Wiktionary thwiktionary https://th.wiktionary.org/wiki/%E0%B8%A7%E0%B8%B4%E0%B8%81%E0%B8%B4%E0%B8%9E%E0%B8%88%E0%B8%99%E0%B8%B2%E0%B8%99%E0%B8%B8%E0%B8%81%E0%B8%A3%E0%B8%A1:%E0%B8%AB%E0%B8%99%E0%B9%89%E0%B8%B2%E0%B8%AB%E0%B8%A5%E0%B8%B1%E0%B8%81 MediaWiki 1.46.0-wmf.24 case-sensitive สื่อ พิเศษ พูดคุย ผู้ใช้ คุยกับผู้ใช้ วิกิพจนานุกรม คุยเรื่องวิกิพจนานุกรม ไฟล์ คุยเรื่องไฟล์ มีเดียวิกิ คุยเรื่องมีเดียวิกิ แม่แบบ คุยเรื่องแม่แบบ วิธีใช้ คุยเรื่องวิธีใช้ หมวดหมู่ คุยเรื่องหมวดหมู่ ภาคผนวก คุยเรื่องภาคผนวก ดัชนี คุยเรื่องดัชนี สัมผัส คุยเรื่องสัมผัส อรรถาภิธาน คุยเรื่องอรรถาภิธาน TimedText TimedText talk มอดูล คุยเรื่องมอดูล Event Event talk ເອົາ 0 5652 5720724 2189812 2026-04-21T04:48:00Z Ai Ku Karng 17824 /* ภาษาลาว */ 5720724 wikitext text/x-wiki {{also/auto}} == ภาษาลาว == === รากศัพท์ === {{inh+|lo|tai-pro|*ʔawᴬ}}; ร่วมเชื้อสายกับ{{cog|th|เอา}}, {{cog|tts|เอา}}, {{cog|nod|ᩐᩣ}}, {{cog|kkh|ᩐᩢᩣ}}, {{cog|khb|ᦀᧁ}}, {{cog|blt|ꪹꪮꪱ}}, {{cog|twh|ꪹꪮꪱ}}, {{cog|shn|ဢဝ်}}, {{cog|aho|𑜒𑜧}} หรือ {{m|aho|𑜒𑜧𑜈𑜫}} หรือ {{m|aho|𑜒𑜨𑜧}}, {{cog|za|aeu}}, {{cog|tdd|ᥟᥝ}} === การออกเสียง === {{lo-pron}} === คำกริยา === {{lo-verb}} # {{lb|lo|สกรรม}} [[เอา]] ==== ลูกคำ ==== {{col4|lo |ເອົາການ |ເອົາການເອົາງານ |ເອົາງານ |ເອົາຈິງເອົາຈັງ |ເອົາໃຈໃສ່ |ເອົາປຽບ |ເອົາຜົວ |ເອົາເມຍ |ເອົາເລື່ອງ |ເອົາໜ້າ }} pxifdp81ge8ulzd2sq906qyjff23gck city 0 10271 5720694 2188018 2026-04-21T01:37:58Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-\|เมืองใหญ่\}\} +|นคร}}) 5720694 wikitext text/x-wiki == ภาษาอังกฤษ == {{wp|lang=en}} === รูปแบบอื่น === * {{alter|en|citie|cittie|cyte|cytee||เลิกใช้}} === รากศัพท์ === {{inh+|en|enm|city}}, {{m|enm|citie}}, {{m|enm|citee}}, {{m|enm|cite}}, จาก{{der|en|fro|cité}}, จาก{{der|en|la|cīvitās|t=[[citizenry]]; [[community]]; a city with its [[hinterland]]}}, จาก {{m|la|cīvis|t=[[native]]; [[townsman]]; [[citizen]]}}, จาก{{der|en|ine-pro|*ḱey-|t=to lie down, settle; home, family; love; beloved}} ร่วมเชื้อสายกับ{{cog|ang|hīwan|g=p|t=members of one's household, servants}}; ดูเพิ่มที่ {{m|en|hewe}}; {{doublet|en|civitas}} เข้าแทนที่คำพื้นถิ่น {{noncog|enm|burgh}}, {{m|enm|borough|t=[[fortified]] [[town]]; [[incorporated]] city}} และ {{m|enm|sted}}, {{m|enm|stede|t=[[place]], [[stead]]; city}} === การออกเสียง === [[ไฟล์:Empire State Building Aerial.JPG|thumb|Part of New York City, a large '''city''' with many tall buildings.]] [[ไฟล์:Aerial view of Wells.jpg|thumb|Despite its small size, Wells is a '''city''' because of its cathedral.]] * {{IPA|en|/ˈsɪti/}} * {{IPA|en|/sɪtɪ/|a=Northern England}} * {{IPA|en|/ˈsɪɾi/|a=US}} * {{audio|en|en-us-city.ogg|a=US}} * {{audio|en|En-uk-city.ogg|a=UK}} * {{audio|en|EN-AU ck1 city.ogg|a=AU}} * {{rhymes|en|ɪti}} * {{hyphenation|en|ci|ty}} === คำนาม === {{en-noun}} # [[เมือง]][[ใหญ่]], [[นคร]] #: {{ux|en|São Paulo is the largest '''city''' in South America.}} #* {{RQ:Ferguson Zollenstein|IV}} #*: So this was my future home, I thought!{{...}}Backed by towering hills, the but faintly discernible purple line of the French boundary off to the southwest, a sky of palest Gobelin flecked with fat, fleecy little clouds, it in truth looked a dear little '''city'''; the '''city''' of one's dreams. #* {{quote-journal|en|date=2014-06-14|volume=411|issue=8891|magazine={{w|The Economist}} | title=[http://www.economist.com/news/science-and-technology/21604091-it-possible-sniff-out-problems-sewer-pipes-they-happen-its-gas It's a gas] | passage=One of the hidden glories of Victorian engineering is proper drains. Isolating a '''city'''’s effluent and shipping it away in underground sewers has probably saved more lives than any medical procedure except vaccination.}} #* {{quote-journal|en|date=2020 July 15|author=Mike Brown talks to Paul Clifton|title=Leading London's "hidden heroes"|journal=Rail|page=42|text=All our stations have changed. We have to constrain numbers. We have to mandate face coverings. These are massive changes in what is a public transport '''city'''. This is not a car '''city'''.}} # {{label|en|บริเตน}} A settlement granted special status by royal charter or [[letters patent]]; traditionally, a settlement with a [[cathedral]] regardless of size. #* '''1976''', Cornelius P. Darcy, ''The Encouragement of the Fine Arts in Lancashire, 1760-1860'', Manchester University Press ({{ISBN|9780719013300}}), page 20 #*: Manchester, incorporated in 1838, was made the centre of a bishopric in 1847 and became a '''city''' in 1853. Liverpool was transformed into a '''city''' by Royal Charter when the new diocese of Liverpool was created in 1880. #* '''2014''', Graham Rutt, ''Cycling Britain's Cathedrals Volume 1'', Lulu.com ({{ISBN|9781326056049}}), page 307 #*: St Davids itself is the smallest '''city''' in Great Britain, with a population of less than 2,000. # {{lb|en|ออสเตรเลีย}} [[เขต]][[ศูนย์กลาง]][[ธุรกิจ]], [[ตัว]]เมือง, [[ใน]]เมือง #: {{ux|en|I'm going into the '''city''' today to do some shopping.}} # {{lb|en|สแลง}} [[ปริมาณ]][[มหาศาล]] {{q|ใช้หลังคำนาม}} #: ''It's video game '''city''' in here!'' ==== คำจ่ากลุ่ม ==== * {{l|en|settlement}} ==== ลูกคำ ==== {{col4|en | cathedral city | cidiot | citify | citizen | city and county | city banker | city block | city boy | city center | city centre | city clerk | city desk | city district | city father | city gent | city girl | city hall | [[city limit]](s) | city line | city man | city manager | city map | city planning | city room | city slicker | city-state | cityite | cityscape | citywide | cross-city | freedom of the city | free of the city | garden city | Hanseatic city | holy city | host city | [[inner city]], [[inner-city]] | megacity | sister city | the city | twin city }} {{col4|en|title=สถานที่ที่ลงท้ายด้วย ''City'' | Archer City | Arkansas City | Ashland City | Atlantic City | Bay City | Beaver City | Belize City | Center City | Charles City | Columbia City | Cross City | Dade City | Dakota City | Dodge City | Forrest City | Garden City | Granite City | Hill City | Ivy City | Jefferson City | Jersey City | Johnson City | Junction City | Kansas City | Lake City | Long Island City | Loup City | Mexico City | Nebraska City | Ness City | New York City | Oklahoma City | Panama City | Pawnee City | Pine City | Quebec City | Quezon City | Rapid City | Redwood City | Reed City | Rio Grande City | Rogers City | Sac City | Salt Lake City | Sioux City | Surf City | Tawas City | Traverse City | Tunnel City | Union City | Valley City | Vatican City | White City | Yazoo City | Yuba City }} ==== คำเกี่ยวข้อง ==== * {{l|en|civic}} * {{l|en|civil}} ==== คำสืบทอด ==== * {{desc|fr|City|bor=1}} * {{desc|de|City|bor=1}} * {{desc|it|city|bor=1}} * {{desc|sv|city|bor=1}} === ดูเพิ่ม === * {{l|en|metropolis}} * {{l|en|megalopolis}} * {{l|en|megacity}} * {{l|en|multicity}} === แหล่งข้อมูลอื่น === * {{R:Keywords|page=55}} === คำสลับอักษร === * {{anagrams|en|a=city|ICTY}} {{topics|en|นคร}} pqv3z7l1nqqnukaycvup5yywxcsis6r เชียงใหม่ 0 25817 5720690 2032418 2026-04-21T01:34:01Z OctraBot 3198 5720690 wikitext text/x-wiki == ภาษาไทย == {{วิกิพีเดีย|จังหวัดเชียงใหม่}} {{วิกิพีเดีย|เทศบาลนครเชียงใหม่}} {{วิกิพีเดีย|มหาวิทยาลัยเชียงใหม่}} [[ไฟล์:Amphoe Chiang Mai.svg|thumb|right|150px|เชียงใหม่]] === รากศัพท์ === {{คำประสม|th|เชียง|ใหม่}} === การออกเสียง === {{th-pron|เชียง-ไหฺม่}} === คำวิสามานยนาม === {{th-proper noun}} # {{lang|th|([[จังหวัด]]~)}} ชื่อ[[จังหวัด]]ใน[[ภาค]][[เหนือ]]ของ[[ประเทศไทย]] #: {{syn|th|ชม|q1=อักษรย่อ}} <!-- undecided former names พิงค์|นพบุรี|นพีสี --> # ชื่อ[[เทศบาลนคร]]ในจังหวัดเชียงใหม่ # [[ชื่อ]][[มหาวิทยาลัย]][[ใน]][[กำกับ]]ของ[[รัฐ]] [[แห่ง]][[หนึ่ง]]ใน[[ประเทศไทย]] ==== คำแปลภาษาอื่น ==== {{trans-top|ชื่อจังหวัด}} * ไทลื้อ: {{t+|khb|ᦵᦋᧂᦺᦖᧈ}} * ไทใหญ่: {{t+|shn|ၵဵင်းမႆႇ}}, {{t+|shn|ၵဵင်းမႂ်ႇ}} * ลาว: {{t+|lo|ຊຽງໃໝ່}} * อังกฤษ: {{t+|en|Chiang Mai}} {{trans-bottom}} {{topics|th|เชียงใหม่|นครในไทย}} afgegm2hy4ifdtdke6cis3zw09r1d6k นนทบุรี 0 25852 5720692 1871885 2026-04-21T01:34:20Z OctraBot 3198 /* คำแปลภาษาอื่น */ 5720692 wikitext text/x-wiki == ภาษาไทย == {{วิกิพีเดีย|จังหวัดนนทบุรี}} {{วิกิพีเดีย|เทศบาลนครนนทบุรี}} [[ไฟล์:Amphoe Nonthaburi.png|thumb|150px|right|นนทบุรี]] === รากศัพท์ === {{com|th|นนท|บุรี}} === การออกเสียง === {{th-pron|นน-ทะ-บุ-รี|นน-บุ-รี}} === คำวิสามานยนาม === {{th-proper noun}} # {{lang|th|([[จังหวัด]]~)}} ชื่อ[[จังหวัด]]ใน[[ภาคกลาง]]ของ[[ประเทศไทย]] #: {{syn|th|นบ|q1=อักษรย่อ|นนท์|q2=ภาษาปาก}} # ชื่อ[[เทศบาลนคร]]ในจังหวัดนนทบุรี ==== คำแปลภาษาอื่น ==== {{trans-top| (1) ชื่อจังหวัด}} * รัสเซีย: {{t+|ru|Нонтхабури}} * ลาว: {{t|lo|ນົນທະບຸລີ}} * [[ภาษาอังกฤษ|อังกฤษ]] : {{t+|en|Nonthaburi}} {{trans-bottom}} {{topics|th|จังหวัดในไทย|นครในไทย}} 5lcdbk0efwjefc2xhrbw0lytsjcb82k ประจวบคีรีขันธ์ 0 25876 5720691 5644801 2026-04-21T01:34:09Z OctraBot 3198 /* คำแปลภาษาอื่น */ 5720691 wikitext text/x-wiki == ภาษาไทย == {{wp|จังหวัด+}} [[ไฟล์:Amphoe Prachuap Khiri Khan.png|thumb|150px|right|ประจวบคีรีขันธ์]] === รากศัพท์ === {{คำประสม|th|ประจวบ|คีรี|ขันธ์}} === การออกเสียง === {{th-pron|ปฺระ-จวบ-คี-รี-ขัน}} === คำวิสามานยนาม === {{th-proper noun}} # {{lang|th|([[จังหวัด]]~)}} ชื่อ[[จังหวัด]]ใน[[ภาคตะวันตก]]ของ[[ประเทศไทย]] #: {{syn|th|ปข|q1=อักษรย่อ|ประจวบ|q2=ภาษาปาก}} # ชื่อ[[อำเภอ]]เมืองในจังหวัดประจวบคีรีขันธ์ # ชื่อ[[เทศบาลเมือง]]ในจังหวัดประจวบคีรีขันธ์ ====คำเกี่ยวข้อง==== * [[ปัจจันตคิรีเขตร]] ====คำแปลภาษาอื่น==== {{trans-top|ชื่อจังหวัด}} * [[ภาษาจีน|จีน]] : [[班武里府]], [[巴蜀府]] * [[ภาษาพม่า|พม่า]] : [[ပရာချွတ်ခီရိခန်း]] * [[ภาษาอังกฤษ|อังกฤษ]] : [[Prachuap Khiri Khan]] {{trans-bottom}} {{topics|th|เมือง|จังหวัดในไทย|นครในไทย}} pm8rzlzdifmu9kfilevcb9t1cv9zptf levrette 0 27342 5720713 1118285 2026-04-21T02:11:32Z OctraBot 3198 /* ภาษาฝรั่งเศส */ เก็บกวาด 5720713 wikitext text/x-wiki == ภาษาฝรั่งเศส == === รากศัพท์ === จาก{{suffix|fr|lévrier|ette|id2=female}}. === การออกเสียง === * {{fr-IPA}} * {{audio|fr|LL-Q150 (fra)-Mecanautes-levrette.wav|a=France}} ** {{audio|fr|LL-Q150 (fra)-Lepticed7-levrette.wav|a=<<France>> (<<Toulouse>>)}} ** {{audio|fr|LL-Q150 (fra)-Poslovitch-levrette.wav|a=<<France>> (<<Vosges>>)}} * {{rhymes|fr|ɛt|s=2}} === คำนาม === {{fr-noun|f}} # [[สุนัข]][[เกรย์ฮาวด์]][[ตัวเมีย]] # {{lb|fr|slang}} [[ท่าหมา]] #: {{uxi|fr|Moi, j'aime la '''levrette'''.|I like it '''doggy style'''.}} #* {{RQ:Despentes King Kong|page=90|chapter=Porno sorcières|text=Censure et interdiction sont réclamées à cor et à cri par des militants effarés, comme si leur vie en dépendait. Cette attitude est objectivement surprenante : est-ce qu'une '''levrette''' en gros plan menace la sûreté de l'État ?}} ==== ลูกคำ ==== * {{l|fr|leuleu}} ==== คำสืบทอด ==== * {{desc|ro|levretă|bor=1}} === อ่านเพิ่ม === * {{R:fr:TLFi}} {{C|fr|หมา|Sighthounds|Sex positions}} c4rijsd46bzzyhnvrlnyief23ksg8jv 5720714 5720713 2026-04-21T02:11:54Z OctraBot 3198 /* ภาษาฝรั่งเศส */ 5720714 wikitext text/x-wiki == ภาษาฝรั่งเศส == === รากศัพท์ === จาก{{suffix|fr|lévrier|ette|id2=female}} === การออกเสียง === * {{fr-IPA}} * {{audio|fr|LL-Q150 (fra)-Mecanautes-levrette.wav|a=France}} ** {{audio|fr|LL-Q150 (fra)-Lepticed7-levrette.wav|a=<<France>> (<<Toulouse>>)}} ** {{audio|fr|LL-Q150 (fra)-Poslovitch-levrette.wav|a=<<France>> (<<Vosges>>)}} * {{rhymes|fr|ɛt|s=2}} === คำนาม === {{fr-noun|f}} # [[สุนัข]][[เกรย์ฮาวด์]][[ตัวเมีย]] # {{lb|fr|slang}} [[ท่าหมา]] #: {{uxi|fr|Moi, j'aime la '''levrette'''.|I like it '''doggy style'''.}} #* {{RQ:Despentes King Kong|page=90|chapter=Porno sorcières|text=Censure et interdiction sont réclamées à cor et à cri par des militants effarés, comme si leur vie en dépendait. Cette attitude est objectivement surprenante : est-ce qu'une '''levrette''' en gros plan menace la sûreté de l'État ?}} ==== ลูกคำ ==== * {{l|fr|leuleu}} ==== คำสืบทอด ==== * {{desc|ro|levretă|bor=1}} === อ่านเพิ่ม === * {{R:fr:TLFi}} {{C|fr|หมา|Sighthounds|Sex positions}} ov4uxb7yhdpa9cgxwyv41de9a3po9wm capital city 0 28316 5720695 5677150 2026-04-21T01:38:12Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-\|เมืองใหญ่\}\} +|นคร}}) 5720695 wikitext text/x-wiki == ภาษาอังกฤษ == === การออกเสียง === * {{IPA|en|/ˌkæpɪtəl ˈsɪti/}} * {{audio|en|EN-AU ck1 capital city.ogg|a=AU}} === คำนาม === {{en-noun}} # {{senseid|en|Q5119}} [[เมืองหลวง]], [[เมือง]]ที่เป็น[[ที่ตั้ง]]ของ[[รัฐบาล]] # เมือง (หรือหลายเมือง) ที่มีขนาดใหญ่กว่าหรือมีความสำคัญต่อประเทศมากกว่าเมืองอื่น ๆ ทั้งหมด โดยไม่คำนึงถึงที่ตั้งของรัฐบาลที่แท้จริง (เช่น [[มอสโก]]และ[[เซนต์ปีเตอร์สเบิร์ก]] โดยไม่คำนึงถึงการย้ายที่ตั้งของรัฐบาล[[รัสเซีย]] หรือ[[อัมสเตอร์ดัม]] ถึงแม้ว่ารัฐบาล[[เนเธอร์แลนด์]]จะตั้งอยู่ที่[[เฮก]]ตั้งแต่ปี 1588) # {{label|en|AU}} [[เขตมหานคร]]หลัก ๆ ของ[[ออสเตรเลีย]] ([[ซิดนีย์]] [[เมลเบิร์น]] [[บริสเบน]] [[เพิร์ท]] [[แอดิเลด]]) ซึ่งทั้งหมดเป็นเมืองหลวงของ[[รัฐ]] และโดยนัยเดียวกันก็รวมถึงเขตมหานครของประเทศอื่น ๆ ด้วย ในขณะที่เมืองหลวงของรัฐอื่นอย่างเช่น [[โฮบาร์ต]] [[ดาร์วิน]] และเมืองหลวงของประเทศ [[แคนเบอร์รา]] มักจะไม่นับรวม ==== คำพ้องความ ==== * {{qualifier|การละ}} [[capital]] {{C|en|นคร}} 1uxv4oxusbprr7e644vd9ayzq8942hb capital gain 0 28987 5720715 1332023 2026-04-21T02:15:21Z OctraBot 3198 /* ภาษาอังกฤษ */ เก็บกวาด 5720715 wikitext text/x-wiki == ภาษาอังกฤษ == === การออกเสียง === * {{audio|en|LL-Q1860 (eng)-Vealhurl-capital gain.wav|a=Southern England}} === คำนาม === {{en-noun|~}} # {{lb|en|economics|business|finance}} [[กำไรประเภททุน]]; การเพิ่มขึ้นของมูลค่าสินทรัพย์ประเภททุน; จำนวนที่มูลค่าหรือรายได้จากการขายสินทรัพย์ประเภททุนโดยเจ้าของเกินกว่าต้นทุนของเจ้าของ #: {{ant|en|capital loss}} #: {{rfquote-sense|en}} 6stokupp5u4u0vliq5kvuk0n3tfpiuk 5720717 5720715 2026-04-21T02:17:09Z OctraBot 3198 /* คำนาม */ 5720717 wikitext text/x-wiki == ภาษาอังกฤษ == === การออกเสียง === * {{audio|en|LL-Q1860 (eng)-Vealhurl-capital gain.wav|a=Southern England}} === คำนาม === {{en-noun|~}} # {{lb|en|economics|business|finance}} [[กำไรประเภททุน]]; การเพิ่มขึ้นของมูลค่าสินทรัพย์ประเภททุน; จำนวนที่มูลค่าหรือรายได้จากการขายสินทรัพย์ประเภททุนโดยเจ้าของ เกินกว่าต้นทุนของเจ้าของ #: {{ant|en|capital loss}} #: {{rfquote-sense|en}} 0ey9orpnzvhsc2waa4x13p6luk60ycn เที่ยว 0 31114 5720719 1885100 2026-04-21T03:43:19Z Ai Ku Karng 17824 /* ภาษาไทย */ 5720719 wikitext text/x-wiki {{also/auto}} == ภาษาไทย == === รากศัพท์ === ร่วมเชื้อสายกับ{{cog|za|deuh|tr=แต่ว}}, {{cog|zzj|teuh|tr=แท่ว}}; สำหรับคำกริยา เทียบ{{cog|za|liuh|tr=ลิ่ว}}, {{cog|zzj|lieuh|tr=เลี่ยว}}, {{cog|lo|ທ່ຽວ}} === การออกเสียง === {{th-pron}} === คำนาม === {{th-noun}} # ใช้เรียกการไปยังที่ซึ่ง[[กำหนด]]ไว้ครั้งหนึ่ง ๆ หรือ[[ไปกลับ]]รอบหนึ่ง ๆ #: {{ux|th|เที่ยว[[ขึ้น]], เที่ยว[[ล่อง]], เที่ยวไป, เที่ยวกลับ}} === คำลักษณนาม === {{th-cls}} # [[ลักษณนาม]]บอกอาการเช่นนั้น #: {{ux|th|ไป ๒ เที่ยว}} #: {{ux|th|มา ๓ เที่ยว}} === คำกริยา === {{th-verb}} # [[กิริยา]]ที่ไปที่โน่นที่นี่[[เรื่อยไป]] มักใช้พูดประกอบกับกริยาอื่น #: {{ux|th|เที่ยวหา, เที่ยวพูด, เที่ยวกิน, เที่ยวนอน}} # ไปไหน ๆ เพื่อความ[[เพลิดเพลิน]]ตาม[[สบาย]] #: {{ux|th|ไปเที่ยว, เดินเที่ยว, ท่องเที่ยว}} # [[เตร็ดเตร่]]ไปเพื่อหาความ[[สนุก]]เพลิดเพลินตามที่ต่าง ๆ #: {{ux|th|เที่ยวงานกาชาด}} ==== คำเกี่ยวข้อง ==== * [[ท่องเที่ยว]] ==== ดูเพิ่ม ==== * {{l|th|เทียว}} j5qpgbjz1e8p3tmlwt8iuzyhtinqfth เอา 0 32916 5720725 5030228 2026-04-21T04:48:42Z Ai Ku Karng 17824 /* ภาษาไทย */ 5720725 wikitext text/x-wiki {{also/auto}} == ภาษาไทย == === รากศัพท์ === {{inh+|th|tai-pro|*ʔawᴬ}}; ร่วมเชื้อสายกับ{{cog|tts|เอา}}, {{cog|lo|ເອົາ}}, {{cog|nod|ᩐᩣ}}, {{cog|kkh|ᩐᩢᩣ}}, {{cog|khb|ᦀᧁ}}, {{cog|blt|ꪹꪮꪱ}}, {{cog|shn|ဢဝ်}}, {{cog|aho|𑜒𑜧}} หรือ {{m|aho|𑜒𑜧𑜈𑜫}} หรือ {{m|aho|𑜒𑜨𑜧}}, {{cog|za|aeu}}, {{cog|tdd|ᥟᥝ}} === การออกเสียง === {{th-pron}} === คำกริยา === {{th-verb}} # [[ยึด]] #: {{ux|th|เอาไว้อยู่}} # [[รับ]][[ไว้]] #: {{ux|th|เขาให้ก็เอา}} # [[พา]], [[นำ]] #: {{ux|th|เอาตัวมา}} # [[ต้องการ]] #: {{ux|th|ทำเอาชื่อ}} #: {{ux|th|ทำงานเอาหน้า}} # [[ถือ]][[เป็น]][[สำคัญ]] #: {{ux|th|เจรจาเอาถ้อยคำ}} #: {{ux|th|เอาพี่เอาน้อง}} # {{lb|th|ปาก}} [[คำ]]ใช้แทน[[กริยา]]อื่น ๆ บางคำได้ ==== คำแปลภาษาอื่น ==== {{trans-top|พา, นำ}} * คำเมือง: {{t+|nod|ᩐᩣ}} * ไทดำ: {{t+|blt|ꪹꪮꪱ}} * ไทใหญ่: {{t+|shn|ဢဝ်}} * ลาว: {{t+|lo|ເອົາ}} * อังกฤษ: {{t+|en|take|tr=เทค}} {{trans-bottom}} === คำกริยาวิเศษณ์ === {{th-adv|-}} # เมื่อใช้[[ลงท้าย]]กริยา เป็นการ[[เน้น]]กริยาแสดงถึงการ[[ตั้งหน้าตั้งตา]][[ทำ]][[ต่อเนื่อง]][[กัน]] #: {{ux|th|กินเอา ๆ}} == ภาษาคำเมือง == === การออกเสียง === * {{IPA|nod|/ʔaw˧˧/|a=เชียงใหม่}} === คำกริยา === {{nod-verb}} # {{lb|nod|สกรรม}} {{alternative form of|nod|ᩐᩣ}} == ภาษาชอง == === รากศัพท์ === {{inh+|cog|mkh-pro|*ʔaawʔ}} === การออกเสียง === * {{IPA|cog|/ʔaw/|a=จันทบุรี,ตราด,กาญจนบุรี}} === คำนาม === {{cog-noun}} # [[เสื้อ]] ghoxth4hdikrhgmbvnv5m8kwdru0jep ທ່ອງທ່ຽວ 0 36283 5720721 1550623 2026-04-21T03:57:30Z Ai Ku Karng 17824 /* ภาษาลาว */ 5720721 wikitext text/x-wiki == ภาษาลาว == === รากศัพท์ === {{com|lo|ທ່ອງ|ທ່ຽວ|t1=ท่อง|t2=เที่ยว}}; ร่วมเชื้อสายกับ{{cog|th|ท่องเที่ยว}} === การออกเสียง === {{lo-pron|ທ່ອງ-ທ່ຽວ}} === คำกริยา === {{lo-verb}} # [[ท่องเที่ยว]] opxsaderw42zcb5py7keqz8oteb2g2s มอดูล:languages/data/exceptional 828 36360 5720769 5720544 2026-04-21T07:01:21Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720769 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["aav-khs-pro"] = { "คาเซียนดั้งเดิม", 116773216, "aav-khs", "Latn", type = "reconstructed", } m["aav-nic-pro"] = { "นิโคบารีสดั้งเดิม", 116773793, "aav-nic", "Latn", type = "reconstructed", } m["aav-pkl-pro"] = { "ปนัร-คาซี-ลึงงัมดั้งเดิม", 116773259, "aav-pkl", "Latn", type = "reconstructed", } m["aav-pro"] = { -- mkh-pro will merge into this "ออสโตรเอเชียติกดั้งเดิม", 116773186, "aav", "Latn", type = "reconstructed", } m["afa-pro"] = { "แอฟโฟรเอเชียติกดั้งเดิม", 269125, "afa", "Latn", type = "reconstructed", } m["alg-aga"] = { "Agawam", nil, "alg-eas", "Latn", } m["alg-pro"] = { "แอลกองเคียนดั้งเดิม", -- silent u 7251834, "alg", "Latn", type = "reconstructed", sort_key = {remove_diacritics = "·"}, } m["alv-ama"] = { "Amasi", 4740400, "nic-grs", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron}, } m["alv-bgu"] = { "Baïnounk Gubëeher", 17002646, "alv-bny", "Latn", } m["alv-bua-pro"] = { "Proto-Bua", 116773723, "alv-bua", "Latn", type = "reconstructed", } m["alv-cng-pro"] = { "Proto-Cangin", 116773726, "alv-cng", "Latn", type = "reconstructed", } m["alv-edo-pro"] = { "Proto-Edoid", 116773206, "alv-edo", "Latn", type = "reconstructed", } m["alv-fli-pro"] = { "Proto-Fali", 116773754, "alv-fli", "Latn", type = "reconstructed", } m["alv-gbe-pro"] = { "กเบดั้งเดิม", 116773208, "alv-gbe", "Latn", type = "reconstructed", } m["alv-gng-pro"] = { "Proto-Guang", 116773757, "alv-gng", "Latn", type = "reconstructed", } m["alv-gtm-pro"] = { "Proto-Central Togo", 116773732, "alv-gtm", "Latn", type = "reconstructed", } m["alv-gwa"] = { "Gwara", 16945580, "nic-pla", "Latn", } m["alv-hei-pro"] = { "Proto-Heiban", 116773760, "alv-hei", "Latn", type = "reconstructed", } m["alv-ido-pro"] = { "Proto-Idomoid", 116773764, "alv-ido", "Latn", type = "reconstructed", } m["alv-igb-pro"] = { "Proto-Igboid", 116773765, "alv-igb", "Latn", type = "reconstructed", } m["alv-kwa-pro"] = { "Proto-Kwa", 116773780, "alv-kwa", "Latn", type = "reconstructed", } m["alv-mum-pro"] = { "Proto-Mumuye", 116773791, "alv-mum", "Latn", type = "reconstructed", } m["alv-nup-pro"] = { "Proto-Nupoid", 116773795, "alv-nup", "Latn", type = "reconstructed", } m["alv-pro"] = { "แอตแลนติก-คองโกดั้งเดิม", 116732838, "alv", "Latn", type = "reconstructed", } m["alv-edk-pro"] = { "Proto-Edekiri", nil, "alv-edk", "Latn", type = "reconstructed", } m["alv-yor-pro"] = { "โยรูบาดั้งเดิม", nil, "alv-yor", "Latn", type = "reconstructed", } m["alv-yrd-pro"] = { "โยรูบอยด์ดั้งเดิม", 116773824, "alv-yrd", "Latn", type = "reconstructed", } m["alv-von-pro"] = { "วอลตา-ไนเจอร์ดั้งเดิม", 116773820, "alv-von", "Latn", type = "reconstructed", } m["apa-pro"] = { "Proto-Apachean", 116773135, "apa", "Latn", type = "reconstructed", } m["aql-pro"] = { "แอลจิกดั้งเดิม", 18389588, "aql", "Latn", type = "reconstructed", sort_key = {remove_diacritics = "·"}, } m["art-adu"] = { "Adûni", 1232159, "art", "Latn", type = "appendix-constructed", } m["art-bel"] = { "Belter Creole", 108055510, "art", "Latn", type = "appendix-constructed", sort_key = { remove_diacritics = c.acute, from = {"ɒ"}, to = {"a"}, }, } m["art-blk"] = { "Bolak", 2909283, "art", "Latn", type = "appendix-constructed", } m["art-bsp"] = { "แบล็กสปีช", 686210, "art", "Latn, Teng", type = "appendix-constructed", } m["art-com"] = { "Communicationssprache", 35227, "art", "Latn", type = "appendix-constructed", } m["art-dtk"] = { "Dothraki", 2914733, "art", "Latn", type = "appendix-constructed", } m["art-elo"] = { "Eloi", nil, "art", "Latn", type = "appendix-constructed", } m["art-gld"] = { "Goa'uld", 19823, "art", "Latn, Egyp, Mero", type = "appendix-constructed", } m["art-lap"] = { "Lapine", 6488195, "art", "Latn", type = "appendix-constructed", } m["art-man"] = { "Mandalorian", 54289, "art", "Latn", type = "appendix-constructed", } m["art-mun"] = { "Mundolinco", 851355, "art", "Latn", type = "appendix-constructed", } m["art-nav"] = { "Naʼvi", 316939, "art", "Latn", type = "appendix-constructed", } m["art-vlh"] = { "High Valyrian", 64483808, "art", "Latn", type = "appendix-constructed", } m["ath-nic"] = { "Nicola", 20609, "ath-nor", "Latn", } m["ath-pro"] = { "Proto-Athabaskan", 104841722, "ath", "Latn", type = "reconstructed", } m["auf-pro"] = { "Proto-Arawa", 116773706, "auf", "Latn", type = "reconstructed", } m["aus-alu"] = { "Alungul", 16827670, "aus-pmn", "Latn", } m["aus-and"] = { "Andjingith", 4754509, "aus-pmn", "Latn", } m["aus-ang"] = { "Angkula", 16828520, "aus-pmn", "Latn", } m["aus-arn-pro"] = { "Proto-Arnhem", 116773720, "aus-arn", "Latn", type = "reconstructed", } m["aus-bra"] = { "Barranbinya", 4863220, "aus-pmn", "Latn", } m["aus-brm"] = { "Barunggam", 4865914, "aus-pmn", "Latn", } m["aus-cww-pro"] = { "Proto-Central New South Wales", 116773199, "aus-cww", "Latn", type = "reconstructed", } m["aus-dal-pro"] = { "Proto-Daly", 116773743, "aus-dal", "Latn", type = "reconstructed", } m["aus-guw"] = { "Guwar", 6652138, "aus-pam", "Latn", } m["aus-lsw"] = { "Little Swanport", 6652138, "qfa-unc", "Latn", } m["aus-mbi"] = { "Mbiywom", 6799701, "aus-pmn", "Latn", } m["aus-ngk"] = { "Ngkoth", 7022405, "aus-pmn", "Latn", } m["aus-nyu-pro"] = { "Proto-Nyulnyulan", 116773797, "aus-nyu", "Latn", type = "reconstructed", } m["aus-pam-pro"] = { "Proto-Pama-Nyungan", 33942, "aus-pam", "Latn", type = "reconstructed", } m["aus-tul"] = { "Tulua", 16938541, "aus-pam", "Latn", } m["aus-uwi"] = { "Uwinymil", 7903995, "aus-arn", "Latn", } m["aus-wdj-pro"] = { "Proto-Iwaidjan", 116773767, "aus-wdj", "Latn", type = "reconstructed", } m["aus-won"] = { "Wong-gie", nil, "aus-pam", "Latn", } m["aus-wul"] = { "Wulguru", 8039196, "aus-dyb", "Latn", } m["aus-ynk"] = { -- contrast nny "Yangkaal", 3913770, "aus-tnk", "Latn", } m["awd-amc-pro"] = { "Proto-Amuesha-Chamicuro", nil, "awd", "Latn", type = "reconstructed", } m["awd-kmp-pro"] = { "Proto-Kampa", nil, "awd", "Latn", type = "reconstructed", } m["awd-prw-pro"] = { "Proto-Paresi-Waura", nil, "awd", "Latn", type = "reconstructed", } m["awd-ama"] = { "Amarizana", 16827787, "awd", "Latn", } m["awd-ana"] = { "Anauyá", 16828252, "awd", "Latn", } m["awd-apo"] = { "Apolista", 16916645, "awd", "Latn", } m["awd-cab"] = { "Cabre", 16850160, "awd", "Latn", } m["awd-gnu"] = { "Guinau", 3504087, "awd", "Latn", } m["awd-kar"] = { "Cariay", 16920253, "awd", "Latn", } m["awd-kaw"] = { "Kawishana", 6379993, "awd-nwk", "Latn", } m["awd-kus"] = { "Kustenau", 5196293, "awd", "Latn", } m["awd-man"] = { "Manao", 6746920, "awd", "Latn", } m["awd-mar"] = { "Marawan", 6755108, "awd", "Latn", } m["awd-mpr"] = { "Maipure", 6736872, "awd", "Latn", } m["awd-mrt"] = { "Mariaté", 16910017, "awd-nwk", "Latn", } m["awd-nwk-pro"] = { "Proto-Nawiki", 116773234, "awd-nwk", "Latn", type = "reconstructed", } m["awd-pai"] = { "Paikoneka", 128807835, "awd", "Latn", } m["awd-pas"] = { "Pasé", 7143168, "awd-nwk", "Latn", } m["awd-pro"] = { "Proto-Arawak", 97573478, "awd", "Latn", type = "reconstructed", } m["awd-she"] = { "Shebayo", 7492248, "awd", "Latn", } m["awd-taa-pro"] = { "Proto-Ta-Arawak", 116773282, "awd-taa", "Latn", type = "reconstructed", } m["awd-wai"] = { "Wainumá", 16910017, "awd-nwk", "Latn", } m["awd-yum"] = { "Yumana", 8061062, "awd-nwk", "Latn", } m["azc-caz"] = { "Cazcan", 5055514, "azc", "Latn", } m["azc-cup-pro"] = { "Proto-Cupan", 116773738, "azc-cup", "Latn", type = "reconstructed", } m["azc-ktn"] = { "Kitanemuk", 3197558, "azc-tak", "Latn", } m["azc-nah-pro"] = { "นาวันดั้งเดิม", 7251860, "azc-nah", "Latn", type = "reconstructed", } m["azc-num-pro"] = { "Proto-Numic", 116773247, "azc-num", "Latn", type = "reconstructed", } m["azc-pro"] = { "ยูโต-แอซเทกันดั้งเดิม", 96400333, "azc", "Latn", type = "reconstructed", } m["azc-tak-pro"] = { "Proto-Takic", 116773283, "azc-tak", "Latn", type = "reconstructed", } m["azc-tat"] = { "Tataviam", 743736, "azc", "Latn", } m["ber-pro"] = { "เบอร์เบอร์ดั้งเดิม", 2855698, "ber", "Latn", type = "reconstructed", } m["ber-fog"] = { "Fogaha", 107610173, "ber", "Latn", } m["ber-zuw"] = { "Zuwara", 4117169, "ber", "Latn", } m["bnt-bal"] = { "Balong", 93935237, "bnt-bbo", "Latn", } m["bnt-bon"] = { "Boma Nkuu", nil, "bnt", "Latn", } m["bnt-boy"] = { "Boma Yumu", nil, "bnt", "Latn", } m["bnt-bwa"] = { "Bwala", 128810345, "bnt-tek", "Latn", } m["bnt-cmw"] = { "Chimwiini", 4958328, "bnt-swh", "Latn", } m["bnt-ind"] = { "Indanga", 51412803, "bnt", "Latn", } m["bnt-lal"] = { "Lala (South Africa)", 6480154, "bnt-ngu", "Latn", } m["bnt-mpi"] = { "Mpiin", 93937013, "bnt-bdz", "Latn", } m["bnt-mpu"] = { "Mpuono", -- not to be confused with Mbuun zmp 36056, "bnt", "Latn", } m["bnt-ngu-pro"] = { "งูนีดั้งเดิม", 961559, "bnt-ngu", "Latn", type = "reconstructed", sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.caron}, } m["bnt-phu"] = { "Phuthi", 33796, "bnt-ngu", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute}, } m["bnt-pro"] = { "แบนทูดั้งเดิม", 3408025, "bnt", "Latn", type = "reconstructed", sort_key = "bnt-pro-sortkey", } m["bnt-sab-pro"] = { "Proto-Sabaki", nil, -- Q2209395 is the code for the Sabaki family "bnt-sab", "Latn", type = "reconstructed", } m["bnt-sbo"] = { "South Boma", nil, "bnt", "Latn", } m["bnt-sts-pro"] = { "Proto-Sotho-Tswana", 116773278, "bnt-sts", "Latn", type = "reconstructed", } m["btk-pro"] = { "Proto-Batak", 116773191, "btk", "Latn", type = "reconstructed", } m["cau-abz-pro"] = { "Proto-Abkhaz-Abaza", 7251831, "cau-abz", "Latn", type = "reconstructed", } m["cau-and-pro"] = { "Proto-Andian", nil, "cau-and", "Latn", type = "reconstructed", } m["cau-ava-pro"] = { "Proto-Avaro-Andian", 116773187, "cau-ava", "Latn", type = "reconstructed", } m["cau-cir-pro"] = { "Proto-Circassian", 7251838, "cau-cir", "Latn", type = "reconstructed", } m["cau-drg-pro"] = { "Proto-Dargwa", 116773205, "cau-drg", "Latn", type = "reconstructed", } m["cau-lzg-pro"] = { "Proto-Lezghian", 116773223, "cau-lzg", "Latn", type = "reconstructed", } m["cau-nec-pro"] = { "คอเคเซียนตะวันออกเฉียงเหนือดั้งเดิม", 116773244, "cau-nec", "Latn", type = "reconstructed", } m["cau-nkh-pro"] = { "นัคดั้งเดิม", 108032840, "cau-nkh", "Latn", type = "reconstructed", } m["cau-nwc-pro"] = { "คอเคเซียนตะวันตกเฉียงเหนือดั้งเดิม", 7251861, "cau-nwc", "Latn", type = "reconstructed", } m["cau-tsz-pro"] = { "Proto-Tsezian", 116773287, "cau-tsz", "Latn", type = "reconstructed", } m["cba-ata"] = { "Atanques", 4812783, "cba", "Latn", } m["cba-cat"] = { "Catío Chibcha", 7083619, "cba", "Latn", } m["cba-dor"] = { "Dorasque", 5297532, "cba", "Latn", } m["cba-dui"] = { "Duit", 3041061, "cba", "Latn", } m["cba-hue"] = { "Huetar", 35514, "cba", "Latn", } m["cba-nut"] = { "Nutabe", 7070405, "cba", "Latn", } m["cba-pro"] = { "Proto-Chibchan", 116773203, "cba", "Latn", type = "reconstructed", } m["ccs-pro"] = { "คาร์ทเวเลียนดั้งเดิม", 2608203, "ccs", "Latn", type = "reconstructed", strip_diacritics = { from = {"q̣", "p̣", "ʓ", "ċ"}, to = {"q̇", "ṗ", "ʒ", "c̣"} }, } m["ccs-gzn-pro"] = { "จอร์เจียน-แซนดั้งเดิม", 23808119, "ccs-gzn", "Latn", type = "reconstructed", strip_diacritics = { from = {"q̣", "p̣", "ʓ", "ċ"}, to = {"q̇", "ṗ", "ʒ", "c̣"} }, } m["cdc-cbm-pro"] = { "ชาดิกตอนกลางดั้งเดิม", 116773197, "cdc-cbm", "Latn", type = "reconstructed", } m["cdc-mas-pro"] = { "Proto-Masa", 116773789, "cdc-mas", "Latn", type = "reconstructed", } m["cdc-pro"] = { "ชาดิกดั้งเดิม", 116773201, "cdc", "Latn", type = "reconstructed", } m["cdd-pro"] = { "Proto-Caddoan", 116773725, "cdd", "Latn", type = "reconstructed", } m["cel-bry-pro"] = { "บริทอนิกดั้งเดิม", 1248800, "cel-bry", "Latn, Polyt", sort_key = { Latn = "cel-bry-pro-sortkey", }, -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["cel-gal"] = { "Gallaecian", 3094789, "cel-his", } m["cel-gau"] = { "กอล", 29977, "cel", "Latn, Polyt, Ital", strip_diacritics = { Latn = {remove_diacritics = c.macron .. c.breve .. c.diaer}, }, sort_key = { Latn = "cel-bry-pro-sortkey", }, -- Ital translit in [[Module:scripts/data]] -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["cel-pro"] = { "เคลติกดั้งเดิม", 653649, "cel", "Latn", type = "reconstructed", sort_key = "cel-pro-sortkey", } m["chi-pro"] = { "Proto-Chimakuan", 116773734, "chi", "Latn", type = "reconstructed", } m["chm-pro"] = { "Proto-Mari", 116773788, "chm", "Latn", type = "reconstructed", } m["cmc-pro"] = { "จามิกดั้งเดิม", 114793834, "cmc", "Latn", type = "reconstructed", } m["crp-bip"] = { "Basque-Icelandic Pidgin", 810378, "crp", "Latn", ancestors = "eu", } m["crp-gep"] = { "West Greenlandic Pidgin", 17036301, "crp", "Latn", ancestors = "kl", } m["crp-kia"] = { "Kiautschou German Pidgin", 108314615, "crp", "Latn", ancestors = "de", } m["crp-mar"] = { "Maroon Spirit Language", 1093206, "crp", "Latn", ancestors = "en", } m["crp-mpp"] = { "Macau Pidgin Portuguese", 128804537, "crp", "Hant, Latn", ancestors = "pt", sort_key = {Hant = "Hani-sortkey"}, } m["crp-rsn"] = { "Russenorsk", 505125, "crp", "Cyrl, Latn", ancestors = "nn, ru", translit = {Cyrl = "ru-translit"}, } m["crp-spp"] = { "Samoan Plantation Pidgin", 7409948, "crp", "Latn", ancestors = "en", } m["crp-slb"] = { "Solombala English", 7558525, "crp", "Cyrl, Latn", ancestors = "en, ru", translit = {Cyrl = "ru-translit"}, } m["crp-tpr"] = { "Taimyr Pidgin Russian", 16930506, "crp", "Cyrl", ancestors = "ru", translit = "ru-translit", } m["csu-bba-pro"] = { "Proto-Bongo-Bagirmi", 116773722, "csu-bba", "Latn", type = "reconstructed", } m["csu-maa-pro"] = { "Proto-Mangbetu", 116773786, "csu-maa", "Latn", type = "reconstructed", } m["csu-pro"] = { "Proto-Central Sudanic", 116773730, "csu", "Latn", type = "reconstructed", } m["csu-sar-pro"] = { "Proto-Sara", 116773809, "csu-sar", "Latn", type = "reconstructed", } m["cus-ash"] = { "Ashraaf", 4805855, "cus-som", "Latn", } m["cus-hec-pro"] = { "Proto-Highland East Cushitic", 116773761, "cus-hec", "Latn", type = "reconstructed", } m["cus-som-pro"] = { "โซมาลอยด์ดั้งเดิม", nil, "cus-som", "Latn", type = "reconstructed", } m["cus-sou-pro"] = { "Proto-South Cushitic", 126081567, "cus-sou", "Latn", type = "reconstructed", } m["cus-pro"] = { "Proto-Cushitic", 116773204, "cus", "Latn", type = "reconstructed", } m["dmn-dam"] = { "Dama (Sierra Leone)", 19601574, "dmn", "Latn", } m["dra-bry"] = { "Beary", 1089116, "qfa-mix", "Mlym, Knda", ancestors = "ml, tcy", -- Knda translit in [[Module:scripts/data]] -- Mlym translit in [[Module:scripts/data]] } m["dra-cen-pro"] = { "ดราวิเดียนตอนกลางดั้งเดิม", nil, "dra-cen", "Latn", type = "reconstructed", } m["dra-mkn"] = { "Middle Kannada", 128810572, "dra-kan", "Knda", -- Knda translit in [[Module:scripts/data]] } m["dra-nor-pro"] = { "ดราวิเดียนเหนือดั้งเดิม", 124433593, "dra-nor", "Latn", type = "reconstructed", } m["dra-okn"] = { "Old Kannada", 15723156, "dra-kan", "Knda", -- Knda translit in [[Module:scripts/data]] } m["dra-ote"] = { "Old Telugu", 126720868, "dra-tel", "Telu", translit = "Telu-translit", } m["dra-pro"] = { "ดราวิเดียนดั้งเดิม", 1702853, "dra", "Latn", type = "reconstructed", } m["dra-sdo-pro"] = { "ดราวิเดียนใต้ที่หนึ่งดั้งเดิม", 104847952, -- Wikipedia's "Proto-South Dravidian" is Proto-South Dravidian I in this scheme. "dra-sdo", "Latn", type = "reconstructed", } m["dra-sdt-pro"] = { "ดราวิเดียนใต้ที่สองดั้งเดิม", 128885257, "dra-sdt", "Latn", type = "reconstructed", } m["dra-sou-pro"] = { "ดราวิเดียนใต้ดั้งเดิม", 128886121, "dra-sou", "Latn", type = "reconstructed", } m["egx-dem"] = { "Demotic Egyptian", 36765, "egx", "Latn, Egyd, Polyt", sort_key = { Latn = { remove_diacritics = "'%-%s", from = {"ꜣ", "j", "e", "ꜥ", "y", "w", "b", "p", "f", "m", "n", "r", "l", "ḥ", "ḫ", "h̭", "ẖ", "h", "š", "s", "q", "k", "g", "ṱ", "ṯ", "t", "ḏ", "%.", "⸗"}, to = {p[1], p[2], p[3], p[4], p[5], p[6], p[7], p[8], p[9], p[10], p[11], p[12], p[13], p[15], p[16], p[16], p[17], p[14], p[19], p[18], p[20], p[21], p[22], p[23], p[24], p[23], p[25], p[26], p[26]} }, }, -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["dmn-pro"] = { "Proto-Mande", 116773785, "dmn", "Latn", type = "reconstructed", } m["dmn-mdw-pro"] = { "Proto-Western Mande", 116773822, "dmn-mdw", "Latn", type = "reconstructed", } m["dru-pro"] = { "Proto-Rukai", 116773807, "map", "Latn", type = "reconstructed", } m["ero-gsz"] = { "Geshiza", nil, "ero", "Latn", } m["ero-nya"] = { "Nyagrong Minyag", nil, "ero", "Latn", } m["ero-tau"] = { "Stau", nil, "ero", "Latn", } m["esx-esk-pro"] = { "เอสกิโมดั้งเดิม", 7251842, "esx-esk", "Latn", type = "reconstructed", } m["esx-ink"] = { "Inuktun", 1671647, "esx-inu", "Latn", } m["esx-inq"] = { "Inuinnaqtun", 28070, "esx-inu", "Latn", } m["esx-inu-pro"] = { "อินุอิตดั้งเดิม", 60785588, "esx-inu", "Latn", type = "reconstructed", } m["esx-pro"] = { "Proto-Eskimo-Aleut", 7251843, "esx", "Latn", type = "reconstructed", } m["esx-tut"] = { "Tunumiisut", 15665389, "esx-inu", "Latn", } m["euq-pro"] = { "บาสก์ดั้งเดิม", 938011, "euq", "Latn", type = "reconstructed", } m["gba-pro"] = { "Proto-Gbaya", nil, "gba", "Latn", type = "reconstructed", } m["gem-pro"] = { "เจอร์แมนิกดั้งเดิม", 669623, "gem", "Latn", type = "reconstructed", sort_key = "gem-pro-sortkey", } m["gme-bur"] = { "Burgundian", 47625, "gme", "Latn", } m["gme-cgo"] = { "กอทแบบไครเมีย", 36211, "gme", "Latn", } m["gmq-gut"] = { "Gutnish", 1256646, "gmq", "Latn", ancestors = "gmq-ogt", } m["gmq-jmk"] = { "Jamtish", 35512, "gmq-eas", "Latn", } m["gmq-mno"] = { "นอร์เวย์กลาง", 3417070, "gmq-wes", "Latn", } m["gmq-oda"] = { "เดนมาร์กเก่า", 12330003, "gmq-eas", "Latn, Runr", strip_diacritics = {remove_diacritics = c.macron}, } m["gmq-ogt"] = { "Old Gutnish", 1133488, "gmq", "Latn, Runr", ancestors = "non", } m["gmq-osw"] = { "สวีเดนเก่า", 2417210, "gmq-eas", "Latn, Runr", strip_diacritics = {remove_diacritics = c.macron}, } m["gmq-pro"] = { "นอร์สดั้งเดิม", 1671294, "gmq", "Runr", translit = "Runr-translit", } m["gmq-scy"] = { "Scanian", 768017, "gmq-eas", "Latn", } m["gmw-bgh"] = { "Bergish", 329030, "gmw-frk", "Latn", } m["gmw-cfr"] = { "ภาษาฟรังโกเนียตอนกลาง", 572197, "gmw-hgm", "Latn", ancestors = "gmh", wikimedia_codes = "ksh", } m["gmw-ecg"] = { "เยอรมันตอนกลางตะวันออก", 499344, -- subsumes Q699284, Q152965 "gmw-hgm", "Latn", ancestors = "gmh", } m["gmw-fin"] = { "Fingallian", 3072588, "gmw-ian", "Latn", } m["gmw-gts"] = { "Gottscheerish", 533109, "gmw-hgm", "Latn", ancestors = "bar", } m["gmw-jdt"] = { "Jersey Dutch", 1687911, "gmw-frk", "Latn", ancestors = "nl", } m["gmw-msc"] = { "Middle Scots", 3327000, "gmw-ang", "Latn", ancestors = "enm-esc", } m["gmw-pro"] = { "เจอร์แมนิกตะวันตกดั้งเดิม", 78079021, "gmw", "Latn, Runr", -- type = "reconstructed", -- largely but not entirely reconstructed (like Proto-Norse); see April '24 BP, set back to reconstructed (?) if 'anti-asterisk' is added sort_key = "gmw-pro-sortkey", } m["gmw-rfr"] = { "ฟรังโกเนียแบบไรน์", 707007, "gmw-hgm", "Latn", ancestors = "gmh", } m["gmw-stm"] = { "Sathmar Swabian", 2223059, "gmw-hgm", "Latn", ancestors = "swg", } m["gmw-tsx"] = { "Transylvanian Saxon", 260942, "gmw-hgm", "Latn", ancestors = "gmw-cfr", } m["gmw-vog"] = { "เยอรมันแบบว็อลกา", 312574, "gmw-hgm", "Latn", ancestors = "gmw-rfr", } m["gmw-zps"] = { "เยอรมันแบบซิพเซอร์", 205548, "gmw-hgm", "Latn", ancestors = "gmh", } m["gn-cls"] = { "กัวรานีคลาสสิก", 17478065, "gn", "Latn", } m["grk-cal"] = { "Calabrian Greek", 1146398, "grk", "Latn, Grek", ancestors = "grk-ita", translit = { Grek = "el-translit", }, -- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["grk-ita"] = { "Italiot Greek", 19720507, "grk", "Latn, Grek", ancestors = "gkm", translit = { Grek = "el-translit", }, -- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["grk-mar"] = { "Mariupol Greek", 4400023, "grk", "Cyrl, Latn, Grek", ancestors = "gkm", translit = { Cyrl = "grk-mar-translit", Grek = "grk-mar-translit", }, override_translit = true, strip_diacritics = { Cyrl = {remove_diacritics = c.acute}, }, -- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["grk-pro"] = { "เฮลเลนิกดั้งเดิม", 1231805, "grk", "Latn, Polyt", type = "reconstructed", sort_key = {Latn = { from = {"ʰ", "ʷ"}, to = {"h", "w"}, remove_diacritics = c.grave .. c.acute .. c.macron .. c.breve .. c.caron }}, -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] -- NOTE: formerly no translit specified for Polyt; presumably an accidental omission; if not, set Polyt = false in -- the translit section } m["hmn-pro"] = { "ม้งดั้งเดิม", 116773210, "hmn", "Latn", type = "reconstructed", } m["hmx-mie-pro"] = { "เมี่ยนดั้งเดิม", 116773229, "hmx-mie", "Latn", type = "reconstructed", } m["hmx-pro"] = { "ม้ง-เมี่ยนดั้งเดิม", 7251846, "hmx", "Latn", type = "reconstructed", } m["hyx-pro"] = { "อาร์มีเนียนดั้งเดิม", 3848498, "hyx", "Latn", type = "reconstructed", } m["iir-nur-pro"] = { "Proto-Nuristani", 116773248, "iir-nur", "Latn", type = "reconstructed", } m["iir-pro"] = { "อินโด-อิเรเนียนดั้งเดิม", 966439, "iir", "Latn", type = "reconstructed", } m["ijo-pro"] = { "Proto-Ijoid", 116773766, "ijo", "Latn", type = "reconstructed", } m["inc-apa"] = { "Apabhramsa", 616419, "inc-mid", "Deva, Shrd, Sidd", ancestors = "pra", translit = { Deva = "Deva-translit", -- Shrd translit in [[Module:scripts/data]] -- Sidd translit in [[Module:scripts/data]] }, } m["inc-ash"] = { "ปรากฤตแบบอโศก", 104854379, "inc-mid", "Brah, Khar", ancestors = "sa", translit = { -- Brah translit in [[Module:scripts/data]] Khar = "Khar-translit", }, } m["inc-dng-pro"] = { "Proto-Dangari", nil, "inc-dng", "Latn", type = "reconstructed", } m["inc-kam"] = { "กามรูป", 6356097, "inc-bas", "Brah, Sidd", -- Brah, Sidd translit in [[Module:scripts/data]] } m["inc-kho"] = { "Kholosi", 24952008, "inc-snd", "Latn", } m["inc-krd-pro"] = { "Proto-Kamta", 128816843, "inc-bas", "Latn", ancestors = "inc-kam", type = "reconstructed", } m["inc-mas"] = { "อัสสัมกลาง", 128806836, "inc-bas", "as-Beng", ancestors = "inc-oas", translit = "Beng-translit", } m["inc-mbn"] = { "เบงกอลกลาง", 113559927, "inc-bas", "Beng", ancestors = "inc-obn", translit = "Beng-translit", } m["inc-mgu"] = { "คุชราตกลาง", 24907429, "inc-wes", "Deva", ancestors = "inc-ogu", translit = { Deva = "Deva-translit", }, } m["inc-mor"] = { "โอริยากลาง", 128810882, "inc-eas", "Orya", ancestors = "inc-oor", } m["inc-oas"] = { "อัสสัมช่วงต้น", 85758237, "inc-bas", "as-Beng", ancestors = "inc-kam", translit = "Beng-translit", } m["inc-oaw"] = { "Old Awadhi", nil, "inc-hie", "Deva, Kthi, ur-Arab", strip_diacritics = { from = {"هٔ", "ۂ"}, -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه" to = {"ہ", "ہ"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef }, translit = { Deva = "Deva-translit", Kthi = "Kthi-translit", ["ur-Arab"] = "inc-ohi-translit", }, } m["inc-obn"] = { "เบงกอลเก่า", 113559926, "inc-bas", "Beng", translit = "Beng-translit", } m["inc-ogu"] = { "คุชราตเก่า", 24907427, "inc-wes", "Deva", translit = "Deva-translit", } m["inc-ohi"] = { "ฮินดีเก่า", 48767781, "inc-hiw", "Deva, ur-Arab", strip_diacritics = { from = {"هٔ", "ۂ"}, -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه" to = {"ہ", "ہ"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef }, translit = { Deva = "Deva-translit", ["ur-Arab"] = "ur-translit", }, } m["inc-oor"] = { "โอริยาเก่า", 128807801, "inc-eas", "Orya", } m["inc-opa"] = { "ปัญจาบเก่า", 115270971, "inc-pan", "Guru, pa-Arab", translit = { Guru = "Guru-translit", ["pa-Arab"] = "pa-Arab-translit", }, strip_diacritics = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun}, } m["inc-pro"] = { "อินโด-อารยันดั้งเดิม", 23808344, "inc", "Latn", type = "reconstructed", } m["ine-ana-pro"] = { "อานาโตเลียนดั้งเดิม", 7251833, "ine-ana", "Latn", type = "reconstructed", } m["ine-bsl-pro"] = { "บอลโต-สลาวิกดั้งเดิม", 1703347, "ine-bsl", "Latn", type = "reconstructed", sort_key = { from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", c.acute, c.macron, "ˀ"}, to = {"a", "e", "i", "o", "u"} }, } m["ine-kal"] = { "Kalašma", 122770439, "ine-ana", "Xsux", } m["ine-pae"] = { "Paeonian", 2705672, "ine", "Polyt", -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["ine-pro"] = { "อินโด-ยูโรเปียนดั้งเดิม", 37178, "ine", "Latn", type = "reconstructed", sort_key = { from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", "ĺ", "ḿ", "ń", "ŕ", "ǵ", "ḱ", "ʰ", "ʷ", "₁", "₂", "₃", c.ringbelow, c.acute, c.macron}, to = {"a", "e", "i", "o", "u", "l", "m", "n", "r", "g'", "k'", "¯h", "¯w", "1", "2", "3"} }, } m["ine-toc-pro"] = { "โทแคเรียนดั้งเดิม", 104841462, "ine-toc", "Latn", type = "reconstructed", } m["xme-old"] = { "Old Median", 36461, "xme", "Polyt, Latn", -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["xme-mid"] = { "Middle Median", 12836150, "xme", "Latn", } m["xme-ker"] = { "Kermanic", 129850, "xme", "fa-Arab, Latn, Hebr", ancestors = "xme-mid", -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["xme-taf"] = { "Tafreshi", nil, "xme", "fa-Arab, Latn", ancestors = "xme-mid", } m["xme-ttc-pro"] = { "Proto-Tatic", 122973870, "xme-ttc", "Latn", ancestors = "xme-mid", } m["xme-kls"] = { "Kalasuri", nil, "xme-ttc", ancestors = "xme-ttc-nor", } m["xme-klt"] = { "Kilit", 3612452, "xme-ttc", "Cyrl", -- and fa-Arab? } m["xme-ott"] = { "Old Tati", 434697, "xme-ttc", "fa-Arab, Latn", } m["ira-kms-pro"] = { "Proto-Komisenian", 116773777, "ira-kms", "Latn", type = "reconstructed", } m["ira-mpr-pro"] = { "Proto-Medo-Parthian", 116773227, "ira-mpr", "Latn", type = "reconstructed", } m["ira-pat-pro"] = { "ปาทานดั้งเดิม", 116773255, "ira-pat", "Latn", type = "reconstructed", } m["ira-pro"] = { "อิเรเนียนดั้งเดิม", 4167865, "ira", "Latn", type = "reconstructed", } m["ira-zgr-pro"] = { "Proto-Zaza-Gorani", 116775031, "ira-zgr", "Latn", type = "reconstructed", } m["xsc-pro"] = { "Proto-Scythian", 116773273, "xsc", "Latn", type = "reconstructed", } m["xsc-sar-pro"] = { "Proto-Sarmatian", 116773249, "xsc-sar", "Latn", type = "reconstructed", } m["xsc-skw-pro"] = { "Proto-Saka-Wakhi", 116773267, "xsc-skw", "Latn", type = "reconstructed", } m["xsc-sak-pro"] = { "Proto-Saka", 116773264, "xsc-sak", "Latn", type = "reconstructed", } m["ira-sym-pro"] = { "Proto-Shughni-Yazghulami-Munji", 116773813, "ira-sym", "Latn", type = "reconstructed", } m["ira-sgi-pro"] = { "Proto-Sanglechi-Ishkashimi", 116773808, "ira-sgi", "Latn", type = "reconstructed", } m["ira-mny-pro"] = { "Proto-Munji-Yidgha", 116773792, "ira-mny", "Latn", type = "reconstructed", } m["ira-shy-pro"] = { "Proto-Shughni-Yazghulami", 116773812, "ira-shy", "Latn", type = "reconstructed", } m["ira-shr-pro"] = { "Proto-Shughni-Roshani", 116773811, "ira-shr", "Latn", type = "reconstructed", } m["ira-sgc-pro"] = { "ซอกดิกดั้งเดิม", 116773276, "ira-sgc", "Latn", type = "reconstructed", } m["ira-wnj"] = { "Vanji", 3398419, "ira-shy", "Latn", } m["iro-ere"] = { "Erie", 5388365, "iro-nor", "Latn", } m["iro-min"] = { "Mingo", 128531, "iro-nor", "Latn", ietf_subtag = "i-mingo", -- grandfathered IETF tag } m["iro-nor-pro"] = { "Proto-North Iroquoian", 116773242, "iro-nor", "Latn", type = "reconstructed", } m["iro-pro"] = { "Proto-Iroquoian", 7251852, "iro", "Latn", type = "reconstructed", } m["itc-pro"] = { "อิตาลิกดั้งเดิม", 17102720, "itc", "Latn", type = "reconstructed", } m["itc-psa"] = { "Pre-Samnite", 7239186, "itc-sbl", "Ital, Polyt, Latn", -- Ital translit in [[Module:scripts/data]] (NOTE: formerly not present, probably an accidental omission) -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["jpx-hcj"] = { "Hachijō", 5637049, "jpx", "Jpan", ancestors = "ojp-eas", translit = s["jpx-translit"], display_text = s["jpx-displaytext"], strip_diacritics = s["jpx-stripdiacritics"], sort_key = s["jpx-sortkey"], } m["jpx-pro"] = { "แจพอนิกดั้งเดิม", 3924309, "jpx", "Latn", type = "reconstructed", } m["jpx-ryu-pro"] = { "Proto-Ryukyuan", 56349069, "jpx-ryu", "Latn", type = "reconstructed", } m["kar-pro"] = { "กะเหรี่ยงดั้งเดิม", 85794783, "kar", "Latn", type = "reconstructed", } m["kca-eas"] = { "Eastern Khanty", 30304622, "kca", "Cyrl", translit = "kca-translit", override_translit = true, -- TODO temporary until MediaWiki supports Unicode 16 (probably requires a PHP update from their side) sort_key = { Cyrl = { from = {"ᲊ"}, to = {"Ᲊ"} } }, } m["kca-nor"] = { "Northern Khanty", 30304527, "kca", "Cyrl", translit = "kca-translit", override_translit = true, -- TODO temporary until MediaWiki supports Unicode 16 (probably requires a PHP update from their side) sort_key = { Cyrl = { from = {"ᲊ"}, to = {"Ᲊ"} } }, } m["kca-pro"] = { "Proto-Khanty", 127505171, "kca", "Latn", type = "reconstructed", } m["kca-sou"] = { "Southern Khanty", 30304618, "kca", "Cyrl", translit = "kca-translit", override_translit = true, } m["khi-kho-pro"] = { "Proto-Khoe", 116773218, "khi-kho", "Latn", type = "reconstructed", } m["khi-kun"] = { "ǃKung", 32904, "khi-kxa", "Latn", } m["ko-ear"] = { "เกาหลีใหม่ช่วงต้น", 756014, "qfa-kor", "Kore", ancestors = "okm", translit = "okm-translit", -- Kore strip_diacritics in [[Module:scripts/data]] } m["kro-pro"] = { "Proto-Kru", 116773778, "kro", "Latn", type = "reconstructed", } m["ku-pro"] = { "เคอร์ดิชดั้งเดิม", 116773221, "ku", "Latn", type = "reconstructed", } m["map-ata-pro"] = { "Proto-Atayalic", 116773151, "map-ata", "Latn", type = "reconstructed", } m["map-bms"] = { "Banyumasan", 33219, "map", "Latn, Java", } m["map-pro"] = { "ออสโตรนีเซียนดั้งเดิม", 49230, "map", "Latn", type = "reconstructed", } m["mis-hkl"] = { "Kelantan Peranakan Hokkien", 108794818, "qfa-mix", ancestors = "nan-hbl, sou, mfa", } m["mis-idn"] = { "Idiom Neutral", 35847, "art", "Latn", type = "appendix-constructed", } m["mis-isa"] = { "Isaurian", 16956868, nil, -- "Xsux, Hluw, Latn", } m["mis-jie"] = { "Jie", 124424186, nil, "Hani", sort_key = "Hani-sortkey", } m["mis-jzh"] = { "Jizhao", 45242758, "qfa-bej", "Latn", } m["mis-kas"] = { "Kassite", 35612, nil, "Xsux", } m["mis-mmd"] = { "Mimi of Decorse", 6862206, nil, "Latn", } m["mis-mmn"] = { "Mimi of Nachtigal", 6862207, nil, "Latn", } m["mis-phi"] = { "Philistine", 2230924, nil, "Phnx", -- Phnx translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission) } m["mis-rou"] = { "Rouran", 48816637, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-tdl"] = { "Turdulian", 133176492, } m["mis-tdt"] = { "Turdetanian", 133176461, } m["mis-tnw"] = { "Tangwang", 7683179, "qfa-mix", "Latn", ancestors = "cmn, sce", } m["mis-tuh"] = { "Tuyuhun", 48816625, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-tuo"] = { "Tuoba", 48816629, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-wuh"] = { "Wuhuan", 118976867, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-xbi"] = { "Xianbei", 4448647, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-xnu"] = { "Xiongnu", 10901674, nil, "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mjg-mgl"] = { "Mongghul", 53765528, "mjg", "Latn", -- also Mong, Cyrl ? } m["mjg-mgr"] = { "Mangghuer", 56285392, "mjg", "Latn", -- also Mong, Cyrl ? } m["mkh-asl-pro"] = { "Proto-Aslian", 55630680, "mkh-asl", "Latn", type = "reconstructed", } m["mkh-ban-pro"] = { "Proto-Bahnaric", 116773189, "mkh-ban", "Latn", type = "reconstructed", } m["mkh-kat-pro"] = { "Proto-Katuic", 116773772, "mkh-kat", "Latn", type = "reconstructed", } m["mkh-khm-pro"] = { "ขมุอิกดั้งเดิม", 116773774, "mkh-khm", "Latn", type = "reconstructed", } m["mkh-kmr-pro"] = { "เขมรดั้งเดิม", 55630684, "mkh-kmr", "Latn", type = "reconstructed", } m["mkh-mmn"] = { "มอญกลาง", 121337926, "mkh-mnc", "Latn, Mymr", --and also Pallava ancestors = "omx", } m["mkh-mnc-pro"] = { "มอญดั้งเดิม", 116773231, "mkh-mnc", "Latn", type = "reconstructed", } m["mkh-mvi"] = { "เวียดนามกลาง", 9199, "mkh-vie", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mkh-pal-pro"] = { "ปะหล่องดั้งเดิม", 104847372, "mkh-pal", "Latn", type = "reconstructed", } m["mkh-pea-pro"] = { "Proto-Pearic", 116773804, "mkh-pea", "Latn", type = "reconstructed", } m["mkh-pkn-pro"] = { "Proto-Pakanic", 116773803, "mkh-pkn", "Latn", type = "reconstructed", } m["mkh-pro"] = { --This will be merged into 2015 aav-pro. "มอญ-เขมรดั้งเดิม", 7251859, "mkh", "Latn", type = "reconstructed", } m["mnw-tha"] = { -- To be removed. "มอญแบบไทย", nil, "mkh-mnc", "Mymr, Thai", ancestors = "mkh-mmn", sort_key = { from = {"[%p]", "ျ", "ြ", "ွ", "ှ", "ၞ", "ၟ", "ၠ", "ၚ", "ဿ", "[็-๎]", "([เแโใไ])([ก-ฮ])ฺ?"}, to = {"", "္ယ", "္ရ", "္ဝ", "္ဟ", "္န", "္မ", "္လ", "င", "သ္သ", "", "%2%1"} }, } m["mkh-vie-pro"] = { "เวียตติกดั้งเดิม", 109432616, "mkh-vie", "Latn", type = "reconstructed", } m["mns-cen"] = { "Central Mansi", 128810384, "mns", "Cyrl", translit = "mns-translit", override_translit = true, } m["mns-nor"] = { "Northern Mansi", 30304537, "mns", "Cyrl", translit = "mns-translit", override_translit = true, } m["mns-pro"] = { "Proto-Mansi", 128883093, "mns", "Latn", type = "reconstructed", } m["mns-sou"] = { "Southern Mansi", 30304629, "mns", "Cyrl", translit = "mns-translit", override_translit = true, } m["mun-pro"] = { "มุนดาดั้งเดิม", 105102373, "mun", "Latn", type = "reconstructed", } m["myn-chl"] = { -- the stage after ''emy'' "Ch'olti'", 873995, "myn", "Latn", } m["myn-pro"] = { "มายันดั้งเดิม", 3321532, "myn", "Latn", type = "reconstructed", } m["nai-ala"] = { "Alazapa", 128810233, nil, "Latn", } m["nai-bay"] = { "Bayogoula", 1563704, nil, "Latn", } m["nai-cal"] = { "Calusa", 51782, nil, "Latn", } m["nai-chi"] = { "Chiquimulilla", 25339627, "nai-xin", "Latn", } m["nai-chu-pro"] = { "Proto-Chumash", 116773736, "nai-chu", "Latn", type = "reconstructed", } m["nai-cig"] = { "Ciguayo", 20741700, nil, "Latn", } m["nai-ckn-pro"] = { "Proto-Chinookan", 116773735, "nai-ckn", "Latn", type = "reconstructed", } m["nai-guz"] = { "Guazacapán", 19572028, "nai-xin", "Latn", } m["nai-hit"] = { "Hitchiti", 1542882, "nai-mus", "Latn", } m["nai-ipa"] = { "Ipai", 3027474, "nai-yuc", "Latn", } m["nai-jtp"] = { "Jutiapa", nil, "nai-xin", "Latn", } m["nai-jum"] = { "Jumaytepeque", 25339626, "nai-xin", "Latn", } m["nai-kat"] = { "Kathlamet", 6376639, "nai-ckn", "Latn", } m["nai-klp-pro"] = { "Proto-Kalapuyan", 116773771, "nai-klp", "Latn", type = "reconstructed", } m["nai-knm"] = { "Konomihu", 3198734, "nai-shs", "Latn", } m["nai-kum"] = { "Kumeyaay", 4910139, "nai-yuc", "Latn", } m["nai-mac"] = { "Macoris", 21070851, nil, "Latn", } m["nai-mdu-pro"] = { "Proto-Maidun", 116773784, "nai-mdu", "Latn", type = "reconstructed", } m["nai-miz-pro"] = { "Proto-Mixe-Zoque", 7251858, "nai-miz", "Latn", type = "reconstructed", } m["nai-mus-pro"] = { "Proto-Muskogean", 116775368, "nai-mus", "Latn", type = "reconstructed", } m["nai-nao"] = { "Naolan", 6964594, nil, "Latn", } m["nai-nrs"] = { "New River Shasta", 7011254, "nai-shs", "Latn", } m["nai-okw"] = { "Okwanuchu", 3350126, "nai-shs", "Latn", } m["nai-per"] = { "Pericú", 3375369, nil, "Latn", } m["nai-pic"] = { "Picuris", 7191257, "nai-kta", "Latn", } m["nai-plp-pro"] = { "Proto-Plateau Penutian", 116773806, "nai-plp", "Latn", type = "reconstructed", } m["nai-pom-pro"] = { "Proto-Pomo", 116773262, "nai-pom", "Latn", type = "reconstructed", } m["nai-qng"] = { "Quinigua", 36360, nil, "Latn", } m["nai-sca-pro"] = { -- NB 'sio-pro' "Proto-Siouan" which is Proto-Western Siouan "Proto-Siouan-Catawban", 116773275, "nai-sca", "Latn", type = "reconstructed", } m["nai-sin"] = { "Sinacantán", 24190249, "nai-xin", "Latn", } m["nai-sln"] = { "Salvadoran Lenca", 3229434, "nai-len", "Latn", } m["nai-spt"] = { "Sahaptin", 3833015, "nai-shp", "Latn", } m["nai-tap"] = { "Tapachultec", 7684401, "nai-miz", "Latn", } m["nai-taw"] = { "Tawasa", 7689233, nil, "Latn", } m["nai-teq"] = { "Tequistlatec", 2964454, "nai-tqn", "Latn", } m["nai-tip"] = { "Tipai", 3027471, "nai-yuc", "Latn", } m["nai-tot-pro"] = { "Proto-Totozoquean", 116773285, "nai-tot", "Latn", type = "reconstructed", } m["nai-tsi-pro"] = { "Proto-Tsimshianic", nil, "nai-tsi", "Latn", type = "reconstructed", } m["nai-utn-pro"] = { "Proto-Utian", 116773290, "nai-utn", "Latn", type = "reconstructed", } m["nai-wai"] = { "Waikuri", 3118702, nil, "Latn", } m["nai-wji"] = { "Western Jicaque", 3178610, "nai-jcq", "Latn", } m["nai-yup"] = { "Yupiltepeque", 25339628, "nai-xin", "Latn", } m["nan-dat"] = { "Datian Min", 19855572, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-hbl"] = { "ฮกเกี้ยน", 1624231, "zhx-nan", "Hants, Latn, Bopo, Kana", wikimedia_codes = "zh-min-nan", generate_forms = "zh-generateforms", sort_key = { Hani = "Hani-sortkey", Kana = "Kana-sortkey" }, } m["nan-hlh"] = { "Hailufeng Min", 120755728, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-lnx"] = { "Longyan Min", 6674568, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-tws"] = { "แต้จิ๋ว", 36759, "zhx-nan", "Hants", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["nan-zhe"] = { "Zhenan Min", 3846710, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-zsh"] = { "Sanxiang Min", 7420769, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nds-de"] = { "เยอรมันต่ำแบบเยอรมนี", 25433, "gmw-lgm", "Latn", ancestors = "nds", ietf_subtag = "nds-DE", -- should we make this the actual code? wikimedia_codes = "nds", } m["nds-nl"] = { "Dutch Low Saxon", 516137, "gmw-lgm", "Latn", ancestors = "nds", ietf_subtag = "nds-NL", -- should we make this the actual code? wikimedia_codes = "nds-nl", } m["ngf-pro"] = { "Proto-Trans-New Guinea", 85794785, "ngf", "Latn", type = "reconstructed", } m["nic-bco-pro"] = { "เบนูเอ-คองโกดั้งเดิม", 116773194, "nic-bco", "Latn", type = "reconstructed", } m["nic-bod-pro"] = { "แบนทอยด์ดั้งเดิม", 116773190, "nic-bod", "Latn", type = "reconstructed", } m["nic-eov-pro"] = { "Proto-Eastern Oti-Volta", 116773753, "nic-eov", "Latn", type = "reconstructed", } m["nic-gns-pro"] = { "Proto-Gurunsi", 116773759, "nic-gns", "Latn", type = "reconstructed", } m["nic-grf-pro"] = { "Proto-Grassfields", 116773755, "nic-grf", "Latn", type = "reconstructed", } m["nic-gur-pro"] = { "กูร์ดั้งเดิม", 116773758, "nic-gur", "Latn", type = "reconstructed", } m["nic-jkn-pro"] = { "Proto-Jukunoid", 116773769, "nic-jkn", "Latn", type = "reconstructed", } m["nic-lcr-pro"] = { "Proto-Lower Cross River", 116773782, "nic-lcr", "Latn", type = "reconstructed", } m["nic-ogo-pro"] = { "Proto-Ogoni", 116773799, "nic-ogo", "Latn", type = "reconstructed", } m["nic-ovo-pro"] = { "Proto-Oti-Volta", 116773802, "nic-ovo", "Latn", type = "reconstructed", } m["nic-plt-pro"] = { "Proto-Plateau", 116773805, "nic-plt", "Latn", type = "reconstructed", } m["nic-pro"] = { "ไนเจอร์-คองโกดั้งเดิม", 108000748, "nic", "Latn", type = "reconstructed", } m["nic-ubg-pro"] = { "Proto-Ubangian", 116773818, "nic-ubg", "Latn", type = "reconstructed", } m["nic-ucr-pro"] = { "Proto-Upper Cross River", 116773819, "nic-ucr", "Latn", type = "reconstructed", } m["nic-vco-pro"] = { "วอลตา-คองโกดั้งเดิม", 116773293, "nic-vco", "Latn", type = "reconstructed", } m["nub-har"] = { "Haraza", 19572059, "nub", "Arab, Latn", } m["nub-pro"] = { "นูเบียนดั้งเดิม", 116773246, "nub", "Latn", type = "reconstructed", } m["omq-cha-pro"] = { "Proto-Chatino", 116773202, "omq-cha", "Latn", type = "reconstructed", } m["omq-maz-pro"] = { "Proto-Mazatec", 116773790, "omq-maz", "Latn", type = "reconstructed", } m["omq-mix-pro"] = { "Proto-Mixtecan", 21573423, "omq-mix", "Latn", type = "reconstructed", } m["omq-mxt-pro"] = { "Proto-Mixtec", 21573424, "omq-mxt", "Latn", type = "reconstructed", } m["omq-otp-pro"] = { "Proto-Oto-Pamean", 116773251, "omq-otp", "Latn", type = "reconstructed", } m["omq-pro"] = { "Proto-Oto-Manguean", 33669, "omq", "Latn", type = "reconstructed", } m["omq-sjq"] = { "San Juan Quiahije Chatino", 17003130, "omq-cha", "Latn", } m["omq-tel"] = { "Teposcolula Mixtec", nil, "omq-mxt", "Latn", } m["omq-teo"] = { "Teojomulco Chatino", 25340451, "omq-cha", "Latn", } m["omq-tri-pro"] = { "Proto-Triqui", 116773817, "omq-tri", "Latn", type = "reconstructed", } m["omq-zap-pro"] = { "Proto-Zapotecan", 116773297, "omq-zap", "Latn", type = "reconstructed", } m["omq-zpc-pro"] = { "Proto-Zapotec", 116773296, "omq-zpc", "Latn", type = "reconstructed", } m["omv-aro-pro"] = { "Proto-Aroid", 116773721, "omv-aro", "Latn", type = "reconstructed", } m["omv-diz-pro"] = { "Proto-Dizoid", 116773750, "omv-diz", "Latn", type = "reconstructed", } m["omv-pro"] = { "Proto-Omotic", 116773800, "omv", "Latn", type = "reconstructed", } m["oto-otm-pro"] = { "Proto-Otomi", 5908710, "oto-otm", "Latn", type = "reconstructed", } m["oto-pro"] = { "Proto-Otomian", 116773252, "oto", "Latn", type = "reconstructed", } m["paa-bin-pro"] = { "Proto-Binanderean", 137881672, "paa-bin", "Latn", type = "reconstructed", } m["paa-kom"] = { "Kómnzo", 18344310, "paa-yam", "Latn", } m["paa-kwn"] = { "Kuwani", 6449056, "qfa-unc", -- poorly attested, possibly the same as or related to Kalabra "Latn", } m["paa-nha-pro"] = { "Proto-North Halmahera", 116773241, "paa-nha", "Latn", type = "reconstructed" } m["paa-nun"] = { "Nungon", 128807788, "ngf-fin", "Latn", } m["phi-din"] = { "Dinapigue Agta", 16945774, "phi", "Latn", } m["phi-kal-pro"] = { "คาลาเมียนดั้งเดิม", 116773213, "phi-kal", "Latn", type = "reconstructed", } m["phi-nag"] = { "Nagtipunan Agta", 16966111, "phi", "Latn", } m["phi-pro"] = { "ฟิลิปปินส์ดั้งเดิม", 18204898, "phi", "Latn", type = "reconstructed", } m["poz-abi"] = { "Abai", 19570729, "poz-san", "Latn", } m["poz-bal"] = { "Baliledo", 4850912, "poz", "Latn", } m["poz-btk-pro"] = { "Proto-Bungku-Tolaki", 116773724, "poz-btk", "Latn", type = "reconstructed", } m["poz-cet-pro"] = { "มาลาโย-พอลินีเชียนตอนกลาง-ตะวันออกดั้งเดิม", 2269883, "poz-cet", "Latn", type = "reconstructed", } m["poz-hce-pro"] = { "Proto-Halmahera-Cenderawasih", 116773209, "poz-hce", "Latn", type = "reconstructed", } m["poz-lgx-pro"] = { "ลัมปุงกิกดั้งเดิม", 116773222, "poz-lgx", "Latn", type = "reconstructed", } m["poz-mcm-pro"] = { "มาลาโย-จามิกดั้งเดิม", 116773225, "poz-mcm", "Latn", type = "reconstructed", } m["poz-mic-pro"] = { "ไมโครนีเซียนดั้งเดิม", 111939079, "poz-mic", "Latn", type = "reconstructed", } m["poz-mly-pro"] = { "มาเลย์อิกดั้งเดิม", 98057728, "poz-mly", "Latn", type = "reconstructed", } m["poz-msa-pro"] = { "มาลาโย-ซุมบาวันดั้งเดิม", 116773226, "poz-msa", "Latn", type = "reconstructed", } m["poz-oce-pro"] = { "โอเชียนิกดั้งเดิม", 141741, "poz-oce", "Latn", type = "reconstructed", } m["poz-pep-pro"] = { "พอลินีเชียนตะวันออกดั้งเดิม", 113988745, "poz-pep", "Latn", type = "reconstructed", } m["poz-pnp-pro"] = { "นิวเคลียร์พอลินีเชียนดั้งเดิม", 113988746, "poz-pnp", "Latn", type = "reconstructed", } m["poz-pol-pro"] = { "พอลินีเชียนดั้งเดิม", 1658709, "poz-pol", "Latn", type = "reconstructed", } m["poz-pro"] = { "มาลาโย-พอลินีเชียนดั้งเดิม", 3832960, "poz", "Latn", type = "reconstructed", } m["poz-sml"] = { "Sarawak Malay", 4251702, "poz-mly", "Latn, ms-Arab", } m["poz-ssw-pro"] = { "ซูลาเวซีใต้ดั้งเดิม", 116773279, "poz-ssw", "Latn", type = "reconstructed", } m["poz-swa-pro"] = { "ซาราวักเหนือดั้งเดิม", 116773243, "poz-swa", "Latn", type = "reconstructed", } m["poz-ter"] = { "มลายูแบบตรังกานู", 4207412, "poz-mly", "Latn, ms-Arab", } m["pqe-pro"] = { "มาลาโย-พอลินีเชียนตะวันออกดั้งเดิม", 2269883, "pqe", "Latn", type = "reconstructed", } m["pra-niy"] = { "Niya Prakrit", 11991601, "inc-mid", "Khar", ancestors = "inc-ash", translit = "Khar-translit", } m["qfa-adm-pro"] = { "เกรตอันดามันนีสดั้งเดิม", 116773756, "qfa-adm", "Latn", type = "reconstructed", } m["qfa-bet-pro"] = { "เบ-ไทดั้งเดิม", 116773193, "qfa-bet", "Latn", type = "reconstructed", } m["qfa-cka-pro"] = { "Proto-Chukotko-Kamchatkan", 7251837, "qfa-cka", "Latn", type = "reconstructed", } m["qfa-hur-pro"] = { "Proto-Hurro-Urartian", 116773211, "qfa-hur", "Latn", type = "reconstructed", } m["qfa-kad-pro"] = { "Proto-Kadu", 116773770, "qfa-kad", "Latn", type = "reconstructed", } m["qfa-kms-pro"] = { "Proto-Kam-Sui", 55630682, "qfa-kms", "Latn", type = "reconstructed", } m["qfa-kor-pro"] = { "เกาหลีดั้งเดิม", 467883, "qfa-kor", "Latn", type = "reconstructed", } m["qfa-kra-pro"] = { "ขร้าดั้งเดิม", 7251854, "qfa-kra", "Latn", type = "reconstructed", } m["qfa-lic-pro"] = { "ไหลดั้งเดิม", 7251845, "qfa-lic", "Latn", type = "reconstructed", } m["qfa-onb-pro"] = { "เบดั้งเดิม", 116773192, "qfa-onb", "Latn", type = "reconstructed", } m["qfa-ong-pro"] = { "Proto-Ongan", 116773801, "qfa-ong", "Latn", type = "reconstructed", } m["qfa-tak-pro"] = { "ขร้า-ไทดั้งเดิม", 104901616, "qfa-tak", "Latn", type = "reconstructed", } m["qfa-yen-pro"] = { "Proto-Yeniseian", 27639, "qfa-yen", "Latn", type = "reconstructed", } m["qfa-yuk-pro"] = { "Proto-Yukaghir", 116773294, "qfa-yuk", "Latn", type = "reconstructed", } m["qwe-kch"] = { "Kichwa", 1740805, "qwe", "Latn", ancestors = "qu", } m["qwe-pro"] = { "เกชวนดั้งเดิม", 5575757, "qwe", "Latn", type = "reconstructed", } m["roa-ang"] = { "Angevin", 56782, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-bbn"] = { "บูร์บอแน-แบรีชง", 2899128, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-brg"] = { "Bourguignon", 508332, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-can"] = { "Cantabrian", 917021, "roa-asl", "Latn", } m["roa-cha"] = { "Champenois", 430018, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-fcm"] = { "Franc-Comtois", 510561, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-gal"] = { "Gallo", 37300, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-gib"] = { "Gallo-Italic of Basilicata", 3094838, "roa-git", "Latn", } m["roa-gis"] = { "Gallo-Italic of Sicily", 2629019, "roa-git", "Latn", } m["roa-leo"] = { "เลออน", 34108, "roa-asl", "Latn", } m["roa-lor"] = { "Lorrain", 671198, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-oca"] = { "กาตาลาเก่า", 15478520, "roa-ocr", "Latn", sort_key = {remove_diacritics = c.grave .. c.acute .. c.diaer .. c.cedilla .. "·"}, } m["roa-ole"] = { "เลออนเก่า", 125977465, "roa-asl", "Latn", } m["roa-ona"] = { "Old Navarro-Aragonese", 2736184, "roa-nar", "Latn", } m["roa-opt"] = { "กาลิเซีย-โปรตุเกสเก่า", 1072111, "roa-gap", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ}, } m["roa-orl"] = { "Orléanais", 28497058, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-poi"] = { "Poitevin-Saintongeais", 514123, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-tar"] = { "Tarantino", 695526, "roa-itr", "Latn", wikimedia_codes = "roa-tara", } m["sai-all"] = { "Allentiac", 19570789, "sai-hrp", "Latn", } m["sai-and"] = { -- not to be confused with 'cbc' or 'ano' "Andoquero", 16828359, "sai-wit", "Latn", } m["sai-ayo"] = { "Ayomán", 16937754, "sai-jir", "Latn", } m["sai-bae"] = { "Baenan", 3401998, "qfa-unc", -- extinct, poorly attested; only known through 9 words "Latn", } m["sai-bag"] = { "Bagua", 5390321, "qfa-unc", -- extinct, poorly attested; possibly Cariban "Latn", } m["sai-bet"] = { "Betoi", 926551, "qfa-iso", "Latn", } m["sai-bor-pro"] = { "Proto-Boran", nil, "sai-bor", "Latn", } m["sai-cac"] = { "Cacán", 945482, "qfa-unc", -- extinct, poorly attested; no consensus on classification "Latn", } m["sai-caq"] = { "Caranqui", 2937753, "sai-bar", "Latn", } m["sai-car-pro"] = { "Proto-Cariban", 116773196, "sai-car", "Latn", type = "reconstructed", } m["sai-cat"] = { "Catacao", 5051136, "sai-ctc", "Latn", } m["sai-cer-pro"] = { "Proto-Cerrado", 116773200, "sai-cer", "Latn", type = "reconstructed", } m["sai-chi"] = { "Chirino", 5390321, "qfa-unc", -- extinct, only four words known; possibly related to Candoshi-Shapra (cbu) "Latn", } m["sai-chn"] = { "Chaná", 5072718, "sai-crn", "Latn", } m["sai-chp"] = { "Chapacura", 5072884, "sai-cpc", "Latn", } m["sai-chr"] = { "Charrua", 5086680, "sai-crn", "Latn", } m["sai-chu"] = { "Churuya", 5118339, "sai-guh", "Latn", } m["sai-cje-pro"] = { "Proto-Central Jê", 116773198, "sai-cje", "Latn", type = "reconstructed", } m["sai-cmg"] = { "Comechingon", 6644203, "qfa-unc", -- extinct, poorly attested; no consensus on classification "Latn", } m["sai-cno"] = { "Chono", 5104704, "qfa-unc", -- extinct, poorly attested; no consensus on classification, possibly spurious "Latn", } m["sai-cnr"] = { "Cañari", 5055572, "qfa-unc", -- extinct, poorly attested; possibly Chimuan or Barbacoan "Latn", } m["sai-coe"] = { "Coeruna", 6425639, "sai-wit", "Latn", } m["sai-col"] = { "Colán", 5141893, "sai-ctc", "Latn", } m["sai-cop"] = { "Copallén", 5390321, "qfa-unc", -- extinct, only four words attested; possibly Cholonan "Latn", } m["sai-crd"] = { "Coroado Puri", 24191321, "sai-mje", "Latn", } m["sai-ctq"] = { "Catuquinaru", 16858455, "qfa-unc", -- extinct, poorly attested; vocabulary does not resemble other languages "Latn", } m["sai-cul"] = { "Culli", 2879660, "qfa-unc", -- extinct, poorly attested; often considered an isolate "Latn", } m["sai-cva"] = { "Cueva", 5192644, "qfa-unc", -- extinct, poorly attested; possibly Chocoan "Latn", } m["sai-esm"] = { "Esmeralda", 3058083, "qfa-unc", -- extinct, poorly attested; possibly related to Yaruro "Latn", } m["sai-ewa"] = { "Ewarhuyana", 16898104, nil, "Latn", } m["sai-gam"] = { "Gamela", 5403661, "qfa-unc", -- extinct, poorly attested; possibly an isolate "Latn", } m["sai-gay"] = { "Gayón", 5528902, "sai-jir", "Latn", } m["sai-gmo"] = { "Guamo", 5613495, "qfa-unc", -- extinct; "Kaufman (1990) finds a connection with the Chapacuran languages convincing." [Wikipedia] Considered an isolate by Campbell (2024). "Latn", } m["sai-gua"] = { "Guachí", 5613172, "sai-guc", "Latn", } m["sai-gue"] = { "Güenoa", 5626799, "sai-crn", "Latn", } m["sai-hau"] = { "Haush", 3128376, "sai-cho", "Latn", } m["sai-jee-pro"] = { "Proto-Jê", 116773212, "sai-jee", "Latn", type = "reconstructed", } m["sai-jko"] = { "Jeikó", 6176527, "sai-mje", "Latn", } m["sai-jrj"] = { "Jirajara", 6202966, "sai-jir", "Latn", } m["sai-kat"] = { -- contrast xoo, kzw, sai-xoc "Katembri", 6375925, "qfa-unc", -- extinct, poorly attested; "Kaufman (1990) has linked it with the nearly extinct Taruma, although this has not been accepted by other scholars." [Wikipedia] "Latn", } m["sai-mal"] = { "Malalí", 6741212, "sai-mje", -- considered the most divergent Maxakalían language (a subdivision of Macro-Jê), for which we have no entry "Latn", } m["sai-mar"] = { "Maratino", 6755055, "qfa-unc", -- extinct, poorly attested; possibly Uto-Aztecan "Latn", } m["sai-mat"] = { "Matanawi", 6786047, "qfa-unc", -- extinct; either an isolate or distantly related to the Muran languages; Campbell (2024) lists it as an isolate, Glottolog gives it as unclassified "Latn", } m["sai-mcn"] = { "Mocana", 3402048, "qfa-unc", -- extinct, poorly attested; given as part of the Malibu languages (geographic grouping; not a clade) "Latn", } m["sai-men"] = { "Menien", 16890110, "sai-mje", "Latn", } m["sai-mil"] = { "Millcayac", 19573012, "sai-hrp", "Latn", } m["sai-mlb"] = { "Malibu", 3402048, "qfa-unc", -- extinct, poorly attested; given as part of the Malibu languages (geographic grouping; not a clade) "Latn", } m["sai-msk"] = { "Masakará", 6782426, "sai-mje", "Latn", } m["sai-muc"] = { "Mucuchí", 6931290, nil, -- generally considered Timotean, for which we have no entry "Latn", } m["sai-mue"] = { "Muellama", 16886936, "sai-bar", "Latn", } m["sai-muz"] = { "Muzo", 6644203, "qfa-unc", -- extinct language of Colombia, poorly attested; may be Pijao (Cariban) "Latn", } m["sai-mys"] = { "Maynas", 16919393, "sai-cah", -- per Campbell (2024); formerly considered unclassified "Latn", } m["sai-nat"] = { "Natú", 9006749, "qfa-unc", -- extinct, poorly attested; "only Greenberg dares to classify [it]".[Wikipedia, quoting Moseley, Christopher; Asher, R. E.; Tait, Mary (1994), Atlas of the world's languages] "Latn", } m["sai-nje-pro"] = { "Proto-Northern Jê", 116773245, "sai-nje", "Latn", type = "reconstructed", } m["sai-opo"] = { "Opón", 7099152, "sai-car", "Latn", } m["sai-oto"] = { "Otomaco", 16879234, "sai-otm", "Latn", } m["sai-pal"] = { "Palta", 3042978, "qfa-unc", -- extinct, unclassified; possibly Chicham "Latn", } m["sai-pam"] = { "Pamigua", 5908689, "sai-otm", "Latn", } m["sai-par"] = { "Paratió", 16890038, "qfa-unc", -- extinct, poorly attested; possibly Xukuruan "Latn", } m["sai-peb"] = { "Peba", 3373890, "sai-pey", "Latn", } m["sai-pnz"] = { "Panzaleo", 3123275, "qfa-unc", -- extinct, unclassified; possibly Paezan "Latn", } m["sai-prh"] = { "Puruhá", 3410994, "qfa-unc", -- extinct, poorly attested; possibly in a famil with Cañari "Latn", } m["sai-ptg"] = { "Patagón", 128807870, "sai-tar", -- extinct, only known from 4 words, which suggest Cariban lineage (Campbell 2024) "Latn", } m["sai-pur"] = { "Purukotó", 7261622, "sai-pem", "Latn", } m["sai-pyg"] = { "Payaguá", 7156643, "sai-guc", "Latn", } m["sai-pyk"] = { "Pykobjê", 98113977, "sai-nje", "Latn", } m["sai-qmb"] = { "Quimbaya", 7272043, "qfa-unc", -- extinct, might not exist; few known words "Latn", } m["sai-qtm"] = { "Quitemo", 7272651, "sai-cpc", "Latn", } m["sai-rab"] = { "Rabona", 6644203, "qfa-unc", -- extinct, poorly attested, mostly plant names; possibly Candoshi-Shapra "Latn", } m["sai-ram"] = { "Ramanos", 16902824, "qfa-unc", -- extinct, poorly attested, possibly an isolate; per Glottolog: "the minuscule wordlist ... shows no convincing resemblances to surrounding languages" "Latn", } m["sai-sac"] = { "Sácata", 5390321, "qfa-unc", -- extinct, only 3 words known; possibly Candoshí or Arawakan "Latn", } m["sai-san"] = { "Sanaviron", 16895999, "qfa-unc", -- extinct, unclassified; no consensus on classification "Latn", } m["sai-sap"] = { "Sapará", 7420922, "sai-car", "Latn", } m["sai-sec"] = { "Sechura", 7442912, "qfa-unc", -- extinct, poorly attested; possibly Catacaoan "Latn", } m["sai-sin"] = { "Sinúfana", 7525275, "qfa-unc", -- moribund, poorly attested; possibly Chocoan "Latn", } m["sai-sje-pro"] = { "Proto-Southern Jê", 116773814, "sai-sje", "Latn", type = "reconstructed", } m["sai-tab"] = { "Tabancale", 5390321, "qfa-unc", -- extinct, only 5 words known; no obvious connections, might be an isolate "Latn", } m["sai-tal"] = { "Tallán", 16910468, "qfa-unc", -- extinct, poorly attested; might be Catacaoan "Latn", } m["sai-tap"] = { "Tapayuna", 30719984, "sai-nje", "Latn", } m["sai-tar-pro"] = { "Proto-Taranoan", 116773816, "sai-tar", "Latn", type = "reconstructed", } m["sai-teu"] = { "Teushen", 3519243, "qfa-unc", -- probably extinct by the 1950's; possibly Chonan "Latn", } m["sai-tim"] = { "Timote", 7806995, nil, -- possibly in a small Timotean family "Latn", } m["sai-tpr"] = { "Taparita", 7684460, "sai-otm", "Latn", } m["sai-trr"] = { "Tarairiú", 7685313, "qfa-unc", -- extinct, too poorly attested to classify "Latn", } m["sai-wai"] = { "Waitaká", 16918610, "qfa-unc", -- extinct, possibly Purian "Latn", } m["sai-way"] = { "Wayumara", 7960726, "sai-car", "Latn", } m["sai-wit-pro"] = { "Proto-Witotoan", 116773823, "sai-wit", "Latn", type = "reconstructed", } m["sai-wnm"] = { "Wanham", 16879440, "sai-cpc", "Latn", } m["sai-xoc"] = { -- contrast xoo, kzw, sai-kat "Xocó", 12953620, "qfa-unc", -- extinct and poorly attested; not clear if one or three languages "Latn", } m["sai-yao"] = { "Yao (South America)", 16979655, "sai-ven", "Latn", } m["sai-yar"] = { -- not the same family as 'suy' "Yarumá", 3505859, "sai-pek", "Latn", } m["sai-yri"] = { "Yuri", 2669157, "sai-tyu", "Latn", } m["sai-yup"] = { "Yupua", 8061430, "sai-tuc", "Latn", } m["sai-yur"] = { "Yurumanguí", 1281291, "qfa-unc", -- extinct, too poorly attested to classify "Latn", } m["sal-pro"] = { "Proto-Salish", 116773269, "sal", "Latn", type = "reconstructed", } m["sdv-daj-pro"] = { "Proto-Daju", 116773739, "sdv-daj", "Latn", type = "reconstructed", } m["sdv-eje-pro"] = { "Proto-Eastern Jebel", 116773751, "sdv-eje", "Latn", type = "reconstructed", } m["sdv-nil-pro"] = { "Proto-Nilotic", 116773794, "sdv-nil", "Latn", type = "reconstructed", } m["sdv-nyi-pro"] = { "Proto-Nyima", 116773796, "sdv-nyi", "Latn", type = "reconstructed", } m["sdv-tmn-pro"] = { "Proto-Taman", 116773815, "sdv-tmn", "Latn", type = "reconstructed", } m["sel-nor"] = { "Northern Selkup", 30304565, "sel", "Cyrl", translit = "sel-nor-translit", } m["sel-pro"] = { "Proto-Selkup", 128884235, "sel", "Latn", type = "reconstructed", } m["sel-sou"] = { "Southern Selkup", 30304639, "sel", "Cyrl", translit = "sel-sou-translit", } m["sem-amm"] = { "อัมโมน", 279181, "sem-can", "Phnx", -- Phnx translit in [[Module:scripts/data]] } m["sem-amo"] = { "Amorite", 35941, "sem-nwe", "Xsux, Latn", } m["sem-cha"] = { "Chaha", 35543, "sem-eth", "Ethi", translit = "Ethi-translit", } m["sem-dad"] = { "Dadanitic", 21838040, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-dum"] = { "Dumaitic", 128810397, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-has"] = { "Hasaitic", 3541433, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-his"] = { "Hismaic", 22948260, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-mhr"] = { "Muher", 33743, "sem-eth", "Latn", } m["sem-pro"] = { "เซมิติกดั้งเดิม", 1658554, "sem", "Latn", type = "reconstructed", } m["sem-saf"] = { "Safaitic", 472586, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-sam"] = { "Samalian", 85847147, "sem-nwe", "Phnx", -- Phnx translit in [[Module:scripts/data]] } m["sem-srb"] = { "Old South Arabian", 35025, "sem-osa", "Sarb", -- Sarb translit in [[Module:scripts/data]] } m["sem-tay"] = { "Taymanitic", 24912301, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-tha"] = { "Thamudic", 843030, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-wes-pro"] = { "เซมิติกตะวันตกดั้งเดิม", 98021726, "sem-wes", "Latn", type = "reconstructed", } m["sio-pro"] = { -- NB this is not Proto-Siouan-Catawban 'nai-sca-pro' "Proto-Siouan", 34181, "sio", "Latn", type = "reconstructed", } m["sit-aao-pro"] = { "Proto-Central Naga", nil, "sit-aao", "Latn", type = "reconstructed", } m["sit-bai-pro"] = { "Proto-Bai", nil, "sit-bai", "Latn", type = "reconstructed", } m["sit-ban"] = { "Bangru", 56071779, "sit-hrs", "Latn", } m["sit-bdi-pro"] = { "Proto-Bodish", nil, "sit-bdi", "Latn", type = "reconstructed", } m["sit-bok"] = { "Bokar", 4938727, "sit-tan", "Latn, Tibt", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["sit-cai"] = { "Caijia", 5017528, "sit-cln", "Latn" } m["sit-cha"] = { "Chairel", 5068066, "sit-luu", "Latn", } m["sit-ers-pro"] = { "Proto-Ersuic", nil, "sit-ers", "Latn", type = "reconstructed", } m["sit-hrs-pro"] = { "Proto-Hrusish", 116773762, "sit-hrs", "Latn", type = "reconstructed", } m["sit-jap"] = { "Japhug", 3162245, "sit-egy", "Latn", } m["sit-kha-pro"] = { "Proto-Kham", 116773773, "sit-kha", "Latn", type = "reconstructed", } m["sit-khb-pro"] = { "Proto-Kho-Bwa", nil, "sit-khb", "Latn", type = "reconstructed", } m["sit-khp-pro"] = { "Proto-Puroik", nil, "sit-khb", "Latn", type = "reconstructed", } m["sit-khw-pro"] = { "Proto-Western Kho-Bwa", nil, "sit-khw", "Latn", type = "reconstructed", } m["sit-kon-pro"] = { "Proto-Northern Naga", nil, "sit-kon", "Latn", type = "reconstructed", } m["sit-liz"] = { "Lizu", 6660653, "sit-ers", "Latn", -- and Ersu Shaba } m["sit-lnj"] = { "Longjia", 17096251, "sit-cln", "Latn" } m["sit-lrn"] = { "Luren", 16946370, "sit-cln", "Latn" } m["sit-luu-pro"] = { "ลูอิชดั้งเดิม", 116773783, "sit-luu", "Latn", type = "reconstructed", } m["sit-nas-pro"] = { "Proto-Naish", nil, "sit-nas", "Latn", type = "reconstructed", } m["sit-prn"] = { "Puiron", 7259048, "sit-zem", } m["sit-pro"] = { "ซีโน-ทิเบตันดั้งเดิม", 24839178, "sit", "Latn", type = "reconstructed", } m["sit-sit"] = { "Situ", 19840830, "sit-egy", "Latn", } m["sit-tam-pro"] = { "Proto-Tamangic", 117469295, "sit-tam", "Latn", type = "reconstructed", } m["sit-tan-pro"] = { "Proto-Tani", 116773284, "sit-tan", "Latn", -- needs verification type = "reconstructed", } m["sit-tgm"] = { "ตางัม", 17041370, "sit-tan", "Latn", } m["sit-tng-pro"] = { "Proto-Tangkhulic", nil, "sit-tng", "Latn", type = "reconstructed" } m["sit-tos"] = { "Tosu", 7827899, "sit-ers", "Latn", -- also Ersu Shaba } m["sit-tsh"] = { "Tshobdun", 19840950, "sit-egy", "Latn", } m["sit-zbu"] = { "Zbu", 19841106, "sit-egy", "Latn", } m["sla-pro"] = { "สลาวิกดั้งเดิม", 747537, "sla", "Latn", type = "reconstructed", strip_diacritics = { remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve, remove_exceptions = {'ś'}, }, sort_key = { from = {"č", "ď", "ě", "ę", "ь", "ľ", "ň", "ǫ", "ř", "š", "ś", "ť", "ъ", "ž"}, to = {"c²", "d²", "e²", "e³", "i²", "l²", "nj", "o²", "r²", "s²", "s³", "t²", "u²", "z²"}, } } m["smi-pro"] = { "ซามิกดั้งเดิม", 7251862, "smi", "Latn", type = "reconstructed", sort_key = { from = {"ā", "č", "δ", "[ëē]", "ŋ", "ń", "ō", "š", "θ", "%([^()]+%)"}, to = {"a", "c²", "d", "e", "n²", "n³", "o", "s²", "t²"} }, } m["son-pro"] = { "Proto-Songhay", 116773277, "son", "Latn", type = "reconstructed", } m["sqj-pro"] = { "แอลเบเนียนดั้งเดิม", 18210846, "sqj", "Latn", type = "reconstructed", } m["ssa-klk-pro"] = { "Proto-Kuliak", 116773779, "ssa-klk", "Latn", type = "reconstructed", } m["ssa-kom-pro"] = { "Proto-Koman", 116773775, "ssa-kom", "Latn", type = "reconstructed", } m["ssa-pro"] = { "Proto-Nilo-Saharan", 116773236, "ssa", "Latn", type = "reconstructed", } m["syd-pro"] = { "Proto-Samoyedic", 7251863, "syd", "Latn", type = "reconstructed", } m["tai-pro"] = { "ไทดั้งเดิม", 6583709, "tai", "Latn", type = "reconstructed", } m["tai-swe-pro"] = { "ไทตะวันตกเฉียงใต้ดั้งเดิม", 116773280, "tai-swe", "Latn", type = "reconstructed", } m["tbq-bdg-pro"] = { "โบโด-กาโรดั้งเดิม", 116773195, "tbq-bdg", "Latn", type = "reconstructed", } m["tbq-blg"] = { "Bailang", 2879843, "tbq-lob", "Hani", sort_key = "Hani-sortkey", } m["tbq-brm-pro"] = { "Proto-Burmish", nil, "tbq-brm", "Latn", type = "reconstructed", } m["tbq-gkh"] = { "Gokhy", 5578069, "tbq-sil", "Latn", } m["tbq-kuk-pro"] = { "Proto-Kuki-Chin", 116773220, "tbq-kuk", "Latn", type = "reconstructed", } m["tbq-lal-pro"] = { "Proto-Lalo", 116773781, "tbq-lal", "Latn", type = "reconstructed", } m["tbq-laz"] = { "Laze", 17007626, "sit-nas", "Latn", } m["tbq-lob-pro"] = { "โลโล-เบอร์มีซดั้งเดิม", 116773224, "tbq-lob", "Latn", type = "reconstructed", } m["tbq-lol-pro"] = { "โลโลอิชดั้งเดิม", 7251855, "tbq-lol", "Latn", type = "reconstructed", } m["tbq-mil"] = { "Milang", 6850761, "sit-gsi", "Deva, Latn", translit = { Deva = "Deva-translit", }, } m["tbq-mor"] = { "Moran", 6909216, "tbq-bdg", "Latn", } m["tbq-ngo"] = { "Ngochang", 56582, "tbq-brm", "Latn", } -- tbq-pro is now etymology-only m["trk-dkh"] = { "Dukhan", 12809273, "trk-ssb", "Latn, Cyrl, Mong", -- Mong translit, display_text and strip_diacritics in [[Module:scripts/data]] } -- As described in Mahmud al-Kashgari's 11th century ''Dīwān Lughāt al-Turk''. m["trk-eog"] = { "Early Old Oghuz", nil, "trk-ogz", "ota-Arab", strip_diacritics = {["ota-Arab"] = "ar-stripdiacritics"}, } m["trk-oat"] = { "ตุรกีแบบอานาโตเลียเก่า", 7083390, "trk-ogz", "ota-Arab", strip_diacritics = {["ota-Arab"] = "ar-stripdiacritics"}, ancestors = "trk-eog", } m["trk-pro"] = { "เตอร์กิกดั้งเดิม", 3657773, "trk", "Latn", type = "reconstructed", standard_chars = { Latn = " ()-abdegiklmnoprstuxyzïöüāčēīĺŋōŕšūǖȫẹ" .. c.macron, } } m["tup-gua-pro"] = { "ตูปี-กัวรานีดั้งเดิม", 116773288, "tup-gua", "Latn", type = "reconstructed", } m["tup-kab"] = { "Kabishiana", 15302988, "tup", "Latn", } m["tup-pro"] = { "ตูเปียนดั้งเดิม", 10354700, "tup", "Latn", type = "reconstructed", } m["tuw-alk"] = { "Alchuka", 113553616, "tuw-jrc", "Latn, Hans", sort_key = {Hans = "Hani-sortkey"}, } m["tuw-bal"] = { "Bala", 86730632, "tuw-jrc", "Latn, Hans", sort_key = {Hans = "Hani-sortkey"}, } m["tuw-kkl"] = { "Kyakala", 118875708, "tuw-jrc", "Latn, Hans", sort_key = {Hans = "Hani-sortkey"}, } m["tuw-kli"] = { "Kili", 6406892, "tuw-ewe", "Cyrl", } m["tuw-pro"] = { "Proto-Tungusic", 85872335, "tuw", "Latn", type = "reconstructed", } m["tuw-sol"] = { "Solon", 30004, "tuw-ewe", } m["urj-fin-pro"] = { "ฟินนิกดั้งเดิม", 11883720, "urj-fin", "Latn", type = "reconstructed", } m["urj-koo"] = { "Old Komi", 86679962, "kv", "Perm, Cyrs", translit = "urj-koo-translit", -- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]]; previously, Cyrs strip_diacritics not present } m["urj-kuk"] = { "Kukkuzi", 107410460, "urj-fin", "Latn", ancestors = "vot", } m["urj-kya"] = { "Komi-Yazva", 2365210, "kv", "Cyrl", translit = "kv-translit", override_translit = true, strip_diacritics = {remove_diacritics = c.acute}, } m["urj-mdv-pro"] = { "Proto-Mordvinic", 116773232, "urj-mdv", "Latn", type = "reconstructed", } m["urj-prm-pro"] = { "เปอร์มิกดั้งเดิม", 116773257, "urj-prm", "Latn", type = "reconstructed", } m["urj-pro"] = { "ยูราลิกดั้งเดิม", 288765, "urj", "Latn", type = "reconstructed", } m["urj-ugr-pro"] = { "ยูกริกดั้งเดิม", 156631, "urj-ugr", "Latn", type = "reconstructed", } m["xnd-pro"] = { "Proto-Na-Dene", 116773233, "xnd", "Latn", type = "reconstructed", } m["xgn-pro"] = { "มองโกลิกดั้งเดิม", 2493677, "xgn", "Latn", type = "reconstructed", sort_key = { from = {"č", "i", "ï", "ǰ", "ŋ", "ö", "š", "ü"}, to = {"c", "i" .. p[1], "i", "j", "n" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1]}, }, } m["yok-bvy"] = { "Buena Vista Yokuts", 4985474, "yok", "Latn", } m["yok-dly"] = { "Delta Yokuts", 70923266, "yok", "Latn", } m["yok-gsy"] = { "Gashowu Yokuts", 3098708, "yok", "Latn", } m["yok-kry"] = { "Kings River Yokuts", 6413014, "yok", "Latn", } m["yok-nvy"] = { "Northern Valley Yokuts", 85789777, "yok", "Latn", } m["yok-ply"] = { "Palewyami Yokuts", 2387391, "yok", "Latn", } m["yok-svy"] = { "Southern Valley Yokuts", 12642473, "yok", "Latn", } m["yok-tky"] = { "Tule-Kaweah Yokuts", 7851988, "yok", "Latn", } m["ypk-pro"] = { "Proto-Yupik", 116773295, "ypk", "Latn", type = "reconstructed", } m["yrk-for"] = { "Forest Nenets", 1295107, "yrk", "Cyrl", translit = "yrk-for-translit", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.macron .. c.breve .. c.dotabove}, } m["yrk-tun"] = { "Tundra Nenets", 36452, "yrk", "Cyrl", strip_diacritics = { from = {"ӑ", "а̄", "э̇", "ӣ", "ы̄", "ӯ", "ю̄", "я̆", "я̄"}, to = {"а", "а", "э", "и", "ы", "у", "ю", "я", "я"}, }, translit = "yrk-tun-translit", } m["zhx-min-pro"] = { "หมิ่นดั้งเดิม", 19646347, "zhx-min", "Latn", type = "reconstructed", } m["zhx-sht"] = { "Shaozhou Tuhua", 1920769, "zhx", "Nshu, Hants", generate_forms = "zh-generateforms", sort_key = {Hani = "Hani-sortkey"}, } m["zhx-sic"] = { "เสฉวน", 2278732, "zhx-man", "Hants", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["zhx-tai"] = { "ห่อยซัน", 2208940, "zhx-yue", "Hants", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["zle-ono"] = { "Old Novgorodian", 162013, "zle", "Cyrs, Glag", translit = {Cyrs = "Cyrs-translit", Glag = "Glag-translit"}, -- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]] } m["zle-ort"] = { "รูซินเก่า", 13211, "zle", "Arab, Cyrs, Latn", ancestors = "orv", translit = { Cyrs = "zle-ort-translit", Arab = "zle-ort-Arab-translit", }, strip_diacritics = { Cyrs = { remove_diacritics = m_langdata.chars_substitutions["Cyrs_remove_diacritics"], remove_exceptions = {"Ї", "ї"}, }, Arab = "ar-stripdiacritics", }, -- Cyrs sort_key in [[Module:scripts/data]] } m["zls-chs"] = { "Church Slavonic", 33251, "zls", "Cyrs, Glag, Latn", ancestors = "cu", translit = { Cyrs = "Cyrs-translit", Glag = "Glag-translit" }, -- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]] } m["zlw-ocs"] = { "เช็กเก่า", 593096, "zlw", "Latn", } m["zlw-opl"] = { "โปแลนด์เก่า", 149838, "zlw-lch", "Latn", strip_diacritics = {remove_diacritics = c.ringabove}, } m["zlw-osk"] = { "สโลวักเก่า", 12776676, "zlw", "Latn", } m["zlw-slv"] = { "สโลวินช์", 36822, "zlw-pom", "Latn", strip_diacritics = {remove_diacritics = c.macron .. c.breve}, } return require("Module:languages").finalizeData(m, "language") c7n6allndmel74lnv21o7pg4zdh1o88 มอดูล:languages/data/3/w 828 36364 5720768 5684169 2026-04-21T07:01:19Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720768 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["waa"] = { "Walla Walla", 12953960, "nai-shp", "Latn", ancestors = "nai-spt", } m["wab"] = { "Wab", 11222271, "poz-ocw", "Latn", } m["wac"] = { "Wasco-Wishram", 12645081, "nai-ckn", "Latn", } m["wad"] = { "Wandamen", 2806128, "poz-hce", "Latn", } m["waf"] = { "Wakoná", 7961205, } m["wag"] = { "Wa'ema", 12953264, "poz-ocw", "Latn", } m["wah"] = { "Watubela", 7975070, "poz-cma", "Latn", } m["waj"] = { "Waffa", 3565058, "ngf-kag", "Latn", } m["wal"] = { "Wolaytta", 36943, "omv-nom", "Latn, Ethi", } m["wam"] = { "Massachusett", 56519, "alg-eas", "Latn", } m["wan"] = { "Wan", 3913272, "dmn-nbe", } m["wao"] = { "Wappo", 56530, "nai-ykn", "Latn", } m["wap"] = { "Wapishana", 3450493, "awd", "Latn", } m["waq"] = { "Wageman", 3436843, "aus-gun", "Latn", } m["war"] = { "วาไร", 34279, "phi", "Latn", strip_diacritics = {Latn = {remove_diacritics = c.grave .. c.acute .. c.circ}}, standard_chars = { Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy", c.punc }, sort_key = { Latn = "tl-sortkey", }, } m["was"] = { "Washo", 34198, "qfa-iso", "Latn", } m["wat"] = { "Kaninuwa", 12952565, "poz-ocw", "Latn", } m["wau"] = { "Wauja", 3450522, "awd", "Latn", } m["wav"] = { "Waka", 3913394, "alv-mye", } m["waw"] = { "Waiwai", 56632, "sai-prk", "Latn", } m["wax"] = { "Watam", 3566597, "paa-ram", "Latn", } m["way"] = { "Wayana", 5908753, "sai-gui", "Latn", } m["waz"] = { "Wampur", 7966957, "poz-ocw", "Latn", } m["wba"] = { "วาราโอ", 36946, "qfa-iso", "Latn", } m["wbb"] = { "Wabo", 7958701, "poz-hce", "Latn", } m["wbe"] = { "Waritai", 7969453, "paa-lkp", "Latn", } m["wbf"] = { "Wara", 3914052, "alv-wan", } m["wbh"] = { "Wanda", 7967153, "bnt-mwi", } m["wbi"] = { "Wanji", 3376818, "bnt-bki", "Latn", } m["wbj"] = { "Alagwa", 56621, "cus-sou", "Latn", } m["wbk"] = { "Waigali", 34196, "nur-sou", "Latn", } m["wbl"] = { "Wakhi", 34208, "xsc-skw", "Cyrl, Latn, Arab", translit = {Cyrl = "tg-translit"}, } m["wbm"] = { "Wa", 12644869, "mkh-pal", } m["wbp"] = { "Warlpiri", 1639998, "aus-pam", "Latn", } m["wbq"] = { "Waddar", 6708569, "dra-tel", } m["wbr"] = { "Wagdi", 7959490, "inc-bhi", "Deva", } m["wbt"] = { "Wanman", 7967989, } m["wbv"] = { "Wajarri", 3913856, "aus-psw", "Latn", } m["wbw"] = { "Woi", 8029092, "poz-hce", "Latn", } m["wca"] = { "Yanomam", 7960056, "sai-ynm", "Latn", } m["wci"] = { "Waci Gbe", 36987, "alv-gbe", } m["wdd"] = { "Wandji", 36976, "bnt-nze", } m["wdg"] = { "Wadaginam", 7958930, "ngf-sad", "Latn", } m["wdj"] = { "Wadjiginy", 7959489, } m["wdt"] = { "Wendat", 3567223, "iro-nor", "Latn", ancestors = "iro-ohu", } m["wdu"] = { "Wadjigu", 10719025, } m["wdy"] = { "Wadjabangayi", 63313681, } m["wea"] = { "Wewaw", 15895870, } m["wec"] = { "Wè Western", 11159067, "kro-wee", } m["wed"] = { "Wedau", 12953294, "poz-ocw", "Latn", } m["weh"] = { "Weh", 7979690, "nic-rnw", } m["wei"] = { "Kiunum", 7983230, "paa-ani", "Latn", } m["wem"] = { "Weme Gbe", 18379970, "alv-gbe", } m["weo"] = { "Wemale", 7982165, "poz-cma", } m["wer"] = { "Weri", 11732752, "paa-kun", "Latn", } m["wes"] = { "Cameroon Pidgin", 35541, "crp", "Latn", ancestors = "en", } m["wet"] = { "Perai", 12953035, "poz-tim", } m["weu"] = { "Welaung", 7980503, "tbq-kuk", } m["wew"] = { "Weyewa", 4314526, "poz-cet", "Latn", } m["wfg"] = { "Yafi", 8074520, "paa-pau", } m["wga"] = { "Wagaya", 7959487, "aus-pam", } m["wgb"] = { "Wagawaga", 7959485, } m["wgg"] = { "Wangganguru", 7967859, "aus-kar", "Latn", } m["wgi"] = { "Wahgi", -- not to be confused with North Wahgi 3565122, "ngf-chw", "Latn", } m["wgo"] = { "Waigeo", 7959937, "poz-hce", "Latn", } m["wgu"] = { "Wirangu", 2092286, "aus-pam", "Latn", } m["wgy"] = { "Warrgamay", 3915942, "aus-pam", "Latn", } m["wha"] = { "Manusela", 3287127, "poz-cma", "Latn", } m["whg"] = { "North Wahgi", 12953273, "ngf-chw", "Latn", } m["whk"] = { "Wahau Kenyah", 7959737, "poz-swa", "Latn", } m["whu"] = { "Wahau Kayan", 12473397, } m["wib"] = { "Southern Toussian", 11158982, "alv-sav", } m["wic"] = { "Wichita", 56513, "cdd", "Latn", } m["wie"] = { "Wik-Epa", 10720035, "aus-pmn", } m["wif"] = { "Wik-Keyangan", 10720037, "aus-pmn", } m["wig"] = { "Wik-Ngathana", 3915695, "aus-pmn", } m["wih"] = { "Wik-Me'anha", 10720039, "aus-pmn", } m["wii"] = { "Minidien", 6865237, "paa-tor", "Latn", } m["wij"] = { "Wik-Iiyanh", 10720036, "aus-pmn", } m["wik"] = { "Wikalkan", 7999800, "aus-pmn", } m["wil"] = { "Wilawila", 10720050, "aus-wor", } m["wim"] = { "Wik-Mungkan", 2092246, "aus-pmn", "Latn", } m["win"] = { "วินเนอเบโก", 1957108, "sio-msv", "Latn", } m["wir"] = { "Wiraféd", 12953970, "tup-gua", "Latn", } m["wiu"] = { "Wiru", 8027044, "qfa-dis", -- Papuan; isolate in Glottolog; grouped with Teberan by Usher (2020) "Latn", } m["wiv"] = { "Muduapa", 3121040, "poz-ocw", "Latn", } m["wiy"] = { "Wiyot", 36937, "aql", "Latn", } m["wja"] = { "Waja", 3914415, "alv-wjk", } m["wji"] = { "Warji", 3440381, "cdc-wst", "Latn", } m["wka"] = { "Kw'adza", 3807652, "cus-sou", } m["wkb"] = { "Kumbaran", 16878146, "dra-sdo", } m["wkd"] = { "Mo", 7960881, "poz-ocw", "Latn", } m["wkl"] = { "Kalanadi", 6350515, "dra-mal", } m["wku"] = { "Kunduvadi", 6444383, "dra-mal", } m["wkw"] = { "Wakawaka", 10719110, "aus-pam", } m["wky"] = { "Wangkayutyuru", 33060533, "aus-kar", } m["wla"] = { "Walio", 7961958, "paa-wal", "Latn", } m["wlc"] = { "Mwali Comorian", 3319155, "bnt-com", "Latn", sort_key = "bnt-com-sortkey", } m["wle"] = { "Wolane", 12645275, "sem-eth", "Ethi", } m["wlg"] = { "Kunbarlang", 5618523, "aus-gun", "Latn", } m["wli"] = { "Waioli", 7960241, "paa-nha", "Latn", } m["wlk"] = { "Wailaki", 20832, "ath-pco", "Latn", } m["wll"] = { "Wali (Sudan)", 30597440, "nub-hil", } m["wlm"] = { "เวลส์กลาง", 2487263, "cel-brw", "Latn", ancestors = "owl", strip_diacritics = { from = {"Ð", "ð"}, to = {"D", "d"} }, sort_key = "wlm-sortkey", } m["wlo"] = { "Wolio", 1185114, "poz-wot", "Latn, Arab", } m["wlr"] = { "Wailapa", 7960062, "poz-vnn", "Latn", } m["wls"] = { "Wallisian", 36979, "poz-pnp", "Latn", } m["wlu"] = { "Wuliwuli", 8039208, } m["wlv"] = { "Wichí Lhamtés Vejoz", 13526867, "sai-wic", "Latn", } m["wlw"] = { "Walak", 7961258, "ngf-dan", "Latn", } m["wlx"] = { "Wali (Ghana)", 36895, "nic-mre", "Latn", } m["wly"] = { "Waling", 7961957, "sit-kic", ancestors = "bap", } m["wmb"] = { "Wambaya", 2083197, "aus-mir", "Latn", } m["wmc"] = { "Wamas", 7966909, "ngf-nwh", "Latn", } m["wmd"] = { "Mamaindé", 3284890, "sai-nmk", "Latn", } m["wme"] = { "Wambule", 56785, "sit-kiw", "Latn", } m["wmh"] = { "Waima'a", 7960132, "poz-tim", "Latn", } m["wmi"] = { "Wamin", 7966934, } m["wmm"] = { "Maiwa (Indonesia)", 6737226, "poz", "Latn", } m["wmn"] = { "Waamwang", 7958575, "poz-cln", "Latn", } m["wmo"] = { "Wam", 8030620, "paa-tor", "Latn", } m["wms"] = { "Wambon", 7966922, "ngf-gaw", "Latn", } m["wmt"] = { "Walmajarri", 2232696, "aus-pam", "Latn", } m["wmw"] = { "มวานี", 3042206, "bnt-swh", "Latn", } m["wmx"] = { "Womo", 8031646, "paa-msk", "Latn", } m["wnb"] = { "Wanambre", 7967057, "ngf-tib", "Latn", } m["wnc"] = { "Wantoat", 7968184, "ngf-fin", "Latn", } m["wnd"] = { "Wandarang", 3913767, "aus-arn", "Latn", } m["wne"] = { "Waneci", 7967334, "ira-pat", "ps-Arab", } m["wng"] = { "Wanggom", 11732736, "ngf-gaw", "Latn", } m["wni"] = { "Ndzwani Comorian", 2850262, "bnt-com", "Latn", sort_key = "bnt-com-sortkey", } m["wnk"] = { "Wanukaka", 2370136, "poz", "Latn", } m["wnm"] = { "Wanggamala", 7967860, "aus-kar", "Latn", } m["wno"] = { "Wano", 3566166, "ngf-dan", "Latn", } m["wnp"] = { "Wanap", 7967060, "paa-tor", "Latn", } m["wnu"] = { "Usan", 7901709, "ngf-num", "Latn", } m["wnw"] = { "Wintu", 56754, "nai-wtq", "Latn", } m["wny"] = { "Wanyi", 7968201, "aus-gar", "Latn", } m["woa"] = { "Tyaraity", 10706951, } m["wob"] = { "Wobé", 3915363, "kro-wee", } m["woc"] = { "Wogeo", 8029061, "poz-ocw", "Latn", } m["wod"] = { "Wolani", 8029704, "ngf-pan", "Latn", } m["woe"] = { "Woleaian", 34037, "poz-mic", "Latn, Wole", } m["wog"] = { "Wogamusin", 56991, "paa-spk", "Latn", } m["woi"] = { "Kamang", 8029096, "paa-tap", "Latn", } m["wok"] = { "Longto", 35795, "alv-dur", "Latn", } m["wom"] = { "Perema", 3913378, "alv-lek", "Latn", } m["won"] = { "Wongo", 8032058, "bnt-bsh", "Latn", } m["woo"] = { "Manombai", 6751253, "poz", "Latn", } m["wor"] = { "Woria", 8034514, "paa-egb", "Latn", } m["wos"] = { "Hanga Hundi", 6450232, "paa-ndu", "Latn", } m["wow"] = { "Wawonii", 3566780, "poz-btk", "Latn", } m["woy"] = { "Weyto", 3915918, "qfa-unc", -- speculated to have been Agaw } m["wpc"] = { "Wirö", 12953684, nil, "Latn", } m["wra"] = { "Warapu", 56739, "paa-msk", "Latn", } m["wrb"] = { "Warluwara", 3913761, "aus-pam", "Latn", } m["wrg"] = { "Warungu", 7970854, "aus-pam", "Latn", } m["wrh"] = { "Wiradjuri", 3913840, "aus-cww", "Latn", } m["wri"] = { "Wariyangga", 10719289, "aus-psw", "Latn", } m["wrk"] = { "Garawa", 2524022, "aus-gar", "Latn", } m["wrl"] = { "Warlmanpa", 3913823, "aus-pam", } m["wrm"] = { "Warumungu", 1764544, "aus-pam", "Latn", } m["wrn"] = { "Warnang", 36971, "alv-hei", } m["wro"] = { "Worora", 3504106, "aus-wor", "Latn", } m["wrp"] = { "Waropen", 7969851, "poz-hce", "Latn", } m["wrr"] = { "Wardaman", 3913842, "aus-yng", } m["wrs"] = { "Waris", 3502610, "paa-brd", "Latn", } m["wru"] = { "Waru", 3566463, } m["wrv"] = { "Waruna", 7971078, "ngf-gsu", "Latn", } m["wrw"] = { "Gugu Warra", 5615286, } m["wrx"] = { "Wae Rana", 7959375, } m["wrz"] = { "Warray", 7969971, "aus-gun", } m["wsa"] = { "Warembori", 56459, } m["wsi"] = { "Wusi", 8039349, "poz-vnn", "Latn", } m["wsk"] = { "Waskia", 7972683, "ngf-kow", "Latn", } m["wsr"] = { "Owenia", 7114727, "ngf-kag", "Latn", } -- "wss" Wasa is treated as "ak" Akan, see [[WT:LT]] m["wsu"] = { "Wasu", 7972892, } m["wsv"] = { "Wotapuri-Katarqalai", 3877569, "inc-koh", } m["wtf"] = { "Watiwa", 35316, "ngf-eva", "Latn", } m["wth"] = { "Wathaurong", 7974656, "aus-pam", "Latn", } m["wti"] = { "Berta", 33178, "qfa-iso", -- may be ssa "Latn", } m["wtk"] = { "Watakataui", 7972975, "paa-spk", "Latn", } m["wtm"] = { "Mewati", 2605943, "raj", "Deva", translit = "Deva-translit", } m["wtw"] = { "Wotu", 12473488, } m["wua"] = { "Wikngenchera", 10720045, "aus-pmn", } m["wub"] = { "Wunambal", 3913805, "aus-wor", } m["wud"] = { "Wudu", 36972, "alv-gbe", "Latn", } m["wuh"] = { "Wutunhua", 1012917, "qfa-mix", "Latn", ancestors = "cmn, bo, peh", } m["wul"] = { "Silimo", 11732514, "ngf-dan", "Latn", } m["wum"] = { "Wumbvu", 36891, "bnt-kel", "Latn", } m["wun"] = { "Bungu", 4997686, "bnt-mby", "Latn", } m["wur"] = { "Wurrugu", 8039305, "aus-wdj", } m["wut"] = { "Wutung", 56743, "paa-msk", "Latn", } m["wuu"] = { "อู๋", 34290, "zhx", "Hants", ancestors = "ltc", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["wuv"] = { "Wuvulu-Aua", 3062746, "poz-aay", "Latn", } m["wux"] = { "Wulna", 13591670, } m["wuy"] = { "Wauyai", 12953295, "poz-hce", "Latn", } m["wwa"] = { "Waama", 7958576, "nic-eov", "Latn", } m["wwo"] = { "Dorig", 3037047, "poz-vnn", "Latn", } m["wwr"] = { "Warrwa", 7970852, } m["www"] = { "Wawa", 36889, "nic-mmb", "Latn", } m["wxa"] = { "Waxiang", 2252191, "zhx", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["wxw"] = { "Wardandi", 61999705, } m["wya"] = { "Wyandot", 1185119, "iro-nor", "Latn", } m["wyb"] = { "Ngiyambaa", 3913825, "aus-cww", "Latn", } m["wyi"] = { "Woiwurrung", 8029099, "aus-pam", "Latn", } m["wym"] = { "วีลามอวิตแซ", 56485, "gmw-hgm", "Latn", ancestors = "gmh", strip_diacritics = {remove_diacritics = c.dotabove}, } m["wyr"] = { "Wayoró", 2875044, "tup", } m["wyy"] = { "Western Fijian", 3062751, "poz-pcc", "Latn", } return require("Module:languages").finalizeData(m, "language") r4syvgbv90anygy45o3ulhprz6ijuvx มอดูล:languages/data/3/v 828 36365 5720767 5683855 2026-04-21T07:01:18Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720767 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["vaa"] = { "Vaagri Booli", 7907798, "inc", "Taml", translit = { Taml = "Taml-translit", }, } m["vae"] = { "Vale", 3450194, "csu-val", } m["vag"] = { "Vagla", 36637, "nic-gnw", } m["vah"] = { "Varhadi", 155645, "inc-sou", "Deva, Modi", ancestors = "omr", translit = { Deva = "Deva-translit", Modi = "Modi-translit", }, } m["vai"] = { "ไว", 36939, "dmn-vak", "Vaii", translit = "Vaii-translit", } m["vaj"] = { "Sekele", 56528, } m["val"] = { "Vehes", 7918407, } m["vam"] = { "Vanimo", 3327415, "paa-msk", "Latn", } m["van"] = { "Valman", 7912479, "paa-tor", } m["vao"] = { "Vao", 2160405, "poz-vnc", "Latn", } m["vap"] = { "Vaiphei", 56368, "tbq-kuk", } m["var"] = { "Huarijio", 10974017, "azc-trc", "Latn", } m["vas"] = { "Vasavi", 765418, } m["vau"] = { "Vanuma", 7915259, "bnt-nya", } m["vav"] = { "Varli", 7915983, "inc-sou", "Deva, Gujr", translit = { Deva = "Deva-translit", Gujr = "Gujr-translit", }, } m["vay"] = { "Vayu", 7917585, "sit-kiw", } m["vbb"] = { "Southeast Babar", 12952247, "poz-tim", } m["vbk"] = { "Southwestern Bontoc", 63313677, "phi", "Latn", } m["vec"] = { "เวเนโต", 32724, "roa-iwr", "Latn, Armn", -- Armn translit in [[Module:scripts/data]] } m["ved"] = { "Veddah", 2567934, "crp", "Sinh" } m["vem"] = { "Vemgo-Mabas", 56268, } m["veo"] = { "Ventureño", 56712, "nai-chu", "Latn", } m["vep"] = { "เวปส์", 32747, "urj-fin", "Latn", display_text = { from = {"'"}, to = {"ʹ"} }, strip_diacritics = { from = {"'"}, to = {"ʹ"} }, sort_key = { from = { "č", "š", "ž", "ü", "ä", "ö", -- 2 chars "z", "ʹ" -- 1 char }, to = { "c" .. p[1], "s" .. p[1], "s" .. p[3], "y" .. p[1], "y" .. p[2], "y" .. p[3], "s" .. p[2], "y" .. p[4], } }, } m["ver"] = { "Mom Jango", 35862, "alv-dur", } m["vgr"] = { "Vaghri", 7908480, "inc-bhi", "Gujr", translit = "Gujr-translit", } m["vgt"] = { "Flemish Sign Language", 2107617, "sgn", } m["vic"] = { "Virgin Islands Creole", 7933935, "crp", "Latn", ancestors = "en", } m["vid"] = { "Vidunda", 7928151, "bnt-ruv", } m["vif"] = { "Vili", 3558409, "bnt-kng", "Latn", } m["vig"] = { "Viemo", 36912, "alv-sav", } m["vil"] = { "Vilela", 3409297, } m["vis"] = { "Vishavan", 14916908, "dra-mal", } m["vit"] = { "Viti", 11011055, "nic-grf", } m["viv"] = { "Iduna", 5989839, "poz-ocw", "Latn", } m["vjk"] = { "Bajjika", 3636001, "inc-bih", "Deva, Kthi", translit = { Deva = "Deva-translit", Kthi = "Kthi-translit", }, } m["vka"] = { "Kariyarra", 13586632, "aus-nga", "Latn", } m["vki"] = { "Ija-Zuba", 11011389, "nic-pls", ancestors = "uji", } m["vkj"] = { "Kujarge", 33448, "qfa-unc", -- Chadic, Cushitic or an isolate; still living but only 200 words known } m["vkk"] = { "Kaur", 6378867, "poz-mly", -- per Wikipedia } m["vkl"] = { "Kulisusu", 3200326, "poz-btk", "Latn", } m["vkm"] = { "Kamakan", 3192316, "sai-mje", "Latn", } m["vko"] = { "Kodeoha", 3198209, "poz-btk", -- per Wikipedia } m["vkp"] = { "Korlai Creole Portuguese", 3915520, "crp", "Latn", ancestors = "idb", } m["vkt"] = { "Tenggarong Kutai Malay", 12683226, "poz-mly", -- per Wikipedia } m["vku"] = { "Kurrama", 3915684, "aus-nga", "Latn", } m["vlp"] = { "Valpei", 7912582, "poz-vnn", "Latn", } m["vls"] = { "West Flemish", 100103, "gmw-frk", "Latn", ancestors = "dum", } m["vma"] = { "Martuthunira", 975399, "aus-nga", "Latn", } m["vmb"] = { "Mbabaram", 3303475, "aus-pam", "Latn", } m["vmc"] = { "Juxtlahuaca Mixtec", 25559582, "omq-mxt", "Latn", } m["vmd"] = { "Mudu Koraga", 12952656, "dra-kor", "Knda", -- Knda translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission) } m["vme"] = { "East Masela", 18487451, "poz-tim", "Latn", } m["vmf"] = { "East Franconian", 497345, "gmw-hgm", "Latn", ancestors = "gmh", sort_key = "vmf-sortkey", } m["vmg"] = { "Minigir", 17053237, "poz-ocw", "Latn", } m["vmh"] = { "Maraghei", 36220, "xme-ttc", ancestors = "xme-ttc-eas", } m["vmi"] = { "Miwa", 10586712, "aus-wor", } m["vmj"] = { "Ixtayutla Mixtec", 6101163, "omq-mxt", "Latn", } m["vmk"] = { "Makhuwa-Shirima", 2963909, "bnt-mak", "Latn", ancestors = "vmw", } m["vml"] = { "Malgana", 6743201, "aus-psw", "Latn", } m["vmm"] = { "Mitlatongo Mixtec", 6881813, "omq-mxt", "Latn", } m["vmp"] = { "Soyaltepec Mazatec", 7572000, "omq-maz", -- per Wikipedia "Latn", } m["vmq"] = { "Soyaltepec Mixtec", 7572001, "omq-mxt", "Latn", } m["vmr"] = { "Marenje", 11128833, ancestors = "vmw", "bnt-mak", } -- vms "Moskela" is extinct and unattested; see Wikipedia m["vmu"] = { "Muluridyi", 10590149, "aus-pam", -- Yalanjic but we don't have that family } m["vmv"] = { "Valley Maidu", 5096458, "nai-mdu", "Latn", } m["vmw"] = { "Makhuwa", 33882, "bnt-mak", "Latn", strip_diacritics = { remove_diacritics = c.acute }, } m["vmx"] = { "Tamazola Mixtec", 12953734, "omq-mxt", "Latn", } m["vmy"] = { "Ayautla Mazatec", 14916912, "omq-maz", "Latn", } m["vmz"] = { "Mazatlán Mazatec", 12953706, "omq-maz", "Latn", } m["vnk"] = { "Lovono", 3211090, "poz-tem", "Latn", } m["vnm"] = { "Neve'ei", 2157431, "poz-vnc", "Latn", } m["vnp"] = { "Vunapu", 7943647, "poz-vnn", "Latn", } m["vor"] = { "Voro", 3914407, "alv-yun", "Latn", } m["vot"] = { "โวต", 32858, "urj-fin", "Latn", display_text = { from = {"'"}, to = {"ʹ"} }, strip_diacritics = { from = {"'"}, to = {"ʹ"} }, sort_key = { from = { "tš", "š", "ž", "õ", "ä", "ö", "ü", "z", "ʹ" }, to = { "t" .. p[1], "s" .. p[1], "s" .. p[3], "w" .. p[1], "w" .. p[2], "w" .. p[3], "w" .. p[4], "s" .. p[2], "" } }, standard_chars = "AaBbDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsŠšZzŽžTtUuVvÕõÄäÖöÜüʹ" .. c.punc, } m["vra"] = { "Vera'a", 3555689, "poz-vnn", "Latn", } m["vro"] = { "เวอโร", 32762, "urj-fin", "Latn", wikimedia_codes = "fiu-vro", strip_diacritics = { remove_diacritics = c.caronbelow }, sort_key = { from = { "ǵ", "ḿ", "ń", "ṕ", "ŕ", "ś", "v́" }, to = { "g'", "m'", "n'", "p'", "r'", "s'", "v'" } }, } m["vrs"] = { "Varisi", 3554807, "poz-ocw", } m["vrt"] = { "Burmbar", 2928522, "poz-vnc", "Latn", } m["vsi"] = { "Moldova Sign Language", 12953478, "sgn", } m["vsl"] = { "Venezuelan Sign Language", 3322064, "sgn", } m["vsv"] = { "Valencian Sign Language", 32663, "sgn", } m["vto"] = { "Vitou", 7937210, "paa-tkw", } m["vum"] = { "Vumbu", 36629, "bnt-sir", } m["vun"] = { "Vunjo", 12953261, "bnt-chg", "Latn", } m["vut"] = { "Vute", 36897, "nic-mmb", "Latn", } m["vwa"] = { "Awa (China)", 2874642, "mkh-pal", "Latn, Mymr", } return require("Module:languages").finalizeData(m, "language") 4ysixpmc3nwqt77xaxchmknzwdl9m6r มอดูล:languages/data/3/t 828 36367 5720766 5688007 2026-04-21T07:01:15Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720766 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["taa"] = { "Lower Tanana", 28565, "ath-nor", "Latn", } m["tab"] = { "ทาบาซารัน", 34079, "cau-esm", "Cyrl, Latn, Arab", translit = { Cyrl = "tab-translit", }, override_translit = true, display_text = { Cyrl = s["cau-Cyrl-displaytext"] }, strip_diacritics = { Cyrl = s["cau-Cyrl-stripdiacritics"], Latn = s["cau-Latn-stripdiacritics"], }, sort_key = { Cyrl = "tab-sortkey", } } m["tac"] = { "Lowland Tarahumara", 15616384, "azc-trc", "Latn", } m["tad"] = { "Tause", 2356440, "paa-lkp", "Latn", } m["tae"] = { "Tariana", 732726, "awd-nwk", "Latn", } m["taf"] = { "Tapirapé", 7684673, "tup-gua", "Latn", } m["tag"] = { "Tagoi", 36537, "nic-ras", "Latn", } m["taj"] = { "Eastern Tamang", 12953177, "sit-tam", "sit-tam-Tibt, Deva", translit = { Deva = "Deva-translit", }, -- sit-tam-Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] -- NOTE: Formerly there was no sort_key or translit specified; I assume that's a mistake. } m["tak"] = { "Tala", 3914494, "cdc-wst", "Latn", } m["tal"] = { "Tal", 3440387, "cdc-wst", "Latn", } m["tan"] = { "Tangale", 529921, "cdc-wst", "Latn", } m["tao"] = { "ยามี", 715760, "phi", "Latn", } m["tap"] = { "Taabwa", 7673650, "bnt-sbi", "Latn", } m["tar"] = { "Central Tarahumara", 20090009, "azc-trc", "Latn", sort_key = {remove_diacritics = c.acute .. "ꞌ"}, } m["tas"] = { "Tây Bồi", 2233794, "crp", "Latn", ancestors = "fr", sort_key = s["roa-oil-sortkey"], } m["tau"] = { "Upper Tanana", 28281, "ath-nor", "Latn", } m["tav"] = { "Tatuyo", 2524007, "sai-tuc", "Latn", } m["taw"] = { "Tai", 7675861, "ngf-kak", "Latn", } m["tax"] = { "Tamki", 3449082, "cdc-est", "Latn", } m["tay"] = { "Atayal", 715766, "map-ata", "Latn", } m["taz"] = { "Tocho", 36680, "alv-tal", "Latn", } m["tba"] = { "Aikanã", 3409307, "qfa-iso", "Latn", } m["tbb"] = { "Tapeba", 12953908, } m["tbc"] = { "Takia", 3514336, "poz-oce", "Latn", } m["tbd"] = { "Kaki Ae", 6349417, "qfa-iso", -- isolate in Glottolog and Pawley and Hammarström (2018); tentatively in Eleman family by Ross (2005) -- (and Usher?), but they don't address counterarguments of Clifton 1997 "Latn", } m["tbe"] = { "Tanimbili", 3515188, "poz-tem", "Latn", } m["tbf"] = { "Mandara", 3285424, "poz-ocw", "Latn", } m["tbg"] = { "North Tairora", 20210398, "ngf-kag", "Latn", } m["tbh"] = { "Thurawal", 3537135, "aus-yuk", } m["tbi"] = { "Gaam", 35338, "sdv-eje", "Latn", } m["tbj"] = { "Tiang", 3528020, "poz-ocw", "Latn", } m["tbk"] = { "Calamian Tagbanwa", 3915487, "phi-kal", "Tagb, Latn", } m["tbl"] = { "ตโบลี", 7690594, "phi", "Latn", } m["tbm"] = { "Tagbu", 7675188, "nic-ser", } m["tbn"] = { "Barro Negro Tunebo", 12953943, "cba", } m["tbo"] = { "Tawala", 7689206, "poz-ocw", "Latn", } m["tbp"] = { "Taworta", 7689337, "paa-lkp", "Latn", } m["tbr"] = { "Tumtum", 3407029, "qfa-kad", } m["tbs"] = { "Tanguat", 7683166, "paa-ram", "Latn", } m["tbt"] = { "Kitembo", 13123561, "bnt-shh", "Latn", } m["tbu"] = { "Tubar", 56730, "azc-trc", "Latn", } m["tbv"] = { -- considered a dialect of Kulungtfu-Yuanggeng-Tobo [kgf] by Glottolog "Tobo", 7811712, "ngf-huo", "Latn", } m["tbw"] = { "Aborlan Tagbanwa", 3915475, "phi", "Latn", } m["tbx"] = { "Kapin", 6366665, "poz-ocw", "Latn", } m["tby"] = { "Tabaru", 11732670, "paa-nha", "Latn", } m["tbz"] = { "Ditammari", 35186, "nic-eov", } m["tca"] = { "Ticuna", 1815205, "sai-tyu", "Latn", } m["tcb"] = { "Tanacross", 28268, "ath-nor", "Latn", } m["tcc"] = { "Datooga", 35327, "sdv-nis", "Latn", } m["tcd"] = { "Tafi", 36545, "alv-ktg", } m["tce"] = { "Southern Tutchone", 31091048, "ath-nor", "Latn", } m["tcf"] = { "Malinaltepec Tlapanec", 25559732, "omq", "Latn", } m["tcg"] = { "Tamagario", 7680531, "paa-kay", "Latn", } m["tch"] = { "Turks and Caicos Creole English", 7855478, "crp", "Latn", ancestors = "en", } m["tci"] = { "Wára", 20825638, "paa-yam", "Latn", } m["tck"] = { "Tchitchege", 36595, "bnt-tek", } m["tcl"] = { "Taman (Myanmar)", 15616518, "sit-jnp", "Latn", } m["tcm"] = { "Tanahmerah", 3514927, "qfa-dis", -- Papuan; isolate per Glottolog and Palmer (2018), considered an independent branch of TNG by Usher -- (2020); seems based only on some pronoun correspondences "Latn", } m["tco"] = { "Taungyo", 12953186, "tbq-brm", ancestors = "obr", } m["tcp"] = { "Tawr Chin", 7689338, "tbq-kuk", } m["tcq"] = { "Kaiy", 6348709, "paa-lkp", "Latn", } m["tcs"] = { "Torres Strait Creole", 36648, "crp", "Latn", ancestors = "en", } m["tct"] = { "T'en", 3442330, "qfa-kms", } m["tcu"] = { "Southeastern Tarahumara", 36807, "azc-trc", "Latn", } m["tcw"] = { "Tecpatlán Totonac", 7692795, "nai-ttn", "Latn", } m["tcx"] = { "โตทา", 34042, "dra-tkt", "Taml", translit = {Taml = "Taml-translit"}, } m["tcy"] = { "ตูลู", 34251, "dra-tlk", "Tutg, Mlym, Knda", -- Mlym is nearer than Knda but both lack ɛ/ɛː. translit = { Tutg = "tcy-Tutg-translit", }, -- Knda translit in [[Module:scripts/data]] -- Mlym translit in [[Module:scripts/data]] } m["tcz"] = { "Thado Chin", 6583558, "tbq-kuk", } m["tda"] = { "Tagdal", 36570, "son", } m["tdb"] = { "Panchpargania", 21946879, "inc-eas", "Deva, as-Beng, Orya, Chis", translit = { Deva = "Deva-translit", ["as-Beng"] = "Beng-translit", Orya = "Orya-translit", }, ancestors = "bh", } m["tdc"] = { "Emberá-Tadó", 3052041, "sai-chc", "Latn", } m["tdd"] = { "ไทใต้คง", 36556, "tai-swe", "Tale", translit = "Tale-translit", strip_diacritics = {remove_diacritics = c.ZWNJ .. c.ZWJ}, } m["tde"] = { "Tiranige Diga Dogon", 5313387, "nic-dgw", } m["tdf"] = { "Talieng", 37525108, "mkh-ban", } m["tdg"] = { "Western Tamang", 12953178, "sit-tam", "sit-tam-Tibt, Deva", translit = { Deva = "Deva-translit", }, -- sit-tam-Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] -- NOTE: Formerly there was no sort_key or translit specified; I assume that's a mistake. } m["tdh"] = { "Thulung", 56553, "sit-kiw", } m["tdi"] = { "Tomadino", 7818197, "poz-btk", "Latn", } m["tdj"] = { "Tajio", 7676870, "poz", "Latn", } m["tdk"] = { "Tambas", 3440392, "cdc-wst", } m["tdl"] = { "Sur", 3914453, "nic-tar", } m["tdm"] = { "Taruma", 5559094, } m["tdn"] = { "Tondano", 3531514, "phi", "Latn", } m["tdo"] = { "Teme", 3913994, "alv-mye", } m["tdq"] = { "Tita", 3914899, "nic-bco", } m["tdr"] = { "Todrah", 7812881, "mkh", } m["tds"] = { "Doutai", 5302331, "paa-lkp", "Latn", } m["tdt"] = { "Tetun Dili", 12643484, "poz-tim", "Latn", } m["tdu"] = { "Tempasuk Dusun", 3529155, "poz-san", } m["tdv"] = { "Toro", 3438367, "nic-alu", "Latn", } m["tdy"] = { "Tadyawan", 7674700, "phi", "Latn", } m["tea"] = { "เตอเมียร์", 3914693, "mkh-asl", "Latn", } m["teb"] = { "Tetete", 7706087, "sai-tuc", "Latn", } m["tec"] = { "Terik", 3518379, "sdv-nma", } m["ted"] = { "Tepo Krumen", 11152243, "kro-grb", } m["tee"] = { "Huehuetla Tepehua", 56455, "nai-ttn", "Latn", } m["tef"] = { "Teressa", 3518362, "aav-nic", } m["teg"] = { "Teke-Tege", 36478, "bnt-tek", } m["teh"] = { "Tehuelche", 33930, "sai-cho", "Latn", } m["tei"] = { "Torricelli", 3450788, "paa-tor", "Latn", } m["tek"] = { "Ibali Teke", 2802914, "bnt-tek", } m["tem"] = { "Temne", 36613, "alv-mel", "Latn", } m["ten"] = { "Tama (Colombia)", 3832969, "sai-tuc", "Latn", } m["teo"] = { "Ateso", 29474, "sdv-ttu", "Latn", } m["tep"] = { "Tepecano", 3915525, "azc-pim", "Latn", } m["teq"] = { "Temein", 7698064, "sdv", } m["ter"] = { "Tereno", 3314742, "awd", "Latn", } m["tes"] = { "Tengger", 12473479, "poz", "Latn, Java", } m["tet"] = { "เตตุน", 34125, "poz-tim", "Latn", } m["teu"] = { "Soo", 3437607, "ssa-klk", } m["tev"] = { "Teor", 12953198, "poz-cma", } m["tew"] = { "Tewa", 56492, "nai-kta", "Latn", } m["tex"] = { "Tennet", 56346, "sdv", } m["tey"] = { "Tulishi", 12911106, "qfa-kad", "Latn", } m["tez"] = { "Tetserret", 7706841, "ber", "Latn", } m["tfi"] = { "Tofin Gbe", 3530330, "alv-pph", } m["tfn"] = { "Dena'ina", 27785, "ath-nor", "Latn", } m["tfo"] = { "Tefaro", 7694618, "paa-egb", "Latn", } m["tfr"] = { "Teribe", 36533, "cba", "Latn", } m["tft"] = { "เตอร์นาเต", 3518492, "paa-nha", "Latn, Arab", } m["tga"] = { "Sagalla", 12953082, "bnt-cht", } m["tgb"] = { "Tobilung", 12953913, "poz-san", "Latn", } m["tgc"] = { "Tigak", 3528276, "poz-ocw", "Latn", } m["tgd"] = { "Ciwogai", 3438799, "cdc-wst", "Latn", } m["tge"] = { "Eastern Gorkha Tamang", 12953175, "sit-tam", "sit-tam-Tibt, Deva", translit = { Deva = "Deva-translit", }, -- sit-tam-Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] -- NOTE: Formerly there was no sort_key or translit specified; I assume that's a mistake. } m["tgf"] = { "Chali", 3695197, "sit-ebo", "Tibt, Latn", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["tgh"] = { "Tobagonian Creole English", 7811541, "crp", ancestors = "en", } m["tgi"] = { "Lawunuia", 3219937, "poz-ocw", } m["tgn"] = { "ตันดากาโนน", 63311769, "phi", "Latn", } m["tgo"] = { "Sudest", 7675351, "poz-ocw", } m["tgp"] = { "Tangoa", 2410276, "poz-vnn", "Latn", } m["tgq"] = { "Tring", 7842360, "poz-swa", } m["tgr"] = { "Tareng", 25559541, "mkh", } m["tgs"] = { "Nume", 3346290, "poz-vnn", "Latn", } m["tgt"] = { "Central Tagbanwa", 3915515, "phi", "Tagb", } m["tgu"] = { "Tanggu", 7682930, "paa-ram", "Latn", } m["tgv"] = { "Tingui-Boto", 7808195, "sai-mje", "Latn", } m["tgw"] = { "Tagwana Senoufo", 36514, "alv-tdj", } m["tgx"] = { "Tagish", 28064, "ath-nor", "Latn", } m["tgy"] = { "Togoyo", 36825, "nic-ser", } m["thc"] = { "Tai Hang Tong", 7675753, "tai-nor", } m["thd"] = { "Kuuk Thaayorre", 6448718, "aus-pmn", "Latn", } m["the"] = { "Chitwania Tharu", 22083804, "inc-tha", "Deva", } m["thf"] = { "Thangmi", 7710314, "sit-new", } m["thh"] = { "Northern Tarahumara", 15616395, "azc-trc", "Latn", } m["thi"] = { "Tai Long", 25559562, "tai-swe", } m["thk"] = { "Tharaka", 15407179, "bnt-kka", } m["thl"] = { "Dangaura Tharu", 22083815, "inc-tha", "Deva", } m["thm"] = { "ทะวืง", 34780, "mkh-vie", "Thai", --Laoo is feasible but no evidence yet. --sort_key = "Thai-sortkey", } m["thn"] = { "Thachanadan", 7708880, "dra-mal", } m["thp"] = { "Thompson", 1755054, "sal", "Latn, Dupl", } m["thq"] = { "Kochila Tharu", 22083826, "inc-tha", } m["thr"] = { "Rana Tharu", 12953920, "inc-tha", "Deva", } m["ths"] = { "Thakali", 7709348, "sit-tam", } m["tht"] = { "Tahltan", 30125, "ath-nor", "Latn", } m["thu"] = { "Thuri", 7799291, "sdv-lon", } m["thy"] = { "Tha", 3915849, "alv-bwj", } m["tic"] = { "Tira", 36677, "alv-hei", } m["tif"] = { "Tifal", 11732691, "ngf-okk", "Latn", } m["tig"] = { "ทือเกร", 34129, "sem-eth", "Ethi", translit = "Ethi-translit", } m["tih"] = { "Timugon Murut", 7807680, "poz-san", "Latn", } m["tii"] = { "Tiene", 36469, "bnt-tek", } m["tij"] = { "Tilung", 7803037, "sit-kiw", } m["tik"] = { "Tikar", 36483, "nic-bdn", "Latn", } m["til"] = { "Tillamook", 2109432, "sal", "Latn", } m["tim"] = { "Timbe", 7804599, "ngf-huo", "Latn", } m["tin"] = { "Tindi", 36860, "cau-and", "Cyrl", display_text = s["cau-Cyrl-displaytext"], strip_diacritics = s["cau-Cyrl-stripdiacritics"], } m["tio"] = { "Teop", 3518239, "poz-ocw", "Latn", } m["tip"] = { "Trimuris", 7842270, "paa-tkw", } m["tiq"] = { "Tiéfo", 3914874, "alv-sav", } m["tis"] = { "Masadiit Itneg", 18748769, "phi", } m["tit"] = { "Tinigua", 3029805, } m["tiu"] = { "Adasen", 11214797, "phi", } m["tiv"] = { "Tiv", 34131, "nic-tvc", "Latn", } m["tiw"] = { "Tiwi", 1656014, "qfa-iso", "Latn", } m["tix"] = { "Southern Tiwa", 7570552, "nai-kta", "Latn", } m["tiy"] = { "ตีรูไร", 7809425, "phi", "Latn", } m["tiz"] = { "Tai Hongjin", 3915716, "tai-swe", } m["tja"] = { "Tajuasohn", 3915326, "kro-wkr", } m["tjg"] = { "Tunjung", 3542117, "poz", "Latn", } m["tji"] = { "Northern Tujia", 12953229, "sit-tja", "Latn", } m["tjl"] = { "ไทแหล่ง", 7675773, "tai-swe", "Mymr", } m["tjm"] = { "Timucua", 638300, "qfa-iso", "Latn", } m["tjn"] = { "Tonjon", 3913372, "dmn-jje", } m["tjs"] = { "Southern Tujia", 12633994, "sit-tja", "Latn", } m["tju"] = { "Tjurruru", 3913834, "aus-nga", "Latn", } m["tjw"] = { "Chaap Wuurong", 5285187, "aus-pam", "Latn", } m["tka"] = { "Truká", 7847648, } m["tkb"] = { "Buksa", 20983638, "inc-eas", "Deva", } m["tkd"] = { "Tukudede", 36863, "poz-tim", "Latn", } m["tke"] = { "Takwane", 11030092, "bnt-mak", "Latn", ancestors = "vmw", } m["tkf"] = { "Tukumanféd", 42330115, "tup-gua", "Latn", } m["tkl"] = { "Tokelauan", 34097, "poz-pnp", "Latn", } m["tkm"] = { "Takelma", 56710, } m["tkn"] = { "โทกูโนชิมะ", 3530484, "jpx-nry", "Jpan", translit = s["jpx-translit"], display_text = s["jpx-displaytext"], strip_diacritics = s["jpx-stripdiacritics"], sort_key = s["jpx-sortkey"], } m["tkp"] = { "Tikopia", 36682, "poz-pnp", "Latn", } m["tkq"] = { "Tee", 3075144, "nic-ogo", "Latn", } m["tkr"] = { "Tsakhur", 36853, "cau-wsm", "Cyrl, Latn, Arab", translit = "tkr-translit", override_translit = true, display_text = { Cyrl = s["cau-Cyrl-displaytext"] }, strip_diacritics = { Cyrl = s["cau-Cyrl-stripdiacritics"], Latn = s["cau-Latn-stripdiacritics"], }, } m["tks"] = { "Ramandi", 25261947, "xme-ttc", "Arab", ancestors = "xme-ttc-sou", } m["tkt"] = { "Kathoriya Tharu", 22083822, "inc-tha", } m["tku"] = { "Upper Necaxa Totonac", 56343, "nai-ttn", "Latn", } m["tkv"] = { "Mur Pano", 16939373, "poz-ocw", "Latn", } m["tkw"] = { "Teanu", 3516731, "poz-tem", "Latn", } m["tkx"] = { "Tangko", 7682993, "ngf-okk", "Latn", } m["tkz"] = { "Takua", 7678544, "mkh", } m["tla"] = { "Southwestern Tepehuan", 3518245, "azc-pim", "Latn", } m["tlb"] = { "Tobelo", 1142333, "paa-nha", "Latn", } m["tlc"] = { "Misantla Totonac", 56460, "nai-ttn", "Latn", } m["tld"] = { "Talaud", 7678964, "phi", "Latn", } m["tlf"] = { "Telefol", 7696150, "ngf-okk", "Latn", } m["tlg"] = { "Tofanma", 4461493, "paa-pau", } m["tlh"] = { "Klingon", 10134, "art", "Latn", type = "appendix-constructed", } m["tli"] = { "Tlingit", 27792, "xnd", "Latn, Cyrl", } m["tlj"] = { "Talinga-Bwisi", 7679530, "bnt-haj", } m["tlk"] = { "Taloki", 3514563, "poz-btk", } m["tll"] = { "Tetela", 2613465, "bnt-tet", } m["tlm"] = { "Tolomako", 3130514, "poz-vnn", "Latn", } m["tln"] = { "Talondo'", 7680293, "poz-ssw", } m["tlo"] = { "Talodi", 36525, "alv-tal", } m["tlp"] = { "Filomena Mata-Coahuitlán Totonac", 5449202, "nai-ttn", "Latn", } m["tlq"] = { "Tai Loi", 7675784, "mkh-pal", } m["tlr"] = { "Talise", 3514510, "poz-sls", "Latn", } m["tls"] = { "Tambotalo", 7681065, "poz-vnn", "Latn", } m["tlt"] = { "Teluti", 12953194, "poz-cma", } m["tlu"] = { "Tulehu", 7852006, "poz-cma", } m["tlv"] = { "Taliabu", 3514498, "poz-cma", "Latn", } m["tlx"] = { "Khehek", 3196124, "poz-aay", } m["tly"] = { "Talysh", 34318, "xme-ttc", "Latn, Cyrl, fa-Arab", } m["tma"] = { "Tama (Chad)", 57001, "sdv-tmn", } m["tmb"] = { "Avava", 2157461, "poz-vnc", "Latn", } m["tmc"] = { "Tumak", 3121045, "cdc-est", } m["tmd"] = { "Haruai", 12632146, "paa-pia", "Latn", } m["tme"] = { "Tremembé", 5246937, } m["tmf"] = { "Toba-Maskoy", 3033544, "sai-mas", "Latn", } m["tmg"] = { "Ternateño", 7232597, } m["tmh"] = { "Tuareg", 34065, "ber", "Latn, Tfng, Arab", strip_diacritics = { Latn = {remove_diacritics = c.grave .. c.acute .. c.circ}, }, } m["tmi"] = { "Tutuba", 7857052, "poz-vnn", "Latn", } m["tmj"] = { "Samarokena", 7408865, "paa-tkw", } m["tmk"] = { "Northwestern Tamang", 15616509, "sit-tam", "sit-tam-Tibt, Deva", translit = { Deva = "Deva-translit", }, -- sit-tam-Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] -- NOTE: Formerly there was no sort_key or translit specified; I assume that's a mistake. } m["tml"] = { "Tamnim Citak", 12643315, "ngf-ask", "Latn", } m["tmm"] = { "Tai Thanh", 7675842, "tai-swe", } m["tmn"] = { "Taman (Indonesia)", 7680671, "poz", "Latn", } m["tmo"] = { "Temoq", 7698205, "mkh-asl", } m["tmq"] = { "Tumleo", 7852641, "poz-ocw", } m["tms"] = { "Tima", 36684, "nic-ktl", } m["tmt"] = { "Tasmate", 7687571, "poz-vnn", "Latn", } m["tmu"] = { "Iau", 56867, "paa-lkp", "Latn", } m["tmv"] = { "Motembo", 11013108, "bnt-bun", } m["tmy"] = { "Tami", 3514812, "poz-oce", } m["tmz"] = { "Tamanaku", 3441435, "sai-ven", "Latn", } m["tna"] = { "Tacana", 3182551, "sai-tac", "Latn", } m["tnb"] = { "Western Tunebo", 3181238, "cba", } m["tnc"] = { "Tanimuca-Retuarã", 36535, "sai-tuc", "Latn", } m["tnd"] = { "Angosturas Tunebo", 25559604, "cba", } m["tne"] = { "Tinoc Kallahan", 3192219, } m["tng"] = { "Tobanga", 3440501, "cdc-est", } m["tnh"] = { "Maiani", 6735243, "ngf-kau", "Latn", } m["tni"] = { "Tandia", 7682454, "poz-hce", "Latn", } m["tnk"] = { "Kwamera", 3200806, "poz-vns", "Latn", } m["tnl"] = { "Lenakel", 3229429, "poz-vns", "Latn", } m["tnm"] = { "Tabla", 7673105, "paa-sen", "Latn", } m["tnn"] = { "North Tanna", 957945, "poz-vns", "Latn", } m["tno"] = { "Toromono", 510544, "sai-tac", "Latn", } m["tnp"] = { "Whitesands", 3063761, "poz-vns", "Latn", } m["tnq"] = { "ตาอีโน", 5232952, "awd-taa", "Latn", } m["tnr"] = { "Bedik", 35096, "alv-ten", "Latn", } m["tns"] = { "Tenis", 7699870, "poz-stm", "Latn", } m["tnt"] = { "Tontemboan", 3531666, "phi", "Latn", } m["tnu"] = { "Tay Khang", 6362363, "tai", } m["tnv"] = { "Tanchangya", 7682361, "inc-bas", "Cakm", ancestors = "inc-obn", } m["tnw"] = { "Tonsawang", 3531660, "phi", "Latn", } m["tnx"] = { "Tanema", 2106984, "poz-tem", "Latn", } m["tny"] = { "Tongwe", 7821200, "bnt", } m["tnz"] = { "Ten'edn", 3073453, "mkh-asl", "Latn", } m["tob"] = { "Toba", 3113756, "sai-guc", "Latn", } m["toc"] = { "Coyutla Totonac", 15615591, "nai-ttn", "Latn", } m["tod"] = { "Toma", 11055484, "dmn-msw", "Latn, Loma" } m["tof"] = { "Gizrra", 5565941, "paa-etf", "Latn", } m["tog"] = { "Tonga (Malawi)", 3847648, "bnt-nys", "Latn", } m["toh"] = { "Tonga (Mozambique)", 7820988, "bnt-bso", } m["toi"] = { "Tonga (Zambia)", 34101, "bnt-bot", "Latn", } m["toj"] = { "Tojolabal", 36762, "myn", "Latn", } m["tok"] = { "Toki Pona", 36846, "art", "Latn", type = "appendix-constructed", } m["tol"] = { "Tolowa", 20827, "ath-pco", "Latn", } m["tom"] = { "Tombulu", 3531199, "phi", "Latn", } m["too"] = { "Xicotepec de Juárez Totonac", 8044353, "nai-ttn", "Latn", } m["top"] = { "Papantla Totonac", 56329, "nai-ttn", "Latn", } m["toq"] = { "Toposa", 3033588, "sdv-ttu", } m["tor"] = { "Togbo-Vara Banda", 11002922, "bad-cnt", } m["tos"] = { "Highland Totonac", 13154149, "nai-ttn", "Latn", } m["tou"] = { "Tho", 22694631, "mkh-vie", "Latn", } m["tov"] = { "Upper Taromi", 12953183, "xme-ttc", ancestors = "xme-ttc-cen", } m["tow"] = { "Jemez", 3912876, "nai-kta", "Latn", } m["tox"] = { "Tobian", 34022, "poz-mic", } m["toy"] = { "Topoiyo", 7824977, "poz-kal", } m["toz"] = { "To", 7811216, "alv-mbm", } m["tpa"] = { "Taupota", 7688832, "poz-ocw", } m["tpc"] = { "Azoyú Me'phaa", 25559730, "omq", "Latn", } m["tpe"] = { "Tippera", 16115423, "tbq-bdg", } m["tpf"] = { "Tarpia", 12953185, "poz-ocw", "Latn", } m["tpg"] = { "Kula", 6442714, "paa-tap", "Latn", } m["tpi"] = { "ตอกปีซิน", 34159, "crp", "Latn", ancestors = "en", } m["tpj"] = { "Tapieté", 3121063, "gn", "Latn", } m["tpk"] = { "Tupinikin", 33924, "tup-gua", } m["tpl"] = { "Tlacoapa Me'phaa", 16115511, "omq", } m["tpm"] = { "Tampulma", 36590, "nic-gnw", } m["tpn"] = { "Tupinambá", 31528147, "tup-gua", "Latn", } m["tpo"] = { "Tai Pao", 7675795, "tai-nor", } m["tpp"] = { "Pisaflores Tepehua", 56349, "nai-ttn", } m["tpq"] = { "Tukpa", 12953230, "sit-las", } m["tpr"] = { "Tuparí", 3542217, "tup", "Latn", } m["tpt"] = { "Tlachichilco Tepehua", 56330, "nai-ttn", } m["tpu"] = { "Tampuan", 3514882, "mkh-ban", "Khmr", } m["tpv"] = { "Tanapag", 3397371, "poz-mic", } m["tpw"] = { "ตูปีเก่า", 56944, "tup-gua", "Latn", } m["tpx"] = { "Acatepec Me'phaa", 31157882, "omq", "Latn", } m["tpy"] = { "Trumai", 12294279, "qfa-iso", } m["tpz"] = { "Tinputz", 3529205, "poz-ocw", } m["tqb"] = { "Tembé", 10322157, "tup-gua", "Latn", } m["tql"] = { "Lehali", 3229119, "poz-vnn", "Latn", } m["tqm"] = { "Turumsa", 7856508, "paa-dot", "Latn", } m["tqn"] = { "Tenino", 15699255, "nai-shp", "Latn", ancestors = "nai-spt", } m["tqo"] = { "Toaripi", 7811403, "paa-eel", "Latn", } m["tqp"] = { "Tomoip", 3531388, "poz-ocw", } m["tqq"] = { "Tunni", 3514343, "cus-som", } m["tqr"] = { "Torona", 36679, "alv-tal", } m["tqt"] = { "Western Totonac", 7116691, "nai-ttn", "Latn", } m["tqu"] = { "Touo", 56750, } m["tqw"] = { "Tonkawa", 2454881, "qfa-iso", "Latn", } m["tra"] = { "Tirahi", 3812406, "inc-koh", } m["trb"] = { "Terebu", 7701797, "poz-ocw", } m["trc"] = { "Copala Triqui", 12953935, "omq-tri", "Latn", } m["trd"] = { "Turi", 7854914, "mun", } m["tre"] = { "East Tarangan", 18609750, "poz", } m["trf"] = { "Trinidadian Creole English", 7842493, "crp", "Latn", ancestors = "en", } m["trg"] = { "Lishán Didán", 56473, "sem-nna", } m["trh"] = { "Turaka", 12953237, "ngf-dag", "Latn", } m["tri"] = { "Trió", 56885, "sai-tar", "Latn", } m["trj"] = { "Toram", 3441225, "cdc-est", } m["trl"] = { "Traveller Scottish", 3915671, "qfa-mix", "Latn", ancestors = "rom, sco", } m["trm"] = { "Tregami", 34081, "nur-sou", } m["trn"] = { "Trinitario", 3539279, "awd", } m["tro"] = { "Tarao", 3515603, "tbq-kuk", "Latn", } m["trp"] = { "กอกบอรอก", 35947, "tbq-bdg", "Beng, Latn" -- WP lists 2 more } m["trq"] = { "San Martín Itunyoso Triqui", 12953934, "omq-tri", "Latn", } m["trr"] = { "Taushiro", 1957508, nil, "Latn", } m["trs"] = { "Chicahuaxtla Triqui", 3539587, "omq-tri", "Latn", } m["trt"] = { "Tunggare", 615071, "paa-egb", "Latn", } m["tru"] = { "Turoyo", 34040, "sem-cna", "Syrc, Latn", translit = { Syrc = "tru-translit", }, strip_diacritics = { Syrc = "Syrc-stripdiacritics", }, } m["trv"] = { "Taroko", 716686, "map-ata", "Latn", } m["trw"] = { "Torwali", 2665246, "inc-koh", "ur-Arab", } m["trx"] = { "Tringgus", 7842365, "day", } m["try"] = { "ตุรุง", 7856514, "tai-swe", "as-Beng", } m["trz"] = { "Torá", 7827518, "sai-cpc", } m["tsa"] = { "Tsaangi", 36675, "bnt-nze", } m["tsb"] = { "Tsamai", 2371358, "cus-eas", } m["tsc"] = { "Tswa", 2085051, "bnt-tsr", } m["tsd"] = { "Tsakonian", 220607, "grk", "Grek", ancestors = "grc-dor", translit = "el-translit", -- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["tse"] = { "Tunisian Sign Language", 7853191, "sgn", } m["tsf"] = { "Southwestern Tamang", 12953176, "sit-tam", } m["tsg"] = { "ซูก", 34142, "phi", "Latn, Arab", } m["tsh"] = { "Tsuvan", 3502326, "cdc-cbm", } m["tsi"] = { "Tsimshian", 20085721, "nai-tsi", "Latn", } m["tsj"] = { "Tshangla", 36840, "sit-tsk", "Tibt, Latn, Deva", translit = { Deva = "Deva-translit", }, override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["tsl"] = { "Ts'ün-Lao", 3446675, "tai", } m["tsm"] = { "Turkish Sign Language", 36885, "sgn", } m["tsp"] = { "Northern Toussian", 11155635, "alv-sav", } m["tsq"] = { "มือไทย", 7709156, "sgn", "Sgnw", } m["tsr"] = { "Akei", 2828964, "poz-vnn", "Latn", } m["tss"] = { "Taiwan Sign Language", 34019, "sgn-jsl", } m["tsu"] = { "โจว", 716681, "map", "Latn", } m["tsv"] = { "Tsogo", 36674, "bnt-tso", } m["tsw"] = { "Tsishingini", 13123571, "nic-kam", } m["tsx"] = { "Mubami", 6930815, "paa-ani", "Latn", } m["tsy"] = { "Tebul Sign Language", 7692090, "sgn", } m["tta"] = { "Tutelo", 2311602, "sio-ohv", "Latn", } m["ttb"] = { "Gaa", 3438361, "nic-dak", "Latn", } m["ttc"] = { "Tektiteko", 36686, "myn", "Latn", } m["ttd"] = { "Tauade", 7688634, "qfa-dis", -- Papuan; isolate per Glottolog; Glottolog says "A Goilalan family uniting Kunimaipan, Tauade and Fuyug -- is often posited based on the lexicostatistical figures reported in Tom E. Dutton 1975: 631-632" but -- goes on to say the data "is clearly insufficient, as the lexical links so far proposed are few and -- show irregular one-consonant correspondences". "Latn", } m["tte"] = { "Bwanabwana", 5003667, "poz-ocw", "Latn", } m["ttf"] = { "Tuotomb", 7853459, "nic-mbw", "Latn", } m["ttg"] = { "Tutong", 3507990, "poz-swa", "Latn", } m["tth"] = { "Upper Ta'oih", 3512660, "mkh-kat", } m["tti"] = { "Tobati", 7811556, "poz-ocw", "Latn", } m["ttj"] = { "Tooro", 7824218, "bnt-nyg", "Latn", } m["ttk"] = { "Totoro", 3532756, "sai-bar", "Latn", } m["ttl"] = { "Totela", 10962316, "bnt-bot", "Latn", } m["ttm"] = { "Northern Tutchone", 20822, "ath-nor", "Latn", } m["ttn"] = { "Towei", 7829606, "paa-pau", } m["tto"] = { "Lower Ta'oih", 25559539, "mkh-kat", } m["ttp"] = { "Tombelala", 6799663, "poz-kal", } m["ttr"] = { "Tera", 56267, "cdc-cbm", } m["tts"] = { "อีสาน", 33417, "tai-swe", "Thai", -- also Tai Noi/Lao Buhan script --sort_key = "Thai-sortkey", } m["ttt"] = { "Tat", 56489, "ira-swi", "Cyrl, Latn, Armn, fa-Arab", -- Armn translit in [[Module:scripts/data]] (NOTE: formerly not present, probably an accidental omission) ancestors = "fa", } m["ttu"] = { "Torau", 3532208, "poz-ocw", } m["ttv"] = { "Titan", 3445811, "poz-aay", "Latn", } m["ttw"] = { "Long Wat", 7856961, "poz-swa", } m["tty"] = { "Sikaritai", 7513600, "paa-lkp", "Latn", } m["ttz"] = { "Tsum", 12953223, "sit-kyk", } m["tua"] = { "Wiarumus", 7998045, "paa-tor", "Latn", } m["tub"] = { "Tübatulabal", 56704, "azc", "Latn", } m["tuc"] = { "Mutu", 3331003, "poz-ocw", "Latn", } m["tud"] = { "Tuxá", 7857217, } m["tue"] = { "Tuyuca", 2520538, "sai-tuc", "Latn", } m["tuf"] = { "Central Tunebo", 12953942, "cba", "Latn", } m["tug"] = { "Tunia", 863721, "alv-bua", } m["tuh"] = { "Taulil", 3516141, } m["tui"] = { "Tupuri", 36646, "alv-mbm", "Latn", } m["tuj"] = { "Tugutil", 12953228, "paa-nha", "Latn", } m["tul"] = { "Tula", 3914907, "alv-wjk", } m["tum"] = { "Tumbuka", 34138, "bnt-nys", "Latn", } m["tun"] = { "Tunica", 56619, "qfa-iso", "Latn", } m["tuo"] = { "Tucano", 3541834, "sai-tuc", "Latn", } m["tuq"] = { "Tedaga", 36639, "ssa-sah", "Latn", } m["tus"] = { "Tuscarora", 36944, "iro-nor", "Latn", } m["tuu"] = { "Tututni", 20627, "ath-pco", "Latn", } m["tuv"] = { "Turkana", 36958, "sdv-ttu", "Latn", } m["tux"] = { "Tuxináwa", 7857204, "sai-pan", "Latn", } m["tuy"] = { "Tugen", 3541935, "sdv-nma", } m["tuz"] = { "Turka", 36643, "nic-gur", "Latn", } m["tva"] = { "Vaghua", 3553248, "poz-ocw", "Latn", } m["tvd"] = { "Tsuvadi", 3914936, "nic-kam", } m["tve"] = { "Te'un", 7690709, "poz-cet", "Latn", } m["tvk"] = { "Southeast Ambrym", 252411, "poz-vnc", "Latn", } m["tvl"] = { "ตูวาลู", 34055, "poz-pnp", "Latn", } m["tvm"] = { "Tela-Masbuar", 7695666, "poz-tim", } m["tvn"] = { "Tavoyan", 7689158, "tbq-brm", "Mymr", ancestors = "obr", } m["tvo"] = { "ตีโดเร", 3528199, "paa-nha", "Latn, Arab", } m["tvs"] = { "Taveta", 15632387, "bnt-par", "Latn", } m["tvt"] = { "Tutsa Naga", 7856987, "sit-tno", } m["tvu"] = { "Tunen", 36632, "nic-mbw", } m["tvw"] = { "Sedoa", 7445362, "poz-kal", } m["tvx"] = { "Taivoan", 1975271, "map", "Latn", } m["tvy"] = { "Timor Pidgin", 4904029, "crp", ancestors = "pt", } m["twa"] = { "Twana", 7857412, "sal", "Latn", } m["twb"] = { "Western Tawbuid", 12953912, "phi", } m["twc"] = { "Teshenawa", 3436597, "phi", } m["twe"] = { "Teiwa", 3519302, "paa-tap", "Latn", } m["twf"] = { "เทาส์", 7684320, "nai-kta", "Latn", } m["twg"] = { "Tereweng", 12953200, "paa-tap", "Latn", } m["twh"] = { "ไทขาว", 7675751, "tai-swe", "Tavt", --translit = "Tavt-translit", sort_key = { from = {"[꪿ꫀ꫁ꫂ]", "([ꪵꪶꪹꪻꪼ])([ꪀ-ꪯ])"}, to = {"", "%2%1"} }, } m["twm"] = { "Tawang Monpa", 36586, "sit-ebo", "Tibt", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["twn"] = { "Twendi", 7857682, "nic-mmb", } m["two"] = { "Tswapong", 3446241, "bnt-sts", } m["twp"] = { "Ere", 3056045, "poz-aay", "Latn", } m["twq"] = { "Tasawaq", 36564, "son", } m["twr"] = { "Southwestern Tarahumara", 12953909, "azc-trc", "Latn", } m["twt"] = { "Turiwára", 3542307, "tup-gua", "Latn", } m["twu"] = { "Termanu", 7702572, "poz-tim", "Latn", } m["tww"] = { "Tuwari", 7857159, "paa-wal", "Latn", } m["twy"] = { "Tawoyan", 3513542, "poz-bre", "Latn", } m["txa"] = { "Tombonuo", 7818692, "poz-san", "Latn", } m["txb"] = { "โทแคเรียนบี", 3199353, "ine-toc", "Latn", standard_chars = "AaÄäĀāCcEeIiKkLlMmṂṃNnṄṅÑñOoPpRrSsŚśṢṣTtUuWwYy" .. c.punc, } m["txc"] = { "Tsetsaut", 20829, "ath-nor", "Latn", } m["txe"] = { "Totoli", 7828387, "poz-tot", "Latn", } m["txg"] = { "ตังกุต", 2727930, "ero", "Tang", -- Tang translit in [[Module:scripts/data]] } m["txh"] = { "Thracian", 36793, "ine", "Latn, Polyt", -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["txi"] = { "Ikpeng", 9344891, "sai-pek", "Latn", } m["txj"] = { "Tarjumo", 24906088, "ssa-sah", "Latn, Arab", } m["txm"] = { "Tomini", 7818911, "poz", "Latn", } m["txn"] = { "West Tarangan", 3515594, "poz", "Latn", } m["txo"] = { "Toto", 36709, "sit-dhi", "Beng, Toto" } m["txq"] = { "Tii", 7801784, "poz-tim", } m["txr"] = { "Tartessian", 36795, "qfa-unc", -- extinct, no consensus on classification } m["txs"] = { "Tonsea", 3531659, "phi", "Latn", } m["txt"] = { "Citak", 3447279, "ngf-ask", "Latn", } m["txu"] = { "Kayapó", 3101212, "sai-nje", "Latn", } m["txx"] = { "Tatana", 18643518, "poz-san", "Latn", } m["tya"] = { "Tauya", 7688978, "ngf-rai", "Latn", } m["tye"] = { "Kyenga", 3913304, "dmn-bbu", "Latn", } m["tyh"] = { "O'du", 3347428, "mkh", } m["tyi"] = { "Teke-Tsaayi", 33123613, "bnt-nze", } m["tyj"] = { "ไทแมน", -- Thai word for this 7675746, "tai-nor", -- Chamberlain (1991), but Pittayaporn (2009) suggests tai-swe "Latn, Tayo", -- Vietnam } m["tyl"] = { "Thu Lao", 12953921, "tai-cen", } m["tyn"] = { "Kombai", 6428241, "ngf-gaw", "Latn", } m["typ"] = { "Kuku-Thaypan", 3915693, "aus-pmn", "Latn", } m["tyr"] = { "ไทแดง", 3915207, "tai-swe", "Tavt", } m["tys"] = { "ซาปา", 3446668, "tai-sap", "Latn", } m["tyt"] = { "Tày Tac", 7862029, "tai-swe", } m["tyu"] = { "Kua", 3832933, "khi-kal", } m["tyv"] = { "ตูวา", 34119, "trk-ssb", "Cyrl", translit = "tyv-translit", override_translit = true, sort_key = "tyv-sortkey", } m["tyx"] = { "Teke-Tyee", 36634, "bnt-nze", "Latn", } m["tyz"] = { "ตั่ย", -- This does not mean its family "Tai" languages. 2511476, "tai-tay", "Latn, Hani", sort_key = { Hani = "Hani-sortkey" }, } m["tza"] = { "Tanzanian Sign Language", 7684177, "sgn", } m["tzh"] = { "Tzeltal", 36808, "myn", "Latn", } m["tzj"] = { "Tz'utujil", 36941, "myn", "Latn", } m["tzl"] = { "Talossan", 1063911, "art", "Latn", type = "appendix-constructed", sort_key = "tzl-sortkey", } m["tzm"] = { "Central Atlas Tamazight", 49741, "ber", "Tfng, Arab, Latn", translit = { Tfng = "Tfng-translit", }, } m["tzn"] = { "Tugun", 12953225, "poz-tim", "Latn", } m["tzo"] = { "โซตซิล", 36809, "myn", "Latn", } m["tzx"] = { "Tabriak", 56872, "paa-lsp", "Latn", } return require("Module:languages").finalizeData(m, "language") asseyub07jy456q835v2ykghmnlaha4 มอดูล:languages/data/3/s 828 36368 5720765 5719153 2026-04-21T07:01:14Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720765 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["saa"] = { "Saba", 3914885, "cdc-est", "Latn", } m["sab"] = { "Buglere", 3368506, "cba", "Latn", } m["sac"] = { "Fox", 12714767, "alg-sfk", "Latn", } m["sad"] = { "Sandawe", 34016, "qfa-iso", "Latn", } m["sae"] = { "Sabanê", 3460478, "sai-nmk", "Latn", } m["saf"] = { "Safaliba", 36432, "nic-mre", "Latn", } m["sah"] = { "ซาคา", 34299, "trk-nsb", "Cyrl", translit = "sah-translit", override_translit = true, } m["saj"] = { "Sahu", 7399757, "paa-nha", "Latn", } m["sak"] = { "Sake", 36425, "bnt-kel", "Latn", } m["sam"] = { "Samaritan Aramaic", 56612, "sem-arw", "Samr", translit = "Samr-translit", -- Samr strip_diacritics, sort_key in [[Module:scripts/data]] } m["sao"] = { "Sause", 4409155, "paa-tkw", "Latn", } m["saq"] = { "Samburu", 56536, "sdv-lma", } m["sar"] = { "Saraveca", 3450556, "awd", "Latn", } m["sas"] = { "Sasak", 1294047, "poz-bss", "Latn, Bali, Java", } m["sat"] = { "สันถาลี", 33965, "mun", "Olck", translit = "Olck-translit", override_translit = true, } m["sau"] = { "Saleman", 7404262, "poz-cet", } m["sav"] = { "Saafi-Saafi", 36308, "alv-cng", "Arab, Latn", } m["saw"] = { "Sawi", 677064, "ngf-gaw", "Latn", } m["sax"] = { "Sa", 3460352, "poz-vnn", "Latn", } m["say"] = { "Saya", 3914431, "cdc-wst", "Latn", } m["saz"] = { "Saurashtra", 13292, "inc-wes", "Saur, Latn, Taml, Deva", translit = { Taml = "Taml-translit", Deva = "Deva-translit", }, ancestors = "inc-ogu", } m["sba"] = { "Ngambay", 2372207, "csu-sar", "Latn", } m["sbb"] = { "Simbo", 3484101, "poz-ocw", } m["sbc"] = { "Gele'", 3194847, "poz-aay", "Latn", } m["sbd"] = { "Southern Samo", 33122730, "dmn-sam", "Latn", } m["sbe"] = { "Saliba (New Guinea)", 3469737, "poz-ocw", } m["sbf"] = { "Shabo", 36342, "ssa", "Latn", } m["sbg"] = { "Seget", 7446237, "paa-wbh", "Latn", } m["sbh"] = { "Sori-Harengan", 36515, "poz-aay", "Latn", } m["sbi"] = { "Seti", 7456682, "paa-tor", "Latn", } m["sbj"] = { "Surbakhal", 759995, } m["sbk"] = { "Safwa", 4121160, "bnt-mby", "Latn", } m["sbl"] = { "Botolan Sambal", 4095195, "phi", "Latn", } m["sbm"] = { "Sagala", 11732610, "bnt-ruv", "Latn", } m["sbn"] = { "Sindhi Bhil", 25559289, "inc-snd", "Arab, Deva, Sind, Guru", translit = { Deva = "Deva-translit", Sind = "Sind-translit", Guru = "Guru-translit", }, ancestors = "sd", } m["sbo"] = { "Sabüm", 7396535, "mkh-asl", } m["sbp"] = { "Sangu (Tanzania)", 7418149, "bnt-bki", "Latn", } m["sbq"] = { "Sileibi", 7514337, "ngf-nso", "Latn", } m["sbr"] = { "Sembakung Murut", 7449148, "poz-san", } m["sbs"] = { "Subiya", 6442073, "bnt-bot", "Latn", } m["sbt"] = { "Kimki", 6410160, "paa-pau", } m["sbu"] = { "Stod Bhoti", 15622700, "sit-las", } m["sbv"] = { "Sabine", 65455885, "itc-sbl", "Latn", display_text = s["itc-Latn-displaytext"], strip_diacritics = s["itc-Latn-stripdiacritics"], sort_key = s["itc-Latn-sortkey"], } m["sbw"] = { "Simba", 36430, "bnt-tso", "Latn", } m["sbx"] = { "Seberuang", 12473470, "poz-mly", } m["sby"] = { "Soli", 7557754, "bnt-bot", "Latn", } m["sbz"] = { "Sara Kaba", 25559318, "csu-kab", "Latn", } m["scb"] = { "Chut", 2967709, "mkh-vie", "Latn", } m["sce"] = { "ตงเซียง", 32947, "xgn-shr", "Arab, Latn", } m["scf"] = { "San Miguel Creole French", 12953094, "crp", "Latn", ancestors = "gcf", sort_key = s["roa-oil-sortkey"], } m["scg"] = { "Sanggau", 12473466, "day", } m["sch"] = { "Sakachep", 37054, "tbq-kuk", } m["sci"] = { "Sri Lankan Creole Malay", 1089151, "crp", "Latn", ancestors = "ms", } m["sck"] = { "Sadri", 765922, "inc-bih", "Deva, Kthi", translit = { Deva = "Deva-translit", Kthi = "Kthi-translit", }, } m["scl"] = { "Shina", 1353320, "inc-shn", "ur-Arab, Deva", translit = { Deva = "Deva-translit", }, } m["scn"] = { "ซิซิลี", 33973, "roa-itr", "Latn", } m["sco"] = { "สกอต", 14549, "gmw-ang", "Latn", ancestors = "gmw-msc", } m["scp"] = { "Yolmo", 22662107, "sit-kyk", "Deva", translit = { Deva = "Deva-translit", }, } m["scq"] = { "สะโอจ", 6583617, "mkh-pea", } m["scs"] = { "North Slavey", 20628, "den", "Latn", } m["scu"] = { "Shumcho", 22077739, "sit-kin", } m["scv"] = { "Sheni", 11015820, "nic-jer", "Latn", ancestors = "zir", } m["scw"] = { "Sha", 3438816, "cdc-wst", "Latn", } m["scx"] = { "Sicel", 36667, "itc", "Polyt", -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["scz"] = { "Shetland", 3069598, "qfa-mix", "Latn", ancestors = "nrn, gmw-msc", standard_chars = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZzØøÖüÜü0123456789" .. c.punc, } m["sda"] = { "Toraja-Sa'dan", 36673, "poz-ssw", "Latn", } m["sdb"] = { "Shabak", 3289596, "ira-zgr", ancestors = "hac", } m["sdc"] = { "ซัสซารี", 845441, "roa-itr", "Latn", } m["sde"] = { "Surubu", 3913336, "nic-kau", "Latn", } m["sdf"] = { "Sarli", 7424256, "ira-zgr", ancestors = "hac", } m["sdg"] = { "Savi", 3474654, "inc-dng", } m["sdh"] = { "Southern Kurdish", 1496597, "ku", "ku-Arab", translit = "sdh-translit", strip_diacritics = {remove_diacritics = c.kasra .. c.sukun}, } m["sdj"] = { "Suundi", 7650407, "bnt-kng", "Latn", } m["sdk"] = { "Sos Kundi", 7563811, "paa-ndu", "Latn", } m["sdl"] = { "Saudi Arabian Sign Language", 3504160, "sgn", } m["sdm"] = { "Semandang", 7449012, "day", } m["sdn"] = { "Gallurese", 612220, "roa-itr", "Latn", ancestors = "co", } m["sdo"] = { "Bukar-Sadung Bidayuh", 2927799, "day", "Latn", } m["sdp"] = { "Sherdukpen", 7494785, "sit-khm", } m["sdr"] = { "Oraon Sadri", 12953860, "inc-bih", } m["sds"] = { "เบอร์เบอร์แบบตูนีเซีย", 5329732, "ber", } m["sdu"] = { "Sarudu", 7424700, "poz-cet", } m["sdx"] = { "Sibu Melanau", 18642842, "poz-bnn", } m["sea"] = { "เซอไม", 3135426, "mkh-asl", "Latn", } -- seb is a duplicate code of spp m["sec"] = { "Sechelt", 7442898, "sal", "Latn", } m["sed"] = { "Sedang", 56448, "mkh-nbn", "Latn", } m["see"] = { "Seneca", 1185133, "iro-nor", "Latn", } m["sef"] = { "Cebaara Senoufo", 10975121, "alv-snr", "Latn", } m["seg"] = { "Segeju", 17584599, "bnt-mij", "Latn", } m["seh"] = { "Sena", 2964008, "bnt-sna", "Latn", } m["sei"] = { "Seri", 36583, "qfa-iso", "Latn", } m["sej"] = { "Sene", 7450252, "ngf-huo", "Latn", } m["sek"] = { "Sekani", 28562, "ath-nor", "Latn", } m["sen"] = { "Nanerigé Sénoufo", 36002, "alv-sma", } m["seo"] = { "Suarmin", 7630513, "qfa-dis", -- Papuan; isolate or unclassified in Glottolog; Sepik language in Foley (2018) "Latn", } m["sep"] = { "Sìcìté Sénoufo", 56787, "alv-sma", } m["seq"] = { "Senara Sénoufo", 35210, "alv-snr", } m["ser"] = { "Serrano", 3479942, "azc-tak", "Latn", } m["ses"] = { "Koyraboro Senni", 35655, "son", "Latn", } m["set"] = { "Sentani", 3441672, "paa-sen", "Latn", } m["seu"] = { "Serui-Laut", 7455503, "poz-hce", "Latn", } m["sev"] = { "Nyarafolo Senoufo", 36306, "alv-snr", } m["sew"] = { "Sewa Bay", 7458126, "poz-ocw", } m["sey"] = { "Secoya", 3477218, "sai-tuc", "Latn", } m["sez"] = { "Senthang Chin", 7451223, "tbq-kuk", } m["sfb"] = { "French Belgian Sign Language", 3217332, "sgn", } m["sfe"] = { "Eastern Subanun", 63311321, "phi", "Latn", } m["sfm"] = { "Small Flowery Miao", 7542773, "hmn", } m["sfs"] = { "South African Sign Language", 3322093, "sgn", } m["sfw"] = { "Sehwi", 36593, "alv-ctn", "Latn", } m["sga"] = { "ไอริชเก่า", 35308, "cel-gae", "Latn, Ogam", strip_diacritics = {remove_diacritics = c.dotabove .. c.diaer .. "·"}, sort_key = "sga-sortkey", standard_chars = "AaÁáBbCcDdEeÉéFfGgHhIiÍíLlMmNnOoÓóPpRrSsTtUuÚú0123456789ᚁᚂᚃᚄᚅᚆᚇᚈᚉᚊᚋᚌᚍᚎᚏᚐᚑᚒᚓᚔ" .. c.punc, } m["sgb"] = { "Mag-Anchi Ayta", 4356243, "phi", "Latn", } m["sgc"] = { "Kipsigis", 56339, "sdv-nma", } m["sgd"] = { "ซูรีเกาโนน", 34140, "phi", "Latn", } m["sge"] = { "Segai", 7446180, } m["sgg"] = { "Swiss-German Sign Language", 35150, "sgn", } m["sgh"] = { "Shughni", 34053, "ira-shr", "Latn, Cyrl", translit = "sgh-translit", override_translit = true, } m["sgi"] = { "Suga", 36475, "nic-mmb", "Latn", } m["sgk"] = { "Sangkong", 2945610, "tbq-bis", } m["sgm"] = { "Singa", 7522797, "bnt-lok", "Latn", } m["sgp"] = { "Singpho", 7524158, "sit-jnp", "Latn", } m["sgr"] = { "Sangisari", 3394363, "ira-kms", "Arab", } m["sgs"] = { "Samogitian", 213434, "bat-eas", "Latn", wikimedia_codes = "bat-smg", ancestors = "olt", display_text = "lt-common", strip_diacritics = "lt-common", sort_key = "lt-common", } m["sgt"] = { "Brokpake", 56603, "sit-tib", "Tibt", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["sgu"] = { "Salas", 7403694, "poz-cma", } m["sgw"] = { "Sebat Bet Gurage", 2707343, "sem-eth", "Ethi", } m["sgx"] = { "Sierra Leone Sign Language", 7511448, "sgn", } m["sgy"] = { "Sanglechi", 3472220, "ira-sgi", } m["sgz"] = { "Sursurunga", 36511, "poz-ocw", "Latn", } m["sha"] = { "Shall-Zwall", 3915355, "nic-beo", } m["shb"] = { "Ninam", 3436586, "sai-ynm", "Latn", } m["shc"] = { "Sonde", 7560881, "bnt-pen", "Latn", } m["shd"] = { "Kundal Shahi", 6444265, "inc-shn", "Arab", } m["she"] = { "Sheko", 3183355, "omv-diz", } m["shg"] = { "Shua", 3501092, "khi-kal", "Latn", } m["shh"] = { "Shoshone", 33811, "azc-num", "Latn", } m["shi"] = { "Tashelhit", 34152, "ber", "Latn, Arab, Tfng, Hebr", ancestors = "shi-med", translit = { Tfng = "Tfng-translit", }, strip_diacritics = { Arab = "ar-stripdiacritics", }, -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["shj"] = { "Shatt", 56344, "sdv-daj", } m["shk"] = { "Shilluk", 36486, "sdv-lon", "Latn", } m["shl"] = { "Shendu", 22074616, "tbq-kuk", } m["shm"] = { "Shahrudi", 7462280, "xme-ttc", "fa-Arab, Latn", ancestors = "xme-ttc-cen", } m["shn"] = { "ไทใหญ่", 56482, "tai-swe", "Mymr", translit = "shn-translit", sort_key = { from = {"[ၢႃ]", "ဵ", "ႅ", "ႇ", "ႈ", "း", "ႉ", "ႊ"}, to = {"ာ", "ေ", "ႄ", "႒", "႓", "႔", "႕", "႖"} }, } m["sho"] = { "Shanga", 3913931, "dmn-bbu", "Latn", } m["shp"] = { "Shipibo-Conibo", 2671988, "sai-pan", "Latn", } m["shq"] = { "Sala", 10961665, "bnt-bot", "Latn", } m["shr"] = { "Shi", 3481999, "bnt-shh", "Latn", } m["shs"] = { "Shuswap", 3482685, "sal", "Latn", } m["sht"] = { "Shasta", 56396, "nai-shs", "Latn", } m["shu"] = { "Chadian Arabic", 56497, "sem-arb", "Arab", strip_diacritics = { remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef, from = {u(0x0671)}, to = {u(0x0627)} }, } m["shv"] = { "Shehri", 33445, "sem-sar", "Arab, Latn", } m["shw"] = { "Shwai", 36527, "alv-hei", } m["shx"] = { "She", 2605689, "hmn", } m["shy"] = { "Tachawit", 33274, "ber", "Tfng, Arab, Latn", translit = "Tfng-translit", } m["shz"] = { "Syenara Senoufo", 36316, "alv-snr", } m["sia"] = { "Akkala Sami", 35241, "smi", "Cyrl, Latn", translit = "sia-translit", display_text = { from = {"'"}, to = {"ˈ"} }, strip_diacritics = {remove_diacritics = "'ˈ"}, } m["sib"] = { "Sebop", 7442799, "poz-swa", "Latn", } m["sid"] = { "ซีดามา", 33786, "cus-hec", "Latn, Ethi", } m["sie"] = { "Simaa", 7517329, "bnt-kav", "Latn", } m["sif"] = { "Siamou", 36252, } m["sig"] = { "Paasaal", 36426, "nic-sis", "Latn", } m["sih"] = { "Sîshëë", 8072753, "poz-cln", "Latn", } m["sii"] = { "Shom Peng", 1039346, "aav", } m["sij"] = { "Numbami", 3346277, "poz-ocw", "Latn", } m["sik"] = { "Sikiana", 3443734, "sai-prk", "Latn", } m["sil"] = { "Tumulung Sisaala", 25383006, "nic-sis", "Latn", } m["sim"] = { "Seim", 7446815, "paa-spk", "Latn", } m["sip"] = { "สิกขิม", 35285, "sit-tib", "Tibt", ancestors = "xct", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["siq"] = { "Sonia", 7561770, "ngf-bos", "Latn", } m["sir"] = { "Siri", 3438729, "cdc-wst", "Latn", } m["sis"] = { "Siuslaw", 2315424, } m["siu"] = { "Sinagen", 7521655, "paa-tor", "Latn", } m["siv"] = { "Sumariup", 7636966, "paa-spk", "Latn", } m["siw"] = { "Siwai", 7532519, "paa-sbo", "Latn", } m["six"] = { "Sumau", 7637021, "ngf-pek", "Latn", } m["siy"] = { "Sivandi", 13269, "xme", "fa-Arab, Latn", ancestors = "xme-mid", } m["siz"] = { "Siwi", 36814, "ber", "Tfng, Arab, Latn", } m["sja"] = { "Epena", 3055682, "sai-chc", "Latn", } m["sjb"] = { "Sajau Basap", 4684353, "poz-bnn", } m["sjc"] = { "Shaojiang Min", 3431451, "zhx-inm", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["sjd"] = { "ซามีแบบกิลดิน", 33656, "smi", "Cyrl", translit = "sjd-translit", display_text = { from = {"'"}, to = {"ˈ"} }, strip_diacritics = {remove_diacritics = "'ˈ"}, } m["sje"] = { "ซามีแบบปีเต", 56314, "smi", "Latn", display_text = { from = {"'"}, to = {"ˈ"} }, strip_diacritics = {remove_diacritics = c.macron .. "'ˈ"}, sort_key = "sje-sortkey", } m["sjg"] = { "Assangori", 3502255, "sdv-tmn", } m["sjk"] = { "Kemi Sami", 35871, "smi", "Latn", display_text = { from = {"'"}, to = {"ˈ"} }, strip_diacritics = {remove_diacritics = "'ˈ"}, } m["sjl"] = { "Miji", 6845470, "sit-hrs", } m["sjm"] = { "Mapun", 3287253, "poz-sbj", "Latn", } m["sjn"] = { "ซินดาริน", 56437, "art", "Latn, Teng", type = "appendix-constructed", } m["sjo"] = { "Xibe", 13223, "tuw-jrc", "sjo-Mong", ancestors = "mnc", } m["sjp"] = { "Surjapuri", 7645351, "inc-krd", "Deva, as-Beng, Kthi", translit = { Deva = "Deva-translit", ["as-Beng"] = "Beng-translit", Kthi = "Kthi-translit", }, } m["sjr"] = { "Siar-Lak", 3482907, "poz-ocw", } m["sjs"] = { "Senhaja de Srair", 56744, "ber", "Latn, Tfng, Arab", strip_diacritics = { Arab = "ar-stripdiacritics", }, translit = { Tfng = "Tfng-translit", } } m["sjt"] = { "Ter Sami", 36656, "smi", "Cyrl, Latn", display_text = { from = {"'"}, to = {"ˈ"} }, strip_diacritics = {remove_diacritics = "'ˈ"}, translit = "sjt-translit", } m["sju"] = { "Ume Sami", 56415, "smi", "Latn", strip_diacritics = {remove_diacritics = c.macron .. "'ˈ"}, display_text = { from = {"'"}, to = {"ˈ"} }, sort_key = "sju-sortkey", } m["sjw"] = { "Shawnee", 2669206, "alg", "Latn", } m["ska"] = { "Skagit", 25559652, "sal", "Latn", } m["skb"] = { "แสก", 36437, "tai-nor", "Thai", --sort_key = "Thai-sortkey", } m["skc"] = { "Ma Manda", 6720783, "ngf-fin", "Latn", } m["skd"] = { "Southern Sierra Miwok", 3492334, "nai-utn", "Latn", } m["ske"] = { "Ske", 7534244, "poz-vnn", "Latn", } m["skf"] = { "Mekéns", 3304806, "tup", "Latn", } m["skh"] = { "Sikule", 3121081, "poz-nws", } m["ski"] = { "Sika", 33960, "poz-cet", "Latn", } m["skj"] = { -- compare 'ths' "Seke", 30226846, "sit-tam", } m["skk"] = { "Sok", 12953887, "mkh-ban", } m["skm"] = { "Sakam", 6448517, "ngf-fin", "Latn", } m["skn"] = { "Kolibugan Subanon", 18755617, "phi", "Latn", } m["sko"] = { "Seko Tengah", 15613270, "poz", } m["skp"] = { "Sekapan", 7447132, "poz-bnn", } m["skq"] = { "Sininkere", 3914896, "dmn-man", "Latn", } m["skr"] = { "Saraiki", 33902, "inc-pan", "pa-Arab, Mult, Deva", ancestors = "lah", strip_diacritics = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna}, translit = { ["pa-Arab"] = "pa-Arab-translit", Deva = "Deva-translit", Mult = "Mult-translit", }, } m["sks"] = { "ไมยา", 12952760, "ngf-kau", "Latn", } m["skt"] = { "Sakata", 36691, "bnt-bnm", "Latn", } m["sku"] = { "Sakao", 3298421, "poz-vnn", "Latn", } m["skv"] = { "Skou", 3915200, "paa-msk", "Latn", } m["skw"] = { "Skepi Creole Dutch", 2522153, "crp", "Latn", ancestors = "nl", } m["skx"] = { "Seko Padang", 15613282, "poz-ssw", "Latn", } m["sky"] = { "ซีกายานา", 7439242, "poz-pnp", "Latn", } m["skz"] = { "Sekar", 7447136, "poz-cet", } m["slc"] = { "Saliba (Colombia)", 3441097, nil, "Latn", } m["sld"] = { "Sisaala", 11020264, "nic-sis", "Latn", } m["sle"] = { "Sholaga", 7500203, "dra-kan", "Knda", -- Knda translit in [[Module:scripts/data]] } m["slf"] = { "Swiss-Italian Sign Language", 12953479, "sgn", } m["slg"] = { "Selungai Murut", 7448844, "poz-san", } m["slh"] = { "Southern Puget Sound Salish", 12642471, "sal", "Latn", } -- "sli" "Silesian German" IS SUBSUMED INTO "gmw-ecg" "East Central German" m["slj"] = { "Salumá", 7406296, "sai-prk", "Latn", } m["sll"] = { "Salt-Yui", 7405785, "ngf-chw", "Latn", } m["slm"] = { "ซามาแบบปางูตารัน", 3362086, "poz-sbj", "Latn", } m["sln"] = { "Salinan", 1568938, "qfa-iso", "Latn", } m["slp"] = { "Lamaholot", 6480777, "poz-cet", "Latn", } m["slr"] = { "ซาลาร์", 33963, "trk-ogz", "Arab, Latn", ancestors = "trk-eog", } m["sls"] = { "Singapore Sign Language", 7512563, "sgn", } m["slt"] = { "Sila", 7514021, "tbq-sil", } m["slu"] = { "Selaru", 7447500, "poz-cet", "Latn", } m["slw"] = { "Sialum", 7506694, "ngf-huo", "Latn", } m["slx"] = { "Salampasu", 7403607, "bnt-lun", "Latn", } m["sly"] = { "Selayar", 7447520, "poz-ssw", } m["slz"] = { "Ma'ya", 2291492, "poz-hce", "Latn", } m["sma"] = { "ซามีใต้", 13293, "smi", "Latn", display_text = { from = {"'"}, to = {"ˈ"} }, strip_diacritics = {remove_diacritics = "'ˈ"}, sort_key = "sma-sortkey", } m["smb"] = { "Simbari", 7517427, "ngf-ang", "Latn", } m["smc"] = { "Som", 7559081, "ngf-fin", "Latn", } m["smd"] = { "Sama", 6407456, "bnt-kmb", "Latn", } m["smf"] = { "Auwe", 3502072, "paa-brd", "Latn", ancestors = "dnd", } m["smg"] = { "Simbali", 56692, "paa-bng", "Latn", } m["smh"] = { "Samei", 7409269, "tbq-axi", } m["smj"] = { "ซามีแบบลูเล", 56322, "smi", "Latn", display_text = { from = {"'"}, to = {"ˈ"} }, strip_diacritics = {remove_diacritics = c.macron .. "'ˈ"}, sort_key = "smj-sortkey", } m["smk"] = { "โบลีเนา", 2669235, "phi", "Latn, Tglg", } m["sml"] = { "ซามาตอนกลาง", 3470593, "poz-sbj", "Latn", } m["smm"] = { "Musasa", 6940122, "inc-eas", ancestors = "bh", } m["smn"] = { "ซามีแบบอีนารี", 33462, "smi", "Latn", display_text = { from = {"'"}, to = {"ˈ"} }, strip_diacritics = {remove_diacritics = c.dotbelow .. "'ˈ"}, sort_key = "smn-sortkey", } m["smp"] = { "Samaritan Hebrew", 56502, "sem-can", "Samr", translit = "Samr-translit", -- Samr strip_diacritics, sort_key in [[Module:scripts/data]] ancestors = "hbo", } m["smq"] = { "Samo", 7409884, "ngf-est", "Latn", } m["smr"] = { "Simeulue", 2992833, "poz-nws", "Latn", } m["sms"] = { "Skolt Sami", 13271, "smi", "Latn", display_text = { from = {"'"}, to = {"ˈ"} }, strip_diacritics = {remove_diacritics = c.dotbelow .. "'ˈ"}, sort_key = "sms-sortkey", } m["smt"] = { "Simte", 7521268, "tbq-kuk", } m["smu"] = { "สมราย", 6583612, "mkh-pea", } m["smv"] = { "Samvedi", 6345632, "inc-sou", } m["smw"] = { "Sumbawa", 3182585, "poz-bss", "Latn", } m["smx"] = { "Samba", 11120157, "bnt-pen", "Latn", } m["smy"] = { "Semnani", 14531212, "xme", "fa-Arab, Latn", } m["smz"] = { "Simeku", 7517534, "paa-sbo", "Latn", } m["snb"] = { "Sebuyau", 7442836, "poz-mly", "Latn", } m["snc"] = { "Sinaugoro", 4170719, "poz-ocw", "Latn", } m["sne"] = { "Bau Bidayuh", 2891938, "day", "Latn", } m["snf"] = { "Noon", 36304, "alv-cng", "Latn", } m["sng"] = { "Sanga (Congo)", 3438316, "bnt-lub", "Latn", } m["sni"] = { "Sensi", 7451029, "sai-pan", "Latn", } m["snj"] = { "Riverain Sango", 25559751, "crp", "Latn", ancestors = "ngb", } m["snk"] = { "Soninke", 36660, "dmn-snb", "Latn", } m["snl"] = { "Sangil", 3472206, "phi", "Latn", } m["snm"] = { "Southern Ma'di", 15637273, "csu-mma", } m["snn"] = { "Siona", 3485116, "sai-tuc", "Latn", } m["sno"] = { "Snohomish", 25559662, "sal", "Latn", } m["snp"] = { "Siane", 7506812, "ngf-kag", "Latn", } m["snq"] = { "Sangu (Gabon)", 36609, "bnt-sir", "Latn", } m["snr"] = { "Sihan", 7513400, "ngf-gum", "Latn", } m["sns"] = { "Nahavaq", 2160435, "poz-vnc", "Latn", } m["snu"] = { "Senggi", 7929052, "paa-brd", "Latn", } m["snv"] = { "Sa'ban", 3474891, "poz-swa", "Latn", } m["snw"] = { "Selee", 36272, "alv-ntg", "Latn", } m["snx"] = { "Sam", 7408387, "ngf-min", "Latn", } m["sny"] = { "Saniyo-Hiyewe", 7418302, "paa-spk", "Latn", } m["snz"] = { "Kou", 7525035, -- also 4803639 "ngf-eva", "Latn", } m["soa"] = { "โซ่ง", 7709159, "tai-swe", "Tavt, Thai", --translit = "Tavt-translit", sort_key = { from = {"([ꪵꪶꪹꪻꪼ])([ꪀ-ꪯ])", "([เแโใไ])([ก-ฮ])"}, to = {"%2%1", "%2%1"} }, } m["sob"] = { "Sobei", 3121035, "poz-ocw", "Latn", } m["soc"] = { "Soko", 7555138, "bnt-ske", "Latn", } m["sod"] = { "Songoora", 7561296, "bnt-lgb", "Latn", } m["soe"] = { "Songomeno", 5713543, "bnt-bsh", "Latn", } m["sog"] = { "Sogdian", 205979, "ira-sgc", "Sogd, Mani, Syrc, Sogo", translit = { Sogd = "Sogd-translit", -- Mani translit in [[Module:scripts/data]] Sogo = "Sogo-translit", }, } m["soh"] = { "Aka (Sudan)", 3450949, "sdv-eje", "Latn", } m["soi"] = { "Sonha", 12953890, "inc-eas", } m["sok"] = { "Sokoro", 3441303, "cdc-est", "Latn", } m["sol"] = { "Solos", 3489591, "poz-ocw", } m["soo"] = { "Nsong", 12953148, "bnt-bdz", "Latn", } m["sop"] = { "Songe", 3130911, "bnt-lbn", "Latn", } m["soq"] = { "Kanasi", 11732656, "ngf-dag", "Latn", } m["sor"] = { "Somrai", 3123566, "cdc-est", "Latn", } m["sos"] = { "Seenku", 36274, "dmn-smg", } m["sou"] = { "ปักษ์ใต้", 56508, "tai-swe", "Thai", --sort_key = "Thai-sortkey", } m["sov"] = { "Sonsorolese", 13281, "poz-mic", "Latn", } m["sow"] = { "Sowanda", 7571845, "paa-brd", "Latn", } m["sox"] = { "Swo", 36604, "bnt-mka", "Latn", } m["soy"] = { "Miyobe", 35913, "alv-sav", "Latn", } m["soz"] = { "Temi", 13278, "bnt-kka", "Latn", } m["spb"] = { "Sepa (Indonesia)", 18603687, "poz-cma", "Latn", } m["spc"] = { "Sapé", 2888158, nil, "Latn", } m["spd"] = { "Saep", 7398312, "ngf-yag", "Latn", } m["spe"] = { "Sepa (New Guinea)", 7451725, "poz-ocw", "Latn", } m["spg"] = { "Sian", 7506806, "poz-bnn", } m["spi"] = { "Saponi", 3915418, "paa-lkp", "Latn", } m["spk"] = { "Sengo", 7450584, "paa-ndu", "Latn", } m["spl"] = { "Selepet", 7447917, "ngf-huo", "Latn", } m["spm"] = { "Sepen", 4701931, "paa-ram", "Latn", } m["spn"] = { "Sanapaná", 3033556, "sai-mas", "Latn", } m["spo"] = { "Spokane", 3493704, "sal", } m["spp"] = { "Supyire", 56284, "alv-sma", "Latn", } m["spr"] = { "Saparua", 7420921, "poz-cma", "Latn", } m["sps"] = { "Saposa", 3473187, "poz-ocw", } m["spt"] = { "Spiti Bhoti", 22080879, "sit-las", } m["spu"] = { "Sapuan", 7421168, "mkh-ban", } m["spv"] = { "Sambalpuri", 6433240, "inc-eas", "Orya", translit = "or-translit", ancestors = "or", } m["spx"] = { "South Picene", 36688, "itc-sbl", "Ital, Latn", -- Ital translit in [[Module:scripts/data]] display_text = { Latn = s["itc-Latn-displaytext"] }, strip_diacritics = { Latn = s["itc-Latn-stripdiacritics"], }, sort_key = { Latn = s["itc-Latn-sortkey"], }, } m["spy"] = { "Sabaot", 7395896, "sdv-kln", } m["sqa"] = { "Shama-Sambuga", 3914392, "nic-kmk", "Latn", } m["sqh"] = { "Shau", 3913925, "nic-jer", "Latn", } m["sqk"] = { "Albanian Sign Language", 4709168, "sgn", } m["sqm"] = { "Suma", 11008431, "gba-wes", } m["sqn"] = { "Susquehannock", 3505736, "iro-nor", } m["sqo"] = { "Sorkhei", 3491964, "ira-kms", } m["sqq"] = { "Sou", 16979751, "mkh-ban", } m["sqr"] = { "Siculo-Arabic", 1069489, "sem-arb", "Arab", } m["sqs"] = { "Sri Lankan Sign Language", 3915466, "sgn", } m["sqt"] = { "Soqotri", 13283, "sem-sar", "Arab, Latn", } m["squ"] = { "Squamish", 2484579, "sal", "Latn", } m["sra"] = { "Saruga", 7424699, "ngf-han", "Latn", } m["srb"] = { "Sora", 13284, "mun", "Sora, Latn, Orya", } m["sre"] = { "Sara", 33957, "day", } m["srf"] = { "Nafi", 6958174, "poz-ocw", } m["srg"] = { "Sulod", 7636489, "phi", } m["srh"] = { "Sarikoli", 33873, "ira-shr", "Latn, ug-Arab, Cyrl", } m["sri"] = { "Siriano", 3485264, "sai-tuc", "Latn", } m["srk"] = { "Serudung Murut", 7455497, "poz-san", } m["srl"] = { "Isirawa", 4203802, "paa-tkw", } m["srm"] = { "Saramaccan", 33779, "crp", "Latn", ancestors = "en, pt", } m["srn"] = { "Sranan Tongo", 33989, "crp", "Latn", ancestors = "en", } m["srq"] = { "Sirionó", 3027953, "tup-gua", "Latn", } m["srr"] = { "Serer", 36284, "alv-fwo", "Latn", } m["srs"] = { "Tsuut'ina", 20825, "ath-nor", "Latn", } m["srt"] = { "Sauri", 7427547, "paa-egb", "Latn", } m["sru"] = { "Suruí", 7646993, "tup", "Latn", } m["srv"] = { "Waray Sorsogon", 18755610, "phi", "Latn", } m["srw"] = { "Serua", 14916905, "poz-cet", } m["srx"] = { "Sirmauri", 7530505, "him", } m["sry"] = { "Sera", 7452602, "poz-ocw", "Latn", } m["srz"] = { "Shahmirzadi", 12953126, "ira-msh", "fa-Arab", } m["ssb"] = { "Southern Sama", 3470594, "poz-sbj", "Latn", } m["ssc"] = { "Suba-Simbiti", 7630687, "bnt-lok", "Latn", } m["ssd"] = { "Siroi", 10771067, "ngf-rai", "Latn", } m["sse"] = { "Balangingi", 2880535, "poz-sbj", "Latn", } m["ssf"] = { "Thao", 676492, "map", "Latn", } m["ssg"] = { "Seimat", 3182581, "poz-aay", "Latn", } m["ssh"] = { "Shihhi Arabic", 56571, "sem-arb", "Arab", strip_diacritics = { remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef, from = {u(0x0671)}, to = {u(0x0627)} }, } m["ssi"] = { "Sansi", 3309366, "inc-nwe", } m["ssj"] = { "Sausi", 7427605, "ngf-eva", "Latn", } m["ssk"] = { "Sunam", 11002210, "sit-kin", } m["ssl"] = { "Western Sisaala", 11154776, "nic-sis", "Latn", } m["ssm"] = { "Semnam", 7449713, "mkh-asl", "Latn", } m["sso"] = { "Sissano", 7530937, "poz-ocw", "Latn", } m["ssp"] = { "Spanish Sign Language", 3100814, "sgn", } m["ssq"] = { "So'a", 7572120, "poz-cet", "Latn", } m["ssr"] = { "Swiss-French Sign Language", 12953483, "sgn", } m["sss"] = { "Sô", 3082037, "mkh-kat", } m["sst"] = { "Sinasina", 7521813, "ngf-chw", "Latn", } m["ssu"] = { "Susuami", 7649752, "ngf-ang", "Latn", } m["ssv"] = { "Shark Bay", 7489783, "poz-vnn", "Latn", } m["ssx"] = { "Samberigi", 7409020, "ngf-eng", "Latn", } m["ssy"] = { "ซาโฮ", 36353, "cus-eas", "Latn, Ethi, Arab", } m["ssz"] = { "Sengseng", 7450601, "poz-ocw", "Latn", } m["stb"] = { "Northern Subanen", 12953892, "phi", "Latn", } m["std"] = { "Sentinelese", 568377, "qfa-unc", -- presumed Ongan } m["ste"] = { "Liana-Seti", 6539924, "poz-cma", } m["stf"] = { "Seta", 7456326, "paa-tor", "Latn", } m["stg"] = { "Trieng", 22694648, "mkh-ban", } m["sth"] = { "Shelta", 36705, "qfa-mix", "Latn", ancestors = "ga, en", } m["sti"] = { "Bulo Stieng", 15771431, "mkh-ban", "Khmr, Latn", } m["stj"] = { "Matya Samo", 10974879, "dmn-sam", "Latn", } m["stk"] = { "Arammba", 3502094, "paa-yam", "Latn", } m["stm"] = { "Setaman", 7456333, "ngf-okk", "Latn", } m["stn"] = { "Owa", 1324132, "poz-sls", "Latn", } m["sto"] = { "Stoney", 3033570, "sio-dkt", "Latn", } m["stp"] = { "Southeastern Tepehuan", 12953917, "azc-pim", "Latn", } m["stq"] = { "ฟรีเชียแบบซาเทอร์ลันท์", 27154, "gmw-fri", "Latn", } m["str"] = { "Saanich", 36444, "sal", "Latn", } m["sts"] = { "Shumashti", 33777, "inc-kun", "Arab", } m["stt"] = { "Budeh Stieng", 12953891, "mkh-ban", } m["stu"] = { "Samtao", 25559550, "mkh-pal", } m["stv"] = { "Silt'e", 33880, "sem-eth", "Ethi", } m["stw"] = { "Satawalese", 28477, "poz-mic", "Latn", } m["sty"] = { "Siberian Tatar", 4418344, "trk-kno", "Cyrl", } m["sua"] = { "Sulka", 7636341, "qfa-iso", -- Papuan; isolate in Glottolog and Palmer (2018) "Latn", } m["sub"] = { "Suku", 12953160, "bnt-yak", "Latn", } m["suc"] = { "Western Subanon", 16113894, "phi", "Latn", } m["sue"] = { "Suena", 7634386, "paa-bin", "Latn", } m["sug"] = { "Suganga", 7634706, "ngf-okk", "Latn", } m["sui"] = { "Suki", 2089984, "ngf-gsu", "Latn", } m["suk"] = { "Sukuma", 2638144, "bnt-tkm", "Latn", } -- suo (Bouni, Papua New Guinea, called Bouni-Bobe in Glottolog): not yet accepted; in the Sko/Skou family m["suq"] = { "Suri", 5364172, "sdv", } m["sur"] = { "Mwaghavul", 3440486, "cdc-wst", "Latn", } m["sus"] = { "Susu", 33990, "dmn-sya", "Latn", } m["sut"] = { "Subtiaba", 3915405, "omq", "Latn", } m["suv"] = { "Puroik", 56408, "sit-khb", "Beng, Deva, Latn", ancestors = "sit-khp-pro", } m["suw"] = { "Sumbwa", 7637055, "bnt-glb", "Latn", } m["sux"] = { "ซูเมอร์", 36790, "qfa-iso", "Xsux, Latn", } m["suy"] = { "Suyá", 3505859, "sai-nje", "Latn", } m["suz"] = { "Sunwar", 56549, "sit-kiw", "Deva, Sunu", translit = { Deva = "Deva-translit", }, } m["sva"] = { "Svan", 34067, "ccs", "Geor, Cyrl", translit = { Geor = "sva-translit", }, override_translit = true, } m["svb"] = { "Ulau-Suain", 7878769, "poz-ocw", "Latn", } m["svc"] = { "Vincentian Creole English", 3501785, "crp", "Latn", ancestors = "en", } m["sve"] = { "Serili", 7454834, "poz-tim", } m["svk"] = { "Slovakian Sign Language", 7541557, "sgn", } m["svm"] = { "Slavomolisano", 36254, "zls", "Latn", ancestors = "sh", } m["svs"] = { "Savosavo", 3130296, "qfa-dis", -- Papuan; isolate in Glottolog; in the tentative Central Solomons family by Ross (2005) and Pedrós -- (2015) "Latn", } m["svx"] = { "Skalvian", 3486125, "bat-wes", "Latn", } m["swb"] = { "Maore Comorian", 34075, "bnt-com", "Latn", sort_key = "bnt-com-sortkey", } m["swf"] = { "Sere", 7453056, "nic-ser", "Latn", } m["swg"] = { "Swabian", 327274, "gmw-hgm", "Latn", ancestors = "gsw", } m["swi"] = { "สุ่ย", 3112388, "qfa-kms", "Latn, Shui, Hani", sort_key = {Hani = "Hani-sortkey"}, } m["swj"] = { "Sira", 36599, "bnt-sir", "Latn", } m["swl"] = { "Swedish Sign Language", 36558, "sgn", } m["swm"] = { "Samosa", 7410037, "ngf-nwh", "Latn", } m["swn"] = { "Sokna", 2988323, "ber", } m["swo"] = { "Shanenawa", 61974839, "sai-pan", "Latn", } m["swp"] = { "Suau", 3502368, "poz-ocw", } m["swq"] = { "Sharwa", 56791, "cdc-cbm", "Latn", } m["swr"] = { "Saweru", 3474649, "paa-yaw", "Latn", } m["sws"] = { "Seluwasan", 7448845, "poz-cet", } m["swt"] = { "Sawila", 7428639, "paa-tap", "Latn", } m["swu"] = { "Suwawa", 7650588, "phi", } m["sww"] = { "Sowa", 7571843, "poz-vnn", "Latn", } m["swx"] = { "Suruahá", 3114402, "auf", } m["swy"] = { "Sarua", 56261, "cdc-est", "Latn", } m["sxb"] = { "Suba", 33916, "bnt-lok", "Latn", } m["sxc"] = { "Sicanian", 36335, "qfa-unc", -- extinct, lack of data; only names deciphered "Polyt", } m["sxe"] = { "Sighu", 36431, "bnt-kel", "Latn", } m["sxg"] = { "Shixing", 56337, "sit-nax", "Latn", } m["sxk"] = { "Southern Kalapuya", 3192122, "nai-klp", } m["sxl"] = { "Selonian", 36491, "bat-eas", "Latn", } m["sxm"] = { "สำเร", 6583615, "mkh-pea", } m["sxn"] = { "Sangir", 25714758, "phi", "Latn", } m["sxo"] = { "Sorothaptic", 2762254, } m["sxr"] = { "Saaroa", 716599, "map", "Latn", } m["sxs"] = { "Sasaru", 3913384, "alv-yek", "Latn", } -- "sxu" "Upper Saxon" IS SUBSUMED INTO "gmw-ecg" "East Central German" m["sxw"] = { "Saxwe Gbe", 7428892, "alv-pph", "Latn", } m["sya"] = { "Siang", 3482903, } m["syb"] = { "Central Subanen", 12953893, "phi", "Latn", } m["syc"] = { "ซีรีแอกคลาสสิก", 33538, "sem-are", "Syrc", strip_diacritics = {remove_diacritics = c.macron .. c.diaer .. c.macronbelow .. u(0x0730) .. "-" .. u(0x0748)}, } m["syi"] = { "Seki", 36547, "bnt-kel", "Latn", } m["syk"] = { "Sukur", 56292, "cdc-cbm", "Latn", } m["syl"] = { "สิเลฏ", 2044560, "inc-bas", "Sylo, Beng", ancestors = "inc-obn", translit = "syl-translit", } m["sym"] = { "Maya Samo", 10950421, "dmn-sam", "Latn", } m["syn"] = { "Senaya", 33914, "sem-nna", } m["syo"] = { "Suoy", 7641864, "mkh-pea", } m["sys"] = { "Sinyar", 56840, "csu", "Latn", } m["syw"] = { "Kagate", 12952538, "sit-kyk", "Deva", translit = { Deva = "Deva-translit", }, } m["syx"] = { "Osamayi", 7408415, "bnt-kel", "Latn", } m["syy"] = { "Al-Sayyid Bedouin Sign Language", 2915457, "sgn", } m["sza"] = { "Semelai", 3111827, "mkh-asl", "Latn", } m["szb"] = { "Ngalum", 11732516, "ngf-okk", "Latn", } m["szc"] = { "Semaq Beri", 7449119, "mkh-asl", } m["szd"] = { "Seru", 7455488, "poz-bnn", "Latn", } m["sze"] = { "Seze", 373683, "omv-mao", "Latn", } m["szg"] = { "Sengele", 7450555, "bnt-mon", "Latn", } m["szl"] = { "ไซลีเซีย", 30319, "zlw-lch", "Latn", ancestors = "zlw-opl", } m["szn"] = { "Sula", 3503403, "poz-cma", "Latn", } m["szp"] = { "Suabo", 7630429, "ngf-sbh", "Latn", } m["szv"] = { "Isubu", 35431, "bnt-saw", "Latn", } m["szw"] = { "Sawai", 3447258, "poz-hce", "Latn", } m["szy"] = { "ซากีซายา", 718269, "map", "Latn", } return require("Module:languages").finalizeData(m, "language") mifenuct2b7k5jmhz3iypt0d3b2ocj6 มอดูล:languages/data/3/r 828 36369 5720764 5684165 2026-04-21T07:01:11Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720764 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["raa"] = { "Dungmali", 56871, "sit-kic", } m["rab"] = { "Chamling", 3436664, "sit-kic", "Deva", translit = { Deva = "Deva-translit", }, } m["rac"] = { "Rasawa", 56443, "paa-lkp", "Latn", } m["rad"] = { "Rade", 3429088, "cmc", "Latn", } m["raf"] = { "Western Meohang", 17442461, "sit-kie", } m["rag"] = { "Logooli", 6667767, "bnt-lok", "Latn", } m["rah"] = { "Rabha", 7278686, "tbq-bdg", "Beng, Latn", } m["rai"] = { "Ramoaaina", 3418509, "poz-ocw", "Latn", } m["rak"] = { "Tulu-Bohuai", 2908807, "poz-aay", "Latn", } m["ral"] = { "Ralte", 7288392, "tbq-kuk", "Latn", } m["ram"] = { "Canela", 2936334, "sai-nje", "Latn", } m["ran"] = { "Riantana", 7322169, "paa-kol", "Latn", } m["rao"] = { "Rao", 11732596, "paa-ram", "Latn", } m["rap"] = { "ราปานูอี", 36746, "poz-pep", "Latn", } m["raq"] = { "Saam", 7395644, "sit-kic", } m["rar"] = { "ราโรโตงา", 36745, "poz-pep", "Latn", } m["ras"] = { "Tegali", 36522, "nic-ras", "Latn", } m["rat"] = { "Razajerdi", 7299461, "xme-ttc", ancestors = "xme-ttc-eas", } m["rau"] = { "Raute", 7296262, "sit-gma", "Deva, Latn", translit = { Deva = "Deva-translit", }, } m["rav"] = { "Sampang", 3449115, "sit-kic", } m["raw"] = { "เรอหวั่ง", 542564, "sit-nng", "Latn", sort_key = {remove_diacritics = c.grave .. c.acute .. c.macron}, } m["rax"] = { "Rang", 3913345, "alv-mum", } m["ray"] = { "Rapa", 36417, "poz-pep", "Latn", } m["raz"] = { "Rahambuu", 3417555, "poz-btk", } m["rbb"] = { "Rumai Palaung", 12953797, "mkh-pal", "Mymr", } m["rbk"] = { "Northern Bontoc", 63311016, "phi", "Latn", } m["rbl"] = { "Miraya Bikol", 18664557, "phi", "Latn", } m["rcf"] = { "Réunion Creole French", 13198, "crp", "Latn", ancestors = "fr", sort_key = s["roa-oil-sortkey"], } m["rdb"] = { "Rudbari", 12953072, "xme", ancestors = "xme-mid", } m["rea"] = { "Rerau", 7314883, "ngf-rai", -- placed in Nuru subfamily by Pawley-Hammarström "Latn", } m["reb"] = { "Rembong", 7311570, "poz-cet", "Latn", } m["ree"] = { "Rejang Kayan", 3423957, "poz", "Latn", } m["reg"] = { "Kara (Tanzania)", 6367567, "bnt-haj", } m["rei"] = { "Reli", 7310982, } m["rej"] = { "Rejang", 3056339, "poz", "Rjng, Latn", } m["rel"] = { "Rendille", 3447297, "cus-som", "Latn", } m["rem"] = { "Remo", 3501825, "sai-pan", "Latn", } m["ren"] = { "Rengao", 6583692, "mkh", } m["rer"] = { "Rer Bare", 12953857, "qfa-unc", -- extinct, might not exist } m["res"] = { "Reshe", 36258, "nic-knj", } m["ret"] = { "Retta", 7317113, "paa-tap", "Latn", } m["rey"] = { "Reyesano", 3111857, "sai-tac", "Latn", } m["rga"] = { "Roria", 7366825, "poz-vnn", "Latn", } m["rge"] = { "Romano-Greek", 3915435, "qfa-mix", "Latn", -- and/or Grek? ancestors = "rom, el", } m["rgk"] = { "Rangkas", 7292645, "sit-alm", } m["rgn"] = { "โรมัญญา", 1641543, "roa-emr", "Latn", wikimedia_codes = "eml", } m["rgr"] = { "Resígaro", 3450504, "awd", "Latn", } m["rgs"] = { "Southern Roglai", 12953069, "cmc", "Latn", } m["rgu"] = { "Ringgou", 7334886, "poz-tim", } m["rhg"] = { "โรฮีนจา", 3241177, "inc-bas", "Rohg, Arab, Mymr, Latn, Beng", ancestors = "inc-obn", translit = { Rohg = "Rohg-translit", }, } m["rhp"] = { "Yahang", 8046792, "paa-tor", "Latn", } m["ria"] = { "Reang", 12953063, "tbq-bdg", } m["rif"] = { "ริฟ", 34174, "ber", "Latn, Tfng, Arab", translit = { Tfng = "Tfng-translit" }, standard_chars = { Latn = "AaBbCcDdḌḍEeƐɛFfGgƔɣĞğHhḤḥIiJjKkLlMmNnPpQqRrŘřSsṢṣTtṬṭUuWwXxYyZzẒẓʷ", Tfng = "ⴰⴳⴷⴹⴼⵖⵉⴽⵍⵎⵏⵓⵔⵙⵛⵜⵡⵢⵣⵥⴱⵀⵅⵊⴳⵯⵕⵚⵟⵇⵃⵄⴻⴽⵯ", c.punc }, } m["ril"] = { "Riang", 2741615, "mkh-pal", } m["rim"] = { "Nyaturu", 7193418, "bnt-tkm", "Latn", } m["rin"] = { "Nungu", 3913350, "nic-nin", "Latn", } m["rir"] = { "Ribun", 7322443, "day", "Latn", } m["rit"] = { "Ritarungo", 7336730, "aus-yol", "Latn", } m["riu"] = { "Riung", 7336938, "poz-cet", "Latn", } m["rjg"] = { "Rajong", 7286370, "poz-cet", "Latn", } m["rji"] = { "Raji", 7286138, "sit-gma", } m["rjs"] = { "ราชพังสี", 12640969, "inc-krd", "Deva", translit = { Deva = "Deva-translit", }, } m["rka"] = { "Kraol", 3199593, "mkh-ban", "Khmr", -- also Latn? } m["rkb"] = { "Rikbaktsa", 2585357, "sai-mje", "Latn", } m["rkh"] = { "Rakahanga-Manihiki", 3119695, "poz-pep", "Latn", } m["rki"] = { "ยะไข่", 3450749, "tbq-brm", "Mymr", ancestors = "obr", } m["rkm"] = { "Marka", 36030, "dmn-wmn", "Latn", } m["rkt"] = { "Kamta", 3241618, "inc-krd", "as-Beng, Latn", translit = "as-translit", } m["rkw"] = { "Arakwal", 34295800, "aus-pam", "Latn", } m["rma"] = { "Rama", 3444486, "cba", } m["rmb"] = { "Rembarunga", 7311553, "aus-gun", "Latn", } m["rmc"] = { "Carpathian Romani", 5045611, "inc-rom", "Latn", } m["rmd"] = { "Traveller Danish", 12640779, "qfa-mix", "Latn", ancestors = "rom, da", } m["rme"] = { "Angloromani", 541279, "qfa-mix", "Latn", ancestors = "rom, en", } m["rmf"] = { "Kalo Finnish Romani", 2093214, "inc-rom", "Latn", } m["rmg"] = { "Traveller Norwegian", 3177352, "qfa-mix", "Latn", ancestors = "rom, no", } m["rmh"] = { "Murkim", 4308074, "paa-pau", } m["rmi"] = { "Lomavren", 2495696, "qfa-mix", "Latn, Armn", ancestors = "pra-sau, hy", -- Armn translit in [[Module:scripts/data]] override_translit = true, } m["rmk"] = { "Romkun", 7363236, "paa-ram", "Latn", } m["rml"] = { "Baltic Romani", 513736, "inc-rom", "Latn", } m["rmm"] = { "Roma", 4414831, } m["rmn"] = { "Balkan Romani", 1256701, "inc-rom", "Latn", } m["rmo"] = { "Sinte Romani", 1793299, "inc-rom", "Latn", } m["rmp"] = { "Rempi", 7312214, "ngf-han", "Latn", } m["rmq"] = { "Caló", 35466, "qfa-mix", "Latn", ancestors = "rom, osp, roa-opt", } m["rms"] = { "Romanian Sign Language", 7362575, "sgn", } m["rmt"] = { "Domari", 35394, "inc-cen", "Latn, Arab, Hebr", -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["rmu"] = { "Tavringer Romani", 27808413, "qfa-mix", "Latn", ancestors = "rom, sv", } m["rmv"] = { "Romanova", 1298715, "art", type = "appendix-constructed", } m["rmw"] = { "Welsh Romani", 2097387, "inc-rom", "Latn", } m["rmx"] = { "Romam", 22694600, "mkh", } m["rmy"] = { "Vlax Romani", 2669199, "inc-rom", "Latn", } m["rmz"] = { "Marma", 21403256, "tbq-brm", "Mymr", ancestors = "obr", } m["rnd"] = { "Ruwund", 7383564, "bnt-lun", } m["rng"] = { "Ronga", 2520717, "bnt-tsr", "Latn", } m["rnl"] = { "Ranglong", 7292878, } m["rnn"] = { "Roon", 7366335, "poz-hce", "Latn", } m["rnp"] = { "Rongpo", 7365672, "sit-whm", } m["rnw"] = { "Rungwa", 7379873, "bnt-mwi", "Latn", } m["rob"] = { "Tae'", 12473476, "poz-ssw", "Latn", } m["roc"] = { "Cacgia Roglai", 2932485, "cmc", "Latn", } m["rod"] = { "Rogo", 3914894, "nic-kmk", } m["roe"] = { "Ronji", 3441763, "poz-ocw", } m["rof"] = { "Rombo", 33330, "bnt-chg", "Latn", } m["rog"] = { "Northern Roglai", 3439680, "cmc", "Latn", } m["rol"] = { "Romblomanon", 13202, "phi", "Latn", } m["rom"] = { "โรมานี", 13201, "inc-rom", "Latn, Cyrl", } m["roo"] = { "Rotokas", 13203, "paa-nbo", "Latn", } m["rop"] = { "Australian Kriol", 35671, "crp", "Latn", ancestors = "en", } m["ror"] = { "Rongga", 12473464, } m["rou"] = { "Runga", 56793, } m["row"] = { "Dela-Oenale", 5253046, "poz-tim", } m["rpn"] = { "Repanbitip", 7313900, "poz-vnc", "Latn", } m["rpt"] = { "Rapting", 7294362, "ngf-han", "Latn", } m["rri"] = { "Ririo", 2404190, "poz-ocw", } m["rro"] = { "Roro", 34197, "poz-ocw", "Latn", } m["rrt"] = { "Arritinngithigh", 4796002, nil, "Latn", } m["rsb"] = { "Romano-Serbian", 1268244, "qfa-mix", "Latn", -- and Cyrl? ancestors = "rom, sh", } m["rsl"] = { "มือรัสเซีย", 13210, "sgn", } m["rsk"] = { "รูซินแบบพันโนเนีย", 35660, "zlw", "Cyrl", ancestors = "zlw-osk", --translit = "rsk-translit", sort_key = { Cyrl = { from = {"ґ", "є", "ї", "ь"}, to = {"г" .. p[1], "е" .. p[1], "и" .. p[1], "я" .. p[1]} } }, standard_chars = "АаБбВвГ㥴ДдЕеЄєЖжЗзИиІіЇїЙйКкЛлМмНнОоПпРрСсТтУуФфХхЦцЧчШшЩщЬьЮюЯя" .. c.punc:gsub("'", ""), -- Exclude apostrophe. } m["rsm"] = { "Miriwoong Sign Language", 24090240, "sgn", } m["rsn"] = { "Rwandan Sign Language", 25041935, "sgn", } m["rtc"] = { "Rungtu", 7379867, "tbq-kuk", } m["rth"] = { "Ratahan", 3420026, "phi", "Latn", } m["rtm"] = { "Rotuman", 36754, "poz-pcc", "Latn", } m["rtw"] = { "Rathawi", 12953854, "inc-bhi", } m["rub"] = { "Gungu", 11165235, "bnt-glb", "Latn", } m["ruc"] = { "Ruuli", 7383562, "bnt-nyg", } m["rue"] = { "รูซินแบบคาร์พาเทีย", 26245, "zle", "Cyrl", ancestors = "zle-ort", translit = "rue-translit", strip_diacritics = {remove_diacritics = c.grave .. c.acute}, sort_key = "rue-sortkey", } m["ruf"] = { "Luguru", 3437661, "bnt-ruv", "Latn", } m["rug"] = { "Roviana", 3445546, "poz-ocw", "Latn", } m["ruh"] = { "Ruga", 7378127, } m["rui"] = { "Rufiji", 7377946, "bnt-mbi", } m["ruk"] = { "Che", 3915445, "nic-nin", "Latn", } m["ruo"] = { "Istro-Romanian", 33622, "roa-eas", "Latn", } m["rup"] = { "Aromanian", 29316, "roa-eas", "Latn, Polyt", translit = { -- FIXME: formerly no translit specified for Polyt; unclear if the default [[Module:grc-translit]] is -- acceptable, so we disable it for now Polyt = false, }, sort_key = { Latn = { from = {"ã"}, to = {"a"..p[1]} }, }, -- Polyt display_text, strip_diacritics, sort_key in [[Module:scripts/data]] wikimedia_codes = "roa-rup", } m["ruq"] = { "Megleno-Romanian", 13358, "roa-eas", "Latn", } m["rut"] = { "Rutul", 36757, "cau-wsm", "Cyrl, Latn", translit = "rut-translit", override_translit = true, display_text = { Cyrl = s["cau-Cyrl-displaytext"] }, strip_diacritics = { Cyrl = s["cau-Cyrl-stripdiacritics"], Latn = s["cau-Latn-stripdiacritics"], }, } m["ruu"] = { "Lanas Lobu", 12953676, } m["ruy"] = { "Mala (Nigeria)", 3913381, "nic-kau", } m["ruz"] = { "Ruma", 3913326, "nic-kau", } m["rwa"] = { "Rawo", 3504269, "paa-msk", "Latn", } m["rwk"] = { "Rwa", 7985624, "bnt-chg", } m["rwm"] = { "Amba", 788423, "bnt-kbi", "Latn", } m["rwo"] = { "Rawa", 11732598, "ngf-fin", "Latn", } m["rxd"] = { "Ngardi", 7022063, } m["rxw"] = { "Karuwali", 6881575, } m["ryn"] = { "อามามิโอชิมะเหนือ", 2840988, "jpx-nry", "Jpan", translit = s["jpx-translit"], display_text = s["jpx-displaytext"], strip_diacritics = s["jpx-stripdiacritics"], sort_key = s["jpx-sortkey"], } m["rys"] = { "ยาเอยามะ", 34203, "jpx-sry", "Jpan", translit = s["jpx-translit"], display_text = s["jpx-displaytext"], strip_diacritics = s["jpx-stripdiacritics"], sort_key = s["jpx-sortkey"], } m["ryu"] = { "โอกินาวะ", 34233, "jpx-nry", "Jpan", translit = s["jpx-translit"], display_text = s["jpx-displaytext"], strip_diacritics = s["jpx-stripdiacritics"], sort_key = s["jpx-sortkey"], } m["rzh"] = { "Razihi", 16911222, "sem-osa", "Arab", ancestors = "sem-srb", } return require("Module:languages").finalizeData(m, "language") trlsbvxrbif6e5qi3zud6fwywprrtkm มอดูล:languages/data/3/p 828 36371 5720763 5684164 2026-04-21T07:01:09Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720763 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["pab"] = { "Pareci", 3504312, "awd", "Latn", } m["pac"] = { "ปาโกะห์", 3441136, "mkh-kat", "Latn", } m["pad"] = { "Paumarí", 389827, "auf", "Latn", } m["pae"] = { "Pagibete", 7124357, "bnt-bta", "Latn", } m["paf"] = { "Paranawát", 12953806, "tup-gua", "Latn", } m["pag"] = { "Pangasinan", 33879, "phi", "Latn, Tglg", strip_diacritics = { Latn = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer}, }, } m["pah"] = { "Tenharim", 10266010, "tup-gua", "Latn", } m["pai"] = { "Pe", 3914871, "nic-tar", "Latn", } m["pak"] = { "Parakanã", 12953804, "tup-gua", "Latn", } m["pal"] = { "เปอร์เซียกลาง", 32063, "ira-swi", "Latn, Phli, pal-Avst, Mani, Phlp, Phlv", -- Latn for translit; Phlv not in Unicode translit = { Phli = "Phli-translit", ["pal-Avst"] = "Avst-translit", -- Mani translit in [[Module:scripts/data]] }, ancestors = "peo", } m["pam"] = { "กาปัมปางัน", 36121, "phi", "Latn, Kulit", strip_diacritics = { Latn = {remove_diacritics = c.grave .. c.acute .. c.circ} }, standard_chars = { Latn = "AaBbDdEeGgHhIiKkLlMmNnOoPpRrSsTtUuWwYy", c.punc }, sort_key = { Latn = "tl-sortkey" }, } m["pao"] = { "Northern Paiute", 3360656, "azc-num", "Latn", } m["pap"] = { "ปาเปียเมนตู", 33856, "crp", "Latn", ancestors = "pt", } m["paq"] = { "Parya", 1135134, "inc-cen", } m["par"] = { "Panamint", 33926, "azc-num", "Latn", } m["pas"] = { "Papasena", 7132508, "paa-lkp", "Latn", } m["pau"] = { "ปาเลา", 33776, "poz", "Latn, Kana", sort_key = { Kana = "Kana-sortkey" }, } m["pav"] = { "Wari'", 3027909, "sai-cpc", "Latn", } m["paw"] = { "Pawnee", 56751, "cdd", "Latn", strip_diacritics = {remove_diacritics = c.acute}, } m["pax"] = { "Pankararé", 25559779, nil, "Latn", } m["pay"] = { "Pech", 4898889, "cba", "Latn", } m["paz"] = { "Pankararú", 7131310, nil, "Latn", } m["pbb"] = { "Páez", 33677, nil, "Latn", } m["pbc"] = { "Patamona", 3915921, "sai-pem", "Latn", } m["pbe"] = { "Mezontla Popoloca", 42365630, "omq-pop", "Latn", } m["pbf"] = { "Coyotepec Popoloca", 5180100, "omq-pop", "Latn", } m["pbg"] = { "Paraujano", 3501747, "awd-taa", "Latn", } m["pbh"] = { "Panare", 56610, "sai-ven", "Latn", } m["pbi"] = { "Podoko", 3515096, "cdc-cbm", "Latn", } m["pbl"] = { "Mak (Nigeria)", 3915349, "alv-bwj", "Latn", } m["pbm"] = { "Puebla Mazatec", 31102530, "omq-maz", "Latn", } m["pbn"] = { "Kpasam", 3914902, "alv-mye", "Latn", } m["pbo"] = { "Papel", 36314, "alv-pap", "Latn", } m["pbp"] = { "Badyara", 35095, "alv-ten", "Latn", } m["pbr"] = { "Pangwa", 3847550, "bnt-bki", "Latn", } m["pbs"] = { "Central Pame", 3361763, "omq", "Latn", } m["pbv"] = { "ปนัร", 3501850, "aav-pkl", "Latn", } m["pby"] = { "Pyu (New Guinea)", 2567925, "qfa-dis", -- Papuan; isolate per Glottolog, in a putative Arai-Samaia family in Usher (2020) "Latn", } m["pca"] = { "Santa Inés Ahuatempan Popoloca", 42365276, "omq-pop", "Latn", } m["pcb"] = { "Pear", 6583669, "mkh-pea", "Khmr", } m["pcc"] = { "ปู้อี", 35100, "tai-nor", "Latn, Hani", sort_key = { Hani = "Hani-sortkey" }, } m["pcd"] = { "ปีการ์", 34024, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["pce"] = { "Ruching Palaung", 12953798, "mkh-pal", "Mymr", } m["pcf"] = { "Paliyan", 7127643, "dra-tam", } m["pcg"] = { "Paniya", 7131211, "dra-mal", } m["pch"] = { "Pardhan", 7133207, "dra-gon", } m["pci"] = { "Duruwa", 56753, "dra-pgd", "Deva, Orya", translit = { Deva = "Deva-translit", Orya = "Orya-translit", }, } m["pcj"] = { "Parenga", 3111396, "mun", } m["pck"] = { "Paite", 12952337, "tbq-kuk", } m["pcl"] = { "Pardhi", 7136554, "inc-bhi", } m["pcm"] = { "Nigerian Pidgin", 33655, "crp", "Latn", ancestors = "en", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.caron .. c.macronbelow}, sort_key = { remove_diacritics = c.tilde, from = {"ẹ", "gb", "kp", "ọ", "sh", "zh"}, to = {"e" .. p[1], "g" .. p[1], "k" .. p[1], "o" .. p[1], "s" .. p[1], "z" .. p[1]} }, } m["pcn"] = { "Piti", 3913375, "nic-kne", "Latn", } m["pcp"] = { "Pacahuara", 2591165, "sai-pan", "Latn", } m["pcw"] = { "Pyapun", 3438807, nil, "Latn", } m["pda"] = { "Anam", 3501930, "ngf-pom", "Latn", } m["pdc"] = { "เยอรมันแบบเพนซิลเวเนีย", 22711, "gmw-hgm", "Latn", ancestors = "gmw-rfr", } m["pdi"] = { "Pa Di", 3359940, nil, "Latn", } m["pdn"] = { "Fedan", 7206699, "poz-ocw", "Latn", } m["pdo"] = { "Padoe", 3360370, "poz-btk", "Latn", } m["pdt"] = { "เพลาท์ดิทช์", 1751432, "gmw-lgm", "Latn", ancestors = "nds-de", } m["pdu"] = { "กะยัน", 7123283, "kar", "Latn", } m["pea"] = { "Peranakan Indonesian", 653415, "crp", "Latn", ancestors = "ms", } m["peb"] = { "Eastern Pomo", 3396032, "nai-pom", "Latn", } m["ped"] = { "Mala (New Guinea)", 11732569, "ngf-kau", "Latn", } m["pee"] = { "Taje", 12953902, nil, "Latn", } m["pef"] = { "Northeastern Pomo", 3396018, "nai-pom", "Latn", } m["peg"] = { "Pengo", 56758, "dra-kki", "Orya", translit = "Orya-translit", } m["peh"] = { "Bonan", 32983, "xgn-shr", "Latn", } m["pei"] = { "Chichimeca-Jonaz", 3915427, "omq-otp", "Latn", } m["pej"] = { "Northern Pomo", 3396021, "nai-pom", "Latn", } m["pek"] = { "Penchal", 3374631, "poz-aay", "Latn", } m["pel"] = { "Pekal", 3241781, nil, "Latn", } m["pem"] = { "Phende", 7162372, "bnt-pen", "Latn", } m["peo"] = { "เปอร์เซียเก่า", 35225, "ira-swi", "Xpeo, Latn", --translit = "peo-translit", } m["pep"] = { "Kunja", 6444807, "paa-yam", "Latn", } m["peq"] = { "Southern Pomo", 3396023, "nai-pom", "Latn", } -- "pes" is treated as "fa" (or as etymology-only), see [[WT:LT]] m["pev"] = { "Pémono", 3439012, "sai-map", "Latn", } m["pex"] = { "Petats", 3376353, "poz-ocw", "Latn", } m["pey"] = { "Petjo", 940486, nil, "Latn", } m["pez"] = { "Eastern Penan", 18638342, "poz-swa", "Latn", } m["pfa"] = { "Pááfang", 3063517, "poz-mic", "Latn", } m["pfe"] = { "Peere", 36377, "alv-dur", "Latn", } m["pga"] = { "Juba Arabic", 1262143, "crp", "Latn", ancestors = "apd", } m["pgd"] = { "คานธาระ", 3124623, "inc-mid", "Deva, Khar", ancestors = "inc-ash", translit = { Deva = "Deva-translit", Khar = "Khar-translit", }, } m["pgg"] = { "ปังควาฬฺ", 13600429, "him", "Deva, Takr", translit = { Deva = "Deva-translit", Takr = "Takr-translit", }, } m["pgi"] = { "Pagi", 7124354, "paa-brd", "Latn", } m["pgk"] = { "Rerep", 586907, "poz-vnc", "Latn", } m["pgl"] = { "Primitive Irish", 3320030, "cel-gae", "Ogam, Latn", translit = "pgl-translit", } m["pgn"] = { "Paelignian", 65455883, "itc-sbl", "Ital, Latn", -- Ital translit in [[Module:scripts/data]] display_text = { Latn = s["itc-Latn-displaytext"] }, strip_diacritics = { Latn = s["itc-Latn-stripdiacritics"] }, sort_key = { Latn = s["itc-Latn-sortkey"] }, } m["pgs"] = { "Pangseng", 3914027, "alv-mum", "Latn", } m["pgu"] = { "Pagu", 7124462, "paa-nha", "Latn", } m["pgz"] = { "Papua New Guinean Sign Language", 25044405, "sgn", } m["pha"] = { "Pa-Hng", 2625410, "hmn", } m["phd"] = { "Phudagi", 7188289, } m["phg"] = { "Phuong", 7188376, "mkh-kat", } m["phh"] = { "Phukha", 7188298, "tbq-phw", } m["phk"] = { "พ่าเก", 7675798, "tai-swe", "Mymr", translit = "aio-phk-translit", display_text = s["phk-displaytext"], strip_diacritics = s["phk-stripdiacritics"], } m["phl"] = { "Palula", 2449549, "inc-dng", "Latn, ur-Arab", strip_diacritics = { -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif from = {"هٔ", "ۂ", "ٱ"}, to = {"ہ", "ہ", "ا"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef }, } m["phm"] = { "Phimbi", 11007144, "bnt-sna", "Latn", } m["phn"] = { "ฟินิเชีย", 36734, "sem-can", "Phnx", -- Phnx translit in [[Module:scripts/data]] } m["pho"] = { "ผู้น้อย", 7188361, "tbq-bis", } m["phq"] = { "Phana'", 7180427, "tbq-sil", } m["phr"] = { "Pahari-Potwari", 33739, "inc-pan", "pa-Arab, Guru", ancestors = "lah", translit = { Guru = "Guru-translit", ["pa-Arab"] = "pa-Arab-translit", }, strip_diacritics = { ["pa-Arab"] = { remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna, from = {"ݨ", "ࣇ"}, to = {"ن", "ل"} }, } } m["pht"] = { "ผู้ไท", 3626597, "tai-swe", "Thai", } m["phu"] = { "พวน", 3915665, } m["phv"] = { "Pahlavani", 7124567, } m["phw"] = { "Phangduwali", 12953036, "sit-kie", ancestors = "ybh", } m["pia"] = { "Pima Bajo", 3388544, "azc-pim", "Latn", } m["pib"] = { "Yine", 3135432, "awd", "Latn", } m["pic"] = { "Pinji", 36296, "bnt-tso", "Latn", } m["pid"] = { "Piaroa", 3382207, nil, "Latn", } m["pie"] = { "Piro", 7198055, "nai-kta", "Latn", } m["pif"] = { "Pingelapese", 36421, "poz-mic", "Latn", } m["pig"] = { "Pisabo", 966883, "sai-pan", "Latn", } m["pih"] = { "Pitcairn-Norfolk", 36554, "crp", "Latn", ancestors = "en", } m["pii"] = { "Pini", 10631925, } m["pij"] = { "Pijao", 7193519, } m["pil"] = { "Yom", 36893, "nic-yon", } m["pim"] = { "Powhatan", 2270532, "alg-eas", "Latn", } m["pin"] = { "Piame", 7190042, "paa-spk", "Latn", } m["pio"] = { "Piapoco", 3382208, "awd-nwk", "Latn", } m["pip"] = { "Pero", 2411063, "cdc-wst", } m["pir"] = { "Piratapuyo", 3389119, "sai-tuc", "Latn", } m["pis"] = { "Pijin", 36699, "crp", "Latn", ancestors = "en", } m["pit"] = { "Pitta-Pitta", 6433116, "aus-kar", "Latn", } m["piu"] = { "Pintupi-Luritja", 2591175, "aus-pam", "Latn", } m["piv"] = { "Pileni", 2976736, "poz-pnp", "Latn", } m["piw"] = { "Pimbwe", 3894132, "bnt-mwi", } m["pix"] = { "Piu", 7199578, } m["piy"] = { "Piya-Kwonci", 3440492, } m["piz"] = { "Pije", 3388339, "poz-cln", "Latn", } m["pjt"] = { "Pitjantjatjara", 2982063, "aus-pam", "pjt-Latn", } m["pkb"] = { "Kipfokomo", 7208693, "bnt-sab", "Latn", } m["pkc"] = { "แพ็กเจ", 4841264, "qfa-kor", "Hani, Kana", sort_key = { Hani = "Hani-sortkey", Kana = "Kana-sortkey" }, } m["pkg"] = { "Pak-Tong", 3360711, } m["pkh"] = { "Pankhu", 7130962, "tbq-kuk", } m["pkn"] = { "Pakanha", 954916, "aus-pmn", } m["pko"] = { "Pökoot", 36323, "sdv-kln", "Latn", } m["pkp"] = { "ปูกาปูกา", 36447, "poz-pnp", "Latn", } m["pkr"] = { "Attapady Kurumba", 16835180, "dra-imd", "Mlym", -- Mlym translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission) } m["pks"] = { "Pakistan Sign Language", 22964057, "sgn", } m["pkt"] = { "Maleng", 6583562, "mkh-vie", } m["pku"] = { "Paku", 2932604, "poz-bre", "Latn", } m["pla"] = { "Miani", 12952844, "ngf-kau", "Latn", } m["plb"] = { "Polonombauk", 7225957, "poz-vnn", "Latn", } m["plc"] = { "ปาลาวาโนตอนกลาง", 12953795, "phi", "Latn", } m["ple"] = { "Palu'e", 2196866, "poz-cet", "Latn", } m["plg"] = { "Pilagá", 2748259, "sai-guc", "Latn", } m["plh"] = { "Paulohi", 7155331, "poz-cma", } m["plj"] = { "Polci", 3914383, } m["plk"] = { "Kohistani Shina", 12953882, "inc-shn", "ur-Arab, Latn", } m["pll"] = { "Shwe Palaung", 27941664, "mkh-pal", "Mymr", } m["pln"] = { "Palenquero", 36665, "crp", "Latn", ancestors = "es", } m["plo"] = { "Oluta Popoluca", 5908687, "nai-miz", "Latn", } m["plq"] = { "Palaic", 36582, "ine-ana", "Xsux", } m["plr"] = { "Palaka Senoufo", 36346, "alv-snf", "Latn", } m["pls"] = { "San Marcos Tlalcoyalco Popoloca", 12641692, "omq-pop", "Latn", } m["plu"] = { "Palikur", 3073448, "awd", "Latn", } m["plv"] = { "ปาลาวาโนตะวันตกเฉียงใต้", 15614922, "phi", "Latn", } m["plw"] = { "ปาลาวาโนแบบบรูกส์พอยต์", 12953796, "phi", "Latn", } m["ply"] = { "Bolyu", 3361723, "mkh-pkn", "Latn", } m["plz"] = { "Paluan", 7128795, nil, "Latn", } m["pma"] = { "Paamese", 3130286, "poz-vnc", "Latn", } m["pmb"] = { "Pambia", 36267, "znd", "Latn", } m["pmd"] = { "Pallanganmiddang", 7127734, "aus-pam", "Latn", } m["pme"] = { "Pwaamèi", 3411152, "poz-cln", "Latn", } m["pmf"] = { "Pamona", 3513320, "poz-kal", "Latn", } m["pmi"] = { "Northern Pumi", 3403245, "sit-qia", } m["pmj"] = { "Southern Pumi", 3403246, "sit-qia", } m["pmk"] = { "Pamlico", 111366045, "alg-eas", "Latn", } m["pml"] = { "Sabir", 636479, "crp", "Latn", ancestors = "lij, pro, vec", } m["pmm"] = { "Pol", 36408, "bnt-kak", "Latn", } m["pmn"] = { "Pam", 7129017, "alv-mbm", } m["pmo"] = { "Pom", 7227178, "poz-hce", "Latn", } m["pmq"] = { "Northern Pame", 3361762, "omq", "Latn", } m["pmr"] = { "Paynamar", 3450824, "ngf-sog", "Latn", } m["pms"] = { "ปีเยมอนเต", 15085, "roa-git", "Latn", } m["pmt"] = { "Tuamotuan", 36763, "poz-pep", "Latn", } m["pmu"] = { "Mirpur Panjabi", 6874480, } m["pmw"] = { "Plains Miwok", 3391031, "nai-utn", "Latn", } m["pmx"] = { "Poumei Naga", 12952910, "tbq-anp", } m["pmy"] = { "Papuan Malay", 12473446, "crp", "Latn", ancestors = "ms", } m["pmz"] = { "Southern Pame", 3361765, "omq", "Latn", } m["pna"] = { "Punan Bah-Biau", 4842201, "poz-bnn", "Latn", } m["pnc"] = { "Pannei", 7131391, } m["pnd"] = { "Mpinda", 63308194, "bnt-kmb", } m["pne"] = { "Western Penan", 12953808, "poz-swa", "Latn", } m["png"] = { "Pongu", 36282, "nic-shi", } m["pnh"] = { "Penrhyn", 3130301, "poz-pep", "Latn", } m["pni"] = { "Aoheng", 4778608, "poz", "Latn", } m["pnj"] = { "Pinjarup", 33103591, } m["pnk"] = { "Paunaka", 2064378, "awd", "Latn", } m["pnl"] = { "Paleni", 7127118, "alv-wan", "Latn", } m["pnm"] = { "Punan Batu", 7259892, } m["pnn"] = { "Pinai-Hagahai", 5638511, "paa-pia", "Latn", } m["pno"] = { "Panobo", 3141869, "sai-pan", "Latn", } m["pnp"] = { "Pancana", 7130204, } m["pnq"] = { "Pana (West Africa)", 7129739, "nic-gnn", "Latn", } m["pnr"] = { "Panim", 11732562, "ngf-gum", "Latn", } m["pns"] = { "Ponosakan", 7227956, "phi", "Latn", } m["pnt"] = { "Pontic Greek", 36748, "grk", "Grek, Latn, Cyrl", ancestors = "gkm", translit = { Grek = "el-translit" }, -- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["pnu"] = { "Jiongnai Bunu", 56325, "hmn", } m["pnv"] = { "Pinigura", 10631927, "aus-psw", "Latn", } m["pnw"] = { "Panyjima", 3913830, "aus-nga", "Latn", } m["pnx"] = { "Phong-Kniang", 3914627, "mkh", } m["pny"] = { "Pinyin", 36250, "nic-nge", "Latn", } m["pnz"] = { "Pana (Central Africa)", 36241, "alv-mbm", "Latn", } m["poc"] = { "Poqomam", 36416, "myn", "Latn", } m["poe"] = { "San Juan Atzingo Popoloca", 12953819, "omq-pop", "Latn", } m["pof"] = { "Poke", 7208577, "bnt-ske", } m["pog"] = { "Potiguára", 56722, "tup-gua", "Latn", } m["poh"] = { "Poqomchi'", 36414, "myn", "Latn", } m["poi"] = { "Highland Popoluca", 7511556, "nai-miz", "Latn", } m["pok"] = { "Pokangá", 25559704, "sai-tuc", "Latn", } m["pom"] = { "Southeastern Pomo", 3396025, "nai-pom", "Latn", } m["pon"] = { "Pohnpeian", 28422, "poz-mic", "Latn", } m["poo"] = { "Central Pomo", 3396020, "nai-pom", "Latn", } m["pop"] = { "Pwapwâ", 3411153, "poz-cln", "Latn", } m["poq"] = { "Texistepec Popoluca", 5908707, "nai-miz", "Latn", } m["pos"] = { "Sayula Popoluca", 5908722, "nai-miz", "Latn", } m["pot"] = { "Potawatomi", 56749, "alg", "Latn", } m["pov"] = { "ครีโอลกินี-บิสเซา", 33339, "crp", "Latn", ancestors = "pt", } m["pow"] = { "San Felipe Otlaltepec Popoloca", 25559598, "omq-pop", "Latn", } m["pox"] = { "Polabian", 36741, "zlw-lch", "Latn", } m["poy"] = { "Pogolo", 2429648, "bnt-kil", } m["ppa"] = { "Pao", 7132069, } m["ppe"] = { "Papi", 7132809, } m["ppi"] = { "Paipai", 56726, "nai-yuc", "Latn", } m["ppk"] = { "Uma", 7881036, "poz-kal", "Latn", } m["ppl"] = { "ปีปิล", -- ใช้ชื่อนี้เพราะ นาวัต (Nawat/Nahuat) อ่านเหมือนกับ นาวัตล์ (Nahuatl) 1186896, "azc-nah", "Latn", strip_diacritics = {remove_diacritics = c.acute .. c.macron}, } m["ppm"] = { "Papuma", 7133239, "poz-hce", "Latn", } m["ppn"] = { "Papapana", 3362757, "poz-ocw", "Latn", } m["ppo"] = { "Folopa", 5464843, "paa-teb", "Latn", } m["ppq"] = { "Pei", 7160903, "paa-wal", "Latn", } m["pps"] = { "San Luís Temalacayuca Popoloca", 25559602, "omq-pop", "Latn", } m["ppt"] = { "Pa", 3504757, "paa-kae", "Latn", } m["ppu"] = { "Papora", 2094884, "map", "Latn", } m["pqa"] = { "Pa'a", 3441315, "cdc-wst", } m["pqm"] = { "Malecite-Passamaquoddy", 3183144, "alg-eas", "Latn", } m["pra"] = { "ปรากฤต", 192170, "inc-mid", "Brah, Deva, Gujr, Knda", ancestors = "inc-ash", translit = { -- Brah translit in [[Module:scripts/data]] Deva = "Deva-translit", Gujr = "Gujr-translit", Knda = "Knda-translit", }, strip_diacritics = { -- FIXME: separate by script from = {"ऎ", "ऒ", u(0x0946), u(0x094A), "य़", "ಯ಼", u(0x11071), u(0x11072), u(0x11073), u(0x11074)}, to = {"ए", "ओ", u(0x0947), u(0x094B), "य", "ಯ", "𑀏", "𑀑", u(0x11042), u(0x11044)} } , } m["prc"] = { "Parachi", 2640637, "ira-orp", "Arab", } -- "prd" is not included, see [[WT:LT]] m["pre"] = { "Principense", 36520, "crp", "Latn", ancestors = "pt", } m["prf"] = { "Paranan", 7135433, "phi", } m["prg"] = { "Old Prussian", 35501, "bat-wes", "Latn", } m["prh"] = { "Porohanon", 6583710, "phi", "Latn", } m["pri"] = { "Paicî", 732131, "poz-cln", "Latn", } m["prk"] = { "Parauk", 3363719, "mkh-pal", "Latn", } m["prl"] = { "Peruvian Sign Language", 3915508, "sgn", } m["prm"] = { "Kibiri", 56745, "qfa-iso", -- Papuan; isolate in Glottolog and Wurm; suggested grouping with Kiwaian languages by Ross based only on 1sg and 2sg pronouns "Latn", } m["prn"] = { "Prasuni", 32689, "nur-nor", } m["pro"] = { "อุตซิตาเก่า", 2779185, "roa-ocr", "Latn", sort_key = {remove_diacritics = c.cedilla}, } -- "prp" is not included, see [[WT:LT]] m["prq"] = { "Ashéninka Perené", 3450601, "awd", "Latn", } m["prr"] = { "Puri", 7261687, } -- "prs" is treated as "fa" (or as etymology-only), see [[WT:LT]] m["prt"] = { "Phai", 7180184, "mkh", } m["pru"] = { "Puragi", 7260800, "ngf-sbh", "Latn", } m["prw"] = { "Parawen", 7136291, "ngf-num", "Latn", } m["prx"] = { "Purik", 567905, "sit-lab", } m["prz"] = { "Providencia Sign Language", 3322084, "sgn", } m["psa"] = { "Asue Awyu", 11266334, "ngf-gaw", "Latn", } m["psc"] = { "Persian Sign Language", 7170221, "sgn", } m["psd"] = { "Plains Indian Sign Language", 2380124, "sgn", } m["pse"] = { "มลายูตอนกลาง", -- This does not mean the central of Malaysia. It is spoken in Indonesia. 3367751, "poz-mly", "Latn, Rjng", } m["psg"] = { "Penang Sign Language", 4924925, "sgn", } m["psh"] = { "Southwest Pashayi", 16112270, "inc-pas", "fa-Arab", } m["psi"] = { "Southeast Pashayi", 23713536, "inc-pas", "fa-Arab", } m["psl"] = { "Puerto Rican Sign Language", 7258608, "sgn-fsl", } m["psm"] = { "Pauserna", 2912846, "tup-gua", "Latn", } m["psn"] = { "Panasuan", 7130113, "poz", } m["pso"] = { "Polish Sign Language", 3915194, "sgn-gsl", } m["psp"] = { "Philippine Sign Language", 3551357, "sgn-fsl", } m["psq"] = { "Pasi", 7142091, "paa-spk", "Latn", } m["psr"] = { "Portuguese Sign Language", 3915472, "sgn", } m["pss"] = { "Kaulong", 3194294, "poz-ocw", } m["psw"] = { "Port Sandwich", 3398324, "poz-vnc", "Latn", } m["psy"] = { "Piscataway", 3504233, "alg-eas", } m["pta"] = { "Pai Tavytera", 7124619, "tup-gua", "Latn", } m["pth"] = { "Pataxó Hã-Ha-Hãe", 7144304, } m["pti"] = { "Pintiini", 10632026, "aus-pam", } m["ptn"] = { "Patani", 7144242, "poz-hce", "Latn", } m["pto"] = { "Zo'é", 8073148, "tup-gua", "Latn", } m["ptp"] = { "Patep", 3368679, "poz-ocw", "Latn", } m["ptq"] = { "Pattapu", 60785085, "dra-tam", } m["ptr"] = { "Piamatsina", 7190040, "poz-vnn", "Latn", } m["ptt"] = { "Enrekang", 12953520, nil, "Latn", } m["ptu"] = { "Bambam", 4853321, "poz-ssw", "Latn", } m["ptv"] = { "Port Vato", 3398323, "poz-vnc", "Latn", } m["ptw"] = { "Pentlatch", 2069475, "sal", "Latn", } m["pty"] = { "Pathiya", 7144790, "dra-mal", } m["pua"] = { "Purepecha", 16114351, "qfa-iso", "Latn", sort_key = {remove_diacritics = c.acute}, } m["pub"] = { "Purum", 6400562, "tbq-kuk", "Latn", } m["puc"] = { "Punan Merap", 7259895, "poz", "Latn", } m["pud"] = { "Punan Aput", 4782333, "poz-swa", "Latn", } m["pue"] = { "Puelche", 33660, } m["puf"] = { "Punan Merah", 7259894, "poz-swa", "Latn", } m["pug"] = { "Phuie", 36375, "nic-gnw", } m["pui"] = { "Puinave", 3027918, nil, "Latn", } m["puj"] = { "Punan Tubu", 7259896, "poz-swa", "Latn", } m["pum"] = { "Puma", 33736, "sit-kic", } m["puo"] = { "Puoc", 6440803, "mkh", "Latn", } m["pup"] = { "Pulabu", 7259163, "ngf-rai", "Latn", } m["puq"] = { "Puquina", 1207739, } m["pur"] = { "Puruborá", 7261619, "tup", } m["put"] = { "Putoh", 12953832, "poz-swa", "Latn", } m["puu"] = { "Punu", 36401, "bnt-sir", "Latn", } m["puw"] = { "Puluwat", 36397, "poz-mic", "Latn", } m["pux"] = { "Puare", 3507983, "paa-msk", "Latn", } m["puy"] = { "Purisimeño", 2967638, "nai-chu", "Latn", } m["pwa"] = { "Pawaia", 7156099, "qfa-dis", -- Papuan; isolate in Glottolog; unclassified in Pawley and Hammarström (2018); sister to the Teberan -- languages by Usher (2020); tentatively TNG by Ross (2005) "Latn", } m["pwb"] = { "Panawa", 47385077, "nic-jer", "Latn", ancestors = "jer", } m["pwg"] = { "Gapapaiwa", 3095245, "poz-ocw", "Latn", } m["pwi"] = { "Patwin", 3370188, "nai-wtq", "Latn", } m["pwm"] = { "Molbog", 6895718, "poz-san", "Latn", } m["pwn"] = { "ไปวัน", 715755, "map", "Latn", } m["pwo"] = { "กะเหรี่ยงโปตะวันตก", 7988202, "kar", "Mymr", translit = "pwo-translit", } m["pwr"] = { "Powari", 12640277, "inc-hie", "Deva", translit = { Deva = "Deva-translit", }, } m["pww"] = { "กะเหรี่ยงโปเหนือ", 7058885, "kar", "Thai", } m["pxm"] = { "Quetzaltepec Mixe", 6842374, "nai-miz", "Latn", } m["pye"] = { "Pye Krumen", 11157382, "kro-grb", } m["pym"] = { "Fyam", 3914025, "nic-ple", "Latn", } m["pyn"] = { "Poyanáwa", 3401023, "sai-pan", } m["pys"] = { "Paraguayan Sign Language", 7134698, "sgn", } m["pyu"] = { "Puyuma", 716690, "map", "Latn", } m["pyx"] = { "ปยู", 36259, "sit", } m["pyy"] = { "Pyen", 7262966, "tbq-bis", } m["pzh"] = { "Pazeh", 36435, "map", "Latn", } m["pzn"] = { "Para Naga", 7133667, "sit-aao", } return require("Module:languages").finalizeData(m, "language") fm7vg2iafotk4mfy34rrfto7i1wxyqt มอดูล:languages/data/3/o 828 36372 5720762 5684163 2026-04-21T07:01:08Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720762 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["oaa"] = { "Orok", 33928, "tuw-nan", "Cyrl, Latn", translit = "oaa-translit", } m["oac"] = { "Oroch", 33650, "tuw-udg", "Latn, Cyrl", } m["oak"] = { "Noakhali", 107548681, "inc-bas", "Beng", } m["oav"] = { "อะวาร์เก่า", 65455879, "cau-ava", "Geor", -- Geor translit in [[Module:scripts/data]] (NOTE: formerly not present, probably an accidental omission) } m["obi"] = { "Obispeño", 1288385, "nai-chu", "Latn", } m["obk"] = { "Southern Bontoc", 63308144, "phi", "Latn", } m["obl"] = { "Oblo", 36309, } m["obm"] = { "โมอับ", 36385, "sem-can", -- Phnx translit in [[Module:scripts/data]] } m["obo"] = { "Obo Manobo", 12953699, "mno", "Latn", } m["obr"] = { "พม่าเก่า", 17006600, "tbq-brm", "Mymr, Latn", --and also Pallava } m["obt"] = { "เบรอตงเก่า", 3558112, "cel-brs", "Latn", } m["obu"] = { "Obulom", 3813403, "nic-cde", "Latn", } m["oca"] = { "Ocaina", 3182577, "sai-wit", "Latn", } m["och"] = { "จีนเก่า", 35137, "zhx", "Hant", translit = "zh-translit", sort_key = "Hani-sortkey", } m["oco"] = { "คอร์นวอลล์เก่า", 48304520, "cel-brs", "Latn", } m["ocu"] = { "Tlahuica", 10751739, "omq", "Latn", } m["oda"] = { "Odut", 3915388, "nic-uce", "Latn", ancestors = "mfn", } m["odk"] = { "Od", 7077191, "inc-wes", "Arab", } m["odt"] = { "ดัตช์เก่า", 443089, "gmw-frk", "Latn, Runr", strip_diacritics = {remove_diacritics = c.circ .. c.macron}, } m["odu"] = { "Odual", 3813392, "nic-cde", "Latn", } m["ofo"] = { "Ofo", 3349758, "sio-ohv", } m["ofs"] = { "ฟรีเชียเก่า", 35133, "gmw-fri", "Latn", strip_diacritics = {remove_diacritics = c.circ .. c.macron}, sort_key = { from = {"æ", "ð", "þ"}, to = {"ae", "t" .. p[1], "t" .. p[2]} }, } m["ofu"] = { "Efutop", 35297, "nic-eko", "Latn", } m["ogb"] = { "Ogbia", 3813400, "nic-cde", "Latn", } m["ogc"] = { "Ogbah", 36291, "alv-igb", "Latn", } m["oge"] = { "จอร์เจียเก่า", 34834, "ccs-gzn", "Geor, Geok", -- Geor, Geok translit in [[Module:scripts/data]] override_translit = true, strip_diacritics = {remove_diacritics = c.circ}, } m["ogg"] = { "Ogbogolo", 3813405, "nic-cde", "Latn", } m["ogo"] = { "Khana", 3914409, "nic-ogo", "Latn", } m["ogu"] = { "Ogbronuagum", 3914485, "nic-cde", "Latn", } m["ohu"] = { "ฮังการีเก่า", 65455880, "urj-ugr", "Latn, Hung", } m["oia"] = { "Oirata", 56738, "paa-tap", "Latn", } m["oin"] = { "Inebu One", 12953782, "paa-tor", "Latn", } m["ojb"] = { "Northwestern Ojibwa", 7060356, "alg", "Latn", ancestors = "oj", } m["ojc"] = { "Central Ojibwa", 5061548, "alg", "Latn", ancestors = "oj", } m["ojg"] = { "Eastern Ojibwa", 5330342, "alg", "Latn", ancestors = "oj", } m["ojp"] = { "ญี่ปุ่นเก่า", 5736700, "jpx", "Jpan", display_text = s["jpx-displaytext"], strip_diacritics = s["jpx-stripdiacritics"], sort_key = s["jpx-sortkey"], } m["ojs"] = { "Severn Ojibwa", 56494, "alg", "Latn", ancestors = "oj", } m["ojv"] = { "Ontong Java", 7095071, "poz-pnp", "Latn", } m["ojw"] = { "Western Ojibwa", 3474222, "alg", "Latn", ancestors = "oj", } m["oka"] = { "Okanagan", 2984602, "sal", "Latn", } m["okb"] = { "Okobo", 3813398, "nic-lcr", "Latn", } m["okd"] = { "Okodia", 36300, "ijo", "Latn", } m["oke"] = { "Okpe (Southwestern Edo)", 268924, "alv-swd", "Latn", } m["okg"] = { "Kok-Paponk", 55254102, "aus-pmn", "Latn", } m["okh"] = { "Koresh-e Rostam", 6432160, "xme-ttc", ancestors = "xme-ttc-cen", } m["oki"] = { "Okiek", 56367, "sdv-kln", "Latn", } m["okj"] = { "Oko-Juwoi", 3436832, "qfa-adc", } m["okk"] = { "Kwamtim One", 19830649, "paa-tor", "Latn", } m["okl"] = { "Old Kentish Sign Language", 7084319, "sgn", } m["okm"] = { "เกาหลีกลาง", 715339, "qfa-kor", "Kore, Latn", ancestors = "oko", translit = "okm-translit", sort_key = "okm-sortkey", -- Kore strip_diacritics in [[Module:scripts/data]] } m["okn"] = { "โอกิโนเอราบุ", 3350036, "jpx-nry", "Jpan", translit = s["jpx-translit"], display_text = s["jpx-displaytext"], strip_diacritics = s["jpx-stripdiacritics"], sort_key = s["jpx-sortkey"], } m["oko"] = { "เกาหลีเก่า", 715364, "qfa-kor", "Kore", -- Kore strip_diacritics in [[Module:scripts/data]] } m["okr"] = { "Kirike", 11006763, "ijo", "Latn", } m["oks"] = { "Oko-Eni-Osayen", 36302, "alv-von", "Latn", } m["oku"] = { "Oku", 36289, "nic-rnc", "Latn", } m["okv"] = { "Orokaiva", 7103752, "paa-bin", "Latn", } m["okx"] = { "Okpe (Northwestern Edo)", 7082547, "alv-nwd", "Latn", } m["okz"] = { "เขมรเก่า", 9205, "mkh-kmr", "Latn, Khmr", --and also Khom, Pallava translit = { Khmr = "Khmr-translit", }, } m["old"] = { "Mochi", 12952852, "bnt-chg", "Latn", } m["ole"] = { "Olekha", 3695204, "sit", "Tibt, Latn", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["olm"] = { "Oloma", 3441166, "alv-nwd", "Latn", } m["olo"] = { "ลิววี", 36584, "urj-fin", "Latn", } m["olr"] = { "Olrat", 3351562, "poz-vnn", "Latn", } m["olt"] = { "ลิทัวเนียเก่า", 17417801, "bat-eas", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.tilde}, } m["olu"] = { "Kuvale", 6448765, "bnt-swb", "Latn", } m["oma"] = { "Omaha-Ponca", 2917968, "sio-dhe", "Latn", } m["omb"] = { "Omba", 2841471, "poz-vnn", "Latn", } m["omc"] = { "Mochica", 1951641, "qfa-iso", "Latn", } m["omg"] = { "Omagua", 33663, "tup-gua", "Latn", } m["omi"] = { "Omi", 56795, "csu-mma", } m["omk"] = { "Omok", 4334657, "qfa-yuk", "Cyrl", translit = "omk-translit", } m["oml"] = { "Ombo", 7089928, "bnt-tet", "Latn", } m["omn"] = { "ไมนอส", 1669994, "qfa-unc", -- undeciphered "Lina", } m["omo"] = { "Utarmbung", 7902577, "ngf-sad", "Latn", } m["omp"] = { "มณีปุระเก่า", 105953310, "sit", "Mtei", translit = "Mtei-translit", } m["omr"] = { "มราฐีเก่า", 65455881, "inc-sou", "Deva, Modi", translit = { Deva = "Deva-translit", Modi = "Modi-translit", }, } m["omt"] = { "Omotik", 36313, "sdv-nis", } m["omu"] = { "Omurano", 1957612, } m["omw"] = { "South Tairora", 20210553, "ngf-kag", "Latn", } m["omx"] = { "มอญเก่า", 111364697, "mkh-mnc", "Mymr, Latn", --and also Pallava } m["ona"] = { "Selk'nam", 2721227, "sai-cho", "Latn", } m["onb"] = { "เบ", 7093790, "qfa-onb", "Latn", } m["one"] = { "Oneida", 857858, "iro-nor", "Latn", } m["ong"] = { "Olo", 592162, "paa-tor", "Latn", } m["oni"] = { "Onin", 7093910, "poz-cet", "Latn", } m["onj"] = { "Onjob", 7093968, "ngf-dag", "Latn", } m["onk"] = { "Kabore One", 12953783, "paa-tor", "Latn", } m["onn"] = { "Onobasulu", 7094437, "ngf-bos", "Latn", } m["ono"] = { "Onondaga", 1077450, "iro-nor", "Latn", ancestors = "iro-oon", } m["onp"] = { "Sartang", 7424639, "sit-khm", "Latn, Deva", } m["onr"] = { "Northern One", 19830648, "paa-tor", "Latn", } m["ons"] = { "Ono", 11732548, "ngf-huo", "Latn", } m["ont"] = { "Ontenu", 3352827, } m["onu"] = { "Unua", 3552042, "poz-vnc", "Latn", } m["onw"] = { "นิวเบียเก่า", 2268, "nub", "Copt", translit = "Copt-translit", sort_key = "Copt-sortkey", } m["onx"] = { "Pidgin Onin", 12953788, "crp", "Latn", ancestors = "oni", } m["ood"] = { "O'odham", 2393095, "azc-pim", "Latn", } m["oog"] = { "Ong", 12953787, "mkh-kat", } m["oon"] = { "Önge", 2475551, "qfa-ong", "Latn", } m["oor"] = { "Oorlams", 2484337, } m["opa"] = { "Okpamheri", 3913331, "alv-nwd", "Latn", } m["opk"] = { "Kopkaka", 6431129, "ngf-okk", "Latn", } m["opm"] = { "Oksapmin", 1068097, "ngf", -- per Glottolog, in an Ok-Oksapmin family, under Awyu-Ok, under Asmat-Awyu-Ok, but we don't have these "Latn", } m["opo"] = { "Opao", 7095585, "paa-wel", "Latn", } m["opt"] = { "Opata", 2304583, "azc-trc", "Latn", } m["opy"] = { "Ofayé", 3446691, "sai-mje", "Latn", } m["ora"] = { "Oroha", 36298, "poz-sls", "Latn", } m["ore"] = { "Orejón", 3355834, "sai-tuc", "Latn", } m["org"] = { "Oring", 3915308, "nic-ucn", "Latn", } m["orh"] = { "Oroqen", 1367309, "tuw-ewe", "Latn", } m["oro"] = { "Orokolo", 7103758, "paa-wel", "Latn", } m["orr"] = { "Oruma", 36299, "ijo", "Latn", } m["ort"] = { "Adivasi Odia", 12953791, "inc-eas", "Orya", ancestors = "or", } m["oru"] = { "Ormuri", 33740, "ira-orp", "fa-Arab", } m["orv"] = { "สลาวิกตะวันออกเก่า", 35228, "zle", "Cyrs", translit = {Cyrs = "Cyrs-translit"}, -- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]] } m["orw"] = { "Oro Win", 3450423, "sai-cpc", "Latn", } m["orx"] = { "Oro", 3813396, "nic-lcr", "Latn", } m["orz"] = { "Ormu", 7103494, "poz-ocw", "Latn", } m["osa"] = { "Osage", 2600085, "sio-dhe", "Latn, Osge", } m["osc"] = { "Oscan", 36653, "itc-sbl", "Ital, Latn, Polyt", display_text = { Latn = s["itc-Latn-displaytext"], }, strip_diacritics = { Latn = s["itc-Latn-stripdiacritics"], }, sort_key = { Latn = s["itc-Latn-sortkey"], }, -- Ital translit in [[Module:scripts/data]] -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["osi"] = { "โอซิง", 2701322, "poz", "Latn", } m["osn"] = { "Old Sundanese", 56197074, "poz-msa", "Latn, Sund, Kawi", } m["oso"] = { "Ososo", 3913398, "alv-yek", "Latn", } m["osp"] = { "สเปนเก่า", 1088025, "roa-cas", "Latn", } m["ost"] = { "Osatu", 36243, "nic-grs", "Latn", } m["osu"] = { "Southern One", 12953785, "paa-tor", "Latn", } m["osx"] = { "แซกซันเก่า", 35219, "gmw-lgm", "Latn", strip_diacritics = {remove_diacritics = c.circ .. c.macron}, } m["ota"] = { "ตุรกีแบบออตโตมัน", 36730, "trk-ogz", "ota-Arab, Armn", ancestors = "trk-oat", strip_diacritics = { ["ota-Arab"] = { remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef, from = {"گ", "ڭ", "ۀ"}, to = {"ك", "ك", "ه"} }, Armn = { from = {"՚"}, to = {"’"} }, }, translit = {Armn = "ota-Armn-translit"}, standard_chars = { ["ota-Arab"] = "آاأبپتثجچحخدذرزژسشصضطظعغفقكلمنوؤهیئةءـ‌", c.punc }, } m["otb"] = { "ทิเบตเก่า", 7085214, "sit-tib", "Tibt", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["otd"] = { "Ot Danum", 3033781, "poz-brw", "Latn", } m["ote"] = { "Mezquital Otomi", 23755711, "oto-otm", "Latn", } m["oti"] = { "Oti", 3357881, } m["otk"] = { "เตอร์กิกเก่า", 34988, "trk-sib", "Orkh, Sogd", -- Orkh translit in [[Module:scripts/data]] } m["otl"] = { "Tilapa Otomi", 7802050, "oto-otm", "Latn", } m["otm"] = { "Eastern Highland Otomi", 13581718, "oto-otm", "Latn", } m["otn"] = { "Tenango Otomi", 25559589, "oto-otm", "Latn", } m["otq"] = { "Querétaro Otomi", 23755688, "oto-otm", "Latn", } m["otr"] = { "Otoro", 36328, "alv-hei", } m["ots"] = { "Estado de México Otomi", 7413841, "oto-otm", "Latn", } m["ott"] = { "Temoaya Otomi", 7698191, "oto-otm", "Latn", } m["otu"] = { "Otuke", 7110049, "sai-mje", "Latn", } m["otw"] = { "Ottawa", 133678, "alg", "Latn", ancestors = "oj", } m["otx"] = { "Texcatepec Otomi", 25559590, "oto-otm", "Latn", } m["oty"] = { "ทมิฬเก่า", 20987452, "dra-tam", "Brah", -- Brah translit in [[Module:scripts/data]] } m["otz"] = { "Ixtenco Otomi", 6101171, "oto-otm", "Latn", } m["oub"] = { "Glio-Oubi", 3914977, "kro-grb", } m["oue"] = { "Oune", 7110521, "paa-sbo", "Latn", } m["oui"] = { "อุยกูร์เก่า", 428299, "trk-ssb", "Ougr, Latn, Hani, Phag, Brah, Mani, Syrc, Orkh, Sogd, Arab, mnc-Mong, Sogo, Tibt", ancestors = "otk", translit = { Ougr = "Ougr-translit", -- Orkh translit in [[Module:scripts/data]] -- Mani translit in [[Module:scripts/data]] -- mnc-Mong translit in [[Module:scripts/data]] (NOTE: Formerly not present; I assume accidentally left out) -- Brah translit in [[Module:scripts/data]] }, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] -- NOTE: Formerly there was only a sort_key for Tibetan. I assume the other three were left out accidentally. sort_key = { Hani = "Hani-sortkey", }, } m["oum"] = { "Ouma", 7110494, "poz-ocw", "Latn", } m["ovd"] = { "แอลฟ์ดาเลิน", --Älvdalen 254950, "gmq-eas", "Latn, Runr", } m["owi"] = { "Owiniga", 56454, "paa-lem", "Latn", } m["owl"] = { "เวลส์เก่า", 2266723, "cel-brw", "Latn", } m["oyb"] = { "Oy", 13593748, "mkh-ban", } m["oyd"] = { "Oyda", 7116251, "omv-nom", } m["oym"] = { "Wayampi", 7975842, "tup-gua", "Latn", } m["oyy"] = { "Oya'oya", 7116243, "poz-ocw", "Latn", } m["ozm"] = { "Koonzime", 35566, "bnt-ndb", "Latn", } return require("Module:languages").finalizeData(m, "language") 6rb04ncw5aniuger1eohv8hivxlbgv0 มอดูล:languages/data/3/m 828 36374 5720761 5684379 2026-04-21T07:01:05Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720761 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["maa"] = { "San Jerónimo Tecóatl Mazatec", 7692927, "omq-maz", "Latn", } m["mab"] = { "Yutanduchi Mixtec", 12645448, "omq-mxt", "Latn", } m["mad"] = { "Madurese", 36213, "poz-msa", "Latn, Java", } m["mae"] = { "Bo-Rukul", 34967, "nic-ple", "Latn", } m["maf"] = { "Mafa", 35819, "cdc-cbm", "Latn", } m["mag"] = { "มคหะ", -- Not to be confused with Magadhi Prakrit (pra-mag) 33728, "inc-bih", "Deva, Kthi", translit = { Deva = "Deva-translit", Kthi = "Kthi-translit", }, } m["mai"] = { "ไมถิลี", 36109, "inc-bih", "Deva, Tirh, Kthi, Newa", translit = { Deva = "Deva-translit", Tirh = "Tirh-translit", Kthi = "Kthi-translit", Newa = "Newa-translit", }, } m["maj"] = { "Jalapa de Díaz Mazatec", 3915999, "omq-maz", "Latn", } m["mak"] = { "มากัซซาร์", 33643, "poz-ssw", "Latn, Bugi, Maka", } m["mam"] = { "Mam", 33467, "myn", "Latn", } m["man"] = { "Mandingo", 35772, "dmn-man", "Latn", } m["maq"] = { "Chiquihuitlán Mazatec", 5101757, "omq-maz", "Latn", } m["mas"] = { "มาไซ", 35787, "sdv-lma", "Latn", } m["mat"] = { "Matlatzinca", 12953704, "omq", "Latn", } m["mau"] = { "Huautla Mazatec", 36230, "omq-maz", "Latn", } m["mav"] = { "Sateré-Mawé", 6794475, "tup", "Latn", } m["maw"] = { "Mampruli", 35804, "nic-wov", "Latn", } m["max"] = { "North Moluccan Malay", 7056136, "crp", "Latn", ancestors = "ms", } m["maz"] = { "Central Mazahua", 36228, "oto", "Latn", } m["mba"] = { "Higaonon", 5753411, "mno", "Latn", } m["mbb"] = { "Western Bukidnon Manobo", 7987643, "mno", "Latn", } m["mbc"] = { "Macushi", 56633, "sai-pem", "Latn", } m["mbd"] = { "Dibabawon Manobo", 18755523, "mno", "Latn", } m["mbe"] = { "Molale", 3319444, "nai-plp", "Latn", } m["mbf"] = { "Baba Malay", 18642798, "crp", "Latn", ancestors = "ms", } m["mbh"] = { "Mangseng", 6749147, "poz-ocw", "Latn", } m["mbi"] = { "Ilianen Manobo", 14916911, "mno", "Latn", } m["mbj"] = { "Nadëb", 3335011, "sai-nad", "Latn", } m["mbk"] = { "Malol", 6744477, "poz-ocw", "Latn", } m["mbl"] = { "Maxakalí", 3029682, "sai-mje", "Latn", } m["mbm"] = { "Ombamba", 36407, "bnt-mbt", "Latn", } m["mbn"] = { "Macaguán", 3273980, "sai-guh", "Latn", } m["mbo"] = { -- is, like 'bqz', 'bsi' and 'bss', a dialect of Manenguba "Mbo (Cameroon)", 36011, "bnt-mne", "Latn", } m["mbp"] = { "Wiwa", 3012604, "cba", "Latn", } m["mbq"] = { "Maisin", 3448149, nil, "Latn", } m["mbr"] = { "Nukak Makú", 3346228, "sai-nad", "Latn", } m["mbs"] = { "Sarangani Manobo", 7423093, "mno", "Latn", } m["mbt"] = { "Matigsalug Manobo", 6787447, "mno", "Latn", } m["mbu"] = { "Mbula-Bwazza", 3913324, "nic-jrn", "Latn", } m["mbv"] = { "Mbulungish", 36003, "alv-nal", "Latn", } m["mbw"] = { "Maring", 3293280, "ngf-chw", "Latn", } m["mbx"] = { "Sepik Mari", 6760942, "paa-spk", "Latn", } m["mby"] = { "Memoni", 4180871, "inc-snd", "Gujr, ur-Arab", } m["mbz"] = { "Amoltepec Mixtec", 13583504, "omq-mxt", "Latn", } m["mca"] = { "Maca", 3281043, "sai-mtc", "Latn", } m["mcb"] = { "Machiguenga", 3915441, "awd", "Latn", } m["mcc"] = { "Bitur", 4919173, "paa-ani", "Latn", } m["mcd"] = { "Sharanahua", 12953881, "sai-pan", "Latn", } m["mce"] = { "Itundujia Mixtec", 12953727, "omq-mxt", "Latn", } m["mcf"] = { "Matsés", 2981620, "sai-pan", "Latn", } m["mcg"] = { "Mapoyo", 56946, "sai-map", "Latn", } m["mch"] = { "Ye'kwana", 3082027, "sai-car", "Latn", sort_key = { remove_diacritics = "%-%s", from = {"'", "ñ", "ö", "sh", "ü"}, to = {"’", "n" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1]} } } m["mci"] = { "Mese", 6821190, "ngf-huo", "Latn", } m["mcj"] = { "Mvanip", 3913281, "nic-mmb", "Latn", } m["mck"] = { "Mbunda", 34170, "bnt-clu", "Latn", } m["mcl"] = { "Macaguaje", 6722435, "sai-tuc", "Latn", } m["mcm"] = { "Kristang", 2669169, "crp", "Latn", ancestors = "pt", } m["mcn"] = { "Masana", 56668, "cdc-mas", } m["mco"] = { "Coatlán Mixe", 25559716, "nai-miz", "Latn", } m["mcp"] = { "Makaa", 35803, "bnt-mka", } m["mcq"] = { "Ese", 5397551, "ngf-koi", "Latn", } m["mcr"] = { "Menya", 11732444, "ngf-ang", "Latn", } m["mcs"] = { "Mambai", 6748872, "alv-mbm", } m["mcu"] = { "Cameroon Mambila", 19359039, "nic-mmb", "Latn", } -- mcv (Minanibai) merged into ffi (Foia Foia) per Glottolog m["mcw"] = { "Mawa", 3441333, "cdc-est", "Latn", } m["mcx"] = { "Mpiemo", 35908, "bnt-bek", } m["mcy"] = { "South Watut", 12953293, "poz-ocw", "Latn", } m["mcz"] = { "Mawan", 11732429, "ngf-han", "Latn", } m["mda"] = { "Mada (Nigeria)", 3915843, "nic-nin", "Latn", } m["mdb"] = { "Morigi", 6912195, "paa-kiw", "Latn", } m["mdc"] = { "Male", 6742927, "ngf-min", "Latn", } m["mdd"] = { "Mbum", 36170, "alv-mbm", } m["mde"] = { "Bura Mabang", 35860, "ssa", "Arab, Latn", } m["mdf"] = { "มอกชา", 13343, "urj-mdv", "Cyrl", translit = "mdf-translit", strip_diacritics = {remove_diacritics = c.acute}, override_translit = true, sort_key = "mdf-sortkey", } m["mdg"] = { "Massalat", 759984, } m["mdh"] = { "มากินดาเนา", 33717, "phi", "Latn, Arab", } m["mdi"] = { "Mamvu", 3033594, "csu-mle", } m["mdj"] = { "Mangbetu", 56327, "csu-maa", "Latn", } m["mdk"] = { "Mangbutu", 6748877, "csu-mle", } m["mdl"] = { "Maltese Sign Language", 6744816, "sgn", } m["mdm"] = { "Mayogo", 6797580, "nic-nke", "Latn", } m["mdn"] = { "Mbati", 36165, "bnt-ngn", } m["mdp"] = { "Mbala", 6799583, "bnt-pen", } m["mdq"] = { "Mbole", 6799727, "bnt-mbe", } m["mdr"] = { "Mandar", 35995, "poz-ssw", "Bugi, Latn", } m["mds"] = { "Maria", 3448673, "paa-man", "Latn", } m["mdt"] = { "Mbere", 36062, "bnt-mbt", } m["mdu"] = { "Mboko", 36058, "bnt-mbo", } m["mdv"] = { "Santa Lucía Monteverde Mixtec", 12953722, "omq-mxt", "Latn", } m["mdw"] = { "Mbosi", 36035, "bnt-mbo", } m["mdx"] = { "Dizin", 35313, "omv-diz", "Ethi, Latn", } m["mdy"] = { "Maale", 795327, "omv-ome", } m["mdz"] = { "Suruí Do Pará", 10322149, "tup-gua", "Latn", } m["mea"] = { "Menka", 36078, "nic-grs", "Latn", } m["meb"] = { "Ikobi-Mena", 11732241, "paa-tuk", "Latn", } m["mec"] = { "Mara", 6772774, } m["med"] = { "Melpa", 36166, "ngf-chw", "Latn", } m["mee"] = { "Mengen", 3305831, "poz-ocw", "Latn", } m["mef"] = { "Megam", 6808589, } m["meh"] = { "Southwestern Tlaxiaco Mixtec", 7070686, "omq-mxt", "Latn", } m["mei"] = { "Midob", 36007, "nub", "Latn", } m["mej"] = { "Meyah", 11732436, "paa-ebh", "Latn", } m["mek"] = { "Mekeo", 3304803, "poz-ocw", "Latn", } m["mel"] = { "Central Melanau", 18638319, "poz-swa", "Latn", } m["mem"] = { "Mangala", 6748664, } m["men"] = { "Mende", 1478672, "dmn-msw", "Latn, Mend", } m["meo"] = { "มลายูแบบเกอดะฮ์", 4925684, "poz-mly", "Latn, ms-Arab, Thai", strip_diacritics = { from = {u(0xF70F)}, to = {"ญ"} }, --sort_key = {Thai = "Thai-sortkey"}, } m["mep"] = { "Miriwung", 3111847, "aus-jar", "Latn", } m["meq"] = { "Merey", 3502314, "cdc-cbm", "Latn", } m["mer"] = { "Meru", 13313, "bnt-kka", "Latn", } m["mes"] = { "Masmaje", 3440448, } m["met"] = { "Mato", 3299190, "poz-ocw", "Latn", } m["meu"] = { "Motu", 33516, "poz-ocw", "Latn", } m["mev"] = { "Mano", 3913286, "dmn-mda", "Latn", } m["mew"] = { "Maaka", 3438764, "cdc-wst", "Latn", } m["mey"] = { "Hassaniya Arabic", 56231, "sem-arb", "Arab", } m["mez"] = { "Menominee", 13363, "alg", "Latn", sort_key = {remove_diacritics = "·"}, } m["mfa"] = { "มลายูแบบปัตตานี", 1199751, "poz-mly", "Latn, ms-Arab, Thai", strip_diacritics = { from = {u(0xF70F)}, to = {"ญ"} }, sort_key = {remove_diacritics = "'"}, -- only for thwikt } m["mfb"] = { "Bangka", 3258818, "poz-mly", "Latn, Arab", } m["mfc"] = { "Mba", 4286464, "nic-mbc", "Latn", } m["mfd"] = { "Mendankwe-Nkwen", 11129537, "nic-nge", "Latn", } m["mfe"] = { "ครีโอลมอริเชียส", 33661, "crp", "Latn", ancestors = "fr", sort_key = s["roa-oil-sortkey"], } m["mff"] = { "Naki", 36083, "nic-bbe", "Latn", } m["mfg"] = { "Mixifore", 3914478, "dmn-mok", } m["mfh"] = { "Matal", 3501751, "cdc-cbm", "Latn", } m["mfi"] = { "Wandala", 3441249, "cdc-cbm", "Latn", } m["mfj"] = { "Mefele", 3501871, "cdc-cbm", } m["mfk"] = { "North Mofu", 56303, "cdc-cbm", "Latn", } m["mfl"] = { "Putai", 56291, } m["mfm"] = { "Marghi South", 56248, } m["mfn"] = { "Cross River Mbembe", 3915395, "nic-uce", "Latn", } m["mfo"] = { "Mbe", 36075, "nic-eko", "Latn", } m["mfp"] = { "Makassar Malay", 12952776, "qfa-mix", "Latn", ancestors = "ms, mak" } m["mfq"] = { "Moba", 19921578, "nic-grm", "Latn", } m["mfr"] = { "Marrithiyel", 6773014, "aus-dal", "Latn", } m["mfs"] = { "Mexican Sign Language", 3915511, "sgn", "Latn", -- when documented } m["mft"] = { "Mokerang", 3319387, "poz-aay", "Latn", } m["mfu"] = { "Mbwela", 11004988, "bnt-clu", ancestors = "lch", } m["mfv"] = { "Mandjak", 35822, "alv-pap", } m["mfw"] = { "Mulaha", 6933720, "paa-kwa", "Latn", } m["mfx"] = { "Melo", 6813268, "omv-nom", } m["mfy"] = { "Mayo", 56729, "azc-trc", "Latn", sort_key = {remove_diacritics = c.acute}, } m["mfz"] = { "Mabaan", 20526385, "sdv", "Latn", } m["mga"] = { "ไอริชกลาง", 36116, "cel-gae", "Latn", ancestors = "sga", strip_diacritics = {remove_diacritics = c.dotabove .. c.diaer .. "·"}, sort_key = "mga-sortkey", } m["mgb"] = { "Mararit", 56359, "sdv-tmn", } m["mgc"] = { "Morokodo", 6913216, "csu-bbk", "Latn", } m["mgd"] = { "Moru", 6915014, "csu-mma", "Latn, Arab", } m["mge"] = { "Mango", 713659, "csu-sar", "Latn", } m["mgf"] = { "Maklew", 6739816, "paa-bul", "Latn", } m["mgg"] = { "Mpongmpong", 35924, "bnt-bek", } m["mgh"] = { "Makhuwa-Meetto", 33604, "bnt-mak", "Latn", ancestors = "vmw", } m["mgi"] = { "Jili", 3914497, "nic-pls", } m["mgj"] = { "Abureni", 3441256, "nic-cde", "Latn", } m["mgk"] = { "Mawes", 6794395, "qfa-dis", -- Papuan; isolate in Glottolog, Foley (2018) and Hammarström (2010); in the Tor-Kwerba languages per -- Usher (2020) "Latn", } m["mgl"] = { "Maleu-Kilenge", 3281884, } m["mgm"] = { "Mambae", 35774, "poz-tim", "Latn", } m["mgn"] = { "Mbangi", 11017443, "nic-ngd", "Latn", } m["mgo"] = { "Meta'", 36054, "nic-mom", "Latn", } m["mgp"] = { "Eastern Magar", 12952758, "sit-gma", "Deva, Latn", } m["mgq"] = { "Malila", 6743679, "bnt-mby", "Latn", } m["mgr"] = { "Mambwe-Lungu", 626210, "bnt-mwi", "Latn", } m["mgs"] = { "Manda (Tanzania)", 16939267, "bnt-bki", } m["mgt"] = { "Mongol", 11260674, "paa-wke", "Latn", } m["mgu"] = { "Mailu", 3278246, "paa-mal", "Latn", } m["mgv"] = { "Matengo", 6786446, "bnt-mbi", "Latn", } m["mgw"] = { "Matumbi", 6791974, "bnt-mbi", "Latn", } m["mgy"] = { "Mbunga", 6799817, "bnt-kil", } m["mgz"] = { "Mbugwe", 3426367, "bnt-mra", } m["mha"] = { "Manda (India)", 56760, "dra-kki", "Orya", translit = "Orya-translit", } m["mhb"] = { "Mahongwe", 35816, "bnt-kel", } m["mhc"] = { "Mocho", 1941682, "myn", } m["mhd"] = { "Mbugu", 36152, "qfa-mix", "Latn", ancestors = "asa", } m["mhe"] = { "Besisi", 2742262, "mkh-asl", "Latn", } m["mhf"] = { "Mamaa", 6745346, "ngf-fin", "Latn", } m["mhg"] = { "Marrgu", 6772812, } m["mhi"] = { "Ma'di", 56670, "csu-mma", "Latn", strip_diacritics = {remove_diacritics = c.acute .. c.grave .. c.tilde .. c.dotbelow}, } m["mhj"] = { "Mogholi", 13336, "xgn", "fa-Arab, Latn", translit = "fa-cls-translit", strip_diacritics = { ["fa-Arab"] = "ar-stripdiacritics", }, } m["mhk"] = { "Mungaka", 36068, "nic-nun", } m["mhl"] = { "Mauwake", 6794095, "ngf-kum", "Latn", } m["mhm"] = { "Makhuwa-Moniga", 6900145, "bnt-mak", } m["mhn"] = { "โมเชโน", 268130, "gmw-hgm", "Latn", ancestors = "bar", sort_key = {remove_diacritics = c.grave}, } m["mho"] = { "Mashi", 10962737, "bnt-kav", "Latn", } m["mhp"] = { "Balinese Malay", 12473441, "crp", "Latn, Bali, ms-Arab", } m["mhq"] = { "Mandan", 1957120, "sio", "Latn", } m["mhr"] = { "Eastern Mari", 3906614, "chm", "Cyrl", translit = "chm-translit", override_translit = true, strip_diacritics = {remove_diacritics = c.grave .. c.acute}, sort_key = { from = {"ё", "ҥ", "ӧ", "ӱ"}, to = {"е" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1]} } } m["mhs"] = { "Buru (Indonesia)", 2928650, "poz-cma", "Latn", } m["mht"] = { "Mandahuaca", 6747924, "awd-nwk", } m["mhu"] = { "Taraon", 56400, "sit-gsi", "Latn", } m["mhw"] = { "Mbukushu", 2691548, "bnt", "Latn", } m["mhx"] = { "Lhao Vo", 11149315, "tbq-brm", "Latn", } m["mhy"] = { "Ma'anyan", 2328761, "poz-bre", "Latn", } m["mhz"] = { "Mor (Austronesian)", 2122792, "poz-hce", "Latn", } m["mia"] = { "Miami", 56523, "alg", "Latn", } m["mib"] = { "Atatláhuca Mixtec", 32093046, "omq-mxt", "Latn", } m["mic"] = { "Mi'kmaq", 13321, "alg-eas", "Latn", } m["mid"] = { "Mandaic", 6991742, "sem-ase", "Mand", ancestors = "myz", translit = { Mand = "Mand-translit", }, strip_diacritics = { Mand = "Mand-stripdiacritics", } } m["mie"] = { "Ocotepec Mixtec", 25559575, "omq-mxt", "Latn", } m["mif"] = { "Mofu-Gudur", 1365132, "cdc-cbm", "Latn", } m["mig"] = { "San Miguel el Grande Mixtec", 12953719, "omq-mxt", "Latn", } m["mih"] = { "Chayuco Mixtec", 13583510, "omq-mxt", "Latn", } m["mii"] = { "Chigmecatitlán Mixtec", 12953724, "omq-mxt", "Latn", } m["mij"] = { "Mungbam", 34725, "nic-beb", "Latn", } m["mik"] = { "Mikasuki", 13316, "nai-mus", "Latn", } m["mil"] = { "Peñoles Mixtec", 42411307, "omq-mxt", "Latn", } m["mim"] = { "Alacatlatzala Mixtec", 14697894, "omq-mxt", "Latn", } m["min"] = { "มีนังกาเบา", 13324, "poz-mly", "Latn, Arab", } m["mio"] = { "Pinotepa Nacional Mixtec", 7196415, "omq-mxt", "Latn", } m["mip"] = { "Apasco-Apoala Mixtec", 13583505, "omq-mxt", "Latn", } m["miq"] = { "Miskito", 1516803, "nai-min", "Latn", strip_diacritics = {remove_diacritics = c.circ}, } m["mir"] = { "Isthmus Mixe", 6088873, "nai-miz", "Latn", } m["mit"] = { "Southern Puebla Mixtec", 7570345, "omq-mxt", "Latn", } m["miu"] = { "Cacaloxtepec Mixtec", 12953723, "omq-mxt", "Latn", } m["miw"] = { "Akoye", 3327462, "ngf-ang", "Latn", } m["mix"] = { "Mixtepec Mixtec", 6884125, "omq-mxt", "Latn", } m["miy"] = { "Ayutla Mixtec", 13583508, "omq-mxt", "Latn", } m["miz"] = { "Coatzospan Mixtec", 3317290, "omq-mxt", "Latn", } m["mjb"] = { "Makalero", 35729, "paa-tap", "Latn", } m["mjc"] = { "San Juan Colorado Mixtec", 12953718, "omq-mxt", "Latn", } m["mjd"] = { "Northwest Maidu", 3198700, "nai-mdu", "Latn", } m["mje"] = { "Muskum", 3913334, } -- mjg "Monguor" is not recognized as a language, but it is a family code m["mji"] = { "Kim Mun", 1115317, "hmx-mie", "Latn", } m["mjj"] = { "Mawak", 11732427, "ngf-tib", "Latn", } m["mjk"] = { "Matukar", 6791963, "poz-ocw", "Latn", } m["mjl"] = { "Mandeali", 6747931, "him", "Deva, Takr", translit = { Deva = "Deva-translit", Takr = "Takr-translit", }, } m["mjm"] = { "Medebur", 6805227, "poz-ocw", "Latn", } m["mjn"] = { "Mebu", 6804364, "ngf-fin", "Latn", } m["mjo"] = { "Malankuravan", 14916887, "dra-mal", } m["mjp"] = { "Malapandaram", 10575729, "dra-tam", } m["mjq"] = { "Malaryan", 12952773, "dra-mal", } m["mjr"] = { "Malavedan", 12952775, "dra-mal", "Mlym", -- Mlym translit in [[Module:scripts/data]] } m["mjs"] = { "Miship", 3441264, "cdc-wst", "Latn", } m["mjt"] = { "Sawriya Paharia", 33907, "dra-mlo", "Beng, Deva", translit = { Beng = "Beng-translit", Deva = "Deva-translit", }, } m["mju"] = { "Manna-Dora", 10576453, "dra-tel", } m["mjv"] = { "Mannan", 3286037, "dra-tam", "Mlym, Taml", translit = { Taml = "Taml-translit", }, -- Mlym translit in [[Module:scripts/data]] } m["mjw"] = { "Karbi", 56591, "tbq-kuk", "Latn", } m["mjx"] = { "Mahali", 12953686, "mun", } m["mjy"] = { "Mahican", 3182562, "alg-eas", "Latn", } m["mjz"] = { "Majhi", 6737786, "inc-bih", } m["mka"] = { "Mbre", 3450154, "nic", --unclassified within niger-congo tho } m["mkb"] = { "Mal Paharia", 6583595, "inc-eas", "Deva", translit = { Deva = "Deva-translit", }, } m["mkc"] = { "Siliput", 7515090, "paa-tor", "Latn", } m["mke"] = { "Mawchi", 21403317, } m["mkf"] = { "Miya", 43328, "cdc-wst", "Latn", } m["mkg"] = { "Mak (China)", 3280623, "qfa-kms", } m["mki"] = { "Dhatki", 32480, "raj", "Deva, Mahj, Arab", } m["mkj"] = { "โมกิล", 2335528, "poz-mic", "Latn", } m["mkk"] = { "Byep", 35052, "bnt-mka", } m["mkl"] = { "Mokole", 36047, "alv-yor", "Latn", } m["mkm"] = { "Moklen", 3319380, } m["mkn"] = { "Kupang Malay", 18458203, "crp", "Latn", } m["mko"] = { "Mingang Doso", 3915382, "alv-bwj", } m["mkp"] = { "Moikodi", 6894594, "ngf-yar", "Latn", } m["mkq"] = { "Bay Miwok", 3460957, "nai-utn", "Latn", } m["mkr"] = { "Malas", 11732402, "ngf-nad", "Latn", } m["mks"] = { "Silacayoapan Mixtec", 7514027, "omq-mxt", "Latn", } m["mkt"] = { "Vamale", 14916907, "poz-cln", "Latn", } m["mku"] = { "Konyanka Maninka", 11163298, "dmn-mnk", } m["mkv"] = { "Mav̋ea", 3073532, "poz-vnn", "Latn", } m["mkx"] = { "Cinamiguin Manobo", 12953697, "mno", "Latn", } m["mky"] = { "Taba", 3512690, "poz-hce", "Latn", } m["mkz"] = { "Makasae", 35782, "paa-tap", "Latn", } m["mla"] = { "Tamambo", 1153276, "poz-vnn", "Latn", } m["mlb"] = { "Mbule", 35843, "nic-ymb", "Latn", } m["mlc"] = { "Caolan", 3446682, "tai-cho", "Latn, Hani", sort_key = {Hani = "Hani-sortkey"}, } m["mle"] = { "Manambu", 11732406, "paa-ndu", "Latn", } m["mlf"] = { "มัล", 3281057, "mkh-khm", } m["mlh"] = { "Mape", 6753787, "ngf-huo", "Latn", } m["mli"] = { "Malimpung", 12473435, } m["mlj"] = { "Miltu", 3441310, } m["mlk"] = { "Ilwana", 6001357, "bnt-sab", } m["mll"] = { "Malua Bay", 6744946, "poz-vnc", "Latn", } m["mlm"] = { "Mulam", 3092284, "qfa-kms", "Latn", } m["mln"] = { "Malango", 3281522, "poz-sls", "Latn", } m["mlo"] = { "Mlomp", 36009, "alv-bak", } m["mlp"] = { "Bargam", 4860543, "ngf-mad", "Latn", } m["mlq"] = { "Western Maninkakan", 11028033, "dmn-wmn", } m["mlr"] = { "Vame", 3515088, "cdc-cbm", "Latn", } m["mls"] = { "Masalit", 56557, "ssa", } m["mlu"] = { "To'abaita", 36645, "poz-sls", "Latn", } m["mlv"] = { "Mwotlap", 2475538, "poz-vnn", "Latn", } m["mlw"] = { "Moloko", 1965222, "cdc-cbm", "Latn", } m["mlx"] = { "Malfaxal", 2157421, "poz-vnc", "Latn", } m["mlz"] = { "Malaynon", 18755512, "phi", } m["mma"] = { "Mama", 3913963, "nic-jrn", } m["mmb"] = { "Momina", 6897297, } m["mmc"] = { "Michoacán Mazahua", 12953705, "oto", "Latn", } m["mmd"] = { "Maonan", 3092293, "qfa-kms", "Latn", } m["mme"] = { "Tirax", 3276286, "poz-vnc", "Latn", } m["mmf"] = { "Mundat", 56263, "cdc-wst", "Latn", } m["mmg"] = { "North Ambrym", 2842468, "poz-vnc", "Latn", } m["mmh"] = { "Mehináku", 3501838, "awd", "Latn", } m["mmi"] = { "Musar", 6940113, "ngf-tib", "Latn", } m["mmj"] = { "Majhwar", 6737795, } m["mmk"] = { "Mukha-Dora", 6933447, } m["mml"] = { "Man Met", 3194984, "mkh-pal", } m["mmm"] = { "Maii", 6735599, "poz-vnc", "Latn", } m["mmn"] = { "Mamanwa", 3206623, "phi", "Latn", } m["mmo"] = { "Mangga Buang", 12952294, "poz-ocw", "Latn", } m["mmp"] = { "Musan", 2605703, "paa-amu", "Latn", } m["mmq"] = { "Aisi", 6940074, "ngf-ais", "Latn", } m["mmr"] = { "Western Xiangxi Miao", 3307901, "hmn", "Latn", } m["mmt"] = { "Malalamai", 3281496, "poz-ocw", "Latn", } m["mmu"] = { "Mmaala", 13123461, "nic-ymb", "Latn", } m["mmv"] = { "Miriti", 6873567, "sai-tuc", "Latn", } m["mmw"] = { "Emae", 3051961, "poz-pnp", "Latn", } m["mmx"] = { "Madak", 3275205, "poz-ocw", "Latn", } m["mmy"] = { "Migaama", 56259, "cdc-est", "Latn", } m["mmz"] = { "Mabaale", 11003249, "bnt-ngn", } m["mna"] = { "Mbula", 3303572, "poz-ocw", "Latn", } m["mnb"] = { "Muna", 6935584, "poz-mun", "Latn", } m["mnc"] = { "แมนจู", 33638, "tuw-jrc", "mnc-Mong, Latn", ancestors = "juc", -- mnc-Mong translit in [[Module:scripts/data]] } m["mnd"] = { "Mondé", 6898840, "tup", "Latn", } m["mne"] = { "Naba", 760732, "csu-bgr", } m["mnf"] = { "Mundani", 35839, "nic-mom", "Latn", } m["mng"] = { "Eastern Mnong", 12953747, "mkh-ban", "Latn, Khmr", } m["mnh"] = { "Mono (Congo)", 33501, "bad-cnt", "Latn", } m["mni"] = { "มณีปุระ", 33868, "sit", "Mtei, Beng", ancestors = "omp", translit = {Mtei = "Mtei-translit"}, } m["mnj"] = { "Munji", 33639, "ira-mny", "Arab", } m["mnk"] = { "Mandinka", 33678, "dmn-wmn", "Latn, Arab, Nkoo", } m["mnl"] = { "Tiale", 6744350, "poz-vnn", "Latn", } m["mnm"] = { "Mapena", 11732415, "ngf-dag", "Latn", } m["mnn"] = { "มนองใต้", 23857582, "mkh-ban", } m["mnp"] = { "หมิ่นเหนือ", 36457, "zhx-inm", "Hants", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["mnq"] = { "Minriq", 2742268, "mkh-asl", "Latn", } m["mnr"] = { "Mono (California)", 33591, "azc-num", "Latn", } m["mnt"] = { "Maykulan", 3915696, "aus-pam", "Latn", } m["mnu"] = { "Mer", 6817854, "paa-mai", "Latn", } m["mnv"] = { "Rennellese", 3397346, "poz-pnp", "Latn", } m["mnw"] = { "มอญ", 13349, "mkh-mnc", "Mymr", ancestors = "mkh-mmn", translit = "mnw-translit", sort_key = { from = {"ျ", "ြ", "ွ", "ှ", "ၞ", "ၟ", "ၠ", "ၚ", "ဿ"}, to = {"္ယ", "္ရ", "္ဝ", "္ဟ", "္န", "္မ", "္လ", "င", "သ္သ"} }, } m["mnx"] = { "Manikion", 3507964, "paa-ebh", "Latn", } m["mny"] = { "Manyawa", 11002622, "bnt-mak", ancestors = "vmw", } m["mnz"] = { "Moni", 6899857, "ngf-pan", "Latn", } m["moa"] = { "Mwan", 3320111, "dmn-nbe", "Latn", } m["moc"] = { "Mocoví", 3027906, "sai-guc", "Latn", } m["mod"] = { "Mobilian", 13333, "crp", "Latn", ancestors = "cho, cic", } m["moe"] = { "มงตาแญ", 13351, "alg", "Latn", ancestors = "cr", strip_diacritics = {remove_diacritics = c.macron}, } m["mog"] = { "Mongondow", 3058458, "phi", "Latn", } m["moh"] = { "Mohawk", 13339, "iro-nor", "Latn", ancestors = "iro-omo", } m["moi"] = { "Mboi", 3914417, "alv-yun", } m["moj"] = { "Monzombo", 11154772, "nic-nkk", "Latn", } m["mok"] = { "Morori", 6913275, } m["mom"] = { "Monimbo", 56542, } m["moo"] = { "Monom", 6901726, "mkh-ban", } m["mop"] = { "Mopan Maya", 36183, "myn", "Latn", } m["moq"] = { "Mor (Papuan)", 11732468, "qfa-dis", -- Papuan; isolate in Glottolog and Palmer (2018); top-level TNG in Ross (2005), in Berau Gulf (under -- TNG) in Usher (2020) } m["mor"] = { "Moro", 36172, "alv-hei", "Latn", } m["mos"] = { "Moore", 36096, "nic-mre", "Latn", } m["mot"] = { "Barí", 2886281, "cba", "Latn", } m["mou"] = { "Mogum", 3440473, "cdc-est", "Latn", } m["mov"] = { "Mojave", 56510, "nai-yuc", "Latn", } m["mow"] = { "Moi (Congo)", 11124792, "bnt-bmo", "Latn", } m["mox"] = { "Molima", 3319495, "poz-ocw", "Latn", } m["moy"] = { "Shekkacho", 56827, "omv-gon", } m["moz"] = { "Mukulu", 3440403, "cdc-est", } m["mpa"] = { "Mpoto", 6928303, "bnt-mbi", "Latn", } m["mpb"] = { "Mullukmulluk", 6741120, } m["mpc"] = { "Mangarayi", 6748829, } m["mpd"] = { "Machinere", 12953681, "awd", "Latn", } m["mpe"] = { "Majang", 56724, "sdv", } m["mpg"] = { "Marba", 56614, "cdc-mas", } m["mph"] = { "Maung", 6792550, "aus-wdj", "Latn", } m["mpi"] = { "Mpade", 3280670, "cdc-cbm", "Latn", } m["mpj"] = { "Martu Wangka", 3295916, "aus-pam", "Latn", } m["mpk"] = { "Mbara (Chad)", 3912770, "cdc-cbm", } m["mpl"] = { "Middle Watut", 15887910, "poz-ocw", "Latn", } m["mpm"] = { "Yosondúa Mixtec", 12953741, "omq-mxt", "Latn", } m["mpn"] = { "Mindiri", 6863842, "poz-ocw", "Latn", } m["mpo"] = { "Miu", 6883668, "poz-ocw", "Latn", } m["mpp"] = { "Migabac", 11732448, "ngf-huo", "Latn", } m["mpq"] = { "Matís", 3299145, "sai-pan", "Latn", } m["mpr"] = { "Vangunu", 3554582, "poz-ocw", "Latn", } m["mps"] = { "Dadibi", 5208077, "paa-teb", "Latn", } m["mpt"] = { "Mian", 12952846, "ngf-okk", "Latn", } m["mpu"] = { "Makuráp", 3281037, "tup", "Latn", } m["mpv"] = { "Mungkip", 11732485, "ngf-fin", "Latn", } m["mpw"] = { "Mapidian", 6753812, "awd", "Latn", } m["mpx"] = { "Misima-Paneati", 6875666, "poz-ocw", "Latn", } m["mpy"] = { "Mapia", 3287224, "poz-mic", "Latn", } m["mpz"] = { "Mpi", 6928276, "tbq-bka", } m["mqa"] = { "Maba", 3273750, } m["mqb"] = { "Mbuko", 3502213, "cdc-cbm", "Latn", } m["mqc"] = { "Mangole", 6749097, "poz-cma", "Latn", } m["mqe"] = { "Matepi", 11732426, "ngf-han", "Latn", } m["mqf"] = { "Momuna", 6897518, } m["mqg"] = { "Kota Bangun Kutai Malay", 12952778, } m["mqh"] = { "Tlazoyaltepec Mixtec", 12953740, "omq-mxt", "Latn", } m["mqi"] = { "Mariri", 6765544, } m["mqj"] = { "Mamasa", 6745452, "poz-ssw", "Latn", } m["mqk"] = { "Rajah Kabunsuwan Manobo", 12953700, "mno", } m["mql"] = { "Mbelime", 4286473, "nic-eov", "Latn", } m["mqm"] = { "South Marquesan", 19694214, "poz-pep", "Latn", } m["mqn"] = { "Moronene", 642581, "poz-btk", "Latn", } m["mqo"] = { "Modole", 11732457, "paa-nha", "Latn", } m["mqp"] = { "Manipa", 6749799, "poz-cma", "Latn", } m["mqq"] = { "Minokok", 18642293, "poz-san", "Latn", } m["mqr"] = { "Mander", 6747979, "paa-tkw", } m["mqs"] = { "West Makian", 3033575, "paa-nha", "Latn", } m["mqt"] = { "Mok", 13018559, "mkh-pal", } m["mqu"] = { "Mandari", 3285426, "sdv-bri", } m["mqv"] = { "Mosimo", 11732478, "ngf-nwh", "Latn", } m["mqw"] = { "Murupi", 11732486, "ngf-nwh", "Latn", } m["mqx"] = { "Mamuju", 6746004, "poz-ssw", "Latn", } m["mqy"] = { "Manggarai", 3285748, "poz-cet", "Latn", } m["mqz"] = { "Malasanga", 14916889, "poz-ocw", "Latn", } m["mra"] = { "Mlabri", 3073465, "mkh", } m["mrb"] = { "Sungwadia", 3293299, "poz-vnn", "Latn", } m["mrc"] = { "Maricopa", 56386, "nai-yuc", "Latn", } m["mrd"] = { "Western Magar", 22303263, "sit-gma", "Deva", translit = { Deva = "Deva-translit", }, } m["mre"] = { "Martha's Vineyard Sign Language", 33494, "sgn", "Latn, Sgnw", } m["mrf"] = { "Elseng", 3915667, "qfa-unc", -- "Border or language isolate"; unclassifiable due to paucity of data "Latn", } m["mrg"] = { "Mising", 3316328, "sit-tan", "Latn, Beng, Deva", ancestors = "adi", translit = { Beng = "Beng-translit", Deva = "Deva-translit", }, } m["mrh"] = { "Mara Chin", 4175893, "tbq-kuk", "Latn", } m["mrj"] = { "Western Mari", 1776032, "chm", "Cyrl", translit = "chm-translit", sort_key = "mrj-sortkey", } m["mrk"] = { "Hmwaveke", 5873712, "poz-cln", "Latn", } m["mrl"] = { "Mortlockese", 3324598, "poz-mic", "Latn", } m["mrm"] = { "Mwerlap", 3331115, "poz-vnn", "Latn", } m["mrn"] = { "Cheke Holo", 2962165, "poz-ocw", "Latn", } m["mro"] = { "Mru", 1951521, "sit-mru", "Latn, Mroo", } m["mrp"] = { "Morouas", 6913299, "poz-vnn", "Latn", } m["mrq"] = { "North Marquesan", 2603808, "poz-pep", "Latn", } m["mrr"] = { "Hill Maria", 27602, "dra-mdy", "Deva", } m["mrs"] = { "Maragus", 6754640, "poz-vnc", "Latn", } m["mrt"] = { "Margi", 56241, "cdc-cbm", "Latn", } m["mru"] = { "Mono (Cameroon)", 11031964, "alv-mbm", "Latn", } m["mrv"] = { "Mangarevan", 36237, "poz-pep", "Latn", } m["mrw"] = { "มาราเนา", 33800, "phi", "Latn, Arab", } m["mrx"] = { "Dineor", 5278044, "paa-tkw", "Latn", } m["mry"] = { "Karaga Mandaya", 6747925, "phi", } m["mrz"] = { "Marind", 6763970, "paa-ani", "Latn", } m["msb"] = { "มัสบาเต", 33948, "phi", "Latn", } m["msc"] = { "Sankaran Maninka", 11155812, "dmn-mnk", } m["msd"] = { "Yucatec Maya Sign Language", 34281, "sgn", "Latn", -- when documented } m["mse"] = { "Musey", 56328, "cdc-mas", } m["msf"] = { "Mekwei", 4544752, "paa-nim", "Latn", } m["msg"] = { "Moraid", 6909020, "paa-wbh", "Latn", } m["msi"] = { "Sabah Malay", 10867404, "crp", "Latn, Arab", } m["msj"] = { "Ma", 6720909, "nic-mbc", "Latn", } m["msk"] = { "Mansaka", 12952800, "phi", "Latn", } m["msl"] = { "Molof", 4300950, } m["msm"] = { "Agusan Manobo", 12953696, "mno", "Latn", } m["msn"] = { "Vurës", 3563857, "poz-vnn", "Latn", } m["mso"] = { "Mombum", 6897079, "ngf-mom", "Latn", } m["msp"] = { "Maritsauá", 6765915, "tup", "Latn", } m["msq"] = { "Caac", 2932212, "poz-cln", "Latn", } m["msr"] = { "Mongolian Sign Language", 3915499, "sgn", } m["mss"] = { "West Masela", 12952816, "poz-tim", } m["msu"] = { "Musom", 6943041, "poz-ocw", "Latn", } m["msv"] = { "Maslam", 3502273, } m["msw"] = { "Mansoanka", 35814, } m["msx"] = { "Moresada", 11732475, "ngf-pom", "Latn", } m["msy"] = { "Aruamu", 3501809, "paa-ram", "Latn", } m["msz"] = { "Momare", 6897030, "ngf-huo", "Latn", } m["mta"] = { "Cotabato Manobo", 12953698, "mno", "Latn", } m["mtb"] = { "Anyin Morofo", 3502338, "alv-ctn", "Latn", ancestors = "any", } m["mtc"] = { "Munit", 11732482, "ngf-kok", "Latn", } m["mtd"] = { "Mualang", 3073458, "poz-mly", "Latn", } m["mte"] = { "Alu", 33503, "poz-ocw", "Latn", } m["mtf"] = { "Murik (New Guinea)", 7050035, "paa-lsp", "Latn", } m["mtg"] = { "Una", 5580728, "ngf-mek", } m["mth"] = { "Munggui", 6936018, "poz-hce", "Latn", } m["mti"] = { "Maiwa (New Guinea)", 6737223, "ngf-dag", "Latn", } m["mtj"] = { "Moskona", 11288953, "paa-ebh", "Latn", } m["mtk"] = { "Mbe'", 10964025, "nic-nka", "Latn", } m["mtl"] = { "Montol", 3440457, "cdc-wst", "Latn", } m["mtm"] = { "Mator", 20669419, "syd", "Cyrl", } m["mtn"] = { "Matagalpa", 3490756, "nai-min", } m["mto"] = { "Totontepec Mixe", 7828400, "nai-miz", "Latn", } m["mtp"] = { "Wichí Lhamtés Nocten", 5908756, "sai-wic", "Latn", } m["mtq"] = { "เหมื่อง", 3236789, "mkh-vie", "Latn", sort_key = "vi-sortkey", } m["mtr"] = { "เมวาร์", 2992857, "raj", "Deva", translit = "Deva-translit", -- for now } m["mts"] = { "Yora", 3572572, "sai-pan", "Latn", } m["mtt"] = { "Mota", 3325052, "poz-vnn", "Latn", } m["mtu"] = { "Tututepec Mixtec", 7857069, "omq-mxt", "Latn", } m["mtv"] = { "Asaro'o", 3503684, "ngf-fin", "Latn", } m["mtw"] = { "Magahat", 6729600, "phi", } m["mtx"] = { "Tidaá Mixtec", 7800805, "omq-mxt", "Latn", } m["mty"] = { "Nabi", 6956858, "paa-tor", "Latn", } m["mua"] = { "Mundang", 36032, "alv-mbm", } m["mub"] = { "Mubi", 3440518, "cdc-est", "Latn", } m["muc"] = { "Mbu'", 35868, "nic-beb", "Latn", } m["mud"] = { "Mednyj Aleut", 1977419, "qfa-mix", ancestors = "ale, ru" } m["mue"] = { "Media Lengua", 36066, "qfa-mix", "Latn", ancestors = "es, qu", } m["mug"] = { "Musgu", 3123545, "cdc-cbm", "Latn", } m["muh"] = { "Mündü", 35981, "nic-nke", "Latn", } m["mui"] = { "มูซี", 615660, "poz-mly", "Latn", } m["muj"] = { "Mabire", 3440437, } m["mul"] = { "ร่วม", -- ภาษาร่วม ใช้แทน ข้ามภาษา 7834564, "qfa-not", "All", -- NOTE: The following sort keys are used in process_page() in [[Module:headword/page]], which generates -- the default sort key for the page (corresponding to {{DEFAULTSORT:...}}) by generating a sort key for -- the pagename using `makeSortKey()` called on language object "mul". Currently this just handles -- Japanese sort keys. -- -- FIXME: This should be smarter and use the language of the page if there's only one. sort_key = { Hani = "Hani-sortkey", Jpan = "Jpan-sortkey", Hrkt = "Hira-sortkey", -- Sort all kana as Hira. Hira = "Hira-sortkey", Kana = "Hira-sortkey", }, standard_chars = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz" .. c.punc, } m["mum"] = { "Maiwala", 12952764, "poz-ocw", "Latn", } m["muo"] = { "Nyong", 36373, "alv-lek", } m["mup"] = { "Malvi", 33413, "raj", "Deva", translit = "Deva-translit" } m["muq"] = { "Eastern Xiangxi Miao", 27431376, "hmn", } m["mur"] = { "Murle", 56727, "sdv", } m["mus"] = { "Creek", 523014, "nai-mus", "Latn", } m["mut"] = { "Western Muria", 12952886, "dra-mur", } m["muu"] = { "Yaaku", 34222, "cus-eas", } m["muv"] = { "Muthuvan", 3327420, "dra-tam", } m["mux"] = { "Bo-Ung", 15831607, "ngf-chw", "Latn", } m["muy"] = { "Muyang", 3502301, "cdc-cbm", "Latn", } m["muz"] = { "Mursi", 36013, "sdv", } m["mva"] = { "Manam", 6746851, "poz-ocw", "Latn", } m["mvb"] = { "Mattole", 20824, "ath-pco", "Latn", } m["mvd"] = { "Mamboru", 578815, "poz", "Latn", } m["mvg"] = { "Yucuañe Mixtec", 25562736, "omq-mxt", "Latn", } m["mvh"] = { "Mire", 3441359, } m["mvi"] = { "มิยาโกะ", 36218, "jpx-sry", "Jpan", translit = s["jpx-translit"], display_text = s["jpx-displaytext"], strip_diacritics = s["jpx-stripdiacritics"], sort_key = s["jpx-sortkey"], } m["mvk"] = { "Mekmek", 6810592, "paa-yua", "Latn", } m["mvl"] = { "Mbara (Australia)", 6799620, "aus-pam", } m["mvm"] = { "Muya", 2422759, "sit-qia", } m["mvn"] = { "Minaveha", 6863278, "poz-ocw", "Latn", } m["mvo"] = { "Marovo", 3294683, "poz-ocw", "Latn", } m["mvp"] = { "Duri", 3915414, "poz-ssw", "Latn", } m["mvq"] = { "Moere", 11732458, "ngf-kum", "Latn", } m["mvr"] = { "Marau", 6755069, "poz-hce", "Latn", } m["mvs"] = { "Massep", 3502895, "paa-tkw", } m["mvt"] = { "Mpotovoro", 6928305, "poz-vnc", "Latn", } m["mvu"] = { "Marfa", 713633, } m["mvv"] = { "Tagal Murut", 7675300, "poz-san", "Latn", } m["mvw"] = { "Machinga", 12952754, "bnt-rvm", } m["mvx"] = { "Meoswar", 6817777, "poz-hce", "Latn", } m["mvy"] = { "Indus Kohistani", 33399, "inc-koh", "Arab", } m["mvz"] = { "Mesqan", 6821677, "sem-eth", } m["mwa"] = { "Mwatebu", 14916896, "poz-ocw", "Latn", } m["mwb"] = { "Juwal", 6319103, "paa-tor", "Latn", } m["mwc"] = { "Are", 29277, "poz-ocw", "Latn", } m["mwe"] = { "Mwera", 6944725, "bnt-rvm", "Latn", } m["mwf"] = { "Murrinh-Patha", 2980398, "aus-dal", "Latn", } m["mwg"] = { "Aiklep", 3399652, "poz-ocw", "Latn", } m["mwh"] = { "Mouk-Aria", 3325498, "poz-ocw", "Latn", } m["mwi"] = { "Labo", 2157452, "poz-vnc", "Latn", } m["mwk"] = { "Kita Maninkakan", 3015523, "dmn-wmn", } m["mwl"] = { "มีรังดา", 13330, "roa-asl", "Latn", } m["mwm"] = { "Sar", 56850, "csu-sar", "Latn", } m["mwn"] = { "Nyamwanga", 6944666, "bnt-mwi", "Latn", } m["mwo"] = { "Sungwadaga", 3276435, "poz-vnn", "Latn", } m["mwp"] = { "Kala Lagaw Ya", 2591262, "aus-pam", "Latn", } m["mwq"] = { "Mün Chin", 331340, "tbq-kuk", } m["mwr"] = { "มาร์วาร์", 56312, "raj", "Deva, Mahj", translit = { Deva = "Deva-translit", -- for now Mahj = "Mahj-translit", }, } m["mws"] = { "Mwimbi-Muthambi", 15632357, "bnt-kka", "Latn", } m["mwt"] = { "Moken", 18648701, "poz", } m["mwu"] = { "Mittu", 6883573, "csu-bbk", "Latn", } m["mwv"] = { "Mentawai", 13365, "poz-nws", "Latn", } m["mww"] = { "ม้งขาว", 3138829, "hmn", "Latn, Hmng, Hmnp", } m["mwz"] = { "Moingi", 11011905, } m["mxa"] = { "Northwest Oaxaca Mixtec", 12953739, "omq-mxt", "Latn", } m["mxb"] = { "Tezoatlán Mixtec", 3317286, "omq-mxt", "Latn", } m["mxd"] = { "Modang", 6888037, "poz", "Latn", } m["mxe"] = { "Mele-Fila", 3305008, "poz-pnp", "Latn", } m["mxf"] = { "Malgbe", 3502224, } m["mxg"] = { "Mbangala", 6799612, "bnt-yak", } m["mxh"] = { "Mvuba", 6944591, "csu-mle", "Latn", } m["mxi"] = { "Mozarabic", 317044, "roa-ibe", "Arab, Hebr, Latn", translit = "mxi-translit", strip_diacritics = { Arab = "ar-stripdiacritics", }, -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["mxj"] = { "Miju", 56332, "sit-mdz", "Latn, Deva", translit = { Deva = "Deva-translit", }, } m["mxk"] = { "Monumbo", 6906792, "paa-tor", } m["mxl"] = { "Maxi Gbe", 35770, "alv-gbe", } m["mxm"] = { "Meramera", 6817936, "poz-ocw", "Latn", } m["mxn"] = { "Moi (Indonesia)", 11732459, "paa-wbh", "Latn", } m["mxo"] = { "Mbowe", 10962309, "bnt-kav", } m["mxp"] = { "Tlahuitoltepec Mixe", 7810697, } m["mxq"] = { "Juquila Mixe", 25559721, } m["mxr"] = { "Murik (Malaysia)", 3328150, nil, "Latn", } m["mxs"] = { "Huitepec Mixtec", 12953729, "omq-mxt", "Latn", } m["mxt"] = { "Jamiltepec Mixtec", 12953730, "omq-mxt", "Latn", } m["mxu"] = { "Mada (Cameroon)", 3441206, "cdc-cbm", "Latn", } m["mxv"] = { "Metlatónoc Mixtec", 36363, "omq-mxt", "Latn", } m["mxw"] = { "Namo", 12952923, "paa-yam", "Latn", } m["mxx"] = { "Mahou", 11004334, "dmn-mnk", "Latn, Nkoo", } m["mxy"] = { "Southeastern Nochixtlán Mixtec", 7070684, "omq-mxt", "Latn", } m["mxz"] = { "Central Masela", 42575433, "poz-tim", "Latn", } m["myb"] = { "Mbay", 3033565, "csu-sar", "Latn", } m["myc"] = { "Mayeka", 11129517, "bnt-boa", } m["mye"] = { "Myene", 35832, "bnt-tso", "Latn", } m["myf"] = { "Bambassi", 56540, "omv-mao", "Latn", } m["myg"] = { "Manta", 35799, "nic-mom", "Latn", } m["myh"] = { "Makah", 3280640, "wak", "Latn", } m["myj"] = { "Mangayat", 35988, "nic-ser", } m["myk"] = { "Mamara Senoufo", 36187, "alv-sma", "Latn", } m["myl"] = { "Moma", 6897018, "poz", "Latn", } m["mym"] = { "Me'en", 3408516, "sdv", } m["myo"] = { "Anfillo", 34928, "omv-gon", } m["myp"] = { "Pirahã", 33825, "sai-mur", "Latn", } m["myr"] = { "Muniche", 3915654, } m["mys"] = { "Mesmes", 3508617, "sem-eth", } m["myu"] = { "Mundurukú", 746723, "tup", "Latn", } m["myv"] = { "เอร์เซีย", 29952, "urj-mdv", "Cyrl", translit = "myv-translit", override_translit = true, } m["myw"] = { "Muyuw", 3502878, "poz-ocw", "Latn", } m["myx"] = { "Masaba", 12952814, "bnt-msl", "Latn", } m["myy"] = { "Macuna", 3275059, "sai-tuc", "Latn", } m["myz"] = { "Classical Mandaic", 25559314, "sem-ase", "Mand", translit = { Mand = "Mand-translit", }, strip_diacritics = { Mand = "Mand-stripdiacritics", } } m["mza"] = { "Santa María Zacatepec Mixtec", 8063756, "omq-mxt", "Latn", } m["mzb"] = { "Northern Saharan Berber", 11156769, "ber", "Arab, Latn, Tfng", } m["mzc"] = { "Madagascar Sign Language", 12715020, "sgn", } m["mzd"] = { "Malimba", 35806, "bnt-saw", } m["mze"] = { "Morawa", 6909384, "paa-mal", "Latn", } m["mzg"] = { "Monastic Sign Language", 3217333, "sgn", } m["mzh"] = { "Wichí Lhamtés Güisnay", 7998197, "sai-wic", "Latn", } m["mzi"] = { "Ixcatlán Mazatec", 6101049, "omq-maz", "Latn", } m["mzj"] = { "Manya", 11006832, "dmn-mnk", } m["mzk"] = { "Nigeria Mambila", 11004163, "nic-mmb", "Latn", } m["mzl"] = { "Mazatlán Mixe", 25559728, } m["mzm"] = { "Mumuye", 36021, "alv-mum", "Latn", } m["mzn"] = { "มอแซนแดรอน", 13356, "ira-msh", "mzn-Arab", } m["mzo"] = { "Matipuhy", 6787588, "sai-kui", "Latn", } m["mzp"] = { "Movima", 1659701, "qfa-iso", "Latn", } m["mzq"] = { "Mori Atas", 3324070, "poz-btk", "Latn", } m["mzr"] = { "Marúbo", 3296011, "sai-pan", "Latn", } m["mzs"] = { "ครีโอลมาเก๊า", 35785, "crp", "Latn", ancestors = "pt", sort_key = {Latn = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.diaer .. c.cedilla}}, } m["mzt"] = { "Mintil", 6869641, "mkh-asl", } m["mzu"] = { "Inapang", 6013569, "paa-ram", "Latn", } m["mzv"] = { "Manza", 36038, "gba-eas", } m["mzw"] = { "Deg", 35183, "nic-gnw", "Latn", } m["mzx"] = { "Mawayana", 6794377, "awd", } m["mzy"] = { "Mozambican Sign Language", 6927809, "sgn", } m["mzz"] = { "Maiadomu", 6735234, "poz-ocw", "Latn", } return require("Module:languages").finalizeData(m, "language") 7qlw76ybprn9d4lm5euyznbcbx2a5ye มอดูล:languages/data/3/k 828 36376 5720760 5684159 2026-04-21T07:01:03Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720760 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["kaa"] = { "การากัลปัก", 33541, "trk-kno", "Latn, Cyrl, fa-Arab", dotted_dotless_i = true, strip_diacritics = { from = {"['’]"}, to = {"ʼ"} }, sort_key = { Latn = { from = { -- Sort the old orthography (using the apostrophe) after the new orthography (using the acute accent). "í", "iʼ", "i", -- Ensure "i" comes after "í", "iʼ", "ı". "sh", "ch", "á", "aʼ", "ǵ", "gʼ", "x", p[4], p[5], "ı", "q", "ń", "nʼ", "ó", "oʼ", "ú", "uʼ", "c" }, to = { p[4], p[5], "i" .. p[3], "z" .. p[1], "z" .. p[3], "a" .. p[1], "a" .. p[2], "g" .. p[1], "g" .. p[2], "h" .. p[1], "i", "i" .. p[1], "i" .. p[2], "k" .. p[1], "n" .. p[1], "n" .. p[2], "o" .. p[1], "o" .. p[2], "u" .. p[1], "u" .. p[2], "z" .. p[2] } }, Cyrl = { from = {"ә", "ғ", "ё", "қ", "ң", "ө", "ү", "ў", "ҳ"}, to = {"а" .. p[1], "г" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1], "у" .. p[2], "х" .. p[1]} }, }, } m["kab"] = { "กะไบล์", 35853, "ber", "Latn, Arab, Tfng", } m["kac"] = { "จิ่งเผาะ", 33332, "sit-jnp", "Latn, Mymr", } m["kad"] = { "Kadara", 3914011, "nic-plc", "Latn", } m["kae"] = { "Ketangalan", 2779411, "map", } m["kaf"] = { "Katso", 246122, "tbq-kzh", } m["kag"] = { "Kajaman", 6348863, "poz", "Latn", } m["kah"] = { "Fer", 5443742, "csu-bgr", "Latn", } m["kai"] = { "Karekare", 3438770, "cdc-wst", "Latn", } m["kaj"] = { "Jju", 35401, "nic-plc", "Latn", } m["kak"] = { "Kayapa Kallahan", 3192220, "phi", "Latn", } m["kam"] = { "Kamba", 2574767, "bnt-kka", "Latn", } m["kao"] = { "Kassonke", 36905, "dmn-wmn", "Latn", } m["kap"] = { "Bezhta", 33054, "cau-ets", "Cyrl", translit = "cau-nec-translit", override_translit = true, display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]}, } m["kaq"] = { "Capanahua", 2937196, "sai-pan", "Latn", } m["kaw"] = { "ชวาเก่า", 49341, "poz", "Latn, Java, Kawi", --translit = "jv-translit", --same as jv } m["kax"] = { "Kao", 3192799, "paa-nha", "Latn", } m["kay"] = { "Kamayurá", 3192336, "tup-gua", "Latn", } m["kba"] = { "Kalarko", 5517764, "aus-pam", "Latn", } m["kbb"] = { "Kaxuyana", 12953626, "sai-prk", "Latn", } m["kbc"] = { "Kadiwéu", 18168288, "sai-guc", "Latn", } m["kbd"] = { "คาบาร์เดีย", 33522, "cau-cir", "Cyrl, Latn, Arab", translit = { Cyrl = "cau-cir-translit", Arab = "ar-translit", }, override_translit = true, display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = { Cyrl = s["cau-Cyrl-stripdiacritics"], Latn = s["cau-Latn-stripdiacritics"], }, sort_key = { Cyrl = { from = { "кхъу", "къӏу", -- 4 chars "гъу", "джу", "дзу", "жъу", "къу", "кхъ", "къӏ", "кӏу", "кӏь", "лъу", "лӏу", "пӏу", "сӏу", "тӏу", "фӏу", "хъу", "цӏу", "чъу", "чӏу", "шъу", "шӏу", "щӏу", -- 3 chars "гу", "гъ", "гь", "дж", "дз", "ё", "жъ", "жь", "ку", "къ", "кь", "кӏ", "лъ", "ль", "лӏ", "пӏ", "сӏ", "тӏ", "фӏ", "ху", "хъ", "хь", "цу", "цӏ", "чу", "чъ", "чӏ", "шъ", "шӏ", "щӏ", "ӏу", "ӏь", -- 2 chars "э" -- 1 char }, to = { "к" .. p[5], "к" .. p[7], "г" .. p[3], "д" .. p[2], "д" .. p[4], "ж" .. p[2], "к" .. p[3], "к" .. p[4], "к" .. p[6], "к" .. p[10], "к" .. p[11], "л" .. p[2], "л" .. p[5], "п" .. p[2], "с" .. p[2], "т" .. p[2], "ф" .. p[2], "х" .. p[3], "ц" .. p[3], "ч" .. p[3], "ч" .. p[5], "ш" .. p[2], "ш" .. p[4], "щ" .. p[2], "г" .. p[1], "г" .. p[2], "г" .. p[4], "д" .. p[1], "д" .. p[3], "е" .. p[1], "ж" .. p[1], "ж" .. p[3], "к" .. p[1], "к" .. p[2], "к" .. p[8], "к" .. p[9], "л" .. p[1], "л" .. p[3], "л" .. p[4], "п" .. p[1], "с" .. p[1], "т" .. p[1], "ф" .. p[1], "х" .. p[1], "х" .. p[2], "х" .. p[4], "ц" .. p[1], "ц" .. p[2], "ч" .. p[1], "ч" .. p[2], "ч" .. p[4], "ш" .. p[1], "ш" .. p[3], "щ" .. p[1], "ӏ" .. p[1], "ӏ" .. p[2], "а" .. p[1] } }, }, } m["kbe"] = { "Kanju", 10543322, "aus-pam", "Latn", } m["kbh"] = { "Camsá", 2842667, "qfa-iso", "Latn", } m["kbi"] = { "Kaptiau", 6367294, "poz-oce", "Latn", } m["kbj"] = { "Kari", 6370438, "bnt-boa", "Latn", } m["kbk"] = { "Grass Koiari", 12952642, "ngf-koi", "Latn", } m["kbm"] = { "Iwal", 3156391, "poz-ocw", "Latn", } m["kbn"] = { "Kare (Central Africa)", 35554, "alv-mbm", "Latn", } m["kbo"] = { "Keliko", 11275553, "csu-mma", } m["kbp"] = { "Kabiyé", 35475, "nic-gne", "Latn", } m["kbq"] = { "Kamano", 11732272, "ngf-kag", "Latn", } m["kbr"] = { "Kafa", 35481, "omv-gon", "Ethi, Latn", } m["kbs"] = { "Kande", 35556, "bnt-tso", "Latn", } m["kbt"] = { "Gabadi", 3291159, "poz-ocw", "Latn", } m["kbu"] = { "Kabutra", 10966761, "raj", } m["kbv"] = { "Kamberataro", 5261289, "paa-sng", "Latn", } m["kbw"] = { "Kaiep", 6347632, "poz-ocw", "Latn", } m["kbx"] = { "Ap Ma", 56298, "paa-eke", "Latn", } m["kbz"] = { "Duhwa", 56295, "cdc-wst", "Latn", } m["kcb"] = { "Kawacha", 11732302, "ngf-ang", "Latn", } m["kcc"] = { "Lubila", 3914381, "nic-uce", "Latn", } m["kcd"] = { "Ngkâlmpw Kanum", 12952566, "paa-yam", "Latn", } m["kce"] = { "Kaivi", 6348685, "nic-kau", } m["kcf"] = { "Ukaan", 36651, "nic-bco", } m["kcg"] = { "Tyap", 3912765, "nic-plc", "Latn", } m["kch"] = { "Vono", 3913920, "nic-kau", } m["kci"] = { "Kamantan", 3914019, "nic-plc", } m["kcj"] = { "Kobiana", 35609, "alv-nyn", } m["kck"] = { "Kalanga", 33672, "bnt-sho", "Latn", } m["kcl"] = { "Kala", 6349982, "poz-ocw", "Latn", } m["kcm"] = { "Tar Gula", 277963, "csu-bba", } m["kcn"] = { "Nubi", 36388, "crp", "Latn, Arab", ancestors = "apd", strip_diacritics = {remove_diacritics = c.acute}, } m["kco"] = { "Kinalakna", 11732320, "ngf-huo", "Latn", } m["kcp"] = { "Kanga", 6362384, "qfa-kad", "Latn", } m["kcq"] = { "Kamo", 3914879, "alv-wjk", } m["kcr"] = { "Katla", 35688, "nic-ktl", } m["kcs"] = { "Koenoem", 3438755, "cdc-wst", } m["kct"] = { "Kaian", 6347538, "paa-ram", "Latn", } m["kcu"] = { "Kikami", 3915212, "bnt-ruv", "Latn", } m["kcv"] = { "Kete", 3195598, "bnt-lub", } m["kcw"] = { "Kabwari", 6344539, "bnt-glb", } m["kcx"] = { "Kachama-Ganjule", 12634070, "omv-eom", } m["kcy"] = { "Korandje", 33427, "son", } m["kcz"] = { "Konongo", 11732345, "bnt-tkm", "Latn", } m["kda"] = { "Worimi", 3914062, "aus-pam", "Latn", } m["kdc"] = { "Kutu", 6448634, "bnt-ruv", } m["kdd"] = { "Yankunytjatjara", 34207, "aus-pam", "Latn", } m["kde"] = { "Makonde", 35172, "bnt-rvm", "Latn", } m["kdf"] = { "Mamusi", 6746036, "poz-ocw", "Latn", } m["kdg"] = { "Seba", 7442316, "bnt-sbi", "Latn", } m["kdh"] = { "Tem", 36531, "nic-gne", "Latn", } m["kdi"] = { "Kumam", 6443410, "sdv-los", } m["kdj"] = { "Karamojong", 56326, "sdv-ttu", "Latn", } m["kdk"] = { "Numèè", 3346774, "poz-cln", "Latn", } m["kdl"] = { "Tsikimba", 3914404, "nic-kam", } m["kdm"] = { "Kagoma", 3914420, "nic-plc", } m["kdn"] = { "Kunda", 4121130, "bnt-sna", "Latn", } m["kdp"] = { "Kaningdon-Nindem", 3914956, "nic-nin", } m["kdq"] = { "Koch", 56431, "tbq-bdg", } m["kdr"] = { "Karaim", 33725, "trk-kcu", "Cyrl, Latn, Hebr", -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["kdt"] = { "กูย", 56310, "mkh-kat", "Thai, Khmr, Laoo", } m["kdu"] = { "Kadaru", 35441, "nub-hil", "Latn", } m["kdv"] = { "Kado", 7402721, "sit-luu", } m["kdw"] = { "Koneraw", 11732341, "ngf-mom", "Latn", } m["kdx"] = { "Kam", 36753, "alv-wjk", } m["kdy"] = { "Keder", 6383641, "paa-tkw", } m["kdz"] = { "Kwaja", 11128866, "nic-nka", "Latn", } m["kea"] = { "ครีโอลกาบูเวร์ดี", 35963, "crp", "Latn", ancestors = "pt", } m["keb"] = { "Kélé", 35559, "bnt-kel", } m["kec"] = { "Keiga", 3409311, "qfa-kad", "Latn", } m["ked"] = { "Kerewe", 6393846, "bnt-haj", } m["kee"] = { "Eastern Keres", 15649021, "nai-ker", "Latn", } m["kef"] = { "Kpessi", 35748, "alv-gbe", } m["keg"] = { "Tese", 16887296, "sdv", } m["keh"] = { "Keak", 6382110, "paa-ndu", "Latn", } m["kei"] = { "Kei", 2410352, "poz-cet", } m["kej"] = { "Kadar", 6345179, "dra-mal", } m["kek"] = { "Q'eqchi", 35536, "myn", "Latn", } m["kel"] = { "Kela-Yela", 6385426, "bnt-mon", "Latn", } m["kem"] = { "Kemak", 35549, "poz-tim", "Latn", } m["ken"] = { "Kenyang", 35650, "nic-mam", "Latn", } m["keo"] = { "Kakwa", 3033547, "sdv-bri", } m["kep"] = { "Kaikadi", 6347757, "dra-tam", } m["keq"] = { "Kamar", 14916877, "inc-hal", } m["ker"] = { "Kera", 56251, "cdc-est", "Latn", } m["kes"] = { "Kugbo", 3813394, "nic-cde", "Latn", } m["ket"] = { "Ket", 33485, "qfa-yke", "Cyrl", strip_diacritics = { from = {"['’]"}, to = {"ʼ"} }, sort_key = { from = {"ӷ", "ё", "ӄ", "ӈ", "ө", "ә", "ʼ"}, to = {"г" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "ъ" .. p[1], "ь" .. p[1]} }, } m["keu"] = { "Akebu", 35026, "alv-ktg", "Latn", } m["kev"] = { "Kanikkaran", 6363201, "dra-mal", "Taml, Mlym", -- Mlym translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission) } m["kew"] = { "Kewa", 12952619, "ngf-eng", "Latn", } m["kex"] = { "Kukna", 5031131, "inc-eas", ancestors = "bh", } m["key"] = { "Kupia", 6445354, "inc-eas", } m["kez"] = { "Kukele", 3915391, "nic-ucn", "Latn", } m["kfa"] = { "Kodava", 33531, "dra-kod", "Knda, Mlym", -- Knda translit in [[Module:scripts/data]] -- Mlym translit in [[Module:scripts/data]] } m["kfb"] = { "Kolami", 33479, "dra-knk", "Deva, Telu", translit = { Deva = "Deva-translit", Telu = "Telu-translit", }, } m["kfc"] = { "Konda-Dora", 35679, "dra-kki", "Orya, Telu", translit = { Orya = "Orya-translit", Telu = "Telu-translit", }, } m["kfd"] = { "Korra Koraga", 12952655, "dra-kor", "Knda", -- Knda translit in [[Module:scripts/data]] } m["kfe"] = { "Kota (India)", 33483, "dra-tkt", "Taml", translit = "Taml-translit", } m["kff"] = { "Koya", 33471, "dra-gon", "Telu, Orya, Deva, Latn", } m["kfg"] = { "Kudiya", 12952667, "dra-tlk", } m["kfh"] = { "Kurichiya", 12952676, "dra-mal", "Mlym", -- Mlym translit in [[Module:scripts/data]] } m["kfi"] = { "Kannada Kurumba", 56589, "dra-sdo", } m["kfj"] = { "Kemiehua", 27144776, "mkh-pal", } m["kfk"] = { "Kinnauri", 2383208, "sit-kin", "Takr, Deva, Latn", translit = { Takr = "Takr-translit", Deva = "Deva-translit", }, } m["kfl"] = { "Kung", 6444510, "nic-rnc", "Latn", } m["kfn"] = { "Kuk", 6442398, "nic-rnc", "Latn", } m["kfo"] = { "Koro (West Africa)", 11160588, "dmn-mnk", "Latn, Nkoo", } m["kfp"] = { "Korwa", 6432786, "mun", } m["kfq"] = { "Korku", 33715, "mun", "Deva", } m["kfr"] = { "กัจฉ์", 56487, "inc-snd", "Gujr, sd-Arab, Sind, Khoj", translit = { Gujr = "Gujr-translit", Sind = "Sind-translit", ["sd-Arab"] = "sd-Arab-translit", }, strip_diacritics = { remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef, from = {u(0x0671)}, to = {u(0x0627)} }, } m["kfs"] = { "Bilaspuri", 12953397, "him", "Deva, Takr", translit = { Deva = "Deva-translit", Takr = "Takr-translit", }, } m["kft"] = { "Kanjari", 12953610, "inc-pan", ancestors = "pa", } m["kfu"] = { "Katkari", 6377671, "inc-sou", } m["kfv"] = { "Kurmukar", 6446193, "inc-eas", } m["kfw"] = { "Kharam Naga", 12952906, "tbq-kuk", } m["kfx"] = { "Kullu Pahari", 6443148, "him", "Deva", translit = { Deva = "Deva-translit", }, } m["kfy"] = { "Kumaoni", 33529, "inc-pah", "Deva, Shrd, Takr", translit = { Deva = "Deva-translit", Takr = "Takr-translit", }, -- Shrd translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission) } m["kfz"] = { "Koromfé", 35701, "nic-gur", "Latn", } m["kga"] = { "Koyaga", 11155632, "dmn-mnk", } m["kgb"] = { "Kawe", 12952750, "poz-hce", "Latn", } m["kgd"] = { "Kataang", 12953622, "mkh", } m["kge"] = { "Komering", 49224, "poz-lgx", "Latn, Arab", } m["kgf"] = { "Kube", 11732359, "ngf-huo", "Latn", } m["kgg"] = { "Kusunda", 33630, "qfa-iso", -- central Nepal "Latn", } m["kgi"] = { "Selangor Sign Language", 33731, "sgn", } m["kgj"] = { "Gamale Kham", 22236996, "sit-kha", "Deva", translit = { Deva = "Deva-translit", }, } m["kgk"] = { "Kaiwá", 3111883, "gn", "Latn", } m["kgl"] = { "Kunggari", 10550184, "aus-pam", } m["kgn"] = { "Karingani", 6371041, "xme-ttc", "fa-Arab, Latn", ancestors = "xme-ttc-nor", } m["kgo"] = { "Krongo", 6438927, "qfa-kad", "Latn", } m["kgp"] = { "Kaingang", 2665734, "sai-sje", "Latn", } m["kgq"] = { "Kamoro", 6359001, "ngf-ask", "Latn", } m["kgr"] = { "Abun", 56657, "qfa-iso", -- Papuan; isolate in Ethnologue, Glottolog and Palmer (2018); grouped with West Papuan by Ross (2005) "Latn", } m["kgs"] = { "Kumbainggar", 3915412, "aus-pam", } m["kgt"] = { "Somyev", 3913354, "nic-mmb", "Latn", } m["kgu"] = { "Kobol", 11732325, "ngf-omo", "Latn", } m["kgv"] = { "Karas", 6368621, "qfa-dis", -- Divergent Papuan language; grouped with Mbaham-Iha by Glottolog to form a (mainland) West Bomberai -- family, but with Mbaham-Iha and Timor-Alor-Pantar by Wikipedia (following Usher and Schapper 2022) -- into a (Greater) West Bomberai family. "Latn", } m["kgw"] = { "Karon Dori", 56817, "paa-mbr", "Latn", } m["kgx"] = { "Kamaru", 12953604, "poz", } m["kgy"] = { "Kyerung", 12952691, "sit-kyk", } m["kha"] = { "คาซี", 33584, "aav-pkl", "Latn, as-Beng", } m["khb"] = { "ไทลื้อ", 36948, "tai-swe", "Talu, Lana", translit = { Talu = "Talu-translit", Lana = "Lana-translit", }, strip_diacritics = {remove_diacritics = c.ZWNJ}, sort_key = "khb-sortkey", } m["khc"] = { "Tukang Besi North", 18611555, "poz", } m["khd"] = { "Bädi Kanum", 20888004, "paa-yam", "Latn", } m["khe"] = { "Korowai", 6432598, "ngf-gaw", "Latn", } m["khf"] = { "Khuen", 27144893, "mkh", } m["khh"] = { "Kehu", 10994953, } m["khj"] = { "Kuturmi", 3914490, "nic-plc", "Latn", } m["khl"] = { "Lusi", 3267788, "poz-ocw", "Latn", } m["khn"] = { "Khandeshi", 33726, "inc-sou", } m["kho"] = { "โคตาน", 6583551, "xsc-sak", "Brah, Khar", -- Brah translit in [[Module:scripts/data]] } m["khp"] = { "Kapauri", 3502575, "paa-tkw", } m["khq"] = { "Koyra Chiini", 33600, "son", "Latn, Arab", } m["khr"] = { "Kharia", 3915562, "mun", } m["khs"] = { "Kasua", 6374863, "ngf-bos", "Latn", } m["kht"] = { "คำตี้", 3915502, "tai-swe", "Mymr", display_text = s["kht-displaytext"], strip_diacritics = s["kht-stripdiacritics"], } m["khu"] = { "Nkhumbi", 11019169, "bnt-swb", } m["khv"] = { "Khvarshi", 56425, "cau-wts", "Cyrl", translit = "khv-translit", display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]}, } m["khw"] = { "Khowar", 938216, "inc-chi", "Arab", strip_diacritics = { -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif from = {"هٔ", "ۂ", "ٱ"}, to = {"ہ", "ہ", "ا"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef }, } m["khx"] = { "Kanu", 12952571, "bnt-lgb", } m["khy"] = { "Ekele", 6385549, "bnt-ske", "Latn", } m["khz"] = { "Keapara", 12952603, "poz-ocw", "Latn", } m["kia"] = { "Kim", 35685, "alv-kim", } m["kib"] = { "Koalib", 35859, "alv-hei", } m["kic"] = { "Kickapoo", 20162127, "alg-sfk", "Latn", } m["kid"] = { "Koshin", 35632, "nic-beb", "Latn", } m["kie"] = { "Kibet", 56893, } m["kif"] = { "Eastern Parbate Kham", 12953022, "sit-kha", "Deva", translit = { Deva = "Deva-translit", }, } m["kig"] = { "Kimaama", 11732321, "paa-kol", } m["kih"] = { "Kilmeri", 6408020, "paa-brd", "Latn", } m["kii"] = { "Kitsai", 56627, "cdd", "Latn", } m["kij"] = { "Kilivila", 3196601, "poz-ocw", "Latn", } m["kil"] = { "Kariya", 3438708, "cdc-wst", } m["kim"] = { "โตฟา", 36848, "trk-ssb", "Cyrl", } m["kio"] = { "Kiowa", 56631, "nai-kta", "Latn", } m["kip"] = { "Sheshi Kham", 12952622, "sit-kha", "Deva", translit = { Deva = "Deva-translit", }, } m["kiq"] = { "Kosadle", 6432994, "paa-kko", "Latn", } m["kis"] = { "Kis", 6416362, "poz-ocw", "Latn", } m["kit"] = { "Agob", 3332143, "paa-pht", "Latn", } m["kiv"] = { "Kimbu", 10997740, "bnt-tkm", } m["kiw"] = { "Northeast Kiwai", 11732324, "paa-kiw", "Latn", } m["kix"] = { "Khiamniungan Naga", 6401546, "sit-kch", "Latn", } m["kiy"] = { "Kirikiri", 6415159, "paa-lkp", "Latn", } m["kiz"] = { "Kisi", 3912772, "bnt-bki", } m["kja"] = { "Mlap", 6885683, "paa-nim", "Latn", } m["kjb"] = { "Q'anjob'al", 35551, "myn", "Latn", } m["kjc"] = { "Coastal Konjo", 3198689, "poz", "Latn", } m["kjd"] = { "Southern Kiwai", 11732322, "paa-kiw", "Latn", } m["kje"] = { "Kisar", 3197441, "poz", "Latn", } m["kjg"] = { "ขมุ", 33335, "mkh", "Laoo", translit = "Laoo-translit", --sort_key = "Laoo-sortkey", } m["kjh"] = { "คาคัส", 33575, "trk-ssb", "Cyrl", translit = "kjh-translit", override_translit = true, } m["kji"] = { "Zabana", 379130, "poz-ocw", "Latn", } m["kjj"] = { "Khinalug", 35278, "cau-nec", "Cyrl, Latn", translit = "kjj-translit", override_translit = true, display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = { Cyrl = s["cau-Cyrl-stripdiacritics"], Latn = s["cau-Latn-stripdiacritics"], }, } m["kjk"] = { "Highland Konjo", 3198688, "poz", } m["kjl"] = { "Western Parbate Kham", 22237017, "sit-kha", "Deva", translit = { Deva = "Deva-translit", }, } m["kjm"] = { "Kháng", 6403501, "mkh-pal", } m["kjn"] = { "Kunjen", 3200468, "aus-pmn", "Latn", } m["kjo"] = { "Harijan Kinnauri", 5657463, "him", "Takr, Deva", } m["kjp"] = { "กะเหรี่ยงโปตะวันออก", 5330390, "kar", "Mymr, Leke, Thai", translit = "kjp-translit", override_translit = true, } m["kjq"] = { "Western Keres", 12645568, "nai-ker", "Latn", } m["kjr"] = { "Kurudu", 12952678, "poz-hce", "Latn", } m["kjs"] = { "East Kewa", 20050949, "ngf-eng", "Latn", } m["kjt"] = { "กะเหรี่ยงโปแพร่", 7187991, "kar", "Thai", } m["kju"] = { "Kashaya", 3193689, "nai-pom", "Latn", } m["kjx"] = { "Ramopa", 56830, "paa-nbo", "Latn", } m["kjy"] = { "Erave", 12952416, "ngf-eng", "Latn", } m["kjz"] = { "Bumthangkha", 2786408, "sit-ebo", "Tibt", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["kka"] = { "Kakanda", 3915342, "alv-ngb", } m["kkb"] = { "Kwerisa", 56881, "paa-lkp", "Latn", } m["kkc"] = { "Odoodee", 12952987, "ngf-est", "Latn", } m["kkd"] = { "Kinuku", 6414422, "nic-kau", } m["kke"] = { "Kakabe", 3913966, "dmn-mok", "Latn", } m["kkf"] = { "Kalaktang Monpa", 63257089, "sit-tsk", "Tibt, Latn, Deva", translit = { Deva = "Deva-translit", }, override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["kkg"] = { "Mabaka Valley Kalinga", 18753304, "phi", } m["kkh"] = { "เขิน", 3545044, "tai-swe", "Lana, Thai", translit = { Lana = "Lana-translit", }, sort_key = "nod-sortkey", } m["kki"] = { "Kagulu", 12952537, "bnt-ruv", "Latn", } m["kkj"] = { "Kako", 35755, "bnt-kak", } m["kkk"] = { "Kokota", 3198399, "poz-ocw", "Latn", } m["kkl"] = { "Kosarek Yale", 6432995, "ngf-mek", "Latn", } m["kkm"] = { "Kiong", 6414512, "nic-ucr", "Latn", } m["kkn"] = { "Kon Keu", 6428686, "mkh-pal", } m["kko"] = { "Karko", 35529, "nub-hil", } m["kkp"] = { "Koko-Bera", 6426699, "aus-pmn", "Latn", } m["kkq"] = { "Kaiku", 6347840, "bnt-kbi", "Latn", } m["kkr"] = { "Kir-Balar", 3440527, "cdc-wst", "Latn", } m["kks"] = { "Kirfi", 56242, "cdc-wst", "Latn", } m["kkt"] = { "Koi", 6426194, "sit-kiw", } m["kku"] = { "Tumi", 3913934, "nic-kau", } m["kkv"] = { "Kangean", 2071325, "poz-msa", "Latn", } m["kkw"] = { "Teke-Kukuya", 36560, "bnt-tek", } m["kkx"] = { "Kohin", 6425997, "poz-brw", } m["kky"] = { "Guugu Yimidhirr", 56543, "aus-pam", "Latn", } m["kkz"] = { "Kaska", 20823, "ath-nor", "Latn", } m["kla"] = { "Klamath-Modoc", 2669248, "nai-plp", "Latn", } m["klb"] = { "Kiliwa", 3182593, "nai-yuc", "Latn", } m["klc"] = { "Kolbila", 6427122, "alv-lek", } m["kld"] = { "Gamilaraay", 3111818, "aus-cww", "Latn", } m["kle"] = { "Kulung", 6443304, "sit-kic", } m["klf"] = { "Kendeje", 56895, } m["klg"] = { "กาลากันแบบตากาเกาลู", 18756514, "phi", "Latn", } m["klh"] = { "Weliki", 7981017, "ngf-fin", "Latn", } m["kli"] = { "Kalumpang", 13561407, "poz", } m["klj"] = { "Khalaj", 33455, "trk", "fa-Arab, Latn", ancestors = "klj-arg", strip_diacritics = { remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun, } } m["klk"] = { "Kono (Nigeria)", 6429589, "nic-kau", "Latn", } m["kll"] = { "กาลากันแบบกากัน", 18748913, "phi", } m["klm"] = { "Kolom", 6844970, "ngf-rai", "Latn", } m["kln"] = { "Kalenjin", 637228, "sdv-nma", "Latn", } m["klo"] = { "Kapya", 6367410, "nic-ykb", } m["klp"] = { "Kamasa", 6356107, "ngf-ang", "Latn", } m["klq"] = { "Rumu", 7379420, "paa-tuk", "Latn", } m["klr"] = { "Khaling", 56381, "sit-kiw", "Deva", } m["kls"] = { "Kalasha", 33416, "inc-chi", "Latn, ks-Arab", } m["klt"] = { "Nukna", 7068874, "ngf-fin", "Latn", } m["klu"] = { "Klao", 3914866, "kro-wkr", } m["klv"] = { "Maskelynes", 3297282, "poz-vnc", "Latn", } m["klw"] = { "ลินดู", 18390055, "poz-kal", "Latn", } m["klx"] = { "Koluwawa", 6427954, "poz-ocw", "Latn", } m["kly"] = { "Kalao", 6350643, "poz", } m["klz"] = { "Kabola", 11732258, "paa-tap", "Latn", } m["kma"] = { "Konni", 35680, "nic-buk", } m["kmb"] = { "Kimbundu", 35891, "bnt-kmb", "Latn", } m["kmc"] = { "ต้งใต้", 35379, "qfa-kms", "Latn", } m["kmd"] = { "Madukayang Kalinga", 18753305, "phi", } m["kme"] = { "Bakole", 35068, "bnt-kpw", "Latn", } m["kmf"] = { "Kare (New Guinea)", 11732286, "ngf-mab", "Latn", } m["kmg"] = { "Kâte", 3201059, "ngf-huo", "Latn", } m["kmh"] = { "Kalam", 12952550, "ngf-kak", "Latn", } m["kmi"] = { "Kami", 3915372, "alv-ngb", "Latn", } m["kmj"] = { "Kumarbhag Paharia", 3130374, "dra-mlo", "Beng, Deva", translit = { Beng = "Beng-translit", Deva = "Deva-translit", }, } m["kmk"] = { "Limos Kalinga", 18753303, "phi", "Latn", } m["kml"] = { "Tanudan Kalinga", 18753307, "phi", "Latn", } m["kmm"] = { "Kom (India)", 12952647, "tbq-kuk", } m["kmn"] = { "Awtuw", 3504217, "paa-spk", "Latn", } m["kmo"] = { "Kwoma", 11732376, "paa-spk", "Latn", } m["kmp"] = { "Gimme", 11152236, "alv-dur", } m["kmq"] = { "Kwama", 2591184, "ssa-kom", } m["kmr"] = { "เคิร์ดเหนือ", 36163, "ku", "Latn, Cyrl, Armn, ku-Arab, Yezi", translit = { Cyrl = "kmr-translit", -- Armn translit in [[Module:scripts/data]] ["ku-Arab"] = "ckb-translit", }, strip_diacritics = { Latn = { remove_diacritics = "'’", from = {"r̄", "R̄", "ẍ", "Ẍ"}, to = {"rr", "Rr", "x", "X"} }, }, wikimedia_codes = "ku", } m["kms"] = { "Kamasau", 6356117, "paa-tor", "Latn", } m["kmt"] = { "Kemtuik", 6387179, "paa-nim", "Latn", } m["kmu"] = { "Kanite", 12952567, "ngf-kag", "Latn", } m["kmv"] = { "Karipúna Creole French", 2523999, "crp", "Latn", ancestors = "fr", sort_key = s["roa-oil-sortkey"], } m["kmw"] = { "Kumu", 6428450, "bnt-kbi", "Latn", } m["kmx"] = { "Waboda", 7958705, "paa-kiw", "Latn", } m["kmy"] = { "Koma", 35634, "alv-dur", } m["kmz"] = { "Khorasani Turkish", 35373, "trk-ogz", "Arab", ancestors = "trk-oat", } m["kna"] = { "Kanakuru", 56811, "cdc-wst", "Latn", } m["knb"] = { "Lubuagan Kalinga", 12953602, "phi", "Latn", } m["knd"] = { "Konda", 11732340, "ngf-sbh", "Latn", } m["kne"] = { "กันกานาอือ", 18753329, "phi", "Latn", strip_diacritics = { Latn = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer, } }, sort_key = { Latn = "tl-sortkey", }, standard_chars = { Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy" .. c.punc, }, } m["knf"] = { "Mankanya", 35789, "alv-pap", "Latn", } m["kni"] = { "Kanufi", 3913297, "nic-nin", "Latn", } m["knj"] = { "Akatek", 34923, "myn", "Latn", } m["knk"] = { "Kuranko", 3198896, "dmn-mok", "Latn", } m["knl"] = { "Keninjal", 6389309, "poz-mly", "Latn", } m["knm"] = { -- two unrelated lects have this name; this is the Katukinian one "Kanamari", 3438373, "sai-ktk", "Latn", } m["kno"] = { "Kono (Sierra Leone)", 35675, "dmn-vak", "Latn", } m["knp"] = { "Kwanja", 35641, "nic-mmb", "Latn", } m["knq"] = { "Kintaq", 6414335, "mkh-asl", } m["knr"] = { "Kaningra", 6363253, "paa-spk", "Latn", } m["kns"] = { "Kensiu", 6391529, "mkh-asl", } m["knt"] = { "Katukina", 3194265, "sai-pan", "Latn", } m["knu"] = { -- a dialect of 'kpe' "Kono (Guinea)", 3198703, "dmn-msw", "Latn, Kpel", ancestors = "kpe", } m["knv"] = { "Tabo", 7959888, "aav", } m["knx"] = { "Kendayan", 6388963, "poz-mly", "Latn", } m["kny"] = { "Kanyok", 11110766, "bnt-lub", "Latn", } m["knz"] = { "Kalamsé", 3914000, "nic-gnn", } m["koa"] = { "Konomala", 3198732, "poz-ocw", "Latn", } m["koc"] = { "Kpati", 3913279, "nic-nge", "Latn", } m["kod"] = { "Kodi", 4577633, "poz-cet", "Latn", } m["koe"] = { "Kacipo-Balesi", 5364424, "sdv", } m["kof"] = { "Kubi", 3438718, "cdc-wst", "Latn", } m["kog"] = { "Cogui", 3198286, "cba", "Latn", } m["koh"] = { "Koyo", 35649, "bnt-mbo", "Latn", } m["koi"] = { "Komi-Permyak", 56318, "kv", "Cyrl", translit = "kv-translit", strip_diacritics = {remove_diacritics = c.acute}, override_translit = true, } m["kok"] = { "กงกัณ", 34239, "inc-sou", "Deva, Knda, Mlym, fa-Arab, Latn", translit = { Deva = "Deva-translit", }, -- Knda translit in [[Module:scripts/data]] -- Mlym translit in [[Module:scripts/data]] strip_diacritics = { -- FIXME: Separate out the scripts from = {"च़", "ज़", "झ़", "ಚ಼", "ಜ಼", "ಝ಼"}, to = {"च", "ज", "झ", "ಚ", "ಜ", "ಝ"} } , } m["kol"] = { "Kol (New Guinea)", 4227542, } m["koo"] = { "Konzo", 2361829, "bnt-glb", } m["kop"] = { "Waube", 11732373, "ngf-nur", "Latn", } m["koq"] = { "Kota (Gabon)", 35607, "bnt-kel", "Latn", } m["kos"] = { "Kosraean", 33464, "poz-mic", "Latn", } m["kot"] = { "Lagwan", 3502264, "cdc-cbm", "Latn", } m["kou"] = { "Koke", 797249, "alv-bua", } m["kov"] = { "Kudu-Camo", 3915850, "nic-jer", } m["kow"] = { "Kugama", 3913307, "alv-mye", } m["koy"] = { "Koyukon", 28304, "ath-nor", "Latn", } m["koz"] = { "Korak", 6431365, "ngf-kow", "Latn", } m["kpa"] = { "Kutto", 3437656, "cdc-wst", } m["kpb"] = { "Mullu Kurumba", 19573111, "dra-mal", } m["kpc"] = { "Curripaco", 2882543, "awd-nwk", "Latn", } m["kpd"] = { "Koba", 6424249, "poz", } m["kpe"] = { "Kpelle", 35673, "dmn-msw", "Latn, Kpel", } m["kpf"] = { "Komba", 6428239, "ngf-huo", "Latn", } m["kpg"] = { "Kapingamarangi", 35771, "poz-pnp", "Latn", } m["kph"] = { "Kplang", 35628, "alv-gng", } m["kpi"] = { "Kofei", 6425665, "paa-egb", "Latn", } m["kpj"] = { "Karajá", 10322066, "sai-mje", "Latn", } m["kpk"] = { "Kpan", 3915380, "nic-jkn", "Latn", } m["kpl"] = { "Kpala", 11154769, "nic-nkk", "Latn", } m["kpm"] = { "เกอฮอ", 3511919, "mkh-ban", "Latn", } m["kpn"] = { "Kepkiriwát", 3195366, "tup", "Latn", } m["kpo"] = { "Ikposo", 35029, "alv-ktg", "Latn", } m["kpq"] = { "Korupun-Sela", 6432769, "ngf-mek", } m["kpr"] = { "Korafe-Yegha", 11732347, "paa-bin", "Latn", } m["kps"] = { "Tehit", 7694851, "paa-wbh", "Latn", } m["kpt"] = { "Karata", 56636, "cau-and", "Cyrl", translit = "kpt-translit", override_translit = true, display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]}, } m["kpu"] = { "Kafoa", 6346151, "paa-tap", "Latn", } m["kpv"] = { "Komi-Zyrian", 34114, "kv", "Cyrl", translit = "kv-translit", override_translit = true, wikimedia_codes = "kv", } m["kpw"] = { "Kobon", 11732326, "ngf-kak", "Latn", } m["kpx"] = { "Mountain Koiari", 6925030, "ngf-koi", "Latn", } m["kpy"] = { "Koryak", 36199, "qfa-ckn", "Cyrl", strip_diacritics = { from = {"['’]"}, to = {"ʼ"} }, sort_key = { from = {"вʼ", "гʼ", "ё", "ӄ", "ӈ"}, to = {"в" .. p[1], "г" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1]} }, translit = "kpy-translit", } m["kpz"] = { "Kupsabiny", 56445, "sdv-kln", } m["kqa"] = { "Mum", 6935252, "ngf-nso", "Latn", } m["kqb"] = { "Kovai", 6434822, "ngf-huo", "Latn", } m["kqc"] = { "Doromu-Koki", 5298175, "paa-man", "Latn", } m["kqd"] = { "Koy Sanjaq Surat", 33463, "sem-nna", } m["kqe"] = { "กาลากัน", 18748906, "phi", "Latn", } m["kqf"] = { "Kakabai", 6349119, "poz-ocw", "Latn", } m["kqg"] = { "Khe", 3914015, "nic-gur", } m["kqh"] = { "Kisankasa", 6416409, "sdv", } m["kqi"] = { "Koitabu", 6426363, "ngf-koi", "Latn", } m["kqj"] = { "Koromira", 6432520, "paa-sbo", "Latn", } m["kqk"] = { "Kotafon Gbe", 12952447, "alv-pph", } m["kql"] = { "Kyenele", 11732453, "paa-yua", "Latn", } m["kqm"] = { "Khisa", 3913955, "nic-gur", } m["kqn"] = { "Kaonde", 33601, "bnt-lub", "Latn", } m["kqo"] = { "Eastern Krahn", 3915374, "kro-wee", } m["kqp"] = { "Kimré", 3441210, "cdc-est", } m["kqq"] = { "Krenak", 6436747, "sai-cer", } m["kqr"] = { "Kimaragang", 3196845, "poz-san", "Latn", } m["kqs"] = { "Northern Kissi", 19921576, "alv-kis", } m["kqt"] = { "Klias River Kadazan", 12953594, "poz-san", } m["kqu"] = { "Seroa", 33127766, "khi-tuu", } m["kqv"] = { "Okolod", 7082487, "poz-san", } m["kqw"] = { "Kandas", 3192590, "poz-ocw", "Latn", } m["kqx"] = { "Mser", 3502347, "cdc-cbm", } m["kqy"] = { "Koorete", 6430753, "omv-eom", "Ethi, Latn", } m["kqz"] = { "Korana", 2756709, "khi-khk", "Latn", } m["kra"] = { "Kumhali", 13580783, "inc-eas", ancestors = "bh", } m["krb"] = { "Karkin", 3193345, "nai-utn", "Latn", } m["krc"] = { "Karachay-Balkar", 33714, "trk-kcu", "Cyrl", translit = "krc-translit", sort_key = { from = {"гъ", "дж", "ё", "къ", "нг"}, to = {"г" .. p[1], "д" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1]} }, } m["krd"] = { "Kairui-Midiki", 12953277, "poz-tim", } m["kre"] = { "Panará", 3361895, "sai-cer", "Latn", } m["krf"] = { "Koro (Vanuatu)", 3198995, "poz-vnn", "Latn", } m["krh"] = { "Kurama", 35593, "nic-kau", } m["kri"] = { "Krio", 35744, "crp", "Latn", ancestors = "en", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ}, sort_key = { from = {"ɛ", "gb", "kp", "ɔ"}, to = {"e" .. p[1], "g" .. p[1], "k" .. p[1], "o" .. p[1]} }, } m["krj"] = { "Kinaray-a", 33720, "phi", "Latn", } m["krk"] = { "Kerek", 332792, "qfa-ckn", "Cyrl", } m["krl"] = { "คาเรเลีย", 33557, "urj-fin", "Latn", sort_key = { from = { "č", "š", "ž", "ü", "ä", "ö", -- 2 chars "z", "'" -- 1 char }, to = { "c" .. p[1], "s" .. p[1], "s" .. p[3], "y" .. p[1], "y" .. p[2], "y" .. p[3], "s" .. p[2], "y" .. p[4], } }, } m["krm"] = { "Krim", 35713, "alv", } m["krn"] = { "Sapo", 3915386, "kro-wee", } m["krp"] = { "Korop", 35626, "nic-ucr", "Latn", } m["krr"] = { "Kru'ng", 12953650, "mkh-ban", } m["krs"] = { "Kresh", 56674, "csu-bkr", } m["kru"] = { "กุรุข", 33492, "dra-kml", "Deva, Tols", translit = { Deva = "Deva-translit", }, } m["krv"] = { "Kavet", 12953649, "sai-ktk", "Latn", } m["krw"] = { "Western Krahn", 10975611, "kro-wee", } m["krx"] = { "Karon", 35704, "alv-jol", } m["kry"] = { "Kryts", 35861, "cau-ssm", "Latn, Cyrl", display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = { Latn = s["cau-Latn-stripdiacritics"], Cyrl = s["cau-Cyrl-stripdiacritics"], }, } m["krz"] = { "Sota Kanum", 12952568, "paa-yam", "Latn", } m["ksa"] = { "Shuwa-Zamani", 3913929, "nic-kau", } m["ksb"] = { "Shambala", 3788739, "bnt-seu", "Latn", } m["ksc"] = { "Southern Kalinga", 18753301, "phi", } m["ksd"] = { "Tolai", 35870, "poz-ocw", "Latn", } m["kse"] = { "Kuni", 6444619, "poz-ocw", "Latn", } m["ksf"] = { "Bafia", 34930, "bnt-baf", "Latn", } m["ksg"] = { "Kusaghe", 3200638, "poz-ocw", "Latn", } m["ksi"] = { "Krisa", 841704, "paa-msk", "Latn", } m["ksj"] = { "Uare", 6450052, "paa-kwa", "Latn", } m["ksk"] = { "Kansa", 3192772, "sio-dhe", "Latn", } m["ksl"] = { "Kumalu", 17584381, "poz-ocw", "Latn", } m["ksm"] = { "Kumba", 3913972, "alv-mye", } m["ksn"] = { "Kasiguranin", 6374525, "phi", } m["kso"] = { "Kofa", 56278, "cdc-cbm", } m["ksp"] = { "Kaba", 3915316, "csu-sar", } m["ksq"] = { "Kwaami", 3440525, "cdc-wst", } m["ksr"] = { "Borong", 4946263, "ngf-huo", "Latn", } m["kss"] = { "Southern Kissi", 11028974, "alv-kis", } m["kst"] = { "Winyé", 3913360, "nic-gnw", } m["ksu"] = { "Khamyang", 6583541, "tai-swe", } m["ksv"] = { "Kusu", 6448199, "bnt-tet", } m["ksw"] = { "กะเหรี่ยงสะกอ", 56410, "kar", "Mymr", translit = "ksw-translit", } m["ksx"] = { "Kedang", 6382520, "poz", "Latn", } m["ksy"] = { "Kharia Thar", 6400661, "inc-eas", } m["ksz"] = { "Kodaku", 21179986, "mun", } m["kta"] = { "Katua", 6378404, "mkh-ban", } m["ktb"] = { "Kambaata", 35664, "cus-hec", "Latn", } m["ktc"] = { "Kholok", 3440464, "cdc-wst", } m["ktd"] = { "Kokata", 10547021, "aus-pam", } m["ktf"] = { "Kwami", 12952687, "bnt-lgb", } m["ktg"] = { "Kalkatungu", 3914057, "aus-pam", "Latn", } m["kth"] = { "Karanga", 713643, } m["kti"] = { "North Muyu", 20857698, "ngf-okk", "Latn", } m["ktj"] = { "Plapo Krumen", 10975356, "kro-grb", } m["ktk"] = { "Kaniet", 3399050, "poz-aay", "Latn", } m["ktl"] = { "Koroshi", 3775265, "ira-nwi", ancestors = "bal", } m["ktm"] = { "Kurti", 3200615, "poz-aay", "Latn", } m["ktn"] = { "Karitiâna", 3112184, "tup", "Latn", } m["kto"] = { "Kuot", 56537, } m["ktp"] = { "Kaduo", 769809, "tbq-bka", } m["ktq"] = { "Katabaga", 3193895, } m["ktr"] = { "Kota Marudu Tinagas", 18642280, } m["kts"] = { "South Muyu", 42308820, "ngf-okk", "Latn", } m["ktt"] = { "Ketum", 12952616, "ngf-gaw", "Latn", } m["ktu"] = { "Kituba", 35746, "crp", "Latn", ancestors = "kg", } m["ktv"] = { "กะตูตะวันออก", 22808951, "mkh-kat", "Latn", } m["ktw"] = { "Kato", 20831, "ath-pco", "Latn", } m["ktx"] = { "Kaxararí", 6380124, "sai-pan", "Latn", } m["kty"] = { "Kango", 6362818, "bnt-bta", "Latn", } m["ktz"] = { "Juǀ'hoan", 1192295, "khi-kxa", "Latn", } m["kub"] = { "Kutep", 35645, "nic-jkn", } m["kuc"] = { "Kwinsu", 6450460, "paa-tkw", } m["kud"] = { "Auhelawa", 5166, "poz-ocw", "Latn", } m["kue"] = { "Kuman", 137525, "ngf-chw", "Latn", } m["kuf"] = { "กะตูตะวันตก", 6378400, "mkh-kat", "Laoo, Tale, Latn", } m["kug"] = { "Kupa", 3915336, "alv-ngb", } m["kuh"] = { "Kushi", 3438747, "cdc-wst", } m["kui"] = { "Kuikúro", 3915522, "sai-kui", "Latn", } m["kuj"] = { "Kuria", 6445968, "bnt-lok", "Latn", } m["kuk"] = { "Kepo'", 6393217, "poz", } m["kul"] = { "Kulere", 3440506, "cdc-wst", } m["kum"] = { "คูมุก", 36209, "trk-kcu", "Cyrl", translit = "kum-translit", sort_key = { from = {"гъ", "гь", "ё", "къ", "нг", "оь", "уь"}, to = {"г" .. p[1], "г" .. p[2], "е" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1]} }, } m["kun"] = { "Kunama", 36041, } m["kuo"] = { "Kumukio", 11732362, "ngf-huo", "Latn", } m["kup"] = { "Kunimaipa", 6444696, "paa-kun", "Latn", } m["kuq"] = { "Karipuna", 6371071, "tup-gua", "Latn", } m["kus"] = { "Kusaal", 35708, "nic-dag", "Latn", } m["kut"] = { "Ktunaxa", 33434, "qfa-iso", "Latn", } m["kuu"] = { "Upper Kuskokwim", 28062, "ath-nor", "Latn", } m["kuv"] = { "Kur", 12635082, "poz-cma", "Latn", } m["kuw"] = { "Kpagua", 11137573, "bad-cnt", } m["kux"] = { "Kukatja", 10549839, "aus-pam", } m["kuy"] = { "Kuuku-Ya'u", 10550697, "aus-pmn", } m["kuz"] = { "Kunza", 2669181, "qfa-iso", "Latn", } m["kva"] = { "Bagvalal", 56638, "cau-and", "Cyrl", translit = "cau-nec-translit", override_translit = true, display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]}, } m["kvb"] = { "Kubu", 6441341, "poz-mly", } m["kvc"] = { "Kove", 3199402, "poz-ocw", "Latn", } m["kvd"] = { "Kui (Indonesia)", 6442230, "paa-tap", "Latn", } m["kve"] = { "Kalabakan", 6350003, "poz-san", "Latn", } m["kvf"] = { "Kabalai", 3440427, "cdc-est", } m["kvg"] = { "Kuni-Boazi", 2907551, "paa-ani", "Latn", } m["kvh"] = { "Komodo", 3198565, "poz-cet", "Latn", } m["kvi"] = { "Kwang", 3440398, "cdc-est", "Latn", } m["kvj"] = { "Psikye", 56304, "cdc-cbm", } m["kvk"] = { "Korean Sign Language", 3073428, "sgn-jsl", } m["kvl"] = { "Brek Karen", 12952577, "kar", } m["kvm"] = { "Kendem", 35751, "nic-mam", "Latn", } m["kvn"] = { "Border Kuna", 31777873, "cba", } m["kvo"] = { "Dobel", 5286559, "poz", "Latn", } m["kvp"] = { "Kompane", 18343041, "poz", } m["kvq"] = { "Geba Karen", 12952581, "kar", "Latn, Mymr", } m["kvr"] = { "Kerinci", 3195442, "poz-mly", "Latn, Arab", -- Also Incung, which we don't have } m["kvt"] = { "Lahta Karen", 12952582, "kar", } m["kvu"] = { "Yinbaw Karen", 14426328, "kar", } m["kvv"] = { "Kola", 6426967, "poz", "Latn", } m["kvw"] = { "Wersing", 7983599, "paa-tap", "Latn", } m["kvx"] = { "Parkari Koli", 3244176, "inc-wes", } m["kvy"] = { "Yintale Karen", 14426329, "kar", } m["kvz"] = { "Tsakwambo", 7849438, "ngf-gaw", "Latn", } m["kwa"] = { "Dâw", 3042278, "sai-nad", "Latn", } m["kwb"] = { "Baa", 34842, "alv-ada", } m["kwc"] = { "Likwala", 35597, "bnt-mbo", } m["kwd"] = { "Kwaio", 3200796, "poz-sls", "Latn", } m["kwe"] = { "Kwerba", 6450328, "paa-tkw", } m["kwf"] = { "Kwara'ae", 3200829, "poz-sls", "Latn", } m["kwg"] = { "Sara Kaba Deme", 3915384, "csu-kab", } m["kwh"] = { "Kowiai", 6435028, "poz", "Latn", } m["kwi"] = { "Awa-Cuaiquer", 2603103, "sai-bar", "Latn", } m["kwj"] = { "Kwanga", 3438383, "paa-spk", "Latn", } m["kwk"] = { "Kwak'wala", 2640628, "wak", "Latn", } m["kwl"] = { "Kofyar", 3441382, "cdc-wst", "Latn", } m["kwm"] = { "Kwambi", 3487165, "bnt-ova", } m["kwn"] = { "Kwangali", 36334, "bnt-kav", "Latn", } m["kwo"] = { "Kwomtari", 3508116, "paa-kwm", "Latn", } m["kwp"] = { "Kodia", 3914867, "kro-ekr", } m["kwq"] = { "Kwak", 11014183, "nic-nka", ancestors = "yam", } m["kwr"] = { "Kwer", 12635137, "ngf-okk", "Latn", } m["kws"] = { "Kwese", 3200846, "bnt-pen", } m["kwt"] = { "Kwesten", 6450354, "paa-tkw", } m["kwu"] = { "Kwakum", 35624, "bnt-kak", } m["kwv"] = { "Sara Kaba Náà", 3915361, "csu-kab", "Latn", } m["kww"] = { "Kwinti", 721182, "crp", "Latn", ancestors = "en" } m["kwx"] = { "Khirwar", 12976968, "dra", } m["kwz"] = { "Kwadi", 2364661, "khi-kkw", "Latn", } m["kxa"] = { "Kairiru", 3398785, "poz-ocw", "Latn", } m["kxb"] = { "Krobu", 35586, "alv-ptn", "Latn", } m["kxc"] = { "Konso", 56624, "cus-eas", "Ethi, Latn", } m["kxd"] = { "มลายูแบบบรูไน", 3182878, "poz-mly", "Latn, ms-Arab", } m["kxe"] = { "Kakihum", 3914433, "nic-kam", ancestors = "tvd", } m["kxf"] = { "Manumanaw Karen", 12952592, "kar", "Mymr, Latn", } m["kxh"] = { "Karo", 3447116, "omv-aro", } m["kxi"] = { "Keningau Murut", 6389308, "poz-san", "Latn", } m["kxj"] = { "Kulfa", 713654, "csu-kab", } m["kxk"] = { "Zayein Karen", 14352960, "kar", } m["kxl"] = { "Nepali Kurux", 3200624, "dra-kml", "Deva", ancestors = "kru", translit = { Deva = "Deva-translit", }, } m["kxm"] = { "เขมรเหนือ", 3502234, "mkh-kmr", "Thai, Khmr", ancestors = "xhm", sort_key = { from = {"[%pๆ]", "[็-๎]", "([เแโใไ])([ก-ฮ])"}, to = {"", "", "%2%1"} }, } m["kxn"] = { "Kanowit", 6364300, "poz-bnn", "Latn", } m["kxo"] = { "Kanoé", 4356223, "qfa-iso", "Latn", } m["kxp"] = { "Wadiyara Koli", 12953645, "inc-wes", } m["kxq"] = { "Smärky Kanum", 12952569, "paa-yam", "Latn", } m["kxr"] = { "Manus Koro", 3198994, "poz-aay", "Latn", } m["kxs"] = { "Kangjia", 3182570, "xgn-shr", "Latn", } m["kxt"] = { "Koiwat", 6426388, "paa-ndu", "Latn", } m["kxu"] = { "Kui (India)", 33919, "dra-kki", "Orya", translit = "Orya-translit", strip_diacritics = { remove_diacritics = "୕", from = {"ଆଆ", "ଇଇ", "ଉଉ", "ଏଏ", "ଓଓ", "ିଇ", "ୁଉ", "େଏ", "ୋଓ"}, to = {"ଆ", "ଈ", "ଊ", "ଏ", "ଓ", "ୀ", "ୂ", "େ", "ୋ"}, }, } m["kxv"] = { "Kuvi", 3200721, "dra-kki", "Orya", translit = "Orya-translit", strip_diacritics = { remove_diacritics = "୕", from = {"ଆଆ", "ଇଇ", "ଉଉ", "ଏଏ", "ଓଓ", "([କ-ହ])ଆ", "ିଇ", "ୁଉ", "େଏ", "ୋଓ"}, to = {"ଆ", "ଈ", "ଊ", "ଏ", "ଓ", "%1ା", "ୀ", "ୂ", "େ", "ୋ"}, }, } m["kxw"] = { "Konai", 11732339, "ngf-est", "Latn", } m["kxx"] = { "Likuba", 35646, "bnt-bmo", } m["kxy"] = { "Kayong", 6380673, "mkh", } m["kxz"] = { "Kerewo", 6393847, "paa-kiw", "Latn", } m["kya"] = { "Kwaya", 6450276, "bnt-haj", "Latn", } m["kyb"] = { "Butbut Kalinga", 18753300, "phi", "Latn", } m["kyc"] = { "Kyaka", 12952690, "ngf-eng", "Latn", } m["kyd"] = { "Karey", 6370196, "poz", } m["kye"] = { "Krache", 35658, "alv-gng", } m["kyf"] = { "Kouya", 35595, "kro-bet", } m["kyg"] = { "Keyagana", 6398208, "ngf-kag", "Latn", } m["kyh"] = { "Karok", 1288440, "qfa-iso", -- or Hokan? "Latn", } m["kyi"] = { "Kiput", 3038653, "poz-swa", "Latn", } m["kyj"] = { "กาเรา", 3192950, "phi", "Latn", } m["kyk"] = { "Kamayo", 3192339, "phi", "Latn", } m["kyl"] = { "Kalapuya", 3192120, "nai-klp", } m["kym"] = { "Kpatili", 3913982, "znd", } m["kyn"] = { "Karolanos", 6373093, "phi", } m["kyo"] = { "Kelon", 6386414, "paa-tap", "Latn", } m["kyp"] = { "Kang", 25559558, "tai", } m["kyq"] = { "Kenga", 35707, "csu-bgr", } m["kyr"] = { "Kuruáya", 3200633, "tup", "Latn", } m["kys"] = { "Baram Kayan", 2883794, "poz", "Latn", } m["kyt"] = { "Kayagar", 6380394, "paa-kay", "Latn", } m["kyu"] = { "กะยาตะวันตก", 12952596, "kar", "Kali, Mymr, Latn", translit = {Kali = "Kali-translit"}, } m["kyv"] = { "Kayort", 6380675, "inc-krd", "Deva", translit = { Deva = "Deva-translit", }, } m["kyw"] = { "Kudmali", 6446173, "inc-bih", "Deva, as-Beng, Orya, Chis", translit = { Deva = "Deva-translit", ["as-Beng"] = "Beng-translit", Orya = "Orya-translit", }, } m["kyx"] = { "Rapoisi", 7294279, "paa-nbo", "Latn", } m["kyy"] = { "Kambaira", 6356254, "ngf-kag", "Latn", } m["kyz"] = { "Kayabí", 6380372, "tup-gua", "Latn", } m["kza"] = { "Western Karaboro", 36601, "alv-krb", } m["kzb"] = { "Kaibobo", 6347565, "poz-cma", } m["kzc"] = { "Bondoukou Kulango", 11031321, "alv-kul", "Latn", } m["kzd"] = { "Kadai", 7679471, "poz-cma", "Latn", } --kze (Kosena) made an etym-only child of auy (Auyana) per [[Wiktionary:Language_treatment_requests#merge_Kosena_[kze]_into_Auyana_[auy]]] m["kzf"] = { "Da'a Kaili", 33103997, "poz-kal", "Latn", } m["kzg"] = { "Kikai", 3196527, "jpx-nry", "Jpan", translit = s["jpx-translit"], display_text = s["jpx-displaytext"], strip_diacritics = s["jpx-stripdiacritics"], sort_key = s["jpx-sortkey"], } m["kzh"] = { "Dongolawi", 5295991, "nub", "Latn", } m["kzi"] = { "Kelabit", 6385445, "poz-swa", "Latn", } m["kzj"] = { "กาดาซันชายฝั่ง", 3307195, "poz-san", "Latn", } m["kzk"] = { "Kazukuru", 1089069, "poz-ocw", } m["kzl"] = { "Kayeli", 4207444, "poz-cma", "Latn", } m["kzm"] = { "Kais", 6348319, "ngf-sbh", "Latn", } m["kzn"] = { "Kokola", 11128329, "bnt-mak", "Latn", ancestors = "vmw", } m["kzo"] = { "Kaningi", 35683, "bnt-mbt", } m["kzp"] = { "Kaidipang", 6347611, "phi", "Latn", } m["kzq"] = { "Kaike", 10951226, "sit-tam", } m["kzr"] = { "Karang", 35681, "alv-mbm", "Latn", } m["kzs"] = { "Sugut Dusun", 12953510, "poz-san", "Latn", } m["kzt"] = { "Tambunan Dusun", 12953514, "poz-san", "Latn", } m["kzu"] = { "Kayupulau", 6380723, "poz-ocw", } m["kzv"] = { "Komyandaret", 6428671, "ngf-gaw", "Latn", } m["kzw"] = { -- contrast xoo, sai-kat, sai-xoc, the last of which the ISO conflated into this code "Kariri", 12953620, "sai-mje", "Latn", } m["kzx"] = { "Kamarian", 6356040, "poz-cma", "Latn", } m["kzy"] = { "Kango-Sua", 11008360, "bnt-kbi", "Latn", ancestors = "bip", } m["kzz"] = { "Kalabra", 6350038, "paa-wbh", "Latn", } return require("Module:languages").finalizeData(m, "language") nt7c1900gwwpyvpyc4ho1fh5pyoe3ut มอดูล:languages/data/3/i 828 36378 5720759 5684157 2026-04-21T07:01:01Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720759 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["iai"] = { "Iaai", 282888, "poz-cln", "Latn", } m["ian"] = { "Iatmul", 5983460, "paa-ndu", "Latn", } m["iar"] = { "Purari", 3499934, "qfa-dis", -- Papuan; isolate in Glottolog; unclassified by Pawley and Hammarström; proposed to be in a putative Binanderean-Goilalan family by Usher } m["iba"] = { "อีบัน", 33424, "poz-mly", "Latn", } m["ibb"] = { "Ibibio", 33792, "nic-ief", "Latn", } m["ibd"] = { "Iwaidja", 1977429, "aus-wdj", "Latn", } m["ibe"] = { "Akpes", 35457, "alv-von", "Latn", } m["ibg"] = { "Ibanag", 1775596, "phi", "Latn", } m["ibh"] = { "Bih", 51955140, "cmc", "Latn", } m["ibl"] = { "Ibaloi", 3147383, "phi", "Latn", } m["ibm"] = { "Agoi", 34727, "nic-ucr", "Latn", } m["ibn"] = { "Ibino", 3813281, "nic-lcr", "Latn", } m["ibr"] = { "Ibuoro", 3813306, "nic-ief", "Latn", } m["ibu"] = { "Ibu", 11732235, "paa-nha", "Latn", } m["iby"] = { "Ibani", 11280479, "ijo", "Latn", } m["ica"] = { "Ede Ica", 12952405, "alv-ede", "Latn", } m["ich"] = { "Etkywan", 3914462, "nic-jkn", "Latn", } m["icl"] = { "Icelandic Sign Language", 3436654, "sgn", "Latn", -- when documented } m["icr"] = { "Islander Creole English", 2044587, "crp", "Latn", ancestors = "en", } m["ida"] = { "Idakho-Isukha-Tiriki", 12952512, "bnt-lok", "Latn", } m["idb"] = { "Indo-Portuguese", 6025550, "crp", "Latn", ancestors = "pt", } m["idc"] = { "Idon", 3913366, "nic-plc", } m["idd"] = { "Ede Idaca", 13123376, "alv-ede", "Latn", } m["ide"] = { "Idere", 3813288, "nic-ief", } m["idi"] = { "Idi", 5988630, "paa-pht", "Latn", } m["idr"] = { "Indri", 35662, "nic-ser", } m["ids"] = { "Idesa", 3913979, "alv-swd", "Latn", ancestors = "oke", } m["idt"] = { "Idaté", 12952511, "poz-tim", "Latn", } m["idu"] = { "Idoma", 35478, "alv-ido", "Latn", } m["ifa"] = { "Amganad Ifugao", 18748222, "phi", "Latn", } m["ifb"] = { "Batad Ifugao", 12953578, "phi", "Latn", } m["ife"] = { "Ifè", 33606, "alv-ede", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron}, sort_key = { remove_diacritics = c.tilde, from = {"ɖ", "dz", "ɛ", "gb", "kp", "ny", "ŋ", "ɔ", "ts"}, to = {"d" .. p[1], "d" .. p[2], "e" .. p[1], "g" .. p[1], "k" .. p[1], "n" .. p[1], "n" .. p[2], "o" .. p[1], "t" .. p[1]} }, } m["iff"] = { "Ifo", 7902545, "poz-vns", "Latn", } m["ifk"] = { "Tuwali Ifugao", 7857158, "phi", "Latn", } m["ifm"] = { "Teke-Fuumu", 36603, "bnt-tek", } m["ifu"] = { "Mayoyao Ifugao", 12953579, "phi", "Latn", } m["ify"] = { "Keley-I Kallahan", 3192221, "phi", "Latn", } m["igb"] = { "Ebira", 35363, "alv-nup", "Latn", } m["ige"] = { "Igede", 35420, "alv-ido", "Latn", } m["igg"] = { "Igana", 5991454, "paa-ram", "Latn", } m["igl"] = { "Igala", 35513, "alv-yrd", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.dotabove .. c.caron .. c.lineabove}, sort_key = { from = { "ñm", "ñw", -- 3 chars "ch", "ẹ", "gb", "gw", "kp", "kw", "ny", "ñ", "ọ" -- 2 chars }, to = { "n" .. p[3], "n" .. p[4], "c" .. p[1], "e" .. p[1], "g" .. p[1], "g" .. p[2], "k" .. p[1], "k" .. p[2], "n" .. p[1], "n" .. p[2], "o" .. p[1] } }, } m["igm"] = { "Kanggape", 6362743, "paa-ram", "Latn", } m["ign"] = { "Ignaciano", 3148190, "awd", "Latn", } m["igo"] = { "Isebe", 11732248, "ngf-gum", "Latn", } m["igs"] = { "Glosa", 1138529, "art", "Latn", type = "appendix-constructed", } m["igw"] = { "Igwe", 3913985, "alv-yek", "Latn", } m["ihb"] = { "Pidgin Iha", 12639686, "crp", ancestors = "ihp", } m["ihi"] = { "Ihievbe", 3441193, "alv-eeo", "Latn", ancestors = "ema", } m["ihp"] = { "Iha", 5994495, "paa-mbi", "Latn", } m["ijc"] = { "Izon", 35483, "ijo", "Latn", } m["ije"] = { "Biseni", 35010, "ijo", "Latn", } m["ijj"] = { "Ede Ije", 12952406, "alv-ede", "Latn", } m["ijn"] = { "Kalabari", 35697, "ijo", "Latn", } m["ijs"] = { "Southeast Ijo", 3915854, "ijo", "Latn", } m["ike"] = { "Eastern Canadian Inuktitut", 4126517, "esx-inu", "Cans, Latn", } m["iki"] = { "Iko", 3813290, "nic-lcr", "Latn", } m["ikk"] = { "Ika", 35406, "alv-igb", "Latn", } m["ikl"] = { "Ikulu", 425973, "nic-plc", "Latn", } m["iko"] = { "Olulumo-Ikom", 3914402, "nic-uce", "Latn", } m["ikp"] = { "Ikpeshi", 3912777, "alv-yek", "Latn", } m["ikr"] = { "Ikaranggal", 5995402, "aus-pam", } m["iks"] = { "Inuit Sign Language", 13360244, "sgn", "Latn", -- when documented } m["ikt"] = { "Inuvialuktun", 27990, "esx-inu", "Cans, Latn", } m["ikv"] = { "Iku-Gora-Ankwa", 3913940, "nic-plc", } m["ikw"] = { "Ikwere", 35399, "alv-igb", "Latn", } m["ikx"] = { "Ik", 35472, "ssa-klk", "Latn", } m["ikz"] = { "Ikizu", 10977626, "bnt-lok", "Latn", } m["ila"] = { "Ile Ape", 12473380, "poz-cet", } m["ilb"] = { "Ila", 10962725, "bnt-bot", "Latn", } m["ilg"] = { "Ilgar", 5997810, "aus-wdj", "Latn", } m["ili"] = { "Ili Turki", 33627, "trk-kar", "Cyrl", } m["ilk"] = { "Ilongot", 3148787, "phi", "Latn", } m["ill"] = { "อีรานุน", 12953581, "phi", "Latn, Arab", } m["ilo"] = { "อีโลกาโน", 35936, "phi", "Latn, Tglg", translit = { Tglg = "ilo-translit", }, override_translit = true, strip_diacritics = { Latn = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer, } }, sort_key = { Latn = "tl-sortkey", }, standard_chars = { Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy" .. c.punc, }, } m["ils"] = { "International Sign", 35754, "sgn", } m["ilu"] = { "Ili'uun", 12632888, "poz-tim", "Latn", } m["ilv"] = { "Ilue", 3813301, "nic-lcr", "Latn", } m["ima"] = { "Mala Malasar", 6740693, "dra-tam", } m["imi"] = { "Anamgura", 3501881, "ngf-pom", "Latn", } m["iml"] = { "Miluk", 3314550, "nai-coo", "Latn", } m["imn"] = { "Imonda", 6005721, "paa-brd", "Latn", } m["imo"] = { "Imbongu", 12632895, "ngf-chw", "Latn", } m["imr"] = { "Imroing", 6008394, "poz-tim", } m["ims"] = { "Marsian", 1265446, "itc-sbl", "Ital, Latn", -- Ital translit in [[Module:scripts/data]] display_text = { Latn = s["itc-Latn-displaytext"] }, strip_diacritics = { Latn = s["itc-Latn-stripdiacritics"] }, sort_key = { Latn = s["itc-Latn-sortkey"] }, } m["imy"] = { "Milyan", 3832946, "ine-luw", "Lyci", } m["inb"] = { "Inga", 35491, "qwe", ancestors = "qwe-kch", } m["ing"] = { "Deg Xinag", 27782, "ath-nor", "Latn", } m["inh"] = { "อิงกุช", 33509, "cau-vay", "Cyrl, Latn, Arab", translit = { Cyrl = "cau-nec-translit", Arab = "ar-translit", }, override_translit = true, display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = { Cyrl = s["cau-Cyrl-stripdiacritics"], Latn = s["cau-Latn-stripdiacritics"], }, sort_key = { Cyrl = { from = {"аь", "гӏ", "ё", "кх", "къ", "кӏ", "пӏ", "тӏ", "хь", "хӏ", "цӏ", "чӏ", "яь"}, to = {"а" .. p[1], "г" .. p[1], "е" .. p[1], "к" .. p[1], "к" .. p[2], "к" .. p[3], "п" .. p[1], "т" .. p[1], "х" .. p[1], "х" .. p[2], "ц" .. p[1], "ч" .. p[1], "я" .. p[1]} }, }, } m["inj"] = { "Jungle Inga", 16115012, "qwe", ancestors = "qwe-kch", } m["inl"] = { "มืออินโดนีเซีย", 3915477, "sgn", "Latn", -- when documented } m["inm"] = { "Minaean", 737784, "sem-osa", "Sarb", -- Sarb translit in [[Module:scripts/data]] } m["inn"] = { "Isinai", 6081098, "phi", "Latn", } m["ino"] = { "Inoke-Yate", 6036531, "ngf-kag", "Latn", } m["inp"] = { "Iñapari", 15338035, "awd", "Latn", } m["ins"] = { "มืออินเดีย", 12953486, "sgn", } m["int"] = { "Intha", 6057507, "tbq-brm", ancestors = "obr", } m["inz"] = { "Ineseño", 35443, "nai-chu", "Latn", } m["ior"] = { "Inor", 35763, "sem-eth", "Ethi", } m["iou"] = { "Tuma-Irumu", 7852460, "ngf-fin", "Latn", } m["iow"] = { "Chiwere", 56737, "sio-msv", "Latn", } m["ipi"] = { "Ipili", 6065141, "ngf-eng", "Latn", } m["ipo"] = { "Ipiko", 10566515, "paa-ani", "Latn", } m["iqu"] = { "Iquito", 2669184, "sai-zap", "Latn", } m["iqw"] = { "Ikwo", 11926474, "alv-igb", "Latn", ancestors = "izi", } m["ire"] = { "Iresim", 6069398, "poz-hce", "Latn", } m["irh"] = { "Irarutu", 3027928, "poz-cet", "Latn", } m["iri"] = { "Rigwe", 3912756, "nic-plc", "Latn", } m["irk"] = { "Iraqw", 33595, "cus-sou", "Latn", } m["irn"] = { "Irantxe", 3409301, nil, "Latn", } m["irr"] = { "Ir", 3071880, "mkh-kat", } m["iru"] = { "Irula", 33363, "dra-imd", "Taml", translit = "Taml-translit" } m["irx"] = { -- Wikipedia and Glottolog say that North and South Kamberau are different languages but ISO 639-3 has not (yet?) -- split them. "Kamberau", 6356317, "ngf-ask", "Latn", } m["iry"] = { "Iraya", 6068356, "phi", "Latn", } m["isa"] = { "Isabi", 11732247, "ngf-kag", "Latn", } m["isc"] = { "Isconahua", 3052971, "sai-pan", "Latn", } m["isd"] = { "Isnag", 6085162, "phi", "Latn", } m["ise"] = { "Italian Sign Language", 375619, "sgn", "Latn", -- when documented } m["isg"] = { "Irish Sign Language", 14183, "sgn", "Latn", -- when documented } m["ish"] = { "Esan", 35268, "alv-eeo", "Latn", } m["isi"] = { "Nkem-Nkum", 36261, "nic-eko", "Latn", } m["isk"] = { "Ishkashimi", 33419, "ira-sgi", "Cyrl, Arab", } m["ism"] = { "Masimasi", 6783273, "poz-ocw", "Latn", } m["isn"] = { "Isanzu", 6078891, "bnt-tkm", "Latn", } m["iso"] = { "Isoko", 35414, "alv-swd", "Latn", } m["isr"] = { "Israeli Sign Language", 2911863, "sgn", "Sgnw", } m["ist"] = { "อิสเตรีย", 35845, "roa-dal", "Latn", } m["isu"] = { "Isu", 6089423, "nic-rnw", "Latn", } m["isv"] = { "Interslavic", 148971, "art", "Latn, Cyrl", type = "appendix-constructed", ancestors = "sla-pro", } m["itb"] = { "Binongan Itneg", 12953584, "phi", "Latn", } m["itd"] = { "Southern Tidung", 63214959, "poz-san", "Latn", } m["ite"] = { "Itene", 3038640, "sai-cpc", "Latn", } m["iti"] = { "Inlaod Itneg", 12953585, "phi", } m["itk"] = { "Judeo-Italian", 1145414, "roa-itd", "Hebr, Latn", -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["itl"] = { "Itelmen", 33624, "qfa-cka", "Cyrl, Latn", strip_diacritics = { Cyrl = { from = {"['’]", "[ӅԮ]", "[ӆԯ]", "Ҳ", "ҳ"}, to = {"ʼ", "Ԓ", "ԓ", "Ӽ", "ӽ"} }, }, sort_key = { Cyrl = { from = { "ӑ", "ё", "кʼ", "ӄʼ", "о̆", "пʼ", "тʼ", "ў", "чʼ", -- 2 chars "ӄ", "љ", "ԓ", "њ", "ӈ", "ӽ", "ә" -- 1 char }, to = { "а" .. p[1], "е" .. p[1], "к" .. p[1], "к" .. p[3], "о" .. p[1], "п" .. p[1], "т" .. p[1], "у" .. p[1], "ч" .. p[1], "к" .. p[2], "л" .. p[1], "л" .. p[2], "н" .. p[1], "н" .. p[2], "х" .. p[1], "ь" .. p[1] } }, }, } m["itm"] = { "Itu Mbon Uzo", 10977737, "nic-ief", "Latn", ancestors = "ibr", } m["ito"] = { "Itonama", 950585, "qfa-iso", } m["itr"] = { "Iteri", 2083185, "paa-lem", "Latn", } m["its"] = { "Itsekiri", 36045, "alv-edk", "Latn", strip_diacritics = {Latn = {remove_diacritics = c.grave .. c.acute .. c.macron}}, sort_key = { remove_diacritics = c.tilde, from = {"ẹ", "gb", "gh", "kp", "ọ", "ts", "ṣ"}, to = {"e" .. p[1], "g" .. p[1], "g" .. p[2], "k" .. p[1], "o" .. p[1], "t" .. p[1], "t" .. p[1]} }, } m["itt"] = { "Maeng Itneg", 18748761, "phi", } m["itv"] = { "Itawit", 3915527, "phi", "Latn", } m["itw"] = { "Ito", 11128810, "nic-ief", ancestors = "ibr", } m["itx"] = { "Itik", 6094713, "paa-tkw", } m["ity"] = { "Moyadan Itneg", 12953583, "phi", } m["itz"] = { "Itza'", 35537, "myn", "Latn", } m["ium"] = { "เมี่ยน", 2498808, "hmx-mie", "Latn", } m["ivb"] = { "Ibatan", 18748212, "phi", "Latn", } m["ivv"] = { "Ivatan", 3547080, "phi", "Latn", } m["iwk"] = { "I-Wak", 12632789, "phi", } m["iwm"] = { "Iwam", 3915215, "paa-iwm", "Latn", } m["iwo"] = { "Iwur", 6101006, "ngf-okk", "Latn", } m["iws"] = { "Sepik Iwam", 16893603, "paa-iwm", "Latn", } m["ixc"] = { "Ixcatec", 56706, "omq", } m["ixl"] = { "Ixil", 35528, "myn", "Latn", } m["iya"] = { "Iyayu", 3913390, "alv-nwd", "Latn", } m["iyo"] = { "Mesaka", 36080, "nic-tiv", "Latn", } m["iyx"] = { "Yaa", 36909, "bnt-nze", "Latn", } m["izh"] = { "อิงเกรีย", 33559, "urj-fin", "Latn", sort_key = { from = { "š", "ž", }, to = { "s" .. p[1], "z" .. p[1], } }, } m["izi"] = { "Izi-Ezaa-Ikwo-Mgbo", 11927027, "alv-igb", } m["izr"] = { "Izere", 6101921, "nic-plc", "Latn", } m["izz"] = { "Izi", 3914387, "alv-igb", "Latn", ancestors = "izi", } return require("Module:languages").finalizeData(m, "language") j9de089eppkcf5v3dfhi0hp4mj41cy6 มอดูล:languages/data/3/h 828 36379 5720758 5684156 2026-04-21T07:00:59Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720758 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["haa"] = { "Hän", 28272, "ath-nor", "Latn", } m["hab"] = { "Hanoi Sign Language", 12632107, "sgn", "Latn", -- when documented } m["hac"] = { "Gurani", 33733, "ira-zgr", "ku-Arab", translit = "ckb-translit", } m["had"] = { "Hatam", 56825, "qfa-iso", -- Would form paa-ham with [[w:Mansim language]] if that language were added "Latn", } m["haf"] = { "Haiphong Sign Language", 39868240, "sgn", } m["hag"] = { "Hanga", 35426, "nic-dag", "Latn", } m["hah"] = { "Hahon", 3125730, "poz-ocw", "Latn", } m["hai"] = { "Haida", 33303, "qfa-iso", "Latn", } m["haj"] = { "ฮาชอง", 3350576, "qfa-mix", "as-Beng, Latn", ancestors = "tbq-pro, inc-oas, inc-obn", } m["hak"] = { "แคะ", 33375, "zhx", "Hants", ancestors = "ltc", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["hal"] = { "Halang", 56307, "mkh", "Latn", } m["ham"] = { "Hewa", 5748345, "paa-spk", "Latn", } m["hao"] = { "Hakö", 3125871, "poz-ocw", "Latn", } m["hap"] = { "Hupla", 5946223, "ngf-dan", "Latn", } m["har"] = { "Harari", 33626, "sem-eth", "Ethi", translit = "Ethi-translit", } m["has"] = { "Haisla", 3107399, "wak", "Latn", } m["hav"] = { "Havu", 5684097, "bnt-shh", "Latn", } m["haw"] = { "ฮาวาย", 33569, "poz-pep", "Latn", display_text = { from = {"‘"}, to = {"ʻ"} }, sort_key = {remove_diacritics = c.macron}, standard_chars = "AaĀāEeĒēIiĪīOoŌōUuŪūHhKkLlMmNnPpWwʻ" .. c.punc, } m["hax"] = { "Southern Haida", 12953543, "qfa-iso", "Latn", ancestors = "hai", } m["hay"] = { "Haya", 35756, "bnt-haj", "Latn", } m["hba"] = { "Hamba", 11028905, "bnt-tet", "Latn", } m["hbb"] = { "Huba", 56290, "cdc-cbm", "Latn", } m["hbn"] = { "Heiban", 35523, "alv-hei", } m["hbu"] = { "Habu", 1567033, "poz-cet", "Latn", } m["hca"] = { "Andaman Creole Hindi", 7599417, "crp", ancestors = "hi, bn, ta", } m["hch"] = { "Huichol", 35575, "azc", "Latn", } m["hdn"] = { "Northern Haida", 20054484, "qfa-iso", "Latn", ancestors = "hai", } m["hds"] = { "Honduras Sign Language", 3915496, "sgn", "Latn", -- when documented } m["hdy"] = { "Hadiyya", 56613, "cus-hec", "Latn, Ethi", } m["hea"] = { "Northern Qiandong Miao", 3138832, "hmn", "Latn, Bopo", } m["hed"] = { "Herdé", 56253, "cdc-mas", "Latn", } m["heg"] = { "Helong", 35432, "poz-tim", "Latn", } m["heh"] = { "Hehe", 3129390, "bnt-bki", "Latn", } m["hei"] = { "Heiltsuk", 5699507, "wak", "Latn", } m["hem"] = { "Hemba", 5711209, "bnt-lbn", } m["hgm"] = { "Haiǁom", 4494781, "khi-khk", "Latn", } m["hgw"] = { "Haigwai", 5639108, "poz-ocw", "Latn", } m["hhi"] = { "Hoia Hoia", 5877767, "paa-ani", } m["hhr"] = { "Kerak", 11010783, "alv-jfe", } m["hhy"] = { "Hoyahoya", 15633149, "paa-ani", "Latn", } m["hia"] = { "Lamang", 35700, "cdc-cbm", "Latn", } m["hib"] = { "Hibito", 3135164, "qfa-unc", -- poorly attested; possibly in a Hibito-Cholon or Cholonan family } m["hid"] = { "Hidatsa", 3135234, "sio-mor", "Latn", } m["hif"] = { "ฮินดีแบบฟีจี", 46728, "inc-hie", "Latn", ancestors = "awa", } m["hig"] = { "Kamwe", 56271, "cdc-cbm", } m["hih"] = { "Pamosu", 12953011, "ngf-tib", "Latn", } m["hii"] = { "Hinduri", 5766763, "him", "Deva, Takr", } m["hij"] = { "Hijuk", 35274, "bnt-bsa", } m["hik"] = { "Seit-Kaitetu", 7446989, "poz-cma", } m["hil"] = { "ฮีลีไกโนน", 35978, "phi", "Latn", strip_diacritics = {Latn = {remove_diacritics = c.grave .. c.acute .. c.circ}}, standard_chars = { Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy", c.punc }, sort_key = { Latn = "tl-sortkey" }, } m["hio"] = { "Tshwa", 963636, "khi-kal", } m["hir"] = { "Himarimã", 5765127, "qfa-unc", -- language of uncontacted group; word list lost; believed Arawan } m["hit"] = { "Hittite", 35668, "ine-ana", "Xsux, Latn", } m["hiw"] = { "Hiw", 3138713, "poz-vnn", "Latn", } m["hix"] = { "Hixkaryana", 56522, "sai-prk", "Latn", } m["hji"] = { "Haji", 5639933, "poz-mly", "Latn", } m["hka"] = { "Kahe", 3892562, "bnt-chg", "Latn", } m["hke"] = { "Hunde", 3065432, "bnt-shh", "Latn", } m["hkh"] = { "Pogali", 105198619, "inc-kas", } m["hkk"] = { "Hunjara-Kaina Ke", 63213931, "paa-bin", "Latn", } m["hkn"] = { "Mel-Khaonh", 19059577, "mkh-ban", } m["hks"] = { "Hong Kong Sign Language", 17038844, "sgn", } m["hla"] = { "Halia", 3125959, "poz-ocw", "Latn", } m["hlb"] = { "Halbi", 3695692, "inc-hal", "Deva, Orya", translit = { Deva = "Deva-translit", Orya = "Orya-translit", }, } m["hld"] = { "Halang Doan", 3914632, "mkh-ban", } m["hle"] = { "Hlersu", 5873537, "tbq-llo", } m["hlt"] = { "Nga La", 12952942, "tbq-kuk", "Latn", } m["hma"] = { "Southern Mashan Hmong", 12953560, "hmn", "Latn", } m["hmb"] = { "Humburi Senni", 35486, "son", "Latn, Arab", } m["hmc"] = { "Central Huishui Hmong", 12953558, "hmn", } m["hmd"] = { "A-Hmao", 1108934, "hmn", "Latn, Plrd", } m["hme"] = { "Eastern Huishui Hmong", 12953559, "hmn", } m["hmf"] = { "Hmong Don", 22911602, "hmn", } m["hmg"] = { "Southwestern Guiyang Hmong", 27478542, "hmn", } m["hmh"] = { "Southwestern Huishui Hmong", 12953565, "hmn", } m["hmi"] = { "Northern Huishui Hmong", 27434946, "hmn", } m["hmj"] = { "Ge", 11251864, "hmn", "Bopo", } m["hmk"] = { "Yemaek", 8050724, "qfa-kor", "Hani", sort_key = "Hani-sortkey", } m["hml"] = { "Luopohe Hmong", 14468943, "hmn", } m["hmm"] = { "Central Mashan Hmong", 12953561, "hmn", } m["hmp"] = { "Northern Mashan Hmong", 12953564, "hmn", } m["hmq"] = { "Eastern Qiandong Miao", 27431369, "hmn", } m["hmr"] = { "Hmar", 2992841, "tbq-kuk", "Latn", ancestors = "lus", } m["hms"] = { "Southern Qiandong Miao", 12953562, "hmn", } m["hmt"] = { "Hamtai", 5646436, "ngf-ang", "Latn", } m["hmu"] = { "Hamap", 12952484, "paa-tap", "Latn", } m["hmv"] = { "Hmong Dô", 22911598, "hmn", "Latn", -- Probably also Hmng } m["hmw"] = { "Western Mashan Hmong", 12953563, "hmn", } m["hmy"] = { "Southern Guiyang Hmong", 12953553, "hmn", } m["hmz"] = { "Hmong Shua", 25559603, "hmn", } m["hna"] = { "Mina", 56532, "cdc-cbm", } m["hnd"] = { "Southern Hindko", 382273, "inc-pan", "pa-Arab", ancestors = "lah", } m["hne"] = { "Chhattisgarhi", 33158, "inc-hie", "Deva", ancestors = "inc-oaw", translit = { Deva = "Deva-translit", }, } m["hnh"] = { "ǁAni", 3832982, "khi-kal", "Latn", } m["hni"] = { "Hani", 56516, "tbq-han", "Latn", } m["hnj"] = { "ม้งเขียว", 3138831, "hmn", "Latn, Hmng, Hmnp", } m["hnm"] = { "ไหหลำ", 934541, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["hnn"] = { "ฮานูโนโอ", 35435, "phi", "Hano, Latn", translit = {Hano = "hnn-translit"}, override_translit = true, strip_diacritics = {Latn = {remove_diacritics = c.grave .. c.acute .. c.circ}}, standard_chars = { Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy", c.punc }, sort_key = { Latn = "tl-sortkey", }, } m["hno"] = { "Northern Hindko", 6346358, "inc-pan", "Arab", ancestors = "lah", } m["hns"] = { "ฮินดูสตานีแบบแคริบเบียน", 1843468, "inc", -- "crp"? "Arab, Deva, Kthi, Latn", ancestors = "bho, awa", } m["hnu"] = { "Hung", 12632753, "mkh-vie", } m["hoa"] = { "Hoava", 3138887, "poz-ocw", "Latn", } m["hob"] = { "Austronesian Mari", 6760941, "poz-ocw", "Latn", } m["hoc"] = { "Ho", 33270, "mun", "Wara, Orya, Deva, Beng, Latn", translit = { Orya = "Orya-translit", Deva = "Deva-translit", Beng = "Beng-translit", }, } m["hod"] = { "Holma", 56331, "cdc-cbm", "Latn", } m["hoe"] = { "Horom", 3914008, "nic-ple", "Latn", } m["hoh"] = { "Hobyót", 33299, "sem-sar", "Arab, Latn", } m["hoi"] = { "Holikachuk", 28508, "ath-nor", "Latn", } m["hoj"] = { "Hadoti", 33227, "raj", "Deva", translit = "Deva-translit", } m["hol"] = { "Holu", 4121133, "bnt-pen", "Latn", } m["hom"] = { "Homa", 3449953, "bnt-boa", "Latn", } m["hoo"] = { "Holoholo", 3139484, "bnt-tkm", "Latn", } m["hop"] = { "Hopi", 56421, "azc", "Latn", } m["hor"] = { "Horo", 641748, "csu-sar", } m["hos"] = { "Ho Chi Minh City Sign Language", 16111971, "sgn", "Latn", -- when documented } m["hot"] = { "Hote", 12632404, "poz-ocw", "Latn", } m["hov"] = { "Hovongan", 5917269, "poz", "Latn", } m["how"] = { "Honi", 56842, "tbq-han", } m["hoy"] = { "Holiya", 5880707, "dra-kan", } m["hoz"] = { "Hozo", 5923010, "omv-mao", } m["hpo"] = { "Hpon", 5923277, "tbq-brm", "Latn", } m["hps"] = { "Hawai'i Pidgin Sign Language", 33358, "sgn", "Latn", -- when documented } m["hra"] = { "Hrangkhol", 5923435, "tbq-kuk", "Latn", } m["hrc"] = { "Niwer Mil", 30323994, "poz-oce", "Latn", } m["hre"] = { "Hrê", 3915794, "mkh-nbn", "Latn", } m["hrk"] = { "Haruku", 5675762, "poz-cma", } m["hrm"] = { "Horned Miao", 63213949, "hmn", } m["hro"] = { "Haroi", 3127568, "cmc", "Latn", } m["hrp"] = { "Nhirrpi", 32571318, "aus-kar", } m["hrt"] = { "Hértevin", 33290, "sem-nna", "Latn", } m["hru"] = { "Hruso", 5923933, "sit-hrs", "Latn", } m["hrw"] = { "Warwar Feni", 56704265, "poz-oce", "Latn", } m["hrx"] = { "ฮุนสริก", 304049, "gmw-hgm", "Latn", ancestors = "gmw-cfr", } m["hrz"] = { "Harzani", 56464, "xme-ttc", "fa-Arab, Latn", ancestors = "xme-ttc-nor", } m["hsb"] = { "ซอร์บตอนบน", 13248, "wen", "Latn", sort_key = s["wen-sortkey"], standard_chars = "AaBbCcČčĆćDdEeĚěFfGgHhIiJjKkŁłLlMmNnŃńOoÓóPpRrŘřSsŠšTtUuWwYyZzŽžŹź" .. c.punc, } m["hsh"] = { "Hungarian Sign Language", 13636869, "sgn", "Latn", -- when documented } m["hsl"] = { "Hausa Sign Language", 3915462, "sgn", "Latn", -- when documented } m["hsn"] = { "เซียง", 13220, "zhx", "Hants", ancestors = "ltc", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["hss"] = { "Harsusi", 33423, "sem-sar", "Arab, Latn", } m["hti"] = { "Hoti", 5912372, "poz-cma", "Latn", } m["hto"] = { "Minica Huitoto", 948514, "sai-wit", "Latn", } m["hts"] = { "Hadza", 33411, "qfa-iso", "Latn", } m["htu"] = { "Hitu", 5872700, "poz-cma", "Latn", } m["hub"] = { "Huambisa", 1526037, "sai-jiv", "Latn", } m["huc"] = { "ǂHoan", 2053913, "khi-kxa", "Latn", } m["hud"] = { "Huaulu", 12952504, "poz-cma", "Latn", } m["huf"] = { "Humene", 11732231, "paa-kwa", "Latn", } m["hug"] = { "Huachipaeri", 3446617, "sai-har", "Latn", } m["huh"] = { "Huilliche", 35531, "sai-ara", "Latn", } m["hui"] = { "Huli", 3125121, "ngf-eng", "Latn", } m["huj"] = { "Northern Guiyang Hmong", 12953554, "hmn", } m["huk"] = { "Hulung", 12952505, "poz-cet", } m["hul"] = { "Hula", 6382179, "poz-ocw", "Latn", } m["hum"] = { "Hungana", 10975396, "bnt-yak", } m["huo"] = { "Hu", 3141783, "mkh-pal", } m["hup"] = { "Hupa", 28058, "ath-pco", "Latn", } m["huq"] = { "Tsat", 34133, "cmc", } m["hur"] = { "Halkomelem", 35388, "sal", "Latn", } m["hus"] = { "Wastek", 35573, "myn", "Latn", } m["huu"] = { "Murui Huitoto", 2640935, "sai-wit", "Latn", } m["huv"] = { "Huave", 12954031, "qfa-iso", "Latn", } m["huw"] = { "Hukumina", 3142988, "poz-cma", "Latn", } m["hux"] = { "Nüpode Huitoto", 56333, "sai-wit", "Latn", } m["huy"] = { "Hulaulá", 33426, "sem-nna", "Hebr", -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["huz"] = { "Hunzib", 56564, "cau-ets", "Cyrl", translit = "huz-translit", display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]}, } m["hvc"] = { "Haitian Vodoun Culture Language", 3504239, "crp", "Latn", } m["hvk"] = { "Haveke", 5683513, "poz-cln", "Latn", } m["hvn"] = { "Sabu", 3128792, "poz-cet", "Latn", } m["hwa"] = { "Wané", 3914887, "kro-ekr", "Latn", } m["hwc"] = { "Hawaiian Creole", 35602, "crp", "Latn", } m["hwo"] = { "Hwana", 56498, "cdc-cbm", "Latn", } m["hya"] = { "Hya", 56798, "cdc-cbm", "Latn", } return require("Module:languages").finalizeData(m, "language") qhhob2c9rjmybektliq0fbeijwr90s9 มอดูล:languages/data/3/g 828 36380 5720757 5719157 2026-04-21T07:00:57Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720757 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["gaa"] = { "Ga", 33287, "alv-gda", "Latn", } m["gab"] = { "Gabri", 3441237, "cdc-est", "Latn", } m["gac"] = { "Mixed Great Andamanese", 56329630, "qfa-adn", "Latn", } m["gad"] = { -- not to be confused with gdk, gdg "Gaddang", 3438830, "phi", "Latn", } m["gae"] = { "Warekena", 1091095, "awd-nwk", "Latn", } m["gaf"] = { "Gende", 3100425, "ngf-kag", "Latn", } m["gag"] = { "กากาอุซ", 33457, "trk-ogz", "Latn, Cyrl", ancestors = "trk-oat", dotted_dotless_i = true, sort_key = { Latn = { from = { "i", -- Ensure "i" comes after "ı". "ä", "ç", "ê", "ı", "ö", "ş", "ţ", "ü" }, to = { "i" .. p[1], "a" .. p[1], "c" .. p[1], "e" .. p[1], "i", "o" .. p[1], "s" .. p[1], "t" .. p[1], "u" .. p[1] } }, }, } m["gah"] = { "Alekano", 3441595, "ngf-kag", "Latn", } m["gai"] = { "Borei", 6799756, "paa-ram", "Latn", } m["gaj"] = { "Gadsup", 5516467, "ngf-kag", "Latn", } m["gak"] = { "Gamkonora", 5520226, "paa-nha", "Latn", } m["gal"] = { "Galoli", 35322, "poz-tim", "Latn", } m["gam"] = { "Kandawo", 6361369, "ngf-chw", "Latn", } m["gan"] = { "กั้น", 33475, "zhx", "Hants", ancestors = "ltc", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["gao"] = { "Gants", 5521529, "ngf-eso", "Latn", } m["gap"] = { "Gal", 5517742, "ngf-han", "Latn", } m["gaq"] = { "Gata'", 3501920, "mun", "Orya", } m["gar"] = { "Galeya", 5518509, "poz-ocw", "Latn", } m["gas"] = { "Adiwasi Garasia", 12953522, "inc-bhi", "Deva, Gujr", ancestors = "bhb", } m["gat"] = { "Kenati", 4219330, "ngf-kag", "Latn", } m["gau"] = { "Kondekor", 12952433, "dra-pgd", "Telu", } m["gaw"] = { "Nobonob", 11732205, "ngf-han", "Latn", } m["gay"] = { "Gayo", 33286, "poz-nws", "Latn", } m["gbb"] = { "Kaytetye", 6380709, "aus-rnd", "Latn", } m["gbd"] = { "Karadjeri", 3913837, "aus-pam", "Latn", } m["gbe"] = { "Niksek", 56375, "paa-spk", "Latn", } m["gbf"] = { "Gaikundi", 5517032, "paa-ndu", "Latn", } m["gbg"] = { "Gbanziri", 35306, "nic-nkg", "Latn", } m["gbh"] = { "Defi Gbe", 12952446, "alv-gbe", "Latn", } m["gbi"] = { "Galela", 3094570, "paa-nha", "Latn", } m["gbj"] = { "Bodo Gadaba", 3347070, "mun", "Orya", } m["gbk"] = { "Gaddi", 17455500, "him", "Deva, Takr", translit = { Deva = "Deva-translit", }, } m["gbl"] = { "Gamit", 2731717, "inc-bhi", "Deva, Gujr", translit = { Deva = "Deva-translit", Gujr = "Gujr-translit", }, } m["gbm"] = { "Garhwali", 33459, "inc-pah", "Deva", translit = { Deva = "Deva-translit", }, } m["gbn"] = { "Mo'da", 12755683, "csu-bbk", "Latn", } m["gbo"] = { "Northern Grebo", 11157042, "grb", "Latn", } m["gbp"] = { "Gbaya-Bossangoa", 11011295, "gba-wes", "Latn", } m["gbq"] = { "Gbaya-Bozoum", 4952879, "gba-wes", "Latn", } m["gbr"] = { "Gbagyi", 11015105, "alv-ngb", "Latn", } m["gbs"] = { "Gbesi Gbe", 12952448, "alv-pph", "Latn", } m["gbu"] = { "Gagadu", 35677, "aus-arn", "Latn", } m["gbv"] = { "Gbanu", 3914945, "gba-eas", "Latn", } m["gbw"] = { "Gabi", 5515391, "aus-pam", "Latn", } m["gbx"] = { "Eastern Xwla Gbe", 18379975, "alv-pph", "Latn", } m["gby"] = { "Gbari", 3915451, "alv-ngb", "Latn", } m["gcc"] = { "Mali", 6743338, "paa-bng", "Latn", } m["gcd"] = { "Ganggalida", 3913765, "aus-tnk", "Latn", } m["gce"] = { "Galice", 20711, "ath-pco", "Latn", } m["gcf"] = { "Antillean Creole", 3006280, "crp", "Latn", ancestors = "fr", sort_key = s["roa-oil-sortkey"], } m["gcl"] = { "Grenadian Creole English", 4252500, "crp", "Latn", ancestors = "en", } m["gcn"] = { "Gaina", 11732195, "paa-bin", "Latn", } m["gcr"] = { "Guianese Creole", 1363072, "crp", "Latn", ancestors = "fr", sort_key = s["roa-oil-sortkey"], } m["gct"] = { "Colonia Tovar German", 1138351, "gmw-hgm", "Latn", ancestors = "gsw", } m["gdb"] = { "Ollari", 33906, "dra-pgd", "Orya, Telu", translit = { Orya = "Orya-translit", Telu = "Telu-translit" }, } m["gdc"] = { "Gugu Badhun", 10510360, "aus-pam", "Latn", } m["gdd"] = { "Gedaged", 35292, "poz-ocw", "Latn", } m["gde"] = { "Gude", 3441230, "cdc-cbm", "Latn", } m["gdf"] = { "Guduf-Gava", 3441350, "cdc-cbm", "Latn", } m["gdg"] = { -- not to be confused with gad, gdk "Ga'dang", 5515189, "phi", "Latn", } m["gdh"] = { "Gadjerawang", 3913817, "aus-jar", "Latn", } m["gdi"] = { "Gundi", 11137851, "nic-nkb", "Latn", } m["gdj"] = { "Kurtjar", 5619931, "aus-pmn", "Latn", } m["gdk"] = { -- not to be confused with gad, gdg "Gadang", 56256, "cdc-est", "Latn", } m["gdl"] = { "Dirasha", 56809, "cus-eas", "Ethi", } m["gdm"] = { "Laal", 33436, "qfa-dis", -- Chad; unclassified, isolate or grouped with Adamawa or Chadic languages "Latn", } m["gdn"] = { "Umanakaina", 7881084, "ngf-dag", "Latn", } m["gdo"] = { "Godoberi", 56515, "cau-and", "Cyrl", display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]}, } m["gdq"] = { "Mehri", 13361, "sem-sar", "Arab, Latn", } m["gdr"] = { "Wipi", 8026711, "paa-etf", "Latn", } m["gds"] = { "Ghandruk Sign Language", 15971577, "sgn", } m["gdt"] = { "Kungardutyi", 6444517, "aus-kar", "Latn", } m["gdu"] = { "Gudu", 3441172, "cdc-cbm", "Latn", } m["gdx"] = { "Godwari", 3540922, "raj", "Deva", translit = { Deva = "Deva-translit", }, } m["gea"] = { "Geruma", 3438789, "cdc-wst", "Latn", } m["geb"] = { "Kire", 11129733, "paa-ram", "Latn", } m["gec"] = { "Gboloo Grebo", 11019342, "grb", "Latn", } m["ged"] = { "Gade", 3914459, "alv-nup", "Latn", } m["geg"] = { "Gengle", 3438345, "alv-mye", "Latn", ancestors = "kow", } m["geh"] = { "Hutterisch", 33385, "gmw-hgm", "Latn", ancestors = "bar", } m["gei"] = { "Gebe", 3100032, "poz-hce", "Latn", } m["gej"] = { "Gen", 33450, "alv-gbe", "Latn", } m["gek"] = { "Gerka", 3441277, "cdc-wst", "Latn", } m["gel"] = { "Fakkanci", 36627, "nic-knn", "Latn", } m["geq"] = { "Geme", 3915851, "znd", "Latn", } m["ges"] = { "Geser-Gorom", 5553579, "poz-cma", "Latn", } m["gev"] = { "Viya", 7937974, "bnt-tso", "Latn", } m["gew"] = { "Gera", 3438725, "cdc-wst", "Latn", } m["gex"] = { "Garre", 56618, "cus-som", "Latn", } m["gey"] = { "Enya", 5381452, "bnt-mbe", "Latn", } m["gez"] = { "กืออึซ", 35667, "sem-eth", "Ethi", translit = "Ethi-translit", } m["gfk"] = { "Patpatar", 3368846, "poz-ocw", "Latn", } m["gft"] = { "Gafat", 56910, "sem-eth", "Ethi, Latn", } m["gga"] = { "Gao", 3095228, "poz-ocw", "Latn", } m["ggb"] = { "Gbii", 3914390, "kro-wkr", "Latn", } m["ggd"] = { "Gugadj", 5615186, "aus-pmn", "Latn", } m["gge"] = { "Guragone", 5619801, "aus-arn", "Latn", } m["ggg"] = { "Gurgula", 5620032, "raj", "Arab", } m["ggk"] = { "Kungarakany", 6444516, "aus-arn", "Latn", } m["ggl"] = { "Ganglau", 5521140, "ngf-yag", "Latn", } m["ggn"] = { "Eastern Gurung", 12952472, "sit-tam", "Gukh, Deva", translit = { Deva = "Deva-translit", }, } m["ggt"] = { "Gitua", 3107865, "poz-ocw", "Latn", } m["ggu"] = { "Gban", 3913317, "dmn-nbe", "Latn", } m["ggw"] = { "Gogodala", 3512161, "ngf-gsu", "Latn", } m["gha"] = { "Ghadames", 56747, "ber", "Latn", -- and other scripts? } m["ghc"] = { "แกลิกคลาสสิก", 5128278, "cel-gae", "Latn, Latg", ancestors = "mga", } m["ghe"] = { "Southern Ghale", 12952453, "sit-tam", "Deva", translit = { Deva = "Deva-translit", }, } m["ghh"] = { "Northern Ghale", 22662104, "sit-tam", "Deva", translit = { Deva = "Deva-translit", }, } m["ghk"] = { "Geko Karen", 5530317, "kar", } m["ghl"] = { "Ghulfan", 16885737, "nub-hil", "Latn", -- and others? } m["ghn"] = { "Ghanongga", 3104772, "poz-ocw", "Latn", } m["gho"] = { "Ghomara", 35315, "ber", "Tfng, Latn", translit = {Tfng = "Tfng-translit"}, } m["ghr"] = { "Ghera", 22808992, "inc-hiw", } m["ghs"] = { "Guhu-Samane", 11732219, "paa-gbi", "Latn", } m["ght"] = { "Kutang Ghale", 6448337, "sit-tam", "Tibt", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["gia"] = { "Kitja", 1284877, "aus-jar", "Latn", } m["gib"] = { "Gibanawa", 12953530, "crp", "Latn", ancestors = "ha", } m["gid"] = { "Gidar", 35265, "cdc-cbm", "Latn", } m["gie"] = { "Guébie", 63140714, "kro-did", "Latn", } m["gig"] = { "Goaria", 33269, "raj", "Arab", } m["gih"] = { "Githabul", 48987680, "aus-pam", "Latn", } m["gii"] = { "Girirra", 5564288, "cus-som", } m["gil"] = { "กิลเบิร์ต", 30898, "poz-mic", "Latn", } m["gim"] = { "Gimi (Papuan)", 11732209, "ngf-kag", "Latn", } m["gin"] = { "Hinukh", 33283, "cau-wts", "Cyrl", translit = "gin-translit", display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]}, } m["gip"] = { "Gimi (Austronesian)", 12952457, "poz-ocw", } m["giq"] = { "Green Gelao", 12953525, "gio", "Latn", } m["gir"] = { "Red Gelao", 3100264, "gio", } m["gis"] = { "North Giziga", 3515084, "cdc-cbm", } m["git"] = { "Gitxsan", 3107862, "nai-tsi", "Latn", } m["giu"] = { "Mulao", 11092831, "gio", } m["giw"] = { "White Gelao", 8843040, "gio", } m["gix"] = { "Gilima", 10977716, "nic-nkm", "Latn", } m["giy"] = { "Giyug", 5565906, } m["giz"] = { "South Giziga", 3502232, "cdc-cbm", } m["gji"] = { "Geji", 3914890, "cdc-wst", "Latn", } m["gjk"] = { "Kachi Koli", 12953646, "inc-wes", } m["gjm"] = { "Gunditjmara", 6448731, "aus-pam", "Latn", } m["gjn"] = { "Gonja", 35267, "alv-gng", "Latn", } m["gjr"] = { "Gurindji Kriol", 5620091, "qfa-mix", "Latn", ancestors = "gue, rop" } m["gju"] = { "Gojri", 3241731, "raj", "ur-Arab, Deva, Takr", strip_diacritics = { ["ur-Arab"] = { remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.smallv, from = {"ڵ", "ݩ"}, to = {"ل", "ن"} }, }, translit = { --["ur-Arab"] = "ur-translit", Deva = "Deva-translit", }, } m["gka"] = { "Guya", 11732221, "ngf-fin", "Latn", } m["gkd"] = { "Magɨ", 55621742, "ngf-ais", "Latn", } m["gke"] = { "Ndai", 6983667, "alv-mbm", } m["gkn"] = { "Gokana", 3075137, "nic-ogo", "Latn", } m["gko"] = { "Kok-Nar", 6426526, "aus-pmn", "Latn", } m["gkp"] = { "Guinea Kpelle", 11052867, "dmn-msw", "Latn, Kpel", ancestors = "kpe", } m["glc"] = { "Bon Gula", 289816, "alv-bua", } m["gld"] = { "Nanai", 13303, "tuw-nan", "Cyrl", translit = "gld-translit", strip_diacritics = {remove_diacritics = c.macron}, sort_key = { from = {"ё", "ӈ"}, to = {"е" .. p[1], "н" .. p[1]} }, } m["glh"] = { "Northwest Pashayi", 23713532, "inc-pas", "fa-Arab", } m["glj"] = { "Kulaal", 33360, "alv-bua", } m["glk"] = { "Gilaki", 33657, "ira-csp", "fa-Arab", } m["glo"] = { "Galambu", 2598797, "cdc-wst", "Latn", } m["glr"] = { "Glaro-Twabo", 3915313, "kro-wee", } m["glu"] = { "Gula", 5617176, "csu-bgr", "Latn", } m["glw"] = { "Glavda", 3441285, "cdc-cbm", "Latn", } m["gly"] = { "Gule", 3120736, "ssa-kom", } m["gma"] = { "Gambera", 10502327, "aus-wor", "Latn", } m["gmb"] = { "Gula'alaa", 3120733, "poz-sls", "Latn", } m["gmd"] = { "Mághdì", 3914475, "alv-bwj", } m["gmg"] = { "Magiyi", 16926155, "ngf-sog", "Latn", } m["gmh"] = { "เยอรมันสูงกลาง", 837985, "gmw-hgm", "Latn", strip_diacritics = { remove_diacritics = c.circ .. c.macron, from = {"Ë", "ë", "[ƷȤ]", "[ʒȥ]"}, to = {"E", "e", "Z", "z"} }, } m["gml"] = { "เยอรมันต่ำกลาง", 505674, "gmw-lgm", "Latn", strip_diacritics = {remove_diacritics = c.circ .. c.macron .. c.diaer}, } m["gmm"] = { "Gbaya-Mbodomo", 6799713, "gba-eas", "Latn", } m["gmn"] = { "Gimnime", 11016905, "alv-dur", "Latn", } m["gmr"] = { "Mirning", 6873793, "aus-pam", "Latn", } m["gmu"] = { "Gumalu", 5618027, "ngf-gum", "Latn", } m["gmv"] = { "กาโม", 16116386, "omv-nom", "Latn, Ethi", } m["gmx"] = { "Magoma", 16939552, "bnt-bki", } m["gmy"] = { "กรีกแบบไมซีนี", 668366, "grk", "Linb", translit = "Linb-translit", } m["gmz"] = { "Mgbo", 6826835, "alv-igb", ancestors = "izi", } m["gna"] = { "Kaansa", 56802, "nic-gur", } m["gnb"] = { "Gangte", 12952442, "tbq-kuk", } m["gnc"] = { "Guanche", 35762, "ber", } m["gnd"] = { "Zulgo-Gemzek", 56800, "cdc-cbm", "Latn", } m["gne"] = { "Ganang", 63163361, "nic-plc", ancestors = "izr", } m["gng"] = { "Ngangam", 35888, "nic-grm", } m["gnh"] = { "Lere", 3915319, "nic-jer", } m["gni"] = { "กูนียันดี", 2669219, "aus-bub", "Latn", } m["gnj"] = { "Ngen of Djonkro", 63170838, "dmn-nbe", "Latn", } m["gnk"] = { "ǁGana", 1975199, "khi-kal", "Latn", } m["gnl"] = { "Gangulu", 4916329, "aus-pam", "Latn", } m["gnm"] = { "Ginuman", 11732210, "ngf-dag", "Latn", } m["gnn"] = { "Gumatj", 10510745, "aus-yol", "Latn", } m["gnq"] = { "Gana", 5520523, "poz-san", "Latn", } m["gnr"] = { "Gureng Gureng", 5619998, "aus-pam", "Latn", } m["gnt"] = { "Guntai", 12952475, "paa-yam", "Latn", } m["gnu"] = { "Gnau", 3915810, "paa-tor", "Latn", } m["gnw"] = { "Western Bolivian Guarani", 3775037, "gn", "Latn", } m["gnz"] = { "Ganzi", 11137942, "nic-nkb", "Latn", } m["goa"] = { "Guro", 35251, "dmn-mda", "Latn", } m["gob"] = { "Playero", 3027923, "sai-guh", } m["goc"] = { "Gorakor", 12952463, "poz-ocw", "Latn", } m["god"] = { "Godié", 3914412, "kro-bet", } m["goe"] = { "Gongduk", 2669221, "sit", } m["gof"] = { "โกฟา", 12631584, "omv-nom", "Latn, Ethi", } m["gog"] = { "Gogo", 3272630, "bnt-ruv", "Latn", } m["goh"] = { "เยอรมันสูงเก่า", 35218, "gmw-hgm", "Latn, Runr", strip_diacritics = { remove_diacritics = c.circ .. c.macron .. c.diaer, from = {"[ƷȤ]", "[ʒȥ]"}, to = {"Z", "z"} }, translit = { Runr = "Runr-translit", }, } m["goi"] = { "Gobasi", 5575414, "ngf-est", "Latn", } m["goj"] = { "Gowlan", 12953532, "inc-sou", } -- gok is a spurious language, see [[w:Spurious languages]] m["gol"] = { "Gola", 35482, "alv", "Latn, Vaii", } m["gon"] = { "โคณฑี", 1775361, "dra-gon", "Telu, Gonm, Gong, Deva, Orya", translit = { Telu = "Telu-translit", Gong = "gon-Gong-translit", Gonm = "gon-Gonm-translit", }, } m["goo"] = { "Gone Dau", 3110470, "poz-pcc", "Latn", } m["gop"] = { "Yeretuar", 8052565, "poz-hce", "Latn", } m["goq"] = { "Gorap", 3110816, "crp", "Latn", ancestors = "ms", } m["gor"] = { "Gorontalo", 2501174, "phi", "Latn", } m["got"] = { "กอท", 35722, "gme", "Goth, Runr, Latn", translit = {Goth = "Goth-translit"}, link_tr = true, strip_diacritics = {Latn = {remove_diacritics = c.macron}}, } m["gou"] = { "Gavar", 3441180, "cdc-cbm", } m["gov"] = { "Goo", 16927208, "dmn", "Latn", } m["gow"] = { "Gorwaa", 3437626, "cus-sou", "Latn", } m["gox"] = { "Gobu", 7194986, "bad-cnt", } m["goy"] = { "Goundo", 317636, "alv-kim", } m["goz"] = { "Gozarkhani", 5590235, "xme-ttc", ancestors = "xme-ttc-eas", } m["gpa"] = { "Gupa-Abawa", 3915352, "alv-ngb", "Latn", } m["gpn"] = { "Taiap", 56237, "qfa-dis", -- Papuan; isolate in Glottolog; relationship with Torricelli proposed by Usher "Latn", } m["gqa"] = { "Ga'anda", 56245, "cdc-cbm", "Latn", } m["gqi"] = { "Guiqiong", 3120647, "sit-qia", } m["gqn"] = { -- a variety of 'ter' "Kinikinao", 53386731, "awd", "Latn", } m["gqr"] = { "Gor", 759992, "csu-sar", "Latn", } m["gqu"] = { "Qau", 17284874, "gio", } m["gra"] = { "Rajput Garasia", 21041529, "inc-bhi", "Deva, Gujr", ancestors = "bhb", translit = { Deva = "Deva-translit", Gujr = "Gujr-translit", }, } m["grc"] = { "กรีกโบราณ", 35497, "grk", "Polyt, Cprt", translit = { Cprt = "Cprt-translit", }, override_translit = true, -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] standard_chars = { Polyt = "ΑΆἈἉἊἋἌἍἎἏᾈᾉᾊᾋᾌᾍᾎᾏᾸᾹᾺᾼΒΓΔΕΈἘἙἚἛἜἝῈΖΗΉἨἩἪἫἬἭἮἯᾘᾙᾚᾛᾜᾝᾞᾟῊῌΘΙΊΪἸἹἺἻἼἽἾἿῘῙῚΚΛΜΝΞΟΌὈὉὊὋὌὍΠΡῬΡ̓ΣΤΥΎΫὙὛὝὟῨῩῪΦΧΨΩΏὨὩὪὫὬὭὮὯᾨᾩᾪᾫᾬᾭᾮᾯῸῺῼαάἀἁἂἃἄἅἆἇὰᾀᾁᾂᾃᾄᾅᾆᾇᾰᾱᾲᾳᾴᾶᾷβγδεέἐἑἒἓἔἕὲζηήἠἡἢἣἤἥἦἧὴᾐᾑᾒᾓᾔᾕᾖᾗῂῃῄῆῇθιίϊΐἰἱἲἳἴἵἶἷὶῐῑῒῖῗκλμνξοόὀὁὂὃὄὅὸπρῤῥςστυύϋΰὐὑὒὓὔὕὖὗὺῠῡῢῦῧφχψωώὠὡὢὣὤὥὦὧὼᾠᾡᾢᾣᾤᾥᾦᾧῲῳῴῶῷ·ͺ΄΅᾽᾿῀῁῍῎῏῝῞῟῭`´῾", Cprt = "𐠀𐠁𐠂𐠃𐠄𐠅𐠈𐠊𐠋𐠌𐠍𐠎𐠏𐠐𐠑𐠒𐠓𐠔𐠕𐠖𐠗𐠘𐠙𐠚𐠛𐠜𐠝𐠞𐠟𐠠𐠡𐠢𐠣𐠤𐠥𐠦𐠧𐠨𐠩𐠪𐠫𐠬𐠭𐠮𐠯𐠰𐠱𐠲𐠳𐠴𐠵𐠷𐠸𐠼𐠿", c.punc }, } m["grd"] = { "Guruntum", 3441272, "cdc-wst", "Latn", } m["grg"] = { "Madi", 6727664, "ngf-fin", "Latn", } m["grh"] = { "Gbiri-Niragu", 3913936, "nic-kau", "Latn", } m["gri"] = { "Ghari", 3104782, "poz-sls", "Latn", } m["grj"] = { "Southern Grebo", 3914444, "grb", "Latn", } m["grm"] = { "Kota Marudu Talantang", 6433808, "poz-san", "Latn", } m["gro"] = { "Groma", 56551, "sit-tib", } m["grq"] = { "Gorovu", 56355, "paa-ram", "Latn", } m["grs"] = { "Gresi", 5607612, "paa-nim", "Latn", } m["grt"] = { "กาโร", 36137, "tbq-bdg", "Latn, Beng, Brai", } m["gru"] = { "Kistane", 13273, "sem-eth", "Latn, Ethi", } m["grv"] = { "Central Grebo", 18385114, "grb", "Latn", } m["grw"] = { "Gweda", 5623387, "poz-ocw", "Latn", } m["grx"] = { "Guriaso", 12631954, "qfa-unc", -- no consensus; may be Kwomtari per Baron (1983) and Usher (2020), but no connections accepted by -- Glottolog. "Latn", } m["gry"] = { "Barclayville Grebo", 11157342, "grb", "Latn", } m["grz"] = { "Guramalum", 3120935, "poz-ocw", "Latn", } m["gse"] = { "Ghanaian Sign Language", 35289, "sgn", "Latn", -- when documented } m["gsg"] = { "German Sign Language", 33282, "sgn-gsl", "Sgnw", } m["gsl"] = { "Gusilay", 35439, "alv-jol", "Latn", } m["gsm"] = { "Guatemalan Sign Language", 2886781, "sgn", "Latn", -- when documented } m["gsn"] = { "Gusan", 11732224, "ngf-fin", "Latn", } m["gso"] = { "Southwest Gbaya", 4919322, "gba-sou", "Latn", } m["gsp"] = { "Wasembo", 7971402, "ngf-mad", -- placed in under Rai Coast by Glottolog (under Greater Yaganon) and Pawley-Hammarström "Latn", } m["gss"] = { "Greek Sign Language", 3565084, "sgn", } m["gsw"] = { "เยอรมันแบบอลามันเนีย", 131339, "gmw-hgm", "Latn", wikimedia_codes = "als", ancestors = "gmh", } m["gta"] = { "Guató", 3027940, "qfa-dis", -- isolate or Macro-Jê "Latn", } m["gtu"] = { "Aghu Tharrnggala", 16825981, "aus-pmn", "Latn", } m["gua"] = { "Shiki", 3913946, "nic-jrn", "Latn", } m["gub"] = { "Guajajára", 7699720, "tup-gua", "Latn", } m["guc"] = { "Wayuu", 891085, "awd-taa", "Latn", } m["gud"] = { "Yocoboué Dida", 21074781, "kro-did", "Latn", } m["gue"] = { "Gurindji", 10511016, "aus-pam", "Latn", } m["guf"] = { "Gupapuyngu", 10511004, "aus-yol", "Latn", } m["gug"] = { "กัวรานีแบบปารากวัย", 17478066, "gn", "Latn", wikimedia_codes = "gn", ancestors = "gn-cls", } m["guh"] = { "Guahibo", 2669193, "sai-guh", "Latn", } m["gui"] = { "Eastern Bolivian Guarani", 2963912, "gn", "Latn", } m["guk"] = { "Gumuz", 2396970, "ssa", "Latn, Ethi", } m["gul"] = { "Gullah", 33395, "crp", "Latn", ancestors = "en", } m["gum"] = { "Guambiano", 2744745, "sai-bar", "Latn", } m["gun"] = { "Mbya Guarani", 3915584, "gn", "Latn", } m["guo"] = { "Guayabero", 2980375, "sai-guh", "Latn", } m["gup"] = { "Gunwinggu", 1406574, "aus-gun", "Latn", } m["guq"] = { "Aché", 383701, "tup", "Latn", } m["gur"] = { "Farefare", 35331, "nic-mre", "Latn", } m["gus"] = { "Guinean Sign Language", 15983937, "sgn", "Latn", -- when documented } m["gut"] = { "Maléku Jaíka", 3915782, "cba", "Latn", } m["guu"] = { "Yanomamö", 8048928, "sai-ynm", "Latn", } m["guv"] = { "Gey", 11137816, "alv-sav", "Latn", } m["guw"] = { "Gun", 3111668, "alv-gbe", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.macron}, } m["gux"] = { "Gourmanchéma", 35474, "nic-grm", "Latn", } m["guz"] = { "Gusii", 33603, "bnt-lok", "Latn", } m["gva"] = { "Kaskihá", 3033534, "sai-mas", "Latn", } m["gvc"] = { "Guanano", 3566001, "sai-tuc", "Latn", } m["gve"] = { "Duwet", 5317647, "poz-ocw", "Latn", } m["gvf"] = { "Golin", 3110291, "ngf-chw", "Latn", } m["gvj"] = { "Guajá", 3915506, "tup", "Latn", } m["gvl"] = { "Gulay", 641737, "csu-sar", "Latn", } m["gvm"] = { "Gurmana", 3913363, "nic-shi", "Latn", } m["gvn"] = { "Kuku-Yalanji", 5621973, "aus-pam", "Latn", } m["gvo"] = { "Gavião do Jiparaná", 5528335, "tup", "Latn", } m["gvp"] = { "Pará Gavião", 3365443, "sai-nje", "Latn", } m["gvr"] = { "Western Gurung", 2392342, "sit-tam", "Gukh, Deva", translit = { Deva = "Deva-translit", }, } m["gvs"] = { "Gumawana", 5618041, "poz-ocw", "Latn", } m["gvy"] = { "Guyani", 10511230, "aus-pam", "Latn", } m["gwa"] = { "Mbato", 3914941, "alv-ptn", "Latn", } m["gwb"] = { "Gwa", 5623219, "nic-jrn", "Latn", } m["gwc"] = { "Kalami", 1675961, "inc-koh", "Arab", strip_diacritics = { ["Arab"] = { -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif from = {"هٔ", "ۂ", "ٱ"}, to = {"ہ", "ہ", "ا"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef .. u(0x065e) }, }, } m["gwd"] = { "Gawwada", 3032135, "cus-eas", "Latn, Ethi", } m["gwe"] = { "Gweno", 3358211, "bnt-chg", "Latn", } m["gwf"] = { "Gowro", 3812403, "inc-koh", "Arab", } m["gwg"] = { "Moo", 6907057, "alv-bwj", "Latn", } m["gwi"] = { "Gwich'in", 21057, "ath-nor", "Latn", } m["gwj"] = { "Gcwi", 12631978, "khi-kal", "Latn", } m["gwm"] = { "Awngthim", 4830109, "aus-pmn", "Latn", } m["gwn"] = { "Gwandara", 56521, "cdc-wst", "Latn", } m["gwr"] = { "Gwere", 5623559, "bnt-nyg", "Latn", } m["gwt"] = { "Gawar-Bati", 33894, "inc-kun", "Arab", } m["gwu"] = { "Guwamu", 10511225, "aus-pam", "Latn", } m["gww"] = { "Kwini", 10551249, "aus-wor", "Latn", } m["gwx"] = { "Gua", 35422, "alv-gng", "Latn", } m["gxx"] = { "Wè Southern", 19921582, "kro-wee", "Latn", } m["gya"] = { "Northwest Gbaya", 36594, "gba-wes", "Latn", } m["gyb"] = { "Garus", 5524492, "ngf-han", "Latn", } m["gyd"] = { "Kayardild", 3913770, "aus-tnk", "Latn", } m["gye"] = { "Gyem", 5624046, "nic-jer", "Latn", } m["gyf"] = { "Gungabula", 10510783, "aus-pam", "Latn", } m["gyg"] = { "Gbayi", 11137618, "nic-ngd", "Latn", } m["gyi"] = { "Gyele", 35434, "bnt-mnj", "Latn", } m["gyl"] = { "Gayil", 5528771, "omv-aro", "Latn", } m["gym"] = { "Ngäbere", 3915581, "cba", "Latn", } m["gyn"] = { "Guyanese Creole English", 3305477, "crp", "Latn", ancestors = "en", } m["gyo"] = { "Gyalsumdo", 53575940, "sit-kyk", } m["gyr"] = { "Guarayu", 3118779, "tup-gua", "Latn", } m["gyy"] = { "Gunya", 10511001, "aus-pam", "Latn", } m["gza"] = { "Ganza", 5521556, "omv-mao", "Latn", } m["gzn"] = { "Gane", 3095108, "poz-hce", "Latn", } return require("Module:languages").finalizeData(m, "language") 99nrkp1ua81hq4ilwcpa9k8h969l4hz มอดูล:languages/data/3/e 828 36382 5720756 5684153 2026-04-21T07:00:55Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720756 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["ebg"] = { "Ebughu", 35294, "nic-lcr", "Latn", } m["ebk"] = { "Eastern Bontoc", 62664215, "phi", "Latn", } m["ebr"] = { "Ebrié", 36644, "alv-ptn", "Latn", } m["ebu"] = { "Embu", 35318, "bnt-kka", "Latn", } m["ecr"] = { "Eteocretan", 35461, "qfa-unc", "Polyt", -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["ecs"] = { "Ecuadorian Sign Language", 3436769, "sgn", "Latn", -- when documented } m["ecy"] = { "Eteocypriot", 35309, "qfa-unc", "Cprt", } m["eee"] = { "E", 35386, "qfa-mix", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["efa"] = { "Efai", 3813297, "nic-ief", "Latn", } m["efe"] = { "Efe", 56354, "csu-mle", "Latn", } m["efi"] = { "Efik", 35377, "nic-ief", "Latn", } m["ega"] = { "Ega", 3914927, "alv", "Latn", } m["egl"] = { "เอมีเลีย", 1057898, "roa-emr", "Latn", wikimedia_codes = "eml", } m["ego"] = { "Eggon", 35300, "nic-pls", "Latn", } m["egy"] = { "อียิปต์", 50868, "egx", "Latn, Egyp, Egyh", sort_key = { remove_diacritics = "'%-%s", from = {"ꜣ", "j", "y", "ꜥ", "w", "b", "p", "f", "m", "n", "r", "ḥ", "ḫ", "ẖ", "h", "z", "š", "s", "q", "k", "g", "ṯ", "t", "ḏ", "d", "%."}, to = {p[1], p[2], p[3], p[4], p[5], p[6], p[7], p[8], p[9], p[10], p[11], p[13], p[14], p[15], p[12], p[16], p[18], p[17], p[19], p[20], p[21], p[23], p[22], p[25], p[24], p[26]} }, } m["ehu"] = { "Ehueun", 3441392, "alv-nwd", "Latn", } m["eip"] = { "Eipomek", 5349839, "ngf-mek", "Latn", } m["eit"] = { "Eitiep", 5350030, "paa-tor", "Latn", } m["eiv"] = { "Askopan", 56324, "paa-nbo", "Latn", } m["eja"] = { "Ejamat", 6269820, "alv-jfe", "Latn", } m["eka"] = { "Ekajuk", 35250, "nic-eko", "Latn", } m["eke"] = { "Ekit", 3509628, "nic-ief", "Latn", } m["ekg"] = { "Ekari", 5350305, "ngf-pan", "Latn", } m["eki"] = { "Eki", 5350418, "nic-ief", "Latn", } m["ekl"] = { "Kolhe", 6426945, "mun", "Latn", } m["ekm"] = { "Elip", 12952414, "nic-ymb", "Latn", } m["eko"] = { "Koti", 29930, "bnt-mak", "Latn", } m["ekp"] = { "Ekpeye", 35254, "alv-igb", "Latn", } m["ekr"] = { "Yace", 36901, "alv-ido", "Latn", } m["eky"] = { "กะยาตะวันออก", 25559417, "kar", "Kali", } m["ele"] = { "Elepi", 5359444, "paa-tor", "Latn", } m["elh"] = { "El Hugeirat", 5351410, "nub-hil", "Latn", } m["eli"] = { "Nding", 36176, "alv-tal", "Latn", } m["elk"] = { "Elkei", 5364210, "paa-tor", "Latn", } m["elm"] = { "Eleme", 3914427, "nic-ogo", "Latn", } m["elo"] = { "El Molo", 56719, "cus-eas", "Latn", } m["elu"] = { "Elu", 3364594, "poz-aay", "Latn", } m["elx"] = { "Elamite", 35470, "qfa-iso", --ancient language of Iran "Xsux", } m["ema"] = { "Emai", 35428, "alv-eeo", "Latn", } m["emb"] = { "Embaloh", 5369424, "poz", "Latn", } m["eme"] = { "Emerillon", 3588942, "tup-gua", "Latn", } m["emg"] = { "Eastern Meohang", 12952840, "sit-kie", "Deva", translit = { Deva = "Deva-translit", }, } m["emi"] = { "Mussau-Emira", 6943093, "poz-stm", "Latn", } m["emk"] = { "Eastern Maninkakan", 11002130, "dmn-mnk", "Latn, Arab, Nkoo", } m["emm"] = { "Mamulique", 3285082, "nai-pak", "Latn", } m["emn"] = { "Eman", 5368975, "nic-tvc", "Latn", } m["emp"] = { "Northern Emberá", 2391297, "sai-chc", "Latn", } m["ems"] = { "อาลูตีก", 27992, "ypk", "Latn", } m["emu"] = { "Eastern Muria", 12952883, "dra-mur", "Deva", translit = { Deva = "Deva-translit", }, } m["emw"] = { "Emplawas", 5374265, "poz-tim", "Latn", } m["emx"] = { "Erromintxela", 1122188, "qfa-mix", "Latn", ancestors = "rom, eu", } m["emy"] = { "Epigraphic Mayan", 301355, "myn", "Latn, Maya", } m["ena"] = { "Apali", 3504201, "ngf-sog", "Latn", } m["enb"] = { "Markweeta", 56874, "sdv-nma", "Latn", } m["enc"] = { "En", 3504110, "qfa-buy", "Latn", } m["end"] = { "Ende", 2067656, "poz-cet", "Latn", } m["enf"] = { "Forest Enets", 30249597, "syd-ene", "Cyrl", } m["enh"] = { "Tundra Enets", 25559411, "syd-ene", "Cyrl", } m["enl"] = { "Enlhet", 15462671, "sai-mas", "Latn", } m["enm"] = { "อังกฤษกลาง", 36395, "gmw-ang", "Latn", strip_diacritics = {remove_diacritics = c.acute .. c.circ .. c.macron .. c.breve .. c.dotabove .. c.diaer .. c.dacute .. c.dotbelow .. c.tacute}, sort_key = { remove_diacritics = c.acute .. c.circ .. c.macron .. c.breve .. c.dotabove .. c.diaer .. c.dacute .. c.dotbelow .. c.tacute, from = {"[ꟓꟕ]", "[æðᵹꟑȝœẜþꟃƿ]"}, to = { { ["ꟓ"] = "þþ", ["ꟕ"] = "ƿƿ", -- finalized by the next substitution }, { ["æ"] = "ae", ["ð"] = "d" .. p[1], ["ᵹ"] = "g", ["ꟑ"] = "g", ["ȝ"] = "g" .. p[1], ["œ"] = "oe", ["ẜ"] = "s", ["þ"] = "t" .. p[1], ["ꟃ"] = "w", ["ƿ"] = "w", }, }, }, standard_chars = { Latn = "AaÆæBbCcDdÐðEeFfGgȜȝHhIiJjKkLlMmNnOoPpQqRrSsTtÞþUuVvWwXxYyZz", c.punc, }, } m["enn"] = { "Engenni", 3915365, "alv-dlt", "Latn", } m["eno"] = { "Enggano", 2669164, "poz", "Latn", } m["enq"] = { "Enga", 1143040, "ngf-eng", "Latn", } m["enr"] = { "Emem", 5370369, "paa-pau", } m["enu"] = { "Enu", 5380858, "tbq-bka", } m["env"] = { "Enwan", 3438334, "alv-yek", "Latn", } m["enw"] = { "Enwang", 11134434, "nic-lcr", "Latn", } m["enx"] = { "Enxet", 15462609, "sai-mas", "Latn", } m["eot"] = { "Eotile", 3915347, "alv-ptn", "Latn", } m["epi"] = { "Epie", 35291, "alv-dlt", "Latn", } m["era"] = { "Eravallan", 5385061, "dra-tam", "Taml", } m["erg"] = { "Sie", 426254, "poz-vns", "Latn", } m["erh"] = { "Eruwa", 3441244, "alv-swd", "Latn", } m["eri"] = { "Ogea", 7079984, "ngf-nur", "Latn", } m["erk"] = { "South Efate", 3449070, "poz-vnc", "Latn", } m["err"] = { "Erre", 10488401, "qfa-iso", -- Evans (1997) put it in an Arnhem Land family "Latn", } m["ers"] = { "Ersu", 12952417, "sit-ers", "Latn", -- also Ersu Shaba } m["ert"] = { "Eritai", 56376, "paa-lkp", "Latn", } m["erw"] = { "Erokwanas", 5395296, "poz-hce", "Latn", } m["ese"] = { "Ese Ejja", 2980381, "sai-tac", "Latn", } m["esh"] = { "เอชเตฮาร์ด", 12952418, "xme-ttc", "fa-Arab, Latn", ancestors = "xme-ttc-sou", } -- "esi" and "esk" moved to etymology-only per [[WT:LT]] and [[Wiktionary:Beer_parlour/2023/August#Issues_regarding_the_Inuit_languages]] m["esl"] = { "Egyptian Sign Language", 5348443, "sgn", } m["esm"] = { "Esuma", 16927555, "alv-kwa", "Latn", } m["esn"] = { "Salvadoran Sign Language", 7406492, "sgn", "Latn", -- when documented } m["eso"] = { "Estonian Sign Language", 3196221, "sgn", "Latn", -- when documented } m["esq"] = { "Esselen", 1294243, "qfa-dis", -- isolate or Hokan "Latn", } m["ess"] = { "Central Siberian Yupik", 27993, "ypk", "Cyrl, Latn", } m["esu"] = { "ยุปปิก", 21117, "ypk", "Latn", } m["esy"] = { "Eskayan", 867086, "art", "Latn", -- also its own native script } m["etb"] = { "Etebi", 11002851, "nic-ief", "Latn", } m["etc"] = { "Etchemin", 5402493, "alg-eas", "Latn", } m["eth"] = { "Ethiopian Sign Language", 3501903, "sgn", } m["etn"] = { "Eton (Vanuatu)", 3059362, "poz-vnc", "Latn", } m["eto"] = { "Eton (Cameroon)", 35317, "bnt-btb", "Latn", } m["etr"] = { "Edolo", 5340184, "ngf-bos", "Latn", } m["ets"] = { "Yekhee", 3915848, "alv-yek", "Latn", } m["ett"] = { "อีทรัสคัน", 35726, "qfa-tyn", "Ital", -- Ital translit in [[Module:scripts/data]] } m["etu"] = { "Ejagham", 35296, "nic-eko", "Latn", } m["etx"] = { "Eten", 3915392, "nic-beo", "Latn", } m["etz"] = { "Semimi", 10950308, "paa-mai", "Latn", } m["eve"] = { "เอเว็น", 29960, "tuw-ewe", "Cyrl, Latn", translit = {Cyrl = "eve-translit"}, strip_diacritics = {remove_diacritics = c.macron .. c.dotabove .. c.dotbelow}, sort_key = { Cyrl = { from = { "ӫ", -- 2 chars "ё", "ӈ", "ө" -- 1 char }, to = { "о" .. p[2], "е" .. p[1], "н" .. p[1], "о" .. p[1] }, }, }, } m["evh"] = { "Uvbie", 3441344, "alv-swd", "Latn", } m["evn"] = { "เอเวนค์", 30004, "tuw-ewe", "Cyrl", translit = "evn-translit", strip_diacritics = {remove_diacritics = c.macron .. c.dotabove .. c.dotbelow}, sort_key = { from = {"ё", "ӈ"}, to = {"е" .. p[1], "н" .. p[1]} }, } m["ewo"] = { "Ewondo", 35459, "bnt-btb", "Latn", } m["ext"] = { "เอซเตรมาดูรา", 30007, "roa-asl", "Latn", } m["eya"] = { "Eyak", 27480, "xnd", "Latn", } m["eyo"] = { "Keiyo", 56856, "sdv-nma", "Latn", } m["eza"] = { "Ezaa", 11921436, "alv-igb", "Latn", ancestors = "izi", } m["eze"] = { "Uzekwe", 3502244, "nic-ucn", "Latn", } return require("Module:languages").finalizeData(m, "language") 5u6epxcieeit1v0aothwfcow04mnqtd มอดูล:languages/data/3/d 828 36383 5720755 5684152 2026-04-21T07:00:53Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720755 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["daa"] = { "Dangaléat", 942591, "cdc-est", "Latn", } m["dac"] = { "Dambi", 12629491, "poz-ocw", "Latn", } m["dad"] = { "Marik", 6763404, "poz-ocw", "Latn", } m["dae"] = { "Duupa", 35263, "alv-dur", "Latn", } m["dag"] = { "Dagbani", 32238, "nic-dag", "Latn", } m["dah"] = { "Gwahatike", 5623246, "ngf-fin", "Latn", } m["dai"] = { "Day", 35163, "alv-mbd", "Latn", } m["daj"] = { "Dar Fur Daju", 56370, "sdv-daj", "Latn", } m["dak"] = { "ดาโคตา", 530384, "sio-dkt", "Latn", } m["dal"] = { "Dahalo", 35143, "cus", "Latn", } m["dam"] = { "Damakawa", 1158134, "nic-knn", "Latn", } m["dao"] = { "Daai Chin", 860029, "tbq-kuk", "Latn", } m["daq"] = { "Dandami Maria", 12952805, "dra-mdy", "Deva", } m["dar"] = { "Dargwa", 32332, "cau-drg", "Cyrl, Latn, Arab", translit = {Cyrl = "dar-translit"}, override_translit = true, display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = { Cyrl = s["cau-Cyrl-stripdiacritics"], Latn = s["cau-Latn-stripdiacritics"], }, sort_key = { Cyrl = { from = { "къкъ", "хьхь", -- 4 chars "гъ", "гь", "гӏ", "ё", "къ", "кь", "кӏ", "пп", "пӏ", "сс", "тт", "тӏ", "хх", "хъ", "хь", "хӏ", "цц", "цӏ", "чч", "чӏ" -- 2 chars }, to = { "к" .. p[2], "х" .. p[4], "г" .. p[1], "г" .. p[2], "г" .. p[3], "е" .. p[1], "к" .. p[1], "к" .. p[3], "к" .. p[4], "п" .. p[1], "п" .. p[2], "с" .. p[1], "т" .. p[1], "т" .. p[2], "х" .. p[1], "х" .. p[2], "х" .. p[3], "х" .. p[5], "ц" .. p[1], "ц" .. p[2], "ч" .. p[1], "ч" .. p[2] } }, }, } m["das"] = { "Daho-Doo", 3915369, "kro-wee", "Latn", } m["dau"] = { "Dar Sila Daju", 7514020, "sdv-daj", "Latn", } m["dav"] = { "Taita", 2387274, "bnt-cht", "Latn", } m["daw"] = { "Davawenyo", 5228174, "phi", "Latn", } m["dax"] = { "Dayi", 10467281, "aus-yol", "Latn", } m["daz"] = { "Dao", 5221513, "ngf-pan", "Latn", } m["dba"] = { "Bangime", 1982696, "qfa-iso", -- southern Mali "Latn", } m["dbb"] = { "Deno", 56275, "cdc-wst", "Latn", } m["dbd"] = { "Dadiya", 3914436, "alv-wjk", "Latn", } m["dbe"] = { "Dabe", 5207451, "paa-tkw", "Latn", } m["dbf"] = { "Edopi", 12953516, "paa-lkp", "Latn", } m["dbg"] = { "Dogul Dom", 3912880, "nic-npd", "Latn", } m["dbi"] = { "Doka", 3913293, "nic-plc", "Latn", } m["dbj"] = { "อีดาอัน", 3041552, "poz-san", "Latn", } m["dbl"] = { "Dyirbal", 35465, "aus-dyb", "Latn", } m["dbm"] = { "Duguri", 7194057, "nic-jrw", "Latn", } m["dbn"] = { "Duriankere", 5316627, "ngf-sbh", "Latn", } m["dbo"] = { "Dulbu", 5313310, "nic-jrn", "Latn", } m["dbp"] = { "Duwai", 56301, "cdc-wst", "Latn", } m["dbq"] = { "Daba", 3913342, "cdc-cbm", "Latn", } m["dbr"] = { "Dabarre", 3447286, "cus-som", } m["dbt"] = { "Ben Tey", 4886561, "nic-nwa", "Latn", } m["dbu"] = { "Bondum Dom Dogon", 3912758, "nic-npd", "Latn", } m["dbv"] = { "Dungu", 5315230, "nic-kau", "Latn", } m["dbw"] = { "Bankan Tey Dogon", 4856243, "nic-nwa", "Latn", } m["dby"] = { "Dibiyaso", 5272268, "qfa-dis", -- Papuan; isolate per Glottolog, unclassified per Pawley and Hammarström (2018), sometimes classified with Bosavi languages "Latn", } m["dcc"] = { "Deccani", 669431, "inc-hnd", "ur-Arab", ancestors = "ur", } m["dcr"] = { "เนเกอร์ฮอลันดส์", 1815830, "crp", "Latn", ancestors = "nl", } m["dda"] = { "Dadi Dadi", 50207890, "aus-pam", "Latn", } m["ddd"] = { "Dongotono", 56676, "sdv-lma", "Latn", } m["dde"] = { "Doondo", 11003401, "bnt-kng", "Latn", } m["ddg"] = { "Fataluku", 35353, "paa-tap", "Latn", } m["ddi"] = { "Diodio", 3028668, "poz-ocw", "Latn", } m["ddj"] = { "Jaru", 3162806, "aus-pam", "Latn", } m["ddn"] = { "Dendi", 35164, "son", "Latn", } m["ddo"] = { "Tsez", 34033, "cau-wts", "Cyrl", translit = "ddo-translit", display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]}, } m["ddr"] = { "Dhudhuroa", 5269842, "aus-pam", "Latn", } m["dds"] = { "Donno So Dogon", 1234776, "nic-dge", "Latn", } m["ddw"] = { "Dawera-Daweloor", 5242304, "poz-tim", "Latn", } m["dec"] = { "Dagik", 35125, "alv-tal", "Latn", } m["ded"] = { "Dedua", 5249850, "ngf-huo", "Latn", } m["dee"] = { "Dewoin", 3914892, "kro-wkr", "Latn", } m["def"] = { "Dezfuli", 4115412, "ira-swi", "Arab", } m["deg"] = { "Degema", 35182, "alv-dlt", "Latn", } m["deh"] = { "Dehwari", 5704314, "ira-swi", "fa-Arab", ancestors = "fa", } m["dei"] = { "Demisa", 56380, "paa-egb", "Latn", } -- "dek" is no longer an ISO code; spurious m["dem"] = { "Dem", 5254989, "qfa-dis", -- Papuan; isolate in Glottolog; unclassified in Palmer (2018); grouped with Amung by Usher, ultimately in TNG "Latn", } m["dep"] = { "Pidgin Delaware", 1183938, "crp", "Latn", ancestors = "unm", } -- deq is not included, see [[WT:LT]] m["der"] = { "Deori", 56478, "tbq-bdg", "Beng, Latn", } m["des"] = { "Desano", 962392, "sai-tuc", "Latn", } m["dev"] = { "Domung", 5291378, "ngf-fin", "Latn", } m["dez"] = { "Dengese", 2909984, "bnt-tet", "Latn", } m["dga"] = { "Southern Dagaare", 35159, "nic-mre", "Latn", } m["dgb"] = { "Bunoge", 4985178, "nic-dgw", "Latn", } m["dgc"] = { "Casiguran Dumagat Agta", 5313599, "phi", "Latn", } m["dgd"] = { "Dagaari Dioula", 11153465, "nic-mre", "Latn", } m["dge"] = { "Degenan", 5251770, "ngf-fin", "Latn", } m["dgg"] = { "Doga", 3033726, "poz-ocw", "Latn", } m["dgh"] = { "Dghwede", 56293, "cdc-cbm", "Latn", } m["dgi"] = { "Northern Dagara", 11004218, "nic-mre", "Latn", } m["dgk"] = { "Dagba", 12952357, "csu-sar", "Latn", } m["dgn"] = { "Dagoman", 10465931, "aus-yng", "Latn", } m["dgo"] = { "Hindi Dogri", nil, "him", "Deva, Arab, Takr", ancestors = "doi", translit = { Deva = "Deva-translit", }, } m["dgr"] = { "Dogrib", 20979, "ath-nor", "Latn", } m["dgs"] = { "Dogoso", 35343, "nic-gur", } m["dgt"] = { "Ntra'ngith", 6983809, "aus-pam", "Latn", } -- dgu is not a language; see [[w:Dhekaru]] m["dgw"] = { "Daungwurrung", 5228050, "aus-pam", "Latn", } m["dgx"] = { "Doghoro", 12952392, "paa-bin", "Latn", } m["dgz"] = { "Daga", 5208442, "ngf-dag", "Latn", } m["dhg"] = { "Dhangu", 5268960, "aus-yol", "Latn", } m["dhd"] = { "Dhundhari", 633359, "raj", "Deva", translit = { Deva = "Deva-translit", }, } m["dhi"] = { "Dhimal", 35229, "sit-dhi", "Deva", translit = { Deva = "Deva-translit", }, } m["dhl"] = { "Dhalandji", 5268787, "aus-psw", "Latn", } m["dhm"] = { "Zemba", 3502283, "bnt-swb", "Latn", ancestors = "hz", } m["dhn"] = { "Dhanki", 5268992, "inc-bhi", } m["dho"] = { "Dhodia", 5269658, "inc-bhi", "Deva", translit = { Deva = "Deva-translit", }, } m["dhr"] = { "Tharrgari", 10470289, "aus-psw", "Latn", } m["dhs"] = { "Dhaiso", 11001788, "bnt-kka", "Latn", } m["dhu"] = { "Dhurga", 1285318, "aus-yuk", "Latn", } m["dhv"] = { "Drehu", 3039319, "poz-cln", "Latn", } m["dhw"] = { "Danuwar", 3522797, "inc-bhi", "Deva", translit = { Deva = "Deva-translit", }, } m["dhx"] = { "Dhungaloo", 16960599, "aus-pam", "Latn", } m["dia"] = { "Dia", 3446591, "paa-tor", "Latn", } m["dib"] = { "South Central Dinka", 35154, "sdv-dnu", "Latn", ancestors = "din", } m["dic"] = { "Lakota Dida", 11001730, "kro-did", "Latn", } m["did"] = { "Didinga", 56365, "sdv", "Latn", } m["dif"] = { "Dieri", 25559563, "aus-kar", "Latn", } m["dig"] = { "Digo", 3362072, "bnt-mij", "Latn", } -- "dih" is split into nai-ipa, nai-kum, nai-tip, see [[WT:LT]] m["dii"] = { "Dimbong", 35196, "bnt-baf", "Latn", } m["dij"] = { "Dai", 5209056, "poz-tim", "Latn", } m["dik"] = { "Southwestern Dinka", 36540, "sdv-dnu", "Latn", ancestors = "din", } m["dil"] = { "Dilling", 35152, "nub-hil", "Latn", } m["dim"] = { "Dime", 35311, "omv-aro", } m["din"] = { "Dinka", 56466, "sdv-dnu", "Latn", } m["dio"] = { "Dibo", 3914891, "alv-ngb", "Latn", } m["dip"] = { "Northeastern Dinka", 36246, "sdv-dnu", "Latn", ancestors = "din", } m["dir"] = { "Dirim", 11130804, "nic-dak", "Latn", } m["dis"] = { "Dimasa", 56664, "tbq-bdg", "Latn, Beng", } m["diu"] = { "Gciriku", 3780954, "bnt-kav", "Latn", } m["diw"] = { "Northwestern Dinka", 36249, "sdv-dnu", "Latn", ancestors = "din", } m["dix"] = { "Dixon Reef", 5284967, "poz-vnc", "Latn", } m["diy"] = { "Diuwe", 5283765, "ngf-ask", "Latn", } m["diz"] = { "Ding", 35202, "bnt-bdz", "Latn", } m["dja"] = { "Djadjawurrung", 5285190, "aus-pam", "Latn", } m["djb"] = { "Djinba", 5285351, "aus-yol", "Latn", } m["djc"] = { "Dar Daju Daju", 5209890, "sdv-daj", "Latn", } m["djd"] = { "Jaminjung", 6147825, "aus-mir", "Latn", } m["dje"] = { "Zarma", 36990, "son", "Latn, Arab, Brai", } m["djf"] = { "Djangun", 10474818, "aus-pmn", "Latn", } m["dji"] = { "Djinang", 5285350, "aus-yol", "Latn", } m["djj"] = { "Ndjébbana", 5285274, "aus-arn", "Latn", } m["djk"] = { "Aukan", 2659044, "crp", "Latn, Afak", ancestors = "en", } m["djl"] = { "Djiwarli", 2669569, "aus-psw", "Latn", } m["djm"] = { "Jamsay", 3913290, "nic-pld", "Latn", } m["djn"] = { "Djauan", 13553748, "aus-gun", "Latn", } m["djo"] = { "Jangkang", 12952388, "day", } m["djr"] = { "Djambarrpuyngu", 3915679, "aus-yol", "Latn", } m["dju"] = { "Kapriman", 6367199, "paa-spk", "Latn", } m["djw"] = { "Djawi", 3913844, "aus-nyu", "Latn", ancestors = "bcj", } m["dka"] = { "Dakpa", 3695189, "sit-ebo", "Tibt", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["dkk"] = { "Dakka", 5209962, "poz-ssw", } m["dkr"] = { "Kuijau", 13580777, "poz-bnn", } m["dks"] = { "Southeastern Dinka", 36538, "sdv-dnu", "Latn", ancestors = "din", } m["dkx"] = { "Mazagway", 6798209, "cdc-cbm", "Latn", } m["dlg"] = { "Dolgan", 32878, "trk-nsb", "Cyrl", sort_key = { from = {"ё", "һ", "ӈ", "ө", "ү"}, to = {"е" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1]} }, } m["dlk"] = { "Dahalik", 32260, "sem-eth", "Ethi", translit = "Ethi-translit", } m["dlm"] = { "แดลเมเชีย", 35527, "roa-dal", "Latn", } m["dln"] = { "Darlong", 5224029, "tbq-kuk", "Latn", } m["dma"] = { "Duma", 35319, "bnt-nze", "Latn", } m["dmb"] = { "Mombo Dogon", 6897074, "nic-dgw", "Latn", } m["dmc"] = { "Gavak", 5277406, "ngf-nad", "Latn", } m["dmd"] = { "Madhi Madhi", 6727353, "aus-pam", "Latn", } m["dme"] = { "Dugwor", 56313, "cdc-cbm", "Latn", } m["dmf"] = { "Medefaidrin", 1519764, "art", "Medf", type = "appendix-constructed", } m["dmg"] = { "กีนาบาตางันตอนบน", 16109975, "poz-san", "Latn", } m["dmk"] = { "Domaaki", 32900, "inc-wes", "Arab", } m["dml"] = { "Dameli", 32288, "inc-kun", } m["dmm"] = { "Dama (Nigeria)", 5211865, "alv-mbm", "Latn", } m["dmo"] = { "Kemezung", 35562, "nic-bbe", "Latn", } m["dmr"] = { "East Damar", 5328200, "poz-cet", "Latn", } m["dms"] = { "Dampelas", 5212928, "poz-tot", "Latn", } m["dmu"] = { "Dubu", 7692059, "paa-pau", "Latn", } m["dmv"] = { "Dumpas", 12953512, "poz-san", "Latn", } m["dmw"] = { "Mudburra", 6931573, "aus-pam", "Latn", } m["dmx"] = { "Dema", 3553423, "bnt-sho", "Latn", } m["dmy"] = { "Demta", 14466283, "paa-sen", "Latn", } m["dna"] = { "Upper Grand Valley Dani", 12952361, "ngf-dan", "Latn", } m["dnd"] = { "Daonda", 5221528, "paa-brd", "Latn", } m["dne"] = { "Ndendeule", 6983725, "bnt-mbi", "Latn", } m["dng"] = { "ดุงกาน", 33050, "zhx-man", "Cyrl, Hants, Arab", generate_forms = "zh-generateforms", translit = {Cyrl = "dng-translit"}, sort_key = { Cyrl = { from = {"ё", "ә", "җ", "ң", "ў", "ү"}, to = {"е" .. p[1], "е" .. p[2], "ж" .. p[1], "н" .. p[1], "у" .. p[1], "у" .. p[2]} }, Hani = "Hani-sortkey", }, } m["dni"] = { "Lower Grand Valley Dani", 12635807, "ngf-dan", "Latn", } m["dnj"] = { "Dan", 1158971, "dmn-mda", "Latn", } m["dnk"] = { "Dengka", 5256954, "poz-tim", "Latn", } m["dnn"] = { "Dzuun", 10973260, "dmn-smg", "Latn", } m["dno"] = { "Ndrulo", 60785094, "csu-lnd", } m["dnr"] = { "Danaru", 5214932, "ngf-pek", "Latn", } m["dnt"] = { "Mid Grand Valley Dani", 12952359, "ngf-dan", "Latn", } m["dnu"] = { "Danau", 5013745, "mkh-pal", "Mymr", } m["dnv"] = { "Danu", 5221251, "tbq-brm", "Mymr", ancestors = "obr", } m["dnw"] = { "Western Dani", 7987774, "ngf-dan", "Latn", } m["dny"] = { "Dení", 56562, "auf", "Latn", } m["doa"] = { "Dom", 5289770, "ngf-chw", "Latn", } m["dob"] = { "Dobu", 952133, "poz-ocw", "Latn", } m["doc"] = { "ต้งเหนือ", 17195499, "qfa-tak", "Latn", } m["doe"] = { "Doe", 5288055, "bnt-ruv", "Latn", } m["dof"] = { "Domu", 5291375, "paa-mal", "Latn", } m["doh"] = { "Dong", 3438405, "nic-dak", "Latn", } m["doi"] = { "Dogri", 32730, "him", "Deva, Takr, fa-Arab, Dogr", translit = { Deva = "Deva-translit", Dogr = "Dogr-translit", }, } m["dok"] = { "Dondo", 5295571, "poz-tot", "Latn", } m["dol"] = { "Doso", 4167202, "paa-dot", "Latn", } m["don"] = { "Doura", 7829037, "poz-ocw", "Latn", } m["doo"] = { "Dongo", 35303, "nic-mbc", "Latn", } m["dop"] = { "Lukpa", 3258739, "nic-gne", "Latn", } m["doq"] = { "Dominican Sign Language", 5290820, "sgn", "Latn", -- when documented } m["dor"] = { "Dori'o", 3037084, "poz-sls", "Latn", } m["dos"] = { "Dogosé", 3913314, "nic-gur", "Latn", } m["dot"] = { "Dass", 3441293, "cdc-wst", "Latn", } m["dov"] = { "Toka-Leya", 11001779, "bnt-bot", "Latn", ancestors = "toi", } m["dow"] = { "Doyayo", 35299, "alv-dur", "Latn", } m["dox"] = { "Bussa", 35123, "cus-eas", "Latn", } m["doy"] = { "Dompo", 35270, "alv-gng", "Latn", } m["doz"] = { "Dorze", 56336, "omv-nom", "Latn", } m["dpp"] = { "Papar", 7132487, "poz-san", "Latn", } m["drb"] = { "Dair", 12952360, "nub-hil", "Latn", } m["drc"] = { "Minderico", 6863806, "roa-gap", "Latn", ancestors = "pt", } m["drd"] = { "Darmiya", 5224058, "sit-alm", } m["drg"] = { "Rungus", 6897407, "poz-san", "Latn", } m["dri"] = { "Lela", 3914004, "nic-knn", "Latn", } m["drl"] = { "Baagandji", 5223941, "aus-pam", "Latn", } m["drn"] = { "West Damar", 3450459, "poz-tim", "Latn", } m["dro"] = { "Daro-Matu Melanau", 5224156, "poz-bnn", "Latn", } m["drq"] = { "Dura", 3449842, "sit-gma", "Deva", } m["drs"] = { "Gedeo", 56622, "cus-hec", "Ethi", } m["dru"] = { "Rukai", 49232, "map", "Latn", ancestors = "dru-pro", } m["dry"] = { "Darai", 46995026, "inc-bhi", "Deva", translit = { Deva = "Deva-translit", }, } m["dsb"] = { "ซอร์บตอนล่าง", 13286, "wen", "Latn", sort_key = s["wen-sortkey"], standard_chars = "AaBbCcČčĆćDdEeĚěFfGgHhIiJjKkŁłLlMmNnŃńOoÓóPpRrŔŕSsŠšŚśTtUuWwYyZzŽžŹź" .. c.punc, } m["dse"] = { "Dutch Sign Language", 2201099, "sgn", "Latn", -- when documented } m["dsh"] = { "Daasanach", 56637, "cus-eas", "Latn", } m["dsi"] = { "Disa", 3914455, "csu-bgr", "Latn", } m["dsl"] = { "Danish Sign Language", 2605298, "sgn", "Latn", -- when documented } m["dsn"] = { "Dusner", 5316948, "poz-hce", "Latn", } m["dso"] = { "Desiya", 12629755, "inc-eas", "Orya", ancestors = "or", } m["dsq"] = { "Tadaksahak", 36568, "son", "Arab, Latn", } m["dta"] = { "Daur", 32430, "xgn", "Latn, Hani, Cyrl, Mong", ancestors = "xng", -- Mong translit, display_text and strip_diacritics in [[Module:scripts/data]] sort_key = {Hani = "Hani-sortkey"}, } m["dtb"] = { "Labuk-Kinabatangan Kadazan", 5330240, "poz-san", "Latn", } m["dtd"] = { "Ditidaht", 13728042, "wak", "Latn", } m["dth"] = { -- contrast 'rrt' "Adithinngithigh", 4683034, "aus-pmn", "Latn", } m["dti"] = { "Ana Tinga Dogon", 4750346, "qfa-dgn", "Latn", } m["dtk"] = { "Tene Kan Dogon", 11018863, "nic-pld", "Latn", } m["dtm"] = { "Tomo Kan Dogon", 11137719, "nic-pld", "Latn", } m["dto"] = { "Tommo So", 47012992, "nic-dge", "Latn", } m["dtp"] = { "ดูซุนตอนกลาง", 5317225, "poz-san", "Latn", } m["dtr"] = { "Lotud", 6685078, "poz-san", "Latn", } m["dts"] = { "Toro So Dogon", 11003311, "nic-dge", "Latn", } m["dtt"] = { "Toro Tegu Dogon", 3913924, "nic-pld", "Latn", } m["dtu"] = { "Tebul Ure Dogon", 7692089, "qfa-dgn", "Latn", } m["dty"] = { "Doteli", 18415595, "inc-pah", "Deva", translit = { Deva = "Deva-translit", }, } m["dua"] = { "Duala", 33013, "bnt-saw", "Latn", } m["dub"] = { "Dubli", 5310792, "inc-bhi", } m["duc"] = { "Duna", 5314039, "qfa-dis", -- Papuan; isolate in Glottolog; tentatively grouped with Bogaya into a Duna-Pogaya [sic] family, -- ultimately in TNG "Latn", } m["due"] = { "Umiray Dumaget Agta", 7881585, "phi", "Latn", } m["duf"] = { "Dumbea", 6983819, "poz-cln", "Latn", } m["dug"] = { "Chiduruma", 35614, "bnt-mij", "Latn", } m["duh"] = { "Dungra Bhil", 12953513, "inc-bhi", "Deva, Gujr", translit = { Deva = "Deva-translit", Gujr = "Gujr-translit", }, } m["dui"] = { "Dumun", 5314004, "ngf-yag", "Latn", } m["duk"] = { "Uyajitaya", 7904085, "ngf-nur", "Latn", } m["dul"] = { "Alabat Island Agta", 3399709, "phi", "Latn", } m["dum"] = { "ดัตช์กลาง", 178806, "gmw-frk", "Latn", ancestors = "odt", strip_diacritics = {remove_diacritics = c.circ .. c.macron .. c.diaer}, } m["dun"] = { "Dusun Deyah", 2784033, "poz-bre", "Latn", } m["duo"] = { "Dupaningan Agta", 5315912, "phi", "Latn", } m["dup"] = { "Duano", 3040468, "poz-mly", "Latn", } m["duq"] = { "Dusun Malang", 3041711, "poz-bre", "Latn", } m["dur"] = { "Dii", 35208, "alv-dur", "Latn", } m["dus"] = { "Dumi", 56315, "sit-kiw", "Deva", translit = { Deva = "Deva-translit", }, } m["duu"] = { "Drung", 56406, "sit-nng", "Latn", } m["duv"] = { "Duvle", 56364, "paa-lkp", "Latn", } m["duw"] = { "Dusun Witu", 2381310, "poz-bre", "Latn", } m["dux"] = { "Duun", 3914880, "dmn-smg", "Latn", } m["duy"] = { "Dicamay Agta", 5272321, "phi", "Latn", } m["duz"] = { "Duli", 5313405, "alv-ada", "Latn", } m["dva"] = { "Duau", 5310448, "poz-ocw", "Latn", } m["dwa"] = { "Diri", 56286, "cdc-wst", "Latn", } m["dwr"] = { "เดาโร", 12629647, "omv-nom", "Ethi, Latn", } m["dwu"] = { "Dhuwal", 3120791, "aus-yol", "Latn", } m["dww"] = { "Dawawa", 5242286, "poz-ocw", "Latn", } m["dwy"] = { "Dhuwaya", 63348560, "aus-yol", "Latn", } m["dwz"] = { "Dewas Rai", 62663667, "inc-bhi", } m["dya"] = { "Dyan", 35340, "nic-gur", "Latn", } m["dyb"] = { "Dyaberdyaber", 5285185, "aus-nyu", "Latn", } m["dyd"] = { "Dyugun", 3913785, "aus-nyu", "Latn", } m["dyg"] = { "Villa Viciosa Agta", 12626611, "phi", "Latn", } m["dyi"] = { "Djimini", 35336, "alv-tdj", "Latn", } m["dym"] = { "Yanda Dogon", 8048316, "qfa-dgn", "Latn", } m["dyn"] = { "Dyangadi", 3913820, "aus-cww", "Latn", } m["dyo"] = { "Jola-Fonyi", 3507832, "alv-jol", "Latn", } m["dyu"] = { "Dyula", 32706, "dmn-man", "Latn", } m["dyy"] = { "Dyaabugay", 2591320, "aus-pmn", "Latn", } m["dza"] = { "Tunzu", 3915845, "nic-jer", "Latn", } m["dzg"] = { "Dazaga", 35244, "ssa-sah", "Latn", } m["dzl"] = { "Dzala", 56607, "sit-ebo", "Tibt", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["dzn"] = { "Dzando", 5319622, "bnt-bun", "Latn", } return require("Module:languages").finalizeData(m, "language") ryuqo5rd4w08so4wvmqtbfz9z1tn892 มอดูล:languages/data/3/c 828 36384 5720754 5684151 2026-04-21T07:00:52Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720754 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["caa"] = { "Ch'orti'", 35177, "myn", "Latn", } m["cab"] = { "Garifuna", 35490, "awd-taa", "Latn", ancestors = "crb", } m["cac"] = { "Chuj", 35233, "myn", "Latn", } m["cad"] = { "Caddo", 56756, "cdd", "Latn", } m["cae"] = { "Laalaa", 35564, "alv-cng", "Latn", } m["caf"] = { "Southern Carrier", 12953426, "ath-nor", "Latn", } m["cag"] = { "Nivaclé", 3182557, "sai-mtc", "Latn", } m["cah"] = { "Cahuarano", 2933175, "sai-zap", "Latn", } m["caj"] = { "Chané", 56721, "awd", "Latn", } m["cak"] = { "Kaqchikel", 35115, "myn", "Latn", } m["cal"] = { "Carolinian", 28427, "poz-mic", "Latn", } m["cam"] = { "Cèmuhî", 3009690, "poz-cln", "Latn", } m["can"] = { "Chambri", 5069707, "paa-lsp", "Latn", } m["cao"] = { "Chácobo", 2591202, "sai-pan", "Latn", } m["cap"] = { "Chipaya", 35235, "sai-ucp", "Latn", } m["caq"] = { "คาร์นิโคบาร์", 35156, "aav-nic", "Latn, Deva", } m["car"] = { "Kari'na", 56611, "sai-gui", "Latn", sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. "`" .. "'%-%s"}, strip_diacritics = { remove_diacritics = c.acute, from = {"â", "ê", "î", "ô", "û", "ŷ"}, to = {"à", "è", "ì", "ò", "ù", "ỳ"} }, } m["cas"] = { "Tsimané", 35950, "qfa-iso", "Latn", } m["cav"] = { "Cavineña", 524102, "sai-tac", "Latn", } m["caw"] = { "Kallawaya", 266417, "qfa-mix", "Latn", } m["cax"] = { "Chiquitano", 1844993, "qfa-iso", "Latn", } m["cay"] = { "Cayuga", 32967, "iro-nor", "Latn", } m["caz"] = { "Canichana", 2936374, "qfa-iso", "Latn", } m["cbb"] = { "Cabiyarí", 3450660, "awd-nwk", "Latn", } m["cbc"] = { "Carapana", 924405, "sai-tuc", "Latn", } m["cbd"] = { "Carijona", 3446655, "sai-tar", "Latn", } m["cbg"] = { "Chimila", 2963680, "cba", "Latn", } m["cbi"] = { "Chachi", 2591329, "sai-bar", "Latn", } m["cbj"] = { "Ede Cabe", 33112829, "alv-ede", "Latn", } m["cbk"] = { "ชาบากาโน", 33281, "crp", "Latn", ancestors = "es", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer}, sort_key = { from = {"ch", "ll", "ñ", "r"}, to = {"c" .. p[1], "l" .. p[1], "n" .. p[1], "r" .. p[1]} }, standard_chars = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnÑñOoPpQqRrSsTtUuVvWwXxYyZz" .. c.punc, } m["cbl"] = { "Bualkhaw Chin", 9229830, "tbq-kuk", "Latn", } m["cbn"] = { "ญัฮกุร", 116849, "mkh-mnc", "Thai", ancestors = "omx", --sort_key = "Thai-sortkey", } m["cbo"] = { "Izora", 3915454, "nic-jer", "Latn", } m["cbq"] = { "Tsucuba", 62603062, "nic-knj", "Latn", } m["cbr"] = { "Cashibo-Cacataibo", 5359560, "sai-pan", "Latn", } m["cbs"] = { "Cashinahua", 2591230, "sai-pan", "Latn", } m["cbt"] = { "Chayahuita", 1526525, "sai-cah", "Latn", } m["cbu"] = { "Candoshi-Shapra", 642843, "qfa-iso", "Latn", } m["cbv"] = { "Cacua", 3192052, "sai-nad", "Latn", ancestors = "mbr", } m["cbw"] = { "Kinabalian", 6410324, "phi", "Latn", } m["cby"] = { "Carabayo", 3441762, "sai-tyu", "Latn", } m["cca"] = { "Cauca", 5054242, "sai-chc", "Latn", } m["ccc"] = { "Chamicuro", 2155119, "awd", "Latn", } m["ccd"] = { "Cafundó", 3331506, "roa-gap", "Latn", ancestors = "pt", } m["cce"] = { "Chopi", 3437616, "bnt-bso", "Latn", } m["ccg"] = { "Chamba Daka", 33120805, "nic-dak", "Latn", } m["cch"] = { "Atsam", 34794, "nic-kne", "Latn", } m["ccj"] = { "Kasanga", 35542, "alv-nyn", "Latn", } m["ccl"] = { "Cutchi-Swahili", 5196729, "crp", "Latn", ancestors = "sw", } m["ccm"] = { "Malaccan Creole Malay", 12636092, "crp", "Latn", ancestors = "ms", } m["cco"] = { "Comaltepec Chinantec", 2963735, "omq-chi", "Latn", } m["ccp"] = { "จักมา", 32952, "inc-bas", "Cakm, Beng, Latn", ancestors = "inc-obn", translit = { Cakm = "Cakm-translit", Beng = "Beng-translit", }, } m["ccr"] = { "Cacaopera", 3438338, "nai-min", "Latn", } m["cda"] = { "Choni", 2964447, "sit-tib", } m["cde"] = { "Chenchu", 32981, "dra-tel", "Telu", } m["cdf"] = { "Chiru", 5102016, "tbq-kuk", "Latn, Beng", } m["cdh"] = { "Chambeali", 12953424, "him", "Deva, Takr", translit = { Deva = "Deva-translit", }, } m["cdi"] = { "Chodri", 5103788, "inc-bhi", "Gujr", } m["cdj"] = { "Churahi", 12629039, "him", "Deva, Takr", translit = { Deva = "Deva-translit", }, } m["cdm"] = { "Chepang", 5091700, "sit-gma", "Deva", translit = { Deva = "Deva-translit", }, } m["cdn"] = { "Chaudangsi", 5088056, "sit-alm", } m["cdo"] = { "หมิ่นตะวันออก", 36455, "zhx-com", "Hants", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["cdr"] = { "Cinda-Regi-Tiyal", 35596, "nic-kmk", "Latn", } m["cds"] = { "Chadian Sign Language", 10322099, "sgn", "Latn", -- when documented } m["cdy"] = { "Chadong", 926742, "qfa-kms", } m["cdz"] = { "Koda", 6425038, "mun", "Beng", } m["cea"] = { "Lower Chehalis", 6693377, "sal", "Latn", } m["ceb"] = { "เซบัวโน", 33239, "phi", "Latn, Tglg", translit = { Tglg = "ceb-translit" }, override_translit = true, strip_diacritics = { Latn = { remove_diacritics = c.grave .. c.acute .. c.circ } }, sort_key = { Latn = "tl-sortkey", }, standard_chars = { Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy", c.punc }, } m["ceg"] = { "Chamacoco", 3436637, "sai-zam", "Latn", } m["cen"] = { "Cen", 12628777, "nic-plc", "Latn", ancestors = "izr", } m["cet"] = { "Centúúm", 33608, "qfa-iso", -- northeastern Nigeria "Latn", } m["cfa"] = { "Dijim-Bwilim", 3438350, "alv-wjk", "Latn", } m["cfd"] = { "Cara", 35048, "nic-beo", "Latn", } m["cfg"] = { "Como Karim", 35304, "nic-jkn", "Latn", } m["cfm"] = { "Falam Chin", 56815, "tbq-kuk", "Beng, Latn", } m["cga"] = { "Changriwa", 5072105, "paa-yua", "Latn", } m["cgc"] = { "Kagayanen", 6346422, "mno", "Latn", } m["cgg"] = { "Rukiga", 3270727, "bnt-nyg", "Latn", } m["cgk"] = { "Chocangaca", 56604, "sit-tib", "Tibt", ancestors = "xct", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["chb"] = { "Chibcha", 2356431, "cba", "Latn", } m["chc"] = { "Catawba", 5051602, "nai-cat", "Latn", } m["chd"] = { "Highland Oaxaca Chontal", 2964457, "nai-tqn", "Latn", } m["chf"] = { "Chontal Maya", 35175, "myn", "Latn", } m["chg"] = { "ชากาทาย", 36831, "trk-kar", "Arab, Ougr", ancestors = "zkh", strip_diacritics = { remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef, from = {u(0x0671)}, to = {u(0x0627)} }, translit = { Arab = "chg-translit", Ougr = "Ougr-translit", }, } m["chh"] = { "Chinook", 6693380, "nai-ckn", "Latn", } m["chj"] = { "Ojitlán Chinantec", 5100110, "omq-chi", "Latn", } m["chk"] = { "Chuukese", 33161, "poz-mic", "Latn", } m["chl"] = { "Cahuilla", 56438, "azc-cup", "Latn", strip_diacritics = {remove_diacritics = c.acute .. c.macron}, } -- chm "Mari" is not recognized as a language, but it is a family code m["chn"] = { "Chinook Jargon", 35173, "crp", "Latn, Dupl", ancestors = "chh, nuk", } m["cho"] = { "Choctaw", 32979, "nai-mus", "Latn", sort_key = {remove_diacritics = c.macronbelow .. "-"}, strip_diacritics = {remove_diacritics = c.acute .. c.dotbelow}, } m["chp"] = { "Chipewyan", 27692, "ath-nor", "Latn, Cans", } m["chq"] = { "Quiotepec Chinantec", 5758709, "omq-chi", "Latn", } m["chr"] = { "เชโรกี", 33388, "iro", "Cher", translit = "Cher-translit", } m["cht"] = { "Cholón", 2591243, "qfa-unc", -- poorly attested; possibly in a Hibito-Cholon or Cholonan family "Latn", } m["chw"] = { "Chuabo", 5118412, "bnt-mak", "Latn", } m["chx"] = { "Chantyal", 4926344, "sit-tam", "Deva", translit = { Deva = "Deva-translit", }, } m["chy"] = { "เชเยนน์", 33265, "alg", "Latn", sort_key = {remove_diacritics = c.grave .. c.acute .. c.macron .. c.dotabove .. "-"}, standard_chars = "AaÁáÀàĀāȦȧEeÉéÈèĒēĖėHhKkMmNnOoÓóÒòŌōȮȯPpSsŠšTtVvXx" .. c.punc, --umlaut and circumflex not allowed } m["chz"] = { "Ozumacín Chinantec", 5100111, "omq-chi", "Latn", } m["cia"] = { "Cia-Cia", 35284, "poz-mun", "Hang, Latn, Arab", } m["cib"] = { "Ci Gbe", 12952445, "alv-gbe", "Latn", } m["cic"] = { "Chickasaw", 33192, "nai-mus", "Latn", } m["cid"] = { "Chimariko", 1294251, "qfa-iso", -- possibly Hokan "Latn", } m["cie"] = { "Cineni", 56243, "cdc-cbm", "Latn", } m["cih"] = { "Chinali", 11855245, "inc", "Deva", ancestors = "sa", translit = { Deva = "Deva-translit", }, } m["cik"] = { "Chitkuli Kinnauri", 15615982, "sit-kin", } m["cim"] = { "Cimbrian", 37053, "gmw-hgm", "Latn", ancestors = "bar", sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer .. c.ringabove .. c.caron}, } m["cin"] = { "Cinta Larga", 5121095, "tup", "Latn", } m["cip"] = { "Chiapanec", 3364475, "omq", "Latn", } m["cir"] = { "Tinrin", 7862281, "poz-cln", "Latn", } m["ciy"] = { "Chaima", 12628867, "sai-ven", "Latn", } m["cja"] = { "จามตะวันตก", 12645578, "cmc", "Latn, Arab, Khmr, Cham", -- Western Cham script is not yet available. Also, Arabic script is missing some glyphs. } m["cje"] = { "Chru", 2967321, "cmc", "Latn", } m["cjh"] = { "Upper Chehalis", 2962074, "sal", "Latn", } m["cji"] = { "Chamalal", 56567, "cau-and", "Cyrl", translit = "cau-nec-translit", override_translit = true, display_text = s["cau-Cyrl-displaytext"], strip_diacritics = s["cau-Cyrl-stripdiacritics"], } m["cjk"] = { "Chokwe", 2422065, "bnt-clu", "Latn", } m["cjm"] = { "จามตะวันออก", 2948019, "cmc", "Latn, Cham", } m["cjn"] = { "Chenapian", 5091044, "paa-spk", "Latn", } m["cjo"] = { "Ashéninka Pajonal", 3450481, "awd", "Latn", } m["cjp"] = { "Cabécar", 27878, "cba", "Latn", } m["cjs"] = { "โชร์", 34139, "trk-ssb", "Cyrl", } m["cjv"] = { "Chuave", 5115226, "ngf-chw", "Latn", } m["cjy"] = { "จิ้น", 56479, "zhx", "Hants", ancestors = "ltc", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["ckb"] = { "เคิร์ดตอนกลาง", 36811, "ku", "ku-Arab", translit = "ckb-translit", strip_diacritics = {remove_diacritics = c.kasra .. c.sukun}, } m["ckh"] = { "Chak", 12628870, "sit-luu", "Latn", ancestors = "kdv", } m["ckl"] = { "Cibak", 56279, "cdc-cbm", "Latn", } m["ckn"] = { "Kaang Chin", 6343432, "tbq-kuk", "Latn", } m["cko"] = { "Anufo", 34845, "alv-ctn", "Latn", } m["ckq"] = { "Kajakse", 3440422, "cdc-est", "Latn", } m["ckr"] = { "Kairak", 3503002, "paa-bng", "Latn", } m["cks"] = { "Tayo", 1133089, "crp", "Latn", ancestors = "fr", sort_key = s["roa-oil-sortkey"], } m["ckt"] = { "ชุกชี", 33170, "qfa-ckn", "Cyrl, Latn", -- Latn is obsolete strip_diacritics = { from = {"['’]"}, to = {"ʼ"} }, sort_key = { from = {"ё", "ӄ", "ԓ", "ӈ"}, to = {"е" .. p[1], "к" .. p[1], "л" .. p[1], "н" .. p[1]} }, } m["cku"] = { "Koasati", 35162, "nai-mus", "Latn", } m["ckv"] = { "กบาลัน", 716627, "map", "Latn", } m["ckx"] = { "Caka", 5018037, "nic-tvc", "Latn", } m["cky"] = { "Cakfem-Mushere", 3441199, "cdc-wst", "Latn", } m["ckz"] = { "Kaqchikel-K'iche' Mixed Language", 5054550, "qfa-mix", "Latn", ancestors = "cak, quc" } m["cla"] = { "Ron", 3440432, "cdc-wst", "Latn", } m["clc"] = { "Chilcotin", 28535, "ath-nor", "Latn", } m["cld"] = { "Chaldean Neo-Aramaic", 33236, "sem-are", "Syrc", strip_diacritics = "Syrc-stripdiacritics", } m["cle"] = { "Lealao Chinantec", 6509365, "omq-chi", "Latn", } m["clh"] = { "Chilisso", 3250629, "inc-koh", "ur-Arab", } m["cli"] = { "Chakali", 35206, "nic-gnw", "Latn", } m["clj"] = { "Laitu Chin", 6474196, "tbq-kuk", } m["clk"] = { "Idu", 56412, "sit-gsi", "Tibt, Deva", translit = { Deva = "Deva-translit", }, override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["cll"] = { "Chala", 35190, "nic-gne", "Latn", } m["clm"] = { "Klallam", 33404, "sal", "Latn", } m["clo"] = { "Lowland Oaxaca Chontal", 2964450, "nai-tqn", "Latn", } m["clt"] = { "Lutuv", 6502107, "tbq-kuk", "Latn", } m["clu"] = { "Caluyanun", 32964, "phi", "Latn", } m["clw"] = { "Chulym", 33125, "trk-ssb", "Latn, Cyrl", } m["cly"] = { "Eastern Highland Chatino", 12642078, "omq-cha", "Latn", } m["cma"] = { "หมะ", 12953680, "mkh-ban", "Latn", } m["cme"] = { "Cerma", 35074, "nic-gur", "Latn", } m["cmg"] = { "มองโกเลียคลาสสิก", 5128303, "xgn-cen", "Mong, Soyo, Zanb", -- Mong translit, display_text and strip_diacritics in [[Module:scripts/data]] } m["cmi"] = { "Emberá-Chamí", 3052042, "sai-chc", "Latn", } m["cml"] = { "Campalagian", 5027893, "poz-ssw", "Latn", } m["cmm"] = { "Michigamea", 12636809, "sio-msv", "Latn", } m["cmn"] = { "จีนกลาง", 9192, "zhx-man", "Hants, Latn, Bopo, Brai", wikimedia_codes = "zh", generate_forms = "zh-generateforms", translit = { Hani = "zh-translit", Bopo = "zh-translit", }, sort_key = { Hani = "Hani-sortkey", Latn = { from = { -- Sort terms with tone numbers immediately after equivalent terms with diacritics. "[aeiouv][" .. c.circ .. c.diaer .. "]?[nr]?g?[0-5]", -- Add temporary breaks between syllables. "([aeiouvmn][" .. c.circ .. c.diaer .. "]?[" .. c.macron .. c.acute .. c.caron .. c.grave .. "]?n?ŋ?g?r?)([bpmfdtnlgkhjqxzcsywrv']h?[aeiouvmn ])", p[1] .. "([ngr])$", p[1] .. "([ngr][%s%-'" .. p[1] .. "])", -- Substitute diacritics for syllable-final tone numbers, and add tone 0 where necessary. c.macron, c.acute, c.caron, c.grave, "([1-4])([^%s%p" .. p[1] .. "]+)", "([^0-5])%f[%z%s%p" .. p[1] .. "]", -- Substitute "v" shorthand for "ü" for a temporary placeholder, so that the (very rare) "v" initial is not affected by the later shorthand substitutions. "([^ " .. p[1] .. "])v", -- Remove temporary breaks. p[1], -- Substitute shorthands for full forms, and sort them immediately after equivalent terms. "%S*[csz]" .. c.circ .. "%S*", "%S*[ŋ" .. p[2] .. "]%S*", "ĉ", "ŝ", "ŋ", p[2], "ẑ", -- "ê" comes after "e", "ü" comes after "u" and apostrophes are removed (as their function is replaced by tone numbers). "[" .. c.circ .. c.diaer .. "]", "'", -- Sort numbered tone 5 after tone 0. "5!" }, to = { "%0!", "%1" .. p[1] .. "%2", "%1", "%1", "1", "2", "3", "4", "%2%1", "%10", "%1" .. p[2], "", "%0\"", "%0\"", "ch", "sh", "ng", "ü", "zh", p[1], "", "0!!" } }, }, } m["cmo"] = { "มนองตอนกลาง", 33369881, "mkh-ban", "Khmr, Latn", } m["cmr"] = { "Mro Chin", 16889978, "tbq-kuk", } m["cms"] = { "Messapic", 36383, "ine", "Ital, Latn, Polyt", -- Ital translit in [[Module:scripts/data]] -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["cmt"] = { "Camtho", 10441336, "crp", "Latn", ancestors = "fly, zu" } m["cna"] = { "Changthang", 12952322, "sit-lab", "Tibt", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["cnb"] = { "Chinbon Chin", 12952327, "tbq-kuk", "Latn", } m["cnc"] = { "Cốông", 5202780, "tbq-bis", "Latn", } m["cng"] = { "Northern Qiang", 56559, "sit-qia", "Latn", } m["cnh"] = { "Lai", 3250286, "tbq-kuk", "Latn, Mymr", } m["cni"] = { "Asháninka", 3437230, "awd", "Latn", } m["cnk"] = { "Khumi Chin", 56308, "tbq-kuk", "Latn", } m["cnl"] = { "Lalana Chinantec", 12953437, "omq-chi", "Latn", } m["cno"] = { "Con", 3440883, "mkh-pal", } m["cnp"] = { "ผิงเหนือ", 84302463, "zhx-pin", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["cns"] = { "Central Asmat", 11732048, "ngf-ask", "Latn", } m["cnt"] = { "Tepetotutla Chinantec", 5100113, "omq-chi", "Latn", } m["cnu"] = { "Chenoua", 33276, "ber", "Latn", } m["cnw"] = { "Ngawn Chin", 6583675, "tbq-kuk", } m["cnx"] = { "Middle Cornish", 12642603, "cel-brs", "Latn", ancestors = "oco", } m["coa"] = { "Cocos Islands Malay", 3441699, "crp", "Latn", ancestors = "ms", } m["cob"] = { "Chicomuceltec", 3307204, "myn", "Latn", } m["coc"] = { "Cocopa", 33044, "nai-yuc", "Latn", } m["cod"] = { "Cocama", 33317, "tup", "Latn", } m["coe"] = { "Koreguaje", 3198924, "sai-tuc", "Latn", } m["cof"] = { "Tsafiki", 2567055, "sai-bar", "Latn", } m["cog"] = { "ชอง", 3914630, "mkh-pea", "Thai, Khmr", translit = { Khmr = "Khmr-translit", }, --sort_key = { -- Thai = "Thai-sortkey" --}, } m["coh"] = { "Chichonyi-Chidzihana-Chikauma", 12629011, "bnt-mij", "Latn", } m["coj"] = { "Cochimi", 3915551, "nai-yuc", "Latn", } m["cok"] = { "Santa Teresa Cora", 12641754, "azc", "Latn", } m["col"] = { "Columbia-Wenatchi", 3324744, "sal", "Latn", } m["com"] = { "Comanche", 32972, "azc-num", "Latn", } m["con"] = { "Cofán", 2669254, "qfa-iso", "Latn", } m["coo"] = { "Comox", 13583746, "sal", "Latn", } m["cop"] = { "คอปติก", 36155, "egx", "Copt", translit = "Copt-translit", ancestors = "egx-dem", strip_diacritics = {remove_diacritics = c.grave .. c.macron .. c.overline .. c.diaer .. "ˋ"}, sort_key = "Copt-sortkey", } m["coq"] = { "Coquille", 12953452, "ath-pco", "Latn", } m["cot"] = { "Caquinte", 3915557, "awd", "Latn", } m["cou"] = { "Wamey", 36935, "alv-ten", "Latn", } m["cov"] = { "เฉ่าเหมียว", 2936935, "qfa-tak", } m["cow"] = { "Cowlitz", 3001877, "sal", "Latn", } m["cox"] = { "Nanti", 15342275, "awd", "Latn", } m["coy"] = { "Coyaima", 56450, "sai-car", "Latn", } m["coz"] = { "Chochotec", 2964262, "omq-pop", "Latn", } m["cpa"] = { "Palantla Chinantec", 5100112, "omq-chi", "Latn", } m["cpb"] = { "Ucayali-Yurúa Ashéninka", 3501858, "awd", "Latn", } m["cpc"] = { "Ajyíninka Apurucayali", 3327405, "awd", "Latn", } m["cpg"] = { "Cappadocian Greek", 853414, "grk", "Grek, fa-Arab", ancestors = "gkm", translit = { Grek = "el-translit", }, -- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["cpi"] = { "Chinese Pidgin English", 3435078, "crp", "Latn, Hant", ancestors = "en", sort_key = { Hant = "Hani-sortkey" }, } m["cpn"] = { "Cherepon", 35181, "alv-gng", "Latn", } m["cpo"] = { "Kpee", 6435722, "dmn-jje", } m["cps"] = { "Capiznon", 2937525, "phi", "Latn", } m["cpu"] = { "Pichis Ashéninka", 7190661, "awd", "Latn", } m["cpx"] = { "ผูเซียน", 56583, "zhx-com", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["cpy"] = { "South Ucayali Ashéninka", 3501868, "awd", "Latn", } m["cqd"] = { "Chuanqiandian Cluster Miao", 121627627, "hmn", "Latn, Plrd", } m["cra"] = { "Chara", 5073694, "omv", "Latn", } m["crb"] = { "Kalinago", 3450735, "awd-taa", "Latn", } m["crc"] = { "Lonwolwol", 3259216, "poz-vnc", "Latn", } m["crd"] = { "Coeur d'Alene", 32915, "sal", "Latn", } m["crf"] = { "Caramanta", 3504195, "sai-chc", "Latn", } m["crg"] = { "Michif", 13315, "qfa-mix", "Latn", ancestors = "cr, fr", } m["crh"] = { "ตาตาร์แบบไครเมีย", 33357, "trk-kcu", "Latn, Cyrl", dotted_dotless_i = true, sort_key = { Latn = { from = { "[ıi]" .. c.breve, -- Convert ĭ into PUA so that the decomposed form does not get caught by the next step. Also cover decomposed forms with ı and i, as decomposed Ĭ is converted to ı + ̆ due to the dotted dotless I logic). "i", -- Ensure "i" comes after "ı". "â", "ç", "ğ", "ı", p[3], "ñ", "ö", "ş", "ü" }, to = { p[3], "i" .. p[1], "a", "c" .. p[1], "g" .. p[1], "i", "i" .. p[2], "n" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1], } }, Cyrl = { from = {"гъ", "ё", "къ", "нъ", "дж"}, to = {"г" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1], "ч" .. p[1]} }, }, } m["cri"] = { "Sãotomense", 36536, "crp", "Latn", ancestors = "pt", } m["crj"] = { "Southern East Cree", 12953464, "alg", "Latn, Cans", ancestors = "cr", translit = { Cans = "cr-translit" }, } m["crk"] = { "Plains Cree", 56699, "alg", "Latn, Cans", ancestors = "cr", } m["crl"] = { "Northern East Cree", 12642195, "alg", "Latn, Cans", ancestors = "cr", translit = { Cans = "cr-translit" }, } m["crm"] = { "Moose Cree", 3446671, "alg", "Latn, Cans", ancestors = "cr", } m["crn"] = { "Cora", 12953454, "azc", "Latn", } m["cro"] = { "Crow", 1207611, "sio-mor", "Latn", } m["crq"] = { "Iyo'wujwa Chorote", 3540927, "sai-mtc", "Latn", } m["crr"] = { "Carolina Algonquian", 16113723, "alg-eas", "Latn", } m["crs"] = { "Seychellois Creole", 34015, "crp", "Latn", ancestors = "fr", sort_key = s["roa-oil-sortkey"], } m["crt"] = { "Iyojwa'ja Chorote", 3504118, "sai-mtc", "Latn", } m["crv"] = { "Chaura", 2605680, "aav-nic", "Latn", } m["crw"] = { "Chrau", 5105629, "mkh-ban", "Latn", } m["crx"] = { "Carrier", 12953431, "ath-nor", "Latn, Cans", } m["cry"] = { "Cori", 35204, "nic-plc", "Latn", } m["crz"] = { "Cruzeño", 2967636, "nai-chu", "Latn", } m["csa"] = { "Chiltepec Chinantec", 12953435, "omq-chi", "Latn", } m["csb"] = { "คาชุบ", 33690, "zlw-pom", "Latn", } m["csc"] = { "Catalan Sign Language", 35768, "sgn", "Latn", -- when documented } m["csd"] = { "Chiangmai Sign Language", 5095211, "sgn", } m["cse"] = { "Czech Sign Language", 5201809, "sgn", "Latn", -- when documented } m["csf"] = { "Cuban Sign Language", 5192046, "sgn", "Latn", -- when documented } m["csg"] = { "Chilean Sign Language", 3322112, "sgn", "Latn", -- when documented } m["csh"] = { "Asho Chin", 12627282, "tbq-kuk", "Latn, Mymr", } m["csi"] = { "Coast Miwok", 2981109, "nai-utn", "Latn", } m["csj"] = { "Songlai Chin", 7561280, "tbq-kuk", } m["csk"] = { "Jola-Kasa", 3446622, "alv-jol", "Latn", } m["csl"] = { "Chinese Sign Language", 1094190, "sgn", } m["csm"] = { "Central Sierra Miwok", 2944443, "nai-utn", "Latn", } m["csn"] = { "Colombian Sign Language", 2748229, "sgn", "Latn", -- when documented } m["cso"] = { "Sochiapam Chinantec", 7550388, "omq-chi", "Latn", } m["csp"] = { "ผิงใต้", 84302019, "zhx-pin", "Hants", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["csq"] = { "Croatian Sign Language", 3507506, "sgn", } m["csr"] = { "Costa Rican Sign Language", 5174901, "sgn", "Latn", -- when documented } m["css"] = { "Southern Ohlone", 25559664, "nai-utn", "Latn", } m["cst"] = { "Northern Ohlone", 25559666, "nai-utn", "Latn", } m["csv"] = { "Sumtu Chin", 7638087, "tbq-kuk", } m["csw"] = { "Swampy Cree", 56696, "alg", "Latn, Cans", ancestors = "cr", } m["csx"] = { "Cambodian Sign Language", 50934287, "sgn", } m["csy"] = { "Siyin Chin", 7533375, "tbq-kuk", } m["csz"] = { "Coos", 3126783, "nai-coo", "Latn", } m["cta"] = { "Tataltepec Chatino", 7687853, "omq-cha", "Latn", } m["ctc"] = { "Chetco-Tolowa", 12628946, "ath-pco", "Latn", } m["ctd"] = { "Tedim Chin", 56357, "tbq-kuk", "Latn, Pauc", } m["cte"] = { "Tepinapa Chinantec", 12953443, "omq-chi", "Latn", } m["ctg"] = { "Chittagonian", 33173, "inc-bas", "Beng", ancestors = "inc-obn", } m["cth"] = { "Thaiphum Chin", 16912048, "tbq-kuk", } m["ctl"] = { "Tlacoatzintepec Chinantec", 12643657, "omq-chi", "Latn", } m["ctm"] = { "Chitimacha", 1294227, "qfa-iso", -- recently proposed to be in the Totozoquean family "Latn", } m["ctn"] = { "Chhintange", 32994, "sit-kie", "Deva", translit = { Deva = "Deva-translit", }, } m["cto"] = { "Emberá-Catío", 3052039, "sai-chc", "Latn", } m["ctp"] = { "Western Highland Chatino", 32861734, "omq-cha", "Latn", strip_diacritics = {remove_diacritics = "¹²³⁴⁵"}, sort_key = {remove_diacritics = c.acute}, } m["cts"] = { "Northern Catanduanes Bicolano", 7130477, "phi", "Latn", } m["ctt"] = { "Wayanad Chetti", 7975850, "dra-mal", "Taml", } m["ctu"] = { "Chol", 35179, "myn", "Latn", } m["ctz"] = { "Zacatepec Chatino", 8063754, "omq-cha", "Latn", } m["cua"] = { "Cua", 3441115, "mkh-ban", "Latn", } m["cub"] = { "Cubeo", 3006705, "sai-tuc", "Latn", } m["cuc"] = { "Usila Chinantec", 7901979, "omq-chi", "Latn", } m["cug"] = { "Cung", 35194, "nic-bbe", "Latn", } m["cuh"] = { "Chuka", 12952344, "bnt-kka", "Latn", } m["cui"] = { "Cuiba", 2980421, "sai-guh", "Latn", } m["cuj"] = { "Mashco Piro", 3446596, "awd", "Latn", } m["cuk"] = { "Kuna", 12953659, "cba", "Latn", } m["cul"] = { "Culina", 2475442, "auf", "Latn", } m["cuo"] = { "Cumanagoto", 5193784, "sai-cpc", "Latn", } m["cup"] = { "Cupeño", 143130, "azc-cup", "Latn", } m["cuq"] = { "จุน", 2475478, "qfa-lic", "Latn", } m["cur"] = { "Chhulung", 5116126, "sit-kie", "Deva", translit = { Deva = "Deva-translit", }, } m["cut"] = { "Teutila Cuicatec", 12953453, "omq-cui", "Latn", } m["cuu"] = { "Tai Ya", 3441122, "qfa-tak", "Latn", } m["cuv"] = { "Cuvok", 3515056, "cdc-cbm", "Latn", } m["cuw"] = { "Chukwa", 12629033, "sit-kic", } m["cux"] = { "Tepeuxila Cuicatec", 20527242, "omq-cui", "Latn", } m["cuy"] = { "Cuitlatec", 2030998, "qfa-iso", "Latn", } m["cvg"] = { "Chug", 47683644, "sit-khc", "Tibt, Latn", -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] -- (NOTE: formerly not present, probably an accidental omission) } m["cvn"] = { "Valle Nacional Chinantec", 12953442, "omq-chi", "Latn", } m["cwa"] = { "Kabwa", 6344537, "bnt-lok", "Latn", } m["cwb"] = { "Maindo", 11002891, "bnt-mak", "Latn", ancestors = "chw", } m["cwd"] = { "Woods Cree", 56305, "alg", "Latn, Cans", ancestors = "cr", } m["cwe"] = { "Kwere", 779632, "bnt-ruv", "Latn", } m["cwg"] = { "Chewong", 646718, "mkh-asl", "Latn", } m["cwt"] = { "Kuwaataay", 35699, "alv-jol", "Latn", } m["cya"] = { "Nopala Chatino", 15616302, "omq-cha", "Latn", } m["cyb"] = { "Cayubaba", 3183382, "qfa-iso", "Latn", } m["cyo"] = { "Cuyunon", 33153, "phi", "Latn", } m["czh"] = { "Huizhou", 56546, "zhx", "Hants", -- ? ancestors = "ltc", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["czk"] = { "Knaanic", 56384, "zlw", "Hebr", ancestors = "zlw-ocs", -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["czn"] = { "Zenzontepec Chatino", 603106, "omq-cha", "Latn", } m["czo"] = { "หมิ่นตอนกลาง", 56435, "zhx-inm", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["czt"] = { "Zotung Chin", 8074599, "tbq-kuk", "Latn", } return require("Module:languages").finalizeData(m, "language") p2zltpvocfy604hk72zjsoomdfi3qb3 มอดูล:languages/data/3/b 828 36385 5720753 5684150 2026-04-21T07:00:50Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720753 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["baa"] = { "Babatana", 2877785, "poz-ocw", "Latn", } m["bab"] = { "Bainouk-Gunyuño", 35508, "alv-bny", "Latn", } m["bac"] = { "Baduy", 3449885, "poz-msa", "Latn, Sund", ancestors = "osn", translit = { Sund = "Sund-translit" }, } m["bae"] = { "Baré", 3504087, "awd", "Latn", } m["baf"] = { "Nubaca", 36270, "nic-ymb", "Latn", } m["bag"] = { "Tuki", 36621, "nic-mba", "Latn", } m["bah"] = { "Bahamian Creole", 2669229, "crp", "Latn", ancestors = "en", } m["baj"] = { "Barakai", 3502030, "poz-cet", "Latn", } m["bal"] = { "บาโลจ", 33049, "ira-nwi", "fa-Arab", } m["ban"] = { "บาหลี", 33070, "poz-bss", "Latn, Bali", } m["bao"] = { "Waimaha", 2883738, "sai-tuc", "Latn", } m["bap"] = { "Bantawa", 56500, "sit-kic", "Krai, Deva", translit = { Deva = "Deva-translit", }, } m["bar"] = { "บาวาเรีย", 29540, "gmw-hgm", "Latn", ancestors = "gmh", } m["bas"] = { "Basaa", 33093, "bnt-bsa", "Latn", } m["bau"] = { "Badanchi", 11001650, "nic-jrw", "Latn", } m["bav"] = { "Babungo", 34885, "nic-rnn", "Latn", } m["baw"] = { "Bambili-Bambui", 34880, "nic-nge", "Latn", } m["bax"] = { "Bamum", 35280, "nic-nun", "Latn, Bamu", } m["bay"] = { "Batuley", 8828787, "poz", "Latn", } m["bba"] = { "Baatonum", 34889, "alv-sav", "Latn", } m["bbb"] = { "Barai", 4858206, "ngf-koi", "Latn", } m["bbc"] = { "Toba Batak", 33017, "btk", "Latn, Batk", } m["bbd"] = { "Bau", 4873415, "ngf-gum", "Latn", } m["bbe"] = { "Bangba", 34895, "nic-nke", "Latn", } m["bbf"] = { "Baibai", 56902, "paa-fas", "Latn", } m["bbg"] = { "Barama", 34884, "bnt-sir", "Latn", } m["bbh"] = { "Bugan", 3033554, "mkh-pkn", "Latn", } m["bbi"] = { "Barombi", 34985, "bnt-bsa", "Latn", } m["bbj"] = { "Ghomala'", 35271, "bai", "Latn", } m["bbk"] = { "Babanki", 34790, "nic-rnc", "Latn", } m["bbl"] = { "บัตส์", 33259, "cau-nkh", "Geor", -- Geor translit in [[Module:scripts/data]] override_translit = true, strip_diacritics = { remove_diacritics = c.tilde .. c.macron .. c.breve, from = {"<sup>ნ</sup>"}, to = {"ნ"} }, } m["bbm"] = { -- name includes prefix "Babango", 34819, "bnt-bta", "Latn", } m["bbn"] = { "Uneapa", 7884126, "poz-ocw", "Latn", } m["bbo"] = { "Konabéré", 35371, "dmn-snb", "Latn", } m["bbp"] = { "West Central Banda", 7984377, "bad", "Latn", } m["bbq"] = { "Bamali", 34901, "nic-nun", "Latn", } m["bbr"] = { "Girawa", 5564185, "ngf-kok", "Latn", } m["bbs"] = { "Bakpinka", 3515061, "nic-ucr", "Latn", } m["bbt"] = { "Mburku", 3441324, "cdc-wst", "Latn", } m["bbu"] = { "Bakulung", 35580, "nic-jrn", "Latn", } m["bbv"] = { "Karnai", 6372803, "poz-ocw", "Latn", } m["bbw"] = { "Baba", 34822, "nic-nun", "Latn", } m["bbx"] = { -- cf bvb "Bubia", 34953, "nic-bds", "Latn", ancestors = "bvb", } m["bby"] = { "Befang", 34960, "nic-bds", "Latn", } m["bca"] = { "Central Bai", 12628803, "sit-bai", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["bcb"] = { "Bainouk-Samik", 36390, "alv-bny", "Latn", } m["bcd"] = { "North Babar", 7054041, "poz-tim", "Latn", } m["bce"] = { "Bamenyam", 34968, "nic-nun", "Latn", } m["bcf"] = { "Bamu", 3503788, "paa-kiw", "Latn", } m["bcg"] = { "Baga Pokur", 31172660, "alv-nal", "Latn", } m["bch"] = { "Bariai", 2884502, "poz-ocw", "Latn", } m["bci"] = { "Baoule", 35107, "alv-ctn", "Latn", } m["bcj"] = { "Bardi", 3913852, "aus-nyu", "Latn", } m["bck"] = { "Bunaba", 580923, "aus-bub", "Latn", } m["bcl"] = { "บีโคลตอนกลาง", 33284, "phi", "Latn, Tglg", translit = { Tglg = "bcl-translit", }, override_translit = true, strip_diacritics = { Latn = { remove_diacritics = c.grave .. c.acute .. c.circ, } }, sort_key = { Latn = "tl-sortkey", }, standard_chars = { Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy" .. c.punc, }, } m["bcm"] = { "Banoni", 2882857, "poz-ocw", "Latn", } m["bcn"] = { "Bibaali", 34892, "alv-mye", "Latn", } m["bco"] = { "Kaluli", 6354586, "ngf-bos", "Latn", } m["bcp"] = { "Bali", 3515074, "bnt-kbi", "Latn", } m["bcq"] = { "Bench", 35108, "omv", "Latn", } m["bcr"] = { "Babine-Witsuwit'en", 27864, "ath-nor", "Latn", } m["bcs"] = { "Kohumono", 35590, "nic-ucn", "Latn", } m["bct"] = { "Bendi", 8836662, "csu-mle", "Latn", } m["bcu"] = { "Biliau", 2874658, "poz-ocw", "Latn", } m["bcv"] = { "Shoo-Minda-Nye", 36548, "nic-jkn", "Latn", } m["bcw"] = { "Bana", 56272, "cdc-cbm", "Latn", } m["bcy"] = { "Bacama", 56274, "cdc-cbm", "Latn", } m["bcz"] = { "Bainouk-Gunyaamolo", 35506, "alv-bny", "Latn", } m["bda"] = { "Bayot", 35019, "alv-jol", "Latn", } m["bdb"] = { "Basap", 3504208, "poz-bnn", "Latn", } m["bdc"] = { "Emberá-Baudó", 11173166, "sai-chc", "Latn", } m["bdd"] = { "Bunama", 4997416, "poz-ocw", "Latn", } m["bde"] = { "Bade", 56239, "cdc-wst", "Latn", } m["bdf"] = { "Biage", 48037487, "ngf-koi", "Latn", } m["bdg"] = { "Bonggi", 2910053, "poz-bnn", "Latn", } m["bdh"] = { "Tara Baka", 2880165, "csu-bbk", "Latn", } m["bdi"] = { "Burun", 35040, "sdv-niw", "Latn", } m["bdj"] = { "Bai (South Sudan)", 34894, "nic-ser", "Latn", } m["bdk"] = { "Budukh", 35397, "cau-ssm", "Cyrl", translit = "cau-nec-translit", override_translit = true, display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]}, } m["bdl"] = { "บาเจาแบบอินโดนีเซีย", 2880038, "poz", "Latn", } m["bdm"] = { "Buduma", 56287, "cdc-cbm", "Latn", } m["bdn"] = { "Baldemu", 56280, "cdc-cbm", "Latn", } m["bdo"] = { "Morom", 759770, "csu-bgr", "Latn", } m["bdp"] = { "Bende", 8836490, "bnt", "Latn", } m["bdq"] = { "บะห์นัร", 32924, "mkh-ban", "Latn", } m["bdr"] = { "บาเจาแบบเวสต์โคสต์", 2880037, "poz-sbj", "Latn", } m["bds"] = { "Burunge", 56617, "cus-sou", "Latn", } m["bdt"] = { "Bokoto", 4938812, "gba-wes", "Latn", } m["bdu"] = { "Oroko", 36278, "bnt-saw", "Latn", } m["bdv"] = { "Bodo Parja", 8845881, "inc-eas", "Orya", } m["bdw"] = { "Baham", 3513309, "paa-mbi", "Latn", } m["bdx"] = { "Budong-Budong", 4985158, "poz-ssw", "Latn", } m["bdy"] = { "Bandjalang", 2980386, "aus-pam", "Latn", } m["bdz"] = { "Badeshi", 33028, "iir", "Arab, Latn", } m["bea"] = { "Beaver", 20826, "ath-nor", "Latn", } m["beb"] = { "Bebele", 34976, "bnt-btb", "Latn", } m["bec"] = { "Iceve-Maci", 35449, "nic-tvc", "Latn", } m["bed"] = { "Bedoanas", 4879330, "poz-hce", "Latn", } m["bee"] = { "Byangsi", 56904, "sit-alm", "Deva", translit = { Deva = "Deva-translit", }, } m["bef"] = { "Benabena", 2895638, "ngf-kag", "Latn", } m["beg"] = { "Belait", 2894198, "poz-swa", "Latn", } m["beh"] = { "Biali", 34961, "nic-eov", "Latn", } m["bei"] = { "Bekati'", 3441683, "day", "Latn", } m["bej"] = { "Beja", 33025, "cus", "Arab, Latn", strip_diacritics = { Latn = { remove_diacritics = c.acute, } }, } m["bek"] = { "Bebeli", 4878430, "poz-ocw", "Latn", } m["bem"] = { "Bemba", 33052, "bnt-sbi", "Latn", } m["beo"] = { "Beami", 3504079, "ngf-bos", "Latn", } m["bep"] = { "Besoa", 8840465, "poz-kal", "Latn", } m["beq"] = { "Beembe", 3196320, "bnt-kng", "Latn", } m["bes"] = { "Besme", 289832, "alv-kim", "Latn", } m["bet"] = { "Guiberoua Bété", 11019185, "kro-bet", "Latn", } m["beu"] = { "บลาการ์", 4923846, "paa-tap", "Latn", } m["bev"] = { "Daloa Bété", 11155819, "kro-bet", "Latn", } m["bew"] = { "เบอตาวี", 33014, "crp", "Latn", ancestors = "ms", } m["bex"] = { "Jur Modo", 56682, "csu-bbk", "Latn", } m["bey"] = { "Akuwagel", 3504170, "paa-tor", "Latn", } m["bez"] = { "Kibena", 2502949, "bnt-bki", "Latn", } m["bfa"] = { "Bari", 35042, "sdv-bri", "Latn", } m["bfb"] = { "Pauri Bareli", 7155462, "inc-bhi", "Deva", translit = { Deva = "Deva-translit", }, } m["bfc"] = { "Panyi Bai", 12642165, "sit-nba", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["bfd"] = { "Bafut", 34888, "nic-nge", "Latn", } m["bfe"] = { "Betaf", 4897329, "paa-tkw", "Latn", } m["bff"] = { "Bofi", 34914, "gba-eas", "Latn", } m["bfg"] = { "Busang Kayan", 9231909, "poz", "Latn", } m["bfh"] = { "Blafe", 12628007, "paa-yam", "Latn", } m["bfi"] = { "British Sign Language", 33000, "sgn", "Latn", -- when documented } m["bfj"] = { "Bafanji", 34890, "nic-nun", "Latn", } m["bfk"] = { "Ban Khor Sign Language", 3441103, "sgn", } m["bfl"] = { "Banda-Ndélé", 34850, "bad-cnt", "Latn", } m["bfm"] = { "Mmen", 36132, "nic-rnc", "Latn", } m["bfn"] = { "Bunak", 35101, "paa-tap", "Latn", } m["bfo"] = { "Malba Birifor", 11150710, "nic-mre", "Latn", } m["bfp"] = { "Beba", 35050, "nic-nge", "Latn", } m["bfq"] = { "พทคะ", 33205, "dra-kan", "Taml, Knda, Mlym", translit = { Taml = "Taml-translit", }, -- Knda translit in [[Module:scripts/data]] -- Mlym translit in [[Module:scripts/data]] } m["bfr"] = { "Bazigar", 8829558, "inc", } m["bfs"] = { "Southern Bai", 12952250, "sit-bai", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["bft"] = { "บัลติ", 33086, "sit-lab", "fa-Arab, Deva, Tibt", translit = { Deva = "Deva-translit", }, override_translit = "Tibt", -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] strip_diacritics = { ["fa-Arab"] = { from = {"هٔ", "ٱ"}, to = {"ه", "ا"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.kashida .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef, }, }, } m["bfu"] = { "Gahri", 5516952, "sit-whm", "Takr, Tibt", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["bfw"] = { "Bondo", 2567942, "mun", "Orya", } m["bfx"] = { "Bantayanon", 16837866, "phi", "Latn", } m["bfy"] = { "Bagheli", 2356364, "inc-hie", "Deva", ancestors = "inc-oaw", translit = { Deva = "Deva-translit", }, } m["bfz"] = { "Mahasu Pahari", 6733460, "him", "Deva, Takr", translit = { Deva = "Deva-translit", }, } m["bga"] = { "Gwamhi-Wuri", 6707102, "nic-knn", "Latn", } m["bgb"] = { "Bobongko", 4935896, "poz-slb", "Latn", } m["bgc"] = { "Haryanvi", 33410, "inc-hiw", "Deva", translit = { Deva = "Deva-translit", }, } m["bgd"] = { "Rathwi Bareli", 7295692, "inc-bhi", "Deva", translit = { Deva = "Deva-translit", }, } m["bge"] = { "Bauria", 4873579, "inc-bhi", "Deva", translit = { Deva = "Deva-translit", }, } m["bgf"] = { "Bangandu", 34938, "gba-sou", "Latn", } m["bgg"] = { "Bugun", 3514220, "sit-khb", "Latn", } m["bgi"] = { "Giangan", 4842057, "phi", "Latn", } m["bgj"] = { "Bangolan", 34862, "nic-nun", "Latn", } m["bgk"] = { "Bit", 2904868, "mkh-pal", "Latn", -- also Hani? } m["bgl"] = { "Bo", 8845514, "mkh-vie", } m["bgo"] = { "Baga Koga", 35695, "alv-bag", "Latn", } m["bgq"] = { "Bagri", 2426319, "raj", "Deva", translit = { Deva = "Deva-translit", }, } m["bgr"] = { "Bawm Chin", 56765, "tbq-kuk", "Latn", } m["bgs"] = { "Tagabawa", 7675121, "mno", "Latn", } m["bgt"] = { "Bughotu", 2927723, "poz-sls", "Latn", } m["bgu"] = { "Mbongno", 36141, "nic-mmb", "Latn", } m["bgv"] = { "Warkay-Bipim", 4915439, "paa-ani", "Latn", } m["bgw"] = { "Bhatri", 8841054, "inc-eas", "Deva", translit = { Deva = "Deva-translit", }, } m["bgx"] = { "Balkan Gagauz Turkish", 2360396, "trk-ogz", "Latn", ancestors = "trk-oat", } m["bgy"] = { "Benggoi", 4887742, "poz-cma", "Latn", } m["bgz"] = { "Banggai", 3441692, "poz-slb", "Latn", } m["bha"] = { "Bharia", 4901287, "inc", "Deva", translit = { Deva = "Deva-translit", }, } m["bhb"] = { "Bhili", 33229, "inc-bhi", "Deva, Gujr", translit = { Deva = "Deva-translit", Gujr = "Gujr-translit", }, } m["bhc"] = { "Biga", 2902375, "poz-hce", "Latn", } m["bhd"] = { "Bhadrawahi", 4900565, "him", "Arab, Deva", translit = { Deva = "Deva-translit", }, } m["bhe"] = { "Bhaya", 8841168, "raj", } m["bhf"] = { "Odiai", 56690, "qfa-dis", -- Papuan; no consensus; may be in the Kwomtari family, an isolate and/or distantly related to the -- Torricelli family. "Latn", } m["bhg"] = { "Binandere", 3503802, "paa-bin", "Latn", } m["bhh"] = { "Bukhari", 56469, "ira-swi", "Cyrl, Hebr, Latn, fa-Arab", ancestors = "tg", -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["bhi"] = { "Bhilali", 4901729, "inc-bhi", "Deva", translit = { Deva = "Deva-translit", }, } m["bhj"] = { "Bahing", 56442, "sit-kiw", "Deva, Latn", translit = { Deva = "Deva-translit", }, } m["bhl"] = { "Bimin", 4913743, "ngf-okk", "Latn", } m["bhm"] = { "Bathari", 2586893, "sem-sar", "Arab, Latn", } m["bhn"] = { "Bohtan Neo-Aramaic", 33230, "sem-nna", "Syrc", } m["bho"] = { "โภชปุระ", 33268, "inc-bih", "Deva, Kthi", wikimedia_codes = "bh", translit = { Deva = "Deva-translit", Kthi = "Kthi-translit", }, } m["bhp"] = { "Bima", 2796873, "poz-cet", "Latn", } m["bhq"] = { "Tukang Besi South", 12643975, "poz-mun", "Latn", } m["bhs"] = { "Buwal", 3515065, "cdc-cbm", "Latn", } m["bht"] = { "Bhattiyali", 4901452, "him", "Deva", translit = { Deva = "Deva-translit", }, } m["bhu"] = { "Bhunjia", 8841766, "inc-hal", "Deva, Orya", translit = { Deva = "Deva-translit", Orya = "Orya-translit", }, } m["bhv"] = { "Bahau", 3502039, "poz", "Latn", } m["bhw"] = { "Biak", 1961488, "poz-hce", "Latn", } m["bhx"] = { -- spurious? "Bhalay", 8840773, "inc", } m["bhy"] = { "Bhele", 4901671, "bnt-kbi", "Latn", } m["bhz"] = { "Bada", 4840520, "poz-kal", "Latn", } m["bia"] = { "Badimaya", 3442745, "aus-psw", "Latn", } m["bib"] = { "Bissa", 32934, "dmn-bbu", "Latn", } m["bic"] = { "Bikaru", 56342, "ngf-eng", "Latn", } m["bid"] = { "Bidiyo", 56258, "cdc-est", "Latn", } m["bie"] = { "Bepour", 4890914, "ngf-kum", "Latn", } m["bif"] = { "Biafada", 35099, "alv-ten", "Latn", } m["big"] = { "Biangai", 8842027, "paa-kun", "Latn", } m["bij"] = { "Kwanka", 35598, "nic-tar", "Latn", } m["bil"] = { "Bile", 34987, "nic-jrn", "Latn", } m["bim"] = { "Bimoba", 34971, "nic-grm", "Latn", } m["bin"] = { "Edo", 35375, "alv-eeo", "Latn", strip_diacritics = {remove_diacritics = c.acute .. c.grave .. c.macron .. c.dgrave}, sort_key = { from = {"ẹ", "gb", "gh", "kh", "kp", "mw", "nw", "ny", "ọ", "rh", "rr", "vb"}, to = {"e" .. p[1], "g" .. p[1], "g" .. p[2], "k" .. p[1], "k" .. p[2], "m" .. p[1], "n" .. p[1], "n" .. p[2], "o" .. p[1], "r" .. p[1], "r" .. p[1], "v" .. p[1]} }, } m["bio"] = { "Nai", 3508074, "paa-kwm", "Latn", } m["bip"] = { "Bila", 2902626, "bnt-kbi", "Latn", } m["biq"] = { "Bipi", 2904312, "poz-aay", "Latn", } m["bir"] = { "Bisorio", 8844749, "ngf-eng", "Latn", } m["bit"] = { "Berinomo", 56447, "paa-spk", "Latn", } m["biu"] = { "Biete", 4904687, "tbq-kuk", "Latn", } m["biv"] = { "Southern Birifor", 32859745, "nic-mre", "Latn", } m["biw"] = { "Kol (Cameroon)", 35582, "bnt-mka", "Latn", } m["bix"] = { "Bijori", 3450686, "mun", "Deva", translit = { Deva = "Deva-translit", }, } m["biy"] = { "Birhor", 3450469, "mun", "Deva", translit = { Deva = "Deva-translit", }, } m["biz"] = { "Baloi", 3450590, "bnt-ngn", "Latn", } m["bja"] = { "Budza", 3046889, "bnt-bun", "Latn", } m["bjb"] = { "Barngarla", 3439071, "aus-pam", "Latn", } m["bjc"] = { "Bariji", 4690919, "ngf-yar", "Latn", } m["bje"] = { "Biao-Jiao Mien", 3503800, "hmx-mie", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["bjf"] = { "Barzani Jewish Neo-Aramaic", 33234, "sem-nna", "Hebr", -- maybe others -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["bjg"] = { "Bidyogo", 35365, "alv-bak", "Latn", } m["bjh"] = { "Bahinemo", 56361, "paa-spk", "Latn", } m["bji"] = { "Burji", 34999, "cus-hec", "Latn, Ethi", } m["bjj"] = { "Kannauji", 2726867, "inc-hiw", "Deva", translit = { Deva = "Deva-translit", }, } m["bjk"] = { "Barok", 2884743, "poz-ocw", "Latn", } m["bjl"] = { "Bulu (New Guinea)", 4997162, "poz-ocw", "Latn", } m["bjm"] = { "Bajelani", 4848866, "ira-zgr", "Latn, Arab", ancestors = "hac", } m["bjn"] = { "บันจาร์", 33151, "poz-mly", "Latn, Arab", } m["bjo"] = { "Mid-Southern Banda", 42303990, "bad-cnt", "Latn", } m["bjp"] = { "Fanamaket", 56704263, "poz-oce", "Latn", } m["bjr"] = { "Binumarien", 538364, "ngf-kag", "Latn", } m["bjs"] = { "Bajan", 2524014, "crp", "Latn", ancestors = "en", } m["bjt"] = { "Balanta-Ganja", 19359034, "alv-bak", "Arab, Latn", } m["bju"] = { "Busuu", 35046, "nic-fru", "Latn", } m["bjv"] = { "Bedjond", 8829831, "csu-sar", "Latn", } m["bjw"] = { "Bakwé", 34899, "kro-ekr", "Latn", } m["bjx"] = { "Banao Itneg", 12627559, "phi", "Latn", } m["bjy"] = { "Bayali", 4874263, "aus-pam", "Latn", } m["bjz"] = { "Baruga", 2886189, "paa-bin", "Latn", } m["bka"] = { "Kyak", 35653, "alv-bwj", "Latn", } m["bkc"] = { "Baka", 34905, "nic-nkb", "Latn", } m["bkd"] = { "บีนูกิด", 4914553, "mno", "Latn", } m["bkf"] = { "Beeke", 3441375, "bnt-kbi", "Latn", } m["bkg"] = { "Buraka", 35066, "nic-nkg", "Latn", } m["bkh"] = { "Bakoko", 34866, "bnt-bsa", "Latn", } m["bki"] = { "Baki", 11024697, "poz-vnc", "Latn", } m["bkj"] = { "Pande", 36263, "bnt-ngn", "Latn", } m["bkk"] = { -- written in Balti script "Brokskat", 2925988, "inc-shn", "Tibt, Arab", -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] -- (NOTE: formerly not present, probably an accidental omission) } m["bkl"] = { "Berik", 378743, "paa-tkw", "Latn", } m["bkm"] = { "Kom (Cameroon)", 1656595, "nic-rnc", "Latn", } m["bkn"] = { "Bukitan", 3446774, "poz-bnn", "Latn", } m["bko"] = { "Kwa'", 35567, "bai", "Latn", } m["bkp"] = { "Iboko", 35089, "bnt-ngn", "Latn", } m["bkq"] = { "Bakairí", 56846, "sai-pek", "Latn", } m["bkr"] = { "Bakumpai", 3436626, "poz-brw", "Latn", } m["bks"] = { "Masbate Sorsogon", 16113356, "phi", "Latn", } m["bkt"] = { "Boloki", 4144560, "bnt-zbi", "Latn", ancestors = "lse", } m["bku"] = { "Buhid", 1002956, "phi", "Latn, Buhd", translit = { Buhd = "bku-translit", }, override_translit = true, strip_diacritics = { Latn = { remove_diacritics = c.grave .. c.acute .. c.circ, } }, sort_key = { Latn = "tl-sortkey", }, standard_chars = { Latn = "AaBbKkDdEeFfGgHhIiLlMmNnOoPpRrSsTtUuWwYy" .. c.punc, }, } m["bkv"] = { "Bekwarra", 34954, "nic-ben", "Latn", } m["bkw"] = { "Bekwel", 34950, "bnt-bek", "Latn", } m["bkx"] = { "Baikeno", 11200640, "poz-tim", "Latn", } m["bky"] = { "Bokyi", 35087, "nic-ben", "Latn", } m["bkz"] = { "Bungku", 2928207, "poz-btk", "Latn", } m["bla"] = { "Blackfoot", 33060, "alg", "Latn, Cans", } m["blb"] = { "Bilua", 35003, "qfa-dis", -- Papuan; isolate per Glottolog, Central Solomon per Ross (2005) and Pedrós (2015) "Latn", } m["blc"] = { "Bella Coola", 977808, "sal", "Latn", } m["bld"] = { "Bolango", 3450578, "phi", "Latn", } m["ble"] = { "Balanta-Kentohe", 56789, "alv-bak", "Latn", } m["blf"] = { "Buol", 2928278, "phi", "Latn", } m["blg"] = { "Balau", 4850134, "poz-mly", "Latn", } m["blh"] = { "Kuwaa", 35579, "kro", "Latn", } m["bli"] = { "Bolia", 34910, "bnt-mon", "Latn", } m["blj"] = { "Bulungan", 9229310, "poz", "Latn", } m["blk"] = { "กะเหรี่ยงปะโอ", 7121294, "kar", "Mymr", } m["bll"] = { "Biloxi", 2903780, "sio-ohv", "Latn", } m["blm"] = { "Beli", 56821, "csu-bbk", "Latn", } m["bln"] = { "Southern Catanduanes Bicolano", 7569754, "phi", "Latn", } m["blo"] = { "Anii", 34838, "alv-ntg", "Latn", } m["blp"] = { "Blablanga", 2905245, "poz-ocw", "Latn", } m["blq"] = { "Baluan-Pam", 2881675, "poz-aay", "Latn", } m["blr"] = { "Blang", 4925096, "mkh-pal", "Latn, Tale, Lana, Thai", sort_key = { -- FIXME: This needs to be converted into the current standardized format. from = {"[%pᪧๆ]", "[᩠ᩳ-᩿]", "ᩔ", "ᩕ", "ᩖ", "ᩘ", "([ᨭ-ᨱ])ᩛ", "([ᨷ-ᨾ])ᩛ", "ᩤ", "[็-๎]", "([เแโใไ])([ก-ฮ])"}, to = {"", "", "ᩈᩈ", "ᩁ", "ᩃ", "ᨦ", "%1ᨮ", "%1ᨻ", "ᩣ", "", "%2%1"} }, } m["bls"] = { "Balaesang", 4849796, "poz", "Latn", } m["blt"] = { "ไทดำ", 56407, "tai-swe", "Tavt, Latn", translit = "Tavt-translit", sort_key = { Tavt = { from = {"[꪿ꫀ꫁ꫂ]", "([ꪵꪶꪹꪻꪼ])([ꪀ-ꪯ])"}, to = {"", "%2%1"} }, }, } m["blv"] = { "Kibala", 4939959, "bnt-kmb", "Latn", } m["blw"] = { "Balangao", 4850033, "phi", "Latn", } m["blx"] = { "Mag-Indi Ayta", 1931221, "phi", "Latn", } m["bly"] = { "Notre", 11009194, "nic-wov", "Latn", } m["blz"] = { "Balantak", 4850053, "poz-slb", "Latn", } m["bma"] = { "Lame", 3913997, "nic-jrn", "Latn", } m["bmb"] = { "Bembe", 4885023, "bnt-lgb", "Latn", } m["bmc"] = { "Biem", 4904523, "poz-ocw", "Latn", } m["bmd"] = { "Baga Manduri", 35815, "alv-bag", "Latn", } m["bme"] = { "Limassa", 11004666, "nic-nkb", "Latn", } m["bmf"] = { "Bom", 35088, "alv-mel", "Latn", } m["bmg"] = { "Bamwe", 34867, "bnt-bun", "Latn", } m["bmh"] = { "Kein", 6383764, "ngf-kok", "Latn", } m["bmi"] = { "Bagirmi", 34903, "csu-bgr", "Latn", } m["bmj"] = { "Bote-Majhi", 9229570, "inc-eas", "Deva", ancestors = "bh", translit = { Deva = "Deva-translit", }, } m["bmk"] = { "Ghayavi", 5555976, "poz-ocw", "Latn", } m["bml"] = { "Bomboli", 35055, "bnt-ngn", "Latn", } m["bmn"] = { "Bina", 8843664, "poz-ocw", "Latn", } m["bmo"] = { "Bambalang", 34868, "nic-nun", "Latn", } m["bmp"] = { "Bulgebi", 4996380, "ngf-fin", "Latn", } m["bmq"] = { "Bomu", 35065, "nic-bwa", "Latn", } m["bmr"] = { "Muinane", 3027894, "sai-bor", "Latn", } m["bmt"] = { "Biao Mon", 8842159, "hmx-mie", } m["bmu"] = { "Somba-Siawari", 5000983, "ngf-huo", "Latn", } m["bmv"] = { "Bum", 35058, "nic-rnc", "Latn", } m["bmw"] = { "Bomwali", 34984, "bnt-ndb", "Latn", } m["bmx"] = { "Baimak", 3450546, "ngf-han", "Latn", } m["bmz"] = { "Baramu", 4858315, "paa-ani", "Latn", } m["bna"] = { "Bonerate", 4941729, "poz-mun", "Latn", } m["bnb"] = { "Bookan", 4943150, "poz-san", "Latn", } m["bnd"] = { "Banda", 3504147, "poz-cma", "Latn", } m["bne"] = { "Bintauna", 4914533, "phi", "Latn", } m["bnf"] = { "Masiwang", 6783305, "poz-cma", "Latn", } m["bng"] = { "Benga", 34952, "bnt-saw", "Latn", } m["bni"] = { "Bangi", 34936, "bnt-bmo", "Latn", } m["bnj"] = { "Eastern Tawbuid", 18757427, "phi", "Latn", } m["bnk"] = { "Bierebo", 2902029, "poz-vnc", "Latn", } m["bnl"] = { "Boon", 56616, "cus-eas", "Latn", } m["bnm"] = { "Batanga", 34979, "bnt-saw", "Latn", } m["bnn"] = { "Bunun", 56505, "map", "Latn", } m["bno"] = { "อาซี", 29490, "phi", "Latn", } m["bnp"] = { "Bola", 4938876, "poz-ocw", "Latn", } m["bnq"] = { "Bantik", 2883521, "poz", "Latn", } m["bnr"] = { "Butmas-Tur", 2928942, "poz-vnn", "Latn", } m["bns"] = { "Bundeli", 56399, "inc-hiw", "Deva", translit = { Deva = "Deva-translit", }, } m["bnu"] = { "Bentong", 4890644, "poz-ssw", "Latn", } m["bnv"] = { "Beneraf", 4941733, "paa-tkw", "Latn", } m["bnw"] = { "Bisis", 56356, "paa-spk", "Latn", } m["bnx"] = { "Bangubangu", 3438330, "bnt-lbn", "Latn", } m["bny"] = { "Bintulu", 3450775, "poz-swa", "Latn", } m["bnz"] = { "Beezen", 35083, "nic-ykb", "Latn", } m["boa"] = { "Bora", 2375468, "sai-bor", "Latn", } m["bob"] = { "Aweer", 56526, "cus-som", "Latn", } m["boe"] = { "Mundabli", 36127, "nic-beb", "Latn", } m["bof"] = { "Bolon", 3913301, "dmn-emn", "Latn", } m["bog"] = { "Bamako Sign Language", 4853284, "sgn", } m["boh"] = { "North Boma", 35080, "bnt-bdz", "Latn", } m["boi"] = { "Barbareño", 56391, "nai-chu", "Latn", } m["boj"] = { "Anjam", 3504136, "ngf-min", "Latn", } m["bok"] = { "Bonjo", 34942, "alv", "Latn", } m["bol"] = { "โบล", 3436680, "cdc-wst", "Latn", } m["bom"] = { "Berom", 35013, "nic-beo", "Latn", } m["bon"] = { "Bine", 4914077, "paa-etf", "Latn", } m["boo"] = { "Tiemacèwè Bozo", 12643582, "dmn-snb", "Latn", -- and others? } m["bop"] = { "Bonkiman", 4942134, "ngf-fin", "Latn", } m["boq"] = { "Bogaya", 7207578, "qfa-dis", -- Papuan; isolate per Glottolog, grouped in Duna-Pogaya family by Voorhoeve (1975), Ross (2005) and Usher (2018) "Latn", } m["bor"] = { "Borôro", 32986, "sai-mje", "Latn", } m["bot"] = { "Bongo", 2910067, "csu-bbk", "Latn", } m["bou"] = { "Bondei", 4941378, "bnt-seu", "Latn", } m["bov"] = { "Tuwuli", 36974, "alv-ktg", "Latn", } m["bow"] = { "Rema", 7311502, "paa-yam", "Latn", } m["box"] = { "Buamu", 35157, "nic-bwa", "Latn", } m["boy"] = { "Bodo (Central Africa)", 4936715, "bnt-leb", "Latn", } m["boz"] = { "Tiéyaxo Bozo", 32860401, "dmn-snb", "Latn", } m["bpa"] = { "Daakaka", 1157729, "poz-vnc", "Latn", } m["bpd"] = { "Banda-Banda", 3450674, "bad-cnt", "Latn", } -- bpe (Bauni, Papua New Guinea): not yet accepted; in the Sko/Skou family m["bpg"] = { "Bonggo", 4941860, "poz-ocw", "Latn", } m["bph"] = { "Botlikh", 56560, "cau-and", "Cyrl", translit = "cau-nec-translit", override_translit = true, display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, strip_diacritics = {Cyrl = s["cau-Cyrl-stripdiacritics"]}, } m["bpi"] = { "Bagupi", 3450697, "ngf-han", "Latn", } m["bpj"] = { "Binji", 4914403, "bnt-lbn", "Latn", } m["bpk"] = { "Orowe", 7103905, "poz-cln", "Latn", } m["bpl"] = { "Broome Pearling Lugger Pidgin", 4975277, "crp", "Latn", ancestors = "ms", } m["bpm"] = { "Biyom", 4919327, "ngf-rai", "Latn", } m["bpn"] = { "Dzao Min", 3042189, "hmx-mie", } m["bpo"] = { "Anasi", 11207813, "paa-egb", "Latn", } m["bpp"] = { "Kaure", 20526532, "paa-kko", "Latn", } m["bpq"] = { "Banda Malay", 12473442, "crp", "Latn", ancestors = "ms", } m["bpr"] = { "Koronadal Blaan", 16115430, "phi", "Latn", } m["bps"] = { "Sarangani Blaan", 16117272, "phi", "Latn", } m["bpt"] = { "Barrow Point", 2567916, "aus-pmn", "Latn", } m["bpu"] = { "Bongu", 4941930, "ngf-min", "Latn", } m["bpv"] = { "Bian Marind", 8841889, "paa-ani", "Latn", } -- bpw: Bo (Papua New Guinea): pending acceptance; per Wikipedia: "It is essentially undocumented, and its status as a separate language is unconfirmed." m["bpx"] = { "Palya Bareli", 7128872, "inc-bhi", "Deva", translit = { Deva = "Deva-translit", }, } m["bpy"] = { "Bishnupriya Manipuri", 37059, "inc-bas", "Beng", ancestors = "inc-obn", } m["bpz"] = { "Bilba", 8843362, "poz-tim", "Latn", } m["bqa"] = { "Tchumbuli", 11008162, "alv-ctn", "Latn", ancestors = "ak", } m["bqb"] = { "Bagusa", 4842178, "paa-tkw", "Latn", } m["bqc"] = { "Boko", 34983, "dmn-bbu", "Latn", } m["bqd"] = { "Bung", 3436612, "nic-bdn", "Latn", } m["bqf"] = { "Baga Kaloum", 3502293, "alv-bag", "Latn", } m["bqg"] = { "Bago-Kusuntu", 34878, "nic-gne", } m["bqh"] = { "Baima", 674990, "sit-qia", } m["bqi"] = { "Bakhtiari", 257829, "ira-swi", "fa-Arab", ancestors = "pal", } m["bqj"] = { "Bandial", 34872, "alv-jol", "Latn", } m["bqk"] = { "Banda-Mbrès", 3450724, "bad-cnt", "Latn", } m["bql"] = { "Bilakura", 4907504, "ngf-num", "Latn", } m["bqm"] = { "Wumboko", 37051, "bnt-kpw", "Latn", } m["bqn"] = { "Bulgarian Sign Language", 3438325, "sgn", } m["bqo"] = { "Balo", 34865, "nic-grs", "Latn", } m["bqp"] = { "Busa", 35185, "dmn-bbu", "Latn", } m["bqq"] = { "Biritai", 56382, "paa-lkp", "Latn", } m["bqr"] = { "Burusu", 5001028, "poz-san", "Latn", } m["bqs"] = { "Bosngun", 56838, "paa-ram", "Latn", } m["bqt"] = { "Bamukumbit", 35078, "nic-nge", "Latn", } m["bqu"] = { "Boguru", 3438444, "bnt-boa", "Latn", } m["bqv"] = { "Begbere-Ejar", 7194098, "nic-plc", "Latn", } m["bqw"] = { "Buru (Nigeria)", 1017152, "nic-bds", "Latn", } m["bqx"] = { "Baangi", 3450648, "nic-kam", "Latn", } m["bqy"] = { "Bengkala Sign Language", 3322119, "sgn", } m["bqz"] = { "Bakaka", 34855, "bnt-mne", "Latn", } m["bra"] = { "พรัช", 35243, "inc-hiw", "Deva", translit = { Deva = "Deva-translit", }, } m["brb"] = { "Lave", 4957737, "mkh-ban", } m["brc"] = { "Berbice Creole Dutch", 35215, "crp", "Latn", ancestors = "nl", } m["brd"] = { "Baraamu", 56804, "sit-new", "Deva", translit = { Deva = "Deva-translit", }, } m["brf"] = { "Bera", 2896850, "bnt-kbi", "Latn", } m["brg"] = { "Baure", 2839722, "awd", "Latn", } m["brh"] = { "บราฮุอี", 33202, "dra-nor", "ur-Arab, Latn", translit = {["ur-Arab"] = "ur-translit"}, strip_diacritics = { -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif from = {"هٔ", "ۂ", "ٱ"}, to = {"ہ", "ہ", "ا"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef }, } m["bri"] = { "Mokpwe", 36428, "bnt-kpw", "Latn", } m["brj"] = { "Bieria", 4904607, "poz-vnc", "Latn", } m["brk"] = { "Birgid", 56823, "nub", "Latn", } m["brl"] = { "Birwa", 3501019, "bnt-sts", "Latn", } m["brm"] = { "Barambu", 34893, "znd", "Latn", } m["brn"] = { "Boruca", 4946773, "cba", "Latn", } m["bro"] = { "Brokkat", 56605, "sit-tib", "Tibt, Latn", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["brp"] = { "Barapasi", 56995, "paa-egb", "Latn", } m["brq"] = { "Breri", 4961835, "paa-ram", "Latn", } m["brr"] = { "Birao", 2904383, "poz-sls", "Latn", } m["brs"] = { "Baras", 8827053, "poz", "Latn", } m["brt"] = { "Bitare", 34946, "nic-tvn", "Latn", } m["bru"] = { "บรูตะวันออก", 16115463, "mkh-kat", "Latn, Laoo, Thai", --sort_key = { -- Laoo = "Laoo-sortkey", -- Thai = "Thai-sortkey", --}, } m["brv"] = { "บรูตะวันตก", 13018531, "mkh-kat", "Latn, Laoo, Thai", --sort_key = { -- Laoo = "Laoo-sortkey", -- Thai = "Thai-sortkey", --}, } m["brw"] = { "Bellari", 4883496, "dra-tlk", "Knda, Mlym", -- Knda translit in [[Module:scripts/data]] -- Mlym translit in [[Module:scripts/data]] } m["brx"] = { "โบโด", 33223, "tbq-bdg", "Deva, Latn", translit = { Deva = "Deva-translit", }, } m["bry"] = { "Burui", 5000976, "paa-ndu", "Latn", } m["brz"] = { "Bilbil", 4907473, "poz-ocw", "Latn", } m["bsa"] = { "Abinomn", 56648, "qfa-iso", -- Papuan "Latn", } m["bsb"] = { "Brunei Bisaya", 3450611, "poz-san", "Latn", } m["bsc"] = { "Bassari", 35098, "alv-ten", "Latn", } m["bse"] = { "Wushi", 36973, "nic-rnn", "Latn", } m["bsf"] = { "Bauchi", 34974, "nic-shi", "Latn", } m["bsg"] = { "Bashkardi", 33030, "ira-swi", "fa-Arab, Latn", } m["bsh"] = { "Kamkata-viri", 2605045, "nur-nor", "Latn, Arab", } m["bsi"] = { "Bassossi", 34940, "bnt-mne", "Latn", } m["bsj"] = { "Bangwinji", 3446631, "alv-wjk", "Latn", } m["bsk"] = { "Burushaski", 216286, "qfa-iso", "Arab", strip_diacritics = { -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif from = {"هٔ", "ۂ", "ٱ"}, to = {"ہ", "ہ", "ا"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef }, } m["bsl"] = { "Basa-Gumna", 4866150, "nic-bas", "Latn", } m["bsm"] = { "Busami", 5001255, "poz-hce", "Latn", } m["bsn"] = { "Barasana", 2883843, "sai-tuc", "Latn", } m["bso"] = { "Buso", 3441370, "cdc-est", "Latn", } m["bsp"] = { "Baga Sitemu", 36466, "alv-bag", "Latn", } m["bsq"] = { "Bassa", 34949, "kro-wkr", "Latn, Bass", } m["bsr"] = { "Bassa-Kontagora", 4866152, "nic-bas", "Latn", } m["bss"] = { "Akoose", 34806, "bnt-mne", "Latn", } m["bst"] = { "Basketo", 56531, "omv-ome", "Ethi", } m["bsu"] = { "Bahonsuai", 2879298, "poz-btk", "Latn", } m["bsv"] = { "Baga Sobané", 3450433, "alv-bag", "Latn", } m["bsw"] = { "Baiso", 56615, "cus-som", "Latn", } m["bsx"] = { "Yangkam", 36922, "nic-tar", "Latn", } m["bsy"] = { "Sabah Bisaya", 12641557, "poz-san", "Latn", } m["bta"] = { "Bata", 56254, "cdc-cbm", "Latn", } m["btc"] = { "Bati (Cameroon)", 34944, "nic-mbw", "Latn", } m["btd"] = { "Dairi Batak", 2891045, "btk", "Latn, Batk", } m["bte"] = { "Gamo-Ningi", 5520366, "nic-jer", "Latn", } m["btf"] = { "Birgit", 56302, "cdc-est", "Latn", } m["btg"] = { "Gagnoa Bété", 5005069, "kro-bet", "Latn", } m["bth"] = { "Biatah Bidayuh", 2900881, "day", "Latn", } m["bti"] = { "Burate", 56900, "paa-egb", "Latn", } m["btj"] = { "Bacanese Malay", 8828608, "poz-mly", "Latn", } m["btm"] = { "Mandailing Batak", 2891049, "btk", "Latn, Batk", } m["btn"] = { "Ratagnon", 13197, "phi", "Latn", } m["bto"] = { "Iriga Bicolano", 12633026, "phi", "Latn", } m["btp"] = { "Budibud", 4985086, "poz-ocw", "Latn", } m["btq"] = { "Batek", 860315, "mkh-asl", "Latn", } m["btr"] = { "Baetora", 2878874, "poz-vnn", "Latn", } m["bts"] = { "Simalungun Batak", 2891054, "btk", "Latn, Batk", } m["btt"] = { "Bete-Bendi", 4887064, "nic-ben", "Latn", } m["btu"] = { "Batu", 34964, "nic-tvn", "Latn", } m["btv"] = { "Bateri", 3812564, "inc-koh", "Deva", translit = { Deva = "Deva-translit", }, } m["btw"] = { "Butuanon", 5003156, "phi", "Latn", } m["btx"] = { "Karo Batak", 33012, "btk", "Latn, Batk", } m["bty"] = { "Bobot", 3446788, "poz-cma", "Latn", } m["btz"] = { "Alas-Kluet Batak", 2891042, "btk", "Latn, Batk", } m["bua"] = { "บูร์ยัต", 33120, "xgn-cen", "Cyrl, Mong, Latn", wikimedia_codes = "bxr", ancestors = "cmg", translit = { --Cyrl = "bua-translit", -- Mong translit in [[Module:scripts/data]] }, override_translit = true, -- Mong display_text and strip_diacritics in [[Module:scripts/data]] strip_diacritics = { Cyrl = {remove_diacritics = c.grave .. c.acute}, }, sort_key = { Cyrl = { from = {"ё", "ө", "ү", "һ"}, to = {"е" .. p[1], "о" .. p[1], "у" .. p[1], "х" .. p[1]} }, }, } m["bub"] = { "Bua", 32928, "alv-bua", "Latn", } m["bud"] = { "Ntcham", 36266, "nic-grm", "Latn", } m["bue"] = { "Beothuk", 56234, "qfa-unc", -- extinct since 1829, poorly attested; possibly a divergent Algonquian language "Latn", } m["buf"] = { "Bushoong", 3449964, "bnt-bsh", "Latn", } m["bug"] = { "บูกิส", 33190, "poz-ssw", "Bugi, Latn", } m["buh"] = { "Younuo Bunu", 56299, "hmn", "Latn", } m["bui"] = { "Bongili", 35084, "bnt-ngn", "Latn", } m["buj"] = { "Basa-Gurmana", 6432515, "nic-bas", "Latn", } m["buk"] = { "Bukawa", 35043, "poz-ocw", "Latn", } m["bum"] = { "Bulu (Cameroon)", 35028, "bnt-btb", "Latn", } m["bun"] = { "Sherbro", 36339, "alv-mel", "Latn", } m["buo"] = { "Terei", 56831, "paa-sbo", "Latn", } m["bup"] = { "Busoa", 5002001, "poz", "Latn", } m["buq"] = { "Brem", 4960502, "ngf-nad", "Latn", } m["bus"] = { "Bokobaru", 9228931, "dmn-bbu", "Latn", } m["but"] = { "Bungain", 3450623, "paa-tor", "Latn", } m["buu"] = { "Budu", 3450207, "bnt-nya", "Latn", } m["buv"] = { "Bun", 56351, "paa-yua", "Latn", } m["buw"] = { "Bubi", 35017, "bnt-tso", "Latn", } m["bux"] = { "Boghom", 3440412, "cdc-wst", "Latn", } m["buy"] = { "Mmani", 35061, "alv-mel", "Latn", } m["bva"] = { "Barein", 56285, "cdc-est", "Latn", } m["bvb"] = { "Bube", 35110, "nic-bds", "Latn", } m["bvc"] = { "Baelelea", 2878833, "poz-sls", "Latn", } m["bvd"] = { "Baeggu", 2878850, "poz-sls", "Latn", } m["bve"] = { "Berau Malay", 3915770, "poz-mly", "Latn", } m["bvf"] = { "Boor", 56250, "cdc-est", "Latn", } m["bvg"] = { "Bonkeng", 34958, "bnt-bbo", "Latn", } m["bvh"] = { "Bure", 56294, "cdc-wst", "Latn", } m["bvi"] = { "Belanda Viri", 35247, "nic-ser", "Latn", } m["bvj"] = { "Baan", 3515067, "nic-ogo", "Latn", } m["bvk"] = { "Bukat", 4986814, "poz-bnn", "Latn", } m["bvl"] = { "Bolivian Sign Language", 1783590, "sgn", "Latn", -- when documented } m["bvm"] = { "Bamunka", 34882, "nic-rnn", "Latn", } m["bvn"] = { "Buna", 3450516, "paa-tor", "Latn", } m["bvo"] = { "Bolgo", 35038, "alv-bua", "Latn", } m["bvp"] = { "Bumang", 4997235, "mkh-pal", } m["bvq"] = { "Birri", 56514, "csu-bkr", "Latn", } m["bvr"] = { "Burarra", 4998124, "aus-arn", "Latn", } m["bvt"] = { "Bati (Indonesia)", 4869253, "poz-cma", "Latn", } m["bvu"] = { "Bukit Malay", 9230148, "poz-mly", "Latn", } m["bvv"] = { "Baniva", 3515198, "awd", "Latn", } m["bvw"] = { "Boga", 56262, "cdc-cbm", "Latn", } m["bvx"] = { "Babole", 35180, "bnt-ngn", "Latn", } m["bvy"] = { "Baybayanon", 16839275, "phi", "Latn", } m["bvz"] = { "Bauzi", 56360, "paa-egb", "Latn", } m["bwa"] = { "Bwatoo", 9232446, "poz-cln", "Latn", } m["bwb"] = { "Namosi-Naitasiri-Serua", 3130290, "poz-pcc", "Latn", } m["bwc"] = { "Bwile", 3447440, "bnt-sbi", "Latn", } m["bwd"] = { "Bwaidoka", 2929111, "poz-ocw", "Latn", } m["bwe"] = { "Bwe Karen", 56994, "kar", "Mymr, Latn", } m["bwf"] = { "Boselewa", 4947229, "poz-ocw", "Latn", } m["bwg"] = { "Barwe", 8826802, "bnt-sna", "Latn", } m["bwh"] = { "Bishuo", 34973, "nic-fru", "Latn", } m["bwi"] = { "Baniwa", 3501735, "awd-nwk", "Latn", } m["bwj"] = { "Láá Láá Bwamu", 11017275, "nic-bwa", "Latn", } m["bwk"] = { "Bauwaki", 4873607, "paa-mal", "Latn", } m["bwl"] = { "Bwela", 5003678, "bnt-bun", "Latn", } m["bwm"] = { "Biwat", 56352, "paa-yua", "Latn", } m["bwn"] = { "Wunai Bunu", 56452, "hmn", } m["bwo"] = { "Shinasha", 56260, "omv-gon", "Latn", } m["bwp"] = { "Mandobo Bawah", 12636155, "ngf-gaw", "Latn", } m["bwq"] = { "Southern Bobo", 11001714, "dmn-snb", "Latn", } m["bwr"] = { "Bura", 56552, "cdc-cbm", "Latn", } m["bws"] = { "Bomboma", 9229429, "bnt-bun", "Latn", } m["bwt"] = { "Bafaw", 34853, "bnt-bbo", "Latn", } m["bwu"] = { "Buli (Ghana)", 35085, "nic-buk", "Latn", } m["bww"] = { "Bwa", 3515058, "bnt-bta", "Latn", } m["bwx"] = { "Bu-Nao Bunu", 56411, "hmn", "Latn", } m["bwy"] = { "Cwi Bwamu", 11150714, "nic-bwa", "Latn", } m["bwz"] = { "Bwisi", 35067, "bnt-sir", "Latn", } m["bxa"] = { "Bauro", 2892068, "poz-sls", "Latn", } m["bxb"] = { "Belanda Bor", 56678, "sdv-lon", "Latn", } m["bxc"] = { "Molengue", 13345, "bnt-kel", "Latn", } m["bxd"] = { "Pela", 57000, "tbq-brm", } m["bxe"] = { "Ongota", 36344, "qfa-unc", -- moribund, no academic consensus on classification; might be an isolate "Latn", } m["bxf"] = { "Bilur", 2903788, "poz-ocw", "Latn", } m["bxg"] = { "Bangala", 34989, "bnt-bmo", "Latn", } m["bxh"] = { "Buhutu", 4986329, "poz-ocw", "Latn", } m["bxi"] = { "Pirlatapa", 10632195, "aus-kar", "Latn", } m["bxj"] = { "Bayungu", 10427485, "aus-psw", "Latn", } m["bxk"] = { "Bukusu", 32930, "bnt-msl", "Latn", } m["bxl"] = { "Jalkunan", 11009787, "dmn-jje", "Latn", } m["bxn"] = { "Burduna", 4998313, "aus-psw", "Latn", } m["bxo"] = { "Barikanchi", 3450802, "crp", "Latn", ancestors = "ha", } m["bxp"] = { "Bebil", 34941, "bnt-btb", "Latn", } m["bxq"] = { "Beele", 56238, "cdc-wst", "Latn", } m["bxs"] = { "Busam", 35189, "nic-grs", "Latn", } m["bxv"] = { "Berakou", 56796, "csu-bgr", "Latn", } m["bxw"] = { "Banka", 3438402, "dmn-smg", "Latn", } m["bxz"] = { "Binahari", 4913840, "paa-mal", "Latn", } m["bya"] = { "Palawan Batak", 3450443, "phi", "Tagb", } m["byb"] = { "Bikya", 33257, "nic-fru", "Latn", } m["byc"] = { "Ubaghara", 36625, "nic-ucn", "Latn", } m["byd"] = { "Benyadu'", 11173588, "day", "Latn", } m["bye"] = { "Pouye", 7235814, "paa-spk", "Latn", } m["byf"] = { "Bete", 32932, "nic-ykb", "Latn", } m["byg"] = { "Baygo", 56836, "sdv-daj", "Latn", } m["byh"] = { "Bujhyal", 56317, "sit-gma", "Deva", translit = { Deva = "Deva-translit", }, } m["byi"] = { "Buyu", 5003401, "bnt-nyb", "Latn", } m["byj"] = { "Binawa", 4913807, "nic-kau", "Latn", } m["byk"] = { "Biao", 4902547, "qfa-tak", "Latn", -- also Hani? } m["byl"] = { "Bayono", 3503856, "paa-baw", "Latn", } m["bym"] = { "Bidyara", 8842355, "aus-pam", "Latn", } m["byn"] = { "Blin", 56491, "cus-cen", "Ethi, Latn", translit = {Ethi = "Ethi-translit"}, } m["byo"] = { "Biyo", 56848, "tbq-bka", "Latn, Hani", sort_key = {Hani = "Hani-sortkey"}, } m["byp"] = { "Bumaji", 4997234, "nic-ben", "Latn", } m["byq"] = { "Basay", 716647, "map", "Latn", } m["byr"] = { "Baruya", 3450812, "ngf-ang", "Latn", } m["bys"] = { "Burak", 4998097, "alv-bwj", "Latn", } m["byt"] = { "Berti", 35008, "ssa-sah", "Latn", } m["byv"] = { "Medumba", 36019, "bai", "Latn", } m["byw"] = { "Belhariya", 32961, "sit-kie", "Deva", translit = { Deva = "Deva-translit", }, } m["byx"] = { "Qaqet", 3503009, "paa-bng", "Latn", } m["byz"] = { "Banaro", 56858, "paa-ram", "Latn", } m["bza"] = { "Bandi", 34912, "dmn-msw", "Latn", } m["bzb"] = { "Andio", 4754487, "poz-slb", "Latn", } m["bzd"] = { "Bribri", 28400, "cba", "Latn", } m["bze"] = { "Jenaama Bozo", 10950633, "dmn-snb", "Latn", } m["bzf"] = { "Boikin", 56829, "paa-ndu", "Latn", } m["bzg"] = { "Babuza", 716615, "map", "Latn", } m["bzh"] = { "Mapos Buang", 2927370, "poz-ocw", "Latn", } m["bzi"] = { "บีซู", 56852, "tbq-bis", "Latn, Thai", --sort_key = {Thai = "Thai-sortkey"}, } m["bzj"] = { "Belizean Creole", 1363055, "crp", "Latn", ancestors = "en", } m["bzk"] = { "Nicaraguan Creole", 3504097, "crp", "Latn", ancestors = "en", } m["bzl"] = { -- supposedly also called "Bolano", but I can find no evidence of that "Boano (Sulawesi)", 4931258, "poz", "Latn", } m["bzm"] = { "Bolondo", 35071, "bnt-bun", "Latn", } m["bzn"] = { "Boano (Maluku)", 4931255, "poz-cma", "Latn", } m["bzo"] = { "Bozaba", 4952785, "bnt-ngn", "Latn", } m["bzp"] = { "Kemberano", 12634399, "ngf-sbh", "Latn", } m["bzq"] = { "Buli (Indonesia)", 2927952, "poz-hce", "Latn", } m["bzr"] = { "Biri", 4087011, "aus-pam", "Latn", } m["bzs"] = { "Brazilian Sign Language", 3436689, "sgn", "Latn", } m["bzu"] = { "Burmeso", 56746, "qfa-dis", -- isolate in Glottolog, Wurm and Foley; in East Bird's Head-Sentani fmaily by Ross "Latn", } m["bzv"] = { "Bebe", 34977, "nic-bbe", "Latn", } m["bzw"] = { "Basa", 34898, "nic-bas", "Latn", } m["bzx"] = { "Hainyaxo Bozo", 11159536, "dmn-snb", "Latn", } m["bzy"] = { "Obanliku", 36276, "nic-ben", "Latn", } m["bzz"] = { "Evant", 35259, "nic-tvc", "Latn", } return require("Module:languages").finalizeData(m, "language") ti53ede14bo2w134dbnx4bjhvzo90qy มอดูล:languages/data/3/a 828 36386 5720752 5684149 2026-04-21T07:00:47Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720752 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["aaa"] = { "Ghotuo", 35463, "alv-yek", "Latn", } m["aab"] = { "Alumu-Tesu", 35034, "nic-alu", "Latn", } m["aac"] = { "Ari", 1811224, "ngf-gsu", "Latn", } m["aad"] = { "Amal", 56708, "paa-spk", "Latn", } -- "aae" is treated as "sq", see [[WT:LT]] m["aaf"] = { "Aranadan", 3507928, "dra-mal", "Mlym", -- Mlym translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission) } m["aag"] = { "Ambrak", 4741706, "paa-tor", "Latn", } m["aah"] = { "Abu'", 4670715, "paa-tor", "Latn", } m["aai"] = { "Arifama-Miniafia", 4790560, "poz-ocw", "Latn", } m["aak"] = { "Ankave", 3446690, "ngf-ang", "Latn", } m["aal"] = { "Afade", 56434, "cdc-cbm", "Latn", } m["aan"] = { "Anambé", 3507873, "tup-gua", "Latn", } m["aap"] = { "Arára (Pará)", 56807, "sai-pek", "Latn", } m["aaq"] = { "Penobscot", 3515185, "alg-abp", "Latn", } m["aas"] = { "Aasax", 56620, "cus-sou", "Latn", } -- "aat" is treated as "sq", see [[WT:LT]] m["aau"] = { "Abau", 3073568, "paa-spk", "Latn", } m["aaw"] = { "Solong", 7558834, "poz-ocw", "Latn", } m["aax"] = { "Mandobo Atas", 12636156, "ngf-gaw", "Latn", } m["aaz"] = { "Amarasi", 4740192, "poz-tim", "Latn", } m["aba"] = { "อาเบ", 34833, "alv-lag", "Latn", } m["abb"] = { "Bankon", 34860, "bnt-bsa", "Latn", } m["abc"] = { "Ambala Ayta", 3448896, "phi", "Latn", } m["abd"] = { "Camarines Norte Agta", 3399682, "phi", "Latn", } m["abe"] = { "Abenaki", 17502788, "alg-abp", "Latn", } m["abf"] = { "Abai Sungai", 4663287, "poz-san", "Latn", } m["abg"] = { "Abaga", 3507954, "ngf-kag", "Latn", } m["abh"] = { "อาหรับแบบทาจิกิสถาน", 56833, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["abi"] = { "Abidji", 34781, "alv-lag", "Latn", } m["abj"] = { "Aka-Bea", 2356391, "qfa-ads", "Latn", } m["abl"] = { "Abung", 49215, "poz-lgx", "Latn", } m["abm"] = { "Abanyom", 7502, "nic-eko", "Latn", } m["abn"] = { "Abua", 34835, "nic-cde", "Latn", } m["abo"] = { "Abon", 35121, "nic-tvn", "Latn", } m["abp"] = { "Abenlen Ayta", 3436621, "phi", "Latn", } m["abq"] = { "อาบาซา", 27567, "cau-abz", "Cyrl, Latn", translit = { Cyrl = "abq-translit" }, override_translit = true, display_text = { Cyrl = s["cau-Cyrl-displaytext"] }, strip_diacritics = { Cyrl = s["cau-Cyrl-stripdiacritics"], Latn = s["cau-Latn-stripdiacritics"], }, sort_key = { Cyrl = { from = { "гъв", "гъь", "гӏв", "джв", "джь", "къв", "къь", "кӏв", "кӏь", "хъв", "хӏв", "чӏв", -- 3 chars "гв", "гъ", "гь", "гӏ", "дж", "дз", "ё", "жв", "жь", "кв", "къ", "кь", "кӏ", "ль", "лӏ", "пӏ", "тл", "тш", "тӏ", "фӏ", "хв", "хъ", "хь", "хӏ", "цӏ", "чв", "чӏ", "шв", "шӏ" -- 2 chars }, to = { "г" .. p[3], "г" .. p[4], "г" .. p[7], "д" .. p[2], "д" .. p[3], "к" .. p[3], "к" .. p[4], "к" .. p[7], "к" .. p[8], "х" .. p[3], "х" .. p[6], "ч" .. p[3], "г" .. p[1], "г" .. p[2], "г" .. p[5], "г" .. p[6], "д" .. p[1], "д" .. p[4], "е" .. p[1], "ж" .. p[1], "ж" .. p[2], "к" .. p[1], "к" .. p[2], "к" .. p[5], "к" .. p[6], "л" .. p[1], "л" .. p[2], "п" .. p[1], "т" .. p[1], "т" .. p[2], "т" .. p[3], "ф" .. p[1], "х" .. p[1], "х" .. p[2], "х" .. p[4], "х" .. p[5], "ц" .. p[1], "ч" .. p[1], "ч" .. p[2], "ш" .. p[1], "ш" .. p[2] } }, }, } -- "abr" Abron is treated as "ak" Akan, see [[WT:LT]] m["abs"] = { "Ambonese Malay", 3124354, "crp", "Latn", ancestors = "ms", } m["abt"] = { "Ambulas", 3508015, "paa-ndu", "Latn", } m["abu"] = { "Abure", 34767, "alv-ptn", "Latn", } m["abv"] = { "อาหรับแบบบาห์เรน", 56576, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["abw"] = { "Pal", 7126121, "ngf-mad", "Latn", } m["abx"] = { "Inabaknon", 2820163, "poz-sbj", "Latn", } m["aby"] = { "Aneme Wake", 3508107, "ngf-yar", "Latn", } m["abz"] = { "Abui", 2822110, "paa-tap", "Latn", } m["aca"] = { "Achagua", 2822982, "awd", "Latn", } m["acb"] = { "Áncá", 11130787, "nic-mom", "Latn", } m["acd"] = { "Gikyode", 35256, "alv-gng", "Latn", } m["ace"] = { "อาเจะฮ์", 27683, "cmc", "Latn, ms-Arab", standard_chars = { Latn = "AaBbCcDdEeÉéÈèËëFfGgHhIiJjKkLlMmNnOoÔôÖöPpQqRrSsTtUuVvWwXxYyZz", -- current orthography (not yet add Arab) c.punc }, } m["ach"] = { "Acholi", 34926, "sdv-los", "Latn", } m["aci"] = { "Aka-Cari", 2670418, "qfa-adn", "Latn", } m["ack"] = { "Aka-Kora", 3433680, "qfa-adn", "Latn", } m["acl"] = { "Akar-Bale", 3436825, "qfa-ads", "Latn", } m["acm"] = { "อาหรับแบบอิรัก", 56232, "sem-arb", "Arab, Hebr", strip_diacritics = { Arab = "ar-stripdiacritics", }, -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["acn"] = { "Achang", 56582, "tbq-brm", "Latn", } m["acp"] = { "Eastern Acipa", 5329945, "nic-kmk", "Latn", } m["acr"] = { "Achi", 34774, "myn", "Latn", } m["acs"] = { "Acroá", 2829146, "sai-cje", "Latn", } m["acu"] = { "Achuar", 2823170, "sai-jiv", "Latn", } m["acv"] = { "Achumawi", 56661, "nai-pal", "Latn", } m["acw"] = { "อาหรับแบบฮิญาซ", 56608, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["acx"] = { "อาหรับแบบโอมาน", 56630, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["acy"] = { "อาหรับแบบไซปรัส", 56416, "sem-arb", "Latn, Grek", ancestors = "acm", strip_diacritics = { Latn = {remove_diacritics = c.grave .. c.acute .. c.breve}, }, -- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]] standard_chars = { Latn = "AaBbCcDdΔδEeFfGgĠġĊċIiJjKkLlMmNnOoPpΘθRrSsTtUuVvWwXxYyZzŞş", c.punc }, } m["acz"] = { "Acheron", 34769, "alv-tal", "Latn", } m["ada"] = { "Adangme", 35141, "alv-gda", "Latn", } m["adb"] = { "Atauran", 125421255, "poz-cet", "Latn", } m["add"] = { "Dzodinka", 35266, "nic-nka", "Latn", } m["ade"] = { "Adele", 27740, "alv-ntg", "Latn", } m["adf"] = { "Dhofari Arabic", 56565, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["adg"] = { "Andegerebinha", 3508123, "aus-pam", "Latn", } m["adh"] = { "Adhola", 1971400, "sdv-los", "Latn", } m["adi"] = { "Adi", 56440, "sit-tan", "Latn", } m["adj"] = { "Adioukrou", 34738, "alv-lag", "Latn", } m["adl"] = { "Galo", 2857892, "sit-tan", "Latn", } m["adn"] = { "Adang", 3398276, "paa-tap", "Latn", } m["ado"] = { "Abu", 56659, "paa-ram", "Latn", } m["adp"] = { "Adap", 3512402, "sit-tib", "Tibt", ancestors = "dz", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["adq"] = { "Adangbe", 34730, "alv-gda", "Latn", ancestors = "ada", } m["adr"] = { "Adonara", 4684505, "poz-cet", "Latn", } m["ads"] = { "Adamorobe Sign Language", 27709, "sgn", "Latn", -- when documented } m["adt"] = { "Adnyamathanha", 2225391, "aus-psw", "Latn", } m["adu"] = { "Aduge", 34734, "alv-nwd", "Latn", ancestors = "opa", } m["adw"] = { "Amondawa", 12626847, "tup-gua", "Latn", } m["ady"] = { "อะดีเกยา", 27776, "cau-cir", "Cyrl, Latn, Arab", translit = { Cyrl = "cau-cir-translit", Arab = "ar-translit", }, override_translit = true, display_text = { Cyrl = s["cau-Cyrl-displaytext"] }, strip_diacritics = { Cyrl = s["cau-Cyrl-stripdiacritics"], Latn = s["cau-Latn-stripdiacritics"], }, sort_key = { Cyrl = { from = { "кхъу", "къӏу", -- 4 chars "гъу", "джу", "дзу", "жъу", "къу", "кхъ", "къӏ", "кӏу", "кӏь", "лъу", "лӏу", "пӏу", "сӏу", "тӏу", "фӏу", "хъу", "цӏу", "чъу", "чӏу", "шъу", "шӏу", "щӏу", -- 3 chars "гу", "гъ", "гь", "дж", "дз", "ё", "жъ", "жь", "ку", "къ", "кь", "кӏ", "лъ", "ль", "лӏ", "пӏ", "сӏ", "тӏ", "фӏ", "ху", "хъ", "хь", "цу", "цӏ", "чу", "чъ", "чӏ", "шъ", "шӏ", "щӏ", "ӏу", "ӏь" -- 2 chars }, to = { "к" .. p[5], "к" .. p[7], "г" .. p[3], "д" .. p[2], "д" .. p[4], "ж" .. p[2], "к" .. p[3], "к" .. p[4], "к" .. p[6], "к" .. p[10], "к" .. p[11], "л" .. p[2], "л" .. p[5], "п" .. p[2], "с" .. p[2], "т" .. p[2], "ф" .. p[2], "х" .. p[3], "ц" .. p[3], "ч" .. p[3], "ч" .. p[5], "ш" .. p[2], "ш" .. p[4], "щ" .. p[2], "г" .. p[1], "г" .. p[2], "г" .. p[4], "д" .. p[1], "д" .. p[3], "е" .. p[1], "ж" .. p[1], "ж" .. p[3], "к" .. p[1], "к" .. p[2], "к" .. p[8], "к" .. p[9], "л" .. p[1], "л" .. p[3], "л" .. p[4], "п" .. p[1], "с" .. p[1], "т" .. p[1], "ф" .. p[1], "х" .. p[1], "х" .. p[2], "х" .. p[4], "ц" .. p[1], "ц" .. p[2], "ч" .. p[1], "ч" .. p[2], "ч" .. p[4], "ш" .. p[1], "ш" .. p[3], "щ" .. p[1], "ӏ" .. p[1], "ӏ" .. p[2] } }, }, } m["adz"] = { "Adzera", 3327445, "poz-ocw", "Latn", } m["aea"] = { "Areba", 3509129, "aus-pam", "Latn", } m["aeb"] = { "อาหรับแบบตูนิเซีย", 56240, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["aed"] = { "Argentine Sign Language", 3322073, "sgn", "Latn", -- when documented } m["aee"] = { "Northeast Pashayi", 12642198, "inc-pas", "fa-Arab, Latn", } m["aek"] = { "Haeke", 5638166, "poz-cln", "Latn", } m["ael"] = { "Ambele", 34818, "nic-grf", "Latn", } m["aem"] = { "Arem", 3507920, "mkh-vie", "Latn", } m["aen"] = { "Armenian Sign Language", 3446604, "sgn", } m["aeq"] = { "Aer", 3246741, "inc-wes", "Arab", } m["aer"] = { "Eastern Arrernte", 10728232, "aus-pam", "Latn", } m["aes"] = { "Alsea", 2395641, nil, "Latn", } m["aeu"] = { "Akeu", 4700657, "tbq-sil", "Latn", } m["aew"] = { "Ambakich", 56642, "paa-eke", "Latn", } m["aey"] = { "Amele", 3508025, "ngf-gum", "Latn", } m["aez"] = { "Aeka", 16110528, "paa-bin", "Latn", } m["afb"] = { "Gulf Arabic", 56385, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["afd"] = { "Andai", 4753480, "paa-arf", "Latn", } m["afe"] = { "Putukwam", 3914930, "nic-ben", "Latn", } m["afg"] = { "Afghan Sign Language", 4689093, "sgn", } m["afh"] = { "Afrihili", 384707, "art", "Latn", type = "appendix-constructed", } m["afi"] = { "Akrukay", 57003, "paa-ram", "Latn", } m["afk"] = { "Nanubae", 6964416, "paa-arf", "Latn", } m["afn"] = { "Defaka", 35174, "nic", "Latn", } m["afo"] = { "Eloyi", 3914066, "nic-plt", "Latn", } m["afp"] = { "Tapei", 16887371, "paa-arf", "Latn", } m["afs"] = { "Afro-Seminole Creole", 27867, "crp", "Latn", ancestors = "en", } m["aft"] = { "Afitti", 3400829, "sdv-nyi", "Latn", } m["afu"] = { "Awutu", 34847, "alv-gng", "Latn", } m["afz"] = { "Obokuitai", 7075258, "paa-lkp", "Latn", } m["aga"] = { "Aguano", 3331203, nil, "Latn", } m["agb"] = { "Legbo", 35584, "nic-uce", "Latn", } m["agc"] = { "Agatu", 34732, "alv-ido", "Latn", } m["agd"] = { "Agarabi", 3399642, "ngf-kag", "Latn", } m["age"] = { "Angal", 10951553, "ngf-eng", "Latn", } m["agf"] = { "Arguni", 12473346, "poz-cet", "Latn", } m["agg"] = { "Angor", 3508100, "paa-sng", "Latn", } m["agh"] = { "Ngelima", 7022266, "bnt-bta", "Latn", } m["agi"] = { "Agariya", 663586, "mun", "Deva", translit = "Deva-translit", } m["agj"] = { "Argobba", 29292, "sem-eth", "Ethi", } m["agk"] = { "Isarog Agta", 6078982, "phi", "Latn", } m["agl"] = { "Fembe", 372927, "ngf-est", "Latn", } m["agm"] = { "Angaataha", 3508001, "ngf-ang", "Latn", } m["agn"] = { "อากูตายา", 3399717, "phi-kal", "Latn", } m["ago"] = { "Tainae", 7676186, "ngf-ang", "Latn", } m["agq"] = { "Aghem", 34737, "nic-rnw", "Latn", } m["agr"] = { "Aguaruna", 1526530, "sai-jiv", "Latn", } m["ags"] = { "Esimbi", 35260, "nic-bds", "Latn", } m["agt"] = { "Central Cagayan Agta", 5017296, "phi", "Latn", } m["agu"] = { "Aguacateca", 35091, "myn", "Latn", } m["agv"] = { "Remontado Agta", 3508085, "phi", "Latn", } m["agw"] = { "Kahua", 3191906, "poz-sls", "Latn", } m["agx"] = { "Aghul", 36498, "cau-esm", "Cyrl", translit = "cau-nec-translit", override_translit = true, display_text = s["cau-Cyrl-displaytext"], strip_diacritics = s["cau-Cyrl-stripdiacritics"], sort_key = { from = {"аь", "гъ", "гь", "гӏ", "дж", "ё", "къ", "кь", "кӏ", "оь", "пӏ", "тӏ", "уь", "хъ", "хь", "хӏ", "цӏ", "чӏ"}, to = {"а" .. p[1], "г" .. p[1], "г" .. p[2], "г" .. p[3], "д" .. p[1], "е" .. p[1], "к" .. p[1], "к" .. p[2], "к" .. p[3], "о" .. p[1], "п" .. p[1], "т" .. p[1], "у" .. p[1], "х" .. p[1], "х" .. p[2], "х" .. p[3], "ц" .. p[1], "ч" .. p[1]} }, } m["agy"] = { "Southern Alta", 7569611, "phi", "Latn", } m["agz"] = { "Mount Iriga Agta", 6921432, "phi", "Latn", } m["aha"] = { "Ahanta", 34729, "alv-ctn", "Latn", } m["ahb"] = { "Axamb", 2874710, "poz-vnc", "Latn", } m["ahg"] = { "Qimant", 35663, "cus-cen", "Latn", } m["ahh"] = { "Aghu", 3436645, "ngf-gaw", "Latn", } m["ahi"] = { "Tiagba", 3400073, "kro-aiz", "Latn", } m["ahk"] = { "อาข่า", 56643, "tbq-han", "Latn, Mymr, Thai", sort_key = { Thai = { from = {"[%pๆ]", "[็-๎]", "([เแโใไ])([ก-ฮ])"}, to = {"", "", "%2%1"} }, }, } m["ahl"] = { "Igo", 35412, "alv-ktg", "Latn", } m["ahm"] = { "Mobu", 35967, "kro-aiz", "Latn", } m["ahn"] = { "Àhàn", 34723, "alv-aah", "Latn", } m["aho"] = { "อาหม", 34778, "tai-swe", "Ahom", translit = "Ahom-translit", } m["ahp"] = { "Apro", 34810, "alv-kwa", "Latn", } m["ahr"] = { "Ahirani", 15549890, "raj", "Deva", translit = "Deva-translit", } m["ahs"] = { "Ashe", 34823, "nic-plc", "Latn", } m["aht"] = { "Ahtna", 21058, "ath-nor", "Latn", } m["aia"] = { "Arosi", 2863483, "poz-sls", "Latn", } m["aib"] = { "Äynu", 27927, "qfa-mix", "Arab, Latn", ancestors = "ug, fa" } m["aic"] = { "Ainbai", 3332149, "paa-brd", "Latn", } m["aid"] = { "Alngith", 3279409, "aus-pmn", "Latn", } m["aie"] = { "Amara", 2841180, "poz-ocw", "Latn", } m["aif"] = { "Agi", 3331491, "paa-tor", "Latn", } m["aig"] = { "Antigua and Barbuda Creole English", 3244184, "crp", "Latn", ancestors = "en", } m["aih"] = { "Ai-Cham", 2827749, "qfa-kms", "Latn, Hani", sort_key = { Hani = "Hani-sortkey" }, } m["aii"] = { "Assyrian Neo-Aramaic", 29440, "sem-nna", "Syrc", translit = "aii-translit", strip_diacritics = "Syrc-stripdiacritics", } m["aij"] = { "Lishanid Noshan", 3436467, "sem-nna", "Hebr", -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["aik"] = { "Ake", 34808, "nic-pls", "Latn", } m["ail"] = { "Aimele", 3327418, "ngf-bos", "Latn", } m["aim"] = { "Aimol", 4697175, "tbq-kuk", "Latn, Beng", } m["ain"] = { "ไอนุ", 27969, "qfa-ain", "Kana, Latn, Cyrl", sort_key = { Kana = "Kana-sortkey" }, } m["aio"] = { "อ่ายตน", 3399725, "tai-swe", "Mymr", translit = "aio-phk-translit", display_text = s["aio-displaytext"], strip_diacritics = s["aio-stripdiacritics"], } m["aip"] = { "Burumakok", 5000984, "ngf-okk", "Latn", } m["air"] = { "Airoran", 3321131, "paa-tkw", "Latn", } m["ait"] = { "Arikem", 3446679, "tup", "Latn", } m["aiw"] = { "Aari", 7495, "omv-aro", "Latn", } m["aix"] = { "Aighon", 3504287, "poz-ocw", "Latn", } m["aiy"] = { "Ali", 34814, "gba-eas", "Latn", } m["aja"] = { "Aja", 3237491, "csu-bkr", "Latn", } m["ajg"] = { "Adja", 35035, "alv-gbe", "Latn", } m["aji"] = { "Ajië", 2828867, "poz-cln", "Latn", } m["ajn"] = { "Andajin", 16111302, "aus-wor", "Latn", } m["ajp"] = { "อาหรับแบบลิแวนต์ใต้", 55633582, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["ajw"] = { "Ajawa", 56645, "cdc-wst", "Latn", } m["ajz"] = { "Amri Karbi", 3508092, "tbq-kuk", "Latn", ancestors = "mjw", } m["akb"] = { "Angkola Batak", 2640686, "btk", "Latn, Batk", } m["akc"] = { "Mpur", 3327139, "qfa-iso", -- Papuan; based on Palmer (2018), Ethnologue and Glottolog "Latn", } m["akd"] = { "Ukpet-Ehom", 36618, "nic-ucr", "Latn", } m["ake"] = { "Akawaio", 28059, "sai-pem", "Latn", } m["akf"] = { "Akpa", 34801, "alv-ido", "Latn", } m["akg"] = { "Anakalangu", 4750964, "poz-cet", "Latn", } m["akh"] = { "Angal Heneng", 10950354, "ngf-eng", "Latn", } m["aki"] = { "Aiome", 56735, "paa-ram", "Latn", } m["akj"] = { "Jeru", 2919121, "qfa-adn", "Latn, Deva", translit = { Deva = "Deva-translit", }, } m["akk"] = { "แอกแคด", 35518, "sem-eas", "Xsux, Latn", } m["akl"] = { "อักลัน", 8773, "phi", "Latn", } m["akm"] = { "Aka-Bo", 35361, "qfa-adn", "Latn", } m["ako"] = { "Akurio", 56650, "sai-tar", "Latn", } m["akp"] = { "Siwu", 36470, "alv-ntg", "Latn", } m["akq"] = { "Ak", 56654, "paa-spk", "Latn", } m["akr"] = { "Araki", 2699882, "poz-vnn", "Latn", } m["aks"] = { "Akaselem", 34817, "nic-grm", "Latn", } m["akt"] = { "Akolet", 3330162, "poz-ocw", "Latn", } m["aku"] = { "Akum", 34799, "nic-ykb", "Latn", } m["akv"] = { "Akhvakh", 56423, "cau-and", "Cyrl", translit = "cau-nec-translit", override_translit = true, display_text = s["cau-Cyrl-displaytext"], strip_diacritics = s["cau-Cyrl-stripdiacritics"], } m["akw"] = { "Akwa", 34802, "bnt-mbo", "Latn", } m["akx"] = { "Aka-Kede", 3436816, "qfa-adc", "Latn", } m["aky"] = { "Aka-Kol", 3436784, "qfa-adc", "Latn", } m["akz"] = { "แอละแบมา", 1815020, "nai-mus", "Latn", } m["ala"] = { "Alago", 34813, "alv-ido", "Latn", } m["alc"] = { "Kawésqar", 56544, "aqa", "Latn", } m["ald"] = { "Alladian", 34837, "alv-lag", "Latn", } m["ale"] = { "Aleut", 27210, "esx", "Latn, Cyrl", } m["alf"] = { "Alege", 34815, "nic-ben", "Latn", } m["alh"] = { "Alawa", 2147917, "aus-gun", "Latn", } m["ali"] = { "Amaimon", 3327427, "ngf-mad", "Latn", } m["alj"] = { "Alangan", 3327423, "phi", "Latn", } m["alk"] = { "Alak", 2714690, "mkh", "Latn", } m["all"] = { "Allar", 3393634, "dra-mal", "Mlym", -- Mlym translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission) } -- "aln" is treated as "sq", see [[WT:LT]] m["alm"] = { "Amblong", 11022615, "poz-vnn", "Latn", } m["alo"] = { "Larike-Wakasihu", 3217929, "poz-cma", "Latn", } m["alp"] = { "Alune", 3327367, "poz-cet", "Latn", } m["alq"] = { "Algonquin", 28092, "alg", "Latn, Cans", ancestors = "oj", } m["alr"] = { "Alutor", 28213, "qfa-ckn", "Cyrl", strip_diacritics = { from = {"['’]"}, to = {"ʼ"} }, sort_key = { from = {"вʼ", "гʼ", "ғ", "ә", "ё", "ӄ", "ӈ"}, to = {"в" .. p[1], "г" .. p[1], "г" .. p[2], "е" .. p[1], "е" .. p[2], "к" .. p[1], "н" .. p[1]} }, } m["alt"] = { "อัลไตใต้", 1991779, "trk-kkp", "Cyrl", translit = "Altai-translit", sort_key = { from = {"ј", "ё", "ҥ", "ӧ", "ӱ"}, to = {"д" .. p[1], "е" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1]} }, } m["alu"] = { "'Are'are", 5160, "poz-sls", "Latn", } m["alw"] = { "Alaba", 56652, "cus-hec", "Latn", } m["alx"] = { "Amol", 3504260, "paa-tor", "Latn", } m["aly"] = { "Alyawarr", 3327389, "aus-pam", "Latn", } m["alz"] = { "Alur", 56507, "sdv-los", "Latn", } m["ama"] = { "Amanayé", 3508053, "tup-gua", "Latn", } m["amb"] = { "Ambo", 3450142, "nic-tvn", "Latn", } m["amc"] = { "Amahuaca", 2669150, "sai-pan", "Latn", } m["ame"] = { "Yanesha'", 3088540, "awd", "Latn", } m["amf"] = { "Hamer-Banna", 35764, "omv-aro", "Latn, Ethi", sort_key = "amf-utilities" } m["amg"] = { "Amurdag", 3360016, "aus-wdj", "Latn", } m["ami"] = { "Amis", 35132, "map", "Latn", } m["amj"] = { "Amdang", 28335, "ssa-fur", "Latn", } m["amk"] = { "Ambai", 1875885, "poz-hce", "Latn", } m["aml"] = { "War-Jaintia", 56321, "aav-khs", "Latn", } m["amm"] = { "Ama", 3446626, "paa-lem", "Latn", } m["amn"] = { "Amanab", 3327399, "paa-brd", "Latn", } m["amo"] = { "Amo", 34826, "nic-kne", "Latn", } m["amp"] = { "Alamblak", 56688, "paa-spk", "Latn", } m["amq"] = { "Amahai", 3327384, "poz-cma", "Latn", } m["amr"] = { "Amarakaeri", 35128, "sai-har", "Latn", } m["ams"] = { "อามามิโอชิมะใต้", 2840986, "jpx-nry", "Jpan", translit = s["jpx-translit"], display_text = s["jpx-displaytext"], strip_diacritics = s["jpx-stripdiacritics"], sort_key = s["jpx-sortkey"], } m["amt"] = { "Amto", 56517, "paa-amu", "Latn", } m["amu"] = { "Guerrero Amuzgo", 3501942, "omq", "Latn", } m["amv"] = { "Ambelau", 2669214, "poz-cma", "Latn", } m["amw"] = { "Western Neo-Aramaic", 34226, "sem-arw", "Armi, Syrc, Latn", strip_diacritics = { Syrc = "Syrc-stripdiacritics" }, } m["amx"] = { "Anmatyerre", 10412317, "aus-pam", "Latn", } m["amy"] = { "Ami", 10408315, "aus-dal", "Latn", } m["amz"] = { "Atampaya", 3446651, "aus-pam", "Latn", } m["ana"] = { "Andaqui", 2846078, nil, "Latn", } m["anb"] = { "Andoa", 2846171, "sai-zap", "Latn", } m["anc"] = { "Ngas", 35999, "cdc-wst", "Latn", } m["and"] = { "Ansus", 3513300, "poz-hce", "Latn", } m["ane"] = { "คังรังชือ", 3571097, "poz-cln", "Latn", } m["anf"] = { "Animere", 34783, "alv-ktg", "Latn", } m["ang"] = { "อังกฤษเก่า", 42365, "gmw-ang", "Latn, Runr", translit = { Runr = "Runr-translit" }, strip_diacritics = { Latn = { remove_diacritics = c.acute .. c.circ .. c.macron .. c.breve .. c.dotabove .. c.diaer .. c.dotbelow, from = {"[Ƿƿ]"}, to = {{ ["Ƿ"] = "W", ["ƿ"] = "w", }}, }, }, sort_key = { Latn = { remove_diacritics = c.acute .. c.circ .. c.macron .. c.breve .. c.dotabove .. c.diaer .. c.dotbelow, from = {"[æƀꝺðꝼᵹȝłœꞃꞅꞇþꝥꝧƿ]"}, to = {{ ["æ"] = "ae", ["ƀ"] = "b", ["ꝺ"] = "d", ["ð"] = "d" .. p[1], ["ꝼ"] = "f", ["ᵹ"] = "g", ["ȝ"] = "g" .. p[1], ["ł"] = "l", ["œ"] = "oe", ["ꞃ"] = "r", ["ꞅ"] = "s", ["ꞇ"] = "t", ["þ"] = "t" .. p[1], ["ꝥ"] = "t" .. p[1], ["ꝧ"] = "t" .. p[1], ["ƿ"] = "w", }}, }, }, standard_chars = { Latn = "AaÆæBbCcDdÐðEeFfGgHhIiLlMmNnOoŒœPpRrSsTtÞþUuWwXxYy", c.punc, }, } m["anh"] = { "Nend", 6991554, "ngf-wso", "Latn", } m["ani"] = { "Andi", 34849, "cau-and", "Cyrl", translit = "cau-nec-translit", override_translit = true, display_text = s["cau-Cyrl-displaytext"], strip_diacritics = s["cau-Cyrl-stripdiacritics"], } m["anj"] = { "Anor", 56458, "paa-ram", "Latn", } m["ank"] = { "Goemai", 35272, "cdc-wst", "Latn", } m["anl"] = { "Anu", 4777679, "sit-mru", "Latn", } m["anm"] = { "Anāl", 56235, "tbq-kuk", "Latn", } m["ann"] = { "Obolo", 36614, "nic-lcr", "Latn", } m["ano"] = { "Andoque", 2669225, "qfa-iso", "Latn", } m["anp"] = { "Angika", 28378, "inc-bih", "Deva, Kthi", translit = { Deva = "Deva-translit", Kthi = "Kthi-translit", }, } m["anq"] = { "Jarawa", 2475526, "qfa-ong", "Latn", } m["anr"] = { "Andh", 4754314, "inc-sou", "Deva", translit = "Deva-translit", } m["ans"] = { "Anserma", 3446613, "sai-chc", "Latn", } m["ant"] = { "Antakarinya", 921304, "aus-psw", "Latn", } m["anu"] = { "Anuak", 56677, "sdv-lon", "Latn", } m["anv"] = { "Denya", 35187, "nic-mam", "Latn", } m["anw"] = { "Anaang", 2845320, "nic-ief", "Latn", } m["anx"] = { "Andra-Hus", 2846195, "poz-aay", "Latn", } m["any"] = { "Anyi", 28395, "alv-ctn", "Latn", } m["anz"] = { "Anem", 56512, "qfa-dis", -- Papuan; might be an isolate or in a putative West New Britain family "Latn", } m["aoa"] = { "Angolar", 34994, "crp", "Latn", ancestors = "pt", } m["aob"] = { "Abom", 3446647, "paa-ani", "Latn", } m["aoc"] = { "Pemon", 10729616, "sai-pem", "Latn", } m["aod"] = { "Andarum", 3507888, "paa-ram", "Latn", } m["aoe"] = { "Angal Enen", 10951638, "ngf-eng", "Latn", } m["aof"] = { "Bragat", 3507977, "paa-tor", "Latn", } m["aog"] = { "Angoram", 56366, -- cf 6754745 for merged dialect "paa-lsp", "Latn", } m["aoi"] = { "Anindilyakwa", 2714654, "aus-arn", "Latn", } m["aoj"] = { "Mufian", 3507881, "paa-tor", "Latn", } m["aok"] = { "Arhö", 4790086, "poz-cln", "Latn", } m["aol"] = { "Alorese", 3332062, "poz", "Latn", } m["aom"] = { "Ömie", 8078975, "ngf-koi", "Latn", } m["aon"] = { "Bumbita Arapesh", 3508044, "paa-tor", "Latn", } m["aor"] = { "Aore", 12627129, "poz-vnn", "Latn", } m["aos"] = { "Taikat", 7676018, "paa-brd", "Latn", } m["aot"] = { "อะตง", --actual pronounciation 5646, "tbq-bdg", "Latn, Beng", } m["aou"] = { "A'ou", 16109994, "gio", "Latn", -- also Hani? } m["aox"] = { "Atorada", 3507932, "awd", "Latn", } m["aoz"] = { "Uab Meto", 3441962, "poz-tim", "Latn", } m["apb"] = { "Sa'a", 36294, "poz-sls", "Latn", } m["apc"] = { "North Levantine Arabic", 22809485, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["apd"] = { "อาหรับแบบซูดาน", 56573, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["ape"] = { "Bukiyip", 3507895, "paa-tor", "Latn", } m["apf"] = { "Pahanan Agta", 7135432, "phi", "Latn", } m["apg"] = { "Ampanang", 4748035, "poz", "Latn", } m["aph"] = { "Athpare", 3449126, "sit-kie", "Deva, Latn", translit = { Deva = "Deva-translit", }, } m["api"] = { "Apiaká", 3507941, "tup-gua", "Latn", } m["apj"] = { "Jicarilla", 28277, "apa", "Latn", } m["apk"] = { "Plains Apache", 27861, "apa", "Latn", } m["apl"] = { "Lipan", 28269, "apa", "Latn", } m["apm"] = { "Chiricahua", 13368, "apa", "Latn", } m["apn"] = { "Apinayé", 2858311, "sai-nje", "Latn", } m["apo"] = { "Ambul", 12627135, "poz-ocw", "Latn", } m["app"] = { "Apma", 2669188, "poz-vnn", "Latn", } m["apq"] = { "A-Pucikwar", 28466, "qfa-adc", "Latn", } m["apr"] = { "Arop-Lokep", 2863482, "poz-ocw", "Latn", } m["aps"] = { "Arop-Sissano", 12627242, "poz-ocw", "Latn", } m["apt"] = { "Apatani", 56306, "sit-tan", "Latn", } m["apu"] = { "Apurinã", 2859081, "awd", "Latn", } m["apv"] = { "Alapmunte", 16110782, "sai-nmk", "Latn", } m["apw"] = { "อะแพชีตะวันตก", 28060, "apa", "Latn", } m["apx"] = { "Aputai", 12473343, "poz-tim", "Latn", } m["apy"] = { "Apalaí", 2736980, "sai-gui", "Latn", } m["apz"] = { "Safeyoka", 7398693, "ngf-ang", "Latn", } m["aqc"] = { "Archi", 34915, "cau-lzg", "Cyrl", translit = "cau-nec-translit", override_translit = true, display_text = s["cau-Cyrl-displaytext"], strip_diacritics = s["cau-Cyrl-stripdiacritics"], sort_key = { from = { "ккъӏв", "ххьӏв", -- 5 chars "гъӏв", "ёоӏ", "ккъӏ", "ккъв", "къӏв", "ллъв", "ххьӏ", "хъӏв", "хьӏв", "ццӏв", "ччӏв", -- 4 chars "ааӏ", "гӏв", "гъӏ", "гъв", "гьв", "ееӏ", "ёӏ", "ёо", "ииӏ", "кӏв", "ккв", "ккъ", "къӏ", "къв", "кьв", "лӏв", "ллъ", "лъв", "льв", "ооӏ", "пӏв", "ппв", "ссв", "тӏв", "ттв", "ууӏ", "хӏв", "ххв", "хъӏ", "хъв", "хьӏ", "цӏв", "ццӏ", "ццв", "чӏв", "ччӏ", "ээӏ", "юуӏ", "яаӏ", -- 3 chars "аӏ", "аа", "гӏ", "гв", "гъ", "гь", "дв", "еӏ", "ее", "ё", "жв", "зв", "иӏ", "ии", "кӏ", "кв", "кк", "къ", "кь", "лӏ", "лв", "лъ", "ль", "оӏ", "оо", "пӏ", "пв", "пп", "св", "сс", "тӏ", "тв", "тт", "уӏ", "уу", "фв", "хӏ", "хв", "хх", "хъ", "цӏ", "цв", "цц", "чӏ", "чв", "шв", "щв", "эӏ", "ээ", "юӏ", "юу", "яӏ", "яа" -- 2 chars }, to = { "к" .. p[8], "х" .. p[7], "г" .. p[6], "е" .. p[7], "к" .. p[7], "к" .. p[9], "к" .. p[12], "л" .. p[5], "х" .. p[6], "х" .. p[10], "х" .. p[13], "ц" .. p[6], "ч" .. p[5], "а" .. p[3], "г" .. p[2], "г" .. p[5], "г" .. p[7], "г" .. p[9], "е" .. p[3], "е" .. p[5], "е" .. p[6], "и" .. p[3], "к" .. p[2], "к" .. p[5], "к" .. p[6], "к" .. p[11], "к" .. p[13], "к" .. p[15], "л" .. p[2], "л" .. p[4], "л" .. p[7], "л" .. p[9], "о" .. p[3], "п" .. p[2], "п" .. p[5], "с" .. p[3], "т" .. p[2], "т" .. p[5], "у" .. p[3], "х" .. p[2], "х" .. p[5], "х" .. p[9], "х" .. p[11], "х" .. p[12], "ц" .. p[2], "ц" .. p[5], "ц" .. p[7], "ч" .. p[2], "ч" .. p[4], "э" .. p[3], "ю" .. p[3], "я" .. p[3], "а" .. p[1], "а" .. p[2], "г" .. p[1], "г" .. p[3], "г" .. p[4], "г" .. p[8], "д" .. p[1], "е" .. p[1], "е" .. p[2], "е" .. p[4], "ж" .. p[1], "з" .. p[1], "и" .. p[1], "и" .. p[2], "к" .. p[1], "к" .. p[3], "к" .. p[4], "к" .. p[10], "к" .. p[14], "л" .. p[1], "л" .. p[3], "л" .. p[6], "л" .. p[8], "о" .. p[1], "о" .. p[2], "п" .. p[1], "п" .. p[3], "п" .. p[4], "с" .. p[1], "с" .. p[2], "т" .. p[1], "т" .. p[3], "т" .. p[4], "у" .. p[1], "у" .. p[2], "ф" .. p[1], "х" .. p[1], "х" .. p[3], "х" .. p[4], "х" .. p[8], "ц" .. p[1], "ц" .. p[3], "ц" .. p[4], "ч" .. p[1], "ч" .. p[3], "ш" .. p[1], "щ" .. p[1], "э" .. p[1], "э" .. p[2], "ю" .. p[1], "ю" .. p[2], "я" .. p[1], "я" .. p[2] } }, } m["aqd"] = { "Ampari Dogon", 4748057, "nic-dgw", "Latn", } m["aqg"] = { "Arigidi", 34829, "alv-von", "Latn", } m["aqm"] = { "Atohwaim", 11732297, "paa-kay", "Latn", } m["aqn"] = { "Northern Alta", 7058116, "phi", "Latn", } m["aqp"] = { "Atakapa", 10975683, "qfa-iso", "Latn", } m["aqr"] = { "Arhâ", 4790085, "poz-cln", "Latn", } m["aqt"] = { "Angaité", 15736037, "sai-mas", "Latn", } m["aqz"] = { "Akuntsu", 4701960, "tup", "Latn", } m["arc"] = { "อารามายา", -- ใช้แทน แอราเมอิก เพราะซ้ำกับกลุ่มภาษา 28602, "sem-ara", "Hebr, Armi, Syrc, Palm, Nbat, Phnx, Mand, Samr, Hatr, Elym", translit = { Armi = "Armi-translit", Palm = "Palm-translit", }, strip_diacritics = { -- The first three were added by [[User:Wikitiki89]] in 2015 for use with Syriac, which has diacritics that look -- like a diaeresis (syāmē) and macrons above and below (mṭalqānā); see Wikipedia [[w:Syriac alphabet]]. But -- I don't know if they are actually represented using these diacritics. Syrc = {remove_diacritics = c.macron .. c.diaer .. c.macronbelow .. u(0x0730) .. "-" .. u(0x0748)}, }, -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] -- Samr strip_diacritics, sort_key in [[Module:scripts/data]]; previously no sort_key for Samr, presumably a mistake -- Phnx translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission) } m["ard"] = { "Arabana", 3507959, "aus-kar", "Latn", } m["are"] = { "Western Arrernte", 12645549, "aus-pam", "Latn", } m["arh"] = { "Arhuaco", 2640621, "cba", "Latn", } m["ari"] = { "Arikara", 56539, "cdd", "Latn", strip_diacritics = {remove_diacritics = c.acute}, } m["arj"] = { "Arapaso", 9627356, "sai-tuc", "Latn", } m["ark"] = { "Arikapú", 3446640, "sai-mje", "Latn", } m["arl"] = { "Arabela", 2591221, "sai-zap", "Latn", } m["arn"] = { "Mapudungun", 33730, "sai-ara", "Latn", } m["aro"] = { "Araona", 958414, "sai-tac", "Latn", } m["arp"] = { "Arapaho", 56417, "alg-ara", "Latn", } m["arq"] = { "อาหรับแบบแอลจีเรีย", 56499, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["arr"] = { "Arara-Karo", 35539, "tup", "Latn", } m["ars"] = { "Najdi Arabic", 56574, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["aru"] = { "Arua", 2746221, "auf", "Latn", } m["arv"] = { "Arbore", 56883, "cus-eas", "Latn", } m["arw"] = { "โลโกโน", 2655664, "awd-taa", "Latn", } m["arx"] = { "Aruá", 3507907, "tup", "Latn", } m["ary"] = { "อาหรับแบบโมร็อกโก", 56426, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["arz"] = { "อาหรับแบบอียิปต์", 29919, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["asa"] = { "Pare", 36403, "bnt-par", "Latn", } m["asb"] = { "Assiniboine", 2591288, "sio-dkt", "Latn", } m["asc"] = { "Casuarina Coast Asmat", 11732046, "ngf-ask", "Latn", } m["ase"] = { "มืออเมริกัน", 14759, "sgn", "Sgnw", } m["asf"] = { "Auslan", 29525, "sgn", "Latn", -- when documented } m["asg"] = { "Cishingini", 35199, "nic-kam", "Latn", } m["ash"] = { "Abishira", 2871740, "qfa-dis", -- extinct, poorly documented; isolate or in a proposed Tequiraca-Canichana family by Kaufman (1994) "Latn", } m["asi"] = { "Buruwai", 5001031, "ngf-ask", "Latn", } m["asj"] = { "Nsari", 36418, "nic-bbe", "Latn", } m["ask"] = { "Ashkun", 29379, "nur-sou", "Arab, Latn", } m["asl"] = { "Asilulu", 12473347, "poz-cma", "Latn", } m["asn"] = { "Xingú Asuriní", 8044571, "tup-gua", "Latn", } m["aso"] = { "Dano", 5220979, "ngf-kag", "Latn", } m["asp"] = { "Algerian Sign Language", 3135421, "sgn", } m["asq"] = { "Austrian Sign Language", 36668, "sgn", "Latn", -- when documented } m["asr"] = { "Asuri", 3504321, "mun", "Latn", -- when documented } m["ass"] = { "Ipulo", 35408, "nic-tvc", "Latn", } m["ast"] = { "อัสตูเรียส", 29507, "roa-asl", "Latn", } m["asu"] = { "Tocantins Asurini", 32041490, "tup-gua", "Latn", } m["asv"] = { "Asoa", 56296, "csu-maa", "Latn", } m["asw"] = { "Australian Aboriginal Sign Language", 955216, "sgn", "Latn", -- when documented } m["asx"] = { "Muratayak", 11732766, "ngf-fin", "Latn", } m["asy"] = { "Yaosakor Asmat", 16113158, "ngf-ask", "Latn", } m["asz"] = { "As", 2866218, "poz-hce", "Latn", } m["ata"] = { "Pele-Ata", 56511, "qfa-dis", -- Papuan; possibly in a putative West New Britain family, or an isolate "Latn", } m["atb"] = { "Zaiwa", 56594, "tbq-brm", "Latn, Lisu", -- also Hani? -- Lisu translit, sort_key in [[Module:scripts/data]] } m["atc"] = { "Atsahuaca", 4817730, "sai-pan", "Latn", } m["atd"] = { "Ata Manobo", 12627315, "mno", "Latn", } m["ate"] = { "Atemble", 4813055, "ngf-wso", "Latn", } m["atg"] = { "Okpela", 7082551, "alv-yek", "Latn", } m["ati"] = { "Attié", 34844, "alv-lag", "Latn", } m["atj"] = { "Atikamekw", 56590, "alg", "Latn", ancestors = "cr", } m["atk"] = { "Ati", 3217458, "phi", "Latn", } m["atl"] = { "Mount Iraya Agta", 6921430, "phi", "Latn", } m["atm"] = { "Ata", 4812603, "phi", "Latn", } m["ato"] = { "Atong (Cameroon)", 34824, "nic-grs", "Latn", } m["atp"] = { "Pudtol Atta", 12640726, "phi", "Latn", } m["atq"] = { "Aralle-Tabulahan", 4783889, "poz-ssw", "Latn", } m["atr"] = { "Waimiri-Atroari", 56865, "sai-car", "Latn", } m["ats"] = { "Gros Ventre", 56628, "alg-ara", "Latn", } m["att"] = { "Pamplona Atta", 12639245, "phi", "Latn", } m["atu"] = { "Reel", 7306882, "sdv-dnu", "Latn", } m["atv"] = { "อัลไตเหนือ", 2640863, "trk-ssb", "Cyrl", translit = "Altai-translit", } m["atw"] = { "Atsugewi", 56718, "nai-pal", "Latn", } m["atx"] = { "Arutani", 56609, nil, "Latn", } m["aty"] = { "อาเนตยูม", 2379113, "poz-vns", "Latn", } m["atz"] = { "Arta", 3508067, "phi", "Latn", } m["aua"] = { "Asumboa", 4811870, "poz-tem", "Latn", } m["aub"] = { "Alugu", 12626798, "tbq-urp", "Latn", -- also Hani? } m["auc"] = { "Huaorani", 758570, "qfa-iso", "Latn", } m["aud"] = { "Anuta", 35326, "poz-pnp", "Latn", } m["aug"] = { "Aguna", 34733, "alv-gbe", "Latn", } m["auh"] = { "Aushi", 2872082, "bnt-sbi", "Latn", } m["aui"] = { "Anuki", 3508132, "poz-ocw", "Latn", } m["auj"] = { "Awjila", 56398, "ber", "Latn, Arab, Tfng", } m["auk"] = { "Heyo", 3504295, "paa-tor", "Latn", } m["aul"] = { "Aulua", 427300, "poz-vnc", "Latn", } m["aum"] = { "Asu", 34798, "alv-ngb", "Latn", } m["aun"] = { "Molmo One", 12637224, "paa-tor", "Latn", } m["auo"] = { "Auyokawa", 56247, "cdc-wst", "Latn", } m["aup"] = { "Makayam", 6738863, "paa-ani", "Latn", } m["auq"] = { "Anus", 23855, "poz-ocw", "Latn", } m["aur"] = { "Aruek", 3504279, "paa-tor", "Latn", } m["aut"] = { "Austral", 2669261, "poz-pep", "Latn", } m["auu"] = { "Auye", 4827334, "ngf-pan", "Latn", } m["auw"] = { "Awyi", 3513326, "paa-brd", "Latn", } m["aux"] = { "Aurá", 3507995, "tup-gua", "Latn", } m["auy"] = { "Auyana", 2873211, "ngf-kag", "Latn", } m["auz"] = { "อาหรับแบบอุซเบกิสถาน", 3399507, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["avb"] = { "Avau", 12627412, "poz-ocw", "Latn", } m["avd"] = { "Alviri-Vidari", 3327357, "xme", "fa-Arab", ancestors = "xme-mid", } m["avi"] = { "Avikam", 34840, "alv-lag", "Latn", } m["avk"] = { "โคทาวา", 1377116, "art", "Latn", type = "appendix-constructed", } m["avm"] = { "Angkamuthi", 62603022, "aus-pmn", "Latn", } m["avn"] = { "Avatime", 34796, "alv-ktg", "Latn", } m["avo"] = { "Agavotaguerra", 3508007, "awd", "Latn", } m["avs"] = { "Aushiri", 3409318, "sai-zap", "Latn", } m["avt"] = { "Au", 3446608, "paa-tor", "Latn", } m["avu"] = { "Avokaya", 56685, "csu-mma", "Latn", } m["avv"] = { "Avá-Canoeiro", 4829584, "tup-gua", "Latn", } m["awa"] = { "อวัธ", 29579, "inc-hie", "Deva, Kthi, fa-Arab", ancestors = "inc-oaw", translit = { Deva = "Deva-translit", Kthi = "Kthi-translit", }, } m["awb"] = { "Awa (New Guinea)", 2874650, "ngf-kag", "Latn", } m["awc"] = { "Cicipu", 35193, "nic-kam", "Latn", } m["awe"] = { "Awetí", 4830038, "tup", "Latn", } m["awg"] = { "Anguthimri", 4764288, "aus-pam", "Latn", } m["awh"] = { "Awbono", 3446684, "paa-baw", "Latn", } m["awi"] = { "Aekyom", 3399691, "paa-kae", "Latn", } m["awk"] = { "Awabakal", 3449138, "aus-pam", "Latn", } m["awm"] = { "Arawum", 4784537, "ngf-rai", "Latn", } m["awn"] = { "Awngi", 34934, "cus-cen", "Ethi", } m["awo"] = { "Awak", 3446643, "alv-wjk", "Latn", } m["awr"] = { "Awera", 56379, "paa-lkp", "Latn", } m["aws"] = { "South Awyu", 12633986, "ngf-gaw", "Latn", } m["awt"] = { "Araweté", 4784535, "tup-gua", "Latn", } m["awu"] = { "Central Awyu", 12628801, "ngf-gaw", "Latn", } m["awv"] = { "Jair Awyu", 16110177, "ngf-gaw", "Latn", } m["aww"] = { "Awun", 56369, "paa-spk", "Latn", } m["awx"] = { "Awara", 2874670, "ngf-fin", "Latn", } m["awy"] = { "Edera Awyu", 12630425, "ngf-gaw", "Latn", } m["axb"] = { "Abipón", 11252539, "sai-guc", "Latn", } m["axe"] = { "Ayerrerenge", 16112737, "aus-pam", "Latn", } m["axg"] = { "Arára (Mato Grosso)", 3446660, nil, "Latn", } m["axk"] = { "Aka (Central Africa)", 11010149, "bnt-ngn", "Latn", } m["axl"] = { "Lower Southern Aranda", 6693295, "aus-pam", "Latn", } m["axm"] = { "อาร์มีเนียกลาง", 4438498, "hyx", "Armn", ancestors = "xcl", -- Armn translit in [[Module:scripts/data]] override_translit = true, strip_diacritics = { remove_diacritics = "՞՜՛՟", from = {"եւ", "ՙ", "՚"}, to = {"և", "ʻ", "’"} } } m["axx"] = { "Xârâgurè", 8045635, "poz-cln", "Latn", } m["aya"] = { "Awar", 56876, "paa-ram", "Latn", } m["ayb"] = { "Ayizo", 34841, "alv-pph", "Latn", } m["ayd"] = { "Ayabadhu", 3509164, "aus-pmn", "Latn", } m["aye"] = { "Ayere", 34788, "alv-aah", "Latn", } m["ayg"] = { "Nyanga (Togo)", 35446, "alv-gng", "Latn", } m["ayi"] = { "Leyigha", 3914492, "nic-uce", "Latn", } m["ayk"] = { "Akuku", 3450179, "alv-nwd", "Latn", } m["ayl"] = { "อาหรับแบบลิเบีย", 56503, "sem-arb", "Arab", strip_diacritics = "ar-stripdiacritics", } m["ayn"] = { "อาหรับแบบเยเมน", 1686766, "sem-arb", "Arab, Hebr", strip_diacritics = { Arab = "ar-stripdiacritics", }, -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["ayo"] = { "Ayoreo", 56634, "sai-zam", "Latn", } m["ayp"] = { "อาหรับแบบเมโสโปเตเมียเหนือ", 56577, "sem-arb", "Arab", ancestors = "acm", strip_diacritics = "ar-stripdiacritics", } m["ayq"] = { "Ayi", 56449, "paa-spk", "Latn", } m["ays"] = { "Sorsogon Ayta", 7563752, "phi", "Latn", } m["ayt"] = { "Bataan Ayta", 4921648, "phi", "Latn", } m["ayu"] = { "Ayu", 34786, "alv", "Latn", } m["ayy"] = { "Tayabas Ayta", 7689745, "phi", "Latn", } m["ayz"] = { "Maybrat", 4830892, "paa-mbr", -- either an isolate; grouped with Abun and the West Bird's Head family; or in the putative West Papuan family "Latn", } m["aza"] = { "Azha", 4832486, "tbq-axi", "Latn", } m["azd"] = { "Eastern Durango Nahuatl", 16115449, "azc-dur", "Latn", } m["azg"] = { "San Pedro Amuzgos Amuzgo", 35092, "omq", "Latn", } m["azm"] = { "Ipalapa Amuzgo", 12633013, "omq", "Latn", } m["azn"] = { "Western Durango Nahuatl", 12645553, "azc-dur", "Latn", } m["azo"] = { "Awing", 34856, "nic-nge", "Latn", } m["azt"] = { "Faire Atta", 12630884, "phi", "Latn", } m["azz"] = { "Highland Puebla Nahuatl", 12953754, "azc-nah", "Latn", } return require("Module:languages").finalizeData(m, "language") 7h8k7ctcb8t8byj5fd8ajlapo2lzllf มอดูล:languages/data/2 828 36387 5720751 5719180 2026-04-21T07:00:45Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720751 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared -- Ideally, we want to move these into [[Module:languages/data]], but because (a) it's necessary to use require on that module, and (b) they're only used in this data module, it's less memory-efficient to do that at the moment. If it becomes possible to use mw.loadData, then these should be moved there. s["de-Latn-sortkey"] = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer .. c.ringabove, from = {"æ", "œ", "ß"}, to = {"ae", "oe", "ss"} } s["de-Latn-standardchars"] = "AaÄäBbCcDdEeFfGgHhIiJjKkLlMmNnOoÖöPpQqRrSsẞßTtUuÜüVvWwXxYyZz" s["ka-stripdiacritics"] = {remove_diacritics = c.circ} s["no-sortkey"] = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron .. c.dacute .. c.caron .. c.cedilla, remove_exceptions = {"å"}, from = {"æ", "ø", "å"}, to = {"z" .. p[1], "z" .. p[2], "z" .. p[3]} } s["no-standardchars"] = "AaBbDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsTtUuVvYyÆæØøÅå" .. c.punc s["tg-stripdiacritics"] = {remove_diacritics = c.grave .. c.acute} s["tk-stripdiacritics"] = {remove_diacritics = c.macron} local m = {} m["aa"] = { "อาฟาร์", 27811, "cus-eas", "Latn, Ethi", strip_diacritics = { Latn = {remove_diacritics = c.acute}, }, } m["ab"] = { "อับคาเซีย", 5111, "cau-abz", "Cyrl, Geor, Latn", translit = { Cyrl = "ab-translit", -- Geor translit in [[Module:scripts/data]] }, override_translit = true, display_text = { Cyrl = s["cau-Cyrl-displaytext"] }, strip_diacritics = { Cyrl = { remove_diacritics = c.acute, from = {"^а%-"}, to = {"а"}, }, Latn = s["cau-Latn-stripdiacritics"], }, sort_key = { Cyrl = { from = { "х'ә", -- 3 chars "гь", "гә", "ӷь", "ҕь", "ӷә", "ҕә", "дә", "ё", "жь", "жә", "ҙә", "ӡә", "ӡ'", "кь", "кә", "қь", "қә", "ҟь", "ҟә", "ҫә", "тә", "ҭә", "ф'", "хь", "хә", "х'", "ҳә", "ць", "цә", "ц'", "ҵә", "ҵ'", "шь", "шә", "џь", -- 2 chars "ӷ", "ҕ", "ҙ", "ӡ", "қ", "ҟ", "ԥ", "ҧ", "ҫ", "ҭ", "ҳ", "ҵ", "ҷ", "ҽ", "ҿ", "ҩ", "џ", "ә", -- 1 char "^а", }, to = { "х" .. p[4], "г" .. p[1], "г" .. p[2], "г" .. p[5], "г" .. p[6], "г" .. p[7], "г" .. p[8], "д" .. p[1], "е" .. p[1], "ж" .. p[1], "ж" .. p[2], "з" .. p[2], "з" .. p[4], "з" .. p[5], "к" .. p[1], "к" .. p[2], "к" .. p[4], "к" .. p[5], "к" .. p[7], "к" .. p[8], "с" .. p[2], "т" .. p[1], "т" .. p[3], "ф" .. p[1], "х" .. p[1], "х" .. p[2], "х" .. p[3], "х" .. p[6], "ц" .. p[1], "ц" .. p[2], "ц" .. p[3], "ц" .. p[5], "ц" .. p[6], "ш" .. p[1], "ш" .. p[2], "ы" .. p[3], "г" .. p[3], "г" .. p[4], "з" .. p[1], "з" .. p[3], "к" .. p[3], "к" .. p[6], "п" .. p[1], "п" .. p[2], "с" .. p[1], "т" .. p[2], "х" .. p[5], "ц" .. p[4], "ч" .. p[1], "ч" .. p[2], "ч" .. p[3], "ы" .. p[1], "ы" .. p[2], "ь" .. p[1], "", } }, }, } m["ae"] = { "อเวสตะ", 29572, "ira-cen", "Avst, Gujr", translit = { Avst = "Avst-translit" }, } m["af"] = { "อาฟรีกานส์", 14196, "gmw-frk", "Latn, Arab", ancestors = "nl", sort_key = { Latn = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.diaer .. c.ringabove .. c.cedilla .. "'", from = {"['ʼ]n"}, to = {"n" .. p[1]} } }, } m["ak"] = { "อาคัน", 28026, "alv-ctn", "Latn", } m["am"] = { "อัมฮารา", 28244, "sem-eth", "Ethi", translit = "Ethi-translit", } m["an"] = { "อารากอน", 8765, "roa-nar", "Latn", } m["ar"] = { "อาหรับ", 13955, "sem-arb", "Arab, Hebr, Syrc, Brai, Nbat", translit = { Arab = "ar-translit" }, strip_diacritics = { Arab = "ar-stripdiacritics", }, -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["as"] = { "อัสสัม", 29401, "inc-bas", "as-Beng", ancestors = "inc-mas", translit = "Beng-translit", } m["av"] = { "อะวาร์", 29561, "cau-ava", "Cyrl, Latn, Arab", ancestors = "oav", translit = { Cyrl = "cau-nec-translit", Arab = "ar-translit", }, override_translit = true, display_text = { Cyrl = s["cau-Cyrl-displaytext"], }, strip_diacritics = { Cyrl = s["cau-Cyrl-stripdiacritics"], Latn = s["cau-Latn-stripdiacritics"], }, sort_key = { Cyrl = { from = {"гъ", "гь", "гӏ", "ё", "кк", "къ", "кь", "кӏ", "лъ", "лӏ", "тӏ", "хх", "хъ", "хь", "хӏ", "цӏ", "чӏ"}, to = {"г" .. p[1], "г" .. p[2], "г" .. p[3], "е" .. p[1], "к" .. p[1], "к" .. p[2], "к" .. p[3], "к" .. p[4], "л" .. p[1], "л" .. p[2], "т" .. p[1], "х" .. p[1], "х" .. p[2], "х" .. p[3], "х" .. p[4], "ц" .. p[1], "ч" .. p[1]} }, }, } m["ay"] = { "ไอย์มารา", 4627, "sai-aym", "Latn", } m["az"] = { "อาเซอร์ไบจาน", 9292, "trk-ogz", "Latn, Cyrl, fa-Arab", ancestors = "trk-oat", dotted_dotless_i = true, strip_diacritics = { Latn = { from = {"ʼ"}, to = {"'"}, }, ["fa-Arab"] = { module = "ar-stripdiacritics", ["from"] = { "ۆ", "ۇ", "وْ", "ڲ", "ؽ", }, ["to"] = { "و", "و", "و", "گ", "ی", }, }, }, display_text = { Latn = { from = {"'"}, to = {"ʼ"} } }, sort_key = { Latn = { from = { "i", -- Ensure "i" comes after "ı". "ç", "ə", "ğ", "x", "ı", "q", "ö", "ş", "ü", "w" }, to = { "i" .. p[1], "c" .. p[1], "e" .. p[1], "g" .. p[1], "h" .. p[1], "i", "k" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1], "z" .. p[1] } }, Cyrl = { from = {"ғ", "ә", "ы", "ј", "ҝ", "ө", "ү", "һ", "ҹ"}, to = {"г" .. p[1], "е" .. p[1], "и" .. p[1], "и" .. p[2], "к" .. p[1], "о" .. p[1], "у" .. p[1], "х" .. p[1], "ч" .. p[1]} }, }, } m["ba"] = { "แบชเคียร์", 13389, "trk-kbu", "Cyrl", translit = "ba-translit", override_translit = true, sort_key = { from = {"ғ", "ҙ", "ё", "ҡ", "ң", "ө", "ҫ", "ү", "һ", "ә"}, to = {"г" .. p[1], "д" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "с" .. p[1], "у" .. p[1], "х" .. p[1], "э" .. p[1]} }, } m["be"] = { "เบลารุส", 9091, "zle", "Cyrl, Latn", ancestors = "zle-mbe", translit = { Cyrl = "be-translit-Thai", }, strip_diacritics = { Cyrl = { remove_diacritics = c.grave .. c.acute, }, Latn = { remove_diacritics = c.grave .. c.acute, remove_exceptions = {"Ć", "ć", "Ń", "ń", "Ś", "ś", "Ź", "ź"}, }, }, sort_key = { Cyrl = { remove_diacritics = c.grave .. c.acute, from = {"ґ", "ё", "і", "ў"}, to = {"г" .. p[1], "е" .. p[1], "и" .. p[1], "у" .. p[1]} }, Latn = { remove_diacritics = c.grave .. c.acute, remove_exceptions = {"Ć", "ć", "Ń", "ń", "Ś", "ś", "Ź", "ź"}, from = {"ć", "č", "dz", "dź", "dž", "ch", "ł", "ń", "ś", "š", "ŭ", "ź", "ž"}, to = {"c" .. p[1], "c" .. p[2], "d" .. p[1], "d" .. p[2], "d" .. p[3], "h" .. p[1], "l" .. p[1], "n" .. p[1], "s" .. p[1], "s" .. p[2], "u" .. p[1], "z" .. p[1], "z" .. p[2]} }, }, standard_chars = { Cyrl = "АаБбВвГгДдЕеЁёЖжЗзІіЙйКкЛлМмНнОоПпРрСсТтУуЎўФфХхЦцЧчШшЫыЬьЭэЮюЯя", Latn = "AaBbCcĆćČčDdEeFfGgHhIiJjKkLlŁłMmNnŃńOoPpRrSsŚśŠšTtUuŬŭVvYyZzŹźŽž", (c.punc:gsub("'", "")) -- Exclude apostrophe. }, } m["bg"] = { "บัลแกเรีย", 7918, "zls", "Cyrl", ancestors = "cu-bgm", translit = "bg-translit", strip_diacritics = { remove_diacritics = c.grave .. c.acute, remove_exceptions = {"%f[^%z%s]ѝ%f[%z%s]"}, }, sort_key = { remove_diacritics = c.grave .. c.acute, remove_exceptions = {"%f[^%z%s]ѝ%f[%z%s]"}, }, standard_chars = "АаБбВвГгДдЕеЖжЗзИиЙйКкЛлМмНнОоПпРрСсТтУуФфХхЦцЧчШшЩщЪъЬьЮюЯя" .. c.punc, } m["bh"] = { "พิหาร", 135305, "inc-eas", "Deva", translit = "Deva-translit", } m["bi"] = { "บิสลามา", 35452, "crp", "Latn", ancestors = "en", } m["bm"] = { "บัมบารา", 33243, "dmn-emn", "Latn, Nkoo", sort_key = { Latn = { from = {"ɛ", "ɲ", "ŋ", "ɔ"}, to = {"e" .. p[1], "n" .. p[1], "n" .. p[2], "o" .. p[1]} }, }, } m["bn"] = { "เบงกอล", 9610, "inc-bas", "Beng, Newa", ancestors = "inc-mbn", translit = { Beng = "Beng-translit" }, } m["bo"] = { "ทิเบต", 34271, "sit-tib", "Tibt", -- sometimes Deva? ancestors = "xct", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["br"] = { "เบรอตง", 12107, "cel-brs", "Latn", ancestors = "xbm", sort_key = { from = {"ch", "c['ʼ’]h"}, to = {"c" .. p[1], "c" .. p[2]} }, } m["ca"] = { "กาตาลา", 7026, "roa-ocr", "Latn", ancestors = "roa-oca", sort_key = {remove_diacritics = c.grave .. c.acute .. c.diaer .. c.cedilla .. "·"}, standard_chars = "AaÀàBbCcÇçDdEeÉéÈèFfGgHhIiÍíÏïJjLlMmNnOoÓóÒòPpQqRrSsTtUuÚúÜüVvXxYyZz·" .. c.punc, } m["ce"] = { "เชเชน", 33350, "cau-vay", "Cyrl, Latn, Arab", translit = { Cyrl = "cau-nec-translit", Arab = "ar-translit", }, override_translit = true, display_text = { Cyrl = s["cau-Cyrl-displaytext"] }, strip_diacritics = { Cyrl = s["cau-Cyrl-stripdiacritics"], Latn = s["cau-Latn-stripdiacritics"], }, sort_key = { Cyrl = { from = {"аь", "гӏ", "ё", "кх", "къ", "кӏ", "оь", "пӏ", "тӏ", "уь", "хь", "хӏ", "цӏ", "чӏ", "юь", "яь"}, to = {"а" .. p[1], "г" .. p[1], "е" .. p[1], "к" .. p[1], "к" .. p[2], "к" .. p[3], "о" .. p[1], "п" .. p[1], "т" .. p[1], "у" .. p[1], "х" .. p[1], "х" .. p[2], "ц" .. p[1], "ч" .. p[1], "ю" .. p[1], "я" .. p[1]} }, }, } m["ch"] = { "ชามอร์โร", 33262, "poz", "Latn", sort_key = { remove_diacritics = "'", from = {"å", "ch", "ñ", "ng"}, to = {"a" .. p[1], "c" .. p[1], "n" .. p[1], "n" .. p[2]} }, } m["co"] = { "คอร์ซิกา", 33111, "roa-itr", "Latn", sort_key = { from = {"chj", "ghj", "sc", "sg"}, to = {"c" .. p[1], "g" .. p[1], "s" .. p[1], "s" .. p[2]} }, standard_chars = "AaÀàBbCcDdEeÈèFfGgHhIiÌìÏïJjLlMmNnOoÒòPpQqRrSsTtUuÙùÜüVvZz" .. c.punc, } m["cr"] = { "ครี", 33390, "alg", "Latn, Cans", translit = { Cans = "cr-translit" }, } m["cs"] = { "เช็ก", 9056, "zlw", "Latn", ancestors = "cs-ear", sort_key = { from = {"á", "č", "ď", "é", "ě", "ch", "í", "ň", "ó", "ř", "š", "ť", "ú", "ů", "ý", "ž"}, to = {"a" .. p[1], "c" .. p[1], "d" .. p[1], "e" .. p[1], "e" .. p[2], "h" .. p[1], "i" .. p[1], "n" .. p[1], "o" .. p[1], "r" .. p[1], "s" .. p[1], "t" .. p[1], "u" .. p[1], "u" .. p[2], "y" .. p[1], "z" .. p[1]} }, standard_chars = "AaÁáBbCcČčDdĎďEeÉéĚěFfGgHhIiÍíJjKkLlMmNnŇňOoÓóPpRrŘřSsŠšTtŤťUuÚúŮůVvYyÝýZzŽž" .. c.punc, } m["cu"] = { "สลาวอนิกคริสตจักรเก่า", 35499, "zls", "Cyrs, Glag, Zname", translit = { Cyrs = "Cyrs-translit", Glag = "Glag-translit" }, -- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]] } m["cv"] = { "ชูวัช", 33348, "trk-ogr", "Cyrl", ancestors = "cv-mid", translit = "cv-translit", override_translit = true, sort_key = { from = {"ӑ", "ё", "ӗ", "ҫ", "ӳ"}, to = {"а" .. p[1], "е" .. p[1], "е" .. p[2], "с" .. p[1], "у" .. p[1]} }, } m["cy"] = { "เวลส์", 9309, "cel-brw", "Latn", ancestors = "wlm", sort_key = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer .. "'", from = {"ch", "dd", "ff", "ng", "ll", "ph", "rh", "th"}, to = {"c" .. p[1], "d" .. p[1], "f" .. p[1], "g" .. p[1], "l" .. p[1], "p" .. p[1], "r" .. p[1], "t" .. p[1]} }, standard_chars = "ÂâAaBbCcDdEeÊêFfGgHhIiÎîLlMmNnOoÔôPpRrSsTtUuÛûWwŴŵYyŶŷ" .. c.punc, } m["da"] = { "เดนมาร์ก", 9035, "gmq-eas", "Latn", ancestors = "gmq-oda", sort_key = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron .. c.dacute .. c.caron .. c.cedilla, remove_exceptions = {"å"}, from = {"æ", "ø", "å"}, to = {"z" .. p[1], "z" .. p[2], "z" .. p[3]} }, standard_chars = "AaBbDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsTtUuVvYyÆæØøÅå" .. c.punc, } m["de"] = { "เยอรมัน", 188, "gmw-hgm", "Latn, Latf, Brai", ancestors = "de-ear", sort_key = { Latn = s["de-Latn-sortkey"], Latf = s["de-Latn-sortkey"], }, standard_chars = { Latn = s["de-Latn-standardchars"], Latf = s["de-Latn-standardchars"], Brai = c.braille, c.punc } } m["dv"] = { "มัลดีฟส์", 32656, "inc-ins", "Thaa, Diak", translit = { Thaa = "Thaa-translit", Diak = "Diak-translit", }, override_translit = true, } m["dz"] = { "ซองคา", 33081, "sit-tib", "Tibt", ancestors = "xct", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["ee"] = { "เอเว", 30005, "alv-gbe", "Latn", sort_key = { remove_diacritics = c.tilde, from = {"ɖ", "dz", "ɛ", "ƒ", "gb", "ɣ", "kp", "ny", "ŋ", "ɔ", "ts", "ʋ"}, to = {"d" .. p[1], "d" .. p[2], "e" .. p[1], "f" .. p[1], "g" .. p[1], "g" .. p[2], "k" .. p[1], "n" .. p[1], "n" .. p[2], "o" .. p[1], "t" .. p[1], "v" .. p[1]} }, } m["el"] = { "กรีก", 9129, "grk", "Grek, Polyt, Brai", ancestors = "el-kth", translit = "el-translit", override_translit = true, -- Grek and Polyt display_text, strip_diacritics, sort_key in [[Module:scripts/data]] standard_chars = { Grek = "΅·ͺ΄ΑαΆάΒβΓγΔδΕεέΈΖζΗηΉήΘθΙιΊίΪϊΐΚκΛλΜμΝνΞξΟοΌόΠπΡρΣσςΤτΥυΎύΫϋΰΦφΧχΨψΩωΏώ", Brai = c.braille, c.punc }, } m["en"] = { "อังกฤษ", 1860, "gmw-ang", "Latn, Brai, Shaw, Dsrt", -- entries in Shaw or Dsrt might require prior discussion wikimedia_codes = "en, simple", ancestors = "en-ear", sort_key = { Latn = { -- Many of these are needed for sorting language names. remove_diacritics = "'\"%-%.,%s·ʻʼ" .. c.diacritics, -- These are found in pagenames. from = {"[ɒæ🅱¢©ᴄðđəǝɜɡħʜıɨłŋɲøɔœꝑꝓꝕßʋ]"}, to = {{ ["ɒ"] = "a", ["æ"] = "ae", ["🅱"] = "b", ["¢"] = "c", ["©"] = "c", ["ᴄ"] = "c", ["ð"] = "d", ["đ"] = "d", ["ə"] = "e", ["ǝ"] = "e", ["ɜ"] = "e", ["ɡ"] = "g", ["ħ"] = "h", ["ʜ"] = "h", ["ı"] = "i", ["ɨ"] = "i", ["ł"] = "l", ["ŋ"] = "n", ["ɲ"] = "n", ["ø"] = "o", ["ɔ"] = "o", ["œ"] = "oe", ["ꝑ"] = "p", ["ꝓ"] = "p", ["ꝕ"] = "p", ["ß"] = "ss", ["ʋ"] = "v", }}, }, }, standard_chars = { Latn = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz", Brai = c.braille, c.punc }, } m["eo"] = { "เอสเปรันโต", 143, "art", "Latn", --translit = "eo-translit", -- already handled in Module:headword & Module:links sort_key = { remove_diacritics = c.grave .. c.acute, from = {"ĉ", "ĝ", "ĥ", "ĵ", "ŝ", "ŭ"}, to = {"c" .. p[1], "g" .. p[1], "h" .. p[1], "j" .. p[1], "s" .. p[1], "u" .. p[1]} }, standard_chars = "AaBbCcĈĉDdEeFfGgĜĝHhĤĥIiJjĴĵKkLlMmNnOoPpRrSsŜŝTtUuŬŭVvZz" .. c.punc, } m["es"] = { "สเปน", 1321, "roa-cas", "Latn, Brai", ancestors = "es-ear", --translit = "es-translit", -- already handled in Module:headword & Module:links sort_key = { Latn = { remove_exceptions = {"ñ"}, remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron .. c.diaer .. c.cedilla, from = {"ª", "æ", "ñ", "º", "œ"}, to = {"a", "ae", "n" .. p[1], "o", "oe"} }, }, standard_chars = { Latn = "AaÁáBbCcDdEeÉéFfGgHhIiÍíJjLlMmNnÑñOoÓóPpQqRrSsTtUuÚúÜüVvXxYyZz", Brai = c.braille, c.punc }, } m["et"] = { "เอสโตเนีย", 9072, "urj-fin", "Latn", sort_key = { from = { "š", "ž", "õ", "ä", "ö", "ü", -- 2 chars "z" -- 1 char }, to = { "s" .. p[1], "s" .. p[3], "w" .. p[1], "w" .. p[2], "w" .. p[3], "w" .. p[4], "s" .. p[2] } }, standard_chars = "AaBbDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsTtUuVvÕõÄäÖöÜü" .. c.punc, } m["eu"] = { "บาสก์", 8752, "euq", "Latn", sort_key = { from = {"ç", "ñ"}, to = {"c" .. p[1], "n" .. p[1]} }, standard_chars = "AaBbDdEeFfGgHhIiJjKkLlMmNnÑñOoPpRrSsTtUuXxZz" .. c.punc, } m["fa"] = { "เปอร์เซีย", 9168, "ira-swi", "fa-Arab, Hebr", ancestors = "fa-cls", strip_diacritics = { ["fa-Arab"] = { -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif from = {"هٔ", "ٱ"}, -- character "ۂ" code U+06C2 to "ه"; hamzatu l-waṣli to a regular alif to = {"ه", "ا"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef, }, }, -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["ff"] = { "ฟูลา", 33454, "alv-fwo", "Latn, Adlm", } m["fi"] = { "ฟินแลนด์", 1412, "urj-fin", "Latn", display_text = { from = {"'"}, to = {"’"} }, strip_diacritics = { -- used to indicate gemination of the next consonant remove_diacritics = "ˣ", from = {"’"}, to = {"'"}, }, sort_key = { -- [[Appendix:Finnish alphabet#Collation]] + "aͤ" and "oͤ" as historical variants of "ä" and "ö". remove_diacritics = "'’:" .. c.diacritics, remove_exceptions = { "a[" .. c.ringabove .. c.diaer .. c.small_e .. "]", -- åäaͤ "o[" .. c.diaer .. c.tilde .. c.dacute .. c.small_e .. "]", -- öõőoͤ "u[" .. c.diaer .. c.dacute .. "]" -- üű }, from = {"æ", "[ðđ]", "ł", "ŋ", "œ", "ß", "þ", "u[" .. c.diaer .. c.dacute .. "]", "å", "aͤ", "o[" .. c.tilde .. c.dacute .. c.small_e .. "]", "ø", "(.)['%-]"}, to = {"ae", "d", "l", "n", "oe", "ss", "th", "y", "z" .. p[1], "ä", "ö", "ö", "%1"} }, standard_chars = "AaBbDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsTtUuVvYyÄäÖö" .. c.punc, } m["fj"] = { "ฟีจี", 33295, "poz-pcc", "Latn", } m["fo"] = { "แฟโร", 25258, "gmq-ins", "Latn", sort_key = { from = {"á", "ð", "í", "ó", "ú", "ý", "æ", "ø"}, to = {"a" .. p[1], "d" .. p[1], "i" .. p[1], "o" .. p[1], "u" .. p[1], "y" .. p[1], "z" .. p[1], "z" .. p[2]} }, standard_chars = "AaÁáBbDdÐðEeFfGgHhIiÍíJjKkLlMmNnOoÓóPpRrSsTtUuÚúVvYyÝýÆæØø" .. c.punc, } m["fr"] = { "ฝรั่งเศส", 150, "roa-oil", "Latn, Brai", ancestors = "frm", sort_key = { Latn = s["roa-oil-sortkey"] }, standard_chars = { Latn = "AaÀàÂâBbCcÇçDdEeÉéÈèÊêËëFfGgHhIiÎîÏïJjLlMmNnOoÔôŒœPpQqRrSsTtUuÙùÛûÜüVvXxYyZz", Brai = c.braille, c.punc }, } m["fy"] = { "ฟรีเชียตะวันตก", 27175, "gmw-fri", "Latn", sort_key = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer, from = {"y"}, to = {"i"} }, standard_chars = "AaâäàÆæBbCcDdEeéêëèFfGgHhIiïìYyỳJjKkLlMmNnOoôöòPpRrSsTtUuúûüùVvWwZz" .. c.punc, } m["ga"] = { "ไอริช", 9142, "cel-gae", "Latn, Latg", ancestors = "mga", sort_key = { remove_diacritics = c.acute, from = {"ḃ", "ċ", "ḋ", "ḟ", "ġ", "ṁ", "ṗ", "ṡ", "ṫ"}, to = {"bh", "ch", "dh", "fh", "gh", "mh", "ph", "sh", "th"} }, standard_chars = "AaÁáBbCcDdEeÉéFfGgHhIiÍíLlMmNnOoÓóPpRrSsTtUuÚúVv" .. c.punc, } m["gd"] = { "แกลิกแบบสกอตแลนด์", 9314, "cel-gae", "Latn, Latg", ancestors = "mga", sort_key = {remove_diacritics = c.grave .. c.acute}, standard_chars = "AaÀàBbCcDdEeÈèFfGgHhIiÌìLlMmNnOoÒòPpRrSsTtUuÙù" .. c.punc, } m["gl"] = { "กาลิเซีย", 9307, "roa-gap", "Latn", sort_key = { remove_diacritics = c.acute, from = {"ñ"}, to = {"n" .. p[1]} }, standard_chars = "AaÁáBbCcDdEeÉéFfGgHhIiÍíÏïLlMmNnÑñOoÓóPpQqRrSsTtUuÚúÜüVvXxZz" .. c.punc, } m["gu"] = { "คุชราต", 5137, "inc-wes", "Arab, Gujr", ancestors = "inc-mgu", translit = { Gujr = "Gujr-translit", }, strip_diacritics = { Arab = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.kasra .. c.shadda .. c.sukun}, Gujr = {remove_diacritics = "઼"}, }, } m["gv"] = { "แมงซ์", 12175, "cel-gae", "Latn", ancestors = "mga", sort_key = {remove_diacritics = c.cedilla .. "-"}, standard_chars = "AaBbCcÇçDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwYy" .. c.punc, } m["ha"] = { "เฮาซา", 56475, "cdc-wst", "Latn, Arab", strip_diacritics = { Latn = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron} }, sort_key = { Latn = { from = {"ɓ", "b'", "ɗ", "d'", "ƙ", "k'", "sh", "ƴ", "'y"}, to = {"b" .. p[1], "b" .. p[2], "d" .. p[1], "d" .. p[2], "k" .. p[1], "k" .. p[2], "s" .. p[1], "y" .. p[1], "y" .. p[2]} }, }, } m["he"] = { "ฮีบรู", 9288, "sem-can", "Hebr, Phnx, Brai, Samr", ancestors = "he-med", -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] -- Samr strip_diacritics, sort_key in [[Module:scripts/data]] -- Phnx translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission) } m["hi"] = { "ฮินดี", 1568, "inc-hnd", "Deva, Kthi, Newa", translit = { Deva = "Deva-translit", Kthi = "Kthi-translit", Newa = "Newa-translit", }, standard_chars = { Deva = "अआइईउऊएऐओऔकखगघङचछजझञटठडढणतथदधनपफबभमयरलवशषसहत्रज्ञक्षक़ख़ग़ज़झ़ड़ढ़फ़काखागाघाङाचाछाजाझाञाटाठाडाढाणाताथादाधानापाफाबाभामायारालावाशाषासाहात्राज्ञाक्षाक़ाख़ाग़ाज़ाझ़ाड़ाढ़ाफ़ाकिखिगिघिङिचिछिजिझिञिटिठिडिढिणितिथिदिधिनिपिफिबिभिमियिरिलिविशिषिसिहित्रिज्ञिक्षिक़िख़िग़िज़िझ़िड़िढ़िफ़िकीखीगीघीङीचीछीजीझीञीटीठीडीढीणीतीथीदीधीनीपीफीबीभीमीयीरीलीवीशीषीसीहीत्रीज्ञीक्षीक़ीख़ीग़ीज़ीझ़ीड़ीढ़ीफ़ीकुखुगुघुङुचुछुजुझुञुटुठुडुढुणुतुथुदुधुनुपुफुबुभुमुयुरुलुवुशुषुसुहुत्रुज्ञुक्षुक़ुख़ुग़ुज़ुझ़ुड़ुढ़ुफ़ुकूखूगूघूङूचूछूजूझूञूटूठूडूढूणूतूथूदूधूनूपूफूबूभूमूयूरूलूवूशूषूसूहूत्रूज्ञूक्षूक़ूख़ूग़ूज़ूझ़ूड़ूढ़ूफ़ूकेखेगेघेङेचेछेजेझेञेटेठेडेढेणेतेथेदेधेनेपेफेबेभेमेयेरेलेवेशेषेसेहेत्रेज्ञेक्षेक़ेख़ेग़ेज़ेझ़ेड़ेढ़ेफ़ेकैखैगैघैङैचैछैजैझैञैटैठैडैढैणैतैथैदैधैनैपैफैबैभैमैयैरैलैवैशैषैसैहैत्रैज्ञैक्षैक़ैख़ैग़ैज़ैझ़ैड़ैढ़ैफ़ैकोखोगोघोङोचोछोजोझोञोटोठोडोढोणोतोथोदोधोनोपोफोबोभोमोयोरोलोवोशोषोसोहोत्रोज्ञोक्षोक़ोख़ोग़ोज़ोझ़ोड़ोढ़ोफ़ोकौखौगौघौङौचौछौजौझौञौटौठौडौढौणौतौथौदौधौनौपौफौबौभौमौयौरौलौवौशौषौसौहौत्रौज्ञौक्षौक़ौख़ौग़ौज़ौझ़ौड़ौढ़ौफ़ौक्ख्ग्घ्ङ्च्छ्ज्झ्ञ्ट्ठ्ड्ढ्ण्त्थ्द्ध्न्प्फ्ब्भ्म्य्र्ल्व्श्ष्स्ह्त्र्ज्ञ्क्ष्क़्ख़्ग़्ज़्झ़्ड़्ढ़्फ़्।॥०१२३४५६७८९॰", c.punc }, } m["ho"] = { "ฮีรีโมตู", 33617, "crp", "Latn", ancestors = "meu", } m["ht"] = { "ครีโอลเฮติ", 33491, "crp", "Latn", ancestors = "ht-sdm", sort_key = { from = { "oun", -- 3 chars "an", "ch", "è", "en", "ng", "ò", "on", "ou", "ui" -- 2 chars }, to = { "o" .. p[4], "a" .. p[1], "c" .. p[1], "e" .. p[1], "e" .. p[2], "n" .. p[1], "o" .. p[1], "o" .. p[2], "o" .. p[3], "u" .. p[1] } }, } m["hu"] = { "ฮังการี", 9067, "urj-ugr", "Latn, Hung", ancestors = "ohu", sort_key = { Latn = { from = { "dzs", -- 3 chars "á", "cs", "dz", "é", "gy", "í", "ly", "ny", "ó", "ö", "ő", "sz", "ty", "ú", "ü", "ű", "zs", -- 2 chars }, to = { "d" .. p[2], "a" .. p[1], "c" .. p[1], "d" .. p[1], "e" .. p[1], "g" .. p[1], "i" .. p[1], "l" .. p[1], "n" .. p[1], "o" .. p[1], "o" .. p[2], "o" .. p[3], "s" .. p[1], "t" .. p[1], "u" .. p[1], "u" .. p[2], "u" .. p[3], "z" .. p[1], } }, }, standard_chars = { Latn = "AaÁáBbCcDdEeÉéFfGgHhIiÍíJjKkLlMmNnOoÓóÖöŐőPpQqRrSsTtUuÚúÜüŰűVvWwXxYyZz", c.punc }, } m["hy"] = { "อาร์มีเนีย", 8785, "hyx", "Armn, Brai", ancestors = "axm", -- Armn translit in [[Module:scripts/data]] override_translit = true, strip_diacritics = { Armn = { remove_diacritics = "՛՜՞՟", from = {"եւ", "<sup>յ</sup>", "<sup>ի</sup>", "<sup>է</sup>", "յ̵", "ՙ", "՚"}, to = {"և", "յ", "ի", "է", "ֈ", "ʻ", "’"} }, }, sort_key = { Armn = { from = { "ու", "եւ", -- 2 chars "և" -- 1 char }, to = { "ւ", "եվ", "եվ" } }, }, } m["hz"] = { "เฮเรโร", 33315, "bnt-swb", "Latn", } m["ia"] = { "อินเทอร์ลิงกวา", 35934, "art", "Latn", } m["id"] = { "อินโดนีเซีย", 9240, "poz-mly", "Latn", ancestors = "ms", standard_chars = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz" .. c.punc, } m["ie"] = { "อินเทร์ลิงเกว", 35850, "art", "Latn", type = "appendix-constructed", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ}, } m["ig"] = { "อิกโบ", 33578, "alv-igb", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.macron}, sort_key = { from = {"gb", "gh", "gw", "ị", "kp", "kw", "ṅ", "nw", "ny", "ọ", "sh", "ụ"}, to = {"g" .. p[1], "g" .. p[2], "g" .. p[3], "i" .. p[1], "k" .. p[1], "k" .. p[2], "n" .. p[1], "n" .. p[2], "n" .. p[3], "o" .. p[1], "s" .. p[1], "u" .. p[1]} }, } m["ii"] = { "นอซู", 34235, "tbq-nlo", "Yiii", translit = "ii-translit", } m["ik"] = { "Inupiaq", 27183, "esx-inu", "Latn", sort_key = { from = { "ch", "ġ", "dj", "ḷ", "ł̣", "ñ", "ng", "r̂", "sr", "zr", -- 2 chars "ł", "ŋ", "ʼ" -- 1 char }, to = { "c" .. p[1], "g" .. p[1], "h" .. p[1], "l" .. p[1], "l" .. p[3], "n" .. p[1], "n" .. p[2], "r" .. p[1], "s" .. p[1], "z" .. p[1], "l" .. p[2], "n" .. p[2], "z" .. p[2] } }, } m["io"] = { "อีโด", 35224, "art", "Latn", } m["is"] = { "ไอซ์แลนด์", 294, "gmq-ins", "Latn", sort_key = { from = {"á", "ð", "é", "í", "ó", "ú", "ý", "þ", "æ", "ö"}, to = {"a" .. p[1], "d" .. p[1], "e" .. p[1], "i" .. p[1], "o" .. p[1], "u" .. p[1], "y" .. p[1], "z" .. p[1], "z" .. p[2], "z" .. p[3]} }, standard_chars = "AaÁáBbDdÐðEeÉéFfGgHhIiÍíJjKkLlMmNnOoÓóPpRrSsTtUuÚúVvXxYyÝýÞþÆæÖö" .. c.punc, } m["it"] = { "อิตาลี", 652, "roa-itr", "Latn", ancestors = "roa-oit", sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer .. c.ringabove}, standard_chars = "AaÀàBbCcDdEeÈèÉéFfGgHhIiÌìLlMmNnOoÒòPpQqRrSsTtUuÙùVvZz" .. c.punc, } m["iu"] = { "อินุกติตุต", 29921, "esx-inu", "Cans, Latn", translit = { Cans = "cr-translit" }, override_translit = true, } m["ja"] = { "ญี่ปุ่น", 5287, "jpx", "Jpan, Latn, Brai", ancestors = "ja-ear", translit = s["jpx-translit"], link_tr = true, display_text = s["jpx-displaytext"], strip_diacritics = s["jpx-stripdiacritics"], sort_key = s["jpx-sortkey"], } m["jv"] = { "ชวา", 33549, "poz", "Latn, Java, Arab", ancestors = "kaw", translit = { Java = "Java-translit" }, link_tr = true, strip_diacritics = { Latn = {remove_diacritics = c.circ} -- Modern jv don't use ê }, sort_key = { Latn = { from = {"å", "dh", "é", "è", "ng", "ny", "th"}, to = {"a" .. p[1], "d" .. p[1], "e" .. p[1], "e" .. p[2], "n" .. p[1], "n" .. p[2], "t" .. p[1]} }, }, } m["ka"] = { "จอร์เจีย", 8108, "ccs-gzn", "Geor, Geok, Hebr", -- Hebr is used to write Judeo-Georgian ancestors = "ka-mid", -- Geor, Geok translit in [[Module:scripts/data]] override_translit = true, strip_diacritics = { Geor = s["ka-stripdiacritics"], Geok = s["ka-stripdiacritics"], }, -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["kg"] = { "คองโก", 33702, "bnt-kng", "Latn", } m["ki"] = { "เกโกโย", 33587, "bnt-kka", "Latn", } m["kj"] = { "Kwanyama", 1405077, "bnt-ova", "Latn", } m["kk"] = { "คาซัค", 9252, "trk-kno", "Cyrl, Latn, kk-Arab", translit = { Cyrl = { from = { "Ё", "ё", "Й", "й", "Нг", "нг", "Ӯ", "ӯ", -- 2 chars; are "Ӯ" and "ӯ" actually used? "А", "а", "Ә", "ә", "Б", "б", "В", "в", "Г", "г", "Ғ", "ғ", "Д", "д", "Е", "е", "Ж", "ж", "З", "з", "И", "и", "К", "к", "Қ", "қ", "Л", "л", "М", "м", "Н", "н", "Ң", "ң", "О", "о", "Ө", "ө", "П", "п", "Р", "р", "С", "с", "Т", "т", "У", "у", "Ұ", "ұ", "Ү", "ү", "Ф", "ф", "Х", "х", "Һ", "һ", "Ц", "ц", "Ч", "ч", "Ш", "ш", "Щ", "щ", "Ъ", "ъ", "Ы", "ы", "І", "і", "Ь", "ь", "Э", "э", "Ю", "ю", "Я", "я", -- 1 char }, to = { "E", "e", "İ", "i", "Ñ", "ñ", "U", "u", "A", "a", "Ä", "ä", "B", "b", "V", "v", "G", "g", "Ğ", "ğ", "D", "d", "E", "e", "J", "j", "Z", "z", "İ", "i", "K", "k", "Q", "q", "L", "l", "M", "m", "N", "n", "Ñ", "ñ", "O", "o", "Ö", "ö", "P", "p", "R", "r", "S", "s", "T", "t", "U", "u", "Ū", "ū", "Ü", "ü", "F", "f", "X", "x", "H", "h", "S", "s", "Ç", "ç", "Ş", "ş", "Ş", "ş", "", "", "Y", "y", "I", "ı", "", "", "É", "é", "Ü", "ü", "Ä", "ä", } } }, -- override_translit = true, sort_key = { Cyrl = { from = {"ә", "ғ", "ё", "қ", "ң", "ө", "ұ", "ү", "һ", "і"}, to = {"а" .. p[1], "г" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1], "у" .. p[2], "х" .. p[1], "ы" .. p[1]} }, }, standard_chars = { Cyrl = "АаӘәБбВвГгҒғДдЕеЁёЖжЗзИиЙйКкҚқЛлМмНнҢңОоӨөПпРрСсТтУуҰұҮүФфХхҺһЦцЧчШшЩщЪъЫыІіЬьЭэЮюЯя", c.punc }, } m["kl"] = { "กรีนแลนด์", 25355, "esx-inu", "Latn", sort_key = { from = {"æ", "ø", "å"}, to = {"z" .. p[1], "z" .. p[2], "z" .. p[3]} } } m["km"] = { "เขมร", 9205, "mkh-kmr", "Khmr", ancestors = "xhm", translit = "Khmr-translit", } m["kn"] = { "กันนฑะ", 33673, "dra-kan", "Knda, Tutg", ancestors = "dra-mkn", -- Knda translit in [[Module:scripts/data]] } m["ko"] = { "เกาหลี", 9176, "qfa-kor", "Kore, Brai", ancestors = "ko-ear", translit = { Kore = "ko-translit", }, -- Kore strip_diacritics in [[Module:scripts/data]] } m["kr"] = { "กานูรี", 36094, "ssa-sah", "Latn, Arab", -- the sortkey and strip_diacritics are only for standard Kanuri; when dialectal entries get added, someone will have to work out how the dialects should be represented orthographically strip_diacritics = { Latn = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.breve} }, sort_key = { Latn = { from = {"ǝ", "ny", "ɍ", "sh"}, to = {"e" .. p[1], "n" .. p[1], "r" .. p[1], "s" .. p[1]} }, }, } m["ks"] = { "แคชเมียร์", 33552, "inc-kas", "ks-Arab, Deva, Shrd, Latn", translit = { ["ks-Arab"] = "ks-Arab-translit", Deva = "Deva-translit", -- Shrd translit in [[Module:scripts/data]] }, } -- "kv" is treated as "koi", "kpv", see [[WT:LT]] m["kw"] = { "คอร์นวอลล์", 25289, "cel-brs", "Latn", ancestors = "cnx", sort_key = { from = {"ch"}, to = {"c" .. p[1]} }, } m["ky"] = { "คีร์กีซ", 9255, "trk-kkp", "Cyrl, Latn, Arab", translit = { Cyrl = "ky-translit" }, override_translit = true, sort_key = { Cyrl = { from = {"ё", "ң", "ө", "ү"}, to = {"е" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1]} }, }, } m["la"] = { "ละติน", 397, "itc-laf", "Latn, Ital", ancestors = "itc-ola", --translit = "la-translit", -- already handled in Module:headword & Module:links -- Ital translit in [[Module:scripts/data]] (NOTE: formerly not present, probably an accidental omission) display_text = { Latn = s["itc-Latn-displaytext"] }, strip_diacritics = { Latn = s["itc-Latn-stripdiacritics"] }, sort_key = { Latn = s["itc-Latn-sortkey"] }, standard_chars = { Latn = "AaBbCcDdEeFfGgHhIiLlMmNnOoPpQqRrSsTtUuVvXx", c.punc }, } m["lb"] = { "ลักเซมเบิร์ก", 9051, "gmw-hgm", "Latn, Brai", ancestors = "gmw-cfr", sort_key = { Latn = { from = {"ä", "ë", "é"}, to = {"z" .. p[1], "z" .. p[2], "z" .. p[3]} }, }, } m["lg"] = { "ลูกันดา", 33368, "bnt-nyg", "Latn", strip_diacritics = {remove_diacritics = c.acute .. c.circ}, sort_key = { from = {"ŋ"}, to = {"n" .. p[1]} }, } m["li"] = { "ลิมเบิร์ก", 102172, "gmw-frk", "Latn", ancestors = "dum", } m["ln"] = { "ลิงกาลา", 36217, "bnt-bmo", "Latn", sort_key = { remove_diacritics = c.acute .. c.circ .. c.caron, from = {"ɛ", "gb", "mb", "mp", "nd", "ng", "nk", "ns", "nt", "ny", "nz", "ɔ"}, to = {"e" .. p[1], "g" .. p[1], "m" .. p[1], "m" .. p[2], "n" .. p[1], "n" .. p[2], "n" .. p[3], "n" .. p[4], "n" .. p[5], "n" .. p[6], "n" .. p[7], "o" .. p[1]} }, } m["lo"] = { "ลาว", 9211, "tai-swe", "Laoo", -- also Tai Noi/Lao Buhan script translit = "Laoo-translit", --sort_key = "Laoo-sortkey", standard_chars = "0-9ກຂຄງຈຊຍດຕຖທນບປຜຝພຟມຢຣລວສຫອຮຯ-ໝ" .. c.punc, } m["lt"] = { "ลิทัวเนีย", 9083, "bat-eas", "Latn", ancestors = "olt", display_text = "lt-common", strip_diacritics = "lt-common", sort_key = "lt-common", standard_chars = "AaĄąBbCcČčDdEeĘęĖėFfGgHhIiĮįYyJjKkLlMmNnOoPpRrSsŠšTtUuŲųŪūVvZzŽž" .. c.punc, } m["lu"] = { "Luba-Katanga", 36157, "bnt-lub", "Latn", } m["lv"] = { "ลัตเวีย", 9078, "bat-eas", "Latn", strip_diacritics = { -- This attempts to convert vowels with tone marks to vowels either with or without macrons. Specifically, there should be no macrons if the vowel is part of a diphthong (including resonant diphthongs such pìrksts -> pirksts not #pīrksts). What we do is first convert the vowel + tone mark to a vowel + tilde in a decomposed fashion, then remove the tilde in diphthongs, then convert the remaining vowel + tilde sequences to macroned vowels, then delete any other tilde. We leave already-macroned vowels alone: Both e.g. ar and ār occur before consonants. FIXME: This still might not be sufficient. from = {"([Ee])" .. c.cedilla, "[" .. c.grave .. c.circ .. c.tilde .."]", "([aAeEiIoOuU])" .. c.tilde .."?([lrnmuiLRNMUI])" .. c.tilde .. "?([^aAeEiIoOuU])", "([aAeEiIoOuU])" .. c.tilde .."?([lrnmuiLRNMUI])" .. c.tilde .."?$", "([iI])" .. c.tilde .. "?([eE])" .. c.tilde .. "?", "([aAeEiIuU])" .. c.tilde, c.tilde}, to = {"%1", c.tilde, "%1%2%3", "%1%2", "%1%2", "%1" .. c.macron} }, sort_key = { from = {"ā", "č", "ē", "ģ", "ī", "ķ", "ļ", "ņ", "š", "ū", "ž"}, to = {"a" .. p[1], "c" .. p[1], "e" .. p[1], "g" .. p[1], "i" .. p[1], "k" .. p[1], "l" .. p[1], "n" .. p[1], "s" .. p[1], "u" .. p[1], "z" .. p[1]} }, standard_chars = "AaĀāBbCcČčDdEeĒēFfGgĢģHhIiĪīJjKkĶķLlĻļMmNnŅņOoPpRrSsŠšTtUuŪūVvZzŽž" .. c.punc, } m["mg"] = { "มาลากาซี", 7930, "poz-bre", "Latn, Arab", } m["mh"] = { "มาร์แชลล์", 36280, "poz-mic", "Latn", sort_key = { from = {"ā", "ļ", "m̧", "ņ", "n̄", "o̧", "ō", "ū"}, to = {"a" .. p[1], "l" .. p[1], "m" .. p[1], "n" .. p[1], "n" .. p[2], "o" .. p[1], "o" .. p[2], "u" .. p[1]} }, } m["mi"] = { "มาวรี", 36451, "poz-pep", "Latn", sort_key = { remove_diacritics = c.macron, from = {"ng", "wh"}, to = {"n" .. p[1], "w" .. p[1]} }, } m["mk"] = { "มาซิโดเนีย", 9296, "zls", "Cyrl, Polyt", ancestors = "cu", translit = { Cyrl = "mk-translit", -- FIXME: formerly no translit specified for Polyt; unclear if the default [[Module:grc-translit]] is -- acceptable, so we disable it for now Polyt = false, }, strip_diacritics = { Cyrl = { remove_diacritics = c.acute, remove_exceptions = {"Ѓ", "ѓ", "Ќ", "ќ"} }, }, sort_key = { Cyrl = { remove_diacritics = c.grave, remove_exceptions = {"ѓ", "ќ"}, from = {"ѓ", "ѕ", "ј", "љ", "њ", "ќ", "џ"}, to = {"д" .. p[1], "з" .. p[1], "и" .. p[1], "л" .. p[1], "н" .. p[1], "т" .. p[1], "ч" .. p[1]} }, }, -- Polyt display_text, strip_diacritics, sort_key in [[Module:scripts/data]] standard_chars = { Cyrl = "АаБбВвГгДдЃѓЕеЖжЗзЅѕИиЈјКкЛлЉљМмНнЊњОоПпРрСсТтЌќУуФфХхЦцЧчЏџШш", c.punc }, } m["ml"] = { "มลยาฬัม", 36236, "dra-mal", "Mlym", override_translit = true, -- Mlym translit in [[Module:scripts/data]] } m["mn"] = { "มองโกเลีย", 9246, "xgn-cen", "Cyrl, Mong, Latn, Brai", ancestors = "cmg", translit = { Cyrl = "mn-translit", -- Mong translit in [[Module:scripts/data]] }, override_translit = true, -- Mong display_text and strip_diacritics in [[Module:scripts/data]] strip_diacritics = { Cyrl = {remove_diacritics = c.grave .. c.acute}, }, sort_key = { Cyrl = { remove_diacritics = c.grave, from = {"ё", "ө", "ү"}, to = {"е" .. p[1], "о" .. p[1], "у" .. p[1]} }, }, standard_chars = { Cyrl = "АаБбВвГгДдЕеЁёЖжЗзИиЙйЛлМмНнОоӨөРрСсТтУуҮүХхЦцЧчШшЫыЬьЭэЮюЯя—", Brai = c.braille, c.punc }, } -- "mo" is treated as "ro", see [[WT:LT]] m["mr"] = { "มราฐี", 1571, "inc-sou", "Deva, Modi", ancestors = "omr", translit = { Deva = "Deva-translit", Modi = "Modi-translit", }, strip_diacritics = { Deva = { from = {"च़", "ज़", "झ़"}, to = {"च", "ज", "झ"} }, }, } m["ms"] = { "มาเลเซีย", 9237, "poz-mly", "Latn, ms-Arab", ancestors = "ms-cla", standard_chars = { Latn = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz", c.punc }, } m["mt"] = { "มอลตา", 9166, "sem-arb", "Latn", display_text = { from = {"'"}, to = {"’"} }, strip_diacritics = { from = {"’"}, to = {"'"}, }, ancestors = "sqr", sort_key = { from = { "ċ", "ġ", "ż", -- Convert into PUA so that decomposed form does not get caught by the next step. "([cgz])", -- Ensure "c" comes after "ċ", "g" comes after "ġ" and "z" comes after "ż". "g" .. p[1] .. "ħ", -- "għ" after initial conversion of "g". p[3], p[4], "ħ", "ie", p[5] -- Convert "ċ", "ġ", "ħ", "ie", "ż" into final output. }, to = { p[3], p[4], p[5], "%1" .. p[1], "g" .. p[2], "c", "g", "h" .. p[1], "i" .. p[1], "z" } }, } m["my"] = { "พม่า", 9228, "tbq-brm", "Mymr", ancestors = "obr", translit = "my-translit", override_translit = true, sort_key = { from = {"ျ", "ြ", "ွ", "ှ", "ဿ"}, to = {"္ယ", "္ရ", "္ဝ", "္ဟ", "သ္သ"} }, } m["na"] = { "นาอูรู", 13307, "poz-mic", "Latn", } m["nb"] = { "นอร์เวย์แบบบุ๊กมอล", 25167, "gmq", "Latn", wikimedia_codes = "no", ancestors = "gmq-mno, da", -- da as an (but not the) ancestor of nb was agreed on - do not change without discussion sort_key = s["no-sortkey"], standard_chars = s["no-standardchars"], } m["nd"] = { "Northern Ndebele", 35613, "bnt-ngu", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron}, } m["ne"] = { "เนปาล", 33823, "inc-pah", "Deva, Newa", translit = { Deva = "Deva-translit", Newa = "Newa-translit", }, } m["ng"] = { "Ndonga", 33900, "bnt-ova", "Latn", } m["nl"] = { "ดัตช์", 7411, "gmw-frk", "Latn, Brai", ancestors = "dum", sort_key = { Latn = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.diaer .. c.ringabove .. c.cedilla .. "'"}, }, standard_chars = { Latn = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZzÄäËëÏïÖöÜü", Brai = c.braille, c.punc }, } m["nn"] = { "นอร์เวย์แบบนือนอสก์", 25164, "gmq-wes", "Latn", ancestors = "gmq-mno", strip_diacritics = { remove_diacritics = c.grave .. c.acute, }, sort_key = s["no-sortkey"], standard_chars = s["no-standardchars"], } m["no"] = { "นอร์เวย์", 9043, "gmq-wes", "Latn", ancestors = "gmq-mno", sort_key = s["no-sortkey"], standard_chars = s["no-standardchars"], } m["nr"] = { "Southern Ndebele", 36785, "bnt-ngu", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron}, } m["nv"] = { "นาวาโฮ", 13310, "apa", "Latn, Brai", sort_key = { remove_diacritics = c.acute .. c.ogonek, from = { "chʼ", "tłʼ", "tsʼ", -- 3 chars "ch", "dl", "dz", "gh", "hw", "kʼ", "kw", "sh", "tł", "ts", "zh", -- 2 chars "ł", "ʼ" -- 1 char }, to = { "c" .. p[2], "t" .. p[2], "t" .. p[4], "c" .. p[1], "d" .. p[1], "d" .. p[2], "g" .. p[1], "h" .. p[1], "k" .. p[1], "k" .. p[2], "s" .. p[1], "t" .. p[1], "t" .. p[3], "z" .. p[1], "l" .. p[1], "z" .. p[2] } }, } m["ny"] = { "เจวา", 33273, "bnt-nys", "Latn", strip_diacritics = {remove_diacritics = c.acute .. c.circ}, sort_key = { from = {"ng'"}, to = {"ng"} }, } m["oc"] = { "อุตซิตา", 14185, "roa-ocr", "Latn, Hebr", ancestors = "pro", sort_key = { Latn = { remove_diacritics = c.grave .. c.acute .. c.diaer .. c.cedilla, from = {"([lns])·h"}, to = {"%1h"} }, }, -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["oj"] = { "โอจิบเว", 33875, "alg", "Cans, Latn", sort_key = { Latn = { from = {"aa", "ʼ", "ii", "oo", "sh", "zh"}, to = {"a" .. p[1], "h" .. p[1], "i" .. p[1], "o" .. p[1], "s" .. p[1], "z" .. p[1]} }, }, } m["om"] = { "ออโรโม", 33864, "cus-eas", "Latn, Ethi", } m["or"] = { "โอริยา", 33810, "inc-eas", "Orya", ancestors = "inc-mor", translit = "Orya-translit", } m["os"] = { "ออสซีเซีย", 33968, "xsc-sar", "Cyrl, Geor, Latn", ancestors = "oos", translit = { Cyrl = "os-translit", -- Geor translit in [[Module:scripts/data]] }, override_translit = true, display_text = { Cyrl = { from = {"æ"}, to = {"ӕ"} }, Latn = { from = {"ӕ"}, to = {"æ"} }, }, strip_diacritics = { Cyrl = { remove_diacritics = c.grave .. c.acute, from = {"æ"}, to = {"ӕ"} }, Latn = { from = {"ӕ"}, to = {"æ"} }, }, sort_key = { Cyrl = { from = {"ӕ", "гъ", "дж", "дз", "ё", "къ", "пъ", "тъ", "хъ", "цъ", "чъ"}, to = {"а" .. p[1], "г" .. p[1], "д" .. p[1], "д" .. p[2], "е" .. p[1], "к" .. p[1], "п" .. p[1], "т" .. p[1], "х" .. p[1], "ц" .. p[1], "ч" .. p[1]} }, }, } m["pa"] = { "ปัญจาบ", 58635, "inc-pan", "Guru, pa-Arab", ancestors = "inc-opa", translit = { Guru = "Guru-translit", ["pa-Arab"] = "pa-Arab-translit", }, strip_diacritics = { ["pa-Arab"] = { remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna, from = {"ݨ", "ࣇ"}, to = {"ن", "ل"} }, }, } m["pi"] = { "บาลี", 36727, "inc-mid", "Latn, Brah, Deva, Beng, Sinh, Mymr, Thai, Lana, Laoo, Khmr, Cakm", --and also Khom ancestors = "sa", translit = { -- Brah translit in [[Module:scripts/data]] Deva = "Deva-translit", Beng = "Beng-translit", Sinh = "Sinh-translit", Mymr = "Mymr-translit", --Thai = "pi-translit", Lana = "Lana-translit", Laoo = "Laoo-translit", Khmr = "Khmr-translit", Cakm = "Cakm-translit", }, strip_diacritics = { Thai = { from = {"ึ", u(0xF700), u(0xF70F)}, -- FIXME: Not clear what's going on with the PUA characters here. to = {"ิํ", "ฐ", "ญ"} }, Mymr = { remove_diacritics = c.VS01, }, }, sort_key = { -- FIXME: This needs to be converted into the current standardized format. from = {"ā", "ī", "ū", "ḍ", "ḷ", "m[" .. c.dotabove .. c.dotbelow .. "]", "ṅ", "ñ", "ṇ", "ṭ", "([เโ])([ก-ฮ])", "([ເໂ])([ກ-ຮ])", "ᩔ", "ᩕ", "ᩖ", "ᩘ", "([ᨭ-ᨱ])ᩛ", "([ᨷ-ᨾ])ᩛ", "ᩤ", u(0xFE00), u(0x200D)}, to = {"a~", "i~", "u~", "d~", "l~", "m~", "n~", "n~~", "n~~~", "t~", "%2%1", "%2%1", "ᩈ᩠ᩈ", "᩠ᩁ", "᩠ᩃ", "ᨦ᩠", "%1᩠ᨮ", "%1᩠ᨻ", "ᩣ"} }, } m["pl"] = { "โปแลนด์", 809, "zlw-lch", "Latn", ancestors = "zlw-mpl", sort_key = { from = {"ą", "ć", "ę", "ł", "ń", "ó", "ś", "ź", "ż"}, to = {"a" .. p[1], "c" .. p[1], "e" .. p[1], "l" .. p[1], "n" .. p[1], "o" .. p[1], "s" .. p[1], "z" .. p[1], "z" .. p[2]} }, standard_chars = "AaĄąBbCcĆćDdEeĘęFfGgHhIiJjKkLlŁłMmNnŃńOoÓóPpRrSsŚśTtUuWwYyZzŹźŻż" .. c.punc, } m["ps"] = { "ปาทาน", 58680, "ira-pat", "ps-Arab", strip_diacritics = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.zwarakay .. c.superalef}, } m["pt"] = { "โปรตุเกส", 5146, "roa-gap", "Latn, Brai", sort_key = { Latn = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron .. c.diaer .. c.cedilla, from = {"ª", "æ", "º", "œ"}, to = {"a", "ae", "o", "oe"} }, }, standard_chars = { Latn = "AaÁáÂâÃãBbCcÇçDdEeÉéÊêFfGgHhIiÍíJjLlMmNnOoÓóÔôÕõPpQqRrSsTtUuÚúVvXxZz", Brai = c.braille, c.punc }, } m["qu"] = { "เกชัว", 5218, "qwe", "Latn", } m["rm"] = { "โรมานช์", 13199, "roa-rhe", "Latn", sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer .. c.small_e}, } m["ro"] = { "โรมาเนีย", 7913, "roa-eas", "Latn, Cyrl, Cyrs", translit = { Cyrl = "ro-translit" }, sort_key = { Latn = { remove_diacritics = c.grave .. c.acute, from = {"ă", "â", "î", "ș", "ț"}, to = {"a" .. p[1], "a" .. p[2], "i" .. p[1], "s" .. p[1], "t" .. p[1]} }, Cyrl = { from = {"ӂ"}, to = {"ж" .. p[1]} }, }, -- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]]; presumably not present standard_chars = { Latn = "AaĂăÂâBbCcDdEeFfGgHhIiÎîJjLlMmNnOoPpRrSsȘșTtȚțUuVvXxZz", Cyrl = "АаБбВвГгДдЕеЖжӁӂЗзИиЙйКкЛлМмНнОоПпРрСсТтУуФфХхЦцЧчШшЫыЬьЭэЮюЯя", c.punc }, } m["ru"] = { "รัสเซีย", 7737, "zle", "Cyrl, Brai", ancestors = "zle-mru", translit = { Cyrl = "ru-translit-Thai" }, display_text = { Cyrl = { from = {"'"}, to = {"’"} }, }, strip_diacritics = { Cyrl = { remove_diacritics = c.grave .. c.acute .. c.diaer, remove_exceptions = {"Ё", "ё", "Ѣ̈", "ѣ̈", "Я̈", "я̈"}, from = {"’"}, to = {"'"}, }, }, sort_key = { Cyrl = { remove_diacritics = c.grave .. c.acute .. c.diaer, from = { "і", "ѣ", "ѳ", "ѵ" }, to = { "и" .. p[1], "ь" .. p[1], "я" .. p[2], "я" .. p[3] } }, }, standard_chars = { Cyrl = "АаБбВвГгДдЕеЁёЖжЗзИиЙйКкЛлМмНнОоПпРрСсТтУуФфХхЦцЧчШшЩщЪъЫыЬьЭэЮюЯя—", Brai = c.braille, (c.punc:gsub("'", "")) -- Exclude apostrophe. }, } m["rw"] = { "รวันดา-รุนดี", 3217514, "bnt-glb", "Latn", strip_diacritics = {remove_diacritics = c.acute .. c.circ .. c.macron .. c.caron}, } m["sa"] = { "สันสกฤต", 11059, "inc", "as-Beng, Bali, Beng, Bhks, Brah, Mymr, xwo-Mong, Deva, Gujr, Guru, Gran, Hani, Java, Kthi, Knda, Kawi, Khar, Khmr, Laoo, Mlym, mnc-Mong, Marc, Modi, Mong, Nand, Newa, Orya, Phag, Ranj, Saur, Shrd, Sidd, Sinh, Soyo, Lana, Takr, Taml, Tang, Telu, Thai, Tibt, Tutg, Tirh, Zanb", --and also Khom; script codes sorted by canonical name rather than code for [[MOD:sa-convert]] translit = { Beng = "Beng-translit", ["as-Beng"] = "Beng-translit", -- Brah translit in [[Module:scripts/data]] Deva = "Deva-translit", Gujr = "Gujr-translit", Guru = "Guru-translit", Java = "Java-translit", Kthi = "Kthi-translit", Khmr = "Khmr-translit", Knda = "Knda-translit", Lana = "Lana-translit", Laoo = "Laoo-translit", Mlym = "Mlym-translit", Modi = "Modi-translit", -- Mong, mnc-Mong, xwo-Mong translit in [[Module:scripts/data]] -- NOTE: Formerly used xal-translit for transliterating xwo-Mong but that only handles Cyrillic; it has -- code to transliterate xwo-Mong but it's broken so I've replaced it with the default xwo-translit. Mymr = "Mymr-translit", Orya = "Orya-translit", -- Shrd translit in [[Module:scripts/data]] -- Sidd translit in [[Module:scripts/data]] Sinh = "Sinh-translit", --Thai = "pi-translit", Taml = "Taml-translit", Telu = "Telu-translit", -- Tibt translit in [[Module:scripts/data]] }, -- Mong display_text and strip_diacritics in [[Module:scripts/data]] -- Tibt display_text, strip_diacritics, sort_key in [[Module:scripts/data]] strip_diacritics = { Thai = { from = {"ึ", u(0xF700), u(0xF70F)}, -- FIXME: Not clear what's going on with the PUA characters here. to = {"ิํ", "ฐ", "ญ"} }, Mymr = { remove_diacritics = c.VS01, }, Deva = { remove_diacritics = c.udatta .. c.anudatta, }, }, sort_key = { Latn = { from = {"ā", "ī", "ū", "ḍ", "ḷ", "ḹ", "m[" .. c.dotabove .. c.dotbelow .. "]", "ṅ", "ñ", "ṇ", "ṛ", "ṝ", "ś", "ṣ", "ṭ"}, to = {"a~", "i~", "u~", "d~", "l~", "l~~", "m~", "n~", "n~~", "n~~~", "r~", "r~~", "s~", "s~~", "t~"}, }, --Thai = "Thai-sortkey", --Laoo = "Laoo-sortkey", Lana = { -- Tai Tham from = {"ᩔ", "ᩕ", "ᩖ", "ᩘ", "([ᨭ-ᨱ])ᩛ", "([ᨷ-ᨾ])ᩛ", "ᩤ"}, to = {"ᩈ᩠ᩈ", "᩠ᩁ", "᩠ᩃ", "ᨦ᩠", "%1᩠ᨮ", "%1᩠ᨻ", "ᩣ"}, }, Mymr = { remove_diacritics = c.VS01, }, -- FIXME: The previous sort key which mixed all scripts removed ZWJ; I don't know which script(s) this was -- intended for and there are no other languages which remove it in the sort key AFAIK. If it needs to be -- removed, specify the script(s) it needs to be removed under or add handling for the "all" script that applies -- regardless of script. --all = { -- remove_diacritics = c.ZWJ, --}, }, } m["sc"] = { "ซาร์ดิเนีย", 33976, "roa-sou", "Latn", } m["sd"] = { "สินธ์", 33997, "inc-snd", "sd-Arab, Deva, Sind, Khoj", translit = { Deva = "Deva-translit", Sind = "Sind-translit", ["sd-Arab"] = "sd-Arab-translit" }, strip_diacritics = { ["sd-Arab"] = { remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef, from = {"ٱ"}, to = {"ا"} }, }, } m["se"] = { "ซามีเหนือ", 33947, "smi", "Latn", display_text = { from = {"'"}, to = {"ˈ"} }, strip_diacritics = {remove_diacritics = c.macron .. c.dotbelow .. "'ˈ"}, sort_key = { from = {"á", "č", "đ", "ŋ", "š", "ŧ", "ž"}, to = {"a" .. p[1], "c" .. p[1], "d" .. p[1], "n" .. p[1], "s" .. p[1], "t" .. p[1], "z" .. p[1]} }, standard_chars = "AaÁáBbCcČčDdĐđEeFfGgHhIiJjKkLlMmNnŊŋOoPpRrSsŠšTtŦŧUuVvZzŽž" .. c.punc, } m["sg"] = { "ซังโก", 33954, "crp", "Latn", ancestors = "ngb", } m["sh"] = { "เซอร์โบ-โครเอเชีย", 9301, "zls", "Latn, Cyrl, Glag, Arab", ietf_subtag = "hbs", -- ISO 639-3 code, since "sh" is deprecated from ISO 639-1 wikimedia_codes = "sh, bs, hr, sr", strip_diacritics = { Latn = { remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve, remove_exceptions = {"Ć", "ć", "Ś", "ś", "Ź", "ź"} }, Cyrl = { remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve, remove_exceptions = {"З́", "з́", "С́", "с́"} }, }, sort_key = { Latn = { remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve, remove_exceptions = {"ć", "ś", "ź"}, from = {"č", "ć", "dž", "đ", "lj", "nj", "š", "ś", "ž", "ź"}, to = {"c" .. p[1], "c" .. p[2], "d" .. p[1], "d" .. p[2], "l" .. p[1], "n" .. p[1], "s" .. p[1], "s" .. p[2], "z" .. p[1], "z" .. p[2]} }, Cyrl = { remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve, remove_exceptions = {"з́", "с́"}, from = {"ђ", "з́", "ј", "љ", "њ", "с́", "ћ", "џ"}, to = {"д" .. p[1], "з" .. p[1], "и" .. p[1], "л" .. p[1], "н" .. p[1], "с" .. p[1], "т" .. p[1], "ч" .. p[1]} }, }, standard_chars = { Latn = "AaBbCcČčĆćDdĐđEeFfGgHhIiJjKkLlMmNnOoPpRrSsŠšTtUuVvZzŽž", Cyrl = "АаБбВвГгДдЂђЕеЖжЗзИиЈјКкЛлЉљМмНнЊњОоПпРрСсТтЋћУуФфХхЦцЧчЏџШш", c.punc }, } m["si"] = { "สิงหล", 13267, "inc-ins", "Sinh", translit = "Sinh-translit", override_translit = true, } m["sk"] = { "สโลวัก", 9058, "zlw", "Latn", ancestors = "zlw-osk", sort_key = {remove_diacritics = c.acute .. c.circ .. c.diaer .. c.caron}, standard_chars = "AaÁáÄäBbCcČčDdĎďEeÉéFfGgHhIiÍíJjKkLlĹ弾MmNnŇňOoÓóÔôPpRrŔŕSsŠšTtŤťUuÚúVvYyÝýZzŽž" .. c.punc, } m["sl"] = { "สโลวีเนีย", 9063, "zls", "Latn", strip_diacritics = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.dgrave .. c.invbreve .. c.dotbelow, remove_exceptions = {"Ć", "ć", "Ǵ", "ǵ", "Ś", "ś", "Ź", "ź"}, from = {"Ə", "ə", "Ł", "ł"}, to = {"E", "e", "L", "l"}, }, sort_key = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron .. c.dotabove .. c.ringabove .. c.dgrave .. c.invbreve .. c.dotbelow .. c.ringbelow .. c.ogonek, remove_exceptions = {"ć", "ǵ", "ś", "ź"}, from = {"ä", "č", "ć", "đ", "ə", "ë", "ǧ", "ǵ", "ï", "ł", "ö", "š", "ś", "ü", "ž", "ź"}, to = {"a" .. p[1], "c" .. p[1], "c" .. p[2], "d" .. p[1], "e", "e" .. p[1], "g" .. p[1], "g" .. p[2], "i" .. p[1], "l", "o" .. p[1], "s" .. p[1], "s" .. p[2], "u" .. p[1], "z" .. p[1], "z" .. p[2]}, }, standard_chars = "AaBbCcČčDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsŠšTtUuVvZzŽž" .. c.punc, } m["sm"] = { "ซามัว", 34011, "poz-pnp", "Latn", } m["sn"] = { "โชนา", 34004, "bnt-sho", "Latn", strip_diacritics = {remove_diacritics = c.acute}, } m["so"] = { "โซมาลี", 13275, "cus-som", "Latn, Arab, Osma", strip_diacritics = { Latn = {remove_diacritics = c.grave .. c.acute .. c.circ} }, } m["sq"] = { "แอลเบเนีย", 8748, "sqj", "Latn, Grek, ota-Arab, Elba, Todr, Vith", translit = { Elba = "Elba-translit", }, -- Grek display_text, sort_key in [[Module:scripts/data]] strip_diacritics = { Latn = { remove_diacritics = c.acute .. c.circ, from = {'^[ie] (%w)', '^të (%w)'}, to = {'%1', '%1'}, }, Grek = { -- Diacritic removal from Grek-stripdiacritics excluded. from = m_langdata.chars_substitutions["Grek-stripdiacritics"].from, to = m_langdata.chars_substitutions["Grek-stripdiacritics"].to, }, }, sort_key = { Latn = { remove_diacritics = c.acute .. c.circ .. c.tilde .. c.breve .. c.caron, from = {'^[ie] (%w)', '^të (%w)', 'ç', 'dh', 'ë', 'gj', 'll', 'nj', 'rr', 'sh', 'th', 'xh', 'zh'}, to = {'%1', '%1', 'c'..p[1], 'd'..p[1], 'e'..p[1], 'g'..p[1], 'l'..p[1], 'n'..p[1], 'r'..p[1], 's'..p[1], 't'..p[1], 'x'..p[1], 'z'..p[1]}, } -- TODO: Grek if the default sort key is unsuitable }, standard_chars = { Latn = "AaBbCcÇçDdEeËëFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvXxYyZz", c.punc }, } m["ss"] = { "Swazi", 34014, "bnt-ngu", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron}, } m["st"] = { "ซูทู", 34340, "bnt-sts", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron}, } m["su"] = { "ซุนดา", 34002, "poz-msa", "Latn, Sund, Arab", ancestors = "osn", translit = { Sund = "Sund-translit" }, } m["sv"] = { "สวีเดน", 9027, "gmq-eas", "Latn", ancestors = "gmq-osw-lat", sort_key = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron .. c.dacute .. c.caron .. c.cedilla .. "':", remove_exceptions = {"å"}, from = {"ø", "æ", "œ", "ß", "å", "aͤ", "oͤ"}, to = {"o", "ae", "oe", "ss", "z" .. p[1], "ä", "ö"} }, standard_chars = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpRrSsTtUuVvXxYyÅåÄäÖö" .. c.punc, } m["sw"] = { "สวาฮีลี", 7838, "bnt-swh", "Latn, Arab", sort_key = { Latn = { from = {"ng'"}, to = {"ng" .. p[1]} }, }, } m["ta"] = { "ทมิฬ", 5885, "dra-tam", "Taml", ancestors = "ta-mid", translit = "Taml-translit", override_translit = true, } m["te"] = { "เตลูกู", 8097, "dra-tel", "Telu", translit = "Telu-translit", override_translit = true, } m["tg"] = { "ทาจิก", 9260, "ira-swi", "Cyrl, fa-Arab, Latn", ancestors = "fa-cls", translit = { Cyrl = "tg-translit" }, override_translit = true, strip_diacritics = { Cyrl = s["tg-stripdiacritics"], Latn = s["tg-stripdiacritics"], }, sort_key = { Cyrl = { from = {"ғ", "ё", "ӣ", "қ", "ӯ", "ҳ", "ҷ"}, to = {"г" .. p[1], "е" .. p[1], "и" .. p[1], "к" .. p[1], "у" .. p[1], "х" .. p[1], "ч" .. p[1]} }, }, } m["th"] = { "ไทย", 9217, "tai-swe", "Thai, Khomt, Brai", --translit = { -- Thai = "th-translit" --}, --sort_key = { -- Thai = "Thai-sortkey" --}, } m["ti"] = { "ทือกรึญญา", 34124, "sem-eth", "Ethi", translit = "Ethi-translit", } m["tk"] = { "เติร์กเมน", 9267, "trk-ogz", "Latn, Cyrl, Arab", strip_diacritics = { Latn = s["tk-stripdiacritics"], Cyrl = s["tk-stripdiacritics"], }, sort_key = { Latn = { from = {"ç", "ä", "ž", "ň", "ö", "ş", "ü", "ý"}, to = {"c" .. p[1], "e" .. p[1], "j" .. p[1], "n" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1], "y" .. p[1]} }, Cyrl = { from = {"ё", "җ", "ң", "ө", "ү", "ә"}, to = {"е" .. p[1], "ж" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1], "э" .. p[1]} }, }, ancestors = "trk-eog", } m["tl"] = { "ตากาล็อก", 34057, "phi", "Latn, Tglg", translit = { Tglg = "tl-translit" }, override_translit = true, strip_diacritics = { Latn = {remove_diacritics = c.grave .. c.acute .. c.circ} }, standard_chars = { Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy", c.punc }, sort_key = { Latn = "tl-sortkey", }, } m["tn"] = { "สวานา", 34137, "bnt-sts", "Latn", } m["to"] = { "ตองงา", 34094, "poz-ton", "Latn", strip_diacritics = {remove_diacritics = c.acute}, sort_key = {remove_diacritics = c.macron}, } m["tr"] = { "ตุรกี", 256, "trk-ogz", "Latn", ancestors = "ota", dotted_dotless_i = true, sort_key = { from = { -- Ignore circumflex, but account for capital Î wrongly becoming ı + circ due to dotted dotless I logic. "ı" .. c.circ, c.circ, "i", -- Ensure "i" comes after "ı". "ç", "ğ", "ı", "ö", "ş", "ü" }, to = { "i", "", "i" .. p[1], "c" .. p[1], "g" .. p[1], "i", "o" .. p[1], "s" .. p[1], "u" .. p[1] } }, standard_chars = "AaÂâBbCcÇçDdEeFfGgĞğHhIıİiÎîJjKkLlMmNnOoÖöPpRrSsŞşTtUuÛûÜüVvYyZz" .. c.punc, } m["ts"] = { "Tsonga", 34327, "bnt-tsr", "Latn", } m["tt"] = { "ตาตาร์", 25285, "trk-kbu", "Cyrl, Latn, tt-Arab", translit = { Cyrl = "tt-translit", ["tt-Arab"] = "tt-translit" }, --override_translit = true, -- enable override until Module code can detect Russian loans such as [[аэропорт]] dotted_dotless_i = true, sort_key = { Cyrl = { from = {"ә", "ў", "ғ", "ё", "җ", "қ", "ң", "ө", "ү", "һ"}, to = {"а" .. p[1], "в" .. p[1], "г" .. p[1], "е" .. p[1], "ж" .. p[1], "к" .. p[1], "н" .. p[1], "о" .. p[1], "у" .. p[1], "х" .. p[1]} }, Latn = { from = { "i", -- Ensure "i" comes after "ı". "ä", "ə", "ç", "ğ", "ı", "ñ", "ŋ", "ö", "ɵ", "ş", "ü" }, to = { "i" .. p[1], "a" .. p[1], "a" .. p[2], "c" .. p[1], "g" .. p[1], "i", "n" .. p[1], "n" .. p[2], "o" .. p[1], "o" .. p[2], "s" .. p[1], "u" .. p[1] } }, }, } -- "tw" is treated as "ak", see [[WT:LT]] m["ty"] = { "ตาฮีตี", 34128, "poz-pep", "Latn", } m["ug"] = { "อุยกูร์", 13263, "trk-kar", "ug-Arab, Latn, Cyrl", ancestors = "chg", translit = { ["ug-Arab"] = "ug-translit-Thai", --Cyrl = "ug-translit", }, override_translit = true, } m["uk"] = { "ยูเครน", 8798, "zle", "Cyrl", ancestors = "zle-muk", translit = "uk-translit-Thai", strip_diacritics = {remove_diacritics = c.grave .. c.acute}, sort_key = { remove_diacritics = c.grave .. c.acute, from = { "ї", -- 2 chars "ґ", "є", "і" -- 1 char }, to = { "и" .. p[2], "г" .. p[1], "е" .. p[1], "и" .. p[1] } }, standard_chars = "АаБбВвГгДдЕеЄєЖжЗзИиІіЇїЙйКкЛлМмНнОоПпРрСсТтУуФфХхЦцЧчШшЩщЬьЮюЯя" .. c.punc:gsub("'", ""), -- Exclude apostrophe. } m["ur"] = { "อูรดู", 1617, "inc-hnd", "ur-Arab, Hebr", translit = { ["ur-Arab"] = "ur-translit" }, strip_diacritics = { ["ur-Arab"] = { -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"; hamzatu l-waṣli to a regular alif from = {"هٔ", "ۂ", "ٱ"}, to = {"ہ", "ہ", "ا"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef }, }, -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] standard_chars = { ["ur-Arab"] = "ایببپتثجچحخدذرزژسشصضطظعغفقکگلࣇڷمنݨوؤہھئٹڈڑآے", c.punc, }, } m["uz"] = { "อุซเบก", 9264, "trk-kar", "Latn, Cyrl, fa-Arab", ancestors = "chg", translit = { Cyrl = "uz-translit" }, sort_key = { Latn = { from = {"oʻ", "gʻ", "sh", "ch", "ng"}, to = {"z" .. p[1], "z" .. p[2], "z" .. p[3], "z" .. p[4], "z" .. p[5]} }, Cyrl = { from = {"ё", "ў", "қ", "ғ", "ҳ"}, to = {"е" .. p[1], "я" .. p[1], "я" .. p[2], "я" .. p[3], "я" .. p[4]} }, }, strip_diacritics = { ["fa-Arab"] = "ar-stripdiacritics", }, } m["ve"] = { "เวนดา", 32704, "bnt-bso", "Latn", } m["vi"] = { "เวียดนาม", 9199, "mkh-vie", "Latn, Hani", ancestors = "mkh-mvi", --translit = {Latn = "vi-translit"}, -- already handled in Module:headword & Module:links sort_key = { Latn = "vi-sortkey", Hani = "Hani-sortkey", }, } m["vo"] = { "โวลาปุก", 36986, "art", "Latn", } m["wa"] = { "วัลลูน", 34219, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["wo"] = { "โวลอฟ", 34257, "alv-fwo", "Latn, Arab, Gara", } m["xh"] = { "โคซา", 13218, "bnt-ngu", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron}, } m["yi"] = { "ยิดดิช", 8641, "gmw-hgm", "Hebr, Latn", ancestors = "gmh", translit = { Hebr = "yi-translit", }, -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["yo"] = { "โยรูบา", 34311, "alv-yor", "Latn, Arab", strip_diacritics = { Latn = {remove_diacritics = c.grave .. c.acute .. c.macron} }, sort_key = { Latn = { from = {"ẹ", "ɛ", "gb", "ị", "kp", "ọ", "ɔ", "ṣ", "sh", "ụ"}, to = {"e" .. p[1], "e" .. p[1], "g" .. p[1], "i" .. p[1], "k" .. p[1], "o" .. p[1], "o" .. p[1], "s" .. p[1], "s" .. p[1], "u" .. p[1]} }, }, } m["za"] = { "จ้วง", 13216, "tai", "Latn, Hani", sort_key = { Latn = "za-sortkey", Hani = "Hani-sortkey", }, } m["zh"] = { "จีน", 7850, "zhx", "Hants, Latn, Bopo, Nshu, Brai", ancestors = "ltc", generate_forms = "zh-generateforms", translit = { Hani = "zh-translit", Bopo = "zh-translit", }, sort_key = { Hani = "Hani-sortkey" }, } m["zu"] = { "ซูลู", 10179, "bnt-ngu", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.macron .. c.caron}, } return require("Module:languages").finalizeData(m, "language") 8aik8yyt7fo2dzxsbmj7ih4oxgzah6t มอดูล:languages 828 36388 5720749 5676884 2026-04-21T07:00:44Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720749 Scribunto text/plain --[==[ intro: This module implements fetching of language-specific information and processing text in a given language. ===Types of languages=== There are two types of languages: full languages and etymology-only languages. The essential difference is that only full languages appear in L2 headings in vocabulary entries, and hence categories like [[:Category:French nouns]] exist only for full languages. Etymology-only languages have either a full language or another etymology-only language as their parent (in the parent-child inheritance sense), and for etymology-only languages with another etymology-only language as their parent, a full language can always be derived by following the parent links upwards. For example, "Canadian French", code `fr-CA`, is an etymology-only language whose parent is the full language "French", code `fr`. An example of an etymology-only language with another etymology-only parent is "Northumbrian Old English", code `ang-nor`, which has "Anglian Old English", code `ang-ang` as its parent; this is an etymology-only language whose parent is "Old English", code `ang`, which is a full language. (This is because Northumbrian Old English is considered a variety of Anglian Old English.) Sometimes the parent is the "Undetermined" language, code `und`; this is the case, for example, for "substrate" languages such as "Pre-Greek", code `qsb-grc`, and "the BMAC substrate", code `qsb-bma`. It is important to distinguish language ''parents'' from language ''ancestors''. The parent-child relationship is one of containment, i.e. if X is a child of Y, X is considered a variety of Y. On the other hand, the ancestor-descendant relationship is one of descent in time. For example, "Classical Latin", code `la-cla`, and "Late Latin", code `la-lat`, are both etymology-only languages with "Latin", code `la`, as their parents, because both of the former are varieties of Latin. However, Late Latin does *NOT* have Classical Latin as its parent because Late Latin is *not* a variety of Classical Latin; rather, it is a descendant. There is in fact a separate `ancestors` field that is used to express the ancestor-descendant relationship, and Late Latin's ancestor is given as Classical Latin. It is also important to note that sometimes an etymology-only language is actually the conceptual ancestor of its parent language. This happens, for example, with "Old Italian" (code `roa-oit`), which is an etymology-only variant of full language "Italian" (code `it`), and with "Old Latin" (code `itc-ola`), which is an etymology-only variant of Latin. In both cases, the full language has the etymology-only variant listed as an ancestor. This allows a Latin term to inherit from Old Latin using the {{tl|inh}} template (where in this template, "inheritance" refers to ancestral inheritance, i.e. inheritance in time, rather than in the parent-child sense); likewise for Italian and Old Italian. Full languages come in three subtypes: * {regular}: This indicates a full language that is attested according to [[WT:CFI]] and therefore permitted in the main namespace. There may also be reconstructed terms for the language, which are placed in the {Reconstruction} namespace and must be prefixed with * to indicate a reconstruction. Most full languages are natural (not constructed) languages, but a few constructed languages (e.g. Esperanto and Volapük, among others) are also allowed in the mainspace and considered regular languages. * {reconstructed}: This language is not attested according to [[WT:CFI]], and therefore is allowed only in the {Reconstruction} namespace. All terms in this language are reconstructed, and must be prefixed with *. Languages such as Proto-Indo-European and Proto-Germanic are in this category. * {appendix-constructed}: This language is attested but does not meet the additional requirements set out for constructed languages ([[WT:CFI#Constructed languages]]). Its entries must therefore be in the Appendix namespace, but they are not reconstructed and therefore should not have * prefixed in links. Most constructed languages are of this subtype. Both full languages and etymology-only languages have a {Language} object associated with them, which is fetched using the {getByCode} function in [[Module:languages]] to convert a language code to a {Language} object. Depending on the options supplied to this function, etymology-only languages may or may not be accepted, and family codes may be accepted (returning a {Family} object as described in [[Module:families]]). There are also separate {getByCanonicalName} functions in [[Module:languages]] and [[Module:etymology languages]] to convert a language's canonical name to a {Language} object (depending on whether the canonical name refers to a full or etymology-only language). ===Textual representations=== Textual strings belonging to a given language come in several different ''text variants'': # The ''input text'' is what the user supplies in wikitext, in the parameters to {{tl|m}}, {{tl|l}}, {{tl|ux}}, {{tl|t}}, {{tl|lang}} and the like. # The ''corrected input text'' is the input text with some corrections and/or normalizations applied, such as bad-character replacements for certain languages, like replacing `l` or `1` to [[palochka]] in some languages written in Cyrillic. (FIXME: This currently goes under the name ''display text'' but that will be repurposed below. Also, [[User:Surjection]] suggests renaming this to ''normalized input text'', but "normalized" is used in a different sense in [[Module:usex]].) # The ''display text'' is the text in the form as it will be displayed to the user. This is what appears in headwords, in usexes, in displayed internal links, etc. This can include accent marks that are removed to form the stripped display text (see below), as well as embedded bracketed links that are variously processed further. The display text is generated from the corrected input text by applying language-specific transformations; for most languages, there will be no such transformations. The general reason for having a difference between input and display text is to allow for extra information in the input text that is not displayed to the user but is sent to the transliteration module. Note that having different display and input text is only supported currently through special-casing but will be generalized. Examples of transformations are: (1) Removing the {{cd|^}} that is used in certain East Asian (and possibly other unicameral) languages to indicate capitalization of the transliteration (which is currently special-cased); (2) for Korean, removing or otherwise processing hyphens (which is currently special-cased); (3) for Arabic, removing a ''sukūn'' diacritic placed over a ''tāʔ marbūṭa'' (like this: ةْ) to indicate that the ''tāʔ marbūṭa'' is pronounced and transliterated as /t/ instead of being silent [NOTE, NOT IMPLEMENTED YET]; (4) for Thai and Khmer, converting space-separated words to bracketed words and resolving respelling substitutions such as `[กรีน/กฺรีน]`, which indicate how to transliterate given words [NOTE, NOT IMPLEMENTED YET except in language-specific templates like {{tl|th-usex}}]. ## The ''right-resolved display text'' is the result of removing brackets around one-part embedded links and resolving two-part embedded links into their right-hand components (i.e. converting two-part links into the displayed form). The process of right-resolution is what happens when you call {{cd|remove_links()}} in [[Module:links]] on some text. When applied to the display text, it produces exactly what the user sees, without any link markup. # The ''stripped display text'' is the result of applying diacritic-stripping to the display text. ## The ''left-resolved stripped display text'' [NEED BETTER NAME] is the result of applying left-resolution to the stripped display text, i.e. similar to right-resolution but resolving two-part embedded links into their left-hand components (i.e. the linked-to page). If the display text refers to a single page, the resulting of applying diacritic stripping and left-resolution produces the ''logical pagename''. # The ''physical pagename text'' is the result of converting the stripped display text into physical page links. If the stripped display text contains embedded links, the left side of those links is converted into physical page links; otherwise, the entire text is considered a pagename and converted in the same fashion. The conversion does three things: (1) converts characters not allowed in pagenames into their "unsupported title" representation, e.g. {{cd|Unsupported titles/`gt`}} in place of the logical name {{cd|>}}; (2) handles certain special-cased unsupported-title logical pagenames, such as {{cd|Unsupported titles/Space}} in place of {{cd|[space]}} and {{cd|Unsupported titles/Ancient Greek dish}} in place of a very long Greek name for a gourmet dish as found in Aristophanes; (3) converts "mammoth" pagenames such as [[a]] into their appropriate split component, e.g. [[a/languages A to L]]. # The ''source translit text'' is the text as supplied to the language-specific {{cd|transliterate()}} method. The form of the source translit text may need to be language-specific, e.g Thai and Khmer will need the corrected input text, whereas other languages may need to work off the display text. [FIXME: It's still unclear to me how embedded bracketed links are handled in the existing code.] In general, embedded links need to be right-resolved (see above), but when this happens is unclear to me [FIXME]. Some languages have a chop-up-and-paste-together scheme that sends parts of the text through the transliterate mechanism, and for others (those listed with "cont" in {{cd|substitution}} in [[Module:languages/data]]) they receive the full input text, but preprocessed in certain ways. (The wisdom of this is still unclear to me.) # The ''transliterated text'' (or ''transliteration'') is the result of transliterating the source translit text. Unlike for all the other text variants except the transcribed text, it is always in the Latin script. # The ''transcribed text'' (or ''transcription'') is the result of transcribing the source translit text, where "transcription" here means a close approximation to the phonetic form of the language in languages (e.g. Akkadian, Sumerian, Ancient Egyptian, maybe Tibetan) that have a wide difference between the written letters and spoken form. Unlike for all the other text variants other than the transliterated text, it is always in the Latin script. Currently, the transcribed text is always supplied manually be the user; there is no such thing as a {{cd|transcribe()}} method on language objects. # The ''sort key'' is the text used in sort keys for determining the placing of pages in categories they belong to. The sort key is generated from the pagename or a specified ''sort base'' by lowercasing, doing language-specific transformations and then uppercasing the result. If the sort base is supplied and is generated from input text, it needs to be converted to display text, have embedded links removed through right-resolution and have diacritic-stripping applied. # There are other text variants that occur in usexes (specifically, there are normalized variants of several of the above text variants), but we can skip them for now. The following methods exist on {Language} objects to convert between different text variants: # {correctInputText} (currently called {makeDisplayText}): This converts input text to corrected input text. # {stripDiacritics}: This converts to stripped display text. [FIXME: This needs some rethinking. In particular, {stripDiacritics} is sometimes called on input text, corrected input text or display text (in various paths inside of [[Module:links]], and, in the case of input text, usually from other modules). We need to make sure we don't try to convert input text to display text twice, but at the same time we need to support calling it directly on input text since so many modules do this. This means we need to add a parameter indicating whether the passed-in text is input, corrected input, or display text; if the former two, we call {correctInputText} ourselves.] # {logicalToPhysical}: This converts logical pagenames to physical pagenames. # {transliterate}: This appears to convert input text with embedded brackets removed into a transliteration. [FIXME: This needs some rethinking. In particular, it calls {processDisplayText} on its input, which won't work for Thai and Khmer, so we may need language-specific flags indicating whether to pass the input text directly to the language transliterate method. In addition, I'm not sure how embedded links are handled in the existing translit code; a lot of callers remove the links themselves before calling {transliterate()}, which I assume is wrong.] # {makeSortKey}: This converts display text (?) to a sort key. [FIXME: Clarify this.] ]==] local export = {} local debug_track_module = "Module:debug/track" local etymology_languages_data_module = "Module:etymology languages/data" local families_module = "Module:families" local headword_page_module = "Module:headword/page" local json_module = "Module:JSON" local language_like_module = "Module:language-like" local languages_data_module = "Module:languages/data" local languages_data_patterns_module = "Module:languages/data/patterns" local links_data_module = "Module:links/data" local load_module = "Module:load" local scripts_module = "Module:scripts" local scripts_data_module = "Module:scripts/data" local string_encode_entities_module = "Module:string/encode entities" local string_pattern_escape_module = "Module:string/patternEscape" local string_replacement_escape_module = "Module:string/replacementEscape" local string_utilities_module = "Module:string utilities" local table_module = "Module:table" local utilities_module = "Module:utilities" local wikimedia_languages_module = "Module:wikimedia languages" local mw = mw local string = string local table = table local char = string.char local concat = table.concat local find = string.find local floor = math.floor local get_by_code -- Defined below. local get_data_module_name -- Defined below. local get_extra_data_module_name -- Defined below. local getmetatable = getmetatable local gmatch = string.gmatch local gsub = string.gsub local insert = table.insert local ipairs = ipairs local is_known_language_tag = mw.language.isKnownLanguageTag local make_object -- Defined below. local match = string.match local next = next local pairs = pairs local remove = table.remove local require = require local select = select local setmetatable = setmetatable local sub = string.sub local type = type local unstrip = mw.text.unstrip -- Loaded as needed by findBestScript. local Hans_chars local Hant_chars local function check_object(...) check_object = require(utilities_module).check_object return check_object(...) end local function debug_track(...) debug_track = require(debug_track_module) return debug_track(...) end local function decode_entities(...) decode_entities = require(string_utilities_module).decode_entities return decode_entities(...) end local function decode_uri(...) decode_uri = require(string_utilities_module).decode_uri return decode_uri(...) end local function deep_copy(...) deep_copy = require(table_module).deepCopy return deep_copy(...) end local function encode_entities(...) encode_entities = require(string_encode_entities_module) return encode_entities(...) end local function get_L2_sort_key(...) get_L2_sort_key = require(headword_page_module).get_L2_sort_key return get_L2_sort_key(...) end local function get_script(...) get_script = require(scripts_module).getByCode return get_script(...) end local function find_best_script_without_lang(...) find_best_script_without_lang = require(scripts_module).findBestScriptWithoutLang return find_best_script_without_lang(...) end local function get_family(...) get_family = require(families_module).getByCode return get_family(...) end local function get_plaintext(...) get_plaintext = require(utilities_module).get_plaintext return get_plaintext(...) end local function get_wikimedia_lang(...) get_wikimedia_lang = require(wikimedia_languages_module).getByCode return get_wikimedia_lang(...) end local function keys_to_list(...) keys_to_list = require(table_module).keysToList return keys_to_list(...) end local function list_to_set(...) list_to_set = require(table_module).listToSet return list_to_set(...) end local function load_data(...) load_data = require(load_module).load_data return load_data(...) end local function make_family_object(...) make_family_object = require(families_module).makeObject return make_family_object(...) end local function pattern_escape(...) pattern_escape = require(string_pattern_escape_module) return pattern_escape(...) end local function replacement_escape(...) replacement_escape = require(string_replacement_escape_module) return replacement_escape(...) end local function safe_require(...) safe_require = require(load_module).safe_require return safe_require(...) end local function shallow_copy(...) shallow_copy = require(table_module).shallowCopy return shallow_copy(...) end local function split(...) split = require(string_utilities_module).split return split(...) end local function to_json(...) to_json = require(json_module).toJSON return to_json(...) end local function u(...) u = require(string_utilities_module).char return u(...) end local function ugsub(...) ugsub = require(string_utilities_module).gsub return ugsub(...) end local function ulen(...) ulen = require(string_utilities_module).len return ulen(...) end local function ulower(...) ulower = require(string_utilities_module).lower return ulower(...) end local function umatch(...) umatch = require(string_utilities_module).match return umatch(...) end local function uupper(...) uupper = require(string_utilities_module).upper return uupper(...) end local function track(page) debug_track("languages/" .. page) return true end local function normalize_code(code) return load_data(languages_data_module).aliases[code] or code end local function check_inputs(self, check, default, ...) local n = select("#", ...) if n == 0 then return false end local ret = check(self, (...)) if ret ~= nil then return ret elseif n > 1 then local inputs = {...} for i = 2, n do ret = check(self, inputs[i]) if ret ~= nil then return ret end end end return default end local function make_link(self, target, display) local prefix, main if self:getFamilyCode() == "qfa-sub" then prefix, main = display:match("^(the )(.*)") if not prefix then prefix, main = display:match("^(a )(.*)") end end return (prefix or "") .. "[[" .. target .. "|" .. (main or display) .. "]]" end -- Convert risky characters to HTML entities, which minimizes interference once returned (e.g. for "sms:a", "<!-- -->" etc.). local function escape_risky_characters(text) -- Spacing characters in isolation generally need to be escaped in order to be properly processed by the MediaWiki software. if umatch(text, "^%s*$") then return encode_entities(text, text) end return encode_entities(text, "!#%&*+/:;<=>?@[\\]_{|}") end -- Temporarily convert various formatting characters to PUA to prevent them from being disrupted by the substitution process. local function doTempSubstitutions(text, subbedChars, keepCarets, noTrim) -- Clone so that we don't insert any extra patterns into the table in package.loaded. For some reason, using require seems to keep memory use down; probably because the table is always cloned. local patterns = shallow_copy(require(languages_data_patterns_module)) if keepCarets then insert(patterns, "((\\+)%^)") insert(patterns, "((%^))") end -- Ensure any whitespace at the beginning and end is temp substituted, to prevent it from being accidentally trimmed. We only want to trim any final spaces added during the substitution process (e.g. by a module), which means we only do this during the first round of temp substitutions. if not noTrim then insert(patterns, "^([\128-\191\244]*(%s+))") insert(patterns, "((%s+)[\128-\191\244]*)$") end -- Pre-substitution, of "[[" and "]]", which makes pattern matching more accurate. text = gsub(text, "%f[%[]%[%[", "\1"):gsub("%f[%]]%]%]", "\2") local i = #subbedChars for _, pattern in ipairs(patterns) do -- Patterns ending in \0 stand are for things like "[[" or "]]"), so the inserted PUA are treated as breaks between terms by modules that scrape info from pages. local term_divider pattern = gsub(pattern, "%z$", function(divider) term_divider = divider == "\0" return "" end) text = gsub(text, pattern, function(...) local m = {...} local m1New = m[1] for k = 2, #m do local n = i + k - 1 subbedChars[n] = m[k] local byte2 = floor(n / 4096) % 64 + (term_divider and 128 or 136) local byte3 = floor(n / 64) % 64 + 128 local byte4 = n % 64 + 128 m1New = gsub(m1New, pattern_escape(m[k]), "\244" .. char(byte2) .. char(byte3) .. char(byte4), 1) end i = i + #m - 1 return m1New end) end text = gsub(text, "\1", "%[%["):gsub("\2", "%]%]") return text, subbedChars end -- Reinsert any formatting that was temporarily substituted. local function undoTempSubstitutions(text, subbedChars) for i = 1, #subbedChars do local byte2 = floor(i / 4096) % 64 + 128 local byte3 = floor(i / 64) % 64 + 128 local byte4 = i % 64 + 128 text = gsub(text, "\244[" .. char(byte2) .. char(byte2+8) .. "]" .. char(byte3) .. char(byte4), replacement_escape(subbedChars[i])) end text = gsub(text, "\1", "%[%["):gsub("\2", "%]%]") return text end -- Check if the raw text is an unsupported title, and if so return that. Otherwise, remove HTML entities. We do the pre-conversion to avoid loading the unsupported title list unnecessarily. local function checkNoEntities(self, text) local textNoEnc = decode_entities(text) if textNoEnc ~= text and load_data(links_data_module).unsupported_titles[text] then return text else return textNoEnc end end -- If no script object is provided (or if it's invalid or None), get one. local function checkScript(text, self, sc) if not check_object("script", true, sc) or sc:getCode() == "None" then return self:findBestScript(text) end return sc end local function normalize(text, sc) text = sc:fixDiscouragedSequences(text) return sc:toFixedNFD(text) end -- Subfunction of iterateSectionSubstitutions(). Process an individual chunk of text according to the specifications in -- `substitution_data`. The input parameters are all as in the documentation of iterateSectionSubstitutions() except for -- `recursed`, which is set to true if we called ourselves recursively to process a script-specific setting or -- script-wide fallback. Returns two values: the processed text and the actual substitution data used to do the -- substitutions (same as the `actual_substitution_data` return value to iterateSectionSubstitutions()). local function doSubstitutions(self, text, sc, substitution_data, data_field, function_name, recursed) -- BE CAREFUL in this function because the value at any level can be `false`, which causes no processing to be done -- and blocks any further fallback processing. local actual_substitution_data = substitution_data -- If there are language-specific substitutes given in the data module, use those. if type(substitution_data) == "table" then -- If a script is specified, run this function with the script-specific data before continuing. local sc_code = sc:getCode() local has_substitution_data = false if substitution_data[sc_code] ~= nil then has_substitution_data = true if substitution_data[sc_code] then text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data[sc_code], data_field, function_name, true) end -- Hant, Hans and Hani are usually treated the same, so add a special case to avoid having to specify each one -- separately. elseif sc_code:match("^Han") and substitution_data.Hani ~= nil then has_substitution_data = true if substitution_data.Hani then text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data.Hani, data_field, function_name, true) end -- Substitution data with key 1 in the outer table may be given as a fallback. elseif substitution_data[1] ~= nil then has_substitution_data = true if substitution_data[1] then text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data[1], data_field, function_name, true) end end -- Iterate over all strings in the "from" subtable, and gsub with the corresponding string in "to". We work with -- the NFD decomposed forms, as this simplifies many substitutions. if substitution_data.from then has_substitution_data = true for i, from in ipairs(substitution_data.from) do -- Normalize each loop, to ensure multi-stage substitutions work correctly. text = sc:toFixedNFD(text) text = ugsub(text, sc:toFixedNFD(from), substitution_data.to[i] or "") end end if substitution_data.remove_diacritics then has_substitution_data = true text = sc:toFixedNFD(text) -- Convert exceptions to PUA. local remove_exceptions, substitutes = substitution_data.remove_exceptions if remove_exceptions then substitutes = {} local i = 0 for _, exception in ipairs(remove_exceptions) do exception = sc:toFixedNFD(exception) text = ugsub(text, exception, function(m) i = i + 1 local subst = u(0x80000 + i) substitutes[subst] = m return subst end) end end -- Strip diacritics. text = ugsub(text, "[" .. substitution_data.remove_diacritics .. "]", "") -- Convert exceptions back. if remove_exceptions then text = text:gsub("\242[\128-\191]*", substitutes) end end if not has_substitution_data and sc._data[data_field] then -- If language-specific sort key (etc.) is nil, fall back to script-wide sort key (etc.). text, actual_substitution_data = doSubstitutions(self, text, sc, sc._data[data_field], data_field, function_name, true) end elseif type(substitution_data) == "string" then -- If there is a dedicated function module, use that. local module = safe_require("Module:" .. substitution_data) if module then -- TODO: translit functions should take objects, not codes. -- TODO: translit functions should be called with form NFD. if function_name == "tr" then if not module[function_name] then error(("Internal error: Module [[%s]] has no function named 'tr'"):format(substitution_data)) end text = module[function_name](text, self._code, sc:getCode()) elseif function_name == "stripDiacritics" then -- FIXME, get rid of this arm after renaming makeEntryName -> stripDiacritics. if module[function_name] then text = module[function_name](sc:toFixedNFD(text), self, sc) elseif module.makeEntryName then text = module.makeEntryName(sc:toFixedNFD(text), self, sc) else error(("Internal error: Module [[%s]] has no function named 'stripDiacritics' or 'makeEntryName'" ):format(substitution_data)) end else if not module[function_name] then error(("Internal error: Module [[%s]] has no function named '%s'"):format( substitution_data, function_name)) end text = module[function_name](sc:toFixedNFD(text), self, sc) end else error("Substitution data '" .. substitution_data .. "' does not match an existing module.") end elseif substitution_data == nil and sc._data[data_field] then -- If language-specific sort key (etc.) is nil, fall back to script-wide sort key (etc.). text, actual_substitution_data = doSubstitutions(self, text, sc, sc._data[data_field], data_field, function_name, true) end -- Don't normalize to NFC if this is the inner loop or if a module returned nil. if recursed or not text then return text, actual_substitution_data end -- Fix any discouraged sequences created during the substitution process, and normalize into the final form. return sc:toFixedNFC(sc:fixDiscouragedSequences(text)), actual_substitution_data end -- Split the text into sections, based on the presence of temporarily substituted formatting characters, then iterate -- over each section to apply substitutions (e.g. transliteration or diacritic stripping). This avoids putting PUA -- characters through language-specific modules, which may be unequipped for them. This function is passed the following -- values: -- * `self` (the Language object); -- * `text` (the text to process); -- * `sc` (the script of the text, which must be specified; callers should call checkScript() as needed to autodetect the -- script of the text if not given explicitly by the user); -- * `subbedChars` (an array of the same length as the text, indicating which characters have been substituted and by -- what, or {nil} if no substitutions are to happen); -- * `keepCarets` (DOCUMENT ME); -- * `substitution_data` (the data indicating which substitutions to apply, taken directly from `data_field` in the -- language's data structure in a submodule of [[Module:languages/data]]); -- * `data_field` (the data field from which `substitution_data` was fetched, such as "sort_key" or "strip_diacritics"); -- * `function_name` (the name of the function to call to do the substitution, in case `substitution_data` specifies a -- module to do the substitution); -- * `notrim` (don't trim whitespace at the edges of `text`; set when computing the sort key, because whitespace at the -- beginning of a sort key is significant and causes the resulting page to be sorted at the beginning of the category -- it's in). -- Returns three values: -- (1) the processed text; -- (2) the value of `subbedChars` that was passed in, possibly modified with additional character substitutions; will be -- {nil} if {nil} was passed in; -- (3) the actual substitution data that was used to apply substitutions to `text`; this may be different from the value -- of `substitution_data` passed in if that value recursively specified script-specific substitutions or if no -- substitution data could be found in the language-specific data (e.g. {nil} was passed in or a structure was passed -- in that had no setting for the script given in `sc`), but a script-wide fallback value was set; currently it is -- only used by makeSortKey(). local function iterateSectionSubstitutions(self, text, sc, subbedChars, keepCarets, substitution_data, data_field, function_name, notrim) local sections -- See [[Module:languages/data]]. if not find(text, "\244") or load_data(languages_data_module).substitution[self._code] == "cont" then sections = {text} else sections = split(text, "\244[\128-\143][\128-\191]*", true) end local actual_substitution_data for _, section in ipairs(sections) do -- Don't bother processing empty strings or whitespace (which may also not be handled well by dedicated -- modules). if gsub(section, "%s+", "") ~= "" then local sub, this_actual_substitution_data = doSubstitutions(self, section, sc, substitution_data, data_field, function_name) actual_substitution_data = this_actual_substitution_data -- Second round of temporary substitutions, in case any formatting was added by the main substitution -- process. However, don't do this if the section contains formatting already (as it would have had to have -- been escaped to reach this stage, and therefore should be given as raw text). if sub and subbedChars then local noSub for _, pattern in ipairs(require(languages_data_patterns_module)) do if match(section, pattern .. "%z?") then noSub = true end end if not noSub then sub, subbedChars = doTempSubstitutions(sub, subbedChars, keepCarets, true) end end if not sub then text = sub break end text = sub and gsub(text, pattern_escape(section), replacement_escape(sub), 1) or text end end if not notrim then -- Trim, unless there are only spacing characters, while ignoring any final formatting characters. -- Do not trim sort keys because spaces at the beginning are significant. text = text and text:gsub("^([\128-\191\244]*)%s+(%S)", "%1%2"):gsub("(%S)%s+([\128-\191\244]*)$", "%1%2") or nil end return text, subbedChars, actual_substitution_data end -- Process carets (and any escapes). Default to simple removal, if no pattern/replacement is given. local function processCarets(text, pattern, repl) local rep repeat text, rep = gsub(text, "\\\\(\\*^)", "\3%1") until rep == 0 return (text:gsub("\\^", "\4") :gsub(pattern or "%^", repl or "") :gsub("\3", "\\") :gsub("\4", "^")) end -- Remove carets if they are used to capitalize parts of transliterations (unless they have been escaped). local function removeCarets(text, sc) if not sc:hasCapitalization() and sc:isTransliterated() and text:find("^", 1, true) then return processCarets(text) else return text end end local Language = {} --[==[Returns the language code of the language. Example: {{code|lua|"fr"}} for French.]==] function Language:getCode() return self._code end --[==[Returns the canonical name of the language. This is the name used to represent that language on Wiktionary, and is guaranteed to be unique to that language alone. Example: {{code|lua|"French"}} for French.]==] function Language:getCanonicalName() local name = self._name if name == nil then name = self._data[1] self._name = name end return name end --[==[ Return the display form of the language. The display form of a language, family or script is the form it takes when appearing as the <code><var>source</var></code> in categories such as <code>English terms derived from <var>source</var></code> or <code>English given names from <var>source</var></code>, and is also the displayed text in {makeCategoryLink()} links. For full and etymology-only languages, this is the same as the canonical name, but for families, it reads <code>"<var>name</var> languages"</code> (e.g. {"Indo-Iranian languages"}), and for scripts, it reads <code>"<var>name</var> script"</code> (e.g. {"Arabic script"}). ]==] function Language:getDisplayForm() local form = self._displayForm if form == nil then form = self:getCanonicalName() -- Add article and " substrate" to substrates that lack them. if self:getFamilyCode() == "qfa-sub" then if not (sub(form, 1, 4) == "the " or sub(form, 1, 2) == "a ") then form = "a " .. form end if not match(form, " [Ss]ubstrate") then form = form .. " substrate" end end self._displayForm = form end return form end --[==[Returns the value which should be used in the HTML lang= attribute for tagged text in the language.]==] function Language:getHTMLAttribute(sc, region) local code = self._code if not find(code, "-", 1, true) then return code .. "-" .. sc:getCode() .. (region and "-" .. region or "") end local parent = self:getParent() region = region or match(code, "%f[%u][%u-]+%f[%U]") if parent then return parent:getHTMLAttribute(sc, region) end -- TODO: ISO family codes can also be used. return "mis-" .. sc:getCode() .. (region and "-" .. region or "") end --[==[Returns a table of the aliases that the language is known by, excluding the canonical name. Aliases are synonyms for the language in question. The names are not guaranteed to be unique, in that sometimes more than one language is known by the same name. Example: {{code|lua|{"High German", "New High German", "Deutsch"} }} for [[:Category:German language|German]].]==] function Language:getAliases() self:loadInExtraData() return require(language_like_module).getAliases(self) end --[==[ Return a table of the known subvarieties of a given language, excluding subvarieties that have been given explicit etymology-only language codes. The names are not guaranteed to be unique, in that sometimes a given name refers to a subvariety of more than one language. Example: {{code|lua|{"Southern Aymara", "Central Aymara"} }} for [[:Category:Aymara language|Aymara]]. Note that the returned value can have nested tables in it, when a subvariety goes by more than one name. Example: {{code|lua|{"North Azerbaijani", "South Azerbaijani", {"Afshar", "Afshari", "Afshar Azerbaijani", "Afchar"}, {"Qashqa'i", "Qashqai", "Kashkay"}, "Sonqor"} }} for [[:Category:Azerbaijani language|Azerbaijani]]. Here, for example, Afshar, Afshari, Afshar Azerbaijani and Afchar all refer to the same subvariety, whose preferred name is Afshar (the one listed first). To avoid a return value with nested tables in it, specify a non-{{code|lua|nil}} value for the <code>flatten</code> parameter; in that case, the return value would be {{code|lua|{"North Azerbaijani", "South Azerbaijani", "Afshar", "Afshari", "Afshar Azerbaijani", "Afchar", "Qashqa'i", "Qashqai", "Kashkay", "Sonqor"} }}. ]==] function Language:getVarieties(flatten) self:loadInExtraData() return require(language_like_module).getVarieties(self, flatten) end --[==[Returns a table of the "other names" that the language is known by, which are listed in the <code>otherNames</code> field. It should be noted that the <code>otherNames</code> field itself is deprecated, and entries listed there should eventually be moved to either <code>aliases</code> or <code>varieties</code>.]==] function Language:getOtherNames() -- To be eventually removed, once there are no more uses of the `otherNames` field. self:loadInExtraData() return require(language_like_module).getOtherNames(self) end --[==[ Return a combined table of the canonical name, aliases, varieties and other names of a given language.]==] function Language:getAllNames() self:loadInExtraData() return require(language_like_module).getAllNames(self) end --[==[Returns a table of types as a lookup table (with the types as keys). The possible types are * {language}: This is a language, either full or etymology-only. * {full}: This is a "full" (not etymology-only) language, i.e. the union of {regular}, {reconstructed} and {appendix-constructed}. Note that the types {full} and {etymology-only} also exist for families, so if you want to check specifically for a full language and you have an object that might be a family, you should use {{lua|hasType("language", "full")}} and not simply {{lua|hasType("full")}}. * {etymology-only}: This is an etymology-only (not full) language, whose parent is another etymology-only language or a full language. Note that the types {full} and {etymology-only} also exist for families, so if you want to check specifically for an etymology-only language and you have an object that might be a family, you should use {{lua|hasType("language", "etymology-only")}} and not simply {{lua|hasType("etymology-only")}}. * {regular}: This indicates a full language that is attested according to [[WT:CFI]] and therefore permitted in the main namespace. There may also be reconstructed terms for the language, which are placed in the {Reconstruction} namespace and must be prefixed with * to indicate a reconstruction. Most full languages are natural (not constructed) languages, but a few constructed languages (e.g. Esperanto and Volapük, among others) are also allowed in the mainspace and considered regular languages. * {reconstructed}: This language is not attested according to [[WT:CFI]], and therefore is allowed only in the {Reconstruction} namespace. All terms in this language are reconstructed, and must be prefixed with *. Languages such as Proto-Indo-European and Proto-Germanic are in this category. * {appendix-constructed}: This language is attested but does not meet the additional requirements set out for constructed languages ([[WT:CFI#Constructed languages]]). Its entries must therefore be in the Appendix namespace, but they are not reconstructed and therefore should not have * prefixed in links. ]==] function Language:getTypes() local types = self._types if types == nil then types = {language = true} if self:getFullCode() == self._code then types.full = true else types["etymology-only"] = true end for t in gmatch(self._data.type, "[^,]+") do types[t] = true end self._types = types end return types end --[==[Given a list of types as strings, returns true if the language has all of them.]==] function Language:hasType(...) Language.hasType = require(language_like_module).hasType return self:hasType(...) end --[==[Returns a table containing <code>WikimediaLanguage</code> objects (see [[Module:wikimedia languages]]), which represent languages and their codes as they are used in Wikimedia projects for interwiki linking and such. More than one object may be returned, as a single Wiktionary language may correspond to multiple Wikimedia languages. For example, Wiktionary's single code <code>sh</code> (Serbo-Croatian) maps to four Wikimedia codes: <code>sh</code> (Serbo-Croatian), <code>bs</code> (Bosnian), <code>hr</code> (Croatian) and <code>sr</code> (Serbian). The code for the Wikimedia language is retrieved from the <code>wikimedia_codes</code> property in the data modules. If that property is not present, the code of the current language is used. If none of the available codes is actually a valid Wikimedia code, an empty table is returned.]==] function Language:getWikimediaLanguages() local wm_langs = self._wikimediaLanguageObjects if wm_langs == nil then local codes = self:getWikimediaLanguageCodes() wm_langs = {} for i = 1, #codes do wm_langs[i] = get_wikimedia_lang(codes[i]) end self._wikimediaLanguageObjects = wm_langs end return wm_langs end function Language:getWikimediaLanguageCodes() local wm_langs = self._wikimediaLanguageCodes if wm_langs == nil then wm_langs = self._data.wikimedia_codes if wm_langs then wm_langs = split(wm_langs, ",", true, true) else local code = self._code if is_known_language_tag(code) then wm_langs = {code} else -- Inherit, but only if no codes are specified in the data *and* -- the language code isn't a valid Wikimedia language code. local parent = self:getParent() wm_langs = parent and parent:getWikimediaLanguageCodes() or {} end end self._wikimediaLanguageCodes = wm_langs end return wm_langs end --[==[ Returns the name of the Wikipedia article for the language. `project` specifies the language and project to retrieve the article from, defaulting to {"enwiki"} for the English Wikipedia. Normally if specified it should be the project code for a specific-language Wikipedia e.g. "zhwiki" for the Chinese Wikipedia, but it can be any project, including non-Wikipedia ones. If the project is the English Wikipedia and the property {wikipedia_article} is present in the data module it will be used first. In all other cases, a sitelink will be generated from {:getWikidataItem} (if set). The resulting value (or lack of value) is cached so that subsequent calls are fast. If no value could be determined, and `noCategoryFallback` is {false}, {:getCategoryName} is used as fallback; otherwise, {nil} is returned. Note that if `noCategoryFallback` is {nil} or omitted, it defaults to {false} if the project is the English Wikipedia, otherwise to {true}. In other words, under normal circumstances, if the English Wikipedia article couldn't be retrieved, the return value will fall back to a link to the language's category, but this won't normally happen for any other project. ]==] function Language:getWikipediaArticle(noCategoryFallback, project) Language.getWikipediaArticle = require(language_like_module).getWikipediaArticle return self:getWikipediaArticle(noCategoryFallback, project) end function Language:makeWikipediaLink() return make_link(self, "w:" .. self:getWikipediaArticle(), self:getCanonicalName()) end --[==[Returns the name of the Wikimedia Commons category page for the language.]==] function Language:getCommonsCategory() Language.getCommonsCategory = require(language_like_module).getCommonsCategory return self:getCommonsCategory() end --[==[Returns the Wikidata item id for the language or <code>nil</code>. This corresponds to the the second field in the data modules.]==] function Language:getWikidataItem() Language.getWikidataItem = require(language_like_module).getWikidataItem return self:getWikidataItem() end --[==[Returns a table of <code>Script</code> objects for all scripts that the language is written in. See [[Module:scripts]].]==] function Language:getScripts() local scripts = self._scriptObjects if scripts == nil then local codes = self:getScriptCodes() if codes[1] == "All" then scripts = load_data(scripts_data_module) else scripts = {} for i = 1, #codes do scripts[i] = get_script(codes[i]) end end self._scriptObjects = scripts end return scripts end --[==[Returns the table of script codes in the language's data file.]==] function Language:getScriptCodes() local scripts = self._scriptCodes if scripts == nil then scripts = self._data[4] if scripts then local codes, n = {}, 0 for code in gmatch(scripts, "[^,]+") do n = n + 1 -- Special handling of "Hants", which represents "Hani", "Hant" and "Hans" collectively. if code == "Hants" then codes[n] = "Hani" codes[n + 1] = "Hant" codes[n + 2] = "Hans" n = n + 2 else codes[n] = code end end scripts = codes else scripts = {"None"} end self._scriptCodes = scripts end return scripts end --[==[Given some text, this function iterates through the scripts of a given language and tries to find the script that best matches the text. It returns a {{code|lua|Script}} object representing the script. If no match is found at all, it returns the {{code|lua|None}} script object.]==] function Language:findBestScript(text, forceDetect) if not text or text == "" or text == "-" then return get_script("None") end -- Differs from table returned by getScriptCodes, as Hants is not normalized into its constituents. local codes = self._bestScriptCodes if codes == nil then codes = self._data[4] codes = codes and split(codes, ",", true, true) or {"None"} self._bestScriptCodes = codes end local first_sc = codes[1] if first_sc == "All" then return find_best_script_without_lang(text) end local codes_len = #codes if not (forceDetect or first_sc == "Hants" or codes_len > 1) then first_sc = get_script(first_sc) local charset = first_sc.characters return charset and umatch(text, "[" .. charset .. "]") and first_sc or get_script("None") end -- Remove all formatting characters. text = get_plaintext(text) -- Remove all spaces and any ASCII punctuation. Some non-ASCII punctuation is script-specific, so can't be removed. text = ugsub(text, "[%s!\"#%%&'()*,%-./:;?@[\\%]_{}]+", "") if #text == 0 then return get_script("None") end -- Try to match every script against the text, -- and return the one with the most matching characters. local bestcount, bestscript, length = 0 for i = 1, codes_len do local sc = codes[i] -- Special case for "Hants", which is a special code that represents whichever of "Hant" or "Hans" best matches, or "Hani" if they match equally. This avoids having to list all three. In addition, "Hants" will be treated as the best match if there is at least one matching character, under the assumption that a Han script is desirable in terms that contain a mix of Han and other scripts (not counting those which use Jpan or Kore). if sc == "Hants" then local Hani = get_script("Hani") if not Hant_chars then Hant_chars = load_data("Module:zh/data/ts") Hans_chars = load_data("Module:zh/data/st") end local t, s, found = 0, 0 -- This is faster than using mw.ustring.gmatch directly. for ch in gmatch((ugsub(text, "[" .. Hani.characters .. "]", "\255%0")), "\255(.[\128-\191]*)") do found = true if Hant_chars[ch] then t = t + 1 if Hans_chars[ch] then s = s + 1 end elseif Hans_chars[ch] then s = s + 1 else t, s = t + 1, s + 1 end end if found then if t == s then return Hani end return get_script(t > s and "Hant" or "Hans") end else sc = get_script(sc) if not length then length = ulen(text) end -- Count characters by removing everything in the script's charset and comparing to the original length. local charset = sc.characters local count = charset and length - ulen((ugsub(text, "[" .. charset .. "]+", ""))) or 0 if count >= length then return sc elseif count > bestcount then bestcount = count bestscript = sc end end end -- Return best matching script, or otherwise None. return bestscript or get_script("None") end --[==[Returns a <code>Family</code> object for the language family that the language belongs to. See [[Module:families]].]==] function Language:getFamily() local family = self._familyObject if family == nil then family = self:getFamilyCode() -- If the value is nil, it's cached as false. family = family and get_family(family) or false self._familyObject = family end return family or nil end --[==[Returns the family code in the language's data file.]==] function Language:getFamilyCode() local family = self._familyCode if family == nil then -- If the value is nil, it's cached as false. family = self._data[3] or false self._familyCode = family end return family or nil end function Language:getFamilyName() local family = self._familyName if family == nil then family = self:getFamily() -- If the value is nil, it's cached as false. family = family and family:getCanonicalName() or false self._familyName = family end return family or nil end do local function check_family(self, family) if type(family) == "table" then family = family:getCode() end if self:getFamilyCode() == family then return true end local self_family = self:getFamily() if self_family:inFamily(family) then return true -- If the family isn't a real family (e.g. creoles) check any ancestors. elseif self_family:inFamily("qfa-not") then local ancestors = self:getAncestors() for _, ancestor in ipairs(ancestors) do if ancestor:inFamily(family) then return true end end end end --[==[Check whether the language belongs to `family` (which can be a family code or object). A list of objects can be given in place of `family`; in that case, return true if the language belongs to any of the specified families. Note that some languages (in particular, certain creoles) can have multiple immediate ancestors potentially belonging to different families; in that case, return true if the language belongs to any of the specified families.]==] function Language:inFamily(...) if self:getFamilyCode() == nil then return false end return check_inputs(self, check_family, false, ...) end end function Language:getParent() local parent = self._parentObject if parent == nil then parent = self:getParentCode() -- If the value is nil, it's cached as false. parent = parent and get_by_code(parent, nil, true, true) or false self._parentObject = parent end return parent or nil end function Language:getParentCode() local parent = self._parentCode if parent == nil then -- If the value is nil, it's cached as false. parent = self._data.parent or false self._parentCode = parent end return parent or nil end function Language:getParentName() local parent = self._parentName if parent == nil then parent = self:getParent() -- If the value is nil, it's cached as false. parent = parent and parent:getCanonicalName() or false self._parentName = parent end return parent or nil end function Language:getParentChain() local chain = self._parentChain if chain == nil then chain = {} local parent, n = self:getParent(), 0 while parent do n = n + 1 chain[n] = parent parent = parent:getParent() end self._parentChain = chain end return chain end do local function check_lang(self, lang) for _, parent in ipairs(self:getParentChain()) do if (type(lang) == "string" and lang or lang:getCode()) == parent:getCode() then return true end end end function Language:hasParent(...) return check_inputs(self, check_lang, false, ...) end end --[==[ If the language is etymology-only, this iterates through parents until a full language or family is found, and the corresponding object is returned. If the language is a full language, then it simply returns itself. ]==] function Language:getFull() local full = self._fullObject if full == nil then full = self:getFullCode() full = full == self._code and self or get_by_code(full) self._fullObject = full end return full end --[==[ If the language is an etymology-only language, this iterates through parents until a full language or family is found, and the corresponding code is returned. If the language is a full language, then it simply returns the language code. ]==] function Language:getFullCode() return self._fullCode or self._code end --[==[ If the language is an etymology-only language, this iterates through parents until a full language or family is found, and the corresponding canonical name is returned. If the language is a full language, then it simply returns the canonical name of the language. ]==] function Language:getFullName() local full = self._fullName if full == nil then full = self:getFull():getCanonicalName() self._fullName = full end return full end --[==[Returns a table of <code class="nf">Language</code> objects for all languages that this language is directly descended from. Generally this is only a single language, but creoles, pidgins and mixed languages can have multiple ancestors.]==] function Language:getAncestors() local ancestors = self._ancestorObjects if ancestors == nil then ancestors = {} local ancestor_codes = self:getAncestorCodes() if #ancestor_codes > 0 then for _, ancestor in ipairs(ancestor_codes) do insert(ancestors, get_by_code(ancestor, nil, true)) end else local fam = self:getFamily() local protoLang = fam and fam:getProtoLanguage() or nil -- For the cases where the current language is the proto-language -- of its family, or an etymology-only language that is ancestral to that -- proto-language, we need to step up a level higher right from the -- start. if protoLang and ( protoLang:getCode() == self._code or (self:hasType("etymology-only") and protoLang:hasAncestor(self)) ) then fam = fam:getFamily() protoLang = fam and fam:getProtoLanguage() or nil end while not protoLang and not (not fam or fam:getCode() == "qfa-not") do fam = fam:getFamily() protoLang = fam and fam:getProtoLanguage() or nil end insert(ancestors, protoLang) end self._ancestorObjects = ancestors end return ancestors end do -- Avoid a language being its own ancestor via class inheritance. We only need to check for this if the language has inherited an ancestor table from its parent, because we never want to drop ancestors that have been explicitly set in the data. -- Recursively iterate over ancestors until we either find self or run out. If self is found, return true. local function check_ancestor(self, lang) local codes = lang:getAncestorCodes() if not codes then return nil end for i = 1, #codes do local code = codes[i] if code == self._code then return true end local anc = get_by_code(code, nil, true) if check_ancestor(self, anc) then return true end end end --[==[Returns a table of <code class="nf">Language</code> codes for all languages that this language is directly descended from. Generally this is only a single language, but creoles, pidgins and mixed languages can have multiple ancestors.]==] function Language:getAncestorCodes() if self._ancestorCodes then return self._ancestorCodes end local data = self._data local codes = data.ancestors if codes == nil then codes = {} self._ancestorCodes = codes return codes end codes = split(codes, ",", true, true) self._ancestorCodes = codes -- If there are no codes or the ancestors weren't inherited data, there's nothing left to check. if #codes == 0 or self:getData(false, "raw").ancestors ~= nil then return codes end local i, code = 1 while i <= #codes do code = codes[i] if check_ancestor(self, self) then remove(codes, i) else i = i + 1 end end return codes end end --[==[Given a list of language objects or codes, returns true if at least one of them is an ancestor. This includes any etymology-only children of that ancestor. If the language's ancestor(s) are etymology-only languages, it will also return true for those language parent(s) (e.g. if Vulgar Latin is the ancestor, it will also return true for its parent, Latin). However, a parent is excluded from this if the ancestor is also ancestral to that parent (e.g. if Classical Persian is the ancestor, Persian would return false, because Classical Persian is also ancestral to Persian).]==] function Language:hasAncestor(...) local function iterateOverAncestorTree(node, func, parent_check) local ancestors = node:getAncestors() local ancestorsParents = {} for _, ancestor in ipairs(ancestors) do -- When checking the parents of the other language, and the ancestor is also a parent, skip to the next ancestor, so that we exclude any etymology-only children of that parent that are not directly related (see below). local ret = (parent_check or not node:hasParent(ancestor)) and func(ancestor) or iterateOverAncestorTree(ancestor, func, parent_check) if ret then return ret end end -- Check the parents of any ancestors. We don't do this if checking the parents of the other language, so that we exclude any etymology-only children of those parents that are not directly related (e.g. if the ancestor is Vulgar Latin and we are checking New Latin, we want it to return false because they are on different ancestral branches. As such, if we're already checking the parent of New Latin (Latin) we don't want to compare it to the parent of the ancestor (Latin), as this would be a false positive; it should be one or the other). if not parent_check then return nil end for _, ancestor in ipairs(ancestors) do local ancestorParents = ancestor:getParentChain() for _, ancestorParent in ipairs(ancestorParents) do if ancestorParent:getCode() == self._code or ancestorParent:hasAncestor(ancestor) then break else insert(ancestorsParents, ancestorParent) end end end for _, ancestorParent in ipairs(ancestorsParents) do local ret = func(ancestorParent) if ret then return ret end end end local function do_iteration(otherlang, parent_check) -- otherlang can't be self if (type(otherlang) == "string" and otherlang or otherlang:getCode()) == self._code then return false end repeat if iterateOverAncestorTree( self, function(ancestor) return ancestor:getCode() == (type(otherlang) == "string" and otherlang or otherlang:getCode()) end, parent_check ) then return true elseif type(otherlang) == "string" then otherlang = get_by_code(otherlang, nil, true) end otherlang = otherlang:getParent() parent_check = false until not otherlang end local parent_check = true for _, otherlang in ipairs{...} do local ret = do_iteration(otherlang, parent_check) if ret then return true end end return false end do local function construct_node(lang, memo) local branch, ancestors = {lang = lang:getCode()} memo[lang:getCode()] = branch for _, ancestor in ipairs(lang:getAncestors()) do if ancestors == nil then ancestors = {} end insert(ancestors, memo[ancestor:getCode()] or construct_node(ancestor, memo)) end branch.ancestors = ancestors return branch end function Language:getAncestorChain() local chain = self._ancestorChain if chain == nil then chain = construct_node(self, {}) self._ancestorChain = chain end return chain end end function Language:getAncestorChainOld() local chain = self._ancestorChain if chain == nil then chain = {} local step = self while true do local ancestors = step:getAncestors() step = #ancestors == 1 and ancestors[1] or nil if not step then break end insert(chain, step) end self._ancestorChain = chain end return chain end local function fetch_descendants(self, fmt) local descendants, family = {}, self:getFamily() -- Iterate over all three datasets. for _, data in ipairs{ require("Module:languages/code to canonical name"), require("Module:etymology languages/code to canonical name"), require("Module:families/code to canonical name"), } do for code in pairs(data) do local lang = get_by_code(code, nil, true, true) -- Test for a descendant. Earlier tests weed out most candidates, while the more intensive tests are only used sparingly. if ( code ~= self._code and -- Not self. lang:inFamily(family) and -- In the same family. ( family:getProtoLanguageCode() == self._code or -- Self is the protolanguage. self:hasDescendant(lang) or -- Full hasDescendant check. (lang:getFullCode() == self._code and not self:hasAncestor(lang)) -- Etymology-only child which isn't an ancestor. ) ) then if fmt == "object" then insert(descendants, lang) elseif fmt == "code" then insert(descendants, code) elseif fmt == "name" then insert(descendants, lang:getCanonicalName()) end end end end return descendants end function Language:getDescendants() local descendants = self._descendantObjects if descendants == nil then descendants = fetch_descendants(self, "object") self._descendantObjects = descendants end return descendants end function Language:getDescendantCodes() local descendants = self._descendantCodes if descendants == nil then descendants = fetch_descendants(self, "code") self._descendantCodes = descendants end return descendants end function Language:getDescendantNames() local descendants = self._descendantNames if descendants == nil then descendants = fetch_descendants(self, "name") self._descendantNames = descendants end return descendants end do local function check_lang(self, lang) if type(lang) == "string" then lang = get_by_code(lang, nil, true) end if lang:hasAncestor(self) then return true end end function Language:hasDescendant(...) return check_inputs(self, check_lang, false, ...) end end local function fetch_children(self, fmt) local m_etym_data = require(etymology_languages_data_module) local self_code, children = self._code, {} for code, lang in pairs(m_etym_data) do local _lang = lang repeat local parent = _lang.parent if parent == self_code then if fmt == "object" then insert(children, get_by_code(code, nil, true)) elseif fmt == "code" then insert(children, code) elseif fmt == "name" then insert(children, lang[1]) end break end _lang = m_etym_data[parent] until not _lang end return children end function Language:getChildren() local children = self._childObjects if children == nil then children = fetch_children(self, "object") self._childObjects = children end return children end function Language:getChildrenCodes() local children = self._childCodes if children == nil then children = fetch_children(self, "code") self._childCodes = children end return children end function Language:getChildrenNames() local children = self._childNames if children == nil then children = fetch_children(self, "name") self._childNames = children end return children end function Language:hasChild(...) local lang = ... if not lang then return false elseif type(lang) == "string" then lang = get_by_code(lang, nil, true) end if lang:hasParent(self) then return true end return self:hasChild(select(2, ...)) end --[==[Returns the name of the main category of that language. Example: {{code|lua|"French language"}} for French, whose category is at [[:Category:French language]]. Unless optional argument <code>nocap</code> is given, the language name at the beginning of the returned value will be capitalized. This capitalization is correct for category names, but not if the language name is lowercase and the returned value of this function is used in the middle of a sentence.]==] function Language:getCategoryName(nocap) local name = self._categoryName if name == nil then name = self:getCanonicalName() -- If a substrate, omit any leading article. if self:getFamilyCode() == "qfa-sub" then name = name:gsub("^the ", ""):gsub("^a ", "") end -- Only add " language" if a full language. if self:hasType("full") then -- Unless the canonical name already ends with "language", "lect" or their derivatives, add " language". --if not (match(name, "[Ll]anguage$") or match(name, "[Ll]ect$")) then if not (match(name, "^ภาษา") or match(name, "^ภาษณ์")) then name = "ภาษา" .. name end end self._categoryName = name end if nocap then return name end return mw.getContentLanguage():ucfirst(name) end --[==[Creates a link to the category; the link text is the canonical name.]==] function Language:makeCategoryLink() return make_link(self, ":Category:" .. self:getCategoryName(), self:getDisplayForm()) end function Language:getStandardCharacters(sc) local standard_chars = self._data.standard_chars if type(standard_chars) ~= "table" then return standard_chars elseif sc and type(sc) ~= "string" then check_object("script", nil, sc) sc = sc:getCode() end if (not sc) or sc == "None" then local scripts = {} for _, script in pairs(standard_chars) do insert(scripts, script) end return concat(scripts) end if standard_chars[sc] then return standard_chars[sc] .. (standard_chars[1] or "") end end --[==[ Strip diacritics from display text `text` (in a language-specific fashion), which is in the script `sc`. If `sc` is omitted or {nil}, the script is autodetected. This also strips certain punctuation characters from the end and (in the case of Spanish upside-down question mark and exclamation points) from the beginning; strips any whitespace at the end of the text or between the text and final stripped punctuation characters; and applies some language-specific Unicode normalizations to replace discouraged characters with their prescribed alternatives. Return the stripped text. ]==] function Language:stripDiacritics(text, sc) if (not text) or text == "" then return text end sc = checkScript(text, self, sc) text = normalize(text, sc) -- FIXME, rename makeEntryName to stripDiacritics and get rid of second and third return values -- everywhere text, _, _ = iterateSectionSubstitutions(self, text, sc, nil, nil, self._data.strip_diacritics or self._data.entry_name, "strip_diacritics", "stripDiacritics") text = umatch(text, "^[¿¡]?(.-[^%s%p].-)%s*[؟?!;՛՜ ՞ ՟?!︖︕।॥။၊་།]?$") or text return text end --[==[ Convert a ''logical'' pagename (the pagename as it appears to the user, after diacritics and punctuation have been stripped) to a ''physical'' pagename (the pagename as it appears in the MediaWiki database). Reasons for a difference between the two are (a) unsupported titles such as `[ ]` (with square brackets in them), `#` (pound/hash sign) and `¯\_(ツ)_/¯` (with underscores), as well as overly long titles of various sorts; (b) "mammoth" pages that are split into parts (e.g. `a`, which is split into physical pagenames `a/languages A to L` and `a/languages M to Z`). For almost all purposes, you should work with logical and not physical pagenames. But there are certain use cases that require physical pagenames, such as checking the existence of a page or retrieving a page's contents. `pagename` is the logical pagename to be converted. `is_reconstructed_or_appendix` indicates whether the page is in the `Reconstruction` or `Appendix` namespaces. If it is omitted or has the value {nil}, the pagename is checked for an initial asterisk, and if found, the page is assumed to be a `Reconstruction` page. Setting a value of `false` or `true` to `is_reconstructed_or_appendix` disables this check and allows for mainspace pagenames that begin with an asterisk. ]==] function Language:logicalToPhysical(pagename, is_reconstructed_or_appendix) -- FIXME: This probably shouldn't happen but it happens when makeEntryName() receives nil. if pagename == nil then track("nil-passed-to-logicalToPhysical") return nil end local initial_asterisk if is_reconstructed_or_appendix == nil then local pagename_minus_initial_asterisk initial_asterisk, pagename_minus_initial_asterisk = pagename:match("^(%*)(.*)$") if pagename_minus_initial_asterisk then is_reconstructed_or_appendix = true pagename = pagename_minus_initial_asterisk elseif self:hasType("appendix-constructed") then is_reconstructed_or_appendix = true end end if not is_reconstructed_or_appendix then -- Check if the pagename is a listed unsupported title. local unsupportedTitles = load_data(links_data_module).unsupported_titles if unsupportedTitles[pagename] then return "Unsupported titles/" .. unsupportedTitles[pagename] end end -- Set `unsupported` as true if certain conditions are met. local unsupported -- Check if there's an unsupported character. \239\191\189 is the replacement character U+FFFD, which can't be typed -- directly here due to an abuse filter. Unix-style dot-slash notation is also unsupported, as it is used for -- relative paths in links, as are 3 or more consecutive tildes. Note: match is faster with magic -- characters/charsets; find is faster with plaintext. if ( match(pagename, "[#<>%[%]_{|}]") or find(pagename, "\239\191\189") or match(pagename, "%f[^%z/]%.%.?%f[%z/]") or find(pagename, "~~~") ) then unsupported = true -- If it looks like an interwiki link. elseif find(pagename, ":") then local prefix = gsub(pagename, "^:*(.-):.*", ulower) if ( load_data("Module:data/namespaces")[prefix] or load_data("Module:data/interwikis")[prefix] ) then unsupported = true end end -- Escape unsupported characters so they can be used in titles. ` is used as a delimiter for this, so a raw use of -- it in an unsupported title is also escaped here to prevent interference; this is only done with unsupported -- titles, though, so inclusion won't in itself mean a title is treated as unsupported (which is why it's excluded -- from the earlier test). if unsupported then -- FIXME: This conversion needs to be different for reconstructed pages with unsupported characters. There -- aren't any currently, but if there ever are, we need to fix this e.g. to put them in something like -- Reconstruction:Proto-Indo-European/Unsupported titles/`lowbar``num`. local unsupported_characters = load_data(links_data_module).unsupported_characters pagename = pagename:gsub("[#<>%[%]_`{|}\239]\191?\189?", unsupported_characters) :gsub("%f[^%z/]%.%.?%f[%z/]", function(m) return (gsub(m, "%.", "`period`")) end) :gsub("~~~+", function(m) return (gsub(m, "~", "`tilde`")) end) pagename = "ชื่อไม่รองรับ/" .. pagename elseif not is_reconstructed_or_appendix then -- Check if this is a mammoth page. If so, which subpage should we link to? local m_links_data = load_data(links_data_module) local mammoth_page_type = m_links_data.mammoth_pages[pagename] if mammoth_page_type then local canonical_name = self:getFullName() if canonical_name ~= "ร่วม" and canonical_name ~= "อังกฤษ" then local this_subpage local L2_sort_key = get_L2_sort_key(canonical_name) for _, subpage_spec in ipairs(m_links_data.mammoth_page_subpage_types[mammoth_page_type]) do -- unpack() fails utterly on data loaded using mw.loadData() even if offsets are given local subpage, pattern = subpage_spec[1], subpage_spec[2] if pattern == true or L2_sort_key:match(pattern) then this_subpage = subpage break end end if not this_subpage then error(("Internal error: Bad data in mammoth_page_subpage_pages in [[Module:links/data]] for mammoth page %s, type %s; last entry didn't have 'true' in it"):format( pagename, mammoth_page_type)) end pagename = pagename .. "/" .. this_subpage end end end return (initial_asterisk or "") .. pagename end --[==[ Strip the diacritics from a display pagename and convert the resulting logical pagename into a physical pagename. This allows you, for example, to retrieve the contents of the page or check its existence. WARNING: This is deprecated and will be going away. It is a simple composition of `self:stripDiacritics` and `self:logicalToPhysical`; most callers only want the former, and if you need both, call them both yourself. `text` and `sc` are as in `self:stripDiacritics`, and `is_reconstructed_or_appendix` is as in `self:logicalToPhysical`. ]==] function Language:makeEntryName(text, sc, is_reconstructed_or_appendix) return self:logicalToPhysical(self:stripDiacritics(text, sc), is_reconstructed_or_appendix) end --[==[Generates alternative forms using a specified method, and returns them as a table. If no method is specified, returns a table containing only the input term.]==] function Language:generateForms(text, sc) local generate_forms = self._data.generate_forms if generate_forms == nil then return {text} end sc = checkScript(text, self, sc) return require("Module:" .. self._data.generate_forms).generateForms(text, self, sc) end --[==[Creates a sort key for the given stripped text, following the rules appropriate for the language. This removes diacritical marks from the stripped text if they are not considered significant for sorting, and may perform some other changes. Any initial hyphen is also removed, and anything in parentheses is removed as well. The <code>sort_key</code> setting for each language in the data modules defines the replacements made by this function, or it gives the name of the module that takes the stripped text and returns a sortkey.]==] function Language:makeSortKey(text, sc) if (not text) or text == "" then return text end if match(text, "<[^<>]+>") then track("track HTML tag") end -- Remove directional characters, bold, italics, soft hyphens, strip markers and HTML tags. -- FIXME: Partly duplicated with remove_formatting() in [[Module:links]]. text = ugsub(text, "[\194\173\226\128\170-\226\128\174\226\129\166-\226\129\169]", "") text = text:gsub("('*)'''(.-'*)'''", "%1%2"):gsub("('*)''(.-'*)''", "%1%2") text = gsub(unstrip(text), "<[^<>]+>", "") text = decode_uri(text, "PATH") text = checkNoEntities(self, text) -- Remove initial hyphens and * unless the term only consists of spacing + punctuation characters. text = ugsub(text, "^([􀀀-􏿽]*)[-־ـ᠊*]+([􀀀-􏿽]*)(.*[^%s%p].*)", "%1%2%3") sc = checkScript(text, self, sc) text = normalize(text, sc) text = removeCarets(text, sc) -- For languages with dotted dotless i, ensure that "İ" is sorted as "i", and "I" is sorted as "ı". if self:hasDottedDotlessI() then text = gsub(text, "I\204\135", "i") -- decomposed "İ" :gsub("I", "ı") text = sc:toFixedNFD(text) end -- Convert to lowercase, make the sortkey, then convert to uppercase. Where the language has dotted dotless i, it is -- usually not necessary to convert "i" to "İ" and "ı" to "I" first, because "I" will always be interpreted as -- conventional "I" (not dotless "İ") by any sorting algorithms, which will have been taken into account by the -- sortkey substitutions themselves. However, if no sortkey substitutions have been specified, then conversion is -- necessary so as to prevent "i" and "ı" both being sorted as "I". -- -- An exception is made for scripts that (sometimes) sort by scraping page content, as that means they are sensitive -- to changes in capitalization (as it changes the target page). if not sc:sortByScraping() then text = ulower(text) end local actual_substitution_data -- Don't trim whitespace here because it's significant at the beginning of a sort key or sort base. text, _, actual_substitution_data = iterateSectionSubstitutions(self, text, sc, nil, nil, self._data.sort_key, "sort_key", "makeSortKey", "notrim") if not sc:sortByScraping() then if self:hasDottedDotlessI() and not actual_substitution_data then text = text:gsub("ı", "I"):gsub("i", "İ") text = sc:toFixedNFC(text) end text = uupper(text) end -- Remove parentheses, as long as they are either preceded or followed by something. text = gsub(text, "(.)[()]+", "%1"):gsub("[()]+(.)", "%1") text = escape_risky_characters(text) return text end --[==[Create the form used as as a basis for display text and transliteration. FIXME: Rename to correctInputText().]==] local function processDisplayText(text, self, sc, keepCarets, keepPrefixes) local subbedChars = {} text, subbedChars = doTempSubstitutions(text, subbedChars, keepCarets) text = decode_uri(text, "PATH") text = checkNoEntities(self, text) sc = checkScript(text, self, sc) text = normalize(text, sc) text, subbedChars = iterateSectionSubstitutions(self, text, sc, subbedChars, keepCarets, self._data.display_text, "display_text", "makeDisplayText") text = removeCarets(text, sc) -- Remove any interwiki link prefixes (unless they have been escaped or this has been disabled). if find(text, ":") and not keepPrefixes then local rep repeat text, rep = gsub(text, "\\\\(\\*:)", "\3%1") until rep == 0 text = gsub(text, "\\:", "\4") while true do local prefix = gsub(text, "^(.-):.+", function(m1) return (gsub(m1, "\244[\128-\191]*", "")) end) -- Check if the prefix is an interwiki, though ignore capitalised Wiktionary:, which is a namespace. if not prefix or prefix == text or prefix == "Wiktionary" or not (load_data("Module:data/interwikis")[ulower(prefix)] or prefix == "") then break end text = gsub(text, "^(.-):(.*)", function(m1, m2) local ret = {} for subbedChar in gmatch(m1, "\244[\128-\191]*") do insert(ret, subbedChar) end return concat(ret) .. m2 end) end text = gsub(text, "\3", "\\"):gsub("\4", ":") end return text, subbedChars end --[==[Make the display text (i.e. what is displayed on the page).]==] function Language:makeDisplayText(text, sc, keepPrefixes) if not text or text == "" then return text end local subbedChars text, subbedChars = processDisplayText(text, self, sc, nil, keepPrefixes) text = escape_risky_characters(text) return undoTempSubstitutions(text, subbedChars) end --[==[Transliterates the text from the given script into the Latin script (see [[Wiktionary:Transliteration and romanization]]). The language must have the <code>translit</code> property for this to work; if it is not present, {{code|lua|nil}} is returned. The <code>sc</code> parameter is handled by the transliteration module, and how it is handled is specific to that module. Some transliteration modules may tolerate {{code|lua|nil}} as the script, others require it to be one of the possible scripts that the module can transliterate, and will throw an error if it's not one of them. For this reason, the <code>sc</code> parameter should always be provided when writing non-language-specific code. The <code>module_override</code> parameter is used to override the default module that is used to provide the transliteration. This is useful in cases where you need to demonstrate a particular module in use, but there is no default module yet, or you want to demonstrate an alternative version of a transliteration module before making it official. It should not be used in real modules or templates, only for testing. All uses of this parameter are tracked by [[Wiktionary:Tracking/languages/module_override]]. '''Known bugs''': * This function assumes {tr(s1) .. tr(s2) == tr(s1 .. s2)}. When this assertion fails, wikitext markups like <nowiki>'''</nowiki> can cause wrong transliterations. * HTML entities like <code>&amp;apos;</code>, often used to escape wikitext markups, do not work. ]==] function Language:transliterate(text, sc, module_override) -- If there is no text, or the language doesn't have transliteration data and there's no override, return nil. if not text or text == "" or text == "-" then return text end -- If the script is not transliteratable (and no override is given), return nil. sc = checkScript(text, self, sc) if not (sc:isTransliterated() or module_override) then -- temporary tracking to see if/when this gets triggered track("non-transliterable") track("non-transliterable/" .. self._code) track("non-transliterable/" .. sc:getCode()) track("non-transliterable/" .. sc:getCode() .. "/" .. self._code) return nil end -- Remove any strip markers. text = unstrip(text) -- Do not process the formatting into PUA characters for certain languages. local processed = load_data(languages_data_module).substitution[self._code] ~= "none" -- Get the display text with the keepCarets flag set. local subbedChars if processed then text, subbedChars = processDisplayText(text, self, sc, true) end -- Transliterate (using the module override if applicable). text, subbedChars = iterateSectionSubstitutions(self, text, sc, subbedChars, true, module_override or self._data.translit, "translit", "tr") if not text then return nil end -- Incomplete transliterations return nil. local charset = sc.characters if charset and umatch(text, "[" .. charset .. "]") then -- Remove any characters in Latin, which includes Latin characters also included in other scripts (as these are -- false positives), as well as any PUA substitutions. Anything remaining should only be script code "None" -- (e.g. numerals). local check_text = ugsub(text, "[" .. get_script("Latn").characters .. "􀀀-􏿽]+", "") -- Set none_is_last_resort_only flag, so that any non-None chars will cause a script other than "None" to be -- returned. if find_best_script_without_lang(check_text, true):getCode() ~= "None" then return nil end end if processed then text = escape_risky_characters(text) text = undoTempSubstitutions(text, subbedChars) end -- If the script does not use capitalization, then capitalize any letters of the transliteration which are -- immediately preceded by a caret (and remove the caret). if text and not sc:hasCapitalization() and text:find("^", 1, true) then text = processCarets(text, "%^([\128-\191\244]*%*?)([^\128-\191\244][\128-\191]*)", function(m1, m2) return m1 .. uupper(m2) end) end -- Track module overrides. if module_override ~= nil then track("module_override") end return text end do local function handle_language_spec(self, spec, sc) local ret = self["_" .. spec] if ret == nil then ret = self._data[spec] if type(ret) == "string" then ret = list_to_set(split(ret, ",", true, true)) end self["_" .. spec] = ret end if type(ret) == "table" then ret = ret[sc:getCode()] end return not not ret end function Language:overrideManualTranslit(sc) return handle_language_spec(self, "override_translit", sc) end function Language:link_tr(sc) return handle_language_spec(self, "link_tr", sc) end end --[==[Returns {{code|lua|true}} if the language has a transliteration module, or {{code|lua|false}} if it doesn't.]==] function Language:hasTranslit() return not not self._data.translit end --[==[Returns {{code|lua|true}} if the language uses the letters I/ı and İ/i, or {{code|lua|false}} if it doesn't.]==] function Language:hasDottedDotlessI() return not not self._data.dotted_dotless_i end function Language:toJSON(opts) local strip_diacritics, strip_diacritics_patterns, strip_diacritics_remove_diacritics = self._data.strip_diacritics if strip_diacritics then if strip_diacritics.from then strip_diacritics_patterns = {} for i, from in ipairs(strip_diacritics.from) do insert(strip_diacritics_patterns, {from = from, to = strip_diacritics.to[i] or ""}) end end strip_diacritics_remove_diacritics = strip_diacritics.remove_diacritics end -- mainCode should only end up non-nil if dontCanonicalizeAliases is passed to make_object(). -- props should either contain zero-argument functions to compute the value, or the value itself. local props = { ancestors = function() return self:getAncestorCodes() end, canonicalName = function() return self:getCanonicalName() end, categoryName = function() return self:getCategoryName("nocap") end, code = self._code, mainCode = self._mainCode, parent = function() return self:getParentCode() end, full = function() return self:getFullCode() end, stripDiacriticsPatterns = strip_diacritics_patterns, stripDiacriticsRemoveDiacritics = strip_diacritics_remove_diacritics, family = function() return self:getFamilyCode() end, aliases = function() return self:getAliases() end, varieties = function() return self:getVarieties() end, otherNames = function() return self:getOtherNames() end, scripts = function() return self:getScriptCodes() end, type = function() return keys_to_list(self:getTypes()) end, wikimediaLanguages = function() return self:getWikimediaLanguageCodes() end, wikidataItem = function() return self:getWikidataItem() end, wikipediaArticle = function() return self:getWikipediaArticle(true) end, } local ret = {} for prop, val in pairs(props) do if not opts.skip_fields or not opts.skip_fields[prop] then if type(val) == "function" then ret[prop] = val() else ret[prop] = val end end end -- Use `deep_copy` when returning a table, so that there are no editing restrictions imposed by `mw.loadData`. return opts and opts.lua_table and deep_copy(ret) or to_json(ret, opts) end function export.getDataModuleName(code) local letter = match(code, "^(%l)%l%l?$") return "Module:" .. ( letter == nil and "languages/data/exceptional" or #code == 2 and "languages/data/2" or "languages/data/3/" .. letter ) end get_data_module_name = export.getDataModuleName function export.getExtraDataModuleName(code) return get_data_module_name(code) .. "/extra" end get_extra_data_module_name = export.getExtraDataModuleName do local function make_stack(data) local key_types = { [2] = "unique", aliases = "unique", otherNames = "unique", type = "append", varieties = "unique", wikipedia_article = "unique", wikimedia_codes = "unique" } local function __index(self, k) local stack, key_type = getmetatable(self), key_types[k] -- Data that isn't inherited from the parent. if key_type == "unique" then local v = stack[stack[make_stack]][k] if v == nil then local layer = stack[0] if layer then -- Could be false if there's no extra data. v = layer[k] end end return v -- Data that is appended by each generation. elseif key_type == "append" then local parts, offset, n = {}, 0, stack[make_stack] for i = 1, n do local part = stack[i][k] if part == nil then offset = offset + 1 else parts[i - offset] = part end end return offset ~= n and concat(parts, ",") or nil end local n = stack[make_stack] while true do local layer = stack[n] if not layer then -- Could be false if there's no extra data. return nil end local v = layer[k] if v ~= nil then return v end n = n - 1 end end local function __newindex() error("table is read-only") end local function __pairs(self) -- Iterate down the stack, caching keys to avoid duplicate returns. local stack, seen = getmetatable(self), {} local n = stack[make_stack] local iter, state, k, v = pairs(stack[n]) return function() repeat repeat k = iter(state, k) if k == nil then n = n - 1 local layer = stack[n] if not layer then -- Could be false if there's no extra data. return nil end iter, state, k = pairs(layer) end until not (k == nil or seen[k]) -- Get the value via a lookup, as the one returned by the -- iterator will be the raw value from the current layer, -- which may not be the one __index will return for that -- key. Also memoize the key in `seen` (even if the lookup -- returns nil) so that it doesn't get looked up again. -- TODO: store values in `self`, avoiding the need to create -- the `seen` table. The iterator will need to iterate over -- `self` with `next` first to find these on future loops. v, seen[k] = self[k], true until v ~= nil return k, v end end local __ipairs = require(table_module).indexIpairs function make_stack(data) local stack = { data, [make_stack] = 1, -- stores the length and acts as a sentinel to confirm a given metatable is a stack. __index = __index, __newindex = __newindex, __pairs = __pairs, __ipairs = __ipairs, } stack.__metatable = stack return setmetatable({}, stack), stack end return make_stack(data) end local function get_stack(data) local stack = getmetatable(data) return stack and type(stack) == "table" and stack[make_stack] and stack or nil end --[==[ <span style="color: var(--wikt-palette-red,#BA0000)">This function is not for use in entries or other content pages.</span> Returns a blob of data about the language. The format of this blob is undocumented, and perhaps unstable; it's intended for things like the module's own unit-tests, which are "close friends" with the module and will be kept up-to-date as the format changes. If `extra` is set, any extra data in the relevant `/extra` module will be included. (Note that it will be included anyway if it has already been loaded into the language object.) If `raw` is set, then the returned data will not contain any data inherited from parent objects. -- Do NOT use these methods! -- All uses should be pre-approved on the talk page! ]==] function Language:getData(extra, raw) if extra then self:loadInExtraData() end local data = self._data -- If raw is not set, just return the data. if not raw then return data end local stack = get_stack(data) -- If there isn't a stack or its length is 1, return the data. Extra data (if any) will be included, as it's stored at key 0 and doesn't affect the reported length. if stack == nil then return data end local n = stack[make_stack] if n == 1 then return data end local extra = stack[0] -- If there isn't any extra data, return the top layer of the stack. if extra == nil then return stack[n] end -- If there is, return a new stack which has the top layer at key 1 and the extra data at key 0. data, stack = make_stack(stack[n]) stack[0] = extra return data end function Language:loadInExtraData() -- Only full languages have extra data. if not self:hasType("language", "full") then return end local data = self._data -- If there's no stack, create one. local stack = get_stack(self._data) if stack == nil then data, stack = make_stack(data) -- If already loaded, return. elseif stack[0] ~= nil then return end self._data = data -- Load extra data from the relevant module and add it to the stack at key 0, so that the __index and __pairs metamethods will pick it up, since they iterate down the stack until they run out of layers. local code = self._code local modulename = get_extra_data_module_name(code) -- No data cached as false. stack[0] = modulename and load_data(modulename)[code] or false end --[==[Returns the name of the module containing the language's data. Currently, this is always [[Module:scripts/data]].]==] function Language:getDataModuleName() local name = self._dataModuleName if name == nil then name = self:hasType("etymology-only") and etymology_languages_data_module or get_data_module_name(self._mainCode or self._code) self._dataModuleName = name end return name end --[==[Returns the name of the module containing the language's data. Currently, this is always [[Module:scripts/data]].]==] function Language:getExtraDataModuleName() local name = self._extraDataModuleName if name == nil then name = not self:hasType("etymology-only") and get_extra_data_module_name(self._mainCode or self._code) or false self._extraDataModuleName = name end return name or nil end function export.makeObject(code, data, dontCanonicalizeAliases) local data_type = type(data) if data_type ~= "table" then error(("bad argument #2 to 'makeObject' (table expected, got %s)"):format(data_type)) end -- Convert any aliases. local input_code = code code = normalize_code(code) input_code = dontCanonicalizeAliases and input_code or code local parent if data.parent then parent = get_by_code(data.parent, nil, true, true) else parent = Language end parent.__index = parent local lang = {_code = input_code} -- This can only happen if dontCanonicalizeAliases is passed to make_object(). if code ~= input_code then lang._mainCode = code end local parent_data = parent._data if parent_data == nil then -- Full code is the same as the code. lang._fullCode = parent._code or code else -- Copy full code. lang._fullCode = parent._fullCode local stack = get_stack(parent_data) if stack == nil then parent_data, stack = make_stack(parent_data) end -- Insert the input data as the new top layer of the stack. local n = stack[make_stack] + 1 data, stack[n], stack[make_stack] = parent_data, data, n end lang._data = data return setmetatable(lang, parent) end make_object = export.makeObject end --[==[Finds the language whose code matches the one provided. If it exists, it returns a <code class="nf">Language</code> object representing the language. Otherwise, it returns {{code|lua|nil}}, unless <code class="n">paramForError</code> is given, in which case an error is generated. If <code class="n">paramForError</code> is {{code|lua|true}}, a generic error message mentioning the bad code is generated; otherwise <code class="n">paramForError</code> should be a string or number specifying the parameter that the code came from, and this parameter will be mentioned in the error message along with the bad code. If <code class="n">allowEtymLang</code> is specified, etymology-only language codes are allowed and looked up along with normal language codes. If <code class="n">allowFamily</code> is specified, language family codes are allowed and looked up along with normal language codes.]==] function export.getByCode(code, paramForError, allowEtymLang, allowFamily) -- Track uses of paramForError, ultimately so it can be removed, as error-handling should be done by [[Module:parameters]], not here. if paramForError ~= nil then track("paramForError") end if type(code) ~= "string" then local typ if not code then typ = "nil" elseif check_object("language", true, code) then typ = "a language object" elseif check_object("family", true, code) then typ = "a family object" else typ = "a " .. type(code) end error("The function getByCode expects a string as its first argument, but received " .. typ .. ".") end local m_data = load_data(languages_data_module) if m_data.aliases[code] or m_data.track[code] then track(code) end local norm_code = normalize_code(code) -- Get the data, checking for etymology-only languages if allowEtymLang is set. local data = load_data(get_data_module_name(norm_code))[norm_code] or allowEtymLang and load_data(etymology_languages_data_module)[norm_code] -- If no data was found and allowFamily is set, check the family data. If the main family data was found, make the object with [[Module:families]] instead, as family objects have different methods. However, if it's an etymology-only family, use make_object in this module (which handles object inheritance), and the family-specific methods will be inherited from the parent object. if data == nil and allowFamily then data = load_data("Module:families/data")[norm_code] if data ~= nil then if data.parent == nil then return make_family_object(norm_code, data) elseif not allowEtymLang then data = nil end end end local retval = code and data and make_object(code, data) if not retval and paramForError then require("Module:languages/errorGetBy").code(code, paramForError, allowEtymLang, allowFamily) end return retval end get_by_code = export.getByCode --[==[Finds the language whose canonical name (the name used to represent that language on Wiktionary) or other name matches the one provided. If it exists, it returns a <code class="nf">Language</code> object representing the language. Otherwise, it returns {{code|lua|nil}}, unless <code class="n">paramForError</code> is given, in which case an error is generated. If <code class="n">allowEtymLang</code> is specified, etymology-only language codes are allowed and looked up along with normal language codes. If <code class="n">allowFamily</code> is specified, language family codes are allowed and looked up along with normal language codes. The canonical name of languages should always be unique (it is an error for two languages on Wiktionary to share the same canonical name), so this is guaranteed to give at most one result. This function is powered by [[Module:languages/canonical names]], which contains a pre-generated mapping of full-language canonical names to codes. It is generated by going through the [[:Category:Language data modules]] for full languages. When <code class="n">allowEtymLang</code> is specified for the above function, [[Module:etymology languages/canonical names]] may also be used, and when <code class="n">allowFamily</code> is specified for the above function, [[Module:families/canonical names]] may also be used.]==] function export.getByCanonicalName(name, errorIfInvalid, allowEtymLang, allowFamily) local byName = load_data("Module:languages/canonical names") local code = byName and byName[name] if not code and allowEtymLang then byName = load_data("Module:etymology languages/canonical names") code = byName and byName[name] or byName[gsub(name, " [Ss]ubstrate$", "")] or byName[gsub(name, "^a ", "")] or byName[gsub(name, "^a ", ""):gsub(" [Ss]ubstrate$", "")] or -- For etymology families like "ira-pro". -- FIXME: This is not ideal, as it allows " languages" to be appended to any etymology-only language, too. byName[match(name, "^กลุ่มภาษา(.*)$")] end if not code and allowFamily then byName = load_data("Module:families/canonical names") code = byName[name] or byName[match(name, "^กลุ่มภาษา(.*)$")] end local retval = code and get_by_code(code, errorIfInvalid, allowEtymLang, allowFamily) if not retval and errorIfInvalid then require("Module:languages/errorGetBy").canonicalName(name, allowEtymLang, allowFamily) end return retval end --[==[Used by [[Module:languages/data/2]] (et al.) and [[Module:etymology languages/data]], [[Module:families/data]], [[Module:scripts/data]] and [[Module:writing systems/data]] to finalize the data into the format that is actually returned.]==] function export.finalizeData(data, main_type, variety) local fields = {"type"} if main_type == "language" then insert(fields, 4) -- script codes insert(fields, "ancestors") insert(fields, "link_tr") insert(fields, "override_translit") insert(fields, "wikimedia_codes") elseif main_type == "script" then insert(fields, 3) -- writing system codes end -- Families and writing systems have no extra fields to process. local fields_len = #fields for _, entity in next, data do if variety then -- Move parent from 3 to "parent" and family from "family" to 3. These are different for the sake of convenience, since very few varieties have the family specified, whereas all of them have a parent. entity.parent, entity[3], entity.family = entity[3], entity.family -- Give the type "regular" iff not a variety and no other types are assigned. elseif not (entity.type or entity.parent) then entity.type = "regular" end for i = 1, fields_len do local key = fields[i] local field = entity[key] if field and type(field) == "string" then entity[key] = gsub(field, "%s*,%s*", ",") end end end return data end --[==[For backwards compatibility only; modules should require the error themselves.]==] function export.err(lang_code, param, code_desc, template_tag, not_real_lang) return require("Module:languages/error")(lang_code, param, code_desc, template_tag, not_real_lang) end return export s12b02rmox9ex4qh204ad7z74n6xdk3 ท่อง 0 36466 5720722 1895413 2026-04-21T04:21:18Z Apisite 10648 5720722 wikitext text/x-wiki {{also/auto}} == ภาษาไทย == === รากศัพท์ === ร่วมเชื้อสายกับ{{cog|lo|ທ່ອງ}} === การออกเสียง === {{th-pron|ท็่อง}} === คำกริยา === {{th-verb}} # [[เดิน]][[ก้าว]][[ไป]][[ใน]][[น้ำ]] #: {{ux|th|ท่องน้ำ}} # [[ว่า]][[ซ้ำ]] ๆ [[ให้]][[จำ]][[ได้]] #: {{ux|th|ท่องหนังสือ}} {{topics|th|การศึกษา}} 4c8wahj1eyr4tfqxwm5u7lgspssc77j rhino 0 43557 5720711 2018970 2026-04-21T02:09:34Z OctraBot 3198 เก็บกวาด 5720711 wikitext text/x-wiki == ภาษาอังกฤษ == === การออกเสียง === * {{IPA|en|/ˈɹaɪ.nəʊ/|a=UK}} * {{enPR|rīʹnō|a=US}}, {{IPA|en|/ˈɹaɪ.noʊ/}} * {{audio|en|En-au-rhino.ogg|a=AU}} * {{rhymes|en|aɪnəʊ|s=2}} * {{homophones|en|RINO}} === รากศัพท์ 1 === {{unk|en}} ==== รูปแบบอื่น ==== * {{alt|en|rino}} ==== คำนาม ==== {{en-noun|-}} # {{lb|en|slang|now|rare}} [[เงิน]] {{defdate|from 17th c.}} #* {{quote-text|en|year=1792|author=w:Thomas Holcroft|title=Anne St. Ives|section=vol. III.52 |passage=When so be as a man has no money, why then, a savin and exceptin your onnur's reverence, a's but a poor dog. But when so be as a man as got the '''rhino''', why then a may begin to hold up his head.}} #* {{quote-text|en|year=1835|author=w:Frederick Marryat|title=The Pacha of Many Tales |passage=There I fell in with Betsy, and as she proved a regular out and outer, I spliced her, and a famous wedding we had of it, as long as the '''rhino''' lasted.}} #* {{RQ:Joyce Ulysses|chapter=Episode 12: The Cyclops|passage=—Here you are, says Alf, chucking out the '''rhino'''. Talking about hanging, I'll show you something you never saw}} === รากศัพท์ 2 === {{clipping|en|rhinoceros}} ==== คำนาม ==== {{en-noun}} # {{lb|en|colloquial}} [[แรด]] {{defdate|from 19th c.}} #* {{quote-book|en|year=1932|year_published=1965|author=w:Delos W. Lovelace|title=[[w:King Kong (1933 film)|King Kong]]|page=24|passage=‘We were getting a grand shot of a charging '''rhino''' when the cameraman got scared and bolted. The fathead!’}} #* {{quote-journal|en|year=1961|month=October|title=Talking of Trains: B.R. exile at work?|journal=Trains Illustrated|page=586|text=This cutting from an East African newspaper caught our eye last month: "The up mail train from Mombasa was held up for an hour at Kibwezi by an angry '''rhino''' on Monday night."}} ===== ลูกคำ ===== {{col2|en|black rhino|Indian rhino|Javan rhino|Sumatran rhino|white rhino|woolly rhino|Merck's rhino|narrow-nosed rhino|rhinolike|rhinoless|rhino beetle|rhino ferry}} == ภาษาฝรั่งเศส == === รากศัพท์ === {{clipping|fr|rhinocéros}} === การออกเสียง === * {{fr-IPA}} * {{audio|fr|LL-Q150 (fra)-Bananax47-rhino.wav|a=<<France>> (<<Agen>>)}} === คำนาม === {{fr-noun|m}} # {{lb|fr|informal}} [[แรด]] {{C|fr|แรด}} 4lnkp9hkr6o3zdzxgclu33607m5mgam 5720712 5720711 2026-04-21T02:09:43Z OctraBot 3198 เรียงลำดับหัวเรื่องภาษา 5720712 wikitext text/x-wiki == ภาษาฝรั่งเศส == === รากศัพท์ === {{clipping|fr|rhinocéros}} === การออกเสียง === * {{fr-IPA}} * {{audio|fr|LL-Q150 (fra)-Bananax47-rhino.wav|a=<<France>> (<<Agen>>)}} === คำนาม === {{fr-noun|m}} # {{lb|fr|informal}} [[แรด]] {{C|fr|แรด}} == ภาษาอังกฤษ == === การออกเสียง === * {{IPA|en|/ˈɹaɪ.nəʊ/|a=UK}} * {{enPR|rīʹnō|a=US}}, {{IPA|en|/ˈɹaɪ.noʊ/}} * {{audio|en|En-au-rhino.ogg|a=AU}} * {{rhymes|en|aɪnəʊ|s=2}} * {{homophones|en|RINO}} === รากศัพท์ 1 === {{unk|en}} ==== รูปแบบอื่น ==== * {{alt|en|rino}} ==== คำนาม ==== {{en-noun|-}} # {{lb|en|slang|now|rare}} [[เงิน]] {{defdate|from 17th c.}} #* {{quote-text|en|year=1792|author=w:Thomas Holcroft|title=Anne St. Ives|section=vol. III.52 |passage=When so be as a man has no money, why then, a savin and exceptin your onnur's reverence, a's but a poor dog. But when so be as a man as got the '''rhino''', why then a may begin to hold up his head.}} #* {{quote-text|en|year=1835|author=w:Frederick Marryat|title=The Pacha of Many Tales |passage=There I fell in with Betsy, and as she proved a regular out and outer, I spliced her, and a famous wedding we had of it, as long as the '''rhino''' lasted.}} #* {{RQ:Joyce Ulysses|chapter=Episode 12: The Cyclops|passage=—Here you are, says Alf, chucking out the '''rhino'''. Talking about hanging, I'll show you something you never saw}} === รากศัพท์ 2 === {{clipping|en|rhinoceros}} ==== คำนาม ==== {{en-noun}} # {{lb|en|colloquial}} [[แรด]] {{defdate|from 19th c.}} #* {{quote-book|en|year=1932|year_published=1965|author=w:Delos W. Lovelace|title=[[w:King Kong (1933 film)|King Kong]]|page=24|passage=‘We were getting a grand shot of a charging '''rhino''' when the cameraman got scared and bolted. The fathead!’}} #* {{quote-journal|en|year=1961|month=October|title=Talking of Trains: B.R. exile at work?|journal=Trains Illustrated|page=586|text=This cutting from an East African newspaper caught our eye last month: "The up mail train from Mombasa was held up for an hour at Kibwezi by an angry '''rhino''' on Monday night."}} ===== ลูกคำ ===== {{col2|en|black rhino|Indian rhino|Javan rhino|Sumatran rhino|white rhino|woolly rhino|Merck's rhino|narrow-nosed rhino|rhinolike|rhinoless|rhino beetle|rhino ferry}} ehgbnyi3w2j91nbrcyodp4zt3lchj6y ท่องเที่ยว 0 43721 5720720 1510353 2026-04-21T03:51:45Z Ai Ku Karng 17824 /* ภาษาไทย */ 5720720 wikitext text/x-wiki == ภาษาไทย == {{wp}} === รากศัพท์ === {{com|th|ท่อง|เที่ยว}}; ร่วมเชื้อสายกับ{{cog|lo|ທ່ອງທ່ຽວ}} === การออกเสียง === {{th-pron|ท็่อง-เที่ยว}} === คำกริยา === {{th-verb}} # เที่ยว[[ไป]] # {{lb|th|กฎ}} [[เดินทาง]][[จาก]][[ท้องที่]][[อัน]][[เป็น]][[ถิ่น]][[ที่อยู่]][[โดย]][[ปรกติ]][[ของ]][[ตน]]ไป[[ยัง]]ท้องที่[[อื่น]]เป็น[[การ]][[ชั่วคราว]][[ด้วย]][[ความ]][[สมัครใจ]] [[และ]]ด้วย[[วัตถุประสงค์]]อัน[[มิ]][[ใช่]][[เพื่อ]]ไป[[ประกอบ]][[อาชีพ]][[หรือ]][[หา]][[รายได้]] lva0rodmnzzi4ibbttycivya3bxyiux มอดูล:category tree/หัวข้อ/สถานที่ 828 44394 5720685 5688585 2026-04-21T01:19:42Z OctraBot 3198 5720685 Scribunto text/plain local labels = {} local handlers = {} local m_table = require("Module:table") local en_utilities_module = "Module:en-utilities" local string_utilities_module = "Module:string utilities" local m_locations = require("Module:place/locations") local m_placetypes = require("Module:place/placetypes") local placetype_data = m_placetypes.placetype_data local internal_error = m_locations.internal_error local dump = mw.dumpObject local insert = table.insert local concat = table.concat local is_callable = require("Module:fun").is_callable --[==[ intro: This module is part of the category tree code and contains code to generate the descriptions of place-related categories such as [[Category:de:Hokkaido Prefecture, Japan]], [[Category:es:Cities in France]], [[Category:pt:Municipalities of Tocantins, Brazil]], etc.). Note that this module doesn't actually create the categories; that must be done separately, with the text "{{tl|auto cat}}" as the definition of the category. (This process should automatically happen periodically for non-empty categories, because they will appear in [[Special:WantedCategories]] and a bot will periodically examine that list and create any needed category.) There are two ways that category descriptions are specified: (1) by manually adding an entry to the `labels` table, keyed by the label (the category minus the language code) with a value consisting of a Lua table specifying the description text and the category's parents; (2) through handlers (pieces of Lua code) added to the `handlers` list, which recognize labels of a specific type (e.g. `Cities in France`) and generate the appropriate specification for that label on-the-fly. See [[Module:place]] for an introduction to the terminology associated with places along with a list of all the relevant modules, along with for more specific information on types of toponyms and placetypes and how their categorization works. ]==] local function lcfirst(label) return mw.getContentLanguage():lcfirst(label) end local function gsub_literally(str, from, to) local m_strutils = require(string_utilities_module) return (str:gsub(m_strutils.pattern_escape(from), m_strutils.replacement_escape(to))) end --ห้ามแปล class local class_to_bare_category_parent = { ["polity"] = "องค์การทางการเมือง", ["subpolity"] = "political divisions", ["settlement"] = "การตั้งถิ่นฐาน", ["non-admin settlement"] = "การตั้งถิ่นฐาน", ["capital"] = "capital cities", ["natural feature"] = "natural features", ["man-made structure"] = "man-made structures", ["geographic region"] = "geographic and cultural areas", } --ห้ามแปล class local class_is_political_division = { ["polity"] = true, -- strictly false but there are placetypes ambiguous between polity and subpolity ["subpolity"] = true, ["settlement"] = true, ["non-admin settlement"] = false, ["capital"] = true, ["natural feature"] = false, ["man-made structure"] = false, ["geographic region"] = false, ["generic place"] = false, } local capital_cat_to_placetype = {} for placetype, capital_cat in pairs(m_placetypes.placetype_to_capital_cat) do capital_cat_to_placetype[capital_cat] = placetype end -- Handler for bare categories for all types of capitals. This needs to precede the handler for bare placetype -- categories as some of the types of capitals exist as placetypes as well. insert(handlers, function(label) label = lcfirst(label) local capital_placetype = capital_cat_to_placetype[label] if capital_placetype then local pl_placetype = m_placetypes.pluralize_placetype(capital_placetype) local linkdesc = m_placetypes.get_placetype_display_form(pl_placetype, "top-level") if linkdesc == nil then internal_error("Unrecognized placetype %s when processing label %s", capital_placetype, label) end if linkdesc == false then mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype))) return nil end return { type = "name", topic = label, description = "{{{langname}}} names of [[capital]]s of " .. linkdesc .. ".", parents = {"capital cities"}, } end end) -- Handler for bare placetype categories. FIXME: Add wpcat= and commonscat= info. Previously we had it for various -- so-called "generic" placetypes, but sometimes the categories were wrong. insert(handlers, function(label) for _, canon_label in ipairs { lcfirst(label), label } do local ptdesc, ptdata = m_placetypes.get_placetype_display_form(canon_label, "top-level", "return full") if ptdesc then local from_category_props = { from_category = true, no_split_qualifiers = true, } local bare_category_parent = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt) local bare_category_parent = m_placetypes.get_placetype_prop(pt, "bare_category_parent") if bare_category_parent then return bare_category_parent end local class = m_placetypes.get_placetype_prop(pt, "class") if class then if class_to_bare_category_parent[class] == nil then internal_error("Saw unknown category class %s derived from placetype %s", class, canon_label) end return class_to_bare_category_parent[class] end end, from_category_props) if not bare_category_parent then internal_error("Saw placetype %s without a `class` or `bare_category_parent` setting, either " .. "directly or through a fallback", canon_label) end local addl_bare_category_parents = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt) return m_placetypes.get_placetype_prop(pt, "addl_bare_category_parents") end, from_category_props) local bare_category_breadcrumb = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt) return m_placetypes.get_placetype_prop(pt, "bare_category_breadcrumb") end, from_category_props) if type(bare_category_parent) == "string" and bare_category_breadcrumb then bare_category_parent = {name = bare_category_parent, sort = bare_category_breadcrumb} end local parents = {bare_category_parent} if addl_bare_category_parents then m_table.extend(parents, addl_bare_category_parents) end return { type = "name", topic = canon_label, description = "{{{langname}}} " .. ptdesc .. ".", breadcrumb = bare_category_breadcrumb, parents = parents, } elseif ptdesc == false then mw.log(("Display form for canon_label %s is false, can't categorize"):format(dump(canon_label))) end end end) local function fetch_primary_placetype(key, spec) local placetype = spec.placetype if type(placetype) == "table" then placetype = placetype[1] end if not placetype then internal_error("No placetype specified or defaulted for key %s, spec %s", key, spec) end return placetype end --[==[ Construct an appropriately linked location based on the full or elliptical placename, preceded by `"the "`` if appropriate. Specifically: Fetch the full and elliptical_placenames. If they are the same, just link to the placename directly. Otherwise, check if the full placename exists; if so link to it. Otherwise, if the elliptical placename exists, link to it but display it as the full placename. Finally, if neither full placename nor elliptical placename exists, fall back to linking to the full placename. That way, we prefer full placenames to elliptical placenames if both or neither exist as Wiktionary entries, but if only one exists, we link to that one rather than have a red link. ]==] local function construct_linked_location(group, key, spec) local full_placename, elliptical_placename = m_locations.key_to_placename(group, key) local linked_placename if elliptical_placename ~= full_placename then local full_placename_title = mw.title.new(full_placename) if full_placename_title and full_placename_title.exists then linked_placename = m_locations.construct_linked_placename(spec, full_placename) else local elliptical_placename_title = mw.title.new(elliptical_placename) if elliptical_placename_title and elliptical_placename_title.exists then linked_placename = m_locations.construct_linked_placename(spec, elliptical_placename, full_placename) end end end return linked_placename or m_locations.construct_linked_placename(spec, full_placename) end --[==[ Construct the description of a location, including its container trail either to the end or until we encounter a `no_include_container_in_desc` setting. For example, for the city of [[Birmingham]], the description will read `"[[Birmingham]], a [[city]] in the [[West Midlands]] (which is a [[county]] of [[England]], which is a [[constituent country]] of the [[United Kingdom]], which is a [[country]] in [[Europe]])"`. FIXME: Possibly we should adopt the way city descriptions used to read, which was similar to `"the city of [[Birmingham]], in the county of the [[West Midlands]], in the [[constituent country]] of [[England]], in the [[country]] of the [[United Kingdom]], in [[Europe]]"`. ]==] local function construct_location_desc(group, key, spec) local parts = {} local function ins(txt) insert(parts, txt) end ins(construct_linked_location(group, key, spec)) local iteration = 0 local need_closing_paren = false local containers = {{group = group, key = key, spec = spec}} local container_iterator = m_locations.iterate_containers(group, key, spec) while true do iteration = iteration + 1 local include_container_in_desc = false for _, container in ipairs(containers) do if not container.spec.no_include_container_in_desc then include_container_in_desc = true break end end if not include_container_in_desc then break end local next_containers = container_iterator() if not next_containers then break end local is_former = nil for _, container in ipairs(containers) do local this_is_former = container.spec.is_former_place if is_former == nil then is_former = this_is_former elseif is_former ~= this_is_former then internal_error("When processing container trail of key %s, found a mixture of former and non-former " .. "containers: %s", key, containers) end end if #containers > 1 then local placetypes = {} local prepositions = {} for _, container in ipairs(containers) do local container_type = fetch_primary_placetype(container.key, container.spec) m_table.insertIfNot(placetypes, m_placetypes.pluralize_placetype(container_type)) m_table.insertIfNot(prepositions, m_placetypes.get_placetype_entry_preposition(container_type)) end if iteration == 1 then ins(", ") elseif iteration == 2 then ins(" (which are ") need_closing_paren = true else ins(", which are ") end if is_former then ins("former ") end ins(m_table.serialCommaJoin(placetypes)) ins(" ") ins(concat(prepositions, "/")) else if iteration == 1 then ins(", ") elseif iteration == 2 then ins(" (which is ") need_closing_paren = true else ins(", which is ") end local container_type = fetch_primary_placetype(containers[1].key, containers[1].spec) if is_former then ins("a former ") else ins(m_placetypes.get_placetype_article(container_type)) ins(" ") end ins(container_type) ins(" ") ins(m_placetypes.get_placetype_entry_preposition(container_type)) end ins(" ") first_container = false containers = next_containers local container_locations = {} for _, container in ipairs(containers) do insert(container_locations, construct_linked_location(container.group, container.key, container.spec)) end ins(m_table.serialCommaJoin(container_locations)) end if need_closing_paren then ins(")") end return concat(parts) end -- Fetch or construct the description of the location specified by `key`. If the `keydesc` property is specified, -- use it directly but substitute any occurrence of `+++` with the auto-constructed location description, which -- mentions the placename corresponding to the key, its placetype and container, and repeats the description up -- the container trail until either there are no more containers or (more usually) the `no_include_container_in_desc` -- setting is found (which is set on all continents and continent-level regions). local function fetch_or_construct_location_desc(group, key, spec) local val = spec.keydesc if is_callable(val) then val = val(group, key, spec) spec.keydesc = val end val = val or "+++" if val:find("%+%+%+") then val = gsub_literally(val, "+++", construct_location_desc(group, key, spec)) end return val end local function normalize_cat_as(cat_as, div) if type(cat_as) ~= "table" or cat_as.type then cat_as = {cat_as} end local ret_cat_as = {} for _, pt_cat_as in ipairs(cat_as) do if type(pt_cat_as) == "string" then pt_cat_as = {type = pt_cat_as} end insert(ret_cat_as, {type = pt_cat_as.type, prep = pt_cat_as.prep or div.prep or "ของ"}) end return ret_cat_as end -- Find the specified plural placetype among the divs for a given known location. Return a list of cat_as specs, where -- each spec is of the form {type = "PLURAL_PLACETYPE", prep = "PREP"} indicating the plural placetype to use when -- categorizing and the preposition to follow. local function find_placetype_cat_as(divs, pl_placetype) if divs then if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) == "string" then div = {type = div} end if div.type == pl_placetype then local cat_as = div.cat_as or div.type return normalize_cat_as(cat_as, div) end end end return nil end -- Handler for bare placename categories for known locations in `locations` in [[Module:place/locations]]. insert(handlers, function(label) for _, canon_label in ipairs { label, lcfirst(label) } do local group, spec = m_locations.find_canonical_key(canon_label) if group then -- wp= defaults to true (Wikipedia article matches location's full placename) local wp = spec.wp if wp == nil then wp = true end -- wpcat= defaults to wp= (if Wikipedia article has its own name, Wikipedia category and Commons category -- generally follow) local wpcat = spec.wpcat if wpcat == nil then wpcat = wp end -- commonscat= defaults to wpcat= (if Wikipedia category has its own name, Commons category generally -- follows) local commonscat = spec.commonscat if commonscat == nil then commonscat = wpcat end local parents = {} local bare_label_parents = spec.overriding_bare_label_parents local container_iterator = m_locations.iterate_containers(group, canon_label, spec) local containers = container_iterator() if not bare_label_parents then bare_label_parents = {"+++"} end local full_location_placename, elliptical_location_placename = m_locations.key_to_placename(group, canon_label) local full_container_placename if containers then full_container_placename, _ = m_locations.key_to_placename(containers[1].group, containers[1].key) end local inserted_containers = false for _, parent in ipairs(bare_label_parents) do if parent == "+++" then parent = "PL_PLACETYPEPREPCONTAINER" --th not use spaces end if parent:find("CONTAINER") then if not containers then internal_error("Parent category %s needs the container of %s but no containers specified: %s", parent, canon_label, spec) end local location_type = fetch_primary_placetype(canon_label, spec) local pl_location_type = m_placetypes.pluralize_placetype(location_type) for _, container in ipairs(containers) do local per_container_parent = parent local cat_as_list if per_container_parent:find("PL_PLACETYPE") then if spec.bare_category_parent_type then cat_as_list = normalize_cat_as(spec.bare_category_parent_type, spec) else cat_as_list = find_placetype_cat_as(container.spec.divs, pl_location_type) or find_placetype_cat_as(container.spec.addl_divs, pl_location_type) end end if not cat_as_list then local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(location_type, "from category") if not canon_placetype or not (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then internal_error("Unable to locate plural location type %s among the divs or addl_divs " .. "for container key %s spec %s, and the location type is either not in placetype_data or " .. "not identified as a generic placetype", pl_location_type, container.key, container.spec) end cat_as_list = {{type = pl_location_type, prep = m_placetypes.get_placetype_entry_preposition(location_type)}} end local prefixed_key = m_placetypes.get_prefixed_key(container.key, container.spec) per_container_parent = gsub_literally(per_container_parent, "CONTAINER", prefixed_key) for _, cat_as in ipairs(cat_as_list) do local per_container_per_placetype_parent = per_container_parent per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PL_PLACETYPE", cat_as.type) per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PREP", cat_as.prep) m_table.insertIfNot(parents, per_container_per_placetype_parent) end end inserted_containers = true else m_table.insertIfNot(parents, parent) end end if not inserted_containers and containers then -- If we didn't insert the containers above in some form, insert them now as bare categories. Note that -- this may be different categories from the container categories inserted above. for _, container in ipairs(containers) do m_table.insertIfNot(parents, container.key) end end if spec.addl_parents then for _, parent in ipairs(spec.addl_parents) do m_table.insertIfNot(parents, parent) end end local function format_boxval(val, specname) if val == true then val = "%l" end if type(val) == "string" then val = gsub_literally(val, "%l", full_location_placename) val = gsub_literally(val, "%e", elliptical_location_placename) if val:find("%%c") then if not full_container_placename then internal_error("Wikipedia/Commons spec %s = %s has %%c in it but key %s has no " .. "containers: %s", specname, val, canon_label, spec) end val = gsub_literally(val, "%c", full_container_placename) end end return val end local description = spec.fulldesc or ( "{{{langname}}} terms related to the people, culture, or territory of " .. fetch_or_construct_location_desc(group, canon_label, spec) .. ".") local full_placename, _ = m_locations.key_to_placename(group, canon_label) return { type = "topic", description = description, breadcrumb = full_placename, parents = parents, wp = format_boxval(wp, "wp"), wpcat = format_boxval(wpcat, "wpcat"), commonscat = format_boxval(commonscat, "commonscat"), } end end end) local function find_canonical_key_from_place(place, canon_label) local has_the = false local key if place:find("^the ") then key = place:gsub("^the ", "") has_the = true else key = place end local group, spec = m_locations.find_canonical_key(key) if group then local requires_the = spec.the or false if has_the ~= requires_the then if has_the then mw.log(("Mismatch in category name '%s', has 'the' in the category when it should not"):format( canon_label)) else mw.log(("Mismatch in category name '%s', should have 'the' in the category but does not"): format(canon_label)) end return nil end return group, key, spec end return nil end -- Handler for generic placetypes (those whose categories are added through category generation handlers or through -- explicit category specs in the placetype data) for known locations in [[Module:place/locations]]. All such -- placetypes have either a `generic_before_non_cities` setting (meaning they can occur before non-city locations) or -- `generic_before_cities` setting (meaning they can occur before cities), or both. Examples of such categories are -- "cities in the Bahamas" or "rivers in Western Australia, Australia", or (for city locations) -- "neighbourhoods of Hong Kong" or "places in Melbourne". insert(handlers, function(label) for _, canon_label in ipairs { lcfirst(label), label } do local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ใน)(.*)$") --th if not placetype then placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ของ)(.*)$") --th end if placetype then local normalized_placetype = placetype == "neighbourhoods" and "neighborhoods" or placetype local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(normalized_placetype, "from category") if canon_placetype and (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then local group, key, spec = find_canonical_key_from_place(place, canon_label) if group then -- Check whether the location uses British spelling, but also check all containers, because -- it's too hard to keep in sync the `british_spelling` setting for locations at all different -- levels (e.g. cities of various countries, first and second level administrative division, etc.), -- so we just set it at top level on the country. local uses_british_spelling = spec.british_spelling if uses_british_spelling == nil then for containers in m_locations.iterate_containers(group, key, spec) do local must_outer_break = false for _, container in ipairs(containers) do if container.spec.british_spelling ~= nil then uses_british_spelling = container.spec.british_spelling must_outer_break = true break end end if must_outer_break then break end end end local allow_cat = true if placetype == "neighborhoods" and uses_british_spelling or placetype == "neighbourhoods" and not uses_british_spelling then mw.log(("Mismatch in spelling of placetype '%s' in category '%s', should be '%s'"):format( placetype, canon_label, uses_british_spelling and "neighbourhoods" or "neighborhoods")) allow_cat = false end if spec.is_former_place and placetype ~= "สถานที่" then allow_cat = false end local expected_prep if spec.is_city then expected_prep = ptdata.generic_before_cities else expected_prep = ptdata.generic_before_non_cities end if not expected_prep then allow_cat = false end if allow_cat then if expected_prep ~= in_of then mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format( canon_label, in_of, expected_prep)) return nil end local linkdesc = m_placetypes.get_placetype_display_form(placetype, spec.is_city and "city" or "noncity", "return full") if linkdesc == false then mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype))) return nil end if not linkdesc then internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s", placetype, key, spec, canon_label) end desc = linkdesc .. " " .. in_of .. " " .. fetch_or_construct_location_desc(group, key, spec) desc = "{{{langname}}} " .. desc .. "." local parents = {} insert(parents, key) if spec.no_container_parent then -- top-level country, constituent country, continent or the like insert(parents, {name = normalized_placetype, sort = key}) if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then local category_class = m_placetypes.get_equiv_placetype_prop(normalized_placetype, function(pt) return m_placetypes.get_placetype_prop(pt, "class") end, { from_category = true, no_split_qualifiers = true, }) if not category_class then internal_error("Saw placetype %s that is either unknown or has no `class` " .. "setting in `placetype_data`", normalized_placetype) end if class_is_political_division[category_class] == nil then internal_error("Saw unknown category class %s derived from placetype %s", category_class, normalized_placetype) end if class_is_political_division[category_class] then insert(parents, "political divisions of specific countries") end end else local container_iterator = m_locations.iterate_containers(group, key, spec) local next_containers = container_iterator() if next_containers then for _, container in ipairs(next_containers) do local container_prep if container.spec.is_city then container_prep = ptdata.generic_before_cities else container_prep = ptdata.generic_before_non_cities end if not container_prep then internal_error("For container key %s spec %s defines is_city = %s but " .. "there is no corresponding `generic_before_*` setting in the " .. "placedata for placetype %s", container.key, container.spec, container.spec.is_city, placetype) end insert(parents, { name = placetype .. container_prep .. m_placetypes.get_prefixed_key(container.key, container.spec), --th sort = key }) end else -- unrecognized countries or the like insert(parents, {name = normalized_placetype, sort = key}) end end return { type = "name", topic = canon_label, description = desc, breadcrumb = placetype, parents = parents, } end end end end end end) -- Handler for "state capitals of the United States", "provincial capitals of Canada", etc. This must precede the next -- handler for specific political and misc (non-political) divisions of polities and subpolities, such as -- "provinces of the Philippines", because "departmental capitals" is listed in cat_as for French prefectures and so -- will trigger an error if that handler runs before this one. insert(handlers, function(label) label = lcfirst(label) local capital_cat, place = label:match("^([a-z%- ]- capitals) of (.*)$") -- Make sure we recognize the type of capital. if place and capital_cat_to_placetype[capital_cat] then local placetype = capital_cat_to_placetype[capital_cat] local pl_placetype = m_placetypes.pluralize_placetype(placetype) -- Locate the container, fetch its known political divisions, and make sure the placetype corresponding to the -- type of capital is among the list. local group, key, spec = find_canonical_key_from_place(place, canon_label) if group and (spec.divs or spec.addl_divs) then local saw_match = false local variant_matches = {} local divlists = {} if spec.divs then insert(divlists, spec.divs) end if spec.addl_divs then insert(divlists, spec.addl_divs) end for _, divlist in ipairs(divlists) do if type(divlist) ~= "table" then divlist = {divlist} end for _, div in ipairs(divlist) do if type(div) == "string" then div = {type = div} end -- HACK. Currently if we don't find a match for the placetype, we map e.g. 'autonomous region' -- -> 'regional capitals' and 'union territory' -> 'territorial capitals'. When encountering a -- political division like 'autonomous region' or 'union territory', chop off everything up -- through a space to make things match. To make this clearer, we record all such -- "variant match" cases, and down below we insert a note into the category text indicating that -- such "variant matches" are included among the category. if pl_placetype == div.type or pl_placetype == div.type:gsub("^.* ", "") then saw_match = true if pl_placetype ~= div.type then insert(variant_matches, div.type) end end end end if saw_match then -- Everything checks out, construct the category description. local placetype_desc = m_placetypes.get_placetype_display_form(pl_placetype, placetype.is_city and "city" or "noncity") if placetype_desc == false then mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype))) return nil end if not placetype_desc then internal_error("Unrecognized plural placetype %s, generated as the plural of %s, which " .. "was found as the placetype of capital placetype %s in label %s", pl_placetype, placetype, capital_cat, label) end local variant_match_text = "" if variant_matches[1] then local real_variant_match_descs = {} for i, variant_match in ipairs(variant_matches) do local variant_match_desc = m_placetypes.get_placetype_display_form(variant_match, placetype.is_city and "city" or "noncity") if variant_match_desc == nil then internal_error("Unrecognized variant match plural placetype %s, coming from " .. "place key %s, data %s in label %s", variant_match, key, spec, label) end if variant_match_desc then -- skip those for which the description is `false`, like `ABBREVIATION_OF states` -- in the United States divs. insert(real_variant_match_descs, variant_match_desc) end end if real_variant_match_descs[1] then variant_match_text = " (including " .. m_table.serialCommaJoin(real_variant_match_descs) .. ")" end end local desc = "{{{langname}}} names of [[capital]]s of " .. placetype_desc .. variant_match_text .. " of " .. fetch_or_construct_location_desc(group, key, spec) .. "." local full_placename, _ = m_locations.key_to_placename(group, key) local parents = {} if spec.no_container_parent then -- top-level country, constituent country, continent or the like insert(parents, {name = capital_cat, sort = key}) else local container_iterator = m_locations.iterate_containers(group, key, spec) local next_containers = container_iterator() if next_containers then for _, container in ipairs(next_containers) do insert(parents, { name = capital_cat .. "ของ" .. m_placetypes.get_prefixed_key(container.key, container.spec), --th sort = key }) end else -- unrecognized countries or the like insert(parents, {name = capital_cat, sort = key}) end end insert(parents, key) return { type = "name", topic = label, description = desc, breadcrumb = full_placename, parents = parents, } end end end end) local overriding_category_descriptions = { ["autonomous cities of Spain"] = "the [[w:Autonomous communities of Spain#Autonomous_cities|autonomous cities of Spain]]", ["regions of Greece"] = "the regions ([[periphery|peripheries]]) of [[Greece]]", ["regions of North Macedonia"] = "the regions ([[periphery|peripheries]]) of [[North Macedonia]]", ["subprefectures of Japan"] = "[[subprefecture]]s of [[Japan]]ese [[prefecture]]s", } -- Handler for specific political and misc (non-political) divisions of locations (polities, subpolities, cities, etc.), -- such as "provinces of the Philippines", "counties of Wales", "municipalities of Tocantins, Brazil", -- "boroughs of New York City", etc. This does not handle categories for generic placetypes (cities, rivers, etc.) of -- locations, which are handled by different handlers above. insert(handlers, function(label) -- The label comes with an initial capitalization but we have to check both lowercase-initial and capital-initial -- versions of the placetype to handle e.g. [[:Category:en:Indian reserves of Canada]]. for _, canon_label in ipairs { label, lcfirst(label) } do for _, minimal_placetype in ipairs { true, false } do local match_quantifier = minimal_placetype and "-" or "+" -- Some categories have two "of"s in them, and depending on the category, it's correct to do either a greedy -- ([[:Category:en:Abbreviations of states of the United States]], with placetype `abbreviations of states`) -- or non-greedy ([[:Category:en:Provinces of the Democratic Republic of the Congo]], with placetype -- `provinces`) match. We can't know in advance which is correct so we try both possibilities, doing the -- non-greedy one first as it seems more common (there are many locations with "of" in them, but currently -- only `abbreviations of states` occurs with a following location). local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ของ)(.*)$") if not placetype then placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ใน)(.*)$") end if placetype then local group, key, spec = find_canonical_key_from_place(place, canon_label) if group then local function find_placetype(divs) if divs then if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) == "string" then div = {type = div} end local cat_as = div.cat_as or div.type if type(cat_as) ~= "table" then cat_as = {cat_as} end for _, pt_cat_as in ipairs(cat_as) do if type(pt_cat_as) == "string" then pt_cat_as = {type = pt_cat_as} end if placetype == pt_cat_as.type then local div_parent = pt_cat_as.container_parent_type if div_parent == nil then -- allow false div_parent = div.container_parent_type end if div_parent == nil then div_parent = placetype end return div_parent, pt_cat_as.prep or div.prep or "ของ" end end end end return nil end local div_parent, div_prep = find_placetype(spec.divs) if div_parent == nil then -- allow false div_parent, div_prep = find_placetype(spec.addl_divs) end if div_parent == nil then -- allow false div_parent, div_prep = find_placetype(spec.addl_divs_for_categorization) end if div_parent ~= nil then if div_prep ~= in_of then mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format( canon_label, in_of, div_prep)) return nil end local linkdesc = m_placetypes.get_placetype_display_form(placetype, spec.is_city and "city" or "noncity", "return full") if linkdesc == false then mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype))) return nil end if not linkdesc then internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s", placetype, key, spec, canon_label) end local desc = overriding_category_descriptions[canon_label] if not desc then desc = linkdesc .. in_of .. fetch_or_construct_location_desc(group, key, spec) --th end desc = "{{{langname}}} " .. desc .. "." local parents = {} insert(parents, key) if div_parent then -- div_parent may be `false` if spec.no_container_parent then -- top-level country, constituent country, continent or the like insert(parents, {name = placetype, sort = " " .. key}) if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then --th insert(parents, "political divisions of specific countries") end else local container_iterator = m_locations.iterate_containers(group, key, spec) local next_containers = container_iterator() if next_containers then for _, container in ipairs(next_containers) do insert(parents, { name = div_parent .. in_of .. m_placetypes.get_prefixed_key(container.key, container.spec), --th sort = key }) end else -- unrecognized countries or the like insert(parents, {name = placetype, sort = " " .. key}) end end end return { type = "name", topic = canon_label, description = desc, breadcrumb = placetype, parents = parents, } end end end end end end) labels["exonyms"] = { type = "name", -- special-cased description description = "{{{langname}}} [[exonym]]s.", parents = {"สถานที่"}, } labels["political divisions of specific countries"] = { type = "grouping", description = "{{{langname}}} categories for political divisions of specific countries.", parents = {"สถานที่"}, } -- Misc. FIXME: Remove the need for this. labels["nomes of Ancient Egypt"] = { type = "name", -- special-cased description description = "{{{langname}}} names of the [[nome]]s of [[Ancient Egypt]].", breadcrumb = "nomes", parents = {"อียิปต์โบราณ"}, } -- FIXME: Everything here has been moved from [[Module:category tree/topic/Earth]]. Most should be removed. labels["มหาสมุทรแอตแลนติก"] = { type = "related-to", description = "default with the", parents = {"โลก"}, } labels["Atlantic Ocean"] = labels["มหาสมุทรแอตแลนติก"] labels["British Isles"] = { type = "related-to", description = "=the people, culture, or territory of [[Great Britain]], [[Ireland]], and other nearby islands", parents = {"ยุโรป", "เกาะ"}, } labels["สหภาพยุโรป"] = { type = "related-to", description = "default with the", parents = {"ยุโรป"}, } labels["European Union"] = labels["สหภาพยุโรป"] labels["Gascony"] = { type = "related-to", description = "default", parents = {"Occitania, France"}, } labels["Indian subcontinent"] = { type = "related-to", description = "default with the", parents = {"เอเชียใต้"}, } labels["Bengal"] = { type = "related-to", description = "{{{langname}}} terms related to the people, culture, or territory of [[Bengal]].", parents = {"Indian subcontinent"}, } labels["Kashmir"] = { type = "related-to", description = "{{{langname}}} terms related to the people, culture, or territory of [[Kashmir]].", parents = {"Indian subcontinent"}, } labels["Kashmir, India"] = { type = "related-to", description = "{{{langname}}} names of places in {{w|Kashmir, India}}.", parents = {"อินเดีย", "Kashmir"}, } labels["เกาหลี"] = { type = "related-to", description = "=the people, culture, or territory of [[Korea]]", parents = {"เอเชีย"}, } labels["Korea"] = labels["เกาหลี"] labels["Languedoc"] = { type = "related-to", description = "default", parents = {"Occitania, France"}, } labels["Lapland"] = { type = "related-to", description = "=[[Lapland]], a region in northernmost Europe", parents = {"ยุโรป", "ฟินแลนด์", "นอร์เวย์", "รัสเซีย", "สวีเดน"}, } labels["ตะวันออกกลาง"] = { type = "related-to", description = "default with the", parents = {"แอฟริกา", "เอเชีย"}, } labels["Middle East"] = labels["ตะวันออกกลาง"] labels["Netherlands Antilles"] = { type = "related-to", description = "=the people, culture, or territory of the [[Netherlands Antilles]]", parents = {"เนเธอร์แลนด์", "อเมริกาเหนือ"}, } labels["Provence"] = { type = "related-to", description = "default", parents = {"Provence-Alpes-Côte d'Azur, France"}, } labels["เอเชียใต้"] = { type = "related-to", description = "default", parents = {"ยูเรเชีย", "เอเชีย"}, } labels["South Asia"] = labels["เอเชียใต้"] return {LABELS = labels, HANDLERS = handlers} 9fu8gl0wabsmxyswonlqjr6p0dt4ghr 5720687 5720685 2026-04-21T01:23:06Z OctraBot 3198 5720687 Scribunto text/plain local labels = {} local handlers = {} local m_table = require("Module:table") local en_utilities_module = "Module:en-utilities" local string_utilities_module = "Module:string utilities" local m_locations = require("Module:place/locations") local m_placetypes = require("Module:place/placetypes") local placetype_data = m_placetypes.placetype_data local internal_error = m_locations.internal_error local dump = mw.dumpObject local insert = table.insert local concat = table.concat local is_callable = require("Module:fun").is_callable --[==[ intro: This module is part of the category tree code and contains code to generate the descriptions of place-related categories such as [[Category:de:Hokkaido Prefecture, Japan]], [[Category:es:Cities in France]], [[Category:pt:Municipalities of Tocantins, Brazil]], etc.). Note that this module doesn't actually create the categories; that must be done separately, with the text "{{tl|auto cat}}" as the definition of the category. (This process should automatically happen periodically for non-empty categories, because they will appear in [[Special:WantedCategories]] and a bot will periodically examine that list and create any needed category.) There are two ways that category descriptions are specified: (1) by manually adding an entry to the `labels` table, keyed by the label (the category minus the language code) with a value consisting of a Lua table specifying the description text and the category's parents; (2) through handlers (pieces of Lua code) added to the `handlers` list, which recognize labels of a specific type (e.g. `Cities in France`) and generate the appropriate specification for that label on-the-fly. See [[Module:place]] for an introduction to the terminology associated with places along with a list of all the relevant modules, along with for more specific information on types of toponyms and placetypes and how their categorization works. ]==] local function lcfirst(label) return mw.getContentLanguage():lcfirst(label) end local function gsub_literally(str, from, to) local m_strutils = require(string_utilities_module) return (str:gsub(m_strutils.pattern_escape(from), m_strutils.replacement_escape(to))) end --ห้ามแปล class local class_to_bare_category_parent = { ["polity"] = "องค์การทางการเมือง", ["subpolity"] = "political divisions", ["settlement"] = "การตั้งถิ่นฐาน", ["non-admin settlement"] = "การตั้งถิ่นฐาน", ["capital"] = "เมืองหลวง", ["natural feature"] = "natural features", ["man-made structure"] = "man-made structures", ["geographic region"] = "geographic and cultural areas", } --ห้ามแปล class local class_is_political_division = { ["polity"] = true, -- strictly false but there are placetypes ambiguous between polity and subpolity ["subpolity"] = true, ["settlement"] = true, ["non-admin settlement"] = false, ["capital"] = true, ["natural feature"] = false, ["man-made structure"] = false, ["geographic region"] = false, ["generic place"] = false, } local capital_cat_to_placetype = {} for placetype, capital_cat in pairs(m_placetypes.placetype_to_capital_cat) do capital_cat_to_placetype[capital_cat] = placetype end -- Handler for bare categories for all types of capitals. This needs to precede the handler for bare placetype -- categories as some of the types of capitals exist as placetypes as well. insert(handlers, function(label) label = lcfirst(label) local capital_placetype = capital_cat_to_placetype[label] if capital_placetype then local pl_placetype = m_placetypes.pluralize_placetype(capital_placetype) local linkdesc = m_placetypes.get_placetype_display_form(pl_placetype, "top-level") if linkdesc == nil then internal_error("Unrecognized placetype %s when processing label %s", capital_placetype, label) end if linkdesc == false then mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype))) return nil end return { type = "name", topic = label, description = "{{{langname}}} names of [[capital]]s of " .. linkdesc .. ".", parents = {"เมืองหลวง"}, } end end) -- Handler for bare placetype categories. FIXME: Add wpcat= and commonscat= info. Previously we had it for various -- so-called "generic" placetypes, but sometimes the categories were wrong. insert(handlers, function(label) for _, canon_label in ipairs { lcfirst(label), label } do local ptdesc, ptdata = m_placetypes.get_placetype_display_form(canon_label, "top-level", "return full") if ptdesc then local from_category_props = { from_category = true, no_split_qualifiers = true, } local bare_category_parent = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt) local bare_category_parent = m_placetypes.get_placetype_prop(pt, "bare_category_parent") if bare_category_parent then return bare_category_parent end local class = m_placetypes.get_placetype_prop(pt, "class") if class then if class_to_bare_category_parent[class] == nil then internal_error("Saw unknown category class %s derived from placetype %s", class, canon_label) end return class_to_bare_category_parent[class] end end, from_category_props) if not bare_category_parent then internal_error("Saw placetype %s without a `class` or `bare_category_parent` setting, either " .. "directly or through a fallback", canon_label) end local addl_bare_category_parents = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt) return m_placetypes.get_placetype_prop(pt, "addl_bare_category_parents") end, from_category_props) local bare_category_breadcrumb = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt) return m_placetypes.get_placetype_prop(pt, "bare_category_breadcrumb") end, from_category_props) if type(bare_category_parent) == "string" and bare_category_breadcrumb then bare_category_parent = {name = bare_category_parent, sort = bare_category_breadcrumb} end local parents = {bare_category_parent} if addl_bare_category_parents then m_table.extend(parents, addl_bare_category_parents) end return { type = "name", topic = canon_label, description = "{{{langname}}} " .. ptdesc .. ".", breadcrumb = bare_category_breadcrumb, parents = parents, } elseif ptdesc == false then mw.log(("Display form for canon_label %s is false, can't categorize"):format(dump(canon_label))) end end end) local function fetch_primary_placetype(key, spec) local placetype = spec.placetype if type(placetype) == "table" then placetype = placetype[1] end if not placetype then internal_error("No placetype specified or defaulted for key %s, spec %s", key, spec) end return placetype end --[==[ Construct an appropriately linked location based on the full or elliptical placename, preceded by `"the "`` if appropriate. Specifically: Fetch the full and elliptical_placenames. If they are the same, just link to the placename directly. Otherwise, check if the full placename exists; if so link to it. Otherwise, if the elliptical placename exists, link to it but display it as the full placename. Finally, if neither full placename nor elliptical placename exists, fall back to linking to the full placename. That way, we prefer full placenames to elliptical placenames if both or neither exist as Wiktionary entries, but if only one exists, we link to that one rather than have a red link. ]==] local function construct_linked_location(group, key, spec) local full_placename, elliptical_placename = m_locations.key_to_placename(group, key) local linked_placename if elliptical_placename ~= full_placename then local full_placename_title = mw.title.new(full_placename) if full_placename_title and full_placename_title.exists then linked_placename = m_locations.construct_linked_placename(spec, full_placename) else local elliptical_placename_title = mw.title.new(elliptical_placename) if elliptical_placename_title and elliptical_placename_title.exists then linked_placename = m_locations.construct_linked_placename(spec, elliptical_placename, full_placename) end end end return linked_placename or m_locations.construct_linked_placename(spec, full_placename) end --[==[ Construct the description of a location, including its container trail either to the end or until we encounter a `no_include_container_in_desc` setting. For example, for the city of [[Birmingham]], the description will read `"[[Birmingham]], a [[city]] in the [[West Midlands]] (which is a [[county]] of [[England]], which is a [[constituent country]] of the [[United Kingdom]], which is a [[country]] in [[Europe]])"`. FIXME: Possibly we should adopt the way city descriptions used to read, which was similar to `"the city of [[Birmingham]], in the county of the [[West Midlands]], in the [[constituent country]] of [[England]], in the [[country]] of the [[United Kingdom]], in [[Europe]]"`. ]==] local function construct_location_desc(group, key, spec) local parts = {} local function ins(txt) insert(parts, txt) end ins(construct_linked_location(group, key, spec)) local iteration = 0 local need_closing_paren = false local containers = {{group = group, key = key, spec = spec}} local container_iterator = m_locations.iterate_containers(group, key, spec) while true do iteration = iteration + 1 local include_container_in_desc = false for _, container in ipairs(containers) do if not container.spec.no_include_container_in_desc then include_container_in_desc = true break end end if not include_container_in_desc then break end local next_containers = container_iterator() if not next_containers then break end local is_former = nil for _, container in ipairs(containers) do local this_is_former = container.spec.is_former_place if is_former == nil then is_former = this_is_former elseif is_former ~= this_is_former then internal_error("When processing container trail of key %s, found a mixture of former and non-former " .. "containers: %s", key, containers) end end if #containers > 1 then local placetypes = {} local prepositions = {} for _, container in ipairs(containers) do local container_type = fetch_primary_placetype(container.key, container.spec) m_table.insertIfNot(placetypes, m_placetypes.pluralize_placetype(container_type)) m_table.insertIfNot(prepositions, m_placetypes.get_placetype_entry_preposition(container_type)) end if iteration == 1 then ins(", ") elseif iteration == 2 then ins(" (which are ") need_closing_paren = true else ins(", which are ") end if is_former then ins("former ") end ins(m_table.serialCommaJoin(placetypes)) ins(" ") ins(concat(prepositions, "/")) else if iteration == 1 then ins(", ") elseif iteration == 2 then ins(" (which is ") need_closing_paren = true else ins(", which is ") end local container_type = fetch_primary_placetype(containers[1].key, containers[1].spec) if is_former then ins("a former ") else ins(m_placetypes.get_placetype_article(container_type)) ins(" ") end ins(container_type) ins(" ") ins(m_placetypes.get_placetype_entry_preposition(container_type)) end ins(" ") first_container = false containers = next_containers local container_locations = {} for _, container in ipairs(containers) do insert(container_locations, construct_linked_location(container.group, container.key, container.spec)) end ins(m_table.serialCommaJoin(container_locations)) end if need_closing_paren then ins(")") end return concat(parts) end -- Fetch or construct the description of the location specified by `key`. If the `keydesc` property is specified, -- use it directly but substitute any occurrence of `+++` with the auto-constructed location description, which -- mentions the placename corresponding to the key, its placetype and container, and repeats the description up -- the container trail until either there are no more containers or (more usually) the `no_include_container_in_desc` -- setting is found (which is set on all continents and continent-level regions). local function fetch_or_construct_location_desc(group, key, spec) local val = spec.keydesc if is_callable(val) then val = val(group, key, spec) spec.keydesc = val end val = val or "+++" if val:find("%+%+%+") then val = gsub_literally(val, "+++", construct_location_desc(group, key, spec)) end return val end local function normalize_cat_as(cat_as, div) if type(cat_as) ~= "table" or cat_as.type then cat_as = {cat_as} end local ret_cat_as = {} for _, pt_cat_as in ipairs(cat_as) do if type(pt_cat_as) == "string" then pt_cat_as = {type = pt_cat_as} end insert(ret_cat_as, {type = pt_cat_as.type, prep = pt_cat_as.prep or div.prep or "ของ"}) end return ret_cat_as end -- Find the specified plural placetype among the divs for a given known location. Return a list of cat_as specs, where -- each spec is of the form {type = "PLURAL_PLACETYPE", prep = "PREP"} indicating the plural placetype to use when -- categorizing and the preposition to follow. local function find_placetype_cat_as(divs, pl_placetype) if divs then if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) == "string" then div = {type = div} end if div.type == pl_placetype then local cat_as = div.cat_as or div.type return normalize_cat_as(cat_as, div) end end end return nil end -- Handler for bare placename categories for known locations in `locations` in [[Module:place/locations]]. insert(handlers, function(label) for _, canon_label in ipairs { label, lcfirst(label) } do local group, spec = m_locations.find_canonical_key(canon_label) if group then -- wp= defaults to true (Wikipedia article matches location's full placename) local wp = spec.wp if wp == nil then wp = true end -- wpcat= defaults to wp= (if Wikipedia article has its own name, Wikipedia category and Commons category -- generally follow) local wpcat = spec.wpcat if wpcat == nil then wpcat = wp end -- commonscat= defaults to wpcat= (if Wikipedia category has its own name, Commons category generally -- follows) local commonscat = spec.commonscat if commonscat == nil then commonscat = wpcat end local parents = {} local bare_label_parents = spec.overriding_bare_label_parents local container_iterator = m_locations.iterate_containers(group, canon_label, spec) local containers = container_iterator() if not bare_label_parents then bare_label_parents = {"+++"} end local full_location_placename, elliptical_location_placename = m_locations.key_to_placename(group, canon_label) local full_container_placename if containers then full_container_placename, _ = m_locations.key_to_placename(containers[1].group, containers[1].key) end local inserted_containers = false for _, parent in ipairs(bare_label_parents) do if parent == "+++" then parent = "PL_PLACETYPEPREPCONTAINER" --th not use spaces end if parent:find("CONTAINER") then if not containers then internal_error("Parent category %s needs the container of %s but no containers specified: %s", parent, canon_label, spec) end local location_type = fetch_primary_placetype(canon_label, spec) local pl_location_type = m_placetypes.pluralize_placetype(location_type) for _, container in ipairs(containers) do local per_container_parent = parent local cat_as_list if per_container_parent:find("PL_PLACETYPE") then if spec.bare_category_parent_type then cat_as_list = normalize_cat_as(spec.bare_category_parent_type, spec) else cat_as_list = find_placetype_cat_as(container.spec.divs, pl_location_type) or find_placetype_cat_as(container.spec.addl_divs, pl_location_type) end end if not cat_as_list then local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(location_type, "from category") if not canon_placetype or not (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then internal_error("Unable to locate plural location type %s among the divs or addl_divs " .. "for container key %s spec %s, and the location type is either not in placetype_data or " .. "not identified as a generic placetype", pl_location_type, container.key, container.spec) end cat_as_list = {{type = pl_location_type, prep = m_placetypes.get_placetype_entry_preposition(location_type)}} end local prefixed_key = m_placetypes.get_prefixed_key(container.key, container.spec) per_container_parent = gsub_literally(per_container_parent, "CONTAINER", prefixed_key) for _, cat_as in ipairs(cat_as_list) do local per_container_per_placetype_parent = per_container_parent per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PL_PLACETYPE", cat_as.type) per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PREP", cat_as.prep) m_table.insertIfNot(parents, per_container_per_placetype_parent) end end inserted_containers = true else m_table.insertIfNot(parents, parent) end end if not inserted_containers and containers then -- If we didn't insert the containers above in some form, insert them now as bare categories. Note that -- this may be different categories from the container categories inserted above. for _, container in ipairs(containers) do m_table.insertIfNot(parents, container.key) end end if spec.addl_parents then for _, parent in ipairs(spec.addl_parents) do m_table.insertIfNot(parents, parent) end end local function format_boxval(val, specname) if val == true then val = "%l" end if type(val) == "string" then val = gsub_literally(val, "%l", full_location_placename) val = gsub_literally(val, "%e", elliptical_location_placename) if val:find("%%c") then if not full_container_placename then internal_error("Wikipedia/Commons spec %s = %s has %%c in it but key %s has no " .. "containers: %s", specname, val, canon_label, spec) end val = gsub_literally(val, "%c", full_container_placename) end end return val end local description = spec.fulldesc or ( "{{{langname}}} terms related to the people, culture, or territory of " .. fetch_or_construct_location_desc(group, canon_label, spec) .. ".") local full_placename, _ = m_locations.key_to_placename(group, canon_label) return { type = "topic", description = description, breadcrumb = full_placename, parents = parents, wp = format_boxval(wp, "wp"), wpcat = format_boxval(wpcat, "wpcat"), commonscat = format_boxval(commonscat, "commonscat"), } end end end) local function find_canonical_key_from_place(place, canon_label) local has_the = false local key if place:find("^the ") then key = place:gsub("^the ", "") has_the = true else key = place end local group, spec = m_locations.find_canonical_key(key) if group then local requires_the = spec.the or false if has_the ~= requires_the then if has_the then mw.log(("Mismatch in category name '%s', has 'the' in the category when it should not"):format( canon_label)) else mw.log(("Mismatch in category name '%s', should have 'the' in the category but does not"): format(canon_label)) end return nil end return group, key, spec end return nil end -- Handler for generic placetypes (those whose categories are added through category generation handlers or through -- explicit category specs in the placetype data) for known locations in [[Module:place/locations]]. All such -- placetypes have either a `generic_before_non_cities` setting (meaning they can occur before non-city locations) or -- `generic_before_cities` setting (meaning they can occur before cities), or both. Examples of such categories are -- "cities in the Bahamas" or "rivers in Western Australia, Australia", or (for city locations) -- "neighbourhoods of Hong Kong" or "places in Melbourne". insert(handlers, function(label) for _, canon_label in ipairs { lcfirst(label), label } do local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ใน)(.*)$") --th if not placetype then placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ของ)(.*)$") --th end if placetype then local normalized_placetype = placetype == "neighbourhoods" and "neighborhoods" or placetype local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(normalized_placetype, "from category") if canon_placetype and (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then local group, key, spec = find_canonical_key_from_place(place, canon_label) if group then -- Check whether the location uses British spelling, but also check all containers, because -- it's too hard to keep in sync the `british_spelling` setting for locations at all different -- levels (e.g. cities of various countries, first and second level administrative division, etc.), -- so we just set it at top level on the country. local uses_british_spelling = spec.british_spelling if uses_british_spelling == nil then for containers in m_locations.iterate_containers(group, key, spec) do local must_outer_break = false for _, container in ipairs(containers) do if container.spec.british_spelling ~= nil then uses_british_spelling = container.spec.british_spelling must_outer_break = true break end end if must_outer_break then break end end end local allow_cat = true if placetype == "neighborhoods" and uses_british_spelling or placetype == "neighbourhoods" and not uses_british_spelling then mw.log(("Mismatch in spelling of placetype '%s' in category '%s', should be '%s'"):format( placetype, canon_label, uses_british_spelling and "neighbourhoods" or "neighborhoods")) allow_cat = false end if spec.is_former_place and placetype ~= "สถานที่" then allow_cat = false end local expected_prep if spec.is_city then expected_prep = ptdata.generic_before_cities else expected_prep = ptdata.generic_before_non_cities end if not expected_prep then allow_cat = false end if allow_cat then if expected_prep ~= in_of then mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format( canon_label, in_of, expected_prep)) return nil end local linkdesc = m_placetypes.get_placetype_display_form(placetype, spec.is_city and "city" or "noncity", "return full") if linkdesc == false then mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype))) return nil end if not linkdesc then internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s", placetype, key, spec, canon_label) end desc = linkdesc .. " " .. in_of .. " " .. fetch_or_construct_location_desc(group, key, spec) desc = "{{{langname}}} " .. desc .. "." local parents = {} insert(parents, key) if spec.no_container_parent then -- top-level country, constituent country, continent or the like insert(parents, {name = normalized_placetype, sort = key}) if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then local category_class = m_placetypes.get_equiv_placetype_prop(normalized_placetype, function(pt) return m_placetypes.get_placetype_prop(pt, "class") end, { from_category = true, no_split_qualifiers = true, }) if not category_class then internal_error("Saw placetype %s that is either unknown or has no `class` " .. "setting in `placetype_data`", normalized_placetype) end if class_is_political_division[category_class] == nil then internal_error("Saw unknown category class %s derived from placetype %s", category_class, normalized_placetype) end if class_is_political_division[category_class] then insert(parents, "political divisions of specific countries") end end else local container_iterator = m_locations.iterate_containers(group, key, spec) local next_containers = container_iterator() if next_containers then for _, container in ipairs(next_containers) do local container_prep if container.spec.is_city then container_prep = ptdata.generic_before_cities else container_prep = ptdata.generic_before_non_cities end if not container_prep then internal_error("For container key %s spec %s defines is_city = %s but " .. "there is no corresponding `generic_before_*` setting in the " .. "placedata for placetype %s", container.key, container.spec, container.spec.is_city, placetype) end insert(parents, { name = placetype .. container_prep .. m_placetypes.get_prefixed_key(container.key, container.spec), --th sort = key }) end else -- unrecognized countries or the like insert(parents, {name = normalized_placetype, sort = key}) end end return { type = "name", topic = canon_label, description = desc, breadcrumb = placetype, parents = parents, } end end end end end end) -- Handler for "state capitals of the United States", "provincial capitals of Canada", etc. This must precede the next -- handler for specific political and misc (non-political) divisions of polities and subpolities, such as -- "provinces of the Philippines", because "departmental capitals" is listed in cat_as for French prefectures and so -- will trigger an error if that handler runs before this one. insert(handlers, function(label) label = lcfirst(label) local capital_cat, place = label:match("^([a-z%- ]- capitals) of (.*)$") -- Make sure we recognize the type of capital. if place and capital_cat_to_placetype[capital_cat] then local placetype = capital_cat_to_placetype[capital_cat] local pl_placetype = m_placetypes.pluralize_placetype(placetype) -- Locate the container, fetch its known political divisions, and make sure the placetype corresponding to the -- type of capital is among the list. local group, key, spec = find_canonical_key_from_place(place, canon_label) if group and (spec.divs or spec.addl_divs) then local saw_match = false local variant_matches = {} local divlists = {} if spec.divs then insert(divlists, spec.divs) end if spec.addl_divs then insert(divlists, spec.addl_divs) end for _, divlist in ipairs(divlists) do if type(divlist) ~= "table" then divlist = {divlist} end for _, div in ipairs(divlist) do if type(div) == "string" then div = {type = div} end -- HACK. Currently if we don't find a match for the placetype, we map e.g. 'autonomous region' -- -> 'regional capitals' and 'union territory' -> 'territorial capitals'. When encountering a -- political division like 'autonomous region' or 'union territory', chop off everything up -- through a space to make things match. To make this clearer, we record all such -- "variant match" cases, and down below we insert a note into the category text indicating that -- such "variant matches" are included among the category. if pl_placetype == div.type or pl_placetype == div.type:gsub("^.* ", "") then saw_match = true if pl_placetype ~= div.type then insert(variant_matches, div.type) end end end end if saw_match then -- Everything checks out, construct the category description. local placetype_desc = m_placetypes.get_placetype_display_form(pl_placetype, placetype.is_city and "city" or "noncity") if placetype_desc == false then mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype))) return nil end if not placetype_desc then internal_error("Unrecognized plural placetype %s, generated as the plural of %s, which " .. "was found as the placetype of capital placetype %s in label %s", pl_placetype, placetype, capital_cat, label) end local variant_match_text = "" if variant_matches[1] then local real_variant_match_descs = {} for i, variant_match in ipairs(variant_matches) do local variant_match_desc = m_placetypes.get_placetype_display_form(variant_match, placetype.is_city and "city" or "noncity") if variant_match_desc == nil then internal_error("Unrecognized variant match plural placetype %s, coming from " .. "place key %s, data %s in label %s", variant_match, key, spec, label) end if variant_match_desc then -- skip those for which the description is `false`, like `ABBREVIATION_OF states` -- in the United States divs. insert(real_variant_match_descs, variant_match_desc) end end if real_variant_match_descs[1] then variant_match_text = " (including " .. m_table.serialCommaJoin(real_variant_match_descs) .. ")" end end local desc = "{{{langname}}} names of [[capital]]s of " .. placetype_desc .. variant_match_text .. " of " .. fetch_or_construct_location_desc(group, key, spec) .. "." local full_placename, _ = m_locations.key_to_placename(group, key) local parents = {} if spec.no_container_parent then -- top-level country, constituent country, continent or the like insert(parents, {name = capital_cat, sort = key}) else local container_iterator = m_locations.iterate_containers(group, key, spec) local next_containers = container_iterator() if next_containers then for _, container in ipairs(next_containers) do insert(parents, { name = capital_cat .. "ของ" .. m_placetypes.get_prefixed_key(container.key, container.spec), --th sort = key }) end else -- unrecognized countries or the like insert(parents, {name = capital_cat, sort = key}) end end insert(parents, key) return { type = "name", topic = label, description = desc, breadcrumb = full_placename, parents = parents, } end end end end) local overriding_category_descriptions = { ["autonomous cities of Spain"] = "the [[w:Autonomous communities of Spain#Autonomous_cities|autonomous cities of Spain]]", ["regions of Greece"] = "the regions ([[periphery|peripheries]]) of [[Greece]]", ["regions of North Macedonia"] = "the regions ([[periphery|peripheries]]) of [[North Macedonia]]", ["subprefectures of Japan"] = "[[subprefecture]]s of [[Japan]]ese [[prefecture]]s", } -- Handler for specific political and misc (non-political) divisions of locations (polities, subpolities, cities, etc.), -- such as "provinces of the Philippines", "counties of Wales", "municipalities of Tocantins, Brazil", -- "boroughs of New York City", etc. This does not handle categories for generic placetypes (cities, rivers, etc.) of -- locations, which are handled by different handlers above. insert(handlers, function(label) -- The label comes with an initial capitalization but we have to check both lowercase-initial and capital-initial -- versions of the placetype to handle e.g. [[:Category:en:Indian reserves of Canada]]. for _, canon_label in ipairs { label, lcfirst(label) } do for _, minimal_placetype in ipairs { true, false } do local match_quantifier = minimal_placetype and "-" or "+" -- Some categories have two "of"s in them, and depending on the category, it's correct to do either a greedy -- ([[:Category:en:Abbreviations of states of the United States]], with placetype `abbreviations of states`) -- or non-greedy ([[:Category:en:Provinces of the Democratic Republic of the Congo]], with placetype -- `provinces`) match. We can't know in advance which is correct so we try both possibilities, doing the -- non-greedy one first as it seems more common (there are many locations with "of" in them, but currently -- only `abbreviations of states` occurs with a following location). local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ของ)(.*)$") if not placetype then placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ใน)(.*)$") end if placetype then local group, key, spec = find_canonical_key_from_place(place, canon_label) if group then local function find_placetype(divs) if divs then if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) == "string" then div = {type = div} end local cat_as = div.cat_as or div.type if type(cat_as) ~= "table" then cat_as = {cat_as} end for _, pt_cat_as in ipairs(cat_as) do if type(pt_cat_as) == "string" then pt_cat_as = {type = pt_cat_as} end if placetype == pt_cat_as.type then local div_parent = pt_cat_as.container_parent_type if div_parent == nil then -- allow false div_parent = div.container_parent_type end if div_parent == nil then div_parent = placetype end return div_parent, pt_cat_as.prep or div.prep or "ของ" end end end end return nil end local div_parent, div_prep = find_placetype(spec.divs) if div_parent == nil then -- allow false div_parent, div_prep = find_placetype(spec.addl_divs) end if div_parent == nil then -- allow false div_parent, div_prep = find_placetype(spec.addl_divs_for_categorization) end if div_parent ~= nil then if div_prep ~= in_of then mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format( canon_label, in_of, div_prep)) return nil end local linkdesc = m_placetypes.get_placetype_display_form(placetype, spec.is_city and "city" or "noncity", "return full") if linkdesc == false then mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype))) return nil end if not linkdesc then internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s", placetype, key, spec, canon_label) end local desc = overriding_category_descriptions[canon_label] if not desc then desc = linkdesc .. in_of .. fetch_or_construct_location_desc(group, key, spec) --th end desc = "{{{langname}}} " .. desc .. "." local parents = {} insert(parents, key) if div_parent then -- div_parent may be `false` if spec.no_container_parent then -- top-level country, constituent country, continent or the like insert(parents, {name = placetype, sort = " " .. key}) if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then --th insert(parents, "political divisions of specific countries") end else local container_iterator = m_locations.iterate_containers(group, key, spec) local next_containers = container_iterator() if next_containers then for _, container in ipairs(next_containers) do insert(parents, { name = div_parent .. in_of .. m_placetypes.get_prefixed_key(container.key, container.spec), --th sort = key }) end else -- unrecognized countries or the like insert(parents, {name = placetype, sort = " " .. key}) end end end return { type = "name", topic = canon_label, description = desc, breadcrumb = placetype, parents = parents, } end end end end end end) labels["exonyms"] = { type = "name", -- special-cased description description = "{{{langname}}} [[exonym]]s.", parents = {"สถานที่"}, } labels["political divisions of specific countries"] = { type = "grouping", description = "{{{langname}}} categories for political divisions of specific countries.", parents = {"สถานที่"}, } -- Misc. FIXME: Remove the need for this. labels["nomes of Ancient Egypt"] = { type = "name", -- special-cased description description = "{{{langname}}} names of the [[nome]]s of [[Ancient Egypt]].", breadcrumb = "nomes", parents = {"อียิปต์โบราณ"}, } -- FIXME: Everything here has been moved from [[Module:category tree/topic/Earth]]. Most should be removed. labels["มหาสมุทรแอตแลนติก"] = { type = "related-to", description = "default with the", parents = {"โลก"}, } labels["Atlantic Ocean"] = labels["มหาสมุทรแอตแลนติก"] labels["British Isles"] = { type = "related-to", description = "=the people, culture, or territory of [[Great Britain]], [[Ireland]], and other nearby islands", parents = {"ยุโรป", "เกาะ"}, } labels["สหภาพยุโรป"] = { type = "related-to", description = "default with the", parents = {"ยุโรป"}, } labels["European Union"] = labels["สหภาพยุโรป"] labels["Gascony"] = { type = "related-to", description = "default", parents = {"Occitania, France"}, } labels["Indian subcontinent"] = { type = "related-to", description = "default with the", parents = {"เอเชียใต้"}, } labels["Bengal"] = { type = "related-to", description = "{{{langname}}} terms related to the people, culture, or territory of [[Bengal]].", parents = {"Indian subcontinent"}, } labels["Kashmir"] = { type = "related-to", description = "{{{langname}}} terms related to the people, culture, or territory of [[Kashmir]].", parents = {"Indian subcontinent"}, } labels["Kashmir, India"] = { type = "related-to", description = "{{{langname}}} names of places in {{w|Kashmir, India}}.", parents = {"อินเดีย", "Kashmir"}, } labels["เกาหลี"] = { type = "related-to", description = "=the people, culture, or territory of [[Korea]]", parents = {"เอเชีย"}, } labels["Korea"] = labels["เกาหลี"] labels["Languedoc"] = { type = "related-to", description = "default", parents = {"Occitania, France"}, } labels["Lapland"] = { type = "related-to", description = "=[[Lapland]], a region in northernmost Europe", parents = {"ยุโรป", "ฟินแลนด์", "นอร์เวย์", "รัสเซีย", "สวีเดน"}, } labels["ตะวันออกกลาง"] = { type = "related-to", description = "default with the", parents = {"แอฟริกา", "เอเชีย"}, } labels["Middle East"] = labels["ตะวันออกกลาง"] labels["Netherlands Antilles"] = { type = "related-to", description = "=the people, culture, or territory of the [[Netherlands Antilles]]", parents = {"เนเธอร์แลนด์", "อเมริกาเหนือ"}, } labels["Provence"] = { type = "related-to", description = "default", parents = {"Provence-Alpes-Côte d'Azur, France"}, } labels["เอเชียใต้"] = { type = "related-to", description = "default", parents = {"ยูเรเชีย", "เอเชีย"}, } labels["South Asia"] = labels["เอเชียใต้"] return {LABELS = labels, HANDLERS = handlers} e5bhvs95rhqhsqvpd8l0ytl1hzaz8zt 5720702 5720687 2026-04-21T01:53:18Z OctraBot 3198 5720702 Scribunto text/plain local labels = {} local handlers = {} local m_table = require("Module:table") local en_utilities_module = "Module:en-utilities" local string_utilities_module = "Module:string utilities" local m_locations = require("Module:place/locations") local m_placetypes = require("Module:place/placetypes") local placetype_data = m_placetypes.placetype_data local internal_error = m_locations.internal_error local dump = mw.dumpObject local insert = table.insert local concat = table.concat local is_callable = require("Module:fun").is_callable --[==[ intro: This module is part of the category tree code and contains code to generate the descriptions of place-related categories such as [[Category:de:Hokkaido Prefecture, Japan]], [[Category:es:Cities in France]], [[Category:pt:Municipalities of Tocantins, Brazil]], etc.). Note that this module doesn't actually create the categories; that must be done separately, with the text "{{tl|auto cat}}" as the definition of the category. (This process should automatically happen periodically for non-empty categories, because they will appear in [[Special:WantedCategories]] and a bot will periodically examine that list and create any needed category.) There are two ways that category descriptions are specified: (1) by manually adding an entry to the `labels` table, keyed by the label (the category minus the language code) with a value consisting of a Lua table specifying the description text and the category's parents; (2) through handlers (pieces of Lua code) added to the `handlers` list, which recognize labels of a specific type (e.g. `Cities in France`) and generate the appropriate specification for that label on-the-fly. See [[Module:place]] for an introduction to the terminology associated with places along with a list of all the relevant modules, along with for more specific information on types of toponyms and placetypes and how their categorization works. ]==] local function lcfirst(label) return mw.getContentLanguage():lcfirst(label) end local function gsub_literally(str, from, to) local m_strutils = require(string_utilities_module) return (str:gsub(m_strutils.pattern_escape(from), m_strutils.replacement_escape(to))) end --ห้ามแปล class local class_to_bare_category_parent = { ["polity"] = "องค์การทางการเมือง", ["subpolity"] = "political divisions", ["settlement"] = "การตั้งถิ่นฐาน", ["non-admin settlement"] = "การตั้งถิ่นฐาน", ["capital"] = "เมืองหลวง", ["natural feature"] = "natural features", ["man-made structure"] = "man-made structures", ["geographic region"] = "geographic and cultural areas", } --ห้ามแปล class local class_is_political_division = { ["polity"] = true, -- strictly false but there are placetypes ambiguous between polity and subpolity ["subpolity"] = true, ["settlement"] = true, ["non-admin settlement"] = false, ["capital"] = true, ["natural feature"] = false, ["man-made structure"] = false, ["geographic region"] = false, ["generic place"] = false, } local capital_cat_to_placetype = {} for placetype, capital_cat in pairs(m_placetypes.placetype_to_capital_cat) do capital_cat_to_placetype[capital_cat] = placetype end -- Handler for bare categories for all types of capitals. This needs to precede the handler for bare placetype -- categories as some of the types of capitals exist as placetypes as well. insert(handlers, function(label) label = lcfirst(label) local capital_placetype = capital_cat_to_placetype[label] if capital_placetype then local pl_placetype = m_placetypes.pluralize_placetype(capital_placetype) local linkdesc = m_placetypes.get_placetype_display_form(pl_placetype, "top-level") if linkdesc == nil then internal_error("Unrecognized placetype %s when processing label %s", capital_placetype, label) end if linkdesc == false then mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype))) return nil end return { type = "name", topic = label, description = "{{{langname}}} names of [[capital]]s of " .. linkdesc .. ".", parents = {"เมืองหลวง"}, } end end) -- Handler for bare placetype categories. FIXME: Add wpcat= and commonscat= info. Previously we had it for various -- so-called "generic" placetypes, but sometimes the categories were wrong. insert(handlers, function(label) for _, canon_label in ipairs { lcfirst(label), label } do local ptdesc, ptdata = m_placetypes.get_placetype_display_form(canon_label, "top-level", "return full") if ptdesc then local from_category_props = { from_category = true, no_split_qualifiers = true, } local bare_category_parent = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt) local bare_category_parent = m_placetypes.get_placetype_prop(pt, "bare_category_parent") if bare_category_parent then return bare_category_parent end local class = m_placetypes.get_placetype_prop(pt, "class") if class then if class_to_bare_category_parent[class] == nil then internal_error("Saw unknown category class %s derived from placetype %s", class, canon_label) end return class_to_bare_category_parent[class] end end, from_category_props) if not bare_category_parent then internal_error("Saw placetype %s without a `class` or `bare_category_parent` setting, either " .. "directly or through a fallback", canon_label) end local addl_bare_category_parents = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt) return m_placetypes.get_placetype_prop(pt, "addl_bare_category_parents") end, from_category_props) local bare_category_breadcrumb = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt) return m_placetypes.get_placetype_prop(pt, "bare_category_breadcrumb") end, from_category_props) if type(bare_category_parent) == "string" and bare_category_breadcrumb then bare_category_parent = {name = bare_category_parent, sort = bare_category_breadcrumb} end local parents = {bare_category_parent} if addl_bare_category_parents then m_table.extend(parents, addl_bare_category_parents) end return { type = "name", topic = canon_label, description = "{{{langname}}} " .. ptdesc .. ".", breadcrumb = bare_category_breadcrumb, parents = parents, } elseif ptdesc == false then mw.log(("Display form for canon_label %s is false, can't categorize"):format(dump(canon_label))) end end end) local function fetch_primary_placetype(key, spec) local placetype = spec.placetype if type(placetype) == "table" then placetype = placetype[1] end if not placetype then internal_error("No placetype specified or defaulted for key %s, spec %s", key, spec) end return placetype end --[==[ Construct an appropriately linked location based on the full or elliptical placename, preceded by `"the "`` if appropriate. Specifically: Fetch the full and elliptical_placenames. If they are the same, just link to the placename directly. Otherwise, check if the full placename exists; if so link to it. Otherwise, if the elliptical placename exists, link to it but display it as the full placename. Finally, if neither full placename nor elliptical placename exists, fall back to linking to the full placename. That way, we prefer full placenames to elliptical placenames if both or neither exist as Wiktionary entries, but if only one exists, we link to that one rather than have a red link. ]==] local function construct_linked_location(group, key, spec) local full_placename, elliptical_placename = m_locations.key_to_placename(group, key) local linked_placename if elliptical_placename ~= full_placename then local full_placename_title = mw.title.new(full_placename) if full_placename_title and full_placename_title.exists then linked_placename = m_locations.construct_linked_placename(spec, full_placename) else local elliptical_placename_title = mw.title.new(elliptical_placename) if elliptical_placename_title and elliptical_placename_title.exists then linked_placename = m_locations.construct_linked_placename(spec, elliptical_placename, full_placename) end end end return linked_placename or m_locations.construct_linked_placename(spec, full_placename) end --[==[ Construct the description of a location, including its container trail either to the end or until we encounter a `no_include_container_in_desc` setting. For example, for the city of [[Birmingham]], the description will read `"[[Birmingham]], a [[city]] in the [[West Midlands]] (which is a [[county]] of [[England]], which is a [[constituent country]] of the [[United Kingdom]], which is a [[country]] in [[Europe]])"`. FIXME: Possibly we should adopt the way city descriptions used to read, which was similar to `"the city of [[Birmingham]], in the county of the [[West Midlands]], in the [[constituent country]] of [[England]], in the [[country]] of the [[United Kingdom]], in [[Europe]]"`. ]==] local function construct_location_desc(group, key, spec) local parts = {} local function ins(txt) insert(parts, txt) end ins(construct_linked_location(group, key, spec)) local iteration = 0 local need_closing_paren = false local containers = {{group = group, key = key, spec = spec}} local container_iterator = m_locations.iterate_containers(group, key, spec) while true do iteration = iteration + 1 local include_container_in_desc = false for _, container in ipairs(containers) do if not container.spec.no_include_container_in_desc then include_container_in_desc = true break end end if not include_container_in_desc then break end local next_containers = container_iterator() if not next_containers then break end local is_former = nil for _, container in ipairs(containers) do local this_is_former = container.spec.is_former_place if is_former == nil then is_former = this_is_former elseif is_former ~= this_is_former then internal_error("When processing container trail of key %s, found a mixture of former and non-former " .. "containers: %s", key, containers) end end if #containers > 1 then local placetypes = {} local prepositions = {} for _, container in ipairs(containers) do local container_type = fetch_primary_placetype(container.key, container.spec) m_table.insertIfNot(placetypes, m_placetypes.pluralize_placetype(container_type)) m_table.insertIfNot(prepositions, m_placetypes.get_placetype_entry_preposition(container_type)) end if iteration == 1 then ins(", ") elseif iteration == 2 then ins(" (which are ") need_closing_paren = true else ins(", which are ") end if is_former then ins("former ") end ins(m_table.serialCommaJoin(placetypes)) ins(" ") ins(concat(prepositions, "/")) else if iteration == 1 then ins(", ") elseif iteration == 2 then ins(" (which is ") need_closing_paren = true else ins(", which is ") end local container_type = fetch_primary_placetype(containers[1].key, containers[1].spec) if is_former then ins("a former ") else ins(m_placetypes.get_placetype_article(container_type)) ins(" ") end ins(container_type) ins(" ") ins(m_placetypes.get_placetype_entry_preposition(container_type)) end ins(" ") first_container = false containers = next_containers local container_locations = {} for _, container in ipairs(containers) do insert(container_locations, construct_linked_location(container.group, container.key, container.spec)) end ins(m_table.serialCommaJoin(container_locations)) end if need_closing_paren then ins(")") end return concat(parts) end -- Fetch or construct the description of the location specified by `key`. If the `keydesc` property is specified, -- use it directly but substitute any occurrence of `+++` with the auto-constructed location description, which -- mentions the placename corresponding to the key, its placetype and container, and repeats the description up -- the container trail until either there are no more containers or (more usually) the `no_include_container_in_desc` -- setting is found (which is set on all continents and continent-level regions). local function fetch_or_construct_location_desc(group, key, spec) local val = spec.keydesc if is_callable(val) then val = val(group, key, spec) spec.keydesc = val end val = val or "+++" if val:find("%+%+%+") then val = gsub_literally(val, "+++", construct_location_desc(group, key, spec)) end return val end local function normalize_cat_as(cat_as, div) if type(cat_as) ~= "table" or cat_as.type then cat_as = {cat_as} end local ret_cat_as = {} for _, pt_cat_as in ipairs(cat_as) do if type(pt_cat_as) == "string" then pt_cat_as = {type = pt_cat_as} end insert(ret_cat_as, {type = pt_cat_as.type, prep = pt_cat_as.prep or div.prep or "ของ"}) end return ret_cat_as end -- Find the specified plural placetype among the divs for a given known location. Return a list of cat_as specs, where -- each spec is of the form {type = "PLURAL_PLACETYPE", prep = "PREP"} indicating the plural placetype to use when -- categorizing and the preposition to follow. local function find_placetype_cat_as(divs, pl_placetype) if divs then if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) == "string" then div = {type = div} end if div.type == pl_placetype then local cat_as = div.cat_as or div.type return normalize_cat_as(cat_as, div) end end end return nil end -- Handler for bare placename categories for known locations in `locations` in [[Module:place/locations]]. insert(handlers, function(label) for _, canon_label in ipairs { label, lcfirst(label) } do local group, spec = m_locations.find_canonical_key(canon_label) if group then -- wp= defaults to true (Wikipedia article matches location's full placename) local wp = spec.wp if wp == nil then wp = true end -- wpcat= defaults to wp= (if Wikipedia article has its own name, Wikipedia category and Commons category -- generally follow) local wpcat = spec.wpcat if wpcat == nil then wpcat = wp end -- commonscat= defaults to wpcat= (if Wikipedia category has its own name, Commons category generally -- follows) local commonscat = spec.commonscat if commonscat == nil then commonscat = wpcat end local parents = {} local bare_label_parents = spec.overriding_bare_label_parents local container_iterator = m_locations.iterate_containers(group, canon_label, spec) local containers = container_iterator() if not bare_label_parents then bare_label_parents = {"+++"} end local full_location_placename, elliptical_location_placename = m_locations.key_to_placename(group, canon_label) local full_container_placename if containers then full_container_placename, _ = m_locations.key_to_placename(containers[1].group, containers[1].key) end local inserted_containers = false for _, parent in ipairs(bare_label_parents) do if parent == "+++" then parent = "PL_PLACETYPEPREPCONTAINER" --th not use spaces end if parent:find("CONTAINER") then if not containers then internal_error("Parent category %s needs the container of %s but no containers specified: %s", parent, canon_label, spec) end local location_type = fetch_primary_placetype(canon_label, spec) local pl_location_type = m_placetypes.pluralize_placetype(location_type) for _, container in ipairs(containers) do local per_container_parent = parent local cat_as_list if per_container_parent:find("PL_PLACETYPE") then if spec.bare_category_parent_type then cat_as_list = normalize_cat_as(spec.bare_category_parent_type, spec) else cat_as_list = find_placetype_cat_as(container.spec.divs, pl_location_type) or find_placetype_cat_as(container.spec.addl_divs, pl_location_type) end end if not cat_as_list then local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(location_type, "from category") if not canon_placetype or not (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then internal_error("Unable to locate plural location type %s among the divs or addl_divs " .. "for container key %s spec %s, and the location type is either not in placetype_data or " .. "not identified as a generic placetype", pl_location_type, container.key, container.spec) end cat_as_list = {{type = pl_location_type, prep = m_placetypes.get_placetype_entry_preposition(location_type)}} end local prefixed_key = m_placetypes.get_prefixed_key(container.key, container.spec) per_container_parent = gsub_literally(per_container_parent, "CONTAINER", prefixed_key) for _, cat_as in ipairs(cat_as_list) do local per_container_per_placetype_parent = per_container_parent per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PL_PLACETYPE", cat_as.type) per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PREP", cat_as.prep) m_table.insertIfNot(parents, per_container_per_placetype_parent) end end inserted_containers = true else m_table.insertIfNot(parents, parent) end end if not inserted_containers and containers then -- If we didn't insert the containers above in some form, insert them now as bare categories. Note that -- this may be different categories from the container categories inserted above. for _, container in ipairs(containers) do m_table.insertIfNot(parents, container.key) end end if spec.addl_parents then for _, parent in ipairs(spec.addl_parents) do m_table.insertIfNot(parents, parent) end end local function format_boxval(val, specname) if val == true then val = "%l" end if type(val) == "string" then val = gsub_literally(val, "%l", full_location_placename) val = gsub_literally(val, "%e", elliptical_location_placename) if val:find("%%c") then if not full_container_placename then internal_error("Wikipedia/Commons spec %s = %s has %%c in it but key %s has no " .. "containers: %s", specname, val, canon_label, spec) end val = gsub_literally(val, "%c", full_container_placename) end end return val end local description = spec.fulldesc or ( "{{{langname}}} terms related to the people, culture, or territory of " .. fetch_or_construct_location_desc(group, canon_label, spec) .. ".") local full_placename, _ = m_locations.key_to_placename(group, canon_label) return { type = "topic", description = description, breadcrumb = full_placename, parents = parents, wp = format_boxval(wp, "wp"), wpcat = format_boxval(wpcat, "wpcat"), commonscat = format_boxval(commonscat, "commonscat"), } end end end) local function find_canonical_key_from_place(place, canon_label) local has_the = false local key if place:find("^the ") then key = place:gsub("^the ", "") has_the = true else key = place end local group, spec = m_locations.find_canonical_key(key) if group then local requires_the = spec.the or false if has_the ~= requires_the then if has_the then mw.log(("Mismatch in category name '%s', has 'the' in the category when it should not"):format( canon_label)) else mw.log(("Mismatch in category name '%s', should have 'the' in the category but does not"): format(canon_label)) end return nil end return group, key, spec end return nil end -- Handler for generic placetypes (those whose categories are added through category generation handlers or through -- explicit category specs in the placetype data) for known locations in [[Module:place/locations]]. All such -- placetypes have either a `generic_before_non_cities` setting (meaning they can occur before non-city locations) or -- `generic_before_cities` setting (meaning they can occur before cities), or both. Examples of such categories are -- "cities in the Bahamas" or "rivers in Western Australia, Australia", or (for city locations) -- "neighbourhoods of Hong Kong" or "places in Melbourne". insert(handlers, function(label) for _, canon_label in ipairs { lcfirst(label), label } do local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ใน)(.*)$") --th if not placetype then placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ของ)(.*)$") --th end if placetype then local normalized_placetype = placetype == "neighbourhoods" and "neighborhoods" or placetype local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(normalized_placetype, "from category") if canon_placetype and (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then local group, key, spec = find_canonical_key_from_place(place, canon_label) if group then -- Check whether the location uses British spelling, but also check all containers, because -- it's too hard to keep in sync the `british_spelling` setting for locations at all different -- levels (e.g. cities of various countries, first and second level administrative division, etc.), -- so we just set it at top level on the country. local uses_british_spelling = spec.british_spelling if uses_british_spelling == nil then for containers in m_locations.iterate_containers(group, key, spec) do local must_outer_break = false for _, container in ipairs(containers) do if container.spec.british_spelling ~= nil then uses_british_spelling = container.spec.british_spelling must_outer_break = true break end end if must_outer_break then break end end end local allow_cat = true if placetype == "neighborhoods" and uses_british_spelling or placetype == "neighbourhoods" and not uses_british_spelling then mw.log(("Mismatch in spelling of placetype '%s' in category '%s', should be '%s'"):format( placetype, canon_label, uses_british_spelling and "neighbourhoods" or "neighborhoods")) allow_cat = false end if spec.is_former_place and placetype ~= "สถานที่" then allow_cat = false end local expected_prep if spec.is_city then expected_prep = ptdata.generic_before_cities else expected_prep = ptdata.generic_before_non_cities end if not expected_prep then allow_cat = false end if allow_cat then if expected_prep ~= in_of then mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format( canon_label, in_of, expected_prep)) return nil end local linkdesc = m_placetypes.get_placetype_display_form(placetype, spec.is_city and "city" or "noncity", "return full") if linkdesc == false then mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype))) return nil end if not linkdesc then internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s", placetype, key, spec, canon_label) end desc = linkdesc .. " " .. in_of .. " " .. fetch_or_construct_location_desc(group, key, spec) desc = "{{{langname}}} " .. desc .. "." local parents = {} insert(parents, key) if spec.no_container_parent then -- top-level country, constituent country, continent or the like insert(parents, {name = normalized_placetype, sort = key}) if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then local category_class = m_placetypes.get_equiv_placetype_prop(normalized_placetype, function(pt) return m_placetypes.get_placetype_prop(pt, "class") end, { from_category = true, no_split_qualifiers = true, }) if not category_class then internal_error("Saw placetype %s that is either unknown or has no `class` " .. "setting in `placetype_data`", normalized_placetype) end if class_is_political_division[category_class] == nil then internal_error("Saw unknown category class %s derived from placetype %s", category_class, normalized_placetype) end if class_is_political_division[category_class] then insert(parents, "political divisions of specific countries") end end else local container_iterator = m_locations.iterate_containers(group, key, spec) local next_containers = container_iterator() if next_containers then for _, container in ipairs(next_containers) do local container_prep if container.spec.is_city then container_prep = ptdata.generic_before_cities else container_prep = ptdata.generic_before_non_cities end if not container_prep then internal_error("For container key %s spec %s defines is_city = %s but " .. "there is no corresponding `generic_before_*` setting in the " .. "placedata for placetype %s", container.key, container.spec, container.spec.is_city, placetype) end insert(parents, { name = placetype .. container_prep .. m_placetypes.get_prefixed_key(container.key, container.spec), --th sort = key }) end else -- unrecognized countries or the like insert(parents, {name = normalized_placetype, sort = key}) end end return { type = "name", topic = canon_label, description = desc, breadcrumb = placetype, parents = parents, } end end end end end end) -- Handler for "state capitals of the United States", "provincial capitals of Canada", etc. This must precede the next -- handler for specific political and misc (non-political) divisions of polities and subpolities, such as -- "provinces of the Philippines", because "departmental capitals" is listed in cat_as for French prefectures and so -- will trigger an error if that handler runs before this one. insert(handlers, function(label) label = lcfirst(label) --local capital_cat, place = label:match("^([a-z%- ]- capitals) of (.*)$") local capital_cat, place = mw.ustring.match("^(เมืองหลวงของ[a-zก-๛%- ]-)ของ(.*)$") -- Make sure we recognize the type of capital. if place and capital_cat_to_placetype[capital_cat] then local placetype = capital_cat_to_placetype[capital_cat] local pl_placetype = m_placetypes.pluralize_placetype(placetype) -- Locate the container, fetch its known political divisions, and make sure the placetype corresponding to the -- type of capital is among the list. local group, key, spec = find_canonical_key_from_place(place, canon_label) if group and (spec.divs or spec.addl_divs) then local saw_match = false local variant_matches = {} local divlists = {} if spec.divs then insert(divlists, spec.divs) end if spec.addl_divs then insert(divlists, spec.addl_divs) end for _, divlist in ipairs(divlists) do if type(divlist) ~= "table" then divlist = {divlist} end for _, div in ipairs(divlist) do if type(div) == "string" then div = {type = div} end -- HACK. Currently if we don't find a match for the placetype, we map e.g. 'autonomous region' -- -> 'regional capitals' and 'union territory' -> 'territorial capitals'. When encountering a -- political division like 'autonomous region' or 'union territory', chop off everything up -- through a space to make things match. To make this clearer, we record all such -- "variant match" cases, and down below we insert a note into the category text indicating that -- such "variant matches" are included among the category. if pl_placetype == div.type or pl_placetype == div.type:gsub("^.* ", "") then saw_match = true if pl_placetype ~= div.type then insert(variant_matches, div.type) end end end end if saw_match then -- Everything checks out, construct the category description. local placetype_desc = m_placetypes.get_placetype_display_form(pl_placetype, placetype.is_city and "city" or "noncity") if placetype_desc == false then mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype))) return nil end if not placetype_desc then internal_error("Unrecognized plural placetype %s, generated as the plural of %s, which " .. "was found as the placetype of capital placetype %s in label %s", pl_placetype, placetype, capital_cat, label) end local variant_match_text = "" if variant_matches[1] then local real_variant_match_descs = {} for i, variant_match in ipairs(variant_matches) do local variant_match_desc = m_placetypes.get_placetype_display_form(variant_match, placetype.is_city and "city" or "noncity") if variant_match_desc == nil then internal_error("Unrecognized variant match plural placetype %s, coming from " .. "place key %s, data %s in label %s", variant_match, key, spec, label) end if variant_match_desc then -- skip those for which the description is `false`, like `ABBREVIATION_OF states` -- in the United States divs. insert(real_variant_match_descs, variant_match_desc) end end if real_variant_match_descs[1] then variant_match_text = " (including " .. m_table.serialCommaJoin(real_variant_match_descs) .. ")" end end local desc = "{{{langname}}} names of [[capital]]s of " .. placetype_desc .. variant_match_text .. " of " .. fetch_or_construct_location_desc(group, key, spec) .. "." local full_placename, _ = m_locations.key_to_placename(group, key) local parents = {} if spec.no_container_parent then -- top-level country, constituent country, continent or the like insert(parents, {name = capital_cat, sort = key}) else local container_iterator = m_locations.iterate_containers(group, key, spec) local next_containers = container_iterator() if next_containers then for _, container in ipairs(next_containers) do insert(parents, { name = capital_cat .. "ของ" .. m_placetypes.get_prefixed_key(container.key, container.spec), --th sort = key }) end else -- unrecognized countries or the like insert(parents, {name = capital_cat, sort = key}) end end insert(parents, key) return { type = "name", topic = label, description = desc, breadcrumb = full_placename, parents = parents, } end end end end) local overriding_category_descriptions = { ["autonomous cities of Spain"] = "the [[w:Autonomous communities of Spain#Autonomous_cities|autonomous cities of Spain]]", ["regions of Greece"] = "the regions ([[periphery|peripheries]]) of [[Greece]]", ["regions of North Macedonia"] = "the regions ([[periphery|peripheries]]) of [[North Macedonia]]", ["subprefectures of Japan"] = "[[subprefecture]]s of [[Japan]]ese [[prefecture]]s", } -- Handler for specific political and misc (non-political) divisions of locations (polities, subpolities, cities, etc.), -- such as "provinces of the Philippines", "counties of Wales", "municipalities of Tocantins, Brazil", -- "boroughs of New York City", etc. This does not handle categories for generic placetypes (cities, rivers, etc.) of -- locations, which are handled by different handlers above. insert(handlers, function(label) -- The label comes with an initial capitalization but we have to check both lowercase-initial and capital-initial -- versions of the placetype to handle e.g. [[:Category:en:Indian reserves of Canada]]. for _, canon_label in ipairs { label, lcfirst(label) } do for _, minimal_placetype in ipairs { true, false } do local match_quantifier = minimal_placetype and "-" or "+" -- Some categories have two "of"s in them, and depending on the category, it's correct to do either a greedy -- ([[:Category:en:Abbreviations of states of the United States]], with placetype `abbreviations of states`) -- or non-greedy ([[:Category:en:Provinces of the Democratic Republic of the Congo]], with placetype -- `provinces`) match. We can't know in advance which is correct so we try both possibilities, doing the -- non-greedy one first as it seems more common (there are many locations with "of" in them, but currently -- only `abbreviations of states` occurs with a following location). local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ของ)(.*)$") if not placetype then placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ใน)(.*)$") end if placetype then local group, key, spec = find_canonical_key_from_place(place, canon_label) if group then local function find_placetype(divs) if divs then if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) == "string" then div = {type = div} end local cat_as = div.cat_as or div.type if type(cat_as) ~= "table" then cat_as = {cat_as} end for _, pt_cat_as in ipairs(cat_as) do if type(pt_cat_as) == "string" then pt_cat_as = {type = pt_cat_as} end if placetype == pt_cat_as.type then local div_parent = pt_cat_as.container_parent_type if div_parent == nil then -- allow false div_parent = div.container_parent_type end if div_parent == nil then div_parent = placetype end return div_parent, pt_cat_as.prep or div.prep or "ของ" end end end end return nil end local div_parent, div_prep = find_placetype(spec.divs) if div_parent == nil then -- allow false div_parent, div_prep = find_placetype(spec.addl_divs) end if div_parent == nil then -- allow false div_parent, div_prep = find_placetype(spec.addl_divs_for_categorization) end if div_parent ~= nil then if div_prep ~= in_of then mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format( canon_label, in_of, div_prep)) return nil end local linkdesc = m_placetypes.get_placetype_display_form(placetype, spec.is_city and "city" or "noncity", "return full") if linkdesc == false then mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype))) return nil end if not linkdesc then internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s", placetype, key, spec, canon_label) end local desc = overriding_category_descriptions[canon_label] if not desc then desc = linkdesc .. in_of .. fetch_or_construct_location_desc(group, key, spec) --th end desc = "{{{langname}}} " .. desc .. "." local parents = {} insert(parents, key) if div_parent then -- div_parent may be `false` if spec.no_container_parent then -- top-level country, constituent country, continent or the like insert(parents, {name = placetype, sort = " " .. key}) if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then --th insert(parents, "political divisions of specific countries") end else local container_iterator = m_locations.iterate_containers(group, key, spec) local next_containers = container_iterator() if next_containers then for _, container in ipairs(next_containers) do insert(parents, { name = div_parent .. in_of .. m_placetypes.get_prefixed_key(container.key, container.spec), --th sort = key }) end else -- unrecognized countries or the like insert(parents, {name = placetype, sort = " " .. key}) end end end return { type = "name", topic = canon_label, description = desc, breadcrumb = placetype, parents = parents, } end end end end end end) labels["exonyms"] = { type = "name", -- special-cased description description = "{{{langname}}} [[exonym]]s.", parents = {"สถานที่"}, } labels["political divisions of specific countries"] = { type = "grouping", description = "{{{langname}}} categories for political divisions of specific countries.", parents = {"สถานที่"}, } -- Misc. FIXME: Remove the need for this. labels["nomes of Ancient Egypt"] = { type = "name", -- special-cased description description = "{{{langname}}} names of the [[nome]]s of [[Ancient Egypt]].", breadcrumb = "nomes", parents = {"อียิปต์โบราณ"}, } -- FIXME: Everything here has been moved from [[Module:category tree/topic/Earth]]. Most should be removed. labels["มหาสมุทรแอตแลนติก"] = { type = "related-to", description = "default with the", parents = {"โลก"}, } labels["Atlantic Ocean"] = labels["มหาสมุทรแอตแลนติก"] labels["British Isles"] = { type = "related-to", description = "=the people, culture, or territory of [[Great Britain]], [[Ireland]], and other nearby islands", parents = {"ยุโรป", "เกาะ"}, } labels["สหภาพยุโรป"] = { type = "related-to", description = "default with the", parents = {"ยุโรป"}, } labels["European Union"] = labels["สหภาพยุโรป"] labels["Gascony"] = { type = "related-to", description = "default", parents = {"Occitania, France"}, } labels["Indian subcontinent"] = { type = "related-to", description = "default with the", parents = {"เอเชียใต้"}, } labels["Bengal"] = { type = "related-to", description = "{{{langname}}} terms related to the people, culture, or territory of [[Bengal]].", parents = {"Indian subcontinent"}, } labels["Kashmir"] = { type = "related-to", description = "{{{langname}}} terms related to the people, culture, or territory of [[Kashmir]].", parents = {"Indian subcontinent"}, } labels["Kashmir, India"] = { type = "related-to", description = "{{{langname}}} names of places in {{w|Kashmir, India}}.", parents = {"อินเดีย", "Kashmir"}, } labels["เกาหลี"] = { type = "related-to", description = "=the people, culture, or territory of [[Korea]]", parents = {"เอเชีย"}, } labels["Korea"] = labels["เกาหลี"] labels["Languedoc"] = { type = "related-to", description = "default", parents = {"Occitania, France"}, } labels["Lapland"] = { type = "related-to", description = "=[[Lapland]], a region in northernmost Europe", parents = {"ยุโรป", "ฟินแลนด์", "นอร์เวย์", "รัสเซีย", "สวีเดน"}, } labels["ตะวันออกกลาง"] = { type = "related-to", description = "default with the", parents = {"แอฟริกา", "เอเชีย"}, } labels["Middle East"] = labels["ตะวันออกกลาง"] labels["Netherlands Antilles"] = { type = "related-to", description = "=the people, culture, or territory of the [[Netherlands Antilles]]", parents = {"เนเธอร์แลนด์", "อเมริกาเหนือ"}, } labels["Provence"] = { type = "related-to", description = "default", parents = {"Provence-Alpes-Côte d'Azur, France"}, } labels["เอเชียใต้"] = { type = "related-to", description = "default", parents = {"ยูเรเชีย", "เอเชีย"}, } labels["South Asia"] = labels["เอเชียใต้"] return {LABELS = labels, HANDLERS = handlers} 2linlys6mypmnvk6ldw4bjnug2h7ne0 5720703 5720702 2026-04-21T01:54:09Z OctraBot 3198 5720703 Scribunto text/plain local labels = {} local handlers = {} local m_table = require("Module:table") local en_utilities_module = "Module:en-utilities" local string_utilities_module = "Module:string utilities" local m_locations = require("Module:place/locations") local m_placetypes = require("Module:place/placetypes") local placetype_data = m_placetypes.placetype_data local internal_error = m_locations.internal_error local dump = mw.dumpObject local insert = table.insert local concat = table.concat local is_callable = require("Module:fun").is_callable --[==[ intro: This module is part of the category tree code and contains code to generate the descriptions of place-related categories such as [[Category:de:Hokkaido Prefecture, Japan]], [[Category:es:Cities in France]], [[Category:pt:Municipalities of Tocantins, Brazil]], etc.). Note that this module doesn't actually create the categories; that must be done separately, with the text "{{tl|auto cat}}" as the definition of the category. (This process should automatically happen periodically for non-empty categories, because they will appear in [[Special:WantedCategories]] and a bot will periodically examine that list and create any needed category.) There are two ways that category descriptions are specified: (1) by manually adding an entry to the `labels` table, keyed by the label (the category minus the language code) with a value consisting of a Lua table specifying the description text and the category's parents; (2) through handlers (pieces of Lua code) added to the `handlers` list, which recognize labels of a specific type (e.g. `Cities in France`) and generate the appropriate specification for that label on-the-fly. See [[Module:place]] for an introduction to the terminology associated with places along with a list of all the relevant modules, along with for more specific information on types of toponyms and placetypes and how their categorization works. ]==] local function lcfirst(label) return mw.getContentLanguage():lcfirst(label) end local function gsub_literally(str, from, to) local m_strutils = require(string_utilities_module) return (str:gsub(m_strutils.pattern_escape(from), m_strutils.replacement_escape(to))) end --ห้ามแปล class local class_to_bare_category_parent = { ["polity"] = "องค์การทางการเมือง", ["subpolity"] = "political divisions", ["settlement"] = "การตั้งถิ่นฐาน", ["non-admin settlement"] = "การตั้งถิ่นฐาน", ["capital"] = "เมืองหลวง", ["natural feature"] = "natural features", ["man-made structure"] = "man-made structures", ["geographic region"] = "geographic and cultural areas", } --ห้ามแปล class local class_is_political_division = { ["polity"] = true, -- strictly false but there are placetypes ambiguous between polity and subpolity ["subpolity"] = true, ["settlement"] = true, ["non-admin settlement"] = false, ["capital"] = true, ["natural feature"] = false, ["man-made structure"] = false, ["geographic region"] = false, ["generic place"] = false, } local capital_cat_to_placetype = {} for placetype, capital_cat in pairs(m_placetypes.placetype_to_capital_cat) do capital_cat_to_placetype[capital_cat] = placetype end -- Handler for bare categories for all types of capitals. This needs to precede the handler for bare placetype -- categories as some of the types of capitals exist as placetypes as well. insert(handlers, function(label) label = lcfirst(label) local capital_placetype = capital_cat_to_placetype[label] if capital_placetype then local pl_placetype = m_placetypes.pluralize_placetype(capital_placetype) local linkdesc = m_placetypes.get_placetype_display_form(pl_placetype, "top-level") if linkdesc == nil then internal_error("Unrecognized placetype %s when processing label %s", capital_placetype, label) end if linkdesc == false then mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype))) return nil end return { type = "name", topic = label, description = "{{{langname}}} names of [[capital]]s of " .. linkdesc .. ".", parents = {"เมืองหลวง"}, } end end) -- Handler for bare placetype categories. FIXME: Add wpcat= and commonscat= info. Previously we had it for various -- so-called "generic" placetypes, but sometimes the categories were wrong. insert(handlers, function(label) for _, canon_label in ipairs { lcfirst(label), label } do local ptdesc, ptdata = m_placetypes.get_placetype_display_form(canon_label, "top-level", "return full") if ptdesc then local from_category_props = { from_category = true, no_split_qualifiers = true, } local bare_category_parent = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt) local bare_category_parent = m_placetypes.get_placetype_prop(pt, "bare_category_parent") if bare_category_parent then return bare_category_parent end local class = m_placetypes.get_placetype_prop(pt, "class") if class then if class_to_bare_category_parent[class] == nil then internal_error("Saw unknown category class %s derived from placetype %s", class, canon_label) end return class_to_bare_category_parent[class] end end, from_category_props) if not bare_category_parent then internal_error("Saw placetype %s without a `class` or `bare_category_parent` setting, either " .. "directly or through a fallback", canon_label) end local addl_bare_category_parents = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt) return m_placetypes.get_placetype_prop(pt, "addl_bare_category_parents") end, from_category_props) local bare_category_breadcrumb = m_placetypes.get_equiv_placetype_prop(canon_label, function(pt) return m_placetypes.get_placetype_prop(pt, "bare_category_breadcrumb") end, from_category_props) if type(bare_category_parent) == "string" and bare_category_breadcrumb then bare_category_parent = {name = bare_category_parent, sort = bare_category_breadcrumb} end local parents = {bare_category_parent} if addl_bare_category_parents then m_table.extend(parents, addl_bare_category_parents) end return { type = "name", topic = canon_label, description = "{{{langname}}} " .. ptdesc .. ".", breadcrumb = bare_category_breadcrumb, parents = parents, } elseif ptdesc == false then mw.log(("Display form for canon_label %s is false, can't categorize"):format(dump(canon_label))) end end end) local function fetch_primary_placetype(key, spec) local placetype = spec.placetype if type(placetype) == "table" then placetype = placetype[1] end if not placetype then internal_error("No placetype specified or defaulted for key %s, spec %s", key, spec) end return placetype end --[==[ Construct an appropriately linked location based on the full or elliptical placename, preceded by `"the "`` if appropriate. Specifically: Fetch the full and elliptical_placenames. If they are the same, just link to the placename directly. Otherwise, check if the full placename exists; if so link to it. Otherwise, if the elliptical placename exists, link to it but display it as the full placename. Finally, if neither full placename nor elliptical placename exists, fall back to linking to the full placename. That way, we prefer full placenames to elliptical placenames if both or neither exist as Wiktionary entries, but if only one exists, we link to that one rather than have a red link. ]==] local function construct_linked_location(group, key, spec) local full_placename, elliptical_placename = m_locations.key_to_placename(group, key) local linked_placename if elliptical_placename ~= full_placename then local full_placename_title = mw.title.new(full_placename) if full_placename_title and full_placename_title.exists then linked_placename = m_locations.construct_linked_placename(spec, full_placename) else local elliptical_placename_title = mw.title.new(elliptical_placename) if elliptical_placename_title and elliptical_placename_title.exists then linked_placename = m_locations.construct_linked_placename(spec, elliptical_placename, full_placename) end end end return linked_placename or m_locations.construct_linked_placename(spec, full_placename) end --[==[ Construct the description of a location, including its container trail either to the end or until we encounter a `no_include_container_in_desc` setting. For example, for the city of [[Birmingham]], the description will read `"[[Birmingham]], a [[city]] in the [[West Midlands]] (which is a [[county]] of [[England]], which is a [[constituent country]] of the [[United Kingdom]], which is a [[country]] in [[Europe]])"`. FIXME: Possibly we should adopt the way city descriptions used to read, which was similar to `"the city of [[Birmingham]], in the county of the [[West Midlands]], in the [[constituent country]] of [[England]], in the [[country]] of the [[United Kingdom]], in [[Europe]]"`. ]==] local function construct_location_desc(group, key, spec) local parts = {} local function ins(txt) insert(parts, txt) end ins(construct_linked_location(group, key, spec)) local iteration = 0 local need_closing_paren = false local containers = {{group = group, key = key, spec = spec}} local container_iterator = m_locations.iterate_containers(group, key, spec) while true do iteration = iteration + 1 local include_container_in_desc = false for _, container in ipairs(containers) do if not container.spec.no_include_container_in_desc then include_container_in_desc = true break end end if not include_container_in_desc then break end local next_containers = container_iterator() if not next_containers then break end local is_former = nil for _, container in ipairs(containers) do local this_is_former = container.spec.is_former_place if is_former == nil then is_former = this_is_former elseif is_former ~= this_is_former then internal_error("When processing container trail of key %s, found a mixture of former and non-former " .. "containers: %s", key, containers) end end if #containers > 1 then local placetypes = {} local prepositions = {} for _, container in ipairs(containers) do local container_type = fetch_primary_placetype(container.key, container.spec) m_table.insertIfNot(placetypes, m_placetypes.pluralize_placetype(container_type)) m_table.insertIfNot(prepositions, m_placetypes.get_placetype_entry_preposition(container_type)) end if iteration == 1 then ins(", ") elseif iteration == 2 then ins(" (which are ") need_closing_paren = true else ins(", which are ") end if is_former then ins("former ") end ins(m_table.serialCommaJoin(placetypes)) ins(" ") ins(concat(prepositions, "/")) else if iteration == 1 then ins(", ") elseif iteration == 2 then ins(" (which is ") need_closing_paren = true else ins(", which is ") end local container_type = fetch_primary_placetype(containers[1].key, containers[1].spec) if is_former then ins("a former ") else ins(m_placetypes.get_placetype_article(container_type)) ins(" ") end ins(container_type) ins(" ") ins(m_placetypes.get_placetype_entry_preposition(container_type)) end ins(" ") first_container = false containers = next_containers local container_locations = {} for _, container in ipairs(containers) do insert(container_locations, construct_linked_location(container.group, container.key, container.spec)) end ins(m_table.serialCommaJoin(container_locations)) end if need_closing_paren then ins(")") end return concat(parts) end -- Fetch or construct the description of the location specified by `key`. If the `keydesc` property is specified, -- use it directly but substitute any occurrence of `+++` with the auto-constructed location description, which -- mentions the placename corresponding to the key, its placetype and container, and repeats the description up -- the container trail until either there are no more containers or (more usually) the `no_include_container_in_desc` -- setting is found (which is set on all continents and continent-level regions). local function fetch_or_construct_location_desc(group, key, spec) local val = spec.keydesc if is_callable(val) then val = val(group, key, spec) spec.keydesc = val end val = val or "+++" if val:find("%+%+%+") then val = gsub_literally(val, "+++", construct_location_desc(group, key, spec)) end return val end local function normalize_cat_as(cat_as, div) if type(cat_as) ~= "table" or cat_as.type then cat_as = {cat_as} end local ret_cat_as = {} for _, pt_cat_as in ipairs(cat_as) do if type(pt_cat_as) == "string" then pt_cat_as = {type = pt_cat_as} end insert(ret_cat_as, {type = pt_cat_as.type, prep = pt_cat_as.prep or div.prep or "ของ"}) end return ret_cat_as end -- Find the specified plural placetype among the divs for a given known location. Return a list of cat_as specs, where -- each spec is of the form {type = "PLURAL_PLACETYPE", prep = "PREP"} indicating the plural placetype to use when -- categorizing and the preposition to follow. local function find_placetype_cat_as(divs, pl_placetype) if divs then if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) == "string" then div = {type = div} end if div.type == pl_placetype then local cat_as = div.cat_as or div.type return normalize_cat_as(cat_as, div) end end end return nil end -- Handler for bare placename categories for known locations in `locations` in [[Module:place/locations]]. insert(handlers, function(label) for _, canon_label in ipairs { label, lcfirst(label) } do local group, spec = m_locations.find_canonical_key(canon_label) if group then -- wp= defaults to true (Wikipedia article matches location's full placename) local wp = spec.wp if wp == nil then wp = true end -- wpcat= defaults to wp= (if Wikipedia article has its own name, Wikipedia category and Commons category -- generally follow) local wpcat = spec.wpcat if wpcat == nil then wpcat = wp end -- commonscat= defaults to wpcat= (if Wikipedia category has its own name, Commons category generally -- follows) local commonscat = spec.commonscat if commonscat == nil then commonscat = wpcat end local parents = {} local bare_label_parents = spec.overriding_bare_label_parents local container_iterator = m_locations.iterate_containers(group, canon_label, spec) local containers = container_iterator() if not bare_label_parents then bare_label_parents = {"+++"} end local full_location_placename, elliptical_location_placename = m_locations.key_to_placename(group, canon_label) local full_container_placename if containers then full_container_placename, _ = m_locations.key_to_placename(containers[1].group, containers[1].key) end local inserted_containers = false for _, parent in ipairs(bare_label_parents) do if parent == "+++" then parent = "PL_PLACETYPEPREPCONTAINER" --th not use spaces end if parent:find("CONTAINER") then if not containers then internal_error("Parent category %s needs the container of %s but no containers specified: %s", parent, canon_label, spec) end local location_type = fetch_primary_placetype(canon_label, spec) local pl_location_type = m_placetypes.pluralize_placetype(location_type) for _, container in ipairs(containers) do local per_container_parent = parent local cat_as_list if per_container_parent:find("PL_PLACETYPE") then if spec.bare_category_parent_type then cat_as_list = normalize_cat_as(spec.bare_category_parent_type, spec) else cat_as_list = find_placetype_cat_as(container.spec.divs, pl_location_type) or find_placetype_cat_as(container.spec.addl_divs, pl_location_type) end end if not cat_as_list then local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(location_type, "from category") if not canon_placetype or not (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then internal_error("Unable to locate plural location type %s among the divs or addl_divs " .. "for container key %s spec %s, and the location type is either not in placetype_data or " .. "not identified as a generic placetype", pl_location_type, container.key, container.spec) end cat_as_list = {{type = pl_location_type, prep = m_placetypes.get_placetype_entry_preposition(location_type)}} end local prefixed_key = m_placetypes.get_prefixed_key(container.key, container.spec) per_container_parent = gsub_literally(per_container_parent, "CONTAINER", prefixed_key) for _, cat_as in ipairs(cat_as_list) do local per_container_per_placetype_parent = per_container_parent per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PL_PLACETYPE", cat_as.type) per_container_per_placetype_parent = gsub_literally(per_container_per_placetype_parent, "PREP", cat_as.prep) m_table.insertIfNot(parents, per_container_per_placetype_parent) end end inserted_containers = true else m_table.insertIfNot(parents, parent) end end if not inserted_containers and containers then -- If we didn't insert the containers above in some form, insert them now as bare categories. Note that -- this may be different categories from the container categories inserted above. for _, container in ipairs(containers) do m_table.insertIfNot(parents, container.key) end end if spec.addl_parents then for _, parent in ipairs(spec.addl_parents) do m_table.insertIfNot(parents, parent) end end local function format_boxval(val, specname) if val == true then val = "%l" end if type(val) == "string" then val = gsub_literally(val, "%l", full_location_placename) val = gsub_literally(val, "%e", elliptical_location_placename) if val:find("%%c") then if not full_container_placename then internal_error("Wikipedia/Commons spec %s = %s has %%c in it but key %s has no " .. "containers: %s", specname, val, canon_label, spec) end val = gsub_literally(val, "%c", full_container_placename) end end return val end local description = spec.fulldesc or ( "{{{langname}}} terms related to the people, culture, or territory of " .. fetch_or_construct_location_desc(group, canon_label, spec) .. ".") local full_placename, _ = m_locations.key_to_placename(group, canon_label) return { type = "topic", description = description, breadcrumb = full_placename, parents = parents, wp = format_boxval(wp, "wp"), wpcat = format_boxval(wpcat, "wpcat"), commonscat = format_boxval(commonscat, "commonscat"), } end end end) local function find_canonical_key_from_place(place, canon_label) local has_the = false local key if place:find("^the ") then key = place:gsub("^the ", "") has_the = true else key = place end local group, spec = m_locations.find_canonical_key(key) if group then local requires_the = spec.the or false if has_the ~= requires_the then if has_the then mw.log(("Mismatch in category name '%s', has 'the' in the category when it should not"):format( canon_label)) else mw.log(("Mismatch in category name '%s', should have 'the' in the category but does not"): format(canon_label)) end return nil end return group, key, spec end return nil end -- Handler for generic placetypes (those whose categories are added through category generation handlers or through -- explicit category specs in the placetype data) for known locations in [[Module:place/locations]]. All such -- placetypes have either a `generic_before_non_cities` setting (meaning they can occur before non-city locations) or -- `generic_before_cities` setting (meaning they can occur before cities), or both. Examples of such categories are -- "cities in the Bahamas" or "rivers in Western Australia, Australia", or (for city locations) -- "neighbourhoods of Hong Kong" or "places in Melbourne". insert(handlers, function(label) for _, canon_label in ipairs { lcfirst(label), label } do local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ใน)(.*)$") --th if not placetype then placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]-)(ของ)(.*)$") --th end if placetype then local normalized_placetype = placetype == "neighbourhoods" and "neighborhoods" or placetype local canon_placetype, ptdata, ptmatch = m_placetypes.get_placetype_data(normalized_placetype, "from category") if canon_placetype and (ptdata.generic_before_non_cities or ptdata.generic_before_cities) then local group, key, spec = find_canonical_key_from_place(place, canon_label) if group then -- Check whether the location uses British spelling, but also check all containers, because -- it's too hard to keep in sync the `british_spelling` setting for locations at all different -- levels (e.g. cities of various countries, first and second level administrative division, etc.), -- so we just set it at top level on the country. local uses_british_spelling = spec.british_spelling if uses_british_spelling == nil then for containers in m_locations.iterate_containers(group, key, spec) do local must_outer_break = false for _, container in ipairs(containers) do if container.spec.british_spelling ~= nil then uses_british_spelling = container.spec.british_spelling must_outer_break = true break end end if must_outer_break then break end end end local allow_cat = true if placetype == "neighborhoods" and uses_british_spelling or placetype == "neighbourhoods" and not uses_british_spelling then mw.log(("Mismatch in spelling of placetype '%s' in category '%s', should be '%s'"):format( placetype, canon_label, uses_british_spelling and "neighbourhoods" or "neighborhoods")) allow_cat = false end if spec.is_former_place and placetype ~= "สถานที่" then allow_cat = false end local expected_prep if spec.is_city then expected_prep = ptdata.generic_before_cities else expected_prep = ptdata.generic_before_non_cities end if not expected_prep then allow_cat = false end if allow_cat then if expected_prep ~= in_of then mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format( canon_label, in_of, expected_prep)) return nil end local linkdesc = m_placetypes.get_placetype_display_form(placetype, spec.is_city and "city" or "noncity", "return full") if linkdesc == false then mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype))) return nil end if not linkdesc then internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s", placetype, key, spec, canon_label) end desc = linkdesc .. " " .. in_of .. " " .. fetch_or_construct_location_desc(group, key, spec) desc = "{{{langname}}} " .. desc .. "." local parents = {} insert(parents, key) if spec.no_container_parent then -- top-level country, constituent country, continent or the like insert(parents, {name = normalized_placetype, sort = key}) if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then local category_class = m_placetypes.get_equiv_placetype_prop(normalized_placetype, function(pt) return m_placetypes.get_placetype_prop(pt, "class") end, { from_category = true, no_split_qualifiers = true, }) if not category_class then internal_error("Saw placetype %s that is either unknown or has no `class` " .. "setting in `placetype_data`", normalized_placetype) end if class_is_political_division[category_class] == nil then internal_error("Saw unknown category class %s derived from placetype %s", category_class, normalized_placetype) end if class_is_political_division[category_class] then insert(parents, "political divisions of specific countries") end end else local container_iterator = m_locations.iterate_containers(group, key, spec) local next_containers = container_iterator() if next_containers then for _, container in ipairs(next_containers) do local container_prep if container.spec.is_city then container_prep = ptdata.generic_before_cities else container_prep = ptdata.generic_before_non_cities end if not container_prep then internal_error("For container key %s spec %s defines is_city = %s but " .. "there is no corresponding `generic_before_*` setting in the " .. "placedata for placetype %s", container.key, container.spec, container.spec.is_city, placetype) end insert(parents, { name = placetype .. container_prep .. m_placetypes.get_prefixed_key(container.key, container.spec), --th sort = key }) end else -- unrecognized countries or the like insert(parents, {name = normalized_placetype, sort = key}) end end return { type = "name", topic = canon_label, description = desc, breadcrumb = placetype, parents = parents, } end end end end end end) -- Handler for "state capitals of the United States", "provincial capitals of Canada", etc. This must precede the next -- handler for specific political and misc (non-political) divisions of polities and subpolities, such as -- "provinces of the Philippines", because "departmental capitals" is listed in cat_as for French prefectures and so -- will trigger an error if that handler runs before this one. insert(handlers, function(label) label = lcfirst(label) --local capital_cat, place = label:match("^([a-z%- ]- capitals) of (.*)$") local capital_cat, place = mw.ustring.match(label, "^(เมืองหลวงของ[a-zก-๛%- ]-)ของ(.*)$") -- Make sure we recognize the type of capital. if place and capital_cat_to_placetype[capital_cat] then local placetype = capital_cat_to_placetype[capital_cat] local pl_placetype = m_placetypes.pluralize_placetype(placetype) -- Locate the container, fetch its known political divisions, and make sure the placetype corresponding to the -- type of capital is among the list. local group, key, spec = find_canonical_key_from_place(place, canon_label) if group and (spec.divs or spec.addl_divs) then local saw_match = false local variant_matches = {} local divlists = {} if spec.divs then insert(divlists, spec.divs) end if spec.addl_divs then insert(divlists, spec.addl_divs) end for _, divlist in ipairs(divlists) do if type(divlist) ~= "table" then divlist = {divlist} end for _, div in ipairs(divlist) do if type(div) == "string" then div = {type = div} end -- HACK. Currently if we don't find a match for the placetype, we map e.g. 'autonomous region' -- -> 'regional capitals' and 'union territory' -> 'territorial capitals'. When encountering a -- political division like 'autonomous region' or 'union territory', chop off everything up -- through a space to make things match. To make this clearer, we record all such -- "variant match" cases, and down below we insert a note into the category text indicating that -- such "variant matches" are included among the category. if pl_placetype == div.type or pl_placetype == div.type:gsub("^.* ", "") then saw_match = true if pl_placetype ~= div.type then insert(variant_matches, div.type) end end end end if saw_match then -- Everything checks out, construct the category description. local placetype_desc = m_placetypes.get_placetype_display_form(pl_placetype, placetype.is_city and "city" or "noncity") if placetype_desc == false then mw.log(("Display form for pl_placetype %s is false, can't categorize"):format(dump(pl_placetype))) return nil end if not placetype_desc then internal_error("Unrecognized plural placetype %s, generated as the plural of %s, which " .. "was found as the placetype of capital placetype %s in label %s", pl_placetype, placetype, capital_cat, label) end local variant_match_text = "" if variant_matches[1] then local real_variant_match_descs = {} for i, variant_match in ipairs(variant_matches) do local variant_match_desc = m_placetypes.get_placetype_display_form(variant_match, placetype.is_city and "city" or "noncity") if variant_match_desc == nil then internal_error("Unrecognized variant match plural placetype %s, coming from " .. "place key %s, data %s in label %s", variant_match, key, spec, label) end if variant_match_desc then -- skip those for which the description is `false`, like `ABBREVIATION_OF states` -- in the United States divs. insert(real_variant_match_descs, variant_match_desc) end end if real_variant_match_descs[1] then variant_match_text = " (including " .. m_table.serialCommaJoin(real_variant_match_descs) .. ")" end end local desc = "{{{langname}}} names of [[capital]]s of " .. placetype_desc .. variant_match_text .. " of " .. fetch_or_construct_location_desc(group, key, spec) .. "." local full_placename, _ = m_locations.key_to_placename(group, key) local parents = {} if spec.no_container_parent then -- top-level country, constituent country, continent or the like insert(parents, {name = capital_cat, sort = key}) else local container_iterator = m_locations.iterate_containers(group, key, spec) local next_containers = container_iterator() if next_containers then for _, container in ipairs(next_containers) do insert(parents, { name = capital_cat .. "ของ" .. m_placetypes.get_prefixed_key(container.key, container.spec), --th sort = key }) end else -- unrecognized countries or the like insert(parents, {name = capital_cat, sort = key}) end end insert(parents, key) return { type = "name", topic = label, description = desc, breadcrumb = full_placename, parents = parents, } end end end end) local overriding_category_descriptions = { ["autonomous cities of Spain"] = "the [[w:Autonomous communities of Spain#Autonomous_cities|autonomous cities of Spain]]", ["regions of Greece"] = "the regions ([[periphery|peripheries]]) of [[Greece]]", ["regions of North Macedonia"] = "the regions ([[periphery|peripheries]]) of [[North Macedonia]]", ["subprefectures of Japan"] = "[[subprefecture]]s of [[Japan]]ese [[prefecture]]s", } -- Handler for specific political and misc (non-political) divisions of locations (polities, subpolities, cities, etc.), -- such as "provinces of the Philippines", "counties of Wales", "municipalities of Tocantins, Brazil", -- "boroughs of New York City", etc. This does not handle categories for generic placetypes (cities, rivers, etc.) of -- locations, which are handled by different handlers above. insert(handlers, function(label) -- The label comes with an initial capitalization but we have to check both lowercase-initial and capital-initial -- versions of the placetype to handle e.g. [[:Category:en:Indian reserves of Canada]]. for _, canon_label in ipairs { label, lcfirst(label) } do for _, minimal_placetype in ipairs { true, false } do local match_quantifier = minimal_placetype and "-" or "+" -- Some categories have two "of"s in them, and depending on the category, it's correct to do either a greedy -- ([[:Category:en:Abbreviations of states of the United States]], with placetype `abbreviations of states`) -- or non-greedy ([[:Category:en:Provinces of the Democratic Republic of the Congo]], with placetype -- `provinces`) match. We can't know in advance which is correct so we try both possibilities, doing the -- non-greedy one first as it seems more common (there are many locations with "of" in them, but currently -- only `abbreviations of states` occurs with a following location). local placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ของ)(.*)$") if not placetype then placetype, in_of, place = mw.ustring.match(canon_label, "^([A-Za-zก-๛%- ]" .. match_quantifier .. ")(ใน)(.*)$") end if placetype then local group, key, spec = find_canonical_key_from_place(place, canon_label) if group then local function find_placetype(divs) if divs then if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) == "string" then div = {type = div} end local cat_as = div.cat_as or div.type if type(cat_as) ~= "table" then cat_as = {cat_as} end for _, pt_cat_as in ipairs(cat_as) do if type(pt_cat_as) == "string" then pt_cat_as = {type = pt_cat_as} end if placetype == pt_cat_as.type then local div_parent = pt_cat_as.container_parent_type if div_parent == nil then -- allow false div_parent = div.container_parent_type end if div_parent == nil then div_parent = placetype end return div_parent, pt_cat_as.prep or div.prep or "ของ" end end end end return nil end local div_parent, div_prep = find_placetype(spec.divs) if div_parent == nil then -- allow false div_parent, div_prep = find_placetype(spec.addl_divs) end if div_parent == nil then -- allow false div_parent, div_prep = find_placetype(spec.addl_divs_for_categorization) end if div_parent ~= nil then if div_prep ~= in_of then mw.log(("Mismatch in category name '%s', has '%s' when it should have '%s'"):format( canon_label, in_of, div_prep)) return nil end local linkdesc = m_placetypes.get_placetype_display_form(placetype, spec.is_city and "city" or "noncity", "return full") if linkdesc == false then mw.log(("Display form for placetype %s is false, can't categorize"):format(dump(placetype))) return nil end if not linkdesc then internal_error("Unrecognized placetype %s when processing key %s, data %s, label %s", placetype, key, spec, canon_label) end local desc = overriding_category_descriptions[canon_label] if not desc then desc = linkdesc .. in_of .. fetch_or_construct_location_desc(group, key, spec) --th end desc = "{{{langname}}} " .. desc .. "." local parents = {} insert(parents, key) if div_parent then -- div_parent may be `false` if spec.no_container_parent then -- top-level country, constituent country, continent or the like insert(parents, {name = placetype, sort = " " .. key}) if spec.placetype == "ประเทศ" or m_table.contains(spec.placetype, "ประเทศ") then --th insert(parents, "political divisions of specific countries") end else local container_iterator = m_locations.iterate_containers(group, key, spec) local next_containers = container_iterator() if next_containers then for _, container in ipairs(next_containers) do insert(parents, { name = div_parent .. in_of .. m_placetypes.get_prefixed_key(container.key, container.spec), --th sort = key }) end else -- unrecognized countries or the like insert(parents, {name = placetype, sort = " " .. key}) end end end return { type = "name", topic = canon_label, description = desc, breadcrumb = placetype, parents = parents, } end end end end end end) labels["exonyms"] = { type = "name", -- special-cased description description = "{{{langname}}} [[exonym]]s.", parents = {"สถานที่"}, } labels["political divisions of specific countries"] = { type = "grouping", description = "{{{langname}}} categories for political divisions of specific countries.", parents = {"สถานที่"}, } -- Misc. FIXME: Remove the need for this. labels["nomes of Ancient Egypt"] = { type = "name", -- special-cased description description = "{{{langname}}} names of the [[nome]]s of [[Ancient Egypt]].", breadcrumb = "nomes", parents = {"อียิปต์โบราณ"}, } -- FIXME: Everything here has been moved from [[Module:category tree/topic/Earth]]. Most should be removed. labels["มหาสมุทรแอตแลนติก"] = { type = "related-to", description = "default with the", parents = {"โลก"}, } labels["Atlantic Ocean"] = labels["มหาสมุทรแอตแลนติก"] labels["British Isles"] = { type = "related-to", description = "=the people, culture, or territory of [[Great Britain]], [[Ireland]], and other nearby islands", parents = {"ยุโรป", "เกาะ"}, } labels["สหภาพยุโรป"] = { type = "related-to", description = "default with the", parents = {"ยุโรป"}, } labels["European Union"] = labels["สหภาพยุโรป"] labels["Gascony"] = { type = "related-to", description = "default", parents = {"Occitania, France"}, } labels["Indian subcontinent"] = { type = "related-to", description = "default with the", parents = {"เอเชียใต้"}, } labels["Bengal"] = { type = "related-to", description = "{{{langname}}} terms related to the people, culture, or territory of [[Bengal]].", parents = {"Indian subcontinent"}, } labels["Kashmir"] = { type = "related-to", description = "{{{langname}}} terms related to the people, culture, or territory of [[Kashmir]].", parents = {"Indian subcontinent"}, } labels["Kashmir, India"] = { type = "related-to", description = "{{{langname}}} names of places in {{w|Kashmir, India}}.", parents = {"อินเดีย", "Kashmir"}, } labels["เกาหลี"] = { type = "related-to", description = "=the people, culture, or territory of [[Korea]]", parents = {"เอเชีย"}, } labels["Korea"] = labels["เกาหลี"] labels["Languedoc"] = { type = "related-to", description = "default", parents = {"Occitania, France"}, } labels["Lapland"] = { type = "related-to", description = "=[[Lapland]], a region in northernmost Europe", parents = {"ยุโรป", "ฟินแลนด์", "นอร์เวย์", "รัสเซีย", "สวีเดน"}, } labels["ตะวันออกกลาง"] = { type = "related-to", description = "default with the", parents = {"แอฟริกา", "เอเชีย"}, } labels["Middle East"] = labels["ตะวันออกกลาง"] labels["Netherlands Antilles"] = { type = "related-to", description = "=the people, culture, or territory of the [[Netherlands Antilles]]", parents = {"เนเธอร์แลนด์", "อเมริกาเหนือ"}, } labels["Provence"] = { type = "related-to", description = "default", parents = {"Provence-Alpes-Côte d'Azur, France"}, } labels["เอเชียใต้"] = { type = "related-to", description = "default", parents = {"ยูเรเชีย", "เอเชีย"}, } labels["South Asia"] = labels["เอเชียใต้"] return {LABELS = labels, HANDLERS = handlers} 8mj1wiag254u54rhuemz518zgwxyobs มอดูล:category tree/หัวข้อ/สังคม 828 44518 5720684 5720603 2026-04-21T01:19:32Z OctraBot 3198 5720684 Scribunto text/plain local labels = {} local unpack = unpack or table.unpack -- Lua 5.2 compatibility labels["สังคม"] = { type = "related-to", description = "default", parents = {"หัวข้อทั้งหมด"}, } labels["society"] = labels["สังคม"] labels["ปริญญา"] = { type = "name", description = "default", parents = {"การศึกษา"}, } labels["academic degrees"] = labels["ปริญญา"] labels["ชั้นเรียน"] = { type = "set", description = "default", parents = {"การศึกษา"}, } labels["academic grades"] = labels["ชั้นเรียน"] labels["การบัญชี"] = { type = "related-to", description = "default", parents = {"การเงิน"}, } labels["accounting"] = labels["การบัญชี"] labels["administrative divisions"] = { type = "set", description = "default", parents = {"รัฐบาลและการปกครอง"}, } labels["การโฆษณา"] = { type = "related-to", description = "default", parents = {"ธุรกิจ", "การตลาด"}, } labels["advertising"] = labels["การโฆษณา"] labels["alt-right"] = { type = "related-to", description = "=the [[alt-right]], a loosely connected [[far-right]], [[white nationalist]] movement", parents = {"อนุรักษนิยม", "ลัทธิฟาสซิสต์", "คตินิยม", "white supremacist ideology"}, } labels["อนาธิปไตย"] = { type = "related-to", description = "default", parents = {"คตินิยม", "ฝ่ายซ้าย"}, } labels["anarchism"] = labels["อนาธิปไตย"] labels["anti-Semitism"] = { type = "related-to", description = "default", parents = {"forms of discrimination"}, } labels["รางวัล"] = { type = "name,type", description = "default", parents = {"สังคม"}, } labels["awards"] = labels["รางวัล"] labels["การธนาคาร"] = { type = "related-to", description = "default", parents = {"การเงิน", "อุตสาหกรรม"}, } labels["banking"] = labels["การธนาคาร"] labels["bars"] = { type = "type", description = "default", parents = {"กิจการ", "การดื่ม"}, } labels["Basque nationalism"] = { type = "related-to", description = "default", parents = {"Basque Country, Spain", "ชาตินิยม"}, } labels["เครื่องนอน"] = { type = "related-to", description = "default", parents = {"บ้าน"}, } labels["bedding"] = labels["เครื่องนอน"] labels["blacksmithing"] = { type = "related-to", description = "default", parents = {"โลหกรรม"}, } labels["bond market"] = { type = "related-to", description = "default with the", parents = {"การเงิน"}, } labels["bookbinding"] = { type = "related-to", description = "default", parents = {"publishing"}, } labels["book sizes"] = { type = "name", description = "default", parents = {"bookbinding"}, } labels["Brexit"] = { type = "related-to", description = "={{w|Brexit}}, i.e. the withdrawal of the {{w|United Kingdom}} from the {{w|European Union}}", parents = {"ชาตินิยม", "การเมืองยุโรป", "การเมืองสหราชอาณาจักร"}, } labels["burial"] = { type = "related-to", description = "default", parents = {"สังคม", "ความตาย"}, } labels["ธุรกิจ"] = { type = "related-to", description = "default", parents = {"เศรษฐศาสตร์", "สังคม"}, } labels["business"] = labels["ธุรกิจ"] labels["กิจการ"] = { type = "type", description = "=[[business]]es (specific commercial enterprises or establishments)", parents = {"ธุรกิจ"}, } labels["businesses"] = labels["กิจการ"] labels["ทุนนิยม"] = { type = "related-to", description = "default", parents = {"เศรษฐศาสตร์", "คตินิยม"}, } labels["capitalism"] = labels["ทุนนิยม"] labels["chairs"] = { type = "related-to", description = "default", parents = {"เครื่องเรือน", "การนั่ง"}, } labels["child abuse"] = { type = "related-to", description = "default", parents = {"อาชญากรรม", "เด็ก", "ความรุนแรง"}, } labels["Chinese restaurants"] = { type = "related-to", description = "default", breadcrumb = "Chinese", parents = {"ร้านอาหาร", "จีน"}, } labels["cleaning"] = { type = "related-to", description = "default", parents = {"บ้าน"}, } labels["เหรียญ"] = { type = "set,related-to", description = "default", parents = {"เงิน (ตัวกลาง)"}, } labels["coins"] = labels["เหรียญ"] labels["อนุรักษนิยม"] = { type = "related-to", description = "=[[conservatism]] or [[traditionalist]] beliefs", parents = {"คตินิยม"}, } labels["conservatism"] = labels["อนุรักษนิยม"] labels["commerce"] = { type = "related-to", description = "default", parents = {"ธุรกิจ"}, } labels["commercial documents"] = { type = "set", description = "default", parents = {"commerce"}, } labels["commercial law"] = { type = "related-to", description = "default", breacrumb = "commercial", parents = {"กฎหมาย", "commerce"}, } labels["competition law"] = { type = "related-to", description = "default", breacrumb = "competition", parents = {"กฎหมาย"}, } labels["antitrust law"] = { description = "default", breacrumb = "antitrust", parents = {"competition law"}, } labels["law of unfair competition"] = { description = "default with the", breacrumb = "unfair", parents = {{name = "competition law", sort = "unfair"}}, } labels["ลัทธิคอมมิวนิสต์"] = { type = "related-to", description = "default", parents = {"คตินิยม", "สังคมนิยม", "ฝ่ายซ้าย"}, } labels["communism"] = labels["ลัทธิคอมมิวนิสต์"] labels["constitutional law"] = { type = "related-to", description = "default", breadcrumb = "constitutional", parents = {"กฎหมาย"}, } labels["ลิขสิทธิ์"] = { type = "related-to", description = "default", parents = {"ทรัพย์สินทางปัญญา"}, } labels["copyright"] = labels["ลิขสิทธิ์"] labels["copyright licenses"] = { type = "name", description = "=[[license]]s of [[copyright]]", breadcrumb_and_first_sort_base = "licenses", parents = {"ลิขสิทธิ์"}, } labels["corporate law"] = { type = "related-to", description = "default", breadcrumb = "corporate", parents = {"กฎหมาย"}, } labels["corruption"] = { type = "related-to", description = "default", parents = {"อาชญากรรม", "การเมือง"}, } labels["งานฝีมือ"] = { type = "type", description = "default", parents = {"สังคม"}, } labels["crafts"] = labels["งานฝีมือ"] labels["อาชญากรรม"] = { type = "related-to", description = "default", parents = {"สังคม", "กฎหมายอาญา"}, } labels["crime"] = labels["อาชญากรรม"] labels["crime prevention"] = { type = "related-to", description = "default", parents = {"public safety", "อาชญากรรม"}, } labels["กฎหมายอาญา"] = { type = "related-to", description = "default", breadcrumb = "criminal", parents = {"กฎหมาย"}, } labels["criminal law"] = labels["กฎหมายอาญา"] labels["crochet"] = { type = "related-to", description = "default", parents = {"งานฝีมือ"}, } labels["cryptocurrency"] = { type = "related-to", description = "default", parents = {"เงินตรา", "วิทยาการรหัสลับ", "เทคโนโลยี"}, } -- currencies คือเงินตราชนิดต่าง ๆ labels["สกุลเงิน"] = { type = "set", description = "default", parents = {"เงิน (ตัวกลาง)", "เงินตรา"}, } labels["currencies"] = labels["สกุลเงิน"] -- currency คือเงินที่กำหนดตามกฎหมาย มีตราของรัฐ labels["เงินตรา"] = { type = "related-to", description = "default", parents = {"เงิน (ตัวกลาง)"}, } labels["currency"] = labels["เงินตรา"] labels["dairy farming"] = { type = "related-to", description = "default", parents = {"เกษตรกรรม", "อุตสาหกรรม"}, } labels["ประชาธิปไตย"] = { type = "related-to", description = "default", parents = {"ระบอบการปกครอง"}, } labels["democracy"] = labels["ประชาธิปไตย"] labels["diplomacy"] = { type = "related-to", description = "default", parents = {"สังคม"}, } labels["discrimination"] = { type = "related-to", description = "default", parents = {"สังคม"}, } labels["drug trafficking"] = { type = "related-to", description = "default", parents = {"อาชญากรรม", "ยา"}, } labels["การศึกษา"] = { type = "related-to", description = "default", parents = {"สังคม"}, } labels["education"] = labels["การศึกษา"] labels["emergency services"] = { type = "related-to", description = "default", parents = {"public safety"}, } labels["employment"] = { type = "related-to", description = "default", parents = {"ธุรกิจ", "งาน"}, } labels["espionage"] = { type = "related-to", description = "default", parents = {"security", "deception", "secrecy"}, } labels["evil"] = { type = "related-to", description = "default", parents = {"จริยศาสตร์", "ศาสนา"}, } labels["fame"] = { type = "related-to", description = "default", parents = {"สังคม", "ความรู้"}, } labels["family law"] = { type = "related-to", description = "default", breadcrumb = "family", parents = {"กฎหมาย"}, } labels["ลัทธิฟาสซิสต์"] = { type = "related-to", description = "default", parents = {"คตินิยม"}, } labels["fascism"] = labels["ลัทธิฟาสซิสต์"] labels["farriery"] = { type = "related-to", description = "default", parents = {"blacksmithing", "ม้า"}, } -- AKA คตินิยมสิทธิสตรี, สตรีสิทธินิยม labels["สตรีนิยม"] = { type = "related-to", description = "default", parents = {"สถานะเพศ", "เพศหญิง", "คตินิยม", "สังคม", "สังคมวิทยา"}, } labels["feminism"] = labels["สตรีนิยม"] labels["feudalism"] = { type = "related-to", description = "default", parents = {"ระบอบการปกครอง"}, } labels["การเงิน"] = { type = "related-to", description = "default", parents = {"ธุรกิจ"}, } labels["finance"] = labels["การเงิน"] labels["firefighting"] = { type = "related-to", description = "default", parents = {"emergency services", "ไฟ"}, } labels["forms of discrimination"] = { type = "type", description = "{{{langname}}} terms for [[form]]s of [[discrimination]].", additional = "{{also|หมวดหมู่:{{{langcode}}}:อคติ|หมวดหมู่:{{{langcode}}}:ทฤษฎีสมคบคิด|หมวดหมู่:{{{langcode}}}:คตินิยม}}", breadcrumb = "forms", parents = {"discrimination"}, } labels["ระบอบการปกครอง"] = { type = "type", description = "{{{langname}}} terms for [[form]]s of [[government]].", breadcrumb = "forms", parents = {"รัฐบาลและการปกครอง"}, } labels["forms of government"] = labels["ระบอบการปกครอง"] labels["อิสรภาพ"] = { type = "related-to", description = "default", parents = {"สังคม"}, } labels["freedom"] = labels["อิสรภาพ"] labels["freedom of speech"] = { type = "related-to", description = "default", breadcrumb = "speech", parents = {{name = "อิสรภาพ", sort = "speech"}, "กฎหมาย"}, } labels["freemasonry"] = { type = "related-to", description = "default", parents = {"องค์การ"}, } labels["funeral"] = { type = "related-to", description = "default", parents = {"สังคม", "ความตาย", "อุตสาหกรรม"}, } labels["เครื่องเรือน"] = { type = "related-to", description = "default", parents = {"บ้าน"}, commonscat = true, wpcat = true, } labels["furniture"] = labels["เครื่องเรือน"] labels["gender-critical feminism"] = { type = "related-to", description = "default", breadcrumb = "gender-critical", parents = {"สตรีนิยม", "สถานะเพศ", "transphobia"}, } labels["glassblowing"] = { type = "related-to", description = "default", parents = {"งานฝีมือ", "glass"}, } labels["good"] = { type = "related-to", description = "default", parents = {"จริยศาสตร์", "ศาสนา"}, } labels["รัฐบาลและการปกครอง"] = { type = "related-to", description = "default", parents = {"สังคม", "การเมือง"}, } labels["government"] = labels["รัฐบาลและการปกครอง"] labels["hairdressing"] = { type = "related-to", description = "default", parents = {"ผมและขน", "งานฝีมือ"}, } labels["สังคมชั้นสูง"] = { type = "related-to", description = "=royalty and nobility", parents = {"สังคม"}, } labels["high society"] = labels["สังคมชั้นสูง"] labels["Hindutva"] = { type = "related-to", description = "=[[Hindutva]] or {{w|Hindu nationalism}}", parents = {"อนุรักษนิยม", "ศาสนาฮินดู", "คตินิยม", "การเมืองอินเดีย", "ชาตินิยม", "เทวาธิปไตย"}, } labels["สกุลเงินในอดีต"] = { type = "set", description = "default", breadcrumb = "historical", parents = {"สกุลเงิน"}, } labels["historical currencies"] = labels["สกุลเงินในอดีต"] labels["บ้าน"] = { type = "related-to", description = "default with the", parents = {"สังคม"}, } labels["home"] = labels["บ้าน"] labels["homophobia"] = { type = "related-to", description = "default", parents = {"queerphobia"}, } labels["hospitality"] = { type = "related-to", description = "default", parents = {"ธุรกิจ"}, } labels["host industry"] = { type = "related-to", description = "default", parents = {"hospitality", "กิจการ"}, } labels["โรงแรม"] = { type = "type", description = "default", parents = {"กิจการ", "การท่องเที่ยว", "hospitality"}, } labels["hotels"] = labels["โรงแรม"] labels["ครัวเรือน"] = { type = "related-to", description = "default", parents = {"บ้าน"}, } labels["household"] = labels["ครัวเรือน"] labels["housing"] = { type = "related-to", description = "default", parents = {"บ้าน", "อาคาร"}, } labels["ทรัพยากรมนุษย์"] = { type = "related-to", description = "default no singularize", parents = {"ธุรกิจ", "สังคมวิทยา"}, } labels["human resources"] = labels["ทรัพยากรมนุษย์"] labels["คตินิยม"] = { type = "related-to", description = "default", parents = {"สังคม", "การเมือง"}, } labels["ideologies"] = labels["คตินิยม"] labels["imperialism"] = { type = "related-to", description = "default", parents = {"คตินิยม"}, } labels["import/export"] = { type = "related-to", description = "=[[import]]s and [[export]]s", parents = {"การค้า", "การคมนาคม"}, } labels["incel community"] = { type = "related-to", description = "=the [[incel]] community", parents = {"masculism", "เพศ"}, } labels["incoterms"] = { type = "related-to", description = "=[[Incoterm]]s", parents = {"ธุรกิจ", "import/export"}, } labels["อุตสาหกรรม"] = { type = "related-to", description = "default", parents = {"ธุรกิจ"}, } labels["industries"] = labels["อุตสาหกรรม"] labels["inheritance law"] = { type = "related-to", description = "default", breadcrumb = "inheritance", parents = {"กฤหมาย"}, } labels["insurance"] = { type = "related-to", description = "default", parents = {"การเงิน", "อุตสาหกรรม"}, } labels["ทรัพย์สินทางปัญญา"] = { type = "related-to", description = "=[[intellectual property]] [[law]]", parents = {"กฎหมาย"}, } labels["intellectual property"] = labels["ทรัพย์สินทางปัญญา"] labels["กฎหมายระหว่างประเทศ"] = { type = "related-to", description = "default", breadcrumb = "international", parents = {"กฎหมาย"}, } labels["international law"] = labels["กฎหมายระหว่างประเทศ"] labels["international relations"] = { type = "related-to", description = "default wikify", parents = {"การเมือง", "โลก"}, } labels["การเงินอิสลาม"] = { type = "related-to", description = "default wikify", breadcrumb = "Islamic", parents = {"การเงิน", "การธนาคาร", "ศาสนาอิสลาม"}, } labels["Islamic finance"] = labels["การเงินอิสลาม"] labels["กฎหมายอิสลาม"] = { type = "related-to", description = "default wikify", breadcrumb = "กฎหมาย", parents = {{name = "ศาสนาอิสลาม", sort = "กฎหมาย"}, "กฎหมาย"}, } labels["Islamic law"] = labels["กฎหมายอิสลาม"] labels["Islamism"] = { type = "related-to", description = "default", parents = {"คตินิยม", "อนุรักษนิยม", "ศาสนาอิสลาม", "เทวาธิปไตย"}, } labels["Juche"] = { type = "related-to", description = "default", parents = {"เกาหลีเหนือ", "communism", "ชาตินิยม"}, } labels["justice"] = { type = "related-to", description = "default", parents = {"สังคม"}, } labels["เคเอฟซี"] = { type = "related-to", description = "=the {{w|Kentucky Fried Chicken}} [[chain]] of [[fast-food]] [[restaurant]]s", parents = {"ร้านอาหาร"}, } labels["Kentucky Fried Chicken"] = labels["เคเอฟซี"] labels["knitting"] = { type = "related-to", description = "default", parents = {"งานฝีมือ"}, } labels["Ku Klux Klan"] = { type = "related-to", description = "default with the", parents = {"องค์การ", "white supremacist ideology"}, } labels["kyabakura industry"] = { type = "related-to", description = "default", parents = {"hospitality", "กิจการ"}, } labels["labour"] = { type = "related-to", description = "=[[labour]] or the {{w|labour movement}}", parents = {"งาน", "ฝ่ายซ้าย"}, } labels["laundry"] = { type = "related-to", description = "default", parents = {"cleaning"}, } labels["กฎหมาย"] = { type = "related-to", description = "=the [[science]] and [[practice]] of [[law]]", parents = {"justice"}, } labels["law"] = labels["กฎหมาย"] labels["การบังคับใช้กฎหมาย"] = { type = "related-to", description = "default", parents = {"crime prevention", "emergency services", "กฎหมาย"}, } labels["law enforcement"] = labels["การบังคับใช้กฎหมาย"] labels["law of obligations"] = { type = "related-to", description = "default with the no singularize", breadcrumb = "obligations", parents = {"กฎหมาย"}, } labels["leatherworking"] = { type = "related-to", description = "default", parents = {"งานฝีมือ"}, } labels["ฝ่ายซ้าย"] = { type = "related-to", description = "default", parents = {"คตินิยม"}, } labels["leftism"] = labels["ฝ่ายซ้าย"] labels["เสรีนิยม"] = { type = "related-to", description = "default", parents = {"คตินิยม"}, } labels["liberalism"] = labels["เสรีนิยม"] labels["อิสรนิยม"] = { type = "related-to", description = "default", parents = {"คตินิยม"}, } labels["libertarianism"] = labels["อิสรนิยม"] labels["logistics"] = { type = "related-to", description = "default no singularize", parents = {"operations"}, } labels["การจัดการ"] = { type = "related-to", description = "default", parents = {"ธุรกิจ"}, } labels["management"] = labels["การจัดการ"] labels["ลัทธิเหมา"] = { type = "related-to", description = "default", parents = {"คตินิยม", "ลัทธิคอมมิวนิสต์", "ลัทธิมากซ์"}, } labels["Maoism"] = labels["ลัทธิเหมา"] labels["การตลาด"] = { type = "related-to", description = "default", parents = {"ธุรกิจ"}, } labels["marketing"] = labels["การตลาด"] labels["ลัทธิมากซ์"] = { type = "related-to", description = "default", parents = {"คตินิยม", "สังคมนิยม"}, } labels["Marxism"] = labels["ลัทธิมากซ์"] labels["masculism"] = { type = "related-to", description = "default", parents = {"คตินิยม", "เพศชาย"}, } labels["metalworking"] = { type = "related-to", description = "default", parents = {"งานฝีมือ", "โลหกรรม"}, } labels["McDonald's"] = { type = "related-to", description = "=the {{w|McDonald's}} [[chain]] of [[fast-food]] [[restaurant]]s", parents = {"ร้านอาหาร"}, } labels["micronationalism"] = { type = "related-to", description = "default", parents = {"ระบอบการปกครอง", "คตินิยม"}, } labels["การทหาร"] = { type = "related-to", description = "default with the", parents = {"สังคม"}, } labels["military"] = labels["การทหาร"] labels["military units"] = { type = "related-to", description = "default", parents = {"การทหาร", "อาชีพ"}, } labels["mining"] = { type = "related-to", description = "default", parents = {"อุตสาหกรรม"}, } labels["กษัตริย์นิยม"] = { type = "related-to", description = "default", parents = {"คตินิยม", "ราชาธิปไตย"}, } labels["monarchism"] = labels["กษัตริย์นิยม"] labels["ราชาธิปไตย"] = { type = "related-to", description = "default", parents = {"ระบอบการปกครอง", "สังคมชั้นสูง"}, } labels["monarchy"] = labels["ราชาธิปไตย"] -- money คือรวมทั้งเงินตราและไม่ใช่เงินตรา labels["เงิน (ตัวกลาง)"] = { type = "related-to", description = "default", parents = {"ธุรกิจ"}, } labels["money"] = labels["เงิน (ตัวกลาง)"] labels["museums"] = { type = "related-to", description = "default", parents = {"กิจการ", "การท่องเที่ยว", "ศิลปะ"}, } labels["ชาตินิยม"] = { type = "related-to", description = "default", parents = {"คตินิยม"}, } labels["nationalism"] = labels["ชาตินิยม"] labels["ลัทธินาซี"] = { type = "related-to", description = "default", parents = {"ลัทธิฟาสซิสต์", "white supremacist ideology", "คตินิยม"}, } labels["Nazism"] = labels["ลัทธินาซี"] labels["ลัทธินาซีใหม่"] = { -- Adjacent to Nazism, but not quite the same thing. type = "related-to", description = "default", parents = {"ลัทธินาซี", "ลัทธิฟาสซิสต์", "white supremacist ideology", "คตินิยม"}, } labels["neo-Nazism"] = labels["ลัทธินาซีใหม่"] labels["Nobel Prize"] = { type = "related-to", description = "default with the", parents = {"รางวัล"}, } labels["Objectivism"] = { type = "related-to", description = "=the political philosophy of {{w|Objectivism}} developed by {{w|Ayn Rand}}", parents = {"คตินิยม", "อิสรนิยม"}, } labels["offices"] = { type = "type", description = "=offices, in the sense \"position of responsibility of some authority within an organisation\"", parents = {"รัฐบาลและการปกครอง"}, } labels["อุตสาหกรรมน้ำมัน"] = { type = "related-to", description = "default with the", breadcrumb = "oil", parents = {"อุตสาหกรรม", "ปิโตรเลียม"}, } labels["oil industry"] = labels["อุตสาหกรรมน้ำมัน"] labels["operations"] = { type = "related-to", description = "{{{langname}}} terms covering all operational matters in [[production]], [[logistics]], or [[services]].", parents = {"การจัดการ", "systems theory"}, } labels["องค์การ"] = { type = "name", description = "default", parents = {"สังคม"}, } labels["organizations"] = labels["องค์การ"] labels["papermaking"] = { type = "related-to", description = "default", parents = {"งานฝีมือ", "อุตสาหกรรม"}, } labels["กฎหมายสิทธิบัตร"] = { type = "related-to", description = "default", breadcrumb = "patent", parents = {"กฎหมาย"}, } labels["patent law"] = labels["กฎหมายสิทธิบัตร"] labels["peace"] = { type = "related-to", description = "default", parents = {"security"}, } labels["pensions"] = { type = "related-to", description = "default", parents = {"การเงิน"}, } labels["philanthropy"] = { type = "related-to", description = "default", parents = {"สังคม"}, } labels["Philmont Scout Ranch"] = { type = "related-to", description = "={{w|Philmont Scout Ranch}}, a Scouting ranch in the United States", parents = {"Scouting"}, } labels["piracy"] = { type = "related-to", description = "default", parents = {"อาชญากรรม", "การเดินเรือ"}, } labels["การเมือง"] = { type = "related-to", description = "default no singularize", parents = {"สังคม"}, } labels["politics"] = labels["การเมือง"] labels["poverty"] = { type = "related-to", description = "default", parents = {"wealth"}, } --ภาษาไทยไม่ต้องผันรูป for _, country_demonym in ipairs { {"อาร์เจนตินา"}, {"ออสเตรเลีย"}, {"บังกลาเทศ"}, {"บราซิล"}, {"แคนาดา"}, {"ชิลี"}, {"จีน"}, {"ยุโรป"}, {"สหภาพยุโรป", nil, nil, "การเมืองยุโรป"}, {"ฝรั่งเศส", nil, nil, "การเมืองยุโรป"}, {"เยอรมนี", nil, nil, "การเมืองยุโรป"}, {"ฮ่องกง"}, {"ฮังการี", nil, nil, "การเมืองยุโรป"}, {"อินเดีย"}, {"อินโดนีเซีย"}, {"ไอร์แลนด์", nil, nil, "การเมืองยุโรป"}, {"ญี่ปุ่น"}, {"มาเลเซีย"}, {"เม็กซิโก"}, {"นิวซีแลนด์"}, {"ไนจีเรีย"}, {"ปากีสถาน"}, {"ปาเลสไตน์"}, {"เปรู"}, {"ฟิลิปปินส์"}, {"โปแลนด์", nil, nil, "การเมืองยุโรป"}, {"โปรตุเกส", nil, nil, "การเมืองยุโรป"}, {"รัสเซีย"}, {"สิงคโปร์"}, {"เซาท์แอฟริกา"}, {"เกาหลีใต้"}, {"สเปน", nil, nil, "การเมืองยุโรป"}, {"สวิตเซอร์แลนด์", nil, nil, "การเมืองยุโรป"}, {"ไต้หวัน"}, {"ยูเครน"}, {"สหราชอาณาจักร"}, {"สหรัฐอเมริกา"}, {"เวเนซุเอลา"}, {"เวียดนาม"}, } do local country, demonym, full_country, parent = unpack(country_demonym) labels["การเมือง" .. country] = { -- ภาษาไทยใช้คำเดียวกันหมด type = "related-to", description = ("=the {{w|politics of %s}}"):format(full_country or country), parents = {parent or "การเมือง", country}, } end labels["การพิมพ์"] = { type = "related-to", description = "default", parents = {"อุตสาหกรรม"}, } labels["printing"] = labels["การพิมพ์"] labels["prison"] = { type = "related-to", description = "default", parents = {"การบังคับใช้กฎหมาย", "อาคาร"}, } labels["กฎหมายวิธีพิจารณาความ"] = { type = "related-to", description = "default", breadcrumb = "procedural", parents = {"กฎหมาย"}, } labels["procedural law"] = labels["กฎหมายวิธีพิจารณาความ"] labels["property law"] = { type = "related-to", description = "default", breadcrumb = "property", parents = {"กฎหมาย"}, } labels["public administration"] = { type = "related-to", description = "=the field of [[public]] [[administration]]", parents = {"รัฐบาลและการปกครอง"}, } labels["public safety"] = { type = "related-to", description = "=the field of [[public]] [[safety]]", parents = {"public administration", "security"}, } labels["publishing"] = { type = "related-to", description = "default", parents = {"อุตสาหกรรม", "สื่อมวลชน"}, } labels["QAnon"] = { type = "related-to", description = "=the [[QAnon]] movement", parents = {"alt-right", "ทฤษฎีสมคบคิด", "Donald Trump", "pedophilia"}, } labels["queerphobia"] = { type = "related-to", description = "default", parents = {"forms of discrimination", "แอลจีบีทีคิว"}, } labels["เชื้อชาตินิยม"] = { type = "related-to", description = "default", parents = {"forms of discrimination"}, } labels["racism"] = labels["เชื้อชาตินิยม"] labels["rape"] = { type = "related-to", description = "=the field of [[sexual violence]]", parents = {"เพศ", "อาชญากรรม", "ความรุนแรง"}, } labels["อสังหาริมทรัพย์"] = { type = "related-to", description = "default", parents = {"อุตสาหกรรม", "housing"}, } labels["real estate"] = labels["อสังหาริมทรัพย์"] labels["ร้านอาหาร"] = { type = "related-to", description = "=[[restaurant]]s (including [[pub]]s, [[café]]s etc.)", parents = {"กิจการ", "อาหารและเครื่องดื่ม"}, } labels["restaurants"] = labels["ร้านอาหาร"] labels["royal residences"] = { type = "related-to", description = "default", parents = {"housing", "ราชาธิปไตย"}, } labels["โรงเรียน"] = { type = "related-to", description = "default", parents = {"การศึกษา", "อาคาร"}, } labels["schools"] = labels["โรงเรียน"] -- Note: this is the usual term, not "Scottish law". labels["Scots law"] = { type = "related-to", description = "default", breadcrumb = "Scots", parents = {"กฎหมาย", "สกอตแลนด์"}, } labels["Scouting"] = { type = "related-to", description = "default", parents = {"สังคม"}, } labels["security"] = { type = "related-to", description = "default", parents = {"สังคม"}, } labels["sexism"] = { type = "related-to", description = "default", parents = {"forms of discrimination", "สถานะเพศ"}, } labels["sewing"] = { type = "related-to", description = "=[[sewing]], sewing tools, sewing [[technique]]s and so on", parents = {"งานฝีมือ"}, } labels["shoemaking"] = { type = "related-to", description = "default", parents = {"งานฝีมือ"}, } labels["slavery"] = { type = "related-to", description = "default", parents = {"สังคม", "งาน"}, } labels["สังคมนิยม"] = { type = "related-to", description = "default", parents = {"เศรษฐศาสตร์", "คตินิยม", "ฝ่ายซ้าย"}, } labels["socialism"] = labels["สังคมนิยม"] labels["social justice"] = { type = "related-to", description = "default", parents = {"การเมือง", "สังคม", "สัคมวิทยา", "ฝ่ายซ้าย"}, } labels["social security"] = { type = "related-to", description = "default", parents = {"รัฐบาลและการปกครอง", "กฎหมาย", "เงิน (ตัวกลาง)"}, } labels["spinning"] = { type = "related-to", description = "=[[spinning]], the process of making [[yarn]] or [[string]] from raw [[fiber]]", parents = {"งานฝีมือ"}, } labels["square dancing"] = { type = "related-to", description = "default", parents = {"dance"}, } labels["standards of identity"] = { type = "related-to", description = "default", parents = {"กฎหมาย", "อาหารและเครื่องดื่ม"}, } labels["ตลาดหลักทรัพย์"] = { type = "related-to", description = "default with the", parents = {"การเงิน"}, } labels["stock market"] = labels["ตลาดหลักทรัพย์"] labels["stock symbols for companies"] = { type = "name", description = "=[[stock symbol]]s for [[company|companies]]", parents = {"การค้า"}, } labels["supply chain"] = { type = "related-to", description = "default no singularize", parents = {"operations"}, } labels["ภาษีอากร"] = { type = "related-to", description = "default", parents = {"รัฐบาลและการปกครอง", "กฎหมาย", "เงิน (ตัวกลาง)"}, } labels["taxation"] = labels["ภาษีอากร"] labels["theft"] = { type = "related-to", description = "default", parents = {"อาชญากรรม"}, } labels["เทวาธิปไตย"] = { type = "related-to", description = "default", parents = {"คตินิยม", "ศาสนา"}, } labels["theocracy"] = labels["เทวาธิปไตย"] labels["อุตสาหกรรมป่าไม้"] = { type = "related-to", description = "default with the", breadcrumb = "timber", parents = {"อุตสาหกรรม"}, } labels["timber industry"] = labels["อุตสาหกรรมป่าไม้"] labels["เครื่องหมายการค้า"] = { type = "related-to", description = "=[[trademark]] [[law]]", parents = {"ทรัพย์สินทางปัญญา"}, } labels["trademark"] = labels["เครื่องหมายการค้า"] labels["การค้า"] = { type = "related-to", description = "default", parents = {"ธุรกิจ"}, } labels["transphobia"] = { type = "related-to", description = "default", parents = {"forms of discrimination", "transgender"}, } labels["trust"] = { type = "related-to", description = "default", parents = {"security"}, } labels["types of settlements"] = { type = "type", topic = "การตั้งถิ่นฐาน", description = "=[[การตั้งถิ่นฐาน]]", parents = {"รัฐบาลและการปกครอง"}, } labels["สหประชาชาติ"] = { type = "related-to", description = "=the [[United Nations Organization]]", parents = {"องค์การ"}, } labels["United Nations"] = labels["สหประชาชาติ"] labels["มหาวิทยาลัย"] = { type = "related-to", description = "default", parents = {"โรงเรียน"}, } labels["universities"] = labels["มหาวิทยาลัย"] labels["voting systems"] = { type = "related-to", description = "default", parents = {"ประชาธิปไตย", "ระบบ"}, } labels["wealth"] = { type = "related-to", description = "default", parents = {"เศรษฐศาสตร์"}, } labels["weaving"] = { type = "related-to", description = "default", parents = {"งานฝีมือ"}, } labels["white supremacist ideology"] = { type = "related-to", description = "default", parents = {"เชื้อชาตินิยม", "anti-Semitism", "คตินิยม"}, } labels["woodworking"] = { type = "related-to", description = "default", parents = {"งานฝีมือ"}, } labels["Zionism"] = { type = "related-to", description = "default", parents = {"คตินิยม", "ศาสนายูดาย", "อิสราเอล", "ชาตินิยม"}, } return labels h1sxr4lcx65b9zhj1tsjntxprzxubfi ciudad 0 45949 5720706 2188019 2026-04-21T01:57:48Z OctraBot 3198 5720706 wikitext text/x-wiki == ภาษาชาบากาโน == === รากศัพท์ === {{inh+|cbk|es|ciudad}} === คำนาม === {{head|cbk|คำนาม}} # [[นคร]], [[เมืองใหญ่]] == ภาษานาวัตล์คลาสสิก == === รากศัพท์ === {{bor+|nci|es|ciudad}} === คำนาม === {{head|nci|คำนาม|head=ciudād}} # [[นคร]], [[เมืองใหญ่]] === อ้างอิง === * Lockhart, James. (2001) ''Nahuatl as Written'', Stanford University Press, page 215. == ภาษาสเปน == === รูปแบบอื่น === * {{alter|es|cibdad||โบราณ}} === รากศัพท์ === {{inh+|es|osp|cibdat}}, {{m|osp|cibdad|çibdad}} (เทียบ{{cog|lad|sivdad}}), จาก{{inh|es|la|cīvitātem}}, กรรมการกเอกพจน์ของ {{m|la|cīvitās||เมืองใหญ่}} (เทียบ{{cog|pt|cidade}}, {{cog|gl|cidade}}) === การออกเสียง === {{es-pr|+<audio:Es-am-lat-ciudad.ogg;Audio (Latin America)>}} === คำนาม === {{es-noun|f}} # [[นคร]], [[เมืองใหญ่]] #: {{usex|es|[[vivir|Viven]] en la '''ciudad'''.|They live in the '''city'''.}} #: {{usex|es|¡Qué '''ciudad''' tan grande y bonita!|What a large and beautiful '''city'''!}} ==== ลูกคำ ==== {{col-auto|es | ciudad estado | ciudad federal | ciudadano | ciudad fantasma | ciudad universitaria }} ==== คำเกี่ยวข้อง ==== {{col-auto|es|ciudadela|civil}} ==== คำสืบทอด ==== * {{desc|cbk|ciudad}} * {{desc|bcl|siyudad|bor=1}} * {{desc|ceb|siyudad|bor=1}} * {{desc|ilo|siudad|bor=1}} * {{desc|tl|siyudad|bor=1}} ==== ดูเพิ่ม ==== * {{l|es|aldea}} * {{l|es|pueblo}} === อ่านเพิ่ม === * {{R:es:DRAE}} frgjhc2o5ycdcdmlz74npf5jd4k1qt3 หมวดหมู่:term cleanup 14 46060 5720718 224816 2026-04-21T02:17:34Z OctraBot 3198 5720718 wikitext text/x-wiki {{delete}} 35r2j9t4ectnt1cmb7mlgcqvwz6h5k6 ผู้ใช้:Octahedron80/อักษรไทธรรม 2 138742 5720678 4933435 2026-04-20T15:48:13Z OctraBot 3198 5720678 wikitext text/x-wiki <div class="Lana" lang="nod"> ; พยัญชนะ # ละ ที่เป็นอักษรตามหรืออักษรควบกล้ำ สามารถใช้ ᩖ หรือ ᩠ + ᩃ ก็ได้ วิกิพจนานุกรมกำหนดให้ ᩖ เป็นรายการหลัก # ละตังหลาย ᩗ ใช้ใน [[ᨴᩗᩘᩣ]] (ตังหลาย) เท่านั้น # หาง ᩛ ใช้ตามหลังพยัญชนะบางตัว ในคำที่ยืมมาจากบาลี/สันสกฤต (และบาลี/สันสกฤตโดยตรง) #* หากตามหลังพยัญชนะ ᨭ (ฏ) ᨮ (ฐ) ᨯ (ฑ) ᨰ (ฒ) ᨱ (ณ) จะเท่ากับ ᩠ + ᨮ (+ฐ) #* หากตามหลังพยัญชนะ ᨲ (ต) ᨳ (ถ) ᨴ (ท) ᨵ (ธ) ᨶ (น) จะเท่ากับ ᩠ + ᨳ (+ถ) #* หากตามหลังพยัญชนะ ᨷ (ป) ᨹ (ผ) ᨻ (พ) ᨽ (ภ) ᨾ (ม) จะเท่ากับ ᩠ + ᨻ (+พ) # สะใหญ่ ᩔ (สฺส) พบได้ในคำบาลี # พยัญชนะสะกด ปกติจะเขียนไว้เป็นเชิงของตัวอักษรก่อนหน้า ซึ่งอาจเป็นพยัญชนะหรือสระก็ได้ #* เว้นแต่ ตัวอักษรก่อนหน้าเป็นสระล่าง หรือพยัญชนะเชิง จะเขียนเป็นตัวเต็มแทน #* ตัวสะกด ᩠ + ᨿ (+ย) เขียนเป็นพยัญชนะเชิงเสมอ (สำหรับคำเมือง) ; สระ # ใส่สระหลังจาก (กลุ่ม) พยัญชนะต้นเสมอ ตัว ᨿ, ᩅ, ᩋ ที่ปรากฏในรูปสระ ก็ใช้หลักเดียวกัน # ถ้ามีสระหลายรูปประกอบกัน ให้ใส่สระหน้า สระล่าง สระบน และสระหลัง <u>ตามลำดับ</u> #* ตัวเชิง ᨿ, ᩅ เป็นสระล่าง แต่สระออยของเขิน ᩭ เป็นสระหลัง # สระอำเขียนต่างจากภาษาไทยคือ เขียนลากข้างก่อน แล้วตามด้วยนิคหิต ส่วนวรรณยุกต์อยู่บนพยัญชนะ (ก่อนสระ) หมายเหตุ: ใช้ ᨠ เป็นพยัญชนะสำหรับเกาะ {| |- | {| class="wikitable" |- | 1. || ᨠᩫ (โอะมีตัวสะกด) || ᨠ + ᩫ |- style="background-color:lightgreen" | 2. || ᨠᩴ (อัง และนิคหิตของบาลี) || ᨠ + ᩴ |- style="background-color:lightgreen" | 3. || ᨠᩘ (ใช้ในตังหลาย และ งฺ ของบาลี) || ᨠ + ᩘ |- | 4. || ᨠᩢ (อะมีตัวสะกด) || ᨠ + ᩢ |- | 5. || ᨠ᩠ᩅᩫᩡ (อัวะ) || ᨠ + ᩠ + ᩅ + ᩫ + ᩡ |- | 6. || ᨠ᩠ᩅᩫ (อัวไม่มีตัวสะกด) || ᨠ + ᩠ + ᩅ + ᩫ |- | 7. || ᨠ᩠ᩅ (อัวมีตัวสะกด) || ᨠ + ᩠ + ᩅ |- style="background-color:moccasin" | 8. || ᨠᩬᩴ (ออไม่มีตัวสะกดของคำเมือง)<sup>[1]</sup> || ᨠ + ᩬ + ᩴ |- | 9. || ᨠᩬ (ออมีตัวสะกด) || ᨠ + ᩬ |- | 10. || ᨠᩡ (อะไม่มีตัวสะกด/ไม่เติมก็ได้) || ᨠ + ᩡ |- | 11. || ᨠᩣ (อาต่ำ) || ᨠ + ᩣ |- | 12. || ᨠᩤ (อาสูง) || ᨠ + ᩤ |- | 13. || ᨠᩣᩴ (อำต่ำ) || ᨠ + ᩣ + ᩴ |- | 14. || ᨠᩤᩴ (อำสูง) || ᨠ + ᩤ + ᩴ |- | 15. || ᨠᩥ (อิ) || ᨠ + ᩥ |- | 16. || ᨠᩦ (อี) || ᨠ + ᩦ |- | 17. || ᨠᩧ (อึ) || ᨠ + ᩧ |- | 18. || ᨠᩨ (อือ) || ᨠ + ᩨ |- | 19. || ᨠᩩ (อุ) || ᨠ + ᩩ |- | 20. || ᨠᩪ (อู) || ᨠ + ᩪ |} | {| class="wikitable" |- | 21. || ᨠᩮᩡ (เอะไม่มีตัวสะกด) || ᨠ + ᩮ + ᩡ |- | 22. || ᨠᩮᩢ (เอะมีตัวสะกด) || ᨠ + ᩮ + ᩢ |- | 23. || ᨠᩮ (เอ) || ᨠ + ᩮ |- | 24. || ᨠᩯᩡ (แอะไม่มีตัวสะกด) || ᨠ + ᩯ + ᩡ |- | 25. || ᨠᩯᩢ (แอะมีตัวสะกด) || ᨠ + ᩯ + ᩢ |- | 26. || ᨠᩯ (แอ) || ᨠ + ᩯ |- | 27. || ᨠᩮᩬᩥᩡ (เออะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩡ |- | 28. || ᨠᩮᩬᩥ (เออไม่มีตัวสะกด หรือเอือมีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ |- | 29. || ᨠᩮᩥ (เออมีตัวสะกด) || ᨠ + ᩮ + ᩥ |- style="background-color:pink" | 30. || ᨠᩮᩬᩨᩡ (เออะของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ + ᩡ |- style="background-color:pink" | 31. || ᨠᩮᩬᩨ (เออของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ |- | 32. || ᨠᩮᩢᩣ (เอาต่ำ) || ᨠ + ᩮ + ᩢ + ᩣ |- | 33. || ᨠᩮᩢᩤ (เอาสูง) || ᨠ + ᩮ + ᩢ + ᩤ |- style="background-color:lightgreen" | 34. || ᨠᩮᩣ (โอต่ำของบาลี) || ᨠ + ᩮ + ᩣ |- style="background-color:lightgreen" | 35. || ᨠᩮᩤ (โอสูงของบาลี) || ᨠ + ᩮ + ᩤ |- style="background-color:pink" | 36. || ᨠᩳ (ออไม่มีตัวสะกดของเขิน) || ᨠ + ᩳ |- style="background-color:lightblue" | 37. || ᨠᩬᩳ (ออไม่มีตัวสะกดของลื้อ/ยอง) || ᨠ + ᩬ + ᩳ |- | 38. || ᨠ᩠ᨿᩮᩡ (เอียะ) || ᨠ + ᩠ + ᨿ + ᩮ + ᩡ |- | 39. || ᨠ᩠ᨿᩮ (เอียไม่มีตัวสะกด) || ᨠ + ᩠ + ᨿ + ᩮ |- | 40. || ᨠ᩠ᨿ (เอียมีตัวสะกด) || ᨠ + ᩠ + ᨿ |} | {| class="wikitable" |- | 41. || ᨠᩮᩬᩥᩋᩡ (เอือะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ + ᩡ |- | 42. || ᨠᩮᩬᩥᩋ (เอือไม่มีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ |- | 43. || ᨠᩰᩡ (โอะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩡ |- | 44. || ᨠᩰ (โอไม่มีตัวสะกด) || ᨠ + ᩰ |- | 45. || ᨠᩰᩫ (โอมีตัวสะกด) || ᨠ + ᩰ + ᩫ |- | 46. || ᨠᩰᩬᩡ (เอาะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩬ + ᩡ |- | 47. || ᨠᩬᩢ (เอาะมีตัวสะกด) || ᨠ + ᩬ + ᩢ |- | 48. || ᨠᩱ (ไอ) || ᨠ + ᩱ |- style="background-color:moccasin" | 49. || ᨠᩲ (ใอของคำเมือง) || ᨠ + ᩲ |- style="background:linear-gradient(to bottom, pink 0%, lightblue 100%);" | 50. || ᨠᩭ (ออยของเขิน/ลื้อ/ยอง) || ᨠ + ᩭ |- | 51. || ᨠᩙ (อังอีกแบบหนึ่ง) || ᨠ + ᩙ |- | 52. || ᨠᩥᩴ (อิงอีกแบบหนึ่ง) || ᨠ + ᩥ + ᩴ |} |} [1] ภาษาคำเมือง: คำพิเศษที่ไม่ต้องสะกดตามนี้ ได้แก่ [[ᨣᩴ᩵]] (ก็) และ [[ᨷᩴ᩵]] (บ่, ไม่) ; วรรณยุกต์ # ถ้ามีรูปสระหน้า สระล่าง สระบน ให้ใส่วรรณยุกต์หลังจากสระเหล่านี้ครบแล้ว # ถ้าไม่มีสระ หรือมีแต่สระหลัง สามารถใส่วรรณยุกต์หลังจาก (กลุ่ม) พยัญชนะต้นได้ทันที ; สัญลักษณ์อื่น ๆ # ไม้ซ้ำ ᩻ ใช้งานได้สามอย่าง ได้แก่ #* คำซ้ำ ให้ใส่ไม้ซ้ำที่ท้ายพยางค์หรือคำที่สะกดสำเร็จแล้ว เหมือนไม้ยมก #* อักษรนำและอักษรตาม (อย่างคำเขมร) ให้ใส่ไม้ซ้ำหลังจากอักษรตาม (พยัญชนะตัวที่สอง) ทันที แล้วจึงตามด้วยสระ/วรรณยุกต์ต่อไป #* ใช้เป็นตัวแก้ความกำกวมว่า พยัญชนะสองตัวที่ติดกัน คืออักษรนำและอักษรตาม มิใช่ตัวสะกด (ในกรณีที่สะกดตามปกติแล้วรูปเหมือนกัน) ตำแหน่งที่ใส่เหมือนข้อที่แล้ว # ไม้กั๋งไหล ᩘ ใช้แทน งฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป # ระห้าม ᩺ ใช้แทน รฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป หรือใช้แทนทัณฑฆาตที่อยู่ท้ายคำ # การันต์ ᩼ ใช้แทนทัณฑฆาตของเขินและลื้อ ส่วนคำเมืองใช้ ระห้าม ᩺ ; อื่น ๆ # ภาษาเขิน, ภาษาลื้อ, ภาษายอง ที่เขียนด้วยอักษรไทธรรม ใช้อักขรวิธีเดียวกับภาษาคำเมือง เว้นแต่จะกำหนดไว้ในตาราง </div> qm6x5xllgzi2ozmiktw8i3zmkq0zimy 5720679 5720678 2026-04-20T15:48:51Z OctraBot 3198 5720679 wikitext text/x-wiki <div class="Lana" lang="nod"> ; พยัญชนะ # ละ ที่เป็นอักษรตามหรืออักษรควบกล้ำ สามารถใช้ ᩖ หรือ ᩠ + ᩃ ก็ได้ วิกิพจนานุกรมกำหนดให้ ᩖ เป็นรายการหลัก # ละตังหลาย ᩗ ใช้ใน [[ᨴᩗᩘᩣ]] (ตังหลาย) เท่านั้น # หาง ᩛ ใช้ตามหลังพยัญชนะบางตัว ในคำที่ยืมมาจากบาลี/สันสกฤต (และบาลี/สันสกฤตโดยตรง) #* หากตามหลังพยัญชนะ ᨭ (ฏ) ᨮ (ฐ) ᨯ (ฑ) ᨰ (ฒ) ᨱ (ณ) จะเท่ากับ ᩠ + ᨮ (+ฐ) #* หากตามหลังพยัญชนะ ᨲ (ต) ᨳ (ถ) ᨴ (ท) ᨵ (ธ) ᨶ (น) จะเท่ากับ ᩠ + ᨳ (+ถ) #* หากตามหลังพยัญชนะ ᨷ (ป) ᨹ (ผ) ᨻ (พ) ᨽ (ภ) ᨾ (ม) จะเท่ากับ ᩠ + ᨻ (+พ) # สะใหญ่ ᩔ (สฺส) พบได้ในคำที่ยืมมาจากบาลี (และบาลีโดยตรง) # พยัญชนะสะกด ปกติจะเขียนไว้เป็นเชิงของตัวอักษรก่อนหน้า ซึ่งอาจเป็นพยัญชนะหรือสระก็ได้ #* เว้นแต่ ตัวอักษรก่อนหน้าเป็นสระล่าง หรือพยัญชนะเชิง จะเขียนเป็นตัวเต็มแทน #* ตัวสะกด ᩠ + ᨿ (+ย) เขียนเป็นพยัญชนะเชิงเสมอ (สำหรับคำเมือง) ; สระ # ใส่สระหลังจาก (กลุ่ม) พยัญชนะต้นเสมอ ตัว ᨿ, ᩅ, ᩋ ที่ปรากฏในรูปสระ ก็ใช้หลักเดียวกัน # ถ้ามีสระหลายรูปประกอบกัน ให้ใส่สระหน้า สระล่าง สระบน และสระหลัง <u>ตามลำดับ</u> #* ตัวเชิง ᨿ, ᩅ เป็นสระล่าง แต่สระออยของเขิน ᩭ เป็นสระหลัง # สระอำเขียนต่างจากภาษาไทยคือ เขียนลากข้างก่อน แล้วตามด้วยนิคหิต ส่วนวรรณยุกต์อยู่บนพยัญชนะ (ก่อนสระ) หมายเหตุ: ใช้ ᨠ เป็นพยัญชนะสำหรับเกาะ {| |- | {| class="wikitable" |- | 1. || ᨠᩫ (โอะมีตัวสะกด) || ᨠ + ᩫ |- style="background-color:lightgreen" | 2. || ᨠᩴ (อัง และนิคหิตของบาลี) || ᨠ + ᩴ |- style="background-color:lightgreen" | 3. || ᨠᩘ (ใช้ในตังหลาย และ งฺ ของบาลี) || ᨠ + ᩘ |- | 4. || ᨠᩢ (อะมีตัวสะกด) || ᨠ + ᩢ |- | 5. || ᨠ᩠ᩅᩫᩡ (อัวะ) || ᨠ + ᩠ + ᩅ + ᩫ + ᩡ |- | 6. || ᨠ᩠ᩅᩫ (อัวไม่มีตัวสะกด) || ᨠ + ᩠ + ᩅ + ᩫ |- | 7. || ᨠ᩠ᩅ (อัวมีตัวสะกด) || ᨠ + ᩠ + ᩅ |- style="background-color:moccasin" | 8. || ᨠᩬᩴ (ออไม่มีตัวสะกดของคำเมือง)<sup>[1]</sup> || ᨠ + ᩬ + ᩴ |- | 9. || ᨠᩬ (ออมีตัวสะกด) || ᨠ + ᩬ |- | 10. || ᨠᩡ (อะไม่มีตัวสะกด/ไม่เติมก็ได้) || ᨠ + ᩡ |- | 11. || ᨠᩣ (อาต่ำ) || ᨠ + ᩣ |- | 12. || ᨠᩤ (อาสูง) || ᨠ + ᩤ |- | 13. || ᨠᩣᩴ (อำต่ำ) || ᨠ + ᩣ + ᩴ |- | 14. || ᨠᩤᩴ (อำสูง) || ᨠ + ᩤ + ᩴ |- | 15. || ᨠᩥ (อิ) || ᨠ + ᩥ |- | 16. || ᨠᩦ (อี) || ᨠ + ᩦ |- | 17. || ᨠᩧ (อึ) || ᨠ + ᩧ |- | 18. || ᨠᩨ (อือ) || ᨠ + ᩨ |- | 19. || ᨠᩩ (อุ) || ᨠ + ᩩ |- | 20. || ᨠᩪ (อู) || ᨠ + ᩪ |} | {| class="wikitable" |- | 21. || ᨠᩮᩡ (เอะไม่มีตัวสะกด) || ᨠ + ᩮ + ᩡ |- | 22. || ᨠᩮᩢ (เอะมีตัวสะกด) || ᨠ + ᩮ + ᩢ |- | 23. || ᨠᩮ (เอ) || ᨠ + ᩮ |- | 24. || ᨠᩯᩡ (แอะไม่มีตัวสะกด) || ᨠ + ᩯ + ᩡ |- | 25. || ᨠᩯᩢ (แอะมีตัวสะกด) || ᨠ + ᩯ + ᩢ |- | 26. || ᨠᩯ (แอ) || ᨠ + ᩯ |- | 27. || ᨠᩮᩬᩥᩡ (เออะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩡ |- | 28. || ᨠᩮᩬᩥ (เออไม่มีตัวสะกด หรือเอือมีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ |- | 29. || ᨠᩮᩥ (เออมีตัวสะกด) || ᨠ + ᩮ + ᩥ |- style="background-color:pink" | 30. || ᨠᩮᩬᩨᩡ (เออะของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ + ᩡ |- style="background-color:pink" | 31. || ᨠᩮᩬᩨ (เออของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ |- | 32. || ᨠᩮᩢᩣ (เอาต่ำ) || ᨠ + ᩮ + ᩢ + ᩣ |- | 33. || ᨠᩮᩢᩤ (เอาสูง) || ᨠ + ᩮ + ᩢ + ᩤ |- style="background-color:lightgreen" | 34. || ᨠᩮᩣ (โอต่ำของบาลี) || ᨠ + ᩮ + ᩣ |- style="background-color:lightgreen" | 35. || ᨠᩮᩤ (โอสูงของบาลี) || ᨠ + ᩮ + ᩤ |- style="background-color:pink" | 36. || ᨠᩳ (ออไม่มีตัวสะกดของเขิน) || ᨠ + ᩳ |- style="background-color:lightblue" | 37. || ᨠᩬᩳ (ออไม่มีตัวสะกดของลื้อ/ยอง) || ᨠ + ᩬ + ᩳ |- | 38. || ᨠ᩠ᨿᩮᩡ (เอียะ) || ᨠ + ᩠ + ᨿ + ᩮ + ᩡ |- | 39. || ᨠ᩠ᨿᩮ (เอียไม่มีตัวสะกด) || ᨠ + ᩠ + ᨿ + ᩮ |- | 40. || ᨠ᩠ᨿ (เอียมีตัวสะกด) || ᨠ + ᩠ + ᨿ |} | {| class="wikitable" |- | 41. || ᨠᩮᩬᩥᩋᩡ (เอือะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ + ᩡ |- | 42. || ᨠᩮᩬᩥᩋ (เอือไม่มีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ |- | 43. || ᨠᩰᩡ (โอะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩡ |- | 44. || ᨠᩰ (โอไม่มีตัวสะกด) || ᨠ + ᩰ |- | 45. || ᨠᩰᩫ (โอมีตัวสะกด) || ᨠ + ᩰ + ᩫ |- | 46. || ᨠᩰᩬᩡ (เอาะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩬ + ᩡ |- | 47. || ᨠᩬᩢ (เอาะมีตัวสะกด) || ᨠ + ᩬ + ᩢ |- | 48. || ᨠᩱ (ไอ) || ᨠ + ᩱ |- style="background-color:moccasin" | 49. || ᨠᩲ (ใอของคำเมือง) || ᨠ + ᩲ |- style="background:linear-gradient(to bottom, pink 0%, lightblue 100%);" | 50. || ᨠᩭ (ออยของเขิน/ลื้อ/ยอง) || ᨠ + ᩭ |- | 51. || ᨠᩙ (อังอีกแบบหนึ่ง) || ᨠ + ᩙ |- | 52. || ᨠᩥᩴ (อิงอีกแบบหนึ่ง) || ᨠ + ᩥ + ᩴ |} |} [1] ภาษาคำเมือง: คำพิเศษที่ไม่ต้องสะกดตามนี้ ได้แก่ [[ᨣᩴ᩵]] (ก็) และ [[ᨷᩴ᩵]] (บ่, ไม่) ; วรรณยุกต์ # ถ้ามีรูปสระหน้า สระล่าง สระบน ให้ใส่วรรณยุกต์หลังจากสระเหล่านี้ครบแล้ว # ถ้าไม่มีสระ หรือมีแต่สระหลัง สามารถใส่วรรณยุกต์หลังจาก (กลุ่ม) พยัญชนะต้นได้ทันที ; สัญลักษณ์อื่น ๆ # ไม้ซ้ำ ᩻ ใช้งานได้สามอย่าง ได้แก่ #* คำซ้ำ ให้ใส่ไม้ซ้ำที่ท้ายพยางค์หรือคำที่สะกดสำเร็จแล้ว เหมือนไม้ยมก #* อักษรนำและอักษรตาม (อย่างคำเขมร) ให้ใส่ไม้ซ้ำหลังจากอักษรตาม (พยัญชนะตัวที่สอง) ทันที แล้วจึงตามด้วยสระ/วรรณยุกต์ต่อไป #* ใช้เป็นตัวแก้ความกำกวมว่า พยัญชนะสองตัวที่ติดกัน คืออักษรนำและอักษรตาม มิใช่ตัวสะกด (ในกรณีที่สะกดตามปกติแล้วรูปเหมือนกัน) ตำแหน่งที่ใส่เหมือนข้อที่แล้ว # ไม้กั๋งไหล ᩘ ใช้แทน งฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป # ระห้าม ᩺ ใช้แทน รฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป หรือใช้แทนทัณฑฆาตที่อยู่ท้ายคำ # การันต์ ᩼ ใช้แทนทัณฑฆาตของเขินและลื้อ ส่วนคำเมืองใช้ ระห้าม ᩺ ; อื่น ๆ # ภาษาเขิน, ภาษาลื้อ, ภาษายอง ที่เขียนด้วยอักษรไทธรรม ใช้อักขรวิธีเดียวกับภาษาคำเมือง เว้นแต่จะกำหนดไว้ในตาราง </div> tj9kb6j76jx44mh20iy6f09od6zyny1 5720680 5720679 2026-04-20T15:52:52Z OctraBot 3198 5720680 wikitext text/x-wiki <div class="Lana" lang="nod"> ; พยัญชนะ # ละ ที่เป็นอักษรตามหรืออักษรควบกล้ำ สามารถใช้ ᩖ หรือ ᩠ + ᩃ ก็ได้ วิกิพจนานุกรมกำหนดให้ ᩖ เป็นรายการหลัก # ละตังหลาย ᩗ ใช้ใน [[ᨴᩗᩘᩣ]] (ตังหลาย) เท่านั้น # หาง ᩛ ใช้ตามหลังพยัญชนะบางตัว ในคำที่ยืมมาจากบาลี/สันสกฤต (และบาลี/สันสกฤตโดยตรง) #* หากตามหลังพยัญชนะ ᨭ (ฏ) ᨮ (ฐ) ᨯ (ฑ) ᨰ (ฒ) ᨱ (ณ) จะเท่ากับ ᩠ + ᨮ (+ฐ) #* หากตามหลังพยัญชนะ ᨲ (ต) ᨳ (ถ) ᨴ (ท) ᨵ (ธ) ᨶ (น) จะเท่ากับ ᩠ + ᨳ (+ถ) #* หากตามหลังพยัญชนะ ᨷ (ป) ᨹ (ผ) ᨻ (พ) ᨽ (ภ) ᨾ (ม) จะเท่ากับ ᩠ + ᨻ (+พ) # สะใหญ่ ᩔ (สฺส) พบได้ในคำที่ยืมมาจากบาลี (และบาลีโดยตรง) # พยัญชนะสะกด <u>โดยปกติ</u>จะเขียนไว้เป็นเชิงของตัวอักษรก่อนหน้า ซึ่งอาจเป็นพยัญชนะหรือสระก็ได้ #* เว้นแต่ ตัวอักษรก่อนหน้าเป็นสระล่าง หรือพยัญชนะเชิง จะเขียนเป็นตัวเต็มแทน #* ตัวสะกด ᩠ + ᨿ (+ย) เขียนเป็นพยัญชนะเชิงเสมอ (สำหรับคำเมือง) ; สระ # ใส่สระหลังจาก (กลุ่ม) พยัญชนะต้นเสมอ ตัว ᨿ, ᩅ, ᩋ ที่ปรากฏในรูปสระ ก็ใช้หลักเดียวกัน # ถ้ามีสระหลายรูปประกอบกัน ให้ใส่สระหน้า สระล่าง สระบน และสระหลัง <u>ตามลำดับ</u> #* ตัวเชิง ᨿ, ᩅ เป็นสระล่าง แต่สระออยของเขิน ᩭ เป็นสระหลัง # สระอำเขียนต่างจากภาษาไทยคือ เขียนลากข้างก่อน แล้วตามด้วยนิคหิต ส่วนวรรณยุกต์อยู่บนพยัญชนะ (ก่อนสระ) หมายเหตุ: ใช้ ᨠ เป็นพยัญชนะสำหรับเกาะ {| |- | {| class="wikitable" |- | 1. || ᨠᩫ (โอะมีตัวสะกด) || ᨠ + ᩫ |- style="background-color:lightgreen" | 2. || ᨠᩴ (อัง และนิคหิตของบาลี) || ᨠ + ᩴ |- style="background-color:lightgreen" | 3. || ᨠᩘ (ใช้ในตังหลาย และ งฺ ของบาลี) || ᨠ + ᩘ |- | 4. || ᨠᩢ (อะมีตัวสะกด) || ᨠ + ᩢ |- | 5. || ᨠ᩠ᩅᩫᩡ (อัวะ) || ᨠ + ᩠ + ᩅ + ᩫ + ᩡ |- | 6. || ᨠ᩠ᩅᩫ (อัวไม่มีตัวสะกด) || ᨠ + ᩠ + ᩅ + ᩫ |- | 7. || ᨠ᩠ᩅ (อัวมีตัวสะกด) || ᨠ + ᩠ + ᩅ |- style="background-color:moccasin" | 8. || ᨠᩬᩴ (ออไม่มีตัวสะกดของคำเมือง)<sup>[1]</sup> || ᨠ + ᩬ + ᩴ |- | 9. || ᨠᩬ (ออมีตัวสะกด) || ᨠ + ᩬ |- | 10. || ᨠᩡ (อะไม่มีตัวสะกด/ไม่เติมก็ได้) || ᨠ + ᩡ |- | 11. || ᨠᩣ (อาต่ำ) || ᨠ + ᩣ |- | 12. || ᨠᩤ (อาสูง) || ᨠ + ᩤ |- | 13. || ᨠᩣᩴ (อำต่ำ) || ᨠ + ᩣ + ᩴ |- | 14. || ᨠᩤᩴ (อำสูง) || ᨠ + ᩤ + ᩴ |- | 15. || ᨠᩥ (อิ) || ᨠ + ᩥ |- | 16. || ᨠᩦ (อี) || ᨠ + ᩦ |- | 17. || ᨠᩧ (อึ) || ᨠ + ᩧ |- | 18. || ᨠᩨ (อือ) || ᨠ + ᩨ |- | 19. || ᨠᩩ (อุ) || ᨠ + ᩩ |- | 20. || ᨠᩪ (อู) || ᨠ + ᩪ |} | {| class="wikitable" |- | 21. || ᨠᩮᩡ (เอะไม่มีตัวสะกด) || ᨠ + ᩮ + ᩡ |- | 22. || ᨠᩮᩢ (เอะมีตัวสะกด) || ᨠ + ᩮ + ᩢ |- | 23. || ᨠᩮ (เอ) || ᨠ + ᩮ |- | 24. || ᨠᩯᩡ (แอะไม่มีตัวสะกด) || ᨠ + ᩯ + ᩡ |- | 25. || ᨠᩯᩢ (แอะมีตัวสะกด) || ᨠ + ᩯ + ᩢ |- | 26. || ᨠᩯ (แอ) || ᨠ + ᩯ |- | 27. || ᨠᩮᩬᩥᩡ (เออะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩡ |- | 28. || ᨠᩮᩬᩥ (เออไม่มีตัวสะกด หรือเอือมีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ |- | 29. || ᨠᩮᩥ (เออมีตัวสะกด) || ᨠ + ᩮ + ᩥ |- style="background-color:pink" | 30. || ᨠᩮᩬᩨᩡ (เออะของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ + ᩡ |- style="background-color:pink" | 31. || ᨠᩮᩬᩨ (เออของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ |- | 32. || ᨠᩮᩢᩣ (เอาต่ำ) || ᨠ + ᩮ + ᩢ + ᩣ |- | 33. || ᨠᩮᩢᩤ (เอาสูง) || ᨠ + ᩮ + ᩢ + ᩤ |- style="background-color:lightgreen" | 34. || ᨠᩮᩣ (โอต่ำของบาลี) || ᨠ + ᩮ + ᩣ |- style="background-color:lightgreen" | 35. || ᨠᩮᩤ (โอสูงของบาลี) || ᨠ + ᩮ + ᩤ |- style="background-color:pink" | 36. || ᨠᩳ (ออไม่มีตัวสะกดของเขิน) || ᨠ + ᩳ |- style="background-color:lightblue" | 37. || ᨠᩬᩳ (ออไม่มีตัวสะกดของลื้อ/ยอง) || ᨠ + ᩬ + ᩳ |- | 38. || ᨠ᩠ᨿᩮᩡ (เอียะ) || ᨠ + ᩠ + ᨿ + ᩮ + ᩡ |- | 39. || ᨠ᩠ᨿᩮ (เอียไม่มีตัวสะกด) || ᨠ + ᩠ + ᨿ + ᩮ |- | 40. || ᨠ᩠ᨿ (เอียมีตัวสะกด) || ᨠ + ᩠ + ᨿ |} | {| class="wikitable" |- | 41. || ᨠᩮᩬᩥᩋᩡ (เอือะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ + ᩡ |- | 42. || ᨠᩮᩬᩥᩋ (เอือไม่มีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ |- | 43. || ᨠᩰᩡ (โอะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩡ |- | 44. || ᨠᩰ (โอไม่มีตัวสะกด) || ᨠ + ᩰ |- | 45. || ᨠᩰᩫ (โอมีตัวสะกด) || ᨠ + ᩰ + ᩫ |- | 46. || ᨠᩰᩬᩡ (เอาะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩬ + ᩡ |- | 47. || ᨠᩬᩢ (เอาะมีตัวสะกด) || ᨠ + ᩬ + ᩢ |- | 48. || ᨠᩱ (ไอ) || ᨠ + ᩱ |- style="background-color:moccasin" | 49. || ᨠᩲ (ใอของคำเมือง) || ᨠ + ᩲ |- style="background:linear-gradient(to bottom, pink 0%, lightblue 100%);" | 50. || ᨠᩭ (ออยของเขิน/ลื้อ/ยอง) || ᨠ + ᩭ |- | 51. || ᨠᩙ (อังอีกแบบหนึ่ง) || ᨠ + ᩙ |- | 52. || ᨠᩥᩴ (อิงอีกแบบหนึ่ง) || ᨠ + ᩥ + ᩴ |} |} [1] ภาษาคำเมือง: คำพิเศษที่ไม่ต้องสะกดตามนี้ ได้แก่ [[ᨣᩴ᩵]] (ก็) และ [[ᨷᩴ᩵]] (บ่, ไม่) ; วรรณยุกต์ # ถ้ามีรูปสระหน้า สระล่าง สระบน ให้ใส่วรรณยุกต์หลังจากสระเหล่านี้ครบแล้ว # ถ้าไม่มีสระ หรือมีแต่สระหลัง สามารถใส่วรรณยุกต์หลังจาก (กลุ่ม) พยัญชนะต้นได้ทันที ; สัญลักษณ์อื่น ๆ # ไม้ซ้ำ ᩻ ใช้งานได้สามอย่าง ได้แก่ #* คำซ้ำ ให้ใส่ไม้ซ้ำที่ท้ายพยางค์หรือคำที่สะกดสำเร็จแล้ว เหมือนไม้ยมก #* อักษรนำและอักษรตาม (อย่างคำเขมร) ให้ใส่ไม้ซ้ำหลังจากอักษรตาม (พยัญชนะตัวที่สอง) ทันที แล้วจึงตามด้วยสระ/วรรณยุกต์ต่อไป #* ใช้เป็นตัวแก้ความกำกวมว่า พยัญชนะสองตัวที่ติดกัน คืออักษรนำและอักษรตาม มิใช่ตัวสะกด (ในกรณีที่สะกดตามปกติแล้วรูปเหมือนกัน) ตำแหน่งที่ใส่เหมือนข้อที่แล้ว # ไม้กั๋งไหล ᩘ ใช้แทน งฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป # ระห้าม ᩺ ใช้แทน รฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป หรือใช้แทนทัณฑฆาตที่อยู่ท้ายคำ # การันต์ ᩼ ใช้แทนทัณฑฆาตของเขินและลื้อ ส่วนคำเมืองใช้ ระห้าม ᩺ ; อื่น ๆ # ภาษาเขิน, ภาษาลื้อ, ภาษายอง ที่เขียนด้วยอักษรไทธรรม ใช้อักขรวิธีเดียวกับภาษาคำเมือง เว้นแต่จะกำหนดไว้ในตาราง </div> lbvclqxminzaqfe5a5o1fdhj3kxlhqu 5720681 5720680 2026-04-20T15:53:45Z OctraBot 3198 5720681 wikitext text/x-wiki <div class="Lana" lang="nod"> ; พยัญชนะ # ละ ที่เป็นอักษรตามหรืออักษรควบกล้ำ สามารถใช้ ᩖ หรือ ᩠ + ᩃ ก็ได้ วิกิพจนานุกรมกำหนดให้ ᩖ เป็นรายการหลัก # ละตังหลาย ᩗ ใช้ใน [[ᨴᩗᩘᩣ]] (ตังหลาย) เท่านั้น # หาง ᩛ ใช้ตามหลังพยัญชนะบางตัว ในคำที่ยืมมาจากบาลี/สันสกฤต (และบาลี/สันสกฤตโดยตรง) #* หากตามหลังพยัญชนะ ᨭ (ฏ) ᨮ (ฐ) ᨯ (ฑ) ᨰ (ฒ) ᨱ (ณ) จะเท่ากับ ᩠ + ᨮ (+ฐ) #* หากตามหลังพยัญชนะ ᨲ (ต) ᨳ (ถ) ᨴ (ท) ᨵ (ธ) ᨶ (น) จะเท่ากับ ᩠ + ᨳ (+ถ) #* หากตามหลังพยัญชนะ ᨷ (ป) ᨹ (ผ) ᨻ (พ) ᨽ (ภ) ᨾ (ม) จะเท่ากับ ᩠ + ᨻ (+พ) # สะใหญ่ ᩔ (สฺส) พบได้ในคำที่ยืมมาจากบาลี (และบาลีโดยตรง) # พยัญชนะสะกด <u>โดยปกติ</u>จะเขียนไว้เป็นเชิงของตัวอักษรก่อนหน้า ซึ่งอาจเป็นพยัญชนะหรือสระก็ได้ #* หากตัวอักษรก่อนหน้าเป็นสระล่าง หรือมีพยัญชนะเชิงอยู่แล้ว จะเขียนเป็นตัวเต็มแทน #* ตัวสะกด ᩠ + ᨿ (+ย) เขียนเป็นพยัญชนะเชิงเสมอ (สำหรับคำเมือง) ; สระ # ใส่สระหลังจาก (กลุ่ม) พยัญชนะต้นเสมอ ตัว ᨿ, ᩅ, ᩋ ที่ปรากฏในรูปสระ ก็ใช้หลักเดียวกัน # ถ้ามีสระหลายรูปประกอบกัน ให้ใส่สระหน้า สระล่าง สระบน และสระหลัง <u>ตามลำดับ</u> #* ตัวเชิง ᨿ, ᩅ เป็นสระล่าง แต่สระออยของเขิน ᩭ เป็นสระหลัง # สระอำเขียนต่างจากภาษาไทยคือ เขียนลากข้างก่อน แล้วตามด้วยนิคหิต ส่วนวรรณยุกต์อยู่บนพยัญชนะ (ก่อนสระ) หมายเหตุ: ใช้ ᨠ เป็นพยัญชนะสำหรับเกาะ {| |- | {| class="wikitable" |- | 1. || ᨠᩫ (โอะมีตัวสะกด) || ᨠ + ᩫ |- style="background-color:lightgreen" | 2. || ᨠᩴ (อัง และนิคหิตของบาลี) || ᨠ + ᩴ |- style="background-color:lightgreen" | 3. || ᨠᩘ (ใช้ในตังหลาย และ งฺ ของบาลี) || ᨠ + ᩘ |- | 4. || ᨠᩢ (อะมีตัวสะกด) || ᨠ + ᩢ |- | 5. || ᨠ᩠ᩅᩫᩡ (อัวะ) || ᨠ + ᩠ + ᩅ + ᩫ + ᩡ |- | 6. || ᨠ᩠ᩅᩫ (อัวไม่มีตัวสะกด) || ᨠ + ᩠ + ᩅ + ᩫ |- | 7. || ᨠ᩠ᩅ (อัวมีตัวสะกด) || ᨠ + ᩠ + ᩅ |- style="background-color:moccasin" | 8. || ᨠᩬᩴ (ออไม่มีตัวสะกดของคำเมือง)<sup>[1]</sup> || ᨠ + ᩬ + ᩴ |- | 9. || ᨠᩬ (ออมีตัวสะกด) || ᨠ + ᩬ |- | 10. || ᨠᩡ (อะไม่มีตัวสะกด/ไม่เติมก็ได้) || ᨠ + ᩡ |- | 11. || ᨠᩣ (อาต่ำ) || ᨠ + ᩣ |- | 12. || ᨠᩤ (อาสูง) || ᨠ + ᩤ |- | 13. || ᨠᩣᩴ (อำต่ำ) || ᨠ + ᩣ + ᩴ |- | 14. || ᨠᩤᩴ (อำสูง) || ᨠ + ᩤ + ᩴ |- | 15. || ᨠᩥ (อิ) || ᨠ + ᩥ |- | 16. || ᨠᩦ (อี) || ᨠ + ᩦ |- | 17. || ᨠᩧ (อึ) || ᨠ + ᩧ |- | 18. || ᨠᩨ (อือ) || ᨠ + ᩨ |- | 19. || ᨠᩩ (อุ) || ᨠ + ᩩ |- | 20. || ᨠᩪ (อู) || ᨠ + ᩪ |} | {| class="wikitable" |- | 21. || ᨠᩮᩡ (เอะไม่มีตัวสะกด) || ᨠ + ᩮ + ᩡ |- | 22. || ᨠᩮᩢ (เอะมีตัวสะกด) || ᨠ + ᩮ + ᩢ |- | 23. || ᨠᩮ (เอ) || ᨠ + ᩮ |- | 24. || ᨠᩯᩡ (แอะไม่มีตัวสะกด) || ᨠ + ᩯ + ᩡ |- | 25. || ᨠᩯᩢ (แอะมีตัวสะกด) || ᨠ + ᩯ + ᩢ |- | 26. || ᨠᩯ (แอ) || ᨠ + ᩯ |- | 27. || ᨠᩮᩬᩥᩡ (เออะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩡ |- | 28. || ᨠᩮᩬᩥ (เออไม่มีตัวสะกด หรือเอือมีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ |- | 29. || ᨠᩮᩥ (เออมีตัวสะกด) || ᨠ + ᩮ + ᩥ |- style="background-color:pink" | 30. || ᨠᩮᩬᩨᩡ (เออะของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ + ᩡ |- style="background-color:pink" | 31. || ᨠᩮᩬᩨ (เออของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ |- | 32. || ᨠᩮᩢᩣ (เอาต่ำ) || ᨠ + ᩮ + ᩢ + ᩣ |- | 33. || ᨠᩮᩢᩤ (เอาสูง) || ᨠ + ᩮ + ᩢ + ᩤ |- style="background-color:lightgreen" | 34. || ᨠᩮᩣ (โอต่ำของบาลี) || ᨠ + ᩮ + ᩣ |- style="background-color:lightgreen" | 35. || ᨠᩮᩤ (โอสูงของบาลี) || ᨠ + ᩮ + ᩤ |- style="background-color:pink" | 36. || ᨠᩳ (ออไม่มีตัวสะกดของเขิน) || ᨠ + ᩳ |- style="background-color:lightblue" | 37. || ᨠᩬᩳ (ออไม่มีตัวสะกดของลื้อ/ยอง) || ᨠ + ᩬ + ᩳ |- | 38. || ᨠ᩠ᨿᩮᩡ (เอียะ) || ᨠ + ᩠ + ᨿ + ᩮ + ᩡ |- | 39. || ᨠ᩠ᨿᩮ (เอียไม่มีตัวสะกด) || ᨠ + ᩠ + ᨿ + ᩮ |- | 40. || ᨠ᩠ᨿ (เอียมีตัวสะกด) || ᨠ + ᩠ + ᨿ |} | {| class="wikitable" |- | 41. || ᨠᩮᩬᩥᩋᩡ (เอือะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ + ᩡ |- | 42. || ᨠᩮᩬᩥᩋ (เอือไม่มีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ |- | 43. || ᨠᩰᩡ (โอะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩡ |- | 44. || ᨠᩰ (โอไม่มีตัวสะกด) || ᨠ + ᩰ |- | 45. || ᨠᩰᩫ (โอมีตัวสะกด) || ᨠ + ᩰ + ᩫ |- | 46. || ᨠᩰᩬᩡ (เอาะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩬ + ᩡ |- | 47. || ᨠᩬᩢ (เอาะมีตัวสะกด) || ᨠ + ᩬ + ᩢ |- | 48. || ᨠᩱ (ไอ) || ᨠ + ᩱ |- style="background-color:moccasin" | 49. || ᨠᩲ (ใอของคำเมือง) || ᨠ + ᩲ |- style="background:linear-gradient(to bottom, pink 0%, lightblue 100%);" | 50. || ᨠᩭ (ออยของเขิน/ลื้อ/ยอง) || ᨠ + ᩭ |- | 51. || ᨠᩙ (อังอีกแบบหนึ่ง) || ᨠ + ᩙ |- | 52. || ᨠᩥᩴ (อิงอีกแบบหนึ่ง) || ᨠ + ᩥ + ᩴ |} |} [1] ภาษาคำเมือง: คำพิเศษที่ไม่ต้องสะกดตามนี้ ได้แก่ [[ᨣᩴ᩵]] (ก็) และ [[ᨷᩴ᩵]] (บ่, ไม่) ; วรรณยุกต์ # ถ้ามีรูปสระหน้า สระล่าง สระบน ให้ใส่วรรณยุกต์หลังจากสระเหล่านี้ครบแล้ว # ถ้าไม่มีสระ หรือมีแต่สระหลัง สามารถใส่วรรณยุกต์หลังจาก (กลุ่ม) พยัญชนะต้นได้ทันที ; สัญลักษณ์อื่น ๆ # ไม้ซ้ำ ᩻ ใช้งานได้สามอย่าง ได้แก่ #* คำซ้ำ ให้ใส่ไม้ซ้ำที่ท้ายพยางค์หรือคำที่สะกดสำเร็จแล้ว เหมือนไม้ยมก #* อักษรนำและอักษรตาม (อย่างคำเขมร) ให้ใส่ไม้ซ้ำหลังจากอักษรตาม (พยัญชนะตัวที่สอง) ทันที แล้วจึงตามด้วยสระ/วรรณยุกต์ต่อไป #* ใช้เป็นตัวแก้ความกำกวมว่า พยัญชนะสองตัวที่ติดกัน คืออักษรนำและอักษรตาม มิใช่ตัวสะกด (ในกรณีที่สะกดตามปกติแล้วรูปเหมือนกัน) ตำแหน่งที่ใส่เหมือนข้อที่แล้ว # ไม้กั๋งไหล ᩘ ใช้แทน งฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป # ระห้าม ᩺ ใช้แทน รฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป หรือใช้แทนทัณฑฆาตที่อยู่ท้ายคำ # การันต์ ᩼ ใช้แทนทัณฑฆาตของเขินและลื้อ ส่วนคำเมืองใช้ ระห้าม ᩺ ; อื่น ๆ # ภาษาเขิน, ภาษาลื้อ, ภาษายอง ที่เขียนด้วยอักษรไทธรรม ใช้อักขรวิธีเดียวกับภาษาคำเมือง เว้นแต่จะกำหนดไว้ในตาราง </div> 30akd007a6l88syyev9wkgmvlkl5wyf 5720682 5720681 2026-04-20T15:55:24Z OctraBot 3198 5720682 wikitext text/x-wiki <div class="Lana" lang="nod"> ; พยัญชนะ # ละ ที่เป็นอักษรตามหรืออักษรควบกล้ำ สามารถใช้ ᩖ หรือ ᩠ + ᩃ ก็ได้ วิกิพจนานุกรมกำหนดให้ ᩖ เป็นรายการหลัก # ละตังหลาย ᩗ ใช้ใน [[ᨴᩗᩘᩣ]] (ตังหลาย) เท่านั้น # หาง ᩛ ใช้ตามหลังพยัญชนะบางตัว ในคำที่ยืมมาจากบาลี/สันสกฤต (และบาลี/สันสกฤตโดยตรง) #* หากตามหลังพยัญชนะ ᨭ (ฏ) ᨮ (ฐ) ᨯ (ฑ) ᨰ (ฒ) ᨱ (ณ) จะเท่ากับ ᩠ + ᨮ (+ฐ) #* หากตามหลังพยัญชนะ ᨲ (ต) ᨳ (ถ) ᨴ (ท) ᨵ (ธ) ᨶ (น) จะเท่ากับ ᩠ + ᨳ (+ถ) #* หากตามหลังพยัญชนะ ᨷ (ป) ᨹ (ผ) ᨻ (พ) ᨽ (ภ) ᨾ (ม) จะเท่ากับ ᩠ + ᨻ (+พ) # สะใหญ่ ᩔ (สฺส) พบได้ในคำที่ยืมมาจากบาลี (และบาลีโดยตรง) # พยัญชนะสะกด <u>โดยปกติ</u>จะเขียนไว้เป็นเชิงของตัวอักษรก่อนหน้า ซึ่งอาจเป็นพยัญชนะหรือสระก็ตาม #* หากตัวอักษรก่อนหน้าเป็นสระล่าง หรือมีพยัญชนะเชิงอยู่แล้ว จะเขียนเป็นตัวเต็มแทน #* ตัวสะกด ᩠ + ᨿ (+ย) เขียนเป็นพยัญชนะเชิงเสมอ (สำหรับคำเมือง) ; สระ # ใส่สระหลังจาก (กลุ่ม) พยัญชนะต้นเสมอ ตัว ᨿ, ᩅ, ᩋ ที่ปรากฏในรูปสระ ก็ใช้หลักเดียวกัน # ถ้ามีสระหลายรูปประกอบกัน ให้ใส่สระหน้า สระล่าง สระบน และสระหลัง <u>ตามลำดับ</u> #* ตัวเชิง ᨿ, ᩅ เป็นสระล่าง แต่สระออยของเขิน ᩭ เป็นสระหลัง # สระอำเขียนต่างจากภาษาไทยคือ เขียนลากข้างก่อน แล้วตามด้วยนิคหิต ส่วนวรรณยุกต์อยู่บนพยัญชนะ (ก่อนสระ) หมายเหตุ: ใช้ ᨠ เป็นพยัญชนะสำหรับเกาะ {| |- | {| class="wikitable" |- | 1. || ᨠᩫ (โอะมีตัวสะกด) || ᨠ + ᩫ |- style="background-color:lightgreen" | 2. || ᨠᩴ (อัง และนิคหิตของบาลี) || ᨠ + ᩴ |- style="background-color:lightgreen" | 3. || ᨠᩘ (ใช้ในตังหลาย และ งฺ ของบาลี) || ᨠ + ᩘ |- | 4. || ᨠᩢ (อะมีตัวสะกด) || ᨠ + ᩢ |- | 5. || ᨠ᩠ᩅᩫᩡ (อัวะ) || ᨠ + ᩠ + ᩅ + ᩫ + ᩡ |- | 6. || ᨠ᩠ᩅᩫ (อัวไม่มีตัวสะกด) || ᨠ + ᩠ + ᩅ + ᩫ |- | 7. || ᨠ᩠ᩅ (อัวมีตัวสะกด) || ᨠ + ᩠ + ᩅ |- style="background-color:moccasin" | 8. || ᨠᩬᩴ (ออไม่มีตัวสะกดของคำเมือง)<sup>[1]</sup> || ᨠ + ᩬ + ᩴ |- | 9. || ᨠᩬ (ออมีตัวสะกด) || ᨠ + ᩬ |- | 10. || ᨠᩡ (อะไม่มีตัวสะกด/ไม่เติมก็ได้) || ᨠ + ᩡ |- | 11. || ᨠᩣ (อาต่ำ) || ᨠ + ᩣ |- | 12. || ᨠᩤ (อาสูง) || ᨠ + ᩤ |- | 13. || ᨠᩣᩴ (อำต่ำ) || ᨠ + ᩣ + ᩴ |- | 14. || ᨠᩤᩴ (อำสูง) || ᨠ + ᩤ + ᩴ |- | 15. || ᨠᩥ (อิ) || ᨠ + ᩥ |- | 16. || ᨠᩦ (อี) || ᨠ + ᩦ |- | 17. || ᨠᩧ (อึ) || ᨠ + ᩧ |- | 18. || ᨠᩨ (อือ) || ᨠ + ᩨ |- | 19. || ᨠᩩ (อุ) || ᨠ + ᩩ |- | 20. || ᨠᩪ (อู) || ᨠ + ᩪ |} | {| class="wikitable" |- | 21. || ᨠᩮᩡ (เอะไม่มีตัวสะกด) || ᨠ + ᩮ + ᩡ |- | 22. || ᨠᩮᩢ (เอะมีตัวสะกด) || ᨠ + ᩮ + ᩢ |- | 23. || ᨠᩮ (เอ) || ᨠ + ᩮ |- | 24. || ᨠᩯᩡ (แอะไม่มีตัวสะกด) || ᨠ + ᩯ + ᩡ |- | 25. || ᨠᩯᩢ (แอะมีตัวสะกด) || ᨠ + ᩯ + ᩢ |- | 26. || ᨠᩯ (แอ) || ᨠ + ᩯ |- | 27. || ᨠᩮᩬᩥᩡ (เออะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩡ |- | 28. || ᨠᩮᩬᩥ (เออไม่มีตัวสะกด หรือเอือมีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ |- | 29. || ᨠᩮᩥ (เออมีตัวสะกด) || ᨠ + ᩮ + ᩥ |- style="background-color:pink" | 30. || ᨠᩮᩬᩨᩡ (เออะของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ + ᩡ |- style="background-color:pink" | 31. || ᨠᩮᩬᩨ (เออของเขิน) || ᨠ + ᩮ + ᩬ + ᩨ |- | 32. || ᨠᩮᩢᩣ (เอาต่ำ) || ᨠ + ᩮ + ᩢ + ᩣ |- | 33. || ᨠᩮᩢᩤ (เอาสูง) || ᨠ + ᩮ + ᩢ + ᩤ |- style="background-color:lightgreen" | 34. || ᨠᩮᩣ (โอต่ำของบาลี) || ᨠ + ᩮ + ᩣ |- style="background-color:lightgreen" | 35. || ᨠᩮᩤ (โอสูงของบาลี) || ᨠ + ᩮ + ᩤ |- style="background-color:pink" | 36. || ᨠᩳ (ออไม่มีตัวสะกดของเขิน) || ᨠ + ᩳ |- style="background-color:lightblue" | 37. || ᨠᩬᩳ (ออไม่มีตัวสะกดของลื้อ/ยอง) || ᨠ + ᩬ + ᩳ |- | 38. || ᨠ᩠ᨿᩮᩡ (เอียะ) || ᨠ + ᩠ + ᨿ + ᩮ + ᩡ |- | 39. || ᨠ᩠ᨿᩮ (เอียไม่มีตัวสะกด) || ᨠ + ᩠ + ᨿ + ᩮ |- | 40. || ᨠ᩠ᨿ (เอียมีตัวสะกด) || ᨠ + ᩠ + ᨿ |} | {| class="wikitable" |- | 41. || ᨠᩮᩬᩥᩋᩡ (เอือะ) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ + ᩡ |- | 42. || ᨠᩮᩬᩥᩋ (เอือไม่มีตัวสะกด) || ᨠ + ᩮ + ᩬ + ᩥ + ᩋ |- | 43. || ᨠᩰᩡ (โอะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩡ |- | 44. || ᨠᩰ (โอไม่มีตัวสะกด) || ᨠ + ᩰ |- | 45. || ᨠᩰᩫ (โอมีตัวสะกด) || ᨠ + ᩰ + ᩫ |- | 46. || ᨠᩰᩬᩡ (เอาะไม่มีตัวสะกด) || ᨠ + ᩰ + ᩬ + ᩡ |- | 47. || ᨠᩬᩢ (เอาะมีตัวสะกด) || ᨠ + ᩬ + ᩢ |- | 48. || ᨠᩱ (ไอ) || ᨠ + ᩱ |- style="background-color:moccasin" | 49. || ᨠᩲ (ใอของคำเมือง) || ᨠ + ᩲ |- style="background:linear-gradient(to bottom, pink 0%, lightblue 100%);" | 50. || ᨠᩭ (ออยของเขิน/ลื้อ/ยอง) || ᨠ + ᩭ |- | 51. || ᨠᩙ (อังอีกแบบหนึ่ง) || ᨠ + ᩙ |- | 52. || ᨠᩥᩴ (อิงอีกแบบหนึ่ง) || ᨠ + ᩥ + ᩴ |} |} [1] ภาษาคำเมือง: คำพิเศษที่ไม่ต้องสะกดตามนี้ ได้แก่ [[ᨣᩴ᩵]] (ก็) และ [[ᨷᩴ᩵]] (บ่, ไม่) ; วรรณยุกต์ # ถ้ามีรูปสระหน้า สระล่าง สระบน ให้ใส่วรรณยุกต์หลังจากสระเหล่านี้ครบแล้ว # ถ้าไม่มีสระ หรือมีแต่สระหลัง สามารถใส่วรรณยุกต์หลังจาก (กลุ่ม) พยัญชนะต้นได้ทันที ; สัญลักษณ์อื่น ๆ # ไม้ซ้ำ ᩻ ใช้งานได้สามอย่าง ได้แก่ #* คำซ้ำ ให้ใส่ไม้ซ้ำที่ท้ายพยางค์หรือคำที่สะกดสำเร็จแล้ว เหมือนไม้ยมก #* อักษรนำและอักษรตาม (อย่างคำเขมร) ให้ใส่ไม้ซ้ำหลังจากอักษรตาม (พยัญชนะตัวที่สอง) ทันที แล้วจึงตามด้วยสระ/วรรณยุกต์ต่อไป #* ใช้เป็นตัวแก้ความกำกวมว่า พยัญชนะสองตัวที่ติดกัน คืออักษรนำและอักษรตาม มิใช่ตัวสะกด (ในกรณีที่สะกดตามปกติแล้วรูปเหมือนกัน) ตำแหน่งที่ใส่เหมือนข้อที่แล้ว # ไม้กั๋งไหล ᩘ ใช้แทน งฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป # ระห้าม ᩺ ใช้แทน รฺ ในคำบาลี/สันสกฤต โดยวางบนพยัญชนะตัวถัดไป หรือใช้แทนทัณฑฆาตที่อยู่ท้ายคำ # การันต์ ᩼ ใช้แทนทัณฑฆาตของเขินและลื้อ ส่วนคำเมืองใช้ ระห้าม ᩺ ; อื่น ๆ # ภาษาเขิน, ภาษาลื้อ, ภาษายอง ที่เขียนด้วยอักษรไทธรรม ใช้อักขรวิธีเดียวกับภาษาคำเมือง เว้นแต่จะกำหนดไว้ในตาราง </div> pwrtbkr9taduu8la2loswbh8m59w7yx ဢဝ် 0 166994 5720726 1893283 2026-04-21T04:49:55Z Ai Ku Karng 17824 /* ภาษาไทใหญ่ */ 5720726 wikitext text/x-wiki {{also/auto}} == ภาษาไทใหญ่ == === รากศัพท์ === {{inh+|shn|tai-pro|*ʔawᴬ}}; ร่วมเชื้อสายกับ{{cog|th|เอา}}, {{cog|nod|ᩐᩣ}}, {{cog|lo|ເອົາ}}, {{cog|khb|ᦀᧁ}}, {{cog|blt|ꪹꪮꪱ}}, {{cog|aho|𑜒𑜧}} หรือ {{m|aho|𑜒𑜧𑜈𑜫}} หรือ {{m|aho|𑜒𑜨𑜧}}, {{cog|za|aeu}}, {{cog|tdd|ᥟᥝ}} === การออกเสียง === {{shn-pron}} === คำกริยา === {{shn-verb}} # [[เอา]] # [[ทำให้]] #: {{ux|shn|'''ဢဝ်'''[[ႁၢႆ]]|'''ทำ'''หาย}} #: {{ux|shn|[[မၼ်း]]'''ဢဝ်'''[[ငိုၼ်း]][[ႁၢႆ]]|มัน'''ทำ'''เงินหาย}} erxsga6hkzdokiirhkhjiewu59h2hoh Chiang Mai 0 169809 5720704 1610303 2026-04-21T01:55:51Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-\|เมืองใหญ่ในไทย\}\} +|นครในไทย}}) 5720704 wikitext text/x-wiki == ภาษาอังกฤษ == {{wikipedia|lang=en}} === รากศัพท์ === {{bor|en|th|เชียงใหม่}} === คำวิสามานยนาม === {{en-proper noun|head=Chiang Mai}} # [[เชียงใหม่]] (ทั้งจังหวัดและเมือง) {{topics|en|จังหวัดในไทย|นครในไทย}} q2p828hw75to4dkl8x3x42gvznvqw6m ᥓᥣᥭᥰ 0 174184 5720738 1647641 2026-04-21T06:27:17Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720738 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === {{inh+|tdd|tai-pro|*ʑaːjᴬ||เพศชาย}}; ร่วมเชื้อสายกับ{{cog|th|ชาย}}, {{cog|nod|ᨩᩣ᩠ᨿ}}, {{cog|lo|ຊາຍ}}, {{cog|khb|ᦋᦻ}}, {{cog|shn|ၸၢႆး}}, {{cog|blt|ꪋꪱꪥ}}, {{cog|aho|𑜋𑜩}}, {{cog|za|sai}} === การออกเสียง === * {{IPA|tdd|/t͡saːj˥˧/}} === คำนาม === {{tdd-noun}} # [[ผู้ชาย]], [[ชาย]] oefzwoe88g93w4ws2v5em4oz398ekjj ᥐᥣ 0 174198 5720747 1643221 2026-04-21T06:53:32Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720747 wikitext text/x-wiki == ภาษาไทใต้คง == === การออกเสียง === * {{IPA|tdd|/kaː˧˧/}} === รากศัพท์ 1 === ==== คำนาม ==== {{tdd-noun}} # [[กา]] (นก) === รากศัพท์ 2 === {{bor+|tdd|ltc|-}} {{ltc-l|價}}; ร่วมเชื้อสายกับ{{cog|th|ค่า}}, {{cog|lo|ຄ່າ}}, {{cog|tts|ค่า}}, {{cog|nod|ᨣ᩵ᩤ}}, {{cog|kkh|ᨣ᩵ᩤ}}, {{cog|khb|ᦅᦱᧈ}}, {{cog|shn|ၵႃႈ}}, {{cog|blt|ꪁ꪿ꪱ}}, {{cog|aho|𑜀𑜠}} ==== คำนาม ==== {{tdd-noun}} # [[ราคา]], [[ค่า]] === รากศัพท์ 3 === แผลงมาจาก {{m|tdd|ᥐᥣᥳ}} ==== คำกริยา ==== {{tdd-verb}} # {{lb|tdd|สกรรม}} [[ค้า]], [[ทำ]][[การค้า]] hcitzqq0njtmo0r3iy9gdl8btxkrfdj ᥛᥤᥰ 0 174210 5720723 1647678 2026-04-21T04:40:16Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720723 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === {{inh+|tdd|tai-pro|*miːᴬ}}; ร่วมเชื้อสายกับ{{cog|th|มี}}, {{cog|nod|ᨾᩦ}}, {{cog|kkh|ᨾᩦ}}, {{cog|lo|ມີ}}, {{cog|khb|ᦙᦲ}}, {{cog|blt|ꪣꪲ}}, {{cog|shn|မီး}}, {{cog|za|miz}} === การออกเสียง === * {{IPA|tdd|/miː˥˧/}} === คำกริยา === {{tdd-verb}} # [[มี]] rtufcetbbjts85zkrz89f1wwfubxunz 5720776 5720723 2026-04-21T07:14:18Z Ai Ku Karng 17824 5720776 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === {{inh+|tdd|tai-pro|*miːᴬ}}; ร่วมเชื้อสายกับ{{cog|th|มี}}, {{cog|nod|ᨾᩦ}}, {{cog|kkh|ᨾᩦ}}, {{cog|lo|ມີ}}, {{cog|khb|ᦙᦲ}}, {{cog|blt|ꪣꪲ}}, {{cog|shn|မီး}}, {{cog|za|miz}} === การออกเสียง === * {{IPA|tdd|/mi˥˧/}} === คำกริยา === {{tdd-verb}} # [[มี]] a2p0d5twwvtvsj9v6x9zwj6qp82t8ee มอดูล:data consistency check 828 211720 5720745 2691044 2026-04-21T06:47:28Z OctraBot 3198 5720745 Scribunto text/plain -- TODO: -- ietf_subtag field used with a 2/3-letter langauge/family code except qaa-qtz, or a 4-letter script code. -- Check against files containing up-to-date ISO data, to cross-check validity. local export = {} local mw = mw local require = require local string = string local Array = require("Module:array") local m_en_utilities = require("Module:en-utilities") local m_etym_languages_canonical_names = require("Module:etymology languages/canonical names") local m_etym_languages_codes = require("Module:etymology languages/code to canonical name") local m_etym_languages_data = require("Module:etymology languages/data") local m_families = require("Module:families") local m_families_canonical_names = require("Module:families/canonical names") local m_families_codes = require("Module:families/code to canonical name") local m_families_data = require("Module:families/data") local m_languages = require("Module:languages") local m_languages_canonical_names = require("Module:languages/canonical names") local m_languages_codes = require("Module:languages/code to canonical name") local m_languages_data_all = require("Module:languages/data/all") local m_load = require("Module:load") local m_scripts = require("Module:scripts") local m_scripts_canonical_names = require("Module:scripts/canonical names") local m_scripts_codes = require("Module:scripts/code to canonical name") local m_scripts_data = require("Module:scripts/data") local m_str_utils = require("Module:string utilities") local m_table = require("Module:table") local add_indefinite_article = m_en_utilities.add_indefinite_article local codepoint = m_str_utils.codepoint local concat = table.concat local dump = mw.dumpObject local format = string.format local gcodepoint = m_str_utils.gcodepoint local get_data_module_name = m_languages.getDataModuleName local get_family_by_code = m_families.getByCode local get_family_by_canonical_name = m_families.getByCanonicalName local get_indefinite_article = m_en_utilities.get_indefinite_article local get_language_by_code = m_languages.getByCode local get_language_by_canonical_name = m_languages.getByCanonicalName local get_script_by_code = m_scripts.getByCode local get_script_by_canonical_name = m_scripts.getByCanonicalName local gmatch = string.gmatch local gsub = string.gsub local insert = table.insert local ipairs = ipairs local is_callable = require("Module:fun").is_callable local is_positive_integer = require("Module:math").is_positive_integer local is_known_language_tag = mw.language.isKnownLanguageTag local isutf8 = mw.ustring.isutf8 local json_decode = mw.text.jsonDecode local language_link = require("Module:links").language_link local list_to_set = m_table.listToSet local list_to_text = mw.text.listToText local load_data = m_load.load_data local log = mw.log local main_loader = package.loaders[2] local make_family = m_families.makeObject local make_lang = m_languages.makeObject local make_script = m_scripts.makeObject local match = string.match local new_title = mw.title.new local next = next local pairs = pairs local pcall = pcall local remove_comments = require("Module:string/removeComments") local safe_require = m_load.safe_require local sorted_pairs = m_table.sortedPairs local split = m_str_utils.split local sub = string.sub local table_len = m_table.length local tag_text = require("Module:script utilities").tag_text local type = type local umatch = m_str_utils.match local unpack = unpack or table.unpack -- Lua 5.2 compatibility local aliases = require("Module:languages/data").aliases local messages local function discrepancy(modname, ...) local success, result = pcall(function(...) messages[modname]:insert(format(...)) end, ...) if not success then log(result, ...) end end local messages_mt = {} function messages_mt:__index(k) local val = Array() self[k] = val return val end local all_codes = {} local language_names = {} local etym_language_names = {} local family_names = {} local script_names = {} local nonempty_families = {} local allowed_empty_families = {tbq = true} local nonempty_scripts = {} local function link(obj, code_first) return type(obj) == "string" and obj or code_first and format("<code>%s</code> (%s)", obj:getCode(), obj:makeCategoryLink()) or format("%s (<code>%s</code>)", obj:makeCategoryLink(), obj:getCode()) end local function check_data_keys(...) local valid_keys = Array(...):toSet() return function (modname, obj, data) local invalid_keys for k in pairs(data) do if not valid_keys[k] then if not invalid_keys then invalid_keys = Array(k) else invalid_keys:insert(k) end end end if invalid_keys == nil then return end local plural = #invalid_keys ~= 1 discrepancy(modname, "The data key%s %s for %s %s invalid.", plural and "s" or "", invalid_keys:map(function(key) return "<code>" .. key .. "</code>" end):concat(", "), link(obj), plural and "are" or "is" ) end end -- Modification of isArray in [[Module:table]]. -- This assumes all keys are either integers or non-numbers. -- If there are fractional numbers, the results might be incorrect. -- For instance, find_gap{"a", "b", [0.5] = true} evaluates to 3, but there -- isn't a gap at 3 in the sense of there being an integer key greater than 3. local function find_gap(t, can_contain_non_number_keys) local i = 0 for k in pairs(t) do if not (can_contain_non_number_keys and type(k) ~= "number") then i = i + 1 if t[i] == nil then return i end end end end local function check_true_or_string_or_nil(modname, obj, data, key) local field = data[key] if not (field == nil or field == true or type(field) == "string") then discrepancy(modname, "%s has %s <code>%s</code> value that is not <code>nil</code>, <code>true</code> or a string: <code>%s</code>", link(obj), get_indefinite_article(key), key, dump(data[key]) ) end end local function check_array(modname, obj, data, array_name, parent_array_name, can_contain_non_number_keys) local parent_table = data if parent_array_name then parent_table = assert(data[parent_array_name], parent_array_name) parent_array_name = "the <code>" .. parent_array_name .. "</code> field in " else parent_array_name = "" end local array_type = type(parent_table[array_name]) if array_type == "table" then local gap = find_gap(parent_table[array_name], can_contain_non_number_keys) if gap then discrepancy(modname, "The <code>%s</code> array in %sthe data table for %s has a gap at index %d.", array_name, parent_array_name, link(obj), gap ) else return true end else discrepancy(modname, "The <code>%s</code> field in %sthe data table for %s should be an array (table) but is %s.", array_name, parent_array_name, link(obj), array_type == "nil" and "nil" or "a " .. array_type ) end end local function check_no_alias_codes(modname, mod_data) local lookup, discrepancies = {}, {} for k, v in pairs(mod_data) do local check = lookup[v] if check then discrepancies[check] = discrepancies[check] or {"<code>" .. check .. "</code>"} insert(discrepancies[check], "<code>" .. k .. "</code>") else lookup[v] = k end end for _, v in pairs(discrepancies) do discrepancy(modname, "The codes %s are currently alias codes. Only one code should be used in the data.", list_to_text(v, ", ", " and ") ) end end local function check_wikidata_item(modname, obj, data, key) local data_item = data[key] if data_item == nil or is_positive_integer(data_item) then return end discrepancy(modname, "%s has a Wikidata item ID that is not a positive integer: <code>%s</code>", link(obj), dump(data_item) ) end local function check_name_field(modname, obj, data, canonical_name, data_key, allow_nested, allow_canonical_name_in_table) local array = data[data_key] if not array then return end check_array(modname, obj, data, data_key, nil, true) local names = {} local function check_other_name(other_name) if not allow_canonical_name_in_table and other_name == canonical_name then discrepancy(modname, "%s has its canonical name (<code>%s</code>) repeated in the table of <code>%s</code>.", link(obj), dump(canonical_name), data_key ) end if names[other_name] then discrepancy(modname, "The name %s is found twice or more in the list of <code>%s</code> for %s.", other_name, data_key, link(obj) ) end names[other_name] = true end for _, other_name in ipairs(array) do if type(other_name) == "table" then if not allow_nested then discrepancy(modname, "A nested table is found in the list of <code>%s</code> for %s, but isn't allowed.", data_key, link(obj) ) else for _, on in ipairs(other_name) do check_other_name(on) end end else check_other_name(other_name) end end end local function check_other_names_aliases_varieties(modname, obj, data, canonical_name) if data.other_names then check_name_field(modname, obj, data, canonical_name, "other_names") end if data.aliases then check_name_field(modname, obj, data, canonical_name, "aliases") end if data.varieties then -- Sometimes a variety legitimately has the same name as the language as a whole, so allow that. check_name_field(modname, obj, data, canonical_name, "varieties", "allow_nested", "allow_canonical_name_in_table") end end local function validate_pattern(pattern, modname, obj, standard_chars) if type(pattern) ~= "string" then return discrepancy(modname, "\"%s\", the %spattern for %s, is not a string.", pattern, standard_chars and "standard character " or "", link(obj) ) elseif not isutf8(pattern) then return discrepancy(modname, "%s specifies a pattern for for %scharacter detection which is not valid UTF-8: <code>%s</code>", link(obj), standard_chars and "standard " or "", dump(pattern) ) end local ranges for lower, higher in gmatch(pattern, "(.[\128-\191]*)%-%%?(.[\128-\191]*)") do if codepoint(lower) >= codepoint(higher) then ranges = ranges or Array() insert(ranges, { lower, higher }) end end if ranges and ranges[1] then local plural = #ranges ~= 1 and "s" or "" discrepancy(modname, "%s specifies an invalid pattern " .. "for %scharacter detection: <code>%s</code>. The first codepoint%s " .. "in the range%s %s %s must be less than or equal to the second.", link(obj), standard_chars and "standard " or "", dump(pattern), plural, plural, ranges:map(function(range) return format(range[1] .. "-" .. range[2] .. " (U+%X, U+%X)", codepoint(range[1]), codepoint(range[2])) end):concat(", "), #ranges ~= 1 and "are" or "is" ) end local success, result = pcall(umatch, "", "[" .. pattern .. "]") if not success then discrepancy(modname, "%s specifies an invalid pattern for %scharacter detection: <code>%s</code> (%s)", link(obj), standard_chars and "standard " or "", dump(pattern), result ) end end local remove_exceptions_addition = 0xF0000 local maximum_code_point = 0x10FFFF local remove_exceptions_maximum_code_point = maximum_code_point - remove_exceptions_addition -- TODO: check modules exist. -- TODO: validate script codes and check inner tables. local function check_replacement_data(modname, obj, data, key, func_name) local replacements = data[key] if replacements == nil then return end local replacements_type = type(replacements) if replacements_type == "string" then local mod = main_loader("Module:" .. replacements) if not mod then discrepancy(modname, "The <code>%s</code> field in the data table for %s specifies the module [[Module:%s]], which does not exist.", key, link(obj), replacements ) else mod = mod() if not (type(mod) == "table" and is_callable(mod[func_name])) then discrepancy(modname, "The <code>%s</code> field in the data table for %s specifies the module [[Module:%s]], which exists, but does not contain the expected function <code>%s()</code>.", key, link(obj), replacements, func_name ) end end return elseif replacements_type ~= "table" then discrepancy(modname, "The <code>%s</code> field in the data table for %s must be a string or table, not a %s.", key, link(obj), replacements_type ) return end local from, to = replacements.from, replacements.to if (from ~= nil) ~= (to ~= nil) then discrepancy(modname, "The <code>from</code> and <code>to</code> arrays in the <code>%s</code> table for %s are not both defined or both undefined.", key, link(obj) ) elseif from then for _, k in ipairs {"from", "to"} do check_array(modname, obj, data, k, key) end end local remove_diacritics = replacements.remove_diacritics if not (remove_diacritics == nil or type(remove_diacritics) == "string") then discrepancy(modname, "The <code>remove_diacritics</code> field in the <code>%s</code> table for %s table must be a string.", key, link(obj) ) end local remove_exceptions = replacements.remove_exceptions if remove_exceptions then if check_array(modname, obj, data, "remove_exceptions", key) then for sequence_i, sequence in ipairs(remove_exceptions) do local code_point_i = 0 for code_point in gcodepoint(sequence) do code_point_i = code_point_i + 1 if code_point > remove_exceptions_maximum_code_point then discrepancy(modname, "Code point #%d (0x%04X) in field #%d of the <code>remove_exceptions</code> array for %s is over U+%04X.", code_point_i, code_point, sequence_i, link(obj), remove_exceptions_maximum_code_point ) end end end end end if from and to and table_len(to) > table_len(from) then discrepancy(modname, "The <code>from</code> array in the <code>%s</code> table for %s must be shorter or the same length as the <code>to</code> array.", key, link(obj) ) end end local function check_replacements_data(modname, obj, data) for _, replacement_spec in ipairs{ {"translit", "tr"}, {"display_text", "makeDisplayText"}, {"strip_diacritics", "stripDiacritics"}, {"sort_key", "makeSortKey"}, } do check_replacement_data(modname, obj, data, unpack(replacement_spec)) end end local function has_ancestor(lang, code) for _, anc in ipairs(lang:getAncestors()) do if code == anc:getCode() or has_ancestor(anc, code) then return true end end end local function get_default_ancestors(lang) if lang:hasType("language", "etymology-only") then local parent = lang:getParent() if not has_ancestor(parent, lang:getCode()) then return parent:getAncestorCodes() end end local fam_code, def_anc = lang:getFamilyCode() while fam_code and fam_code ~= "qfa-not" do local fam = m_families_data[fam_code] def_anc = fam.protoLanguage or m_languages_data_all[fam_code .. "-pro"] and fam_code .. "-pro" or m_etym_languages_data[fam_code .. "-pro"] and fam_code .. "-pro" if def_anc and def_anc ~= lang:getCode() then return {def_anc} end fam_code = fam[3] end end local function iterate_ancestor(obj, modname, anc_code) local anc = get_language_by_code(anc_code, nil, true) if not anc then discrepancy(modname, "%s lists the invalid language code <code>%s</code> as its ancestor.", link(obj), dump(anc_code) ) return end local anc_fam = anc:getFamily() if not anc_fam then discrepancy(modname, "%s has no family.", link(anc) ) return end local anc_fam_code = anc_fam:getCode() local def_ancs = get_default_ancestors(obj) if def_ancs then for _, def_anc in ipairs(def_ancs) do def_anc = get_language_by_code(def_anc, nil, true) if def_anc and ( anc_code == def_anc:getCode() or has_ancestor(def_anc, anc_code) or def_anc:hasParent(anc_code) and not has_ancestor(anc, def_anc:getCode()) ) then discrepancy(modname, "%s has the ancestor %s listed in its ancestor field, which is redundant, since it is determined to be ancestral automatically.", link(obj), link(anc) ) end end end if not obj:inFamily(anc_fam_code) then discrepancy(modname, "%s has %s set as an ancestor, but is not in the %s.", link(obj), link(anc), link(anc_fam) ) end local fam, proto = obj repeat fam = fam:getFamily() proto = fam and fam:getProtoLanguage() until proto or not fam or fam:getCode() == "qfa-not" if proto and not ( proto:getCode() == anc:getCode() or proto:hasAncestor(anc:getCode()) or anc:hasAncestor(proto:getCode()) ) then local fam = obj:getFamily() discrepancy(modname, "%s is in the %s and has %s set as an ancestor, but it is not possible to form an ancestral chain between them.", link(obj), link(fam), link(anc) ) end end local function check_ancestors(modname, obj, data) local ancestors = data.ancestors if ancestors == nil then return end local ancestors_type = type(ancestors) if ancestors_type == "string" then ancestors = split(ancestors, ",", true, true) elseif ancestors_type ~= "table" then discrepancy(modname, "The <code>ancestors</code> field in the data table for %s must be a string or table, not a %s.", link(obj), ancestors_type ) end for _, anc in ipairs(ancestors) do iterate_ancestor(obj, modname, anc) end end local function check_wikimedia_codes(modname, obj, data) local wikimedia_codes = data.wikimedia_codes if wikimedia_codes == nil then return end local wikimedia_codes_type = type(wikimedia_codes) if wikimedia_codes_type == "string" then wikimedia_codes = split(wikimedia_codes, ",", true, true) elseif wikimedia_codes_type ~= "table" then discrepancy(modname, "The <code>wikimedia_codes</code> field in the data table for %s must be a string or table, not a %s.", link(obj), wikimedia_codes_type ) end for _, code in ipairs(wikimedia_codes) do if not is_known_language_tag(code) then discrepancy(modname, "%s lists the invalid Wikimedia code <code>%s</code> in the <code>wikimedia_codes</code> field.", link(obj), dump(code) ) end end end local function check_code_to_name_and_name_to_code_maps( source_module_type, source_module_description, code_to_module_map, name_to_code_map, code_to_name_modname, code_to_name_module, name_to_code_modname, name_to_code_module ) local function check_code_and_name(modname, code, canonical_name) -- Check the code is in code_to_module_map and that it didn't originate from the wrong data module. local check_mod = code_to_module_map[code] or code_to_module_map[aliases[code]] if not (check_mod and match(check_mod, "^" .. source_module_type .. "/data")) then if not name_to_code_map[canonical_name] then discrepancy(modname, "The code <code>%s</code> and the canonical name %s should be removed; they are not found in %s.", code, canonical_name, source_module_description ) else discrepancy(modname, "<code>%s</code>, the code for the canonical name %s, is wrong; it should be <code>%s</code>.", code, canonical_name, name_to_code_map[canonical_name] ) end elseif not name_to_code_map[canonical_name] then local data_table = require("Module:" .. code_to_module_map[code])[code] discrepancy(modname, "%s, the canonical name for the code <code>%s</code>, is wrong; it should be %s.", canonical_name, code, data_table[1] ) end end for code, canonical_name in pairs(code_to_name_module) do check_code_and_name(code_to_name_modname, code, canonical_name) end for canonical_name, code in pairs(name_to_code_module) do check_code_and_name(name_to_code_modname, code, canonical_name) end end local function check_extraneous_extra_data( data_modname, data_module, extra_data_modname, extra_data_module) for code, _ in pairs(extra_data_module) do if not data_module[code] then discrepancy(extra_data_modname, "The code <code>%s</code> is not found in [[Module:%s]], and should be removed from [[Module:%s]].", code, data_modname, extra_data_modname ) end end end -- TODO: add collision check between the canonical names "X" and "X [Ll]anguage". local function check_languages(frame) local check_language_data_keys = check_data_keys( 1, 2, 3, 4, -- canonical name, Wikidata item, family, scripts "display_text", "generate_forms", "strip_diacritics", "sort_key", "other_names", "aliases", "varieties", "ietf_subtag", "type", "ancestors", "pseudo_families", "wikimedia_codes", "wikipedia_article", "standard_chars", "translit", "override_translit", "link_tr", "dotted_dotless_i" ) local function check_language(modname, code, data, extra_modname, extra_data) local obj, code_modname, canonical_name = make_lang(code, data, true), get_data_module_name(code), data[1] -- FIXME: this module should use the prefixed module name throughout. code_modname = code_modname:gsub("^Module:", "") if code_modname ~= modname then if code_modname == "languages/data/2" then discrepancy(modname, "%s is a two-letter code, so should be moved to [[Module:%s]].", link(obj), code_modname ) elseif code_modname == "languages/data/exceptional" then discrepancy(modname, "%s is an exceptional code, as it does not consist of two or three lowercase letters, so should be moved to [[Module:%s]].", link(obj), code_modname ) else discrepancy(modname, "%s is a three-letter code beginning with '%s', so should be moved to [[Module:%s]].", link(obj), sub(code, 1, 1), code_modname ) end end check_language_data_keys(modname, obj, data) if all_codes[code] then discrepancy(modname, "The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].", code, all_codes[code] ) else if not m_languages_codes[code] then discrepancy("languages/code to canonical name", "The code %s is missing.", link(obj, true) ) end all_codes[code] = modname end -- TODO: these checks should be consolidated with the proto-language checks in the family data, -- since bad settings there affect the warnings here (e.g. xxx-pro assigned to yyy when xxx also -- doesn't not exist - a warning that xxx has "no family" would be misleading). if sub(code, -4) == "-pro" then local fam_code = sub(code, 1, -5) local fam = get_language_by_code(fam_code, nil, true, true) if not fam then discrepancy(modname, "'''Proto-language with no family''': %s should be the proto-language of <code>%s</code>, which doesn't exist.", link(obj), dump(fam_code) ) elseif not fam:hasType("family") then discrepancy(modname, "'''Proto-language with no family''': %s should be the proto-language of <code>%s</code>, but %s is not a family.", link(obj), dump(fam_code), link(fam) ) else -- Reinstate this as low-priority once message priorities have been implemented. -- local expected_name = "Proto-" .. fam:getCanonicalName() -- if canonical_name ~= expected_name then -- discrepancy(modname, -- "%s does not have the expected name \"%s\", even though it is the proto-language of the %s.", -- link(obj), expected_name, link(fam) -- ) -- end end end if not canonical_name then discrepancy(modname, "The code <code>%s</code> has no canonical name specified.", code ) elseif language_names[canonical_name] then local canonical_lang = get_language_by_canonical_name(canonical_name) if not canonical_lang then discrepancy(modname, "%s has a canonical name that cannot be looked up.", link(obj) ) elseif data.main_code ~= canonical_lang:getCode() then discrepancy(modname, "%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.", link(obj), language_names[canonical_name] ) end else if not m_languages_canonical_names[canonical_name] then discrepancy("languages/canonical names", "The canonical name %s is missing.", link(obj) ) end language_names[canonical_name] = code end check_wikidata_item(modname, obj, data, 2) if extra_data then check_other_names_aliases_varieties(modname, obj, extra_data, canonical_name) end local lang_type = data.type if lang_type and not (lang_type == "regular" or lang_type == "reconstructed" or lang_type == "appendix-constructed") then discrepancy(modname, "%s is of the invalid type <code>%s</code>.", link(obj), lang_type ) end if data.aliases then discrepancy(modname, "%s has an <code>aliases</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].", link(obj), modname, extra_modname ) end if data.varieties then discrepancy(modname, "%s has the <code>varieties</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].", link(obj), modname, extra_modname ) end if data.other_names then discrepancy(modname, "%s has the <code>other_names</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].", link(obj), modname, extra_modname ) end if not extra_data then discrepancy(extra_modname, "%s has data in [[Module:%s]], but does not have corresponding data in [[Module:%s]].", link(obj), modname, extra_modname ) --[[elseif extra_data.other_names then discrepancy(extra_modname, "%s has <code>other_names</code> key, but these should be changed to either <code>aliases</code> or <code>varieties</code>.", link(obj) )]] end local sc = data[4] if sc then if type(sc) == "string" then sc = split(sc, "%s*,%s*", true) end if type(sc) == "table" then if not sc[1] then discrepancy(modname, "%s has no scripts listed.", link(obj) ) else for _, sccode in ipairs(sc) do local cur_sc = m_scripts_data[sccode] if not (cur_sc or sccode == "All" or sccode == "Hants") then discrepancy(modname, "%s lists the invalid script code <code>%s</code>.", link(obj), dump(sccode) ) --[[elseif not cur_sc.characters then discrepancy(modname, "%s lists the %s, which does not have any characters.", link(obj), link(get_script_by_code(sccode)) )]] end nonempty_scripts[sccode] = true end end else discrepancy(modname, "The %s field for %s must be a table or string.", 4, link(obj) ) end end if data.ancestors then check_ancestors(modname, obj, data) end if data.wikimedia_codes then check_wikimedia_codes(modname, obj, data) end if data[3] then local family = data[3] if not m_families_data[family] then discrepancy(modname, "%s has the invalid family code <code>%s</code>.", link(obj), dump(family) ) end nonempty_families[family] = true end check_replacements_data(modname, obj, data) if data.standard_chars then if type(data.standard_chars) == "table" then local sccodes = {} for _, sccode in ipairs(sc) do sccodes[sccode] = true end for sccode in pairs(data.standard_chars) do if not (sccodes[sccode] or sccode == 1) then discrepancy(modname, "The field %s in the <code>standard_chars</code> table for %s does not match any script for that language.", sccode, link(obj) ) end end elseif data.standard_chars and type(data.standard_chars) ~= "string" then discrepancy(modname, "The <code>standard_chars</code> field in the data table for %s must be a string or table.", link(obj) ) end end check_true_or_string_or_nil(modname, obj, data, "override_translit") check_true_or_string_or_nil(modname, obj, data, "link_tr") -- This doesn't apply any more since scripts can be script-wide translit methods. -- if data.override_translit and not data.translit then -- discrepancy(modname, -- "%s has the <code>override_translit</code> field set, but no transliteration module", -- link(obj) -- ) -- end end local function check_module(modname) local mod_data = load_data("Module:" .. modname) local extra_modname = modname .. "/extra" local extra_mod_data = load_data("Module:" .. extra_modname) for code, data in pairs(mod_data) do check_language(modname, code, data, extra_modname, extra_mod_data[code]) end check_no_alias_codes(modname, mod_data) check_no_alias_codes(extra_modname, extra_mod_data) check_extraneous_extra_data(modname, mod_data, extra_modname, extra_mod_data) end -- Check two-letter codes check_module( "languages/data/2" ) -- Check three-letter codes for i = 0x61, 0x7A do -- a to z check_module( format("languages/data/3/%c", i) ) end -- Check exceptional codes check_module( "languages/data/exceptional" ) -- These checks must be done while all_codes only contains language codes: -- that is, after language data modules have been processed, but before -- etymology languages, families, and scripts have. check_code_to_name_and_name_to_code_maps( "languages", "a submodule of [[Module:languages]]", all_codes, language_names, "languages/code to canonical name", m_languages_codes, "languages/canonical names", m_languages_canonical_names ) -- Check [[Template:langname-lite]] local modname = "Template:langname-lite" for code, name in gmatch(remove_comments(new_title(modname):getContent()), "\n\t*|#*([^\n]+)=([^\n]*)") do if #code > 1 and code ~= "default" then for _, code in pairs(split(code, "|", true)) do local lang = get_language_by_code(code, nil, true, true) if match(name, "etymcode") then local nonEtym_name = frame:preprocess(name) local nonEtym_real_name = lang:getFullName() if nonEtym_name ~= nonEtym_real_name then discrepancy(modname, "Code: <code>%s</code>. Saw name: %s. Expected name: %s.", code, nonEtym_name, nonEtym_real_name ) end name = frame:preprocess(gsub(name, "{{{allow etym|}}}", "1")) elseif match(name, "familycode") then name = match(name, "familycode|(.-)|") else name = name end if not lang then discrepancy(modname, "Code: <code>%s</code>. Saw name: %s. Language not present in data.", code, name ) else local real_name = lang:getCanonicalName() if name ~= real_name then discrepancy(modname, "Code: <code>%s</code>. Saw name: %s. Expected name: %s.", code, name, real_name ) end end end end end end local function check_etym_languages() local modname = "etymology languages/data" local check_etymology_language_data_keys = check_data_keys( 1, 2, 3, 4, -- canonical name, Wikidata item, family, scripts "parent", "display_text", "generate_forms", "strip_diacritics", "sort_key", "other_names", "aliases", "varieties", "ietf_subtag", "type", "main_code", "ancestors", "pseudo_families", "wikimedia_codes", "wikipedia_article", "standard_chars", "translit", "override_translit", "link_tr", "dotted_dotless_i" ) local checked = {} for code, data in pairs(m_etym_languages_data) do local obj, canonical_name, parent = make_lang(code, data, true), data[1], data.parent check_etymology_language_data_keys(modname, obj, data) if all_codes[code] then discrepancy(modname, "The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].", code, all_codes[code] ) else if not m_etym_languages_codes[code] then discrepancy("etymology languages/code to canonical name", "The code %s is missing.", link(obj, true) ) end all_codes[code] = modname end if not canonical_name then discrepancy(modname, "The code <code>%s</code> has no canonical name specified.", code ) elseif language_names[canonical_name] then local canonical_lang = get_language_by_canonical_name(canonical_name, nil, true) if not canonical_lang then discrepancy(modname, "%s has a canonical name that cannot be looked up.", link(obj) ) elseif data.main_code ~= canonical_lang:getCode() then discrepancy(modname, "%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.", link(obj), language_names[canonical_name] ) end else if not m_etym_languages_canonical_names[canonical_name] then discrepancy("etymology languages/canonical names", "The canonical name %s is missing.", link(obj) ) end etym_language_names[canonical_name] = code end check_other_names_aliases_varieties(modname, obj, data, canonical_name) if parent then if type(parent) ~= "string" then discrepancy(modname, "%s has a parent code that is %s rather than a string.", link(obj), parent == nil and "nil" or "a " .. type(parent) ) elseif not (m_languages_data_all[parent] or m_etym_languages_data[parent]) then discrepancy(modname, "%s has the invalid parent code <code>%s</code>%s.", link(obj), dump(parent), m_families_data[parent] and " (a family code)" or "" ) end nonempty_families[parent] = true else discrepancy(modname, "%s has no parent code.", link(obj) ) end if data.ancestors then check_ancestors(modname, obj, data) end if data.wikimedia_codes then check_wikimedia_codes(modname, obj, data) end if data[3] then local family = data[3] if not m_families_data[family] then discrepancy(modname, "%s has the invalid family code <code>%s</code>.", link(obj), dump(family)) end nonempty_families[family] = true end check_replacements_data(modname, obj, data) check_wikidata_item(modname, obj, data, 2) local stack = {} while data do if checked[code] then break elseif stack[code] then local parent = data.parent discrepancy(modname, "%s has a cyclic parental relationship to %s", link(make_lang(code, data, true)), link(get_language_by_code(parent, nil, true)) ) break end stack[code] = true code = data.parent data = m_etym_languages_data[code] end for code in pairs(stack) do checked[code] = true end end check_no_alias_codes(modname, m_etym_languages_data) check_code_to_name_and_name_to_code_maps( "etymology languages", "[[Module:etymology languages/data]]", all_codes, etym_language_names, "etymology languages/code to canonical name", m_etym_languages_codes, "etymology languages/canonical names", m_etym_languages_canonical_names) end -- TODO: add collision check between the canonical names "X" and "X [Ll]anguages". local function check_families() local modname = "families/data" local check_family_data_keys = check_data_keys( 1, 2, 3, -- canonical name, Wikidata item, (parent) family "type", "ietf_subtag", "protoLanguage", "other_names", "aliases", "varieties", "pseudo_families", "categoryName" ) local checked, double_check_if_empty = {["qfa-not"] = true}, {} for code, data in pairs(m_families_data) do local obj, canonical_name, family, protolang = make_family(code, data), data[1], data[3], data.protoLanguage check_family_data_keys(modname, obj, data) if all_codes[code] then discrepancy(modname, "The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].", code, all_codes[code] ) else if not m_families_codes[code] then discrepancy("families/code to canonical name", "The code %s is missing.", link(obj, true) ) end all_codes[code] = modname end if not canonical_name then discrepancy(modname, "The code <code>%s</code> has no canonical name specified.", code ) elseif family_names[canonical_name] then local canonical_family = get_family_by_canonical_name(canonical_name) if not canonical_family then discrepancy(modname, "%s has a canonical name that cannot be looked up.", link(obj) ) elseif data.main_code ~= canonical_family:getCode() then discrepancy(modname, "%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.", link(obj), family_names[canonical_name] ) end else if not m_families_canonical_names[canonical_name] then discrepancy("families/canonical names", "The canonical name %s is missing.", link(obj) ) end family_names[canonical_name] = code end check_other_names_aliases_varieties(modname, obj, data, canonical_name) if family then if family == code and code ~= "qfa-not" then discrepancy(modname, "%s has itself as its family.", link(obj) ) elseif not m_families_data[family] then discrepancy(modname, "%s has the invalid parent family code <code>%s</code>.", link(obj), dump(family) ) end nonempty_families[family] = true end if protolang then local protolang_obj = get_language_by_code(protolang, nil, true) if not protolang_obj then discrepancy(modname, "%s has the invalid proto-language code <code>%s</code>.", link(obj), dump(protolang) ) elseif protolang == code .. "-pro" then discrepancy(modname, "%s has %s listed as its proto-language, which is redundant, since it is determined to be the proto-language automatically.", link(obj), link(protolang_obj) ) elseif sub(protolang, -4) == "-pro" then discrepancy(modname, "%s has %s listed as its proto-language, which is supposed to be the proto-language for the family <code>%s</code>.", link(obj), link(protolang_obj), sub(protolang, 1, -5) ) end end check_wikidata_item(modname, obj, data, 2) -- Could be a false-positive if a child family occurs on a later -- iteration, so set aside any that fail for a second check. This avoids -- having to iterate through the whole list of families once -- nonempty_families has been fully populated. if not (nonempty_families[code] or allowed_empty_families[code]) then double_check_if_empty[code] = obj end local stack = {} while data do if checked[code] then break elseif stack[code] then local parent = data[3] discrepancy(modname, "%s has a cyclic familial relationship to %s", link(make_family(code, data)), link(get_family_by_code(parent)) ) break end stack[code] = true code = data[3] data = m_families_data[code] end for code in pairs(stack) do checked[code] = true end end -- Any languages set aside as candidates for having no children are checked -- again, now that nonempty_families is definitely complete. for code, obj in next, double_check_if_empty do if not (nonempty_families[code] or allowed_empty_families[code]) then discrepancy(modname, "%s has no child families or languages.", link(obj) ) end end check_no_alias_codes(modname, m_families_data) check_code_to_name_and_name_to_code_maps( "families", "[[Module:families/data]]", all_codes, family_names, "families/code to canonical name", m_families_codes, "families/canonical names", m_families_canonical_names) end -- TODO: add collision check between the canonical names "X" and "X [Ss]cript". local function check_scripts() local modname = "scripts/data" local check_script_data_keys = check_data_keys( 1, 2, 3, -- canonical name, Wikidata item, writing systems "other_names", "aliases", "varieties", "parent", "ietf_subtag", "type", "wikipedia_article", "ranges", "characters", "spaces", "capitalized", "translit", "direction", "character_category", "normalizationFixes", "sort_by_scraping", "display_text", "sort_key", "strip_diacritics" ) -- Just to satisfy requirements of check_code_to_name_and_name_to_code_maps. local script_code_to_module_map = {} for code, data in pairs(m_scripts_data) do local obj, canonical_name = make_script(code, data), data[1] if not m_scripts_codes[code] and #code == 4 then discrepancy("scripts/code to canonical name", "The code %s is missing", link(obj, true) ) end check_script_data_keys(modname, obj, data) if not canonical_name then discrepancy(modname, "The code <code>%s</code> has no canonical name specified.", code ) elseif script_names[canonical_name] then local canonical_script = get_script_by_canonical_name(canonical_name) if not canonical_script then discrepancy(modname, "%s has a canonical name that cannot be looked up.", link(obj) ) --[[elseif data.main_code ~= canonical_script:getCode() then discrepancy(modname, "%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.", link(obj), script_names[canonical_name] )]] end else if not m_scripts_canonical_names[canonical_name] and #code == 4 then discrepancy("scripts/canonical names", "The canonical name %s is missing.", link(obj) ) end script_names[canonical_name] = code end check_other_names_aliases_varieties(modname, obj, data, canonical_name) if not nonempty_scripts[code] then discrepancy(modname, "%s is not used by any language%s.", link(obj), data.characters and "" or " and has no characters listed for auto-detection") --[[elseif not data.characters then discrepancy(modname, "%s has no characters listed for auto-detection.", link(obj) )--]] end if data.characters then validate_pattern(data.characters, modname, obj, false) end check_wikidata_item(modname, obj, data, 2) script_code_to_module_map[code] = modname end check_no_alias_codes(modname, m_scripts_data) check_code_to_name_and_name_to_code_maps( "scripts", "a submodule of [[Module:scripts]]", script_code_to_module_map, script_names, "scripts/code to canonical name", m_scripts_codes, "scripts/canonical names", m_scripts_canonical_names) end -- FIXME: this is quite messy. local function check_wikidata_languages() local data = json_decode(new_title("Module:languages/data/wikidata.json"):getContent()) local seen = {{}, {}, {}, [5] = {}} for _, item in ipairs(data) do local id = item.id for k, v in pairs(item) do if k ~= "id" then local _seen = seen[k] for _, code in ipairs(v) do local _code = code[1] local _type = type(_seen[_code]) if _type == "table" then insert(_seen[_code], id) elseif _type == "string" then _seen[_code] = {_seen[_code], id} else _seen[_code] = id end end end end end local modname = "languages/data/wikidata.json" for k, v in pairs(seen) do for code, ids in pairs(v) do if type(ids) == "table" then local t = {} for i, id in ipairs(ids) do t[i] = format("<code>[[d:%s|%s]]</code>", id, id) end discrepancy(modname, "<code>%s</code> is set as an ISO 639-%d code on multiple items: %s.", code, k, list_to_text(t) ) end end end end local function check_labels() local check_label_data_keys = check_data_keys( "display", "Wikipedia", "glossary", "plain_categories", "topical_categories", "pos_categories", "regional_categories", "sense_categories", "omit_preComma", "omit_postComma", "omit_preSpace", "deprecated", "track" ) local function check_label(modname, code, data) local _type = type(data) if _type == "table" then check_label_data_keys(modname, code, data) elseif _type ~= "string" then discrepancy(modname, "The data for the label <code>%s</code> is %s %s; only tables and strings are allowed.", code, add_indefinite_article(_type) ) end end for _, module in ipairs{"", "/regional", "/topical"} do local modname = "Module:labels/data" .. module module = require(modname) for label, data in pairs(module) do check_label(modname, label, data) end end for code in pairs(m_languages_codes) do local modname = "Module:labels/data/lang/" .. code local module = safe_require(modname) if module then for label, data in pairs(module) do check_label(modname, label, data) end end end end local function check_zh_trad_simp() local m_ts = require("Module:zh/data/ts") local m_st = require("Module:zh/data/st") local ruby = require("Module:ja-ruby").ruby_auto local lang = get_language_by_code("zh") local Hant = get_script_by_code("Hant") local Hans = get_script_by_code("Hans") local data = {[0] = m_st, m_ts} local mod = {[0] = "st", "ts"} local var = {[0] = "Simp.", "Trad."} local sc = {[0] = Hans, Hant} local function find_stable_loop(chars, other, j) local display = ruby({["markup"] = "[" .. other .. "](" .. var[(j+1)%2] .. ")"}) display = language_link{term = other, alt = display, lang = lang, sc = sc[(j+1)%2], tr = "-"} insert(chars, display) if data[(j+1)%2][other] == other then insert(chars, other) return chars, 1 elseif not data[(j+1)%2][other] then insert(chars, "not found") return chars, 2 elseif data[j%2][data[(j+1)%2][other]] ~= other then return find_stable_loop(chars, data[(j+1)%2][other], j + 1) else local display = ruby({["markup"] = "[" .. data[(j+1)%2][other] .. "](" .. var[j%2] .. ")"}) display = language_link{term = data[(j+1)%2][other], alt = display, lang = lang, sc = sc[j%2], tr = "-"} insert(chars, display .. " (") display = ruby({["markup"] = "[" .. data[j%2][data[(j+1)%2][other]] .. "](" .. var[(j+1)%2] .. ")"}) display = language_link{term = data[j%2][data[(j+1)%2][other]], alt = display, lang = lang, sc = sc[(j+1)%2], tr = "-"} insert(chars, display .. " etc.)") return chars, 3 end return chars end for i = 0, 1, 1 do for ch, other_ch in pairs(data[i]) do if data[(i+1)%2][other_ch] ~= ch then local chars, issue = {} local display = ruby({["markup"] = "[" .. ch .. "](" .. var[i] .. ")"}) display = language_link{term = ch, alt = display, lang = lang, sc = sc[i], tr = "-"} insert(chars, display) chars, issue = find_stable_loop(chars, other_ch, i) if issue == 1 or issue == 2 then local sc_this, mod_this, j = {} if match(chars[#chars-1], var[(i+1)%2]) then j = 1 else j = 0 end mod_this = mod[(i+j)%2] sc_this = {[0] = sc[(i+j)%2], sc[(i+j+1)%2]} for k, ch in ipairs(chars) do chars[k] = tag_text(ch, lang, sc_this[k%2], "term") end local modname = "zh/data/" .. mod_this if issue == 1 then discrepancy(modname, "character references itself: %s", concat(chars, " → ") ) elseif issue == 2 then discrepancy(modname, "missing character: %s", concat(chars, " → ") ) end elseif issue == 3 then for j, ch in ipairs(chars) do chars[j] = tag_text(ch, lang, sc[(i+j)%2], "term") end discrepancy("zh/data/" .. mod[i], "possible mismatched character: %s", concat(chars, " → ") ) end end end end end local function check_serialization(modname) local serializers = { ["Hani-sortkey/data/serialized"] = "Hani-sortkey/serializer", } if not serializers[modname] then return nil end local serializer = serializers[modname] local current_data = require("Module:" .. serializer).main(true) local stored_data = require("Module:" .. modname) if current_data ~= stored_data then discrepancy(modname, "<strong><u>Important!</u> Serialized data is out of sync. Use [[Module:%s]] to update it. If you have made any changes to the underlying data, the serialized data <u>must</u> be updated before these changes will take effect.</strong>", serializer ) end end local find_code = require("Module:memoize")(function(message) return match(message, "<code>([^<]+)</code>") end) local function compare_messages(message1, message2) local code1, code2 = find_code(message1), find_code(message2) if code1 and code2 then return code1 < code2 else return message1 < message2 end end -- Warning: cannot be called twice in the same module invocation because -- some module-global variables are not reset between calls. local function do_checks(frame, modules) messages = setmetatable({}, messages_mt) if modules["zh/data/ts"] or modules["zh/data/st"] then check_zh_trad_simp() end check_languages(frame) check_etym_languages() -- families and scripts must be checked AFTER languages; languages checks fill out -- the nonempty_families and nonempty_scripts tables, used for testing if a family/script -- is ever used in the data check_families() check_scripts() check_wikidata_languages() if modules["labels/data"] then check_labels() end for module in pairs(modules) do check_serialization(module) end setmetatable(messages, nil) for _, msglist in pairs(messages) do msglist:sort(compare_messages) end local ret = messages messages = nil return ret end local function format_message(modname, msglist) local header; if match(modname, "^Module:") or match(modname, "^Template:") then header = "===[[" .. modname .. "]]===" else header = "===[[Module:" .. modname .. "]]===" end return header .. msglist:map(function(msg) return "\n* " .. msg end):concat() end function export.check_modules_t(frame) local args = frame.args local modules = list_to_set(args) local ret = Array() local messages = do_checks(frame, modules) for _, module in ipairs(args) do local msglist = messages[module] if msglist then ret:insert(format_message(module, msglist)) end end return ret:concat("\n") end function export.perform(frame) local messages = do_checks(frame, {}) -- Format the messages local ret = Array() for modname, msglist in sorted_pairs(messages) do ret:insert(format_message(modname, msglist)) end -- Are there any messages? -- TODO: check how many messages there are. if false then --if i == 1 then return "<b class=\"success\">Glory to Arstotzka.</b>" else ret:insert(1, "<b class=\"warning\">Discrepancies detected:</b>") return ret:concat("\n") end end return export gl7cr7quegesqg20o8ga2k3wf0lzrq0 5720748 5720745 2026-04-21T06:53:40Z OctraBot 3198 5720748 Scribunto text/plain -- TODO: -- ietf_subtag field used with a 2/3-letter langauge/family code except qaa-qtz, or a 4-letter script code. -- Check against files containing up-to-date ISO data, to cross-check validity. local export = {} local mw = mw local require = require local string = string local Array = require("Module:array") local m_en_utilities = require("Module:en-utilities") local m_etym_languages_canonical_names = require("Module:etymology languages/canonical names") local m_etym_languages_codes = require("Module:etymology languages/code to canonical name") local m_etym_languages_data = require("Module:etymology languages/data") local m_families = require("Module:families") local m_families_canonical_names = require("Module:families/canonical names") local m_families_codes = require("Module:families/code to canonical name") local m_families_data = require("Module:families/data") local m_languages = require("Module:languages") local m_languages_canonical_names = require("Module:languages/canonical names") local m_languages_codes = require("Module:languages/code to canonical name") local m_languages_data_all = require("Module:languages/data/all") local m_load = require("Module:load") local m_scripts = require("Module:scripts") local m_scripts_canonical_names = require("Module:scripts/canonical names") local m_scripts_codes = require("Module:scripts/code to canonical name") local m_scripts_data = require("Module:scripts/data") local m_str_utils = require("Module:string utilities") local m_table = require("Module:table") local add_indefinite_article = m_en_utilities.add_indefinite_article local codepoint = m_str_utils.codepoint local concat = table.concat local dump = mw.dumpObject local format = string.format local gcodepoint = m_str_utils.gcodepoint local get_data_module_name = m_languages.getDataModuleName local get_family_by_code = m_families.getByCode local get_family_by_canonical_name = m_families.getByCanonicalName local get_indefinite_article = m_en_utilities.get_indefinite_article local get_language_by_code = m_languages.getByCode local get_language_by_canonical_name = m_languages.getByCanonicalName local get_script_by_code = m_scripts.getByCode local get_script_by_canonical_name = m_scripts.getByCanonicalName local gmatch = string.gmatch local gsub = string.gsub local insert = table.insert local ipairs = ipairs local is_callable = require("Module:fun").is_callable local is_positive_integer = require("Module:math").is_positive_integer local is_known_language_tag = mw.language.isKnownLanguageTag local isutf8 = mw.ustring.isutf8 local json_decode = mw.text.jsonDecode local language_link = require("Module:links").language_link local list_to_set = m_table.listToSet local list_to_text = mw.text.listToText local load_data = m_load.load_data local log = mw.log local main_loader = package.loaders[2] local make_family = m_families.makeObject local make_lang = m_languages.makeObject local make_script = m_scripts.makeObject local match = string.match local new_title = mw.title.new local next = next local pairs = pairs local pcall = pcall local remove_comments = require("Module:string/removeComments") local safe_require = m_load.safe_require local sorted_pairs = m_table.sortedPairs local split = m_str_utils.split local sub = string.sub local table_len = m_table.length local tag_text = require("Module:script utilities").tag_text local type = type local umatch = m_str_utils.match local unpack = unpack or table.unpack -- Lua 5.2 compatibility local aliases = require("Module:languages/data").aliases local messages local function discrepancy(modname, ...) local success, result = pcall(function(...) messages[modname]:insert(format(...)) end, ...) if not success then log(result, ...) end end local messages_mt = {} function messages_mt:__index(k) local val = Array() self[k] = val return val end local all_codes = {} local language_names = {} local etym_language_names = {} local family_names = {} local script_names = {} local nonempty_families = {} local allowed_empty_families = {tbq = true} local nonempty_scripts = {} local function link(obj, code_first) return type(obj) == "string" and obj or code_first and format("<code>%s</code> (%s)", obj:getCode(), obj:makeCategoryLink()) or format("%s (<code>%s</code>)", obj:makeCategoryLink(), obj:getCode()) end local function check_data_keys(...) local valid_keys = Array(...):toSet() return function (modname, obj, data) local invalid_keys for k in pairs(data) do if not valid_keys[k] then if not invalid_keys then invalid_keys = Array(k) else invalid_keys:insert(k) end end end if invalid_keys == nil then return end local plural = #invalid_keys ~= 1 discrepancy(modname, "The data key%s %s for %s %s invalid.", plural and "s" or "", invalid_keys:map(function(key) return "<code>" .. key .. "</code>" end):concat(", "), link(obj), plural and "are" or "is" ) end end -- Modification of isArray in [[Module:table]]. -- This assumes all keys are either integers or non-numbers. -- If there are fractional numbers, the results might be incorrect. -- For instance, find_gap{"a", "b", [0.5] = true} evaluates to 3, but there -- isn't a gap at 3 in the sense of there being an integer key greater than 3. local function find_gap(t, can_contain_non_number_keys) local i = 0 for k in pairs(t) do if not (can_contain_non_number_keys and type(k) ~= "number") then i = i + 1 if t[i] == nil then return i end end end end local function check_true_or_string_or_nil(modname, obj, data, key) local field = data[key] if not (field == nil or field == true or type(field) == "string") then discrepancy(modname, "%s has %s <code>%s</code> value that is not <code>nil</code>, <code>true</code> or a string: <code>%s</code>", link(obj), get_indefinite_article(key), key, dump(data[key]) ) end end local function check_array(modname, obj, data, array_name, parent_array_name, can_contain_non_number_keys) local parent_table = data if parent_array_name then parent_table = assert(data[parent_array_name], parent_array_name) parent_array_name = "the <code>" .. parent_array_name .. "</code> field in " else parent_array_name = "" end local array_type = type(parent_table[array_name]) if array_type == "table" then local gap = find_gap(parent_table[array_name], can_contain_non_number_keys) if gap then discrepancy(modname, "The <code>%s</code> array in %sthe data table for %s has a gap at index %d.", array_name, parent_array_name, link(obj), gap ) else return true end else discrepancy(modname, "The <code>%s</code> field in %sthe data table for %s should be an array (table) but is %s.", array_name, parent_array_name, link(obj), array_type == "nil" and "nil" or "a " .. array_type ) end end local function check_no_alias_codes(modname, mod_data) local lookup, discrepancies = {}, {} for k, v in pairs(mod_data) do local check = lookup[v] if check then discrepancies[check] = discrepancies[check] or {"<code>" .. check .. "</code>"} insert(discrepancies[check], "<code>" .. k .. "</code>") else lookup[v] = k end end for _, v in pairs(discrepancies) do discrepancy(modname, "The codes %s are currently alias codes. Only one code should be used in the data.", list_to_text(v, ", ", " and ") ) end end local function check_wikidata_item(modname, obj, data, key) local data_item = data[key] if data_item == nil or is_positive_integer(data_item) then return end discrepancy(modname, "%s has a Wikidata item ID that is not a positive integer: <code>%s</code>", link(obj), dump(data_item) ) end local function check_name_field(modname, obj, data, canonical_name, data_key, allow_nested, allow_canonical_name_in_table) local array = data[data_key] if not array then return end check_array(modname, obj, data, data_key, nil, true) local names = {} local function check_other_name(other_name) if not allow_canonical_name_in_table and other_name == canonical_name then discrepancy(modname, "%s has its canonical name (<code>%s</code>) repeated in the table of <code>%s</code>.", link(obj), dump(canonical_name), data_key ) end if names[other_name] then discrepancy(modname, "The name %s is found twice or more in the list of <code>%s</code> for %s.", other_name, data_key, link(obj) ) end names[other_name] = true end for _, other_name in ipairs(array) do if type(other_name) == "table" then if not allow_nested then discrepancy(modname, "A nested table is found in the list of <code>%s</code> for %s, but isn't allowed.", data_key, link(obj) ) else for _, on in ipairs(other_name) do check_other_name(on) end end else check_other_name(other_name) end end end local function check_other_names_aliases_varieties(modname, obj, data, canonical_name) if data.other_names then check_name_field(modname, obj, data, canonical_name, "other_names") end if data.aliases then check_name_field(modname, obj, data, canonical_name, "aliases") end if data.varieties then -- Sometimes a variety legitimately has the same name as the language as a whole, so allow that. check_name_field(modname, obj, data, canonical_name, "varieties", "allow_nested", "allow_canonical_name_in_table") end end local function validate_pattern(pattern, modname, obj, standard_chars) if type(pattern) ~= "string" then return discrepancy(modname, "\"%s\", the %spattern for %s, is not a string.", pattern, standard_chars and "standard character " or "", link(obj) ) elseif not isutf8(pattern) then return discrepancy(modname, "%s specifies a pattern for for %scharacter detection which is not valid UTF-8: <code>%s</code>", link(obj), standard_chars and "standard " or "", dump(pattern) ) end local ranges for lower, higher in gmatch(pattern, "(.[\128-\191]*)%-%%?(.[\128-\191]*)") do if codepoint(lower) >= codepoint(higher) then ranges = ranges or Array() insert(ranges, { lower, higher }) end end if ranges and ranges[1] then local plural = #ranges ~= 1 and "s" or "" discrepancy(modname, "%s specifies an invalid pattern " .. "for %scharacter detection: <code>%s</code>. The first codepoint%s " .. "in the range%s %s %s must be less than or equal to the second.", link(obj), standard_chars and "standard " or "", dump(pattern), plural, plural, ranges:map(function(range) return format(range[1] .. "-" .. range[2] .. " (U+%X, U+%X)", codepoint(range[1]), codepoint(range[2])) end):concat(", "), #ranges ~= 1 and "are" or "is" ) end local success, result = pcall(umatch, "", "[" .. pattern .. "]") if not success then discrepancy(modname, "%s specifies an invalid pattern for %scharacter detection: <code>%s</code> (%s)", link(obj), standard_chars and "standard " or "", dump(pattern), result ) end end local remove_exceptions_addition = 0xF0000 local maximum_code_point = 0x10FFFF local remove_exceptions_maximum_code_point = maximum_code_point - remove_exceptions_addition -- TODO: check modules exist. -- TODO: validate script codes and check inner tables. local function check_replacement_data(modname, obj, data, key, func_name) local replacements = data[key] if replacements == nil then return end local replacements_type = type(replacements) if replacements_type == "string" then local mod = main_loader("Module:" .. replacements) if not mod then discrepancy(modname, "The <code>%s</code> field in the data table for %s specifies the module [[Module:%s]], which does not exist.", key, link(obj), replacements ) else mod = mod() if not (type(mod) == "table" and is_callable(mod[func_name])) then discrepancy(modname, "The <code>%s</code> field in the data table for %s specifies the module [[Module:%s]], which exists, but does not contain the expected function <code>%s()</code>.", key, link(obj), replacements, func_name ) end end return elseif replacements_type ~= "table" then discrepancy(modname, "The <code>%s</code> field in the data table for %s must be a string or table, not a %s.", key, link(obj), replacements_type ) return end local from, to = replacements.from, replacements.to if (from ~= nil) ~= (to ~= nil) then discrepancy(modname, "The <code>from</code> and <code>to</code> arrays in the <code>%s</code> table for %s are not both defined or both undefined.", key, link(obj) ) elseif from then for _, k in ipairs {"from", "to"} do check_array(modname, obj, data, k, key) end end local remove_diacritics = replacements.remove_diacritics if not (remove_diacritics == nil or type(remove_diacritics) == "string") then discrepancy(modname, "The <code>remove_diacritics</code> field in the <code>%s</code> table for %s table must be a string.", key, link(obj) ) end local remove_exceptions = replacements.remove_exceptions if remove_exceptions then if check_array(modname, obj, data, "remove_exceptions", key) then for sequence_i, sequence in ipairs(remove_exceptions) do local code_point_i = 0 for code_point in gcodepoint(sequence) do code_point_i = code_point_i + 1 if code_point > remove_exceptions_maximum_code_point then discrepancy(modname, "Code point #%d (0x%04X) in field #%d of the <code>remove_exceptions</code> array for %s is over U+%04X.", code_point_i, code_point, sequence_i, link(obj), remove_exceptions_maximum_code_point ) end end end end end if from and to and table_len(to) > table_len(from) then discrepancy(modname, "The <code>from</code> array in the <code>%s</code> table for %s must be shorter or the same length as the <code>to</code> array.", key, link(obj) ) end end local function check_replacements_data(modname, obj, data) for _, replacement_spec in ipairs{ {"translit", "tr"}, {"display_text", "makeDisplayText"}, {"strip_diacritics", "stripDiacritics"}, {"sort_key", "makeSortKey"}, } do check_replacement_data(modname, obj, data, unpack(replacement_spec)) end end local function has_ancestor(lang, code) for _, anc in ipairs(lang:getAncestors()) do if code == anc:getCode() or has_ancestor(anc, code) then return true end end end local function get_default_ancestors(lang) if lang:hasType("language", "etymology-only") then local parent = lang:getParent() if not has_ancestor(parent, lang:getCode()) then return parent:getAncestorCodes() end end local fam_code, def_anc = lang:getFamilyCode() while fam_code and fam_code ~= "qfa-not" do local fam = m_families_data[fam_code] def_anc = fam.protoLanguage or m_languages_data_all[fam_code .. "-pro"] and fam_code .. "-pro" or m_etym_languages_data[fam_code .. "-pro"] and fam_code .. "-pro" if def_anc and def_anc ~= lang:getCode() then return {def_anc} end fam_code = fam[3] end end local function iterate_ancestor(obj, modname, anc_code) local anc = get_language_by_code(anc_code, nil, true) if not anc then discrepancy(modname, "%s lists the invalid language code <code>%s</code> as its ancestor.", link(obj), dump(anc_code) ) return end local anc_fam = anc:getFamily() if not anc_fam then discrepancy(modname, "%s has no family.", link(anc) ) return end local anc_fam_code = anc_fam:getCode() local def_ancs = get_default_ancestors(obj) if def_ancs then for _, def_anc in ipairs(def_ancs) do def_anc = get_language_by_code(def_anc, nil, true) if def_anc and ( anc_code == def_anc:getCode() or has_ancestor(def_anc, anc_code) or def_anc:hasParent(anc_code) and not has_ancestor(anc, def_anc:getCode()) ) then discrepancy(modname, "%s has the ancestor %s listed in its ancestor field, which is redundant, since it is determined to be ancestral automatically.", link(obj), link(anc) ) end end end if not obj:inFamily(anc_fam_code) then discrepancy(modname, "%s has %s set as an ancestor, but is not in the %s.", link(obj), link(anc), link(anc_fam) ) end local fam, proto = obj repeat fam = fam:getFamily() proto = fam and fam:getProtoLanguage() until proto or not fam or fam:getCode() == "qfa-not" if proto and not ( proto:getCode() == anc:getCode() or proto:hasAncestor(anc:getCode()) or anc:hasAncestor(proto:getCode()) ) then local fam = obj:getFamily() discrepancy(modname, "%s is in the %s and has %s set as an ancestor, but it is not possible to form an ancestral chain between them.", link(obj), link(fam), link(anc) ) end end local function check_ancestors(modname, obj, data) local ancestors = data.ancestors if ancestors == nil then return end local ancestors_type = type(ancestors) if ancestors_type == "string" then ancestors = split(ancestors, ",", true, true) elseif ancestors_type ~= "table" then discrepancy(modname, "The <code>ancestors</code> field in the data table for %s must be a string or table, not a %s.", link(obj), ancestors_type ) end for _, anc in ipairs(ancestors) do iterate_ancestor(obj, modname, anc) end end local function check_wikimedia_codes(modname, obj, data) local wikimedia_codes = data.wikimedia_codes if wikimedia_codes == nil then return end local wikimedia_codes_type = type(wikimedia_codes) if wikimedia_codes_type == "string" then wikimedia_codes = split(wikimedia_codes, ",", true, true) elseif wikimedia_codes_type ~= "table" then discrepancy(modname, "The <code>wikimedia_codes</code> field in the data table for %s must be a string or table, not a %s.", link(obj), wikimedia_codes_type ) end for _, code in ipairs(wikimedia_codes) do if not is_known_language_tag(code) then discrepancy(modname, "%s lists the invalid Wikimedia code <code>%s</code> in the <code>wikimedia_codes</code> field.", link(obj), dump(code) ) end end end local function check_code_to_name_and_name_to_code_maps( source_module_type, source_module_description, code_to_module_map, name_to_code_map, code_to_name_modname, code_to_name_module, name_to_code_modname, name_to_code_module ) local function check_code_and_name(modname, code, canonical_name) -- Check the code is in code_to_module_map and that it didn't originate from the wrong data module. local check_mod = code_to_module_map[code] or code_to_module_map[aliases[code]] if not (check_mod and match(check_mod, "^" .. source_module_type .. "/data")) then if not name_to_code_map[canonical_name] then discrepancy(modname, "The code <code>%s</code> and the canonical name %s should be removed; they are not found in %s.", code, canonical_name, source_module_description ) else discrepancy(modname, "<code>%s</code>, the code for the canonical name %s, is wrong; it should be <code>%s</code>.", code, canonical_name, name_to_code_map[canonical_name] ) end elseif not name_to_code_map[canonical_name] then local data_table = require("Module:" .. code_to_module_map[code])[code] discrepancy(modname, "%s, the canonical name for the code <code>%s</code>, is wrong; it should be %s.", canonical_name, code, data_table[1] ) end end for code, canonical_name in pairs(code_to_name_module) do check_code_and_name(code_to_name_modname, code, canonical_name) end for canonical_name, code in pairs(name_to_code_module) do check_code_and_name(name_to_code_modname, code, canonical_name) end end local function check_extraneous_extra_data( data_modname, data_module, extra_data_modname, extra_data_module) for code, _ in pairs(extra_data_module) do if not data_module[code] then discrepancy(extra_data_modname, "The code <code>%s</code> is not found in [[Module:%s]], and should be removed from [[Module:%s]].", code, data_modname, extra_data_modname ) end end end -- TODO: add collision check between the canonical names "X" and "X [Ll]anguage". local function check_languages(frame) local check_language_data_keys = check_data_keys( 1, 2, 3, 4, -- canonical name, Wikidata item, family, scripts "display_text", "generate_forms", "strip_diacritics", "sort_key", "other_names", "aliases", "varieties", "ietf_subtag", "type", "ancestors", "pseudo_families", "wikimedia_codes", "wikipedia_article", "standard_chars", "translit", "override_translit", "link_tr", "dotted_dotless_i" ) local function check_language(modname, code, data, extra_modname, extra_data) local obj, code_modname, canonical_name = make_lang(code, data, true), get_data_module_name(code), data[1] -- FIXME: this module should use the prefixed module name throughout. code_modname = code_modname:gsub("^Module:", "") if code_modname ~= modname then if code_modname == "languages/data/2" then discrepancy(modname, "%s is a two-letter code, so should be moved to [[Module:%s]].", link(obj), code_modname ) elseif code_modname == "languages/data/exceptional" then discrepancy(modname, "%s is an exceptional code, as it does not consist of two or three lowercase letters, so should be moved to [[Module:%s]].", link(obj), code_modname ) else discrepancy(modname, "%s is a three-letter code beginning with '%s', so should be moved to [[Module:%s]].", link(obj), sub(code, 1, 1), code_modname ) end end check_language_data_keys(modname, obj, data) if all_codes[code] then discrepancy(modname, "The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].", code, all_codes[code] ) else if not m_languages_codes[code] then discrepancy("languages/code to canonical name", "The code %s is missing.", link(obj, true) ) end all_codes[code] = modname end -- TODO: these checks should be consolidated with the proto-language checks in the family data, -- since bad settings there affect the warnings here (e.g. xxx-pro assigned to yyy when xxx also -- doesn't not exist - a warning that xxx has "no family" would be misleading). if sub(code, -4) == "-pro" then local fam_code = sub(code, 1, -5) local fam = get_language_by_code(fam_code, nil, true, true) if not fam then discrepancy(modname, "'''Proto-language with no family''': %s should be the proto-language of <code>%s</code>, which doesn't exist.", link(obj), dump(fam_code) ) elseif not fam:hasType("family") then discrepancy(modname, "'''Proto-language with no family''': %s should be the proto-language of <code>%s</code>, but %s is not a family.", link(obj), dump(fam_code), link(fam) ) else -- Reinstate this as low-priority once message priorities have been implemented. -- local expected_name = "Proto-" .. fam:getCanonicalName() -- if canonical_name ~= expected_name then -- discrepancy(modname, -- "%s does not have the expected name \"%s\", even though it is the proto-language of the %s.", -- link(obj), expected_name, link(fam) -- ) -- end end end if not canonical_name then discrepancy(modname, "The code <code>%s</code> has no canonical name specified.", code ) elseif language_names[canonical_name] then local canonical_lang = get_language_by_canonical_name(canonical_name) if not canonical_lang then discrepancy(modname, "%s has a canonical name that cannot be looked up.", link(obj) ) elseif data.main_code ~= canonical_lang:getCode() then discrepancy(modname, "%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.", link(obj), language_names[canonical_name] ) end else if not m_languages_canonical_names[canonical_name] then discrepancy("languages/canonical names", "The canonical name %s is missing.", link(obj) ) end language_names[canonical_name] = code end check_wikidata_item(modname, obj, data, 2) if extra_data then check_other_names_aliases_varieties(modname, obj, extra_data, canonical_name) end local lang_type = data.type if lang_type and not (lang_type == "regular" or lang_type == "reconstructed" or lang_type == "appendix-constructed") then discrepancy(modname, "%s is of the invalid type <code>%s</code>.", link(obj), lang_type ) end if data.aliases then discrepancy(modname, "%s has an <code>aliases</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].", link(obj), modname, extra_modname ) end if data.varieties then discrepancy(modname, "%s has the <code>varieties</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].", link(obj), modname, extra_modname ) end if data.other_names then discrepancy(modname, "%s has the <code>other_names</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].", link(obj), modname, extra_modname ) end if not extra_data then discrepancy(extra_modname, "%s has data in [[Module:%s]], but does not have corresponding data in [[Module:%s]].", link(obj), modname, extra_modname ) --[[elseif extra_data.other_names then discrepancy(extra_modname, "%s has <code>other_names</code> key, but these should be changed to either <code>aliases</code> or <code>varieties</code>.", link(obj) )]] end local sc = data[4] if sc then if type(sc) == "string" then sc = split(sc, "%s*,%s*", true) end if type(sc) == "table" then if not sc[1] then discrepancy(modname, "%s has no scripts listed.", link(obj) ) else for _, sccode in ipairs(sc) do local cur_sc = m_scripts_data[sccode] if not (cur_sc or sccode == "All" or sccode == "Hants") then discrepancy(modname, "%s lists the invalid script code <code>%s</code>.", link(obj), dump(sccode) ) --[[elseif not cur_sc.characters then discrepancy(modname, "%s lists the %s, which does not have any characters.", link(obj), link(get_script_by_code(sccode)) )]] end nonempty_scripts[sccode] = true end end else discrepancy(modname, "The %s field for %s must be a table or string.", 4, link(obj) ) end end if data.ancestors then check_ancestors(modname, obj, data) end if data.wikimedia_codes then check_wikimedia_codes(modname, obj, data) end if data[3] then local family = data[3] if not m_families_data[family] then discrepancy(modname, "%s has the invalid family code <code>%s</code>.", link(obj), dump(family) ) end nonempty_families[family] = true end check_replacements_data(modname, obj, data) if data.standard_chars then if type(data.standard_chars) == "table" then local sccodes = {} for _, sccode in ipairs(sc) do sccodes[sccode] = true end for sccode in pairs(data.standard_chars) do if not (sccodes[sccode] or sccode == 1) then discrepancy(modname, "The field %s in the <code>standard_chars</code> table for %s does not match any script for that language.", sccode, link(obj) ) end end elseif data.standard_chars and type(data.standard_chars) ~= "string" then discrepancy(modname, "The <code>standard_chars</code> field in the data table for %s must be a string or table.", link(obj) ) end end check_true_or_string_or_nil(modname, obj, data, "override_translit") check_true_or_string_or_nil(modname, obj, data, "link_tr") -- This doesn't apply any more since scripts can be script-wide translit methods. -- if data.override_translit and not data.translit then -- discrepancy(modname, -- "%s has the <code>override_translit</code> field set, but no transliteration module", -- link(obj) -- ) -- end end local function check_module(modname) local mod_data = load_data("Module:" .. modname) local extra_modname = modname .. "/extra" local extra_mod_data = load_data("Module:" .. extra_modname) for code, data in pairs(mod_data) do check_language(modname, code, data, extra_modname, extra_mod_data[code]) end check_no_alias_codes(modname, mod_data) check_no_alias_codes(extra_modname, extra_mod_data) check_extraneous_extra_data(modname, mod_data, extra_modname, extra_mod_data) end -- Check two-letter codes check_module( "languages/data/2" ) -- Check three-letter codes for i = 0x61, 0x7A do -- a to z check_module( format("languages/data/3/%c", i) ) end -- Check exceptional codes check_module( "languages/data/exceptional" ) -- These checks must be done while all_codes only contains language codes: -- that is, after language data modules have been processed, but before -- etymology languages, families, and scripts have. check_code_to_name_and_name_to_code_maps( "languages", "a submodule of [[Module:languages]]", all_codes, language_names, "languages/code to canonical name", m_languages_codes, "languages/canonical names", m_languages_canonical_names ) --[===[ not to check langname-lite because we don't use it -- Check [[Template:langname-lite]] local modname = "Template:langname-lite" for code, name in gmatch(remove_comments(new_title(modname):getContent()), "\n\t*|#*([^\n]+)=([^\n]*)") do if #code > 1 and code ~= "default" then for _, code in pairs(split(code, "|", true)) do local lang = get_language_by_code(code, nil, true, true) if match(name, "etymcode") then local nonEtym_name = frame:preprocess(name) local nonEtym_real_name = lang:getFullName() if nonEtym_name ~= nonEtym_real_name then discrepancy(modname, "Code: <code>%s</code>. Saw name: %s. Expected name: %s.", code, nonEtym_name, nonEtym_real_name ) end name = frame:preprocess(gsub(name, "{{{allow etym|}}}", "1")) elseif match(name, "familycode") then name = match(name, "familycode|(.-)|") else name = name end if not lang then discrepancy(modname, "Code: <code>%s</code>. Saw name: %s. Language not present in data.", code, name ) else local real_name = lang:getCanonicalName() if name ~= real_name then discrepancy(modname, "Code: <code>%s</code>. Saw name: %s. Expected name: %s.", code, name, real_name ) end end end end end --]===] end local function check_etym_languages() local modname = "etymology languages/data" local check_etymology_language_data_keys = check_data_keys( 1, 2, 3, 4, -- canonical name, Wikidata item, family, scripts "parent", "display_text", "generate_forms", "strip_diacritics", "sort_key", "other_names", "aliases", "varieties", "ietf_subtag", "type", "main_code", "ancestors", "pseudo_families", "wikimedia_codes", "wikipedia_article", "standard_chars", "translit", "override_translit", "link_tr", "dotted_dotless_i" ) local checked = {} for code, data in pairs(m_etym_languages_data) do local obj, canonical_name, parent = make_lang(code, data, true), data[1], data.parent check_etymology_language_data_keys(modname, obj, data) if all_codes[code] then discrepancy(modname, "The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].", code, all_codes[code] ) else if not m_etym_languages_codes[code] then discrepancy("etymology languages/code to canonical name", "The code %s is missing.", link(obj, true) ) end all_codes[code] = modname end if not canonical_name then discrepancy(modname, "The code <code>%s</code> has no canonical name specified.", code ) elseif language_names[canonical_name] then local canonical_lang = get_language_by_canonical_name(canonical_name, nil, true) if not canonical_lang then discrepancy(modname, "%s has a canonical name that cannot be looked up.", link(obj) ) elseif data.main_code ~= canonical_lang:getCode() then discrepancy(modname, "%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.", link(obj), language_names[canonical_name] ) end else if not m_etym_languages_canonical_names[canonical_name] then discrepancy("etymology languages/canonical names", "The canonical name %s is missing.", link(obj) ) end etym_language_names[canonical_name] = code end check_other_names_aliases_varieties(modname, obj, data, canonical_name) if parent then if type(parent) ~= "string" then discrepancy(modname, "%s has a parent code that is %s rather than a string.", link(obj), parent == nil and "nil" or "a " .. type(parent) ) elseif not (m_languages_data_all[parent] or m_etym_languages_data[parent]) then discrepancy(modname, "%s has the invalid parent code <code>%s</code>%s.", link(obj), dump(parent), m_families_data[parent] and " (a family code)" or "" ) end nonempty_families[parent] = true else discrepancy(modname, "%s has no parent code.", link(obj) ) end if data.ancestors then check_ancestors(modname, obj, data) end if data.wikimedia_codes then check_wikimedia_codes(modname, obj, data) end if data[3] then local family = data[3] if not m_families_data[family] then discrepancy(modname, "%s has the invalid family code <code>%s</code>.", link(obj), dump(family)) end nonempty_families[family] = true end check_replacements_data(modname, obj, data) check_wikidata_item(modname, obj, data, 2) local stack = {} while data do if checked[code] then break elseif stack[code] then local parent = data.parent discrepancy(modname, "%s has a cyclic parental relationship to %s", link(make_lang(code, data, true)), link(get_language_by_code(parent, nil, true)) ) break end stack[code] = true code = data.parent data = m_etym_languages_data[code] end for code in pairs(stack) do checked[code] = true end end check_no_alias_codes(modname, m_etym_languages_data) check_code_to_name_and_name_to_code_maps( "etymology languages", "[[Module:etymology languages/data]]", all_codes, etym_language_names, "etymology languages/code to canonical name", m_etym_languages_codes, "etymology languages/canonical names", m_etym_languages_canonical_names) end -- TODO: add collision check between the canonical names "X" and "X [Ll]anguages". local function check_families() local modname = "families/data" local check_family_data_keys = check_data_keys( 1, 2, 3, -- canonical name, Wikidata item, (parent) family "type", "ietf_subtag", "protoLanguage", "other_names", "aliases", "varieties", "pseudo_families", "categoryName" ) local checked, double_check_if_empty = {["qfa-not"] = true}, {} for code, data in pairs(m_families_data) do local obj, canonical_name, family, protolang = make_family(code, data), data[1], data[3], data.protoLanguage check_family_data_keys(modname, obj, data) if all_codes[code] then discrepancy(modname, "The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].", code, all_codes[code] ) else if not m_families_codes[code] then discrepancy("families/code to canonical name", "The code %s is missing.", link(obj, true) ) end all_codes[code] = modname end if not canonical_name then discrepancy(modname, "The code <code>%s</code> has no canonical name specified.", code ) elseif family_names[canonical_name] then local canonical_family = get_family_by_canonical_name(canonical_name) if not canonical_family then discrepancy(modname, "%s has a canonical name that cannot be looked up.", link(obj) ) elseif data.main_code ~= canonical_family:getCode() then discrepancy(modname, "%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.", link(obj), family_names[canonical_name] ) end else if not m_families_canonical_names[canonical_name] then discrepancy("families/canonical names", "The canonical name %s is missing.", link(obj) ) end family_names[canonical_name] = code end check_other_names_aliases_varieties(modname, obj, data, canonical_name) if family then if family == code and code ~= "qfa-not" then discrepancy(modname, "%s has itself as its family.", link(obj) ) elseif not m_families_data[family] then discrepancy(modname, "%s has the invalid parent family code <code>%s</code>.", link(obj), dump(family) ) end nonempty_families[family] = true end if protolang then local protolang_obj = get_language_by_code(protolang, nil, true) if not protolang_obj then discrepancy(modname, "%s has the invalid proto-language code <code>%s</code>.", link(obj), dump(protolang) ) elseif protolang == code .. "-pro" then discrepancy(modname, "%s has %s listed as its proto-language, which is redundant, since it is determined to be the proto-language automatically.", link(obj), link(protolang_obj) ) elseif sub(protolang, -4) == "-pro" then discrepancy(modname, "%s has %s listed as its proto-language, which is supposed to be the proto-language for the family <code>%s</code>.", link(obj), link(protolang_obj), sub(protolang, 1, -5) ) end end check_wikidata_item(modname, obj, data, 2) -- Could be a false-positive if a child family occurs on a later -- iteration, so set aside any that fail for a second check. This avoids -- having to iterate through the whole list of families once -- nonempty_families has been fully populated. if not (nonempty_families[code] or allowed_empty_families[code]) then double_check_if_empty[code] = obj end local stack = {} while data do if checked[code] then break elseif stack[code] then local parent = data[3] discrepancy(modname, "%s has a cyclic familial relationship to %s", link(make_family(code, data)), link(get_family_by_code(parent)) ) break end stack[code] = true code = data[3] data = m_families_data[code] end for code in pairs(stack) do checked[code] = true end end -- Any languages set aside as candidates for having no children are checked -- again, now that nonempty_families is definitely complete. for code, obj in next, double_check_if_empty do if not (nonempty_families[code] or allowed_empty_families[code]) then discrepancy(modname, "%s has no child families or languages.", link(obj) ) end end check_no_alias_codes(modname, m_families_data) check_code_to_name_and_name_to_code_maps( "families", "[[Module:families/data]]", all_codes, family_names, "families/code to canonical name", m_families_codes, "families/canonical names", m_families_canonical_names) end -- TODO: add collision check between the canonical names "X" and "X [Ss]cript". local function check_scripts() local modname = "scripts/data" local check_script_data_keys = check_data_keys( 1, 2, 3, -- canonical name, Wikidata item, writing systems "other_names", "aliases", "varieties", "parent", "ietf_subtag", "type", "wikipedia_article", "ranges", "characters", "spaces", "capitalized", "translit", "direction", "character_category", "normalizationFixes", "sort_by_scraping", "display_text", "sort_key", "strip_diacritics" ) -- Just to satisfy requirements of check_code_to_name_and_name_to_code_maps. local script_code_to_module_map = {} for code, data in pairs(m_scripts_data) do local obj, canonical_name = make_script(code, data), data[1] if not m_scripts_codes[code] and #code == 4 then discrepancy("scripts/code to canonical name", "The code %s is missing", link(obj, true) ) end check_script_data_keys(modname, obj, data) if not canonical_name then discrepancy(modname, "The code <code>%s</code> has no canonical name specified.", code ) elseif script_names[canonical_name] then local canonical_script = get_script_by_canonical_name(canonical_name) if not canonical_script then discrepancy(modname, "%s has a canonical name that cannot be looked up.", link(obj) ) --[[elseif data.main_code ~= canonical_script:getCode() then discrepancy(modname, "%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.", link(obj), script_names[canonical_name] )]] end else if not m_scripts_canonical_names[canonical_name] and #code == 4 then discrepancy("scripts/canonical names", "The canonical name %s is missing.", link(obj) ) end script_names[canonical_name] = code end check_other_names_aliases_varieties(modname, obj, data, canonical_name) if not nonempty_scripts[code] then discrepancy(modname, "%s is not used by any language%s.", link(obj), data.characters and "" or " and has no characters listed for auto-detection") --[[elseif not data.characters then discrepancy(modname, "%s has no characters listed for auto-detection.", link(obj) )--]] end if data.characters then validate_pattern(data.characters, modname, obj, false) end check_wikidata_item(modname, obj, data, 2) script_code_to_module_map[code] = modname end check_no_alias_codes(modname, m_scripts_data) check_code_to_name_and_name_to_code_maps( "scripts", "a submodule of [[Module:scripts]]", script_code_to_module_map, script_names, "scripts/code to canonical name", m_scripts_codes, "scripts/canonical names", m_scripts_canonical_names) end -- FIXME: this is quite messy. local function check_wikidata_languages() local data = json_decode(new_title("Module:languages/data/wikidata.json"):getContent()) local seen = {{}, {}, {}, [5] = {}} for _, item in ipairs(data) do local id = item.id for k, v in pairs(item) do if k ~= "id" then local _seen = seen[k] for _, code in ipairs(v) do local _code = code[1] local _type = type(_seen[_code]) if _type == "table" then insert(_seen[_code], id) elseif _type == "string" then _seen[_code] = {_seen[_code], id} else _seen[_code] = id end end end end end local modname = "languages/data/wikidata.json" for k, v in pairs(seen) do for code, ids in pairs(v) do if type(ids) == "table" then local t = {} for i, id in ipairs(ids) do t[i] = format("<code>[[d:%s|%s]]</code>", id, id) end discrepancy(modname, "<code>%s</code> is set as an ISO 639-%d code on multiple items: %s.", code, k, list_to_text(t) ) end end end end local function check_labels() local check_label_data_keys = check_data_keys( "display", "Wikipedia", "glossary", "plain_categories", "topical_categories", "pos_categories", "regional_categories", "sense_categories", "omit_preComma", "omit_postComma", "omit_preSpace", "deprecated", "track" ) local function check_label(modname, code, data) local _type = type(data) if _type == "table" then check_label_data_keys(modname, code, data) elseif _type ~= "string" then discrepancy(modname, "The data for the label <code>%s</code> is %s %s; only tables and strings are allowed.", code, add_indefinite_article(_type) ) end end for _, module in ipairs{"", "/regional", "/topical"} do local modname = "Module:labels/data" .. module module = require(modname) for label, data in pairs(module) do check_label(modname, label, data) end end for code in pairs(m_languages_codes) do local modname = "Module:labels/data/lang/" .. code local module = safe_require(modname) if module then for label, data in pairs(module) do check_label(modname, label, data) end end end end local function check_zh_trad_simp() local m_ts = require("Module:zh/data/ts") local m_st = require("Module:zh/data/st") local ruby = require("Module:ja-ruby").ruby_auto local lang = get_language_by_code("zh") local Hant = get_script_by_code("Hant") local Hans = get_script_by_code("Hans") local data = {[0] = m_st, m_ts} local mod = {[0] = "st", "ts"} local var = {[0] = "Simp.", "Trad."} local sc = {[0] = Hans, Hant} local function find_stable_loop(chars, other, j) local display = ruby({["markup"] = "[" .. other .. "](" .. var[(j+1)%2] .. ")"}) display = language_link{term = other, alt = display, lang = lang, sc = sc[(j+1)%2], tr = "-"} insert(chars, display) if data[(j+1)%2][other] == other then insert(chars, other) return chars, 1 elseif not data[(j+1)%2][other] then insert(chars, "not found") return chars, 2 elseif data[j%2][data[(j+1)%2][other]] ~= other then return find_stable_loop(chars, data[(j+1)%2][other], j + 1) else local display = ruby({["markup"] = "[" .. data[(j+1)%2][other] .. "](" .. var[j%2] .. ")"}) display = language_link{term = data[(j+1)%2][other], alt = display, lang = lang, sc = sc[j%2], tr = "-"} insert(chars, display .. " (") display = ruby({["markup"] = "[" .. data[j%2][data[(j+1)%2][other]] .. "](" .. var[(j+1)%2] .. ")"}) display = language_link{term = data[j%2][data[(j+1)%2][other]], alt = display, lang = lang, sc = sc[(j+1)%2], tr = "-"} insert(chars, display .. " etc.)") return chars, 3 end return chars end for i = 0, 1, 1 do for ch, other_ch in pairs(data[i]) do if data[(i+1)%2][other_ch] ~= ch then local chars, issue = {} local display = ruby({["markup"] = "[" .. ch .. "](" .. var[i] .. ")"}) display = language_link{term = ch, alt = display, lang = lang, sc = sc[i], tr = "-"} insert(chars, display) chars, issue = find_stable_loop(chars, other_ch, i) if issue == 1 or issue == 2 then local sc_this, mod_this, j = {} if match(chars[#chars-1], var[(i+1)%2]) then j = 1 else j = 0 end mod_this = mod[(i+j)%2] sc_this = {[0] = sc[(i+j)%2], sc[(i+j+1)%2]} for k, ch in ipairs(chars) do chars[k] = tag_text(ch, lang, sc_this[k%2], "term") end local modname = "zh/data/" .. mod_this if issue == 1 then discrepancy(modname, "character references itself: %s", concat(chars, " → ") ) elseif issue == 2 then discrepancy(modname, "missing character: %s", concat(chars, " → ") ) end elseif issue == 3 then for j, ch in ipairs(chars) do chars[j] = tag_text(ch, lang, sc[(i+j)%2], "term") end discrepancy("zh/data/" .. mod[i], "possible mismatched character: %s", concat(chars, " → ") ) end end end end end local function check_serialization(modname) local serializers = { ["Hani-sortkey/data/serialized"] = "Hani-sortkey/serializer", } if not serializers[modname] then return nil end local serializer = serializers[modname] local current_data = require("Module:" .. serializer).main(true) local stored_data = require("Module:" .. modname) if current_data ~= stored_data then discrepancy(modname, "<strong><u>Important!</u> Serialized data is out of sync. Use [[Module:%s]] to update it. If you have made any changes to the underlying data, the serialized data <u>must</u> be updated before these changes will take effect.</strong>", serializer ) end end local find_code = require("Module:memoize")(function(message) return match(message, "<code>([^<]+)</code>") end) local function compare_messages(message1, message2) local code1, code2 = find_code(message1), find_code(message2) if code1 and code2 then return code1 < code2 else return message1 < message2 end end -- Warning: cannot be called twice in the same module invocation because -- some module-global variables are not reset between calls. local function do_checks(frame, modules) messages = setmetatable({}, messages_mt) if modules["zh/data/ts"] or modules["zh/data/st"] then check_zh_trad_simp() end check_languages(frame) check_etym_languages() -- families and scripts must be checked AFTER languages; languages checks fill out -- the nonempty_families and nonempty_scripts tables, used for testing if a family/script -- is ever used in the data check_families() check_scripts() check_wikidata_languages() if modules["labels/data"] then check_labels() end for module in pairs(modules) do check_serialization(module) end setmetatable(messages, nil) for _, msglist in pairs(messages) do msglist:sort(compare_messages) end local ret = messages messages = nil return ret end local function format_message(modname, msglist) local header; if match(modname, "^Module:") or match(modname, "^Template:") then header = "===[[" .. modname .. "]]===" else header = "===[[Module:" .. modname .. "]]===" end return header .. msglist:map(function(msg) return "\n* " .. msg end):concat() end function export.check_modules_t(frame) local args = frame.args local modules = list_to_set(args) local ret = Array() local messages = do_checks(frame, modules) for _, module in ipairs(args) do local msglist = messages[module] if msglist then ret:insert(format_message(module, msglist)) end end return ret:concat("\n") end function export.perform(frame) local messages = do_checks(frame, {}) -- Format the messages local ret = Array() for modname, msglist in sorted_pairs(messages) do ret:insert(format_message(modname, msglist)) end -- Are there any messages? -- TODO: check how many messages there are. if false then --if i == 1 then return "<b class=\"success\">Glory to Arstotzka.</b>" else ret:insert(1, "<b class=\"warning\">Discrepancies detected:</b>") return ret:concat("\n") end end return export c42m7m0j905eljkvxvftn433l22ax1o ເຂັນ 0 243506 5720683 1890237 2026-04-20T21:01:08Z Alifshinobi 397 5720683 wikitext text/x-wiki {{also/auto}} == ภาษาลาว == === การออกเสียง === {{lo-pron}} === รากศัพท์ 1 === ร่วมเชื้อสายกับ{{cog|th|เข็น}} ==== คำกริยา ==== {{lo-verb}} # [[เข็น]] #: {{syn|lo|ຍູ້|ດັນ}} === รากศัพท์ 2 === ร่วมเชื้อสายกับ{{cog|th|เข็ญ}}, {{cog|nod|ᨡᩮ᩠ᨶ}} หรือ {{m|nod|ᨡᩮᩢ᩠ᨶ}}, {{cog|kkh|ᨡᩮ᩠ᨶ}}, {{cog|khb|ᦃᦲᧃ}} หรือ {{m|khb|ᦵᦃᧃ}}, {{cog|shn|ၶဵၼ်}}, {{cog|aho|𑜁𑜢𑜃𑜫}} === คำคุณศัพท์ === {{lo-adj}} # [[เข็ญ]], [[โชค]][[ร้าย]], [[อยู่]][[ใน]][[ภัย]][[อันตราย]] === คำนาม === {{lo-noun}} # [[ความเข็ญ]], [[ความ]][[โชค]][[ร้าย]], [[ภัย]][[อันตราย]] === รากศัพท์ 3 === ร่วมเชื้อสายกับ{{cog|tts|เข็น}} ==== คำกริยา ==== {{lo-verb}} # {{lb|lo|สกรรม}} [[ปั่น]] (ใช้แก่ฝ้ายหรือไหมเป็นต้น) 7hw8lrv983x516wkh3a0vvcthp01av2 ᥕᥒ 0 270949 5720744 5652919 2026-04-21T06:46:22Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720744 wikitext text/x-wiki == ภาษาไทใต้คง == === การออกเสียง === * {{IPA|tdd|/jaŋ˧˧/}} === คำกริยาวิเศษณ์ === {{tdd-adv}} # [[ไม่]], [[ยัง]]ไม่ ==== คำพ้องความ ==== * {{l|tdd|ᥟᥛᥱ}} 8he202zaum4ulqw4awvvaflafmxf7a0 ᥕᥝᥳ 0 270953 5720729 5715180 2026-04-21T05:18:37Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720729 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === ร่วมเชื้อสายกับ{{cog|shn|ယဝ်ႉ}} === การออกเสียง === * {{IPA|tdd|/jaw˦˧/}} === คำอนุภาค === {{tdd-part}} # [[แล้ว]] (ใช้แสดงการกระทำที่เสร็จสิ้นไปแล้ว) #: {{syn|tdd|ᥞᥝᥳ}} i307aas1wnw1zg56zmhk0gs6w4c8eez ᥞᥝᥰ 0 270976 5720727 1422750 2026-04-21T05:06:52Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720727 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === {{inh+|th|tai-pro|*rawᴬ}}; ร่วมเชื้อสายกับ{{cog|shn|ႁဝ်း}}, {{cog|lo|ເຮົາ}}, {{cog|nod|ᩁᩮᩢᩣ}}, {{cog|khb|ᦣᧁ}}, {{cog|blt|ꪹꪭꪱ}}, {{cog|tts|เฮา}}, {{cog|aho|𑜍𑜧}}, {{m|aho|𑜍𑜈𑜫}} หรือ {{m|aho|𑜍𑜧𑜈𑜫}}, {{cog|pcc|rauz}}, {{cog|za|raeuz}} === การออกเสียง === * {{IPA|tdd|/haw˥˧/}} === คำสรรพนาม === {{tdd-pronoun}} # [[เรา]], [[พวก]]เรา (รวมผู้ฟัง) 45lo2z4od1pvg745l6cd48xyqoxmn0b ᥟᥛᥱ 0 270978 5720741 5652920 2026-04-21T06:33:47Z Ai Ku Karng 17824 5720741 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === ร่วมเชื้อสายกับ{{cog|shn|ဢမ်ႇ}} === การออกเสียง === * {{IPA|tdd|/ʔam˩˩/}} === คำกริยาวิเศษณ์ === {{tdd-adv}} # [[ไม่]] ==== คำพ้องความ ==== * {{l|tdd|ᥕᥒ}} g94cxhj7vfvgu5f8794q9kn1ii30b91 ᥑᥤᥲ 0 271120 5720787 1650905 2026-04-21T07:20:24Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720787 wikitext text/x-wiki == ภาษาไทใต้คง == === การออกเสียง === * {{IPA|tdd|/xi˧˩/}} === รากศัพท์ 1 === {{inh+|tdd|tai-pro|*C̬.qɯjꟲ}}; ร่วมเชื้อสายกับ{{cog|th|ขี้}}, {{cog|nod|ᨡᩦ᩶}}, {{cog|lo|ຂີ້}}, {{cog|khb|ᦃᦲᧉ}}, {{cog|shn|ၶီႈ}}, {{cog|aho|𑜁𑜣}}, {{cog|za|haex}}, {{cog|skb|ไกฺ}} ==== คำนาม ==== {{tdd-noun}} # [[ขี้]] ==== คำกริยา ==== {{tdd-verb}} # [[ขี้]] === รากศัพท์ 2 === ==== รูปแบบอื่น ==== * {{l|tdd|ᥔᥤᥲ}} ==== คำนาม ==== {{tdd-noun}} # [[ซี่]] 6yayhtp9yrreee4genw7bns8p9xvoq4 ᥑᥤᥳ 0 271122 5720783 1422484 2026-04-21T07:17:21Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720783 wikitext text/x-wiki == ภาษาไทใต้คง == === การออกเสียง === * {{IPA|tdd|/xi˦˧/}} === คำนาม === {{tdd-noun}} # [[ธง]] 7tn1c2brwaxgp7nvhrv11c717e82nrv ᥐᥣᥓᥣᥒᥲ 0 274125 5720821 1422433 2026-04-21T08:06:08Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720821 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === {{com|tdd|ᥐᥣ|ᥓᥣᥒᥲ|t1=ค่า|t2=จ้าง}} === การออกเสียง === * {{IPA|tdd|/kaː˧˧.t͡saːŋ˧˩/}} === คำนาม === {{tdd-noun}} # [[ค่าจ้าง]] 6la21mkeifmqt0p75pirwgzafohrp2d ᥐᥣᥑᥢᥴ 0 274126 5720771 1422432 2026-04-21T07:03:29Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720771 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === {{com|tdd|ᥐᥣ|ᥑᥢᥴ|t1=ค่า|t2=ค่า}} === การออกเสียง === * {{IPA|tdd|/kaː˧˧.xan˨˦/}} === คำนาม === {{tdd-noun}} # [[ค่า]], [[ราคา]] pr9800nei8bmf3yx9n9oyo43uxyv9gd มอดูล:place 828 283922 5720701 5715280 2026-04-21T01:50:08Z OctraBot 3198 5720701 Scribunto text/plain local export = {} local force_cat = false -- set to true for testing local m_placetypes = require("Module:place/placetypes") local m_links = require("Module:links") local memoize = require("Module:memoize") local m_strutils = require("Module:string utilities") local m_table = require("Module:table") local debug_track_module = "Module:debug/track" local en_utilities_module = "Module:en-utilities" local form_of_module = "Module:form of" local languages_module = "Module:languages" local parse_interface_module = "Module:parse interface" local parse_utilities_module = "Module:parse utilities" local parameter_utilities_module = "Module:parameter utilities" local utilities_module = "Module:utilities" local enlang = require(languages_module).getByCode("en") local rmatch = m_strutils.match local rfind = m_strutils.find local ulen = m_strutils.len local split = m_strutils.split local dump = mw.dumpObject local insert = table.insert local concat = table.concat local pluralize = require(en_utilities_module).pluralize local extend = m_table.extend local unpack = unpack or table.unpack -- Lua 5.2 compatibility local internal_error = m_placetypes.internal_error local process_error = m_placetypes.process_error local placetype_data = m_placetypes.placetype_data --[==[ intro: ===Introduction=== This module implements {{tl|place}}, which is a template for standardizing the description and categorization of toponyms (terms that refer to locations such as cities, countries, rivers, etc.). The following modules support this template: * [[Module:place]]: The main module. * [[Module:place/placetypes]]: A module containing data on placetypes, as well as utilities for working with placetypes; category generation handlers for adding categories based on placetypes; and display handlers for displaying holonyms (i.e. containing locations) of a specific type. FIXME: Maybe split out the code from the data. * [[Module:place/locations]]: A module containing data on known locations, as well as utilities for working with such locations. FIXME: Maybe split out the code from the data. * [[Module:category tree/topic/Places]]: A category tree module for generating the descriptions of all categories generated by {{tl|place}}. * [[Module:place doc]]: A module that generates documentation tables describing known placetypes and locations. ===Basic terminology=== The basic terminology used in this and associated {{tl|place}} modules is: * A ''location'' (or equivalently, a ''place'') is any geographic feature (either natural or geopolitical), either on the surface of the Earth or elsewhere. Examples of types of natural places are rivers, mountains, seas and moons; examples of types of geopolitical places are cities, countries, neighborhoods and roads. A ''known location'' is specifically a location whose properties are specified in the {{tl|place}} modules; more on them below. * Specific places are identified by names, referred to as ''toponyms'' or ''placenames''. A given place will often have multiple names, and a given toponym may be ambiguous, referring to multiple possible locations. Specifically: ** There may be names including different amounts of disambiguating information (`Tucson` vs. `Tucson, Arizona` vs. `Tucson, Arizona, USA` or `New York` vs. `New York City` vs. `New York, New York`); abbreviations (`NYC` for `New York City`, `USA` for `United States of America`); ''official'' vs. ''short'' names (e.g. `Union of Soviet Socialist Republics` vs. `Soviet Union`); spelling variations (`Cracow` vs. `Krakow` vs. `Kraków`); current vs. former names (`Saint Petersburg` vs. `Leningrad` vs. `Petrograd`); [[exonym]]s vs. [[endonym]]s (e.g. `Tavastia Proper` vs. `Kanta-Häme`, both referring to the same administrative region in Finland); alternative names not due to any of the above reasons (`Bashkiria` vs. `Bashkortostan`); etc. In addition, each language that has an opportunity to refer to the place will have its own name, with the same sorts of variations as exist in English. ** Examples of ambiguous toponyms are `New York` (either a city or a state); `Georgia` (either a state of the US or an independent country in the Caucasus Mountains); `Paris` (either the capital of France or various small cities and towns in the US); `Mexico` (either a country, a state of that country, or the capital city of that country); and `San Antonio` (besides being a major city in Texas, it is the name of dozens of settlements of all sorts throughout the US and Latin America, and a least 181 distinct [[barangay]]s in the Philippines). * A ''placetype'' is the (or a) type that a location belongs to (e.g. `city`, `state`, `river`, `administrative region`, `[[regional county municipality]]`, etc.). ** It is common for locations to be described using multiple placetypes, and even sometimes known locations have multiple placetypes that they may be identified by (e.g. American Samoa can be identified either as an `unincorporated territory`, an `overseas territory` or just a `territory`). Both the {{tl|place}} template and the known location data allow a given location to be identified by multiple placetypes. When in doubt as to the correct placetype or placetypes for a given location, generally follow how Wikipedia describes the place. ** Some placetypes themselves are ambiguous; e.g. an ''area'' can variously refer to a top-level administrative division (specifically of Kuwait); a geographic region, generally without unambiguously defined borders; or a section of a city, similar to a neighborhood. The term ''district'' is similarly ambiguous. A ''[[prefecture]]'' in the context of Japan is similar to a province, but a prefecture in France is the capital of a ''[[department]]'' (which is similar to a county). Some of this ambiguity is currently handled automatically; e.g. the ambiguity of areas and districts is handled by looking at the ''holonyms'', or containing locations, specified for a given place. But sometimes it is necessary to use a qualifier before the placetype to disambiguate; for example to refer to a French prefecture, use the placetype `French prefecture` instead of just `prefecture`. (FIXME: Handle this automatically.) * A ''holonym'', in the context of a description of a place, is a placename that refers to a larger-sized entity that contains the location being described. For example, `Arizona` and `United States` are holonyms of `Tucson`, and `United States` is a holonym of `Arizona`. * A ''place invocation'' consists of the invocation of {{tl|place}}, including all its parameters. Place invocations may contain one or more ''place descriptions'', each of which provides a description of the location, including its placetype or types, any holonyms, and any additional raw text needed to properly explain the place in context. Place invocations may also contain named parameters specifying zero or more English ''glosses'' or translations (for foreign-language toponyms) and any attached ''extra information'' such as the capital, largest city, official name, modern name or full name. Multiple place descriptions in a single invocation are separated by a numbered parameter starting with a semicolon, and are used when it is necessary to provide two or more definitions of a single location for proper categorization. For example, [[Vatican City]] is defined both as a city-state in Southern Europe and as an enclave within the city of Rome, follows: : {{tl|place|en|city-state|r/Southern Europe|;,|an <<enclave>> within the city of [[Rome]], [[Italy]]|cat=Places in Rome|official=Vatican City State}}. Similar things need to be done for places like [[Crimea]] that are claimed by two different countries with different definitions and administrative structures. ** There are two types of place descriptions, ''new-style'' and ''old-style''. (The use of the terms "new" and "old" indicates chronological precedence in the development of {{tl|place}}, but is not meant to pass any value judgments on the two types, and does not indicate any intent to deprecate old-style descriptions. Both types of descriptions are useful; for example, old-style descriptions are generally more succinct but less flexible.) The above invocation shows both types: an old-style description followed by a new-style description. Old style descriptions use multiple numbered parameters, where the first parameter (after the language code) specifies the placetype or types, and following parameters specify either holonyms (which are always of the form ` ``placetype``/``placename`` `) or raw text (which is identifiable by not having a slash in it). New-style descriptions use a single parameter, where both placetypes and holonyms are surrounded by double angle brackets, and all remaining text is raw (displayed as-is). In both types of descriptions, holonyms include a slash in them to separate the placetype (which is mandatory and often abbreviated) from the placename. ** In the context of a place description, there are two types of placetypes. The ''entry placetypes'' are the placetypes of the place being described, while the ''holonym placetypes'' are the placetypes of the holonyms that the place being described is located within. Currently, a given place can have multiple placetypes specified (e.g. [[Normandy]] is specified using the ''compound placetype'' `administrative region/former province/and/medieval kingdom`) while a given holonym can have only one placetype associated with it. Holonym placetypes are frequently abbreviated (e.g. `r` for `region`, `s` for `state`, `co` for `county`, etc.), while stylistically it is preferred to spell out the entry placetype (except for some long placetypes with well-known abbreviations, such as `CDP` or `cdp` for `[[census-designated place]]`). ** All holonyms in place descriptions are automatically linked as if surrounded by {{tl|l|en|...}}; i.e. if double brackets do not occur in the holonym, the entire holonym will be linked to the corresponding Wiktionary article. For this reason, the holonym should generally be in the same format as the canonical Wiktionary article describing the location; see below). * A ''known location'' is a location whose properties are specifically defined in the {{tl|place}} modules. Generally each such location has an associated category, and known locations exist in a containment hierarchy, where the immediately containing known location is known as the ''container'' of the location and the chain of successive containing locations is known as the ''container trail''. Generally the location's container corresponds to the first parent of its category. Note that some known locations belong to more than one immediate container; for example, Russia belongs to both Europe and Asia. ===More about placetypes=== # The following general categories of placetypes exist: ## ''Natural features'' such as lakes, mountains, mountain ranges, islands, archipelagoes, moons, stars, asteroids, etc. ## ''Continents'', ''supercontinents'' (groupings of continents where it makes sense, such as `America` and `Eurasia`) and ''continent-level regions'' (grouping of countries in a given continent, such as `Central America` and `Polynesia`). ## ''Political entities'', which are generally classified as either ''polities'' (top-level entities such as countries), ''subpolities'' or ''political divisions'' (non-sovereign divisions, often specifically ''administrative divisions'', of a polity, where an administrative division has a governmental or statistical function and almost always has unambiguously defined boundaries), or ''settlements'' (e.g. cities; towns; villages; and divisions of a city such as neighborhoods, wards, [[barrio]]s and [[barangay]]s, which may or may not be formal administrative divisions and may or may not have unambiguous boundaries). ## ''Geographic regions'', which refer to recognized areas of the Earth (either with a natural geographic, political or cultural significance, often of a historical nature). Such regions can be of greatly varying size, may exist either within a single country or spanning multiple countries or (more often) parts of multiple countries, and may not have well-defined boundaries. They should be distinguished from ''administrative regions'', which exist within a single country and have well-defined boundaries and a political or administrative function. Geographic regions are categorized using the generic term ''geographic and cultural areas'' to emphasize that (a) they have no administrative significance; (b) they may vary greatly in size; and (c) their cohesion is due either to natural geographic boundaries, such as rivers or mountain ranges, or to sharing some cultural characteristics. ## ''Man-made structures'' below the level of a settlement or neighborhood, such as airports, roads, individual buildings, and the like. (Note that such structures, even if named, often do not meet the [[WT:CFI]] criteria; this is particularly the case for roads.) # Placetypes support aliases, and the mapping to canonical form happens early on in the processing. For example, `state` can be abbreviated as `s`; `administrative region` as `adr`; `regional county municipality` as `rcomun`; etc. Some placetype aliases handle alternative spellings rather than abbreviations. For example, `departmental capital` maps to `department capital`, and `home-rule city` maps to `home rule city`. Placetype abbreviations are particularly useful in holonym specs, because every holonym must be accompanied by its placetype, for disambiguation purposes. # A ''placetype qualifier'' is an adjective prepended to the placetype to give additional information about the place being described. For example, a given place may be described as a `small city`; logically this is still a city, but the qualifier `small` gives additional information about the place. Multiple qualifiers can be stacked, e.g. `small affluent beachfront unincorporated community`, where `unincorporated community` is a recognized placetype and `small`, `affluent` and `beachfront` are qualifiers. (As shown here, it may not always be obvious where the qualifiers end and the placetype begins.) For the most part, placetype qualifiers do not affect categorization; a `small city` is still a city and an `affluent beachfront unincorporated community` is still an unincorporated community, and both should still be categorized as such. But some qualifiers do change the categorization. In particular, a `former province` is no longer a province and should not be categorized in e.g. [[:Category:Provinces of Italy]], but instead in a different set of categories, e.g. [[:Category:Historical political subdivisions]]. There are several terms treated as equivalent for this purpose: `abandoned` `ancient`, `extinct`, `historic(al)`, `medi(a)eval` and `traditional`. Another set of qualifiers that change categorization are `fictional` and `mythological`, which cause any term using the qualifier to be categorized respectively into [[:Category:Fictional locations]] and [[:Category:Mythological locations]]. ===More about toponyms=== # Toponyms may be: ## ''simple'' (not including any containing location in its name, such as `Tucson`) or ''multipart'' (including one or more containing locations, such as `Tucson, Arizona` or `Tucson, USA` or even `Tucson, Arizona, USA`); ## ''bare'' (not including the word `the` if the location normally requires this article when following a preposition, such as `United States`, `Gambia` or 'Community of Madrid') or ''prefixed'' (including the word `the` as needed, such as `the United States`, `the Gambia` or `the Community of Madrid`); ## ''elliptical'' (just the placename without any disambiguating placetype, such as `Durham`, `New York` or `Mexico`) or ''full'' (containing a disambiguating placetype or similar identifier if one is commonly included, such as the city of `Durham` (in England) vs. its containing county `County Durham`; the US city `New York City` vs. its containing state `New York`; or the three-way distinction between `Mexico` (the country), `Mexico City` (the capital of this country) and `(the) State of Mexico` (one of the states of the country Mexico, mostly surrounding but not including Mexico City)). # The ''canonical Wiktionary article'' is the main article on Wiktionary where a location is described. Canonical articles, per the above terminology, are generally ''simple'' and ''bare'', but may be either ''full'' or ''elliptical''. The fact that a given article is canonical is often identifiable by the fact that translations are housed there an not somewhere else. For example, most counties of the US and Canada include the word `County` in their canonical article name, but most counties elsewhere do not. `Washington, D.C.` is one of the few cases where a non-simple toponym is used as the canonical article; this is based on common usage, especially by residents of the city in question (who commonly refer to it as "D.C." but rarely just as "Washington"). ===More about known locations=== # The following types of known locations are defined in this module: ## Continents, supercontinents and continent-level regions, into which countries are grouped. Specifically: ### At the top level below `Earth` are the supercontinents `America` and `Eurasia` and the continents `Africa`, `Oceania` and `Antartica`. ### `America` is further broken down into the continents `North America` (in turn containing the continental regions `Central America` and `Caribbean`, with the United States, Canada and Mexico directly under North America) and `South America`. ### `Eurasia` is further broken down into the continents `Europe` and `Asia`. ### `Oceania` is further broken down into the continental regions `Melanesia`, `Micronesia` and `Polynesia`, with Australia` directly under `Oceania. ### Under the above-specified divisions are countries. Some countries are placed in more than one continent or continent-level region, either because they actually span two continents (e.g. Russia, Turkey, Kazakhstan, Egypt) or because they are politically considered to belong to a continent different from the one they are geographically in (Cyprus, Georgia, Armenia, etc.). ## Political entities, including: ### Top-level political entities, which includes: #### Countries, with a fairly liberal definition, notably including all UN-recognized countries plus some others that are commonly considered countries, even if not all other countries recognize them as such or consider them completely independent (notably, Kosovo, Palestine, Taiwan, Western Sahara, Niue and the Cook Islands). #### Pseudo-countries, which include areas calling themselves countries that are de-facto not under the control of the country that they are internationally considered part of (e.g. Abkhazia, South Ossetia, Transnistria); dependent/external/etc. territories of countries (e.g. American Samoa [US], Bermuda [UK], Christmas Island [Australia], Easter Island [Chile]); constituent countries, autonomous territories and the like (Aruba, Curaçao and Sint Maarten of the Netherlands; Greenland and the Faroe Islands of Denmark; etc.; but notably not including England, Scotland, Northern Ireland and Wales, which are treated as regular countries); and a grab bag of other entities that have a semi-independent existence, such as Hong Kong, Macau, Guadeloupe, Martinique and the like. Currently, the actual distinction in treatment between "countries" and "country-like entities" is minimal, but in the future we might restrict the sorts of subcategories of country-like entities more than regular countries. #### Former countries, e.g. the Soviet Union, Yugoslavia, West Germany and the Roman Empire. These are much more limited in the sorts of subcategories allowed, because generally locations, especially cities, should be described from the perspective of which political entity they are currently located in (e.g. "an ancient Roman town in modern Syria") and categorized as such. ### Subpolities. Generally we only list top-level administrative divisions of countries (and only fairly major countries are usually included), but sometimes we list second-level administrative divisions, as in the case of the United Kingdom (where the top-level administrative divisions of the four constituent countries are listed) and China (where major prefecture-level cities are listed, and are considered administrative divisions rather than cities). ### Cities. Only major cities get categories, with the definition of "major" varying by country but often including those where the city population itself (sometimes the metro area) is >= 1,000,000 people. # A distinction should be made in the {{tl|place}} modules between ''keys'' and ''placenames''. Placenames are as the location appears in a holonym, and are generally in the same format as the canonical Wiktionary article describing the location so that when formatted as a link, the link goes to the right article; i.e. they are simple and bare, and may be full or elliptical according to Wiktionary conventions. The ''canonical key'' of a location is how the location's category is named, and always uniquely identifies the location from among the known locations in this module (but not necessarily among all possible locations). In particular, subpolities usually have multipart keys that include the containing location, such as `Anhui, China` (not just `Anhui`); `Arizona, USA` (not just `Arizona`, and also not `Arizona, United States`); and `Herefordshire, England` (not just `Herefordshire`, and also in this case not `Herefordshire, UK` or `Herefordshire, England, UK` or any other possible variation). Cities are normally simple, but some cities are multipart for disambiguation purposes (e.g. `Newcastle, New South Wales` for the city in Australia vs. `Newcastle upon Tyne` for the identically-named city in England). Canonical keys may have ''key aliases'', other ways of referring to the location that are not necessarily unique (e.g. `Newcastle` is a key alias for both of the above-mentioned cities), and city keys with diacritics generally have diacriticless aliases, such as canonical key `Düsseldorf` vs. key alias `Dusseldorf`, or canonical key `Łódź` vs. key alias `Lodz`. # Known locations are gathered into ''groups'' with similar properties, such as all the states of the United States; all the (ceremonial) counties of England (see below); and all the "sufficiently major" prefecture-level cities in China (where a prefecture-level city is a prefecture surrounding a major city with a unified government and is more like a prefecture, i.e. a major administrative division just underneath a province, than like a city, and where "sufficiently major" is defined according to the population of either the total prefecture or the urban area of the city). Note that there are multiple types of counties in England, with overlapping but non-identical names and boundaries; there are, in particular, ''ceremonial counties'', ''local government counties'' and ''historic counties''; ''ceremonial counties'' have only ceremonial administrative functionality but unlike local government counties (a) don't frequently change their boundaries or nature, (b) correspond more closely to historic county boundaries and names, and (c) are what Englanders usually identify themselves with, and so they are used as top-level divisions rather than local government counties. # Some known locations have ''aliases'' defined, which are of two types. ''Display aliases'' map holonyms to their canonical form near the beginning of processing (in particular before the displayed output is formatted). For example, `US`, `U.S.`, `USA`, `U.S.A.` and `United States of America` are all canonicalized to `United States` (if identified as a country), and display as `United States`. Similarly, the foreign forms `Occitanie` (as a region or administrative region) and `Noord-Brabant` (as a province) are mapped to `Occitania` and `North Brabant` for display purposes. There are also ''category aliases'', so that if e.g. `Republic of Macedonia` is encountered, it will display as such but categorize as `North Macedonia`. (This is because, among other reasons, `Republic of Macedonia` is normally preceded by `"the"` while `North Macedonia` is not, so a call {{tl|place|en|a <<city>> in the <<c/Republic of Macedonia>>}} would look wrong if `Republic of Macedonia` were converted to `North Macedonia` during display, as the result would be `a city in the North Macedonia`. There are also frequently political connotations to different category aliases, e.g. `Burma` vs. `Myanmar`.) All of these aliases are sensitive to the placetype specified. For example, `Mexico` as a state is categorized under `State of Mexico, Mexico` but `Mexico` the country is categorized as just `Mexico`. ===Categories=== There are two main types of categories: # Categories for known locations, divided into: ## Top-level polity categories (e.g. [[:Category:United States]], [[:Category:Taiwan]], [[:Category:South Ossetia]], [[:Category:Bermuda]], [[:Category:Soviet Union]], [[:Category:West Germany]]). ## Subpolity categories ([[:Category:Arizona, USA]], [[:Category:Hunan]], [[:Category:Kagoshima Prefecture]], [[:Category:Cluj County, Romania]]). For historical reasons, different formats are used for the subpolities of different polities. Increasingly, we are moving towards always including the polity name in the subpolity category, but whether the subpolity type is included and where it is included (cf. [[:Category:Cluj County, Romania]] vs. [[:Category:County Cork, Ireland]] is still inconsistent and will probably remain that way, based on how the subpolity is normally referred to. ## City categories ([[:Category:Tokyo]], [[:Category:New York City]], [[:Category:Jaipur]]). Normally these do not include the containing subpolity, but may do so in order to disambiguate. # Categories for placetypes, divided into: ## "Immediate" political and non-political division categories ([[:Category:States of the United States]], [[:Category:Municipalities of Tocantins, Brazil]], [[:Category:Ghost towns in Arizona, USA]]). These are name categories, whose purpose is to contain locations of the specified type. "Immediate" here refers to the fact that the location in the category name is the immediately-containing polity. Usually these categories use the preposition "of", but sometimes "in". (Specifically, "of" typically implies that the placetype in question has an official or semi-official status, whereas "in" implies there is no such official status, but common usage may override this.) The form of the toponym appearing in these categories is always the same as that of the corresponding toponym category except that the word "the" may appear (e.g. [[:Category:States of the United States]]), whereas it doesn't appear in the toponym category itself ([[:Category:United States]], no "the"). ## "Skip-polity" categories for second-level political and non-political divisions of a country or other top-level polity (e.g. [[:Category:Counties of the United States]], [[:Category:Municipalities of Brazil]] and [[:Category:Subprefectures of Japan]]). These have several purposes: * They group the immediate division categories mentioned previously. * They categorize "straggler" topoynms that (often improperly) fail to mention the subpolity they belong to, but only the top-level polity. * If categories do not exist for the first-level divisions of a country (and sometimes even when they do), they group all toponyms of the specified type for the specified country. For example, Lithuania is divided into first-level counties and second-level municipalities, but since we don't currently have categories for Lithuanian counties, all municipalities go under [[:Category:Municipalities of Lithuania]] rather than under a category for a specific county. In addition, even though we do have categories for Japanese prefectures (a first-level division), all subprefectures (a second-level division) go under [[:Category:Subprefectures of Japan]] because there aren't very many of them (see below). ## "Generic placetype" categories, both of the immediate and skip-polity type (immediate [[:Category:Cities in California, USA]] and [[:Category:Neighborhoods of the Bronx]]; skip-polity [[:Category:Villages in Ivory Coast]], [[:Category:Geographic and cultural areas of England]], [[:Category:Rivers in Egypt]] and [[:Category:Places in the Philippines]]). As mentioned above, "generic" placetypes occur in every polity (although the set of generic placetypes allowed for cities is a subset of those allowed for top-level polities and subpolities). Usually these categories use the preposition "in", but sometimes "of". As above, skip-polity categories group immediate categories, and in addition there are various reasons a toponym entry is categorized into a skip-polity category. (For example, as a general rule, geographic and cultural areas only categorize at the country level, not the subpolity level, both because there often aren't very many in a given country and because they often span multiple subpolities.) The parent categories of a given category depend on its type. Generally, location categories have placetype categories as their first parent, and vice-versa. Specifically: # Top-level country categories have as their parent e.g. [[:Category:Countries in Europe]], [[:Category:Countries in Central America]] or [[:Category:Countries in Polynesia]], using the most specific continental-level region the country is contained in. # Pseudo-countries are under [[:Category:Country-like entities]] as a neutral designation. There aren't enough of them to subcategorize under continent-level regions. # Former countries are under [[:Category:Former countries and country-like entities]]. # Subpolity categories are usually under a placetype category whose placetype is the canonical (first-listed) placetype of the subpolity and whose toponym is the immediately containing polity, but there are exceptions. Specifically, sometimes if a polity has multiple types of subpolities, they are combined (e.g. [[:Category:States and territories of Australia]], [[:Category:Federal subjects of Russia]]). In addition, sometimes a less specific but more identifiable placetype is used instead of the canonical one (e.g. [[:Category:Regions of France]] when the canonical placetype is "administrative region"). The same rules and exceptions generally apply when categorizing subpolities themselves; e.g. both the Australian state of Queensland and territory of Northern Territory go under [[:Category:en:States and territories of Australia]] rather than separately under [[:Category:en:States of Australia]] and [[:Category:en:Territories of Australia]]. In addition, sometimes subpolities may "skip a level" if there aren't very many. For example, there are only 26 subprefectures of Japan (14 under Hokkaido and 12 more scattered under five other prefectures). Rather than have e.g. [[:Category:en:Subprefectures of Kagoshima Prefecture]] containing at most two entries and [[:Category:en:Subprefectures of Miyazaki Prefecture]] containing at most one, they are all grouped under the so-called "skip-subpolity category" [[:Category:en:Subprefectures of Japan]]. # City categories are always under e.g. [[:Category:Cities in the United States]] (e.g. [[:Category:New York City]] is so-placed, even though [[:Category:Cities in New York, USA]] exists). However, they may have a second, more-specific parent (e.g. [[:Category:Cities in New York, USA]] in the case of New York City). The city entries themselves will go under the more specific parent if it exists. # Immediate placetype categories for second-level divisions of a country generally have, respectively, a "toponym parent" that is the toponym mentioned in the category and a "skip-polity parent" that groups all subpolity placetype categories of a specific type and containing polity. For example, [[:Category:Counties of Arizona, USA]] has toponym parent [[:Category:en:Arizona, USA]] and skip-polity parent [[:Category:en:Counties of the United States]]. Sometimes the default skip-polity parent is overridden or disabled entirely. For example, in the US, most states are divided into counties but Louisiana is divided into parishes and Alaska into boroughs. It would make no sense to put [[:Category:Parishes of Louisiana, USA]] under [[:Category:Parishes of the United States]] (which would only have one subcategory), so we include them under [[:Category:Counties of the United States]]. An alternative would be to name the skip-polity category to explicitly include parishes and boroughs; this would get awkward here but is done in some cases. Similarly, [[:Category:Regional county municipalities of Quebec]] is placed under [[:Category:Regional municipalities of Canada]] since that name is used in other provinces. Meanwhile, [[:Category:Regional districts of British Columbia]] disables its skip-polity category since no other province or territory of Canada has regional districts or comparable subpolities under a different name (an alternative would be to place them under [[:Category:Counties of Canada]], since they are sort of comparable to counties). # Placetype categories for first-level divisions of a country similarly (e.g. [[:Category:States of the United States]]) have a toponym parent (in this case [[:Category:United States]]), but in place of the skip-polity parent they have two other parents: a "bare placetype" parent (in this case [[:Category:States]]) and the "generic" parent [[:Category:Political divisions of specific countries]]. (There is also a bare [[:Category:Political divisions]] that groups "bare placetype" categories.) Skip-polity placetype categories for second-level divisions of a country (e.g. [[:Category:Counties of the United States]]) work the same. Placetype categories for countries work likewise except they are missing the generic parent. ===Place descriptions=== A given place description is defined internally in a table of the following form: ```{ placetypes = {"``placetype``", "``placetype``", ...}, holonyms = { { -- holonym object; see below placetype = "``placetype``" or nil, display_placename = "``placename``", unlinked_placename = "``placename``", langcode = "``langcode``" or nil, no_display = BOOLEAN, needs_article = BOOLEAN, force_the = BOOLEAN, affix_type = "``affix_type``" or nil, pluralize_affix = BOOLEAN, suppress_affix = BOOLEAN, continue_cat_loop = BOOLEAN, }, ... }, order = { ``order_item``, ``order_item``, ... }, -- (only for new-style place descriptions), joiner = "``joiner_string``" or nil, holonyms_by_placetype = { ``holonym_placetype`` = {"``placename``", "``placename``", ...}, ``holonym_placetype`` = {"``placename``", "``placename``", ...}, ... }, }``` Holonym objects have the following fields: * `placetype`: The canonicalized placetype if specified as e.g. `c/Australia`; nil if no slash is present (in which case the placename in `display_placename` refers to raw text). * `display_placename`: The placename or raw text, in the format to be displayed. Placename display aliases have already been resolved. It is raw text if `placetype` is nil. * `unlinked_placename`: Same as `display_placename` but with links and HTML removed. * `langcode`: The language code prefix if specified as e.g. `c/fr:Australie`; otherwise nil. * `no_display`: If true (holonym prefixed with !), don't display the holonym but use it for categorization. * `needs_article`: If true, prepend an article if the placename needs one (e.g. `United States`). * `force_the`: If true, always prepend the article `the`. Example use: holoynm 'city:pref:the/Gold Coast', which gets formatted as `(the) city of the [[Gold Coast]]`. * `affix_type`: Type of affix to prepend (values `pref` or `Pref`) or append (values `suf` or `Suf`). The actual affix added is the placetype (capitalized if values `Pref` or `Suf` are given), or its plural if `pluralize_affix` is given. Note that some placetypes (e.g. `district` and `department`) have inherent affixes displayed after (or sometimes before) them. * `pluralize_affix`: Pluralize any displayed affix. Used for holonyms like `c:pref/Canada,US`, which displays as `the countries of Canada and the United States`. * `suppress_affix`: Don't display any affix even if the placetype has an inherent affix. Used for the non-last placenames when there are multiple and a suffix is present, and for the non-first placenames when there are multiple and a prefix is present. * `continue_cat_loop`: If true (holonym used :also), continue producing categories starting with this holonym when preceding holonyms generated categories. Note that new-style place descs (those specified as a single argument using <<...>> to denote placetypes, placetype qualifiers and holonyms) have an additional `order` field to properly capture the raw text surrounding the items denoted in double angle brackets. The ``order_item`` items in the `order` field are objects of the following form: ```{ type = "``order_type``", value = "STRING" or INDEX, }``` Here, the ``order_type`` is one of `"raw"`, `"qualifier"`, `"placetype"` or `"holonym"`: * `"raw"` is used for raw text surrounding `<<...>>` specs. * `"qualifier"` is used for `<<...>>` specs without slashes in them that consist only of qualifiers (e.g. the spec `<<former>>` in `<<former>> French <<colony>>`). * `"placetype"` is used for `<<...>>` `specs without slashes that do not consist only of qualifiers. * `"holonym"` is used for holonyms, i.e. `<<...>>` specs with a slash in them. For all types but `"holonym"`, the value is a string, specifying the text in question. For `"holonym"`, the value is a numeric index into the `holonyms` field. It should be noted that placetypes and placenames occurring inside the holonyms structure are canonicalized, but placetypes inside the placetypes structure are as specified by the user. Stripping off of qualifiers and canonicalization of qualifiers and bare placetypes happens later. The information under `holonyms_by_placetype` is redundant to the information in holonyms but makes categorization easier. The holonym placenames listed here already have category aliases applied. For example, the call {{tl|place|en|city|s/Pennsylvania|c/US}} will result in the return value ```{ placetypes = {"city"}, holonyms = { { placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania" }, { placetype = "country", display_placename = "United States", unlinked_placename = "United States" }, }, holonyms_by_placetype = { state = {"Pennsylvania"}, country = {"United States"}, }, }``` Here, the placetype aliases `s` and `c` have been expanded into `state` and `country` respectively, and the placename display alias `US` has been expanded into `United States`. PLACETYPES is a list because there may be more than one. For example, the call {{tl|place|en|city/and/municipality|p/[[Kwango]] Province|c/Congo}} will result in the return value ``` { placetypes = {"city", "และ", "municipality"}, holonyms = { { placetype = "province", display_placename = "[[Kwango]] Province", unlinked_placename = "Kwango Province" }, { placetype = "country", display_placename = "Congo", unlinked_placename = "Congo" }, }, holonyms_by_placetype = { country = {"Congo"}, }, }``` Here, the `unlinked_placename` field has removed links from `display_placename`. The value in the key/value pairs is likewise a list; e.g. the call {{tl|place|en|city|s/Kansas|and|s/Missouri}} will return ``` { placetypes = {"city"}, holonyms = { { placetype = "state", display_placename = "Kansas", unlinked_placename = "Kansas" }, { display_placename = "และ", unlinked_placename = "และ" }, { placetype = "state", display_placename = "Missouri", unlinked_placename = "Missouri" }, }, holonyms_by_placetype = { state = {"Kansas", "Missouri"}, }, } ``` Note that in `get_cats()` (which runs after the display form has been generated), further changes to the holonym structure are made to aid in categorization. For example, after `handle_category_implications()` and `augment_holonyms_with_container()` are called, the above structure will look more like ``` { placetypes = {"city"}, holonyms = { { placetype = "state", display_placename = "Kansas", unlinked_placename = "Kansas" }, { placetype = "country", unlinked_placename = "United States" }, { display_placename = "และ", unlinked_placename = "และ" }, { placetype = "state", display_placename = "Missouri", unlinked_placename = "Missouri" }, { placetype = "country", unlinked_placename = "United States" }, }, holonyms_by_placetype = { state = {"Kansas", "Missouri"}, country = {"United States"} }, } ``` ===Overall place specs=== The overall place spec parsed by `parse_overall_place_spec` has the following fields: * `lang`: The language object (from {{para|1}}). * `args`: The parsed arguments from the {{tl|place}} call. * `directives`: List of form-of directives (starting with `@`) parsed from the numeric args beginning with {{para|2}}. Each directive contains fields `directive` (the directive as specified by the user, e.g. `"former name of"`); `terms` (list of term objects for the terms specified by the user); `conj` (conjunction specified by the user using inline modifier `<conj:...>`, or {nil}); `spec` (the corresponding directive spec from `all_form_of_directives`); `pretext` (the text to display directly before the directive); `posttext` (the text to display directly after the directive; {nil} except for the last directive). * `descs`: List of one or more place description objects parsed from the numeric args beginning with {{para|2}}, as described above. * `extra_info`: List of extra-info objects for extra info specified using arguments such as {{para|capital}}, {{para|modern}}, etc. Objects are in the order they should be displayed, and each object contains fields `spec` (the spec for the type of extra info, taken from `export.extra_info_args`), `terms` (list of term objects for the terms specified by the user); and `conj` (conjunction specified by the user using inline modifier `<conj:...>`, or {nil}). ===Category determination=== The algorithm to find the categories to which a given place belongs works off of a place description (which specifies the entry placetype(s) and holonym(s); see above). If there are multiple place descriptions, each is processed independently to generate categories. Likewise, if there are multiple entry placetypes in a given place description, each is processed independently with all the holonyms of the description to generate categories. Furthermore, before the category-generation algorithm runs, earlier steps have modified the holonyms of the place description (inserting containing polities whenever possible; see the description above of `handle_category_implications()` and `augment_holonyms_with_container()`). Given a single entry placetype and a place description, the algorithm to generate categories processes holonyms from left to right until it finds one that "matches" in that it produces one or more categories. At that point it attempts to generate categories for all other holonyms in the place description of the same placetype. Normally, it then stops processing holonyms, but if a holonym is marked using the `:also` modifier, the category generation process starts over starting with that holonym (or the leftmost such remaining holonym, if there is more than one marked with `:also`). This makes it possible, for example, to specify the description of a river that passes through two different types of political divisions (e.g. Alberta and the Northwest Territories), or categorize a geographic region at both the continent and country level, such as this: <pre> {{place|en|historical region|r/Eastern Europe|located in southeastern|c:also/Poland|*and western|c/Ukraine}} </pre> Here, `r/Eastern Europe` has a category implication that adds `cont/Europe` as a holonym directly after it, which causes the page to be categorized into [[:Category:en:Geographic and cultural areas of Europe]]. The category generation process would normally stop at this point, but the presence of `:also` causes it to restart with `c/Poland` and generate the category [[:Category:en:Geographic and cultural areas of Poland]]. After doing this, it looks for other holonyms of the same placetype as `c/Poland` (i.e. other countries), which causes it to process `c/Ukraine` and generate the category [[:Category:en:Geographic and cultural areas of Ukraine]]. The category generation process works off of the `placetype_data` table, which specifies various properties for placetypes, such as how to display a holonym of that placetype as well as how to categorize certain pages where the {{tl|place}} call contains the specified placetype as an entry placetype. For example, the entry for `city-state` in [[Module:place/placetypes]] might look like ``` ["city-state"] = { link = true, category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]", has_neighborhoods = true, class = "settlement", ["continent/*"] = {"City-states", "นคร", "ประเทศ", "ประเทศใน+++", "เมืองหลวงของประเทศ"}, default = {"City-states", "นคร", "ประเทศ", "เมืองหลวงของประเทศ"}, }, ``` Here, the keys specify, respectively: # If `city-state` occurs as an entry placetype, link it to the corresponding Wiktionary entry (that is what `true` means in `link = true`). # Use the specified `category_link` text for categories such as [[:Category:City-states]]. # City-states are "city-like", i.e. they have neighborhoods; this controls the handling of entry placetypes such as `neighborhood`, `district`, `area`, etc. # City-states should be treated as settlements for determining how to handle the placetype `former city-state` and for categorizing the bare category [[:Category:City-states]] and language-specific equivalents such as [[:Category:en:City-states]]. # When the entry placetype `city-state` occurs along with a continent holonym, categorize into the specified categories under `continent/*`. Here, `+++` stands for the holonym in question. # When the entry placetype `city-state` occurs in any other context, categorize into the specified categories under `default`. It's important to realize that the only categorization keys under a given placetype entry that are specified explicitly in [[Module:place/placetypes]] are certain wildcard keys such as `continent/*` above (i.e. containing a slash followed by `*`) and under the key `default`. All the remaining categorization happens through category handlers, based on the information on known locations in [[Module:place/locations]]. For example, [[Module:place/locations]] has an "England group" specified similarly to the following: ``` export.england_group = { default_container = {key = "England", placetype = "constituent country"}, default_placetype = "county", default_divs = { "districts", {type = "local government districts", cat_as = "districts"}, { type = "local government districts with borough status", cat_as = {"districts", "boroughs"}, }, {type = "boroughs", cat_as = {"districts", "boroughs"}}, "civil parishes", }, default_british_spelling = true, data = export.england_counties, } ``` The `default_divs` key here specifies the divisions that exist for each of the counties listed under the `data` key (unless the key overrides them). Here, the entry `{type = "boroughs", cat_as = {"districts", "boroughs"}}` directs the category handler `political_division_cat_handler` in [[Module:place/placetypes]] (which is one of two category handlers that run for all entry placetypes, along with `generic_place_cat_handler`) to categorize boroughs specified under any of the counties listed under `data` as both districts and boroughs. Now, the categorization process proceeds as follows, given an entry placetype and place description, which specifies a set of holonyms (the code to do this is in `get_placetype_cats()`): # First, look up the entry placetype and any equivalent placetypes in `placetype_data`, which is defined in [[Module:place/placetypes]]. Note that the entry in `placetype_data` that specifies the placetype information that is used to determine the category or categories may not directly correspond to the entry placetype as specified in the place description. For example, if the entry placetype is `small town`, the placetype whose data is fetched will be `town` since `small` is a recognized qualifier and there is no entry in `placetype_data` for `small town`. As another example, if the entry placetype is `administrative capital`, the code will first look up `administrative capital` and then look up `capital city`, which is where the category handler is found, because `administrative capital` specifies `capital city` as its fallback. # Then, iterate over holonyms from left to right, as described above. For each holonym, we proceed as follows: ## First, call `political_division_cat_handler` to check if the entry placetype and holonym match a division in the `locations` data in [[Module:place/locations]], as in the example above. Note that when doing this, holonyms are canonicalized so that e.g. `co/Bedfordshire` gets mapped to `county/Bedfordshire` (because there is an entry in `placetype_aliases` in [[Module:place/placetypes]] that maps `co` to `county`) and `c/USA` gets mapped to `country/United States` (because there is an entry in the location data for the list of countries that maps `country/USA` to `country/United States` for both display and categorization purposes). This category handler, as with all such handlers, is passed the entry placetype and holonym being processed, but is also passed the entire place description, so it can look at other specified holonyms (particularly those that follow). It either returns {nil} or a list of category specs (which are the actual categories minus the preceding language code). ## If `political_division_cat_handler` doesn't generate any categories, check if there is a category handler defined using the `cat_handler` key for the entry placetype. If so, call it to generate the categories (if any). ## If the category handler returns {nil}, or there is no category handler, look for a ''wildcard key'' of the format e.g. `country/*`, which matches any holonym of placetype `country`. If found, the value is a list of category specs, which are processed as above. ## If we get this far without generating any categories, move to the next holonym. ## If we do generate any categories, process all other holonyms of the same placetype. For example, if the user says {{tl|place|en|city|s/Kansas|and|s/Missouri}}, when we get to the holonym `s/Kansas`, we generate the category [[:Category:en:Cities in Kansas, USA]]. This causes us to look for other holonyms of the same placetype `state`, and process them accordingly, generating a category [[:Category:en:Cities in Missouri, USA]] as well. The same thing happens in an invocation like {{tl|place|pl|river|c/Poland,Ukraine,Belarus}}. # Once we generate categories for a holonym and any other holonyms of the same placetype, we normally stop processing holonyms. But if a holonym has the `:also` modifier, we restart the left-to-right loop at that holonym. For example, in the invocation {{tl|place|en|river|flowing through|p/Alberta|p/British Columbia|and the|terr/Northwest Territories}}, we will generate a category [[:Category:en:Rivers in Alberta, Canada]] as well as [[:Category:en:Rivers in British Columbia, Canada]] (because British Columbia is of the same placetype as Alberta); but no category will be generated for the Northwest Territories, which is of a different placetype. To fix this, write {{tl|place|en|river|flowing through|p/Alberta|p/British Columbia|and the|terr:also/Northwest Territories}}. The use of `:also` will cause holonym processing to resume at `Northwest Territories` after `Alberta` is processed, leading to an additional category [[:Category:en:Rivers in the Northwest Territories, Canada]]. (The presence of `the` in this last category is because `Northwest Territories` is a known location with a spec indicating that it should be preceded by `the`; it has nothing to do with the raw text `and the` in the invocation.) # Finally, if we process all holonyms and don't end up producing any categories, we check the entry placetype's data for a `default` key. If found, it lists category specs, which are processed to generate categories. This is used, for example, in the placetype `city-state`, as described above. # It should be noted that the above process runs independently for each combination of entry placetype and place description. Thus, for example, an invocation {{tl|place|en|city/and/county|s/Kansas,Missouri|c/USA}} will generate categories for both cities and counties in both Kansas and Missouri. # Two additional sources of categories are ''bare location'' categories and ''generic place'' categories. These categories are added by appropriate calls in the outer function `get_cats`, which iterates over placetypes and place descriptions, calling `get_placetype_cats` on each combination. ## Bare location categories are categories like [[:Category:Arizona, USA]] that are related-to categories containing terms related to the specified location. The bare location code, for example, adds the term [[Arizona]], and its equivalents in other languages, to [[:Category:Arizona, USA]]. When looking for terms to consider, it checks the pagename, the glosses specified using {{para|t}}, and the terms specified using {{para|modern}}, {{para|short}} and {{para|full}}. It looks to see if any of these parameters match any known locations, but only adds them to a bare location category if (a) the specified entry placetype matches, so that for example Russian `[[Джорджия]]` goes into [[:Category:Georgia, USA]] while `[[Грузия]]` goes into [[:Category:Georgia]] (the country), even though both have a gloss `Georgia`; and (b) there are no conflicting holonyms, so that for example the Old English term [[Munucceaster]] if defined similarly to {{tl|place|ang|city|in modern|cc/England|t=Newcastle}} won't get added to [[:Category:Newcastle, New South Wales]] (even though it is also a city) because the latter city is known to be in Australia, which conflicts with the country `United Kingdom` (added internally to the Old English place description through the holonym augmentation process, based on the holonym `cc/England`). ## Generic place categories are categories like [[:Category:Places in Kansas, USA]] and [[:Category:Places in England]] that contain places of arbitrary placetype. These are added through a special category handler that operates like other category handlers but is run for all placetypes, rather than only for the specified one(s). ]==] --[=[ TODO/FIXME: 1. [DONE] Neighborhoods should categorize at the city level. Categories like [[:Category:Places in Los Angeles]] exist but not [[:Category:Neighborhoods in Los Angeles]]; we can refactor the code in generic_cat_handler() to support this use case. 2. Display handlers should be smarter. For example, 'co/Travis' as a holonym should display as 'Travis County' in the United States, but (I think) display handlers don't currently have the full context of holonyms passed in to allow this to happen. 3. Connected to this, we have various display handlers that add the name of the holonym after or (sometimes) before the placename if it's not already there. An example is the county_display_handler() in [[Module:place/placetypes]], which adds "County" before Ireland and Northern Ireland counties and after Taiwan and Romania counties. This should be integrated into the polity group for these respective polities through a setting rather than requiring a separate handler that has special casing for various polities. 4. Placetypes for toponyms should also have display handlers rather than just fixed text. This should allow us to dispense with the need for special types for "fpref" = "French prefecture" (which displays as "prefecture" but links to the appropriate Wikipedia article on Frenc prefectures, which are completely different from the more general concept of prefecture). Similarly for "Polish colony" and "Welsh community". ("Israeli settlement" should probably stay as-is because it displays as "Israeli settlement" not just "settlement".) 5. [DONE] Currently, categories for e.g. states and territories of Australia go into [[:Category:States and territories of Australia]] but terms for states and territories of Australia go into (respectively) [[:Category:States of Australia]] and [[:Category:Territories of Australia]]. We should fix this; maybe this is as easy as setting cat_as in the respective divs definitions. 6. Probably cat_as should support raw categories as well as category types; raw categories would be indicated by being prefixed with "Category:". 7. [MOSTLY DONE] Update documentation. 8. [DONE] Rename remaining political division categories to include name of country in them. 9. [DONE] Add Pakistan provinces and territories. 10. [DONE] Add a polity group for continents and continent-level regions instead of special-casing. This should make it possible e.g. to have Jerusalem as a city under "Asia". 11. [DONE] Add better handling of cities that are their own states, like Mexico City. 12. [DONE] Breadcrumb for e.g. [[Category:Aguascalientes, Mexico]] is "Aguascalientes, Mexico" instead of just "Aguascalientes". 13. [DONE] Unify aliasing system; cities have a completely different mechanism (alias_of) vs. polities/subpolities (which use`placename_cat_aliases` and `placename_display_aliases` in [[Module:place/placetypes]]). 14. [DONE] More generally, cities should be unified into the polity grouping system to the extent possible; this would allow for divs of cities (see #17 below). 15. [DONE] We have `no_containing_polity_cat` set for Lebanon, Malta and Saudi Arabia to prevent country-level implications from being added due to generically-named divisions like "North Governorate", "Central Region" and "Eastern Province" but (a) this setting seems to do multiple things and should be split, (b) it should be possible to set this at the division level instead of the country level. 16. Split out the data from the handlers so we can use loadData() on the data because it's becoming very big. 17. [DONE] Cities like Tokyo have special wards; "prefecture-level cities" like Wuhan (which aren't really cities but we treat them as such) have districts, subdistricts, etc. We need to support divs for cities and even named divisions of cities (such as we already have for boroughs of New York City). 18. [DONE] It should be allowed to set 'true' to any qualifier (which links it) and have it work correctly; qualifier lookup in [[Module:place]] needs to remove links first. 19. [DONE] Categories 'Historical polities' and 'Historical political subdivisions' should be renamed 'Former ...' since "historic(al)" is ambiguous (cf. "historic counties" in England which are not former, but still have a legal definition). 20. [PARTLY DONE; SUPPORT IS THERE BUT FORMER PROVINCES NOT YET CATEGORIZED] It should be possible to categorize former subpolities of certain polities; cf. [[:Category:ja:Provinces of Japan]], which contains former provinces. 21. [DONE] In subpolity_keydesc(), we need to generate the correct indefinite article and have a huge hack to check specifically for "union territory", which is the only placetype that shows up in this function where the default indefinite article generating function fails. To fix this properly, we need to separate out the non-category placetype data from `cat_data` in [[Module:place/placetypes]] and move it to [[Module:place/locations]], because we don't have access to the data in [[Module:place/placetypes]], and that data indicates the correct article for placetypes like "union territory". 22. [DONE] Simplify the specs in `cat_data`, eliminating the distinction between "inner" and "outer" matching. There should not be two levels, just one. For example, in "district", instead of ["country/Portugal"] = { ["itself"] = {"Districts and autonomous regions of +++"}, } we should just have ["country/Portugal"] = {"Districts and autonomous regions of +++"}, And in "dependent territory", instead of ["default"] = { ["itself"] = {true}, ["country"] = {true}, }, we should just have ["itself"] = {true}, ["country/*"] = {true}, It appears the only remaining spec that can't be easily converted in this fashion is for "subdistrict": ["country/Indonesia"] = { ["municipality"] = {true}, }, This seems to be specifically for Jakarta and doesn't seem to work anyway, as the two entries in [[:Category:en:Subdistricts of Jakarta]] and the one entry in [[:Category:id:Subdistricts of Jakarta]] are manually categorized. 23. [DONE] Consolidate the remaining stuff in [[Module:category tree/topic cat/data/Earth]] into [[Module:category tree/topic cat/data/Places]]. 24. [DONE] The `generic_cat_handler` that categorizes into `Places in FOO` is smart enough not to categorize cities that are in different polities from the specified containing polity/polities of the city, but doesn't do the same for larger-level divisions. Likewise for the `city_type_cat_handler`. There are some sufficiently generically-named divisions that this issue can occur; for example, [[Koforidua]], the capital city of Eastern Region, Ghana, is incorrectly categorized under [[:Category:en:Cities in Eastern Region, Malta]] and [[:Category:en:Places in Eastern Region, Malta]]. Note that the function `augment_holonyms_with_container` ''DOES'' do such checks, so we should be able to refactor the code out of that function and use it elsewhere. 25. [DONE] The `generic_cat_handler` that categorizes into `Places in FOO` is smart enough not to categorize cities that are in different polities from the specified containing polity/polities of the city; but how smart is it? It will successfully avoid categorizing a neighborhood in e.g. [[Columbus]], [[Georgia]] that doesn't explicitly mention the US (only `s/Georgia`) into [[:Category:en:Places in Columbus]], which is for Columbus, Ohio, but will it do the same for a hypothetical neighborhood of Columbus in say Merseyside, England? This should be investigated. It will probably work for a hypothetical Columbus in [[Canada]] because `augment_holonyms_with_container` would auto-add Canada as an additional holonym once say `p/Ontario` is mentioned, but I think there's a setting preventing this augmentation from happening for the UK. (This relates to FIXME #15. `no_containing_polity_cat` is set on England, Scotland, etc. to prevent the toponyms from being added to [[:Category:en:Places in the United Kingdom]], but this same setting is used to prevent augmentation, which it should not be; there should be different settings.) 26. [DONE] The `generic_cat_handler` (or more specifically `find_holonym_keys_for_categorization`) checks for city holonyms by looking specifically for holonym type `city`. But some cities (particularly those in China) can be specified using different holonym types, e.g. `prefecture-level city`, `subprovincial city`, etc. We should allow these when appropriate (which means the cities in China need to have a `placetype` set that indicates their regional-level status as well as just `city`). I'm not sure if cities support specifying a custom `placetype` at the moment; this relates to FIXME #14 above concerning unifying cities and political divisions internally. 27. [DONE] The bare category handler (`get_bare_categories` in [[Module:place/placetypes]]) is not smart enough to avoid overcategorizing cities or other divisions that are of the right placetype but in the wrong containing polity. For example, Asturian [[Llión]] "León (city in Spain)" gets put in [[:Category:ast:León]] even though the latter is supposed to refer to a city in Mexico. We can borrow the check-containing-polity code from `generic_cat_handler`. 28. [DONE] Redo handling of singular and plural to respect overrides specified in placetype_data. Check more carefully for things that may not singularize correctly, e.g. 'passes' -> 'passe'? Definitely 'headquarters' and variants. 29. [DONE] Combine placetype_equivs and other placetype data into `placetype_data`. Figure out if we need the distinction between `placetype_equivs` and `fallback`. 30. `has_neighborhoods` may need to be a function that can look at the containing holonyms to determine whether the entity in question is city-like. 31. [DONE] Bare placenames as they appear in holonyms (e.g. `Riau Islands`) instead of category keys (e.g. `the Riau Islands, Indonesia`) should appear in the polity data tables. As a first pass, the word "the" should not appear but should instead be a property of the polity. 32. [DONE] `capital_city_cat_handler` should use `get_holonyms_to_check()`. 33. [PARTLY DONE] The code to generate and parse the correct preposition ("in" or "of") is very convoluted, and the actual preposition used is specified in various locations with various defaults, sometimes hardcoded. This should be simplified. It is made more difficult by the fact that the in/of distinction occurs in several places: (a) when generating the {{place}} text in old-style descriptions where the preposition isn't explicitly given, which uses the `preposition` setting in placetype_data, defaulting to "in"; (b) when generating categories based on explicit category specs in placetype_data (which are gradually being deprecated), which likewise uses the `preposition` setting in placetype_data, defaulting to "in"; (c) when generating categories based on political_division_cat_handler, originating in the `divs` placetypes for specific known locations in [[Module:place/locations]], which uses the `prep` setting embedded in the `divs` specifications, defaulting to "of"; (d) when generating categories based on category handlers specified using the `cat_handler` property of entries in placetype_data, which tend to hardcode "in" or "of" depending on the specific category handler; (e) when generating category descriptions in [[Module:category tree/topic/Places]] for `divs` categories generated in (c), which (correctly) uses the same `prep` setting embedded in the `divs` settings that is used when generating the categories themselves; (f) when generating category descriptions for categories generated in (b) and (d) above, which relies on the `generic_before_non_cities` and `generic_before_cities` settings in placetype_data, which need to match the corresponding prepositions hardcoded in the category generation handlers. Instead of the hardcoding, the category generation handler should respect the `generic_before_*` settings. 34. [[Krakow]] defined as {{place|en|A <<city>> on the [[Vistula]] River, the <<capital>> of the <<voi/Lesser Poland Voivodeship>> in southern <<c/Poland>>}} categorizes under [[:Category:Voivodeship capitals]] when it should probably instead be under [[:Category:Voivodeship capitals of Poland]]. Possibly this is because the various voivodeships haven't yet been entered as known locations, but this should happen regardless of that. 35. {{tcl}} bugs: a. [DONE] Lowercase initial letter in new-style {{place}} descriptions in {{tcl}}. Maybe we can have a setting tcl_nolc=1 to prevent this from happening. b. [DONE] tcl= and probably new-style {{place}} descriptions in general should recognize ;; to separate distinct {{place}} descriptions, and similarly ;;and as the equivalent of regular `;and`, etc. c. [DONE] The value supplied in `modern=` should be displayed in {{tcl}} descriptions regardless of the setting that normally disables this, so that e.g. the foreign-language equivalent of [[British Honduras]] doesn't just say it's a former British colony in Central America but specifically identifies it as modern Belize. If the user gives, place_modern= in {{tcl}}, that should override the modern= value and still display. d. [DONE] The page supplied to {{tcl}} should be used for generating bare categories even if t= is supplied and overrides the English term displayed. [DONE] e. [DONE] If text follows {{place}} and begins with a semicolon, the semicolon isn't copied into {{tcl}}. 36. County boroughs used as holonyms currently display 'borough county borough' because there's an affix setting for 'county borough' and a fallback display handler for 'borough'. We need to rethink this; maybe merge the affix setting and display handlers. 37. Implement known-location groups and specs in a more standardly object-oriented way using metatables. 38. Implement caching of known location lookup in the holonym. This may have to be keyed by placetype, but we can have a special field for when the lookup placetype is the same as the user-specified placetype of the holonym. Use this known location in place of looking up known locations and store the appropriate known location there in `augment_holonyms_with_container()` instead of calling `key_to_placename`. 39. Bug fixes with 'the': (a) [DONE] [[Kazaň]] defined as {{place|cs|caplc|rep:Pref/Tatarstan|c/Russia|t1=Kazan}} displays as "Republic of the Tatarstan". (b) [[Valday]] defined as {{place|en|town/administrative center|dist:Suf/Valdaysky|obl/Novgorod|c/Russia}} displays as "a town, the administrative center of the Valdaysky District". Changing to `dist:suf/Valdaysky` displays as "... of Valdaysky district". 40. [DONE] Bug fix with 'the': [[Verkhoyansk]] defined as {{place|en|town|rep/Sakha|c/Russia}} displays as "a town in the Sakha". 41. [DONE] [[Category:Cities in Asia]] has [[Category:Cities in Eurasia]] as a parent, which in turn has [[Category:Cities in the Earth]] as a parent. Continents should not have the second parent like this. 42. [DONE] When checking `british_spelling`, it should check all containers as well; otherwise it's too hard to keep this in sync across cities, administrative divisions and countries. 43. [DONE] `skip_polity_parent_type` should be renamed to container_parent_type or similar. 44. There should be a flag to allow e.g. departments of France that are currently categorized as departments of their region to also be categorized as departments of France. 45. [DONE] Aliases are causing iterate_matching_holonym_location() to fail, e.g. if [[براق]] "Prague" is specified as {{place|acw|capital city|c/Czechia|t1=Prague}}, this fails add a bare category [[Category:acw:Prague]] because the code in iterate_matching_holonym_location() isn't resolving aliases when comparing the known container 'Czech Republic'. Probably we want to build an alias table to speed up these sorts of lookups. 46. [DONE; DUE TO TYPO IN HANDLER] The district cat handler is failing to work right, e.g. in [[Saint-Gaudérique]] defined as {{place|fr|district|city/Perpignan|in|dept/Pyrénées-Orientales|r/Occitania|c/France|t=Saint-Gaudérique}}, only the 'Places in ...' categories are getting triggered. 47. Suburbs of a given city aren't generally in the city and may not even be in the same country or country division, so they should not categorize as "Places in ..." based on the city and specified country and division. Same goes for "enclave" (within somewhere) and "exclave". 48. When converting display aliases, we should automatically convert full placenames to full placenames and elliptical placenames to elliptical placenames instead of always either doing elliptical or full placenames depending on the value of `display_as_full`. 49. `@obsolete form of` and `@archaic form of` should automatically trigger nocat=1. 50. The handler that adds bare categories should pick up values in <eq:...>. ]=] --[==[ var: List specifying the allowed form-of directives, used for former names, official names, abbreviations, etc. of places. The key is the form-of directive and the value is an object with the following properties: * `text`: The actual text displayed before the terms. If the value is `+`, the key is used as the text. If the value is a function, it is passed a single argument, the overall place spec (see comment at top of file) and should return the text to be displayed. * `type_prefix`: The prefix used to generate the placetype for looking up the appropriate category or categories in the placetype data structure. Can be omitted if there are no categories associated with the directive. * `conjunction`: The conjunction used to join multiple terms, defaulting to `and`. * `cat`: Additional category or categories to add the term to, whenever this particular directive is used. Normally the value is a topic-style category minus the langcode prefix, but if prefixed with `cln:`, it is a langname-style category. For example, the value `"Abbreviations"` would correspond to a category [[:Category:en:Abbreviations]] (assuming the language of the {{tl|place}} call is English), while the value `"cln:abbreviations"` corresponds to a category [[:Category:English abbreviations]]. Use a list of such specs for multiple categories. * `default_foreign`: If specified, the default language of terms given along with this directive is the language in {{para|1}}; otherwise it is English. ]==] export.all_form_of_directives = { ["former name of"] = {text = "+", type_prefix = "FORMER_NAME_OF"}, ["fmr of"] = {alias_of = "former name of"}, ["ancient name of"] = {text = "+", type_prefix = "FORMER_NAME_OF"}, ["official name of"] = {text = "+", type_prefix = "OFFICIAL_NAME_OF"}, ["former official name of"] = {text = "+", type_prefix = "FORMER_OFFICIAL_NAME_OF"}, ["long form of"] = {text = "+", type_prefix = "LONG_FORM_OF"}, ["former long form of"] = {text = "+", type_prefix = "FORMER_LONG_FORM_OF"}, ["nickname for"] = {text = "+", type_prefix = "NICKNAME_FOR"}, ["official nickname for"] = {text = "+", type_prefix = "OFFICIAL_NICKNAME_FOR"}, ["former nickname for"] = {text = "+", type_prefix = "FORMER_NICKNAME_FOR"}, ["derogatory name for"] = {text = "[[Appendix:Glossary#derogatory|derogatory]] name for", type_prefix = "DEROGATORY_NAME_FOR"}, ["synonym of"] = {text = "+"}, ["syn of"] = {alias_of = "synonym of"}, ["abbreviation of"] = {text = "[[Appendix:Glossary#abbreviation|abbreviation]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:abbreviations", default_foreign = true}, ["abbr of"] = {alias_of = "abbreviation of"}, ["abbrev of"] = {alias_of = "abbreviation of"}, ["initialism of"] = {text = "[[Appendix:Glossary#initialism|initialism]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:initialisms", default_foreign = true}, ["init of"] = {alias_of = "initialism of"}, ["acronym of"] = {text = "[[Appendix:Glossary#acronym|acronym]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:acronyms", default_foreign = true}, ["syllabic abbreviation of"] = {text = "[[Appendix:Glossary#syllabic abbreviation|syllabic abbreviation]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:syllabic abbreviations", default_foreign = true}, ["sylabbr of"] = {alias_of = "syllabic abbreviation of"}, ["sylabbrev of"] = {alias_of = "syllabic abbreviation of"}, ["ellipsis of"] = {text = "[[Appendix:Glossary#ellipsis|ellipsis]] of", type_prefix = "ELLIPSIS_OF", cat = "cln:ellipses", default_foreign = true}, ["ellip of"] = {alias_of = "ellipsis of"}, ["clipping of"] = {text = "[[Appendix:Glossary#clipping|clipping]] of", type_prefix = "CLIPPING_OF", cat = "cln:clippings", default_foreign = true}, ["clip of"] = {alias_of = "clipping of"}, ["alternative form of"] = {text = "+", default_foreign = true}, ["alt form"] = {alias_of = "alternative form of"}, ["alternative spelling of"] = {text = "+", default_foreign = true}, ["alt spell"] = {alias_of = "alternative spelling of"}, ["alt sp"] = {alias_of = "alternative spelling of"}, ["dated form of"] = {text = "[[Appendix:Glossary#dated|dated]] form of", type_prefix = "DATED_FORM_OF", cat = "cln:dated forms", default_foreign = true}, ["dated form"] = {alias_of = "dated form of"}, ["dated spelling of"] = {text = "[[Appendix:Glossary#dated|dated]] spelling of", type_prefix = "DATED_FORM_OF", cat = "cln:dated forms", default_foreign = true}, ["dated spell"] = {alias_of = "dated spelling of"}, ["dated sp"] = {alias_of = "dated spelling of"}, ["archaic form of"] = {text = "[[Appendix:Glossary#archaic|archaic]] form of", type_prefix = "ARCHAIC_FORM_OF", cat = "cln:archaic forms", default_foreign = true}, ["arch form"] = {alias_of = "archaic form of"}, ["archaic spelling of"] = {text = "[[Appendix:Glossary#archaic|archaic]] spelling of", type_prefix = "ARCHAIC_FORM_OF", cat = "cln:archaic forms", default_foreign = true}, ["arch spell"] = {alias_of = "archaic spelling of"}, ["arch sp"] = {alias_of = "archaic spelling of"}, ["obsolete form of"] = {text = "[[Appendix:Glossary#obsolete|obsolete]] form of", type_prefix = "OBSOLETE_FORM_OF", cat = "cln:obsolete forms", default_foreign = true}, ["obs form"] = {alias_of = "obsolete form of"}, ["obsolete spelling of"] = {text = "[[Appendix:Glossary#obsolete|obsolete]] spelling of", type_prefix = "OBSOLETE_FORM_OF", cat = "cln:obsolete forms", default_foreign = true}, ["obs spell"] = {alias_of = "obsolete spelling of"}, ["obs sp"] = {alias_of = "obsolete spelling of"}, } local function get_seat_text(overall_place_spec) local placetype = overall_place_spec.descs[1].placetypes[1] if placetype == "county" or placetype == "counties" then return "county seat" elseif placetype == "parish" or placetype == "parishes" then return "parish seat" elseif placetype == "borough" or placetype == "boroughs" then return "borough seat" else return "seat" end end --[==[ var: List specifying the allowed arguments containing extra information that is sometimes added to a definition, such as the capital, largest city, modern name, official name, etc., along with associated properties; displayed in the order given. Each element is an object with the following properties: * `arg`: The argument name. * `text`: The actual text displayed before the terms. If the value is `+`, the argument name is used as the text. If the value is a function, it is passed a single argument, the overall place spec (see the comment at the top of the file) and should return the text to be displayed. * `conjunction`: The conjunction used to join multiple terms, defaulting to `and`. * `display_even_when_dropped`: Display this piece of extra info even when it would normally be dropped (e.g. in {{tl|tcl}} when the language is other than English). * `match_sentence_style`: If true, the text will be capitalized and preceded by a period when ''sentence style'' is in effect (essentially, when the language is English and there is no translation specified using {{para|t}} or similar parameter); otherwise, the text will be displayed as-is and preceded by a semicolon. If false, the semicolon style will always be used. * `auto_plural`: If true, pluralize the text when there is more than one term. * `with_colon`: If true, follow the text with a colon. (This colon cannot easily be included in the text itself because if pluralized, the pluralized text goes before the colon.) ]==] export.extra_info_args = { {arg = "modern", text = "+", conjunction = "หรือ", display_even_when_dropped = true}, {arg = "now", text = "now,", conjunction = "หรือ", display_even_when_dropped = true}, {arg = "full", text = "in full,", conjunction = "หรือ", display_even_when_dropped = true}, {arg = "short", text = "short form", conjunction = "หรือ"}, {arg = "abbr", text = "abbreviation", conjunction = "หรือ"}, {arg = "former", text = "formerly,"}, {arg = "official", text = "ชื่อทางการ", match_sentence_style = true, auto_plural = true, with_colon = true}, {arg = "capital", text = "เมืองหลวง", match_sentence_style = true, auto_plural = true, with_colon = true}, {arg = "largest city", text = "นครใหญ่สุด", match_sentence_style = true, auto_plural = true, with_colon = true}, {arg = "caplc", text = "เมืองหลวงและนครใหญ่สุด", match_sentence_style = true, auto_plural = false, with_colon = true}, {arg = "seat", text = get_seat_text, match_sentence_style = true, auto_plural = true, with_colon = true}, {arg = "shire town", text = "+", match_sentence_style = true, auto_plural = true, with_colon = true}, {arg = "headquarters", text = "+", match_sentence_style = true, auto_plural = false, with_colon = true}, {arg = "center", text = "administrative center", match_sentence_style = true, auto_plural = false, with_colon = true}, {arg = "centre", text = "administrative centre", match_sentence_style = true, auto_plural = false, with_colon = true}, } export.extra_info_arg_map = {} for _, spec in ipairs(export.extra_info_args) do export.extra_info_arg_map[spec.arg] = spec end ----------- Wikicode utility functions -- Return a wikilink link {{l|language|text}} local function link(text, langcode, id) if not langcode then return text end return m_links.full_link( {term = text, lang = require(languages_module).getByCode(langcode, true, "allow etym"), id = id}, nil, "allow self link" ) end ---------- Basic utility functions -- Add the page to a tracking "category". To see the pages in the "category", -- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here". local function track(page) require(debug_track_module)("place/" .. page) return true end local function ucfirst_all(text) if text:find(" ") then local parts = split(text, " ", true) for i, part in ipairs(parts) do parts[i] = m_strutils.ucfirst(part) end return concat(parts, " ") else return m_strutils.ucfirst(text) end end local function lc(text) return mw.getContentLanguage():lc(text) end ---------- Argument parsing functions and utilities -- Split an argument on comma, but not comma followed by whitespace. local function split_on_comma(val) if val:find(",") then return require(parse_interface_module).split_on_comma(val) else return {val} end end -- Split an argument on slash, but not slash occurring inside of HTML tags like </span> or <br />. local function split_on_slash(arg) if arg:find("<") then local m_parse_utilities = require(parse_utilities_module) -- We implement this by parsing balanced segment runs involving <...>, and splitting on slash in the remainder. -- The result is a list of lists, so we have to rejoin the inner lists by concatenating. local segments = m_parse_utilities.parse_balanced_segment_run(arg, "<", ">") local slash_separated_groups = m_parse_utilities.split_alternating_runs(segments, "/") for i, group in ipairs(slash_separated_groups) do slash_separated_groups[i] = concat(group) end return slash_separated_groups else return split(arg, "/", true) end end -- Implement "implications", i.e. where the presence of a given holonym causes additional holonym(s) to be added. -- Implications apply only to categorization. There used to be support for "general implications" that applied to both -- display and categorization, but there ended up not being any such implications, so we've removed the support. It is -- a bad idea in any case to have such implications; the user might purposely leave out a higher-level polity to avoid -- redundancy in several successive definitions, and we wouldn't want to override that. Note that in practice the -- mechanism implemented by this function is used specifically for non-administrative geographic regions such as -- Eastern Europe and the West Bank; there is a similar mechanism for administrative regions handled by -- `augment_holonyms_with_containing_polity` in [[Module:place/placetypes]]. -- -- `place_descriptions` is a list of place descriptions (see top of file, collectively describing the data passed to -- {{place}}). `implication_data` is the data used to implement the implications, i.e. a table indexed by holonym -- placetype, each value of which is a table indexed by holonym placename, each value of which is a list of -- "PLACETYPE/PLACENAME" holonyms to be added to the end of the list of holonyms. local function handle_category_implications(place_descriptions, implication_data) for i, desc in ipairs(place_descriptions) do if desc.holonyms then local new_holonyms = {} for _, holonym in ipairs(desc.holonyms) do insert(new_holonyms, holonym) local imp_data = m_placetypes.get_equiv_placetype_prop(holonym.placetype, function(pt) local implication = implication_data[pt] and implication_data[pt][holonym.unlinked_placename] if implication then return implication end end) if imp_data then for _, holonym_to_add in ipairs(imp_data) do local split_holonym = split_on_slash(holonym_to_add) if #split_holonym ~= 2 then internal_error("Invalid holonym in implications: %s", holonym_to_add) end local holonym_placetype, holonym_placename = unpack(split_holonym, 1, 2) local new_holonym = { -- By the time we run, the display has already been generated so we don't need to set -- display_placename. placetype = holonym_placetype, unlinked_placename = holonym_placename } insert(new_holonyms, new_holonym) m_placetypes.key_holonym_into_place_desc(desc, new_holonym) end end end desc.holonyms = new_holonyms end end end -- Split a holonym (e.g. "continent/Europe" or "country/en:Italy" or "in southern" or "r:suf/O'Higgins" or -- "c/Austria,Germany,Czech Republic") into its components. Return a list of holonym objects (see top of file). Note -- that if there isn't a slash in the holonym (e.g. "in southern"), the `placetype` field of the holonym will be nil. -- Placetype aliases (e.g. "r" for "region") and placename aliases (e.g. "US" or "USA" for "United States") will be -- expanded. local function split_holonym(raw) local no_display, combined_holonym = raw:match("^(!)(.*)$") no_display = not not no_display combined_holonym = combined_holonym or raw local suppress_comma, combined_holonym_without_comma = combined_holonym:match("^(%*)(.*)$") suppress_comma = not not suppress_comma combined_holonym = combined_holonym_without_comma or combined_holonym local holonym_parts = split_on_slash(combined_holonym) if #holonym_parts == 1 then -- `unlinked_placename` should not be used. return {{display_placename = combined_holonym, no_display = no_display, suppress_comma = suppress_comma}} end -- Rejoin further slashes in case of slash in holonym placename, e.g. Admaston/Bromley. local placetype = holonym_parts[1] local placename = concat(holonym_parts, "/", 2) -- Check for modifiers after the holonym placetype. local split_holonym_placetype = split(placetype, ":", true) placetype = split_holonym_placetype[1] local affix_type local saw_also local saw_the for i = 2, #split_holonym_placetype do local modifier = split_holonym_placetype[i] if modifier == "also" then if saw_also then error(("Modifier ':also' occurs twice in holonym '%s'"):format(combined_holonym)) end saw_also = true elseif modifier == "the" then if saw_the then error(("Modifier ':the' occurs twice in holonym '%s'"):format(combined_holonym)) end saw_the = true elseif modifier == "pref" or modifier == "Pref" or modifier == "suf" or modifier == "Suf" or modifier == "noaff" then if affix_type then error(("Affix-type modifier ':%s' occurs twice in holonym '%s'"):format(modifier, combined_holonym)) end affix_type = modifier else error(("Unrecognized holonym placetype modifier '%s', should be one of " .. "'pref', 'Pref', 'suf', 'Suf', 'noaff', 'also' or 'the'"):format(modifier)) end end placetype = m_placetypes.resolve_placetype_aliases(placetype) local holonyms = split_on_comma(placename) local pluralize_affix = #holonyms > 1 local affix_holonym_index = (affix_type == "pref" or affix_type == "Pref") and 1 or affix_type == "noaff" and 0 or #holonyms for i, placename in ipairs(holonyms) do -- Check for langcode before the holonym placename, but don't get tripped up by Wikipedia links, which begin -- "[[w:...]]" or "[[wikipedia:]]". local langcode, placename_without_langcode = rmatch(placename, "^([^%[%]]-):(.*)$") if langcode then placename = placename_without_langcode end placename = m_placetypes.resolve_placename_display_aliases(placetype, placename) holonyms[i] = { placetype = placetype, display_placename = placename, unlinked_placename = m_placetypes.remove_links_and_html(placename), langcode = langcode, affix_type = i == affix_holonym_index and affix_type or nil, pluralize_affix = i == affix_holonym_index and pluralize_affix, suppress_affix = i ~= affix_holonym_index, no_display = no_display, suppress_comma = suppress_comma, continue_cat_loop = saw_also, force_the = i == 1 and saw_the, } end return holonyms end local get_param_mods = memoize(function() local m_param_utils = require(parameter_utilities_module) return m_param_utils.construct_param_mods { {group = {"link", "q", "l", "ref"}}, {param = "eq"}, -- FIXME: Finish [[Module:format utilities]]. --{param = "conj", set = require(format_utilities_module).allowed_conjs_for_join_segments, overall = true}, {param = "conj", set = {["and"] = true, ["or"] = true, ["and/or"] = true, ["และ"] = true, ["หรือ"] = true, ["และ/หรือ"] = true}, overall = true}, } end) local function parse_term_with_inline_modifiers(term, paramname, default_lang) -- FIXME: Finish changes to [[Module:parameter utilities]] and [[Module:parse utilities]] that support continuations -- and new-format generate_obj(). --local function generate_obj(data) -- local m_param_utils = require(parameter_utilities_module) -- data.parse_lang_prefix = true -- data.special_continuations = m_param_utils.default_special_continuations -- data.default_lang = default_lang -- return m_param_utils.generate_obj_maybe_parsing_lang_prefix(data) --end local function generate_obj(raw_term, parse_err) local obj = require(parameter_utilities_module).generate_obj_maybe_parsing_lang_prefix { term = raw_term, parse_err = parse_err, parse_lang_prefix = true, } obj.lang = obj.lang or default_lang return obj end return require(parse_interface_module).parse_inline_modifiers(term, { paramname = paramname, param_mods = get_param_mods(), generate_obj = generate_obj, -- FIXME: See above. --generate_obj_new_format = true, splitchar = ",", outer_container = {}, }) end local function parse_form_of_directive(arg, lang, form_of_overridden_args) local form_of_directive, raw_terms = arg:match("^@([a-z -]+):(.*)$") if not form_of_directive then error("Misformatted @-directive: " .. dump(arg)) end if not export.all_form_of_directives[form_of_directive] then local known_directives = {} for k, _ in pairs(export.all_form_of_directives) do insert(known_directives, '"' .. k .. '"') end table.sort(known_directives) error(("Unrecognized form-of directive %s in @-directive %s; recognized directives are %s"):format( dump(form_of_directive), dump(arg), concat(known_directives, ", "))) end local spec = export.all_form_of_directives[form_of_directive] local canonical_directive = form_of_directive if spec.alias_of then canonical_directive = spec.alias_of spec = export.all_form_of_directives[canonical_directive] if not spec then internal_error("Form-of directive alias %s points to %s, which is not a directive", "@" .. form_of_directive, canonical_directive) elseif spec.alias_of then internal_error("Form-of directive alias %s points to %s, which is also an alias", "@" .. form_of_directive, canonical_directive) end end local default_foreign = spec.default_foreign local directive_param = "@" .. form_of_directive if form_of_overridden_args and form_of_overridden_args[canonical_directive] then raw_terms = form_of_overridden_args[canonical_directive].new_value local new_directive = form_of_overridden_args[canonical_directive].new_directive local new_spec = export.all_form_of_directives[new_directive] if not new_spec then error(("Internal error: [[Module:transclude]] passed in unrecognized replacement directive '@%s'"): format(new_directive)) end if new_spec.alias_of then error(("Internal error: [[Module:transclude]] passed in replacement directive alias '@%s', " .. "should be canonical"):format(new_directive)) end if new_directive ~= canonical_directive then directive_param = directive_param .. (" (replaced with @%s)"):format(new_directive) canonical_directive = new_directive spec = new_spec end default_foreign = true end local terms = parse_term_with_inline_modifiers(raw_terms, directive_param, default_foreign and lang or enlang) return { directive = canonical_directive, terms = terms.terms, conj = terms.conj, spec = spec, } end -- Parse an argument containing extra information that is sometimes added to a definition, such as the capital, largest -- city, modern name, official name, etc. `args` is the value from the parsed argument structure and can be either nil, -- a string or a list (depending on whether it was declared as a single parameter or a list). `spec` is the extra info -- spec corresponding to the type of extra info. Each value in `args` can be a comma-separated list of terms with inline -- modifiers attached. [FIXME: we should switch to always using the comma-separated format and disallow list parameters -- such as |capital=, |capital2=, etc.] The return value is a structure containing fields `terms` (a list of term -- objects, each of which is in the format expected by full_link() in [[Module:links]]), `conj` (an explicit -- conjunction to join multiple terms, or nil if no explicit conjunction was given) and `spec` (the passed-in spec). local function parse_extra_info_arg(args, spec, default_lang) if not args then return nil end if type(args) ~= "table" then args = {args} end if not args[1] then return nil end local terms = nil local conj for i, arg in ipairs(args) do local this_terms = parse_term_with_inline_modifiers(arg, spec.arg .. (i == 1 and "" or i), default_lang) local thisconj = this_terms.conj if not conj then conj = thisconj elseif thisconj and conj ~= thisconj then error(("Two different conjunctions '%s' and '%s' specified for |%s=; you only need to specify the " .. "conjunction once"):format(conj, thisconj)) end if not terms then terms = this_terms.terms else m_table.extend(terms, this_terms.terms) end end return { spec = spec, terms = terms, conj = conj, } end --[==[ Parse a "new-style" place description, with placetypes and holonyms surrounded by `<<...>>` amid otherwise raw text. Return value is a place description object as documented at the top of the file. Exported for use by [[Module:demonyms]]. ]==] function export.parse_new_style_place_desc(text, lang, form_of_directives, form_of_overridden_args) local placetypes = {} local segments = split(text, "<<(.-)>>") local retval = {holonyms = {}, order = {}} local form_of_directives_already_present = form_of_directives and not not form_of_directives[1] for i, segment in ipairs(segments) do if i % 2 == 1 then insert(retval.order, {type = "raw", value = segment}) elseif segment:find("@") then if not form_of_directives then error(("Form-of directive '%s' not allowed in this context"):format(segment)) elseif form_of_directives_already_present then error(("Saw form-of directive '%s' in new-style place desc followed by direct (separate-parameter) form-of directives; not allowed"):format( segment)) elseif placetypes[1] or retval.holonyms[1] then error(("Form-of directive '%s' must come first, before placetypes and holonyms"):format(segment)) else local form_of_directive = parse_form_of_directive(segment, lang, form_of_overridden_args) if not retval.order[1] or retval.order[1].type ~= "raw" or retval.order[2] then internal_error("`retval.order` should have a single raw element: %s", retval.order) end form_of_directive.pretext = retval.order[1].value retval.order[1] = nil insert(form_of_directives, form_of_directive) end elseif segment:find("/") then local holonyms = split_holonym(segment) for j, holonym in ipairs(holonyms) do if j > 1 then if not holonym.no_display then if j == #holonyms then insert(retval.order, {type = "raw", value = " and "}) else insert(retval.order, {type = "raw", value = ", "}) end end -- All but the first in a multi-holonym need an article. For the first one, the article is -- specified in the raw text if needed. (Currently, needs_article is only used when displaying the -- holonym, so it wouldn't matter when no_display is set, but we set it anyway in case we need it -- for something else.) holonym.needs_article = true end insert(retval.holonyms, holonym) if not holonym.no_display then insert(retval.order, {type = "holonym", value = #retval.holonyms}) end m_placetypes.key_holonym_into_place_desc(retval, holonym) end else local treat_as, display = segment:match("^(..-):(.+)$") if treat_as then segment = treat_as else display = segment end -- see if the placetype segment is just qualifiers local only_qualifiers = true local split_segments = split(segment, " ", true) for _, split_segment in ipairs(split_segments) do if m_placetypes.placetype_qualifiers[split_segment] == nil then only_qualifiers = false break end end insert(placetypes, {placetype = segment, only_qualifiers = only_qualifiers}) if only_qualifiers then insert(retval.order, {type = "qualifier", value = display}) else insert(retval.order, {type = "placetype", value = display}) end end end if not form_of_directives_already_present and form_of_directives and form_of_directives[1] then form_of_directives[#form_of_directives].posttext = "" end local final_placetypes = {} for i, placetype in ipairs(placetypes) do if i > 1 and placetypes[i - 1].only_qualifiers then final_placetypes[#final_placetypes] = final_placetypes[#final_placetypes] .. " " .. placetypes[i].placetype else insert(final_placetypes, placetypes[i].placetype) end end retval.placetypes = final_placetypes return retval end --[==[ Parse one or more "new-style" place descriptions, with placetypes and holonyms surrounded by `<<...>>` amid otherwise raw text. Multiple descriptions are separated by two semicolons in a row. Return value is a list of place description objects as documented at the top of the file. ]==] local function parse_conjoined_new_style_place_desc(text, lang, form_of_directives, form_of_overridden_args) local separate_specs = split(text, ";(;[^ ]*)") local descs = {} for i = 1, #separate_specs do if i % 2 == 1 then insert(descs, export.parse_new_style_place_desc(separate_specs[i], lang, form_of_directives, form_of_overridden_args)) form_of_directives = nil else descs[#descs].separator = separate_specs[i] end end return descs end --[=[ Process numeric and "extra info" arguments into an overall place spec, as described at the top of the file. `data` is an object with the following fields: * `args`: The parsed arguments of {{tl|place}}. * `from_tcl`: True if we're being invoked from {{tl|tcl}}. * `extra_info_overridden_set`, `form_of_overridden_args`: Same as the corresponding fields in the `data` object passed to `export.format`. ]=] local function parse_overall_place_spec(data) local args, from_tcl, extra_info_overridden_set, form_of_overridden_args = data.args, data.from_tcl, data.extra_info_overridden_set, data.form_of_overridden_args local descs = {} local this_desc -- Index of separate (semicolon-separated) place descriptions within `descs`. local desc_index = 1 -- Index of separate holonyms within a place description. 0 means we've seen no holonyms and have yet to process -- the placetypes that precede the holonyms. 1 means we've seen no holonyms but have already processed the -- placetypes. local holonym_index = 0 local in_place_desc = false local form_of_directives = {} local function set_desc_joiner(desc, separator) if separator == ";" then this_desc.joiner = "; " this_desc.include_following_article = true elseif separator == ";;" then this_desc.joiner = " " else local joiner = separator:sub(2) if rfind(joiner, "^%a") then this_desc.joiner = " " .. joiner .. " " else this_desc.joiner = joiner .. " " end end end for _, arg in ipairs(args[2]) do if arg:find("^@") then if not (desc_index == 1 and holonym_index == 0) then error("@-directives cannot follow place descriptions") end local form_of_directive = parse_form_of_directive(arg, args[1], form_of_overridden_args) if form_of_directives[1] then form_of_directive.pretext = ", " else form_of_directive.pretext = "" end insert(form_of_directives, form_of_directive) elseif arg == ";" or arg:find("^;[^ ]") then if not this_desc then error("Saw semicolon joiner without preceding place description") end set_desc_joiner(this_desc, arg) desc_index = desc_index + 1 holonym_index = 0 in_place_desc = false else if arg:find("<<") then if in_place_desc then error("New-style place description must come first or following a separator (semicolon or similar), not directly following another description") end in_place_desc = true local this_descs = parse_conjoined_new_style_place_desc(arg, args[1], form_of_directives, form_of_overridden_args) for j, desc in ipairs(this_descs) do this_desc = desc if holonym_index > 0 then desc_index = desc_index + 1 holonym_index = 0 end if j < #this_descs then set_desc_joiner(this_desc, this_desc.separator) end descs[desc_index] = this_desc last_was_new_style = true holonym_index = #this_desc.holonyms + 1 end else -- Old-style arguments can directly follow a new-style argument; they become additional holonyms -- tacked onto the end of the holonym list, and are displayed old-style except that there is no -- prefix before the first one following the new-style argument. in_place_desc = true if holonym_index == 0 then local entry_placetypes = split_on_slash(arg) this_desc = {placetypes = entry_placetypes, holonyms = {}} descs[desc_index] = this_desc holonym_index = holonym_index + 1 else local holonyms = split_holonym(arg) for j, holonym in ipairs(holonyms) do if j > 1 then -- All but the first in a multi-holonym need an article. Not for the first one because e.g. -- {{place|en|city|s/Arizona|c/United States}} should not display as "a city in Arizona, the -- United States". The overall first holonym in the place description gets an article if -- needed regardless of our setting here. holonym.needs_article = true -- Insert "และ" before the last holonym. if j == #holonyms then this_desc.holonyms[holonym_index] = { -- Use the no_display value from the first holonym; it should be the same for all -- holonyms. `unlinked_placename` should not be used. display_placename = "และ", no_display = holonyms[1].no_display } holonym_index = holonym_index + 1 end end this_desc.holonyms[holonym_index] = holonym m_placetypes.key_holonym_into_place_desc(this_desc, this_desc.holonyms[holonym_index]) holonym_index = holonym_index + 1 end end end end end if form_of_directives[1] and not form_of_directives[#form_of_directives].posttext then form_of_directives[#form_of_directives].posttext = (args.def and args.def ~= "-" or not args.def and descs[1]) and ": " or "" end -- Tracking code. This does nothing but add tracking for seen placetypes and qualifiers. The place will be linked to -- [[Wiktionary:Tracking/place/entry-placetype/PLACETYPE]] for all entry placetypes seen; in addition, if PLACETYPE -- has qualifiers (e.g. 'small city'), there will be links for the bare placetype minus qualifiers and separately -- for the qualifiers themselves: -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/BARE_PLACETYPE]] -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-qualifier/QUALIFIER]] -- Note that if there are multiple qualifiers, there will be links for each possible split. For example, for -- 'small maritime city'), there will be the following links: -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/small maritime city]] -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/maritime city]] -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/city]] -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-qualifier/small]] -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-qualifier/maritime]] -- Finally, there are also links for holonym placetypes, e.g. if the holonym 'c/Italy' occurs, there will be the -- following link: -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/holonym-placetype/country]] for _, desc in ipairs(descs) do for _, entry_placetype in ipairs(desc.placetypes) do local splits = m_placetypes.split_qualifiers_from_placetype(entry_placetype, "no canon qualifiers") for _, split in ipairs(splits) do local prev_qualifier, this_qualifier, bare_placetype = unpack(split, 1, 3) track("entry-placetype/" .. bare_placetype) if this_qualifier then track("entry-qualifier/" .. this_qualifier) end end end for _, holonym in ipairs(desc.holonyms) do if holonym.placetype then track("holonym-placetype/" .. holonym.placetype) end end end local extra_info = {} for _, extra_info_spec in ipairs(export.extra_info_args) do local extra_info_terms = parse_extra_info_arg(args[extra_info_spec.arg], extra_info_spec, -- If called from {{tcl}} and extra info argument was set by {{tcl}}, interpret the argument -- according to the language in 1=; otherwise interpret as English. To override this, prefix -- with the appropriate language. from_tcl and extra_info_overridden_set and extra_info_overridden_set[extra_info_spec.arg] and args[1] or enlang) if extra_info_terms then insert(extra_info, extra_info_terms) end end return { lang = args[1], args = args, directives = form_of_directives, descs = descs, extra_info = extra_info, } end -------- Definition-generating functions -- Return a string with the wikilinks to the English translations of the word. local function get_translations(transl, ids) local ret = {} for i, t in ipairs(transl) do local arg_transls = split_on_comma(t) local arg_ids = ids[i] if arg_ids then arg_ids = split_on_comma(arg_ids) if #arg_transls ~= #arg_ids then error(("Saw %s translation%s in t%s=%s but %s ID%s in tid%s=%s"):format( #arg_transls, #arg_transls > 1 and "s" or "", i == 1 and "" or i, t, #arg_ids, #arg_ids > 1 and "'s" or "", i == 1 and "" or i, ids[i])) end end for j, arg_transl in ipairs(arg_transls) do insert(ret, link(arg_transl, "en", arg_ids and arg_ids[j] or nil)) end end return concat(ret, ", ") end -- Return the article (currently always `"the"`) to be prepended to the given placename, or nil. `decorated_placename` -- is the placename as specified by the user along with any affix added to it. `placename` is the raw unlinked -- placename, defaulting to the unlinked version of `decorated_placename` if not given. `placetypes` is a placetype or -- list of placetypes for the placename. `suppress_holonym_use_the_check` suppresses checking the placetypes for -- `holonym_use_the`. local function get_placename_article(decorated_placename, placetypes, placename, suppress_holonym_use_the_check) local unlinked_decorated_placename = m_placetypes.remove_links_and_html(decorated_placename) if unlinked_decorated_placename:find("^the ") then return nil end placename = placename or unlinked_decorated_placename if type(placetypes) == "string" then placetypes = {placetypes} end for _, placetype in ipairs(placetypes) do local art = m_placetypes.get_equiv_placetype_prop(placetype, function(pt) local art = m_placetypes.placename_article[pt] and m_placetypes.placename_article[pt][placename] if art then return art end end) if art then return art end end -- Get equivalent placetypes of the specified placetype so that e.g. -- {{place|en|@official name of:Bahamas|island country|r/Caribbean}} put 'the' before Bahamas ("Bahamas" is just -- specified as a country but "island country" falls back to "country"). local all_equiv_placetypes = {} for _, placetype in ipairs(placetypes) do local this_equiv_placetypes = m_placetypes.get_placetype_equivs(placetype) for _, this_equiv_placetype in ipairs(this_equiv_placetypes) do insert(all_equiv_placetypes, this_equiv_placetype.placetype) end end -- Look for a known location. We should be using find_matching_holonym_location() but that function doesn't -- currently work without alias resolution. Instead we check if any matching location has `the = true` set. -- In practice there aren't any cases where a given placename matches two locations, only one of which has -- `the = true` set. for group, key, spec in m_placetypes.iterate_matching_location { placetypes = all_equiv_placetypes, placename = placename, alias_resolution = "none", } do -- `iterate_holonym_location` doesn't initialize the spec if alias resolution is turned off, so check both -- the spec and group. Be careful in case `the = false` is explicitly given by the spec. if spec.the ~= nil then if spec.the then return "the" end elseif group.default_the then return "the" end end if not suppress_holonym_use_the_check then -- See if the placetype requests an article to be placed before the placename. This occurs e.g. with 'sea'. But -- if the user specifies e.g. "sea:pref/Cortez", we'll wrongly get "the sea of the Cortez", so in that case we -- need to ignore the holonym article specified along with the placetype. for _, placetype in ipairs(placetypes) do local holonym_use_the = m_placetypes.get_equiv_placetype_prop(placetype, function(pt) return placetype_data[pt] and placetype_data[pt].holonym_use_the end) if holonym_use_the then return "the" end end end local universal_res = m_placetypes.placename_the_re["*"] for _, re in ipairs(universal_res) do if unlinked_decorated_placename:find(re) then return "the" end end for _, placetype in ipairs(placetypes) do local matched = m_placetypes.get_equiv_placetype_prop(placetype, function(pt) local res = m_placetypes.placename_the_re[pt] if not res then return nil end for _, re in ipairs(res) do if unlinked_decorated_placename:find(re) then return true end end return nil end) if matched then return "the" end end return nil end -- Prepend the appropriate article if needed to `decorated_placename` (the user-specified placename with any affix -- added), where the underlying holonym object that generated `linked_placename` can be found at `holonym_index` in the -- holonyms in `place_desc`. local function get_holonym_article(decorated_placename, place_desc, holonym_index) local holonym = place_desc.holonyms[holonym_index] local holonym_placetype = holonym.placetype if not holonym_placetype then return nil end return get_placename_article(decorated_placename, holonym_placetype, holonym.unlinked_placename, not not holonym.affix_type) end -- Convert a holonym into display format. This adds wikilinks to holonyms and passes them through any display handlers, -- which may (e.g.) add the placetype to the holonym. If `needs_article` is true, prepend the article `"the"` if the -- holonym requires it (e.g. if the holonym is `United States`). `needs_article` is set to true we are processing the -- first specified holonym in an old-style place description (i.e. the holonym directly following the entry placetype, -- with no raw-text holonym in between). -- -- Examples: -- ({placetype = "country", display_placename = "United States", unlinked_placename = "United States"}, true) returns -- the template-expanded equivalent of "the {{l|en|United States}}". -- ({placetype = "region", display_placename = "O'Higgins", unlinked_placename = "O'Higgins", affix_type = "suf"}, false) -- returns the template-expanded equivalent of "{{l|en|O'Higgins}} region". -- ({display_placename = "in the southern"}, false) returns "in the southern" (without wikilinking because .placetype -- and .langcode are both nil). local function format_holonym(place_desc, holonym_index, needs_article) local holonym = place_desc.holonyms[holonym_index] if holonym.no_display then return "" end local orig_needs_article = needs_article needs_article = needs_article or holonym.needs_article or holonym.force_the local output = holonym.display_placename local placetype = holonym.placetype local affix_type_pt_data, affix_type, affix_is_prefix, affix, prefix, suffix, no_affix_strings local pt_equiv_for_affix_type, already_seen_affix, need_affix -- Implement display handlers. local display_handler = m_placetypes.get_equiv_placetype_prop(placetype, function(pt) return placetype_data[pt] and placetype_data[pt].display_handler end) if display_handler then output = display_handler(placetype, output) end if not holonym.suppress_affix then -- Implement adding an affix (prefix or suffix) based on the holonym's placetype. The affix will be -- added either if the placetype's placetype_data spec says so (by setting 'affix_type'), or if the -- user explicitly called for this (e.g. by using 'r:suf/O'Higgins'). Before adding the affix, -- however, we check to see if the affix is already present (e.g. the placetype is "district" -- and the placename is "Mission District"). The placetype can override the affix to add (by setting -- `prefix`, `suffix` or `affix`) and/or override the strings used for checking if the affix is already -- present (by setting 'no_affix_strings', which defaults to the affix explicitly given through `prefix`, -- `suffix` or `affix` if any are given). `prefix` and `suffix` take precedence over `affix` if both are -- set, but only when the appropriate type of affix is requested. -- Search through equivalent placetypes for a setting of `affix_type`, `affix`, `prefix` or `suffix`. If we -- find any, use them. If `affix_type` is given, it is overridden by the user's explicitly specified affix -- type. If either an `affix_type` is found or the user explicitly specified an affix type, the affix is -- displayed according to the following: -- 1. If `prefix`, `suffix` or `affix` is given by the placetype or equivalent placetypes, use it (e.g. -- placetype `administrative region` requests suffix "region" but doesn't set affix type; if the user -- explicitly specifies `administrative region` as the placetype for a holonym and specifies a suffixal -- affix type, use "region"). In this search, we stop looking if we find an explicit `affix_type` -- setting; if this is found without an associated affix setting, the assumption is the associated -- placetype was intended as the affix, not some explicit affix setting associated with a fallback -- placetype. -- 2. Otherwise, if the user explicitly requested an affix type, use the actual placetype (principle of -- least surprise). -- 3. Finally, fall back to the placetype associated with an explicit `affix_type` setting (which will -- always exist if we get this far). affix_type_pt_data, pt_equiv_for_affix_type = m_placetypes.get_equiv_placetype_prop(placetype, function(pt) local cdpt = placetype_data[pt] return cdpt and cdpt.affix_type and cdpt or nil end ) affix_pt_data, pt_equiv_for_affix = m_placetypes.get_equiv_placetype_prop(placetype, function(pt) local cdpt = placetype_data[pt] return cdpt and (cdpt.affix_type or cdpt.affix or cdpt.prefix or cdpt.suffix) and cdpt or nil end ) if affix_type_pt_data then affix_type = affix_type_pt_data.affix_type need_affix = true end if affix_pt_data then prefix = affix_pt_data.prefix or affix_pt_data.affix suffix = affix_pt_data.suffix or affix_pt_data.affix need_affix = true end no_affix_strings = affix_pt_data and affix_pt_data.no_affix_strings or affix_type_pt_data and affix_type_pt_data.no_affix_strings if holonym.affix_type and placetype then affix_type = holonym.affix_type prefix = prefix or placetype suffix = suffix or placetype need_affix = true end if need_affix then -- At this point the affix_type has been determined and can't change any more, so we can figure out -- whether we need the calculated prefix or suffix. affix_is_prefix = affix_type == "pref" or affix_type == "Pref" if affix_is_prefix then affix = prefix else affix = suffix end if not affix then if not pt_equiv_for_affix_type then internal_error("Something wrong, `pt_equiv_for_affix_type` not set processing holonym: %s", holonym) end affix = pt_equiv_for_affix_type.placetype if not affix then internal_error("Something wrong, no affix could be located in `pt_equiv_for_affix_type` for " .. "holonym %s: %s", holonym, pt_equiv_for_affix_type) end end no_affix_strings = no_affix_strings or lc(affix) if holonym.pluralize_affix then affix = m_placetypes.pluralize_placetype(affix) end already_seen_affix = m_placetypes.check_already_seen_string(output, no_affix_strings) end end output = link(output, holonym.langcode or placetype and "en" or nil) if need_affix and not affix_is_prefix and not already_seen_affix then output = output .. " " .. (affix_type == "Suf" and ucfirst_all(affix) or affix) end if needs_article then local article = holonym.force_the and "the" or get_holonym_article(output, place_desc, holonym_index) if article then output = article .. " " .. output end end if affix_is_prefix and not already_seen_affix then output = (affix_type == "Pref" and ucfirst_all(affix) or affix) .. " of " .. output if orig_needs_article then -- Put the article before the added affix if we're the first holonym in the place description. This is -- distinct from the article added above for the holonym itself; cf. "c:pref/United States,Canada" -> -- "the countries of the United States and Canada". We need to use the value of `needs_article` passed -- in from the function, which indicates whether we're processing the first holonym. output = "the " .. output end end return output end -- Format a holonym for display, taking into account the entry's placetype (specifically, the last placetype if there -- are more than one, excluding conjunctions and parenthetical items); the holonym's index among the holonyms in the -- template (which specifies what the previous holonym is and whether it is the first holonym); and the full place -- description (which helps resolve ambiguities in holonyms when looking up known locations). This may involve putting a -- preposition ("in" or "of") before the formatted holonym, particularly if it is the first one, and may involve -- prepending a comma. If `holonym_no_prefix` is specified, nothing except a space is put before the holonym; used -- when formatting mixed new/old-style descriptions. local function format_holonym_in_context(entry_placetype, place_desc, holonym_index, holonym_no_prefix) local desc = "" -- If holonym.placetype is nil, the holonym is just raw text, e.g. 'in southern'. if holonym_no_prefix then desc = " " else local holonym = place_desc.holonyms[holonym_index] if not holonym.no_display then -- First compute the initial delimiter. if holonym_index == 1 then if holonym.placetype then desc = desc .. " " .. m_placetypes.get_placetype_entry_preposition(entry_placetype) .. " " elseif not holonym.display_placename:find("^,") then desc = desc .. " " end else local prev_holonym = place_desc.holonyms[holonym_index - 1] if prev_holonym.placetype and not holonym.suppress_comma then local dname = holonym.display_placename if dname ~= "and" and dname ~= "in" and dname ~= "and the" and dname ~= "in the" and dname ~= "และ" and dname ~= "ใน" then desc = desc .. "," end end if holonym.placetype or not holonym.display_placename:find("^,") then desc = desc .. " " end end end end return desc .. format_holonym(place_desc, holonym_index, not holonym_no_prefix and holonym_index == 1) end -- Return the linked description of a placetype. This splits off any qualifiers and displays them separately. local function get_placetype_description(placetype) local splits = m_placetypes.split_qualifiers_from_placetype(placetype) local prefix = "" for _, split in ipairs(splits) do local prev_qualifier, this_qualifier, bare_placetype = unpack(split, 1, 3) if this_qualifier then prefix = (prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier) .. " " else prefix = "" end local display_form = m_placetypes.get_placetype_display_form(bare_placetype) if display_form then return prefix .. display_form end placetype = bare_placetype end return prefix .. placetype end -- Return the linked description of a qualifier (which may be multiple words). local function get_qualifier_description(qualifier) local splits = m_placetypes.split_qualifiers_from_placetype(qualifier .. " foo") local split = splits[#splits] local prev_qualifier, this_qualifier, bare_placetype = unpack(split, 1, 3) return prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier end -- Format a set of form-of directive terms. local function format_form_of_directive(overall_place_spec, directive_terms, ucfirst, from_tcl) local formatted_terms = {} local placetypes if not overall_place_spec.descs[2] then placetypes = overall_place_spec.descs[1].placetypes else placetypes = {} for _, desc in ipairs(overall_place_spec.descs) do m_table.extend(placetypes, desc.placetypes) end end for _, termobj in ipairs(directive_terms.terms) do local placename_article if not termobj.alt and termobj.term and not termobj.term:find("%[%[") then placename_article = get_placename_article(termobj.term, placetypes) end local linked_term = m_links.full_link(termobj, "term", nil, "show qualifiers") linked_term = "<span class='form-of-definition-link'>" .. linked_term .. "</span>" if termobj.eq then linked_term = linked_term .. " (= " .. m_links.full_link {term = termobj.eq, lang = enlang} .. ")" end if placename_article then linked_term = placename_article .. " " .. linked_term end insert(formatted_terms, linked_term) end local spec = directive_terms.spec local text = spec.text if type(text) == "function" then text = text(overall_place_spec) end if text == "+" then text = directive_terms.directive end if ucfirst then text = m_strutils.ucfirst(text) end if not from_tcl then local tracking_prefix = "form-of/" .. directive_terms.directive track(tracking_prefix) local langcode = overall_place_spec.lang:getCode() local full_langcode = overall_place_spec.lang:getFullCode() track(tracking_prefix .. "/" .. langcode) if full_langcode ~= langcode then track(tracking_prefix .. "/" .. full_langcode) end if full_langcode ~= "en" then track(tracking_prefix .. "/non-english") end end return (require(form_of_module).format_form_of { text = text, lemmas = m_table.serialCommaJoin(formatted_terms, {conj = directive_terms.conj or spec.conjunction or "และ"}), lemma_classes = false, -- text_classes = "place-text", }) end -- Format a set of extra-info terms for extra information that is sometimes added to a definition, such as the capital, -- largest city, modern name, official name, etc. `overall_place_spec` is the overall parsed {{tl|place}} spec (see -- comment at top of file); `extra_info_terms` is the terms spec for this type of extra-info (as returned by -- `parse_extra_info_arg`); and `sentence_style` indicates whether we're generating a sentence-style definition (as -- suitable for an English-language term without a translation specified using t=). local function format_extra_info(overall_place_spec, extra_info_terms, sentence_style) local formatted_terms = {} for _, termobj in ipairs(extra_info_terms.terms) do insert(formatted_terms, m_links.full_link(termobj, nil, nil, "show qualifiers")) end local spec = extra_info_terms.spec local text = spec.text if type(text) == "function" then text = text(overall_place_spec) end if text == "+" then text = spec.arg end if spec.auto_plural and formatted_terms[2] then text = pluralize(text) end if spec.with_colon then text = text .. ":" end if sentence_style and spec.match_sentence_style then text = ". " .. m_strutils.ucfirst(text) else text = "; " .. text end -- FIME: Use joinSegments when available. -- return text .. " " .. -- m_table.joinSegments(formatted_terms, {conj = extra_info_terms.conj or spec.conjunction or "และ"}) return text .. " " .. m_table.serialCommaJoin(formatted_terms, {conj = extra_info_terms.conj or spec.conjunction or "และ"}) end -- Format an old-style place description (with separate arguments for the placetype and each holonym) for display and -- return the resulting string. local function format_old_style_place_desc_for_display(args, place_desc, desc_index, with_article, ucfirst) -- The placetype used to determine whether "in" or "of" follows is the last placetype if there are -- multiple slash-separated placetypes, but ignoring "และ", "or" and parenthesized notes -- such as "(one of 254)". local entry_placetype = nil local placetypes = place_desc.placetypes local function is_and_or(item) return item == "และ" or item == "หรือ" end local parts = {} local function ins(txt) insert(parts, txt) end local function ins_space() if #parts > 0 then ins(" ") end end local and_or_pos for i, placetype in ipairs(placetypes) do if is_and_or(placetype) then and_or_pos = i -- no break here; we want the last in case of more than one end end local remaining_placetype_index if and_or_pos then track("multiple-placetypes-with-and") if and_or_pos == #placetypes then error("Conjunctions 'and' and 'or' cannot occur last in a set of slash-separated placetypes: " .. concat(placetypes, "/")) end local items = {} for i = 1, and_or_pos + 1 do local pt = placetypes[i] if is_and_or(pt) then -- skip elseif i > 1 and pt:find("^%(") then -- append placetypes beginning with a paren to previous item items[#items] = items[#items] .. " " .. pt else entry_placetype = pt insert(items, get_placetype_description(pt)) end end ins(m_table.serialCommaJoin(items, {conj = placetypes[and_or_pos]})) remaining_placetype_index = and_or_pos + 2 else remaining_placetype_index = 1 end for i = remaining_placetype_index, #placetypes do local pt = placetypes[i] -- Check for and, or and placetypes beginning with a paren (so that things like -- "{{place|en|county/(one of 254)|s/Texas}}" work). if m_placetypes.placetype_is_ignorable(pt) then ins_space() ins(pt) else entry_placetype = pt -- Join multiple placetypes with comma unless placetypes are already -- joined with "และ". We allow "the" to precede the second placetype -- if they're not joined with "และ" (so we get "city and county seat of ..." -- but "city, the county seat of ..."). if i > 1 then ins(", ") local article = m_placetypes.get_placetype_article(pt) if article ~= "the" and i > remaining_placetype_index then -- Track cases where we are comma-separating multiple placetypes without the second one starting -- with "the", as they may be mistakes. The occurrence of "the" is usually intentional, e.g. -- {{place|zh|municipality/state capital|s/Rio de Janeiro|c/Brazil|t1=Rio de Janeiro}} -- for the city of [[Rio de Janeiro]], which displays as "a municipality, the state capital of ...". track("multiple-placetypes-without-and-or-the") end if article then ins(article) ins(" ") end end ins(get_placetype_description(pt)) end end if place_desc.holonyms then for holonym_index, _ in ipairs(place_desc.holonyms) do ins(format_holonym_in_context(entry_placetype, place_desc, holonym_index)) end end local gloss = concat(parts) if with_article then local article if desc_index == 1 then article = args.a else if not place_desc.holonyms then -- there isn't a following holonym; the place type given might be raw text as well, so don't add -- an article. with_article = false else local saw_placetype_holonym = false for _, holonym in ipairs(place_desc.holonyms) do if holonym.placetype then saw_placetype_holonym = true break end end if not saw_placetype_holonym then -- following holonym(s)s is/are just raw text; the place type given might be raw text as well, -- so don't add an article. with_article = false end end if with_article then track("second-or-higher-description-with-added-article") else track("second-or-higher-description-suppressed-article") end end if with_article then article = article or m_placetypes.get_placetype_article(place_desc.placetypes[1], ucfirst) if article then gloss = article .. " " .. gloss elseif ucfirst then gloss = m_strutils.ucfirst(gloss) end end end return gloss end --[==[ Get the full gloss (English description) of a new-style place description. New-style place descriptions are specified with a single string containing raw text interspersed with placetypes and holonyms surrounded by `<<...>>`. Exported for use by [[Module:demonyms]]. ]==] function export.format_new_style_place_desc_for_display(args, place_desc, with_article) local parts = {} local function ins(txt) insert(parts, txt) end if with_article and args.a then ins(args.a .. " ") end local max_holonym = 0 for _, order in ipairs(place_desc.order) do local segment_type, segment = order.type, order.value if segment_type == "raw" then ins(segment) elseif segment_type == "placetype" then ins(get_placetype_description(segment)) elseif segment_type == "qualifier" then ins(get_qualifier_description(segment)) elseif segment_type == "holonym" then ins(format_holonym(place_desc, segment, false)) if segment > max_holonym then max_holonym = segment end else internal_error("Unrecognized segment type %s", segment_type) end end if place_desc.holonyms and max_holonym < #place_desc.holonyms then local holonym_no_prefix = true for holonym_index = max_holonym + 1, #place_desc.holonyms do ins(format_holonym_in_context(nil, place_desc, holonym_index, holonym_no_prefix)) holonym_no_prefix = false end end return concat(parts) end -- Return a string with the gloss (the description of the place itself, as opposed to translations). If `ucfirst` is -- given, the gloss's first letter is made upper case. If `sentence_style` is given, the "extra info" (modern name, -- capital, largest city, etc.) is displayed as separated sentences; otherwise, it is displayed separated from the main -- definition by semicolons. local function get_display_form(data) local overall_place_spec, ucfirst, sentence_style, drop_extra_info, extra_info_overridden_set, from_tcl = data.overall_place_spec, data.ucfirst, data.sentence_style, data.drop_extra_info, data.extra_info_overridden_set, data.from_tcl local args = overall_place_spec.args local parts = {} local function ins(txt) table.insert(parts, txt) end if overall_place_spec.directives and overall_place_spec.directives[1] then for i, directive_terms in ipairs(overall_place_spec.directives) do ins(directive_terms.pretext) if directive_terms.pretext ~= "" then ucfirst = false end if not args.def or args.def == "-" then ins(format_form_of_directive(overall_place_spec, directive_terms, ucfirst, from_tcl)) ucfirst = false if i == #overall_place_spec.directives and directive_terms.posttext then ins(directive_terms.posttext) end end end end if args.def == "-" then return concat(parts) end if args.def then if args.def:find("<<") then local def_desc = export.parse_new_style_place_desc(args.def, args[1]) ins(export.format_new_style_place_desc_for_display({}, def_desc, false)) else ins(args.def) end else local include_article = true for n, desc in ipairs(overall_place_spec.descs) do if desc.order then ins(export.format_new_style_place_desc_for_display(args, desc, n == 1)) else ins(format_old_style_place_desc_for_display(args, desc, n, include_article, ucfirst)) end if desc.joiner then ins(desc.joiner) end include_article = desc.include_following_article ucfirst = false end end local addl = args.addl if addl then posttext = posttext or "" if addl:find("^[;:]") then ins(addl) elseif addl:find("^_") then ins(" " .. addl:sub(2)) else ins(", " .. addl) end end for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do -- Include a given extra info term either when -- (1) drop_extra_info not set (it's set by {{tcl}}), or -- (2) the extra info term is marked as "display even when dropped" (e.g. modern= or full=, to help understand -- the term's sense), or -- (3) the term was overridden by a `place_*=` setting in {{tcl}}. if not drop_extra_info or extra_info_terms.spec.display_even_when_dropped or extra_info_overridden_set and extra_info_overridden_set[extra_info_terms.spec.arg] then ins(format_extra_info(overall_place_spec, extra_info_terms, sentence_style)) end end return concat(parts) end -- Return the definition line. local function get_def(data) local overall_place_spec, from_tcl, drop_extra_info, extra_info_overridden_set, translation_follows = data.overall_place_spec, data.from_tcl, data.drop_extra_info, data.extra_info_overridden_set, data.translation_follows local args = overall_place_spec.args local sentence_style = overall_place_spec.lang:getCode() == "en" local ucfirst = sentence_style and not args.nocap if #args.t > 0 then local gloss = get_display_form { overall_place_spec = overall_place_spec, ucfirst = false, sentence_style = false, drop_extra_info = drop_extra_info, extra_info_overridden_set = extra_info_overridden_set, from_tcl = from_tcl, } if from_tcl and not args.tcl_nolc then gloss = m_strutils.lcfirst(gloss) end if translation_follows then return (gloss == "" and "" or gloss .. ": ") .. get_translations(args.t, args.tid) else return get_translations(args.t, args.tid) .. (gloss == "" and "" or " (" .. gloss .. ")") end else return get_display_form { overall_place_spec = overall_place_spec, ucfirst = ucfirst, sentence_style = sentence_style, drop_extra_info = drop_extra_info, extra_info_overridden_set = extra_info_overridden_set, from_tcl = from_tcl, } end end ---------- Functions for the category wikicode -- The code in this section finds the categories to which a given place belongs. See comment at top of file. --[=[ Find the appropriate category specs for a given place description and placetype. For example, for the template invocation {{tl|place|en|city/and/county|s/Pennsylvania|c/US}}, which results in the place description ``` { placetypes = {"city", "และ", "county"}, holonyms = { {placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania"}, {placetype = "country", display_placename = "United States", unlinked_placename = "United States"}, }, holonyms_by_placetype = { state = {"Pennsylvania"}, country = {"United States"}, }, } ``` the call ``` find_placetype_cat_specs { entry_placetype = "city", place_desc = { placetypes = {"city", "และ", "county"}, holonyms = { {placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania"}, {placetype = "country", display_placename = "United States", unlinked_placename = "United States"}, }, holonyms_by_placetype = { state = {"Pennsylvania"}, country = {"United States"}, }, }, } ``` might produce the return value ``` { entry_placetype = "city", cat_specs = {"Cities in Pennsylvania, USA"}, triggering_holonym = {placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania"}, triggering_holonym_index = 1, } ``` See the comment at the top of the section for a description of category specs and the overall algorithm. On entry, `data` is an object with the following fields: * `entry_placetype`: the entry placetype (or equivalent) used to look up the category data in placetype_data, which must have already been resolved to a placetype with an entry in `placetype_data`; * `place_desc`: the full place description as documented at the top of the file (used only for its holonyms); * `first_holonym_index`: the index of the first holonym to consider when iterating through the holonyms (used to implement the `:also` holonym placetype modifier); * `overriding_holonym`: an optional overriding holonym to use, in place of iterating through the holonyms (used to implement categorizing other holonyms of the same type as the triggering holonym, so that e.g. {{tl|place|en|river|s/Kansas,Nebraska}}, or equivalently {{tl|place|en|river|s/Kansas|and|s/Nebraska}}, works); * `from_demonym`: we are called from {{tl|demonym-noun}} or {{tl|demonym-adj}} instead of {{tl|place}}, and should generate categories appropriate to those templates. * `form_of_directive`: A form-of directive prefix such as `FORMER_NAME_OF`. If specified, use that type prefix to generate categories appropriate to the form-of directive (in addition to the regular categories generated for the {{tl|place}} invocation, which happens in a separate call). The return value is {nil} if no category specs could be located, otherwise an object with the following fields: * `entry_placetype`: the placetype that should be used to construct categories when `true` is one of the returned category specs (normally the same as the `entry_placetype` passed in, but will be different when a "fallback" key exists and is used); * `cat_specs`: list of category specs as described above; * `triggering_holonym`: the triggering holonym (see the comment at the top of the section), or nil if there was no triggering holonym; * `triggering_holonym_index`: the index of the triggering holonym in the list of holonyms in `place_desc`, or nil if an overriding holonym was passed in or there was no triggering holonym. ]=] local function find_placetype_cat_specs(data) local entry_placetype, place_desc, first_holonym_index, overriding_holonym, from_demonym = data.entry_placetype, data.place_desc, data.first_holonym_index, data.overriding_holonym, data.from_demonym local form_of_directive = data.form_of_directive local function fetch_cat_specs(holonym_to_match, index, no_fallback) local holonym_placetype = holonym_to_match.placetype if not holonym_placetype then -- raw text in place of holonym return nil end local holonym_placename = holonym_to_match.unlinked_placename if not holonym_placename then internal_error("Missing unlinked_placename in holonym (index %s): %s", index, holonym_to_match) end local cat_specs, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype, function(equiv_entry_pt) return m_placetypes.get_equiv_placetype_prop(holonym_placetype, function(equiv_holonym_pt) return m_placetypes.political_division_cat_handler { entry_placetype = equiv_entry_pt, holonym_placetype = equiv_holonym_pt, holonym_placename = holonym_placename, holonym_index = index, place_desc = place_desc, from_demonym = from_demonym, } end) end, {no_fallback = no_fallback, form_of_directive = form_of_directive} ) if cat_specs and cat_specs[1] then return cat_specs, equiv_entry_placetype_and_qualifier.placetype end local cat_handler, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype, function(equiv_entry_pt) local entry_placetype_data = m_placetypes.placetype_data[equiv_entry_pt] if entry_placetype_data and entry_placetype_data.cat_handler then return entry_placetype_data.cat_handler end end, {no_fallback = no_fallback, form_of_directive = form_of_directive} ) if cat_handler then local cat_specs = m_placetypes.get_equiv_placetype_prop(holonym_placetype, function(equiv_holonym_pt) return cat_handler { entry_placetype = equiv_entry_placetype_and_qualifier.placetype, holonym_placetype = equiv_holonym_pt, holonym_placename = holonym_placename, holonym_index = index, place_desc = place_desc, from_demonym = from_demonym, } end) if cat_specs and cat_specs[1] then return cat_specs, equiv_entry_placetype_and_qualifier.placetype end end if not no_fallback then local cat_specs, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype, function(equiv_entry_pt) local entry_placetype_data = m_placetypes.placetype_data[equiv_entry_pt] if entry_placetype_data then return m_placetypes.get_equiv_placetype_prop(holonym_placetype, function(equiv_holonym_pt) return entry_placetype_data[equiv_holonym_pt .. "/*"] end) end end, {form_of_directive = form_of_directive} ) if cat_specs and cat_specs[1] then return cat_specs, equiv_entry_placetype_and_qualifier.placetype end end return nil end if overriding_holonym then -- FIXME, change the algorithm to eliminate overriding_holonym local cat_specs, fetched_entry_placetype = fetch_cat_specs(overriding_holonym, nil) if cat_specs and cat_specs[1] then return { entry_placetype = fetched_entry_placetype, cat_specs = cat_specs, triggering_holonym = overriding_holonym, -- no triggering_holonym_index } end else -- We loop twice over holonyms, the first time setting `no_fallback` so that we process only category specs for -- the specifically given entry placetype (possibly with preceding qualifiers). The reason for this is to -- correctly handle cases like [[Poblacion IX]]: -- {{place|en|barangay|mun/Roxas|p/Capiz|c/Philippines}}. -- "barangay" falls back to "neighborhood", and without the `no_fallback` loop, the neighborhood cat handler run -- on the mun/Roxas holonym will take precedence over the barangay-specific setting for p/Capiz because we -- check, for each holonym in turn, first for a matching spec through political_division_cat_handler, then a cat -- handler, then a wildcard spec like country/*. During the first no-fallback loop, we disable checking for -- wildcard specs because it seems a fallback matching exactly or through a cat handler on an earlier holonym -- would be better than a wildcard match for the exact entry placetype at a later holonym. (FIXME: But I don't -- know for sure; maybe we should check wildcard holonyms on the exact entry placetype first, or contrariwise -- maybe we should check only exact-match holonyms through political_division_cat_handler on the exact entry -- placetype first, not even checking other cat handlers.) for i, holonym in ipairs(place_desc.holonyms) do if first_holonym_index and i < first_holonym_index then -- continue else local cat_specs, fetched_entry_placetype = fetch_cat_specs(holonym, i, "no_fallback") if cat_specs and cat_specs[1] then return { entry_placetype = fetched_entry_placetype, cat_specs = cat_specs, triggering_holonym = holonym, triggering_holonym_index = i, } end end end for i, holonym in ipairs(place_desc.holonyms) do if first_holonym_index and i < first_holonym_index then -- continue else local cat_specs, fetched_entry_placetype = fetch_cat_specs(holonym, i) if cat_specs and cat_specs[1] then return { entry_placetype = fetched_entry_placetype, cat_specs = cat_specs, triggering_holonym = holonym, triggering_holonym_index = i, } end end end end return nil end -- Turn a list of category specs (see comment at section top) into the corresponding categories (minus the language -- code prefix). The function is given the following arguments: -- (1) the category specs retrieved using find_placetype_cat_specs(); -- (2) the entry placetype used to fetch the entry in `placetype_data` -- (3) the triggering holonym (a holonym object; see comment at top of file) used to fetch the category specs -- (see top-of-section comment); or nil if no triggering holonym. -- The return value is constructed as described in the top-of-section comment. local function cat_specs_to_categories(place_desc, cat_data) local all_cats = {} local cat_specs, entry_placetype, triggering_holonym, triggering_holonym_index = cat_data.cat_specs, cat_data.entry_placetype, cat_data.triggering_holonym, cat_data.triggering_holonym_index if triggering_holonym then for _, cat_spec in ipairs(cat_specs) do local cat if cat_spec == true then cat = m_placetypes.pluralize_placetype(entry_placetype, "ucfirst") .. " " .. m_placetypes.get_placetype_entry_preposition(entry_placetype) .. " +++" else cat = cat_spec end if cat:find("%+%+%+") then local group, key, spec, container_trail = m_placetypes.find_matching_holonym_location { holonym_placetype = triggering_holonym.placetype, holonym_placename = triggering_holonym.unlinked_placename, holonym_index = triggering_holonym_index, place_desc = place_desc, } if group then cat = cat:gsub("%+%+%+", m_strutils.replacement_escape(m_placetypes.get_prefixed_key(key, spec))) insert(all_cats, cat) else mw.log(("Unable to insert category for cat spec '%s' because holonym '%s/%s' did not match a " .. "known location"):format(cat, triggering_holonym.placetype, triggering_holonym.unlinked_placename)) track("cant-match-holonym-for-category-spec") end else insert(all_cats, cat) end end else for _, cat_spec in ipairs(cat_specs) do local cat if cat_spec == true then cat = m_placetypes.pluralize_placetype(entry_placetype, "ucfirst") else cat = cat_spec if cat:find("%+%+%+") then internal_error("Category %s contains +++ but there is no holonym to substitute", cat) end end insert(all_cats, cat) end end return all_cats end -- Return the categories (without initial lang code) that should be added to the entry, given the place description -- (which specifies the entry placetype(s) and holonym(s); see top of file) and a particular entry placetype (e.g. -- "city"). Note that only the holonyms from the place description are looked at, not the entry placetypes in the place -- description. local function get_placetype_cats(place_desc, entry_placetype, from_demonym, form_of_directive) local cats = {} local first_holonym_index = 1 while first_holonym_index <= #place_desc.holonyms do -- Find the category specs (see top-of-file comment) corresponding to the holonym(s) in the place description. local cat_data = find_placetype_cat_specs { entry_placetype = entry_placetype, place_desc = place_desc, first_holonym_index = first_holonym_index, from_demonym = from_demonym, form_of_directive = form_of_directive, } -- Check if no category spec could be found. if not cat_data then break end local triggering_holonym = cat_data.triggering_holonym if not triggering_holonym then internal_error("find_placetype_cat_specs should have returned a triggering holonym: %s", cat_data) end -- Generate categories for the category specs found. extend(cats, cat_specs_to_categories(place_desc, cat_data)) -- Also generate categories for other holonyms of the same placetype, so that e.g. -- {{place|en|city|s/Kansas|and|s/Missouri|c/USA}} generates both [[:Category:en:Cities in Kansas, USA]] and -- [[:Category:en:Cities in Missouri, USA]]. first_holonym_index = cat_data.triggering_holonym_index -- Loop over non-fallback equivalent placetypes to the triggering holonym's placetype, in case it is -- non-canonical (e.g. `cities/San Francisco`). This matches the loop over equivalent places in -- key_holonym_into_place_desc(). local equiv_triggering_placetypes = m_placetypes.get_placetype_equivs(triggering_holonym.placetype, {no_fallback = true}) for _, equiv in ipairs(equiv_triggering_placetypes) do local other_holonyms_of_same_type = place_desc.holonyms_by_placetype[equiv.placetype] if other_holonyms_of_same_type then for _, other_placename_of_same_type in ipairs(other_holonyms_of_same_type) do if other_placename_of_same_type ~= triggering_holonym.unlinked_placename then local overriding_holonym = { placetype = triggering_holonym.placetype, unlinked_placename = other_placename_of_same_type, } local other_cat_data = find_placetype_cat_specs { entry_placetype = entry_placetype, place_desc = place_desc, overriding_holonym = overriding_holonym, from_demonym = from_demonym, form_of_directive = form_of_directive, } if other_cat_data then extend(cats, cat_specs_to_categories(place_desc, other_cat_data)) end end end end end -- If there are any later-specified holonyms that had the modifier :also, try to produce categories for them -- as well. first_holonym_index = first_holonym_index + 1 while first_holonym_index <= #place_desc.holonyms do if place_desc.holonyms[first_holonym_index].continue_cat_loop then break end first_holonym_index = first_holonym_index + 1 end end if cats[1] then return cats end local entry_pt_default, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype, function(pt) return m_placetypes.placetype_data[pt] and m_placetypes.placetype_data[pt].default end, {form_of_directive = form_of_directive}) if entry_pt_default then return cat_specs_to_categories(place_desc, { cat_specs = entry_pt_default, entry_placetype = equiv_entry_placetype_and_qualifier.placetype, -- no triggering holonym }) end return {} end --[==[ Iterate through each type of place and return a list of the categories that need to be added to the entry. The returned categories need to be formatted using `format_cats`, as they can be either topic-style categories (by default) or langname-style categories (if prefixed with `cln:`). The function is passed the overall place spec, which contains all the parsed info on the {{tl|place}} call (see comment at top of file), the parsed arguments (needed for arguments not parsed by `parse_overall_place_spec` and used primarily to add "bare categories" corresponding to toponyms for known locations), and `from_demonym`, which is true if we're being called from {{tl|demonym-noun}} or {{tl|demonym-adj}} (in this case, we only want certain categories added, specifically bare categories corresponding to the specified holonym(s)). ]==] function export.get_cats(args, overall_place_spec, from_demonym) local cats = {} local place_descriptions = overall_place_spec.descs handle_category_implications(place_descriptions, m_placetypes.cat_implications) m_placetypes.augment_holonyms_with_container(place_descriptions) if overall_place_spec.directives then -- not necessarily when called from [[Module:demonym]] for _, directive_terms in ipairs(overall_place_spec.directives) do local spec_cats = directive_terms.spec.cat if spec_cats then if type(spec_cats) == "string" then spec_cats = {spec_cats} end for _, spec_cat in ipairs(spec_cats) do insert(cats, spec_cat) end end if directive_terms.spec.type_prefix then for _, place_desc in ipairs(place_descriptions) do for _, placetype in ipairs(place_desc.placetypes) do if not m_placetypes.placetype_is_ignorable(placetype) then extend(cats, get_placetype_cats(place_desc, placetype, from_demonym, directive_terms.spec.type_prefix)) end end end end end end if not from_demonym then local bare_categories = m_placetypes.get_bare_categories(args, overall_place_spec) extend(cats, bare_categories) end for _, place_desc in ipairs(place_descriptions) do if not from_demonym then for _, placetype in ipairs(place_desc.placetypes) do if not m_placetypes.placetype_is_ignorable(placetype) then extend(cats, get_placetype_cats(place_desc, placetype)) end end end -- Also add generic place categories for the holonyms listed (e.g. a category like -- [[Category:Places in Merseyside, England]]). This is handled through the special placetype "*". extend(cats, get_placetype_cats(place_desc, "*", from_demonym)) end if args.cat then -- not necessarily when called from [[Module:demonym]] for _, cat in ipairs(args.cat) do local split_cats = split_on_comma(cat) extend(cats, split_cats) end end return cats end -- Return the category link for a category, given the language code and the name of the category. local function format_cats(lang, cats, sort_key) local full_cats = {} local langcode = lang:getFullCode() for _, cat in ipairs(cats) do -- 'cln' corresponds to {{cln}}, which generates lang-name categories like [[:Category:English abbreviations]] -- (as opposed to topic categories like [[:Category:en:Abbreviations of states of the United States]]). local cln_cat = cat:match("^cln:(.*)$") if cln_cat then insert(full_cats, lang:getFullName() .. " " .. cln_cat) else insert(full_cats, langcode .. ":" .. cat) end end return require(utilities_module).format_categories(full_cats, lang, sort_key, nil, force_cat or m_placetypes.get_force_cat()) end ----------- Main entry point --[==[ Implementation of {{tl|place}}. Meant to be callable from another module (specifically, [[Module:transclude]]). The single argument `data` is an object with the following fields: * `template_args`: Raw arguments specified by {{tl|place}}, possibly modified by {{tl|tcl}}. * `from_tcl`: True if we're being invoked from {{tl|tcl}}. * `drop_extra_info`: True if we should drop most of the "extra info" specified using extra info arguments (capital, largest city, etc.). Usually true when invoked from {{tl|tcl}}. Note that some extra info is still displayed even when `drop_extra_info` is set in order to establish the context (e.g. {{para|full}} and {{para|modern}}), and any extra info overridden at the {{tl|tcl}} level is displayed regardless. * `extra_info_overridden_set`: Set of booleans specifying, for each extra info arg, whether it was overridden at the {{tl|tcl}} level. This means, for example, that the values are interpreted according to the language in {{para|1}} instead of always defaulting to English, as is the case when {{tl|place}} is called directly. * `form_of_overridden_args`: Set of objects of the form `{new_directive = ``directive``, new_value = ``value``}` for overriding a given form-of directive (the key) with new directive ``directive`` and new unparsed value ``value``. Both the key and the replacing directive should be canonical. ``value`` will be parsed in the same way as a regular form-of directive except that all specified terms are interpreted in the language specified in {{para|1}}, never in English. This is present so that {{tl|tcl}} can be used on abbreviations like [[GDR]] and [[FYROM]], whose equivalents in a foreign language have language-specific expansions but where the rest of the call should stay the same. * `translation_follows`: If true, any translation specified using t= should follow the definition, after a colon, rather than preceding, with the definition in parens. ]==] function export.format(data) local template_args = data.template_args local list_param = {list = true} local boolean_param = {type = "boolean"} local params = { [1] = {required = true, type = "language", default = "und"}, [2] = {required = true, list = true}, ["t"] = list_param, ["tid"] = {list = true, allow_holes = true}, ["cat"] = list_param, ["nocat"] = boolean_param, ["nocap"] = boolean_param, ["sort"] = true, ["pagename"] = true, -- for testing or documentation purposes ["a"] = true, ["addl"] = true, ["def"] = true, -- params that are only used when transcluding using {{tcl}}/{{transclude}}, to transmit information to {{tcl}}. ["tcl"] = true, ["tcl_t"] = list_param, ["tcl_tid"] = list_param, ["tcl_nolb"] = true, ["tcl_nolc"] = boolean_param, ["tcl_noextratext"] = boolean_param, } -- add "extra info" parameters for _, extra_arg_spec in ipairs(export.extra_info_args) do params[extra_arg_spec.arg] = list_param end -- FIXME, once we've flushed out any uses, delete the following clause. That will cause def= to be ignored. if template_args.def == "" then error("Cannot currently pass def= as an empty parameter; use def=- if you want to suppress the definition display") end local args = require("Module:parameters").process(template_args, params) if args.a then track("a") if args.a:find("^[Aa]n?$") or args.a:find("^[Tt]he$") then track("a/article") else error("a= can only be used to specify a definite or indefinite article (and preferably use |nocap=1 instead to get the initial letter lowercase); see especially the documentation on the [[Template:place#Mixed format|mixed format]], which can be used to add arbitrary text before the placetype") end end data.args = args local overall_place_spec = parse_overall_place_spec(data) data.overall_place_spec = overall_place_spec return get_def(data) .. ( args.nocat and "" or format_cats(args[1], export.get_cats(args, overall_place_spec), args.sort)) end --[==[ Actual entry point of {{tl|place}}. ]==] function export.show(frame) return export.format { template_args = frame:getParent().args, } end return export 7oy34n5np1xetyvtopk0kg7ib5dxhdr ᥖᥣᥭ 0 289654 5720822 1651642 2026-04-21T08:19:40Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720822 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === {{inh+|tdd|tai-swe-pro|*taːjᴬ²}}, จาก{{inh|tdd|tai-pro|*p.taːjᴬ}}; ร่วมเชื้อสายกับ{{cog|th|ตาย}}, {{cog|nod|ᨲᩣ᩠ᨿ}}, {{cog|lo|ຕາຍ}}, {{cog|khb|ᦎᦻ}}, {{cog|blt|ꪔꪱꪥ}}, {{cog|shn|တၢႆ}}, {{cog|kht|တၢဲႈ}}, {{cog|phk|တႝ}}, {{cog|aho|𑜄𑜩}}, {{cog|za|dai}} === การออกเสียง === * {{IPA|tdd|/taːj˧˧/}} === คำกริยา === {{tdd-verb}} # [[ตาย]] mg0eapmp57gwvgvvanek0y93vxdzuom ᥐᥨᥝ 0 296722 5720734 1651996 2026-04-21T05:46:45Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720734 wikitext text/x-wiki == ภาษาไทใต้คง == === การออกเสียง === * {{IPA|tdd|/ko˧˧/}} === รากศัพท์ 1 === ร่วมเชื้อสายกับ{{cog|th|กลัว}}, {{cog|lo|ກົວ}}, {{cog|tts|กัว}}, {{cog|nod|ᨠᩖ᩠ᩅᩫ}}, {{cog|kkh|ᨠ᩠ᩅᩫ}}, {{cog|khb|ᦷᦂ}}, {{cog|shn|ၵူဝ်}}, {{cog|blt|ꪀꪺ}}, {{cog|aho|𑜀𑜥}} ==== คำกริยา ==== {{tdd-verb}} # {{lb|tdd|สกรรม}} [[กลัว]] === รากศัพท์ 2 === {{inh+|tdd|tai-pro|*koːᴬ}}; ร่วมเชื้อสายกับ{{cog|th|กอ}}, {{cog|lo|ກໍ}}, {{cog|tts|กอ}}, {{cog|khb|ᦂᦸ}}, {{cog|shn|ၵေႃ}}, {{cog|blt|ꪀꪷ}}, {{cog|aho|𑜀𑜦𑜡}}, {{cog|za|go}} ==== คำนาม ==== {{tdd-noun}} # [[กอ]] {{gloss|กลุ่มพืช}} {{topics|tdd|ความกลัว}} 63chloxyp7jt444qjyukvwicsgru1od ᥐᥝᥲ 0 300442 5720737 1652138 2026-04-21T06:21:46Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720737 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === {{inh+|tdd|tai-pro|*kɤwꟲ}}, จาก{{der|tdd|ltc|-}} {{ltc-l|九}}, จาก{{der|tdd|och|-}} {{och-l|九}}, จาก{{der|tdd|sit-pro|*d/s-kəw}}; ร่วมเชื้อสายกับ{{cog|th|เก้า}}, {{cog|nod|ᨠᩮᩢ᩶ᩣ}}, {{cog|lo|ເກົ້າ}}, {{cog|khb|ᦂᧁᧉ}}, {{cog|blt|ꪹꪀ꫁ꪱ}}, {{cog|shn|ၵဝ်ႈ}}, {{cog|aho|𑜀𑜧}}, {{cog|pcc|guz}}, {{cog|za|gouj}}, {{cog|skb|กู̂}} === การออกเสียง === * {{IPA|tdd|/kaw˧˩/}} === เลข === {{tdd-num}} # [[เก้า]] qb1ym41jysogtb6b7nsin76ujkpzuor ᥟᥧᥱ 0 301006 5720736 1513076 2026-04-21T06:07:28Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720736 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === ร่วมเชื้อสายกับ{{cog|shn|ဢူႇ}}, {{cog|nod|ᩋᩪ᩵}}, {{cog|tts|อู่}}, {{cog|lo|ອູ່}}, {{cog|kkh|ᩋᩪ᩵}}, {{cog|khb|ᦀᦴᧈ}}, {{cog|blt|ꪮꪴ꪿}}, {{cog|za|uq|tr=อู่}}, {{cog|zzj|wq|tr=อื่อ|t=อู่}} === การออกเสียง === * {{IPA|tdd|/ʔu˩˩/}} === คำนาม === {{tdd-noun}} # [[เปล]], [[อู่]] {{gloss|เปล}} p2f4ldhxubgusqgk78q4b0cpogvvc74 ᥕᥣᥒ 0 303770 5720743 4614495 2026-04-21T06:42:03Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720743 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === {{inh+|tdd|tai-swe-pro|*jaːŋᴮ²}}; ร่วมเชื้อสายกับ{{cog|th|ย่าง}}, {{cog|tts|ญ่าง}} หรือ {{m|tts|ย่าง}}, {{cog|lo|ຍ່າງ}}, {{cog|nyw|ญ่าง}}, {{cog|khb|ᦍᦱᧂᧈ}}, {{cog|blt|ꪑ꪿ꪱꪉ}}, {{cog|shn|ယၢင်ႈ}}, {{cog|zzj|yangz}}, {{cog|za|yangz}} === การออกเสียง === * {{IPA|tdd|/jaːŋ˧˧/}} === คำกริยา === {{tdd-verb}} # [[เดิน]] 76fm6aom7613neane6fjnga72s9js58 ᥕᥣᥒᥲ 0 303772 5720742 1652270 2026-04-21T06:38:53Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720742 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === {{inh+|tdd|tai-swe-pro|*ˀjaːŋꟲ¹}}, จาก{{inh|tdd|tai-pro|*ˀjɯəŋꟲ}}; ร่วมเชื้อสายกับ{{cog|th|ย่าง}}, {{cog|lo|ຢ້າງ}}, {{cog|khb|ᦊᦱᧂᧉ}}, {{cog|shn|ယၢင်ႈ}} === การออกเสียง === * {{IPA|tdd|/jaːŋ˧˩/}} === คำกริยา === {{tdd-verb}} # [[ย่าง]], [[ปิ้ง]] 8k0oqlj2sso2z5y1ftmllastjn5m5a2 ဝေင်ꩻ 0 311609 5720708 1581456 2026-04-21T01:59:10Z OctraBot 3198 /* ภาษากะเหรี่ยงปะโอ */ 5720708 wikitext text/x-wiki == ภาษากะเหรี่ยงปะโอ == === คำนาม === {{head|blk|คำนาม}} # [[นคร]], [[เมืองใหญ่]], [[เวียง]] 1pb90fnn57evf5mgezznrm9ldnv04b9 राजधानी 0 314314 5720696 1605653 2026-04-21T01:38:56Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-\|เมืองใหญ่\}\} +|นคร}}) 5720696 wikitext text/x-wiki == ภาษากงกัณ == === รากศัพท์ === {{lbor|kok|sa|राजधानी}}, จาก{{com|sa|राज|धानी|gloss2=บ้าน, เรือน|nocat=1}} === คำนาม === {{kok-pos|n|razdhani|ರಾಜ್ಧಾನಿ}} # [[เมืองหลวง]], [[เมืองเอก]] {{topics|kok|นคร}} == ภาษาเนปาล == === รากศัพท์ === {{lbor|ne|sa|राजधानी}}, จาก{{com|sa|राज|धानी|gloss2=บ้าน, เรือน|nocat=1}} === การออกเสียง === * {{ne-IPA|rājdhānī}} === คำนาม === {{ne-noun}} # [[เมืองหลวง]], [[เมืองเอก]] {{topics|ne|นคร}} == ภาษาฮินดี == === รากศัพท์ === {{lbor|hi|sa|राजधानी}}, จาก{{com|sa|राज|धानी|gloss2=บ้าน, เรือน|nocat=1}} === การออกเสียง === * {{hi-IPA}} === คำนาม === {{hi-noun|g=f}} # [[เมืองหลวง]], [[เมืองเอก]] #: {{syn|hi|दारुलहुकूमत}} ==== การผันรูป ==== {{hi-ndecl|<F>}} {{topics|hi|นคร}} d8hgdimotuve1y6zlqunuslq209wfvr หมวดหมู่:th:นครในไทย 14 316295 5720693 1909146 2026-04-21T01:34:30Z OctraBot 3198 OctraBot ย้ายหน้า [[หมวดหมู่:th:เมืองใหญ่ในไทย]] ไปยัง [[หมวดหมู่:th:นครในไทย]] โดยไม่สร้างหน้าเปลี่ยนทางตามมา 1610099 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx แม่แบบ:langname-lite 10 324689 5720746 1626212 2026-04-21T06:48:49Z OctraBot 3198 5720746 wikitext text/x-wiki <includeonly>{{#switch:{{str left|{{{1<noinclude>|en</noinclude>}}}|1}} |a={{#switch:{{{1|}}} |aa=Afar |aag=Ambrak |aak=Ankave |aan=Anambé |aau=Abau |aav={{langname-lite/familycode|Austroasiatic|{{{is family|}}}|{{{allow family|}}}}} |aav-pro=Proto-Austroasiatic |aav-khs-pro=Proto-Khasian |aaz=Amarasi |ab=Abkhaz |abc=Ambala Ayta |abe=Abenaki |abp=Abenlen Ayta |abs=Ambonese Malay |abx=Inabaknon |ace=Acehnese |acv=Achumawi |acw=Hijazi Arabic |acy=Cypriot Arabic |acz=Acheron |ada=Adangme |adl=Galo |adw=Amondawa |ady=West Circassian |adz=Adzera |ae=Avestan |aeb=Tunisian Arabic |aek=Haeke |aem=Arem |aey=Amele |af=Afrikaans |afa-pro=Proto-Afroasiatic |afb=Gulf Arabic |agn=Agutaynen |agv=Remontado Agta |aho=Ahom |aht=Ahtna |aii=Assyrian Neo-Aramaic |ain=Ainu |aio=Aiton |ajg=Aja (West Africa) |aji=Ajië |ajp=South Levantine Arabic |ak=Akan |akk=Akkadian |akl=Aklanon |akr=Araki |alc=Kawésqar |ale=Aleut |ali=Amaimon |alj=Alangan |als={{langname-lite/etymcode|Tosk Albanian|Albanian|{{{allow etym|}}}}} |alt=Southern Altai |alu='Are'are |alv-gbe-pro=Proto-Gbe |ami=Amis |amm=Ama |ams=Southern Amami Ōshima |amu=Guerrero Amuzgo |an=Aragonese |ane=Xârâcùù |ang=Old English |anm=Anāl |anq=Jarawa |anw=Anaang |aoa=Angolar |aot=Atong (India) |apc=North Levantine Arabic |apl=Lipan |apt=Apatani |apw=Western Apache |aqd=Ampari Dogon |aqg=Arigidi |ar=Arabic |arc=Aramaic |ark=Arikapú |arn=Mapudungun |arx=Aruá |ary=Moroccan Arabic |arz=Egyptian Arabic |as=Assamese |asb=Assiniboine |ask=Ashkun |ast=Asturian |atc=Atsahuaca |atd=Ata Manobo |ath-pro=Proto-Athabaskan |att=Pamplona Atta |atz=Arta |aui=Anuki |avu=Avokaya |awb=Awa (New Guinea) |awg=Anguthimri |awt=Araweté |awx=Awara |ay=Aymara |ayl=Libyan Arabic |az=Azerbaijani |azc-nah-pro=Proto-Nahuan |azc-pro=Proto-Uto-Aztecan |azd=Eastern Durango Nahuatl |azg=San Pedro Amuzgos Amuzgo |#default={{langname-lite/unknowncode|{{{1}}}}}}} |b={{#switch:{{{1|}}} |ba=Bashkir |ban=Balinese |bar=Bavarian |bat={{langname-lite/familycode|Baltic|{{{is family|}}}|{{{allow family|}}}}} |bat-pro={{langname-lite/etymcode|Proto-Baltic|Proto-Balto-Slavic|{{{allow etym|}}}}} |bay=Batuley |bbb=Barai |bbd=Bau |bbn=Uneapa |bbr=Girawa |bca=Central Bai |bch=Bariai |bcl=Central Bikol |bdq=Bahnar |be=Belarusian |bej=Beja |bem=Bemba |ber-pro=Proto-Berber |beu=Blagar |bew=Betawi |bew-kot={{langname-lite/etymcode|Betawi Kota|Betawi|{{{allow etym|}}}}} |bfa=Bari |bfs=Southern Bai |bft=Balti |bg=Bulgarian |bgs=Tagabawa |bgt=Bughotu |bhg=Binandere |bi=Bislama |bji=Burji |bjn=Banjarese |bkd=Binukid |bkl=Berik |bks=Masbate Sorsogon |bla=Blackfoot |ble=Balanta-Kentohe |bll=Biloxi |bln=Southern Catanduanes Bikol |blr=Blang |blt=Tai Dam |blx=Mag-Indi Ayta |bm=Bambara |bmh=Kein |bmi=Bagirmi |bmr=Muinane |bmu=Somba-Siawari |bmx=Baimak |bn=Bengali |bnn=Bunun |bno=Asi |bnq=Bantik |bnt-lal=Lala (South Africa) |bnt-phu=Phuthi |bnt-pro=Proto-Bantu |bo=Tibetan |bor=Borôro |bpg=Bonggo |bpi=Bagupi |bps=Sarangani Blaan |bqb=Bagusa |bqc=Boko |bqp=Busa |br=Breton |brg=Baure |brh=Brahui |brx=Bodo (India) |bsa=Abinomn |bsh=Kamkata-viri |bsk=Burushaski |bsq=Bassa |btn=Ratagnon |bto=Rinconada Bikol |btw=Butuanon |bug=Buginese |byn=Blin |byt=Berti |bzj=Belizean Creole |#default={{langname-lite/unknowncode|{{{1}}}}}}} |c={{#switch:{{{1|}}} |ca=Catalan |caa=Ch'orti' |cab=Garifuna |cal=Carolinian |car=Kari'na |cav=Cavineña |cba-nut=Nutabe |cbi=Chachi |cbk=Chavacano |ccs-pro=Proto-Kartvelian |cdc-cbm-pro=Proto-Central Chadic |cdc-pro=Proto-Chadic |cdm=Chepang |cdo=Eastern Min |ce=Chechen |ceb=Cebuano |cel={{langname-lite/familycode|Celtic|{{{is family|}}}|{{{allow family|}}}}} |cel-bry-pro=Proto-Brythonic |cel-gau=Gaulish |cel-pro=Proto-Celtic |cgc=Kagayanen |ch=Chamorro |chb=Chibcha |chg=Chagatai |chk=Chuukese |chl=Cahuilla |chn=Chinook Jargon |cho=Choctaw |chp=Chipewyan |chy=Cheyenne |cia=Cia-Cia |cic=Chickasaw |cim=Cimbrian |cja=Western Cham |cjm=Eastern Cham |cjo=Pajonal Ashéninka |cjs=Shor |ckb=Central Kurdish |ckv=Kavalan |clc=Chilcotin |clw=Chulym |cmc-pro=Proto-Chamic |cmn=Mandarin |cmn-ear={{langname-lite/etymcode|Early Mandarin|Mandarin|{{{allow etym|}}}}} |cng=Northern Qiang |cnk=Khumi Chin |cnx=Middle Cornish |co=Corsican |cof=Tsafiki |com=Comanche |con=Cofán |coo=Comox |cps=Capiznon |crg=Michif |crh=Crimean Tatar |cro=Crow |crs=Seychellois Creole |crw=Chrau |cs=Czech |csb=Kashubian |ctd=Tedim Chin |cts=Northern Catanduanes Bikol |cu=Old Church Slavonic |cus-pro=Proto-Cushitic |cv=Chuvash |cy=Welsh |cyo=Cuyunon |#default={{langname-lite/unknowncode|{{{1}}}}}}} |d={{#switch:{{{1|}}} |da=Danish |dag=Dagbani |dak=Dakota |dcr=Negerhollands |de=German |dgc=Casiguran Dumagat Agta |dgr=Dogrib |dhv=Drehu |dif=Dieri |din=Dinka |dis=Dimasa |dje=Zarma |djk=Aukan |dlm=Dalmatian |dmn-dam=Dama (Sierra Leone) |dng=Dungan |dni=Lower Grand Valley Dani |doz=Dorze |dra-okn=Old Kannada |dsb=Lower Sorbian |dtp=Central Dusun |duf=Dumbea |dum=Middle Dutch |duo=Dupaningan Agta |duu=Drung |dux=Duun |dv=Dhivehi |dz=Dzongkha |#default={{langname-lite/unknowncode|{{{1}}}}}}} |E={{#switch:{{{1|}}} |EL.={{langname-lite/etymcode|Ecclesiastical Latin|Latin|{{{allow etym|}}}}} |#default={{langname-lite/unknowncode|{{{1}}}}}}} |e={{#switch:{{{1<noinclude>|en</noinclude>}}} |ebk=Eastern Bontoc |ee=Ewe |eee=E |efi=Efik |egl=Emilian |egy=Egyptian |el=Greek |emb=Embaloh |emi=Mussau-Emira |en=English |enm=Middle English |eo=Esperanto |es=Spanish |esx-esk-pro=Proto-Eskimo |esx-inu-pro=Proto-Inuit |et=Estonian |ett=Etruscan |eu=Basque |euq-pro=Proto-Basque |evn=Evenki |ext=Extremaduran |eya=Eyak |#default={{langname-lite/unknowncode|{{{1}}}}}}} |f={{#switch:{{{1|}}} |fa=Persian |fab=Annobonese |fad=Wagi |fax=Fala |fbl=West Miraya Bikol |ff=Fula |fi=Finnish |fit=Meänkieli |fiu-pro={{langname-lite/etymcode|Proto-Finno-Ugric|Proto-Uralic|{{{allow etym|}}}}} |fj=Fijian |fkv=Kven |fmp=Fe'fe' |fng=Fanagalo |fo=Faroese |foi=Foi |fon=Fon |fos=Siraya |fr=French |fr-CA={{langname-lite/etymcode|Canadian French|French|{{{allow etym|}}}}} |frd=Fordata |frk={{langname-lite/etymcode|Frankish|Proto-West Germanic|{{{allow etym|}}}}} |frm=Middle French |fro=Old French |fro-nor={{langname-lite/etymcode|Old Northern French|Old French|{{{allow etym|}}}}} |frp=Franco-Provençal |frr=North Frisian |fud=East Futuna |fur=Friulian |fut=Futuna-Aniwa |fwa=Fwâi |fy=West Frisian |#default={{langname-lite/unknowncode|{{{1}}}}}}} |g={{#switch:{{{1|}}} |ga=Irish |gaa=Ga |gad=Gaddang |gag=Gagauz |gah=Alekano |gal=Galoli |gap=Gal |gaw=Nobonob |gbf=Gaikundi |gce=Galice |gcf=Antillean Creole |gd=Scottish Gaelic |gem={{langname-lite/familycode|Germanic|{{{is family|}}}|{{{allow family|}}}}} |gem-pro=Proto-Germanic |ges=Geser-Gorom |gil=Gilbertese |gim=Gimi (Papuan) |gkm={{langname-lite/etymcode|Byzantine Greek|Ancient Greek|{{{allow etym|}}}}} |gl=Galician |gmh=Middle High German |gml=Middle Low German |gmq={{langname-lite/familycode|North Germanic|{{{is family|}}}|{{{allow family|}}}}} |gmq-mno=Middle Norwegian |gmq-oda=Old Danish |gmq-osw=Old Swedish |gmq-pro=Proto-Norse |gmu=Gumalu |gmw-cfr=Central Franconian |gmw-ecg=East Central German |gmw-jdt=Jersey Dutch |gmw-pro=Proto-West Germanic |gmw-stm=Sathmar Swabian |gmy=Mycenaean Greek |goh=Old High German |gor=Gorontalo |got=Gothic |grc=Ancient Greek |grh=Gbiri-Niragu |grk-mar=Mariupol Greek |grk-pro=Proto-Hellenic |grt=Garo |gsw=Alemannic German |gtu=Aghu Tharrnggala |gu=Gujarati |gug=Paraguayan Guarani |gul=Gullah |gun=Mbya Guarani |gur=Farefare |guw=Gun |gv=Manx |gwi=Gwich'in |gyb=Garus |#default={{langname-lite/unknowncode|{{{1}}}}}}} |h={{#switch:{{{1|}}} |ha=Hausa |haa=Hän |hai=Haida |hak=Hakka |hal=Halang |haw=Hawaiian |hch=Huichol |hdy=Hadiyya |he=Hebrew |hi=Hindi |hid=Hidatsa |hil=Hiligaynon |hit=Hittite |hmn-pro=Proto-Hmongic |hmx-pro=Proto-Hmong-Mien |ho=Hiri Motu |hop=Hopi |hro=Haroi |hrx=Hunsrik |hsb=Upper Sorbian |ht=Haitian Creole |hts=Hadza |hu=Hungarian |hup=Hupa |huq=Tsat |hur=Halkomelem |huu=Murui Huitoto |hvk=Haveke |hwc=Hawaiian Creole |hy=Armenian |#default={{langname-lite/unknowncode|{{{1}}}}}}} |i={{#switch:{{{1|}}} |ia=Interlingua |iba=Iban |ibg=Ibanag |ibl=Ibaloi |id=Indonesian |idb=Indo-Portuguese |idi=Idi |ie=Interlingue |ifb=Batad Ifugao |ifu=Mayoyao Ifugao |ig=Igbo |igl=Igala |igo=Isebe |ii=Nuosu |iir-pro=Proto-Indo-Iranian |ijj=Ede Ije |ik=Inupiaq |ilk=Ilongot |ilo=Ilocano |imn=Imonda |inc-ash=Ashokan Prakrit |inc-kho=Kholosi |inc-oas=Early Assamese |pra=Prakrit |ine-bsl-pro=Proto-Balto-Slavic |ine-pro=Proto-Indo-European |ine-toc-pro=Proto-Tocharian |ing=Deg Xinag |inn=Isinai |io=Ido |iow=Chiwere |ira-pro=Proto-Iranian |iry=Iraya |is=Icelandic |isd=Isnag |ish=Esan |ist=Istriot |it=Italian |itc-ola={{langname-lite/etymcode|Old Latin|Latin|{{{allow etym|}}}}} |itc-pro=Proto-Italic |itl=Itelmen |its=Itsekiri |itv=Itawit |iu=Inuktitut |ium=Iu Mien |ivb=Ibatan |ivv=Ivatan |izh=Ingrian |#default={{langname-lite/unknowncode|{{{1}}}}}}} |j={{#switch:{{{1|}}} |ja=Japanese |jam=Jamaican Creole |jaz=Jawe |jct=Krymchak |jje=Jeju |jkr=Koro (India) |jpx-pro=Proto-Japonic |jpx-ryu-pro=Proto-Ryukyuan |jra=Jarai |juc=Jurchen |juh=Hone |jv=Javanese |#default={{langname-lite/unknowncode|{{{1}}}}}}} |k={{#switch:{{{1|}}} |ka=Georgian |kaa=Karakalpak |kab=Kabyle |kac=Jingpho |kak=Kayapa Kallahan |kam=Kamba |kar-pro=Proto-Karen |kaw=Old Javanese |kay=Kamayurá |kbd=East Circassian |kbk=Grass Koiari |kbq=Kamano |kcg=Tyap |kdr=Karaim |kea=Kabuverdianu |kek=Q'eqchi |ket=Ket |kgp=Kaingang |kha=Khasi |khb=Lü |khi-kun=ǃKung |khl=Lusi |kht=Khamti |ki=Kikuyu |kij=Kilivila |kim=Tofa |kiy=Kirikiri |kjh=Khakas |kju=Kashaya |kk=Kazakh |kky=Guugu Yimidhirr |kl=Greenlandic |klg=Tagakaulu Kalagan |klq=Rumu |kls=Kalasha |klu=Klao |klv=Maskelynes |klw=Lindu |km=Khmer |kmb=Kimbundu |kmc=Southern Kam |kmf=Kare (New Guinea) |kmk=Limos Kalinga |kmr=Northern Kurdish |knb=Lubuagan Kalinga |kne=Kankanaey |knf=Mankanya |ko=Korean |kok=Konkani |kos=Kosraean |koy=Koyukon |kpg=Kapingamarangi |kpm=Koho |kpv=Komi-Zyrian |kpw=Kobon |kpx=Mountain Koiari |kqf=Kakabai |kqi=Koitabu |kr=Kanuri |kri=Krio |krj=Kinaray-a |krl=Karelian |ks=Kashmiri |ksd=Tolai |ksi=Krisa |ksk=Kansa |ksw=S'gaw Karen |ksx=Kedang |ktb=Kambaata |ktz=Juǀ'hoan |kud=Auhelawa |kum=Kumyk |kus=Kusaal |kuu=Upper Kuskokwim |kw=Cornish |kwa=Dâw |kwe=Kwerba |kwk=Kwak'wala |kxd=Brunei Malay |kxo=Kanoé |kxs=Kangjia |ky=Kyrgyz |kzg=Kikai |#default={{langname-lite/unknowncode|{{{1}}}}}}} |L={{#switch:{{{1|}}} |LL.={{langname-lite/etymcode|Late Latin|Latin|{{{allow etym|}}}}} |#default={{langname-lite/unknowncode|{{{1}}}}}}} |l={{#switch:{{{1|}}} |la=Latin |la-ecc={{langname-lite/etymcode|Ecclesiastical Latin|Latin|{{{allow etym|}}}}} |la-lat={{langname-lite/etymcode|Late Latin|Latin|{{{allow etym|}}}}} |la-med={{langname-lite/etymcode|Medieval Latin|Latin|{{{allow etym|}}}}} |la-vul={{langname-lite/etymcode|Vulgar Latin|Latin|{{{allow etym|}}}}} |lac=Lacandon |lad=Ladino |lay=Lama Bai |lb=Luxembourgish |lbk=Central Bontoc |lbl=Libon Bikol |lbn=Lamet |lew=Ledo Kaili |lg=Luganda |lhu=Lahu |li=Limburgish |lic=Hlai |lif=Limbu |lij=Ligurian |liv=Livonian |lkt=Lakota |lld=Ladin |llu=Lau |lml=Raga |lmo=Lombard |lmy=Laboya |ln=Lingala |lng={{langname-lite/etymcode|Lombardic|Old High German|{{{allow etym|}}}}} |lo=Lao |loc=Inonhan |loj=Lou |los=Loniu |lou=Louisiana Creole |lsi=Lashi |lt=Lithuanian |ltc=Middle Chinese |ltg=Latgalian |lud=Ludian |luo=Luo |lus=Mizo |lut=Lushootseed |lv=Latvian |lwh=White Lachi |lzz=Laz |#default={{langname-lite/unknowncode|{{{1}}}}}}} |M={{#switch:{{{1|}}} |ML.={{langname-lite/etymcode|Medieval Latin|Latin|{{{allow etym|}}}}} |#default={{langname-lite/unknowncode|{{{1}}}}}}} |m={{#switch:{{{1|}}} |mad=Madurese |mag=Magahi |mak=Makasar |mam=Man |map={{langname-lite/familycode|Austronesian|{{{is family|}}}|{{{allow family|}}}}} |map-ata-pro=Proto-Atayalic |map-pro=Proto-Austronesian |maw=Mampruli |maz=Central Mazahua |mba=Higaonon |mbb=Western Bukidnon Manobo |mbd=Dibabawon Manobo |mbi=Ilianen Manobo |mbj=Nadëb |mch=Ye'kwana |mcz=Mawan |mdf=Moksha |mdh=Maguindanao |mee=Mengen |mel=Central Melanau |men=Mende (Sierra Leone) |meo=Kedah Malay |mfe=Mauritian Creole |mfh=Matal |mg=Malagasy |mga=Middle Irish |mh=Marshallese |mhn=Mòcheno |mhr=Eastern Mari |mhx=Lhao Vo |mi=Māori |mih=Chayuco Mixtec |min=Minangkabau |miq=Miskito |mis-phi=Philistine |mk=Macedonian |mkh-ban-pro=Proto-Bahnaric |mkh-pro=Proto-Mon-Khmer |mkh-vie-pro=Proto-Vietic |mkj=Mokilese |mkt=Vamale |ml=Malayalam |mlp=Bargam |mlu=To'abaita |mmg=North Ambrym |mmn=Mamanwa |mmr=Western Xiangxi Miao |mn=Mongolian |mnc=Manchu |mnd=Mondé |mnk=Mandinka |mnp=Northern Min |mnw=Mon |moa=Mwan |mog=Mongondow |moh=Mohawk |mop=Mopan Maya |mos=Moore |mpg=Marba |mps=Dadibi |mqe=Matepi |mpj=Martu Wangka |mqs=West Makian |mqv=Mosimo |mqw=Murupi |mr=Marathi |mrc=Maricopa |mrk=Hmwaveke |mro=Mru |mrw=Maranao |ms=Malay |ms-cla={{langname-lite/etymcode|Classical Malay|Malay|{{{allow etym|}}}}} |ms-old={{langname-lite/etymcode|Old Malay|Malay|{{{allow etym|}}}}} |msb=Masbatenyo |msk=Mansaka |msm=Agusan Manobo |msn=Vurës |msq=Caac |mt=Maltese |mtc=Munit |mte=Alu |mtq=Muong |mtv=Asaro'o |mul=Translingual |muz=Mursi |mva=Manam |mvd=Mamboru |mvi=Miyako |mwl=Mirandese |mww=White Hmong |my=Burmese |myv=Erzya |mzp=Movima |mzw=Deg |#default={{langname-lite/unknowncode|{{{1}}}}}}} |n={{#switch:{{{1|}}} |na=Nauruan |nag=Naga Pidgin |nah=Nahuatl |nai-tap=Tapachultec |nak=Nakanai |nan=Min Nan |nan-hbl=Hokkien |nap=Neapolitan |naz=Coatepec Nahuatl |nb=Norwegian Bokmål |nbk=Nake |nce=Yale |ncf=Notsi |ncg=Nisga'a |nch=Central Huasteca Nahuatl |nci=Classical Nahuatl |ncj=Northern Puebla Nahuatl |nd=Northern Ndebele |nds=Low German |nds-de=German Low German |nds-nl=Dutch Low Saxon |ne=Nepali |nec=Nedebang |nef=Nefamese |nem=Nemi |nev=Nyaheun |nfl=Äiwoo |ngf-pro=Proto-Trans-New Guinea |nhe=Eastern Huasteca Nahuatl |nhn=Central Nahuatl |nht=Ometepec Nahuatl |nhx=Mecayapan Nahuatl |nia=Nias |nic-pro=Proto-Niger-Congo |nio=Nganasan |niu=Niuean |niv=Nivkh |niz=Ningil |njm=Angami |njo=Ao |njz=Nyishi |nkp=Niuatoputapu |nkr=Nukuoro |nl=Dutch |nlc=Nalca |nlg=Gela |nmb=Big Nambas |nn=Norwegian Nynorsk |no=Norwegian |nod=Northern Thai |nog=Nogai |non=Old Norse |non-oen={{langname-lite/etymcode|Old East Norse|Old Norse|{{{allow etym|}}}}} |nr=Southern Ndebele |nrf=Norman |nrl=Ngarluma |nrn=Norn |nso=Northern Sotho |ntp=Northern Tepehuan |nua=Yuanga |nuk=Nootka |nup=Nupe |nus=Nuer |nut=Nùng |nv=Navajo |nxq=Naxi |ny=Chichewa |nys=Nyunga |nza=Tigon Mbembe |nzd=Nzadi |#default={{langname-lite/unknowncode|{{{1}}}}}}} |o={{#switch:{{{1|}}} |obr=Old Burmese |obt=Old Breton |oc=Occitan |och=Old Chinese |oco=Old Cornish |odt=Old Dutch |ofs=Old Frisian |oge=Old Georgian |ohu=Old Hungarian |oj=Ojibwe |ojp=Old Japanese |oka=Okanagan |okm=Middle Korean |okn=Okinoerabu |oko=Old Korean |okz=Old Khmer |okz-ang={{langname-lite/etymcode|Angkorian Old Khmer|Old Khmer|{{{allow etym|}}}}} |olo=Livvi |om=Oromo |oma=Omaha-Ponca |omq-otp-pro=Proto-Oto-Pamean |omq-pro=Proto-Oto-Manguean |omx=Old Mon |ono=Onondaga |ood=O'odham |oon=Önge |opo=Opao |oro=Orokolo |orv=Old East Slavic |os=Ossetian |osc=Oscan |osp=Old Spanish |osx=Old Saxon |ota=Ottoman Turkish |ote=Mezquital Otomi |otk=Old Turkic |oto-otm-pro=Proto-Otomi |oto-pro=Proto-Otomian |otw=Ottawa |oui=Old Uyghur |ovd=Elfdalian |owl=Old Welsh |#default={{langname-lite/unknowncode|{{{1}}}}}}} |p={{#switch:{{{1|}}} |pa=Punjabi |paa-nha-pro=Proto-North Halmahera |pac=Pacoh |pag=Pangasinan |pal=Middle Persian |pam=Kapampangan |pap=Papiamentu |pau=Palauan |pbv=Pnar |pcc=Bouyei |pcm=Nigerian Pidgin |pdc=Pennsylvania German |pdt=Plautdietsch |pdu=Kayan |peh=Bonan |peo=Old Persian |phi-pro=Proto-Philippine |phk=Phake |phl=Palula |phn=Phoenician |pi=Pali |pis=Pijin |piz=Pije |pjt=Pitjantjatjara |pkc=Baekje |pkp=Pukapukan |pl=Polish |ple=Palu'e |plg=Pilagá |pln=Palenquero |plu=Palikur |plv=Southwest Palawano |plw=Brooke's Point Palawano |ply=Bolyu |pml=Sabir |pms=Piedmontese |pnr=Panim |pns=Ponosakan |pnw=Panyjima |pon=Pohnpeian |poo=Central Pomo |pov=Guinea-Bissau Creole |pox=Polabian |poz-cet-pro=Proto-Central-Eastern Malayo-Polynesian |poz-mcm-pro=Proto-Malayo-Chamic |poz-mly-pro=Proto-Malayic |poz-msa-pro=Proto-Malayo-Sumbawan |poz-oce-pro=Proto-Oceanic |poz-pep-pro=Proto-Eastern Polynesian |poz-pnp-pro=Proto-Nuclear Polynesian |poz-pol={{langname-lite/familycode|Polynesian|{{{is family|}}}|{{{allow family|}}}}} |poz-pol-pro=Proto-Polynesian |poz-pro=Proto-Malayo-Polynesian |ppk=Uma |ppl=Pipil |ppu=Papora |pqe-pro=Proto-Eastern Malayo-Polynesian |prc=Parachi |prg=Old Prussian |pri=Paicî |pro=Old Occitan |ps=Pashto |pt=Portuguese |pwn=Paiwan |#default={{langname-lite/unknowncode|{{{1}}}}}}} |q={{#switch:{{{1|}}} |qfa-kms-pro=Proto-Kam-Sui |qfa-lic-pro=Proto-Hlai |qfa-sub={{langname-lite/familycode|substrate|{{{is family|}}}|{{{allow family|}}}}} |qfa-tak={{langname-lite/familycode|Kra-Dai|{{{is family|}}}|{{{allow family|}}}}} |qfa-yen-pro=Proto-Yeniseian |qsb-ibe={{langname-lite/etymcode|Paleo-Hispanic|Undetermined|{{{allow etym|}}}}} |qu=Quechua |qua=Quapaw |quc=K'iche' |#default={{langname-lite/unknowncode|{{{1}}}}}}} |r={{#switch:{{{1|}}} |rad=Rade |rah=Rabha |ran=Riantana |rap=Rapa Nui |raw=Rawang |ray=Rapa |rbl=East Miraya Bikol |rel=Rendille |rgn=Romagnol |rhg=Rohingya |ril=Riang |rki=Rakhine |rm=Romansh |rme=Angloromani |rmf=Kalo Finnish Romani |rmg=Traveller Norwegian |rmn=Balkan Romani |rmo=Sinte Romani |rmp=Rempi |rmq=Caló |rmt=Domari |rmw=Welsh Romani |rng=Ronga |ro=Romanian |roa={{langname-lite/familycode|Romance|{{{is family|}}}|{{{allow family|}}}}} |roa-brg=Bourguignon |roa-fcm=Franc-Comtois |roa-gal=Gallo |roa-leo=Leonese |roa-oca=Old Catalan |roa-ole=Old Leonese |roa-opt=Old Galician-Portuguese |roa-tar=Tarantino |rog=Northern Roglai |rol=Romblomanon |rom=Romani |roo=Rotokas |rop=Australian Kriol |rpt=Rapting |rth=Ratahan |rtm=Rotuman |ru=Russian |rue=Carpathian Rusyn |rug=Roviana |ruo=Istro-Romanian |rup=Aromanian |ruq=Megleno-Romanian |rw=Rwanda-Rundi |rwo=Rawa |ryn=Northern Amami Ōshima |rys=Yaeyama |ryu=Okinawan |#default={{langname-lite/unknowncode|{{{1}}}}}}} |s={{#switch:{{{1|}}} |sa=Sanskrit |sah=Yakut |sai-ayo=Ayomán |sai-men=Menien |sai-nje-pro=Proto-Northern Jê |sai-tap=Tapayuna |sat=Santali |sav=Saafi-Saafi |sbf=Shabo |sbl=Botolan Sambal |sc=Sardinian |sce=Dongxiang |scn=Sicilian |sco=Scots |sd=Sindhi |sdc=Sassarese |sdg=Savi |sdn=Gallurese |se=Northern Sami |sea=Semai |sed=Sedang |sei=Seri |sel=Selkup |sem-pro=Proto-Semitic |ses=Koyraboro Senni |sg=Sango |sga=Old Irish |sgb=Mag-Anchi Ayta |sgd=Surigaonon |sgs=Samogitian |sh=Serbo-Croatian |shh=Shoshone |shk=Shilluk |shn=Shan |si=Sinhalese |sid=Sidamo |sio-pro=Proto-Siouan |sip=Sikkimese |sit={{langname-lite/familycode|Sino-Tibetan|{{{is family|}}}|{{{allow family|}}}}} |sit-jap=Japhug |sit-pro=Proto-Sino-Tibetan |sit-sit=Situ |sit-tan-pro=Proto-Tani |sjd=Kildin Sami |sje=Pite Sami |sjm=Mapun |sjt=Ter Sami |sju=Ume Sami |sk=Slovak |skb=Saek |sky=Sikaiana |sl=Slovene |sla={{langname-lite/familycode|Slavic|{{{is family|}}}|{{{allow family|}}}}} |sla-pro=Proto-Slavic |slm=Pangutaran Sama |slr=Salar |slu=Selaru |sm=Samoan |sma=Southern Sami |smi-pro=Proto-Samic |smj=Lule Sami |smk=Bolinao |smn=Inari Sami |smr=Simeulue |sms=Skolt Sami |sn=Shona |snf=Noon |snp=Siane |snr=Sihan |snu=Senggi |so=Somali |sog=Sogdian |sou=Southern Thai |sq=Albanian |sqj-pro=Proto-Albanian |squ=Squamish |sra=Saruga |srn=Sranan Tongo |srq=Sirionó |srr=Serer |srv=Waray Sorsogon |ss=Swazi |ssf=Thao |ssl=Western Sisaala |ssq=So'a |ssy=Saho |stf=Seta |stp=Southeastern Tepehuan |stq=Saterland Frisian |str=Saanich |stw=Satawalese |suq=Suri |sux=Sumerian |sv=Swedish |sw=Swahili |swb=Maore Comorian |swg=Swabian |swi=Sui |swm=Samosa |sxn=Sangir |sxw=Saxwe Gbe |syc=Classical Syriac |szl=Silesian |szy=Sakizaya |#default={{langname-lite/unknowncode|{{{1}}}}}}} |t={{#switch:{{{1|}}} |ta=Tamil |taa=Lower Tanana |tad=Tause |tai={{langname-lite/familycode|Tai|{{{is family|}}}|{{{allow family|}}}}} |tai-pro=Proto-Tai |tao=Yami |tay=Atayal |tbc=Takia |tbl=Tboli |tbp=Taworta |tbq={{langname-lite/familycode|Tibeto-Burman|{{{is family|}}}|{{{allow family|}}}}} |tbq-bdg-pro=Proto-Bodo-Garo |tbq-blg=Bailang |tbq-kuk-pro=Proto-Kuki-Chin |tbq-lob-pro=Proto-Lolo-Burmese |tbq-lol-pro=Proto-Loloish |tbw=Aborlan Tagbanwa |tby=Tabaru |tcb=Tanacross |tcs=Torres Strait Creole |tdd=Tai Nüa |tdy=Tadyawan |te=Telugu |tet=Tetum |tew=Tewa |tfn=Dena'ina |tft=Ternate |tg=Tajik |th=Thai |ti=Tigrinya |tim=Timbe |tio=Teop |tiy=Tiruray |tk=Turkmen |tkl=Tokelauan |tkw=Teanu |tl=Tagalog |tli=Tlingit |tmh=Tuareg |tmu=Iau |tnq=Taíno |to=Tongan |tpf=Tarpia |tpi=Tok Pisin |tpn=Tupinambá |tpw=Old Tupi |tqo=Toaripi |tqw=Tonkawa |tr=Turkish |trk={{langname-lite/familycode|Turkic|{{{is family|}}}|{{{allow family|}}}}} |trk-cmn-pro={{langname-lite/etymcode|Proto-Common Turkic|Proto-Turkic|{{{allow etym|}}}}} |trk-oat=Old Anatolian Turkish |trk-pro=Proto-Turkic |trv=Taroko |ts=Tsonga |tsg=Tausug |tt=Tatar |tts=Isan |ttt=Tat |tum=Tumbuka |tuw-pro=Proto-Tungusic |tuw-sol=Solon |tvl=Tuvaluan |tvn=Tavoyan |tvo=Tidore |txb=Tocharian B |txg=Tangut |ty=Tahitian |typ=Kuku-Thaypan |tys=Sapa |tyv=Tuvan |tyz=Tày |tzj=Tz'utujil |tzm=Central Atlas Tamazight |tzo=Tzotzil |#default={{langname-lite/unknowncode|{{{1}}}}}}} |u={{#switch:{{{1|}}} |uar=Tairuma |ubl=Buhi'non Bikol |uby=Ubykh |ude=Udihe |udi=Udi |ug=Uyghur |ugo=Gong |uk=Ukrainian |ulb=Olukumi |ulk=Meriam |umo=Umotína |umu=Munsee |und=Undetermined |unm=Unami |ur=Urdu |urj-fin-pro=Proto-Finnic |urj-pro=Proto-Uralic |urk=Urak Lawoi' |ush=Ushojo |utu=Utu |uur=Ura (Vanuatu) |uz=Uzbek |#default={{langname-lite/unknowncode|{{{1}}}}}}} |V={{#switch:{{{1|}}} |VL.={{langname-lite/etymcode|Vulgar Latin|Latin|{{{allow etym|}}}}} |#default={{langname-lite/unknowncode|{{{1}}}}}}} |v={{#switch:{{{1|}}} |vai=Vai |vam=Vanimo |ve=Venda |vec=Venetan |vep=Veps |vi=Vietnamese |vil=Vilela |vma=Martuthunira |vo=Volapük |vot=Votic |vro=Võro |vsn={{langname-lite/etymcode|Vedic Sanskrit|Sanskrit|{{{allow etym|}}}}} |#default={{langname-lite/unknowncode|{{{1}}}}}}} |w={{#switch:{{{1|}}} |wa=Walloon |wam=Massachusett |war=Waray-Waray |wba=Warao |wbl=Wakhi |wes=Cameroon Pidgin |wim=Wik-Mungkan |win=Winnebago |wiv=Muduapa |wlm=Middle Welsh |wmc=Wamas |wmw=Mwani |wno=Wano |wo=Wolof |woe=Woleaian |wrh=Wiradjuri |wrs=Waris |wsk=Waskia |wuh=Wutunhua |wul=Silimo |wuu=Wu |wya=Wyandot |wym=Vilamovian |#default={{langname-lite/unknowncode|{{{1}}}}}}} |x={{#switch:{{{1|}}} |xaa=Andalusian Arabic |xag=Aghwan |xbm=Middle Breton |xbr=Kambera |xcl=Old Armenian |xdc=Dacian |xeu=Keoru-Ahia |xgn={{langname-lite/familycode|Mongolic|{{{is family|}}}|{{{allow family|}}}}} |xgn-pro=Proto-Mongolic |xh=Xhosa |xib=Iberian |xil=Illyrian |xnb=Kanakanabu |xno={{langname-lite/etymcode|Anglo-Norman|Old French|{{{allow etym|}}}}} |xok=Xokleng |xpm=Pumpokol |xpo=Pochutec |xpq=Mohegan-Pequot |xpr=Parthian |xqa=Karakhanid |xrn=Arin |xsb=Sambali |xsl=South Slavey |xsm=Kasem |xsp=Silopi |xss=Assan |xsv=Sudovian |xto=Tocharian A |xug=Kunigami |xve=Venetic |#default={{langname-lite/unknowncode|{{{1}}}}}}} |y={{#switch:{{{1|}}} |yag=Yámana |yai=Yaghnobi |ybe=Western Yugur |ycl=Lolopo |ydk=Yoidik |yee=Yimas |yha=Baha |yi=Yiddish |yka=Yakan |yle=Yele |yll=Yil |yly=Nyelâyu |yo=Yoruba |yog=Yogad |yoi=Yonaguni |yol=Yola |yox=Yoron |yrk-tun=Tundra Nenets |yrl=Nheengatu |yua=Yucatec Maya |yue=Cantonese |yuf=Havasupai-Walapai-Yavapai |yuq=Yuqui |yuy=East Yugur |#default={{langname-lite/unknowncode|{{{1}}}}}}} |z={{#switch:{{{1|}}} |za=Zhuang |zag=Zaghawa |zai=Isthmus Zapotec |zav=Yatzachi Zapotec |zca=Coatecas Altas Zapotec |zea=Zealandic |zh=Chinese |zhn=Nong Zhuang |zia=Zia |zkg=Goguryeo |zko=Kott |zkt=Khitan |zle-mbe={{langname-lite/etymcode|Middle Belarusian|Old Ruthenian|{{{allow etym|}}}}} |zle-ono=Old Novgorodian |zle-ort=Old Ruthenian |zls={{langname-lite/familycode|South Slavic|{{{is family|}}}|{{{allow family|}}}}} |zlw-ocs=Old Czech |zlw-opl=Old Polish |zlw-slv=Slovincian |zmo=Molo |zne=Zande |zom=Zou |zpq=Zoogocho Zapotec |ztn=Santa Catarina Albarradas Zapotec |ztt=Tejalapan Zapotec |zu=Zulu |zza=Zazaki |#default={{langname-lite/unknowncode|{{{1}}}}}}} |#default={{langname-lite/unknowncode|{{{1}}}}} }}</includeonly><noinclude>{{documentation}}[[Category:Lua-free templates]]</noinclude> 938bfiv6mjyq9pw9vtx23b7rrdo8uee ᨷᩢ 0 331979 5720819 5713608 2026-04-21T07:57:45Z Ai Ku Karng 17824 /* ภาษาเขิน */ 5720819 wikitext text/x-wiki {{also/auto}} == ภาษาเขิน == === รากศัพท์ === {{inh+|th|tai-pro|*ɓawᴮ||ไม่}}; ร่วมเชื้อสายกับ{{cog|nod|ᨷᩮᩢ᩵ᩤ}}, {{cog|lo|ເບົ່າ}}, {{cog|khb|ᦢᧁᧈ}}, {{cog|blt|ꪹꪚ꪿ꪱ}}, {{cog|shn|မဝ်ႇ}} === การออกเสียง === * {{IPA|kkh|/baw˨˨/|a=เชียงตุง}} * {{คำอ่านไทย|เบ่า}} === คำกริยาวิเศษณ์ === {{kkh-adv|-}} # [[ไม่]], [[บ่]] === อ้างอิง === {{รายการอ้างอิง}} * ᨩᩣ᩠ᨿᨪᩮᨩᩮ᩠ᨾ. (n.d.). ''ᩋᨽᩥᨵᩤᨶᩈᩢ᩠ᨷᩅᩰᩉᩣ᩠ᩁᨸᩖᩯᨽᩣᩈᩣᨡᩨ᩠ᨶ''. == ภาษาไทลื้อ == === คำนาม === {{khb-noun}} # {{alternative form of|khb|ᦢᧁᧈ}} jub11464sn4bzq6gvzdjhhbndz1dm1a 5720820 5720819 2026-04-21T07:58:45Z Ai Ku Karng 17824 5720820 wikitext text/x-wiki {{also/auto}} == ภาษาเขิน == === รากศัพท์ === {{inh+|th|tai-pro|*ɓawᴮ||ไม่}}; ร่วมเชื้อสายกับ{{cog|nod|ᨷᩮᩢ᩵ᩤ}}, {{cog|lo|ເບົ່າ}}, {{cog|khb|ᦢᧁᧈ}}, {{cog|blt|ꪹꪚ꪿ꪱ}}, {{cog|shn|မဝ်ႇ}} === การออกเสียง === * {{IPA|kkh|/baw˨˨/|a=เชียงตุง}} * {{คำอ่านไทย|เบ่า}} === คำกริยาวิเศษณ์ === {{kkh-adv|-}} # [[ไม่]], [[บ่]] === อ้างอิง === {{รายการอ้างอิง}} * ᨩᩣ᩠ᨿᨪᩮᨩᩮ᩠ᨾ. (n.d.). ''ᩋᨽᩥᨵᩤᨶᩈᩢ᩠ᨷᩅᩰᩉᩣ᩠ᩁᨸᩖᩯᨽᩣᩈᩣᨡᩨ᩠ᨶ''. == ภาษาไทลื้อ == === คำกริยาวิเศษณ์ === {{khb-noun}} # {{alternative form of|khb|ᦢᧁᧈ}} f4rq0lwjobk49ip0jwtsc2ij3d5m1yq มอดูล:cop-sortkey 828 335759 5720774 1899755 2026-04-21T07:11:51Z OctraBot 3198 แทนที่เนื้อหาด้วย "--[[หมวดหมู่:หน้าที่ถูกแจ้งลบ|เปลี่ยนชื่อ]]" 5720774 Scribunto text/plain --[[หมวดหมู่:หน้าที่ถูกแจ้งลบ|เปลี่ยนชื่อ]] qizf2rdzmcbn6ioiwx591u5zqnh1lvx มอดูล:list of languages, csv format 828 336684 5720770 1902914 2026-04-21T07:01:23Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720770 Scribunto text/plain local languages = require("Module:languages/data/all") local families = require("Module:families/data") -- based on Module:list_of_languages local export = {} local filters = {} function export.show(frame) local args = frame.args local filter = filters[args[1]] local ids = args["ids"]; if not ids or ids == "" then ids = false else ids = true end local rows = {} -- Get a list of all language codes local codes = {} for code, _ in pairs(languages) do table.insert(codes, code) end -- Sort the list table.sort(codes) local sep = ";" local minor_sep = "," local function shallowcopy(array) local new_array = {} if type(array) == "string" then array = {array} end for i, v in ipairs(array) do new_array[i] = v end return new_array end -- Now go over each code, and create table rows for those that are selected local column_names = { "line", "code", "canonical name", "category", "type", "family code", "family", "sortkey?", "autodetect?", "exceptional?", "script codes", "other names", "standard characters" } for line, code in ipairs(codes) do local data = languages[code] local row = {} local sc = data[4] if type(sc) == "string" then sc = mw.text.split(sc, "%s*,%s*") end -- data[1]: canonical name; data[3]: family code table.insert(row, line) table.insert(row, code) table.insert(row, data[1]) table.insert(row, (data[1]:find("^ภาษา") and "" or "ภาษา") .. data[1]) table.insert(row, data.type or "") table.insert(row, data[3] or "") table.insert(row, data[3] and (families[data[3]] and families[data[3]][1] or error(data[3] .. " is not a valid family code (family of " .. code .. ")"))) table.insert(row, data.sort_key and "sortkey" or "") table.insert(row, data.entry_name and "autodetect" or "") table.insert(row, code:find("-") and "exceptional" or "") table.insert(row, sc and table.concat(sc, minor_sep) or "") table.insert(row, data.otherNames and table.concat(data.otherNames, minor_sep) or "") table.insert(row, data.standard_chars and "standard characters" or "") table.insert(rows, table.concat(row, sep)) end return "<pre>\n" .. table.concat(column_names, sep) .. "\n" .. table.concat(rows, "\n") .. "</pre>" end return export 08e6ih7ibhxiygvp4avpglqw2klgbme 5720772 5720770 2026-04-21T07:04:24Z OctraBot 3198 5720772 Scribunto text/plain local languages = require("Module:languages/data/all") local families = require("Module:families/data") -- based on Module:list_of_languages local export = {} local filters = {} function export.show(frame) local args = frame.args local filter = filters[args[1]] local ids = args["ids"]; if not ids or ids == "" then ids = false else ids = true end local rows = {} -- Get a list of all language codes local codes = {} for code, _ in pairs(languages) do table.insert(codes, code) end -- Sort the list table.sort(codes) local sep = ";" local minor_sep = "," local function shallowcopy(array) local new_array = {} if type(array) == "string" then array = {array} end for i, v in ipairs(array) do new_array[i] = v end return new_array end -- Now go over each code, and create table rows for those that are selected local column_names = { "line", "code", "canonical name", "category", "type", "family code", "family", "sortkey?", "autodetect?", "exceptional?", "script codes", "other names", "standard characters" } for line, code in ipairs(codes) do local data = languages[code] local row = {} local sc = data[4] if type(sc) == "string" then sc = mw.text.split(sc, "%s*,%s*") end -- data[1]: canonical name; data[3]: family code table.insert(row, line) table.insert(row, code) table.insert(row, data[1]) table.insert(row, (data[1]:find("^ภาษา") and "" or "ภาษา") .. data[1]) table.insert(row, data.type or "") table.insert(row, data[3] or "") table.insert(row, data[3] and (families[data[3]] and families[data[3]][1] or error(data[3] .. " is not a valid family code (family of " .. code .. ")"))) table.insert(row, data.sort_key and "sortkey" or "") table.insert(row, data.entry_name and "autodetect" or "") table.insert(row, code:find("-") and "exceptional" or "") table.insert(row, sc and table.concat(sc, minor_sep) or "") table.insert(row, data.other_names and table.concat(data.other_names, minor_sep) or "") table.insert(row, data.standard_chars and "standard characters" or "") table.insert(rows, table.concat(row, sep)) end return "<pre>\n" .. table.concat(column_names, sep) .. "\n" .. table.concat(rows, "\n") .. "</pre>" end return export oyytuonuvu6j0q9q77zzvthz9a5lnvs หมวดหมู่:zh:นครในไทย 14 338947 5720705 1914805 2026-04-21T01:56:27Z OctraBot 3198 OctraBot ย้ายหน้า [[หมวดหมู่:zh:เมืองใหญ่ในไทย]] ไปยัง [[หมวดหมู่:zh:นครในไทย]] โดยไม่สร้างหน้าเปลี่ยนทางตามมา 1914805 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx มอดูล:lt-common 828 1714667 5720786 4733484 2026-04-21T07:20:08Z OctraBot 3198 5720786 Scribunto text/plain local export = {} local m_str_utils = require("Module:string utilities") local u = m_str_utils.char local ugsub = m_str_utils.gsub local ulower = m_str_utils.lower local uupper = m_str_utils.upper local ufind = m_str_utils.find local ulen = m_str_utils.len local ucodepoint = m_str_utils.codepoint -- Keep native Unicode normalization functions (no replacement available) local toNFC = mw.ustring.toNFC local toNFD = mw.ustring.toNFD -- ============================================================================= -- Unicode constants -- ============================================================================= local GRAVE = u(0x0300) -- combining grave accent local ACUTE = u(0x0301) -- combining acute accent local TILDE = u(0x0303) -- combining tilde local MACRON = u(0x0304) -- combining macron local DOTABOVE = u(0x0307) -- combining dot above local CARON = u(0x030C) -- combining caron local OGONEK = u(0x0328) -- combining ogonek local ANY_ACCENT = "[" .. GRAVE .. ACUTE .. TILDE .. "]" -- Legacy aliases for backward compatibility local grave = GRAVE local acute = ACUTE local tilde = TILDE local macron = MACRON local dotabove = DOTABOVE local caron = CARON local ogonek = OGONEK local accents = ANY_ACCENT -- ============================================================================= -- Internal helper functions -- ============================================================================= local dotless_to_dotted = { ["ı"] = "i", ["ȷ"] = "j", } local function char_to_dotted_form(base, below) return (dotless_to_dotted[base] or base) .. below end local function normalize_dotted_chars(text) -- Remove any dots above, and convert dotless forms to dotted. -- On entry, text must be in NFD form. return ugsub(text, "([iıjȷ])(" .. ogonek .. "?)" .. dotabove, char_to_dotted_form) end local function char_to_accent_form(base, below) -- Add a 'dot above' after the base. if base == "i" or base == "j" then return base .. below .. dotabove end -- Convert any dotless chars combining with accents to the dotted form, -- so that they normalize properly. This shouldn't happen, but just in case. return char_to_dotted_form(base, below) end local function stripped_text_form(text) -- Remove accents. text = ugsub(toNFD(text), accents .. "+", "") -- Normalize dotless characters and dot-above diacritics. return normalize_dotted_chars(text) end -- ============================================================================= -- Input validation -- ============================================================================= -- Reject Private Use Area characters (U+E000–U+F8FF). function export.reject_pua(s) if not s then return end for i = 1, ulen(s) do local cp = ucodepoint(s, i) if cp >= 0xE000 and cp <= 0xF8FF then error(string.format( "lt-common: private use area character U+%04X detected in \"%s\". " .. "Please use a standard Unicode character instead.", cp, s)) end end end -- ============================================================================= -- Input normalization -- ============================================================================= -- Detect nonstandard encoding patterns in the input. -- Returns: dotless_flag (found ı/ȷ), precomp_i_flag (found precomposed í/ì/ĩ) function export.detect_nonstandard(s) if not s then return false, false end local nfd_s = toNFD(s) local dotless_flag = ufind(nfd_s, "[ıȷ]") ~= nil local precomp_i_flag = ufind(nfd_s, "[íìĩ]") ~= nil return dotless_flag, precomp_i_flag end -- Normalize input to clean canonical NFC. -- Handles dotless i/j (ı, ȷ) and stray dot-above combinations. function export.canonicalize_input(s) if not s then return s end s = toNFD(s) -- Remove stray dot-above after i/j (with or without ogonek) s = ugsub(s, "([iıjȷ])(" .. OGONEK .. "?)" .. DOTABOVE, function(base, below) base = (base == "ı") and "i" or (base == "ȷ") and "j" or base return base .. below end) -- Convert any remaining dotless i/j to standard forms s = ugsub(s, "ı", "i") s = ugsub(s, "ȷ", "j") return toNFC(s) end -- ============================================================================= -- Partial NFD conversion (stem_ac representation) -- ============================================================================= -- Convert canonical NFC to partial NFD (stem_ac). -- Applies full NFD, then recomposes non-accent diacritics. -- Only grave/acute/tilde remain as combining characters. function export.to_stem_ac(s) if not s then return s end s = toNFD(s) -- Recompose ogonek vowels s = ugsub(s, "a" .. OGONEK, "ą") s = ugsub(s, "e" .. OGONEK, "ę") s = ugsub(s, "i" .. OGONEK, "į") s = ugsub(s, "u" .. OGONEK, "ų") -- Recompose macron vowel s = ugsub(s, "u" .. MACRON, "ū") -- Recompose dot-above e s = ugsub(s, "e" .. DOTABOVE, "ė") -- Recompose caron consonants s = ugsub(s, "c" .. CARON, "č") s = ugsub(s, "s" .. CARON, "š") s = ugsub(s, "z" .. CARON, "ž") return s end -- ============================================================================= -- Accent manipulation -- ============================================================================= -- Strip all accent marks (grave/acute/tilde) from partial NFD text. function export.to_stem_bare(stem_ac) if not stem_ac then return stem_ac end return ugsub(stem_ac, ANY_ACCENT, "") end -- Check if partial NFD text contains any accent marks. function export.has_accent(stem_ac) return ufind(stem_ac, ANY_ACCENT) ~= nil end -- ============================================================================= -- Complete input pipeline -- ============================================================================= -- Process raw user input through the complete normalization pipeline. -- Returns: stem_bare, stem_ac, dotless_flag, precomp_flag function export.process_input(raw) if not raw then return raw, raw, false, false end export.reject_pua(raw) local dotless_flag, precomp_flag = export.detect_nonstandard(raw) local canon = export.canonicalize_input(raw) local stem_ac = export.to_stem_ac(canon) local stem_bare = export.to_stem_bare(stem_ac) return stem_bare, stem_ac, dotless_flag, precomp_flag end -- ============================================================================= -- Display and text processing -- ============================================================================= function export.makeDisplayText(text, lang, sc) if not text then return text end -- Normalize dotless characters and dot-above diacritics (while retaining accents). text = normalize_dotted_chars(toNFD(text)) -- Add a 'dot above' between "i" or "j" and an accent. text = ugsub(text, "([iıjȷ])(" .. ogonek .. "?)%f" .. accents, char_to_accent_form) return toNFC(text) end -- Called from [[Module:languages]] since [[Module:lt-common]] is set as the stripDiacritics handler in -- [[Module:languages/data/2]]. function export.stripDiacritics(text, lang, sc) if not text then return text end return toNFC(stripped_text_form(text)) end local sortkey_substitutes = { [ogonek] = u(0xF000), [caron] = u(0xF001), [macron] = u(0xF002), [dotabove] = u(0xF003), ["y"] = "i" .. u(0xF004), } function export.makeSortKey(text, lang, sc) if not text then return text end -- Normalize to the stripped-text form and convert diacritics to Private Use -- Area characters so they sort after all other characters. text = stripped_text_form(ulower(text)) :gsub(".[\128-\191]*", sortkey_substitutes) return toNFC(uupper(text)) end return export goom2dkjrwlrqdqdxdvrmutp68b77go มอดูล:place/locations 828 2297279 5720697 5715284 2026-04-21T01:40:16Z OctraBot 3198 5720697 Scribunto text/plain local export = {} export.force_cat = false -- set to true to force category generation even on non-mainspace pages local m_table = require("Module:table") local string_utilities_module = "Module:string utilities" local en_utilities_module = "Module:en-utilities" local insert = table.insert local concat = table.concat local dump = mw.dumpObject local unpack = unpack or table.unpack -- Lua 5.2 compatibility --[==[ intro: This module contains data on all known locations, along with some lower-level code to process them (higher-level known-location code is in [[Module:place/placetypes]]). You must load this module using require(), not using mw.loadData(). ===Location data=== '''NOTE: In order to understand the following better, first read the introductory documentation in [[Module:place]], especially the section `More about known locations`.''' The bulk of the code in this module (after some helper functions and placetype tables) describes the known locations and their relationships. Locations are grouped into ''location groups'' that share some common properties (examples are states of the United States and cities in Brazil). Each location group is associated with two tables, a ''data table'' that lists the locations and their individual properties, and a ''metadata table'' that lists group-level properties and defaults for the location properties. Each metadata table points to the associated data table (i.e. contains the data table as its `data` field), and the global `locations` variable holds a list of all group metadata tables. A given location is generally described by three values: (a) the group metadata table for the group the location is part of; (b) the location's canonical ''key'', which is the actual key in the group's data table and is globally unique across all locations; and (c) the location's ''spec'', which is the initialized object describing the properties of the location and comes from the value in the data table corresponding to the canonical key, transformed by the `initialize_spec()` function. These are typically named `group`, `key` and `spec`, respectively and in that order, and are found in the arguments to many functions. In a per-group data table, the keys are either ''canonical keys'' describing locations (which, as mentioned above, must be globally unique) or ''alias keys'' specifying an allowed alias for a given location. There may be multiple aliases for a given location and the alias keys only need to be unique within a particular group data table, not across all groups. It is also possible for the same string to serve as an alias key in one group and a canonical key in another group. (For example, `Newcastle` appears as an alias key in two different groups, referring to two different locations, canonically known as `Newcastle upon Tyne`, for the city in England, and `Newcastle, New South Wales`, for the city in New South Wales, ออสเตรเลีย; and `Birmingham` appears both as a canonical key in the group of English cities and an alias key for canonical `Birmingham, Alabama` in the group of US cities.) The corresponding value objects are different for canonical and alias keys. Corresponding to canonical keys are ''location specs'', describing the properies of the location that cannot be derived from default properties of the group or global defaults. Corresponding to alias keys are ''alias specs'', which are highly restricted in the properties they can contain, and whose properties do not have per-group defaults, but only global defaults. The canonical key is always the same as the bare category corresponding to the location, which is one of the reasons it must be globally unique. For example, the country of Georgia uses the canonical key `Georgia` and corresponding bare category [[:Category:Georgia]], while the US state of Georgia uses the canonical key `Georgia, USA` and corresponding bare category [[:Category:Georgia, USA]]. The following conventions are followed in naming keys: * Countries, ''country-like entities'' (which are a mixture of unrecognized de-facto states and dependent territories) and ''former countries'' (which also includes other types of polities, such as the Roman Empire) use their unqualified placename as the canonical key. (See the documentation for [[Module:place]] for the distinction between keys and placenames, which is critical to understand when working with location data.) This also applies to constituent countries (such as England, Aruba and the Faroe Islands) and constituent parts of grouped dependent territories (such as the island of Saint Helena, which is administratively part of the British overseas territory of Saint Helena, Ascension and Tristan da Cunha). * Cities (including prefecture-level cities in China, which behave in most respects more like non-city administrative divisions) also normally use their unqualified placename as the canonical key, but if this causes name conflicts or ambiguities, they use a ''qualified key'' containing either the country name or immediate containing division (if different) following a comma, such as the case of `Newcastle, New South Wales` and `Birmingham, Alabama` above. Examples of name conflicts are the two cities just given; examples of ambiguities are the major cities of León and Mérida in Mexico and city of Cartagena, Colombia, which are given the respective canonical keys of `León, Guanajuato`, `Mérida, Yucatán` and `Cartagena, Colombia` to avoid ambiguity with the well-known respective cities of the same name in Spain, even though none of those cities are large enough to be included as known locations in this module. (The cutoff is generally having a metro area of at least 1,000,000 inhabitants, although there are exceptions.) * Administrative divisions of countries, other than the exceptions noted above for constituent countries and dependent territories, use a qualified key that contains the name of the country or constituent country in it, e.g. `Normandy, ฝรั่งเศส` (a region), `Calvados, ฝรั่งเศส` (a department in the region of Normandy), `Herefordshire, England` (a ceremonial county), `Northwest Territories, Canada` (a territory), `Central Finland, ฟินแลนด์` (a region), `Antalya Province, Turkey` (a province), `Cluj County, Romania` (a county), `County Cork, ไอร์แลนด์` (a county) and `New York, USA` (a state). As shown in these various examples, (a) first and second-level divisions are sometimes both included (as in France, the United Kingdom and China); (b) the qualifier after the comma is sometimes a constituent country (England) instead of a country (United Kingdom), and is sometimes abbreviated (USA rather than United States or Unites States of America); (c) the word `the` is not normally included in the key even if the location is normally preceded by `the` when following a preposition (there is a property in the location and alias specs to indicate this), except in a very few cases (most notably `The Hague`); (d) the country is included as a qualifier even if it creates an apparent redundancy, as with `Central Finland, ฟินแลนด์`; and (e) sometimes the placetype is included in the key, as with provinces in Turkey and several other countries; states in Nigeria; and counties in Ireland, Romania and several other countries. Whether the placetype is included, and whether it follows or precedes the placename, depends on per-country conventions. For example, provinces in Turkey, อิหร่าน and several other countries (likewise for states in Nigeria, oblasts in Russia, etc.) conventionally include the word "จังหวัด", "รัฐ", "Oblast" etc. in their name because they are normally named after the largest city in the division, which would otherwise lead to ambiguity; and counties in Ireland and Northern Ireland (and likewise County Durham, England) normally have the word "เทศมณฑล" preceding rather than following them in their conventional name, so we follow this practice. The Wikipedia article naming scheme for a given administrative division is a strong clue as to how the division is normally referred to, and we usually follow this practice. (A minor exception is that the Wikipedia articles for provinces in Iran, Laos and Thailand include the word `province` with an initial lowercase letter while provinces elsewhere, e.g. North and South Korea, Saudi Arabia and Turkey, use uppercase `Province`; we normalize to uppercase `Province` in all cases.) As mentioned above, associated with canonical keys in the group data table are location specs, which are objects containing properties. It is important here to distinguish ''initialized specs'' from ''uninitialized specs''. Unininitialized specs are as directly specified in [[Module:place/locations]], containing only those properties that differ from the per-group or global defaults. Initialized specs result from calling `initialize_spec()` on an uninitialized spec (it is idempotent in that it will do nothing if encountering an already-initialized spec). This copies all group-level defaults that are not overridden in the location spec itself from the group-level metadata table into the location spec, so that in general, no more reference need be made to the group to fetch the correct value of a given location property. (The initialization process also does more transformations in a few cases, noted below.) Note that the default value of a given property is stored under a key in the group metadata table that is preceded by the string `default_`; for example, the default value corresponding to the `placetype` property of a given location is specified in the `default_placetype` key in the group metadata table. The following are the properties of the location spec. * `placetype`: String specifying the placetype of the location (e.g. "ประเทศ", "รัฐ", province"). This can also be a table of such types; in this case, the first listed type is the canonical type that will be used in descriptions, but the location will be recognized (e.g. in a holonym, or for categorizing into the bare category) when tagged with any of the specified types. The placetype '''must''' be either specified on an individual location or defaulted at the group level, or an error occurs. * `container`: Either a string, a ''canonicalized container'' structure or a list of either type, specifying the immediate ''container'' (or containers) of the given location. A container is another location which this location is considered to be directly part of, either politically or (above the country level) geographically. Some locations belong to multiple immediate containers; this applies especially to transcontinental countries such as Russia and Turkey. Containers can themselves have containers, forming a tree (or more correctly, a [[w:directed acyclic graph]]) of locations. The list of immediate container(s), followed by the container(s) of the container(s), etc., is termed the ''container trail'', and some functions compute and return this trail as part of their operation. When a location spec is initialized, the given container spec is canonicalized into ''canonical container form'', which consists of a list of canonicalized container structures, each of which is of the form `{key = "``container_key``", placetype = "``container_placetype``"}`, where ``container_key`` is a canonical location key and ``container_placetype`` should be the listed placetype for the location, or the first listed placetype if there are multiple. (FIXME: Since the key uniquely identifies the container location, we should eliminate the placetype from the container structure.) The list of canonicalized container structures is stored into the `.containers` field of the location spec (this happens even if the container value is unset in its uninitialized spec form, causing it to default to the corresponding group-level value), and the `.container` field is set to {nil}. The canonicalization process is described in more detail below under [[#Container spec canonicalization]]. * `divs`: List of recognized political divisions; e.g. for the Netherlands, a specification of the form `divs = {"จังหวัด", "เทศบาล"}` will allow categories such as [[:Category:de:Provinces of the Netherlands]] and [[:Category:pt:Municipalities of the Netherlands]] to be created. Any division that appears here must also be found in `placetype_data`, or an error occurs. The entities appearing in the `divs` list can be structures as well as just strings; this is explained more below under [[#Location divisions]]. Additional political divisions that apply to all locations in a group can be specified at the group level using the group-only property `addl_divs`, which has the same format as `divs`. This is intended to be used in the situation where some division types are shared among all locations in the group and others differ from location to location. An example where this is used is the United States, where `census-designated places` is specified in the group-level `addl_divs` so that all 50 states have census-designated places categorized as e.g. [[:Category:Census-designated places in Arizona, USA]], but `counties` and `county seats` are specified in the group-level `default_divs` because not all states have counties and county seats (Alaska has boroughs and borough seats and Louisiana has parishes and parish seats), and some states have additional divisions (New Jersey and Pennsylvania also have boroughs, while Colorado and Connecticut have municipalities). Note that under most circumstances (particularly, if `container_parent_type` is not set as a property associated with the division type), any division type specified on a sub-country-level location must also be specified on all containers up through the country. For example, since French departments specify `communes` and `municipalities` in `default_divs`, the same division types must be (and are) specified on French regions and for France itself. * `keydesc`: String directly specifying a description of the location, for use in generating the contents of category pages related to the location. In place of a string, a function of three arguments (`group`, `key`, `spec`, as is normal for locations) that computes the location description can also be given. This is used, for example, for Russian federal subjects; see `construct_russia_federal_subject_keydesc`. The special string `+++` contained in the keydesc is replaced with the default value of the location description, which specifies the location's placename, placetype, and the corresponding values for each container in the container trail, generally up through (but not beyond) the country level; see `no_include_container_in_desc` below. The location description is used to construct the full description of various categories, such as bare location categories, whose description generally reads `"{{(((}}langname}}} terms related to the people, culture, or territory of ``keydesc``."` where ``keydesc`` is the specified or auto-constructed location description. * `fulldesc`: String overriding the full description for the bare location category (but not for any other category). This is currently used only for the location `Earth`, at the very top of the tree (because the standard `people, culture or territory of ...` text doesn't make sense here), and for `Antarctica` (because it has no permanent inhabitants). FIXME: This should be renamed `bare_category_fulldesc`. * `addl_parents`: Specify additional parents for the bare location category, in addition to the category or categories generated based on the immediate container(s). For example, `Hawaii, USA` specifies `Polynesia` as an additional parent category; both `North Korea` and `South Korea` specify `Korea` (which is a specially handled location category) as an additional parent; and `Earth` specifies `nature` (not a location category, but still a topic category) as an additional parent (which in this case becomes the first parent, as `Earth` has no container). The only restriction on the categories in `addl_parents` is that they must be topic categories, because each language-specific version of the bare location category gets the corresponding language-specific versions of the categories in `addl_parents`. FIXME: This shoudl be renamed `bare_category_addl_parents`. * `wp`: Spec describing how to construct the Wikipedia article for the location. Each spec is either `true` (equivalent to `"%l"`, i.e. use the full location placename directly) or a string containing formatting directives, indicating how to construct the article name. The allowed formatting directives are `%l` (the full location placename), `%e` (the elliptical location placename) and `%c` (the full placename of the first immediate container). For example, the default value of `wp` for the group of United States cities is `"%l, %c"` since the city articles tend to be named e.g. `Austin, Texas` (but with many exceptions, specified using `wp` fields at the city level). Another example is Thai provinces, which specify a group-level default of `"%e province"` as the Wikipedia articles have lowercase `province` in their name but the Thai province keys specified in this module have uppercase `Province`. Here we have to use `%e` to get the placename without the word `Province` in it. The default is `true`, which simply uses the full location placename as the article name. Note that the Wikipedia article, along with the Wikipedia and Commons category pages, are shown in the upper right of bare category pages. * `wpcat`: Spec describing how to construct the Wikipedia category page for the location (i.e. the page listing articles and categories relevant to the location). The format is the same as with `wp`, and it defaults to the value of `wp`. It rarely needs to be specified because the category page and the article page almost always follow the same format. * `commonscat`: Spec describing how to construct the Commons category page for the location (i.e. the page on the MediaWiki Commons site listing articles and categories relevant to the location). It has the same format as `wp` and `wpcat` and defaults to `wpcat`, which is usually (but not always) correct. * `the`: Boolean specifying whether a location should be preceded by `the` when following a preposition, e.g. in category names such as [[:Category:Cities in the Northern Territory, ออสเตรเลีย]] and in old-style place descriptions when the location occurs as the first holonym, such as the city [[Darwin]] described using {{tl|place|city|terr/Northern Territory|c/Australia}}. Note that the global default for this and all Boolean properties is {nil}, which amounts to the same as {false}. * `british_spelling`: Boolean indicating whether the location in question uses British spelling. Currently this only affects whether the spelling `neighborhoods` or `neighbourhoods` is used in categories such as [[:Category:Neighborhoods of New York City]] and [[:Category:Neighbourhoods of Sydney]]. This usually needs to be set only at the top level (i.e. country or country-like entity), because lower-level entities look up the container trail for any container that has `british_spelling = true` set, and if found, assume that British spelling applies. The general principle used in setting this is that all countries in Europe, all dependent territories of any such country, all former British colonies, and any dependent territories of these former colonies, are assumed to use British spelling, while all other countries and associated dependent territories are assumed to use American spelling. This can potentially be modified on a case-by-case basis. * `is_city`: Boolean indicating whether the location in question is a city. This is explicitly set to `true` for city-states (e.g. Monaco and Vatican City), dependent territories that are cities (e.g. Hong Kong, Macau, Bonaire, Gibraltar, etc.), certain city-level administrative divisions (such as `City of Belfast, Northern Ireland`) and (through a group-levell setting) New York boroughs. In addition, it is set to `true` in initialize_spec() whenever the group-level `default_placetype == "นคร"`, so that all cities get it set without explicitly needing to add a group-level setting for this. Note that the condition `default_placetype == "นคร"` intentionally excludes Chinese prefecture-level cities, which aren't really cities in that (for example) they don't directly contain neighborhoods, but do contain cities within them. This setting is used in various places: (a) to add cities, rivers, etc. to categories like [[:Category:Rivers in Osaka, ญี่ปุ่น]] and [[:Category:Cities in Wuhan]] for holonyms that are ''not'' cities; (b) to add districts, neighborhoods, and the like to categories like [[:Category:Neighborhoods of Brooklyn]] and [[:Category:Neighborhoods of Monaco]] for holoynms that ''are'' cities; (c) generally, to determine which "generic" placetypes (cities, rivers, neighborhoods, etc.) apply to the location. (Those that can occur with cities have a `generic_before_cities` setting in [[Module:place/placetypes]], and those that can occur with non-cities have a `generic_before_non_cities` setting.) * `is_former_place`: Boolean that should be set on former places such as the Soviet Union and the Roman Empire. For such places, categories such as [[:Category:fr:Rivers in the Soviet Union]] are neither generated nor recognized (more generally, no "generic" placetypes apply except for `places`), and category descriptions include the word `former`. * `overriding_bare_label_parents`: Document me! * `bare_category_parent_type`: Document me! * `no_container_cat`: Document me! * `no_container_parent`: Document me! * `no_generic_place_cat`: Document me! * `no_check_holonym_mismatch`: Document me! * `no_auto_augment_container`: Document me! * `no_include_container_in_desc`: Document me! ====Location divisions==== The `divs` field of a location describes the recognized political division types of that location. Specifying a given division type will cause places defined as being of the specified division type and with the location as a holonym will cause the place to be categorized as ` ``placetypes`` in/of ``location`` `; for example, specifying that the United States has `"รัฐ"` as a division will cause anything defined as {{tl|place|fr|state|c/US}} to be categorized under [[:Category:fr:States of the United States]]. Note that you do not have to explicitly specify division types for "generic" placetypes (those that have a `generic_before_non_cities` field if the location is not a city, or that have a `generic_before_cities` field if the location is a city); this includes things like cities, towns, villages, neighbo(u)rhoods and rivers. A given element in the `divs` list is usually a string naming a plural placetype; the placetype is automatically converted to the singular for recognizing the placetype in a {{tl|place}} spec, and irregular plurals such as `kibbutzim` are handled correctly as long as the placetype specifies an appropriate `plural` field (if the `plural` isn't explicitly given, the default singularization algorithm in [[Module:en-utilities]] is run, which gets most things correctly but has problems with `passes` and `fortresses`, which are singularized to `passe` and `fortresse`; for this reason, an explicit plural entry is added to terms in ''-ss''). In place of a string, an object can be given with the plural placetype in the `type` field; this allows additional properties to be specified along with the placetype. An example of this is the `divs` list for Canada: { ["แคนาดา"] = {divs = { {type = "รัฐ", cat_as = "รัฐและดินแดน"}, {type = "ดินแดน", cat_as = "รัฐและดินแดน"}, "เทศมณฑล", "อำเภอ", "เทศบาล", "regional municipalities", "rural municipalities", "parishes", "Indian reserves", "census divisions", {type = "townships", prep = "ใน"}, }, ...}, } Here, both provinces and territories are set to categorize as `provinces and territories`, meaning that there is a single category [[:Category:Provinces and territories of Canada]] rather than separate categories for provinces and territories. Similar things are done for other countries that have more than one type of first-level administrative division (e.g. Australia, จีน, อินเดีย and Pakistan). Note that any placetype listed under `cat_as` must exist in the table of placetypes in [[Module:place/placetypes]], and in fact there is a category-only entry there for `provinces and territories!` (the use of exclamation point following a plural placetype means that the placetype is present only for use in categories and won't be recognized as the placetype field in a {{tl|place}} description). In addition, townships are declared to use `in` rather than `of` as the preposition in the category; hence the category name will be [[:Category:Townships in Canada]] rather than [[:Category:Townships of Canada]]. (The use of `in` vs. `of` is somewhat related to whether a given placetype is an official administrative or statistical division of the location in question and comes in a defined list, in which case `of` should be used, or is more ill-defined, in which case `in` should be used; the default is `of`, and the use of `in` with `townships` is probably by analogy with the use of `in` with cities and towns.) Another more complex example is the divisions given for Quebec: { ["Quebec, Canada"] = {divs = { "เทศมณฑล", {type = "regional county municipalities", container_parent_type = "regional municipalities"}, {type = "ภูมิภาค", container_parent_type = false}, {type = "townships", prep = "ใน"}, {type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "เทศมณฑล"}, "เทศบาล"}}, {type = "township municipalities", cat_as = {{type = "townships", prep = "ใน"}, "เทศบาล"}}, {type = "village municipalities", cat_as = {{type = "villages", prep = "ใน"}, "เทศบาล"}}, }, ...}, } Here, `container_parent_type` controls the second parent category of the placetype/location category associated with the entry. In this case, for example, [[:Category:Counties of Quebec, Canada]] will have [[:Category:Counties of Canada]] as its second or ''container-level'' parent. However, this doesn't make sense for `regional county municipalities`, which exist only in Quebec (so the parent category [[:Category:Regional county municipalities of Canada]] would have only one subcategory); but they are similar to regional municipalities in British Columbia, Nova Scotia and Ontario, so the `container_parent_type = "regional municipalities"` spec causes the container-level parent of this category to be [[:Category:Regional municipalities of Canada]]. Likewise, `regions` as administrative divisions (as opposed to mere geographic regions) exist only in Quebec; they have no equivalent elsewhere, so we disable the container-level parent using `container_parent_type = false`. The specs for `parish municipalities`, `township municipalities` and `village municipalities` show both that multiple types can be specified under `cat_as` (here, for example, we categorize `parish municipalities` as both `parishes` and `municipalities`) and that these types can themselves have properties, just as for entries directly under `divs`. Specifically, `{type = "parishes", container_parent_type = "เทศมณฑล"}` means that any place defined as a parish municipality in Quebec will be categorized under both [[:Category:Parishes of Quebec, Canada]] and [[:Category:Municipalities of Quebec, Canada]], and that the former will have a container-level parent of [[:Category:Counties of Canada]] (rather than the default of [[:Category:Parishes of Canada]]). Similarly, `township municipalities` will be categorized under both [[:Category:Townships in Quebec, Canada]] (''not'' [[:Category:Townships of Quebec, Canada]]) and [[:Category:Municipalities of Quebec, Canada]]. ====Container spec canonicalization==== A fully canonicalized container spec for a given location consists of a list of ''canonicalized container objects'', each with a `key` and `placetype` field. The `key` field should name the canonical key of some other location at a higher level (e.g. French cities are contained in French departments, which are contained in French regions, which are contained in France, which is contained in Europe, which is contained in Eurasia, which is contained in the Earth). The `placetype` field should correspond to the first (canonical) placetype listed for the key in question. The process of initializing a locaion spec converts the container spec in `.container` into a canonicalized spec in `.containers` and removes the spec from `.container`. It works as follows: # If the `container` field is missing, and there is a group-level `default_container` field, it is used in its place. For example, none of the Brazilian states listed in `brazil_states` specifies a container, but the group specifies `default_container = "บราซิล"`. # A single string or canonicalized container object is allowed and made into a one-element list. # If a list element is a string that did ''not'' come from `default_container`, and there is a group-level `canonicalize_key_container` field, it is assumed to be a one-argument function and is called on the string to get a canonicalized container object. # Any remaining strings are assumed to be countries and are used directly as the `key`, with `placetype` set to `"ประเทศ"`. ====Alias keys==== Aliases can be provided for canonical keys using ''alias keys''. Alias keys have a very different location spec structure from canonical keys. This structure does not, in general, have defaults at the group level and is not initialized using `initialize_spec()`, but is used as-is. The following properties are recognized in an alias location spec: * `alias_of`: The canonical key of which this key is an alias. Required. * `the`: If true, this alias key is preceded by `the` following a preposition. Defaults to the group-level `default_the` but does not pay attention to the value of `the` for the corresponding canonical key. * `display`: This is a display alias, meaning that holonyms using the placename corresponding to this alias will be converted to the placename corresponding to the canonical key when formatting the holonym for display. (Otherwise, the aliasing applies only to categorization.) If the value is true, the display canonicalization is to the placename of the canonical key; otherwise, the value should be a key whose corresponding placename is used when display canonicalizing. * `placetype`: The placetype of the alias. Rarely needs to be specified as it defaults to the canonical key's placetype, and if that is unspecified, to the group-level default placetype. ====Location group metadata tables==== As mentioned above, associated with each location group is a ''metadata table'' listing group-level properties. The metadata table contains two types of keys: group-level defaults (named like the corresponding location-level keys but preceded by `default_`, e.g. `default_placetype` corresponding to the location-level `placetype` key) and group-only keys, which are mostly functions. The following are the possible group-only keys: * `data`: This points to the group data table for the group, as described above. * `key_to_placename`: This is a function of one argument to transform the location's key (whether canonical or alias) into the full and elliptical placenames. The difference between full and elliptical placenames is described in the documentation for [[Module:place]], but in essence, it applies for keys that include the placetype in them (e.g. `Phuket Province, Thailand` or `County Mayo, ไอร์แลนด์`), in which case the full placename includes the placetype and the elliptical placename does not. For keys that do not include the placetype in them (e.g. `Arizona, USA` or `Gloucestershire, England`), the full and elliptical placenames are identical. Note that neither the full nor the elliptical placename includes the container in it; hence, for `Phuket Province, Thailand`, the full placename is `Phuket Province` and the elliptical placename is just `Phuket`. (Note that the full vs. elliptical placename distinction is intended only for handling cases where the placetype follows or precedes the raw placename and there is no difference between the two in whether they are normally preceded by `the`. More complex situations, such as `State of Mexico` (which normally takes `the`) vs. just `Mexico` (which doesn't), or `Islamabad Capital Territory` vs. just `Islamabad`, should be handled instead by aliases.) The `key_to_placename` function takes one argument, the key, and returns two arguments, the full and elliptical placenames, respectively. If left undefined, the default is to chop off anything starting with a comma and return the result as both full and elliptical placename, and if specifically set to `false`, the key is used directly as both full and elliptical placename. If it needs to be defined, it is best to use the helper function `make_key_to_placename`, if possible (or `make_irish_type_key_to_placename` in the case of Ireland and Northern Ireland, where `County` precedes), rather than rolling your own. In addition, you should use the global `key_to_placename` function (which takes care of the default implementation and such) rather than directly calling the function in the `key_to_placename` field. * `placename_to_key`: This is approximately the inverse of `key_to_placename`, transforming a placename (which can be either in full or elliptical form) into the corresponding key. As with `key_to_placename`, if you need to define this (generally, when the full and elliptical placenames are different), prefer using `make_placename_to_key` (or `make_irish_type_placename_to_key` for Ireland and Northern Ireland) to rolling your own. In addition, similarly to `key_to_placename`, use the global `placename_to_key` function to convert placenames to keys rather than directly invoking the function in the `placename_to_key` field. If the field is set to `false`, the placename is used unchanged as the key. Otherwise, the default algorithm works as follows: *# If the group-level `default_placetype == "นคร"`, use the placename unchanged as the key. *# Otherwise, if the group-level `default_container` exists and is a string, append it to the placename after a comma + space and use the result as the key. *# Otherwise, if the group-level `default_container` is a canonical container object (an object with `key` and `placetype` fields), and the `placetype` field is either `country` or `constituent country`, append the `key` field to the placename after a comma + space and use the result as the key. *# Otherwise, use the placename unchanged as the key. * `canonicalize_key_container`: A function of one argument to convert the specified `container` field, when a string, to canonical form. Described in more detail above under [[#Container spec canonicalization]]. It is preferable to construct the function using `make_canonicalize_key_container`, if possible, rather than rolling your own. * `addl_divs`: Additional political divisions appended, for all locations in the group, to the list of divisions derived from the location-level `divs` or group-level `default_divs` fields to get the final list of divisions for the location. See [[#Location divisions]] for more details. ]==] ----------------------------------------------------------------------------------- -- Helper functions -- ----------------------------------------------------------------------------------- --[==[ Throw an error. `fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to format the format string as if `fmt:format(...)` were called. In general, callers should use `internal_error` unless the error was due to bad user input rather than a logic error (which usually isn't the case in deep back-end code like this). ]==] function export.process_error(fmt, ...) local args = {...} for i = 1, select("#", ...) do args[i] = dump(args[i]) end return error(string.format(fmt, unpack(args))) end --[==[ Throw an internal error (a logic error that should never happen unless there is a bug in the code, as opposed to a user error triggered by bad input or a system error due to something like running out of memory or hitting a time limit). `fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to format the format string as if `fmt:format(...)` were called. ]==] function export.internal_error(fmt, ...) export.process_error("Internal error: " .. fmt, ...) end local internal_error = export.internal_error -- Return whether `list_or_element` (a list of strings, or a single string) "contains" `item` (a string). If -- `list_or_element` is a list, this returns true if `item` is in the list; otherwise it returns true if `item` -- equals `list_or_element`. local function list_or_element_contains(list_or_element, item) if type(list_or_element) == "table" then return m_table.contains(list_or_element, item) and true or false end return list_or_element == item end --[==[ Call the location group's `key_to_placename` function if it exists (see the comment at the top of [[Module:place]] for the distinction between keys and placenames). Two values are returned, the full and elliptical placenames (e.g. full `"County Durham"` vs. elliptical `"Durham"`). If the group does not define `key_to_placename`, both full and elliptical placenames are computed by chopping off anything starting with a comma. ]==] function export.key_to_placename(group, key) if group.key_to_placename == false then return key, key end if group.key_to_placename then local full_placename, elliptical_placename = group.key_to_placename(key) if type(full_placename) ~= "string" then internal_error("Key %s returned a non-string full placename: %s", key, full_placename) end if type(elliptical_placename) ~= "string" then internal_error("Key %s returned a non-string elliptical placename: %s", key, elliptical_placename) end return full_placename, elliptical_placename end key = key:gsub(",.*", "") return key, key end --[==[ Call the location group's `placename_to_key` function if it exists (see the comment at the top of [[Module:place]] for the distinction between keys and placenames) and return the result. If `placename_to_key` exists with the value `false`, return the placename unchanged. If the group does not define `placename_to_key`, and it defines a `default_container` whose placetype is either `country` or `constituent country`, the container name is appended to the placename after a comma and a space. Otherwise the placename is returned unchanged. ]==] function export.placename_to_key(group, placename) if group.placename_to_key == false then return placename elseif group.placename_to_key then local key = group.placename_to_key(placename) if type(key) ~= "string" then internal_error("Placename %s returned a non-string key: %s", placename, key) end return key elseif group.default_placetype == "นคร" then return placename else local defcon = group.default_container if not defcon then return placename elseif type(defcon) == "string" then return placename .. ", " .. defcon elseif type(defcon) == "table" and (defcon.placetype == "ประเทศ" or defcon.placetype == "constituent country") then return placename .. ", " .. defcon.key else return placename end end end --[==[ Initialize the location spec `spec`, augmenting it with default values taken from `group` if the spec itself doesn't specify values for the properties. This sets `containers` to a canonicalized list of objects, each with `key` and `placetype` keys, describing the immediate containers of the location, and erases (sets to nil) the original non-canonicalized `container` field. (Most locations have only one immediate container but some, e.g. Russia, have more than one. Containers should be carefully distinguished from category parents. Generally the container is the first category parent, or the first ``n`` parents if there are ``n`` containers, but there may be additional category parents, which indicate some sort of relation between the category parent and the location but not necessarily one of containment.) This function is idempotent in that nothing happens if called more than once on the same spec. FIXME: Consider reimplementing this in a more standardly object-oriented way using metatables. ]==] function export.initialize_spec(group, key, spec) if spec.initialized then return end local container = spec.container local containers local container_from_default if not container then container = group.default_container container_from_default = true end if container then if type(container) == "string" or container.key then container = {container} end containers = {} for _, cont in ipairs(container) do if type(cont) == "string" then if group.canonicalize_key_container and not container_from_default then cont = group.canonicalize_key_container(cont) else cont = {key = cont, placetype = "ประเทศ"} end end insert(containers, cont) end end spec.containers = containers spec.container = nil local function value_with_default(val, default_val) if val == nil then return default_val else return val end end local function set_or_default(prop) spec[prop] = value_with_default(spec[prop], group["default_" .. prop]) end set_or_default("placetype") if not spec.placetype then internal_error("No placetype found in key %s for spec %s or in group `default_placetype`", key, spec) end set_or_default("divs") spec.addl_divs = group.addl_divs for _, prop in ipairs { "keydesc", "fulldesc", "addl_parents", "overriding_bare_label_parents", "bare_category_parent_type", "wp", "wpcat", "commonscat", "british_spelling", "the", "no_container_cat", "no_container_parent", "no_generic_place_cat", "no_check_holonym_mismatch", "no_auto_augment_container", "no_include_container_in_desc", "is_city", "is_former_place", } do set_or_default(prop) end -- `default_placetype == "นคร"` is correct; if `default_placetype` has something else like `prefecture-level city` -- as the canonical placetype but also lists `city` (as Chinese prefecture-level cities do), don't mark as -- is_city. spec.is_city = value_with_default(spec.is_city, group.default_placetype == "นคร") spec.initialized = true end --[=[ Given a location group, key and possible placetypes that the placename must match, check if the key exists in the group with at least one of the group's key's placetypes matching one of the passed-in placetypes. If so, return two values: the group key (which potentially could differ from the passed-in key due to aliases) and the corresponding spec object, which (as with all functions that return spec objects) has been initialized using `initialize_spec()` (i.e. default property values have been copied from the group into the spec, if the spec doesn't itself specify a value for the property in question). `alias_resolution` controls how aliases are resolved. Normally, both display and category aliases are followed, and the returned key will reflect the canonical location key. However, if `alias_resolution` is {"none"}, no alias following happens. In that case, if the key specifies an alias, the spec for the alias rather than the spec for the canonical location is returned, and importantly, it is returned uninitialized, meaning that properties from the group are not copied into the spec. (If the key specifies a canonical location, its spec is returned initialized, as in the normal case where `alias_resolution` is unspecified.) The caller needs to check whether the returned spec is an alias by looking for an `alias_of` property. If `alias_resolution` is {"display"}, the behavior is the same as for {"none"} except that if the alias contains a setting `display = true`, the returned key will reflect the canonical location key, and if the alias contains a setting `display = ``string`` `, the returned key will reflect that string. This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or `find_canonical_key` (for known-canonical locations where the placetype isn't known). ]=] local function find_matching_key_in_group(group, placetypes, key, alias_resolution) if alias_resolution ~= nil and alias_resolution ~= "none" and alias_resolution ~= "display" and alias_resolution ~= "all" then internal_error("Bad value for 'alias_resolution': %s", alias_resolution) end local spec = group.data[key] if not spec then return nil end local function check_correct_placetype(placetype) if type(placetype) == "table" then for _, pt in ipairs(placetype) do if list_or_element_contains(placetypes, pt) then return true end end return false else return list_or_element_contains(placetypes, placetype) end end if spec.alias_of then local resolved_key = spec.alias_of local resolved_spec = group.data[resolved_key] if not resolved_spec then internal_error("Key %s is an alias of %s, which doesn't exist", key, resolved_key) elseif resolved_spec.alias_of then internal_error("Key %s is an alias of %s, which is itself an alias; indirect aliasing not allowed", key, resolved_key) end if alias_resolution == "none" or alias_resolution == "display" then -- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group. local placetype = spec.placetype or resolved_spec.placetype or group.default_placetype if not placetype then internal_error("No placetype found for key %s in any of spec %s, alias-resolved spec %s or in group " .. "`default_placetype`", key, spec, resolved_spec) end if not check_correct_placetype(placetype) then return nil end if alias_resolution == "display" then if spec.display == true then key = resolved_key elseif spec.display then key = spec.display end end return key, spec end key = resolved_key spec = resolved_spec end -- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group. local placetype = spec.placetype or group.default_placetype if not placetype then internal_error("No placetype found for key %s in spec %s or group `default_placetype`", key, spec) end if not check_correct_placetype(placetype) then return nil end export.initialize_spec(group, key, spec) return key, spec end --[=[ Given a location group, placename and possible placetypes that the placename must match, check if the placename exists in the group with at least one of the placetypes of the key in the group that corresponds to the placename matching one of the passed-in placetypes. If so, return two values: the key corrsponding to the passed-in placename and the corresponding spec object. This is similar to `find_matching_key_in_group()` but works with placenames rather than keys. `alias_resolution` is as in `find_matching_key_in_group()`. This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or `find_canonical_key` (for known-canonical locations where the placetype isn't known). ]=] local function find_matching_placename_in_group(group, placetypes, placename, alias_resolution) local key = export.placename_to_key(group, placename) return find_matching_key_in_group(group, placetypes, key, alias_resolution) end --[==[ If `key` is a canonical known location key (i.e. not an alias), return the corresponding group and initialized spec. If no such key exists, return {nil}. This throws an internal error if two locations with the same key are found. ]==] function export.find_canonical_key(key) local found_locations = {} for _, group in ipairs(export.locations) do local spec = group.data[key] if not spec then -- do nothing elseif spec.alias_of then mw.log(("Skipping alias '%s' of canonical '%s'"):format(key, spec.alias_of)) else insert(found_locations, {group, spec}) end end if not found_locations[1] then return nil elseif found_locations[2] then internal_error("Found multiple matching locations for canonical key %s: %s", key, found_locations) else local group, spec = unpack(found_locations[1]) export.initialize_spec(group, key, spec) return group, spec end end --[==[ Iterator that returns all locations matching a given description, where the description consists of either a placename or a key along with a list of possible placetypes. Usually there will be at most one such location. The iterator returns three values at each iteration: the location group, canonical key by which the location is known and the spec object describing the location. `data` contains the following possible fields: * `placetypes`: A list of possible placetypes, one of which must match one of the location's placetypes; or a string specifying a placetype, which must match one of the location's placetypes. This must be specified. * `placename`: The placename of the location. Either this or `key` must be specified. * `key`: The key of the location. Either this or `placename` must be specified. * `alias_resolution`: If specified, it behaves the same as for `find_matching_key_in_group`. The spec is normally initialized using `initialize_spec()` prior to it being returned (but may not be if `alias_resolution` is given and the specified key or placename is an alias; see the documentation for `find_matching_key_in_group`). ]==] function export.iterate_matching_location(data) local i = 0 local n = #export.locations return function() while true do i = i + 1 if i > n then break end local group = export.locations[i] local key, spec if data.placename then key, spec = find_matching_placename_in_group(group, data.placetypes, data.placename, data.alias_resolution) else if not data.key then internal_error("'.placename' or '.key' must be defined: %s", data) end key, spec = find_matching_key_in_group(group, data.placetypes, data.key, data.alias_resolution) end if key then return group, key, spec end end end end --[==[ Return the location matching a given description, where the description consists of either a placename or a key along with a list of possible placetypes. This is similar to `iterate_matching_location()` but throws an internal error if there is not exactly one location found; as such, it is for use with internally specified locations (such as the containers of known locations) rather than externally specified locations, which may not match a known location and in some cases may match multiple known locations. For finding an externally specified location, consider using `find_matching_holonym_location`, which returns {nil} rather than throwing an error if the location isn't found, but also (more importantly) checks to make sure there are no conflicting holonyms among the user-specified holonyms (e.g. {{tl|place|city|s/Delaware|c/USA|t=Newark}} will not match the known location `Newark` (in New Jersey, not Delaware). ]==] function export.get_matching_location(data) local all_found = {} for group, key, spec in export.iterate_matching_location(data) do insert(all_found, {group, key, spec}) end if not all_found[1] then internal_error("Couldn't find matching location for data %s", data) elseif all_found[2] then internal_error("Found multiple matching locations for data %s: %s", data, all_found) else return unpack(all_found[1]) end end --[==[ Successively iterate over a location's containers, and then the containers of those containers, etc. Keep in mind that locations may have multiple containers (e.g. Russia has both Europe and Asia as containers, and both Europe and Asia have Eurasia as their container). A given container will never be returned twice (e.g. in the case where a specific location A has locations B and C as containers, and B has C as its container, C will not be returned twice). An internal error happens if a container loop is detected. The return value is a list of location objects, each of which contains `group`, `key` and `spec` fields. ]==] function export.iterate_containers(group, key, spec) local keys_seen = {} keys_seen[key] = true local iterations = 0 local last_iteration_containers = {{group = group, key = key, spec = spec}} return function() iterations = iterations + 1 if iterations > 10 then internal_error("Probable loop in containers when processing key %s", key) end local next_iteration_containers = {} for _, location in ipairs(last_iteration_containers) do local containers = location.spec.containers if containers then for _, container in ipairs(containers) do local container_group, container_key, container_spec = export.get_matching_location { placetypes = container.placetype, key = container.key, } if not keys_seen[container_key] then insert(next_iteration_containers, { group = container_group, key = container_key, spec = container_spec }) keys_seen[container_key] = true end end end end if not next_iteration_containers[1] then return nil end last_iteration_containers = next_iteration_containers return next_iteration_containers end end --[==[ Given a placename, convert it into a link (two-part if `display_form` is given and differs from `placename`) and add `"the "` to the beginning if called for in `spec`. ]==] function export.construct_linked_placename(spec, placename, display_form) local linked_placename = display_form and placename ~= display_form and ("[[%s|%s]]"):format(placename, display_form) or ("[[%s]]"):format(placename) if spec.the then linked_placename = "the " .. linked_placename end return linked_placename end --[=[ This is typically used to define `key_to_placename`. It generates a function that chops off parts of a string (a location key), typically at the end, in order to get the full and elliptical versions of a placename. (See the documentation above for `key_to_placename` under "Location group tables" for the difference between full and elliptical placenames.) `container_patterns` is a Lua pattern or a list of possible patterns matching the container at the end of the key, which will be used to remove that container. If multiple patterns are specified, each one is tried until one matches. If `container_patterns` is omitted, this part of the process is skipped. The reulting string becomes the full placename. If `divtype_patterns` is specified, it is likewise either a Lua pattern or list of possible patterns to match and remove the political division affixed onto the end (or possibly the beginning) of the key in the keys of certain countries (such as South Korean and North Korean counties, which include the word "เทศมณฑล" in the key). The resulting chopped string becomes the elliptical placename. If `divtype_patterns` is omitted, this part of the process is skipped and the full and elliptical placenames are the same. Typical usage is as follows: ``` key_to_placename = make_key_to_placename(", England$"), ``` or (when the political division is part of the key) ``` key_to_placename = make_key_to_placename(", South Korea$", " County$") ``` ]=] local function make_key_to_placename(container_patterns, divtype_patterns) if type(container_patterns) == "string" then container_patterns = {container_patterns} end if type(divtype_patterns) == "string" then divtype_patterns = {divtype_patterns} end return function(key) local full_placename = key if container_patterns then for _, container_pattern in ipairs(container_patterns) do local nsubs full_placename, nsubs = full_placename:gsub(container_pattern, "") if nsubs > 0 then break end end end local elliptical_placename = full_placename if divtype_patterns then for _, divtype_pattern in ipairs(divtype_patterns) do local nsubs elliptical_placename, nsubs = elliptical_placename:gsub(divtype_pattern, "") if nsubs > 0 then break end end end return full_placename, elliptical_placename end end --[=[ This is typically used to define `placename_to_key`. It generates a function that appends a string to the end of a given placename to get the key (see the definition of `placename_to_key` above in the documentation under "Location group tables"). Optional `divtype_suffix` is a raw string (which should not contain hyphens or other characters that have special meaning in Lua patterns) to be appended first to the placename; if already present at the end, it is not appended. `container_suffix` is then added in the same fashion if given. Typical usage is like this: ``` placename_to_key = make_placename_to_key(", England") ``` (which will convert e.g. `"Hampshire"` into `"Hampshire, England"`) or ``` placename_to_key = make_placename_to_key(", South Korea", " County") ``` (which will convert e.g. `"Gangwon"` or `"Gangwon County"` into `"Gangwon County, South Korea"`). ]=] local function make_placename_to_key(container_suffix, divtype_suffix) return function(placename) local key = placename if divtype_suffix then if not key:find("^" .. divtype_suffix) then --th; เปลี่ยนไปเติมข้างหน้าแทน key = divtype_suffix .. key --th end end if container_suffix then key = container_suffix .. key --th end return key end end --[=[ This is typically used to define `canonicalize_key_container`, which converts a container as specified in the location data into the canonical form containing both the full container key and its placetype. It generates a function to do the canonicalization of a given container. If the container is a string, `suffix` is appended onto the string (use {nil} or {""} if there is no suffix to append), and the placetype is set to `placetype`. Otherwise the container is left as-is. Typical usage is like this: ``` canonicalize_key_container = make_canonicalize_key_container(", Canada", "จังหวัด") ``` which will convert e.g. `"Ontario"` into `{key = "Ontario, Canada", placetype = "จังหวัด"}`. ]=] local function make_canonicalize_key_container(suffix, placetype) return function(container) if type(container) == "string" then return {key = container .. (suffix or ""), placetype = placetype} else return container end end end ----------------------------------------------------------------------------------- -- Top-level tables -- ----------------------------------------------------------------------------------- export.continents = { ["โลก"] = {the = true, placetype = "ดาวเคราะห์", addl_parents = {"ธรรมชาติ"}, fulldesc = "=the planet [[Earth]] and the features found on it"}, ["แอฟริกา"] = {placetype = "ทวีป", container = {key = "โลก", placetype = "ดาวเคราะห์"}}, ["อเมริกา"] = {placetype = {"มหาทวีป", "ทวีป"}, container = {key = "โลก", placetype = "ดาวเคราะห์"}, keydesc = "[[America]], in the sense of [[North America]] and [[South America]] combined", wp = "Americas"}, ["อเมริกาส์"] = {alias_of = "อเมริกา", the = true}, ["อเมริกาเหนือ"] = {placetype = "ทวีป", container = {key = "อเมริกา", placetype = "มหาทวีป"}}, ["แคริบเบียน"] = {the = true, placetype = {"continental region", "ภูมิภาค"}, container = {key = "อเมริกาเหนือ", placetype = "ทวีป"}}, ["อเมริกากลาง"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "อเมริกาเหนือ", placetype = "ทวีป"}}, ["อเมริกาใต้"] = {placetype = "ทวีป", container = {key = "อเมริกา", placetype = "มหาทวีป"}}, ["แอนตาร์กติกา"] = {placetype = "ทวีป", container = {key = "โลก", placetype = "ดาวเคราะห์"}, fulldesc = "=the territory of [[Antarctica]]"}, ["ยูเรเชีย"] = {placetype = {"มหาทวีป", "ทวีป"}, container = {key = "โลก", placetype = "ดาวเคราะห์"}, keydesc = "[[Eurasia]], i.e. [[Europe]] and [[Asia]] together"}, ["เอเชีย"] = {placetype = "ทวีป", container = {key = "ยูเรเชีย", placetype = "มหาทวีป"}}, ["ยุโรป"] = {placetype = "ทวีป", container = {key = "ยูเรเชีย", placetype = "มหาทวีป"}}, ["โอเชียเนีย"] = {placetype = "ทวีป", container = {key = "โลก", placetype = "ดาวเคราะห์"}}, ["เมลานีเชีย"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "โอเชียเนีย", placetype = "ทวีป"}}, ["ไมโครนีเชีย (ภูมิภาค)"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "โอเชียเนีย", placetype = "ทวีป"}}, --ชื่อซ้ำกัน: ภูมิภาค/สหพันธรัฐ ["พอลินีเชีย"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "โอเชียเนีย", placetype = "ทวีป"}}, } export.continents_group = { default_overriding_bare_label_parents = {}, -- container parents should be used default_divs = {{type = "ประเทศ", prep = "ใน"}}, -- It's enough to mention the first-level continent or continent group. It seems excessive to write e.g. -- "El Salvador, a country in Central America, a continental region in North America, a continent in America, ...". default_no_include_container_in_desc = true, default_no_container_cat = true, default_no_container_parent = true, default_no_auto_augment_container = true, default_no_generic_place_cat = true, -- French Guyana is in France but not in Europe, which should not be an issue, so don't check holonym mismatches at -- this level. We also run into problems with supercontinents, which have "ทวีป" as the fallback and cause -- mismatches. default_no_check_holonym_mismatch = true, data = export.continents, } -- Countries: including those with partial recognition that are normally considered countries (e.g. Kosovo, Taiwan). export.countries = { ["อัฟกานิสถาน"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}}, ["แอลเบเนีย"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล", "communes", {type = "administrative units", cat_as = "communes"}, }, british_spelling = true}, ["แอลจีเรีย"] = {container = "แอฟริกา", divs = {"จังหวัด", "communes", "อำเภอ", "เทศบาล"}}, ["อันดอร์รา"] = {container = "ยุโรป", divs = {"parishes"}, british_spelling = true}, ["แองโกลา"] = {container = "แอฟริกา", divs = {"จังหวัด", "เทศบาล"}}, ["แอนทีกาและบาร์บิวดา"] = {container = "แคริบเบียน", divs = {"จังหวัด"}, british_spelling = true}, ["อาร์เจนตินา"] = {container = "อเมริกาใต้", divs = {"จังหวัด", "departments", "เทศบาล"}}, ["อาร์มีเนีย"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"จังหวัด", "อำเภอ", "เทศบาล"}, british_spelling = true}, ["สาธารณรัฐอาร์มีเนีย"] = {alias_of = "อาร์มีเนีย", the = true}, -- differs in "the" -- Both a country and continent ["ออสเตรเลีย"] = {container = "โอเชียเนีย", divs = { {type = "รัฐ", cat_as = "states and territories"}, {type = "ดินแดน", cat_as = "states and territories"}, {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and territories"}, {type = "ABBREVIATION_OF territories", cat_as = "abbreviations of states and territories"}, "local government areas", "dependent territories", }, british_spelling = true}, ["ออสเตรีย"] = {container = "ยุโรป", divs = {"รัฐ", "อำเภอ", "เทศบาล"}, british_spelling = true}, ["อาเซอร์ไบจาน"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"อำเภอ", "เทศบาล"}, british_spelling = true}, ["บาฮามาส"] = {the = true, container = "แคริบเบียน", divs = {"อำเภอ"}, british_spelling = true, wp = "The %l"}, ["บาห์เรน"] = {container = "เอเชีย", divs = {"governorates"}}, ["บังกลาเทศ"] = {container = "เอเชีย", divs = {"divisions", "อำเภอ", "เทศบาล"}, british_spelling = true}, ["บาร์เบโดส"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true}, ["เบลารุส"] = {container = "ยุโรป", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true}, ["เบลเยียม"] = {container = "ยุโรป", divs = {"ภูมิภาค", "จังหวัด", "เทศบาล"}, british_spelling = true}, ["เบลีซ"] = {container = "อเมริกากลาง", divs = {"อำเภอ"}, british_spelling = true}, ["เบนิน"] = {container = "แอฟริกา", divs = {"departments", "communes"}}, ["ภูฏาน"] = {container = "เอเชีย", divs = {"อำเภอ", "gewogs"}}, ["โบลิเวีย"] = {container = "อเมริกาใต้", divs = {"จังหวัด", "departments", "เทศบาล"}}, ["บอสเนียและเฮอร์เซโกวีนา"] = {container = "ยุโรป", divs = {"entities", "cantons", "เทศบาล"}, british_spelling = true}, --["Bosnia and Hercegovina"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true}, ["บอสเนีย-เฮอร์เซโกวีนา"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true}, --["Bosnia-Hercegovina"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true}, ["บอสเนีย"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true}, ["บอตสวานา"] = {container = "แอฟริกา", divs = {"อำเภอ", "ตำบล"}, british_spelling = true}, ["บราซิล"] = {container = "อเมริกาใต้", divs = { "รัฐ", "เทศบาล", "macroregions", {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"}, }}, ["บรูไน"] = {container = "เอเชีย", divs = {"อำเภอ", "mukims"}, british_spelling = true}, ["บัลแกเรีย"] = {container = "ยุโรป", divs = {"จังหวัด", "เทศบาล"}, british_spelling = true}, ["บูร์กินาฟาโซ"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments", "จังหวัด"}}, ["บุรุนดี"] = {container = "แอฟริกา", divs = {"จังหวัด", "communes"}}, ["กัมพูชา"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}}, ["แคเมอรูน"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}}, ["แคนาดา"] = {container = "อเมริกาเหนือ", divs = { {type = "รัฐ", cat_as = "รัฐและดินแดน"}, --ตาม thwiki {type = "ดินแดน", cat_as = "รัฐและดินแดน"}, {type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of รัฐและดินแดน"}, {type = "ABBREVIATION_OF territories", cat_as = "abbreviations of รัฐและดินแดน"}, "เทศมณฑล", "อำเภอ", "เทศบาล", "regional municipalities", "rural municipalities", "parishes", -- Don't change the following to something more politically correct (e.g. "First Nations reserves") until/unless -- the Canadian government makes a similar switch (and note that as of Apr 18 2025, the Wikipedia article is -- still at [[w:Indian reserves]]). "Indian reserves", "census divisions", {type = "townships", prep = "ใน"}, }, british_spelling = true}, ["กาบูเวร์ดี"] = {container = "แอฟริกา", divs = {"เทศบาล", "parishes"}}, ["เคปเวิร์ด"] = {alias_of = "กาบูเวร์ดี", display = true}, ["สาธารณรัฐแอฟริกากลาง"] = {the = true, container = "แอฟริกา", divs = {"prefectures", "subprefectures"}}, ["CAR"] = {alias_of = "สาธารณรัฐแอฟริกากลาง", display = true, the = true}, ["C.A.R"] = {alias_of = "สาธารณรัฐแอฟริกากลาง", display = true, the = true}, ["ชาด"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}}, ["ชิลี"] = {container = "อเมริกาใต้", divs = {"ภูมิภาค", "จังหวัด", "communes"}}, ["จีน"] = {container = "เอเชีย", divs = { {type = "มณฑล", cat_as = "provinces and autonomous regions"}, --ตาม thwiki {type = "autonomous regions", cat_as = "provinces and autonomous regions"}, {type = "FORMER provinces", cat_as = "former provinces"}, "special administrative regions", "จังหวัด", --ตาม thwiki {type = "FORMER prefectures", cat_as = "former prefectures"}, "prefecture-level cities", {type = "เทศมณฑล", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, {type = "FORMER counties", cat_as = "former counties and county-level cities"}, {type = "FORMER county-level cities", cat_as = "former counties and county-level cities"}, -- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities. "อำเภอ", {type = "FORMER districts", cat_as = "former districts"}, "ตำบล", "townships", "เทศบาล", {type = "direct-administered municipalities", cat_as = "เทศบาล"}, }}, ["สาธารณรัฐประชาชนจีน"] = {alias_of = "จีน", the = true}, -- differs in "the" ["โคลอมเบีย"] = {container = "อเมริกาใต้", divs = {"departments", "เทศบาล"}}, ["คอโมโรส"] = {the = true, container = "แอฟริกา", divs = {"autonomous islands"}}, ["คอสตาริกา"] = {container = "อเมริกากลาง", divs = {"จังหวัด", "cantons"}}, ["โครเอเชีย"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล"}, british_spelling = true}, ["คิวบา"] = {container = "แคริบเบียน", divs = {"จังหวัด", "เทศบาล"}}, ["ไซปรัส"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"อำเภอ"}, british_spelling = true}, ["สาธารณรัฐเช็ก"] = {the = true, container = "ยุโรป", divs = {"ภูมิภาค", "อำเภอ", "เทศบาล"}, british_spelling = true}, ["เช็กเกีย"] = {alias_of = "สาธารณรัฐเช็ก"}, -- differs in "the" ["สาธารณรัฐประชาธิปไตยคองโก"] = {the = true, container = "แอฟริกา", divs = {"จังหวัด", "ดินแดน"}}, ["คองโก"] = {alias_of = "สาธารณรัฐประชาธิปไตยคองโก", display = true, the = true}, ["DRC"] = {alias_of = "สาธารณรัฐประชาธิปไตยคองโก", display = true, the = true}, ["D.R.C"] = {alias_of = "สาธารณรัฐประชาธิปไตยคองโก", display = true, the = true}, ["เดนมาร์ก"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล", "dependent territories"}, british_spelling = true, -- Wikipedia separates [[w:Denmark]] (constituent country) from [[w:Danish Realm]] (country) }, ["จิบูตี"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}}, ["ดอมินีกา"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true}, ["สาธารณรัฐโดมินิกัน"] = {the = true, container = "แคริบเบียน", divs = {"จังหวัด", "เทศบาล"}, keydesc = "the [[Dominican Republic]], the country that shares the [[Caribbean]] island of [[Hispaniola]] with [[Haiti]]"}, ["ติมอร์-เลสเต"] = {container = "เอเชีย", divs = {"เทศบาล"}, wp = "ติมอร์-เลสเต"}, ["ติมอร์ตะวันออก"] = {alias_of = "ติมอร์-เลสเต", display = true}, ["เอกวาดอร์"] = {container = "อเมริกาใต้", divs = {"จังหวัด", "cantons"}}, ["อียิปต์"] = {container = "แอฟริกา", divs = {"governorates", "ภูมิภาค"}, british_spelling = true}, ["เอลซัลวาดอร์"] = {container = "อเมริกากลาง", divs = {"departments", "เทศบาล"}}, ["อิเควทอเรียลกินี"] = {container = "แอฟริกา", divs = {"จังหวัด"}}, ["เอริเทรีย"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "subregions"}}, ["เอสโตเนีย"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล"}, british_spelling = true}, ["เอสวาตินี"] = {container = "แอฟริกา", british_spelling = true}, ["สวาซีแลนด์"] = {alias_of = "เอสวาตินี", display = true}, ["เอธิโอเปีย"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "zones"}}, ["สหพันธรัฐไมโครนีเชีย"] = {the = true, container = "ไมโครนีเชีย", divs = {"รัฐ"}}, ["ไมโครนีเชีย"] = {alias_of = "สหพันธรัฐไมโครนีเชีย"}, --ชื่อซ้ำกัน: ภูมิภาค/สหพันธรัฐ ["ฟีจี"] = {container = "เมลานีเชีย", divs = {"divisions", "จังหวัด"}, british_spelling = true}, ["ฟินแลนด์"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล"}, british_spelling = true}, ["ฝรั่งเศส"] = {container = "ยุโรป", divs = {"ภูมิภาค", "cantons", "collectivities", "communes", {type = "เทศบาล", cat_as = "communes"}, "departments", {type = "prefectures", cat_as = {"prefectures", "departmental capitals"}}, {type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}}, "dependent territories", "ดินแดน", "จังหวัด", }, british_spelling = true}, ["กาบอง"] = {container = "แอฟริกา", divs = {"จังหวัด", "departments"}}, ["แกมเบีย"] = {the = true, container = "แอฟริกา", divs = {"divisions", "อำเภอ"}, british_spelling = true, wp = "The %l"}, ["จอร์เจีย"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"ภูมิภาค", "อำเภอ"}, keydesc = "the country of [[Georgia]], in [[Eurasia]]", british_spelling = true, wp = "%l (country)"}, ["เยอรมนี"] = {container = "ยุโรป", divs = { "รัฐ", -- Bavaria, Baden-Württemberg, Hesse and North Rhine-Westphalia have administrative regions as divisions, but -- there aren't really enough of them to categorize per state. "ภูมิภาค", "เทศบาล", "อำเภอ"}, british_spelling = true}, ["กานา"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true}, ["กรีซ"] = {container = "ยุโรป", divs = {"ภูมิภาค", "regional units", "เทศบาล", {type = "peripheries", cat_as = {"ภูมิภาค"}}, }, british_spelling = true}, ["กรีเนดา"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true}, ["กัวเตมาลา"] = {container = "อเมริกากลาง", divs = {"จังหวัด", "เทศบาล"}}, ["กินี"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "prefectures"}}, ["กินี-บิสเซา"] = {container = "แอฟริกา", divs = {"ภูมิภาค"}}, ["กายอานา"] = {container = "อเมริกาใต้", divs = {"ภูมิภาค"}, british_spelling = true}, ["เฮติ"] = {container = "แคริบเบียน", divs = {"departments", "arrondissements"}}, ["ฮอนดูรัส"] = {container = "อเมริกากลาง", divs = {"departments", "เทศบาล"}}, ["ฮังการี"] = {container = "ยุโรป", divs = {"เทศมณฑล", "อำเภอ"}, british_spelling = true}, ["ไอซ์แลนด์"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล", "เทศมณฑล"}, british_spelling = true}, ["อินเดีย"] = {container = "เอเชีย", divs = { {type = "รัฐ", cat_as = "states and union territories"}, {type = "union territories", cat_as = "states and union territories"}, {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and union territories"}, {type = "ABBREVIATION_OF union territories", cat_as = "abbreviations of states and union territories"}, "divisions", "อำเภอ", "เทศบาล", }, british_spelling = true}, ["อินโดนีเซีย"] = {container = "เอเชีย", divs = {"regencies", "จังหวัด", {type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces"}, }}, ["อิหร่าน"] = {container = "เอเชีย", divs = {"จังหวัด", "เทศมณฑล"}}, ["อิรัก"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}}, ["ไอร์แลนด์"] = {container = "ยุโรป", addl_parents = {"British Isles"}, divs = {"เทศมณฑล", "อำเภอ", "จังหวัด"}, british_spelling = true, wp = "Republic of %l"}, ["สาธารณรัฐไอร์แลนด์"] = {alias_of = "ไอร์แลนด์", the = true}, -- differs in "the" ["อิสราเอล"] = {container = "เอเชีย", divs = {"อำเภอ"}}, ["อิตาลี"] = {container = "ยุโรป", divs = { "ภูมิภาค", "จังหวัด", "metropolitan cities", "เทศบาล", {type = "autonomous regions", cat_as = "ภูมิภาค"}, }, british_spelling = true}, ["โกตดิวัวร์"] = {container = "แอฟริกา", divs = {"อำเภอ", "ภูมิภาค"}}, -- We should really be using Ivory Coast (common name) but there are political ramifications to the use of -- Côte d'Ivoire so don't make it a display alias. ["ไอวอรีโคสต์"] = {alias_of = "โกตดิวัวร์"}, ["จาเมกา"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true}, ["ญี่ปุ่น"] = {container = "เอเชีย", divs = {"จังหวัด", "กิ่งจังหวัด", "เทศบาล"}}, ["จอร์แดน"] = {container = "เอเชีย", divs = {"governorates"}}, ["คาซัคสถาน"] = {container = {"เอเชีย", "ยุโรป"}, divs = {"ภูมิภาค", "อำเภอ"}}, ["เคนยา"] = {container = "แอฟริกา", divs = {"เทศมณฑล"}, british_spelling = true}, ["Kiribati"] = {container = "ไมโครนีเชีย", british_spelling = true}, ["Kosovo"] = {container = "ยุโรป", divs = {"อำเภอ", "เทศบาล"}, british_spelling = true}, ["Kuwait"] = {container = "เอเชีย", divs = {"governorates", "areas"}}, ["Kyrgyzstan"] = {container = "เอเชีย", divs = {"ภูมิภาค", "อำเภอ"}}, ["Laos"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}}, ["Latvia"] = {container = "ยุโรป", divs = {"เทศบาล"}, british_spelling = true}, ["Lebanon"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}}, ["Lesotho"] = {container = "แอฟริกา", divs = {"อำเภอ"}, british_spelling = true}, ["Liberia"] = {container = "แอฟริกา", divs = {"เทศมณฑล", "อำเภอ"}}, ["Libya"] = {container = "แอฟริกา", divs = {"อำเภอ", "เทศบาล"}}, ["Liechtenstein"] = {container = "ยุโรป", divs = {"เทศบาล"}, british_spelling = true}, ["Lithuania"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล"}, british_spelling = true}, ["Luxembourg"] = {container = "ยุโรป", divs = {"cantons", "อำเภอ"}, british_spelling = true}, ["Madagascar"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}}, ["Malawi"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true}, ["Malaysia"] = {container = "เอเชีย", divs = {"รัฐ", "federal territories", "อำเภอ"}, british_spelling = true}, ["Maldives"] = {the = true, container = "เอเชีย", divs = {"จังหวัด", "administrative atolls"}, british_spelling = true}, ["Mali"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "cercles"}}, ["Malta"] = {container = "ยุโรป", divs = {"ภูมิภาค", "local councils"}, british_spelling = true}, ["Marshall Islands"] = {the = true, container = "ไมโครนีเชีย", divs = {"เทศบาล"}}, ["Mauritania"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}}, ["Mauritius"] = {container = "แอฟริกา", divs = {"อำเภอ"}, british_spelling = true}, ["Mexico"] = {container = "อเมริกาเหนือ", addl_parents = {"อเมริกากลาง"}, divs = { "รัฐ", "เทศบาล", {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"}, }}, ["Moldova"] = {container = "ยุโรป", divs = { {type = "อำเภอ", cat_as = "districts and autonomous territorial units"}, {type = "autonomous territorial units", cat_as = "districts and autonomous territorial units"}, "communes", "เทศบาล", }, british_spelling = true}, ["Monaco"] = {placetype = {"city-state", "ประเทศ"}, container = "ยุโรป", -- We want the first placetype to be 'city-state' so the description of Monaco says it's a city-state, but we -- want its parent to be "countries in Europe". bare_category_parent_type = {type = "ประเทศ", prep = "ใน"}, is_city = true, british_spelling = true}, ["Mongolia"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}}, ["Montenegro"] = {container = "ยุโรป", divs = {"เทศบาล"}}, ["Morocco"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "prefectures", "จังหวัด"}}, ["Mozambique"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}}, ["Myanmar"] = {container = "เอเชีย", divs = {"ภูมิภาค", "รัฐ", "union territories", {type = "self-administered zones", cat_as = "self-administered areas"}, {type = "self-administered divisions", cat_as = "self-administered areas"}, "อำเภอ"}}, ["Burma"] = {alias_of = "Myanmar"}, -- not display-canonicalizing; has political connotations ["Namibia"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "constituencies"}, british_spelling = true}, ["Nauru"] = {container = "ไมโครนีเชีย", divs = {"อำเภอ"}, british_spelling = true}, ["Nepal"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}}, ["เนเธอร์แลนด์"] = {the = true, placetype = {"ประเทศ", "constituent country"}, container = "ยุโรป", divs = {"จังหวัด", "เทศบาล", {type = "FORMER municipalities", cat_as = "former municipalities"}, "dependent territories", "constituent countries"}, british_spelling = true, -- Wikipedia separates [[w:Netherlands]] (constituent country) from [[w:Kingdom of the Netherlands]] -- (country) }, ["New Zealand"] = {container = "พอลินีเชีย", divs = { "ภูมิภาค", "dependent territories", "territorial authorities", {type = "อำเภอ", cat_as = "territorial authorities"}, }, british_spelling = true}, ["Nicaragua"] = {container = "อเมริกากลาง", divs = {"departments", "เทศบาล"}}, ["Niger"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}}, ["Nigeria"] = {container = "แอฟริกา", divs = { "รัฐ", -- Categorize the Federal Capital Territory as a state because there's only one of it; we could categorize -- everything under 'states and territories' but that seems a bit pointless. {type = "federal territories", cat_as = "รัฐ"}, "local government areas", }, british_spelling = true}, ["North Korea"] = {container = "เอเชีย", addl_parents = {"Korea"}, divs = {"จังหวัด", "เทศมณฑล"}}, ["North Macedonia"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล"}, british_spelling = true}, ["Macedonia"] = {alias_of = "North Macedonia", display = true}, ["Republic of North Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the" ["Republic of Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the" ["Norway"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล", "dependent territories", "อำเภอ", "unincorporated areas"}, british_spelling = true}, ["Oman"] = {container = "เอเชีย", divs = {"governorates", "จังหวัด"}}, ["Pakistan"] = {container = "เอเชีย", divs = { {type = "จังหวัด", cat_as = "provinces and territories"}, {type = "administrative territories", cat_as = "provinces and territories"}, {type = "federal territories", cat_as = "provinces and territories"}, {type = "ดินแดน", cat_as = "provinces and territories"}, "divisions", "อำเภอ", }, british_spelling = true}, ["Palau"] = {container = "ไมโครนีเชีย", divs = {"รัฐ"}}, ["Palestine"] = {container = "เอเชีย", divs = {"governorates"}}, ["State of Palestine"] = {alias_of = "Palestine", the = true}, -- differs in "the" ["Panama"] = {container = "อเมริกากลาง", divs = {"จังหวัด", "อำเภอ"}}, ["Papua New Guinea"] = {container = "เมลานีเชีย", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true}, ["Paraguay"] = {container = "อเมริกาใต้", divs = {"departments", "อำเภอ"}}, ["Peru"] = {container = "อเมริกาใต้", divs = {"ภูมิภาค", "จังหวัด", "อำเภอ"}}, ["Philippines"] = {the = true, container = "เอเชีย", divs = {"ภูมิภาค", "จังหวัด", "อำเภอ", "เทศบาล", "barangays"}}, ["Poland"] = {divs = {"voivodeships", "เทศมณฑล", {type = "Polish colonies", cat_as = {{type = "villages", prep = "ใน"}}}, }, container = "ยุโรป", british_spelling = true}, ["Portugal"] = {container = "ยุโรป", divs = { {type = "autonomous regions", cat_as = "districts and autonomous regions"}, {type = "อำเภอ", cat_as = "districts and autonomous regions"}, "จังหวัด", "เทศบาล"}, british_spelling = true}, ["Qatar"] = {container = "เอเชีย", divs = {"เทศบาล", "zones"}}, ["Republic of the Congo"] = {the = true, container = "แอฟริกา", divs = {"departments", "อำเภอ"}}, ["Congo Republic"] = {alias_of = "Republic of the Congo", display = true, the = true}, ["Romania"] = {container = "ยุโรป", divs = { "ภูมิภาค", "เทศมณฑล", "communes", {type = "ABBREVIATION_OF counties", cat_as = "abbreviations of counties"}, }, british_spelling = true}, ["Russia"] = {container = {"ยุโรป", "เอเชีย"}, divs = { "federal subjects", "republics", "autonomous oblasts", "autonomous okrugs", "oblasts", "krais", "federal cities", "อำเภอ", "federal districts"}, british_spelling = true}, ["Rwanda"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}}, ["Saint Kitts and Nevis"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true}, ["Saint Kitts"] = {alias_of = "Saint Kitts and Nevis", display = true}, ["Saint Lucia"] = {container = "แคริบเบียน", divs = {"อำเภอ"}, british_spelling = true}, ["Saint Vincent and the Grenadines"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true}, ["Saint Vincent"] = {alias_of = "Saint Vincent and the Grenadines", display = true}, ["SVG"] = {alias_of = "Saint Vincent and the Grenadines", display = true}, ["S.V.G"] = {alias_of = "Saint Vincent and the Grenadines", display = true}, ["Samoa"] = {container = "พอลินีเชีย", divs = {"อำเภอ"}, british_spelling = true}, ["San Marino"] = {container = "ยุโรป", divs = {"เทศบาล"}, british_spelling = true}, ["São Tomé and Príncipe"] = {container = "แอฟริกา", divs = {"อำเภอ"}}, ["São Tome and Principe"] = {alias_of = "São Tomé and Príncipe", display = true}, ["São Tomé"] = {alias_of = "São Tomé and Príncipe", display = true}, ["São Tome"] = {alias_of = "São Tomé and Príncipe", display = true}, ["Saudi Arabia"] = {container = "เอเชีย", divs = {"จังหวัด", "governorates"}}, ["Senegal"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}}, ["Serbia"] = {container = "ยุโรป", divs = {"อำเภอ", "เทศบาล", "autonomous provinces"}}, ["Seychelles"] = {container = "แอฟริกา", divs = {"อำเภอ"}, british_spelling = true}, ["Sierra Leone"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true}, ["Singapore"] = {container = "เอเชีย", divs = {"อำเภอ", "ภูมิภาค"}, british_spelling = true}, ["Slovakia"] = {container = "ยุโรป", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true}, ["Slovenia"] = {container = "ยุโรป", divs = {"statistical regions", "เทศบาล"}, british_spelling = true}, -- Note: While the official name does not include "the" at the beginning, -- it sounds strange in English to leave it out and it's commonly included. ["Solomon Islands"] = {the = true, container = "เมลานีเชีย", divs = {"จังหวัด"}, british_spelling = true}, ["โซมาเลีย"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}}, ["South Africa"] = {container = "แอฟริกา", divs = { "จังหวัด", "อำเภอ", {type = "district municipalities", cat_as = "อำเภอ"}, {type = "metropolitan municipalities", cat_as = "อำเภอ"}, "เทศบาล", }, british_spelling = true}, ["South Korea"] = {container = "เอเชีย", addl_parents = {"Korea"}, divs = {"จังหวัด", "เทศมณฑล", "อำเภอ"}}, ["South Sudan"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "รัฐ", "เทศมณฑล"}, british_spelling = true}, ["Spain"] = {container = "ยุโรป", divs = {"autonomous communities", "จังหวัด", "เทศบาล", "comarcas", "autonomous cities"}, british_spelling = true}, ["Sri Lanka"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true}, ["Sudan"] = {container = "แอฟริกา", divs = {"รัฐ", "อำเภอ"}, british_spelling = true}, ["Suriname"] = {container = "อเมริกาใต้", divs = {"อำเภอ"}}, ["Sweden"] = {container = "ยุโรป", divs = {"จังหวัด", "เทศมณฑล", "เทศบาล"}, british_spelling = true}, ["Switzerland"] = {container = "ยุโรป", divs = {"cantons", "เทศบาล", "อำเภอ"}, british_spelling = true}, ["Syria"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}}, ["ไต้หวัน"] = {container = "เอเชีย", divs = {"เทศมณฑล", "อำเภอ", "townships", "special municipalities"}}, ["สาธารณรัฐจีน"] = {alias_of = "ไต้หวัน", the = true}, -- differs in "the", different political connotations ["Tajikistan"] = {container = "เอเชีย", divs = {"ภูมิภาค", "อำเภอ"}}, ["Tanzania"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true}, ["ไทย"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ", "ตำบล"}}, ["Togo"] = {container = "แอฟริกา", divs = {"จังหวัด", "prefectures"}}, ["Tonga"] = {container = "พอลินีเชีย", divs = {"divisions"}, british_spelling = true}, ["Trinidad and Tobago"] = {container = "แคริบเบียน", divs = {"ภูมิภาค", "เทศบาล"}, british_spelling = true}, ["Tunisia"] = {container = "แอฟริกา", divs = {"governorates", "delegations"}}, ["Turkey"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"จังหวัด", "อำเภอ"}}, -- Foreign names generally get display-canonicalized. ["Türkiye"] = {alias_of = "Turkey", display = true}, ["Turkmenistan"] = {container = "เอเชีย", divs = { -- The 5 regions are often also called provinces "ภูมิภาค", {type = "จังหวัด", cat_as = "ภูมิภาค"}, "อำเภอ"}, }, ["Tuvalu"] = {container = "พอลินีเชีย", divs = {"atolls"}, british_spelling = true}, ["Uganda"] = {container = "แอฟริกา", divs = {"อำเภอ", "เทศมณฑล"}, british_spelling = true}, ["Ukraine"] = {container = "ยุโรป", divs = { {type = "oblasts", cat_as = "oblasts and autonomous republics"}, {type = "autonomous republics", cat_as = "oblasts and autonomous republics"}, "raions", "hromadas", }, british_spelling = true}, ["United Arab Emirates"] = {the = true, container = "เอเชีย", divs = {"emirates"}}, -- Abbreviations get display-canonicalized. ["UAE"] = {alias_of = "United Arab Emirates", display = true, the = true}, ["U.A.E."] = {alias_of = "United Arab Emirates", display = true, the = true}, ["สหราชอาณาจักร"] = {the = true, container = "ยุโรป", addl_parents = {"British Isles"}, divs = {"constituent countries", "เทศมณฑล", "อำเภอ", "boroughs", "ดินแดน", "dependent territories", "traditional counties"}, keydesc = "the [[United Kingdom]] of Great Britain and Northern Ireland", british_spelling = true}, -- Abbreviations get display-canonicalized. ["UK"] = {alias_of = "สหราชอาณาจักร", display = true, the = true}, ["U.K."] = {alias_of = "สหราชอาณาจักร", display = true, the = true}, ["สหรัฐอเมริกา"] = {the = true, container = "อเมริกาเหนือ", divs = {"เทศมณฑล", "county seats", "รัฐ", "ดินแดน", "dependent territories", {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"}, {type = "DEROGATORY_NAME_FOR states", cat_as = "derogatory names for states"}, {type = "NICKNAME_FOR states", cat_as = "nicknames for states"}, {type = "OFFICIAL_NICKNAME_FOR states", cat_as = "official nicknames for states"}, {type = "boroughs", prep = "ใน"}, -- exist in Pennsylvania and New Jersey "เทศบาล", -- these exist politically at least in Colorado and Connecticut {type = "census-designated places", prep = "ใน"}, {type = "unincorporated communities", prep = "ใน"}, -- Don't change the following to something more politically correct until/unless the US government makes a -- similar switch (and note that as of Apr 18 2025, the Wikipedia article is still at -- [[w:Indian reservations]]). "Indian reservations", }}, -- Abbreviations and long forms (when possible) get display-canonicalized. ["US"] = {alias_of = "สหรัฐอเมริกา", display = true, the = true}, ["U.S."] = {alias_of = "สหรัฐอเมริกา", display = true, the = true}, ["USA"] = {alias_of = "สหรัฐอเมริกา", display = true, the = true}, ["U.S.A."] = {alias_of = "สหรัฐอเมริกา", display = true, the = true}, ["สหรัฐ"] = {alias_of = "สหรัฐอเมริกา", display = true, the = true}, ["Uruguay"] = {container = "อเมริกาใต้", divs = {"departments", "เทศบาล"}}, ["Uzbekistan"] = {container = "เอเชีย", divs = {"ภูมิภาค", "อำเภอ"}}, ["Vanuatu"] = {container = "เมลานีเชีย", divs = {"จังหวัด"}, british_spelling = true}, ["Vatican City"] = {placetype = {"city-state", "ประเทศ"}, container = "ยุโรป", -- First placetype should be 'city-state' for to shown up in its description, -- Its parent should still be "countries in Europe". bare_category_parent_type = {type = "ประเทศ", prep = "ใน"}, addl_parents = {"Rome"}, is_city = true, british_spelling = true}, ["Vatican"] = {alias_of = "Vatican City", the = true}, -- differs in "the" ["Venezuela"] = {container = "อเมริกาใต้", divs = {"รัฐ", "เทศบาล"}}, ["เวียดนาม"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ", "เทศบาล"}}, ["Western Sahara"] = {placetype = {"ดินแดน", "ประเทศ"}, container = "แอฟริกา", bare_category_parent_type = {type = "ประเทศ", prep = "ใน"}, }, -- Not display-canonicalizable both due to differences in 'the' and the sovereignty dispute over Western Sahara ["Sahrawi Arab Democratic Republic"] = {alias_of = "Western Sahara", the = true}, ["SADR"] = {alias_of = "Sahrawi Arab Democratic Republic", display = true, the = true}, ["Yemen"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}}, ["Zambia"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true}, ["Zimbabwe"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true}, } local function canonicalize_continent_container(key) if type(key) ~= "string" then return key end if export.continents[key] then return {key = key, placetype = export.continents[key].placetype} end internal_error("Unrecognized key %s in `canonicalize_continent_like`", key) end export.countries_group = { canonicalize_key_container = canonicalize_continent_container, default_overriding_bare_label_parents = {"+++", "ประเทศ"}, default_placetype = "ประเทศ", default_no_container_cat = true, default_no_container_parent = true, -- No need to augment country holonyms with continents; not needed for disambiguation. default_no_auto_augment_container = true, data = export.countries, } -- Country-like entities: typically overseas territories or de-facto independent countries, which in both cases -- are not internationally recognized as sovereign nations but which we treat similarly to countries. export.country_like_entities = { -- British Overseas Territory ["Akrotiri and Dhekelia"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"ไซปรัส", "ยุโรป", "เอเชีย"}, british_spelling = true, }, -- Åland: Listed as a region of Finland. Wikipedia lists this under "dependent territories" in -- [[w:List of sovereign states and dependent territories by continent]]. -- unincorporated territory of the United States ["American Samoa"] = { placetype = {"unincorporated territory", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"พอลินีเชีย"}, }, -- British Overseas Territory ["Anguilla"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Georgia ["Abkhazia"] = { placetype = {"unrecognized country", "ประเทศ"}, addl_parents = {"Georgia", "ยุโรป", "เอเชีย"}, divs = {"อำเภอ"}, keydesc = "the de-facto independent state of [[Abkhazia]], internationally recognized as part of the country of [[Georgia]]", british_spelling = true, }, -- Australian external territory ["Ashmore and Cartier Islands"] = { the = true, placetype = {"external territory", "ดินแดน"}, container = "ออสเตรเลีย", addl_parents = {"เอเชีย"}, }, -- constituent country of the Netherlands ["Aruba"] = { placetype = {"constituent country", "ประเทศ"}, container = "เนเธอร์แลนด์", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- British Overseas Territory ["Bermuda"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"อเมริกาเหนือ"}, british_spelling = true, }, -- special municipality of the Netherlands ["Bonaire"] = { placetype = {"special municipality", "เทศบาล", "overseas territory", "ดินแดน"}, container = "เนเธอร์แลนด์", addl_parents = {"แคริบเบียน"}, is_city = true, british_spelling = true, }, -- British Overseas Territory ["British Indian Ocean Territory"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"เอเชีย"}, british_spelling = true, }, -- British Overseas Territory ["British Virgin Islands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- Norwegian dependent territory ["Bouvet Island"] = { placetype = {"dependent territory", "ดินแดน"}, container = "Norway", addl_parents = {"แอฟริกา"}, british_spelling = true, }, -- British Overseas Territory ["Cayman Islands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- Australian external territory ["Christmas Island"] = { placetype = {"external territory", "ดินแดน"}, container = "ออสเตรเลีย", addl_parents = {"เอเชีย"}, british_spelling = true, }, -- Sui generis French "state private property" per Wikipedia; classify as overseas territory like the -- French Southern and Antarctic Lands. ["Clipperton Island"] = { placetype = {"overseas territory", "ดินแดน"}, container = "ฝรั่งเศส", addl_parents = {"อเมริกาเหนือ"}, }, -- Australian external territory; also called the Keeling Islands or (officially) the Cocos (Keeling) Islands ["Cocos Islands"] = { the = true, placetype = {"external territory", "ดินแดน"}, container = "ออสเตรเลีย", addl_parents = {"เอเชีย"}, wp = "Cocos (Keeling) Islands", british_spelling = true, }, ["Cocos (Keeling) Islands"] = {alias_of = "Cocos Islands", display = true, the = true}, ["Keeling Islands"] = {alias_of = "Cocos Islands", display = true, the = true}, -- self-governing but in free association with New Zealand ["Cook Islands"] = { the = true, placetype = {"ประเทศ"}, container = "New Zealand", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, -- constituent country of the Netherlands ["Curaçao"] = { placetype = {"constituent country", "ประเทศ"}, container = "เนเธอร์แลนด์", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- special territory of Chile ["Easter Island"] = { placetype = {"special territory", "ดินแดน"}, container = "ชิลี", addl_parents = {"พอลินีเชีย"}, }, -- British Overseas Territory ["Falkland Islands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"อเมริกาใต้"}, british_spelling = true, }, -- autonomous territory of Denmark ["Faroe Islands"] = { the = true, placetype = {"autonomous territory", "ดินแดน"}, container = "เดนมาร์ก", addl_parents = {"ยุโรป"}, british_spelling = true, }, -- overseas department and region of France ["French Guiana"] = { placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"}, container = "ฝรั่งเศส", divs = {"communes"}, addl_parents = {"อเมริกาใต้"}, british_spelling = true, }, -- overseas collectivity of France ["French Polynesia"] = { placetype = {"overseas collectivity", "collectivity"}, container = "ฝรั่งเศส", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, -- French overseas territory ["French Southern and Antarctic Lands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "ฝรั่งเศส", addl_parents = {"แอฟริกา"}, }, -- British Overseas Territory ["Gibraltar"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"ยุโรป"}, is_city = true, british_spelling = true, }, -- autonomous territory of Denmark ["Greenland"] = { placetype = {"autonomous territory", "ดินแดน"}, container = "เดนมาร์ก", addl_parents = {"อเมริกาเหนือ"}, divs = {"เทศบาล"}, british_spelling = true, }, -- overseas department and region of France ["Guadeloupe"] = { placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"}, container = "ฝรั่งเศส", addl_parents = {"แคริบเบียน"}, divs = {"communes"}, british_spelling = true, }, -- unincorporated territory of the United States ["Guam"] = { placetype = {"unincorporated territory", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"ไมโครนีเชีย"}, }, -- self-governing British Crown dependency; technically called the Bailiwick of Guernsey ["Guernsey"] = { placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"British Isles", "ยุโรป"}, british_spelling = true, wp = "Bailiwick of %l", }, ["Bailiwick of Guernsey"] = {alias_of = "Guernsey", the = true}, -- Australian external territory ["Heard Island and McDonald Islands"] = { the = true, placetype = {"external territory", "ดินแดน"}, container = "ออสเตรเลีย", addl_parents = {"แอฟริกา"}, }, -- special administrative region of China ["Hong Kong"] = { placetype = {"special administrative region", "นคร"}, container = "จีน", is_city = true, british_spelling = true, }, -- self-governing British Crown dependency ["Isle of Man"] = { the = true, placetype = {"crown dependency", "dependency", "dependent territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"British Isles", "ยุโรป"}, british_spelling = true, }, -- Norwegian unincorporated area ["Jan Mayen"] = { placetype = {"unincorporated area", "dependent territory", "ดินแดน", "เกาะ"}, container = "Norway", addl_parents = {"ยุโรป"}, british_spelling = true, }, -- self-governing British Crown dependency; technically called the Bailiwick of Jersey ["Jersey"] = { placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"British Isles", "ยุโรป"}, british_spelling = true, }, ["Bailiwick of Jersey"] = {alias_of = "Jersey", the = true}, -- special administrative region of China ["Macau"] = { placetype = {"special administrative region", "นคร"}, container = "จีน", is_city = true, british_spelling = true, }, -- overseas department and region of France ["Martinique"] = { placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"}, container = "ฝรั่งเศส", divs = {"communes"}, addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- overseas department and region of France ["Mayotte"] = { placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"}, container = "ฝรั่งเศส", divs = {"communes"}, addl_parents = {"แอฟริกา"}, british_spelling = true, }, -- British Overseas Territory ["Montserrat"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- special collectivity of France ["New Caledonia"] = { placetype = {"special collectivity", "collectivity"}, container = "ฝรั่งเศส", addl_parents = {"เมลานีเชีย"}, british_spelling = true, }, -- dependent territory of New Zealand ["New Zealand Subantarctic Islands"] = { the = true, placetype = {"dependent territory", "ดินแดน"}, container = "New Zealand", addl_parents = {"แอนตาร์กติกา"}, british_spelling = true, }, -- self-governing but in free association with New Zealand ["Niue"] = { placetype = {"ประเทศ"}, container = "New Zealand", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, -- Australian external territory ["Norfolk Island"] = { placetype = {"external territory", "ดินแดน"}, container = "ออสเตรเลีย", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Cyprus ["Northern Cyprus"] = { placetype = {"unrecognized country", "ประเทศ"}, addl_parents = {"ไซปรัส", "Turkey", "ยุโรป", "เอเชีย"}, divs = {"อำเภอ"}, keydesc = "the de-facto independent state of [[Northern Cyprus]], internationally recognized as part of the country of [[Cyprus]]", british_spelling = true, }, -- commonwealth, unincorporated territory of the United States ["Northern Mariana Islands"] = { the = true, placetype = {"commonwealth", "unincorporated territory", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"ไมโครนีเชีย"}, }, -- British Overseas Territory ["Pitcairn Islands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, -- commonwealth of the United States ["Puerto Rico"] = { placetype = {"commonwealth", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"แคริบเบียน"}, divs = {"เทศบาล"}, }, -- overseas department and region of France ["Réunion"] = { placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"}, container = "ฝรั่งเศส", divs = {"communes"}, addl_parents = {"แอฟริกา"}, british_spelling = true, }, -- special municipality of the Netherlands ["Saba"] = { placetype = {"special municipality", "เทศบาล", "overseas territory", "ดินแดน"}, container = "เนเธอร์แลนด์", addl_parents = {"แคริบเบียน"}, is_city = true, british_spelling = true, }, -- overseas collectivity of France ["Saint Barthélemy"] = { placetype = {"overseas collectivity", "collectivity"}, container = "ฝรั่งเศส", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- British Overseas Territory ["Saint Helena, Ascension and Tristan da Cunha"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", divs = {{type = "constituent parts", container_parent_type = false}}, addl_parents = {"มหาสมุทรแอตแลนติก", "แอฟริกา"}, british_spelling = true, }, -- constituent parts of the combined oveseas territory ["Ascension Island"] = { placetype = {"constituent part", "ดินแดน", "เกาะ"}, container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"}, addl_parents = {"มหาสมุทรแอตแลนติก"}, overriding_bare_label_parents = {}, no_container_cat = false, no_container_parent = false, no_auto_augment_container = false, }, ["Saint Helena"] = { placetype = {"constituent part", "ดินแดน", "เกาะ"}, container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"}, addl_parents = {"มหาสมุทรแอตแลนติก"}, overriding_bare_label_parents = {}, no_container_cat = false, no_container_parent = false, no_auto_augment_container = false, }, ["Tristan da Cunha"] = { placetype = {"constituent part", "ดินแดน", "archipelago"}, container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"}, addl_parents = {"มหาสมุทรแอตแลนติก"}, overriding_bare_label_parents = {}, no_container_cat = false, no_container_parent = false, no_auto_augment_container = false, }, -- overseas collectivity of France ["Saint Martin"] = { placetype = {"overseas collectivity", "collectivity"}, container = "ฝรั่งเศส", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- overseas collectivity of France ["Saint Pierre and Miquelon"] = { placetype = {"overseas collectivity", "collectivity"}, container = "ฝรั่งเศส", divs = {"communes"}, addl_parents = {"อเมริกาเหนือ"}, british_spelling = true, }, -- special municipality of the Netherlands ["Sint Eustatius"] = { placetype = {"special municipality", "เทศบาล", "overseas territory", "ดินแดน"}, container = "เนเธอร์แลนด์", addl_parents = {"แคริบเบียน"}, is_city = true, british_spelling = true, }, -- constituent country of the Netherlands ["Sint Maarten"] = { placetype = {"constituent country", "ประเทศ"}, container = "เนเธอร์แลนด์", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Somalia ["Somaliland"] = { placetype = {"unrecognized country", "ประเทศ"}, addl_parents = {"โซมาเลีย", "แอฟริกา"}, keydesc = "the de-facto independent state of [[Somaliland]], internationally recognized as part of the country of [[Somalia]]", british_spelling = true, }, -- British Overseas Territory -- FIXME: We should form the group "South Georgia and the South Sandwich Islands" like we did for -- "Saint Helena, Ascension and Tristan da Cunha". ["South Georgia"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"มหาสมุทรแอตแลนติก"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Georgia ["South Ossetia"] = { placetype = {"unrecognized country", "ประเทศ"}, addl_parents = {"Georgia", "ยุโรป", "เอเชีย"}, keydesc = "the de-facto independent state of [[South Ossetia]], internationally recognized as part of the country of [[Georgia]]", british_spelling = true, }, -- British Overseas Territory ["South Sandwich Islands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"มหาสมุทรแอตแลนติก"}, wp = true, wpcat = "South Georgia and the South Sandwich Islands", british_spelling = true, }, -- Norwegian unincorporated area ["Svalbard"] = { placetype = {"unincorporated area", "dependent territory", "ดินแดน", "archipelago"}, container = "Norway", addl_parents = {"ยุโรป"}, british_spelling = true, }, -- dependent territory of New Zealand ["Tokelau"] = { placetype = {"dependent territory", "ดินแดน"}, container = "New Zealand", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Moldova ["Transnistria"] = { placetype = {"unrecognized country", "ประเทศ"}, addl_parents = {"Moldova", "ยุโรป"}, keydesc = "the de-facto independent state of [[Transnistria]], internationally recognized as part of [[Moldova]]", british_spelling = true, }, -- British Overseas Territory ["Turks and Caicos Islands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- unincorporated territory of the United States ["United States Minor Outlying Islands"] = { the = true, placetype = {"unincorporated territory", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"เกาะ", "ไมโครนีเชีย", "พอลินีเชีย", "แคริบเบียน"}, }, -- FIXME: We should add entries for the other minor outlying islands. -- Baker Island (Oceania) -- Howland Island (Oceania) -- Jarvis Island (Oceania) -- Johnston Atoll (Oceania) -- Kingman Reef (Oceania) -- Midway Atoll (Oceania) -- Navassa Island (Caribbean) -- Palmyra Atoll (Oceania) -- Wake Island (Oceania) ["Wake Island"] = { placetype = {"unincorporated territory", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"ไมโครนีเชีย"}, }, -- unincorporated territory of the United States ["United States Virgin Islands"] = { the = true, placetype = {"unincorporated territory", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"แคริบเบียน"}, }, ["U.S. Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true}, ["US Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true}, -- overseas collectivity of France ["Wallis and Futuna"] = { placetype = {"overseas collectivity", "collectivity"}, container = "ฝรั่งเศส", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, } export.country_like_entities_group = { -- don't do any transformations between key and placename; in particular, don't chop off anything from -- "Saint Helena, Ascension and Tristan da Cunha". key_to_placename = false, placename_to_key = false, canonicalize_key_container = make_canonicalize_key_container(nil, "ประเทศ"), default_overriding_bare_label_parents = {"country-like entities"}, default_no_container_cat = true, default_no_container_parent = true, -- These entities often aren't really part of their container; a village in Wallis and Futuna (an overseas -- collectivity of France in Polynesia), for example, shouldn't be treated as a village in France, nor as a village -- in Europe. default_no_auto_augment_container = true, data = export.country_like_entities, } -- Former countries and such; we don't create "Cities in ..." categories because they don't exist anymore export.former_countries = { -- de-facto independent state of Armenian ethnicity, internationally recognized as part of Azerbaijan -- (also known as Nagorno-Karabakh) -- NOTE: Formerly listed Armenia as a parent; this seems politically non-neutral so I've taken it out. ["Artsakh"] = { placetype = {"unrecognized country", "ประเทศ"}, addl_parents = {"อาเซอร์ไบจาน", "ยุโรป", "เอเชีย"}, keydesc = "the former de-facto independent state of [[Artsakh]], internationally recognized as part of [[Azerbaijan]]", british_spelling = true, }, ["Nagorno-Karabakh"] = {alias_of = "Artsakh"}, ["Czechoslovakia"] = {container = "ยุโรป", british_spelling = true}, ["East Germany"] = {container = "ยุโรป", addl_parents = {"เยอรมนี"}, british_spelling = true}, ["เวียดนามเหนือ"] = {container = "เอเชีย", addl_parents = {"เวียดนาม"}}, ["เปอร์เซีย"] = {placetype = {"จักรวรรดิ", "ประเทศ"}, container = "เอเชีย", divs = {"จังหวัด"}}, ["Byzantine Empire"] = { the = true, placetype = {"จักรวรรดิ", "ประเทศ"}, container = {"ยุโรป", "แอฟริกา", "เอเชีย"}, addl_parents = {"Ancient Europe", "Ancient Near East"}, divs = { "จังหวัด", "themes", }}, ["Roman Empire"] = { the = true, placetype = {"จักรวรรดิ", "ประเทศ"}, container = {"ยุโรป", "แอฟริกา", "เอเชีย"}, addl_parents = {"Rome"}, divs = { "จังหวัด", {type = "FORMER provinces", cat_as = "จังหวัด"}, }}, ["เวียดนามใต้"] = {container = "เอเชีย", addl_parents = {"เวียดนาม"}}, ["Soviet Union"] = { the = true, container = {"ยุโรป", "เอเชีย"}, divs = {"republics", "autonomous republics"}, british_spelling = true}, ["West Germany"] = {container = "ยุโรป", addl_parents = {"เยอรมนี"}, british_spelling = true}, ["Yugoslavia"] = {container = "ยุโรป", divs = {"อำเภอ"}, keydesc = "the former [[Kingdom of Yugoslavia]] (1918–1943) or the former [[Socialist Federal Republic of Yugoslavia]] (1943–1992)", british_spelling = true}, } export.former_countries_group = { canonicalize_key_container = canonicalize_continent_container, default_overriding_bare_label_parents = {"former countries and country-like entities"}, default_is_former_place = true, default_placetype = "ประเทศ", default_no_container_cat = true, default_no_container_parent = true, -- No need to augment country holonyms with continents; not needed for disambiguation. default_no_auto_augment_container = true, data = export.former_countries, } ----------------------------------------------------------------------------------- -- Subpolity tables -- ----------------------------------------------------------------------------------- export.australia_states_and_territories = { ["Australian Capital Territory, ออสเตรเลีย"] = {the = true, placetype = "ดินแดน"}, ["Jervis Bay Territory, ออสเตรเลีย"] = {the = true, placetype = "ดินแดน"}, ["New South Wales, ออสเตรเลีย"] = {}, ["Northern Territory, ออสเตรเลีย"] = {the = true, placetype = "ดินแดน"}, ["Queensland, ออสเตรเลีย"] = {}, ["South Australia, ออสเตรเลีย"] = {}, ["Tasmania, ออสเตรเลีย"] = {}, ["Victoria, ออสเตรเลีย"] = {}, ["Western Australia, ออสเตรเลีย"] = {}, } -- states and territories of Australia export.australia_group = { default_container = "ออสเตรเลีย", default_placetype = "รัฐ", default_divs = "local government areas", data = export.australia_states_and_territories, } export.austria_states = { ["Vienna, ออสเตรีย"] = {}, ["Lower Austria, ออสเตรีย"] = {}, ["Upper Austria, ออสเตรีย"] = {}, ["Styria, ออสเตรีย"] = {}, ["Tyrol, ออสเตรีย"] = {wp = "Tyrol (รัฐ)"}, ["Carinthia, ออสเตรีย"] = {}, ["Salzburg, ออสเตรีย"] = {wp = "Salzburg (รัฐ)"}, ["Vorarlberg, ออสเตรีย"] = {}, ["Burgenland, ออสเตรีย"] = {}, } -- states of Austria export.austria_group = { default_container = "ออสเตรีย", default_placetype = "รัฐ", default_divs = "เทศบาล", data = export.austria_states, } export.bangladesh_divisions = { ["Barisal Division, บังกลาเทศ"] = {}, ["Chittagong Division, บังกลาเทศ"] = {}, ["Dhaka Division, บังกลาเทศ"] = {}, ["Khulna Division, บังกลาเทศ"] = {}, ["Mymensingh Division, บังกลาเทศ"] = {}, ["Rajshahi Division, บังกลาเทศ"] = {}, ["Rangpur Division, บังกลาเทศ"] = {}, ["Sylhet Division, บังกลาเทศ"] = {}, } -- divisions of Bangladesh export.bangladesh_group = { key_to_placename = make_key_to_placename(", บังกลาเทศ$", " Division$"), placename_to_key = make_placename_to_key(", บังกลาเทศ", " Division"), default_container = "บังกลาเทศ", default_placetype = "division", default_divs = "อำเภอ", data = export.bangladesh_divisions, } export.brazil_states = { ["Acre, บราซิล"] = {wp = "%l (รัฐ)"}, ["Alagoas, บราซิล"] = {}, ["Amapá, บราซิล"] = {}, ["Amazonas, บราซิล"] = {wp = "%l (Brazilian state)"}, ["Bahia, บราซิล"] = {}, ["Ceará, บราซิล"] = {}, ["Distrito Federal, บราซิล"] = {wp = "Federal District (Brazil)"}, ["Espírito Santo, บราซิล"] = {}, ["Goiás, บราซิล"] = {}, ["Maranhão, บราซิล"] = {}, ["Mato Grosso, บราซิล"] = {}, ["Mato Grosso do Sul, บราซิล"] = {}, ["Minas Gerais, บราซิล"] = {}, ["Pará, บราซิล"] = {}, ["Paraíba, บราซิล"] = {}, ["Paraná, บราซิล"] = {wp = "%l (รัฐ)"}, ["Pernambuco, บราซิล"] = {}, ["Piauí, บราซิล"] = {}, ["Rio de Janeiro, บราซิล"] = {wp = "%l (รัฐ)"}, ["Rio Grande do Norte, บราซิล"] = {}, ["Rio Grande do Sul, บราซิล"] = {}, ["Rondônia, บราซิล"] = {}, ["Roraima, บราซิล"] = {}, ["Santa Catarina, บราซิล"] = {wp = "%l (รัฐ)"}, ["São Paulo, บราซิล"] = {wp = "%l (รัฐ)"}, ["Sergipe, บราซิล"] = {}, ["Tocantins, บราซิล"] = {}, } -- states of Brazil export.brazil_group = { default_container = "บราซิล", default_placetype = "รัฐ", default_divs = "เทศบาล", data = export.brazil_states, } export.canada_provinces_and_territories = { ["Alberta, แคนาดา"] = {divs = { {type = "municipal districts", container_parent_type = "rural municipalities"}, }}, ["British Columbia, แคนาดา"] = {divs = {type = "regional districts", container_parent_type = false}, "regional municipalities", }, ["Manitoba, แคนาดา"] = {divs = {"rural municipalities"}}, ["New Brunswick, แคนาดา"] = {divs = {"เทศมณฑล", "parishes", {type = "civil parishes", cat_as = "parishes"}}}, ["Newfoundland and Labrador, แคนาดา"] = {}, ["Northwest Territories, แคนาดา"] = {the = true, placetype = "ดินแดน"}, ["Nova Scotia, แคนาดา"] = {divs = {"เทศมณฑล", "regional municipalities"}}, ["Nunavut, แคนาดา"] = {placetype = "ดินแดน"}, ["Ontario, แคนาดา"] = {divs = {"เทศมณฑล", "regional municipalities", {type = "townships", prep = "ใน"}}}, ["Prince Edward Island, แคนาดา"] = {divs = {"เทศมณฑล", "parishes", "rural municipalities"}}, ["Saskatchewan, แคนาดา"] = {divs = {"rural municipalities"}}, ["Quebec, แคนาดา"] = {divs = { "เทศมณฑล", {type = "regional county municipalities", container_parent_type = "regional municipalities"}, -- administrative regions have an official (but non-governmental) function but there don't appear to be any -- equivalent regions elsewhere in Canada, so disable the [[Category:Regions of Canada]] grouping {type = "ภูมิภาค", container_parent_type = false}, {type = "townships", prep = "ใน"}, {type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "เทศมณฑล"}, "เทศบาล"}}, {type = "township municipalities", cat_as = {{type = "townships", prep = "ใน"}, "เทศบาล"}}, {type = "village municipalities", cat_as = {{type = "villages", prep = "ใน"}, "เทศบาล"}}, }}, ["Yukon, แคนาดา"] = {placetype = "ดินแดน"}, ["Yukon Territory, แคนาดา"] = {alias_of = "Yukon, Canada", the = true}, } -- provinces and territories of Canada export.canada_group = { default_container = "แคนาดา", default_placetype = "รัฐ", --ตาม thwiki data = export.canada_provinces_and_territories, } export.china_provinces_and_autonomous_regions = { -- direct-administered municipalities are not here but below under prefecture-level cities ["Anhui, จีน"] = {}, ["Fujian, จีน"] = {}, ["Fuchien, จีน"] = {alias_of = "Fujian, จีน", display = true}, ["Gansu, จีน"] = {}, ["Guangdong, จีน"] = {}, ["Guangxi, จีน"] = {placetype = "autonomous region"}, ["Guizhou, จีน"] = {}, ["Hainan, จีน"] = {}, ["Hebei, จีน"] = {}, ["Heilongjiang, จีน"] = {}, ["Henan, จีน"] = {}, ["Hubei, จีน"] = {}, ["Hunan, จีน"] = {}, ["Inner Mongolia, จีน"] = {placetype = "autonomous region"}, ["Jiangsu, จีน"] = {}, ["Jiangxi, จีน"] = {}, ["Jilin, จีน"] = {}, ["Liaoning, จีน"] = {}, ["Ningxia, จีน"] = {placetype = "autonomous region"}, ["Qinghai, จีน"] = {}, ["Shaanxi, จีน"] = {}, ["Shandong, จีน"] = {}, ["Shanxi, จีน"] = {}, ["Sichuan, จีน"] = {}, ["Tibet, จีน"] = {placetype = "autonomous region", wp = "Tibet Autonomous Region"}, ["Xinjiang, จีน"] = {placetype = "autonomous region"}, ["Yunnan, จีน"] = {}, ["Zhejiang, จีน"] = {}, } -- provinces and autonomous regions of China export.china_group = { default_container = "จีน", default_placetype = "มณฑล", default_divs = { "จังหวัด", "prefecture-level cities", "อำเภอ", "ตำบล", "townships", {type = "เทศมณฑล", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, }, data = export.china_provinces_and_autonomous_regions, } export.china_prefecture_level_cities = { -- In China, a "prefecture-level city" is not a city in any real sense. It is rather a prefecture, which is an -- administrative unit smaller than a province but bigger than a county, which is administratively controlled by -- the chief city of the prefecture (which bears the same name as the prefecture), in a unified government. Prior -- to the mid-1980's, in fact, prefecture-level cities *were* prefectures, and a few of them (especially in the -- western portion of China) have not yet been converted. Generally a given province is entirely tiled by -- prefecture-level cities, another indication that they should be treated as prefectures and not cities per se. -- Yet another indication is that prefecture-level cities can contain counties and county-level cities (which, much -- like prefecture-level cities, are effectively counties surrounding a chief city of the county, again which bears -- the same name as the county-level city). -- -- For this reason, we treat prefecture-level cities as non-city political divisions, and separately enumerate the -- most populous so we can separately categorize districts and counties under them instead of lumping them at the -- province level. -- -- Note also that China separately distinguishes "urban area" from "metro area". Sometimes the two figures are -- identical but sometimes the metro area is larger (and very occasionally smaller, which I assume is an error). I'm -- guessing that the "urban area" is the contiguous urban area over a certain density while the metro area includes -- all urban areas above a certain density; when the latter is greater, it's because of satellite cities in the -- metro area separated by suburban/exurban or rural land. -- At first I chose all prefecture/province-level cities with a total prefecture/province-level population of at -- least 6,000,000 per the 2020 census with data taken from https://www.citypopulation.de/en/china/admin/ (a total -- of 67, including the four direct-administered municipalities), and also chose all prefecture/province-level -- cities whose "urban population" was at least 2,000,000 per the 2020 census with data taken from Wikipedia -- [[w:List of cities in China by population#Cities and towns by population]] (a total of 61 cities; if we cut off -- at 1.5 million we'd have 84 cities, and if we cut off at 1 million we'd have 105 cities). Merging them produces -- 87 cities. Note that this leaves off a few well-known cities (Guilin, Qiqihar, Kashgar, Lhasa, ...) but includes -- a lot of obscure cities. -- -- At a later date I added all cities from citypopulation.de whose "urban" population per the 2020 China census was -- >= 1 million, and then finally added all urban agglomerations from citypopulation.de whose 2025-01-01 estimate -- was >= 1 million. These are sorted below by the urban agglomeration value (which is generally of the "adm-urb" = -- "administrative area (urban population)" type) and sometimes groups nearby cities into a single agglomeration -- (most notably in the case of the Pearl River Delta, grouped under Guangzhou with an agglomeration population of -- 72,700,000 but including a large number of nearby large cities in the agglomeration (although for some reason not -- Hong Kong, maybe due to the administrative issues involved). In addition, citypopulation.de includes divisions -- under a prefecture-level city if they are city-like and have an agglomeration population of at least 1 million; -- this includes several county-level cities, one county and one district (Wanzhou, a "district" of Chongqing -- despite being 142 miles away). None of the county-level cities or counties have districts under them, only -- subdistricts, towns and townships. ["Guangzhou"] = {container = "Guangdong"}, -- 18.7 prefectural, 18.8 urban; sub-provincial city; 16.097 urban (72.700 adm-urb including Dongguan, Foshan, Huizhou, Jiangmen, Shenzhen, Zhongshan) per citypopulation.de ["Dongguan"] = {container = "Guangdong"}, -- 10.5 prefectural, 10.5 urban; 9.645 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Foshan"] = {container = "Guangdong"}, -- 9.5 prefectural, 9.5 urban; 9.043 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Huizhou"] = {container = "Guangdong"}, -- 6.0 prefectural, 2.5 urban; 2.900 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Jiangmen"] = {container = "Guangdong"}, -- 4.798 prefectural, 2.7 urban; 1.795 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Shenzhen"] = {container = "Guangdong"}, -- 17.5 prefectural, 14.7 urban; sub-provincial city; 17.445 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Zhongshan"] = {container = "Guangdong"}, -- 4.418 prefectural, 4.4 urban; 3.842 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Shanghai"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 24.9 prefectural, 29.9 urban; 21.910 urban (41.600 adm-urb including Changshu, Changzhou, Suzhou, Wuxi) per citypopulation.de ["Changshu"] = {container = "Jiangsu"}, -- 1.231 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration -- NOTE: Not to be confused with Cangzhou in Hebei ["Changzhou"] = {container = "Jiangsu"}, -- 5.278 prefectural, 3.6 urban; 3.187 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration -- NOTE: There is also a prefecture-level city Suzhou in Anhui with 5.3 million prefectural inhabitants ["Suzhou"] = {container = "Jiangsu"}, -- 12.8 prefectural, 4.3 urban; 5.893 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration ["Wuxi"] = {container = "Jiangsu"}, -- 7.5 prefectural, 3.3 urban; 3.957 per citypopulation.de; included by citypopulation.de in Shanghai agglomeration ["Beijing"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 21.9 prefectural, 21.9 urban; 18.961 urban (21.500 adm-urb) per citypopulation.de ["Chengdu"] = {container = "Sichuan"}, -- 20.9 prefectural, 16.9 urban; sub-provincial city; 13.568 urban (18.100 adm-urb) per citypopulation.de ["Xiamen"] = {container = "Fujian"}, -- 5.163 prefectural, 5.2 urban; sub-provincial city; 4.617 urban (15.400 adm-urb including Jinjiang, Quanzhou, Putian) per citypopulation.de ["Jinjiang"] = {container = "Fujian"}, -- 1.416 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration ["Quanzhou"] = {container = "Fujian"}, -- 8.8 prefectural, 1.7 urban (6.7 metro); 1.469 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration ["Putian"] = {container = "Fujian"}, -- 3.210 prefectural, 2.0 urban; 1.539 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration ["Hangzhou"] = {container = "Zhejiang"}, -- 11.9 prefectural, 10.7 urban; sub-provincial city; 9.236 urban (14.600 adm-urb including Shaoxing) per citypopulation.de ["Shaoxing"] = {container = "Zhejiang"}, -- 5.270 prefectural, 2.5 urban; 2.333 urban per citypopulation.de; included by citypopulation.de in Hangzhou agglomeration ["Xi'an"] = {container = "Shaanxi"}, -- 12.1 prefectural, 11.9 urban; sub-provincial city; 9.393 urban (13.400 adm-urb including Xianyang) per citypopulation.de ["Xianyang"] = {container = "Shaanxi"}, -- 1.193 urban per citypopulation.de; included by citypopulation.de in Xi'an agglomeration ["Chongqing"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 32.1 prefectural, 16.9 urban; 9.581 urban (12.900 adm-urb) per citypopulation.de ["Wuhan"] = {container = "Hubei"}, -- 12.4 prefectural, 12.3 urban; sub-provincial city; 10.495 urban (12.600 adm-urb) per citypopulation.de ["Tianjin"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 13.9 prefectural, 13.9 urban; 11.052 urban (11.700 adm-urb) per citypopulation.de ["Changsha"] = {container = "Hunan"}, -- 10.0 prefectural, 6.0 urban; 5.630 urban (11.500 adm-urb including Xiangtan, Zhuzhou) per citypopulation.de -- Changsha County -- 1.024 urban per citypopulation.de ["Zhuzhou"] = {container = "Hunan"}, -- 1.510 urban per citypopulation.de; included by citypopulation.de in Changsha agglomeration ["Zhengzhou"] = {container = "Henan"}, -- 12.6 prefectural, 6.7 urban; 6.461 urban (10.300 adm-urb) per citypopulation.de ["Nanjing"] = {container = "Jiangsu"}, -- 9.3 prefectural, 9.3 urban; sub-provincial city; 7.520 urban (9.500 adm-urb including Ma'anshan) per citypopulation.de ["Shenyang"] = {container = "Liaoning"}, -- 9.1 prefectural, 7.9 urban; sub-provincial city; 7.026 urban (8.800 adm-urb including Fushun) per citypopulation.de ["Fushun"] = {container = "Liaoning"}, -- 1.229 urban per citypopulation.de; included by citypopulation.de in Shenyang agglomeration ["Hefei"] = {container = "Anhui"}, -- 9.4 prefectural, 4.2 urban; 5.056 urban (8.200 adm-urb) per citypopulation.de ["Shantou"] = {container = "Guangdong"}, -- 5.502 prefectural, 4.3 urban; 3.839 urban (8.050 adm-urb including Chaozhou, Jieyang, Puning) per citypopulation.de ["Chaozhou"] = {container = "Guangdong"}, -- 1.254 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration ["Jieyang"] = {container = "Guangdong"}, -- 1.243 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration ["Qingdao"] = {container = "Shandong"}, -- 10.1 prefectural, 7.1 urban; sub-provincial city; 6.165 urban (7.700 adm-urb) per citypopulation.de ["Ningbo"] = {container = "Zhejiang"}, -- 9.4 prefectural, 5.1 urban; sub-provincial city; 3.731 urban (7.600 adm-urb including Cixi, Yuyao) per citypopulation.de ["Cixi"] = {container = "Zhejiang"}, -- 1.458 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration ["Yuyao"] = {container = "Zhejiang"}, -- 1.014 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration -- Hong Kong 7.500 agglomeration per citypopulation.de 2025-01-01 estimate including Kowloon, Victoria ["Wenzhou"] = {container = "Zhejiang"}, -- 9.6 prefectural, 3.6 urban; 2.582 urban (7.000 adm-urb including Rui'an, Cangnan, Pingyang) per citypopulation.de -- Rui'an is a "county-level city" of the "prefecture-level city" of Wenzhou but in fact is 19 miles away from Wenzhou city proper (urban core to urban core). ["Rui'an"] = {placetype = "county-level city", container = {key = "Wenzhou", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}}, -- 1.013 urban per citypopulation.de; included by citypopulation.de in Wenzhou agglomeration ["Kunming"] = {container = "Yunnan"}, -- 8.5 prefectural, 6.0 urban; 5.273 urban (6.800 adm-urb) per citypopulation.de -- includes Láiwú city ["Jinan"] = {container = "Shandong", wp = "%l, %c"}, -- 9.2 prefectural, 8.4 urban; sub-provincial city; 5.648 urban (6.750 adm-urb) per citypopulation.de -- includes Xīnjí city ["Shijiazhuang"] = {container = "Hebei"}, -- 11.2 prefectural, 4.1 urban; 5.090 urban (6.450 adm-urb) per citypopulation.de ["Taiyuan"] = {container = "Shanxi"}, -- 5.304 prefectural, 4.5 urban; 4.304 urban (6.150 adm-urb) per citypopulation.de ["Harbin"] = {container = "Heilongjiang"}, -- 10.0 prefectural, 7.0 urban; sub-provincial city; 5.243 urban (5.550 adm-urb) per citypopulation.de ["Nanning"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 8.7 prefectural, 3.8 urban; 4.583 urban (5.550 adm-urb) per citypopulation.de ["Dalian"] = {container = "Liaoning"}, -- 7.5 prefectural, 5.7 urban; sub-provincial city; 4.914 urban (5.400 adm-urb) per citypopulation.de ["Guiyang"] = {container = "Guizhou"}, -- 5.987 prefectural, 3.5 urban; 4.021 urban (5.300 adm-urb) per citypopulation.de ["Changchun"] = {container = "Jilin"}, -- 9.1 prefectural, 5.7 urban; sub-provincial city; 4.557 urban (5.200 adm-urb) per citypopulation.de ["Nanchang"] = {container = "Jiangxi"}, -- 6.3 prefectural, 3.6 (3.9?) urban, 5.3 metro; 3.519 urban (5.150 adm-urb) per citypopulation.de ["Ürümqi"] = {container = {key = "Xinjiang, จีน", placetype = "autonomous region"}}, -- 4.054 prefectural, 4.3 urban; 3.843 urban (5.000 adm-urb) per citypopulation.de ["Urumqi"] = {alias_of = "Ürümqi", display = true}, ["Fuzhou"] = {container = "Fujian"}, -- 8.3 prefectural, 4.1 urban; 3.723 urban (4.775 adm-urb) per citypopulation.de ["Linyi"] = {container = "Shandong"}, -- 11.0 prefectural, 2.3 urban; 2.744 urban (4.650 adm-urb) per citypopulation.de ["Zibo"] = {container = "Shandong"}, -- 4.704 prefectural, 2.6 urban; 2.750 urban (3.975 adm-urb) per citypopulation.de ["Luoyang"] = {container = "Henan"}, -- 7.1 prefectural, 2.4 urban; 2.231 urban (3.750 adm-urb) per citypopulation.de ["Lanzhou"] = {container = "Gansu"}, -- 4.359 prefectural, 3.1 urban; 3.013 urban (3.575 adm-urb) per citypopulation.de ["Nantong"] = {container = "Jiangsu"}, -- 7.7 prefectural, 2.3 urban; 2.988 urban (3.475 adm-urb) citypopulation.de ["Weifang"] = {container = "Shandong"}, -- 9.4 prefectural, 2.7 urban; 1.998 urban (3.325 adm-urb) per citypopulation.de ["Jiangyin"] = {container = "Jiangsu"}, -- 1.331 urban (3.200 adm-urb including Zhangjiagang) per citypopulation.de ["Zhangjiagang"] = {container = "Jiangsu"}, -- 1.056 urban per citypopulation.de; included in Jiangyin figures ["Xuzhou"] = {container = "Jiangsu"}, -- 9.1 prefectural, 2.6 urban; 2.846 urban (3.150 adm-urb) per citypopulation.de ["Handan"] = {container = "Hebei"}, -- 9.4 prefectural, 2.8 urban; 2.095 urban (2.925 adm-urb) per citypopulation.de ["Hohhot"] = {container = {key = "Inner Mongolia, จีน", placetype = "autonomous region"}}, -- 3.446 prefectural, 2.7 urban; 2.373 urban (2.850 adm-urb) per citypopulation.de ["Haikou"] = {container = "Hainan"}, -- 2.873 prefectural, 2.3 urban; 2.349 urban (2.800 adm-urb) per citypopulation.de ["Tangshan"] = {container = "Hebei"}, -- 7.7 prefectural, 3.4 urban; 2.550 urban (2.750 adm-urb) per citypopulation.de ["Xinxiang"] = {container = "Henan"}, -- 6.3 prefectural, 1.2 urban, 2.7 metro; 1.271 urban (2.700 adm-urb) per citypopulation.de ["Yiwu"] = {container = "Zhejiang"}, -- 1.481 urban (2.700 adm-urb) per citypopulation.de ["Zhuhai"] = {container = "Guangdong"}, -- 2.439 prefectural, 2.4 urban; 2.207 urban (2.675 adm-urb) per citypopulation.de ["Taizhou, Zhejiang"] = {container = "Zhejiang"}, -- 6.6 prefectural, 1.6 urban; 1.486 urban (2.625 adm-urb) per citypopulation.de ["Taizhou"] = {alias_of = "Taizhou, Zhejiang"}, ["Yantai"] = {container = "Shandong"}, -- 7.1 prefectural, 2.5 urban; 2.312 urban (2.550 adm-urb) per citypopulation.de ["Yinchuan"] = {container = {key = "Ningxia, จีน", placetype = "autonomous region"}}, -- 1.663 urban (2.525 adm-urb) per citypopulation.de ["Liuzhou"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 4.157 prefectural, 2.2 urban; 2.205 urban (2.500 adm-urb) per citypopulation.de ["Anshan"] = {container = "Liaoning"}, -- 1.480 urban (2.350 adm-urb including Liáoyáng) per citypopulation.de ["Yangzhou"] = {container = "Jiangsu"}, -- 2.067 urban (2.300 adm-urb) per citypopulation.de ["Jiaxing"] = {container = "Zhejiang"}, -- 1.188 urban (2.275 adm-urb) per citypopulation.de ["Xining"] = {container = "Qinghai"}, -- 1.677 urban (2.250 adm-urb) per citypopulation.de -- includes Dìngzhōu city and Xióngān Xīnqū ["Baoding"] = {container = "Hebei"}, -- 11.5 prefectural, 2.0 urban; 1.940 urban (2.225 adm-urb) per citypopulation.de ["Baotou"] = {container = {key = "Inner Mongolia, จีน", placetype = "autonomous region"}}, -- 2.709 prefectural, 2.2 urban; 2.104 urban (2.200 adm-urb) per citypopulation.de ["Ganzhou"] = {container = "Jiangxi"}, -- 9.0 prefectural, 1.6 urban; 1.778 urban (2.150 adm-urb) per citypopulation.de ["Pingdingshan"] = {container = "Henan"}, -- 1.046 urban (2.100 adm-urb) per citypopulation.de ["Zunyi"] = {container = "Guizhou"}, -- 6.6 prefectural, 2.4 urban/metro; 1.675 urban (2.025 adm-urb) per citypopulation.de ["Bengbu"] = {container = "Anhui"}, -- 1.078 urban (2.000 adm-urb) per citypopulation.de ["Datong"] = {container = "Shanxi"}, -- 3.105 prefectural, 2.0 urban; 1.810 urban (2.000 adm-urb) per citypopulation.de ["Anyang"] = {container = "Henan"}, -- 1.188 urban (1.960 adm-urb) per citypopulation.de ["Huai'an"] = {container = "Jiangsu"}, -- 4.556 prefectural, 2.6 urban; 1.805 urban (1.940 adm-urb) per citypopulation.de ["Zaozhuang"] = {container = "Shandong"}, -- 1.350 urban (1.900 adm-urb) per citypopulation.de ["Zhanjiang"] = {container = "Guangdong"}, -- 7.0 prefectural, 1.9 urban; 1.401 urban (1.890 adm-urb) per citypopulation.de ["Huainan"] = {container = "Anhui"}, -- 1.256 urban (1.880 adm-urb) per citypopulation.de ["Jining"] = {container = "Shandong"}, -- 8.4 prefectural, 1.5 urban; 1.700 urban (1.880 adm-urb) per citypopulation.de ["Daqing"] = {container = "Heilongjiang"}, -- 1.604 urban (1.860 adm-urb) per citypopulation.de ["Wuhu"] = {container = "Anhui"}, -- 1.598 urban (1.850 adm-urb) per citypopulation.de ["Guilin"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 1.361 urban (1.830 adm-urb) per citypopulation.de ["Mianyang"] = {container = "Sichuan"}, -- 1.549 urban (1.800 adm-urb) per citypopulation.de ["Xiangyang"] = {container = "Hubei"}, -- 1.686 urban (1.800 adm-urb) per citypopulation.de ["Huzhou"] = {container = "Zhejiang"}, -- 1.084 urban (1.750 adm-urb) per citypopulation.de ["Puyang"] = {container = "Henan"}, -- 0.824 urban (1.750 adm-urb) per citypopulation.de ["Shangqiu"] = {container = "Henan"}, -- 7.8 prefectural, 1.9 urban (2.8 metro); 1.031 urban (1.750 adm-urb) per citypopulation.de ["Qinhuangdao"] = {container = "Hebei"}, -- 1.520 urban (1.740 adm-urb) per citypopulation.de ["Xingtai"] = {container = "Hebei"}, -- 7.1 prefectural, 971,000 urban; 1.5 urban (1.700 adm-urb) per citypopulation.de ["Nanyang"] = {container = "Henan", wp = "%l, %c"}, -- 9.7 prefectural, 2.1 urban/metro; 1.481 urban (1.680 adm-urb) per citypopulation.de ["Jiaozuo"] = {container = "Henan"}, -- 0.875 urban (1.640 adm-urb) per citypopulation.de ["Jilin City"] = {container = "Jilin"}, -- 1.509 urban (1.610 adm-urb) per citypopulation.de ["Jilin"] = {alias_of = "Jilin City"}, ["Jinhua"] = {container = "Zhejiang"}, -- 7.1 prefectural, 1.5 urban; 1.041 urban (1.590 adm-urb) per citypopulation.de ["Shangrao"] = {container = "Jiangxi"}, -- 6.5 prefectural, 2.1 urban, 1.3 metro [sic]; 1.342 urban (1.580 adm-urb) per citypopulation.de ["Heze"] = {container = "Shandong"}, -- 8.8 prefectural, 1.3 urban; 1.294 urban (1.570 adm-urb) per citypopulation.de ["Yulin"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}, wp = "%l, %c"}, -- 0.878 urban (1.570 adm-urb) per citypopulation.de ["Tai'an"] = {container = "Shandong"}, -- 1.417 urban (1.560 adm-urb) per citypopulation.de ["Weihai"] = {container = "Shandong"}, -- 1.340 urban (1.510 adm-urb) per citypopulation.de -- Taizhou, Jiangsu would be here (1.490 adm-urb) but moved to china_prefecture_level_cities_2 to avoid clash ["Yancheng"] = {container = "Jiangsu"}, -- 6.7 prefectural, 1.6 urban; 1.353 urban (1.460 adm-urb) per citypopulation.de ["Zhangjiakou"] = {container = "Hebei"}, -- 1.339 urban (1.450 adm-urb) per citypopulation.de ["Maoming"] = {container = "Guangdong"}, -- 6.2 prefectural, 2.5 urban; 1.308 urban (1.440 adm-urb) per citypopulation.de ["Nanchong"] = {container = "Sichuan"}, -- 1.254 urban (1.440 adm-urb) per citypopulation.de ["Fuyang"] = {container = "Anhui", wp = "%l, %c"}, -- 8.2 prefectural, 2.1 urban; 1.191 urban (1.410 adm-urb) per citypopulation.de ["Xuchang"] = {container = "Henan"}, -- 0.850 urban (1.390 adm-urb) per citypopulation.de ["Yichang"] = {container = "Hubei"}, -- 1.284 urban (1.390 adm-urb) per citypopulation.de ["Dazhou"] = {container = "Sichuan"}, -- 1.136 urban (1.380 adm-urb) per citypopulation.de ["Kaifeng"] = {container = "Henan"}, -- 1.194 urban (1.340 adm-urb) per citypopulation.de ["Luzhou"] = {container = "Sichuan"}, -- 1.128 urban (1.340 adm-urb) per citypopulation.de ["Qingyuan"] = {container = "Guangdong"}, -- 1.198 urban (1.340 adm-urb) per citypopulation.de ["Huaibei"] = {container = "Anhui"}, -- 0.831 urban (1.330 adm-urb) per citypopulation.de ["Yibin"] = {container = "Sichuan"}, -- 1.101 urban (1.310 adm-urb) per citypopulation.de ["Lu'an"] = {container = "Anhui"}, -- 1.070 urban (1.300 adm-urb) per citypopulation.de ["Dezhou"] = {container = "Shandong"}, -- 0.843 urban (1.290 adm-urb) per citypopulation.de ["Rizhao"] = {container = "Shandong"}, -- 1.147 urban (1.270 adm-urb) per citypopulation.de ["Changzhi"] = {container = "Shanxi"}, -- 1.047 urban (1.250 adm-urb) per citypopulation.de ["Hengyang"] = {container = "Hunan"}, -- 6.6 prefectural, 1.5 urban; 1.185 urban (1.250 adm-urb) per citypopulation.de ["Jinzhou"] = {container = "Liaoning"}, -- 1.021 urban (1.240 adm-urb) per citypopulation.de ["Liaocheng"] = {container = "Shandong"}, -- 1.020 urban (1.240 adm-urb) per citypopulation.de ["Changde"] = {container = "Hunan"}, -- 1.101 urban (1.230 adm-urb) per citypopulation.de ["Suqian"] = {container = "Jiangsu"}, -- 1.082 urban (1.230 adm-urb) per citypopulation.de ["Xinyang"] = {container = "Henan"}, -- 6.2 prefectural, 1.4 urban/metro; 1.015 urban (1.230 adm-urb) per citypopulation.de ["Baoji"] = {container = "Shaanxi"}, -- 1.108 urban (1.220 adm-urb) per citypopulation.de ["Yueyang"] = {container = "Hunan"}, -- 1.125 urban (1.220 adm-urb) per citypopulation.de ["Zhenjiang"] = {container = "Jiangsu"}, -- 1.124 urban (1.210 adm-urb) per citypopulation.de -- Wanzhou is a "district" of the "direct-administered municipality" of Chongqing but in fact is 142 miles away from Chongqing city proper. ["Wanzhou"] = {placetype = "district", container = {key = "Chongqing", placetype = "direct-administered municipality"}, divs = {"ตำบล", "townships"}, wp = "%l, %c"}, -- 1.078 urban (1.190 adm-urb) per citypopulation.de ["Ulanhad"] = {container = {key = "Inner Mongolia, จีน", placetype = "autonomous region"}}, -- 1.093 urban (1.180 adm-urb) per citypopulation.de ["Chifeng"] = {alias_of = "Ulanhad"}, ["Ulankhad"] = {alias_of = "Ulanhad", display = true}, ["Ezhou"] = {container = "Hubei"}, -- < 0.750 urban (1.180 adm-urb) per citypopulation.de ["Zhaoqing"] = {container = "Guangdong"}, -- 1.036 urban (1.160 adm-urb) per citypopulation.de ["Lianyungang"] = {container = "Jiangsu"}, -- 4.599 prefectural, 2.0 urban; 1.071 urban (1.150 adm-urb) per citypopulation.de ["Qujing"] = {container = "Yunnan"}, -- 0.976 urban (1.150 adm-urb) per citypopulation.de -- Shuyang is a "เทศมณฑล" of the "prefecture-level city" of Suqian but in fact is 38 miles away from Suqian city proper (urban core to urban core). -- The county itself is 37 miles by 34 miles. ["Shuyang"] = {placetype = "เทศมณฑล", container = {key = "Suqian", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}, wp = "%l County"}, -- 0.986 urban (1.120 adm-urb) per citypopulation.de -- Yongkang is a "county-level city" of the "prefecture-level city" of Jinhua but in fact is 32 miles away from Jinhua city proper (urban core to urban core). ["Yongkang"] = {placetype = "county-level city", container = {key = "Jinhua", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}, wp = "%l, Zhejiang"}, -- < 0.750 urban (1.110 adm-urb) per citypopulation.de ["Zhoukou"] = {container = "Henan"}, -- 9.0 prefectural, 721,000 urban (1.6 metro); < 0.750 urban (1.100 adm-urb) per citypopulation.de ["Beihai"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- < 1 urban (1.090 adm-urb) per citypopulation.de ["Jiujiang"] = {container = "Jiangxi"}, -- < 0.750 urban (1.080 adm-urb) per citypopulation.de ["Shaoyang"] = {container = "Hunan"}, -- 6.6 prefectural, 802,000 urban, 1.4 metro; < 1 urban (1.080 adm-urb) per citypopulation.de ["Chuzhou"] = {container = "Anhui"}, -- < 0.750 urban (1.070 adm-urb) per citypopulation.de ["Hengshui"] = {container = "Hebei"}, -- 0.885 urban (1.070 adm-urb) per citypopulation.de ["Shiyan"] = {container = "Hubei"}, -- 0.955 urban (1.070 adm-urb) per citypopulation.de ["Huludao"] = {container = "Liaoning"}, -- 0.764 urban (1.060 adm-urb) per citypopulation.de ["Dongying"] = {container = "Shandong"}, -- 0.961 urban (1.050 adm-urb) per citypopulation.de ["Guigang"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 0.921 urban (1.050 adm-urb) per citypopulation.de -- Liuyang is a "county-level city" of the "prefecture-level city" of Changsha but in fact is 47 miles away from Changsha city proper (urban core to urban core). ["Liuyang"] = {placetype = "county-level city", container = {key = "Changsha", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}}, -- 0.886 urban (1.040 adm-urb) per citypopulation.de -- NOTE: Not to be confused with Changzhou in Jiangsu ["Cangzhou"] = {container = "Hebei"}, -- 7.3 prefectural, 621,000 urban; 0.759 urban (1.030 adm-urb) per citypopulation.de ["Liupanshui"] = {container = "Guizhou"}, -- < 0.750 urban (1.030 adm-urb) per citypopulation.de ["Panjin"] = {container = "Liaoning"}, -- 0.980 urban (1.030 adm-urb) per citypopulation.de ["Qiqihar"] = {container = "Heilongjiang"}, -- 1.030 urban (1.030 adm-urb) per citypopulation.de ["Linfen"] = {container = "Shanxi"}, -- < 0.750 urban (1.010 adm-urb) per citypopulation.de -- Tengzhou is a "county-level city" of the "prefecture-level city" of Zaozhuang but in fact is 30 miles away from Zaozhuang city proper (urban core to urban core). ["Tengzhou"] = {placetype = "county-level city", container = {key = "Zaozhuang", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}}, -- 0.937 urban (1.010 adm-urb) per citypopulation.de -- 3 extra that got added in earlier incarnations and aren't found in the "major agglomerations of the world" page https://citypopulation.de/en/world/agglomerations/ reference date 2025-01-01 ["Kunshan"] = {container = "Jiangsu"}, -- 1.652 urban (2020 China census) per citypopulation.de ["Zhumadian"] = {container = "Henan"}, -- 7.0 prefectural, 722,000 urban per Wikipedia; 0.754 urban per citypopulation.de ["Bijie"] = {container = "Guizhou"}, -- 6.9 prefectural, ? urban, ? metro (not listed in Wikipedia); < 0.750 urban per citypopulation.de } export.china_prefecture_level_cities_group = { -- don't do any transformations between key and placename; in particular, don't chop off anything from -- "Taizhou, Zhejiang" or "Suzhou, Anhui". key_to_placename = false, placename_to_key = false, -- don't add ", จีน" to make the key default_container = "จีน", canonicalize_key_container = make_canonicalize_key_container(", จีน", "จังหวัด"), -- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people -- don't understand how Chinese administrative divisions work. default_placetype = {"prefecture-level city", "นคร"}, default_divs = { -- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities, -- and prefecture-level cities (as well as county-level cities) are considered non-cities. "อำเภอ", "ตำบล", "townships", {type = "เทศมณฑล", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, }, data = export.china_prefecture_level_cities, } -- Needed to avoid problems with two cities called Taizhou and Suzhou. export.china_prefecture_level_cities_2 = { -- NOTE: There is also a larger and better-known prefecture-level city Taizhou in Zhejiang. ["Taizhou, Jiangsu"] = {container = "Jiangsu"}, -- 1.3 urban (1.490 adm-urb) per citypopulation.de 2020 census ["Taizhou"] = {alias_of = "Taizhou, Jiangsu"}, -- NOTE: There is also a larger and better-known prefecture-level city Suzhou in Jiangsu. ["Suzhou, Anhui"] = {container = "Anhui"}, -- 5.3 prefectural, 1.766 metro and "urban"; < 1 urban (1.010 adm-urb) per citypopulation.de 2020 census -- hopefully this will work because we also have Suzhou as a key by itself for the larger, more-well-known Suzhou in Jiangsu ["Suzhou"] = {alias_of = "Suzhou, Anhui"}, } export.china_prefecture_level_cities_group_2 = { -- don't do any transformations between key and placename; in particular, don't chop off anything from -- "Taizhou, Jiangsu". placename_to_key = false, -- don't add ", จีน" to make the key default_container = "จีน", canonicalize_key_container = make_canonicalize_key_container(", จีน", "จังหวัด"), -- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people -- don't understand how Chinese administrative divisions work. default_placetype = {"prefecture-level city", "นคร"}, default_divs = { -- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities, -- and prefecture-level cities (as well as county-level cities) are considered non-cities. "อำเภอ", "ตำบล", "townships", {type = "เทศมณฑล", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, }, data = export.china_prefecture_level_cities_2, } export.finland_regions = { ["Lapland, ฟินแลนด์"] = {wp = "%l (%c)"}, ["North Ostrobothnia, ฟินแลนด์"] = {}, ["Northern Ostrobothnia, ฟินแลนด์"] = {alias_of = "North Ostrobothnia, ฟินแลนด์", display = true}, ["Kainuu, ฟินแลนด์"] = {}, ["North Karelia, ฟินแลนด์"] = {}, ["Northern Savonia, ฟินแลนด์"] = {}, ["North Savo, ฟินแลนด์"] = {alias_of = "Northern Savonia, ฟินแลนด์", display = true}, ["Southern Savonia, ฟินแลนด์"] = {}, ["South Savo, ฟินแลนด์"] = {alias_of = "Southern Savonia, ฟินแลนด์", display = true}, ["South Karelia, ฟินแลนด์"] = {}, ["Central Finland, ฟินแลนด์"] = {}, ["South Ostrobothnia, ฟินแลนด์"] = {}, ["Southern Ostrobothnia, ฟินแลนด์"] = {alias_of = "South Ostrobothnia, ฟินแลนด์", display = true}, ["Ostrobothnia, ฟินแลนด์"] = {wp = "%l (ภูมิภาค)"}, ["Central Ostrobothnia, ฟินแลนด์"] = {}, ["Pirkanmaa, ฟินแลนด์"] = {}, ["Satakunta, ฟินแลนด์"] = {}, ["Päijänne Tavastia, ฟินแลนด์"] = {}, ["Päijät-Häme, ฟินแลนด์"] = {alias_of = "Päijänne Tavastia, ฟินแลนด์", display = true}, ["Tavastia Proper, ฟินแลนด์"] = {}, ["Kanta-Häme, ฟินแลนด์"] = {alias_of = "Tavastia Proper, ฟินแลนด์", display = true}, ["Kymenlaakso, ฟินแลนด์"] = {}, ["Uusimaa, ฟินแลนด์"] = {}, ["Southwest Finland, ฟินแลนด์"] = {}, ["Åland Islands, ฟินแลนด์"] = {the = true, wp = "Åland"}, ["Åland, ฟินแลนด์"] = {alias_of = "Åland Islands, ฟินแลนด์"}, -- differs in "the" } -- regions of Finland export.finland_group = { default_container = "ฟินแลนด์", default_placetype = "ภูมิภาค", default_divs = "เทศบาล", data = export.finland_regions, } export.france_administrative_regions = { ["Auvergne-Rhône-Alpes, ฝรั่งเศส"] = {}, ["Bourgogne-Franche-Comté, ฝรั่งเศส"] = {}, ["Brittany, ฝรั่งเศส"] = {wp = "%l (administrative region)"}, ["Centre-Val de Loire, ฝรั่งเศส"] = {}, ["Corsica, ฝรั่งเศส"] = {}, -- overseas departments are handled in `export.country_like_entities` -- ["French Guiana"] = {}, ["Grand Est, ฝรั่งเศส"] = {}, -- ["Guadeloupe"] = {}, ["Hauts-de-France, ฝรั่งเศส"] = {}, ["Île-de-France, ฝรั่งเศส"] = {}, -- ["Martinique"] = {}, -- ["Mayotte"] = {}, ["Normandy, ฝรั่งเศส"] = {wp = "%l (administrative region)"}, ["Nouvelle-Aquitaine, ฝรั่งเศส"] = {}, ["Occitania, ฝรั่งเศส"] = {wp = "%l (administrative region)"}, ["Occitanie, ฝรั่งเศส"] = {alias_of = "Occitania, ฝรั่งเศส", display = true}, ["Pays de la Loire, ฝรั่งเศส"] = {}, ["Provence-Alpes-Côte d'Azur, ฝรั่งเศส"] = {}, -- ["Réunion"] = {}, } -- administrative regions of France export.france_group = { default_container = "ฝรั่งเศส", -- Canonically these are 'administrative regions' but also treat as 'region' ('administrative region' falls back -- to 'region'). default_placetype = "ภูมิภาค", default_divs = { "communes", {type = "เทศบาล", cat_as = "communes"}, "departments", {type = "prefectures", cat_as = {"prefectures", "departmental capitals"}}, {type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}}, }, data = export.france_administrative_regions, } export.france_departments = { ["Ain, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 01 ["Aisne, ฝรั่งเศส"] = {container = "Hauts-de-France"}, -- 02 ["Allier, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 03 ["Alpes-de-Haute-Provence, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 04 ["Hautes-Alpes, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 05 ["Alpes-Maritimes, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 06 ["Ardèche, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 07 ["Ardennes, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 08 ["Ariège, ฝรั่งเศส"] = {container = "Occitania", wp = "%l (department)"}, -- 09 ["Aube, ฝรั่งเศส"] = {container = "Grand Est"}, -- 10 ["Aude, ฝรั่งเศส"] = {container = "Occitania"}, -- 11 ["Aveyron, ฝรั่งเศส"] = {container = "Occitania"}, -- 12 ["Bouches-du-Rhône, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 13 ["Calvados, ฝรั่งเศส"] = {container = "Normandy", wp = "%l (department)"}, -- 14 ["Cantal, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 15 ["Charente, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 16 ["Charente-Maritime, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 17 ["Cher, ฝรั่งเศส"] = {container = "Centre-Val de Loire", wp = "%l (department)"}, -- 18 ["Corrèze, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 19 ["Corse-du-Sud, ฝรั่งเศส"] = {container = "Corsica"}, -- 2A ["Haute-Corse, ฝรั่งเศส"] = {container = "Corsica"}, -- 2B ["Côte-d'Or, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 21 ["Côte d'Or, ฝรั่งเศส"] = {alias_of = "Côte-d'Or, ฝรั่งเศส", display = true}, ["Côtes-d'Armor, ฝรั่งเศส"] = {container = "Brittany"}, -- 22 ["Côtes d'Armor, ฝรั่งเศส"] = {alias_of = "Côtes-d'Armor, ฝรั่งเศส", display = true}, ["Creuse, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 23 ["Dordogne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 24 ["Doubs, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 25 ["Drôme, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 26 ["Eure, ฝรั่งเศส"] = {container = "Normandy"}, -- 27 ["Eure-et-Loir, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 28 ["Finistère, ฝรั่งเศส"] = {container = "Brittany"}, -- 29 ["Gard, ฝรั่งเศส"] = {container = "Occitania"}, -- 30 ["Haute-Garonne, ฝรั่งเศส"] = {container = "Occitania"}, -- 31 ["Gers, ฝรั่งเศส"] = {container = "Occitania"}, -- 32 ["Gironde, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 33 ["Hérault, ฝรั่งเศส"] = {container = "Occitania"}, -- 34 ["Ille-et-Vilaine, ฝรั่งเศส"] = {container = "Brittany"}, -- 35 ["Indre, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 36 ["Indre-et-Loire, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 37 ["Isère, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 38 ["Jura, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté", wp = "%l (department)"}, -- 39 ["Landes, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 40 ["Loir-et-Cher, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 41 ["Loire, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 42 ["Haute-Loire, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 43 ["Loire-Atlantique, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 44 ["Loiret, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 45 ["Lot, ฝรั่งเศส"] = {container = "Occitania", wp = "%l (department)"}, -- 46 ["Lot-et-Garonne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 47 ["Lozère, ฝรั่งเศส"] = {container = "Occitania"}, -- 48 ["Maine-et-Loire, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 49 ["Manche, ฝรั่งเศส"] = {container = "Normandy"}, -- 50 ["Marne, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 51 ["Haute-Marne, ฝรั่งเศส"] = {container = "Grand Est"}, -- 52 ["Mayenne, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 53 ["Meurthe-et-Moselle, ฝรั่งเศส"] = {container = "Grand Est"}, -- 54 ["Meuse, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 55 ["Morbihan, ฝรั่งเศส"] = {container = "Brittany"}, -- 56 ["Moselle, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 57 ["Nièvre, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 58 ["Nord, ฝรั่งเศส"] = {container = "Hauts-de-France", wp = "%l (French department)"}, -- 59 ["Oise, ฝรั่งเศส"] = {container = "Hauts-de-France"}, -- 60 ["Orne, ฝรั่งเศส"] = {container = "Normandy"}, -- 61 ["Pas-de-Calais, ฝรั่งเศส"] = {container = "Hauts-de-France"}, -- 62 ["Puy-de-Dôme, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 63 ["Pyrénées-Atlantiques, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 64 ["Hautes-Pyrénées, ฝรั่งเศส"] = {container = "Occitania"}, -- 65 ["Pyrénées-Orientales, ฝรั่งเศส"] = {container = "Occitania"}, -- 66 ["Bas-Rhin, ฝรั่งเศส"] = {container = "Grand Est"}, -- 67 ["Haut-Rhin, ฝรั่งเศส"] = {container = "Grand Est"}, -- 68 ["Rhône, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 69D ["Metropolis of Lyon, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes", the = true}, -- 69M ["Lyon Metropolis, ฝรั่งเศส"] = {alias_of = "Metropolis of Lyon, ฝรั่งเศส"}, ["Lyon, ฝรั่งเศส"] = {alias_of = "Metropolis of Lyon, ฝรั่งเศส"}, ["Haute-Saône, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 70 ["Saône-et-Loire, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 71 ["Sarthe, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 72 ["Savoie, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 73 ["Haute-Savoie, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 74 ["Paris, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 75 ["Seine-Maritime, ฝรั่งเศส"] = {container = "Normandy"}, -- 76 ["Seine-et-Marne, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 77 ["Yvelines, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 78 ["Deux-Sèvres, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 79 ["Somme, ฝรั่งเศส"] = {container = "Hauts-de-France", wp = "%l (department)"}, -- 80 ["Tarn, ฝรั่งเศส"] = {container = "Occitania", wp = "%l (department)"}, -- 81 ["Tarn-et-Garonne, ฝรั่งเศส"] = {container = "Occitania"}, -- 82 ["Var, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur", wp = "%l (department)"}, -- 83 ["Vaucluse, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 84 ["Vendée, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 85 ["Vienne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 86 ["Haute-Vienne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 87 ["Vosges, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 88 ["Yonne, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 89 ["Territoire de Belfort, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 90 ["Essonne, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 91 ["Hauts-de-Seine, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 92 ["Seine-Saint-Denis, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 93 ["Val-de-Marne, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 94 ["Val-d'Oise, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 95 --["Guadeloupe"] = {container = "Guadeloupe"}, -- 971 --["Martinique"] = {container = "Martinique"}, -- 972 --["Guyane"] = {container = "French Guiana", wp = "French Guiana"}, -- 973 --["La Réunion"] = {container = "Réunion", wp = "Réunion"}, -- 974 --["Mayotte"] = {container = "Mayotte"}, -- 976 } export.france_departments_group = { placename_to_key = make_placename_to_key(", ฝรั่งเศส"), canonicalize_key_container = make_canonicalize_key_container(", ฝรั่งเศส", "ภูมิภาค"), default_placetype = "department", default_divs = { "communes", {type = "เทศบาล", cat_as = "communes"}, }, data = export.france_departments, } export.germany_states = { ["Baden-Württemberg, เยอรมนี"] = {}, ["Bavaria, เยอรมนี"] = {}, -- Berlin, Bremen and Hamburg are effectively city-states and don't have districts ([[Kreise]]), so override -- the default_divs setting. Better not to include them at all since they're included as cities down below. -- ["Berlin"] = {divs = {}}, ["Brandenburg, เยอรมนี"] = {}, -- ["Bremen"] = {divs = {}}, -- ["Hamburg"] = {divs = {}}, ["Hesse, เยอรมนี"] = {}, ["Lower Saxony, เยอรมนี"] = {}, ["Mecklenburg-Vorpommern, เยอรมนี"] = {}, ["Mecklenburg-Western Pomerania, เยอรมนี"] = {alias_of = "Mecklenburg-Vorpommern, เยอรมนี", display = true}, ["North Rhine-Westphalia, เยอรมนี"] = {}, ["Rhineland-Palatinate, เยอรมนี"] = {}, ["Saarland, เยอรมนี"] = {}, ["Saxony, เยอรมนี"] = {}, ["Saxony-Anhalt, เยอรมนี"] = {}, ["Schleswig-Holstein, เยอรมนี"] = {}, ["Thuringia, เยอรมนี"] = {}, } -- states of Germany export.germany_group = { default_container = "เยอรมนี", default_placetype = "รัฐ", default_divs = {"อำเภอ", "เทศบาล"}, data = export.germany_states, } export.greece_regions = { ["Attica, กรีซ"] = {wp = "%l (ภูมิภาค)"}, ["Central Greece, กรีซ"] = {wp = "%l (administrative region)"}, ["Central Macedonia, กรีซ"] = {}, ["Crete, กรีซ"] = {}, ["Eastern Macedonia and Thrace, กรีซ"] = {}, ["Epirus, กรีซ"] = {wp = "%l (ภูมิภาค)"}, ["Ionian Islands, กรีซ"] = {the = true, wp = "%l (ภูมิภาค)"}, ["North Aegean, กรีซ"] = {the = true}, -- I would expect 'the Peloponnese' but Wikipedia mostly has categories like [[w:Category:Geography of Peloponnese (ภูมิภาค)]] -- and [[w:Category:Buildings and structures in Peloponnese (ภูมิภาค)]]; only [[w:Category:People from the Peloponnese (ภูมิภาค)]] -- has "the" in it. ["Peloponnese, กรีซ"] = {wp = "%l (ภูมิภาค)"}, ["South Aegean, กรีซ"] = {the = true}, ["Thessaly, กรีซ"] = {}, ["Western Greece, กรีซ"] = {}, ["Western Macedonia, กรีซ"] = {}, ["Mount Athos, กรีซ"] = {placetype = {"autonomous region", "ภูมิภาค"}, wp = "Monastic community of Mount Athos"}, } -- regions of Greece export.greece_group = { default_container = "กรีซ", default_placetype = "ภูมิภาค", data = export.greece_regions, } local india_polity_with_divisions = {"divisions", "อำเภอ"} local india_polity_without_divisions = {"อำเภอ"} -- States and union territories of India. Only some of them are divided into divisions. export.india_states_and_union_territories = { ["Andaman and Nicobar Islands, อินเดีย"] = {the = true, placetype = "union territory", divs = india_polity_without_divisions}, ["Andhra Pradesh, อินเดีย"] = {divs = india_polity_without_divisions}, ["Arunachal Pradesh, อินเดีย"] = {divs = india_polity_with_divisions}, ["Assam, อินเดีย"] = {divs = india_polity_with_divisions}, ["Bihar, อินเดีย"] = {divs = india_polity_with_divisions}, ["Chandigarh, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions}, ["Chhattisgarh, อินเดีย"] = {divs = india_polity_with_divisions}, ["Dadra and Nagar Haveli and Daman and Diu, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions}, ["Delhi, อินเดีย"] = {placetype = "union territory", divs = india_polity_with_divisions}, ["Goa, อินเดีย"] = {divs = india_polity_without_divisions}, ["Gujarat, อินเดีย"] = {divs = india_polity_without_divisions}, ["Haryana, อินเดีย"] = {divs = india_polity_with_divisions}, ["Himachal Pradesh, อินเดีย"] = {divs = india_polity_with_divisions}, ["Jammu and Kashmir, อินเดีย"] = {placetype = "union territory", divs = india_polity_with_divisions, wp = "%l (union territory)"}, ["Jharkhand, อินเดีย"] = {divs = india_polity_with_divisions}, ["Karnataka, อินเดีย"] = {divs = india_polity_with_divisions}, ["Kerala, อินเดีย"] = {divs = india_polity_without_divisions}, ["Ladakh, อินเดีย"] = {placetype = "union territory", divs = india_polity_with_divisions}, ["Lakshadweep, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions}, ["Madhya Pradesh, อินเดีย"] = {divs = india_polity_with_divisions}, ["Maharashtra, อินเดีย"] = {divs = india_polity_with_divisions}, ["Manipur, อินเดีย"] = {divs = india_polity_without_divisions}, ["Meghalaya, อินเดีย"] = {divs = india_polity_with_divisions}, ["Mizoram, อินเดีย"] = {divs = india_polity_without_divisions}, ["Nagaland, อินเดีย"] = {divs = india_polity_with_divisions}, ["Odisha, อินเดีย"] = {divs = india_polity_with_divisions}, ["Puducherry, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions, wp = "%l (union territory)"}, ["Pondicherry, อินเดีย"] = {alias_of = "Puducherry, อินเดีย", display = true}, ["Punjab, อินเดีย"] = {divs = india_polity_with_divisions, wp = "%l, %c"}, ["Rajasthan, อินเดีย"] = {divs = india_polity_with_divisions}, ["Sikkim, อินเดีย"] = {divs = india_polity_without_divisions}, ["Tamil Nadu, อินเดีย"] = {divs = india_polity_without_divisions}, ["Telangana, อินเดีย"] = {divs = india_polity_without_divisions}, ["Tripura, อินเดีย"] = {divs = india_polity_without_divisions}, ["Uttar Pradesh, อินเดีย"] = {divs = india_polity_with_divisions}, ["Uttarakhand, อินเดีย"] = {divs = india_polity_with_divisions}, ["West Bengal, อินเดีย"] = {divs = india_polity_with_divisions}, } -- states and union territories of India export.india_group = { default_container = "อินเดีย", default_placetype = "รัฐ", data = export.india_states_and_union_territories, } export.indonesia_provinces = { ["Aceh, อินโดนีเซีย"] = {}, ["Bali, อินโดนีเซีย"] = {}, ["Bangka Belitung Islands, อินโดนีเซีย"] = {the = true}, ["Banten, อินโดนีเซีย"] = {}, ["Bengkulu, อินโดนีเซีย"] = {}, ["Central Java, อินโดนีเซีย"] = {}, ["Central Kalimantan, อินโดนีเซีย"] = {}, ["Central Papua, อินโดนีเซีย"] = {}, ["Central Sulawesi, อินโดนีเซีย"] = {}, ["East Java, อินโดนีเซีย"] = {}, ["East Kalimantan, อินโดนีเซีย"] = {}, ["East Nusa Tenggara, อินโดนีเซีย"] = {}, ["Gorontalo, อินโดนีเซีย"] = {}, ["Highland Papua, อินโดนีเซีย"] = {wp = "%l"}, ["Special Capital Region of Jakarta, อินโดนีเซีย"] = {the = true, wp = "Jakarta"}, ["Jakarta, อินโดนีเซีย"] = {alias_of = "Special Capital Region of Jakarta, อินโดนีเซีย"}, ["Jambi, อินโดนีเซีย"] = {}, ["Lampung, อินโดนีเซีย"] = {}, ["Maluku, อินโดนีเซีย"] = {}, ["North Kalimantan, อินโดนีเซีย"] = {}, ["North Maluku, อินโดนีเซีย"] = {}, ["North Sulawesi, อินโดนีเซีย"] = {}, ["North Papua, อินโดนีเซีย"] = {}, ["North Sumatra, อินโดนีเซีย"] = {}, ["Papua, อินโดนีเซีย"] = {wp = "%l (จังหวัด)"}, ["Riau, อินโดนีเซีย"] = {}, ["Riau Islands, อินโดนีเซีย"] = {the = true}, ["Southeast Sulawesi, อินโดนีเซีย"] = {}, ["South Kalimantan, อินโดนีเซีย"] = {}, ["South Papua, อินโดนีเซีย"] = {}, ["South Sulawesi, อินโดนีเซีย"] = {}, ["South Sumatra, อินโดนีเซีย"] = {}, ["Southwest Papua, อินโดนีเซีย"] = {}, ["West Java, อินโดนีเซีย"] = {}, ["West Kalimantan, อินโดนีเซีย"] = {}, ["West Nusa Tenggara, อินโดนีเซีย"] = {}, ["West Papua, อินโดนีเซีย"] = {wp = "%l (จังหวัด)"}, ["West Sulawesi, อินโดนีเซีย"] = {}, ["West Sumatra, อินโดนีเซีย"] = {}, ["Special Region of Yogyakarta, อินโดนีเซีย"] = {the = true}, ["Yogyakarta, อินโดนีเซีย"] = {alias_of = "Special Region of Yogyakarta, อินโดนีเซีย"}, } -- provinces of Indonesia export.indonesia_group = { default_container = "อินโดนีเซีย", default_placetype = "จังหวัด", -- per https://www.quora.com/Does-Indonesia-use-British-or-American-English, อินโดนีเซีย tends to use American -- spellings. data = export.indonesia_provinces, } export.iran_provinces = { ["Alborz, อิหร่าน"] = {}, -- abbreviation AL, capital [[w:Karaj]] ["Ardabil, อิหร่าน"] = {}, -- abbreviation AR, capital [[w:Ardabil]] ["Bushehr, อิหร่าน"] = {}, -- abbreviation BU, capital [[w:Bushehr]] ["Chaharmahal and Bakhtiari, อิหร่าน"] = {}, -- abbreviation CB, capital [[w:Shahr-e Kord]] ["East Azerbaijan, อิหร่าน"] = {}, -- abbreviation EA, capital [[w:Tabriz]] ["Fars, อิหร่าน"] = {}, -- abbreviation FA, capital [[w:Shiraz]] ["Pars, อิหร่าน"] = {alias_of = "Fars, อิหร่าน", display = true}, ["Gilan, อิหร่าน"] = {}, -- abbreviation GN, capital [[w:Rasht]] ["Golestan, อิหร่าน"] = {}, -- abbreviation GO, capital [[w:Gorgan]] ["Hamadan, อิหร่าน"] = {}, -- abbreviation HA, capital [[w:Hamadan]] ["Hormozgan, อิหร่าน"] = {}, -- abbreviation HO, capital [[w:Bandar Abbas]] ["Ilam, อิหร่าน"] = {}, -- abbreviation IL, capital [[w:Ilam, อิหร่าน|Ilam]] ["Isfahan, อิหร่าน"] = {}, -- abbreviation IS, capital [[w:Isfahan]] ["Kerman, อิหร่าน"] = {}, -- abbreviation KN, capital [[w:Kerman]] ["Kermanshah, อิหร่าน"] = {}, -- abbreviation KE, capital [[w:Kermanshah]] ["Khuzestan, อิหร่าน"] = {}, -- abbreviation KH, capital [[w:Ahvaz]] ["Kohgiluyeh and Boyer-Ahmad, อิหร่าน"] = {}, -- abbreviation KB, capital [[w:Yasuj]] ["Kurdistan, อิหร่าน"] = {}, -- abbreviation KU, capital [[w:Sanandaj]] ["Lorestan, อิหร่าน"] = {}, -- abbreviation LO, capital [[w:Khorramabad]] ["Markazi, อิหร่าน"] = {}, -- abbreviation MA, capital [[w:Arak, อิหร่าน|Arak]] ["Mazandaran, อิหร่าน"] = {}, -- abbreviation MN, capital [[w:Sari, อิหร่าน|Sari]] ["North Khorasan, อิหร่าน"] = {}, -- abbreviation NK, capital [[w:Bojnord]] ["Qazvin, อิหร่าน"] = {}, -- abbreviation QA, capital [[w:Qazvin]] ["Qom, อิหร่าน"] = {}, -- abbreviation QM, capital [[w:Qom]] ["Razavi Khorasan, อิหร่าน"] = {}, -- abbreviation RK, capital [[w:Mashhad]] ["Semnan, อิหร่าน"] = {}, -- abbreviation SE, capital [[w:Semnan, อิหร่าน|Semnan]] ["Sistan and Baluchestan, อิหร่าน"] = {}, -- abbreviation SB, capital [[w:Zahedan]] ["South Khorasan, อิหร่าน"] = {}, -- abbreviation SK, capital [[w:Birjand]] ["Tehran, อิหร่าน"] = {}, -- abbreviation TE, capital [[w:Tehran]] ["West Azerbaijan, อิหร่าน"] = {}, -- abbreviation WA, capital [[w:Urmia]] ["Yazd, อิหร่าน"] = {}, -- abbreviation YA, capital [[w:Yazd]] ["Zanjan, อิหร่าน"] = {}, -- abbreviation ZA, capital [[w:Zanjan, อิหร่าน|Zanjan]] } -- provinces of Iran export.iran_group = { key_to_placename = make_key_to_placename(", อิหร่าน$"), placename_to_key = make_placename_to_key(", อิหร่าน"), default_container = "อิหร่าน", default_placetype = "จังหวัด", -- There aren't nearly enough counties of Iran currently entered in any language to allow for categorizing them -- per-province. (As of 2025-05-09, there are only 6 counties in each of [[Category:en:Counties of Iran]], -- [[Category:fa:Counties of Iran]] and [[Category:ar:Counties of Iran]].) -- default_divs = "เทศมณฑล", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "จังหวัด%e", data = export.iran_provinces, } export.ireland_counties = { ["County Carlow, ไอร์แลนด์"] = {}, ["County Cavan, ไอร์แลนด์"] = {}, ["County Clare, ไอร์แลนด์"] = {}, ["County Cork, ไอร์แลนด์"] = {}, ["County Donegal, ไอร์แลนด์"] = {}, ["County Dublin, ไอร์แลนด์"] = {}, ["County Galway, ไอร์แลนด์"] = {}, ["County Kerry, ไอร์แลนด์"] = {}, ["County Kildare, ไอร์แลนด์"] = {}, ["County Kilkenny, ไอร์แลนด์"] = {}, ["County Laois, ไอร์แลนด์"] = {}, ["County Leitrim, ไอร์แลนด์"] = {}, ["County Limerick, ไอร์แลนด์"] = {}, ["County Longford, ไอร์แลนด์"] = {}, ["County Louth, ไอร์แลนด์"] = {}, ["County Mayo, ไอร์แลนด์"] = {}, ["County Meath, ไอร์แลนด์"] = {}, ["County Monaghan, ไอร์แลนด์"] = {}, ["County Offaly, ไอร์แลนด์"] = {}, ["County Roscommon, ไอร์แลนด์"] = {}, ["County Sligo, ไอร์แลนด์"] = {}, ["County Tipperary, ไอร์แลนด์"] = {}, ["County Waterford, ไอร์แลนด์"] = {}, ["County Westmeath, ไอร์แลนด์"] = {}, ["County Wexford, ไอร์แลนด์"] = {}, ["County Wicklow, ไอร์แลนด์"] = {}, } local function make_irish_type_key_to_placename(container_pattern) return function(key) key = key:gsub(container_pattern, "") local elliptical_key = key:gsub("^County ", "") return key, elliptical_key end end local function make_irish_type_placename_to_key(container_suffix) return function(placename) if not placename:find("^County ") and not placename:find("^City ") then placename = "County " .. placename end return placename .. container_suffix end end -- counties of Ireland export.ireland_group = { key_to_placename = make_irish_type_key_to_placename(", ไอร์แลนด์$"), placename_to_key = make_irish_type_placename_to_key(", ไอร์แลนด์"), default_container = "ไอร์แลนด์", default_placetype = "เทศมณฑล", data = export.ireland_counties, } export.italy_administrative_regions = { ["Abruzzo, Italy"] = {}, ["Aosta Valley, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}}, ["Apulia, Italy"] = {}, ["Basilicata, Italy"] = {}, ["Calabria, Italy"] = {}, ["Campania, Italy"] = {}, ["Emilia-Romagna, Italy"] = {}, ["Friuli-Venezia Giulia, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}}, ["Lazio, Italy"] = {}, ["Liguria, Italy"] = {}, ["Lombardy, Italy"] = {}, ["Marche, Italy"] = {}, ["Molise, Italy"] = {}, ["Piedmont, Italy"] = {}, ["Sardinia, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}}, ["Sicily, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}}, ["Trentino-Alto Adige, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}}, ["Tuscany, Italy"] = {}, ["Umbria, Italy"] = {}, ["Veneto, Italy"] = {}, } -- administrative regions of Italy export.italy_group = { default_container = "อิตาลี", default_placetype = "ภูมิภาค", data = export.italy_administrative_regions, } -- table of Japanese prefectures; interpolated into the main 'places' table, but also needed separately export.japan_prefectures = { ["ไอจิ, ญี่ปุ่น"] = {}, ["อากิตะ, ญี่ปุ่น"] = {}, ["อาโอโมริ, ญี่ปุ่น"] = {}, ["จิบะ, ญี่ปุ่น"] = {}, ["เอฮิเมะ, ญี่ปุ่น"] = {}, ["ฟูกูอิ, ญี่ปุ่น"] = {}, ["ฟูกูโอกะ, ญี่ปุ่น"] = {}, ["ฟูกูชิมะ, ญี่ปุ่น"] = {}, ["กิฟุ, ญี่ปุ่น"] = {}, ["กุมมะ, ญี่ปุ่น"] = {}, ["ฮิโรชิมะ, ญี่ปุ่น"] = {}, ["ฮกไกโด, ญี่ปุ่น"] = {divs = "กิ่งจังหวัด", wp = "ฮกไกโด"}, ["เฮียวโงะ, ญี่ปุ่น"] = {}, --["Hyogo, ญี่ปุ่น"] = {alias_of = "เฮียวโงะ, ญี่ปุ่น", display = true}, ["อิบารากิ, ญี่ปุ่น"] = {}, ["อิชิกาวะ, ญี่ปุ่น"] = {}, ["อิวาเตะ, ญี่ปุ่น"] = {}, ["คางาวะ, ญี่ปุ่น"] = {}, ["คาโงชิมะ, ญี่ปุ่น"] = {}, ["คานางาวะ, ญี่ปุ่น"] = {}, ["โคจิ, ญี่ปุ่น"] = {}, --["Kochi, ญี่ปุ่น"] = {alias_of = "โคจิ, ญี่ปุ่น", display = true}, ["คูมาโมโตะ, ญี่ปุ่น"] = {}, ["เกียวโต, ญี่ปุ่น"] = {}, ["มิเอะ, ญี่ปุ่น"] = {}, ["มิยางิ, ญี่ปุ่น"] = {}, ["มิยาซากิ, ญี่ปุ่น"] = {}, ["นางาโนะ, ญี่ปุ่น"] = {}, ["นางาซากิ, ญี่ปุ่น"] = {}, ["นาระ, ญี่ปุ่น"] = {}, ["นีงาตะ, ญี่ปุ่น"] = {}, ["โออิตะ, ญี่ปุ่น"] = {}, --["Oita, ญี่ปุ่น"] = {alias_of = "โออิตะ, ญี่ปุ่น", display = true}, ["โอกายามะ, ญี่ปุ่น"] = {}, ["โอกินาวะ, ญี่ปุ่น"] = {}, ["โอซากะ, ญี่ปุ่น"] = {}, ["ซางะ, ญี่ปุ่น"] = {}, ["ไซตามะ, ญี่ปุ่น"] = {}, ["ชิงะ, ญี่ปุ่น"] = {}, ["ชิมาเนะ, ญี่ปุ่น"] = {}, ["ชิซูโอกะ, ญี่ปุ่น"] = {}, ["โทจิงิ, ญี่ปุ่น"] = {}, ["โทกูชิมะ, ญี่ปุ่น"] = {}, ["ทตโตริ, ญี่ปุ่น"] = {}, ["โทยามะ, ญี่ปุ่น"] = {}, ["วากายามะ, ญี่ปุ่น"] = {}, ["ยามางาตะ, ญี่ปุ่น"] = {}, ["ยามางูจิ, ญี่ปุ่น"] = {}, ["ยามานาชิ, ญี่ปุ่น"] = {}, } -- prefectures of Japan export.japan_group = { key_to_placename = make_key_to_placename(", ญี่ปุ่น$"), placename_to_key = make_placename_to_key(", ญี่ปุ่น"), default_container = "ญี่ปุ่น", default_placetype = "จังหวัด", default_wp = "จังหวัด%e", data = export.japan_prefectures, } export.laos_provinces = { ["Attapeu Province, Laos"] = {}, ["Bokeo Province, Laos"] = {}, ["Bolikhamxai Province, Laos"] = {}, ["Champasak Province, Laos"] = {}, ["Houaphanh Province, Laos"] = {}, ["Khammouane Province, Laos"] = {}, ["Luang Namtha Province, Laos"] = {}, ["Luang Prabang Province, Laos"] = {}, ["Oudomxay Province, Laos"] = {}, ["Phongsaly Province, Laos"] = {}, ["Salavan Province, Laos"] = {}, ["Savannakhet Province, Laos"] = {}, ["Vientiane Province, Laos"] = {}, ["Vientiane Prefecture, Laos"] = {placetype = "prefecture", wp = "%l"}, ["Sainyabuli Province, Laos"] = {}, ["Sekong Province, Laos"] = {}, ["Xaisomboun Province, Laos"] = {}, ["Xiangkhouang Province, Laos"] = {}, } local function laos_placename_to_key(placename) if placename == "Vientiane Prefecture" then return placename .. ", Laos" end if placename:find(" Province$") then return placename .. ", Laos" end return placename .. " Province, Laos" end -- provinces of Laos export.laos_group = { key_to_placename = make_key_to_placename(", Laos$", {" Province$", " Prefecture$"}), placename_to_key = laos_placename_to_key, default_container = "Laos", default_placetype = "จังหวัด", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "%e province", data = export.laos_provinces, } export.lebanon_governorates = { ["Akkar Governorate, Lebanon"] = {}, ["Baalbek-Hermel Governorate, Lebanon"] = {}, ["Beirut Governorate, Lebanon"] = {}, ["Beqaa Governorate, Lebanon"] = {}, ["Keserwan-Jbeil Governorate, Lebanon"] = {}, ["Mount Lebanon Governorate, Lebanon"] = {}, ["Nabatieh Governorate, Lebanon"] = {}, -- These two are generic enough that we don't want to automatically augment a use of `gov/North Governorate` or -- `gov/South Governorate` with `c/Lebanon`. ["North Governorate, Lebanon"] = {no_auto_augment_container = true}, ["South Governorate, Lebanon"] = {no_auto_augment_container = true}, } -- governorates of Lebanon export.lebanon_group = { key_to_placename = make_key_to_placename(", Lebanon$", " Governorate$"), placename_to_key = make_placename_to_key(", Lebanon", " Governorate"), default_container = "Lebanon", default_placetype = "governorate", data = export.lebanon_governorates, } export.malaysia_states = { ["Johor, Malaysia"] = {}, ["Kedah, Malaysia"] = {}, ["Kelantan, Malaysia"] = {}, ["Malacca, Malaysia"] = {}, ["Negeri Sembilan, Malaysia"] = {}, ["Pahang, Malaysia"] = {}, ["Penang, Malaysia"] = {}, ["Perak, Malaysia"] = {}, ["Perlis, Malaysia"] = {}, ["Sabah, Malaysia"] = {}, ["Sarawak, Malaysia"] = {}, ["Selangor, Malaysia"] = {}, ["Terengganu, Malaysia"] = {}, } -- states of Malaysia export.malaysia_group = { default_container = "Malaysia", default_placetype = "รัฐ", default_wp = "%l, %c", data = export.malaysia_states, } export.malta_regions = { -- Some of the regions are generic enough that we don't want to automatically augment a use of e.g. -- `r/Northern Region` with `c/Malta`. In particular; -- * "Eastern Region" also occurs at least in Ghana, Uganda, Iceland, Nigeria, Venezuela, North Macedonia and -- El Salvador; -- * "Northern Region" also occurs at least in Ghana, Uganda, Malawi, Nigeria, Canada and South Africa; -- * "Western Region" also occurs at least in Abu Dhabi, Bahrain, South Africa, Ghana, Iceland, Nepal, Nigeria, -- Serbia and Uganda; -- * "Southern Region" also occurs at least in Nigeria, Eritrea, Iceland, ไอร์แลนด์, Malawi and Serbia. ["Eastern Region, Malta"] = {no_auto_augment_container = true}, ["Gozo Region, Malta"] = {wp = "%l"}, ["Northern Region, Malta"] = {no_auto_augment_container = true}, ["Port Region, Malta"] = {}, ["Southern Region, Malta"] = {no_auto_augment_container = true}, ["Western Region, Malta"] = {no_auto_augment_container = true}, } -- regions of Malta export.malta_group = { key_to_placename = make_key_to_placename(", Malta$", " Region"), placename_to_key = make_placename_to_key(", Malta", " Region"), default_container = "Malta", default_placetype = "ภูมิภาค", default_wp = "%l, %c", default_the = true, data = export.malta_regions, } export.mexico_states = { ["Aguascalientes, Mexico"] = {}, ["Baja California, Mexico"] = {}, -- not display-canonicalizing because the "Norte" could be for emphasis ["Baja California Norte, Mexico"] = {alias_of = "Baja California, Mexico"}, ["Baja California Sur, Mexico"] = {}, ["Campeche, Mexico"] = {}, ["Chiapas, Mexico"] = {}, ["Chihuahua, Mexico"] = {wp = "%l (รัฐ)"}, ["Coahuila, Mexico"] = {}, ["Colima, Mexico"] = {}, ["Durango, Mexico"] = {}, ["Guanajuato, Mexico"] = {}, ["Guerrero, Mexico"] = {}, ["Hidalgo, Mexico"] = {wp = "%l (รัฐ)"}, ["Jalisco, Mexico"] = {}, ["State of Mexico, Mexico"] = {the = true}, ["Mexico, Mexico"] = {alias_of = "State of Mexico, Mexico"}, -- differs in "the" -- ["Mexico City, Mexico"] = {}, doesn't belong here because it's a city ["Michoacán, Mexico"] = {}, ["Michoacan, Mexico"] = {alias_of = "Michoacán, Mexico", display = true}, ["Morelos, Mexico"] = {}, ["Nayarit, Mexico"] = {}, ["Nuevo León, Mexico"] = {}, ["Nuevo Leon, Mexico"] = {alias_of = "Nuevo León, Mexico", display = true}, ["Oaxaca, Mexico"] = {}, ["Puebla, Mexico"] = {}, ["Querétaro, Mexico"] = {}, ["Queretaro, Mexico"] = {alias_of = "Querétaro, Mexico", display = true}, ["Quintana Roo, Mexico"] = {}, ["San Luis Potosí, Mexico"] = {}, ["San Luis Potosi, Mexico"] = {alias_of = "San Luis Potosí, Mexico", display = true}, ["Sinaloa, Mexico"] = {}, ["Sonora, Mexico"] = {}, ["Tabasco, Mexico"] = {}, ["Tamaulipas, Mexico"] = {}, ["Tlaxcala, Mexico"] = {}, ["Veracruz, Mexico"] = {}, ["Yucatán, Mexico"] = {}, ["Yucatan, Mexico"] = {alias_of = "Yucatán, Mexico", display = true}, ["Zacatecas, Mexico"] = {}, } -- Mexican states export.mexico_group = { default_container = "Mexico", default_placetype = "รัฐ", data = export.mexico_states, } export.moldova_districts_and_autonomous_territorial_units = { ["Anenii Noi District, Moldova"] = {}, -- capital [[Anenii Noi]] ["Basarabeasca District, Moldova"] = {}, -- capital [[Basarabeasca]] ["Briceni District, Moldova"] = {}, -- capital [[Briceni]] ["Cahul District, Moldova"] = {}, -- capital [[Cahul]] ["Cantemir District, Moldova"] = {}, -- capital [[Cantemir, Moldova|Cantemir]] ["Călărași District, Moldova"] = {}, -- capital [[Călărași, Moldova|Călărași]] ["Căușeni District, Moldova"] = {}, -- capital [[Căușeni]] ["Cimișlia District, Moldova"] = {}, -- capital [[Cimișlia]] ["Criuleni District, Moldova"] = {}, -- capital [[Criuleni]] ["Dondușeni District, Moldova"] = {}, -- capital [[Dondușeni]] ["Drochia District, Moldova"] = {}, -- capital [[Drochia]] ["Dubăsari District, Moldova"] = {}, -- capital [[Cocieri]] ["Edineț District, Moldova"] = {}, -- capital [[Edineț]] ["Fălești District, Moldova"] = {}, -- capital [[Fălești]] ["Florești District, Moldova"] = {}, -- capital [[Florești, Moldova|Florești]] ["Glodeni District, Moldova"] = {}, -- capital [[Glodeni]] ["Hîncești District, Moldova"] = {}, -- capital [[Hîncești]] ["Ialoveni District, Moldova"] = {}, -- capital [[Ialoveni]] ["Leova District, Moldova"] = {}, -- capital [[Leova]] ["Nisporeni District, Moldova"] = {}, -- capital [[Nisporeni]] ["Ocnița District, Moldova"] = {}, -- capital [[Ocnița]] ["Orhei District, Moldova"] = {}, -- capital [[Orhei]] ["Rezina District, Moldova"] = {}, -- capital [[Rezina]] ["Rîșcani District, Moldova"] = {}, -- capital [[Rîșcani]] ["Sîngerei District, Moldova"] = {}, -- capital [[Sîngerei]] ["Soroca District, Moldova"] = {}, -- capital [[Soroca]] ["Strășeni District, Moldova"] = {}, -- capital [[Strășeni]] ["Șoldănești District, Moldova"] = {}, -- capital [[Șoldănești]] ["Ștefan Vodă District, Moldova"] = {}, -- capital [[Ștefan Vodă]] ["Taraclia District, Moldova"] = {}, -- capital [[Taraclia]] ["Telenești District, Moldova"] = {}, -- capital [[Telenești]] ["Ungheni District, Moldova"] = {}, -- capital [[Ungheni]] ["Chișinău, Moldova"] = {placetype = "เทศบาล"}, ["Bălți, Moldova"] = {placetype = "เทศบาล"}, ["Gagauzia, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "ภูมิภาค"}}, -- capital [[Comrat]] -- the remainder are under the de-facto control of the unrecognized state of Transnistria ["Bender, Moldova"] = {placetype = "เทศบาล"}, ["Tighina, Moldova"] = {alias_of = "Bender, Moldova"}, ["Transnistria, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "ภูมิภาค"}}, -- capital [[Tiraspol]] ["Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true}, ["Administrative-Territorial Units of the Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true}, } local function moldova_placename_to_key(placename) local elliptical_key = placename .. ", Moldova" if export.moldova_districts_and_autonomous_territorial_units[elliptical_key] then return elliptical_key end if placename:find(" District$") then return placename .. ", Moldova" end return placename .. " District, Moldova" end -- Moldovan districts (raions) and autonomous territorial units export.moldova_group = { key_to_placename = make_key_to_placename(", Moldova$", " District"), placename_to_key = moldova_placename_to_key, default_container = "Moldova", default_placetype = {"district", "raion"}, default_divs = "communes", data = export.moldova_districts_and_autonomous_territorial_units, } export.morocco_regions = { ["Tangier-Tetouan-Al Hoceima, Morocco"] = {}, ["Oriental, Morocco"] = {wp = "%l (%c)"}, ["L'Oriental, Morocco"] = {alias_of = "Oriental, Morocco", display = true}, ["Fez-Meknes, Morocco"] = {}, ["Rabat-Sale-Kenitra, Morocco"] = {wp = "Rabat-Salé-Kénitra"}, ["Rabat-Salé-Kénitra, Morocco"] = {alias_of = "Rabat-Sale-Kenitra, Morocco", display = true}, ["Beni Mellal-Khenifra, Morocco"] = {wp = "Béni Mellal-Khénifra"}, ["Béni Mellal-Khénifra, Morocco"] = {alias_of = "Beni Mellal-Khenifra, Morocco", display = true}, ["Casablanca-Settat, Morocco"] = {}, ["Marrakesh-Safi, Morocco"] = {wp = "Marrakesh–Safi"}, -- WP title has en-dash ["Marrakech-Safi, Morocco"] = {alias_of = "Marrakesh-Safi, Morocco", display = true}, ["Draa-Tafilalet, Morocco"] = {wp = "Drâa-Tafilalet"}, ["Drâa-Tafilalet, Morocco"] = {alias_of = "Draa-Tafilalet, Morocco", display = true}, ["Souss-Massa, Morocco"] = {}, ["Guelmim-Oued Noun, Morocco"] = { keydesc = "+++. '''NOTE:''' This region lies partly within the disputed territory of [[Western Sahara]]" }, ["Laayoune-Sakia El Hamra, Morocco"] = { wp = "Laâyoune-Sakia El Hamra", keydesc = "+++. '''NOTE:''' This region lies almost completely within the disputed territory of [[Western Sahara]]", }, ["Laâyoune-Sakia El Hamra, Morocco"] = {alias_of = "Laayoune-Sakia El Hamra, Morocco", display = true}, ["Dakhla-Oued Ed-Dahab, Morocco"] = { keydesc = "+++. '''NOTE:''' This region lies completely within the disputed territory of [[Western Sahara]]", }, } -- regions of Morocco export.morocco_group = { default_container = "Morocco", default_placetype = "ภูมิภาค", data = export.morocco_regions, } export.egypt_governorates = { ["Cairo Governorate, Egypt"] = {}, ["Giza Governorate, Egypt"] = {}, ["Sharqia Governorate, Egypt"] = {}, ["Dakahlia Governorate, Egypt"] = {}, ["Beheira Governorate, Egypt"] = {}, ["Minya Governorate, Egypt"] = {}, ["Qalyubia Governorate, Egypt"] = {}, ["Sohag Governorate, Egypt"] = {}, ["Alexandria Governorate, Egypt"] = {}, ["Gharbia Governorate, Egypt"] = {}, ["Asyut Governorate, Egypt"] = {}, ["Monufia Governorate, Egypt"] = {}, ["Faiyum Governorate, Egypt"] = {}, ["Kafr El Sheikh Governorate, Egypt"] = {}, ["Qena Governorate, Egypt"] = {}, ["Beni Suef Governorate, Egypt"] = {}, ["Damietta Governorate, Egypt"] = {}, ["Aswan Governorate, Egypt"] = {}, ["Ismailia Governorate, Egypt"] = {}, ["Luxor Governorate, Egypt"] = {}, ["Suez Governorate, Egypt"] = {}, ["Port Said Governorate, Egypt"] = {}, ["Matrouh Governorate, Egypt"] = {}, ["North Sinai Governorate, Egypt"] = {}, ["Red Sea Governorate, Egypt"] = {}, ["New Valley Governorate, Egypt"] = {}, ["South Sinai Governorate, Egypt"] = {}, } -- governorates of Egypt export.egypt_group = { key_to_placename = make_key_to_placename(", Egypt$", " Governorate$"), placename_to_key = make_placename_to_key(", Egypt", " Governorate"), default_container = "อียิปต์", default_placetype = "governorate", data = export.egypt_governorates, } export.netherlands_provinces = { ["Drenthe, Netherlands"] = {}, ["Flevoland, Netherlands"] = {}, ["Friesland, Netherlands"] = {}, ["Gelderland, Netherlands"] = {}, ["Groningen, Netherlands"] = {wp = "%l (จังหวัด)"}, ["Limburg, Netherlands"] = {wp = "%l (%c)"}, ["North Brabant, Netherlands"] = {}, -- Foreign forms get display-canonicalized. ["Noord-Brabant, Netherlands"] = {alias_of = "North Brabant, Netherlands", display = true}, ["North Holland, Netherlands"] = {}, ["Noord-Holland, Netherlands"] = {alias_of = "North Holland, Netherlands", display = true}, ["Overijssel, Netherlands"] = {}, ["South Holland, Netherlands"] = {}, ["Zuid-Holland, Netherlands"] = {alias_of = "South Holland, Netherlands", display = true}, ["Utrecht, Netherlands"] = {wp = "%l (จังหวัด)"}, ["Zeeland, Netherlands"] = {}, } -- provinces of the Netherlands export.netherlands_group = { default_container = "เนเธอร์แลนด์", default_placetype = "จังหวัด", default_divs = "เทศบาล", data = export.netherlands_provinces, } export.new_zealand_regions = { -- North Island regions ["Northland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-NTL, number 1, capital [[Whangārei]] ["Auckland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-AUK, number 2, capital [[Auckland]] ["Waikato, New Zealand"] = {}, -- ISO 3166-2 code NZ-WKO, number 3, capital [[Hamilton, New Zealand|Hamilton]] ["Bay of Plenty, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-BOP, number 4, capital [[Whakatāne]] ["Gisborne, New Zealand"] = {placetype = {"ภูมิภาค", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-GIS, number 5, capital [[Gisborne, New Zealand|Gisborne]] ["Hawke's Bay, New Zealand"] = {}, -- ISO 3166-2 code NZ-HKB, number 6, capital [[Napier, New Zealand|Napier]] ["Taranaki, New Zealand"] = {}, -- ISO 3166-2 code NZ-TKI, number 7, capital [[Stratford, New Zealand|Stratford]] ["Manawatū-Whanganui, New Zealand"] = {}, -- ISO 3166-2 code NZ-MWT, number 8, capital [[Palmerston North]] ["Manawatu-Whanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true}, ["Manawatu-Wanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true}, ["Wellington, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-WGN, number 9, capital [[Wellington]] -- South Island regions ["Tasman, New Zealand"] = {placetype = {"ภูมิภาค", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-TAS, number 10, capital [[Richmond, New Zealand|Richmond]] ["Nelson, New Zealand"] = {placetype = {"ภูมิภาค", "นคร"}, wp = "%l, %c", is_city = true}, -- ISO 3166-2 code NZ-NSN, number 11, capital [[Nelson, New Zealand|Nelson]] ["Marlborough, New Zealand"] = {placetype = {"ภูมิภาค", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-MBH, number 12, capital [[Blenheim, New Zealand|Blenheim]] ["West Coast, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-WTC, number 13, capital [[Greymouth]] ["Canterbury, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-CAN, number 14, capital [[Christchurch]] ["Otago, New Zealand"] = {}, -- ISO 3166-2 code NZ-OTA, number 15, capital [[Dunedin]] ["Southland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-STL, number 16, capital [[Invercargill]] } -- regions of New Zealand export.new_zealand_group = { default_container = "New Zealand", default_placetype = "ภูมิภาค", data = export.new_zealand_regions, } export.nigeria_states = { ["Abia State, Nigeria"] = {}, ["Adamawa State, Nigeria"] = {}, ["Akwa Ibom State, Nigeria"] = {}, ["Anambra State, Nigeria"] = {}, ["Bauchi State, Nigeria"] = {}, ["Bayelsa State, Nigeria"] = {}, ["Benue State, Nigeria"] = {}, ["Borno State, Nigeria"] = {}, ["Cross River State, Nigeria"] = {}, ["Delta State, Nigeria"] = {}, ["Ebonyi State, Nigeria"] = {}, ["Edo State, Nigeria"] = {}, ["Ekiti State, Nigeria"] = {}, ["Enugu State, Nigeria"] = {}, ["Federal Capital Territory, Nigeria"] = { -- not a state but allow it to be referenced as one in holonyms placetype = {"federal territory", "ดินแดน", "รัฐ"}, the = true, wp = "%l (%c)", }, ["Gombe State, Nigeria"] = {}, ["Imo State, Nigeria"] = {}, ["Jigawa State, Nigeria"] = {}, ["Kaduna State, Nigeria"] = {}, ["Kano State, Nigeria"] = {}, ["Katsina State, Nigeria"] = {}, ["Kebbi State, Nigeria"] = {}, ["Kogi State, Nigeria"] = {}, ["Kwara State, Nigeria"] = {}, ["Lagos State, Nigeria"] = {}, ["Nasarawa State, Nigeria"] = {}, ["Niger State, Nigeria"] = {}, ["Ogun State, Nigeria"] = {}, ["Ondo State, Nigeria"] = {}, ["Osun State, Nigeria"] = {}, ["Oyo State, Nigeria"] = {}, ["Plateau State, Nigeria"] = {}, ["Rivers State, Nigeria"] = {}, ["Sokoto State, Nigeria"] = {}, ["Taraba State, Nigeria"] = {}, ["Yobe State, Nigeria"] = {}, ["Zamfara State, Nigeria"] = {}, } -- states of Nigeria export.nigeria_group = { key_to_placename = make_key_to_placename(", Nigeria$", " State$"), placename_to_key = make_placename_to_key(", Nigeria", " State"), default_container = "Nigeria", default_placetype = "รัฐ", data = export.nigeria_states, } export.north_korea_provinces = { ["Chagang Province, North Korea"] = {}, ["North Hamgyong Province, North Korea"] = {}, ["South Hamgyong Province, North Korea"] = {}, ["North Hwanghae Province, North Korea"] = {}, ["South Hwanghae Province, North Korea"] = {}, ["Kangwon Province, North Korea"] = {wp = "%l (%c)"}, ["North Pyongan Province, North Korea"] = {}, ["South Pyongan Province, North Korea"] = {}, ["Ryanggang Province, North Korea"] = {}, } -- provinces of North Korea export.north_korea_group = { key_to_placename = make_key_to_placename(", North Korea$", " Province$"), placename_to_key = make_placename_to_key(", North Korea", " Province"), default_container = "North Korea", default_placetype = "จังหวัด", data = export.north_korea_provinces, } export.norwegian_counties = { ["Oslo, Norway"] = {}, ["Rogaland, Norway"] = {}, ["Møre og Romsdal, Norway"] = {}, ["Nordland, Norway"] = {}, ["Østfold, Norway"] = {}, ["Akershus, Norway"] = {}, ["Buskerud, Norway"] = {}, -- the following two were merged into Innlandet -- ["Hedmark, Norway"] = {}, -- ["Oppland, Norway"] = {}, ["Innlandet, Norway"] = {}, ["Vestfold, Norway"] = {}, ["Telemark, Norway"] = {}, -- the following two were merged into Agder -- ["Aust-Agder, Norway"] = {}, -- ["Vest-Agder, Norway"] = {}, ["Agder, Norway"] = {}, -- the following two were merged into Vestland -- ["Hordaland, Norway"] = {}, -- ["Sogn og Fjordane, Norway"] = {}, ["Vestland, Norway"] = {}, ["Trøndelag, Norway"] = {}, ["Troms, Norway"] = {}, ["Finnmark, Norway"] = {}, } -- counties of Norway export.norway_group = { default_container = "Norway", default_placetype = "เทศมณฑล", data = export.norwegian_counties, } export.pakistan_provinces_and_territories = { ["Azad Kashmir, Pakistan"] = { placetype = {"administrative territory", "autonomous territory", "ดินแดน"}, }, ["Azad Jammu and Kashmir, Pakistan"] = {alias_of = "Azad Kashmir, Pakistan", display = true}, ["Balochistan, Pakistan"] = {wp = "%l, %c"}, ["Gilgit-Baltistan, Pakistan"] = { placetype = {"administrative territory", "ดินแดน"}, }, ["Islamabad Capital Territory, Pakistan"] = { the = true, divs = {}, -- no divisions placetype = {"federal territory", "administrative territory", "ดินแดน"}, }, -- Islamabad is an accepted alias for Islamabad Capital Territory given the above placetypes ["Islamabad, Pakistan"] = {alias_of = "Islamabad Capital Territory, Pakistan"}, ["Khyber Pakhtunkhwa, Pakistan"] = {}, ["Punjab, Pakistan"] = {wp = "%l, %c"}, ["Sindh, Pakistan"] = {}, } -- provinces and territories of Pakistan export.pakistan_group = { default_container = "Pakistan", default_placetype = "จังหวัด", default_divs = "divisions", data = export.pakistan_provinces_and_territories, } export.philippines_provinces = { ["Abra, Philippines"] = {wp = "%l (จังหวัด)"}, ["Agusan del Norte, Philippines"] = {}, ["Agusan del Sur, Philippines"] = {}, ["Aklan, Philippines"] = {}, ["Albay, Philippines"] = {}, ["Antique, Philippines"] = {wp = "%l (จังหวัด)"}, ["Apayao, Philippines"] = {}, ["Aurora, Philippines"] = {wp = "%l (จังหวัด)"}, ["Basilan, Philippines"] = {}, ["Bataan, Philippines"] = {}, ["Batanes, Philippines"] = {}, ["Batangas, Philippines"] = {}, ["Benguet, Philippines"] = {}, ["Biliran, Philippines"] = {}, ["Bohol, Philippines"] = {}, ["Bukidnon, Philippines"] = {}, ["Bulacan, Philippines"] = {}, ["Cagayan, Philippines"] = {}, ["Camarines Norte, Philippines"] = {}, ["Camarines Sur, Philippines"] = {}, ["Camiguin, Philippines"] = {}, ["Capiz, Philippines"] = {}, ["Catanduanes, Philippines"] = {}, ["Cavite, Philippines"] = {}, ["Cebu, Philippines"] = {}, ["Cotabato, Philippines"] = {}, ["Davao de Oro, Philippines"] = {}, ["Davao del Norte, Philippines"] = {}, ["Davao del Sur, Philippines"] = {}, ["Davao Occidental, Philippines"] = {}, ["Davao Oriental, Philippines"] = {}, ["Dinagat Islands, Philippines"] = {the = true}, ["Eastern Samar, Philippines"] = {}, ["Guimaras, Philippines"] = {}, ["Ifugao, Philippines"] = {}, ["Ilocos Norte, Philippines"] = {}, ["Ilocos Sur, Philippines"] = {}, ["Iloilo, Philippines"] = {}, ["Isabela, Philippines"] = {wp = "%l (จังหวัด)"}, ["Kalinga, Philippines"] = {wp = "%l (จังหวัด)"}, ["La Union, Philippines"] = {}, ["Laguna, Philippines"] = {wp = "%l (จังหวัด)"}, ["Lanao del Norte, Philippines"] = {}, ["Lanao del Sur, Philippines"] = {}, ["Leyte, Philippines"] = {wp = "%l (จังหวัด)"}, ["Maguindanao del Norte, Philippines"] = {}, ["Maguindanao del Sur, Philippines"] = {}, ["Marinduque, Philippines"] = {}, ["Masbate, Philippines"] = {}, ["Misamis Occidental, Philippines"] = {}, ["Misamis Oriental, Philippines"] = {}, ["Mountain Province, Philippines"] = {}, ["Negros Occidental, Philippines"] = {}, ["Negros Oriental, Philippines"] = {}, ["Northern Samar, Philippines"] = {}, ["Nueva Ecija, Philippines"] = {}, ["Nueva Vizcaya, Philippines"] = {}, ["Occidental Mindoro, Philippines"] = {}, ["Oriental Mindoro, Philippines"] = {}, ["Palawan, Philippines"] = {}, ["Pampanga, Philippines"] = {}, ["Pangasinan, Philippines"] = {}, ["Quezon, Philippines"] = {}, ["Quirino, Philippines"] = {}, ["Rizal, Philippines"] = {wp = "%l (จังหวัด)"}, ["Romblon, Philippines"] = {}, ["Samar, Philippines"] = {wp = "%l (จังหวัด)"}, ["Sarangani, Philippines"] = {}, ["Siquijor, Philippines"] = {}, ["Sorsogon, Philippines"] = {}, ["South Cotabato, Philippines"] = {}, ["Southern Leyte, Philippines"] = {}, ["Sultan Kudarat, Philippines"] = {}, ["Sulu, Philippines"] = {}, ["Surigao del Norte, Philippines"] = {}, ["Surigao del Sur, Philippines"] = {}, ["Tarlac, Philippines"] = {}, ["Tawi-Tawi, Philippines"] = {}, ["Zambales, Philippines"] = {}, ["Zamboanga del Norte, Philippines"] = {}, ["Zamboanga del Sur, Philippines"] = {}, ["Zamboanga Sibugay, Philippines"] = {}, -- not a province but treated as one; allow it to be referred to as a province in holonyms ["Metro Manila, Philippines"] = {placetype = {"ภูมิภาค", "จังหวัด"}}, } -- provinces of the Philippines export.philippines_group = { default_container = "Philippines", default_placetype = "จังหวัด", default_divs = {"เทศบาล", "barangays"}, data = export.philippines_provinces, } export.poland_voivodeships = { ["Lower Silesian Voivodeship, Poland"] = {}, -- abbr DS, code 02, capital Wrocław ["Kuyavian-Pomeranian Voivodeship, Poland"] = {}, -- abbr KP, code 04, capital Bydgoszcz (seat of voivode), Toruń (seat of sejmik and marshal) ["Lublin Voivodeship, Poland"] = {}, -- abbr LU, code 06, capital Lublin ["Lubusz Voivodeship, Poland"] = {}, -- abbr LB, code 08, capital Gorzów Wielkopolski (seat of voivode), Zielona Góra (seat of sejmik and marshal) ["Lodz Voivodeship, Poland"] = {wp = "Łódź Voivodeship"}, -- abbr LD, code 10, capital Łódź ["Łódź Voivodeship, Poland"] = {alias_of = "Lodz Voivodeship, Poland", display = true, display_as_full = true}, ["Lesser Poland Voivodeship, Poland"] = {}, -- abbr MA, code 12, capital Kraków ["Masovian Voivodeship, Poland"] = {}, -- abbr MZ, code 14, capital Warsaw ["Opole Voivodeship, Poland"] = {}, -- abbr OP, code 16, capital Opole ["Subcarpathian Voivodeship, Poland"] = {}, -- abbr PK, code 18, capital Rzeszów ["Podlaskie Voivodeship, Poland"] = {}, -- abbr PD, code 20, capital Białystok ["Pomeranian Voivodeship, Poland"] = {}, -- abbr PM, code 22, capital Gdańsk ["Silesian Voivodeship, Poland"] = {}, -- abbr SL, code 24, capital Katowice ["Holy Cross Voivodeship, Poland"] = {wp = "Świętokrzyskie Voivodeship"}, -- abbr SK, code 26, capital Kielce ["Świętokrzyskie Voivodeship, Poland"] = {alias_of = "Holy Cross Voivodeship, Poland", display = true, display_as_full = true}, ["Warmian-Masurian Voivodeship, Poland"] = {}, -- abbr WN, code 28, capital Olsztyn ["Greater Poland Voivodeship, Poland"] = {}, -- abbr WP, code 30, capital Poznań ["West Pomeranian Voivodeship, Poland"] = {}, -- abbr ZP, code 32, capital Szczecin } -- voivodeships of Poland export.poland_group = { key_to_placename = make_key_to_placename(", Poland$", " Voivodeship$"), placename_to_key = make_placename_to_key(", Poland", " Voivodeship"), default_container = "Poland", default_placetype = "voivodeship", default_divs = { -- "เทศมณฑล", -- not enough of them currently {type = "Polish colonies", cat_as = {{type = "villages", prep = "ใน"}}}, }, data = export.poland_voivodeships, } export.portugal_districts_and_autonomous_regions = { ["Azores, Portugal"] = {the = true, placetype = {"autonomous region", "ภูมิภาค"}}, ["Aveiro District, Portugal"] = {}, ["Beja District, Portugal"] = {}, ["Braga District, Portugal"] = {}, ["Bragança District, Portugal"] = {}, ["Castelo Branco District, Portugal"] = {}, ["Coimbra District, Portugal"] = {}, ["Évora District, Portugal"] = {}, ["Faro District, Portugal"] = {}, ["Guarda District, Portugal"] = {}, ["Leiria District, Portugal"] = {}, ["Lisbon District, Portugal"] = {}, ["Lisboa District, Portugal"] = {alias_of = "Lisbon District, Portugal", display = true}, ["Madeira, Portugal"] = {placetype = {"autonomous region", "ภูมิภาค"}}, ["Portalegre District, Portugal"] = {}, ["Porto District, Portugal"] = {}, ["Santarém District, Portugal"] = {}, ["Setúbal District, Portugal"] = {}, ["Viana do Castelo District, Portugal"] = {}, ["Vila Real District, Portugal"] = {}, ["Viseu District, Portugal"] = {}, } local function portugal_placename_to_key(placename) if placename == "Azores" or placename == "Madeira" then return placename .. ", Portugal" end if placename:find(" District$") then return placename .. ", Portugal" end return placename .. " District, Portugal" end -- districts and autonomous regions of Portugal export.portugal_group = { key_to_placename = make_key_to_placename(", Portugal$", " District$"), placename_to_key = portugal_placename_to_key, default_container = "Portugal", default_placetype = "district", default_divs = "เทศบาล", data = export.portugal_districts_and_autonomous_regions, } export.romania_counties = { ["Alba County, Romania"] = {}, ["Arad County, Romania"] = {}, ["Argeș County, Romania"] = {}, ["Bacău County, Romania"] = {}, ["Bihor County, Romania"] = {}, ["Bistrița-Năsăud County, Romania"] = {}, ["Botoșani County, Romania"] = {}, ["Brașov County, Romania"] = {}, ["Brăila County, Romania"] = {}, -- Bucharest: not in a county ["Buzău County, Romania"] = {}, ["Caraș-Severin County, Romania"] = {}, ["Cluj County, Romania"] = {}, ["Constanța County, Romania"] = {}, ["Covasna County, Romania"] = {}, ["Călărași County, Romania"] = {}, ["Dolj County, Romania"] = {}, ["Dâmbovița County, Romania"] = {}, ["Galați County, Romania"] = {}, ["Giurgiu County, Romania"] = {}, ["Gorj County, Romania"] = {}, ["Harghita County, Romania"] = {}, ["Hunedoara County, Romania"] = {}, ["Ialomița County, Romania"] = {}, ["Iași County, Romania"] = {}, ["Ilfov County, Romania"] = {}, ["Maramureș County, Romania"] = {}, ["Mehedinți County, Romania"] = {}, ["Mureș County, Romania"] = {}, ["Neamț County, Romania"] = {}, ["Olt County, Romania"] = {}, ["Prahova County, Romania"] = {}, ["Satu Mare County, Romania"] = {}, ["Sibiu County, Romania"] = {}, ["Suceava County, Romania"] = {}, ["Sălaj County, Romania"] = {}, ["Teleorman County, Romania"] = {}, ["Timiș County, Romania"] = {}, ["Tulcea County, Romania"] = {}, ["Vaslui County, Romania"] = {}, ["Vrancea County, Romania"] = {}, ["Vâlcea County, Romania"] = {}, } -- counties of Romania export.romania_group = { key_to_placename = make_key_to_placename(", Romania$", " County$"), placename_to_key = make_placename_to_key(", Romania", " County"), default_container = "Romania", default_placetype = "เทศมณฑล", default_divs = "communes", data = export.romania_counties, } local function make_russia_federal_subject_spec(spectype, use_the, wp) return { placetype = spectype, the = not not use_the, bare_category_parent_type = {"federal subjects", spectype .. "s"}, wp = wp, } end local russia_autonomous_okrug_no_the = {placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"}} local russia_autonomous_okrug_the = {placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"}, the = true} local russia_krai = make_russia_federal_subject_spec("krai") local russia_oblast = make_russia_federal_subject_spec("oblast") local russia_republic_the = make_russia_federal_subject_spec("republic", "use the") local russia_republic_no_the = make_russia_federal_subject_spec("republic") export.russia_federal_subjects = { -- autonomous oblasts ["Jewish Autonomous Oblast, Russia"] = {the = true, placetype = {"autonomous oblast", "oblast"}, bare_category_parent_type = {"federal subjects", "autonomous oblasts"}}, -- autonomous okrugs ["Chukotka Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Chukotka, Russia"] = {alias_of = "Chukotka Autonomous Okrug, Russia"}, ["Khanty-Mansi Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Khanty-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"}, ["Khantia-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"}, ["Yugra, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"}, ["Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Nenetsia, Russia"] = {alias_of = "Nenets Autonomous Okrug, Russia"}, ["Yamalo-Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Yamalia, Russia"] = {alias_of = "Yamalo-Nenets Autonomous Okrug, Russia"}, -- krais ["Altai Krai, Russia"] = russia_krai, ["Kamchatka Krai, Russia"] = russia_krai, ["Khabarovsk Krai, Russia"] = russia_krai, ["Krasnodar Krai, Russia"] = russia_krai, ["Krasnoyarsk Krai, Russia"] = russia_krai, ["Perm Krai, Russia"] = russia_krai, ["Primorsky Krai, Russia"] = russia_krai, ["Stavropol Krai, Russia"] = russia_krai, ["Zabaykalsky Krai, Russia"] = russia_krai, -- oblasts ["Amur Oblast, Russia"] = russia_oblast, ["Arkhangelsk Oblast, Russia"] = russia_oblast, ["Astrakhan Oblast, Russia"] = russia_oblast, ["Belgorod Oblast, Russia"] = russia_oblast, ["Bryansk Oblast, Russia"] = russia_oblast, ["Chelyabinsk Oblast, Russia"] = russia_oblast, ["Irkutsk Oblast, Russia"] = russia_oblast, ["Ivanovo Oblast, Russia"] = russia_oblast, ["Kaliningrad Oblast, Russia"] = russia_oblast, ["Kaluga Oblast, Russia"] = russia_oblast, ["Kemerovo Oblast, Russia"] = russia_oblast, ["Kirov Oblast, Russia"] = russia_oblast, ["Kostroma Oblast, Russia"] = russia_oblast, ["Kurgan Oblast, Russia"] = russia_oblast, ["Kursk Oblast, Russia"] = russia_oblast, ["Leningrad Oblast, Russia"] = russia_oblast, ["Lipetsk Oblast, Russia"] = russia_oblast, ["Magadan Oblast, Russia"] = russia_oblast, ["Moscow Oblast, Russia"] = russia_oblast, ["Murmansk Oblast, Russia"] = russia_oblast, ["Nizhny Novgorod Oblast, Russia"] = russia_oblast, ["Novgorod Oblast, Russia"] = russia_oblast, ["Novosibirsk Oblast, Russia"] = russia_oblast, ["Omsk Oblast, Russia"] = russia_oblast, ["Orenburg Oblast, Russia"] = russia_oblast, ["Oryol Oblast, Russia"] = russia_oblast, ["Penza Oblast, Russia"] = russia_oblast, ["Pskov Oblast, Russia"] = russia_oblast, ["Rostov Oblast, Russia"] = russia_oblast, ["Ryazan Oblast, Russia"] = russia_oblast, ["Sakhalin Oblast, Russia"] = russia_oblast, ["Samara Oblast, Russia"] = russia_oblast, ["Saratov Oblast, Russia"] = russia_oblast, ["Smolensk Oblast, Russia"] = russia_oblast, ["Sverdlovsk Oblast, Russia"] = russia_oblast, ["Tambov Oblast, Russia"] = russia_oblast, ["Tomsk Oblast, Russia"] = russia_oblast, ["Tula Oblast, Russia"] = russia_oblast, ["Tver Oblast, Russia"] = russia_oblast, ["Tyumen Oblast, Russia"] = russia_oblast, ["Ulyanovsk Oblast, Russia"] = russia_oblast, ["Vladimir Oblast, Russia"] = russia_oblast, ["Volgograd Oblast, Russia"] = russia_oblast, ["Vologda Oblast, Russia"] = russia_oblast, ["Voronezh Oblast, Russia"] = russia_oblast, ["Yaroslavl Oblast, Russia"] = russia_oblast, -- republics -- -- We only need to include cases that aren't just shortened versions of the full federal subject name (i.e. where -- words like "Republic" and "Oblast" are omitted but the name is not otherwise modified; these are handled by -- key_to_placename). Non-display-canonicalizing aliases are generally due to differences in the presence or absence -- of "the". ["Adygea, Russia"] = russia_republic_no_the, ["Republic of Adygea, Russia"] = {alias_of = "Adygea, Russia", the = true}, ["Bashkortostan, Russia"] = russia_republic_no_the, ["Republic of Bashkortostan, Russia"] = {alias_of = "Bashkortostan, Russia", the = true}, ["Bashkiria, Russia"] = {alias_of = "Bashkortostan, Russia"}, ["Buryatia, Russia"] = russia_republic_no_the, ["Republic of Buryatia, Russia"] = {alias_of = "Buryatia, Russia", the = true}, ["Dagestan, Russia"] = russia_republic_no_the, ["Republic of Dagestan, Russia"] = {alias_of = "Dagestan, Russia", the = true}, ["Ingushetia, Russia"] = russia_republic_no_the, ["Republic of Ingushetia, Russia"] = {alias_of = "Ingushetia, Russia", the = true}, ["Kalmykia, Russia"] = russia_republic_no_the, ["Republic of Kalmykia, Russia"] = {alias_of = "Kalmykia, Russia", the = true}, ["Karelia, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Karelia"), ["Republic of Karelia, Russia"] = {alias_of = "Karelia, Russia", the = true}, ["Khakassia, Russia"] = russia_republic_no_the, ["Republic of Khakassia, Russia"] = {alias_of = "Khakassia, Russia", the = true}, ["Mordovia, Russia"] = russia_republic_no_the, ["Republic of Mordovia, Russia"] = {alias_of = "Mordovia, Russia", the = true}, ["North Ossetia-Alania, Russia"] = make_russia_federal_subject_spec("republic", nil, "North Ossetia–Alania"), -- with en-dash ["Republic of North Ossetia-Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", the = true}, ["North Ossetia, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true}, ["Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true}, ["Tatarstan, Russia"] = russia_republic_no_the, ["Republic of Tatarstan, Russia"] = {alias_of = "Tatarstan, Russia", the = true}, ["Altai Republic, Russia"] = russia_republic_the, ["Chechnya, Russia"] = russia_republic_no_the, ["Chechen Republic, Russia"] = {alias_of = "Chechnya, Russia", the = true}, ["Chuvashia, Russia"] = russia_republic_no_the, ["Chuvash Republic, Russia"] = {alias_of = "Chuvashia, Russia", the = true}, ["Kabardino-Balkaria, Russia"] = russia_republic_no_the, ["Kabardino-Balkariya, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", display = true}, ["Kabardino-Balkarian Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", the = true}, ["Kabardino-Balkar Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", display = "Kabardino-Balkarian Republic, Russia", the = true}, ["Karachay-Cherkessia, Russia"] = russia_republic_no_the, ["Karachay-Cherkess Republic, Russia"] = {alias_of = "Karachay-Cherkessia, Russia"}, ["Komi, Russia"] = make_russia_federal_subject_spec("republic", nil, "Komi Republic"), ["Komi Republic, Russia"] = {alias_of = "Komi, Russia", the = true}, ["Mari El, Russia"] = russia_republic_no_the, ["Mari El Republic, Russia"] = {alias_of = "Mari El, Russia", the = true}, ["Sakha, Russia"] = make_russia_federal_subject_spec("republic", nil, "Sakha Republic"), ["Sakha Republic, Russia"] = {alias_of = "Sakha, Russia", the = true}, ["Yakutia, Russia"] = {alias_of = "Sakha, Russia"}, ["Yakutiya, Russia"] = {alias_of = "Sakha, Russia", display = "Yakutia, Russia"}, ["Republic of Yakutia (Sakha), Russia"] = {alias_of = "Sakha, Russia", display = "Sakha Republic, Russia", the = true}, ["Tuva, Russia"] = russia_republic_no_the, ["Tyva, Russia"] = {alias_of = "Tuva, Russia", display = true}, ["Tuva Republic, Russia"] = {alias_of = "Tuva, Russia", the = true}, ["Tyva Republic, Russia"] = {alias_of = "Tuva, Russia", display= "Tuva Republic, Russia", the = true}, ["Udmurtia, Russia"] = russia_republic_no_the, ["Udmurt Republic, Russia"] = {alias_of = "Udmurtia, Russia", the = true}, -- Not included due to being unrecognized and only partly controlled: -- ["Crimea, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Crimea (Russia)") -- ["Donetsk People's Republic, Russia"] = russia_republic_the, -- ["Luhansk People's Republic, Russia"] = russia_republic_the, -- ["Zaporozhye Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Zaporizhzhia Oblast"), -- ["Kherson Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Kherson Oblast"), -- There are also federal cities (not included because they're cities): -- Moscow, Saint Petersburg; Sevastopol (unrecognized; same status as for "Crimea, Russia" above) } local function russia_key_to_placename(key) key = key:gsub(",.*", "") local full_placename = key if key == "Jewish Autonomous Oblast" then return full_placename, full_placename end local elliptical_placename for _, suffix in ipairs({"Krai", "Oblast"}) do elliptical_placename = key:match("^(.*) " .. suffix .. "$") if elliptical_placename then return full_placename, elliptical_placename end end return full_placename, full_placename end local function russia_placename_to_key(placename) local key = placename .. ", Russia" if export.russia_federal_subjects[key] then return key end -- We allow the user to say e.g. "obl/Samara" in place of "obl/Samara Oblast". for _, suffix in ipairs({"Krai", "Oblast"}) do local suffixed_key = placename .. " " .. suffix .. ", Russia" if export.russia_federal_subjects[suffixed_key] then return suffixed_key end end return placename .. ", Russia" end local function construct_russia_federal_subject_keydesc(group, key, spec) local placename = key:gsub(",.*", "") local linked_placename = export.construct_linked_placename(spec, placename) local placetype = spec.placetype if type(placetype) == "table" then placetype = placetype[1] end if placetype == "oblast" then -- Hack: Oblasts generally don't have entries under "Foo Oblast" -- but just under "Foo", so fix the linked key appropriately; -- doesn't apply to the Jewish Autonomous Oblast linked_placename = linked_placename:gsub(" Oblast%]%]", "%]%] Oblast") end return linked_placename .. ", a [[federal subject]] ([[" .. placetype .. "]]) of [[Russia]]" end -- federal subjects of Russia export.russia_group = { key_to_placename = russia_key_to_placename, placename_to_key = russia_placename_to_key, default_container = "Russia", default_keydesc = construct_russia_federal_subject_keydesc, default_overriding_bare_label_parents = {"federal subjects of Russia", "+++"}, data = export.russia_federal_subjects, } export.saudi_arabia_provinces = { ["Riyadh Province, Saudi Arabia"] = {}, ["Mecca Province, Saudi Arabia"] = {}, -- Name is too generic to assume it's in Saudi Arabia if not specified. ["Eastern Province, Saudi Arabia"] = {no_auto_augment_container = true, wp = "%l, %c"}, ["Medina Province, Saudi Arabia"] = {wp = "%l (%c)"}, ["Aseer Province, Saudi Arabia"] = {wp = "Asir"}, ["Asir Province, Saudi Arabia"] = {alias_of = "Aseer Province, Saudi Arabia", display = true}, ["Jazan Province, Saudi Arabia"] = {}, ["Qassim Province, Saudi Arabia"] = {wp = "Al-Qassim Province"}, ["Al-Qassim Province, Saudi Arabia"] = {alias_of = "Qassim Province, Saudi Arabia", display = true}, ["Tabuk Province, Saudi Arabia"] = {}, ["Hail Province, Saudi Arabia"] = {wp = "Ḥa'il Province"}, ["Ha'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true}, ["Ḥa'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true}, ["Al-Jouf Province, Saudi Arabia"] = {wp = "Al-Jawf Province"}, ["Al-Jawf Province, Saudi Arabia"] = {alias_of = "Al-Jouf Province, Saudi Arabia", display = true}, ["Najran Province, Saudi Arabia"] = {}, ["Northern Borders Province, Saudi Arabia"] = {}, ["Al-Bahah Province, Saudi Arabia"] = {}, } -- provinces of Saudi Arabia export.saudi_arabia_group = { key_to_placename = make_key_to_placename(", Saudi Arabia$", " Province$"), placename_to_key = make_placename_to_key(", Saudi Arabia", " Province"), default_container = "Saudi Arabia", default_placetype = "จังหวัด", data = export.saudi_arabia_provinces, } export.south_africa_provinces = { ["Eastern Cape, South Africa"] = {the = true}, ["Free State, South Africa"] = {the = true, wp = "%l (จังหวัด)"}, ["Gauteng, South Africa"] = {}, ["KwaZulu-Natal, South Africa"] = {}, ["Limpopo, South Africa"] = {}, ["Mpumalanga, South Africa"] = {}, -- per Wikipedia and other sources, `North West` doesn't normally have `the` before it ["North West, South Africa"] = {wp = "%l (South African province)"}, ["Northern Cape, South Africa"] = {the = true}, ["Western Cape, South Africa"] = {the = true}, } -- provinces of South Africa export.south_africa_group = { default_container = "South Africa", default_placetype = "จังหวัด", default_divs = "เทศบาล", data = export.south_africa_provinces, } export.south_korea_provinces = { ["North Chungcheong Province, South Korea"] = {}, ["South Chungcheong Province, South Korea"] = {}, ["Gangwon Province, South Korea"] = {wp = "%l, %c"}, ["Gyeonggi Province, South Korea"] = {}, ["North Gyeongsang Province, South Korea"] = {}, ["South Gyeongsang Province, South Korea"] = {}, ["North Jeolla Province, South Korea"] = {}, ["South Jeolla Province, South Korea"] = {}, ["Jeju Province, South Korea"] = {}, } -- provinces of South Korea export.south_korea_group = { key_to_placename = make_key_to_placename(", South Korea$", " Province$"), placename_to_key = make_placename_to_key(", South Korea", " Province"), default_container = "South Korea", default_placetype = "จังหวัด", data = export.south_korea_provinces, } export.spain_autonomous_communities = { ["Andalusia, Spain"] = {}, ["Aragon, Spain"] = {}, ["Asturias, Spain"] = {}, ["Balearic Islands, Spain"] = {the = true}, ["Basque Country, Spain"] = {the = true, wp = "%l (autonomous community)"}, ["Canary Islands, Spain"] = {the = true}, ["Cantabria, Spain"] = {}, ["Castile and León, Spain"] = {}, ["Castilla-La Mancha, Spain"] = {wp = "Castilla–La Mancha"}, -- with en-dash ["Catalonia, Spain"] = {}, ["Community of Madrid, Spain"] = {the = true}, ["Extremadura, Spain"] = {}, ["Galicia, Spain"] = {wp = "%l (Spain)"}, ["La Rioja, Spain"] = {}, ["Murcia, Spain"] = {wp = "Region of %l"}, ["Navarre, Spain"] = {}, ["Valencia, Spain"] = {wp = "Valencian Community"}, ["Valencian Community, Spain"] = {alias_of = "Valencia, Spain", the = true}, } -- autonomous communities of Spain export.spain_group = { default_container = "Spain", default_placetype = "autonomous community", default_divs = {"เทศบาล", "comarcas"}, data = export.spain_autonomous_communities, } export.taiwan_counties = { ["จางฮว่า, ไต้หวัน"] = {}, ["เจียอี้, ไต้หวัน"] = {}, ["ซินจู๋, ไต้หวัน"] = {}, ["ฮวาเหลียน, ไต้หวัน"] = {}, ["จินเหมิน, ไต้หวัน"] = {wp = "หมู่เกาะจินเหมิน"}, ["เหลียนเจียง, ไต้หวัน"] = {wp = "หมู่เกาะหมาจู่"}, ["เหมียวลี่, ไต้หวัน"] = {}, ["หนานโถว, ไต้หวัน"] = {}, ["เผิงหู, ไต้หวัน"] = {wp = "เผิงหู"}, ["ผิงตง, ไต้หวัน"] = {}, ["ไถตง, ไต้หวัน"] = {}, ["อี๋หลาน, ไต้หวัน"] = {wp = "%l, %c"}, ["ยฺหวินหลิน, ไต้หวัน"] = {}, } -- counties of Taiwan export.taiwan_group = { key_to_placename = make_key_to_placename(", ไต้หวัน$"), placename_to_key = make_placename_to_key(", ไต้หวัน"), default_container = "ไต้หวัน", default_placetype = "เทศมณฑล", default_divs = {"อำเภอ", "townships"}, data = export.taiwan_counties, } export.thailand_provinces = { --ไม่ต้องเติม จังหวัด -- กรุงเทพมหานคร (Bangkok - special administrative area) ["อำนาจเจริญ, ไทย"] = {}, ["อ่างทอง, ไทย"] = {}, ["บึงกาฬ, ไทย"] = {}, ["บุรีรัมย์, ไทย"] = {}, ["ฉะเชิงเทรา, ไทย"] = {}, ["ชัยนาท, ไทย"] = {}, ["ชัยภูมิ, ไทย"] = {}, ["จันทบุรี, ไทย"] = {}, ["เชียงใหม่, ไทย"] = {}, ["เชียงราย, ไทย"] = {}, ["ชลบุรี, ไทย"] = {}, ["ชุมพร, ไทย"] = {}, ["กาฬสินธุ์, ไทย"] = {}, ["กำแพงเพชร, ไทย"] = {}, ["กาญจนบุรี, ไทย"] = {}, ["ขอนแก่น, ไทย"] = {}, ["กระบี่, ไทย"] = {}, ["ลำปาง, ไทย"] = {}, ["ลำพูน, ไทย"] = {}, ["เลย, ไทย"] = {}, ["ลพบุรี, ไทย"] = {}, ["แม่ฮ่องสอน, ไทย"] = {}, ["มหาสารคาม, ไทย"] = {}, ["มุกดาหาร, ไทย"] = {}, ["นครนายก, ไทย"] = {}, ["นครปฐม, ไทย"] = {}, ["นครพนม, ไทย"] = {}, ["นครราชสีมา, ไทย"] = {}, ["นครสวรรค์, ไทย"] = {}, ["นครศรีธรรมราช, ไทย"] = {}, ["น่าน, ไทย"] = {}, ["นราธิวาส, ไทย"] = {}, ["หนองบัวลำภู, ไทย"] = {}, ["หนองคาย, ไทย"] = {}, ["นนทบุรี, ไทย"] = {}, ["ปทุมธานี, ไทย"] = {}, ["ปัตตานี, ไทย"] = {}, ["พังงา, ไทย"] = {}, ["พัทลุง, ไทย"] = {}, ["พะเยา, ไทย"] = {}, ["เพชรบูรณ์, ไทย"] = {}, ["เพชรบุรี, ไทย"] = {}, ["พิจิตร, ไทย"] = {}, ["พิษณุโลก, ไทย"] = {}, ["พระนครศรีอยุธยา, ไทย"] = {}, ["แพร่, ไทย"] = {}, ["ภูเก็ต, ไทย"] = {}, ["ปราจีนบุรี, ไทย"] = {}, ["ประจวบคีรีขันธ์, ไทย"] = {}, ["ระนอง, ไทย"] = {}, ["ราชบุรี, ไทย"] = {}, ["ระยอง, ไทย"] = {}, ["ร้อยเอ็ด, ไทย"] = {}, ["สระแก้ว, ไทย"] = {}, ["สกลนคร, ไทย"] = {}, ["สมุทรปราการ, ไทย"] = {}, ["สมุทรสาคร, ไทย"] = {}, ["สมุทรสงคราม, ไทย"] = {}, ["สระบุรี, ไทย"] = {}, ["สตูล, ไทย"] = {}, ["สิงห์บุรี, ไทย"] = {}, ["ศรีสะเกษ, ไทย"] = {}, ["สงขลา, ไทย"] = {}, ["สุโขทัย, ไทย"] = {}, ["สุพรรณบุรี, ไทย"] = {}, ["สุราษฎร์ธานี, ไทย"] = {}, ["สุรินทร์, ไทย"] = {}, ["ตาก, ไทย"] = {}, ["ตรัง, ไทย"] = {}, ["ตราด, ไทย"] = {}, ["อุบลราชธานี, ไทย"] = {}, ["อุดรธานี, ไทย"] = {}, ["อุทัยธานี, ไทย"] = {}, ["อุตรดิตถ์, ไทย"] = {}, ["ยะลา, ไทย"] = {}, ["ยโสธร, ไทย"] = {}, } -- provinces of Thailand export.thailand_group = { key_to_placename = make_key_to_placename(", ไทย$"), --ไม่ต้องเติม จังหวัด placename_to_key = make_placename_to_key(", ไทย"), default_container = "ไทย", default_placetype = "จังหวัด", default_divs = "อำเภอ", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "จังหวัด%e", data = export.thailand_provinces, } export.turkey_provinces = { ["Adana Province, Turkey"] = {}, -- code 01 ["Adıyaman Province, Turkey"] = {}, -- code 02 ["Afyonkarahisar Province, Turkey"] = {}, -- code 03 ["Ağrı Province, Turkey"] = {}, -- code 04 ["Amasya Province, Turkey"] = {}, -- code 05 ["Ankara Province, Turkey"] = {}, -- code 06 ["Antalya Province, Turkey"] = {}, -- code 07 ["Artvin Province, Turkey"] = {}, -- code 08 ["Aydın Province, Turkey"] = {}, -- code 09 ["Balıkesir Province, Turkey"] = {}, -- code 10 ["Bilecik Province, Turkey"] = {}, -- code 11 ["Bingöl Province, Turkey"] = {}, -- code 12 ["Bitlis Province, Turkey"] = {}, -- code 13 ["Bolu Province, Turkey"] = {}, -- code 14 ["Burdur Province, Turkey"] = {}, -- code 15 ["Bursa Province, Turkey"] = {}, -- code 16 ["Çanakkale Province, Turkey"] = {}, -- code 17 ["Çankırı Province, Turkey"] = {}, -- code 18 ["Çorum Province, Turkey"] = {}, -- code 19 ["Denizli Province, Turkey"] = {}, -- code 20 ["Diyarbakır Province, Turkey"] = {}, -- code 21 ["Edirne Province, Turkey"] = {}, -- code 22 ["Elazığ Province, Turkey"] = {}, -- code 23 ["Elâzığ Province, Turkey"] = {alias_of = "Elazığ Province, Turkey", display = true}, ["Erzincan Province, Turkey"] = {}, -- code 24 ["Erzurum Province, Turkey"] = {}, -- code 25 ["Eskişehir Province, Turkey"] = {}, -- code 26 ["Gaziantep Province, Turkey"] = {}, -- code 27 ["Giresun Province, Turkey"] = {}, -- code 28 ["Gümüşhane Province, Turkey"] = {}, -- code 29 ["Hakkâri Province, Turkey"] = {}, -- code 30 ["Hakkari Province, Turkey"] = {alias_of = "Hakkâri Province, Turkey", display = true}, ["Hatay Province, Turkey"] = {}, -- code 31 ["Isparta Province, Turkey"] = {}, -- code 32 ["Mersin Province, Turkey"] = {}, -- code 33 -- ["Istanbul Province, Turkey"] = {}, -- code 34; this is coextensive with the city itself ["İzmir Province, Turkey"] = {}, -- code 35 ["Izmir Province, Turkey"] = {alias_of = "İzmir Province, Turkey", display = true}, ["Kars Province, Turkey"] = {}, -- code 36 ["Kastamonu Province, Turkey"] = {}, -- code 37 ["Kayseri Province, Turkey"] = {}, -- code 38 ["Kırklareli Province, Turkey"] = {}, -- code 39 ["Kırşehir Province, Turkey"] = {}, -- code 40 ["Kocaeli Province, Turkey"] = {}, -- code 41 ["Konya Province, Turkey"] = {}, -- code 42 ["Kütahya Province, Turkey"] = {}, -- code 43 ["Malatya Province, Turkey"] = {}, -- code 44 ["Manisa Province, Turkey"] = {}, -- code 45 ["Kahramanmaraş Province, Turkey"] = {}, -- code 46 ["Mardin Province, Turkey"] = {}, -- code 47 ["Muğla Province, Turkey"] = {}, -- code 48 ["Muş Province, Turkey"] = {}, -- code 49 ["Nevşehir Province, Turkey"] = {}, -- code 50 ["Niğde Province, Turkey"] = {}, -- code 51 ["Ordu Province, Turkey"] = {}, -- code 52 ["Rize Province, Turkey"] = {}, -- code 53 ["Sakarya Province, Turkey"] = {}, -- code 54 ["Samsun Province, Turkey"] = {}, -- code 55 ["Siirt Province, Turkey"] = {}, -- code 56 ["Sinop Province, Turkey"] = {}, -- code 57 ["Sivas Province, Turkey"] = {}, -- code 58 ["Tekirdağ Province, Turkey"] = {}, -- code 59 ["Tokat Province, Turkey"] = {}, -- code 60 ["Trabzon Province, Turkey"] = {}, -- code 61 ["Tunceli Province, Turkey"] = {}, -- code 62 ["Şanlıurfa Province, Turkey"] = {}, -- code 63 ["Uşak Province, Turkey"] = {}, -- code 64 ["Van Province, Turkey"] = {}, -- code 65 ["Yozgat Province, Turkey"] = {}, -- code 66 ["Zonguldak Province, Turkey"] = {}, -- code 67 ["Aksaray Province, Turkey"] = {}, -- code 68 ["Bayburt Province, Turkey"] = {}, -- code 69 ["Karaman Province, Turkey"] = {}, -- code 70 ["Kırıkkale Province, Turkey"] = {}, -- code 71 ["Batman Province, Turkey"] = {}, -- code 72 ["Şırnak Province, Turkey"] = {}, -- code 73 ["Bartın Province, Turkey"] = {}, -- code 74 ["Ardahan Province, Turkey"] = {}, -- code 75 ["Iğdır Province, Turkey"] = {}, -- code 76 ["Yalova Province, Turkey"] = {}, -- code 77 ["Karabük Province, Turkey"] = {}, -- code 78 ["Kilis Province, Turkey"] = {}, -- code 79 ["Osmaniye Province, Turkey"] = {}, -- code 80 ["Düzce Province, Turkey"] = {}, -- code 81 } -- provinces of Turkey export.turkey_group = { key_to_placename = make_key_to_placename(", Turkey$", " Province$"), placename_to_key = make_placename_to_key(", Turkey", " Province"), default_container = "Turkey", default_placetype = "จังหวัด", default_divs = "อำเภอ", data = export.turkey_provinces, } export.ukraine_oblasts = { ["Cherkasy Oblast, Ukraine"] = {}, -- capital [[Cherkasy]], license plate prefix CA, IA ["Chernihiv Oblast, Ukraine"] = {}, -- capital [[Chernihiv]], license plate prefix CB, IB ["Chernivtsi Oblast, Ukraine"] = {}, -- capital [[Chernivtsi]], license plate prefix CE, IE -- apparently will be renamed to 'Dnipro Oblast' ["Dnipropetrovsk Oblast, Ukraine"] = {}, -- capital [[Dnipro]], license plate prefix AE, KE ["Donetsk Oblast, Ukraine"] = {}, -- capital ''[[Donetsk]] ([[Kramatorsk]])'', license plate prefix AH, KH ["Ivano-Frankivsk Oblast, Ukraine"] = {}, -- capital [[Ivano-Frankivsk]], license plate prefix AT, KT ["Kharkiv Oblast, Ukraine"] = {}, -- capital [[Kharkiv]], license plate prefix AX, KX ["Kherson Oblast, Ukraine"] = {}, -- capital ''[[Kherson]]'', license plate prefix ''BT, HT'' ["Khmelnytskyi Oblast, Ukraine"] = {}, -- capital [[Khmelnytskyi]], license plate prefix BX, HX -- apparently will be renamed to 'Kropyvnytskyi Oblast' ["Kirovohrad Oblast, Ukraine"] = {}, -- capital [[Kropyvnytskyi]], license plate prefix BA, HA ["Kyiv Oblast, Ukraine"] = {}, -- capital [[Kyiv]], license plate prefix AI, KI ["Kiev Oblast, Ukraine"] = {alias_of = "Kyiv Oblast, Ukraine", display = true}, ["Luhansk Oblast, Ukraine"] = {}, -- capital ''[[Luhansk]] ([[Sievierodonetsk]])'', license plate prefix BB, HB ["Lviv Oblast, Ukraine"] = {}, -- capital [[Lviv]], license plate prefix BC, HC ["Mykolaiv Oblast, Ukraine"] = {}, -- capital [[Mykolaiv]], license plate prefix BE, HE ["Odesa Oblast, Ukraine"] = {}, -- capital [[Odesa]], license plate prefix BH, HH ["Odessa Oblast, Ukraine"] = {alias_of = "Odesa Oblast, Ukraine", display = true}, ["Poltava Oblast, Ukraine"] = {}, -- capital [[Poltava]], license plate prefix BI, HI ["Rivne Oblast, Ukraine"] = {}, -- capital [[Rivne]], license plate prefix BK, HK ["Sumy Oblast, Ukraine"] = {}, -- capital [[Sumy]], license plate prefix BM, HM ["Ternopil Oblast, Ukraine"] = {}, -- capital [[Ternopil]], license plate prefix BO, HO ["Vinnytsia Oblast, Ukraine"] = {}, -- capital [[Vinnytsia]], license plate prefix AB, KB ["Volyn Oblast, Ukraine"] = {}, -- capital [[Lutsk]], license plate prefix AC, KC ["Zakarpattia Oblast, Ukraine"] = {}, -- capital [[Uzhhorod]], license plate prefix AO, KO ["Zaporizhzhia Oblast, Ukraine"] = {}, -- capital ''[[Zaporizhzhia]]'', license plate prefix AP, KP ["Zaporizhia Oblast, Ukraine"] = {alias_of = "Zaporizhzhia Oblast, Ukraine", display = true}, ["Zhytomyr Oblast, Ukraine"] = {}, -- capital [[Zhytomyr]], license plate prefix AM, KM } -- oblasts of Ukraine export.ukraine_group = { key_to_placename = make_key_to_placename(", Ukraine$", " Oblast$"), placename_to_key = make_placename_to_key(", Ukraine", " Oblast"), default_container = "Ukraine", default_placetype = "oblast", default_divs = {"raions", "hromadas"}, data = export.ukraine_oblasts, } export.united_kingdom_constituent_countries = { ["England"] = {divs = { "เทศมณฑล", "อำเภอ", {type = "local government districts", cat_as = "อำเภอ"}, { type = "local government districts with borough status", cat_as = {"อำเภอ", "boroughs"}, }, {type = "boroughs", cat_as = {"อำเภอ", "boroughs"}}, {type = "civil parishes", container_parent_type = false}, }}, ["Northern Ireland"] = { placetype = {"constituent country", "จังหวัด", "ประเทศ"}, divs = {"เทศมณฑล", "อำเภอ"}, }, ["Scotland"] = {divs = { {type = "council areas", container_parent_type = false}, "อำเภอ", }}, ["Wales"] = {divs = { "เทศมณฑล", {type = "county boroughs", container_parent_type = false}, {type = "communities", container_parent_type = false}, {type = "Welsh communities", cat_as = {{type = "communities", container_parent_type = false}}}, }}, } -- constituent countries and provinces of the United Kingdom export.united_kingdom_group = { placename_to_key = false, default_container = "สหราชอาณาจักร", default_placetype = {"constituent country", "ประเทศ"}, addl_divs = { "traditional counties", {type = "historical counties", cat_as = "traditional counties"}, }, -- Don't create categories like 'Category:en:Towns in the United Kingdom' -- or 'Category:en:Places in the United Kingdom'. default_no_container_cat = true, data = export.united_kingdom_constituent_countries, } export.england_counties = { -- NOTE: We used to have various other "no longer" counties commented out, which seems to refer to counties that -- existed officially at some point between 1889 and 1974, which I have removed. I have only kept the three -- ceremonial counties that existed from 1974 (when ceremonial counties were created) to 1996, as well as those -- still considered "historic counties" per [[w:Historic counties of England]]. -- ["Avon, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996) ["Bedfordshire, England"] = {}, ["Berkshire, England"] = {}, -- ["Brighton and Hove, England"] = {}, -- city -- ["Bristol, England"] = {}, -- city ["Buckinghamshire, England"] = {}, ["Cambridgeshire, England"] = {}, ["Cheshire, England"] = {}, -- ["Cleveland, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996) ["Cornwall, England"] = {}, -- ["Cumberland, England"] = {}, -- no longer (historic county) ["Cumbria, England"] = {}, ["Derbyshire, England"] = {}, ["Devon, England"] = {}, ["Dorset, England"] = {}, ["County Durham, England"] = {}, ["East Sussex, England"] = {}, ["Essex, England"] = {}, ["Gloucestershire, England"] = {}, ["Greater London, England"] = {}, ["Greater Manchester, England"] = {}, ["Hampshire, England"] = {}, ["Herefordshire, England"] = {}, ["Hertfordshire, England"] = {}, -- ["Humberside, England"] = {}, -- no longer (1974 to 1996) -- ["Huntingdonshire, England"] = {}, -- no longer (historic county) ["Isle of Wight, England"] = {the = true}, ["Kent, England"] = {}, ["Lancashire, England"] = {}, ["Leicestershire, England"] = {}, ["Lincolnshire, England"] = {}, ["Merseyside, England"] = {}, -- ["Middlesex, England"] = {}, -- no longer (historic county) ["Norfolk, England"] = {}, ["Northamptonshire, England"] = {}, ["Northumberland, England"] = {}, ["North Yorkshire, England"] = {}, ["Nottinghamshire, England"] = {}, ["Oxfordshire, England"] = {}, ["Rutland, England"] = {}, ["Shropshire, England"] = {}, ["Somerset, England"] = {}, ["South Humberside, England"] = {}, ["South Yorkshire, England"] = {}, ["Staffordshire, England"] = {}, ["Suffolk, England"] = {}, ["Surrey, England"] = {}, -- ["Sussex, England"] = {}, -- no longer (historic county) ["Tyne and Wear, England"] = {}, ["Warwickshire, England"] = {}, ["West Midlands, England"] = {the = true, wp = "%l (county)"}, -- ["Westmorland, England"] = {}, -- no longer (historic county) ["West Sussex, England"] = {}, ["West Yorkshire, England"] = {}, ["Wiltshire, England"] = {}, ["Worcestershire, England"] = {}, -- ["Yorkshire, England"] = {}, -- no longer (historic county) ["East Riding of Yorkshire, England"] = {the = true}, } -- counties of England export.england_group = { default_container = {key = "England", placetype = "constituent country"}, default_placetype = "เทศมณฑล", default_divs = { "อำเภอ", {type = "local government districts", cat_as = "อำเภอ"}, { type = "local government districts with borough status", cat_as = {"อำเภอ", "boroughs"}, }, {type = "boroughs", cat_as = {"อำเภอ", "boroughs"}}, "civil parishes", }, data = export.england_counties, } export.northern_ireland_counties = { ["County Antrim, Northern Ireland"] = {}, ["County Armagh, Northern Ireland"] = {}, ["City of Belfast, Northern Ireland"] = {the = true, is_city = true, wp = "Belfast"}, ["County Down, Northern Ireland"] = {}, ["County Fermanagh, Northern Ireland"] = {}, ["County Londonderry, Northern Ireland"] = {}, ["City of Derry, Northern Ireland"] = {the = true, is_city = true, wp = "Derry"}, ["County Tyrone, Northern Ireland"] = {}, } -- counties of Northern Ireland export.northern_ireland_group = { key_to_placename = make_irish_type_key_to_placename(", Northern Ireland$"), placename_to_key = make_irish_type_placename_to_key(", Northern Ireland"), default_container = {key = "Northern Ireland", placetype = "constituent country"}, default_placetype = "เทศมณฑล", data = export.northern_ireland_counties, } export.scotland_council_areas = { ["Aberdeenshire, Scotland"] = {}, ["Angus, Scotland"] = {wp = "%l, %c"}, ["Argyll and Bute, Scotland"] = {}, ["City of Aberdeen, Scotland"] = {the = true, wp = "Aberdeen"}, ["Aberdeen"] = {alias_of = "City of Aberdeen, Scotland"}, ["Aberdeen City"] = {alias_of = "City of Aberdeen, Scotland"}, ["City of Dundee, Scotland"] = {the = true, wp = "Dundee"}, ["Dundee"] = {alias_of = "City of Dundee, Scotland"}, ["Dundee City"] = {alias_of = "City of Dundee, Scotland"}, ["City of Edinburgh, Scotland"] = {the = true, wp = "%l council area"}, ["Edinburgh"] = {alias_of = "City of Edinburgh, Scotland"}, ["City of Glasgow, Scotland"] = {the = true, wp = "Glasgow"}, ["Glasgow"] = {alias_of = "City of Glasgow, Scotland"}, ["Clackmannanshire, Scotland"] = {}, ["Dumfries and Galloway, Scotland"] = {}, ["East Ayrshire, Scotland"] = {}, ["East Dunbartonshire, Scotland"] = {}, ["East Lothian, Scotland"] = {}, ["East Renfrewshire, Scotland"] = {}, ["Falkirk, Scotland"] = {wp = "%l council area"}, ["Fife, Scotland"] = {}, ["Highland, Scotland"] = {wp = "%l council area"}, ["Inverclyde, Scotland"] = {}, ["Midlothian, Scotland"] = {}, ["Moray, Scotland"] = {}, ["North Ayrshire, Scotland"] = {}, ["North Lanarkshire, Scotland"] = {}, ["Orkney Islands, Scotland"] = {the = true}, ["Perth and Kinross, Scotland"] = {}, ["Renfrewshire, Scotland"] = {}, ["Scottish Borders, Scotland"] = {the = true}, ["Shetland Islands, Scotland"] = {the = true}, ["South Ayrshire, Scotland"] = {}, ["South Lanarkshire, Scotland"] = {}, ["Stirling, Scotland"] = {wp = "%l council area"}, ["West Dunbartonshire, Scotland"] = {}, ["West Lothian, Scotland"] = {}, ["Western Isles, Scotland"] = {the = true, wp = "Outer Hebrides"}, ["Na h-Eileanan Siar, Scotland"] = {alias_of = "Western Isles, Scotland"}, } -- council areas of Scotland export.scotland_group = { default_container = {key = "Scotland", placetype = "constituent country"}, default_placetype = "council area", data = export.scotland_council_areas, } export.wales_principal_areas = { ["Blaenau Gwent, Wales"] = {}, ["Bridgend, Wales"] = {wp = "%l County Borough"}, ["Caerphilly, Wales"] = {wp = "%l County Borough"}, -- ["Cardiff, Wales"] = {placetype = "นคร"}, ["Carmarthenshire, Wales"] = {placetype = "เทศมณฑล"}, ["Ceredigion, Wales"] = {placetype = "เทศมณฑล"}, ["Conwy, Wales"] = {wp = "%l County Borough"}, ["Denbighshire, Wales"] = {placetype = "เทศมณฑล"}, ["Flintshire, Wales"] = {placetype = "เทศมณฑล"}, ["Gwynedd, Wales"] = {placetype = "เทศมณฑล"}, ["Isle of Anglesey, Wales"] = {the = true, placetype = "เทศมณฑล"}, ["Anglesey, Wales"] = {alias_of = "Isle of Anglesey, Wales"}, -- differs in "the" ["Merthyr Tydfil, Wales"] = {wp = "%l County Borough"}, ["Monmouthshire, Wales"] = {placetype = "เทศมณฑล"}, ["Neath Port Talbot, Wales"] = {}, -- ["Newport, Wales"] = {placetype = "นคร", wp = "%l, %c"}, ["Pembrokeshire, Wales"] = {placetype = "เทศมณฑล"}, ["Powys, Wales"] = {placetype = "เทศมณฑล"}, ["Rhondda Cynon Taf, Wales"] = {}, -- ["Swansea, Wales"] = {placetype = "นคร"}, ["Torfaen, Wales"] = {}, ["Vale of Glamorgan, Wales"] = {the = true}, ["Wrexham, Wales"] = {wp = "%l County Borough"}, } -- principal areas (cities, counties and county boroughs) of Wales export.wales_group = { default_container = {key = "Wales", placetype = "constituent country"}, default_placetype = "county borough", data = export.wales_principal_areas, } export.united_states_states = { ["Alabama, USA"] = {}, ["Alaska, USA"] = {divs = { {type = "boroughs", container_parent_type = "เทศมณฑล"}, {type = "borough seats", container_parent_type = "county seats"}, }}, ["Arizona, USA"] = {}, ["Arkansas, USA"] = {}, ["California, USA"] = {}, ["Colorado, USA"] = {divs = {"เทศมณฑล", "county seats", "เทศบาล"}}, ["Connecticut, USA"] = {divs = {"เทศมณฑล", "county seats", "เทศบาล"}}, ["Delaware, USA"] = {}, ["Florida, USA"] = {}, ["Georgia, USA"] = {wp = "%l (U.S. state)"}, ["Hawaii, USA"] = {addl_parents = {"พอลินีเชีย"}}, ["Idaho, USA"] = {}, ["Illinois, USA"] = {}, ["Indiana, USA"] = {}, ["Iowa, USA"] = {}, ["Kansas, USA"] = {}, ["Kentucky, USA"] = {}, ["Louisiana, USA"] = {divs = { {type = "parishes", container_parent_type = "เทศมณฑล"}, {type = "parish seats", container_parent_type = "county seats"}, }}, ["Maine, USA"] = {}, ["Maryland, USA"] = {}, ["Massachusetts, USA"] = {}, ["Michigan, USA"] = {}, ["Minnesota, USA"] = {}, ["Mississippi, USA"] = {}, ["Missouri, USA"] = {}, ["Montana, USA"] = {}, ["Nebraska, USA"] = {}, ["Nevada, USA"] = {}, ["New Hampshire, USA"] = {}, ["New Jersey, USA"] = {divs = { "เทศมณฑล", "county seats", {type = "boroughs", prep = "ใน"}, }}, ["New Mexico, USA"] = {}, ["New York, USA"] = {wp = "%l (รัฐ)"}, ["North Carolina, USA"] = {}, ["North Dakota, USA"] = {}, ["Ohio, USA"] = {}, ["Oklahoma, USA"] = {}, ["Oregon, USA"] = {}, ["Pennsylvania, USA"] = {divs = { "เทศมณฑล", "county seats", {type = "boroughs", prep = "ใน"}, }}, ["Rhode Island, USA"] = {}, ["South Carolina, USA"] = {}, ["South Dakota, USA"] = {}, ["Tennessee, USA"] = {}, ["Texas, USA"] = {}, ["Utah, USA"] = {}, ["Vermont, USA"] = {}, ["Virginia, USA"] = {}, ["Washington, USA"] = {wp = "%l (รัฐ)"}, ["West Virginia, USA"] = {}, ["Wisconsin, USA"] = {}, ["Wyoming, USA"] = {}, } -- states of the United States export.united_states_group = { placename_to_key = make_placename_to_key(", USA"), default_container = "สหรัฐอเมริกา", default_placetype = "รัฐ", default_divs = {"เทศมณฑล", "county seats"}, addl_divs = { {type = "census-designated places", prep = "ใน"}, {type = "unincorporated communities", prep = "ใน"}, }, data = export.united_states_states, } export.vietnam_provinces = { -- [[Northeast (Vietnam)|Northeast]] region ["Bắc Giang, เวียดนาม"] = {}, -- capital [[Bắc Giang]] ["Bắc Kạn, เวียดนาม"] = {}, -- capital [[Bắc Kạn]] ["Cao Bằng, เวียดนาม"] = {}, -- capital [[Cao Bằng]] ["Hà Giang, เวียดนาม"] = {}, -- capital [[Hà Giang]] ["Lạng Sơn, เวียดนาม"] = {}, -- capital [[Lạng Sơn]] ["Phú Thọ, เวียดนาม"] = {}, -- capital [[Việt Trì]] ["Quảng Ninh, เวียดนาม"] = {}, -- capital [[Hạ Long]] ["Thái Nguyên, เวียดนาม"] = {}, -- capital [[Thái Nguyên]] ["Tuyên Quang, เวียดนาม"] = {}, -- capital [[Tuyên Quang]] -- [[Northwest (Vietnam)|Northwest]] region ["Lào Cai, เวียดนาม"] = {}, -- capital [[Lào Cai]] ["Yên Bái, เวียดนาม"] = {}, -- capital [[Yên Bái]] ["Điện Biên, เวียดนาม"] = {}, -- capital [[Điện Biên Phủ]] ["Hoà Bình, เวียดนาม"] = {}, -- capital [[Hoà Bình City|Hoà Bình]] ["Hòa Bình, เวียดนาม"] = {alias_of = "Hoà Bình, เวียดนาม", display = true}, ["Lai Châu, เวียดนาม"] = {}, -- capital [[Lai Châu]] ["Sơn La, เวียดนาม"] = {}, -- capital [[Sơn La]] -- [[Red River Delta]] region ["Bắc Ninh, เวียดนาม"] = {}, -- capital [[Bắc Ninh]] ["Hà Nam, เวียดนาม"] = {}, -- capital [[Phủ Lý]] ["Hải Dương, เวียดนาม"] = {}, -- capital [[Hải Dương]] ["Hưng Yên, เวียดนาม"] = {}, -- capital [[Hưng Yên]] ["Nam Định, เวียดนาม"] = {}, -- capital [[Nam Định]] ["Ninh Bình, เวียดนาม"] = {}, -- capital [[Ninh Bình|Hoa Lư]] ["Thái Bình, เวียดนาม"] = {}, -- capital [[Thái Bình]] ["Vĩnh Phúc, เวียดนาม"] = {}, -- capital [[Vĩnh Yên]] -- ["Hanoi"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[Hoàn Kiếm district]] -- ["Haiphong"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[Hồng Bàng district]] -- [[North Central Coast]] region ["Hà Tĩnh, เวียดนาม"] = {}, -- capital [[Hà Tĩnh]] ["Nghệ An, เวียดนาม"] = {}, -- capital [[Vinh]] ["Quảng Bình, เวียดนาม"] = {}, -- capital [[Đồng Hới]] ["Quảng Trị, เวียดนาม"] = {}, -- capital [[Đông Hà]] ["Thanh Hoá, เวียดนาม"] = {}, -- capital [[Thanh Hoá]] ["Thanh Hóa, เวียดนาม"] = {alias_of = "Thanh Hoá, เวียดนาม", display = true}, -- ["Hue"] = {placetype = {"เทศบาล", "นคร"}, wp = "Huế"}, -- capital [[Thuận Hoá district]] -- [[Central Highlands (Vietnam)|Central Highlands]] region ["Đắk Lắk, เวียดนาม"] = {}, -- capital [[Buôn Ma Thuột]] ["Đăk Nông, เวียดนาม"] = {}, -- capital [[Gia Nghĩa]] ["Gia Lai, เวียดนาม"] = {}, -- capital [[Pleiku]] ["Kon Tum, เวียดนาม"] = {}, -- capital [[Kon Tum]] ["Lâm Đồng, เวียดนาม"] = {}, -- capital [[Đà Lạt]] -- [[South Central Coast]] region ["Bình Định, เวียดนาม"] = {}, -- capital [[Quy Nhon]] ["Bình Thuận, เวียดนาม"] = {}, -- capital [[Phan Thiết]] ["Khánh Hoà, เวียดนาม"] = {}, -- capital [[Nha Trang]] ["Khánh Hòa, เวียดนาม"] = {alias_of = "Khánh Hoà, เวียดนาม", display = true}, ["Ninh Thuận, เวียดนาม"] = {}, -- capital [[Phan Rang–Tháp Chàm]] ["Phú Yên, เวียดนาม"] = {}, -- capital [[Tuy Hoà]] ["Quảng Nam, เวียดนาม"] = {}, -- capital [[Tam Kỳ]] ["Quảng Ngãi, เวียดนาม"] = {}, -- capital [[Quảng Ngãi]] -- ["Da Nang"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[Hải Châu district]] -- [[Southeast (Vietnam)|Southeast]] region ["Bà Rịa–Vũng Tàu, เวียดนาม"] = {}, -- capital [[Bà Rịa]] ["Bình Dương, เวียดนาม"] = {}, -- capital [[Thủ Dầu Một]] ["Bình Phước, เวียดนาม"] = {}, -- capital [[Đồng Xoài]] ["Đồng Nai, เวียดนาม"] = {}, -- capital [[Biên Hoà]] ["Tây Ninh, เวียดนาม"] = {}, -- capital [[Tây Ninh]] -- ["Ho Chi Minh City"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[District 1, Ho Chi Minh City|'''District 1''']] -- [[Mekong Delta]] region ["An Giang, เวียดนาม"] = {}, -- capital [[Long Xuyên]] ["Bạc Liêu, เวียดนาม"] = {}, -- capital [[Bạc Liêu]] ["Bến Tre, เวียดนาม"] = {}, -- capital [[Bến Tre]] ["Cà Mau, เวียดนาม"] = {}, -- capital [[Cà Mau]] ["Đồng Tháp, เวียดนาม"] = {}, -- capital [[Cao Lãnh City|Cao Lãnh]] ["Hậu Giang, เวียดนาม"] = {}, -- capital [[Vị Thanh]] ["Kiên Giang, เวียดนาม"] = {}, -- capital [[Rạch Giá]] ["Long An, เวียดนาม"] = {}, -- capital [[Tân An]] ["Sóc Trăng, เวียดนาม"] = {}, -- capital [[Sóc Trăng]] ["Tiền Giang, เวียดนาม"] = {}, -- capital [[Mỹ Tho]] ["Trà Vinh, เวียดนาม"] = {}, -- capital [[Trà Vinh]] ["Vĩnh Long, เวียดนาม"] = {}, -- capital [[Vĩnh Long]] -- ["Can Tho"] = {placetype = {"เทศบาล", "นคร"}, wp = "Cần Thơ"}, -- capital [[Ninh Kiều district]] } -- provinces of Vietnam export.vietnam_group = { key_to_placename = make_key_to_placename(", เวียดนาม$"), placename_to_key = make_placename_to_key(", เวียดนาม"), default_container = "เวียดนาม", default_placetype = "จังหวัด", -- There may not be enough districts to subcategorize like this. -- default_divs = "อำเภอ", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "จังหวัด%e", data = export.vietnam_provinces, } ----------------------------------------------------------------------------------- -- City data -- ----------------------------------------------------------------------------------- export.australia_cities = { ["Adelaide"] = {container = "South Australia"}, -- 1,450,000 (Agglomeration) ["Brisbane"] = {container = "Queensland"}, -- 3,450,000 (Conglomeration; including the Gold Coast [750,997 2024 estiamte]) ["Canberra"] = {container = {key = "Australian Capital Territory, ออสเตรเลีย", placetype = "ดินแดน"}}, -- 510,641 (2024 estimate) ["Melbourne"] = {container = "Victoria"}, -- 5,200,000 (Agglomeration) ["Newcastle, New South Wales"] = {container = "New South Wales", wp = "%l, %c"}, -- 534,033 (2024 estimate) ["Newcastle"] = {alias_of = "Newcastle, New South Wales"}, ["Perth"] = {container = "Western Australia"}, -- 2,350,000 (Agglomeration) ["Sydney"] = {container = "New South Wales"}, -- 5,100,000 (Agglomeration) } export.australia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", ออสเตรเลีย", "รัฐ"), default_placetype = "นคร", data = export.australia_cities, } export.brazil_cities = { -- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01. ["São Paulo"] = {container = "São Paulo"}, -- 22,600,000 (Consolidated Urban Area; including Guarulhos) ["Sao Paulo"] = {alias_of = "São Paulo", display = true}, ["Rio de Janeiro"] = {container = "Rio de Janeiro"}, -- 13,600,000 (Consolidated Urban Area) ["Belo Horizonte"] = {container = "Minas Gerais"}, -- 5,300,000 ["Recife"] = {container = "Pernambuco"}, -- 4,100,000 ["Porto Alegre"] = {container = "Rio Grande do Sul"}, -- 3,950,000 (Consolidated Urban Area) ["Brasília"] = {container = "Distrito Federal"}, -- 3,850,000 ["Brasilia"] = {alias_of = "Brasília", display = true}, ["Fortaleza"] = {container = "Ceará"}, -- 3,825,000 ["Salvador"] = {container = "Bahia", wp = "%l, %c", commonscat = "%l (%c)"}, -- 3,400,000 ["Curitiba"] = {container = "Paraná"}, -- 3,375,000 ["Campinas"] = {container = "São Paulo"}, -- 3,250,000 ["Goiânia"] = {container = "Goiás"}, -- 2,525,000 ["Goiania"] = {alias_of = "Goiânia", display = true}, ["Manaus"] = {container = "Amazonas"}, -- 2,275,000 ["Belém"] = {container = "Pará"}, -- 2,200,000 ["Belem"] = {alias_of = "Belém", display = true}, ["Vitória"] = {container = "Espírito Santo", wp = "%l, %c"}, -- 1,870,000 ["Vitoria"] = {alias_of = "Vitória", display = true}, ["Santos"] = {container = "São Paulo", wp = "%l, %c"}, -- 1,760,000 ["São Luís"] = {container = "Maranhão", wp = "%l, %c"}, -- 1,530,000 ["Sao Luis"] = {alias_of = "São Luís", display = true}, ["Natal"] = {container = "Rio Grande do Norte", wp = "%l, %c"}, -- 1,360,000 ["Florianópolis"] = {container = "Santa Catarina"}, -- 1,260,000 ["Florianopolis"] = {alias_of = "Florianópolis", display = true}, ["Maceió"] = {container = "Alagoas"}, -- 1,220,000 ["Maceio"] = {alias_of = "Maceió", display = true}, ["João Pessoa"] = {container = "Paraíba", wp = "%l, %c"}, -- 1,210,000 ["Joao Pessoa"] = {alias_of = "João Pessoa", display = true}, ["São José dos Campos"] = {container = "São Paulo"}, -- 1,090,000 ["Sao Jose dos Campos"] = {alias_of = "São José dos Campos", display = true}, ["Londrina"] = {container = "Paraná"}, -- 1,050,000 ["Teresina"] = {container = "Piauí"}, -- 1,040,000 } export.brazil_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", บราซิล", "รัฐ"), default_placetype = "นคร", data = export.brazil_cities, } export.canada_cities = { -- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01. ["Toronto"] = {container = "Ontario"}, -- 7,850,000 (Consolidated Urban Area; including Hamilton) ["Montreal"] = {container = "Quebec"}, -- 4,500,000 (Consolidated Urban Area) ["Vancouver"] = {container = "British Columbia"}, -- 3,175,000 (Consolidated Urban Area) ["Calgary"] = {container = "Alberta"}, -- 1,510,000 (Consolidated Urban Area) ["Edmonton"] = {container = "Alberta"}, -- 1,460,000 (Consolidated Urban Area) ["Ottawa"] = {container = "Ontario"}, -- 1,390,000 (Consolidated Urban Area) ["Quebec City"] = {container = "Quebec"}, -- 839,311 metro per Wikipedia (2021 census) ["Winnipeg"] = {container = "Manitoba"}, -- 834,678 metro per Wikipedia (2021 census) ["Hamilton"] = {container = "Ontario", wp = "%l, %c"}, -- 785,184 metro per Wikipedia (2021 census) ["Kitchener"] = {container = "Ontario", wp = "%l, %c"}, -- 575,847 metro per Wikipedia (2021 census) } export.canada_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Canada", "จังหวัด"), default_placetype = "นคร", data = export.canada_cities, } export.france_cities = { -- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01. ["Paris"] = {container = "Île-de-France"}, -- 11,500,000 (Conglomeration) ["Lyon"] = {container = "Auvergne-Rhône-Alpes"}, -- 2,050,000 (Conglomeration) ["Lyons"] = {alias_of = "Lyon", display = true}, ["Marseille"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 1,710,000 (Conglomeration) ["Marseilles"] = {alias_of = "Marseille", display = true}, ["Lille"] = {container = "Hauts-de-France"}, -- 1,320,000 (Conglomeration) ["Bordeaux"] = {container = "Nouvelle-Aquitaine"}, -- 1,160,000 (Conglomeration) ["Toulouse"] = {container = "Occitania"}, -- 1,150,000 (Conglomeration) ["Nice"] = {container = "Provence-Alpes-Côte d'Azur"}, ["Nantes"] = {container = "Pays de la Loire"}, ["Strasbourg"] = {container = "Grand Est"}, ["Rennes"] = {container = "Brittany"}, } export.france_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", ฝรั่งเศส", "ภูมิภาค"), default_placetype = "นคร", data = export.france_cities, } export.germany_cities = { -- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01. -- listed under Rhein-Ruhr Area, total population 10,900,000 (Consolidated Urban Area) ["Cologne"] = {container = "North Rhine-Westphalia"}, ["Köln"] = {alias_of = "Cologne", display = true}, ["Düsseldorf"] = {container = "North Rhine-Westphalia"}, ["Dusseldorf"] = {alias_of = "Düsseldorf", display = true}, ["Dortmund"] = {container = "North Rhine-Westphalia"}, ["Essen"] = {container = "North Rhine-Westphalia"}, ["Duisberg"] = {container = "North Rhine-Westphalia"}, ["Berlin"] = {}, -- 4,700,000 ["Frankfurt"] = {container = "Hesse"}, -- 3,225,000 ["Frankfurt am Main"] = {alias_of = "Frankfurt"}, -- not a display alias as it's longer ["Hamburg"] = {}, -- 2,900,000 ["Munich"] = {container = "Bavaria"}, -- 2,300,000 ["Stuttgart"] = {container = "Baden-Württemberg"}, -- 2,300,000 ["Mannheim"] = {container = "Baden-Württemberg"}, -- 1,550,000 ["Nuremberg"] = {container = "Bavaria"}, -- 1,120,000 ["Hanover"] = {"Lower Saxony"}, -- 1,090,000 ["Bielefeld"] = {container = "North Rhine-Westphalia"}, -- 1,080,000 ["Leipzig"] = {container = "Saxony"}, -- 1,080,000 ["Aachen"] = {container = "North Rhine-Westphalia"}, -- 1,000,000 ["Aix-la-Chapelle"] = {alias_of = "Aachen"}, -- historical; not a display alias ["Bremen"] = {}, } export.germany_cities_group = { default_container = "เยอรมนี", canonicalize_key_container = make_canonicalize_key_container(", เยอรมนี", "รัฐ"), default_placetype = "นคร", data = export.germany_cities, } export.india_cities = { -- This lists the 65 metro areas per Demographia's 2023 estimates, as found in -- [[w:List_of_million-plus_urban_agglomerations_in_India]]. The last census in India (as of April 2025) was -- conducted in 2011, and the results are not accurate any more. ["Delhi"] = {container = {key = "Delhi, อินเดีย", placetype = "union territory"}}, -- 31,190,000 ["Mumbai"] = {container = "Maharashtra"}, -- 25,189,000 ["Kolkata"] = {container = "West Bengal"}, -- 21,747,000 ["Bangalore"] = {container = "Karnataka", wp = "Bengaluru"}, -- 15,257,000 ["Bengaluru"] = {alias_of = "Bangalore"}, ["Chennai"] = {container = "Tamil Nadu"}, -- 11,570,000 ["Hyderabad"] = {container = "Telangana"}, -- 9,797,000 ["Ahmedabad"] = {container = "Gujarat"}, -- 8,006,000 ["Pune"] = {container = "Maharashtra"}, -- 6,819,000 ["Surat"] = {container = "Gujarat"}, -- 6,601,000 ["Lucknow"] = {container = "Uttar Pradesh"}, -- 4,661,000 ["Jaipur"] = {container = "Rajasthan"}, -- 4,360,000 ["Kanpur"] = {container = "Uttar Pradesh"}, -- 4,350,000 ["Indore"] = {container = "Madhya Pradesh"}, -- 3,765,000 ["Nagpur"] = {container = "Maharashtra"}, -- 3,493,000 ["Patna"] = {container = "Bihar"}, -- 3,331,000 ["Varanasi"] = {container = "Uttar Pradesh"}, -- 3,229,000 ["Kozhikode"] = {container = "Kerala"}, -- 3,049,000 ["Thiruvananthapuram"] = {container = "Kerala"}, -- 2,851,000 ["Agra"] = {container = "Uttar Pradesh"}, -- 2,737,000 ["Bhopal"] = {container = "Madhya Pradesh"}, -- 2,562,000 ["Coimbatore"] = {container = "Tamil Nadu"}, -- 2,551,000 ["Allahabad"] = {container = "Uttar Pradesh", wp = "Prayagraj"}, -- 2,438,000 ["Prayagraj"] = {alias_of = "Allahabad"}, ["Kochi"] = {container = "Kerala"}, -- 2,381,000 ["Ludhiana"] = {container = "Punjab"}, -- 2,205,000 ["Vadodara"] = {container = "Gujarat"}, -- 2,182,000 ["Chandigarh"] = {container = {key = "Chandigarh, อินเดีย", placetype = "union territory"}}, -- 2,168,000 ["Madurai"] = {container = "Tamil Nadu"}, -- 2,048,000 ["Meerut"] = {container = "Uttar Pradesh"}, -- 2,011,000 ["Visakhapatnam"] = {container = "Andhra Pradesh"}, -- 2,005,000 ["Jamshedpur"] = {container = "Jharkhand"}, -- 1,925,000 ["Malappuram"] = {container = "Kerala"}, -- 1,868,000 ["Nashik"] = {container = "Maharashtra"}, -- 1,810,000 ["Asansol"] = {container = "West Bengal"}, -- 1,720,000 ["Aligarh"] = {container = "Uttar Pradesh"}, -- 1,660,000 ["Ranchi"] = {container = "Jharkhand"}, -- 1,638,000 ["Thrissur"] = {container = "Kerala"}, -- 1,578,000 ["Kollam"] = {container = "Kerala"}, -- 1,576,000 ["Jabalpur"] = {container = "Madhya Pradesh"}, -- 1,533,000 ["Dhanbad"] = {container = "Jharkhand"}, -- 1,503,000 ["Jodhpur"] = {container = "Rajasthan"}, -- 1,497,000 ["Aurangabad"] = {container = "Maharashtra"}, -- 1,490,000 ["Chhatrapati Sambhajinagar"] = {alias_of = "Aurangabad"}, ["Rajkot"] = {container = "Gujarat"}, -- 1,487,000 ["Gwalior"] = {container = "Madhya Pradesh"}, -- 1,477,000 ["Raipur"] = {container = "Chhattisgarh"}, -- 1,429,000 ["Gorakhpur"] = {container = "Uttar Pradesh"}, -- 1,410,000 ["Kannur"] = {container = "Kerala"}, -- 1,360,000 ["Bareilly"] = {container = "Uttar Pradesh"}, -- 1,355,000 ["Guwahati"] = {container = "Assam"}, -- 1,355,000 ["Moradabad"] = {container = "Uttar Pradesh"}, -- 1,345,000 ["Amritsar"] = {container = "Punjab"}, -- 1,313,000 ["Mysore"] = {container = "Karnataka"}, -- 1,296,000 ["Bhilai"] = {container = "Chhattisgarh"}, -- 1,293,000 ["Durg-Bhilainagar"] = {alias_of = "Bhilai"}, ["Durg-Bhilai"] = {alias_of = "Bhilai"}, ["Durg"] = {alias_of = "Bhilai"}, ["Bhilainagar"] = {alias_of = "Bhilai"}, ["Vijayawada"] = {container = "Andhra Pradesh"}, -- 1,232,000 ["Srinagar"] = {container = {key = "Jammu and Kashmir, อินเดีย", placetype = "union territory"}}, -- 1,212,000 ["Salem"] = {container = "Tamil Nadu", wp = "%l, %c"}, -- 1,189,000 ["Kota"] = {container = "Rajasthan"}, -- 1,172,000 ["Jalandhar"] = {container = "Punjab"}, -- 1,165,000 ["Saharanpur"] = {container = "Uttar Pradesh"}, -- 1,152,000 ["Dehradun"] = {container = "Uttarakhand"}, -- 1,136,000 ["Tiruchirappalli"] = {container = "Tamil Nadu"}, -- 1,131,000 ["Bhubaneswar"] = {container = "Odisha"}, -- 1,112,000 ["Jammu"] = {container = {key = "Jammu and Kashmir, อินเดีย", placetype = "union territory"}}, -- 1,103,000 ["Solapur"] = {container = "Maharashtra"}, -- 1,082,000 ["Hubli-Dharwad"] = {container = "Karnataka", wp = "Hubli–Dharwad"}, -- 1,062,000; wp with en dash ["Hubli"] = {alias_of = "Hubli-Dharwad"}, ["Dharwad"] = {alias_of = "Hubli-Dharwad"}, ["Puducherry"] = {container = {key = "Puducherry, อินเดีย", placetype = "union territory"}}, -- 1,024,000 ["Pondicherry"] = {alias_of = "Puducherry", display = true}, -- satellite/secondary cities of metro area (none in citypopulation.de) ["Ghaziabad"] = {container = "Uttar Pradesh"}, -- 1,729,000 city, 2,358,525 urban agglomeration per 2011 census; 3,406,061 2025 estimate from official website; part of Delhi metro area ["Faridabad"] = {container = "Haryana"}, -- 1,414,050 city per 2011 census; part of Delhi metro area ["Thane"] = {container = "Maharashtra"}, -- 1,841,488 city per 2011 census; part of Mumbai metro area ["Kalyan-Dombivli"] = {container = "Maharashtra"}, -- 1,246,381 city per 2011 census; part of Mumbai metro area ["Kalyan-Dombivali"] = {alias_of = "Kalyan-Dombivli", display = true}, ["Kalyan"] = {alias_of = "Kalyan-Dombivli"}, ["Dombivli"] = {alias_of = "Kalyan-Dombivli"}, ["Dombivali"] = {alias_of = "Kalyan-Dombivli"}, ["Vasai-Virar"] = {container = "Maharashtra"}, -- 1,221,233 city per 2011 census; part of Mumbai metro area ["Vasai"] = {alias_of = "Vasai-Virar"}, ["Virar"] = {alias_of = "Vasai-Virar"}, ["Navi Mumbai"] = {container = "Maharashtra"}, -- 1,120,547 city per 2011 census; part of Mumbai metro area ["Howrah"] = {container = "West Bengal"}, -- 1,077,075 city ("metropolis"), 2,811,344 "metro" per 2011 census; part of Kolkata metro area ["Pimpri-Chinchwad"] = {container = "Maharashtra"}, -- 1,727,692 per 2011 census; part of Pune metro area ["Pimpri Chinchwad"] = {alias_of = "Pimpri-Chinchwad", display = true}, } export.india_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", อินเดีย", "รัฐ"), default_placetype = "นคร", data = export.india_cities, } export.indonesia_cities = { -- cities where the city proper has more than 1,000,000 people as of mid-2023 estimate ["Jakarta"] = {container = "Special Capital Region of Jakarta", divs = { {type = "ตำบล", container_parent_type = false}, }}, ["Surabaya"] = {container = "East Java"}, ["Bekasi"] = {container = "West Java"}, -- part of Jakarta metro area ["Bandung"] = {container = "West Java"}, ["Medan"] = {container = "North Sumatra"}, ["Depok"] = {container = "West Java"}, -- part of Jakarta metro area ["Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area ["Palembang"] = {container = "South Sumatra"}, ["Semarang"] = {container = "Central Java"}, ["Makassar"] = {container = "South Sulawesi"}, ["South Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area ["Batam"] = {container = "Riau Islands"}, ["Bogor"] = {container = "West Java"}, -- part of Jakarta metro area ["Pekanbaru"] = {container = "Riau"}, ["Bandar Lampung"] = {container = "Lampung"}, -- other metro areas over 1,000,000 people ["Padang"] = {container = "West Sumatra"}, ["Samarinda"] = {container = "East Kalimantan"}, ["Malang"] = {container = "East Java"}, ["Yogyakarta"] = {container = "Special Region of Yogyakarta"}, ["Denpasar"] = {container = "Bali"}, ["Cirebon"] = {container = "West Java"}, ["Surakarta"] = {container = "Central Java"}, ["Banjarmasin"] = {container = "South Kalimantan"}, ["Tasikmalaya"] = {container = "West Java"}, } export.indonesia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", อินโดนีเซีย", "จังหวัด"), default_placetype = "นคร", data = export.indonesia_cities, } export.italy_cities = { -- Data per [[w:List_of_metropolitan_areas_of_Italy]]. There are several lists given; the most recent one, used -- here, only gives estimates as of Jan 1, 2014. ["Milan"] = {container = "Lombardy"}, -- 6,623,798 ["Naples"] = {container = "Campania"}, -- 5,294,546 ["Rome"] = {container = "Lazio"}, -- 4,447,881 ["Turin"] = {container = "Piedmont"}, -- 1,865,284 ["Venice"] = {container = "Veneto"}, -- 1,645,900 ["Florence"] = {container = "Tuscany"}, -- 1,485,030 ["Bari"] = {container = "Apulia"}, -- 1,257,459 ["Palermo"] = {container = "Sicily"}, -- 1,183,084 -- include a few just below 1,000,000 metro area that may be above it by now (depending on the definition). ["Catania"] = {container = "Sicily"}, -- 988,240 ["Brescia"] = {container = "Lombardy"}, -- 924,090 ["Genoa"] = {container = "Liguria"}, -- 861,318 } export.italy_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Italy", "ภูมิภาค"), default_placetype = "นคร", data = export.italy_cities, } export.japan_cities = { -- Population figures from [[w:List of cities in Japan]]. Metro areas from -- [[w:List of metropolitan areas in Japan]]. ["Tokyo"] = {keydesc = "[[Tokyo]] Metropolis, the [[capital city]] and a [[prefecture]] of [[Japan]] (which is a country in [[Asia]])", placetype = {"นคร", "prefecture"}, divs = { {type = "special wards", container_parent_type = false}, {type = "cities", prep = "ใน"}, }, }, ["Yokohama"] = {container = "Kanagawa"}, -- 3,697,894 ["Osaka"] = {container = "Osaka"}, -- 2,668,586 ["Nagoya"] = {container = "Aichi"}, -- 2,283,289 -- FIXME, Hokkaido is handled specially. ["Sapporo"] = {container = "Hokkaido"}, -- 1,918,096 ["Fukuoka"] = {container = "Fukuoka"}, -- 1,581,527 ["Kobe"] = {container = "Hyōgo"}, -- 1,530,847 ["Kyoto"] = {container = "Kyoto"}, -- 1,474,570 ["Kawasaki"] = {container = "Kanagawa", wp = "%l, Kanagawa"}, -- 1,373,630 ["Saitama"] = {container = "Saitama", wp = "%l (city)", commonscat = "%l, %c"}, -- 1,192,418 ["Hiroshima"] = {container = "Hiroshima"}, -- 1,163,806 ["Sendai"] = {container = "Miyagi"}, -- 1,029,552 -- the remaining cities are considered "central cities" in a 1,000,000+ metro area -- (sometimes there is more than one central city in the area). ["Kitakyushu"] = {container = "Fukuoka"}, -- 986,998 ["Chiba"] = {container = "Chiba", wp = "%l (city)", commonscat = "%l, %c"}, -- 938,695 ["Sakai"] = {container = "Osaka"}, -- 835,333 ["Niigata"] = {container = "Niigata", wp = "%l (city)", commonscat = "%l, %c"}, -- 813,053 ["Hamamatsu"] = {container = "Shizuoka"}, -- 811,431 ["Shizuoka"] = {container = "Shizuoka", wp = "%l (city)", commonscat = "%l, %c"}, -- 710,944 ["Sagamihara"] = {container = "Kanagawa"}, -- 706,342 ["Okayama"] = {container = "Okayama"}, -- 701,293 ["Kumamoto"] = {container = "Kumamoto"}, -- 670,348 ["Kagoshima"] = {container = "Kagoshima"}, -- 605,196 -- skipped 6 cities (Funabashi, Hachiōji, Kawaguchi, Himeji, Matsuyama, Higashiōsaka) -- with population in the range 509k - 587k because not central cities in any -- 1,000,000+ metro area. ["Utsunomiya"] = {container = "Tochigi"}, -- 507,833 } export.japan_cities_group = { default_container = "ญี่ปุ่น", canonicalize_key_container = make_canonicalize_key_container(", ญี่ปุ่น", "prefecture"), default_placetype = "นคร", data = export.japan_cities, } export.mexico_cities = { ["Mexico City"] = {}, -- its own state ["Monterrey"] = {container = "Nuevo León"}, ["Guadalajara"] = {container = "Jalisco"}, ["Puebla"] = {container = "Puebla", wp = "%l (city)"}, ["Toluca"] = {container = "State of Mexico"}, ["Tijuana"] = {container = "Baja California"}, -- Include the state in the category for León due to possible confusion with León, Spain. ["León, Guanajuato"] = {container = "Guanajuato", wp = "%l, %c"}, ["León"] = {alias_of = "León, Guanajuato"}, ["Leon"] = {alias_of = "León, Guanajuato", display = true}, ["Querétaro"] = {container = "Querétaro", wp = "%l (city)"}, ["Queretaro"] = {alias_of = "Querétaro", display = true}, ["Ciudad Juárez"] = {container = "Chihuahua"}, ["Juárez"] = {alias_of = "Ciudad Juárez"}, ["Juarez"] = {alias_of = "Ciudad Juárez", display = "Juárez"}, ["Torreón"] = {container = "Coahuila"}, ["Torreon"] = {alias_of = "Torreón", display = true}, -- Include the state in the category for Mérida due to possible confusion with Mérida, Spain or -- Mérida, Venezuela. ["Mérida, Yucatán"] = {container = "Yucatán", wp = "%l, %c"}, ["Mérida"] = {alias_of = "Mérida, Yucatán"}, ["Merida"] = {alias_of = "Mérida, Yucatán", display = true}, ["San Luis Potosí"] = {container = "San Luis Potosí", wp = "%l (city)"}, ["San Luis Potosi"] = {alias_of = "San Luis Potosí", display = true}, ["Aguascalientes"] = {container = "Aguascalientes", wp = "%l (city)"}, ["Mexicali"] = {container = "Baja California"}, } export.mexico_cities_group = { default_container = "Mexico", canonicalize_key_container = make_canonicalize_key_container(", Mexico", "รัฐ"), default_placetype = "นคร", data = export.mexico_cities, } export.nigeria_cities = { -- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01. ["Lagos"] = {container = "Lagos"}, -- 21,300,000 (unindicated; population of low reliability) ["Kano"] = {container = "Kano", wp = "%l (city)"}, -- 5,350,000 (unindicated; population of low reliability) ["Ibadan"] = {container = "Oyo"}, -- 3,400,000 (unindicated; population of low reliability) ["Abuja"] = {container = {key = "Federal Capital Territory, Nigeria", placetype = "federal territory"}}, -- 3,050,000 (unindicated; population of low reliability) ["Port Harcourt"] = {container = "Rivers"}, -- 2,250,000 (unindicated; population of low reliability) ["Kaduna"] = {container = "Kaduna"}, -- 1,980,000 (unindicated; population of low reliability) ["Benin City"] = {container = "Edo"}, -- 1,790,000 (unindicated; population of low reliability) ["Aba"] = {container = "Abia", wp = "%l, Nigeria"}, -- 1,280,000 (unindicated; population of low reliability) ["Onitsha"] = {container = "Anambra"}, -- 1,230,000 (unindicated; population of low reliability) ["Maiduguri"] = {container = "Borno"}, -- 1,190,000 (unindicated; population of low reliability) ["Ilorin"] = {container = "Kwara"}, -- 1,160,000 (unindicated; population of low reliability) ["Sokoto"] = {container = "Sokoto", wp = "%l (city)"}, -- 1,140,000 (unindicated; population of low reliability) ["Jos"] = {container = "Plateau"}, -- 1,110,000 (unindicated; population of low reliability) ["Zaria"] = {container = "Kaduna"}, -- 1,050,000 (unindicated; population of low reliability) ["Enugu"] = {container = "Enugu", wp = "%l (city)"}, -- 1,010,000 (unindicated; population of low reliability) } export.nigeria_cities_group = { default_container = "Nigeria", canonicalize_key_container = make_canonicalize_key_container(" State, Nigeria", "รัฐ"), default_placetype = "นคร", data = export.nigeria_cities, } export.pakistan_cities = { -- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01. ["Karachi"] = {container = "Sindh"}, -- 21,000,000 (Consolidated Urban Area) ["Lahore"] = {container = "Punjab"}, -- 14,600,000 (Consolidated Urban Area) ["Rawalpindi"] = {container = "Punjab"}, -- 5,600,000 (Consolidated Urban Area; including Islamabad) ["Islamabad"] = {container = {key = "Islamabad Capital Territory, Pakistan", placetype = "federal territory"}}, -- 5,600,000 (Consolidated Urban Area; including Rawalpindi) ["Faisalabad"] = {container = "Punjab"}, -- 4,125,000 (Consolidated Urban Area) ["Gujranwala"] = {container = "Punjab"}, -- 3,450,000 (Consolidated Urban Area) -- there is also Hyderabad in India (very confusing) ["Hyderabad, Pakistan"] = {container = "Sindh", wp = "%l, %c"}, -- 2,475,000 (Consolidated Urban Area) ["Hyderabad"] = {alias_of = "Hyderabad, Pakistan"}, ["Multan"] = {container = "Punjab"}, -- 2,425,000 (Consolidated Urban Area) ["Peshawar"] = {container = "Khyber Pakhtunkhwa"}, -- 2,150,000 (Consolidated Urban Area) ["Quetta"] = {container = "Balochistan"}, -- 1,720,000 (Urban Area) ["Sargodha"] = {container = "Punjab"}, -- 1,080,000 (Urban Area) ["Sialkot"] = {container = "Punjab"}, -- 1,050,000 (Consolidated Urban Area) } export.pakistan_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Pakistan", "จังหวัด"), default_placetype = "นคร", data = export.pakistan_cities, } export.philippines_cities = { -- Skipped some cities in Metro Manila (Taguig, Pasig) which don't have districts. -- Other cities outside Metro Manila skipped as not central city in their urban area. ["Quezon City"] = {container = {key = "Metro Manila, Philippines", placetype = "ภูมิภาค"}}, -- Don't display-canonicalize Foo to Foo City as it may make the display weird. ["Quezon"] = {alias_of = "Quezon City"}, ["Manila"] = {container = {key = "Metro Manila, Philippines", placetype = "ภูมิภาค"}}, ["Davao City"] = {container = "Davao del Sur"}, ["Davao"] = {alias_of = "Davao City"}, ["Caloocan"] = {container = {key = "Metro Manila, Philippines", placetype = "ภูมิภาค"}}, ["Zamboanga City"] = {container = "Zamboanga del Sur"}, ["Zamboanga"] = {alias_of = "Zamboanga City"}, ["Cebu City"] = {container = "Cebu"}, ["Cebu"] = {alias_of = "Cebu City"}, ["Antipolo"] = {container = "Rizal"}, ["Cagayan de Oro"] = {container = "Misamis Oriental"}, ["Dasmariñas"] = {container = "Cavite"}, ["Dasmarinas"] = {alias_of = "Dasmariñas", display = true}, ["General Santos"] = {container = "South Cotabato"}, ["San Jose del Monte"] = {container = "Bulacan"}, ["Bacolod"] = {container = "Negros Occidental"}, ["Calamba"] = {container = "Laguna", wp = "%l, %c"}, ["Angeles"] = {container = "Pampanga", wp = "Angeles City"}, ["Angeles City"] = {alias_of = "Angeles"}, ["Iloilo City"] = {container = "Iloilo"}, ["Iloilo"] = {alias_of = "Iloilo City"}, } export.philippines_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Philippines", "จังหวัด"), default_placetype = "นคร", data = export.philippines_cities, } export.russia_cities = { -- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01. ["Moscow"] = {}, -- 18,800,000 (Agglomeration) ["Saint Petersburg"] = {}, -- 6,350,000 (Agglomeration) ["Novosibirsk"] = {container = "Novosibirsk Oblast"}, -- 1,820,000 (Agglomeration) ["Yekaterinburg"] = {container = "Sverdlovsk Oblast"}, -- 1,810,000 (Agglomeration) ["Nizhny Novgorod"] = {container = "Nizhny Novgorod Oblast"}, -- 1,620,000 (Agglomeration) ["Kazan"] = {container = {key = "Tatarstan, Russia", placetype = "republic"}}, -- 1,560,000 (Agglomeration) ["Chelyabinsk"] = {container = "Chelyabinsk Oblast"}, -- 1,430,000 (Agglomeration) ["Rostov-on-Don"] = {container = "Rostov Oblast"}, -- 1,390,000 (Agglomeration) ["Rostov-na-Donu"] = {alias_of = "Rostov-on-Don", display = true}, ["Krasnodar"] = {container = {key = "Krasnodar Krai, Russia", placetype = "krai"}}, -- 1,370,000 (Agglomeration) ["Samara"] = {container = "Samara Oblast"}, -- 1,350,000 (Agglomeration) ["Krasnoyarsk"] = {container = {key = "Krasnoyarsk Krai, Russia", placetype = "krai"}}, -- 1,270,000 (Agglomeration) ["Ufa"] = {container = {key = "Bashkortostan, Russia", placetype = "republic"}}, -- 1,230,000 (Agglomeration) ["Saratov"] = {container = "Saratov Oblast"}, -- 1,170,000 (Agglomeration) ["Omsk"] = {container = "Omsk Oblast"}, -- 1,140,000 (Agglomeration) ["Voronezh"] = {container = "Voronezh Oblast"}, -- 1,130,000 (Agglomeration) ["Volgograd"] = {container = "Volgograd Oblast"}, -- 1,080,000 (Agglomeration) ["Perm"] = {container = {key = "Perm Krai, Russia", placetype = "krai"}, wp = "%l, Russia"}, -- 1,070,000 (Agglomeration) } export.russia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Russia", "oblast"), default_container = "Russia", default_placetype = "นคร", data = export.russia_cities, } export.saudi_arabia_cities = { -- Figures for the first five from [[w:List of cities and towns in Saudi Arabia]] as of 2022. Unclear if these are -- metro, urban or city proper figures. ["Riyadh"] = {container = "Riyadh"}, -- 7,000,100; 7,700,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Jeddah"] = {container = "Mecca"}, -- 3,751,917; 3,950,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Jedda"] = {alias_of = "Jeddah", display = true}, ["Jiddah"] = {alias_of = "Jeddah", display = true}, ["Jidda"] = {alias_of = "Jeddah", display = true}, ["Dammam"] = {container = "Eastern"}, -- 2,638,166; 2,925,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Mecca"] = {container = "Mecca"}, -- 2,385,509; 2,675,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Makkah"] = {alias_of = "Mecca", display = true}, ["Medina"] = {container = "Medina"}, -- 1,477,023; 1,530,000 per citypopulation.de 2025-01-01 (City) ["Hofuf"] = {container = "Eastern"}, -- 1,060,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Khamis Mushait"] = {container = "Aseer"}, -- 1,030,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Khamis Mushayt"] = {alias_of = "Khamis Mushait", display = true}, } export.saudi_arabia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(" Province, Saudi Arabia", "จังหวัด"), default_placetype = "นคร", data = export.saudi_arabia_cities, } export.south_korea_cities = { -- All cities listed are not associated with any county. ["Seoul"] = {}, ["Busan"] = {}, ["Incheon"] = {}, ["Daegu"] = {}, ["Daejeon"] = {}, ["Gwangju"] = {}, ["Ulsan"] = {}, } export.south_korea_cities_group = { default_container = "South Korea", canonicalize_key_container = make_canonicalize_key_container(" County, South Korea", "จังหวัด"), default_placetype = "นคร", data = export.south_korea_cities, } export.spain_cities = { ["Madrid"] = {container = "Community of Madrid"}, ["Barcelona"] = {container = "Catalonia"}, ["Valencia"] = {container = "Valencia"}, ["Seville"] = {container = "Andalusia"}, ["Bilbao"] = {container = "Basque Country"}, } export.spain_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Spain", "autonomous community"), default_placetype = "นคร", data = export.spain_cities, } export.taiwan_cities = { ["New Taipei City"] = {}, ["New Taipei"] = {alias_of = "New Taipei City", display = true}, ["Taichung"] = {}, ["Kaohsiung"] = {wp = "%l, ไต้หวัน"}, ["Taipei"] = {}, ["Taoyuan"] = {}, ["Tainan"] = {}, -- these last three are not special municipalities ["Chiayi"] = {placetype = "นคร"}, ["Hsinchu"] = {placetype = "นคร"}, ["Keelung"] = {placetype = "นคร"}, } export.taiwan_cities_group = { placename_to_key = false, -- don't add ", ไต้หวัน" to make the key canonicalize_key_container = make_canonicalize_key_container(", ไต้หวัน", "เทศมณฑล"), default_container = "ไต้หวัน", default_placetype = {"special municipality", "เทศบาล", "นคร"}, default_is_city = true, default_divs = {"อำเภอ"}, data = export.taiwan_cities, } -- NOTE: It's OK to mix cities from different constituent countries; as long as the immediate container is correct, -- everything else will be figured out. export.united_kingdom_cities = { ["London"] = {container = "Greater London"}, ["Manchester"] = {container = "Greater Manchester"}, ["Birmingham"] = {container = "West Midlands"}, ["Liverpool"] = {container = "Merseyside"}, ["Glasgow"] = {container = {key = "City of Glasgow, Scotland", placetype = "council area"}}, ["Leeds"] = {container = "West Yorkshire"}, ["Newcastle upon Tyne"] = {container = "Tyne and Wear"}, ["Newcastle"] = {alias_of = "Newcastle upon Tyne"}, ["Bristol"] = {container = {key = "England", placetype = "constituent country"}}, ["Cardiff"] = {container = {key = "Wales", placetype = "constituent country"}}, ["Portsmouth"] = {container = "Hampshire"}, ["Edinburgh"] = {container = {key = "City of Edinburgh, Scotland", placetype = "council area"}}, -- under 1,000,000 people but principal areas of Wales; requested by [[User:Donnanz]] ["Swansea"] = {container = {key = "Wales", placetype = "constituent country"}}, ["Newport"] = {container = {key = "Wales", placetype = "constituent country"}, wp = "Newport, Wales"}, } export.united_kingdom_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", England", "เทศมณฑล"), default_placetype = "นคร", data = export.united_kingdom_cities, } export.united_states_cities = { -- top 50 CSA's by population, with the top and sometimes 2nd or 3rd city listed ["New York City"] = {container = "New York", wp = "%l", divs = { {type = "boroughs", container_parent_type = false}, }}, -- Don't display-canonicalize as it may make the display weird (e.g. in the context New York, New York). ["New York"] = {alias_of = "New York City"}, ["Newark"] = {container = "New Jersey"}, ["Los Angeles"] = {container = "California", wp = "%l"}, ["Long Beach"] = {container = "California"}, ["Riverside"] = {container = "California"}, ["Chicago"] = {container = "Illinois", wp = "%l"}, ["Washington, D.C."] = {wp = "%l"}, ["Washington, DC"] = {alias_of = "Washington, D.C.", display = true}, ["Washington D.C."] = {alias_of = "Washington, D.C.", display = true}, ["Washington DC"] = {alias_of = "Washington, D.C.", display = true}, -- Don't display-canonicalize as it may make the display weird (e.g. if the holonym is followed by a District of -- Columbia holonym). ["Washington"] = {alias_of = "Washington, D.C."}, ["Baltimore"] = {container = "Maryland", wp = "%l"}, -- to avoid conflict with San Jose in Costa Rica ["San Jose, California"] = {container = "California"}, ["San Jose"] = {alias_of = "San Jose, California"}, ["San Francisco"] = {container = "California", wp = "%l"}, ["Oakland"] = {container = "California"}, ["Boston"] = {container = "Massachusetts", wp = "%l"}, ["Providence"] = {container = "Rhode Island"}, ["Dallas"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"}, ["Fort Worth"] = {container = "Texas"}, ["Philadelphia"] = {container = "Pennsylvania", wp = "%l"}, ["Houston"] = {container = "Texas", wp = "%l"}, ["Miami"] = {container = "Florida", wp = "%l", commonscat = "%l, %c"}, ["Atlanta"] = {container = "Georgia", wp = "%l"}, ["Detroit"] = {container = "Michigan", wp = "%l"}, ["Phoenix"] = {container = "Arizona", wp = "%l", commonscat = "%l, %c"}, ["Mesa"] = {container = "Arizona"}, ["Seattle"] = {container = "Washington", wp = "%l"}, ["Orlando"] = {container = "Florida"}, ["Minneapolis"] = {container = "Minnesota", wp = "%l"}, ["Cleveland"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"}, ["Denver"] = {container = "Colorado", wp = "%l", commonscat = "%l, %c"}, ["San Diego"] = {container = "California", wp = "%l", commonscat = "%l, %c"}, ["Portland"] = {container = "Oregon"}, ["Tampa"] = {container = "Florida"}, ["St. Louis"] = {container = "Missouri", wp = "%l", commonscat = "%l, %c"}, ["Saint Louis"] = {alias_of = "St. Louis", display = true}, ["Charlotte"] = {container = "North Carolina"}, ["Sacramento"] = {container = "California"}, ["Pittsburgh"] = {container = "Pennsylvania", wp = "%l"}, ["Salt Lake City"] = {container = "Utah", wp = "%l"}, ["San Antonio"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"}, ["Columbus"] = {container = "Ohio"}, ["Kansas City"] = {container = "Missouri", wp = "%l metropolitan area", commonscat = "%l, %c"}, ["Indianapolis"] = {container = "Indiana", wp = "%l"}, ["Las Vegas"] = {container = "Nevada", wp = "%l"}, ["Cincinnati"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"}, ["Austin"] = {container = "Texas"}, ["Milwaukee"] = {container = "Wisconsin", wp = "%l", commonscat = "%l, %c"}, ["Raleigh"] = {container = "North Carolina"}, ["Nashville"] = {container = "Tennessee"}, ["Virginia Beach"] = {container = "Virginia"}, ["Norfolk"] = {container = "Virginia"}, ["Greensboro"] = {container = "North Carolina"}, ["Winston-Salem"] = {container = "North Carolina"}, ["Jacksonville"] = {container = "Florida"}, ["New Orleans"] = {container = "Louisiana", wp = "%l"}, ["Louisville"] = {container = "Kentucky"}, ["Greenville"] = {container = "South Carolina"}, ["Hartford"] = {container = "Connecticut"}, ["Oklahoma City"] = {container = "Oklahoma", wp = "%l"}, ["Grand Rapids"] = {container = "Michigan"}, ["Memphis"] = {container = "Tennessee"}, ["Birmingham, Alabama"] = {container = "Alabama"}, ["Birmingham"] = {alias_of = "Birmingham, Alabama"}, ["Fresno"] = {container = "California"}, ["Richmond"] = {container = "Virginia"}, ["Harrisburg"] = {container = "Pennsylvania"}, -- any major city of top 50 MSA's that's missed by previous ["Buffalo"] = {container = "New York"}, -- any of the top 50 city by city population that's missed by previous ["El Paso"] = {container = "Texas"}, ["Albuquerque"] = {container = "New Mexico"}, ["Tucson"] = {container = "Arizona"}, ["Colorado Springs"] = {container = "Colorado"}, ["Omaha"] = {container = "Nebraska"}, ["Tulsa"] = {container = "Oklahoma"}, -- skip Arlington, Texas; too obscure and likely to be interpreted as Arlington, Virginia } export.united_states_cities_group = { default_container = "สหรัฐอเมริกา", canonicalize_key_container = make_canonicalize_key_container(", USA", "รัฐ"), default_placetype = "นคร", default_wp = "%l, %c", data = export.united_states_cities, } export.new_york_boroughs = { ["Bronx"] = {the = true, wp = "The Bronx"}, ["Brooklyn"] = {}, ["Manhattan"] = {}, ["Queens"] = {}, ["Staten Island"] = {}, } export.new_york_boroughs_group = { default_container = {key = "New York City", placetype = "นคร"}, default_placetype = "borough", default_is_city = true, data = export.new_york_boroughs, } export.vietnam_cities = { -- Figures from citypopulation.de (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated. ["Ho Chi Minh City"] = {}, -- 14,300,000 (Agglomeration; inclunding Bien Hoa) ["Saigon"] = {alias_of = "Ho Chi Minh City"}, ["Hanoi"] = {}, -- 7,350,000 (Agglomeration) ["Da Nang"] = {}, -- 1,500,000 (Agglomeration) ["Danang"] = {alias_of = "Da Nang", display = true}, ["Haiphong"] = {}, -- 1,450,000 (Agglomeration) ["Hai Phong"] = {alias_of = "Haiphong", display = true}, -- This is the one entry in this list that is not a province-level municipality; instead it's a "provincial city" -- meaning it is directly under its province as opposed to being contained in a district. ["Bien Hoa"] = {placetype = "นคร", container = "Đồng Nai", wp = "Biên Hòa"}, -- 1,272,235 (2022 city population per Wikipedia) ["Biên Hòa"] = {alias_of = "Bien Hoa", display = true}, ["Biên Hoà"] = {alias_of = "Bien Hoa", display = true}, -- These two not in citypopulation.de because the urban population may be slightly under 1,000,000, but they are -- both province-level municipalities and close to the 1,000,000 mark. ["Can Tho"] = {wp = "Cần Thơ"}, -- 1,456,000 municipality (2019 census), 994,704 urban (2022 General Statistics Office of Vietnam estimate); capital [[Ninh Kiều district]] ["Cần Thơ"] = {alias_of = "Can Tho", display = true}, ["Hue"] = {wp = "Huế"}, -- 1,257,000 municipality (2019 census), 840,000 urban (2022 General Statistics Office of Vietnam estimate); -- capital [[Thuận Hóa district]] ["Huế"] = {alias_of = "Hue", display = true}, } export.vietnam_cities_group = { placename_to_key = false, -- don't add ", เวียดนาม" to make the key default_container = "เวียดนาม", canonicalize_key_container = make_canonicalize_key_container(", เวียดนาม", "จังหวัด"), -- Most of the cities listed are province-level municipalities in addition, which contain a certain amount of -- rural territory surrounding the city, but not enough to separate the municipality from the city as distinct -- known locations. default_placetype = {"เทศบาล", "นคร"}, default_is_city = true, -- There may not be enough districts to subcategorize like this. -- default_divs = "อำเภอ", data = export.vietnam_cities, } export.misc_cities = { ------------------ Africa ------------------- -- Sorted by country and then within the country, by decreasing population; figures from citypopulation.de -- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated; combined with data from -- [[w:List of urban areas in Africa by population]]. ["Algiers"] = {container = "แอลจีเรีย"}, -- 4,325,000 (Consolidated Urban Area) ["Oran"] = {container = "แอลจีเรีย"}, -- 1,640,000 (Consolidated Urban Area) ["Luanda"] = {container = "แองโกลา"}, -- 9,650,000 (Urban Area) ["Benguela"] = {container = "แองโกลา"}, -- 1,420,000 (Urban Area) ["Cotonou"] = {container = "เบนิน"}, -- 2,150,000 (Agglomeration) ["Ouagadougou"] = {container = "บูร์กินาฟาโซ"}, -- 3,425,000 (Agglomeration) ["Bobo-Dioulasso"] = {container = "บูร์กินาฟาโซ"}, -- 1,100,000 (Agglomeration) ["Bujumbura"] = {container = "บุรุนดี"}, -- 1,143,202 (Urban Area 2023 per PopulationStat, cited in Wikipedia) ["Yaoundé"] = {container = "แคเมอรูน"}, -- 3,975,000 (City) ["Yaounde"] = {alias_of = "Yaoundé", display = true}, ["Douala"] = {container = "แคเมอรูน"}, -- 3,900,000 (City) ["Bangui"] = {container = "สาธารณรัฐแอฟริกากลาง"}, -- 1,680,000 (Agglomeration) ["N'Djamena"] = {container = "ชาด"}, -- 1,950,000 (City) ["Ndjamena"] = {alias_of = "N'Djamena", display = true}, ["Kinshasa"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 16,300,000 (City; population of low reliability) ["Lubumbashi"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 2,875,000 (City; population of low reliability) ["Mbuji-Mayi"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 2,500,000 (City; population of low reliability) ["Kananga"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,370,000 (City; population of low reliability) ["Kisangani"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,300,000 (City; population of low reliability) ["Bukavu"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,100,000 (City; population of low reliability) ["Goma"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,010,000 (City; population of low reliability) ["Tshikapa"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,020,468 (2023 Wikipedia [[w:List of cities with over one million inhabitants]] from populationstat.com; not in citypopulation.de) ["Cairo"] = {container = "อียิปต์"}, -- 22,800,000 (Agglomeration, including Giza and Subhra El Kheima) ["Alexandria"] = {container = "อียิปต์"}, -- 6,250,000 (Agglomeration) ["Giza"] = {container = "อียิปต์"}, -- 4,458,135 (2023 from citypopulation.de) ["Shubra El Kheima"] = {container = "อียิปต์"}, -- 1,240,239 (2021 from citypopulation.de) ["Asmara"] = {container = "เอริเทรีย"}, -- 1,090,000 (City; population of low reliability) ["Asmera"] = {alias_of = "Asmara", display = true}, ["Addis Ababa"] = {container = "เอธิโอเปีย"}, -- 4,825,000 (Agglomeration) ["Banjul"] = {container = "Gambia"}, -- 1,170,000 (Agglomeration) ["Accra"] = {container = "กานา"}, -- 6,800,000 (Agglomeration) ["Kumasi"] = {container = "กานา"}, -- 2,900,000 (Agglomeration) ["Conakry"] = {container = "กินี"}, -- 2,975,000 (Consolidated Urban Area) ["Abidjan"] = {container = "โกตดิวัวร์"}, -- 7,050,000 (Agglomeration) ["Nairobi"] = {container = "Kenya"}, -- 6,900,000 (unindicated) ["Mombasa"] = {container = "Kenya"}, -- 1,370,000 (City) ["Monrovia"] = {container = "Liberia"}, -- 1,940,000 (Urban Area) ["Tripoli"] = {container = "Libya", wp = "%l, %c"}, -- 1,870,000 (unindicated) ["Antananarivo"] = {container = "Madagascar"}, -- 3,150,000 (Agglomeration) ["Lilongwe"] = {container = "Malawi"}, -- 1,210,000 (City) ["Bamako"] = {container = "Mali"}, -- 5,700,000 (Agglomeration) ["Nouakchott"] = {container = "Mauritania"}, -- 1,500,000 (City) ["Casablanca"] = {container = {key = "Casablanca-Settat, Morocco", placetype = "ภูมิภาค"}}, -- 4,450,000 (Municipality (urban population)) ["Rabat"] = {container = {key = "Rabat-Sale-Kenitra, Morocco", placetype = "ภูมิภาค"}}, -- 2,125,000 (Municipality (urban population)) ["Tangier"] = {container = {key = "Tangier-Tetouan-Al Hoceima, Morocco", placetype = "ภูมิภาค"}}, -- 1,410,000 (Municipality (urban population)) ["Tanger"] = {alias_of = "Tangier", display = true}, ["Tangiers"] = {alias_of = "Tangier", display = true}, ["Fez"] = {container = {key = "Fez-Meknes, Morocco", placetype = "ภูมิภาค"}, wp = "%l, Morocco"}, -- 1,310,000 (Municipality (urban population)) ["Fes"] = {alias_of = "Fez", display = true}, ["Fès"] = {alias_of = "Fez", display = true}, ["Agadir"] = {container = {key = "Souss-Massa, Morocco", placetype = "ภูมิภาค"}}, -- 1,270,000 (Municipality (urban population)) ["Marrakesh"] = {container = {key = "Marrakesh-Safi, Morocco", placetype = "ภูมิภาค"}}, -- 1,140,000 (Municipality (urban population)) ["Marrakech"] = {alias_of = "Marrakesh", display = true}, ["Maputo"] = {container = "Mozambique"}, -- 2,575,000 (Agglomeration) ["Niamey"] = {container = "Niger"}, -- 1,530,000 (City) ["Brazzaville"] = {container = "Republic of the Congo"}, -- 2,475,000 (Agglomeration) ["Pointe-Noire"] = {container = "Republic of the Congo"}, -- 1,480,000 (City) ["Kigali"] = {container = "Rwanda"}, -- 1,960,000 (Municipality (urban population)) ["Dakar"] = {container = "Senegal"}, -- 4,225,000 (Agglomeration) ["Touba"] = {container = "Senegal"}, -- 1,320,000 (Agglomeration) ["Freetown"] = {container = "Sierra Leone"}, -- 1,420,000 (Agglomeration) ["Mogadishu"] = {container = "โซมาเลีย"}, -- 2,250,000 (unindicated; population of low reliability) ["Johannesburg"] = {container = {key = "Gauteng, South Africa", placetype = "จังหวัด"}}, -- 14,800,000 (Consolidated Urban Area; including Pretoria, Soweto, etc.) ["Cape Town"] = {container = {key = "Western Cape, South Africa", placetype = "จังหวัด"}}, -- 5,100,000 (Consolidated Urban Area) ["Durban"] = {container = {key = "KwaZulu-Natal, South Africa", placetype = "จังหวัด"}}, -- 3,900,000 (Consolidated Urban Area) ["Pretoria"] = {container = {key = "Gauteng, South Africa", placetype = "จังหวัด"}}, -- 2,921,488 (2011 census) ["Port Elizabeth"] = {container = {key = "Eastern Cape, South Africa", placetype = "จังหวัด"}, wp = "Gqeberha"}, -- 1,200,000 (Consolidated Urban Area) ["Gqeberha"] = {alias_of = "Port Elizabeth"}, -- official name; not a display alias ["Khartoum"] = {container = "Sudan"}, -- 7,200,000 (unindicated; population of low reliability) ["Dar es Salaam"] = {container = "Tanzania"}, -- 6,650,000 (Agglomeration) ["Mwanza"] = {container = "Tanzania"}, -- 1,340,000 (Agglomeration) ["Mwanza City"] = {alias_of = "Mwanza", display = true}, ["Arusha"] = {container = "Tanzania"}, -- 1,190,000 (Agglomeration) ["Zanzibar"] = {container = "Tanzania"}, -- 1,030,000 (Agglomeration) ["Lomé"] = {container = "Togo"}, -- 2,625,000 (unindicated) ["Lome"] = {alias_of = "Lomé", display = true}, ["Tunis"] = {container = "Tunisia"}, -- 2,725,000 (Municipality (urban population)) ["Sousse"] = {container = "Tunisia"}, -- 1,180,000 (Municipality (urban population)) ["Soussa"] = {alias_of = "Sousse", display = true}, ["Kampala"] = {container = "Uganda"}, -- 4,300,000 (unindicated) ["Lusaka"] = {container = "Zambia"}, -- 3,000,000 (Consolidated Urban Area) ["Harare"] = {container = "Zimbabwe"}, -- 2,675,000 (Agglomeration) ------------------ Asia ------------------- -- sorted by country and then within the country, by decreasing population; figures from citypopulation.de -- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated. ["Kabul"] = {container = "อัฟกานิสถาน"}, -- 5,250,000 (Agglomeration) ["Baku"] = {container = "อาเซอร์ไบจาน"}, -- 3,725,000 (Administrative Area (urban population)) ["Manama"] = {container = "บาห์เรน"}, -- 1,560,000 (unindicated) ["Dhaka"] = {container = {key = "Dhaka Division, บังกลาเทศ", placetype = "division"}}, -- 23,100,000 (Agglomeration) ["Dacca"] = {alias_of = "Dhaka", display = true}, ["Chittagong"] = {container = {key = "Chittagong Division, บังกลาเทศ", placetype = "division"}}, -- 5,050,000 (Agglomeration) ["Gazipur"] = {container = {key = "Dhaka Division, บังกลาเทศ", placetype = "division"}}, -- 2,674,697 (City per 2022; countied in citypopulation.de as part of Dhaka metro area) ["Khulna"] = {container = {key = "Khulna Division, บังกลาเทศ", placetype = "division"}}, -- 1,210,000 (Agglomeration) ["Phnom Penh"] = {container = "กัมพูชา"}, -- 2,925,000 (Agglomeration) ["Tehran"] = {container = {key = "Tehran, อิหร่าน", placetype = "จังหวัด"}}, -- 16,800,000 (Agglomeration) ["Teheran"] = {alias_of = "Tehran", display = true}, ["Mashhad"] = {container = {key = "Razavi Khorasan, อิหร่าน", placetype = "จังหวัด"}}, -- 3,475,000 (Agglomeration) ["Mashad"] = {alias_of = "Mashhad", display = true}, ["Meshhed"] = {alias_of = "Mashhad", display = true}, ["Meshed"] = {alias_of = "Mashhad", display = true}, ["Isfahan"] = {container = {key = "Isfahan, อิหร่าน", placetype = "จังหวัด"}}, -- 3,425,000 (Agglomeration) ["Esfahan"] = {alias_of = "Isfahan", display = true}, ["Tabriz"] = {container = {key = "East Azerbaijan, อิหร่าน", placetype = "จังหวัด"}}, -- 1,970,000 (Agglomeration) ["Shiraz"] = {container = {key = "Fars, อิหร่าน", placetype = "จังหวัด"}}, -- 1,950,000 (Agglomeration) ["Ahvaz"] = {container = {key = "Khuzestan, อิหร่าน", placetype = "จังหวัด"}}, -- 1,550,000 (Agglomeration) ["Qom"] = {container = {key = "Qom, อิหร่าน", placetype = "จังหวัด"}}, -- 1,450,000 (City) ["Kermanshah"] = {container = {key = "Kermanshah, อิหร่าน", placetype = "จังหวัด"}}, -- 1,130,000 (City) ["Baghdad"] = {container = "อิรัก"}, -- 7,800,000 (Administrative Area (urban population)) ["Basra"] = {container = "อิรัก"}, -- 1,710,000 (Administrative Area (urban population)) ["Mosul"] = {container = "อิรัก"}, -- 1,550,000 (Administrative Area (urban population)) ["Erbil"] = {container = "อิรัก"}, -- 1,220,000 (Administrative Area (urban population)) ["Kirkuk"] = {container = "อิรัก"}, -- 1,160,000 (Administrative Area (urban population)) ["Najaf"] = {container = "อิรัก"}, -- 1,050,000 (Administrative Area (urban population)) ["Tel Aviv"] = {container = "อิสราเอล"}, -- 3,000,000 (Agglomeration) -- Jerusalem is not recognized internationally as part of either Israel or Palestine, but as a -- [[w:corpus separatum]], so put the container as "เอเชีย" and list Israel and Palestine as additional parents for -- categorization purposes. ["Jerusalem"] = {container = {key = "เอเชีย", placetype = "ทวีป"}, addl_parents = {"อิสราเอล", "Palestine"}}, -- 1,080,000 (Agglomeration) ["Amman"] = {container = "Jordan"}, -- 6,150,000 (unindicated) ["Irbid"] = {container = "Jordan"}, -- 1,070,000 (unindicated) ["Almaty"] = {container = "Kazakhstan"}, -- 2,700,000 (Agglomeration) ["Alma-Ata"] = {alias_of = "Almaty"}, -- former name, sometimes still used; don't display-canonicalize ["Astana"] = {container = "Kazakhstan"}, -- 1,600,000 (Agglomeration) ["Shymkent"] = {container = "Kazakhstan"}, -- 1,370,000 (Agglomeration) ["Kuwait City"] = {container = "Kuwait"}, -- 5,050,000 (Agglomeration) ["Bishkek"] = {container = "Kyrgyzstan"}, -- 1,540,000 (Agglomeration) ["Beirut"] = {container = "Lebanon"}, -- 1,930,000 (unindicated; population of low reliability) -- Kuala Lumpur is a federal capital city, not in any state ["Kuala Lumpur"] = {container = "Malaysia"}, -- 9,550,000 (Agglomeration) -- there are various George Towns and Georgetowns ["George Town, Malaysia"] = {container = {key = "Penang, Malaysia", placetype = "รัฐ"}, wp = "%l, %c"}, -- 2,075,000 (Agglomeration) ["George Town"] = {alias_of = "George Town, Malaysia"}, ["Ulaanbaatar"] = {container = "Mongolia"}, -- 1,610,000 (City) ["Ulan Bator"] = {alias_of = "Ulaanbaatar", display = true}, ["Yangon"] = {container = "Myanmar"}, -- 5,650,000 (Municipality (urban population)) ["Rangoon"] = {alias_of = "Yangon", display = true}, ["Mandalay"] = {container = "Myanmar"}, -- 1,600,000 (Municipality (urban population)) ["Kathmandu"] = {container = "Nepal"}, -- 3,175,000 (Agglomeration) -- Pyongyang is a directly governed city, not in any province ["Pyongyang"] = {container = "North Korea"}, -- 3,025,000 (Administrative Area (urban population)) ["Muscat"] = {container = "Oman"}, -- 1,620,000 (Agglomeration) ["Gaza"] = {container = "Palestine", wp = "Gaza City"}, -- 2,275,000 (unindicated) ["Gaza City"] = {alias_of = "Gaza"}, ["Doha"] = {container = "Qatar"}, -- 2,650,000 (Agglomeration) ["Colombo"] = {container = "Sri Lanka"}, -- 4,975,000 (unindicated) ["Damascus"] = {container = "Syria"}, -- 3,975,000 (unindicated; population of low reliability) ["Aleppo"] = {container = "Syria"}, -- 1,980,000 (unindicated; population of low reliability) ["Dushanbe"] = {container = "Tajikistan"}, -- 1,270,000 (City) ["Bangkok"] = {container = "Thailand"}, -- 21,800,000 (Agglomeration) -- Chiang Mai not in citypopulation.de, but 1,198,000 urban population in 2021 per Wikipedia -- [[w:List_of_municipalities_in_Thailand#Largest_cities_by_urban_population]] ["Chiang Mai"] = {container = {key = "Chiang Mai Province, Thailand", placetype = "จังหวัด"}}, ["Chonburi"] = {container = {key = "Chonburi Province, Thailand", placetype = "จังหวัด"}}, -- 1,570,000 (Agglomeration; including Pattaya) -- metro area population stats from https://www.statista.com/statistics/255483/biggest-cities-in-turkey/ as of 2021; -- second source is citypopulation.de reference date 2025-01-01. ["Istanbul"] = {placetype = {"นคร", "จังหวัด"}, divs = {"อำเภอ"}, container = "Turkey"}, -- 15.2 million; 16,000,000 (Agglomeration) ["İstanbul"] = {alias_of = "Istanbul", display = true}, ["Ankara"] = {container = {key = "Ankara Province, Turkey", placetype = "จังหวัด"}}, -- 5.15 million; 5,200,000 (Agglomeration) ["Izmir"] = {container = {key = "İzmir Province, Turkey", placetype = "จังหวัด"}, wp = "İzmir"}, -- 2.95 million; 3,025,000 (Agglomeration) ["İzmir"] = {alias_of = "Izmir", display = true}, ["Bursa"] = {container = {key = "Bursa Province, Turkey", placetype = "จังหวัด"}}, -- 2.02 million; 2,200,000 (Agglomeration) ["Adana"] = {container = {key = "Adana Province, Turkey", placetype = "จังหวัด"}}, -- 1.77 million; 1,780,000 (Agglomeration) ["Gaziantep"] = {container = {key = "Gaziantep Province, Turkey", placetype = "จังหวัด"}}, -- 1.71 million; 1,750,000 (Agglomeration) ["Antalya"] = {container = {key = "Antalya Province, Turkey", placetype = "จังหวัด"}}, -- 1.3 million; 1,400,000 (Agglomeration) ["Konya"] = {container = {key = "Konya Province, Turkey", placetype = "จังหวัด"}}, -- 1.35 million; 1,390,000 (Agglomeration) ["Diyarbakır"] = {container = {key = "Diyarbakır Province, Turkey", placetype = "จังหวัด"}}, -- 1.07 million; 1,100,000 (Agglomeration) -- Diyarbakır is more common per Ngrams and Google Scholar, but Diyarbakir is the Kurdish form, so we should not -- display-canonicalize to the Turkish form Diyarbakır. ["Diyarbakir"] = {alias_of = "Diyarbakır"}, ["Mersin"] = {container = {key = "Mersin Province, Turkey", placetype = "จังหวัด"}}, -- 1.03 million; 1,060,000 (Agglomeration) ["Ashgabat"] = {container = "Turkmenistan"}, -- 1,150,000 (Agglomeration) ["Dubai"] = {container = "United Arab Emirates"}, -- 6,050,000 (Agglomeration; including Sharjah) ["Abu Dhabi"] = {container = "United Arab Emirates"}, -- 1,850,000 (City) ["Sharjah"] = {container = "United Arab Emirates"}, -- 1,800,000 (Metro area 2022-2023 per Wikipedia; separate from Dubai) ["Tashkent"] = {container = "Uzbekistan"}, -- 3,850,000 (unindicated) ["Sanaa"] = {container = "Yemen"}, -- 3,275,000 (City; population of low reliability) ["Sana'a"] = {alias_of = "Sanaa", display = true}, ["Aden"] = {container = "Yemen"}, -- 1,079,060 (?; 2023 estimate from World Population Review per Wikipedia) ------------------ Europe or Europe-like (Caucasus etc.) --------------------- ["Yerevan"] = {container = "อาร์มีเนีย"}, -- 1,520,000 (Agglomeration) ["Vienna"] = {container = "ออสเตรีย"}, -- 2,375,000 (Agglomeration) ["Minsk"] = {container = "เบลารุส"}, -- 2,100,000 (unindicated) ["Brussels"] = {container = "เบลเยียม"}, -- 2,800,000 (Consolidated Urban Area) ["Antwerp"] = {container = "เบลเยียม"}, -- 1,270,000 (Consolidated Urban Area) ["Sofia"] = {container = "บัลแกเรีย"}, -- 1,260,000 (Agglomeration) ["Zagreb"] = {container = "โครเอเชีย"}, ["Prague"] = {container = "สาธารณรัฐเช็ก"}, -- 1,470,000 (Agglomeration) ["Brno"] = {container = "สาธารณรัฐเช็ก"}, -- 729,405 (metro area per Wikipedia as of 2024-01-01 Czech Statistical Office) ["Olomouc"] = {container = "สาธารณรัฐเช็ก"}, -- 102,293 (city; included only because someone went crazy creating Olomouc-related terms) ["Copenhagen"] = {container = "เดนมาร์ก"}, -- 1,800,000 (Consolidated Urban Area) ["Helsinki"] = {container = {key = "Uusimaa, ฟินแลนด์", placetype = "ภูมิภาค"}}, -- 1,560,000 (Consolidated Urban Area) ["Tbilisi"] = {container = "Georgia"}, -- 1,430,000 (Agglomeration) ["Athens"] = {container = "กรีซ"}, ["Thessaloniki"] = {container = "กรีซ"}, ["Budapest"] = {container = "ฮังการี"}, -- FIXME, per Wikipedia "County Dublin" is now the "Dublin Region" ["Dublin"] = {container = {key = "County Dublin, ไอร์แลนด์", placetype = "เทศมณฑล"}}, ["Riga"] = {container = "Latvia"}, ["Amsterdam"] = {container = {key = "North Holland, Netherlands", placetype = "จังหวัด"}}, ["Rotterdam"] = {container = {key = "South Holland, Netherlands", placetype = "จังหวัด"}}, ["The Hague"] = {container = {key = "South Holland, Netherlands", placetype = "จังหวัด"}}, -- Christchurch (metro 546,600) and Wellington (metro 439,800) are too small to make it. ["Auckland"] = {container = {key = "Auckland, New Zealand", placetype = "ภูมิภาค"}}, ["Oslo"] = {container = {key = "Oslo, Norway", placetype = "เทศมณฑล"}}, ["Warsaw"] = {container = {key = "Masovian Voivodeship, Poland", placetype = "voivodeship"}}, ["Katowice"] = {container = {key = "Silesian Voivodeship, Poland", placetype = "voivodeship"}}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Krakow" without accent. ["Krakow"] = {container = {key = "Lesser Poland Voivodeship, Poland", placetype = "voivodeship"}, wp = "Kraków"}, ["Kraków"] = {alias_of = "Krakow", display = true}, ["Cracow"] = {alias_of = "Krakow", display = true}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirm "Gdańsk" and "Poznań" with accent. ["Gdańsk"] = {container = {key = "Pomeranian Voivodeship, Poland", placetype = "voivodeship"}}, ["Gdansk"] = {alias_of = "Gdańsk", display = true}, ["Poznań"] = {container = {key = "Greater Poland Voivodeship, Poland", placetype = "voivodeship"}}, ["Poznan"] = {alias_of = "Poznań", display = true}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Lodz" without accents. ["Lodz"] = {container = {key = "Lodz Voivodeship, Poland", placetype = "voivodeship"}, wp = "Łódź"}, ["Łódź"] = {alias_of = "Lodz", display = true}, ["Lisbon"] = {container = {key = "Lisbon District, Portugal", placetype = "district"}}, ["Porto"] = {container = {key = "Porto District, Portugal", placetype = "district"}}, ["Oporto"] = {alias_of = "Porto", display = true}, ["Bucharest"] = {container = "Romania"}, ["Belgrade"] = {container = "Serbia"}, ["Stockholm"] = {container = "Sweden"}, ["Zurich"] = {container = "Switzerland"}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Zurich" without umlaut. --- Even Wikipedia uses the form without umlaut. ["Zürich"] = {alias_of = "Zurich", display = true}, ["Kyiv"] = {container = "Ukraine"}, -- not in Kyiv Oblast -- Don't display-canonicalize Kiev -> Kyiv because in ancient contexts, Kiev is still more common. ["Kiev"] = {alias_of = "Kyiv"}, ["Kharkiv"] = {container = {key = "Kharkiv Oblast, Ukraine", placetype = "oblast"}}, ["Odessa"] = {container = {key = "Odesa Oblast, Ukraine", placetype = "oblast"}, wp = "Odesa"}, -- Don't display-canonicalize Odesa -> Odessa because it may be interpreted as a political statement. ["Odesa"] = {alias_of = "Odessa"}, ------------------ North America, South America --------------------- -- Primary figures from citypopulation.de retrieved on 2025-04-26 (reference date 2025-01-01); -- Wikipedia metropolitan figures from [[w:List of metropolitan areas in the Americas]] based on per-country data; -- Wikipedia city limits figures from [[w:List of largest cities in the Americas]]. ["Buenos Aires"] = {container = "อาร์เจนตินา"}, -- 16,800,000 (Consolidated Urban Area; 13,985,794 metropolitan area per Wikipedia) ["Córdoba, Argentina"] = {container = "อาร์เจนตินา", wp = "%l, %c"}, -- 1,810,000 (Consolidated Urban Area; 1,505,25 city limits per Wikipedia) -- to avoid confusion with Córdoba in Spain ["Córdoba"] = {alias_of = "Córdoba, Argentina"}, ["Cordoba"] = {alias_of = "Córdoba, Argentina", display = "Córdoba"}, ["Rosario"] = {container = "อาร์เจนตินา", wp = "%l, Santa Fe"}, -- 1,510,000 (Consolidated Urban Area; 1,348,725 metropolitan area per Wikipedia) ["Mendoza"] = {container = "อาร์เจนตินา", wp = "%l, %c"}, -- 1,180,000 (Consolidated Urban Area) ["San Miguel de Tucumán"] = {container = "อาร์เจนตินา"}, -- 1,110,000 (Consolidated Urban Area) ["Tucumán"] = {alias_of = "San Miguel de Tucumán"}, ["Tucuman"] = {alias_of = "San Miguel de Tucumán", display = "Tucumán"}, ["Santa Cruz de la Sierra"] = {container = "โบลิเวีย"}, -- 1,960,000 (Consolidated Urban Area); 1,606,671 (city limits per Wikipedia) ["Santa Cruz"] = {alias_of = "Santa Cruz de la Sierra"}, ["La Paz"] = {container = "โบลิเวีย"}, -- 1,870,000 (Consolidated Urban Area; composed of El Alto, now slightly larger, and La Paz) ["El Alto"] = {container = "โบลิเวีย"}, ["Cochabamba"] = {container = "โบลิเวีย"}, -- 1,280,000 (Consolidated Urban Area) ["Santiago"] = {container = "ชิลี"}, -- 8,400,000 (Consolidated Urban Area; 6,903,479 city limits? per Wikipedia) ["Valparaíso"] = {container = "ชิลี"}, -- 1,060,000 (Consolidated Urban Area) ["Valparaiso"] = {alias_of = "Valparaíso"}, -- 1,060,000 (Consolidated Urban Area) ["Bogotá"] = {container = "โคลอมเบีย"}, -- 10,600,000 (Agglomeration; 12,772,828 metropolitan area per Wikipedia) ["Bogota"] = {alias_of = "Bogotá", display = true}, ["Medellín"] = {container = "โคลอมเบีย"}, -- 4,350,000 (Agglomeration; 4,068,000 metropolitan area per Wikipedia) ["Medellin"] = {alias_of = "Medellín", display = true}, ["Cali"] = {container = "โคลอมเบีย"}, -- 2,975,000 (Agglomeration; 2,837,000 metropolitan area per Wikipedia) ["Barranquilla"] = {container = "โคลอมเบีย"}, -- 2,375,000 (Agglomeration; 1,341,160 city limits per Wikipedia) ["Bucaramanga"] = {container = "โคลอมเบีย"}, -- 1,380,000 (Agglomeration) ["Cartagena, Colombia"] = {container = "โคลอมเบีย", wp = "%l, %c"}, -- 1,250,000 (Agglomeration) -- to avoid confusion with Cartagena, Spain ["Cartagena"] = {alias_of = "Cartagena, Colombia"}, ["Cúcuta"] = {container = "โคลอมเบีย"}, -- 1,130,000 (Agglomeration) ["Cucuta"] = {alias_of = "Cúcuta", display = true}, -- to avoid conflict with San Jose, California ["San José, Costa Rica"] = {container = "คอสตาริกา", wp = "%l, %c"}, -- 2,450,000 (Municipality (urban population); 3,160,000 metropolitan area per Wikipedia) ["San José"] = {alias_of = "San José, Costa Rica"}, ["San Jose"] = {alias_of = "San José, Costa Rica"}, -- display = "San José"; causes error due to San Jose alias for California city; FIXME ["Havana"] = {container = "คิวบา"}, -- 2,150,000 (City; 2,137,847 city limits? per Wikipedia) ["Santo Domingo"] = {container = "สาธารณรัฐโดมินิกัน"}, -- 3,900,000 (Municipality (urban population); 4,274,651 ??? per Wikipedia) ["Guayaquil"] = {container = "เอกวาดอร์"}, -- 3,350,000 (Agglomeration; 3,092,000 metro area? per Wikipedia) ["Quito"] = {container = "เอกวาดอร์"}, -- 2,875,000 (Agglomeration; 2,889,703 metro area? per Wikipedia) ["San Salvador"] = {container = "เอลซัลวาดอร์"}, -- 1,580,000 (Municipality (urban population)) ["Guatemala City"] = {container = "กัวเตมาลา"}, -- 3,375,000 (Municipality (urban population); 3,160,000 metro area? per Wikipedia) ["Port-au-Prince"] = {container = "เฮติ"}, -- 3,050,000 (Agglomeration; population of low reliability; 2,915,000 metro area? per Wikipedia) ["San Pedro Sula"] = {container = "ฮอนดูรัส"}, -- 1,330,000 (Consolidated Urban Area) ["Tegucigalpa"] = {container = "ฮอนดูรัส"}, -- 1,220,000 (Urban Area) ["Managua"] = {container = "Nicaragua"}, -- 1,400,000 (Consolidated Urban Area) ["Panama City"] = {container = "Panama"}, -- 1,430,000 (Urban Area) ["Asunción"] = {container = "Paraguay"}, -- 2,350,000 (Municipality (urban population)) ["Lima"] = {container = "Peru"}, -- 12,000,000 (Agglomeration; 11,283,787 ??? per Wikipedia) ["Arequipa"] = {container = "Peru"}, -- 1,210,000 (Agglomeration) ["San Juan"] = {container = {key = "Puerto Rico", placetype = "commonwealth"}, wp = "%l, %c"}, -- 1,910,000 (Consolidated Urban Area) ["Montevideo"] = {container = "Uruguay"}, -- 1,810,000 (Agglomeration; 1,302,954 ??? per Wikipedia) ["Caracas"] = {container = "Venezuela"}, -- 3,850,000 (Consolidated Urban Area; 5,243,301 ??? per Wikipedia) ["Maracaibo"] = {container = "Venezuela"}, -- 2,825,000 (Consolidated Urban Area; 5,278,448 ??? per Wikipedia) -- to avoid confusion with Valencia (city and autonomous community of Spain) ["Valencia, Venezuela"] = {container = "Venezuela", wp = "%l, %c"}, -- 2,100,000 (Consolidated Urban Area) ["Valencia"] = {alias_of = "Valencia, Venezuela"}, ["Maracay"] = {container = "Venezuela"}, -- 1,480,000 (Consolidated Urban Area) ["Barquisimeto"] = {container = "Venezuela"}, -- 1,360,000 (Consolidated Urban Area) } export.misc_cities_group = { canonicalize_key_container = make_canonicalize_key_container(nil, "ประเทศ"), default_placetype = "นคร", data = export.misc_cities, } --[==[ var: List of all known locations, in groups. The first group lists continents and continental regions, followed by three groups listing top-level locations: countries, "country-like entities" (de-facto/unrecognized/etc. countries and dependent territories) and former polities (countries, empires, etc.). After that come first-level subpolities (administrative divisions) of several, mostly large, countries, followed by groups of cities. China and the United Kingdom include second-level subpolities (in the case of China, only the largest ones as the full list runs in the hundreds). ]==] export.locations = { export.continents_group, export.countries_group, export.country_like_entities_group, export.former_countries_group, export.australia_group, export.austria_group, export.bangladesh_group, export.brazil_group, export.canada_group, export.china_group, export.china_prefecture_level_cities_group, export.china_prefecture_level_cities_group_2, export.egypt_group, export.finland_group, export.france_group, export.france_departments_group, export.germany_group, export.greece_group, export.india_group, export.indonesia_group, export.iran_group, export.ireland_group, export.italy_group, export.japan_group, export.laos_group, export.lebanon_group, export.malaysia_group, export.malta_group, export.mexico_group, export.moldova_group, export.morocco_group, export.netherlands_group, export.new_zealand_group, export.nigeria_group, export.north_korea_group, export.norway_group, export.pakistan_group, export.philippines_group, export.poland_group, export.portugal_group, export.romania_group, export.russia_group, export.saudi_arabia_group, export.south_africa_group, export.south_korea_group, export.spain_group, export.taiwan_group, export.thailand_group, export.turkey_group, export.ukraine_group, export.united_kingdom_group, export.united_states_group, export.england_group, export.northern_ireland_group, export.scotland_group, export.wales_group, export.vietnam_group, export.australia_cities_group, export.brazil_cities_group, export.canada_cities_group, export.france_cities_group, export.germany_cities_group, export.india_cities_group, export.indonesia_cities_group, export.italy_cities_group, export.japan_cities_group, export.mexico_cities_group, export.nigeria_cities_group, export.pakistan_cities_group, export.philippines_cities_group, export.russia_cities_group, export.saudi_arabia_cities_group, export.south_korea_cities_group, export.spain_cities_group, export.taiwan_cities_group, export.united_kingdom_cities_group, export.united_states_cities_group, export.new_york_boroughs_group, export.vietnam_cities_group, export.misc_cities_group, } return export ofhmtq58gfdveqagct5z7fna2kem0zu 5720698 5720697 2026-04-21T01:41:27Z OctraBot 3198 5720698 Scribunto text/plain local export = {} export.force_cat = false -- set to true to force category generation even on non-mainspace pages local m_table = require("Module:table") local string_utilities_module = "Module:string utilities" local en_utilities_module = "Module:en-utilities" local insert = table.insert local concat = table.concat local dump = mw.dumpObject local unpack = unpack or table.unpack -- Lua 5.2 compatibility --[==[ intro: This module contains data on all known locations, along with some lower-level code to process them (higher-level known-location code is in [[Module:place/placetypes]]). You must load this module using require(), not using mw.loadData(). ===Location data=== '''NOTE: In order to understand the following better, first read the introductory documentation in [[Module:place]], especially the section `More about known locations`.''' The bulk of the code in this module (after some helper functions and placetype tables) describes the known locations and their relationships. Locations are grouped into ''location groups'' that share some common properties (examples are states of the United States and cities in Brazil). Each location group is associated with two tables, a ''data table'' that lists the locations and their individual properties, and a ''metadata table'' that lists group-level properties and defaults for the location properties. Each metadata table points to the associated data table (i.e. contains the data table as its `data` field), and the global `locations` variable holds a list of all group metadata tables. A given location is generally described by three values: (a) the group metadata table for the group the location is part of; (b) the location's canonical ''key'', which is the actual key in the group's data table and is globally unique across all locations; and (c) the location's ''spec'', which is the initialized object describing the properties of the location and comes from the value in the data table corresponding to the canonical key, transformed by the `initialize_spec()` function. These are typically named `group`, `key` and `spec`, respectively and in that order, and are found in the arguments to many functions. In a per-group data table, the keys are either ''canonical keys'' describing locations (which, as mentioned above, must be globally unique) or ''alias keys'' specifying an allowed alias for a given location. There may be multiple aliases for a given location and the alias keys only need to be unique within a particular group data table, not across all groups. It is also possible for the same string to serve as an alias key in one group and a canonical key in another group. (For example, `Newcastle` appears as an alias key in two different groups, referring to two different locations, canonically known as `Newcastle upon Tyne`, for the city in England, and `Newcastle, New South Wales`, for the city in New South Wales, ออสเตรเลีย; and `Birmingham` appears both as a canonical key in the group of English cities and an alias key for canonical `Birmingham, Alabama` in the group of US cities.) The corresponding value objects are different for canonical and alias keys. Corresponding to canonical keys are ''location specs'', describing the properies of the location that cannot be derived from default properties of the group or global defaults. Corresponding to alias keys are ''alias specs'', which are highly restricted in the properties they can contain, and whose properties do not have per-group defaults, but only global defaults. The canonical key is always the same as the bare category corresponding to the location, which is one of the reasons it must be globally unique. For example, the country of Georgia uses the canonical key `Georgia` and corresponding bare category [[:Category:Georgia]], while the US state of Georgia uses the canonical key `Georgia, USA` and corresponding bare category [[:Category:Georgia, USA]]. The following conventions are followed in naming keys: * Countries, ''country-like entities'' (which are a mixture of unrecognized de-facto states and dependent territories) and ''former countries'' (which also includes other types of polities, such as the Roman Empire) use their unqualified placename as the canonical key. (See the documentation for [[Module:place]] for the distinction between keys and placenames, which is critical to understand when working with location data.) This also applies to constituent countries (such as England, Aruba and the Faroe Islands) and constituent parts of grouped dependent territories (such as the island of Saint Helena, which is administratively part of the British overseas territory of Saint Helena, Ascension and Tristan da Cunha). * Cities (including prefecture-level cities in China, which behave in most respects more like non-city administrative divisions) also normally use their unqualified placename as the canonical key, but if this causes name conflicts or ambiguities, they use a ''qualified key'' containing either the country name or immediate containing division (if different) following a comma, such as the case of `Newcastle, New South Wales` and `Birmingham, Alabama` above. Examples of name conflicts are the two cities just given; examples of ambiguities are the major cities of León and Mérida in Mexico and city of Cartagena, Colombia, which are given the respective canonical keys of `León, Guanajuato`, `Mérida, Yucatán` and `Cartagena, Colombia` to avoid ambiguity with the well-known respective cities of the same name in Spain, even though none of those cities are large enough to be included as known locations in this module. (The cutoff is generally having a metro area of at least 1,000,000 inhabitants, although there are exceptions.) * Administrative divisions of countries, other than the exceptions noted above for constituent countries and dependent territories, use a qualified key that contains the name of the country or constituent country in it, e.g. `Normandy, ฝรั่งเศส` (a region), `Calvados, ฝรั่งเศส` (a department in the region of Normandy), `Herefordshire, England` (a ceremonial county), `Northwest Territories, Canada` (a territory), `Central Finland, ฟินแลนด์` (a region), `Antalya Province, Turkey` (a province), `Cluj County, Romania` (a county), `County Cork, ไอร์แลนด์` (a county) and `New York, USA` (a state). As shown in these various examples, (a) first and second-level divisions are sometimes both included (as in France, the United Kingdom and China); (b) the qualifier after the comma is sometimes a constituent country (England) instead of a country (United Kingdom), and is sometimes abbreviated (USA rather than United States or Unites States of America); (c) the word `the` is not normally included in the key even if the location is normally preceded by `the` when following a preposition (there is a property in the location and alias specs to indicate this), except in a very few cases (most notably `The Hague`); (d) the country is included as a qualifier even if it creates an apparent redundancy, as with `Central Finland, ฟินแลนด์`; and (e) sometimes the placetype is included in the key, as with provinces in Turkey and several other countries; states in Nigeria; and counties in Ireland, Romania and several other countries. Whether the placetype is included, and whether it follows or precedes the placename, depends on per-country conventions. For example, provinces in Turkey, อิหร่าน and several other countries (likewise for states in Nigeria, oblasts in Russia, etc.) conventionally include the word "จังหวัด", "รัฐ", "Oblast" etc. in their name because they are normally named after the largest city in the division, which would otherwise lead to ambiguity; and counties in Ireland and Northern Ireland (and likewise County Durham, England) normally have the word "เทศมณฑล" preceding rather than following them in their conventional name, so we follow this practice. The Wikipedia article naming scheme for a given administrative division is a strong clue as to how the division is normally referred to, and we usually follow this practice. (A minor exception is that the Wikipedia articles for provinces in Iran, Laos and Thailand include the word `province` with an initial lowercase letter while provinces elsewhere, e.g. North and South Korea, Saudi Arabia and Turkey, use uppercase `Province`; we normalize to uppercase `Province` in all cases.) As mentioned above, associated with canonical keys in the group data table are location specs, which are objects containing properties. It is important here to distinguish ''initialized specs'' from ''uninitialized specs''. Unininitialized specs are as directly specified in [[Module:place/locations]], containing only those properties that differ from the per-group or global defaults. Initialized specs result from calling `initialize_spec()` on an uninitialized spec (it is idempotent in that it will do nothing if encountering an already-initialized spec). This copies all group-level defaults that are not overridden in the location spec itself from the group-level metadata table into the location spec, so that in general, no more reference need be made to the group to fetch the correct value of a given location property. (The initialization process also does more transformations in a few cases, noted below.) Note that the default value of a given property is stored under a key in the group metadata table that is preceded by the string `default_`; for example, the default value corresponding to the `placetype` property of a given location is specified in the `default_placetype` key in the group metadata table. The following are the properties of the location spec. * `placetype`: String specifying the placetype of the location (e.g. "ประเทศ", "รัฐ", province"). This can also be a table of such types; in this case, the first listed type is the canonical type that will be used in descriptions, but the location will be recognized (e.g. in a holonym, or for categorizing into the bare category) when tagged with any of the specified types. The placetype '''must''' be either specified on an individual location or defaulted at the group level, or an error occurs. * `container`: Either a string, a ''canonicalized container'' structure or a list of either type, specifying the immediate ''container'' (or containers) of the given location. A container is another location which this location is considered to be directly part of, either politically or (above the country level) geographically. Some locations belong to multiple immediate containers; this applies especially to transcontinental countries such as Russia and Turkey. Containers can themselves have containers, forming a tree (or more correctly, a [[w:directed acyclic graph]]) of locations. The list of immediate container(s), followed by the container(s) of the container(s), etc., is termed the ''container trail'', and some functions compute and return this trail as part of their operation. When a location spec is initialized, the given container spec is canonicalized into ''canonical container form'', which consists of a list of canonicalized container structures, each of which is of the form `{key = "``container_key``", placetype = "``container_placetype``"}`, where ``container_key`` is a canonical location key and ``container_placetype`` should be the listed placetype for the location, or the first listed placetype if there are multiple. (FIXME: Since the key uniquely identifies the container location, we should eliminate the placetype from the container structure.) The list of canonicalized container structures is stored into the `.containers` field of the location spec (this happens even if the container value is unset in its uninitialized spec form, causing it to default to the corresponding group-level value), and the `.container` field is set to {nil}. The canonicalization process is described in more detail below under [[#Container spec canonicalization]]. * `divs`: List of recognized political divisions; e.g. for the Netherlands, a specification of the form `divs = {"จังหวัด", "เทศบาล"}` will allow categories such as [[:Category:de:Provinces of the Netherlands]] and [[:Category:pt:Municipalities of the Netherlands]] to be created. Any division that appears here must also be found in `placetype_data`, or an error occurs. The entities appearing in the `divs` list can be structures as well as just strings; this is explained more below under [[#Location divisions]]. Additional political divisions that apply to all locations in a group can be specified at the group level using the group-only property `addl_divs`, which has the same format as `divs`. This is intended to be used in the situation where some division types are shared among all locations in the group and others differ from location to location. An example where this is used is the United States, where `census-designated places` is specified in the group-level `addl_divs` so that all 50 states have census-designated places categorized as e.g. [[:Category:Census-designated places in Arizona, USA]], but `counties` and `county seats` are specified in the group-level `default_divs` because not all states have counties and county seats (Alaska has boroughs and borough seats and Louisiana has parishes and parish seats), and some states have additional divisions (New Jersey and Pennsylvania also have boroughs, while Colorado and Connecticut have municipalities). Note that under most circumstances (particularly, if `container_parent_type` is not set as a property associated with the division type), any division type specified on a sub-country-level location must also be specified on all containers up through the country. For example, since French departments specify `communes` and `municipalities` in `default_divs`, the same division types must be (and are) specified on French regions and for France itself. * `keydesc`: String directly specifying a description of the location, for use in generating the contents of category pages related to the location. In place of a string, a function of three arguments (`group`, `key`, `spec`, as is normal for locations) that computes the location description can also be given. This is used, for example, for Russian federal subjects; see `construct_russia_federal_subject_keydesc`. The special string `+++` contained in the keydesc is replaced with the default value of the location description, which specifies the location's placename, placetype, and the corresponding values for each container in the container trail, generally up through (but not beyond) the country level; see `no_include_container_in_desc` below. The location description is used to construct the full description of various categories, such as bare location categories, whose description generally reads `"{{(((}}langname}}} terms related to the people, culture, or territory of ``keydesc``."` where ``keydesc`` is the specified or auto-constructed location description. * `fulldesc`: String overriding the full description for the bare location category (but not for any other category). This is currently used only for the location `Earth`, at the very top of the tree (because the standard `people, culture or territory of ...` text doesn't make sense here), and for `Antarctica` (because it has no permanent inhabitants). FIXME: This should be renamed `bare_category_fulldesc`. * `addl_parents`: Specify additional parents for the bare location category, in addition to the category or categories generated based on the immediate container(s). For example, `Hawaii, USA` specifies `Polynesia` as an additional parent category; both `North Korea` and `South Korea` specify `Korea` (which is a specially handled location category) as an additional parent; and `Earth` specifies `nature` (not a location category, but still a topic category) as an additional parent (which in this case becomes the first parent, as `Earth` has no container). The only restriction on the categories in `addl_parents` is that they must be topic categories, because each language-specific version of the bare location category gets the corresponding language-specific versions of the categories in `addl_parents`. FIXME: This shoudl be renamed `bare_category_addl_parents`. * `wp`: Spec describing how to construct the Wikipedia article for the location. Each spec is either `true` (equivalent to `"%l"`, i.e. use the full location placename directly) or a string containing formatting directives, indicating how to construct the article name. The allowed formatting directives are `%l` (the full location placename), `%e` (the elliptical location placename) and `%c` (the full placename of the first immediate container). For example, the default value of `wp` for the group of United States cities is `"%l, %c"` since the city articles tend to be named e.g. `Austin, Texas` (but with many exceptions, specified using `wp` fields at the city level). Another example is Thai provinces, which specify a group-level default of `"%e province"` as the Wikipedia articles have lowercase `province` in their name but the Thai province keys specified in this module have uppercase `Province`. Here we have to use `%e` to get the placename without the word `Province` in it. The default is `true`, which simply uses the full location placename as the article name. Note that the Wikipedia article, along with the Wikipedia and Commons category pages, are shown in the upper right of bare category pages. * `wpcat`: Spec describing how to construct the Wikipedia category page for the location (i.e. the page listing articles and categories relevant to the location). The format is the same as with `wp`, and it defaults to the value of `wp`. It rarely needs to be specified because the category page and the article page almost always follow the same format. * `commonscat`: Spec describing how to construct the Commons category page for the location (i.e. the page on the MediaWiki Commons site listing articles and categories relevant to the location). It has the same format as `wp` and `wpcat` and defaults to `wpcat`, which is usually (but not always) correct. * `the`: Boolean specifying whether a location should be preceded by `the` when following a preposition, e.g. in category names such as [[:Category:Cities in the Northern Territory, ออสเตรเลีย]] and in old-style place descriptions when the location occurs as the first holonym, such as the city [[Darwin]] described using {{tl|place|city|terr/Northern Territory|c/Australia}}. Note that the global default for this and all Boolean properties is {nil}, which amounts to the same as {false}. * `british_spelling`: Boolean indicating whether the location in question uses British spelling. Currently this only affects whether the spelling `neighborhoods` or `neighbourhoods` is used in categories such as [[:Category:Neighborhoods of New York City]] and [[:Category:Neighbourhoods of Sydney]]. This usually needs to be set only at the top level (i.e. country or country-like entity), because lower-level entities look up the container trail for any container that has `british_spelling = true` set, and if found, assume that British spelling applies. The general principle used in setting this is that all countries in Europe, all dependent territories of any such country, all former British colonies, and any dependent territories of these former colonies, are assumed to use British spelling, while all other countries and associated dependent territories are assumed to use American spelling. This can potentially be modified on a case-by-case basis. * `is_city`: Boolean indicating whether the location in question is a city. This is explicitly set to `true` for city-states (e.g. Monaco and Vatican City), dependent territories that are cities (e.g. Hong Kong, Macau, Bonaire, Gibraltar, etc.), certain city-level administrative divisions (such as `City of Belfast, Northern Ireland`) and (through a group-levell setting) New York boroughs. In addition, it is set to `true` in initialize_spec() whenever the group-level `default_placetype == "นคร"`, so that all cities get it set without explicitly needing to add a group-level setting for this. Note that the condition `default_placetype == "นคร"` intentionally excludes Chinese prefecture-level cities, which aren't really cities in that (for example) they don't directly contain neighborhoods, but do contain cities within them. This setting is used in various places: (a) to add cities, rivers, etc. to categories like [[:Category:Rivers in Osaka, ญี่ปุ่น]] and [[:Category:Cities in Wuhan]] for holonyms that are ''not'' cities; (b) to add districts, neighborhoods, and the like to categories like [[:Category:Neighborhoods of Brooklyn]] and [[:Category:Neighborhoods of Monaco]] for holoynms that ''are'' cities; (c) generally, to determine which "generic" placetypes (cities, rivers, neighborhoods, etc.) apply to the location. (Those that can occur with cities have a `generic_before_cities` setting in [[Module:place/placetypes]], and those that can occur with non-cities have a `generic_before_non_cities` setting.) * `is_former_place`: Boolean that should be set on former places such as the Soviet Union and the Roman Empire. For such places, categories such as [[:Category:fr:Rivers in the Soviet Union]] are neither generated nor recognized (more generally, no "generic" placetypes apply except for `places`), and category descriptions include the word `former`. * `overriding_bare_label_parents`: Document me! * `bare_category_parent_type`: Document me! * `no_container_cat`: Document me! * `no_container_parent`: Document me! * `no_generic_place_cat`: Document me! * `no_check_holonym_mismatch`: Document me! * `no_auto_augment_container`: Document me! * `no_include_container_in_desc`: Document me! ====Location divisions==== The `divs` field of a location describes the recognized political division types of that location. Specifying a given division type will cause places defined as being of the specified division type and with the location as a holonym will cause the place to be categorized as ` ``placetypes`` in/of ``location`` `; for example, specifying that the United States has `"รัฐ"` as a division will cause anything defined as {{tl|place|fr|state|c/US}} to be categorized under [[:Category:fr:States of the United States]]. Note that you do not have to explicitly specify division types for "generic" placetypes (those that have a `generic_before_non_cities` field if the location is not a city, or that have a `generic_before_cities` field if the location is a city); this includes things like cities, towns, villages, neighbo(u)rhoods and rivers. A given element in the `divs` list is usually a string naming a plural placetype; the placetype is automatically converted to the singular for recognizing the placetype in a {{tl|place}} spec, and irregular plurals such as `kibbutzim` are handled correctly as long as the placetype specifies an appropriate `plural` field (if the `plural` isn't explicitly given, the default singularization algorithm in [[Module:en-utilities]] is run, which gets most things correctly but has problems with `passes` and `fortresses`, which are singularized to `passe` and `fortresse`; for this reason, an explicit plural entry is added to terms in ''-ss''). In place of a string, an object can be given with the plural placetype in the `type` field; this allows additional properties to be specified along with the placetype. An example of this is the `divs` list for Canada: { ["แคนาดา"] = {divs = { {type = "รัฐ", cat_as = "รัฐและดินแดน"}, {type = "ดินแดน", cat_as = "รัฐและดินแดน"}, "เทศมณฑล", "อำเภอ", "เทศบาล", "regional municipalities", "rural municipalities", "parishes", "Indian reserves", "census divisions", {type = "townships", prep = "ใน"}, }, ...}, } Here, both provinces and territories are set to categorize as `provinces and territories`, meaning that there is a single category [[:Category:Provinces and territories of Canada]] rather than separate categories for provinces and territories. Similar things are done for other countries that have more than one type of first-level administrative division (e.g. Australia, จีน, อินเดีย and Pakistan). Note that any placetype listed under `cat_as` must exist in the table of placetypes in [[Module:place/placetypes]], and in fact there is a category-only entry there for `provinces and territories!` (the use of exclamation point following a plural placetype means that the placetype is present only for use in categories and won't be recognized as the placetype field in a {{tl|place}} description). In addition, townships are declared to use `in` rather than `of` as the preposition in the category; hence the category name will be [[:Category:Townships in Canada]] rather than [[:Category:Townships of Canada]]. (The use of `in` vs. `of` is somewhat related to whether a given placetype is an official administrative or statistical division of the location in question and comes in a defined list, in which case `of` should be used, or is more ill-defined, in which case `in` should be used; the default is `of`, and the use of `in` with `townships` is probably by analogy with the use of `in` with cities and towns.) Another more complex example is the divisions given for Quebec: { ["Quebec, Canada"] = {divs = { "เทศมณฑล", {type = "regional county municipalities", container_parent_type = "regional municipalities"}, {type = "ภูมิภาค", container_parent_type = false}, {type = "townships", prep = "ใน"}, {type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "เทศมณฑล"}, "เทศบาล"}}, {type = "township municipalities", cat_as = {{type = "townships", prep = "ใน"}, "เทศบาล"}}, {type = "village municipalities", cat_as = {{type = "villages", prep = "ใน"}, "เทศบาล"}}, }, ...}, } Here, `container_parent_type` controls the second parent category of the placetype/location category associated with the entry. In this case, for example, [[:Category:Counties of Quebec, Canada]] will have [[:Category:Counties of Canada]] as its second or ''container-level'' parent. However, this doesn't make sense for `regional county municipalities`, which exist only in Quebec (so the parent category [[:Category:Regional county municipalities of Canada]] would have only one subcategory); but they are similar to regional municipalities in British Columbia, Nova Scotia and Ontario, so the `container_parent_type = "regional municipalities"` spec causes the container-level parent of this category to be [[:Category:Regional municipalities of Canada]]. Likewise, `regions` as administrative divisions (as opposed to mere geographic regions) exist only in Quebec; they have no equivalent elsewhere, so we disable the container-level parent using `container_parent_type = false`. The specs for `parish municipalities`, `township municipalities` and `village municipalities` show both that multiple types can be specified under `cat_as` (here, for example, we categorize `parish municipalities` as both `parishes` and `municipalities`) and that these types can themselves have properties, just as for entries directly under `divs`. Specifically, `{type = "parishes", container_parent_type = "เทศมณฑล"}` means that any place defined as a parish municipality in Quebec will be categorized under both [[:Category:Parishes of Quebec, Canada]] and [[:Category:Municipalities of Quebec, Canada]], and that the former will have a container-level parent of [[:Category:Counties of Canada]] (rather than the default of [[:Category:Parishes of Canada]]). Similarly, `township municipalities` will be categorized under both [[:Category:Townships in Quebec, Canada]] (''not'' [[:Category:Townships of Quebec, Canada]]) and [[:Category:Municipalities of Quebec, Canada]]. ====Container spec canonicalization==== A fully canonicalized container spec for a given location consists of a list of ''canonicalized container objects'', each with a `key` and `placetype` field. The `key` field should name the canonical key of some other location at a higher level (e.g. French cities are contained in French departments, which are contained in French regions, which are contained in France, which is contained in Europe, which is contained in Eurasia, which is contained in the Earth). The `placetype` field should correspond to the first (canonical) placetype listed for the key in question. The process of initializing a locaion spec converts the container spec in `.container` into a canonicalized spec in `.containers` and removes the spec from `.container`. It works as follows: # If the `container` field is missing, and there is a group-level `default_container` field, it is used in its place. For example, none of the Brazilian states listed in `brazil_states` specifies a container, but the group specifies `default_container = "บราซิล"`. # A single string or canonicalized container object is allowed and made into a one-element list. # If a list element is a string that did ''not'' come from `default_container`, and there is a group-level `canonicalize_key_container` field, it is assumed to be a one-argument function and is called on the string to get a canonicalized container object. # Any remaining strings are assumed to be countries and are used directly as the `key`, with `placetype` set to `"ประเทศ"`. ====Alias keys==== Aliases can be provided for canonical keys using ''alias keys''. Alias keys have a very different location spec structure from canonical keys. This structure does not, in general, have defaults at the group level and is not initialized using `initialize_spec()`, but is used as-is. The following properties are recognized in an alias location spec: * `alias_of`: The canonical key of which this key is an alias. Required. * `the`: If true, this alias key is preceded by `the` following a preposition. Defaults to the group-level `default_the` but does not pay attention to the value of `the` for the corresponding canonical key. * `display`: This is a display alias, meaning that holonyms using the placename corresponding to this alias will be converted to the placename corresponding to the canonical key when formatting the holonym for display. (Otherwise, the aliasing applies only to categorization.) If the value is true, the display canonicalization is to the placename of the canonical key; otherwise, the value should be a key whose corresponding placename is used when display canonicalizing. * `placetype`: The placetype of the alias. Rarely needs to be specified as it defaults to the canonical key's placetype, and if that is unspecified, to the group-level default placetype. ====Location group metadata tables==== As mentioned above, associated with each location group is a ''metadata table'' listing group-level properties. The metadata table contains two types of keys: group-level defaults (named like the corresponding location-level keys but preceded by `default_`, e.g. `default_placetype` corresponding to the location-level `placetype` key) and group-only keys, which are mostly functions. The following are the possible group-only keys: * `data`: This points to the group data table for the group, as described above. * `key_to_placename`: This is a function of one argument to transform the location's key (whether canonical or alias) into the full and elliptical placenames. The difference between full and elliptical placenames is described in the documentation for [[Module:place]], but in essence, it applies for keys that include the placetype in them (e.g. `Phuket Province, Thailand` or `County Mayo, ไอร์แลนด์`), in which case the full placename includes the placetype and the elliptical placename does not. For keys that do not include the placetype in them (e.g. `Arizona, USA` or `Gloucestershire, England`), the full and elliptical placenames are identical. Note that neither the full nor the elliptical placename includes the container in it; hence, for `Phuket Province, Thailand`, the full placename is `Phuket Province` and the elliptical placename is just `Phuket`. (Note that the full vs. elliptical placename distinction is intended only for handling cases where the placetype follows or precedes the raw placename and there is no difference between the two in whether they are normally preceded by `the`. More complex situations, such as `State of Mexico` (which normally takes `the`) vs. just `Mexico` (which doesn't), or `Islamabad Capital Territory` vs. just `Islamabad`, should be handled instead by aliases.) The `key_to_placename` function takes one argument, the key, and returns two arguments, the full and elliptical placenames, respectively. If left undefined, the default is to chop off anything starting with a comma and return the result as both full and elliptical placename, and if specifically set to `false`, the key is used directly as both full and elliptical placename. If it needs to be defined, it is best to use the helper function `make_key_to_placename`, if possible (or `make_irish_type_key_to_placename` in the case of Ireland and Northern Ireland, where `County` precedes), rather than rolling your own. In addition, you should use the global `key_to_placename` function (which takes care of the default implementation and such) rather than directly calling the function in the `key_to_placename` field. * `placename_to_key`: This is approximately the inverse of `key_to_placename`, transforming a placename (which can be either in full or elliptical form) into the corresponding key. As with `key_to_placename`, if you need to define this (generally, when the full and elliptical placenames are different), prefer using `make_placename_to_key` (or `make_irish_type_placename_to_key` for Ireland and Northern Ireland) to rolling your own. In addition, similarly to `key_to_placename`, use the global `placename_to_key` function to convert placenames to keys rather than directly invoking the function in the `placename_to_key` field. If the field is set to `false`, the placename is used unchanged as the key. Otherwise, the default algorithm works as follows: *# If the group-level `default_placetype == "นคร"`, use the placename unchanged as the key. *# Otherwise, if the group-level `default_container` exists and is a string, append it to the placename after a comma + space and use the result as the key. *# Otherwise, if the group-level `default_container` is a canonical container object (an object with `key` and `placetype` fields), and the `placetype` field is either `country` or `constituent country`, append the `key` field to the placename after a comma + space and use the result as the key. *# Otherwise, use the placename unchanged as the key. * `canonicalize_key_container`: A function of one argument to convert the specified `container` field, when a string, to canonical form. Described in more detail above under [[#Container spec canonicalization]]. It is preferable to construct the function using `make_canonicalize_key_container`, if possible, rather than rolling your own. * `addl_divs`: Additional political divisions appended, for all locations in the group, to the list of divisions derived from the location-level `divs` or group-level `default_divs` fields to get the final list of divisions for the location. See [[#Location divisions]] for more details. ]==] ----------------------------------------------------------------------------------- -- Helper functions -- ----------------------------------------------------------------------------------- --[==[ Throw an error. `fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to format the format string as if `fmt:format(...)` were called. In general, callers should use `internal_error` unless the error was due to bad user input rather than a logic error (which usually isn't the case in deep back-end code like this). ]==] function export.process_error(fmt, ...) local args = {...} for i = 1, select("#", ...) do args[i] = dump(args[i]) end return error(string.format(fmt, unpack(args))) end --[==[ Throw an internal error (a logic error that should never happen unless there is a bug in the code, as opposed to a user error triggered by bad input or a system error due to something like running out of memory or hitting a time limit). `fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to format the format string as if `fmt:format(...)` were called. ]==] function export.internal_error(fmt, ...) export.process_error("Internal error: " .. fmt, ...) end local internal_error = export.internal_error -- Return whether `list_or_element` (a list of strings, or a single string) "contains" `item` (a string). If -- `list_or_element` is a list, this returns true if `item` is in the list; otherwise it returns true if `item` -- equals `list_or_element`. local function list_or_element_contains(list_or_element, item) if type(list_or_element) == "table" then return m_table.contains(list_or_element, item) and true or false end return list_or_element == item end --[==[ Call the location group's `key_to_placename` function if it exists (see the comment at the top of [[Module:place]] for the distinction between keys and placenames). Two values are returned, the full and elliptical placenames (e.g. full `"County Durham"` vs. elliptical `"Durham"`). If the group does not define `key_to_placename`, both full and elliptical placenames are computed by chopping off anything starting with a comma. ]==] function export.key_to_placename(group, key) if group.key_to_placename == false then return key, key end if group.key_to_placename then local full_placename, elliptical_placename = group.key_to_placename(key) if type(full_placename) ~= "string" then internal_error("Key %s returned a non-string full placename: %s", key, full_placename) end if type(elliptical_placename) ~= "string" then internal_error("Key %s returned a non-string elliptical placename: %s", key, elliptical_placename) end return full_placename, elliptical_placename end key = key:gsub(",.*", "") return key, key end --[==[ Call the location group's `placename_to_key` function if it exists (see the comment at the top of [[Module:place]] for the distinction between keys and placenames) and return the result. If `placename_to_key` exists with the value `false`, return the placename unchanged. If the group does not define `placename_to_key`, and it defines a `default_container` whose placetype is either `country` or `constituent country`, the container name is appended to the placename after a comma and a space. Otherwise the placename is returned unchanged. ]==] function export.placename_to_key(group, placename) if group.placename_to_key == false then return placename elseif group.placename_to_key then local key = group.placename_to_key(placename) if type(key) ~= "string" then internal_error("Placename %s returned a non-string key: %s", placename, key) end return key elseif group.default_placetype == "นคร" then return placename else local defcon = group.default_container if not defcon then return placename elseif type(defcon) == "string" then return placename .. ", " .. defcon elseif type(defcon) == "table" and (defcon.placetype == "ประเทศ" or defcon.placetype == "constituent country") then return placename .. ", " .. defcon.key else return placename end end end --[==[ Initialize the location spec `spec`, augmenting it with default values taken from `group` if the spec itself doesn't specify values for the properties. This sets `containers` to a canonicalized list of objects, each with `key` and `placetype` keys, describing the immediate containers of the location, and erases (sets to nil) the original non-canonicalized `container` field. (Most locations have only one immediate container but some, e.g. Russia, have more than one. Containers should be carefully distinguished from category parents. Generally the container is the first category parent, or the first ``n`` parents if there are ``n`` containers, but there may be additional category parents, which indicate some sort of relation between the category parent and the location but not necessarily one of containment.) This function is idempotent in that nothing happens if called more than once on the same spec. FIXME: Consider reimplementing this in a more standardly object-oriented way using metatables. ]==] function export.initialize_spec(group, key, spec) if spec.initialized then return end local container = spec.container local containers local container_from_default if not container then container = group.default_container container_from_default = true end if container then if type(container) == "string" or container.key then container = {container} end containers = {} for _, cont in ipairs(container) do if type(cont) == "string" then if group.canonicalize_key_container and not container_from_default then cont = group.canonicalize_key_container(cont) else cont = {key = cont, placetype = "ประเทศ"} end end insert(containers, cont) end end spec.containers = containers spec.container = nil local function value_with_default(val, default_val) if val == nil then return default_val else return val end end local function set_or_default(prop) spec[prop] = value_with_default(spec[prop], group["default_" .. prop]) end set_or_default("placetype") if not spec.placetype then internal_error("No placetype found in key %s for spec %s or in group `default_placetype`", key, spec) end set_or_default("divs") spec.addl_divs = group.addl_divs for _, prop in ipairs { "keydesc", "fulldesc", "addl_parents", "overriding_bare_label_parents", "bare_category_parent_type", "wp", "wpcat", "commonscat", "british_spelling", "the", "no_container_cat", "no_container_parent", "no_generic_place_cat", "no_check_holonym_mismatch", "no_auto_augment_container", "no_include_container_in_desc", "is_city", "is_former_place", } do set_or_default(prop) end -- `default_placetype == "นคร"` is correct; if `default_placetype` has something else like `prefecture-level city` -- as the canonical placetype but also lists `city` (as Chinese prefecture-level cities do), don't mark as -- is_city. spec.is_city = value_with_default(spec.is_city, group.default_placetype == "นคร") spec.initialized = true end --[=[ Given a location group, key and possible placetypes that the placename must match, check if the key exists in the group with at least one of the group's key's placetypes matching one of the passed-in placetypes. If so, return two values: the group key (which potentially could differ from the passed-in key due to aliases) and the corresponding spec object, which (as with all functions that return spec objects) has been initialized using `initialize_spec()` (i.e. default property values have been copied from the group into the spec, if the spec doesn't itself specify a value for the property in question). `alias_resolution` controls how aliases are resolved. Normally, both display and category aliases are followed, and the returned key will reflect the canonical location key. However, if `alias_resolution` is {"none"}, no alias following happens. In that case, if the key specifies an alias, the spec for the alias rather than the spec for the canonical location is returned, and importantly, it is returned uninitialized, meaning that properties from the group are not copied into the spec. (If the key specifies a canonical location, its spec is returned initialized, as in the normal case where `alias_resolution` is unspecified.) The caller needs to check whether the returned spec is an alias by looking for an `alias_of` property. If `alias_resolution` is {"display"}, the behavior is the same as for {"none"} except that if the alias contains a setting `display = true`, the returned key will reflect the canonical location key, and if the alias contains a setting `display = ``string`` `, the returned key will reflect that string. This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or `find_canonical_key` (for known-canonical locations where the placetype isn't known). ]=] local function find_matching_key_in_group(group, placetypes, key, alias_resolution) if alias_resolution ~= nil and alias_resolution ~= "none" and alias_resolution ~= "display" and alias_resolution ~= "all" then internal_error("Bad value for 'alias_resolution': %s", alias_resolution) end local spec = group.data[key] if not spec then return nil end local function check_correct_placetype(placetype) if type(placetype) == "table" then for _, pt in ipairs(placetype) do if list_or_element_contains(placetypes, pt) then return true end end return false else return list_or_element_contains(placetypes, placetype) end end if spec.alias_of then local resolved_key = spec.alias_of local resolved_spec = group.data[resolved_key] if not resolved_spec then internal_error("Key %s is an alias of %s, which doesn't exist", key, resolved_key) elseif resolved_spec.alias_of then internal_error("Key %s is an alias of %s, which is itself an alias; indirect aliasing not allowed", key, resolved_key) end if alias_resolution == "none" or alias_resolution == "display" then -- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group. local placetype = spec.placetype or resolved_spec.placetype or group.default_placetype if not placetype then internal_error("No placetype found for key %s in any of spec %s, alias-resolved spec %s or in group " .. "`default_placetype`", key, spec, resolved_spec) end if not check_correct_placetype(placetype) then return nil end if alias_resolution == "display" then if spec.display == true then key = resolved_key elseif spec.display then key = spec.display end end return key, spec end key = resolved_key spec = resolved_spec end -- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group. local placetype = spec.placetype or group.default_placetype if not placetype then internal_error("No placetype found for key %s in spec %s or group `default_placetype`", key, spec) end if not check_correct_placetype(placetype) then return nil end export.initialize_spec(group, key, spec) return key, spec end --[=[ Given a location group, placename and possible placetypes that the placename must match, check if the placename exists in the group with at least one of the placetypes of the key in the group that corresponds to the placename matching one of the passed-in placetypes. If so, return two values: the key corrsponding to the passed-in placename and the corresponding spec object. This is similar to `find_matching_key_in_group()` but works with placenames rather than keys. `alias_resolution` is as in `find_matching_key_in_group()`. This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or `find_canonical_key` (for known-canonical locations where the placetype isn't known). ]=] local function find_matching_placename_in_group(group, placetypes, placename, alias_resolution) local key = export.placename_to_key(group, placename) return find_matching_key_in_group(group, placetypes, key, alias_resolution) end --[==[ If `key` is a canonical known location key (i.e. not an alias), return the corresponding group and initialized spec. If no such key exists, return {nil}. This throws an internal error if two locations with the same key are found. ]==] function export.find_canonical_key(key) local found_locations = {} for _, group in ipairs(export.locations) do local spec = group.data[key] if not spec then -- do nothing elseif spec.alias_of then mw.log(("Skipping alias '%s' of canonical '%s'"):format(key, spec.alias_of)) else insert(found_locations, {group, spec}) end end if not found_locations[1] then return nil elseif found_locations[2] then internal_error("Found multiple matching locations for canonical key %s: %s", key, found_locations) else local group, spec = unpack(found_locations[1]) export.initialize_spec(group, key, spec) return group, spec end end --[==[ Iterator that returns all locations matching a given description, where the description consists of either a placename or a key along with a list of possible placetypes. Usually there will be at most one such location. The iterator returns three values at each iteration: the location group, canonical key by which the location is known and the spec object describing the location. `data` contains the following possible fields: * `placetypes`: A list of possible placetypes, one of which must match one of the location's placetypes; or a string specifying a placetype, which must match one of the location's placetypes. This must be specified. * `placename`: The placename of the location. Either this or `key` must be specified. * `key`: The key of the location. Either this or `placename` must be specified. * `alias_resolution`: If specified, it behaves the same as for `find_matching_key_in_group`. The spec is normally initialized using `initialize_spec()` prior to it being returned (but may not be if `alias_resolution` is given and the specified key or placename is an alias; see the documentation for `find_matching_key_in_group`). ]==] function export.iterate_matching_location(data) local i = 0 local n = #export.locations return function() while true do i = i + 1 if i > n then break end local group = export.locations[i] local key, spec if data.placename then key, spec = find_matching_placename_in_group(group, data.placetypes, data.placename, data.alias_resolution) else if not data.key then internal_error("'.placename' or '.key' must be defined: %s", data) end key, spec = find_matching_key_in_group(group, data.placetypes, data.key, data.alias_resolution) end if key then return group, key, spec end end end end --[==[ Return the location matching a given description, where the description consists of either a placename or a key along with a list of possible placetypes. This is similar to `iterate_matching_location()` but throws an internal error if there is not exactly one location found; as such, it is for use with internally specified locations (such as the containers of known locations) rather than externally specified locations, which may not match a known location and in some cases may match multiple known locations. For finding an externally specified location, consider using `find_matching_holonym_location`, which returns {nil} rather than throwing an error if the location isn't found, but also (more importantly) checks to make sure there are no conflicting holonyms among the user-specified holonyms (e.g. {{tl|place|city|s/Delaware|c/USA|t=Newark}} will not match the known location `Newark` (in New Jersey, not Delaware). ]==] function export.get_matching_location(data) local all_found = {} for group, key, spec in export.iterate_matching_location(data) do insert(all_found, {group, key, spec}) end if not all_found[1] then internal_error("Couldn't find matching location for data %s", data) elseif all_found[2] then internal_error("Found multiple matching locations for data %s: %s", data, all_found) else return unpack(all_found[1]) end end --[==[ Successively iterate over a location's containers, and then the containers of those containers, etc. Keep in mind that locations may have multiple containers (e.g. Russia has both Europe and Asia as containers, and both Europe and Asia have Eurasia as their container). A given container will never be returned twice (e.g. in the case where a specific location A has locations B and C as containers, and B has C as its container, C will not be returned twice). An internal error happens if a container loop is detected. The return value is a list of location objects, each of which contains `group`, `key` and `spec` fields. ]==] function export.iterate_containers(group, key, spec) local keys_seen = {} keys_seen[key] = true local iterations = 0 local last_iteration_containers = {{group = group, key = key, spec = spec}} return function() iterations = iterations + 1 if iterations > 10 then internal_error("Probable loop in containers when processing key %s", key) end local next_iteration_containers = {} for _, location in ipairs(last_iteration_containers) do local containers = location.spec.containers if containers then for _, container in ipairs(containers) do local container_group, container_key, container_spec = export.get_matching_location { placetypes = container.placetype, key = container.key, } if not keys_seen[container_key] then insert(next_iteration_containers, { group = container_group, key = container_key, spec = container_spec }) keys_seen[container_key] = true end end end end if not next_iteration_containers[1] then return nil end last_iteration_containers = next_iteration_containers return next_iteration_containers end end --[==[ Given a placename, convert it into a link (two-part if `display_form` is given and differs from `placename`) and add `"the "` to the beginning if called for in `spec`. ]==] function export.construct_linked_placename(spec, placename, display_form) local linked_placename = display_form and placename ~= display_form and ("[[%s|%s]]"):format(placename, display_form) or ("[[%s]]"):format(placename) if spec.the then linked_placename = "the " .. linked_placename end return linked_placename end --[=[ This is typically used to define `key_to_placename`. It generates a function that chops off parts of a string (a location key), typically at the end, in order to get the full and elliptical versions of a placename. (See the documentation above for `key_to_placename` under "Location group tables" for the difference between full and elliptical placenames.) `container_patterns` is a Lua pattern or a list of possible patterns matching the container at the end of the key, which will be used to remove that container. If multiple patterns are specified, each one is tried until one matches. If `container_patterns` is omitted, this part of the process is skipped. The reulting string becomes the full placename. If `divtype_patterns` is specified, it is likewise either a Lua pattern or list of possible patterns to match and remove the political division affixed onto the end (or possibly the beginning) of the key in the keys of certain countries (such as South Korean and North Korean counties, which include the word "เทศมณฑล" in the key). The resulting chopped string becomes the elliptical placename. If `divtype_patterns` is omitted, this part of the process is skipped and the full and elliptical placenames are the same. Typical usage is as follows: ``` key_to_placename = make_key_to_placename(", England$"), ``` or (when the political division is part of the key) ``` key_to_placename = make_key_to_placename(", South Korea$", " County$") ``` ]=] local function make_key_to_placename(container_patterns, divtype_patterns) if type(container_patterns) == "string" then container_patterns = {container_patterns} end if type(divtype_patterns) == "string" then divtype_patterns = {divtype_patterns} end return function(key) local full_placename = key if container_patterns then for _, container_pattern in ipairs(container_patterns) do local nsubs full_placename, nsubs = full_placename:gsub(container_pattern, "") if nsubs > 0 then break end end end local elliptical_placename = full_placename if divtype_patterns then for _, divtype_pattern in ipairs(divtype_patterns) do local nsubs elliptical_placename, nsubs = elliptical_placename:gsub(divtype_pattern, "") if nsubs > 0 then break end end end return full_placename, elliptical_placename end end --[=[ This is typically used to define `placename_to_key`. It generates a function that appends a string to the end of a given placename to get the key (see the definition of `placename_to_key` above in the documentation under "Location group tables"). Optional `divtype_suffix` is a raw string (which should not contain hyphens or other characters that have special meaning in Lua patterns) to be appended first to the placename; if already present at the end, it is not appended. `container_suffix` is then added in the same fashion if given. Typical usage is like this: ``` placename_to_key = make_placename_to_key(", England") ``` (which will convert e.g. `"Hampshire"` into `"Hampshire, England"`) or ``` placename_to_key = make_placename_to_key(", South Korea", " County") ``` (which will convert e.g. `"Gangwon"` or `"Gangwon County"` into `"Gangwon County, South Korea"`). ]=] local function make_placename_to_key(container_suffix, divtype_suffix) return function(placename) local key = placename if divtype_suffix then if not key:find("^" .. divtype_suffix) then --th; เปลี่ยนไปเติมข้างหน้าแทน key = divtype_suffix .. key --th end end if container_suffix then key = container_suffix .. key --th end return key end end --[=[ This is typically used to define `canonicalize_key_container`, which converts a container as specified in the location data into the canonical form containing both the full container key and its placetype. It generates a function to do the canonicalization of a given container. If the container is a string, `suffix` is appended onto the string (use {nil} or {""} if there is no suffix to append), and the placetype is set to `placetype`. Otherwise the container is left as-is. Typical usage is like this: ``` canonicalize_key_container = make_canonicalize_key_container(", Canada", "จังหวัด") ``` which will convert e.g. `"Ontario"` into `{key = "Ontario, Canada", placetype = "จังหวัด"}`. ]=] local function make_canonicalize_key_container(suffix, placetype) return function(container) if type(container) == "string" then return {key = container .. (suffix or ""), placetype = placetype} else return container end end end ----------------------------------------------------------------------------------- -- Top-level tables -- ----------------------------------------------------------------------------------- export.continents = { ["โลก"] = {the = true, placetype = "ดาวเคราะห์", addl_parents = {"ธรรมชาติ"}, fulldesc = "=the planet [[Earth]] and the features found on it"}, ["แอฟริกา"] = {placetype = "ทวีป", container = {key = "โลก", placetype = "ดาวเคราะห์"}}, ["อเมริกา"] = {placetype = {"มหาทวีป", "ทวีป"}, container = {key = "โลก", placetype = "ดาวเคราะห์"}, keydesc = "[[America]], in the sense of [[North America]] and [[South America]] combined", wp = "Americas"}, ["อเมริกาส์"] = {alias_of = "อเมริกา", the = true}, ["อเมริกาเหนือ"] = {placetype = "ทวีป", container = {key = "อเมริกา", placetype = "มหาทวีป"}}, ["แคริบเบียน"] = {the = true, placetype = {"continental region", "ภูมิภาค"}, container = {key = "อเมริกาเหนือ", placetype = "ทวีป"}}, ["อเมริกากลาง"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "อเมริกาเหนือ", placetype = "ทวีป"}}, ["อเมริกาใต้"] = {placetype = "ทวีป", container = {key = "อเมริกา", placetype = "มหาทวีป"}}, ["แอนตาร์กติกา"] = {placetype = "ทวีป", container = {key = "โลก", placetype = "ดาวเคราะห์"}, fulldesc = "=the territory of [[Antarctica]]"}, ["ยูเรเชีย"] = {placetype = {"มหาทวีป", "ทวีป"}, container = {key = "โลก", placetype = "ดาวเคราะห์"}, keydesc = "[[Eurasia]], i.e. [[Europe]] and [[Asia]] together"}, ["เอเชีย"] = {placetype = "ทวีป", container = {key = "ยูเรเชีย", placetype = "มหาทวีป"}}, ["ยุโรป"] = {placetype = "ทวีป", container = {key = "ยูเรเชีย", placetype = "มหาทวีป"}}, ["โอเชียเนีย"] = {placetype = "ทวีป", container = {key = "โลก", placetype = "ดาวเคราะห์"}}, ["เมลานีเชีย"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "โอเชียเนีย", placetype = "ทวีป"}}, ["ไมโครนีเชีย (ภูมิภาค)"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "โอเชียเนีย", placetype = "ทวีป"}}, --ชื่อซ้ำกัน: ภูมิภาค/สหพันธรัฐ ["พอลินีเชีย"] = {placetype = {"continental region", "ภูมิภาค"}, container = {key = "โอเชียเนีย", placetype = "ทวีป"}}, } export.continents_group = { default_overriding_bare_label_parents = {}, -- container parents should be used default_divs = {{type = "ประเทศ", prep = "ใน"}}, -- It's enough to mention the first-level continent or continent group. It seems excessive to write e.g. -- "El Salvador, a country in Central America, a continental region in North America, a continent in America, ...". default_no_include_container_in_desc = true, default_no_container_cat = true, default_no_container_parent = true, default_no_auto_augment_container = true, default_no_generic_place_cat = true, -- French Guyana is in France but not in Europe, which should not be an issue, so don't check holonym mismatches at -- this level. We also run into problems with supercontinents, which have "ทวีป" as the fallback and cause -- mismatches. default_no_check_holonym_mismatch = true, data = export.continents, } -- Countries: including those with partial recognition that are normally considered countries (e.g. Kosovo, Taiwan). export.countries = { ["อัฟกานิสถาน"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}}, ["แอลเบเนีย"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล", "communes", {type = "administrative units", cat_as = "communes"}, }, british_spelling = true}, ["แอลจีเรีย"] = {container = "แอฟริกา", divs = {"จังหวัด", "communes", "อำเภอ", "เทศบาล"}}, ["อันดอร์รา"] = {container = "ยุโรป", divs = {"parishes"}, british_spelling = true}, ["แองโกลา"] = {container = "แอฟริกา", divs = {"จังหวัด", "เทศบาล"}}, ["แอนทีกาและบาร์บิวดา"] = {container = "แคริบเบียน", divs = {"จังหวัด"}, british_spelling = true}, ["อาร์เจนตินา"] = {container = "อเมริกาใต้", divs = {"จังหวัด", "departments", "เทศบาล"}}, ["อาร์มีเนีย"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"จังหวัด", "อำเภอ", "เทศบาล"}, british_spelling = true}, ["สาธารณรัฐอาร์มีเนีย"] = {alias_of = "อาร์มีเนีย", the = true}, -- differs in "the" -- Both a country and continent ["ออสเตรเลีย"] = {container = "โอเชียเนีย", divs = { {type = "รัฐ", cat_as = "states and territories"}, {type = "ดินแดน", cat_as = "states and territories"}, {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and territories"}, {type = "ABBREVIATION_OF territories", cat_as = "abbreviations of states and territories"}, "local government areas", "dependent territories", }, british_spelling = true}, ["ออสเตรีย"] = {container = "ยุโรป", divs = {"รัฐ", "อำเภอ", "เทศบาล"}, british_spelling = true}, ["อาเซอร์ไบจาน"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"อำเภอ", "เทศบาล"}, british_spelling = true}, ["บาฮามาส"] = {the = true, container = "แคริบเบียน", divs = {"อำเภอ"}, british_spelling = true, wp = "The %l"}, ["บาห์เรน"] = {container = "เอเชีย", divs = {"governorates"}}, ["บังกลาเทศ"] = {container = "เอเชีย", divs = {"divisions", "อำเภอ", "เทศบาล"}, british_spelling = true}, ["บาร์เบโดส"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true}, ["เบลารุส"] = {container = "ยุโรป", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true}, ["เบลเยียม"] = {container = "ยุโรป", divs = {"ภูมิภาค", "จังหวัด", "เทศบาล"}, british_spelling = true}, ["เบลีซ"] = {container = "อเมริกากลาง", divs = {"อำเภอ"}, british_spelling = true}, ["เบนิน"] = {container = "แอฟริกา", divs = {"departments", "communes"}}, ["ภูฏาน"] = {container = "เอเชีย", divs = {"อำเภอ", "gewogs"}}, ["โบลิเวีย"] = {container = "อเมริกาใต้", divs = {"จังหวัด", "departments", "เทศบาล"}}, ["บอสเนียและเฮอร์เซโกวีนา"] = {container = "ยุโรป", divs = {"entities", "cantons", "เทศบาล"}, british_spelling = true}, --["Bosnia and Hercegovina"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true}, ["บอสเนีย-เฮอร์เซโกวีนา"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true}, --["Bosnia-Hercegovina"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true}, ["บอสเนีย"] = {alias_of = "บอสเนียและเฮอร์เซโกวีนา", display = true}, ["บอตสวานา"] = {container = "แอฟริกา", divs = {"อำเภอ", "ตำบล"}, british_spelling = true}, ["บราซิล"] = {container = "อเมริกาใต้", divs = { "รัฐ", "เทศบาล", "macroregions", {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"}, }}, ["บรูไน"] = {container = "เอเชีย", divs = {"อำเภอ", "mukims"}, british_spelling = true}, ["บัลแกเรีย"] = {container = "ยุโรป", divs = {"จังหวัด", "เทศบาล"}, british_spelling = true}, ["บูร์กินาฟาโซ"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments", "จังหวัด"}}, ["บุรุนดี"] = {container = "แอฟริกา", divs = {"จังหวัด", "communes"}}, ["กัมพูชา"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}}, ["แคเมอรูน"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}}, ["แคนาดา"] = {container = "อเมริกาเหนือ", divs = { {type = "รัฐ", cat_as = "รัฐและดินแดน"}, --ตาม thwiki {type = "ดินแดน", cat_as = "รัฐและดินแดน"}, {type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of รัฐและดินแดน"}, {type = "ABBREVIATION_OF territories", cat_as = "abbreviations of รัฐและดินแดน"}, "เทศมณฑล", "อำเภอ", "เทศบาล", "regional municipalities", "rural municipalities", "parishes", -- Don't change the following to something more politically correct (e.g. "First Nations reserves") until/unless -- the Canadian government makes a similar switch (and note that as of Apr 18 2025, the Wikipedia article is -- still at [[w:Indian reserves]]). "Indian reserves", "census divisions", {type = "townships", prep = "ใน"}, }, british_spelling = true}, ["กาบูเวร์ดี"] = {container = "แอฟริกา", divs = {"เทศบาล", "parishes"}}, ["เคปเวิร์ด"] = {alias_of = "กาบูเวร์ดี", display = true}, ["สาธารณรัฐแอฟริกากลาง"] = {the = true, container = "แอฟริกา", divs = {"prefectures", "subprefectures"}}, ["CAR"] = {alias_of = "สาธารณรัฐแอฟริกากลาง", display = true, the = true}, ["C.A.R"] = {alias_of = "สาธารณรัฐแอฟริกากลาง", display = true, the = true}, ["ชาด"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}}, ["ชิลี"] = {container = "อเมริกาใต้", divs = {"ภูมิภาค", "จังหวัด", "communes"}}, ["จีน"] = {container = "เอเชีย", divs = { {type = "มณฑล", cat_as = "provinces and autonomous regions"}, --ตาม thwiki {type = "autonomous regions", cat_as = "provinces and autonomous regions"}, {type = "FORMER provinces", cat_as = "former provinces"}, "special administrative regions", "จังหวัด", --ตาม thwiki {type = "FORMER prefectures", cat_as = "former prefectures"}, "prefecture-level cities", {type = "เทศมณฑล", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, {type = "FORMER counties", cat_as = "former counties and county-level cities"}, {type = "FORMER county-level cities", cat_as = "former counties and county-level cities"}, -- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities. "อำเภอ", {type = "FORMER districts", cat_as = "former districts"}, "ตำบล", "townships", "เทศบาล", {type = "direct-administered municipalities", cat_as = "เทศบาล"}, }}, ["สาธารณรัฐประชาชนจีน"] = {alias_of = "จีน", the = true}, -- differs in "the" ["โคลอมเบีย"] = {container = "อเมริกาใต้", divs = {"departments", "เทศบาล"}}, ["คอโมโรส"] = {the = true, container = "แอฟริกา", divs = {"autonomous islands"}}, ["คอสตาริกา"] = {container = "อเมริกากลาง", divs = {"จังหวัด", "cantons"}}, ["โครเอเชีย"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล"}, british_spelling = true}, ["คิวบา"] = {container = "แคริบเบียน", divs = {"จังหวัด", "เทศบาล"}}, ["ไซปรัส"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"อำเภอ"}, british_spelling = true}, ["สาธารณรัฐเช็ก"] = {the = true, container = "ยุโรป", divs = {"ภูมิภาค", "อำเภอ", "เทศบาล"}, british_spelling = true}, ["เช็กเกีย"] = {alias_of = "สาธารณรัฐเช็ก"}, -- differs in "the" ["สาธารณรัฐประชาธิปไตยคองโก"] = {the = true, container = "แอฟริกา", divs = {"จังหวัด", "ดินแดน"}}, ["คองโก"] = {alias_of = "สาธารณรัฐประชาธิปไตยคองโก", display = true, the = true}, ["DRC"] = {alias_of = "สาธารณรัฐประชาธิปไตยคองโก", display = true, the = true}, ["D.R.C"] = {alias_of = "สาธารณรัฐประชาธิปไตยคองโก", display = true, the = true}, ["เดนมาร์ก"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล", "dependent territories"}, british_spelling = true, -- Wikipedia separates [[w:Denmark]] (constituent country) from [[w:Danish Realm]] (country) }, ["จิบูตี"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}}, ["ดอมินีกา"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true}, ["สาธารณรัฐโดมินิกัน"] = {the = true, container = "แคริบเบียน", divs = {"จังหวัด", "เทศบาล"}, keydesc = "the [[Dominican Republic]], the country that shares the [[Caribbean]] island of [[Hispaniola]] with [[Haiti]]"}, ["ติมอร์-เลสเต"] = {container = "เอเชีย", divs = {"เทศบาล"}, wp = "ติมอร์-เลสเต"}, ["ติมอร์ตะวันออก"] = {alias_of = "ติมอร์-เลสเต", display = true}, ["เอกวาดอร์"] = {container = "อเมริกาใต้", divs = {"จังหวัด", "cantons"}}, ["อียิปต์"] = {container = "แอฟริกา", divs = {"governorates", "ภูมิภาค"}, british_spelling = true}, ["เอลซัลวาดอร์"] = {container = "อเมริกากลาง", divs = {"departments", "เทศบาล"}}, ["อิเควทอเรียลกินี"] = {container = "แอฟริกา", divs = {"จังหวัด"}}, ["เอริเทรีย"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "subregions"}}, ["เอสโตเนีย"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล"}, british_spelling = true}, ["เอสวาตินี"] = {container = "แอฟริกา", british_spelling = true}, ["สวาซีแลนด์"] = {alias_of = "เอสวาตินี", display = true}, ["เอธิโอเปีย"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "zones"}}, ["สหพันธรัฐไมโครนีเชีย"] = {the = true, container = "ไมโครนีเชีย", divs = {"รัฐ"}}, ["ไมโครนีเชีย"] = {alias_of = "สหพันธรัฐไมโครนีเชีย"}, --ชื่อซ้ำกัน: ภูมิภาค/สหพันธรัฐ ["ฟีจี"] = {container = "เมลานีเชีย", divs = {"divisions", "จังหวัด"}, british_spelling = true}, ["ฟินแลนด์"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล"}, british_spelling = true}, ["ฝรั่งเศส"] = {container = "ยุโรป", divs = {"ภูมิภาค", "cantons", "collectivities", "communes", {type = "เทศบาล", cat_as = "communes"}, "departments", {type = "prefectures", cat_as = {"prefectures", "departmental capitals"}}, {type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}}, "dependent territories", "ดินแดน", "จังหวัด", }, british_spelling = true}, ["กาบอง"] = {container = "แอฟริกา", divs = {"จังหวัด", "departments"}}, ["แกมเบีย"] = {the = true, container = "แอฟริกา", divs = {"divisions", "อำเภอ"}, british_spelling = true, wp = "The %l"}, ["จอร์เจีย"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"ภูมิภาค", "อำเภอ"}, keydesc = "the country of [[Georgia]], in [[Eurasia]]", british_spelling = true, wp = "%l (country)"}, ["เยอรมนี"] = {container = "ยุโรป", divs = { "รัฐ", -- Bavaria, Baden-Württemberg, Hesse and North Rhine-Westphalia have administrative regions as divisions, but -- there aren't really enough of them to categorize per state. "ภูมิภาค", "เทศบาล", "อำเภอ"}, british_spelling = true}, ["กานา"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true}, ["กรีซ"] = {container = "ยุโรป", divs = {"ภูมิภาค", "regional units", "เทศบาล", {type = "peripheries", cat_as = {"ภูมิภาค"}}, }, british_spelling = true}, ["กรีเนดา"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true}, ["กัวเตมาลา"] = {container = "อเมริกากลาง", divs = {"จังหวัด", "เทศบาล"}}, ["กินี"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "prefectures"}}, ["กินี-บิสเซา"] = {container = "แอฟริกา", divs = {"ภูมิภาค"}}, ["กายอานา"] = {container = "อเมริกาใต้", divs = {"ภูมิภาค"}, british_spelling = true}, ["เฮติ"] = {container = "แคริบเบียน", divs = {"departments", "arrondissements"}}, ["ฮอนดูรัส"] = {container = "อเมริกากลาง", divs = {"departments", "เทศบาล"}}, ["ฮังการี"] = {container = "ยุโรป", divs = {"เทศมณฑล", "อำเภอ"}, british_spelling = true}, ["ไอซ์แลนด์"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล", "เทศมณฑล"}, british_spelling = true}, ["อินเดีย"] = {container = "เอเชีย", divs = { {type = "รัฐ", cat_as = "states and union territories"}, {type = "union territories", cat_as = "states and union territories"}, {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and union territories"}, {type = "ABBREVIATION_OF union territories", cat_as = "abbreviations of states and union territories"}, "divisions", "อำเภอ", "เทศบาล", }, british_spelling = true}, ["อินโดนีเซีย"] = {container = "เอเชีย", divs = {"regencies", "จังหวัด", {type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces"}, }}, ["อิหร่าน"] = {container = "เอเชีย", divs = {"จังหวัด", "เทศมณฑล"}}, ["อิรัก"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}}, ["ไอร์แลนด์"] = {container = "ยุโรป", addl_parents = {"British Isles"}, divs = {"เทศมณฑล", "อำเภอ", "จังหวัด"}, british_spelling = true, wp = "Republic of %l"}, ["สาธารณรัฐไอร์แลนด์"] = {alias_of = "ไอร์แลนด์", the = true}, -- differs in "the" ["อิสราเอล"] = {container = "เอเชีย", divs = {"อำเภอ"}}, ["อิตาลี"] = {container = "ยุโรป", divs = { "ภูมิภาค", "จังหวัด", "metropolitan cities", "เทศบาล", {type = "autonomous regions", cat_as = "ภูมิภาค"}, }, british_spelling = true}, ["โกตดิวัวร์"] = {container = "แอฟริกา", divs = {"อำเภอ", "ภูมิภาค"}}, -- We should really be using Ivory Coast (common name) but there are political ramifications to the use of -- Côte d'Ivoire so don't make it a display alias. ["ไอวอรีโคสต์"] = {alias_of = "โกตดิวัวร์"}, ["จาเมกา"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true}, ["ญี่ปุ่น"] = {container = "เอเชีย", divs = {"จังหวัด", "กิ่งจังหวัด", "เทศบาล"}}, ["จอร์แดน"] = {container = "เอเชีย", divs = {"governorates"}}, ["คาซัคสถาน"] = {container = {"เอเชีย", "ยุโรป"}, divs = {"ภูมิภาค", "อำเภอ"}}, ["เคนยา"] = {container = "แอฟริกา", divs = {"เทศมณฑล"}, british_spelling = true}, ["Kiribati"] = {container = "ไมโครนีเชีย", british_spelling = true}, ["Kosovo"] = {container = "ยุโรป", divs = {"อำเภอ", "เทศบาล"}, british_spelling = true}, ["Kuwait"] = {container = "เอเชีย", divs = {"governorates", "areas"}}, ["Kyrgyzstan"] = {container = "เอเชีย", divs = {"ภูมิภาค", "อำเภอ"}}, ["Laos"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}}, ["Latvia"] = {container = "ยุโรป", divs = {"เทศบาล"}, british_spelling = true}, ["Lebanon"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}}, ["Lesotho"] = {container = "แอฟริกา", divs = {"อำเภอ"}, british_spelling = true}, ["Liberia"] = {container = "แอฟริกา", divs = {"เทศมณฑล", "อำเภอ"}}, ["Libya"] = {container = "แอฟริกา", divs = {"อำเภอ", "เทศบาล"}}, ["Liechtenstein"] = {container = "ยุโรป", divs = {"เทศบาล"}, british_spelling = true}, ["Lithuania"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล"}, british_spelling = true}, ["Luxembourg"] = {container = "ยุโรป", divs = {"cantons", "อำเภอ"}, british_spelling = true}, ["Madagascar"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}}, ["Malawi"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true}, ["Malaysia"] = {container = "เอเชีย", divs = {"รัฐ", "federal territories", "อำเภอ"}, british_spelling = true}, ["Maldives"] = {the = true, container = "เอเชีย", divs = {"จังหวัด", "administrative atolls"}, british_spelling = true}, ["Mali"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "cercles"}}, ["Malta"] = {container = "ยุโรป", divs = {"ภูมิภาค", "local councils"}, british_spelling = true}, ["Marshall Islands"] = {the = true, container = "ไมโครนีเชีย", divs = {"เทศบาล"}}, ["Mauritania"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}}, ["Mauritius"] = {container = "แอฟริกา", divs = {"อำเภอ"}, british_spelling = true}, ["Mexico"] = {container = "อเมริกาเหนือ", addl_parents = {"อเมริกากลาง"}, divs = { "รัฐ", "เทศบาล", {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"}, }}, ["Moldova"] = {container = "ยุโรป", divs = { {type = "อำเภอ", cat_as = "districts and autonomous territorial units"}, {type = "autonomous territorial units", cat_as = "districts and autonomous territorial units"}, "communes", "เทศบาล", }, british_spelling = true}, ["Monaco"] = {placetype = {"city-state", "ประเทศ"}, container = "ยุโรป", -- We want the first placetype to be 'city-state' so the description of Monaco says it's a city-state, but we -- want its parent to be "countries in Europe". bare_category_parent_type = {type = "ประเทศ", prep = "ใน"}, is_city = true, british_spelling = true}, ["Mongolia"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}}, ["Montenegro"] = {container = "ยุโรป", divs = {"เทศบาล"}}, ["Morocco"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "prefectures", "จังหวัด"}}, ["Mozambique"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}}, ["Myanmar"] = {container = "เอเชีย", divs = {"ภูมิภาค", "รัฐ", "union territories", {type = "self-administered zones", cat_as = "self-administered areas"}, {type = "self-administered divisions", cat_as = "self-administered areas"}, "อำเภอ"}}, ["Burma"] = {alias_of = "Myanmar"}, -- not display-canonicalizing; has political connotations ["Namibia"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "constituencies"}, british_spelling = true}, ["Nauru"] = {container = "ไมโครนีเชีย", divs = {"อำเภอ"}, british_spelling = true}, ["Nepal"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}}, ["เนเธอร์แลนด์"] = {the = true, placetype = {"ประเทศ", "constituent country"}, container = "ยุโรป", divs = {"จังหวัด", "เทศบาล", {type = "FORMER municipalities", cat_as = "former municipalities"}, "dependent territories", "constituent countries"}, british_spelling = true, -- Wikipedia separates [[w:Netherlands]] (constituent country) from [[w:Kingdom of the Netherlands]] -- (country) }, ["New Zealand"] = {container = "พอลินีเชีย", divs = { "ภูมิภาค", "dependent territories", "territorial authorities", {type = "อำเภอ", cat_as = "territorial authorities"}, }, british_spelling = true}, ["Nicaragua"] = {container = "อเมริกากลาง", divs = {"departments", "เทศบาล"}}, ["Niger"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}}, ["Nigeria"] = {container = "แอฟริกา", divs = { "รัฐ", -- Categorize the Federal Capital Territory as a state because there's only one of it; we could categorize -- everything under 'states and territories' but that seems a bit pointless. {type = "federal territories", cat_as = "รัฐ"}, "local government areas", }, british_spelling = true}, ["North Korea"] = {container = "เอเชีย", addl_parents = {"Korea"}, divs = {"จังหวัด", "เทศมณฑล"}}, ["North Macedonia"] = {container = "ยุโรป", divs = {"ภูมิภาค", "เทศบาล"}, british_spelling = true}, ["Macedonia"] = {alias_of = "North Macedonia", display = true}, ["Republic of North Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the" ["Republic of Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the" ["Norway"] = {container = "ยุโรป", divs = {"เทศมณฑล", "เทศบาล", "dependent territories", "อำเภอ", "unincorporated areas"}, british_spelling = true}, ["Oman"] = {container = "เอเชีย", divs = {"governorates", "จังหวัด"}}, ["Pakistan"] = {container = "เอเชีย", divs = { {type = "จังหวัด", cat_as = "provinces and territories"}, {type = "administrative territories", cat_as = "provinces and territories"}, {type = "federal territories", cat_as = "provinces and territories"}, {type = "ดินแดน", cat_as = "provinces and territories"}, "divisions", "อำเภอ", }, british_spelling = true}, ["Palau"] = {container = "ไมโครนีเชีย", divs = {"รัฐ"}}, ["Palestine"] = {container = "เอเชีย", divs = {"governorates"}}, ["State of Palestine"] = {alias_of = "Palestine", the = true}, -- differs in "the" ["Panama"] = {container = "อเมริกากลาง", divs = {"จังหวัด", "อำเภอ"}}, ["Papua New Guinea"] = {container = "เมลานีเชีย", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true}, ["Paraguay"] = {container = "อเมริกาใต้", divs = {"departments", "อำเภอ"}}, ["Peru"] = {container = "อเมริกาใต้", divs = {"ภูมิภาค", "จังหวัด", "อำเภอ"}}, ["Philippines"] = {the = true, container = "เอเชีย", divs = {"ภูมิภาค", "จังหวัด", "อำเภอ", "เทศบาล", "barangays"}}, ["Poland"] = {divs = {"voivodeships", "เทศมณฑล", {type = "Polish colonies", cat_as = {{type = "villages", prep = "ใน"}}}, }, container = "ยุโรป", british_spelling = true}, ["Portugal"] = {container = "ยุโรป", divs = { {type = "autonomous regions", cat_as = "districts and autonomous regions"}, {type = "อำเภอ", cat_as = "districts and autonomous regions"}, "จังหวัด", "เทศบาล"}, british_spelling = true}, ["Qatar"] = {container = "เอเชีย", divs = {"เทศบาล", "zones"}}, ["Republic of the Congo"] = {the = true, container = "แอฟริกา", divs = {"departments", "อำเภอ"}}, ["Congo Republic"] = {alias_of = "Republic of the Congo", display = true, the = true}, ["Romania"] = {container = "ยุโรป", divs = { "ภูมิภาค", "เทศมณฑล", "communes", {type = "ABBREVIATION_OF counties", cat_as = "abbreviations of counties"}, }, british_spelling = true}, ["Russia"] = {container = {"ยุโรป", "เอเชีย"}, divs = { "federal subjects", "republics", "autonomous oblasts", "autonomous okrugs", "oblasts", "krais", "federal cities", "อำเภอ", "federal districts"}, british_spelling = true}, ["Rwanda"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}}, ["Saint Kitts and Nevis"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true}, ["Saint Kitts"] = {alias_of = "Saint Kitts and Nevis", display = true}, ["Saint Lucia"] = {container = "แคริบเบียน", divs = {"อำเภอ"}, british_spelling = true}, ["Saint Vincent and the Grenadines"] = {container = "แคริบเบียน", divs = {"parishes"}, british_spelling = true}, ["Saint Vincent"] = {alias_of = "Saint Vincent and the Grenadines", display = true}, ["SVG"] = {alias_of = "Saint Vincent and the Grenadines", display = true}, ["S.V.G"] = {alias_of = "Saint Vincent and the Grenadines", display = true}, ["Samoa"] = {container = "พอลินีเชีย", divs = {"อำเภอ"}, british_spelling = true}, ["San Marino"] = {container = "ยุโรป", divs = {"เทศบาล"}, british_spelling = true}, ["São Tomé and Príncipe"] = {container = "แอฟริกา", divs = {"อำเภอ"}}, ["São Tome and Principe"] = {alias_of = "São Tomé and Príncipe", display = true}, ["São Tomé"] = {alias_of = "São Tomé and Príncipe", display = true}, ["São Tome"] = {alias_of = "São Tomé and Príncipe", display = true}, ["Saudi Arabia"] = {container = "เอเชีย", divs = {"จังหวัด", "governorates"}}, ["Senegal"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "departments"}}, ["Serbia"] = {container = "ยุโรป", divs = {"อำเภอ", "เทศบาล", "autonomous provinces"}}, ["Seychelles"] = {container = "แอฟริกา", divs = {"อำเภอ"}, british_spelling = true}, ["Sierra Leone"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true}, ["Singapore"] = {container = "เอเชีย", divs = {"อำเภอ", "ภูมิภาค"}, british_spelling = true}, ["Slovakia"] = {container = "ยุโรป", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true}, ["Slovenia"] = {container = "ยุโรป", divs = {"statistical regions", "เทศบาล"}, british_spelling = true}, -- Note: While the official name does not include "the" at the beginning, -- it sounds strange in English to leave it out and it's commonly included. ["Solomon Islands"] = {the = true, container = "เมลานีเชีย", divs = {"จังหวัด"}, british_spelling = true}, ["โซมาเลีย"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}}, ["South Africa"] = {container = "แอฟริกา", divs = { "จังหวัด", "อำเภอ", {type = "district municipalities", cat_as = "อำเภอ"}, {type = "metropolitan municipalities", cat_as = "อำเภอ"}, "เทศบาล", }, british_spelling = true}, ["South Korea"] = {container = "เอเชีย", addl_parents = {"Korea"}, divs = {"จังหวัด", "เทศมณฑล", "อำเภอ"}}, ["South Sudan"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "รัฐ", "เทศมณฑล"}, british_spelling = true}, ["Spain"] = {container = "ยุโรป", divs = {"autonomous communities", "จังหวัด", "เทศบาล", "comarcas", "autonomous cities"}, british_spelling = true}, ["Sri Lanka"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true}, ["Sudan"] = {container = "แอฟริกา", divs = {"รัฐ", "อำเภอ"}, british_spelling = true}, ["Suriname"] = {container = "อเมริกาใต้", divs = {"อำเภอ"}}, ["Sweden"] = {container = "ยุโรป", divs = {"จังหวัด", "เทศมณฑล", "เทศบาล"}, british_spelling = true}, ["Switzerland"] = {container = "ยุโรป", divs = {"cantons", "เทศบาล", "อำเภอ"}, british_spelling = true}, ["Syria"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}}, ["ไต้หวัน"] = {container = "เอเชีย", divs = {"เทศมณฑล", "อำเภอ", "townships", "special municipalities"}}, ["สาธารณรัฐจีน"] = {alias_of = "ไต้หวัน", the = true}, -- differs in "the", different political connotations ["Tajikistan"] = {container = "เอเชีย", divs = {"ภูมิภาค", "อำเภอ"}}, ["Tanzania"] = {container = "แอฟริกา", divs = {"ภูมิภาค", "อำเภอ"}, british_spelling = true}, ["ไทย"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ", "ตำบล"}}, ["Togo"] = {container = "แอฟริกา", divs = {"จังหวัด", "prefectures"}}, ["Tonga"] = {container = "พอลินีเชีย", divs = {"divisions"}, british_spelling = true}, ["Trinidad and Tobago"] = {container = "แคริบเบียน", divs = {"ภูมิภาค", "เทศบาล"}, british_spelling = true}, ["Tunisia"] = {container = "แอฟริกา", divs = {"governorates", "delegations"}}, ["Turkey"] = {container = {"ยุโรป", "เอเชีย"}, divs = {"จังหวัด", "อำเภอ"}}, -- Foreign names generally get display-canonicalized. ["Türkiye"] = {alias_of = "Turkey", display = true}, ["Turkmenistan"] = {container = "เอเชีย", divs = { -- The 5 regions are often also called provinces "ภูมิภาค", {type = "จังหวัด", cat_as = "ภูมิภาค"}, "อำเภอ"}, }, ["Tuvalu"] = {container = "พอลินีเชีย", divs = {"atolls"}, british_spelling = true}, ["Uganda"] = {container = "แอฟริกา", divs = {"อำเภอ", "เทศมณฑล"}, british_spelling = true}, ["Ukraine"] = {container = "ยุโรป", divs = { {type = "oblasts", cat_as = "oblasts and autonomous republics"}, {type = "autonomous republics", cat_as = "oblasts and autonomous republics"}, "raions", "hromadas", }, british_spelling = true}, ["United Arab Emirates"] = {the = true, container = "เอเชีย", divs = {"emirates"}}, -- Abbreviations get display-canonicalized. ["UAE"] = {alias_of = "United Arab Emirates", display = true, the = true}, ["U.A.E."] = {alias_of = "United Arab Emirates", display = true, the = true}, ["สหราชอาณาจักร"] = {the = true, container = "ยุโรป", addl_parents = {"British Isles"}, divs = {"constituent countries", "เทศมณฑล", "อำเภอ", "boroughs", "ดินแดน", "dependent territories", "traditional counties"}, keydesc = "the [[United Kingdom]] of Great Britain and Northern Ireland", british_spelling = true}, -- Abbreviations get display-canonicalized. ["UK"] = {alias_of = "สหราชอาณาจักร", display = true, the = true}, ["U.K."] = {alias_of = "สหราชอาณาจักร", display = true, the = true}, ["สหรัฐอเมริกา"] = {the = true, container = "อเมริกาเหนือ", divs = {"เทศมณฑล", "county seats", "รัฐ", "ดินแดน", "dependent territories", {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"}, {type = "DEROGATORY_NAME_FOR states", cat_as = "derogatory names for states"}, {type = "NICKNAME_FOR states", cat_as = "nicknames for states"}, {type = "OFFICIAL_NICKNAME_FOR states", cat_as = "official nicknames for states"}, {type = "boroughs", prep = "ใน"}, -- exist in Pennsylvania and New Jersey "เทศบาล", -- these exist politically at least in Colorado and Connecticut {type = "census-designated places", prep = "ใน"}, {type = "unincorporated communities", prep = "ใน"}, -- Don't change the following to something more politically correct until/unless the US government makes a -- similar switch (and note that as of Apr 18 2025, the Wikipedia article is still at -- [[w:Indian reservations]]). "Indian reservations", }}, -- Abbreviations and long forms (when possible) get display-canonicalized. ["US"] = {alias_of = "สหรัฐอเมริกา", display = true, the = true}, ["U.S."] = {alias_of = "สหรัฐอเมริกา", display = true, the = true}, ["USA"] = {alias_of = "สหรัฐอเมริกา", display = true, the = true}, ["U.S.A."] = {alias_of = "สหรัฐอเมริกา", display = true, the = true}, ["สหรัฐ"] = {alias_of = "สหรัฐอเมริกา", display = true, the = true}, ["Uruguay"] = {container = "อเมริกาใต้", divs = {"departments", "เทศบาล"}}, ["Uzbekistan"] = {container = "เอเชีย", divs = {"ภูมิภาค", "อำเภอ"}}, ["Vanuatu"] = {container = "เมลานีเชีย", divs = {"จังหวัด"}, british_spelling = true}, ["Vatican City"] = {placetype = {"city-state", "ประเทศ"}, container = "ยุโรป", -- First placetype should be 'city-state' for to shown up in its description, -- Its parent should still be "countries in Europe". bare_category_parent_type = {type = "ประเทศ", prep = "ใน"}, addl_parents = {"Rome"}, is_city = true, british_spelling = true}, ["Vatican"] = {alias_of = "Vatican City", the = true}, -- differs in "the" ["Venezuela"] = {container = "อเมริกาใต้", divs = {"รัฐ", "เทศบาล"}}, ["เวียดนาม"] = {container = "เอเชีย", divs = {"จังหวัด", "อำเภอ", "เทศบาล"}}, ["Western Sahara"] = {placetype = {"ดินแดน", "ประเทศ"}, container = "แอฟริกา", bare_category_parent_type = {type = "ประเทศ", prep = "ใน"}, }, -- Not display-canonicalizable both due to differences in 'the' and the sovereignty dispute over Western Sahara ["Sahrawi Arab Democratic Republic"] = {alias_of = "Western Sahara", the = true}, ["SADR"] = {alias_of = "Sahrawi Arab Democratic Republic", display = true, the = true}, ["Yemen"] = {container = "เอเชีย", divs = {"governorates", "อำเภอ"}}, ["Zambia"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true}, ["Zimbabwe"] = {container = "แอฟริกา", divs = {"จังหวัด", "อำเภอ"}, british_spelling = true}, } local function canonicalize_continent_container(key) if type(key) ~= "string" then return key end if export.continents[key] then return {key = key, placetype = export.continents[key].placetype} end internal_error("Unrecognized key %s in `canonicalize_continent_like`", key) end export.countries_group = { canonicalize_key_container = canonicalize_continent_container, default_overriding_bare_label_parents = {"+++", "ประเทศ"}, default_placetype = "ประเทศ", default_no_container_cat = true, default_no_container_parent = true, -- No need to augment country holonyms with continents; not needed for disambiguation. default_no_auto_augment_container = true, data = export.countries, } -- Country-like entities: typically overseas territories or de-facto independent countries, which in both cases -- are not internationally recognized as sovereign nations but which we treat similarly to countries. export.country_like_entities = { -- British Overseas Territory ["Akrotiri and Dhekelia"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"ไซปรัส", "ยุโรป", "เอเชีย"}, british_spelling = true, }, -- Åland: Listed as a region of Finland. Wikipedia lists this under "dependent territories" in -- [[w:List of sovereign states and dependent territories by continent]]. -- unincorporated territory of the United States ["American Samoa"] = { placetype = {"unincorporated territory", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"พอลินีเชีย"}, }, -- British Overseas Territory ["Anguilla"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Georgia ["Abkhazia"] = { placetype = {"unrecognized country", "ประเทศ"}, addl_parents = {"Georgia", "ยุโรป", "เอเชีย"}, divs = {"อำเภอ"}, keydesc = "the de-facto independent state of [[Abkhazia]], internationally recognized as part of the country of [[Georgia]]", british_spelling = true, }, -- Australian external territory ["Ashmore and Cartier Islands"] = { the = true, placetype = {"external territory", "ดินแดน"}, container = "ออสเตรเลีย", addl_parents = {"เอเชีย"}, }, -- constituent country of the Netherlands ["Aruba"] = { placetype = {"constituent country", "ประเทศ"}, container = "เนเธอร์แลนด์", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- British Overseas Territory ["Bermuda"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"อเมริกาเหนือ"}, british_spelling = true, }, -- special municipality of the Netherlands ["Bonaire"] = { placetype = {"special municipality", "เทศบาล", "overseas territory", "ดินแดน"}, container = "เนเธอร์แลนด์", addl_parents = {"แคริบเบียน"}, is_city = true, british_spelling = true, }, -- British Overseas Territory ["British Indian Ocean Territory"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"เอเชีย"}, british_spelling = true, }, -- British Overseas Territory ["British Virgin Islands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- Norwegian dependent territory ["Bouvet Island"] = { placetype = {"dependent territory", "ดินแดน"}, container = "Norway", addl_parents = {"แอฟริกา"}, british_spelling = true, }, -- British Overseas Territory ["Cayman Islands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- Australian external territory ["Christmas Island"] = { placetype = {"external territory", "ดินแดน"}, container = "ออสเตรเลีย", addl_parents = {"เอเชีย"}, british_spelling = true, }, -- Sui generis French "state private property" per Wikipedia; classify as overseas territory like the -- French Southern and Antarctic Lands. ["Clipperton Island"] = { placetype = {"overseas territory", "ดินแดน"}, container = "ฝรั่งเศส", addl_parents = {"อเมริกาเหนือ"}, }, -- Australian external territory; also called the Keeling Islands or (officially) the Cocos (Keeling) Islands ["Cocos Islands"] = { the = true, placetype = {"external territory", "ดินแดน"}, container = "ออสเตรเลีย", addl_parents = {"เอเชีย"}, wp = "Cocos (Keeling) Islands", british_spelling = true, }, ["Cocos (Keeling) Islands"] = {alias_of = "Cocos Islands", display = true, the = true}, ["Keeling Islands"] = {alias_of = "Cocos Islands", display = true, the = true}, -- self-governing but in free association with New Zealand ["Cook Islands"] = { the = true, placetype = {"ประเทศ"}, container = "New Zealand", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, -- constituent country of the Netherlands ["Curaçao"] = { placetype = {"constituent country", "ประเทศ"}, container = "เนเธอร์แลนด์", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- special territory of Chile ["Easter Island"] = { placetype = {"special territory", "ดินแดน"}, container = "ชิลี", addl_parents = {"พอลินีเชีย"}, }, -- British Overseas Territory ["Falkland Islands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"อเมริกาใต้"}, british_spelling = true, }, -- autonomous territory of Denmark ["Faroe Islands"] = { the = true, placetype = {"autonomous territory", "ดินแดน"}, container = "เดนมาร์ก", addl_parents = {"ยุโรป"}, british_spelling = true, }, -- overseas department and region of France ["French Guiana"] = { placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"}, container = "ฝรั่งเศส", divs = {"communes"}, addl_parents = {"อเมริกาใต้"}, british_spelling = true, }, -- overseas collectivity of France ["French Polynesia"] = { placetype = {"overseas collectivity", "collectivity"}, container = "ฝรั่งเศส", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, -- French overseas territory ["French Southern and Antarctic Lands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "ฝรั่งเศส", addl_parents = {"แอฟริกา"}, }, -- British Overseas Territory ["Gibraltar"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"ยุโรป"}, is_city = true, british_spelling = true, }, -- autonomous territory of Denmark ["Greenland"] = { placetype = {"autonomous territory", "ดินแดน"}, container = "เดนมาร์ก", addl_parents = {"อเมริกาเหนือ"}, divs = {"เทศบาล"}, british_spelling = true, }, -- overseas department and region of France ["Guadeloupe"] = { placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"}, container = "ฝรั่งเศส", addl_parents = {"แคริบเบียน"}, divs = {"communes"}, british_spelling = true, }, -- unincorporated territory of the United States ["Guam"] = { placetype = {"unincorporated territory", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"ไมโครนีเชีย"}, }, -- self-governing British Crown dependency; technically called the Bailiwick of Guernsey ["Guernsey"] = { placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"British Isles", "ยุโรป"}, british_spelling = true, wp = "Bailiwick of %l", }, ["Bailiwick of Guernsey"] = {alias_of = "Guernsey", the = true}, -- Australian external territory ["Heard Island and McDonald Islands"] = { the = true, placetype = {"external territory", "ดินแดน"}, container = "ออสเตรเลีย", addl_parents = {"แอฟริกา"}, }, -- special administrative region of China ["Hong Kong"] = { placetype = {"special administrative region", "นคร"}, container = "จีน", is_city = true, british_spelling = true, }, -- self-governing British Crown dependency ["Isle of Man"] = { the = true, placetype = {"crown dependency", "dependency", "dependent territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"British Isles", "ยุโรป"}, british_spelling = true, }, -- Norwegian unincorporated area ["Jan Mayen"] = { placetype = {"unincorporated area", "dependent territory", "ดินแดน", "เกาะ"}, container = "Norway", addl_parents = {"ยุโรป"}, british_spelling = true, }, -- self-governing British Crown dependency; technically called the Bailiwick of Jersey ["Jersey"] = { placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"British Isles", "ยุโรป"}, british_spelling = true, }, ["Bailiwick of Jersey"] = {alias_of = "Jersey", the = true}, -- special administrative region of China ["Macau"] = { placetype = {"special administrative region", "นคร"}, container = "จีน", is_city = true, british_spelling = true, }, -- overseas department and region of France ["Martinique"] = { placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"}, container = "ฝรั่งเศส", divs = {"communes"}, addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- overseas department and region of France ["Mayotte"] = { placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"}, container = "ฝรั่งเศส", divs = {"communes"}, addl_parents = {"แอฟริกา"}, british_spelling = true, }, -- British Overseas Territory ["Montserrat"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- special collectivity of France ["New Caledonia"] = { placetype = {"special collectivity", "collectivity"}, container = "ฝรั่งเศส", addl_parents = {"เมลานีเชีย"}, british_spelling = true, }, -- dependent territory of New Zealand ["New Zealand Subantarctic Islands"] = { the = true, placetype = {"dependent territory", "ดินแดน"}, container = "New Zealand", addl_parents = {"แอนตาร์กติกา"}, british_spelling = true, }, -- self-governing but in free association with New Zealand ["Niue"] = { placetype = {"ประเทศ"}, container = "New Zealand", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, -- Australian external territory ["Norfolk Island"] = { placetype = {"external territory", "ดินแดน"}, container = "ออสเตรเลีย", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Cyprus ["Northern Cyprus"] = { placetype = {"unrecognized country", "ประเทศ"}, addl_parents = {"ไซปรัส", "Turkey", "ยุโรป", "เอเชีย"}, divs = {"อำเภอ"}, keydesc = "the de-facto independent state of [[Northern Cyprus]], internationally recognized as part of the country of [[Cyprus]]", british_spelling = true, }, -- commonwealth, unincorporated territory of the United States ["Northern Mariana Islands"] = { the = true, placetype = {"commonwealth", "unincorporated territory", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"ไมโครนีเชีย"}, }, -- British Overseas Territory ["Pitcairn Islands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, -- commonwealth of the United States ["Puerto Rico"] = { placetype = {"commonwealth", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"แคริบเบียน"}, divs = {"เทศบาล"}, }, -- overseas department and region of France ["Réunion"] = { placetype = {"overseas department", "department", "administrative region", "ภูมิภาค"}, container = "ฝรั่งเศส", divs = {"communes"}, addl_parents = {"แอฟริกา"}, british_spelling = true, }, -- special municipality of the Netherlands ["Saba"] = { placetype = {"special municipality", "เทศบาล", "overseas territory", "ดินแดน"}, container = "เนเธอร์แลนด์", addl_parents = {"แคริบเบียน"}, is_city = true, british_spelling = true, }, -- overseas collectivity of France ["Saint Barthélemy"] = { placetype = {"overseas collectivity", "collectivity"}, container = "ฝรั่งเศส", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- British Overseas Territory ["Saint Helena, Ascension and Tristan da Cunha"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", divs = {{type = "constituent parts", container_parent_type = false}}, addl_parents = {"มหาสมุทรแอตแลนติก", "แอฟริกา"}, british_spelling = true, }, -- constituent parts of the combined oveseas territory ["Ascension Island"] = { placetype = {"constituent part", "ดินแดน", "เกาะ"}, container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"}, addl_parents = {"มหาสมุทรแอตแลนติก"}, overriding_bare_label_parents = {}, no_container_cat = false, no_container_parent = false, no_auto_augment_container = false, }, ["Saint Helena"] = { placetype = {"constituent part", "ดินแดน", "เกาะ"}, container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"}, addl_parents = {"มหาสมุทรแอตแลนติก"}, overriding_bare_label_parents = {}, no_container_cat = false, no_container_parent = false, no_auto_augment_container = false, }, ["Tristan da Cunha"] = { placetype = {"constituent part", "ดินแดน", "archipelago"}, container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"}, addl_parents = {"มหาสมุทรแอตแลนติก"}, overriding_bare_label_parents = {}, no_container_cat = false, no_container_parent = false, no_auto_augment_container = false, }, -- overseas collectivity of France ["Saint Martin"] = { placetype = {"overseas collectivity", "collectivity"}, container = "ฝรั่งเศส", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- overseas collectivity of France ["Saint Pierre and Miquelon"] = { placetype = {"overseas collectivity", "collectivity"}, container = "ฝรั่งเศส", divs = {"communes"}, addl_parents = {"อเมริกาเหนือ"}, british_spelling = true, }, -- special municipality of the Netherlands ["Sint Eustatius"] = { placetype = {"special municipality", "เทศบาล", "overseas territory", "ดินแดน"}, container = "เนเธอร์แลนด์", addl_parents = {"แคริบเบียน"}, is_city = true, british_spelling = true, }, -- constituent country of the Netherlands ["Sint Maarten"] = { placetype = {"constituent country", "ประเทศ"}, container = "เนเธอร์แลนด์", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Somalia ["Somaliland"] = { placetype = {"unrecognized country", "ประเทศ"}, addl_parents = {"โซมาเลีย", "แอฟริกา"}, keydesc = "the de-facto independent state of [[Somaliland]], internationally recognized as part of the country of [[Somalia]]", british_spelling = true, }, -- British Overseas Territory -- FIXME: We should form the group "South Georgia and the South Sandwich Islands" like we did for -- "Saint Helena, Ascension and Tristan da Cunha". ["South Georgia"] = { placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"มหาสมุทรแอตแลนติก"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Georgia ["South Ossetia"] = { placetype = {"unrecognized country", "ประเทศ"}, addl_parents = {"Georgia", "ยุโรป", "เอเชีย"}, keydesc = "the de-facto independent state of [[South Ossetia]], internationally recognized as part of the country of [[Georgia]]", british_spelling = true, }, -- British Overseas Territory ["South Sandwich Islands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"มหาสมุทรแอตแลนติก"}, wp = true, wpcat = "South Georgia and the South Sandwich Islands", british_spelling = true, }, -- Norwegian unincorporated area ["Svalbard"] = { placetype = {"unincorporated area", "dependent territory", "ดินแดน", "archipelago"}, container = "Norway", addl_parents = {"ยุโรป"}, british_spelling = true, }, -- dependent territory of New Zealand ["Tokelau"] = { placetype = {"dependent territory", "ดินแดน"}, container = "New Zealand", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Moldova ["Transnistria"] = { placetype = {"unrecognized country", "ประเทศ"}, addl_parents = {"Moldova", "ยุโรป"}, keydesc = "the de-facto independent state of [[Transnistria]], internationally recognized as part of [[Moldova]]", british_spelling = true, }, -- British Overseas Territory ["Turks and Caicos Islands"] = { the = true, placetype = {"overseas territory", "ดินแดน"}, container = "สหราชอาณาจักร", addl_parents = {"แคริบเบียน"}, british_spelling = true, }, -- unincorporated territory of the United States ["United States Minor Outlying Islands"] = { the = true, placetype = {"unincorporated territory", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"เกาะ", "ไมโครนีเชีย", "พอลินีเชีย", "แคริบเบียน"}, }, -- FIXME: We should add entries for the other minor outlying islands. -- Baker Island (Oceania) -- Howland Island (Oceania) -- Jarvis Island (Oceania) -- Johnston Atoll (Oceania) -- Kingman Reef (Oceania) -- Midway Atoll (Oceania) -- Navassa Island (Caribbean) -- Palmyra Atoll (Oceania) -- Wake Island (Oceania) ["Wake Island"] = { placetype = {"unincorporated territory", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"ไมโครนีเชีย"}, }, -- unincorporated territory of the United States ["United States Virgin Islands"] = { the = true, placetype = {"unincorporated territory", "overseas territory", "ดินแดน"}, container = "สหรัฐอเมริกา", addl_parents = {"แคริบเบียน"}, }, ["U.S. Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true}, ["US Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true}, -- overseas collectivity of France ["Wallis and Futuna"] = { placetype = {"overseas collectivity", "collectivity"}, container = "ฝรั่งเศส", addl_parents = {"พอลินีเชีย"}, british_spelling = true, }, } export.country_like_entities_group = { -- don't do any transformations between key and placename; in particular, don't chop off anything from -- "Saint Helena, Ascension and Tristan da Cunha". key_to_placename = false, placename_to_key = false, canonicalize_key_container = make_canonicalize_key_container(nil, "ประเทศ"), default_overriding_bare_label_parents = {"country-like entities"}, default_no_container_cat = true, default_no_container_parent = true, -- These entities often aren't really part of their container; a village in Wallis and Futuna (an overseas -- collectivity of France in Polynesia), for example, shouldn't be treated as a village in France, nor as a village -- in Europe. default_no_auto_augment_container = true, data = export.country_like_entities, } -- Former countries and such; we don't create "Cities in ..." categories because they don't exist anymore export.former_countries = { -- de-facto independent state of Armenian ethnicity, internationally recognized as part of Azerbaijan -- (also known as Nagorno-Karabakh) -- NOTE: Formerly listed Armenia as a parent; this seems politically non-neutral so I've taken it out. ["Artsakh"] = { placetype = {"unrecognized country", "ประเทศ"}, addl_parents = {"อาเซอร์ไบจาน", "ยุโรป", "เอเชีย"}, keydesc = "the former de-facto independent state of [[Artsakh]], internationally recognized as part of [[Azerbaijan]]", british_spelling = true, }, ["Nagorno-Karabakh"] = {alias_of = "Artsakh"}, ["Czechoslovakia"] = {container = "ยุโรป", british_spelling = true}, ["East Germany"] = {container = "ยุโรป", addl_parents = {"เยอรมนี"}, british_spelling = true}, ["เวียดนามเหนือ"] = {container = "เอเชีย", addl_parents = {"เวียดนาม"}}, ["เปอร์เซีย"] = {placetype = {"จักรวรรดิ", "ประเทศ"}, container = "เอเชีย", divs = {"จังหวัด"}}, ["Byzantine Empire"] = { the = true, placetype = {"จักรวรรดิ", "ประเทศ"}, container = {"ยุโรป", "แอฟริกา", "เอเชีย"}, addl_parents = {"Ancient Europe", "Ancient Near East"}, divs = { "จังหวัด", "themes", }}, ["Roman Empire"] = { the = true, placetype = {"จักรวรรดิ", "ประเทศ"}, container = {"ยุโรป", "แอฟริกา", "เอเชีย"}, addl_parents = {"Rome"}, divs = { "จังหวัด", {type = "FORMER provinces", cat_as = "จังหวัด"}, }}, ["เวียดนามใต้"] = {container = "เอเชีย", addl_parents = {"เวียดนาม"}}, ["Soviet Union"] = { the = true, container = {"ยุโรป", "เอเชีย"}, divs = {"republics", "autonomous republics"}, british_spelling = true}, ["West Germany"] = {container = "ยุโรป", addl_parents = {"เยอรมนี"}, british_spelling = true}, ["Yugoslavia"] = {container = "ยุโรป", divs = {"อำเภอ"}, keydesc = "the former [[Kingdom of Yugoslavia]] (1918–1943) or the former [[Socialist Federal Republic of Yugoslavia]] (1943–1992)", british_spelling = true}, } export.former_countries_group = { canonicalize_key_container = canonicalize_continent_container, default_overriding_bare_label_parents = {"former countries and country-like entities"}, default_is_former_place = true, default_placetype = "ประเทศ", default_no_container_cat = true, default_no_container_parent = true, -- No need to augment country holonyms with continents; not needed for disambiguation. default_no_auto_augment_container = true, data = export.former_countries, } ----------------------------------------------------------------------------------- -- Subpolity tables -- ----------------------------------------------------------------------------------- export.australia_states_and_territories = { ["Australian Capital Territory, ออสเตรเลีย"] = {the = true, placetype = "ดินแดน"}, ["Jervis Bay Territory, ออสเตรเลีย"] = {the = true, placetype = "ดินแดน"}, ["New South Wales, ออสเตรเลีย"] = {}, ["Northern Territory, ออสเตรเลีย"] = {the = true, placetype = "ดินแดน"}, ["Queensland, ออสเตรเลีย"] = {}, ["South Australia, ออสเตรเลีย"] = {}, ["Tasmania, ออสเตรเลีย"] = {}, ["Victoria, ออสเตรเลีย"] = {}, ["Western Australia, ออสเตรเลีย"] = {}, } -- states and territories of Australia export.australia_group = { default_container = "ออสเตรเลีย", default_placetype = "รัฐ", default_divs = "local government areas", data = export.australia_states_and_territories, } export.austria_states = { ["Vienna, ออสเตรีย"] = {}, ["Lower Austria, ออสเตรีย"] = {}, ["Upper Austria, ออสเตรีย"] = {}, ["Styria, ออสเตรีย"] = {}, ["Tyrol, ออสเตรีย"] = {wp = "Tyrol (รัฐ)"}, ["Carinthia, ออสเตรีย"] = {}, ["Salzburg, ออสเตรีย"] = {wp = "Salzburg (รัฐ)"}, ["Vorarlberg, ออสเตรีย"] = {}, ["Burgenland, ออสเตรีย"] = {}, } -- states of Austria export.austria_group = { default_container = "ออสเตรีย", default_placetype = "รัฐ", default_divs = "เทศบาล", data = export.austria_states, } export.bangladesh_divisions = { ["Barisal Division, บังกลาเทศ"] = {}, ["Chittagong Division, บังกลาเทศ"] = {}, ["Dhaka Division, บังกลาเทศ"] = {}, ["Khulna Division, บังกลาเทศ"] = {}, ["Mymensingh Division, บังกลาเทศ"] = {}, ["Rajshahi Division, บังกลาเทศ"] = {}, ["Rangpur Division, บังกลาเทศ"] = {}, ["Sylhet Division, บังกลาเทศ"] = {}, } -- divisions of Bangladesh export.bangladesh_group = { key_to_placename = make_key_to_placename(", บังกลาเทศ$", " Division$"), placename_to_key = make_placename_to_key(", บังกลาเทศ", " Division"), default_container = "บังกลาเทศ", default_placetype = "division", default_divs = "อำเภอ", data = export.bangladesh_divisions, } export.brazil_states = { ["Acre, บราซิล"] = {wp = "%l (รัฐ)"}, ["Alagoas, บราซิล"] = {}, ["Amapá, บราซิล"] = {}, ["Amazonas, บราซิล"] = {wp = "%l (Brazilian state)"}, ["Bahia, บราซิล"] = {}, ["Ceará, บราซิล"] = {}, ["Distrito Federal, บราซิล"] = {wp = "Federal District (Brazil)"}, ["Espírito Santo, บราซิล"] = {}, ["Goiás, บราซิล"] = {}, ["Maranhão, บราซิล"] = {}, ["Mato Grosso, บราซิล"] = {}, ["Mato Grosso do Sul, บราซิล"] = {}, ["Minas Gerais, บราซิล"] = {}, ["Pará, บราซิล"] = {}, ["Paraíba, บราซิล"] = {}, ["Paraná, บราซิล"] = {wp = "%l (รัฐ)"}, ["Pernambuco, บราซิล"] = {}, ["Piauí, บราซิล"] = {}, ["Rio de Janeiro, บราซิล"] = {wp = "%l (รัฐ)"}, ["Rio Grande do Norte, บราซิล"] = {}, ["Rio Grande do Sul, บราซิล"] = {}, ["Rondônia, บราซิล"] = {}, ["Roraima, บราซิล"] = {}, ["Santa Catarina, บราซิล"] = {wp = "%l (รัฐ)"}, ["São Paulo, บราซิล"] = {wp = "%l (รัฐ)"}, ["Sergipe, บราซิล"] = {}, ["Tocantins, บราซิล"] = {}, } -- states of Brazil export.brazil_group = { default_container = "บราซิล", default_placetype = "รัฐ", default_divs = "เทศบาล", data = export.brazil_states, } export.canada_provinces_and_territories = { ["Alberta, แคนาดา"] = {divs = { {type = "municipal districts", container_parent_type = "rural municipalities"}, }}, ["British Columbia, แคนาดา"] = {divs = {type = "regional districts", container_parent_type = false}, "regional municipalities", }, ["Manitoba, แคนาดา"] = {divs = {"rural municipalities"}}, ["New Brunswick, แคนาดา"] = {divs = {"เทศมณฑล", "parishes", {type = "civil parishes", cat_as = "parishes"}}}, ["Newfoundland and Labrador, แคนาดา"] = {}, ["Northwest Territories, แคนาดา"] = {the = true, placetype = "ดินแดน"}, ["Nova Scotia, แคนาดา"] = {divs = {"เทศมณฑล", "regional municipalities"}}, ["Nunavut, แคนาดา"] = {placetype = "ดินแดน"}, ["Ontario, แคนาดา"] = {divs = {"เทศมณฑล", "regional municipalities", {type = "townships", prep = "ใน"}}}, ["Prince Edward Island, แคนาดา"] = {divs = {"เทศมณฑล", "parishes", "rural municipalities"}}, ["Saskatchewan, แคนาดา"] = {divs = {"rural municipalities"}}, ["Quebec, แคนาดา"] = {divs = { "เทศมณฑล", {type = "regional county municipalities", container_parent_type = "regional municipalities"}, -- administrative regions have an official (but non-governmental) function but there don't appear to be any -- equivalent regions elsewhere in Canada, so disable the [[Category:Regions of Canada]] grouping {type = "ภูมิภาค", container_parent_type = false}, {type = "townships", prep = "ใน"}, {type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "เทศมณฑล"}, "เทศบาล"}}, {type = "township municipalities", cat_as = {{type = "townships", prep = "ใน"}, "เทศบาล"}}, {type = "village municipalities", cat_as = {{type = "villages", prep = "ใน"}, "เทศบาล"}}, }}, ["Yukon, แคนาดา"] = {placetype = "ดินแดน"}, ["Yukon Territory, แคนาดา"] = {alias_of = "Yukon, Canada", the = true}, } -- provinces and territories of Canada export.canada_group = { default_container = "แคนาดา", default_placetype = "รัฐ", --ตาม thwiki data = export.canada_provinces_and_territories, } export.china_provinces_and_autonomous_regions = { -- direct-administered municipalities are not here but below under prefecture-level cities ["Anhui, จีน"] = {}, ["Fujian, จีน"] = {}, ["Fuchien, จีน"] = {alias_of = "Fujian, จีน", display = true}, ["Gansu, จีน"] = {}, ["Guangdong, จีน"] = {}, ["Guangxi, จีน"] = {placetype = "autonomous region"}, ["Guizhou, จีน"] = {}, ["Hainan, จีน"] = {}, ["Hebei, จีน"] = {}, ["Heilongjiang, จีน"] = {}, ["Henan, จีน"] = {}, ["Hubei, จีน"] = {}, ["Hunan, จีน"] = {}, ["Inner Mongolia, จีน"] = {placetype = "autonomous region"}, ["Jiangsu, จีน"] = {}, ["Jiangxi, จีน"] = {}, ["Jilin, จีน"] = {}, ["Liaoning, จีน"] = {}, ["Ningxia, จีน"] = {placetype = "autonomous region"}, ["Qinghai, จีน"] = {}, ["Shaanxi, จีน"] = {}, ["Shandong, จีน"] = {}, ["Shanxi, จีน"] = {}, ["Sichuan, จีน"] = {}, ["Tibet, จีน"] = {placetype = "autonomous region", wp = "Tibet Autonomous Region"}, ["Xinjiang, จีน"] = {placetype = "autonomous region"}, ["Yunnan, จีน"] = {}, ["Zhejiang, จีน"] = {}, } -- provinces and autonomous regions of China export.china_group = { default_container = "จีน", default_placetype = "มณฑล", default_divs = { "จังหวัด", "prefecture-level cities", "อำเภอ", "ตำบล", "townships", {type = "เทศมณฑล", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, }, data = export.china_provinces_and_autonomous_regions, } export.china_prefecture_level_cities = { -- In China, a "prefecture-level city" is not a city in any real sense. It is rather a prefecture, which is an -- administrative unit smaller than a province but bigger than a county, which is administratively controlled by -- the chief city of the prefecture (which bears the same name as the prefecture), in a unified government. Prior -- to the mid-1980's, in fact, prefecture-level cities *were* prefectures, and a few of them (especially in the -- western portion of China) have not yet been converted. Generally a given province is entirely tiled by -- prefecture-level cities, another indication that they should be treated as prefectures and not cities per se. -- Yet another indication is that prefecture-level cities can contain counties and county-level cities (which, much -- like prefecture-level cities, are effectively counties surrounding a chief city of the county, again which bears -- the same name as the county-level city). -- -- For this reason, we treat prefecture-level cities as non-city political divisions, and separately enumerate the -- most populous so we can separately categorize districts and counties under them instead of lumping them at the -- province level. -- -- Note also that China separately distinguishes "urban area" from "metro area". Sometimes the two figures are -- identical but sometimes the metro area is larger (and very occasionally smaller, which I assume is an error). I'm -- guessing that the "urban area" is the contiguous urban area over a certain density while the metro area includes -- all urban areas above a certain density; when the latter is greater, it's because of satellite cities in the -- metro area separated by suburban/exurban or rural land. -- At first I chose all prefecture/province-level cities with a total prefecture/province-level population of at -- least 6,000,000 per the 2020 census with data taken from https://www.citypopulation.de/en/china/admin/ (a total -- of 67, including the four direct-administered municipalities), and also chose all prefecture/province-level -- cities whose "urban population" was at least 2,000,000 per the 2020 census with data taken from Wikipedia -- [[w:List of cities in China by population#Cities and towns by population]] (a total of 61 cities; if we cut off -- at 1.5 million we'd have 84 cities, and if we cut off at 1 million we'd have 105 cities). Merging them produces -- 87 cities. Note that this leaves off a few well-known cities (Guilin, Qiqihar, Kashgar, Lhasa, ...) but includes -- a lot of obscure cities. -- -- At a later date I added all cities from citypopulation.de whose "urban" population per the 2020 China census was -- >= 1 million, and then finally added all urban agglomerations from citypopulation.de whose 2025-01-01 estimate -- was >= 1 million. These are sorted below by the urban agglomeration value (which is generally of the "adm-urb" = -- "administrative area (urban population)" type) and sometimes groups nearby cities into a single agglomeration -- (most notably in the case of the Pearl River Delta, grouped under Guangzhou with an agglomeration population of -- 72,700,000 but including a large number of nearby large cities in the agglomeration (although for some reason not -- Hong Kong, maybe due to the administrative issues involved). In addition, citypopulation.de includes divisions -- under a prefecture-level city if they are city-like and have an agglomeration population of at least 1 million; -- this includes several county-level cities, one county and one district (Wanzhou, a "district" of Chongqing -- despite being 142 miles away). None of the county-level cities or counties have districts under them, only -- subdistricts, towns and townships. ["Guangzhou"] = {container = "Guangdong"}, -- 18.7 prefectural, 18.8 urban; sub-provincial city; 16.097 urban (72.700 adm-urb including Dongguan, Foshan, Huizhou, Jiangmen, Shenzhen, Zhongshan) per citypopulation.de ["Dongguan"] = {container = "Guangdong"}, -- 10.5 prefectural, 10.5 urban; 9.645 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Foshan"] = {container = "Guangdong"}, -- 9.5 prefectural, 9.5 urban; 9.043 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Huizhou"] = {container = "Guangdong"}, -- 6.0 prefectural, 2.5 urban; 2.900 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Jiangmen"] = {container = "Guangdong"}, -- 4.798 prefectural, 2.7 urban; 1.795 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Shenzhen"] = {container = "Guangdong"}, -- 17.5 prefectural, 14.7 urban; sub-provincial city; 17.445 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Zhongshan"] = {container = "Guangdong"}, -- 4.418 prefectural, 4.4 urban; 3.842 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Shanghai"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 24.9 prefectural, 29.9 urban; 21.910 urban (41.600 adm-urb including Changshu, Changzhou, Suzhou, Wuxi) per citypopulation.de ["Changshu"] = {container = "Jiangsu"}, -- 1.231 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration -- NOTE: Not to be confused with Cangzhou in Hebei ["Changzhou"] = {container = "Jiangsu"}, -- 5.278 prefectural, 3.6 urban; 3.187 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration -- NOTE: There is also a prefecture-level city Suzhou in Anhui with 5.3 million prefectural inhabitants ["Suzhou"] = {container = "Jiangsu"}, -- 12.8 prefectural, 4.3 urban; 5.893 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration ["Wuxi"] = {container = "Jiangsu"}, -- 7.5 prefectural, 3.3 urban; 3.957 per citypopulation.de; included by citypopulation.de in Shanghai agglomeration ["Beijing"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 21.9 prefectural, 21.9 urban; 18.961 urban (21.500 adm-urb) per citypopulation.de ["Chengdu"] = {container = "Sichuan"}, -- 20.9 prefectural, 16.9 urban; sub-provincial city; 13.568 urban (18.100 adm-urb) per citypopulation.de ["Xiamen"] = {container = "Fujian"}, -- 5.163 prefectural, 5.2 urban; sub-provincial city; 4.617 urban (15.400 adm-urb including Jinjiang, Quanzhou, Putian) per citypopulation.de ["Jinjiang"] = {container = "Fujian"}, -- 1.416 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration ["Quanzhou"] = {container = "Fujian"}, -- 8.8 prefectural, 1.7 urban (6.7 metro); 1.469 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration ["Putian"] = {container = "Fujian"}, -- 3.210 prefectural, 2.0 urban; 1.539 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration ["Hangzhou"] = {container = "Zhejiang"}, -- 11.9 prefectural, 10.7 urban; sub-provincial city; 9.236 urban (14.600 adm-urb including Shaoxing) per citypopulation.de ["Shaoxing"] = {container = "Zhejiang"}, -- 5.270 prefectural, 2.5 urban; 2.333 urban per citypopulation.de; included by citypopulation.de in Hangzhou agglomeration ["Xi'an"] = {container = "Shaanxi"}, -- 12.1 prefectural, 11.9 urban; sub-provincial city; 9.393 urban (13.400 adm-urb including Xianyang) per citypopulation.de ["Xianyang"] = {container = "Shaanxi"}, -- 1.193 urban per citypopulation.de; included by citypopulation.de in Xi'an agglomeration ["Chongqing"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 32.1 prefectural, 16.9 urban; 9.581 urban (12.900 adm-urb) per citypopulation.de ["Wuhan"] = {container = "Hubei"}, -- 12.4 prefectural, 12.3 urban; sub-provincial city; 10.495 urban (12.600 adm-urb) per citypopulation.de ["Tianjin"] = {placetype = {"direct-administered municipality", "เทศบาล", "นคร"}}, -- 13.9 prefectural, 13.9 urban; 11.052 urban (11.700 adm-urb) per citypopulation.de ["Changsha"] = {container = "Hunan"}, -- 10.0 prefectural, 6.0 urban; 5.630 urban (11.500 adm-urb including Xiangtan, Zhuzhou) per citypopulation.de -- Changsha County -- 1.024 urban per citypopulation.de ["Zhuzhou"] = {container = "Hunan"}, -- 1.510 urban per citypopulation.de; included by citypopulation.de in Changsha agglomeration ["Zhengzhou"] = {container = "Henan"}, -- 12.6 prefectural, 6.7 urban; 6.461 urban (10.300 adm-urb) per citypopulation.de ["Nanjing"] = {container = "Jiangsu"}, -- 9.3 prefectural, 9.3 urban; sub-provincial city; 7.520 urban (9.500 adm-urb including Ma'anshan) per citypopulation.de ["Shenyang"] = {container = "Liaoning"}, -- 9.1 prefectural, 7.9 urban; sub-provincial city; 7.026 urban (8.800 adm-urb including Fushun) per citypopulation.de ["Fushun"] = {container = "Liaoning"}, -- 1.229 urban per citypopulation.de; included by citypopulation.de in Shenyang agglomeration ["Hefei"] = {container = "Anhui"}, -- 9.4 prefectural, 4.2 urban; 5.056 urban (8.200 adm-urb) per citypopulation.de ["Shantou"] = {container = "Guangdong"}, -- 5.502 prefectural, 4.3 urban; 3.839 urban (8.050 adm-urb including Chaozhou, Jieyang, Puning) per citypopulation.de ["Chaozhou"] = {container = "Guangdong"}, -- 1.254 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration ["Jieyang"] = {container = "Guangdong"}, -- 1.243 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration ["Qingdao"] = {container = "Shandong"}, -- 10.1 prefectural, 7.1 urban; sub-provincial city; 6.165 urban (7.700 adm-urb) per citypopulation.de ["Ningbo"] = {container = "Zhejiang"}, -- 9.4 prefectural, 5.1 urban; sub-provincial city; 3.731 urban (7.600 adm-urb including Cixi, Yuyao) per citypopulation.de ["Cixi"] = {container = "Zhejiang"}, -- 1.458 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration ["Yuyao"] = {container = "Zhejiang"}, -- 1.014 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration -- Hong Kong 7.500 agglomeration per citypopulation.de 2025-01-01 estimate including Kowloon, Victoria ["Wenzhou"] = {container = "Zhejiang"}, -- 9.6 prefectural, 3.6 urban; 2.582 urban (7.000 adm-urb including Rui'an, Cangnan, Pingyang) per citypopulation.de -- Rui'an is a "county-level city" of the "prefecture-level city" of Wenzhou but in fact is 19 miles away from Wenzhou city proper (urban core to urban core). ["Rui'an"] = {placetype = "county-level city", container = {key = "Wenzhou", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}}, -- 1.013 urban per citypopulation.de; included by citypopulation.de in Wenzhou agglomeration ["Kunming"] = {container = "Yunnan"}, -- 8.5 prefectural, 6.0 urban; 5.273 urban (6.800 adm-urb) per citypopulation.de -- includes Láiwú city ["Jinan"] = {container = "Shandong", wp = "%l, %c"}, -- 9.2 prefectural, 8.4 urban; sub-provincial city; 5.648 urban (6.750 adm-urb) per citypopulation.de -- includes Xīnjí city ["Shijiazhuang"] = {container = "Hebei"}, -- 11.2 prefectural, 4.1 urban; 5.090 urban (6.450 adm-urb) per citypopulation.de ["Taiyuan"] = {container = "Shanxi"}, -- 5.304 prefectural, 4.5 urban; 4.304 urban (6.150 adm-urb) per citypopulation.de ["Harbin"] = {container = "Heilongjiang"}, -- 10.0 prefectural, 7.0 urban; sub-provincial city; 5.243 urban (5.550 adm-urb) per citypopulation.de ["Nanning"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 8.7 prefectural, 3.8 urban; 4.583 urban (5.550 adm-urb) per citypopulation.de ["Dalian"] = {container = "Liaoning"}, -- 7.5 prefectural, 5.7 urban; sub-provincial city; 4.914 urban (5.400 adm-urb) per citypopulation.de ["Guiyang"] = {container = "Guizhou"}, -- 5.987 prefectural, 3.5 urban; 4.021 urban (5.300 adm-urb) per citypopulation.de ["Changchun"] = {container = "Jilin"}, -- 9.1 prefectural, 5.7 urban; sub-provincial city; 4.557 urban (5.200 adm-urb) per citypopulation.de ["Nanchang"] = {container = "Jiangxi"}, -- 6.3 prefectural, 3.6 (3.9?) urban, 5.3 metro; 3.519 urban (5.150 adm-urb) per citypopulation.de ["Ürümqi"] = {container = {key = "Xinjiang, จีน", placetype = "autonomous region"}}, -- 4.054 prefectural, 4.3 urban; 3.843 urban (5.000 adm-urb) per citypopulation.de ["Urumqi"] = {alias_of = "Ürümqi", display = true}, ["Fuzhou"] = {container = "Fujian"}, -- 8.3 prefectural, 4.1 urban; 3.723 urban (4.775 adm-urb) per citypopulation.de ["Linyi"] = {container = "Shandong"}, -- 11.0 prefectural, 2.3 urban; 2.744 urban (4.650 adm-urb) per citypopulation.de ["Zibo"] = {container = "Shandong"}, -- 4.704 prefectural, 2.6 urban; 2.750 urban (3.975 adm-urb) per citypopulation.de ["Luoyang"] = {container = "Henan"}, -- 7.1 prefectural, 2.4 urban; 2.231 urban (3.750 adm-urb) per citypopulation.de ["Lanzhou"] = {container = "Gansu"}, -- 4.359 prefectural, 3.1 urban; 3.013 urban (3.575 adm-urb) per citypopulation.de ["Nantong"] = {container = "Jiangsu"}, -- 7.7 prefectural, 2.3 urban; 2.988 urban (3.475 adm-urb) citypopulation.de ["Weifang"] = {container = "Shandong"}, -- 9.4 prefectural, 2.7 urban; 1.998 urban (3.325 adm-urb) per citypopulation.de ["Jiangyin"] = {container = "Jiangsu"}, -- 1.331 urban (3.200 adm-urb including Zhangjiagang) per citypopulation.de ["Zhangjiagang"] = {container = "Jiangsu"}, -- 1.056 urban per citypopulation.de; included in Jiangyin figures ["Xuzhou"] = {container = "Jiangsu"}, -- 9.1 prefectural, 2.6 urban; 2.846 urban (3.150 adm-urb) per citypopulation.de ["Handan"] = {container = "Hebei"}, -- 9.4 prefectural, 2.8 urban; 2.095 urban (2.925 adm-urb) per citypopulation.de ["Hohhot"] = {container = {key = "Inner Mongolia, จีน", placetype = "autonomous region"}}, -- 3.446 prefectural, 2.7 urban; 2.373 urban (2.850 adm-urb) per citypopulation.de ["Haikou"] = {container = "Hainan"}, -- 2.873 prefectural, 2.3 urban; 2.349 urban (2.800 adm-urb) per citypopulation.de ["Tangshan"] = {container = "Hebei"}, -- 7.7 prefectural, 3.4 urban; 2.550 urban (2.750 adm-urb) per citypopulation.de ["Xinxiang"] = {container = "Henan"}, -- 6.3 prefectural, 1.2 urban, 2.7 metro; 1.271 urban (2.700 adm-urb) per citypopulation.de ["Yiwu"] = {container = "Zhejiang"}, -- 1.481 urban (2.700 adm-urb) per citypopulation.de ["Zhuhai"] = {container = "Guangdong"}, -- 2.439 prefectural, 2.4 urban; 2.207 urban (2.675 adm-urb) per citypopulation.de ["Taizhou, Zhejiang"] = {container = "Zhejiang"}, -- 6.6 prefectural, 1.6 urban; 1.486 urban (2.625 adm-urb) per citypopulation.de ["Taizhou"] = {alias_of = "Taizhou, Zhejiang"}, ["Yantai"] = {container = "Shandong"}, -- 7.1 prefectural, 2.5 urban; 2.312 urban (2.550 adm-urb) per citypopulation.de ["Yinchuan"] = {container = {key = "Ningxia, จีน", placetype = "autonomous region"}}, -- 1.663 urban (2.525 adm-urb) per citypopulation.de ["Liuzhou"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 4.157 prefectural, 2.2 urban; 2.205 urban (2.500 adm-urb) per citypopulation.de ["Anshan"] = {container = "Liaoning"}, -- 1.480 urban (2.350 adm-urb including Liáoyáng) per citypopulation.de ["Yangzhou"] = {container = "Jiangsu"}, -- 2.067 urban (2.300 adm-urb) per citypopulation.de ["Jiaxing"] = {container = "Zhejiang"}, -- 1.188 urban (2.275 adm-urb) per citypopulation.de ["Xining"] = {container = "Qinghai"}, -- 1.677 urban (2.250 adm-urb) per citypopulation.de -- includes Dìngzhōu city and Xióngān Xīnqū ["Baoding"] = {container = "Hebei"}, -- 11.5 prefectural, 2.0 urban; 1.940 urban (2.225 adm-urb) per citypopulation.de ["Baotou"] = {container = {key = "Inner Mongolia, จีน", placetype = "autonomous region"}}, -- 2.709 prefectural, 2.2 urban; 2.104 urban (2.200 adm-urb) per citypopulation.de ["Ganzhou"] = {container = "Jiangxi"}, -- 9.0 prefectural, 1.6 urban; 1.778 urban (2.150 adm-urb) per citypopulation.de ["Pingdingshan"] = {container = "Henan"}, -- 1.046 urban (2.100 adm-urb) per citypopulation.de ["Zunyi"] = {container = "Guizhou"}, -- 6.6 prefectural, 2.4 urban/metro; 1.675 urban (2.025 adm-urb) per citypopulation.de ["Bengbu"] = {container = "Anhui"}, -- 1.078 urban (2.000 adm-urb) per citypopulation.de ["Datong"] = {container = "Shanxi"}, -- 3.105 prefectural, 2.0 urban; 1.810 urban (2.000 adm-urb) per citypopulation.de ["Anyang"] = {container = "Henan"}, -- 1.188 urban (1.960 adm-urb) per citypopulation.de ["Huai'an"] = {container = "Jiangsu"}, -- 4.556 prefectural, 2.6 urban; 1.805 urban (1.940 adm-urb) per citypopulation.de ["Zaozhuang"] = {container = "Shandong"}, -- 1.350 urban (1.900 adm-urb) per citypopulation.de ["Zhanjiang"] = {container = "Guangdong"}, -- 7.0 prefectural, 1.9 urban; 1.401 urban (1.890 adm-urb) per citypopulation.de ["Huainan"] = {container = "Anhui"}, -- 1.256 urban (1.880 adm-urb) per citypopulation.de ["Jining"] = {container = "Shandong"}, -- 8.4 prefectural, 1.5 urban; 1.700 urban (1.880 adm-urb) per citypopulation.de ["Daqing"] = {container = "Heilongjiang"}, -- 1.604 urban (1.860 adm-urb) per citypopulation.de ["Wuhu"] = {container = "Anhui"}, -- 1.598 urban (1.850 adm-urb) per citypopulation.de ["Guilin"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 1.361 urban (1.830 adm-urb) per citypopulation.de ["Mianyang"] = {container = "Sichuan"}, -- 1.549 urban (1.800 adm-urb) per citypopulation.de ["Xiangyang"] = {container = "Hubei"}, -- 1.686 urban (1.800 adm-urb) per citypopulation.de ["Huzhou"] = {container = "Zhejiang"}, -- 1.084 urban (1.750 adm-urb) per citypopulation.de ["Puyang"] = {container = "Henan"}, -- 0.824 urban (1.750 adm-urb) per citypopulation.de ["Shangqiu"] = {container = "Henan"}, -- 7.8 prefectural, 1.9 urban (2.8 metro); 1.031 urban (1.750 adm-urb) per citypopulation.de ["Qinhuangdao"] = {container = "Hebei"}, -- 1.520 urban (1.740 adm-urb) per citypopulation.de ["Xingtai"] = {container = "Hebei"}, -- 7.1 prefectural, 971,000 urban; 1.5 urban (1.700 adm-urb) per citypopulation.de ["Nanyang"] = {container = "Henan", wp = "%l, %c"}, -- 9.7 prefectural, 2.1 urban/metro; 1.481 urban (1.680 adm-urb) per citypopulation.de ["Jiaozuo"] = {container = "Henan"}, -- 0.875 urban (1.640 adm-urb) per citypopulation.de ["Jilin City"] = {container = "Jilin"}, -- 1.509 urban (1.610 adm-urb) per citypopulation.de ["Jilin"] = {alias_of = "Jilin City"}, ["Jinhua"] = {container = "Zhejiang"}, -- 7.1 prefectural, 1.5 urban; 1.041 urban (1.590 adm-urb) per citypopulation.de ["Shangrao"] = {container = "Jiangxi"}, -- 6.5 prefectural, 2.1 urban, 1.3 metro [sic]; 1.342 urban (1.580 adm-urb) per citypopulation.de ["Heze"] = {container = "Shandong"}, -- 8.8 prefectural, 1.3 urban; 1.294 urban (1.570 adm-urb) per citypopulation.de ["Yulin"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}, wp = "%l, %c"}, -- 0.878 urban (1.570 adm-urb) per citypopulation.de ["Tai'an"] = {container = "Shandong"}, -- 1.417 urban (1.560 adm-urb) per citypopulation.de ["Weihai"] = {container = "Shandong"}, -- 1.340 urban (1.510 adm-urb) per citypopulation.de -- Taizhou, Jiangsu would be here (1.490 adm-urb) but moved to china_prefecture_level_cities_2 to avoid clash ["Yancheng"] = {container = "Jiangsu"}, -- 6.7 prefectural, 1.6 urban; 1.353 urban (1.460 adm-urb) per citypopulation.de ["Zhangjiakou"] = {container = "Hebei"}, -- 1.339 urban (1.450 adm-urb) per citypopulation.de ["Maoming"] = {container = "Guangdong"}, -- 6.2 prefectural, 2.5 urban; 1.308 urban (1.440 adm-urb) per citypopulation.de ["Nanchong"] = {container = "Sichuan"}, -- 1.254 urban (1.440 adm-urb) per citypopulation.de ["Fuyang"] = {container = "Anhui", wp = "%l, %c"}, -- 8.2 prefectural, 2.1 urban; 1.191 urban (1.410 adm-urb) per citypopulation.de ["Xuchang"] = {container = "Henan"}, -- 0.850 urban (1.390 adm-urb) per citypopulation.de ["Yichang"] = {container = "Hubei"}, -- 1.284 urban (1.390 adm-urb) per citypopulation.de ["Dazhou"] = {container = "Sichuan"}, -- 1.136 urban (1.380 adm-urb) per citypopulation.de ["Kaifeng"] = {container = "Henan"}, -- 1.194 urban (1.340 adm-urb) per citypopulation.de ["Luzhou"] = {container = "Sichuan"}, -- 1.128 urban (1.340 adm-urb) per citypopulation.de ["Qingyuan"] = {container = "Guangdong"}, -- 1.198 urban (1.340 adm-urb) per citypopulation.de ["Huaibei"] = {container = "Anhui"}, -- 0.831 urban (1.330 adm-urb) per citypopulation.de ["Yibin"] = {container = "Sichuan"}, -- 1.101 urban (1.310 adm-urb) per citypopulation.de ["Lu'an"] = {container = "Anhui"}, -- 1.070 urban (1.300 adm-urb) per citypopulation.de ["Dezhou"] = {container = "Shandong"}, -- 0.843 urban (1.290 adm-urb) per citypopulation.de ["Rizhao"] = {container = "Shandong"}, -- 1.147 urban (1.270 adm-urb) per citypopulation.de ["Changzhi"] = {container = "Shanxi"}, -- 1.047 urban (1.250 adm-urb) per citypopulation.de ["Hengyang"] = {container = "Hunan"}, -- 6.6 prefectural, 1.5 urban; 1.185 urban (1.250 adm-urb) per citypopulation.de ["Jinzhou"] = {container = "Liaoning"}, -- 1.021 urban (1.240 adm-urb) per citypopulation.de ["Liaocheng"] = {container = "Shandong"}, -- 1.020 urban (1.240 adm-urb) per citypopulation.de ["Changde"] = {container = "Hunan"}, -- 1.101 urban (1.230 adm-urb) per citypopulation.de ["Suqian"] = {container = "Jiangsu"}, -- 1.082 urban (1.230 adm-urb) per citypopulation.de ["Xinyang"] = {container = "Henan"}, -- 6.2 prefectural, 1.4 urban/metro; 1.015 urban (1.230 adm-urb) per citypopulation.de ["Baoji"] = {container = "Shaanxi"}, -- 1.108 urban (1.220 adm-urb) per citypopulation.de ["Yueyang"] = {container = "Hunan"}, -- 1.125 urban (1.220 adm-urb) per citypopulation.de ["Zhenjiang"] = {container = "Jiangsu"}, -- 1.124 urban (1.210 adm-urb) per citypopulation.de -- Wanzhou is a "district" of the "direct-administered municipality" of Chongqing but in fact is 142 miles away from Chongqing city proper. ["Wanzhou"] = {placetype = "district", container = {key = "Chongqing", placetype = "direct-administered municipality"}, divs = {"ตำบล", "townships"}, wp = "%l, %c"}, -- 1.078 urban (1.190 adm-urb) per citypopulation.de ["Ulanhad"] = {container = {key = "Inner Mongolia, จีน", placetype = "autonomous region"}}, -- 1.093 urban (1.180 adm-urb) per citypopulation.de ["Chifeng"] = {alias_of = "Ulanhad"}, ["Ulankhad"] = {alias_of = "Ulanhad", display = true}, ["Ezhou"] = {container = "Hubei"}, -- < 0.750 urban (1.180 adm-urb) per citypopulation.de ["Zhaoqing"] = {container = "Guangdong"}, -- 1.036 urban (1.160 adm-urb) per citypopulation.de ["Lianyungang"] = {container = "Jiangsu"}, -- 4.599 prefectural, 2.0 urban; 1.071 urban (1.150 adm-urb) per citypopulation.de ["Qujing"] = {container = "Yunnan"}, -- 0.976 urban (1.150 adm-urb) per citypopulation.de -- Shuyang is a "เทศมณฑล" of the "prefecture-level city" of Suqian but in fact is 38 miles away from Suqian city proper (urban core to urban core). -- The county itself is 37 miles by 34 miles. ["Shuyang"] = {placetype = "เทศมณฑล", container = {key = "Suqian", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}, wp = "%l County"}, -- 0.986 urban (1.120 adm-urb) per citypopulation.de -- Yongkang is a "county-level city" of the "prefecture-level city" of Jinhua but in fact is 32 miles away from Jinhua city proper (urban core to urban core). ["Yongkang"] = {placetype = "county-level city", container = {key = "Jinhua", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}, wp = "%l, Zhejiang"}, -- < 0.750 urban (1.110 adm-urb) per citypopulation.de ["Zhoukou"] = {container = "Henan"}, -- 9.0 prefectural, 721,000 urban (1.6 metro); < 0.750 urban (1.100 adm-urb) per citypopulation.de ["Beihai"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- < 1 urban (1.090 adm-urb) per citypopulation.de ["Jiujiang"] = {container = "Jiangxi"}, -- < 0.750 urban (1.080 adm-urb) per citypopulation.de ["Shaoyang"] = {container = "Hunan"}, -- 6.6 prefectural, 802,000 urban, 1.4 metro; < 1 urban (1.080 adm-urb) per citypopulation.de ["Chuzhou"] = {container = "Anhui"}, -- < 0.750 urban (1.070 adm-urb) per citypopulation.de ["Hengshui"] = {container = "Hebei"}, -- 0.885 urban (1.070 adm-urb) per citypopulation.de ["Shiyan"] = {container = "Hubei"}, -- 0.955 urban (1.070 adm-urb) per citypopulation.de ["Huludao"] = {container = "Liaoning"}, -- 0.764 urban (1.060 adm-urb) per citypopulation.de ["Dongying"] = {container = "Shandong"}, -- 0.961 urban (1.050 adm-urb) per citypopulation.de ["Guigang"] = {container = {key = "Guangxi, จีน", placetype = "autonomous region"}}, -- 0.921 urban (1.050 adm-urb) per citypopulation.de -- Liuyang is a "county-level city" of the "prefecture-level city" of Changsha but in fact is 47 miles away from Changsha city proper (urban core to urban core). ["Liuyang"] = {placetype = "county-level city", container = {key = "Changsha", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}}, -- 0.886 urban (1.040 adm-urb) per citypopulation.de -- NOTE: Not to be confused with Changzhou in Jiangsu ["Cangzhou"] = {container = "Hebei"}, -- 7.3 prefectural, 621,000 urban; 0.759 urban (1.030 adm-urb) per citypopulation.de ["Liupanshui"] = {container = "Guizhou"}, -- < 0.750 urban (1.030 adm-urb) per citypopulation.de ["Panjin"] = {container = "Liaoning"}, -- 0.980 urban (1.030 adm-urb) per citypopulation.de ["Qiqihar"] = {container = "Heilongjiang"}, -- 1.030 urban (1.030 adm-urb) per citypopulation.de ["Linfen"] = {container = "Shanxi"}, -- < 0.750 urban (1.010 adm-urb) per citypopulation.de -- Tengzhou is a "county-level city" of the "prefecture-level city" of Zaozhuang but in fact is 30 miles away from Zaozhuang city proper (urban core to urban core). ["Tengzhou"] = {placetype = "county-level city", container = {key = "Zaozhuang", placetype = "prefecture-level city"}, divs = {"ตำบล", "townships"}}, -- 0.937 urban (1.010 adm-urb) per citypopulation.de -- 3 extra that got added in earlier incarnations and aren't found in the "major agglomerations of the world" page https://citypopulation.de/en/world/agglomerations/ reference date 2025-01-01 ["Kunshan"] = {container = "Jiangsu"}, -- 1.652 urban (2020 China census) per citypopulation.de ["Zhumadian"] = {container = "Henan"}, -- 7.0 prefectural, 722,000 urban per Wikipedia; 0.754 urban per citypopulation.de ["Bijie"] = {container = "Guizhou"}, -- 6.9 prefectural, ? urban, ? metro (not listed in Wikipedia); < 0.750 urban per citypopulation.de } export.china_prefecture_level_cities_group = { -- don't do any transformations between key and placename; in particular, don't chop off anything from -- "Taizhou, Zhejiang" or "Suzhou, Anhui". key_to_placename = false, placename_to_key = false, -- don't add ", จีน" to make the key default_container = "จีน", canonicalize_key_container = make_canonicalize_key_container(", จีน", "จังหวัด"), -- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people -- don't understand how Chinese administrative divisions work. default_placetype = {"prefecture-level city", "นคร"}, default_divs = { -- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities, -- and prefecture-level cities (as well as county-level cities) are considered non-cities. "อำเภอ", "ตำบล", "townships", {type = "เทศมณฑล", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, }, data = export.china_prefecture_level_cities, } -- Needed to avoid problems with two cities called Taizhou and Suzhou. export.china_prefecture_level_cities_2 = { -- NOTE: There is also a larger and better-known prefecture-level city Taizhou in Zhejiang. ["Taizhou, Jiangsu"] = {container = "Jiangsu"}, -- 1.3 urban (1.490 adm-urb) per citypopulation.de 2020 census ["Taizhou"] = {alias_of = "Taizhou, Jiangsu"}, -- NOTE: There is also a larger and better-known prefecture-level city Suzhou in Jiangsu. ["Suzhou, Anhui"] = {container = "Anhui"}, -- 5.3 prefectural, 1.766 metro and "urban"; < 1 urban (1.010 adm-urb) per citypopulation.de 2020 census -- hopefully this will work because we also have Suzhou as a key by itself for the larger, more-well-known Suzhou in Jiangsu ["Suzhou"] = {alias_of = "Suzhou, Anhui"}, } export.china_prefecture_level_cities_group_2 = { -- don't do any transformations between key and placename; in particular, don't chop off anything from -- "Taizhou, Jiangsu". placename_to_key = false, -- don't add ", จีน" to make the key default_container = "จีน", canonicalize_key_container = make_canonicalize_key_container(", จีน", "จังหวัด"), -- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people -- don't understand how Chinese administrative divisions work. default_placetype = {"prefecture-level city", "นคร"}, default_divs = { -- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities, -- and prefecture-level cities (as well as county-level cities) are considered non-cities. "อำเภอ", "ตำบล", "townships", {type = "เทศมณฑล", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, }, data = export.china_prefecture_level_cities_2, } export.finland_regions = { ["Lapland, ฟินแลนด์"] = {wp = "%l (%c)"}, ["North Ostrobothnia, ฟินแลนด์"] = {}, ["Northern Ostrobothnia, ฟินแลนด์"] = {alias_of = "North Ostrobothnia, ฟินแลนด์", display = true}, ["Kainuu, ฟินแลนด์"] = {}, ["North Karelia, ฟินแลนด์"] = {}, ["Northern Savonia, ฟินแลนด์"] = {}, ["North Savo, ฟินแลนด์"] = {alias_of = "Northern Savonia, ฟินแลนด์", display = true}, ["Southern Savonia, ฟินแลนด์"] = {}, ["South Savo, ฟินแลนด์"] = {alias_of = "Southern Savonia, ฟินแลนด์", display = true}, ["South Karelia, ฟินแลนด์"] = {}, ["Central Finland, ฟินแลนด์"] = {}, ["South Ostrobothnia, ฟินแลนด์"] = {}, ["Southern Ostrobothnia, ฟินแลนด์"] = {alias_of = "South Ostrobothnia, ฟินแลนด์", display = true}, ["Ostrobothnia, ฟินแลนด์"] = {wp = "%l (ภูมิภาค)"}, ["Central Ostrobothnia, ฟินแลนด์"] = {}, ["Pirkanmaa, ฟินแลนด์"] = {}, ["Satakunta, ฟินแลนด์"] = {}, ["Päijänne Tavastia, ฟินแลนด์"] = {}, ["Päijät-Häme, ฟินแลนด์"] = {alias_of = "Päijänne Tavastia, ฟินแลนด์", display = true}, ["Tavastia Proper, ฟินแลนด์"] = {}, ["Kanta-Häme, ฟินแลนด์"] = {alias_of = "Tavastia Proper, ฟินแลนด์", display = true}, ["Kymenlaakso, ฟินแลนด์"] = {}, ["Uusimaa, ฟินแลนด์"] = {}, ["Southwest Finland, ฟินแลนด์"] = {}, ["Åland Islands, ฟินแลนด์"] = {the = true, wp = "Åland"}, ["Åland, ฟินแลนด์"] = {alias_of = "Åland Islands, ฟินแลนด์"}, -- differs in "the" } -- regions of Finland export.finland_group = { default_container = "ฟินแลนด์", default_placetype = "ภูมิภาค", default_divs = "เทศบาล", data = export.finland_regions, } export.france_administrative_regions = { ["Auvergne-Rhône-Alpes, ฝรั่งเศส"] = {}, ["Bourgogne-Franche-Comté, ฝรั่งเศส"] = {}, ["Brittany, ฝรั่งเศส"] = {wp = "%l (administrative region)"}, ["Centre-Val de Loire, ฝรั่งเศส"] = {}, ["Corsica, ฝรั่งเศส"] = {}, -- overseas departments are handled in `export.country_like_entities` -- ["French Guiana"] = {}, ["Grand Est, ฝรั่งเศส"] = {}, -- ["Guadeloupe"] = {}, ["Hauts-de-France, ฝรั่งเศส"] = {}, ["Île-de-France, ฝรั่งเศส"] = {}, -- ["Martinique"] = {}, -- ["Mayotte"] = {}, ["Normandy, ฝรั่งเศส"] = {wp = "%l (administrative region)"}, ["Nouvelle-Aquitaine, ฝรั่งเศส"] = {}, ["Occitania, ฝรั่งเศส"] = {wp = "%l (administrative region)"}, ["Occitanie, ฝรั่งเศส"] = {alias_of = "Occitania, ฝรั่งเศส", display = true}, ["Pays de la Loire, ฝรั่งเศส"] = {}, ["Provence-Alpes-Côte d'Azur, ฝรั่งเศส"] = {}, -- ["Réunion"] = {}, } -- administrative regions of France export.france_group = { default_container = "ฝรั่งเศส", -- Canonically these are 'administrative regions' but also treat as 'region' ('administrative region' falls back -- to 'region'). default_placetype = "ภูมิภาค", default_divs = { "communes", {type = "เทศบาล", cat_as = "communes"}, "departments", {type = "prefectures", cat_as = {"prefectures", "departmental capitals"}}, {type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}}, }, data = export.france_administrative_regions, } export.france_departments = { ["Ain, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 01 ["Aisne, ฝรั่งเศส"] = {container = "Hauts-de-France"}, -- 02 ["Allier, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 03 ["Alpes-de-Haute-Provence, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 04 ["Hautes-Alpes, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 05 ["Alpes-Maritimes, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 06 ["Ardèche, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 07 ["Ardennes, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 08 ["Ariège, ฝรั่งเศส"] = {container = "Occitania", wp = "%l (department)"}, -- 09 ["Aube, ฝรั่งเศส"] = {container = "Grand Est"}, -- 10 ["Aude, ฝรั่งเศส"] = {container = "Occitania"}, -- 11 ["Aveyron, ฝรั่งเศส"] = {container = "Occitania"}, -- 12 ["Bouches-du-Rhône, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 13 ["Calvados, ฝรั่งเศส"] = {container = "Normandy", wp = "%l (department)"}, -- 14 ["Cantal, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 15 ["Charente, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 16 ["Charente-Maritime, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 17 ["Cher, ฝรั่งเศส"] = {container = "Centre-Val de Loire", wp = "%l (department)"}, -- 18 ["Corrèze, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 19 ["Corse-du-Sud, ฝรั่งเศส"] = {container = "Corsica"}, -- 2A ["Haute-Corse, ฝรั่งเศส"] = {container = "Corsica"}, -- 2B ["Côte-d'Or, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 21 ["Côte d'Or, ฝรั่งเศส"] = {alias_of = "Côte-d'Or, ฝรั่งเศส", display = true}, ["Côtes-d'Armor, ฝรั่งเศส"] = {container = "Brittany"}, -- 22 ["Côtes d'Armor, ฝรั่งเศส"] = {alias_of = "Côtes-d'Armor, ฝรั่งเศส", display = true}, ["Creuse, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 23 ["Dordogne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 24 ["Doubs, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 25 ["Drôme, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 26 ["Eure, ฝรั่งเศส"] = {container = "Normandy"}, -- 27 ["Eure-et-Loir, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 28 ["Finistère, ฝรั่งเศส"] = {container = "Brittany"}, -- 29 ["Gard, ฝรั่งเศส"] = {container = "Occitania"}, -- 30 ["Haute-Garonne, ฝรั่งเศส"] = {container = "Occitania"}, -- 31 ["Gers, ฝรั่งเศส"] = {container = "Occitania"}, -- 32 ["Gironde, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 33 ["Hérault, ฝรั่งเศส"] = {container = "Occitania"}, -- 34 ["Ille-et-Vilaine, ฝรั่งเศส"] = {container = "Brittany"}, -- 35 ["Indre, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 36 ["Indre-et-Loire, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 37 ["Isère, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 38 ["Jura, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté", wp = "%l (department)"}, -- 39 ["Landes, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 40 ["Loir-et-Cher, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 41 ["Loire, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 42 ["Haute-Loire, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 43 ["Loire-Atlantique, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 44 ["Loiret, ฝรั่งเศส"] = {container = "Centre-Val de Loire"}, -- 45 ["Lot, ฝรั่งเศส"] = {container = "Occitania", wp = "%l (department)"}, -- 46 ["Lot-et-Garonne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 47 ["Lozère, ฝรั่งเศส"] = {container = "Occitania"}, -- 48 ["Maine-et-Loire, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 49 ["Manche, ฝรั่งเศส"] = {container = "Normandy"}, -- 50 ["Marne, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 51 ["Haute-Marne, ฝรั่งเศส"] = {container = "Grand Est"}, -- 52 ["Mayenne, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 53 ["Meurthe-et-Moselle, ฝรั่งเศส"] = {container = "Grand Est"}, -- 54 ["Meuse, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 55 ["Morbihan, ฝรั่งเศส"] = {container = "Brittany"}, -- 56 ["Moselle, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 57 ["Nièvre, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 58 ["Nord, ฝรั่งเศส"] = {container = "Hauts-de-France", wp = "%l (French department)"}, -- 59 ["Oise, ฝรั่งเศส"] = {container = "Hauts-de-France"}, -- 60 ["Orne, ฝรั่งเศส"] = {container = "Normandy"}, -- 61 ["Pas-de-Calais, ฝรั่งเศส"] = {container = "Hauts-de-France"}, -- 62 ["Puy-de-Dôme, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 63 ["Pyrénées-Atlantiques, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 64 ["Hautes-Pyrénées, ฝรั่งเศส"] = {container = "Occitania"}, -- 65 ["Pyrénées-Orientales, ฝรั่งเศส"] = {container = "Occitania"}, -- 66 ["Bas-Rhin, ฝรั่งเศส"] = {container = "Grand Est"}, -- 67 ["Haut-Rhin, ฝรั่งเศส"] = {container = "Grand Est"}, -- 68 ["Rhône, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 69D ["Metropolis of Lyon, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes", the = true}, -- 69M ["Lyon Metropolis, ฝรั่งเศส"] = {alias_of = "Metropolis of Lyon, ฝรั่งเศส"}, ["Lyon, ฝรั่งเศส"] = {alias_of = "Metropolis of Lyon, ฝรั่งเศส"}, ["Haute-Saône, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 70 ["Saône-et-Loire, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 71 ["Sarthe, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 72 ["Savoie, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 73 ["Haute-Savoie, ฝรั่งเศส"] = {container = "Auvergne-Rhône-Alpes"}, -- 74 ["Paris, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 75 ["Seine-Maritime, ฝรั่งเศส"] = {container = "Normandy"}, -- 76 ["Seine-et-Marne, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 77 ["Yvelines, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 78 ["Deux-Sèvres, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 79 ["Somme, ฝรั่งเศส"] = {container = "Hauts-de-France", wp = "%l (department)"}, -- 80 ["Tarn, ฝรั่งเศส"] = {container = "Occitania", wp = "%l (department)"}, -- 81 ["Tarn-et-Garonne, ฝรั่งเศส"] = {container = "Occitania"}, -- 82 ["Var, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur", wp = "%l (department)"}, -- 83 ["Vaucluse, ฝรั่งเศส"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 84 ["Vendée, ฝรั่งเศส"] = {container = "Pays de la Loire"}, -- 85 ["Vienne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 86 ["Haute-Vienne, ฝรั่งเศส"] = {container = "Nouvelle-Aquitaine"}, -- 87 ["Vosges, ฝรั่งเศส"] = {container = "Grand Est", wp = "%l (department)"}, -- 88 ["Yonne, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 89 ["Territoire de Belfort, ฝรั่งเศส"] = {container = "Bourgogne-Franche-Comté"}, -- 90 ["Essonne, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 91 ["Hauts-de-Seine, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 92 ["Seine-Saint-Denis, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 93 ["Val-de-Marne, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 94 ["Val-d'Oise, ฝรั่งเศส"] = {container = "Île-de-France"}, -- 95 --["Guadeloupe"] = {container = "Guadeloupe"}, -- 971 --["Martinique"] = {container = "Martinique"}, -- 972 --["Guyane"] = {container = "French Guiana", wp = "French Guiana"}, -- 973 --["La Réunion"] = {container = "Réunion", wp = "Réunion"}, -- 974 --["Mayotte"] = {container = "Mayotte"}, -- 976 } export.france_departments_group = { placename_to_key = make_placename_to_key(", ฝรั่งเศส"), canonicalize_key_container = make_canonicalize_key_container(", ฝรั่งเศส", "ภูมิภาค"), default_placetype = "department", default_divs = { "communes", {type = "เทศบาล", cat_as = "communes"}, }, data = export.france_departments, } export.germany_states = { ["Baden-Württemberg, เยอรมนี"] = {}, ["Bavaria, เยอรมนี"] = {}, -- Berlin, Bremen and Hamburg are effectively city-states and don't have districts ([[Kreise]]), so override -- the default_divs setting. Better not to include them at all since they're included as cities down below. -- ["Berlin"] = {divs = {}}, ["Brandenburg, เยอรมนี"] = {}, -- ["Bremen"] = {divs = {}}, -- ["Hamburg"] = {divs = {}}, ["Hesse, เยอรมนี"] = {}, ["Lower Saxony, เยอรมนี"] = {}, ["Mecklenburg-Vorpommern, เยอรมนี"] = {}, ["Mecklenburg-Western Pomerania, เยอรมนี"] = {alias_of = "Mecklenburg-Vorpommern, เยอรมนี", display = true}, ["North Rhine-Westphalia, เยอรมนี"] = {}, ["Rhineland-Palatinate, เยอรมนี"] = {}, ["Saarland, เยอรมนี"] = {}, ["Saxony, เยอรมนี"] = {}, ["Saxony-Anhalt, เยอรมนี"] = {}, ["Schleswig-Holstein, เยอรมนี"] = {}, ["Thuringia, เยอรมนี"] = {}, } -- states of Germany export.germany_group = { default_container = "เยอรมนี", default_placetype = "รัฐ", default_divs = {"อำเภอ", "เทศบาล"}, data = export.germany_states, } export.greece_regions = { ["Attica, กรีซ"] = {wp = "%l (ภูมิภาค)"}, ["Central Greece, กรีซ"] = {wp = "%l (administrative region)"}, ["Central Macedonia, กรีซ"] = {}, ["Crete, กรีซ"] = {}, ["Eastern Macedonia and Thrace, กรีซ"] = {}, ["Epirus, กรีซ"] = {wp = "%l (ภูมิภาค)"}, ["Ionian Islands, กรีซ"] = {the = true, wp = "%l (ภูมิภาค)"}, ["North Aegean, กรีซ"] = {the = true}, -- I would expect 'the Peloponnese' but Wikipedia mostly has categories like [[w:Category:Geography of Peloponnese (ภูมิภาค)]] -- and [[w:Category:Buildings and structures in Peloponnese (ภูมิภาค)]]; only [[w:Category:People from the Peloponnese (ภูมิภาค)]] -- has "the" in it. ["Peloponnese, กรีซ"] = {wp = "%l (ภูมิภาค)"}, ["South Aegean, กรีซ"] = {the = true}, ["Thessaly, กรีซ"] = {}, ["Western Greece, กรีซ"] = {}, ["Western Macedonia, กรีซ"] = {}, ["Mount Athos, กรีซ"] = {placetype = {"autonomous region", "ภูมิภาค"}, wp = "Monastic community of Mount Athos"}, } -- regions of Greece export.greece_group = { default_container = "กรีซ", default_placetype = "ภูมิภาค", data = export.greece_regions, } local india_polity_with_divisions = {"divisions", "อำเภอ"} local india_polity_without_divisions = {"อำเภอ"} -- States and union territories of India. Only some of them are divided into divisions. export.india_states_and_union_territories = { ["Andaman and Nicobar Islands, อินเดีย"] = {the = true, placetype = "union territory", divs = india_polity_without_divisions}, ["Andhra Pradesh, อินเดีย"] = {divs = india_polity_without_divisions}, ["Arunachal Pradesh, อินเดีย"] = {divs = india_polity_with_divisions}, ["Assam, อินเดีย"] = {divs = india_polity_with_divisions}, ["Bihar, อินเดีย"] = {divs = india_polity_with_divisions}, ["Chandigarh, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions}, ["Chhattisgarh, อินเดีย"] = {divs = india_polity_with_divisions}, ["Dadra and Nagar Haveli and Daman and Diu, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions}, ["Delhi, อินเดีย"] = {placetype = "union territory", divs = india_polity_with_divisions}, ["Goa, อินเดีย"] = {divs = india_polity_without_divisions}, ["Gujarat, อินเดีย"] = {divs = india_polity_without_divisions}, ["Haryana, อินเดีย"] = {divs = india_polity_with_divisions}, ["Himachal Pradesh, อินเดีย"] = {divs = india_polity_with_divisions}, ["Jammu and Kashmir, อินเดีย"] = {placetype = "union territory", divs = india_polity_with_divisions, wp = "%l (union territory)"}, ["Jharkhand, อินเดีย"] = {divs = india_polity_with_divisions}, ["Karnataka, อินเดีย"] = {divs = india_polity_with_divisions}, ["Kerala, อินเดีย"] = {divs = india_polity_without_divisions}, ["Ladakh, อินเดีย"] = {placetype = "union territory", divs = india_polity_with_divisions}, ["Lakshadweep, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions}, ["Madhya Pradesh, อินเดีย"] = {divs = india_polity_with_divisions}, ["Maharashtra, อินเดีย"] = {divs = india_polity_with_divisions}, ["Manipur, อินเดีย"] = {divs = india_polity_without_divisions}, ["Meghalaya, อินเดีย"] = {divs = india_polity_with_divisions}, ["Mizoram, อินเดีย"] = {divs = india_polity_without_divisions}, ["Nagaland, อินเดีย"] = {divs = india_polity_with_divisions}, ["Odisha, อินเดีย"] = {divs = india_polity_with_divisions}, ["Puducherry, อินเดีย"] = {placetype = "union territory", divs = india_polity_without_divisions, wp = "%l (union territory)"}, ["Pondicherry, อินเดีย"] = {alias_of = "Puducherry, อินเดีย", display = true}, ["Punjab, อินเดีย"] = {divs = india_polity_with_divisions, wp = "%l, %c"}, ["Rajasthan, อินเดีย"] = {divs = india_polity_with_divisions}, ["Sikkim, อินเดีย"] = {divs = india_polity_without_divisions}, ["Tamil Nadu, อินเดีย"] = {divs = india_polity_without_divisions}, ["Telangana, อินเดีย"] = {divs = india_polity_without_divisions}, ["Tripura, อินเดีย"] = {divs = india_polity_without_divisions}, ["Uttar Pradesh, อินเดีย"] = {divs = india_polity_with_divisions}, ["Uttarakhand, อินเดีย"] = {divs = india_polity_with_divisions}, ["West Bengal, อินเดีย"] = {divs = india_polity_with_divisions}, } -- states and union territories of India export.india_group = { default_container = "อินเดีย", default_placetype = "รัฐ", data = export.india_states_and_union_territories, } export.indonesia_provinces = { ["Aceh, อินโดนีเซีย"] = {}, ["Bali, อินโดนีเซีย"] = {}, ["Bangka Belitung Islands, อินโดนีเซีย"] = {the = true}, ["Banten, อินโดนีเซีย"] = {}, ["Bengkulu, อินโดนีเซีย"] = {}, ["Central Java, อินโดนีเซีย"] = {}, ["Central Kalimantan, อินโดนีเซีย"] = {}, ["Central Papua, อินโดนีเซีย"] = {}, ["Central Sulawesi, อินโดนีเซีย"] = {}, ["East Java, อินโดนีเซีย"] = {}, ["East Kalimantan, อินโดนีเซีย"] = {}, ["East Nusa Tenggara, อินโดนีเซีย"] = {}, ["Gorontalo, อินโดนีเซีย"] = {}, ["Highland Papua, อินโดนีเซีย"] = {wp = "%l"}, ["Special Capital Region of Jakarta, อินโดนีเซีย"] = {the = true, wp = "Jakarta"}, ["Jakarta, อินโดนีเซีย"] = {alias_of = "Special Capital Region of Jakarta, อินโดนีเซีย"}, ["Jambi, อินโดนีเซีย"] = {}, ["Lampung, อินโดนีเซีย"] = {}, ["Maluku, อินโดนีเซีย"] = {}, ["North Kalimantan, อินโดนีเซีย"] = {}, ["North Maluku, อินโดนีเซีย"] = {}, ["North Sulawesi, อินโดนีเซีย"] = {}, ["North Papua, อินโดนีเซีย"] = {}, ["North Sumatra, อินโดนีเซีย"] = {}, ["Papua, อินโดนีเซีย"] = {wp = "%l (จังหวัด)"}, ["Riau, อินโดนีเซีย"] = {}, ["Riau Islands, อินโดนีเซีย"] = {the = true}, ["Southeast Sulawesi, อินโดนีเซีย"] = {}, ["South Kalimantan, อินโดนีเซีย"] = {}, ["South Papua, อินโดนีเซีย"] = {}, ["South Sulawesi, อินโดนีเซีย"] = {}, ["South Sumatra, อินโดนีเซีย"] = {}, ["Southwest Papua, อินโดนีเซีย"] = {}, ["West Java, อินโดนีเซีย"] = {}, ["West Kalimantan, อินโดนีเซีย"] = {}, ["West Nusa Tenggara, อินโดนีเซีย"] = {}, ["West Papua, อินโดนีเซีย"] = {wp = "%l (จังหวัด)"}, ["West Sulawesi, อินโดนีเซีย"] = {}, ["West Sumatra, อินโดนีเซีย"] = {}, ["Special Region of Yogyakarta, อินโดนีเซีย"] = {the = true}, ["Yogyakarta, อินโดนีเซีย"] = {alias_of = "Special Region of Yogyakarta, อินโดนีเซีย"}, } -- provinces of Indonesia export.indonesia_group = { default_container = "อินโดนีเซีย", default_placetype = "จังหวัด", -- per https://www.quora.com/Does-Indonesia-use-British-or-American-English, อินโดนีเซีย tends to use American -- spellings. data = export.indonesia_provinces, } export.iran_provinces = { ["Alborz, อิหร่าน"] = {}, -- abbreviation AL, capital [[w:Karaj]] ["Ardabil, อิหร่าน"] = {}, -- abbreviation AR, capital [[w:Ardabil]] ["Bushehr, อิหร่าน"] = {}, -- abbreviation BU, capital [[w:Bushehr]] ["Chaharmahal and Bakhtiari, อิหร่าน"] = {}, -- abbreviation CB, capital [[w:Shahr-e Kord]] ["East Azerbaijan, อิหร่าน"] = {}, -- abbreviation EA, capital [[w:Tabriz]] ["Fars, อิหร่าน"] = {}, -- abbreviation FA, capital [[w:Shiraz]] ["Pars, อิหร่าน"] = {alias_of = "Fars, อิหร่าน", display = true}, ["Gilan, อิหร่าน"] = {}, -- abbreviation GN, capital [[w:Rasht]] ["Golestan, อิหร่าน"] = {}, -- abbreviation GO, capital [[w:Gorgan]] ["Hamadan, อิหร่าน"] = {}, -- abbreviation HA, capital [[w:Hamadan]] ["Hormozgan, อิหร่าน"] = {}, -- abbreviation HO, capital [[w:Bandar Abbas]] ["Ilam, อิหร่าน"] = {}, -- abbreviation IL, capital [[w:Ilam, อิหร่าน|Ilam]] ["Isfahan, อิหร่าน"] = {}, -- abbreviation IS, capital [[w:Isfahan]] ["Kerman, อิหร่าน"] = {}, -- abbreviation KN, capital [[w:Kerman]] ["Kermanshah, อิหร่าน"] = {}, -- abbreviation KE, capital [[w:Kermanshah]] ["Khuzestan, อิหร่าน"] = {}, -- abbreviation KH, capital [[w:Ahvaz]] ["Kohgiluyeh and Boyer-Ahmad, อิหร่าน"] = {}, -- abbreviation KB, capital [[w:Yasuj]] ["Kurdistan, อิหร่าน"] = {}, -- abbreviation KU, capital [[w:Sanandaj]] ["Lorestan, อิหร่าน"] = {}, -- abbreviation LO, capital [[w:Khorramabad]] ["Markazi, อิหร่าน"] = {}, -- abbreviation MA, capital [[w:Arak, อิหร่าน|Arak]] ["Mazandaran, อิหร่าน"] = {}, -- abbreviation MN, capital [[w:Sari, อิหร่าน|Sari]] ["North Khorasan, อิหร่าน"] = {}, -- abbreviation NK, capital [[w:Bojnord]] ["Qazvin, อิหร่าน"] = {}, -- abbreviation QA, capital [[w:Qazvin]] ["Qom, อิหร่าน"] = {}, -- abbreviation QM, capital [[w:Qom]] ["Razavi Khorasan, อิหร่าน"] = {}, -- abbreviation RK, capital [[w:Mashhad]] ["Semnan, อิหร่าน"] = {}, -- abbreviation SE, capital [[w:Semnan, อิหร่าน|Semnan]] ["Sistan and Baluchestan, อิหร่าน"] = {}, -- abbreviation SB, capital [[w:Zahedan]] ["South Khorasan, อิหร่าน"] = {}, -- abbreviation SK, capital [[w:Birjand]] ["Tehran, อิหร่าน"] = {}, -- abbreviation TE, capital [[w:Tehran]] ["West Azerbaijan, อิหร่าน"] = {}, -- abbreviation WA, capital [[w:Urmia]] ["Yazd, อิหร่าน"] = {}, -- abbreviation YA, capital [[w:Yazd]] ["Zanjan, อิหร่าน"] = {}, -- abbreviation ZA, capital [[w:Zanjan, อิหร่าน|Zanjan]] } -- provinces of Iran export.iran_group = { key_to_placename = make_key_to_placename(", อิหร่าน$"), placename_to_key = make_placename_to_key(", อิหร่าน"), default_container = "อิหร่าน", default_placetype = "จังหวัด", -- There aren't nearly enough counties of Iran currently entered in any language to allow for categorizing them -- per-province. (As of 2025-05-09, there are only 6 counties in each of [[Category:en:Counties of Iran]], -- [[Category:fa:Counties of Iran]] and [[Category:ar:Counties of Iran]].) -- default_divs = "เทศมณฑล", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "จังหวัด%e", data = export.iran_provinces, } export.ireland_counties = { ["County Carlow, ไอร์แลนด์"] = {}, ["County Cavan, ไอร์แลนด์"] = {}, ["County Clare, ไอร์แลนด์"] = {}, ["County Cork, ไอร์แลนด์"] = {}, ["County Donegal, ไอร์แลนด์"] = {}, ["County Dublin, ไอร์แลนด์"] = {}, ["County Galway, ไอร์แลนด์"] = {}, ["County Kerry, ไอร์แลนด์"] = {}, ["County Kildare, ไอร์แลนด์"] = {}, ["County Kilkenny, ไอร์แลนด์"] = {}, ["County Laois, ไอร์แลนด์"] = {}, ["County Leitrim, ไอร์แลนด์"] = {}, ["County Limerick, ไอร์แลนด์"] = {}, ["County Longford, ไอร์แลนด์"] = {}, ["County Louth, ไอร์แลนด์"] = {}, ["County Mayo, ไอร์แลนด์"] = {}, ["County Meath, ไอร์แลนด์"] = {}, ["County Monaghan, ไอร์แลนด์"] = {}, ["County Offaly, ไอร์แลนด์"] = {}, ["County Roscommon, ไอร์แลนด์"] = {}, ["County Sligo, ไอร์แลนด์"] = {}, ["County Tipperary, ไอร์แลนด์"] = {}, ["County Waterford, ไอร์แลนด์"] = {}, ["County Westmeath, ไอร์แลนด์"] = {}, ["County Wexford, ไอร์แลนด์"] = {}, ["County Wicklow, ไอร์แลนด์"] = {}, } local function make_irish_type_key_to_placename(container_pattern) return function(key) key = key:gsub(container_pattern, "") local elliptical_key = key:gsub("^County ", "") return key, elliptical_key end end local function make_irish_type_placename_to_key(container_suffix) return function(placename) if not placename:find("^County ") and not placename:find("^City ") then placename = "County " .. placename end return placename .. container_suffix end end -- counties of Ireland export.ireland_group = { key_to_placename = make_irish_type_key_to_placename(", ไอร์แลนด์$"), placename_to_key = make_irish_type_placename_to_key(", ไอร์แลนด์"), default_container = "ไอร์แลนด์", default_placetype = "เทศมณฑล", data = export.ireland_counties, } export.italy_administrative_regions = { ["Abruzzo, Italy"] = {}, ["Aosta Valley, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}}, ["Apulia, Italy"] = {}, ["Basilicata, Italy"] = {}, ["Calabria, Italy"] = {}, ["Campania, Italy"] = {}, ["Emilia-Romagna, Italy"] = {}, ["Friuli-Venezia Giulia, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}}, ["Lazio, Italy"] = {}, ["Liguria, Italy"] = {}, ["Lombardy, Italy"] = {}, ["Marche, Italy"] = {}, ["Molise, Italy"] = {}, ["Piedmont, Italy"] = {}, ["Sardinia, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}}, ["Sicily, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}}, ["Trentino-Alto Adige, Italy"] = {placetype = {"autonomous region", "administrative region", "ภูมิภาค"}}, ["Tuscany, Italy"] = {}, ["Umbria, Italy"] = {}, ["Veneto, Italy"] = {}, } -- administrative regions of Italy export.italy_group = { default_container = "อิตาลี", default_placetype = "ภูมิภาค", data = export.italy_administrative_regions, } -- table of Japanese prefectures; interpolated into the main 'places' table, but also needed separately export.japan_prefectures = { ["ไอจิ, ญี่ปุ่น"] = {}, ["อากิตะ, ญี่ปุ่น"] = {}, ["อาโอโมริ, ญี่ปุ่น"] = {}, ["จิบะ, ญี่ปุ่น"] = {}, ["เอฮิเมะ, ญี่ปุ่น"] = {}, ["ฟูกูอิ, ญี่ปุ่น"] = {}, ["ฟูกูโอกะ, ญี่ปุ่น"] = {}, ["ฟูกูชิมะ, ญี่ปุ่น"] = {}, ["กิฟุ, ญี่ปุ่น"] = {}, ["กุมมะ, ญี่ปุ่น"] = {}, ["ฮิโรชิมะ, ญี่ปุ่น"] = {}, ["ฮกไกโด, ญี่ปุ่น"] = {divs = "กิ่งจังหวัด", wp = "ฮกไกโด"}, ["เฮียวโงะ, ญี่ปุ่น"] = {}, --["Hyogo, ญี่ปุ่น"] = {alias_of = "เฮียวโงะ, ญี่ปุ่น", display = true}, ["อิบารากิ, ญี่ปุ่น"] = {}, ["อิชิกาวะ, ญี่ปุ่น"] = {}, ["อิวาเตะ, ญี่ปุ่น"] = {}, ["คางาวะ, ญี่ปุ่น"] = {}, ["คาโงชิมะ, ญี่ปุ่น"] = {}, ["คานางาวะ, ญี่ปุ่น"] = {}, ["โคจิ, ญี่ปุ่น"] = {}, --["Kochi, ญี่ปุ่น"] = {alias_of = "โคจิ, ญี่ปุ่น", display = true}, ["คูมาโมโตะ, ญี่ปุ่น"] = {}, ["เกียวโต, ญี่ปุ่น"] = {}, ["มิเอะ, ญี่ปุ่น"] = {}, ["มิยางิ, ญี่ปุ่น"] = {}, ["มิยาซากิ, ญี่ปุ่น"] = {}, ["นางาโนะ, ญี่ปุ่น"] = {}, ["นางาซากิ, ญี่ปุ่น"] = {}, ["นาระ, ญี่ปุ่น"] = {}, ["นีงาตะ, ญี่ปุ่น"] = {}, ["โออิตะ, ญี่ปุ่น"] = {}, --["Oita, ญี่ปุ่น"] = {alias_of = "โออิตะ, ญี่ปุ่น", display = true}, ["โอกายามะ, ญี่ปุ่น"] = {}, ["โอกินาวะ, ญี่ปุ่น"] = {}, ["โอซากะ, ญี่ปุ่น"] = {}, ["ซางะ, ญี่ปุ่น"] = {}, ["ไซตามะ, ญี่ปุ่น"] = {}, ["ชิงะ, ญี่ปุ่น"] = {}, ["ชิมาเนะ, ญี่ปุ่น"] = {}, ["ชิซูโอกะ, ญี่ปุ่น"] = {}, ["โทจิงิ, ญี่ปุ่น"] = {}, ["โทกูชิมะ, ญี่ปุ่น"] = {}, ["ทตโตริ, ญี่ปุ่น"] = {}, ["โทยามะ, ญี่ปุ่น"] = {}, ["วากายามะ, ญี่ปุ่น"] = {}, ["ยามางาตะ, ญี่ปุ่น"] = {}, ["ยามางูจิ, ญี่ปุ่น"] = {}, ["ยามานาชิ, ญี่ปุ่น"] = {}, } -- prefectures of Japan export.japan_group = { key_to_placename = make_key_to_placename(", ญี่ปุ่น$"), placename_to_key = make_placename_to_key(", ญี่ปุ่น"), default_container = "ญี่ปุ่น", default_placetype = "จังหวัด", default_wp = "จังหวัด%e", data = export.japan_prefectures, } export.laos_provinces = { ["Attapeu Province, Laos"] = {}, ["Bokeo Province, Laos"] = {}, ["Bolikhamxai Province, Laos"] = {}, ["Champasak Province, Laos"] = {}, ["Houaphanh Province, Laos"] = {}, ["Khammouane Province, Laos"] = {}, ["Luang Namtha Province, Laos"] = {}, ["Luang Prabang Province, Laos"] = {}, ["Oudomxay Province, Laos"] = {}, ["Phongsaly Province, Laos"] = {}, ["Salavan Province, Laos"] = {}, ["Savannakhet Province, Laos"] = {}, ["Vientiane Province, Laos"] = {}, ["Vientiane Prefecture, Laos"] = {placetype = "prefecture", wp = "%l"}, ["Sainyabuli Province, Laos"] = {}, ["Sekong Province, Laos"] = {}, ["Xaisomboun Province, Laos"] = {}, ["Xiangkhouang Province, Laos"] = {}, } local function laos_placename_to_key(placename) if placename == "Vientiane Prefecture" then return placename .. ", Laos" end if placename:find(" Province$") then return placename .. ", Laos" end return placename .. " Province, Laos" end -- provinces of Laos export.laos_group = { key_to_placename = make_key_to_placename(", Laos$", {" Province$", " Prefecture$"}), placename_to_key = laos_placename_to_key, default_container = "Laos", default_placetype = "จังหวัด", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "%e province", data = export.laos_provinces, } export.lebanon_governorates = { ["Akkar Governorate, Lebanon"] = {}, ["Baalbek-Hermel Governorate, Lebanon"] = {}, ["Beirut Governorate, Lebanon"] = {}, ["Beqaa Governorate, Lebanon"] = {}, ["Keserwan-Jbeil Governorate, Lebanon"] = {}, ["Mount Lebanon Governorate, Lebanon"] = {}, ["Nabatieh Governorate, Lebanon"] = {}, -- These two are generic enough that we don't want to automatically augment a use of `gov/North Governorate` or -- `gov/South Governorate` with `c/Lebanon`. ["North Governorate, Lebanon"] = {no_auto_augment_container = true}, ["South Governorate, Lebanon"] = {no_auto_augment_container = true}, } -- governorates of Lebanon export.lebanon_group = { key_to_placename = make_key_to_placename(", Lebanon$", " Governorate$"), placename_to_key = make_placename_to_key(", Lebanon", " Governorate"), default_container = "Lebanon", default_placetype = "governorate", data = export.lebanon_governorates, } export.malaysia_states = { ["Johor, Malaysia"] = {}, ["Kedah, Malaysia"] = {}, ["Kelantan, Malaysia"] = {}, ["Malacca, Malaysia"] = {}, ["Negeri Sembilan, Malaysia"] = {}, ["Pahang, Malaysia"] = {}, ["Penang, Malaysia"] = {}, ["Perak, Malaysia"] = {}, ["Perlis, Malaysia"] = {}, ["Sabah, Malaysia"] = {}, ["Sarawak, Malaysia"] = {}, ["Selangor, Malaysia"] = {}, ["Terengganu, Malaysia"] = {}, } -- states of Malaysia export.malaysia_group = { default_container = "Malaysia", default_placetype = "รัฐ", default_wp = "%l, %c", data = export.malaysia_states, } export.malta_regions = { -- Some of the regions are generic enough that we don't want to automatically augment a use of e.g. -- `r/Northern Region` with `c/Malta`. In particular; -- * "Eastern Region" also occurs at least in Ghana, Uganda, Iceland, Nigeria, Venezuela, North Macedonia and -- El Salvador; -- * "Northern Region" also occurs at least in Ghana, Uganda, Malawi, Nigeria, Canada and South Africa; -- * "Western Region" also occurs at least in Abu Dhabi, Bahrain, South Africa, Ghana, Iceland, Nepal, Nigeria, -- Serbia and Uganda; -- * "Southern Region" also occurs at least in Nigeria, Eritrea, Iceland, ไอร์แลนด์, Malawi and Serbia. ["Eastern Region, Malta"] = {no_auto_augment_container = true}, ["Gozo Region, Malta"] = {wp = "%l"}, ["Northern Region, Malta"] = {no_auto_augment_container = true}, ["Port Region, Malta"] = {}, ["Southern Region, Malta"] = {no_auto_augment_container = true}, ["Western Region, Malta"] = {no_auto_augment_container = true}, } -- regions of Malta export.malta_group = { key_to_placename = make_key_to_placename(", Malta$", " Region"), placename_to_key = make_placename_to_key(", Malta", " Region"), default_container = "Malta", default_placetype = "ภูมิภาค", default_wp = "%l, %c", default_the = true, data = export.malta_regions, } export.mexico_states = { ["Aguascalientes, Mexico"] = {}, ["Baja California, Mexico"] = {}, -- not display-canonicalizing because the "Norte" could be for emphasis ["Baja California Norte, Mexico"] = {alias_of = "Baja California, Mexico"}, ["Baja California Sur, Mexico"] = {}, ["Campeche, Mexico"] = {}, ["Chiapas, Mexico"] = {}, ["Chihuahua, Mexico"] = {wp = "%l (รัฐ)"}, ["Coahuila, Mexico"] = {}, ["Colima, Mexico"] = {}, ["Durango, Mexico"] = {}, ["Guanajuato, Mexico"] = {}, ["Guerrero, Mexico"] = {}, ["Hidalgo, Mexico"] = {wp = "%l (รัฐ)"}, ["Jalisco, Mexico"] = {}, ["State of Mexico, Mexico"] = {the = true}, ["Mexico, Mexico"] = {alias_of = "State of Mexico, Mexico"}, -- differs in "the" -- ["Mexico City, Mexico"] = {}, doesn't belong here because it's a city ["Michoacán, Mexico"] = {}, ["Michoacan, Mexico"] = {alias_of = "Michoacán, Mexico", display = true}, ["Morelos, Mexico"] = {}, ["Nayarit, Mexico"] = {}, ["Nuevo León, Mexico"] = {}, ["Nuevo Leon, Mexico"] = {alias_of = "Nuevo León, Mexico", display = true}, ["Oaxaca, Mexico"] = {}, ["Puebla, Mexico"] = {}, ["Querétaro, Mexico"] = {}, ["Queretaro, Mexico"] = {alias_of = "Querétaro, Mexico", display = true}, ["Quintana Roo, Mexico"] = {}, ["San Luis Potosí, Mexico"] = {}, ["San Luis Potosi, Mexico"] = {alias_of = "San Luis Potosí, Mexico", display = true}, ["Sinaloa, Mexico"] = {}, ["Sonora, Mexico"] = {}, ["Tabasco, Mexico"] = {}, ["Tamaulipas, Mexico"] = {}, ["Tlaxcala, Mexico"] = {}, ["Veracruz, Mexico"] = {}, ["Yucatán, Mexico"] = {}, ["Yucatan, Mexico"] = {alias_of = "Yucatán, Mexico", display = true}, ["Zacatecas, Mexico"] = {}, } -- Mexican states export.mexico_group = { default_container = "Mexico", default_placetype = "รัฐ", data = export.mexico_states, } export.moldova_districts_and_autonomous_territorial_units = { ["Anenii Noi District, Moldova"] = {}, -- capital [[Anenii Noi]] ["Basarabeasca District, Moldova"] = {}, -- capital [[Basarabeasca]] ["Briceni District, Moldova"] = {}, -- capital [[Briceni]] ["Cahul District, Moldova"] = {}, -- capital [[Cahul]] ["Cantemir District, Moldova"] = {}, -- capital [[Cantemir, Moldova|Cantemir]] ["Călărași District, Moldova"] = {}, -- capital [[Călărași, Moldova|Călărași]] ["Căușeni District, Moldova"] = {}, -- capital [[Căușeni]] ["Cimișlia District, Moldova"] = {}, -- capital [[Cimișlia]] ["Criuleni District, Moldova"] = {}, -- capital [[Criuleni]] ["Dondușeni District, Moldova"] = {}, -- capital [[Dondușeni]] ["Drochia District, Moldova"] = {}, -- capital [[Drochia]] ["Dubăsari District, Moldova"] = {}, -- capital [[Cocieri]] ["Edineț District, Moldova"] = {}, -- capital [[Edineț]] ["Fălești District, Moldova"] = {}, -- capital [[Fălești]] ["Florești District, Moldova"] = {}, -- capital [[Florești, Moldova|Florești]] ["Glodeni District, Moldova"] = {}, -- capital [[Glodeni]] ["Hîncești District, Moldova"] = {}, -- capital [[Hîncești]] ["Ialoveni District, Moldova"] = {}, -- capital [[Ialoveni]] ["Leova District, Moldova"] = {}, -- capital [[Leova]] ["Nisporeni District, Moldova"] = {}, -- capital [[Nisporeni]] ["Ocnița District, Moldova"] = {}, -- capital [[Ocnița]] ["Orhei District, Moldova"] = {}, -- capital [[Orhei]] ["Rezina District, Moldova"] = {}, -- capital [[Rezina]] ["Rîșcani District, Moldova"] = {}, -- capital [[Rîșcani]] ["Sîngerei District, Moldova"] = {}, -- capital [[Sîngerei]] ["Soroca District, Moldova"] = {}, -- capital [[Soroca]] ["Strășeni District, Moldova"] = {}, -- capital [[Strășeni]] ["Șoldănești District, Moldova"] = {}, -- capital [[Șoldănești]] ["Ștefan Vodă District, Moldova"] = {}, -- capital [[Ștefan Vodă]] ["Taraclia District, Moldova"] = {}, -- capital [[Taraclia]] ["Telenești District, Moldova"] = {}, -- capital [[Telenești]] ["Ungheni District, Moldova"] = {}, -- capital [[Ungheni]] ["Chișinău, Moldova"] = {placetype = "เทศบาล"}, ["Bălți, Moldova"] = {placetype = "เทศบาล"}, ["Gagauzia, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "ภูมิภาค"}}, -- capital [[Comrat]] -- the remainder are under the de-facto control of the unrecognized state of Transnistria ["Bender, Moldova"] = {placetype = "เทศบาล"}, ["Tighina, Moldova"] = {alias_of = "Bender, Moldova"}, ["Transnistria, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "ภูมิภาค"}}, -- capital [[Tiraspol]] ["Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true}, ["Administrative-Territorial Units of the Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true}, } local function moldova_placename_to_key(placename) local elliptical_key = placename .. ", Moldova" if export.moldova_districts_and_autonomous_territorial_units[elliptical_key] then return elliptical_key end if placename:find(" District$") then return placename .. ", Moldova" end return placename .. " District, Moldova" end -- Moldovan districts (raions) and autonomous territorial units export.moldova_group = { key_to_placename = make_key_to_placename(", Moldova$", " District"), placename_to_key = moldova_placename_to_key, default_container = "Moldova", default_placetype = {"district", "raion"}, default_divs = "communes", data = export.moldova_districts_and_autonomous_territorial_units, } export.morocco_regions = { ["Tangier-Tetouan-Al Hoceima, Morocco"] = {}, ["Oriental, Morocco"] = {wp = "%l (%c)"}, ["L'Oriental, Morocco"] = {alias_of = "Oriental, Morocco", display = true}, ["Fez-Meknes, Morocco"] = {}, ["Rabat-Sale-Kenitra, Morocco"] = {wp = "Rabat-Salé-Kénitra"}, ["Rabat-Salé-Kénitra, Morocco"] = {alias_of = "Rabat-Sale-Kenitra, Morocco", display = true}, ["Beni Mellal-Khenifra, Morocco"] = {wp = "Béni Mellal-Khénifra"}, ["Béni Mellal-Khénifra, Morocco"] = {alias_of = "Beni Mellal-Khenifra, Morocco", display = true}, ["Casablanca-Settat, Morocco"] = {}, ["Marrakesh-Safi, Morocco"] = {wp = "Marrakesh–Safi"}, -- WP title has en-dash ["Marrakech-Safi, Morocco"] = {alias_of = "Marrakesh-Safi, Morocco", display = true}, ["Draa-Tafilalet, Morocco"] = {wp = "Drâa-Tafilalet"}, ["Drâa-Tafilalet, Morocco"] = {alias_of = "Draa-Tafilalet, Morocco", display = true}, ["Souss-Massa, Morocco"] = {}, ["Guelmim-Oued Noun, Morocco"] = { keydesc = "+++. '''NOTE:''' This region lies partly within the disputed territory of [[Western Sahara]]" }, ["Laayoune-Sakia El Hamra, Morocco"] = { wp = "Laâyoune-Sakia El Hamra", keydesc = "+++. '''NOTE:''' This region lies almost completely within the disputed territory of [[Western Sahara]]", }, ["Laâyoune-Sakia El Hamra, Morocco"] = {alias_of = "Laayoune-Sakia El Hamra, Morocco", display = true}, ["Dakhla-Oued Ed-Dahab, Morocco"] = { keydesc = "+++. '''NOTE:''' This region lies completely within the disputed territory of [[Western Sahara]]", }, } -- regions of Morocco export.morocco_group = { default_container = "Morocco", default_placetype = "ภูมิภาค", data = export.morocco_regions, } export.egypt_governorates = { ["Cairo Governorate, Egypt"] = {}, ["Giza Governorate, Egypt"] = {}, ["Sharqia Governorate, Egypt"] = {}, ["Dakahlia Governorate, Egypt"] = {}, ["Beheira Governorate, Egypt"] = {}, ["Minya Governorate, Egypt"] = {}, ["Qalyubia Governorate, Egypt"] = {}, ["Sohag Governorate, Egypt"] = {}, ["Alexandria Governorate, Egypt"] = {}, ["Gharbia Governorate, Egypt"] = {}, ["Asyut Governorate, Egypt"] = {}, ["Monufia Governorate, Egypt"] = {}, ["Faiyum Governorate, Egypt"] = {}, ["Kafr El Sheikh Governorate, Egypt"] = {}, ["Qena Governorate, Egypt"] = {}, ["Beni Suef Governorate, Egypt"] = {}, ["Damietta Governorate, Egypt"] = {}, ["Aswan Governorate, Egypt"] = {}, ["Ismailia Governorate, Egypt"] = {}, ["Luxor Governorate, Egypt"] = {}, ["Suez Governorate, Egypt"] = {}, ["Port Said Governorate, Egypt"] = {}, ["Matrouh Governorate, Egypt"] = {}, ["North Sinai Governorate, Egypt"] = {}, ["Red Sea Governorate, Egypt"] = {}, ["New Valley Governorate, Egypt"] = {}, ["South Sinai Governorate, Egypt"] = {}, } -- governorates of Egypt export.egypt_group = { key_to_placename = make_key_to_placename(", Egypt$", " Governorate$"), placename_to_key = make_placename_to_key(", Egypt", " Governorate"), default_container = "อียิปต์", default_placetype = "governorate", data = export.egypt_governorates, } export.netherlands_provinces = { ["Drenthe, Netherlands"] = {}, ["Flevoland, Netherlands"] = {}, ["Friesland, Netherlands"] = {}, ["Gelderland, Netherlands"] = {}, ["Groningen, Netherlands"] = {wp = "%l (จังหวัด)"}, ["Limburg, Netherlands"] = {wp = "%l (%c)"}, ["North Brabant, Netherlands"] = {}, -- Foreign forms get display-canonicalized. ["Noord-Brabant, Netherlands"] = {alias_of = "North Brabant, Netherlands", display = true}, ["North Holland, Netherlands"] = {}, ["Noord-Holland, Netherlands"] = {alias_of = "North Holland, Netherlands", display = true}, ["Overijssel, Netherlands"] = {}, ["South Holland, Netherlands"] = {}, ["Zuid-Holland, Netherlands"] = {alias_of = "South Holland, Netherlands", display = true}, ["Utrecht, Netherlands"] = {wp = "%l (จังหวัด)"}, ["Zeeland, Netherlands"] = {}, } -- provinces of the Netherlands export.netherlands_group = { default_container = "เนเธอร์แลนด์", default_placetype = "จังหวัด", default_divs = "เทศบาล", data = export.netherlands_provinces, } export.new_zealand_regions = { -- North Island regions ["Northland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-NTL, number 1, capital [[Whangārei]] ["Auckland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-AUK, number 2, capital [[Auckland]] ["Waikato, New Zealand"] = {}, -- ISO 3166-2 code NZ-WKO, number 3, capital [[Hamilton, New Zealand|Hamilton]] ["Bay of Plenty, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-BOP, number 4, capital [[Whakatāne]] ["Gisborne, New Zealand"] = {placetype = {"ภูมิภาค", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-GIS, number 5, capital [[Gisborne, New Zealand|Gisborne]] ["Hawke's Bay, New Zealand"] = {}, -- ISO 3166-2 code NZ-HKB, number 6, capital [[Napier, New Zealand|Napier]] ["Taranaki, New Zealand"] = {}, -- ISO 3166-2 code NZ-TKI, number 7, capital [[Stratford, New Zealand|Stratford]] ["Manawatū-Whanganui, New Zealand"] = {}, -- ISO 3166-2 code NZ-MWT, number 8, capital [[Palmerston North]] ["Manawatu-Whanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true}, ["Manawatu-Wanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true}, ["Wellington, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-WGN, number 9, capital [[Wellington]] -- South Island regions ["Tasman, New Zealand"] = {placetype = {"ภูมิภาค", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-TAS, number 10, capital [[Richmond, New Zealand|Richmond]] ["Nelson, New Zealand"] = {placetype = {"ภูมิภาค", "นคร"}, wp = "%l, %c", is_city = true}, -- ISO 3166-2 code NZ-NSN, number 11, capital [[Nelson, New Zealand|Nelson]] ["Marlborough, New Zealand"] = {placetype = {"ภูมิภาค", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-MBH, number 12, capital [[Blenheim, New Zealand|Blenheim]] ["West Coast, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-WTC, number 13, capital [[Greymouth]] ["Canterbury, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-CAN, number 14, capital [[Christchurch]] ["Otago, New Zealand"] = {}, -- ISO 3166-2 code NZ-OTA, number 15, capital [[Dunedin]] ["Southland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-STL, number 16, capital [[Invercargill]] } -- regions of New Zealand export.new_zealand_group = { default_container = "New Zealand", default_placetype = "ภูมิภาค", data = export.new_zealand_regions, } export.nigeria_states = { ["Abia State, Nigeria"] = {}, ["Adamawa State, Nigeria"] = {}, ["Akwa Ibom State, Nigeria"] = {}, ["Anambra State, Nigeria"] = {}, ["Bauchi State, Nigeria"] = {}, ["Bayelsa State, Nigeria"] = {}, ["Benue State, Nigeria"] = {}, ["Borno State, Nigeria"] = {}, ["Cross River State, Nigeria"] = {}, ["Delta State, Nigeria"] = {}, ["Ebonyi State, Nigeria"] = {}, ["Edo State, Nigeria"] = {}, ["Ekiti State, Nigeria"] = {}, ["Enugu State, Nigeria"] = {}, ["Federal Capital Territory, Nigeria"] = { -- not a state but allow it to be referenced as one in holonyms placetype = {"federal territory", "ดินแดน", "รัฐ"}, the = true, wp = "%l (%c)", }, ["Gombe State, Nigeria"] = {}, ["Imo State, Nigeria"] = {}, ["Jigawa State, Nigeria"] = {}, ["Kaduna State, Nigeria"] = {}, ["Kano State, Nigeria"] = {}, ["Katsina State, Nigeria"] = {}, ["Kebbi State, Nigeria"] = {}, ["Kogi State, Nigeria"] = {}, ["Kwara State, Nigeria"] = {}, ["Lagos State, Nigeria"] = {}, ["Nasarawa State, Nigeria"] = {}, ["Niger State, Nigeria"] = {}, ["Ogun State, Nigeria"] = {}, ["Ondo State, Nigeria"] = {}, ["Osun State, Nigeria"] = {}, ["Oyo State, Nigeria"] = {}, ["Plateau State, Nigeria"] = {}, ["Rivers State, Nigeria"] = {}, ["Sokoto State, Nigeria"] = {}, ["Taraba State, Nigeria"] = {}, ["Yobe State, Nigeria"] = {}, ["Zamfara State, Nigeria"] = {}, } -- states of Nigeria export.nigeria_group = { key_to_placename = make_key_to_placename(", Nigeria$", " State$"), placename_to_key = make_placename_to_key(", Nigeria", " State"), default_container = "Nigeria", default_placetype = "รัฐ", data = export.nigeria_states, } export.north_korea_provinces = { ["Chagang Province, North Korea"] = {}, ["North Hamgyong Province, North Korea"] = {}, ["South Hamgyong Province, North Korea"] = {}, ["North Hwanghae Province, North Korea"] = {}, ["South Hwanghae Province, North Korea"] = {}, ["Kangwon Province, North Korea"] = {wp = "%l (%c)"}, ["North Pyongan Province, North Korea"] = {}, ["South Pyongan Province, North Korea"] = {}, ["Ryanggang Province, North Korea"] = {}, } -- provinces of North Korea export.north_korea_group = { key_to_placename = make_key_to_placename(", North Korea$", " Province$"), placename_to_key = make_placename_to_key(", North Korea", " Province"), default_container = "North Korea", default_placetype = "จังหวัด", data = export.north_korea_provinces, } export.norwegian_counties = { ["Oslo, Norway"] = {}, ["Rogaland, Norway"] = {}, ["Møre og Romsdal, Norway"] = {}, ["Nordland, Norway"] = {}, ["Østfold, Norway"] = {}, ["Akershus, Norway"] = {}, ["Buskerud, Norway"] = {}, -- the following two were merged into Innlandet -- ["Hedmark, Norway"] = {}, -- ["Oppland, Norway"] = {}, ["Innlandet, Norway"] = {}, ["Vestfold, Norway"] = {}, ["Telemark, Norway"] = {}, -- the following two were merged into Agder -- ["Aust-Agder, Norway"] = {}, -- ["Vest-Agder, Norway"] = {}, ["Agder, Norway"] = {}, -- the following two were merged into Vestland -- ["Hordaland, Norway"] = {}, -- ["Sogn og Fjordane, Norway"] = {}, ["Vestland, Norway"] = {}, ["Trøndelag, Norway"] = {}, ["Troms, Norway"] = {}, ["Finnmark, Norway"] = {}, } -- counties of Norway export.norway_group = { default_container = "Norway", default_placetype = "เทศมณฑล", data = export.norwegian_counties, } export.pakistan_provinces_and_territories = { ["Azad Kashmir, Pakistan"] = { placetype = {"administrative territory", "autonomous territory", "ดินแดน"}, }, ["Azad Jammu and Kashmir, Pakistan"] = {alias_of = "Azad Kashmir, Pakistan", display = true}, ["Balochistan, Pakistan"] = {wp = "%l, %c"}, ["Gilgit-Baltistan, Pakistan"] = { placetype = {"administrative territory", "ดินแดน"}, }, ["Islamabad Capital Territory, Pakistan"] = { the = true, divs = {}, -- no divisions placetype = {"federal territory", "administrative territory", "ดินแดน"}, }, -- Islamabad is an accepted alias for Islamabad Capital Territory given the above placetypes ["Islamabad, Pakistan"] = {alias_of = "Islamabad Capital Territory, Pakistan"}, ["Khyber Pakhtunkhwa, Pakistan"] = {}, ["Punjab, Pakistan"] = {wp = "%l, %c"}, ["Sindh, Pakistan"] = {}, } -- provinces and territories of Pakistan export.pakistan_group = { default_container = "Pakistan", default_placetype = "จังหวัด", default_divs = "divisions", data = export.pakistan_provinces_and_territories, } export.philippines_provinces = { ["Abra, Philippines"] = {wp = "%l (จังหวัด)"}, ["Agusan del Norte, Philippines"] = {}, ["Agusan del Sur, Philippines"] = {}, ["Aklan, Philippines"] = {}, ["Albay, Philippines"] = {}, ["Antique, Philippines"] = {wp = "%l (จังหวัด)"}, ["Apayao, Philippines"] = {}, ["Aurora, Philippines"] = {wp = "%l (จังหวัด)"}, ["Basilan, Philippines"] = {}, ["Bataan, Philippines"] = {}, ["Batanes, Philippines"] = {}, ["Batangas, Philippines"] = {}, ["Benguet, Philippines"] = {}, ["Biliran, Philippines"] = {}, ["Bohol, Philippines"] = {}, ["Bukidnon, Philippines"] = {}, ["Bulacan, Philippines"] = {}, ["Cagayan, Philippines"] = {}, ["Camarines Norte, Philippines"] = {}, ["Camarines Sur, Philippines"] = {}, ["Camiguin, Philippines"] = {}, ["Capiz, Philippines"] = {}, ["Catanduanes, Philippines"] = {}, ["Cavite, Philippines"] = {}, ["Cebu, Philippines"] = {}, ["Cotabato, Philippines"] = {}, ["Davao de Oro, Philippines"] = {}, ["Davao del Norte, Philippines"] = {}, ["Davao del Sur, Philippines"] = {}, ["Davao Occidental, Philippines"] = {}, ["Davao Oriental, Philippines"] = {}, ["Dinagat Islands, Philippines"] = {the = true}, ["Eastern Samar, Philippines"] = {}, ["Guimaras, Philippines"] = {}, ["Ifugao, Philippines"] = {}, ["Ilocos Norte, Philippines"] = {}, ["Ilocos Sur, Philippines"] = {}, ["Iloilo, Philippines"] = {}, ["Isabela, Philippines"] = {wp = "%l (จังหวัด)"}, ["Kalinga, Philippines"] = {wp = "%l (จังหวัด)"}, ["La Union, Philippines"] = {}, ["Laguna, Philippines"] = {wp = "%l (จังหวัด)"}, ["Lanao del Norte, Philippines"] = {}, ["Lanao del Sur, Philippines"] = {}, ["Leyte, Philippines"] = {wp = "%l (จังหวัด)"}, ["Maguindanao del Norte, Philippines"] = {}, ["Maguindanao del Sur, Philippines"] = {}, ["Marinduque, Philippines"] = {}, ["Masbate, Philippines"] = {}, ["Misamis Occidental, Philippines"] = {}, ["Misamis Oriental, Philippines"] = {}, ["Mountain Province, Philippines"] = {}, ["Negros Occidental, Philippines"] = {}, ["Negros Oriental, Philippines"] = {}, ["Northern Samar, Philippines"] = {}, ["Nueva Ecija, Philippines"] = {}, ["Nueva Vizcaya, Philippines"] = {}, ["Occidental Mindoro, Philippines"] = {}, ["Oriental Mindoro, Philippines"] = {}, ["Palawan, Philippines"] = {}, ["Pampanga, Philippines"] = {}, ["Pangasinan, Philippines"] = {}, ["Quezon, Philippines"] = {}, ["Quirino, Philippines"] = {}, ["Rizal, Philippines"] = {wp = "%l (จังหวัด)"}, ["Romblon, Philippines"] = {}, ["Samar, Philippines"] = {wp = "%l (จังหวัด)"}, ["Sarangani, Philippines"] = {}, ["Siquijor, Philippines"] = {}, ["Sorsogon, Philippines"] = {}, ["South Cotabato, Philippines"] = {}, ["Southern Leyte, Philippines"] = {}, ["Sultan Kudarat, Philippines"] = {}, ["Sulu, Philippines"] = {}, ["Surigao del Norte, Philippines"] = {}, ["Surigao del Sur, Philippines"] = {}, ["Tarlac, Philippines"] = {}, ["Tawi-Tawi, Philippines"] = {}, ["Zambales, Philippines"] = {}, ["Zamboanga del Norte, Philippines"] = {}, ["Zamboanga del Sur, Philippines"] = {}, ["Zamboanga Sibugay, Philippines"] = {}, -- not a province but treated as one; allow it to be referred to as a province in holonyms ["Metro Manila, Philippines"] = {placetype = {"ภูมิภาค", "จังหวัด"}}, } -- provinces of the Philippines export.philippines_group = { default_container = "Philippines", default_placetype = "จังหวัด", default_divs = {"เทศบาล", "barangays"}, data = export.philippines_provinces, } export.poland_voivodeships = { ["Lower Silesian Voivodeship, Poland"] = {}, -- abbr DS, code 02, capital Wrocław ["Kuyavian-Pomeranian Voivodeship, Poland"] = {}, -- abbr KP, code 04, capital Bydgoszcz (seat of voivode), Toruń (seat of sejmik and marshal) ["Lublin Voivodeship, Poland"] = {}, -- abbr LU, code 06, capital Lublin ["Lubusz Voivodeship, Poland"] = {}, -- abbr LB, code 08, capital Gorzów Wielkopolski (seat of voivode), Zielona Góra (seat of sejmik and marshal) ["Lodz Voivodeship, Poland"] = {wp = "Łódź Voivodeship"}, -- abbr LD, code 10, capital Łódź ["Łódź Voivodeship, Poland"] = {alias_of = "Lodz Voivodeship, Poland", display = true, display_as_full = true}, ["Lesser Poland Voivodeship, Poland"] = {}, -- abbr MA, code 12, capital Kraków ["Masovian Voivodeship, Poland"] = {}, -- abbr MZ, code 14, capital Warsaw ["Opole Voivodeship, Poland"] = {}, -- abbr OP, code 16, capital Opole ["Subcarpathian Voivodeship, Poland"] = {}, -- abbr PK, code 18, capital Rzeszów ["Podlaskie Voivodeship, Poland"] = {}, -- abbr PD, code 20, capital Białystok ["Pomeranian Voivodeship, Poland"] = {}, -- abbr PM, code 22, capital Gdańsk ["Silesian Voivodeship, Poland"] = {}, -- abbr SL, code 24, capital Katowice ["Holy Cross Voivodeship, Poland"] = {wp = "Świętokrzyskie Voivodeship"}, -- abbr SK, code 26, capital Kielce ["Świętokrzyskie Voivodeship, Poland"] = {alias_of = "Holy Cross Voivodeship, Poland", display = true, display_as_full = true}, ["Warmian-Masurian Voivodeship, Poland"] = {}, -- abbr WN, code 28, capital Olsztyn ["Greater Poland Voivodeship, Poland"] = {}, -- abbr WP, code 30, capital Poznań ["West Pomeranian Voivodeship, Poland"] = {}, -- abbr ZP, code 32, capital Szczecin } -- voivodeships of Poland export.poland_group = { key_to_placename = make_key_to_placename(", Poland$", " Voivodeship$"), placename_to_key = make_placename_to_key(", Poland", " Voivodeship"), default_container = "Poland", default_placetype = "voivodeship", default_divs = { -- "เทศมณฑล", -- not enough of them currently {type = "Polish colonies", cat_as = {{type = "villages", prep = "ใน"}}}, }, data = export.poland_voivodeships, } export.portugal_districts_and_autonomous_regions = { ["Azores, Portugal"] = {the = true, placetype = {"autonomous region", "ภูมิภาค"}}, ["Aveiro District, Portugal"] = {}, ["Beja District, Portugal"] = {}, ["Braga District, Portugal"] = {}, ["Bragança District, Portugal"] = {}, ["Castelo Branco District, Portugal"] = {}, ["Coimbra District, Portugal"] = {}, ["Évora District, Portugal"] = {}, ["Faro District, Portugal"] = {}, ["Guarda District, Portugal"] = {}, ["Leiria District, Portugal"] = {}, ["Lisbon District, Portugal"] = {}, ["Lisboa District, Portugal"] = {alias_of = "Lisbon District, Portugal", display = true}, ["Madeira, Portugal"] = {placetype = {"autonomous region", "ภูมิภาค"}}, ["Portalegre District, Portugal"] = {}, ["Porto District, Portugal"] = {}, ["Santarém District, Portugal"] = {}, ["Setúbal District, Portugal"] = {}, ["Viana do Castelo District, Portugal"] = {}, ["Vila Real District, Portugal"] = {}, ["Viseu District, Portugal"] = {}, } local function portugal_placename_to_key(placename) if placename == "Azores" or placename == "Madeira" then return placename .. ", Portugal" end if placename:find(" District$") then return placename .. ", Portugal" end return placename .. " District, Portugal" end -- districts and autonomous regions of Portugal export.portugal_group = { key_to_placename = make_key_to_placename(", Portugal$", " District$"), placename_to_key = portugal_placename_to_key, default_container = "Portugal", default_placetype = "district", default_divs = "เทศบาล", data = export.portugal_districts_and_autonomous_regions, } export.romania_counties = { ["Alba County, Romania"] = {}, ["Arad County, Romania"] = {}, ["Argeș County, Romania"] = {}, ["Bacău County, Romania"] = {}, ["Bihor County, Romania"] = {}, ["Bistrița-Năsăud County, Romania"] = {}, ["Botoșani County, Romania"] = {}, ["Brașov County, Romania"] = {}, ["Brăila County, Romania"] = {}, -- Bucharest: not in a county ["Buzău County, Romania"] = {}, ["Caraș-Severin County, Romania"] = {}, ["Cluj County, Romania"] = {}, ["Constanța County, Romania"] = {}, ["Covasna County, Romania"] = {}, ["Călărași County, Romania"] = {}, ["Dolj County, Romania"] = {}, ["Dâmbovița County, Romania"] = {}, ["Galați County, Romania"] = {}, ["Giurgiu County, Romania"] = {}, ["Gorj County, Romania"] = {}, ["Harghita County, Romania"] = {}, ["Hunedoara County, Romania"] = {}, ["Ialomița County, Romania"] = {}, ["Iași County, Romania"] = {}, ["Ilfov County, Romania"] = {}, ["Maramureș County, Romania"] = {}, ["Mehedinți County, Romania"] = {}, ["Mureș County, Romania"] = {}, ["Neamț County, Romania"] = {}, ["Olt County, Romania"] = {}, ["Prahova County, Romania"] = {}, ["Satu Mare County, Romania"] = {}, ["Sibiu County, Romania"] = {}, ["Suceava County, Romania"] = {}, ["Sălaj County, Romania"] = {}, ["Teleorman County, Romania"] = {}, ["Timiș County, Romania"] = {}, ["Tulcea County, Romania"] = {}, ["Vaslui County, Romania"] = {}, ["Vrancea County, Romania"] = {}, ["Vâlcea County, Romania"] = {}, } -- counties of Romania export.romania_group = { key_to_placename = make_key_to_placename(", Romania$", " County$"), placename_to_key = make_placename_to_key(", Romania", " County"), default_container = "Romania", default_placetype = "เทศมณฑล", default_divs = "communes", data = export.romania_counties, } local function make_russia_federal_subject_spec(spectype, use_the, wp) return { placetype = spectype, the = not not use_the, bare_category_parent_type = {"federal subjects", spectype .. "s"}, wp = wp, } end local russia_autonomous_okrug_no_the = {placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"}} local russia_autonomous_okrug_the = {placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"}, the = true} local russia_krai = make_russia_federal_subject_spec("krai") local russia_oblast = make_russia_federal_subject_spec("oblast") local russia_republic_the = make_russia_federal_subject_spec("republic", "use the") local russia_republic_no_the = make_russia_federal_subject_spec("republic") export.russia_federal_subjects = { -- autonomous oblasts ["Jewish Autonomous Oblast, Russia"] = {the = true, placetype = {"autonomous oblast", "oblast"}, bare_category_parent_type = {"federal subjects", "autonomous oblasts"}}, -- autonomous okrugs ["Chukotka Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Chukotka, Russia"] = {alias_of = "Chukotka Autonomous Okrug, Russia"}, ["Khanty-Mansi Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Khanty-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"}, ["Khantia-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"}, ["Yugra, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"}, ["Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Nenetsia, Russia"] = {alias_of = "Nenets Autonomous Okrug, Russia"}, ["Yamalo-Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Yamalia, Russia"] = {alias_of = "Yamalo-Nenets Autonomous Okrug, Russia"}, -- krais ["Altai Krai, Russia"] = russia_krai, ["Kamchatka Krai, Russia"] = russia_krai, ["Khabarovsk Krai, Russia"] = russia_krai, ["Krasnodar Krai, Russia"] = russia_krai, ["Krasnoyarsk Krai, Russia"] = russia_krai, ["Perm Krai, Russia"] = russia_krai, ["Primorsky Krai, Russia"] = russia_krai, ["Stavropol Krai, Russia"] = russia_krai, ["Zabaykalsky Krai, Russia"] = russia_krai, -- oblasts ["Amur Oblast, Russia"] = russia_oblast, ["Arkhangelsk Oblast, Russia"] = russia_oblast, ["Astrakhan Oblast, Russia"] = russia_oblast, ["Belgorod Oblast, Russia"] = russia_oblast, ["Bryansk Oblast, Russia"] = russia_oblast, ["Chelyabinsk Oblast, Russia"] = russia_oblast, ["Irkutsk Oblast, Russia"] = russia_oblast, ["Ivanovo Oblast, Russia"] = russia_oblast, ["Kaliningrad Oblast, Russia"] = russia_oblast, ["Kaluga Oblast, Russia"] = russia_oblast, ["Kemerovo Oblast, Russia"] = russia_oblast, ["Kirov Oblast, Russia"] = russia_oblast, ["Kostroma Oblast, Russia"] = russia_oblast, ["Kurgan Oblast, Russia"] = russia_oblast, ["Kursk Oblast, Russia"] = russia_oblast, ["Leningrad Oblast, Russia"] = russia_oblast, ["Lipetsk Oblast, Russia"] = russia_oblast, ["Magadan Oblast, Russia"] = russia_oblast, ["Moscow Oblast, Russia"] = russia_oblast, ["Murmansk Oblast, Russia"] = russia_oblast, ["Nizhny Novgorod Oblast, Russia"] = russia_oblast, ["Novgorod Oblast, Russia"] = russia_oblast, ["Novosibirsk Oblast, Russia"] = russia_oblast, ["Omsk Oblast, Russia"] = russia_oblast, ["Orenburg Oblast, Russia"] = russia_oblast, ["Oryol Oblast, Russia"] = russia_oblast, ["Penza Oblast, Russia"] = russia_oblast, ["Pskov Oblast, Russia"] = russia_oblast, ["Rostov Oblast, Russia"] = russia_oblast, ["Ryazan Oblast, Russia"] = russia_oblast, ["Sakhalin Oblast, Russia"] = russia_oblast, ["Samara Oblast, Russia"] = russia_oblast, ["Saratov Oblast, Russia"] = russia_oblast, ["Smolensk Oblast, Russia"] = russia_oblast, ["Sverdlovsk Oblast, Russia"] = russia_oblast, ["Tambov Oblast, Russia"] = russia_oblast, ["Tomsk Oblast, Russia"] = russia_oblast, ["Tula Oblast, Russia"] = russia_oblast, ["Tver Oblast, Russia"] = russia_oblast, ["Tyumen Oblast, Russia"] = russia_oblast, ["Ulyanovsk Oblast, Russia"] = russia_oblast, ["Vladimir Oblast, Russia"] = russia_oblast, ["Volgograd Oblast, Russia"] = russia_oblast, ["Vologda Oblast, Russia"] = russia_oblast, ["Voronezh Oblast, Russia"] = russia_oblast, ["Yaroslavl Oblast, Russia"] = russia_oblast, -- republics -- -- We only need to include cases that aren't just shortened versions of the full federal subject name (i.e. where -- words like "Republic" and "Oblast" are omitted but the name is not otherwise modified; these are handled by -- key_to_placename). Non-display-canonicalizing aliases are generally due to differences in the presence or absence -- of "the". ["Adygea, Russia"] = russia_republic_no_the, ["Republic of Adygea, Russia"] = {alias_of = "Adygea, Russia", the = true}, ["Bashkortostan, Russia"] = russia_republic_no_the, ["Republic of Bashkortostan, Russia"] = {alias_of = "Bashkortostan, Russia", the = true}, ["Bashkiria, Russia"] = {alias_of = "Bashkortostan, Russia"}, ["Buryatia, Russia"] = russia_republic_no_the, ["Republic of Buryatia, Russia"] = {alias_of = "Buryatia, Russia", the = true}, ["Dagestan, Russia"] = russia_republic_no_the, ["Republic of Dagestan, Russia"] = {alias_of = "Dagestan, Russia", the = true}, ["Ingushetia, Russia"] = russia_republic_no_the, ["Republic of Ingushetia, Russia"] = {alias_of = "Ingushetia, Russia", the = true}, ["Kalmykia, Russia"] = russia_republic_no_the, ["Republic of Kalmykia, Russia"] = {alias_of = "Kalmykia, Russia", the = true}, ["Karelia, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Karelia"), ["Republic of Karelia, Russia"] = {alias_of = "Karelia, Russia", the = true}, ["Khakassia, Russia"] = russia_republic_no_the, ["Republic of Khakassia, Russia"] = {alias_of = "Khakassia, Russia", the = true}, ["Mordovia, Russia"] = russia_republic_no_the, ["Republic of Mordovia, Russia"] = {alias_of = "Mordovia, Russia", the = true}, ["North Ossetia-Alania, Russia"] = make_russia_federal_subject_spec("republic", nil, "North Ossetia–Alania"), -- with en-dash ["Republic of North Ossetia-Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", the = true}, ["North Ossetia, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true}, ["Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true}, ["Tatarstan, Russia"] = russia_republic_no_the, ["Republic of Tatarstan, Russia"] = {alias_of = "Tatarstan, Russia", the = true}, ["Altai Republic, Russia"] = russia_republic_the, ["Chechnya, Russia"] = russia_republic_no_the, ["Chechen Republic, Russia"] = {alias_of = "Chechnya, Russia", the = true}, ["Chuvashia, Russia"] = russia_republic_no_the, ["Chuvash Republic, Russia"] = {alias_of = "Chuvashia, Russia", the = true}, ["Kabardino-Balkaria, Russia"] = russia_republic_no_the, ["Kabardino-Balkariya, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", display = true}, ["Kabardino-Balkarian Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", the = true}, ["Kabardino-Balkar Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", display = "Kabardino-Balkarian Republic, Russia", the = true}, ["Karachay-Cherkessia, Russia"] = russia_republic_no_the, ["Karachay-Cherkess Republic, Russia"] = {alias_of = "Karachay-Cherkessia, Russia"}, ["Komi, Russia"] = make_russia_federal_subject_spec("republic", nil, "Komi Republic"), ["Komi Republic, Russia"] = {alias_of = "Komi, Russia", the = true}, ["Mari El, Russia"] = russia_republic_no_the, ["Mari El Republic, Russia"] = {alias_of = "Mari El, Russia", the = true}, ["Sakha, Russia"] = make_russia_federal_subject_spec("republic", nil, "Sakha Republic"), ["Sakha Republic, Russia"] = {alias_of = "Sakha, Russia", the = true}, ["Yakutia, Russia"] = {alias_of = "Sakha, Russia"}, ["Yakutiya, Russia"] = {alias_of = "Sakha, Russia", display = "Yakutia, Russia"}, ["Republic of Yakutia (Sakha), Russia"] = {alias_of = "Sakha, Russia", display = "Sakha Republic, Russia", the = true}, ["Tuva, Russia"] = russia_republic_no_the, ["Tyva, Russia"] = {alias_of = "Tuva, Russia", display = true}, ["Tuva Republic, Russia"] = {alias_of = "Tuva, Russia", the = true}, ["Tyva Republic, Russia"] = {alias_of = "Tuva, Russia", display= "Tuva Republic, Russia", the = true}, ["Udmurtia, Russia"] = russia_republic_no_the, ["Udmurt Republic, Russia"] = {alias_of = "Udmurtia, Russia", the = true}, -- Not included due to being unrecognized and only partly controlled: -- ["Crimea, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Crimea (Russia)") -- ["Donetsk People's Republic, Russia"] = russia_republic_the, -- ["Luhansk People's Republic, Russia"] = russia_republic_the, -- ["Zaporozhye Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Zaporizhzhia Oblast"), -- ["Kherson Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Kherson Oblast"), -- There are also federal cities (not included because they're cities): -- Moscow, Saint Petersburg; Sevastopol (unrecognized; same status as for "Crimea, Russia" above) } local function russia_key_to_placename(key) key = key:gsub(",.*", "") local full_placename = key if key == "Jewish Autonomous Oblast" then return full_placename, full_placename end local elliptical_placename for _, suffix in ipairs({"Krai", "Oblast"}) do elliptical_placename = key:match("^(.*) " .. suffix .. "$") if elliptical_placename then return full_placename, elliptical_placename end end return full_placename, full_placename end local function russia_placename_to_key(placename) local key = placename .. ", Russia" if export.russia_federal_subjects[key] then return key end -- We allow the user to say e.g. "obl/Samara" in place of "obl/Samara Oblast". for _, suffix in ipairs({"Krai", "Oblast"}) do local suffixed_key = placename .. " " .. suffix .. ", Russia" if export.russia_federal_subjects[suffixed_key] then return suffixed_key end end return placename .. ", Russia" end local function construct_russia_federal_subject_keydesc(group, key, spec) local placename = key:gsub(",.*", "") local linked_placename = export.construct_linked_placename(spec, placename) local placetype = spec.placetype if type(placetype) == "table" then placetype = placetype[1] end if placetype == "oblast" then -- Hack: Oblasts generally don't have entries under "Foo Oblast" -- but just under "Foo", so fix the linked key appropriately; -- doesn't apply to the Jewish Autonomous Oblast linked_placename = linked_placename:gsub(" Oblast%]%]", "%]%] Oblast") end return linked_placename .. ", a [[federal subject]] ([[" .. placetype .. "]]) of [[Russia]]" end -- federal subjects of Russia export.russia_group = { key_to_placename = russia_key_to_placename, placename_to_key = russia_placename_to_key, default_container = "Russia", default_keydesc = construct_russia_federal_subject_keydesc, default_overriding_bare_label_parents = {"federal subjects of Russia", "+++"}, data = export.russia_federal_subjects, } export.saudi_arabia_provinces = { ["Riyadh Province, Saudi Arabia"] = {}, ["Mecca Province, Saudi Arabia"] = {}, -- Name is too generic to assume it's in Saudi Arabia if not specified. ["Eastern Province, Saudi Arabia"] = {no_auto_augment_container = true, wp = "%l, %c"}, ["Medina Province, Saudi Arabia"] = {wp = "%l (%c)"}, ["Aseer Province, Saudi Arabia"] = {wp = "Asir"}, ["Asir Province, Saudi Arabia"] = {alias_of = "Aseer Province, Saudi Arabia", display = true}, ["Jazan Province, Saudi Arabia"] = {}, ["Qassim Province, Saudi Arabia"] = {wp = "Al-Qassim Province"}, ["Al-Qassim Province, Saudi Arabia"] = {alias_of = "Qassim Province, Saudi Arabia", display = true}, ["Tabuk Province, Saudi Arabia"] = {}, ["Hail Province, Saudi Arabia"] = {wp = "Ḥa'il Province"}, ["Ha'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true}, ["Ḥa'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true}, ["Al-Jouf Province, Saudi Arabia"] = {wp = "Al-Jawf Province"}, ["Al-Jawf Province, Saudi Arabia"] = {alias_of = "Al-Jouf Province, Saudi Arabia", display = true}, ["Najran Province, Saudi Arabia"] = {}, ["Northern Borders Province, Saudi Arabia"] = {}, ["Al-Bahah Province, Saudi Arabia"] = {}, } -- provinces of Saudi Arabia export.saudi_arabia_group = { key_to_placename = make_key_to_placename(", Saudi Arabia$", " Province$"), placename_to_key = make_placename_to_key(", Saudi Arabia", " Province"), default_container = "Saudi Arabia", default_placetype = "จังหวัด", data = export.saudi_arabia_provinces, } export.south_africa_provinces = { ["Eastern Cape, South Africa"] = {the = true}, ["Free State, South Africa"] = {the = true, wp = "%l (จังหวัด)"}, ["Gauteng, South Africa"] = {}, ["KwaZulu-Natal, South Africa"] = {}, ["Limpopo, South Africa"] = {}, ["Mpumalanga, South Africa"] = {}, -- per Wikipedia and other sources, `North West` doesn't normally have `the` before it ["North West, South Africa"] = {wp = "%l (South African province)"}, ["Northern Cape, South Africa"] = {the = true}, ["Western Cape, South Africa"] = {the = true}, } -- provinces of South Africa export.south_africa_group = { default_container = "South Africa", default_placetype = "จังหวัด", default_divs = "เทศบาล", data = export.south_africa_provinces, } export.south_korea_provinces = { ["North Chungcheong Province, South Korea"] = {}, ["South Chungcheong Province, South Korea"] = {}, ["Gangwon Province, South Korea"] = {wp = "%l, %c"}, ["Gyeonggi Province, South Korea"] = {}, ["North Gyeongsang Province, South Korea"] = {}, ["South Gyeongsang Province, South Korea"] = {}, ["North Jeolla Province, South Korea"] = {}, ["South Jeolla Province, South Korea"] = {}, ["Jeju Province, South Korea"] = {}, } -- provinces of South Korea export.south_korea_group = { key_to_placename = make_key_to_placename(", South Korea$", " Province$"), placename_to_key = make_placename_to_key(", South Korea", " Province"), default_container = "South Korea", default_placetype = "จังหวัด", data = export.south_korea_provinces, } export.spain_autonomous_communities = { ["Andalusia, Spain"] = {}, ["Aragon, Spain"] = {}, ["Asturias, Spain"] = {}, ["Balearic Islands, Spain"] = {the = true}, ["Basque Country, Spain"] = {the = true, wp = "%l (autonomous community)"}, ["Canary Islands, Spain"] = {the = true}, ["Cantabria, Spain"] = {}, ["Castile and León, Spain"] = {}, ["Castilla-La Mancha, Spain"] = {wp = "Castilla–La Mancha"}, -- with en-dash ["Catalonia, Spain"] = {}, ["Community of Madrid, Spain"] = {the = true}, ["Extremadura, Spain"] = {}, ["Galicia, Spain"] = {wp = "%l (Spain)"}, ["La Rioja, Spain"] = {}, ["Murcia, Spain"] = {wp = "Region of %l"}, ["Navarre, Spain"] = {}, ["Valencia, Spain"] = {wp = "Valencian Community"}, ["Valencian Community, Spain"] = {alias_of = "Valencia, Spain", the = true}, } -- autonomous communities of Spain export.spain_group = { default_container = "Spain", default_placetype = "autonomous community", default_divs = {"เทศบาล", "comarcas"}, data = export.spain_autonomous_communities, } export.taiwan_counties = { ["จางฮว่า, ไต้หวัน"] = {}, ["เจียอี้, ไต้หวัน"] = {}, ["ซินจู๋, ไต้หวัน"] = {}, ["ฮวาเหลียน, ไต้หวัน"] = {}, ["จินเหมิน, ไต้หวัน"] = {wp = "หมู่เกาะจินเหมิน"}, ["เหลียนเจียง, ไต้หวัน"] = {wp = "หมู่เกาะหมาจู่"}, ["เหมียวลี่, ไต้หวัน"] = {}, ["หนานโถว, ไต้หวัน"] = {}, ["เผิงหู, ไต้หวัน"] = {wp = "เผิงหู"}, ["ผิงตง, ไต้หวัน"] = {}, ["ไถตง, ไต้หวัน"] = {}, ["อี๋หลาน, ไต้หวัน"] = {wp = "%l, %c"}, ["ยฺหวินหลิน, ไต้หวัน"] = {}, } -- counties of Taiwan export.taiwan_group = { key_to_placename = make_key_to_placename(", ไต้หวัน$"), placename_to_key = make_placename_to_key(", ไต้หวัน"), default_container = "ไต้หวัน", default_placetype = "เทศมณฑล", default_divs = {"อำเภอ", "townships"}, data = export.taiwan_counties, } export.thailand_provinces = { --ไม่ต้องเติม จังหวัด -- กรุงเทพมหานคร (Bangkok - special administrative area) ["อำนาจเจริญ, ไทย"] = {}, ["อ่างทอง, ไทย"] = {}, ["บึงกาฬ, ไทย"] = {}, ["บุรีรัมย์, ไทย"] = {}, ["ฉะเชิงเทรา, ไทย"] = {}, ["ชัยนาท, ไทย"] = {}, ["ชัยภูมิ, ไทย"] = {}, ["จันทบุรี, ไทย"] = {}, ["เชียงใหม่, ไทย"] = {}, ["เชียงราย, ไทย"] = {}, ["ชลบุรี, ไทย"] = {}, ["ชุมพร, ไทย"] = {}, ["กาฬสินธุ์, ไทย"] = {}, ["กำแพงเพชร, ไทย"] = {}, ["กาญจนบุรี, ไทย"] = {}, ["ขอนแก่น, ไทย"] = {}, ["กระบี่, ไทย"] = {}, ["ลำปาง, ไทย"] = {}, ["ลำพูน, ไทย"] = {}, ["เลย, ไทย"] = {}, ["ลพบุรี, ไทย"] = {}, ["แม่ฮ่องสอน, ไทย"] = {}, ["มหาสารคาม, ไทย"] = {}, ["มุกดาหาร, ไทย"] = {}, ["นครนายก, ไทย"] = {}, ["นครปฐม, ไทย"] = {}, ["นครพนม, ไทย"] = {}, ["นครราชสีมา, ไทย"] = {}, ["นครสวรรค์, ไทย"] = {}, ["นครศรีธรรมราช, ไทย"] = {}, ["น่าน, ไทย"] = {}, ["นราธิวาส, ไทย"] = {}, ["หนองบัวลำภู, ไทย"] = {}, ["หนองคาย, ไทย"] = {}, ["นนทบุรี, ไทย"] = {}, ["ปทุมธานี, ไทย"] = {}, ["ปัตตานี, ไทย"] = {}, ["พังงา, ไทย"] = {}, ["พัทลุง, ไทย"] = {}, ["พะเยา, ไทย"] = {}, ["เพชรบูรณ์, ไทย"] = {}, ["เพชรบุรี, ไทย"] = {}, ["พิจิตร, ไทย"] = {}, ["พิษณุโลก, ไทย"] = {}, ["พระนครศรีอยุธยา, ไทย"] = {}, ["แพร่, ไทย"] = {}, ["ภูเก็ต, ไทย"] = {}, ["ปราจีนบุรี, ไทย"] = {}, ["ประจวบคีรีขันธ์, ไทย"] = {}, ["ระนอง, ไทย"] = {}, ["ราชบุรี, ไทย"] = {}, ["ระยอง, ไทย"] = {}, ["ร้อยเอ็ด, ไทย"] = {}, ["สระแก้ว, ไทย"] = {}, ["สกลนคร, ไทย"] = {}, ["สมุทรปราการ, ไทย"] = {}, ["สมุทรสาคร, ไทย"] = {}, ["สมุทรสงคราม, ไทย"] = {}, ["สระบุรี, ไทย"] = {}, ["สตูล, ไทย"] = {}, ["สิงห์บุรี, ไทย"] = {}, ["ศรีสะเกษ, ไทย"] = {}, ["สงขลา, ไทย"] = {}, ["สุโขทัย, ไทย"] = {}, ["สุพรรณบุรี, ไทย"] = {}, ["สุราษฎร์ธานี, ไทย"] = {}, ["สุรินทร์, ไทย"] = {}, ["ตาก, ไทย"] = {}, ["ตรัง, ไทย"] = {}, ["ตราด, ไทย"] = {}, ["อุบลราชธานี, ไทย"] = {}, ["อุดรธานี, ไทย"] = {}, ["อุทัยธานี, ไทย"] = {}, ["อุตรดิตถ์, ไทย"] = {}, ["ยะลา, ไทย"] = {}, ["ยโสธร, ไทย"] = {}, } -- provinces of Thailand export.thailand_group = { key_to_placename = make_key_to_placename(", ไทย$"), --ไม่ต้องเติม จังหวัด placename_to_key = make_placename_to_key(", ไทย"), default_container = "ไทย", default_placetype = "จังหวัด", default_divs = "อำเภอ", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "จังหวัด%e", data = export.thailand_provinces, } export.turkey_provinces = { ["Adana Province, Turkey"] = {}, -- code 01 ["Adıyaman Province, Turkey"] = {}, -- code 02 ["Afyonkarahisar Province, Turkey"] = {}, -- code 03 ["Ağrı Province, Turkey"] = {}, -- code 04 ["Amasya Province, Turkey"] = {}, -- code 05 ["Ankara Province, Turkey"] = {}, -- code 06 ["Antalya Province, Turkey"] = {}, -- code 07 ["Artvin Province, Turkey"] = {}, -- code 08 ["Aydın Province, Turkey"] = {}, -- code 09 ["Balıkesir Province, Turkey"] = {}, -- code 10 ["Bilecik Province, Turkey"] = {}, -- code 11 ["Bingöl Province, Turkey"] = {}, -- code 12 ["Bitlis Province, Turkey"] = {}, -- code 13 ["Bolu Province, Turkey"] = {}, -- code 14 ["Burdur Province, Turkey"] = {}, -- code 15 ["Bursa Province, Turkey"] = {}, -- code 16 ["Çanakkale Province, Turkey"] = {}, -- code 17 ["Çankırı Province, Turkey"] = {}, -- code 18 ["Çorum Province, Turkey"] = {}, -- code 19 ["Denizli Province, Turkey"] = {}, -- code 20 ["Diyarbakır Province, Turkey"] = {}, -- code 21 ["Edirne Province, Turkey"] = {}, -- code 22 ["Elazığ Province, Turkey"] = {}, -- code 23 ["Elâzığ Province, Turkey"] = {alias_of = "Elazığ Province, Turkey", display = true}, ["Erzincan Province, Turkey"] = {}, -- code 24 ["Erzurum Province, Turkey"] = {}, -- code 25 ["Eskişehir Province, Turkey"] = {}, -- code 26 ["Gaziantep Province, Turkey"] = {}, -- code 27 ["Giresun Province, Turkey"] = {}, -- code 28 ["Gümüşhane Province, Turkey"] = {}, -- code 29 ["Hakkâri Province, Turkey"] = {}, -- code 30 ["Hakkari Province, Turkey"] = {alias_of = "Hakkâri Province, Turkey", display = true}, ["Hatay Province, Turkey"] = {}, -- code 31 ["Isparta Province, Turkey"] = {}, -- code 32 ["Mersin Province, Turkey"] = {}, -- code 33 -- ["Istanbul Province, Turkey"] = {}, -- code 34; this is coextensive with the city itself ["İzmir Province, Turkey"] = {}, -- code 35 ["Izmir Province, Turkey"] = {alias_of = "İzmir Province, Turkey", display = true}, ["Kars Province, Turkey"] = {}, -- code 36 ["Kastamonu Province, Turkey"] = {}, -- code 37 ["Kayseri Province, Turkey"] = {}, -- code 38 ["Kırklareli Province, Turkey"] = {}, -- code 39 ["Kırşehir Province, Turkey"] = {}, -- code 40 ["Kocaeli Province, Turkey"] = {}, -- code 41 ["Konya Province, Turkey"] = {}, -- code 42 ["Kütahya Province, Turkey"] = {}, -- code 43 ["Malatya Province, Turkey"] = {}, -- code 44 ["Manisa Province, Turkey"] = {}, -- code 45 ["Kahramanmaraş Province, Turkey"] = {}, -- code 46 ["Mardin Province, Turkey"] = {}, -- code 47 ["Muğla Province, Turkey"] = {}, -- code 48 ["Muş Province, Turkey"] = {}, -- code 49 ["Nevşehir Province, Turkey"] = {}, -- code 50 ["Niğde Province, Turkey"] = {}, -- code 51 ["Ordu Province, Turkey"] = {}, -- code 52 ["Rize Province, Turkey"] = {}, -- code 53 ["Sakarya Province, Turkey"] = {}, -- code 54 ["Samsun Province, Turkey"] = {}, -- code 55 ["Siirt Province, Turkey"] = {}, -- code 56 ["Sinop Province, Turkey"] = {}, -- code 57 ["Sivas Province, Turkey"] = {}, -- code 58 ["Tekirdağ Province, Turkey"] = {}, -- code 59 ["Tokat Province, Turkey"] = {}, -- code 60 ["Trabzon Province, Turkey"] = {}, -- code 61 ["Tunceli Province, Turkey"] = {}, -- code 62 ["Şanlıurfa Province, Turkey"] = {}, -- code 63 ["Uşak Province, Turkey"] = {}, -- code 64 ["Van Province, Turkey"] = {}, -- code 65 ["Yozgat Province, Turkey"] = {}, -- code 66 ["Zonguldak Province, Turkey"] = {}, -- code 67 ["Aksaray Province, Turkey"] = {}, -- code 68 ["Bayburt Province, Turkey"] = {}, -- code 69 ["Karaman Province, Turkey"] = {}, -- code 70 ["Kırıkkale Province, Turkey"] = {}, -- code 71 ["Batman Province, Turkey"] = {}, -- code 72 ["Şırnak Province, Turkey"] = {}, -- code 73 ["Bartın Province, Turkey"] = {}, -- code 74 ["Ardahan Province, Turkey"] = {}, -- code 75 ["Iğdır Province, Turkey"] = {}, -- code 76 ["Yalova Province, Turkey"] = {}, -- code 77 ["Karabük Province, Turkey"] = {}, -- code 78 ["Kilis Province, Turkey"] = {}, -- code 79 ["Osmaniye Province, Turkey"] = {}, -- code 80 ["Düzce Province, Turkey"] = {}, -- code 81 } -- provinces of Turkey export.turkey_group = { key_to_placename = make_key_to_placename(", Turkey$", " Province$"), placename_to_key = make_placename_to_key(", Turkey", " Province"), default_container = "Turkey", default_placetype = "จังหวัด", default_divs = "อำเภอ", data = export.turkey_provinces, } export.ukraine_oblasts = { ["Cherkasy Oblast, Ukraine"] = {}, -- capital [[Cherkasy]], license plate prefix CA, IA ["Chernihiv Oblast, Ukraine"] = {}, -- capital [[Chernihiv]], license plate prefix CB, IB ["Chernivtsi Oblast, Ukraine"] = {}, -- capital [[Chernivtsi]], license plate prefix CE, IE -- apparently will be renamed to 'Dnipro Oblast' ["Dnipropetrovsk Oblast, Ukraine"] = {}, -- capital [[Dnipro]], license plate prefix AE, KE ["Donetsk Oblast, Ukraine"] = {}, -- capital ''[[Donetsk]] ([[Kramatorsk]])'', license plate prefix AH, KH ["Ivano-Frankivsk Oblast, Ukraine"] = {}, -- capital [[Ivano-Frankivsk]], license plate prefix AT, KT ["Kharkiv Oblast, Ukraine"] = {}, -- capital [[Kharkiv]], license plate prefix AX, KX ["Kherson Oblast, Ukraine"] = {}, -- capital ''[[Kherson]]'', license plate prefix ''BT, HT'' ["Khmelnytskyi Oblast, Ukraine"] = {}, -- capital [[Khmelnytskyi]], license plate prefix BX, HX -- apparently will be renamed to 'Kropyvnytskyi Oblast' ["Kirovohrad Oblast, Ukraine"] = {}, -- capital [[Kropyvnytskyi]], license plate prefix BA, HA ["Kyiv Oblast, Ukraine"] = {}, -- capital [[Kyiv]], license plate prefix AI, KI ["Kiev Oblast, Ukraine"] = {alias_of = "Kyiv Oblast, Ukraine", display = true}, ["Luhansk Oblast, Ukraine"] = {}, -- capital ''[[Luhansk]] ([[Sievierodonetsk]])'', license plate prefix BB, HB ["Lviv Oblast, Ukraine"] = {}, -- capital [[Lviv]], license plate prefix BC, HC ["Mykolaiv Oblast, Ukraine"] = {}, -- capital [[Mykolaiv]], license plate prefix BE, HE ["Odesa Oblast, Ukraine"] = {}, -- capital [[Odesa]], license plate prefix BH, HH ["Odessa Oblast, Ukraine"] = {alias_of = "Odesa Oblast, Ukraine", display = true}, ["Poltava Oblast, Ukraine"] = {}, -- capital [[Poltava]], license plate prefix BI, HI ["Rivne Oblast, Ukraine"] = {}, -- capital [[Rivne]], license plate prefix BK, HK ["Sumy Oblast, Ukraine"] = {}, -- capital [[Sumy]], license plate prefix BM, HM ["Ternopil Oblast, Ukraine"] = {}, -- capital [[Ternopil]], license plate prefix BO, HO ["Vinnytsia Oblast, Ukraine"] = {}, -- capital [[Vinnytsia]], license plate prefix AB, KB ["Volyn Oblast, Ukraine"] = {}, -- capital [[Lutsk]], license plate prefix AC, KC ["Zakarpattia Oblast, Ukraine"] = {}, -- capital [[Uzhhorod]], license plate prefix AO, KO ["Zaporizhzhia Oblast, Ukraine"] = {}, -- capital ''[[Zaporizhzhia]]'', license plate prefix AP, KP ["Zaporizhia Oblast, Ukraine"] = {alias_of = "Zaporizhzhia Oblast, Ukraine", display = true}, ["Zhytomyr Oblast, Ukraine"] = {}, -- capital [[Zhytomyr]], license plate prefix AM, KM } -- oblasts of Ukraine export.ukraine_group = { key_to_placename = make_key_to_placename(", Ukraine$", " Oblast$"), placename_to_key = make_placename_to_key(", Ukraine", " Oblast"), default_container = "Ukraine", default_placetype = "oblast", default_divs = {"raions", "hromadas"}, data = export.ukraine_oblasts, } export.united_kingdom_constituent_countries = { ["England"] = {divs = { "เทศมณฑล", "อำเภอ", {type = "local government districts", cat_as = "อำเภอ"}, { type = "local government districts with borough status", cat_as = {"อำเภอ", "boroughs"}, }, {type = "boroughs", cat_as = {"อำเภอ", "boroughs"}}, {type = "civil parishes", container_parent_type = false}, }}, ["Northern Ireland"] = { placetype = {"constituent country", "จังหวัด", "ประเทศ"}, divs = {"เทศมณฑล", "อำเภอ"}, }, ["Scotland"] = {divs = { {type = "council areas", container_parent_type = false}, "อำเภอ", }}, ["Wales"] = {divs = { "เทศมณฑล", {type = "county boroughs", container_parent_type = false}, {type = "communities", container_parent_type = false}, {type = "Welsh communities", cat_as = {{type = "communities", container_parent_type = false}}}, }}, } -- constituent countries and provinces of the United Kingdom export.united_kingdom_group = { placename_to_key = false, default_container = "สหราชอาณาจักร", default_placetype = {"constituent country", "ประเทศ"}, addl_divs = { "traditional counties", {type = "historical counties", cat_as = "traditional counties"}, }, -- Don't create categories like 'Category:en:Towns in the United Kingdom' -- or 'Category:en:Places in the United Kingdom'. default_no_container_cat = true, data = export.united_kingdom_constituent_countries, } export.england_counties = { -- NOTE: We used to have various other "no longer" counties commented out, which seems to refer to counties that -- existed officially at some point between 1889 and 1974, which I have removed. I have only kept the three -- ceremonial counties that existed from 1974 (when ceremonial counties were created) to 1996, as well as those -- still considered "historic counties" per [[w:Historic counties of England]]. -- ["Avon, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996) ["Bedfordshire, England"] = {}, ["Berkshire, England"] = {}, -- ["Brighton and Hove, England"] = {}, -- city -- ["Bristol, England"] = {}, -- city ["Buckinghamshire, England"] = {}, ["Cambridgeshire, England"] = {}, ["Cheshire, England"] = {}, -- ["Cleveland, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996) ["Cornwall, England"] = {}, -- ["Cumberland, England"] = {}, -- no longer (historic county) ["Cumbria, England"] = {}, ["Derbyshire, England"] = {}, ["Devon, England"] = {}, ["Dorset, England"] = {}, ["County Durham, England"] = {}, ["East Sussex, England"] = {}, ["Essex, England"] = {}, ["Gloucestershire, England"] = {}, ["Greater London, England"] = {}, ["Greater Manchester, England"] = {}, ["Hampshire, England"] = {}, ["Herefordshire, England"] = {}, ["Hertfordshire, England"] = {}, -- ["Humberside, England"] = {}, -- no longer (1974 to 1996) -- ["Huntingdonshire, England"] = {}, -- no longer (historic county) ["Isle of Wight, England"] = {the = true}, ["Kent, England"] = {}, ["Lancashire, England"] = {}, ["Leicestershire, England"] = {}, ["Lincolnshire, England"] = {}, ["Merseyside, England"] = {}, -- ["Middlesex, England"] = {}, -- no longer (historic county) ["Norfolk, England"] = {}, ["Northamptonshire, England"] = {}, ["Northumberland, England"] = {}, ["North Yorkshire, England"] = {}, ["Nottinghamshire, England"] = {}, ["Oxfordshire, England"] = {}, ["Rutland, England"] = {}, ["Shropshire, England"] = {}, ["Somerset, England"] = {}, ["South Humberside, England"] = {}, ["South Yorkshire, England"] = {}, ["Staffordshire, England"] = {}, ["Suffolk, England"] = {}, ["Surrey, England"] = {}, -- ["Sussex, England"] = {}, -- no longer (historic county) ["Tyne and Wear, England"] = {}, ["Warwickshire, England"] = {}, ["West Midlands, England"] = {the = true, wp = "%l (county)"}, -- ["Westmorland, England"] = {}, -- no longer (historic county) ["West Sussex, England"] = {}, ["West Yorkshire, England"] = {}, ["Wiltshire, England"] = {}, ["Worcestershire, England"] = {}, -- ["Yorkshire, England"] = {}, -- no longer (historic county) ["East Riding of Yorkshire, England"] = {the = true}, } -- counties of England export.england_group = { default_container = {key = "England", placetype = "constituent country"}, default_placetype = "เทศมณฑล", default_divs = { "อำเภอ", {type = "local government districts", cat_as = "อำเภอ"}, { type = "local government districts with borough status", cat_as = {"อำเภอ", "boroughs"}, }, {type = "boroughs", cat_as = {"อำเภอ", "boroughs"}}, "civil parishes", }, data = export.england_counties, } export.northern_ireland_counties = { ["County Antrim, Northern Ireland"] = {}, ["County Armagh, Northern Ireland"] = {}, ["City of Belfast, Northern Ireland"] = {the = true, is_city = true, wp = "Belfast"}, ["County Down, Northern Ireland"] = {}, ["County Fermanagh, Northern Ireland"] = {}, ["County Londonderry, Northern Ireland"] = {}, ["City of Derry, Northern Ireland"] = {the = true, is_city = true, wp = "Derry"}, ["County Tyrone, Northern Ireland"] = {}, } -- counties of Northern Ireland export.northern_ireland_group = { key_to_placename = make_irish_type_key_to_placename(", Northern Ireland$"), placename_to_key = make_irish_type_placename_to_key(", Northern Ireland"), default_container = {key = "Northern Ireland", placetype = "constituent country"}, default_placetype = "เทศมณฑล", data = export.northern_ireland_counties, } export.scotland_council_areas = { ["Aberdeenshire, Scotland"] = {}, ["Angus, Scotland"] = {wp = "%l, %c"}, ["Argyll and Bute, Scotland"] = {}, ["City of Aberdeen, Scotland"] = {the = true, wp = "Aberdeen"}, ["Aberdeen"] = {alias_of = "City of Aberdeen, Scotland"}, ["Aberdeen City"] = {alias_of = "City of Aberdeen, Scotland"}, ["City of Dundee, Scotland"] = {the = true, wp = "Dundee"}, ["Dundee"] = {alias_of = "City of Dundee, Scotland"}, ["Dundee City"] = {alias_of = "City of Dundee, Scotland"}, ["City of Edinburgh, Scotland"] = {the = true, wp = "%l council area"}, ["Edinburgh"] = {alias_of = "City of Edinburgh, Scotland"}, ["City of Glasgow, Scotland"] = {the = true, wp = "Glasgow"}, ["Glasgow"] = {alias_of = "City of Glasgow, Scotland"}, ["Clackmannanshire, Scotland"] = {}, ["Dumfries and Galloway, Scotland"] = {}, ["East Ayrshire, Scotland"] = {}, ["East Dunbartonshire, Scotland"] = {}, ["East Lothian, Scotland"] = {}, ["East Renfrewshire, Scotland"] = {}, ["Falkirk, Scotland"] = {wp = "%l council area"}, ["Fife, Scotland"] = {}, ["Highland, Scotland"] = {wp = "%l council area"}, ["Inverclyde, Scotland"] = {}, ["Midlothian, Scotland"] = {}, ["Moray, Scotland"] = {}, ["North Ayrshire, Scotland"] = {}, ["North Lanarkshire, Scotland"] = {}, ["Orkney Islands, Scotland"] = {the = true}, ["Perth and Kinross, Scotland"] = {}, ["Renfrewshire, Scotland"] = {}, ["Scottish Borders, Scotland"] = {the = true}, ["Shetland Islands, Scotland"] = {the = true}, ["South Ayrshire, Scotland"] = {}, ["South Lanarkshire, Scotland"] = {}, ["Stirling, Scotland"] = {wp = "%l council area"}, ["West Dunbartonshire, Scotland"] = {}, ["West Lothian, Scotland"] = {}, ["Western Isles, Scotland"] = {the = true, wp = "Outer Hebrides"}, ["Na h-Eileanan Siar, Scotland"] = {alias_of = "Western Isles, Scotland"}, } -- council areas of Scotland export.scotland_group = { default_container = {key = "Scotland", placetype = "constituent country"}, default_placetype = "council area", data = export.scotland_council_areas, } export.wales_principal_areas = { ["Blaenau Gwent, Wales"] = {}, ["Bridgend, Wales"] = {wp = "%l County Borough"}, ["Caerphilly, Wales"] = {wp = "%l County Borough"}, -- ["Cardiff, Wales"] = {placetype = "นคร"}, ["Carmarthenshire, Wales"] = {placetype = "เทศมณฑล"}, ["Ceredigion, Wales"] = {placetype = "เทศมณฑล"}, ["Conwy, Wales"] = {wp = "%l County Borough"}, ["Denbighshire, Wales"] = {placetype = "เทศมณฑล"}, ["Flintshire, Wales"] = {placetype = "เทศมณฑล"}, ["Gwynedd, Wales"] = {placetype = "เทศมณฑล"}, ["Isle of Anglesey, Wales"] = {the = true, placetype = "เทศมณฑล"}, ["Anglesey, Wales"] = {alias_of = "Isle of Anglesey, Wales"}, -- differs in "the" ["Merthyr Tydfil, Wales"] = {wp = "%l County Borough"}, ["Monmouthshire, Wales"] = {placetype = "เทศมณฑล"}, ["Neath Port Talbot, Wales"] = {}, -- ["Newport, Wales"] = {placetype = "นคร", wp = "%l, %c"}, ["Pembrokeshire, Wales"] = {placetype = "เทศมณฑล"}, ["Powys, Wales"] = {placetype = "เทศมณฑล"}, ["Rhondda Cynon Taf, Wales"] = {}, -- ["Swansea, Wales"] = {placetype = "นคร"}, ["Torfaen, Wales"] = {}, ["Vale of Glamorgan, Wales"] = {the = true}, ["Wrexham, Wales"] = {wp = "%l County Borough"}, } -- principal areas (cities, counties and county boroughs) of Wales export.wales_group = { default_container = {key = "Wales", placetype = "constituent country"}, default_placetype = "county borough", data = export.wales_principal_areas, } export.united_states_states = { ["Alabama, USA"] = {}, ["Alaska, USA"] = {divs = { {type = "boroughs", container_parent_type = "เทศมณฑล"}, {type = "borough seats", container_parent_type = "county seats"}, }}, ["Arizona, USA"] = {}, ["Arkansas, USA"] = {}, ["California, USA"] = {}, ["Colorado, USA"] = {divs = {"เทศมณฑล", "county seats", "เทศบาล"}}, ["Connecticut, USA"] = {divs = {"เทศมณฑล", "county seats", "เทศบาล"}}, ["Delaware, USA"] = {}, ["Florida, USA"] = {}, ["Georgia, USA"] = {wp = "%l (U.S. state)"}, ["Hawaii, USA"] = {addl_parents = {"พอลินีเชีย"}}, ["Idaho, USA"] = {}, ["Illinois, USA"] = {}, ["Indiana, USA"] = {}, ["Iowa, USA"] = {}, ["Kansas, USA"] = {}, ["Kentucky, USA"] = {}, ["Louisiana, USA"] = {divs = { {type = "parishes", container_parent_type = "เทศมณฑล"}, {type = "parish seats", container_parent_type = "county seats"}, }}, ["Maine, USA"] = {}, ["Maryland, USA"] = {}, ["Massachusetts, USA"] = {}, ["Michigan, USA"] = {}, ["Minnesota, USA"] = {}, ["Mississippi, USA"] = {}, ["Missouri, USA"] = {}, ["Montana, USA"] = {}, ["Nebraska, USA"] = {}, ["Nevada, USA"] = {}, ["New Hampshire, USA"] = {}, ["New Jersey, USA"] = {divs = { "เทศมณฑล", "county seats", {type = "boroughs", prep = "ใน"}, }}, ["New Mexico, USA"] = {}, ["New York, USA"] = {wp = "%l (รัฐ)"}, ["North Carolina, USA"] = {}, ["North Dakota, USA"] = {}, ["Ohio, USA"] = {}, ["Oklahoma, USA"] = {}, ["Oregon, USA"] = {}, ["Pennsylvania, USA"] = {divs = { "เทศมณฑล", "county seats", {type = "boroughs", prep = "ใน"}, }}, ["Rhode Island, USA"] = {}, ["South Carolina, USA"] = {}, ["South Dakota, USA"] = {}, ["Tennessee, USA"] = {}, ["Texas, USA"] = {}, ["Utah, USA"] = {}, ["Vermont, USA"] = {}, ["Virginia, USA"] = {}, ["Washington, USA"] = {wp = "%l (รัฐ)"}, ["West Virginia, USA"] = {}, ["Wisconsin, USA"] = {}, ["Wyoming, USA"] = {}, } -- states of the United States export.united_states_group = { placename_to_key = make_placename_to_key(", USA"), default_container = "สหรัฐอเมริกา", default_placetype = "รัฐ", default_divs = {"เทศมณฑล", "county seats"}, addl_divs = { {type = "census-designated places", prep = "ใน"}, {type = "unincorporated communities", prep = "ใน"}, }, data = export.united_states_states, } export.vietnam_provinces = { -- [[Northeast (Vietnam)|Northeast]] region ["Bắc Giang, เวียดนาม"] = {}, -- capital [[Bắc Giang]] ["Bắc Kạn, เวียดนาม"] = {}, -- capital [[Bắc Kạn]] ["Cao Bằng, เวียดนาม"] = {}, -- capital [[Cao Bằng]] ["Hà Giang, เวียดนาม"] = {}, -- capital [[Hà Giang]] ["Lạng Sơn, เวียดนาม"] = {}, -- capital [[Lạng Sơn]] ["Phú Thọ, เวียดนาม"] = {}, -- capital [[Việt Trì]] ["Quảng Ninh, เวียดนาม"] = {}, -- capital [[Hạ Long]] ["Thái Nguyên, เวียดนาม"] = {}, -- capital [[Thái Nguyên]] ["Tuyên Quang, เวียดนาม"] = {}, -- capital [[Tuyên Quang]] -- [[Northwest (Vietnam)|Northwest]] region ["Lào Cai, เวียดนาม"] = {}, -- capital [[Lào Cai]] ["Yên Bái, เวียดนาม"] = {}, -- capital [[Yên Bái]] ["Điện Biên, เวียดนาม"] = {}, -- capital [[Điện Biên Phủ]] ["Hoà Bình, เวียดนาม"] = {}, -- capital [[Hoà Bình City|Hoà Bình]] ["Hòa Bình, เวียดนาม"] = {alias_of = "Hoà Bình, เวียดนาม", display = true}, ["Lai Châu, เวียดนาม"] = {}, -- capital [[Lai Châu]] ["Sơn La, เวียดนาม"] = {}, -- capital [[Sơn La]] -- [[Red River Delta]] region ["Bắc Ninh, เวียดนาม"] = {}, -- capital [[Bắc Ninh]] ["Hà Nam, เวียดนาม"] = {}, -- capital [[Phủ Lý]] ["Hải Dương, เวียดนาม"] = {}, -- capital [[Hải Dương]] ["Hưng Yên, เวียดนาม"] = {}, -- capital [[Hưng Yên]] ["Nam Định, เวียดนาม"] = {}, -- capital [[Nam Định]] ["Ninh Bình, เวียดนาม"] = {}, -- capital [[Ninh Bình|Hoa Lư]] ["Thái Bình, เวียดนาม"] = {}, -- capital [[Thái Bình]] ["Vĩnh Phúc, เวียดนาม"] = {}, -- capital [[Vĩnh Yên]] -- ["Hanoi"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[Hoàn Kiếm district]] -- ["Haiphong"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[Hồng Bàng district]] -- [[North Central Coast]] region ["Hà Tĩnh, เวียดนาม"] = {}, -- capital [[Hà Tĩnh]] ["Nghệ An, เวียดนาม"] = {}, -- capital [[Vinh]] ["Quảng Bình, เวียดนาม"] = {}, -- capital [[Đồng Hới]] ["Quảng Trị, เวียดนาม"] = {}, -- capital [[Đông Hà]] ["Thanh Hoá, เวียดนาม"] = {}, -- capital [[Thanh Hoá]] ["Thanh Hóa, เวียดนาม"] = {alias_of = "Thanh Hoá, เวียดนาม", display = true}, -- ["Hue"] = {placetype = {"เทศบาล", "นคร"}, wp = "Huế"}, -- capital [[Thuận Hoá district]] -- [[Central Highlands (Vietnam)|Central Highlands]] region ["Đắk Lắk, เวียดนาม"] = {}, -- capital [[Buôn Ma Thuột]] ["Đăk Nông, เวียดนาม"] = {}, -- capital [[Gia Nghĩa]] ["Gia Lai, เวียดนาม"] = {}, -- capital [[Pleiku]] ["Kon Tum, เวียดนาม"] = {}, -- capital [[Kon Tum]] ["Lâm Đồng, เวียดนาม"] = {}, -- capital [[Đà Lạt]] -- [[South Central Coast]] region ["Bình Định, เวียดนาม"] = {}, -- capital [[Quy Nhon]] ["Bình Thuận, เวียดนาม"] = {}, -- capital [[Phan Thiết]] ["Khánh Hoà, เวียดนาม"] = {}, -- capital [[Nha Trang]] ["Khánh Hòa, เวียดนาม"] = {alias_of = "Khánh Hoà, เวียดนาม", display = true}, ["Ninh Thuận, เวียดนาม"] = {}, -- capital [[Phan Rang–Tháp Chàm]] ["Phú Yên, เวียดนาม"] = {}, -- capital [[Tuy Hoà]] ["Quảng Nam, เวียดนาม"] = {}, -- capital [[Tam Kỳ]] ["Quảng Ngãi, เวียดนาม"] = {}, -- capital [[Quảng Ngãi]] -- ["Da Nang"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[Hải Châu district]] -- [[Southeast (Vietnam)|Southeast]] region ["Bà Rịa–Vũng Tàu, เวียดนาม"] = {}, -- capital [[Bà Rịa]] ["Bình Dương, เวียดนาม"] = {}, -- capital [[Thủ Dầu Một]] ["Bình Phước, เวียดนาม"] = {}, -- capital [[Đồng Xoài]] ["Đồng Nai, เวียดนาม"] = {}, -- capital [[Biên Hoà]] ["Tây Ninh, เวียดนาม"] = {}, -- capital [[Tây Ninh]] -- ["Ho Chi Minh City"] = {placetype = {"เทศบาล", "นคร"}}, -- capital [[District 1, Ho Chi Minh City|'''District 1''']] -- [[Mekong Delta]] region ["An Giang, เวียดนาม"] = {}, -- capital [[Long Xuyên]] ["Bạc Liêu, เวียดนาม"] = {}, -- capital [[Bạc Liêu]] ["Bến Tre, เวียดนาม"] = {}, -- capital [[Bến Tre]] ["Cà Mau, เวียดนาม"] = {}, -- capital [[Cà Mau]] ["Đồng Tháp, เวียดนาม"] = {}, -- capital [[Cao Lãnh City|Cao Lãnh]] ["Hậu Giang, เวียดนาม"] = {}, -- capital [[Vị Thanh]] ["Kiên Giang, เวียดนาม"] = {}, -- capital [[Rạch Giá]] ["Long An, เวียดนาม"] = {}, -- capital [[Tân An]] ["Sóc Trăng, เวียดนาม"] = {}, -- capital [[Sóc Trăng]] ["Tiền Giang, เวียดนาม"] = {}, -- capital [[Mỹ Tho]] ["Trà Vinh, เวียดนาม"] = {}, -- capital [[Trà Vinh]] ["Vĩnh Long, เวียดนาม"] = {}, -- capital [[Vĩnh Long]] -- ["Can Tho"] = {placetype = {"เทศบาล", "นคร"}, wp = "Cần Thơ"}, -- capital [[Ninh Kiều district]] } -- provinces of Vietnam export.vietnam_group = { key_to_placename = make_key_to_placename(", เวียดนาม$"), placename_to_key = make_placename_to_key(", เวียดนาม"), default_container = "เวียดนาม", default_placetype = "จังหวัด", -- There may not be enough districts to subcategorize like this. -- default_divs = "อำเภอ", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "จังหวัด%e", data = export.vietnam_provinces, } ----------------------------------------------------------------------------------- -- City data -- ----------------------------------------------------------------------------------- export.australia_cities = { ["Adelaide"] = {container = "South Australia"}, -- 1,450,000 (Agglomeration) ["Brisbane"] = {container = "Queensland"}, -- 3,450,000 (Conglomeration; including the Gold Coast [750,997 2024 estiamte]) ["Canberra"] = {container = {key = "Australian Capital Territory, ออสเตรเลีย", placetype = "ดินแดน"}}, -- 510,641 (2024 estimate) ["Melbourne"] = {container = "Victoria"}, -- 5,200,000 (Agglomeration) ["Newcastle, New South Wales"] = {container = "New South Wales", wp = "%l, %c"}, -- 534,033 (2024 estimate) ["Newcastle"] = {alias_of = "Newcastle, New South Wales"}, ["Perth"] = {container = "Western Australia"}, -- 2,350,000 (Agglomeration) ["Sydney"] = {container = "New South Wales"}, -- 5,100,000 (Agglomeration) } export.australia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", ออสเตรเลีย", "รัฐ"), default_placetype = "นคร", data = export.australia_cities, } export.brazil_cities = { -- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01. ["São Paulo"] = {container = "São Paulo"}, -- 22,600,000 (Consolidated Urban Area; including Guarulhos) ["Sao Paulo"] = {alias_of = "São Paulo", display = true}, ["Rio de Janeiro"] = {container = "Rio de Janeiro"}, -- 13,600,000 (Consolidated Urban Area) ["Belo Horizonte"] = {container = "Minas Gerais"}, -- 5,300,000 ["Recife"] = {container = "Pernambuco"}, -- 4,100,000 ["Porto Alegre"] = {container = "Rio Grande do Sul"}, -- 3,950,000 (Consolidated Urban Area) ["Brasília"] = {container = "Distrito Federal"}, -- 3,850,000 ["Brasilia"] = {alias_of = "Brasília", display = true}, ["Fortaleza"] = {container = "Ceará"}, -- 3,825,000 ["Salvador"] = {container = "Bahia", wp = "%l, %c", commonscat = "%l (%c)"}, -- 3,400,000 ["Curitiba"] = {container = "Paraná"}, -- 3,375,000 ["Campinas"] = {container = "São Paulo"}, -- 3,250,000 ["Goiânia"] = {container = "Goiás"}, -- 2,525,000 ["Goiania"] = {alias_of = "Goiânia", display = true}, ["Manaus"] = {container = "Amazonas"}, -- 2,275,000 ["Belém"] = {container = "Pará"}, -- 2,200,000 ["Belem"] = {alias_of = "Belém", display = true}, ["Vitória"] = {container = "Espírito Santo", wp = "%l, %c"}, -- 1,870,000 ["Vitoria"] = {alias_of = "Vitória", display = true}, ["Santos"] = {container = "São Paulo", wp = "%l, %c"}, -- 1,760,000 ["São Luís"] = {container = "Maranhão", wp = "%l, %c"}, -- 1,530,000 ["Sao Luis"] = {alias_of = "São Luís", display = true}, ["Natal"] = {container = "Rio Grande do Norte", wp = "%l, %c"}, -- 1,360,000 ["Florianópolis"] = {container = "Santa Catarina"}, -- 1,260,000 ["Florianopolis"] = {alias_of = "Florianópolis", display = true}, ["Maceió"] = {container = "Alagoas"}, -- 1,220,000 ["Maceio"] = {alias_of = "Maceió", display = true}, ["João Pessoa"] = {container = "Paraíba", wp = "%l, %c"}, -- 1,210,000 ["Joao Pessoa"] = {alias_of = "João Pessoa", display = true}, ["São José dos Campos"] = {container = "São Paulo"}, -- 1,090,000 ["Sao Jose dos Campos"] = {alias_of = "São José dos Campos", display = true}, ["Londrina"] = {container = "Paraná"}, -- 1,050,000 ["Teresina"] = {container = "Piauí"}, -- 1,040,000 } export.brazil_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", บราซิล", "รัฐ"), default_placetype = "นคร", data = export.brazil_cities, } export.canada_cities = { -- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01. ["Toronto"] = {container = "Ontario"}, -- 7,850,000 (Consolidated Urban Area; including Hamilton) ["Montreal"] = {container = "Quebec"}, -- 4,500,000 (Consolidated Urban Area) ["Vancouver"] = {container = "British Columbia"}, -- 3,175,000 (Consolidated Urban Area) ["Calgary"] = {container = "Alberta"}, -- 1,510,000 (Consolidated Urban Area) ["Edmonton"] = {container = "Alberta"}, -- 1,460,000 (Consolidated Urban Area) ["Ottawa"] = {container = "Ontario"}, -- 1,390,000 (Consolidated Urban Area) ["Quebec City"] = {container = "Quebec"}, -- 839,311 metro per Wikipedia (2021 census) ["Winnipeg"] = {container = "Manitoba"}, -- 834,678 metro per Wikipedia (2021 census) ["Hamilton"] = {container = "Ontario", wp = "%l, %c"}, -- 785,184 metro per Wikipedia (2021 census) ["Kitchener"] = {container = "Ontario", wp = "%l, %c"}, -- 575,847 metro per Wikipedia (2021 census) } export.canada_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Canada", "จังหวัด"), default_placetype = "นคร", data = export.canada_cities, } export.france_cities = { -- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01. ["Paris"] = {container = "Île-de-France"}, -- 11,500,000 (Conglomeration) ["Lyon"] = {container = "Auvergne-Rhône-Alpes"}, -- 2,050,000 (Conglomeration) ["Lyons"] = {alias_of = "Lyon", display = true}, ["Marseille"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 1,710,000 (Conglomeration) ["Marseilles"] = {alias_of = "Marseille", display = true}, ["Lille"] = {container = "Hauts-de-France"}, -- 1,320,000 (Conglomeration) ["Bordeaux"] = {container = "Nouvelle-Aquitaine"}, -- 1,160,000 (Conglomeration) ["Toulouse"] = {container = "Occitania"}, -- 1,150,000 (Conglomeration) ["Nice"] = {container = "Provence-Alpes-Côte d'Azur"}, ["Nantes"] = {container = "Pays de la Loire"}, ["Strasbourg"] = {container = "Grand Est"}, ["Rennes"] = {container = "Brittany"}, } export.france_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", ฝรั่งเศส", "ภูมิภาค"), default_placetype = "นคร", data = export.france_cities, } export.germany_cities = { -- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01. -- listed under Rhein-Ruhr Area, total population 10,900,000 (Consolidated Urban Area) ["Cologne"] = {container = "North Rhine-Westphalia"}, ["Köln"] = {alias_of = "Cologne", display = true}, ["Düsseldorf"] = {container = "North Rhine-Westphalia"}, ["Dusseldorf"] = {alias_of = "Düsseldorf", display = true}, ["Dortmund"] = {container = "North Rhine-Westphalia"}, ["Essen"] = {container = "North Rhine-Westphalia"}, ["Duisberg"] = {container = "North Rhine-Westphalia"}, ["Berlin"] = {}, -- 4,700,000 ["Frankfurt"] = {container = "Hesse"}, -- 3,225,000 ["Frankfurt am Main"] = {alias_of = "Frankfurt"}, -- not a display alias as it's longer ["Hamburg"] = {}, -- 2,900,000 ["Munich"] = {container = "Bavaria"}, -- 2,300,000 ["Stuttgart"] = {container = "Baden-Württemberg"}, -- 2,300,000 ["Mannheim"] = {container = "Baden-Württemberg"}, -- 1,550,000 ["Nuremberg"] = {container = "Bavaria"}, -- 1,120,000 ["Hanover"] = {"Lower Saxony"}, -- 1,090,000 ["Bielefeld"] = {container = "North Rhine-Westphalia"}, -- 1,080,000 ["Leipzig"] = {container = "Saxony"}, -- 1,080,000 ["Aachen"] = {container = "North Rhine-Westphalia"}, -- 1,000,000 ["Aix-la-Chapelle"] = {alias_of = "Aachen"}, -- historical; not a display alias ["Bremen"] = {}, } export.germany_cities_group = { default_container = "เยอรมนี", canonicalize_key_container = make_canonicalize_key_container(", เยอรมนี", "รัฐ"), default_placetype = "นคร", data = export.germany_cities, } export.india_cities = { -- This lists the 65 metro areas per Demographia's 2023 estimates, as found in -- [[w:List_of_million-plus_urban_agglomerations_in_India]]. The last census in India (as of April 2025) was -- conducted in 2011, and the results are not accurate any more. ["Delhi"] = {container = {key = "Delhi, อินเดีย", placetype = "union territory"}}, -- 31,190,000 ["Mumbai"] = {container = "Maharashtra"}, -- 25,189,000 ["Kolkata"] = {container = "West Bengal"}, -- 21,747,000 ["Bangalore"] = {container = "Karnataka", wp = "Bengaluru"}, -- 15,257,000 ["Bengaluru"] = {alias_of = "Bangalore"}, ["Chennai"] = {container = "Tamil Nadu"}, -- 11,570,000 ["Hyderabad"] = {container = "Telangana"}, -- 9,797,000 ["Ahmedabad"] = {container = "Gujarat"}, -- 8,006,000 ["Pune"] = {container = "Maharashtra"}, -- 6,819,000 ["Surat"] = {container = "Gujarat"}, -- 6,601,000 ["Lucknow"] = {container = "Uttar Pradesh"}, -- 4,661,000 ["Jaipur"] = {container = "Rajasthan"}, -- 4,360,000 ["Kanpur"] = {container = "Uttar Pradesh"}, -- 4,350,000 ["Indore"] = {container = "Madhya Pradesh"}, -- 3,765,000 ["Nagpur"] = {container = "Maharashtra"}, -- 3,493,000 ["Patna"] = {container = "Bihar"}, -- 3,331,000 ["Varanasi"] = {container = "Uttar Pradesh"}, -- 3,229,000 ["Kozhikode"] = {container = "Kerala"}, -- 3,049,000 ["Thiruvananthapuram"] = {container = "Kerala"}, -- 2,851,000 ["Agra"] = {container = "Uttar Pradesh"}, -- 2,737,000 ["Bhopal"] = {container = "Madhya Pradesh"}, -- 2,562,000 ["Coimbatore"] = {container = "Tamil Nadu"}, -- 2,551,000 ["Allahabad"] = {container = "Uttar Pradesh", wp = "Prayagraj"}, -- 2,438,000 ["Prayagraj"] = {alias_of = "Allahabad"}, ["Kochi"] = {container = "Kerala"}, -- 2,381,000 ["Ludhiana"] = {container = "Punjab"}, -- 2,205,000 ["Vadodara"] = {container = "Gujarat"}, -- 2,182,000 ["Chandigarh"] = {container = {key = "Chandigarh, อินเดีย", placetype = "union territory"}}, -- 2,168,000 ["Madurai"] = {container = "Tamil Nadu"}, -- 2,048,000 ["Meerut"] = {container = "Uttar Pradesh"}, -- 2,011,000 ["Visakhapatnam"] = {container = "Andhra Pradesh"}, -- 2,005,000 ["Jamshedpur"] = {container = "Jharkhand"}, -- 1,925,000 ["Malappuram"] = {container = "Kerala"}, -- 1,868,000 ["Nashik"] = {container = "Maharashtra"}, -- 1,810,000 ["Asansol"] = {container = "West Bengal"}, -- 1,720,000 ["Aligarh"] = {container = "Uttar Pradesh"}, -- 1,660,000 ["Ranchi"] = {container = "Jharkhand"}, -- 1,638,000 ["Thrissur"] = {container = "Kerala"}, -- 1,578,000 ["Kollam"] = {container = "Kerala"}, -- 1,576,000 ["Jabalpur"] = {container = "Madhya Pradesh"}, -- 1,533,000 ["Dhanbad"] = {container = "Jharkhand"}, -- 1,503,000 ["Jodhpur"] = {container = "Rajasthan"}, -- 1,497,000 ["Aurangabad"] = {container = "Maharashtra"}, -- 1,490,000 ["Chhatrapati Sambhajinagar"] = {alias_of = "Aurangabad"}, ["Rajkot"] = {container = "Gujarat"}, -- 1,487,000 ["Gwalior"] = {container = "Madhya Pradesh"}, -- 1,477,000 ["Raipur"] = {container = "Chhattisgarh"}, -- 1,429,000 ["Gorakhpur"] = {container = "Uttar Pradesh"}, -- 1,410,000 ["Kannur"] = {container = "Kerala"}, -- 1,360,000 ["Bareilly"] = {container = "Uttar Pradesh"}, -- 1,355,000 ["Guwahati"] = {container = "Assam"}, -- 1,355,000 ["Moradabad"] = {container = "Uttar Pradesh"}, -- 1,345,000 ["Amritsar"] = {container = "Punjab"}, -- 1,313,000 ["Mysore"] = {container = "Karnataka"}, -- 1,296,000 ["Bhilai"] = {container = "Chhattisgarh"}, -- 1,293,000 ["Durg-Bhilainagar"] = {alias_of = "Bhilai"}, ["Durg-Bhilai"] = {alias_of = "Bhilai"}, ["Durg"] = {alias_of = "Bhilai"}, ["Bhilainagar"] = {alias_of = "Bhilai"}, ["Vijayawada"] = {container = "Andhra Pradesh"}, -- 1,232,000 ["Srinagar"] = {container = {key = "Jammu and Kashmir, อินเดีย", placetype = "union territory"}}, -- 1,212,000 ["Salem"] = {container = "Tamil Nadu", wp = "%l, %c"}, -- 1,189,000 ["Kota"] = {container = "Rajasthan"}, -- 1,172,000 ["Jalandhar"] = {container = "Punjab"}, -- 1,165,000 ["Saharanpur"] = {container = "Uttar Pradesh"}, -- 1,152,000 ["Dehradun"] = {container = "Uttarakhand"}, -- 1,136,000 ["Tiruchirappalli"] = {container = "Tamil Nadu"}, -- 1,131,000 ["Bhubaneswar"] = {container = "Odisha"}, -- 1,112,000 ["Jammu"] = {container = {key = "Jammu and Kashmir, อินเดีย", placetype = "union territory"}}, -- 1,103,000 ["Solapur"] = {container = "Maharashtra"}, -- 1,082,000 ["Hubli-Dharwad"] = {container = "Karnataka", wp = "Hubli–Dharwad"}, -- 1,062,000; wp with en dash ["Hubli"] = {alias_of = "Hubli-Dharwad"}, ["Dharwad"] = {alias_of = "Hubli-Dharwad"}, ["Puducherry"] = {container = {key = "Puducherry, อินเดีย", placetype = "union territory"}}, -- 1,024,000 ["Pondicherry"] = {alias_of = "Puducherry", display = true}, -- satellite/secondary cities of metro area (none in citypopulation.de) ["Ghaziabad"] = {container = "Uttar Pradesh"}, -- 1,729,000 city, 2,358,525 urban agglomeration per 2011 census; 3,406,061 2025 estimate from official website; part of Delhi metro area ["Faridabad"] = {container = "Haryana"}, -- 1,414,050 city per 2011 census; part of Delhi metro area ["Thane"] = {container = "Maharashtra"}, -- 1,841,488 city per 2011 census; part of Mumbai metro area ["Kalyan-Dombivli"] = {container = "Maharashtra"}, -- 1,246,381 city per 2011 census; part of Mumbai metro area ["Kalyan-Dombivali"] = {alias_of = "Kalyan-Dombivli", display = true}, ["Kalyan"] = {alias_of = "Kalyan-Dombivli"}, ["Dombivli"] = {alias_of = "Kalyan-Dombivli"}, ["Dombivali"] = {alias_of = "Kalyan-Dombivli"}, ["Vasai-Virar"] = {container = "Maharashtra"}, -- 1,221,233 city per 2011 census; part of Mumbai metro area ["Vasai"] = {alias_of = "Vasai-Virar"}, ["Virar"] = {alias_of = "Vasai-Virar"}, ["Navi Mumbai"] = {container = "Maharashtra"}, -- 1,120,547 city per 2011 census; part of Mumbai metro area ["Howrah"] = {container = "West Bengal"}, -- 1,077,075 city ("metropolis"), 2,811,344 "metro" per 2011 census; part of Kolkata metro area ["Pimpri-Chinchwad"] = {container = "Maharashtra"}, -- 1,727,692 per 2011 census; part of Pune metro area ["Pimpri Chinchwad"] = {alias_of = "Pimpri-Chinchwad", display = true}, } export.india_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", อินเดีย", "รัฐ"), default_placetype = "นคร", data = export.india_cities, } export.indonesia_cities = { -- cities where the city proper has more than 1,000,000 people as of mid-2023 estimate ["Jakarta"] = {container = "Special Capital Region of Jakarta", divs = { {type = "ตำบล", container_parent_type = false}, }}, ["Surabaya"] = {container = "East Java"}, ["Bekasi"] = {container = "West Java"}, -- part of Jakarta metro area ["Bandung"] = {container = "West Java"}, ["Medan"] = {container = "North Sumatra"}, ["Depok"] = {container = "West Java"}, -- part of Jakarta metro area ["Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area ["Palembang"] = {container = "South Sumatra"}, ["Semarang"] = {container = "Central Java"}, ["Makassar"] = {container = "South Sulawesi"}, ["South Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area ["Batam"] = {container = "Riau Islands"}, ["Bogor"] = {container = "West Java"}, -- part of Jakarta metro area ["Pekanbaru"] = {container = "Riau"}, ["Bandar Lampung"] = {container = "Lampung"}, -- other metro areas over 1,000,000 people ["Padang"] = {container = "West Sumatra"}, ["Samarinda"] = {container = "East Kalimantan"}, ["Malang"] = {container = "East Java"}, ["Yogyakarta"] = {container = "Special Region of Yogyakarta"}, ["Denpasar"] = {container = "Bali"}, ["Cirebon"] = {container = "West Java"}, ["Surakarta"] = {container = "Central Java"}, ["Banjarmasin"] = {container = "South Kalimantan"}, ["Tasikmalaya"] = {container = "West Java"}, } export.indonesia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", อินโดนีเซีย", "จังหวัด"), default_placetype = "นคร", data = export.indonesia_cities, } export.italy_cities = { -- Data per [[w:List_of_metropolitan_areas_of_Italy]]. There are several lists given; the most recent one, used -- here, only gives estimates as of Jan 1, 2014. ["Milan"] = {container = "Lombardy"}, -- 6,623,798 ["Naples"] = {container = "Campania"}, -- 5,294,546 ["Rome"] = {container = "Lazio"}, -- 4,447,881 ["Turin"] = {container = "Piedmont"}, -- 1,865,284 ["Venice"] = {container = "Veneto"}, -- 1,645,900 ["Florence"] = {container = "Tuscany"}, -- 1,485,030 ["Bari"] = {container = "Apulia"}, -- 1,257,459 ["Palermo"] = {container = "Sicily"}, -- 1,183,084 -- include a few just below 1,000,000 metro area that may be above it by now (depending on the definition). ["Catania"] = {container = "Sicily"}, -- 988,240 ["Brescia"] = {container = "Lombardy"}, -- 924,090 ["Genoa"] = {container = "Liguria"}, -- 861,318 } export.italy_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Italy", "ภูมิภาค"), default_placetype = "นคร", data = export.italy_cities, } export.japan_cities = { -- Population figures from [[w:List of cities in Japan]]. Metro areas from -- [[w:List of metropolitan areas in Japan]]. ["Tokyo"] = {keydesc = "[[Tokyo]] Metropolis, the [[capital city]] and a [[prefecture]] of [[Japan]] (which is a country in [[Asia]])", placetype = {"นคร", "จังหวัด"}, divs = { {type = "special wards", container_parent_type = false}, {type = "นคร", prep = "ใน"}, }, }, ["Yokohama"] = {container = "Kanagawa"}, -- 3,697,894 ["Osaka"] = {container = "Osaka"}, -- 2,668,586 ["Nagoya"] = {container = "Aichi"}, -- 2,283,289 -- FIXME, Hokkaido is handled specially. ["Sapporo"] = {container = "Hokkaido"}, -- 1,918,096 ["Fukuoka"] = {container = "Fukuoka"}, -- 1,581,527 ["Kobe"] = {container = "Hyōgo"}, -- 1,530,847 ["Kyoto"] = {container = "Kyoto"}, -- 1,474,570 ["Kawasaki"] = {container = "Kanagawa", wp = "%l, Kanagawa"}, -- 1,373,630 ["Saitama"] = {container = "Saitama", wp = "%l (city)", commonscat = "%l, %c"}, -- 1,192,418 ["Hiroshima"] = {container = "Hiroshima"}, -- 1,163,806 ["Sendai"] = {container = "Miyagi"}, -- 1,029,552 -- the remaining cities are considered "central cities" in a 1,000,000+ metro area -- (sometimes there is more than one central city in the area). ["Kitakyushu"] = {container = "Fukuoka"}, -- 986,998 ["Chiba"] = {container = "Chiba", wp = "%l (city)", commonscat = "%l, %c"}, -- 938,695 ["Sakai"] = {container = "Osaka"}, -- 835,333 ["Niigata"] = {container = "Niigata", wp = "%l (city)", commonscat = "%l, %c"}, -- 813,053 ["Hamamatsu"] = {container = "Shizuoka"}, -- 811,431 ["Shizuoka"] = {container = "Shizuoka", wp = "%l (city)", commonscat = "%l, %c"}, -- 710,944 ["Sagamihara"] = {container = "Kanagawa"}, -- 706,342 ["Okayama"] = {container = "Okayama"}, -- 701,293 ["Kumamoto"] = {container = "Kumamoto"}, -- 670,348 ["Kagoshima"] = {container = "Kagoshima"}, -- 605,196 -- skipped 6 cities (Funabashi, Hachiōji, Kawaguchi, Himeji, Matsuyama, Higashiōsaka) -- with population in the range 509k - 587k because not central cities in any -- 1,000,000+ metro area. ["Utsunomiya"] = {container = "Tochigi"}, -- 507,833 } export.japan_cities_group = { default_container = "ญี่ปุ่น", canonicalize_key_container = make_canonicalize_key_container(", ญี่ปุ่น", "จังหวัด"), default_placetype = "นคร", data = export.japan_cities, } export.mexico_cities = { ["Mexico City"] = {}, -- its own state ["Monterrey"] = {container = "Nuevo León"}, ["Guadalajara"] = {container = "Jalisco"}, ["Puebla"] = {container = "Puebla", wp = "%l (city)"}, ["Toluca"] = {container = "State of Mexico"}, ["Tijuana"] = {container = "Baja California"}, -- Include the state in the category for León due to possible confusion with León, Spain. ["León, Guanajuato"] = {container = "Guanajuato", wp = "%l, %c"}, ["León"] = {alias_of = "León, Guanajuato"}, ["Leon"] = {alias_of = "León, Guanajuato", display = true}, ["Querétaro"] = {container = "Querétaro", wp = "%l (city)"}, ["Queretaro"] = {alias_of = "Querétaro", display = true}, ["Ciudad Juárez"] = {container = "Chihuahua"}, ["Juárez"] = {alias_of = "Ciudad Juárez"}, ["Juarez"] = {alias_of = "Ciudad Juárez", display = "Juárez"}, ["Torreón"] = {container = "Coahuila"}, ["Torreon"] = {alias_of = "Torreón", display = true}, -- Include the state in the category for Mérida due to possible confusion with Mérida, Spain or -- Mérida, Venezuela. ["Mérida, Yucatán"] = {container = "Yucatán", wp = "%l, %c"}, ["Mérida"] = {alias_of = "Mérida, Yucatán"}, ["Merida"] = {alias_of = "Mérida, Yucatán", display = true}, ["San Luis Potosí"] = {container = "San Luis Potosí", wp = "%l (city)"}, ["San Luis Potosi"] = {alias_of = "San Luis Potosí", display = true}, ["Aguascalientes"] = {container = "Aguascalientes", wp = "%l (city)"}, ["Mexicali"] = {container = "Baja California"}, } export.mexico_cities_group = { default_container = "Mexico", canonicalize_key_container = make_canonicalize_key_container(", Mexico", "รัฐ"), default_placetype = "นคร", data = export.mexico_cities, } export.nigeria_cities = { -- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01. ["Lagos"] = {container = "Lagos"}, -- 21,300,000 (unindicated; population of low reliability) ["Kano"] = {container = "Kano", wp = "%l (city)"}, -- 5,350,000 (unindicated; population of low reliability) ["Ibadan"] = {container = "Oyo"}, -- 3,400,000 (unindicated; population of low reliability) ["Abuja"] = {container = {key = "Federal Capital Territory, Nigeria", placetype = "federal territory"}}, -- 3,050,000 (unindicated; population of low reliability) ["Port Harcourt"] = {container = "Rivers"}, -- 2,250,000 (unindicated; population of low reliability) ["Kaduna"] = {container = "Kaduna"}, -- 1,980,000 (unindicated; population of low reliability) ["Benin City"] = {container = "Edo"}, -- 1,790,000 (unindicated; population of low reliability) ["Aba"] = {container = "Abia", wp = "%l, Nigeria"}, -- 1,280,000 (unindicated; population of low reliability) ["Onitsha"] = {container = "Anambra"}, -- 1,230,000 (unindicated; population of low reliability) ["Maiduguri"] = {container = "Borno"}, -- 1,190,000 (unindicated; population of low reliability) ["Ilorin"] = {container = "Kwara"}, -- 1,160,000 (unindicated; population of low reliability) ["Sokoto"] = {container = "Sokoto", wp = "%l (city)"}, -- 1,140,000 (unindicated; population of low reliability) ["Jos"] = {container = "Plateau"}, -- 1,110,000 (unindicated; population of low reliability) ["Zaria"] = {container = "Kaduna"}, -- 1,050,000 (unindicated; population of low reliability) ["Enugu"] = {container = "Enugu", wp = "%l (city)"}, -- 1,010,000 (unindicated; population of low reliability) } export.nigeria_cities_group = { default_container = "Nigeria", canonicalize_key_container = make_canonicalize_key_container(" State, Nigeria", "รัฐ"), default_placetype = "นคร", data = export.nigeria_cities, } export.pakistan_cities = { -- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01. ["Karachi"] = {container = "Sindh"}, -- 21,000,000 (Consolidated Urban Area) ["Lahore"] = {container = "Punjab"}, -- 14,600,000 (Consolidated Urban Area) ["Rawalpindi"] = {container = "Punjab"}, -- 5,600,000 (Consolidated Urban Area; including Islamabad) ["Islamabad"] = {container = {key = "Islamabad Capital Territory, Pakistan", placetype = "federal territory"}}, -- 5,600,000 (Consolidated Urban Area; including Rawalpindi) ["Faisalabad"] = {container = "Punjab"}, -- 4,125,000 (Consolidated Urban Area) ["Gujranwala"] = {container = "Punjab"}, -- 3,450,000 (Consolidated Urban Area) -- there is also Hyderabad in India (very confusing) ["Hyderabad, Pakistan"] = {container = "Sindh", wp = "%l, %c"}, -- 2,475,000 (Consolidated Urban Area) ["Hyderabad"] = {alias_of = "Hyderabad, Pakistan"}, ["Multan"] = {container = "Punjab"}, -- 2,425,000 (Consolidated Urban Area) ["Peshawar"] = {container = "Khyber Pakhtunkhwa"}, -- 2,150,000 (Consolidated Urban Area) ["Quetta"] = {container = "Balochistan"}, -- 1,720,000 (Urban Area) ["Sargodha"] = {container = "Punjab"}, -- 1,080,000 (Urban Area) ["Sialkot"] = {container = "Punjab"}, -- 1,050,000 (Consolidated Urban Area) } export.pakistan_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Pakistan", "จังหวัด"), default_placetype = "นคร", data = export.pakistan_cities, } export.philippines_cities = { -- Skipped some cities in Metro Manila (Taguig, Pasig) which don't have districts. -- Other cities outside Metro Manila skipped as not central city in their urban area. ["Quezon City"] = {container = {key = "Metro Manila, Philippines", placetype = "ภูมิภาค"}}, -- Don't display-canonicalize Foo to Foo City as it may make the display weird. ["Quezon"] = {alias_of = "Quezon City"}, ["Manila"] = {container = {key = "Metro Manila, Philippines", placetype = "ภูมิภาค"}}, ["Davao City"] = {container = "Davao del Sur"}, ["Davao"] = {alias_of = "Davao City"}, ["Caloocan"] = {container = {key = "Metro Manila, Philippines", placetype = "ภูมิภาค"}}, ["Zamboanga City"] = {container = "Zamboanga del Sur"}, ["Zamboanga"] = {alias_of = "Zamboanga City"}, ["Cebu City"] = {container = "Cebu"}, ["Cebu"] = {alias_of = "Cebu City"}, ["Antipolo"] = {container = "Rizal"}, ["Cagayan de Oro"] = {container = "Misamis Oriental"}, ["Dasmariñas"] = {container = "Cavite"}, ["Dasmarinas"] = {alias_of = "Dasmariñas", display = true}, ["General Santos"] = {container = "South Cotabato"}, ["San Jose del Monte"] = {container = "Bulacan"}, ["Bacolod"] = {container = "Negros Occidental"}, ["Calamba"] = {container = "Laguna", wp = "%l, %c"}, ["Angeles"] = {container = "Pampanga", wp = "Angeles City"}, ["Angeles City"] = {alias_of = "Angeles"}, ["Iloilo City"] = {container = "Iloilo"}, ["Iloilo"] = {alias_of = "Iloilo City"}, } export.philippines_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Philippines", "จังหวัด"), default_placetype = "นคร", data = export.philippines_cities, } export.russia_cities = { -- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01. ["Moscow"] = {}, -- 18,800,000 (Agglomeration) ["Saint Petersburg"] = {}, -- 6,350,000 (Agglomeration) ["Novosibirsk"] = {container = "Novosibirsk Oblast"}, -- 1,820,000 (Agglomeration) ["Yekaterinburg"] = {container = "Sverdlovsk Oblast"}, -- 1,810,000 (Agglomeration) ["Nizhny Novgorod"] = {container = "Nizhny Novgorod Oblast"}, -- 1,620,000 (Agglomeration) ["Kazan"] = {container = {key = "Tatarstan, Russia", placetype = "republic"}}, -- 1,560,000 (Agglomeration) ["Chelyabinsk"] = {container = "Chelyabinsk Oblast"}, -- 1,430,000 (Agglomeration) ["Rostov-on-Don"] = {container = "Rostov Oblast"}, -- 1,390,000 (Agglomeration) ["Rostov-na-Donu"] = {alias_of = "Rostov-on-Don", display = true}, ["Krasnodar"] = {container = {key = "Krasnodar Krai, Russia", placetype = "krai"}}, -- 1,370,000 (Agglomeration) ["Samara"] = {container = "Samara Oblast"}, -- 1,350,000 (Agglomeration) ["Krasnoyarsk"] = {container = {key = "Krasnoyarsk Krai, Russia", placetype = "krai"}}, -- 1,270,000 (Agglomeration) ["Ufa"] = {container = {key = "Bashkortostan, Russia", placetype = "republic"}}, -- 1,230,000 (Agglomeration) ["Saratov"] = {container = "Saratov Oblast"}, -- 1,170,000 (Agglomeration) ["Omsk"] = {container = "Omsk Oblast"}, -- 1,140,000 (Agglomeration) ["Voronezh"] = {container = "Voronezh Oblast"}, -- 1,130,000 (Agglomeration) ["Volgograd"] = {container = "Volgograd Oblast"}, -- 1,080,000 (Agglomeration) ["Perm"] = {container = {key = "Perm Krai, Russia", placetype = "krai"}, wp = "%l, Russia"}, -- 1,070,000 (Agglomeration) } export.russia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Russia", "oblast"), default_container = "Russia", default_placetype = "นคร", data = export.russia_cities, } export.saudi_arabia_cities = { -- Figures for the first five from [[w:List of cities and towns in Saudi Arabia]] as of 2022. Unclear if these are -- metro, urban or city proper figures. ["Riyadh"] = {container = "Riyadh"}, -- 7,000,100; 7,700,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Jeddah"] = {container = "Mecca"}, -- 3,751,917; 3,950,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Jedda"] = {alias_of = "Jeddah", display = true}, ["Jiddah"] = {alias_of = "Jeddah", display = true}, ["Jidda"] = {alias_of = "Jeddah", display = true}, ["Dammam"] = {container = "Eastern"}, -- 2,638,166; 2,925,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Mecca"] = {container = "Mecca"}, -- 2,385,509; 2,675,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Makkah"] = {alias_of = "Mecca", display = true}, ["Medina"] = {container = "Medina"}, -- 1,477,023; 1,530,000 per citypopulation.de 2025-01-01 (City) ["Hofuf"] = {container = "Eastern"}, -- 1,060,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Khamis Mushait"] = {container = "Aseer"}, -- 1,030,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Khamis Mushayt"] = {alias_of = "Khamis Mushait", display = true}, } export.saudi_arabia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(" Province, Saudi Arabia", "จังหวัด"), default_placetype = "นคร", data = export.saudi_arabia_cities, } export.south_korea_cities = { -- All cities listed are not associated with any county. ["Seoul"] = {}, ["Busan"] = {}, ["Incheon"] = {}, ["Daegu"] = {}, ["Daejeon"] = {}, ["Gwangju"] = {}, ["Ulsan"] = {}, } export.south_korea_cities_group = { default_container = "South Korea", canonicalize_key_container = make_canonicalize_key_container(" County, South Korea", "จังหวัด"), default_placetype = "นคร", data = export.south_korea_cities, } export.spain_cities = { ["Madrid"] = {container = "Community of Madrid"}, ["Barcelona"] = {container = "Catalonia"}, ["Valencia"] = {container = "Valencia"}, ["Seville"] = {container = "Andalusia"}, ["Bilbao"] = {container = "Basque Country"}, } export.spain_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Spain", "autonomous community"), default_placetype = "นคร", data = export.spain_cities, } export.taiwan_cities = { ["New Taipei City"] = {}, ["New Taipei"] = {alias_of = "New Taipei City", display = true}, ["Taichung"] = {}, ["Kaohsiung"] = {wp = "%l, ไต้หวัน"}, ["Taipei"] = {}, ["Taoyuan"] = {}, ["Tainan"] = {}, -- these last three are not special municipalities ["Chiayi"] = {placetype = "นคร"}, ["Hsinchu"] = {placetype = "นคร"}, ["Keelung"] = {placetype = "นคร"}, } export.taiwan_cities_group = { placename_to_key = false, -- don't add ", ไต้หวัน" to make the key canonicalize_key_container = make_canonicalize_key_container(", ไต้หวัน", "เทศมณฑล"), default_container = "ไต้หวัน", default_placetype = {"special municipality", "เทศบาล", "นคร"}, default_is_city = true, default_divs = {"อำเภอ"}, data = export.taiwan_cities, } -- NOTE: It's OK to mix cities from different constituent countries; as long as the immediate container is correct, -- everything else will be figured out. export.united_kingdom_cities = { ["London"] = {container = "Greater London"}, ["Manchester"] = {container = "Greater Manchester"}, ["Birmingham"] = {container = "West Midlands"}, ["Liverpool"] = {container = "Merseyside"}, ["Glasgow"] = {container = {key = "City of Glasgow, Scotland", placetype = "council area"}}, ["Leeds"] = {container = "West Yorkshire"}, ["Newcastle upon Tyne"] = {container = "Tyne and Wear"}, ["Newcastle"] = {alias_of = "Newcastle upon Tyne"}, ["Bristol"] = {container = {key = "England", placetype = "constituent country"}}, ["Cardiff"] = {container = {key = "Wales", placetype = "constituent country"}}, ["Portsmouth"] = {container = "Hampshire"}, ["Edinburgh"] = {container = {key = "City of Edinburgh, Scotland", placetype = "council area"}}, -- under 1,000,000 people but principal areas of Wales; requested by [[User:Donnanz]] ["Swansea"] = {container = {key = "Wales", placetype = "constituent country"}}, ["Newport"] = {container = {key = "Wales", placetype = "constituent country"}, wp = "Newport, Wales"}, } export.united_kingdom_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", England", "เทศมณฑล"), default_placetype = "นคร", data = export.united_kingdom_cities, } export.united_states_cities = { -- top 50 CSA's by population, with the top and sometimes 2nd or 3rd city listed ["New York City"] = {container = "New York", wp = "%l", divs = { {type = "boroughs", container_parent_type = false}, }}, -- Don't display-canonicalize as it may make the display weird (e.g. in the context New York, New York). ["New York"] = {alias_of = "New York City"}, ["Newark"] = {container = "New Jersey"}, ["Los Angeles"] = {container = "California", wp = "%l"}, ["Long Beach"] = {container = "California"}, ["Riverside"] = {container = "California"}, ["Chicago"] = {container = "Illinois", wp = "%l"}, ["Washington, D.C."] = {wp = "%l"}, ["Washington, DC"] = {alias_of = "Washington, D.C.", display = true}, ["Washington D.C."] = {alias_of = "Washington, D.C.", display = true}, ["Washington DC"] = {alias_of = "Washington, D.C.", display = true}, -- Don't display-canonicalize as it may make the display weird (e.g. if the holonym is followed by a District of -- Columbia holonym). ["Washington"] = {alias_of = "Washington, D.C."}, ["Baltimore"] = {container = "Maryland", wp = "%l"}, -- to avoid conflict with San Jose in Costa Rica ["San Jose, California"] = {container = "California"}, ["San Jose"] = {alias_of = "San Jose, California"}, ["San Francisco"] = {container = "California", wp = "%l"}, ["Oakland"] = {container = "California"}, ["Boston"] = {container = "Massachusetts", wp = "%l"}, ["Providence"] = {container = "Rhode Island"}, ["Dallas"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"}, ["Fort Worth"] = {container = "Texas"}, ["Philadelphia"] = {container = "Pennsylvania", wp = "%l"}, ["Houston"] = {container = "Texas", wp = "%l"}, ["Miami"] = {container = "Florida", wp = "%l", commonscat = "%l, %c"}, ["Atlanta"] = {container = "Georgia", wp = "%l"}, ["Detroit"] = {container = "Michigan", wp = "%l"}, ["Phoenix"] = {container = "Arizona", wp = "%l", commonscat = "%l, %c"}, ["Mesa"] = {container = "Arizona"}, ["Seattle"] = {container = "Washington", wp = "%l"}, ["Orlando"] = {container = "Florida"}, ["Minneapolis"] = {container = "Minnesota", wp = "%l"}, ["Cleveland"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"}, ["Denver"] = {container = "Colorado", wp = "%l", commonscat = "%l, %c"}, ["San Diego"] = {container = "California", wp = "%l", commonscat = "%l, %c"}, ["Portland"] = {container = "Oregon"}, ["Tampa"] = {container = "Florida"}, ["St. Louis"] = {container = "Missouri", wp = "%l", commonscat = "%l, %c"}, ["Saint Louis"] = {alias_of = "St. Louis", display = true}, ["Charlotte"] = {container = "North Carolina"}, ["Sacramento"] = {container = "California"}, ["Pittsburgh"] = {container = "Pennsylvania", wp = "%l"}, ["Salt Lake City"] = {container = "Utah", wp = "%l"}, ["San Antonio"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"}, ["Columbus"] = {container = "Ohio"}, ["Kansas City"] = {container = "Missouri", wp = "%l metropolitan area", commonscat = "%l, %c"}, ["Indianapolis"] = {container = "Indiana", wp = "%l"}, ["Las Vegas"] = {container = "Nevada", wp = "%l"}, ["Cincinnati"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"}, ["Austin"] = {container = "Texas"}, ["Milwaukee"] = {container = "Wisconsin", wp = "%l", commonscat = "%l, %c"}, ["Raleigh"] = {container = "North Carolina"}, ["Nashville"] = {container = "Tennessee"}, ["Virginia Beach"] = {container = "Virginia"}, ["Norfolk"] = {container = "Virginia"}, ["Greensboro"] = {container = "North Carolina"}, ["Winston-Salem"] = {container = "North Carolina"}, ["Jacksonville"] = {container = "Florida"}, ["New Orleans"] = {container = "Louisiana", wp = "%l"}, ["Louisville"] = {container = "Kentucky"}, ["Greenville"] = {container = "South Carolina"}, ["Hartford"] = {container = "Connecticut"}, ["Oklahoma City"] = {container = "Oklahoma", wp = "%l"}, ["Grand Rapids"] = {container = "Michigan"}, ["Memphis"] = {container = "Tennessee"}, ["Birmingham, Alabama"] = {container = "Alabama"}, ["Birmingham"] = {alias_of = "Birmingham, Alabama"}, ["Fresno"] = {container = "California"}, ["Richmond"] = {container = "Virginia"}, ["Harrisburg"] = {container = "Pennsylvania"}, -- any major city of top 50 MSA's that's missed by previous ["Buffalo"] = {container = "New York"}, -- any of the top 50 city by city population that's missed by previous ["El Paso"] = {container = "Texas"}, ["Albuquerque"] = {container = "New Mexico"}, ["Tucson"] = {container = "Arizona"}, ["Colorado Springs"] = {container = "Colorado"}, ["Omaha"] = {container = "Nebraska"}, ["Tulsa"] = {container = "Oklahoma"}, -- skip Arlington, Texas; too obscure and likely to be interpreted as Arlington, Virginia } export.united_states_cities_group = { default_container = "สหรัฐอเมริกา", canonicalize_key_container = make_canonicalize_key_container(", USA", "รัฐ"), default_placetype = "นคร", default_wp = "%l, %c", data = export.united_states_cities, } export.new_york_boroughs = { ["Bronx"] = {the = true, wp = "The Bronx"}, ["Brooklyn"] = {}, ["Manhattan"] = {}, ["Queens"] = {}, ["Staten Island"] = {}, } export.new_york_boroughs_group = { default_container = {key = "New York City", placetype = "นคร"}, default_placetype = "borough", default_is_city = true, data = export.new_york_boroughs, } export.vietnam_cities = { -- Figures from citypopulation.de (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated. ["Ho Chi Minh City"] = {}, -- 14,300,000 (Agglomeration; inclunding Bien Hoa) ["Saigon"] = {alias_of = "Ho Chi Minh City"}, ["Hanoi"] = {}, -- 7,350,000 (Agglomeration) ["Da Nang"] = {}, -- 1,500,000 (Agglomeration) ["Danang"] = {alias_of = "Da Nang", display = true}, ["Haiphong"] = {}, -- 1,450,000 (Agglomeration) ["Hai Phong"] = {alias_of = "Haiphong", display = true}, -- This is the one entry in this list that is not a province-level municipality; instead it's a "provincial city" -- meaning it is directly under its province as opposed to being contained in a district. ["Bien Hoa"] = {placetype = "นคร", container = "Đồng Nai", wp = "Biên Hòa"}, -- 1,272,235 (2022 city population per Wikipedia) ["Biên Hòa"] = {alias_of = "Bien Hoa", display = true}, ["Biên Hoà"] = {alias_of = "Bien Hoa", display = true}, -- These two not in citypopulation.de because the urban population may be slightly under 1,000,000, but they are -- both province-level municipalities and close to the 1,000,000 mark. ["Can Tho"] = {wp = "Cần Thơ"}, -- 1,456,000 municipality (2019 census), 994,704 urban (2022 General Statistics Office of Vietnam estimate); capital [[Ninh Kiều district]] ["Cần Thơ"] = {alias_of = "Can Tho", display = true}, ["Hue"] = {wp = "Huế"}, -- 1,257,000 municipality (2019 census), 840,000 urban (2022 General Statistics Office of Vietnam estimate); -- capital [[Thuận Hóa district]] ["Huế"] = {alias_of = "Hue", display = true}, } export.vietnam_cities_group = { placename_to_key = false, -- don't add ", เวียดนาม" to make the key default_container = "เวียดนาม", canonicalize_key_container = make_canonicalize_key_container(", เวียดนาม", "จังหวัด"), -- Most of the cities listed are province-level municipalities in addition, which contain a certain amount of -- rural territory surrounding the city, but not enough to separate the municipality from the city as distinct -- known locations. default_placetype = {"เทศบาล", "นคร"}, default_is_city = true, -- There may not be enough districts to subcategorize like this. -- default_divs = "อำเภอ", data = export.vietnam_cities, } export.misc_cities = { ------------------ Africa ------------------- -- Sorted by country and then within the country, by decreasing population; figures from citypopulation.de -- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated; combined with data from -- [[w:List of urban areas in Africa by population]]. ["Algiers"] = {container = "แอลจีเรีย"}, -- 4,325,000 (Consolidated Urban Area) ["Oran"] = {container = "แอลจีเรีย"}, -- 1,640,000 (Consolidated Urban Area) ["Luanda"] = {container = "แองโกลา"}, -- 9,650,000 (Urban Area) ["Benguela"] = {container = "แองโกลา"}, -- 1,420,000 (Urban Area) ["Cotonou"] = {container = "เบนิน"}, -- 2,150,000 (Agglomeration) ["Ouagadougou"] = {container = "บูร์กินาฟาโซ"}, -- 3,425,000 (Agglomeration) ["Bobo-Dioulasso"] = {container = "บูร์กินาฟาโซ"}, -- 1,100,000 (Agglomeration) ["Bujumbura"] = {container = "บุรุนดี"}, -- 1,143,202 (Urban Area 2023 per PopulationStat, cited in Wikipedia) ["Yaoundé"] = {container = "แคเมอรูน"}, -- 3,975,000 (City) ["Yaounde"] = {alias_of = "Yaoundé", display = true}, ["Douala"] = {container = "แคเมอรูน"}, -- 3,900,000 (City) ["Bangui"] = {container = "สาธารณรัฐแอฟริกากลาง"}, -- 1,680,000 (Agglomeration) ["N'Djamena"] = {container = "ชาด"}, -- 1,950,000 (City) ["Ndjamena"] = {alias_of = "N'Djamena", display = true}, ["Kinshasa"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 16,300,000 (City; population of low reliability) ["Lubumbashi"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 2,875,000 (City; population of low reliability) ["Mbuji-Mayi"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 2,500,000 (City; population of low reliability) ["Kananga"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,370,000 (City; population of low reliability) ["Kisangani"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,300,000 (City; population of low reliability) ["Bukavu"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,100,000 (City; population of low reliability) ["Goma"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,010,000 (City; population of low reliability) ["Tshikapa"] = {container = "สาธารณรัฐประชาธิปไตยคองโก"}, -- 1,020,468 (2023 Wikipedia [[w:List of cities with over one million inhabitants]] from populationstat.com; not in citypopulation.de) ["Cairo"] = {container = "อียิปต์"}, -- 22,800,000 (Agglomeration, including Giza and Subhra El Kheima) ["Alexandria"] = {container = "อียิปต์"}, -- 6,250,000 (Agglomeration) ["Giza"] = {container = "อียิปต์"}, -- 4,458,135 (2023 from citypopulation.de) ["Shubra El Kheima"] = {container = "อียิปต์"}, -- 1,240,239 (2021 from citypopulation.de) ["Asmara"] = {container = "เอริเทรีย"}, -- 1,090,000 (City; population of low reliability) ["Asmera"] = {alias_of = "Asmara", display = true}, ["Addis Ababa"] = {container = "เอธิโอเปีย"}, -- 4,825,000 (Agglomeration) ["Banjul"] = {container = "Gambia"}, -- 1,170,000 (Agglomeration) ["Accra"] = {container = "กานา"}, -- 6,800,000 (Agglomeration) ["Kumasi"] = {container = "กานา"}, -- 2,900,000 (Agglomeration) ["Conakry"] = {container = "กินี"}, -- 2,975,000 (Consolidated Urban Area) ["Abidjan"] = {container = "โกตดิวัวร์"}, -- 7,050,000 (Agglomeration) ["Nairobi"] = {container = "Kenya"}, -- 6,900,000 (unindicated) ["Mombasa"] = {container = "Kenya"}, -- 1,370,000 (City) ["Monrovia"] = {container = "Liberia"}, -- 1,940,000 (Urban Area) ["Tripoli"] = {container = "Libya", wp = "%l, %c"}, -- 1,870,000 (unindicated) ["Antananarivo"] = {container = "Madagascar"}, -- 3,150,000 (Agglomeration) ["Lilongwe"] = {container = "Malawi"}, -- 1,210,000 (City) ["Bamako"] = {container = "Mali"}, -- 5,700,000 (Agglomeration) ["Nouakchott"] = {container = "Mauritania"}, -- 1,500,000 (City) ["Casablanca"] = {container = {key = "Casablanca-Settat, Morocco", placetype = "ภูมิภาค"}}, -- 4,450,000 (Municipality (urban population)) ["Rabat"] = {container = {key = "Rabat-Sale-Kenitra, Morocco", placetype = "ภูมิภาค"}}, -- 2,125,000 (Municipality (urban population)) ["Tangier"] = {container = {key = "Tangier-Tetouan-Al Hoceima, Morocco", placetype = "ภูมิภาค"}}, -- 1,410,000 (Municipality (urban population)) ["Tanger"] = {alias_of = "Tangier", display = true}, ["Tangiers"] = {alias_of = "Tangier", display = true}, ["Fez"] = {container = {key = "Fez-Meknes, Morocco", placetype = "ภูมิภาค"}, wp = "%l, Morocco"}, -- 1,310,000 (Municipality (urban population)) ["Fes"] = {alias_of = "Fez", display = true}, ["Fès"] = {alias_of = "Fez", display = true}, ["Agadir"] = {container = {key = "Souss-Massa, Morocco", placetype = "ภูมิภาค"}}, -- 1,270,000 (Municipality (urban population)) ["Marrakesh"] = {container = {key = "Marrakesh-Safi, Morocco", placetype = "ภูมิภาค"}}, -- 1,140,000 (Municipality (urban population)) ["Marrakech"] = {alias_of = "Marrakesh", display = true}, ["Maputo"] = {container = "Mozambique"}, -- 2,575,000 (Agglomeration) ["Niamey"] = {container = "Niger"}, -- 1,530,000 (City) ["Brazzaville"] = {container = "Republic of the Congo"}, -- 2,475,000 (Agglomeration) ["Pointe-Noire"] = {container = "Republic of the Congo"}, -- 1,480,000 (City) ["Kigali"] = {container = "Rwanda"}, -- 1,960,000 (Municipality (urban population)) ["Dakar"] = {container = "Senegal"}, -- 4,225,000 (Agglomeration) ["Touba"] = {container = "Senegal"}, -- 1,320,000 (Agglomeration) ["Freetown"] = {container = "Sierra Leone"}, -- 1,420,000 (Agglomeration) ["Mogadishu"] = {container = "โซมาเลีย"}, -- 2,250,000 (unindicated; population of low reliability) ["Johannesburg"] = {container = {key = "Gauteng, South Africa", placetype = "จังหวัด"}}, -- 14,800,000 (Consolidated Urban Area; including Pretoria, Soweto, etc.) ["Cape Town"] = {container = {key = "Western Cape, South Africa", placetype = "จังหวัด"}}, -- 5,100,000 (Consolidated Urban Area) ["Durban"] = {container = {key = "KwaZulu-Natal, South Africa", placetype = "จังหวัด"}}, -- 3,900,000 (Consolidated Urban Area) ["Pretoria"] = {container = {key = "Gauteng, South Africa", placetype = "จังหวัด"}}, -- 2,921,488 (2011 census) ["Port Elizabeth"] = {container = {key = "Eastern Cape, South Africa", placetype = "จังหวัด"}, wp = "Gqeberha"}, -- 1,200,000 (Consolidated Urban Area) ["Gqeberha"] = {alias_of = "Port Elizabeth"}, -- official name; not a display alias ["Khartoum"] = {container = "Sudan"}, -- 7,200,000 (unindicated; population of low reliability) ["Dar es Salaam"] = {container = "Tanzania"}, -- 6,650,000 (Agglomeration) ["Mwanza"] = {container = "Tanzania"}, -- 1,340,000 (Agglomeration) ["Mwanza City"] = {alias_of = "Mwanza", display = true}, ["Arusha"] = {container = "Tanzania"}, -- 1,190,000 (Agglomeration) ["Zanzibar"] = {container = "Tanzania"}, -- 1,030,000 (Agglomeration) ["Lomé"] = {container = "Togo"}, -- 2,625,000 (unindicated) ["Lome"] = {alias_of = "Lomé", display = true}, ["Tunis"] = {container = "Tunisia"}, -- 2,725,000 (Municipality (urban population)) ["Sousse"] = {container = "Tunisia"}, -- 1,180,000 (Municipality (urban population)) ["Soussa"] = {alias_of = "Sousse", display = true}, ["Kampala"] = {container = "Uganda"}, -- 4,300,000 (unindicated) ["Lusaka"] = {container = "Zambia"}, -- 3,000,000 (Consolidated Urban Area) ["Harare"] = {container = "Zimbabwe"}, -- 2,675,000 (Agglomeration) ------------------ Asia ------------------- -- sorted by country and then within the country, by decreasing population; figures from citypopulation.de -- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated. ["Kabul"] = {container = "อัฟกานิสถาน"}, -- 5,250,000 (Agglomeration) ["Baku"] = {container = "อาเซอร์ไบจาน"}, -- 3,725,000 (Administrative Area (urban population)) ["Manama"] = {container = "บาห์เรน"}, -- 1,560,000 (unindicated) ["Dhaka"] = {container = {key = "Dhaka Division, บังกลาเทศ", placetype = "division"}}, -- 23,100,000 (Agglomeration) ["Dacca"] = {alias_of = "Dhaka", display = true}, ["Chittagong"] = {container = {key = "Chittagong Division, บังกลาเทศ", placetype = "division"}}, -- 5,050,000 (Agglomeration) ["Gazipur"] = {container = {key = "Dhaka Division, บังกลาเทศ", placetype = "division"}}, -- 2,674,697 (City per 2022; countied in citypopulation.de as part of Dhaka metro area) ["Khulna"] = {container = {key = "Khulna Division, บังกลาเทศ", placetype = "division"}}, -- 1,210,000 (Agglomeration) ["Phnom Penh"] = {container = "กัมพูชา"}, -- 2,925,000 (Agglomeration) ["Tehran"] = {container = {key = "Tehran, อิหร่าน", placetype = "จังหวัด"}}, -- 16,800,000 (Agglomeration) ["Teheran"] = {alias_of = "Tehran", display = true}, ["Mashhad"] = {container = {key = "Razavi Khorasan, อิหร่าน", placetype = "จังหวัด"}}, -- 3,475,000 (Agglomeration) ["Mashad"] = {alias_of = "Mashhad", display = true}, ["Meshhed"] = {alias_of = "Mashhad", display = true}, ["Meshed"] = {alias_of = "Mashhad", display = true}, ["Isfahan"] = {container = {key = "Isfahan, อิหร่าน", placetype = "จังหวัด"}}, -- 3,425,000 (Agglomeration) ["Esfahan"] = {alias_of = "Isfahan", display = true}, ["Tabriz"] = {container = {key = "East Azerbaijan, อิหร่าน", placetype = "จังหวัด"}}, -- 1,970,000 (Agglomeration) ["Shiraz"] = {container = {key = "Fars, อิหร่าน", placetype = "จังหวัด"}}, -- 1,950,000 (Agglomeration) ["Ahvaz"] = {container = {key = "Khuzestan, อิหร่าน", placetype = "จังหวัด"}}, -- 1,550,000 (Agglomeration) ["Qom"] = {container = {key = "Qom, อิหร่าน", placetype = "จังหวัด"}}, -- 1,450,000 (City) ["Kermanshah"] = {container = {key = "Kermanshah, อิหร่าน", placetype = "จังหวัด"}}, -- 1,130,000 (City) ["Baghdad"] = {container = "อิรัก"}, -- 7,800,000 (Administrative Area (urban population)) ["Basra"] = {container = "อิรัก"}, -- 1,710,000 (Administrative Area (urban population)) ["Mosul"] = {container = "อิรัก"}, -- 1,550,000 (Administrative Area (urban population)) ["Erbil"] = {container = "อิรัก"}, -- 1,220,000 (Administrative Area (urban population)) ["Kirkuk"] = {container = "อิรัก"}, -- 1,160,000 (Administrative Area (urban population)) ["Najaf"] = {container = "อิรัก"}, -- 1,050,000 (Administrative Area (urban population)) ["Tel Aviv"] = {container = "อิสราเอล"}, -- 3,000,000 (Agglomeration) -- Jerusalem is not recognized internationally as part of either Israel or Palestine, but as a -- [[w:corpus separatum]], so put the container as "เอเชีย" and list Israel and Palestine as additional parents for -- categorization purposes. ["Jerusalem"] = {container = {key = "เอเชีย", placetype = "ทวีป"}, addl_parents = {"อิสราเอล", "Palestine"}}, -- 1,080,000 (Agglomeration) ["Amman"] = {container = "Jordan"}, -- 6,150,000 (unindicated) ["Irbid"] = {container = "Jordan"}, -- 1,070,000 (unindicated) ["Almaty"] = {container = "Kazakhstan"}, -- 2,700,000 (Agglomeration) ["Alma-Ata"] = {alias_of = "Almaty"}, -- former name, sometimes still used; don't display-canonicalize ["Astana"] = {container = "Kazakhstan"}, -- 1,600,000 (Agglomeration) ["Shymkent"] = {container = "Kazakhstan"}, -- 1,370,000 (Agglomeration) ["Kuwait City"] = {container = "Kuwait"}, -- 5,050,000 (Agglomeration) ["Bishkek"] = {container = "Kyrgyzstan"}, -- 1,540,000 (Agglomeration) ["Beirut"] = {container = "Lebanon"}, -- 1,930,000 (unindicated; population of low reliability) -- Kuala Lumpur is a federal capital city, not in any state ["Kuala Lumpur"] = {container = "Malaysia"}, -- 9,550,000 (Agglomeration) -- there are various George Towns and Georgetowns ["George Town, Malaysia"] = {container = {key = "Penang, Malaysia", placetype = "รัฐ"}, wp = "%l, %c"}, -- 2,075,000 (Agglomeration) ["George Town"] = {alias_of = "George Town, Malaysia"}, ["Ulaanbaatar"] = {container = "Mongolia"}, -- 1,610,000 (City) ["Ulan Bator"] = {alias_of = "Ulaanbaatar", display = true}, ["Yangon"] = {container = "Myanmar"}, -- 5,650,000 (Municipality (urban population)) ["Rangoon"] = {alias_of = "Yangon", display = true}, ["Mandalay"] = {container = "Myanmar"}, -- 1,600,000 (Municipality (urban population)) ["Kathmandu"] = {container = "Nepal"}, -- 3,175,000 (Agglomeration) -- Pyongyang is a directly governed city, not in any province ["Pyongyang"] = {container = "North Korea"}, -- 3,025,000 (Administrative Area (urban population)) ["Muscat"] = {container = "Oman"}, -- 1,620,000 (Agglomeration) ["Gaza"] = {container = "Palestine", wp = "Gaza City"}, -- 2,275,000 (unindicated) ["Gaza City"] = {alias_of = "Gaza"}, ["Doha"] = {container = "Qatar"}, -- 2,650,000 (Agglomeration) ["Colombo"] = {container = "Sri Lanka"}, -- 4,975,000 (unindicated) ["Damascus"] = {container = "Syria"}, -- 3,975,000 (unindicated; population of low reliability) ["Aleppo"] = {container = "Syria"}, -- 1,980,000 (unindicated; population of low reliability) ["Dushanbe"] = {container = "Tajikistan"}, -- 1,270,000 (City) ["Bangkok"] = {container = "Thailand"}, -- 21,800,000 (Agglomeration) -- Chiang Mai not in citypopulation.de, but 1,198,000 urban population in 2021 per Wikipedia -- [[w:List_of_municipalities_in_Thailand#Largest_cities_by_urban_population]] ["Chiang Mai"] = {container = {key = "Chiang Mai Province, Thailand", placetype = "จังหวัด"}}, ["Chonburi"] = {container = {key = "Chonburi Province, Thailand", placetype = "จังหวัด"}}, -- 1,570,000 (Agglomeration; including Pattaya) -- metro area population stats from https://www.statista.com/statistics/255483/biggest-cities-in-turkey/ as of 2021; -- second source is citypopulation.de reference date 2025-01-01. ["Istanbul"] = {placetype = {"นคร", "จังหวัด"}, divs = {"อำเภอ"}, container = "Turkey"}, -- 15.2 million; 16,000,000 (Agglomeration) ["İstanbul"] = {alias_of = "Istanbul", display = true}, ["Ankara"] = {container = {key = "Ankara Province, Turkey", placetype = "จังหวัด"}}, -- 5.15 million; 5,200,000 (Agglomeration) ["Izmir"] = {container = {key = "İzmir Province, Turkey", placetype = "จังหวัด"}, wp = "İzmir"}, -- 2.95 million; 3,025,000 (Agglomeration) ["İzmir"] = {alias_of = "Izmir", display = true}, ["Bursa"] = {container = {key = "Bursa Province, Turkey", placetype = "จังหวัด"}}, -- 2.02 million; 2,200,000 (Agglomeration) ["Adana"] = {container = {key = "Adana Province, Turkey", placetype = "จังหวัด"}}, -- 1.77 million; 1,780,000 (Agglomeration) ["Gaziantep"] = {container = {key = "Gaziantep Province, Turkey", placetype = "จังหวัด"}}, -- 1.71 million; 1,750,000 (Agglomeration) ["Antalya"] = {container = {key = "Antalya Province, Turkey", placetype = "จังหวัด"}}, -- 1.3 million; 1,400,000 (Agglomeration) ["Konya"] = {container = {key = "Konya Province, Turkey", placetype = "จังหวัด"}}, -- 1.35 million; 1,390,000 (Agglomeration) ["Diyarbakır"] = {container = {key = "Diyarbakır Province, Turkey", placetype = "จังหวัด"}}, -- 1.07 million; 1,100,000 (Agglomeration) -- Diyarbakır is more common per Ngrams and Google Scholar, but Diyarbakir is the Kurdish form, so we should not -- display-canonicalize to the Turkish form Diyarbakır. ["Diyarbakir"] = {alias_of = "Diyarbakır"}, ["Mersin"] = {container = {key = "Mersin Province, Turkey", placetype = "จังหวัด"}}, -- 1.03 million; 1,060,000 (Agglomeration) ["Ashgabat"] = {container = "Turkmenistan"}, -- 1,150,000 (Agglomeration) ["Dubai"] = {container = "United Arab Emirates"}, -- 6,050,000 (Agglomeration; including Sharjah) ["Abu Dhabi"] = {container = "United Arab Emirates"}, -- 1,850,000 (City) ["Sharjah"] = {container = "United Arab Emirates"}, -- 1,800,000 (Metro area 2022-2023 per Wikipedia; separate from Dubai) ["Tashkent"] = {container = "Uzbekistan"}, -- 3,850,000 (unindicated) ["Sanaa"] = {container = "Yemen"}, -- 3,275,000 (City; population of low reliability) ["Sana'a"] = {alias_of = "Sanaa", display = true}, ["Aden"] = {container = "Yemen"}, -- 1,079,060 (?; 2023 estimate from World Population Review per Wikipedia) ------------------ Europe or Europe-like (Caucasus etc.) --------------------- ["Yerevan"] = {container = "อาร์มีเนีย"}, -- 1,520,000 (Agglomeration) ["Vienna"] = {container = "ออสเตรีย"}, -- 2,375,000 (Agglomeration) ["Minsk"] = {container = "เบลารุส"}, -- 2,100,000 (unindicated) ["Brussels"] = {container = "เบลเยียม"}, -- 2,800,000 (Consolidated Urban Area) ["Antwerp"] = {container = "เบลเยียม"}, -- 1,270,000 (Consolidated Urban Area) ["Sofia"] = {container = "บัลแกเรีย"}, -- 1,260,000 (Agglomeration) ["Zagreb"] = {container = "โครเอเชีย"}, ["Prague"] = {container = "สาธารณรัฐเช็ก"}, -- 1,470,000 (Agglomeration) ["Brno"] = {container = "สาธารณรัฐเช็ก"}, -- 729,405 (metro area per Wikipedia as of 2024-01-01 Czech Statistical Office) ["Olomouc"] = {container = "สาธารณรัฐเช็ก"}, -- 102,293 (city; included only because someone went crazy creating Olomouc-related terms) ["Copenhagen"] = {container = "เดนมาร์ก"}, -- 1,800,000 (Consolidated Urban Area) ["Helsinki"] = {container = {key = "Uusimaa, ฟินแลนด์", placetype = "ภูมิภาค"}}, -- 1,560,000 (Consolidated Urban Area) ["Tbilisi"] = {container = "Georgia"}, -- 1,430,000 (Agglomeration) ["Athens"] = {container = "กรีซ"}, ["Thessaloniki"] = {container = "กรีซ"}, ["Budapest"] = {container = "ฮังการี"}, -- FIXME, per Wikipedia "County Dublin" is now the "Dublin Region" ["Dublin"] = {container = {key = "County Dublin, ไอร์แลนด์", placetype = "เทศมณฑล"}}, ["Riga"] = {container = "Latvia"}, ["Amsterdam"] = {container = {key = "North Holland, Netherlands", placetype = "จังหวัด"}}, ["Rotterdam"] = {container = {key = "South Holland, Netherlands", placetype = "จังหวัด"}}, ["The Hague"] = {container = {key = "South Holland, Netherlands", placetype = "จังหวัด"}}, -- Christchurch (metro 546,600) and Wellington (metro 439,800) are too small to make it. ["Auckland"] = {container = {key = "Auckland, New Zealand", placetype = "ภูมิภาค"}}, ["Oslo"] = {container = {key = "Oslo, Norway", placetype = "เทศมณฑล"}}, ["Warsaw"] = {container = {key = "Masovian Voivodeship, Poland", placetype = "voivodeship"}}, ["Katowice"] = {container = {key = "Silesian Voivodeship, Poland", placetype = "voivodeship"}}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Krakow" without accent. ["Krakow"] = {container = {key = "Lesser Poland Voivodeship, Poland", placetype = "voivodeship"}, wp = "Kraków"}, ["Kraków"] = {alias_of = "Krakow", display = true}, ["Cracow"] = {alias_of = "Krakow", display = true}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirm "Gdańsk" and "Poznań" with accent. ["Gdańsk"] = {container = {key = "Pomeranian Voivodeship, Poland", placetype = "voivodeship"}}, ["Gdansk"] = {alias_of = "Gdańsk", display = true}, ["Poznań"] = {container = {key = "Greater Poland Voivodeship, Poland", placetype = "voivodeship"}}, ["Poznan"] = {alias_of = "Poznań", display = true}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Lodz" without accents. ["Lodz"] = {container = {key = "Lodz Voivodeship, Poland", placetype = "voivodeship"}, wp = "Łódź"}, ["Łódź"] = {alias_of = "Lodz", display = true}, ["Lisbon"] = {container = {key = "Lisbon District, Portugal", placetype = "district"}}, ["Porto"] = {container = {key = "Porto District, Portugal", placetype = "district"}}, ["Oporto"] = {alias_of = "Porto", display = true}, ["Bucharest"] = {container = "Romania"}, ["Belgrade"] = {container = "Serbia"}, ["Stockholm"] = {container = "Sweden"}, ["Zurich"] = {container = "Switzerland"}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Zurich" without umlaut. --- Even Wikipedia uses the form without umlaut. ["Zürich"] = {alias_of = "Zurich", display = true}, ["Kyiv"] = {container = "Ukraine"}, -- not in Kyiv Oblast -- Don't display-canonicalize Kiev -> Kyiv because in ancient contexts, Kiev is still more common. ["Kiev"] = {alias_of = "Kyiv"}, ["Kharkiv"] = {container = {key = "Kharkiv Oblast, Ukraine", placetype = "oblast"}}, ["Odessa"] = {container = {key = "Odesa Oblast, Ukraine", placetype = "oblast"}, wp = "Odesa"}, -- Don't display-canonicalize Odesa -> Odessa because it may be interpreted as a political statement. ["Odesa"] = {alias_of = "Odessa"}, ------------------ North America, South America --------------------- -- Primary figures from citypopulation.de retrieved on 2025-04-26 (reference date 2025-01-01); -- Wikipedia metropolitan figures from [[w:List of metropolitan areas in the Americas]] based on per-country data; -- Wikipedia city limits figures from [[w:List of largest cities in the Americas]]. ["Buenos Aires"] = {container = "อาร์เจนตินา"}, -- 16,800,000 (Consolidated Urban Area; 13,985,794 metropolitan area per Wikipedia) ["Córdoba, Argentina"] = {container = "อาร์เจนตินา", wp = "%l, %c"}, -- 1,810,000 (Consolidated Urban Area; 1,505,25 city limits per Wikipedia) -- to avoid confusion with Córdoba in Spain ["Córdoba"] = {alias_of = "Córdoba, Argentina"}, ["Cordoba"] = {alias_of = "Córdoba, Argentina", display = "Córdoba"}, ["Rosario"] = {container = "อาร์เจนตินา", wp = "%l, Santa Fe"}, -- 1,510,000 (Consolidated Urban Area; 1,348,725 metropolitan area per Wikipedia) ["Mendoza"] = {container = "อาร์เจนตินา", wp = "%l, %c"}, -- 1,180,000 (Consolidated Urban Area) ["San Miguel de Tucumán"] = {container = "อาร์เจนตินา"}, -- 1,110,000 (Consolidated Urban Area) ["Tucumán"] = {alias_of = "San Miguel de Tucumán"}, ["Tucuman"] = {alias_of = "San Miguel de Tucumán", display = "Tucumán"}, ["Santa Cruz de la Sierra"] = {container = "โบลิเวีย"}, -- 1,960,000 (Consolidated Urban Area); 1,606,671 (city limits per Wikipedia) ["Santa Cruz"] = {alias_of = "Santa Cruz de la Sierra"}, ["La Paz"] = {container = "โบลิเวีย"}, -- 1,870,000 (Consolidated Urban Area; composed of El Alto, now slightly larger, and La Paz) ["El Alto"] = {container = "โบลิเวีย"}, ["Cochabamba"] = {container = "โบลิเวีย"}, -- 1,280,000 (Consolidated Urban Area) ["Santiago"] = {container = "ชิลี"}, -- 8,400,000 (Consolidated Urban Area; 6,903,479 city limits? per Wikipedia) ["Valparaíso"] = {container = "ชิลี"}, -- 1,060,000 (Consolidated Urban Area) ["Valparaiso"] = {alias_of = "Valparaíso"}, -- 1,060,000 (Consolidated Urban Area) ["Bogotá"] = {container = "โคลอมเบีย"}, -- 10,600,000 (Agglomeration; 12,772,828 metropolitan area per Wikipedia) ["Bogota"] = {alias_of = "Bogotá", display = true}, ["Medellín"] = {container = "โคลอมเบีย"}, -- 4,350,000 (Agglomeration; 4,068,000 metropolitan area per Wikipedia) ["Medellin"] = {alias_of = "Medellín", display = true}, ["Cali"] = {container = "โคลอมเบีย"}, -- 2,975,000 (Agglomeration; 2,837,000 metropolitan area per Wikipedia) ["Barranquilla"] = {container = "โคลอมเบีย"}, -- 2,375,000 (Agglomeration; 1,341,160 city limits per Wikipedia) ["Bucaramanga"] = {container = "โคลอมเบีย"}, -- 1,380,000 (Agglomeration) ["Cartagena, Colombia"] = {container = "โคลอมเบีย", wp = "%l, %c"}, -- 1,250,000 (Agglomeration) -- to avoid confusion with Cartagena, Spain ["Cartagena"] = {alias_of = "Cartagena, Colombia"}, ["Cúcuta"] = {container = "โคลอมเบีย"}, -- 1,130,000 (Agglomeration) ["Cucuta"] = {alias_of = "Cúcuta", display = true}, -- to avoid conflict with San Jose, California ["San José, Costa Rica"] = {container = "คอสตาริกา", wp = "%l, %c"}, -- 2,450,000 (Municipality (urban population); 3,160,000 metropolitan area per Wikipedia) ["San José"] = {alias_of = "San José, Costa Rica"}, ["San Jose"] = {alias_of = "San José, Costa Rica"}, -- display = "San José"; causes error due to San Jose alias for California city; FIXME ["Havana"] = {container = "คิวบา"}, -- 2,150,000 (City; 2,137,847 city limits? per Wikipedia) ["Santo Domingo"] = {container = "สาธารณรัฐโดมินิกัน"}, -- 3,900,000 (Municipality (urban population); 4,274,651 ??? per Wikipedia) ["Guayaquil"] = {container = "เอกวาดอร์"}, -- 3,350,000 (Agglomeration; 3,092,000 metro area? per Wikipedia) ["Quito"] = {container = "เอกวาดอร์"}, -- 2,875,000 (Agglomeration; 2,889,703 metro area? per Wikipedia) ["San Salvador"] = {container = "เอลซัลวาดอร์"}, -- 1,580,000 (Municipality (urban population)) ["Guatemala City"] = {container = "กัวเตมาลา"}, -- 3,375,000 (Municipality (urban population); 3,160,000 metro area? per Wikipedia) ["Port-au-Prince"] = {container = "เฮติ"}, -- 3,050,000 (Agglomeration; population of low reliability; 2,915,000 metro area? per Wikipedia) ["San Pedro Sula"] = {container = "ฮอนดูรัส"}, -- 1,330,000 (Consolidated Urban Area) ["Tegucigalpa"] = {container = "ฮอนดูรัส"}, -- 1,220,000 (Urban Area) ["Managua"] = {container = "Nicaragua"}, -- 1,400,000 (Consolidated Urban Area) ["Panama City"] = {container = "Panama"}, -- 1,430,000 (Urban Area) ["Asunción"] = {container = "Paraguay"}, -- 2,350,000 (Municipality (urban population)) ["Lima"] = {container = "Peru"}, -- 12,000,000 (Agglomeration; 11,283,787 ??? per Wikipedia) ["Arequipa"] = {container = "Peru"}, -- 1,210,000 (Agglomeration) ["San Juan"] = {container = {key = "Puerto Rico", placetype = "commonwealth"}, wp = "%l, %c"}, -- 1,910,000 (Consolidated Urban Area) ["Montevideo"] = {container = "Uruguay"}, -- 1,810,000 (Agglomeration; 1,302,954 ??? per Wikipedia) ["Caracas"] = {container = "Venezuela"}, -- 3,850,000 (Consolidated Urban Area; 5,243,301 ??? per Wikipedia) ["Maracaibo"] = {container = "Venezuela"}, -- 2,825,000 (Consolidated Urban Area; 5,278,448 ??? per Wikipedia) -- to avoid confusion with Valencia (city and autonomous community of Spain) ["Valencia, Venezuela"] = {container = "Venezuela", wp = "%l, %c"}, -- 2,100,000 (Consolidated Urban Area) ["Valencia"] = {alias_of = "Valencia, Venezuela"}, ["Maracay"] = {container = "Venezuela"}, -- 1,480,000 (Consolidated Urban Area) ["Barquisimeto"] = {container = "Venezuela"}, -- 1,360,000 (Consolidated Urban Area) } export.misc_cities_group = { canonicalize_key_container = make_canonicalize_key_container(nil, "ประเทศ"), default_placetype = "นคร", data = export.misc_cities, } --[==[ var: List of all known locations, in groups. The first group lists continents and continental regions, followed by three groups listing top-level locations: countries, "country-like entities" (de-facto/unrecognized/etc. countries and dependent territories) and former polities (countries, empires, etc.). After that come first-level subpolities (administrative divisions) of several, mostly large, countries, followed by groups of cities. China and the United Kingdom include second-level subpolities (in the case of China, only the largest ones as the full list runs in the hundreds). ]==] export.locations = { export.continents_group, export.countries_group, export.country_like_entities_group, export.former_countries_group, export.australia_group, export.austria_group, export.bangladesh_group, export.brazil_group, export.canada_group, export.china_group, export.china_prefecture_level_cities_group, export.china_prefecture_level_cities_group_2, export.egypt_group, export.finland_group, export.france_group, export.france_departments_group, export.germany_group, export.greece_group, export.india_group, export.indonesia_group, export.iran_group, export.ireland_group, export.italy_group, export.japan_group, export.laos_group, export.lebanon_group, export.malaysia_group, export.malta_group, export.mexico_group, export.moldova_group, export.morocco_group, export.netherlands_group, export.new_zealand_group, export.nigeria_group, export.north_korea_group, export.norway_group, export.pakistan_group, export.philippines_group, export.poland_group, export.portugal_group, export.romania_group, export.russia_group, export.saudi_arabia_group, export.south_africa_group, export.south_korea_group, export.spain_group, export.taiwan_group, export.thailand_group, export.turkey_group, export.ukraine_group, export.united_kingdom_group, export.united_states_group, export.england_group, export.northern_ireland_group, export.scotland_group, export.wales_group, export.vietnam_group, export.australia_cities_group, export.brazil_cities_group, export.canada_cities_group, export.france_cities_group, export.germany_cities_group, export.india_cities_group, export.indonesia_cities_group, export.italy_cities_group, export.japan_cities_group, export.mexico_cities_group, export.nigeria_cities_group, export.pakistan_cities_group, export.philippines_cities_group, export.russia_cities_group, export.saudi_arabia_cities_group, export.south_korea_cities_group, export.spain_cities_group, export.taiwan_cities_group, export.united_kingdom_cities_group, export.united_states_cities_group, export.new_york_boroughs_group, export.vietnam_cities_group, export.misc_cities_group, } return export pcb3s52s1ts3o7kiu4uflgy2c7ek2pw มอดูล:place/placetypes 828 2297280 5720686 5715287 2026-04-21T01:22:10Z OctraBot 3198 5720686 Scribunto text/plain local export = {} export.force_cat = false -- set to true for testing local m_locations = require("Module:place/locations") local m_links = require("Module:links") local m_table = require("Module:table") local m_strutils = require("Module:string utilities") local debug_track_module = "Module:debug/track" local en_utilities_module = "Module:en-utilities" local dump = mw.dumpObject local insert = table.insert local concat = table.concat local internal_error = m_locations.internal_error export.internal_error = internal_error local process_error = m_locations.process_error export.process_error = process_error local unpack = unpack or table.unpack -- Lua 5.2 compatibility local ucfirst = m_strutils.ucfirst local ulower = m_strutils.lower local rmatch = m_strutils.match local split = m_strutils.split --[==[ intro: This module contains placetype data used by [[Module:place]] and {{tl|place}}, along with a significant amount of code to work with both placetypes and locations, as well as some placename-related info (FIXME: Consider moving it to [[Module:place/locations]]). See also [[Module:place/locations]], which has definitions of all known locations. You must currently load this module using {{cd|require()}}, not using {{cd|mw.loadData()}}. In particular, it contains two fundamental and tricky functions: # `get_placetype_equivs`, which finds the equivalent placetypes to look under in order to find a given property, and in the process correctly handles placetypes with qualifiers (including qualifiers that act similar to "type-raising" operators in that they do something non-trivial to the placetype to their right) as well as form-of directives and fallbacks. # `find_matching_holonym_location`, which looks up a holonym to find a matching known location, but in the process checks holonyms to the right to make sure there isn't a clash between the user-specified containing holonyms and the containers of the known location being considered. This is done to prevent overcategorizing when either there are two known locations with the same name (e.g. Birmingham in England and Birmingham, Alabama in the US), or more generally two locations with the same name, one of which is a known location but where the other is not (e.g. we're processing non-known-location Mérida, Spain and don't want it categorized like known location Mérida, Yucatán, Mexico). Both of these functions are invoked repeatedly, and probably are invoked several times on the same inputs and as a result are candidates for memoization to speed up the operation of {{tl|place}}. ]==] ------------------------------------------------------------------------------------------ -- Basic utilities -- ------------------------------------------------------------------------------------------ --[==[ Return true if `force_cat` is set either in this module or in [[Module:place/locations]]. ]==] function export.get_force_cat() return export.force_cat or m_locations.force_cat end -- Add the page to a tracking "category". To see the pages in the "category", -- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here". local function track(page) require(debug_track_module)("place/" .. page) return true end function export.remove_links_and_html(text) text = m_links.remove_links(text) return text:gsub("<.->", "") end --[==[ Return the singular version of a maybe-plural placetype, or nil if not plural. This correctly handles placetypes with irregular plurals such as `kibbutzim` plural of `kibbutz` by looking up in a table constructed from the `plural` values specified in `placetype_data`. If a special plural value is not found, the regular singularization algorithm in [[Module:en-utilities]] is invoked, which reverses the y -> ies change after vowels and the 'es' addition after sh/ch/x, and otherwise just subtracts a final 's' (which will incorrectly generate 'passe' for plural 'passes'; FIXME: consider changing this for words ending in '-sses'). If the generated singular is the same as the passed-in value, nil is returned. ]==] function export.maybe_singularize_placetype(placetype) if not placetype then return nil end if export.plural_placetype_to_singular[placetype] then return export.plural_placetype_to_singular[placetype] end local retval = --[[require(en_utilities_module).singularize(placetype)]] placetype if retval == placetype then return nil end return retval end -- Return the correct plural of a placetype, and (if `do_ucfirst` is given) make the first letter uppercase. We first -- look up the plural in `placetype_data`, falling back to pluralize() in [[Module:en-utilities]], which is almost -- always correct. function export.pluralize_placetype(placetype, do_ucfirst) local ptdata = export.placetype_data[placetype] if ptdata and ptdata.plural then placetype = ptdata.plural else placetype = --[[require(en_utilities_module).pluralize(placetype)]] placetype end if do_ucfirst then return ucfirst(placetype) else return placetype end end --[==[ Get the data associated with a placetype, which may be in its singular or plural form. If `from_category` is specified, we also look for category-only placetypes (generally plural) followed by `!`. Return three values: (a) the placetype under which the data can be looked up (i.e. in its singular form if the passed-in `placetype` is plural and did not match a category-only placetype followed by `!`); (b) the placetype data structure; (c) the type of `placetype` match that occurred, one of `"direct"` if the canonical placetype is the same as the passed-in `placetype` and also the same as the key under which `ptdata` was looked up, or `"direct-category"` if the `ptdata` was looked up under a key formed from the passed-in `placetype` by adding `!`, or `"plural"` if the `ptdata` was looked up under the singularized version of the plural passed-in `placetype`. ]==] function export.get_placetype_data(placetype, from_category) local ptdata = export.placetype_data[placetype] if ptdata then return placetype, ptdata, "direct" end if from_category then ptdata = export.placetype_data[placetype .. "!"] if ptdata then return placetype .. "!", ptdata, "direct-category" end end local sg_placetype = export.maybe_singularize_placetype(placetype) if sg_placetype then ptdata = export.placetype_data[sg_placetype] if ptdata then return sg_placetype, ptdata, "plural" end end return nil end --[==[ Check for special pseudo-placetypes that should be ignored for categorization purposes. ]==] function export.placetype_is_ignorable(placetype) return placetype == "and" or placetype == "or" or placetype == "และ" or placetype == "หรือ" or placetype:find("^%(") end function export.resolve_placetype_aliases(placetype) return export.placetype_aliases[placetype] or placetype end --[==[ Return a property from `placetype_data` for a given placetype. If the placetype isn't found in `placetype_data`, or the key isn't found in the placetype's entry in `placetype_data`, return nil. ]==] function export.get_placetype_prop(placetype, key) -- Usually we are called on equivalent placetypes returned from `get_placetype_equivs`, in which case placetype -- aliases have been resolved, but sometimes not, e.g. when fetching the indefinite article in -- get_placetype_article(). `resolve_placetype_aliases` is just a simple lookup and it doesn't hurt to do it twice. placetype = export.resolve_placetype_aliases(placetype) if export.placetype_data[placetype] then return export.placetype_data[placetype][key] else return nil end end --[==[ Given a placetype, split the placetype into one or more potential ''splits'', each consisting of a three-element list { {``prev_qualifiers``, ``this_qualifier``, ``reduced_placetype``}}, i.e. # the concatenation of zero or more previously-recognized qualifiers on the left, normally canonicalized (if there are zero such qualifiers, the value will be nil); # a single recognized qualifier, normally canonicalized (if there is no qualifier, the value will be nil); # the "reduced placetype" on the right. Splitting between the qualifier in (2) and the reduced placetype in (3) happens at each space character, proceeding from left to right, and stops if a qualifier isn't recognized. All placetypes are canonicalized by checking for aliases in `placetype_aliases`, but no other checks are made as to whether the reduced placetype is recognized. Canonicalization of qualifiers does not happen if `no_canon_qualifiers` is specified. For example, given the placetype `"small beachside unincorporated community"`, the return value will be { { {nil, nil, "small beachside unincorporated community"}, {nil, "small", "beachside unincorporated community"}, {"small", "[[beachfront]]", "unincorporated community"}, {"small [[beachfront]]", "[[unincorporated]]", "community"}, }} Here, `"beachside"` is canonicalized to `"[[beachfront]]"` and `"unincorporated"` is canonicalized to `"[[unincorporated]]"`, in both cases according to the entry in `placetype_qualifiers`. On the other hand, if given `"small former haunted community"`, the return value will be { { {nil, nil, "small former haunted community"}, {nil, "small", "former haunted community"}, {"small", "former", "haunted community"}, }} because `"small"` and `"former"` but not `"haunted"` are recognized as qualifiers. Finally, if given `"former adr"`, the return value will be { { {nil, nil, "former adr"}, {nil, "former", "administrative region"}, }} because `"adr"` is a recognized placetype alias for `"administrative region"`. ]==] function export.split_qualifiers_from_placetype(placetype, no_canon_qualifiers) local splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}} local prev_qualifier = nil while true do local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$") if qualifier then local canon = export.placetype_qualifiers[qualifier] if canon == nil then break end local new_qualifier = qualifier if type(canon) == "table" then canon = canon.link end if not no_canon_qualifiers and canon ~= false then if canon == true then new_qualifier = "[[" .. qualifier .. "]]" else new_qualifier = canon end end insert(splits, {prev_qualifier, new_qualifier, export.resolve_placetype_aliases(reduced_placetype)}) prev_qualifier = prev_qualifier and prev_qualifier .. " " .. new_qualifier or new_qualifier placetype = reduced_placetype else break end end return splits end --[==[ Given a `placetype` (which may be pluralized), return an ordered list of equivalent placetypes to look under to find the placetype's properties (such as the category or categories to be inserted). The return value is actually an ordered list of objects of the form `{qualifier=``qualifier``, placetype=``equiv_placetype``}` where ``equiv_placetype`` is a placetype whose properties to look up, derived from the passed-in placetype or from a contiguous subsequence of the words in the passed-in placetype (always including the rightmost word in the placetype, i.e. we successively chop off qualifier words from the left and use the remainder to find equivalent placetypes). ``qualifier`` is the remaining words not part of the subsequence used to find ``equiv_placetype``; or nil if all words in the passed-in placetype were used to find ``equiv_placetype``. (FIXME: This qualifier is not currently used anywhere.) Only placetypes for which there is an entry in `placetype_data` are included. The placetype passed in is always checked first, and will form the first entry if it exists in `placetype_data`. '''NOTE:''' This is a tricky function as it implements handling of (a) qualifiers, (b) fallback logic, (c) "type-raising" qualifiers such as `former`/`ancient`/etc. as well as `fictional` and `mythological`, and (d) form-of directives, which act somewhat similarly to `former`, and allows interaction between more than one of these simultaneously (e.g. official names of former places, which have their own categorization). If {{tl|place}} gets too slow, one potential speedup is to memoize the results of this function, as it appears to be getting called more than once on the same inputs. Another similar potential speedup is to memoize the results of `iterate_matching_holonym_location()`. For example, given the placetype `left tributary`, the following placetype/qualifier combinations are checked in turn: ``` {qualifier = nil, placetype="left tributary"} {qualifier = "left", placetype="tributary"} {qualifier = "left", placetype="แม่น้ำ"} ``` and the return value will be { { {qualifier = "left", placetype="tributary"}, {qualifier = "left", placetype="แม่น้ำ"}, }} The algorithm first enters the placetype itself into the list, then checks for `left tributary` as a recognized placetype in `placetype_data` and doesn't find it, so it doesn't enter it into the returned list (if it found it, it would add it as well as any fallbacks directly after it). It then splits off the recognized qualifier `left` to form the ''reduced placetype'' `tributary`, which is entered into the list because it is found in `placetype_data`. Then, because it has a fallback `river`, which exists in `placetype_data`, the fallback is entered next. Another example is `small rural fraziones` (where a ''frazione'' is type of subdivision of a ''comune'' or municipality, often specifically an outlying hamlet). the placetype/qualifier combinations checked are: ``` {qualifier = nil, placetype="small rural fraziones"} {qualifier = nil, placetype="small rural frazione"} {qualifier = "small", placetype="rural fraziones"} {qualifier = "small", placetype="rural frazione"} {qualifier = "small [[rural]]", placetype="fraziones"} {qualifier = "small [[rural]]", placetype="frazione"} {qualifier = "small [[rural]]", placetype="hamlet"} {qualifier = "small [[rural]]", placetype="village"} ``` The return value ends up as {qualifier = "small [[rural]]", placetype="frazione"}, {qualifier = "small [[rural]]", placetype="hamlet"}, {qualifier = "small [[rural]]", placetype="village"}, }} Here, because the result of singularizing `fraziones` returns a different value from the placetype itself, that singularized value is checked after the original plural value. Also, in the process of splitting off qualifiers, they are canonicalized if the entry in `placetype_qualifiers` says to do so; in this case, links are placed around `rural`. Finally, `frazione` has `hamlet` as its fallback, which in turn has `village` as its fallback, so both fallbacks end up being returned. `no_fallback`, if set, disables returning equivalent placetypes based on the `fallback` setting for a placetype. This is used in the first of two loops in find_placetype_cat_specs() in [[Module:place]] to prefer exact matches for placetypes such as barangays with later holonyms to matches based on a fallback such as `neighborhood` with an earlier holonym. See the comment in that function in [[Module:place]] for a more detailed explanation of why this is needed. Only the placetype itself, and any reduced placetypes created by chopping off recognized qualifiers at the beginning, are returned; but we do not return reduced placetypes if a containing placetype exists in `placetype_data`. (For example, `"overseas territory"` has a fallback `"dependent territory"`, and `"overseas"` is also a recognized qualifier. When `no_fallback` is in place, without the above proviso, we would return `"overseas territory"` followed by `"ดินแดน"` with the incorrect effect of classifying an `"overseas territory"` of the United Kingdom such as `"Gibraltar"` under [[:Category:Territories of the United Kingdom]] instead of [[:Category:Dependent territories of the United Kingdom]].) As an exception, if `historical`, `ancient`, `former` or the like are found, they proceed ignoring `no_fallback`, because it seems tricky to handle them correctly in the presence of `no_fallback`, and historical/former placetypes rarely occur with exact match category specs anyway. `no_split_qualifiers` prevents splitting off recognized qualifiers and returning the remainder of the placetype as an equivalent placetype. Only the passed-in placetype, and any fallbacks, will be returned. This is used in [[Module:category tree/topic cat/data/Places]] when looking up placetypes found in categories. Such placetypes won't have qualifiers and so it doesn't make sense to try and look for them. `from_category`, if set, causes category-only placetypes (those ending in `!`) to also be checked. `form_of_directive`, if set, causes the specified form-of directive (e.g. `FORMER_NAME_OF`) to be prepended to checked placetypes, their directive-specific type (e.g. `FORMER_NAME_OF_type`), and their classes (`class`) to get the appropriate placetypes to check for form-of-directive categories. It falls back to the prepended generic `place` as a placetype, e.g. `FORMER_NAME_OF place`, if nothing else matches. `no_check_for_inherently_former` is used internally to prevent an infinite loop when checking for `inherently_former`. `register_former_as_non_former` is a major hack used in `get_bare_categories` to deal with the mismatch between e.g. known location `Yugoslavia` declaring itself a `country` but definitions of it declaring it a `former country`. It causes the non-former version of the specified placetype to be included in the returned equivalents along with the former placetypes. [FIXME: This should apply only to the entries in `former_countries` but it's tricky to do that now; fix this in the known-location refactor. -- The known-location refactor is already done but we haven't yet fixed this.] ]==] function export.get_placetype_equivs(placetype, props) local no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former local form_of_directive if props then no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former = props.no_fallback, props.no_split_qualifiers, props.no_check_for_inherently_former, props.from_category, props.register_former_as_non_former form_of_directive = props.form_of_directive end local equivs = {} -- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. `qualifier` is -- the preceding qualifier to insert into `equivs` along with the placetype (see comment at top of function). If -- `from_category` is given, we also check for a category-specific entry consisting of the placetype followed by -- `!`, and in all cases we also check to see if `placetype` is plural, and if so, insert the singularized version -- along with its fallbacks (if any) in `placetype_data`. `form_of_prefix` is a form-of prefix such as -- `OFFICIAL_NAME_OF`. If specified, we check the fallbacks of `placetype` without the prefix but then insert into -- `equivs` the prefixed placetype. This way, if the user says e.g. {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}}, -- we will correctly categorize into [[:Category:Official names of countries]], rather than only trying to look up -- `OFFICIAL_NAME_OF island country` and failing, falling back ultimately to [[:Category:Official names of places]]. local function insert_placetype_and_fallbacks(qualifier, placetype, form_of_prefix) local function insert_equiv(pt) if form_of_prefix then -- Let's say the user says {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}} and we have -- no entry for `OFFICIAL_NAME_OF island country` but we do for `OFFICIAL_NAME_OF country` (which we end -- up processing because `island country` falls back to `country`), and that entry in turn is defined -- using a fallback. We have to insert that fallback-of-fallback, and the easiest/cleanest way of -- handling this is by calling ourselves recursively. insert_placetype_and_fallbacks(qualifier, form_of_prefix .. " " .. pt) else insert(equivs, {qualifier=qualifier, placetype=pt}) end end -- Insert the placetype, along with any fallbacks. local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category) if ptdata then insert_equiv(canon_placetype) if no_fallback then return end local first_placetype = #equivs + 1 local prev_placetype = nil while true do local pt_value = export.placetype_data[canon_placetype] if not pt_value then internal_error("Fallback value %s specified for placetype %s but is not in `placetype_data`", canon_placetype, prev_placetype) end if pt_value.fallback then insert_equiv(pt_value.fallback) local last_placetype = #equivs if last_placetype - first_placetype >= 10 then local fallback_loop = {} for i = first_placetype, last_placetype do insert(fallback_loop, equivs[i].placetype) end internal_error("Apparent loop in fallback chain: %s", table.concat(fallback_loop, " -> ")) end prev_placetype = canon_placetype canon_placetype = pt_value.fallback else break end end end end -- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. This is a -- wrapper around the more basic `insert_placetype_and_fallbacks()` which handles form-of directives. If there is no -- form-of directive, this function directly calls `insert_placetype_and_fallbacks()`. We do things this way so that -- form-of directives correctly combine with `former`-type qualifiers. Note that we also have special backups for -- form-of directives that check `DIRECTIVE place` (and before that, `DIRECTIVE FORMER/ANCIENT place` is there's a -- `former`-type directive); these backups live outside this function because we want them done once, late, rather -- than in each invocation of `process_and_insert_placetype()`. local function process_and_insert_placetype(qualifier, reduced_placetype) if form_of_directive then -- First check for e.g. `OFFICIAL_NAME_OF island country` and its fallbacks; then we look for fallbacks of -- `island country` and check e.g. `OFFICIAL_NAME_OF country` and its fallbacks. All of this is handled by -- `insert_placetype_and_fallbacks()` with appropriate parameters. After that, check the general class of -- the directive, e.g. `subpolity` if something like `district` is given. (Eventually, we check for -- `OFFICIAL_NAME_OF place` as a backup, but this happens at the end outside the loop over qualifiers.) insert_placetype_and_fallbacks(qualifier, reduced_placetype, form_of_directive) if not no_fallback then local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype) local directive_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.get_placetype_prop(pt, form_of_directive .. "_type") or export.get_placetype_prop(pt, "class") end ) if not directive_type then local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.placetype_data[pt] end ) if pt_data then internal_error("For placetype %s in conjunction with form-of directive %s, placetype data " .. 'located but directive-specific type property %s missing, and so is "class"; ' .. "placetypes searched are %s", reduced_placetype, form_of_directive, form_of_directive .. "_type", reduced_placetype_equivs) else -- This should be allowed, as we allow unrecognized placetypes in general. end elseif directive_type ~= "!" then insert_placetype_and_fallbacks(qualifier, directive_type, form_of_directive) end end else insert_placetype_and_fallbacks(qualifier, reduced_placetype) end end -- Successively split off recognized qualifiers and loop over successively greater sets of qualifiers from the left -- (unless `no_split_qualifiers` is specified, in which case we don't check for qualifiers). local splits if no_split_qualifiers then splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}} else splits = export.split_qualifiers_from_placetype(placetype) end for _, split in ipairs(splits) do local prev_qualifier, this_qualifier, reduced_placetype = unpack(split, 1, 3) -- If a special "former" qualifier like `former` or `historical` isn't present, and -- `no_check_for_inherently_former` is not given (this flag is used to avoid infinite loops), check for -- "inherently former" placetypes like `satrapy` and `treaty port` that always refer to no-longer-existing -- placetypes, and handle accordingly. local unlinked_this_qualifier if this_qualifier and this_qualifier:find("%[") then unlinked_this_qualifier = export.remove_links_and_html(this_qualifier) else unlinked_this_qualifier = this_qualifier end local former_qualifiers = this_qualifier and export.former_qualifiers[unlinked_this_qualifier] or nil if not former_qualifiers and not no_check_for_inherently_former then former_qualifiers = export.get_equiv_placetype_prop(reduced_placetype, function(pt) return export.get_placetype_prop(pt, "inherently_former") end, {no_check_for_inherently_former = true}) end -- If a special "former" qualifier like `former` or `historical` is present, map it to the appropriate internal -- qualifiers (`ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified -- qualifiers), fetch the `former_type` property, and treat the placetype as if a concatenation of the mapped -- qualifier(s) and the value of `former_type`. For example, if `medieval village` is given, we map `medieval` -- to `ANCIENT` and `FORMER`, and `village` to its `former_type` of `settlement`, and enter the placetypes -- `ANCIENT settlement` and `FORMER settlement` (in that order) into `equivs`. If the placetype following the -- "former" qualifier is recognized in `placetype_data` but has no `former_type` and no fallback with a -- `former_type` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like -- `former greenhouse` is specified and we don't have an entry for `greenhouse`), just track the occurrence and -- don't enter anything into `equivs`. if former_qualifiers then -- FIXME: Should we respect `no_fallback` here? My instinct says no. local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype, { no_check_for_inherently_former = true }) local former_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.get_placetype_prop(pt, "former_type") or export.get_placetype_prop(pt, "class") end ) if not former_type then local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.placetype_data[pt] end ) if pt_data then internal_error("For placetype %s, placetype data located but `former_type` missing; " .. "placetypes searched are %s", reduced_placetype, reduced_placetype_equivs) else -- Enable error when we've verified there aren't any examples. track("bad-former-placetype") track("bad-former-placetype/" .. reduced_placetype) --process_error("For placetype '%s', unrecognized placetype following 'former'-type " .. -- "qualifier; searched placetype(s) %s", reduced_placetype, dump(reduced_placetype_equivs)) end elseif former_type ~= "!" then -- First check directly for `ANCIENT/FORMER` + the original following placetype. This makes it possible -- for (e.g.) former provinces of the Roman empire to be categorized specially. for _, former_qualifier in ipairs(former_qualifiers) do process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. reduced_placetype) end for _, former_qualifier in ipairs(former_qualifiers) do process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. former_type) end -- HACK! See explanation above for `register_former_as_non_former`. if register_former_as_non_former then process_and_insert_placetype(prev_qualifier, reduced_placetype) end -- If we're processing a form-of directive, after doing everything else we do -- `DIRECTIVE ANCIENT/FORMER place` e.g. `OFFICIAL_NAME_OF FORMER place` as a backup. if form_of_directive and not no_fallback then for _, former_qualifier in ipairs(former_qualifiers) do insert_placetype_and_fallbacks(prev_qualifier, form_of_directive .. " " .. former_qualifier .. " place") end end -- Don't continue processing equivs. The reason is probably the same as the `break` below for -- qualifier_to_placetype_equivs[]; categories for `former BLAH` are set using `default`, and -- non-former equivs will otherwise take precedence. break end end -- Then see if the rightmost split-off qualifier is in qualifier_to_placetype_equivs -- (e.g. 'fictional *' -> 'fictional location'). If so, add the mapping. if this_qualifier and export.qualifier_to_placetype_equivs[unlinked_this_qualifier] then insert(equivs, { qualifier=prev_qualifier, placetype=export.qualifier_to_placetype_equivs[unlinked_this_qualifier] }) -- Don't continue processing equivs; otherwise, if we specify 'mythological city', even though the -- equivalent entry for 'mythological location' gets inserted ahead of the entry for 'city', the -- latter ends up generating the category because the category for 'mythological location' is set as -- the default value, which is used only when no non-default category can be found. break end -- Finally, join the rightmost split-off qualifier to the previously split-off qualifiers to form a combined -- qualifier, and add it along with reduced_placetype and any mapping in placetype_data for reduced_placetype. -- NOTE: The first time through this loop, both `prev_qualifier` and `this_qualifier` are nil, and this inserts -- the full placetype into `equivs`. local qualifier = prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier process_and_insert_placetype(qualifier, reduced_placetype) -- If `no_fallback` and there's an entry in `placetype_data` for this placetype, don't include any reduced -- placetypes to avoid the "overseas territory treated as a territory" issue describe above. if no_fallback then local canon_placetype, ptdata, ptmatch = export.get_placetype_data(reduced_placetype, from_category) if canon_placetype then break end end end -- If we're processing a form-of directive, after doing everything else we do `DIRECTIVE place` e.g. -- `OFFICIAL_NAME_OF place` as a backup; but only if either the placetype as a whole is recognized or the placetype -- begins with a recognized qualifier. This latter check is to avoid categorizing into e.g. -- [[Category:en:Former names of places]] in an invocation like -- {{place|en|@former name of:Democratic Republic of the Congo|country|r/Central Africa|;|used from 1971–1997}}; -- the `used from 1971–1997` gets treated as a placetype and we're called on it. if form_of_directive and not no_fallback and (splits[2] or export.get_placetype_data(placetype, from_category)) then insert_placetype_and_fallbacks(nil, form_of_directive .. " place") end return equivs end function export.get_equiv_placetype_prop_from_equivs(equivs, fun, continue_on_nil_only) for _, equiv in ipairs(equivs) do local retval = fun(equiv.placetype) if continue_on_nil_only and retval ~= nil or not continue_on_nil_only and retval then return retval, equiv end end return nil, nil end --[==[ Given a placetype `placetype` and a function `fun` of one argument, iteratively call the function on equivalent placetypes fetched from `get_placetype_equivs` until the function returns a non-falsy value (i.e. not {nil} or {false}); but if `continue_on_nil_only` is specified, the iterations continue until the function returns non non-{nil} value. FIXME: We should make `continue_on_nil_only` the default; but this requires changing some callers.) When `fun` returns a non-falsy or non-{nil} value, `get_equiv_placetype_prop` returns two values: the value returned by `fun` and the equivalent placetype that triggered the non-falsy (or non-{nil}) return value. If `fun` never returns a non-falsy (or non-{nil}) value, `get_equiv_placetype_prop` returns {nil} for both return values. If `placetype` is passed in as {nil}, the return value is the result of calling `fun` on {nil} (whatever it is) with {nil} for the second return value. ]==] function export.get_equiv_placetype_prop(placetype, fun, props) if not placetype then return fun(nil), nil end return export.get_equiv_placetype_prop_from_equivs(export.get_placetype_equivs(placetype, props), fun, props and props.continue_on_nil_only) end --[==[ Return the article that is used with an entry placetype. We proceed as follows: # See if there is a recognized qualifier at the beginning that specifies an article (including `false` for no article). This takes precedence over anything else, so that e.g. `various capitals` gets no article rather than "`the"`. # Then check the placetype or any equivalent placetype for the `entry_placetype_use_the` property, indicating that `"the"` should be used. # Otherwise we look to see if the placetype itself (not any equivalents, even those involving deleting a qualifier from the beginning) has an entry in `placetype_data` that specifies the indefinite article using `entry_placetype_use_the` (principally for use with placetypes like `union territory`). # Otherwise, we use [[Module:en-utilities]] to apply the standard algorithm to generate `"an"` for words beginning with a vowel and `"a"` otherwise. If `ucfirst` is true, the first letter of the article is made upper-case. ]==] function export.get_placetype_article(placetype, ucfirst) local art local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$") if qualifier then local canon = export.placetype_qualifiers[qualifier] if type(canon) == "table" then art = canon.article end end if art == false then return art end if art == nil then local placetype_use_the = export.get_equiv_placetype_prop(placetype, function(pt) return export.get_placetype_prop(pt, "entry_placetype_use_the") end) if placetype_use_the then art = "the" else art = export.get_placetype_prop(placetype, "entry_placetype_indefinite_article") if not art then art = --[[require(en_utilities_module).get_indefinite_article(placetype)]] "" end end end if ucfirst then art = m_strutils.ucfirst(art) end return art end --[==[ Return the preposition that should be used after `placetype` when occurring as an entry placetype or in categories (e.g. `city >in< France` but `country >of< South America`). The preposition defaults to `"ใน"` if not specified. ]==] function export.get_placetype_entry_preposition(placetype) local pt_prep = export.get_equiv_placetype_prop(placetype, function(pt) return export.get_placetype_prop(pt, "preposition") end ) return pt_prep or "ใน" end --[==[ Given a place desc (see top of file) and a holonym object (see top of file), add a key/value into the place desc's `holonyms_by_placetype` field corresponding to the placetype and placename of the holonym. For example, corresponding to the holonym "c/Italy", a key "ประเทศ" with the list value {"Italy"} will be added to the place desc's `holonyms_by_placetype` field. If there is already a key with that place type, the new placename will be added to the end of the value's list. ]==] function export.key_holonym_into_place_desc(place_desc, holonym) if not holonym.placetype then return end -- Key in equivalent placetypes, so that e.g. `cities/San Francisco` gets keyed under `city`; but don't do -- fallbacks, as it doesn't seem correct for the "do other holonyms of the same placetype" algorithm to do holonyms -- of different types just because they have the same fallback. local equiv_placetypes = export.get_placetype_equivs(holonym.placetype, {no_fallback = true}) local unlinked_placename = holonym.unlinked_placename for _, equiv in ipairs(equiv_placetypes) do local placetype = equiv.placetype if not place_desc.holonyms_by_placetype then place_desc.holonyms_by_placetype = {} end if not place_desc.holonyms_by_placetype[placetype] then place_desc.holonyms_by_placetype[placetype] = {unlinked_placename} else insert(place_desc.holonyms_by_placetype[placetype], unlinked_placename) end end end --[=[ Construct a formatted link from the raw link spec `link` given the canonical singular placetype `sg_placetype`. If the placetype was originally plural, `orig_placetype` should contain this plural value; otherwise it should be nil. This will construct the appropriate type of link that displays as `orig_placetype` (or otherwise `sg_placetype`) but links to whatever the `link` spec specifies (which may be `sg_placetype`, a Wikipedia article, etc.). `ptdata` is the placetype data structure for the placetype, and `from_category` indicates that we are generating the description of a category (otherwise we are generating the display form of an entry placetype). ]=] local function make_placetype_link(link, sg_placetype, orig_placetype, ptdata, from_category, noerror) if not from_category and ptdata.disallow_in_entries then if noerror then return "[not meant to be specified directly, with warning: " .. ptdata.disallow_in_entries .. "]" else process_error("Placetype %s is not meant to be specified directly: " .. ptdata.disallow_in_entries, sg_placetype) end end if link == nil then internal_error("Placetype data present for placetype %s but no link= setting given", sg_placetype) elseif link == true then if orig_placetype then return ("[[%s|%s]]"):format(sg_placetype, orig_placetype) else return ("[[%s]]"):format(sg_placetype) end elseif link == false then process_error("Placetype %s is not meant to be specified directly, but is only for internal use", sg_placetype) elseif link == "w" then return ("[[w:%s|%s]]"):format(sg_placetype, orig_placetype or sg_placetype) elseif link == "separately" then if orig_placetype then local sg_words = split(sg_placetype, " ") local orig_words = split(orig_placetype, " ") if #sg_words ~= #orig_words then internal_error("Can't construct 'separately' link for plural placetype %s as original placetype %s " .. "has different number of words", orig_placetype, sg_placetype) else for i = 1, #sg_words do if sg_words[i] == orig_words[i] then sg_words[i] = ("[[%s]]"):format(sg_words[i]) else sg_words[i] = ("[[%s|%s]]"):format(sg_words[i], orig_words[i]) end end return concat(sg_words, " ") end else return (sg_placetype:gsub("([^ ]+)", "[[%1]]")) end elseif link:find("^%+") then link = link:sub(2) -- discard initial + return ("[[%s|%s]]"):format(link, orig_placetype or sg_placetype) elseif not orig_placetype then return link else return --[[require(en_utilities_module).pluralize(link)]] link end end --[==[ Get the display form of a placetype by looking it up in `placetype_data`. If the placetype is recognized, or is the plural of a recognized placetype, the corresponding linked display form is returned (with plural placetypes displaying as plural but linked to the singular form of the placetype). Otherwise, return nil. If we're generating the description of a category, `category_type` should be set to one of `"top-level"` (for top-level categories like [[:Category:Neighborhoods]]), `"noncity"` (for non-city categories like [[:Category:Neighborhoods in Illinois, USA]]) or `"city"` (for city categories like [[:Category:Neighborhoods of Chicago]]). Otherwise, we're generating the description for use in formatting a {{tl|place}} call, and category-only placetypes ending in `!` will be ignored, along with special `category_link*` settings. `return_full` is used along with `category_type` and will preferably return the "full" variant of category link settings, i.e. `full_category_link*`; if they don't exist, the `category_link*` value is prepended with `"names of"`. `noerror` says to not throw an error when encountering entry placetypes that would be disallowed. ]==] function export.get_placetype_display_form(placetype, category_type, return_full, noerror) local from_category = not not category_type local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category) if canon_placetype then local raw_link local function is_linked_string(str) return type(str) == "string" and str:find("%[%[") end if category_type then local fetched_full local function fetch_maybe_full(prop) local retval = ptdata["full_" .. prop] if retval ~= nil then if return_full then return retval, true else internal_error("Saw full_" .. prop .. "=%s but `return_full` not set, can't handle", retval) end end return ptdata[prop], false end local function maybe_prefix(str) if return_full and not fetched_full then return "names of " .. str else return str end end -- Careful with `false` as possible value. if category_type == "top-level" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_top_level") elseif category_type == "noncity" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_before_noncity") elseif category_type == "city" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_before_city") else internal_error('Unrecognized value for `category_type` %s, should be "top-level", "noncity" or "city"', --ห้ามแปล category_type) end if type(raw_link) == "string" then return maybe_prefix(raw_link), ptdata elseif raw_link ~= nil then return raw_link, ptdata end raw_link, fetched_full = fetch_maybe_full("category_link") if raw_link == false then return raw_link, ptdata end if is_linked_string(raw_link) then return maybe_prefix(raw_link), ptdata end if ptmatch == "plural" then raw_link, fetched_full = fetch_maybe_full("plural_link") if raw_link == false then return raw_link, ptdata end if is_linked_string(raw_link) then return maybe_prefix(raw_link), ptdata end end if raw_link == nil then raw_link, fetched_full = fetch_maybe_full("link") end if raw_link == false then return raw_link, ptdata end return maybe_prefix(make_placetype_link(raw_link, canon_placetype, placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror)), ptdata else if ptmatch == "plural" then raw_link = ptdata.plural_link if raw_link == false then process_error("Placetype %s cannot appear plural", placetype) end if is_linked_string(raw_link) then return raw_link, ptdata end end if raw_link == nil then raw_link = ptdata.link end return make_placetype_link(raw_link, canon_placetype, placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror), ptdata end end return nil end local function resolve_unlinked_placename_display_aliases(placetype, placename) local equiv_placetypes = export.get_placetype_equivs(placetype) for i, equiv in ipairs(equiv_placetypes) do equiv_placetypes[i] = equiv.placetype end local all_display_aliases_found = {} local all_others_found = {} for group, key, spec in m_locations.iterate_matching_location { placetypes = equiv_placetypes, placename = placename, alias_resolution = "display", } do if spec.alias_of and spec.display then insert(all_display_aliases_found, {group, key, spec, spec.display_as_full}) else insert(all_others_found, {group, key, spec}) end end if not all_display_aliases_found[1] then return placename elseif all_display_aliases_found[2] then internal_error("Found multiple matching display aliases for placename %s, placetype %s: " .. "all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found, all_others_found) elseif all_others_found[1] then internal_error("Found a display alias along with other possible meanings for placename %s, placetype %s: " .. "all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found, all_others_found) else local group, key, spec, as_full = unpack(all_display_aliases_found[1]) local full, elliptical = m_locations.key_to_placename(group, key) return as_full and full or elliptical end end --[==[ If `placename` of type `placetype` is a display alias, convert it to its canonical form; otherwise, return unchanged. Display aliases transform certain placenames into canonical displayed forms. For example, if any of `country/US`, `country/USA` or `country/United States of America` (or `c/US`, etc.) are given, the result will be displayed as `United States`. '''NOTE''': Display aliases change what is displayed from what the editor wrote in the Wikitext. As a result, they should (a) be non-political in nature, and (b) not involve a change where the word `the` needs to be added or removed. For example, normalizing `US` and `USA` to `United States` for display purposes is OK but normalizing `Burma` to `Myanmar` is not (instead a cat alias should be used) because the terms `Burma` and `Myanmar` have clear political connotations. Similarly, we have a display alias that maps the old name of `Macedonia` as a country (but not a region!) to `North Macedonia`, but `Republic of Macedonia` is mapped to `North Macedonia` only as a cat alias because the two terms differ in their use of `the`. (For example, if we had a display alias mapping `Republic of Macedonia` to `North Macedonia`, the call {{tl|place|en|the <<capital city>> of the <<c/Republic of Macedonia>>}} would wrongly display as `the [[capital city]] of the [[North Macedonia]]`.) Generally, display normalizations tend to involve alternative forms (e.g. abbreviations, ellipses, foreign spellings) where the normalization improves clarity and consistency. ]==] function export.resolve_placename_display_aliases(placetype, placename) -- If the placename is a link, apply the alias inside the link. -- This pattern matches both piped and unpiped links. If the link is not piped, the second capture (linktext) will -- be empty. local link, linktext = rmatch(placename, "^%[%[([^|%[%]]+)|?([^|%[%]]-)%]%]$") if link then if linktext ~= "" then local alias = resolve_unlinked_placename_display_aliases(placetype, linktext) return "[[" .. link .. "|" .. alias .. "]]" else local alias = resolve_unlinked_placename_display_aliases(placetype, link) return "[[" .. alias .. "]]" end else return resolve_unlinked_placename_display_aliases(placetype, placename) end end --[==[ Generate the "prefixed" version of a bare key, i.e. prefix it with `the` if correct for this key. ]==] function export.get_prefixed_key(key, spec) if spec.the then return "the " .. key else return key end end -- Necessary for use by [[Module:place]]. FIXME: Reorganize the modules so this isn't necessary. export.iterate_matching_location = m_locations.iterate_matching_location --[=[ Iterator that iterates over holonyms in `place_desc`. If `first_holonym_index` is given, start iterating at the specified holonym and stop either when there are no more holonyms or a holonym with modifier `:also` is found. If `first_holonym_index` is nil or omitted, iterate over all holonyms regardless. If `include_raw_text_holonyms` is specified, raw text holonyms (those not of the form `placetype/placename`) are returned as well; they can be identified by the fact that the `placetype` field in the holonym structure is nil. Two values are returned at each iteration, the holonym index and holonym structure, similar to `ipairs()`. ]=] function export.get_holonyms_to_check(place_desc, first_holonym_index, include_raw_text_holonyms) local stop_at_also = not not first_holonym_index return function(place_desc, index) while true do index = index + 1 local this_holonym = place_desc.holonyms[index] -- If we were passed in a starting holonym index, go up to but not including a holonym marked with `:also` -- (continue_cat_loop); the categorization code will then restart the loop at that holonym. That holonym -- will have `:also` marked on it, so make sure not to stop immediately if the first holonym is marked with -- `:also`. if not this_holonym or stop_at_also and index > first_holonym_index and this_holonym.continue_cat_loop then return nil end -- If not placetype, we're processing raw text, which we normally want to skip. if include_raw_text_holonyms or this_holonym.placetype then return index, this_holonym end end end, place_desc, first_holonym_index and first_holonym_index - 1 or 0 end --[==[ If the holonym in `data` (in the format as passed to a category handler) refers to a known location, iterate over all such known locations, returning for each location the corresponding key, spec and group as well as the trail of ancestral containers. Unlike `iterate_matching_location()`, this specifically checks that there is no mismatch between the location's containers at any level and any of the following holonyms in the {{tl|place}} spec. The fields in `data` are: * `holonym_placetype`: The placetype of the holonym. It can actually be a list of possible placetypes, as with `iterate_matching_location()`. * `holonym_placename`: The placename of the holonym. * `holonym_index`: The index of the holonym among the holonyms in `place_desc`, or nil if the holonym is not among the holonyms in `place_desc`. (If a holonym index is given, we check for container mismatches among the holonyms following the specified index, stopping either when encountering a holonym marked with modifier `:also` or, if none exist, when we run out of holonyms. If no holonym index is given, we check all holonyms for container mismatches.) * `place_desc`: Description of the place; used for the holonyms, to check for container mismatches. Returns four values: the location group, the canonical key by which the location is known, the spec object describing the location and the trail of ancestral containers for the location. The first three values are the same as for `iterate_matching_location`. ]==] function export.iterate_matching_holonym_location(data) local holonym_placetype, holonym_placename, holonym_index, place_desc = data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc local matching_location_iterator = m_locations.iterate_matching_location { placetypes = holonym_placetype, placename = holonym_placename, } return function() while true do local group, key, spec = matching_location_iterator() if not group then return nil end local container_trail = {} -- For each level of container, check that there are no mismatches (i.e. other location of the same -- placetype) mentioned. We allow a mismatch at a given level if there's also a match with the container -- at that level. For example, in the case of Kansas City, defined in [[Module:place/locations]] as a city -- in Missouri, if we define it as {{tl|place|city|s/Missouri,Kansas}}, we ignore the mismatching state of -- Kansas because the correct state of Missouri was also mentioned. But imagine we are defining Newark, -- Delaware as {{tl|place|city|s/Delaware|c/US}} and (as is the case) we have an entry for Newark, New -- Jersey in [[Module:place/locations]]. Just because the containing location `US` matches isn't enough, -- because Newark, NJ also has New Jersey as a containing location and there's a mismatch at that level. If -- there are no mismatches at any level we assume we're dealing with the right known location. -- -- If at a given level there are multiple containing locations, we count a match if any holonym matches any -- containing location, and a mismatch only if a holonym exists of the same placetype that doesn't match any -- containing location. local containers_mismatch = false for containers in m_locations.iterate_containers(group, key, spec) do insert(container_trail, containers) local match_at_level = false local mismatch_at_level = false for other_holonym_index, other_holonym in export.get_holonyms_to_check(place_desc, holonym_index and holonym_index + 1 or nil) do local other_source_holonym = other_holonym.augmented_from_holonym if other_source_holonym and other_source_holonym.placetype == holonym_placetype and other_source_holonym.unlinked_placename ~= holonym_placename then -- Ignore holonyms added during the augmentation process for other holonyms of the same -- placetype as the placetype of the holonym we're considering. See comment in -- augment_holonyms_with_container() for why we do this. -- continue; grrr, no 'continue' in Lua else local holonym_matches_at_level = false local holonym_exists_with_same_placetype = false for _, container in ipairs(containers) do if not container.spec.no_check_holonym_mismatch then local full_container_placename, elliptical_container_placename = m_locations.key_to_placename(container.group, container.key) local placetypes = container.spec.placetype if type(placetypes) ~= "table" then placetypes = {placetypes} end local placetype_equivs = {} for _, pt in ipairs(placetypes) do m_table.extend(placetype_equivs, export.get_placetype_equivs(pt)) end local this_holonym_matches = export.get_equiv_placetype_prop_from_equivs( placetype_equivs, function(placetype) return other_holonym.placetype == placetype and (other_holonym.unlinked_placename == full_container_placename or other_holonym.unlinked_placename == elliptical_container_placename) end ) if this_holonym_matches then holonym_matches_at_level = true break end local this_holonym_exists_with_same_placetype = export.get_equiv_placetype_prop_from_equivs( placetype_equivs, function(placetype) return other_holonym.placetype == placetype end ) if this_holonym_exists_with_same_placetype then -- We seem to have a mismatch at this level. But before we decide conclusively that this -- is the case, check to see whether the putative mismatch is an alias and matches when -- we resolve the alias. for oh_group, oh_key, oh_spec, oh_container_trail in export.iterate_matching_holonym_location { holonym_placetype = other_holonym.placetype, holonym_placename = other_holonym.unlinked_placename, holonym_index = other_holonym_index, place_desc = place_desc, } do local oh_full_placename, oh_elliptical_placename = m_locations.key_to_placename(oh_group, oh_key) if oh_full_placename == full_container_placename or oh_elliptical_placename == elliptical_container_placename then -- Alias matched when resolved. this_holonym_matches = true break end end if this_holonym_matches then -- Alias matched above when resolved. holonym_matches_at_level = true break else -- Not an alias, or doesn't match when resolved. We have a true mismatch. holonym_exists_with_same_placetype = true end end end end if holonym_matches_at_level then match_at_level = true break end if holonym_exists_with_same_placetype then mismatch_at_level = true end end end if not match_at_level and mismatch_at_level then containers_mismatch = true break end end if not containers_mismatch then return group, key, spec, container_trail end end end end --[==[ If the holonym in `data` (in the format as passed to a category handler) refers to a known location, find and return the corresponding key, spec and group as well as the trail of ancestral containers. This is like `iterate_matching_holonym_location()` but throws an error if more than one location matches. (An example where this would happen is {{tl|place|en|neighborhood|city/Newcastle}}, because there are two known locations named Newcastle. To fix this, specify additional following disambiguating holonyms, e.g. {{tl|place|en|neighborhood|city/Newcastle|s/New South Wales}}. ]==] function export.find_matching_holonym_location(data) local all_found = {} for group, key, spec, container_trail in export.iterate_matching_holonym_location(data) do insert(all_found, {group, key, spec, container_trail}) end if not all_found[1] then return nil elseif all_found[2] then local holonym_placetype = data.holonym_placetype if type(holonym_placetype) == "table" then holonym_placetype = concat(holonym_placetype, ",") end local found_keys = {} for _, found in ipairs(all_found) do local _, key, _, _ = unpack(found) insert(found_keys, key) end error(("Found multiple matching locations for holonym '%s/%s'; specify disambiguating context in the " .. "containing holonyms: %s"):format(holonym_placetype, data.holonym_placename, dump(found_keys))) else return unpack(all_found[1]) end end ------------------------------------------------------------------------------------------ -- Placename and placetype data -- ------------------------------------------------------------------------------------------ --[==[ var: This is a map from aliases to their canonical forms. Any placetypes appearing as keys here will be mapped to their canonical forms in all respects, including the display form. Contrast entries in 'placetype_data' with a fallback, which applies to categorization and other processes but not to display. The most important aliases are for holonym placetypes, particularly those that occur often such as "ประเทศ", "รัฐ", "จังหวัด" and the like. Particularly long placetypes that mostly occur as entry placetypes (e.g. "census-designated place") can be given abbreviations, but it is generally preferred to spell out the entry placetype. Note also that we purposely avoid certain abbreviations that would be ambiguous (e.g. "d", which could variously be interpreted as "department", "อำเภอ" or "division"). ]==] export.placetype_aliases = { ["acomm"] = "autonomous community", ["adr"] = "administrative region", ["adterr"] = "administrative territory", -- Pakistan ["aobl"] = "autonomous oblast", ["aokr"] = "autonomous okrug", ["ap"] = "autonomous province", ["apref"] = "autonomous prefecture", ["aprov"] = "autonomous province", ["ar"] = "autonomous region", ["arch"] = "archipelago", ["arep"] = "autonomous republic", ["aterr"] = "autonomous territory", ["atu"] = "autonomous territorial unit", ["bor"] = "borough", ["c"] = "ประเทศ", ["can"] = "canton", ["carea"] = "council area", ["cc"] = "constituent country", ["cdblock"] = "community development block", ["cdep"] = "Crown dependency", ["CDP"] = "census-designated place", ["cdp"] = "census-designated place", ["clcity"] = "county-level city", ["co"] = "เทศมณฑล", ["cobor"] = "county borough", ["colcity"] = "county-level city", ["coll"] = "collectivity", ["comm"] = "community", ["cont"] = "ทวีป", ["contr"] = "continental region", ["contregion"] = "continental region", ["cpar"] = "civil parish", ["damun"] = "direct-administered municipality", ["dep"] = "dependency", ["department capital"] = "departmental capital", ["dept"] = "department", ["depterr"] = "dependent territory", ["dist"] = "อำเภอ", ["distmun"] = "district municipality", ["div"] = "division", ["emp"] = "จักรวรรดิ", ["fpref"] = "French prefecture", ["gov"] = "governorate", ["govnat"] = "governorate", ["home-rule city"] = "home rule city", ["home-rule municipality"] = "home rule municipality", ["inner-city area"] = "inner city area", ["ires"] = "Indian reservation", ["isl"] = "เกาะ", ["lbor"] = "London borough", ["lga"] = "local government area", ["lgarea"] = "local government area", ["lgd"] = "local government district", ["lgdist"] = "local government district", ["metbor"] = "metropolitan borough", ["metcity"] = "metropolitan city", ["metmun"] = "metropolitan municipality", ["mtn"] = "ภูเขา", ["mun"] = "เทศบาล", ["mundist"] = "municipal district", ["nonmetropolitan county"] = "non-metropolitan county", ["obl"] = "oblast", ["okr"] = "okrug", ["p"] = "จังหวัด", ["par"] = "parish", ["parmun"] = "parish municipality", ["pen"] = "peninsula", ["plcity"] = "prefecture-level city", ["plcolony"] = "Polish colony", ["pref"] = "prefecture", ["prefcity"] = "prefecture-level city", ["preflcity"] = "prefecture-level city", ["prov"] = "จังหวัด", ["r"] = "ภูมิภาค", ["range"] = "เทือกเขา", ["rcm"] = "regional county municipality", ["rcomun"] = "regional county municipality", ["rdist"] = "regional district", ["rep"] = "republic", ["rhrom"] = "rural hromada", ["riv"] = "แม่น้ำ", ["rmun"] = "regional municipality", ["robor"] = "royal borough", ["romp"] = "Roman province", ["runit"] = "regional unit", ["rurmun"] = "rural municipality", ["s"] = "รัฐ", ["sar"] = "special administrative region", ["shrom"] = "settlement hromada", ["spref"] = "subprefecture", ["sprefcity"] = "sub-prefectural city", ["sprovcity"] = "subprovincial city", ["submet city"] = "sub-metropolitan city", ["submetropolitan city"] = "sub-metropolitan city", ["sub-prefecture-level city"] = "sub-prefectural city", ["sub-provincial city"] = "subprovincial city", ["sub-provincial district"] = "subprovincial district", ["terr"] = "ดินแดน", ["terrauth"] = "territorial authority", ["twp"] = "township", ["twpmun"] = "township municipality", ["uauth"] = "unitary authority", ["ucomm"] = "unincorporated community", ["udist"] = "unitary district", ["uhrom"] = "urban hromada", ["uterr"] = "union territory", ["utwpmun"] = "united township municipality", ["val"] = "valley", ["vdc"] = "village development committee", ["vil"] = "village", ["voi"] = "voivodeship", ["wcomm"] = "Welsh community", } local no_link_def_article = {link = false, article = "the"} local no_link_no_article = {link = false, article = false} --[==[ var: These qualifiers can be prepended onto any placetype and will be handled correctly. For example, the placetype `large city` will be displayed as `large <nowiki>[[city]]</nowiki>` and categorized as if `city` were specified. If the value in the following table is a string, the qualifier will display according to the string. If the value is `true`, the qualifier will be linked to its corresponding Wiktionary entry. If the value is `false`, the qualifier will not be linked but will appear as-is. Note that these qualifiers do not override placetypes with entries elsewhere that contain those same qualifiers. For example, the entry for `inland sea` in `placetype_data` will apply in preference to treating `inland sea` as equivalent to `sea`. ]==] export.placetype_qualifiers = { -- generic qualifiers ["huge"] = false, ["tiny"] = false, ["large"] = false, ["big"] = false, ["mid-size"] = false, ["mid-sized"] = false, ["small"] = false, ["sizable"] = false, ["important"] = false, ["long"] = false, ["short"] = false, ["major"] = false, ["minor"] = false, ["high"] = false, ["tall"] = false, ["low"] = false, ["left"] = false, -- left tributary ["right"] = false, -- right tributary ["modern"] = false, -- for use in opposition to "ancient" in another definition -- "former" qualifiers ["abandoned"] = true, ["ancient"] = true, ["deserted"] = true, ["extinct"] = true, ["former"] = false, ["historic"] = "historical", ["historical"] = true, ["medieval"] = true, ["mediaeval"] = true, ["ruined"] = true, ["traditional"] = true, -- sea qualifiers ["coastal"] = true, ["inland"] = true, -- note, we also have an entry in placetype_data for 'inland sea' to get a link to [[inland sea]] ["maritime"] = true, ["overseas"] = true, ["seaside"] = true, ["beachfront"] = true, ["beachside"] = true, ["riverside"] = true, -- lake qualifiers ["freshwater"] = true, ["saltwater"] = true, ["endorheic"] = true, ["oxbow"] = true, ["ox-bow"] = "[[oxbow]]", -- [[ox-bow]] is a red link ["tidal"] = true, -- land qualifiers ["hilltop"] = true, ["hilly"] = true, ["insular"] = true, ["peninsular"] = true, ["chalk"] = true, ["karst"] = true, ["limestone"] = true, ["mountainous"] = true, ["mountaintop"] = true, ["alpine"] = true, ["volcanic"] = true, -- for an island -- political status qualifiers ["autonomous"] = true, ["incorporated"] = true, ["special"] = true, ["unincorporated"] = true, ["coterminous"] = true, -- monetary status/etc. qualifiers ["fashionable"] = true, ["wealthy"] = true, ["affluent"] = true, ["declining"] = true, -- city vs. rural qualifiers ["urban"] = true, ["suburban"] = true, ["exurban"] = true, ["outlying"] = true, ["remote"] = true, ["rural"] = true, ["outback"] = true, ["inner"] = false, ["inner-city"] = true, ["central"] = false, ["outer"] = false, -- land use qualifiers ["residential"] = true, ["agricultural"] = true, ["business"] = true, ["commercial"] = true, ["industrial"] = true, -- business use qualifiers ["railroad"] = true, ["railway"] = true, ["farming"] = true, ["fishing"] = true, ["mining"] = true, ["logging"] = true, ["cattle"] = true, -- tourism use qualifiers ["resort"] = true, -- note, we also have 'resort city' and 'resort town', that take precedecne ["spa"] = true, -- note, we also have 'spa city' and 'spa town', that take precedecne ["ski"] = true, -- note, we also have 'ski resort city' and 'ski resort town', that take precedecne -- religious qualifiers ["holy"] = true, ["sacred"] = true, ["religious"] = true, ["secular"] = true, -- qualifiers for nonexistent places ["claimed"] = false, ["fictional"] = true, ["legendary"] = true, ["mythical"] = true, ["mythological"] = true, -- directional qualifiers ["northern"] = false, ["southern"] = false, ["eastern"] = false, ["western"] = false, ["north"] = false, ["south"] = false, ["east"] = false, ["west"] = false, ["northeastern"] = false, ["southeastern"] = false, ["northwestern"] = false, ["southwestern"] = false, ["northeast"] = false, ["southeast"] = false, ["northwest"] = false, ["southwest"] = false, -- seasonal qualifiers ["summer"] = true, -- e.g. for 'summer capital' ["winter"] = true, -- legal status qualifiers -- FIXME: Two-word qualifiers don't work yet. But you can enter "de-facto" and it's canonicalized to [[de facto]]. ["official"] = true, ["unofficial"] = true, ["de facto"] = true, -- 'de facto capital' ["de-facto"] = "[[de facto]]", -- [[de-facto]] is a red link ["de jure"] = true, -- 'de jure capital' ["de-jure"] = "[[de jure]]", -- [[de-jure]] is a red link -- NOTE: 'unrecognized/unrecognised' are handled as placetypes 'unrecognized country', 'unrecognized state' -- misc. qualifiers ["planned"] = true, ["chartered"] = true, ["landlocked"] = true, ["uninhabited"] = true, -- superlative qualifiers ["first"] = no_link_def_article, ["second"] = no_link_def_article, -- for "second largest" etc. ["third"] = no_link_def_article, ["fourth"] = no_link_def_article, ["last"] = no_link_def_article, ["only"] = no_link_def_article, ["sole"] = no_link_def_article, ["main"] = no_link_def_article, ["largest"] = no_link_def_article, ["biggest"] = no_link_def_article, ["smallest"] = no_link_def_article, ["shortest"] = no_link_def_article, ["longest"] = no_link_def_article, ["tallest"] = no_link_def_article, ["highest"] = no_link_def_article, ["lowest"] = no_link_def_article, ["leftmost"] = no_link_def_article, ["rightmost"] = no_link_def_article, ["innermost"] = no_link_def_article, ["outermost"] = no_link_def_article, ["northernmost"] = no_link_def_article, ["southernmost"] = no_link_def_article, ["westernmost"] = no_link_def_article, ["easternmost"] = no_link_def_article, ["northwesternmost"] = no_link_def_article, ["southwesternmost"] = no_link_def_article, ["northeasternmost"] = no_link_def_article, ["southeasternmost"] = no_link_def_article, -- several/various ["several"] = no_link_no_article, ["various"] = no_link_no_article, ["numerous"] = no_link_no_article, ["multiple"] = no_link_no_article, ["many"] = no_link_no_article, ["other"] = no_link_no_article, } --[==[ var: In this table, the key qualifiers should be treated the same as the value qualifiers for categorization purposes. This is overridden by `placetype_data` and `qualifier_to_placetype_equivs`. ]==] export.former_qualifiers = { ["abandoned"] = {"FORMER"}, ["ancient"] = {"ANCIENT", "FORMER"}, ["former"] = {"FORMER"}, ["extinct"] = {"FORMER"}, ["historic"] = {"FORMER"}, ["historical"] = {"FORMER"}, ["medieval"] = {"ANCIENT", "FORMER"}, ["mediaeval"] = {"ANCIENT", "FORMER"}, ["ruined"] = {"ANCIENT", "FORMER"}, ["traditional"] = {"FORMER"}, } --[==[ var: In this table, any placetypes containing these qualifiers that do not occur in `placetype_data` should be mapped to the specified placetypes for categorization purposes. Entries here are overridden by `placetype_data`. ]==] export.qualifier_to_placetype_equivs = { ["fictional"] = "fictional location", ["legendary"] = "mythological location", ["mythical"] = "mythological location", ["mythological"] = "mythological location", -- For e.g. Taiwan as a "claimed province" of China; parts of Belize as claimed by Guatemala; various islands -- claimed by various parties in East Asia. FIXME: We should conditionalize on what is being claimed since there are -- also claimed capitals, e.g. Israel and Palestine claim Jerusalem as their capital. ["claimed"] = "claimed political division", } --[==[ var: Mapping from placetypes to the corresponding plural category-only placetype for a capital of that placetype. The reverse mapping also exists. ]==] export.placetype_to_capital_cat = { ["autonomous community"] = "autonomous community capitals", ["canton"] = "cantonal capitals", ["comarca"] = "comarca capitals", ["ประเทศ"] = "national capitals", -- The following are not obviously different from 'county seats' but the latte terminology is used in the US. ["เทศมณฑล"] = "county capitals", ["department"] = "departmental capitals", ["อำเภอ"] = "district capitals", ["division"] = "division capitals", ["emirate"] = "emirate capitals", ["governorate"] = "governorate capitals", ["hromada"] = "hromada capitals", ["krai"] = "krai capitals", ["metropolitan city"] = "metropolitan city capitals", ["เทศบาล"] = "municipal capitals", ["oblast"] = "oblast capitals", ["okrug"] = "okrug capitals", ["prefecture"] = "prefectural capitals", ["จังหวัด"] = "provincial capitals", ["raion"] = "raion capitals", ["regency"] = "regency capitals", ["ภูมิภาค"] = "regional capitals", ["regional unit"] = "regional unit capitals", ["republic"] = "republic capitals", ["รัฐ"] = "state capitals", ["ดินแดน"] = "territorial capitals", ["voivodeship"] = "voivodeship capitals", } --[==[ var: This contains placenames that should be preceded by an article (almost always "the"). '''NOTE''': There are multiple ways that placenames can come to be preceded by "the": # Listed here. # Given in [[Module:place/locations]] with an initial "the". All such placenames are added to this map by the code just below the map. # The placetype of the placename has `holonym_use_the = true` in its placetype_data. # A regex in placename_the_re matches the placename. Note that "the" is added only before the first holonym in a place description. ]==] export.placename_article = { -- This should only contain info that can't be inferred from [[Module:place/locations]]. ["archipelago"] = { ["Cyclades"] = "the", ["Dodecanese"] = "the", }, ["ประเทศ"] = { ["Holy Roman Empire"] = "the", }, ["จักรวรรดิ"] = { ["Holy Roman Empire"] = "the", }, ["เกาะ"] = { ["North Island"] = "the", ["South Island"] = "the", }, ["ภูมิภาค"] = { ["Balkans"] = "the", ["Russian Far East"] = "the", ["Caribbean"] = "the", ["Caucasus"] = "the", ["Middle East"] = "the", ["New Territories"] = "the", ["North Caucasus"] = "the", ["South Caucasus"] = "the", ["West Bank"] = "the", ["Gaza Strip"] = "the", }, ["valley"] = { ["San Fernando Valley"] = "the", }, } --[==[ var: Regular expressions to apply to determine whether we need to put 'the' before a holonym. The key "*" applies to all holonyms, otherwise only the regexes for the holonym's placetype apply. ]==] export.placename_the_re = { -- We don't need entries for peninsulas, seas, oceans, gulfs or rivers -- because they have holonym_use_the = true. ["*"] = {"^Isle of ", " Islands$", " Mountains$", " Empire$", " Country$", " Region$", " District$", "^City of "}, ["bay"] = {"^Bay of "}, ["ทะเลสาบ"] = {"^Lake of "}, ["ประเทศ"] = {"^Republic of ", " Republic$"}, ["republic"] = {"^Republic of ", " Republic$"}, ["ภูมิภาค"] = {" [Rr]egion$"}, ["แม่น้ำ"] = {" River$"}, ["local government area"] = {"^Shire of "}, ["เทศมณฑล"] = {"^Shire of "}, ["Indian reservation"] = {" Reservation", " Nation"}, ["tribal jurisdictional area"] = {" Reservation", " Nation"}, } --[==[ var: If any of the following holonyms are present, the associated holonyms are automatically added to the end of the list of holonyms for categorization (but not display) purposes. ]==] export.cat_implications = { ["ภูมิภาค"] = { ["Eastern Europe"] = {"continent/Europe"}, ["Central Europe"] = {"continent/Europe"}, ["Western Europe"] = {"continent/Europe"}, ["South Europe"] = {"continent/Europe"}, ["Southern Europe"] = {"continent/Europe"}, ["Northern Europe"] = {"continent/Europe"}, ["Northeast Europe"] = {"continent/Europe"}, ["Northeastern Europe"] = {"continent/Europe"}, ["Southeast Europe"] = {"continent/Europe"}, ["Southeastern Europe"] = {"continent/Europe"}, ["North Caucasus"] = {"continent/Europe"}, ["South Caucasus"] = {"continent/Asia"}, ["South Asia"] = {"continent/Asia"}, ["Southern Asia"] = {"continent/Asia"}, ["East Asia"] = {"continent/Asia"}, ["Eastern Asia"] = {"continent/Asia"}, ["Central Asia"] = {"continent/Asia"}, ["West Asia"] = {"continent/Asia"}, ["Western Asia"] = {"continent/Asia"}, ["Southeast Asia"] = {"continent/Asia"}, ["North Asia"] = {"continent/Asia"}, ["Northern Asia"] = {"continent/Asia"}, ["Anatolia"] = {"continent/Asia"}, ["Asia Minor"] = {"continent/Asia"}, ["Mesopotamia"] = {"continent/Asia"}, ["North Africa"] = {"continent/Africa"}, ["Central Africa"] = {"continent/Africa"}, ["West Africa"] = {"continent/Africa"}, ["East Africa"] = {"continent/Africa"}, ["Southern Africa"] = {"continent/Africa"}, ["Central America"] = {"continent/Central America"}, ["Caribbean"] = {"continent/North America"}, ["Polynesia"] = {"continent/Oceania"}, ["Micronesia"] = {"continent/Oceania"}, ["Melanesia"] = {"continent/Oceania"}, ["Siberia"] = {"country/Russia", "continent/Asia"}, ["Russian Far East"] = {"country/Russia", "continent/Asia"}, ["South Wales"] = {"constituent country/Wales", "continent/Europe"}, ["Balkans"] = {"continent/Europe"}, ["West Bank"] = {"country/Palestine", "continent/Asia"}, ["Gaza"] = {"country/Palestine", "continent/Asia"}, ["Gaza Strip"] = {"country/Palestine", "continent/Asia"}, } } ------------------------------------------------------------------------------------------ -- Category and display handlers -- ------------------------------------------------------------------------------------------ local function city_type_cat_handler(data) local entry_placetype = data.entry_placetype local generic_before_non_cities = export.get_placetype_prop(entry_placetype, "generic_before_non_cities") if not generic_before_non_cities then internal_error("city_type_cat_handler called on placetype %s that doesn't have a `generic_before_non_cities`" .. " setting", entry_placetype) end local plural_entry_placetype = export.pluralize_placetype(entry_placetype) local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and not spec.is_former_place and not spec.is_city then -- Categorize both in key, and in the larger polity that the key is part of, e.g. [[Hirakata]] goes in both -- "Cities in Osaka Prefecture" and "Cities in Japan". (But don't do the latter if no_container_cat is set.) local cap_plural_entry_placetype = ucfirst(plural_entry_placetype) local retcats = {("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(key, spec))} --th if container_trail[1] and not spec.no_container_cat then for _, container in ipairs(container_trail[1]) do insert(retcats, ("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(container.key, container.spec))) --th end end return retcats end end local function capital_city_cat_handler(data, non_city) local holonym_placetype, holonym_placename, holonym_index, place_desc = data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc -- The first time we're called we want to return something; otherwise we will be called for later-mentioned -- holonyms, which can result in wrongly classifying into e.g. `National capitals`. Simulate the loop in -- find_placetype_cat_specs() over holonyms so we get the proper `Cities in ...` categories as well as the capital -- category/categories we add below. local retcats if not non_city and place_desc.holonyms then for h_index, holonym in export.get_holonyms_to_check(place_desc, holonym_index) do local h_placetype, h_placename = holonym.placetype, holonym.unlinked_placename retcats = city_type_cat_handler { entry_placetype = "นคร", holonym_placetype = h_placetype, holonym_placename = h_placename, holonym_index = h_index, place_desc = place_desc, } if retcats then break end end end if not retcats then retcats = {} end -- Now find the appropriate capital-type category for the placetype of the holonym, e.g. 'State capitals'. If we -- recognize the holonym among the known holonyms in [[Module:place/locations]], also add a category like 'State -- capitals of the United States'. Truncate e.g. 'autonomous region' to 'region', 'union territory' to 'territory' -- when looking up the type of capital category, if we can't find an entry for the holonym placetype itself (there's -- an entry for 'autonomous community'). local capital_cat = export.placetype_to_capital_cat[holonym_placetype] if not capital_cat then capital_cat = export.placetype_to_capital_cat[holonym_placetype:gsub("^.* ", "")] end if capital_cat then capital_cat = ucfirst(capital_cat) local inserted_specific_variant_cat = false if holonym_index then -- Now find the first recognized holonym location. We don't stop when :also is seen because of the common pattern -- where we use :also to specify that a given city is the capital at multiple surrounding levels. local matching_group, matching_key, matching_spec, matching_container_trail, matching_holonym_index for h_index = holonym_index, #place_desc.holonyms do if place_desc.holonyms[h_index].placetype then matching_group, matching_key, matching_spec, matching_container_trail = export.find_matching_holonym_location { holonym_placetype = place_desc.holonyms[h_index].placetype, holonym_placename = place_desc.holonyms[h_index].unlinked_placename, holonym_index = h_index, place_desc = place_desc, } if matching_group then matching_holonym_index = h_index break end end end if matching_holonym_index == holonym_index then if matching_container_trail[1] and not matching_spec.no_container_cat then for _, container in ipairs(matching_container_trail[1]) do insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key, container.spec))) inserted_specific_variant_cat = true end end elseif matching_holonym_index then -- Check to make sure that the holonym placetype we were called on is listed among the -- divtypes of the location we found. local function insert_specific_variant_if_possible(key, spec) return export.get_equiv_placetype_prop(holonym_placetype, function(pt) local plural_holonym_placetype = export.pluralize_placetype(pt) local saw_matching_div if spec.divs then local divs = spec.divs if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) ~= "table" then div = {type = div} end if plural_holonym_placetype == div.type then saw_matching_div = true break end end end if saw_matching_div then insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(key, spec))) return true end return false end) end if insert_specific_variant_if_possible(matching_key, matching_spec) then inserted_specific_variant_cat = true elseif not matching_spec.no_container_cat then for _, containers in ipairs(matching_container_trail) do local saw_no_container_cat = false for _, container in ipairs(containers) do if insert_specific_variant_if_possible(container.key, container.spec) then inserted_specific_variant_cat = true break end saw_no_container_cat = saw_no_container_cat or container.spec.no_container_cat end if inserted_specific_variant_cat or saw_no_container_cat then break end end end end else -- This happens when in an invocation like {{place|en|capital city|s/Haryana,Punjab}} for -- [[Chandigarh]]. We fall back to older code that doesn't depend on the holonym index existing. -- FIXME: This may not be necessary. In the example just given, when processing Haryana we add to -- [[:Category:en:State capitals of India]], and nothing extra gets added when processing Punjab. -- Possibly we can just skip this case entirely. local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and container_trail[1] and not spec.no_container_cat then for _, container in ipairs(container_trail[1]) do insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key, container.spec))) inserted_specific_variant_cat = true end end end if not inserted_specific_variant_cat then insert(retcats, capital_cat) end else -- We didn't recognize the holonym placetype; just put in 'Capital cities'. insert(retcats, "Capital cities") end return retcats end --[=[ This is invoked specially for all placetypes (see the `*` placetype key at the bottom of `placetype_data`). This is used in two ways: # To add pages to generic holonym categories like [[:Category:en:สถานที่ในMerseyside, England]] (and [[:Category:en:สถานที่ในEngland]]) for any pages that have `co/Merseyside` as their holonym. # To categorize demonyms in bare placename categories like [[:Category:en:Merseyside, England]] if the demonym description mentions `co/Merseyside` and doesn't mention a more specific placename that also has a category. (In this case there are none, but we can have demonyms at multiple levels, e.g. in France for individual villages, departments, administrative regions, and for the entire country, and for example we only want to categorize a demonym into [[:Category:France]] if no more specific category applies.) Unlike when invoked from {{tl|place}}, a demonym invocation only adds the most specific holonym category and not the category of any containing polity (hence if we add [[:Category:en:Merseyside, England]] we won't also add [[:Category:England]]). This code also handles cities; e.g. for the first use case above, it would be used to add a page that has `city/Boston` as a holonym to [[:Category:en:สถานที่ในBoston]], along with [[:Category:en:สถานที่ในMassachusetts, USA]] and [[:Category:en:สถานที่ในthe United States]]. The city handler tries to deal with the possibility of multiple cities having the same name. For example, the code in [[Module:place/locations]] knows about the city of [[Columbus]], [[Ohio]], which has containing polities `Ohio` (a state) and `the United States` (a country). If either containing polity is mentioned, the handler proceeds to return the key `Columbus` (along with `Ohio, USA` and `the United States`). Otherwise, if any other state or country is mentioned, the handler returns nothing, and otherwise it assumes the mentioned city is the one we're considering and returns `Columbus` etc. This works correctly if the place only mentions Ohio and a holonym for a Columbus in a different country is encountered, because of the function `augment_holonyms_with_container`, which adds the US as a holonym when Ohio is encountered. The single parameter `data` is as in category handlers. The return value is a list of categories (without the preceding language code). ]=] local function generic_place_cat_handler(data) local from_demonym = data.from_demonym local retcats = {} local function insert_retkey(key, spec) if from_demonym then insert(retcats, key) else insert(retcats, ("สถานที่ใน%s"):format(export.get_prefixed_key(key, spec))) end end local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group then if not spec.no_generic_place_cat then -- This applies to continents and continental regions. insert_retkey(key, spec) end -- Categorize both in key, and in the larger location(s) that the key is part of, e.g. [[Hirakata]] goes in -- both [[Category:สถานที่ในOsaka Prefecture, Japan]] and [[Category:สถานที่ในJapan]]. But not when -- no_container_cat is set (e.g. for 'United Kingdom'). if not spec.no_container_cat then for _, container_set in ipairs(container_trail) do local stop_adding_containers = false for _, container in ipairs(container_set) do if not container.spec.no_generic_place_cat then insert_retkey(container.key, container.spec) end if container.spec.no_container_cat then stop_adding_containers = true end end if stop_adding_containers then break end end end return retcats end end --[==[ Special category handler run for all placetypes that checks for specified division placetypes of known locations and categorizes appropriately. ]==] function export.political_division_cat_handler(data) if data.from_demonym then return end local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group then local divlists = {} if spec.divs then insert(divlists, spec.divs) end if spec.addl_divs then insert(divlists, spec.addl_divs) end for _, divlist in ipairs(divlists) do if type(divlist) ~= "table" then divlist = {divlist} end for _, div in ipairs(divlist) do if type(div) == "string" then div = {type = div} end local sgdiv = export.maybe_singularize_placetype(div.type) or div.type local prep = div.prep or "ของ" local cat_as = div.cat_as or div.type if type(cat_as) ~= "table" then cat_as = {cat_as} end if not export.placetype_data[sgdiv] then internal_error("Placetype %s associated with known location key %s and data %s not found in " .. "`placetype_data`", sgdiv, key, spec) end if sgdiv == data.entry_placetype then local retcats = {} for _, pt_cat in ipairs(cat_as) do if type(pt_cat) == "string" then pt_cat = {type = pt_cat} end local pt_prep = pt_cat.prep or prep insert(retcats, ucfirst(pt_cat.type) .. pt_prep .. export.get_prefixed_key(key, spec)) --th end return retcats end end end end end --[==[ This is used to add pages to "bare" categories like [[:Category:en:Georgia, USA]] for `[[Georgia]]` and any foreign-language terms that are translations of the state of Georgia. We look at the page title (or its overridden value in {{para|pagename}}) as well as the glosses in {{para|t}}/{{para|t2}} etc., various extra-info values such as the modern names in {{para|modern}}, and any values specified using a form-of directive. We need to pay attention to the entry placetypes specified so we don't overcategorize; e.g. the US state of Georgia is `[[Джорджия]]` in Russian but the country of Georgia is `[[Грузия]]`, and if we just looked for matching names, we'd get both Russian terms categorized into both [[:Category:ru:Georgia, USA]] and [[:Category:ru:Georgia]]. We also need to check the containing holonyms to make sure there isn't a mismatch (so we don't e.g. categorize Newark, Delaware in [[:Category:en:Newark]], which is intended for Newark, New Jersey). ]==] function export.get_bare_categories(args, overall_place_spec) local bare_cats = {} local place_descs = overall_place_spec.descs local possible_placetypes_by_place_desc = {} for i, place_desc in ipairs(place_descs) do possible_placetypes_by_place_desc[i] = {} for _, placetype in ipairs(place_desc.placetypes) do if not export.placetype_is_ignorable(placetype) then local equivs = export.get_placetype_equivs(placetype, {register_former_as_non_former = true}) for _, equiv in ipairs(equivs) do insert(possible_placetypes_by_place_desc[i], equiv.placetype) end end end end local function check_term(term) -- Treat Wikipedia links like local ones. term = term:gsub("%[%[w:", "[["):gsub("%[%[wikipedia:", "[[") term = export.remove_links_and_html(term) term = term:gsub("^the ", "") for i, place_desc in ipairs(place_descs) do -- Iterate over all matching locations in case there are multiple, as with Delhi defined as -- {{place|en|megacity/and/union territory|c/India|containing the national capital [[New Delhi]]}}. for group, key, spec, container_trail in export.iterate_matching_holonym_location { holonym_placetype = possible_placetypes_by_place_desc[i], holonym_placename = term, place_desc = place_desc, } do insert(bare_cats, key) end end end -- FIXME: Should we only do the following if the language is English (requires that the lang is passed in)? -- We should always do it if `pagename` is given (as it is with {{tcl}}) but maybe not otherwise unless 1=en. There -- are cases like [[Ankara]] = English name for capital of Turkey, but also the name in various languages for the -- capital of Ghana (= English [[Accra]]). But this should get caught by mismatching the containing country. The -- advantage of checking when the language isn't English is we catch those places that fail to give an English -- translation but where the translation happens to be the same as the other-language spelling. However, I don't -- know how often this situation occurs. check_term(args.pagename or mw.title.getCurrentTitle().subpageText) for _, t in ipairs(args.t) do check_term(t) end local function check_termobj_list(terms) for _, term in ipairs(terms) do if term.eq then check_term(term.eq) end if term.alt or term.term then check_term(term.alt or term.term) end end end for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do local arg = extra_info_terms.arg if arg == "modern" or arg == "now" or arg == "full" or arg == "short" then check_termobj_list(extra_info_terms.terms) end end for _, directive in ipairs(overall_place_spec.directives) do check_termobj_list(directive.terms) end return bare_cats end --[==[ This is used to augment the holonyms associated with a place description with the containing polities. For example, given the following: `# {{tl|place|en|subprefecture|pref/Hokkaido}}.` We auto-add Japan as another holonym so that the term gets categorized into [[:Category:Subprefectures of Japan]]. To avoid over-categorizing we need to check to make sure no other countries are specified as holonyms. ]==] function export.augment_holonyms_with_container(place_descs) for _, place_desc in ipairs(place_descs) do if place_desc.holonyms then -- This ends up containing a copy of the original holonyms, with the augmented holonyms inserted in their -- appropriate position. We don't just put them at the end because some holonyms have use the `:also` -- modifier, which causes category processing to restart at that point after generating categories for a -- preceding holonym, and we don't want the preceding holonym's augmented holonyms interfering with -- categorization of a later holonym. We proceed from right to left, and each time we augment, we copy -- the holonyms with the augmented holonym(s) inserted appropriately and replace the place description's -- holonyms with the augmented ones before the next iteration. The reason for this is so that e.g. -- {{place|neighborhood|city/Birmingham|co/West Midlands|cc/England}} doesn't throw an error during the -- augmentation process due to 'Birmingham' referring to two known locations (in England and Alabama). If -- we go left to right, we will throw an ambiguity error on `city/Birmingham` because code to exclude -- Birmingham, Alabama needs `c/United Kingdom` present (to cause a mismatch with `c/United States`), -- which isn't yet present as the augmentation code hasn't gotten to `cc/England` yet. For similar -- reasons, we need to include the augmented holonyms in the holonyms considered in the next iteration -- rather than modifying the place description once at athe end. for i = #place_desc.holonyms, 1, -1 do local holonym = place_desc.holonyms[i] if holonym.placetype and not export.placetype_is_ignorable(holonym.placetype) then local group, key, spec, container_trail = export.find_matching_holonym_location { holonym_placetype = holonym.placetype, holonym_placename = holonym.unlinked_placename, holonym_index = i, place_desc = place_desc, } if group and container_trail[1] and not spec.no_auto_augment_container then local augmented_holonyms = {} for j = 1, i do insert(augmented_holonyms, place_desc.holonyms[j]) end for _, containers in ipairs(container_trail) do local any_no_auto_augment_container = false for _, container in ipairs(containers) do any_no_auto_augment_container = any_no_auto_augment_container or container.spec.no_auto_augment_container local containing_type = container.spec.placetype if type(containing_type) == "table" then -- If the containing type is a list, use the first element as the canonical variant. containing_type = containing_type[1] end local full_container_placename, elliptical_container_placename = m_locations.key_to_placename(container.group, container.key) -- Don't side-effect holonyms while processing them. local new_holonym = { -- By the time we run, the display has already been generated so we don't need to -- set display_placename. placetype = containing_type, -- placename_to_key() for the group should correctly handle both full and elliptical -- placenames, but the full placename seems less likely to be ambiguous. FIXME: We -- should just store the key directly and use it when available to avoid having to -- convert key to placename and back to key. unlinked_placename = full_container_placename, -- Indicate that this is an augmented holonym, and was derived from the specified -- holonym. In iterate_matching_holonym_location(), we ignore augmented holonyms -- derived from holonyms that are different from the holonym we're searching for but -- of the same placetype. This is to correctly handle a situation like -- {{place|river|dept/Ardèche,Gard,Vaucluse,Bouches-du-Rhône|c/France}}. Here, -- `Ardèche` is in `r/Auvergne-Rhône-Alpes`, while `Gard` is in `r/Occitania` and -- the other two are in `r/Provence-Alpes-Côte d'Azur`. Augmenting proceeds from -- right to left, so after it adds `r/Provence-Alpes-Côte d'Azur` to -- `Bouches-du-Rhône`, Vaucluse gets augmented correctly but `Gard` fails to match -- in find_matching_holonym_location() because of the mismatch between augmented -- `r/Provence-Alpes-Côte d'Azur` and actual `r/Occitania`. Similarly, all later -- calls to find_matching_holonym_location() fail to match `Gard` (and likewise -- `Ardèche`) against any known location. To deal with this, we mark augmented -- holoynms as being augmented due to a source holonym, and when processing a given -- holonym, ignore augmented holonyms from other holonyms of the same placetype. -- The restriction to the same placetype is so that `Birmingham` still gets -- correctly disambiguated to Birmingham, England in the example given above near -- the top of this function, using the augmented holonym `c/United Kingdom` added by -- the specified `cc/England` (whose placetype `constituent country` differs from -- the placetype `city` of Birmingham). augmented_from_holonym = holonym, } insert(augmented_holonyms, new_holonym) -- But it is safe to modify other parts of the place_desc. export.key_holonym_into_place_desc(place_desc, new_holonym) end if any_no_auto_augment_container then break end end for j = i + 1, #place_desc.holonyms do insert(augmented_holonyms, place_desc.holonyms[j]) end place_desc.holonyms = augmented_holonyms end end end end end end -- Cat handler for district, areas, neighborhoods and suburbs. Districts are tricky because they can either be political -- divisions or city neighborhoods. Areas similarly can be political divisions (rarely; specifically, in Kuwait), city -- neighborhoods or larger geographical areas/regions. We handle this as follows: -- (1) `placetype_data` cat entries for specific countries or country divisions take precedence over cat_handlers, so if -- the user says {{tl|place|district|s/Maharashtra|c/India}}, we won't even be called because there is an entry that -- categorizes into [[:Category|Districts of Maharashtra, India]]. -- (2) If we're called, we check the holonym we're called on to see if it is a recognized city, e.g. if we're called -- using {{tl|place|district|city/Mumbai|s/Maharashtra|c/India}}. If so, we categorize under e.g. -- [[:Category:Neighbourhoods of Mumbai]]. (Choosing the spelling "neighbourhoods" because we're in India.) -- (3) If we're called and the holonym is not a recognized city, we check if the placetype has has_neighborhoods set. -- If so, it's "city-like" and we categorize under the first containing polity that we recognize. For example, if -- we're called using {{tl|place|district|town/Northampton|co/Hampshire|s/Massachusetts|c/US}}, we should recognize -- town as "city-like" and categorize under [[:Category:Neighborhoods in Massachusetts]]. (Note "ใน" not "ของ", and -- note the spelling "neighborhoods" because we're in the US.) -- (4) If the holonym is not city-like, we do nothing. If there's a city or city-like placetype farther up (e.g. we're -- called as {{tl|place|district|ward/Foo|mun/Bar|...}}), we will handle the city-like entity according to (2) or -- (3) when called on that holonym. Otherwise either the categorization in (1) takes place or there's no -- categorization. local function district_neighborhood_cat_handler(data) local function get_plural_entry_placetype(location_spec, container_trail) if data.entry_placetype == "suburb" then return "Suburbs" else -- Check for `british_spelling` setting on the spec itself or any container. local uses_british_spelling = location_spec.british_spelling if uses_british_spelling == nil and container_trail then for _, container_set in ipairs(container_trail) do local must_outer_break = false for _, container in ipairs(container_set) do if container.spec.british_spelling ~= nil then uses_british_spelling = container.spec.british_spelling must_outer_break = true break end end if must_outer_break then break end end end return uses_british_spelling and "Neighbourhoods" or "Neighborhoods" end end -- First check the immediate holonym to see if it's a city or a city-like top-level entity (Hong Kong, Bonaire, -- etc.) local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and not spec.is_former_place and spec.is_city then return {get_plural_entry_placetype(spec, container_trail) .. " of " .. export.get_prefixed_key(key, spec)} end -- If the entry placetype is neighbo(u)rhood, assume it is a neighborhood even if there isn't a city-like -- entity father up the chain. (E.g. due to a mistaken use of m/ instead of mun/ for municipality.) local has_neighborhoods local entry_placetype = data.entry_placetype if entry_placetype == "neighborhood" or entry_placetype == "neighbourhood" or entry_placetype == "suburb" then has_neighborhoods = true else -- Otherwise, make sure the current holonym is city-like. has_neighborhoods = export.get_equiv_placetype_prop(data.holonym_placetype, function(pt) return export.get_placetype_prop(pt, "has_neighborhoods") end, {continue_on_nil_only = true}) end if has_neighborhoods then -- Loop up the holonyms, looking for city and city-like entities in case of e.g. [[Sepulveda]] written -- {{place|en|neighborhood|valley/San Fernando Valley|city/Los Angeles|s/California|c/USA}} -- but also look for a recognizable poldiv, and if so categorize as "Neighborhoods in POLDIV". We need -- to start with the current holonym, which is especially important for neighborhoods and suburbs that -- may have the first holonym be a recognizable province, etc. but can't hurt otherwise. (Previously -- we skipped the first/current holonym.) for other_holonym_index, other_holonym in export.get_holonyms_to_check(data.place_desc, data.holonym_index) do local other_holonym_data = { holonym_placetype = other_holonym.placetype, holonym_placename = other_holonym.unlinked_placename, holonym_index = other_holonym_index, place_desc = data.place_desc, } local group, key, spec, container_trail = export.find_matching_holonym_location(other_holonym_data) if group and not spec.is_former_place then return {get_plural_entry_placetype(spec, container_trail) .. (spec.is_city and "ของ" or "ใน") .. export.get_prefixed_key(key, spec)} end end end end function export.check_already_seen_string(holonym_placename, already_seen_strings) local canon_placename = ulower(m_links.remove_links(holonym_placename)) if type(already_seen_strings) ~= "table" then already_seen_strings = {already_seen_strings} end for _, already_seen_string in ipairs(already_seen_strings) do if canon_placename:find(already_seen_string) then return true end end return false end -- Prefix display handler that adds a prefix such as "Metropolitan Borough of " to the display -- form of holonyms. We make sure the holonym doesn't contain the prefix or some variant already. -- We do this by checking if any of the strings in ALREADY_SEEN_STRINGS, either a single string or -- a list of strings, or the prefix if ALREADY_SEEN_STRINGS is omitted, are found in the holonym -- placename, ignoring case and links. If the prefix isn't already present, we create a link that -- uses the raw form as the link destination but the prefixed form as the display form, unless the -- holonym already has a link in it, in which case we just add the prefix. local function prefix_display_handler(prefix, holonym_placename, already_seen_strings) if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(prefix)) then return holonym_placename end if holonym_placename:find("%[%[") then return prefix .. " " .. holonym_placename end return prefix .. " [[" .. holonym_placename .. "]]" end -- Suffix display handler that adds a suffix such as " parish" to the display form of holonyms. -- Works identically to prefix_display_handler but for suffixes instead of prefixes. local function suffix_display_handler(suffix, holonym_placename, already_seen_strings, include_suffix_in_link) if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(suffix)) then return holonym_placename end if holonym_placename:find("%[%[") then return holonym_placename .. " " .. suffix end if include_suffix_in_link then return "[[" .. holonym_placename .. " " .. suffix .. "]]" else return "[[" .. holonym_placename .. "]] " .. suffix end end -- Display handler for boroughs. New York City boroughs are display as-is. Others are suffixed -- with "borough". local function borough_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.new_york_boroughs[unlinked_placename] then -- Hack: don't display "borough" after the names of NYC boroughs return holonym_placename end return suffix_display_handler("borough", holonym_placename) end local function county_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) -- Display handler for Irish counties. Irish counties are displayed as e.g. "County [[Cork]]". if m_locations.ireland_counties["County " .. unlinked_placename .. ", Ireland"] or m_locations.northern_ireland_counties["County " .. unlinked_placename .. ", Northern Ireland"] then return prefix_display_handler("เทศมณฑล", holonym_placename) end -- Display handler for Taiwanese counties. Taiwanese counties are displayed as e.g. "[[Chiayi]] County". if m_locations.taiwan_counties[unlinked_placename .. " County, Taiwan"] then return suffix_display_handler("เทศมณฑล", holonym_placename) end -- Display handler for Romanian counties. Romanian counties are displayed as e.g. "[[Cluj]] County". if m_locations.romania_counties[unlinked_placename .. " County, Romania"] then return suffix_display_handler("เทศมณฑล", holonym_placename) end -- FIXME, we need the same for US counties but need to key off the country, not the specific county. -- Others are displayed as-is. return holonym_placename end -- Display handler for prefectures. Japanese prefectures are displayed as e.g. "[[Fukushima]] Prefecture". -- Others are displayed as e.g. "[[Fthiotida]] prefecture". local function prefecture_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) local suffix = m_locations.japan_prefectures[unlinked_placename .. " Prefecture, Japan"] and "Prefecture" or "prefecture" return suffix_display_handler(suffix, holonym_placename) end -- Display handler for provinces of Iran, Laos, North and South Korea, Thailand, Turkey and Vietnam. Recognized -- provinces are displayed as e.g. "[[Gyeonggi]] Province" or "[[Antalya]] Province". Others are displayed as-is. local function province_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.iran_provinces[unlinked_placename .. ", Iran"] or m_locations.laos_provinces[unlinked_placename .. ", Laos"] or m_locations.north_korea_provinces[unlinked_placename .. ", North Korea"] or m_locations.south_korea_provinces[unlinked_placename .. ", South Korea"] or m_locations.thailand_provinces[unlinked_placename .. ", ไทย"] or m_locations.turkey_provinces[unlinked_placename .. ", Turkey"] or m_locations.vietnam_provinces[unlinked_placename .. ", เวียดนาม"] then return suffix_display_handler("จังหวัด", holonym_placename) end return holonym_placename end -- Display handler for Nigerian states. Nigerian states are display as "[[Kano]] State". Others are displayed as-is. local function state_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.nigeria_states[unlinked_placename .. " State, Nigeria"] then return suffix_display_handler("รัฐ", holonym_placename) end return holonym_placename end -- Display handler for voivodeships. Display as e.g. [[Subcarpathian Voivodeship]]. local function voivodesip_display_handler(holonym_placetype, holonym_placename) return suffix_display_handler("Voivodeship", holonym_placename, nil, "include_suffix_in_link") end ------------------------------------------------------------------------------------------ -- Placetype data -- ------------------------------------------------------------------------------------------ --[==[ var: Main placetype data structure. This specifies, for each canonicalized placetype, various properties. The keys are placetypes (in the singular, except for category-only placetypes, which are plural and followed by `!`), and the value is a table of properties. The `"*"` key is special and is used for adding "generic" categories of the form `สถานที่ใน``location`` `; it runs for all entry placetypes. Keys in the form of plural placetypes followed by `!` are used only in [[Module:category tree/topic cat/data/Places]] for specifying the properties of categories containing the specified placetype, esp. bare categories like [[:Category:States and territories]] (rather than qualified categories like [[:Category:States and territories of Australia]]). Keys under the value table for a given placetype of are two types: ''property keys'' (which specify the value of specific properties) and ''categorization keys'' (which tell how to categorize certain sorts of holonyms if the placetype in question occurs as an entry placetype). Categorization keys are either the special value `default` or are wildcard strings with a slash in them, such as `"country/*"`. Note that only wildcard strings are currently allowed directly in the placetype data; everything else is handled through category handlers, either per-placetype or special (such as `political_division_cat_handler`). The algorithm for how category keys and handlers are used to generate categories is described at the top of [[Module:place]]. There are several recognized property keys, of various types: 1. The following link-related property keys are recognized: * `link`: '''Required''' except in category-only placetypes ending in `!`. Describes how to link and display the placetype in the formatted description when occurring as an entry placetype. Also used for formatting pluralized placetypes (which may occur in entry placetypes, esp. new-format ones, such as `two <<islands>>`) and may occur in categories). The possible values are: *# `true`: Link to the same-named Wiktionary entry. This creates a raw link, e.g. `<nowiki>[[city]]</nowiki>`, which is converted to an English-specific link by JavaScript postprocessing. If the placetype is plural, this creates a two-part raw link e.g. `<nowiki>[[city|cities]]</nowiki>`. *# `"w"`: Link to the same-named Wikipedia entry. This creates a two-part link, e.g. `<nowiki>[[w:census town|census town]]</nowiki>`, or `<nowiki>[[w:census town|census towns]]</nowiki>` if the placetype is given plural. *# `"+..."`: Create a two-part link to the entry following the `+` sign. For example, if `cercle` specifies `"+w:cercles of Mali"`, a two-part link `<nowiki>[[w:cercles of Mali|cercle]]</nowiki>` will be generated, or `<nowiki>[[w:cercles of Mali|cercles]]</nowiki>` if plural `cercles` is specified. *# `"separately"`: Link each word separately. For example, if `administrative territory` specifies `"separately"`, it will be linked as `<nowiki>[[administrative]] [[territory]]</nowiki>`, or as `<nowiki>[[administrative]] [[territory|territories]]</nowiki>` if plural `administrative territories` is given. *# another string: Use that string directly. If the placetype is plural, `pluralize()` in [[Module:en-utilities]] is called on the string, which will correctly pluralize most strings, including those with links in them. (If there are multiple links, the display form of the last link is pluralized.) *# `false`: This placetype is not allowed as an entry placetype. An error will be thrown if this placetype is given as an entry placetype. This is specified for internal-use placetypes, especially placetypes used in conjunction with the qualifiers `former`, `ancient`, `historical` and such. * `plural_link`: If specified and the placetype is plural, use the value in place of generating a pluralized version of the link spec in `link`. Most commonly, this is either a string with links in it (which is used directly) or the value `false`, indicating that the placetype cannot occur plural. (This is used for example by `caplc`, which displays as `<nowiki>[[capital]] and [[large]]st [[city]]</nowiki>`, where a plural version doesn't make sense.) Generally if this is specified, `plural` also needs to be specified to give a special placetype plural; this situation occurs especially with multiword placetypes where something other than the last word is pluralized. An example is `town with bystatus`, whose plural is `towns with bystatus`, which needs to be explicitly given. This example uses `link = <nowiki>"[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>` ({{m|nb|bystatus}}) is a Norwegian Bokmål word, and template calls aren't currently permitted in link strings), along with `plural_link = <nowiki>"[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>`. * `category_link`: Spec indicating how to display the placetype when occurring in category descriptions. Defaults to the value of `link`, and in turn is overridden by more specific `category_link_*` keys; see below. Category-only placetypes (which are plural and end in `!`) usually use `category_link` in preference to `link`. The value of `category_link` can be any of the types of specs given above, but most commonly is a plural string with links in it, spelling out the description; in this case it is used directly. When both `category_link` and `link` are given, the value in `category_link` is typically longer and more descriptive. For example, `polity` uses `link = true`, which just generates a link `<nowiki>[[polity]]</nowiki>` or plural `<nowiki>[[polity|polities]]</nowiki>`, but specifies a separate `category_link = <nowiki>"[[independent]] or [[semi-]][[independent]] [[polity|polities]]"</nowiki>`, which clarifies in the category description what a polity is. * `category_link_top_level`: Spec indicating how to display top-level (bare/unqualified) categories, i.e. categories where the placetype is not followed by `in ``location`` ` or `of ``location`` `. If given, this overrides `category_link` for this type of category. * `category_link_before_noncity`: Spec indicating how to display qualified categories of the form ` ``placetypes`` in/of ``location`` ` where ``location`` does not refer to a city. If given, this overrides `category_link` for this type of category. * `category_link_before_city`: Spec indicating how to display qualified categories of the form ` ``placetypes`` in/of ``location`` ` where ``location`` refer to a city. If given, this overrides `category_link` for this type of category. An example where this is given is `neighborhood`, which uses the following specs:<ol> <li>`link = true`</li> <li>`category_link = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]"</nowiki>`</li> <li>`category_link_before_city = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions"</nowiki>`</li> </ol> This has the effect of making the entry placetype `neighborhood` display as just `<nowiki>[[neighborhood]]</nowiki>`, while e.g. a category like `Neighborhoods of Chicago` displays as `<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[Chicago]], ...</nowiki>` and a category like `Neighborhoods in Illinois, USA` displays as `<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]] in [[Illinois]], ...</nowiki>`. * `disallow_in_entries`: If specified, this placetype cannot occur as an entry placetype, and the specified value (a message indicating what to use instead) is displayed in the error message. * `disallow_in_holonyms`: If specified, this placetype cannot occur as a holonym placetype, and the specified value (a message indicating what to use instead) is displayed in the error message. 2. There is currently one fallback-related property key recognized: * `fallback`: If specified, its value is a placetype which will be used for categorization purposes if no categories get added using the placetype itself. As an example, `branch` sets a fallback of `river` but also sets `preposition = "ของ"`, meaning that {{tl|place|en|branch|riv/Mississippi}} displays as `a branch of the Mississippi` (whereas `river` itself uses the preposition `in`), but otherwise categorizes the same as `river`. A more complex example is `area`, which sets a fallback of `geographic and cultural area` and also sets a category handler that checks for cities or city-like entities (e.g. boroughs) occurring as holonyms and categorizes the toponym under [[:Category:Neighborhoods of CITY]] (for recognized cities) or otherwise [[:Category:Neighborhoods of POLDIV]] (for the nearest containing recognized location). In addition, `area` is set as a political division of Kuwait, meaning if `c/Kuwait` occurs as holonym, the toponym is categorized under [[:Category:Areas of Kuwait]]. If none of these categories trigger, the fallback of `geographic and cultural area` will take effect, and the toponym will be categorized as e.g. [[:Category:Geographic and cultural areas of England]]. 3. There is currently one property to control irregular plurals of placetypes: * `plural`: If specified, its value is the plural of the placetype. Otherwise, the default pluralization algorithm in [[Module:en-utilities]] applies (which correctly pluralizes most words, including those ending in `-y`, `-ch`, `-sh`, `-x`, etc.). The value of `plural` is also used when converting a pluralized placetype into its singular equivalent; for example, since the placetype `kibbutz` has `plural = "kibbutzim"`, the placetype `kibbutzim` will be recognized as a plural and singularized to `kibbutz`. For this reason, it's occasionally necessary to specify a `plural` value even when the default pluralization algorithm works correctly, if the default singularization algorithm won't correctly reverse the pluralization (as with `pass` and other terms ending in `-ss`). 4. The following property keys relate to generating categories for entry placetypes and specifying the parents of those categories: * `class`: The general class of placetype. This is used for various purposes: (a) to categorize placetypes preceded by a qualifier such as `former`, `ancient`, `medieval` or `historical` (note that these placetypes are not all treated alike); (b) to determine the parent category of bare placetype categories (e.g. [[:Category:Villages]] for placetype `village`); (c) to determine whether to add a parent category `political divisions of specific countries` to qualified placetype categories (e.g. [[:Category:Villages in Mali]]). The possible values are: *# `polity`: a more-or-less sovereign/independent polity, such as a country, kingdom or empire. *# `subpolity`: a non-sovereign division of a polity, above the level of an individual settlement. *# `settlement`: a city or smaller equivalent, such as a village. This also includes administrative divisions of a settlement, such as wards and barangays. *# `non-admin settlement`: similar to a settlement but without administrative or political significance, such as an unincorporated community, farm or neighborhood. *# `capital`: a settlement that is a capital. A former capital is generally still in existence, just not the capital any more. *# `natural feature`: any non-man-made feature, such as a lake, mountain, island, ocean, etc. *# `man-made structure`: a man-made feature below the level of a neighborhood, such as a house, airport, university, metro station, park or the like. *# `geographic region`: a geographic or cultural region or area that has no administrative significance. These may vary greatly in size but typically have some sort of cultural significance (possibly historical). The `former`, `ancient`, etc. qualifier has no effect on the category of these placetypes. *# `generic place`: a place that isn't further qualified into any specific subtype. * `former_type`: The class of placetype used for categorizing placetypes preceded by a qualifier such as `former`, `ancient`, `medieval` or `historical`. The possible values are the same as for `class` but with the addition of `dependent territory` (for colonies, protectorates and the like) and `!` (ignore the historical/former/ancient/etc. qualifier; used e.g. with `fictional location` and `mythological location`). If not specified, the value of `class` is used. When a qualifier such as `former`, `ancient`, `medieval` or `historical` is encountered (specifically, those in `former_qualifiers`), it is mapped using `former_qualifiers` to the appropriate internal qualifier or qualifiers (one or both of `ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified qualifiers), which is prepended to the value of `former_type` or `class` to form a placetype whose properties are looked up to determine how to categorize the toponym in question. For example, if `medieval village` is given, we map `medieval` to `ANCIENT` and `FORMER`, and `village` to its `class` of `settlement`, and enter the placetypes `ANCIENT settlement` and `FORMER settlement` (in that order) into the list of equivalent placetypes returned by `get_placetype_equivs`. In this case, there is an entry in `placetype_data` for `ANCIENT settlement`, so its default category spec `Ancient settlements` is used as the category. If on the other hand `medieval kingdom` is given, where `kingdom` has a `class` value `polity`, we first look up `ANCIENT polity`, see there is no entry in `placetype_data` for it, and then look up `FORMER polity`, which exists and has a default category spec `Former polities`, which is used as the category. Note that if the placetype following the "former" qualifier is recognized in `placetype_data` but has no `former_type` or `class` and no fallback with a `former_type` or `class` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like `former greenhouse` is specified and we don't have an entry for `greenhouse`), we just track the occurrence and end up not categorizing. * `bare_category_parent`: This specifies the first parent category of a bare placetype category named according to the placetype in question (e.g. [[:Category:Atolls]] for placetype `atoll`, or [[:Category:Named buildings]] for placetype `named buildings!`). If not specified, the first parent category is determined by the value of `class`, using the mapping `class_to_bare_category_parent` in [[Module:category tree/topic cat/data/Places]]. * `addl_bare_category_parents`: Extra parent categories to add a bare placetype category to (see `bare_category_parent` just above). * `bare_category_breadcrumb`: Breadcrumb for bare placetype categories. Also used as the sort key of `bare_category_parent` if it is a string. * `inherently_former`: If specified and the given placetype is used as an entry placetype, act as if `former` or `ancient` (depending on the value of `inherently_former`) were prefixed to the placetype. This is for placetypes that always refer to no-longer-existing entities, such as `satrapy` and `treaty port`. The value of `inherently_former` is a list of internal qualifiers (one or more of `ANCIENT` and/or `FORMER`), just as for `former_qualifiers`, and the implementation is the same. * `cat_handler`: Handler used to generate the categories to add a given toponym to, if its entry placetype is the placetype in question. Generally the `cat_handler` function checks the holonyms specified in order to determine which category or categories to generate. For example, `district_neighborhood_cat_handler` handles placetypes `district`, `neighborhood`, `subdivision`, `suburb` and the like, and either adds the toponym to a category like `Neighborhoods of ``city`` ` (if a recognized city is given as a holonym), or otherwise a category like `Neighborhoods in ``location`` ` (for the first recognized non-city location given as a holonym, if an unrecognized city or city-like entity is given before the recognized non-city). The algorithm that runs the category handlers iterates over holonyms from left to right, running the `cat_handler` function on each holonym in turn until one or more categories are returned; see below for more specifics. (Note that countries for which e.g. a `district` is a political division do not get the corresponding category added by the `district_neighborhood_cat_handler` function but by `political_division_cat_handler`.) `cat_handler` functions are called with one argument, `data`, describing the resolved entry placetype (i.e. after resolving placetype aliases and fallbacks) and the holonym being processed. The return value should be a list of category specs (categories minus the langcode prefix, with `+++` standing for the holonym key, or the value `true`, which stands for ` ``Placetypes`` in/of ``Holonym`` `, i.e. the pluralized placetype with the appropriate preposition as specified in `placetype_data`). `data` contains the following fields: ** `entry_placetype`: the resolved entry placetype for the entry placetype being processed (i.e. it will always have an entry in `placetype_data` but may not be the original placetype given by the user); ** `holonym_placetype` and `holonym_placename`: the holonym placetype and placename being processed; ** `holonym_index`: the index of the holonym being processed, or {nil} if we're handling an overriding holonym (FIXME: we will change the overriding holonym algorithm so there will be an index even when processing overriding holonyms); ** `place_desc`: a full description of the {{tl|place}} call, as specified at the top of [[Module:place]]; ** `from_demonym`: If set, we are called from [[Module:demonym]], triggered by {{tl|demonym-adj}} or {{tl|demonym-noun}}, instead of being triggered by {{tl|place}}. * `has_neighborhoods`: If `true`, the specified placetype is city-like. This is used in the `district_neighborhood_cat_handler` to determine whether to add a category such as `Neighborhoods in ``location`` `; see the section just above on `cat_handler`. 5. The following preposition-related property keys are recognized: * `preposition`: The preposition used after this placetype when it occurs as an entry placetype. Defaults to `"ใน"`. * `generic_before_non_cities`: If specified, the appropriate category description handler in [[Module:category tree/topic cat/data/Places]] will recognize categories of the form ` ``Placetype`` in/of ``location`` ` for the specified placetype and preposition, if ``location`` is a non-city. This is used to generate descriptions for categories added by category handlers and by explicit category specs in the placetype data. All placetypes that specify `generic_before_non_cities` or `generic_before_cities` *MUST* also specify a value for `class` so that the category tree code can determine whether it's a political or non-political division. * `generic_before_cities`: Like `generic_before_non_cities` but for locations referring to cities. 6. The following property keys control the auto-addition of affixes when formatting holonyms of a particular placetype: * `affix_type`: If specified, add the placetype as an affix before or after holonyms of this placetype. Possible values are: *# `"pref"` (the holonym will display as `(the) placetype of Holonym`, where `the` appears when the holonym directly follows an entry placetype); *# `"Pref"` (same as `"pref"` but the placetype is capitalized; each word is capitalized if there are multiple); *# `"suf"` (the holonym will display as `Holonym placetype`); *# `"Suf"` (the holonym will display as `Holonym Placetype`, i.e. same as `"suf"` but the placetype is capitalized). * `suffix`: String to use in place of the placetype itself when the placetype is displayed as a suffix after a holonym. Note that `suffix` can be used independently of `affix_type` because the user can also request a suffix explicitly using a syntax like `adr:suf/Occitania`, which will display as `Occitania region` because the placetype `administrative region` specifies `suffix = "ภูมิภาค"`. * `prefix`: Like `suffix` but for use when the placetype is displayed as a prefix before the holonym. * `affix`: Like `suffix` and `prefix` but for use when the placetype is displayed as an affix either before or after the holonym. If both `suffix` or `prefix` and `affix` are given for a single placetype, `suffix` or `prefix` take precedence. * `no_affix_strings`: String or list of strings that, if they occur in the holonym, suppress the addition of any affix requested using `affix_type`. Defaults to the placetype itself. For example, `autonomous okrug` specifies `affix_type = "Suf"` so that `aokr/Nenets` displays as `Nenets Autonomous Okrug`, but also specifies `no_affix_strings = "okrug"` so that `aokr/Nenets Okrug` or `aokr/Nenets Autonomous Okrug` displays as specified, without a redundant `Autonomous Okrug` added. Matching is case-insensitive but whole-word. * `display_handler`: A function of two arguments, `holonym_placetype` and `holonym_placename` (specifying a holonym). Its return value is a string specifying the display form of the holonym. 7. The following property keys control the indefinite and definite articles used before entry placetypes and/or holonyms of the specified placetype. * `entry_placetype_use_the`: Use `"the"` before this placetype when it occurs as an entry placetype. * `entry_placetype_indefinite_article`: Indefinite article used before this placetype when it occurs as an entry placetype (usually `"a"`, specifically for placetypes beginning with u- that don't take the indefinite article `"an"`). Defaults to the appropriate indefinite article (`"a"` or `"an"` depending on whether the placetype begins with a vowel). Overridden by `entry_placetype_use_the`, and unlike for most properties, does not apply to equivalent placetypes (i.e. fallbacks or those formed by removing a qualifier from the beginning); only to the exact placetype specified. * `holonym_use_the`: Use `"the"` before holonyms of this placetype. '''NOTE:''' # The `link` property must be specified on all placetypes, except those ending in `!` (category-only placetypes), which must have either `link` or `category_link` specified. # Either the `class` or `former_type` property must be specified on all placetypes not ending in `!` that do not have a fallback (if a placetype has a fallback and omits the `class` and `former_type` properties, they are taken from the fallback). An internal error will result if a placetype has no `class` or `former_type` property derivable either directly or through a fallback, if an attempt is made to categorize a former/ancient/historical/etc. entity of this placetype. # It is possible to have multiple levels of fallback (e.g. `frazione` falls back to `hamlet`, which falls back to `village`). Fallback loops will cause an internal error. All placetypes specified as fallbacks must exist in `placetype_data` or an internal error occurs. ]==] export.placetype_data = { --[=[ If you need to sort the following, do this (using Vim): 1. Make sure all full-line comments are within the { ... } table, or are moved after and on the same line as single-line entries. 2. Make sure the table uses tabs everywhere for indent, and not spaces. 3. Mark the top of the table with `ma`, go to the bottom and execute the following two lines in sequence: :'a,.s/\n/\\n/g :s/\\n\(\t\[\)/\r\1/g The first command converts every newline to a literal `\n` sequence, so the whole thing becomes a single line, while the second command restores the newlines before the beginning of each entry. The effect is to convert all entries to a single line while not losing any information. (Potentially a negative lookahead could be used to do it all in one command.) 4. Execute the following to sort: :'a,.!perl -pe 's/^(\t\[")(.*?)(".*)$/$2 @@@ $1$2$3/' | sort -f | perl -pe 's/.*? @@@ //' Note that a simple `sort -f` (where `-f` means case-insensitive) would almost work, but it would sort "hill station" before "hill" and "county borough" before "เทศมณฑล" because the space after e.g. "hill station" sorts before the quotation mark after e.g. "hill". The above command deals with this by extracting the key, prepending it followed by ` @@@ `, sorting, and then removing key (the classic decorate-sort-undecorate pattern). 5. Put the table back to multi-line format by marking the top of the table with `ma`, going to the bottom and executing :'a,.s/\\n/\r/g Note that for some reason, in order to get a match a newline in the left side of a replacement, you must use \n, but to insert a newline in the right sode of a replacement you must use \r. ]=] ["*"] = { link = false, cat_handler = generic_place_cat_handler, }, ["administrative atoll"] = { -- Maldives link = "+w:administrative divisions of the Maldives", preposition = "ของ", class = "subpolity", }, ["administrative capital"] = { link = "w", fallback = "capital city", }, ["administrative center"] = { link = "w", fallback = "non-city capital", }, ["administrative centre"] = { link = "w", fallback = "administrative center", }, ["administrative county"] = { link = "w", fallback = "เทศมณฑล", }, ["administrative district"] = { link = "w", fallback = "อำเภอ", }, ["administrative headquarters"] = { link = "separately", fallback = "administrative centre", }, ["administrative region"] = { link = true, preposition = "ของ", suffix = "ภูมิภาค", -- but prefix is still "administrative region (of)" fallback = "ภูมิภาค", class = "subpolity", }, ["administrative seat"] = { link = "w", fallback = "administrative centre", }, ["administrative territory"] = { link = "separately", preposition = "ของ", suffix = "ดินแดน", -- but prefix is still "administrative territory (of)" fallback = "ดินแดน", class = "subpolity", }, ["administrative unit"] = { -- Grrr, it's difficult to generalize about "administrative units". In Albania, "administrative unit" is an -- official term for a city-level division of municipalities; Wikipedia renders it using the more practical term -- "commune". In Pakistan, "administrative unit" is a collective term used to refer to all the different types -- of first-level divisions (four provinces, one federal territory, and two "disputed territories", i.e. Azad -- Kashmir and Gilgit-Balistan, that are variously described). For this reason, we set no fallback, but we need -- to include this so that it can be used as a placetype for Albania, categorizing as communes. link = "w", class = "subpolity", }, ["administrative village"] = { link = "w", preposition = "ของ", has_neighborhoods = true, class = "settlement", }, ["aimag"] = { -- used in Mongolia, Russia and China (Inner Mongolia); in Mongolia, equivalent to a province; -- in China, equivalent to a prefecture (below a province); in Russia, equivalent to a municipal district. link = "w", fallback = "prefecture", }, ["airport"] = { link = true, class = "man-made structure", default = {true}, }, ["alliance"] = { link = true, fallback = "confederation", }, ["archipelago"] = { link = true, fallback = "เกาะ", }, ["area"] = { link = true, preposition = "ของ", fallback = "geographic and cultural area", -- Areas can either be administrative divisions (specifically of Kuwait) or geographic areas. Assume the former -- when categorizing 'Areas' but the latter when handling e.g. 'historical area'. class = "subpolity", former_type = "geographic region", cat_handler = district_neighborhood_cat_handler, }, ["arm"] = { link = true, preposition = "ของ", class = "natural feature", default = {"ทะเล"}, }, ["arrondissement"] = { link = true, preposition = "ของ", -- FIXME!!! Grrrrr!!! In some countries, arrondissements are divisions of cities; in others, they are divisions -- of departments or provinces. Need to conditionalize on the country for both of the following. class = "subpolity", has_neighborhoods = true, }, ["associated province"] = { link = "separately", fallback = "จังหวัด", }, ["atoll"] = { -- FIXME! Atolls are administrative divisions of the Maldives but natural features elsewhere. Need to -- conditionalize `class` on the country. See also `administrative atoll`. link = true, class = "natural feature", bare_category_parent = "เกาะ", default = {true}, }, ["autonomous city"] = { link = "w", preposition = "ของ", fallback = "นคร", has_neighborhoods = true, }, ["autonomous community"] = { -- Spain; refers to regional entities, not village-like entities, as might be expected from "community" link = true, preposition = "ของ", class = "subpolity", }, ["autonomous island"] = { -- Comoros; seems like an administrative atoll of the Maldives. link = "+w:autonomous islands of Comoros", preposition = "ของ", class = "subpolity", }, ["autonomous oblast"] = { link = true, preposition = "ของ", affix_type = "Suf", no_affix_strings = "oblast", class = "subpolity", }, ["autonomous okrug"] = { link = true, preposition = "ของ", affix_type = "Suf", no_affix_strings = "okrug", class = "subpolity", }, ["autonomous prefecture"] = { link = true, fallback = "prefecture", }, ["autonomous province"] = { link = "w", fallback = "จังหวัด", }, ["autonomous region"] = { link = "w", preposition = "ของ", fallback = "administrative region", -- "administrative region" sets an affix of "ภูมิภาค" but we want to display as "Tibet Autonomous Region" -- if the user writes 'ar:Suf/Tibet'. affix = "autonomous region", }, ["autonomous republic"] = { link = "w", preposition = "ของ", class = "subpolity", }, ["autonomous territorial unit"] = { -- Moldova; only two of them, one for Gagauzia and one for Transnistria. link = "w", preposition = "ของ", class = "subpolity", }, ["autonomous territory"] = { link = "w", fallback = "dependent territory", }, ["bailiwick"] = { -- Jersey, etc. link = true, fallback = "องค์การทางการเมือง", }, ["barangay"] = { -- Philippines link = true, class = "settlement", -- Barangays are formal administrative divisions of a city rather than informal neighborhoods, but can use -- some of the properties of a neighborhood. fallback = "neighborhood", }, ["barrio"] = { -- Spanish-speaking countries; Philippines link = true, -- FIXME: Not completely correct, in some countries barrios are formal administrative divisions of a city. -- `class` will need to conditionalize on the country to be completely correct. fallback = "neighborhood", }, ["basin"] = { link = true, fallback = "ทะเลสาบ", }, ["bay"] = { link = true, preposition = "ของ", class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["beach"] = { link = true, class = "natural feature", addl_bare_category_parents = {"water"}, default = {true}, }, ["beach resort"] = { link = "w", fallback = "resort town", }, ["bishopric"] = { link = true, fallback = "องค์การทางการเมือง", }, ["bodies of water!"] = { -- FIXME: This is (maybe?) a type category not a name category. There should be an option for this. We need to -- straighten out the type vs. name vs. related-to issue. category_link = "[[body of water|bodies of water]]", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems", "water"}, }, ["borough"] = { link = true, preposition = "ของ", display_handler = borough_display_handler, has_neighborhoods = true, -- "former borough" could be a former settlement or a former part of a city but seems more likely to -- be a former subpolity, particularly in England. FIXME, we really need a handler to take care of this -- properly. class = "subpolity", -- Grr, some boroughs are city-like but some (e.g. in Britain) may be larger. }, ["borough seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["branch"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["bridge"] = { link = true, class = "man-made structure", default = {"Named bridges"}, }, ["building"] = { link = true, class = "man-made structure", default = {"Named buildings"}, }, ["built-up area"] = { link = "w", fallback = "area", }, ["burgh"] = { link = true, fallback = "borough", }, ["business park"] = { link = true, fallback = "park", }, ["caliphate"] = { link = true, fallback = "องค์การทางการเมือง", }, ["canton"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["cape"] = { link = true, fallback = "headland", }, ["capital"] = { link = true, fallback = "capital city", }, ["capital city"] = { link = true, category_link = "[[capital city|capital cities]]: the [[seat of government|seats of government]] for a country or [[political]] [[division]] of a country", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", bare_category_parent = "นคร", cat_handler = capital_city_cat_handler, default = {true}, -- The following is necessary so that e.g. [[Melbourne]] defined as {{place|en|capital city|s/Victoria|c/Australia}} -- gets categorized in the bare category [[Category:en:Melbourne]]; otherwise placetype 'capital city' wouldn't -- match against the placetype 'city' of Melbourne. fallback = "นคร", }, ["caplc"] = { link = "[[capital]] and [[large]]st [[city]]", plural_link = false, fallback = "capital city", }, ["captaincy"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["caravan city"] = { link = "w", fallback = "นคร", class = "settlement", inherently_former = {"ANCIENT", "FORMER"}, }, ["castle"] = { link = true, fallback = "building", }, ["cathedral city"] = { link = true, fallback = "นคร", }, ["cattle station"] = { -- Australia link = true, fallback = "farm", }, ["census area"] = { link = true, affix_type = "Suf", has_neighborhoods = true, class = "non-admin settlement", }, ["census-designated place"] = { -- United States link = true, class = "non-admin settlement", }, ["census division"] = { -- Canada link = "w", preposition = "ของ", class = "subpolity", }, ["census town"] = { link = "w", fallback = "เมือง", }, ["central business district"] = { link = true, fallback = "neighborhood", }, ["cercle"] = { -- Mali link = "+w:cercles of Mali", preposition = "ของ", class = "subpolity", }, ["ceremonial county"] = { link = true, fallback = "เทศมณฑล", }, ["chain of islands"] = { link = "[[chain]] of [[island]]s", plural = "chains of islands", plural_link = "[[chain]]s of [[island]]s", fallback = "เกาะ", }, ["channel"] = { link = true, fallback = "strait", }, ["charter community"] = { -- Northwest Territories, Canada link = "w", fallback = "village", }, ["นคร"] = { link = true, generic_before_non_cities = "ใน", has_neighborhoods = true, class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["city-state"] = { link = true, category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]", has_neighborhoods = true, class = "settlement", ["continent/*"] = {"City-states", "Cities in +++", "Countries in +++", "National capitals"}, default = {"City-states", "นคร", "ประเทศ", "National capitals"}, }, ["civil parish"] = { -- Mostly England; similar to municipalities link = true, preposition = "ของ", affix_type = "suf", has_neighborhoods = true, class = "subpolity", }, ["claimed political division"] = { link = "[[claim]]ed [[political]] [[division]]", class = "subpolity", default = {true}, }, ["co-capital"] = { link = "[[co-]][[capital]]", fallback = "capital city", }, ["coal city"] = { link = "+w:coal town", fallback = "นคร", }, ["coal town"] = { link = "w", fallback = "เมือง", }, ["collectivity"] = { link = "w", preposition = "ของ", -- No default; these are weird one-off governmental divisions in France (esp. for overseas collectivities) class = "subpolity", }, ["colony"] = { link = true, fallback = "dependent territory", }, ["comarca"] = { -- per Wikipedia: traditional region or local administrative division found in Portugal, Spain, and some of -- their former colonies, like Brazil, Nicaragua, and Panama. In the Valencian Community, for example, it -- sits between municipalities and provinces, something like a county or district. link = true, preposition = "ของ", class = "subpolity", }, ["commandery"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"ANCIENT", "FORMER"}, }, ["commonwealth"] = { link = true, preposition = "ของ", -- No default; applies specifically to Puerto Rico class = "subpolity", }, ["commune"] = { link = true, fallback = "เทศบาล", }, ["community"] = { link = true, category_link = "[[community|communities]] of all sizes", fallback = "village", }, ["community development block"] = { -- in India; appears to be similar to a rural municipality; groups several villages, unclear if there will be -- neighborhoods so I'm not setting `has_neighborhoods` for now link = "w", affix_type = "suf", no_affix_strings = "block", class = "subpolity", }, ["comune"] = { -- Italy, Switzerland link = true, fallback = "เทศบาล", }, ["condominium"] = { link = true, fallback = "องค์การทางการเมือง", }, ["confederacy"] = { link = true, fallback = "confederation", }, ["confederation"] = { link = true, fallback = "องค์การทางการเมือง", }, ["constituency"] = { -- currently we have them as political divisions of Namibia but many countries have them link = true, preposition = "ของ", class = "subpolity", }, ["constituent country"] = { link = true, preposition = "ของ", class = "subpolity", }, ["constituent part"] = { link = "separately", preposition = "ของ", class = "subpolity", }, ["constituent republic"] = { -- Of Russia, Yugoslavia, etc. link = "separately", preposition = "ของ", class = "subpolity", }, ["counties and county-level cities!"] = { -- This is used when grouping counties and county-level cities under prefecture-level cities in China. category_link = "[[county|counties]] and [[county-level city|county-level cities]]", class = "subpolity", }, ["continent"] = { link = true, category_link = false, -- can't occur as a bare category class = "natural feature", default = {"Continents and continental regions"}, }, ["continental region"] = { link = "separately", category_link = false, -- can't occur as a bare category class = "geographic region", fallback = "continent", }, ["continents and continental regions!"] = { category_link = "[[continent]]s and [[continent]]-[[level]] [[region]]s (e.g. [[Polynesia]])", class = "geographic region", }, ["council area"] = { link = true, -- in Scotland; similar to a county preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["ประเทศ"] = { link = true, class = "polity", --ห้ามแปล class ["continent/*"] = {true, "ประเทศ"}, default = {true}, }, ["country-like entities!"] = { category_link = "[[polity|polities]] not normally considered [[country|countries]] but treated similarly for categorization purposes; typically, [[unrecognized]] [[de-facto]] countries or [[w:dependent territory|dependent territories]]", class = "polity", --ห้ามแปล class }, ["เทศมณฑล"] = { link = true, preposition = "ของ", display_handler = county_display_handler, class = "subpolity", }, ["county borough"] = { link = true, -- in Wales; similar to a county preposition = "ของ", affix_type = "suf", fallback = "borough", class = "subpolity", }, ["county seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["county town"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", fallback = "เมือง", has_neighborhoods = true, class = "capital", }, ["county-administered city"] = { -- In Taiwan, per Wikipedia similar to a Taiwanese township or district, which is a small city. -- NOT anything like a "county-level city" in PR China, which is a county masquerading as a city. link = "w", fallback = "นคร", has_neighborhoods = true, class = "settlement", }, ["county-controlled city"] = { -- Taiwan link = "w", fallback = "county-administered city", }, ["county-level city"] = { -- PR China link = "w", fallback = "prefecture-level city", }, ["crater lake"] = { link = true, fallback = "ทะเลสาบ", }, ["creek"] = { link = true, fallback = "stream", }, ["Crown colony"] = { link = "+crown colony", fallback = "crown colony", }, ["crown colony"] = { link = true, fallback = "colony", }, ["Crown dependency"] = { link = true, fallback = "dependent territory", }, ["crown dependency"] = { link = true, fallback = "dependent territory", }, ["cultural area"] = { link = "w", fallback = "geographic and cultural area", }, ["cultural region"] = { link = "w", fallback = "geographic and cultural area", }, ["delegation"] = { -- Tunisia link = "+w:delegations of Tunisia", preposition = "ของ", class = "subpolity", }, ["department"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["departmental capital"] = { link = "separately", fallback = "capital city", }, ["dependency"] = { link = true, fallback = "dependent territory", }, ["dependent territory"] = { link = "w", preposition = "ของ", class = "subpolity", former_type = "dependent territory", bare_category_parent = "political divisions", ["country/*"] = {true}, default = {true}, }, ["desert"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ecosystems"}, default = {true}, }, ["deserted mediaeval village"] = { link = "w", fallback = "deserted medieval village", }, ["deserted medieval village"] = { link = "w", fallback = "ANCIENT settlement", }, ["direct-administered municipality"] = { -- China link = "+w:direct-administered municipalities of China", fallback = "เทศบาล", }, ["direct-controlled municipality"] = { -- several countries link = "w", fallback = "เทศบาล", }, ["distributary"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["อำเภอ"] = { link = true, preposition = "ของ", affix_type = "suf", -- Grrr! FIXME! Here is where we need handlers for `class`. Using similar logic to -- district_neighborhood_cat_handler, we need to check if we're below or above a city to determine if the class -- is "settlement" or "subpolity". class = "subpolity", cat_handler = district_neighborhood_cat_handler, -- No default. Countries for which districts are political divisions will get entries. }, ["districts and autonomous regions!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Portugal. category_link = "[[district]]s and [[autonomous region]]s", class = "subpolity", }, ["districts and autonomous territorial units!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Moldova. category_link = "[[district]]s and [[w:autonomous territorial unit|autonomous territorial unit]]s", class = "subpolity", }, ["district capital"] = { link = "separately", fallback = "capital city", }, ["district headquarters"] = { link = "separately", fallback = "administrative centre", }, ["district municipality"] = { -- In Canada, a district municipality is equivalent to a rural municipality and won't have neighborhoods; in -- South Africa, district municipalities group local municipalities and hence won't have neighborhoods. link = "w", preposition = "ของ", affix_type = "suf", no_affix_strings = {"อำเภอ", "เทศบาล"}, fallback = "เทศบาล", class = "subpolity", }, ["division"] = { link = true, preposition = "ของ", class = "subpolity", }, ["division capital"] = { link = "separately", fallback = "capital city", }, ["dome"] = { link = true, fallback = "ภูเขา", }, ["dormant volcano"] = { link = true, fallback = "volcano", }, ["duchy"] = { link = true, fallback = "องค์การทางการเมือง", }, ["emirate"] = { link = true, preposition = "ของ", -- FIXME: Can be subpolities (of the United Arab Emirates). fallback = "องค์การทางการเมือง", }, ["จักรวรรดิ"] = { link = true, fallback = "องค์การทางการเมือง", }, ["enclave"] = { link = true, preposition = "ของ", -- Enclaves can theoretically be any size but assume a subpolity. class = "subpolity", }, ["entity"] = { -- Bosnia and Herzegovina link = "+w:entities of Bosnia and Herzegovina", preposition = "ของ", class = "subpolity", }, ["escarpment"] = { link = true, fallback = "ภูเขา", }, ["ethnographic region"] = { -- used in Lithuania link = "+w:ethnographic regions of Lithuania", fallback = "geographic and cultural area", }, ["exclave"] = { link = true, preposition = "ของ", -- exclaves can theoretically be any size but assume a subpolity. class = "subpolity", }, ["external territory"] = { link = "separately", fallback = "dependent territory", }, ["farm"] = { link = true, class = "non-admin settlement", default = {"Farms and ranches"}, }, ["farms and ranches!"] = { category_link = "[[farm]]s and [[ranch]]es", class = "non-admin settlement", }, ["federal city"] = { link = "w", preposition = "ของ", fallback = "นคร", }, ["federal district"] = { link = true, preposition = "ของ", -- Might have neighborhoods as federal districts are often cities (e.g. Mexico City) has_neighborhoods = true, class = "settlement", }, ["federal subject"] = { -- In Russia; a generic term for first-level administrative divisions (republics, oblasts, okrugs, krais, -- autonomous okrugs and autonomous oblasts). link = "w", preposition = "ของ", class = "subpolity", }, ["federal territory"] = { link = "w", fallback = "ดินแดน", }, ["fictional location"] = { link = "separately", former_type = "!", class = "hypothetical location", bare_category_parent = "สถานที่", default = {true}, }, ["First Nations reserve"] = { -- Canada link = "[[First Nations]] [[w:Indian reserve|reserve]]", -- Wikipedia uses "Indian reserve"; presumably that is the legal term fallback = "Indian reserve", class = "subpolity", }, ["fjord"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["footpath"] = { link = true, fallback = "road", }, ["forest"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ecosystems", "forestry"}, default = {true}, }, ["fort"] = { link = true, fallback = "building", }, ["fortress"] = { link = true, -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- fortresses -> fortresse, so put an entry here to ensure we singularize correctly. plural = "fortresses", fallback = "building", }, ["frazione"] = { link = "w", fallback = "hamlet", }, ["freeway"] = { link = true, fallback = "road", }, ["French prefecture"] = { link = "[[w:prefectures in France|prefecture]]", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["geographic and cultural area"] = { link = "+w:cultural area", -- `generic_before_non_cities` is used when generating the category description of categories of the format -- `Geographic and cultural areas of PLACE`. `preposition` is used when generating {{place}} description and -- categories for any placetype that falls back to `geographic and cultural area`. generic_before_non_cities = "ของ", preposition = "ของ", class = "geographic region", bare_category_parent = "สถานที่", ["country/*"] = {true}, ["constituent country/*"] = {true}, ["continent/*"] = {true}, default = {true}, }, ["geographic area"] = { link = "+w:geographic region", fallback = "geographic and cultural area", }, ["geographic region"] = { link = "w", fallback = "geographic and cultural area", }, ["geographical area"] = { link = "w", fallback = "geographic and cultural area", }, ["geographical region"] = { link = "w", fallback = "geographic and cultural area", }, ["geopolitical zone"] = { -- Nigeria link = true, preposition = "ของ", class = "subpolity", }, ["gewog"] = { -- Bhutan link = true, preposition = "ของ", class = "subpolity", }, ["ghost town"] = { link = true, generic_before_non_cities = "ใน", class = "non-admin settlement", bare_category_parent = "former settlements", cat_handler = city_type_cat_handler, default = {true}, }, ["glen"] = { link = true, fallback = "valley", }, ["governorate"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["greater administrative region"] = { -- China (former division) link = "w", preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["gromada"] = { -- Poland (former division) link = "w", preposition = "ของ", affix_type = "Pref", class = "subpolity", inherently_former = {"FORMER"}, }, ["group of islands"] = { link = "[[group]] of [[island]]s", plural = "groups of islands", plural_link = "[[group]]s of [[island]]s", fallback = "island group", }, ["gulf"] = { link = true, preposition = "ของ", holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["hamlet"] = { link = true, fallback = "village", }, ["harbor city"] = { link = "separately", fallback = "นคร", }, ["harbor town"] = { link = "separately", fallback = "เมือง", }, ["harbour city"] = { link = "separately", fallback = "นคร", }, ["harbour town"] = { link = "separately", fallback = "เมือง", }, ["headland"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["headquarters"] = { link = "w", fallback = "administrative centre", }, ["heath"] = { link = true, fallback = "moor", }, ["hemisphere"] = { link = true, entry_placetype_use_the = true, fallback = "continental region", }, ["highway"] = { link = true, fallback = "road", }, ["hill"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["hill station"] = { link = "w", fallback = "เมือง", }, ["hill town"] = { link = "w", fallback = "เมือง", }, ["historic region"] = { -- provided only for the link link = "+w:historical region", fallback = "FORMER geographic region", }, ["historical county"] = { -- needed for historical counties of England/etc. link = "+w:historic county", fallback = "FORMER subpolity", }, ["historical region"] = { -- provided only for the link link = "w", fallback = "FORMER geographic region", }, ["home rule city"] = { link = "w", fallback = "นคร", }, ["home rule municipality"] = { link = "w", fallback = "เทศบาล", }, ["hot spring"] = { link = true, fallback = "spring", }, ["house"] = { link = true, fallback = "building", }, ["housing estate"] = { -- not the same as a housing project (i.e. public housing) link = true, -- not exactly the case but approximately fallback = "neighborhood", }, ["hromada"] = { -- Ukraine link = "w", disallow_in_entries = "Use placetype 'urban hromada', 'rural hromada' or 'settlement hromada' in place of bare 'hromada'", disallow_in_holonyms = "Use placetype 'urban hromada'/'uhrom', 'rural hromada'/'rhrom' or 'settlement hromada'/'shrom' in place of bare 'hromada'", preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["inactive volcano"] = { link = "w", fallback = "dormant volcano", }, ["independent city"] = { link = true, fallback = "นคร", }, ["independent town"] = { link = "+independent city", fallback = "เมือง", }, ["Indian reservation"] = { link = "w", -- In the US. Also known as "Native American reservation" or "domestic dependent nation", and the reservations -- themselves often use the term "nation" in their official name (e.g. the "Navajo Nation"). But Wikipedia puts -- the article at [[w:Indian reservation]] and uses that term when describing e.g. what the Navajo Nation is, -- so this must still be the legal term. preposition = "ของ", class = "subpolity", default = {true}, }, ["Indian reserve"] = { link = "w", -- In Canada. "First Nations reserve" sounds more modern/PC but Wikipedia uses "Indian reserve"; presumably that -- is still the legal term. preposition = "ของ", class = "subpolity", default = {true}, }, ["inland sea"] = { -- note, we also have 'inland' as a qualifier link = true, fallback = "ทะเล", }, ["inner city area"] = { link = "[[inner city]] [[area]]", fallback = "neighborhood", }, ["เกาะ"] = { link = true, preposition = "ของ", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["island country"] = { -- FIXME: The following should map to both 'island' and 'country'. link = "w", fallback = "ประเทศ", }, ["island group"] = { link = "separately", fallback = "เกาะ", }, ["island municipality"] = { link = "w", fallback = "เทศบาล", }, ["islet"] = { link = "w", fallback = "เกาะ", }, ["Israeli settlement"] = { link = "w", class = "settlement", default = {true}, }, ["judicial capital"] = { link = "w", fallback = "capital city", }, ["khanate"] = { link = true, fallback = "องค์การทางการเมือง", }, ["kibbutz"] = { link = true, plural = "kibbutzim", class = "non-admin settlement", default = {true}, }, ["kingdom"] = { link = true, fallback = "monarchy", }, ["krai"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["ทะเลสาบ"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["ธรณีสัณฐาน!"] = { category_link = "[[ธรณีสัณฐาน]]", bare_category_parent = "สถานที่", addl_bare_category_parents = {"โลก"}, }, ["largest city"] = { link = "[[large]]st [[city]]", entry_placetype_use_the = true, fallback = "นคร", has_neighborhoods = true, }, ["league"] = { link = true, fallback = "confederation", }, ["legislative capital"] = { link = "separately", fallback = "capital city", }, ["library"] = { link = true, fallback = "building", }, ["lieutenancy area"] = { -- used in the United Kingdom; per Wikipedia: -- In England, lieutenancy areas are colloquially known as the ceremonial counties, although this phrase does -- not appear in any legislation referring to them. The lieutenancy areas of Scotland are subdivisions of -- Scotland that are more or less based on the counties of Scotland, making use of the major cities as separate -- entities.[2] In Wales, the lieutenancy areas are known as the preserved counties of Wales and are based on -- those used for lieutenancy and local government between 1974 and 1996. The lieutenancy areas of Northern -- Ireland correspond to the six counties and two former county boroughs.[3] link = "w", fallback = "ceremonial county", }, ["local authority district"] = { link = "w", fallback = "local government district", }, ["local government area"] = { -- Australia link = "w", preposition = "ของ", class = "subpolity", }, ["local council"] = { -- Malta; similar to municipalities link = "+w:local councils of Malta", preposition = "ของ", fallback = "เทศบาล", }, ["local government district"] = { link = "w", preposition = "ของ", affix_type = "suf", affix = "อำเภอ", class = "subpolity", }, ["local government district with borough status"] = { link = "[[w:local government district|local government district]] with [[w:borough status|borough status]]", plural = "local government districts with borough status", plural_link = "[[w:local government district|local government districts]] with [[w:borough status|borough status]]", preposition = "ของ", affix_type = "suf", affix = "อำเภอ", class = "subpolity", }, ["local urban district"] = { link = "w", fallback = "unincorporated community", }, ["locality"] = { link = "+w:locality (settlement)", -- not necessarily true, but usually is the case fallback = "village", }, ["London borough"] = { link = "w", preposition = "ของ", affix_type = "pref", affix = "borough", fallback = "local government district with borough status", has_neighborhoods = true, }, ["macroregion"] = { link = true, fallback = "ภูมิภาค", }, ["man-made structures!"] = { category_link = "[[w:geographical feature#Engineered constructs|man-made structures]] such as [[airport]]s, [[university|universities]] and [[metro station]]s", bare_category_parent = "สถานที่", }, ["manor"] = { -- FIXME: or is this more like a farm? link = true, fallback = "building", }, ["marginal sea"] = { link = true, preposition = "ของ", fallback = "ทะเล", }, ["market city"] = { link = "+market town", fallback = "นคร", }, ["market town"] = { link = true, fallback = "เมือง", }, ["massif"] = { link = true, fallback = "ภูเขา", }, ["megacity"] = { link = true, fallback = "นคร", }, ["metro station"] = { link = true, class = "man-made structure", }, ["metropolitan borough"] = { link = true, preposition = "ของ", affix_type = "Pref", no_affix_strings = {"borough", "นคร"}, fallback = "local government district", has_neighborhoods = true, }, ["metropolitan city"] = { -- These exist e.g. in Italy and are more like municipalities or even provinces than cities. link = true, preposition = "ของ", affix_type = "Pref", no_affix_strings = {"metropolitan", "นคร"}, class = "subpolity", }, ["metropolitan county"] = { link = true, fallback = "เทศมณฑล", }, ["metropolitan municipality"] = { -- In South Africa, metropolitan municipalities group local municipalities and are like districts, between -- provinces and municipalities. -- In Turkey, metropolitan municipalities are provinces-level. link = "w", preposition = "ของ", affix_type = "Suf", no_affix_strings = {"metropolitan", "เทศบาล"}, fallback = "เทศบาล", class = "subpolity", }, ["microdistrict"] = { -- residential complex in post-Soviet states link = true, fallback = "neighborhood", }, ["micronations!"] = { -- FIXME, merge with microstate category_link = "[[micronation]]s", bare_category_parent = "ประเทศ", }, ["microstate"] = { link = true, fallback = "ประเทศ", }, ["military base"] = { link = "w", class = "settlement", -- or "man-made structure"? default = {true}, }, ["minster town"] = { -- England link = "separately", fallback = "เมือง", }, ["monarchy"] = { link = true, fallback = "องค์การทางการเมือง", }, ["moor"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems"}, default = {true}, }, ["moorland"] = { link = true, fallback = "moor", }, ["motorway"] = { link = true, fallback = "road", }, ["ภูเขา"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["mountain indigenous district"] = { -- Taiwan link = "+w:district (Taiwan)", fallback = "อำเภอ", }, ["mountain indigenous township"] = { -- Taiwan link = "+w:township (Taiwan)", fallback = "township", }, ["mountain pass"] = { link = true, -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- passes -> passe, so put an entry here to ensure we singularize correctly. plural = "mountain passes", class = "natural feature", addl_bare_category_parents = {"ภูเขา"}, default = {true}, }, ["เทือกเขา"] = { link = true, fallback = "ภูเขา", }, ["mountainous region"] = { link = "separately", fallback = "ภูมิภาค", }, ["mukim"] = { -- Malaysia, Brunei, Indonesia, Singapore link = true, preposition = "ของ", class = "subpolity", }, ["municipal district"] = { link = "w", -- meaning varies depending on the country; for now, assume no neighborhoods. -- FIXME: has_neighborhoods might have to be a function that looks at the containing holonyms. preposition = "ของ", affix_type = "Pref", no_affix_strings = "อำเภอ", fallback = "เทศบาล", }, ["เทศบาล"] = { link = true, preposition = "ของ", has_neighborhoods = true, class = "subpolity", }, ["municipality with city status"] = { link = "[[municipality]] with [[w:city status|city status]]", plural = "municipalities with city status", plural_link = "[[municipality|municipalities]] with [[w:city status|city status]]", fallback = "เทศบาล", }, ["museum"] = { link = true, fallback = "building", }, ["mythological location"] = { link = "separately", former_type = "!", class = "hypothetical location", bare_category_parent = "สถานที่", default = {true}, }, ["named bridges!"] = { category_link = "notable [[bridge]]s", bare_category_parent = "man-made structures", addl_bare_category_parents = {"bridges"}, }, ["named buildings!"] = { category_link = "notable [[house]]s, [[library|libraries]] and other [[building]]s", bare_category_parent = "man-made structures", addl_bare_category_parents = {"buildings"}, }, ["named roads!"] = { category_link = "notable [[road]]s, [[highway]]s, [[trail]]s and similar linear structures", bare_category_parent = "man-made structures", addl_bare_category_parents = {"roads"}, }, ["national capital"] = { link = "w", fallback = "capital city", }, ["national park"] = { link = true, fallback = "park", }, ["natural features!"] = { category_link = "[[w:geographical feature#Natural features|natural features]] such as [[lake]]s, [[mountain]]s, [[island]]s and [[ocean]]s", bare_category_parent = "สถานที่", }, ["neighborhood"] = { -- The majority of the properties here apply to both `neighborhoods` and `neighbourhoods`; the choice of which -- one to use is made by district_neighborhood_cat_handler() based on the value of `british_spelling` for the -- location (city, political division, etc.) of the holonym that follows the word "neighbo(u)hoods" in the -- category name. It does *NOT* depend on whether the {{place}} call uses "neighborhoods" or "neighbourhoods". -- (In general it can't, because other things like "urban areas", "อำเภอ", "subdivisions" and the like also -- categorize as neighbo(u)rhoods.) link = true, -- See below. These are used by category handlers in [[Module:category tree/topic cat/data/Places]]. generic_before_non_cities = "ใน", generic_before_cities = "ของ", -- The following text is suitable for the top-level description of a neighborhood as well as categories of the -- form `Neighborhoods in POLDIV` e.g. `Neighborhoods in Illinois, USA` but not for categories of the form -- `Neighborhoods of Chicago`, where we'd get "... and other subportions of [[city|cities]] of [[Chicago]]". category_link = "[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]", category_link_before_city = "[[neighborhood]]s, [[district]]s and other subportions", -- NOTE: This setting is needed for administrative divisions like barangays that fall back to `neighborhood`, -- when set in [[Module:place/locations]] for a specific country (e.g. the Philippines). The above settings -- for `generic_before_non_cities` and `generic_before_cities` are used by category handlers in -- [[Module:category tree/topic cat/data/Places]] for `Neighborhoods in POLDIV` and `Neighborhoods of CITY` -- categories. In fact, district_neighborhood_cat_handler() does not currently pay attention to them, but -- generates "ของ" before cities and "ใน" before non-cities regardless. (FIXME: We should change that.) preposition = "ของ", class = "non-admin settlement", cat_handler = district_neighborhood_cat_handler, }, ["neighbourhood"] = { link = true, category_link = "[[neighbourhood]]s, [[district]]s and other subportions of [[city|cities]]", category_link_before_city = "[[neighbourhood]]s, [[district]]s and other subportions", fallback = "neighborhood", }, ["new area"] = { -- China (type of economic development zone, varying greatly in size) link = "w", preposition = "ใน", class = "subpolity", --? }, ["new town"] = { link = true, fallback = "เมือง", }, ["non-city capital"] = { link = "[[capital]]", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", cat_handler = function(data) return capital_city_cat_handler(data, "non-city") end, -- FIXME, do we need the following? default = {true}, }, ["non-metropolitan county"] = { link = "w", fallback = "เทศมณฑล", }, ["non-metropolitan district"] = { link = "w", fallback = "local government district", }, ["non-sovereign kingdom"] = { -- especially in Africa and Asia link = "+w:non-sovereign monarchy", generic_before_non_cities = "ใน", class = "subpolity", ["country/*"] = {true}, ["continent/*"] = {true}, default = {true}, }, ["non-sovereign monarchy"] = { link = "w", fallback = "non-sovereign kingdom", }, ["oblast"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["oblasts and autonomous republics!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Ukraine. category_link = "[[oblast]]s and [[w:autonomous republic|autonomous republic]]s", class = "subpolity", }, ["มหาสมุทร"] = { link = true, holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"ทะเล", "bodies of water"}, default = {true}, }, ["okrug"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["overseas collectivity"] = { link = "w", fallback = "collectivity", }, ["overseas department"] = { link = "w", fallback = "department", }, ["overseas territory"] = { link = "w", fallback = "dependent territory", }, ["parish"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["parish municipality"] = { -- in Quebec, often similar to a rural village; the famous [[Saint-Louis-du-Ha! Ha!]] is one of them. link = "+w:parish municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, }, ["parish seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", class = "capital", has_neighborhoods = true, }, ["park"] = { link = true, class = "man-made structure", default = {true}, }, ["pass"] = { link = "+mountain pass", -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- passes -> passe, so put an entry here to ensure we singularize correctly. plural = "passes", fallback = "mountain pass", }, ["path"] = { link = true, fallback = "road", }, ["peak"] = { link = true, fallback = "ภูเขา", }, ["peninsula"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["periphery"] = { link = true, preposition = "ของ", class = "subpolity", }, ["สถานที่!"] = { generic_before_non_cities = "ใน", generic_before_cities = "ใน", class = "generic place", category_link = "[[place]]s of all sorts", -- `category_link_top_level` control the description used in the top-level [[Category:Places]] and -- language-specific variants such as [[Category:en:Places]]. The actual text for a language-spefic variant is -- "{{{langname}}} names of [[geographical]] [[place]]s of all sorts; [[toponym]]s." where the "names of" -- portion is automatically generated by the appropriate handler in -- [[Module:category tree/topic cat/data/Places]]. category_link_top_level = "[[geographical]] [[place]]s of all sorts; [[toponym]]s", bare_category_parent = "ชื่อ (หัวข้อ)", }, ["planned community"] = { -- Include this so we don't categorize 'planned community' into villages, as 'community' does. link = true, class = "settlement", has_neighborhoods = true, }, ["plateau"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, -- FIXME: Should generate both "Plateaus" and the appropriate 'geographic and cultural area' category }, ["Polish colony"] = { link = "[[w:colony (Poland)|colony]]", affix_type = "suf", affix = "colony", fallback = "village", has_neighborhoods = true, }, ["political divisions!"] = { category_link = "[[political]] [[division]]s and [[subdivision]]s, such as [[state]]s, [[province]]s, [[county|counties]] or [[district]]s", bare_category_parent = "สถานที่", }, ["องค์การทางการเมือง"] = { link = true, category_link = "[[independent]] or [[semi-]][[independent]] [[polity|polities]]", class = "polity", --ห้ามแปล class bare_category_parent = "สถานที่", default = {true}, }, ["populated place"] = { link = "+w:populated place", -- not necessarily true, but usually is the case fallback = "village", }, ["port"] = { link = true, class = "man-made structure", default = {true}, }, ["port city"] = { -- FIXME: should categorize into "Ports" as well as "นคร" link = true, fallback = "นคร", }, ["port town"] = { -- FIXME: should categorize into "Ports" as well as "เมือง" link = "w", fallback = "เมือง", }, ["prefecture"] = { -- FIXME! `prefecture` is like a county in Japan and elsewhere but a department capital city in France. -- May need `has_neighborhoods` to be a function. link = true, preposition = "ของ", display_handler = prefecture_display_handler, class = "subpolity", }, ["prefecture-level city"] = { -- China; they are huge entities with a central city; not cities themselves. link = "w", preposition = "ของ", class = "subpolity", }, ["preserved county"] = { -- In Wales; they are former counties enshrined in law; there are 8 of them and each consists of one or more -- "principal areas" (styled as "เทศมณฑล" or "county boroughs"), of which there are 22. link = "w", preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["primary area"] = { -- a grouping of "อำเภอ" (neighborhoods) in Gothenburg, Sweden link = "+w:sv:primärområde", fallback = "neighborhood", }, ["principality"] = { link = true, fallback = "monarchy", }, ["promontory"] = { link = true, fallback = "headland", }, ["protectorate"] = { link = true, fallback = "dependent territory", }, ["จังหวัด"] = { link = true, preposition = "ของ", display_handler = province_display_handler, class = "subpolity", }, ["provinces and autonomous regions!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case China. category_link = "[[province]]s and [[autonomous region]]s", class = "subpolity", }, ["provinces and territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Canada and Pakistan. category_link = "[[province]]s and [[territory|territories]]", class = "subpolity", }, ["provincial capital"] = { link = true, fallback = "capital city", }, ["raion"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["ranch"] = { link = true, fallback = "farm", }, ["range"] = { -- FIXME: Where is this used? Is it a mountain range? link = true, holonym_use_the = true, class = "natural feature", }, ["regency"] = { link = true, preposition = "ของ", class = "subpolity", }, ["ภูมิภาค"] = { link = true, preposition = "ของ", -- If 'region' isn't a specific administrative division, fall back to 'geographic and cultural area' fallback = "geographic and cultural area", -- "former region" is a subpolity but traditional/historic(al)/ancient/medieval/etc. is a geographic region class = "geographic region", }, ["regional capital"] = { link = "separately", fallback = "capital city", }, ["regional county municipality"] = { -- Quebec link = "w", preposition = "ของ", affix_type = "Suf", no_affix_strings = {"เทศบาล", "เทศมณฑล"}, fallback = "เทศบาล", }, ["regional district"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "อำเภอ", fallback = "อำเภอ", }, ["regional municipality"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "เทศบาล", fallback = "เทศบาล", }, ["regional unit"] = { link = "w", preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["registration county"] = { -- Used in Scotland for land registration purposes; formerly used in England, Wales and Ireland for statistical -- purposes (registration of births, deaths and marriages, and for the output of census information). link = "w", fallback = "เทศมณฑล", }, ["republic"] = { -- Of Russia, Yugoslavia, etc. "Republics" in general are sovereign but we use "ประเทศ" in that case. link = true, fallback = "constituent republic", }, ["research base"] = { link = "+w:research station", fallback = "research station", }, ["research station"] = { link = "w", class = "non-admin settlement", -- or "man-made structure"? default = {true}, }, ["reservoir"] = { link = true, fallback = "ทะเลสาบ", }, ["residential area"] = { link = "separately", fallback = "neighborhood", }, ["resort city"] = { link = "w", fallback = "นคร", }, ["resort town"] = { link = "w", fallback = "เมือง", }, ["แม่น้ำ"] = { link = true, generic_before_non_cities = "ใน", holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, cat_handler = city_type_cat_handler, ["continent/*"] = {true}, default = {true}, }, ["river island"] = { link = "w", fallback = "เกาะ", }, ["road"] = { link = true, class = "man-made structure", default = {"Named roads"}, }, ["Roman province"] = { -- FIXME! Eliminate this in favor of 'former province|emp/Roman Empire' link = "w", default = {"Provinces of the Roman Empire"}, class = "subpolity", }, ["royal borough"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = {"royal", "borough"}, fallback = "local government district with borough status", has_neighborhoods = true, }, ["royal burgh"] = { link = true, fallback = "borough", }, ["royal capital"] = { link = "w", fallback = "capital city", }, ["rural committee"] = { -- Hong Kong; a group of villages link = "w", affix_type = "Suf", has_neighborhoods = true, class = "settlement", }, ["rural community"] = { -- New Brunswick link = "+w:list of municipalities in New_Brunswick#Rural communities", fallback = "เทศบาล", }, ["rural hromada"] = { link = "[[rural]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["rural municipality"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "เทศบาล", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["rural township"] = { -- Taiwan link = "+w:rural township (Taiwan)", fallback = "township", }, ["sanctuary"] = { link = true, fallback = "temple", }, ["satrapy"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"ANCIENT", "FORMER"}, }, ["ทะเล"] = { link = true, holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["seaport"] = { link = true, fallback = "port", }, ["seat"] = { link = true, fallback = "administrative centre", }, ["self-administered area"] = { -- Myanmar (groups self-administered divisions and zones) link = "+w:self-administered zone", preposition = "ของ", class = "subpolity", }, ["self-administered division"] = { -- Myanmar (only one of them: Wa Self-Administered Division) link = "w", fallback = "self-administered area", }, ["self-administered zone"] = { -- Myanmar (five of them) link = "w", fallback = "self-administered area", }, ["separatist state"] = { link = "separately", fallback = "unrecognized country", }, ["การตั้งถิ่นฐาน"] = { link = true, category_link = "[[settlement]]s such as [[city|cities]], [[village]]s and [[farm]]s", bare_category_parent = "สถานที่", -- not necessarily true, but usually is the case fallback = "village", }, ["settlement hromada"] = { link = "[[w:Populated สถานที่ในUkraine#Rural settlements|การตั้งถิ่นฐาน]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["sheading"] = { -- Isle of Man link = true, fallback = "อำเภอ", }, ["sheep station"] = { -- Australia link = true, fallback = "farm", }, ["shire"] = { link = true, fallback = "เทศมณฑล", }, ["shire county"] = { link = "w", fallback = "เทศมณฑล", }, ["shire town"] = { link = true, fallback = "county seat", }, ["ski resort city"] = { link = "[[ski resort]] [[city]]", fallback = "นคร", }, ["ski resort town"] = { link = "[[ski resort]] [[town]]", fallback = "เมือง", }, ["spa city"] = { link = "+w:spa town", fallback = "นคร", }, ["spa town"] = { link = "w", fallback = "เมือง", }, ["space station"] = { link = true, fallback = "research station", }, ["special administrative region"] = { -- in China; in practice they are city-like (Hong Kong, Macau); also [[Oecusse]] in East Timor is formally a -- "special administrative region"; North Korea had one such region planned (Sinuiju) but abandoned; Indonesia -- has similar "special regions" of Jakarta, Yogyakarta and Aceh; and South Sudan has three "special -- administrative areas" link = "+w:special administrative regions of China", preposition = "ของ", class = "subpolity", has_neighborhoods = true, --? -- no suffix since สถานที่ในHong Kong or Macau are listed without China, except Hong Kong and Macau themselves -- they also contain regions (or areas), e.g. [[Kowloon]], so it would be confusing suffix = "", }, ["special collectivity"] = { link = "w", fallback = "collectivity", }, ["special municipality"] = { -- formerly linked to the Taiwan article but there are also special municipalities of the Netherlands link = "w", fallback = "เทศบาล", }, ["special ward"] = { -- Tokyo link = true, fallback = "เทศบาล", }, ["spit"] = { link = true, fallback = "peninsula", }, ["spring"] = { link = true, class = "natural feature", default = {true}, }, ["star"] = { link = true, class = "natural feature", default = {true}, }, ["รัฐ"] = { link = true, preposition = "ของ", class = "subpolity", -- 'former/historical state' could refer either to a state of a country (a division) or a state = sovereign -- entity. The latter appears more common (e.g. in various "ancient states" of East Asia). former_type = "องค์การทางการเมือง", }, ["states and territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Australia. category_link = "[[state]]s and [[territory|territories]]", class = "subpolity", }, ["states and union territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case India. category_link = "[[state]]s and [[union territory|union territories]]", class = "subpolity", }, ["state capital"] = { link = true, fallback = "capital city", }, ["state park"] = { link = true, fallback = "park", }, ["state-level new area"] = { -- China (type of economic development zone, varying greatly in size) link = "w", fallback = "new area", }, ["statistical region"] = { -- Slovenia link = true, fallback = "administrative region", }, ["statutory city"] = { link = "w", fallback = "นคร", }, ["statutory town"] = { link = "w", fallback = "เมือง", }, ["strait"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["stream"] = { link = true, fallback = "แม่น้ำ", }, ["street"] = { link = true, fallback = "road", }, ["strip"] = { link = true, fallback = "geographic region", }, ["strip of land"] = { link = "[[strip]] of [[land]]", plural = "strips of land", plural_link = "[[strip]]s of [[land]]", fallback = "geographic region", }, ["sub-metropolitan city"] = { link = "+w:List of cities in Nepal#Sub-metropolitan cities", fallback = "นคร", }, ["sub-prefectural city"] = { link = "w", fallback = "subprovincial city", }, ["ตำบล"] = { link = true, preposition = "ของ", has_neighborhoods = true, --? -- FIXME: subdistricts can be neighborhood-like (of Jakarta) or larger (in China); need a handler class = "subpolity", default = {true}, }, ["subdivision"] = { link = true, preposition = "ของ", affix_type = "suf", -- FIXME: subdivisions can be neighborhood-like or larger; need a handler class = "subpolity", cat_handler = district_neighborhood_cat_handler, }, ["submerged ghost town"] = { -- FIXME: Consider just having "submerged" as a qualifier. link = "[[submerged]] [[ghost town]]", fallback = "ghost town", }, ["subnational kingdom"] = { link = "+w:subnational monarchy", fallback = "non-sovereign kingdom", }, ["subnational monarchy"] = { link = "w", fallback = "non-sovereign kingdom", }, ["subprefecture"] = { link = true, affix_type = "suf", preposition = "ของ", class = "subpolity", }, ["subprovince"] = { link = true, preposition = "ของ", class = "subpolity", }, ["subprovincial city"] = { link = "w", -- China; special status given to certain prefecture-level cities fallback = "prefecture-level city", }, ["subprovincial district"] = { link = "w", -- China; special status given to Binhai New Area and Pudong New Area, which are county-level districts preposition = "ของ", class = "subpolity", }, ["subregion"] = { link = true, fallback = "geographic region", }, ["suburb"] = { link = true, -- The following text is suitable for the top-level description of a suburb as well as categories of the form -- 'Suburbs in POLDIV' e.g. 'Suburbs in Illinois, USA' but not for categories of the form 'Suburbs of Chicago', -- where we'd get "[[suburb]]s of [[city|cities]] of [[Chicago]]". category_link = "[[suburb]]s of [[city|cities]]", category_link_before_city = "[[suburb]]s", -- See comments under "neighborhood" for the following three settings. They are used by -- [[Module:category tree/topic cat/data/Places]] for generating the text of 'Suburbs in/of PLACE' categories -- but currently ignored by district_neighborhood_cat_handler (which actually generates the categories for a -- given page), which hardcodes "ใน" for non-cities and "ของ" for cities. (FIXME: Change this.) generic_before_non_cities = "ใน", generic_before_cities = "ของ", preposition = "ของ", has_neighborhoods = true, --? class = "non-admin settlement", --? cat_handler = district_neighborhood_cat_handler, }, ["suburban area"] = { link = "w", fallback = "suburb", }, ["subway station"] = { link = "w", fallback = "metro station", }, ["sum"] = { -- In China, Mongolia, Russia; something like a county in Mongolia but a township in China (Inner Mongolia), -- and equivalent to a [[selsoviet]] in the parts of Russia where it's in use (a rural council, below a raion). link = "+w:sum (administrative division)", -- This fallback is somewha arbitrary. We could use "เทศมณฑล" but that has a display handler -- which we don't want to be active (FIXME: If the display handler would be active, that's a bug). fallback = "division", }, ["supercontinent"] = { link = true, fallback = "continent", }, ["tehsil"] = { link = true, affix_type = "suf", no_affix_strings = {"tehsil", "tahsil"}, class = "subpolity", }, ["temple"] = { link = true, fallback = "building", }, ["territorial authority"] = { link = "w", fallback = "อำเภอ", }, ["ดินแดน"] = { link = true, preposition = "ของ", class = "subpolity", }, ["theme"] = { link = "+w:theme (Byzantine district)", preposition = "ของ", class = "subpolity", }, ["เมือง"] = { link = true, generic_before_non_cities = "ใน", has_neighborhoods = true, class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["town with bystatus"] = { -- can't use templates in links currently link = "[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]", plural = "towns with bystatus", plural_link = "[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]", fallback = "เมือง", }, ["township"] = { link = true, has_neighborhoods = true, class = "settlement", --? default = {true}, }, ["township municipality"] = { -- Quebec link = "+w:township municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["traditional county"] = { link = true, fallback = "เทศมณฑล", }, ["traditional region"] = { -- FIXME: Verify this works. Same for 'historic(al) region'. -- provided only for the link link = "w", fallback = "FORMER geographic region", }, ["trail"] = { link = true, fallback = "road", }, ["treaty port"] = { link = "w", fallback = "นคร", class = "settlement", inherently_former = {"FORMER"}, }, ["tributary"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["underground station"] = { link = "w", fallback = "metro station", }, ["unincorporated area"] = { link = "w", -- I don't know if this fallback makes sense everywhere. fallback = "unincorporated community", }, ["unincorporated community"] = { link = true, generic_before_non_cities = "ใน", class = "non-admin settlement", }, ["unincorporated territory"] = { link = "w", fallback = "ดินแดน", }, ["union territory"] = { -- India link = true, preposition = "ของ", entry_placetype_indefinite_article = "a", class = "subpolity", }, ["unitary authority"] = { -- UK, New Zealand link = true, entry_placetype_indefinite_article = "a", fallback = "local government district", }, ["unitary district"] = { link = "w", entry_placetype_indefinite_article = "a", fallback = "local government district", }, ["united township municipality"] = { -- Quebec link = "+w:united township municipality (Quebec)", entry_placetype_indefinite_article = "a", fallback = "township municipality", has_neighborhoods = true, --? }, ["university"] = { link = true, entry_placetype_indefinite_article = "a", class = "man-made structure", default = {true}, }, ["unrecognised country"] = { link = "w", fallback = "unrecognized country", }, ["unrecognized and nearly unrecognized countries!"] = { category_link = "[[de facto]] [[independent]] [[state]]s with little or no {{w|international recognition}}", bare_category_parent = "country-like entities", }, ["unrecognized country"] = { link = "w", class = "polity", --ห้ามแปล class default = {"Unrecognized and nearly unrecognized countries"}, }, ["unrecognised state"] = { link = "w", fallback = "unrecognized country", }, ["unrecognized state"] = { link = "w", fallback = "unrecognized country", }, ["urban area"] = { link = "separately", fallback = "neighborhood", }, ["urban hromada"] = { link = "[[urban]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["urban service area"] = { -- A strange beast existing in Alberta; technically a type of hamlet but in practice used for much larger -- cities and treated equivalent to a city. (There are only two of them, [[Fort McMurray]] and [[Sherwood Park]]). link = "w", fallback = "นคร", }, ["urban township"] = { link = "w", fallback = "township", }, ["urban-type settlement"] = { -- appears to be a particular type of small urban settlement in post-Soviet states, -- had an administrative function. link = "w", fallback = "เมือง", }, ["valley"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "water"}, default = {true}, }, ["viceroyalty"] = { -- in essence, a type of colony link = true, fallback = "dependent territory", }, ["village"] = { link = true, generic_before_non_cities = "ใน", category_link = "[[village]]s, [[hamlet]]s, and other small [[community|communities]] and [[settlement]]s", class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["village development committee"] = { -- former administrative structure in Nepal; also exists in India but not as a formal unit link = "+w:village development committee (Nepal)", inherently_former = {"FORMER"}, fallback = "village", }, ["village municipality"] = { -- Quebec link = "+w:village municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["voivodeship"] = { -- Poland link = true, display_handler = voivodeship_display_handler, preposition = "ของ", class = "subpolity", }, ["volcano"] = { link = true, plural = "volcanoes", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true, "ภูเขา"}, }, ["ward"] = { link = true, class = "settlement", -- Wards are formal administrative divisions of a city but have some properties of neighborhoods. fallback = "neighborhood", }, ["watercourse"] = { link = true, fallback = "channel", }, ["Welsh community"] = { -- Wales link = "[[w:community (Wales)|community]]", preposition = "ของ", affix_type = "suf", affix = "community", has_neighborhoods = true, class = "settlement", }, ["zone"] = { -- administrative division of Ethiopia, Qatar, Nepal, India link = "+w:zone#Place names", preposition = "ของ", class = "subpolity", }, ---------------------------------------------------------------------------------------------- -- Categories for former places -- ---------------------------------------------------------------------------------------------- ["ANCIENT capital"] = { link = false, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", -- FIXME: Consider removing 'ancient settlements' here. Ancient capitals, like former capitals, often still -- exist but just aren't the capital any more. Maybe we should have an 'Ancient capitals' category. default = {"Ancient settlements", "Former capitals"}, }, ["ANCIENT non-admin settlement"] = { link = false, class = "non-admin settlement", fallback = "ANCIENT settlement", }, ["ANCIENT settlement"] = { link = false, has_neighborhoods = true, class = "settlement", default = {"Ancient settlements"}, }, ["ancient settlements!"] = { category_link = "former [[city|cities]], [[town]]s and [[village]]s that existed in [[antiquity]]", bare_category_parent = "former settlements", }, ["FORMER capital"] = { link = false, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", default = {"Former capitals"}, }, ["former capitals!"] = { category_link = "former [[capital]] [[city|cities]] and [[town]]s", bare_category_parent = "การตั้งถิ่นฐาน", }, ["former counties and county-level cities!"] = { -- For categorizing former counties and county-level cities of China category_link = "no-longer existing [[county|counties]] and [[county-level city|county-level cities]]", bare_category_breadcrumb = "counties and county-level cities", bare_category_parent = "former political divisions", }, ["FORMER county"] = { -- For categorizing former counties and county-level cities of China link = false, fallback = "FORMER subpolity", }, ["FORMER county-level city"] = { -- For categorizing former counties and county-level cities of China link = false, fallback = "FORMER subpolity", }, ["former countries and country-like entities!"] = { category_link = "[[country|countries]] and similar [[polity|polities]] that no longer exist", bare_category_breadcrumb = "countries and country-like entities", bare_category_parent = "former polities", }, ["FORMER country"] = { link = false, class = "polity", --ห้ามแปล class default = {"Former countries and country-like entities"}, }, ["former dependent territories!"] = { category_link = "[[w:dependent territory|dependent territories]] (colonies, dependencies, protectorates, etc.) that no longer exist", bare_category_breadcrumb = "dependent territories", bare_category_parent = "former political divisions", }, ["FORMER dependent territory"] = { link = false, preposition = "ของ", class = "subpolity", default = {"Former dependent territories"}, }, ["former districts!"] = { -- For categorizing former districts of China category_link = "no-longer-existing [[district]]s", bare_category_breadcrumb = "อำเภอ", bare_category_parent = "former political divisions", }, ["FORMER district"] = { -- For categorizing former districts of China link = false, fallback = "FORMER subpolity", }, ["FORMER geographic region"] = { link = false, fallback = "geographic and cultural area", }, ["FORMER man-made structure"] = { link = false, class = "man-made structure", default = {"Former man-made structures"}, }, ["former man-made structures!"] = { category_link = "man-made structures such as [[airport]]s and [[park]]s that no longer exist", bare_category_breadcrumb = "man-made structures", bare_category_parent = "former places", }, ["former municipalities!"] = { -- For categorizing former municipalities of the Netherlands category_link = "no-longer-existing [[municipality|municipalities]]", bare_category_breadcrumb = "เทศบาล", bare_category_parent = "former political divisions", }, ["FORMER municipality"] = { -- For categorizing former municipalities of the Netherlands link = false, fallback = "FORMER subpolity", }, ["FORMER natural feature"] = { link = false, class = "natural feature", default = {"Former natural features"}, }, ["former natural features!"] = { category_link = "natural features such as [[lake]]s, [[river]]s and [[island]]s that no longer exist", bare_category_breadcrumb = "natural features", bare_category_parent = "former places", }, ["FORMER non-admin settlement"] = { link = false, class = "non-admin settlement", fallback = "FORMER settlement", }, ["former places!"] = { category_link = "[[place]]s of all sorts that no longer exist", bare_category_breadcrumb = "former", bare_category_parent = "สถานที่", }, ["former political divisions!"] = { category_link = "[[political]] [[division]]s (states, provinces, counties, etc.) that no longer exist", bare_category_breadcrumb = "political divisions", bare_category_parent = "former places", }, ["former polities!"] = { category_link = "[[polity|polities]] (countries, kingdoms, empires, etc.) that no longer exist", bare_category_breadcrumb = "องค์การทางการเมือง", bare_category_parent = "former places", }, ["FORMER polity"] = { link = false, class = "polity", --ห้ามแปล class default = {"Former polities"}, }, ["former prefectures!"] = { -- For categorizing former prefectures of China category_link = "no-longer-existing [[prefecture]]s", bare_category_breadcrumb = "prefectures", bare_category_parent = "former political divisions", }, ["FORMER prefecture"] = { -- For categorizing former prefectures of China link = false, fallback = "FORMER subpolity", }, ["former provinces!"] = { -- For categorizing former provinces of China, etc. category_link = "no-longer-existing [[province]]s", bare_category_breadcrumb = "จังหวัด", bare_category_parent = "former political divisions", }, ["FORMER province"] = { -- For categorizing ancient/historical/former provinces of the Roman Empire link = false, fallback = "FORMER subpolity", }, ["former region"] = { -- A former region is considered a former political division, but not a 'historical/traditional/etc.' region. link = "separately", preposition = "ของ", inherently_former = {"FORMER"}, class = "subpolity", }, ["FORMER settlement"] = { link = false, has_neighborhoods = true, class = "settlement", default = {"Former settlements"}, }, ["former settlements!"] = { category_link = "[[city|cities]], [[town]]s and [[village]]s that no longer exist or have been merged or reclassified", bare_category_breadcrumb = "การตั้งถิ่นฐาน", bare_category_parent = "former political divisions", }, ["FORMER subpolity"] = { link = false, preposition = "ของ", class = "subpolity", default = {"Former political divisions"}, }, ---------------------------------------------------------------------------------------------- -- form-of categories -- ---------------------------------------------------------------------------------------------- ---------- Abbreviations ---------- ["abbreviations of counties!"] = { -- For categorizing abbreviations of counties of e.g. England full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[county|counties]]", bare_category_breadcrumb = "เทศมณฑล", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of countries!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "abbreviations of places", }, ["abbreviations of departments!"] = { -- For categorizing abbreviations of departments of e.g. France full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[department]]s", bare_category_breadcrumb = "departments", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of districts!"] = { -- For categorizing abbreviations of districts of e.g. ??? full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[district]]s", bare_category_breadcrumb = "อำเภอ", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of divisions!"] = { -- For categorizing abbreviations of divisions of e.g. Bangladesh full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[division]]s", bare_category_breadcrumb = "divisions", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of former countries!"] = { full_category_link = "{{glossary|abbreviation}}s of [[country|countries]] that no longer [[exist]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "abbreviations of former places", }, ["abbreviations of former places!"] = { full_category_link = "{{glossary|abbreviation}}s of [[place]]s that no longer [[exist]]", bare_category_breadcrumb = "abbreviations", bare_category_parent = "former places", addl_bare_category_parents = {{name = "abbreviations of places", sort = "former"}}, }, ["abbreviations of places!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[place]]s", bare_category_breadcrumb = "abbreviations", bare_category_parent = "สถานที่", }, ["abbreviations of political divisions!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[political]] [[division]]s", bare_category_breadcrumb = "political divisions", bare_category_parent = "abbreviations of places", }, ["abbreviations of prefectures!"] = { -- For categorizing abbreviations of prefectures of e.g. Japan full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[prefecture]]s", bare_category_breadcrumb = "prefectures", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of provinces!"] = { -- For categorizing abbreviations of provinces of e.g. Canada full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s", bare_category_breadcrumb = "จังหวัด", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of provinces and territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s and [[territory|territories]]", bare_category_breadcrumb = "provinces and territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of regions!"] = { -- For categorizing abbreviations of regions of e.g. Italy full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[administrative region]]s", bare_category_breadcrumb = "ภูมิภาค", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states!"] = { -- For categorizing abbreviations of states of e.g. the United States full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states and territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[territory|territories]]", bare_category_breadcrumb = "states and territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states and union territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[union territory|union territories]]", bare_category_breadcrumb = "states and union territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[territory|territories]]", bare_category_breadcrumb = "ดินแดน", bare_category_parent = "abbreviations of political divisions", }, ["ABBREVIATION_OF country"] = { link = false, default = {"Abbreviations of countries"}, }, ["ABBREVIATION_OF county"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF department"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF district"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF division"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF FORMER country"] = { link = false, default = {"Abbreviations of former countries"}, }, ["ABBREVIATION_OF FORMER place"] = { link = false, default = {"Abbreviations of former places"}, }, ["ABBREVIATION_OF place"] = { link = false, default = {"Abbreviations of places"}, }, ["ABBREVIATION_OF prefecture"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF province"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF region"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF state"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF subpolity"] = { link = false, default = {"Abbreviations of political divisions"}, }, ["ABBREVIATION_OF territory"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF union territory"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ---------- Archaic forms ---------- ["archaic forms of places!"] = { full_category_link = "{{glossary|archaic}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "archaic forms", bare_category_parent = "สถานที่", }, ["ARCHAIC_FORM_OF place"] = { link = false, default = {"Archaic forms of places"}, }, ---------- Clippings ---------- ["clippings of places!"] = { full_category_link = "{{glossary|clipping}}s of [[name]]s of [[place]]s", bare_category_breadcrumb = "clippings", bare_category_parent = "สถานที่", }, ["CLIPPING_OF place"] = { link = false, default = {"Clippings of places"}, }, ---------- Dated forms ---------- ["dated forms of places!"] = { full_category_link = "{{glossary|dated}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "dated forms", bare_category_parent = "สถานที่", }, ["DATED_FORM_OF place"] = { link = false, default = {"Dated forms of places"}, }, ---------- Derogatory names ---------- ["derogatory names for cities!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[city|cities]]", bare_category_breadcrumb = "นคร", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for cities"}, }, ["derogatory names for continents!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[continent]]s", bare_category_breadcrumb = "ทวีป", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for continents"}, }, ["derogatory names for countries!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for countries"}, }, ["derogatory names for places!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[place]]s", bare_category_breadcrumb = "derogatory names", bare_category_parent = "nicknames for places", }, ["derogatory names for states!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for states"}, }, ["DEROGATORY_NAME_FOR capital"] = { link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR city"] = { link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR continent"] = { link = false, default = {"Derogatory names for continents"}, }, ["DEROGATORY_NAME_FOR country"] = { link = false, default = {"Derogatory names for countries"}, }, ["DEROGATORY_NAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR place"] = { link = false, default = {"Derogatory names for places"}, }, ["DEROGATORY_NAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR state"] = { link = false, default = {"Derogatory names for states"}, }, ["DEROGATORY_NAME_FOR town"] = { link = false, default = {"Derogatory names for cities"}, }, ---------- Ellipses ---------- ["ellipses of places!"] = { full_category_link = "{{glossary|ellipsis|ellipses}} of [[name]]s of [[place]]s", bare_category_breadcrumb = "ellipses", bare_category_parent = "สถานที่", }, ["ELLIPSIS_OF place"] = { link = false, default = {"Ellipses of places"}, }, ---------- Former long-form names ---------- ["former long-form names of countries!"] = { full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former long-form names of places", addl_bare_category_parents = {{name = "former names of countries", sort = "long-form"}}, }, ["former long-form names of places!"] = { full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s", bare_category_breadcrumb = "long-form", bare_category_parent = "former names of places", }, ["FORMER_LONG_FORM_OF country"] = { link = false, default = {"Former long-form names of countries"}, }, ["FORMER_LONG_FORM_OF place"] = { link = false, default = {"Former long-form names of places"}, }, ---------- Former names ---------- ["former names of capitals!"] = { full_category_link = "[[former]] [[name]]s of [[capital city|capital cities]] that generally still exist but under a different name", bare_category_breadcrumb = "capitals", bare_category_parent = "former names of settlements", }, ["former names of countries!"] = { full_category_link = "[[former]] [[name]]s of [[country|countries]] that generally still exist but under a different name", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former names of places", }, ["former names of places!"] = { full_category_link = "[[former]] [[name]]s of [[place]]s that generally still exist but under a different name", bare_category_breadcrumb = "former names", bare_category_parent = "สถานที่", }, ["former names of political divisions!"] = { full_category_link = "[[former]] [[name]]s of [[political]] [[division]]s (states, provinces, counties, etc.) that generally still exist but under a different name", bare_category_breadcrumb = "political divisions", bare_category_parent = "former names of places", }, ["former names of polities!"] = { full_category_link = "[[former]] [[name]]s of [[polity|polities]] (e.g. [[country|countries]]) that generally still exist but under a different name", bare_category_breadcrumb = "องค์การทางการเมือง", bare_category_parent = "former names of places", }, ["former names of settlements!"] = { full_category_link = "[[former]] [[name]]s of [[city|cities]], [[town]]s, [[village]]s, etc. that generally still exist but under a different name", bare_category_breadcrumb = "การตั้งถิ่นฐาน", bare_category_parent = "former names of political divisions", }, ["FORMER_NAME_OF capital"] = { link = false, default = {"Former names of capitals"}, }, ["FORMER_NAME_OF country"] = { link = false, default = {"Former names of countries"}, }, ["FORMER_NAME_OF place"] = { link = false, default = {"Former names of places"}, }, ["FORMER_NAME_OF polity"] = { link = false, default = {"Former names of polities"}, }, ["FORMER_NAME_OF region"] = { link = false, fallback = "FORMER_NAME_OF subpolity", }, ["FORMER_NAME_OF settlement"] = { link = false, default = {"Former names of settlements"}, }, ["FORMER_NAME_OF subpolity"] = { link = false, default = {"Former names of political divisions"}, }, ---------- Former nicknames ---------- ["former nicknames for cities!"] = { full_category_link = "no-longer-used [[nickname]]s for [[city|cities]], e.g. the [[Eternal City]] for [[Kyoto]] during the {{w|Heian period}} ({{circa2|800–1100|short=yes}} {{AD}})", bare_category_breadcrumb = "นคร", bare_category_parent = "former nicknames for places", addl_bare_category_parents = {"nicknames for cities"}, }, ["former nicknames for places!"] = { full_category_link = "no-longer-used [[nickname]]s for [[place]]s", bare_category_breadcrumb = "former", bare_category_parent = "nicknames for places", addl_bare_category_parents = {{name = "former names of places", sort = "nicknames"}}, }, ["FORMER_NICKNAME_FOR capital"] = { link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR city"] = { link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR place"] = { link = false, default = {"Former nicknames for places"}, }, ["FORMER_NICKNAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR town"] = { link = false, default = {"Former nicknames for cities"}, }, ---------- Former official names ---------- ["former official names of countries!"] = { full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former official names of places", addl_bare_category_parents = {{name = "former names of countries", sort = "official"}}, }, ["former official names of places!"] = { full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[place]]s", bare_category_breadcrumb = "official", bare_category_parent = "former names of places", }, ["FORMER_OFFICIAL_NAME_OF country"] = { link = false, default = {"Former official names of countries"}, }, ["FORMER_OFFICIAL_NAME_OF place"] = { link = false, default = {"Former official names of places"}, }, ---------- Long-form names ---------- ["long-form names of countries!"] = { full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "long-form names of places", }, ["long-form names of places!"] = { full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s", bare_category_breadcrumb = "long-form names", bare_category_parent = "สถานที่", }, ["LONG_FORM_OF country"] = { link = false, default = {"Long-form names of countries"}, }, ["LONG_FORM_OF place"] = { link = false, default = {"Long-form names of places"}, }, ---------- Nicknames ---------- ["nicknames for cities!"] = { full_category_link = "[[nickname]]s for [[city|cities]], e.g. the [[Big Apple]] for [[New York City]]", bare_category_breadcrumb = "นคร", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"นคร"}, }, ["nicknames for continents!"] = { full_category_link = "[[nickname]]s for [[continent]]s", bare_category_breadcrumb = "ทวีป", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"ทวีป"}, }, ["nicknames for countries!"] = { full_category_link = "[[nickname]]s for [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"ประเทศ"}, }, ["nicknames for places!"] = { full_category_link = "[[nickname]]s for [[place]]s", bare_category_breadcrumb = "สถานที่", bare_category_parent = "nicknames", addl_bare_category_parents = {"สถานที่"}, }, ["nicknames for states!"] = { -- For categorizing nicknames for states of e.g. the United States full_category_link = "[[nicknames]] for [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"รัฐ"}, }, ["NICKNAME_FOR capital"] = { link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR city"] = { link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR continent"] = { link = false, default = {"Nicknames for continents"}, }, ["NICKNAME_FOR country"] = { link = false, default = {"Nicknames for countries"}, }, ["NICKNAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR place"] = { link = false, default = {"Nicknames for places"}, }, ["NICKNAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR state"] = { link = false, default = {"Nicknames for states"}, }, ["NICKNAME_FOR town"] = { link = false, default = {"Nicknames for cities"}, }, ---------- Obsolete forms ---------- ["obsolete forms of places!"] = { full_category_link = "{{glossary|obsolete}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "obsolete forms", bare_category_parent = "สถานที่", }, ["OBSOLETE_FORM_OF place"] = { link = false, default = {"Obsolete forms of places"}, }, ---------- Official names ---------- ["official names of countries!"] = { full_category_link = "[[official]] [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "official names of places", }, ["official names of former countries!"] = { full_category_link = "[[official]] [[name]]s of [[country|countries]] that no longer [[exist]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "official names of former places", }, ["official names of former places!"] = { full_category_link = "[[official]] [[name]]s of [[place]]s that no longer [[exist]]", bare_category_breadcrumb = "official names", bare_category_parent = "former places", addl_bare_category_parents = {{name = "official names of places", sort = "former"}}, }, ["official names of places!"] = { full_category_link = "[[official]] [[name]]s of [[place]]s", bare_category_breadcrumb = "official names", bare_category_parent = "สถานที่", }, ["OFFICIAL_NAME_OF country"] = { link = false, default = {"Official names of countries"}, }, ["OFFICIAL_NAME_OF FORMER country"] = { link = false, default = {"Official names of former countries"}, }, ["OFFICIAL_NAME_OF FORMER place"] = { link = false, default = {"Official names of former places"}, }, ["OFFICIAL_NAME_OF place"] = { link = false, default = {"Official names of places"}, }, ---------- Official nicknames ---------- ["official nicknames for places!"] = { full_category_link = "[[official]] [[nickname]]s for [[place]]s", bare_category_breadcrumb = "official", bare_category_parent = "nicknames for places", }, ["official nicknames for states!"] = { -- For categorizing official nicknames for states of e.g. the United States full_category_link = "[[official]] [[nicknames]] for [[state]]s", bare_category_breadcrumb = "official", bare_category_parent = "nicknames for states", addl_bare_category_parents = {"รัฐ"}, }, ["OFFICIAL_NICKNAME_FOR place"] = { link = false, default = {"Official nicknames for places"}, }, ["OFFICIAL_NICKNAME_FOR state"] = { link = false, default = {"Official nicknames for states"}, }, } export.plural_placetype_to_singular = {} for sg_placetype, spec in pairs(export.placetype_data) do if spec.plural then export.plural_placetype_to_singular[spec.plural] = sg_placetype end end return export c2rgy85gws88642kx3n64pxzd5hvu9l 5720688 5720686 2026-04-21T01:23:31Z OctraBot 3198 5720688 Scribunto text/plain local export = {} export.force_cat = false -- set to true for testing local m_locations = require("Module:place/locations") local m_links = require("Module:links") local m_table = require("Module:table") local m_strutils = require("Module:string utilities") local debug_track_module = "Module:debug/track" local en_utilities_module = "Module:en-utilities" local dump = mw.dumpObject local insert = table.insert local concat = table.concat local internal_error = m_locations.internal_error export.internal_error = internal_error local process_error = m_locations.process_error export.process_error = process_error local unpack = unpack or table.unpack -- Lua 5.2 compatibility local ucfirst = m_strutils.ucfirst local ulower = m_strutils.lower local rmatch = m_strutils.match local split = m_strutils.split --[==[ intro: This module contains placetype data used by [[Module:place]] and {{tl|place}}, along with a significant amount of code to work with both placetypes and locations, as well as some placename-related info (FIXME: Consider moving it to [[Module:place/locations]]). See also [[Module:place/locations]], which has definitions of all known locations. You must currently load this module using {{cd|require()}}, not using {{cd|mw.loadData()}}. In particular, it contains two fundamental and tricky functions: # `get_placetype_equivs`, which finds the equivalent placetypes to look under in order to find a given property, and in the process correctly handles placetypes with qualifiers (including qualifiers that act similar to "type-raising" operators in that they do something non-trivial to the placetype to their right) as well as form-of directives and fallbacks. # `find_matching_holonym_location`, which looks up a holonym to find a matching known location, but in the process checks holonyms to the right to make sure there isn't a clash between the user-specified containing holonyms and the containers of the known location being considered. This is done to prevent overcategorizing when either there are two known locations with the same name (e.g. Birmingham in England and Birmingham, Alabama in the US), or more generally two locations with the same name, one of which is a known location but where the other is not (e.g. we're processing non-known-location Mérida, Spain and don't want it categorized like known location Mérida, Yucatán, Mexico). Both of these functions are invoked repeatedly, and probably are invoked several times on the same inputs and as a result are candidates for memoization to speed up the operation of {{tl|place}}. ]==] ------------------------------------------------------------------------------------------ -- Basic utilities -- ------------------------------------------------------------------------------------------ --[==[ Return true if `force_cat` is set either in this module or in [[Module:place/locations]]. ]==] function export.get_force_cat() return export.force_cat or m_locations.force_cat end -- Add the page to a tracking "category". To see the pages in the "category", -- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here". local function track(page) require(debug_track_module)("place/" .. page) return true end function export.remove_links_and_html(text) text = m_links.remove_links(text) return text:gsub("<.->", "") end --[==[ Return the singular version of a maybe-plural placetype, or nil if not plural. This correctly handles placetypes with irregular plurals such as `kibbutzim` plural of `kibbutz` by looking up in a table constructed from the `plural` values specified in `placetype_data`. If a special plural value is not found, the regular singularization algorithm in [[Module:en-utilities]] is invoked, which reverses the y -> ies change after vowels and the 'es' addition after sh/ch/x, and otherwise just subtracts a final 's' (which will incorrectly generate 'passe' for plural 'passes'; FIXME: consider changing this for words ending in '-sses'). If the generated singular is the same as the passed-in value, nil is returned. ]==] function export.maybe_singularize_placetype(placetype) if not placetype then return nil end if export.plural_placetype_to_singular[placetype] then return export.plural_placetype_to_singular[placetype] end local retval = --[[require(en_utilities_module).singularize(placetype)]] placetype if retval == placetype then return nil end return retval end -- Return the correct plural of a placetype, and (if `do_ucfirst` is given) make the first letter uppercase. We first -- look up the plural in `placetype_data`, falling back to pluralize() in [[Module:en-utilities]], which is almost -- always correct. function export.pluralize_placetype(placetype, do_ucfirst) local ptdata = export.placetype_data[placetype] if ptdata and ptdata.plural then placetype = ptdata.plural else placetype = --[[require(en_utilities_module).pluralize(placetype)]] placetype end if do_ucfirst then return ucfirst(placetype) else return placetype end end --[==[ Get the data associated with a placetype, which may be in its singular or plural form. If `from_category` is specified, we also look for category-only placetypes (generally plural) followed by `!`. Return three values: (a) the placetype under which the data can be looked up (i.e. in its singular form if the passed-in `placetype` is plural and did not match a category-only placetype followed by `!`); (b) the placetype data structure; (c) the type of `placetype` match that occurred, one of `"direct"` if the canonical placetype is the same as the passed-in `placetype` and also the same as the key under which `ptdata` was looked up, or `"direct-category"` if the `ptdata` was looked up under a key formed from the passed-in `placetype` by adding `!`, or `"plural"` if the `ptdata` was looked up under the singularized version of the plural passed-in `placetype`. ]==] function export.get_placetype_data(placetype, from_category) local ptdata = export.placetype_data[placetype] if ptdata then return placetype, ptdata, "direct" end if from_category then ptdata = export.placetype_data[placetype .. "!"] if ptdata then return placetype .. "!", ptdata, "direct-category" end end local sg_placetype = export.maybe_singularize_placetype(placetype) if sg_placetype then ptdata = export.placetype_data[sg_placetype] if ptdata then return sg_placetype, ptdata, "plural" end end return nil end --[==[ Check for special pseudo-placetypes that should be ignored for categorization purposes. ]==] function export.placetype_is_ignorable(placetype) return placetype == "and" or placetype == "or" or placetype == "และ" or placetype == "หรือ" or placetype:find("^%(") end function export.resolve_placetype_aliases(placetype) return export.placetype_aliases[placetype] or placetype end --[==[ Return a property from `placetype_data` for a given placetype. If the placetype isn't found in `placetype_data`, or the key isn't found in the placetype's entry in `placetype_data`, return nil. ]==] function export.get_placetype_prop(placetype, key) -- Usually we are called on equivalent placetypes returned from `get_placetype_equivs`, in which case placetype -- aliases have been resolved, but sometimes not, e.g. when fetching the indefinite article in -- get_placetype_article(). `resolve_placetype_aliases` is just a simple lookup and it doesn't hurt to do it twice. placetype = export.resolve_placetype_aliases(placetype) if export.placetype_data[placetype] then return export.placetype_data[placetype][key] else return nil end end --[==[ Given a placetype, split the placetype into one or more potential ''splits'', each consisting of a three-element list { {``prev_qualifiers``, ``this_qualifier``, ``reduced_placetype``}}, i.e. # the concatenation of zero or more previously-recognized qualifiers on the left, normally canonicalized (if there are zero such qualifiers, the value will be nil); # a single recognized qualifier, normally canonicalized (if there is no qualifier, the value will be nil); # the "reduced placetype" on the right. Splitting between the qualifier in (2) and the reduced placetype in (3) happens at each space character, proceeding from left to right, and stops if a qualifier isn't recognized. All placetypes are canonicalized by checking for aliases in `placetype_aliases`, but no other checks are made as to whether the reduced placetype is recognized. Canonicalization of qualifiers does not happen if `no_canon_qualifiers` is specified. For example, given the placetype `"small beachside unincorporated community"`, the return value will be { { {nil, nil, "small beachside unincorporated community"}, {nil, "small", "beachside unincorporated community"}, {"small", "[[beachfront]]", "unincorporated community"}, {"small [[beachfront]]", "[[unincorporated]]", "community"}, }} Here, `"beachside"` is canonicalized to `"[[beachfront]]"` and `"unincorporated"` is canonicalized to `"[[unincorporated]]"`, in both cases according to the entry in `placetype_qualifiers`. On the other hand, if given `"small former haunted community"`, the return value will be { { {nil, nil, "small former haunted community"}, {nil, "small", "former haunted community"}, {"small", "former", "haunted community"}, }} because `"small"` and `"former"` but not `"haunted"` are recognized as qualifiers. Finally, if given `"former adr"`, the return value will be { { {nil, nil, "former adr"}, {nil, "former", "administrative region"}, }} because `"adr"` is a recognized placetype alias for `"administrative region"`. ]==] function export.split_qualifiers_from_placetype(placetype, no_canon_qualifiers) local splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}} local prev_qualifier = nil while true do local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$") if qualifier then local canon = export.placetype_qualifiers[qualifier] if canon == nil then break end local new_qualifier = qualifier if type(canon) == "table" then canon = canon.link end if not no_canon_qualifiers and canon ~= false then if canon == true then new_qualifier = "[[" .. qualifier .. "]]" else new_qualifier = canon end end insert(splits, {prev_qualifier, new_qualifier, export.resolve_placetype_aliases(reduced_placetype)}) prev_qualifier = prev_qualifier and prev_qualifier .. " " .. new_qualifier or new_qualifier placetype = reduced_placetype else break end end return splits end --[==[ Given a `placetype` (which may be pluralized), return an ordered list of equivalent placetypes to look under to find the placetype's properties (such as the category or categories to be inserted). The return value is actually an ordered list of objects of the form `{qualifier=``qualifier``, placetype=``equiv_placetype``}` where ``equiv_placetype`` is a placetype whose properties to look up, derived from the passed-in placetype or from a contiguous subsequence of the words in the passed-in placetype (always including the rightmost word in the placetype, i.e. we successively chop off qualifier words from the left and use the remainder to find equivalent placetypes). ``qualifier`` is the remaining words not part of the subsequence used to find ``equiv_placetype``; or nil if all words in the passed-in placetype were used to find ``equiv_placetype``. (FIXME: This qualifier is not currently used anywhere.) Only placetypes for which there is an entry in `placetype_data` are included. The placetype passed in is always checked first, and will form the first entry if it exists in `placetype_data`. '''NOTE:''' This is a tricky function as it implements handling of (a) qualifiers, (b) fallback logic, (c) "type-raising" qualifiers such as `former`/`ancient`/etc. as well as `fictional` and `mythological`, and (d) form-of directives, which act somewhat similarly to `former`, and allows interaction between more than one of these simultaneously (e.g. official names of former places, which have their own categorization). If {{tl|place}} gets too slow, one potential speedup is to memoize the results of this function, as it appears to be getting called more than once on the same inputs. Another similar potential speedup is to memoize the results of `iterate_matching_holonym_location()`. For example, given the placetype `left tributary`, the following placetype/qualifier combinations are checked in turn: ``` {qualifier = nil, placetype="left tributary"} {qualifier = "left", placetype="tributary"} {qualifier = "left", placetype="แม่น้ำ"} ``` and the return value will be { { {qualifier = "left", placetype="tributary"}, {qualifier = "left", placetype="แม่น้ำ"}, }} The algorithm first enters the placetype itself into the list, then checks for `left tributary` as a recognized placetype in `placetype_data` and doesn't find it, so it doesn't enter it into the returned list (if it found it, it would add it as well as any fallbacks directly after it). It then splits off the recognized qualifier `left` to form the ''reduced placetype'' `tributary`, which is entered into the list because it is found in `placetype_data`. Then, because it has a fallback `river`, which exists in `placetype_data`, the fallback is entered next. Another example is `small rural fraziones` (where a ''frazione'' is type of subdivision of a ''comune'' or municipality, often specifically an outlying hamlet). the placetype/qualifier combinations checked are: ``` {qualifier = nil, placetype="small rural fraziones"} {qualifier = nil, placetype="small rural frazione"} {qualifier = "small", placetype="rural fraziones"} {qualifier = "small", placetype="rural frazione"} {qualifier = "small [[rural]]", placetype="fraziones"} {qualifier = "small [[rural]]", placetype="frazione"} {qualifier = "small [[rural]]", placetype="hamlet"} {qualifier = "small [[rural]]", placetype="village"} ``` The return value ends up as {qualifier = "small [[rural]]", placetype="frazione"}, {qualifier = "small [[rural]]", placetype="hamlet"}, {qualifier = "small [[rural]]", placetype="village"}, }} Here, because the result of singularizing `fraziones` returns a different value from the placetype itself, that singularized value is checked after the original plural value. Also, in the process of splitting off qualifiers, they are canonicalized if the entry in `placetype_qualifiers` says to do so; in this case, links are placed around `rural`. Finally, `frazione` has `hamlet` as its fallback, which in turn has `village` as its fallback, so both fallbacks end up being returned. `no_fallback`, if set, disables returning equivalent placetypes based on the `fallback` setting for a placetype. This is used in the first of two loops in find_placetype_cat_specs() in [[Module:place]] to prefer exact matches for placetypes such as barangays with later holonyms to matches based on a fallback such as `neighborhood` with an earlier holonym. See the comment in that function in [[Module:place]] for a more detailed explanation of why this is needed. Only the placetype itself, and any reduced placetypes created by chopping off recognized qualifiers at the beginning, are returned; but we do not return reduced placetypes if a containing placetype exists in `placetype_data`. (For example, `"overseas territory"` has a fallback `"dependent territory"`, and `"overseas"` is also a recognized qualifier. When `no_fallback` is in place, without the above proviso, we would return `"overseas territory"` followed by `"ดินแดน"` with the incorrect effect of classifying an `"overseas territory"` of the United Kingdom such as `"Gibraltar"` under [[:Category:Territories of the United Kingdom]] instead of [[:Category:Dependent territories of the United Kingdom]].) As an exception, if `historical`, `ancient`, `former` or the like are found, they proceed ignoring `no_fallback`, because it seems tricky to handle them correctly in the presence of `no_fallback`, and historical/former placetypes rarely occur with exact match category specs anyway. `no_split_qualifiers` prevents splitting off recognized qualifiers and returning the remainder of the placetype as an equivalent placetype. Only the passed-in placetype, and any fallbacks, will be returned. This is used in [[Module:category tree/topic cat/data/Places]] when looking up placetypes found in categories. Such placetypes won't have qualifiers and so it doesn't make sense to try and look for them. `from_category`, if set, causes category-only placetypes (those ending in `!`) to also be checked. `form_of_directive`, if set, causes the specified form-of directive (e.g. `FORMER_NAME_OF`) to be prepended to checked placetypes, their directive-specific type (e.g. `FORMER_NAME_OF_type`), and their classes (`class`) to get the appropriate placetypes to check for form-of-directive categories. It falls back to the prepended generic `place` as a placetype, e.g. `FORMER_NAME_OF place`, if nothing else matches. `no_check_for_inherently_former` is used internally to prevent an infinite loop when checking for `inherently_former`. `register_former_as_non_former` is a major hack used in `get_bare_categories` to deal with the mismatch between e.g. known location `Yugoslavia` declaring itself a `country` but definitions of it declaring it a `former country`. It causes the non-former version of the specified placetype to be included in the returned equivalents along with the former placetypes. [FIXME: This should apply only to the entries in `former_countries` but it's tricky to do that now; fix this in the known-location refactor. -- The known-location refactor is already done but we haven't yet fixed this.] ]==] function export.get_placetype_equivs(placetype, props) local no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former local form_of_directive if props then no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former = props.no_fallback, props.no_split_qualifiers, props.no_check_for_inherently_former, props.from_category, props.register_former_as_non_former form_of_directive = props.form_of_directive end local equivs = {} -- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. `qualifier` is -- the preceding qualifier to insert into `equivs` along with the placetype (see comment at top of function). If -- `from_category` is given, we also check for a category-specific entry consisting of the placetype followed by -- `!`, and in all cases we also check to see if `placetype` is plural, and if so, insert the singularized version -- along with its fallbacks (if any) in `placetype_data`. `form_of_prefix` is a form-of prefix such as -- `OFFICIAL_NAME_OF`. If specified, we check the fallbacks of `placetype` without the prefix but then insert into -- `equivs` the prefixed placetype. This way, if the user says e.g. {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}}, -- we will correctly categorize into [[:Category:Official names of countries]], rather than only trying to look up -- `OFFICIAL_NAME_OF island country` and failing, falling back ultimately to [[:Category:Official names of places]]. local function insert_placetype_and_fallbacks(qualifier, placetype, form_of_prefix) local function insert_equiv(pt) if form_of_prefix then -- Let's say the user says {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}} and we have -- no entry for `OFFICIAL_NAME_OF island country` but we do for `OFFICIAL_NAME_OF country` (which we end -- up processing because `island country` falls back to `country`), and that entry in turn is defined -- using a fallback. We have to insert that fallback-of-fallback, and the easiest/cleanest way of -- handling this is by calling ourselves recursively. insert_placetype_and_fallbacks(qualifier, form_of_prefix .. " " .. pt) else insert(equivs, {qualifier=qualifier, placetype=pt}) end end -- Insert the placetype, along with any fallbacks. local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category) if ptdata then insert_equiv(canon_placetype) if no_fallback then return end local first_placetype = #equivs + 1 local prev_placetype = nil while true do local pt_value = export.placetype_data[canon_placetype] if not pt_value then internal_error("Fallback value %s specified for placetype %s but is not in `placetype_data`", canon_placetype, prev_placetype) end if pt_value.fallback then insert_equiv(pt_value.fallback) local last_placetype = #equivs if last_placetype - first_placetype >= 10 then local fallback_loop = {} for i = first_placetype, last_placetype do insert(fallback_loop, equivs[i].placetype) end internal_error("Apparent loop in fallback chain: %s", table.concat(fallback_loop, " -> ")) end prev_placetype = canon_placetype canon_placetype = pt_value.fallback else break end end end end -- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. This is a -- wrapper around the more basic `insert_placetype_and_fallbacks()` which handles form-of directives. If there is no -- form-of directive, this function directly calls `insert_placetype_and_fallbacks()`. We do things this way so that -- form-of directives correctly combine with `former`-type qualifiers. Note that we also have special backups for -- form-of directives that check `DIRECTIVE place` (and before that, `DIRECTIVE FORMER/ANCIENT place` is there's a -- `former`-type directive); these backups live outside this function because we want them done once, late, rather -- than in each invocation of `process_and_insert_placetype()`. local function process_and_insert_placetype(qualifier, reduced_placetype) if form_of_directive then -- First check for e.g. `OFFICIAL_NAME_OF island country` and its fallbacks; then we look for fallbacks of -- `island country` and check e.g. `OFFICIAL_NAME_OF country` and its fallbacks. All of this is handled by -- `insert_placetype_and_fallbacks()` with appropriate parameters. After that, check the general class of -- the directive, e.g. `subpolity` if something like `district` is given. (Eventually, we check for -- `OFFICIAL_NAME_OF place` as a backup, but this happens at the end outside the loop over qualifiers.) insert_placetype_and_fallbacks(qualifier, reduced_placetype, form_of_directive) if not no_fallback then local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype) local directive_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.get_placetype_prop(pt, form_of_directive .. "_type") or export.get_placetype_prop(pt, "class") end ) if not directive_type then local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.placetype_data[pt] end ) if pt_data then internal_error("For placetype %s in conjunction with form-of directive %s, placetype data " .. 'located but directive-specific type property %s missing, and so is "class"; ' .. "placetypes searched are %s", reduced_placetype, form_of_directive, form_of_directive .. "_type", reduced_placetype_equivs) else -- This should be allowed, as we allow unrecognized placetypes in general. end elseif directive_type ~= "!" then insert_placetype_and_fallbacks(qualifier, directive_type, form_of_directive) end end else insert_placetype_and_fallbacks(qualifier, reduced_placetype) end end -- Successively split off recognized qualifiers and loop over successively greater sets of qualifiers from the left -- (unless `no_split_qualifiers` is specified, in which case we don't check for qualifiers). local splits if no_split_qualifiers then splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}} else splits = export.split_qualifiers_from_placetype(placetype) end for _, split in ipairs(splits) do local prev_qualifier, this_qualifier, reduced_placetype = unpack(split, 1, 3) -- If a special "former" qualifier like `former` or `historical` isn't present, and -- `no_check_for_inherently_former` is not given (this flag is used to avoid infinite loops), check for -- "inherently former" placetypes like `satrapy` and `treaty port` that always refer to no-longer-existing -- placetypes, and handle accordingly. local unlinked_this_qualifier if this_qualifier and this_qualifier:find("%[") then unlinked_this_qualifier = export.remove_links_and_html(this_qualifier) else unlinked_this_qualifier = this_qualifier end local former_qualifiers = this_qualifier and export.former_qualifiers[unlinked_this_qualifier] or nil if not former_qualifiers and not no_check_for_inherently_former then former_qualifiers = export.get_equiv_placetype_prop(reduced_placetype, function(pt) return export.get_placetype_prop(pt, "inherently_former") end, {no_check_for_inherently_former = true}) end -- If a special "former" qualifier like `former` or `historical` is present, map it to the appropriate internal -- qualifiers (`ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified -- qualifiers), fetch the `former_type` property, and treat the placetype as if a concatenation of the mapped -- qualifier(s) and the value of `former_type`. For example, if `medieval village` is given, we map `medieval` -- to `ANCIENT` and `FORMER`, and `village` to its `former_type` of `settlement`, and enter the placetypes -- `ANCIENT settlement` and `FORMER settlement` (in that order) into `equivs`. If the placetype following the -- "former" qualifier is recognized in `placetype_data` but has no `former_type` and no fallback with a -- `former_type` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like -- `former greenhouse` is specified and we don't have an entry for `greenhouse`), just track the occurrence and -- don't enter anything into `equivs`. if former_qualifiers then -- FIXME: Should we respect `no_fallback` here? My instinct says no. local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype, { no_check_for_inherently_former = true }) local former_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.get_placetype_prop(pt, "former_type") or export.get_placetype_prop(pt, "class") end ) if not former_type then local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.placetype_data[pt] end ) if pt_data then internal_error("For placetype %s, placetype data located but `former_type` missing; " .. "placetypes searched are %s", reduced_placetype, reduced_placetype_equivs) else -- Enable error when we've verified there aren't any examples. track("bad-former-placetype") track("bad-former-placetype/" .. reduced_placetype) --process_error("For placetype '%s', unrecognized placetype following 'former'-type " .. -- "qualifier; searched placetype(s) %s", reduced_placetype, dump(reduced_placetype_equivs)) end elseif former_type ~= "!" then -- First check directly for `ANCIENT/FORMER` + the original following placetype. This makes it possible -- for (e.g.) former provinces of the Roman empire to be categorized specially. for _, former_qualifier in ipairs(former_qualifiers) do process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. reduced_placetype) end for _, former_qualifier in ipairs(former_qualifiers) do process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. former_type) end -- HACK! See explanation above for `register_former_as_non_former`. if register_former_as_non_former then process_and_insert_placetype(prev_qualifier, reduced_placetype) end -- If we're processing a form-of directive, after doing everything else we do -- `DIRECTIVE ANCIENT/FORMER place` e.g. `OFFICIAL_NAME_OF FORMER place` as a backup. if form_of_directive and not no_fallback then for _, former_qualifier in ipairs(former_qualifiers) do insert_placetype_and_fallbacks(prev_qualifier, form_of_directive .. " " .. former_qualifier .. " place") end end -- Don't continue processing equivs. The reason is probably the same as the `break` below for -- qualifier_to_placetype_equivs[]; categories for `former BLAH` are set using `default`, and -- non-former equivs will otherwise take precedence. break end end -- Then see if the rightmost split-off qualifier is in qualifier_to_placetype_equivs -- (e.g. 'fictional *' -> 'fictional location'). If so, add the mapping. if this_qualifier and export.qualifier_to_placetype_equivs[unlinked_this_qualifier] then insert(equivs, { qualifier=prev_qualifier, placetype=export.qualifier_to_placetype_equivs[unlinked_this_qualifier] }) -- Don't continue processing equivs; otherwise, if we specify 'mythological city', even though the -- equivalent entry for 'mythological location' gets inserted ahead of the entry for 'city', the -- latter ends up generating the category because the category for 'mythological location' is set as -- the default value, which is used only when no non-default category can be found. break end -- Finally, join the rightmost split-off qualifier to the previously split-off qualifiers to form a combined -- qualifier, and add it along with reduced_placetype and any mapping in placetype_data for reduced_placetype. -- NOTE: The first time through this loop, both `prev_qualifier` and `this_qualifier` are nil, and this inserts -- the full placetype into `equivs`. local qualifier = prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier process_and_insert_placetype(qualifier, reduced_placetype) -- If `no_fallback` and there's an entry in `placetype_data` for this placetype, don't include any reduced -- placetypes to avoid the "overseas territory treated as a territory" issue describe above. if no_fallback then local canon_placetype, ptdata, ptmatch = export.get_placetype_data(reduced_placetype, from_category) if canon_placetype then break end end end -- If we're processing a form-of directive, after doing everything else we do `DIRECTIVE place` e.g. -- `OFFICIAL_NAME_OF place` as a backup; but only if either the placetype as a whole is recognized or the placetype -- begins with a recognized qualifier. This latter check is to avoid categorizing into e.g. -- [[Category:en:Former names of places]] in an invocation like -- {{place|en|@former name of:Democratic Republic of the Congo|country|r/Central Africa|;|used from 1971–1997}}; -- the `used from 1971–1997` gets treated as a placetype and we're called on it. if form_of_directive and not no_fallback and (splits[2] or export.get_placetype_data(placetype, from_category)) then insert_placetype_and_fallbacks(nil, form_of_directive .. " place") end return equivs end function export.get_equiv_placetype_prop_from_equivs(equivs, fun, continue_on_nil_only) for _, equiv in ipairs(equivs) do local retval = fun(equiv.placetype) if continue_on_nil_only and retval ~= nil or not continue_on_nil_only and retval then return retval, equiv end end return nil, nil end --[==[ Given a placetype `placetype` and a function `fun` of one argument, iteratively call the function on equivalent placetypes fetched from `get_placetype_equivs` until the function returns a non-falsy value (i.e. not {nil} or {false}); but if `continue_on_nil_only` is specified, the iterations continue until the function returns non non-{nil} value. FIXME: We should make `continue_on_nil_only` the default; but this requires changing some callers.) When `fun` returns a non-falsy or non-{nil} value, `get_equiv_placetype_prop` returns two values: the value returned by `fun` and the equivalent placetype that triggered the non-falsy (or non-{nil}) return value. If `fun` never returns a non-falsy (or non-{nil}) value, `get_equiv_placetype_prop` returns {nil} for both return values. If `placetype` is passed in as {nil}, the return value is the result of calling `fun` on {nil} (whatever it is) with {nil} for the second return value. ]==] function export.get_equiv_placetype_prop(placetype, fun, props) if not placetype then return fun(nil), nil end return export.get_equiv_placetype_prop_from_equivs(export.get_placetype_equivs(placetype, props), fun, props and props.continue_on_nil_only) end --[==[ Return the article that is used with an entry placetype. We proceed as follows: # See if there is a recognized qualifier at the beginning that specifies an article (including `false` for no article). This takes precedence over anything else, so that e.g. `various capitals` gets no article rather than "`the"`. # Then check the placetype or any equivalent placetype for the `entry_placetype_use_the` property, indicating that `"the"` should be used. # Otherwise we look to see if the placetype itself (not any equivalents, even those involving deleting a qualifier from the beginning) has an entry in `placetype_data` that specifies the indefinite article using `entry_placetype_use_the` (principally for use with placetypes like `union territory`). # Otherwise, we use [[Module:en-utilities]] to apply the standard algorithm to generate `"an"` for words beginning with a vowel and `"a"` otherwise. If `ucfirst` is true, the first letter of the article is made upper-case. ]==] function export.get_placetype_article(placetype, ucfirst) local art local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$") if qualifier then local canon = export.placetype_qualifiers[qualifier] if type(canon) == "table" then art = canon.article end end if art == false then return art end if art == nil then local placetype_use_the = export.get_equiv_placetype_prop(placetype, function(pt) return export.get_placetype_prop(pt, "entry_placetype_use_the") end) if placetype_use_the then art = "the" else art = export.get_placetype_prop(placetype, "entry_placetype_indefinite_article") if not art then art = --[[require(en_utilities_module).get_indefinite_article(placetype)]] "" end end end if ucfirst then art = m_strutils.ucfirst(art) end return art end --[==[ Return the preposition that should be used after `placetype` when occurring as an entry placetype or in categories (e.g. `city >in< France` but `country >of< South America`). The preposition defaults to `"ใน"` if not specified. ]==] function export.get_placetype_entry_preposition(placetype) local pt_prep = export.get_equiv_placetype_prop(placetype, function(pt) return export.get_placetype_prop(pt, "preposition") end ) return pt_prep or "ใน" end --[==[ Given a place desc (see top of file) and a holonym object (see top of file), add a key/value into the place desc's `holonyms_by_placetype` field corresponding to the placetype and placename of the holonym. For example, corresponding to the holonym "c/Italy", a key "ประเทศ" with the list value {"Italy"} will be added to the place desc's `holonyms_by_placetype` field. If there is already a key with that place type, the new placename will be added to the end of the value's list. ]==] function export.key_holonym_into_place_desc(place_desc, holonym) if not holonym.placetype then return end -- Key in equivalent placetypes, so that e.g. `cities/San Francisco` gets keyed under `city`; but don't do -- fallbacks, as it doesn't seem correct for the "do other holonyms of the same placetype" algorithm to do holonyms -- of different types just because they have the same fallback. local equiv_placetypes = export.get_placetype_equivs(holonym.placetype, {no_fallback = true}) local unlinked_placename = holonym.unlinked_placename for _, equiv in ipairs(equiv_placetypes) do local placetype = equiv.placetype if not place_desc.holonyms_by_placetype then place_desc.holonyms_by_placetype = {} end if not place_desc.holonyms_by_placetype[placetype] then place_desc.holonyms_by_placetype[placetype] = {unlinked_placename} else insert(place_desc.holonyms_by_placetype[placetype], unlinked_placename) end end end --[=[ Construct a formatted link from the raw link spec `link` given the canonical singular placetype `sg_placetype`. If the placetype was originally plural, `orig_placetype` should contain this plural value; otherwise it should be nil. This will construct the appropriate type of link that displays as `orig_placetype` (or otherwise `sg_placetype`) but links to whatever the `link` spec specifies (which may be `sg_placetype`, a Wikipedia article, etc.). `ptdata` is the placetype data structure for the placetype, and `from_category` indicates that we are generating the description of a category (otherwise we are generating the display form of an entry placetype). ]=] local function make_placetype_link(link, sg_placetype, orig_placetype, ptdata, from_category, noerror) if not from_category and ptdata.disallow_in_entries then if noerror then return "[not meant to be specified directly, with warning: " .. ptdata.disallow_in_entries .. "]" else process_error("Placetype %s is not meant to be specified directly: " .. ptdata.disallow_in_entries, sg_placetype) end end if link == nil then internal_error("Placetype data present for placetype %s but no link= setting given", sg_placetype) elseif link == true then if orig_placetype then return ("[[%s|%s]]"):format(sg_placetype, orig_placetype) else return ("[[%s]]"):format(sg_placetype) end elseif link == false then process_error("Placetype %s is not meant to be specified directly, but is only for internal use", sg_placetype) elseif link == "w" then return ("[[w:%s|%s]]"):format(sg_placetype, orig_placetype or sg_placetype) elseif link == "separately" then if orig_placetype then local sg_words = split(sg_placetype, " ") local orig_words = split(orig_placetype, " ") if #sg_words ~= #orig_words then internal_error("Can't construct 'separately' link for plural placetype %s as original placetype %s " .. "has different number of words", orig_placetype, sg_placetype) else for i = 1, #sg_words do if sg_words[i] == orig_words[i] then sg_words[i] = ("[[%s]]"):format(sg_words[i]) else sg_words[i] = ("[[%s|%s]]"):format(sg_words[i], orig_words[i]) end end return concat(sg_words, " ") end else return (sg_placetype:gsub("([^ ]+)", "[[%1]]")) end elseif link:find("^%+") then link = link:sub(2) -- discard initial + return ("[[%s|%s]]"):format(link, orig_placetype or sg_placetype) elseif not orig_placetype then return link else return --[[require(en_utilities_module).pluralize(link)]] link end end --[==[ Get the display form of a placetype by looking it up in `placetype_data`. If the placetype is recognized, or is the plural of a recognized placetype, the corresponding linked display form is returned (with plural placetypes displaying as plural but linked to the singular form of the placetype). Otherwise, return nil. If we're generating the description of a category, `category_type` should be set to one of `"top-level"` (for top-level categories like [[:Category:Neighborhoods]]), `"noncity"` (for non-city categories like [[:Category:Neighborhoods in Illinois, USA]]) or `"city"` (for city categories like [[:Category:Neighborhoods of Chicago]]). Otherwise, we're generating the description for use in formatting a {{tl|place}} call, and category-only placetypes ending in `!` will be ignored, along with special `category_link*` settings. `return_full` is used along with `category_type` and will preferably return the "full" variant of category link settings, i.e. `full_category_link*`; if they don't exist, the `category_link*` value is prepended with `"names of"`. `noerror` says to not throw an error when encountering entry placetypes that would be disallowed. ]==] function export.get_placetype_display_form(placetype, category_type, return_full, noerror) local from_category = not not category_type local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category) if canon_placetype then local raw_link local function is_linked_string(str) return type(str) == "string" and str:find("%[%[") end if category_type then local fetched_full local function fetch_maybe_full(prop) local retval = ptdata["full_" .. prop] if retval ~= nil then if return_full then return retval, true else internal_error("Saw full_" .. prop .. "=%s but `return_full` not set, can't handle", retval) end end return ptdata[prop], false end local function maybe_prefix(str) if return_full and not fetched_full then return "names of " .. str else return str end end -- Careful with `false` as possible value. if category_type == "top-level" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_top_level") elseif category_type == "noncity" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_before_noncity") elseif category_type == "city" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_before_city") else internal_error('Unrecognized value for `category_type` %s, should be "top-level", "noncity" or "city"', --ห้ามแปล category_type) end if type(raw_link) == "string" then return maybe_prefix(raw_link), ptdata elseif raw_link ~= nil then return raw_link, ptdata end raw_link, fetched_full = fetch_maybe_full("category_link") if raw_link == false then return raw_link, ptdata end if is_linked_string(raw_link) then return maybe_prefix(raw_link), ptdata end if ptmatch == "plural" then raw_link, fetched_full = fetch_maybe_full("plural_link") if raw_link == false then return raw_link, ptdata end if is_linked_string(raw_link) then return maybe_prefix(raw_link), ptdata end end if raw_link == nil then raw_link, fetched_full = fetch_maybe_full("link") end if raw_link == false then return raw_link, ptdata end return maybe_prefix(make_placetype_link(raw_link, canon_placetype, placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror)), ptdata else if ptmatch == "plural" then raw_link = ptdata.plural_link if raw_link == false then process_error("Placetype %s cannot appear plural", placetype) end if is_linked_string(raw_link) then return raw_link, ptdata end end if raw_link == nil then raw_link = ptdata.link end return make_placetype_link(raw_link, canon_placetype, placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror), ptdata end end return nil end local function resolve_unlinked_placename_display_aliases(placetype, placename) local equiv_placetypes = export.get_placetype_equivs(placetype) for i, equiv in ipairs(equiv_placetypes) do equiv_placetypes[i] = equiv.placetype end local all_display_aliases_found = {} local all_others_found = {} for group, key, spec in m_locations.iterate_matching_location { placetypes = equiv_placetypes, placename = placename, alias_resolution = "display", } do if spec.alias_of and spec.display then insert(all_display_aliases_found, {group, key, spec, spec.display_as_full}) else insert(all_others_found, {group, key, spec}) end end if not all_display_aliases_found[1] then return placename elseif all_display_aliases_found[2] then internal_error("Found multiple matching display aliases for placename %s, placetype %s: " .. "all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found, all_others_found) elseif all_others_found[1] then internal_error("Found a display alias along with other possible meanings for placename %s, placetype %s: " .. "all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found, all_others_found) else local group, key, spec, as_full = unpack(all_display_aliases_found[1]) local full, elliptical = m_locations.key_to_placename(group, key) return as_full and full or elliptical end end --[==[ If `placename` of type `placetype` is a display alias, convert it to its canonical form; otherwise, return unchanged. Display aliases transform certain placenames into canonical displayed forms. For example, if any of `country/US`, `country/USA` or `country/United States of America` (or `c/US`, etc.) are given, the result will be displayed as `United States`. '''NOTE''': Display aliases change what is displayed from what the editor wrote in the Wikitext. As a result, they should (a) be non-political in nature, and (b) not involve a change where the word `the` needs to be added or removed. For example, normalizing `US` and `USA` to `United States` for display purposes is OK but normalizing `Burma` to `Myanmar` is not (instead a cat alias should be used) because the terms `Burma` and `Myanmar` have clear political connotations. Similarly, we have a display alias that maps the old name of `Macedonia` as a country (but not a region!) to `North Macedonia`, but `Republic of Macedonia` is mapped to `North Macedonia` only as a cat alias because the two terms differ in their use of `the`. (For example, if we had a display alias mapping `Republic of Macedonia` to `North Macedonia`, the call {{tl|place|en|the <<capital city>> of the <<c/Republic of Macedonia>>}} would wrongly display as `the [[capital city]] of the [[North Macedonia]]`.) Generally, display normalizations tend to involve alternative forms (e.g. abbreviations, ellipses, foreign spellings) where the normalization improves clarity and consistency. ]==] function export.resolve_placename_display_aliases(placetype, placename) -- If the placename is a link, apply the alias inside the link. -- This pattern matches both piped and unpiped links. If the link is not piped, the second capture (linktext) will -- be empty. local link, linktext = rmatch(placename, "^%[%[([^|%[%]]+)|?([^|%[%]]-)%]%]$") if link then if linktext ~= "" then local alias = resolve_unlinked_placename_display_aliases(placetype, linktext) return "[[" .. link .. "|" .. alias .. "]]" else local alias = resolve_unlinked_placename_display_aliases(placetype, link) return "[[" .. alias .. "]]" end else return resolve_unlinked_placename_display_aliases(placetype, placename) end end --[==[ Generate the "prefixed" version of a bare key, i.e. prefix it with `the` if correct for this key. ]==] function export.get_prefixed_key(key, spec) if spec.the then return "the " .. key else return key end end -- Necessary for use by [[Module:place]]. FIXME: Reorganize the modules so this isn't necessary. export.iterate_matching_location = m_locations.iterate_matching_location --[=[ Iterator that iterates over holonyms in `place_desc`. If `first_holonym_index` is given, start iterating at the specified holonym and stop either when there are no more holonyms or a holonym with modifier `:also` is found. If `first_holonym_index` is nil or omitted, iterate over all holonyms regardless. If `include_raw_text_holonyms` is specified, raw text holonyms (those not of the form `placetype/placename`) are returned as well; they can be identified by the fact that the `placetype` field in the holonym structure is nil. Two values are returned at each iteration, the holonym index and holonym structure, similar to `ipairs()`. ]=] function export.get_holonyms_to_check(place_desc, first_holonym_index, include_raw_text_holonyms) local stop_at_also = not not first_holonym_index return function(place_desc, index) while true do index = index + 1 local this_holonym = place_desc.holonyms[index] -- If we were passed in a starting holonym index, go up to but not including a holonym marked with `:also` -- (continue_cat_loop); the categorization code will then restart the loop at that holonym. That holonym -- will have `:also` marked on it, so make sure not to stop immediately if the first holonym is marked with -- `:also`. if not this_holonym or stop_at_also and index > first_holonym_index and this_holonym.continue_cat_loop then return nil end -- If not placetype, we're processing raw text, which we normally want to skip. if include_raw_text_holonyms or this_holonym.placetype then return index, this_holonym end end end, place_desc, first_holonym_index and first_holonym_index - 1 or 0 end --[==[ If the holonym in `data` (in the format as passed to a category handler) refers to a known location, iterate over all such known locations, returning for each location the corresponding key, spec and group as well as the trail of ancestral containers. Unlike `iterate_matching_location()`, this specifically checks that there is no mismatch between the location's containers at any level and any of the following holonyms in the {{tl|place}} spec. The fields in `data` are: * `holonym_placetype`: The placetype of the holonym. It can actually be a list of possible placetypes, as with `iterate_matching_location()`. * `holonym_placename`: The placename of the holonym. * `holonym_index`: The index of the holonym among the holonyms in `place_desc`, or nil if the holonym is not among the holonyms in `place_desc`. (If a holonym index is given, we check for container mismatches among the holonyms following the specified index, stopping either when encountering a holonym marked with modifier `:also` or, if none exist, when we run out of holonyms. If no holonym index is given, we check all holonyms for container mismatches.) * `place_desc`: Description of the place; used for the holonyms, to check for container mismatches. Returns four values: the location group, the canonical key by which the location is known, the spec object describing the location and the trail of ancestral containers for the location. The first three values are the same as for `iterate_matching_location`. ]==] function export.iterate_matching_holonym_location(data) local holonym_placetype, holonym_placename, holonym_index, place_desc = data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc local matching_location_iterator = m_locations.iterate_matching_location { placetypes = holonym_placetype, placename = holonym_placename, } return function() while true do local group, key, spec = matching_location_iterator() if not group then return nil end local container_trail = {} -- For each level of container, check that there are no mismatches (i.e. other location of the same -- placetype) mentioned. We allow a mismatch at a given level if there's also a match with the container -- at that level. For example, in the case of Kansas City, defined in [[Module:place/locations]] as a city -- in Missouri, if we define it as {{tl|place|city|s/Missouri,Kansas}}, we ignore the mismatching state of -- Kansas because the correct state of Missouri was also mentioned. But imagine we are defining Newark, -- Delaware as {{tl|place|city|s/Delaware|c/US}} and (as is the case) we have an entry for Newark, New -- Jersey in [[Module:place/locations]]. Just because the containing location `US` matches isn't enough, -- because Newark, NJ also has New Jersey as a containing location and there's a mismatch at that level. If -- there are no mismatches at any level we assume we're dealing with the right known location. -- -- If at a given level there are multiple containing locations, we count a match if any holonym matches any -- containing location, and a mismatch only if a holonym exists of the same placetype that doesn't match any -- containing location. local containers_mismatch = false for containers in m_locations.iterate_containers(group, key, spec) do insert(container_trail, containers) local match_at_level = false local mismatch_at_level = false for other_holonym_index, other_holonym in export.get_holonyms_to_check(place_desc, holonym_index and holonym_index + 1 or nil) do local other_source_holonym = other_holonym.augmented_from_holonym if other_source_holonym and other_source_holonym.placetype == holonym_placetype and other_source_holonym.unlinked_placename ~= holonym_placename then -- Ignore holonyms added during the augmentation process for other holonyms of the same -- placetype as the placetype of the holonym we're considering. See comment in -- augment_holonyms_with_container() for why we do this. -- continue; grrr, no 'continue' in Lua else local holonym_matches_at_level = false local holonym_exists_with_same_placetype = false for _, container in ipairs(containers) do if not container.spec.no_check_holonym_mismatch then local full_container_placename, elliptical_container_placename = m_locations.key_to_placename(container.group, container.key) local placetypes = container.spec.placetype if type(placetypes) ~= "table" then placetypes = {placetypes} end local placetype_equivs = {} for _, pt in ipairs(placetypes) do m_table.extend(placetype_equivs, export.get_placetype_equivs(pt)) end local this_holonym_matches = export.get_equiv_placetype_prop_from_equivs( placetype_equivs, function(placetype) return other_holonym.placetype == placetype and (other_holonym.unlinked_placename == full_container_placename or other_holonym.unlinked_placename == elliptical_container_placename) end ) if this_holonym_matches then holonym_matches_at_level = true break end local this_holonym_exists_with_same_placetype = export.get_equiv_placetype_prop_from_equivs( placetype_equivs, function(placetype) return other_holonym.placetype == placetype end ) if this_holonym_exists_with_same_placetype then -- We seem to have a mismatch at this level. But before we decide conclusively that this -- is the case, check to see whether the putative mismatch is an alias and matches when -- we resolve the alias. for oh_group, oh_key, oh_spec, oh_container_trail in export.iterate_matching_holonym_location { holonym_placetype = other_holonym.placetype, holonym_placename = other_holonym.unlinked_placename, holonym_index = other_holonym_index, place_desc = place_desc, } do local oh_full_placename, oh_elliptical_placename = m_locations.key_to_placename(oh_group, oh_key) if oh_full_placename == full_container_placename or oh_elliptical_placename == elliptical_container_placename then -- Alias matched when resolved. this_holonym_matches = true break end end if this_holonym_matches then -- Alias matched above when resolved. holonym_matches_at_level = true break else -- Not an alias, or doesn't match when resolved. We have a true mismatch. holonym_exists_with_same_placetype = true end end end end if holonym_matches_at_level then match_at_level = true break end if holonym_exists_with_same_placetype then mismatch_at_level = true end end end if not match_at_level and mismatch_at_level then containers_mismatch = true break end end if not containers_mismatch then return group, key, spec, container_trail end end end end --[==[ If the holonym in `data` (in the format as passed to a category handler) refers to a known location, find and return the corresponding key, spec and group as well as the trail of ancestral containers. This is like `iterate_matching_holonym_location()` but throws an error if more than one location matches. (An example where this would happen is {{tl|place|en|neighborhood|city/Newcastle}}, because there are two known locations named Newcastle. To fix this, specify additional following disambiguating holonyms, e.g. {{tl|place|en|neighborhood|city/Newcastle|s/New South Wales}}. ]==] function export.find_matching_holonym_location(data) local all_found = {} for group, key, spec, container_trail in export.iterate_matching_holonym_location(data) do insert(all_found, {group, key, spec, container_trail}) end if not all_found[1] then return nil elseif all_found[2] then local holonym_placetype = data.holonym_placetype if type(holonym_placetype) == "table" then holonym_placetype = concat(holonym_placetype, ",") end local found_keys = {} for _, found in ipairs(all_found) do local _, key, _, _ = unpack(found) insert(found_keys, key) end error(("Found multiple matching locations for holonym '%s/%s'; specify disambiguating context in the " .. "containing holonyms: %s"):format(holonym_placetype, data.holonym_placename, dump(found_keys))) else return unpack(all_found[1]) end end ------------------------------------------------------------------------------------------ -- Placename and placetype data -- ------------------------------------------------------------------------------------------ --[==[ var: This is a map from aliases to their canonical forms. Any placetypes appearing as keys here will be mapped to their canonical forms in all respects, including the display form. Contrast entries in 'placetype_data' with a fallback, which applies to categorization and other processes but not to display. The most important aliases are for holonym placetypes, particularly those that occur often such as "ประเทศ", "รัฐ", "จังหวัด" and the like. Particularly long placetypes that mostly occur as entry placetypes (e.g. "census-designated place") can be given abbreviations, but it is generally preferred to spell out the entry placetype. Note also that we purposely avoid certain abbreviations that would be ambiguous (e.g. "d", which could variously be interpreted as "department", "อำเภอ" or "division"). ]==] export.placetype_aliases = { ["acomm"] = "autonomous community", ["adr"] = "administrative region", ["adterr"] = "administrative territory", -- Pakistan ["aobl"] = "autonomous oblast", ["aokr"] = "autonomous okrug", ["ap"] = "autonomous province", ["apref"] = "autonomous prefecture", ["aprov"] = "autonomous province", ["ar"] = "autonomous region", ["arch"] = "archipelago", ["arep"] = "autonomous republic", ["aterr"] = "autonomous territory", ["atu"] = "autonomous territorial unit", ["bor"] = "borough", ["c"] = "ประเทศ", ["can"] = "canton", ["carea"] = "council area", ["cc"] = "constituent country", ["cdblock"] = "community development block", ["cdep"] = "Crown dependency", ["CDP"] = "census-designated place", ["cdp"] = "census-designated place", ["clcity"] = "county-level city", ["co"] = "เทศมณฑล", ["cobor"] = "county borough", ["colcity"] = "county-level city", ["coll"] = "collectivity", ["comm"] = "community", ["cont"] = "ทวีป", ["contr"] = "continental region", ["contregion"] = "continental region", ["cpar"] = "civil parish", ["damun"] = "direct-administered municipality", ["dep"] = "dependency", ["department capital"] = "departmental capital", ["dept"] = "department", ["depterr"] = "dependent territory", ["dist"] = "อำเภอ", ["distmun"] = "district municipality", ["div"] = "division", ["emp"] = "จักรวรรดิ", ["fpref"] = "French prefecture", ["gov"] = "governorate", ["govnat"] = "governorate", ["home-rule city"] = "home rule city", ["home-rule municipality"] = "home rule municipality", ["inner-city area"] = "inner city area", ["ires"] = "Indian reservation", ["isl"] = "เกาะ", ["lbor"] = "London borough", ["lga"] = "local government area", ["lgarea"] = "local government area", ["lgd"] = "local government district", ["lgdist"] = "local government district", ["metbor"] = "metropolitan borough", ["metcity"] = "metropolitan city", ["metmun"] = "metropolitan municipality", ["mtn"] = "ภูเขา", ["mun"] = "เทศบาล", ["mundist"] = "municipal district", ["nonmetropolitan county"] = "non-metropolitan county", ["obl"] = "oblast", ["okr"] = "okrug", ["p"] = "จังหวัด", ["par"] = "parish", ["parmun"] = "parish municipality", ["pen"] = "peninsula", ["plcity"] = "prefecture-level city", ["plcolony"] = "Polish colony", ["pref"] = "prefecture", ["prefcity"] = "prefecture-level city", ["preflcity"] = "prefecture-level city", ["prov"] = "จังหวัด", ["r"] = "ภูมิภาค", ["range"] = "เทือกเขา", ["rcm"] = "regional county municipality", ["rcomun"] = "regional county municipality", ["rdist"] = "regional district", ["rep"] = "republic", ["rhrom"] = "rural hromada", ["riv"] = "แม่น้ำ", ["rmun"] = "regional municipality", ["robor"] = "royal borough", ["romp"] = "Roman province", ["runit"] = "regional unit", ["rurmun"] = "rural municipality", ["s"] = "รัฐ", ["sar"] = "special administrative region", ["shrom"] = "settlement hromada", ["spref"] = "subprefecture", ["sprefcity"] = "sub-prefectural city", ["sprovcity"] = "subprovincial city", ["submet city"] = "sub-metropolitan city", ["submetropolitan city"] = "sub-metropolitan city", ["sub-prefecture-level city"] = "sub-prefectural city", ["sub-provincial city"] = "subprovincial city", ["sub-provincial district"] = "subprovincial district", ["terr"] = "ดินแดน", ["terrauth"] = "territorial authority", ["twp"] = "township", ["twpmun"] = "township municipality", ["uauth"] = "unitary authority", ["ucomm"] = "unincorporated community", ["udist"] = "unitary district", ["uhrom"] = "urban hromada", ["uterr"] = "union territory", ["utwpmun"] = "united township municipality", ["val"] = "valley", ["vdc"] = "village development committee", ["vil"] = "village", ["voi"] = "voivodeship", ["wcomm"] = "Welsh community", } local no_link_def_article = {link = false, article = "the"} local no_link_no_article = {link = false, article = false} --[==[ var: These qualifiers can be prepended onto any placetype and will be handled correctly. For example, the placetype `large city` will be displayed as `large <nowiki>[[city]]</nowiki>` and categorized as if `city` were specified. If the value in the following table is a string, the qualifier will display according to the string. If the value is `true`, the qualifier will be linked to its corresponding Wiktionary entry. If the value is `false`, the qualifier will not be linked but will appear as-is. Note that these qualifiers do not override placetypes with entries elsewhere that contain those same qualifiers. For example, the entry for `inland sea` in `placetype_data` will apply in preference to treating `inland sea` as equivalent to `sea`. ]==] export.placetype_qualifiers = { -- generic qualifiers ["huge"] = false, ["tiny"] = false, ["large"] = false, ["big"] = false, ["mid-size"] = false, ["mid-sized"] = false, ["small"] = false, ["sizable"] = false, ["important"] = false, ["long"] = false, ["short"] = false, ["major"] = false, ["minor"] = false, ["high"] = false, ["tall"] = false, ["low"] = false, ["left"] = false, -- left tributary ["right"] = false, -- right tributary ["modern"] = false, -- for use in opposition to "ancient" in another definition -- "former" qualifiers ["abandoned"] = true, ["ancient"] = true, ["deserted"] = true, ["extinct"] = true, ["former"] = false, ["historic"] = "historical", ["historical"] = true, ["medieval"] = true, ["mediaeval"] = true, ["ruined"] = true, ["traditional"] = true, -- sea qualifiers ["coastal"] = true, ["inland"] = true, -- note, we also have an entry in placetype_data for 'inland sea' to get a link to [[inland sea]] ["maritime"] = true, ["overseas"] = true, ["seaside"] = true, ["beachfront"] = true, ["beachside"] = true, ["riverside"] = true, -- lake qualifiers ["freshwater"] = true, ["saltwater"] = true, ["endorheic"] = true, ["oxbow"] = true, ["ox-bow"] = "[[oxbow]]", -- [[ox-bow]] is a red link ["tidal"] = true, -- land qualifiers ["hilltop"] = true, ["hilly"] = true, ["insular"] = true, ["peninsular"] = true, ["chalk"] = true, ["karst"] = true, ["limestone"] = true, ["mountainous"] = true, ["mountaintop"] = true, ["alpine"] = true, ["volcanic"] = true, -- for an island -- political status qualifiers ["autonomous"] = true, ["incorporated"] = true, ["special"] = true, ["unincorporated"] = true, ["coterminous"] = true, -- monetary status/etc. qualifiers ["fashionable"] = true, ["wealthy"] = true, ["affluent"] = true, ["declining"] = true, -- city vs. rural qualifiers ["urban"] = true, ["suburban"] = true, ["exurban"] = true, ["outlying"] = true, ["remote"] = true, ["rural"] = true, ["outback"] = true, ["inner"] = false, ["inner-city"] = true, ["central"] = false, ["outer"] = false, -- land use qualifiers ["residential"] = true, ["agricultural"] = true, ["business"] = true, ["commercial"] = true, ["industrial"] = true, -- business use qualifiers ["railroad"] = true, ["railway"] = true, ["farming"] = true, ["fishing"] = true, ["mining"] = true, ["logging"] = true, ["cattle"] = true, -- tourism use qualifiers ["resort"] = true, -- note, we also have 'resort city' and 'resort town', that take precedecne ["spa"] = true, -- note, we also have 'spa city' and 'spa town', that take precedecne ["ski"] = true, -- note, we also have 'ski resort city' and 'ski resort town', that take precedecne -- religious qualifiers ["holy"] = true, ["sacred"] = true, ["religious"] = true, ["secular"] = true, -- qualifiers for nonexistent places ["claimed"] = false, ["fictional"] = true, ["legendary"] = true, ["mythical"] = true, ["mythological"] = true, -- directional qualifiers ["northern"] = false, ["southern"] = false, ["eastern"] = false, ["western"] = false, ["north"] = false, ["south"] = false, ["east"] = false, ["west"] = false, ["northeastern"] = false, ["southeastern"] = false, ["northwestern"] = false, ["southwestern"] = false, ["northeast"] = false, ["southeast"] = false, ["northwest"] = false, ["southwest"] = false, -- seasonal qualifiers ["summer"] = true, -- e.g. for 'summer capital' ["winter"] = true, -- legal status qualifiers -- FIXME: Two-word qualifiers don't work yet. But you can enter "de-facto" and it's canonicalized to [[de facto]]. ["official"] = true, ["unofficial"] = true, ["de facto"] = true, -- 'de facto capital' ["de-facto"] = "[[de facto]]", -- [[de-facto]] is a red link ["de jure"] = true, -- 'de jure capital' ["de-jure"] = "[[de jure]]", -- [[de-jure]] is a red link -- NOTE: 'unrecognized/unrecognised' are handled as placetypes 'unrecognized country', 'unrecognized state' -- misc. qualifiers ["planned"] = true, ["chartered"] = true, ["landlocked"] = true, ["uninhabited"] = true, -- superlative qualifiers ["first"] = no_link_def_article, ["second"] = no_link_def_article, -- for "second largest" etc. ["third"] = no_link_def_article, ["fourth"] = no_link_def_article, ["last"] = no_link_def_article, ["only"] = no_link_def_article, ["sole"] = no_link_def_article, ["main"] = no_link_def_article, ["largest"] = no_link_def_article, ["biggest"] = no_link_def_article, ["smallest"] = no_link_def_article, ["shortest"] = no_link_def_article, ["longest"] = no_link_def_article, ["tallest"] = no_link_def_article, ["highest"] = no_link_def_article, ["lowest"] = no_link_def_article, ["leftmost"] = no_link_def_article, ["rightmost"] = no_link_def_article, ["innermost"] = no_link_def_article, ["outermost"] = no_link_def_article, ["northernmost"] = no_link_def_article, ["southernmost"] = no_link_def_article, ["westernmost"] = no_link_def_article, ["easternmost"] = no_link_def_article, ["northwesternmost"] = no_link_def_article, ["southwesternmost"] = no_link_def_article, ["northeasternmost"] = no_link_def_article, ["southeasternmost"] = no_link_def_article, -- several/various ["several"] = no_link_no_article, ["various"] = no_link_no_article, ["numerous"] = no_link_no_article, ["multiple"] = no_link_no_article, ["many"] = no_link_no_article, ["other"] = no_link_no_article, } --[==[ var: In this table, the key qualifiers should be treated the same as the value qualifiers for categorization purposes. This is overridden by `placetype_data` and `qualifier_to_placetype_equivs`. ]==] export.former_qualifiers = { ["abandoned"] = {"FORMER"}, ["ancient"] = {"ANCIENT", "FORMER"}, ["former"] = {"FORMER"}, ["extinct"] = {"FORMER"}, ["historic"] = {"FORMER"}, ["historical"] = {"FORMER"}, ["medieval"] = {"ANCIENT", "FORMER"}, ["mediaeval"] = {"ANCIENT", "FORMER"}, ["ruined"] = {"ANCIENT", "FORMER"}, ["traditional"] = {"FORMER"}, } --[==[ var: In this table, any placetypes containing these qualifiers that do not occur in `placetype_data` should be mapped to the specified placetypes for categorization purposes. Entries here are overridden by `placetype_data`. ]==] export.qualifier_to_placetype_equivs = { ["fictional"] = "fictional location", ["legendary"] = "mythological location", ["mythical"] = "mythological location", ["mythological"] = "mythological location", -- For e.g. Taiwan as a "claimed province" of China; parts of Belize as claimed by Guatemala; various islands -- claimed by various parties in East Asia. FIXME: We should conditionalize on what is being claimed since there are -- also claimed capitals, e.g. Israel and Palestine claim Jerusalem as their capital. ["claimed"] = "claimed political division", } --[==[ var: Mapping from placetypes to the corresponding plural category-only placetype for a capital of that placetype. The reverse mapping also exists. ]==] export.placetype_to_capital_cat = { ["autonomous community"] = "autonomous community capitals", ["canton"] = "cantonal capitals", ["comarca"] = "comarca capitals", ["ประเทศ"] = "national capitals", -- The following are not obviously different from 'county seats' but the latte terminology is used in the US. ["เทศมณฑล"] = "county capitals", ["department"] = "departmental capitals", ["อำเภอ"] = "district capitals", ["division"] = "division capitals", ["emirate"] = "emirate capitals", ["governorate"] = "governorate capitals", ["hromada"] = "hromada capitals", ["krai"] = "krai capitals", ["metropolitan city"] = "metropolitan city capitals", ["เทศบาล"] = "municipal capitals", ["oblast"] = "oblast capitals", ["okrug"] = "okrug capitals", ["prefecture"] = "prefectural capitals", ["จังหวัด"] = "provincial capitals", ["raion"] = "raion capitals", ["regency"] = "regency capitals", ["ภูมิภาค"] = "regional capitals", ["regional unit"] = "regional unit capitals", ["republic"] = "republic capitals", ["รัฐ"] = "state capitals", ["ดินแดน"] = "territorial capitals", ["voivodeship"] = "voivodeship capitals", } --[==[ var: This contains placenames that should be preceded by an article (almost always "the"). '''NOTE''': There are multiple ways that placenames can come to be preceded by "the": # Listed here. # Given in [[Module:place/locations]] with an initial "the". All such placenames are added to this map by the code just below the map. # The placetype of the placename has `holonym_use_the = true` in its placetype_data. # A regex in placename_the_re matches the placename. Note that "the" is added only before the first holonym in a place description. ]==] export.placename_article = { -- This should only contain info that can't be inferred from [[Module:place/locations]]. ["archipelago"] = { ["Cyclades"] = "the", ["Dodecanese"] = "the", }, ["ประเทศ"] = { ["Holy Roman Empire"] = "the", }, ["จักรวรรดิ"] = { ["Holy Roman Empire"] = "the", }, ["เกาะ"] = { ["North Island"] = "the", ["South Island"] = "the", }, ["ภูมิภาค"] = { ["Balkans"] = "the", ["Russian Far East"] = "the", ["Caribbean"] = "the", ["Caucasus"] = "the", ["Middle East"] = "the", ["New Territories"] = "the", ["North Caucasus"] = "the", ["South Caucasus"] = "the", ["West Bank"] = "the", ["Gaza Strip"] = "the", }, ["valley"] = { ["San Fernando Valley"] = "the", }, } --[==[ var: Regular expressions to apply to determine whether we need to put 'the' before a holonym. The key "*" applies to all holonyms, otherwise only the regexes for the holonym's placetype apply. ]==] export.placename_the_re = { -- We don't need entries for peninsulas, seas, oceans, gulfs or rivers -- because they have holonym_use_the = true. ["*"] = {"^Isle of ", " Islands$", " Mountains$", " Empire$", " Country$", " Region$", " District$", "^City of "}, ["bay"] = {"^Bay of "}, ["ทะเลสาบ"] = {"^Lake of "}, ["ประเทศ"] = {"^Republic of ", " Republic$"}, ["republic"] = {"^Republic of ", " Republic$"}, ["ภูมิภาค"] = {" [Rr]egion$"}, ["แม่น้ำ"] = {" River$"}, ["local government area"] = {"^Shire of "}, ["เทศมณฑล"] = {"^Shire of "}, ["Indian reservation"] = {" Reservation", " Nation"}, ["tribal jurisdictional area"] = {" Reservation", " Nation"}, } --[==[ var: If any of the following holonyms are present, the associated holonyms are automatically added to the end of the list of holonyms for categorization (but not display) purposes. ]==] export.cat_implications = { ["ภูมิภาค"] = { ["Eastern Europe"] = {"continent/Europe"}, ["Central Europe"] = {"continent/Europe"}, ["Western Europe"] = {"continent/Europe"}, ["South Europe"] = {"continent/Europe"}, ["Southern Europe"] = {"continent/Europe"}, ["Northern Europe"] = {"continent/Europe"}, ["Northeast Europe"] = {"continent/Europe"}, ["Northeastern Europe"] = {"continent/Europe"}, ["Southeast Europe"] = {"continent/Europe"}, ["Southeastern Europe"] = {"continent/Europe"}, ["North Caucasus"] = {"continent/Europe"}, ["South Caucasus"] = {"continent/Asia"}, ["South Asia"] = {"continent/Asia"}, ["Southern Asia"] = {"continent/Asia"}, ["East Asia"] = {"continent/Asia"}, ["Eastern Asia"] = {"continent/Asia"}, ["Central Asia"] = {"continent/Asia"}, ["West Asia"] = {"continent/Asia"}, ["Western Asia"] = {"continent/Asia"}, ["Southeast Asia"] = {"continent/Asia"}, ["North Asia"] = {"continent/Asia"}, ["Northern Asia"] = {"continent/Asia"}, ["Anatolia"] = {"continent/Asia"}, ["Asia Minor"] = {"continent/Asia"}, ["Mesopotamia"] = {"continent/Asia"}, ["North Africa"] = {"continent/Africa"}, ["Central Africa"] = {"continent/Africa"}, ["West Africa"] = {"continent/Africa"}, ["East Africa"] = {"continent/Africa"}, ["Southern Africa"] = {"continent/Africa"}, ["Central America"] = {"continent/Central America"}, ["Caribbean"] = {"continent/North America"}, ["Polynesia"] = {"continent/Oceania"}, ["Micronesia"] = {"continent/Oceania"}, ["Melanesia"] = {"continent/Oceania"}, ["Siberia"] = {"country/Russia", "continent/Asia"}, ["Russian Far East"] = {"country/Russia", "continent/Asia"}, ["South Wales"] = {"constituent country/Wales", "continent/Europe"}, ["Balkans"] = {"continent/Europe"}, ["West Bank"] = {"country/Palestine", "continent/Asia"}, ["Gaza"] = {"country/Palestine", "continent/Asia"}, ["Gaza Strip"] = {"country/Palestine", "continent/Asia"}, } } ------------------------------------------------------------------------------------------ -- Category and display handlers -- ------------------------------------------------------------------------------------------ local function city_type_cat_handler(data) local entry_placetype = data.entry_placetype local generic_before_non_cities = export.get_placetype_prop(entry_placetype, "generic_before_non_cities") if not generic_before_non_cities then internal_error("city_type_cat_handler called on placetype %s that doesn't have a `generic_before_non_cities`" .. " setting", entry_placetype) end local plural_entry_placetype = export.pluralize_placetype(entry_placetype) local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and not spec.is_former_place and not spec.is_city then -- Categorize both in key, and in the larger polity that the key is part of, e.g. [[Hirakata]] goes in both -- "Cities in Osaka Prefecture" and "Cities in Japan". (But don't do the latter if no_container_cat is set.) local cap_plural_entry_placetype = ucfirst(plural_entry_placetype) local retcats = {("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(key, spec))} --th if container_trail[1] and not spec.no_container_cat then for _, container in ipairs(container_trail[1]) do insert(retcats, ("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(container.key, container.spec))) --th end end return retcats end end local function capital_city_cat_handler(data, non_city) local holonym_placetype, holonym_placename, holonym_index, place_desc = data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc -- The first time we're called we want to return something; otherwise we will be called for later-mentioned -- holonyms, which can result in wrongly classifying into e.g. `National capitals`. Simulate the loop in -- find_placetype_cat_specs() over holonyms so we get the proper `Cities in ...` categories as well as the capital -- category/categories we add below. local retcats if not non_city and place_desc.holonyms then for h_index, holonym in export.get_holonyms_to_check(place_desc, holonym_index) do local h_placetype, h_placename = holonym.placetype, holonym.unlinked_placename retcats = city_type_cat_handler { entry_placetype = "นคร", holonym_placetype = h_placetype, holonym_placename = h_placename, holonym_index = h_index, place_desc = place_desc, } if retcats then break end end end if not retcats then retcats = {} end -- Now find the appropriate capital-type category for the placetype of the holonym, e.g. 'State capitals'. If we -- recognize the holonym among the known holonyms in [[Module:place/locations]], also add a category like 'State -- capitals of the United States'. Truncate e.g. 'autonomous region' to 'region', 'union territory' to 'territory' -- when looking up the type of capital category, if we can't find an entry for the holonym placetype itself (there's -- an entry for 'autonomous community'). local capital_cat = export.placetype_to_capital_cat[holonym_placetype] if not capital_cat then capital_cat = export.placetype_to_capital_cat[holonym_placetype:gsub("^.* ", "")] end if capital_cat then capital_cat = ucfirst(capital_cat) local inserted_specific_variant_cat = false if holonym_index then -- Now find the first recognized holonym location. We don't stop when :also is seen because of the common pattern -- where we use :also to specify that a given city is the capital at multiple surrounding levels. local matching_group, matching_key, matching_spec, matching_container_trail, matching_holonym_index for h_index = holonym_index, #place_desc.holonyms do if place_desc.holonyms[h_index].placetype then matching_group, matching_key, matching_spec, matching_container_trail = export.find_matching_holonym_location { holonym_placetype = place_desc.holonyms[h_index].placetype, holonym_placename = place_desc.holonyms[h_index].unlinked_placename, holonym_index = h_index, place_desc = place_desc, } if matching_group then matching_holonym_index = h_index break end end end if matching_holonym_index == holonym_index then if matching_container_trail[1] and not matching_spec.no_container_cat then for _, container in ipairs(matching_container_trail[1]) do insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key, container.spec))) inserted_specific_variant_cat = true end end elseif matching_holonym_index then -- Check to make sure that the holonym placetype we were called on is listed among the -- divtypes of the location we found. local function insert_specific_variant_if_possible(key, spec) return export.get_equiv_placetype_prop(holonym_placetype, function(pt) local plural_holonym_placetype = export.pluralize_placetype(pt) local saw_matching_div if spec.divs then local divs = spec.divs if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) ~= "table" then div = {type = div} end if plural_holonym_placetype == div.type then saw_matching_div = true break end end end if saw_matching_div then insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(key, spec))) return true end return false end) end if insert_specific_variant_if_possible(matching_key, matching_spec) then inserted_specific_variant_cat = true elseif not matching_spec.no_container_cat then for _, containers in ipairs(matching_container_trail) do local saw_no_container_cat = false for _, container in ipairs(containers) do if insert_specific_variant_if_possible(container.key, container.spec) then inserted_specific_variant_cat = true break end saw_no_container_cat = saw_no_container_cat or container.spec.no_container_cat end if inserted_specific_variant_cat or saw_no_container_cat then break end end end end else -- This happens when in an invocation like {{place|en|capital city|s/Haryana,Punjab}} for -- [[Chandigarh]]. We fall back to older code that doesn't depend on the holonym index existing. -- FIXME: This may not be necessary. In the example just given, when processing Haryana we add to -- [[:Category:en:State capitals of India]], and nothing extra gets added when processing Punjab. -- Possibly we can just skip this case entirely. local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and container_trail[1] and not spec.no_container_cat then for _, container in ipairs(container_trail[1]) do insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key, container.spec))) inserted_specific_variant_cat = true end end end if not inserted_specific_variant_cat then insert(retcats, capital_cat) end else -- We didn't recognize the holonym placetype; just put in 'Capital cities'. insert(retcats, "เมืองหลวง") end return retcats end --[=[ This is invoked specially for all placetypes (see the `*` placetype key at the bottom of `placetype_data`). This is used in two ways: # To add pages to generic holonym categories like [[:Category:en:สถานที่ในMerseyside, England]] (and [[:Category:en:สถานที่ในEngland]]) for any pages that have `co/Merseyside` as their holonym. # To categorize demonyms in bare placename categories like [[:Category:en:Merseyside, England]] if the demonym description mentions `co/Merseyside` and doesn't mention a more specific placename that also has a category. (In this case there are none, but we can have demonyms at multiple levels, e.g. in France for individual villages, departments, administrative regions, and for the entire country, and for example we only want to categorize a demonym into [[:Category:France]] if no more specific category applies.) Unlike when invoked from {{tl|place}}, a demonym invocation only adds the most specific holonym category and not the category of any containing polity (hence if we add [[:Category:en:Merseyside, England]] we won't also add [[:Category:England]]). This code also handles cities; e.g. for the first use case above, it would be used to add a page that has `city/Boston` as a holonym to [[:Category:en:สถานที่ในBoston]], along with [[:Category:en:สถานที่ในMassachusetts, USA]] and [[:Category:en:สถานที่ในthe United States]]. The city handler tries to deal with the possibility of multiple cities having the same name. For example, the code in [[Module:place/locations]] knows about the city of [[Columbus]], [[Ohio]], which has containing polities `Ohio` (a state) and `the United States` (a country). If either containing polity is mentioned, the handler proceeds to return the key `Columbus` (along with `Ohio, USA` and `the United States`). Otherwise, if any other state or country is mentioned, the handler returns nothing, and otherwise it assumes the mentioned city is the one we're considering and returns `Columbus` etc. This works correctly if the place only mentions Ohio and a holonym for a Columbus in a different country is encountered, because of the function `augment_holonyms_with_container`, which adds the US as a holonym when Ohio is encountered. The single parameter `data` is as in category handlers. The return value is a list of categories (without the preceding language code). ]=] local function generic_place_cat_handler(data) local from_demonym = data.from_demonym local retcats = {} local function insert_retkey(key, spec) if from_demonym then insert(retcats, key) else insert(retcats, ("สถานที่ใน%s"):format(export.get_prefixed_key(key, spec))) end end local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group then if not spec.no_generic_place_cat then -- This applies to continents and continental regions. insert_retkey(key, spec) end -- Categorize both in key, and in the larger location(s) that the key is part of, e.g. [[Hirakata]] goes in -- both [[Category:สถานที่ในOsaka Prefecture, Japan]] and [[Category:สถานที่ในJapan]]. But not when -- no_container_cat is set (e.g. for 'United Kingdom'). if not spec.no_container_cat then for _, container_set in ipairs(container_trail) do local stop_adding_containers = false for _, container in ipairs(container_set) do if not container.spec.no_generic_place_cat then insert_retkey(container.key, container.spec) end if container.spec.no_container_cat then stop_adding_containers = true end end if stop_adding_containers then break end end end return retcats end end --[==[ Special category handler run for all placetypes that checks for specified division placetypes of known locations and categorizes appropriately. ]==] function export.political_division_cat_handler(data) if data.from_demonym then return end local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group then local divlists = {} if spec.divs then insert(divlists, spec.divs) end if spec.addl_divs then insert(divlists, spec.addl_divs) end for _, divlist in ipairs(divlists) do if type(divlist) ~= "table" then divlist = {divlist} end for _, div in ipairs(divlist) do if type(div) == "string" then div = {type = div} end local sgdiv = export.maybe_singularize_placetype(div.type) or div.type local prep = div.prep or "ของ" local cat_as = div.cat_as or div.type if type(cat_as) ~= "table" then cat_as = {cat_as} end if not export.placetype_data[sgdiv] then internal_error("Placetype %s associated with known location key %s and data %s not found in " .. "`placetype_data`", sgdiv, key, spec) end if sgdiv == data.entry_placetype then local retcats = {} for _, pt_cat in ipairs(cat_as) do if type(pt_cat) == "string" then pt_cat = {type = pt_cat} end local pt_prep = pt_cat.prep or prep insert(retcats, ucfirst(pt_cat.type) .. pt_prep .. export.get_prefixed_key(key, spec)) --th end return retcats end end end end end --[==[ This is used to add pages to "bare" categories like [[:Category:en:Georgia, USA]] for `[[Georgia]]` and any foreign-language terms that are translations of the state of Georgia. We look at the page title (or its overridden value in {{para|pagename}}) as well as the glosses in {{para|t}}/{{para|t2}} etc., various extra-info values such as the modern names in {{para|modern}}, and any values specified using a form-of directive. We need to pay attention to the entry placetypes specified so we don't overcategorize; e.g. the US state of Georgia is `[[Джорджия]]` in Russian but the country of Georgia is `[[Грузия]]`, and if we just looked for matching names, we'd get both Russian terms categorized into both [[:Category:ru:Georgia, USA]] and [[:Category:ru:Georgia]]. We also need to check the containing holonyms to make sure there isn't a mismatch (so we don't e.g. categorize Newark, Delaware in [[:Category:en:Newark]], which is intended for Newark, New Jersey). ]==] function export.get_bare_categories(args, overall_place_spec) local bare_cats = {} local place_descs = overall_place_spec.descs local possible_placetypes_by_place_desc = {} for i, place_desc in ipairs(place_descs) do possible_placetypes_by_place_desc[i] = {} for _, placetype in ipairs(place_desc.placetypes) do if not export.placetype_is_ignorable(placetype) then local equivs = export.get_placetype_equivs(placetype, {register_former_as_non_former = true}) for _, equiv in ipairs(equivs) do insert(possible_placetypes_by_place_desc[i], equiv.placetype) end end end end local function check_term(term) -- Treat Wikipedia links like local ones. term = term:gsub("%[%[w:", "[["):gsub("%[%[wikipedia:", "[[") term = export.remove_links_and_html(term) term = term:gsub("^the ", "") for i, place_desc in ipairs(place_descs) do -- Iterate over all matching locations in case there are multiple, as with Delhi defined as -- {{place|en|megacity/and/union territory|c/India|containing the national capital [[New Delhi]]}}. for group, key, spec, container_trail in export.iterate_matching_holonym_location { holonym_placetype = possible_placetypes_by_place_desc[i], holonym_placename = term, place_desc = place_desc, } do insert(bare_cats, key) end end end -- FIXME: Should we only do the following if the language is English (requires that the lang is passed in)? -- We should always do it if `pagename` is given (as it is with {{tcl}}) but maybe not otherwise unless 1=en. There -- are cases like [[Ankara]] = English name for capital of Turkey, but also the name in various languages for the -- capital of Ghana (= English [[Accra]]). But this should get caught by mismatching the containing country. The -- advantage of checking when the language isn't English is we catch those places that fail to give an English -- translation but where the translation happens to be the same as the other-language spelling. However, I don't -- know how often this situation occurs. check_term(args.pagename or mw.title.getCurrentTitle().subpageText) for _, t in ipairs(args.t) do check_term(t) end local function check_termobj_list(terms) for _, term in ipairs(terms) do if term.eq then check_term(term.eq) end if term.alt or term.term then check_term(term.alt or term.term) end end end for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do local arg = extra_info_terms.arg if arg == "modern" or arg == "now" or arg == "full" or arg == "short" then check_termobj_list(extra_info_terms.terms) end end for _, directive in ipairs(overall_place_spec.directives) do check_termobj_list(directive.terms) end return bare_cats end --[==[ This is used to augment the holonyms associated with a place description with the containing polities. For example, given the following: `# {{tl|place|en|subprefecture|pref/Hokkaido}}.` We auto-add Japan as another holonym so that the term gets categorized into [[:Category:Subprefectures of Japan]]. To avoid over-categorizing we need to check to make sure no other countries are specified as holonyms. ]==] function export.augment_holonyms_with_container(place_descs) for _, place_desc in ipairs(place_descs) do if place_desc.holonyms then -- This ends up containing a copy of the original holonyms, with the augmented holonyms inserted in their -- appropriate position. We don't just put them at the end because some holonyms have use the `:also` -- modifier, which causes category processing to restart at that point after generating categories for a -- preceding holonym, and we don't want the preceding holonym's augmented holonyms interfering with -- categorization of a later holonym. We proceed from right to left, and each time we augment, we copy -- the holonyms with the augmented holonym(s) inserted appropriately and replace the place description's -- holonyms with the augmented ones before the next iteration. The reason for this is so that e.g. -- {{place|neighborhood|city/Birmingham|co/West Midlands|cc/England}} doesn't throw an error during the -- augmentation process due to 'Birmingham' referring to two known locations (in England and Alabama). If -- we go left to right, we will throw an ambiguity error on `city/Birmingham` because code to exclude -- Birmingham, Alabama needs `c/United Kingdom` present (to cause a mismatch with `c/United States`), -- which isn't yet present as the augmentation code hasn't gotten to `cc/England` yet. For similar -- reasons, we need to include the augmented holonyms in the holonyms considered in the next iteration -- rather than modifying the place description once at athe end. for i = #place_desc.holonyms, 1, -1 do local holonym = place_desc.holonyms[i] if holonym.placetype and not export.placetype_is_ignorable(holonym.placetype) then local group, key, spec, container_trail = export.find_matching_holonym_location { holonym_placetype = holonym.placetype, holonym_placename = holonym.unlinked_placename, holonym_index = i, place_desc = place_desc, } if group and container_trail[1] and not spec.no_auto_augment_container then local augmented_holonyms = {} for j = 1, i do insert(augmented_holonyms, place_desc.holonyms[j]) end for _, containers in ipairs(container_trail) do local any_no_auto_augment_container = false for _, container in ipairs(containers) do any_no_auto_augment_container = any_no_auto_augment_container or container.spec.no_auto_augment_container local containing_type = container.spec.placetype if type(containing_type) == "table" then -- If the containing type is a list, use the first element as the canonical variant. containing_type = containing_type[1] end local full_container_placename, elliptical_container_placename = m_locations.key_to_placename(container.group, container.key) -- Don't side-effect holonyms while processing them. local new_holonym = { -- By the time we run, the display has already been generated so we don't need to -- set display_placename. placetype = containing_type, -- placename_to_key() for the group should correctly handle both full and elliptical -- placenames, but the full placename seems less likely to be ambiguous. FIXME: We -- should just store the key directly and use it when available to avoid having to -- convert key to placename and back to key. unlinked_placename = full_container_placename, -- Indicate that this is an augmented holonym, and was derived from the specified -- holonym. In iterate_matching_holonym_location(), we ignore augmented holonyms -- derived from holonyms that are different from the holonym we're searching for but -- of the same placetype. This is to correctly handle a situation like -- {{place|river|dept/Ardèche,Gard,Vaucluse,Bouches-du-Rhône|c/France}}. Here, -- `Ardèche` is in `r/Auvergne-Rhône-Alpes`, while `Gard` is in `r/Occitania` and -- the other two are in `r/Provence-Alpes-Côte d'Azur`. Augmenting proceeds from -- right to left, so after it adds `r/Provence-Alpes-Côte d'Azur` to -- `Bouches-du-Rhône`, Vaucluse gets augmented correctly but `Gard` fails to match -- in find_matching_holonym_location() because of the mismatch between augmented -- `r/Provence-Alpes-Côte d'Azur` and actual `r/Occitania`. Similarly, all later -- calls to find_matching_holonym_location() fail to match `Gard` (and likewise -- `Ardèche`) against any known location. To deal with this, we mark augmented -- holoynms as being augmented due to a source holonym, and when processing a given -- holonym, ignore augmented holonyms from other holonyms of the same placetype. -- The restriction to the same placetype is so that `Birmingham` still gets -- correctly disambiguated to Birmingham, England in the example given above near -- the top of this function, using the augmented holonym `c/United Kingdom` added by -- the specified `cc/England` (whose placetype `constituent country` differs from -- the placetype `city` of Birmingham). augmented_from_holonym = holonym, } insert(augmented_holonyms, new_holonym) -- But it is safe to modify other parts of the place_desc. export.key_holonym_into_place_desc(place_desc, new_holonym) end if any_no_auto_augment_container then break end end for j = i + 1, #place_desc.holonyms do insert(augmented_holonyms, place_desc.holonyms[j]) end place_desc.holonyms = augmented_holonyms end end end end end end -- Cat handler for district, areas, neighborhoods and suburbs. Districts are tricky because they can either be political -- divisions or city neighborhoods. Areas similarly can be political divisions (rarely; specifically, in Kuwait), city -- neighborhoods or larger geographical areas/regions. We handle this as follows: -- (1) `placetype_data` cat entries for specific countries or country divisions take precedence over cat_handlers, so if -- the user says {{tl|place|district|s/Maharashtra|c/India}}, we won't even be called because there is an entry that -- categorizes into [[:Category|Districts of Maharashtra, India]]. -- (2) If we're called, we check the holonym we're called on to see if it is a recognized city, e.g. if we're called -- using {{tl|place|district|city/Mumbai|s/Maharashtra|c/India}}. If so, we categorize under e.g. -- [[:Category:Neighbourhoods of Mumbai]]. (Choosing the spelling "neighbourhoods" because we're in India.) -- (3) If we're called and the holonym is not a recognized city, we check if the placetype has has_neighborhoods set. -- If so, it's "city-like" and we categorize under the first containing polity that we recognize. For example, if -- we're called using {{tl|place|district|town/Northampton|co/Hampshire|s/Massachusetts|c/US}}, we should recognize -- town as "city-like" and categorize under [[:Category:Neighborhoods in Massachusetts]]. (Note "ใน" not "ของ", and -- note the spelling "neighborhoods" because we're in the US.) -- (4) If the holonym is not city-like, we do nothing. If there's a city or city-like placetype farther up (e.g. we're -- called as {{tl|place|district|ward/Foo|mun/Bar|...}}), we will handle the city-like entity according to (2) or -- (3) when called on that holonym. Otherwise either the categorization in (1) takes place or there's no -- categorization. local function district_neighborhood_cat_handler(data) local function get_plural_entry_placetype(location_spec, container_trail) if data.entry_placetype == "suburb" then return "Suburbs" else -- Check for `british_spelling` setting on the spec itself or any container. local uses_british_spelling = location_spec.british_spelling if uses_british_spelling == nil and container_trail then for _, container_set in ipairs(container_trail) do local must_outer_break = false for _, container in ipairs(container_set) do if container.spec.british_spelling ~= nil then uses_british_spelling = container.spec.british_spelling must_outer_break = true break end end if must_outer_break then break end end end return uses_british_spelling and "Neighbourhoods" or "Neighborhoods" end end -- First check the immediate holonym to see if it's a city or a city-like top-level entity (Hong Kong, Bonaire, -- etc.) local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and not spec.is_former_place and spec.is_city then return {get_plural_entry_placetype(spec, container_trail) .. " of " .. export.get_prefixed_key(key, spec)} end -- If the entry placetype is neighbo(u)rhood, assume it is a neighborhood even if there isn't a city-like -- entity father up the chain. (E.g. due to a mistaken use of m/ instead of mun/ for municipality.) local has_neighborhoods local entry_placetype = data.entry_placetype if entry_placetype == "neighborhood" or entry_placetype == "neighbourhood" or entry_placetype == "suburb" then has_neighborhoods = true else -- Otherwise, make sure the current holonym is city-like. has_neighborhoods = export.get_equiv_placetype_prop(data.holonym_placetype, function(pt) return export.get_placetype_prop(pt, "has_neighborhoods") end, {continue_on_nil_only = true}) end if has_neighborhoods then -- Loop up the holonyms, looking for city and city-like entities in case of e.g. [[Sepulveda]] written -- {{place|en|neighborhood|valley/San Fernando Valley|city/Los Angeles|s/California|c/USA}} -- but also look for a recognizable poldiv, and if so categorize as "Neighborhoods in POLDIV". We need -- to start with the current holonym, which is especially important for neighborhoods and suburbs that -- may have the first holonym be a recognizable province, etc. but can't hurt otherwise. (Previously -- we skipped the first/current holonym.) for other_holonym_index, other_holonym in export.get_holonyms_to_check(data.place_desc, data.holonym_index) do local other_holonym_data = { holonym_placetype = other_holonym.placetype, holonym_placename = other_holonym.unlinked_placename, holonym_index = other_holonym_index, place_desc = data.place_desc, } local group, key, spec, container_trail = export.find_matching_holonym_location(other_holonym_data) if group and not spec.is_former_place then return {get_plural_entry_placetype(spec, container_trail) .. (spec.is_city and "ของ" or "ใน") .. export.get_prefixed_key(key, spec)} end end end end function export.check_already_seen_string(holonym_placename, already_seen_strings) local canon_placename = ulower(m_links.remove_links(holonym_placename)) if type(already_seen_strings) ~= "table" then already_seen_strings = {already_seen_strings} end for _, already_seen_string in ipairs(already_seen_strings) do if canon_placename:find(already_seen_string) then return true end end return false end -- Prefix display handler that adds a prefix such as "Metropolitan Borough of " to the display -- form of holonyms. We make sure the holonym doesn't contain the prefix or some variant already. -- We do this by checking if any of the strings in ALREADY_SEEN_STRINGS, either a single string or -- a list of strings, or the prefix if ALREADY_SEEN_STRINGS is omitted, are found in the holonym -- placename, ignoring case and links. If the prefix isn't already present, we create a link that -- uses the raw form as the link destination but the prefixed form as the display form, unless the -- holonym already has a link in it, in which case we just add the prefix. local function prefix_display_handler(prefix, holonym_placename, already_seen_strings) if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(prefix)) then return holonym_placename end if holonym_placename:find("%[%[") then return prefix .. " " .. holonym_placename end return prefix .. " [[" .. holonym_placename .. "]]" end -- Suffix display handler that adds a suffix such as " parish" to the display form of holonyms. -- Works identically to prefix_display_handler but for suffixes instead of prefixes. local function suffix_display_handler(suffix, holonym_placename, already_seen_strings, include_suffix_in_link) if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(suffix)) then return holonym_placename end if holonym_placename:find("%[%[") then return holonym_placename .. " " .. suffix end if include_suffix_in_link then return "[[" .. holonym_placename .. " " .. suffix .. "]]" else return "[[" .. holonym_placename .. "]] " .. suffix end end -- Display handler for boroughs. New York City boroughs are display as-is. Others are suffixed -- with "borough". local function borough_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.new_york_boroughs[unlinked_placename] then -- Hack: don't display "borough" after the names of NYC boroughs return holonym_placename end return suffix_display_handler("borough", holonym_placename) end local function county_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) -- Display handler for Irish counties. Irish counties are displayed as e.g. "County [[Cork]]". if m_locations.ireland_counties["County " .. unlinked_placename .. ", Ireland"] or m_locations.northern_ireland_counties["County " .. unlinked_placename .. ", Northern Ireland"] then return prefix_display_handler("เทศมณฑล", holonym_placename) end -- Display handler for Taiwanese counties. Taiwanese counties are displayed as e.g. "[[Chiayi]] County". if m_locations.taiwan_counties[unlinked_placename .. " County, Taiwan"] then return suffix_display_handler("เทศมณฑล", holonym_placename) end -- Display handler for Romanian counties. Romanian counties are displayed as e.g. "[[Cluj]] County". if m_locations.romania_counties[unlinked_placename .. " County, Romania"] then return suffix_display_handler("เทศมณฑล", holonym_placename) end -- FIXME, we need the same for US counties but need to key off the country, not the specific county. -- Others are displayed as-is. return holonym_placename end -- Display handler for prefectures. Japanese prefectures are displayed as e.g. "[[Fukushima]] Prefecture". -- Others are displayed as e.g. "[[Fthiotida]] prefecture". local function prefecture_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) local suffix = m_locations.japan_prefectures[unlinked_placename .. " Prefecture, Japan"] and "Prefecture" or "prefecture" return suffix_display_handler(suffix, holonym_placename) end -- Display handler for provinces of Iran, Laos, North and South Korea, Thailand, Turkey and Vietnam. Recognized -- provinces are displayed as e.g. "[[Gyeonggi]] Province" or "[[Antalya]] Province". Others are displayed as-is. local function province_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.iran_provinces[unlinked_placename .. ", Iran"] or m_locations.laos_provinces[unlinked_placename .. ", Laos"] or m_locations.north_korea_provinces[unlinked_placename .. ", North Korea"] or m_locations.south_korea_provinces[unlinked_placename .. ", South Korea"] or m_locations.thailand_provinces[unlinked_placename .. ", ไทย"] or m_locations.turkey_provinces[unlinked_placename .. ", Turkey"] or m_locations.vietnam_provinces[unlinked_placename .. ", เวียดนาม"] then return suffix_display_handler("จังหวัด", holonym_placename) end return holonym_placename end -- Display handler for Nigerian states. Nigerian states are display as "[[Kano]] State". Others are displayed as-is. local function state_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.nigeria_states[unlinked_placename .. " State, Nigeria"] then return suffix_display_handler("รัฐ", holonym_placename) end return holonym_placename end -- Display handler for voivodeships. Display as e.g. [[Subcarpathian Voivodeship]]. local function voivodesip_display_handler(holonym_placetype, holonym_placename) return suffix_display_handler("Voivodeship", holonym_placename, nil, "include_suffix_in_link") end ------------------------------------------------------------------------------------------ -- Placetype data -- ------------------------------------------------------------------------------------------ --[==[ var: Main placetype data structure. This specifies, for each canonicalized placetype, various properties. The keys are placetypes (in the singular, except for category-only placetypes, which are plural and followed by `!`), and the value is a table of properties. The `"*"` key is special and is used for adding "generic" categories of the form `สถานที่ใน``location`` `; it runs for all entry placetypes. Keys in the form of plural placetypes followed by `!` are used only in [[Module:category tree/topic cat/data/Places]] for specifying the properties of categories containing the specified placetype, esp. bare categories like [[:Category:States and territories]] (rather than qualified categories like [[:Category:States and territories of Australia]]). Keys under the value table for a given placetype of are two types: ''property keys'' (which specify the value of specific properties) and ''categorization keys'' (which tell how to categorize certain sorts of holonyms if the placetype in question occurs as an entry placetype). Categorization keys are either the special value `default` or are wildcard strings with a slash in them, such as `"country/*"`. Note that only wildcard strings are currently allowed directly in the placetype data; everything else is handled through category handlers, either per-placetype or special (such as `political_division_cat_handler`). The algorithm for how category keys and handlers are used to generate categories is described at the top of [[Module:place]]. There are several recognized property keys, of various types: 1. The following link-related property keys are recognized: * `link`: '''Required''' except in category-only placetypes ending in `!`. Describes how to link and display the placetype in the formatted description when occurring as an entry placetype. Also used for formatting pluralized placetypes (which may occur in entry placetypes, esp. new-format ones, such as `two <<islands>>`) and may occur in categories). The possible values are: *# `true`: Link to the same-named Wiktionary entry. This creates a raw link, e.g. `<nowiki>[[city]]</nowiki>`, which is converted to an English-specific link by JavaScript postprocessing. If the placetype is plural, this creates a two-part raw link e.g. `<nowiki>[[city|cities]]</nowiki>`. *# `"w"`: Link to the same-named Wikipedia entry. This creates a two-part link, e.g. `<nowiki>[[w:census town|census town]]</nowiki>`, or `<nowiki>[[w:census town|census towns]]</nowiki>` if the placetype is given plural. *# `"+..."`: Create a two-part link to the entry following the `+` sign. For example, if `cercle` specifies `"+w:cercles of Mali"`, a two-part link `<nowiki>[[w:cercles of Mali|cercle]]</nowiki>` will be generated, or `<nowiki>[[w:cercles of Mali|cercles]]</nowiki>` if plural `cercles` is specified. *# `"separately"`: Link each word separately. For example, if `administrative territory` specifies `"separately"`, it will be linked as `<nowiki>[[administrative]] [[territory]]</nowiki>`, or as `<nowiki>[[administrative]] [[territory|territories]]</nowiki>` if plural `administrative territories` is given. *# another string: Use that string directly. If the placetype is plural, `pluralize()` in [[Module:en-utilities]] is called on the string, which will correctly pluralize most strings, including those with links in them. (If there are multiple links, the display form of the last link is pluralized.) *# `false`: This placetype is not allowed as an entry placetype. An error will be thrown if this placetype is given as an entry placetype. This is specified for internal-use placetypes, especially placetypes used in conjunction with the qualifiers `former`, `ancient`, `historical` and such. * `plural_link`: If specified and the placetype is plural, use the value in place of generating a pluralized version of the link spec in `link`. Most commonly, this is either a string with links in it (which is used directly) or the value `false`, indicating that the placetype cannot occur plural. (This is used for example by `caplc`, which displays as `<nowiki>[[capital]] and [[large]]st [[city]]</nowiki>`, where a plural version doesn't make sense.) Generally if this is specified, `plural` also needs to be specified to give a special placetype plural; this situation occurs especially with multiword placetypes where something other than the last word is pluralized. An example is `town with bystatus`, whose plural is `towns with bystatus`, which needs to be explicitly given. This example uses `link = <nowiki>"[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>` ({{m|nb|bystatus}}) is a Norwegian Bokmål word, and template calls aren't currently permitted in link strings), along with `plural_link = <nowiki>"[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>`. * `category_link`: Spec indicating how to display the placetype when occurring in category descriptions. Defaults to the value of `link`, and in turn is overridden by more specific `category_link_*` keys; see below. Category-only placetypes (which are plural and end in `!`) usually use `category_link` in preference to `link`. The value of `category_link` can be any of the types of specs given above, but most commonly is a plural string with links in it, spelling out the description; in this case it is used directly. When both `category_link` and `link` are given, the value in `category_link` is typically longer and more descriptive. For example, `polity` uses `link = true`, which just generates a link `<nowiki>[[polity]]</nowiki>` or plural `<nowiki>[[polity|polities]]</nowiki>`, but specifies a separate `category_link = <nowiki>"[[independent]] or [[semi-]][[independent]] [[polity|polities]]"</nowiki>`, which clarifies in the category description what a polity is. * `category_link_top_level`: Spec indicating how to display top-level (bare/unqualified) categories, i.e. categories where the placetype is not followed by `in ``location`` ` or `of ``location`` `. If given, this overrides `category_link` for this type of category. * `category_link_before_noncity`: Spec indicating how to display qualified categories of the form ` ``placetypes`` in/of ``location`` ` where ``location`` does not refer to a city. If given, this overrides `category_link` for this type of category. * `category_link_before_city`: Spec indicating how to display qualified categories of the form ` ``placetypes`` in/of ``location`` ` where ``location`` refer to a city. If given, this overrides `category_link` for this type of category. An example where this is given is `neighborhood`, which uses the following specs:<ol> <li>`link = true`</li> <li>`category_link = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]"</nowiki>`</li> <li>`category_link_before_city = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions"</nowiki>`</li> </ol> This has the effect of making the entry placetype `neighborhood` display as just `<nowiki>[[neighborhood]]</nowiki>`, while e.g. a category like `Neighborhoods of Chicago` displays as `<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[Chicago]], ...</nowiki>` and a category like `Neighborhoods in Illinois, USA` displays as `<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]] in [[Illinois]], ...</nowiki>`. * `disallow_in_entries`: If specified, this placetype cannot occur as an entry placetype, and the specified value (a message indicating what to use instead) is displayed in the error message. * `disallow_in_holonyms`: If specified, this placetype cannot occur as a holonym placetype, and the specified value (a message indicating what to use instead) is displayed in the error message. 2. There is currently one fallback-related property key recognized: * `fallback`: If specified, its value is a placetype which will be used for categorization purposes if no categories get added using the placetype itself. As an example, `branch` sets a fallback of `river` but also sets `preposition = "ของ"`, meaning that {{tl|place|en|branch|riv/Mississippi}} displays as `a branch of the Mississippi` (whereas `river` itself uses the preposition `in`), but otherwise categorizes the same as `river`. A more complex example is `area`, which sets a fallback of `geographic and cultural area` and also sets a category handler that checks for cities or city-like entities (e.g. boroughs) occurring as holonyms and categorizes the toponym under [[:Category:Neighborhoods of CITY]] (for recognized cities) or otherwise [[:Category:Neighborhoods of POLDIV]] (for the nearest containing recognized location). In addition, `area` is set as a political division of Kuwait, meaning if `c/Kuwait` occurs as holonym, the toponym is categorized under [[:Category:Areas of Kuwait]]. If none of these categories trigger, the fallback of `geographic and cultural area` will take effect, and the toponym will be categorized as e.g. [[:Category:Geographic and cultural areas of England]]. 3. There is currently one property to control irregular plurals of placetypes: * `plural`: If specified, its value is the plural of the placetype. Otherwise, the default pluralization algorithm in [[Module:en-utilities]] applies (which correctly pluralizes most words, including those ending in `-y`, `-ch`, `-sh`, `-x`, etc.). The value of `plural` is also used when converting a pluralized placetype into its singular equivalent; for example, since the placetype `kibbutz` has `plural = "kibbutzim"`, the placetype `kibbutzim` will be recognized as a plural and singularized to `kibbutz`. For this reason, it's occasionally necessary to specify a `plural` value even when the default pluralization algorithm works correctly, if the default singularization algorithm won't correctly reverse the pluralization (as with `pass` and other terms ending in `-ss`). 4. The following property keys relate to generating categories for entry placetypes and specifying the parents of those categories: * `class`: The general class of placetype. This is used for various purposes: (a) to categorize placetypes preceded by a qualifier such as `former`, `ancient`, `medieval` or `historical` (note that these placetypes are not all treated alike); (b) to determine the parent category of bare placetype categories (e.g. [[:Category:Villages]] for placetype `village`); (c) to determine whether to add a parent category `political divisions of specific countries` to qualified placetype categories (e.g. [[:Category:Villages in Mali]]). The possible values are: *# `polity`: a more-or-less sovereign/independent polity, such as a country, kingdom or empire. *# `subpolity`: a non-sovereign division of a polity, above the level of an individual settlement. *# `settlement`: a city or smaller equivalent, such as a village. This also includes administrative divisions of a settlement, such as wards and barangays. *# `non-admin settlement`: similar to a settlement but without administrative or political significance, such as an unincorporated community, farm or neighborhood. *# `capital`: a settlement that is a capital. A former capital is generally still in existence, just not the capital any more. *# `natural feature`: any non-man-made feature, such as a lake, mountain, island, ocean, etc. *# `man-made structure`: a man-made feature below the level of a neighborhood, such as a house, airport, university, metro station, park or the like. *# `geographic region`: a geographic or cultural region or area that has no administrative significance. These may vary greatly in size but typically have some sort of cultural significance (possibly historical). The `former`, `ancient`, etc. qualifier has no effect on the category of these placetypes. *# `generic place`: a place that isn't further qualified into any specific subtype. * `former_type`: The class of placetype used for categorizing placetypes preceded by a qualifier such as `former`, `ancient`, `medieval` or `historical`. The possible values are the same as for `class` but with the addition of `dependent territory` (for colonies, protectorates and the like) and `!` (ignore the historical/former/ancient/etc. qualifier; used e.g. with `fictional location` and `mythological location`). If not specified, the value of `class` is used. When a qualifier such as `former`, `ancient`, `medieval` or `historical` is encountered (specifically, those in `former_qualifiers`), it is mapped using `former_qualifiers` to the appropriate internal qualifier or qualifiers (one or both of `ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified qualifiers), which is prepended to the value of `former_type` or `class` to form a placetype whose properties are looked up to determine how to categorize the toponym in question. For example, if `medieval village` is given, we map `medieval` to `ANCIENT` and `FORMER`, and `village` to its `class` of `settlement`, and enter the placetypes `ANCIENT settlement` and `FORMER settlement` (in that order) into the list of equivalent placetypes returned by `get_placetype_equivs`. In this case, there is an entry in `placetype_data` for `ANCIENT settlement`, so its default category spec `Ancient settlements` is used as the category. If on the other hand `medieval kingdom` is given, where `kingdom` has a `class` value `polity`, we first look up `ANCIENT polity`, see there is no entry in `placetype_data` for it, and then look up `FORMER polity`, which exists and has a default category spec `Former polities`, which is used as the category. Note that if the placetype following the "former" qualifier is recognized in `placetype_data` but has no `former_type` or `class` and no fallback with a `former_type` or `class` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like `former greenhouse` is specified and we don't have an entry for `greenhouse`), we just track the occurrence and end up not categorizing. * `bare_category_parent`: This specifies the first parent category of a bare placetype category named according to the placetype in question (e.g. [[:Category:Atolls]] for placetype `atoll`, or [[:Category:Named buildings]] for placetype `named buildings!`). If not specified, the first parent category is determined by the value of `class`, using the mapping `class_to_bare_category_parent` in [[Module:category tree/topic cat/data/Places]]. * `addl_bare_category_parents`: Extra parent categories to add a bare placetype category to (see `bare_category_parent` just above). * `bare_category_breadcrumb`: Breadcrumb for bare placetype categories. Also used as the sort key of `bare_category_parent` if it is a string. * `inherently_former`: If specified and the given placetype is used as an entry placetype, act as if `former` or `ancient` (depending on the value of `inherently_former`) were prefixed to the placetype. This is for placetypes that always refer to no-longer-existing entities, such as `satrapy` and `treaty port`. The value of `inherently_former` is a list of internal qualifiers (one or more of `ANCIENT` and/or `FORMER`), just as for `former_qualifiers`, and the implementation is the same. * `cat_handler`: Handler used to generate the categories to add a given toponym to, if its entry placetype is the placetype in question. Generally the `cat_handler` function checks the holonyms specified in order to determine which category or categories to generate. For example, `district_neighborhood_cat_handler` handles placetypes `district`, `neighborhood`, `subdivision`, `suburb` and the like, and either adds the toponym to a category like `Neighborhoods of ``city`` ` (if a recognized city is given as a holonym), or otherwise a category like `Neighborhoods in ``location`` ` (for the first recognized non-city location given as a holonym, if an unrecognized city or city-like entity is given before the recognized non-city). The algorithm that runs the category handlers iterates over holonyms from left to right, running the `cat_handler` function on each holonym in turn until one or more categories are returned; see below for more specifics. (Note that countries for which e.g. a `district` is a political division do not get the corresponding category added by the `district_neighborhood_cat_handler` function but by `political_division_cat_handler`.) `cat_handler` functions are called with one argument, `data`, describing the resolved entry placetype (i.e. after resolving placetype aliases and fallbacks) and the holonym being processed. The return value should be a list of category specs (categories minus the langcode prefix, with `+++` standing for the holonym key, or the value `true`, which stands for ` ``Placetypes`` in/of ``Holonym`` `, i.e. the pluralized placetype with the appropriate preposition as specified in `placetype_data`). `data` contains the following fields: ** `entry_placetype`: the resolved entry placetype for the entry placetype being processed (i.e. it will always have an entry in `placetype_data` but may not be the original placetype given by the user); ** `holonym_placetype` and `holonym_placename`: the holonym placetype and placename being processed; ** `holonym_index`: the index of the holonym being processed, or {nil} if we're handling an overriding holonym (FIXME: we will change the overriding holonym algorithm so there will be an index even when processing overriding holonyms); ** `place_desc`: a full description of the {{tl|place}} call, as specified at the top of [[Module:place]]; ** `from_demonym`: If set, we are called from [[Module:demonym]], triggered by {{tl|demonym-adj}} or {{tl|demonym-noun}}, instead of being triggered by {{tl|place}}. * `has_neighborhoods`: If `true`, the specified placetype is city-like. This is used in the `district_neighborhood_cat_handler` to determine whether to add a category such as `Neighborhoods in ``location`` `; see the section just above on `cat_handler`. 5. The following preposition-related property keys are recognized: * `preposition`: The preposition used after this placetype when it occurs as an entry placetype. Defaults to `"ใน"`. * `generic_before_non_cities`: If specified, the appropriate category description handler in [[Module:category tree/topic cat/data/Places]] will recognize categories of the form ` ``Placetype`` in/of ``location`` ` for the specified placetype and preposition, if ``location`` is a non-city. This is used to generate descriptions for categories added by category handlers and by explicit category specs in the placetype data. All placetypes that specify `generic_before_non_cities` or `generic_before_cities` *MUST* also specify a value for `class` so that the category tree code can determine whether it's a political or non-political division. * `generic_before_cities`: Like `generic_before_non_cities` but for locations referring to cities. 6. The following property keys control the auto-addition of affixes when formatting holonyms of a particular placetype: * `affix_type`: If specified, add the placetype as an affix before or after holonyms of this placetype. Possible values are: *# `"pref"` (the holonym will display as `(the) placetype of Holonym`, where `the` appears when the holonym directly follows an entry placetype); *# `"Pref"` (same as `"pref"` but the placetype is capitalized; each word is capitalized if there are multiple); *# `"suf"` (the holonym will display as `Holonym placetype`); *# `"Suf"` (the holonym will display as `Holonym Placetype`, i.e. same as `"suf"` but the placetype is capitalized). * `suffix`: String to use in place of the placetype itself when the placetype is displayed as a suffix after a holonym. Note that `suffix` can be used independently of `affix_type` because the user can also request a suffix explicitly using a syntax like `adr:suf/Occitania`, which will display as `Occitania region` because the placetype `administrative region` specifies `suffix = "ภูมิภาค"`. * `prefix`: Like `suffix` but for use when the placetype is displayed as a prefix before the holonym. * `affix`: Like `suffix` and `prefix` but for use when the placetype is displayed as an affix either before or after the holonym. If both `suffix` or `prefix` and `affix` are given for a single placetype, `suffix` or `prefix` take precedence. * `no_affix_strings`: String or list of strings that, if they occur in the holonym, suppress the addition of any affix requested using `affix_type`. Defaults to the placetype itself. For example, `autonomous okrug` specifies `affix_type = "Suf"` so that `aokr/Nenets` displays as `Nenets Autonomous Okrug`, but also specifies `no_affix_strings = "okrug"` so that `aokr/Nenets Okrug` or `aokr/Nenets Autonomous Okrug` displays as specified, without a redundant `Autonomous Okrug` added. Matching is case-insensitive but whole-word. * `display_handler`: A function of two arguments, `holonym_placetype` and `holonym_placename` (specifying a holonym). Its return value is a string specifying the display form of the holonym. 7. The following property keys control the indefinite and definite articles used before entry placetypes and/or holonyms of the specified placetype. * `entry_placetype_use_the`: Use `"the"` before this placetype when it occurs as an entry placetype. * `entry_placetype_indefinite_article`: Indefinite article used before this placetype when it occurs as an entry placetype (usually `"a"`, specifically for placetypes beginning with u- that don't take the indefinite article `"an"`). Defaults to the appropriate indefinite article (`"a"` or `"an"` depending on whether the placetype begins with a vowel). Overridden by `entry_placetype_use_the`, and unlike for most properties, does not apply to equivalent placetypes (i.e. fallbacks or those formed by removing a qualifier from the beginning); only to the exact placetype specified. * `holonym_use_the`: Use `"the"` before holonyms of this placetype. '''NOTE:''' # The `link` property must be specified on all placetypes, except those ending in `!` (category-only placetypes), which must have either `link` or `category_link` specified. # Either the `class` or `former_type` property must be specified on all placetypes not ending in `!` that do not have a fallback (if a placetype has a fallback and omits the `class` and `former_type` properties, they are taken from the fallback). An internal error will result if a placetype has no `class` or `former_type` property derivable either directly or through a fallback, if an attempt is made to categorize a former/ancient/historical/etc. entity of this placetype. # It is possible to have multiple levels of fallback (e.g. `frazione` falls back to `hamlet`, which falls back to `village`). Fallback loops will cause an internal error. All placetypes specified as fallbacks must exist in `placetype_data` or an internal error occurs. ]==] export.placetype_data = { --[=[ If you need to sort the following, do this (using Vim): 1. Make sure all full-line comments are within the { ... } table, or are moved after and on the same line as single-line entries. 2. Make sure the table uses tabs everywhere for indent, and not spaces. 3. Mark the top of the table with `ma`, go to the bottom and execute the following two lines in sequence: :'a,.s/\n/\\n/g :s/\\n\(\t\[\)/\r\1/g The first command converts every newline to a literal `\n` sequence, so the whole thing becomes a single line, while the second command restores the newlines before the beginning of each entry. The effect is to convert all entries to a single line while not losing any information. (Potentially a negative lookahead could be used to do it all in one command.) 4. Execute the following to sort: :'a,.!perl -pe 's/^(\t\[")(.*?)(".*)$/$2 @@@ $1$2$3/' | sort -f | perl -pe 's/.*? @@@ //' Note that a simple `sort -f` (where `-f` means case-insensitive) would almost work, but it would sort "hill station" before "hill" and "county borough" before "เทศมณฑล" because the space after e.g. "hill station" sorts before the quotation mark after e.g. "hill". The above command deals with this by extracting the key, prepending it followed by ` @@@ `, sorting, and then removing key (the classic decorate-sort-undecorate pattern). 5. Put the table back to multi-line format by marking the top of the table with `ma`, going to the bottom and executing :'a,.s/\\n/\r/g Note that for some reason, in order to get a match a newline in the left side of a replacement, you must use \n, but to insert a newline in the right sode of a replacement you must use \r. ]=] ["*"] = { link = false, cat_handler = generic_place_cat_handler, }, ["administrative atoll"] = { -- Maldives link = "+w:administrative divisions of the Maldives", preposition = "ของ", class = "subpolity", }, ["administrative capital"] = { link = "w", fallback = "capital city", }, ["administrative center"] = { link = "w", fallback = "non-city capital", }, ["administrative centre"] = { link = "w", fallback = "administrative center", }, ["administrative county"] = { link = "w", fallback = "เทศมณฑล", }, ["administrative district"] = { link = "w", fallback = "อำเภอ", }, ["administrative headquarters"] = { link = "separately", fallback = "administrative centre", }, ["administrative region"] = { link = true, preposition = "ของ", suffix = "ภูมิภาค", -- but prefix is still "administrative region (of)" fallback = "ภูมิภาค", class = "subpolity", }, ["administrative seat"] = { link = "w", fallback = "administrative centre", }, ["administrative territory"] = { link = "separately", preposition = "ของ", suffix = "ดินแดน", -- but prefix is still "administrative territory (of)" fallback = "ดินแดน", class = "subpolity", }, ["administrative unit"] = { -- Grrr, it's difficult to generalize about "administrative units". In Albania, "administrative unit" is an -- official term for a city-level division of municipalities; Wikipedia renders it using the more practical term -- "commune". In Pakistan, "administrative unit" is a collective term used to refer to all the different types -- of first-level divisions (four provinces, one federal territory, and two "disputed territories", i.e. Azad -- Kashmir and Gilgit-Balistan, that are variously described). For this reason, we set no fallback, but we need -- to include this so that it can be used as a placetype for Albania, categorizing as communes. link = "w", class = "subpolity", }, ["administrative village"] = { link = "w", preposition = "ของ", has_neighborhoods = true, class = "settlement", }, ["aimag"] = { -- used in Mongolia, Russia and China (Inner Mongolia); in Mongolia, equivalent to a province; -- in China, equivalent to a prefecture (below a province); in Russia, equivalent to a municipal district. link = "w", fallback = "prefecture", }, ["airport"] = { link = true, class = "man-made structure", default = {true}, }, ["alliance"] = { link = true, fallback = "confederation", }, ["archipelago"] = { link = true, fallback = "เกาะ", }, ["area"] = { link = true, preposition = "ของ", fallback = "geographic and cultural area", -- Areas can either be administrative divisions (specifically of Kuwait) or geographic areas. Assume the former -- when categorizing 'Areas' but the latter when handling e.g. 'historical area'. class = "subpolity", former_type = "geographic region", cat_handler = district_neighborhood_cat_handler, }, ["arm"] = { link = true, preposition = "ของ", class = "natural feature", default = {"ทะเล"}, }, ["arrondissement"] = { link = true, preposition = "ของ", -- FIXME!!! Grrrrr!!! In some countries, arrondissements are divisions of cities; in others, they are divisions -- of departments or provinces. Need to conditionalize on the country for both of the following. class = "subpolity", has_neighborhoods = true, }, ["associated province"] = { link = "separately", fallback = "จังหวัด", }, ["atoll"] = { -- FIXME! Atolls are administrative divisions of the Maldives but natural features elsewhere. Need to -- conditionalize `class` on the country. See also `administrative atoll`. link = true, class = "natural feature", bare_category_parent = "เกาะ", default = {true}, }, ["autonomous city"] = { link = "w", preposition = "ของ", fallback = "นคร", has_neighborhoods = true, }, ["autonomous community"] = { -- Spain; refers to regional entities, not village-like entities, as might be expected from "community" link = true, preposition = "ของ", class = "subpolity", }, ["autonomous island"] = { -- Comoros; seems like an administrative atoll of the Maldives. link = "+w:autonomous islands of Comoros", preposition = "ของ", class = "subpolity", }, ["autonomous oblast"] = { link = true, preposition = "ของ", affix_type = "Suf", no_affix_strings = "oblast", class = "subpolity", }, ["autonomous okrug"] = { link = true, preposition = "ของ", affix_type = "Suf", no_affix_strings = "okrug", class = "subpolity", }, ["autonomous prefecture"] = { link = true, fallback = "prefecture", }, ["autonomous province"] = { link = "w", fallback = "จังหวัด", }, ["autonomous region"] = { link = "w", preposition = "ของ", fallback = "administrative region", -- "administrative region" sets an affix of "ภูมิภาค" but we want to display as "Tibet Autonomous Region" -- if the user writes 'ar:Suf/Tibet'. affix = "autonomous region", }, ["autonomous republic"] = { link = "w", preposition = "ของ", class = "subpolity", }, ["autonomous territorial unit"] = { -- Moldova; only two of them, one for Gagauzia and one for Transnistria. link = "w", preposition = "ของ", class = "subpolity", }, ["autonomous territory"] = { link = "w", fallback = "dependent territory", }, ["bailiwick"] = { -- Jersey, etc. link = true, fallback = "องค์การทางการเมือง", }, ["barangay"] = { -- Philippines link = true, class = "settlement", -- Barangays are formal administrative divisions of a city rather than informal neighborhoods, but can use -- some of the properties of a neighborhood. fallback = "neighborhood", }, ["barrio"] = { -- Spanish-speaking countries; Philippines link = true, -- FIXME: Not completely correct, in some countries barrios are formal administrative divisions of a city. -- `class` will need to conditionalize on the country to be completely correct. fallback = "neighborhood", }, ["basin"] = { link = true, fallback = "ทะเลสาบ", }, ["bay"] = { link = true, preposition = "ของ", class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["beach"] = { link = true, class = "natural feature", addl_bare_category_parents = {"water"}, default = {true}, }, ["beach resort"] = { link = "w", fallback = "resort town", }, ["bishopric"] = { link = true, fallback = "องค์การทางการเมือง", }, ["bodies of water!"] = { -- FIXME: This is (maybe?) a type category not a name category. There should be an option for this. We need to -- straighten out the type vs. name vs. related-to issue. category_link = "[[body of water|bodies of water]]", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems", "water"}, }, ["borough"] = { link = true, preposition = "ของ", display_handler = borough_display_handler, has_neighborhoods = true, -- "former borough" could be a former settlement or a former part of a city but seems more likely to -- be a former subpolity, particularly in England. FIXME, we really need a handler to take care of this -- properly. class = "subpolity", -- Grr, some boroughs are city-like but some (e.g. in Britain) may be larger. }, ["borough seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["branch"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["bridge"] = { link = true, class = "man-made structure", default = {"Named bridges"}, }, ["building"] = { link = true, class = "man-made structure", default = {"Named buildings"}, }, ["built-up area"] = { link = "w", fallback = "area", }, ["burgh"] = { link = true, fallback = "borough", }, ["business park"] = { link = true, fallback = "park", }, ["caliphate"] = { link = true, fallback = "องค์การทางการเมือง", }, ["canton"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["cape"] = { link = true, fallback = "headland", }, ["capital"] = { link = true, fallback = "capital city", }, ["capital city"] = { link = true, category_link = "[[capital city|capital cities]]: the [[seat of government|seats of government]] for a country or [[political]] [[division]] of a country", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", bare_category_parent = "นคร", cat_handler = capital_city_cat_handler, default = {true}, -- The following is necessary so that e.g. [[Melbourne]] defined as {{place|en|capital city|s/Victoria|c/Australia}} -- gets categorized in the bare category [[Category:en:Melbourne]]; otherwise placetype 'capital city' wouldn't -- match against the placetype 'city' of Melbourne. fallback = "นคร", }, ["caplc"] = { link = "[[capital]] and [[large]]st [[city]]", plural_link = false, fallback = "capital city", }, ["captaincy"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["caravan city"] = { link = "w", fallback = "นคร", class = "settlement", inherently_former = {"ANCIENT", "FORMER"}, }, ["castle"] = { link = true, fallback = "building", }, ["cathedral city"] = { link = true, fallback = "นคร", }, ["cattle station"] = { -- Australia link = true, fallback = "farm", }, ["census area"] = { link = true, affix_type = "Suf", has_neighborhoods = true, class = "non-admin settlement", }, ["census-designated place"] = { -- United States link = true, class = "non-admin settlement", }, ["census division"] = { -- Canada link = "w", preposition = "ของ", class = "subpolity", }, ["census town"] = { link = "w", fallback = "เมือง", }, ["central business district"] = { link = true, fallback = "neighborhood", }, ["cercle"] = { -- Mali link = "+w:cercles of Mali", preposition = "ของ", class = "subpolity", }, ["ceremonial county"] = { link = true, fallback = "เทศมณฑล", }, ["chain of islands"] = { link = "[[chain]] of [[island]]s", plural = "chains of islands", plural_link = "[[chain]]s of [[island]]s", fallback = "เกาะ", }, ["channel"] = { link = true, fallback = "strait", }, ["charter community"] = { -- Northwest Territories, Canada link = "w", fallback = "village", }, ["นคร"] = { link = true, generic_before_non_cities = "ใน", has_neighborhoods = true, class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["city-state"] = { link = true, category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]", has_neighborhoods = true, class = "settlement", ["continent/*"] = {"City-states", "Cities in +++", "Countries in +++", "National capitals"}, default = {"City-states", "นคร", "ประเทศ", "National capitals"}, }, ["civil parish"] = { -- Mostly England; similar to municipalities link = true, preposition = "ของ", affix_type = "suf", has_neighborhoods = true, class = "subpolity", }, ["claimed political division"] = { link = "[[claim]]ed [[political]] [[division]]", class = "subpolity", default = {true}, }, ["co-capital"] = { link = "[[co-]][[capital]]", fallback = "capital city", }, ["coal city"] = { link = "+w:coal town", fallback = "นคร", }, ["coal town"] = { link = "w", fallback = "เมือง", }, ["collectivity"] = { link = "w", preposition = "ของ", -- No default; these are weird one-off governmental divisions in France (esp. for overseas collectivities) class = "subpolity", }, ["colony"] = { link = true, fallback = "dependent territory", }, ["comarca"] = { -- per Wikipedia: traditional region or local administrative division found in Portugal, Spain, and some of -- their former colonies, like Brazil, Nicaragua, and Panama. In the Valencian Community, for example, it -- sits between municipalities and provinces, something like a county or district. link = true, preposition = "ของ", class = "subpolity", }, ["commandery"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"ANCIENT", "FORMER"}, }, ["commonwealth"] = { link = true, preposition = "ของ", -- No default; applies specifically to Puerto Rico class = "subpolity", }, ["commune"] = { link = true, fallback = "เทศบาล", }, ["community"] = { link = true, category_link = "[[community|communities]] of all sizes", fallback = "village", }, ["community development block"] = { -- in India; appears to be similar to a rural municipality; groups several villages, unclear if there will be -- neighborhoods so I'm not setting `has_neighborhoods` for now link = "w", affix_type = "suf", no_affix_strings = "block", class = "subpolity", }, ["comune"] = { -- Italy, Switzerland link = true, fallback = "เทศบาล", }, ["condominium"] = { link = true, fallback = "องค์การทางการเมือง", }, ["confederacy"] = { link = true, fallback = "confederation", }, ["confederation"] = { link = true, fallback = "องค์การทางการเมือง", }, ["constituency"] = { -- currently we have them as political divisions of Namibia but many countries have them link = true, preposition = "ของ", class = "subpolity", }, ["constituent country"] = { link = true, preposition = "ของ", class = "subpolity", }, ["constituent part"] = { link = "separately", preposition = "ของ", class = "subpolity", }, ["constituent republic"] = { -- Of Russia, Yugoslavia, etc. link = "separately", preposition = "ของ", class = "subpolity", }, ["counties and county-level cities!"] = { -- This is used when grouping counties and county-level cities under prefecture-level cities in China. category_link = "[[county|counties]] and [[county-level city|county-level cities]]", class = "subpolity", }, ["continent"] = { link = true, category_link = false, -- can't occur as a bare category class = "natural feature", default = {"Continents and continental regions"}, }, ["continental region"] = { link = "separately", category_link = false, -- can't occur as a bare category class = "geographic region", fallback = "continent", }, ["continents and continental regions!"] = { category_link = "[[continent]]s and [[continent]]-[[level]] [[region]]s (e.g. [[Polynesia]])", class = "geographic region", }, ["council area"] = { link = true, -- in Scotland; similar to a county preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["ประเทศ"] = { link = true, class = "polity", --ห้ามแปล class ["continent/*"] = {true, "ประเทศ"}, default = {true}, }, ["country-like entities!"] = { category_link = "[[polity|polities]] not normally considered [[country|countries]] but treated similarly for categorization purposes; typically, [[unrecognized]] [[de-facto]] countries or [[w:dependent territory|dependent territories]]", class = "polity", --ห้ามแปล class }, ["เทศมณฑล"] = { link = true, preposition = "ของ", display_handler = county_display_handler, class = "subpolity", }, ["county borough"] = { link = true, -- in Wales; similar to a county preposition = "ของ", affix_type = "suf", fallback = "borough", class = "subpolity", }, ["county seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["county town"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", fallback = "เมือง", has_neighborhoods = true, class = "capital", }, ["county-administered city"] = { -- In Taiwan, per Wikipedia similar to a Taiwanese township or district, which is a small city. -- NOT anything like a "county-level city" in PR China, which is a county masquerading as a city. link = "w", fallback = "นคร", has_neighborhoods = true, class = "settlement", }, ["county-controlled city"] = { -- Taiwan link = "w", fallback = "county-administered city", }, ["county-level city"] = { -- PR China link = "w", fallback = "prefecture-level city", }, ["crater lake"] = { link = true, fallback = "ทะเลสาบ", }, ["creek"] = { link = true, fallback = "stream", }, ["Crown colony"] = { link = "+crown colony", fallback = "crown colony", }, ["crown colony"] = { link = true, fallback = "colony", }, ["Crown dependency"] = { link = true, fallback = "dependent territory", }, ["crown dependency"] = { link = true, fallback = "dependent territory", }, ["cultural area"] = { link = "w", fallback = "geographic and cultural area", }, ["cultural region"] = { link = "w", fallback = "geographic and cultural area", }, ["delegation"] = { -- Tunisia link = "+w:delegations of Tunisia", preposition = "ของ", class = "subpolity", }, ["department"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["departmental capital"] = { link = "separately", fallback = "capital city", }, ["dependency"] = { link = true, fallback = "dependent territory", }, ["dependent territory"] = { link = "w", preposition = "ของ", class = "subpolity", former_type = "dependent territory", bare_category_parent = "political divisions", ["country/*"] = {true}, default = {true}, }, ["desert"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ecosystems"}, default = {true}, }, ["deserted mediaeval village"] = { link = "w", fallback = "deserted medieval village", }, ["deserted medieval village"] = { link = "w", fallback = "ANCIENT settlement", }, ["direct-administered municipality"] = { -- China link = "+w:direct-administered municipalities of China", fallback = "เทศบาล", }, ["direct-controlled municipality"] = { -- several countries link = "w", fallback = "เทศบาล", }, ["distributary"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["อำเภอ"] = { link = true, preposition = "ของ", affix_type = "suf", -- Grrr! FIXME! Here is where we need handlers for `class`. Using similar logic to -- district_neighborhood_cat_handler, we need to check if we're below or above a city to determine if the class -- is "settlement" or "subpolity". class = "subpolity", cat_handler = district_neighborhood_cat_handler, -- No default. Countries for which districts are political divisions will get entries. }, ["districts and autonomous regions!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Portugal. category_link = "[[district]]s and [[autonomous region]]s", class = "subpolity", }, ["districts and autonomous territorial units!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Moldova. category_link = "[[district]]s and [[w:autonomous territorial unit|autonomous territorial unit]]s", class = "subpolity", }, ["district capital"] = { link = "separately", fallback = "capital city", }, ["district headquarters"] = { link = "separately", fallback = "administrative centre", }, ["district municipality"] = { -- In Canada, a district municipality is equivalent to a rural municipality and won't have neighborhoods; in -- South Africa, district municipalities group local municipalities and hence won't have neighborhoods. link = "w", preposition = "ของ", affix_type = "suf", no_affix_strings = {"อำเภอ", "เทศบาล"}, fallback = "เทศบาล", class = "subpolity", }, ["division"] = { link = true, preposition = "ของ", class = "subpolity", }, ["division capital"] = { link = "separately", fallback = "capital city", }, ["dome"] = { link = true, fallback = "ภูเขา", }, ["dormant volcano"] = { link = true, fallback = "volcano", }, ["duchy"] = { link = true, fallback = "องค์การทางการเมือง", }, ["emirate"] = { link = true, preposition = "ของ", -- FIXME: Can be subpolities (of the United Arab Emirates). fallback = "องค์การทางการเมือง", }, ["จักรวรรดิ"] = { link = true, fallback = "องค์การทางการเมือง", }, ["enclave"] = { link = true, preposition = "ของ", -- Enclaves can theoretically be any size but assume a subpolity. class = "subpolity", }, ["entity"] = { -- Bosnia and Herzegovina link = "+w:entities of Bosnia and Herzegovina", preposition = "ของ", class = "subpolity", }, ["escarpment"] = { link = true, fallback = "ภูเขา", }, ["ethnographic region"] = { -- used in Lithuania link = "+w:ethnographic regions of Lithuania", fallback = "geographic and cultural area", }, ["exclave"] = { link = true, preposition = "ของ", -- exclaves can theoretically be any size but assume a subpolity. class = "subpolity", }, ["external territory"] = { link = "separately", fallback = "dependent territory", }, ["farm"] = { link = true, class = "non-admin settlement", default = {"Farms and ranches"}, }, ["farms and ranches!"] = { category_link = "[[farm]]s and [[ranch]]es", class = "non-admin settlement", }, ["federal city"] = { link = "w", preposition = "ของ", fallback = "นคร", }, ["federal district"] = { link = true, preposition = "ของ", -- Might have neighborhoods as federal districts are often cities (e.g. Mexico City) has_neighborhoods = true, class = "settlement", }, ["federal subject"] = { -- In Russia; a generic term for first-level administrative divisions (republics, oblasts, okrugs, krais, -- autonomous okrugs and autonomous oblasts). link = "w", preposition = "ของ", class = "subpolity", }, ["federal territory"] = { link = "w", fallback = "ดินแดน", }, ["fictional location"] = { link = "separately", former_type = "!", class = "hypothetical location", bare_category_parent = "สถานที่", default = {true}, }, ["First Nations reserve"] = { -- Canada link = "[[First Nations]] [[w:Indian reserve|reserve]]", -- Wikipedia uses "Indian reserve"; presumably that is the legal term fallback = "Indian reserve", class = "subpolity", }, ["fjord"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["footpath"] = { link = true, fallback = "road", }, ["forest"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ecosystems", "forestry"}, default = {true}, }, ["fort"] = { link = true, fallback = "building", }, ["fortress"] = { link = true, -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- fortresses -> fortresse, so put an entry here to ensure we singularize correctly. plural = "fortresses", fallback = "building", }, ["frazione"] = { link = "w", fallback = "hamlet", }, ["freeway"] = { link = true, fallback = "road", }, ["French prefecture"] = { link = "[[w:prefectures in France|prefecture]]", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["geographic and cultural area"] = { link = "+w:cultural area", -- `generic_before_non_cities` is used when generating the category description of categories of the format -- `Geographic and cultural areas of PLACE`. `preposition` is used when generating {{place}} description and -- categories for any placetype that falls back to `geographic and cultural area`. generic_before_non_cities = "ของ", preposition = "ของ", class = "geographic region", bare_category_parent = "สถานที่", ["country/*"] = {true}, ["constituent country/*"] = {true}, ["continent/*"] = {true}, default = {true}, }, ["geographic area"] = { link = "+w:geographic region", fallback = "geographic and cultural area", }, ["geographic region"] = { link = "w", fallback = "geographic and cultural area", }, ["geographical area"] = { link = "w", fallback = "geographic and cultural area", }, ["geographical region"] = { link = "w", fallback = "geographic and cultural area", }, ["geopolitical zone"] = { -- Nigeria link = true, preposition = "ของ", class = "subpolity", }, ["gewog"] = { -- Bhutan link = true, preposition = "ของ", class = "subpolity", }, ["ghost town"] = { link = true, generic_before_non_cities = "ใน", class = "non-admin settlement", bare_category_parent = "former settlements", cat_handler = city_type_cat_handler, default = {true}, }, ["glen"] = { link = true, fallback = "valley", }, ["governorate"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["greater administrative region"] = { -- China (former division) link = "w", preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["gromada"] = { -- Poland (former division) link = "w", preposition = "ของ", affix_type = "Pref", class = "subpolity", inherently_former = {"FORMER"}, }, ["group of islands"] = { link = "[[group]] of [[island]]s", plural = "groups of islands", plural_link = "[[group]]s of [[island]]s", fallback = "island group", }, ["gulf"] = { link = true, preposition = "ของ", holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["hamlet"] = { link = true, fallback = "village", }, ["harbor city"] = { link = "separately", fallback = "นคร", }, ["harbor town"] = { link = "separately", fallback = "เมือง", }, ["harbour city"] = { link = "separately", fallback = "นคร", }, ["harbour town"] = { link = "separately", fallback = "เมือง", }, ["headland"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["headquarters"] = { link = "w", fallback = "administrative centre", }, ["heath"] = { link = true, fallback = "moor", }, ["hemisphere"] = { link = true, entry_placetype_use_the = true, fallback = "continental region", }, ["highway"] = { link = true, fallback = "road", }, ["hill"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["hill station"] = { link = "w", fallback = "เมือง", }, ["hill town"] = { link = "w", fallback = "เมือง", }, ["historic region"] = { -- provided only for the link link = "+w:historical region", fallback = "FORMER geographic region", }, ["historical county"] = { -- needed for historical counties of England/etc. link = "+w:historic county", fallback = "FORMER subpolity", }, ["historical region"] = { -- provided only for the link link = "w", fallback = "FORMER geographic region", }, ["home rule city"] = { link = "w", fallback = "นคร", }, ["home rule municipality"] = { link = "w", fallback = "เทศบาล", }, ["hot spring"] = { link = true, fallback = "spring", }, ["house"] = { link = true, fallback = "building", }, ["housing estate"] = { -- not the same as a housing project (i.e. public housing) link = true, -- not exactly the case but approximately fallback = "neighborhood", }, ["hromada"] = { -- Ukraine link = "w", disallow_in_entries = "Use placetype 'urban hromada', 'rural hromada' or 'settlement hromada' in place of bare 'hromada'", disallow_in_holonyms = "Use placetype 'urban hromada'/'uhrom', 'rural hromada'/'rhrom' or 'settlement hromada'/'shrom' in place of bare 'hromada'", preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["inactive volcano"] = { link = "w", fallback = "dormant volcano", }, ["independent city"] = { link = true, fallback = "นคร", }, ["independent town"] = { link = "+independent city", fallback = "เมือง", }, ["Indian reservation"] = { link = "w", -- In the US. Also known as "Native American reservation" or "domestic dependent nation", and the reservations -- themselves often use the term "nation" in their official name (e.g. the "Navajo Nation"). But Wikipedia puts -- the article at [[w:Indian reservation]] and uses that term when describing e.g. what the Navajo Nation is, -- so this must still be the legal term. preposition = "ของ", class = "subpolity", default = {true}, }, ["Indian reserve"] = { link = "w", -- In Canada. "First Nations reserve" sounds more modern/PC but Wikipedia uses "Indian reserve"; presumably that -- is still the legal term. preposition = "ของ", class = "subpolity", default = {true}, }, ["inland sea"] = { -- note, we also have 'inland' as a qualifier link = true, fallback = "ทะเล", }, ["inner city area"] = { link = "[[inner city]] [[area]]", fallback = "neighborhood", }, ["เกาะ"] = { link = true, preposition = "ของ", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["island country"] = { -- FIXME: The following should map to both 'island' and 'country'. link = "w", fallback = "ประเทศ", }, ["island group"] = { link = "separately", fallback = "เกาะ", }, ["island municipality"] = { link = "w", fallback = "เทศบาล", }, ["islet"] = { link = "w", fallback = "เกาะ", }, ["Israeli settlement"] = { link = "w", class = "settlement", default = {true}, }, ["judicial capital"] = { link = "w", fallback = "capital city", }, ["khanate"] = { link = true, fallback = "องค์การทางการเมือง", }, ["kibbutz"] = { link = true, plural = "kibbutzim", class = "non-admin settlement", default = {true}, }, ["kingdom"] = { link = true, fallback = "monarchy", }, ["krai"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["ทะเลสาบ"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["ธรณีสัณฐาน!"] = { category_link = "[[ธรณีสัณฐาน]]", bare_category_parent = "สถานที่", addl_bare_category_parents = {"โลก"}, }, ["largest city"] = { link = "[[large]]st [[city]]", entry_placetype_use_the = true, fallback = "นคร", has_neighborhoods = true, }, ["league"] = { link = true, fallback = "confederation", }, ["legislative capital"] = { link = "separately", fallback = "capital city", }, ["library"] = { link = true, fallback = "building", }, ["lieutenancy area"] = { -- used in the United Kingdom; per Wikipedia: -- In England, lieutenancy areas are colloquially known as the ceremonial counties, although this phrase does -- not appear in any legislation referring to them. The lieutenancy areas of Scotland are subdivisions of -- Scotland that are more or less based on the counties of Scotland, making use of the major cities as separate -- entities.[2] In Wales, the lieutenancy areas are known as the preserved counties of Wales and are based on -- those used for lieutenancy and local government between 1974 and 1996. The lieutenancy areas of Northern -- Ireland correspond to the six counties and two former county boroughs.[3] link = "w", fallback = "ceremonial county", }, ["local authority district"] = { link = "w", fallback = "local government district", }, ["local government area"] = { -- Australia link = "w", preposition = "ของ", class = "subpolity", }, ["local council"] = { -- Malta; similar to municipalities link = "+w:local councils of Malta", preposition = "ของ", fallback = "เทศบาล", }, ["local government district"] = { link = "w", preposition = "ของ", affix_type = "suf", affix = "อำเภอ", class = "subpolity", }, ["local government district with borough status"] = { link = "[[w:local government district|local government district]] with [[w:borough status|borough status]]", plural = "local government districts with borough status", plural_link = "[[w:local government district|local government districts]] with [[w:borough status|borough status]]", preposition = "ของ", affix_type = "suf", affix = "อำเภอ", class = "subpolity", }, ["local urban district"] = { link = "w", fallback = "unincorporated community", }, ["locality"] = { link = "+w:locality (settlement)", -- not necessarily true, but usually is the case fallback = "village", }, ["London borough"] = { link = "w", preposition = "ของ", affix_type = "pref", affix = "borough", fallback = "local government district with borough status", has_neighborhoods = true, }, ["macroregion"] = { link = true, fallback = "ภูมิภาค", }, ["man-made structures!"] = { category_link = "[[w:geographical feature#Engineered constructs|man-made structures]] such as [[airport]]s, [[university|universities]] and [[metro station]]s", bare_category_parent = "สถานที่", }, ["manor"] = { -- FIXME: or is this more like a farm? link = true, fallback = "building", }, ["marginal sea"] = { link = true, preposition = "ของ", fallback = "ทะเล", }, ["market city"] = { link = "+market town", fallback = "นคร", }, ["market town"] = { link = true, fallback = "เมือง", }, ["massif"] = { link = true, fallback = "ภูเขา", }, ["megacity"] = { link = true, fallback = "นคร", }, ["metro station"] = { link = true, class = "man-made structure", }, ["metropolitan borough"] = { link = true, preposition = "ของ", affix_type = "Pref", no_affix_strings = {"borough", "นคร"}, fallback = "local government district", has_neighborhoods = true, }, ["metropolitan city"] = { -- These exist e.g. in Italy and are more like municipalities or even provinces than cities. link = true, preposition = "ของ", affix_type = "Pref", no_affix_strings = {"metropolitan", "นคร"}, class = "subpolity", }, ["metropolitan county"] = { link = true, fallback = "เทศมณฑล", }, ["metropolitan municipality"] = { -- In South Africa, metropolitan municipalities group local municipalities and are like districts, between -- provinces and municipalities. -- In Turkey, metropolitan municipalities are provinces-level. link = "w", preposition = "ของ", affix_type = "Suf", no_affix_strings = {"metropolitan", "เทศบาล"}, fallback = "เทศบาล", class = "subpolity", }, ["microdistrict"] = { -- residential complex in post-Soviet states link = true, fallback = "neighborhood", }, ["micronations!"] = { -- FIXME, merge with microstate category_link = "[[micronation]]s", bare_category_parent = "ประเทศ", }, ["microstate"] = { link = true, fallback = "ประเทศ", }, ["military base"] = { link = "w", class = "settlement", -- or "man-made structure"? default = {true}, }, ["minster town"] = { -- England link = "separately", fallback = "เมือง", }, ["monarchy"] = { link = true, fallback = "องค์การทางการเมือง", }, ["moor"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems"}, default = {true}, }, ["moorland"] = { link = true, fallback = "moor", }, ["motorway"] = { link = true, fallback = "road", }, ["ภูเขา"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["mountain indigenous district"] = { -- Taiwan link = "+w:district (Taiwan)", fallback = "อำเภอ", }, ["mountain indigenous township"] = { -- Taiwan link = "+w:township (Taiwan)", fallback = "township", }, ["mountain pass"] = { link = true, -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- passes -> passe, so put an entry here to ensure we singularize correctly. plural = "mountain passes", class = "natural feature", addl_bare_category_parents = {"ภูเขา"}, default = {true}, }, ["เทือกเขา"] = { link = true, fallback = "ภูเขา", }, ["mountainous region"] = { link = "separately", fallback = "ภูมิภาค", }, ["mukim"] = { -- Malaysia, Brunei, Indonesia, Singapore link = true, preposition = "ของ", class = "subpolity", }, ["municipal district"] = { link = "w", -- meaning varies depending on the country; for now, assume no neighborhoods. -- FIXME: has_neighborhoods might have to be a function that looks at the containing holonyms. preposition = "ของ", affix_type = "Pref", no_affix_strings = "อำเภอ", fallback = "เทศบาล", }, ["เทศบาล"] = { link = true, preposition = "ของ", has_neighborhoods = true, class = "subpolity", }, ["municipality with city status"] = { link = "[[municipality]] with [[w:city status|city status]]", plural = "municipalities with city status", plural_link = "[[municipality|municipalities]] with [[w:city status|city status]]", fallback = "เทศบาล", }, ["museum"] = { link = true, fallback = "building", }, ["mythological location"] = { link = "separately", former_type = "!", class = "hypothetical location", bare_category_parent = "สถานที่", default = {true}, }, ["named bridges!"] = { category_link = "notable [[bridge]]s", bare_category_parent = "man-made structures", addl_bare_category_parents = {"bridges"}, }, ["named buildings!"] = { category_link = "notable [[house]]s, [[library|libraries]] and other [[building]]s", bare_category_parent = "man-made structures", addl_bare_category_parents = {"buildings"}, }, ["named roads!"] = { category_link = "notable [[road]]s, [[highway]]s, [[trail]]s and similar linear structures", bare_category_parent = "man-made structures", addl_bare_category_parents = {"roads"}, }, ["national capital"] = { link = "w", fallback = "capital city", }, ["national park"] = { link = true, fallback = "park", }, ["natural features!"] = { category_link = "[[w:geographical feature#Natural features|natural features]] such as [[lake]]s, [[mountain]]s, [[island]]s and [[ocean]]s", bare_category_parent = "สถานที่", }, ["neighborhood"] = { -- The majority of the properties here apply to both `neighborhoods` and `neighbourhoods`; the choice of which -- one to use is made by district_neighborhood_cat_handler() based on the value of `british_spelling` for the -- location (city, political division, etc.) of the holonym that follows the word "neighbo(u)hoods" in the -- category name. It does *NOT* depend on whether the {{place}} call uses "neighborhoods" or "neighbourhoods". -- (In general it can't, because other things like "urban areas", "อำเภอ", "subdivisions" and the like also -- categorize as neighbo(u)rhoods.) link = true, -- See below. These are used by category handlers in [[Module:category tree/topic cat/data/Places]]. generic_before_non_cities = "ใน", generic_before_cities = "ของ", -- The following text is suitable for the top-level description of a neighborhood as well as categories of the -- form `Neighborhoods in POLDIV` e.g. `Neighborhoods in Illinois, USA` but not for categories of the form -- `Neighborhoods of Chicago`, where we'd get "... and other subportions of [[city|cities]] of [[Chicago]]". category_link = "[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]", category_link_before_city = "[[neighborhood]]s, [[district]]s and other subportions", -- NOTE: This setting is needed for administrative divisions like barangays that fall back to `neighborhood`, -- when set in [[Module:place/locations]] for a specific country (e.g. the Philippines). The above settings -- for `generic_before_non_cities` and `generic_before_cities` are used by category handlers in -- [[Module:category tree/topic cat/data/Places]] for `Neighborhoods in POLDIV` and `Neighborhoods of CITY` -- categories. In fact, district_neighborhood_cat_handler() does not currently pay attention to them, but -- generates "ของ" before cities and "ใน" before non-cities regardless. (FIXME: We should change that.) preposition = "ของ", class = "non-admin settlement", cat_handler = district_neighborhood_cat_handler, }, ["neighbourhood"] = { link = true, category_link = "[[neighbourhood]]s, [[district]]s and other subportions of [[city|cities]]", category_link_before_city = "[[neighbourhood]]s, [[district]]s and other subportions", fallback = "neighborhood", }, ["new area"] = { -- China (type of economic development zone, varying greatly in size) link = "w", preposition = "ใน", class = "subpolity", --? }, ["new town"] = { link = true, fallback = "เมือง", }, ["non-city capital"] = { link = "[[capital]]", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", cat_handler = function(data) return capital_city_cat_handler(data, "non-city") end, -- FIXME, do we need the following? default = {true}, }, ["non-metropolitan county"] = { link = "w", fallback = "เทศมณฑล", }, ["non-metropolitan district"] = { link = "w", fallback = "local government district", }, ["non-sovereign kingdom"] = { -- especially in Africa and Asia link = "+w:non-sovereign monarchy", generic_before_non_cities = "ใน", class = "subpolity", ["country/*"] = {true}, ["continent/*"] = {true}, default = {true}, }, ["non-sovereign monarchy"] = { link = "w", fallback = "non-sovereign kingdom", }, ["oblast"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["oblasts and autonomous republics!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Ukraine. category_link = "[[oblast]]s and [[w:autonomous republic|autonomous republic]]s", class = "subpolity", }, ["มหาสมุทร"] = { link = true, holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"ทะเล", "bodies of water"}, default = {true}, }, ["okrug"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["overseas collectivity"] = { link = "w", fallback = "collectivity", }, ["overseas department"] = { link = "w", fallback = "department", }, ["overseas territory"] = { link = "w", fallback = "dependent territory", }, ["parish"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["parish municipality"] = { -- in Quebec, often similar to a rural village; the famous [[Saint-Louis-du-Ha! Ha!]] is one of them. link = "+w:parish municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, }, ["parish seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", class = "capital", has_neighborhoods = true, }, ["park"] = { link = true, class = "man-made structure", default = {true}, }, ["pass"] = { link = "+mountain pass", -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- passes -> passe, so put an entry here to ensure we singularize correctly. plural = "passes", fallback = "mountain pass", }, ["path"] = { link = true, fallback = "road", }, ["peak"] = { link = true, fallback = "ภูเขา", }, ["peninsula"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["periphery"] = { link = true, preposition = "ของ", class = "subpolity", }, ["สถานที่!"] = { generic_before_non_cities = "ใน", generic_before_cities = "ใน", class = "generic place", category_link = "[[place]]s of all sorts", -- `category_link_top_level` control the description used in the top-level [[Category:Places]] and -- language-specific variants such as [[Category:en:Places]]. The actual text for a language-spefic variant is -- "{{{langname}}} names of [[geographical]] [[place]]s of all sorts; [[toponym]]s." where the "names of" -- portion is automatically generated by the appropriate handler in -- [[Module:category tree/topic cat/data/Places]]. category_link_top_level = "[[geographical]] [[place]]s of all sorts; [[toponym]]s", bare_category_parent = "ชื่อ (หัวข้อ)", }, ["planned community"] = { -- Include this so we don't categorize 'planned community' into villages, as 'community' does. link = true, class = "settlement", has_neighborhoods = true, }, ["plateau"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, -- FIXME: Should generate both "Plateaus" and the appropriate 'geographic and cultural area' category }, ["Polish colony"] = { link = "[[w:colony (Poland)|colony]]", affix_type = "suf", affix = "colony", fallback = "village", has_neighborhoods = true, }, ["political divisions!"] = { category_link = "[[political]] [[division]]s and [[subdivision]]s, such as [[state]]s, [[province]]s, [[county|counties]] or [[district]]s", bare_category_parent = "สถานที่", }, ["องค์การทางการเมือง"] = { link = true, category_link = "[[independent]] or [[semi-]][[independent]] [[polity|polities]]", class = "polity", --ห้ามแปล class bare_category_parent = "สถานที่", default = {true}, }, ["populated place"] = { link = "+w:populated place", -- not necessarily true, but usually is the case fallback = "village", }, ["port"] = { link = true, class = "man-made structure", default = {true}, }, ["port city"] = { -- FIXME: should categorize into "Ports" as well as "นคร" link = true, fallback = "นคร", }, ["port town"] = { -- FIXME: should categorize into "Ports" as well as "เมือง" link = "w", fallback = "เมือง", }, ["prefecture"] = { -- FIXME! `prefecture` is like a county in Japan and elsewhere but a department capital city in France. -- May need `has_neighborhoods` to be a function. link = true, preposition = "ของ", display_handler = prefecture_display_handler, class = "subpolity", }, ["prefecture-level city"] = { -- China; they are huge entities with a central city; not cities themselves. link = "w", preposition = "ของ", class = "subpolity", }, ["preserved county"] = { -- In Wales; they are former counties enshrined in law; there are 8 of them and each consists of one or more -- "principal areas" (styled as "เทศมณฑล" or "county boroughs"), of which there are 22. link = "w", preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["primary area"] = { -- a grouping of "อำเภอ" (neighborhoods) in Gothenburg, Sweden link = "+w:sv:primärområde", fallback = "neighborhood", }, ["principality"] = { link = true, fallback = "monarchy", }, ["promontory"] = { link = true, fallback = "headland", }, ["protectorate"] = { link = true, fallback = "dependent territory", }, ["จังหวัด"] = { link = true, preposition = "ของ", display_handler = province_display_handler, class = "subpolity", }, ["provinces and autonomous regions!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case China. category_link = "[[province]]s and [[autonomous region]]s", class = "subpolity", }, ["provinces and territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Canada and Pakistan. category_link = "[[province]]s and [[territory|territories]]", class = "subpolity", }, ["provincial capital"] = { link = true, fallback = "capital city", }, ["raion"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["ranch"] = { link = true, fallback = "farm", }, ["range"] = { -- FIXME: Where is this used? Is it a mountain range? link = true, holonym_use_the = true, class = "natural feature", }, ["regency"] = { link = true, preposition = "ของ", class = "subpolity", }, ["ภูมิภาค"] = { link = true, preposition = "ของ", -- If 'region' isn't a specific administrative division, fall back to 'geographic and cultural area' fallback = "geographic and cultural area", -- "former region" is a subpolity but traditional/historic(al)/ancient/medieval/etc. is a geographic region class = "geographic region", }, ["regional capital"] = { link = "separately", fallback = "capital city", }, ["regional county municipality"] = { -- Quebec link = "w", preposition = "ของ", affix_type = "Suf", no_affix_strings = {"เทศบาล", "เทศมณฑล"}, fallback = "เทศบาล", }, ["regional district"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "อำเภอ", fallback = "อำเภอ", }, ["regional municipality"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "เทศบาล", fallback = "เทศบาล", }, ["regional unit"] = { link = "w", preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["registration county"] = { -- Used in Scotland for land registration purposes; formerly used in England, Wales and Ireland for statistical -- purposes (registration of births, deaths and marriages, and for the output of census information). link = "w", fallback = "เทศมณฑล", }, ["republic"] = { -- Of Russia, Yugoslavia, etc. "Republics" in general are sovereign but we use "ประเทศ" in that case. link = true, fallback = "constituent republic", }, ["research base"] = { link = "+w:research station", fallback = "research station", }, ["research station"] = { link = "w", class = "non-admin settlement", -- or "man-made structure"? default = {true}, }, ["reservoir"] = { link = true, fallback = "ทะเลสาบ", }, ["residential area"] = { link = "separately", fallback = "neighborhood", }, ["resort city"] = { link = "w", fallback = "นคร", }, ["resort town"] = { link = "w", fallback = "เมือง", }, ["แม่น้ำ"] = { link = true, generic_before_non_cities = "ใน", holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, cat_handler = city_type_cat_handler, ["continent/*"] = {true}, default = {true}, }, ["river island"] = { link = "w", fallback = "เกาะ", }, ["road"] = { link = true, class = "man-made structure", default = {"Named roads"}, }, ["Roman province"] = { -- FIXME! Eliminate this in favor of 'former province|emp/Roman Empire' link = "w", default = {"Provinces of the Roman Empire"}, class = "subpolity", }, ["royal borough"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = {"royal", "borough"}, fallback = "local government district with borough status", has_neighborhoods = true, }, ["royal burgh"] = { link = true, fallback = "borough", }, ["royal capital"] = { link = "w", fallback = "capital city", }, ["rural committee"] = { -- Hong Kong; a group of villages link = "w", affix_type = "Suf", has_neighborhoods = true, class = "settlement", }, ["rural community"] = { -- New Brunswick link = "+w:list of municipalities in New_Brunswick#Rural communities", fallback = "เทศบาล", }, ["rural hromada"] = { link = "[[rural]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["rural municipality"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "เทศบาล", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["rural township"] = { -- Taiwan link = "+w:rural township (Taiwan)", fallback = "township", }, ["sanctuary"] = { link = true, fallback = "temple", }, ["satrapy"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"ANCIENT", "FORMER"}, }, ["ทะเล"] = { link = true, holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["seaport"] = { link = true, fallback = "port", }, ["seat"] = { link = true, fallback = "administrative centre", }, ["self-administered area"] = { -- Myanmar (groups self-administered divisions and zones) link = "+w:self-administered zone", preposition = "ของ", class = "subpolity", }, ["self-administered division"] = { -- Myanmar (only one of them: Wa Self-Administered Division) link = "w", fallback = "self-administered area", }, ["self-administered zone"] = { -- Myanmar (five of them) link = "w", fallback = "self-administered area", }, ["separatist state"] = { link = "separately", fallback = "unrecognized country", }, ["การตั้งถิ่นฐาน"] = { link = true, category_link = "[[settlement]]s such as [[city|cities]], [[village]]s and [[farm]]s", bare_category_parent = "สถานที่", -- not necessarily true, but usually is the case fallback = "village", }, ["settlement hromada"] = { link = "[[w:Populated สถานที่ในUkraine#Rural settlements|การตั้งถิ่นฐาน]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["sheading"] = { -- Isle of Man link = true, fallback = "อำเภอ", }, ["sheep station"] = { -- Australia link = true, fallback = "farm", }, ["shire"] = { link = true, fallback = "เทศมณฑล", }, ["shire county"] = { link = "w", fallback = "เทศมณฑล", }, ["shire town"] = { link = true, fallback = "county seat", }, ["ski resort city"] = { link = "[[ski resort]] [[city]]", fallback = "นคร", }, ["ski resort town"] = { link = "[[ski resort]] [[town]]", fallback = "เมือง", }, ["spa city"] = { link = "+w:spa town", fallback = "นคร", }, ["spa town"] = { link = "w", fallback = "เมือง", }, ["space station"] = { link = true, fallback = "research station", }, ["special administrative region"] = { -- in China; in practice they are city-like (Hong Kong, Macau); also [[Oecusse]] in East Timor is formally a -- "special administrative region"; North Korea had one such region planned (Sinuiju) but abandoned; Indonesia -- has similar "special regions" of Jakarta, Yogyakarta and Aceh; and South Sudan has three "special -- administrative areas" link = "+w:special administrative regions of China", preposition = "ของ", class = "subpolity", has_neighborhoods = true, --? -- no suffix since สถานที่ในHong Kong or Macau are listed without China, except Hong Kong and Macau themselves -- they also contain regions (or areas), e.g. [[Kowloon]], so it would be confusing suffix = "", }, ["special collectivity"] = { link = "w", fallback = "collectivity", }, ["special municipality"] = { -- formerly linked to the Taiwan article but there are also special municipalities of the Netherlands link = "w", fallback = "เทศบาล", }, ["special ward"] = { -- Tokyo link = true, fallback = "เทศบาล", }, ["spit"] = { link = true, fallback = "peninsula", }, ["spring"] = { link = true, class = "natural feature", default = {true}, }, ["star"] = { link = true, class = "natural feature", default = {true}, }, ["รัฐ"] = { link = true, preposition = "ของ", class = "subpolity", -- 'former/historical state' could refer either to a state of a country (a division) or a state = sovereign -- entity. The latter appears more common (e.g. in various "ancient states" of East Asia). former_type = "องค์การทางการเมือง", }, ["states and territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Australia. category_link = "[[state]]s and [[territory|territories]]", class = "subpolity", }, ["states and union territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case India. category_link = "[[state]]s and [[union territory|union territories]]", class = "subpolity", }, ["state capital"] = { link = true, fallback = "capital city", }, ["state park"] = { link = true, fallback = "park", }, ["state-level new area"] = { -- China (type of economic development zone, varying greatly in size) link = "w", fallback = "new area", }, ["statistical region"] = { -- Slovenia link = true, fallback = "administrative region", }, ["statutory city"] = { link = "w", fallback = "นคร", }, ["statutory town"] = { link = "w", fallback = "เมือง", }, ["strait"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["stream"] = { link = true, fallback = "แม่น้ำ", }, ["street"] = { link = true, fallback = "road", }, ["strip"] = { link = true, fallback = "geographic region", }, ["strip of land"] = { link = "[[strip]] of [[land]]", plural = "strips of land", plural_link = "[[strip]]s of [[land]]", fallback = "geographic region", }, ["sub-metropolitan city"] = { link = "+w:List of cities in Nepal#Sub-metropolitan cities", fallback = "นคร", }, ["sub-prefectural city"] = { link = "w", fallback = "subprovincial city", }, ["ตำบล"] = { link = true, preposition = "ของ", has_neighborhoods = true, --? -- FIXME: subdistricts can be neighborhood-like (of Jakarta) or larger (in China); need a handler class = "subpolity", default = {true}, }, ["subdivision"] = { link = true, preposition = "ของ", affix_type = "suf", -- FIXME: subdivisions can be neighborhood-like or larger; need a handler class = "subpolity", cat_handler = district_neighborhood_cat_handler, }, ["submerged ghost town"] = { -- FIXME: Consider just having "submerged" as a qualifier. link = "[[submerged]] [[ghost town]]", fallback = "ghost town", }, ["subnational kingdom"] = { link = "+w:subnational monarchy", fallback = "non-sovereign kingdom", }, ["subnational monarchy"] = { link = "w", fallback = "non-sovereign kingdom", }, ["subprefecture"] = { link = true, affix_type = "suf", preposition = "ของ", class = "subpolity", }, ["subprovince"] = { link = true, preposition = "ของ", class = "subpolity", }, ["subprovincial city"] = { link = "w", -- China; special status given to certain prefecture-level cities fallback = "prefecture-level city", }, ["subprovincial district"] = { link = "w", -- China; special status given to Binhai New Area and Pudong New Area, which are county-level districts preposition = "ของ", class = "subpolity", }, ["subregion"] = { link = true, fallback = "geographic region", }, ["suburb"] = { link = true, -- The following text is suitable for the top-level description of a suburb as well as categories of the form -- 'Suburbs in POLDIV' e.g. 'Suburbs in Illinois, USA' but not for categories of the form 'Suburbs of Chicago', -- where we'd get "[[suburb]]s of [[city|cities]] of [[Chicago]]". category_link = "[[suburb]]s of [[city|cities]]", category_link_before_city = "[[suburb]]s", -- See comments under "neighborhood" for the following three settings. They are used by -- [[Module:category tree/topic cat/data/Places]] for generating the text of 'Suburbs in/of PLACE' categories -- but currently ignored by district_neighborhood_cat_handler (which actually generates the categories for a -- given page), which hardcodes "ใน" for non-cities and "ของ" for cities. (FIXME: Change this.) generic_before_non_cities = "ใน", generic_before_cities = "ของ", preposition = "ของ", has_neighborhoods = true, --? class = "non-admin settlement", --? cat_handler = district_neighborhood_cat_handler, }, ["suburban area"] = { link = "w", fallback = "suburb", }, ["subway station"] = { link = "w", fallback = "metro station", }, ["sum"] = { -- In China, Mongolia, Russia; something like a county in Mongolia but a township in China (Inner Mongolia), -- and equivalent to a [[selsoviet]] in the parts of Russia where it's in use (a rural council, below a raion). link = "+w:sum (administrative division)", -- This fallback is somewha arbitrary. We could use "เทศมณฑล" but that has a display handler -- which we don't want to be active (FIXME: If the display handler would be active, that's a bug). fallback = "division", }, ["supercontinent"] = { link = true, fallback = "continent", }, ["tehsil"] = { link = true, affix_type = "suf", no_affix_strings = {"tehsil", "tahsil"}, class = "subpolity", }, ["temple"] = { link = true, fallback = "building", }, ["territorial authority"] = { link = "w", fallback = "อำเภอ", }, ["ดินแดน"] = { link = true, preposition = "ของ", class = "subpolity", }, ["theme"] = { link = "+w:theme (Byzantine district)", preposition = "ของ", class = "subpolity", }, ["เมือง"] = { link = true, generic_before_non_cities = "ใน", has_neighborhoods = true, class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["town with bystatus"] = { -- can't use templates in links currently link = "[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]", plural = "towns with bystatus", plural_link = "[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]", fallback = "เมือง", }, ["township"] = { link = true, has_neighborhoods = true, class = "settlement", --? default = {true}, }, ["township municipality"] = { -- Quebec link = "+w:township municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["traditional county"] = { link = true, fallback = "เทศมณฑล", }, ["traditional region"] = { -- FIXME: Verify this works. Same for 'historic(al) region'. -- provided only for the link link = "w", fallback = "FORMER geographic region", }, ["trail"] = { link = true, fallback = "road", }, ["treaty port"] = { link = "w", fallback = "นคร", class = "settlement", inherently_former = {"FORMER"}, }, ["tributary"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["underground station"] = { link = "w", fallback = "metro station", }, ["unincorporated area"] = { link = "w", -- I don't know if this fallback makes sense everywhere. fallback = "unincorporated community", }, ["unincorporated community"] = { link = true, generic_before_non_cities = "ใน", class = "non-admin settlement", }, ["unincorporated territory"] = { link = "w", fallback = "ดินแดน", }, ["union territory"] = { -- India link = true, preposition = "ของ", entry_placetype_indefinite_article = "a", class = "subpolity", }, ["unitary authority"] = { -- UK, New Zealand link = true, entry_placetype_indefinite_article = "a", fallback = "local government district", }, ["unitary district"] = { link = "w", entry_placetype_indefinite_article = "a", fallback = "local government district", }, ["united township municipality"] = { -- Quebec link = "+w:united township municipality (Quebec)", entry_placetype_indefinite_article = "a", fallback = "township municipality", has_neighborhoods = true, --? }, ["university"] = { link = true, entry_placetype_indefinite_article = "a", class = "man-made structure", default = {true}, }, ["unrecognised country"] = { link = "w", fallback = "unrecognized country", }, ["unrecognized and nearly unrecognized countries!"] = { category_link = "[[de facto]] [[independent]] [[state]]s with little or no {{w|international recognition}}", bare_category_parent = "country-like entities", }, ["unrecognized country"] = { link = "w", class = "polity", --ห้ามแปล class default = {"Unrecognized and nearly unrecognized countries"}, }, ["unrecognised state"] = { link = "w", fallback = "unrecognized country", }, ["unrecognized state"] = { link = "w", fallback = "unrecognized country", }, ["urban area"] = { link = "separately", fallback = "neighborhood", }, ["urban hromada"] = { link = "[[urban]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["urban service area"] = { -- A strange beast existing in Alberta; technically a type of hamlet but in practice used for much larger -- cities and treated equivalent to a city. (There are only two of them, [[Fort McMurray]] and [[Sherwood Park]]). link = "w", fallback = "นคร", }, ["urban township"] = { link = "w", fallback = "township", }, ["urban-type settlement"] = { -- appears to be a particular type of small urban settlement in post-Soviet states, -- had an administrative function. link = "w", fallback = "เมือง", }, ["valley"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "water"}, default = {true}, }, ["viceroyalty"] = { -- in essence, a type of colony link = true, fallback = "dependent territory", }, ["village"] = { link = true, generic_before_non_cities = "ใน", category_link = "[[village]]s, [[hamlet]]s, and other small [[community|communities]] and [[settlement]]s", class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["village development committee"] = { -- former administrative structure in Nepal; also exists in India but not as a formal unit link = "+w:village development committee (Nepal)", inherently_former = {"FORMER"}, fallback = "village", }, ["village municipality"] = { -- Quebec link = "+w:village municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["voivodeship"] = { -- Poland link = true, display_handler = voivodeship_display_handler, preposition = "ของ", class = "subpolity", }, ["volcano"] = { link = true, plural = "volcanoes", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true, "ภูเขา"}, }, ["ward"] = { link = true, class = "settlement", -- Wards are formal administrative divisions of a city but have some properties of neighborhoods. fallback = "neighborhood", }, ["watercourse"] = { link = true, fallback = "channel", }, ["Welsh community"] = { -- Wales link = "[[w:community (Wales)|community]]", preposition = "ของ", affix_type = "suf", affix = "community", has_neighborhoods = true, class = "settlement", }, ["zone"] = { -- administrative division of Ethiopia, Qatar, Nepal, India link = "+w:zone#Place names", preposition = "ของ", class = "subpolity", }, ---------------------------------------------------------------------------------------------- -- Categories for former places -- ---------------------------------------------------------------------------------------------- ["ANCIENT capital"] = { link = false, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", -- FIXME: Consider removing 'ancient settlements' here. Ancient capitals, like former capitals, often still -- exist but just aren't the capital any more. Maybe we should have an 'Ancient capitals' category. default = {"Ancient settlements", "Former capitals"}, }, ["ANCIENT non-admin settlement"] = { link = false, class = "non-admin settlement", fallback = "ANCIENT settlement", }, ["ANCIENT settlement"] = { link = false, has_neighborhoods = true, class = "settlement", default = {"Ancient settlements"}, }, ["ancient settlements!"] = { category_link = "former [[city|cities]], [[town]]s and [[village]]s that existed in [[antiquity]]", bare_category_parent = "former settlements", }, ["FORMER capital"] = { link = false, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", default = {"Former capitals"}, }, ["former capitals!"] = { category_link = "former [[capital]] [[city|cities]] and [[town]]s", bare_category_parent = "การตั้งถิ่นฐาน", }, ["former counties and county-level cities!"] = { -- For categorizing former counties and county-level cities of China category_link = "no-longer existing [[county|counties]] and [[county-level city|county-level cities]]", bare_category_breadcrumb = "counties and county-level cities", bare_category_parent = "former political divisions", }, ["FORMER county"] = { -- For categorizing former counties and county-level cities of China link = false, fallback = "FORMER subpolity", }, ["FORMER county-level city"] = { -- For categorizing former counties and county-level cities of China link = false, fallback = "FORMER subpolity", }, ["former countries and country-like entities!"] = { category_link = "[[country|countries]] and similar [[polity|polities]] that no longer exist", bare_category_breadcrumb = "countries and country-like entities", bare_category_parent = "former polities", }, ["FORMER country"] = { link = false, class = "polity", --ห้ามแปล class default = {"Former countries and country-like entities"}, }, ["former dependent territories!"] = { category_link = "[[w:dependent territory|dependent territories]] (colonies, dependencies, protectorates, etc.) that no longer exist", bare_category_breadcrumb = "dependent territories", bare_category_parent = "former political divisions", }, ["FORMER dependent territory"] = { link = false, preposition = "ของ", class = "subpolity", default = {"Former dependent territories"}, }, ["former districts!"] = { -- For categorizing former districts of China category_link = "no-longer-existing [[district]]s", bare_category_breadcrumb = "อำเภอ", bare_category_parent = "former political divisions", }, ["FORMER district"] = { -- For categorizing former districts of China link = false, fallback = "FORMER subpolity", }, ["FORMER geographic region"] = { link = false, fallback = "geographic and cultural area", }, ["FORMER man-made structure"] = { link = false, class = "man-made structure", default = {"Former man-made structures"}, }, ["former man-made structures!"] = { category_link = "man-made structures such as [[airport]]s and [[park]]s that no longer exist", bare_category_breadcrumb = "man-made structures", bare_category_parent = "former places", }, ["former municipalities!"] = { -- For categorizing former municipalities of the Netherlands category_link = "no-longer-existing [[municipality|municipalities]]", bare_category_breadcrumb = "เทศบาล", bare_category_parent = "former political divisions", }, ["FORMER municipality"] = { -- For categorizing former municipalities of the Netherlands link = false, fallback = "FORMER subpolity", }, ["FORMER natural feature"] = { link = false, class = "natural feature", default = {"Former natural features"}, }, ["former natural features!"] = { category_link = "natural features such as [[lake]]s, [[river]]s and [[island]]s that no longer exist", bare_category_breadcrumb = "natural features", bare_category_parent = "former places", }, ["FORMER non-admin settlement"] = { link = false, class = "non-admin settlement", fallback = "FORMER settlement", }, ["former places!"] = { category_link = "[[place]]s of all sorts that no longer exist", bare_category_breadcrumb = "former", bare_category_parent = "สถานที่", }, ["former political divisions!"] = { category_link = "[[political]] [[division]]s (states, provinces, counties, etc.) that no longer exist", bare_category_breadcrumb = "political divisions", bare_category_parent = "former places", }, ["former polities!"] = { category_link = "[[polity|polities]] (countries, kingdoms, empires, etc.) that no longer exist", bare_category_breadcrumb = "องค์การทางการเมือง", bare_category_parent = "former places", }, ["FORMER polity"] = { link = false, class = "polity", --ห้ามแปล class default = {"Former polities"}, }, ["former prefectures!"] = { -- For categorizing former prefectures of China category_link = "no-longer-existing [[prefecture]]s", bare_category_breadcrumb = "prefectures", bare_category_parent = "former political divisions", }, ["FORMER prefecture"] = { -- For categorizing former prefectures of China link = false, fallback = "FORMER subpolity", }, ["former provinces!"] = { -- For categorizing former provinces of China, etc. category_link = "no-longer-existing [[province]]s", bare_category_breadcrumb = "จังหวัด", bare_category_parent = "former political divisions", }, ["FORMER province"] = { -- For categorizing ancient/historical/former provinces of the Roman Empire link = false, fallback = "FORMER subpolity", }, ["former region"] = { -- A former region is considered a former political division, but not a 'historical/traditional/etc.' region. link = "separately", preposition = "ของ", inherently_former = {"FORMER"}, class = "subpolity", }, ["FORMER settlement"] = { link = false, has_neighborhoods = true, class = "settlement", default = {"Former settlements"}, }, ["former settlements!"] = { category_link = "[[city|cities]], [[town]]s and [[village]]s that no longer exist or have been merged or reclassified", bare_category_breadcrumb = "การตั้งถิ่นฐาน", bare_category_parent = "former political divisions", }, ["FORMER subpolity"] = { link = false, preposition = "ของ", class = "subpolity", default = {"Former political divisions"}, }, ---------------------------------------------------------------------------------------------- -- form-of categories -- ---------------------------------------------------------------------------------------------- ---------- Abbreviations ---------- ["abbreviations of counties!"] = { -- For categorizing abbreviations of counties of e.g. England full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[county|counties]]", bare_category_breadcrumb = "เทศมณฑล", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of countries!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "abbreviations of places", }, ["abbreviations of departments!"] = { -- For categorizing abbreviations of departments of e.g. France full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[department]]s", bare_category_breadcrumb = "departments", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of districts!"] = { -- For categorizing abbreviations of districts of e.g. ??? full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[district]]s", bare_category_breadcrumb = "อำเภอ", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of divisions!"] = { -- For categorizing abbreviations of divisions of e.g. Bangladesh full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[division]]s", bare_category_breadcrumb = "divisions", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of former countries!"] = { full_category_link = "{{glossary|abbreviation}}s of [[country|countries]] that no longer [[exist]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "abbreviations of former places", }, ["abbreviations of former places!"] = { full_category_link = "{{glossary|abbreviation}}s of [[place]]s that no longer [[exist]]", bare_category_breadcrumb = "abbreviations", bare_category_parent = "former places", addl_bare_category_parents = {{name = "abbreviations of places", sort = "former"}}, }, ["abbreviations of places!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[place]]s", bare_category_breadcrumb = "abbreviations", bare_category_parent = "สถานที่", }, ["abbreviations of political divisions!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[political]] [[division]]s", bare_category_breadcrumb = "political divisions", bare_category_parent = "abbreviations of places", }, ["abbreviations of prefectures!"] = { -- For categorizing abbreviations of prefectures of e.g. Japan full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[prefecture]]s", bare_category_breadcrumb = "prefectures", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of provinces!"] = { -- For categorizing abbreviations of provinces of e.g. Canada full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s", bare_category_breadcrumb = "จังหวัด", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of provinces and territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s and [[territory|territories]]", bare_category_breadcrumb = "provinces and territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of regions!"] = { -- For categorizing abbreviations of regions of e.g. Italy full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[administrative region]]s", bare_category_breadcrumb = "ภูมิภาค", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states!"] = { -- For categorizing abbreviations of states of e.g. the United States full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states and territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[territory|territories]]", bare_category_breadcrumb = "states and territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states and union territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[union territory|union territories]]", bare_category_breadcrumb = "states and union territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[territory|territories]]", bare_category_breadcrumb = "ดินแดน", bare_category_parent = "abbreviations of political divisions", }, ["ABBREVIATION_OF country"] = { link = false, default = {"Abbreviations of countries"}, }, ["ABBREVIATION_OF county"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF department"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF district"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF division"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF FORMER country"] = { link = false, default = {"Abbreviations of former countries"}, }, ["ABBREVIATION_OF FORMER place"] = { link = false, default = {"Abbreviations of former places"}, }, ["ABBREVIATION_OF place"] = { link = false, default = {"Abbreviations of places"}, }, ["ABBREVIATION_OF prefecture"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF province"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF region"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF state"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF subpolity"] = { link = false, default = {"Abbreviations of political divisions"}, }, ["ABBREVIATION_OF territory"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF union territory"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ---------- Archaic forms ---------- ["archaic forms of places!"] = { full_category_link = "{{glossary|archaic}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "archaic forms", bare_category_parent = "สถานที่", }, ["ARCHAIC_FORM_OF place"] = { link = false, default = {"Archaic forms of places"}, }, ---------- Clippings ---------- ["clippings of places!"] = { full_category_link = "{{glossary|clipping}}s of [[name]]s of [[place]]s", bare_category_breadcrumb = "clippings", bare_category_parent = "สถานที่", }, ["CLIPPING_OF place"] = { link = false, default = {"Clippings of places"}, }, ---------- Dated forms ---------- ["dated forms of places!"] = { full_category_link = "{{glossary|dated}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "dated forms", bare_category_parent = "สถานที่", }, ["DATED_FORM_OF place"] = { link = false, default = {"Dated forms of places"}, }, ---------- Derogatory names ---------- ["derogatory names for cities!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[city|cities]]", bare_category_breadcrumb = "นคร", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for cities"}, }, ["derogatory names for continents!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[continent]]s", bare_category_breadcrumb = "ทวีป", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for continents"}, }, ["derogatory names for countries!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for countries"}, }, ["derogatory names for places!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[place]]s", bare_category_breadcrumb = "derogatory names", bare_category_parent = "nicknames for places", }, ["derogatory names for states!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for states"}, }, ["DEROGATORY_NAME_FOR capital"] = { link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR city"] = { link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR continent"] = { link = false, default = {"Derogatory names for continents"}, }, ["DEROGATORY_NAME_FOR country"] = { link = false, default = {"Derogatory names for countries"}, }, ["DEROGATORY_NAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR place"] = { link = false, default = {"Derogatory names for places"}, }, ["DEROGATORY_NAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR state"] = { link = false, default = {"Derogatory names for states"}, }, ["DEROGATORY_NAME_FOR town"] = { link = false, default = {"Derogatory names for cities"}, }, ---------- Ellipses ---------- ["ellipses of places!"] = { full_category_link = "{{glossary|ellipsis|ellipses}} of [[name]]s of [[place]]s", bare_category_breadcrumb = "ellipses", bare_category_parent = "สถานที่", }, ["ELLIPSIS_OF place"] = { link = false, default = {"Ellipses of places"}, }, ---------- Former long-form names ---------- ["former long-form names of countries!"] = { full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former long-form names of places", addl_bare_category_parents = {{name = "former names of countries", sort = "long-form"}}, }, ["former long-form names of places!"] = { full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s", bare_category_breadcrumb = "long-form", bare_category_parent = "former names of places", }, ["FORMER_LONG_FORM_OF country"] = { link = false, default = {"Former long-form names of countries"}, }, ["FORMER_LONG_FORM_OF place"] = { link = false, default = {"Former long-form names of places"}, }, ---------- Former names ---------- ["former names of capitals!"] = { full_category_link = "[[former]] [[name]]s of [[capital city|capital cities]] that generally still exist but under a different name", bare_category_breadcrumb = "capitals", bare_category_parent = "former names of settlements", }, ["former names of countries!"] = { full_category_link = "[[former]] [[name]]s of [[country|countries]] that generally still exist but under a different name", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former names of places", }, ["former names of places!"] = { full_category_link = "[[former]] [[name]]s of [[place]]s that generally still exist but under a different name", bare_category_breadcrumb = "former names", bare_category_parent = "สถานที่", }, ["former names of political divisions!"] = { full_category_link = "[[former]] [[name]]s of [[political]] [[division]]s (states, provinces, counties, etc.) that generally still exist but under a different name", bare_category_breadcrumb = "political divisions", bare_category_parent = "former names of places", }, ["former names of polities!"] = { full_category_link = "[[former]] [[name]]s of [[polity|polities]] (e.g. [[country|countries]]) that generally still exist but under a different name", bare_category_breadcrumb = "องค์การทางการเมือง", bare_category_parent = "former names of places", }, ["former names of settlements!"] = { full_category_link = "[[former]] [[name]]s of [[city|cities]], [[town]]s, [[village]]s, etc. that generally still exist but under a different name", bare_category_breadcrumb = "การตั้งถิ่นฐาน", bare_category_parent = "former names of political divisions", }, ["FORMER_NAME_OF capital"] = { link = false, default = {"Former names of capitals"}, }, ["FORMER_NAME_OF country"] = { link = false, default = {"Former names of countries"}, }, ["FORMER_NAME_OF place"] = { link = false, default = {"Former names of places"}, }, ["FORMER_NAME_OF polity"] = { link = false, default = {"Former names of polities"}, }, ["FORMER_NAME_OF region"] = { link = false, fallback = "FORMER_NAME_OF subpolity", }, ["FORMER_NAME_OF settlement"] = { link = false, default = {"Former names of settlements"}, }, ["FORMER_NAME_OF subpolity"] = { link = false, default = {"Former names of political divisions"}, }, ---------- Former nicknames ---------- ["former nicknames for cities!"] = { full_category_link = "no-longer-used [[nickname]]s for [[city|cities]], e.g. the [[Eternal City]] for [[Kyoto]] during the {{w|Heian period}} ({{circa2|800–1100|short=yes}} {{AD}})", bare_category_breadcrumb = "นคร", bare_category_parent = "former nicknames for places", addl_bare_category_parents = {"nicknames for cities"}, }, ["former nicknames for places!"] = { full_category_link = "no-longer-used [[nickname]]s for [[place]]s", bare_category_breadcrumb = "former", bare_category_parent = "nicknames for places", addl_bare_category_parents = {{name = "former names of places", sort = "nicknames"}}, }, ["FORMER_NICKNAME_FOR capital"] = { link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR city"] = { link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR place"] = { link = false, default = {"Former nicknames for places"}, }, ["FORMER_NICKNAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR town"] = { link = false, default = {"Former nicknames for cities"}, }, ---------- Former official names ---------- ["former official names of countries!"] = { full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former official names of places", addl_bare_category_parents = {{name = "former names of countries", sort = "official"}}, }, ["former official names of places!"] = { full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[place]]s", bare_category_breadcrumb = "official", bare_category_parent = "former names of places", }, ["FORMER_OFFICIAL_NAME_OF country"] = { link = false, default = {"Former official names of countries"}, }, ["FORMER_OFFICIAL_NAME_OF place"] = { link = false, default = {"Former official names of places"}, }, ---------- Long-form names ---------- ["long-form names of countries!"] = { full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "long-form names of places", }, ["long-form names of places!"] = { full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s", bare_category_breadcrumb = "long-form names", bare_category_parent = "สถานที่", }, ["LONG_FORM_OF country"] = { link = false, default = {"Long-form names of countries"}, }, ["LONG_FORM_OF place"] = { link = false, default = {"Long-form names of places"}, }, ---------- Nicknames ---------- ["nicknames for cities!"] = { full_category_link = "[[nickname]]s for [[city|cities]], e.g. the [[Big Apple]] for [[New York City]]", bare_category_breadcrumb = "นคร", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"นคร"}, }, ["nicknames for continents!"] = { full_category_link = "[[nickname]]s for [[continent]]s", bare_category_breadcrumb = "ทวีป", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"ทวีป"}, }, ["nicknames for countries!"] = { full_category_link = "[[nickname]]s for [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"ประเทศ"}, }, ["nicknames for places!"] = { full_category_link = "[[nickname]]s for [[place]]s", bare_category_breadcrumb = "สถานที่", bare_category_parent = "nicknames", addl_bare_category_parents = {"สถานที่"}, }, ["nicknames for states!"] = { -- For categorizing nicknames for states of e.g. the United States full_category_link = "[[nicknames]] for [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"รัฐ"}, }, ["NICKNAME_FOR capital"] = { link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR city"] = { link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR continent"] = { link = false, default = {"Nicknames for continents"}, }, ["NICKNAME_FOR country"] = { link = false, default = {"Nicknames for countries"}, }, ["NICKNAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR place"] = { link = false, default = {"Nicknames for places"}, }, ["NICKNAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR state"] = { link = false, default = {"Nicknames for states"}, }, ["NICKNAME_FOR town"] = { link = false, default = {"Nicknames for cities"}, }, ---------- Obsolete forms ---------- ["obsolete forms of places!"] = { full_category_link = "{{glossary|obsolete}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "obsolete forms", bare_category_parent = "สถานที่", }, ["OBSOLETE_FORM_OF place"] = { link = false, default = {"Obsolete forms of places"}, }, ---------- Official names ---------- ["official names of countries!"] = { full_category_link = "[[official]] [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "official names of places", }, ["official names of former countries!"] = { full_category_link = "[[official]] [[name]]s of [[country|countries]] that no longer [[exist]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "official names of former places", }, ["official names of former places!"] = { full_category_link = "[[official]] [[name]]s of [[place]]s that no longer [[exist]]", bare_category_breadcrumb = "official names", bare_category_parent = "former places", addl_bare_category_parents = {{name = "official names of places", sort = "former"}}, }, ["official names of places!"] = { full_category_link = "[[official]] [[name]]s of [[place]]s", bare_category_breadcrumb = "official names", bare_category_parent = "สถานที่", }, ["OFFICIAL_NAME_OF country"] = { link = false, default = {"Official names of countries"}, }, ["OFFICIAL_NAME_OF FORMER country"] = { link = false, default = {"Official names of former countries"}, }, ["OFFICIAL_NAME_OF FORMER place"] = { link = false, default = {"Official names of former places"}, }, ["OFFICIAL_NAME_OF place"] = { link = false, default = {"Official names of places"}, }, ---------- Official nicknames ---------- ["official nicknames for places!"] = { full_category_link = "[[official]] [[nickname]]s for [[place]]s", bare_category_breadcrumb = "official", bare_category_parent = "nicknames for places", }, ["official nicknames for states!"] = { -- For categorizing official nicknames for states of e.g. the United States full_category_link = "[[official]] [[nicknames]] for [[state]]s", bare_category_breadcrumb = "official", bare_category_parent = "nicknames for states", addl_bare_category_parents = {"รัฐ"}, }, ["OFFICIAL_NICKNAME_FOR place"] = { link = false, default = {"Official nicknames for places"}, }, ["OFFICIAL_NICKNAME_FOR state"] = { link = false, default = {"Official nicknames for states"}, }, } export.plural_placetype_to_singular = {} for sg_placetype, spec in pairs(export.placetype_data) do if spec.plural then export.plural_placetype_to_singular[spec.plural] = sg_placetype end end return export mnlfks1x4rjaf9dt88kmmznb0uvx2cd 5720689 5720688 2026-04-21T01:29:08Z OctraBot 3198 5720689 Scribunto text/plain local export = {} export.force_cat = false -- set to true for testing local m_locations = require("Module:place/locations") local m_links = require("Module:links") local m_table = require("Module:table") local m_strutils = require("Module:string utilities") local debug_track_module = "Module:debug/track" local en_utilities_module = "Module:en-utilities" local dump = mw.dumpObject local insert = table.insert local concat = table.concat local internal_error = m_locations.internal_error export.internal_error = internal_error local process_error = m_locations.process_error export.process_error = process_error local unpack = unpack or table.unpack -- Lua 5.2 compatibility local ucfirst = m_strutils.ucfirst local ulower = m_strutils.lower local rmatch = m_strutils.match local split = m_strutils.split --[==[ intro: This module contains placetype data used by [[Module:place]] and {{tl|place}}, along with a significant amount of code to work with both placetypes and locations, as well as some placename-related info (FIXME: Consider moving it to [[Module:place/locations]]). See also [[Module:place/locations]], which has definitions of all known locations. You must currently load this module using {{cd|require()}}, not using {{cd|mw.loadData()}}. In particular, it contains two fundamental and tricky functions: # `get_placetype_equivs`, which finds the equivalent placetypes to look under in order to find a given property, and in the process correctly handles placetypes with qualifiers (including qualifiers that act similar to "type-raising" operators in that they do something non-trivial to the placetype to their right) as well as form-of directives and fallbacks. # `find_matching_holonym_location`, which looks up a holonym to find a matching known location, but in the process checks holonyms to the right to make sure there isn't a clash between the user-specified containing holonyms and the containers of the known location being considered. This is done to prevent overcategorizing when either there are two known locations with the same name (e.g. Birmingham in England and Birmingham, Alabama in the US), or more generally two locations with the same name, one of which is a known location but where the other is not (e.g. we're processing non-known-location Mérida, Spain and don't want it categorized like known location Mérida, Yucatán, Mexico). Both of these functions are invoked repeatedly, and probably are invoked several times on the same inputs and as a result are candidates for memoization to speed up the operation of {{tl|place}}. ]==] ------------------------------------------------------------------------------------------ -- Basic utilities -- ------------------------------------------------------------------------------------------ --[==[ Return true if `force_cat` is set either in this module or in [[Module:place/locations]]. ]==] function export.get_force_cat() return export.force_cat or m_locations.force_cat end -- Add the page to a tracking "category". To see the pages in the "category", -- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here". local function track(page) require(debug_track_module)("place/" .. page) return true end function export.remove_links_and_html(text) text = m_links.remove_links(text) return text:gsub("<.->", "") end --[==[ Return the singular version of a maybe-plural placetype, or nil if not plural. This correctly handles placetypes with irregular plurals such as `kibbutzim` plural of `kibbutz` by looking up in a table constructed from the `plural` values specified in `placetype_data`. If a special plural value is not found, the regular singularization algorithm in [[Module:en-utilities]] is invoked, which reverses the y -> ies change after vowels and the 'es' addition after sh/ch/x, and otherwise just subtracts a final 's' (which will incorrectly generate 'passe' for plural 'passes'; FIXME: consider changing this for words ending in '-sses'). If the generated singular is the same as the passed-in value, nil is returned. ]==] function export.maybe_singularize_placetype(placetype) if not placetype then return nil end if export.plural_placetype_to_singular[placetype] then return export.plural_placetype_to_singular[placetype] end local retval = --[[require(en_utilities_module).singularize(placetype)]] placetype if retval == placetype then return nil end return retval end -- Return the correct plural of a placetype, and (if `do_ucfirst` is given) make the first letter uppercase. We first -- look up the plural in `placetype_data`, falling back to pluralize() in [[Module:en-utilities]], which is almost -- always correct. function export.pluralize_placetype(placetype, do_ucfirst) local ptdata = export.placetype_data[placetype] if ptdata and ptdata.plural then placetype = ptdata.plural else placetype = --[[require(en_utilities_module).pluralize(placetype)]] placetype end if do_ucfirst then return ucfirst(placetype) else return placetype end end --[==[ Get the data associated with a placetype, which may be in its singular or plural form. If `from_category` is specified, we also look for category-only placetypes (generally plural) followed by `!`. Return three values: (a) the placetype under which the data can be looked up (i.e. in its singular form if the passed-in `placetype` is plural and did not match a category-only placetype followed by `!`); (b) the placetype data structure; (c) the type of `placetype` match that occurred, one of `"direct"` if the canonical placetype is the same as the passed-in `placetype` and also the same as the key under which `ptdata` was looked up, or `"direct-category"` if the `ptdata` was looked up under a key formed from the passed-in `placetype` by adding `!`, or `"plural"` if the `ptdata` was looked up under the singularized version of the plural passed-in `placetype`. ]==] function export.get_placetype_data(placetype, from_category) local ptdata = export.placetype_data[placetype] if ptdata then return placetype, ptdata, "direct" end if from_category then ptdata = export.placetype_data[placetype .. "!"] if ptdata then return placetype .. "!", ptdata, "direct-category" end end local sg_placetype = export.maybe_singularize_placetype(placetype) if sg_placetype then ptdata = export.placetype_data[sg_placetype] if ptdata then return sg_placetype, ptdata, "plural" end end return nil end --[==[ Check for special pseudo-placetypes that should be ignored for categorization purposes. ]==] function export.placetype_is_ignorable(placetype) return placetype == "and" or placetype == "or" or placetype == "และ" or placetype == "หรือ" or placetype:find("^%(") end function export.resolve_placetype_aliases(placetype) return export.placetype_aliases[placetype] or placetype end --[==[ Return a property from `placetype_data` for a given placetype. If the placetype isn't found in `placetype_data`, or the key isn't found in the placetype's entry in `placetype_data`, return nil. ]==] function export.get_placetype_prop(placetype, key) -- Usually we are called on equivalent placetypes returned from `get_placetype_equivs`, in which case placetype -- aliases have been resolved, but sometimes not, e.g. when fetching the indefinite article in -- get_placetype_article(). `resolve_placetype_aliases` is just a simple lookup and it doesn't hurt to do it twice. placetype = export.resolve_placetype_aliases(placetype) if export.placetype_data[placetype] then return export.placetype_data[placetype][key] else return nil end end --[==[ Given a placetype, split the placetype into one or more potential ''splits'', each consisting of a three-element list { {``prev_qualifiers``, ``this_qualifier``, ``reduced_placetype``}}, i.e. # the concatenation of zero or more previously-recognized qualifiers on the left, normally canonicalized (if there are zero such qualifiers, the value will be nil); # a single recognized qualifier, normally canonicalized (if there is no qualifier, the value will be nil); # the "reduced placetype" on the right. Splitting between the qualifier in (2) and the reduced placetype in (3) happens at each space character, proceeding from left to right, and stops if a qualifier isn't recognized. All placetypes are canonicalized by checking for aliases in `placetype_aliases`, but no other checks are made as to whether the reduced placetype is recognized. Canonicalization of qualifiers does not happen if `no_canon_qualifiers` is specified. For example, given the placetype `"small beachside unincorporated community"`, the return value will be { { {nil, nil, "small beachside unincorporated community"}, {nil, "small", "beachside unincorporated community"}, {"small", "[[beachfront]]", "unincorporated community"}, {"small [[beachfront]]", "[[unincorporated]]", "community"}, }} Here, `"beachside"` is canonicalized to `"[[beachfront]]"` and `"unincorporated"` is canonicalized to `"[[unincorporated]]"`, in both cases according to the entry in `placetype_qualifiers`. On the other hand, if given `"small former haunted community"`, the return value will be { { {nil, nil, "small former haunted community"}, {nil, "small", "former haunted community"}, {"small", "former", "haunted community"}, }} because `"small"` and `"former"` but not `"haunted"` are recognized as qualifiers. Finally, if given `"former adr"`, the return value will be { { {nil, nil, "former adr"}, {nil, "former", "administrative region"}, }} because `"adr"` is a recognized placetype alias for `"administrative region"`. ]==] function export.split_qualifiers_from_placetype(placetype, no_canon_qualifiers) local splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}} local prev_qualifier = nil while true do local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$") if qualifier then local canon = export.placetype_qualifiers[qualifier] if canon == nil then break end local new_qualifier = qualifier if type(canon) == "table" then canon = canon.link end if not no_canon_qualifiers and canon ~= false then if canon == true then new_qualifier = "[[" .. qualifier .. "]]" else new_qualifier = canon end end insert(splits, {prev_qualifier, new_qualifier, export.resolve_placetype_aliases(reduced_placetype)}) prev_qualifier = prev_qualifier and prev_qualifier .. " " .. new_qualifier or new_qualifier placetype = reduced_placetype else break end end return splits end --[==[ Given a `placetype` (which may be pluralized), return an ordered list of equivalent placetypes to look under to find the placetype's properties (such as the category or categories to be inserted). The return value is actually an ordered list of objects of the form `{qualifier=``qualifier``, placetype=``equiv_placetype``}` where ``equiv_placetype`` is a placetype whose properties to look up, derived from the passed-in placetype or from a contiguous subsequence of the words in the passed-in placetype (always including the rightmost word in the placetype, i.e. we successively chop off qualifier words from the left and use the remainder to find equivalent placetypes). ``qualifier`` is the remaining words not part of the subsequence used to find ``equiv_placetype``; or nil if all words in the passed-in placetype were used to find ``equiv_placetype``. (FIXME: This qualifier is not currently used anywhere.) Only placetypes for which there is an entry in `placetype_data` are included. The placetype passed in is always checked first, and will form the first entry if it exists in `placetype_data`. '''NOTE:''' This is a tricky function as it implements handling of (a) qualifiers, (b) fallback logic, (c) "type-raising" qualifiers such as `former`/`ancient`/etc. as well as `fictional` and `mythological`, and (d) form-of directives, which act somewhat similarly to `former`, and allows interaction between more than one of these simultaneously (e.g. official names of former places, which have their own categorization). If {{tl|place}} gets too slow, one potential speedup is to memoize the results of this function, as it appears to be getting called more than once on the same inputs. Another similar potential speedup is to memoize the results of `iterate_matching_holonym_location()`. For example, given the placetype `left tributary`, the following placetype/qualifier combinations are checked in turn: ``` {qualifier = nil, placetype="left tributary"} {qualifier = "left", placetype="tributary"} {qualifier = "left", placetype="แม่น้ำ"} ``` and the return value will be { { {qualifier = "left", placetype="tributary"}, {qualifier = "left", placetype="แม่น้ำ"}, }} The algorithm first enters the placetype itself into the list, then checks for `left tributary` as a recognized placetype in `placetype_data` and doesn't find it, so it doesn't enter it into the returned list (if it found it, it would add it as well as any fallbacks directly after it). It then splits off the recognized qualifier `left` to form the ''reduced placetype'' `tributary`, which is entered into the list because it is found in `placetype_data`. Then, because it has a fallback `river`, which exists in `placetype_data`, the fallback is entered next. Another example is `small rural fraziones` (where a ''frazione'' is type of subdivision of a ''comune'' or municipality, often specifically an outlying hamlet). the placetype/qualifier combinations checked are: ``` {qualifier = nil, placetype="small rural fraziones"} {qualifier = nil, placetype="small rural frazione"} {qualifier = "small", placetype="rural fraziones"} {qualifier = "small", placetype="rural frazione"} {qualifier = "small [[rural]]", placetype="fraziones"} {qualifier = "small [[rural]]", placetype="frazione"} {qualifier = "small [[rural]]", placetype="hamlet"} {qualifier = "small [[rural]]", placetype="village"} ``` The return value ends up as {qualifier = "small [[rural]]", placetype="frazione"}, {qualifier = "small [[rural]]", placetype="hamlet"}, {qualifier = "small [[rural]]", placetype="village"}, }} Here, because the result of singularizing `fraziones` returns a different value from the placetype itself, that singularized value is checked after the original plural value. Also, in the process of splitting off qualifiers, they are canonicalized if the entry in `placetype_qualifiers` says to do so; in this case, links are placed around `rural`. Finally, `frazione` has `hamlet` as its fallback, which in turn has `village` as its fallback, so both fallbacks end up being returned. `no_fallback`, if set, disables returning equivalent placetypes based on the `fallback` setting for a placetype. This is used in the first of two loops in find_placetype_cat_specs() in [[Module:place]] to prefer exact matches for placetypes such as barangays with later holonyms to matches based on a fallback such as `neighborhood` with an earlier holonym. See the comment in that function in [[Module:place]] for a more detailed explanation of why this is needed. Only the placetype itself, and any reduced placetypes created by chopping off recognized qualifiers at the beginning, are returned; but we do not return reduced placetypes if a containing placetype exists in `placetype_data`. (For example, `"overseas territory"` has a fallback `"dependent territory"`, and `"overseas"` is also a recognized qualifier. When `no_fallback` is in place, without the above proviso, we would return `"overseas territory"` followed by `"ดินแดน"` with the incorrect effect of classifying an `"overseas territory"` of the United Kingdom such as `"Gibraltar"` under [[:Category:Territories of the United Kingdom]] instead of [[:Category:Dependent territories of the United Kingdom]].) As an exception, if `historical`, `ancient`, `former` or the like are found, they proceed ignoring `no_fallback`, because it seems tricky to handle them correctly in the presence of `no_fallback`, and historical/former placetypes rarely occur with exact match category specs anyway. `no_split_qualifiers` prevents splitting off recognized qualifiers and returning the remainder of the placetype as an equivalent placetype. Only the passed-in placetype, and any fallbacks, will be returned. This is used in [[Module:category tree/topic cat/data/Places]] when looking up placetypes found in categories. Such placetypes won't have qualifiers and so it doesn't make sense to try and look for them. `from_category`, if set, causes category-only placetypes (those ending in `!`) to also be checked. `form_of_directive`, if set, causes the specified form-of directive (e.g. `FORMER_NAME_OF`) to be prepended to checked placetypes, their directive-specific type (e.g. `FORMER_NAME_OF_type`), and their classes (`class`) to get the appropriate placetypes to check for form-of-directive categories. It falls back to the prepended generic `place` as a placetype, e.g. `FORMER_NAME_OF place`, if nothing else matches. `no_check_for_inherently_former` is used internally to prevent an infinite loop when checking for `inherently_former`. `register_former_as_non_former` is a major hack used in `get_bare_categories` to deal with the mismatch between e.g. known location `Yugoslavia` declaring itself a `country` but definitions of it declaring it a `former country`. It causes the non-former version of the specified placetype to be included in the returned equivalents along with the former placetypes. [FIXME: This should apply only to the entries in `former_countries` but it's tricky to do that now; fix this in the known-location refactor. -- The known-location refactor is already done but we haven't yet fixed this.] ]==] function export.get_placetype_equivs(placetype, props) local no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former local form_of_directive if props then no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former = props.no_fallback, props.no_split_qualifiers, props.no_check_for_inherently_former, props.from_category, props.register_former_as_non_former form_of_directive = props.form_of_directive end local equivs = {} -- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. `qualifier` is -- the preceding qualifier to insert into `equivs` along with the placetype (see comment at top of function). If -- `from_category` is given, we also check for a category-specific entry consisting of the placetype followed by -- `!`, and in all cases we also check to see if `placetype` is plural, and if so, insert the singularized version -- along with its fallbacks (if any) in `placetype_data`. `form_of_prefix` is a form-of prefix such as -- `OFFICIAL_NAME_OF`. If specified, we check the fallbacks of `placetype` without the prefix but then insert into -- `equivs` the prefixed placetype. This way, if the user says e.g. {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}}, -- we will correctly categorize into [[:Category:Official names of countries]], rather than only trying to look up -- `OFFICIAL_NAME_OF island country` and failing, falling back ultimately to [[:Category:Official names of places]]. local function insert_placetype_and_fallbacks(qualifier, placetype, form_of_prefix) local function insert_equiv(pt) if form_of_prefix then -- Let's say the user says {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}} and we have -- no entry for `OFFICIAL_NAME_OF island country` but we do for `OFFICIAL_NAME_OF country` (which we end -- up processing because `island country` falls back to `country`), and that entry in turn is defined -- using a fallback. We have to insert that fallback-of-fallback, and the easiest/cleanest way of -- handling this is by calling ourselves recursively. insert_placetype_and_fallbacks(qualifier, form_of_prefix .. " " .. pt) else insert(equivs, {qualifier=qualifier, placetype=pt}) end end -- Insert the placetype, along with any fallbacks. local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category) if ptdata then insert_equiv(canon_placetype) if no_fallback then return end local first_placetype = #equivs + 1 local prev_placetype = nil while true do local pt_value = export.placetype_data[canon_placetype] if not pt_value then internal_error("Fallback value %s specified for placetype %s but is not in `placetype_data`", canon_placetype, prev_placetype) end if pt_value.fallback then insert_equiv(pt_value.fallback) local last_placetype = #equivs if last_placetype - first_placetype >= 10 then local fallback_loop = {} for i = first_placetype, last_placetype do insert(fallback_loop, equivs[i].placetype) end internal_error("Apparent loop in fallback chain: %s", table.concat(fallback_loop, " -> ")) end prev_placetype = canon_placetype canon_placetype = pt_value.fallback else break end end end end -- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. This is a -- wrapper around the more basic `insert_placetype_and_fallbacks()` which handles form-of directives. If there is no -- form-of directive, this function directly calls `insert_placetype_and_fallbacks()`. We do things this way so that -- form-of directives correctly combine with `former`-type qualifiers. Note that we also have special backups for -- form-of directives that check `DIRECTIVE place` (and before that, `DIRECTIVE FORMER/ANCIENT place` is there's a -- `former`-type directive); these backups live outside this function because we want them done once, late, rather -- than in each invocation of `process_and_insert_placetype()`. local function process_and_insert_placetype(qualifier, reduced_placetype) if form_of_directive then -- First check for e.g. `OFFICIAL_NAME_OF island country` and its fallbacks; then we look for fallbacks of -- `island country` and check e.g. `OFFICIAL_NAME_OF country` and its fallbacks. All of this is handled by -- `insert_placetype_and_fallbacks()` with appropriate parameters. After that, check the general class of -- the directive, e.g. `subpolity` if something like `district` is given. (Eventually, we check for -- `OFFICIAL_NAME_OF place` as a backup, but this happens at the end outside the loop over qualifiers.) insert_placetype_and_fallbacks(qualifier, reduced_placetype, form_of_directive) if not no_fallback then local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype) local directive_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.get_placetype_prop(pt, form_of_directive .. "_type") or export.get_placetype_prop(pt, "class") end ) if not directive_type then local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.placetype_data[pt] end ) if pt_data then internal_error("For placetype %s in conjunction with form-of directive %s, placetype data " .. 'located but directive-specific type property %s missing, and so is "class"; ' .. "placetypes searched are %s", reduced_placetype, form_of_directive, form_of_directive .. "_type", reduced_placetype_equivs) else -- This should be allowed, as we allow unrecognized placetypes in general. end elseif directive_type ~= "!" then insert_placetype_and_fallbacks(qualifier, directive_type, form_of_directive) end end else insert_placetype_and_fallbacks(qualifier, reduced_placetype) end end -- Successively split off recognized qualifiers and loop over successively greater sets of qualifiers from the left -- (unless `no_split_qualifiers` is specified, in which case we don't check for qualifiers). local splits if no_split_qualifiers then splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}} else splits = export.split_qualifiers_from_placetype(placetype) end for _, split in ipairs(splits) do local prev_qualifier, this_qualifier, reduced_placetype = unpack(split, 1, 3) -- If a special "former" qualifier like `former` or `historical` isn't present, and -- `no_check_for_inherently_former` is not given (this flag is used to avoid infinite loops), check for -- "inherently former" placetypes like `satrapy` and `treaty port` that always refer to no-longer-existing -- placetypes, and handle accordingly. local unlinked_this_qualifier if this_qualifier and this_qualifier:find("%[") then unlinked_this_qualifier = export.remove_links_and_html(this_qualifier) else unlinked_this_qualifier = this_qualifier end local former_qualifiers = this_qualifier and export.former_qualifiers[unlinked_this_qualifier] or nil if not former_qualifiers and not no_check_for_inherently_former then former_qualifiers = export.get_equiv_placetype_prop(reduced_placetype, function(pt) return export.get_placetype_prop(pt, "inherently_former") end, {no_check_for_inherently_former = true}) end -- If a special "former" qualifier like `former` or `historical` is present, map it to the appropriate internal -- qualifiers (`ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified -- qualifiers), fetch the `former_type` property, and treat the placetype as if a concatenation of the mapped -- qualifier(s) and the value of `former_type`. For example, if `medieval village` is given, we map `medieval` -- to `ANCIENT` and `FORMER`, and `village` to its `former_type` of `settlement`, and enter the placetypes -- `ANCIENT settlement` and `FORMER settlement` (in that order) into `equivs`. If the placetype following the -- "former" qualifier is recognized in `placetype_data` but has no `former_type` and no fallback with a -- `former_type` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like -- `former greenhouse` is specified and we don't have an entry for `greenhouse`), just track the occurrence and -- don't enter anything into `equivs`. if former_qualifiers then -- FIXME: Should we respect `no_fallback` here? My instinct says no. local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype, { no_check_for_inherently_former = true }) local former_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.get_placetype_prop(pt, "former_type") or export.get_placetype_prop(pt, "class") end ) if not former_type then local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.placetype_data[pt] end ) if pt_data then internal_error("For placetype %s, placetype data located but `former_type` missing; " .. "placetypes searched are %s", reduced_placetype, reduced_placetype_equivs) else -- Enable error when we've verified there aren't any examples. track("bad-former-placetype") track("bad-former-placetype/" .. reduced_placetype) --process_error("For placetype '%s', unrecognized placetype following 'former'-type " .. -- "qualifier; searched placetype(s) %s", reduced_placetype, dump(reduced_placetype_equivs)) end elseif former_type ~= "!" then -- First check directly for `ANCIENT/FORMER` + the original following placetype. This makes it possible -- for (e.g.) former provinces of the Roman empire to be categorized specially. for _, former_qualifier in ipairs(former_qualifiers) do process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. reduced_placetype) end for _, former_qualifier in ipairs(former_qualifiers) do process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. former_type) end -- HACK! See explanation above for `register_former_as_non_former`. if register_former_as_non_former then process_and_insert_placetype(prev_qualifier, reduced_placetype) end -- If we're processing a form-of directive, after doing everything else we do -- `DIRECTIVE ANCIENT/FORMER place` e.g. `OFFICIAL_NAME_OF FORMER place` as a backup. if form_of_directive and not no_fallback then for _, former_qualifier in ipairs(former_qualifiers) do insert_placetype_and_fallbacks(prev_qualifier, form_of_directive .. " " .. former_qualifier .. " place") end end -- Don't continue processing equivs. The reason is probably the same as the `break` below for -- qualifier_to_placetype_equivs[]; categories for `former BLAH` are set using `default`, and -- non-former equivs will otherwise take precedence. break end end -- Then see if the rightmost split-off qualifier is in qualifier_to_placetype_equivs -- (e.g. 'fictional *' -> 'fictional location'). If so, add the mapping. if this_qualifier and export.qualifier_to_placetype_equivs[unlinked_this_qualifier] then insert(equivs, { qualifier=prev_qualifier, placetype=export.qualifier_to_placetype_equivs[unlinked_this_qualifier] }) -- Don't continue processing equivs; otherwise, if we specify 'mythological city', even though the -- equivalent entry for 'mythological location' gets inserted ahead of the entry for 'city', the -- latter ends up generating the category because the category for 'mythological location' is set as -- the default value, which is used only when no non-default category can be found. break end -- Finally, join the rightmost split-off qualifier to the previously split-off qualifiers to form a combined -- qualifier, and add it along with reduced_placetype and any mapping in placetype_data for reduced_placetype. -- NOTE: The first time through this loop, both `prev_qualifier` and `this_qualifier` are nil, and this inserts -- the full placetype into `equivs`. local qualifier = prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier process_and_insert_placetype(qualifier, reduced_placetype) -- If `no_fallback` and there's an entry in `placetype_data` for this placetype, don't include any reduced -- placetypes to avoid the "overseas territory treated as a territory" issue describe above. if no_fallback then local canon_placetype, ptdata, ptmatch = export.get_placetype_data(reduced_placetype, from_category) if canon_placetype then break end end end -- If we're processing a form-of directive, after doing everything else we do `DIRECTIVE place` e.g. -- `OFFICIAL_NAME_OF place` as a backup; but only if either the placetype as a whole is recognized or the placetype -- begins with a recognized qualifier. This latter check is to avoid categorizing into e.g. -- [[Category:en:Former names of places]] in an invocation like -- {{place|en|@former name of:Democratic Republic of the Congo|country|r/Central Africa|;|used from 1971–1997}}; -- the `used from 1971–1997` gets treated as a placetype and we're called on it. if form_of_directive and not no_fallback and (splits[2] or export.get_placetype_data(placetype, from_category)) then insert_placetype_and_fallbacks(nil, form_of_directive .. " place") end return equivs end function export.get_equiv_placetype_prop_from_equivs(equivs, fun, continue_on_nil_only) for _, equiv in ipairs(equivs) do local retval = fun(equiv.placetype) if continue_on_nil_only and retval ~= nil or not continue_on_nil_only and retval then return retval, equiv end end return nil, nil end --[==[ Given a placetype `placetype` and a function `fun` of one argument, iteratively call the function on equivalent placetypes fetched from `get_placetype_equivs` until the function returns a non-falsy value (i.e. not {nil} or {false}); but if `continue_on_nil_only` is specified, the iterations continue until the function returns non non-{nil} value. FIXME: We should make `continue_on_nil_only` the default; but this requires changing some callers.) When `fun` returns a non-falsy or non-{nil} value, `get_equiv_placetype_prop` returns two values: the value returned by `fun` and the equivalent placetype that triggered the non-falsy (or non-{nil}) return value. If `fun` never returns a non-falsy (or non-{nil}) value, `get_equiv_placetype_prop` returns {nil} for both return values. If `placetype` is passed in as {nil}, the return value is the result of calling `fun` on {nil} (whatever it is) with {nil} for the second return value. ]==] function export.get_equiv_placetype_prop(placetype, fun, props) if not placetype then return fun(nil), nil end return export.get_equiv_placetype_prop_from_equivs(export.get_placetype_equivs(placetype, props), fun, props and props.continue_on_nil_only) end --[==[ Return the article that is used with an entry placetype. We proceed as follows: # See if there is a recognized qualifier at the beginning that specifies an article (including `false` for no article). This takes precedence over anything else, so that e.g. `various capitals` gets no article rather than "`the"`. # Then check the placetype or any equivalent placetype for the `entry_placetype_use_the` property, indicating that `"the"` should be used. # Otherwise we look to see if the placetype itself (not any equivalents, even those involving deleting a qualifier from the beginning) has an entry in `placetype_data` that specifies the indefinite article using `entry_placetype_use_the` (principally for use with placetypes like `union territory`). # Otherwise, we use [[Module:en-utilities]] to apply the standard algorithm to generate `"an"` for words beginning with a vowel and `"a"` otherwise. If `ucfirst` is true, the first letter of the article is made upper-case. ]==] function export.get_placetype_article(placetype, ucfirst) local art local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$") if qualifier then local canon = export.placetype_qualifiers[qualifier] if type(canon) == "table" then art = canon.article end end if art == false then return art end if art == nil then local placetype_use_the = export.get_equiv_placetype_prop(placetype, function(pt) return export.get_placetype_prop(pt, "entry_placetype_use_the") end) if placetype_use_the then art = "the" else art = export.get_placetype_prop(placetype, "entry_placetype_indefinite_article") if not art then art = --[[require(en_utilities_module).get_indefinite_article(placetype)]] "" end end end if ucfirst then art = m_strutils.ucfirst(art) end return art end --[==[ Return the preposition that should be used after `placetype` when occurring as an entry placetype or in categories (e.g. `city >in< France` but `country >of< South America`). The preposition defaults to `"ใน"` if not specified. ]==] function export.get_placetype_entry_preposition(placetype) local pt_prep = export.get_equiv_placetype_prop(placetype, function(pt) return export.get_placetype_prop(pt, "preposition") end ) return pt_prep or "ใน" end --[==[ Given a place desc (see top of file) and a holonym object (see top of file), add a key/value into the place desc's `holonyms_by_placetype` field corresponding to the placetype and placename of the holonym. For example, corresponding to the holonym "c/Italy", a key "ประเทศ" with the list value {"Italy"} will be added to the place desc's `holonyms_by_placetype` field. If there is already a key with that place type, the new placename will be added to the end of the value's list. ]==] function export.key_holonym_into_place_desc(place_desc, holonym) if not holonym.placetype then return end -- Key in equivalent placetypes, so that e.g. `cities/San Francisco` gets keyed under `city`; but don't do -- fallbacks, as it doesn't seem correct for the "do other holonyms of the same placetype" algorithm to do holonyms -- of different types just because they have the same fallback. local equiv_placetypes = export.get_placetype_equivs(holonym.placetype, {no_fallback = true}) local unlinked_placename = holonym.unlinked_placename for _, equiv in ipairs(equiv_placetypes) do local placetype = equiv.placetype if not place_desc.holonyms_by_placetype then place_desc.holonyms_by_placetype = {} end if not place_desc.holonyms_by_placetype[placetype] then place_desc.holonyms_by_placetype[placetype] = {unlinked_placename} else insert(place_desc.holonyms_by_placetype[placetype], unlinked_placename) end end end --[=[ Construct a formatted link from the raw link spec `link` given the canonical singular placetype `sg_placetype`. If the placetype was originally plural, `orig_placetype` should contain this plural value; otherwise it should be nil. This will construct the appropriate type of link that displays as `orig_placetype` (or otherwise `sg_placetype`) but links to whatever the `link` spec specifies (which may be `sg_placetype`, a Wikipedia article, etc.). `ptdata` is the placetype data structure for the placetype, and `from_category` indicates that we are generating the description of a category (otherwise we are generating the display form of an entry placetype). ]=] local function make_placetype_link(link, sg_placetype, orig_placetype, ptdata, from_category, noerror) if not from_category and ptdata.disallow_in_entries then if noerror then return "[not meant to be specified directly, with warning: " .. ptdata.disallow_in_entries .. "]" else process_error("Placetype %s is not meant to be specified directly: " .. ptdata.disallow_in_entries, sg_placetype) end end if link == nil then internal_error("Placetype data present for placetype %s but no link= setting given", sg_placetype) elseif link == true then if orig_placetype then return ("[[%s|%s]]"):format(sg_placetype, orig_placetype) else return ("[[%s]]"):format(sg_placetype) end elseif link == false then process_error("Placetype %s is not meant to be specified directly, but is only for internal use", sg_placetype) elseif link == "w" then return ("[[w:%s|%s]]"):format(sg_placetype, orig_placetype or sg_placetype) elseif link == "separately" then if orig_placetype then local sg_words = split(sg_placetype, " ") local orig_words = split(orig_placetype, " ") if #sg_words ~= #orig_words then internal_error("Can't construct 'separately' link for plural placetype %s as original placetype %s " .. "has different number of words", orig_placetype, sg_placetype) else for i = 1, #sg_words do if sg_words[i] == orig_words[i] then sg_words[i] = ("[[%s]]"):format(sg_words[i]) else sg_words[i] = ("[[%s|%s]]"):format(sg_words[i], orig_words[i]) end end return concat(sg_words, " ") end else return (sg_placetype:gsub("([^ ]+)", "[[%1]]")) end elseif link:find("^%+") then link = link:sub(2) -- discard initial + return ("[[%s|%s]]"):format(link, orig_placetype or sg_placetype) elseif not orig_placetype then return link else return --[[require(en_utilities_module).pluralize(link)]] link end end --[==[ Get the display form of a placetype by looking it up in `placetype_data`. If the placetype is recognized, or is the plural of a recognized placetype, the corresponding linked display form is returned (with plural placetypes displaying as plural but linked to the singular form of the placetype). Otherwise, return nil. If we're generating the description of a category, `category_type` should be set to one of `"top-level"` (for top-level categories like [[:Category:Neighborhoods]]), `"noncity"` (for non-city categories like [[:Category:Neighborhoods in Illinois, USA]]) or `"city"` (for city categories like [[:Category:Neighborhoods of Chicago]]). Otherwise, we're generating the description for use in formatting a {{tl|place}} call, and category-only placetypes ending in `!` will be ignored, along with special `category_link*` settings. `return_full` is used along with `category_type` and will preferably return the "full" variant of category link settings, i.e. `full_category_link*`; if they don't exist, the `category_link*` value is prepended with `"names of"`. `noerror` says to not throw an error when encountering entry placetypes that would be disallowed. ]==] function export.get_placetype_display_form(placetype, category_type, return_full, noerror) local from_category = not not category_type local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category) if canon_placetype then local raw_link local function is_linked_string(str) return type(str) == "string" and str:find("%[%[") end if category_type then local fetched_full local function fetch_maybe_full(prop) local retval = ptdata["full_" .. prop] if retval ~= nil then if return_full then return retval, true else internal_error("Saw full_" .. prop .. "=%s but `return_full` not set, can't handle", retval) end end return ptdata[prop], false end local function maybe_prefix(str) if return_full and not fetched_full then return "names of " .. str else return str end end -- Careful with `false` as possible value. if category_type == "top-level" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_top_level") elseif category_type == "noncity" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_before_noncity") elseif category_type == "city" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_before_city") else internal_error('Unrecognized value for `category_type` %s, should be "top-level", "noncity" or "city"', --ห้ามแปล category_type) end if type(raw_link) == "string" then return maybe_prefix(raw_link), ptdata elseif raw_link ~= nil then return raw_link, ptdata end raw_link, fetched_full = fetch_maybe_full("category_link") if raw_link == false then return raw_link, ptdata end if is_linked_string(raw_link) then return maybe_prefix(raw_link), ptdata end if ptmatch == "plural" then raw_link, fetched_full = fetch_maybe_full("plural_link") if raw_link == false then return raw_link, ptdata end if is_linked_string(raw_link) then return maybe_prefix(raw_link), ptdata end end if raw_link == nil then raw_link, fetched_full = fetch_maybe_full("link") end if raw_link == false then return raw_link, ptdata end return maybe_prefix(make_placetype_link(raw_link, canon_placetype, placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror)), ptdata else if ptmatch == "plural" then raw_link = ptdata.plural_link if raw_link == false then process_error("Placetype %s cannot appear plural", placetype) end if is_linked_string(raw_link) then return raw_link, ptdata end end if raw_link == nil then raw_link = ptdata.link end return make_placetype_link(raw_link, canon_placetype, placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror), ptdata end end return nil end local function resolve_unlinked_placename_display_aliases(placetype, placename) local equiv_placetypes = export.get_placetype_equivs(placetype) for i, equiv in ipairs(equiv_placetypes) do equiv_placetypes[i] = equiv.placetype end local all_display_aliases_found = {} local all_others_found = {} for group, key, spec in m_locations.iterate_matching_location { placetypes = equiv_placetypes, placename = placename, alias_resolution = "display", } do if spec.alias_of and spec.display then insert(all_display_aliases_found, {group, key, spec, spec.display_as_full}) else insert(all_others_found, {group, key, spec}) end end if not all_display_aliases_found[1] then return placename elseif all_display_aliases_found[2] then internal_error("Found multiple matching display aliases for placename %s, placetype %s: " .. "all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found, all_others_found) elseif all_others_found[1] then internal_error("Found a display alias along with other possible meanings for placename %s, placetype %s: " .. "all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found, all_others_found) else local group, key, spec, as_full = unpack(all_display_aliases_found[1]) local full, elliptical = m_locations.key_to_placename(group, key) return as_full and full or elliptical end end --[==[ If `placename` of type `placetype` is a display alias, convert it to its canonical form; otherwise, return unchanged. Display aliases transform certain placenames into canonical displayed forms. For example, if any of `country/US`, `country/USA` or `country/United States of America` (or `c/US`, etc.) are given, the result will be displayed as `United States`. '''NOTE''': Display aliases change what is displayed from what the editor wrote in the Wikitext. As a result, they should (a) be non-political in nature, and (b) not involve a change where the word `the` needs to be added or removed. For example, normalizing `US` and `USA` to `United States` for display purposes is OK but normalizing `Burma` to `Myanmar` is not (instead a cat alias should be used) because the terms `Burma` and `Myanmar` have clear political connotations. Similarly, we have a display alias that maps the old name of `Macedonia` as a country (but not a region!) to `North Macedonia`, but `Republic of Macedonia` is mapped to `North Macedonia` only as a cat alias because the two terms differ in their use of `the`. (For example, if we had a display alias mapping `Republic of Macedonia` to `North Macedonia`, the call {{tl|place|en|the <<capital city>> of the <<c/Republic of Macedonia>>}} would wrongly display as `the [[capital city]] of the [[North Macedonia]]`.) Generally, display normalizations tend to involve alternative forms (e.g. abbreviations, ellipses, foreign spellings) where the normalization improves clarity and consistency. ]==] function export.resolve_placename_display_aliases(placetype, placename) -- If the placename is a link, apply the alias inside the link. -- This pattern matches both piped and unpiped links. If the link is not piped, the second capture (linktext) will -- be empty. local link, linktext = rmatch(placename, "^%[%[([^|%[%]]+)|?([^|%[%]]-)%]%]$") if link then if linktext ~= "" then local alias = resolve_unlinked_placename_display_aliases(placetype, linktext) return "[[" .. link .. "|" .. alias .. "]]" else local alias = resolve_unlinked_placename_display_aliases(placetype, link) return "[[" .. alias .. "]]" end else return resolve_unlinked_placename_display_aliases(placetype, placename) end end --[==[ Generate the "prefixed" version of a bare key, i.e. prefix it with `the` if correct for this key. ]==] function export.get_prefixed_key(key, spec) if spec.the then return "the " .. key else return key end end -- Necessary for use by [[Module:place]]. FIXME: Reorganize the modules so this isn't necessary. export.iterate_matching_location = m_locations.iterate_matching_location --[=[ Iterator that iterates over holonyms in `place_desc`. If `first_holonym_index` is given, start iterating at the specified holonym and stop either when there are no more holonyms or a holonym with modifier `:also` is found. If `first_holonym_index` is nil or omitted, iterate over all holonyms regardless. If `include_raw_text_holonyms` is specified, raw text holonyms (those not of the form `placetype/placename`) are returned as well; they can be identified by the fact that the `placetype` field in the holonym structure is nil. Two values are returned at each iteration, the holonym index and holonym structure, similar to `ipairs()`. ]=] function export.get_holonyms_to_check(place_desc, first_holonym_index, include_raw_text_holonyms) local stop_at_also = not not first_holonym_index return function(place_desc, index) while true do index = index + 1 local this_holonym = place_desc.holonyms[index] -- If we were passed in a starting holonym index, go up to but not including a holonym marked with `:also` -- (continue_cat_loop); the categorization code will then restart the loop at that holonym. That holonym -- will have `:also` marked on it, so make sure not to stop immediately if the first holonym is marked with -- `:also`. if not this_holonym or stop_at_also and index > first_holonym_index and this_holonym.continue_cat_loop then return nil end -- If not placetype, we're processing raw text, which we normally want to skip. if include_raw_text_holonyms or this_holonym.placetype then return index, this_holonym end end end, place_desc, first_holonym_index and first_holonym_index - 1 or 0 end --[==[ If the holonym in `data` (in the format as passed to a category handler) refers to a known location, iterate over all such known locations, returning for each location the corresponding key, spec and group as well as the trail of ancestral containers. Unlike `iterate_matching_location()`, this specifically checks that there is no mismatch between the location's containers at any level and any of the following holonyms in the {{tl|place}} spec. The fields in `data` are: * `holonym_placetype`: The placetype of the holonym. It can actually be a list of possible placetypes, as with `iterate_matching_location()`. * `holonym_placename`: The placename of the holonym. * `holonym_index`: The index of the holonym among the holonyms in `place_desc`, or nil if the holonym is not among the holonyms in `place_desc`. (If a holonym index is given, we check for container mismatches among the holonyms following the specified index, stopping either when encountering a holonym marked with modifier `:also` or, if none exist, when we run out of holonyms. If no holonym index is given, we check all holonyms for container mismatches.) * `place_desc`: Description of the place; used for the holonyms, to check for container mismatches. Returns four values: the location group, the canonical key by which the location is known, the spec object describing the location and the trail of ancestral containers for the location. The first three values are the same as for `iterate_matching_location`. ]==] function export.iterate_matching_holonym_location(data) local holonym_placetype, holonym_placename, holonym_index, place_desc = data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc local matching_location_iterator = m_locations.iterate_matching_location { placetypes = holonym_placetype, placename = holonym_placename, } return function() while true do local group, key, spec = matching_location_iterator() if not group then return nil end local container_trail = {} -- For each level of container, check that there are no mismatches (i.e. other location of the same -- placetype) mentioned. We allow a mismatch at a given level if there's also a match with the container -- at that level. For example, in the case of Kansas City, defined in [[Module:place/locations]] as a city -- in Missouri, if we define it as {{tl|place|city|s/Missouri,Kansas}}, we ignore the mismatching state of -- Kansas because the correct state of Missouri was also mentioned. But imagine we are defining Newark, -- Delaware as {{tl|place|city|s/Delaware|c/US}} and (as is the case) we have an entry for Newark, New -- Jersey in [[Module:place/locations]]. Just because the containing location `US` matches isn't enough, -- because Newark, NJ also has New Jersey as a containing location and there's a mismatch at that level. If -- there are no mismatches at any level we assume we're dealing with the right known location. -- -- If at a given level there are multiple containing locations, we count a match if any holonym matches any -- containing location, and a mismatch only if a holonym exists of the same placetype that doesn't match any -- containing location. local containers_mismatch = false for containers in m_locations.iterate_containers(group, key, spec) do insert(container_trail, containers) local match_at_level = false local mismatch_at_level = false for other_holonym_index, other_holonym in export.get_holonyms_to_check(place_desc, holonym_index and holonym_index + 1 or nil) do local other_source_holonym = other_holonym.augmented_from_holonym if other_source_holonym and other_source_holonym.placetype == holonym_placetype and other_source_holonym.unlinked_placename ~= holonym_placename then -- Ignore holonyms added during the augmentation process for other holonyms of the same -- placetype as the placetype of the holonym we're considering. See comment in -- augment_holonyms_with_container() for why we do this. -- continue; grrr, no 'continue' in Lua else local holonym_matches_at_level = false local holonym_exists_with_same_placetype = false for _, container in ipairs(containers) do if not container.spec.no_check_holonym_mismatch then local full_container_placename, elliptical_container_placename = m_locations.key_to_placename(container.group, container.key) local placetypes = container.spec.placetype if type(placetypes) ~= "table" then placetypes = {placetypes} end local placetype_equivs = {} for _, pt in ipairs(placetypes) do m_table.extend(placetype_equivs, export.get_placetype_equivs(pt)) end local this_holonym_matches = export.get_equiv_placetype_prop_from_equivs( placetype_equivs, function(placetype) return other_holonym.placetype == placetype and (other_holonym.unlinked_placename == full_container_placename or other_holonym.unlinked_placename == elliptical_container_placename) end ) if this_holonym_matches then holonym_matches_at_level = true break end local this_holonym_exists_with_same_placetype = export.get_equiv_placetype_prop_from_equivs( placetype_equivs, function(placetype) return other_holonym.placetype == placetype end ) if this_holonym_exists_with_same_placetype then -- We seem to have a mismatch at this level. But before we decide conclusively that this -- is the case, check to see whether the putative mismatch is an alias and matches when -- we resolve the alias. for oh_group, oh_key, oh_spec, oh_container_trail in export.iterate_matching_holonym_location { holonym_placetype = other_holonym.placetype, holonym_placename = other_holonym.unlinked_placename, holonym_index = other_holonym_index, place_desc = place_desc, } do local oh_full_placename, oh_elliptical_placename = m_locations.key_to_placename(oh_group, oh_key) if oh_full_placename == full_container_placename or oh_elliptical_placename == elliptical_container_placename then -- Alias matched when resolved. this_holonym_matches = true break end end if this_holonym_matches then -- Alias matched above when resolved. holonym_matches_at_level = true break else -- Not an alias, or doesn't match when resolved. We have a true mismatch. holonym_exists_with_same_placetype = true end end end end if holonym_matches_at_level then match_at_level = true break end if holonym_exists_with_same_placetype then mismatch_at_level = true end end end if not match_at_level and mismatch_at_level then containers_mismatch = true break end end if not containers_mismatch then return group, key, spec, container_trail end end end end --[==[ If the holonym in `data` (in the format as passed to a category handler) refers to a known location, find and return the corresponding key, spec and group as well as the trail of ancestral containers. This is like `iterate_matching_holonym_location()` but throws an error if more than one location matches. (An example where this would happen is {{tl|place|en|neighborhood|city/Newcastle}}, because there are two known locations named Newcastle. To fix this, specify additional following disambiguating holonyms, e.g. {{tl|place|en|neighborhood|city/Newcastle|s/New South Wales}}. ]==] function export.find_matching_holonym_location(data) local all_found = {} for group, key, spec, container_trail in export.iterate_matching_holonym_location(data) do insert(all_found, {group, key, spec, container_trail}) end if not all_found[1] then return nil elseif all_found[2] then local holonym_placetype = data.holonym_placetype if type(holonym_placetype) == "table" then holonym_placetype = concat(holonym_placetype, ",") end local found_keys = {} for _, found in ipairs(all_found) do local _, key, _, _ = unpack(found) insert(found_keys, key) end error(("Found multiple matching locations for holonym '%s/%s'; specify disambiguating context in the " .. "containing holonyms: %s"):format(holonym_placetype, data.holonym_placename, dump(found_keys))) else return unpack(all_found[1]) end end ------------------------------------------------------------------------------------------ -- Placename and placetype data -- ------------------------------------------------------------------------------------------ --[==[ var: This is a map from aliases to their canonical forms. Any placetypes appearing as keys here will be mapped to their canonical forms in all respects, including the display form. Contrast entries in 'placetype_data' with a fallback, which applies to categorization and other processes but not to display. The most important aliases are for holonym placetypes, particularly those that occur often such as "ประเทศ", "รัฐ", "จังหวัด" and the like. Particularly long placetypes that mostly occur as entry placetypes (e.g. "census-designated place") can be given abbreviations, but it is generally preferred to spell out the entry placetype. Note also that we purposely avoid certain abbreviations that would be ambiguous (e.g. "d", which could variously be interpreted as "department", "อำเภอ" or "division"). ]==] export.placetype_aliases = { ["acomm"] = "autonomous community", ["adr"] = "administrative region", ["adterr"] = "administrative territory", -- Pakistan ["aobl"] = "autonomous oblast", ["aokr"] = "autonomous okrug", ["ap"] = "autonomous province", ["apref"] = "autonomous prefecture", ["aprov"] = "autonomous province", ["ar"] = "autonomous region", ["arch"] = "archipelago", ["arep"] = "autonomous republic", ["aterr"] = "autonomous territory", ["atu"] = "autonomous territorial unit", ["bor"] = "borough", ["c"] = "ประเทศ", ["can"] = "canton", ["carea"] = "council area", ["cc"] = "constituent country", ["cdblock"] = "community development block", ["cdep"] = "Crown dependency", ["CDP"] = "census-designated place", ["cdp"] = "census-designated place", ["clcity"] = "county-level city", ["co"] = "เทศมณฑล", ["cobor"] = "county borough", ["colcity"] = "county-level city", ["coll"] = "collectivity", ["comm"] = "community", ["cont"] = "ทวีป", ["contr"] = "continental region", ["contregion"] = "continental region", ["cpar"] = "civil parish", ["damun"] = "direct-administered municipality", ["dep"] = "dependency", ["department capital"] = "departmental capital", ["dept"] = "department", ["depterr"] = "dependent territory", ["dist"] = "อำเภอ", ["distmun"] = "district municipality", ["div"] = "division", ["emp"] = "จักรวรรดิ", ["fpref"] = "French prefecture", ["gov"] = "governorate", ["govnat"] = "governorate", ["home-rule city"] = "home rule city", ["home-rule municipality"] = "home rule municipality", ["inner-city area"] = "inner city area", ["ires"] = "Indian reservation", ["isl"] = "เกาะ", ["lbor"] = "London borough", ["lga"] = "local government area", ["lgarea"] = "local government area", ["lgd"] = "local government district", ["lgdist"] = "local government district", ["metbor"] = "metropolitan borough", ["metcity"] = "metropolitan city", ["metmun"] = "metropolitan municipality", ["mtn"] = "ภูเขา", ["mun"] = "เทศบาล", ["mundist"] = "municipal district", ["nonmetropolitan county"] = "non-metropolitan county", ["obl"] = "oblast", ["okr"] = "okrug", ["p"] = "จังหวัด", ["par"] = "parish", ["parmun"] = "parish municipality", ["pen"] = "peninsula", ["plcity"] = "prefecture-level city", ["plcolony"] = "Polish colony", ["pref"] = "prefecture", ["prefcity"] = "prefecture-level city", ["preflcity"] = "prefecture-level city", ["prov"] = "จังหวัด", ["r"] = "ภูมิภาค", ["range"] = "เทือกเขา", ["rcm"] = "regional county municipality", ["rcomun"] = "regional county municipality", ["rdist"] = "regional district", ["rep"] = "republic", ["rhrom"] = "rural hromada", ["riv"] = "แม่น้ำ", ["rmun"] = "regional municipality", ["robor"] = "royal borough", ["romp"] = "Roman province", ["runit"] = "regional unit", ["rurmun"] = "rural municipality", ["s"] = "รัฐ", ["sar"] = "special administrative region", ["shrom"] = "settlement hromada", ["spref"] = "subprefecture", ["sprefcity"] = "sub-prefectural city", ["sprovcity"] = "subprovincial city", ["submet city"] = "sub-metropolitan city", ["submetropolitan city"] = "sub-metropolitan city", ["sub-prefecture-level city"] = "sub-prefectural city", ["sub-provincial city"] = "subprovincial city", ["sub-provincial district"] = "subprovincial district", ["terr"] = "ดินแดน", ["terrauth"] = "territorial authority", ["twp"] = "township", ["twpmun"] = "township municipality", ["uauth"] = "unitary authority", ["ucomm"] = "unincorporated community", ["udist"] = "unitary district", ["uhrom"] = "urban hromada", ["uterr"] = "union territory", ["utwpmun"] = "united township municipality", ["val"] = "valley", ["vdc"] = "village development committee", ["vil"] = "village", ["voi"] = "voivodeship", ["wcomm"] = "Welsh community", } local no_link_def_article = {link = false, article = "the"} local no_link_no_article = {link = false, article = false} --[==[ var: These qualifiers can be prepended onto any placetype and will be handled correctly. For example, the placetype `large city` will be displayed as `large <nowiki>[[city]]</nowiki>` and categorized as if `city` were specified. If the value in the following table is a string, the qualifier will display according to the string. If the value is `true`, the qualifier will be linked to its corresponding Wiktionary entry. If the value is `false`, the qualifier will not be linked but will appear as-is. Note that these qualifiers do not override placetypes with entries elsewhere that contain those same qualifiers. For example, the entry for `inland sea` in `placetype_data` will apply in preference to treating `inland sea` as equivalent to `sea`. ]==] export.placetype_qualifiers = { -- generic qualifiers ["huge"] = false, ["tiny"] = false, ["large"] = false, ["big"] = false, ["mid-size"] = false, ["mid-sized"] = false, ["small"] = false, ["sizable"] = false, ["important"] = false, ["long"] = false, ["short"] = false, ["major"] = false, ["minor"] = false, ["high"] = false, ["tall"] = false, ["low"] = false, ["left"] = false, -- left tributary ["right"] = false, -- right tributary ["modern"] = false, -- for use in opposition to "ancient" in another definition -- "former" qualifiers ["abandoned"] = true, ["ancient"] = true, ["deserted"] = true, ["extinct"] = true, ["former"] = false, ["historic"] = "historical", ["historical"] = true, ["medieval"] = true, ["mediaeval"] = true, ["ruined"] = true, ["traditional"] = true, -- sea qualifiers ["coastal"] = true, ["inland"] = true, -- note, we also have an entry in placetype_data for 'inland sea' to get a link to [[inland sea]] ["maritime"] = true, ["overseas"] = true, ["seaside"] = true, ["beachfront"] = true, ["beachside"] = true, ["riverside"] = true, -- lake qualifiers ["freshwater"] = true, ["saltwater"] = true, ["endorheic"] = true, ["oxbow"] = true, ["ox-bow"] = "[[oxbow]]", -- [[ox-bow]] is a red link ["tidal"] = true, -- land qualifiers ["hilltop"] = true, ["hilly"] = true, ["insular"] = true, ["peninsular"] = true, ["chalk"] = true, ["karst"] = true, ["limestone"] = true, ["mountainous"] = true, ["mountaintop"] = true, ["alpine"] = true, ["volcanic"] = true, -- for an island -- political status qualifiers ["autonomous"] = true, ["incorporated"] = true, ["special"] = true, ["unincorporated"] = true, ["coterminous"] = true, -- monetary status/etc. qualifiers ["fashionable"] = true, ["wealthy"] = true, ["affluent"] = true, ["declining"] = true, -- city vs. rural qualifiers ["urban"] = true, ["suburban"] = true, ["exurban"] = true, ["outlying"] = true, ["remote"] = true, ["rural"] = true, ["outback"] = true, ["inner"] = false, ["inner-city"] = true, ["central"] = false, ["outer"] = false, -- land use qualifiers ["residential"] = true, ["agricultural"] = true, ["business"] = true, ["commercial"] = true, ["industrial"] = true, -- business use qualifiers ["railroad"] = true, ["railway"] = true, ["farming"] = true, ["fishing"] = true, ["mining"] = true, ["logging"] = true, ["cattle"] = true, -- tourism use qualifiers ["resort"] = true, -- note, we also have 'resort city' and 'resort town', that take precedecne ["spa"] = true, -- note, we also have 'spa city' and 'spa town', that take precedecne ["ski"] = true, -- note, we also have 'ski resort city' and 'ski resort town', that take precedecne -- religious qualifiers ["holy"] = true, ["sacred"] = true, ["religious"] = true, ["secular"] = true, -- qualifiers for nonexistent places ["claimed"] = false, ["fictional"] = true, ["legendary"] = true, ["mythical"] = true, ["mythological"] = true, -- directional qualifiers ["northern"] = false, ["southern"] = false, ["eastern"] = false, ["western"] = false, ["north"] = false, ["south"] = false, ["east"] = false, ["west"] = false, ["northeastern"] = false, ["southeastern"] = false, ["northwestern"] = false, ["southwestern"] = false, ["northeast"] = false, ["southeast"] = false, ["northwest"] = false, ["southwest"] = false, -- seasonal qualifiers ["summer"] = true, -- e.g. for 'summer capital' ["winter"] = true, -- legal status qualifiers -- FIXME: Two-word qualifiers don't work yet. But you can enter "de-facto" and it's canonicalized to [[de facto]]. ["official"] = true, ["unofficial"] = true, ["de facto"] = true, -- 'de facto capital' ["de-facto"] = "[[de facto]]", -- [[de-facto]] is a red link ["de jure"] = true, -- 'de jure capital' ["de-jure"] = "[[de jure]]", -- [[de-jure]] is a red link -- NOTE: 'unrecognized/unrecognised' are handled as placetypes 'unrecognized country', 'unrecognized state' -- misc. qualifiers ["planned"] = true, ["chartered"] = true, ["landlocked"] = true, ["uninhabited"] = true, -- superlative qualifiers ["first"] = no_link_def_article, ["second"] = no_link_def_article, -- for "second largest" etc. ["third"] = no_link_def_article, ["fourth"] = no_link_def_article, ["last"] = no_link_def_article, ["only"] = no_link_def_article, ["sole"] = no_link_def_article, ["main"] = no_link_def_article, ["largest"] = no_link_def_article, ["biggest"] = no_link_def_article, ["smallest"] = no_link_def_article, ["shortest"] = no_link_def_article, ["longest"] = no_link_def_article, ["tallest"] = no_link_def_article, ["highest"] = no_link_def_article, ["lowest"] = no_link_def_article, ["leftmost"] = no_link_def_article, ["rightmost"] = no_link_def_article, ["innermost"] = no_link_def_article, ["outermost"] = no_link_def_article, ["northernmost"] = no_link_def_article, ["southernmost"] = no_link_def_article, ["westernmost"] = no_link_def_article, ["easternmost"] = no_link_def_article, ["northwesternmost"] = no_link_def_article, ["southwesternmost"] = no_link_def_article, ["northeasternmost"] = no_link_def_article, ["southeasternmost"] = no_link_def_article, -- several/various ["several"] = no_link_no_article, ["various"] = no_link_no_article, ["numerous"] = no_link_no_article, ["multiple"] = no_link_no_article, ["many"] = no_link_no_article, ["other"] = no_link_no_article, } --[==[ var: In this table, the key qualifiers should be treated the same as the value qualifiers for categorization purposes. This is overridden by `placetype_data` and `qualifier_to_placetype_equivs`. ]==] export.former_qualifiers = { ["abandoned"] = {"FORMER"}, ["ancient"] = {"ANCIENT", "FORMER"}, ["former"] = {"FORMER"}, ["extinct"] = {"FORMER"}, ["historic"] = {"FORMER"}, ["historical"] = {"FORMER"}, ["medieval"] = {"ANCIENT", "FORMER"}, ["mediaeval"] = {"ANCIENT", "FORMER"}, ["ruined"] = {"ANCIENT", "FORMER"}, ["traditional"] = {"FORMER"}, } --[==[ var: In this table, any placetypes containing these qualifiers that do not occur in `placetype_data` should be mapped to the specified placetypes for categorization purposes. Entries here are overridden by `placetype_data`. ]==] export.qualifier_to_placetype_equivs = { ["fictional"] = "fictional location", ["legendary"] = "mythological location", ["mythical"] = "mythological location", ["mythological"] = "mythological location", -- For e.g. Taiwan as a "claimed province" of China; parts of Belize as claimed by Guatemala; various islands -- claimed by various parties in East Asia. FIXME: We should conditionalize on what is being claimed since there are -- also claimed capitals, e.g. Israel and Palestine claim Jerusalem as their capital. ["claimed"] = "claimed political division", } --[==[ var: Mapping from placetypes to the corresponding plural category-only placetype for a capital of that placetype. The reverse mapping also exists. ]==] export.placetype_to_capital_cat = { ["autonomous community"] = "autonomous community capitals", ["canton"] = "cantonal capitals", ["comarca"] = "comarca capitals", ["ประเทศ"] = "national capitals", -- The following are not obviously different from 'county seats' but the latte terminology is used in the US. ["เทศมณฑล"] = "county capitals", ["department"] = "departmental capitals", ["อำเภอ"] = "district capitals", ["division"] = "division capitals", ["emirate"] = "emirate capitals", ["governorate"] = "governorate capitals", ["hromada"] = "hromada capitals", ["krai"] = "krai capitals", ["metropolitan city"] = "metropolitan city capitals", ["เทศบาล"] = "municipal capitals", ["oblast"] = "oblast capitals", ["okrug"] = "okrug capitals", ["prefecture"] = "prefectural capitals", ["จังหวัด"] = "provincial capitals", ["raion"] = "raion capitals", ["regency"] = "regency capitals", ["ภูมิภาค"] = "regional capitals", ["regional unit"] = "regional unit capitals", ["republic"] = "republic capitals", ["รัฐ"] = "state capitals", ["ดินแดน"] = "territorial capitals", ["voivodeship"] = "voivodeship capitals", } --[==[ var: This contains placenames that should be preceded by an article (almost always "the"). '''NOTE''': There are multiple ways that placenames can come to be preceded by "the": # Listed here. # Given in [[Module:place/locations]] with an initial "the". All such placenames are added to this map by the code just below the map. # The placetype of the placename has `holonym_use_the = true` in its placetype_data. # A regex in placename_the_re matches the placename. Note that "the" is added only before the first holonym in a place description. ]==] export.placename_article = { -- This should only contain info that can't be inferred from [[Module:place/locations]]. ["archipelago"] = { ["Cyclades"] = "the", ["Dodecanese"] = "the", }, ["ประเทศ"] = { ["Holy Roman Empire"] = "the", }, ["จักรวรรดิ"] = { ["Holy Roman Empire"] = "the", }, ["เกาะ"] = { ["North Island"] = "the", ["South Island"] = "the", }, ["ภูมิภาค"] = { ["Balkans"] = "the", ["Russian Far East"] = "the", ["Caribbean"] = "the", ["Caucasus"] = "the", ["Middle East"] = "the", ["New Territories"] = "the", ["North Caucasus"] = "the", ["South Caucasus"] = "the", ["West Bank"] = "the", ["Gaza Strip"] = "the", }, ["valley"] = { ["San Fernando Valley"] = "the", }, } --[==[ var: Regular expressions to apply to determine whether we need to put 'the' before a holonym. The key "*" applies to all holonyms, otherwise only the regexes for the holonym's placetype apply. ]==] export.placename_the_re = { -- We don't need entries for peninsulas, seas, oceans, gulfs or rivers -- because they have holonym_use_the = true. ["*"] = {"^Isle of ", " Islands$", " Mountains$", " Empire$", " Country$", " Region$", " District$", "^City of "}, ["bay"] = {"^Bay of "}, ["ทะเลสาบ"] = {"^Lake of "}, ["ประเทศ"] = {"^Republic of ", " Republic$"}, ["republic"] = {"^Republic of ", " Republic$"}, ["ภูมิภาค"] = {" [Rr]egion$"}, ["แม่น้ำ"] = {" River$"}, ["local government area"] = {"^Shire of "}, ["เทศมณฑล"] = {"^Shire of "}, ["Indian reservation"] = {" Reservation", " Nation"}, ["tribal jurisdictional area"] = {" Reservation", " Nation"}, } --[==[ var: If any of the following holonyms are present, the associated holonyms are automatically added to the end of the list of holonyms for categorization (but not display) purposes. ]==] export.cat_implications = { ["ภูมิภาค"] = { ["Eastern Europe"] = {"continent/Europe"}, ["Central Europe"] = {"continent/Europe"}, ["Western Europe"] = {"continent/Europe"}, ["South Europe"] = {"continent/Europe"}, ["Southern Europe"] = {"continent/Europe"}, ["Northern Europe"] = {"continent/Europe"}, ["Northeast Europe"] = {"continent/Europe"}, ["Northeastern Europe"] = {"continent/Europe"}, ["Southeast Europe"] = {"continent/Europe"}, ["Southeastern Europe"] = {"continent/Europe"}, ["North Caucasus"] = {"continent/Europe"}, ["South Caucasus"] = {"continent/Asia"}, ["South Asia"] = {"continent/Asia"}, ["Southern Asia"] = {"continent/Asia"}, ["East Asia"] = {"continent/Asia"}, ["Eastern Asia"] = {"continent/Asia"}, ["Central Asia"] = {"continent/Asia"}, ["West Asia"] = {"continent/Asia"}, ["Western Asia"] = {"continent/Asia"}, ["Southeast Asia"] = {"continent/Asia"}, ["North Asia"] = {"continent/Asia"}, ["Northern Asia"] = {"continent/Asia"}, ["Anatolia"] = {"continent/Asia"}, ["Asia Minor"] = {"continent/Asia"}, ["Mesopotamia"] = {"continent/Asia"}, ["North Africa"] = {"continent/Africa"}, ["Central Africa"] = {"continent/Africa"}, ["West Africa"] = {"continent/Africa"}, ["East Africa"] = {"continent/Africa"}, ["Southern Africa"] = {"continent/Africa"}, ["Central America"] = {"continent/Central America"}, ["Caribbean"] = {"continent/North America"}, ["Polynesia"] = {"continent/Oceania"}, ["Micronesia"] = {"continent/Oceania"}, ["Melanesia"] = {"continent/Oceania"}, ["Siberia"] = {"country/Russia", "continent/Asia"}, ["Russian Far East"] = {"country/Russia", "continent/Asia"}, ["South Wales"] = {"constituent country/Wales", "continent/Europe"}, ["Balkans"] = {"continent/Europe"}, ["West Bank"] = {"country/Palestine", "continent/Asia"}, ["Gaza"] = {"country/Palestine", "continent/Asia"}, ["Gaza Strip"] = {"country/Palestine", "continent/Asia"}, } } ------------------------------------------------------------------------------------------ -- Category and display handlers -- ------------------------------------------------------------------------------------------ local function city_type_cat_handler(data) local entry_placetype = data.entry_placetype local generic_before_non_cities = export.get_placetype_prop(entry_placetype, "generic_before_non_cities") if not generic_before_non_cities then internal_error("city_type_cat_handler called on placetype %s that doesn't have a `generic_before_non_cities`" .. " setting", entry_placetype) end local plural_entry_placetype = export.pluralize_placetype(entry_placetype) local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and not spec.is_former_place and not spec.is_city then -- Categorize both in key, and in the larger polity that the key is part of, e.g. [[Hirakata]] goes in both -- "Cities in Osaka Prefecture" and "Cities in Japan". (But don't do the latter if no_container_cat is set.) local cap_plural_entry_placetype = ucfirst(plural_entry_placetype) local retcats = {("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(key, spec))} --th if container_trail[1] and not spec.no_container_cat then for _, container in ipairs(container_trail[1]) do insert(retcats, ("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(container.key, container.spec))) --th end end return retcats end end local function capital_city_cat_handler(data, non_city) local holonym_placetype, holonym_placename, holonym_index, place_desc = data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc -- The first time we're called we want to return something; otherwise we will be called for later-mentioned -- holonyms, which can result in wrongly classifying into e.g. `National capitals`. Simulate the loop in -- find_placetype_cat_specs() over holonyms so we get the proper `Cities in ...` categories as well as the capital -- category/categories we add below. local retcats if not non_city and place_desc.holonyms then for h_index, holonym in export.get_holonyms_to_check(place_desc, holonym_index) do local h_placetype, h_placename = holonym.placetype, holonym.unlinked_placename retcats = city_type_cat_handler { entry_placetype = "นคร", holonym_placetype = h_placetype, holonym_placename = h_placename, holonym_index = h_index, place_desc = place_desc, } if retcats then break end end end if not retcats then retcats = {} end -- Now find the appropriate capital-type category for the placetype of the holonym, e.g. 'State capitals'. If we -- recognize the holonym among the known holonyms in [[Module:place/locations]], also add a category like 'State -- capitals of the United States'. Truncate e.g. 'autonomous region' to 'region', 'union territory' to 'territory' -- when looking up the type of capital category, if we can't find an entry for the holonym placetype itself (there's -- an entry for 'autonomous community'). local capital_cat = export.placetype_to_capital_cat[holonym_placetype] if not capital_cat then capital_cat = export.placetype_to_capital_cat[holonym_placetype:gsub("^.* ", "")] end if capital_cat then capital_cat = ucfirst(capital_cat) local inserted_specific_variant_cat = false if holonym_index then -- Now find the first recognized holonym location. We don't stop when :also is seen because of the common pattern -- where we use :also to specify that a given city is the capital at multiple surrounding levels. local matching_group, matching_key, matching_spec, matching_container_trail, matching_holonym_index for h_index = holonym_index, #place_desc.holonyms do if place_desc.holonyms[h_index].placetype then matching_group, matching_key, matching_spec, matching_container_trail = export.find_matching_holonym_location { holonym_placetype = place_desc.holonyms[h_index].placetype, holonym_placename = place_desc.holonyms[h_index].unlinked_placename, holonym_index = h_index, place_desc = place_desc, } if matching_group then matching_holonym_index = h_index break end end end if matching_holonym_index == holonym_index then if matching_container_trail[1] and not matching_spec.no_container_cat then for _, container in ipairs(matching_container_trail[1]) do insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key, container.spec))) inserted_specific_variant_cat = true end end elseif matching_holonym_index then -- Check to make sure that the holonym placetype we were called on is listed among the -- divtypes of the location we found. local function insert_specific_variant_if_possible(key, spec) return export.get_equiv_placetype_prop(holonym_placetype, function(pt) local plural_holonym_placetype = export.pluralize_placetype(pt) local saw_matching_div if spec.divs then local divs = spec.divs if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) ~= "table" then div = {type = div} end if plural_holonym_placetype == div.type then saw_matching_div = true break end end end if saw_matching_div then insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(key, spec))) return true end return false end) end if insert_specific_variant_if_possible(matching_key, matching_spec) then inserted_specific_variant_cat = true elseif not matching_spec.no_container_cat then for _, containers in ipairs(matching_container_trail) do local saw_no_container_cat = false for _, container in ipairs(containers) do if insert_specific_variant_if_possible(container.key, container.spec) then inserted_specific_variant_cat = true break end saw_no_container_cat = saw_no_container_cat or container.spec.no_container_cat end if inserted_specific_variant_cat or saw_no_container_cat then break end end end end else -- This happens when in an invocation like {{place|en|capital city|s/Haryana,Punjab}} for -- [[Chandigarh]]. We fall back to older code that doesn't depend on the holonym index existing. -- FIXME: This may not be necessary. In the example just given, when processing Haryana we add to -- [[:Category:en:State capitals of India]], and nothing extra gets added when processing Punjab. -- Possibly we can just skip this case entirely. local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and container_trail[1] and not spec.no_container_cat then for _, container in ipairs(container_trail[1]) do insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key, container.spec))) inserted_specific_variant_cat = true end end end if not inserted_specific_variant_cat then insert(retcats, capital_cat) end else -- We didn't recognize the holonym placetype; just put in 'Capital cities'. insert(retcats, "เมืองหลวง") end return retcats end --[=[ This is invoked specially for all placetypes (see the `*` placetype key at the bottom of `placetype_data`). This is used in two ways: # To add pages to generic holonym categories like [[:Category:en:สถานที่ในMerseyside, England]] (and [[:Category:en:สถานที่ในEngland]]) for any pages that have `co/Merseyside` as their holonym. # To categorize demonyms in bare placename categories like [[:Category:en:Merseyside, England]] if the demonym description mentions `co/Merseyside` and doesn't mention a more specific placename that also has a category. (In this case there are none, but we can have demonyms at multiple levels, e.g. in France for individual villages, departments, administrative regions, and for the entire country, and for example we only want to categorize a demonym into [[:Category:France]] if no more specific category applies.) Unlike when invoked from {{tl|place}}, a demonym invocation only adds the most specific holonym category and not the category of any containing polity (hence if we add [[:Category:en:Merseyside, England]] we won't also add [[:Category:England]]). This code also handles cities; e.g. for the first use case above, it would be used to add a page that has `city/Boston` as a holonym to [[:Category:en:สถานที่ในBoston]], along with [[:Category:en:สถานที่ในMassachusetts, USA]] and [[:Category:en:สถานที่ในthe United States]]. The city handler tries to deal with the possibility of multiple cities having the same name. For example, the code in [[Module:place/locations]] knows about the city of [[Columbus]], [[Ohio]], which has containing polities `Ohio` (a state) and `the United States` (a country). If either containing polity is mentioned, the handler proceeds to return the key `Columbus` (along with `Ohio, USA` and `the United States`). Otherwise, if any other state or country is mentioned, the handler returns nothing, and otherwise it assumes the mentioned city is the one we're considering and returns `Columbus` etc. This works correctly if the place only mentions Ohio and a holonym for a Columbus in a different country is encountered, because of the function `augment_holonyms_with_container`, which adds the US as a holonym when Ohio is encountered. The single parameter `data` is as in category handlers. The return value is a list of categories (without the preceding language code). ]=] local function generic_place_cat_handler(data) local from_demonym = data.from_demonym local retcats = {} local function insert_retkey(key, spec) if from_demonym then insert(retcats, key) else insert(retcats, ("สถานที่ใน%s"):format(export.get_prefixed_key(key, spec))) end end local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group then if not spec.no_generic_place_cat then -- This applies to continents and continental regions. insert_retkey(key, spec) end -- Categorize both in key, and in the larger location(s) that the key is part of, e.g. [[Hirakata]] goes in -- both [[Category:สถานที่ในOsaka Prefecture, Japan]] and [[Category:สถานที่ในJapan]]. But not when -- no_container_cat is set (e.g. for 'United Kingdom'). if not spec.no_container_cat then for _, container_set in ipairs(container_trail) do local stop_adding_containers = false for _, container in ipairs(container_set) do if not container.spec.no_generic_place_cat then insert_retkey(container.key, container.spec) end if container.spec.no_container_cat then stop_adding_containers = true end end if stop_adding_containers then break end end end return retcats end end --[==[ Special category handler run for all placetypes that checks for specified division placetypes of known locations and categorizes appropriately. ]==] function export.political_division_cat_handler(data) if data.from_demonym then return end local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group then local divlists = {} if spec.divs then insert(divlists, spec.divs) end if spec.addl_divs then insert(divlists, spec.addl_divs) end for _, divlist in ipairs(divlists) do if type(divlist) ~= "table" then divlist = {divlist} end for _, div in ipairs(divlist) do if type(div) == "string" then div = {type = div} end local sgdiv = export.maybe_singularize_placetype(div.type) or div.type local prep = div.prep or "ของ" local cat_as = div.cat_as or div.type if type(cat_as) ~= "table" then cat_as = {cat_as} end if not export.placetype_data[sgdiv] then internal_error("Placetype %s associated with known location key %s and data %s not found in " .. "`placetype_data`", sgdiv, key, spec) end if sgdiv == data.entry_placetype then local retcats = {} for _, pt_cat in ipairs(cat_as) do if type(pt_cat) == "string" then pt_cat = {type = pt_cat} end local pt_prep = pt_cat.prep or prep insert(retcats, ucfirst(pt_cat.type) .. pt_prep .. export.get_prefixed_key(key, spec)) --th end return retcats end end end end end --[==[ This is used to add pages to "bare" categories like [[:Category:en:Georgia, USA]] for `[[Georgia]]` and any foreign-language terms that are translations of the state of Georgia. We look at the page title (or its overridden value in {{para|pagename}}) as well as the glosses in {{para|t}}/{{para|t2}} etc., various extra-info values such as the modern names in {{para|modern}}, and any values specified using a form-of directive. We need to pay attention to the entry placetypes specified so we don't overcategorize; e.g. the US state of Georgia is `[[Джорджия]]` in Russian but the country of Georgia is `[[Грузия]]`, and if we just looked for matching names, we'd get both Russian terms categorized into both [[:Category:ru:Georgia, USA]] and [[:Category:ru:Georgia]]. We also need to check the containing holonyms to make sure there isn't a mismatch (so we don't e.g. categorize Newark, Delaware in [[:Category:en:Newark]], which is intended for Newark, New Jersey). ]==] function export.get_bare_categories(args, overall_place_spec) local bare_cats = {} local place_descs = overall_place_spec.descs local possible_placetypes_by_place_desc = {} for i, place_desc in ipairs(place_descs) do possible_placetypes_by_place_desc[i] = {} for _, placetype in ipairs(place_desc.placetypes) do if not export.placetype_is_ignorable(placetype) then local equivs = export.get_placetype_equivs(placetype, {register_former_as_non_former = true}) for _, equiv in ipairs(equivs) do insert(possible_placetypes_by_place_desc[i], equiv.placetype) end end end end local function check_term(term) -- Treat Wikipedia links like local ones. term = term:gsub("%[%[w:", "[["):gsub("%[%[wikipedia:", "[[") term = export.remove_links_and_html(term) term = term:gsub("^the ", "") for i, place_desc in ipairs(place_descs) do -- Iterate over all matching locations in case there are multiple, as with Delhi defined as -- {{place|en|megacity/and/union territory|c/India|containing the national capital [[New Delhi]]}}. for group, key, spec, container_trail in export.iterate_matching_holonym_location { holonym_placetype = possible_placetypes_by_place_desc[i], holonym_placename = term, place_desc = place_desc, } do insert(bare_cats, key) end end end -- FIXME: Should we only do the following if the language is English (requires that the lang is passed in)? -- We should always do it if `pagename` is given (as it is with {{tcl}}) but maybe not otherwise unless 1=en. There -- are cases like [[Ankara]] = English name for capital of Turkey, but also the name in various languages for the -- capital of Ghana (= English [[Accra]]). But this should get caught by mismatching the containing country. The -- advantage of checking when the language isn't English is we catch those places that fail to give an English -- translation but where the translation happens to be the same as the other-language spelling. However, I don't -- know how often this situation occurs. check_term(args.pagename or mw.title.getCurrentTitle().subpageText) for _, t in ipairs(args.t) do check_term(t) end local function check_termobj_list(terms) for _, term in ipairs(terms) do if term.eq then check_term(term.eq) end if term.alt or term.term then check_term(term.alt or term.term) end end end for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do local arg = extra_info_terms.arg if arg == "modern" or arg == "now" or arg == "full" or arg == "short" then check_termobj_list(extra_info_terms.terms) end end for _, directive in ipairs(overall_place_spec.directives) do check_termobj_list(directive.terms) end return bare_cats end --[==[ This is used to augment the holonyms associated with a place description with the containing polities. For example, given the following: `# {{tl|place|en|subprefecture|pref/Hokkaido}}.` We auto-add Japan as another holonym so that the term gets categorized into [[:Category:Subprefectures of Japan]]. To avoid over-categorizing we need to check to make sure no other countries are specified as holonyms. ]==] function export.augment_holonyms_with_container(place_descs) for _, place_desc in ipairs(place_descs) do if place_desc.holonyms then -- This ends up containing a copy of the original holonyms, with the augmented holonyms inserted in their -- appropriate position. We don't just put them at the end because some holonyms have use the `:also` -- modifier, which causes category processing to restart at that point after generating categories for a -- preceding holonym, and we don't want the preceding holonym's augmented holonyms interfering with -- categorization of a later holonym. We proceed from right to left, and each time we augment, we copy -- the holonyms with the augmented holonym(s) inserted appropriately and replace the place description's -- holonyms with the augmented ones before the next iteration. The reason for this is so that e.g. -- {{place|neighborhood|city/Birmingham|co/West Midlands|cc/England}} doesn't throw an error during the -- augmentation process due to 'Birmingham' referring to two known locations (in England and Alabama). If -- we go left to right, we will throw an ambiguity error on `city/Birmingham` because code to exclude -- Birmingham, Alabama needs `c/United Kingdom` present (to cause a mismatch with `c/United States`), -- which isn't yet present as the augmentation code hasn't gotten to `cc/England` yet. For similar -- reasons, we need to include the augmented holonyms in the holonyms considered in the next iteration -- rather than modifying the place description once at athe end. for i = #place_desc.holonyms, 1, -1 do local holonym = place_desc.holonyms[i] if holonym.placetype and not export.placetype_is_ignorable(holonym.placetype) then local group, key, spec, container_trail = export.find_matching_holonym_location { holonym_placetype = holonym.placetype, holonym_placename = holonym.unlinked_placename, holonym_index = i, place_desc = place_desc, } if group and container_trail[1] and not spec.no_auto_augment_container then local augmented_holonyms = {} for j = 1, i do insert(augmented_holonyms, place_desc.holonyms[j]) end for _, containers in ipairs(container_trail) do local any_no_auto_augment_container = false for _, container in ipairs(containers) do any_no_auto_augment_container = any_no_auto_augment_container or container.spec.no_auto_augment_container local containing_type = container.spec.placetype if type(containing_type) == "table" then -- If the containing type is a list, use the first element as the canonical variant. containing_type = containing_type[1] end local full_container_placename, elliptical_container_placename = m_locations.key_to_placename(container.group, container.key) -- Don't side-effect holonyms while processing them. local new_holonym = { -- By the time we run, the display has already been generated so we don't need to -- set display_placename. placetype = containing_type, -- placename_to_key() for the group should correctly handle both full and elliptical -- placenames, but the full placename seems less likely to be ambiguous. FIXME: We -- should just store the key directly and use it when available to avoid having to -- convert key to placename and back to key. unlinked_placename = full_container_placename, -- Indicate that this is an augmented holonym, and was derived from the specified -- holonym. In iterate_matching_holonym_location(), we ignore augmented holonyms -- derived from holonyms that are different from the holonym we're searching for but -- of the same placetype. This is to correctly handle a situation like -- {{place|river|dept/Ardèche,Gard,Vaucluse,Bouches-du-Rhône|c/France}}. Here, -- `Ardèche` is in `r/Auvergne-Rhône-Alpes`, while `Gard` is in `r/Occitania` and -- the other two are in `r/Provence-Alpes-Côte d'Azur`. Augmenting proceeds from -- right to left, so after it adds `r/Provence-Alpes-Côte d'Azur` to -- `Bouches-du-Rhône`, Vaucluse gets augmented correctly but `Gard` fails to match -- in find_matching_holonym_location() because of the mismatch between augmented -- `r/Provence-Alpes-Côte d'Azur` and actual `r/Occitania`. Similarly, all later -- calls to find_matching_holonym_location() fail to match `Gard` (and likewise -- `Ardèche`) against any known location. To deal with this, we mark augmented -- holoynms as being augmented due to a source holonym, and when processing a given -- holonym, ignore augmented holonyms from other holonyms of the same placetype. -- The restriction to the same placetype is so that `Birmingham` still gets -- correctly disambiguated to Birmingham, England in the example given above near -- the top of this function, using the augmented holonym `c/United Kingdom` added by -- the specified `cc/England` (whose placetype `constituent country` differs from -- the placetype `city` of Birmingham). augmented_from_holonym = holonym, } insert(augmented_holonyms, new_holonym) -- But it is safe to modify other parts of the place_desc. export.key_holonym_into_place_desc(place_desc, new_holonym) end if any_no_auto_augment_container then break end end for j = i + 1, #place_desc.holonyms do insert(augmented_holonyms, place_desc.holonyms[j]) end place_desc.holonyms = augmented_holonyms end end end end end end -- Cat handler for district, areas, neighborhoods and suburbs. Districts are tricky because they can either be political -- divisions or city neighborhoods. Areas similarly can be political divisions (rarely; specifically, in Kuwait), city -- neighborhoods or larger geographical areas/regions. We handle this as follows: -- (1) `placetype_data` cat entries for specific countries or country divisions take precedence over cat_handlers, so if -- the user says {{tl|place|district|s/Maharashtra|c/India}}, we won't even be called because there is an entry that -- categorizes into [[:Category|Districts of Maharashtra, India]]. -- (2) If we're called, we check the holonym we're called on to see if it is a recognized city, e.g. if we're called -- using {{tl|place|district|city/Mumbai|s/Maharashtra|c/India}}. If so, we categorize under e.g. -- [[:Category:Neighbourhoods of Mumbai]]. (Choosing the spelling "neighbourhoods" because we're in India.) -- (3) If we're called and the holonym is not a recognized city, we check if the placetype has has_neighborhoods set. -- If so, it's "city-like" and we categorize under the first containing polity that we recognize. For example, if -- we're called using {{tl|place|district|town/Northampton|co/Hampshire|s/Massachusetts|c/US}}, we should recognize -- town as "city-like" and categorize under [[:Category:Neighborhoods in Massachusetts]]. (Note "ใน" not "ของ", and -- note the spelling "neighborhoods" because we're in the US.) -- (4) If the holonym is not city-like, we do nothing. If there's a city or city-like placetype farther up (e.g. we're -- called as {{tl|place|district|ward/Foo|mun/Bar|...}}), we will handle the city-like entity according to (2) or -- (3) when called on that holonym. Otherwise either the categorization in (1) takes place or there's no -- categorization. local function district_neighborhood_cat_handler(data) local function get_plural_entry_placetype(location_spec, container_trail) if data.entry_placetype == "suburb" then return "Suburbs" else -- Check for `british_spelling` setting on the spec itself or any container. local uses_british_spelling = location_spec.british_spelling if uses_british_spelling == nil and container_trail then for _, container_set in ipairs(container_trail) do local must_outer_break = false for _, container in ipairs(container_set) do if container.spec.british_spelling ~= nil then uses_british_spelling = container.spec.british_spelling must_outer_break = true break end end if must_outer_break then break end end end return uses_british_spelling and "Neighbourhoods" or "Neighborhoods" end end -- First check the immediate holonym to see if it's a city or a city-like top-level entity (Hong Kong, Bonaire, -- etc.) local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and not spec.is_former_place and spec.is_city then return {get_plural_entry_placetype(spec, container_trail) .. " of " .. export.get_prefixed_key(key, spec)} end -- If the entry placetype is neighbo(u)rhood, assume it is a neighborhood even if there isn't a city-like -- entity father up the chain. (E.g. due to a mistaken use of m/ instead of mun/ for municipality.) local has_neighborhoods local entry_placetype = data.entry_placetype if entry_placetype == "neighborhood" or entry_placetype == "neighbourhood" or entry_placetype == "suburb" then has_neighborhoods = true else -- Otherwise, make sure the current holonym is city-like. has_neighborhoods = export.get_equiv_placetype_prop(data.holonym_placetype, function(pt) return export.get_placetype_prop(pt, "has_neighborhoods") end, {continue_on_nil_only = true}) end if has_neighborhoods then -- Loop up the holonyms, looking for city and city-like entities in case of e.g. [[Sepulveda]] written -- {{place|en|neighborhood|valley/San Fernando Valley|city/Los Angeles|s/California|c/USA}} -- but also look for a recognizable poldiv, and if so categorize as "Neighborhoods in POLDIV". We need -- to start with the current holonym, which is especially important for neighborhoods and suburbs that -- may have the first holonym be a recognizable province, etc. but can't hurt otherwise. (Previously -- we skipped the first/current holonym.) for other_holonym_index, other_holonym in export.get_holonyms_to_check(data.place_desc, data.holonym_index) do local other_holonym_data = { holonym_placetype = other_holonym.placetype, holonym_placename = other_holonym.unlinked_placename, holonym_index = other_holonym_index, place_desc = data.place_desc, } local group, key, spec, container_trail = export.find_matching_holonym_location(other_holonym_data) if group and not spec.is_former_place then return {get_plural_entry_placetype(spec, container_trail) .. (spec.is_city and "ของ" or "ใน") .. export.get_prefixed_key(key, spec)} end end end end function export.check_already_seen_string(holonym_placename, already_seen_strings) local canon_placename = ulower(m_links.remove_links(holonym_placename)) if type(already_seen_strings) ~= "table" then already_seen_strings = {already_seen_strings} end for _, already_seen_string in ipairs(already_seen_strings) do if canon_placename:find(already_seen_string) then return true end end return false end -- Prefix display handler that adds a prefix such as "Metropolitan Borough of " to the display -- form of holonyms. We make sure the holonym doesn't contain the prefix or some variant already. -- We do this by checking if any of the strings in ALREADY_SEEN_STRINGS, either a single string or -- a list of strings, or the prefix if ALREADY_SEEN_STRINGS is omitted, are found in the holonym -- placename, ignoring case and links. If the prefix isn't already present, we create a link that -- uses the raw form as the link destination but the prefixed form as the display form, unless the -- holonym already has a link in it, in which case we just add the prefix. local function prefix_display_handler(prefix, holonym_placename, already_seen_strings) if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(prefix)) then return holonym_placename end if holonym_placename:find("%[%[") then return prefix .. " " .. holonym_placename end return prefix .. " [[" .. holonym_placename .. "]]" end -- Suffix display handler that adds a suffix such as " parish" to the display form of holonyms. -- Works identically to prefix_display_handler but for suffixes instead of prefixes. local function suffix_display_handler(suffix, holonym_placename, already_seen_strings, include_suffix_in_link) if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(suffix)) then return holonym_placename end if holonym_placename:find("%[%[") then return holonym_placename .. " " .. suffix end if include_suffix_in_link then return "[[" .. holonym_placename .. " " .. suffix .. "]]" else return "[[" .. holonym_placename .. "]] " .. suffix end end -- Display handler for boroughs. New York City boroughs are display as-is. Others are suffixed -- with "borough". local function borough_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.new_york_boroughs[unlinked_placename] then -- Hack: don't display "borough" after the names of NYC boroughs return holonym_placename end return suffix_display_handler("borough", holonym_placename) end local function county_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) -- Display handler for Irish counties. Irish counties are displayed as e.g. "County [[Cork]]". if m_locations.ireland_counties["County " .. unlinked_placename .. ", Ireland"] or m_locations.northern_ireland_counties["County " .. unlinked_placename .. ", Northern Ireland"] then return prefix_display_handler("เทศมณฑล", holonym_placename) end -- Display handler for Taiwanese counties. Taiwanese counties are displayed as e.g. "[[Chiayi]] County". if m_locations.taiwan_counties[unlinked_placename .. " County, Taiwan"] then return suffix_display_handler("เทศมณฑล", holonym_placename) end -- Display handler for Romanian counties. Romanian counties are displayed as e.g. "[[Cluj]] County". if m_locations.romania_counties[unlinked_placename .. " County, Romania"] then return suffix_display_handler("เทศมณฑล", holonym_placename) end -- FIXME, we need the same for US counties but need to key off the country, not the specific county. -- Others are displayed as-is. return holonym_placename end -- Display handler for prefectures. Japanese prefectures are displayed as e.g. "[[Fukushima]] Prefecture". -- Others are displayed as e.g. "[[Fthiotida]] prefecture". local function prefecture_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) local suffix = m_locations.japan_prefectures[unlinked_placename .. " Prefecture, Japan"] and "Prefecture" or "prefecture" return suffix_display_handler(suffix, holonym_placename) end -- Display handler for provinces of Iran, Laos, North and South Korea, Thailand, Turkey and Vietnam. Recognized -- provinces are displayed as e.g. "[[Gyeonggi]] Province" or "[[Antalya]] Province". Others are displayed as-is. local function province_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.iran_provinces[unlinked_placename .. ", Iran"] or m_locations.laos_provinces[unlinked_placename .. ", Laos"] or m_locations.north_korea_provinces[unlinked_placename .. ", North Korea"] or m_locations.south_korea_provinces[unlinked_placename .. ", South Korea"] or m_locations.thailand_provinces[unlinked_placename .. ", ไทย"] or m_locations.turkey_provinces[unlinked_placename .. ", Turkey"] or m_locations.vietnam_provinces[unlinked_placename .. ", เวียดนาม"] then return suffix_display_handler("จังหวัด", holonym_placename) end return holonym_placename end -- Display handler for Nigerian states. Nigerian states are display as "[[Kano]] State". Others are displayed as-is. local function state_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.nigeria_states[unlinked_placename .. " State, Nigeria"] then return suffix_display_handler("รัฐ", holonym_placename) end return holonym_placename end -- Display handler for voivodeships. Display as e.g. [[Subcarpathian Voivodeship]]. local function voivodesip_display_handler(holonym_placetype, holonym_placename) return suffix_display_handler("Voivodeship", holonym_placename, nil, "include_suffix_in_link") end ------------------------------------------------------------------------------------------ -- Placetype data -- ------------------------------------------------------------------------------------------ --[==[ var: Main placetype data structure. This specifies, for each canonicalized placetype, various properties. The keys are placetypes (in the singular, except for category-only placetypes, which are plural and followed by `!`), and the value is a table of properties. The `"*"` key is special and is used for adding "generic" categories of the form `สถานที่ใน``location`` `; it runs for all entry placetypes. Keys in the form of plural placetypes followed by `!` are used only in [[Module:category tree/topic cat/data/Places]] for specifying the properties of categories containing the specified placetype, esp. bare categories like [[:Category:States and territories]] (rather than qualified categories like [[:Category:States and territories of Australia]]). Keys under the value table for a given placetype of are two types: ''property keys'' (which specify the value of specific properties) and ''categorization keys'' (which tell how to categorize certain sorts of holonyms if the placetype in question occurs as an entry placetype). Categorization keys are either the special value `default` or are wildcard strings with a slash in them, such as `"country/*"`. Note that only wildcard strings are currently allowed directly in the placetype data; everything else is handled through category handlers, either per-placetype or special (such as `political_division_cat_handler`). The algorithm for how category keys and handlers are used to generate categories is described at the top of [[Module:place]]. There are several recognized property keys, of various types: 1. The following link-related property keys are recognized: * `link`: '''Required''' except in category-only placetypes ending in `!`. Describes how to link and display the placetype in the formatted description when occurring as an entry placetype. Also used for formatting pluralized placetypes (which may occur in entry placetypes, esp. new-format ones, such as `two <<islands>>`) and may occur in categories). The possible values are: *# `true`: Link to the same-named Wiktionary entry. This creates a raw link, e.g. `<nowiki>[[city]]</nowiki>`, which is converted to an English-specific link by JavaScript postprocessing. If the placetype is plural, this creates a two-part raw link e.g. `<nowiki>[[city|cities]]</nowiki>`. *# `"w"`: Link to the same-named Wikipedia entry. This creates a two-part link, e.g. `<nowiki>[[w:census town|census town]]</nowiki>`, or `<nowiki>[[w:census town|census towns]]</nowiki>` if the placetype is given plural. *# `"+..."`: Create a two-part link to the entry following the `+` sign. For example, if `cercle` specifies `"+w:cercles of Mali"`, a two-part link `<nowiki>[[w:cercles of Mali|cercle]]</nowiki>` will be generated, or `<nowiki>[[w:cercles of Mali|cercles]]</nowiki>` if plural `cercles` is specified. *# `"separately"`: Link each word separately. For example, if `administrative territory` specifies `"separately"`, it will be linked as `<nowiki>[[administrative]] [[territory]]</nowiki>`, or as `<nowiki>[[administrative]] [[territory|territories]]</nowiki>` if plural `administrative territories` is given. *# another string: Use that string directly. If the placetype is plural, `pluralize()` in [[Module:en-utilities]] is called on the string, which will correctly pluralize most strings, including those with links in them. (If there are multiple links, the display form of the last link is pluralized.) *# `false`: This placetype is not allowed as an entry placetype. An error will be thrown if this placetype is given as an entry placetype. This is specified for internal-use placetypes, especially placetypes used in conjunction with the qualifiers `former`, `ancient`, `historical` and such. * `plural_link`: If specified and the placetype is plural, use the value in place of generating a pluralized version of the link spec in `link`. Most commonly, this is either a string with links in it (which is used directly) or the value `false`, indicating that the placetype cannot occur plural. (This is used for example by `caplc`, which displays as `<nowiki>[[capital]] and [[large]]st [[city]]</nowiki>`, where a plural version doesn't make sense.) Generally if this is specified, `plural` also needs to be specified to give a special placetype plural; this situation occurs especially with multiword placetypes where something other than the last word is pluralized. An example is `town with bystatus`, whose plural is `towns with bystatus`, which needs to be explicitly given. This example uses `link = <nowiki>"[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>` ({{m|nb|bystatus}}) is a Norwegian Bokmål word, and template calls aren't currently permitted in link strings), along with `plural_link = <nowiki>"[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>`. * `category_link`: Spec indicating how to display the placetype when occurring in category descriptions. Defaults to the value of `link`, and in turn is overridden by more specific `category_link_*` keys; see below. Category-only placetypes (which are plural and end in `!`) usually use `category_link` in preference to `link`. The value of `category_link` can be any of the types of specs given above, but most commonly is a plural string with links in it, spelling out the description; in this case it is used directly. When both `category_link` and `link` are given, the value in `category_link` is typically longer and more descriptive. For example, `polity` uses `link = true`, which just generates a link `<nowiki>[[polity]]</nowiki>` or plural `<nowiki>[[polity|polities]]</nowiki>`, but specifies a separate `category_link = <nowiki>"[[independent]] or [[semi-]][[independent]] [[polity|polities]]"</nowiki>`, which clarifies in the category description what a polity is. * `category_link_top_level`: Spec indicating how to display top-level (bare/unqualified) categories, i.e. categories where the placetype is not followed by `in ``location`` ` or `of ``location`` `. If given, this overrides `category_link` for this type of category. * `category_link_before_noncity`: Spec indicating how to display qualified categories of the form ` ``placetypes`` in/of ``location`` ` where ``location`` does not refer to a city. If given, this overrides `category_link` for this type of category. * `category_link_before_city`: Spec indicating how to display qualified categories of the form ` ``placetypes`` in/of ``location`` ` where ``location`` refer to a city. If given, this overrides `category_link` for this type of category. An example where this is given is `neighborhood`, which uses the following specs:<ol> <li>`link = true`</li> <li>`category_link = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]"</nowiki>`</li> <li>`category_link_before_city = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions"</nowiki>`</li> </ol> This has the effect of making the entry placetype `neighborhood` display as just `<nowiki>[[neighborhood]]</nowiki>`, while e.g. a category like `Neighborhoods of Chicago` displays as `<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[Chicago]], ...</nowiki>` and a category like `Neighborhoods in Illinois, USA` displays as `<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]] in [[Illinois]], ...</nowiki>`. * `disallow_in_entries`: If specified, this placetype cannot occur as an entry placetype, and the specified value (a message indicating what to use instead) is displayed in the error message. * `disallow_in_holonyms`: If specified, this placetype cannot occur as a holonym placetype, and the specified value (a message indicating what to use instead) is displayed in the error message. 2. There is currently one fallback-related property key recognized: * `fallback`: If specified, its value is a placetype which will be used for categorization purposes if no categories get added using the placetype itself. As an example, `branch` sets a fallback of `river` but also sets `preposition = "ของ"`, meaning that {{tl|place|en|branch|riv/Mississippi}} displays as `a branch of the Mississippi` (whereas `river` itself uses the preposition `in`), but otherwise categorizes the same as `river`. A more complex example is `area`, which sets a fallback of `geographic and cultural area` and also sets a category handler that checks for cities or city-like entities (e.g. boroughs) occurring as holonyms and categorizes the toponym under [[:Category:Neighborhoods of CITY]] (for recognized cities) or otherwise [[:Category:Neighborhoods of POLDIV]] (for the nearest containing recognized location). In addition, `area` is set as a political division of Kuwait, meaning if `c/Kuwait` occurs as holonym, the toponym is categorized under [[:Category:Areas of Kuwait]]. If none of these categories trigger, the fallback of `geographic and cultural area` will take effect, and the toponym will be categorized as e.g. [[:Category:Geographic and cultural areas of England]]. 3. There is currently one property to control irregular plurals of placetypes: * `plural`: If specified, its value is the plural of the placetype. Otherwise, the default pluralization algorithm in [[Module:en-utilities]] applies (which correctly pluralizes most words, including those ending in `-y`, `-ch`, `-sh`, `-x`, etc.). The value of `plural` is also used when converting a pluralized placetype into its singular equivalent; for example, since the placetype `kibbutz` has `plural = "kibbutzim"`, the placetype `kibbutzim` will be recognized as a plural and singularized to `kibbutz`. For this reason, it's occasionally necessary to specify a `plural` value even when the default pluralization algorithm works correctly, if the default singularization algorithm won't correctly reverse the pluralization (as with `pass` and other terms ending in `-ss`). 4. The following property keys relate to generating categories for entry placetypes and specifying the parents of those categories: * `class`: The general class of placetype. This is used for various purposes: (a) to categorize placetypes preceded by a qualifier such as `former`, `ancient`, `medieval` or `historical` (note that these placetypes are not all treated alike); (b) to determine the parent category of bare placetype categories (e.g. [[:Category:Villages]] for placetype `village`); (c) to determine whether to add a parent category `political divisions of specific countries` to qualified placetype categories (e.g. [[:Category:Villages in Mali]]). The possible values are: *# `polity`: a more-or-less sovereign/independent polity, such as a country, kingdom or empire. *# `subpolity`: a non-sovereign division of a polity, above the level of an individual settlement. *# `settlement`: a city or smaller equivalent, such as a village. This also includes administrative divisions of a settlement, such as wards and barangays. *# `non-admin settlement`: similar to a settlement but without administrative or political significance, such as an unincorporated community, farm or neighborhood. *# `capital`: a settlement that is a capital. A former capital is generally still in existence, just not the capital any more. *# `natural feature`: any non-man-made feature, such as a lake, mountain, island, ocean, etc. *# `man-made structure`: a man-made feature below the level of a neighborhood, such as a house, airport, university, metro station, park or the like. *# `geographic region`: a geographic or cultural region or area that has no administrative significance. These may vary greatly in size but typically have some sort of cultural significance (possibly historical). The `former`, `ancient`, etc. qualifier has no effect on the category of these placetypes. *# `generic place`: a place that isn't further qualified into any specific subtype. * `former_type`: The class of placetype used for categorizing placetypes preceded by a qualifier such as `former`, `ancient`, `medieval` or `historical`. The possible values are the same as for `class` but with the addition of `dependent territory` (for colonies, protectorates and the like) and `!` (ignore the historical/former/ancient/etc. qualifier; used e.g. with `fictional location` and `mythological location`). If not specified, the value of `class` is used. When a qualifier such as `former`, `ancient`, `medieval` or `historical` is encountered (specifically, those in `former_qualifiers`), it is mapped using `former_qualifiers` to the appropriate internal qualifier or qualifiers (one or both of `ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified qualifiers), which is prepended to the value of `former_type` or `class` to form a placetype whose properties are looked up to determine how to categorize the toponym in question. For example, if `medieval village` is given, we map `medieval` to `ANCIENT` and `FORMER`, and `village` to its `class` of `settlement`, and enter the placetypes `ANCIENT settlement` and `FORMER settlement` (in that order) into the list of equivalent placetypes returned by `get_placetype_equivs`. In this case, there is an entry in `placetype_data` for `ANCIENT settlement`, so its default category spec `Ancient settlements` is used as the category. If on the other hand `medieval kingdom` is given, where `kingdom` has a `class` value `polity`, we first look up `ANCIENT polity`, see there is no entry in `placetype_data` for it, and then look up `FORMER polity`, which exists and has a default category spec `Former polities`, which is used as the category. Note that if the placetype following the "former" qualifier is recognized in `placetype_data` but has no `former_type` or `class` and no fallback with a `former_type` or `class` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like `former greenhouse` is specified and we don't have an entry for `greenhouse`), we just track the occurrence and end up not categorizing. * `bare_category_parent`: This specifies the first parent category of a bare placetype category named according to the placetype in question (e.g. [[:Category:Atolls]] for placetype `atoll`, or [[:Category:Named buildings]] for placetype `named buildings!`). If not specified, the first parent category is determined by the value of `class`, using the mapping `class_to_bare_category_parent` in [[Module:category tree/topic cat/data/Places]]. * `addl_bare_category_parents`: Extra parent categories to add a bare placetype category to (see `bare_category_parent` just above). * `bare_category_breadcrumb`: Breadcrumb for bare placetype categories. Also used as the sort key of `bare_category_parent` if it is a string. * `inherently_former`: If specified and the given placetype is used as an entry placetype, act as if `former` or `ancient` (depending on the value of `inherently_former`) were prefixed to the placetype. This is for placetypes that always refer to no-longer-existing entities, such as `satrapy` and `treaty port`. The value of `inherently_former` is a list of internal qualifiers (one or more of `ANCIENT` and/or `FORMER`), just as for `former_qualifiers`, and the implementation is the same. * `cat_handler`: Handler used to generate the categories to add a given toponym to, if its entry placetype is the placetype in question. Generally the `cat_handler` function checks the holonyms specified in order to determine which category or categories to generate. For example, `district_neighborhood_cat_handler` handles placetypes `district`, `neighborhood`, `subdivision`, `suburb` and the like, and either adds the toponym to a category like `Neighborhoods of ``city`` ` (if a recognized city is given as a holonym), or otherwise a category like `Neighborhoods in ``location`` ` (for the first recognized non-city location given as a holonym, if an unrecognized city or city-like entity is given before the recognized non-city). The algorithm that runs the category handlers iterates over holonyms from left to right, running the `cat_handler` function on each holonym in turn until one or more categories are returned; see below for more specifics. (Note that countries for which e.g. a `district` is a political division do not get the corresponding category added by the `district_neighborhood_cat_handler` function but by `political_division_cat_handler`.) `cat_handler` functions are called with one argument, `data`, describing the resolved entry placetype (i.e. after resolving placetype aliases and fallbacks) and the holonym being processed. The return value should be a list of category specs (categories minus the langcode prefix, with `+++` standing for the holonym key, or the value `true`, which stands for ` ``Placetypes`` in/of ``Holonym`` `, i.e. the pluralized placetype with the appropriate preposition as specified in `placetype_data`). `data` contains the following fields: ** `entry_placetype`: the resolved entry placetype for the entry placetype being processed (i.e. it will always have an entry in `placetype_data` but may not be the original placetype given by the user); ** `holonym_placetype` and `holonym_placename`: the holonym placetype and placename being processed; ** `holonym_index`: the index of the holonym being processed, or {nil} if we're handling an overriding holonym (FIXME: we will change the overriding holonym algorithm so there will be an index even when processing overriding holonyms); ** `place_desc`: a full description of the {{tl|place}} call, as specified at the top of [[Module:place]]; ** `from_demonym`: If set, we are called from [[Module:demonym]], triggered by {{tl|demonym-adj}} or {{tl|demonym-noun}}, instead of being triggered by {{tl|place}}. * `has_neighborhoods`: If `true`, the specified placetype is city-like. This is used in the `district_neighborhood_cat_handler` to determine whether to add a category such as `Neighborhoods in ``location`` `; see the section just above on `cat_handler`. 5. The following preposition-related property keys are recognized: * `preposition`: The preposition used after this placetype when it occurs as an entry placetype. Defaults to `"ใน"`. * `generic_before_non_cities`: If specified, the appropriate category description handler in [[Module:category tree/topic cat/data/Places]] will recognize categories of the form ` ``Placetype`` in/of ``location`` ` for the specified placetype and preposition, if ``location`` is a non-city. This is used to generate descriptions for categories added by category handlers and by explicit category specs in the placetype data. All placetypes that specify `generic_before_non_cities` or `generic_before_cities` *MUST* also specify a value for `class` so that the category tree code can determine whether it's a political or non-political division. * `generic_before_cities`: Like `generic_before_non_cities` but for locations referring to cities. 6. The following property keys control the auto-addition of affixes when formatting holonyms of a particular placetype: * `affix_type`: If specified, add the placetype as an affix before or after holonyms of this placetype. Possible values are: *# `"pref"` (the holonym will display as `(the) placetype of Holonym`, where `the` appears when the holonym directly follows an entry placetype); *# `"Pref"` (same as `"pref"` but the placetype is capitalized; each word is capitalized if there are multiple); *# `"suf"` (the holonym will display as `Holonym placetype`); *# `"Suf"` (the holonym will display as `Holonym Placetype`, i.e. same as `"suf"` but the placetype is capitalized). * `suffix`: String to use in place of the placetype itself when the placetype is displayed as a suffix after a holonym. Note that `suffix` can be used independently of `affix_type` because the user can also request a suffix explicitly using a syntax like `adr:suf/Occitania`, which will display as `Occitania region` because the placetype `administrative region` specifies `suffix = "ภูมิภาค"`. * `prefix`: Like `suffix` but for use when the placetype is displayed as a prefix before the holonym. * `affix`: Like `suffix` and `prefix` but for use when the placetype is displayed as an affix either before or after the holonym. If both `suffix` or `prefix` and `affix` are given for a single placetype, `suffix` or `prefix` take precedence. * `no_affix_strings`: String or list of strings that, if they occur in the holonym, suppress the addition of any affix requested using `affix_type`. Defaults to the placetype itself. For example, `autonomous okrug` specifies `affix_type = "Suf"` so that `aokr/Nenets` displays as `Nenets Autonomous Okrug`, but also specifies `no_affix_strings = "okrug"` so that `aokr/Nenets Okrug` or `aokr/Nenets Autonomous Okrug` displays as specified, without a redundant `Autonomous Okrug` added. Matching is case-insensitive but whole-word. * `display_handler`: A function of two arguments, `holonym_placetype` and `holonym_placename` (specifying a holonym). Its return value is a string specifying the display form of the holonym. 7. The following property keys control the indefinite and definite articles used before entry placetypes and/or holonyms of the specified placetype. * `entry_placetype_use_the`: Use `"the"` before this placetype when it occurs as an entry placetype. * `entry_placetype_indefinite_article`: Indefinite article used before this placetype when it occurs as an entry placetype (usually `"a"`, specifically for placetypes beginning with u- that don't take the indefinite article `"an"`). Defaults to the appropriate indefinite article (`"a"` or `"an"` depending on whether the placetype begins with a vowel). Overridden by `entry_placetype_use_the`, and unlike for most properties, does not apply to equivalent placetypes (i.e. fallbacks or those formed by removing a qualifier from the beginning); only to the exact placetype specified. * `holonym_use_the`: Use `"the"` before holonyms of this placetype. '''NOTE:''' # The `link` property must be specified on all placetypes, except those ending in `!` (category-only placetypes), which must have either `link` or `category_link` specified. # Either the `class` or `former_type` property must be specified on all placetypes not ending in `!` that do not have a fallback (if a placetype has a fallback and omits the `class` and `former_type` properties, they are taken from the fallback). An internal error will result if a placetype has no `class` or `former_type` property derivable either directly or through a fallback, if an attempt is made to categorize a former/ancient/historical/etc. entity of this placetype. # It is possible to have multiple levels of fallback (e.g. `frazione` falls back to `hamlet`, which falls back to `village`). Fallback loops will cause an internal error. All placetypes specified as fallbacks must exist in `placetype_data` or an internal error occurs. ]==] export.placetype_data = { --[=[ If you need to sort the following, do this (using Vim): 1. Make sure all full-line comments are within the { ... } table, or are moved after and on the same line as single-line entries. 2. Make sure the table uses tabs everywhere for indent, and not spaces. 3. Mark the top of the table with `ma`, go to the bottom and execute the following two lines in sequence: :'a,.s/\n/\\n/g :s/\\n\(\t\[\)/\r\1/g The first command converts every newline to a literal `\n` sequence, so the whole thing becomes a single line, while the second command restores the newlines before the beginning of each entry. The effect is to convert all entries to a single line while not losing any information. (Potentially a negative lookahead could be used to do it all in one command.) 4. Execute the following to sort: :'a,.!perl -pe 's/^(\t\[")(.*?)(".*)$/$2 @@@ $1$2$3/' | sort -f | perl -pe 's/.*? @@@ //' Note that a simple `sort -f` (where `-f` means case-insensitive) would almost work, but it would sort "hill station" before "hill" and "county borough" before "เทศมณฑล" because the space after e.g. "hill station" sorts before the quotation mark after e.g. "hill". The above command deals with this by extracting the key, prepending it followed by ` @@@ `, sorting, and then removing key (the classic decorate-sort-undecorate pattern). 5. Put the table back to multi-line format by marking the top of the table with `ma`, going to the bottom and executing :'a,.s/\\n/\r/g Note that for some reason, in order to get a match a newline in the left side of a replacement, you must use \n, but to insert a newline in the right sode of a replacement you must use \r. ]=] ["*"] = { link = false, cat_handler = generic_place_cat_handler, }, ["administrative atoll"] = { -- Maldives link = "+w:administrative divisions of the Maldives", preposition = "ของ", class = "subpolity", }, ["administrative capital"] = { link = "w", fallback = "เมืองหลวง", }, ["administrative center"] = { link = "w", fallback = "เมืองหลวงที่ไม่ใช่นคร", }, ["administrative centre"] = { link = "w", fallback = "administrative center", }, ["administrative county"] = { link = "w", fallback = "เทศมณฑล", }, ["administrative district"] = { link = "w", fallback = "อำเภอ", }, ["administrative headquarters"] = { link = "separately", fallback = "administrative centre", }, ["administrative region"] = { link = true, preposition = "ของ", suffix = "ภูมิภาค", -- but prefix is still "administrative region (of)" fallback = "ภูมิภาค", class = "subpolity", }, ["administrative seat"] = { link = "w", fallback = "administrative centre", }, ["administrative territory"] = { link = "separately", preposition = "ของ", suffix = "ดินแดน", -- but prefix is still "administrative territory (of)" fallback = "ดินแดน", class = "subpolity", }, ["administrative unit"] = { -- Grrr, it's difficult to generalize about "administrative units". In Albania, "administrative unit" is an -- official term for a city-level division of municipalities; Wikipedia renders it using the more practical term -- "commune". In Pakistan, "administrative unit" is a collective term used to refer to all the different types -- of first-level divisions (four provinces, one federal territory, and two "disputed territories", i.e. Azad -- Kashmir and Gilgit-Balistan, that are variously described). For this reason, we set no fallback, but we need -- to include this so that it can be used as a placetype for Albania, categorizing as communes. link = "w", class = "subpolity", }, ["administrative village"] = { link = "w", preposition = "ของ", has_neighborhoods = true, class = "settlement", }, ["aimag"] = { -- used in Mongolia, Russia and China (Inner Mongolia); in Mongolia, equivalent to a province; -- in China, equivalent to a prefecture (below a province); in Russia, equivalent to a municipal district. link = "w", fallback = "prefecture", }, ["airport"] = { link = true, class = "man-made structure", default = {true}, }, ["alliance"] = { link = true, fallback = "confederation", }, ["archipelago"] = { link = true, fallback = "เกาะ", }, ["area"] = { link = true, preposition = "ของ", fallback = "geographic and cultural area", -- Areas can either be administrative divisions (specifically of Kuwait) or geographic areas. Assume the former -- when categorizing 'Areas' but the latter when handling e.g. 'historical area'. class = "subpolity", former_type = "geographic region", cat_handler = district_neighborhood_cat_handler, }, ["arm"] = { link = true, preposition = "ของ", class = "natural feature", default = {"ทะเล"}, }, ["arrondissement"] = { link = true, preposition = "ของ", -- FIXME!!! Grrrrr!!! In some countries, arrondissements are divisions of cities; in others, they are divisions -- of departments or provinces. Need to conditionalize on the country for both of the following. class = "subpolity", has_neighborhoods = true, }, ["associated province"] = { link = "separately", fallback = "จังหวัด", }, ["atoll"] = { -- FIXME! Atolls are administrative divisions of the Maldives but natural features elsewhere. Need to -- conditionalize `class` on the country. See also `administrative atoll`. link = true, class = "natural feature", bare_category_parent = "เกาะ", default = {true}, }, ["autonomous city"] = { link = "w", preposition = "ของ", fallback = "นคร", has_neighborhoods = true, }, ["autonomous community"] = { -- Spain; refers to regional entities, not village-like entities, as might be expected from "community" link = true, preposition = "ของ", class = "subpolity", }, ["autonomous island"] = { -- Comoros; seems like an administrative atoll of the Maldives. link = "+w:autonomous islands of Comoros", preposition = "ของ", class = "subpolity", }, ["autonomous oblast"] = { link = true, preposition = "ของ", affix_type = "Suf", no_affix_strings = "oblast", class = "subpolity", }, ["autonomous okrug"] = { link = true, preposition = "ของ", affix_type = "Suf", no_affix_strings = "okrug", class = "subpolity", }, ["autonomous prefecture"] = { link = true, fallback = "prefecture", }, ["autonomous province"] = { link = "w", fallback = "จังหวัด", }, ["autonomous region"] = { link = "w", preposition = "ของ", fallback = "administrative region", -- "administrative region" sets an affix of "ภูมิภาค" but we want to display as "Tibet Autonomous Region" -- if the user writes 'ar:Suf/Tibet'. affix = "autonomous region", }, ["autonomous republic"] = { link = "w", preposition = "ของ", class = "subpolity", }, ["autonomous territorial unit"] = { -- Moldova; only two of them, one for Gagauzia and one for Transnistria. link = "w", preposition = "ของ", class = "subpolity", }, ["autonomous territory"] = { link = "w", fallback = "dependent territory", }, ["bailiwick"] = { -- Jersey, etc. link = true, fallback = "องค์การทางการเมือง", }, ["barangay"] = { -- Philippines link = true, class = "settlement", -- Barangays are formal administrative divisions of a city rather than informal neighborhoods, but can use -- some of the properties of a neighborhood. fallback = "neighborhood", }, ["barrio"] = { -- Spanish-speaking countries; Philippines link = true, -- FIXME: Not completely correct, in some countries barrios are formal administrative divisions of a city. -- `class` will need to conditionalize on the country to be completely correct. fallback = "neighborhood", }, ["basin"] = { link = true, fallback = "ทะเลสาบ", }, ["bay"] = { link = true, preposition = "ของ", class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["beach"] = { link = true, class = "natural feature", addl_bare_category_parents = {"water"}, default = {true}, }, ["beach resort"] = { link = "w", fallback = "resort town", }, ["bishopric"] = { link = true, fallback = "องค์การทางการเมือง", }, ["bodies of water!"] = { -- FIXME: This is (maybe?) a type category not a name category. There should be an option for this. We need to -- straighten out the type vs. name vs. related-to issue. category_link = "[[body of water|bodies of water]]", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems", "water"}, }, ["borough"] = { link = true, preposition = "ของ", display_handler = borough_display_handler, has_neighborhoods = true, -- "former borough" could be a former settlement or a former part of a city but seems more likely to -- be a former subpolity, particularly in England. FIXME, we really need a handler to take care of this -- properly. class = "subpolity", -- Grr, some boroughs are city-like but some (e.g. in Britain) may be larger. }, ["borough seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["branch"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["bridge"] = { link = true, class = "man-made structure", default = {"Named bridges"}, }, ["building"] = { link = true, class = "man-made structure", default = {"Named buildings"}, }, ["built-up area"] = { link = "w", fallback = "area", }, ["burgh"] = { link = true, fallback = "borough", }, ["business park"] = { link = true, fallback = "park", }, ["caliphate"] = { link = true, fallback = "องค์การทางการเมือง", }, ["canton"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["cape"] = { link = true, fallback = "headland", }, ["capital"] = { link = true, fallback = "เมืองหลวง", }, ["เมืองหลวง"] = { link = true, category_link = "[[capital city|capital cities]]: the [[seat of government|seats of government]] for a country or [[political]] [[division]] of a country", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", bare_category_parent = "นคร", cat_handler = capital_city_cat_handler, default = {true}, -- The following is necessary so that e.g. [[Melbourne]] defined as {{place|en|capital city|s/Victoria|c/Australia}} -- gets categorized in the bare category [[Category:en:Melbourne]]; otherwise placetype 'capital city' wouldn't -- match against the placetype 'city' of Melbourne. fallback = "นคร", }, ["caplc"] = { link = "[[capital]] and [[large]]st [[city]]", plural_link = false, fallback = "เมืองหลวง", }, ["captaincy"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["caravan city"] = { link = "w", fallback = "นคร", class = "settlement", inherently_former = {"ANCIENT", "FORMER"}, }, ["castle"] = { link = true, fallback = "building", }, ["cathedral city"] = { link = true, fallback = "นคร", }, ["cattle station"] = { -- Australia link = true, fallback = "farm", }, ["census area"] = { link = true, affix_type = "Suf", has_neighborhoods = true, class = "non-admin settlement", }, ["census-designated place"] = { -- United States link = true, class = "non-admin settlement", }, ["census division"] = { -- Canada link = "w", preposition = "ของ", class = "subpolity", }, ["census town"] = { link = "w", fallback = "เมือง", }, ["central business district"] = { link = true, fallback = "neighborhood", }, ["cercle"] = { -- Mali link = "+w:cercles of Mali", preposition = "ของ", class = "subpolity", }, ["ceremonial county"] = { link = true, fallback = "เทศมณฑล", }, ["chain of islands"] = { link = "[[chain]] of [[island]]s", plural = "chains of islands", plural_link = "[[chain]]s of [[island]]s", fallback = "เกาะ", }, ["channel"] = { link = true, fallback = "strait", }, ["charter community"] = { -- Northwest Territories, Canada link = "w", fallback = "village", }, ["นคร"] = { link = true, generic_before_non_cities = "ใน", has_neighborhoods = true, class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["city-state"] = { link = true, category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]", has_neighborhoods = true, class = "settlement", ["continent/*"] = {"City-states", "Cities in +++", "Countries in +++", "National capitals"}, default = {"City-states", "นคร", "ประเทศ", "National capitals"}, }, ["civil parish"] = { -- Mostly England; similar to municipalities link = true, preposition = "ของ", affix_type = "suf", has_neighborhoods = true, class = "subpolity", }, ["claimed political division"] = { link = "[[claim]]ed [[political]] [[division]]", class = "subpolity", default = {true}, }, ["co-capital"] = { link = "[[co-]][[capital]]", fallback = "เมืองหลวง", }, ["coal city"] = { link = "+w:coal town", fallback = "นคร", }, ["coal town"] = { link = "w", fallback = "เมือง", }, ["collectivity"] = { link = "w", preposition = "ของ", -- No default; these are weird one-off governmental divisions in France (esp. for overseas collectivities) class = "subpolity", }, ["colony"] = { link = true, fallback = "dependent territory", }, ["comarca"] = { -- per Wikipedia: traditional region or local administrative division found in Portugal, Spain, and some of -- their former colonies, like Brazil, Nicaragua, and Panama. In the Valencian Community, for example, it -- sits between municipalities and provinces, something like a county or district. link = true, preposition = "ของ", class = "subpolity", }, ["commandery"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"ANCIENT", "FORMER"}, }, ["commonwealth"] = { link = true, preposition = "ของ", -- No default; applies specifically to Puerto Rico class = "subpolity", }, ["commune"] = { link = true, fallback = "เทศบาล", }, ["community"] = { link = true, category_link = "[[community|communities]] of all sizes", fallback = "village", }, ["community development block"] = { -- in India; appears to be similar to a rural municipality; groups several villages, unclear if there will be -- neighborhoods so I'm not setting `has_neighborhoods` for now link = "w", affix_type = "suf", no_affix_strings = "block", class = "subpolity", }, ["comune"] = { -- Italy, Switzerland link = true, fallback = "เทศบาล", }, ["condominium"] = { link = true, fallback = "องค์การทางการเมือง", }, ["confederacy"] = { link = true, fallback = "confederation", }, ["confederation"] = { link = true, fallback = "องค์การทางการเมือง", }, ["constituency"] = { -- currently we have them as political divisions of Namibia but many countries have them link = true, preposition = "ของ", class = "subpolity", }, ["constituent country"] = { link = true, preposition = "ของ", class = "subpolity", }, ["constituent part"] = { link = "separately", preposition = "ของ", class = "subpolity", }, ["constituent republic"] = { -- Of Russia, Yugoslavia, etc. link = "separately", preposition = "ของ", class = "subpolity", }, ["counties and county-level cities!"] = { -- This is used when grouping counties and county-level cities under prefecture-level cities in China. category_link = "[[county|counties]] and [[county-level city|county-level cities]]", class = "subpolity", }, ["continent"] = { link = true, category_link = false, -- can't occur as a bare category class = "natural feature", default = {"Continents and continental regions"}, }, ["continental region"] = { link = "separately", category_link = false, -- can't occur as a bare category class = "geographic region", fallback = "continent", }, ["continents and continental regions!"] = { category_link = "[[continent]]s and [[continent]]-[[level]] [[region]]s (e.g. [[Polynesia]])", class = "geographic region", }, ["council area"] = { link = true, -- in Scotland; similar to a county preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["ประเทศ"] = { link = true, class = "polity", --ห้ามแปล class ["continent/*"] = {true, "ประเทศ"}, default = {true}, }, ["country-like entities!"] = { category_link = "[[polity|polities]] not normally considered [[country|countries]] but treated similarly for categorization purposes; typically, [[unrecognized]] [[de-facto]] countries or [[w:dependent territory|dependent territories]]", class = "polity", --ห้ามแปล class }, ["เทศมณฑล"] = { link = true, preposition = "ของ", display_handler = county_display_handler, class = "subpolity", }, ["county borough"] = { link = true, -- in Wales; similar to a county preposition = "ของ", affix_type = "suf", fallback = "borough", class = "subpolity", }, ["county seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["county town"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", fallback = "เมือง", has_neighborhoods = true, class = "capital", }, ["county-administered city"] = { -- In Taiwan, per Wikipedia similar to a Taiwanese township or district, which is a small city. -- NOT anything like a "county-level city" in PR China, which is a county masquerading as a city. link = "w", fallback = "นคร", has_neighborhoods = true, class = "settlement", }, ["county-controlled city"] = { -- Taiwan link = "w", fallback = "county-administered city", }, ["county-level city"] = { -- PR China link = "w", fallback = "prefecture-level city", }, ["crater lake"] = { link = true, fallback = "ทะเลสาบ", }, ["creek"] = { link = true, fallback = "stream", }, ["Crown colony"] = { link = "+crown colony", fallback = "crown colony", }, ["crown colony"] = { link = true, fallback = "colony", }, ["Crown dependency"] = { link = true, fallback = "dependent territory", }, ["crown dependency"] = { link = true, fallback = "dependent territory", }, ["cultural area"] = { link = "w", fallback = "geographic and cultural area", }, ["cultural region"] = { link = "w", fallback = "geographic and cultural area", }, ["delegation"] = { -- Tunisia link = "+w:delegations of Tunisia", preposition = "ของ", class = "subpolity", }, ["department"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["departmental capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["dependency"] = { link = true, fallback = "dependent territory", }, ["dependent territory"] = { link = "w", preposition = "ของ", class = "subpolity", former_type = "dependent territory", bare_category_parent = "political divisions", ["country/*"] = {true}, default = {true}, }, ["desert"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ecosystems"}, default = {true}, }, ["deserted mediaeval village"] = { link = "w", fallback = "deserted medieval village", }, ["deserted medieval village"] = { link = "w", fallback = "ANCIENT settlement", }, ["direct-administered municipality"] = { -- China link = "+w:direct-administered municipalities of China", fallback = "เทศบาล", }, ["direct-controlled municipality"] = { -- several countries link = "w", fallback = "เทศบาล", }, ["distributary"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["อำเภอ"] = { link = true, preposition = "ของ", affix_type = "suf", -- Grrr! FIXME! Here is where we need handlers for `class`. Using similar logic to -- district_neighborhood_cat_handler, we need to check if we're below or above a city to determine if the class -- is "settlement" or "subpolity". class = "subpolity", cat_handler = district_neighborhood_cat_handler, -- No default. Countries for which districts are political divisions will get entries. }, ["districts and autonomous regions!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Portugal. category_link = "[[district]]s and [[autonomous region]]s", class = "subpolity", }, ["districts and autonomous territorial units!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Moldova. category_link = "[[district]]s and [[w:autonomous territorial unit|autonomous territorial unit]]s", class = "subpolity", }, ["district capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["district headquarters"] = { link = "separately", fallback = "administrative centre", }, ["district municipality"] = { -- In Canada, a district municipality is equivalent to a rural municipality and won't have neighborhoods; in -- South Africa, district municipalities group local municipalities and hence won't have neighborhoods. link = "w", preposition = "ของ", affix_type = "suf", no_affix_strings = {"อำเภอ", "เทศบาล"}, fallback = "เทศบาล", class = "subpolity", }, ["division"] = { link = true, preposition = "ของ", class = "subpolity", }, ["division capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["dome"] = { link = true, fallback = "ภูเขา", }, ["dormant volcano"] = { link = true, fallback = "volcano", }, ["duchy"] = { link = true, fallback = "องค์การทางการเมือง", }, ["emirate"] = { link = true, preposition = "ของ", -- FIXME: Can be subpolities (of the United Arab Emirates). fallback = "องค์การทางการเมือง", }, ["จักรวรรดิ"] = { link = true, fallback = "องค์การทางการเมือง", }, ["enclave"] = { link = true, preposition = "ของ", -- Enclaves can theoretically be any size but assume a subpolity. class = "subpolity", }, ["entity"] = { -- Bosnia and Herzegovina link = "+w:entities of Bosnia and Herzegovina", preposition = "ของ", class = "subpolity", }, ["escarpment"] = { link = true, fallback = "ภูเขา", }, ["ethnographic region"] = { -- used in Lithuania link = "+w:ethnographic regions of Lithuania", fallback = "geographic and cultural area", }, ["exclave"] = { link = true, preposition = "ของ", -- exclaves can theoretically be any size but assume a subpolity. class = "subpolity", }, ["external territory"] = { link = "separately", fallback = "dependent territory", }, ["farm"] = { link = true, class = "non-admin settlement", default = {"Farms and ranches"}, }, ["farms and ranches!"] = { category_link = "[[farm]]s and [[ranch]]es", class = "non-admin settlement", }, ["federal city"] = { link = "w", preposition = "ของ", fallback = "นคร", }, ["federal district"] = { link = true, preposition = "ของ", -- Might have neighborhoods as federal districts are often cities (e.g. Mexico City) has_neighborhoods = true, class = "settlement", }, ["federal subject"] = { -- In Russia; a generic term for first-level administrative divisions (republics, oblasts, okrugs, krais, -- autonomous okrugs and autonomous oblasts). link = "w", preposition = "ของ", class = "subpolity", }, ["federal territory"] = { link = "w", fallback = "ดินแดน", }, ["fictional location"] = { link = "separately", former_type = "!", class = "hypothetical location", bare_category_parent = "สถานที่", default = {true}, }, ["First Nations reserve"] = { -- Canada link = "[[First Nations]] [[w:Indian reserve|reserve]]", -- Wikipedia uses "Indian reserve"; presumably that is the legal term fallback = "Indian reserve", class = "subpolity", }, ["fjord"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["footpath"] = { link = true, fallback = "road", }, ["forest"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ecosystems", "forestry"}, default = {true}, }, ["fort"] = { link = true, fallback = "building", }, ["fortress"] = { link = true, -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- fortresses -> fortresse, so put an entry here to ensure we singularize correctly. plural = "fortresses", fallback = "building", }, ["frazione"] = { link = "w", fallback = "hamlet", }, ["freeway"] = { link = true, fallback = "road", }, ["French prefecture"] = { link = "[[w:prefectures in France|prefecture]]", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["geographic and cultural area"] = { link = "+w:cultural area", -- `generic_before_non_cities` is used when generating the category description of categories of the format -- `Geographic and cultural areas of PLACE`. `preposition` is used when generating {{place}} description and -- categories for any placetype that falls back to `geographic and cultural area`. generic_before_non_cities = "ของ", preposition = "ของ", class = "geographic region", bare_category_parent = "สถานที่", ["country/*"] = {true}, ["constituent country/*"] = {true}, ["continent/*"] = {true}, default = {true}, }, ["geographic area"] = { link = "+w:geographic region", fallback = "geographic and cultural area", }, ["geographic region"] = { link = "w", fallback = "geographic and cultural area", }, ["geographical area"] = { link = "w", fallback = "geographic and cultural area", }, ["geographical region"] = { link = "w", fallback = "geographic and cultural area", }, ["geopolitical zone"] = { -- Nigeria link = true, preposition = "ของ", class = "subpolity", }, ["gewog"] = { -- Bhutan link = true, preposition = "ของ", class = "subpolity", }, ["ghost town"] = { link = true, generic_before_non_cities = "ใน", class = "non-admin settlement", bare_category_parent = "former settlements", cat_handler = city_type_cat_handler, default = {true}, }, ["glen"] = { link = true, fallback = "valley", }, ["governorate"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["greater administrative region"] = { -- China (former division) link = "w", preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["gromada"] = { -- Poland (former division) link = "w", preposition = "ของ", affix_type = "Pref", class = "subpolity", inherently_former = {"FORMER"}, }, ["group of islands"] = { link = "[[group]] of [[island]]s", plural = "groups of islands", plural_link = "[[group]]s of [[island]]s", fallback = "island group", }, ["gulf"] = { link = true, preposition = "ของ", holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["hamlet"] = { link = true, fallback = "village", }, ["harbor city"] = { link = "separately", fallback = "นคร", }, ["harbor town"] = { link = "separately", fallback = "เมือง", }, ["harbour city"] = { link = "separately", fallback = "นคร", }, ["harbour town"] = { link = "separately", fallback = "เมือง", }, ["headland"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["headquarters"] = { link = "w", fallback = "administrative centre", }, ["heath"] = { link = true, fallback = "moor", }, ["hemisphere"] = { link = true, entry_placetype_use_the = true, fallback = "continental region", }, ["highway"] = { link = true, fallback = "road", }, ["hill"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["hill station"] = { link = "w", fallback = "เมือง", }, ["hill town"] = { link = "w", fallback = "เมือง", }, ["historic region"] = { -- provided only for the link link = "+w:historical region", fallback = "FORMER geographic region", }, ["historical county"] = { -- needed for historical counties of England/etc. link = "+w:historic county", fallback = "FORMER subpolity", }, ["historical region"] = { -- provided only for the link link = "w", fallback = "FORMER geographic region", }, ["home rule city"] = { link = "w", fallback = "นคร", }, ["home rule municipality"] = { link = "w", fallback = "เทศบาล", }, ["hot spring"] = { link = true, fallback = "spring", }, ["house"] = { link = true, fallback = "building", }, ["housing estate"] = { -- not the same as a housing project (i.e. public housing) link = true, -- not exactly the case but approximately fallback = "neighborhood", }, ["hromada"] = { -- Ukraine link = "w", disallow_in_entries = "Use placetype 'urban hromada', 'rural hromada' or 'settlement hromada' in place of bare 'hromada'", disallow_in_holonyms = "Use placetype 'urban hromada'/'uhrom', 'rural hromada'/'rhrom' or 'settlement hromada'/'shrom' in place of bare 'hromada'", preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["inactive volcano"] = { link = "w", fallback = "dormant volcano", }, ["independent city"] = { link = true, fallback = "นคร", }, ["independent town"] = { link = "+independent city", fallback = "เมือง", }, ["Indian reservation"] = { link = "w", -- In the US. Also known as "Native American reservation" or "domestic dependent nation", and the reservations -- themselves often use the term "nation" in their official name (e.g. the "Navajo Nation"). But Wikipedia puts -- the article at [[w:Indian reservation]] and uses that term when describing e.g. what the Navajo Nation is, -- so this must still be the legal term. preposition = "ของ", class = "subpolity", default = {true}, }, ["Indian reserve"] = { link = "w", -- In Canada. "First Nations reserve" sounds more modern/PC but Wikipedia uses "Indian reserve"; presumably that -- is still the legal term. preposition = "ของ", class = "subpolity", default = {true}, }, ["inland sea"] = { -- note, we also have 'inland' as a qualifier link = true, fallback = "ทะเล", }, ["inner city area"] = { link = "[[inner city]] [[area]]", fallback = "neighborhood", }, ["เกาะ"] = { link = true, preposition = "ของ", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["island country"] = { -- FIXME: The following should map to both 'island' and 'country'. link = "w", fallback = "ประเทศ", }, ["island group"] = { link = "separately", fallback = "เกาะ", }, ["island municipality"] = { link = "w", fallback = "เทศบาล", }, ["islet"] = { link = "w", fallback = "เกาะ", }, ["Israeli settlement"] = { link = "w", class = "settlement", default = {true}, }, ["judicial capital"] = { link = "w", fallback = "เมืองหลวง", }, ["khanate"] = { link = true, fallback = "องค์การทางการเมือง", }, ["kibbutz"] = { link = true, plural = "kibbutzim", class = "non-admin settlement", default = {true}, }, ["kingdom"] = { link = true, fallback = "monarchy", }, ["krai"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["ทะเลสาบ"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["ธรณีสัณฐาน!"] = { category_link = "[[ธรณีสัณฐาน]]", bare_category_parent = "สถานที่", addl_bare_category_parents = {"โลก"}, }, ["largest city"] = { link = "[[large]]st [[city]]", entry_placetype_use_the = true, fallback = "นคร", has_neighborhoods = true, }, ["league"] = { link = true, fallback = "confederation", }, ["legislative capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["library"] = { link = true, fallback = "building", }, ["lieutenancy area"] = { -- used in the United Kingdom; per Wikipedia: -- In England, lieutenancy areas are colloquially known as the ceremonial counties, although this phrase does -- not appear in any legislation referring to them. The lieutenancy areas of Scotland are subdivisions of -- Scotland that are more or less based on the counties of Scotland, making use of the major cities as separate -- entities.[2] In Wales, the lieutenancy areas are known as the preserved counties of Wales and are based on -- those used for lieutenancy and local government between 1974 and 1996. The lieutenancy areas of Northern -- Ireland correspond to the six counties and two former county boroughs.[3] link = "w", fallback = "ceremonial county", }, ["local authority district"] = { link = "w", fallback = "local government district", }, ["local government area"] = { -- Australia link = "w", preposition = "ของ", class = "subpolity", }, ["local council"] = { -- Malta; similar to municipalities link = "+w:local councils of Malta", preposition = "ของ", fallback = "เทศบาล", }, ["local government district"] = { link = "w", preposition = "ของ", affix_type = "suf", affix = "อำเภอ", class = "subpolity", }, ["local government district with borough status"] = { link = "[[w:local government district|local government district]] with [[w:borough status|borough status]]", plural = "local government districts with borough status", plural_link = "[[w:local government district|local government districts]] with [[w:borough status|borough status]]", preposition = "ของ", affix_type = "suf", affix = "อำเภอ", class = "subpolity", }, ["local urban district"] = { link = "w", fallback = "unincorporated community", }, ["locality"] = { link = "+w:locality (settlement)", -- not necessarily true, but usually is the case fallback = "village", }, ["London borough"] = { link = "w", preposition = "ของ", affix_type = "pref", affix = "borough", fallback = "local government district with borough status", has_neighborhoods = true, }, ["macroregion"] = { link = true, fallback = "ภูมิภาค", }, ["man-made structures!"] = { category_link = "[[w:geographical feature#Engineered constructs|man-made structures]] such as [[airport]]s, [[university|universities]] and [[metro station]]s", bare_category_parent = "สถานที่", }, ["manor"] = { -- FIXME: or is this more like a farm? link = true, fallback = "building", }, ["marginal sea"] = { link = true, preposition = "ของ", fallback = "ทะเล", }, ["market city"] = { link = "+market town", fallback = "นคร", }, ["market town"] = { link = true, fallback = "เมือง", }, ["massif"] = { link = true, fallback = "ภูเขา", }, ["megacity"] = { link = true, fallback = "นคร", }, ["metro station"] = { link = true, class = "man-made structure", }, ["metropolitan borough"] = { link = true, preposition = "ของ", affix_type = "Pref", no_affix_strings = {"borough", "นคร"}, fallback = "local government district", has_neighborhoods = true, }, ["metropolitan city"] = { -- These exist e.g. in Italy and are more like municipalities or even provinces than cities. link = true, preposition = "ของ", affix_type = "Pref", no_affix_strings = {"metropolitan", "นคร"}, class = "subpolity", }, ["metropolitan county"] = { link = true, fallback = "เทศมณฑล", }, ["metropolitan municipality"] = { -- In South Africa, metropolitan municipalities group local municipalities and are like districts, between -- provinces and municipalities. -- In Turkey, metropolitan municipalities are provinces-level. link = "w", preposition = "ของ", affix_type = "Suf", no_affix_strings = {"metropolitan", "เทศบาล"}, fallback = "เทศบาล", class = "subpolity", }, ["microdistrict"] = { -- residential complex in post-Soviet states link = true, fallback = "neighborhood", }, ["micronations!"] = { -- FIXME, merge with microstate category_link = "[[micronation]]s", bare_category_parent = "ประเทศ", }, ["microstate"] = { link = true, fallback = "ประเทศ", }, ["military base"] = { link = "w", class = "settlement", -- or "man-made structure"? default = {true}, }, ["minster town"] = { -- England link = "separately", fallback = "เมือง", }, ["monarchy"] = { link = true, fallback = "องค์การทางการเมือง", }, ["moor"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems"}, default = {true}, }, ["moorland"] = { link = true, fallback = "moor", }, ["motorway"] = { link = true, fallback = "road", }, ["ภูเขา"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["mountain indigenous district"] = { -- Taiwan link = "+w:district (Taiwan)", fallback = "อำเภอ", }, ["mountain indigenous township"] = { -- Taiwan link = "+w:township (Taiwan)", fallback = "township", }, ["mountain pass"] = { link = true, -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- passes -> passe, so put an entry here to ensure we singularize correctly. plural = "mountain passes", class = "natural feature", addl_bare_category_parents = {"ภูเขา"}, default = {true}, }, ["เทือกเขา"] = { link = true, fallback = "ภูเขา", }, ["mountainous region"] = { link = "separately", fallback = "ภูมิภาค", }, ["mukim"] = { -- Malaysia, Brunei, Indonesia, Singapore link = true, preposition = "ของ", class = "subpolity", }, ["municipal district"] = { link = "w", -- meaning varies depending on the country; for now, assume no neighborhoods. -- FIXME: has_neighborhoods might have to be a function that looks at the containing holonyms. preposition = "ของ", affix_type = "Pref", no_affix_strings = "อำเภอ", fallback = "เทศบาล", }, ["เทศบาล"] = { link = true, preposition = "ของ", has_neighborhoods = true, class = "subpolity", }, ["municipality with city status"] = { link = "[[municipality]] with [[w:city status|city status]]", plural = "municipalities with city status", plural_link = "[[municipality|municipalities]] with [[w:city status|city status]]", fallback = "เทศบาล", }, ["museum"] = { link = true, fallback = "building", }, ["mythological location"] = { link = "separately", former_type = "!", class = "hypothetical location", bare_category_parent = "สถานที่", default = {true}, }, ["named bridges!"] = { category_link = "notable [[bridge]]s", bare_category_parent = "man-made structures", addl_bare_category_parents = {"bridges"}, }, ["named buildings!"] = { category_link = "notable [[house]]s, [[library|libraries]] and other [[building]]s", bare_category_parent = "man-made structures", addl_bare_category_parents = {"buildings"}, }, ["named roads!"] = { category_link = "notable [[road]]s, [[highway]]s, [[trail]]s and similar linear structures", bare_category_parent = "man-made structures", addl_bare_category_parents = {"roads"}, }, ["national capital"] = { link = "w", fallback = "เมืองหลวง", }, ["national park"] = { link = true, fallback = "park", }, ["natural features!"] = { category_link = "[[w:geographical feature#Natural features|natural features]] such as [[lake]]s, [[mountain]]s, [[island]]s and [[ocean]]s", bare_category_parent = "สถานที่", }, ["neighborhood"] = { -- The majority of the properties here apply to both `neighborhoods` and `neighbourhoods`; the choice of which -- one to use is made by district_neighborhood_cat_handler() based on the value of `british_spelling` for the -- location (city, political division, etc.) of the holonym that follows the word "neighbo(u)hoods" in the -- category name. It does *NOT* depend on whether the {{place}} call uses "neighborhoods" or "neighbourhoods". -- (In general it can't, because other things like "urban areas", "อำเภอ", "subdivisions" and the like also -- categorize as neighbo(u)rhoods.) link = true, -- See below. These are used by category handlers in [[Module:category tree/topic cat/data/Places]]. generic_before_non_cities = "ใน", generic_before_cities = "ของ", -- The following text is suitable for the top-level description of a neighborhood as well as categories of the -- form `Neighborhoods in POLDIV` e.g. `Neighborhoods in Illinois, USA` but not for categories of the form -- `Neighborhoods of Chicago`, where we'd get "... and other subportions of [[city|cities]] of [[Chicago]]". category_link = "[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]", category_link_before_city = "[[neighborhood]]s, [[district]]s and other subportions", -- NOTE: This setting is needed for administrative divisions like barangays that fall back to `neighborhood`, -- when set in [[Module:place/locations]] for a specific country (e.g. the Philippines). The above settings -- for `generic_before_non_cities` and `generic_before_cities` are used by category handlers in -- [[Module:category tree/topic cat/data/Places]] for `Neighborhoods in POLDIV` and `Neighborhoods of CITY` -- categories. In fact, district_neighborhood_cat_handler() does not currently pay attention to them, but -- generates "ของ" before cities and "ใน" before non-cities regardless. (FIXME: We should change that.) preposition = "ของ", class = "non-admin settlement", cat_handler = district_neighborhood_cat_handler, }, ["neighbourhood"] = { link = true, category_link = "[[neighbourhood]]s, [[district]]s and other subportions of [[city|cities]]", category_link_before_city = "[[neighbourhood]]s, [[district]]s and other subportions", fallback = "neighborhood", }, ["new area"] = { -- China (type of economic development zone, varying greatly in size) link = "w", preposition = "ใน", class = "subpolity", --? }, ["new town"] = { link = true, fallback = "เมือง", }, ["เมืองหลวงที่ไม่ใช่นคร"] = { link = "[[เมืองหลวง]]", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", cat_handler = function(data) return capital_city_cat_handler(data, "non-city") end, -- FIXME, do we need the following? default = {true}, }, ["non-metropolitan county"] = { link = "w", fallback = "เทศมณฑล", }, ["non-metropolitan district"] = { link = "w", fallback = "local government district", }, ["non-sovereign kingdom"] = { -- especially in Africa and Asia link = "+w:non-sovereign monarchy", generic_before_non_cities = "ใน", class = "subpolity", ["country/*"] = {true}, ["continent/*"] = {true}, default = {true}, }, ["non-sovereign monarchy"] = { link = "w", fallback = "non-sovereign kingdom", }, ["oblast"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["oblasts and autonomous republics!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Ukraine. category_link = "[[oblast]]s and [[w:autonomous republic|autonomous republic]]s", class = "subpolity", }, ["มหาสมุทร"] = { link = true, holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"ทะเล", "bodies of water"}, default = {true}, }, ["okrug"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["overseas collectivity"] = { link = "w", fallback = "collectivity", }, ["overseas department"] = { link = "w", fallback = "department", }, ["overseas territory"] = { link = "w", fallback = "dependent territory", }, ["parish"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["parish municipality"] = { -- in Quebec, often similar to a rural village; the famous [[Saint-Louis-du-Ha! Ha!]] is one of them. link = "+w:parish municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, }, ["parish seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", class = "capital", has_neighborhoods = true, }, ["park"] = { link = true, class = "man-made structure", default = {true}, }, ["pass"] = { link = "+mountain pass", -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- passes -> passe, so put an entry here to ensure we singularize correctly. plural = "passes", fallback = "mountain pass", }, ["path"] = { link = true, fallback = "road", }, ["peak"] = { link = true, fallback = "ภูเขา", }, ["peninsula"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["periphery"] = { link = true, preposition = "ของ", class = "subpolity", }, ["สถานที่!"] = { generic_before_non_cities = "ใน", generic_before_cities = "ใน", class = "generic place", category_link = "[[place]]s of all sorts", -- `category_link_top_level` control the description used in the top-level [[Category:Places]] and -- language-specific variants such as [[Category:en:Places]]. The actual text for a language-spefic variant is -- "{{{langname}}} names of [[geographical]] [[place]]s of all sorts; [[toponym]]s." where the "names of" -- portion is automatically generated by the appropriate handler in -- [[Module:category tree/topic cat/data/Places]]. category_link_top_level = "[[geographical]] [[place]]s of all sorts; [[toponym]]s", bare_category_parent = "ชื่อ (หัวข้อ)", }, ["planned community"] = { -- Include this so we don't categorize 'planned community' into villages, as 'community' does. link = true, class = "settlement", has_neighborhoods = true, }, ["plateau"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, -- FIXME: Should generate both "Plateaus" and the appropriate 'geographic and cultural area' category }, ["Polish colony"] = { link = "[[w:colony (Poland)|colony]]", affix_type = "suf", affix = "colony", fallback = "village", has_neighborhoods = true, }, ["political divisions!"] = { category_link = "[[political]] [[division]]s and [[subdivision]]s, such as [[state]]s, [[province]]s, [[county|counties]] or [[district]]s", bare_category_parent = "สถานที่", }, ["องค์การทางการเมือง"] = { link = true, category_link = "[[independent]] or [[semi-]][[independent]] [[polity|polities]]", class = "polity", --ห้ามแปล class bare_category_parent = "สถานที่", default = {true}, }, ["populated place"] = { link = "+w:populated place", -- not necessarily true, but usually is the case fallback = "village", }, ["port"] = { link = true, class = "man-made structure", default = {true}, }, ["port city"] = { -- FIXME: should categorize into "Ports" as well as "นคร" link = true, fallback = "นคร", }, ["port town"] = { -- FIXME: should categorize into "Ports" as well as "เมือง" link = "w", fallback = "เมือง", }, ["prefecture"] = { -- FIXME! `prefecture` is like a county in Japan and elsewhere but a department capital city in France. -- May need `has_neighborhoods` to be a function. link = true, preposition = "ของ", display_handler = prefecture_display_handler, class = "subpolity", }, ["prefecture-level city"] = { -- China; they are huge entities with a central city; not cities themselves. link = "w", preposition = "ของ", class = "subpolity", }, ["preserved county"] = { -- In Wales; they are former counties enshrined in law; there are 8 of them and each consists of one or more -- "principal areas" (styled as "เทศมณฑล" or "county boroughs"), of which there are 22. link = "w", preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["primary area"] = { -- a grouping of "อำเภอ" (neighborhoods) in Gothenburg, Sweden link = "+w:sv:primärområde", fallback = "neighborhood", }, ["principality"] = { link = true, fallback = "monarchy", }, ["promontory"] = { link = true, fallback = "headland", }, ["protectorate"] = { link = true, fallback = "dependent territory", }, ["จังหวัด"] = { link = true, preposition = "ของ", display_handler = province_display_handler, class = "subpolity", }, ["provinces and autonomous regions!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case China. category_link = "[[province]]s and [[autonomous region]]s", class = "subpolity", }, ["provinces and territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Canada and Pakistan. category_link = "[[province]]s and [[territory|territories]]", class = "subpolity", }, ["provincial capital"] = { link = true, fallback = "เมืองหลวง", }, ["raion"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["ranch"] = { link = true, fallback = "farm", }, ["range"] = { -- FIXME: Where is this used? Is it a mountain range? link = true, holonym_use_the = true, class = "natural feature", }, ["regency"] = { link = true, preposition = "ของ", class = "subpolity", }, ["ภูมิภาค"] = { link = true, preposition = "ของ", -- If 'region' isn't a specific administrative division, fall back to 'geographic and cultural area' fallback = "geographic and cultural area", -- "former region" is a subpolity but traditional/historic(al)/ancient/medieval/etc. is a geographic region class = "geographic region", }, ["regional capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["regional county municipality"] = { -- Quebec link = "w", preposition = "ของ", affix_type = "Suf", no_affix_strings = {"เทศบาล", "เทศมณฑล"}, fallback = "เทศบาล", }, ["regional district"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "อำเภอ", fallback = "อำเภอ", }, ["regional municipality"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "เทศบาล", fallback = "เทศบาล", }, ["regional unit"] = { link = "w", preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["registration county"] = { -- Used in Scotland for land registration purposes; formerly used in England, Wales and Ireland for statistical -- purposes (registration of births, deaths and marriages, and for the output of census information). link = "w", fallback = "เทศมณฑล", }, ["republic"] = { -- Of Russia, Yugoslavia, etc. "Republics" in general are sovereign but we use "ประเทศ" in that case. link = true, fallback = "constituent republic", }, ["research base"] = { link = "+w:research station", fallback = "research station", }, ["research station"] = { link = "w", class = "non-admin settlement", -- or "man-made structure"? default = {true}, }, ["reservoir"] = { link = true, fallback = "ทะเลสาบ", }, ["residential area"] = { link = "separately", fallback = "neighborhood", }, ["resort city"] = { link = "w", fallback = "นคร", }, ["resort town"] = { link = "w", fallback = "เมือง", }, ["แม่น้ำ"] = { link = true, generic_before_non_cities = "ใน", holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, cat_handler = city_type_cat_handler, ["continent/*"] = {true}, default = {true}, }, ["river island"] = { link = "w", fallback = "เกาะ", }, ["road"] = { link = true, class = "man-made structure", default = {"Named roads"}, }, ["Roman province"] = { -- FIXME! Eliminate this in favor of 'former province|emp/Roman Empire' link = "w", default = {"Provinces of the Roman Empire"}, class = "subpolity", }, ["royal borough"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = {"royal", "borough"}, fallback = "local government district with borough status", has_neighborhoods = true, }, ["royal burgh"] = { link = true, fallback = "borough", }, ["royal capital"] = { link = "w", fallback = "เมืองหลวง", }, ["rural committee"] = { -- Hong Kong; a group of villages link = "w", affix_type = "Suf", has_neighborhoods = true, class = "settlement", }, ["rural community"] = { -- New Brunswick link = "+w:list of municipalities in New_Brunswick#Rural communities", fallback = "เทศบาล", }, ["rural hromada"] = { link = "[[rural]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["rural municipality"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "เทศบาล", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["rural township"] = { -- Taiwan link = "+w:rural township (Taiwan)", fallback = "township", }, ["sanctuary"] = { link = true, fallback = "temple", }, ["satrapy"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"ANCIENT", "FORMER"}, }, ["ทะเล"] = { link = true, holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["seaport"] = { link = true, fallback = "port", }, ["seat"] = { link = true, fallback = "administrative centre", }, ["self-administered area"] = { -- Myanmar (groups self-administered divisions and zones) link = "+w:self-administered zone", preposition = "ของ", class = "subpolity", }, ["self-administered division"] = { -- Myanmar (only one of them: Wa Self-Administered Division) link = "w", fallback = "self-administered area", }, ["self-administered zone"] = { -- Myanmar (five of them) link = "w", fallback = "self-administered area", }, ["separatist state"] = { link = "separately", fallback = "unrecognized country", }, ["การตั้งถิ่นฐาน"] = { link = true, category_link = "[[settlement]]s such as [[city|cities]], [[village]]s and [[farm]]s", bare_category_parent = "สถานที่", -- not necessarily true, but usually is the case fallback = "village", }, ["settlement hromada"] = { link = "[[w:Populated สถานที่ในUkraine#Rural settlements|การตั้งถิ่นฐาน]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["sheading"] = { -- Isle of Man link = true, fallback = "อำเภอ", }, ["sheep station"] = { -- Australia link = true, fallback = "farm", }, ["shire"] = { link = true, fallback = "เทศมณฑล", }, ["shire county"] = { link = "w", fallback = "เทศมณฑล", }, ["shire town"] = { link = true, fallback = "county seat", }, ["ski resort city"] = { link = "[[ski resort]] [[city]]", fallback = "นคร", }, ["ski resort town"] = { link = "[[ski resort]] [[town]]", fallback = "เมือง", }, ["spa city"] = { link = "+w:spa town", fallback = "นคร", }, ["spa town"] = { link = "w", fallback = "เมือง", }, ["space station"] = { link = true, fallback = "research station", }, ["special administrative region"] = { -- in China; in practice they are city-like (Hong Kong, Macau); also [[Oecusse]] in East Timor is formally a -- "special administrative region"; North Korea had one such region planned (Sinuiju) but abandoned; Indonesia -- has similar "special regions" of Jakarta, Yogyakarta and Aceh; and South Sudan has three "special -- administrative areas" link = "+w:special administrative regions of China", preposition = "ของ", class = "subpolity", has_neighborhoods = true, --? -- no suffix since สถานที่ในHong Kong or Macau are listed without China, except Hong Kong and Macau themselves -- they also contain regions (or areas), e.g. [[Kowloon]], so it would be confusing suffix = "", }, ["special collectivity"] = { link = "w", fallback = "collectivity", }, ["special municipality"] = { -- formerly linked to the Taiwan article but there are also special municipalities of the Netherlands link = "w", fallback = "เทศบาล", }, ["special ward"] = { -- Tokyo link = true, fallback = "เทศบาล", }, ["spit"] = { link = true, fallback = "peninsula", }, ["spring"] = { link = true, class = "natural feature", default = {true}, }, ["star"] = { link = true, class = "natural feature", default = {true}, }, ["รัฐ"] = { link = true, preposition = "ของ", class = "subpolity", -- 'former/historical state' could refer either to a state of a country (a division) or a state = sovereign -- entity. The latter appears more common (e.g. in various "ancient states" of East Asia). former_type = "องค์การทางการเมือง", }, ["states and territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Australia. category_link = "[[state]]s and [[territory|territories]]", class = "subpolity", }, ["states and union territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case India. category_link = "[[state]]s and [[union territory|union territories]]", class = "subpolity", }, ["state capital"] = { link = true, fallback = "เมืองหลวง", }, ["state park"] = { link = true, fallback = "park", }, ["state-level new area"] = { -- China (type of economic development zone, varying greatly in size) link = "w", fallback = "new area", }, ["statistical region"] = { -- Slovenia link = true, fallback = "administrative region", }, ["statutory city"] = { link = "w", fallback = "นคร", }, ["statutory town"] = { link = "w", fallback = "เมือง", }, ["strait"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["stream"] = { link = true, fallback = "แม่น้ำ", }, ["street"] = { link = true, fallback = "road", }, ["strip"] = { link = true, fallback = "geographic region", }, ["strip of land"] = { link = "[[strip]] of [[land]]", plural = "strips of land", plural_link = "[[strip]]s of [[land]]", fallback = "geographic region", }, ["sub-metropolitan city"] = { link = "+w:List of cities in Nepal#Sub-metropolitan cities", fallback = "นคร", }, ["sub-prefectural city"] = { link = "w", fallback = "subprovincial city", }, ["ตำบล"] = { link = true, preposition = "ของ", has_neighborhoods = true, --? -- FIXME: subdistricts can be neighborhood-like (of Jakarta) or larger (in China); need a handler class = "subpolity", default = {true}, }, ["subdivision"] = { link = true, preposition = "ของ", affix_type = "suf", -- FIXME: subdivisions can be neighborhood-like or larger; need a handler class = "subpolity", cat_handler = district_neighborhood_cat_handler, }, ["submerged ghost town"] = { -- FIXME: Consider just having "submerged" as a qualifier. link = "[[submerged]] [[ghost town]]", fallback = "ghost town", }, ["subnational kingdom"] = { link = "+w:subnational monarchy", fallback = "non-sovereign kingdom", }, ["subnational monarchy"] = { link = "w", fallback = "non-sovereign kingdom", }, ["subprefecture"] = { link = true, affix_type = "suf", preposition = "ของ", class = "subpolity", }, ["subprovince"] = { link = true, preposition = "ของ", class = "subpolity", }, ["subprovincial city"] = { link = "w", -- China; special status given to certain prefecture-level cities fallback = "prefecture-level city", }, ["subprovincial district"] = { link = "w", -- China; special status given to Binhai New Area and Pudong New Area, which are county-level districts preposition = "ของ", class = "subpolity", }, ["subregion"] = { link = true, fallback = "geographic region", }, ["suburb"] = { link = true, -- The following text is suitable for the top-level description of a suburb as well as categories of the form -- 'Suburbs in POLDIV' e.g. 'Suburbs in Illinois, USA' but not for categories of the form 'Suburbs of Chicago', -- where we'd get "[[suburb]]s of [[city|cities]] of [[Chicago]]". category_link = "[[suburb]]s of [[city|cities]]", category_link_before_city = "[[suburb]]s", -- See comments under "neighborhood" for the following three settings. They are used by -- [[Module:category tree/topic cat/data/Places]] for generating the text of 'Suburbs in/of PLACE' categories -- but currently ignored by district_neighborhood_cat_handler (which actually generates the categories for a -- given page), which hardcodes "ใน" for non-cities and "ของ" for cities. (FIXME: Change this.) generic_before_non_cities = "ใน", generic_before_cities = "ของ", preposition = "ของ", has_neighborhoods = true, --? class = "non-admin settlement", --? cat_handler = district_neighborhood_cat_handler, }, ["suburban area"] = { link = "w", fallback = "suburb", }, ["subway station"] = { link = "w", fallback = "metro station", }, ["sum"] = { -- In China, Mongolia, Russia; something like a county in Mongolia but a township in China (Inner Mongolia), -- and equivalent to a [[selsoviet]] in the parts of Russia where it's in use (a rural council, below a raion). link = "+w:sum (administrative division)", -- This fallback is somewha arbitrary. We could use "เทศมณฑล" but that has a display handler -- which we don't want to be active (FIXME: If the display handler would be active, that's a bug). fallback = "division", }, ["supercontinent"] = { link = true, fallback = "continent", }, ["tehsil"] = { link = true, affix_type = "suf", no_affix_strings = {"tehsil", "tahsil"}, class = "subpolity", }, ["temple"] = { link = true, fallback = "building", }, ["territorial authority"] = { link = "w", fallback = "อำเภอ", }, ["ดินแดน"] = { link = true, preposition = "ของ", class = "subpolity", }, ["theme"] = { link = "+w:theme (Byzantine district)", preposition = "ของ", class = "subpolity", }, ["เมือง"] = { link = true, generic_before_non_cities = "ใน", has_neighborhoods = true, class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["town with bystatus"] = { -- can't use templates in links currently link = "[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]", plural = "towns with bystatus", plural_link = "[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]", fallback = "เมือง", }, ["township"] = { link = true, has_neighborhoods = true, class = "settlement", --? default = {true}, }, ["township municipality"] = { -- Quebec link = "+w:township municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["traditional county"] = { link = true, fallback = "เทศมณฑล", }, ["traditional region"] = { -- FIXME: Verify this works. Same for 'historic(al) region'. -- provided only for the link link = "w", fallback = "FORMER geographic region", }, ["trail"] = { link = true, fallback = "road", }, ["treaty port"] = { link = "w", fallback = "นคร", class = "settlement", inherently_former = {"FORMER"}, }, ["tributary"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["underground station"] = { link = "w", fallback = "metro station", }, ["unincorporated area"] = { link = "w", -- I don't know if this fallback makes sense everywhere. fallback = "unincorporated community", }, ["unincorporated community"] = { link = true, generic_before_non_cities = "ใน", class = "non-admin settlement", }, ["unincorporated territory"] = { link = "w", fallback = "ดินแดน", }, ["union territory"] = { -- India link = true, preposition = "ของ", entry_placetype_indefinite_article = "a", class = "subpolity", }, ["unitary authority"] = { -- UK, New Zealand link = true, entry_placetype_indefinite_article = "a", fallback = "local government district", }, ["unitary district"] = { link = "w", entry_placetype_indefinite_article = "a", fallback = "local government district", }, ["united township municipality"] = { -- Quebec link = "+w:united township municipality (Quebec)", entry_placetype_indefinite_article = "a", fallback = "township municipality", has_neighborhoods = true, --? }, ["university"] = { link = true, entry_placetype_indefinite_article = "a", class = "man-made structure", default = {true}, }, ["unrecognised country"] = { link = "w", fallback = "unrecognized country", }, ["unrecognized and nearly unrecognized countries!"] = { category_link = "[[de facto]] [[independent]] [[state]]s with little or no {{w|international recognition}}", bare_category_parent = "country-like entities", }, ["unrecognized country"] = { link = "w", class = "polity", --ห้ามแปล class default = {"Unrecognized and nearly unrecognized countries"}, }, ["unrecognised state"] = { link = "w", fallback = "unrecognized country", }, ["unrecognized state"] = { link = "w", fallback = "unrecognized country", }, ["urban area"] = { link = "separately", fallback = "neighborhood", }, ["urban hromada"] = { link = "[[urban]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["urban service area"] = { -- A strange beast existing in Alberta; technically a type of hamlet but in practice used for much larger -- cities and treated equivalent to a city. (There are only two of them, [[Fort McMurray]] and [[Sherwood Park]]). link = "w", fallback = "นคร", }, ["urban township"] = { link = "w", fallback = "township", }, ["urban-type settlement"] = { -- appears to be a particular type of small urban settlement in post-Soviet states, -- had an administrative function. link = "w", fallback = "เมือง", }, ["valley"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "water"}, default = {true}, }, ["viceroyalty"] = { -- in essence, a type of colony link = true, fallback = "dependent territory", }, ["village"] = { link = true, generic_before_non_cities = "ใน", category_link = "[[village]]s, [[hamlet]]s, and other small [[community|communities]] and [[settlement]]s", class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["village development committee"] = { -- former administrative structure in Nepal; also exists in India but not as a formal unit link = "+w:village development committee (Nepal)", inherently_former = {"FORMER"}, fallback = "village", }, ["village municipality"] = { -- Quebec link = "+w:village municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["voivodeship"] = { -- Poland link = true, display_handler = voivodeship_display_handler, preposition = "ของ", class = "subpolity", }, ["volcano"] = { link = true, plural = "volcanoes", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true, "ภูเขา"}, }, ["ward"] = { link = true, class = "settlement", -- Wards are formal administrative divisions of a city but have some properties of neighborhoods. fallback = "neighborhood", }, ["watercourse"] = { link = true, fallback = "channel", }, ["Welsh community"] = { -- Wales link = "[[w:community (Wales)|community]]", preposition = "ของ", affix_type = "suf", affix = "community", has_neighborhoods = true, class = "settlement", }, ["zone"] = { -- administrative division of Ethiopia, Qatar, Nepal, India link = "+w:zone#Place names", preposition = "ของ", class = "subpolity", }, ---------------------------------------------------------------------------------------------- -- Categories for former places -- ---------------------------------------------------------------------------------------------- ["ANCIENT capital"] = { link = false, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", -- FIXME: Consider removing 'ancient settlements' here. Ancient capitals, like former capitals, often still -- exist but just aren't the capital any more. Maybe we should have an 'Ancient capitals' category. default = {"Ancient settlements", "Former capitals"}, }, ["ANCIENT non-admin settlement"] = { link = false, class = "non-admin settlement", fallback = "ANCIENT settlement", }, ["ANCIENT settlement"] = { link = false, has_neighborhoods = true, class = "settlement", default = {"Ancient settlements"}, }, ["ancient settlements!"] = { category_link = "former [[city|cities]], [[town]]s and [[village]]s that existed in [[antiquity]]", bare_category_parent = "former settlements", }, ["FORMER capital"] = { link = false, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", default = {"Former capitals"}, }, ["former capitals!"] = { category_link = "former [[capital]] [[city|cities]] and [[town]]s", bare_category_parent = "การตั้งถิ่นฐาน", }, ["former counties and county-level cities!"] = { -- For categorizing former counties and county-level cities of China category_link = "no-longer existing [[county|counties]] and [[county-level city|county-level cities]]", bare_category_breadcrumb = "counties and county-level cities", bare_category_parent = "former political divisions", }, ["FORMER county"] = { -- For categorizing former counties and county-level cities of China link = false, fallback = "FORMER subpolity", }, ["FORMER county-level city"] = { -- For categorizing former counties and county-level cities of China link = false, fallback = "FORMER subpolity", }, ["former countries and country-like entities!"] = { category_link = "[[country|countries]] and similar [[polity|polities]] that no longer exist", bare_category_breadcrumb = "countries and country-like entities", bare_category_parent = "former polities", }, ["FORMER country"] = { link = false, class = "polity", --ห้ามแปล class default = {"Former countries and country-like entities"}, }, ["former dependent territories!"] = { category_link = "[[w:dependent territory|dependent territories]] (colonies, dependencies, protectorates, etc.) that no longer exist", bare_category_breadcrumb = "dependent territories", bare_category_parent = "former political divisions", }, ["FORMER dependent territory"] = { link = false, preposition = "ของ", class = "subpolity", default = {"Former dependent territories"}, }, ["former districts!"] = { -- For categorizing former districts of China category_link = "no-longer-existing [[district]]s", bare_category_breadcrumb = "อำเภอ", bare_category_parent = "former political divisions", }, ["FORMER district"] = { -- For categorizing former districts of China link = false, fallback = "FORMER subpolity", }, ["FORMER geographic region"] = { link = false, fallback = "geographic and cultural area", }, ["FORMER man-made structure"] = { link = false, class = "man-made structure", default = {"Former man-made structures"}, }, ["former man-made structures!"] = { category_link = "man-made structures such as [[airport]]s and [[park]]s that no longer exist", bare_category_breadcrumb = "man-made structures", bare_category_parent = "former places", }, ["former municipalities!"] = { -- For categorizing former municipalities of the Netherlands category_link = "no-longer-existing [[municipality|municipalities]]", bare_category_breadcrumb = "เทศบาล", bare_category_parent = "former political divisions", }, ["FORMER municipality"] = { -- For categorizing former municipalities of the Netherlands link = false, fallback = "FORMER subpolity", }, ["FORMER natural feature"] = { link = false, class = "natural feature", default = {"Former natural features"}, }, ["former natural features!"] = { category_link = "natural features such as [[lake]]s, [[river]]s and [[island]]s that no longer exist", bare_category_breadcrumb = "natural features", bare_category_parent = "former places", }, ["FORMER non-admin settlement"] = { link = false, class = "non-admin settlement", fallback = "FORMER settlement", }, ["former places!"] = { category_link = "[[place]]s of all sorts that no longer exist", bare_category_breadcrumb = "former", bare_category_parent = "สถานที่", }, ["former political divisions!"] = { category_link = "[[political]] [[division]]s (states, provinces, counties, etc.) that no longer exist", bare_category_breadcrumb = "political divisions", bare_category_parent = "former places", }, ["former polities!"] = { category_link = "[[polity|polities]] (countries, kingdoms, empires, etc.) that no longer exist", bare_category_breadcrumb = "องค์การทางการเมือง", bare_category_parent = "former places", }, ["FORMER polity"] = { link = false, class = "polity", --ห้ามแปล class default = {"Former polities"}, }, ["former prefectures!"] = { -- For categorizing former prefectures of China category_link = "no-longer-existing [[prefecture]]s", bare_category_breadcrumb = "prefectures", bare_category_parent = "former political divisions", }, ["FORMER prefecture"] = { -- For categorizing former prefectures of China link = false, fallback = "FORMER subpolity", }, ["former provinces!"] = { -- For categorizing former provinces of China, etc. category_link = "no-longer-existing [[province]]s", bare_category_breadcrumb = "จังหวัด", bare_category_parent = "former political divisions", }, ["FORMER province"] = { -- For categorizing ancient/historical/former provinces of the Roman Empire link = false, fallback = "FORMER subpolity", }, ["former region"] = { -- A former region is considered a former political division, but not a 'historical/traditional/etc.' region. link = "separately", preposition = "ของ", inherently_former = {"FORMER"}, class = "subpolity", }, ["FORMER settlement"] = { link = false, has_neighborhoods = true, class = "settlement", default = {"Former settlements"}, }, ["former settlements!"] = { category_link = "[[city|cities]], [[town]]s and [[village]]s that no longer exist or have been merged or reclassified", bare_category_breadcrumb = "การตั้งถิ่นฐาน", bare_category_parent = "former political divisions", }, ["FORMER subpolity"] = { link = false, preposition = "ของ", class = "subpolity", default = {"Former political divisions"}, }, ---------------------------------------------------------------------------------------------- -- form-of categories -- ---------------------------------------------------------------------------------------------- ---------- Abbreviations ---------- ["abbreviations of counties!"] = { -- For categorizing abbreviations of counties of e.g. England full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[county|counties]]", bare_category_breadcrumb = "เทศมณฑล", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of countries!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "abbreviations of places", }, ["abbreviations of departments!"] = { -- For categorizing abbreviations of departments of e.g. France full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[department]]s", bare_category_breadcrumb = "departments", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of districts!"] = { -- For categorizing abbreviations of districts of e.g. ??? full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[district]]s", bare_category_breadcrumb = "อำเภอ", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of divisions!"] = { -- For categorizing abbreviations of divisions of e.g. Bangladesh full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[division]]s", bare_category_breadcrumb = "divisions", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of former countries!"] = { full_category_link = "{{glossary|abbreviation}}s of [[country|countries]] that no longer [[exist]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "abbreviations of former places", }, ["abbreviations of former places!"] = { full_category_link = "{{glossary|abbreviation}}s of [[place]]s that no longer [[exist]]", bare_category_breadcrumb = "abbreviations", bare_category_parent = "former places", addl_bare_category_parents = {{name = "abbreviations of places", sort = "former"}}, }, ["abbreviations of places!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[place]]s", bare_category_breadcrumb = "abbreviations", bare_category_parent = "สถานที่", }, ["abbreviations of political divisions!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[political]] [[division]]s", bare_category_breadcrumb = "political divisions", bare_category_parent = "abbreviations of places", }, ["abbreviations of prefectures!"] = { -- For categorizing abbreviations of prefectures of e.g. Japan full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[prefecture]]s", bare_category_breadcrumb = "prefectures", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of provinces!"] = { -- For categorizing abbreviations of provinces of e.g. Canada full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s", bare_category_breadcrumb = "จังหวัด", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of provinces and territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s and [[territory|territories]]", bare_category_breadcrumb = "provinces and territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of regions!"] = { -- For categorizing abbreviations of regions of e.g. Italy full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[administrative region]]s", bare_category_breadcrumb = "ภูมิภาค", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states!"] = { -- For categorizing abbreviations of states of e.g. the United States full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states and territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[territory|territories]]", bare_category_breadcrumb = "states and territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states and union territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[union territory|union territories]]", bare_category_breadcrumb = "states and union territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[territory|territories]]", bare_category_breadcrumb = "ดินแดน", bare_category_parent = "abbreviations of political divisions", }, ["ABBREVIATION_OF country"] = { link = false, default = {"Abbreviations of countries"}, }, ["ABBREVIATION_OF county"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF department"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF district"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF division"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF FORMER country"] = { link = false, default = {"Abbreviations of former countries"}, }, ["ABBREVIATION_OF FORMER place"] = { link = false, default = {"Abbreviations of former places"}, }, ["ABBREVIATION_OF place"] = { link = false, default = {"Abbreviations of places"}, }, ["ABBREVIATION_OF prefecture"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF province"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF region"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF state"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF subpolity"] = { link = false, default = {"Abbreviations of political divisions"}, }, ["ABBREVIATION_OF territory"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF union territory"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ---------- Archaic forms ---------- ["archaic forms of places!"] = { full_category_link = "{{glossary|archaic}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "archaic forms", bare_category_parent = "สถานที่", }, ["ARCHAIC_FORM_OF place"] = { link = false, default = {"Archaic forms of places"}, }, ---------- Clippings ---------- ["clippings of places!"] = { full_category_link = "{{glossary|clipping}}s of [[name]]s of [[place]]s", bare_category_breadcrumb = "clippings", bare_category_parent = "สถานที่", }, ["CLIPPING_OF place"] = { link = false, default = {"Clippings of places"}, }, ---------- Dated forms ---------- ["dated forms of places!"] = { full_category_link = "{{glossary|dated}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "dated forms", bare_category_parent = "สถานที่", }, ["DATED_FORM_OF place"] = { link = false, default = {"Dated forms of places"}, }, ---------- Derogatory names ---------- ["derogatory names for cities!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[city|cities]]", bare_category_breadcrumb = "นคร", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for cities"}, }, ["derogatory names for continents!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[continent]]s", bare_category_breadcrumb = "ทวีป", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for continents"}, }, ["derogatory names for countries!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for countries"}, }, ["derogatory names for places!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[place]]s", bare_category_breadcrumb = "derogatory names", bare_category_parent = "nicknames for places", }, ["derogatory names for states!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for states"}, }, ["DEROGATORY_NAME_FOR capital"] = { link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR city"] = { link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR continent"] = { link = false, default = {"Derogatory names for continents"}, }, ["DEROGATORY_NAME_FOR country"] = { link = false, default = {"Derogatory names for countries"}, }, ["DEROGATORY_NAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR place"] = { link = false, default = {"Derogatory names for places"}, }, ["DEROGATORY_NAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR state"] = { link = false, default = {"Derogatory names for states"}, }, ["DEROGATORY_NAME_FOR town"] = { link = false, default = {"Derogatory names for cities"}, }, ---------- Ellipses ---------- ["ellipses of places!"] = { full_category_link = "{{glossary|ellipsis|ellipses}} of [[name]]s of [[place]]s", bare_category_breadcrumb = "ellipses", bare_category_parent = "สถานที่", }, ["ELLIPSIS_OF place"] = { link = false, default = {"Ellipses of places"}, }, ---------- Former long-form names ---------- ["former long-form names of countries!"] = { full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former long-form names of places", addl_bare_category_parents = {{name = "former names of countries", sort = "long-form"}}, }, ["former long-form names of places!"] = { full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s", bare_category_breadcrumb = "long-form", bare_category_parent = "former names of places", }, ["FORMER_LONG_FORM_OF country"] = { link = false, default = {"Former long-form names of countries"}, }, ["FORMER_LONG_FORM_OF place"] = { link = false, default = {"Former long-form names of places"}, }, ---------- Former names ---------- ["former names of capitals!"] = { full_category_link = "[[former]] [[name]]s of [[capital city|capital cities]] that generally still exist but under a different name", bare_category_breadcrumb = "capitals", bare_category_parent = "former names of settlements", }, ["former names of countries!"] = { full_category_link = "[[former]] [[name]]s of [[country|countries]] that generally still exist but under a different name", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former names of places", }, ["former names of places!"] = { full_category_link = "[[former]] [[name]]s of [[place]]s that generally still exist but under a different name", bare_category_breadcrumb = "former names", bare_category_parent = "สถานที่", }, ["former names of political divisions!"] = { full_category_link = "[[former]] [[name]]s of [[political]] [[division]]s (states, provinces, counties, etc.) that generally still exist but under a different name", bare_category_breadcrumb = "political divisions", bare_category_parent = "former names of places", }, ["former names of polities!"] = { full_category_link = "[[former]] [[name]]s of [[polity|polities]] (e.g. [[country|countries]]) that generally still exist but under a different name", bare_category_breadcrumb = "องค์การทางการเมือง", bare_category_parent = "former names of places", }, ["former names of settlements!"] = { full_category_link = "[[former]] [[name]]s of [[city|cities]], [[town]]s, [[village]]s, etc. that generally still exist but under a different name", bare_category_breadcrumb = "การตั้งถิ่นฐาน", bare_category_parent = "former names of political divisions", }, ["FORMER_NAME_OF capital"] = { link = false, default = {"Former names of capitals"}, }, ["FORMER_NAME_OF country"] = { link = false, default = {"Former names of countries"}, }, ["FORMER_NAME_OF place"] = { link = false, default = {"Former names of places"}, }, ["FORMER_NAME_OF polity"] = { link = false, default = {"Former names of polities"}, }, ["FORMER_NAME_OF region"] = { link = false, fallback = "FORMER_NAME_OF subpolity", }, ["FORMER_NAME_OF settlement"] = { link = false, default = {"Former names of settlements"}, }, ["FORMER_NAME_OF subpolity"] = { link = false, default = {"Former names of political divisions"}, }, ---------- Former nicknames ---------- ["former nicknames for cities!"] = { full_category_link = "no-longer-used [[nickname]]s for [[city|cities]], e.g. the [[Eternal City]] for [[Kyoto]] during the {{w|Heian period}} ({{circa2|800–1100|short=yes}} {{AD}})", bare_category_breadcrumb = "นคร", bare_category_parent = "former nicknames for places", addl_bare_category_parents = {"nicknames for cities"}, }, ["former nicknames for places!"] = { full_category_link = "no-longer-used [[nickname]]s for [[place]]s", bare_category_breadcrumb = "former", bare_category_parent = "nicknames for places", addl_bare_category_parents = {{name = "former names of places", sort = "nicknames"}}, }, ["FORMER_NICKNAME_FOR capital"] = { link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR city"] = { link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR place"] = { link = false, default = {"Former nicknames for places"}, }, ["FORMER_NICKNAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR town"] = { link = false, default = {"Former nicknames for cities"}, }, ---------- Former official names ---------- ["former official names of countries!"] = { full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former official names of places", addl_bare_category_parents = {{name = "former names of countries", sort = "official"}}, }, ["former official names of places!"] = { full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[place]]s", bare_category_breadcrumb = "official", bare_category_parent = "former names of places", }, ["FORMER_OFFICIAL_NAME_OF country"] = { link = false, default = {"Former official names of countries"}, }, ["FORMER_OFFICIAL_NAME_OF place"] = { link = false, default = {"Former official names of places"}, }, ---------- Long-form names ---------- ["long-form names of countries!"] = { full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "long-form names of places", }, ["long-form names of places!"] = { full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s", bare_category_breadcrumb = "long-form names", bare_category_parent = "สถานที่", }, ["LONG_FORM_OF country"] = { link = false, default = {"Long-form names of countries"}, }, ["LONG_FORM_OF place"] = { link = false, default = {"Long-form names of places"}, }, ---------- Nicknames ---------- ["nicknames for cities!"] = { full_category_link = "[[nickname]]s for [[city|cities]], e.g. the [[Big Apple]] for [[New York City]]", bare_category_breadcrumb = "นคร", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"นคร"}, }, ["nicknames for continents!"] = { full_category_link = "[[nickname]]s for [[continent]]s", bare_category_breadcrumb = "ทวีป", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"ทวีป"}, }, ["nicknames for countries!"] = { full_category_link = "[[nickname]]s for [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"ประเทศ"}, }, ["nicknames for places!"] = { full_category_link = "[[nickname]]s for [[place]]s", bare_category_breadcrumb = "สถานที่", bare_category_parent = "nicknames", addl_bare_category_parents = {"สถานที่"}, }, ["nicknames for states!"] = { -- For categorizing nicknames for states of e.g. the United States full_category_link = "[[nicknames]] for [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"รัฐ"}, }, ["NICKNAME_FOR capital"] = { link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR city"] = { link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR continent"] = { link = false, default = {"Nicknames for continents"}, }, ["NICKNAME_FOR country"] = { link = false, default = {"Nicknames for countries"}, }, ["NICKNAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR place"] = { link = false, default = {"Nicknames for places"}, }, ["NICKNAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR state"] = { link = false, default = {"Nicknames for states"}, }, ["NICKNAME_FOR town"] = { link = false, default = {"Nicknames for cities"}, }, ---------- Obsolete forms ---------- ["obsolete forms of places!"] = { full_category_link = "{{glossary|obsolete}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "obsolete forms", bare_category_parent = "สถานที่", }, ["OBSOLETE_FORM_OF place"] = { link = false, default = {"Obsolete forms of places"}, }, ---------- Official names ---------- ["official names of countries!"] = { full_category_link = "[[official]] [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "official names of places", }, ["official names of former countries!"] = { full_category_link = "[[official]] [[name]]s of [[country|countries]] that no longer [[exist]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "official names of former places", }, ["official names of former places!"] = { full_category_link = "[[official]] [[name]]s of [[place]]s that no longer [[exist]]", bare_category_breadcrumb = "official names", bare_category_parent = "former places", addl_bare_category_parents = {{name = "official names of places", sort = "former"}}, }, ["official names of places!"] = { full_category_link = "[[official]] [[name]]s of [[place]]s", bare_category_breadcrumb = "official names", bare_category_parent = "สถานที่", }, ["OFFICIAL_NAME_OF country"] = { link = false, default = {"Official names of countries"}, }, ["OFFICIAL_NAME_OF FORMER country"] = { link = false, default = {"Official names of former countries"}, }, ["OFFICIAL_NAME_OF FORMER place"] = { link = false, default = {"Official names of former places"}, }, ["OFFICIAL_NAME_OF place"] = { link = false, default = {"Official names of places"}, }, ---------- Official nicknames ---------- ["official nicknames for places!"] = { full_category_link = "[[official]] [[nickname]]s for [[place]]s", bare_category_breadcrumb = "official", bare_category_parent = "nicknames for places", }, ["official nicknames for states!"] = { -- For categorizing official nicknames for states of e.g. the United States full_category_link = "[[official]] [[nicknames]] for [[state]]s", bare_category_breadcrumb = "official", bare_category_parent = "nicknames for states", addl_bare_category_parents = {"รัฐ"}, }, ["OFFICIAL_NICKNAME_FOR place"] = { link = false, default = {"Official nicknames for places"}, }, ["OFFICIAL_NICKNAME_FOR state"] = { link = false, default = {"Official nicknames for states"}, }, } export.plural_placetype_to_singular = {} for sg_placetype, spec in pairs(export.placetype_data) do if spec.plural then export.plural_placetype_to_singular[spec.plural] = sg_placetype end end return export ppnv7vnj763b7rrbevfdvo0t7mfqorv 5720699 5720689 2026-04-21T01:47:14Z OctraBot 3198 5720699 Scribunto text/plain local export = {} export.force_cat = false -- set to true for testing local m_locations = require("Module:place/locations") local m_links = require("Module:links") local m_table = require("Module:table") local m_strutils = require("Module:string utilities") local debug_track_module = "Module:debug/track" local en_utilities_module = "Module:en-utilities" local dump = mw.dumpObject local insert = table.insert local concat = table.concat local internal_error = m_locations.internal_error export.internal_error = internal_error local process_error = m_locations.process_error export.process_error = process_error local unpack = unpack or table.unpack -- Lua 5.2 compatibility local ucfirst = m_strutils.ucfirst local ulower = m_strutils.lower local rmatch = m_strutils.match local split = m_strutils.split --[==[ intro: This module contains placetype data used by [[Module:place]] and {{tl|place}}, along with a significant amount of code to work with both placetypes and locations, as well as some placename-related info (FIXME: Consider moving it to [[Module:place/locations]]). See also [[Module:place/locations]], which has definitions of all known locations. You must currently load this module using {{cd|require()}}, not using {{cd|mw.loadData()}}. In particular, it contains two fundamental and tricky functions: # `get_placetype_equivs`, which finds the equivalent placetypes to look under in order to find a given property, and in the process correctly handles placetypes with qualifiers (including qualifiers that act similar to "type-raising" operators in that they do something non-trivial to the placetype to their right) as well as form-of directives and fallbacks. # `find_matching_holonym_location`, which looks up a holonym to find a matching known location, but in the process checks holonyms to the right to make sure there isn't a clash between the user-specified containing holonyms and the containers of the known location being considered. This is done to prevent overcategorizing when either there are two known locations with the same name (e.g. Birmingham in England and Birmingham, Alabama in the US), or more generally two locations with the same name, one of which is a known location but where the other is not (e.g. we're processing non-known-location Mérida, Spain and don't want it categorized like known location Mérida, Yucatán, Mexico). Both of these functions are invoked repeatedly, and probably are invoked several times on the same inputs and as a result are candidates for memoization to speed up the operation of {{tl|place}}. ]==] ------------------------------------------------------------------------------------------ -- Basic utilities -- ------------------------------------------------------------------------------------------ --[==[ Return true if `force_cat` is set either in this module or in [[Module:place/locations]]. ]==] function export.get_force_cat() return export.force_cat or m_locations.force_cat end -- Add the page to a tracking "category". To see the pages in the "category", -- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here". local function track(page) require(debug_track_module)("place/" .. page) return true end function export.remove_links_and_html(text) text = m_links.remove_links(text) return text:gsub("<.->", "") end --[==[ Return the singular version of a maybe-plural placetype, or nil if not plural. This correctly handles placetypes with irregular plurals such as `kibbutzim` plural of `kibbutz` by looking up in a table constructed from the `plural` values specified in `placetype_data`. If a special plural value is not found, the regular singularization algorithm in [[Module:en-utilities]] is invoked, which reverses the y -> ies change after vowels and the 'es' addition after sh/ch/x, and otherwise just subtracts a final 's' (which will incorrectly generate 'passe' for plural 'passes'; FIXME: consider changing this for words ending in '-sses'). If the generated singular is the same as the passed-in value, nil is returned. ]==] function export.maybe_singularize_placetype(placetype) if not placetype then return nil end if export.plural_placetype_to_singular[placetype] then return export.plural_placetype_to_singular[placetype] end local retval = --[[require(en_utilities_module).singularize(placetype)]] placetype if retval == placetype then return nil end return retval end -- Return the correct plural of a placetype, and (if `do_ucfirst` is given) make the first letter uppercase. We first -- look up the plural in `placetype_data`, falling back to pluralize() in [[Module:en-utilities]], which is almost -- always correct. function export.pluralize_placetype(placetype, do_ucfirst) local ptdata = export.placetype_data[placetype] if ptdata and ptdata.plural then placetype = ptdata.plural else placetype = --[[require(en_utilities_module).pluralize(placetype)]] placetype end if do_ucfirst then return ucfirst(placetype) else return placetype end end --[==[ Get the data associated with a placetype, which may be in its singular or plural form. If `from_category` is specified, we also look for category-only placetypes (generally plural) followed by `!`. Return three values: (a) the placetype under which the data can be looked up (i.e. in its singular form if the passed-in `placetype` is plural and did not match a category-only placetype followed by `!`); (b) the placetype data structure; (c) the type of `placetype` match that occurred, one of `"direct"` if the canonical placetype is the same as the passed-in `placetype` and also the same as the key under which `ptdata` was looked up, or `"direct-category"` if the `ptdata` was looked up under a key formed from the passed-in `placetype` by adding `!`, or `"plural"` if the `ptdata` was looked up under the singularized version of the plural passed-in `placetype`. ]==] function export.get_placetype_data(placetype, from_category) local ptdata = export.placetype_data[placetype] if ptdata then return placetype, ptdata, "direct" end if from_category then ptdata = export.placetype_data[placetype .. "!"] if ptdata then return placetype .. "!", ptdata, "direct-category" end end local sg_placetype = export.maybe_singularize_placetype(placetype) if sg_placetype then ptdata = export.placetype_data[sg_placetype] if ptdata then return sg_placetype, ptdata, "plural" end end return nil end --[==[ Check for special pseudo-placetypes that should be ignored for categorization purposes. ]==] function export.placetype_is_ignorable(placetype) return placetype == "and" or placetype == "or" or placetype == "และ" or placetype == "หรือ" or placetype:find("^%(") end function export.resolve_placetype_aliases(placetype) return export.placetype_aliases[placetype] or placetype end --[==[ Return a property from `placetype_data` for a given placetype. If the placetype isn't found in `placetype_data`, or the key isn't found in the placetype's entry in `placetype_data`, return nil. ]==] function export.get_placetype_prop(placetype, key) -- Usually we are called on equivalent placetypes returned from `get_placetype_equivs`, in which case placetype -- aliases have been resolved, but sometimes not, e.g. when fetching the indefinite article in -- get_placetype_article(). `resolve_placetype_aliases` is just a simple lookup and it doesn't hurt to do it twice. placetype = export.resolve_placetype_aliases(placetype) if export.placetype_data[placetype] then return export.placetype_data[placetype][key] else return nil end end --[==[ Given a placetype, split the placetype into one or more potential ''splits'', each consisting of a three-element list { {``prev_qualifiers``, ``this_qualifier``, ``reduced_placetype``}}, i.e. # the concatenation of zero or more previously-recognized qualifiers on the left, normally canonicalized (if there are zero such qualifiers, the value will be nil); # a single recognized qualifier, normally canonicalized (if there is no qualifier, the value will be nil); # the "reduced placetype" on the right. Splitting between the qualifier in (2) and the reduced placetype in (3) happens at each space character, proceeding from left to right, and stops if a qualifier isn't recognized. All placetypes are canonicalized by checking for aliases in `placetype_aliases`, but no other checks are made as to whether the reduced placetype is recognized. Canonicalization of qualifiers does not happen if `no_canon_qualifiers` is specified. For example, given the placetype `"small beachside unincorporated community"`, the return value will be { { {nil, nil, "small beachside unincorporated community"}, {nil, "small", "beachside unincorporated community"}, {"small", "[[beachfront]]", "unincorporated community"}, {"small [[beachfront]]", "[[unincorporated]]", "community"}, }} Here, `"beachside"` is canonicalized to `"[[beachfront]]"` and `"unincorporated"` is canonicalized to `"[[unincorporated]]"`, in both cases according to the entry in `placetype_qualifiers`. On the other hand, if given `"small former haunted community"`, the return value will be { { {nil, nil, "small former haunted community"}, {nil, "small", "former haunted community"}, {"small", "former", "haunted community"}, }} because `"small"` and `"former"` but not `"haunted"` are recognized as qualifiers. Finally, if given `"former adr"`, the return value will be { { {nil, nil, "former adr"}, {nil, "former", "administrative region"}, }} because `"adr"` is a recognized placetype alias for `"administrative region"`. ]==] function export.split_qualifiers_from_placetype(placetype, no_canon_qualifiers) local splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}} local prev_qualifier = nil while true do local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$") if qualifier then local canon = export.placetype_qualifiers[qualifier] if canon == nil then break end local new_qualifier = qualifier if type(canon) == "table" then canon = canon.link end if not no_canon_qualifiers and canon ~= false then if canon == true then new_qualifier = "[[" .. qualifier .. "]]" else new_qualifier = canon end end insert(splits, {prev_qualifier, new_qualifier, export.resolve_placetype_aliases(reduced_placetype)}) prev_qualifier = prev_qualifier and prev_qualifier .. " " .. new_qualifier or new_qualifier placetype = reduced_placetype else break end end return splits end --[==[ Given a `placetype` (which may be pluralized), return an ordered list of equivalent placetypes to look under to find the placetype's properties (such as the category or categories to be inserted). The return value is actually an ordered list of objects of the form `{qualifier=``qualifier``, placetype=``equiv_placetype``}` where ``equiv_placetype`` is a placetype whose properties to look up, derived from the passed-in placetype or from a contiguous subsequence of the words in the passed-in placetype (always including the rightmost word in the placetype, i.e. we successively chop off qualifier words from the left and use the remainder to find equivalent placetypes). ``qualifier`` is the remaining words not part of the subsequence used to find ``equiv_placetype``; or nil if all words in the passed-in placetype were used to find ``equiv_placetype``. (FIXME: This qualifier is not currently used anywhere.) Only placetypes for which there is an entry in `placetype_data` are included. The placetype passed in is always checked first, and will form the first entry if it exists in `placetype_data`. '''NOTE:''' This is a tricky function as it implements handling of (a) qualifiers, (b) fallback logic, (c) "type-raising" qualifiers such as `former`/`ancient`/etc. as well as `fictional` and `mythological`, and (d) form-of directives, which act somewhat similarly to `former`, and allows interaction between more than one of these simultaneously (e.g. official names of former places, which have their own categorization). If {{tl|place}} gets too slow, one potential speedup is to memoize the results of this function, as it appears to be getting called more than once on the same inputs. Another similar potential speedup is to memoize the results of `iterate_matching_holonym_location()`. For example, given the placetype `left tributary`, the following placetype/qualifier combinations are checked in turn: ``` {qualifier = nil, placetype="left tributary"} {qualifier = "left", placetype="tributary"} {qualifier = "left", placetype="แม่น้ำ"} ``` and the return value will be { { {qualifier = "left", placetype="tributary"}, {qualifier = "left", placetype="แม่น้ำ"}, }} The algorithm first enters the placetype itself into the list, then checks for `left tributary` as a recognized placetype in `placetype_data` and doesn't find it, so it doesn't enter it into the returned list (if it found it, it would add it as well as any fallbacks directly after it). It then splits off the recognized qualifier `left` to form the ''reduced placetype'' `tributary`, which is entered into the list because it is found in `placetype_data`. Then, because it has a fallback `river`, which exists in `placetype_data`, the fallback is entered next. Another example is `small rural fraziones` (where a ''frazione'' is type of subdivision of a ''comune'' or municipality, often specifically an outlying hamlet). the placetype/qualifier combinations checked are: ``` {qualifier = nil, placetype="small rural fraziones"} {qualifier = nil, placetype="small rural frazione"} {qualifier = "small", placetype="rural fraziones"} {qualifier = "small", placetype="rural frazione"} {qualifier = "small [[rural]]", placetype="fraziones"} {qualifier = "small [[rural]]", placetype="frazione"} {qualifier = "small [[rural]]", placetype="hamlet"} {qualifier = "small [[rural]]", placetype="village"} ``` The return value ends up as {qualifier = "small [[rural]]", placetype="frazione"}, {qualifier = "small [[rural]]", placetype="hamlet"}, {qualifier = "small [[rural]]", placetype="village"}, }} Here, because the result of singularizing `fraziones` returns a different value from the placetype itself, that singularized value is checked after the original plural value. Also, in the process of splitting off qualifiers, they are canonicalized if the entry in `placetype_qualifiers` says to do so; in this case, links are placed around `rural`. Finally, `frazione` has `hamlet` as its fallback, which in turn has `village` as its fallback, so both fallbacks end up being returned. `no_fallback`, if set, disables returning equivalent placetypes based on the `fallback` setting for a placetype. This is used in the first of two loops in find_placetype_cat_specs() in [[Module:place]] to prefer exact matches for placetypes such as barangays with later holonyms to matches based on a fallback such as `neighborhood` with an earlier holonym. See the comment in that function in [[Module:place]] for a more detailed explanation of why this is needed. Only the placetype itself, and any reduced placetypes created by chopping off recognized qualifiers at the beginning, are returned; but we do not return reduced placetypes if a containing placetype exists in `placetype_data`. (For example, `"overseas territory"` has a fallback `"dependent territory"`, and `"overseas"` is also a recognized qualifier. When `no_fallback` is in place, without the above proviso, we would return `"overseas territory"` followed by `"ดินแดน"` with the incorrect effect of classifying an `"overseas territory"` of the United Kingdom such as `"Gibraltar"` under [[:Category:Territories of the United Kingdom]] instead of [[:Category:Dependent territories of the United Kingdom]].) As an exception, if `historical`, `ancient`, `former` or the like are found, they proceed ignoring `no_fallback`, because it seems tricky to handle them correctly in the presence of `no_fallback`, and historical/former placetypes rarely occur with exact match category specs anyway. `no_split_qualifiers` prevents splitting off recognized qualifiers and returning the remainder of the placetype as an equivalent placetype. Only the passed-in placetype, and any fallbacks, will be returned. This is used in [[Module:category tree/topic cat/data/Places]] when looking up placetypes found in categories. Such placetypes won't have qualifiers and so it doesn't make sense to try and look for them. `from_category`, if set, causes category-only placetypes (those ending in `!`) to also be checked. `form_of_directive`, if set, causes the specified form-of directive (e.g. `FORMER_NAME_OF`) to be prepended to checked placetypes, their directive-specific type (e.g. `FORMER_NAME_OF_type`), and their classes (`class`) to get the appropriate placetypes to check for form-of-directive categories. It falls back to the prepended generic `place` as a placetype, e.g. `FORMER_NAME_OF place`, if nothing else matches. `no_check_for_inherently_former` is used internally to prevent an infinite loop when checking for `inherently_former`. `register_former_as_non_former` is a major hack used in `get_bare_categories` to deal with the mismatch between e.g. known location `Yugoslavia` declaring itself a `country` but definitions of it declaring it a `former country`. It causes the non-former version of the specified placetype to be included in the returned equivalents along with the former placetypes. [FIXME: This should apply only to the entries in `former_countries` but it's tricky to do that now; fix this in the known-location refactor. -- The known-location refactor is already done but we haven't yet fixed this.] ]==] function export.get_placetype_equivs(placetype, props) local no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former local form_of_directive if props then no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former = props.no_fallback, props.no_split_qualifiers, props.no_check_for_inherently_former, props.from_category, props.register_former_as_non_former form_of_directive = props.form_of_directive end local equivs = {} -- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. `qualifier` is -- the preceding qualifier to insert into `equivs` along with the placetype (see comment at top of function). If -- `from_category` is given, we also check for a category-specific entry consisting of the placetype followed by -- `!`, and in all cases we also check to see if `placetype` is plural, and if so, insert the singularized version -- along with its fallbacks (if any) in `placetype_data`. `form_of_prefix` is a form-of prefix such as -- `OFFICIAL_NAME_OF`. If specified, we check the fallbacks of `placetype` without the prefix but then insert into -- `equivs` the prefixed placetype. This way, if the user says e.g. {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}}, -- we will correctly categorize into [[:Category:Official names of countries]], rather than only trying to look up -- `OFFICIAL_NAME_OF island country` and failing, falling back ultimately to [[:Category:Official names of places]]. local function insert_placetype_and_fallbacks(qualifier, placetype, form_of_prefix) local function insert_equiv(pt) if form_of_prefix then -- Let's say the user says {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}} and we have -- no entry for `OFFICIAL_NAME_OF island country` but we do for `OFFICIAL_NAME_OF country` (which we end -- up processing because `island country` falls back to `country`), and that entry in turn is defined -- using a fallback. We have to insert that fallback-of-fallback, and the easiest/cleanest way of -- handling this is by calling ourselves recursively. insert_placetype_and_fallbacks(qualifier, form_of_prefix .. " " .. pt) else insert(equivs, {qualifier=qualifier, placetype=pt}) end end -- Insert the placetype, along with any fallbacks. local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category) if ptdata then insert_equiv(canon_placetype) if no_fallback then return end local first_placetype = #equivs + 1 local prev_placetype = nil while true do local pt_value = export.placetype_data[canon_placetype] if not pt_value then internal_error("Fallback value %s specified for placetype %s but is not in `placetype_data`", canon_placetype, prev_placetype) end if pt_value.fallback then insert_equiv(pt_value.fallback) local last_placetype = #equivs if last_placetype - first_placetype >= 10 then local fallback_loop = {} for i = first_placetype, last_placetype do insert(fallback_loop, equivs[i].placetype) end internal_error("Apparent loop in fallback chain: %s", table.concat(fallback_loop, " -> ")) end prev_placetype = canon_placetype canon_placetype = pt_value.fallback else break end end end end -- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. This is a -- wrapper around the more basic `insert_placetype_and_fallbacks()` which handles form-of directives. If there is no -- form-of directive, this function directly calls `insert_placetype_and_fallbacks()`. We do things this way so that -- form-of directives correctly combine with `former`-type qualifiers. Note that we also have special backups for -- form-of directives that check `DIRECTIVE place` (and before that, `DIRECTIVE FORMER/ANCIENT place` is there's a -- `former`-type directive); these backups live outside this function because we want them done once, late, rather -- than in each invocation of `process_and_insert_placetype()`. local function process_and_insert_placetype(qualifier, reduced_placetype) if form_of_directive then -- First check for e.g. `OFFICIAL_NAME_OF island country` and its fallbacks; then we look for fallbacks of -- `island country` and check e.g. `OFFICIAL_NAME_OF country` and its fallbacks. All of this is handled by -- `insert_placetype_and_fallbacks()` with appropriate parameters. After that, check the general class of -- the directive, e.g. `subpolity` if something like `district` is given. (Eventually, we check for -- `OFFICIAL_NAME_OF place` as a backup, but this happens at the end outside the loop over qualifiers.) insert_placetype_and_fallbacks(qualifier, reduced_placetype, form_of_directive) if not no_fallback then local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype) local directive_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.get_placetype_prop(pt, form_of_directive .. "_type") or export.get_placetype_prop(pt, "class") end ) if not directive_type then local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.placetype_data[pt] end ) if pt_data then internal_error("For placetype %s in conjunction with form-of directive %s, placetype data " .. 'located but directive-specific type property %s missing, and so is "class"; ' .. "placetypes searched are %s", reduced_placetype, form_of_directive, form_of_directive .. "_type", reduced_placetype_equivs) else -- This should be allowed, as we allow unrecognized placetypes in general. end elseif directive_type ~= "!" then insert_placetype_and_fallbacks(qualifier, directive_type, form_of_directive) end end else insert_placetype_and_fallbacks(qualifier, reduced_placetype) end end -- Successively split off recognized qualifiers and loop over successively greater sets of qualifiers from the left -- (unless `no_split_qualifiers` is specified, in which case we don't check for qualifiers). local splits if no_split_qualifiers then splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}} else splits = export.split_qualifiers_from_placetype(placetype) end for _, split in ipairs(splits) do local prev_qualifier, this_qualifier, reduced_placetype = unpack(split, 1, 3) -- If a special "former" qualifier like `former` or `historical` isn't present, and -- `no_check_for_inherently_former` is not given (this flag is used to avoid infinite loops), check for -- "inherently former" placetypes like `satrapy` and `treaty port` that always refer to no-longer-existing -- placetypes, and handle accordingly. local unlinked_this_qualifier if this_qualifier and this_qualifier:find("%[") then unlinked_this_qualifier = export.remove_links_and_html(this_qualifier) else unlinked_this_qualifier = this_qualifier end local former_qualifiers = this_qualifier and export.former_qualifiers[unlinked_this_qualifier] or nil if not former_qualifiers and not no_check_for_inherently_former then former_qualifiers = export.get_equiv_placetype_prop(reduced_placetype, function(pt) return export.get_placetype_prop(pt, "inherently_former") end, {no_check_for_inherently_former = true}) end -- If a special "former" qualifier like `former` or `historical` is present, map it to the appropriate internal -- qualifiers (`ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified -- qualifiers), fetch the `former_type` property, and treat the placetype as if a concatenation of the mapped -- qualifier(s) and the value of `former_type`. For example, if `medieval village` is given, we map `medieval` -- to `ANCIENT` and `FORMER`, and `village` to its `former_type` of `settlement`, and enter the placetypes -- `ANCIENT settlement` and `FORMER settlement` (in that order) into `equivs`. If the placetype following the -- "former" qualifier is recognized in `placetype_data` but has no `former_type` and no fallback with a -- `former_type` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like -- `former greenhouse` is specified and we don't have an entry for `greenhouse`), just track the occurrence and -- don't enter anything into `equivs`. if former_qualifiers then -- FIXME: Should we respect `no_fallback` here? My instinct says no. local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype, { no_check_for_inherently_former = true }) local former_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.get_placetype_prop(pt, "former_type") or export.get_placetype_prop(pt, "class") end ) if not former_type then local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.placetype_data[pt] end ) if pt_data then internal_error("For placetype %s, placetype data located but `former_type` missing; " .. "placetypes searched are %s", reduced_placetype, reduced_placetype_equivs) else -- Enable error when we've verified there aren't any examples. track("bad-former-placetype") track("bad-former-placetype/" .. reduced_placetype) --process_error("For placetype '%s', unrecognized placetype following 'former'-type " .. -- "qualifier; searched placetype(s) %s", reduced_placetype, dump(reduced_placetype_equivs)) end elseif former_type ~= "!" then -- First check directly for `ANCIENT/FORMER` + the original following placetype. This makes it possible -- for (e.g.) former provinces of the Roman empire to be categorized specially. for _, former_qualifier in ipairs(former_qualifiers) do process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. reduced_placetype) end for _, former_qualifier in ipairs(former_qualifiers) do process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. former_type) end -- HACK! See explanation above for `register_former_as_non_former`. if register_former_as_non_former then process_and_insert_placetype(prev_qualifier, reduced_placetype) end -- If we're processing a form-of directive, after doing everything else we do -- `DIRECTIVE ANCIENT/FORMER place` e.g. `OFFICIAL_NAME_OF FORMER place` as a backup. if form_of_directive and not no_fallback then for _, former_qualifier in ipairs(former_qualifiers) do insert_placetype_and_fallbacks(prev_qualifier, form_of_directive .. " " .. former_qualifier .. " place") end end -- Don't continue processing equivs. The reason is probably the same as the `break` below for -- qualifier_to_placetype_equivs[]; categories for `former BLAH` are set using `default`, and -- non-former equivs will otherwise take precedence. break end end -- Then see if the rightmost split-off qualifier is in qualifier_to_placetype_equivs -- (e.g. 'fictional *' -> 'fictional location'). If so, add the mapping. if this_qualifier and export.qualifier_to_placetype_equivs[unlinked_this_qualifier] then insert(equivs, { qualifier=prev_qualifier, placetype=export.qualifier_to_placetype_equivs[unlinked_this_qualifier] }) -- Don't continue processing equivs; otherwise, if we specify 'mythological city', even though the -- equivalent entry for 'mythological location' gets inserted ahead of the entry for 'city', the -- latter ends up generating the category because the category for 'mythological location' is set as -- the default value, which is used only when no non-default category can be found. break end -- Finally, join the rightmost split-off qualifier to the previously split-off qualifiers to form a combined -- qualifier, and add it along with reduced_placetype and any mapping in placetype_data for reduced_placetype. -- NOTE: The first time through this loop, both `prev_qualifier` and `this_qualifier` are nil, and this inserts -- the full placetype into `equivs`. local qualifier = prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier process_and_insert_placetype(qualifier, reduced_placetype) -- If `no_fallback` and there's an entry in `placetype_data` for this placetype, don't include any reduced -- placetypes to avoid the "overseas territory treated as a territory" issue describe above. if no_fallback then local canon_placetype, ptdata, ptmatch = export.get_placetype_data(reduced_placetype, from_category) if canon_placetype then break end end end -- If we're processing a form-of directive, after doing everything else we do `DIRECTIVE place` e.g. -- `OFFICIAL_NAME_OF place` as a backup; but only if either the placetype as a whole is recognized or the placetype -- begins with a recognized qualifier. This latter check is to avoid categorizing into e.g. -- [[Category:en:Former names of places]] in an invocation like -- {{place|en|@former name of:Democratic Republic of the Congo|country|r/Central Africa|;|used from 1971–1997}}; -- the `used from 1971–1997` gets treated as a placetype and we're called on it. if form_of_directive and not no_fallback and (splits[2] or export.get_placetype_data(placetype, from_category)) then insert_placetype_and_fallbacks(nil, form_of_directive .. " place") end return equivs end function export.get_equiv_placetype_prop_from_equivs(equivs, fun, continue_on_nil_only) for _, equiv in ipairs(equivs) do local retval = fun(equiv.placetype) if continue_on_nil_only and retval ~= nil or not continue_on_nil_only and retval then return retval, equiv end end return nil, nil end --[==[ Given a placetype `placetype` and a function `fun` of one argument, iteratively call the function on equivalent placetypes fetched from `get_placetype_equivs` until the function returns a non-falsy value (i.e. not {nil} or {false}); but if `continue_on_nil_only` is specified, the iterations continue until the function returns non non-{nil} value. FIXME: We should make `continue_on_nil_only` the default; but this requires changing some callers.) When `fun` returns a non-falsy or non-{nil} value, `get_equiv_placetype_prop` returns two values: the value returned by `fun` and the equivalent placetype that triggered the non-falsy (or non-{nil}) return value. If `fun` never returns a non-falsy (or non-{nil}) value, `get_equiv_placetype_prop` returns {nil} for both return values. If `placetype` is passed in as {nil}, the return value is the result of calling `fun` on {nil} (whatever it is) with {nil} for the second return value. ]==] function export.get_equiv_placetype_prop(placetype, fun, props) if not placetype then return fun(nil), nil end return export.get_equiv_placetype_prop_from_equivs(export.get_placetype_equivs(placetype, props), fun, props and props.continue_on_nil_only) end --[==[ Return the article that is used with an entry placetype. We proceed as follows: # See if there is a recognized qualifier at the beginning that specifies an article (including `false` for no article). This takes precedence over anything else, so that e.g. `various capitals` gets no article rather than "`the"`. # Then check the placetype or any equivalent placetype for the `entry_placetype_use_the` property, indicating that `"the"` should be used. # Otherwise we look to see if the placetype itself (not any equivalents, even those involving deleting a qualifier from the beginning) has an entry in `placetype_data` that specifies the indefinite article using `entry_placetype_use_the` (principally for use with placetypes like `union territory`). # Otherwise, we use [[Module:en-utilities]] to apply the standard algorithm to generate `"an"` for words beginning with a vowel and `"a"` otherwise. If `ucfirst` is true, the first letter of the article is made upper-case. ]==] function export.get_placetype_article(placetype, ucfirst) local art local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$") if qualifier then local canon = export.placetype_qualifiers[qualifier] if type(canon) == "table" then art = canon.article end end if art == false then return art end if art == nil then local placetype_use_the = export.get_equiv_placetype_prop(placetype, function(pt) return export.get_placetype_prop(pt, "entry_placetype_use_the") end) if placetype_use_the then art = "the" else art = export.get_placetype_prop(placetype, "entry_placetype_indefinite_article") if not art then art = --[[require(en_utilities_module).get_indefinite_article(placetype)]] "" end end end if ucfirst then art = m_strutils.ucfirst(art) end return art end --[==[ Return the preposition that should be used after `placetype` when occurring as an entry placetype or in categories (e.g. `city >in< France` but `country >of< South America`). The preposition defaults to `"ใน"` if not specified. ]==] function export.get_placetype_entry_preposition(placetype) local pt_prep = export.get_equiv_placetype_prop(placetype, function(pt) return export.get_placetype_prop(pt, "preposition") end ) return pt_prep or "ใน" end --[==[ Given a place desc (see top of file) and a holonym object (see top of file), add a key/value into the place desc's `holonyms_by_placetype` field corresponding to the placetype and placename of the holonym. For example, corresponding to the holonym "c/Italy", a key "ประเทศ" with the list value {"Italy"} will be added to the place desc's `holonyms_by_placetype` field. If there is already a key with that place type, the new placename will be added to the end of the value's list. ]==] function export.key_holonym_into_place_desc(place_desc, holonym) if not holonym.placetype then return end -- Key in equivalent placetypes, so that e.g. `cities/San Francisco` gets keyed under `city`; but don't do -- fallbacks, as it doesn't seem correct for the "do other holonyms of the same placetype" algorithm to do holonyms -- of different types just because they have the same fallback. local equiv_placetypes = export.get_placetype_equivs(holonym.placetype, {no_fallback = true}) local unlinked_placename = holonym.unlinked_placename for _, equiv in ipairs(equiv_placetypes) do local placetype = equiv.placetype if not place_desc.holonyms_by_placetype then place_desc.holonyms_by_placetype = {} end if not place_desc.holonyms_by_placetype[placetype] then place_desc.holonyms_by_placetype[placetype] = {unlinked_placename} else insert(place_desc.holonyms_by_placetype[placetype], unlinked_placename) end end end --[=[ Construct a formatted link from the raw link spec `link` given the canonical singular placetype `sg_placetype`. If the placetype was originally plural, `orig_placetype` should contain this plural value; otherwise it should be nil. This will construct the appropriate type of link that displays as `orig_placetype` (or otherwise `sg_placetype`) but links to whatever the `link` spec specifies (which may be `sg_placetype`, a Wikipedia article, etc.). `ptdata` is the placetype data structure for the placetype, and `from_category` indicates that we are generating the description of a category (otherwise we are generating the display form of an entry placetype). ]=] local function make_placetype_link(link, sg_placetype, orig_placetype, ptdata, from_category, noerror) if not from_category and ptdata.disallow_in_entries then if noerror then return "[not meant to be specified directly, with warning: " .. ptdata.disallow_in_entries .. "]" else process_error("Placetype %s is not meant to be specified directly: " .. ptdata.disallow_in_entries, sg_placetype) end end if link == nil then internal_error("Placetype data present for placetype %s but no link= setting given", sg_placetype) elseif link == true then if orig_placetype then return ("[[%s|%s]]"):format(sg_placetype, orig_placetype) else return ("[[%s]]"):format(sg_placetype) end elseif link == false then process_error("Placetype %s is not meant to be specified directly, but is only for internal use", sg_placetype) elseif link == "w" then return ("[[w:%s|%s]]"):format(sg_placetype, orig_placetype or sg_placetype) elseif link == "separately" then if orig_placetype then local sg_words = split(sg_placetype, " ") local orig_words = split(orig_placetype, " ") if #sg_words ~= #orig_words then internal_error("Can't construct 'separately' link for plural placetype %s as original placetype %s " .. "has different number of words", orig_placetype, sg_placetype) else for i = 1, #sg_words do if sg_words[i] == orig_words[i] then sg_words[i] = ("[[%s]]"):format(sg_words[i]) else sg_words[i] = ("[[%s|%s]]"):format(sg_words[i], orig_words[i]) end end return concat(sg_words, " ") end else return (sg_placetype:gsub("([^ ]+)", "[[%1]]")) end elseif link:find("^%+") then link = link:sub(2) -- discard initial + return ("[[%s|%s]]"):format(link, orig_placetype or sg_placetype) elseif not orig_placetype then return link else return --[[require(en_utilities_module).pluralize(link)]] link end end --[==[ Get the display form of a placetype by looking it up in `placetype_data`. If the placetype is recognized, or is the plural of a recognized placetype, the corresponding linked display form is returned (with plural placetypes displaying as plural but linked to the singular form of the placetype). Otherwise, return nil. If we're generating the description of a category, `category_type` should be set to one of `"top-level"` (for top-level categories like [[:Category:Neighborhoods]]), `"noncity"` (for non-city categories like [[:Category:Neighborhoods in Illinois, USA]]) or `"city"` (for city categories like [[:Category:Neighborhoods of Chicago]]). Otherwise, we're generating the description for use in formatting a {{tl|place}} call, and category-only placetypes ending in `!` will be ignored, along with special `category_link*` settings. `return_full` is used along with `category_type` and will preferably return the "full" variant of category link settings, i.e. `full_category_link*`; if they don't exist, the `category_link*` value is prepended with `"names of"`. `noerror` says to not throw an error when encountering entry placetypes that would be disallowed. ]==] function export.get_placetype_display_form(placetype, category_type, return_full, noerror) local from_category = not not category_type local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category) if canon_placetype then local raw_link local function is_linked_string(str) return type(str) == "string" and str:find("%[%[") end if category_type then local fetched_full local function fetch_maybe_full(prop) local retval = ptdata["full_" .. prop] if retval ~= nil then if return_full then return retval, true else internal_error("Saw full_" .. prop .. "=%s but `return_full` not set, can't handle", retval) end end return ptdata[prop], false end local function maybe_prefix(str) if return_full and not fetched_full then return "names of " .. str else return str end end -- Careful with `false` as possible value. if category_type == "top-level" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_top_level") elseif category_type == "noncity" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_before_noncity") elseif category_type == "city" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_before_city") else internal_error('Unrecognized value for `category_type` %s, should be "top-level", "noncity" or "city"', --ห้ามแปล category_type) end if type(raw_link) == "string" then return maybe_prefix(raw_link), ptdata elseif raw_link ~= nil then return raw_link, ptdata end raw_link, fetched_full = fetch_maybe_full("category_link") if raw_link == false then return raw_link, ptdata end if is_linked_string(raw_link) then return maybe_prefix(raw_link), ptdata end if ptmatch == "plural" then raw_link, fetched_full = fetch_maybe_full("plural_link") if raw_link == false then return raw_link, ptdata end if is_linked_string(raw_link) then return maybe_prefix(raw_link), ptdata end end if raw_link == nil then raw_link, fetched_full = fetch_maybe_full("link") end if raw_link == false then return raw_link, ptdata end return maybe_prefix(make_placetype_link(raw_link, canon_placetype, placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror)), ptdata else if ptmatch == "plural" then raw_link = ptdata.plural_link if raw_link == false then process_error("Placetype %s cannot appear plural", placetype) end if is_linked_string(raw_link) then return raw_link, ptdata end end if raw_link == nil then raw_link = ptdata.link end return make_placetype_link(raw_link, canon_placetype, placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror), ptdata end end return nil end local function resolve_unlinked_placename_display_aliases(placetype, placename) local equiv_placetypes = export.get_placetype_equivs(placetype) for i, equiv in ipairs(equiv_placetypes) do equiv_placetypes[i] = equiv.placetype end local all_display_aliases_found = {} local all_others_found = {} for group, key, spec in m_locations.iterate_matching_location { placetypes = equiv_placetypes, placename = placename, alias_resolution = "display", } do if spec.alias_of and spec.display then insert(all_display_aliases_found, {group, key, spec, spec.display_as_full}) else insert(all_others_found, {group, key, spec}) end end if not all_display_aliases_found[1] then return placename elseif all_display_aliases_found[2] then internal_error("Found multiple matching display aliases for placename %s, placetype %s: " .. "all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found, all_others_found) elseif all_others_found[1] then internal_error("Found a display alias along with other possible meanings for placename %s, placetype %s: " .. "all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found, all_others_found) else local group, key, spec, as_full = unpack(all_display_aliases_found[1]) local full, elliptical = m_locations.key_to_placename(group, key) return as_full and full or elliptical end end --[==[ If `placename` of type `placetype` is a display alias, convert it to its canonical form; otherwise, return unchanged. Display aliases transform certain placenames into canonical displayed forms. For example, if any of `country/US`, `country/USA` or `country/United States of America` (or `c/US`, etc.) are given, the result will be displayed as `United States`. '''NOTE''': Display aliases change what is displayed from what the editor wrote in the Wikitext. As a result, they should (a) be non-political in nature, and (b) not involve a change where the word `the` needs to be added or removed. For example, normalizing `US` and `USA` to `United States` for display purposes is OK but normalizing `Burma` to `Myanmar` is not (instead a cat alias should be used) because the terms `Burma` and `Myanmar` have clear political connotations. Similarly, we have a display alias that maps the old name of `Macedonia` as a country (but not a region!) to `North Macedonia`, but `Republic of Macedonia` is mapped to `North Macedonia` only as a cat alias because the two terms differ in their use of `the`. (For example, if we had a display alias mapping `Republic of Macedonia` to `North Macedonia`, the call {{tl|place|en|the <<capital city>> of the <<c/Republic of Macedonia>>}} would wrongly display as `the [[capital city]] of the [[North Macedonia]]`.) Generally, display normalizations tend to involve alternative forms (e.g. abbreviations, ellipses, foreign spellings) where the normalization improves clarity and consistency. ]==] function export.resolve_placename_display_aliases(placetype, placename) -- If the placename is a link, apply the alias inside the link. -- This pattern matches both piped and unpiped links. If the link is not piped, the second capture (linktext) will -- be empty. local link, linktext = rmatch(placename, "^%[%[([^|%[%]]+)|?([^|%[%]]-)%]%]$") if link then if linktext ~= "" then local alias = resolve_unlinked_placename_display_aliases(placetype, linktext) return "[[" .. link .. "|" .. alias .. "]]" else local alias = resolve_unlinked_placename_display_aliases(placetype, link) return "[[" .. alias .. "]]" end else return resolve_unlinked_placename_display_aliases(placetype, placename) end end --[==[ Generate the "prefixed" version of a bare key, i.e. prefix it with `the` if correct for this key. ]==] function export.get_prefixed_key(key, spec) if spec.the then return "the " .. key else return key end end -- Necessary for use by [[Module:place]]. FIXME: Reorganize the modules so this isn't necessary. export.iterate_matching_location = m_locations.iterate_matching_location --[=[ Iterator that iterates over holonyms in `place_desc`. If `first_holonym_index` is given, start iterating at the specified holonym and stop either when there are no more holonyms or a holonym with modifier `:also` is found. If `first_holonym_index` is nil or omitted, iterate over all holonyms regardless. If `include_raw_text_holonyms` is specified, raw text holonyms (those not of the form `placetype/placename`) are returned as well; they can be identified by the fact that the `placetype` field in the holonym structure is nil. Two values are returned at each iteration, the holonym index and holonym structure, similar to `ipairs()`. ]=] function export.get_holonyms_to_check(place_desc, first_holonym_index, include_raw_text_holonyms) local stop_at_also = not not first_holonym_index return function(place_desc, index) while true do index = index + 1 local this_holonym = place_desc.holonyms[index] -- If we were passed in a starting holonym index, go up to but not including a holonym marked with `:also` -- (continue_cat_loop); the categorization code will then restart the loop at that holonym. That holonym -- will have `:also` marked on it, so make sure not to stop immediately if the first holonym is marked with -- `:also`. if not this_holonym or stop_at_also and index > first_holonym_index and this_holonym.continue_cat_loop then return nil end -- If not placetype, we're processing raw text, which we normally want to skip. if include_raw_text_holonyms or this_holonym.placetype then return index, this_holonym end end end, place_desc, first_holonym_index and first_holonym_index - 1 or 0 end --[==[ If the holonym in `data` (in the format as passed to a category handler) refers to a known location, iterate over all such known locations, returning for each location the corresponding key, spec and group as well as the trail of ancestral containers. Unlike `iterate_matching_location()`, this specifically checks that there is no mismatch between the location's containers at any level and any of the following holonyms in the {{tl|place}} spec. The fields in `data` are: * `holonym_placetype`: The placetype of the holonym. It can actually be a list of possible placetypes, as with `iterate_matching_location()`. * `holonym_placename`: The placename of the holonym. * `holonym_index`: The index of the holonym among the holonyms in `place_desc`, or nil if the holonym is not among the holonyms in `place_desc`. (If a holonym index is given, we check for container mismatches among the holonyms following the specified index, stopping either when encountering a holonym marked with modifier `:also` or, if none exist, when we run out of holonyms. If no holonym index is given, we check all holonyms for container mismatches.) * `place_desc`: Description of the place; used for the holonyms, to check for container mismatches. Returns four values: the location group, the canonical key by which the location is known, the spec object describing the location and the trail of ancestral containers for the location. The first three values are the same as for `iterate_matching_location`. ]==] function export.iterate_matching_holonym_location(data) local holonym_placetype, holonym_placename, holonym_index, place_desc = data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc local matching_location_iterator = m_locations.iterate_matching_location { placetypes = holonym_placetype, placename = holonym_placename, } return function() while true do local group, key, spec = matching_location_iterator() if not group then return nil end local container_trail = {} -- For each level of container, check that there are no mismatches (i.e. other location of the same -- placetype) mentioned. We allow a mismatch at a given level if there's also a match with the container -- at that level. For example, in the case of Kansas City, defined in [[Module:place/locations]] as a city -- in Missouri, if we define it as {{tl|place|city|s/Missouri,Kansas}}, we ignore the mismatching state of -- Kansas because the correct state of Missouri was also mentioned. But imagine we are defining Newark, -- Delaware as {{tl|place|city|s/Delaware|c/US}} and (as is the case) we have an entry for Newark, New -- Jersey in [[Module:place/locations]]. Just because the containing location `US` matches isn't enough, -- because Newark, NJ also has New Jersey as a containing location and there's a mismatch at that level. If -- there are no mismatches at any level we assume we're dealing with the right known location. -- -- If at a given level there are multiple containing locations, we count a match if any holonym matches any -- containing location, and a mismatch only if a holonym exists of the same placetype that doesn't match any -- containing location. local containers_mismatch = false for containers in m_locations.iterate_containers(group, key, spec) do insert(container_trail, containers) local match_at_level = false local mismatch_at_level = false for other_holonym_index, other_holonym in export.get_holonyms_to_check(place_desc, holonym_index and holonym_index + 1 or nil) do local other_source_holonym = other_holonym.augmented_from_holonym if other_source_holonym and other_source_holonym.placetype == holonym_placetype and other_source_holonym.unlinked_placename ~= holonym_placename then -- Ignore holonyms added during the augmentation process for other holonyms of the same -- placetype as the placetype of the holonym we're considering. See comment in -- augment_holonyms_with_container() for why we do this. -- continue; grrr, no 'continue' in Lua else local holonym_matches_at_level = false local holonym_exists_with_same_placetype = false for _, container in ipairs(containers) do if not container.spec.no_check_holonym_mismatch then local full_container_placename, elliptical_container_placename = m_locations.key_to_placename(container.group, container.key) local placetypes = container.spec.placetype if type(placetypes) ~= "table" then placetypes = {placetypes} end local placetype_equivs = {} for _, pt in ipairs(placetypes) do m_table.extend(placetype_equivs, export.get_placetype_equivs(pt)) end local this_holonym_matches = export.get_equiv_placetype_prop_from_equivs( placetype_equivs, function(placetype) return other_holonym.placetype == placetype and (other_holonym.unlinked_placename == full_container_placename or other_holonym.unlinked_placename == elliptical_container_placename) end ) if this_holonym_matches then holonym_matches_at_level = true break end local this_holonym_exists_with_same_placetype = export.get_equiv_placetype_prop_from_equivs( placetype_equivs, function(placetype) return other_holonym.placetype == placetype end ) if this_holonym_exists_with_same_placetype then -- We seem to have a mismatch at this level. But before we decide conclusively that this -- is the case, check to see whether the putative mismatch is an alias and matches when -- we resolve the alias. for oh_group, oh_key, oh_spec, oh_container_trail in export.iterate_matching_holonym_location { holonym_placetype = other_holonym.placetype, holonym_placename = other_holonym.unlinked_placename, holonym_index = other_holonym_index, place_desc = place_desc, } do local oh_full_placename, oh_elliptical_placename = m_locations.key_to_placename(oh_group, oh_key) if oh_full_placename == full_container_placename or oh_elliptical_placename == elliptical_container_placename then -- Alias matched when resolved. this_holonym_matches = true break end end if this_holonym_matches then -- Alias matched above when resolved. holonym_matches_at_level = true break else -- Not an alias, or doesn't match when resolved. We have a true mismatch. holonym_exists_with_same_placetype = true end end end end if holonym_matches_at_level then match_at_level = true break end if holonym_exists_with_same_placetype then mismatch_at_level = true end end end if not match_at_level and mismatch_at_level then containers_mismatch = true break end end if not containers_mismatch then return group, key, spec, container_trail end end end end --[==[ If the holonym in `data` (in the format as passed to a category handler) refers to a known location, find and return the corresponding key, spec and group as well as the trail of ancestral containers. This is like `iterate_matching_holonym_location()` but throws an error if more than one location matches. (An example where this would happen is {{tl|place|en|neighborhood|city/Newcastle}}, because there are two known locations named Newcastle. To fix this, specify additional following disambiguating holonyms, e.g. {{tl|place|en|neighborhood|city/Newcastle|s/New South Wales}}. ]==] function export.find_matching_holonym_location(data) local all_found = {} for group, key, spec, container_trail in export.iterate_matching_holonym_location(data) do insert(all_found, {group, key, spec, container_trail}) end if not all_found[1] then return nil elseif all_found[2] then local holonym_placetype = data.holonym_placetype if type(holonym_placetype) == "table" then holonym_placetype = concat(holonym_placetype, ",") end local found_keys = {} for _, found in ipairs(all_found) do local _, key, _, _ = unpack(found) insert(found_keys, key) end error(("Found multiple matching locations for holonym '%s/%s'; specify disambiguating context in the " .. "containing holonyms: %s"):format(holonym_placetype, data.holonym_placename, dump(found_keys))) else return unpack(all_found[1]) end end ------------------------------------------------------------------------------------------ -- Placename and placetype data -- ------------------------------------------------------------------------------------------ --[==[ var: This is a map from aliases to their canonical forms. Any placetypes appearing as keys here will be mapped to their canonical forms in all respects, including the display form. Contrast entries in 'placetype_data' with a fallback, which applies to categorization and other processes but not to display. The most important aliases are for holonym placetypes, particularly those that occur often such as "ประเทศ", "รัฐ", "จังหวัด" and the like. Particularly long placetypes that mostly occur as entry placetypes (e.g. "census-designated place") can be given abbreviations, but it is generally preferred to spell out the entry placetype. Note also that we purposely avoid certain abbreviations that would be ambiguous (e.g. "d", which could variously be interpreted as "department", "อำเภอ" or "division"). ]==] export.placetype_aliases = { ["acomm"] = "autonomous community", ["adr"] = "administrative region", ["adterr"] = "administrative territory", -- Pakistan ["aobl"] = "autonomous oblast", ["aokr"] = "autonomous okrug", ["ap"] = "autonomous province", ["apref"] = "autonomous prefecture", ["aprov"] = "autonomous province", ["ar"] = "autonomous region", ["arch"] = "archipelago", ["arep"] = "autonomous republic", ["aterr"] = "autonomous territory", ["atu"] = "autonomous territorial unit", ["bor"] = "borough", ["c"] = "ประเทศ", ["can"] = "canton", ["carea"] = "council area", ["cc"] = "constituent country", ["cdblock"] = "community development block", ["cdep"] = "Crown dependency", ["CDP"] = "census-designated place", ["cdp"] = "census-designated place", ["clcity"] = "county-level city", ["co"] = "เทศมณฑล", ["cobor"] = "county borough", ["colcity"] = "county-level city", ["coll"] = "collectivity", ["comm"] = "community", ["cont"] = "ทวีป", ["contr"] = "continental region", ["contregion"] = "continental region", ["cpar"] = "civil parish", ["damun"] = "direct-administered municipality", ["dep"] = "dependency", ["department capital"] = "departmental capital", ["dept"] = "department", ["depterr"] = "dependent territory", ["dist"] = "อำเภอ", ["distmun"] = "district municipality", ["div"] = "division", ["emp"] = "จักรวรรดิ", ["fpref"] = "French prefecture", ["gov"] = "governorate", ["govnat"] = "governorate", ["home-rule city"] = "home rule city", ["home-rule municipality"] = "home rule municipality", ["inner-city area"] = "inner city area", ["ires"] = "Indian reservation", ["isl"] = "เกาะ", ["lbor"] = "London borough", ["lga"] = "local government area", ["lgarea"] = "local government area", ["lgd"] = "local government district", ["lgdist"] = "local government district", ["metbor"] = "metropolitan borough", ["metcity"] = "มหานคร", ["metmun"] = "metropolitan municipality", ["mtn"] = "ภูเขา", ["mun"] = "เทศบาล", ["mundist"] = "municipal district", ["nonmetropolitan county"] = "non-metropolitan county", ["obl"] = "oblast", ["okr"] = "okrug", ["p"] = "จังหวัด", ["par"] = "parish", ["parmun"] = "parish municipality", ["pen"] = "peninsula", ["plcity"] = "prefecture-level city", ["plcolony"] = "Polish colony", ["pref"] = "prefecture", ["prefcity"] = "prefecture-level city", ["preflcity"] = "prefecture-level city", ["prov"] = "จังหวัด", ["r"] = "ภูมิภาค", ["range"] = "เทือกเขา", ["rcm"] = "regional county municipality", ["rcomun"] = "regional county municipality", ["rdist"] = "regional district", ["rep"] = "republic", ["rhrom"] = "rural hromada", ["riv"] = "แม่น้ำ", ["rmun"] = "regional municipality", ["robor"] = "royal borough", ["romp"] = "Roman province", ["runit"] = "regional unit", ["rurmun"] = "rural municipality", ["s"] = "รัฐ", ["sar"] = "special administrative region", ["shrom"] = "settlement hromada", ["spref"] = "subprefecture", ["sprefcity"] = "sub-prefectural city", ["sprovcity"] = "subprovincial city", ["submet city"] = "sub-metropolitan city", ["submetropolitan city"] = "sub-metropolitan city", ["sub-prefecture-level city"] = "sub-prefectural city", ["sub-provincial city"] = "subprovincial city", ["sub-provincial district"] = "subprovincial district", ["terr"] = "ดินแดน", ["terrauth"] = "territorial authority", ["twp"] = "township", ["twpmun"] = "township municipality", ["uauth"] = "unitary authority", ["ucomm"] = "unincorporated community", ["udist"] = "unitary district", ["uhrom"] = "urban hromada", ["uterr"] = "union territory", ["utwpmun"] = "united township municipality", ["val"] = "valley", ["vdc"] = "village development committee", ["vil"] = "village", ["voi"] = "voivodeship", ["wcomm"] = "Welsh community", } local no_link_def_article = {link = false, article = "the"} local no_link_no_article = {link = false, article = false} --[==[ var: These qualifiers can be prepended onto any placetype and will be handled correctly. For example, the placetype `large city` will be displayed as `large <nowiki>[[city]]</nowiki>` and categorized as if `city` were specified. If the value in the following table is a string, the qualifier will display according to the string. If the value is `true`, the qualifier will be linked to its corresponding Wiktionary entry. If the value is `false`, the qualifier will not be linked but will appear as-is. Note that these qualifiers do not override placetypes with entries elsewhere that contain those same qualifiers. For example, the entry for `inland sea` in `placetype_data` will apply in preference to treating `inland sea` as equivalent to `sea`. ]==] export.placetype_qualifiers = { -- generic qualifiers ["huge"] = false, ["tiny"] = false, ["large"] = false, ["big"] = false, ["mid-size"] = false, ["mid-sized"] = false, ["small"] = false, ["sizable"] = false, ["important"] = false, ["long"] = false, ["short"] = false, ["major"] = false, ["minor"] = false, ["high"] = false, ["tall"] = false, ["low"] = false, ["left"] = false, -- left tributary ["right"] = false, -- right tributary ["modern"] = false, -- for use in opposition to "ancient" in another definition -- "former" qualifiers ["abandoned"] = true, ["ancient"] = true, ["deserted"] = true, ["extinct"] = true, ["former"] = false, ["historic"] = "historical", ["historical"] = true, ["medieval"] = true, ["mediaeval"] = true, ["ruined"] = true, ["traditional"] = true, -- sea qualifiers ["coastal"] = true, ["inland"] = true, -- note, we also have an entry in placetype_data for 'inland sea' to get a link to [[inland sea]] ["maritime"] = true, ["overseas"] = true, ["seaside"] = true, ["beachfront"] = true, ["beachside"] = true, ["riverside"] = true, -- lake qualifiers ["freshwater"] = true, ["saltwater"] = true, ["endorheic"] = true, ["oxbow"] = true, ["ox-bow"] = "[[oxbow]]", -- [[ox-bow]] is a red link ["tidal"] = true, -- land qualifiers ["hilltop"] = true, ["hilly"] = true, ["insular"] = true, ["peninsular"] = true, ["chalk"] = true, ["karst"] = true, ["limestone"] = true, ["mountainous"] = true, ["mountaintop"] = true, ["alpine"] = true, ["volcanic"] = true, -- for an island -- political status qualifiers ["autonomous"] = true, ["incorporated"] = true, ["special"] = true, ["unincorporated"] = true, ["coterminous"] = true, -- monetary status/etc. qualifiers ["fashionable"] = true, ["wealthy"] = true, ["affluent"] = true, ["declining"] = true, -- city vs. rural qualifiers ["urban"] = true, ["suburban"] = true, ["exurban"] = true, ["outlying"] = true, ["remote"] = true, ["rural"] = true, ["outback"] = true, ["inner"] = false, ["inner-city"] = true, ["central"] = false, ["outer"] = false, -- land use qualifiers ["residential"] = true, ["agricultural"] = true, ["business"] = true, ["commercial"] = true, ["industrial"] = true, -- business use qualifiers ["railroad"] = true, ["railway"] = true, ["farming"] = true, ["fishing"] = true, ["mining"] = true, ["logging"] = true, ["cattle"] = true, -- tourism use qualifiers ["resort"] = true, -- note, we also have 'resort city' and 'resort town', that take precedecne ["spa"] = true, -- note, we also have 'spa city' and 'spa town', that take precedecne ["ski"] = true, -- note, we also have 'ski resort city' and 'ski resort town', that take precedecne -- religious qualifiers ["holy"] = true, ["sacred"] = true, ["religious"] = true, ["secular"] = true, -- qualifiers for nonexistent places ["claimed"] = false, ["fictional"] = true, ["legendary"] = true, ["mythical"] = true, ["mythological"] = true, -- directional qualifiers ["northern"] = false, ["southern"] = false, ["eastern"] = false, ["western"] = false, ["north"] = false, ["south"] = false, ["east"] = false, ["west"] = false, ["northeastern"] = false, ["southeastern"] = false, ["northwestern"] = false, ["southwestern"] = false, ["northeast"] = false, ["southeast"] = false, ["northwest"] = false, ["southwest"] = false, -- seasonal qualifiers ["summer"] = true, -- e.g. for 'summer capital' ["winter"] = true, -- legal status qualifiers -- FIXME: Two-word qualifiers don't work yet. But you can enter "de-facto" and it's canonicalized to [[de facto]]. ["official"] = true, ["unofficial"] = true, ["de facto"] = true, -- 'de facto capital' ["de-facto"] = "[[de facto]]", -- [[de-facto]] is a red link ["de jure"] = true, -- 'de jure capital' ["de-jure"] = "[[de jure]]", -- [[de-jure]] is a red link -- NOTE: 'unrecognized/unrecognised' are handled as placetypes 'unrecognized country', 'unrecognized state' -- misc. qualifiers ["planned"] = true, ["chartered"] = true, ["landlocked"] = true, ["uninhabited"] = true, -- superlative qualifiers ["first"] = no_link_def_article, ["second"] = no_link_def_article, -- for "second largest" etc. ["third"] = no_link_def_article, ["fourth"] = no_link_def_article, ["last"] = no_link_def_article, ["only"] = no_link_def_article, ["sole"] = no_link_def_article, ["main"] = no_link_def_article, ["largest"] = no_link_def_article, ["biggest"] = no_link_def_article, ["smallest"] = no_link_def_article, ["shortest"] = no_link_def_article, ["longest"] = no_link_def_article, ["tallest"] = no_link_def_article, ["highest"] = no_link_def_article, ["lowest"] = no_link_def_article, ["leftmost"] = no_link_def_article, ["rightmost"] = no_link_def_article, ["innermost"] = no_link_def_article, ["outermost"] = no_link_def_article, ["northernmost"] = no_link_def_article, ["southernmost"] = no_link_def_article, ["westernmost"] = no_link_def_article, ["easternmost"] = no_link_def_article, ["northwesternmost"] = no_link_def_article, ["southwesternmost"] = no_link_def_article, ["northeasternmost"] = no_link_def_article, ["southeasternmost"] = no_link_def_article, -- several/various ["several"] = no_link_no_article, ["various"] = no_link_no_article, ["numerous"] = no_link_no_article, ["multiple"] = no_link_no_article, ["many"] = no_link_no_article, ["other"] = no_link_no_article, } --[==[ var: In this table, the key qualifiers should be treated the same as the value qualifiers for categorization purposes. This is overridden by `placetype_data` and `qualifier_to_placetype_equivs`. ]==] export.former_qualifiers = { ["abandoned"] = {"FORMER"}, ["ancient"] = {"ANCIENT", "FORMER"}, ["former"] = {"FORMER"}, ["extinct"] = {"FORMER"}, ["historic"] = {"FORMER"}, ["historical"] = {"FORMER"}, ["medieval"] = {"ANCIENT", "FORMER"}, ["mediaeval"] = {"ANCIENT", "FORMER"}, ["ruined"] = {"ANCIENT", "FORMER"}, ["traditional"] = {"FORMER"}, } --[==[ var: In this table, any placetypes containing these qualifiers that do not occur in `placetype_data` should be mapped to the specified placetypes for categorization purposes. Entries here are overridden by `placetype_data`. ]==] export.qualifier_to_placetype_equivs = { ["fictional"] = "fictional location", ["legendary"] = "mythological location", ["mythical"] = "mythological location", ["mythological"] = "mythological location", -- For e.g. Taiwan as a "claimed province" of China; parts of Belize as claimed by Guatemala; various islands -- claimed by various parties in East Asia. FIXME: We should conditionalize on what is being claimed since there are -- also claimed capitals, e.g. Israel and Palestine claim Jerusalem as their capital. ["claimed"] = "claimed political division", } --[==[ var: Mapping from placetypes to the corresponding plural category-only placetype for a capital of that placetype. The reverse mapping also exists. ]==] export.placetype_to_capital_cat = { ["autonomous community"] = "autonomous community capitals", ["canton"] = "cantonal capitals", ["comarca"] = "comarca capitals", ["ประเทศ"] = "เมืองหลวงของประเทศ", -- The following are not obviously different from 'county seats' but the latte terminology is used in the US. ["เทศมณฑล"] = "เมืองหลวงของเทศมณฑล", ["department"] = "departmental capitals", ["อำเภอ"] = "เมืองหลวงของอำเภอ", ["division"] = "division capitals", ["emirate"] = "emirate capitals", ["governorate"] = "governorate capitals", ["hromada"] = "hromada capitals", ["krai"] = "krai capitals", ["มหานคร"] = "เมืองหลวงของมหานคร", ["เทศบาล"] = "เมืองหลวงของเทศบาล", ["oblast"] = "oblast capitals", ["okrug"] = "okrug capitals", ["prefecture"] = "prefectural capitals", ["จังหวัด"] = "เมืองหลวงของจังหวัด", ["raion"] = "raion capitals", ["regency"] = "regency capitals", ["ภูมิภาค"] = "เมืองหลวงของภูมิภาค", ["regional unit"] = "regional unit capitals", ["republic"] = "republic capitals", ["รัฐ"] = "เมืองหลวงของรัฐ", ["ดินแดน"] = "เมืองหลวงของดินแดน", ["voivodeship"] = "voivodeship capitals", } --[==[ var: This contains placenames that should be preceded by an article (almost always "the"). '''NOTE''': There are multiple ways that placenames can come to be preceded by "the": # Listed here. # Given in [[Module:place/locations]] with an initial "the". All such placenames are added to this map by the code just below the map. # The placetype of the placename has `holonym_use_the = true` in its placetype_data. # A regex in placename_the_re matches the placename. Note that "the" is added only before the first holonym in a place description. ]==] export.placename_article = { -- This should only contain info that can't be inferred from [[Module:place/locations]]. ["archipelago"] = { ["Cyclades"] = "the", ["Dodecanese"] = "the", }, ["ประเทศ"] = { ["Holy Roman Empire"] = "the", }, ["จักรวรรดิ"] = { ["Holy Roman Empire"] = "the", }, ["เกาะ"] = { ["North Island"] = "the", ["South Island"] = "the", }, ["ภูมิภาค"] = { ["Balkans"] = "the", ["Russian Far East"] = "the", ["Caribbean"] = "the", ["Caucasus"] = "the", ["Middle East"] = "the", ["New Territories"] = "the", ["North Caucasus"] = "the", ["South Caucasus"] = "the", ["West Bank"] = "the", ["Gaza Strip"] = "the", }, ["valley"] = { ["San Fernando Valley"] = "the", }, } --[==[ var: Regular expressions to apply to determine whether we need to put 'the' before a holonym. The key "*" applies to all holonyms, otherwise only the regexes for the holonym's placetype apply. ]==] export.placename_the_re = { -- We don't need entries for peninsulas, seas, oceans, gulfs or rivers -- because they have holonym_use_the = true. ["*"] = {"^Isle of ", " Islands$", " Mountains$", " Empire$", " Country$", " Region$", " District$", "^City of "}, ["bay"] = {"^Bay of "}, ["ทะเลสาบ"] = {"^Lake of "}, ["ประเทศ"] = {"^Republic of ", " Republic$"}, ["republic"] = {"^Republic of ", " Republic$"}, ["ภูมิภาค"] = {" [Rr]egion$"}, ["แม่น้ำ"] = {" River$"}, ["local government area"] = {"^Shire of "}, ["เทศมณฑล"] = {"^Shire of "}, ["Indian reservation"] = {" Reservation", " Nation"}, ["tribal jurisdictional area"] = {" Reservation", " Nation"}, } --[==[ var: If any of the following holonyms are present, the associated holonyms are automatically added to the end of the list of holonyms for categorization (but not display) purposes. ]==] export.cat_implications = { ["ภูมิภาค"] = { ["Eastern Europe"] = {"continent/Europe"}, ["Central Europe"] = {"continent/Europe"}, ["Western Europe"] = {"continent/Europe"}, ["South Europe"] = {"continent/Europe"}, ["Southern Europe"] = {"continent/Europe"}, ["Northern Europe"] = {"continent/Europe"}, ["Northeast Europe"] = {"continent/Europe"}, ["Northeastern Europe"] = {"continent/Europe"}, ["Southeast Europe"] = {"continent/Europe"}, ["Southeastern Europe"] = {"continent/Europe"}, ["North Caucasus"] = {"continent/Europe"}, ["South Caucasus"] = {"continent/Asia"}, ["South Asia"] = {"continent/Asia"}, ["Southern Asia"] = {"continent/Asia"}, ["East Asia"] = {"continent/Asia"}, ["Eastern Asia"] = {"continent/Asia"}, ["Central Asia"] = {"continent/Asia"}, ["West Asia"] = {"continent/Asia"}, ["Western Asia"] = {"continent/Asia"}, ["Southeast Asia"] = {"continent/Asia"}, ["North Asia"] = {"continent/Asia"}, ["Northern Asia"] = {"continent/Asia"}, ["Anatolia"] = {"continent/Asia"}, ["Asia Minor"] = {"continent/Asia"}, ["Mesopotamia"] = {"continent/Asia"}, ["North Africa"] = {"continent/Africa"}, ["Central Africa"] = {"continent/Africa"}, ["West Africa"] = {"continent/Africa"}, ["East Africa"] = {"continent/Africa"}, ["Southern Africa"] = {"continent/Africa"}, ["Central America"] = {"continent/Central America"}, ["Caribbean"] = {"continent/North America"}, ["Polynesia"] = {"continent/Oceania"}, ["Micronesia"] = {"continent/Oceania"}, ["Melanesia"] = {"continent/Oceania"}, ["Siberia"] = {"country/Russia", "continent/Asia"}, ["Russian Far East"] = {"country/Russia", "continent/Asia"}, ["South Wales"] = {"constituent country/Wales", "continent/Europe"}, ["Balkans"] = {"continent/Europe"}, ["West Bank"] = {"country/Palestine", "continent/Asia"}, ["Gaza"] = {"country/Palestine", "continent/Asia"}, ["Gaza Strip"] = {"country/Palestine", "continent/Asia"}, } } ------------------------------------------------------------------------------------------ -- Category and display handlers -- ------------------------------------------------------------------------------------------ local function city_type_cat_handler(data) local entry_placetype = data.entry_placetype local generic_before_non_cities = export.get_placetype_prop(entry_placetype, "generic_before_non_cities") if not generic_before_non_cities then internal_error("city_type_cat_handler called on placetype %s that doesn't have a `generic_before_non_cities`" .. " setting", entry_placetype) end local plural_entry_placetype = export.pluralize_placetype(entry_placetype) local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and not spec.is_former_place and not spec.is_city then -- Categorize both in key, and in the larger polity that the key is part of, e.g. [[Hirakata]] goes in both -- "Cities in Osaka Prefecture" and "Cities in Japan". (But don't do the latter if no_container_cat is set.) local cap_plural_entry_placetype = ucfirst(plural_entry_placetype) local retcats = {("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(key, spec))} --th if container_trail[1] and not spec.no_container_cat then for _, container in ipairs(container_trail[1]) do insert(retcats, ("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(container.key, container.spec))) --th end end return retcats end end local function capital_city_cat_handler(data, non_city) local holonym_placetype, holonym_placename, holonym_index, place_desc = data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc -- The first time we're called we want to return something; otherwise we will be called for later-mentioned -- holonyms, which can result in wrongly classifying into e.g. `National capitals`. Simulate the loop in -- find_placetype_cat_specs() over holonyms so we get the proper `Cities in ...` categories as well as the capital -- category/categories we add below. local retcats if not non_city and place_desc.holonyms then for h_index, holonym in export.get_holonyms_to_check(place_desc, holonym_index) do local h_placetype, h_placename = holonym.placetype, holonym.unlinked_placename retcats = city_type_cat_handler { entry_placetype = "นคร", holonym_placetype = h_placetype, holonym_placename = h_placename, holonym_index = h_index, place_desc = place_desc, } if retcats then break end end end if not retcats then retcats = {} end -- Now find the appropriate capital-type category for the placetype of the holonym, e.g. 'State capitals'. If we -- recognize the holonym among the known holonyms in [[Module:place/locations]], also add a category like 'State -- capitals of the United States'. Truncate e.g. 'autonomous region' to 'region', 'union territory' to 'territory' -- when looking up the type of capital category, if we can't find an entry for the holonym placetype itself (there's -- an entry for 'autonomous community'). local capital_cat = export.placetype_to_capital_cat[holonym_placetype] if not capital_cat then capital_cat = export.placetype_to_capital_cat[holonym_placetype:gsub("^.* ", "")] end if capital_cat then capital_cat = ucfirst(capital_cat) local inserted_specific_variant_cat = false if holonym_index then -- Now find the first recognized holonym location. We don't stop when :also is seen because of the common pattern -- where we use :also to specify that a given city is the capital at multiple surrounding levels. local matching_group, matching_key, matching_spec, matching_container_trail, matching_holonym_index for h_index = holonym_index, #place_desc.holonyms do if place_desc.holonyms[h_index].placetype then matching_group, matching_key, matching_spec, matching_container_trail = export.find_matching_holonym_location { holonym_placetype = place_desc.holonyms[h_index].placetype, holonym_placename = place_desc.holonyms[h_index].unlinked_placename, holonym_index = h_index, place_desc = place_desc, } if matching_group then matching_holonym_index = h_index break end end end if matching_holonym_index == holonym_index then if matching_container_trail[1] and not matching_spec.no_container_cat then for _, container in ipairs(matching_container_trail[1]) do insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key, container.spec))) inserted_specific_variant_cat = true end end elseif matching_holonym_index then -- Check to make sure that the holonym placetype we were called on is listed among the -- divtypes of the location we found. local function insert_specific_variant_if_possible(key, spec) return export.get_equiv_placetype_prop(holonym_placetype, function(pt) local plural_holonym_placetype = export.pluralize_placetype(pt) local saw_matching_div if spec.divs then local divs = spec.divs if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) ~= "table" then div = {type = div} end if plural_holonym_placetype == div.type then saw_matching_div = true break end end end if saw_matching_div then insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(key, spec))) return true end return false end) end if insert_specific_variant_if_possible(matching_key, matching_spec) then inserted_specific_variant_cat = true elseif not matching_spec.no_container_cat then for _, containers in ipairs(matching_container_trail) do local saw_no_container_cat = false for _, container in ipairs(containers) do if insert_specific_variant_if_possible(container.key, container.spec) then inserted_specific_variant_cat = true break end saw_no_container_cat = saw_no_container_cat or container.spec.no_container_cat end if inserted_specific_variant_cat or saw_no_container_cat then break end end end end else -- This happens when in an invocation like {{place|en|capital city|s/Haryana,Punjab}} for -- [[Chandigarh]]. We fall back to older code that doesn't depend on the holonym index existing. -- FIXME: This may not be necessary. In the example just given, when processing Haryana we add to -- [[:Category:en:State capitals of India]], and nothing extra gets added when processing Punjab. -- Possibly we can just skip this case entirely. local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and container_trail[1] and not spec.no_container_cat then for _, container in ipairs(container_trail[1]) do insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key, container.spec))) inserted_specific_variant_cat = true end end end if not inserted_specific_variant_cat then insert(retcats, capital_cat) end else -- We didn't recognize the holonym placetype; just put in 'Capital cities'. insert(retcats, "เมืองหลวง") end return retcats end --[=[ This is invoked specially for all placetypes (see the `*` placetype key at the bottom of `placetype_data`). This is used in two ways: # To add pages to generic holonym categories like [[:Category:en:สถานที่ในMerseyside, England]] (and [[:Category:en:สถานที่ในEngland]]) for any pages that have `co/Merseyside` as their holonym. # To categorize demonyms in bare placename categories like [[:Category:en:Merseyside, England]] if the demonym description mentions `co/Merseyside` and doesn't mention a more specific placename that also has a category. (In this case there are none, but we can have demonyms at multiple levels, e.g. in France for individual villages, departments, administrative regions, and for the entire country, and for example we only want to categorize a demonym into [[:Category:France]] if no more specific category applies.) Unlike when invoked from {{tl|place}}, a demonym invocation only adds the most specific holonym category and not the category of any containing polity (hence if we add [[:Category:en:Merseyside, England]] we won't also add [[:Category:England]]). This code also handles cities; e.g. for the first use case above, it would be used to add a page that has `city/Boston` as a holonym to [[:Category:en:สถานที่ในBoston]], along with [[:Category:en:สถานที่ในMassachusetts, USA]] and [[:Category:en:สถานที่ในthe United States]]. The city handler tries to deal with the possibility of multiple cities having the same name. For example, the code in [[Module:place/locations]] knows about the city of [[Columbus]], [[Ohio]], which has containing polities `Ohio` (a state) and `the United States` (a country). If either containing polity is mentioned, the handler proceeds to return the key `Columbus` (along with `Ohio, USA` and `the United States`). Otherwise, if any other state or country is mentioned, the handler returns nothing, and otherwise it assumes the mentioned city is the one we're considering and returns `Columbus` etc. This works correctly if the place only mentions Ohio and a holonym for a Columbus in a different country is encountered, because of the function `augment_holonyms_with_container`, which adds the US as a holonym when Ohio is encountered. The single parameter `data` is as in category handlers. The return value is a list of categories (without the preceding language code). ]=] local function generic_place_cat_handler(data) local from_demonym = data.from_demonym local retcats = {} local function insert_retkey(key, spec) if from_demonym then insert(retcats, key) else insert(retcats, ("สถานที่ใน%s"):format(export.get_prefixed_key(key, spec))) end end local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group then if not spec.no_generic_place_cat then -- This applies to continents and continental regions. insert_retkey(key, spec) end -- Categorize both in key, and in the larger location(s) that the key is part of, e.g. [[Hirakata]] goes in -- both [[Category:สถานที่ในOsaka Prefecture, Japan]] and [[Category:สถานที่ในJapan]]. But not when -- no_container_cat is set (e.g. for 'United Kingdom'). if not spec.no_container_cat then for _, container_set in ipairs(container_trail) do local stop_adding_containers = false for _, container in ipairs(container_set) do if not container.spec.no_generic_place_cat then insert_retkey(container.key, container.spec) end if container.spec.no_container_cat then stop_adding_containers = true end end if stop_adding_containers then break end end end return retcats end end --[==[ Special category handler run for all placetypes that checks for specified division placetypes of known locations and categorizes appropriately. ]==] function export.political_division_cat_handler(data) if data.from_demonym then return end local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group then local divlists = {} if spec.divs then insert(divlists, spec.divs) end if spec.addl_divs then insert(divlists, spec.addl_divs) end for _, divlist in ipairs(divlists) do if type(divlist) ~= "table" then divlist = {divlist} end for _, div in ipairs(divlist) do if type(div) == "string" then div = {type = div} end local sgdiv = export.maybe_singularize_placetype(div.type) or div.type local prep = div.prep or "ของ" local cat_as = div.cat_as or div.type if type(cat_as) ~= "table" then cat_as = {cat_as} end if not export.placetype_data[sgdiv] then internal_error("Placetype %s associated with known location key %s and data %s not found in " .. "`placetype_data`", sgdiv, key, spec) end if sgdiv == data.entry_placetype then local retcats = {} for _, pt_cat in ipairs(cat_as) do if type(pt_cat) == "string" then pt_cat = {type = pt_cat} end local pt_prep = pt_cat.prep or prep insert(retcats, ucfirst(pt_cat.type) .. pt_prep .. export.get_prefixed_key(key, spec)) --th end return retcats end end end end end --[==[ This is used to add pages to "bare" categories like [[:Category:en:Georgia, USA]] for `[[Georgia]]` and any foreign-language terms that are translations of the state of Georgia. We look at the page title (or its overridden value in {{para|pagename}}) as well as the glosses in {{para|t}}/{{para|t2}} etc., various extra-info values such as the modern names in {{para|modern}}, and any values specified using a form-of directive. We need to pay attention to the entry placetypes specified so we don't overcategorize; e.g. the US state of Georgia is `[[Джорджия]]` in Russian but the country of Georgia is `[[Грузия]]`, and if we just looked for matching names, we'd get both Russian terms categorized into both [[:Category:ru:Georgia, USA]] and [[:Category:ru:Georgia]]. We also need to check the containing holonyms to make sure there isn't a mismatch (so we don't e.g. categorize Newark, Delaware in [[:Category:en:Newark]], which is intended for Newark, New Jersey). ]==] function export.get_bare_categories(args, overall_place_spec) local bare_cats = {} local place_descs = overall_place_spec.descs local possible_placetypes_by_place_desc = {} for i, place_desc in ipairs(place_descs) do possible_placetypes_by_place_desc[i] = {} for _, placetype in ipairs(place_desc.placetypes) do if not export.placetype_is_ignorable(placetype) then local equivs = export.get_placetype_equivs(placetype, {register_former_as_non_former = true}) for _, equiv in ipairs(equivs) do insert(possible_placetypes_by_place_desc[i], equiv.placetype) end end end end local function check_term(term) -- Treat Wikipedia links like local ones. term = term:gsub("%[%[w:", "[["):gsub("%[%[wikipedia:", "[[") term = export.remove_links_and_html(term) term = term:gsub("^the ", "") for i, place_desc in ipairs(place_descs) do -- Iterate over all matching locations in case there are multiple, as with Delhi defined as -- {{place|en|megacity/and/union territory|c/India|containing the national capital [[New Delhi]]}}. for group, key, spec, container_trail in export.iterate_matching_holonym_location { holonym_placetype = possible_placetypes_by_place_desc[i], holonym_placename = term, place_desc = place_desc, } do insert(bare_cats, key) end end end -- FIXME: Should we only do the following if the language is English (requires that the lang is passed in)? -- We should always do it if `pagename` is given (as it is with {{tcl}}) but maybe not otherwise unless 1=en. There -- are cases like [[Ankara]] = English name for capital of Turkey, but also the name in various languages for the -- capital of Ghana (= English [[Accra]]). But this should get caught by mismatching the containing country. The -- advantage of checking when the language isn't English is we catch those places that fail to give an English -- translation but where the translation happens to be the same as the other-language spelling. However, I don't -- know how often this situation occurs. check_term(args.pagename or mw.title.getCurrentTitle().subpageText) for _, t in ipairs(args.t) do check_term(t) end local function check_termobj_list(terms) for _, term in ipairs(terms) do if term.eq then check_term(term.eq) end if term.alt or term.term then check_term(term.alt or term.term) end end end for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do local arg = extra_info_terms.arg if arg == "modern" or arg == "now" or arg == "full" or arg == "short" then check_termobj_list(extra_info_terms.terms) end end for _, directive in ipairs(overall_place_spec.directives) do check_termobj_list(directive.terms) end return bare_cats end --[==[ This is used to augment the holonyms associated with a place description with the containing polities. For example, given the following: `# {{tl|place|en|subprefecture|pref/Hokkaido}}.` We auto-add Japan as another holonym so that the term gets categorized into [[:Category:Subprefectures of Japan]]. To avoid over-categorizing we need to check to make sure no other countries are specified as holonyms. ]==] function export.augment_holonyms_with_container(place_descs) for _, place_desc in ipairs(place_descs) do if place_desc.holonyms then -- This ends up containing a copy of the original holonyms, with the augmented holonyms inserted in their -- appropriate position. We don't just put them at the end because some holonyms have use the `:also` -- modifier, which causes category processing to restart at that point after generating categories for a -- preceding holonym, and we don't want the preceding holonym's augmented holonyms interfering with -- categorization of a later holonym. We proceed from right to left, and each time we augment, we copy -- the holonyms with the augmented holonym(s) inserted appropriately and replace the place description's -- holonyms with the augmented ones before the next iteration. The reason for this is so that e.g. -- {{place|neighborhood|city/Birmingham|co/West Midlands|cc/England}} doesn't throw an error during the -- augmentation process due to 'Birmingham' referring to two known locations (in England and Alabama). If -- we go left to right, we will throw an ambiguity error on `city/Birmingham` because code to exclude -- Birmingham, Alabama needs `c/United Kingdom` present (to cause a mismatch with `c/United States`), -- which isn't yet present as the augmentation code hasn't gotten to `cc/England` yet. For similar -- reasons, we need to include the augmented holonyms in the holonyms considered in the next iteration -- rather than modifying the place description once at athe end. for i = #place_desc.holonyms, 1, -1 do local holonym = place_desc.holonyms[i] if holonym.placetype and not export.placetype_is_ignorable(holonym.placetype) then local group, key, spec, container_trail = export.find_matching_holonym_location { holonym_placetype = holonym.placetype, holonym_placename = holonym.unlinked_placename, holonym_index = i, place_desc = place_desc, } if group and container_trail[1] and not spec.no_auto_augment_container then local augmented_holonyms = {} for j = 1, i do insert(augmented_holonyms, place_desc.holonyms[j]) end for _, containers in ipairs(container_trail) do local any_no_auto_augment_container = false for _, container in ipairs(containers) do any_no_auto_augment_container = any_no_auto_augment_container or container.spec.no_auto_augment_container local containing_type = container.spec.placetype if type(containing_type) == "table" then -- If the containing type is a list, use the first element as the canonical variant. containing_type = containing_type[1] end local full_container_placename, elliptical_container_placename = m_locations.key_to_placename(container.group, container.key) -- Don't side-effect holonyms while processing them. local new_holonym = { -- By the time we run, the display has already been generated so we don't need to -- set display_placename. placetype = containing_type, -- placename_to_key() for the group should correctly handle both full and elliptical -- placenames, but the full placename seems less likely to be ambiguous. FIXME: We -- should just store the key directly and use it when available to avoid having to -- convert key to placename and back to key. unlinked_placename = full_container_placename, -- Indicate that this is an augmented holonym, and was derived from the specified -- holonym. In iterate_matching_holonym_location(), we ignore augmented holonyms -- derived from holonyms that are different from the holonym we're searching for but -- of the same placetype. This is to correctly handle a situation like -- {{place|river|dept/Ardèche,Gard,Vaucluse,Bouches-du-Rhône|c/France}}. Here, -- `Ardèche` is in `r/Auvergne-Rhône-Alpes`, while `Gard` is in `r/Occitania` and -- the other two are in `r/Provence-Alpes-Côte d'Azur`. Augmenting proceeds from -- right to left, so after it adds `r/Provence-Alpes-Côte d'Azur` to -- `Bouches-du-Rhône`, Vaucluse gets augmented correctly but `Gard` fails to match -- in find_matching_holonym_location() because of the mismatch between augmented -- `r/Provence-Alpes-Côte d'Azur` and actual `r/Occitania`. Similarly, all later -- calls to find_matching_holonym_location() fail to match `Gard` (and likewise -- `Ardèche`) against any known location. To deal with this, we mark augmented -- holoynms as being augmented due to a source holonym, and when processing a given -- holonym, ignore augmented holonyms from other holonyms of the same placetype. -- The restriction to the same placetype is so that `Birmingham` still gets -- correctly disambiguated to Birmingham, England in the example given above near -- the top of this function, using the augmented holonym `c/United Kingdom` added by -- the specified `cc/England` (whose placetype `constituent country` differs from -- the placetype `city` of Birmingham). augmented_from_holonym = holonym, } insert(augmented_holonyms, new_holonym) -- But it is safe to modify other parts of the place_desc. export.key_holonym_into_place_desc(place_desc, new_holonym) end if any_no_auto_augment_container then break end end for j = i + 1, #place_desc.holonyms do insert(augmented_holonyms, place_desc.holonyms[j]) end place_desc.holonyms = augmented_holonyms end end end end end end -- Cat handler for district, areas, neighborhoods and suburbs. Districts are tricky because they can either be political -- divisions or city neighborhoods. Areas similarly can be political divisions (rarely; specifically, in Kuwait), city -- neighborhoods or larger geographical areas/regions. We handle this as follows: -- (1) `placetype_data` cat entries for specific countries or country divisions take precedence over cat_handlers, so if -- the user says {{tl|place|district|s/Maharashtra|c/India}}, we won't even be called because there is an entry that -- categorizes into [[:Category|Districts of Maharashtra, India]]. -- (2) If we're called, we check the holonym we're called on to see if it is a recognized city, e.g. if we're called -- using {{tl|place|district|city/Mumbai|s/Maharashtra|c/India}}. If so, we categorize under e.g. -- [[:Category:Neighbourhoods of Mumbai]]. (Choosing the spelling "neighbourhoods" because we're in India.) -- (3) If we're called and the holonym is not a recognized city, we check if the placetype has has_neighborhoods set. -- If so, it's "city-like" and we categorize under the first containing polity that we recognize. For example, if -- we're called using {{tl|place|district|town/Northampton|co/Hampshire|s/Massachusetts|c/US}}, we should recognize -- town as "city-like" and categorize under [[:Category:Neighborhoods in Massachusetts]]. (Note "ใน" not "ของ", and -- note the spelling "neighborhoods" because we're in the US.) -- (4) If the holonym is not city-like, we do nothing. If there's a city or city-like placetype farther up (e.g. we're -- called as {{tl|place|district|ward/Foo|mun/Bar|...}}), we will handle the city-like entity according to (2) or -- (3) when called on that holonym. Otherwise either the categorization in (1) takes place or there's no -- categorization. local function district_neighborhood_cat_handler(data) local function get_plural_entry_placetype(location_spec, container_trail) if data.entry_placetype == "suburb" then return "Suburbs" else -- Check for `british_spelling` setting on the spec itself or any container. local uses_british_spelling = location_spec.british_spelling if uses_british_spelling == nil and container_trail then for _, container_set in ipairs(container_trail) do local must_outer_break = false for _, container in ipairs(container_set) do if container.spec.british_spelling ~= nil then uses_british_spelling = container.spec.british_spelling must_outer_break = true break end end if must_outer_break then break end end end return uses_british_spelling and "Neighbourhoods" or "Neighborhoods" end end -- First check the immediate holonym to see if it's a city or a city-like top-level entity (Hong Kong, Bonaire, -- etc.) local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and not spec.is_former_place and spec.is_city then return {get_plural_entry_placetype(spec, container_trail) .. " of " .. export.get_prefixed_key(key, spec)} end -- If the entry placetype is neighbo(u)rhood, assume it is a neighborhood even if there isn't a city-like -- entity father up the chain. (E.g. due to a mistaken use of m/ instead of mun/ for municipality.) local has_neighborhoods local entry_placetype = data.entry_placetype if entry_placetype == "neighborhood" or entry_placetype == "neighbourhood" or entry_placetype == "suburb" then has_neighborhoods = true else -- Otherwise, make sure the current holonym is city-like. has_neighborhoods = export.get_equiv_placetype_prop(data.holonym_placetype, function(pt) return export.get_placetype_prop(pt, "has_neighborhoods") end, {continue_on_nil_only = true}) end if has_neighborhoods then -- Loop up the holonyms, looking for city and city-like entities in case of e.g. [[Sepulveda]] written -- {{place|en|neighborhood|valley/San Fernando Valley|city/Los Angeles|s/California|c/USA}} -- but also look for a recognizable poldiv, and if so categorize as "Neighborhoods in POLDIV". We need -- to start with the current holonym, which is especially important for neighborhoods and suburbs that -- may have the first holonym be a recognizable province, etc. but can't hurt otherwise. (Previously -- we skipped the first/current holonym.) for other_holonym_index, other_holonym in export.get_holonyms_to_check(data.place_desc, data.holonym_index) do local other_holonym_data = { holonym_placetype = other_holonym.placetype, holonym_placename = other_holonym.unlinked_placename, holonym_index = other_holonym_index, place_desc = data.place_desc, } local group, key, spec, container_trail = export.find_matching_holonym_location(other_holonym_data) if group and not spec.is_former_place then return {get_plural_entry_placetype(spec, container_trail) .. (spec.is_city and "ของ" or "ใน") .. export.get_prefixed_key(key, spec)} end end end end function export.check_already_seen_string(holonym_placename, already_seen_strings) local canon_placename = ulower(m_links.remove_links(holonym_placename)) if type(already_seen_strings) ~= "table" then already_seen_strings = {already_seen_strings} end for _, already_seen_string in ipairs(already_seen_strings) do if canon_placename:find(already_seen_string) then return true end end return false end -- Prefix display handler that adds a prefix such as "Metropolitan Borough of " to the display -- form of holonyms. We make sure the holonym doesn't contain the prefix or some variant already. -- We do this by checking if any of the strings in ALREADY_SEEN_STRINGS, either a single string or -- a list of strings, or the prefix if ALREADY_SEEN_STRINGS is omitted, are found in the holonym -- placename, ignoring case and links. If the prefix isn't already present, we create a link that -- uses the raw form as the link destination but the prefixed form as the display form, unless the -- holonym already has a link in it, in which case we just add the prefix. local function prefix_display_handler(prefix, holonym_placename, already_seen_strings) if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(prefix)) then return holonym_placename end if holonym_placename:find("%[%[") then return prefix .. " " .. holonym_placename end return prefix .. " [[" .. holonym_placename .. "]]" end -- Suffix display handler that adds a suffix such as " parish" to the display form of holonyms. -- Works identically to prefix_display_handler but for suffixes instead of prefixes. local function suffix_display_handler(suffix, holonym_placename, already_seen_strings, include_suffix_in_link) if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(suffix)) then return holonym_placename end if holonym_placename:find("%[%[") then return holonym_placename .. " " .. suffix end if include_suffix_in_link then return "[[" .. holonym_placename .. " " .. suffix .. "]]" else return "[[" .. holonym_placename .. "]] " .. suffix end end -- Display handler for boroughs. New York City boroughs are display as-is. Others are suffixed -- with "borough". local function borough_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.new_york_boroughs[unlinked_placename] then -- Hack: don't display "borough" after the names of NYC boroughs return holonym_placename end return suffix_display_handler("borough", holonym_placename) end local function county_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) -- Display handler for Irish counties. Irish counties are displayed as e.g. "County [[Cork]]". if m_locations.ireland_counties["County " .. unlinked_placename .. ", Ireland"] or m_locations.northern_ireland_counties["County " .. unlinked_placename .. ", Northern Ireland"] then return prefix_display_handler("เทศมณฑล", holonym_placename) end -- Display handler for Taiwanese counties. Taiwanese counties are displayed as e.g. "[[Chiayi]] County". if m_locations.taiwan_counties[unlinked_placename .. " County, Taiwan"] then return suffix_display_handler("เทศมณฑล", holonym_placename) end -- Display handler for Romanian counties. Romanian counties are displayed as e.g. "[[Cluj]] County". if m_locations.romania_counties[unlinked_placename .. " County, Romania"] then return suffix_display_handler("เทศมณฑล", holonym_placename) end -- FIXME, we need the same for US counties but need to key off the country, not the specific county. -- Others are displayed as-is. return holonym_placename end -- Display handler for prefectures. Japanese prefectures are displayed as e.g. "[[Fukushima]] Prefecture". -- Others are displayed as e.g. "[[Fthiotida]] prefecture". local function prefecture_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) local suffix = m_locations.japan_prefectures[unlinked_placename .. " Prefecture, Japan"] and "Prefecture" or "prefecture" return suffix_display_handler(suffix, holonym_placename) end -- Display handler for provinces of Iran, Laos, North and South Korea, Thailand, Turkey and Vietnam. Recognized -- provinces are displayed as e.g. "[[Gyeonggi]] Province" or "[[Antalya]] Province". Others are displayed as-is. local function province_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.iran_provinces[unlinked_placename .. ", Iran"] or m_locations.laos_provinces[unlinked_placename .. ", Laos"] or m_locations.north_korea_provinces[unlinked_placename .. ", North Korea"] or m_locations.south_korea_provinces[unlinked_placename .. ", South Korea"] or m_locations.thailand_provinces[unlinked_placename .. ", ไทย"] or m_locations.turkey_provinces[unlinked_placename .. ", Turkey"] or m_locations.vietnam_provinces[unlinked_placename .. ", เวียดนาม"] then return suffix_display_handler("จังหวัด", holonym_placename) end return holonym_placename end -- Display handler for Nigerian states. Nigerian states are display as "[[Kano]] State". Others are displayed as-is. local function state_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.nigeria_states[unlinked_placename .. " State, Nigeria"] then return suffix_display_handler("รัฐ", holonym_placename) end return holonym_placename end -- Display handler for voivodeships. Display as e.g. [[Subcarpathian Voivodeship]]. local function voivodesip_display_handler(holonym_placetype, holonym_placename) return suffix_display_handler("Voivodeship", holonym_placename, nil, "include_suffix_in_link") end ------------------------------------------------------------------------------------------ -- Placetype data -- ------------------------------------------------------------------------------------------ --[==[ var: Main placetype data structure. This specifies, for each canonicalized placetype, various properties. The keys are placetypes (in the singular, except for category-only placetypes, which are plural and followed by `!`), and the value is a table of properties. The `"*"` key is special and is used for adding "generic" categories of the form `สถานที่ใน``location`` `; it runs for all entry placetypes. Keys in the form of plural placetypes followed by `!` are used only in [[Module:category tree/topic cat/data/Places]] for specifying the properties of categories containing the specified placetype, esp. bare categories like [[:Category:States and territories]] (rather than qualified categories like [[:Category:States and territories of Australia]]). Keys under the value table for a given placetype of are two types: ''property keys'' (which specify the value of specific properties) and ''categorization keys'' (which tell how to categorize certain sorts of holonyms if the placetype in question occurs as an entry placetype). Categorization keys are either the special value `default` or are wildcard strings with a slash in them, such as `"country/*"`. Note that only wildcard strings are currently allowed directly in the placetype data; everything else is handled through category handlers, either per-placetype or special (such as `political_division_cat_handler`). The algorithm for how category keys and handlers are used to generate categories is described at the top of [[Module:place]]. There are several recognized property keys, of various types: 1. The following link-related property keys are recognized: * `link`: '''Required''' except in category-only placetypes ending in `!`. Describes how to link and display the placetype in the formatted description when occurring as an entry placetype. Also used for formatting pluralized placetypes (which may occur in entry placetypes, esp. new-format ones, such as `two <<islands>>`) and may occur in categories). The possible values are: *# `true`: Link to the same-named Wiktionary entry. This creates a raw link, e.g. `<nowiki>[[city]]</nowiki>`, which is converted to an English-specific link by JavaScript postprocessing. If the placetype is plural, this creates a two-part raw link e.g. `<nowiki>[[city|cities]]</nowiki>`. *# `"w"`: Link to the same-named Wikipedia entry. This creates a two-part link, e.g. `<nowiki>[[w:census town|census town]]</nowiki>`, or `<nowiki>[[w:census town|census towns]]</nowiki>` if the placetype is given plural. *# `"+..."`: Create a two-part link to the entry following the `+` sign. For example, if `cercle` specifies `"+w:cercles of Mali"`, a two-part link `<nowiki>[[w:cercles of Mali|cercle]]</nowiki>` will be generated, or `<nowiki>[[w:cercles of Mali|cercles]]</nowiki>` if plural `cercles` is specified. *# `"separately"`: Link each word separately. For example, if `administrative territory` specifies `"separately"`, it will be linked as `<nowiki>[[administrative]] [[territory]]</nowiki>`, or as `<nowiki>[[administrative]] [[territory|territories]]</nowiki>` if plural `administrative territories` is given. *# another string: Use that string directly. If the placetype is plural, `pluralize()` in [[Module:en-utilities]] is called on the string, which will correctly pluralize most strings, including those with links in them. (If there are multiple links, the display form of the last link is pluralized.) *# `false`: This placetype is not allowed as an entry placetype. An error will be thrown if this placetype is given as an entry placetype. This is specified for internal-use placetypes, especially placetypes used in conjunction with the qualifiers `former`, `ancient`, `historical` and such. * `plural_link`: If specified and the placetype is plural, use the value in place of generating a pluralized version of the link spec in `link`. Most commonly, this is either a string with links in it (which is used directly) or the value `false`, indicating that the placetype cannot occur plural. (This is used for example by `caplc`, which displays as `<nowiki>[[capital]] and [[large]]st [[city]]</nowiki>`, where a plural version doesn't make sense.) Generally if this is specified, `plural` also needs to be specified to give a special placetype plural; this situation occurs especially with multiword placetypes where something other than the last word is pluralized. An example is `town with bystatus`, whose plural is `towns with bystatus`, which needs to be explicitly given. This example uses `link = <nowiki>"[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>` ({{m|nb|bystatus}}) is a Norwegian Bokmål word, and template calls aren't currently permitted in link strings), along with `plural_link = <nowiki>"[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>`. * `category_link`: Spec indicating how to display the placetype when occurring in category descriptions. Defaults to the value of `link`, and in turn is overridden by more specific `category_link_*` keys; see below. Category-only placetypes (which are plural and end in `!`) usually use `category_link` in preference to `link`. The value of `category_link` can be any of the types of specs given above, but most commonly is a plural string with links in it, spelling out the description; in this case it is used directly. When both `category_link` and `link` are given, the value in `category_link` is typically longer and more descriptive. For example, `polity` uses `link = true`, which just generates a link `<nowiki>[[polity]]</nowiki>` or plural `<nowiki>[[polity|polities]]</nowiki>`, but specifies a separate `category_link = <nowiki>"[[independent]] or [[semi-]][[independent]] [[polity|polities]]"</nowiki>`, which clarifies in the category description what a polity is. * `category_link_top_level`: Spec indicating how to display top-level (bare/unqualified) categories, i.e. categories where the placetype is not followed by `in ``location`` ` or `of ``location`` `. If given, this overrides `category_link` for this type of category. * `category_link_before_noncity`: Spec indicating how to display qualified categories of the form ` ``placetypes`` in/of ``location`` ` where ``location`` does not refer to a city. If given, this overrides `category_link` for this type of category. * `category_link_before_city`: Spec indicating how to display qualified categories of the form ` ``placetypes`` in/of ``location`` ` where ``location`` refer to a city. If given, this overrides `category_link` for this type of category. An example where this is given is `neighborhood`, which uses the following specs:<ol> <li>`link = true`</li> <li>`category_link = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]"</nowiki>`</li> <li>`category_link_before_city = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions"</nowiki>`</li> </ol> This has the effect of making the entry placetype `neighborhood` display as just `<nowiki>[[neighborhood]]</nowiki>`, while e.g. a category like `Neighborhoods of Chicago` displays as `<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[Chicago]], ...</nowiki>` and a category like `Neighborhoods in Illinois, USA` displays as `<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]] in [[Illinois]], ...</nowiki>`. * `disallow_in_entries`: If specified, this placetype cannot occur as an entry placetype, and the specified value (a message indicating what to use instead) is displayed in the error message. * `disallow_in_holonyms`: If specified, this placetype cannot occur as a holonym placetype, and the specified value (a message indicating what to use instead) is displayed in the error message. 2. There is currently one fallback-related property key recognized: * `fallback`: If specified, its value is a placetype which will be used for categorization purposes if no categories get added using the placetype itself. As an example, `branch` sets a fallback of `river` but also sets `preposition = "ของ"`, meaning that {{tl|place|en|branch|riv/Mississippi}} displays as `a branch of the Mississippi` (whereas `river` itself uses the preposition `in`), but otherwise categorizes the same as `river`. A more complex example is `area`, which sets a fallback of `geographic and cultural area` and also sets a category handler that checks for cities or city-like entities (e.g. boroughs) occurring as holonyms and categorizes the toponym under [[:Category:Neighborhoods of CITY]] (for recognized cities) or otherwise [[:Category:Neighborhoods of POLDIV]] (for the nearest containing recognized location). In addition, `area` is set as a political division of Kuwait, meaning if `c/Kuwait` occurs as holonym, the toponym is categorized under [[:Category:Areas of Kuwait]]. If none of these categories trigger, the fallback of `geographic and cultural area` will take effect, and the toponym will be categorized as e.g. [[:Category:Geographic and cultural areas of England]]. 3. There is currently one property to control irregular plurals of placetypes: * `plural`: If specified, its value is the plural of the placetype. Otherwise, the default pluralization algorithm in [[Module:en-utilities]] applies (which correctly pluralizes most words, including those ending in `-y`, `-ch`, `-sh`, `-x`, etc.). The value of `plural` is also used when converting a pluralized placetype into its singular equivalent; for example, since the placetype `kibbutz` has `plural = "kibbutzim"`, the placetype `kibbutzim` will be recognized as a plural and singularized to `kibbutz`. For this reason, it's occasionally necessary to specify a `plural` value even when the default pluralization algorithm works correctly, if the default singularization algorithm won't correctly reverse the pluralization (as with `pass` and other terms ending in `-ss`). 4. The following property keys relate to generating categories for entry placetypes and specifying the parents of those categories: * `class`: The general class of placetype. This is used for various purposes: (a) to categorize placetypes preceded by a qualifier such as `former`, `ancient`, `medieval` or `historical` (note that these placetypes are not all treated alike); (b) to determine the parent category of bare placetype categories (e.g. [[:Category:Villages]] for placetype `village`); (c) to determine whether to add a parent category `political divisions of specific countries` to qualified placetype categories (e.g. [[:Category:Villages in Mali]]). The possible values are: *# `polity`: a more-or-less sovereign/independent polity, such as a country, kingdom or empire. *# `subpolity`: a non-sovereign division of a polity, above the level of an individual settlement. *# `settlement`: a city or smaller equivalent, such as a village. This also includes administrative divisions of a settlement, such as wards and barangays. *# `non-admin settlement`: similar to a settlement but without administrative or political significance, such as an unincorporated community, farm or neighborhood. *# `capital`: a settlement that is a capital. A former capital is generally still in existence, just not the capital any more. *# `natural feature`: any non-man-made feature, such as a lake, mountain, island, ocean, etc. *# `man-made structure`: a man-made feature below the level of a neighborhood, such as a house, airport, university, metro station, park or the like. *# `geographic region`: a geographic or cultural region or area that has no administrative significance. These may vary greatly in size but typically have some sort of cultural significance (possibly historical). The `former`, `ancient`, etc. qualifier has no effect on the category of these placetypes. *# `generic place`: a place that isn't further qualified into any specific subtype. * `former_type`: The class of placetype used for categorizing placetypes preceded by a qualifier such as `former`, `ancient`, `medieval` or `historical`. The possible values are the same as for `class` but with the addition of `dependent territory` (for colonies, protectorates and the like) and `!` (ignore the historical/former/ancient/etc. qualifier; used e.g. with `fictional location` and `mythological location`). If not specified, the value of `class` is used. When a qualifier such as `former`, `ancient`, `medieval` or `historical` is encountered (specifically, those in `former_qualifiers`), it is mapped using `former_qualifiers` to the appropriate internal qualifier or qualifiers (one or both of `ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified qualifiers), which is prepended to the value of `former_type` or `class` to form a placetype whose properties are looked up to determine how to categorize the toponym in question. For example, if `medieval village` is given, we map `medieval` to `ANCIENT` and `FORMER`, and `village` to its `class` of `settlement`, and enter the placetypes `ANCIENT settlement` and `FORMER settlement` (in that order) into the list of equivalent placetypes returned by `get_placetype_equivs`. In this case, there is an entry in `placetype_data` for `ANCIENT settlement`, so its default category spec `Ancient settlements` is used as the category. If on the other hand `medieval kingdom` is given, where `kingdom` has a `class` value `polity`, we first look up `ANCIENT polity`, see there is no entry in `placetype_data` for it, and then look up `FORMER polity`, which exists and has a default category spec `Former polities`, which is used as the category. Note that if the placetype following the "former" qualifier is recognized in `placetype_data` but has no `former_type` or `class` and no fallback with a `former_type` or `class` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like `former greenhouse` is specified and we don't have an entry for `greenhouse`), we just track the occurrence and end up not categorizing. * `bare_category_parent`: This specifies the first parent category of a bare placetype category named according to the placetype in question (e.g. [[:Category:Atolls]] for placetype `atoll`, or [[:Category:Named buildings]] for placetype `named buildings!`). If not specified, the first parent category is determined by the value of `class`, using the mapping `class_to_bare_category_parent` in [[Module:category tree/topic cat/data/Places]]. * `addl_bare_category_parents`: Extra parent categories to add a bare placetype category to (see `bare_category_parent` just above). * `bare_category_breadcrumb`: Breadcrumb for bare placetype categories. Also used as the sort key of `bare_category_parent` if it is a string. * `inherently_former`: If specified and the given placetype is used as an entry placetype, act as if `former` or `ancient` (depending on the value of `inherently_former`) were prefixed to the placetype. This is for placetypes that always refer to no-longer-existing entities, such as `satrapy` and `treaty port`. The value of `inherently_former` is a list of internal qualifiers (one or more of `ANCIENT` and/or `FORMER`), just as for `former_qualifiers`, and the implementation is the same. * `cat_handler`: Handler used to generate the categories to add a given toponym to, if its entry placetype is the placetype in question. Generally the `cat_handler` function checks the holonyms specified in order to determine which category or categories to generate. For example, `district_neighborhood_cat_handler` handles placetypes `district`, `neighborhood`, `subdivision`, `suburb` and the like, and either adds the toponym to a category like `Neighborhoods of ``city`` ` (if a recognized city is given as a holonym), or otherwise a category like `Neighborhoods in ``location`` ` (for the first recognized non-city location given as a holonym, if an unrecognized city or city-like entity is given before the recognized non-city). The algorithm that runs the category handlers iterates over holonyms from left to right, running the `cat_handler` function on each holonym in turn until one or more categories are returned; see below for more specifics. (Note that countries for which e.g. a `district` is a political division do not get the corresponding category added by the `district_neighborhood_cat_handler` function but by `political_division_cat_handler`.) `cat_handler` functions are called with one argument, `data`, describing the resolved entry placetype (i.e. after resolving placetype aliases and fallbacks) and the holonym being processed. The return value should be a list of category specs (categories minus the langcode prefix, with `+++` standing for the holonym key, or the value `true`, which stands for ` ``Placetypes`` in/of ``Holonym`` `, i.e. the pluralized placetype with the appropriate preposition as specified in `placetype_data`). `data` contains the following fields: ** `entry_placetype`: the resolved entry placetype for the entry placetype being processed (i.e. it will always have an entry in `placetype_data` but may not be the original placetype given by the user); ** `holonym_placetype` and `holonym_placename`: the holonym placetype and placename being processed; ** `holonym_index`: the index of the holonym being processed, or {nil} if we're handling an overriding holonym (FIXME: we will change the overriding holonym algorithm so there will be an index even when processing overriding holonyms); ** `place_desc`: a full description of the {{tl|place}} call, as specified at the top of [[Module:place]]; ** `from_demonym`: If set, we are called from [[Module:demonym]], triggered by {{tl|demonym-adj}} or {{tl|demonym-noun}}, instead of being triggered by {{tl|place}}. * `has_neighborhoods`: If `true`, the specified placetype is city-like. This is used in the `district_neighborhood_cat_handler` to determine whether to add a category such as `Neighborhoods in ``location`` `; see the section just above on `cat_handler`. 5. The following preposition-related property keys are recognized: * `preposition`: The preposition used after this placetype when it occurs as an entry placetype. Defaults to `"ใน"`. * `generic_before_non_cities`: If specified, the appropriate category description handler in [[Module:category tree/topic cat/data/Places]] will recognize categories of the form ` ``Placetype`` in/of ``location`` ` for the specified placetype and preposition, if ``location`` is a non-city. This is used to generate descriptions for categories added by category handlers and by explicit category specs in the placetype data. All placetypes that specify `generic_before_non_cities` or `generic_before_cities` *MUST* also specify a value for `class` so that the category tree code can determine whether it's a political or non-political division. * `generic_before_cities`: Like `generic_before_non_cities` but for locations referring to cities. 6. The following property keys control the auto-addition of affixes when formatting holonyms of a particular placetype: * `affix_type`: If specified, add the placetype as an affix before or after holonyms of this placetype. Possible values are: *# `"pref"` (the holonym will display as `(the) placetype of Holonym`, where `the` appears when the holonym directly follows an entry placetype); *# `"Pref"` (same as `"pref"` but the placetype is capitalized; each word is capitalized if there are multiple); *# `"suf"` (the holonym will display as `Holonym placetype`); *# `"Suf"` (the holonym will display as `Holonym Placetype`, i.e. same as `"suf"` but the placetype is capitalized). * `suffix`: String to use in place of the placetype itself when the placetype is displayed as a suffix after a holonym. Note that `suffix` can be used independently of `affix_type` because the user can also request a suffix explicitly using a syntax like `adr:suf/Occitania`, which will display as `Occitania region` because the placetype `administrative region` specifies `suffix = "ภูมิภาค"`. * `prefix`: Like `suffix` but for use when the placetype is displayed as a prefix before the holonym. * `affix`: Like `suffix` and `prefix` but for use when the placetype is displayed as an affix either before or after the holonym. If both `suffix` or `prefix` and `affix` are given for a single placetype, `suffix` or `prefix` take precedence. * `no_affix_strings`: String or list of strings that, if they occur in the holonym, suppress the addition of any affix requested using `affix_type`. Defaults to the placetype itself. For example, `autonomous okrug` specifies `affix_type = "Suf"` so that `aokr/Nenets` displays as `Nenets Autonomous Okrug`, but also specifies `no_affix_strings = "okrug"` so that `aokr/Nenets Okrug` or `aokr/Nenets Autonomous Okrug` displays as specified, without a redundant `Autonomous Okrug` added. Matching is case-insensitive but whole-word. * `display_handler`: A function of two arguments, `holonym_placetype` and `holonym_placename` (specifying a holonym). Its return value is a string specifying the display form of the holonym. 7. The following property keys control the indefinite and definite articles used before entry placetypes and/or holonyms of the specified placetype. * `entry_placetype_use_the`: Use `"the"` before this placetype when it occurs as an entry placetype. * `entry_placetype_indefinite_article`: Indefinite article used before this placetype when it occurs as an entry placetype (usually `"a"`, specifically for placetypes beginning with u- that don't take the indefinite article `"an"`). Defaults to the appropriate indefinite article (`"a"` or `"an"` depending on whether the placetype begins with a vowel). Overridden by `entry_placetype_use_the`, and unlike for most properties, does not apply to equivalent placetypes (i.e. fallbacks or those formed by removing a qualifier from the beginning); only to the exact placetype specified. * `holonym_use_the`: Use `"the"` before holonyms of this placetype. '''NOTE:''' # The `link` property must be specified on all placetypes, except those ending in `!` (category-only placetypes), which must have either `link` or `category_link` specified. # Either the `class` or `former_type` property must be specified on all placetypes not ending in `!` that do not have a fallback (if a placetype has a fallback and omits the `class` and `former_type` properties, they are taken from the fallback). An internal error will result if a placetype has no `class` or `former_type` property derivable either directly or through a fallback, if an attempt is made to categorize a former/ancient/historical/etc. entity of this placetype. # It is possible to have multiple levels of fallback (e.g. `frazione` falls back to `hamlet`, which falls back to `village`). Fallback loops will cause an internal error. All placetypes specified as fallbacks must exist in `placetype_data` or an internal error occurs. ]==] export.placetype_data = { --[=[ If you need to sort the following, do this (using Vim): 1. Make sure all full-line comments are within the { ... } table, or are moved after and on the same line as single-line entries. 2. Make sure the table uses tabs everywhere for indent, and not spaces. 3. Mark the top of the table with `ma`, go to the bottom and execute the following two lines in sequence: :'a,.s/\n/\\n/g :s/\\n\(\t\[\)/\r\1/g The first command converts every newline to a literal `\n` sequence, so the whole thing becomes a single line, while the second command restores the newlines before the beginning of each entry. The effect is to convert all entries to a single line while not losing any information. (Potentially a negative lookahead could be used to do it all in one command.) 4. Execute the following to sort: :'a,.!perl -pe 's/^(\t\[")(.*?)(".*)$/$2 @@@ $1$2$3/' | sort -f | perl -pe 's/.*? @@@ //' Note that a simple `sort -f` (where `-f` means case-insensitive) would almost work, but it would sort "hill station" before "hill" and "county borough" before "เทศมณฑล" because the space after e.g. "hill station" sorts before the quotation mark after e.g. "hill". The above command deals with this by extracting the key, prepending it followed by ` @@@ `, sorting, and then removing key (the classic decorate-sort-undecorate pattern). 5. Put the table back to multi-line format by marking the top of the table with `ma`, going to the bottom and executing :'a,.s/\\n/\r/g Note that for some reason, in order to get a match a newline in the left side of a replacement, you must use \n, but to insert a newline in the right sode of a replacement you must use \r. ]=] ["*"] = { link = false, cat_handler = generic_place_cat_handler, }, ["administrative atoll"] = { -- Maldives link = "+w:administrative divisions of the Maldives", preposition = "ของ", class = "subpolity", }, ["administrative capital"] = { link = "w", fallback = "เมืองหลวง", }, ["administrative center"] = { link = "w", fallback = "เมืองหลวงที่ไม่ใช่นคร", }, ["administrative centre"] = { link = "w", fallback = "administrative center", }, ["administrative county"] = { link = "w", fallback = "เทศมณฑล", }, ["administrative district"] = { link = "w", fallback = "อำเภอ", }, ["administrative headquarters"] = { link = "separately", fallback = "administrative centre", }, ["administrative region"] = { link = true, preposition = "ของ", suffix = "ภูมิภาค", -- but prefix is still "administrative region (of)" fallback = "ภูมิภาค", class = "subpolity", }, ["administrative seat"] = { link = "w", fallback = "administrative centre", }, ["administrative territory"] = { link = "separately", preposition = "ของ", suffix = "ดินแดน", -- but prefix is still "administrative territory (of)" fallback = "ดินแดน", class = "subpolity", }, ["administrative unit"] = { -- Grrr, it's difficult to generalize about "administrative units". In Albania, "administrative unit" is an -- official term for a city-level division of municipalities; Wikipedia renders it using the more practical term -- "commune". In Pakistan, "administrative unit" is a collective term used to refer to all the different types -- of first-level divisions (four provinces, one federal territory, and two "disputed territories", i.e. Azad -- Kashmir and Gilgit-Balistan, that are variously described). For this reason, we set no fallback, but we need -- to include this so that it can be used as a placetype for Albania, categorizing as communes. link = "w", class = "subpolity", }, ["administrative village"] = { link = "w", preposition = "ของ", has_neighborhoods = true, class = "settlement", }, ["aimag"] = { -- used in Mongolia, Russia and China (Inner Mongolia); in Mongolia, equivalent to a province; -- in China, equivalent to a prefecture (below a province); in Russia, equivalent to a municipal district. link = "w", fallback = "prefecture", }, ["airport"] = { link = true, class = "man-made structure", default = {true}, }, ["alliance"] = { link = true, fallback = "confederation", }, ["archipelago"] = { link = true, fallback = "เกาะ", }, ["area"] = { link = true, preposition = "ของ", fallback = "geographic and cultural area", -- Areas can either be administrative divisions (specifically of Kuwait) or geographic areas. Assume the former -- when categorizing 'Areas' but the latter when handling e.g. 'historical area'. class = "subpolity", former_type = "geographic region", cat_handler = district_neighborhood_cat_handler, }, ["arm"] = { link = true, preposition = "ของ", class = "natural feature", default = {"ทะเล"}, }, ["arrondissement"] = { link = true, preposition = "ของ", -- FIXME!!! Grrrrr!!! In some countries, arrondissements are divisions of cities; in others, they are divisions -- of departments or provinces. Need to conditionalize on the country for both of the following. class = "subpolity", has_neighborhoods = true, }, ["associated province"] = { link = "separately", fallback = "จังหวัด", }, ["atoll"] = { -- FIXME! Atolls are administrative divisions of the Maldives but natural features elsewhere. Need to -- conditionalize `class` on the country. See also `administrative atoll`. link = true, class = "natural feature", bare_category_parent = "เกาะ", default = {true}, }, ["autonomous city"] = { link = "w", preposition = "ของ", fallback = "นคร", has_neighborhoods = true, }, ["autonomous community"] = { -- Spain; refers to regional entities, not village-like entities, as might be expected from "community" link = true, preposition = "ของ", class = "subpolity", }, ["autonomous island"] = { -- Comoros; seems like an administrative atoll of the Maldives. link = "+w:autonomous islands of Comoros", preposition = "ของ", class = "subpolity", }, ["autonomous oblast"] = { link = true, preposition = "ของ", affix_type = "Suf", no_affix_strings = "oblast", class = "subpolity", }, ["autonomous okrug"] = { link = true, preposition = "ของ", affix_type = "Suf", no_affix_strings = "okrug", class = "subpolity", }, ["autonomous prefecture"] = { link = true, fallback = "prefecture", }, ["autonomous province"] = { link = "w", fallback = "จังหวัด", }, ["autonomous region"] = { link = "w", preposition = "ของ", fallback = "administrative region", -- "administrative region" sets an affix of "ภูมิภาค" but we want to display as "Tibet Autonomous Region" -- if the user writes 'ar:Suf/Tibet'. affix = "autonomous region", }, ["autonomous republic"] = { link = "w", preposition = "ของ", class = "subpolity", }, ["autonomous territorial unit"] = { -- Moldova; only two of them, one for Gagauzia and one for Transnistria. link = "w", preposition = "ของ", class = "subpolity", }, ["autonomous territory"] = { link = "w", fallback = "dependent territory", }, ["bailiwick"] = { -- Jersey, etc. link = true, fallback = "องค์การทางการเมือง", }, ["barangay"] = { -- Philippines link = true, class = "settlement", -- Barangays are formal administrative divisions of a city rather than informal neighborhoods, but can use -- some of the properties of a neighborhood. fallback = "neighborhood", }, ["barrio"] = { -- Spanish-speaking countries; Philippines link = true, -- FIXME: Not completely correct, in some countries barrios are formal administrative divisions of a city. -- `class` will need to conditionalize on the country to be completely correct. fallback = "neighborhood", }, ["basin"] = { link = true, fallback = "ทะเลสาบ", }, ["bay"] = { link = true, preposition = "ของ", class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["beach"] = { link = true, class = "natural feature", addl_bare_category_parents = {"water"}, default = {true}, }, ["beach resort"] = { link = "w", fallback = "resort town", }, ["bishopric"] = { link = true, fallback = "องค์การทางการเมือง", }, ["bodies of water!"] = { -- FIXME: This is (maybe?) a type category not a name category. There should be an option for this. We need to -- straighten out the type vs. name vs. related-to issue. category_link = "[[body of water|bodies of water]]", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems", "water"}, }, ["borough"] = { link = true, preposition = "ของ", display_handler = borough_display_handler, has_neighborhoods = true, -- "former borough" could be a former settlement or a former part of a city but seems more likely to -- be a former subpolity, particularly in England. FIXME, we really need a handler to take care of this -- properly. class = "subpolity", -- Grr, some boroughs are city-like but some (e.g. in Britain) may be larger. }, ["borough seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["branch"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["bridge"] = { link = true, class = "man-made structure", default = {"Named bridges"}, }, ["building"] = { link = true, class = "man-made structure", default = {"Named buildings"}, }, ["built-up area"] = { link = "w", fallback = "area", }, ["burgh"] = { link = true, fallback = "borough", }, ["business park"] = { link = true, fallback = "park", }, ["caliphate"] = { link = true, fallback = "องค์การทางการเมือง", }, ["canton"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["cape"] = { link = true, fallback = "headland", }, ["capital"] = { link = true, fallback = "เมืองหลวง", }, ["เมืองหลวง"] = { link = true, category_link = "[[capital city|capital cities]]: the [[seat of government|seats of government]] for a country or [[political]] [[division]] of a country", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", bare_category_parent = "นคร", cat_handler = capital_city_cat_handler, default = {true}, -- The following is necessary so that e.g. [[Melbourne]] defined as {{place|en|capital city|s/Victoria|c/Australia}} -- gets categorized in the bare category [[Category:en:Melbourne]]; otherwise placetype 'capital city' wouldn't -- match against the placetype 'city' of Melbourne. fallback = "นคร", }, ["caplc"] = { link = "[[capital]] and [[large]]st [[city]]", plural_link = false, fallback = "เมืองหลวง", }, ["captaincy"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["caravan city"] = { link = "w", fallback = "นคร", class = "settlement", inherently_former = {"ANCIENT", "FORMER"}, }, ["castle"] = { link = true, fallback = "building", }, ["cathedral city"] = { link = true, fallback = "นคร", }, ["cattle station"] = { -- Australia link = true, fallback = "farm", }, ["census area"] = { link = true, affix_type = "Suf", has_neighborhoods = true, class = "non-admin settlement", }, ["census-designated place"] = { -- United States link = true, class = "non-admin settlement", }, ["census division"] = { -- Canada link = "w", preposition = "ของ", class = "subpolity", }, ["census town"] = { link = "w", fallback = "เมือง", }, ["central business district"] = { link = true, fallback = "neighborhood", }, ["cercle"] = { -- Mali link = "+w:cercles of Mali", preposition = "ของ", class = "subpolity", }, ["ceremonial county"] = { link = true, fallback = "เทศมณฑล", }, ["chain of islands"] = { link = "[[chain]] of [[island]]s", plural = "chains of islands", plural_link = "[[chain]]s of [[island]]s", fallback = "เกาะ", }, ["channel"] = { link = true, fallback = "strait", }, ["charter community"] = { -- Northwest Territories, Canada link = "w", fallback = "village", }, ["นคร"] = { link = true, generic_before_non_cities = "ใน", has_neighborhoods = true, class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["city-state"] = { link = true, category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]", has_neighborhoods = true, class = "settlement", ["continent/*"] = {"City-states", "นครใน+++", "ประเทศใน+++", "เมืองหลวงของ"}, default = {"City-states", "นคร", "ประเทศ", "เมืองหลวงของประเทศ"}, }, ["civil parish"] = { -- Mostly England; similar to municipalities link = true, preposition = "ของ", affix_type = "suf", has_neighborhoods = true, class = "subpolity", }, ["claimed political division"] = { link = "[[claim]]ed [[political]] [[division]]", class = "subpolity", default = {true}, }, ["co-capital"] = { link = "[[co-]][[capital]]", fallback = "เมืองหลวง", }, ["coal city"] = { link = "+w:coal town", fallback = "นคร", }, ["coal town"] = { link = "w", fallback = "เมือง", }, ["collectivity"] = { link = "w", preposition = "ของ", -- No default; these are weird one-off governmental divisions in France (esp. for overseas collectivities) class = "subpolity", }, ["colony"] = { link = true, fallback = "dependent territory", }, ["comarca"] = { -- per Wikipedia: traditional region or local administrative division found in Portugal, Spain, and some of -- their former colonies, like Brazil, Nicaragua, and Panama. In the Valencian Community, for example, it -- sits between municipalities and provinces, something like a county or district. link = true, preposition = "ของ", class = "subpolity", }, ["commandery"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"ANCIENT", "FORMER"}, }, ["commonwealth"] = { link = true, preposition = "ของ", -- No default; applies specifically to Puerto Rico class = "subpolity", }, ["commune"] = { link = true, fallback = "เทศบาล", }, ["community"] = { link = true, category_link = "[[community|communities]] of all sizes", fallback = "village", }, ["community development block"] = { -- in India; appears to be similar to a rural municipality; groups several villages, unclear if there will be -- neighborhoods so I'm not setting `has_neighborhoods` for now link = "w", affix_type = "suf", no_affix_strings = "block", class = "subpolity", }, ["comune"] = { -- Italy, Switzerland link = true, fallback = "เทศบาล", }, ["condominium"] = { link = true, fallback = "องค์การทางการเมือง", }, ["confederacy"] = { link = true, fallback = "confederation", }, ["confederation"] = { link = true, fallback = "องค์การทางการเมือง", }, ["constituency"] = { -- currently we have them as political divisions of Namibia but many countries have them link = true, preposition = "ของ", class = "subpolity", }, ["constituent country"] = { link = true, preposition = "ของ", class = "subpolity", }, ["constituent part"] = { link = "separately", preposition = "ของ", class = "subpolity", }, ["constituent republic"] = { -- Of Russia, Yugoslavia, etc. link = "separately", preposition = "ของ", class = "subpolity", }, ["counties and county-level cities!"] = { -- This is used when grouping counties and county-level cities under prefecture-level cities in China. category_link = "[[county|counties]] and [[county-level city|county-level cities]]", class = "subpolity", }, ["continent"] = { link = true, category_link = false, -- can't occur as a bare category class = "natural feature", default = {"Continents and continental regions"}, }, ["continental region"] = { link = "separately", category_link = false, -- can't occur as a bare category class = "geographic region", fallback = "continent", }, ["continents and continental regions!"] = { category_link = "[[continent]]s and [[continent]]-[[level]] [[region]]s (e.g. [[Polynesia]])", class = "geographic region", }, ["council area"] = { link = true, -- in Scotland; similar to a county preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["ประเทศ"] = { link = true, class = "polity", --ห้ามแปล class ["continent/*"] = {true, "ประเทศ"}, default = {true}, }, ["country-like entities!"] = { category_link = "[[polity|polities]] not normally considered [[country|countries]] but treated similarly for categorization purposes; typically, [[unrecognized]] [[de-facto]] countries or [[w:dependent territory|dependent territories]]", class = "polity", --ห้ามแปล class }, ["เทศมณฑล"] = { link = true, preposition = "ของ", display_handler = county_display_handler, class = "subpolity", }, ["county borough"] = { link = true, -- in Wales; similar to a county preposition = "ของ", affix_type = "suf", fallback = "borough", class = "subpolity", }, ["county seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["county town"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", fallback = "เมือง", has_neighborhoods = true, class = "capital", }, ["county-administered city"] = { -- In Taiwan, per Wikipedia similar to a Taiwanese township or district, which is a small city. -- NOT anything like a "county-level city" in PR China, which is a county masquerading as a city. link = "w", fallback = "นคร", has_neighborhoods = true, class = "settlement", }, ["county-controlled city"] = { -- Taiwan link = "w", fallback = "county-administered city", }, ["county-level city"] = { -- PR China link = "w", fallback = "prefecture-level city", }, ["crater lake"] = { link = true, fallback = "ทะเลสาบ", }, ["creek"] = { link = true, fallback = "stream", }, ["Crown colony"] = { link = "+crown colony", fallback = "crown colony", }, ["crown colony"] = { link = true, fallback = "colony", }, ["Crown dependency"] = { link = true, fallback = "dependent territory", }, ["crown dependency"] = { link = true, fallback = "dependent territory", }, ["cultural area"] = { link = "w", fallback = "geographic and cultural area", }, ["cultural region"] = { link = "w", fallback = "geographic and cultural area", }, ["delegation"] = { -- Tunisia link = "+w:delegations of Tunisia", preposition = "ของ", class = "subpolity", }, ["department"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["departmental capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["dependency"] = { link = true, fallback = "dependent territory", }, ["dependent territory"] = { link = "w", preposition = "ของ", class = "subpolity", former_type = "dependent territory", bare_category_parent = "political divisions", ["country/*"] = {true}, default = {true}, }, ["desert"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ecosystems"}, default = {true}, }, ["deserted mediaeval village"] = { link = "w", fallback = "deserted medieval village", }, ["deserted medieval village"] = { link = "w", fallback = "ANCIENT settlement", }, ["direct-administered municipality"] = { -- China link = "+w:direct-administered municipalities of China", fallback = "เทศบาล", }, ["direct-controlled municipality"] = { -- several countries link = "w", fallback = "เทศบาล", }, ["distributary"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["อำเภอ"] = { link = true, preposition = "ของ", affix_type = "suf", -- Grrr! FIXME! Here is where we need handlers for `class`. Using similar logic to -- district_neighborhood_cat_handler, we need to check if we're below or above a city to determine if the class -- is "settlement" or "subpolity". class = "subpolity", cat_handler = district_neighborhood_cat_handler, -- No default. Countries for which districts are political divisions will get entries. }, ["districts and autonomous regions!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Portugal. category_link = "[[district]]s and [[autonomous region]]s", class = "subpolity", }, ["districts and autonomous territorial units!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Moldova. category_link = "[[district]]s and [[w:autonomous territorial unit|autonomous territorial unit]]s", class = "subpolity", }, ["district capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["district headquarters"] = { link = "separately", fallback = "administrative centre", }, ["district municipality"] = { -- In Canada, a district municipality is equivalent to a rural municipality and won't have neighborhoods; in -- South Africa, district municipalities group local municipalities and hence won't have neighborhoods. link = "w", preposition = "ของ", affix_type = "suf", no_affix_strings = {"อำเภอ", "เทศบาล"}, fallback = "เทศบาล", class = "subpolity", }, ["division"] = { link = true, preposition = "ของ", class = "subpolity", }, ["division capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["dome"] = { link = true, fallback = "ภูเขา", }, ["dormant volcano"] = { link = true, fallback = "volcano", }, ["duchy"] = { link = true, fallback = "องค์การทางการเมือง", }, ["emirate"] = { link = true, preposition = "ของ", -- FIXME: Can be subpolities (of the United Arab Emirates). fallback = "องค์การทางการเมือง", }, ["จักรวรรดิ"] = { link = true, fallback = "องค์การทางการเมือง", }, ["enclave"] = { link = true, preposition = "ของ", -- Enclaves can theoretically be any size but assume a subpolity. class = "subpolity", }, ["entity"] = { -- Bosnia and Herzegovina link = "+w:entities of Bosnia and Herzegovina", preposition = "ของ", class = "subpolity", }, ["escarpment"] = { link = true, fallback = "ภูเขา", }, ["ethnographic region"] = { -- used in Lithuania link = "+w:ethnographic regions of Lithuania", fallback = "geographic and cultural area", }, ["exclave"] = { link = true, preposition = "ของ", -- exclaves can theoretically be any size but assume a subpolity. class = "subpolity", }, ["external territory"] = { link = "separately", fallback = "dependent territory", }, ["farm"] = { link = true, class = "non-admin settlement", default = {"Farms and ranches"}, }, ["farms and ranches!"] = { category_link = "[[farm]]s and [[ranch]]es", class = "non-admin settlement", }, ["federal city"] = { link = "w", preposition = "ของ", fallback = "นคร", }, ["federal district"] = { link = true, preposition = "ของ", -- Might have neighborhoods as federal districts are often cities (e.g. Mexico City) has_neighborhoods = true, class = "settlement", }, ["federal subject"] = { -- In Russia; a generic term for first-level administrative divisions (republics, oblasts, okrugs, krais, -- autonomous okrugs and autonomous oblasts). link = "w", preposition = "ของ", class = "subpolity", }, ["federal territory"] = { link = "w", fallback = "ดินแดน", }, ["fictional location"] = { link = "separately", former_type = "!", class = "hypothetical location", bare_category_parent = "สถานที่", default = {true}, }, ["First Nations reserve"] = { -- Canada link = "[[First Nations]] [[w:Indian reserve|reserve]]", -- Wikipedia uses "Indian reserve"; presumably that is the legal term fallback = "Indian reserve", class = "subpolity", }, ["fjord"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["footpath"] = { link = true, fallback = "road", }, ["forest"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ecosystems", "forestry"}, default = {true}, }, ["fort"] = { link = true, fallback = "building", }, ["fortress"] = { link = true, -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- fortresses -> fortresse, so put an entry here to ensure we singularize correctly. plural = "fortresses", fallback = "building", }, ["frazione"] = { link = "w", fallback = "hamlet", }, ["freeway"] = { link = true, fallback = "road", }, ["French prefecture"] = { link = "[[w:prefectures in France|prefecture]]", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["geographic and cultural area"] = { link = "+w:cultural area", -- `generic_before_non_cities` is used when generating the category description of categories of the format -- `Geographic and cultural areas of PLACE`. `preposition` is used when generating {{place}} description and -- categories for any placetype that falls back to `geographic and cultural area`. generic_before_non_cities = "ของ", preposition = "ของ", class = "geographic region", bare_category_parent = "สถานที่", ["country/*"] = {true}, ["constituent country/*"] = {true}, ["continent/*"] = {true}, default = {true}, }, ["geographic area"] = { link = "+w:geographic region", fallback = "geographic and cultural area", }, ["geographic region"] = { link = "w", fallback = "geographic and cultural area", }, ["geographical area"] = { link = "w", fallback = "geographic and cultural area", }, ["geographical region"] = { link = "w", fallback = "geographic and cultural area", }, ["geopolitical zone"] = { -- Nigeria link = true, preposition = "ของ", class = "subpolity", }, ["gewog"] = { -- Bhutan link = true, preposition = "ของ", class = "subpolity", }, ["ghost town"] = { link = true, generic_before_non_cities = "ใน", class = "non-admin settlement", bare_category_parent = "former settlements", cat_handler = city_type_cat_handler, default = {true}, }, ["glen"] = { link = true, fallback = "valley", }, ["governorate"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["greater administrative region"] = { -- China (former division) link = "w", preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["gromada"] = { -- Poland (former division) link = "w", preposition = "ของ", affix_type = "Pref", class = "subpolity", inherently_former = {"FORMER"}, }, ["group of islands"] = { link = "[[group]] of [[island]]s", plural = "groups of islands", plural_link = "[[group]]s of [[island]]s", fallback = "island group", }, ["gulf"] = { link = true, preposition = "ของ", holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["hamlet"] = { link = true, fallback = "village", }, ["harbor city"] = { link = "separately", fallback = "นคร", }, ["harbor town"] = { link = "separately", fallback = "เมือง", }, ["harbour city"] = { link = "separately", fallback = "นคร", }, ["harbour town"] = { link = "separately", fallback = "เมือง", }, ["headland"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["headquarters"] = { link = "w", fallback = "administrative centre", }, ["heath"] = { link = true, fallback = "moor", }, ["hemisphere"] = { link = true, entry_placetype_use_the = true, fallback = "continental region", }, ["highway"] = { link = true, fallback = "road", }, ["hill"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["hill station"] = { link = "w", fallback = "เมือง", }, ["hill town"] = { link = "w", fallback = "เมือง", }, ["historic region"] = { -- provided only for the link link = "+w:historical region", fallback = "FORMER geographic region", }, ["historical county"] = { -- needed for historical counties of England/etc. link = "+w:historic county", fallback = "FORMER subpolity", }, ["historical region"] = { -- provided only for the link link = "w", fallback = "FORMER geographic region", }, ["home rule city"] = { link = "w", fallback = "นคร", }, ["home rule municipality"] = { link = "w", fallback = "เทศบาล", }, ["hot spring"] = { link = true, fallback = "spring", }, ["house"] = { link = true, fallback = "building", }, ["housing estate"] = { -- not the same as a housing project (i.e. public housing) link = true, -- not exactly the case but approximately fallback = "neighborhood", }, ["hromada"] = { -- Ukraine link = "w", disallow_in_entries = "Use placetype 'urban hromada', 'rural hromada' or 'settlement hromada' in place of bare 'hromada'", disallow_in_holonyms = "Use placetype 'urban hromada'/'uhrom', 'rural hromada'/'rhrom' or 'settlement hromada'/'shrom' in place of bare 'hromada'", preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["inactive volcano"] = { link = "w", fallback = "dormant volcano", }, ["independent city"] = { link = true, fallback = "นคร", }, ["independent town"] = { link = "+independent city", fallback = "เมือง", }, ["Indian reservation"] = { link = "w", -- In the US. Also known as "Native American reservation" or "domestic dependent nation", and the reservations -- themselves often use the term "nation" in their official name (e.g. the "Navajo Nation"). But Wikipedia puts -- the article at [[w:Indian reservation]] and uses that term when describing e.g. what the Navajo Nation is, -- so this must still be the legal term. preposition = "ของ", class = "subpolity", default = {true}, }, ["Indian reserve"] = { link = "w", -- In Canada. "First Nations reserve" sounds more modern/PC but Wikipedia uses "Indian reserve"; presumably that -- is still the legal term. preposition = "ของ", class = "subpolity", default = {true}, }, ["inland sea"] = { -- note, we also have 'inland' as a qualifier link = true, fallback = "ทะเล", }, ["inner city area"] = { link = "[[inner city]] [[area]]", fallback = "neighborhood", }, ["เกาะ"] = { link = true, preposition = "ของ", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["island country"] = { -- FIXME: The following should map to both 'island' and 'country'. link = "w", fallback = "ประเทศ", }, ["island group"] = { link = "separately", fallback = "เกาะ", }, ["island municipality"] = { link = "w", fallback = "เทศบาล", }, ["islet"] = { link = "w", fallback = "เกาะ", }, ["Israeli settlement"] = { link = "w", class = "settlement", default = {true}, }, ["judicial capital"] = { link = "w", fallback = "เมืองหลวง", }, ["khanate"] = { link = true, fallback = "องค์การทางการเมือง", }, ["kibbutz"] = { link = true, plural = "kibbutzim", class = "non-admin settlement", default = {true}, }, ["kingdom"] = { link = true, fallback = "monarchy", }, ["krai"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["ทะเลสาบ"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["ธรณีสัณฐาน!"] = { category_link = "[[ธรณีสัณฐาน]]", bare_category_parent = "สถานที่", addl_bare_category_parents = {"โลก"}, }, ["largest city"] = { link = "[[large]]st [[city]]", entry_placetype_use_the = true, fallback = "นคร", has_neighborhoods = true, }, ["league"] = { link = true, fallback = "confederation", }, ["legislative capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["library"] = { link = true, fallback = "building", }, ["lieutenancy area"] = { -- used in the United Kingdom; per Wikipedia: -- In England, lieutenancy areas are colloquially known as the ceremonial counties, although this phrase does -- not appear in any legislation referring to them. The lieutenancy areas of Scotland are subdivisions of -- Scotland that are more or less based on the counties of Scotland, making use of the major cities as separate -- entities.[2] In Wales, the lieutenancy areas are known as the preserved counties of Wales and are based on -- those used for lieutenancy and local government between 1974 and 1996. The lieutenancy areas of Northern -- Ireland correspond to the six counties and two former county boroughs.[3] link = "w", fallback = "ceremonial county", }, ["local authority district"] = { link = "w", fallback = "local government district", }, ["local government area"] = { -- Australia link = "w", preposition = "ของ", class = "subpolity", }, ["local council"] = { -- Malta; similar to municipalities link = "+w:local councils of Malta", preposition = "ของ", fallback = "เทศบาล", }, ["local government district"] = { link = "w", preposition = "ของ", affix_type = "suf", affix = "อำเภอ", class = "subpolity", }, ["local government district with borough status"] = { link = "[[w:local government district|local government district]] with [[w:borough status|borough status]]", plural = "local government districts with borough status", plural_link = "[[w:local government district|local government districts]] with [[w:borough status|borough status]]", preposition = "ของ", affix_type = "suf", affix = "อำเภอ", class = "subpolity", }, ["local urban district"] = { link = "w", fallback = "unincorporated community", }, ["locality"] = { link = "+w:locality (settlement)", -- not necessarily true, but usually is the case fallback = "village", }, ["London borough"] = { link = "w", preposition = "ของ", affix_type = "pref", affix = "borough", fallback = "local government district with borough status", has_neighborhoods = true, }, ["macroregion"] = { link = true, fallback = "ภูมิภาค", }, ["man-made structures!"] = { category_link = "[[w:geographical feature#Engineered constructs|man-made structures]] such as [[airport]]s, [[university|universities]] and [[metro station]]s", bare_category_parent = "สถานที่", }, ["manor"] = { -- FIXME: or is this more like a farm? link = true, fallback = "building", }, ["marginal sea"] = { link = true, preposition = "ของ", fallback = "ทะเล", }, ["market city"] = { link = "+market town", fallback = "นคร", }, ["market town"] = { link = true, fallback = "เมือง", }, ["massif"] = { link = true, fallback = "ภูเขา", }, ["megacity"] = { link = true, fallback = "นคร", }, ["metro station"] = { link = true, class = "man-made structure", }, ["metropolitan borough"] = { link = true, preposition = "ของ", affix_type = "Pref", no_affix_strings = {"borough", "นคร"}, fallback = "local government district", has_neighborhoods = true, }, ["มหานคร"] = { -- These exist e.g. in Italy and are more like municipalities or even provinces than cities. link = true, preposition = "ของ", affix_type = "Pref", no_affix_strings = {"มหานคร", "นคร"}, class = "subpolity", }, ["metropolitan county"] = { link = true, fallback = "เทศมณฑล", }, ["metropolitan municipality"] = { -- In South Africa, metropolitan municipalities group local municipalities and are like districts, between -- provinces and municipalities. -- In Turkey, metropolitan municipalities are provinces-level. link = "w", preposition = "ของ", affix_type = "Suf", no_affix_strings = {"metropolitan", "เทศบาล"}, fallback = "เทศบาล", class = "subpolity", }, ["microdistrict"] = { -- residential complex in post-Soviet states link = true, fallback = "neighborhood", }, ["micronations!"] = { -- FIXME, merge with microstate category_link = "[[micronation]]s", bare_category_parent = "ประเทศ", }, ["microstate"] = { link = true, fallback = "ประเทศ", }, ["military base"] = { link = "w", class = "settlement", -- or "man-made structure"? default = {true}, }, ["minster town"] = { -- England link = "separately", fallback = "เมือง", }, ["monarchy"] = { link = true, fallback = "องค์การทางการเมือง", }, ["moor"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems"}, default = {true}, }, ["moorland"] = { link = true, fallback = "moor", }, ["motorway"] = { link = true, fallback = "road", }, ["ภูเขา"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["mountain indigenous district"] = { -- Taiwan link = "+w:district (Taiwan)", fallback = "อำเภอ", }, ["mountain indigenous township"] = { -- Taiwan link = "+w:township (Taiwan)", fallback = "township", }, ["mountain pass"] = { link = true, -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- passes -> passe, so put an entry here to ensure we singularize correctly. plural = "mountain passes", class = "natural feature", addl_bare_category_parents = {"ภูเขา"}, default = {true}, }, ["เทือกเขา"] = { link = true, fallback = "ภูเขา", }, ["mountainous region"] = { link = "separately", fallback = "ภูมิภาค", }, ["mukim"] = { -- Malaysia, Brunei, Indonesia, Singapore link = true, preposition = "ของ", class = "subpolity", }, ["municipal district"] = { link = "w", -- meaning varies depending on the country; for now, assume no neighborhoods. -- FIXME: has_neighborhoods might have to be a function that looks at the containing holonyms. preposition = "ของ", affix_type = "Pref", no_affix_strings = "อำเภอ", fallback = "เทศบาล", }, ["เทศบาล"] = { link = true, preposition = "ของ", has_neighborhoods = true, class = "subpolity", }, ["municipality with city status"] = { link = "[[municipality]] with [[w:city status|city status]]", plural = "municipalities with city status", plural_link = "[[municipality|municipalities]] with [[w:city status|city status]]", fallback = "เทศบาล", }, ["museum"] = { link = true, fallback = "building", }, ["mythological location"] = { link = "separately", former_type = "!", class = "hypothetical location", bare_category_parent = "สถานที่", default = {true}, }, ["named bridges!"] = { category_link = "notable [[bridge]]s", bare_category_parent = "man-made structures", addl_bare_category_parents = {"bridges"}, }, ["named buildings!"] = { category_link = "notable [[house]]s, [[library|libraries]] and other [[building]]s", bare_category_parent = "man-made structures", addl_bare_category_parents = {"buildings"}, }, ["named roads!"] = { category_link = "notable [[road]]s, [[highway]]s, [[trail]]s and similar linear structures", bare_category_parent = "man-made structures", addl_bare_category_parents = {"roads"}, }, ["national capital"] = { link = "w", fallback = "เมืองหลวง", }, ["national park"] = { link = true, fallback = "park", }, ["natural features!"] = { category_link = "[[w:geographical feature#Natural features|natural features]] such as [[lake]]s, [[mountain]]s, [[island]]s and [[ocean]]s", bare_category_parent = "สถานที่", }, ["neighborhood"] = { -- The majority of the properties here apply to both `neighborhoods` and `neighbourhoods`; the choice of which -- one to use is made by district_neighborhood_cat_handler() based on the value of `british_spelling` for the -- location (city, political division, etc.) of the holonym that follows the word "neighbo(u)hoods" in the -- category name. It does *NOT* depend on whether the {{place}} call uses "neighborhoods" or "neighbourhoods". -- (In general it can't, because other things like "urban areas", "อำเภอ", "subdivisions" and the like also -- categorize as neighbo(u)rhoods.) link = true, -- See below. These are used by category handlers in [[Module:category tree/topic cat/data/Places]]. generic_before_non_cities = "ใน", generic_before_cities = "ของ", -- The following text is suitable for the top-level description of a neighborhood as well as categories of the -- form `Neighborhoods in POLDIV` e.g. `Neighborhoods in Illinois, USA` but not for categories of the form -- `Neighborhoods of Chicago`, where we'd get "... and other subportions of [[city|cities]] of [[Chicago]]". category_link = "[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]", category_link_before_city = "[[neighborhood]]s, [[district]]s and other subportions", -- NOTE: This setting is needed for administrative divisions like barangays that fall back to `neighborhood`, -- when set in [[Module:place/locations]] for a specific country (e.g. the Philippines). The above settings -- for `generic_before_non_cities` and `generic_before_cities` are used by category handlers in -- [[Module:category tree/topic cat/data/Places]] for `Neighborhoods in POLDIV` and `Neighborhoods of CITY` -- categories. In fact, district_neighborhood_cat_handler() does not currently pay attention to them, but -- generates "ของ" before cities and "ใน" before non-cities regardless. (FIXME: We should change that.) preposition = "ของ", class = "non-admin settlement", cat_handler = district_neighborhood_cat_handler, }, ["neighbourhood"] = { link = true, category_link = "[[neighbourhood]]s, [[district]]s and other subportions of [[city|cities]]", category_link_before_city = "[[neighbourhood]]s, [[district]]s and other subportions", fallback = "neighborhood", }, ["new area"] = { -- China (type of economic development zone, varying greatly in size) link = "w", preposition = "ใน", class = "subpolity", --? }, ["new town"] = { link = true, fallback = "เมือง", }, ["เมืองหลวงที่ไม่ใช่นคร"] = { link = "[[เมืองหลวง]]", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", cat_handler = function(data) return capital_city_cat_handler(data, "non-city") end, -- FIXME, do we need the following? default = {true}, }, ["non-metropolitan county"] = { link = "w", fallback = "เทศมณฑล", }, ["non-metropolitan district"] = { link = "w", fallback = "local government district", }, ["non-sovereign kingdom"] = { -- especially in Africa and Asia link = "+w:non-sovereign monarchy", generic_before_non_cities = "ใน", class = "subpolity", ["country/*"] = {true}, ["continent/*"] = {true}, default = {true}, }, ["non-sovereign monarchy"] = { link = "w", fallback = "non-sovereign kingdom", }, ["oblast"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["oblasts and autonomous republics!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Ukraine. category_link = "[[oblast]]s and [[w:autonomous republic|autonomous republic]]s", class = "subpolity", }, ["มหาสมุทร"] = { link = true, holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"ทะเล", "bodies of water"}, default = {true}, }, ["okrug"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["overseas collectivity"] = { link = "w", fallback = "collectivity", }, ["overseas department"] = { link = "w", fallback = "department", }, ["overseas territory"] = { link = "w", fallback = "dependent territory", }, ["parish"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["parish municipality"] = { -- in Quebec, often similar to a rural village; the famous [[Saint-Louis-du-Ha! Ha!]] is one of them. link = "+w:parish municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, }, ["parish seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", class = "capital", has_neighborhoods = true, }, ["park"] = { link = true, class = "man-made structure", default = {true}, }, ["pass"] = { link = "+mountain pass", -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- passes -> passe, so put an entry here to ensure we singularize correctly. plural = "passes", fallback = "mountain pass", }, ["path"] = { link = true, fallback = "road", }, ["peak"] = { link = true, fallback = "ภูเขา", }, ["peninsula"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["periphery"] = { link = true, preposition = "ของ", class = "subpolity", }, ["สถานที่!"] = { generic_before_non_cities = "ใน", generic_before_cities = "ใน", class = "generic place", category_link = "[[place]]s of all sorts", -- `category_link_top_level` control the description used in the top-level [[Category:Places]] and -- language-specific variants such as [[Category:en:Places]]. The actual text for a language-spefic variant is -- "{{{langname}}} names of [[geographical]] [[place]]s of all sorts; [[toponym]]s." where the "names of" -- portion is automatically generated by the appropriate handler in -- [[Module:category tree/topic cat/data/Places]]. category_link_top_level = "[[geographical]] [[place]]s of all sorts; [[toponym]]s", bare_category_parent = "ชื่อ (หัวข้อ)", }, ["planned community"] = { -- Include this so we don't categorize 'planned community' into villages, as 'community' does. link = true, class = "settlement", has_neighborhoods = true, }, ["plateau"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, -- FIXME: Should generate both "Plateaus" and the appropriate 'geographic and cultural area' category }, ["Polish colony"] = { link = "[[w:colony (Poland)|colony]]", affix_type = "suf", affix = "colony", fallback = "village", has_neighborhoods = true, }, ["political divisions!"] = { category_link = "[[political]] [[division]]s and [[subdivision]]s, such as [[state]]s, [[province]]s, [[county|counties]] or [[district]]s", bare_category_parent = "สถานที่", }, ["องค์การทางการเมือง"] = { link = true, category_link = "[[independent]] or [[semi-]][[independent]] [[polity|polities]]", class = "polity", --ห้ามแปล class bare_category_parent = "สถานที่", default = {true}, }, ["populated place"] = { link = "+w:populated place", -- not necessarily true, but usually is the case fallback = "village", }, ["port"] = { link = true, class = "man-made structure", default = {true}, }, ["port city"] = { -- FIXME: should categorize into "Ports" as well as "นคร" link = true, fallback = "นคร", }, ["port town"] = { -- FIXME: should categorize into "Ports" as well as "เมือง" link = "w", fallback = "เมือง", }, ["prefecture"] = { -- FIXME! `prefecture` is like a county in Japan and elsewhere but a department capital city in France. -- May need `has_neighborhoods` to be a function. link = true, preposition = "ของ", display_handler = prefecture_display_handler, class = "subpolity", }, ["prefecture-level city"] = { -- China; they are huge entities with a central city; not cities themselves. link = "w", preposition = "ของ", class = "subpolity", }, ["preserved county"] = { -- In Wales; they are former counties enshrined in law; there are 8 of them and each consists of one or more -- "principal areas" (styled as "เทศมณฑล" or "county boroughs"), of which there are 22. link = "w", preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["primary area"] = { -- a grouping of "อำเภอ" (neighborhoods) in Gothenburg, Sweden link = "+w:sv:primärområde", fallback = "neighborhood", }, ["principality"] = { link = true, fallback = "monarchy", }, ["promontory"] = { link = true, fallback = "headland", }, ["protectorate"] = { link = true, fallback = "dependent territory", }, ["จังหวัด"] = { link = true, preposition = "ของ", display_handler = province_display_handler, class = "subpolity", }, ["provinces and autonomous regions!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case China. category_link = "[[province]]s and [[autonomous region]]s", class = "subpolity", }, ["provinces and territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Canada and Pakistan. category_link = "[[province]]s and [[territory|territories]]", class = "subpolity", }, ["provincial capital"] = { link = true, fallback = "เมืองหลวง", }, ["raion"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["ranch"] = { link = true, fallback = "farm", }, ["range"] = { -- FIXME: Where is this used? Is it a mountain range? link = true, holonym_use_the = true, class = "natural feature", }, ["regency"] = { link = true, preposition = "ของ", class = "subpolity", }, ["ภูมิภาค"] = { link = true, preposition = "ของ", -- If 'region' isn't a specific administrative division, fall back to 'geographic and cultural area' fallback = "geographic and cultural area", -- "former region" is a subpolity but traditional/historic(al)/ancient/medieval/etc. is a geographic region class = "geographic region", }, ["regional capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["regional county municipality"] = { -- Quebec link = "w", preposition = "ของ", affix_type = "Suf", no_affix_strings = {"เทศบาล", "เทศมณฑล"}, fallback = "เทศบาล", }, ["regional district"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "อำเภอ", fallback = "อำเภอ", }, ["regional municipality"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "เทศบาล", fallback = "เทศบาล", }, ["regional unit"] = { link = "w", preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["registration county"] = { -- Used in Scotland for land registration purposes; formerly used in England, Wales and Ireland for statistical -- purposes (registration of births, deaths and marriages, and for the output of census information). link = "w", fallback = "เทศมณฑล", }, ["republic"] = { -- Of Russia, Yugoslavia, etc. "Republics" in general are sovereign but we use "ประเทศ" in that case. link = true, fallback = "constituent republic", }, ["research base"] = { link = "+w:research station", fallback = "research station", }, ["research station"] = { link = "w", class = "non-admin settlement", -- or "man-made structure"? default = {true}, }, ["reservoir"] = { link = true, fallback = "ทะเลสาบ", }, ["residential area"] = { link = "separately", fallback = "neighborhood", }, ["resort city"] = { link = "w", fallback = "นคร", }, ["resort town"] = { link = "w", fallback = "เมือง", }, ["แม่น้ำ"] = { link = true, generic_before_non_cities = "ใน", holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, cat_handler = city_type_cat_handler, ["continent/*"] = {true}, default = {true}, }, ["river island"] = { link = "w", fallback = "เกาะ", }, ["road"] = { link = true, class = "man-made structure", default = {"Named roads"}, }, ["Roman province"] = { -- FIXME! Eliminate this in favor of 'former province|emp/Roman Empire' link = "w", default = {"Provinces of the Roman Empire"}, class = "subpolity", }, ["royal borough"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = {"royal", "borough"}, fallback = "local government district with borough status", has_neighborhoods = true, }, ["royal burgh"] = { link = true, fallback = "borough", }, ["royal capital"] = { link = "w", fallback = "เมืองหลวง", }, ["rural committee"] = { -- Hong Kong; a group of villages link = "w", affix_type = "Suf", has_neighborhoods = true, class = "settlement", }, ["rural community"] = { -- New Brunswick link = "+w:list of municipalities in New_Brunswick#Rural communities", fallback = "เทศบาล", }, ["rural hromada"] = { link = "[[rural]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["rural municipality"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "เทศบาล", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["rural township"] = { -- Taiwan link = "+w:rural township (Taiwan)", fallback = "township", }, ["sanctuary"] = { link = true, fallback = "temple", }, ["satrapy"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"ANCIENT", "FORMER"}, }, ["ทะเล"] = { link = true, holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["seaport"] = { link = true, fallback = "port", }, ["seat"] = { link = true, fallback = "administrative centre", }, ["self-administered area"] = { -- Myanmar (groups self-administered divisions and zones) link = "+w:self-administered zone", preposition = "ของ", class = "subpolity", }, ["self-administered division"] = { -- Myanmar (only one of them: Wa Self-Administered Division) link = "w", fallback = "self-administered area", }, ["self-administered zone"] = { -- Myanmar (five of them) link = "w", fallback = "self-administered area", }, ["separatist state"] = { link = "separately", fallback = "unrecognized country", }, ["การตั้งถิ่นฐาน"] = { link = true, category_link = "[[settlement]]s such as [[city|cities]], [[village]]s and [[farm]]s", bare_category_parent = "สถานที่", -- not necessarily true, but usually is the case fallback = "village", }, ["settlement hromada"] = { link = "[[w:Populated สถานที่ในUkraine#Rural settlements|การตั้งถิ่นฐาน]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["sheading"] = { -- Isle of Man link = true, fallback = "อำเภอ", }, ["sheep station"] = { -- Australia link = true, fallback = "farm", }, ["shire"] = { link = true, fallback = "เทศมณฑล", }, ["shire county"] = { link = "w", fallback = "เทศมณฑล", }, ["shire town"] = { link = true, fallback = "county seat", }, ["ski resort city"] = { link = "[[ski resort]] [[city]]", fallback = "นคร", }, ["ski resort town"] = { link = "[[ski resort]] [[town]]", fallback = "เมือง", }, ["spa city"] = { link = "+w:spa town", fallback = "นคร", }, ["spa town"] = { link = "w", fallback = "เมือง", }, ["space station"] = { link = true, fallback = "research station", }, ["special administrative region"] = { -- in China; in practice they are city-like (Hong Kong, Macau); also [[Oecusse]] in East Timor is formally a -- "special administrative region"; North Korea had one such region planned (Sinuiju) but abandoned; Indonesia -- has similar "special regions" of Jakarta, Yogyakarta and Aceh; and South Sudan has three "special -- administrative areas" link = "+w:special administrative regions of China", preposition = "ของ", class = "subpolity", has_neighborhoods = true, --? -- no suffix since สถานที่ในHong Kong or Macau are listed without China, except Hong Kong and Macau themselves -- they also contain regions (or areas), e.g. [[Kowloon]], so it would be confusing suffix = "", }, ["special collectivity"] = { link = "w", fallback = "collectivity", }, ["special municipality"] = { -- formerly linked to the Taiwan article but there are also special municipalities of the Netherlands link = "w", fallback = "เทศบาล", }, ["special ward"] = { -- Tokyo link = true, fallback = "เทศบาล", }, ["spit"] = { link = true, fallback = "peninsula", }, ["spring"] = { link = true, class = "natural feature", default = {true}, }, ["star"] = { link = true, class = "natural feature", default = {true}, }, ["รัฐ"] = { link = true, preposition = "ของ", class = "subpolity", -- 'former/historical state' could refer either to a state of a country (a division) or a state = sovereign -- entity. The latter appears more common (e.g. in various "ancient states" of East Asia). former_type = "องค์การทางการเมือง", }, ["states and territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Australia. category_link = "[[state]]s and [[territory|territories]]", class = "subpolity", }, ["states and union territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case India. category_link = "[[state]]s and [[union territory|union territories]]", class = "subpolity", }, ["state capital"] = { link = true, fallback = "เมืองหลวง", }, ["state park"] = { link = true, fallback = "park", }, ["state-level new area"] = { -- China (type of economic development zone, varying greatly in size) link = "w", fallback = "new area", }, ["statistical region"] = { -- Slovenia link = true, fallback = "administrative region", }, ["statutory city"] = { link = "w", fallback = "นคร", }, ["statutory town"] = { link = "w", fallback = "เมือง", }, ["strait"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["stream"] = { link = true, fallback = "แม่น้ำ", }, ["street"] = { link = true, fallback = "road", }, ["strip"] = { link = true, fallback = "geographic region", }, ["strip of land"] = { link = "[[strip]] of [[land]]", plural = "strips of land", plural_link = "[[strip]]s of [[land]]", fallback = "geographic region", }, ["sub-metropolitan city"] = { link = "+w:List of cities in Nepal#Sub-metropolitan cities", fallback = "นคร", }, ["sub-prefectural city"] = { link = "w", fallback = "subprovincial city", }, ["ตำบล"] = { link = true, preposition = "ของ", has_neighborhoods = true, --? -- FIXME: subdistricts can be neighborhood-like (of Jakarta) or larger (in China); need a handler class = "subpolity", default = {true}, }, ["subdivision"] = { link = true, preposition = "ของ", affix_type = "suf", -- FIXME: subdivisions can be neighborhood-like or larger; need a handler class = "subpolity", cat_handler = district_neighborhood_cat_handler, }, ["submerged ghost town"] = { -- FIXME: Consider just having "submerged" as a qualifier. link = "[[submerged]] [[ghost town]]", fallback = "ghost town", }, ["subnational kingdom"] = { link = "+w:subnational monarchy", fallback = "non-sovereign kingdom", }, ["subnational monarchy"] = { link = "w", fallback = "non-sovereign kingdom", }, ["subprefecture"] = { link = true, affix_type = "suf", preposition = "ของ", class = "subpolity", }, ["subprovince"] = { link = true, preposition = "ของ", class = "subpolity", }, ["subprovincial city"] = { link = "w", -- China; special status given to certain prefecture-level cities fallback = "prefecture-level city", }, ["subprovincial district"] = { link = "w", -- China; special status given to Binhai New Area and Pudong New Area, which are county-level districts preposition = "ของ", class = "subpolity", }, ["subregion"] = { link = true, fallback = "geographic region", }, ["suburb"] = { link = true, -- The following text is suitable for the top-level description of a suburb as well as categories of the form -- 'Suburbs in POLDIV' e.g. 'Suburbs in Illinois, USA' but not for categories of the form 'Suburbs of Chicago', -- where we'd get "[[suburb]]s of [[city|cities]] of [[Chicago]]". category_link = "[[suburb]]s of [[city|cities]]", category_link_before_city = "[[suburb]]s", -- See comments under "neighborhood" for the following three settings. They are used by -- [[Module:category tree/topic cat/data/Places]] for generating the text of 'Suburbs in/of PLACE' categories -- but currently ignored by district_neighborhood_cat_handler (which actually generates the categories for a -- given page), which hardcodes "ใน" for non-cities and "ของ" for cities. (FIXME: Change this.) generic_before_non_cities = "ใน", generic_before_cities = "ของ", preposition = "ของ", has_neighborhoods = true, --? class = "non-admin settlement", --? cat_handler = district_neighborhood_cat_handler, }, ["suburban area"] = { link = "w", fallback = "suburb", }, ["subway station"] = { link = "w", fallback = "metro station", }, ["sum"] = { -- In China, Mongolia, Russia; something like a county in Mongolia but a township in China (Inner Mongolia), -- and equivalent to a [[selsoviet]] in the parts of Russia where it's in use (a rural council, below a raion). link = "+w:sum (administrative division)", -- This fallback is somewha arbitrary. We could use "เทศมณฑล" but that has a display handler -- which we don't want to be active (FIXME: If the display handler would be active, that's a bug). fallback = "division", }, ["supercontinent"] = { link = true, fallback = "continent", }, ["tehsil"] = { link = true, affix_type = "suf", no_affix_strings = {"tehsil", "tahsil"}, class = "subpolity", }, ["temple"] = { link = true, fallback = "building", }, ["territorial authority"] = { link = "w", fallback = "อำเภอ", }, ["ดินแดน"] = { link = true, preposition = "ของ", class = "subpolity", }, ["theme"] = { link = "+w:theme (Byzantine district)", preposition = "ของ", class = "subpolity", }, ["เมือง"] = { link = true, generic_before_non_cities = "ใน", has_neighborhoods = true, class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["town with bystatus"] = { -- can't use templates in links currently link = "[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]", plural = "towns with bystatus", plural_link = "[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]", fallback = "เมือง", }, ["township"] = { link = true, has_neighborhoods = true, class = "settlement", --? default = {true}, }, ["township municipality"] = { -- Quebec link = "+w:township municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["traditional county"] = { link = true, fallback = "เทศมณฑล", }, ["traditional region"] = { -- FIXME: Verify this works. Same for 'historic(al) region'. -- provided only for the link link = "w", fallback = "FORMER geographic region", }, ["trail"] = { link = true, fallback = "road", }, ["treaty port"] = { link = "w", fallback = "นคร", class = "settlement", inherently_former = {"FORMER"}, }, ["tributary"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["underground station"] = { link = "w", fallback = "metro station", }, ["unincorporated area"] = { link = "w", -- I don't know if this fallback makes sense everywhere. fallback = "unincorporated community", }, ["unincorporated community"] = { link = true, generic_before_non_cities = "ใน", class = "non-admin settlement", }, ["unincorporated territory"] = { link = "w", fallback = "ดินแดน", }, ["union territory"] = { -- India link = true, preposition = "ของ", entry_placetype_indefinite_article = "a", class = "subpolity", }, ["unitary authority"] = { -- UK, New Zealand link = true, entry_placetype_indefinite_article = "a", fallback = "local government district", }, ["unitary district"] = { link = "w", entry_placetype_indefinite_article = "a", fallback = "local government district", }, ["united township municipality"] = { -- Quebec link = "+w:united township municipality (Quebec)", entry_placetype_indefinite_article = "a", fallback = "township municipality", has_neighborhoods = true, --? }, ["university"] = { link = true, entry_placetype_indefinite_article = "a", class = "man-made structure", default = {true}, }, ["unrecognised country"] = { link = "w", fallback = "unrecognized country", }, ["unrecognized and nearly unrecognized countries!"] = { category_link = "[[de facto]] [[independent]] [[state]]s with little or no {{w|international recognition}}", bare_category_parent = "country-like entities", }, ["unrecognized country"] = { link = "w", class = "polity", --ห้ามแปล class default = {"Unrecognized and nearly unrecognized countries"}, }, ["unrecognised state"] = { link = "w", fallback = "unrecognized country", }, ["unrecognized state"] = { link = "w", fallback = "unrecognized country", }, ["urban area"] = { link = "separately", fallback = "neighborhood", }, ["urban hromada"] = { link = "[[urban]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["urban service area"] = { -- A strange beast existing in Alberta; technically a type of hamlet but in practice used for much larger -- cities and treated equivalent to a city. (There are only two of them, [[Fort McMurray]] and [[Sherwood Park]]). link = "w", fallback = "นคร", }, ["urban township"] = { link = "w", fallback = "township", }, ["urban-type settlement"] = { -- appears to be a particular type of small urban settlement in post-Soviet states, -- had an administrative function. link = "w", fallback = "เมือง", }, ["valley"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "water"}, default = {true}, }, ["viceroyalty"] = { -- in essence, a type of colony link = true, fallback = "dependent territory", }, ["village"] = { link = true, generic_before_non_cities = "ใน", category_link = "[[village]]s, [[hamlet]]s, and other small [[community|communities]] and [[settlement]]s", class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["village development committee"] = { -- former administrative structure in Nepal; also exists in India but not as a formal unit link = "+w:village development committee (Nepal)", inherently_former = {"FORMER"}, fallback = "village", }, ["village municipality"] = { -- Quebec link = "+w:village municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["voivodeship"] = { -- Poland link = true, display_handler = voivodeship_display_handler, preposition = "ของ", class = "subpolity", }, ["volcano"] = { link = true, plural = "volcanoes", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true, "ภูเขา"}, }, ["ward"] = { link = true, class = "settlement", -- Wards are formal administrative divisions of a city but have some properties of neighborhoods. fallback = "neighborhood", }, ["watercourse"] = { link = true, fallback = "channel", }, ["Welsh community"] = { -- Wales link = "[[w:community (Wales)|community]]", preposition = "ของ", affix_type = "suf", affix = "community", has_neighborhoods = true, class = "settlement", }, ["zone"] = { -- administrative division of Ethiopia, Qatar, Nepal, India link = "+w:zone#Place names", preposition = "ของ", class = "subpolity", }, ---------------------------------------------------------------------------------------------- -- Categories for former places -- ---------------------------------------------------------------------------------------------- ["ANCIENT capital"] = { link = false, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", -- FIXME: Consider removing 'ancient settlements' here. Ancient capitals, like former capitals, often still -- exist but just aren't the capital any more. Maybe we should have an 'Ancient capitals' category. default = {"Ancient settlements", "Former capitals"}, }, ["ANCIENT non-admin settlement"] = { link = false, class = "non-admin settlement", fallback = "ANCIENT settlement", }, ["ANCIENT settlement"] = { link = false, has_neighborhoods = true, class = "settlement", default = {"Ancient settlements"}, }, ["ancient settlements!"] = { category_link = "former [[city|cities]], [[town]]s and [[village]]s that existed in [[antiquity]]", bare_category_parent = "former settlements", }, ["FORMER capital"] = { link = false, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", default = {"Former capitals"}, }, ["former capitals!"] = { category_link = "former [[capital]] [[city|cities]] and [[town]]s", bare_category_parent = "การตั้งถิ่นฐาน", }, ["former counties and county-level cities!"] = { -- For categorizing former counties and county-level cities of China category_link = "no-longer existing [[county|counties]] and [[county-level city|county-level cities]]", bare_category_breadcrumb = "counties and county-level cities", bare_category_parent = "former political divisions", }, ["FORMER county"] = { -- For categorizing former counties and county-level cities of China link = false, fallback = "FORMER subpolity", }, ["FORMER county-level city"] = { -- For categorizing former counties and county-level cities of China link = false, fallback = "FORMER subpolity", }, ["former countries and country-like entities!"] = { category_link = "[[country|countries]] and similar [[polity|polities]] that no longer exist", bare_category_breadcrumb = "countries and country-like entities", bare_category_parent = "former polities", }, ["FORMER country"] = { link = false, class = "polity", --ห้ามแปล class default = {"Former countries and country-like entities"}, }, ["former dependent territories!"] = { category_link = "[[w:dependent territory|dependent territories]] (colonies, dependencies, protectorates, etc.) that no longer exist", bare_category_breadcrumb = "dependent territories", bare_category_parent = "former political divisions", }, ["FORMER dependent territory"] = { link = false, preposition = "ของ", class = "subpolity", default = {"Former dependent territories"}, }, ["former districts!"] = { -- For categorizing former districts of China category_link = "no-longer-existing [[district]]s", bare_category_breadcrumb = "อำเภอ", bare_category_parent = "former political divisions", }, ["FORMER district"] = { -- For categorizing former districts of China link = false, fallback = "FORMER subpolity", }, ["FORMER geographic region"] = { link = false, fallback = "geographic and cultural area", }, ["FORMER man-made structure"] = { link = false, class = "man-made structure", default = {"Former man-made structures"}, }, ["former man-made structures!"] = { category_link = "man-made structures such as [[airport]]s and [[park]]s that no longer exist", bare_category_breadcrumb = "man-made structures", bare_category_parent = "former places", }, ["former municipalities!"] = { -- For categorizing former municipalities of the Netherlands category_link = "no-longer-existing [[municipality|municipalities]]", bare_category_breadcrumb = "เทศบาล", bare_category_parent = "former political divisions", }, ["FORMER municipality"] = { -- For categorizing former municipalities of the Netherlands link = false, fallback = "FORMER subpolity", }, ["FORMER natural feature"] = { link = false, class = "natural feature", default = {"Former natural features"}, }, ["former natural features!"] = { category_link = "natural features such as [[lake]]s, [[river]]s and [[island]]s that no longer exist", bare_category_breadcrumb = "natural features", bare_category_parent = "former places", }, ["FORMER non-admin settlement"] = { link = false, class = "non-admin settlement", fallback = "FORMER settlement", }, ["former places!"] = { category_link = "[[place]]s of all sorts that no longer exist", bare_category_breadcrumb = "former", bare_category_parent = "สถานที่", }, ["former political divisions!"] = { category_link = "[[political]] [[division]]s (states, provinces, counties, etc.) that no longer exist", bare_category_breadcrumb = "political divisions", bare_category_parent = "former places", }, ["former polities!"] = { category_link = "[[polity|polities]] (countries, kingdoms, empires, etc.) that no longer exist", bare_category_breadcrumb = "องค์การทางการเมือง", bare_category_parent = "former places", }, ["FORMER polity"] = { link = false, class = "polity", --ห้ามแปล class default = {"Former polities"}, }, ["former prefectures!"] = { -- For categorizing former prefectures of China category_link = "no-longer-existing [[prefecture]]s", bare_category_breadcrumb = "prefectures", bare_category_parent = "former political divisions", }, ["FORMER prefecture"] = { -- For categorizing former prefectures of China link = false, fallback = "FORMER subpolity", }, ["former provinces!"] = { -- For categorizing former provinces of China, etc. category_link = "no-longer-existing [[province]]s", bare_category_breadcrumb = "จังหวัด", bare_category_parent = "former political divisions", }, ["FORMER province"] = { -- For categorizing ancient/historical/former provinces of the Roman Empire link = false, fallback = "FORMER subpolity", }, ["former region"] = { -- A former region is considered a former political division, but not a 'historical/traditional/etc.' region. link = "separately", preposition = "ของ", inherently_former = {"FORMER"}, class = "subpolity", }, ["FORMER settlement"] = { link = false, has_neighborhoods = true, class = "settlement", default = {"Former settlements"}, }, ["former settlements!"] = { category_link = "[[city|cities]], [[town]]s and [[village]]s that no longer exist or have been merged or reclassified", bare_category_breadcrumb = "การตั้งถิ่นฐาน", bare_category_parent = "former political divisions", }, ["FORMER subpolity"] = { link = false, preposition = "ของ", class = "subpolity", default = {"Former political divisions"}, }, ---------------------------------------------------------------------------------------------- -- form-of categories -- ---------------------------------------------------------------------------------------------- ---------- Abbreviations ---------- ["abbreviations of counties!"] = { -- For categorizing abbreviations of counties of e.g. England full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[county|counties]]", bare_category_breadcrumb = "เทศมณฑล", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of countries!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "abbreviations of places", }, ["abbreviations of departments!"] = { -- For categorizing abbreviations of departments of e.g. France full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[department]]s", bare_category_breadcrumb = "departments", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of districts!"] = { -- For categorizing abbreviations of districts of e.g. ??? full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[district]]s", bare_category_breadcrumb = "อำเภอ", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of divisions!"] = { -- For categorizing abbreviations of divisions of e.g. Bangladesh full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[division]]s", bare_category_breadcrumb = "divisions", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of former countries!"] = { full_category_link = "{{glossary|abbreviation}}s of [[country|countries]] that no longer [[exist]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "abbreviations of former places", }, ["abbreviations of former places!"] = { full_category_link = "{{glossary|abbreviation}}s of [[place]]s that no longer [[exist]]", bare_category_breadcrumb = "abbreviations", bare_category_parent = "former places", addl_bare_category_parents = {{name = "abbreviations of places", sort = "former"}}, }, ["abbreviations of places!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[place]]s", bare_category_breadcrumb = "abbreviations", bare_category_parent = "สถานที่", }, ["abbreviations of political divisions!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[political]] [[division]]s", bare_category_breadcrumb = "political divisions", bare_category_parent = "abbreviations of places", }, ["abbreviations of prefectures!"] = { -- For categorizing abbreviations of prefectures of e.g. Japan full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[prefecture]]s", bare_category_breadcrumb = "prefectures", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of provinces!"] = { -- For categorizing abbreviations of provinces of e.g. Canada full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s", bare_category_breadcrumb = "จังหวัด", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of provinces and territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s and [[territory|territories]]", bare_category_breadcrumb = "provinces and territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of regions!"] = { -- For categorizing abbreviations of regions of e.g. Italy full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[administrative region]]s", bare_category_breadcrumb = "ภูมิภาค", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states!"] = { -- For categorizing abbreviations of states of e.g. the United States full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states and territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[territory|territories]]", bare_category_breadcrumb = "states and territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states and union territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[union territory|union territories]]", bare_category_breadcrumb = "states and union territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[territory|territories]]", bare_category_breadcrumb = "ดินแดน", bare_category_parent = "abbreviations of political divisions", }, ["ABBREVIATION_OF country"] = { link = false, default = {"Abbreviations of countries"}, }, ["ABBREVIATION_OF county"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF department"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF district"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF division"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF FORMER country"] = { link = false, default = {"Abbreviations of former countries"}, }, ["ABBREVIATION_OF FORMER place"] = { link = false, default = {"Abbreviations of former places"}, }, ["ABBREVIATION_OF place"] = { link = false, default = {"Abbreviations of places"}, }, ["ABBREVIATION_OF prefecture"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF province"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF region"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF state"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF subpolity"] = { link = false, default = {"Abbreviations of political divisions"}, }, ["ABBREVIATION_OF territory"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF union territory"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ---------- Archaic forms ---------- ["archaic forms of places!"] = { full_category_link = "{{glossary|archaic}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "archaic forms", bare_category_parent = "สถานที่", }, ["ARCHAIC_FORM_OF place"] = { link = false, default = {"Archaic forms of places"}, }, ---------- Clippings ---------- ["clippings of places!"] = { full_category_link = "{{glossary|clipping}}s of [[name]]s of [[place]]s", bare_category_breadcrumb = "clippings", bare_category_parent = "สถานที่", }, ["CLIPPING_OF place"] = { link = false, default = {"Clippings of places"}, }, ---------- Dated forms ---------- ["dated forms of places!"] = { full_category_link = "{{glossary|dated}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "dated forms", bare_category_parent = "สถานที่", }, ["DATED_FORM_OF place"] = { link = false, default = {"Dated forms of places"}, }, ---------- Derogatory names ---------- ["derogatory names for cities!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[city|cities]]", bare_category_breadcrumb = "นคร", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for cities"}, }, ["derogatory names for continents!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[continent]]s", bare_category_breadcrumb = "ทวีป", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for continents"}, }, ["derogatory names for countries!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for countries"}, }, ["derogatory names for places!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[place]]s", bare_category_breadcrumb = "derogatory names", bare_category_parent = "nicknames for places", }, ["derogatory names for states!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for states"}, }, ["DEROGATORY_NAME_FOR capital"] = { link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR city"] = { link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR continent"] = { link = false, default = {"Derogatory names for continents"}, }, ["DEROGATORY_NAME_FOR country"] = { link = false, default = {"Derogatory names for countries"}, }, ["DEROGATORY_NAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR place"] = { link = false, default = {"Derogatory names for places"}, }, ["DEROGATORY_NAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR state"] = { link = false, default = {"Derogatory names for states"}, }, ["DEROGATORY_NAME_FOR town"] = { link = false, default = {"Derogatory names for cities"}, }, ---------- Ellipses ---------- ["ellipses of places!"] = { full_category_link = "{{glossary|ellipsis|ellipses}} of [[name]]s of [[place]]s", bare_category_breadcrumb = "ellipses", bare_category_parent = "สถานที่", }, ["ELLIPSIS_OF place"] = { link = false, default = {"Ellipses of places"}, }, ---------- Former long-form names ---------- ["former long-form names of countries!"] = { full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former long-form names of places", addl_bare_category_parents = {{name = "former names of countries", sort = "long-form"}}, }, ["former long-form names of places!"] = { full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s", bare_category_breadcrumb = "long-form", bare_category_parent = "former names of places", }, ["FORMER_LONG_FORM_OF country"] = { link = false, default = {"Former long-form names of countries"}, }, ["FORMER_LONG_FORM_OF place"] = { link = false, default = {"Former long-form names of places"}, }, ---------- Former names ---------- ["former names of capitals!"] = { full_category_link = "[[former]] [[name]]s of [[capital city|capital cities]] that generally still exist but under a different name", bare_category_breadcrumb = "capitals", bare_category_parent = "former names of settlements", }, ["former names of countries!"] = { full_category_link = "[[former]] [[name]]s of [[country|countries]] that generally still exist but under a different name", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former names of places", }, ["former names of places!"] = { full_category_link = "[[former]] [[name]]s of [[place]]s that generally still exist but under a different name", bare_category_breadcrumb = "former names", bare_category_parent = "สถานที่", }, ["former names of political divisions!"] = { full_category_link = "[[former]] [[name]]s of [[political]] [[division]]s (states, provinces, counties, etc.) that generally still exist but under a different name", bare_category_breadcrumb = "political divisions", bare_category_parent = "former names of places", }, ["former names of polities!"] = { full_category_link = "[[former]] [[name]]s of [[polity|polities]] (e.g. [[country|countries]]) that generally still exist but under a different name", bare_category_breadcrumb = "องค์การทางการเมือง", bare_category_parent = "former names of places", }, ["former names of settlements!"] = { full_category_link = "[[former]] [[name]]s of [[city|cities]], [[town]]s, [[village]]s, etc. that generally still exist but under a different name", bare_category_breadcrumb = "การตั้งถิ่นฐาน", bare_category_parent = "former names of political divisions", }, ["FORMER_NAME_OF capital"] = { link = false, default = {"Former names of capitals"}, }, ["FORMER_NAME_OF country"] = { link = false, default = {"Former names of countries"}, }, ["FORMER_NAME_OF place"] = { link = false, default = {"Former names of places"}, }, ["FORMER_NAME_OF polity"] = { link = false, default = {"Former names of polities"}, }, ["FORMER_NAME_OF region"] = { link = false, fallback = "FORMER_NAME_OF subpolity", }, ["FORMER_NAME_OF settlement"] = { link = false, default = {"Former names of settlements"}, }, ["FORMER_NAME_OF subpolity"] = { link = false, default = {"Former names of political divisions"}, }, ---------- Former nicknames ---------- ["former nicknames for cities!"] = { full_category_link = "no-longer-used [[nickname]]s for [[city|cities]], e.g. the [[Eternal City]] for [[Kyoto]] during the {{w|Heian period}} ({{circa2|800–1100|short=yes}} {{AD}})", bare_category_breadcrumb = "นคร", bare_category_parent = "former nicknames for places", addl_bare_category_parents = {"nicknames for cities"}, }, ["former nicknames for places!"] = { full_category_link = "no-longer-used [[nickname]]s for [[place]]s", bare_category_breadcrumb = "former", bare_category_parent = "nicknames for places", addl_bare_category_parents = {{name = "former names of places", sort = "nicknames"}}, }, ["FORMER_NICKNAME_FOR capital"] = { link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR city"] = { link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR place"] = { link = false, default = {"Former nicknames for places"}, }, ["FORMER_NICKNAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR town"] = { link = false, default = {"Former nicknames for cities"}, }, ---------- Former official names ---------- ["former official names of countries!"] = { full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former official names of places", addl_bare_category_parents = {{name = "former names of countries", sort = "official"}}, }, ["former official names of places!"] = { full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[place]]s", bare_category_breadcrumb = "official", bare_category_parent = "former names of places", }, ["FORMER_OFFICIAL_NAME_OF country"] = { link = false, default = {"Former official names of countries"}, }, ["FORMER_OFFICIAL_NAME_OF place"] = { link = false, default = {"Former official names of places"}, }, ---------- Long-form names ---------- ["long-form names of countries!"] = { full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "long-form names of places", }, ["long-form names of places!"] = { full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s", bare_category_breadcrumb = "long-form names", bare_category_parent = "สถานที่", }, ["LONG_FORM_OF country"] = { link = false, default = {"Long-form names of countries"}, }, ["LONG_FORM_OF place"] = { link = false, default = {"Long-form names of places"}, }, ---------- Nicknames ---------- ["nicknames for cities!"] = { full_category_link = "[[nickname]]s for [[city|cities]], e.g. the [[Big Apple]] for [[New York City]]", bare_category_breadcrumb = "นคร", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"นคร"}, }, ["nicknames for continents!"] = { full_category_link = "[[nickname]]s for [[continent]]s", bare_category_breadcrumb = "ทวีป", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"ทวีป"}, }, ["nicknames for countries!"] = { full_category_link = "[[nickname]]s for [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"ประเทศ"}, }, ["nicknames for places!"] = { full_category_link = "[[nickname]]s for [[place]]s", bare_category_breadcrumb = "สถานที่", bare_category_parent = "nicknames", addl_bare_category_parents = {"สถานที่"}, }, ["nicknames for states!"] = { -- For categorizing nicknames for states of e.g. the United States full_category_link = "[[nicknames]] for [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"รัฐ"}, }, ["NICKNAME_FOR capital"] = { link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR city"] = { link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR continent"] = { link = false, default = {"Nicknames for continents"}, }, ["NICKNAME_FOR country"] = { link = false, default = {"Nicknames for countries"}, }, ["NICKNAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR place"] = { link = false, default = {"Nicknames for places"}, }, ["NICKNAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR state"] = { link = false, default = {"Nicknames for states"}, }, ["NICKNAME_FOR town"] = { link = false, default = {"Nicknames for cities"}, }, ---------- Obsolete forms ---------- ["obsolete forms of places!"] = { full_category_link = "{{glossary|obsolete}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "obsolete forms", bare_category_parent = "สถานที่", }, ["OBSOLETE_FORM_OF place"] = { link = false, default = {"Obsolete forms of places"}, }, ---------- Official names ---------- ["official names of countries!"] = { full_category_link = "[[official]] [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "official names of places", }, ["official names of former countries!"] = { full_category_link = "[[official]] [[name]]s of [[country|countries]] that no longer [[exist]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "official names of former places", }, ["official names of former places!"] = { full_category_link = "[[official]] [[name]]s of [[place]]s that no longer [[exist]]", bare_category_breadcrumb = "official names", bare_category_parent = "former places", addl_bare_category_parents = {{name = "official names of places", sort = "former"}}, }, ["official names of places!"] = { full_category_link = "[[official]] [[name]]s of [[place]]s", bare_category_breadcrumb = "official names", bare_category_parent = "สถานที่", }, ["OFFICIAL_NAME_OF country"] = { link = false, default = {"Official names of countries"}, }, ["OFFICIAL_NAME_OF FORMER country"] = { link = false, default = {"Official names of former countries"}, }, ["OFFICIAL_NAME_OF FORMER place"] = { link = false, default = {"Official names of former places"}, }, ["OFFICIAL_NAME_OF place"] = { link = false, default = {"Official names of places"}, }, ---------- Official nicknames ---------- ["official nicknames for places!"] = { full_category_link = "[[official]] [[nickname]]s for [[place]]s", bare_category_breadcrumb = "official", bare_category_parent = "nicknames for places", }, ["official nicknames for states!"] = { -- For categorizing official nicknames for states of e.g. the United States full_category_link = "[[official]] [[nicknames]] for [[state]]s", bare_category_breadcrumb = "official", bare_category_parent = "nicknames for states", addl_bare_category_parents = {"รัฐ"}, }, ["OFFICIAL_NICKNAME_FOR place"] = { link = false, default = {"Official nicknames for places"}, }, ["OFFICIAL_NICKNAME_FOR state"] = { link = false, default = {"Official nicknames for states"}, }, } export.plural_placetype_to_singular = {} for sg_placetype, spec in pairs(export.placetype_data) do if spec.plural then export.plural_placetype_to_singular[spec.plural] = sg_placetype end end return export 3v98feuk221e1wze9d2owir8ih5i7wo 5720700 5720699 2026-04-21T01:48:04Z OctraBot 3198 5720700 Scribunto text/plain local export = {} export.force_cat = false -- set to true for testing local m_locations = require("Module:place/locations") local m_links = require("Module:links") local m_table = require("Module:table") local m_strutils = require("Module:string utilities") local debug_track_module = "Module:debug/track" local en_utilities_module = "Module:en-utilities" local dump = mw.dumpObject local insert = table.insert local concat = table.concat local internal_error = m_locations.internal_error export.internal_error = internal_error local process_error = m_locations.process_error export.process_error = process_error local unpack = unpack or table.unpack -- Lua 5.2 compatibility local ucfirst = m_strutils.ucfirst local ulower = m_strutils.lower local rmatch = m_strutils.match local split = m_strutils.split --[==[ intro: This module contains placetype data used by [[Module:place]] and {{tl|place}}, along with a significant amount of code to work with both placetypes and locations, as well as some placename-related info (FIXME: Consider moving it to [[Module:place/locations]]). See also [[Module:place/locations]], which has definitions of all known locations. You must currently load this module using {{cd|require()}}, not using {{cd|mw.loadData()}}. In particular, it contains two fundamental and tricky functions: # `get_placetype_equivs`, which finds the equivalent placetypes to look under in order to find a given property, and in the process correctly handles placetypes with qualifiers (including qualifiers that act similar to "type-raising" operators in that they do something non-trivial to the placetype to their right) as well as form-of directives and fallbacks. # `find_matching_holonym_location`, which looks up a holonym to find a matching known location, but in the process checks holonyms to the right to make sure there isn't a clash between the user-specified containing holonyms and the containers of the known location being considered. This is done to prevent overcategorizing when either there are two known locations with the same name (e.g. Birmingham in England and Birmingham, Alabama in the US), or more generally two locations with the same name, one of which is a known location but where the other is not (e.g. we're processing non-known-location Mérida, Spain and don't want it categorized like known location Mérida, Yucatán, Mexico). Both of these functions are invoked repeatedly, and probably are invoked several times on the same inputs and as a result are candidates for memoization to speed up the operation of {{tl|place}}. ]==] ------------------------------------------------------------------------------------------ -- Basic utilities -- ------------------------------------------------------------------------------------------ --[==[ Return true if `force_cat` is set either in this module or in [[Module:place/locations]]. ]==] function export.get_force_cat() return export.force_cat or m_locations.force_cat end -- Add the page to a tracking "category". To see the pages in the "category", -- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here". local function track(page) require(debug_track_module)("place/" .. page) return true end function export.remove_links_and_html(text) text = m_links.remove_links(text) return text:gsub("<.->", "") end --[==[ Return the singular version of a maybe-plural placetype, or nil if not plural. This correctly handles placetypes with irregular plurals such as `kibbutzim` plural of `kibbutz` by looking up in a table constructed from the `plural` values specified in `placetype_data`. If a special plural value is not found, the regular singularization algorithm in [[Module:en-utilities]] is invoked, which reverses the y -> ies change after vowels and the 'es' addition after sh/ch/x, and otherwise just subtracts a final 's' (which will incorrectly generate 'passe' for plural 'passes'; FIXME: consider changing this for words ending in '-sses'). If the generated singular is the same as the passed-in value, nil is returned. ]==] function export.maybe_singularize_placetype(placetype) if not placetype then return nil end if export.plural_placetype_to_singular[placetype] then return export.plural_placetype_to_singular[placetype] end local retval = --[[require(en_utilities_module).singularize(placetype)]] placetype if retval == placetype then return nil end return retval end -- Return the correct plural of a placetype, and (if `do_ucfirst` is given) make the first letter uppercase. We first -- look up the plural in `placetype_data`, falling back to pluralize() in [[Module:en-utilities]], which is almost -- always correct. function export.pluralize_placetype(placetype, do_ucfirst) local ptdata = export.placetype_data[placetype] if ptdata and ptdata.plural then placetype = ptdata.plural else placetype = --[[require(en_utilities_module).pluralize(placetype)]] placetype end if do_ucfirst then return ucfirst(placetype) else return placetype end end --[==[ Get the data associated with a placetype, which may be in its singular or plural form. If `from_category` is specified, we also look for category-only placetypes (generally plural) followed by `!`. Return three values: (a) the placetype under which the data can be looked up (i.e. in its singular form if the passed-in `placetype` is plural and did not match a category-only placetype followed by `!`); (b) the placetype data structure; (c) the type of `placetype` match that occurred, one of `"direct"` if the canonical placetype is the same as the passed-in `placetype` and also the same as the key under which `ptdata` was looked up, or `"direct-category"` if the `ptdata` was looked up under a key formed from the passed-in `placetype` by adding `!`, or `"plural"` if the `ptdata` was looked up under the singularized version of the plural passed-in `placetype`. ]==] function export.get_placetype_data(placetype, from_category) local ptdata = export.placetype_data[placetype] if ptdata then return placetype, ptdata, "direct" end if from_category then ptdata = export.placetype_data[placetype .. "!"] if ptdata then return placetype .. "!", ptdata, "direct-category" end end local sg_placetype = export.maybe_singularize_placetype(placetype) if sg_placetype then ptdata = export.placetype_data[sg_placetype] if ptdata then return sg_placetype, ptdata, "plural" end end return nil end --[==[ Check for special pseudo-placetypes that should be ignored for categorization purposes. ]==] function export.placetype_is_ignorable(placetype) return placetype == "and" or placetype == "or" or placetype == "และ" or placetype == "หรือ" or placetype:find("^%(") end function export.resolve_placetype_aliases(placetype) return export.placetype_aliases[placetype] or placetype end --[==[ Return a property from `placetype_data` for a given placetype. If the placetype isn't found in `placetype_data`, or the key isn't found in the placetype's entry in `placetype_data`, return nil. ]==] function export.get_placetype_prop(placetype, key) -- Usually we are called on equivalent placetypes returned from `get_placetype_equivs`, in which case placetype -- aliases have been resolved, but sometimes not, e.g. when fetching the indefinite article in -- get_placetype_article(). `resolve_placetype_aliases` is just a simple lookup and it doesn't hurt to do it twice. placetype = export.resolve_placetype_aliases(placetype) if export.placetype_data[placetype] then return export.placetype_data[placetype][key] else return nil end end --[==[ Given a placetype, split the placetype into one or more potential ''splits'', each consisting of a three-element list { {``prev_qualifiers``, ``this_qualifier``, ``reduced_placetype``}}, i.e. # the concatenation of zero or more previously-recognized qualifiers on the left, normally canonicalized (if there are zero such qualifiers, the value will be nil); # a single recognized qualifier, normally canonicalized (if there is no qualifier, the value will be nil); # the "reduced placetype" on the right. Splitting between the qualifier in (2) and the reduced placetype in (3) happens at each space character, proceeding from left to right, and stops if a qualifier isn't recognized. All placetypes are canonicalized by checking for aliases in `placetype_aliases`, but no other checks are made as to whether the reduced placetype is recognized. Canonicalization of qualifiers does not happen if `no_canon_qualifiers` is specified. For example, given the placetype `"small beachside unincorporated community"`, the return value will be { { {nil, nil, "small beachside unincorporated community"}, {nil, "small", "beachside unincorporated community"}, {"small", "[[beachfront]]", "unincorporated community"}, {"small [[beachfront]]", "[[unincorporated]]", "community"}, }} Here, `"beachside"` is canonicalized to `"[[beachfront]]"` and `"unincorporated"` is canonicalized to `"[[unincorporated]]"`, in both cases according to the entry in `placetype_qualifiers`. On the other hand, if given `"small former haunted community"`, the return value will be { { {nil, nil, "small former haunted community"}, {nil, "small", "former haunted community"}, {"small", "former", "haunted community"}, }} because `"small"` and `"former"` but not `"haunted"` are recognized as qualifiers. Finally, if given `"former adr"`, the return value will be { { {nil, nil, "former adr"}, {nil, "former", "administrative region"}, }} because `"adr"` is a recognized placetype alias for `"administrative region"`. ]==] function export.split_qualifiers_from_placetype(placetype, no_canon_qualifiers) local splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}} local prev_qualifier = nil while true do local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$") if qualifier then local canon = export.placetype_qualifiers[qualifier] if canon == nil then break end local new_qualifier = qualifier if type(canon) == "table" then canon = canon.link end if not no_canon_qualifiers and canon ~= false then if canon == true then new_qualifier = "[[" .. qualifier .. "]]" else new_qualifier = canon end end insert(splits, {prev_qualifier, new_qualifier, export.resolve_placetype_aliases(reduced_placetype)}) prev_qualifier = prev_qualifier and prev_qualifier .. " " .. new_qualifier or new_qualifier placetype = reduced_placetype else break end end return splits end --[==[ Given a `placetype` (which may be pluralized), return an ordered list of equivalent placetypes to look under to find the placetype's properties (such as the category or categories to be inserted). The return value is actually an ordered list of objects of the form `{qualifier=``qualifier``, placetype=``equiv_placetype``}` where ``equiv_placetype`` is a placetype whose properties to look up, derived from the passed-in placetype or from a contiguous subsequence of the words in the passed-in placetype (always including the rightmost word in the placetype, i.e. we successively chop off qualifier words from the left and use the remainder to find equivalent placetypes). ``qualifier`` is the remaining words not part of the subsequence used to find ``equiv_placetype``; or nil if all words in the passed-in placetype were used to find ``equiv_placetype``. (FIXME: This qualifier is not currently used anywhere.) Only placetypes for which there is an entry in `placetype_data` are included. The placetype passed in is always checked first, and will form the first entry if it exists in `placetype_data`. '''NOTE:''' This is a tricky function as it implements handling of (a) qualifiers, (b) fallback logic, (c) "type-raising" qualifiers such as `former`/`ancient`/etc. as well as `fictional` and `mythological`, and (d) form-of directives, which act somewhat similarly to `former`, and allows interaction between more than one of these simultaneously (e.g. official names of former places, which have their own categorization). If {{tl|place}} gets too slow, one potential speedup is to memoize the results of this function, as it appears to be getting called more than once on the same inputs. Another similar potential speedup is to memoize the results of `iterate_matching_holonym_location()`. For example, given the placetype `left tributary`, the following placetype/qualifier combinations are checked in turn: ``` {qualifier = nil, placetype="left tributary"} {qualifier = "left", placetype="tributary"} {qualifier = "left", placetype="แม่น้ำ"} ``` and the return value will be { { {qualifier = "left", placetype="tributary"}, {qualifier = "left", placetype="แม่น้ำ"}, }} The algorithm first enters the placetype itself into the list, then checks for `left tributary` as a recognized placetype in `placetype_data` and doesn't find it, so it doesn't enter it into the returned list (if it found it, it would add it as well as any fallbacks directly after it). It then splits off the recognized qualifier `left` to form the ''reduced placetype'' `tributary`, which is entered into the list because it is found in `placetype_data`. Then, because it has a fallback `river`, which exists in `placetype_data`, the fallback is entered next. Another example is `small rural fraziones` (where a ''frazione'' is type of subdivision of a ''comune'' or municipality, often specifically an outlying hamlet). the placetype/qualifier combinations checked are: ``` {qualifier = nil, placetype="small rural fraziones"} {qualifier = nil, placetype="small rural frazione"} {qualifier = "small", placetype="rural fraziones"} {qualifier = "small", placetype="rural frazione"} {qualifier = "small [[rural]]", placetype="fraziones"} {qualifier = "small [[rural]]", placetype="frazione"} {qualifier = "small [[rural]]", placetype="hamlet"} {qualifier = "small [[rural]]", placetype="village"} ``` The return value ends up as {qualifier = "small [[rural]]", placetype="frazione"}, {qualifier = "small [[rural]]", placetype="hamlet"}, {qualifier = "small [[rural]]", placetype="village"}, }} Here, because the result of singularizing `fraziones` returns a different value from the placetype itself, that singularized value is checked after the original plural value. Also, in the process of splitting off qualifiers, they are canonicalized if the entry in `placetype_qualifiers` says to do so; in this case, links are placed around `rural`. Finally, `frazione` has `hamlet` as its fallback, which in turn has `village` as its fallback, so both fallbacks end up being returned. `no_fallback`, if set, disables returning equivalent placetypes based on the `fallback` setting for a placetype. This is used in the first of two loops in find_placetype_cat_specs() in [[Module:place]] to prefer exact matches for placetypes such as barangays with later holonyms to matches based on a fallback such as `neighborhood` with an earlier holonym. See the comment in that function in [[Module:place]] for a more detailed explanation of why this is needed. Only the placetype itself, and any reduced placetypes created by chopping off recognized qualifiers at the beginning, are returned; but we do not return reduced placetypes if a containing placetype exists in `placetype_data`. (For example, `"overseas territory"` has a fallback `"dependent territory"`, and `"overseas"` is also a recognized qualifier. When `no_fallback` is in place, without the above proviso, we would return `"overseas territory"` followed by `"ดินแดน"` with the incorrect effect of classifying an `"overseas territory"` of the United Kingdom such as `"Gibraltar"` under [[:Category:Territories of the United Kingdom]] instead of [[:Category:Dependent territories of the United Kingdom]].) As an exception, if `historical`, `ancient`, `former` or the like are found, they proceed ignoring `no_fallback`, because it seems tricky to handle them correctly in the presence of `no_fallback`, and historical/former placetypes rarely occur with exact match category specs anyway. `no_split_qualifiers` prevents splitting off recognized qualifiers and returning the remainder of the placetype as an equivalent placetype. Only the passed-in placetype, and any fallbacks, will be returned. This is used in [[Module:category tree/topic cat/data/Places]] when looking up placetypes found in categories. Such placetypes won't have qualifiers and so it doesn't make sense to try and look for them. `from_category`, if set, causes category-only placetypes (those ending in `!`) to also be checked. `form_of_directive`, if set, causes the specified form-of directive (e.g. `FORMER_NAME_OF`) to be prepended to checked placetypes, their directive-specific type (e.g. `FORMER_NAME_OF_type`), and their classes (`class`) to get the appropriate placetypes to check for form-of-directive categories. It falls back to the prepended generic `place` as a placetype, e.g. `FORMER_NAME_OF place`, if nothing else matches. `no_check_for_inherently_former` is used internally to prevent an infinite loop when checking for `inherently_former`. `register_former_as_non_former` is a major hack used in `get_bare_categories` to deal with the mismatch between e.g. known location `Yugoslavia` declaring itself a `country` but definitions of it declaring it a `former country`. It causes the non-former version of the specified placetype to be included in the returned equivalents along with the former placetypes. [FIXME: This should apply only to the entries in `former_countries` but it's tricky to do that now; fix this in the known-location refactor. -- The known-location refactor is already done but we haven't yet fixed this.] ]==] function export.get_placetype_equivs(placetype, props) local no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former local form_of_directive if props then no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former = props.no_fallback, props.no_split_qualifiers, props.no_check_for_inherently_former, props.from_category, props.register_former_as_non_former form_of_directive = props.form_of_directive end local equivs = {} -- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. `qualifier` is -- the preceding qualifier to insert into `equivs` along with the placetype (see comment at top of function). If -- `from_category` is given, we also check for a category-specific entry consisting of the placetype followed by -- `!`, and in all cases we also check to see if `placetype` is plural, and if so, insert the singularized version -- along with its fallbacks (if any) in `placetype_data`. `form_of_prefix` is a form-of prefix such as -- `OFFICIAL_NAME_OF`. If specified, we check the fallbacks of `placetype` without the prefix but then insert into -- `equivs` the prefixed placetype. This way, if the user says e.g. {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}}, -- we will correctly categorize into [[:Category:Official names of countries]], rather than only trying to look up -- `OFFICIAL_NAME_OF island country` and failing, falling back ultimately to [[:Category:Official names of places]]. local function insert_placetype_and_fallbacks(qualifier, placetype, form_of_prefix) local function insert_equiv(pt) if form_of_prefix then -- Let's say the user says {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}} and we have -- no entry for `OFFICIAL_NAME_OF island country` but we do for `OFFICIAL_NAME_OF country` (which we end -- up processing because `island country` falls back to `country`), and that entry in turn is defined -- using a fallback. We have to insert that fallback-of-fallback, and the easiest/cleanest way of -- handling this is by calling ourselves recursively. insert_placetype_and_fallbacks(qualifier, form_of_prefix .. " " .. pt) else insert(equivs, {qualifier=qualifier, placetype=pt}) end end -- Insert the placetype, along with any fallbacks. local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category) if ptdata then insert_equiv(canon_placetype) if no_fallback then return end local first_placetype = #equivs + 1 local prev_placetype = nil while true do local pt_value = export.placetype_data[canon_placetype] if not pt_value then internal_error("Fallback value %s specified for placetype %s but is not in `placetype_data`", canon_placetype, prev_placetype) end if pt_value.fallback then insert_equiv(pt_value.fallback) local last_placetype = #equivs if last_placetype - first_placetype >= 10 then local fallback_loop = {} for i = first_placetype, last_placetype do insert(fallback_loop, equivs[i].placetype) end internal_error("Apparent loop in fallback chain: %s", table.concat(fallback_loop, " -> ")) end prev_placetype = canon_placetype canon_placetype = pt_value.fallback else break end end end end -- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. This is a -- wrapper around the more basic `insert_placetype_and_fallbacks()` which handles form-of directives. If there is no -- form-of directive, this function directly calls `insert_placetype_and_fallbacks()`. We do things this way so that -- form-of directives correctly combine with `former`-type qualifiers. Note that we also have special backups for -- form-of directives that check `DIRECTIVE place` (and before that, `DIRECTIVE FORMER/ANCIENT place` is there's a -- `former`-type directive); these backups live outside this function because we want them done once, late, rather -- than in each invocation of `process_and_insert_placetype()`. local function process_and_insert_placetype(qualifier, reduced_placetype) if form_of_directive then -- First check for e.g. `OFFICIAL_NAME_OF island country` and its fallbacks; then we look for fallbacks of -- `island country` and check e.g. `OFFICIAL_NAME_OF country` and its fallbacks. All of this is handled by -- `insert_placetype_and_fallbacks()` with appropriate parameters. After that, check the general class of -- the directive, e.g. `subpolity` if something like `district` is given. (Eventually, we check for -- `OFFICIAL_NAME_OF place` as a backup, but this happens at the end outside the loop over qualifiers.) insert_placetype_and_fallbacks(qualifier, reduced_placetype, form_of_directive) if not no_fallback then local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype) local directive_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.get_placetype_prop(pt, form_of_directive .. "_type") or export.get_placetype_prop(pt, "class") end ) if not directive_type then local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.placetype_data[pt] end ) if pt_data then internal_error("For placetype %s in conjunction with form-of directive %s, placetype data " .. 'located but directive-specific type property %s missing, and so is "class"; ' .. "placetypes searched are %s", reduced_placetype, form_of_directive, form_of_directive .. "_type", reduced_placetype_equivs) else -- This should be allowed, as we allow unrecognized placetypes in general. end elseif directive_type ~= "!" then insert_placetype_and_fallbacks(qualifier, directive_type, form_of_directive) end end else insert_placetype_and_fallbacks(qualifier, reduced_placetype) end end -- Successively split off recognized qualifiers and loop over successively greater sets of qualifiers from the left -- (unless `no_split_qualifiers` is specified, in which case we don't check for qualifiers). local splits if no_split_qualifiers then splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}} else splits = export.split_qualifiers_from_placetype(placetype) end for _, split in ipairs(splits) do local prev_qualifier, this_qualifier, reduced_placetype = unpack(split, 1, 3) -- If a special "former" qualifier like `former` or `historical` isn't present, and -- `no_check_for_inherently_former` is not given (this flag is used to avoid infinite loops), check for -- "inherently former" placetypes like `satrapy` and `treaty port` that always refer to no-longer-existing -- placetypes, and handle accordingly. local unlinked_this_qualifier if this_qualifier and this_qualifier:find("%[") then unlinked_this_qualifier = export.remove_links_and_html(this_qualifier) else unlinked_this_qualifier = this_qualifier end local former_qualifiers = this_qualifier and export.former_qualifiers[unlinked_this_qualifier] or nil if not former_qualifiers and not no_check_for_inherently_former then former_qualifiers = export.get_equiv_placetype_prop(reduced_placetype, function(pt) return export.get_placetype_prop(pt, "inherently_former") end, {no_check_for_inherently_former = true}) end -- If a special "former" qualifier like `former` or `historical` is present, map it to the appropriate internal -- qualifiers (`ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified -- qualifiers), fetch the `former_type` property, and treat the placetype as if a concatenation of the mapped -- qualifier(s) and the value of `former_type`. For example, if `medieval village` is given, we map `medieval` -- to `ANCIENT` and `FORMER`, and `village` to its `former_type` of `settlement`, and enter the placetypes -- `ANCIENT settlement` and `FORMER settlement` (in that order) into `equivs`. If the placetype following the -- "former" qualifier is recognized in `placetype_data` but has no `former_type` and no fallback with a -- `former_type` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like -- `former greenhouse` is specified and we don't have an entry for `greenhouse`), just track the occurrence and -- don't enter anything into `equivs`. if former_qualifiers then -- FIXME: Should we respect `no_fallback` here? My instinct says no. local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype, { no_check_for_inherently_former = true }) local former_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.get_placetype_prop(pt, "former_type") or export.get_placetype_prop(pt, "class") end ) if not former_type then local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.placetype_data[pt] end ) if pt_data then internal_error("For placetype %s, placetype data located but `former_type` missing; " .. "placetypes searched are %s", reduced_placetype, reduced_placetype_equivs) else -- Enable error when we've verified there aren't any examples. track("bad-former-placetype") track("bad-former-placetype/" .. reduced_placetype) --process_error("For placetype '%s', unrecognized placetype following 'former'-type " .. -- "qualifier; searched placetype(s) %s", reduced_placetype, dump(reduced_placetype_equivs)) end elseif former_type ~= "!" then -- First check directly for `ANCIENT/FORMER` + the original following placetype. This makes it possible -- for (e.g.) former provinces of the Roman empire to be categorized specially. for _, former_qualifier in ipairs(former_qualifiers) do process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. reduced_placetype) end for _, former_qualifier in ipairs(former_qualifiers) do process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. former_type) end -- HACK! See explanation above for `register_former_as_non_former`. if register_former_as_non_former then process_and_insert_placetype(prev_qualifier, reduced_placetype) end -- If we're processing a form-of directive, after doing everything else we do -- `DIRECTIVE ANCIENT/FORMER place` e.g. `OFFICIAL_NAME_OF FORMER place` as a backup. if form_of_directive and not no_fallback then for _, former_qualifier in ipairs(former_qualifiers) do insert_placetype_and_fallbacks(prev_qualifier, form_of_directive .. " " .. former_qualifier .. " place") end end -- Don't continue processing equivs. The reason is probably the same as the `break` below for -- qualifier_to_placetype_equivs[]; categories for `former BLAH` are set using `default`, and -- non-former equivs will otherwise take precedence. break end end -- Then see if the rightmost split-off qualifier is in qualifier_to_placetype_equivs -- (e.g. 'fictional *' -> 'fictional location'). If so, add the mapping. if this_qualifier and export.qualifier_to_placetype_equivs[unlinked_this_qualifier] then insert(equivs, { qualifier=prev_qualifier, placetype=export.qualifier_to_placetype_equivs[unlinked_this_qualifier] }) -- Don't continue processing equivs; otherwise, if we specify 'mythological city', even though the -- equivalent entry for 'mythological location' gets inserted ahead of the entry for 'city', the -- latter ends up generating the category because the category for 'mythological location' is set as -- the default value, which is used only when no non-default category can be found. break end -- Finally, join the rightmost split-off qualifier to the previously split-off qualifiers to form a combined -- qualifier, and add it along with reduced_placetype and any mapping in placetype_data for reduced_placetype. -- NOTE: The first time through this loop, both `prev_qualifier` and `this_qualifier` are nil, and this inserts -- the full placetype into `equivs`. local qualifier = prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier process_and_insert_placetype(qualifier, reduced_placetype) -- If `no_fallback` and there's an entry in `placetype_data` for this placetype, don't include any reduced -- placetypes to avoid the "overseas territory treated as a territory" issue describe above. if no_fallback then local canon_placetype, ptdata, ptmatch = export.get_placetype_data(reduced_placetype, from_category) if canon_placetype then break end end end -- If we're processing a form-of directive, after doing everything else we do `DIRECTIVE place` e.g. -- `OFFICIAL_NAME_OF place` as a backup; but only if either the placetype as a whole is recognized or the placetype -- begins with a recognized qualifier. This latter check is to avoid categorizing into e.g. -- [[Category:en:Former names of places]] in an invocation like -- {{place|en|@former name of:Democratic Republic of the Congo|country|r/Central Africa|;|used from 1971–1997}}; -- the `used from 1971–1997` gets treated as a placetype and we're called on it. if form_of_directive and not no_fallback and (splits[2] or export.get_placetype_data(placetype, from_category)) then insert_placetype_and_fallbacks(nil, form_of_directive .. " place") end return equivs end function export.get_equiv_placetype_prop_from_equivs(equivs, fun, continue_on_nil_only) for _, equiv in ipairs(equivs) do local retval = fun(equiv.placetype) if continue_on_nil_only and retval ~= nil or not continue_on_nil_only and retval then return retval, equiv end end return nil, nil end --[==[ Given a placetype `placetype` and a function `fun` of one argument, iteratively call the function on equivalent placetypes fetched from `get_placetype_equivs` until the function returns a non-falsy value (i.e. not {nil} or {false}); but if `continue_on_nil_only` is specified, the iterations continue until the function returns non non-{nil} value. FIXME: We should make `continue_on_nil_only` the default; but this requires changing some callers.) When `fun` returns a non-falsy or non-{nil} value, `get_equiv_placetype_prop` returns two values: the value returned by `fun` and the equivalent placetype that triggered the non-falsy (or non-{nil}) return value. If `fun` never returns a non-falsy (or non-{nil}) value, `get_equiv_placetype_prop` returns {nil} for both return values. If `placetype` is passed in as {nil}, the return value is the result of calling `fun` on {nil} (whatever it is) with {nil} for the second return value. ]==] function export.get_equiv_placetype_prop(placetype, fun, props) if not placetype then return fun(nil), nil end return export.get_equiv_placetype_prop_from_equivs(export.get_placetype_equivs(placetype, props), fun, props and props.continue_on_nil_only) end --[==[ Return the article that is used with an entry placetype. We proceed as follows: # See if there is a recognized qualifier at the beginning that specifies an article (including `false` for no article). This takes precedence over anything else, so that e.g. `various capitals` gets no article rather than "`the"`. # Then check the placetype or any equivalent placetype for the `entry_placetype_use_the` property, indicating that `"the"` should be used. # Otherwise we look to see if the placetype itself (not any equivalents, even those involving deleting a qualifier from the beginning) has an entry in `placetype_data` that specifies the indefinite article using `entry_placetype_use_the` (principally for use with placetypes like `union territory`). # Otherwise, we use [[Module:en-utilities]] to apply the standard algorithm to generate `"an"` for words beginning with a vowel and `"a"` otherwise. If `ucfirst` is true, the first letter of the article is made upper-case. ]==] function export.get_placetype_article(placetype, ucfirst) local art local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$") if qualifier then local canon = export.placetype_qualifiers[qualifier] if type(canon) == "table" then art = canon.article end end if art == false then return art end if art == nil then local placetype_use_the = export.get_equiv_placetype_prop(placetype, function(pt) return export.get_placetype_prop(pt, "entry_placetype_use_the") end) if placetype_use_the then art = "the" else art = export.get_placetype_prop(placetype, "entry_placetype_indefinite_article") if not art then art = --[[require(en_utilities_module).get_indefinite_article(placetype)]] "" end end end if ucfirst then art = m_strutils.ucfirst(art) end return art end --[==[ Return the preposition that should be used after `placetype` when occurring as an entry placetype or in categories (e.g. `city >in< France` but `country >of< South America`). The preposition defaults to `"ใน"` if not specified. ]==] function export.get_placetype_entry_preposition(placetype) local pt_prep = export.get_equiv_placetype_prop(placetype, function(pt) return export.get_placetype_prop(pt, "preposition") end ) return pt_prep or "ใน" end --[==[ Given a place desc (see top of file) and a holonym object (see top of file), add a key/value into the place desc's `holonyms_by_placetype` field corresponding to the placetype and placename of the holonym. For example, corresponding to the holonym "c/Italy", a key "ประเทศ" with the list value {"Italy"} will be added to the place desc's `holonyms_by_placetype` field. If there is already a key with that place type, the new placename will be added to the end of the value's list. ]==] function export.key_holonym_into_place_desc(place_desc, holonym) if not holonym.placetype then return end -- Key in equivalent placetypes, so that e.g. `cities/San Francisco` gets keyed under `city`; but don't do -- fallbacks, as it doesn't seem correct for the "do other holonyms of the same placetype" algorithm to do holonyms -- of different types just because they have the same fallback. local equiv_placetypes = export.get_placetype_equivs(holonym.placetype, {no_fallback = true}) local unlinked_placename = holonym.unlinked_placename for _, equiv in ipairs(equiv_placetypes) do local placetype = equiv.placetype if not place_desc.holonyms_by_placetype then place_desc.holonyms_by_placetype = {} end if not place_desc.holonyms_by_placetype[placetype] then place_desc.holonyms_by_placetype[placetype] = {unlinked_placename} else insert(place_desc.holonyms_by_placetype[placetype], unlinked_placename) end end end --[=[ Construct a formatted link from the raw link spec `link` given the canonical singular placetype `sg_placetype`. If the placetype was originally plural, `orig_placetype` should contain this plural value; otherwise it should be nil. This will construct the appropriate type of link that displays as `orig_placetype` (or otherwise `sg_placetype`) but links to whatever the `link` spec specifies (which may be `sg_placetype`, a Wikipedia article, etc.). `ptdata` is the placetype data structure for the placetype, and `from_category` indicates that we are generating the description of a category (otherwise we are generating the display form of an entry placetype). ]=] local function make_placetype_link(link, sg_placetype, orig_placetype, ptdata, from_category, noerror) if not from_category and ptdata.disallow_in_entries then if noerror then return "[not meant to be specified directly, with warning: " .. ptdata.disallow_in_entries .. "]" else process_error("Placetype %s is not meant to be specified directly: " .. ptdata.disallow_in_entries, sg_placetype) end end if link == nil then internal_error("Placetype data present for placetype %s but no link= setting given", sg_placetype) elseif link == true then if orig_placetype then return ("[[%s|%s]]"):format(sg_placetype, orig_placetype) else return ("[[%s]]"):format(sg_placetype) end elseif link == false then process_error("Placetype %s is not meant to be specified directly, but is only for internal use", sg_placetype) elseif link == "w" then return ("[[w:%s|%s]]"):format(sg_placetype, orig_placetype or sg_placetype) elseif link == "separately" then if orig_placetype then local sg_words = split(sg_placetype, " ") local orig_words = split(orig_placetype, " ") if #sg_words ~= #orig_words then internal_error("Can't construct 'separately' link for plural placetype %s as original placetype %s " .. "has different number of words", orig_placetype, sg_placetype) else for i = 1, #sg_words do if sg_words[i] == orig_words[i] then sg_words[i] = ("[[%s]]"):format(sg_words[i]) else sg_words[i] = ("[[%s|%s]]"):format(sg_words[i], orig_words[i]) end end return concat(sg_words, " ") end else return (sg_placetype:gsub("([^ ]+)", "[[%1]]")) end elseif link:find("^%+") then link = link:sub(2) -- discard initial + return ("[[%s|%s]]"):format(link, orig_placetype or sg_placetype) elseif not orig_placetype then return link else return --[[require(en_utilities_module).pluralize(link)]] link end end --[==[ Get the display form of a placetype by looking it up in `placetype_data`. If the placetype is recognized, or is the plural of a recognized placetype, the corresponding linked display form is returned (with plural placetypes displaying as plural but linked to the singular form of the placetype). Otherwise, return nil. If we're generating the description of a category, `category_type` should be set to one of `"top-level"` (for top-level categories like [[:Category:Neighborhoods]]), `"noncity"` (for non-city categories like [[:Category:Neighborhoods in Illinois, USA]]) or `"city"` (for city categories like [[:Category:Neighborhoods of Chicago]]). Otherwise, we're generating the description for use in formatting a {{tl|place}} call, and category-only placetypes ending in `!` will be ignored, along with special `category_link*` settings. `return_full` is used along with `category_type` and will preferably return the "full" variant of category link settings, i.e. `full_category_link*`; if they don't exist, the `category_link*` value is prepended with `"names of"`. `noerror` says to not throw an error when encountering entry placetypes that would be disallowed. ]==] function export.get_placetype_display_form(placetype, category_type, return_full, noerror) local from_category = not not category_type local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category) if canon_placetype then local raw_link local function is_linked_string(str) return type(str) == "string" and str:find("%[%[") end if category_type then local fetched_full local function fetch_maybe_full(prop) local retval = ptdata["full_" .. prop] if retval ~= nil then if return_full then return retval, true else internal_error("Saw full_" .. prop .. "=%s but `return_full` not set, can't handle", retval) end end return ptdata[prop], false end local function maybe_prefix(str) if return_full and not fetched_full then return "names of " .. str else return str end end -- Careful with `false` as possible value. if category_type == "top-level" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_top_level") elseif category_type == "noncity" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_before_noncity") elseif category_type == "city" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_before_city") else internal_error('Unrecognized value for `category_type` %s, should be "top-level", "noncity" or "city"', --ห้ามแปล category_type) end if type(raw_link) == "string" then return maybe_prefix(raw_link), ptdata elseif raw_link ~= nil then return raw_link, ptdata end raw_link, fetched_full = fetch_maybe_full("category_link") if raw_link == false then return raw_link, ptdata end if is_linked_string(raw_link) then return maybe_prefix(raw_link), ptdata end if ptmatch == "plural" then raw_link, fetched_full = fetch_maybe_full("plural_link") if raw_link == false then return raw_link, ptdata end if is_linked_string(raw_link) then return maybe_prefix(raw_link), ptdata end end if raw_link == nil then raw_link, fetched_full = fetch_maybe_full("link") end if raw_link == false then return raw_link, ptdata end return maybe_prefix(make_placetype_link(raw_link, canon_placetype, placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror)), ptdata else if ptmatch == "plural" then raw_link = ptdata.plural_link if raw_link == false then process_error("Placetype %s cannot appear plural", placetype) end if is_linked_string(raw_link) then return raw_link, ptdata end end if raw_link == nil then raw_link = ptdata.link end return make_placetype_link(raw_link, canon_placetype, placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror), ptdata end end return nil end local function resolve_unlinked_placename_display_aliases(placetype, placename) local equiv_placetypes = export.get_placetype_equivs(placetype) for i, equiv in ipairs(equiv_placetypes) do equiv_placetypes[i] = equiv.placetype end local all_display_aliases_found = {} local all_others_found = {} for group, key, spec in m_locations.iterate_matching_location { placetypes = equiv_placetypes, placename = placename, alias_resolution = "display", } do if spec.alias_of and spec.display then insert(all_display_aliases_found, {group, key, spec, spec.display_as_full}) else insert(all_others_found, {group, key, spec}) end end if not all_display_aliases_found[1] then return placename elseif all_display_aliases_found[2] then internal_error("Found multiple matching display aliases for placename %s, placetype %s: " .. "all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found, all_others_found) elseif all_others_found[1] then internal_error("Found a display alias along with other possible meanings for placename %s, placetype %s: " .. "all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found, all_others_found) else local group, key, spec, as_full = unpack(all_display_aliases_found[1]) local full, elliptical = m_locations.key_to_placename(group, key) return as_full and full or elliptical end end --[==[ If `placename` of type `placetype` is a display alias, convert it to its canonical form; otherwise, return unchanged. Display aliases transform certain placenames into canonical displayed forms. For example, if any of `country/US`, `country/USA` or `country/United States of America` (or `c/US`, etc.) are given, the result will be displayed as `United States`. '''NOTE''': Display aliases change what is displayed from what the editor wrote in the Wikitext. As a result, they should (a) be non-political in nature, and (b) not involve a change where the word `the` needs to be added or removed. For example, normalizing `US` and `USA` to `United States` for display purposes is OK but normalizing `Burma` to `Myanmar` is not (instead a cat alias should be used) because the terms `Burma` and `Myanmar` have clear political connotations. Similarly, we have a display alias that maps the old name of `Macedonia` as a country (but not a region!) to `North Macedonia`, but `Republic of Macedonia` is mapped to `North Macedonia` only as a cat alias because the two terms differ in their use of `the`. (For example, if we had a display alias mapping `Republic of Macedonia` to `North Macedonia`, the call {{tl|place|en|the <<capital city>> of the <<c/Republic of Macedonia>>}} would wrongly display as `the [[capital city]] of the [[North Macedonia]]`.) Generally, display normalizations tend to involve alternative forms (e.g. abbreviations, ellipses, foreign spellings) where the normalization improves clarity and consistency. ]==] function export.resolve_placename_display_aliases(placetype, placename) -- If the placename is a link, apply the alias inside the link. -- This pattern matches both piped and unpiped links. If the link is not piped, the second capture (linktext) will -- be empty. local link, linktext = rmatch(placename, "^%[%[([^|%[%]]+)|?([^|%[%]]-)%]%]$") if link then if linktext ~= "" then local alias = resolve_unlinked_placename_display_aliases(placetype, linktext) return "[[" .. link .. "|" .. alias .. "]]" else local alias = resolve_unlinked_placename_display_aliases(placetype, link) return "[[" .. alias .. "]]" end else return resolve_unlinked_placename_display_aliases(placetype, placename) end end --[==[ Generate the "prefixed" version of a bare key, i.e. prefix it with `the` if correct for this key. ]==] function export.get_prefixed_key(key, spec) if spec.the then return "the " .. key else return key end end -- Necessary for use by [[Module:place]]. FIXME: Reorganize the modules so this isn't necessary. export.iterate_matching_location = m_locations.iterate_matching_location --[=[ Iterator that iterates over holonyms in `place_desc`. If `first_holonym_index` is given, start iterating at the specified holonym and stop either when there are no more holonyms or a holonym with modifier `:also` is found. If `first_holonym_index` is nil or omitted, iterate over all holonyms regardless. If `include_raw_text_holonyms` is specified, raw text holonyms (those not of the form `placetype/placename`) are returned as well; they can be identified by the fact that the `placetype` field in the holonym structure is nil. Two values are returned at each iteration, the holonym index and holonym structure, similar to `ipairs()`. ]=] function export.get_holonyms_to_check(place_desc, first_holonym_index, include_raw_text_holonyms) local stop_at_also = not not first_holonym_index return function(place_desc, index) while true do index = index + 1 local this_holonym = place_desc.holonyms[index] -- If we were passed in a starting holonym index, go up to but not including a holonym marked with `:also` -- (continue_cat_loop); the categorization code will then restart the loop at that holonym. That holonym -- will have `:also` marked on it, so make sure not to stop immediately if the first holonym is marked with -- `:also`. if not this_holonym or stop_at_also and index > first_holonym_index and this_holonym.continue_cat_loop then return nil end -- If not placetype, we're processing raw text, which we normally want to skip. if include_raw_text_holonyms or this_holonym.placetype then return index, this_holonym end end end, place_desc, first_holonym_index and first_holonym_index - 1 or 0 end --[==[ If the holonym in `data` (in the format as passed to a category handler) refers to a known location, iterate over all such known locations, returning for each location the corresponding key, spec and group as well as the trail of ancestral containers. Unlike `iterate_matching_location()`, this specifically checks that there is no mismatch between the location's containers at any level and any of the following holonyms in the {{tl|place}} spec. The fields in `data` are: * `holonym_placetype`: The placetype of the holonym. It can actually be a list of possible placetypes, as with `iterate_matching_location()`. * `holonym_placename`: The placename of the holonym. * `holonym_index`: The index of the holonym among the holonyms in `place_desc`, or nil if the holonym is not among the holonyms in `place_desc`. (If a holonym index is given, we check for container mismatches among the holonyms following the specified index, stopping either when encountering a holonym marked with modifier `:also` or, if none exist, when we run out of holonyms. If no holonym index is given, we check all holonyms for container mismatches.) * `place_desc`: Description of the place; used for the holonyms, to check for container mismatches. Returns four values: the location group, the canonical key by which the location is known, the spec object describing the location and the trail of ancestral containers for the location. The first three values are the same as for `iterate_matching_location`. ]==] function export.iterate_matching_holonym_location(data) local holonym_placetype, holonym_placename, holonym_index, place_desc = data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc local matching_location_iterator = m_locations.iterate_matching_location { placetypes = holonym_placetype, placename = holonym_placename, } return function() while true do local group, key, spec = matching_location_iterator() if not group then return nil end local container_trail = {} -- For each level of container, check that there are no mismatches (i.e. other location of the same -- placetype) mentioned. We allow a mismatch at a given level if there's also a match with the container -- at that level. For example, in the case of Kansas City, defined in [[Module:place/locations]] as a city -- in Missouri, if we define it as {{tl|place|city|s/Missouri,Kansas}}, we ignore the mismatching state of -- Kansas because the correct state of Missouri was also mentioned. But imagine we are defining Newark, -- Delaware as {{tl|place|city|s/Delaware|c/US}} and (as is the case) we have an entry for Newark, New -- Jersey in [[Module:place/locations]]. Just because the containing location `US` matches isn't enough, -- because Newark, NJ also has New Jersey as a containing location and there's a mismatch at that level. If -- there are no mismatches at any level we assume we're dealing with the right known location. -- -- If at a given level there are multiple containing locations, we count a match if any holonym matches any -- containing location, and a mismatch only if a holonym exists of the same placetype that doesn't match any -- containing location. local containers_mismatch = false for containers in m_locations.iterate_containers(group, key, spec) do insert(container_trail, containers) local match_at_level = false local mismatch_at_level = false for other_holonym_index, other_holonym in export.get_holonyms_to_check(place_desc, holonym_index and holonym_index + 1 or nil) do local other_source_holonym = other_holonym.augmented_from_holonym if other_source_holonym and other_source_holonym.placetype == holonym_placetype and other_source_holonym.unlinked_placename ~= holonym_placename then -- Ignore holonyms added during the augmentation process for other holonyms of the same -- placetype as the placetype of the holonym we're considering. See comment in -- augment_holonyms_with_container() for why we do this. -- continue; grrr, no 'continue' in Lua else local holonym_matches_at_level = false local holonym_exists_with_same_placetype = false for _, container in ipairs(containers) do if not container.spec.no_check_holonym_mismatch then local full_container_placename, elliptical_container_placename = m_locations.key_to_placename(container.group, container.key) local placetypes = container.spec.placetype if type(placetypes) ~= "table" then placetypes = {placetypes} end local placetype_equivs = {} for _, pt in ipairs(placetypes) do m_table.extend(placetype_equivs, export.get_placetype_equivs(pt)) end local this_holonym_matches = export.get_equiv_placetype_prop_from_equivs( placetype_equivs, function(placetype) return other_holonym.placetype == placetype and (other_holonym.unlinked_placename == full_container_placename or other_holonym.unlinked_placename == elliptical_container_placename) end ) if this_holonym_matches then holonym_matches_at_level = true break end local this_holonym_exists_with_same_placetype = export.get_equiv_placetype_prop_from_equivs( placetype_equivs, function(placetype) return other_holonym.placetype == placetype end ) if this_holonym_exists_with_same_placetype then -- We seem to have a mismatch at this level. But before we decide conclusively that this -- is the case, check to see whether the putative mismatch is an alias and matches when -- we resolve the alias. for oh_group, oh_key, oh_spec, oh_container_trail in export.iterate_matching_holonym_location { holonym_placetype = other_holonym.placetype, holonym_placename = other_holonym.unlinked_placename, holonym_index = other_holonym_index, place_desc = place_desc, } do local oh_full_placename, oh_elliptical_placename = m_locations.key_to_placename(oh_group, oh_key) if oh_full_placename == full_container_placename or oh_elliptical_placename == elliptical_container_placename then -- Alias matched when resolved. this_holonym_matches = true break end end if this_holonym_matches then -- Alias matched above when resolved. holonym_matches_at_level = true break else -- Not an alias, or doesn't match when resolved. We have a true mismatch. holonym_exists_with_same_placetype = true end end end end if holonym_matches_at_level then match_at_level = true break end if holonym_exists_with_same_placetype then mismatch_at_level = true end end end if not match_at_level and mismatch_at_level then containers_mismatch = true break end end if not containers_mismatch then return group, key, spec, container_trail end end end end --[==[ If the holonym in `data` (in the format as passed to a category handler) refers to a known location, find and return the corresponding key, spec and group as well as the trail of ancestral containers. This is like `iterate_matching_holonym_location()` but throws an error if more than one location matches. (An example where this would happen is {{tl|place|en|neighborhood|city/Newcastle}}, because there are two known locations named Newcastle. To fix this, specify additional following disambiguating holonyms, e.g. {{tl|place|en|neighborhood|city/Newcastle|s/New South Wales}}. ]==] function export.find_matching_holonym_location(data) local all_found = {} for group, key, spec, container_trail in export.iterate_matching_holonym_location(data) do insert(all_found, {group, key, spec, container_trail}) end if not all_found[1] then return nil elseif all_found[2] then local holonym_placetype = data.holonym_placetype if type(holonym_placetype) == "table" then holonym_placetype = concat(holonym_placetype, ",") end local found_keys = {} for _, found in ipairs(all_found) do local _, key, _, _ = unpack(found) insert(found_keys, key) end error(("Found multiple matching locations for holonym '%s/%s'; specify disambiguating context in the " .. "containing holonyms: %s"):format(holonym_placetype, data.holonym_placename, dump(found_keys))) else return unpack(all_found[1]) end end ------------------------------------------------------------------------------------------ -- Placename and placetype data -- ------------------------------------------------------------------------------------------ --[==[ var: This is a map from aliases to their canonical forms. Any placetypes appearing as keys here will be mapped to their canonical forms in all respects, including the display form. Contrast entries in 'placetype_data' with a fallback, which applies to categorization and other processes but not to display. The most important aliases are for holonym placetypes, particularly those that occur often such as "ประเทศ", "รัฐ", "จังหวัด" and the like. Particularly long placetypes that mostly occur as entry placetypes (e.g. "census-designated place") can be given abbreviations, but it is generally preferred to spell out the entry placetype. Note also that we purposely avoid certain abbreviations that would be ambiguous (e.g. "d", which could variously be interpreted as "department", "อำเภอ" or "division"). ]==] export.placetype_aliases = { ["acomm"] = "autonomous community", ["adr"] = "administrative region", ["adterr"] = "administrative territory", -- Pakistan ["aobl"] = "autonomous oblast", ["aokr"] = "autonomous okrug", ["ap"] = "autonomous province", ["apref"] = "autonomous prefecture", ["aprov"] = "autonomous province", ["ar"] = "autonomous region", ["arch"] = "archipelago", ["arep"] = "autonomous republic", ["aterr"] = "autonomous territory", ["atu"] = "autonomous territorial unit", ["bor"] = "borough", ["c"] = "ประเทศ", ["can"] = "canton", ["carea"] = "council area", ["cc"] = "constituent country", ["cdblock"] = "community development block", ["cdep"] = "Crown dependency", ["CDP"] = "census-designated place", ["cdp"] = "census-designated place", ["clcity"] = "county-level city", ["co"] = "เทศมณฑล", ["cobor"] = "county borough", ["colcity"] = "county-level city", ["coll"] = "collectivity", ["comm"] = "community", ["cont"] = "ทวีป", ["contr"] = "continental region", ["contregion"] = "continental region", ["cpar"] = "civil parish", ["damun"] = "direct-administered municipality", ["dep"] = "dependency", ["department capital"] = "departmental capital", ["dept"] = "department", ["depterr"] = "dependent territory", ["dist"] = "อำเภอ", ["distmun"] = "district municipality", ["div"] = "division", ["emp"] = "จักรวรรดิ", ["fpref"] = "French prefecture", ["gov"] = "governorate", ["govnat"] = "governorate", ["home-rule city"] = "home rule city", ["home-rule municipality"] = "home rule municipality", ["inner-city area"] = "inner city area", ["ires"] = "Indian reservation", ["isl"] = "เกาะ", ["lbor"] = "London borough", ["lga"] = "local government area", ["lgarea"] = "local government area", ["lgd"] = "local government district", ["lgdist"] = "local government district", ["metbor"] = "metropolitan borough", ["metcity"] = "มหานคร", ["metmun"] = "metropolitan municipality", ["mtn"] = "ภูเขา", ["mun"] = "เทศบาล", ["mundist"] = "municipal district", ["nonmetropolitan county"] = "non-metropolitan county", ["obl"] = "oblast", ["okr"] = "okrug", ["p"] = "จังหวัด", ["par"] = "parish", ["parmun"] = "parish municipality", ["pen"] = "peninsula", ["plcity"] = "prefecture-level city", ["plcolony"] = "Polish colony", ["pref"] = "prefecture", ["prefcity"] = "prefecture-level city", ["preflcity"] = "prefecture-level city", ["prov"] = "จังหวัด", ["r"] = "ภูมิภาค", ["range"] = "เทือกเขา", ["rcm"] = "regional county municipality", ["rcomun"] = "regional county municipality", ["rdist"] = "regional district", ["rep"] = "republic", ["rhrom"] = "rural hromada", ["riv"] = "แม่น้ำ", ["rmun"] = "regional municipality", ["robor"] = "royal borough", ["romp"] = "Roman province", ["runit"] = "regional unit", ["rurmun"] = "rural municipality", ["s"] = "รัฐ", ["sar"] = "special administrative region", ["shrom"] = "settlement hromada", ["spref"] = "subprefecture", ["sprefcity"] = "sub-prefectural city", ["sprovcity"] = "subprovincial city", ["submet city"] = "sub-metropolitan city", ["submetropolitan city"] = "sub-metropolitan city", ["sub-prefecture-level city"] = "sub-prefectural city", ["sub-provincial city"] = "subprovincial city", ["sub-provincial district"] = "subprovincial district", ["terr"] = "ดินแดน", ["terrauth"] = "territorial authority", ["twp"] = "township", ["twpmun"] = "township municipality", ["uauth"] = "unitary authority", ["ucomm"] = "unincorporated community", ["udist"] = "unitary district", ["uhrom"] = "urban hromada", ["uterr"] = "union territory", ["utwpmun"] = "united township municipality", ["val"] = "valley", ["vdc"] = "village development committee", ["vil"] = "village", ["voi"] = "voivodeship", ["wcomm"] = "Welsh community", } local no_link_def_article = {link = false, article = "the"} local no_link_no_article = {link = false, article = false} --[==[ var: These qualifiers can be prepended onto any placetype and will be handled correctly. For example, the placetype `large city` will be displayed as `large <nowiki>[[city]]</nowiki>` and categorized as if `city` were specified. If the value in the following table is a string, the qualifier will display according to the string. If the value is `true`, the qualifier will be linked to its corresponding Wiktionary entry. If the value is `false`, the qualifier will not be linked but will appear as-is. Note that these qualifiers do not override placetypes with entries elsewhere that contain those same qualifiers. For example, the entry for `inland sea` in `placetype_data` will apply in preference to treating `inland sea` as equivalent to `sea`. ]==] export.placetype_qualifiers = { -- generic qualifiers ["huge"] = false, ["tiny"] = false, ["large"] = false, ["big"] = false, ["mid-size"] = false, ["mid-sized"] = false, ["small"] = false, ["sizable"] = false, ["important"] = false, ["long"] = false, ["short"] = false, ["major"] = false, ["minor"] = false, ["high"] = false, ["tall"] = false, ["low"] = false, ["left"] = false, -- left tributary ["right"] = false, -- right tributary ["modern"] = false, -- for use in opposition to "ancient" in another definition -- "former" qualifiers ["abandoned"] = true, ["ancient"] = true, ["deserted"] = true, ["extinct"] = true, ["former"] = false, ["historic"] = "historical", ["historical"] = true, ["medieval"] = true, ["mediaeval"] = true, ["ruined"] = true, ["traditional"] = true, -- sea qualifiers ["coastal"] = true, ["inland"] = true, -- note, we also have an entry in placetype_data for 'inland sea' to get a link to [[inland sea]] ["maritime"] = true, ["overseas"] = true, ["seaside"] = true, ["beachfront"] = true, ["beachside"] = true, ["riverside"] = true, -- lake qualifiers ["freshwater"] = true, ["saltwater"] = true, ["endorheic"] = true, ["oxbow"] = true, ["ox-bow"] = "[[oxbow]]", -- [[ox-bow]] is a red link ["tidal"] = true, -- land qualifiers ["hilltop"] = true, ["hilly"] = true, ["insular"] = true, ["peninsular"] = true, ["chalk"] = true, ["karst"] = true, ["limestone"] = true, ["mountainous"] = true, ["mountaintop"] = true, ["alpine"] = true, ["volcanic"] = true, -- for an island -- political status qualifiers ["autonomous"] = true, ["incorporated"] = true, ["special"] = true, ["unincorporated"] = true, ["coterminous"] = true, -- monetary status/etc. qualifiers ["fashionable"] = true, ["wealthy"] = true, ["affluent"] = true, ["declining"] = true, -- city vs. rural qualifiers ["urban"] = true, ["suburban"] = true, ["exurban"] = true, ["outlying"] = true, ["remote"] = true, ["rural"] = true, ["outback"] = true, ["inner"] = false, ["inner-city"] = true, ["central"] = false, ["outer"] = false, -- land use qualifiers ["residential"] = true, ["agricultural"] = true, ["business"] = true, ["commercial"] = true, ["industrial"] = true, -- business use qualifiers ["railroad"] = true, ["railway"] = true, ["farming"] = true, ["fishing"] = true, ["mining"] = true, ["logging"] = true, ["cattle"] = true, -- tourism use qualifiers ["resort"] = true, -- note, we also have 'resort city' and 'resort town', that take precedecne ["spa"] = true, -- note, we also have 'spa city' and 'spa town', that take precedecne ["ski"] = true, -- note, we also have 'ski resort city' and 'ski resort town', that take precedecne -- religious qualifiers ["holy"] = true, ["sacred"] = true, ["religious"] = true, ["secular"] = true, -- qualifiers for nonexistent places ["claimed"] = false, ["fictional"] = true, ["legendary"] = true, ["mythical"] = true, ["mythological"] = true, -- directional qualifiers ["northern"] = false, ["southern"] = false, ["eastern"] = false, ["western"] = false, ["north"] = false, ["south"] = false, ["east"] = false, ["west"] = false, ["northeastern"] = false, ["southeastern"] = false, ["northwestern"] = false, ["southwestern"] = false, ["northeast"] = false, ["southeast"] = false, ["northwest"] = false, ["southwest"] = false, -- seasonal qualifiers ["summer"] = true, -- e.g. for 'summer capital' ["winter"] = true, -- legal status qualifiers -- FIXME: Two-word qualifiers don't work yet. But you can enter "de-facto" and it's canonicalized to [[de facto]]. ["official"] = true, ["unofficial"] = true, ["de facto"] = true, -- 'de facto capital' ["de-facto"] = "[[de facto]]", -- [[de-facto]] is a red link ["de jure"] = true, -- 'de jure capital' ["de-jure"] = "[[de jure]]", -- [[de-jure]] is a red link -- NOTE: 'unrecognized/unrecognised' are handled as placetypes 'unrecognized country', 'unrecognized state' -- misc. qualifiers ["planned"] = true, ["chartered"] = true, ["landlocked"] = true, ["uninhabited"] = true, -- superlative qualifiers ["first"] = no_link_def_article, ["second"] = no_link_def_article, -- for "second largest" etc. ["third"] = no_link_def_article, ["fourth"] = no_link_def_article, ["last"] = no_link_def_article, ["only"] = no_link_def_article, ["sole"] = no_link_def_article, ["main"] = no_link_def_article, ["largest"] = no_link_def_article, ["biggest"] = no_link_def_article, ["smallest"] = no_link_def_article, ["shortest"] = no_link_def_article, ["longest"] = no_link_def_article, ["tallest"] = no_link_def_article, ["highest"] = no_link_def_article, ["lowest"] = no_link_def_article, ["leftmost"] = no_link_def_article, ["rightmost"] = no_link_def_article, ["innermost"] = no_link_def_article, ["outermost"] = no_link_def_article, ["northernmost"] = no_link_def_article, ["southernmost"] = no_link_def_article, ["westernmost"] = no_link_def_article, ["easternmost"] = no_link_def_article, ["northwesternmost"] = no_link_def_article, ["southwesternmost"] = no_link_def_article, ["northeasternmost"] = no_link_def_article, ["southeasternmost"] = no_link_def_article, -- several/various ["several"] = no_link_no_article, ["various"] = no_link_no_article, ["numerous"] = no_link_no_article, ["multiple"] = no_link_no_article, ["many"] = no_link_no_article, ["other"] = no_link_no_article, } --[==[ var: In this table, the key qualifiers should be treated the same as the value qualifiers for categorization purposes. This is overridden by `placetype_data` and `qualifier_to_placetype_equivs`. ]==] export.former_qualifiers = { ["abandoned"] = {"FORMER"}, ["ancient"] = {"ANCIENT", "FORMER"}, ["former"] = {"FORMER"}, ["extinct"] = {"FORMER"}, ["historic"] = {"FORMER"}, ["historical"] = {"FORMER"}, ["medieval"] = {"ANCIENT", "FORMER"}, ["mediaeval"] = {"ANCIENT", "FORMER"}, ["ruined"] = {"ANCIENT", "FORMER"}, ["traditional"] = {"FORMER"}, } --[==[ var: In this table, any placetypes containing these qualifiers that do not occur in `placetype_data` should be mapped to the specified placetypes for categorization purposes. Entries here are overridden by `placetype_data`. ]==] export.qualifier_to_placetype_equivs = { ["fictional"] = "fictional location", ["legendary"] = "mythological location", ["mythical"] = "mythological location", ["mythological"] = "mythological location", -- For e.g. Taiwan as a "claimed province" of China; parts of Belize as claimed by Guatemala; various islands -- claimed by various parties in East Asia. FIXME: We should conditionalize on what is being claimed since there are -- also claimed capitals, e.g. Israel and Palestine claim Jerusalem as their capital. ["claimed"] = "claimed political division", } --[==[ var: Mapping from placetypes to the corresponding plural category-only placetype for a capital of that placetype. The reverse mapping also exists. ]==] export.placetype_to_capital_cat = { ["autonomous community"] = "autonomous community capitals", ["canton"] = "cantonal capitals", ["comarca"] = "comarca capitals", ["ประเทศ"] = "เมืองหลวงของประเทศ", -- The following are not obviously different from 'county seats' but the latte terminology is used in the US. ["เทศมณฑล"] = "เมืองหลวงของเทศมณฑล", ["department"] = "departmental capitals", ["อำเภอ"] = "เมืองหลวงของอำเภอ", ["division"] = "division capitals", ["emirate"] = "emirate capitals", ["governorate"] = "governorate capitals", ["hromada"] = "hromada capitals", ["krai"] = "krai capitals", ["มหานคร"] = "เมืองหลวงของมหานคร", ["เทศบาล"] = "เมืองหลวงของเทศบาล", ["oblast"] = "oblast capitals", ["okrug"] = "okrug capitals", ["prefecture"] = "prefectural capitals", ["จังหวัด"] = "เมืองหลวงของจังหวัด", ["raion"] = "raion capitals", ["regency"] = "regency capitals", ["ภูมิภาค"] = "เมืองหลวงของภูมิภาค", ["regional unit"] = "regional unit capitals", ["republic"] = "republic capitals", ["รัฐ"] = "เมืองหลวงของรัฐ", ["ดินแดน"] = "เมืองหลวงของดินแดน", ["voivodeship"] = "voivodeship capitals", } --[==[ var: This contains placenames that should be preceded by an article (almost always "the"). '''NOTE''': There are multiple ways that placenames can come to be preceded by "the": # Listed here. # Given in [[Module:place/locations]] with an initial "the". All such placenames are added to this map by the code just below the map. # The placetype of the placename has `holonym_use_the = true` in its placetype_data. # A regex in placename_the_re matches the placename. Note that "the" is added only before the first holonym in a place description. ]==] export.placename_article = { -- This should only contain info that can't be inferred from [[Module:place/locations]]. ["archipelago"] = { ["Cyclades"] = "the", ["Dodecanese"] = "the", }, ["ประเทศ"] = { ["Holy Roman Empire"] = "the", }, ["จักรวรรดิ"] = { ["Holy Roman Empire"] = "the", }, ["เกาะ"] = { ["North Island"] = "the", ["South Island"] = "the", }, ["ภูมิภาค"] = { ["Balkans"] = "the", ["Russian Far East"] = "the", ["Caribbean"] = "the", ["Caucasus"] = "the", ["Middle East"] = "the", ["New Territories"] = "the", ["North Caucasus"] = "the", ["South Caucasus"] = "the", ["West Bank"] = "the", ["Gaza Strip"] = "the", }, ["valley"] = { ["San Fernando Valley"] = "the", }, } --[==[ var: Regular expressions to apply to determine whether we need to put 'the' before a holonym. The key "*" applies to all holonyms, otherwise only the regexes for the holonym's placetype apply. ]==] export.placename_the_re = { -- We don't need entries for peninsulas, seas, oceans, gulfs or rivers -- because they have holonym_use_the = true. ["*"] = {"^Isle of ", " Islands$", " Mountains$", " Empire$", " Country$", " Region$", " District$", "^City of "}, ["bay"] = {"^Bay of "}, ["ทะเลสาบ"] = {"^Lake of "}, ["ประเทศ"] = {"^Republic of ", " Republic$"}, ["republic"] = {"^Republic of ", " Republic$"}, ["ภูมิภาค"] = {" [Rr]egion$"}, ["แม่น้ำ"] = {" River$"}, ["local government area"] = {"^Shire of "}, ["เทศมณฑล"] = {"^Shire of "}, ["Indian reservation"] = {" Reservation", " Nation"}, ["tribal jurisdictional area"] = {" Reservation", " Nation"}, } --[==[ var: If any of the following holonyms are present, the associated holonyms are automatically added to the end of the list of holonyms for categorization (but not display) purposes. ]==] export.cat_implications = { ["ภูมิภาค"] = { ["Eastern Europe"] = {"continent/Europe"}, ["Central Europe"] = {"continent/Europe"}, ["Western Europe"] = {"continent/Europe"}, ["South Europe"] = {"continent/Europe"}, ["Southern Europe"] = {"continent/Europe"}, ["Northern Europe"] = {"continent/Europe"}, ["Northeast Europe"] = {"continent/Europe"}, ["Northeastern Europe"] = {"continent/Europe"}, ["Southeast Europe"] = {"continent/Europe"}, ["Southeastern Europe"] = {"continent/Europe"}, ["North Caucasus"] = {"continent/Europe"}, ["South Caucasus"] = {"continent/Asia"}, ["South Asia"] = {"continent/Asia"}, ["Southern Asia"] = {"continent/Asia"}, ["East Asia"] = {"continent/Asia"}, ["Eastern Asia"] = {"continent/Asia"}, ["Central Asia"] = {"continent/Asia"}, ["West Asia"] = {"continent/Asia"}, ["Western Asia"] = {"continent/Asia"}, ["Southeast Asia"] = {"continent/Asia"}, ["North Asia"] = {"continent/Asia"}, ["Northern Asia"] = {"continent/Asia"}, ["Anatolia"] = {"continent/Asia"}, ["Asia Minor"] = {"continent/Asia"}, ["Mesopotamia"] = {"continent/Asia"}, ["North Africa"] = {"continent/Africa"}, ["Central Africa"] = {"continent/Africa"}, ["West Africa"] = {"continent/Africa"}, ["East Africa"] = {"continent/Africa"}, ["Southern Africa"] = {"continent/Africa"}, ["Central America"] = {"continent/Central America"}, ["Caribbean"] = {"continent/North America"}, ["Polynesia"] = {"continent/Oceania"}, ["Micronesia"] = {"continent/Oceania"}, ["Melanesia"] = {"continent/Oceania"}, ["Siberia"] = {"country/Russia", "continent/Asia"}, ["Russian Far East"] = {"country/Russia", "continent/Asia"}, ["South Wales"] = {"constituent country/Wales", "continent/Europe"}, ["Balkans"] = {"continent/Europe"}, ["West Bank"] = {"country/Palestine", "continent/Asia"}, ["Gaza"] = {"country/Palestine", "continent/Asia"}, ["Gaza Strip"] = {"country/Palestine", "continent/Asia"}, } } ------------------------------------------------------------------------------------------ -- Category and display handlers -- ------------------------------------------------------------------------------------------ local function city_type_cat_handler(data) local entry_placetype = data.entry_placetype local generic_before_non_cities = export.get_placetype_prop(entry_placetype, "generic_before_non_cities") if not generic_before_non_cities then internal_error("city_type_cat_handler called on placetype %s that doesn't have a `generic_before_non_cities`" .. " setting", entry_placetype) end local plural_entry_placetype = export.pluralize_placetype(entry_placetype) local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and not spec.is_former_place and not spec.is_city then -- Categorize both in key, and in the larger polity that the key is part of, e.g. [[Hirakata]] goes in both -- "Cities in Osaka Prefecture" and "Cities in Japan". (But don't do the latter if no_container_cat is set.) local cap_plural_entry_placetype = ucfirst(plural_entry_placetype) local retcats = {("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(key, spec))} --th if container_trail[1] and not spec.no_container_cat then for _, container in ipairs(container_trail[1]) do insert(retcats, ("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(container.key, container.spec))) --th end end return retcats end end local function capital_city_cat_handler(data, non_city) local holonym_placetype, holonym_placename, holonym_index, place_desc = data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc -- The first time we're called we want to return something; otherwise we will be called for later-mentioned -- holonyms, which can result in wrongly classifying into e.g. `National capitals`. Simulate the loop in -- find_placetype_cat_specs() over holonyms so we get the proper `Cities in ...` categories as well as the capital -- category/categories we add below. local retcats if not non_city and place_desc.holonyms then for h_index, holonym in export.get_holonyms_to_check(place_desc, holonym_index) do local h_placetype, h_placename = holonym.placetype, holonym.unlinked_placename retcats = city_type_cat_handler { entry_placetype = "นคร", holonym_placetype = h_placetype, holonym_placename = h_placename, holonym_index = h_index, place_desc = place_desc, } if retcats then break end end end if not retcats then retcats = {} end -- Now find the appropriate capital-type category for the placetype of the holonym, e.g. 'State capitals'. If we -- recognize the holonym among the known holonyms in [[Module:place/locations]], also add a category like 'State -- capitals of the United States'. Truncate e.g. 'autonomous region' to 'region', 'union territory' to 'territory' -- when looking up the type of capital category, if we can't find an entry for the holonym placetype itself (there's -- an entry for 'autonomous community'). local capital_cat = export.placetype_to_capital_cat[holonym_placetype] if not capital_cat then capital_cat = export.placetype_to_capital_cat[holonym_placetype:gsub("^.* ", "")] end if capital_cat then capital_cat = ucfirst(capital_cat) local inserted_specific_variant_cat = false if holonym_index then -- Now find the first recognized holonym location. We don't stop when :also is seen because of the common pattern -- where we use :also to specify that a given city is the capital at multiple surrounding levels. local matching_group, matching_key, matching_spec, matching_container_trail, matching_holonym_index for h_index = holonym_index, #place_desc.holonyms do if place_desc.holonyms[h_index].placetype then matching_group, matching_key, matching_spec, matching_container_trail = export.find_matching_holonym_location { holonym_placetype = place_desc.holonyms[h_index].placetype, holonym_placename = place_desc.holonyms[h_index].unlinked_placename, holonym_index = h_index, place_desc = place_desc, } if matching_group then matching_holonym_index = h_index break end end end if matching_holonym_index == holonym_index then if matching_container_trail[1] and not matching_spec.no_container_cat then for _, container in ipairs(matching_container_trail[1]) do insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key, container.spec))) inserted_specific_variant_cat = true end end elseif matching_holonym_index then -- Check to make sure that the holonym placetype we were called on is listed among the -- divtypes of the location we found. local function insert_specific_variant_if_possible(key, spec) return export.get_equiv_placetype_prop(holonym_placetype, function(pt) local plural_holonym_placetype = export.pluralize_placetype(pt) local saw_matching_div if spec.divs then local divs = spec.divs if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) ~= "table" then div = {type = div} end if plural_holonym_placetype == div.type then saw_matching_div = true break end end end if saw_matching_div then insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(key, spec))) return true end return false end) end if insert_specific_variant_if_possible(matching_key, matching_spec) then inserted_specific_variant_cat = true elseif not matching_spec.no_container_cat then for _, containers in ipairs(matching_container_trail) do local saw_no_container_cat = false for _, container in ipairs(containers) do if insert_specific_variant_if_possible(container.key, container.spec) then inserted_specific_variant_cat = true break end saw_no_container_cat = saw_no_container_cat or container.spec.no_container_cat end if inserted_specific_variant_cat or saw_no_container_cat then break end end end end else -- This happens when in an invocation like {{place|en|capital city|s/Haryana,Punjab}} for -- [[Chandigarh]]. We fall back to older code that doesn't depend on the holonym index existing. -- FIXME: This may not be necessary. In the example just given, when processing Haryana we add to -- [[:Category:en:State capitals of India]], and nothing extra gets added when processing Punjab. -- Possibly we can just skip this case entirely. local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and container_trail[1] and not spec.no_container_cat then for _, container in ipairs(container_trail[1]) do insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key, container.spec))) inserted_specific_variant_cat = true end end end if not inserted_specific_variant_cat then insert(retcats, capital_cat) end else -- We didn't recognize the holonym placetype; just put in 'Capital cities'. insert(retcats, "เมืองหลวง") end return retcats end --[=[ This is invoked specially for all placetypes (see the `*` placetype key at the bottom of `placetype_data`). This is used in two ways: # To add pages to generic holonym categories like [[:Category:en:สถานที่ในMerseyside, England]] (and [[:Category:en:สถานที่ในEngland]]) for any pages that have `co/Merseyside` as their holonym. # To categorize demonyms in bare placename categories like [[:Category:en:Merseyside, England]] if the demonym description mentions `co/Merseyside` and doesn't mention a more specific placename that also has a category. (In this case there are none, but we can have demonyms at multiple levels, e.g. in France for individual villages, departments, administrative regions, and for the entire country, and for example we only want to categorize a demonym into [[:Category:France]] if no more specific category applies.) Unlike when invoked from {{tl|place}}, a demonym invocation only adds the most specific holonym category and not the category of any containing polity (hence if we add [[:Category:en:Merseyside, England]] we won't also add [[:Category:England]]). This code also handles cities; e.g. for the first use case above, it would be used to add a page that has `city/Boston` as a holonym to [[:Category:en:สถานที่ในBoston]], along with [[:Category:en:สถานที่ในMassachusetts, USA]] and [[:Category:en:สถานที่ในthe United States]]. The city handler tries to deal with the possibility of multiple cities having the same name. For example, the code in [[Module:place/locations]] knows about the city of [[Columbus]], [[Ohio]], which has containing polities `Ohio` (a state) and `the United States` (a country). If either containing polity is mentioned, the handler proceeds to return the key `Columbus` (along with `Ohio, USA` and `the United States`). Otherwise, if any other state or country is mentioned, the handler returns nothing, and otherwise it assumes the mentioned city is the one we're considering and returns `Columbus` etc. This works correctly if the place only mentions Ohio and a holonym for a Columbus in a different country is encountered, because of the function `augment_holonyms_with_container`, which adds the US as a holonym when Ohio is encountered. The single parameter `data` is as in category handlers. The return value is a list of categories (without the preceding language code). ]=] local function generic_place_cat_handler(data) local from_demonym = data.from_demonym local retcats = {} local function insert_retkey(key, spec) if from_demonym then insert(retcats, key) else insert(retcats, ("สถานที่ใน%s"):format(export.get_prefixed_key(key, spec))) end end local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group then if not spec.no_generic_place_cat then -- This applies to continents and continental regions. insert_retkey(key, spec) end -- Categorize both in key, and in the larger location(s) that the key is part of, e.g. [[Hirakata]] goes in -- both [[Category:สถานที่ในOsaka Prefecture, Japan]] and [[Category:สถานที่ในJapan]]. But not when -- no_container_cat is set (e.g. for 'United Kingdom'). if not spec.no_container_cat then for _, container_set in ipairs(container_trail) do local stop_adding_containers = false for _, container in ipairs(container_set) do if not container.spec.no_generic_place_cat then insert_retkey(container.key, container.spec) end if container.spec.no_container_cat then stop_adding_containers = true end end if stop_adding_containers then break end end end return retcats end end --[==[ Special category handler run for all placetypes that checks for specified division placetypes of known locations and categorizes appropriately. ]==] function export.political_division_cat_handler(data) if data.from_demonym then return end local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group then local divlists = {} if spec.divs then insert(divlists, spec.divs) end if spec.addl_divs then insert(divlists, spec.addl_divs) end for _, divlist in ipairs(divlists) do if type(divlist) ~= "table" then divlist = {divlist} end for _, div in ipairs(divlist) do if type(div) == "string" then div = {type = div} end local sgdiv = export.maybe_singularize_placetype(div.type) or div.type local prep = div.prep or "ของ" local cat_as = div.cat_as or div.type if type(cat_as) ~= "table" then cat_as = {cat_as} end if not export.placetype_data[sgdiv] then internal_error("Placetype %s associated with known location key %s and data %s not found in " .. "`placetype_data`", sgdiv, key, spec) end if sgdiv == data.entry_placetype then local retcats = {} for _, pt_cat in ipairs(cat_as) do if type(pt_cat) == "string" then pt_cat = {type = pt_cat} end local pt_prep = pt_cat.prep or prep insert(retcats, ucfirst(pt_cat.type) .. pt_prep .. export.get_prefixed_key(key, spec)) --th end return retcats end end end end end --[==[ This is used to add pages to "bare" categories like [[:Category:en:Georgia, USA]] for `[[Georgia]]` and any foreign-language terms that are translations of the state of Georgia. We look at the page title (or its overridden value in {{para|pagename}}) as well as the glosses in {{para|t}}/{{para|t2}} etc., various extra-info values such as the modern names in {{para|modern}}, and any values specified using a form-of directive. We need to pay attention to the entry placetypes specified so we don't overcategorize; e.g. the US state of Georgia is `[[Джорджия]]` in Russian but the country of Georgia is `[[Грузия]]`, and if we just looked for matching names, we'd get both Russian terms categorized into both [[:Category:ru:Georgia, USA]] and [[:Category:ru:Georgia]]. We also need to check the containing holonyms to make sure there isn't a mismatch (so we don't e.g. categorize Newark, Delaware in [[:Category:en:Newark]], which is intended for Newark, New Jersey). ]==] function export.get_bare_categories(args, overall_place_spec) local bare_cats = {} local place_descs = overall_place_spec.descs local possible_placetypes_by_place_desc = {} for i, place_desc in ipairs(place_descs) do possible_placetypes_by_place_desc[i] = {} for _, placetype in ipairs(place_desc.placetypes) do if not export.placetype_is_ignorable(placetype) then local equivs = export.get_placetype_equivs(placetype, {register_former_as_non_former = true}) for _, equiv in ipairs(equivs) do insert(possible_placetypes_by_place_desc[i], equiv.placetype) end end end end local function check_term(term) -- Treat Wikipedia links like local ones. term = term:gsub("%[%[w:", "[["):gsub("%[%[wikipedia:", "[[") term = export.remove_links_and_html(term) term = term:gsub("^the ", "") for i, place_desc in ipairs(place_descs) do -- Iterate over all matching locations in case there are multiple, as with Delhi defined as -- {{place|en|megacity/and/union territory|c/India|containing the national capital [[New Delhi]]}}. for group, key, spec, container_trail in export.iterate_matching_holonym_location { holonym_placetype = possible_placetypes_by_place_desc[i], holonym_placename = term, place_desc = place_desc, } do insert(bare_cats, key) end end end -- FIXME: Should we only do the following if the language is English (requires that the lang is passed in)? -- We should always do it if `pagename` is given (as it is with {{tcl}}) but maybe not otherwise unless 1=en. There -- are cases like [[Ankara]] = English name for capital of Turkey, but also the name in various languages for the -- capital of Ghana (= English [[Accra]]). But this should get caught by mismatching the containing country. The -- advantage of checking when the language isn't English is we catch those places that fail to give an English -- translation but where the translation happens to be the same as the other-language spelling. However, I don't -- know how often this situation occurs. check_term(args.pagename or mw.title.getCurrentTitle().subpageText) for _, t in ipairs(args.t) do check_term(t) end local function check_termobj_list(terms) for _, term in ipairs(terms) do if term.eq then check_term(term.eq) end if term.alt or term.term then check_term(term.alt or term.term) end end end for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do local arg = extra_info_terms.arg if arg == "modern" or arg == "now" or arg == "full" or arg == "short" then check_termobj_list(extra_info_terms.terms) end end for _, directive in ipairs(overall_place_spec.directives) do check_termobj_list(directive.terms) end return bare_cats end --[==[ This is used to augment the holonyms associated with a place description with the containing polities. For example, given the following: `# {{tl|place|en|subprefecture|pref/Hokkaido}}.` We auto-add Japan as another holonym so that the term gets categorized into [[:Category:Subprefectures of Japan]]. To avoid over-categorizing we need to check to make sure no other countries are specified as holonyms. ]==] function export.augment_holonyms_with_container(place_descs) for _, place_desc in ipairs(place_descs) do if place_desc.holonyms then -- This ends up containing a copy of the original holonyms, with the augmented holonyms inserted in their -- appropriate position. We don't just put them at the end because some holonyms have use the `:also` -- modifier, which causes category processing to restart at that point after generating categories for a -- preceding holonym, and we don't want the preceding holonym's augmented holonyms interfering with -- categorization of a later holonym. We proceed from right to left, and each time we augment, we copy -- the holonyms with the augmented holonym(s) inserted appropriately and replace the place description's -- holonyms with the augmented ones before the next iteration. The reason for this is so that e.g. -- {{place|neighborhood|city/Birmingham|co/West Midlands|cc/England}} doesn't throw an error during the -- augmentation process due to 'Birmingham' referring to two known locations (in England and Alabama). If -- we go left to right, we will throw an ambiguity error on `city/Birmingham` because code to exclude -- Birmingham, Alabama needs `c/United Kingdom` present (to cause a mismatch with `c/United States`), -- which isn't yet present as the augmentation code hasn't gotten to `cc/England` yet. For similar -- reasons, we need to include the augmented holonyms in the holonyms considered in the next iteration -- rather than modifying the place description once at athe end. for i = #place_desc.holonyms, 1, -1 do local holonym = place_desc.holonyms[i] if holonym.placetype and not export.placetype_is_ignorable(holonym.placetype) then local group, key, spec, container_trail = export.find_matching_holonym_location { holonym_placetype = holonym.placetype, holonym_placename = holonym.unlinked_placename, holonym_index = i, place_desc = place_desc, } if group and container_trail[1] and not spec.no_auto_augment_container then local augmented_holonyms = {} for j = 1, i do insert(augmented_holonyms, place_desc.holonyms[j]) end for _, containers in ipairs(container_trail) do local any_no_auto_augment_container = false for _, container in ipairs(containers) do any_no_auto_augment_container = any_no_auto_augment_container or container.spec.no_auto_augment_container local containing_type = container.spec.placetype if type(containing_type) == "table" then -- If the containing type is a list, use the first element as the canonical variant. containing_type = containing_type[1] end local full_container_placename, elliptical_container_placename = m_locations.key_to_placename(container.group, container.key) -- Don't side-effect holonyms while processing them. local new_holonym = { -- By the time we run, the display has already been generated so we don't need to -- set display_placename. placetype = containing_type, -- placename_to_key() for the group should correctly handle both full and elliptical -- placenames, but the full placename seems less likely to be ambiguous. FIXME: We -- should just store the key directly and use it when available to avoid having to -- convert key to placename and back to key. unlinked_placename = full_container_placename, -- Indicate that this is an augmented holonym, and was derived from the specified -- holonym. In iterate_matching_holonym_location(), we ignore augmented holonyms -- derived from holonyms that are different from the holonym we're searching for but -- of the same placetype. This is to correctly handle a situation like -- {{place|river|dept/Ardèche,Gard,Vaucluse,Bouches-du-Rhône|c/France}}. Here, -- `Ardèche` is in `r/Auvergne-Rhône-Alpes`, while `Gard` is in `r/Occitania` and -- the other two are in `r/Provence-Alpes-Côte d'Azur`. Augmenting proceeds from -- right to left, so after it adds `r/Provence-Alpes-Côte d'Azur` to -- `Bouches-du-Rhône`, Vaucluse gets augmented correctly but `Gard` fails to match -- in find_matching_holonym_location() because of the mismatch between augmented -- `r/Provence-Alpes-Côte d'Azur` and actual `r/Occitania`. Similarly, all later -- calls to find_matching_holonym_location() fail to match `Gard` (and likewise -- `Ardèche`) against any known location. To deal with this, we mark augmented -- holoynms as being augmented due to a source holonym, and when processing a given -- holonym, ignore augmented holonyms from other holonyms of the same placetype. -- The restriction to the same placetype is so that `Birmingham` still gets -- correctly disambiguated to Birmingham, England in the example given above near -- the top of this function, using the augmented holonym `c/United Kingdom` added by -- the specified `cc/England` (whose placetype `constituent country` differs from -- the placetype `city` of Birmingham). augmented_from_holonym = holonym, } insert(augmented_holonyms, new_holonym) -- But it is safe to modify other parts of the place_desc. export.key_holonym_into_place_desc(place_desc, new_holonym) end if any_no_auto_augment_container then break end end for j = i + 1, #place_desc.holonyms do insert(augmented_holonyms, place_desc.holonyms[j]) end place_desc.holonyms = augmented_holonyms end end end end end end -- Cat handler for district, areas, neighborhoods and suburbs. Districts are tricky because they can either be political -- divisions or city neighborhoods. Areas similarly can be political divisions (rarely; specifically, in Kuwait), city -- neighborhoods or larger geographical areas/regions. We handle this as follows: -- (1) `placetype_data` cat entries for specific countries or country divisions take precedence over cat_handlers, so if -- the user says {{tl|place|district|s/Maharashtra|c/India}}, we won't even be called because there is an entry that -- categorizes into [[:Category|Districts of Maharashtra, India]]. -- (2) If we're called, we check the holonym we're called on to see if it is a recognized city, e.g. if we're called -- using {{tl|place|district|city/Mumbai|s/Maharashtra|c/India}}. If so, we categorize under e.g. -- [[:Category:Neighbourhoods of Mumbai]]. (Choosing the spelling "neighbourhoods" because we're in India.) -- (3) If we're called and the holonym is not a recognized city, we check if the placetype has has_neighborhoods set. -- If so, it's "city-like" and we categorize under the first containing polity that we recognize. For example, if -- we're called using {{tl|place|district|town/Northampton|co/Hampshire|s/Massachusetts|c/US}}, we should recognize -- town as "city-like" and categorize under [[:Category:Neighborhoods in Massachusetts]]. (Note "ใน" not "ของ", and -- note the spelling "neighborhoods" because we're in the US.) -- (4) If the holonym is not city-like, we do nothing. If there's a city or city-like placetype farther up (e.g. we're -- called as {{tl|place|district|ward/Foo|mun/Bar|...}}), we will handle the city-like entity according to (2) or -- (3) when called on that holonym. Otherwise either the categorization in (1) takes place or there's no -- categorization. local function district_neighborhood_cat_handler(data) local function get_plural_entry_placetype(location_spec, container_trail) if data.entry_placetype == "suburb" then return "Suburbs" else -- Check for `british_spelling` setting on the spec itself or any container. local uses_british_spelling = location_spec.british_spelling if uses_british_spelling == nil and container_trail then for _, container_set in ipairs(container_trail) do local must_outer_break = false for _, container in ipairs(container_set) do if container.spec.british_spelling ~= nil then uses_british_spelling = container.spec.british_spelling must_outer_break = true break end end if must_outer_break then break end end end return uses_british_spelling and "Neighbourhoods" or "Neighborhoods" end end -- First check the immediate holonym to see if it's a city or a city-like top-level entity (Hong Kong, Bonaire, -- etc.) local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and not spec.is_former_place and spec.is_city then return {get_plural_entry_placetype(spec, container_trail) .. " of " .. export.get_prefixed_key(key, spec)} end -- If the entry placetype is neighbo(u)rhood, assume it is a neighborhood even if there isn't a city-like -- entity father up the chain. (E.g. due to a mistaken use of m/ instead of mun/ for municipality.) local has_neighborhoods local entry_placetype = data.entry_placetype if entry_placetype == "neighborhood" or entry_placetype == "neighbourhood" or entry_placetype == "suburb" then has_neighborhoods = true else -- Otherwise, make sure the current holonym is city-like. has_neighborhoods = export.get_equiv_placetype_prop(data.holonym_placetype, function(pt) return export.get_placetype_prop(pt, "has_neighborhoods") end, {continue_on_nil_only = true}) end if has_neighborhoods then -- Loop up the holonyms, looking for city and city-like entities in case of e.g. [[Sepulveda]] written -- {{place|en|neighborhood|valley/San Fernando Valley|city/Los Angeles|s/California|c/USA}} -- but also look for a recognizable poldiv, and if so categorize as "Neighborhoods in POLDIV". We need -- to start with the current holonym, which is especially important for neighborhoods and suburbs that -- may have the first holonym be a recognizable province, etc. but can't hurt otherwise. (Previously -- we skipped the first/current holonym.) for other_holonym_index, other_holonym in export.get_holonyms_to_check(data.place_desc, data.holonym_index) do local other_holonym_data = { holonym_placetype = other_holonym.placetype, holonym_placename = other_holonym.unlinked_placename, holonym_index = other_holonym_index, place_desc = data.place_desc, } local group, key, spec, container_trail = export.find_matching_holonym_location(other_holonym_data) if group and not spec.is_former_place then return {get_plural_entry_placetype(spec, container_trail) .. (spec.is_city and "ของ" or "ใน") .. export.get_prefixed_key(key, spec)} end end end end function export.check_already_seen_string(holonym_placename, already_seen_strings) local canon_placename = ulower(m_links.remove_links(holonym_placename)) if type(already_seen_strings) ~= "table" then already_seen_strings = {already_seen_strings} end for _, already_seen_string in ipairs(already_seen_strings) do if canon_placename:find(already_seen_string) then return true end end return false end -- Prefix display handler that adds a prefix such as "Metropolitan Borough of " to the display -- form of holonyms. We make sure the holonym doesn't contain the prefix or some variant already. -- We do this by checking if any of the strings in ALREADY_SEEN_STRINGS, either a single string or -- a list of strings, or the prefix if ALREADY_SEEN_STRINGS is omitted, are found in the holonym -- placename, ignoring case and links. If the prefix isn't already present, we create a link that -- uses the raw form as the link destination but the prefixed form as the display form, unless the -- holonym already has a link in it, in which case we just add the prefix. local function prefix_display_handler(prefix, holonym_placename, already_seen_strings) if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(prefix)) then return holonym_placename end if holonym_placename:find("%[%[") then return prefix .. " " .. holonym_placename end return prefix .. " [[" .. holonym_placename .. "]]" end -- Suffix display handler that adds a suffix such as " parish" to the display form of holonyms. -- Works identically to prefix_display_handler but for suffixes instead of prefixes. local function suffix_display_handler(suffix, holonym_placename, already_seen_strings, include_suffix_in_link) if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(suffix)) then return holonym_placename end if holonym_placename:find("%[%[") then return holonym_placename .. " " .. suffix end if include_suffix_in_link then return "[[" .. holonym_placename .. " " .. suffix .. "]]" else return "[[" .. holonym_placename .. "]] " .. suffix end end -- Display handler for boroughs. New York City boroughs are display as-is. Others are suffixed -- with "borough". local function borough_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.new_york_boroughs[unlinked_placename] then -- Hack: don't display "borough" after the names of NYC boroughs return holonym_placename end return suffix_display_handler("borough", holonym_placename) end local function county_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) -- Display handler for Irish counties. Irish counties are displayed as e.g. "County [[Cork]]". if m_locations.ireland_counties["County " .. unlinked_placename .. ", Ireland"] or m_locations.northern_ireland_counties["County " .. unlinked_placename .. ", Northern Ireland"] then return prefix_display_handler("เทศมณฑล", holonym_placename) end -- Display handler for Taiwanese counties. Taiwanese counties are displayed as e.g. "[[Chiayi]] County". if m_locations.taiwan_counties[unlinked_placename .. " County, Taiwan"] then return suffix_display_handler("เทศมณฑล", holonym_placename) end -- Display handler for Romanian counties. Romanian counties are displayed as e.g. "[[Cluj]] County". if m_locations.romania_counties[unlinked_placename .. " County, Romania"] then return suffix_display_handler("เทศมณฑล", holonym_placename) end -- FIXME, we need the same for US counties but need to key off the country, not the specific county. -- Others are displayed as-is. return holonym_placename end -- Display handler for prefectures. Japanese prefectures are displayed as e.g. "[[Fukushima]] Prefecture". -- Others are displayed as e.g. "[[Fthiotida]] prefecture". local function prefecture_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) local suffix = m_locations.japan_prefectures[unlinked_placename .. " Prefecture, Japan"] and "Prefecture" or "prefecture" return suffix_display_handler(suffix, holonym_placename) end -- Display handler for provinces of Iran, Laos, North and South Korea, Thailand, Turkey and Vietnam. Recognized -- provinces are displayed as e.g. "[[Gyeonggi]] Province" or "[[Antalya]] Province". Others are displayed as-is. local function province_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.iran_provinces[unlinked_placename .. ", Iran"] or m_locations.laos_provinces[unlinked_placename .. ", Laos"] or m_locations.north_korea_provinces[unlinked_placename .. ", North Korea"] or m_locations.south_korea_provinces[unlinked_placename .. ", South Korea"] or m_locations.thailand_provinces[unlinked_placename .. ", ไทย"] or m_locations.turkey_provinces[unlinked_placename .. ", Turkey"] or m_locations.vietnam_provinces[unlinked_placename .. ", เวียดนาม"] then return suffix_display_handler("จังหวัด", holonym_placename) end return holonym_placename end -- Display handler for Nigerian states. Nigerian states are display as "[[Kano]] State". Others are displayed as-is. local function state_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.nigeria_states[unlinked_placename .. " State, Nigeria"] then return suffix_display_handler("รัฐ", holonym_placename) end return holonym_placename end -- Display handler for voivodeships. Display as e.g. [[Subcarpathian Voivodeship]]. local function voivodesip_display_handler(holonym_placetype, holonym_placename) return suffix_display_handler("Voivodeship", holonym_placename, nil, "include_suffix_in_link") end ------------------------------------------------------------------------------------------ -- Placetype data -- ------------------------------------------------------------------------------------------ --[==[ var: Main placetype data structure. This specifies, for each canonicalized placetype, various properties. The keys are placetypes (in the singular, except for category-only placetypes, which are plural and followed by `!`), and the value is a table of properties. The `"*"` key is special and is used for adding "generic" categories of the form `สถานที่ใน``location`` `; it runs for all entry placetypes. Keys in the form of plural placetypes followed by `!` are used only in [[Module:category tree/topic cat/data/Places]] for specifying the properties of categories containing the specified placetype, esp. bare categories like [[:Category:States and territories]] (rather than qualified categories like [[:Category:States and territories of Australia]]). Keys under the value table for a given placetype of are two types: ''property keys'' (which specify the value of specific properties) and ''categorization keys'' (which tell how to categorize certain sorts of holonyms if the placetype in question occurs as an entry placetype). Categorization keys are either the special value `default` or are wildcard strings with a slash in them, such as `"country/*"`. Note that only wildcard strings are currently allowed directly in the placetype data; everything else is handled through category handlers, either per-placetype or special (such as `political_division_cat_handler`). The algorithm for how category keys and handlers are used to generate categories is described at the top of [[Module:place]]. There are several recognized property keys, of various types: 1. The following link-related property keys are recognized: * `link`: '''Required''' except in category-only placetypes ending in `!`. Describes how to link and display the placetype in the formatted description when occurring as an entry placetype. Also used for formatting pluralized placetypes (which may occur in entry placetypes, esp. new-format ones, such as `two <<islands>>`) and may occur in categories). The possible values are: *# `true`: Link to the same-named Wiktionary entry. This creates a raw link, e.g. `<nowiki>[[city]]</nowiki>`, which is converted to an English-specific link by JavaScript postprocessing. If the placetype is plural, this creates a two-part raw link e.g. `<nowiki>[[city|cities]]</nowiki>`. *# `"w"`: Link to the same-named Wikipedia entry. This creates a two-part link, e.g. `<nowiki>[[w:census town|census town]]</nowiki>`, or `<nowiki>[[w:census town|census towns]]</nowiki>` if the placetype is given plural. *# `"+..."`: Create a two-part link to the entry following the `+` sign. For example, if `cercle` specifies `"+w:cercles of Mali"`, a two-part link `<nowiki>[[w:cercles of Mali|cercle]]</nowiki>` will be generated, or `<nowiki>[[w:cercles of Mali|cercles]]</nowiki>` if plural `cercles` is specified. *# `"separately"`: Link each word separately. For example, if `administrative territory` specifies `"separately"`, it will be linked as `<nowiki>[[administrative]] [[territory]]</nowiki>`, or as `<nowiki>[[administrative]] [[territory|territories]]</nowiki>` if plural `administrative territories` is given. *# another string: Use that string directly. If the placetype is plural, `pluralize()` in [[Module:en-utilities]] is called on the string, which will correctly pluralize most strings, including those with links in them. (If there are multiple links, the display form of the last link is pluralized.) *# `false`: This placetype is not allowed as an entry placetype. An error will be thrown if this placetype is given as an entry placetype. This is specified for internal-use placetypes, especially placetypes used in conjunction with the qualifiers `former`, `ancient`, `historical` and such. * `plural_link`: If specified and the placetype is plural, use the value in place of generating a pluralized version of the link spec in `link`. Most commonly, this is either a string with links in it (which is used directly) or the value `false`, indicating that the placetype cannot occur plural. (This is used for example by `caplc`, which displays as `<nowiki>[[capital]] and [[large]]st [[city]]</nowiki>`, where a plural version doesn't make sense.) Generally if this is specified, `plural` also needs to be specified to give a special placetype plural; this situation occurs especially with multiword placetypes where something other than the last word is pluralized. An example is `town with bystatus`, whose plural is `towns with bystatus`, which needs to be explicitly given. This example uses `link = <nowiki>"[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>` ({{m|nb|bystatus}}) is a Norwegian Bokmål word, and template calls aren't currently permitted in link strings), along with `plural_link = <nowiki>"[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>`. * `category_link`: Spec indicating how to display the placetype when occurring in category descriptions. Defaults to the value of `link`, and in turn is overridden by more specific `category_link_*` keys; see below. Category-only placetypes (which are plural and end in `!`) usually use `category_link` in preference to `link`. The value of `category_link` can be any of the types of specs given above, but most commonly is a plural string with links in it, spelling out the description; in this case it is used directly. When both `category_link` and `link` are given, the value in `category_link` is typically longer and more descriptive. For example, `polity` uses `link = true`, which just generates a link `<nowiki>[[polity]]</nowiki>` or plural `<nowiki>[[polity|polities]]</nowiki>`, but specifies a separate `category_link = <nowiki>"[[independent]] or [[semi-]][[independent]] [[polity|polities]]"</nowiki>`, which clarifies in the category description what a polity is. * `category_link_top_level`: Spec indicating how to display top-level (bare/unqualified) categories, i.e. categories where the placetype is not followed by `in ``location`` ` or `of ``location`` `. If given, this overrides `category_link` for this type of category. * `category_link_before_noncity`: Spec indicating how to display qualified categories of the form ` ``placetypes`` in/of ``location`` ` where ``location`` does not refer to a city. If given, this overrides `category_link` for this type of category. * `category_link_before_city`: Spec indicating how to display qualified categories of the form ` ``placetypes`` in/of ``location`` ` where ``location`` refer to a city. If given, this overrides `category_link` for this type of category. An example where this is given is `neighborhood`, which uses the following specs:<ol> <li>`link = true`</li> <li>`category_link = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]"</nowiki>`</li> <li>`category_link_before_city = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions"</nowiki>`</li> </ol> This has the effect of making the entry placetype `neighborhood` display as just `<nowiki>[[neighborhood]]</nowiki>`, while e.g. a category like `Neighborhoods of Chicago` displays as `<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[Chicago]], ...</nowiki>` and a category like `Neighborhoods in Illinois, USA` displays as `<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]] in [[Illinois]], ...</nowiki>`. * `disallow_in_entries`: If specified, this placetype cannot occur as an entry placetype, and the specified value (a message indicating what to use instead) is displayed in the error message. * `disallow_in_holonyms`: If specified, this placetype cannot occur as a holonym placetype, and the specified value (a message indicating what to use instead) is displayed in the error message. 2. There is currently one fallback-related property key recognized: * `fallback`: If specified, its value is a placetype which will be used for categorization purposes if no categories get added using the placetype itself. As an example, `branch` sets a fallback of `river` but also sets `preposition = "ของ"`, meaning that {{tl|place|en|branch|riv/Mississippi}} displays as `a branch of the Mississippi` (whereas `river` itself uses the preposition `in`), but otherwise categorizes the same as `river`. A more complex example is `area`, which sets a fallback of `geographic and cultural area` and also sets a category handler that checks for cities or city-like entities (e.g. boroughs) occurring as holonyms and categorizes the toponym under [[:Category:Neighborhoods of CITY]] (for recognized cities) or otherwise [[:Category:Neighborhoods of POLDIV]] (for the nearest containing recognized location). In addition, `area` is set as a political division of Kuwait, meaning if `c/Kuwait` occurs as holonym, the toponym is categorized under [[:Category:Areas of Kuwait]]. If none of these categories trigger, the fallback of `geographic and cultural area` will take effect, and the toponym will be categorized as e.g. [[:Category:Geographic and cultural areas of England]]. 3. There is currently one property to control irregular plurals of placetypes: * `plural`: If specified, its value is the plural of the placetype. Otherwise, the default pluralization algorithm in [[Module:en-utilities]] applies (which correctly pluralizes most words, including those ending in `-y`, `-ch`, `-sh`, `-x`, etc.). The value of `plural` is also used when converting a pluralized placetype into its singular equivalent; for example, since the placetype `kibbutz` has `plural = "kibbutzim"`, the placetype `kibbutzim` will be recognized as a plural and singularized to `kibbutz`. For this reason, it's occasionally necessary to specify a `plural` value even when the default pluralization algorithm works correctly, if the default singularization algorithm won't correctly reverse the pluralization (as with `pass` and other terms ending in `-ss`). 4. The following property keys relate to generating categories for entry placetypes and specifying the parents of those categories: * `class`: The general class of placetype. This is used for various purposes: (a) to categorize placetypes preceded by a qualifier such as `former`, `ancient`, `medieval` or `historical` (note that these placetypes are not all treated alike); (b) to determine the parent category of bare placetype categories (e.g. [[:Category:Villages]] for placetype `village`); (c) to determine whether to add a parent category `political divisions of specific countries` to qualified placetype categories (e.g. [[:Category:Villages in Mali]]). The possible values are: *# `polity`: a more-or-less sovereign/independent polity, such as a country, kingdom or empire. *# `subpolity`: a non-sovereign division of a polity, above the level of an individual settlement. *# `settlement`: a city or smaller equivalent, such as a village. This also includes administrative divisions of a settlement, such as wards and barangays. *# `non-admin settlement`: similar to a settlement but without administrative or political significance, such as an unincorporated community, farm or neighborhood. *# `capital`: a settlement that is a capital. A former capital is generally still in existence, just not the capital any more. *# `natural feature`: any non-man-made feature, such as a lake, mountain, island, ocean, etc. *# `man-made structure`: a man-made feature below the level of a neighborhood, such as a house, airport, university, metro station, park or the like. *# `geographic region`: a geographic or cultural region or area that has no administrative significance. These may vary greatly in size but typically have some sort of cultural significance (possibly historical). The `former`, `ancient`, etc. qualifier has no effect on the category of these placetypes. *# `generic place`: a place that isn't further qualified into any specific subtype. * `former_type`: The class of placetype used for categorizing placetypes preceded by a qualifier such as `former`, `ancient`, `medieval` or `historical`. The possible values are the same as for `class` but with the addition of `dependent territory` (for colonies, protectorates and the like) and `!` (ignore the historical/former/ancient/etc. qualifier; used e.g. with `fictional location` and `mythological location`). If not specified, the value of `class` is used. When a qualifier such as `former`, `ancient`, `medieval` or `historical` is encountered (specifically, those in `former_qualifiers`), it is mapped using `former_qualifiers` to the appropriate internal qualifier or qualifiers (one or both of `ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified qualifiers), which is prepended to the value of `former_type` or `class` to form a placetype whose properties are looked up to determine how to categorize the toponym in question. For example, if `medieval village` is given, we map `medieval` to `ANCIENT` and `FORMER`, and `village` to its `class` of `settlement`, and enter the placetypes `ANCIENT settlement` and `FORMER settlement` (in that order) into the list of equivalent placetypes returned by `get_placetype_equivs`. In this case, there is an entry in `placetype_data` for `ANCIENT settlement`, so its default category spec `Ancient settlements` is used as the category. If on the other hand `medieval kingdom` is given, where `kingdom` has a `class` value `polity`, we first look up `ANCIENT polity`, see there is no entry in `placetype_data` for it, and then look up `FORMER polity`, which exists and has a default category spec `Former polities`, which is used as the category. Note that if the placetype following the "former" qualifier is recognized in `placetype_data` but has no `former_type` or `class` and no fallback with a `former_type` or `class` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like `former greenhouse` is specified and we don't have an entry for `greenhouse`), we just track the occurrence and end up not categorizing. * `bare_category_parent`: This specifies the first parent category of a bare placetype category named according to the placetype in question (e.g. [[:Category:Atolls]] for placetype `atoll`, or [[:Category:Named buildings]] for placetype `named buildings!`). If not specified, the first parent category is determined by the value of `class`, using the mapping `class_to_bare_category_parent` in [[Module:category tree/topic cat/data/Places]]. * `addl_bare_category_parents`: Extra parent categories to add a bare placetype category to (see `bare_category_parent` just above). * `bare_category_breadcrumb`: Breadcrumb for bare placetype categories. Also used as the sort key of `bare_category_parent` if it is a string. * `inherently_former`: If specified and the given placetype is used as an entry placetype, act as if `former` or `ancient` (depending on the value of `inherently_former`) were prefixed to the placetype. This is for placetypes that always refer to no-longer-existing entities, such as `satrapy` and `treaty port`. The value of `inherently_former` is a list of internal qualifiers (one or more of `ANCIENT` and/or `FORMER`), just as for `former_qualifiers`, and the implementation is the same. * `cat_handler`: Handler used to generate the categories to add a given toponym to, if its entry placetype is the placetype in question. Generally the `cat_handler` function checks the holonyms specified in order to determine which category or categories to generate. For example, `district_neighborhood_cat_handler` handles placetypes `district`, `neighborhood`, `subdivision`, `suburb` and the like, and either adds the toponym to a category like `Neighborhoods of ``city`` ` (if a recognized city is given as a holonym), or otherwise a category like `Neighborhoods in ``location`` ` (for the first recognized non-city location given as a holonym, if an unrecognized city or city-like entity is given before the recognized non-city). The algorithm that runs the category handlers iterates over holonyms from left to right, running the `cat_handler` function on each holonym in turn until one or more categories are returned; see below for more specifics. (Note that countries for which e.g. a `district` is a political division do not get the corresponding category added by the `district_neighborhood_cat_handler` function but by `political_division_cat_handler`.) `cat_handler` functions are called with one argument, `data`, describing the resolved entry placetype (i.e. after resolving placetype aliases and fallbacks) and the holonym being processed. The return value should be a list of category specs (categories minus the langcode prefix, with `+++` standing for the holonym key, or the value `true`, which stands for ` ``Placetypes`` in/of ``Holonym`` `, i.e. the pluralized placetype with the appropriate preposition as specified in `placetype_data`). `data` contains the following fields: ** `entry_placetype`: the resolved entry placetype for the entry placetype being processed (i.e. it will always have an entry in `placetype_data` but may not be the original placetype given by the user); ** `holonym_placetype` and `holonym_placename`: the holonym placetype and placename being processed; ** `holonym_index`: the index of the holonym being processed, or {nil} if we're handling an overriding holonym (FIXME: we will change the overriding holonym algorithm so there will be an index even when processing overriding holonyms); ** `place_desc`: a full description of the {{tl|place}} call, as specified at the top of [[Module:place]]; ** `from_demonym`: If set, we are called from [[Module:demonym]], triggered by {{tl|demonym-adj}} or {{tl|demonym-noun}}, instead of being triggered by {{tl|place}}. * `has_neighborhoods`: If `true`, the specified placetype is city-like. This is used in the `district_neighborhood_cat_handler` to determine whether to add a category such as `Neighborhoods in ``location`` `; see the section just above on `cat_handler`. 5. The following preposition-related property keys are recognized: * `preposition`: The preposition used after this placetype when it occurs as an entry placetype. Defaults to `"ใน"`. * `generic_before_non_cities`: If specified, the appropriate category description handler in [[Module:category tree/topic cat/data/Places]] will recognize categories of the form ` ``Placetype`` in/of ``location`` ` for the specified placetype and preposition, if ``location`` is a non-city. This is used to generate descriptions for categories added by category handlers and by explicit category specs in the placetype data. All placetypes that specify `generic_before_non_cities` or `generic_before_cities` *MUST* also specify a value for `class` so that the category tree code can determine whether it's a political or non-political division. * `generic_before_cities`: Like `generic_before_non_cities` but for locations referring to cities. 6. The following property keys control the auto-addition of affixes when formatting holonyms of a particular placetype: * `affix_type`: If specified, add the placetype as an affix before or after holonyms of this placetype. Possible values are: *# `"pref"` (the holonym will display as `(the) placetype of Holonym`, where `the` appears when the holonym directly follows an entry placetype); *# `"Pref"` (same as `"pref"` but the placetype is capitalized; each word is capitalized if there are multiple); *# `"suf"` (the holonym will display as `Holonym placetype`); *# `"Suf"` (the holonym will display as `Holonym Placetype`, i.e. same as `"suf"` but the placetype is capitalized). * `suffix`: String to use in place of the placetype itself when the placetype is displayed as a suffix after a holonym. Note that `suffix` can be used independently of `affix_type` because the user can also request a suffix explicitly using a syntax like `adr:suf/Occitania`, which will display as `Occitania region` because the placetype `administrative region` specifies `suffix = "ภูมิภาค"`. * `prefix`: Like `suffix` but for use when the placetype is displayed as a prefix before the holonym. * `affix`: Like `suffix` and `prefix` but for use when the placetype is displayed as an affix either before or after the holonym. If both `suffix` or `prefix` and `affix` are given for a single placetype, `suffix` or `prefix` take precedence. * `no_affix_strings`: String or list of strings that, if they occur in the holonym, suppress the addition of any affix requested using `affix_type`. Defaults to the placetype itself. For example, `autonomous okrug` specifies `affix_type = "Suf"` so that `aokr/Nenets` displays as `Nenets Autonomous Okrug`, but also specifies `no_affix_strings = "okrug"` so that `aokr/Nenets Okrug` or `aokr/Nenets Autonomous Okrug` displays as specified, without a redundant `Autonomous Okrug` added. Matching is case-insensitive but whole-word. * `display_handler`: A function of two arguments, `holonym_placetype` and `holonym_placename` (specifying a holonym). Its return value is a string specifying the display form of the holonym. 7. The following property keys control the indefinite and definite articles used before entry placetypes and/or holonyms of the specified placetype. * `entry_placetype_use_the`: Use `"the"` before this placetype when it occurs as an entry placetype. * `entry_placetype_indefinite_article`: Indefinite article used before this placetype when it occurs as an entry placetype (usually `"a"`, specifically for placetypes beginning with u- that don't take the indefinite article `"an"`). Defaults to the appropriate indefinite article (`"a"` or `"an"` depending on whether the placetype begins with a vowel). Overridden by `entry_placetype_use_the`, and unlike for most properties, does not apply to equivalent placetypes (i.e. fallbacks or those formed by removing a qualifier from the beginning); only to the exact placetype specified. * `holonym_use_the`: Use `"the"` before holonyms of this placetype. '''NOTE:''' # The `link` property must be specified on all placetypes, except those ending in `!` (category-only placetypes), which must have either `link` or `category_link` specified. # Either the `class` or `former_type` property must be specified on all placetypes not ending in `!` that do not have a fallback (if a placetype has a fallback and omits the `class` and `former_type` properties, they are taken from the fallback). An internal error will result if a placetype has no `class` or `former_type` property derivable either directly or through a fallback, if an attempt is made to categorize a former/ancient/historical/etc. entity of this placetype. # It is possible to have multiple levels of fallback (e.g. `frazione` falls back to `hamlet`, which falls back to `village`). Fallback loops will cause an internal error. All placetypes specified as fallbacks must exist in `placetype_data` or an internal error occurs. ]==] export.placetype_data = { --[=[ If you need to sort the following, do this (using Vim): 1. Make sure all full-line comments are within the { ... } table, or are moved after and on the same line as single-line entries. 2. Make sure the table uses tabs everywhere for indent, and not spaces. 3. Mark the top of the table with `ma`, go to the bottom and execute the following two lines in sequence: :'a,.s/\n/\\n/g :s/\\n\(\t\[\)/\r\1/g The first command converts every newline to a literal `\n` sequence, so the whole thing becomes a single line, while the second command restores the newlines before the beginning of each entry. The effect is to convert all entries to a single line while not losing any information. (Potentially a negative lookahead could be used to do it all in one command.) 4. Execute the following to sort: :'a,.!perl -pe 's/^(\t\[")(.*?)(".*)$/$2 @@@ $1$2$3/' | sort -f | perl -pe 's/.*? @@@ //' Note that a simple `sort -f` (where `-f` means case-insensitive) would almost work, but it would sort "hill station" before "hill" and "county borough" before "เทศมณฑล" because the space after e.g. "hill station" sorts before the quotation mark after e.g. "hill". The above command deals with this by extracting the key, prepending it followed by ` @@@ `, sorting, and then removing key (the classic decorate-sort-undecorate pattern). 5. Put the table back to multi-line format by marking the top of the table with `ma`, going to the bottom and executing :'a,.s/\\n/\r/g Note that for some reason, in order to get a match a newline in the left side of a replacement, you must use \n, but to insert a newline in the right sode of a replacement you must use \r. ]=] ["*"] = { link = false, cat_handler = generic_place_cat_handler, }, ["administrative atoll"] = { -- Maldives link = "+w:administrative divisions of the Maldives", preposition = "ของ", class = "subpolity", }, ["administrative capital"] = { link = "w", fallback = "เมืองหลวง", }, ["administrative center"] = { link = "w", fallback = "เมืองหลวงที่ไม่ใช่นคร", }, ["administrative centre"] = { link = "w", fallback = "administrative center", }, ["administrative county"] = { link = "w", fallback = "เทศมณฑล", }, ["administrative district"] = { link = "w", fallback = "อำเภอ", }, ["administrative headquarters"] = { link = "separately", fallback = "administrative centre", }, ["administrative region"] = { link = true, preposition = "ของ", suffix = "ภูมิภาค", -- but prefix is still "administrative region (of)" fallback = "ภูมิภาค", class = "subpolity", }, ["administrative seat"] = { link = "w", fallback = "administrative centre", }, ["administrative territory"] = { link = "separately", preposition = "ของ", suffix = "ดินแดน", -- but prefix is still "administrative territory (of)" fallback = "ดินแดน", class = "subpolity", }, ["administrative unit"] = { -- Grrr, it's difficult to generalize about "administrative units". In Albania, "administrative unit" is an -- official term for a city-level division of municipalities; Wikipedia renders it using the more practical term -- "commune". In Pakistan, "administrative unit" is a collective term used to refer to all the different types -- of first-level divisions (four provinces, one federal territory, and two "disputed territories", i.e. Azad -- Kashmir and Gilgit-Balistan, that are variously described). For this reason, we set no fallback, but we need -- to include this so that it can be used as a placetype for Albania, categorizing as communes. link = "w", class = "subpolity", }, ["administrative village"] = { link = "w", preposition = "ของ", has_neighborhoods = true, class = "settlement", }, ["aimag"] = { -- used in Mongolia, Russia and China (Inner Mongolia); in Mongolia, equivalent to a province; -- in China, equivalent to a prefecture (below a province); in Russia, equivalent to a municipal district. link = "w", fallback = "prefecture", }, ["airport"] = { link = true, class = "man-made structure", default = {true}, }, ["alliance"] = { link = true, fallback = "confederation", }, ["archipelago"] = { link = true, fallback = "เกาะ", }, ["area"] = { link = true, preposition = "ของ", fallback = "geographic and cultural area", -- Areas can either be administrative divisions (specifically of Kuwait) or geographic areas. Assume the former -- when categorizing 'Areas' but the latter when handling e.g. 'historical area'. class = "subpolity", former_type = "geographic region", cat_handler = district_neighborhood_cat_handler, }, ["arm"] = { link = true, preposition = "ของ", class = "natural feature", default = {"ทะเล"}, }, ["arrondissement"] = { link = true, preposition = "ของ", -- FIXME!!! Grrrrr!!! In some countries, arrondissements are divisions of cities; in others, they are divisions -- of departments or provinces. Need to conditionalize on the country for both of the following. class = "subpolity", has_neighborhoods = true, }, ["associated province"] = { link = "separately", fallback = "จังหวัด", }, ["atoll"] = { -- FIXME! Atolls are administrative divisions of the Maldives but natural features elsewhere. Need to -- conditionalize `class` on the country. See also `administrative atoll`. link = true, class = "natural feature", bare_category_parent = "เกาะ", default = {true}, }, ["autonomous city"] = { link = "w", preposition = "ของ", fallback = "นคร", has_neighborhoods = true, }, ["autonomous community"] = { -- Spain; refers to regional entities, not village-like entities, as might be expected from "community" link = true, preposition = "ของ", class = "subpolity", }, ["autonomous island"] = { -- Comoros; seems like an administrative atoll of the Maldives. link = "+w:autonomous islands of Comoros", preposition = "ของ", class = "subpolity", }, ["autonomous oblast"] = { link = true, preposition = "ของ", affix_type = "Suf", no_affix_strings = "oblast", class = "subpolity", }, ["autonomous okrug"] = { link = true, preposition = "ของ", affix_type = "Suf", no_affix_strings = "okrug", class = "subpolity", }, ["autonomous prefecture"] = { link = true, fallback = "prefecture", }, ["autonomous province"] = { link = "w", fallback = "จังหวัด", }, ["autonomous region"] = { link = "w", preposition = "ของ", fallback = "administrative region", -- "administrative region" sets an affix of "ภูมิภาค" but we want to display as "Tibet Autonomous Region" -- if the user writes 'ar:Suf/Tibet'. affix = "autonomous region", }, ["autonomous republic"] = { link = "w", preposition = "ของ", class = "subpolity", }, ["autonomous territorial unit"] = { -- Moldova; only two of them, one for Gagauzia and one for Transnistria. link = "w", preposition = "ของ", class = "subpolity", }, ["autonomous territory"] = { link = "w", fallback = "dependent territory", }, ["bailiwick"] = { -- Jersey, etc. link = true, fallback = "องค์การทางการเมือง", }, ["barangay"] = { -- Philippines link = true, class = "settlement", -- Barangays are formal administrative divisions of a city rather than informal neighborhoods, but can use -- some of the properties of a neighborhood. fallback = "neighborhood", }, ["barrio"] = { -- Spanish-speaking countries; Philippines link = true, -- FIXME: Not completely correct, in some countries barrios are formal administrative divisions of a city. -- `class` will need to conditionalize on the country to be completely correct. fallback = "neighborhood", }, ["basin"] = { link = true, fallback = "ทะเลสาบ", }, ["bay"] = { link = true, preposition = "ของ", class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["beach"] = { link = true, class = "natural feature", addl_bare_category_parents = {"water"}, default = {true}, }, ["beach resort"] = { link = "w", fallback = "resort town", }, ["bishopric"] = { link = true, fallback = "องค์การทางการเมือง", }, ["bodies of water!"] = { -- FIXME: This is (maybe?) a type category not a name category. There should be an option for this. We need to -- straighten out the type vs. name vs. related-to issue. category_link = "[[body of water|bodies of water]]", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems", "water"}, }, ["borough"] = { link = true, preposition = "ของ", display_handler = borough_display_handler, has_neighborhoods = true, -- "former borough" could be a former settlement or a former part of a city but seems more likely to -- be a former subpolity, particularly in England. FIXME, we really need a handler to take care of this -- properly. class = "subpolity", -- Grr, some boroughs are city-like but some (e.g. in Britain) may be larger. }, ["borough seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["branch"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["bridge"] = { link = true, class = "man-made structure", default = {"Named bridges"}, }, ["building"] = { link = true, class = "man-made structure", default = {"Named buildings"}, }, ["built-up area"] = { link = "w", fallback = "area", }, ["burgh"] = { link = true, fallback = "borough", }, ["business park"] = { link = true, fallback = "park", }, ["caliphate"] = { link = true, fallback = "องค์การทางการเมือง", }, ["canton"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["cape"] = { link = true, fallback = "headland", }, ["capital"] = { link = true, fallback = "เมืองหลวง", }, ["เมืองหลวง"] = { link = true, category_link = "[[capital city|capital cities]]: the [[seat of government|seats of government]] for a country or [[political]] [[division]] of a country", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", bare_category_parent = "นคร", cat_handler = capital_city_cat_handler, default = {true}, -- The following is necessary so that e.g. [[Melbourne]] defined as {{place|en|capital city|s/Victoria|c/Australia}} -- gets categorized in the bare category [[Category:en:Melbourne]]; otherwise placetype 'capital city' wouldn't -- match against the placetype 'city' of Melbourne. fallback = "นคร", }, ["caplc"] = { link = "[[capital]] and [[large]]st [[city]]", plural_link = false, fallback = "เมืองหลวง", }, ["captaincy"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["caravan city"] = { link = "w", fallback = "นคร", class = "settlement", inherently_former = {"ANCIENT", "FORMER"}, }, ["castle"] = { link = true, fallback = "building", }, ["cathedral city"] = { link = true, fallback = "นคร", }, ["cattle station"] = { -- Australia link = true, fallback = "farm", }, ["census area"] = { link = true, affix_type = "Suf", has_neighborhoods = true, class = "non-admin settlement", }, ["census-designated place"] = { -- United States link = true, class = "non-admin settlement", }, ["census division"] = { -- Canada link = "w", preposition = "ของ", class = "subpolity", }, ["census town"] = { link = "w", fallback = "เมือง", }, ["central business district"] = { link = true, fallback = "neighborhood", }, ["cercle"] = { -- Mali link = "+w:cercles of Mali", preposition = "ของ", class = "subpolity", }, ["ceremonial county"] = { link = true, fallback = "เทศมณฑล", }, ["chain of islands"] = { link = "[[chain]] of [[island]]s", plural = "chains of islands", plural_link = "[[chain]]s of [[island]]s", fallback = "เกาะ", }, ["channel"] = { link = true, fallback = "strait", }, ["charter community"] = { -- Northwest Territories, Canada link = "w", fallback = "village", }, ["นคร"] = { link = true, generic_before_non_cities = "ใน", has_neighborhoods = true, class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["city-state"] = { link = true, category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]", has_neighborhoods = true, class = "settlement", ["continent/*"] = {"City-states", "นครใน+++", "ประเทศใน+++", "เมืองหลวงของ"}, default = {"City-states", "นคร", "ประเทศ", "เมืองหลวงของประเทศ"}, }, ["civil parish"] = { -- Mostly England; similar to municipalities link = true, preposition = "ของ", affix_type = "suf", has_neighborhoods = true, class = "subpolity", }, ["claimed political division"] = { link = "[[claim]]ed [[political]] [[division]]", class = "subpolity", default = {true}, }, ["co-capital"] = { link = "[[co-]][[capital]]", fallback = "เมืองหลวง", }, ["coal city"] = { link = "+w:coal town", fallback = "นคร", }, ["coal town"] = { link = "w", fallback = "เมือง", }, ["collectivity"] = { link = "w", preposition = "ของ", -- No default; these are weird one-off governmental divisions in France (esp. for overseas collectivities) class = "subpolity", }, ["colony"] = { link = true, fallback = "dependent territory", }, ["comarca"] = { -- per Wikipedia: traditional region or local administrative division found in Portugal, Spain, and some of -- their former colonies, like Brazil, Nicaragua, and Panama. In the Valencian Community, for example, it -- sits between municipalities and provinces, something like a county or district. link = true, preposition = "ของ", class = "subpolity", }, ["commandery"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"ANCIENT", "FORMER"}, }, ["commonwealth"] = { link = true, preposition = "ของ", -- No default; applies specifically to Puerto Rico class = "subpolity", }, ["commune"] = { link = true, fallback = "เทศบาล", }, ["community"] = { link = true, category_link = "[[community|communities]] of all sizes", fallback = "village", }, ["community development block"] = { -- in India; appears to be similar to a rural municipality; groups several villages, unclear if there will be -- neighborhoods so I'm not setting `has_neighborhoods` for now link = "w", affix_type = "suf", no_affix_strings = "block", class = "subpolity", }, ["comune"] = { -- Italy, Switzerland link = true, fallback = "เทศบาล", }, ["condominium"] = { link = true, fallback = "องค์การทางการเมือง", }, ["confederacy"] = { link = true, fallback = "confederation", }, ["confederation"] = { link = true, fallback = "องค์การทางการเมือง", }, ["constituency"] = { -- currently we have them as political divisions of Namibia but many countries have them link = true, preposition = "ของ", class = "subpolity", }, ["constituent country"] = { link = true, preposition = "ของ", class = "subpolity", }, ["constituent part"] = { link = "separately", preposition = "ของ", class = "subpolity", }, ["constituent republic"] = { -- Of Russia, Yugoslavia, etc. link = "separately", preposition = "ของ", class = "subpolity", }, ["counties and county-level cities!"] = { -- This is used when grouping counties and county-level cities under prefecture-level cities in China. category_link = "[[county|counties]] and [[county-level city|county-level cities]]", class = "subpolity", }, ["continent"] = { link = true, category_link = false, -- can't occur as a bare category class = "natural feature", default = {"Continents and continental regions"}, }, ["continental region"] = { link = "separately", category_link = false, -- can't occur as a bare category class = "geographic region", fallback = "continent", }, ["continents and continental regions!"] = { category_link = "[[continent]]s and [[continent]]-[[level]] [[region]]s (e.g. [[Polynesia]])", class = "geographic region", }, ["council area"] = { link = true, -- in Scotland; similar to a county preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["ประเทศ"] = { link = true, class = "polity", --ห้ามแปล class ["continent/*"] = {true, "ประเทศ"}, default = {true}, }, ["country-like entities!"] = { category_link = "[[polity|polities]] not normally considered [[country|countries]] but treated similarly for categorization purposes; typically, [[unrecognized]] [[de-facto]] countries or [[w:dependent territory|dependent territories]]", class = "polity", --ห้ามแปล class }, ["เทศมณฑล"] = { link = true, preposition = "ของ", display_handler = county_display_handler, class = "subpolity", }, ["county borough"] = { link = true, -- in Wales; similar to a county preposition = "ของ", affix_type = "suf", fallback = "borough", class = "subpolity", }, ["county seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["county town"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", fallback = "เมือง", has_neighborhoods = true, class = "capital", }, ["county-administered city"] = { -- In Taiwan, per Wikipedia similar to a Taiwanese township or district, which is a small city. -- NOT anything like a "county-level city" in PR China, which is a county masquerading as a city. link = "w", fallback = "นคร", has_neighborhoods = true, class = "settlement", }, ["county-controlled city"] = { -- Taiwan link = "w", fallback = "county-administered city", }, ["county-level city"] = { -- PR China link = "w", fallback = "prefecture-level city", }, ["crater lake"] = { link = true, fallback = "ทะเลสาบ", }, ["creek"] = { link = true, fallback = "stream", }, ["Crown colony"] = { link = "+crown colony", fallback = "crown colony", }, ["crown colony"] = { link = true, fallback = "colony", }, ["Crown dependency"] = { link = true, fallback = "dependent territory", }, ["crown dependency"] = { link = true, fallback = "dependent territory", }, ["cultural area"] = { link = "w", fallback = "geographic and cultural area", }, ["cultural region"] = { link = "w", fallback = "geographic and cultural area", }, ["delegation"] = { -- Tunisia link = "+w:delegations of Tunisia", preposition = "ของ", class = "subpolity", }, ["department"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["departmental capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["dependency"] = { link = true, fallback = "dependent territory", }, ["dependent territory"] = { link = "w", preposition = "ของ", class = "subpolity", former_type = "dependent territory", bare_category_parent = "political divisions", ["country/*"] = {true}, default = {true}, }, ["desert"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ecosystems"}, default = {true}, }, ["deserted mediaeval village"] = { link = "w", fallback = "deserted medieval village", }, ["deserted medieval village"] = { link = "w", fallback = "ANCIENT settlement", }, ["direct-administered municipality"] = { -- China link = "+w:direct-administered municipalities of China", fallback = "เทศบาล", }, ["direct-controlled municipality"] = { -- several countries link = "w", fallback = "เทศบาล", }, ["distributary"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["อำเภอ"] = { link = true, preposition = "ของ", affix_type = "suf", -- Grrr! FIXME! Here is where we need handlers for `class`. Using similar logic to -- district_neighborhood_cat_handler, we need to check if we're below or above a city to determine if the class -- is "settlement" or "subpolity". class = "subpolity", cat_handler = district_neighborhood_cat_handler, -- No default. Countries for which districts are political divisions will get entries. }, ["districts and autonomous regions!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Portugal. category_link = "[[district]]s and [[autonomous region]]s", class = "subpolity", }, ["districts and autonomous territorial units!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Moldova. category_link = "[[district]]s and [[w:autonomous territorial unit|autonomous territorial unit]]s", class = "subpolity", }, ["district capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["district headquarters"] = { link = "separately", fallback = "administrative centre", }, ["district municipality"] = { -- In Canada, a district municipality is equivalent to a rural municipality and won't have neighborhoods; in -- South Africa, district municipalities group local municipalities and hence won't have neighborhoods. link = "w", preposition = "ของ", affix_type = "suf", no_affix_strings = {"อำเภอ", "เทศบาล"}, fallback = "เทศบาล", class = "subpolity", }, ["division"] = { link = true, preposition = "ของ", class = "subpolity", }, ["division capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["dome"] = { link = true, fallback = "ภูเขา", }, ["dormant volcano"] = { link = true, fallback = "volcano", }, ["duchy"] = { link = true, fallback = "องค์การทางการเมือง", }, ["emirate"] = { link = true, preposition = "ของ", -- FIXME: Can be subpolities (of the United Arab Emirates). fallback = "องค์การทางการเมือง", }, ["จักรวรรดิ"] = { link = true, fallback = "องค์การทางการเมือง", }, ["enclave"] = { link = true, preposition = "ของ", -- Enclaves can theoretically be any size but assume a subpolity. class = "subpolity", }, ["entity"] = { -- Bosnia and Herzegovina link = "+w:entities of Bosnia and Herzegovina", preposition = "ของ", class = "subpolity", }, ["escarpment"] = { link = true, fallback = "ภูเขา", }, ["ethnographic region"] = { -- used in Lithuania link = "+w:ethnographic regions of Lithuania", fallback = "geographic and cultural area", }, ["exclave"] = { link = true, preposition = "ของ", -- exclaves can theoretically be any size but assume a subpolity. class = "subpolity", }, ["external territory"] = { link = "separately", fallback = "dependent territory", }, ["farm"] = { link = true, class = "non-admin settlement", default = {"Farms and ranches"}, }, ["farms and ranches!"] = { category_link = "[[farm]]s and [[ranch]]es", class = "non-admin settlement", }, ["federal city"] = { link = "w", preposition = "ของ", fallback = "นคร", }, ["federal district"] = { link = true, preposition = "ของ", -- Might have neighborhoods as federal districts are often cities (e.g. Mexico City) has_neighborhoods = true, class = "settlement", }, ["federal subject"] = { -- In Russia; a generic term for first-level administrative divisions (republics, oblasts, okrugs, krais, -- autonomous okrugs and autonomous oblasts). link = "w", preposition = "ของ", class = "subpolity", }, ["federal territory"] = { link = "w", fallback = "ดินแดน", }, ["fictional location"] = { link = "separately", former_type = "!", class = "hypothetical location", bare_category_parent = "สถานที่", default = {true}, }, ["First Nations reserve"] = { -- Canada link = "[[First Nations]] [[w:Indian reserve|reserve]]", -- Wikipedia uses "Indian reserve"; presumably that is the legal term fallback = "Indian reserve", class = "subpolity", }, ["fjord"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["footpath"] = { link = true, fallback = "road", }, ["forest"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ecosystems", "forestry"}, default = {true}, }, ["fort"] = { link = true, fallback = "building", }, ["fortress"] = { link = true, -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- fortresses -> fortresse, so put an entry here to ensure we singularize correctly. plural = "fortresses", fallback = "building", }, ["frazione"] = { link = "w", fallback = "hamlet", }, ["freeway"] = { link = true, fallback = "road", }, ["French prefecture"] = { link = "[[w:prefectures in France|prefecture]]", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["geographic and cultural area"] = { link = "+w:cultural area", -- `generic_before_non_cities` is used when generating the category description of categories of the format -- `Geographic and cultural areas of PLACE`. `preposition` is used when generating {{place}} description and -- categories for any placetype that falls back to `geographic and cultural area`. generic_before_non_cities = "ของ", preposition = "ของ", class = "geographic region", bare_category_parent = "สถานที่", ["country/*"] = {true}, ["constituent country/*"] = {true}, ["continent/*"] = {true}, default = {true}, }, ["geographic area"] = { link = "+w:geographic region", fallback = "geographic and cultural area", }, ["geographic region"] = { link = "w", fallback = "geographic and cultural area", }, ["geographical area"] = { link = "w", fallback = "geographic and cultural area", }, ["geographical region"] = { link = "w", fallback = "geographic and cultural area", }, ["geopolitical zone"] = { -- Nigeria link = true, preposition = "ของ", class = "subpolity", }, ["gewog"] = { -- Bhutan link = true, preposition = "ของ", class = "subpolity", }, ["ghost town"] = { link = true, generic_before_non_cities = "ใน", class = "non-admin settlement", bare_category_parent = "former settlements", cat_handler = city_type_cat_handler, default = {true}, }, ["glen"] = { link = true, fallback = "valley", }, ["governorate"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["greater administrative region"] = { -- China (former division) link = "w", preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["gromada"] = { -- Poland (former division) link = "w", preposition = "ของ", affix_type = "Pref", class = "subpolity", inherently_former = {"FORMER"}, }, ["group of islands"] = { link = "[[group]] of [[island]]s", plural = "groups of islands", plural_link = "[[group]]s of [[island]]s", fallback = "island group", }, ["gulf"] = { link = true, preposition = "ของ", holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["hamlet"] = { link = true, fallback = "village", }, ["harbor city"] = { link = "separately", fallback = "นคร", }, ["harbor town"] = { link = "separately", fallback = "เมือง", }, ["harbour city"] = { link = "separately", fallback = "นคร", }, ["harbour town"] = { link = "separately", fallback = "เมือง", }, ["headland"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["headquarters"] = { link = "w", fallback = "administrative centre", }, ["heath"] = { link = true, fallback = "moor", }, ["hemisphere"] = { link = true, entry_placetype_use_the = true, fallback = "continental region", }, ["highway"] = { link = true, fallback = "road", }, ["hill"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["hill station"] = { link = "w", fallback = "เมือง", }, ["hill town"] = { link = "w", fallback = "เมือง", }, ["historic region"] = { -- provided only for the link link = "+w:historical region", fallback = "FORMER geographic region", }, ["historical county"] = { -- needed for historical counties of England/etc. link = "+w:historic county", fallback = "FORMER subpolity", }, ["historical region"] = { -- provided only for the link link = "w", fallback = "FORMER geographic region", }, ["home rule city"] = { link = "w", fallback = "นคร", }, ["home rule municipality"] = { link = "w", fallback = "เทศบาล", }, ["hot spring"] = { link = true, fallback = "spring", }, ["house"] = { link = true, fallback = "building", }, ["housing estate"] = { -- not the same as a housing project (i.e. public housing) link = true, -- not exactly the case but approximately fallback = "neighborhood", }, ["hromada"] = { -- Ukraine link = "w", disallow_in_entries = "Use placetype 'urban hromada', 'rural hromada' or 'settlement hromada' in place of bare 'hromada'", disallow_in_holonyms = "Use placetype 'urban hromada'/'uhrom', 'rural hromada'/'rhrom' or 'settlement hromada'/'shrom' in place of bare 'hromada'", preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["inactive volcano"] = { link = "w", fallback = "dormant volcano", }, ["independent city"] = { link = true, fallback = "นคร", }, ["independent town"] = { link = "+independent city", fallback = "เมือง", }, ["Indian reservation"] = { link = "w", -- In the US. Also known as "Native American reservation" or "domestic dependent nation", and the reservations -- themselves often use the term "nation" in their official name (e.g. the "Navajo Nation"). But Wikipedia puts -- the article at [[w:Indian reservation]] and uses that term when describing e.g. what the Navajo Nation is, -- so this must still be the legal term. preposition = "ของ", class = "subpolity", default = {true}, }, ["Indian reserve"] = { link = "w", -- In Canada. "First Nations reserve" sounds more modern/PC but Wikipedia uses "Indian reserve"; presumably that -- is still the legal term. preposition = "ของ", class = "subpolity", default = {true}, }, ["inland sea"] = { -- note, we also have 'inland' as a qualifier link = true, fallback = "ทะเล", }, ["inner city area"] = { link = "[[inner city]] [[area]]", fallback = "neighborhood", }, ["เกาะ"] = { link = true, preposition = "ของ", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["island country"] = { -- FIXME: The following should map to both 'island' and 'country'. link = "w", fallback = "ประเทศ", }, ["island group"] = { link = "separately", fallback = "เกาะ", }, ["island municipality"] = { link = "w", fallback = "เทศบาล", }, ["islet"] = { link = "w", fallback = "เกาะ", }, ["Israeli settlement"] = { link = "w", class = "settlement", default = {true}, }, ["judicial capital"] = { link = "w", fallback = "เมืองหลวง", }, ["khanate"] = { link = true, fallback = "องค์การทางการเมือง", }, ["kibbutz"] = { link = true, plural = "kibbutzim", class = "non-admin settlement", default = {true}, }, ["kingdom"] = { link = true, fallback = "monarchy", }, ["krai"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["ทะเลสาบ"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["ธรณีสัณฐาน!"] = { category_link = "[[ธรณีสัณฐาน]]", bare_category_parent = "สถานที่", addl_bare_category_parents = {"โลก"}, }, ["largest city"] = { link = "[[large]]st [[city]]", entry_placetype_use_the = true, fallback = "นคร", has_neighborhoods = true, }, ["league"] = { link = true, fallback = "confederation", }, ["legislative capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["library"] = { link = true, fallback = "building", }, ["lieutenancy area"] = { -- used in the United Kingdom; per Wikipedia: -- In England, lieutenancy areas are colloquially known as the ceremonial counties, although this phrase does -- not appear in any legislation referring to them. The lieutenancy areas of Scotland are subdivisions of -- Scotland that are more or less based on the counties of Scotland, making use of the major cities as separate -- entities.[2] In Wales, the lieutenancy areas are known as the preserved counties of Wales and are based on -- those used for lieutenancy and local government between 1974 and 1996. The lieutenancy areas of Northern -- Ireland correspond to the six counties and two former county boroughs.[3] link = "w", fallback = "ceremonial county", }, ["local authority district"] = { link = "w", fallback = "local government district", }, ["local government area"] = { -- Australia link = "w", preposition = "ของ", class = "subpolity", }, ["local council"] = { -- Malta; similar to municipalities link = "+w:local councils of Malta", preposition = "ของ", fallback = "เทศบาล", }, ["local government district"] = { link = "w", preposition = "ของ", affix_type = "suf", affix = "อำเภอ", class = "subpolity", }, ["local government district with borough status"] = { link = "[[w:local government district|local government district]] with [[w:borough status|borough status]]", plural = "local government districts with borough status", plural_link = "[[w:local government district|local government districts]] with [[w:borough status|borough status]]", preposition = "ของ", affix_type = "suf", affix = "อำเภอ", class = "subpolity", }, ["local urban district"] = { link = "w", fallback = "unincorporated community", }, ["locality"] = { link = "+w:locality (settlement)", -- not necessarily true, but usually is the case fallback = "village", }, ["London borough"] = { link = "w", preposition = "ของ", affix_type = "pref", affix = "borough", fallback = "local government district with borough status", has_neighborhoods = true, }, ["macroregion"] = { link = true, fallback = "ภูมิภาค", }, ["man-made structures!"] = { category_link = "[[w:geographical feature#Engineered constructs|man-made structures]] such as [[airport]]s, [[university|universities]] and [[metro station]]s", bare_category_parent = "สถานที่", }, ["manor"] = { -- FIXME: or is this more like a farm? link = true, fallback = "building", }, ["marginal sea"] = { link = true, preposition = "ของ", fallback = "ทะเล", }, ["market city"] = { link = "+market town", fallback = "นคร", }, ["market town"] = { link = true, fallback = "เมือง", }, ["massif"] = { link = true, fallback = "ภูเขา", }, ["megacity"] = { link = true, fallback = "นคร", }, ["metro station"] = { link = true, class = "man-made structure", }, ["metropolitan borough"] = { link = true, preposition = "ของ", affix_type = "Pref", no_affix_strings = {"borough", "นคร"}, fallback = "local government district", has_neighborhoods = true, }, ["มหานคร"] = { -- These exist e.g. in Italy and are more like municipalities or even provinces than cities. link = true, preposition = "ของ", affix_type = "Pref", no_affix_strings = {"มหานคร", "นคร"}, class = "subpolity", }, ["metropolitan county"] = { link = true, fallback = "เทศมณฑล", }, ["metropolitan municipality"] = { -- In South Africa, metropolitan municipalities group local municipalities and are like districts, between -- provinces and municipalities. -- In Turkey, metropolitan municipalities are provinces-level. link = "w", preposition = "ของ", affix_type = "Suf", no_affix_strings = {"metropolitan", "เทศบาล"}, fallback = "เทศบาล", class = "subpolity", }, ["microdistrict"] = { -- residential complex in post-Soviet states link = true, fallback = "neighborhood", }, ["micronations!"] = { -- FIXME, merge with microstate category_link = "[[micronation]]s", bare_category_parent = "ประเทศ", }, ["microstate"] = { link = true, fallback = "ประเทศ", }, ["military base"] = { link = "w", class = "settlement", -- or "man-made structure"? default = {true}, }, ["minster town"] = { -- England link = "separately", fallback = "เมือง", }, ["monarchy"] = { link = true, fallback = "องค์การทางการเมือง", }, ["moor"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems"}, default = {true}, }, ["moorland"] = { link = true, fallback = "moor", }, ["motorway"] = { link = true, fallback = "road", }, ["ภูเขา"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["mountain indigenous district"] = { -- Taiwan link = "+w:district (Taiwan)", fallback = "อำเภอ", }, ["mountain indigenous township"] = { -- Taiwan link = "+w:township (Taiwan)", fallback = "township", }, ["mountain pass"] = { link = true, -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- passes -> passe, so put an entry here to ensure we singularize correctly. plural = "mountain passes", class = "natural feature", addl_bare_category_parents = {"ภูเขา"}, default = {true}, }, ["เทือกเขา"] = { link = true, fallback = "ภูเขา", }, ["mountainous region"] = { link = "separately", fallback = "ภูมิภาค", }, ["mukim"] = { -- Malaysia, Brunei, Indonesia, Singapore link = true, preposition = "ของ", class = "subpolity", }, ["municipal district"] = { link = "w", -- meaning varies depending on the country; for now, assume no neighborhoods. -- FIXME: has_neighborhoods might have to be a function that looks at the containing holonyms. preposition = "ของ", affix_type = "Pref", no_affix_strings = "อำเภอ", fallback = "เทศบาล", }, ["เทศบาล"] = { link = true, preposition = "ของ", has_neighborhoods = true, class = "subpolity", }, ["municipality with city status"] = { link = "[[municipality]] with [[w:city status|city status]]", plural = "municipalities with city status", plural_link = "[[municipality|municipalities]] with [[w:city status|city status]]", fallback = "เทศบาล", }, ["museum"] = { link = true, fallback = "building", }, ["mythological location"] = { link = "separately", former_type = "!", class = "hypothetical location", bare_category_parent = "สถานที่", default = {true}, }, ["named bridges!"] = { category_link = "notable [[bridge]]s", bare_category_parent = "man-made structures", addl_bare_category_parents = {"bridges"}, }, ["named buildings!"] = { category_link = "notable [[house]]s, [[library|libraries]] and other [[building]]s", bare_category_parent = "man-made structures", addl_bare_category_parents = {"buildings"}, }, ["named roads!"] = { category_link = "notable [[road]]s, [[highway]]s, [[trail]]s and similar linear structures", bare_category_parent = "man-made structures", addl_bare_category_parents = {"roads"}, }, ["national capital"] = { link = "w", fallback = "เมืองหลวง", }, ["national park"] = { link = true, fallback = "park", }, ["natural features!"] = { category_link = "[[w:geographical feature#Natural features|natural features]] such as [[lake]]s, [[mountain]]s, [[island]]s and [[ocean]]s", bare_category_parent = "สถานที่", }, ["neighborhood"] = { -- The majority of the properties here apply to both `neighborhoods` and `neighbourhoods`; the choice of which -- one to use is made by district_neighborhood_cat_handler() based on the value of `british_spelling` for the -- location (city, political division, etc.) of the holonym that follows the word "neighbo(u)hoods" in the -- category name. It does *NOT* depend on whether the {{place}} call uses "neighborhoods" or "neighbourhoods". -- (In general it can't, because other things like "urban areas", "อำเภอ", "subdivisions" and the like also -- categorize as neighbo(u)rhoods.) link = true, -- See below. These are used by category handlers in [[Module:category tree/topic cat/data/Places]]. generic_before_non_cities = "ใน", generic_before_cities = "ของ", -- The following text is suitable for the top-level description of a neighborhood as well as categories of the -- form `Neighborhoods in POLDIV` e.g. `Neighborhoods in Illinois, USA` but not for categories of the form -- `Neighborhoods of Chicago`, where we'd get "... and other subportions of [[city|cities]] of [[Chicago]]". category_link = "[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]", category_link_before_city = "[[neighborhood]]s, [[district]]s and other subportions", -- NOTE: This setting is needed for administrative divisions like barangays that fall back to `neighborhood`, -- when set in [[Module:place/locations]] for a specific country (e.g. the Philippines). The above settings -- for `generic_before_non_cities` and `generic_before_cities` are used by category handlers in -- [[Module:category tree/topic cat/data/Places]] for `Neighborhoods in POLDIV` and `Neighborhoods of CITY` -- categories. In fact, district_neighborhood_cat_handler() does not currently pay attention to them, but -- generates "ของ" before cities and "ใน" before non-cities regardless. (FIXME: We should change that.) preposition = "ของ", class = "non-admin settlement", cat_handler = district_neighborhood_cat_handler, }, ["neighbourhood"] = { link = true, category_link = "[[neighbourhood]]s, [[district]]s and other subportions of [[city|cities]]", category_link_before_city = "[[neighbourhood]]s, [[district]]s and other subportions", fallback = "neighborhood", }, ["new area"] = { -- China (type of economic development zone, varying greatly in size) link = "w", preposition = "ใน", class = "subpolity", --? }, ["new town"] = { link = true, fallback = "เมือง", }, ["เมืองหลวงที่ไม่ใช่นคร"] = { link = "[[เมืองหลวง]]", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", cat_handler = function(data) return capital_city_cat_handler(data, "non-city") end, -- FIXME, do we need the following? default = {true}, }, ["non-metropolitan county"] = { link = "w", fallback = "เทศมณฑล", }, ["non-metropolitan district"] = { link = "w", fallback = "local government district", }, ["non-sovereign kingdom"] = { -- especially in Africa and Asia link = "+w:non-sovereign monarchy", generic_before_non_cities = "ใน", class = "subpolity", ["country/*"] = {true}, ["continent/*"] = {true}, default = {true}, }, ["non-sovereign monarchy"] = { link = "w", fallback = "non-sovereign kingdom", }, ["oblast"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["oblasts and autonomous republics!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Ukraine. category_link = "[[oblast]]s and [[w:autonomous republic|autonomous republic]]s", class = "subpolity", }, ["มหาสมุทร"] = { link = true, holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"ทะเล", "bodies of water"}, default = {true}, }, ["okrug"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["overseas collectivity"] = { link = "w", fallback = "collectivity", }, ["overseas department"] = { link = "w", fallback = "department", }, ["overseas territory"] = { link = "w", fallback = "dependent territory", }, ["parish"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["parish municipality"] = { -- in Quebec, often similar to a rural village; the famous [[Saint-Louis-du-Ha! Ha!]] is one of them. link = "+w:parish municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, }, ["parish seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", class = "capital", has_neighborhoods = true, }, ["park"] = { link = true, class = "man-made structure", default = {true}, }, ["pass"] = { link = "+mountain pass", -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- passes -> passe, so put an entry here to ensure we singularize correctly. plural = "passes", fallback = "mountain pass", }, ["path"] = { link = true, fallback = "road", }, ["peak"] = { link = true, fallback = "ภูเขา", }, ["peninsula"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["periphery"] = { link = true, preposition = "ของ", class = "subpolity", }, ["สถานที่!"] = { generic_before_non_cities = "ใน", generic_before_cities = "ใน", class = "generic place", category_link = "[[place]]s of all sorts", -- `category_link_top_level` control the description used in the top-level [[Category:Places]] and -- language-specific variants such as [[Category:en:Places]]. The actual text for a language-spefic variant is -- "{{{langname}}} names of [[geographical]] [[place]]s of all sorts; [[toponym]]s." where the "names of" -- portion is automatically generated by the appropriate handler in -- [[Module:category tree/topic cat/data/Places]]. category_link_top_level = "[[geographical]] [[place]]s of all sorts; [[toponym]]s", bare_category_parent = "ชื่อ (หัวข้อ)", }, ["planned community"] = { -- Include this so we don't categorize 'planned community' into villages, as 'community' does. link = true, class = "settlement", has_neighborhoods = true, }, ["plateau"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, -- FIXME: Should generate both "Plateaus" and the appropriate 'geographic and cultural area' category }, ["Polish colony"] = { link = "[[w:colony (Poland)|colony]]", affix_type = "suf", affix = "colony", fallback = "village", has_neighborhoods = true, }, ["political divisions!"] = { category_link = "[[political]] [[division]]s and [[subdivision]]s, such as [[state]]s, [[province]]s, [[county|counties]] or [[district]]s", bare_category_parent = "สถานที่", }, ["องค์การทางการเมือง"] = { link = true, category_link = "[[independent]] or [[semi-]][[independent]] [[polity|polities]]", class = "polity", --ห้ามแปล class bare_category_parent = "สถานที่", default = {true}, }, ["populated place"] = { link = "+w:populated place", -- not necessarily true, but usually is the case fallback = "village", }, ["port"] = { link = true, class = "man-made structure", default = {true}, }, ["port city"] = { -- FIXME: should categorize into "Ports" as well as "นคร" link = true, fallback = "นคร", }, ["port town"] = { -- FIXME: should categorize into "Ports" as well as "เมือง" link = "w", fallback = "เมือง", }, ["prefecture"] = { -- FIXME! `prefecture` is like a county in Japan and elsewhere but a department capital city in France. -- May need `has_neighborhoods` to be a function. link = true, preposition = "ของ", display_handler = prefecture_display_handler, class = "subpolity", }, ["prefecture-level city"] = { -- China; they are huge entities with a central city; not cities themselves. link = "w", preposition = "ของ", class = "subpolity", }, ["preserved county"] = { -- In Wales; they are former counties enshrined in law; there are 8 of them and each consists of one or more -- "principal areas" (styled as "เทศมณฑล" or "county boroughs"), of which there are 22. link = "w", preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["primary area"] = { -- a grouping of "อำเภอ" (neighborhoods) in Gothenburg, Sweden link = "+w:sv:primärområde", fallback = "neighborhood", }, ["principality"] = { link = true, fallback = "monarchy", }, ["promontory"] = { link = true, fallback = "headland", }, ["protectorate"] = { link = true, fallback = "dependent territory", }, ["จังหวัด"] = { link = true, preposition = "ของ", display_handler = province_display_handler, class = "subpolity", }, ["provinces and autonomous regions!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case China. category_link = "[[province]]s and [[autonomous region]]s", class = "subpolity", }, ["provinces and territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Canada and Pakistan. category_link = "[[province]]s and [[territory|territories]]", class = "subpolity", }, ["provincial capital"] = { link = true, fallback = "เมืองหลวง", }, ["raion"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["ranch"] = { link = true, fallback = "farm", }, ["range"] = { -- FIXME: Where is this used? Is it a mountain range? link = true, holonym_use_the = true, class = "natural feature", }, ["regency"] = { link = true, preposition = "ของ", class = "subpolity", }, ["ภูมิภาค"] = { link = true, preposition = "ของ", -- If 'region' isn't a specific administrative division, fall back to 'geographic and cultural area' fallback = "geographic and cultural area", -- "former region" is a subpolity but traditional/historic(al)/ancient/medieval/etc. is a geographic region class = "geographic region", }, ["regional capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["regional county municipality"] = { -- Quebec link = "w", preposition = "ของ", affix_type = "Suf", no_affix_strings = {"เทศบาล", "เทศมณฑล"}, fallback = "เทศบาล", }, ["regional district"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "อำเภอ", fallback = "อำเภอ", }, ["regional municipality"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "เทศบาล", fallback = "เทศบาล", }, ["regional unit"] = { link = "w", preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["registration county"] = { -- Used in Scotland for land registration purposes; formerly used in England, Wales and Ireland for statistical -- purposes (registration of births, deaths and marriages, and for the output of census information). link = "w", fallback = "เทศมณฑล", }, ["republic"] = { -- Of Russia, Yugoslavia, etc. "Republics" in general are sovereign but we use "ประเทศ" in that case. link = true, fallback = "constituent republic", }, ["research base"] = { link = "+w:research station", fallback = "research station", }, ["research station"] = { link = "w", class = "non-admin settlement", -- or "man-made structure"? default = {true}, }, ["reservoir"] = { link = true, fallback = "ทะเลสาบ", }, ["residential area"] = { link = "separately", fallback = "neighborhood", }, ["resort city"] = { link = "w", fallback = "นคร", }, ["resort town"] = { link = "w", fallback = "เมือง", }, ["แม่น้ำ"] = { link = true, generic_before_non_cities = "ใน", holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, cat_handler = city_type_cat_handler, ["continent/*"] = {true}, default = {true}, }, ["river island"] = { link = "w", fallback = "เกาะ", }, ["road"] = { link = true, class = "man-made structure", default = {"Named roads"}, }, ["Roman province"] = { -- FIXME! Eliminate this in favor of 'former province|emp/Roman Empire' link = "w", default = {"Provinces of the Roman Empire"}, class = "subpolity", }, ["royal borough"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = {"royal", "borough"}, fallback = "local government district with borough status", has_neighborhoods = true, }, ["royal burgh"] = { link = true, fallback = "borough", }, ["royal capital"] = { link = "w", fallback = "เมืองหลวง", }, ["rural committee"] = { -- Hong Kong; a group of villages link = "w", affix_type = "Suf", has_neighborhoods = true, class = "settlement", }, ["rural community"] = { -- New Brunswick link = "+w:list of municipalities in New_Brunswick#Rural communities", fallback = "เทศบาล", }, ["rural hromada"] = { link = "[[rural]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["rural municipality"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "เทศบาล", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["rural township"] = { -- Taiwan link = "+w:rural township (Taiwan)", fallback = "township", }, ["sanctuary"] = { link = true, fallback = "temple", }, ["satrapy"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"ANCIENT", "FORMER"}, }, ["ทะเล"] = { link = true, holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["seaport"] = { link = true, fallback = "port", }, ["seat"] = { link = true, fallback = "administrative centre", }, ["self-administered area"] = { -- Myanmar (groups self-administered divisions and zones) link = "+w:self-administered zone", preposition = "ของ", class = "subpolity", }, ["self-administered division"] = { -- Myanmar (only one of them: Wa Self-Administered Division) link = "w", fallback = "self-administered area", }, ["self-administered zone"] = { -- Myanmar (five of them) link = "w", fallback = "self-administered area", }, ["separatist state"] = { link = "separately", fallback = "unrecognized country", }, ["การตั้งถิ่นฐาน"] = { link = true, category_link = "[[settlement]]s such as [[city|cities]], [[village]]s and [[farm]]s", bare_category_parent = "สถานที่", -- not necessarily true, but usually is the case fallback = "village", }, ["settlement hromada"] = { link = "[[w:Populated สถานที่ในUkraine#Rural settlements|การตั้งถิ่นฐาน]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["sheading"] = { -- Isle of Man link = true, fallback = "อำเภอ", }, ["sheep station"] = { -- Australia link = true, fallback = "farm", }, ["shire"] = { link = true, fallback = "เทศมณฑล", }, ["shire county"] = { link = "w", fallback = "เทศมณฑล", }, ["shire town"] = { link = true, fallback = "county seat", }, ["ski resort city"] = { link = "[[ski resort]] [[city]]", fallback = "นคร", }, ["ski resort town"] = { link = "[[ski resort]] [[town]]", fallback = "เมือง", }, ["spa city"] = { link = "+w:spa town", fallback = "นคร", }, ["spa town"] = { link = "w", fallback = "เมือง", }, ["space station"] = { link = true, fallback = "research station", }, ["special administrative region"] = { -- in China; in practice they are city-like (Hong Kong, Macau); also [[Oecusse]] in East Timor is formally a -- "special administrative region"; North Korea had one such region planned (Sinuiju) but abandoned; Indonesia -- has similar "special regions" of Jakarta, Yogyakarta and Aceh; and South Sudan has three "special -- administrative areas" link = "+w:special administrative regions of China", preposition = "ของ", class = "subpolity", has_neighborhoods = true, --? -- no suffix since สถานที่ในHong Kong or Macau are listed without China, except Hong Kong and Macau themselves -- they also contain regions (or areas), e.g. [[Kowloon]], so it would be confusing suffix = "", }, ["special collectivity"] = { link = "w", fallback = "collectivity", }, ["special municipality"] = { -- formerly linked to the Taiwan article but there are also special municipalities of the Netherlands link = "w", fallback = "เทศบาล", }, ["special ward"] = { -- Tokyo link = true, fallback = "เทศบาล", }, ["spit"] = { link = true, fallback = "peninsula", }, ["spring"] = { link = true, class = "natural feature", default = {true}, }, ["star"] = { link = true, class = "natural feature", default = {true}, }, ["รัฐ"] = { link = true, preposition = "ของ", class = "subpolity", -- 'former/historical state' could refer either to a state of a country (a division) or a state = sovereign -- entity. The latter appears more common (e.g. in various "ancient states" of East Asia). former_type = "องค์การทางการเมือง", }, ["states and territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Australia. category_link = "[[state]]s and [[territory|territories]]", class = "subpolity", }, ["states and union territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case India. category_link = "[[state]]s and [[union territory|union territories]]", class = "subpolity", }, ["state capital"] = { link = true, fallback = "เมืองหลวง", }, ["state park"] = { link = true, fallback = "park", }, ["state-level new area"] = { -- China (type of economic development zone, varying greatly in size) link = "w", fallback = "new area", }, ["statistical region"] = { -- Slovenia link = true, fallback = "administrative region", }, ["statutory city"] = { link = "w", fallback = "นคร", }, ["statutory town"] = { link = "w", fallback = "เมือง", }, ["strait"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["stream"] = { link = true, fallback = "แม่น้ำ", }, ["street"] = { link = true, fallback = "road", }, ["strip"] = { link = true, fallback = "geographic region", }, ["strip of land"] = { link = "[[strip]] of [[land]]", plural = "strips of land", plural_link = "[[strip]]s of [[land]]", fallback = "geographic region", }, ["sub-metropolitan city"] = { link = "+w:List of cities in Nepal#Sub-metropolitan cities", fallback = "นคร", }, ["sub-prefectural city"] = { link = "w", fallback = "subprovincial city", }, ["ตำบล"] = { link = true, preposition = "ของ", has_neighborhoods = true, --? -- FIXME: subdistricts can be neighborhood-like (of Jakarta) or larger (in China); need a handler class = "subpolity", default = {true}, }, ["subdivision"] = { link = true, preposition = "ของ", affix_type = "suf", -- FIXME: subdivisions can be neighborhood-like or larger; need a handler class = "subpolity", cat_handler = district_neighborhood_cat_handler, }, ["submerged ghost town"] = { -- FIXME: Consider just having "submerged" as a qualifier. link = "[[submerged]] [[ghost town]]", fallback = "ghost town", }, ["subnational kingdom"] = { link = "+w:subnational monarchy", fallback = "non-sovereign kingdom", }, ["subnational monarchy"] = { link = "w", fallback = "non-sovereign kingdom", }, ["subprefecture"] = { link = true, affix_type = "suf", preposition = "ของ", class = "subpolity", }, ["subprovince"] = { link = true, preposition = "ของ", class = "subpolity", }, ["subprovincial city"] = { link = "w", -- China; special status given to certain prefecture-level cities fallback = "prefecture-level city", }, ["subprovincial district"] = { link = "w", -- China; special status given to Binhai New Area and Pudong New Area, which are county-level districts preposition = "ของ", class = "subpolity", }, ["subregion"] = { link = true, fallback = "geographic region", }, ["suburb"] = { link = true, -- The following text is suitable for the top-level description of a suburb as well as categories of the form -- 'Suburbs in POLDIV' e.g. 'Suburbs in Illinois, USA' but not for categories of the form 'Suburbs of Chicago', -- where we'd get "[[suburb]]s of [[city|cities]] of [[Chicago]]". category_link = "[[suburb]]s of [[city|cities]]", category_link_before_city = "[[suburb]]s", -- See comments under "neighborhood" for the following three settings. They are used by -- [[Module:category tree/topic cat/data/Places]] for generating the text of 'Suburbs in/of PLACE' categories -- but currently ignored by district_neighborhood_cat_handler (which actually generates the categories for a -- given page), which hardcodes "ใน" for non-cities and "ของ" for cities. (FIXME: Change this.) generic_before_non_cities = "ใน", generic_before_cities = "ของ", preposition = "ของ", has_neighborhoods = true, --? class = "non-admin settlement", --? cat_handler = district_neighborhood_cat_handler, }, ["suburban area"] = { link = "w", fallback = "suburb", }, ["subway station"] = { link = "w", fallback = "metro station", }, ["sum"] = { -- In China, Mongolia, Russia; something like a county in Mongolia but a township in China (Inner Mongolia), -- and equivalent to a [[selsoviet]] in the parts of Russia where it's in use (a rural council, below a raion). link = "+w:sum (administrative division)", -- This fallback is somewha arbitrary. We could use "เทศมณฑล" but that has a display handler -- which we don't want to be active (FIXME: If the display handler would be active, that's a bug). fallback = "division", }, ["supercontinent"] = { link = true, fallback = "continent", }, ["tehsil"] = { link = true, affix_type = "suf", no_affix_strings = {"tehsil", "tahsil"}, class = "subpolity", }, ["temple"] = { link = true, fallback = "building", }, ["territorial authority"] = { link = "w", fallback = "อำเภอ", }, ["ดินแดน"] = { link = true, preposition = "ของ", class = "subpolity", }, ["theme"] = { link = "+w:theme (Byzantine district)", preposition = "ของ", class = "subpolity", }, ["เมือง"] = { link = true, generic_before_non_cities = "ใน", has_neighborhoods = true, class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["town with bystatus"] = { -- can't use templates in links currently link = "[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]", plural = "towns with bystatus", plural_link = "[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]", fallback = "เมือง", }, ["township"] = { link = true, has_neighborhoods = true, class = "settlement", --? default = {true}, }, ["township municipality"] = { -- Quebec link = "+w:township municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["traditional county"] = { link = true, fallback = "เทศมณฑล", }, ["traditional region"] = { -- FIXME: Verify this works. Same for 'historic(al) region'. -- provided only for the link link = "w", fallback = "FORMER geographic region", }, ["trail"] = { link = true, fallback = "road", }, ["treaty port"] = { link = "w", fallback = "นคร", class = "settlement", inherently_former = {"FORMER"}, }, ["tributary"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["underground station"] = { link = "w", fallback = "metro station", }, ["unincorporated area"] = { link = "w", -- I don't know if this fallback makes sense everywhere. fallback = "unincorporated community", }, ["unincorporated community"] = { link = true, generic_before_non_cities = "ใน", class = "non-admin settlement", }, ["unincorporated territory"] = { link = "w", fallback = "ดินแดน", }, ["union territory"] = { -- India link = true, preposition = "ของ", entry_placetype_indefinite_article = "a", class = "subpolity", }, ["unitary authority"] = { -- UK, New Zealand link = true, entry_placetype_indefinite_article = "a", fallback = "local government district", }, ["unitary district"] = { link = "w", entry_placetype_indefinite_article = "a", fallback = "local government district", }, ["united township municipality"] = { -- Quebec link = "+w:united township municipality (Quebec)", entry_placetype_indefinite_article = "a", fallback = "township municipality", has_neighborhoods = true, --? }, ["university"] = { link = true, entry_placetype_indefinite_article = "a", class = "man-made structure", default = {true}, }, ["unrecognised country"] = { link = "w", fallback = "unrecognized country", }, ["unrecognized and nearly unrecognized countries!"] = { category_link = "[[de facto]] [[independent]] [[state]]s with little or no {{w|international recognition}}", bare_category_parent = "country-like entities", }, ["unrecognized country"] = { link = "w", class = "polity", --ห้ามแปล class default = {"Unrecognized and nearly unrecognized countries"}, }, ["unrecognised state"] = { link = "w", fallback = "unrecognized country", }, ["unrecognized state"] = { link = "w", fallback = "unrecognized country", }, ["urban area"] = { link = "separately", fallback = "neighborhood", }, ["urban hromada"] = { link = "[[urban]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["urban service area"] = { -- A strange beast existing in Alberta; technically a type of hamlet but in practice used for much larger -- cities and treated equivalent to a city. (There are only two of them, [[Fort McMurray]] and [[Sherwood Park]]). link = "w", fallback = "นคร", }, ["urban township"] = { link = "w", fallback = "township", }, ["urban-type settlement"] = { -- appears to be a particular type of small urban settlement in post-Soviet states, -- had an administrative function. link = "w", fallback = "เมือง", }, ["valley"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "water"}, default = {true}, }, ["viceroyalty"] = { -- in essence, a type of colony link = true, fallback = "dependent territory", }, ["village"] = { link = true, generic_before_non_cities = "ใน", category_link = "[[village]]s, [[hamlet]]s, and other small [[community|communities]] and [[settlement]]s", class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["village development committee"] = { -- former administrative structure in Nepal; also exists in India but not as a formal unit link = "+w:village development committee (Nepal)", inherently_former = {"FORMER"}, fallback = "village", }, ["village municipality"] = { -- Quebec link = "+w:village municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["voivodeship"] = { -- Poland link = true, display_handler = voivodeship_display_handler, preposition = "ของ", class = "subpolity", }, ["volcano"] = { link = true, plural = "volcanoes", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true, "ภูเขา"}, }, ["ward"] = { link = true, class = "settlement", -- Wards are formal administrative divisions of a city but have some properties of neighborhoods. fallback = "neighborhood", }, ["watercourse"] = { link = true, fallback = "channel", }, ["Welsh community"] = { -- Wales link = "[[w:community (Wales)|community]]", preposition = "ของ", affix_type = "suf", affix = "community", has_neighborhoods = true, class = "settlement", }, ["zone"] = { -- administrative division of Ethiopia, Qatar, Nepal, India link = "+w:zone#Place names", preposition = "ของ", class = "subpolity", }, ---------------------------------------------------------------------------------------------- -- Categories for former places -- ---------------------------------------------------------------------------------------------- ["ANCIENT capital"] = { link = false, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", -- FIXME: Consider removing 'ancient settlements' here. Ancient capitals, like former capitals, often still -- exist but just aren't the capital any more. Maybe we should have an 'Ancient capitals' category. default = {"Ancient settlements", "Former capitals"}, }, ["ANCIENT non-admin settlement"] = { link = false, class = "non-admin settlement", fallback = "ANCIENT settlement", }, ["ANCIENT settlement"] = { link = false, has_neighborhoods = true, class = "settlement", default = {"Ancient settlements"}, }, ["ancient settlements!"] = { category_link = "former [[city|cities]], [[town]]s and [[village]]s that existed in [[antiquity]]", bare_category_parent = "former settlements", }, ["FORMER capital"] = { link = false, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", default = {"Former capitals"}, }, ["former capitals!"] = { category_link = "former [[capital]] [[city|cities]] and [[town]]s", bare_category_parent = "การตั้งถิ่นฐาน", }, ["former counties and county-level cities!"] = { -- For categorizing former counties and county-level cities of China category_link = "no-longer existing [[county|counties]] and [[county-level city|county-level cities]]", bare_category_breadcrumb = "counties and county-level cities", bare_category_parent = "former political divisions", }, ["FORMER county"] = { -- For categorizing former counties and county-level cities of China link = false, fallback = "FORMER subpolity", }, ["FORMER county-level city"] = { -- For categorizing former counties and county-level cities of China link = false, fallback = "FORMER subpolity", }, ["former countries and country-like entities!"] = { category_link = "[[country|countries]] and similar [[polity|polities]] that no longer exist", bare_category_breadcrumb = "countries and country-like entities", bare_category_parent = "former polities", }, ["FORMER country"] = { link = false, class = "polity", --ห้ามแปล class default = {"Former countries and country-like entities"}, }, ["former dependent territories!"] = { category_link = "[[w:dependent territory|dependent territories]] (colonies, dependencies, protectorates, etc.) that no longer exist", bare_category_breadcrumb = "dependent territories", bare_category_parent = "former political divisions", }, ["FORMER dependent territory"] = { link = false, preposition = "ของ", class = "subpolity", default = {"Former dependent territories"}, }, ["former districts!"] = { -- For categorizing former districts of China category_link = "no-longer-existing [[district]]s", bare_category_breadcrumb = "อำเภอ", bare_category_parent = "former political divisions", }, ["FORMER district"] = { -- For categorizing former districts of China link = false, fallback = "FORMER subpolity", }, ["FORMER geographic region"] = { link = false, fallback = "geographic and cultural area", }, ["FORMER man-made structure"] = { link = false, class = "man-made structure", default = {"Former man-made structures"}, }, ["former man-made structures!"] = { category_link = "man-made structures such as [[airport]]s and [[park]]s that no longer exist", bare_category_breadcrumb = "man-made structures", bare_category_parent = "former places", }, ["former municipalities!"] = { -- For categorizing former municipalities of the Netherlands category_link = "no-longer-existing [[municipality|municipalities]]", bare_category_breadcrumb = "เทศบาล", bare_category_parent = "former political divisions", }, ["FORMER municipality"] = { -- For categorizing former municipalities of the Netherlands link = false, fallback = "FORMER subpolity", }, ["FORMER natural feature"] = { link = false, class = "natural feature", default = {"Former natural features"}, }, ["former natural features!"] = { category_link = "natural features such as [[lake]]s, [[river]]s and [[island]]s that no longer exist", bare_category_breadcrumb = "natural features", bare_category_parent = "former places", }, ["FORMER non-admin settlement"] = { link = false, class = "non-admin settlement", fallback = "FORMER settlement", }, ["former places!"] = { category_link = "[[place]]s of all sorts that no longer exist", bare_category_breadcrumb = "former", bare_category_parent = "สถานที่", }, ["former political divisions!"] = { category_link = "[[political]] [[division]]s (states, provinces, counties, etc.) that no longer exist", bare_category_breadcrumb = "political divisions", bare_category_parent = "former places", }, ["former polities!"] = { category_link = "[[polity|polities]] (countries, kingdoms, empires, etc.) that no longer exist", bare_category_breadcrumb = "องค์การทางการเมือง", bare_category_parent = "former places", }, ["FORMER polity"] = { link = false, class = "polity", --ห้ามแปล class default = {"Former polities"}, }, ["former prefectures!"] = { -- For categorizing former prefectures of China category_link = "no-longer-existing [[prefecture]]s", bare_category_breadcrumb = "prefectures", bare_category_parent = "former political divisions", }, ["FORMER prefecture"] = { -- For categorizing former prefectures of China link = false, fallback = "FORMER subpolity", }, ["former provinces!"] = { -- For categorizing former provinces of China, etc. category_link = "no-longer-existing [[province]]s", bare_category_breadcrumb = "จังหวัด", bare_category_parent = "former political divisions", }, ["FORMER province"] = { -- For categorizing ancient/historical/former provinces of the Roman Empire link = false, fallback = "FORMER subpolity", }, ["former region"] = { -- A former region is considered a former political division, but not a 'historical/traditional/etc.' region. link = "separately", preposition = "ของ", inherently_former = {"FORMER"}, class = "subpolity", }, ["FORMER settlement"] = { link = false, has_neighborhoods = true, class = "settlement", default = {"Former settlements"}, }, ["former settlements!"] = { category_link = "[[city|cities]], [[town]]s and [[village]]s that no longer exist or have been merged or reclassified", bare_category_breadcrumb = "การตั้งถิ่นฐาน", bare_category_parent = "former political divisions", }, ["FORMER subpolity"] = { link = false, preposition = "ของ", class = "subpolity", default = {"Former political divisions"}, }, ---------------------------------------------------------------------------------------------- -- form-of categories -- ---------------------------------------------------------------------------------------------- ---------- Abbreviations ---------- ["abbreviations of counties!"] = { -- For categorizing abbreviations of counties of e.g. England full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[county|counties]]", bare_category_breadcrumb = "เทศมณฑล", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of countries!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "abbreviations of places", }, ["abbreviations of departments!"] = { -- For categorizing abbreviations of departments of e.g. France full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[department]]s", bare_category_breadcrumb = "departments", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of districts!"] = { -- For categorizing abbreviations of districts of e.g. ??? full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[district]]s", bare_category_breadcrumb = "อำเภอ", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of divisions!"] = { -- For categorizing abbreviations of divisions of e.g. Bangladesh full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[division]]s", bare_category_breadcrumb = "divisions", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of former countries!"] = { full_category_link = "{{glossary|abbreviation}}s of [[country|countries]] that no longer [[exist]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "abbreviations of former places", }, ["abbreviations of former places!"] = { full_category_link = "{{glossary|abbreviation}}s of [[place]]s that no longer [[exist]]", bare_category_breadcrumb = "abbreviations", bare_category_parent = "former places", addl_bare_category_parents = {{name = "abbreviations of places", sort = "former"}}, }, ["abbreviations of places!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[place]]s", bare_category_breadcrumb = "abbreviations", bare_category_parent = "สถานที่", }, ["abbreviations of political divisions!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[political]] [[division]]s", bare_category_breadcrumb = "political divisions", bare_category_parent = "abbreviations of places", }, ["abbreviations of prefectures!"] = { -- For categorizing abbreviations of prefectures of e.g. Japan full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[prefecture]]s", bare_category_breadcrumb = "prefectures", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of provinces!"] = { -- For categorizing abbreviations of provinces of e.g. Canada full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s", bare_category_breadcrumb = "จังหวัด", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of provinces and territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s and [[territory|territories]]", bare_category_breadcrumb = "provinces and territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of regions!"] = { -- For categorizing abbreviations of regions of e.g. Italy full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[administrative region]]s", bare_category_breadcrumb = "ภูมิภาค", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states!"] = { -- For categorizing abbreviations of states of e.g. the United States full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states and territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[territory|territories]]", bare_category_breadcrumb = "states and territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states and union territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[union territory|union territories]]", bare_category_breadcrumb = "states and union territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[territory|territories]]", bare_category_breadcrumb = "ดินแดน", bare_category_parent = "abbreviations of political divisions", }, ["ABBREVIATION_OF country"] = { link = false, default = {"Abbreviations of countries"}, }, ["ABBREVIATION_OF county"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF department"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF district"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF division"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF FORMER country"] = { link = false, default = {"Abbreviations of former countries"}, }, ["ABBREVIATION_OF FORMER place"] = { link = false, default = {"Abbreviations of former places"}, }, ["ABBREVIATION_OF place"] = { link = false, default = {"Abbreviations of places"}, }, ["ABBREVIATION_OF prefecture"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF province"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF region"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF state"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF subpolity"] = { link = false, default = {"Abbreviations of political divisions"}, }, ["ABBREVIATION_OF territory"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF union territory"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ---------- Archaic forms ---------- ["archaic forms of places!"] = { full_category_link = "{{glossary|archaic}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "archaic forms", bare_category_parent = "สถานที่", }, ["ARCHAIC_FORM_OF place"] = { link = false, default = {"Archaic forms of places"}, }, ---------- Clippings ---------- ["clippings of places!"] = { full_category_link = "{{glossary|clipping}}s of [[name]]s of [[place]]s", bare_category_breadcrumb = "clippings", bare_category_parent = "สถานที่", }, ["CLIPPING_OF place"] = { link = false, default = {"Clippings of places"}, }, ---------- Dated forms ---------- ["dated forms of places!"] = { full_category_link = "{{glossary|dated}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "dated forms", bare_category_parent = "สถานที่", }, ["DATED_FORM_OF place"] = { link = false, default = {"Dated forms of places"}, }, ---------- Derogatory names ---------- ["derogatory names for cities!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[city|cities]]", bare_category_breadcrumb = "นคร", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for cities"}, }, ["derogatory names for continents!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[continent]]s", bare_category_breadcrumb = "ทวีป", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for continents"}, }, ["derogatory names for countries!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for countries"}, }, ["derogatory names for places!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[place]]s", bare_category_breadcrumb = "derogatory names", bare_category_parent = "nicknames for places", }, ["derogatory names for states!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for states"}, }, ["DEROGATORY_NAME_FOR capital"] = { link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR city"] = { link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR continent"] = { link = false, default = {"Derogatory names for continents"}, }, ["DEROGATORY_NAME_FOR country"] = { link = false, default = {"Derogatory names for countries"}, }, ["DEROGATORY_NAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR place"] = { link = false, default = {"Derogatory names for places"}, }, ["DEROGATORY_NAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR state"] = { link = false, default = {"Derogatory names for states"}, }, ["DEROGATORY_NAME_FOR town"] = { link = false, default = {"Derogatory names for cities"}, }, ---------- Ellipses ---------- ["ellipses of places!"] = { full_category_link = "{{glossary|ellipsis|ellipses}} of [[name]]s of [[place]]s", bare_category_breadcrumb = "ellipses", bare_category_parent = "สถานที่", }, ["ELLIPSIS_OF place"] = { link = false, default = {"Ellipses of places"}, }, ---------- Former long-form names ---------- ["former long-form names of countries!"] = { full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former long-form names of places", addl_bare_category_parents = {{name = "former names of countries", sort = "long-form"}}, }, ["former long-form names of places!"] = { full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s", bare_category_breadcrumb = "long-form", bare_category_parent = "former names of places", }, ["FORMER_LONG_FORM_OF country"] = { link = false, default = {"Former long-form names of countries"}, }, ["FORMER_LONG_FORM_OF place"] = { link = false, default = {"Former long-form names of places"}, }, ---------- Former names ---------- ["former names of capitals!"] = { full_category_link = "[[former]] [[name]]s of [[capital city|capital cities]] that generally still exist but under a different name", bare_category_breadcrumb = "เมืองหลวง", bare_category_parent = "former names of settlements", }, ["former names of countries!"] = { full_category_link = "[[former]] [[name]]s of [[country|countries]] that generally still exist but under a different name", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former names of places", }, ["former names of places!"] = { full_category_link = "[[former]] [[name]]s of [[place]]s that generally still exist but under a different name", bare_category_breadcrumb = "former names", bare_category_parent = "สถานที่", }, ["former names of political divisions!"] = { full_category_link = "[[former]] [[name]]s of [[political]] [[division]]s (states, provinces, counties, etc.) that generally still exist but under a different name", bare_category_breadcrumb = "political divisions", bare_category_parent = "former names of places", }, ["former names of polities!"] = { full_category_link = "[[former]] [[name]]s of [[polity|polities]] (e.g. [[country|countries]]) that generally still exist but under a different name", bare_category_breadcrumb = "องค์การทางการเมือง", bare_category_parent = "former names of places", }, ["former names of settlements!"] = { full_category_link = "[[former]] [[name]]s of [[city|cities]], [[town]]s, [[village]]s, etc. that generally still exist but under a different name", bare_category_breadcrumb = "การตั้งถิ่นฐาน", bare_category_parent = "former names of political divisions", }, ["FORMER_NAME_OF capital"] = { link = false, default = {"Former names of capitals"}, }, ["FORMER_NAME_OF country"] = { link = false, default = {"Former names of countries"}, }, ["FORMER_NAME_OF place"] = { link = false, default = {"Former names of places"}, }, ["FORMER_NAME_OF polity"] = { link = false, default = {"Former names of polities"}, }, ["FORMER_NAME_OF region"] = { link = false, fallback = "FORMER_NAME_OF subpolity", }, ["FORMER_NAME_OF settlement"] = { link = false, default = {"Former names of settlements"}, }, ["FORMER_NAME_OF subpolity"] = { link = false, default = {"Former names of political divisions"}, }, ---------- Former nicknames ---------- ["former nicknames for cities!"] = { full_category_link = "no-longer-used [[nickname]]s for [[city|cities]], e.g. the [[Eternal City]] for [[Kyoto]] during the {{w|Heian period}} ({{circa2|800–1100|short=yes}} {{AD}})", bare_category_breadcrumb = "นคร", bare_category_parent = "former nicknames for places", addl_bare_category_parents = {"nicknames for cities"}, }, ["former nicknames for places!"] = { full_category_link = "no-longer-used [[nickname]]s for [[place]]s", bare_category_breadcrumb = "former", bare_category_parent = "nicknames for places", addl_bare_category_parents = {{name = "former names of places", sort = "nicknames"}}, }, ["FORMER_NICKNAME_FOR capital"] = { link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR city"] = { link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR place"] = { link = false, default = {"Former nicknames for places"}, }, ["FORMER_NICKNAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR town"] = { link = false, default = {"Former nicknames for cities"}, }, ---------- Former official names ---------- ["former official names of countries!"] = { full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former official names of places", addl_bare_category_parents = {{name = "former names of countries", sort = "official"}}, }, ["former official names of places!"] = { full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[place]]s", bare_category_breadcrumb = "official", bare_category_parent = "former names of places", }, ["FORMER_OFFICIAL_NAME_OF country"] = { link = false, default = {"Former official names of countries"}, }, ["FORMER_OFFICIAL_NAME_OF place"] = { link = false, default = {"Former official names of places"}, }, ---------- Long-form names ---------- ["long-form names of countries!"] = { full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "long-form names of places", }, ["long-form names of places!"] = { full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s", bare_category_breadcrumb = "long-form names", bare_category_parent = "สถานที่", }, ["LONG_FORM_OF country"] = { link = false, default = {"Long-form names of countries"}, }, ["LONG_FORM_OF place"] = { link = false, default = {"Long-form names of places"}, }, ---------- Nicknames ---------- ["nicknames for cities!"] = { full_category_link = "[[nickname]]s for [[city|cities]], e.g. the [[Big Apple]] for [[New York City]]", bare_category_breadcrumb = "นคร", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"นคร"}, }, ["nicknames for continents!"] = { full_category_link = "[[nickname]]s for [[continent]]s", bare_category_breadcrumb = "ทวีป", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"ทวีป"}, }, ["nicknames for countries!"] = { full_category_link = "[[nickname]]s for [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"ประเทศ"}, }, ["nicknames for places!"] = { full_category_link = "[[nickname]]s for [[place]]s", bare_category_breadcrumb = "สถานที่", bare_category_parent = "nicknames", addl_bare_category_parents = {"สถานที่"}, }, ["nicknames for states!"] = { -- For categorizing nicknames for states of e.g. the United States full_category_link = "[[nicknames]] for [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"รัฐ"}, }, ["NICKNAME_FOR capital"] = { link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR city"] = { link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR continent"] = { link = false, default = {"Nicknames for continents"}, }, ["NICKNAME_FOR country"] = { link = false, default = {"Nicknames for countries"}, }, ["NICKNAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR place"] = { link = false, default = {"Nicknames for places"}, }, ["NICKNAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR state"] = { link = false, default = {"Nicknames for states"}, }, ["NICKNAME_FOR town"] = { link = false, default = {"Nicknames for cities"}, }, ---------- Obsolete forms ---------- ["obsolete forms of places!"] = { full_category_link = "{{glossary|obsolete}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "obsolete forms", bare_category_parent = "สถานที่", }, ["OBSOLETE_FORM_OF place"] = { link = false, default = {"Obsolete forms of places"}, }, ---------- Official names ---------- ["official names of countries!"] = { full_category_link = "[[official]] [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "official names of places", }, ["official names of former countries!"] = { full_category_link = "[[official]] [[name]]s of [[country|countries]] that no longer [[exist]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "official names of former places", }, ["official names of former places!"] = { full_category_link = "[[official]] [[name]]s of [[place]]s that no longer [[exist]]", bare_category_breadcrumb = "official names", bare_category_parent = "former places", addl_bare_category_parents = {{name = "official names of places", sort = "former"}}, }, ["official names of places!"] = { full_category_link = "[[official]] [[name]]s of [[place]]s", bare_category_breadcrumb = "official names", bare_category_parent = "สถานที่", }, ["OFFICIAL_NAME_OF country"] = { link = false, default = {"Official names of countries"}, }, ["OFFICIAL_NAME_OF FORMER country"] = { link = false, default = {"Official names of former countries"}, }, ["OFFICIAL_NAME_OF FORMER place"] = { link = false, default = {"Official names of former places"}, }, ["OFFICIAL_NAME_OF place"] = { link = false, default = {"Official names of places"}, }, ---------- Official nicknames ---------- ["official nicknames for places!"] = { full_category_link = "[[official]] [[nickname]]s for [[place]]s", bare_category_breadcrumb = "official", bare_category_parent = "nicknames for places", }, ["official nicknames for states!"] = { -- For categorizing official nicknames for states of e.g. the United States full_category_link = "[[official]] [[nicknames]] for [[state]]s", bare_category_breadcrumb = "official", bare_category_parent = "nicknames for states", addl_bare_category_parents = {"รัฐ"}, }, ["OFFICIAL_NICKNAME_FOR place"] = { link = false, default = {"Official nicknames for places"}, }, ["OFFICIAL_NICKNAME_FOR state"] = { link = false, default = {"Official nicknames for states"}, }, } export.plural_placetype_to_singular = {} for sg_placetype, spec in pairs(export.placetype_data) do if spec.plural then export.plural_placetype_to_singular[spec.plural] = sg_placetype end end return export 0xvw51pbnx3kitsu9mw03hxjjrzw4y6 5720709 5720700 2026-04-21T02:03:14Z OctraBot 3198 5720709 Scribunto text/plain local export = {} export.force_cat = false -- set to true for testing local m_locations = require("Module:place/locations") local m_links = require("Module:links") local m_table = require("Module:table") local m_strutils = require("Module:string utilities") local debug_track_module = "Module:debug/track" local en_utilities_module = "Module:en-utilities" local dump = mw.dumpObject local insert = table.insert local concat = table.concat local internal_error = m_locations.internal_error export.internal_error = internal_error local process_error = m_locations.process_error export.process_error = process_error local unpack = unpack or table.unpack -- Lua 5.2 compatibility local ucfirst = m_strutils.ucfirst local ulower = m_strutils.lower local rmatch = m_strutils.match local split = m_strutils.split --[==[ intro: This module contains placetype data used by [[Module:place]] and {{tl|place}}, along with a significant amount of code to work with both placetypes and locations, as well as some placename-related info (FIXME: Consider moving it to [[Module:place/locations]]). See also [[Module:place/locations]], which has definitions of all known locations. You must currently load this module using {{cd|require()}}, not using {{cd|mw.loadData()}}. In particular, it contains two fundamental and tricky functions: # `get_placetype_equivs`, which finds the equivalent placetypes to look under in order to find a given property, and in the process correctly handles placetypes with qualifiers (including qualifiers that act similar to "type-raising" operators in that they do something non-trivial to the placetype to their right) as well as form-of directives and fallbacks. # `find_matching_holonym_location`, which looks up a holonym to find a matching known location, but in the process checks holonyms to the right to make sure there isn't a clash between the user-specified containing holonyms and the containers of the known location being considered. This is done to prevent overcategorizing when either there are two known locations with the same name (e.g. Birmingham in England and Birmingham, Alabama in the US), or more generally two locations with the same name, one of which is a known location but where the other is not (e.g. we're processing non-known-location Mérida, Spain and don't want it categorized like known location Mérida, Yucatán, Mexico). Both of these functions are invoked repeatedly, and probably are invoked several times on the same inputs and as a result are candidates for memoization to speed up the operation of {{tl|place}}. ]==] ------------------------------------------------------------------------------------------ -- Basic utilities -- ------------------------------------------------------------------------------------------ --[==[ Return true if `force_cat` is set either in this module or in [[Module:place/locations]]. ]==] function export.get_force_cat() return export.force_cat or m_locations.force_cat end -- Add the page to a tracking "category". To see the pages in the "category", -- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here". local function track(page) require(debug_track_module)("place/" .. page) return true end function export.remove_links_and_html(text) text = m_links.remove_links(text) return text:gsub("<.->", "") end --[==[ Return the singular version of a maybe-plural placetype, or nil if not plural. This correctly handles placetypes with irregular plurals such as `kibbutzim` plural of `kibbutz` by looking up in a table constructed from the `plural` values specified in `placetype_data`. If a special plural value is not found, the regular singularization algorithm in [[Module:en-utilities]] is invoked, which reverses the y -> ies change after vowels and the 'es' addition after sh/ch/x, and otherwise just subtracts a final 's' (which will incorrectly generate 'passe' for plural 'passes'; FIXME: consider changing this for words ending in '-sses'). If the generated singular is the same as the passed-in value, nil is returned. ]==] function export.maybe_singularize_placetype(placetype) if not placetype then return nil end if export.plural_placetype_to_singular[placetype] then return export.plural_placetype_to_singular[placetype] end local retval = --[[require(en_utilities_module).singularize(placetype)]] placetype if retval == placetype then return nil end return retval end -- Return the correct plural of a placetype, and (if `do_ucfirst` is given) make the first letter uppercase. We first -- look up the plural in `placetype_data`, falling back to pluralize() in [[Module:en-utilities]], which is almost -- always correct. function export.pluralize_placetype(placetype, do_ucfirst) local ptdata = export.placetype_data[placetype] if ptdata and ptdata.plural then placetype = ptdata.plural else placetype = --[[require(en_utilities_module).pluralize(placetype)]] placetype end if do_ucfirst then return ucfirst(placetype) else return placetype end end --[==[ Get the data associated with a placetype, which may be in its singular or plural form. If `from_category` is specified, we also look for category-only placetypes (generally plural) followed by `!`. Return three values: (a) the placetype under which the data can be looked up (i.e. in its singular form if the passed-in `placetype` is plural and did not match a category-only placetype followed by `!`); (b) the placetype data structure; (c) the type of `placetype` match that occurred, one of `"direct"` if the canonical placetype is the same as the passed-in `placetype` and also the same as the key under which `ptdata` was looked up, or `"direct-category"` if the `ptdata` was looked up under a key formed from the passed-in `placetype` by adding `!`, or `"plural"` if the `ptdata` was looked up under the singularized version of the plural passed-in `placetype`. ]==] function export.get_placetype_data(placetype, from_category) local ptdata = export.placetype_data[placetype] if ptdata then return placetype, ptdata, "direct" end if from_category then ptdata = export.placetype_data[placetype .. "!"] if ptdata then return placetype .. "!", ptdata, "direct-category" end end local sg_placetype = export.maybe_singularize_placetype(placetype) if sg_placetype then ptdata = export.placetype_data[sg_placetype] if ptdata then return sg_placetype, ptdata, "plural" end end return nil end --[==[ Check for special pseudo-placetypes that should be ignored for categorization purposes. ]==] function export.placetype_is_ignorable(placetype) return placetype == "and" or placetype == "or" or placetype == "และ" or placetype == "หรือ" or placetype:find("^%(") end function export.resolve_placetype_aliases(placetype) return export.placetype_aliases[placetype] or placetype end --[==[ Return a property from `placetype_data` for a given placetype. If the placetype isn't found in `placetype_data`, or the key isn't found in the placetype's entry in `placetype_data`, return nil. ]==] function export.get_placetype_prop(placetype, key) -- Usually we are called on equivalent placetypes returned from `get_placetype_equivs`, in which case placetype -- aliases have been resolved, but sometimes not, e.g. when fetching the indefinite article in -- get_placetype_article(). `resolve_placetype_aliases` is just a simple lookup and it doesn't hurt to do it twice. placetype = export.resolve_placetype_aliases(placetype) if export.placetype_data[placetype] then return export.placetype_data[placetype][key] else return nil end end --[==[ Given a placetype, split the placetype into one or more potential ''splits'', each consisting of a three-element list { {``prev_qualifiers``, ``this_qualifier``, ``reduced_placetype``}}, i.e. # the concatenation of zero or more previously-recognized qualifiers on the left, normally canonicalized (if there are zero such qualifiers, the value will be nil); # a single recognized qualifier, normally canonicalized (if there is no qualifier, the value will be nil); # the "reduced placetype" on the right. Splitting between the qualifier in (2) and the reduced placetype in (3) happens at each space character, proceeding from left to right, and stops if a qualifier isn't recognized. All placetypes are canonicalized by checking for aliases in `placetype_aliases`, but no other checks are made as to whether the reduced placetype is recognized. Canonicalization of qualifiers does not happen if `no_canon_qualifiers` is specified. For example, given the placetype `"small beachside unincorporated community"`, the return value will be { { {nil, nil, "small beachside unincorporated community"}, {nil, "small", "beachside unincorporated community"}, {"small", "[[beachfront]]", "unincorporated community"}, {"small [[beachfront]]", "[[unincorporated]]", "community"}, }} Here, `"beachside"` is canonicalized to `"[[beachfront]]"` and `"unincorporated"` is canonicalized to `"[[unincorporated]]"`, in both cases according to the entry in `placetype_qualifiers`. On the other hand, if given `"small former haunted community"`, the return value will be { { {nil, nil, "small former haunted community"}, {nil, "small", "former haunted community"}, {"small", "former", "haunted community"}, }} because `"small"` and `"former"` but not `"haunted"` are recognized as qualifiers. Finally, if given `"former adr"`, the return value will be { { {nil, nil, "former adr"}, {nil, "former", "administrative region"}, }} because `"adr"` is a recognized placetype alias for `"administrative region"`. ]==] function export.split_qualifiers_from_placetype(placetype, no_canon_qualifiers) local splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}} local prev_qualifier = nil while true do local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$") if qualifier then local canon = export.placetype_qualifiers[qualifier] if canon == nil then break end local new_qualifier = qualifier if type(canon) == "table" then canon = canon.link end if not no_canon_qualifiers and canon ~= false then if canon == true then new_qualifier = "[[" .. qualifier .. "]]" else new_qualifier = canon end end insert(splits, {prev_qualifier, new_qualifier, export.resolve_placetype_aliases(reduced_placetype)}) prev_qualifier = prev_qualifier and prev_qualifier .. " " .. new_qualifier or new_qualifier placetype = reduced_placetype else break end end return splits end --[==[ Given a `placetype` (which may be pluralized), return an ordered list of equivalent placetypes to look under to find the placetype's properties (such as the category or categories to be inserted). The return value is actually an ordered list of objects of the form `{qualifier=``qualifier``, placetype=``equiv_placetype``}` where ``equiv_placetype`` is a placetype whose properties to look up, derived from the passed-in placetype or from a contiguous subsequence of the words in the passed-in placetype (always including the rightmost word in the placetype, i.e. we successively chop off qualifier words from the left and use the remainder to find equivalent placetypes). ``qualifier`` is the remaining words not part of the subsequence used to find ``equiv_placetype``; or nil if all words in the passed-in placetype were used to find ``equiv_placetype``. (FIXME: This qualifier is not currently used anywhere.) Only placetypes for which there is an entry in `placetype_data` are included. The placetype passed in is always checked first, and will form the first entry if it exists in `placetype_data`. '''NOTE:''' This is a tricky function as it implements handling of (a) qualifiers, (b) fallback logic, (c) "type-raising" qualifiers such as `former`/`ancient`/etc. as well as `fictional` and `mythological`, and (d) form-of directives, which act somewhat similarly to `former`, and allows interaction between more than one of these simultaneously (e.g. official names of former places, which have their own categorization). If {{tl|place}} gets too slow, one potential speedup is to memoize the results of this function, as it appears to be getting called more than once on the same inputs. Another similar potential speedup is to memoize the results of `iterate_matching_holonym_location()`. For example, given the placetype `left tributary`, the following placetype/qualifier combinations are checked in turn: ``` {qualifier = nil, placetype="left tributary"} {qualifier = "left", placetype="tributary"} {qualifier = "left", placetype="แม่น้ำ"} ``` and the return value will be { { {qualifier = "left", placetype="tributary"}, {qualifier = "left", placetype="แม่น้ำ"}, }} The algorithm first enters the placetype itself into the list, then checks for `left tributary` as a recognized placetype in `placetype_data` and doesn't find it, so it doesn't enter it into the returned list (if it found it, it would add it as well as any fallbacks directly after it). It then splits off the recognized qualifier `left` to form the ''reduced placetype'' `tributary`, which is entered into the list because it is found in `placetype_data`. Then, because it has a fallback `river`, which exists in `placetype_data`, the fallback is entered next. Another example is `small rural fraziones` (where a ''frazione'' is type of subdivision of a ''comune'' or municipality, often specifically an outlying hamlet). the placetype/qualifier combinations checked are: ``` {qualifier = nil, placetype="small rural fraziones"} {qualifier = nil, placetype="small rural frazione"} {qualifier = "small", placetype="rural fraziones"} {qualifier = "small", placetype="rural frazione"} {qualifier = "small [[rural]]", placetype="fraziones"} {qualifier = "small [[rural]]", placetype="frazione"} {qualifier = "small [[rural]]", placetype="hamlet"} {qualifier = "small [[rural]]", placetype="village"} ``` The return value ends up as {qualifier = "small [[rural]]", placetype="frazione"}, {qualifier = "small [[rural]]", placetype="hamlet"}, {qualifier = "small [[rural]]", placetype="village"}, }} Here, because the result of singularizing `fraziones` returns a different value from the placetype itself, that singularized value is checked after the original plural value. Also, in the process of splitting off qualifiers, they are canonicalized if the entry in `placetype_qualifiers` says to do so; in this case, links are placed around `rural`. Finally, `frazione` has `hamlet` as its fallback, which in turn has `village` as its fallback, so both fallbacks end up being returned. `no_fallback`, if set, disables returning equivalent placetypes based on the `fallback` setting for a placetype. This is used in the first of two loops in find_placetype_cat_specs() in [[Module:place]] to prefer exact matches for placetypes such as barangays with later holonyms to matches based on a fallback such as `neighborhood` with an earlier holonym. See the comment in that function in [[Module:place]] for a more detailed explanation of why this is needed. Only the placetype itself, and any reduced placetypes created by chopping off recognized qualifiers at the beginning, are returned; but we do not return reduced placetypes if a containing placetype exists in `placetype_data`. (For example, `"overseas territory"` has a fallback `"dependent territory"`, and `"overseas"` is also a recognized qualifier. When `no_fallback` is in place, without the above proviso, we would return `"overseas territory"` followed by `"ดินแดน"` with the incorrect effect of classifying an `"overseas territory"` of the United Kingdom such as `"Gibraltar"` under [[:Category:Territories of the United Kingdom]] instead of [[:Category:Dependent territories of the United Kingdom]].) As an exception, if `historical`, `ancient`, `former` or the like are found, they proceed ignoring `no_fallback`, because it seems tricky to handle them correctly in the presence of `no_fallback`, and historical/former placetypes rarely occur with exact match category specs anyway. `no_split_qualifiers` prevents splitting off recognized qualifiers and returning the remainder of the placetype as an equivalent placetype. Only the passed-in placetype, and any fallbacks, will be returned. This is used in [[Module:category tree/topic cat/data/Places]] when looking up placetypes found in categories. Such placetypes won't have qualifiers and so it doesn't make sense to try and look for them. `from_category`, if set, causes category-only placetypes (those ending in `!`) to also be checked. `form_of_directive`, if set, causes the specified form-of directive (e.g. `FORMER_NAME_OF`) to be prepended to checked placetypes, their directive-specific type (e.g. `FORMER_NAME_OF_type`), and their classes (`class`) to get the appropriate placetypes to check for form-of-directive categories. It falls back to the prepended generic `place` as a placetype, e.g. `FORMER_NAME_OF place`, if nothing else matches. `no_check_for_inherently_former` is used internally to prevent an infinite loop when checking for `inherently_former`. `register_former_as_non_former` is a major hack used in `get_bare_categories` to deal with the mismatch between e.g. known location `Yugoslavia` declaring itself a `country` but definitions of it declaring it a `former country`. It causes the non-former version of the specified placetype to be included in the returned equivalents along with the former placetypes. [FIXME: This should apply only to the entries in `former_countries` but it's tricky to do that now; fix this in the known-location refactor. -- The known-location refactor is already done but we haven't yet fixed this.] ]==] function export.get_placetype_equivs(placetype, props) local no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former local form_of_directive if props then no_fallback, no_split_qualifiers, no_check_for_inherently_former, from_category, register_former_as_non_former = props.no_fallback, props.no_split_qualifiers, props.no_check_for_inherently_former, props.from_category, props.register_former_as_non_former form_of_directive = props.form_of_directive end local equivs = {} -- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. `qualifier` is -- the preceding qualifier to insert into `equivs` along with the placetype (see comment at top of function). If -- `from_category` is given, we also check for a category-specific entry consisting of the placetype followed by -- `!`, and in all cases we also check to see if `placetype` is plural, and if so, insert the singularized version -- along with its fallbacks (if any) in `placetype_data`. `form_of_prefix` is a form-of prefix such as -- `OFFICIAL_NAME_OF`. If specified, we check the fallbacks of `placetype` without the prefix but then insert into -- `equivs` the prefixed placetype. This way, if the user says e.g. {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}}, -- we will correctly categorize into [[:Category:Official names of countries]], rather than only trying to look up -- `OFFICIAL_NAME_OF island country` and failing, falling back ultimately to [[:Category:Official names of places]]. local function insert_placetype_and_fallbacks(qualifier, placetype, form_of_prefix) local function insert_equiv(pt) if form_of_prefix then -- Let's say the user says {{tl|place|pt|@official name of:Cuba|island country|r/Caribbean}} and we have -- no entry for `OFFICIAL_NAME_OF island country` but we do for `OFFICIAL_NAME_OF country` (which we end -- up processing because `island country` falls back to `country`), and that entry in turn is defined -- using a fallback. We have to insert that fallback-of-fallback, and the easiest/cleanest way of -- handling this is by calling ourselves recursively. insert_placetype_and_fallbacks(qualifier, form_of_prefix .. " " .. pt) else insert(equivs, {qualifier=qualifier, placetype=pt}) end end -- Insert the placetype, along with any fallbacks. local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category) if ptdata then insert_equiv(canon_placetype) if no_fallback then return end local first_placetype = #equivs + 1 local prev_placetype = nil while true do local pt_value = export.placetype_data[canon_placetype] if not pt_value then internal_error("Fallback value %s specified for placetype %s but is not in `placetype_data`", canon_placetype, prev_placetype) end if pt_value.fallback then insert_equiv(pt_value.fallback) local last_placetype = #equivs if last_placetype - first_placetype >= 10 then local fallback_loop = {} for i = first_placetype, last_placetype do insert(fallback_loop, equivs[i].placetype) end internal_error("Apparent loop in fallback chain: %s", table.concat(fallback_loop, " -> ")) end prev_placetype = canon_placetype canon_placetype = pt_value.fallback else break end end end end -- Insert `placetype` into `equivs`, along with any fallback placetypes listed in `placetype_data`. This is a -- wrapper around the more basic `insert_placetype_and_fallbacks()` which handles form-of directives. If there is no -- form-of directive, this function directly calls `insert_placetype_and_fallbacks()`. We do things this way so that -- form-of directives correctly combine with `former`-type qualifiers. Note that we also have special backups for -- form-of directives that check `DIRECTIVE place` (and before that, `DIRECTIVE FORMER/ANCIENT place` is there's a -- `former`-type directive); these backups live outside this function because we want them done once, late, rather -- than in each invocation of `process_and_insert_placetype()`. local function process_and_insert_placetype(qualifier, reduced_placetype) if form_of_directive then -- First check for e.g. `OFFICIAL_NAME_OF island country` and its fallbacks; then we look for fallbacks of -- `island country` and check e.g. `OFFICIAL_NAME_OF country` and its fallbacks. All of this is handled by -- `insert_placetype_and_fallbacks()` with appropriate parameters. After that, check the general class of -- the directive, e.g. `subpolity` if something like `district` is given. (Eventually, we check for -- `OFFICIAL_NAME_OF place` as a backup, but this happens at the end outside the loop over qualifiers.) insert_placetype_and_fallbacks(qualifier, reduced_placetype, form_of_directive) if not no_fallback then local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype) local directive_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.get_placetype_prop(pt, form_of_directive .. "_type") or export.get_placetype_prop(pt, "class") end ) if not directive_type then local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.placetype_data[pt] end ) if pt_data then internal_error("For placetype %s in conjunction with form-of directive %s, placetype data " .. 'located but directive-specific type property %s missing, and so is "class"; ' .. "placetypes searched are %s", reduced_placetype, form_of_directive, form_of_directive .. "_type", reduced_placetype_equivs) else -- This should be allowed, as we allow unrecognized placetypes in general. end elseif directive_type ~= "!" then insert_placetype_and_fallbacks(qualifier, directive_type, form_of_directive) end end else insert_placetype_and_fallbacks(qualifier, reduced_placetype) end end -- Successively split off recognized qualifiers and loop over successively greater sets of qualifiers from the left -- (unless `no_split_qualifiers` is specified, in which case we don't check for qualifiers). local splits if no_split_qualifiers then splits = {{nil, nil, export.resolve_placetype_aliases(placetype)}} else splits = export.split_qualifiers_from_placetype(placetype) end for _, split in ipairs(splits) do local prev_qualifier, this_qualifier, reduced_placetype = unpack(split, 1, 3) -- If a special "former" qualifier like `former` or `historical` isn't present, and -- `no_check_for_inherently_former` is not given (this flag is used to avoid infinite loops), check for -- "inherently former" placetypes like `satrapy` and `treaty port` that always refer to no-longer-existing -- placetypes, and handle accordingly. local unlinked_this_qualifier if this_qualifier and this_qualifier:find("%[") then unlinked_this_qualifier = export.remove_links_and_html(this_qualifier) else unlinked_this_qualifier = this_qualifier end local former_qualifiers = this_qualifier and export.former_qualifiers[unlinked_this_qualifier] or nil if not former_qualifiers and not no_check_for_inherently_former then former_qualifiers = export.get_equiv_placetype_prop(reduced_placetype, function(pt) return export.get_placetype_prop(pt, "inherently_former") end, {no_check_for_inherently_former = true}) end -- If a special "former" qualifier like `former` or `historical` is present, map it to the appropriate internal -- qualifiers (`ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified -- qualifiers), fetch the `former_type` property, and treat the placetype as if a concatenation of the mapped -- qualifier(s) and the value of `former_type`. For example, if `medieval village` is given, we map `medieval` -- to `ANCIENT` and `FORMER`, and `village` to its `former_type` of `settlement`, and enter the placetypes -- `ANCIENT settlement` and `FORMER settlement` (in that order) into `equivs`. If the placetype following the -- "former" qualifier is recognized in `placetype_data` but has no `former_type` and no fallback with a -- `former_type` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like -- `former greenhouse` is specified and we don't have an entry for `greenhouse`), just track the occurrence and -- don't enter anything into `equivs`. if former_qualifiers then -- FIXME: Should we respect `no_fallback` here? My instinct says no. local reduced_placetype_equivs = export.get_placetype_equivs(reduced_placetype, { no_check_for_inherently_former = true }) local former_type = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.get_placetype_prop(pt, "former_type") or export.get_placetype_prop(pt, "class") end ) if not former_type then local pt_data = export.get_equiv_placetype_prop_from_equivs(reduced_placetype_equivs, function(pt) return export.placetype_data[pt] end ) if pt_data then internal_error("For placetype %s, placetype data located but `former_type` missing; " .. "placetypes searched are %s", reduced_placetype, reduced_placetype_equivs) else -- Enable error when we've verified there aren't any examples. track("bad-former-placetype") track("bad-former-placetype/" .. reduced_placetype) --process_error("For placetype '%s', unrecognized placetype following 'former'-type " .. -- "qualifier; searched placetype(s) %s", reduced_placetype, dump(reduced_placetype_equivs)) end elseif former_type ~= "!" then -- First check directly for `ANCIENT/FORMER` + the original following placetype. This makes it possible -- for (e.g.) former provinces of the Roman empire to be categorized specially. for _, former_qualifier in ipairs(former_qualifiers) do process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. reduced_placetype) end for _, former_qualifier in ipairs(former_qualifiers) do process_and_insert_placetype(prev_qualifier, former_qualifier .. " " .. former_type) end -- HACK! See explanation above for `register_former_as_non_former`. if register_former_as_non_former then process_and_insert_placetype(prev_qualifier, reduced_placetype) end -- If we're processing a form-of directive, after doing everything else we do -- `DIRECTIVE ANCIENT/FORMER place` e.g. `OFFICIAL_NAME_OF FORMER place` as a backup. if form_of_directive and not no_fallback then for _, former_qualifier in ipairs(former_qualifiers) do insert_placetype_and_fallbacks(prev_qualifier, form_of_directive .. " " .. former_qualifier .. " place") end end -- Don't continue processing equivs. The reason is probably the same as the `break` below for -- qualifier_to_placetype_equivs[]; categories for `former BLAH` are set using `default`, and -- non-former equivs will otherwise take precedence. break end end -- Then see if the rightmost split-off qualifier is in qualifier_to_placetype_equivs -- (e.g. 'fictional *' -> 'fictional location'). If so, add the mapping. if this_qualifier and export.qualifier_to_placetype_equivs[unlinked_this_qualifier] then insert(equivs, { qualifier=prev_qualifier, placetype=export.qualifier_to_placetype_equivs[unlinked_this_qualifier] }) -- Don't continue processing equivs; otherwise, if we specify 'mythological city', even though the -- equivalent entry for 'mythological location' gets inserted ahead of the entry for 'city', the -- latter ends up generating the category because the category for 'mythological location' is set as -- the default value, which is used only when no non-default category can be found. break end -- Finally, join the rightmost split-off qualifier to the previously split-off qualifiers to form a combined -- qualifier, and add it along with reduced_placetype and any mapping in placetype_data for reduced_placetype. -- NOTE: The first time through this loop, both `prev_qualifier` and `this_qualifier` are nil, and this inserts -- the full placetype into `equivs`. local qualifier = prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier process_and_insert_placetype(qualifier, reduced_placetype) -- If `no_fallback` and there's an entry in `placetype_data` for this placetype, don't include any reduced -- placetypes to avoid the "overseas territory treated as a territory" issue describe above. if no_fallback then local canon_placetype, ptdata, ptmatch = export.get_placetype_data(reduced_placetype, from_category) if canon_placetype then break end end end -- If we're processing a form-of directive, after doing everything else we do `DIRECTIVE place` e.g. -- `OFFICIAL_NAME_OF place` as a backup; but only if either the placetype as a whole is recognized or the placetype -- begins with a recognized qualifier. This latter check is to avoid categorizing into e.g. -- [[Category:en:Former names of places]] in an invocation like -- {{place|en|@former name of:Democratic Republic of the Congo|country|r/Central Africa|;|used from 1971–1997}}; -- the `used from 1971–1997` gets treated as a placetype and we're called on it. if form_of_directive and not no_fallback and (splits[2] or export.get_placetype_data(placetype, from_category)) then insert_placetype_and_fallbacks(nil, form_of_directive .. " place") end return equivs end function export.get_equiv_placetype_prop_from_equivs(equivs, fun, continue_on_nil_only) for _, equiv in ipairs(equivs) do local retval = fun(equiv.placetype) if continue_on_nil_only and retval ~= nil or not continue_on_nil_only and retval then return retval, equiv end end return nil, nil end --[==[ Given a placetype `placetype` and a function `fun` of one argument, iteratively call the function on equivalent placetypes fetched from `get_placetype_equivs` until the function returns a non-falsy value (i.e. not {nil} or {false}); but if `continue_on_nil_only` is specified, the iterations continue until the function returns non non-{nil} value. FIXME: We should make `continue_on_nil_only` the default; but this requires changing some callers.) When `fun` returns a non-falsy or non-{nil} value, `get_equiv_placetype_prop` returns two values: the value returned by `fun` and the equivalent placetype that triggered the non-falsy (or non-{nil}) return value. If `fun` never returns a non-falsy (or non-{nil}) value, `get_equiv_placetype_prop` returns {nil} for both return values. If `placetype` is passed in as {nil}, the return value is the result of calling `fun` on {nil} (whatever it is) with {nil} for the second return value. ]==] function export.get_equiv_placetype_prop(placetype, fun, props) if not placetype then return fun(nil), nil end return export.get_equiv_placetype_prop_from_equivs(export.get_placetype_equivs(placetype, props), fun, props and props.continue_on_nil_only) end --[==[ Return the article that is used with an entry placetype. We proceed as follows: # See if there is a recognized qualifier at the beginning that specifies an article (including `false` for no article). This takes precedence over anything else, so that e.g. `various capitals` gets no article rather than "`the"`. # Then check the placetype or any equivalent placetype for the `entry_placetype_use_the` property, indicating that `"the"` should be used. # Otherwise we look to see if the placetype itself (not any equivalents, even those involving deleting a qualifier from the beginning) has an entry in `placetype_data` that specifies the indefinite article using `entry_placetype_use_the` (principally for use with placetypes like `union territory`). # Otherwise, we use [[Module:en-utilities]] to apply the standard algorithm to generate `"an"` for words beginning with a vowel and `"a"` otherwise. If `ucfirst` is true, the first letter of the article is made upper-case. ]==] function export.get_placetype_article(placetype, ucfirst) local art local qualifier, reduced_placetype = placetype:match("^(.-) (.*)$") if qualifier then local canon = export.placetype_qualifiers[qualifier] if type(canon) == "table" then art = canon.article end end if art == false then return art end if art == nil then local placetype_use_the = export.get_equiv_placetype_prop(placetype, function(pt) return export.get_placetype_prop(pt, "entry_placetype_use_the") end) if placetype_use_the then art = "the" else art = export.get_placetype_prop(placetype, "entry_placetype_indefinite_article") if not art then art = --[[require(en_utilities_module).get_indefinite_article(placetype)]] "" end end end if ucfirst then art = m_strutils.ucfirst(art) end return art end --[==[ Return the preposition that should be used after `placetype` when occurring as an entry placetype or in categories (e.g. `city >in< France` but `country >of< South America`). The preposition defaults to `"ใน"` if not specified. ]==] function export.get_placetype_entry_preposition(placetype) local pt_prep = export.get_equiv_placetype_prop(placetype, function(pt) return export.get_placetype_prop(pt, "preposition") end ) return pt_prep or "ใน" end --[==[ Given a place desc (see top of file) and a holonym object (see top of file), add a key/value into the place desc's `holonyms_by_placetype` field corresponding to the placetype and placename of the holonym. For example, corresponding to the holonym "c/Italy", a key "ประเทศ" with the list value {"Italy"} will be added to the place desc's `holonyms_by_placetype` field. If there is already a key with that place type, the new placename will be added to the end of the value's list. ]==] function export.key_holonym_into_place_desc(place_desc, holonym) if not holonym.placetype then return end -- Key in equivalent placetypes, so that e.g. `cities/San Francisco` gets keyed under `city`; but don't do -- fallbacks, as it doesn't seem correct for the "do other holonyms of the same placetype" algorithm to do holonyms -- of different types just because they have the same fallback. local equiv_placetypes = export.get_placetype_equivs(holonym.placetype, {no_fallback = true}) local unlinked_placename = holonym.unlinked_placename for _, equiv in ipairs(equiv_placetypes) do local placetype = equiv.placetype if not place_desc.holonyms_by_placetype then place_desc.holonyms_by_placetype = {} end if not place_desc.holonyms_by_placetype[placetype] then place_desc.holonyms_by_placetype[placetype] = {unlinked_placename} else insert(place_desc.holonyms_by_placetype[placetype], unlinked_placename) end end end --[=[ Construct a formatted link from the raw link spec `link` given the canonical singular placetype `sg_placetype`. If the placetype was originally plural, `orig_placetype` should contain this plural value; otherwise it should be nil. This will construct the appropriate type of link that displays as `orig_placetype` (or otherwise `sg_placetype`) but links to whatever the `link` spec specifies (which may be `sg_placetype`, a Wikipedia article, etc.). `ptdata` is the placetype data structure for the placetype, and `from_category` indicates that we are generating the description of a category (otherwise we are generating the display form of an entry placetype). ]=] local function make_placetype_link(link, sg_placetype, orig_placetype, ptdata, from_category, noerror) if not from_category and ptdata.disallow_in_entries then if noerror then return "[not meant to be specified directly, with warning: " .. ptdata.disallow_in_entries .. "]" else process_error("Placetype %s is not meant to be specified directly: " .. ptdata.disallow_in_entries, sg_placetype) end end if link == nil then internal_error("Placetype data present for placetype %s but no link= setting given", sg_placetype) elseif link == true then if orig_placetype then return ("[[%s|%s]]"):format(sg_placetype, orig_placetype) else return ("[[%s]]"):format(sg_placetype) end elseif link == false then process_error("Placetype %s is not meant to be specified directly, but is only for internal use", sg_placetype) elseif link == "w" then return ("[[w:%s|%s]]"):format(sg_placetype, orig_placetype or sg_placetype) elseif link == "separately" then if orig_placetype then local sg_words = split(sg_placetype, " ") local orig_words = split(orig_placetype, " ") if #sg_words ~= #orig_words then internal_error("Can't construct 'separately' link for plural placetype %s as original placetype %s " .. "has different number of words", orig_placetype, sg_placetype) else for i = 1, #sg_words do if sg_words[i] == orig_words[i] then sg_words[i] = ("[[%s]]"):format(sg_words[i]) else sg_words[i] = ("[[%s|%s]]"):format(sg_words[i], orig_words[i]) end end return concat(sg_words, " ") end else return (sg_placetype:gsub("([^ ]+)", "[[%1]]")) end elseif link:find("^%+") then link = link:sub(2) -- discard initial + return ("[[%s|%s]]"):format(link, orig_placetype or sg_placetype) elseif not orig_placetype then return link else return --[[require(en_utilities_module).pluralize(link)]] link end end --[==[ Get the display form of a placetype by looking it up in `placetype_data`. If the placetype is recognized, or is the plural of a recognized placetype, the corresponding linked display form is returned (with plural placetypes displaying as plural but linked to the singular form of the placetype). Otherwise, return nil. If we're generating the description of a category, `category_type` should be set to one of `"top-level"` (for top-level categories like [[:Category:Neighborhoods]]), `"noncity"` (for non-city categories like [[:Category:Neighborhoods in Illinois, USA]]) or `"city"` (for city categories like [[:Category:Neighborhoods of Chicago]]). Otherwise, we're generating the description for use in formatting a {{tl|place}} call, and category-only placetypes ending in `!` will be ignored, along with special `category_link*` settings. `return_full` is used along with `category_type` and will preferably return the "full" variant of category link settings, i.e. `full_category_link*`; if they don't exist, the `category_link*` value is prepended with `"names of"`. `noerror` says to not throw an error when encountering entry placetypes that would be disallowed. ]==] function export.get_placetype_display_form(placetype, category_type, return_full, noerror) local from_category = not not category_type local canon_placetype, ptdata, ptmatch = export.get_placetype_data(placetype, from_category) if canon_placetype then local raw_link local function is_linked_string(str) return type(str) == "string" and str:find("%[%[") end if category_type then local fetched_full local function fetch_maybe_full(prop) local retval = ptdata["full_" .. prop] if retval ~= nil then if return_full then return retval, true else internal_error("Saw full_" .. prop .. "=%s but `return_full` not set, can't handle", retval) end end return ptdata[prop], false end local function maybe_prefix(str) if return_full and not fetched_full then return "names of " .. str else return str end end -- Careful with `false` as possible value. if category_type == "top-level" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_top_level") elseif category_type == "noncity" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_before_noncity") elseif category_type == "city" then --ห้ามแปล raw_link, fetched_full = fetch_maybe_full("category_link_before_city") else internal_error('Unrecognized value for `category_type` %s, should be "top-level", "noncity" or "city"', --ห้ามแปล category_type) end if type(raw_link) == "string" then return maybe_prefix(raw_link), ptdata elseif raw_link ~= nil then return raw_link, ptdata end raw_link, fetched_full = fetch_maybe_full("category_link") if raw_link == false then return raw_link, ptdata end if is_linked_string(raw_link) then return maybe_prefix(raw_link), ptdata end if ptmatch == "plural" then raw_link, fetched_full = fetch_maybe_full("plural_link") if raw_link == false then return raw_link, ptdata end if is_linked_string(raw_link) then return maybe_prefix(raw_link), ptdata end end if raw_link == nil then raw_link, fetched_full = fetch_maybe_full("link") end if raw_link == false then return raw_link, ptdata end return maybe_prefix(make_placetype_link(raw_link, canon_placetype, placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror)), ptdata else if ptmatch == "plural" then raw_link = ptdata.plural_link if raw_link == false then process_error("Placetype %s cannot appear plural", placetype) end if is_linked_string(raw_link) then return raw_link, ptdata end end if raw_link == nil then raw_link = ptdata.link end return make_placetype_link(raw_link, canon_placetype, placetype ~= canon_placetype and placetype or nil, ptdata, from_category, noerror), ptdata end end return nil end local function resolve_unlinked_placename_display_aliases(placetype, placename) local equiv_placetypes = export.get_placetype_equivs(placetype) for i, equiv in ipairs(equiv_placetypes) do equiv_placetypes[i] = equiv.placetype end local all_display_aliases_found = {} local all_others_found = {} for group, key, spec in m_locations.iterate_matching_location { placetypes = equiv_placetypes, placename = placename, alias_resolution = "display", } do if spec.alias_of and spec.display then insert(all_display_aliases_found, {group, key, spec, spec.display_as_full}) else insert(all_others_found, {group, key, spec}) end end if not all_display_aliases_found[1] then return placename elseif all_display_aliases_found[2] then internal_error("Found multiple matching display aliases for placename %s, placetype %s: " .. "all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found, all_others_found) elseif all_others_found[1] then internal_error("Found a display alias along with other possible meanings for placename %s, placetype %s: " .. "all_display_aliases_found=%s, all_others_found=%s", placename, placetype, all_display_aliases_found, all_others_found) else local group, key, spec, as_full = unpack(all_display_aliases_found[1]) local full, elliptical = m_locations.key_to_placename(group, key) return as_full and full or elliptical end end --[==[ If `placename` of type `placetype` is a display alias, convert it to its canonical form; otherwise, return unchanged. Display aliases transform certain placenames into canonical displayed forms. For example, if any of `country/US`, `country/USA` or `country/United States of America` (or `c/US`, etc.) are given, the result will be displayed as `United States`. '''NOTE''': Display aliases change what is displayed from what the editor wrote in the Wikitext. As a result, they should (a) be non-political in nature, and (b) not involve a change where the word `the` needs to be added or removed. For example, normalizing `US` and `USA` to `United States` for display purposes is OK but normalizing `Burma` to `Myanmar` is not (instead a cat alias should be used) because the terms `Burma` and `Myanmar` have clear political connotations. Similarly, we have a display alias that maps the old name of `Macedonia` as a country (but not a region!) to `North Macedonia`, but `Republic of Macedonia` is mapped to `North Macedonia` only as a cat alias because the two terms differ in their use of `the`. (For example, if we had a display alias mapping `Republic of Macedonia` to `North Macedonia`, the call {{tl|place|en|the <<capital city>> of the <<c/Republic of Macedonia>>}} would wrongly display as `the [[capital city]] of the [[North Macedonia]]`.) Generally, display normalizations tend to involve alternative forms (e.g. abbreviations, ellipses, foreign spellings) where the normalization improves clarity and consistency. ]==] function export.resolve_placename_display_aliases(placetype, placename) -- If the placename is a link, apply the alias inside the link. -- This pattern matches both piped and unpiped links. If the link is not piped, the second capture (linktext) will -- be empty. local link, linktext = rmatch(placename, "^%[%[([^|%[%]]+)|?([^|%[%]]-)%]%]$") if link then if linktext ~= "" then local alias = resolve_unlinked_placename_display_aliases(placetype, linktext) return "[[" .. link .. "|" .. alias .. "]]" else local alias = resolve_unlinked_placename_display_aliases(placetype, link) return "[[" .. alias .. "]]" end else return resolve_unlinked_placename_display_aliases(placetype, placename) end end --[==[ Generate the "prefixed" version of a bare key, i.e. prefix it with `the` if correct for this key. ]==] function export.get_prefixed_key(key, spec) if spec.the then return "the " .. key else return key end end -- Necessary for use by [[Module:place]]. FIXME: Reorganize the modules so this isn't necessary. export.iterate_matching_location = m_locations.iterate_matching_location --[=[ Iterator that iterates over holonyms in `place_desc`. If `first_holonym_index` is given, start iterating at the specified holonym and stop either when there are no more holonyms or a holonym with modifier `:also` is found. If `first_holonym_index` is nil or omitted, iterate over all holonyms regardless. If `include_raw_text_holonyms` is specified, raw text holonyms (those not of the form `placetype/placename`) are returned as well; they can be identified by the fact that the `placetype` field in the holonym structure is nil. Two values are returned at each iteration, the holonym index and holonym structure, similar to `ipairs()`. ]=] function export.get_holonyms_to_check(place_desc, first_holonym_index, include_raw_text_holonyms) local stop_at_also = not not first_holonym_index return function(place_desc, index) while true do index = index + 1 local this_holonym = place_desc.holonyms[index] -- If we were passed in a starting holonym index, go up to but not including a holonym marked with `:also` -- (continue_cat_loop); the categorization code will then restart the loop at that holonym. That holonym -- will have `:also` marked on it, so make sure not to stop immediately if the first holonym is marked with -- `:also`. if not this_holonym or stop_at_also and index > first_holonym_index and this_holonym.continue_cat_loop then return nil end -- If not placetype, we're processing raw text, which we normally want to skip. if include_raw_text_holonyms or this_holonym.placetype then return index, this_holonym end end end, place_desc, first_holonym_index and first_holonym_index - 1 or 0 end --[==[ If the holonym in `data` (in the format as passed to a category handler) refers to a known location, iterate over all such known locations, returning for each location the corresponding key, spec and group as well as the trail of ancestral containers. Unlike `iterate_matching_location()`, this specifically checks that there is no mismatch between the location's containers at any level and any of the following holonyms in the {{tl|place}} spec. The fields in `data` are: * `holonym_placetype`: The placetype of the holonym. It can actually be a list of possible placetypes, as with `iterate_matching_location()`. * `holonym_placename`: The placename of the holonym. * `holonym_index`: The index of the holonym among the holonyms in `place_desc`, or nil if the holonym is not among the holonyms in `place_desc`. (If a holonym index is given, we check for container mismatches among the holonyms following the specified index, stopping either when encountering a holonym marked with modifier `:also` or, if none exist, when we run out of holonyms. If no holonym index is given, we check all holonyms for container mismatches.) * `place_desc`: Description of the place; used for the holonyms, to check for container mismatches. Returns four values: the location group, the canonical key by which the location is known, the spec object describing the location and the trail of ancestral containers for the location. The first three values are the same as for `iterate_matching_location`. ]==] function export.iterate_matching_holonym_location(data) local holonym_placetype, holonym_placename, holonym_index, place_desc = data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc local matching_location_iterator = m_locations.iterate_matching_location { placetypes = holonym_placetype, placename = holonym_placename, } return function() while true do local group, key, spec = matching_location_iterator() if not group then return nil end local container_trail = {} -- For each level of container, check that there are no mismatches (i.e. other location of the same -- placetype) mentioned. We allow a mismatch at a given level if there's also a match with the container -- at that level. For example, in the case of Kansas City, defined in [[Module:place/locations]] as a city -- in Missouri, if we define it as {{tl|place|city|s/Missouri,Kansas}}, we ignore the mismatching state of -- Kansas because the correct state of Missouri was also mentioned. But imagine we are defining Newark, -- Delaware as {{tl|place|city|s/Delaware|c/US}} and (as is the case) we have an entry for Newark, New -- Jersey in [[Module:place/locations]]. Just because the containing location `US` matches isn't enough, -- because Newark, NJ also has New Jersey as a containing location and there's a mismatch at that level. If -- there are no mismatches at any level we assume we're dealing with the right known location. -- -- If at a given level there are multiple containing locations, we count a match if any holonym matches any -- containing location, and a mismatch only if a holonym exists of the same placetype that doesn't match any -- containing location. local containers_mismatch = false for containers in m_locations.iterate_containers(group, key, spec) do insert(container_trail, containers) local match_at_level = false local mismatch_at_level = false for other_holonym_index, other_holonym in export.get_holonyms_to_check(place_desc, holonym_index and holonym_index + 1 or nil) do local other_source_holonym = other_holonym.augmented_from_holonym if other_source_holonym and other_source_holonym.placetype == holonym_placetype and other_source_holonym.unlinked_placename ~= holonym_placename then -- Ignore holonyms added during the augmentation process for other holonyms of the same -- placetype as the placetype of the holonym we're considering. See comment in -- augment_holonyms_with_container() for why we do this. -- continue; grrr, no 'continue' in Lua else local holonym_matches_at_level = false local holonym_exists_with_same_placetype = false for _, container in ipairs(containers) do if not container.spec.no_check_holonym_mismatch then local full_container_placename, elliptical_container_placename = m_locations.key_to_placename(container.group, container.key) local placetypes = container.spec.placetype if type(placetypes) ~= "table" then placetypes = {placetypes} end local placetype_equivs = {} for _, pt in ipairs(placetypes) do m_table.extend(placetype_equivs, export.get_placetype_equivs(pt)) end local this_holonym_matches = export.get_equiv_placetype_prop_from_equivs( placetype_equivs, function(placetype) return other_holonym.placetype == placetype and (other_holonym.unlinked_placename == full_container_placename or other_holonym.unlinked_placename == elliptical_container_placename) end ) if this_holonym_matches then holonym_matches_at_level = true break end local this_holonym_exists_with_same_placetype = export.get_equiv_placetype_prop_from_equivs( placetype_equivs, function(placetype) return other_holonym.placetype == placetype end ) if this_holonym_exists_with_same_placetype then -- We seem to have a mismatch at this level. But before we decide conclusively that this -- is the case, check to see whether the putative mismatch is an alias and matches when -- we resolve the alias. for oh_group, oh_key, oh_spec, oh_container_trail in export.iterate_matching_holonym_location { holonym_placetype = other_holonym.placetype, holonym_placename = other_holonym.unlinked_placename, holonym_index = other_holonym_index, place_desc = place_desc, } do local oh_full_placename, oh_elliptical_placename = m_locations.key_to_placename(oh_group, oh_key) if oh_full_placename == full_container_placename or oh_elliptical_placename == elliptical_container_placename then -- Alias matched when resolved. this_holonym_matches = true break end end if this_holonym_matches then -- Alias matched above when resolved. holonym_matches_at_level = true break else -- Not an alias, or doesn't match when resolved. We have a true mismatch. holonym_exists_with_same_placetype = true end end end end if holonym_matches_at_level then match_at_level = true break end if holonym_exists_with_same_placetype then mismatch_at_level = true end end end if not match_at_level and mismatch_at_level then containers_mismatch = true break end end if not containers_mismatch then return group, key, spec, container_trail end end end end --[==[ If the holonym in `data` (in the format as passed to a category handler) refers to a known location, find and return the corresponding key, spec and group as well as the trail of ancestral containers. This is like `iterate_matching_holonym_location()` but throws an error if more than one location matches. (An example where this would happen is {{tl|place|en|neighborhood|city/Newcastle}}, because there are two known locations named Newcastle. To fix this, specify additional following disambiguating holonyms, e.g. {{tl|place|en|neighborhood|city/Newcastle|s/New South Wales}}. ]==] function export.find_matching_holonym_location(data) local all_found = {} for group, key, spec, container_trail in export.iterate_matching_holonym_location(data) do insert(all_found, {group, key, spec, container_trail}) end if not all_found[1] then return nil elseif all_found[2] then local holonym_placetype = data.holonym_placetype if type(holonym_placetype) == "table" then holonym_placetype = concat(holonym_placetype, ",") end local found_keys = {} for _, found in ipairs(all_found) do local _, key, _, _ = unpack(found) insert(found_keys, key) end error(("Found multiple matching locations for holonym '%s/%s'; specify disambiguating context in the " .. "containing holonyms: %s"):format(holonym_placetype, data.holonym_placename, dump(found_keys))) else return unpack(all_found[1]) end end ------------------------------------------------------------------------------------------ -- Placename and placetype data -- ------------------------------------------------------------------------------------------ --[==[ var: This is a map from aliases to their canonical forms. Any placetypes appearing as keys here will be mapped to their canonical forms in all respects, including the display form. Contrast entries in 'placetype_data' with a fallback, which applies to categorization and other processes but not to display. The most important aliases are for holonym placetypes, particularly those that occur often such as "ประเทศ", "รัฐ", "จังหวัด" and the like. Particularly long placetypes that mostly occur as entry placetypes (e.g. "census-designated place") can be given abbreviations, but it is generally preferred to spell out the entry placetype. Note also that we purposely avoid certain abbreviations that would be ambiguous (e.g. "d", which could variously be interpreted as "department", "อำเภอ" or "division"). ]==] export.placetype_aliases = { ["acomm"] = "autonomous community", ["adr"] = "administrative region", ["adterr"] = "administrative territory", -- Pakistan ["aobl"] = "autonomous oblast", ["aokr"] = "autonomous okrug", ["ap"] = "autonomous province", ["apref"] = "autonomous prefecture", ["aprov"] = "autonomous province", ["ar"] = "autonomous region", ["arch"] = "archipelago", ["arep"] = "autonomous republic", ["aterr"] = "autonomous territory", ["atu"] = "autonomous territorial unit", ["bor"] = "borough", ["c"] = "ประเทศ", ["can"] = "canton", ["carea"] = "council area", ["cc"] = "constituent country", ["cdblock"] = "community development block", ["cdep"] = "Crown dependency", ["CDP"] = "census-designated place", ["cdp"] = "census-designated place", ["clcity"] = "county-level city", ["co"] = "เทศมณฑล", ["cobor"] = "county borough", ["colcity"] = "county-level city", ["coll"] = "collectivity", ["comm"] = "community", ["cont"] = "ทวีป", ["contr"] = "continental region", ["contregion"] = "continental region", ["cpar"] = "civil parish", ["damun"] = "direct-administered municipality", ["dep"] = "dependency", ["department capital"] = "departmental capital", ["dept"] = "department", ["depterr"] = "dependent territory", ["dist"] = "อำเภอ", ["distmun"] = "district municipality", ["div"] = "division", ["emp"] = "จักรวรรดิ", ["fpref"] = "French prefecture", ["gov"] = "governorate", ["govnat"] = "governorate", ["home-rule city"] = "home rule city", ["home-rule municipality"] = "home rule municipality", ["inner-city area"] = "inner city area", ["ires"] = "Indian reservation", ["isl"] = "เกาะ", ["lbor"] = "London borough", ["lga"] = "local government area", ["lgarea"] = "local government area", ["lgd"] = "local government district", ["lgdist"] = "local government district", ["metbor"] = "metropolitan borough", ["metcity"] = "มหานคร", ["metmun"] = "metropolitan municipality", ["mtn"] = "ภูเขา", ["mun"] = "เทศบาล", ["mundist"] = "municipal district", ["nonmetropolitan county"] = "non-metropolitan county", ["obl"] = "oblast", ["okr"] = "okrug", ["p"] = "จังหวัด", ["par"] = "parish", ["parmun"] = "parish municipality", ["pen"] = "peninsula", ["plcity"] = "prefecture-level city", ["plcolony"] = "Polish colony", ["pref"] = "prefecture", ["prefcity"] = "prefecture-level city", ["preflcity"] = "prefecture-level city", ["prov"] = "จังหวัด", ["r"] = "ภูมิภาค", ["range"] = "เทือกเขา", ["rcm"] = "regional county municipality", ["rcomun"] = "regional county municipality", ["rdist"] = "regional district", ["rep"] = "republic", ["rhrom"] = "rural hromada", ["riv"] = "แม่น้ำ", ["rmun"] = "regional municipality", ["robor"] = "royal borough", ["romp"] = "Roman province", ["runit"] = "regional unit", ["rurmun"] = "rural municipality", ["s"] = "รัฐ", ["sar"] = "special administrative region", ["shrom"] = "settlement hromada", ["spref"] = "subprefecture", ["sprefcity"] = "sub-prefectural city", ["sprovcity"] = "subprovincial city", ["submet city"] = "sub-metropolitan city", ["submetropolitan city"] = "sub-metropolitan city", ["sub-prefecture-level city"] = "sub-prefectural city", ["sub-provincial city"] = "subprovincial city", ["sub-provincial district"] = "subprovincial district", ["terr"] = "ดินแดน", ["terrauth"] = "territorial authority", ["twp"] = "township", ["twpmun"] = "township municipality", ["uauth"] = "unitary authority", ["ucomm"] = "unincorporated community", ["udist"] = "unitary district", ["uhrom"] = "urban hromada", ["uterr"] = "union territory", ["utwpmun"] = "united township municipality", ["val"] = "valley", ["vdc"] = "village development committee", ["vil"] = "village", ["voi"] = "voivodeship", ["wcomm"] = "Welsh community", } local no_link_def_article = {link = false, article = "the"} local no_link_no_article = {link = false, article = false} --[==[ var: These qualifiers can be prepended onto any placetype and will be handled correctly. For example, the placetype `large city` will be displayed as `large <nowiki>[[city]]</nowiki>` and categorized as if `city` were specified. If the value in the following table is a string, the qualifier will display according to the string. If the value is `true`, the qualifier will be linked to its corresponding Wiktionary entry. If the value is `false`, the qualifier will not be linked but will appear as-is. Note that these qualifiers do not override placetypes with entries elsewhere that contain those same qualifiers. For example, the entry for `inland sea` in `placetype_data` will apply in preference to treating `inland sea` as equivalent to `sea`. ]==] export.placetype_qualifiers = { -- generic qualifiers ["huge"] = false, ["tiny"] = false, ["large"] = false, ["big"] = false, ["mid-size"] = false, ["mid-sized"] = false, ["small"] = false, ["sizable"] = false, ["important"] = false, ["long"] = false, ["short"] = false, ["major"] = false, ["minor"] = false, ["high"] = false, ["tall"] = false, ["low"] = false, ["left"] = false, -- left tributary ["right"] = false, -- right tributary ["modern"] = false, -- for use in opposition to "ancient" in another definition -- "former" qualifiers ["abandoned"] = true, ["ancient"] = true, ["deserted"] = true, ["extinct"] = true, ["former"] = false, ["historic"] = "historical", ["historical"] = true, ["medieval"] = true, ["mediaeval"] = true, ["ruined"] = true, ["traditional"] = true, -- sea qualifiers ["coastal"] = true, ["inland"] = true, -- note, we also have an entry in placetype_data for 'inland sea' to get a link to [[inland sea]] ["maritime"] = true, ["overseas"] = true, ["seaside"] = true, ["beachfront"] = true, ["beachside"] = true, ["riverside"] = true, -- lake qualifiers ["freshwater"] = true, ["saltwater"] = true, ["endorheic"] = true, ["oxbow"] = true, ["ox-bow"] = "[[oxbow]]", -- [[ox-bow]] is a red link ["tidal"] = true, -- land qualifiers ["hilltop"] = true, ["hilly"] = true, ["insular"] = true, ["peninsular"] = true, ["chalk"] = true, ["karst"] = true, ["limestone"] = true, ["mountainous"] = true, ["mountaintop"] = true, ["alpine"] = true, ["volcanic"] = true, -- for an island -- political status qualifiers ["autonomous"] = true, ["incorporated"] = true, ["special"] = true, ["unincorporated"] = true, ["coterminous"] = true, -- monetary status/etc. qualifiers ["fashionable"] = true, ["wealthy"] = true, ["affluent"] = true, ["declining"] = true, -- city vs. rural qualifiers ["urban"] = true, ["suburban"] = true, ["exurban"] = true, ["outlying"] = true, ["remote"] = true, ["rural"] = true, ["outback"] = true, ["inner"] = false, ["inner-city"] = true, ["central"] = false, ["outer"] = false, -- land use qualifiers ["residential"] = true, ["agricultural"] = true, ["business"] = true, ["commercial"] = true, ["industrial"] = true, -- business use qualifiers ["railroad"] = true, ["railway"] = true, ["farming"] = true, ["fishing"] = true, ["mining"] = true, ["logging"] = true, ["cattle"] = true, -- tourism use qualifiers ["resort"] = true, -- note, we also have 'resort city' and 'resort town', that take precedecne ["spa"] = true, -- note, we also have 'spa city' and 'spa town', that take precedecne ["ski"] = true, -- note, we also have 'ski resort city' and 'ski resort town', that take precedecne -- religious qualifiers ["holy"] = true, ["sacred"] = true, ["religious"] = true, ["secular"] = true, -- qualifiers for nonexistent places ["claimed"] = false, ["fictional"] = true, ["legendary"] = true, ["mythical"] = true, ["mythological"] = true, -- directional qualifiers ["northern"] = false, ["southern"] = false, ["eastern"] = false, ["western"] = false, ["north"] = false, ["south"] = false, ["east"] = false, ["west"] = false, ["northeastern"] = false, ["southeastern"] = false, ["northwestern"] = false, ["southwestern"] = false, ["northeast"] = false, ["southeast"] = false, ["northwest"] = false, ["southwest"] = false, -- seasonal qualifiers ["summer"] = true, -- e.g. for 'summer capital' ["winter"] = true, -- legal status qualifiers -- FIXME: Two-word qualifiers don't work yet. But you can enter "de-facto" and it's canonicalized to [[de facto]]. ["official"] = true, ["unofficial"] = true, ["de facto"] = true, -- 'de facto capital' ["de-facto"] = "[[de facto]]", -- [[de-facto]] is a red link ["de jure"] = true, -- 'de jure capital' ["de-jure"] = "[[de jure]]", -- [[de-jure]] is a red link -- NOTE: 'unrecognized/unrecognised' are handled as placetypes 'unrecognized country', 'unrecognized state' -- misc. qualifiers ["planned"] = true, ["chartered"] = true, ["landlocked"] = true, ["uninhabited"] = true, -- superlative qualifiers ["first"] = no_link_def_article, ["second"] = no_link_def_article, -- for "second largest" etc. ["third"] = no_link_def_article, ["fourth"] = no_link_def_article, ["last"] = no_link_def_article, ["only"] = no_link_def_article, ["sole"] = no_link_def_article, ["main"] = no_link_def_article, ["largest"] = no_link_def_article, ["biggest"] = no_link_def_article, ["smallest"] = no_link_def_article, ["shortest"] = no_link_def_article, ["longest"] = no_link_def_article, ["tallest"] = no_link_def_article, ["highest"] = no_link_def_article, ["lowest"] = no_link_def_article, ["leftmost"] = no_link_def_article, ["rightmost"] = no_link_def_article, ["innermost"] = no_link_def_article, ["outermost"] = no_link_def_article, ["northernmost"] = no_link_def_article, ["southernmost"] = no_link_def_article, ["westernmost"] = no_link_def_article, ["easternmost"] = no_link_def_article, ["northwesternmost"] = no_link_def_article, ["southwesternmost"] = no_link_def_article, ["northeasternmost"] = no_link_def_article, ["southeasternmost"] = no_link_def_article, -- several/various ["several"] = no_link_no_article, ["various"] = no_link_no_article, ["numerous"] = no_link_no_article, ["multiple"] = no_link_no_article, ["many"] = no_link_no_article, ["other"] = no_link_no_article, } --[==[ var: In this table, the key qualifiers should be treated the same as the value qualifiers for categorization purposes. This is overridden by `placetype_data` and `qualifier_to_placetype_equivs`. ]==] export.former_qualifiers = { ["abandoned"] = {"FORMER"}, ["ancient"] = {"ANCIENT", "FORMER"}, ["former"] = {"FORMER"}, ["extinct"] = {"FORMER"}, ["historic"] = {"FORMER"}, ["historical"] = {"FORMER"}, ["medieval"] = {"ANCIENT", "FORMER"}, ["mediaeval"] = {"ANCIENT", "FORMER"}, ["ruined"] = {"ANCIENT", "FORMER"}, ["traditional"] = {"FORMER"}, } --[==[ var: In this table, any placetypes containing these qualifiers that do not occur in `placetype_data` should be mapped to the specified placetypes for categorization purposes. Entries here are overridden by `placetype_data`. ]==] export.qualifier_to_placetype_equivs = { ["fictional"] = "fictional location", ["legendary"] = "mythological location", ["mythical"] = "mythological location", ["mythological"] = "mythological location", -- For e.g. Taiwan as a "claimed province" of China; parts of Belize as claimed by Guatemala; various islands -- claimed by various parties in East Asia. FIXME: We should conditionalize on what is being claimed since there are -- also claimed capitals, e.g. Israel and Palestine claim Jerusalem as their capital. ["claimed"] = "claimed political division", } --[==[ var: Mapping from placetypes to the corresponding plural category-only placetype for a capital of that placetype. The reverse mapping also exists. ]==] export.placetype_to_capital_cat = { ["autonomous community"] = "autonomous community capitals", ["canton"] = "cantonal capitals", ["comarca"] = "comarca capitals", ["ประเทศ"] = "เมืองหลวงของประเทศ", -- The following are not obviously different from 'county seats' but the latte terminology is used in the US. ["เทศมณฑล"] = "เมืองหลวงของเทศมณฑล", ["department"] = "departmental capitals", ["อำเภอ"] = "เมืองหลวงของอำเภอ", ["division"] = "division capitals", ["emirate"] = "emirate capitals", ["governorate"] = "governorate capitals", ["hromada"] = "hromada capitals", ["krai"] = "krai capitals", ["มหานคร"] = "เมืองหลวงของมหานคร", ["เทศบาล"] = "เมืองหลวงของเทศบาล", ["oblast"] = "oblast capitals", ["okrug"] = "okrug capitals", ["prefecture"] = "prefectural capitals", ["จังหวัด"] = "เมืองหลวงของจังหวัด", ["raion"] = "raion capitals", ["regency"] = "regency capitals", ["ภูมิภาค"] = "เมืองหลวงของภูมิภาค", ["regional unit"] = "regional unit capitals", ["republic"] = "republic capitals", ["รัฐ"] = "เมืองหลวงของรัฐ", ["ดินแดน"] = "เมืองหลวงของดินแดน", ["voivodeship"] = "voivodeship capitals", } --[==[ var: This contains placenames that should be preceded by an article (almost always "the"). '''NOTE''': There are multiple ways that placenames can come to be preceded by "the": # Listed here. # Given in [[Module:place/locations]] with an initial "the". All such placenames are added to this map by the code just below the map. # The placetype of the placename has `holonym_use_the = true` in its placetype_data. # A regex in placename_the_re matches the placename. Note that "the" is added only before the first holonym in a place description. ]==] export.placename_article = { -- This should only contain info that can't be inferred from [[Module:place/locations]]. ["archipelago"] = { ["Cyclades"] = "the", ["Dodecanese"] = "the", }, ["ประเทศ"] = { ["Holy Roman Empire"] = "the", }, ["จักรวรรดิ"] = { ["Holy Roman Empire"] = "the", }, ["เกาะ"] = { ["North Island"] = "the", ["South Island"] = "the", }, ["ภูมิภาค"] = { ["Balkans"] = "the", ["Russian Far East"] = "the", ["Caribbean"] = "the", ["Caucasus"] = "the", ["Middle East"] = "the", ["New Territories"] = "the", ["North Caucasus"] = "the", ["South Caucasus"] = "the", ["West Bank"] = "the", ["Gaza Strip"] = "the", }, ["valley"] = { ["San Fernando Valley"] = "the", }, } --[==[ var: Regular expressions to apply to determine whether we need to put 'the' before a holonym. The key "*" applies to all holonyms, otherwise only the regexes for the holonym's placetype apply. ]==] export.placename_the_re = { -- We don't need entries for peninsulas, seas, oceans, gulfs or rivers -- because they have holonym_use_the = true. ["*"] = {"^Isle of ", " Islands$", " Mountains$", " Empire$", " Country$", " Region$", " District$", "^City of "}, ["bay"] = {"^Bay of "}, ["ทะเลสาบ"] = {"^Lake of "}, ["ประเทศ"] = {"^Republic of ", " Republic$"}, ["republic"] = {"^Republic of ", " Republic$"}, ["ภูมิภาค"] = {" [Rr]egion$"}, ["แม่น้ำ"] = {" River$"}, ["local government area"] = {"^Shire of "}, ["เทศมณฑล"] = {"^Shire of "}, ["Indian reservation"] = {" Reservation", " Nation"}, ["tribal jurisdictional area"] = {" Reservation", " Nation"}, } --[==[ var: If any of the following holonyms are present, the associated holonyms are automatically added to the end of the list of holonyms for categorization (but not display) purposes. ]==] export.cat_implications = { ["ภูมิภาค"] = { ["Eastern Europe"] = {"continent/ยุโรป"}, ["Central Europe"] = {"continent/ยุโรป"}, ["Western Europe"] = {"continent/ยุโรป"}, ["South Europe"] = {"continent/ยุโรป"}, ["Southern Europe"] = {"continent/ยุโรป"}, ["Northern Europe"] = {"continent/ยุโรป"}, ["Northeast Europe"] = {"continent/ยุโรป"}, ["Northeastern Europe"] = {"continent/ยุโรป"}, ["Southeast Europe"] = {"continent/ยุโรป"}, ["Southeastern Europe"] = {"continent/ยุโรป"}, ["North Caucasus"] = {"continent/ยุโรป"}, ["South Caucasus"] = {"continent/เอเชีย"}, ["South Asia"] = {"continent/เอเชีย"}, ["Southern Asia"] = {"continent/เอเชีย"}, ["East Asia"] = {"continent/เอเชีย"}, ["Eastern Asia"] = {"continent/เอเชีย"}, ["Central Asia"] = {"continent/เอเชีย"}, ["West Asia"] = {"continent/เอเชีย"}, ["Western Asia"] = {"continent/เอเชีย"}, ["Southeast Asia"] = {"continent/เอเชีย"}, ["North Asia"] = {"continent/เอเชีย"}, ["Northern Asia"] = {"continent/เอเชีย"}, ["Anatolia"] = {"continent/เอเชีย"}, ["Asia Minor"] = {"continent/เอเชีย"}, ["Mesopotamia"] = {"continent/เอเชีย"}, ["North Africa"] = {"continent/แอฟริกา"}, ["Central Africa"] = {"continent/แอฟริกา"}, ["West Africa"] = {"continent/แอฟริกา"}, ["East Africa"] = {"continent/แอฟริกา"}, ["Southern Africa"] = {"continent/แอฟริกา"}, ["Central America"] = {"continent/อเมริกากลาง"}, ["Caribbean"] = {"continent/อเมริกาเหนือ"}, ["Polynesia"] = {"continent/โอเชียเนีย"}, ["Micronesia"] = {"continent/โอเชียเนีย"}, ["Melanesia"] = {"continent/โอเชียเนีย"}, ["Siberia"] = {"country/รัสเซีย", "continent/เอเชีย"}, ["Russian Far East"] = {"country/รัสเซีย", "continent/เอเชีย"}, ["South Wales"] = {"constituent country/เวลส์", "continent/ยุโรป"}, ["Balkans"] = {"continent/ยุโรป"}, ["West Bank"] = {"country/ปาเลสไตน์", "continent/เอเชีย"}, ["Gaza"] = {"country/ปาเลสไตน์", "continent/เอเชีย"}, ["Gaza Strip"] = {"country/ปาเลสไตน์", "continent/เอเชีย"}, } } ------------------------------------------------------------------------------------------ -- Category and display handlers -- ------------------------------------------------------------------------------------------ local function city_type_cat_handler(data) local entry_placetype = data.entry_placetype local generic_before_non_cities = export.get_placetype_prop(entry_placetype, "generic_before_non_cities") if not generic_before_non_cities then internal_error("city_type_cat_handler called on placetype %s that doesn't have a `generic_before_non_cities`" .. " setting", entry_placetype) end local plural_entry_placetype = export.pluralize_placetype(entry_placetype) local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and not spec.is_former_place and not spec.is_city then -- Categorize both in key, and in the larger polity that the key is part of, e.g. [[Hirakata]] goes in both -- "Cities in Osaka Prefecture" and "Cities in Japan". (But don't do the latter if no_container_cat is set.) local cap_plural_entry_placetype = ucfirst(plural_entry_placetype) local retcats = {("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(key, spec))} --th if container_trail[1] and not spec.no_container_cat then for _, container in ipairs(container_trail[1]) do insert(retcats, ("%s%s%s"):format(cap_plural_entry_placetype, generic_before_non_cities, export.get_prefixed_key(container.key, container.spec))) --th end end return retcats end end local function capital_city_cat_handler(data, non_city) local holonym_placetype, holonym_placename, holonym_index, place_desc = data.holonym_placetype, data.holonym_placename, data.holonym_index, data.place_desc -- The first time we're called we want to return something; otherwise we will be called for later-mentioned -- holonyms, which can result in wrongly classifying into e.g. `National capitals`. Simulate the loop in -- find_placetype_cat_specs() over holonyms so we get the proper `Cities in ...` categories as well as the capital -- category/categories we add below. local retcats if not non_city and place_desc.holonyms then for h_index, holonym in export.get_holonyms_to_check(place_desc, holonym_index) do local h_placetype, h_placename = holonym.placetype, holonym.unlinked_placename retcats = city_type_cat_handler { entry_placetype = "นคร", holonym_placetype = h_placetype, holonym_placename = h_placename, holonym_index = h_index, place_desc = place_desc, } if retcats then break end end end if not retcats then retcats = {} end -- Now find the appropriate capital-type category for the placetype of the holonym, e.g. 'State capitals'. If we -- recognize the holonym among the known holonyms in [[Module:place/locations]], also add a category like 'State -- capitals of the United States'. Truncate e.g. 'autonomous region' to 'region', 'union territory' to 'territory' -- when looking up the type of capital category, if we can't find an entry for the holonym placetype itself (there's -- an entry for 'autonomous community'). local capital_cat = export.placetype_to_capital_cat[holonym_placetype] if not capital_cat then capital_cat = export.placetype_to_capital_cat[holonym_placetype:gsub("^.* ", "")] end if capital_cat then capital_cat = ucfirst(capital_cat) local inserted_specific_variant_cat = false if holonym_index then -- Now find the first recognized holonym location. We don't stop when :also is seen because of the common pattern -- where we use :also to specify that a given city is the capital at multiple surrounding levels. local matching_group, matching_key, matching_spec, matching_container_trail, matching_holonym_index for h_index = holonym_index, #place_desc.holonyms do if place_desc.holonyms[h_index].placetype then matching_group, matching_key, matching_spec, matching_container_trail = export.find_matching_holonym_location { holonym_placetype = place_desc.holonyms[h_index].placetype, holonym_placename = place_desc.holonyms[h_index].unlinked_placename, holonym_index = h_index, place_desc = place_desc, } if matching_group then matching_holonym_index = h_index break end end end if matching_holonym_index == holonym_index then if matching_container_trail[1] and not matching_spec.no_container_cat then for _, container in ipairs(matching_container_trail[1]) do insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key, container.spec))) inserted_specific_variant_cat = true end end elseif matching_holonym_index then -- Check to make sure that the holonym placetype we were called on is listed among the -- divtypes of the location we found. local function insert_specific_variant_if_possible(key, spec) return export.get_equiv_placetype_prop(holonym_placetype, function(pt) local plural_holonym_placetype = export.pluralize_placetype(pt) local saw_matching_div if spec.divs then local divs = spec.divs if type(divs) ~= "table" then divs = {divs} end for _, div in ipairs(divs) do if type(div) ~= "table" then div = {type = div} end if plural_holonym_placetype == div.type then saw_matching_div = true break end end end if saw_matching_div then insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(key, spec))) return true end return false end) end if insert_specific_variant_if_possible(matching_key, matching_spec) then inserted_specific_variant_cat = true elseif not matching_spec.no_container_cat then for _, containers in ipairs(matching_container_trail) do local saw_no_container_cat = false for _, container in ipairs(containers) do if insert_specific_variant_if_possible(container.key, container.spec) then inserted_specific_variant_cat = true break end saw_no_container_cat = saw_no_container_cat or container.spec.no_container_cat end if inserted_specific_variant_cat or saw_no_container_cat then break end end end end else -- This happens when in an invocation like {{place|en|capital city|s/Haryana,Punjab}} for -- [[Chandigarh]]. We fall back to older code that doesn't depend on the holonym index existing. -- FIXME: This may not be necessary. In the example just given, when processing Haryana we add to -- [[:Category:en:State capitals of India]], and nothing extra gets added when processing Punjab. -- Possibly we can just skip this case entirely. local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and container_trail[1] and not spec.no_container_cat then for _, container in ipairs(container_trail[1]) do insert(retcats, ("%sของ%s"):format(capital_cat, export.get_prefixed_key(container.key, container.spec))) inserted_specific_variant_cat = true end end end if not inserted_specific_variant_cat then insert(retcats, capital_cat) end else -- We didn't recognize the holonym placetype; just put in 'Capital cities'. insert(retcats, "เมืองหลวง") end return retcats end --[=[ This is invoked specially for all placetypes (see the `*` placetype key at the bottom of `placetype_data`). This is used in two ways: # To add pages to generic holonym categories like [[:Category:en:สถานที่ในMerseyside, England]] (and [[:Category:en:สถานที่ในEngland]]) for any pages that have `co/Merseyside` as their holonym. # To categorize demonyms in bare placename categories like [[:Category:en:Merseyside, England]] if the demonym description mentions `co/Merseyside` and doesn't mention a more specific placename that also has a category. (In this case there are none, but we can have demonyms at multiple levels, e.g. in France for individual villages, departments, administrative regions, and for the entire country, and for example we only want to categorize a demonym into [[:Category:France]] if no more specific category applies.) Unlike when invoked from {{tl|place}}, a demonym invocation only adds the most specific holonym category and not the category of any containing polity (hence if we add [[:Category:en:Merseyside, England]] we won't also add [[:Category:England]]). This code also handles cities; e.g. for the first use case above, it would be used to add a page that has `city/Boston` as a holonym to [[:Category:en:สถานที่ในBoston]], along with [[:Category:en:สถานที่ในMassachusetts, USA]] and [[:Category:en:สถานที่ในthe United States]]. The city handler tries to deal with the possibility of multiple cities having the same name. For example, the code in [[Module:place/locations]] knows about the city of [[Columbus]], [[Ohio]], which has containing polities `Ohio` (a state) and `the United States` (a country). If either containing polity is mentioned, the handler proceeds to return the key `Columbus` (along with `Ohio, USA` and `the United States`). Otherwise, if any other state or country is mentioned, the handler returns nothing, and otherwise it assumes the mentioned city is the one we're considering and returns `Columbus` etc. This works correctly if the place only mentions Ohio and a holonym for a Columbus in a different country is encountered, because of the function `augment_holonyms_with_container`, which adds the US as a holonym when Ohio is encountered. The single parameter `data` is as in category handlers. The return value is a list of categories (without the preceding language code). ]=] local function generic_place_cat_handler(data) local from_demonym = data.from_demonym local retcats = {} local function insert_retkey(key, spec) if from_demonym then insert(retcats, key) else insert(retcats, ("สถานที่ใน%s"):format(export.get_prefixed_key(key, spec))) end end local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group then if not spec.no_generic_place_cat then -- This applies to continents and continental regions. insert_retkey(key, spec) end -- Categorize both in key, and in the larger location(s) that the key is part of, e.g. [[Hirakata]] goes in -- both [[Category:สถานที่ในOsaka Prefecture, Japan]] and [[Category:สถานที่ในJapan]]. But not when -- no_container_cat is set (e.g. for 'United Kingdom'). if not spec.no_container_cat then for _, container_set in ipairs(container_trail) do local stop_adding_containers = false for _, container in ipairs(container_set) do if not container.spec.no_generic_place_cat then insert_retkey(container.key, container.spec) end if container.spec.no_container_cat then stop_adding_containers = true end end if stop_adding_containers then break end end end return retcats end end --[==[ Special category handler run for all placetypes that checks for specified division placetypes of known locations and categorizes appropriately. ]==] function export.political_division_cat_handler(data) if data.from_demonym then return end local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group then local divlists = {} if spec.divs then insert(divlists, spec.divs) end if spec.addl_divs then insert(divlists, spec.addl_divs) end for _, divlist in ipairs(divlists) do if type(divlist) ~= "table" then divlist = {divlist} end for _, div in ipairs(divlist) do if type(div) == "string" then div = {type = div} end local sgdiv = export.maybe_singularize_placetype(div.type) or div.type local prep = div.prep or "ของ" local cat_as = div.cat_as or div.type if type(cat_as) ~= "table" then cat_as = {cat_as} end if not export.placetype_data[sgdiv] then internal_error("Placetype %s associated with known location key %s and data %s not found in " .. "`placetype_data`", sgdiv, key, spec) end if sgdiv == data.entry_placetype then local retcats = {} for _, pt_cat in ipairs(cat_as) do if type(pt_cat) == "string" then pt_cat = {type = pt_cat} end local pt_prep = pt_cat.prep or prep insert(retcats, ucfirst(pt_cat.type) .. pt_prep .. export.get_prefixed_key(key, spec)) --th end return retcats end end end end end --[==[ This is used to add pages to "bare" categories like [[:Category:en:Georgia, USA]] for `[[Georgia]]` and any foreign-language terms that are translations of the state of Georgia. We look at the page title (or its overridden value in {{para|pagename}}) as well as the glosses in {{para|t}}/{{para|t2}} etc., various extra-info values such as the modern names in {{para|modern}}, and any values specified using a form-of directive. We need to pay attention to the entry placetypes specified so we don't overcategorize; e.g. the US state of Georgia is `[[Джорджия]]` in Russian but the country of Georgia is `[[Грузия]]`, and if we just looked for matching names, we'd get both Russian terms categorized into both [[:Category:ru:Georgia, USA]] and [[:Category:ru:Georgia]]. We also need to check the containing holonyms to make sure there isn't a mismatch (so we don't e.g. categorize Newark, Delaware in [[:Category:en:Newark]], which is intended for Newark, New Jersey). ]==] function export.get_bare_categories(args, overall_place_spec) local bare_cats = {} local place_descs = overall_place_spec.descs local possible_placetypes_by_place_desc = {} for i, place_desc in ipairs(place_descs) do possible_placetypes_by_place_desc[i] = {} for _, placetype in ipairs(place_desc.placetypes) do if not export.placetype_is_ignorable(placetype) then local equivs = export.get_placetype_equivs(placetype, {register_former_as_non_former = true}) for _, equiv in ipairs(equivs) do insert(possible_placetypes_by_place_desc[i], equiv.placetype) end end end end local function check_term(term) -- Treat Wikipedia links like local ones. term = term:gsub("%[%[w:", "[["):gsub("%[%[wikipedia:", "[[") term = export.remove_links_and_html(term) term = term:gsub("^the ", "") for i, place_desc in ipairs(place_descs) do -- Iterate over all matching locations in case there are multiple, as with Delhi defined as -- {{place|en|megacity/and/union territory|c/India|containing the national capital [[New Delhi]]}}. for group, key, spec, container_trail in export.iterate_matching_holonym_location { holonym_placetype = possible_placetypes_by_place_desc[i], holonym_placename = term, place_desc = place_desc, } do insert(bare_cats, key) end end end -- FIXME: Should we only do the following if the language is English (requires that the lang is passed in)? -- We should always do it if `pagename` is given (as it is with {{tcl}}) but maybe not otherwise unless 1=en. There -- are cases like [[Ankara]] = English name for capital of Turkey, but also the name in various languages for the -- capital of Ghana (= English [[Accra]]). But this should get caught by mismatching the containing country. The -- advantage of checking when the language isn't English is we catch those places that fail to give an English -- translation but where the translation happens to be the same as the other-language spelling. However, I don't -- know how often this situation occurs. check_term(args.pagename or mw.title.getCurrentTitle().subpageText) for _, t in ipairs(args.t) do check_term(t) end local function check_termobj_list(terms) for _, term in ipairs(terms) do if term.eq then check_term(term.eq) end if term.alt or term.term then check_term(term.alt or term.term) end end end for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do local arg = extra_info_terms.arg if arg == "modern" or arg == "now" or arg == "full" or arg == "short" then check_termobj_list(extra_info_terms.terms) end end for _, directive in ipairs(overall_place_spec.directives) do check_termobj_list(directive.terms) end return bare_cats end --[==[ This is used to augment the holonyms associated with a place description with the containing polities. For example, given the following: `# {{tl|place|en|subprefecture|pref/Hokkaido}}.` We auto-add Japan as another holonym so that the term gets categorized into [[:Category:Subprefectures of Japan]]. To avoid over-categorizing we need to check to make sure no other countries are specified as holonyms. ]==] function export.augment_holonyms_with_container(place_descs) for _, place_desc in ipairs(place_descs) do if place_desc.holonyms then -- This ends up containing a copy of the original holonyms, with the augmented holonyms inserted in their -- appropriate position. We don't just put them at the end because some holonyms have use the `:also` -- modifier, which causes category processing to restart at that point after generating categories for a -- preceding holonym, and we don't want the preceding holonym's augmented holonyms interfering with -- categorization of a later holonym. We proceed from right to left, and each time we augment, we copy -- the holonyms with the augmented holonym(s) inserted appropriately and replace the place description's -- holonyms with the augmented ones before the next iteration. The reason for this is so that e.g. -- {{place|neighborhood|city/Birmingham|co/West Midlands|cc/England}} doesn't throw an error during the -- augmentation process due to 'Birmingham' referring to two known locations (in England and Alabama). If -- we go left to right, we will throw an ambiguity error on `city/Birmingham` because code to exclude -- Birmingham, Alabama needs `c/United Kingdom` present (to cause a mismatch with `c/United States`), -- which isn't yet present as the augmentation code hasn't gotten to `cc/England` yet. For similar -- reasons, we need to include the augmented holonyms in the holonyms considered in the next iteration -- rather than modifying the place description once at athe end. for i = #place_desc.holonyms, 1, -1 do local holonym = place_desc.holonyms[i] if holonym.placetype and not export.placetype_is_ignorable(holonym.placetype) then local group, key, spec, container_trail = export.find_matching_holonym_location { holonym_placetype = holonym.placetype, holonym_placename = holonym.unlinked_placename, holonym_index = i, place_desc = place_desc, } if group and container_trail[1] and not spec.no_auto_augment_container then local augmented_holonyms = {} for j = 1, i do insert(augmented_holonyms, place_desc.holonyms[j]) end for _, containers in ipairs(container_trail) do local any_no_auto_augment_container = false for _, container in ipairs(containers) do any_no_auto_augment_container = any_no_auto_augment_container or container.spec.no_auto_augment_container local containing_type = container.spec.placetype if type(containing_type) == "table" then -- If the containing type is a list, use the first element as the canonical variant. containing_type = containing_type[1] end local full_container_placename, elliptical_container_placename = m_locations.key_to_placename(container.group, container.key) -- Don't side-effect holonyms while processing them. local new_holonym = { -- By the time we run, the display has already been generated so we don't need to -- set display_placename. placetype = containing_type, -- placename_to_key() for the group should correctly handle both full and elliptical -- placenames, but the full placename seems less likely to be ambiguous. FIXME: We -- should just store the key directly and use it when available to avoid having to -- convert key to placename and back to key. unlinked_placename = full_container_placename, -- Indicate that this is an augmented holonym, and was derived from the specified -- holonym. In iterate_matching_holonym_location(), we ignore augmented holonyms -- derived from holonyms that are different from the holonym we're searching for but -- of the same placetype. This is to correctly handle a situation like -- {{place|river|dept/Ardèche,Gard,Vaucluse,Bouches-du-Rhône|c/France}}. Here, -- `Ardèche` is in `r/Auvergne-Rhône-Alpes`, while `Gard` is in `r/Occitania` and -- the other two are in `r/Provence-Alpes-Côte d'Azur`. Augmenting proceeds from -- right to left, so after it adds `r/Provence-Alpes-Côte d'Azur` to -- `Bouches-du-Rhône`, Vaucluse gets augmented correctly but `Gard` fails to match -- in find_matching_holonym_location() because of the mismatch between augmented -- `r/Provence-Alpes-Côte d'Azur` and actual `r/Occitania`. Similarly, all later -- calls to find_matching_holonym_location() fail to match `Gard` (and likewise -- `Ardèche`) against any known location. To deal with this, we mark augmented -- holoynms as being augmented due to a source holonym, and when processing a given -- holonym, ignore augmented holonyms from other holonyms of the same placetype. -- The restriction to the same placetype is so that `Birmingham` still gets -- correctly disambiguated to Birmingham, England in the example given above near -- the top of this function, using the augmented holonym `c/United Kingdom` added by -- the specified `cc/England` (whose placetype `constituent country` differs from -- the placetype `city` of Birmingham). augmented_from_holonym = holonym, } insert(augmented_holonyms, new_holonym) -- But it is safe to modify other parts of the place_desc. export.key_holonym_into_place_desc(place_desc, new_holonym) end if any_no_auto_augment_container then break end end for j = i + 1, #place_desc.holonyms do insert(augmented_holonyms, place_desc.holonyms[j]) end place_desc.holonyms = augmented_holonyms end end end end end end -- Cat handler for district, areas, neighborhoods and suburbs. Districts are tricky because they can either be political -- divisions or city neighborhoods. Areas similarly can be political divisions (rarely; specifically, in Kuwait), city -- neighborhoods or larger geographical areas/regions. We handle this as follows: -- (1) `placetype_data` cat entries for specific countries or country divisions take precedence over cat_handlers, so if -- the user says {{tl|place|district|s/Maharashtra|c/India}}, we won't even be called because there is an entry that -- categorizes into [[:Category|Districts of Maharashtra, India]]. -- (2) If we're called, we check the holonym we're called on to see if it is a recognized city, e.g. if we're called -- using {{tl|place|district|city/Mumbai|s/Maharashtra|c/India}}. If so, we categorize under e.g. -- [[:Category:Neighbourhoods of Mumbai]]. (Choosing the spelling "neighbourhoods" because we're in India.) -- (3) If we're called and the holonym is not a recognized city, we check if the placetype has has_neighborhoods set. -- If so, it's "city-like" and we categorize under the first containing polity that we recognize. For example, if -- we're called using {{tl|place|district|town/Northampton|co/Hampshire|s/Massachusetts|c/US}}, we should recognize -- town as "city-like" and categorize under [[:Category:Neighborhoods in Massachusetts]]. (Note "ใน" not "ของ", and -- note the spelling "neighborhoods" because we're in the US.) -- (4) If the holonym is not city-like, we do nothing. If there's a city or city-like placetype farther up (e.g. we're -- called as {{tl|place|district|ward/Foo|mun/Bar|...}}), we will handle the city-like entity according to (2) or -- (3) when called on that holonym. Otherwise either the categorization in (1) takes place or there's no -- categorization. local function district_neighborhood_cat_handler(data) local function get_plural_entry_placetype(location_spec, container_trail) if data.entry_placetype == "suburb" then return "Suburbs" else -- Check for `british_spelling` setting on the spec itself or any container. local uses_british_spelling = location_spec.british_spelling if uses_british_spelling == nil and container_trail then for _, container_set in ipairs(container_trail) do local must_outer_break = false for _, container in ipairs(container_set) do if container.spec.british_spelling ~= nil then uses_british_spelling = container.spec.british_spelling must_outer_break = true break end end if must_outer_break then break end end end return uses_british_spelling and "Neighbourhoods" or "Neighborhoods" end end -- First check the immediate holonym to see if it's a city or a city-like top-level entity (Hong Kong, Bonaire, -- etc.) local group, key, spec, container_trail = export.find_matching_holonym_location(data) if group and not spec.is_former_place and spec.is_city then return {get_plural_entry_placetype(spec, container_trail) .. " of " .. export.get_prefixed_key(key, spec)} end -- If the entry placetype is neighbo(u)rhood, assume it is a neighborhood even if there isn't a city-like -- entity father up the chain. (E.g. due to a mistaken use of m/ instead of mun/ for municipality.) local has_neighborhoods local entry_placetype = data.entry_placetype if entry_placetype == "neighborhood" or entry_placetype == "neighbourhood" or entry_placetype == "suburb" then has_neighborhoods = true else -- Otherwise, make sure the current holonym is city-like. has_neighborhoods = export.get_equiv_placetype_prop(data.holonym_placetype, function(pt) return export.get_placetype_prop(pt, "has_neighborhoods") end, {continue_on_nil_only = true}) end if has_neighborhoods then -- Loop up the holonyms, looking for city and city-like entities in case of e.g. [[Sepulveda]] written -- {{place|en|neighborhood|valley/San Fernando Valley|city/Los Angeles|s/California|c/USA}} -- but also look for a recognizable poldiv, and if so categorize as "Neighborhoods in POLDIV". We need -- to start with the current holonym, which is especially important for neighborhoods and suburbs that -- may have the first holonym be a recognizable province, etc. but can't hurt otherwise. (Previously -- we skipped the first/current holonym.) for other_holonym_index, other_holonym in export.get_holonyms_to_check(data.place_desc, data.holonym_index) do local other_holonym_data = { holonym_placetype = other_holonym.placetype, holonym_placename = other_holonym.unlinked_placename, holonym_index = other_holonym_index, place_desc = data.place_desc, } local group, key, spec, container_trail = export.find_matching_holonym_location(other_holonym_data) if group and not spec.is_former_place then return {get_plural_entry_placetype(spec, container_trail) .. (spec.is_city and "ของ" or "ใน") .. export.get_prefixed_key(key, spec)} end end end end function export.check_already_seen_string(holonym_placename, already_seen_strings) local canon_placename = ulower(m_links.remove_links(holonym_placename)) if type(already_seen_strings) ~= "table" then already_seen_strings = {already_seen_strings} end for _, already_seen_string in ipairs(already_seen_strings) do if canon_placename:find(already_seen_string) then return true end end return false end -- Prefix display handler that adds a prefix such as "Metropolitan Borough of " to the display -- form of holonyms. We make sure the holonym doesn't contain the prefix or some variant already. -- We do this by checking if any of the strings in ALREADY_SEEN_STRINGS, either a single string or -- a list of strings, or the prefix if ALREADY_SEEN_STRINGS is omitted, are found in the holonym -- placename, ignoring case and links. If the prefix isn't already present, we create a link that -- uses the raw form as the link destination but the prefixed form as the display form, unless the -- holonym already has a link in it, in which case we just add the prefix. local function prefix_display_handler(prefix, holonym_placename, already_seen_strings) if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(prefix)) then return holonym_placename end if holonym_placename:find("%[%[") then return prefix .. " " .. holonym_placename end return prefix .. " [[" .. holonym_placename .. "]]" end -- Suffix display handler that adds a suffix such as " parish" to the display form of holonyms. -- Works identically to prefix_display_handler but for suffixes instead of prefixes. local function suffix_display_handler(suffix, holonym_placename, already_seen_strings, include_suffix_in_link) if export.check_already_seen_string(holonym_placename, already_seen_strings or ulower(suffix)) then return holonym_placename end if holonym_placename:find("%[%[") then return holonym_placename .. " " .. suffix end if include_suffix_in_link then return "[[" .. holonym_placename .. " " .. suffix .. "]]" else return "[[" .. holonym_placename .. "]] " .. suffix end end -- Display handler for boroughs. New York City boroughs are display as-is. Others are suffixed -- with "borough". local function borough_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.new_york_boroughs[unlinked_placename] then -- Hack: don't display "borough" after the names of NYC boroughs return holonym_placename end return suffix_display_handler("borough", holonym_placename) end local function county_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) -- Display handler for Irish counties. Irish counties are displayed as e.g. "County [[Cork]]". if m_locations.ireland_counties["County " .. unlinked_placename .. ", Ireland"] or m_locations.northern_ireland_counties["County " .. unlinked_placename .. ", Northern Ireland"] then return prefix_display_handler("เทศมณฑล", holonym_placename) end -- Display handler for Taiwanese counties. Taiwanese counties are displayed as e.g. "[[Chiayi]] County". if m_locations.taiwan_counties[unlinked_placename .. " County, Taiwan"] then return suffix_display_handler("เทศมณฑล", holonym_placename) end -- Display handler for Romanian counties. Romanian counties are displayed as e.g. "[[Cluj]] County". if m_locations.romania_counties[unlinked_placename .. " County, Romania"] then return suffix_display_handler("เทศมณฑล", holonym_placename) end -- FIXME, we need the same for US counties but need to key off the country, not the specific county. -- Others are displayed as-is. return holonym_placename end -- Display handler for prefectures. Japanese prefectures are displayed as e.g. "[[Fukushima]] Prefecture". -- Others are displayed as e.g. "[[Fthiotida]] prefecture". local function prefecture_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) local suffix = m_locations.japan_prefectures[unlinked_placename .. " Prefecture, Japan"] and "Prefecture" or "prefecture" return suffix_display_handler(suffix, holonym_placename) end -- Display handler for provinces of Iran, Laos, North and South Korea, Thailand, Turkey and Vietnam. Recognized -- provinces are displayed as e.g. "[[Gyeonggi]] Province" or "[[Antalya]] Province". Others are displayed as-is. local function province_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.iran_provinces[unlinked_placename .. ", Iran"] or m_locations.laos_provinces[unlinked_placename .. ", Laos"] or m_locations.north_korea_provinces[unlinked_placename .. ", North Korea"] or m_locations.south_korea_provinces[unlinked_placename .. ", South Korea"] or m_locations.thailand_provinces[unlinked_placename .. ", ไทย"] or m_locations.turkey_provinces[unlinked_placename .. ", Turkey"] or m_locations.vietnam_provinces[unlinked_placename .. ", เวียดนาม"] then return suffix_display_handler("จังหวัด", holonym_placename) end return holonym_placename end -- Display handler for Nigerian states. Nigerian states are display as "[[Kano]] State". Others are displayed as-is. local function state_display_handler(holonym_placetype, holonym_placename) local unlinked_placename = m_links.remove_links(holonym_placename) if m_locations.nigeria_states[unlinked_placename .. " State, Nigeria"] then return suffix_display_handler("รัฐ", holonym_placename) end return holonym_placename end -- Display handler for voivodeships. Display as e.g. [[Subcarpathian Voivodeship]]. local function voivodesip_display_handler(holonym_placetype, holonym_placename) return suffix_display_handler("Voivodeship", holonym_placename, nil, "include_suffix_in_link") end ------------------------------------------------------------------------------------------ -- Placetype data -- ------------------------------------------------------------------------------------------ --[==[ var: Main placetype data structure. This specifies, for each canonicalized placetype, various properties. The keys are placetypes (in the singular, except for category-only placetypes, which are plural and followed by `!`), and the value is a table of properties. The `"*"` key is special and is used for adding "generic" categories of the form `สถานที่ใน``location`` `; it runs for all entry placetypes. Keys in the form of plural placetypes followed by `!` are used only in [[Module:category tree/topic cat/data/Places]] for specifying the properties of categories containing the specified placetype, esp. bare categories like [[:Category:States and territories]] (rather than qualified categories like [[:Category:States and territories of Australia]]). Keys under the value table for a given placetype of are two types: ''property keys'' (which specify the value of specific properties) and ''categorization keys'' (which tell how to categorize certain sorts of holonyms if the placetype in question occurs as an entry placetype). Categorization keys are either the special value `default` or are wildcard strings with a slash in them, such as `"country/*"`. Note that only wildcard strings are currently allowed directly in the placetype data; everything else is handled through category handlers, either per-placetype or special (such as `political_division_cat_handler`). The algorithm for how category keys and handlers are used to generate categories is described at the top of [[Module:place]]. There are several recognized property keys, of various types: 1. The following link-related property keys are recognized: * `link`: '''Required''' except in category-only placetypes ending in `!`. Describes how to link and display the placetype in the formatted description when occurring as an entry placetype. Also used for formatting pluralized placetypes (which may occur in entry placetypes, esp. new-format ones, such as `two <<islands>>`) and may occur in categories). The possible values are: *# `true`: Link to the same-named Wiktionary entry. This creates a raw link, e.g. `<nowiki>[[city]]</nowiki>`, which is converted to an English-specific link by JavaScript postprocessing. If the placetype is plural, this creates a two-part raw link e.g. `<nowiki>[[city|cities]]</nowiki>`. *# `"w"`: Link to the same-named Wikipedia entry. This creates a two-part link, e.g. `<nowiki>[[w:census town|census town]]</nowiki>`, or `<nowiki>[[w:census town|census towns]]</nowiki>` if the placetype is given plural. *# `"+..."`: Create a two-part link to the entry following the `+` sign. For example, if `cercle` specifies `"+w:cercles of Mali"`, a two-part link `<nowiki>[[w:cercles of Mali|cercle]]</nowiki>` will be generated, or `<nowiki>[[w:cercles of Mali|cercles]]</nowiki>` if plural `cercles` is specified. *# `"separately"`: Link each word separately. For example, if `administrative territory` specifies `"separately"`, it will be linked as `<nowiki>[[administrative]] [[territory]]</nowiki>`, or as `<nowiki>[[administrative]] [[territory|territories]]</nowiki>` if plural `administrative territories` is given. *# another string: Use that string directly. If the placetype is plural, `pluralize()` in [[Module:en-utilities]] is called on the string, which will correctly pluralize most strings, including those with links in them. (If there are multiple links, the display form of the last link is pluralized.) *# `false`: This placetype is not allowed as an entry placetype. An error will be thrown if this placetype is given as an entry placetype. This is specified for internal-use placetypes, especially placetypes used in conjunction with the qualifiers `former`, `ancient`, `historical` and such. * `plural_link`: If specified and the placetype is plural, use the value in place of generating a pluralized version of the link spec in `link`. Most commonly, this is either a string with links in it (which is used directly) or the value `false`, indicating that the placetype cannot occur plural. (This is used for example by `caplc`, which displays as `<nowiki>[[capital]] and [[large]]st [[city]]</nowiki>`, where a plural version doesn't make sense.) Generally if this is specified, `plural` also needs to be specified to give a special placetype plural; this situation occurs especially with multiword placetypes where something other than the last word is pluralized. An example is `town with bystatus`, whose plural is `towns with bystatus`, which needs to be explicitly given. This example uses `link = <nowiki>"[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>` ({{m|nb|bystatus}}) is a Norwegian Bokmål word, and template calls aren't currently permitted in link strings), along with `plural_link = <nowiki>"[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]"</nowiki>`. * `category_link`: Spec indicating how to display the placetype when occurring in category descriptions. Defaults to the value of `link`, and in turn is overridden by more specific `category_link_*` keys; see below. Category-only placetypes (which are plural and end in `!`) usually use `category_link` in preference to `link`. The value of `category_link` can be any of the types of specs given above, but most commonly is a plural string with links in it, spelling out the description; in this case it is used directly. When both `category_link` and `link` are given, the value in `category_link` is typically longer and more descriptive. For example, `polity` uses `link = true`, which just generates a link `<nowiki>[[polity]]</nowiki>` or plural `<nowiki>[[polity|polities]]</nowiki>`, but specifies a separate `category_link = <nowiki>"[[independent]] or [[semi-]][[independent]] [[polity|polities]]"</nowiki>`, which clarifies in the category description what a polity is. * `category_link_top_level`: Spec indicating how to display top-level (bare/unqualified) categories, i.e. categories where the placetype is not followed by `in ``location`` ` or `of ``location`` `. If given, this overrides `category_link` for this type of category. * `category_link_before_noncity`: Spec indicating how to display qualified categories of the form ` ``placetypes`` in/of ``location`` ` where ``location`` does not refer to a city. If given, this overrides `category_link` for this type of category. * `category_link_before_city`: Spec indicating how to display qualified categories of the form ` ``placetypes`` in/of ``location`` ` where ``location`` refer to a city. If given, this overrides `category_link` for this type of category. An example where this is given is `neighborhood`, which uses the following specs:<ol> <li>`link = true`</li> <li>`category_link = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]"</nowiki>`</li> <li>`category_link_before_city = <nowiki>"[[neighborhood]]s, [[district]]s and other subportions"</nowiki>`</li> </ol> This has the effect of making the entry placetype `neighborhood` display as just `<nowiki>[[neighborhood]]</nowiki>`, while e.g. a category like `Neighborhoods of Chicago` displays as `<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[Chicago]], ...</nowiki>` and a category like `Neighborhoods in Illinois, USA` displays as `<nowiki>[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]] in [[Illinois]], ...</nowiki>`. * `disallow_in_entries`: If specified, this placetype cannot occur as an entry placetype, and the specified value (a message indicating what to use instead) is displayed in the error message. * `disallow_in_holonyms`: If specified, this placetype cannot occur as a holonym placetype, and the specified value (a message indicating what to use instead) is displayed in the error message. 2. There is currently one fallback-related property key recognized: * `fallback`: If specified, its value is a placetype which will be used for categorization purposes if no categories get added using the placetype itself. As an example, `branch` sets a fallback of `river` but also sets `preposition = "ของ"`, meaning that {{tl|place|en|branch|riv/Mississippi}} displays as `a branch of the Mississippi` (whereas `river` itself uses the preposition `in`), but otherwise categorizes the same as `river`. A more complex example is `area`, which sets a fallback of `geographic and cultural area` and also sets a category handler that checks for cities or city-like entities (e.g. boroughs) occurring as holonyms and categorizes the toponym under [[:Category:Neighborhoods of CITY]] (for recognized cities) or otherwise [[:Category:Neighborhoods of POLDIV]] (for the nearest containing recognized location). In addition, `area` is set as a political division of Kuwait, meaning if `c/Kuwait` occurs as holonym, the toponym is categorized under [[:Category:Areas of Kuwait]]. If none of these categories trigger, the fallback of `geographic and cultural area` will take effect, and the toponym will be categorized as e.g. [[:Category:Geographic and cultural areas of England]]. 3. There is currently one property to control irregular plurals of placetypes: * `plural`: If specified, its value is the plural of the placetype. Otherwise, the default pluralization algorithm in [[Module:en-utilities]] applies (which correctly pluralizes most words, including those ending in `-y`, `-ch`, `-sh`, `-x`, etc.). The value of `plural` is also used when converting a pluralized placetype into its singular equivalent; for example, since the placetype `kibbutz` has `plural = "kibbutzim"`, the placetype `kibbutzim` will be recognized as a plural and singularized to `kibbutz`. For this reason, it's occasionally necessary to specify a `plural` value even when the default pluralization algorithm works correctly, if the default singularization algorithm won't correctly reverse the pluralization (as with `pass` and other terms ending in `-ss`). 4. The following property keys relate to generating categories for entry placetypes and specifying the parents of those categories: * `class`: The general class of placetype. This is used for various purposes: (a) to categorize placetypes preceded by a qualifier such as `former`, `ancient`, `medieval` or `historical` (note that these placetypes are not all treated alike); (b) to determine the parent category of bare placetype categories (e.g. [[:Category:Villages]] for placetype `village`); (c) to determine whether to add a parent category `political divisions of specific countries` to qualified placetype categories (e.g. [[:Category:Villages in Mali]]). The possible values are: *# `polity`: a more-or-less sovereign/independent polity, such as a country, kingdom or empire. *# `subpolity`: a non-sovereign division of a polity, above the level of an individual settlement. *# `settlement`: a city or smaller equivalent, such as a village. This also includes administrative divisions of a settlement, such as wards and barangays. *# `non-admin settlement`: similar to a settlement but without administrative or political significance, such as an unincorporated community, farm or neighborhood. *# `capital`: a settlement that is a capital. A former capital is generally still in existence, just not the capital any more. *# `natural feature`: any non-man-made feature, such as a lake, mountain, island, ocean, etc. *# `man-made structure`: a man-made feature below the level of a neighborhood, such as a house, airport, university, metro station, park or the like. *# `geographic region`: a geographic or cultural region or area that has no administrative significance. These may vary greatly in size but typically have some sort of cultural significance (possibly historical). The `former`, `ancient`, etc. qualifier has no effect on the category of these placetypes. *# `generic place`: a place that isn't further qualified into any specific subtype. * `former_type`: The class of placetype used for categorizing placetypes preceded by a qualifier such as `former`, `ancient`, `medieval` or `historical`. The possible values are the same as for `class` but with the addition of `dependent territory` (for colonies, protectorates and the like) and `!` (ignore the historical/former/ancient/etc. qualifier; used e.g. with `fictional location` and `mythological location`). If not specified, the value of `class` is used. When a qualifier such as `former`, `ancient`, `medieval` or `historical` is encountered (specifically, those in `former_qualifiers`), it is mapped using `former_qualifiers` to the appropriate internal qualifier or qualifiers (one or both of `ANCIENT` and/or `FORMER`, which are written in all-caps to distinguish them from user-specified qualifiers), which is prepended to the value of `former_type` or `class` to form a placetype whose properties are looked up to determine how to categorize the toponym in question. For example, if `medieval village` is given, we map `medieval` to `ANCIENT` and `FORMER`, and `village` to its `class` of `settlement`, and enter the placetypes `ANCIENT settlement` and `FORMER settlement` (in that order) into the list of equivalent placetypes returned by `get_placetype_equivs`. In this case, there is an entry in `placetype_data` for `ANCIENT settlement`, so its default category spec `Ancient settlements` is used as the category. If on the other hand `medieval kingdom` is given, where `kingdom` has a `class` value `polity`, we first look up `ANCIENT polity`, see there is no entry in `placetype_data` for it, and then look up `FORMER polity`, which exists and has a default category spec `Former polities`, which is used as the category. Note that if the placetype following the "former" qualifier is recognized in `placetype_data` but has no `former_type` or `class` and no fallback with a `former_type` or `class` specified, it is an internal error; but if the placetype isn't recognized (e.g. something like `former greenhouse` is specified and we don't have an entry for `greenhouse`), we just track the occurrence and end up not categorizing. * `bare_category_parent`: This specifies the first parent category of a bare placetype category named according to the placetype in question (e.g. [[:Category:Atolls]] for placetype `atoll`, or [[:Category:Named buildings]] for placetype `named buildings!`). If not specified, the first parent category is determined by the value of `class`, using the mapping `class_to_bare_category_parent` in [[Module:category tree/topic cat/data/Places]]. * `addl_bare_category_parents`: Extra parent categories to add a bare placetype category to (see `bare_category_parent` just above). * `bare_category_breadcrumb`: Breadcrumb for bare placetype categories. Also used as the sort key of `bare_category_parent` if it is a string. * `inherently_former`: If specified and the given placetype is used as an entry placetype, act as if `former` or `ancient` (depending on the value of `inherently_former`) were prefixed to the placetype. This is for placetypes that always refer to no-longer-existing entities, such as `satrapy` and `treaty port`. The value of `inherently_former` is a list of internal qualifiers (one or more of `ANCIENT` and/or `FORMER`), just as for `former_qualifiers`, and the implementation is the same. * `cat_handler`: Handler used to generate the categories to add a given toponym to, if its entry placetype is the placetype in question. Generally the `cat_handler` function checks the holonyms specified in order to determine which category or categories to generate. For example, `district_neighborhood_cat_handler` handles placetypes `district`, `neighborhood`, `subdivision`, `suburb` and the like, and either adds the toponym to a category like `Neighborhoods of ``city`` ` (if a recognized city is given as a holonym), or otherwise a category like `Neighborhoods in ``location`` ` (for the first recognized non-city location given as a holonym, if an unrecognized city or city-like entity is given before the recognized non-city). The algorithm that runs the category handlers iterates over holonyms from left to right, running the `cat_handler` function on each holonym in turn until one or more categories are returned; see below for more specifics. (Note that countries for which e.g. a `district` is a political division do not get the corresponding category added by the `district_neighborhood_cat_handler` function but by `political_division_cat_handler`.) `cat_handler` functions are called with one argument, `data`, describing the resolved entry placetype (i.e. after resolving placetype aliases and fallbacks) and the holonym being processed. The return value should be a list of category specs (categories minus the langcode prefix, with `+++` standing for the holonym key, or the value `true`, which stands for ` ``Placetypes`` in/of ``Holonym`` `, i.e. the pluralized placetype with the appropriate preposition as specified in `placetype_data`). `data` contains the following fields: ** `entry_placetype`: the resolved entry placetype for the entry placetype being processed (i.e. it will always have an entry in `placetype_data` but may not be the original placetype given by the user); ** `holonym_placetype` and `holonym_placename`: the holonym placetype and placename being processed; ** `holonym_index`: the index of the holonym being processed, or {nil} if we're handling an overriding holonym (FIXME: we will change the overriding holonym algorithm so there will be an index even when processing overriding holonyms); ** `place_desc`: a full description of the {{tl|place}} call, as specified at the top of [[Module:place]]; ** `from_demonym`: If set, we are called from [[Module:demonym]], triggered by {{tl|demonym-adj}} or {{tl|demonym-noun}}, instead of being triggered by {{tl|place}}. * `has_neighborhoods`: If `true`, the specified placetype is city-like. This is used in the `district_neighborhood_cat_handler` to determine whether to add a category such as `Neighborhoods in ``location`` `; see the section just above on `cat_handler`. 5. The following preposition-related property keys are recognized: * `preposition`: The preposition used after this placetype when it occurs as an entry placetype. Defaults to `"ใน"`. * `generic_before_non_cities`: If specified, the appropriate category description handler in [[Module:category tree/topic cat/data/Places]] will recognize categories of the form ` ``Placetype`` in/of ``location`` ` for the specified placetype and preposition, if ``location`` is a non-city. This is used to generate descriptions for categories added by category handlers and by explicit category specs in the placetype data. All placetypes that specify `generic_before_non_cities` or `generic_before_cities` *MUST* also specify a value for `class` so that the category tree code can determine whether it's a political or non-political division. * `generic_before_cities`: Like `generic_before_non_cities` but for locations referring to cities. 6. The following property keys control the auto-addition of affixes when formatting holonyms of a particular placetype: * `affix_type`: If specified, add the placetype as an affix before or after holonyms of this placetype. Possible values are: *# `"pref"` (the holonym will display as `(the) placetype of Holonym`, where `the` appears when the holonym directly follows an entry placetype); *# `"Pref"` (same as `"pref"` but the placetype is capitalized; each word is capitalized if there are multiple); *# `"suf"` (the holonym will display as `Holonym placetype`); *# `"Suf"` (the holonym will display as `Holonym Placetype`, i.e. same as `"suf"` but the placetype is capitalized). * `suffix`: String to use in place of the placetype itself when the placetype is displayed as a suffix after a holonym. Note that `suffix` can be used independently of `affix_type` because the user can also request a suffix explicitly using a syntax like `adr:suf/Occitania`, which will display as `Occitania region` because the placetype `administrative region` specifies `suffix = "ภูมิภาค"`. * `prefix`: Like `suffix` but for use when the placetype is displayed as a prefix before the holonym. * `affix`: Like `suffix` and `prefix` but for use when the placetype is displayed as an affix either before or after the holonym. If both `suffix` or `prefix` and `affix` are given for a single placetype, `suffix` or `prefix` take precedence. * `no_affix_strings`: String or list of strings that, if they occur in the holonym, suppress the addition of any affix requested using `affix_type`. Defaults to the placetype itself. For example, `autonomous okrug` specifies `affix_type = "Suf"` so that `aokr/Nenets` displays as `Nenets Autonomous Okrug`, but also specifies `no_affix_strings = "okrug"` so that `aokr/Nenets Okrug` or `aokr/Nenets Autonomous Okrug` displays as specified, without a redundant `Autonomous Okrug` added. Matching is case-insensitive but whole-word. * `display_handler`: A function of two arguments, `holonym_placetype` and `holonym_placename` (specifying a holonym). Its return value is a string specifying the display form of the holonym. 7. The following property keys control the indefinite and definite articles used before entry placetypes and/or holonyms of the specified placetype. * `entry_placetype_use_the`: Use `"the"` before this placetype when it occurs as an entry placetype. * `entry_placetype_indefinite_article`: Indefinite article used before this placetype when it occurs as an entry placetype (usually `"a"`, specifically for placetypes beginning with u- that don't take the indefinite article `"an"`). Defaults to the appropriate indefinite article (`"a"` or `"an"` depending on whether the placetype begins with a vowel). Overridden by `entry_placetype_use_the`, and unlike for most properties, does not apply to equivalent placetypes (i.e. fallbacks or those formed by removing a qualifier from the beginning); only to the exact placetype specified. * `holonym_use_the`: Use `"the"` before holonyms of this placetype. '''NOTE:''' # The `link` property must be specified on all placetypes, except those ending in `!` (category-only placetypes), which must have either `link` or `category_link` specified. # Either the `class` or `former_type` property must be specified on all placetypes not ending in `!` that do not have a fallback (if a placetype has a fallback and omits the `class` and `former_type` properties, they are taken from the fallback). An internal error will result if a placetype has no `class` or `former_type` property derivable either directly or through a fallback, if an attempt is made to categorize a former/ancient/historical/etc. entity of this placetype. # It is possible to have multiple levels of fallback (e.g. `frazione` falls back to `hamlet`, which falls back to `village`). Fallback loops will cause an internal error. All placetypes specified as fallbacks must exist in `placetype_data` or an internal error occurs. ]==] export.placetype_data = { --[=[ If you need to sort the following, do this (using Vim): 1. Make sure all full-line comments are within the { ... } table, or are moved after and on the same line as single-line entries. 2. Make sure the table uses tabs everywhere for indent, and not spaces. 3. Mark the top of the table with `ma`, go to the bottom and execute the following two lines in sequence: :'a,.s/\n/\\n/g :s/\\n\(\t\[\)/\r\1/g The first command converts every newline to a literal `\n` sequence, so the whole thing becomes a single line, while the second command restores the newlines before the beginning of each entry. The effect is to convert all entries to a single line while not losing any information. (Potentially a negative lookahead could be used to do it all in one command.) 4. Execute the following to sort: :'a,.!perl -pe 's/^(\t\[")(.*?)(".*)$/$2 @@@ $1$2$3/' | sort -f | perl -pe 's/.*? @@@ //' Note that a simple `sort -f` (where `-f` means case-insensitive) would almost work, but it would sort "hill station" before "hill" and "county borough" before "เทศมณฑล" because the space after e.g. "hill station" sorts before the quotation mark after e.g. "hill". The above command deals with this by extracting the key, prepending it followed by ` @@@ `, sorting, and then removing key (the classic decorate-sort-undecorate pattern). 5. Put the table back to multi-line format by marking the top of the table with `ma`, going to the bottom and executing :'a,.s/\\n/\r/g Note that for some reason, in order to get a match a newline in the left side of a replacement, you must use \n, but to insert a newline in the right sode of a replacement you must use \r. ]=] ["*"] = { link = false, cat_handler = generic_place_cat_handler, }, ["administrative atoll"] = { -- Maldives link = "+w:administrative divisions of the Maldives", preposition = "ของ", class = "subpolity", }, ["administrative capital"] = { link = "w", fallback = "เมืองหลวง", }, ["administrative center"] = { link = "w", fallback = "เมืองหลวงที่ไม่ใช่นคร", }, ["administrative centre"] = { link = "w", fallback = "administrative center", }, ["administrative county"] = { link = "w", fallback = "เทศมณฑล", }, ["administrative district"] = { link = "w", fallback = "อำเภอ", }, ["administrative headquarters"] = { link = "separately", fallback = "administrative centre", }, ["administrative region"] = { link = true, preposition = "ของ", suffix = "ภูมิภาค", -- but prefix is still "administrative region (of)" fallback = "ภูมิภาค", class = "subpolity", }, ["administrative seat"] = { link = "w", fallback = "administrative centre", }, ["administrative territory"] = { link = "separately", preposition = "ของ", suffix = "ดินแดน", -- but prefix is still "administrative territory (of)" fallback = "ดินแดน", class = "subpolity", }, ["administrative unit"] = { -- Grrr, it's difficult to generalize about "administrative units". In Albania, "administrative unit" is an -- official term for a city-level division of municipalities; Wikipedia renders it using the more practical term -- "commune". In Pakistan, "administrative unit" is a collective term used to refer to all the different types -- of first-level divisions (four provinces, one federal territory, and two "disputed territories", i.e. Azad -- Kashmir and Gilgit-Balistan, that are variously described). For this reason, we set no fallback, but we need -- to include this so that it can be used as a placetype for Albania, categorizing as communes. link = "w", class = "subpolity", }, ["administrative village"] = { link = "w", preposition = "ของ", has_neighborhoods = true, class = "settlement", }, ["aimag"] = { -- used in Mongolia, Russia and China (Inner Mongolia); in Mongolia, equivalent to a province; -- in China, equivalent to a prefecture (below a province); in Russia, equivalent to a municipal district. link = "w", fallback = "prefecture", }, ["airport"] = { link = true, class = "man-made structure", default = {true}, }, ["alliance"] = { link = true, fallback = "confederation", }, ["archipelago"] = { link = true, fallback = "เกาะ", }, ["area"] = { link = true, preposition = "ของ", fallback = "geographic and cultural area", -- Areas can either be administrative divisions (specifically of Kuwait) or geographic areas. Assume the former -- when categorizing 'Areas' but the latter when handling e.g. 'historical area'. class = "subpolity", former_type = "geographic region", cat_handler = district_neighborhood_cat_handler, }, ["arm"] = { link = true, preposition = "ของ", class = "natural feature", default = {"ทะเล"}, }, ["arrondissement"] = { link = true, preposition = "ของ", -- FIXME!!! Grrrrr!!! In some countries, arrondissements are divisions of cities; in others, they are divisions -- of departments or provinces. Need to conditionalize on the country for both of the following. class = "subpolity", has_neighborhoods = true, }, ["associated province"] = { link = "separately", fallback = "จังหวัด", }, ["atoll"] = { -- FIXME! Atolls are administrative divisions of the Maldives but natural features elsewhere. Need to -- conditionalize `class` on the country. See also `administrative atoll`. link = true, class = "natural feature", bare_category_parent = "เกาะ", default = {true}, }, ["autonomous city"] = { link = "w", preposition = "ของ", fallback = "นคร", has_neighborhoods = true, }, ["autonomous community"] = { -- Spain; refers to regional entities, not village-like entities, as might be expected from "community" link = true, preposition = "ของ", class = "subpolity", }, ["autonomous island"] = { -- Comoros; seems like an administrative atoll of the Maldives. link = "+w:autonomous islands of Comoros", preposition = "ของ", class = "subpolity", }, ["autonomous oblast"] = { link = true, preposition = "ของ", affix_type = "Suf", no_affix_strings = "oblast", class = "subpolity", }, ["autonomous okrug"] = { link = true, preposition = "ของ", affix_type = "Suf", no_affix_strings = "okrug", class = "subpolity", }, ["autonomous prefecture"] = { link = true, fallback = "prefecture", }, ["autonomous province"] = { link = "w", fallback = "จังหวัด", }, ["autonomous region"] = { link = "w", preposition = "ของ", fallback = "administrative region", -- "administrative region" sets an affix of "ภูมิภาค" but we want to display as "Tibet Autonomous Region" -- if the user writes 'ar:Suf/Tibet'. affix = "autonomous region", }, ["autonomous republic"] = { link = "w", preposition = "ของ", class = "subpolity", }, ["autonomous territorial unit"] = { -- Moldova; only two of them, one for Gagauzia and one for Transnistria. link = "w", preposition = "ของ", class = "subpolity", }, ["autonomous territory"] = { link = "w", fallback = "dependent territory", }, ["bailiwick"] = { -- Jersey, etc. link = true, fallback = "องค์การทางการเมือง", }, ["barangay"] = { -- Philippines link = true, class = "settlement", -- Barangays are formal administrative divisions of a city rather than informal neighborhoods, but can use -- some of the properties of a neighborhood. fallback = "neighborhood", }, ["barrio"] = { -- Spanish-speaking countries; Philippines link = true, -- FIXME: Not completely correct, in some countries barrios are formal administrative divisions of a city. -- `class` will need to conditionalize on the country to be completely correct. fallback = "neighborhood", }, ["basin"] = { link = true, fallback = "ทะเลสาบ", }, ["bay"] = { link = true, preposition = "ของ", class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["beach"] = { link = true, class = "natural feature", addl_bare_category_parents = {"water"}, default = {true}, }, ["beach resort"] = { link = "w", fallback = "resort town", }, ["bishopric"] = { link = true, fallback = "องค์การทางการเมือง", }, ["bodies of water!"] = { -- FIXME: This is (maybe?) a type category not a name category. There should be an option for this. We need to -- straighten out the type vs. name vs. related-to issue. category_link = "[[body of water|bodies of water]]", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems", "water"}, }, ["borough"] = { link = true, preposition = "ของ", display_handler = borough_display_handler, has_neighborhoods = true, -- "former borough" could be a former settlement or a former part of a city but seems more likely to -- be a former subpolity, particularly in England. FIXME, we really need a handler to take care of this -- properly. class = "subpolity", -- Grr, some boroughs are city-like but some (e.g. in Britain) may be larger. }, ["borough seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["branch"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["bridge"] = { link = true, class = "man-made structure", default = {"Named bridges"}, }, ["building"] = { link = true, class = "man-made structure", default = {"Named buildings"}, }, ["built-up area"] = { link = "w", fallback = "area", }, ["burgh"] = { link = true, fallback = "borough", }, ["business park"] = { link = true, fallback = "park", }, ["caliphate"] = { link = true, fallback = "องค์การทางการเมือง", }, ["canton"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["cape"] = { link = true, fallback = "headland", }, ["capital"] = { link = true, fallback = "เมืองหลวง", }, ["เมืองหลวง"] = { link = true, category_link = "[[capital city|capital cities]]: the [[seat of government|seats of government]] for a country or [[political]] [[division]] of a country", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", bare_category_parent = "นคร", cat_handler = capital_city_cat_handler, default = {true}, -- The following is necessary so that e.g. [[Melbourne]] defined as {{place|en|capital city|s/Victoria|c/Australia}} -- gets categorized in the bare category [[Category:en:Melbourne]]; otherwise placetype 'capital city' wouldn't -- match against the placetype 'city' of Melbourne. fallback = "นคร", }, ["caplc"] = { link = "[[capital]] and [[large]]st [[city]]", plural_link = false, fallback = "เมืองหลวง", }, ["captaincy"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["caravan city"] = { link = "w", fallback = "นคร", class = "settlement", inherently_former = {"ANCIENT", "FORMER"}, }, ["castle"] = { link = true, fallback = "building", }, ["cathedral city"] = { link = true, fallback = "นคร", }, ["cattle station"] = { -- Australia link = true, fallback = "farm", }, ["census area"] = { link = true, affix_type = "Suf", has_neighborhoods = true, class = "non-admin settlement", }, ["census-designated place"] = { -- United States link = true, class = "non-admin settlement", }, ["census division"] = { -- Canada link = "w", preposition = "ของ", class = "subpolity", }, ["census town"] = { link = "w", fallback = "เมือง", }, ["central business district"] = { link = true, fallback = "neighborhood", }, ["cercle"] = { -- Mali link = "+w:cercles of Mali", preposition = "ของ", class = "subpolity", }, ["ceremonial county"] = { link = true, fallback = "เทศมณฑล", }, ["chain of islands"] = { link = "[[chain]] of [[island]]s", plural = "chains of islands", plural_link = "[[chain]]s of [[island]]s", fallback = "เกาะ", }, ["channel"] = { link = true, fallback = "strait", }, ["charter community"] = { -- Northwest Territories, Canada link = "w", fallback = "village", }, ["นคร"] = { link = true, generic_before_non_cities = "ใน", has_neighborhoods = true, class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["city-state"] = { link = true, category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]", has_neighborhoods = true, class = "settlement", ["continent/*"] = {"City-states", "นครใน+++", "ประเทศใน+++", "เมืองหลวงของ"}, default = {"City-states", "นคร", "ประเทศ", "เมืองหลวงของประเทศ"}, }, ["civil parish"] = { -- Mostly England; similar to municipalities link = true, preposition = "ของ", affix_type = "suf", has_neighborhoods = true, class = "subpolity", }, ["claimed political division"] = { link = "[[claim]]ed [[political]] [[division]]", class = "subpolity", default = {true}, }, ["co-capital"] = { link = "[[co-]][[capital]]", fallback = "เมืองหลวง", }, ["coal city"] = { link = "+w:coal town", fallback = "นคร", }, ["coal town"] = { link = "w", fallback = "เมือง", }, ["collectivity"] = { link = "w", preposition = "ของ", -- No default; these are weird one-off governmental divisions in France (esp. for overseas collectivities) class = "subpolity", }, ["colony"] = { link = true, fallback = "dependent territory", }, ["comarca"] = { -- per Wikipedia: traditional region or local administrative division found in Portugal, Spain, and some of -- their former colonies, like Brazil, Nicaragua, and Panama. In the Valencian Community, for example, it -- sits between municipalities and provinces, something like a county or district. link = true, preposition = "ของ", class = "subpolity", }, ["commandery"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"ANCIENT", "FORMER"}, }, ["commonwealth"] = { link = true, preposition = "ของ", -- No default; applies specifically to Puerto Rico class = "subpolity", }, ["commune"] = { link = true, fallback = "เทศบาล", }, ["community"] = { link = true, category_link = "[[community|communities]] of all sizes", fallback = "village", }, ["community development block"] = { -- in India; appears to be similar to a rural municipality; groups several villages, unclear if there will be -- neighborhoods so I'm not setting `has_neighborhoods` for now link = "w", affix_type = "suf", no_affix_strings = "block", class = "subpolity", }, ["comune"] = { -- Italy, Switzerland link = true, fallback = "เทศบาล", }, ["condominium"] = { link = true, fallback = "องค์การทางการเมือง", }, ["confederacy"] = { link = true, fallback = "confederation", }, ["confederation"] = { link = true, fallback = "องค์การทางการเมือง", }, ["constituency"] = { -- currently we have them as political divisions of Namibia but many countries have them link = true, preposition = "ของ", class = "subpolity", }, ["constituent country"] = { link = true, preposition = "ของ", class = "subpolity", }, ["constituent part"] = { link = "separately", preposition = "ของ", class = "subpolity", }, ["constituent republic"] = { -- Of Russia, Yugoslavia, etc. link = "separately", preposition = "ของ", class = "subpolity", }, ["counties and county-level cities!"] = { -- This is used when grouping counties and county-level cities under prefecture-level cities in China. category_link = "[[county|counties]] and [[county-level city|county-level cities]]", class = "subpolity", }, ["ทวีป"] = { link = true, category_link = false, -- can't occur as a bare category class = "natural feature", default = {"Continents and continental regions"}, }, ["continental region"] = { link = "separately", category_link = false, -- can't occur as a bare category class = "geographic region", fallback = "ทวีป", }, ["continents and continental regions!"] = { category_link = "[[continent]]s and [[continent]]-[[level]] [[region]]s (e.g. [[Polynesia]])", class = "geographic region", }, ["council area"] = { link = true, -- in Scotland; similar to a county preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["ประเทศ"] = { link = true, class = "polity", --ห้ามแปล class ["continent/*"] = {true, "ประเทศ"}, default = {true}, }, ["country-like entities!"] = { category_link = "[[polity|polities]] not normally considered [[country|countries]] but treated similarly for categorization purposes; typically, [[unrecognized]] [[de-facto]] countries or [[w:dependent territory|dependent territories]]", class = "polity", --ห้ามแปล class }, ["เทศมณฑล"] = { link = true, preposition = "ของ", display_handler = county_display_handler, class = "subpolity", }, ["county borough"] = { link = true, -- in Wales; similar to a county preposition = "ของ", affix_type = "suf", fallback = "borough", class = "subpolity", }, ["county seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["county town"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", fallback = "เมือง", has_neighborhoods = true, class = "capital", }, ["county-administered city"] = { -- In Taiwan, per Wikipedia similar to a Taiwanese township or district, which is a small city. -- NOT anything like a "county-level city" in PR China, which is a county masquerading as a city. link = "w", fallback = "นคร", has_neighborhoods = true, class = "settlement", }, ["county-controlled city"] = { -- Taiwan link = "w", fallback = "county-administered city", }, ["county-level city"] = { -- PR China link = "w", fallback = "prefecture-level city", }, ["crater lake"] = { link = true, fallback = "ทะเลสาบ", }, ["creek"] = { link = true, fallback = "stream", }, ["Crown colony"] = { link = "+crown colony", fallback = "crown colony", }, ["crown colony"] = { link = true, fallback = "colony", }, ["Crown dependency"] = { link = true, fallback = "dependent territory", }, ["crown dependency"] = { link = true, fallback = "dependent territory", }, ["cultural area"] = { link = "w", fallback = "geographic and cultural area", }, ["cultural region"] = { link = "w", fallback = "geographic and cultural area", }, ["delegation"] = { -- Tunisia link = "+w:delegations of Tunisia", preposition = "ของ", class = "subpolity", }, ["department"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["departmental capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["dependency"] = { link = true, fallback = "dependent territory", }, ["dependent territory"] = { link = "w", preposition = "ของ", class = "subpolity", former_type = "dependent territory", bare_category_parent = "political divisions", ["country/*"] = {true}, default = {true}, }, ["desert"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ecosystems"}, default = {true}, }, ["deserted mediaeval village"] = { link = "w", fallback = "deserted medieval village", }, ["deserted medieval village"] = { link = "w", fallback = "ANCIENT settlement", }, ["direct-administered municipality"] = { -- China link = "+w:direct-administered municipalities of China", fallback = "เทศบาล", }, ["direct-controlled municipality"] = { -- several countries link = "w", fallback = "เทศบาล", }, ["distributary"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["อำเภอ"] = { link = true, preposition = "ของ", affix_type = "suf", -- Grrr! FIXME! Here is where we need handlers for `class`. Using similar logic to -- district_neighborhood_cat_handler, we need to check if we're below or above a city to determine if the class -- is "settlement" or "subpolity". class = "subpolity", cat_handler = district_neighborhood_cat_handler, -- No default. Countries for which districts are political divisions will get entries. }, ["districts and autonomous regions!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Portugal. category_link = "[[district]]s and [[autonomous region]]s", class = "subpolity", }, ["districts and autonomous territorial units!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Moldova. category_link = "[[district]]s and [[w:autonomous territorial unit|autonomous territorial unit]]s", class = "subpolity", }, ["district capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["district headquarters"] = { link = "separately", fallback = "administrative centre", }, ["district municipality"] = { -- In Canada, a district municipality is equivalent to a rural municipality and won't have neighborhoods; in -- South Africa, district municipalities group local municipalities and hence won't have neighborhoods. link = "w", preposition = "ของ", affix_type = "suf", no_affix_strings = {"อำเภอ", "เทศบาล"}, fallback = "เทศบาล", class = "subpolity", }, ["division"] = { link = true, preposition = "ของ", class = "subpolity", }, ["division capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["dome"] = { link = true, fallback = "ภูเขา", }, ["dormant volcano"] = { link = true, fallback = "volcano", }, ["duchy"] = { link = true, fallback = "องค์การทางการเมือง", }, ["emirate"] = { link = true, preposition = "ของ", -- FIXME: Can be subpolities (of the United Arab Emirates). fallback = "องค์การทางการเมือง", }, ["จักรวรรดิ"] = { link = true, fallback = "องค์การทางการเมือง", }, ["enclave"] = { link = true, preposition = "ของ", -- Enclaves can theoretically be any size but assume a subpolity. class = "subpolity", }, ["entity"] = { -- Bosnia and Herzegovina link = "+w:entities of Bosnia and Herzegovina", preposition = "ของ", class = "subpolity", }, ["escarpment"] = { link = true, fallback = "ภูเขา", }, ["ethnographic region"] = { -- used in Lithuania link = "+w:ethnographic regions of Lithuania", fallback = "geographic and cultural area", }, ["exclave"] = { link = true, preposition = "ของ", -- exclaves can theoretically be any size but assume a subpolity. class = "subpolity", }, ["external territory"] = { link = "separately", fallback = "dependent territory", }, ["farm"] = { link = true, class = "non-admin settlement", default = {"Farms and ranches"}, }, ["farms and ranches!"] = { category_link = "[[farm]]s and [[ranch]]es", class = "non-admin settlement", }, ["federal city"] = { link = "w", preposition = "ของ", fallback = "นคร", }, ["federal district"] = { link = true, preposition = "ของ", -- Might have neighborhoods as federal districts are often cities (e.g. Mexico City) has_neighborhoods = true, class = "settlement", }, ["federal subject"] = { -- In Russia; a generic term for first-level administrative divisions (republics, oblasts, okrugs, krais, -- autonomous okrugs and autonomous oblasts). link = "w", preposition = "ของ", class = "subpolity", }, ["federal territory"] = { link = "w", fallback = "ดินแดน", }, ["fictional location"] = { link = "separately", former_type = "!", class = "hypothetical location", bare_category_parent = "สถานที่", default = {true}, }, ["First Nations reserve"] = { -- Canada link = "[[First Nations]] [[w:Indian reserve|reserve]]", -- Wikipedia uses "Indian reserve"; presumably that is the legal term fallback = "Indian reserve", class = "subpolity", }, ["fjord"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["footpath"] = { link = true, fallback = "road", }, ["forest"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ecosystems", "forestry"}, default = {true}, }, ["fort"] = { link = true, fallback = "building", }, ["fortress"] = { link = true, -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- fortresses -> fortresse, so put an entry here to ensure we singularize correctly. plural = "fortresses", fallback = "building", }, ["frazione"] = { link = "w", fallback = "hamlet", }, ["freeway"] = { link = true, fallback = "road", }, ["French prefecture"] = { link = "[[w:prefectures in France|prefecture]]", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", }, ["geographic and cultural area"] = { link = "+w:cultural area", -- `generic_before_non_cities` is used when generating the category description of categories of the format -- `Geographic and cultural areas of PLACE`. `preposition` is used when generating {{place}} description and -- categories for any placetype that falls back to `geographic and cultural area`. generic_before_non_cities = "ของ", preposition = "ของ", class = "geographic region", bare_category_parent = "สถานที่", ["country/*"] = {true}, ["constituent country/*"] = {true}, ["continent/*"] = {true}, default = {true}, }, ["geographic area"] = { link = "+w:geographic region", fallback = "geographic and cultural area", }, ["geographic region"] = { link = "w", fallback = "geographic and cultural area", }, ["geographical area"] = { link = "w", fallback = "geographic and cultural area", }, ["geographical region"] = { link = "w", fallback = "geographic and cultural area", }, ["geopolitical zone"] = { -- Nigeria link = true, preposition = "ของ", class = "subpolity", }, ["gewog"] = { -- Bhutan link = true, preposition = "ของ", class = "subpolity", }, ["ghost town"] = { link = true, generic_before_non_cities = "ใน", class = "non-admin settlement", bare_category_parent = "former settlements", cat_handler = city_type_cat_handler, default = {true}, }, ["glen"] = { link = true, fallback = "valley", }, ["governorate"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["greater administrative region"] = { -- China (former division) link = "w", preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["gromada"] = { -- Poland (former division) link = "w", preposition = "ของ", affix_type = "Pref", class = "subpolity", inherently_former = {"FORMER"}, }, ["group of islands"] = { link = "[[group]] of [[island]]s", plural = "groups of islands", plural_link = "[[group]]s of [[island]]s", fallback = "island group", }, ["gulf"] = { link = true, preposition = "ของ", holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["hamlet"] = { link = true, fallback = "village", }, ["harbor city"] = { link = "separately", fallback = "นคร", }, ["harbor town"] = { link = "separately", fallback = "เมือง", }, ["harbour city"] = { link = "separately", fallback = "นคร", }, ["harbour town"] = { link = "separately", fallback = "เมือง", }, ["headland"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["headquarters"] = { link = "w", fallback = "administrative centre", }, ["heath"] = { link = true, fallback = "moor", }, ["hemisphere"] = { link = true, entry_placetype_use_the = true, fallback = "continental region", }, ["highway"] = { link = true, fallback = "road", }, ["hill"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["hill station"] = { link = "w", fallback = "เมือง", }, ["hill town"] = { link = "w", fallback = "เมือง", }, ["historic region"] = { -- provided only for the link link = "+w:historical region", fallback = "FORMER geographic region", }, ["historical county"] = { -- needed for historical counties of England/etc. link = "+w:historic county", fallback = "FORMER subpolity", }, ["historical region"] = { -- provided only for the link link = "w", fallback = "FORMER geographic region", }, ["home rule city"] = { link = "w", fallback = "นคร", }, ["home rule municipality"] = { link = "w", fallback = "เทศบาล", }, ["hot spring"] = { link = true, fallback = "spring", }, ["house"] = { link = true, fallback = "building", }, ["housing estate"] = { -- not the same as a housing project (i.e. public housing) link = true, -- not exactly the case but approximately fallback = "neighborhood", }, ["hromada"] = { -- Ukraine link = "w", disallow_in_entries = "Use placetype 'urban hromada', 'rural hromada' or 'settlement hromada' in place of bare 'hromada'", disallow_in_holonyms = "Use placetype 'urban hromada'/'uhrom', 'rural hromada'/'rhrom' or 'settlement hromada'/'shrom' in place of bare 'hromada'", preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["inactive volcano"] = { link = "w", fallback = "dormant volcano", }, ["independent city"] = { link = true, fallback = "นคร", }, ["independent town"] = { link = "+independent city", fallback = "เมือง", }, ["Indian reservation"] = { link = "w", -- In the US. Also known as "Native American reservation" or "domestic dependent nation", and the reservations -- themselves often use the term "nation" in their official name (e.g. the "Navajo Nation"). But Wikipedia puts -- the article at [[w:Indian reservation]] and uses that term when describing e.g. what the Navajo Nation is, -- so this must still be the legal term. preposition = "ของ", class = "subpolity", default = {true}, }, ["Indian reserve"] = { link = "w", -- In Canada. "First Nations reserve" sounds more modern/PC but Wikipedia uses "Indian reserve"; presumably that -- is still the legal term. preposition = "ของ", class = "subpolity", default = {true}, }, ["inland sea"] = { -- note, we also have 'inland' as a qualifier link = true, fallback = "ทะเล", }, ["inner city area"] = { link = "[[inner city]] [[area]]", fallback = "neighborhood", }, ["เกาะ"] = { link = true, preposition = "ของ", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["island country"] = { -- FIXME: The following should map to both 'island' and 'country'. link = "w", fallback = "ประเทศ", }, ["island group"] = { link = "separately", fallback = "เกาะ", }, ["island municipality"] = { link = "w", fallback = "เทศบาล", }, ["islet"] = { link = "w", fallback = "เกาะ", }, ["Israeli settlement"] = { link = "w", class = "settlement", default = {true}, }, ["judicial capital"] = { link = "w", fallback = "เมืองหลวง", }, ["khanate"] = { link = true, fallback = "องค์การทางการเมือง", }, ["kibbutz"] = { link = true, plural = "kibbutzim", class = "non-admin settlement", default = {true}, }, ["kingdom"] = { link = true, fallback = "monarchy", }, ["krai"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["ทะเลสาบ"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["ธรณีสัณฐาน!"] = { category_link = "[[ธรณีสัณฐาน]]", bare_category_parent = "สถานที่", addl_bare_category_parents = {"โลก"}, }, ["largest city"] = { link = "[[large]]st [[city]]", entry_placetype_use_the = true, fallback = "นคร", has_neighborhoods = true, }, ["league"] = { link = true, fallback = "confederation", }, ["legislative capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["library"] = { link = true, fallback = "building", }, ["lieutenancy area"] = { -- used in the United Kingdom; per Wikipedia: -- In England, lieutenancy areas are colloquially known as the ceremonial counties, although this phrase does -- not appear in any legislation referring to them. The lieutenancy areas of Scotland are subdivisions of -- Scotland that are more or less based on the counties of Scotland, making use of the major cities as separate -- entities.[2] In Wales, the lieutenancy areas are known as the preserved counties of Wales and are based on -- those used for lieutenancy and local government between 1974 and 1996. The lieutenancy areas of Northern -- Ireland correspond to the six counties and two former county boroughs.[3] link = "w", fallback = "ceremonial county", }, ["local authority district"] = { link = "w", fallback = "local government district", }, ["local government area"] = { -- Australia link = "w", preposition = "ของ", class = "subpolity", }, ["local council"] = { -- Malta; similar to municipalities link = "+w:local councils of Malta", preposition = "ของ", fallback = "เทศบาล", }, ["local government district"] = { link = "w", preposition = "ของ", affix_type = "suf", affix = "อำเภอ", class = "subpolity", }, ["local government district with borough status"] = { link = "[[w:local government district|local government district]] with [[w:borough status|borough status]]", plural = "local government districts with borough status", plural_link = "[[w:local government district|local government districts]] with [[w:borough status|borough status]]", preposition = "ของ", affix_type = "suf", affix = "อำเภอ", class = "subpolity", }, ["local urban district"] = { link = "w", fallback = "unincorporated community", }, ["locality"] = { link = "+w:locality (settlement)", -- not necessarily true, but usually is the case fallback = "village", }, ["London borough"] = { link = "w", preposition = "ของ", affix_type = "pref", affix = "borough", fallback = "local government district with borough status", has_neighborhoods = true, }, ["macroregion"] = { link = true, fallback = "ภูมิภาค", }, ["man-made structures!"] = { category_link = "[[w:geographical feature#Engineered constructs|man-made structures]] such as [[airport]]s, [[university|universities]] and [[metro station]]s", bare_category_parent = "สถานที่", }, ["manor"] = { -- FIXME: or is this more like a farm? link = true, fallback = "building", }, ["marginal sea"] = { link = true, preposition = "ของ", fallback = "ทะเล", }, ["market city"] = { link = "+market town", fallback = "นคร", }, ["market town"] = { link = true, fallback = "เมือง", }, ["massif"] = { link = true, fallback = "ภูเขา", }, ["megacity"] = { link = true, fallback = "นคร", }, ["metro station"] = { link = true, class = "man-made structure", }, ["metropolitan borough"] = { link = true, preposition = "ของ", affix_type = "Pref", no_affix_strings = {"borough", "นคร"}, fallback = "local government district", has_neighborhoods = true, }, ["มหานคร"] = { -- These exist e.g. in Italy and are more like municipalities or even provinces than cities. link = true, preposition = "ของ", affix_type = "Pref", no_affix_strings = {"มหานคร", "นคร"}, class = "subpolity", }, ["metropolitan county"] = { link = true, fallback = "เทศมณฑล", }, ["metropolitan municipality"] = { -- In South Africa, metropolitan municipalities group local municipalities and are like districts, between -- provinces and municipalities. -- In Turkey, metropolitan municipalities are provinces-level. link = "w", preposition = "ของ", affix_type = "Suf", no_affix_strings = {"metropolitan", "เทศบาล"}, fallback = "เทศบาล", class = "subpolity", }, ["microdistrict"] = { -- residential complex in post-Soviet states link = true, fallback = "neighborhood", }, ["micronations!"] = { -- FIXME, merge with microstate category_link = "[[micronation]]s", bare_category_parent = "ประเทศ", }, ["microstate"] = { link = true, fallback = "ประเทศ", }, ["military base"] = { link = "w", class = "settlement", -- or "man-made structure"? default = {true}, }, ["minster town"] = { -- England link = "separately", fallback = "เมือง", }, ["monarchy"] = { link = true, fallback = "องค์การทางการเมือง", }, ["moor"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "ecosystems"}, default = {true}, }, ["moorland"] = { link = true, fallback = "moor", }, ["motorway"] = { link = true, fallback = "road", }, ["ภูเขา"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["mountain indigenous district"] = { -- Taiwan link = "+w:district (Taiwan)", fallback = "อำเภอ", }, ["mountain indigenous township"] = { -- Taiwan link = "+w:township (Taiwan)", fallback = "township", }, ["mountain pass"] = { link = true, -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- passes -> passe, so put an entry here to ensure we singularize correctly. plural = "mountain passes", class = "natural feature", addl_bare_category_parents = {"ภูเขา"}, default = {true}, }, ["เทือกเขา"] = { link = true, fallback = "ภูเขา", }, ["mountainous region"] = { link = "separately", fallback = "ภูมิภาค", }, ["mukim"] = { -- Malaysia, Brunei, Indonesia, Singapore link = true, preposition = "ของ", class = "subpolity", }, ["municipal district"] = { link = "w", -- meaning varies depending on the country; for now, assume no neighborhoods. -- FIXME: has_neighborhoods might have to be a function that looks at the containing holonyms. preposition = "ของ", affix_type = "Pref", no_affix_strings = "อำเภอ", fallback = "เทศบาล", }, ["เทศบาล"] = { link = true, preposition = "ของ", has_neighborhoods = true, class = "subpolity", }, ["municipality with city status"] = { link = "[[municipality]] with [[w:city status|city status]]", plural = "municipalities with city status", plural_link = "[[municipality|municipalities]] with [[w:city status|city status]]", fallback = "เทศบาล", }, ["museum"] = { link = true, fallback = "building", }, ["mythological location"] = { link = "separately", former_type = "!", class = "hypothetical location", bare_category_parent = "สถานที่", default = {true}, }, ["named bridges!"] = { category_link = "notable [[bridge]]s", bare_category_parent = "man-made structures", addl_bare_category_parents = {"bridges"}, }, ["named buildings!"] = { category_link = "notable [[house]]s, [[library|libraries]] and other [[building]]s", bare_category_parent = "man-made structures", addl_bare_category_parents = {"buildings"}, }, ["named roads!"] = { category_link = "notable [[road]]s, [[highway]]s, [[trail]]s and similar linear structures", bare_category_parent = "man-made structures", addl_bare_category_parents = {"roads"}, }, ["national capital"] = { link = "w", fallback = "เมืองหลวง", }, ["national park"] = { link = true, fallback = "park", }, ["natural features!"] = { category_link = "[[w:geographical feature#Natural features|natural features]] such as [[lake]]s, [[mountain]]s, [[island]]s and [[ocean]]s", bare_category_parent = "สถานที่", }, ["neighborhood"] = { -- The majority of the properties here apply to both `neighborhoods` and `neighbourhoods`; the choice of which -- one to use is made by district_neighborhood_cat_handler() based on the value of `british_spelling` for the -- location (city, political division, etc.) of the holonym that follows the word "neighbo(u)hoods" in the -- category name. It does *NOT* depend on whether the {{place}} call uses "neighborhoods" or "neighbourhoods". -- (In general it can't, because other things like "urban areas", "อำเภอ", "subdivisions" and the like also -- categorize as neighbo(u)rhoods.) link = true, -- See below. These are used by category handlers in [[Module:category tree/topic cat/data/Places]]. generic_before_non_cities = "ใน", generic_before_cities = "ของ", -- The following text is suitable for the top-level description of a neighborhood as well as categories of the -- form `Neighborhoods in POLDIV` e.g. `Neighborhoods in Illinois, USA` but not for categories of the form -- `Neighborhoods of Chicago`, where we'd get "... and other subportions of [[city|cities]] of [[Chicago]]". category_link = "[[neighborhood]]s, [[district]]s and other subportions of [[city|cities]]", category_link_before_city = "[[neighborhood]]s, [[district]]s and other subportions", -- NOTE: This setting is needed for administrative divisions like barangays that fall back to `neighborhood`, -- when set in [[Module:place/locations]] for a specific country (e.g. the Philippines). The above settings -- for `generic_before_non_cities` and `generic_before_cities` are used by category handlers in -- [[Module:category tree/topic cat/data/Places]] for `Neighborhoods in POLDIV` and `Neighborhoods of CITY` -- categories. In fact, district_neighborhood_cat_handler() does not currently pay attention to them, but -- generates "ของ" before cities and "ใน" before non-cities regardless. (FIXME: We should change that.) preposition = "ของ", class = "non-admin settlement", cat_handler = district_neighborhood_cat_handler, }, ["neighbourhood"] = { link = true, category_link = "[[neighbourhood]]s, [[district]]s and other subportions of [[city|cities]]", category_link_before_city = "[[neighbourhood]]s, [[district]]s and other subportions", fallback = "neighborhood", }, ["new area"] = { -- China (type of economic development zone, varying greatly in size) link = "w", preposition = "ใน", class = "subpolity", --? }, ["new town"] = { link = true, fallback = "เมือง", }, ["เมืองหลวงที่ไม่ใช่นคร"] = { link = "[[เมืองหลวง]]", entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", cat_handler = function(data) return capital_city_cat_handler(data, "non-city") end, -- FIXME, do we need the following? default = {true}, }, ["non-metropolitan county"] = { link = "w", fallback = "เทศมณฑล", }, ["non-metropolitan district"] = { link = "w", fallback = "local government district", }, ["non-sovereign kingdom"] = { -- especially in Africa and Asia link = "+w:non-sovereign monarchy", generic_before_non_cities = "ใน", class = "subpolity", ["country/*"] = {true}, ["continent/*"] = {true}, default = {true}, }, ["non-sovereign monarchy"] = { link = "w", fallback = "non-sovereign kingdom", }, ["oblast"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["oblasts and autonomous republics!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Ukraine. category_link = "[[oblast]]s and [[w:autonomous republic|autonomous republic]]s", class = "subpolity", }, ["มหาสมุทร"] = { link = true, holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"ทะเล", "bodies of water"}, default = {true}, }, ["okrug"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["overseas collectivity"] = { link = "w", fallback = "collectivity", }, ["overseas department"] = { link = "w", fallback = "department", }, ["overseas territory"] = { link = "w", fallback = "dependent territory", }, ["parish"] = { link = true, preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["parish municipality"] = { -- in Quebec, often similar to a rural village; the famous [[Saint-Louis-du-Ha! Ha!]] is one of them. link = "+w:parish municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, }, ["parish seat"] = { link = true, entry_placetype_use_the = true, preposition = "ของ", class = "capital", has_neighborhoods = true, }, ["park"] = { link = true, class = "man-made structure", default = {true}, }, ["pass"] = { link = "+mountain pass", -- The default plural algorithm gets this right but the singularization algorithm incorrectly converts -- passes -> passe, so put an entry here to ensure we singularize correctly. plural = "passes", fallback = "mountain pass", }, ["path"] = { link = true, fallback = "road", }, ["peak"] = { link = true, fallback = "ภูเขา", }, ["peninsula"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, }, ["periphery"] = { link = true, preposition = "ของ", class = "subpolity", }, ["สถานที่!"] = { generic_before_non_cities = "ใน", generic_before_cities = "ใน", class = "generic place", category_link = "[[place]]s of all sorts", -- `category_link_top_level` control the description used in the top-level [[Category:Places]] and -- language-specific variants such as [[Category:en:Places]]. The actual text for a language-spefic variant is -- "{{{langname}}} names of [[geographical]] [[place]]s of all sorts; [[toponym]]s." where the "names of" -- portion is automatically generated by the appropriate handler in -- [[Module:category tree/topic cat/data/Places]]. category_link_top_level = "[[geographical]] [[place]]s of all sorts; [[toponym]]s", bare_category_parent = "ชื่อ (หัวข้อ)", }, ["planned community"] = { -- Include this so we don't categorize 'planned community' into villages, as 'community' does. link = true, class = "settlement", has_neighborhoods = true, }, ["plateau"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true}, -- FIXME: Should generate both "Plateaus" and the appropriate 'geographic and cultural area' category }, ["Polish colony"] = { link = "[[w:colony (Poland)|colony]]", affix_type = "suf", affix = "colony", fallback = "village", has_neighborhoods = true, }, ["political divisions!"] = { category_link = "[[political]] [[division]]s and [[subdivision]]s, such as [[state]]s, [[province]]s, [[county|counties]] or [[district]]s", bare_category_parent = "สถานที่", }, ["องค์การทางการเมือง"] = { link = true, category_link = "[[independent]] or [[semi-]][[independent]] [[polity|polities]]", class = "polity", --ห้ามแปล class bare_category_parent = "สถานที่", default = {true}, }, ["populated place"] = { link = "+w:populated place", -- not necessarily true, but usually is the case fallback = "village", }, ["port"] = { link = true, class = "man-made structure", default = {true}, }, ["port city"] = { -- FIXME: should categorize into "Ports" as well as "นคร" link = true, fallback = "นคร", }, ["port town"] = { -- FIXME: should categorize into "Ports" as well as "เมือง" link = "w", fallback = "เมือง", }, ["prefecture"] = { -- FIXME! `prefecture` is like a county in Japan and elsewhere but a department capital city in France. -- May need `has_neighborhoods` to be a function. link = true, preposition = "ของ", display_handler = prefecture_display_handler, class = "subpolity", }, ["prefecture-level city"] = { -- China; they are huge entities with a central city; not cities themselves. link = "w", preposition = "ของ", class = "subpolity", }, ["preserved county"] = { -- In Wales; they are former counties enshrined in law; there are 8 of them and each consists of one or more -- "principal areas" (styled as "เทศมณฑล" or "county boroughs"), of which there are 22. link = "w", preposition = "ของ", class = "subpolity", inherently_former = {"FORMER"}, }, ["primary area"] = { -- a grouping of "อำเภอ" (neighborhoods) in Gothenburg, Sweden link = "+w:sv:primärområde", fallback = "neighborhood", }, ["principality"] = { link = true, fallback = "monarchy", }, ["promontory"] = { link = true, fallback = "headland", }, ["protectorate"] = { link = true, fallback = "dependent territory", }, ["จังหวัด"] = { link = true, preposition = "ของ", display_handler = province_display_handler, class = "subpolity", }, ["provinces and autonomous regions!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case China. category_link = "[[province]]s and [[autonomous region]]s", class = "subpolity", }, ["provinces and territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Canada and Pakistan. category_link = "[[province]]s and [[territory|territories]]", class = "subpolity", }, ["provincial capital"] = { link = true, fallback = "เมืองหลวง", }, ["raion"] = { link = true, preposition = "ของ", affix_type = "Suf", class = "subpolity", }, ["ranch"] = { link = true, fallback = "farm", }, ["range"] = { -- FIXME: Where is this used? Is it a mountain range? link = true, holonym_use_the = true, class = "natural feature", }, ["regency"] = { link = true, preposition = "ของ", class = "subpolity", }, ["ภูมิภาค"] = { link = true, preposition = "ของ", -- If 'region' isn't a specific administrative division, fall back to 'geographic and cultural area' fallback = "geographic and cultural area", -- "former region" is a subpolity but traditional/historic(al)/ancient/medieval/etc. is a geographic region class = "geographic region", }, ["regional capital"] = { link = "separately", fallback = "เมืองหลวง", }, ["regional county municipality"] = { -- Quebec link = "w", preposition = "ของ", affix_type = "Suf", no_affix_strings = {"เทศบาล", "เทศมณฑล"}, fallback = "เทศบาล", }, ["regional district"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "อำเภอ", fallback = "อำเภอ", }, ["regional municipality"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "เทศบาล", fallback = "เทศบาล", }, ["regional unit"] = { link = "w", preposition = "ของ", affix_type = "suf", class = "subpolity", }, ["registration county"] = { -- Used in Scotland for land registration purposes; formerly used in England, Wales and Ireland for statistical -- purposes (registration of births, deaths and marriages, and for the output of census information). link = "w", fallback = "เทศมณฑล", }, ["republic"] = { -- Of Russia, Yugoslavia, etc. "Republics" in general are sovereign but we use "ประเทศ" in that case. link = true, fallback = "constituent republic", }, ["research base"] = { link = "+w:research station", fallback = "research station", }, ["research station"] = { link = "w", class = "non-admin settlement", -- or "man-made structure"? default = {true}, }, ["reservoir"] = { link = true, fallback = "ทะเลสาบ", }, ["residential area"] = { link = "separately", fallback = "neighborhood", }, ["resort city"] = { link = "w", fallback = "นคร", }, ["resort town"] = { link = "w", fallback = "เมือง", }, ["แม่น้ำ"] = { link = true, generic_before_non_cities = "ใน", holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, cat_handler = city_type_cat_handler, ["continent/*"] = {true}, default = {true}, }, ["river island"] = { link = "w", fallback = "เกาะ", }, ["road"] = { link = true, class = "man-made structure", default = {"Named roads"}, }, ["Roman province"] = { -- FIXME! Eliminate this in favor of 'former province|emp/Roman Empire' link = "w", default = {"Provinces of the Roman Empire"}, class = "subpolity", }, ["royal borough"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = {"royal", "borough"}, fallback = "local government district with borough status", has_neighborhoods = true, }, ["royal burgh"] = { link = true, fallback = "borough", }, ["royal capital"] = { link = "w", fallback = "เมืองหลวง", }, ["rural committee"] = { -- Hong Kong; a group of villages link = "w", affix_type = "Suf", has_neighborhoods = true, class = "settlement", }, ["rural community"] = { -- New Brunswick link = "+w:list of municipalities in New_Brunswick#Rural communities", fallback = "เทศบาล", }, ["rural hromada"] = { link = "[[rural]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["rural municipality"] = { link = "w", preposition = "ของ", affix_type = "Pref", no_affix_strings = "เทศบาล", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["rural township"] = { -- Taiwan link = "+w:rural township (Taiwan)", fallback = "township", }, ["sanctuary"] = { link = true, fallback = "temple", }, ["satrapy"] = { link = true, preposition = "ของ", class = "subpolity", inherently_former = {"ANCIENT", "FORMER"}, }, ["ทะเล"] = { link = true, holonym_use_the = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["seaport"] = { link = true, fallback = "port", }, ["seat"] = { link = true, fallback = "administrative centre", }, ["self-administered area"] = { -- Myanmar (groups self-administered divisions and zones) link = "+w:self-administered zone", preposition = "ของ", class = "subpolity", }, ["self-administered division"] = { -- Myanmar (only one of them: Wa Self-Administered Division) link = "w", fallback = "self-administered area", }, ["self-administered zone"] = { -- Myanmar (five of them) link = "w", fallback = "self-administered area", }, ["separatist state"] = { link = "separately", fallback = "unrecognized country", }, ["การตั้งถิ่นฐาน"] = { link = true, category_link = "[[settlement]]s such as [[city|cities]], [[village]]s and [[farm]]s", bare_category_parent = "สถานที่", -- not necessarily true, but usually is the case fallback = "village", }, ["settlement hromada"] = { link = "[[w:Populated สถานที่ในUkraine#Rural settlements|การตั้งถิ่นฐาน]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["sheading"] = { -- Isle of Man link = true, fallback = "อำเภอ", }, ["sheep station"] = { -- Australia link = true, fallback = "farm", }, ["shire"] = { link = true, fallback = "เทศมณฑล", }, ["shire county"] = { link = "w", fallback = "เทศมณฑล", }, ["shire town"] = { link = true, fallback = "county seat", }, ["ski resort city"] = { link = "[[ski resort]] [[city]]", fallback = "นคร", }, ["ski resort town"] = { link = "[[ski resort]] [[town]]", fallback = "เมือง", }, ["spa city"] = { link = "+w:spa town", fallback = "นคร", }, ["spa town"] = { link = "w", fallback = "เมือง", }, ["space station"] = { link = true, fallback = "research station", }, ["special administrative region"] = { -- in China; in practice they are city-like (Hong Kong, Macau); also [[Oecusse]] in East Timor is formally a -- "special administrative region"; North Korea had one such region planned (Sinuiju) but abandoned; Indonesia -- has similar "special regions" of Jakarta, Yogyakarta and Aceh; and South Sudan has three "special -- administrative areas" link = "+w:special administrative regions of China", preposition = "ของ", class = "subpolity", has_neighborhoods = true, --? -- no suffix since สถานที่ในHong Kong or Macau are listed without China, except Hong Kong and Macau themselves -- they also contain regions (or areas), e.g. [[Kowloon]], so it would be confusing suffix = "", }, ["special collectivity"] = { link = "w", fallback = "collectivity", }, ["special municipality"] = { -- formerly linked to the Taiwan article but there are also special municipalities of the Netherlands link = "w", fallback = "เทศบาล", }, ["special ward"] = { -- Tokyo link = true, fallback = "เทศบาล", }, ["spit"] = { link = true, fallback = "peninsula", }, ["spring"] = { link = true, class = "natural feature", default = {true}, }, ["star"] = { link = true, class = "natural feature", default = {true}, }, ["รัฐ"] = { link = true, preposition = "ของ", class = "subpolity", -- 'former/historical state' could refer either to a state of a country (a division) or a state = sovereign -- entity. The latter appears more common (e.g. in various "ancient states" of East Asia). former_type = "องค์การทางการเมือง", }, ["states and territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case Australia. category_link = "[[state]]s and [[territory|territories]]", class = "subpolity", }, ["states and union territories!"] = { -- This and other similar "combined placetypes" are for use in the plural when grouping first-level -- administrative regions of certain countries, in this case India. category_link = "[[state]]s and [[union territory|union territories]]", class = "subpolity", }, ["state capital"] = { link = true, fallback = "เมืองหลวง", }, ["state park"] = { link = true, fallback = "park", }, ["state-level new area"] = { -- China (type of economic development zone, varying greatly in size) link = "w", fallback = "new area", }, ["statistical region"] = { -- Slovenia link = true, fallback = "administrative region", }, ["statutory city"] = { link = "w", fallback = "นคร", }, ["statutory town"] = { link = "w", fallback = "เมือง", }, ["strait"] = { link = true, class = "natural feature", addl_bare_category_parents = {"bodies of water"}, default = {true}, }, ["stream"] = { link = true, fallback = "แม่น้ำ", }, ["street"] = { link = true, fallback = "road", }, ["strip"] = { link = true, fallback = "geographic region", }, ["strip of land"] = { link = "[[strip]] of [[land]]", plural = "strips of land", plural_link = "[[strip]]s of [[land]]", fallback = "geographic region", }, ["sub-metropolitan city"] = { link = "+w:List of cities in Nepal#Sub-metropolitan cities", fallback = "นคร", }, ["sub-prefectural city"] = { link = "w", fallback = "subprovincial city", }, ["ตำบล"] = { link = true, preposition = "ของ", has_neighborhoods = true, --? -- FIXME: subdistricts can be neighborhood-like (of Jakarta) or larger (in China); need a handler class = "subpolity", default = {true}, }, ["subdivision"] = { link = true, preposition = "ของ", affix_type = "suf", -- FIXME: subdivisions can be neighborhood-like or larger; need a handler class = "subpolity", cat_handler = district_neighborhood_cat_handler, }, ["submerged ghost town"] = { -- FIXME: Consider just having "submerged" as a qualifier. link = "[[submerged]] [[ghost town]]", fallback = "ghost town", }, ["subnational kingdom"] = { link = "+w:subnational monarchy", fallback = "non-sovereign kingdom", }, ["subnational monarchy"] = { link = "w", fallback = "non-sovereign kingdom", }, ["subprefecture"] = { link = true, affix_type = "suf", preposition = "ของ", class = "subpolity", }, ["subprovince"] = { link = true, preposition = "ของ", class = "subpolity", }, ["subprovincial city"] = { link = "w", -- China; special status given to certain prefecture-level cities fallback = "prefecture-level city", }, ["subprovincial district"] = { link = "w", -- China; special status given to Binhai New Area and Pudong New Area, which are county-level districts preposition = "ของ", class = "subpolity", }, ["subregion"] = { link = true, fallback = "geographic region", }, ["suburb"] = { link = true, -- The following text is suitable for the top-level description of a suburb as well as categories of the form -- 'Suburbs in POLDIV' e.g. 'Suburbs in Illinois, USA' but not for categories of the form 'Suburbs of Chicago', -- where we'd get "[[suburb]]s of [[city|cities]] of [[Chicago]]". category_link = "[[suburb]]s of [[city|cities]]", category_link_before_city = "[[suburb]]s", -- See comments under "neighborhood" for the following three settings. They are used by -- [[Module:category tree/topic cat/data/Places]] for generating the text of 'Suburbs in/of PLACE' categories -- but currently ignored by district_neighborhood_cat_handler (which actually generates the categories for a -- given page), which hardcodes "ใน" for non-cities and "ของ" for cities. (FIXME: Change this.) generic_before_non_cities = "ใน", generic_before_cities = "ของ", preposition = "ของ", has_neighborhoods = true, --? class = "non-admin settlement", --? cat_handler = district_neighborhood_cat_handler, }, ["suburban area"] = { link = "w", fallback = "suburb", }, ["subway station"] = { link = "w", fallback = "metro station", }, ["sum"] = { -- In China, Mongolia, Russia; something like a county in Mongolia but a township in China (Inner Mongolia), -- and equivalent to a [[selsoviet]] in the parts of Russia where it's in use (a rural council, below a raion). link = "+w:sum (administrative division)", -- This fallback is somewha arbitrary. We could use "เทศมณฑล" but that has a display handler -- which we don't want to be active (FIXME: If the display handler would be active, that's a bug). fallback = "division", }, ["supercontinent"] = { link = true, fallback = "continent", }, ["tehsil"] = { link = true, affix_type = "suf", no_affix_strings = {"tehsil", "tahsil"}, class = "subpolity", }, ["temple"] = { link = true, fallback = "building", }, ["territorial authority"] = { link = "w", fallback = "อำเภอ", }, ["ดินแดน"] = { link = true, preposition = "ของ", class = "subpolity", }, ["theme"] = { link = "+w:theme (Byzantine district)", preposition = "ของ", class = "subpolity", }, ["เมือง"] = { link = true, generic_before_non_cities = "ใน", has_neighborhoods = true, class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["town with bystatus"] = { -- can't use templates in links currently link = "[[town]] with [[bystatus#Norwegian Bokmål|bystatus]]", plural = "towns with bystatus", plural_link = "[[town]]s with [[bystatus#Norwegian Bokmål|bystatus]]", fallback = "เมือง", }, ["township"] = { link = true, has_neighborhoods = true, class = "settlement", --? default = {true}, }, ["township municipality"] = { -- Quebec link = "+w:township municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["traditional county"] = { link = true, fallback = "เทศมณฑล", }, ["traditional region"] = { -- FIXME: Verify this works. Same for 'historic(al) region'. -- provided only for the link link = "w", fallback = "FORMER geographic region", }, ["trail"] = { link = true, fallback = "road", }, ["treaty port"] = { link = "w", fallback = "นคร", class = "settlement", inherently_former = {"FORMER"}, }, ["tributary"] = { link = true, preposition = "ของ", fallback = "แม่น้ำ", }, ["underground station"] = { link = "w", fallback = "metro station", }, ["unincorporated area"] = { link = "w", -- I don't know if this fallback makes sense everywhere. fallback = "unincorporated community", }, ["unincorporated community"] = { link = true, generic_before_non_cities = "ใน", class = "non-admin settlement", }, ["unincorporated territory"] = { link = "w", fallback = "ดินแดน", }, ["union territory"] = { -- India link = true, preposition = "ของ", entry_placetype_indefinite_article = "a", class = "subpolity", }, ["unitary authority"] = { -- UK, New Zealand link = true, entry_placetype_indefinite_article = "a", fallback = "local government district", }, ["unitary district"] = { link = "w", entry_placetype_indefinite_article = "a", fallback = "local government district", }, ["united township municipality"] = { -- Quebec link = "+w:united township municipality (Quebec)", entry_placetype_indefinite_article = "a", fallback = "township municipality", has_neighborhoods = true, --? }, ["university"] = { link = true, entry_placetype_indefinite_article = "a", class = "man-made structure", default = {true}, }, ["unrecognised country"] = { link = "w", fallback = "unrecognized country", }, ["unrecognized and nearly unrecognized countries!"] = { category_link = "[[de facto]] [[independent]] [[state]]s with little or no {{w|international recognition}}", bare_category_parent = "country-like entities", }, ["unrecognized country"] = { link = "w", class = "polity", --ห้ามแปล class default = {"Unrecognized and nearly unrecognized countries"}, }, ["unrecognised state"] = { link = "w", fallback = "unrecognized country", }, ["unrecognized state"] = { link = "w", fallback = "unrecognized country", }, ["urban area"] = { link = "separately", fallback = "neighborhood", }, ["urban hromada"] = { link = "[[urban]] [[w:hromada|hromada]]", affix_type = "suf", fallback = "hromada", }, ["urban service area"] = { -- A strange beast existing in Alberta; technically a type of hamlet but in practice used for much larger -- cities and treated equivalent to a city. (There are only two of them, [[Fort McMurray]] and [[Sherwood Park]]). link = "w", fallback = "นคร", }, ["urban township"] = { link = "w", fallback = "township", }, ["urban-type settlement"] = { -- appears to be a particular type of small urban settlement in post-Soviet states, -- had an administrative function. link = "w", fallback = "เมือง", }, ["valley"] = { link = true, class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน", "water"}, default = {true}, }, ["viceroyalty"] = { -- in essence, a type of colony link = true, fallback = "dependent territory", }, ["village"] = { link = true, generic_before_non_cities = "ใน", category_link = "[[village]]s, [[hamlet]]s, and other small [[community|communities]] and [[settlement]]s", class = "settlement", cat_handler = city_type_cat_handler, default = {true}, }, ["village development committee"] = { -- former administrative structure in Nepal; also exists in India but not as a formal unit link = "+w:village development committee (Nepal)", inherently_former = {"FORMER"}, fallback = "village", }, ["village municipality"] = { -- Quebec link = "+w:village municipality (Quebec)", preposition = "ของ", fallback = "เทศบาล", has_neighborhoods = true, --? }, ["voivodeship"] = { -- Poland link = true, display_handler = voivodeship_display_handler, preposition = "ของ", class = "subpolity", }, ["volcano"] = { link = true, plural = "volcanoes", class = "natural feature", addl_bare_category_parents = {"ธรณีสัณฐาน"}, default = {true, "ภูเขา"}, }, ["ward"] = { link = true, class = "settlement", -- Wards are formal administrative divisions of a city but have some properties of neighborhoods. fallback = "neighborhood", }, ["watercourse"] = { link = true, fallback = "channel", }, ["Welsh community"] = { -- Wales link = "[[w:community (Wales)|community]]", preposition = "ของ", affix_type = "suf", affix = "community", has_neighborhoods = true, class = "settlement", }, ["zone"] = { -- administrative division of Ethiopia, Qatar, Nepal, India link = "+w:zone#Place names", preposition = "ของ", class = "subpolity", }, ---------------------------------------------------------------------------------------------- -- Categories for former places -- ---------------------------------------------------------------------------------------------- ["ANCIENT capital"] = { link = false, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", -- FIXME: Consider removing 'ancient settlements' here. Ancient capitals, like former capitals, often still -- exist but just aren't the capital any more. Maybe we should have an 'Ancient capitals' category. default = {"Ancient settlements", "Former capitals"}, }, ["ANCIENT non-admin settlement"] = { link = false, class = "non-admin settlement", fallback = "ANCIENT settlement", }, ["ANCIENT settlement"] = { link = false, has_neighborhoods = true, class = "settlement", default = {"Ancient settlements"}, }, ["ancient settlements!"] = { category_link = "former [[city|cities]], [[town]]s and [[village]]s that existed in [[antiquity]]", bare_category_parent = "former settlements", }, ["FORMER capital"] = { link = false, entry_placetype_use_the = true, preposition = "ของ", has_neighborhoods = true, class = "capital", default = {"Former capitals"}, }, ["former capitals!"] = { category_link = "former [[capital]] [[city|cities]] and [[town]]s", bare_category_parent = "การตั้งถิ่นฐาน", }, ["former counties and county-level cities!"] = { -- For categorizing former counties and county-level cities of China category_link = "no-longer existing [[county|counties]] and [[county-level city|county-level cities]]", bare_category_breadcrumb = "counties and county-level cities", bare_category_parent = "former political divisions", }, ["FORMER county"] = { -- For categorizing former counties and county-level cities of China link = false, fallback = "FORMER subpolity", }, ["FORMER county-level city"] = { -- For categorizing former counties and county-level cities of China link = false, fallback = "FORMER subpolity", }, ["former countries and country-like entities!"] = { category_link = "[[country|countries]] and similar [[polity|polities]] that no longer exist", bare_category_breadcrumb = "countries and country-like entities", bare_category_parent = "former polities", }, ["FORMER country"] = { link = false, class = "polity", --ห้ามแปล class default = {"Former countries and country-like entities"}, }, ["former dependent territories!"] = { category_link = "[[w:dependent territory|dependent territories]] (colonies, dependencies, protectorates, etc.) that no longer exist", bare_category_breadcrumb = "dependent territories", bare_category_parent = "former political divisions", }, ["FORMER dependent territory"] = { link = false, preposition = "ของ", class = "subpolity", default = {"Former dependent territories"}, }, ["former districts!"] = { -- For categorizing former districts of China category_link = "no-longer-existing [[district]]s", bare_category_breadcrumb = "อำเภอ", bare_category_parent = "former political divisions", }, ["FORMER district"] = { -- For categorizing former districts of China link = false, fallback = "FORMER subpolity", }, ["FORMER geographic region"] = { link = false, fallback = "geographic and cultural area", }, ["FORMER man-made structure"] = { link = false, class = "man-made structure", default = {"Former man-made structures"}, }, ["former man-made structures!"] = { category_link = "man-made structures such as [[airport]]s and [[park]]s that no longer exist", bare_category_breadcrumb = "man-made structures", bare_category_parent = "former places", }, ["former municipalities!"] = { -- For categorizing former municipalities of the Netherlands category_link = "no-longer-existing [[municipality|municipalities]]", bare_category_breadcrumb = "เทศบาล", bare_category_parent = "former political divisions", }, ["FORMER municipality"] = { -- For categorizing former municipalities of the Netherlands link = false, fallback = "FORMER subpolity", }, ["FORMER natural feature"] = { link = false, class = "natural feature", default = {"Former natural features"}, }, ["former natural features!"] = { category_link = "natural features such as [[lake]]s, [[river]]s and [[island]]s that no longer exist", bare_category_breadcrumb = "natural features", bare_category_parent = "former places", }, ["FORMER non-admin settlement"] = { link = false, class = "non-admin settlement", fallback = "FORMER settlement", }, ["former places!"] = { category_link = "[[place]]s of all sorts that no longer exist", bare_category_breadcrumb = "former", bare_category_parent = "สถานที่", }, ["former political divisions!"] = { category_link = "[[political]] [[division]]s (states, provinces, counties, etc.) that no longer exist", bare_category_breadcrumb = "political divisions", bare_category_parent = "former places", }, ["former polities!"] = { category_link = "[[polity|polities]] (countries, kingdoms, empires, etc.) that no longer exist", bare_category_breadcrumb = "องค์การทางการเมือง", bare_category_parent = "former places", }, ["FORMER polity"] = { link = false, class = "polity", --ห้ามแปล class default = {"Former polities"}, }, ["former prefectures!"] = { -- For categorizing former prefectures of China category_link = "no-longer-existing [[prefecture]]s", bare_category_breadcrumb = "prefectures", bare_category_parent = "former political divisions", }, ["FORMER prefecture"] = { -- For categorizing former prefectures of China link = false, fallback = "FORMER subpolity", }, ["former provinces!"] = { -- For categorizing former provinces of China, etc. category_link = "no-longer-existing [[province]]s", bare_category_breadcrumb = "จังหวัด", bare_category_parent = "former political divisions", }, ["FORMER province"] = { -- For categorizing ancient/historical/former provinces of the Roman Empire link = false, fallback = "FORMER subpolity", }, ["former region"] = { -- A former region is considered a former political division, but not a 'historical/traditional/etc.' region. link = "separately", preposition = "ของ", inherently_former = {"FORMER"}, class = "subpolity", }, ["FORMER settlement"] = { link = false, has_neighborhoods = true, class = "settlement", default = {"Former settlements"}, }, ["former settlements!"] = { category_link = "[[city|cities]], [[town]]s and [[village]]s that no longer exist or have been merged or reclassified", bare_category_breadcrumb = "การตั้งถิ่นฐาน", bare_category_parent = "former political divisions", }, ["FORMER subpolity"] = { link = false, preposition = "ของ", class = "subpolity", default = {"Former political divisions"}, }, ---------------------------------------------------------------------------------------------- -- form-of categories -- ---------------------------------------------------------------------------------------------- ---------- Abbreviations ---------- ["abbreviations of counties!"] = { -- For categorizing abbreviations of counties of e.g. England full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[county|counties]]", bare_category_breadcrumb = "เทศมณฑล", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of countries!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "abbreviations of places", }, ["abbreviations of departments!"] = { -- For categorizing abbreviations of departments of e.g. France full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[department]]s", bare_category_breadcrumb = "departments", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of districts!"] = { -- For categorizing abbreviations of districts of e.g. ??? full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[district]]s", bare_category_breadcrumb = "อำเภอ", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of divisions!"] = { -- For categorizing abbreviations of divisions of e.g. Bangladesh full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[division]]s", bare_category_breadcrumb = "divisions", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of former countries!"] = { full_category_link = "{{glossary|abbreviation}}s of [[country|countries]] that no longer [[exist]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "abbreviations of former places", }, ["abbreviations of former places!"] = { full_category_link = "{{glossary|abbreviation}}s of [[place]]s that no longer [[exist]]", bare_category_breadcrumb = "abbreviations", bare_category_parent = "former places", addl_bare_category_parents = {{name = "abbreviations of places", sort = "former"}}, }, ["abbreviations of places!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[place]]s", bare_category_breadcrumb = "abbreviations", bare_category_parent = "สถานที่", }, ["abbreviations of political divisions!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[political]] [[division]]s", bare_category_breadcrumb = "political divisions", bare_category_parent = "abbreviations of places", }, ["abbreviations of prefectures!"] = { -- For categorizing abbreviations of prefectures of e.g. Japan full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[prefecture]]s", bare_category_breadcrumb = "prefectures", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of provinces!"] = { -- For categorizing abbreviations of provinces of e.g. Canada full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s", bare_category_breadcrumb = "จังหวัด", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of provinces and territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[province]]s and [[territory|territories]]", bare_category_breadcrumb = "provinces and territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of regions!"] = { -- For categorizing abbreviations of regions of e.g. Italy full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[administrative region]]s", bare_category_breadcrumb = "ภูมิภาค", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states!"] = { -- For categorizing abbreviations of states of e.g. the United States full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states and territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[territory|territories]]", bare_category_breadcrumb = "states and territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of states and union territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[state]]s and [[union territory|union territories]]", bare_category_breadcrumb = "states and union territories", bare_category_parent = "abbreviations of political divisions", }, ["abbreviations of territories!"] = { full_category_link = "{{glossary|abbreviation}}s of [[name]]s of [[territory|territories]]", bare_category_breadcrumb = "ดินแดน", bare_category_parent = "abbreviations of political divisions", }, ["ABBREVIATION_OF country"] = { link = false, default = {"Abbreviations of countries"}, }, ["ABBREVIATION_OF county"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF department"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF district"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF division"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF FORMER country"] = { link = false, default = {"Abbreviations of former countries"}, }, ["ABBREVIATION_OF FORMER place"] = { link = false, default = {"Abbreviations of former places"}, }, ["ABBREVIATION_OF place"] = { link = false, default = {"Abbreviations of places"}, }, ["ABBREVIATION_OF prefecture"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF province"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF region"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF state"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF subpolity"] = { link = false, default = {"Abbreviations of political divisions"}, }, ["ABBREVIATION_OF territory"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ["ABBREVIATION_OF union territory"] = { link = false, fallback = "ABBREVIATION_OF subpolity", }, ---------- Archaic forms ---------- ["archaic forms of places!"] = { full_category_link = "{{glossary|archaic}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "archaic forms", bare_category_parent = "สถานที่", }, ["ARCHAIC_FORM_OF place"] = { link = false, default = {"Archaic forms of places"}, }, ---------- Clippings ---------- ["clippings of places!"] = { full_category_link = "{{glossary|clipping}}s of [[name]]s of [[place]]s", bare_category_breadcrumb = "clippings", bare_category_parent = "สถานที่", }, ["CLIPPING_OF place"] = { link = false, default = {"Clippings of places"}, }, ---------- Dated forms ---------- ["dated forms of places!"] = { full_category_link = "{{glossary|dated}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "dated forms", bare_category_parent = "สถานที่", }, ["DATED_FORM_OF place"] = { link = false, default = {"Dated forms of places"}, }, ---------- Derogatory names ---------- ["derogatory names for cities!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[city|cities]]", bare_category_breadcrumb = "นคร", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for cities"}, }, ["derogatory names for continents!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[continent]]s", bare_category_breadcrumb = "ทวีป", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for continents"}, }, ["derogatory names for countries!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for countries"}, }, ["derogatory names for places!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[place]]s", bare_category_breadcrumb = "derogatory names", bare_category_parent = "nicknames for places", }, ["derogatory names for states!"] = { full_category_link = "{{glossary|derogatory}} [[name]]s for [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "derogatory names for places", addl_bare_category_parents = {"nicknames for states"}, }, ["DEROGATORY_NAME_FOR capital"] = { link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR city"] = { link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR continent"] = { link = false, default = {"Derogatory names for continents"}, }, ["DEROGATORY_NAME_FOR country"] = { link = false, default = {"Derogatory names for countries"}, }, ["DEROGATORY_NAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR place"] = { link = false, default = {"Derogatory names for places"}, }, ["DEROGATORY_NAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Derogatory names for cities"}, }, ["DEROGATORY_NAME_FOR state"] = { link = false, default = {"Derogatory names for states"}, }, ["DEROGATORY_NAME_FOR town"] = { link = false, default = {"Derogatory names for cities"}, }, ---------- Ellipses ---------- ["ellipses of places!"] = { full_category_link = "{{glossary|ellipsis|ellipses}} of [[name]]s of [[place]]s", bare_category_breadcrumb = "ellipses", bare_category_parent = "สถานที่", }, ["ELLIPSIS_OF place"] = { link = false, default = {"Ellipses of places"}, }, ---------- Former long-form names ---------- ["former long-form names of countries!"] = { full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former long-form names of places", addl_bare_category_parents = {{name = "former names of countries", sort = "long-form"}}, }, ["former long-form names of places!"] = { full_category_link = "no-longer-[[use]]d [[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s", bare_category_breadcrumb = "long-form", bare_category_parent = "former names of places", }, ["FORMER_LONG_FORM_OF country"] = { link = false, default = {"Former long-form names of countries"}, }, ["FORMER_LONG_FORM_OF place"] = { link = false, default = {"Former long-form names of places"}, }, ---------- Former names ---------- ["former names of capitals!"] = { full_category_link = "[[former]] [[name]]s of [[capital city|capital cities]] that generally still exist but under a different name", bare_category_breadcrumb = "เมืองหลวง", bare_category_parent = "former names of settlements", }, ["former names of countries!"] = { full_category_link = "[[former]] [[name]]s of [[country|countries]] that generally still exist but under a different name", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former names of places", }, ["former names of places!"] = { full_category_link = "[[former]] [[name]]s of [[place]]s that generally still exist but under a different name", bare_category_breadcrumb = "former names", bare_category_parent = "สถานที่", }, ["former names of political divisions!"] = { full_category_link = "[[former]] [[name]]s of [[political]] [[division]]s (states, provinces, counties, etc.) that generally still exist but under a different name", bare_category_breadcrumb = "political divisions", bare_category_parent = "former names of places", }, ["former names of polities!"] = { full_category_link = "[[former]] [[name]]s of [[polity|polities]] (e.g. [[country|countries]]) that generally still exist but under a different name", bare_category_breadcrumb = "องค์การทางการเมือง", bare_category_parent = "former names of places", }, ["former names of settlements!"] = { full_category_link = "[[former]] [[name]]s of [[city|cities]], [[town]]s, [[village]]s, etc. that generally still exist but under a different name", bare_category_breadcrumb = "การตั้งถิ่นฐาน", bare_category_parent = "former names of political divisions", }, ["FORMER_NAME_OF capital"] = { link = false, default = {"Former names of capitals"}, }, ["FORMER_NAME_OF country"] = { link = false, default = {"Former names of countries"}, }, ["FORMER_NAME_OF place"] = { link = false, default = {"Former names of places"}, }, ["FORMER_NAME_OF polity"] = { link = false, default = {"Former names of polities"}, }, ["FORMER_NAME_OF region"] = { link = false, fallback = "FORMER_NAME_OF subpolity", }, ["FORMER_NAME_OF settlement"] = { link = false, default = {"Former names of settlements"}, }, ["FORMER_NAME_OF subpolity"] = { link = false, default = {"Former names of political divisions"}, }, ---------- Former nicknames ---------- ["former nicknames for cities!"] = { full_category_link = "no-longer-used [[nickname]]s for [[city|cities]], e.g. the [[Eternal City]] for [[Kyoto]] during the {{w|Heian period}} ({{circa2|800–1100|short=yes}} {{AD}})", bare_category_breadcrumb = "นคร", bare_category_parent = "former nicknames for places", addl_bare_category_parents = {"nicknames for cities"}, }, ["former nicknames for places!"] = { full_category_link = "no-longer-used [[nickname]]s for [[place]]s", bare_category_breadcrumb = "former", bare_category_parent = "nicknames for places", addl_bare_category_parents = {{name = "former names of places", sort = "nicknames"}}, }, ["FORMER_NICKNAME_FOR capital"] = { link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR city"] = { link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR place"] = { link = false, default = {"Former nicknames for places"}, }, ["FORMER_NICKNAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Former nicknames for cities"}, }, ["FORMER_NICKNAME_FOR town"] = { link = false, default = {"Former nicknames for cities"}, }, ---------- Former official names ---------- ["former official names of countries!"] = { full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "former official names of places", addl_bare_category_parents = {{name = "former names of countries", sort = "official"}}, }, ["former official names of places!"] = { full_category_link = "no-longer-[[use]]d [[official]] [[name]]s of [[place]]s", bare_category_breadcrumb = "official", bare_category_parent = "former names of places", }, ["FORMER_OFFICIAL_NAME_OF country"] = { link = false, default = {"Former official names of countries"}, }, ["FORMER_OFFICIAL_NAME_OF place"] = { link = false, default = {"Former official names of places"}, }, ---------- Long-form names ---------- ["long-form names of countries!"] = { full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "long-form names of places", }, ["long-form names of places!"] = { full_category_link = "[[long]]-[[form]] (but typically [[unofficial]]) [[name]]s of [[place]]s", bare_category_breadcrumb = "long-form names", bare_category_parent = "สถานที่", }, ["LONG_FORM_OF country"] = { link = false, default = {"Long-form names of countries"}, }, ["LONG_FORM_OF place"] = { link = false, default = {"Long-form names of places"}, }, ---------- Nicknames ---------- ["nicknames for cities!"] = { full_category_link = "[[nickname]]s for [[city|cities]], e.g. the [[Big Apple]] for [[New York City]]", bare_category_breadcrumb = "นคร", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"นคร"}, }, ["nicknames for continents!"] = { full_category_link = "[[nickname]]s for [[continent]]s", bare_category_breadcrumb = "ทวีป", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"ทวีป"}, }, ["nicknames for countries!"] = { full_category_link = "[[nickname]]s for [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"ประเทศ"}, }, ["nicknames for places!"] = { full_category_link = "[[nickname]]s for [[place]]s", bare_category_breadcrumb = "สถานที่", bare_category_parent = "nicknames", addl_bare_category_parents = {"สถานที่"}, }, ["nicknames for states!"] = { -- For categorizing nicknames for states of e.g. the United States full_category_link = "[[nicknames]] for [[state]]s", bare_category_breadcrumb = "รัฐ", bare_category_parent = "nicknames for places", addl_bare_category_parents = {"รัฐ"}, }, ["NICKNAME_FOR capital"] = { link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR city"] = { link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR continent"] = { link = false, default = {"Nicknames for continents"}, }, ["NICKNAME_FOR country"] = { link = false, default = {"Nicknames for countries"}, }, ["NICKNAME_FOR metropolitan city"] = { -- "metropolitan city" doesn't fall back to "นคร" link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR place"] = { link = false, default = {"Nicknames for places"}, }, ["NICKNAME_FOR prefecture-level city"] = { -- "prefecture-level city" doesn't fall back to "นคร" but things like "county-level city" and -- "subprovincial city" fall back to "prefecture-level city" link = false, default = {"Nicknames for cities"}, }, ["NICKNAME_FOR state"] = { link = false, default = {"Nicknames for states"}, }, ["NICKNAME_FOR town"] = { link = false, default = {"Nicknames for cities"}, }, ---------- Obsolete forms ---------- ["obsolete forms of places!"] = { full_category_link = "{{glossary|obsolete}} [[form]]s of [[name]]s of [[place]]s", bare_category_breadcrumb = "obsolete forms", bare_category_parent = "สถานที่", }, ["OBSOLETE_FORM_OF place"] = { link = false, default = {"Obsolete forms of places"}, }, ---------- Official names ---------- ["official names of countries!"] = { full_category_link = "[[official]] [[name]]s of [[country|countries]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "official names of places", }, ["official names of former countries!"] = { full_category_link = "[[official]] [[name]]s of [[country|countries]] that no longer [[exist]]", bare_category_breadcrumb = "ประเทศ", bare_category_parent = "official names of former places", }, ["official names of former places!"] = { full_category_link = "[[official]] [[name]]s of [[place]]s that no longer [[exist]]", bare_category_breadcrumb = "official names", bare_category_parent = "former places", addl_bare_category_parents = {{name = "official names of places", sort = "former"}}, }, ["official names of places!"] = { full_category_link = "[[official]] [[name]]s of [[place]]s", bare_category_breadcrumb = "official names", bare_category_parent = "สถานที่", }, ["OFFICIAL_NAME_OF country"] = { link = false, default = {"Official names of countries"}, }, ["OFFICIAL_NAME_OF FORMER country"] = { link = false, default = {"Official names of former countries"}, }, ["OFFICIAL_NAME_OF FORMER place"] = { link = false, default = {"Official names of former places"}, }, ["OFFICIAL_NAME_OF place"] = { link = false, default = {"Official names of places"}, }, ---------- Official nicknames ---------- ["official nicknames for places!"] = { full_category_link = "[[official]] [[nickname]]s for [[place]]s", bare_category_breadcrumb = "official", bare_category_parent = "nicknames for places", }, ["official nicknames for states!"] = { -- For categorizing official nicknames for states of e.g. the United States full_category_link = "[[official]] [[nicknames]] for [[state]]s", bare_category_breadcrumb = "official", bare_category_parent = "nicknames for states", addl_bare_category_parents = {"รัฐ"}, }, ["OFFICIAL_NICKNAME_FOR place"] = { link = false, default = {"Official nicknames for places"}, }, ["OFFICIAL_NICKNAME_FOR state"] = { link = false, default = {"Official nicknames for states"}, }, } export.plural_placetype_to_singular = {} for sg_placetype, spec in pairs(export.placetype_data) do if spec.plural then export.plural_placetype_to_singular[spec.plural] = sg_placetype end end return export kzziwt1anaajgn8bqqnebvdl0xawlmt ᥟᥧᥲ 0 2300501 5720732 5713805 2026-04-21T05:30:31Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720732 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === ร่วมเชื้อสายกับ{{cog|shn|ဢူႈ}} === การออกเสียง === * {{IPA|tdd|/ʔuː˧˩/}} === คำนาม === {{tdd-verb}} # [[พ่อ]] #: {{syn|tdd|ᥙᥨᥝ}} mdabjkfmi6ti26qu1sgzxy6knh25v4c 5720735 5720732 2026-04-21T05:58:30Z Ai Ku Karng 17824 5720735 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === ร่วมเชื้อสายกับ{{cog|shn|ဢူႈ}} === การออกเสียง === * {{IPA|tdd|/ʔu˧˩/}} === คำนาม === {{tdd-verb}} # [[พ่อ]] #: {{syn|tdd|ᥙᥨᥝ}} 1x83efkxg1dx293z7xu4g0eruuc2lpn ᥙᥨᥝ 0 2300515 5720733 5652871 2026-04-21T05:39:40Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720733 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === {{inh+|tdd|tai-pro|*boːᴮ}}; ร่วมเชื้อสายกับ{{cog|th|พ่อ}}, {{cog|sou|ผอ}}, {{cog|tts|พ่อ}}, {{cog|lo|ພໍ່}}, {{cog|nod|ᨻᩬᩴ᩵}}, {{cog|kkh|ᨻᩳ᩵}}, {{cog|khb|ᦗᦸᧈ}}, {{cog|blt|ꪝꪷ꪿}}, {{cog|shn|ပေႃႈ}}, {{cog|aho|𑜆𑜦𑜡}} หรือ {{m|aho|𑜆𑜨𑜦𑜡}} === การออกเสียง === * {{IPA|tdd|/po˧˧/}} === คำนาม === {{tdd-num}} # [[พ่อ]] #: {{syn|tdd|ᥟᥧᥲ}} appmftusqpetu3m0heqn3i7l9hzl7f8 ᥞᥝᥳ 0 2300563 5720728 5653055 2026-04-21T05:16:09Z Ai Ku Karng 17824 /* ภาษาไทใต้คง */ 5720728 wikitext text/x-wiki == ภาษาไทใต้คง == === การออกเสียง === * {{IPA|tdd|/haw˦˧/}} === คำอนุภาค === {{tdd-part}} # [[แล้ว]] #: {{syn|tdd|ᥕᥝᥳ}} irl4pgwjt7czdbqaaapepbg03kyvyed ᥖᥭᥰ 0 2300567 5720730 5653063 2026-04-21T05:21:40Z Ai Ku Karng 17824 5720730 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === {{inh+|th|tai-swe-pro|*dajᴬ²}}, จาก{{inh|th|tai-pro|*ɗwɤːjᴬ}}; ร่วมเชื้อสายกับ{{cog| tts|ไท}}, {{cog|nod|ᨴᩱ}}, {{cog|lo|ໄທ}}, {{cog| nyw|ไท}}, {{cog|khb|ᦺᦑ}}, {{cog|blt|ꪼꪕ}}, {{cog|shn|တႆး}}, {{cog|aio|တႝ}}, {{cog|phk|တႝ}}, {{cog|aho|𑜄𑜩}} === คำนาม === {{tdd-verb}} # [[ไท]] l2o88cul6cvmyjsx9fidifogyiceh5p มอดูล:languages/chars 828 2323855 5720750 5683829 2026-04-21T07:00:44Z OctraBot 3198 บอต: แทนที่ข้อความโดยอัตโนมัติ (-standardChars +standard_chars) 5720750 Scribunto text/plain local export = {} local table = table local insert = table.insert local u = require("Module:string/char") -- UTF-8 encoded strings for some commonly-used diacritics. local c = { prime = u(0x02B9), grave = u(0x0300), acute = u(0x0301), circ = u(0x0302), -- circumflex tilde = u(0x0303), macron = u(0x0304), overline = u(0x0305), breve = u(0x0306), dotabove = u(0x0307), diaer = u(0x0308), -- diaeresis ringabove = u(0x030A), dacute = u(0x030B), -- double acute caron = u(0x030C), lineabove = u(0x030D), dgrave = u(0x030F), -- double grave invbreve = u(0x0311), -- inverted breve turnedcommaabove = u(0x0312), commaabove = u(0x0313), revcommaabove = u(0x0314), -- reversed comma above dotbelow = u(0x0323), diaerbelow = u(0x0324), -- diaeresis below ringbelow = u(0x0325), cedilla = u(0x0327), ogonek = u(0x0328), caronbelow = u(0x032C), brevebelow = u(0x032E), macronbelow = u(0x0331), perispomeni = u(0x0342), ypogegrammeni = u(0x0345), CGJ = u(0x034F), -- combining grapheme joiner zigzag = u(0x035B), dbrevebelow = u(0x035C), -- double breve below dmacron = u(0x035E), -- double macron dtilde = u(0x0360), -- double tilde dinvbreve = u(0x0361), -- double inverted breve small_a = u(0x0363), small_e = u(0x0364), small_i = u(0x0365), small_o = u(0x0366), small_u = u(0x0367), keraia = u(0x0374), lowerkeraia = u(0x0375), tonos = u(0x0384), palatalization = u(0x0484), dasiapneumata = u(0x0485), psilipneumata = u(0x0486), kashida = u(0x0640), fathatan = u(0x064B), dammatan = u(0x064C), kasratan = u(0x064D), fatha = u(0x064E), damma = u(0x064F), kasra = u(0x0650), shadda = u(0x0651), sukun = u(0x0652), hamzaabove = u(0x0654), nunghunna = u(0x0658), zwarakay = u(0x0659), smallv = u(0x065A), superalef = u(0x0670), udatta = u(0x0951), anudatta = u(0x0952), tacute = u(0x1ACB), -- triple acute dottedgrave = u(0x1DC0), dottedacute = u(0x1DC1), coronis = u(0x1FBD), psili = u(0x1FBF), dasia = u(0x1FEF), ZWNJ = u(0x200C), -- zero width non-joiner ZWJ = u(0x200D), -- zero width joiner RSQuo = u(0x2019), -- right single quote kavyka = u(0xA67C), VS01 = u(0xFE00), -- variation selector 1 -- Punctuation for the standard_chars field. -- Note: characters are literal (i.e. no magic characters). punc = " ',-‐‑‒–—…∅", -- Range covering all diacritics. diacritics = u(0x300) .. "-" .. u(0x34E) .. u(0x350) .. "-" .. u(0x36F) .. u(0x1AB0) .. "-" .. u(0x1ACE) .. u(0x1DC0) .. "-" .. u(0x1DFF) .. u(0x20D0) .. "-" .. u(0x20F0) .. u(0xFE20) .. "-" .. u(0xFE2F), } -- Braille characters for the standard_chars field. local braille = {} for i = 0x2800, 0x28FF do insert(braille, u(i)) end c.braille = table.concat(braille) export.chars = c -- PUA characters, generally used in sortkeys. -- Note: if the limit needs to be increased, do so in powers of 2 (due to the way memory is allocated for tables). local p = {} for i = 1, 32 do p[i] = u(0xF000+i-1) end export.puaChars = p local cs = {} -- Used for the default display_text and strip_diacritics for Grek, but parts also used directly by Albanian (sq). cs["Grek-displaytext"] = { from = {"Þ", "þ", c.turnedcommaabove, "['ʼ" .. c.RSQuo .. c.prime .. c.keraia .. c.coronis .. c.psili .. "]"}, -- Not tonos: used as the numeral sign in entries. to = {"Ϸ", "ϸ", c.revcommaabove, c.RSQuo} } cs["Grek-stripdiacritics"] = { remove_diacritics = c.caron .. c.diaerbelow .. c.brevebelow, from = cs["Grek-displaytext"].from, to = {"Ϸ", "ϸ", c.revcommaabove, "'"} } -- Used in the default strip_diacritics and sort_key for Cyrs, but also used directly by Old Ruthenian (zle-ort). cs["Cyrs_remove_diacritics"] = c.grave .. c.acute .. c.dotabove .. c.diaer .. c.invbreve .. c.palatalization .. c.dasiapneumata .. c.psilipneumata .. c.dottedgrave .. c.dottedacute .. c.kavyka export.chars_substitutions = cs return export 8n2w5fgofa7b3yf0owwxcxi9c45tihv คุยกับผู้ใช้:Bikkola 3 2330421 5720674 2026-04-20T12:57:58Z New user message 2698 เพิ่ม[[Template:Welcome|สารต้อนรับ]]ในหน้าคุยของผู้ใช้ใหม่ 5720674 wikitext text/x-wiki {{Template:Welcome|realName=|name=Bikkola}} -- [[ผู้ใช้:New user message|New user message]] ([[คุยกับผู้ใช้:New user message|คุย]]) 19:57, 20 เมษายน 2569 (+07) ni1cz40uu55lcesh51gcphcsjzdlvhi คุยกับผู้ใช้:Octahedron80/อักษรไทธรรม 3 2330422 5720675 2026-04-20T13:47:20Z Ai Ku Karng 17824 /* รูปแบบการเขียน */ ส่วนใหม่ 5720675 wikitext text/x-wiki == รูปแบบการเขียน == รูปแบบการเขียนอักษรธรรมล้านนาภาษาไทลื้อ เช่นᨠᩬᩳ อิงจากอะไรครับ [[ผู้ใช้:Ai Ku Karng|Ai Ku Karng]] ([[คุยกับผู้ใช้:Ai Ku Karng|คุย]]) 20:47, 20 เมษายน 2569 (+07) 2pdjquj3px57b6ktrq2q85ko4oqp0sk 5720676 5720675 2026-04-20T15:24:03Z OctraBot 3198 /* รูปแบบการเขียน */ 5720676 wikitext text/x-wiki == รูปแบบการเขียน == รูปแบบการเขียนอักษรธรรมล้านนาภาษาไทลื้อ เช่นᨠᩬᩳ อิงจากอะไรครับ [[ผู้ใช้:Ai Ku Karng|Ai Ku Karng]] ([[คุยกับผู้ใช้:Ai Ku Karng|คุย]]) 20:47, 20 เมษายน 2569 (+07) [https://drive.google.com/open?id=10c5lzGMittfoU-BvIHv0XJLkgaEPhuyR&usp=drive_fs] [https://drive.google.com/open?id=1x29dkAuhsbeM0Anp_Ok9x4yu_CMBvKJD&usp=drive_fs] --[[ผู้ใช้:OctraBot|OctraBot]] ([[คุยกับผู้ใช้:OctraBot|คุย]]) 22:24, 20 เมษายน 2569 (+07) alxn1fewhgpsh2hy6km2wgf2ihzct0w 5720677 5720676 2026-04-20T15:39:36Z OctraBot 3198 /* รูปแบบการเขียน */ 5720677 wikitext text/x-wiki == รูปแบบการเขียน == รูปแบบการเขียนอักษรธรรมล้านนาภาษาไทลื้อ เช่นᨠᩬᩳ อิงจากอะไรครับ [[ผู้ใช้:Ai Ku Karng|Ai Ku Karng]] ([[คุยกับผู้ใช้:Ai Ku Karng|คุย]]) 20:47, 20 เมษายน 2569 (+07) [https://drive.google.com/open?id=10c5lzGMittfoU-BvIHv0XJLkgaEPhuyR&usp=drive_fs] [https://drive.google.com/open?id=1x29dkAuhsbeM0Anp_Ok9x4yu_CMBvKJD&usp=drive_fs] [https://wrdingham.co.uk/lanna/renderer_test.htm] ทั้งหมดนี้เป็นการประมวลผลรวมมาแล้ว --[[ผู้ใช้:OctraBot|OctraBot]] ([[คุยกับผู้ใช้:OctraBot|คุย]]) 22:24, 20 เมษายน 2569 (+07) 71hzou5b1vdifddm9zxh5gvssx60jxo ciudades 0 2330423 5720707 2026-04-21T01:58:34Z OctraBot 3198 นำเข้าจาก enwikt เก็บกวาด 5720707 wikitext text/x-wiki == ภาษาสเปน == === การออกเสียง === {{es-pr}} === คำนาม === {{head|es|รูปนาม|g=f-p}} # {{noun form of|es|ciudad||p}} 0mcxg8xtweb4gjim192ke67p9b1cpuk rhinos 0 2330424 5720710 2026-04-21T02:08:10Z OctraBot 3198 นำเข้าจาก enwikt เก็บกวาด เรียงลำดับหัวเรื่องภาษา 5720710 wikitext text/x-wiki == ภาษาฝรั่งเศส == === คำนาม === {{head|fr|รูปนาม|g=m-p}} # {{plural of|fr|rhino}} == ภาษาอังกฤษ == === คำนาม === {{head|en|รูปนาม}} # {{plural of|en|rhino}} === คำสลับอักษร === * {{anagrams|en|a=hinors|Rishon|Hirson|orhnis|rishon}} 1fj28dhzbm8h8impx8ese120vigtpu9 capital loss 0 2330425 5720716 2026-04-21T02:17:03Z OctraBot 3198 นำเข้าจาก enwikt เก็บกวาด 5720716 wikitext text/x-wiki == ภาษาอังกฤษ == === คำนาม === {{en-noun|~}} # {{lb|en|economics|business|finance}} [[ขาดทุนประเภททุน]]; การลดลงของมูลค่าสินทรัพย์ประเภททุน; จำนวนที่มูลค่าหรือรายได้จากการขายสินทรัพย์ประเภททุนโดยเจ้าของ น้อยกว่าต้นทุนของเจ้าของ #: {{ant|en|capital gain}} 4mh2njrh8av5ds99094tgihcaqksyo2 kühl 0 2330426 5720731 2026-04-21T05:29:49Z Ponpan 693 สร้างหน้าด้วย "== ภาษาเยอรมัน == === รากศัพท์ === {{root|de|ine-pro|*gel-}} จาก{{inh|de|gmh|küele}}, {{inh|de|goh|kuoli}}, จาก{{inh|de|gmw-pro|*kōl(ī)}}, จาก{{inh|de|gem-pro|*kōluz}}, {{m|gem-pro|*kōlaz}}, จาก{{der|de|ine-pro|*gel-}}; ร่วมเชื้อสายกับ{{cog|nl|koel}}, {{cog|en|cool}}; {{doublet|de|cool}} === การออกเสียง === * {{IPA|de|/kyːl/}} *..." 5720731 wikitext text/x-wiki == ภาษาเยอรมัน == === รากศัพท์ === {{root|de|ine-pro|*gel-}} จาก{{inh|de|gmh|küele}}, {{inh|de|goh|kuoli}}, จาก{{inh|de|gmw-pro|*kōl(ī)}}, จาก{{inh|de|gem-pro|*kōluz}}, {{m|gem-pro|*kōlaz}}, จาก{{der|de|ine-pro|*gel-}}; ร่วมเชื้อสายกับ{{cog|nl|koel}}, {{cog|en|cool}}; {{doublet|de|cool}} === การออกเสียง === * {{IPA|de|/kyːl/}} * {{audio|de|De-kühl.ogg}} * {{audio|de|De-kühl2.ogg|a=<<Germany>> (<<Berlin>>)}} === คำคุณศัพท์ === {{de-adj|comp}} # [[เย็น]] #: {{ant|de|heiß|kalt|lau|warm}} #: {{coi|de|etwas '''kühl''' lagern|เก็บบางสิ่งในที่เย็น}} #: {{uxi|de|Es ist '''kühler''' geworden.|อากาศเย็นลง}} #: {{uxi|de|Das Wasser ist angenehm '''kühl'''.|น้ำเย็นสบาย}} # [[สงบ]], [[เยือกเย็น]] # [[เย็นชา]], [[ไร้]][[อารมณ์]], [[ไม่]][[สนใจ]]; ไม่[[ตอบสนอง]]ทาง[[เพศ]] #: {{uxi|de|Warum bist du so '''kühl''' mir gegenüber.|ทำไมเธอถึงเย็นชากับฉันอย่างนี้}} ==== การผันรูป ==== {{de-adecl|comp}} ==== คำเกี่ยวข้อง ==== {{col|de|kühlen|Kühle|gekühlt|kühlgemäßigt|die kühle Schulter zeigen}} === อ่านเพิ่ม === * {{R:de:Duden}} * {{R:de:DWDS}} k1yus2vfje245rhtriviczuzgdhnfma หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษากรีกโบราณ 14 2330427 5720739 2026-04-21T06:28:44Z OctraBot 3198 สร้างหน้าด้วย "{{auto cat}}" 5720739 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบลิงก์ภาษากรีกโบราณ 14 2330428 5720740 2026-04-21T06:28:51Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720740 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx มอดูล:Copt-sortkey 828 2330429 5720773 2026-04-21T07:09:19Z OctraBot 3198 สร้างหน้าด้วย "export = {} local match = mw.ustring.match local str_gsub = string.gsub local function ugsub(text, regex, replacement) local out = mw.ustring.gsub(text, regex, replacement) return out end local alphabet = "ⲁⲃⲅⲇⲉⲍⲏⲑⲓⲕⲗⲙⲛⲝⲟⲡⲣⲥⲧⲩⲫⲭⲯⲱϣϥⳉϧϩϫϭw" local vowels = "ⲁⲉⲏⲓⲟⲩⲱ" local vowel = "[" .. vowels .. "]" local consonants = ugsub(alphabet, vowel, "") local consonant = "[" .. consona..." 5720773 Scribunto text/plain export = {} local match = mw.ustring.match local str_gsub = string.gsub local function ugsub(text, regex, replacement) local out = mw.ustring.gsub(text, regex, replacement) return out end local alphabet = "ⲁⲃⲅⲇⲉⲍⲏⲑⲓⲕⲗⲙⲛⲝⲟⲡⲣⲥⲧⲩⲫⲭⲯⲱϣϥⳉϧϩϫϭw" local vowels = "ⲁⲉⲏⲓⲟⲩⲱ" local vowel = "[" .. vowels .. "]" local consonants = ugsub(alphabet, vowel, "") local consonant = "[" .. consonants .. "]" local replacements = { ["ⲟⲩ"] = "ⲩ", ["ⳤ"] = "ⲕⲉ", ["ⲉⲓ"] = "ⲓ", ["ϯ"] = "ⲧⲓ", ["-"] = "", ["⸗"] = "", ["ˋ"] = "", } local CopticToGreek = { ["ⲁ"] = "α", ["ⲃ"] = "β", ["ⲅ"] = "γ", ["ⲇ"] = "δ", ["ⲉ"] = "ε", ["ⲍ"] = "ζ", ["ⲏ"] = "η", ["ⲑ"] = "θ", ["ⲓ"] = "ι", ["ⲕ"] = "κ", ["ⲗ"] = "λ", ["ⲙ"] = "μ", ["ⲛ"] = "ν", ["ⲝ"] = "ξ", ["ⲟ"] = "ο", ["ⲡ"] = "π", ["ⲣ"] = "ρ", ["ⲥ"] = "σ", ["ⲧ"] = "τ", ["ⲩ"] = "υ", ["ⲫ"] = "φ", ["ⲭ"] = "χ", ["ⲯ"] = "ψ", ["ⲱ"] = "ω", } function export.makeSortKey(text, lang, sc) text = mw.ustring.lower(text) for letter, replacement in pairs(replacements) do text = str_gsub(text, letter, replacement) end local origText = text text = ugsub(text, "ⲩ(" .. vowel .. ")", "w%1") text = ugsub(text, "(" .. vowel .. ")ⲩ", "%1w") -- mw.log(origText, text) local sort = {} for word in mw.ustring.gmatch(text, "%S+") do -- Add initial vowel (if any). table.insert(sort, match(word, "^" .. vowel) ) -- Add consonants (in order). table.insert(sort, ugsub(word, vowel .. "+", "")) --[[ Add the number "1" if word ends in consonant. "1" sorts before Greek–Coptic and Coptic Unicode blocks. ]] if mw.ustring.match(word, consonant .. "$") then table.insert(sort, "1") elseif mw.ustring.match(word, vowel .. "$") then table.insert(sort, "2") end -- Get non-initial vowels (in order) by removing initial vowel and all consonants. table.insert(sort, ugsub(ugsub(word, "^" .. vowel, ""), consonant, "")) table.insert(sort, " ") end sort = table.concat(sort) sort = str_gsub(sort, "w", "ⲩ") --[[ Convert Greek-derived Coptic characters to Greek ones. Otherwise, the uniquely Coptic letters would sort first, because they were added to Unicode earlier. ϣϥⳉϧϩϫϭ ⲁⲃⲅⲇⲉⲍⲏⲑⲓⲕⲗⲙⲛⲝⲟⲡⲣⲥⲧⲩⲫⲭⲯⲱ ⇓ αβγδεζηθικλμνξοπρστυφχψω ϣϥⳉϧϩϫϭ ]] sort = str_gsub(sort, "[\194-\244][\128-\191]+", CopticToGreek) return mw.ustring.upper(sort) end local lang = require("Module:languages").getByCode("cop") local sc = require("Module:scripts").getByCode("Copt") local function tag(text) return require("Module:script utilities").tag_text(text, lang, sc) end function export.showSorting(frame) local terms = {} for i, term in ipairs(frame.args) do table.insert(terms, term) end local function comp(term1, term2) return export.makeSortKey(term1) < export.makeSortKey(term2) end table.sort(terms, comp) for i, term in pairs(terms) do terms[i] = "\n* " .. tag(term) .. " (<code>" .. export.makeSortKey(term) .. "</code>)" end return table.concat(terms) end return export 8o899v7yx7qyh5uiyv6wmfsrsqo8b0t มอดูล:wlm-sortkey 828 2330430 5720775 2026-04-21T07:13:56Z OctraBot 3198 สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local a = u(0xF000) local remove_diacritics = u(0x0300) .. "-" .. u(0x0302) .. u(0x0308) .. "'" -- grave, acute, circumflex, diaeresis, apostrophe local oneChar = { ["k"] = "c" } local twoChars = { ["ch"] = "c" .. a, ["dd"] = "d" .. a, ["ff"] = "f" .. a, ["ll"] = "l" .. a, ["ph"] = "p" .. a, ["rh"] = "r" .. a, ["th"] = "t" .. a } local threeChars = { ["ngh"] = "g" .. a } function export.makeSortKey..." 5720775 Scribunto text/plain local export = {} local u = mw.ustring.char local a = u(0xF000) local remove_diacritics = u(0x0300) .. "-" .. u(0x0302) .. u(0x0308) .. "'" -- grave, acute, circumflex, diaeresis, apostrophe local oneChar = { ["k"] = "c" } local twoChars = { ["ch"] = "c" .. a, ["dd"] = "d" .. a, ["ff"] = "f" .. a, ["ll"] = "l" .. a, ["ph"] = "p" .. a, ["rh"] = "r" .. a, ["th"] = "t" .. a } local threeChars = { ["ngh"] = "g" .. a } function export.makeSortKey(text, lang, sc) text = mw.ustring.lower(text) for from, to in pairs(threeChars) do text = mw.ustring.gsub(text, from, to) end for from, to in pairs(twoChars) do text = mw.ustring.gsub(text, from, to) end return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(text, ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacritics, then recompose again end return export ofbrvuy006obg7gppomrydjtvu7udcs มอดูล:mdf-sortkey 828 2330431 5720777 2026-04-21T07:15:02Z OctraBot 3198 สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local a = u(0xF000) local oneChar = { ["ё"] = "е" .. a } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)) end return export" 5720777 Scribunto text/plain local export = {} local u = mw.ustring.char local a = u(0xF000) local oneChar = { ["ё"] = "е" .. a } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)) end return export fpynrgnyrcig23pqmf0ixa0z2wahmle มอดูล:gmw-pro-sortkey 828 2330432 5720778 2026-04-21T07:15:59Z OctraBot 3198 สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local remove_diacritics = u(0x0304) .. u(0x0328) -- macron, ogonek local oneChar = { ["ʀ"] = "r" } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacritics, then recompose again end return export" 5720778 Scribunto text/plain local export = {} local u = mw.ustring.char local remove_diacritics = u(0x0304) .. u(0x0328) -- macron, ogonek local oneChar = { ["ʀ"] = "r" } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacritics, then recompose again end return export 7lc3wpilcm5n7sf864u24efw71z1off มอดูล:bnt-pro-sortkey 828 2330433 5720779 2026-04-21T07:16:14Z OctraBot 3198 สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local a, b = u(0xF000), u(0xF001) local remove_diacritics = u(0x0300) .. u(0x0301) -- grave, acute local oneChar = { ["ɪ"] = "i" .. a, ["ì"] = "i" .. b, ["í"] = "i" .. b, ["ʊ"] = "u" .. a, ["ù"] = "u" .. b, ["ú"] = "u" .. b } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar))..." 5720779 Scribunto text/plain local export = {} local u = mw.ustring.char local a, b = u(0xF000), u(0xF001) local remove_diacritics = u(0x0300) .. u(0x0301) -- grave, acute local oneChar = { ["ɪ"] = "i" .. a, ["ì"] = "i" .. b, ["í"] = "i" .. b, ["ʊ"] = "u" .. a, ["ù"] = "u" .. b, ["ú"] = "u" .. b } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacritics, then recompose again end return export 0xagsm5z1meiajymzdjlpul8rs3bk3m มอดูล:cel-pro-sortkey 828 2330434 5720780 2026-04-21T07:16:28Z OctraBot 3198 สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local a = u(0xF000) local remove_diacritics = u(0x0304) -- macron local oneChar = { ["ɸ"] = "f", ["φ"] = "f", ["ʷ"] = "w" } function export.makeSortKey(text, lang, sc) text = mw.ustring.gsub(mw.ustring.lower(text), "w", "w" .. a) -- ensure "w" comes after "ʷ" return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(text, ".", oneChar)), "[" .. remove_diacritics..." 5720780 Scribunto text/plain local export = {} local u = mw.ustring.char local a = u(0xF000) local remove_diacritics = u(0x0304) -- macron local oneChar = { ["ɸ"] = "f", ["φ"] = "f", ["ʷ"] = "w" } function export.makeSortKey(text, lang, sc) text = mw.ustring.gsub(mw.ustring.lower(text), "w", "w" .. a) -- ensure "w" comes after "ʷ" return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(text, ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacritics, then recompose again end return export s4tpyk1s0a6cln7elxc6pbgn54mljlx มอดูล:gem-pro-sortkey 828 2330435 5720781 2026-04-21T07:16:41Z OctraBot 3198 สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local remove_diacritics = u(0x0302) .. u(0x0304) -- circumflex, macron local oneChar = { ["ą"] = "an", ["į"] = "in", ["ǫ"] = "on", ["ų"] = "un" } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacriti..." 5720781 Scribunto text/plain local export = {} local u = mw.ustring.char local remove_diacritics = u(0x0302) .. u(0x0304) -- circumflex, macron local oneChar = { ["ą"] = "an", ["į"] = "in", ["ǫ"] = "on", ["ų"] = "un" } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.toNFC(mw.ustring.gsub(mw.ustring.toNFD(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)), "[" .. remove_diacritics .. "]", ""))) -- decompose, remove appropriate diacritics, then recompose again end return export 4tap68mjd5qt3mdukwabrpdbjban8wb มอดูล:sma-sortkey 828 2330436 5720782 2026-04-21T07:17:09Z OctraBot 3198 สร้างหน้าด้วย "local export = {} local u = mw.ustring.char local a, b, c = u(0xF000), u(0xF001), u(0xF002) local oneChar = { ["ï"] = "i" .. a, ["æ"] = "z" .. a, ["ä"] = "z" .. a, ["ø"] = "z" .. b, ["ö"] = "z" .. b, ["å"] = "z" .. c } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)) end return export" 5720782 Scribunto text/plain local export = {} local u = mw.ustring.char local a, b, c = u(0xF000), u(0xF001), u(0xF002) local oneChar = { ["ï"] = "i" .. a, ["æ"] = "z" .. a, ["ä"] = "z" .. a, ["ø"] = "z" .. b, ["ö"] = "z" .. b, ["å"] = "z" .. c } function export.makeSortKey(text, lang, sc) return mw.ustring.upper(mw.ustring.gsub(mw.ustring.lower(text), ".", oneChar)) end return export 9uw1xjoz1haiv6ju835pesrtd8tkhmd หมวดหมู่:ภาษาซามีใต้ 14 2330437 5720784 2026-04-21T07:17:56Z OctraBot 3198 สร้างหน้าด้วย "{{auto cat|นอร์เวย์|สวีเดน|setwiki=Southern Sami}}" 5720784 wikitext text/x-wiki {{auto cat|นอร์เวย์|สวีเดน|setwiki=Southern Sami}} bl7hv18m0mrs85oml6ckkhx0zhj4yym 5720785 5720784 2026-04-21T07:18:35Z OctraBot 3198 5720785 wikitext text/x-wiki {{auto cat|นอร์เวย์|สวีเดน}} ez99jpf91za3w9tokozpw89n58i5gqo หมวดหมู่:คำนามนับได้ภาษาจอร์เจีย 14 2330438 5720788 2026-04-21T07:31:43Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720788 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx ᥞᥧᥳ 0 2330439 5720789 2026-04-21T07:31:56Z Ai Ku Karng 17824 สร้างหน้าด้วย "=== รากศัพท์ === {{inh+|lo|tai-pro|*rɯːwꟲ}}; ร่วมเชื้อสายกับ{{cog|th|รู้}}, {{cog|nod|ᩁᩪ᩶}}, {{cog|tts|ฮู้}}, {{cog|shn|ႁူႉ}}, {{cog|khb|ᦣᦴᧉ}}, {{cog|blt|ꪭꪴ꫁}}, {{cog|twh|ꪭꪴꫂ}}, {{cog|aho|𑜍𑜥}} หรือ {{m|aho|𑜍𑜤𑜈𑜫}}, {{cog|skb|รอ}}, {{cog|za|rox}}, {{cog|zzj|rux}}, {{cog|lo|ຮູ້}} === การออกเสียง === * {{..." 5720789 wikitext text/x-wiki === รากศัพท์ === {{inh+|lo|tai-pro|*rɯːwꟲ}}; ร่วมเชื้อสายกับ{{cog|th|รู้}}, {{cog|nod|ᩁᩪ᩶}}, {{cog|tts|ฮู้}}, {{cog|shn|ႁူႉ}}, {{cog|khb|ᦣᦴᧉ}}, {{cog|blt|ꪭꪴ꫁}}, {{cog|twh|ꪭꪴꫂ}}, {{cog|aho|𑜍𑜥}} หรือ {{m|aho|𑜍𑜤𑜈𑜫}}, {{cog|skb|รอ}}, {{cog|za|rox}}, {{cog|zzj|rux}}, {{cog|lo|ຮູ້}} === การออกเสียง === * {{IPA|tdd|/hu˦˧/}} === คำกริยา === {{tdd-verb}} # [[รู้]] 4ji16hf2h971ihfjdezxy64dyvfsqgc 5720791 5720789 2026-04-21T07:32:19Z Ai Ku Karng 17824 /* การออกเสียง */ 5720791 wikitext text/x-wiki === รากศัพท์ === {{inh+|lo|tai-pro|*rɯːwꟲ}}; ร่วมเชื้อสายกับ{{cog|th|รู้}}, {{cog|nod|ᩁᩪ᩶}}, {{cog|tts|ฮู้}}, {{cog|shn|ႁူႉ}}, {{cog|khb|ᦣᦴᧉ}}, {{cog|blt|ꪭꪴ꫁}}, {{cog|twh|ꪭꪴꫂ}}, {{cog|aho|𑜍𑜥}} หรือ {{m|aho|𑜍𑜤𑜈𑜫}}, {{cog|skb|รอ}}, {{cog|za|rox}}, {{cog|zzj|rux}}, {{cog|lo|ຮູ້}} == การออกเสียง == * {{IPA|tdd|/hu˦˧/}} === คำกริยา === {{tdd-verb}} # [[รู้]] 3ub1pyud7la5fr1alhayz36grmg6b97 5720792 5720791 2026-04-21T07:32:30Z Ai Ku Karng 17824 5720792 wikitext text/x-wiki == รากศัพท์ == {{inh+|lo|tai-pro|*rɯːwꟲ}}; ร่วมเชื้อสายกับ{{cog|th|รู้}}, {{cog|nod|ᩁᩪ᩶}}, {{cog|tts|ฮู้}}, {{cog|shn|ႁူႉ}}, {{cog|khb|ᦣᦴᧉ}}, {{cog|blt|ꪭꪴ꫁}}, {{cog|twh|ꪭꪴꫂ}}, {{cog|aho|𑜍𑜥}} หรือ {{m|aho|𑜍𑜤𑜈𑜫}}, {{cog|skb|รอ}}, {{cog|za|rox}}, {{cog|zzj|rux}}, {{cog|lo|ຮູ້}} == การออกเสียง == * {{IPA|tdd|/hu˦˧/}} === คำกริยา === {{tdd-verb}} # [[รู้]] okdm7i6fjjz9lp76axlhnn9gbuktr02 5720793 5720792 2026-04-21T07:32:43Z Ai Ku Karng 17824 5720793 wikitext text/x-wiki == รากศัพท์ == {{inh+|lo|tai-pro|*rɯːwꟲ}}; ร่วมเชื้อสายกับ{{cog|th|รู้}}, {{cog|nod|ᩁᩪ᩶}}, {{cog|tts|ฮู้}}, {{cog|shn|ႁူႉ}}, {{cog|khb|ᦣᦴᧉ}}, {{cog|blt|ꪭꪴ꫁}}, {{cog|twh|ꪭꪴꫂ}}, {{cog|aho|𑜍𑜥}} หรือ {{m|aho|𑜍𑜤𑜈𑜫}}, {{cog|skb|รอ}}, {{cog|za|rox}}, {{cog|zzj|rux}}, {{cog|lo|ຮູ້}} === การออกเสียง === * {{IPA|tdd|/hu˦˧/}} === คำกริยา === {{tdd-verb}} # [[รู้]] 5iigbz935uuhh0x5f2irzpsjcsspk9z 5720794 5720793 2026-04-21T07:33:40Z Apisite 10648 5720794 wikitext text/x-wiki == ภาษาไทใต้คง == === รากศัพท์ === {{inh+|lo|tai-pro|*rɯːwꟲ}}; ร่วมเชื้อสายกับ{{cog|th|รู้}}, {{cog|nod|ᩁᩪ᩶}}, {{cog|tts|ฮู้}}, {{cog|shn|ႁူႉ}}, {{cog|khb|ᦣᦴᧉ}}, {{cog|blt|ꪭꪴ꫁}}, {{cog|twh|ꪭꪴꫂ}}, {{cog|aho|𑜍𑜥}} หรือ {{m|aho|𑜍𑜤𑜈𑜫}}, {{cog|skb|รอ}}, {{cog|za|rox}}, {{cog|zzj|rux}}, {{cog|lo|ຮູ້}} === การออกเสียง === * {{IPA|tdd|/hu˦˧/}} === คำกริยา === {{tdd-verb}} # [[รู้]] imhds95aowa85w6kqheznjlgirxgcu7 หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษาซีรีแอกคลาสสิก 14 2330440 5720790 2026-04-21T07:32:05Z OctraBot 3198 สร้างหน้าด้วย "{{auto cat}}" 5720790 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบลิงก์ภาษาซีรีแอกคลาสสิก 14 2330441 5720795 2026-04-21T07:37:42Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720795 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบภาษาซีรีแอกคลาสสิก 14 2330442 5720796 2026-04-21T07:37:47Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720796 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:mul:ประเทศ 14 2330443 5720797 2026-04-21T07:38:01Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720797 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:mul:รายชื่อหมวดหมู่ชื่อ 14 2330444 5720798 2026-04-21T07:38:09Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720798 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:mul:องค์การทางการเมือง 14 2330445 5720799 2026-04-21T07:38:10Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720799 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:mul:สถานที่ 14 2330446 5720800 2026-04-21T07:38:15Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720800 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:mul:ชื่อ (หัวข้อ) 14 2330447 5720801 2026-04-21T07:38:20Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720801 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:คำนามผันรูปไม่ได้ภาษาฝรั่งเศส 14 2330448 5720802 2026-04-21T07:38:42Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720802 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษาอิงกุช 14 2330449 5720803 2026-04-21T07:40:00Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720803 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบลิงก์ภาษาอิงกุช 14 2330450 5720804 2026-04-21T07:40:06Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720804 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบภาษาอิงกุช 14 2330451 5720805 2026-04-21T07:40:12Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720805 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษาแมนจู 14 2330452 5720806 2026-04-21T07:40:34Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720806 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบลิงก์ภาษาแมนจู 14 2330453 5720807 2026-04-21T07:40:39Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720807 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบภาษาแมนจู 14 2330454 5720808 2026-04-21T07:40:44Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720808 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษาบัตส์ 14 2330455 5720809 2026-04-21T07:41:02Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720809 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบลิงก์ภาษาบัตส์ 14 2330456 5720810 2026-04-21T07:41:08Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720810 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบภาษาบัตส์ 14 2330457 5720811 2026-04-21T07:41:13Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720811 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:ศัพท์ภาษาโปรตุเกสที่สะกดด้วย Ù 14 2330458 5720812 2026-04-21T07:41:41Z OctraBot 3198 สร้างหน้าด้วย "{{auto cat}}" 5720812 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษาเบลารุส 14 2330459 5720813 2026-04-21T07:42:02Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720813 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบแหล่งอ้างอิงภาษาเอสเปรันโต 14 2330460 5720814 2026-04-21T07:42:03Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720814 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบลิงก์ภาษาเบลารุส 14 2330461 5720815 2026-04-21T07:42:08Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720815 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบลิงก์ภาษาเอสเปรันโต 14 2330462 5720816 2026-04-21T07:42:10Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720816 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบภาษาเบลารุส 14 2330463 5720817 2026-04-21T07:42:15Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720817 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx หมวดหมู่:แม่แบบภาษาเอสเปรันโต 14 2330464 5720818 2026-04-21T07:42:17Z OctraBot 3198 สร้างหมวดหมู่อัตโนมัติ 5720818 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx