Wikikamus
mswiktionary
https://ms.wiktionary.org/wiki/Wikikamus:Laman_Utama
MediaWiki 1.46.0-wmf.24
case-sensitive
Media
Khas
Perbincangan
Pengguna
Perbincangan pengguna
Wikikamus
Perbincangan Wikikamus
Fail
Perbincangan fail
MediaWiki
Perbincangan MediaWiki
Templat
Perbincangan templat
Bantuan
Perbincangan bantuan
Kategori
Perbincangan kategori
Lampiran
Perbincangan lampiran
Rima
Perbincangan rima
Tesaurus
Perbincangan tesaurus
Indeks
Perbincangan indeks
Petikan
Perbincangan petikan
Rekonstruksi
Perbincangan rekonstruksi
Padanan isyarat
Perbincangan padanan isyarat
Konkordans
Perbincangan konkordans
TimedText
TimedText talk
Modul
Perbincangan modul
Acara
Perbincangan acara
bina
0
6723
281314
242081
2026-04-21T17:45:52Z
Hakimi97
2668
/* Terjemahan */
281314
wikitext
text/x-wiki
== Bahasa Melayu ==
===Takrifan===
{{ms-kn}}
# [[bangunan]].
===Etimologi===
{{bor+|ms|ar|بنى|tr=banā||t=membina}}, {{m|ar|بناء|tr=binā||t=binaan, bangunan}}.
===Sebutan===
{{dewan|bi|na}}
{{AFA|ms|/bina/|/bena/}}
===Tulisan Jawi===
{{ARchar|بينا}}
===Terjemahan===
{{ter-atas|membina}}
* Arab: {{ARchar|يبني}} (yabne)
* Armenia: {{Armn|կառուցել}} (kaṙuc'el), {{Armn|շինել}} (šinel), {{Armn|սարքել}} (sark'el)
* Belanda: bouwen
* Czech: stavět
* Denmark: bygge
* Estonia: ehitama
* Ewe: tu
* Finland: rakentaa
* Ibrani: {{Hebr|לבנות}}
* Ido: konstruktar
* Indonesia: {{t-|id|bangun|alt=membangun}}, {{t-|id|diri|alt=mendirikan}}
* Inggeris: build
* Ireland: tóg
* Itali: costruire, edificare
* Jepun: 建てる (たてる, tateru), 建設する (けんせつする, kensetsu-suru)
* Jerman: bauen
* Korea: 만들다 /mandulda/
* Kurdi: {{KUchar|دروستکردن}}
* Latin: mūniō, munīre, munīvī, munītus
* Perancis: construire, édifier
* Poland: budować
* Portugis: construir
* Romania: clădi
* Rusia: {{Cyrl|строить}} (stroit)
* Scotland: build, big
* Sepanyol: construir, edificar
* Swahili: kujenga
* Sweden: anlägga, bygga, förfärdiga, uppföra, uppresa, upprätta
{{ter-bawah}}
===Terbitan===
* binaan:
** benda yang dibina, bangunan, susunan;
** pembentukan.
* membina:
** membuat sehingga terbina, membangunkan, mendirikan;
** membantu dalam proses menjadi besar;
** mengusahakan supaya lebih maju, membangunkan;
** mewujudkan, membentuk;
** mengembangkan;
** mendatangkan kebaikan, membawa manfaat.
* pembinaan:
** perbuatan membina;
** pembangunan;
** perihal membina.
* terbina:
** dibina, terbangun, terdiri;
** terbentuk daripada beberapa unsur.
===Tesaurus===
; Sinonim: [[bangun]].
== Bahasa Indonesia ==
* Lihat takrifan bahasa Melayu.
7vaecdz4pzgfq933ijeoqs5vez89t47
Modul:languages
828
8666
281239
280883
2026-04-21T12:20:35Z
Hakimi97
2668
281239
Scribunto
text/plain
--[==[ intro:
This module implements fetching of language-specific information and processing text in a given language.
===Types of languages===
There are two types of languages: full languages and etymology-only languages. The essential difference is that only
full languages appear in L2 headings in vocabulary entries, and hence categories like [[:Category:French nouns]] exist
only for full languages. Etymology-only languages have either a full language or another etymology-only language as
their parent (in the parent-child inheritance sense), and for etymology-only languages with another etymology-only
language as their parent, a full language can always be derived by following the parent links upwards. For example,
"Canadian French", code `fr-CA`, is an etymology-only language whose parent is the full language "French", code `fr`.
An example of an etymology-only language with another etymology-only parent is "Northumbrian Old English", code
`ang-nor`, which has "Anglian Old English", code `ang-ang` as its parent; this is an etymology-only language whose
parent is "Old English", code `ang`, which is a full language. (This is because Northumbrian Old English is considered
a variety of Anglian Old English.) Sometimes the parent is the "Undetermined" language, code `und`; this is the case,
for example, for "substrate" languages such as "Pre-Greek", code `qsb-grc`, and "the BMAC substrate", code `qsb-bma`.
It is important to distinguish language ''parents'' from language ''ancestors''. The parent-child relationship is one
of containment, i.e. if X is a child of Y, X is considered a variety of Y. On the other hand, the ancestor-descendant
relationship is one of descent in time. For example, "Classical Latin", code `la-cla`, and "Late Latin", code `la-lat`,
are both etymology-only languages with "Latin", code `la`, as their parents, because both of the former are varieties
of Latin. However, Late Latin does *NOT* have Classical Latin as its parent because Late Latin is *not* a variety of
Classical Latin; rather, it is a descendant. There is in fact a separate `ancestors` field that is used to express the
ancestor-descendant relationship, and Late Latin's ancestor is given as Classical Latin. It is also important to note
that sometimes an etymology-only language is actually the conceptual ancestor of its parent language. This happens,
for example, with "Old Italian" (code `roa-oit`), which is an etymology-only variant of full language "Italian" (code
`it`), and with "Old Latin" (code `itc-ola`), which is an etymology-only variant of Latin. In both cases, the full
language has the etymology-only variant listed as an ancestor. This allows a Latin term to inherit from Old Latin
using the {{tl|inh}} template (where in this template, "inheritance" refers to ancestral inheritance, i.e. inheritance
in time, rather than in the parent-child sense); likewise for Italian and Old Italian.
Full languages come in three subtypes:
* {regular}: This indicates a full language that is attested according to [[WT:CFI]] and therefore permitted in the
main namespace. There may also be reconstructed terms for the language, which are placed in the
{Reconstruction} namespace and must be prefixed with * to indicate a reconstruction. Most full languages
are natural (not constructed) languages, but a few constructed languages (e.g. Esperanto and Volapük,
among others) are also allowed in the mainspace and considered regular languages.
* {reconstructed}: This language is not attested according to [[WT:CFI]], and therefore is allowed only in the
{Reconstruction} namespace. All terms in this language are reconstructed, and must be prefixed with
*. Languages such as Proto-Indo-European and Proto-Germanic are in this category.
* {appendix-constructed}: This language is attested but does not meet the additional requirements set out for
constructed languages ([[WT:CFI#Constructed languages]]). Its entries must therefore be in
the Appendix namespace, but they are not reconstructed and therefore should not have *
prefixed in links. Most constructed languages are of this subtype.
Both full languages and etymology-only languages have a {Language} object associated with them, which is fetched using
the {getByCode} function in [[Module:languages]] to convert a language code to a {Language} object. Depending on the
options supplied to this function, etymology-only languages may or may not be accepted, and family codes may be
accepted (returning a {Family} object as described in [[Module:families]]). There are also separate {getByCanonicalName}
functions in [[Module:languages]] and [[Module:etymology languages]] to convert a language's canonical name to a
{Language} object (depending on whether the canonical name refers to a full or etymology-only language).
===Textual representations===
Textual strings belonging to a given language come in several different ''text variants'':
# The ''input text'' is what the user supplies in wikitext, in the parameters to {{tl|m}}, {{tl|l}}, {{tl|ux}},
{{tl|t}}, {{tl|lang}} and the like.
# The ''corrected input text'' is the input text with some corrections and/or normalizations applied, such as
bad-character replacements for certain languages, like replacing `l` or `1` to [[palochka]] in some languages written
in Cyrillic. (FIXME: This currently goes under the name ''display text'' but that will be repurposed below. Also,
[[User:Surjection]] suggests renaming this to ''normalized input text'', but "normalized" is used in a different sense
in [[Module:usex]].)
# The ''display text'' is the text in the form as it will be displayed to the user. This is what appears in headwords,
in usexes, in displayed internal links, etc. This can include accent marks that are removed to form the stripped
display text (see below), as well as embedded bracketed links that are variously processed further. The display text
is generated from the corrected input text by applying language-specific transformations; for most languages, there
will be no such transformations. The general reason for having a difference between input and display text is to allow
for extra information in the input text that is not displayed to the user but is sent to the transliteration module.
Note that having different display and input text is only supported currently through special-casing but will be
generalized. Examples of transformations are: (1) Removing the {{cd|^}} that is used in certain East Asian (and
possibly other unicameral) languages to indicate capitalization of the transliteration (which is currently
special-cased); (2) for Korean, removing or otherwise processing hyphens (which is currently special-cased); (3) for
Arabic, removing a ''sukūn'' diacritic placed over a ''tāʔ marbūṭa'' (like this: ةْ) to indicate that the
''tāʔ marbūṭa'' is pronounced and transliterated as /t/ instead of being silent [NOTE, NOT IMPLEMENTED YET]; (4) for
Thai and Khmer, converting space-separated words to bracketed words and resolving respelling substitutions such as
`[กรีน/กฺรีน]`, which indicate how to transliterate given words [NOTE, NOT IMPLEMENTED YET except in language-specific
templates like {{tl|th-usex}}].
## The ''right-resolved display text'' is the result of removing brackets around one-part embedded links and resolving
two-part embedded links into their right-hand components (i.e. converting two-part links into the displayed form).
The process of right-resolution is what happens when you call {{cd|remove_links()}} in [[Module:links]] on some text.
When applied to the display text, it produces exactly what the user sees, without any link markup.
# The ''stripped display text'' is the result of applying diacritic-stripping to the display text.
## The ''left-resolved stripped display text'' [NEED BETTER NAME] is the result of applying left-resolution to the
stripped display text, i.e. similar to right-resolution but resolving two-part embedded links into their left-hand
components (i.e. the linked-to page). If the display text refers to a single page, the resulting of applying
diacritic stripping and left-resolution produces the ''logical pagename''.
# The ''physical pagename text'' is the result of converting the stripped display text into physical page links. If the
stripped display text contains embedded links, the left side of those links is converted into physical page links;
otherwise, the entire text is considered a pagename and converted in the same fashion. The conversion does three
things: (1) converts characters not allowed in pagenames into their "unsupported title" representation, e.g.
{{cd|Unsupported titles/`gt`}} in place of the logical name {{cd|>}}; (2) handles certain special-cased
unsupported-title logical pagenames, such as {{cd|Unsupported titles/Space}} in place of {{cd|[space]}} and
{{cd|Unsupported titles/Ancient Greek dish}} in place of a very long Greek name for a gourmet dish as found in
Aristophanes; (3) converts "mammoth" pagenames such as [[a]] into their appropriate split component, e.g.
[[a/languages A to L]].
# The ''source translit text'' is the text as supplied to the language-specific {{cd|transliterate()}} method. The form
of the source translit text may need to be language-specific, e.g Thai and Khmer will need the corrected input text,
whereas other languages may need to work off the display text. [FIXME: It's still unclear to me how embedded bracketed
links are handled in the existing code.] In general, embedded links need to be right-resolved (see above), but when
this happens is unclear to me [FIXME]. Some languages have a chop-up-and-paste-together scheme that sends parts of the
text through the transliterate mechanism, and for others (those listed with "cont" in {{cd|substitution}} in
[[Module:languages/data]]) they receive the full input text, but preprocessed in certain ways. (The wisdom of this is
still unclear to me.)
# The ''transliterated text'' (or ''transliteration'') is the result of transliterating the source translit text. Unlike
for all the other text variants except the transcribed text, it is always in the Latin script.
# The ''transcribed text'' (or ''transcription'') is the result of transcribing the source translit text, where
"transcription" here means a close approximation to the phonetic form of the language in languages (e.g. Akkadian,
Sumerian, Ancient Egyptian, maybe Tibetan) that have a wide difference between the written letters and spoken form.
Unlike for all the other text variants other than the transliterated text, it is always in the Latin script.
Currently, the transcribed text is always supplied manually be the user; there is no such thing as a
{{cd|transcribe()}} method on language objects.
# The ''sort key'' is the text used in sort keys for determining the placing of pages in categories they belong to. The
sort key is generated from the pagename or a specified ''sort base'' by lowercasing, doing language-specific
transformations and then uppercasing the result. If the sort base is supplied and is generated from input text, it
needs to be converted to display text, have embedded links removed through right-resolution and have
diacritic-stripping applied.
# There are other text variants that occur in usexes (specifically, there are normalized variants of several of the
above text variants), but we can skip them for now.
The following methods exist on {Language} objects to convert between different text variants:
# {correctInputText} (currently called {makeDisplayText}): This converts input text to corrected input text.
# {stripDiacritics}: This converts to stripped display text. [FIXME: This needs some rethinking. In particular,
{stripDiacritics} is sometimes called on input text, corrected input text or display text (in various paths inside of
[[Module:links]], and, in the case of input text, usually from other modules). We need to make sure we don't try to
convert input text to display text twice, but at the same time we need to support calling it directly on input text
since so many modules do this. This means we need to add a parameter indicating whether the passed-in text is input,
corrected input, or display text; if the former two, we call {correctInputText} ourselves.]
# {logicalToPhysical}: This converts logical pagenames to physical pagenames.
# {transliterate}: This appears to convert input text with embedded brackets removed into a transliteration.
[FIXME: This needs some rethinking. In particular, it calls {processDisplayText} on its input, which won't work
for Thai and Khmer, so we may need language-specific flags indicating whether to pass the input text directly to the
language transliterate method. In addition, I'm not sure how embedded links are handled in the existing translit code;
a lot of callers remove the links themselves before calling {transliterate()}, which I assume is wrong.]
# {makeSortKey}: This converts display text (?) to a sort key. [FIXME: Clarify this.]
]==]
local export = {}
local debug_track_module = "Modul:debug/track"
local etymology_languages_data_module = "Modul:etymology languages/data"
local families_module = "Modul:families"
local headword_page_module = "Modul:headword/page"
local json_module = "Modul:JSON"
local language_like_module = "Modul:language-like"
local languages_data_module = "Modul:languages/data"
local languages_data_patterns_module = "Modul:languages/data/patterns"
local links_data_module = "Modul:links/data"
local load_module = "Modul:load"
local scripts_module = "Modul:scripts"
local scripts_data_module = "Modul:scripts/data"
local string_encode_entities_module = "Modul:string/encode entities"
local string_pattern_escape_module = "Modul:string/patternEscape"
local string_replacement_escape_module = "Modul:string/replacementEscape"
local string_utilities_module = "Modul:string utilities"
local table_module = "Modul:table"
local utilities_module = "Modul:utilities"
local wikimedia_languages_module = "Modul:wikimedia languages"
local mw = mw
local string = string
local table = table
local char = string.char
local concat = table.concat
local find = string.find
local floor = math.floor
local get_by_code -- Defined below.
local get_data_module_name -- Defined below.
local get_extra_data_module_name -- Defined below.
local getmetatable = getmetatable
local gmatch = string.gmatch
local gsub = string.gsub
local insert = table.insert
local ipairs = ipairs
local is_known_language_tag = mw.language.isKnownLanguageTag
local make_object -- Defined below.
local match = string.match
local next = next
local pairs = pairs
local remove = table.remove
local require = require
local select = select
local setmetatable = setmetatable
local sub = string.sub
local type = type
local unstrip = mw.text.unstrip
-- Loaded as needed by findBestScript.
local Hans_chars
local Hant_chars
local function check_object(...)
check_object = require(utilities_module).check_object
return check_object(...)
end
local function debug_track(...)
debug_track = require(debug_track_module)
return debug_track(...)
end
local function decode_entities(...)
decode_entities = require(string_utilities_module).decode_entities
return decode_entities(...)
end
local function decode_uri(...)
decode_uri = require(string_utilities_module).decode_uri
return decode_uri(...)
end
local function deep_copy(...)
deep_copy = require(table_module).deepCopy
return deep_copy(...)
end
local function encode_entities(...)
encode_entities = require(string_encode_entities_module)
return encode_entities(...)
end
local function get_L2_sort_key(...)
get_L2_sort_key = require(headword_page_module).get_L2_sort_key
return get_L2_sort_key(...)
end
local function get_script(...)
get_script = require(scripts_module).getByCode
return get_script(...)
end
local function find_best_script_without_lang(...)
find_best_script_without_lang = require(scripts_module).findBestScriptWithoutLang
return find_best_script_without_lang(...)
end
local function get_family(...)
get_family = require(families_module).getByCode
return get_family(...)
end
local function get_plaintext(...)
get_plaintext = require(utilities_module).get_plaintext
return get_plaintext(...)
end
local function get_wikimedia_lang(...)
get_wikimedia_lang = require(wikimedia_languages_module).getByCode
return get_wikimedia_lang(...)
end
local function keys_to_list(...)
keys_to_list = require(table_module).keysToList
return keys_to_list(...)
end
local function list_to_set(...)
list_to_set = require(table_module).listToSet
return list_to_set(...)
end
local function load_data(...)
load_data = require(load_module).load_data
return load_data(...)
end
local function make_family_object(...)
make_family_object = require(families_module).makeObject
return make_family_object(...)
end
local function pattern_escape(...)
pattern_escape = require(string_pattern_escape_module)
return pattern_escape(...)
end
local function replacement_escape(...)
replacement_escape = require(string_replacement_escape_module)
return replacement_escape(...)
end
local function safe_require(...)
safe_require = require(load_module).safe_require
return safe_require(...)
end
local function shallow_copy(...)
shallow_copy = require(table_module).shallowCopy
return shallow_copy(...)
end
local function split(...)
split = require(string_utilities_module).split
return split(...)
end
local function to_json(...)
to_json = require(json_module).toJSON
return to_json(...)
end
local function u(...)
u = require(string_utilities_module).char
return u(...)
end
local function ugsub(...)
ugsub = require(string_utilities_module).gsub
return ugsub(...)
end
local function ulen(...)
ulen = require(string_utilities_module).len
return ulen(...)
end
local function ulower(...)
ulower = require(string_utilities_module).lower
return ulower(...)
end
local function umatch(...)
umatch = require(string_utilities_module).match
return umatch(...)
end
local function uupper(...)
uupper = require(string_utilities_module).upper
return uupper(...)
end
local function track(page)
debug_track("languages/" .. page)
return true
end
local function normalize_code(code)
return load_data(languages_data_module).aliases[code] or code
end
local function check_inputs(self, check, default, ...)
local n = select("#", ...)
if n == 0 then
return false
end
local ret = check(self, (...))
if ret ~= nil then
return ret
elseif n > 1 then
local inputs = {...}
for i = 2, n do
ret = check(self, inputs[i])
if ret ~= nil then
return ret
end
end
end
return default
end
local function make_link(self, target, display)
local prefix, main
if self:getFamilyCode() == "qfa-sub" then
prefix, main = display:match("^(sebuah )(.*)")
if not prefix then
prefix, main = display:match("^(suatu )(.*)")
end
end
return (prefix or "") .. "[[" .. target .. "|" .. (main or display) .. "]]"
end
-- Convert risky characters to HTML entities, which minimizes interference once returned (e.g. for "sms:a", "<!-- -->" etc.).
local function escape_risky_characters(text)
-- Spacing characters in isolation generally need to be escaped in order to be properly processed by the MediaWiki software.
if umatch(text, "^%s*$") then
return encode_entities(text, text)
end
return encode_entities(text, "!#%&*+/:;<=>?@[\\]_{|}")
end
-- Temporarily convert various formatting characters to PUA to prevent them from being disrupted by the substitution process.
local function doTempSubstitutions(text, subbedChars, keepCarets, noTrim)
-- Clone so that we don't insert any extra patterns into the table in package.loaded. For some reason, using require seems to keep memory use down; probably because the table is always cloned.
local patterns = shallow_copy(require(languages_data_patterns_module))
if keepCarets then
insert(patterns, "((\\+)%^)")
insert(patterns, "((%^))")
end
-- Ensure any whitespace at the beginning and end is temp substituted, to prevent it from being accidentally trimmed. We only want to trim any final spaces added during the substitution process (e.g. by a module), which means we only do this during the first round of temp substitutions.
if not noTrim then
insert(patterns, "^([\128-\191\244]*(%s+))")
insert(patterns, "((%s+)[\128-\191\244]*)$")
end
-- Pre-substitution, of "[[" and "]]", which makes pattern matching more accurate.
text = gsub(text, "%f[%[]%[%[", "\1"):gsub("%f[%]]%]%]", "\2")
local i = #subbedChars
for _, pattern in ipairs(patterns) do
-- Patterns ending in \0 stand are for things like "[[" or "]]"), so the inserted PUA are treated as breaks between terms by modules that scrape info from pages.
local term_divider
pattern = gsub(pattern, "%z$", function(divider)
term_divider = divider == "\0"
return ""
end)
text = gsub(text, pattern, function(...)
local m = {...}
local m1New = m[1]
for k = 2, #m do
local n = i + k - 1
subbedChars[n] = m[k]
local byte2 = floor(n / 4096) % 64 + (term_divider and 128 or 136)
local byte3 = floor(n / 64) % 64 + 128
local byte4 = n % 64 + 128
m1New = gsub(m1New, pattern_escape(m[k]), "\244" .. char(byte2) .. char(byte3) .. char(byte4), 1)
end
i = i + #m - 1
return m1New
end)
end
text = gsub(text, "\1", "%[%["):gsub("\2", "%]%]")
return text, subbedChars
end
-- Reinsert any formatting that was temporarily substituted.
local function undoTempSubstitutions(text, subbedChars)
for i = 1, #subbedChars do
local byte2 = floor(i / 4096) % 64 + 128
local byte3 = floor(i / 64) % 64 + 128
local byte4 = i % 64 + 128
text = gsub(text, "\244[" .. char(byte2) .. char(byte2+8) .. "]" .. char(byte3) .. char(byte4),
replacement_escape(subbedChars[i]))
end
text = gsub(text, "\1", "%[%["):gsub("\2", "%]%]")
return text
end
-- Check if the raw text is an unsupported title, and if so return that. Otherwise, remove HTML entities. We do the pre-conversion to avoid loading the unsupported title list unnecessarily.
local function checkNoEntities(self, text)
local textNoEnc = decode_entities(text)
if textNoEnc ~= text and load_data(links_data_module).unsupported_titles[text] then
return text
else
return textNoEnc
end
end
-- If no script object is provided (or if it's invalid or None), get one.
local function checkScript(text, self, sc)
if not check_object("script", true, sc) or sc:getCode() == "None" then
return self:findBestScript(text)
end
return sc
end
local function normalize(text, sc)
text = sc:fixDiscouragedSequences(text)
return sc:toFixedNFD(text)
end
-- Subfunction of iterateSectionSubstitutions(). Process an individual chunk of text according to the specifications in
-- `substitution_data`. The input parameters are all as in the documentation of iterateSectionSubstitutions() except for
-- `recursed`, which is set to true if we called ourselves recursively to process a script-specific setting or
-- script-wide fallback. Returns two values: the processed text and the actual substitution data used to do the
-- substitutions (same as the `actual_substitution_data` return value to iterateSectionSubstitutions()).
local function doSubstitutions(self, text, sc, substitution_data, data_field, function_name, recursed)
-- BE CAREFUL in this function because the value at any level can be `false`, which causes no processing to be done
-- and blocks any further fallback processing.
local actual_substitution_data = substitution_data
-- If there are language-specific substitutes given in the data module, use those.
if type(substitution_data) == "table" then
-- If a script is specified, run this function with the script-specific data before continuing.
local sc_code = sc:getCode()
local has_substitution_data = false
if substitution_data[sc_code] ~= nil then
has_substitution_data = true
if substitution_data[sc_code] then
text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data[sc_code], data_field,
function_name, true)
end
-- Hant, Hans and Hani are usually treated the same, so add a special case to avoid having to specify each one
-- separately.
elseif sc_code:match("^Han") and substitution_data.Hani ~= nil then
has_substitution_data = true
if substitution_data.Hani then
text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data.Hani, data_field,
function_name, true)
end
-- Substitution data with key 1 in the outer table may be given as a fallback.
elseif substitution_data[1] ~= nil then
has_substitution_data = true
if substitution_data[1] then
text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data[1], data_field,
function_name, true)
end
end
-- Iterate over all strings in the "from" subtable, and gsub with the corresponding string in "to". We work with
-- the NFD decomposed forms, as this simplifies many substitutions.
if substitution_data.from then
has_substitution_data = true
for i, from in ipairs(substitution_data.from) do
-- Normalize each loop, to ensure multi-stage substitutions work correctly.
text = sc:toFixedNFD(text)
text = ugsub(text, sc:toFixedNFD(from), substitution_data.to[i] or "")
end
end
if substitution_data.remove_diacritics then
has_substitution_data = true
text = sc:toFixedNFD(text)
-- Convert exceptions to PUA.
local remove_exceptions, substitutes = substitution_data.remove_exceptions
if remove_exceptions then
substitutes = {}
local i = 0
for _, exception in ipairs(remove_exceptions) do
exception = sc:toFixedNFD(exception)
text = ugsub(text, exception, function(m)
i = i + 1
local subst = u(0x80000 + i)
substitutes[subst] = m
return subst
end)
end
end
-- Strip diacritics.
text = ugsub(text, "[" .. substitution_data.remove_diacritics .. "]", "")
-- Convert exceptions back.
if remove_exceptions then
text = text:gsub("\242[\128-\191]*", substitutes)
end
end
if not has_substitution_data and sc._data[data_field] then
-- If language-specific sort key (etc.) is nil, fall back to script-wide sort key (etc.).
text, actual_substitution_data = doSubstitutions(self, text, sc, sc._data[data_field], data_field,
function_name, true)
end
elseif type(substitution_data) == "string" then
-- If there is a dedicated function module, use that.
local module = safe_require("Modul:" .. substitution_data)
if module then
-- TODO: translit functions should take objects, not codes.
-- TODO: translit functions should be called with form NFD.
if function_name == "tr" then
if not module[function_name] then
error(("Internal error: Module [[%s]] has no function named 'tr'"):format(substitution_data))
end
text = module[function_name](text, self._code, sc:getCode())
elseif function_name == "stripDiacritics" then
-- FIXME, get rid of this arm after renaming makeEntryName -> stripDiacritics.
if module[function_name] then
text = module[function_name](sc:toFixedNFD(text), self, sc)
elseif module.makeEntryName then
text = module.makeEntryName(sc:toFixedNFD(text), self, sc)
else
error(("Internal error: Module [[%s]] has no function named 'stripDiacritics' or 'makeEntryName'"
):format(substitution_data))
end
else
if not module[function_name] then
error(("Internal error: Module [[%s]] has no function named '%s'"):format(
substitution_data, function_name))
end
text = module[function_name](sc:toFixedNFD(text), self, sc)
end
else
error("Substitution data '" .. substitution_data .. "' does not match an existing module.")
end
elseif substitution_data == nil and sc._data[data_field] then
-- If language-specific sort key (etc.) is nil, fall back to script-wide sort key (etc.).
text, actual_substitution_data = doSubstitutions(self, text, sc, sc._data[data_field], data_field,
function_name, true)
end
-- Don't normalize to NFC if this is the inner loop or if a module returned nil.
if recursed or not text then
return text, actual_substitution_data
end
-- Fix any discouraged sequences created during the substitution process, and normalize into the final form.
return sc:toFixedNFC(sc:fixDiscouragedSequences(text)), actual_substitution_data
end
-- Split the text into sections, based on the presence of temporarily substituted formatting characters, then iterate
-- over each section to apply substitutions (e.g. transliteration or diacritic stripping). This avoids putting PUA
-- characters through language-specific modules, which may be unequipped for them. This function is passed the following
-- values:
-- * `self` (the Language object);
-- * `text` (the text to process);
-- * `sc` (the script of the text, which must be specified; callers should call checkScript() as needed to autodetect the
-- script of the text if not given explicitly by the user);
-- * `subbedChars` (an array of the same length as the text, indicating which characters have been substituted and by
-- what, or {nil} if no substitutions are to happen);
-- * `keepCarets` (DOCUMENT ME);
-- * `substitution_data` (the data indicating which substitutions to apply, taken directly from `data_field` in the
-- language's data structure in a submodule of [[Module:languages/data]]);
-- * `data_field` (the data field from which `substitution_data` was fetched, such as "sort_key" or "strip_diacritics");
-- * `function_name` (the name of the function to call to do the substitution, in case `substitution_data` specifies a
-- module to do the substitution);
-- * `notrim` (don't trim whitespace at the edges of `text`; set when computing the sort key, because whitespace at the
-- beginning of a sort key is significant and causes the resulting page to be sorted at the beginning of the category
-- it's in).
-- Returns three values:
-- (1) the processed text;
-- (2) the value of `subbedChars` that was passed in, possibly modified with additional character substitutions; will be
-- {nil} if {nil} was passed in;
-- (3) the actual substitution data that was used to apply substitutions to `text`; this may be different from the value
-- of `substitution_data` passed in if that value recursively specified script-specific substitutions or if no
-- substitution data could be found in the language-specific data (e.g. {nil} was passed in or a structure was passed
-- in that had no setting for the script given in `sc`), but a script-wide fallback value was set; currently it is
-- only used by makeSortKey().
local function iterateSectionSubstitutions(self, text, sc, subbedChars, keepCarets, substitution_data, data_field,
function_name, notrim)
local sections
-- See [[Module:languages/data]].
if not find(text, "\244") or load_data(languages_data_module).substitution[self._code] == "cont" then
sections = {text}
else
sections = split(text, "\244[\128-\143][\128-\191]*", true)
end
local actual_substitution_data
for _, section in ipairs(sections) do
-- Don't bother processing empty strings or whitespace (which may also not be handled well by dedicated
-- modules).
if gsub(section, "%s+", "") ~= "" then
local sub, this_actual_substitution_data = doSubstitutions(self, section, sc, substitution_data, data_field,
function_name)
actual_substitution_data = this_actual_substitution_data
-- Second round of temporary substitutions, in case any formatting was added by the main substitution
-- process. However, don't do this if the section contains formatting already (as it would have had to have
-- been escaped to reach this stage, and therefore should be given as raw text).
if sub and subbedChars then
local noSub
for _, pattern in ipairs(require(languages_data_patterns_module)) do
if match(section, pattern .. "%z?") then
noSub = true
end
end
if not noSub then
sub, subbedChars = doTempSubstitutions(sub, subbedChars, keepCarets, true)
end
end
if not sub then
text = sub
break
end
text = sub and gsub(text, pattern_escape(section), replacement_escape(sub), 1) or text
end
end
if not notrim then
-- Trim, unless there are only spacing characters, while ignoring any final formatting characters.
-- Do not trim sort keys because spaces at the beginning are significant.
text = text and text:gsub("^([\128-\191\244]*)%s+(%S)", "%1%2"):gsub("(%S)%s+([\128-\191\244]*)$", "%1%2") or
nil
end
return text, subbedChars, actual_substitution_data
end
-- Process carets (and any escapes). Default to simple removal, if no pattern/replacement is given.
local function processCarets(text, pattern, repl)
local rep
repeat
text, rep = gsub(text, "\\\\(\\*^)", "\3%1")
until rep == 0
return (text:gsub("\\^", "\4")
:gsub(pattern or "%^", repl or "")
:gsub("\3", "\\")
:gsub("\4", "^"))
end
-- Remove carets if they are used to capitalize parts of transliterations (unless they have been escaped).
local function removeCarets(text, sc)
if not sc:hasCapitalization() and sc:isTransliterated() and text:find("^", 1, true) then
return processCarets(text)
else
return text
end
end
local Language = {}
--[==[Returns the language code of the language. Example: {{code|lua|"fr"}} for French.]==]
function Language:getCode()
return self._code
end
--[==[Returns the canonical name of the language. This is the name used to represent that language on Wiktionary, and is guaranteed to be unique to that language alone. Example: {{code|lua|"French"}} for French.]==]
function Language:getCanonicalName()
local name = self._name
if name == nil then
name = self._data[1]
self._name = name
end
return name
end
--[==[
Return the display form of the language. The display form of a language, family or script is the form it takes when
appearing as the <code><var>source</var></code> in categories such as <code>English terms derived from
<var>source</var></code> or <code>English given names from <var>source</var></code>, and is also the displayed text
in {makeCategoryLink()} links. For full and etymology-only languages, this is the same as the canonical name, but
for families, it reads <code>"<var>name</var> languages"</code> (e.g. {"Indo-Iranian languages"}), and for scripts,
it reads <code>"<var>name</var> script"</code> (e.g. {"Arabic script"}).
]==]
function Language:getDisplayForm()
local form = self._displayForm
if form == nil then
form = self:getCanonicalName()
-- Add article and " substrate" to substrates that lack them.
if self:getFamilyCode() == "qfa-sub" then
if not (sub(form, 1, 7) == "sebuah " or sub(form, 1, 6) == "suatu ") then
form = "suatu " .. form
end
if not match(form, "[Ss]ubstratum") then
form = "substratum " .. form
end
end
self._displayForm = form
end
return form
end
--[==[Returns the value which should be used in the HTML lang= attribute for tagged text in the language.]==]
function Language:getHTMLAttribute(sc, region)
local code = self._code
if not find(code, "-", 1, true) then
return code .. "-" .. sc:getCode() .. (region and "-" .. region or "")
end
local parent = self:getParent()
region = region or match(code, "%f[%u][%u-]+%f[%U]")
if parent then
return parent:getHTMLAttribute(sc, region)
end
-- TODO: ISO family codes can also be used.
return "mis-" .. sc:getCode() .. (region and "-" .. region or "")
end
--[==[Returns a table of the aliases that the language is known by, excluding the canonical name. Aliases are synonyms for the language in question. The names are not guaranteed to be unique, in that sometimes more than one language is known by the same name. Example: {{code|lua|{"High German", "New High German", "Deutsch"} }} for [[:Category:German language|German]].]==]
function Language:getAliases()
self:loadInExtraData()
return require(language_like_module).getAliases(self)
end
--[==[
Return a table of the known subvarieties of a given language, excluding subvarieties that have been given
explicit etymology-only language codes. The names are not guaranteed to be unique, in that sometimes a given name
refers to a subvariety of more than one language. Example: {{code|lua|{"Southern Aymara", "Central Aymara"} }} for
[[:Category:Aymara language|Aymara]]. Note that the returned value can have nested tables in it, when a subvariety
goes by more than one name. Example: {{code|lua|{"North Azerbaijani", "South Azerbaijani", {"Afshar", "Afshari",
"Afshar Azerbaijani", "Afchar"}, {"Qashqa'i", "Qashqai", "Kashkay"}, "Sonqor"} }} for
[[:Category:Azerbaijani language|Azerbaijani]]. Here, for example, Afshar, Afshari, Afshar Azerbaijani and Afchar
all refer to the same subvariety, whose preferred name is Afshar (the one listed first). To avoid a return value
with nested tables in it, specify a non-{{code|lua|nil}} value for the <code>flatten</code> parameter; in that case,
the return value would be {{code|lua|{"North Azerbaijani", "South Azerbaijani", "Afshar", "Afshari",
"Afshar Azerbaijani", "Afchar", "Qashqa'i", "Qashqai", "Kashkay", "Sonqor"} }}.
]==]
function Language:getVarieties(flatten)
self:loadInExtraData()
return require(language_like_module).getVarieties(self, flatten)
end
--[==[Returns a table of the "other names" that the language is known by, which are listed in the <code>otherNames</code> field. It should be noted that the <code>otherNames</code> field itself is deprecated, and entries listed there should eventually be moved to either <code>aliases</code> or <code>varieties</code>.]==]
function Language:getOtherNames() -- To be eventually removed, once there are no more uses of the `otherNames` field.
self:loadInExtraData()
return require(language_like_module).getOtherNames(self)
end
--[==[
Return a combined table of the canonical name, aliases, varieties and other names of a given language.]==]
function Language:getAllNames()
self:loadInExtraData()
return require(language_like_module).getAllNames(self)
end
--[==[Returns a table of types as a lookup table (with the types as keys).
The possible types are
* {language}: This is a language, either full or etymology-only.
* {full}: This is a "full" (not etymology-only) language, i.e. the union of {regular}, {reconstructed} and
{appendix-constructed}. Note that the types {full} and {etymology-only} also exist for families, so if you
want to check specifically for a full language and you have an object that might be a family, you should
use {{lua|hasType("language", "full")}} and not simply {{lua|hasType("full")}}.
* {etymology-only}: This is an etymology-only (not full) language, whose parent is another etymology-only
language or a full language. Note that the types {full} and {etymology-only} also exist for
families, so if you want to check specifically for an etymology-only language and you have an
object that might be a family, you should use {{lua|hasType("language", "etymology-only")}}
and not simply {{lua|hasType("etymology-only")}}.
* {regular}: This indicates a full language that is attested according to [[WT:CFI]] and therefore permitted
in the main namespace. There may also be reconstructed terms for the language, which are placed in
the {Reconstruction} namespace and must be prefixed with * to indicate a reconstruction. Most full
languages are natural (not constructed) languages, but a few constructed languages (e.g. Esperanto
and Volapük, among others) are also allowed in the mainspace and considered regular languages.
* {reconstructed}: This language is not attested according to [[WT:CFI]], and therefore is allowed only in the
{Reconstruction} namespace. All terms in this language are reconstructed, and must be prefixed
with *. Languages such as Proto-Indo-European and Proto-Germanic are in this category.
* {appendix-constructed}: This language is attested but does not meet the additional requirements set out for
constructed languages ([[WT:CFI#Constructed languages]]). Its entries must therefore
be in the Appendix namespace, but they are not reconstructed and therefore should
not have * prefixed in links.
]==]
function Language:getTypes()
local types = self._types
if types == nil then
types = {language = true}
if self:getFullCode() == self._code then
types.full = true
else
types["etymology-only"] = true
end
for t in gmatch(self._data.type, "[^,]+") do
types[t] = true
end
self._types = types
end
return types
end
--[==[Given a list of types as strings, returns true if the language has all of them.]==]
function Language:hasType(...)
Language.hasType = require(language_like_module).hasType
return self:hasType(...)
end
--[==[Returns a table containing <code>WikimediaLanguage</code> objects (see [[Module:wikimedia languages]]), which represent languages and their codes as they are used in Wikimedia projects for interwiki linking and such. More than one object may be returned, as a single Wiktionary language may correspond to multiple Wikimedia languages. For example, Wiktionary's single code <code>sh</code> (Serbo-Croatian) maps to four Wikimedia codes: <code>sh</code> (Serbo-Croatian), <code>bs</code> (Bosnian), <code>hr</code> (Croatian) and <code>sr</code> (Serbian).
The code for the Wikimedia language is retrieved from the <code>wikimedia_codes</code> property in the data modules. If that property is not present, the code of the current language is used. If none of the available codes is actually a valid Wikimedia code, an empty table is returned.]==]
function Language:getWikimediaLanguages()
local wm_langs = self._wikimediaLanguageObjects
if wm_langs == nil then
local codes = self:getWikimediaLanguageCodes()
wm_langs = {}
for i = 1, #codes do
wm_langs[i] = get_wikimedia_lang(codes[i])
end
self._wikimediaLanguageObjects = wm_langs
end
return wm_langs
end
function Language:getWikimediaLanguageCodes()
local wm_langs = self._wikimediaLanguageCodes
if wm_langs == nil then
wm_langs = self._data.wikimedia_codes
if wm_langs then
wm_langs = split(wm_langs, ",", true, true)
else
local code = self._code
if is_known_language_tag(code) then
wm_langs = {code}
else
-- Inherit, but only if no codes are specified in the data *and*
-- the language code isn't a valid Wikimedia language code.
local parent = self:getParent()
wm_langs = parent and parent:getWikimediaLanguageCodes() or {}
end
end
self._wikimediaLanguageCodes = wm_langs
end
return wm_langs
end
--[==[
Returns the name of the Wikipedia article for the language. `project` specifies the language and project to retrieve
the article from, defaulting to {"enwiki"} for the English Wikipedia. Normally if specified it should be the project
code for a specific-language Wikipedia e.g. "zhwiki" for the Chinese Wikipedia, but it can be any project, including
non-Wikipedia ones. If the project is the English Wikipedia and the property {wikipedia_article} is present in the data
module it will be used first. In all other cases, a sitelink will be generated from {:getWikidataItem} (if set). The
resulting value (or lack of value) is cached so that subsequent calls are fast. If no value could be determined, and
`noCategoryFallback` is {false}, {:getCategoryName} is used as fallback; otherwise, {nil} is returned. Note that if
`noCategoryFallback` is {nil} or omitted, it defaults to {false} if the project is the English Wikipedia, otherwise
to {true}. In other words, under normal circumstances, if the English Wikipedia article couldn't be retrieved, the
return value will fall back to a link to the language's category, but this won't normally happen for any other project.
]==]
function Language:getWikipediaArticle(noCategoryFallback, project)
Language.getWikipediaArticle = require(language_like_module).getWikipediaArticle
return self:getWikipediaArticle(noCategoryFallback, project)
end
function Language:makeWikipediaLink()
return make_link(self, "w:" .. self:getWikipediaArticle(), self:getCanonicalName())
end
--[==[Returns the name of the Wikimedia Commons category page for the language.]==]
function Language:getCommonsCategory()
Language.getCommonsCategory = require(language_like_module).getCommonsCategory
return self:getCommonsCategory()
end
--[==[Returns the Wikidata item id for the language or <code>nil</code>. This corresponds to the the second field in the data modules.]==]
function Language:getWikidataItem()
Language.getWikidataItem = require(language_like_module).getWikidataItem
return self:getWikidataItem()
end
--[==[Returns a table of <code>Script</code> objects for all scripts that the language is written in. See [[Module:scripts]].]==]
function Language:getScripts()
local scripts = self._scriptObjects
if scripts == nil then
local codes = self:getScriptCodes()
if codes[1] == "All" then
scripts = load_data(scripts_data_module)
else
scripts = {}
for i = 1, #codes do
scripts[i] = get_script(codes[i])
end
end
self._scriptObjects = scripts
end
return scripts
end
--[==[Returns the table of script codes in the language's data file.]==]
function Language:getScriptCodes()
local scripts = self._scriptCodes
if scripts == nil then
scripts = self._data[4]
if scripts then
local codes, n = {}, 0
for code in gmatch(scripts, "[^,]+") do
n = n + 1
-- Special handling of "Hants", which represents "Hani", "Hant" and "Hans" collectively.
if code == "Hants" then
codes[n] = "Hani"
codes[n + 1] = "Hant"
codes[n + 2] = "Hans"
n = n + 2
else
codes[n] = code
end
end
scripts = codes
else
scripts = {"None"}
end
self._scriptCodes = scripts
end
return scripts
end
--[==[Given some text, this function iterates through the scripts of a given language and tries to find the script that best matches the text. It returns a {{code|lua|Script}} object representing the script. If no match is found at all, it returns the {{code|lua|None}} script object.]==]
function Language:findBestScript(text, forceDetect)
if not text or text == "" or text == "-" then
return get_script("None")
end
-- Differs from table returned by getScriptCodes, as Hants is not normalized into its constituents.
local codes = self._bestScriptCodes
if codes == nil then
codes = self._data[4]
codes = codes and split(codes, ",", true, true) or {"None"}
self._bestScriptCodes = codes
end
local first_sc = codes[1]
if first_sc == "All" then
return find_best_script_without_lang(text)
end
local codes_len = #codes
if not (forceDetect or first_sc == "Hants" or codes_len > 1) then
first_sc = get_script(first_sc)
local charset = first_sc.characters
return charset and umatch(text, "[" .. charset .. "]") and first_sc or get_script("None")
end
-- Remove all formatting characters.
text = get_plaintext(text)
-- Remove all spaces and any ASCII punctuation. Some non-ASCII punctuation is script-specific, so can't be removed.
text = ugsub(text, "[%s!\"#%%&'()*,%-./:;?@[\\%]_{}]+", "")
if #text == 0 then
return get_script("None")
end
-- Try to match every script against the text,
-- and return the one with the most matching characters.
local bestcount, bestscript, length = 0
for i = 1, codes_len do
local sc = codes[i]
-- Special case for "Hants", which is a special code that represents whichever of "Hant" or "Hans" best matches, or "Hani" if they match equally. This avoids having to list all three. In addition, "Hants" will be treated as the best match if there is at least one matching character, under the assumption that a Han script is desirable in terms that contain a mix of Han and other scripts (not counting those which use Jpan or Kore).
if sc == "Hants" then
local Hani = get_script("Hani")
if not Hant_chars then
Hant_chars = load_data("Modul:zh/data/ts")
Hans_chars = load_data("Modul:zh/data/st")
end
local t, s, found = 0, 0
-- This is faster than using mw.ustring.gmatch directly.
for ch in gmatch((ugsub(text, "[" .. Hani.characters .. "]", "\255%0")), "\255(.[\128-\191]*)") do
found = true
if Hant_chars[ch] then
t = t + 1
if Hans_chars[ch] then
s = s + 1
end
elseif Hans_chars[ch] then
s = s + 1
else
t, s = t + 1, s + 1
end
end
if found then
if t == s then
return Hani
end
return get_script(t > s and "Hant" or "Hans")
end
else
sc = get_script(sc)
if not length then
length = ulen(text)
end
-- Count characters by removing everything in the script's charset and comparing to the original length.
local charset = sc.characters
local count = charset and length - ulen((ugsub(text, "[" .. charset .. "]+", ""))) or 0
if count >= length then
return sc
elseif count > bestcount then
bestcount = count
bestscript = sc
end
end
end
-- Return best matching script, or otherwise None.
return bestscript or get_script("None")
end
--[==[Returns a <code>Family</code> object for the language family that the language belongs to. See [[Module:families]].]==]
function Language:getFamily()
local family = self._familyObject
if family == nil then
family = self:getFamilyCode()
-- If the value is nil, it's cached as false.
family = family and get_family(family) or false
self._familyObject = family
end
return family or nil
end
--[==[Returns the family code in the language's data file.]==]
function Language:getFamilyCode()
local family = self._familyCode
if family == nil then
-- If the value is nil, it's cached as false.
family = self._data[3] or false
self._familyCode = family
end
return family or nil
end
function Language:getFamilyName()
local family = self._familyName
if family == nil then
family = self:getFamily()
-- If the value is nil, it's cached as false.
family = family and family:getCanonicalName() or false
self._familyName = family
end
return family or nil
end
do
local function check_family(self, family)
if type(family) == "table" then
family = family:getCode()
end
if self:getFamilyCode() == family then
return true
end
local self_family = self:getFamily()
if self_family:inFamily(family) then
return true
-- If the family isn't a real family (e.g. creoles) check any ancestors.
elseif self_family:inFamily("qfa-not") then
local ancestors = self:getAncestors()
for _, ancestor in ipairs(ancestors) do
if ancestor:inFamily(family) then
return true
end
end
end
end
--[==[Check whether the language belongs to `family` (which can be a family code or object). A list of objects can be given in place of `family`; in that case, return true if the language belongs to any of the specified families. Note that some languages (in particular, certain creoles) can have multiple immediate ancestors potentially belonging to different families; in that case, return true if the language belongs to any of the specified families.]==]
function Language:inFamily(...)
if self:getFamilyCode() == nil then
return false
end
return check_inputs(self, check_family, false, ...)
end
end
function Language:getParent()
local parent = self._parentObject
if parent == nil then
parent = self:getParentCode()
-- If the value is nil, it's cached as false.
parent = parent and get_by_code(parent, nil, true, true) or false
self._parentObject = parent
end
return parent or nil
end
function Language:getParentCode()
local parent = self._parentCode
if parent == nil then
-- If the value is nil, it's cached as false.
parent = self._data.parent or false
self._parentCode = parent
end
return parent or nil
end
function Language:getParentName()
local parent = self._parentName
if parent == nil then
parent = self:getParent()
-- If the value is nil, it's cached as false.
parent = parent and parent:getCanonicalName() or false
self._parentName = parent
end
return parent or nil
end
function Language:getParentChain()
local chain = self._parentChain
if chain == nil then
chain = {}
local parent, n = self:getParent(), 0
while parent do
n = n + 1
chain[n] = parent
parent = parent:getParent()
end
self._parentChain = chain
end
return chain
end
do
local function check_lang(self, lang)
for _, parent in ipairs(self:getParentChain()) do
if (type(lang) == "string" and lang or lang:getCode()) == parent:getCode() then
return true
end
end
end
function Language:hasParent(...)
return check_inputs(self, check_lang, false, ...)
end
end
--[==[
If the language is etymology-only, this iterates through parents until a full language or family is found, and the
corresponding object is returned. If the language is a full language, then it simply returns itself.
]==]
function Language:getFull()
local full = self._fullObject
if full == nil then
full = self:getFullCode()
full = full == self._code and self or get_by_code(full)
self._fullObject = full
end
return full
end
--[==[
If the language is an etymology-only language, this iterates through parents until a full language or family is
found, and the corresponding code is returned. If the language is a full language, then it simply returns the
language code.
]==]
function Language:getFullCode()
return self._fullCode or self._code
end
--[==[
If the language is an etymology-only language, this iterates through parents until a full language or family is
found, and the corresponding canonical name is returned. If the language is a full language, then it simply returns
the canonical name of the language.
]==]
function Language:getFullName()
local full = self._fullName
if full == nil then
full = self:getFull():getCanonicalName()
self._fullName = full
end
return full
end
--[==[Returns a table of <code class="nf">Language</code> objects for all languages that this language is directly descended from. Generally this is only a single language, but creoles, pidgins and mixed languages can have multiple ancestors.]==]
function Language:getAncestors()
local ancestors = self._ancestorObjects
if ancestors == nil then
ancestors = {}
local ancestor_codes = self:getAncestorCodes()
if #ancestor_codes > 0 then
for _, ancestor in ipairs(ancestor_codes) do
insert(ancestors, get_by_code(ancestor, nil, true))
end
else
local fam = self:getFamily()
local protoLang = fam and fam:getProtoLanguage() or nil
-- For the cases where the current language is the proto-language
-- of its family, or an etymology-only language that is ancestral to that
-- proto-language, we need to step up a level higher right from the
-- start.
if protoLang and (
protoLang:getCode() == self._code or
(self:hasType("etymology-only") and protoLang:hasAncestor(self))
) then
fam = fam:getFamily()
protoLang = fam and fam:getProtoLanguage() or nil
end
while not protoLang and not (not fam or fam:getCode() == "qfa-not") do
fam = fam:getFamily()
protoLang = fam and fam:getProtoLanguage() or nil
end
insert(ancestors, protoLang)
end
self._ancestorObjects = ancestors
end
return ancestors
end
do
-- Avoid a language being its own ancestor via class inheritance. We only need to check for this if the language has inherited an ancestor table from its parent, because we never want to drop ancestors that have been explicitly set in the data.
-- Recursively iterate over ancestors until we either find self or run out. If self is found, return true.
local function check_ancestor(self, lang)
local codes = lang:getAncestorCodes()
if not codes then
return nil
end
for i = 1, #codes do
local code = codes[i]
if code == self._code then
return true
end
local anc = get_by_code(code, nil, true)
if check_ancestor(self, anc) then
return true
end
end
end
--[==[Returns a table of <code class="nf">Language</code> codes for all languages that this language is directly descended from. Generally this is only a single language, but creoles, pidgins and mixed languages can have multiple ancestors.]==]
function Language:getAncestorCodes()
if self._ancestorCodes then
return self._ancestorCodes
end
local data = self._data
local codes = data.ancestors
if codes == nil then
codes = {}
self._ancestorCodes = codes
return codes
end
codes = split(codes, ",", true, true)
self._ancestorCodes = codes
-- If there are no codes or the ancestors weren't inherited data, there's nothing left to check.
if #codes == 0 or self:getData(false, "raw").ancestors ~= nil then
return codes
end
local i, code = 1
while i <= #codes do
code = codes[i]
if check_ancestor(self, self) then
remove(codes, i)
else
i = i + 1
end
end
return codes
end
end
--[==[Given a list of language objects or codes, returns true if at least one of them is an ancestor. This includes any etymology-only children of that ancestor. If the language's ancestor(s) are etymology-only languages, it will also return true for those language parent(s) (e.g. if Vulgar Latin is the ancestor, it will also return true for its parent, Latin). However, a parent is excluded from this if the ancestor is also ancestral to that parent (e.g. if Classical Persian is the ancestor, Persian would return false, because Classical Persian is also ancestral to Persian).]==]
function Language:hasAncestor(...)
local function iterateOverAncestorTree(node, func, parent_check)
local ancestors = node:getAncestors()
local ancestorsParents = {}
for _, ancestor in ipairs(ancestors) do
-- When checking the parents of the other language, and the ancestor is also a parent, skip to the next ancestor, so that we exclude any etymology-only children of that parent that are not directly related (see below).
local ret = (parent_check or not node:hasParent(ancestor)) and
func(ancestor) or iterateOverAncestorTree(ancestor, func, parent_check)
if ret then
return ret
end
end
-- Check the parents of any ancestors. We don't do this if checking the parents of the other language, so that we exclude any etymology-only children of those parents that are not directly related (e.g. if the ancestor is Vulgar Latin and we are checking New Latin, we want it to return false because they are on different ancestral branches. As such, if we're already checking the parent of New Latin (Latin) we don't want to compare it to the parent of the ancestor (Latin), as this would be a false positive; it should be one or the other).
if not parent_check then
return nil
end
for _, ancestor in ipairs(ancestors) do
local ancestorParents = ancestor:getParentChain()
for _, ancestorParent in ipairs(ancestorParents) do
if ancestorParent:getCode() == self._code or ancestorParent:hasAncestor(ancestor) then
break
else
insert(ancestorsParents, ancestorParent)
end
end
end
for _, ancestorParent in ipairs(ancestorsParents) do
local ret = func(ancestorParent)
if ret then
return ret
end
end
end
local function do_iteration(otherlang, parent_check)
-- otherlang can't be self
if (type(otherlang) == "string" and otherlang or otherlang:getCode()) == self._code then
return false
end
repeat
if iterateOverAncestorTree(
self,
function(ancestor)
return ancestor:getCode() == (type(otherlang) == "string" and otherlang or otherlang:getCode())
end,
parent_check
) then
return true
elseif type(otherlang) == "string" then
otherlang = get_by_code(otherlang, nil, true)
end
otherlang = otherlang:getParent()
parent_check = false
until not otherlang
end
local parent_check = true
for _, otherlang in ipairs{...} do
local ret = do_iteration(otherlang, parent_check)
if ret then
return true
end
end
return false
end
do
local function construct_node(lang, memo)
local branch, ancestors = {lang = lang:getCode()}
memo[lang:getCode()] = branch
for _, ancestor in ipairs(lang:getAncestors()) do
if ancestors == nil then
ancestors = {}
end
insert(ancestors, memo[ancestor:getCode()] or construct_node(ancestor, memo))
end
branch.ancestors = ancestors
return branch
end
function Language:getAncestorChain()
local chain = self._ancestorChain
if chain == nil then
chain = construct_node(self, {})
self._ancestorChain = chain
end
return chain
end
end
function Language:getAncestorChainOld()
local chain = self._ancestorChain
if chain == nil then
chain = {}
local step = self
while true do
local ancestors = step:getAncestors()
step = #ancestors == 1 and ancestors[1] or nil
if not step then
break
end
insert(chain, step)
end
self._ancestorChain = chain
end
return chain
end
local function fetch_descendants(self, fmt)
local descendants, family = {}, self:getFamily()
-- Iterate over all three datasets.
for _, data in ipairs{
require("Modul:languages/code to canonical name"),
require("Modul:etymology languages/code to canonical name"),
require("Modul:families/code to canonical name"),
} do
for code in pairs(data) do
local lang = get_by_code(code, nil, true, true)
-- Test for a descendant. Earlier tests weed out most candidates, while the more intensive tests are only used sparingly.
if (
code ~= self._code and -- Not self.
lang:inFamily(family) and -- In the same family.
(
family:getProtoLanguageCode() == self._code or -- Self is the protolanguage.
self:hasDescendant(lang) or -- Full hasDescendant check.
(lang:getFullCode() == self._code and not self:hasAncestor(lang)) -- Etymology-only child which isn't an ancestor.
)
) then
if fmt == "object" then
insert(descendants, lang)
elseif fmt == "code" then
insert(descendants, code)
elseif fmt == "name" then
insert(descendants, lang:getCanonicalName())
end
end
end
end
return descendants
end
function Language:getDescendants()
local descendants = self._descendantObjects
if descendants == nil then
descendants = fetch_descendants(self, "object")
self._descendantObjects = descendants
end
return descendants
end
function Language:getDescendantCodes()
local descendants = self._descendantCodes
if descendants == nil then
descendants = fetch_descendants(self, "code")
self._descendantCodes = descendants
end
return descendants
end
function Language:getDescendantNames()
local descendants = self._descendantNames
if descendants == nil then
descendants = fetch_descendants(self, "name")
self._descendantNames = descendants
end
return descendants
end
do
local function check_lang(self, lang)
if type(lang) == "string" then
lang = get_by_code(lang, nil, true)
end
if lang:hasAncestor(self) then
return true
end
end
function Language:hasDescendant(...)
return check_inputs(self, check_lang, false, ...)
end
end
local function fetch_children(self, fmt)
local m_etym_data = require(etymology_languages_data_module)
local self_code, children = self._code, {}
for code, lang in pairs(m_etym_data) do
local _lang = lang
repeat
local parent = _lang.parent
if parent == self_code then
if fmt == "object" then
insert(children, get_by_code(code, nil, true))
elseif fmt == "code" then
insert(children, code)
elseif fmt == "name" then
insert(children, lang[1])
end
break
end
_lang = m_etym_data[parent]
until not _lang
end
return children
end
function Language:getChildren()
local children = self._childObjects
if children == nil then
children = fetch_children(self, "object")
self._childObjects = children
end
return children
end
function Language:getChildrenCodes()
local children = self._childCodes
if children == nil then
children = fetch_children(self, "code")
self._childCodes = children
end
return children
end
function Language:getChildrenNames()
local children = self._childNames
if children == nil then
children = fetch_children(self, "name")
self._childNames = children
end
return children
end
function Language:hasChild(...)
local lang = ...
if not lang then
return false
elseif type(lang) == "string" then
lang = get_by_code(lang, nil, true)
end
if lang:hasParent(self) then
return true
end
return self:hasChild(select(2, ...))
end
--[==[Returns the name of the main category of that language. Example: {{code|lua|"French language"}} for French, whose category is at [[:Category:French language]]. Unless optional argument <code>nocap</code> is given, the language name at the beginning of the returned value will be capitalized. This capitalization is correct for category names, but not if the language name is lowercase and the returned value of this function is used in the middle of a sentence.]==]
function Language:getCategoryName(nocap)
local name = self._categoryName
if name == nil then
name = self:getCanonicalName()
-- If a substrate, omit any leading article.
if self:getFamilyCode() == "qfa-sub" then
name = name:gsub("^sebuah ", ""):gsub("^suatu ", "")
end
-- Only add "Bahasa " prefix if a full language.
if self:hasType("full") then
-- Unless the canonical name already starts with "Bahasa", "bahasa", "Lek" or "lek", add "Bahasa " prefix.
if not (match(name, "^[Bb]ahasa") or match(name, "^[Ll]ek")) then
name = "Bahasa " .. name
end
end
self._categoryName = name
end
if nocap then
return name
end
return mw.getContentLanguage():ucfirst(name)
end
--[==[Creates a link to the category; the link text is the canonical name.]==]
function Language:makeCategoryLink()
return make_link(self, ":Kategori:" .. self:getCategoryName(), self:getDisplayForm())
end
function Language:getStandardCharacters(sc)
local standard_chars = self._data.standard_chars
if type(standard_chars) ~= "table" then
return standard_chars
elseif sc and type(sc) ~= "string" then
check_object("script", nil, sc)
sc = sc:getCode()
end
if (not sc) or sc == "None" then
local scripts = {}
for _, script in pairs(standard_chars) do
insert(scripts, script)
end
return concat(scripts)
end
if standard_chars[sc] then
return standard_chars[sc] .. (standard_chars[1] or "")
end
end
--[==[
Strip diacritics from display text `text` (in a language-specific fashion), which is in the script `sc`. If `sc` is
omitted or {nil}, the script is autodetected. This also strips certain punctuation characters from the end and (in the
case of Spanish upside-down question mark and exclamation points) from the beginning; strips any whitespace at the
end of the text or between the text and final stripped punctuation characters; and applies some language-specific
Unicode normalizations to replace discouraged characters with their prescribed alternatives. Return the stripped text.
]==]
function Language:stripDiacritics(text, sc)
if (not text) or text == "" then
return text
end
sc = checkScript(text, self, sc)
text = normalize(text, sc)
-- FIXME, rename makeEntryName to stripDiacritics and get rid of second and third return values
-- everywhere
text, _, _ = iterateSectionSubstitutions(self, text, sc, nil, nil,
self._data.strip_diacritics or self._data.entry_name, "strip_diacritics", "stripDiacritics")
text = umatch(text, "^[¿¡]?(.-[^%s%p].-)%s*[؟?!;՛՜ ՞ ՟?!︖︕।॥။၊་།]?$") or text
return text
end
--[==[
Convert a ''logical'' pagename (the pagename as it appears to the user, after diacritics and punctuation have been
stripped) to a ''physical'' pagename (the pagename as it appears in the MediaWiki database). Reasons for a difference
between the two are (a) unsupported titles such as `[ ]` (with square brackets in them), `#` (pound/hash sign) and
`¯\_(ツ)_/¯` (with underscores), as well as overly long titles of various sorts; (b) "mammoth" pages that are split into
parts (e.g. `a`, which is split into physical pagenames `a/languages A to L` and `a/languages M to Z`). For almost all
purposes, you should work with logical and not physical pagenames. But there are certain use cases that require physical
pagenames, such as checking the existence of a page or retrieving a page's contents.
`pagename` is the logical pagename to be converted. `is_reconstructed_or_appendix` indicates whether the page is in the
`Reconstruction` or `Appendix` namespaces. If it is omitted or has the value {nil}, the pagename is checked for an
initial asterisk, and if found, the page is assumed to be a `Reconstruction` page. Setting a value of `false` or `true`
to `is_reconstructed_or_appendix` disables this check and allows for mainspace pagenames that begin with an asterisk.
]==]
function Language:logicalToPhysical(pagename, is_reconstructed_or_appendix)
-- FIXME: This probably shouldn't happen but it happens when makeEntryName() receives nil.
if pagename == nil then
track("nil-passed-to-logicalToPhysical")
return nil
end
local initial_asterisk
if is_reconstructed_or_appendix == nil then
local pagename_minus_initial_asterisk
initial_asterisk, pagename_minus_initial_asterisk = pagename:match("^(%*)(.*)$")
if pagename_minus_initial_asterisk then
is_reconstructed_or_appendix = true
pagename = pagename_minus_initial_asterisk
elseif self:hasType("appendix-constructed") then
is_reconstructed_or_appendix = true
end
end
if not is_reconstructed_or_appendix then
-- Check if the pagename is a listed unsupported title.
local unsupportedTitles = load_data(links_data_module).unsupported_titles
if unsupportedTitles[pagename] then
return "Unsupported titles/" .. unsupportedTitles[pagename]
end
end
-- Set `unsupported` as true if certain conditions are met.
local unsupported
-- Check if there's an unsupported character. \239\191\189 is the replacement character U+FFFD, which can't be typed
-- directly here due to an abuse filter. Unix-style dot-slash notation is also unsupported, as it is used for
-- relative paths in links, as are 3 or more consecutive tildes. Note: match is faster with magic
-- characters/charsets; find is faster with plaintext.
if (
match(pagename, "[#<>%[%]_{|}]") or
find(pagename, "\239\191\189") or
match(pagename, "%f[^%z/]%.%.?%f[%z/]") or
find(pagename, "~~~")
) then
unsupported = true
-- If it looks like an interwiki link.
elseif find(pagename, ":") then
local prefix = gsub(pagename, "^:*(.-):.*", ulower)
if (
load_data("Modul:data/namespaces")[prefix] or
load_data("Modul:data/interwikis")[prefix]
) then
unsupported = true
end
end
-- Escape unsupported characters so they can be used in titles. ` is used as a delimiter for this, so a raw use of
-- it in an unsupported title is also escaped here to prevent interference; this is only done with unsupported
-- titles, though, so inclusion won't in itself mean a title is treated as unsupported (which is why it's excluded
-- from the earlier test).
if unsupported then
-- FIXME: This conversion needs to be different for reconstructed pages with unsupported characters. There
-- aren't any currently, but if there ever are, we need to fix this e.g. to put them in something like
-- Reconstruction:Proto-Indo-European/Unsupported titles/`lowbar``num`.
local unsupported_characters = load_data(links_data_module).unsupported_characters
pagename = pagename:gsub("[#<>%[%]_`{|}\239]\191?\189?", unsupported_characters)
:gsub("%f[^%z/]%.%.?%f[%z/]", function(m)
return (gsub(m, "%.", "`period`"))
end)
:gsub("~~~+", function(m)
return (gsub(m, "~", "`tilde`"))
end)
pagename = "Unsupported titles/" .. pagename
elseif not is_reconstructed_or_appendix then
-- Check if this is a mammoth page. If so, which subpage should we link to?
local m_links_data = load_data(links_data_module)
local mammoth_page_type = m_links_data.mammoth_pages[pagename]
if mammoth_page_type then
local canonical_name = "Bahasa " .. self:getFullName()
if canonical_name ~= "Rentas bahasa" and canonical_name ~= "Bahasa Melayu" then
local this_subpage
local L2_sort_key = get_L2_sort_key(canonical_name)
for _, subpage_spec in ipairs(m_links_data.mammoth_page_subpage_types[mammoth_page_type]) do
-- unpack() fails utterly on data loaded using mw.loadData() even if offsets are given
local subpage, pattern = subpage_spec[1], subpage_spec[2]
if pattern == true or L2_sort_key:match(pattern) then
this_subpage = subpage
break
end
end
if not this_subpage then
error(("Internal error: Bad data in mammoth_page_subpage_pages in [[Modul:links/data]] for mammoth page %s, type %s; last entry didn't have 'true' in it"):format(
pagename, mammoth_page_type))
end
pagename = pagename .. "/" .. this_subpage
end
end
end
return (initial_asterisk or "") .. pagename
end
--[==[
Strip the diacritics from a display pagename and convert the resulting logical pagename into a physical pagename.
This allows you, for example, to retrieve the contents of the page or check its existence. WARNING: This is deprecated
and will be going away. It is a simple composition of `self:stripDiacritics` and `self:logicalToPhysical`; most callers
only want the former, and if you need both, call them both yourself.
`text` and `sc` are as in `self:stripDiacritics`, and `is_reconstructed_or_appendix` is as in `self:logicalToPhysical`.
]==]
function Language:makeEntryName(text, sc, is_reconstructed_or_appendix)
return self:logicalToPhysical(self:stripDiacritics(text, sc), is_reconstructed_or_appendix)
end
--[==[Generates alternative forms using a specified method, and returns them as a table. If no method is specified, returns a table containing only the input term.]==]
function Language:generateForms(text, sc)
local generate_forms = self._data.generate_forms
if generate_forms == nil then
return {text}
end
sc = checkScript(text, self, sc)
return require("Modul:" .. self._data.generate_forms).generateForms(text, self, sc)
end
--[==[Creates a sort key for the given stripped text, following the rules appropriate for the language. This removes
diacritical marks from the stripped text if they are not considered significant for sorting, and may perform some other
changes. Any initial hyphen is also removed, and anything in parentheses is removed as well.
The <code>sort_key</code> setting for each language in the data modules defines the replacements made by this function, or it gives the name of the module that takes the stripped text and returns a sortkey.]==]
function Language:makeSortKey(text, sc)
if (not text) or text == "" then
return text
end
if match(text, "<[^<>]+>") then
track("track HTML tag")
end
-- Remove directional characters, bold, italics, soft hyphens, strip markers and HTML tags.
-- FIXME: Partly duplicated with remove_formatting() in [[Module:links]].
text = ugsub(text, "[\194\173\226\128\170-\226\128\174\226\129\166-\226\129\169]", "")
text = text:gsub("('*)'''(.-'*)'''", "%1%2"):gsub("('*)''(.-'*)''", "%1%2")
text = gsub(unstrip(text), "<[^<>]+>", "")
text = decode_uri(text, "PATH")
text = checkNoEntities(self, text)
-- Remove initial hyphens and * unless the term only consists of spacing + punctuation characters.
text = ugsub(text, "^([-]*)[-־ـ᠊*]+([-]*)(.*[^%s%p].*)", "%1%2%3")
sc = checkScript(text, self, sc)
text = normalize(text, sc)
text = removeCarets(text, sc)
-- For languages with dotted dotless i, ensure that "İ" is sorted as "i", and "I" is sorted as "ı".
if self:hasDottedDotlessI() then
text = gsub(text, "I\204\135", "i") -- decomposed "İ"
:gsub("I", "ı")
text = sc:toFixedNFD(text)
end
-- Convert to lowercase, make the sortkey, then convert to uppercase. Where the language has dotted dotless i, it is
-- usually not necessary to convert "i" to "İ" and "ı" to "I" first, because "I" will always be interpreted as
-- conventional "I" (not dotless "İ") by any sorting algorithms, which will have been taken into account by the
-- sortkey substitutions themselves. However, if no sortkey substitutions have been specified, then conversion is
-- necessary so as to prevent "i" and "ı" both being sorted as "I".
--
-- An exception is made for scripts that (sometimes) sort by scraping page content, as that means they are sensitive
-- to changes in capitalization (as it changes the target page).
if not sc:sortByScraping() then
text = ulower(text)
end
local actual_substitution_data
-- Don't trim whitespace here because it's significant at the beginning of a sort key or sort base.
text, _, actual_substitution_data = iterateSectionSubstitutions(self, text, sc, nil, nil, self._data.sort_key,
"sort_key", "makeSortKey", "notrim")
if not sc:sortByScraping() then
if self:hasDottedDotlessI() and not actual_substitution_data then
text = text:gsub("ı", "I"):gsub("i", "İ")
text = sc:toFixedNFC(text)
end
text = uupper(text)
end
-- Remove parentheses, as long as they are either preceded or followed by something.
text = gsub(text, "(.)[()]+", "%1"):gsub("[()]+(.)", "%1")
text = escape_risky_characters(text)
return text
end
--[==[Create the form used as as a basis for display text and transliteration. FIXME: Rename to correctInputText().]==]
local function processDisplayText(text, self, sc, keepCarets, keepPrefixes)
local subbedChars = {}
text, subbedChars = doTempSubstitutions(text, subbedChars, keepCarets)
text = decode_uri(text, "PATH")
text = checkNoEntities(self, text)
sc = checkScript(text, self, sc)
text = normalize(text, sc)
text, subbedChars = iterateSectionSubstitutions(self, text, sc, subbedChars, keepCarets, self._data.display_text,
"display_text", "makeDisplayText")
text = removeCarets(text, sc)
-- Remove any interwiki link prefixes (unless they have been escaped or this has been disabled).
if find(text, ":") and not keepPrefixes then
local rep
repeat
text, rep = gsub(text, "\\\\(\\*:)", "\3%1")
until rep == 0
text = gsub(text, "\\:", "\4")
while true do
local prefix = gsub(text, "^(.-):.+", function(m1)
return (gsub(m1, "\244[\128-\191]*", ""))
end)
-- Check if the prefix is an interwiki, though ignore capitalised Wikikamus:, which is a namespace.
if not prefix or prefix == text or prefix == "Wikikamus"
or not (load_data("Modul:data/interwikis")[ulower(prefix)] or prefix == "") then
break
end
text = gsub(text, "^(.-):(.*)", function(m1, m2)
local ret = {}
for subbedChar in gmatch(m1, "\244[\128-\191]*") do
insert(ret, subbedChar)
end
return concat(ret) .. m2
end)
end
text = gsub(text, "\3", "\\"):gsub("\4", ":")
end
return text, subbedChars
end
--[==[Make the display text (i.e. what is displayed on the page).]==]
function Language:makeDisplayText(text, sc, keepPrefixes)
if not text or text == "" then
return text
end
local subbedChars
text, subbedChars = processDisplayText(text, self, sc, nil, keepPrefixes)
text = escape_risky_characters(text)
return undoTempSubstitutions(text, subbedChars)
end
--[==[Transliterates the text from the given script into the Latin script (see
[[Wiktionary:Transliteration and romanization]]). The language must have the <code>translit</code> property for this to
work; if it is not present, {{code|lua|nil}} is returned.
The <code>sc</code> parameter is handled by the transliteration module, and how it is handled is specific to that
module. Some transliteration modules may tolerate {{code|lua|nil}} as the script, others require it to be one of the
possible scripts that the module can transliterate, and will throw an error if it's not one of them. For this reason,
the <code>sc</code> parameter should always be provided when writing non-language-specific code.
The <code>module_override</code> parameter is used to override the default module that is used to provide the
transliteration. This is useful in cases where you need to demonstrate a particular module in use, but there is no
default module yet, or you want to demonstrate an alternative version of a transliteration module before making it
official. It should not be used in real modules or templates, only for testing. All uses of this parameter are tracked
by [[Wiktionary:Tracking/languages/module_override]].
'''Known bugs''':
* This function assumes {tr(s1) .. tr(s2) == tr(s1 .. s2)}. When this assertion fails, wikitext markups like <nowiki>'''</nowiki> can cause wrong transliterations.
* HTML entities like <code>&apos;</code>, often used to escape wikitext markups, do not work.
]==]
function Language:transliterate(text, sc, module_override)
-- If there is no text, or the language doesn't have transliteration data and there's no override, return nil.
if not text or text == "" or text == "-" then
return text
end
-- If the script is not transliteratable (and no override is given), return nil.
sc = checkScript(text, self, sc)
if not (sc:isTransliterated() or module_override) then
-- temporary tracking to see if/when this gets triggered
track("non-transliterable")
track("non-transliterable/" .. self._code)
track("non-transliterable/" .. sc:getCode())
track("non-transliterable/" .. sc:getCode() .. "/" .. self._code)
return nil
end
-- Remove any strip markers.
text = unstrip(text)
-- Do not process the formatting into PUA characters for certain languages.
local processed = load_data(languages_data_module).substitution[self._code] ~= "none"
-- Get the display text with the keepCarets flag set.
local subbedChars
if processed then
text, subbedChars = processDisplayText(text, self, sc, true)
end
-- Transliterate (using the module override if applicable).
text, subbedChars = iterateSectionSubstitutions(self, text, sc, subbedChars, true, module_override or
self._data.translit, "translit", "tr")
if not text then
return nil
end
-- Incomplete transliterations return nil.
local charset = sc.characters
if charset and umatch(text, "[" .. charset .. "]") then
-- Remove any characters in Latin, which includes Latin characters also included in other scripts (as these are
-- false positives), as well as any PUA substitutions. Anything remaining should only be script code "None"
-- (e.g. numerals).
local check_text = ugsub(text, "[" .. get_script("Latn").characters .. "-]+", "")
-- Set none_is_last_resort_only flag, so that any non-None chars will cause a script other than "None" to be
-- returned.
if find_best_script_without_lang(check_text, true):getCode() ~= "None" then
return nil
end
end
if processed then
text = escape_risky_characters(text)
text = undoTempSubstitutions(text, subbedChars)
end
-- If the script does not use capitalization, then capitalize any letters of the transliteration which are
-- immediately preceded by a caret (and remove the caret).
if text and not sc:hasCapitalization() and text:find("^", 1, true) then
text = processCarets(text, "%^([\128-\191\244]*%*?)([^\128-\191\244][\128-\191]*)", function(m1, m2)
return m1 .. uupper(m2)
end)
end
-- Track module overrides.
if module_override ~= nil then
track("module_override")
end
return text
end
do
local function handle_language_spec(self, spec, sc)
local ret = self["_" .. spec]
if ret == nil then
ret = self._data[spec]
if type(ret) == "string" then
ret = list_to_set(split(ret, ",", true, true))
end
self["_" .. spec] = ret
end
if type(ret) == "table" then
ret = ret[sc:getCode()]
end
return not not ret
end
function Language:overrideManualTranslit(sc)
return handle_language_spec(self, "override_translit", sc)
end
function Language:link_tr(sc)
return handle_language_spec(self, "link_tr", sc)
end
end
--[==[Returns {{code|lua|true}} if the language has a transliteration module, or {{code|lua|false}} if it doesn't.]==]
function Language:hasTranslit()
return not not self._data.translit
end
--[==[Returns {{code|lua|true}} if the language uses the letters I/ı and İ/i, or {{code|lua|false}} if it doesn't.]==]
function Language:hasDottedDotlessI()
return not not self._data.dotted_dotless_i
end
function Language:toJSON(opts)
local strip_diacritics, strip_diacritics_patterns, strip_diacritics_remove_diacritics = self._data.strip_diacritics
if strip_diacritics then
if strip_diacritics.from then
strip_diacritics_patterns = {}
for i, from in ipairs(strip_diacritics.from) do
insert(strip_diacritics_patterns, {from = from, to = strip_diacritics.to[i] or ""})
end
end
strip_diacritics_remove_diacritics = strip_diacritics.remove_diacritics
end
-- mainCode should only end up non-nil if dontCanonicalizeAliases is passed to make_object().
-- props should either contain zero-argument functions to compute the value, or the value itself.
local props = {
ancestors = function() return self:getAncestorCodes() end,
canonicalName = function() return self:getCanonicalName() end,
categoryName = function() return self:getCategoryName("nocap") end,
code = self._code,
mainCode = self._mainCode,
parent = function() return self:getParentCode() end,
full = function() return self:getFullCode() end,
stripDiacriticsPatterns = strip_diacritics_patterns,
stripDiacriticsRemoveDiacritics = strip_diacritics_remove_diacritics,
family = function() return self:getFamilyCode() end,
aliases = function() return self:getAliases() end,
varieties = function() return self:getVarieties() end,
otherNames = function() return self:getOtherNames() end,
scripts = function() return self:getScriptCodes() end,
type = function() return keys_to_list(self:getTypes()) end,
wikimediaLanguages = function() return self:getWikimediaLanguageCodes() end,
wikidataItem = function() return self:getWikidataItem() end,
wikipediaArticle = function() return self:getWikipediaArticle(true) end,
}
local ret = {}
for prop, val in pairs(props) do
if not opts.skip_fields or not opts.skip_fields[prop] then
if type(val) == "function" then
ret[prop] = val()
else
ret[prop] = val
end
end
end
-- Use `deep_copy` when returning a table, so that there are no editing restrictions imposed by `mw.loadData`.
return opts and opts.lua_table and deep_copy(ret) or to_json(ret, opts)
end
function export.getDataModuleName(code)
local letter = match(code, "^(%l)%l%l?$")
return "Modul:" .. (
letter == nil and "languages/data/exceptional" or
#code == 2 and "languages/data/2" or
"languages/data/3/" .. letter
)
end
get_data_module_name = export.getDataModuleName
function export.getExtraDataModuleName(code)
return get_data_module_name(code) .. "/extra"
end
get_extra_data_module_name = export.getExtraDataModuleName
do
local function make_stack(data)
local key_types = {
[2] = "unique",
aliases = "unique",
otherNames = "unique",
type = "append",
varieties = "unique",
wikipedia_article = "unique",
wikimedia_codes = "unique"
}
local function __index(self, k)
local stack, key_type = getmetatable(self), key_types[k]
-- Data that isn't inherited from the parent.
if key_type == "unique" then
local v = stack[stack[make_stack]][k]
if v == nil then
local layer = stack[0]
if layer then -- Could be false if there's no extra data.
v = layer[k]
end
end
return v
-- Data that is appended by each generation.
elseif key_type == "append" then
local parts, offset, n = {}, 0, stack[make_stack]
for i = 1, n do
local part = stack[i][k]
if part == nil then
offset = offset + 1
else
parts[i - offset] = part
end
end
return offset ~= n and concat(parts, ",") or nil
end
local n = stack[make_stack]
while true do
local layer = stack[n]
if not layer then -- Could be false if there's no extra data.
return nil
end
local v = layer[k]
if v ~= nil then
return v
end
n = n - 1
end
end
local function __newindex()
error("table is read-only")
end
local function __pairs(self)
-- Iterate down the stack, caching keys to avoid duplicate returns.
local stack, seen = getmetatable(self), {}
local n = stack[make_stack]
local iter, state, k, v = pairs(stack[n])
return function()
repeat
repeat
k = iter(state, k)
if k == nil then
n = n - 1
local layer = stack[n]
if not layer then -- Could be false if there's no extra data.
return nil
end
iter, state, k = pairs(layer)
end
until not (k == nil or seen[k])
-- Get the value via a lookup, as the one returned by the
-- iterator will be the raw value from the current layer,
-- which may not be the one __index will return for that
-- key. Also memoize the key in `seen` (even if the lookup
-- returns nil) so that it doesn't get looked up again.
-- TODO: store values in `self`, avoiding the need to create
-- the `seen` table. The iterator will need to iterate over
-- `self` with `next` first to find these on future loops.
v, seen[k] = self[k], true
until v ~= nil
return k, v
end
end
local __ipairs = require(table_module).indexIpairs
function make_stack(data)
local stack = {
data,
[make_stack] = 1, -- stores the length and acts as a sentinel to confirm a given metatable is a stack.
__index = __index,
__newindex = __newindex,
__pairs = __pairs,
__ipairs = __ipairs,
}
stack.__metatable = stack
return setmetatable({}, stack), stack
end
return make_stack(data)
end
local function get_stack(data)
local stack = getmetatable(data)
return stack and type(stack) == "table" and stack[make_stack] and stack or nil
end
--[==[
<span style="color: var(--wikt-palette-red,#BA0000)">This function is not for use in entries or other content pages.</span>
Returns a blob of data about the language. The format of this blob is undocumented, and perhaps unstable; it's intended for things like the module's own unit-tests, which are "close friends" with the module and will be kept up-to-date as the format changes. If `extra` is set, any extra data in the relevant `/extra` module will be included. (Note that it will be included anyway if it has already been loaded into the language object.) If `raw` is set, then the returned data will not contain any data inherited from parent objects.
-- Do NOT use these methods!
-- All uses should be pre-approved on the talk page!
]==]
function Language:getData(extra, raw)
if extra then
self:loadInExtraData()
end
local data = self._data
-- If raw is not set, just return the data.
if not raw then
return data
end
local stack = get_stack(data)
-- If there isn't a stack or its length is 1, return the data. Extra data (if any) will be included, as it's stored at key 0 and doesn't affect the reported length.
if stack == nil then
return data
end
local n = stack[make_stack]
if n == 1 then
return data
end
local extra = stack[0]
-- If there isn't any extra data, return the top layer of the stack.
if extra == nil then
return stack[n]
end
-- If there is, return a new stack which has the top layer at key 1 and the extra data at key 0.
data, stack = make_stack(stack[n])
stack[0] = extra
return data
end
function Language:loadInExtraData()
-- Only full languages have extra data.
if not self:hasType("language", "full") then
return
end
local data = self._data
-- If there's no stack, create one.
local stack = get_stack(self._data)
if stack == nil then
data, stack = make_stack(data)
-- If already loaded, return.
elseif stack[0] ~= nil then
return
end
self._data = data
-- Load extra data from the relevant module and add it to the stack at key 0, so that the __index and __pairs metamethods will pick it up, since they iterate down the stack until they run out of layers.
local code = self._code
local modulename = get_extra_data_module_name(code)
-- No data cached as false.
stack[0] = modulename and load_data(modulename)[code] or false
end
--[==[Returns the name of the module containing the language's data. Currently, this is always [[Module:scripts/data]].]==]
function Language:getDataModuleName()
local name = self._dataModuleName
if name == nil then
name = self:hasType("etymology-only") and etymology_languages_data_module or
get_data_module_name(self._mainCode or self._code)
self._dataModuleName = name
end
return name
end
--[==[Returns the name of the module containing the language's data. Currently, this is always [[Module:scripts/data]].]==]
function Language:getExtraDataModuleName()
local name = self._extraDataModuleName
if name == nil then
name = not self:hasType("etymology-only") and get_extra_data_module_name(self._mainCode or self._code) or false
self._extraDataModuleName = name
end
return name or nil
end
function export.makeObject(code, data, dontCanonicalizeAliases)
local data_type = type(data)
if data_type ~= "table" then
error(("bad argument #2 to 'makeObject' (table expected, got %s)"):format(data_type))
end
-- Convert any aliases.
local input_code = code
code = normalize_code(code)
input_code = dontCanonicalizeAliases and input_code or code
local parent
if data.parent then
parent = get_by_code(data.parent, nil, true, true)
else
parent = Language
end
parent.__index = parent
local lang = {_code = input_code}
-- This can only happen if dontCanonicalizeAliases is passed to make_object().
if code ~= input_code then
lang._mainCode = code
end
local parent_data = parent._data
if parent_data == nil then
-- Full code is the same as the code.
lang._fullCode = parent._code or code
else
-- Copy full code.
lang._fullCode = parent._fullCode
local stack = get_stack(parent_data)
if stack == nil then
parent_data, stack = make_stack(parent_data)
end
-- Insert the input data as the new top layer of the stack.
local n = stack[make_stack] + 1
data, stack[n], stack[make_stack] = parent_data, data, n
end
lang._data = data
return setmetatable(lang, parent)
end
make_object = export.makeObject
end
--[==[Finds the language whose code matches the one provided. If it exists, it returns a <code class="nf">Language</code> object representing the language. Otherwise, it returns {{code|lua|nil}}, unless <code class="n">paramForError</code> is given, in which case an error is generated. If <code class="n">paramForError</code> is {{code|lua|true}}, a generic error message mentioning the bad code is generated; otherwise <code class="n">paramForError</code> should be a string or number specifying the parameter that the code came from, and this parameter will be mentioned in the error message along with the bad code. If <code class="n">allowEtymLang</code> is specified, etymology-only language codes are allowed and looked up along with normal language codes. If <code class="n">allowFamily</code> is specified, language family codes are allowed and looked up along with normal language codes.]==]
function export.getByCode(code, paramForError, allowEtymLang, allowFamily)
-- Track uses of paramForError, ultimately so it can be removed, as error-handling should be done by [[Module:parameters]], not here.
if paramForError ~= nil then
track("paramForError")
end
if type(code) ~= "string" then
local typ
if not code then
typ = "nil"
elseif check_object("language", true, code) then
typ = "a language object"
elseif check_object("family", true, code) then
typ = "a family object"
else
typ = "a " .. type(code)
end
error("The function getByCode expects a string as its first argument, but received " .. typ .. ".")
end
local m_data = load_data(languages_data_module)
if m_data.aliases[code] or m_data.track[code] then
track(code)
end
local norm_code = normalize_code(code)
-- Get the data, checking for etymology-only languages if allowEtymLang is set.
local data = load_data(get_data_module_name(norm_code))[norm_code] or
allowEtymLang and load_data(etymology_languages_data_module)[norm_code]
-- If no data was found and allowFamily is set, check the family data. If the main family data was found, make the object with [[Module:families]] instead, as family objects have different methods. However, if it's an etymology-only family, use make_object in this module (which handles object inheritance), and the family-specific methods will be inherited from the parent object.
if data == nil and allowFamily then
data = load_data("Modul:families/data")[norm_code]
if data ~= nil then
if data.parent == nil then
return make_family_object(norm_code, data)
elseif not allowEtymLang then
data = nil
end
end
end
local retval = code and data and make_object(code, data)
if not retval and paramForError then
require("Modul:languages/errorGetBy").code(code, paramForError, allowEtymLang, allowFamily)
end
return retval
end
get_by_code = export.getByCode
--[==[Finds the language whose canonical name (the name used to represent that language on Wiktionary) or other name matches the one provided. If it exists, it returns a <code class="nf">Language</code> object representing the language. Otherwise, it returns {{code|lua|nil}}, unless <code class="n">paramForError</code> is given, in which case an error is generated. If <code class="n">allowEtymLang</code> is specified, etymology-only language codes are allowed and looked up along with normal language codes. If <code class="n">allowFamily</code> is specified, language family codes are allowed and looked up along with normal language codes.
The canonical name of languages should always be unique (it is an error for two languages on Wiktionary to share the same canonical name), so this is guaranteed to give at most one result.
This function is powered by [[Module:languages/canonical names]], which contains a pre-generated mapping of full-language canonical names to codes. It is generated by going through the [[:Category:Language data modules]] for full languages. When <code class="n">allowEtymLang</code> is specified for the above function, [[Module:etymology languages/canonical names]] may also be used, and when <code class="n">allowFamily</code> is specified for the above function, [[Module:families/canonical names]] may also be used.]==]
function export.getByCanonicalName(name, errorIfInvalid, allowEtymLang, allowFamily)
local byName = load_data("Modul:languages/canonical names")
local code = byName and byName[name]
if not code and allowEtymLang then
byName = load_data("Modul:etymology languages/canonical names")
code = byName and byName[name] or
byName[gsub(name, "^[Ss]ubstratum ", "")] or
byName[gsub(name, "^suatu ", "")] or
byName[gsub(name, "^suatu ", ""):gsub("^[Ss]ubstratum ", "")] or
-- For etymology families like "ira-pro".
-- FIXME: This is not ideal, as it allows " languages" to be appended to any etymology-only language, too.
byName[match(name, "^[Bb]ahasa%-bahasa (.*)$")]
end
if not code and allowFamily then
byName = load_data("Modul:families/canonical names")
code = byName[name] or byName[match(name, "^[Bb]ahasa%-bahasa (.*)$")]
end
local retval = code and get_by_code(code, errorIfInvalid, allowEtymLang, allowFamily)
if not retval and errorIfInvalid then
require("Modul:languages/errorGetBy").canonicalName(name, allowEtymLang, allowFamily)
end
return retval
end
--[==[Used by [[Module:languages/data/2]] (et al.) and [[Module:etymology languages/data]], [[Module:families/data]], [[Module:scripts/data]] and [[Module:writing systems/data]] to finalize the data into the format that is actually returned.]==]
function export.finalizeData(data, main_type, variety)
local fields = {"type"}
if main_type == "language" then
insert(fields, 4) -- script codes
insert(fields, "ancestors")
insert(fields, "link_tr")
insert(fields, "override_translit")
insert(fields, "wikimedia_codes")
elseif main_type == "script" then
insert(fields, 3) -- writing system codes
end -- Families and writing systems have no extra fields to process.
local fields_len = #fields
for _, entity in next, data do
if variety then
-- Move parent from 3 to "parent" and family from "family" to 3. These are different for the sake of convenience, since very few varieties have the family specified, whereas all of them have a parent.
entity.parent, entity[3], entity.family = entity[3], entity.family
-- Give the type "regular" iff not a variety and no other types are assigned.
elseif not (entity.type or entity.parent) then
entity.type = "regular"
end
for i = 1, fields_len do
local key = fields[i]
local field = entity[key]
if field and type(field) == "string" then
entity[key] = gsub(field, "%s*,%s*", ",")
end
end
end
return data
end
--[==[For backwards compatibility only; modules should require the error themselves.]==]
function export.err(lang_code, param, code_desc, template_tag, not_real_lang)
return require("Modul:languages/error")(lang_code, param, code_desc, template_tag, not_real_lang)
end
return export
jrp354oohz2vhqpfwsukxukfcx757jm
Modul:languages/data/3/o
828
9793
281275
280832
2026-04-21T14:04:08Z
Hakimi97
2668
281275
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["oaa"] = {
"Orok",
33928,
"tuw",
"Cyrl, Latn",
}
m["oac"] = {
"Oroch",
33650,
"tuw",
"Latn, Cyrl",
}
m["oav"] = {
"Avar Kuno",
nil,
"cau-ava",
"Geor",
}
m["obi"] = {
"Obispeño",
1288385,
"nai-chu",
"Latn",
}
m["obk"] = {
"Bontoc Selatan",
nil,
"phi",
"Latn",
}
m["obl"] = {
"Oblo",
36309,
}
m["obm"] = {
"Moabite",
36385,
"sem-can",
"Phnx",
translit = "Phnx-translit",
}
m["obo"] = {
"Obo Manobo",
12953699,
"mno",
"Latn",
}
m["obr"] = {
"Burma Kuno",
17006600,
"tbq-brm",
"Mymr, Latn", --and also Pallava
}
m["obt"] = {
"Breton Kuno",
3558112,
"cel-bry",
"Latn",
}
m["obu"] = {
"Obulom",
3813403,
"nic-cde",
"Latn",
}
m["oca"] = {
"Ocaina",
3182577,
"sai-wit",
"Latn",
}
m["och"] = {
"Cina Kuno",
35137,
"zhx",
"Hant",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["oco"] = {
"Cornwall Kuno",
48304520,
"cel-bry",
"Latn",
}
m["ocu"] = {
"Tlahuica",
10751739,
"omq",
"Latn",
}
m["oda"] = {
"Odut",
3915388,
"nic-uce",
"Latn",
ancestors = "mfn",
}
m["odk"] = {
"Od",
7077191,
"inc-wes",
"Arab",
}
m["odt"] = {
"Belanda Kuno",
443089,
"gmw",
"Latn, Runr",
ancestors = "frk",
entry_name = {remove_diacritics = c.circ .. c.macron},
}
m["odu"] = {
"Odual",
3813392,
"nic-cde",
"Latn",
}
m["ofo"] = {
"Ofo",
3349758,
"sio-ohv",
}
m["ofs"] = {
"Frisia Kuno",
35133,
"gmw-fri",
"Latn",
entry_name = {remove_diacritics = c.circ .. c.macron},
}
m["ofu"] = {
"Efutop",
35297,
"nic-eko",
"Latn",
}
m["ogb"] = {
"Ogbia",
3813400,
"nic-cde",
"Latn",
}
m["ogc"] = {
"Ogbah",
36291,
"alv-igb",
"Latn",
}
m["oge"] = {
"Georgia Kuno",
34834,
"ccs-gzn",
"Geor, Geok",
translit = {
Geor = "Geor-translit",
Geok = "Geok-translit",
},
override_translit = true,
entry_name = {remove_diacritics = c.circ},
}
m["ogg"] = {
"Ogbogolo",
3813405,
"nic-cde",
"Latn",
}
m["ogo"] = {
"Khana",
3914409,
"nic-ogo",
"Latn",
}
m["ogu"] = {
"Ogbronuagum",
3914485,
"nic-cde",
"Latn",
}
m["ohu"] = {
"Hungary Kuno",
nil,
"urj-ugr",
"Latn",
}
m["oia"] = {
"Oirata",
56738,
"ngf",
"Latn",
}
m["oin"] = {
"One Inebu",
12953782,
"qfa-tor",
}
m["ojb"] = {
"Ojibwa Barat Laut",
7060356,
"alg",
"Latn",
ancestors = "oj",
}
m["ojc"] = {
"Ojibwa Tengah",
5061548,
"alg",
"Latn",
ancestors = "oj",
}
m["ojg"] = {
"Ojibwa Timur",
5330342,
"alg",
"Latn",
ancestors = "oj",
}
m["ojp"] = {
"Jepun Kuno",
5736700,
"jpx",
"Jpan",
sort_key = s["Jpan-sortkey"],
}
m["ojs"] = {
"Severn Ojibwa",
56494,
"alg",
"Latn",
ancestors = "oj",
}
m["ojv"] = {
"Jawa Ontong",
7095071,
"poz-pnp",
"Latn",
}
m["ojw"] = {
"Ojibwa Barat",
3474222,
"alg",
"Latn",
ancestors = "oj",
}
m["oka"] = {
"Okanagan",
2984602,
"sal",
"Latn",
}
m["okb"] = {
"Okobo",
3813398,
"nic-lcr",
"Latn",
}
m["okd"] = {
"Okodia",
36300,
"ijo",
}
m["oke"] = {
"Okpe (Edo Barat Daya)",
268924,
"alv-swd",
"Latn",
}
m["okg"] = {
"Kok-Paponk",
nil,
"aus-pmn",
"Latn",
}
m["okh"] = {
"Koresh-e Rostam",
6432160,
"xme-ttc",
ancestors = "xme-ttc-cen",
}
m["oki"] = {
"Okiek",
56367,
"sdv-kln",
"Latn",
}
m["okj"] = {
"Oko-Juwoi",
3436832,
"qfa-adc",
}
m["okk"] = {
"Kwamtim One",
19830649,
"qfa-tor",
"Latn",
}
m["okl"] = {
"Bahasa Isyarat Kentish Kuno",
7084319,
"sgn",
}
m["okm"] = {
"Korea Pertengahan",
715339,
"qfa-kor",
"Kore",
ancestors = "oko",
translit = "okm-translit",
entry_name = s["Kore-entryname"],
}
m["okn"] = {
"Oki-No-Erabu",
3350036,
"jpx-ryu",
"Jpan",
translit = s["Jpan-translit"],
sort_key = s["Jpan-sortkey"],
}
m["oko"] = {
"Korea Kuno",
715364,
"qfa-kor",
"Kore",
entry_name = s["Kore-entryname"],
}
m["okr"] = {
"Kirike",
11006763,
"ijo",
}
m["oks"] = {
"Oko-Eni-Osayen",
36302,
"alv-von",
"Latn",
}
m["oku"] = {
"Oku",
36289,
"nic-rnc",
"Latn",
}
m["okv"] = {
"Orokaiva",
7103752,
"ngf",
"Latn",
}
m["okx"] = {
"Okpe (Edo Barat Laut)",
7082547,
"alv-nwd",
"Latn",
}
m["okz"] = {
"Khmer Kuno",
9205,
"mkh-kmr",
"Latn, Khmr", --and also Pallava
}
m["old"] = {
"Mochi",
12952852,
"bnt-chg",
"Latn",
}
m["ole"] = {
"Olekha",
3695204,
"sit-bdi",
"Tibt, Latn",
translit = {Tibt = "Tibt-translit"},
override_translit = true,
display_text = {Tibt = s["Tibt-displaytext"]},
entry_name = {Tibt = s["Tibt-entryname"]},
sort_key = {Tibt = "Tibt-sortkey"},
}
m["olm"] = {
"Oloma",
3441166,
"alv-nwd",
"Latn",
}
m["olo"] = {
"Livvi",
36584,
"urj-fin",
"Latn",
}
m["olr"] = {
"Olrat",
3351562,
"poz-vnc",
}
m["olt"] = {
"Lithuania Kuno",
17417801,
"bat",
"Latn",
entry_name = {remove_diacritics = c.grave .. c.acute .. c.tilde},
}
m["olu"] = {
"Kuvale",
6448765,
"bnt-swb",
"Latn",
}
m["oma"] = {
"Omaha-Ponca",
2917968,
"sio-dhe",
"Latn",
}
m["omb"] = {
"Omba",
2841471,
"poz-vnc",
"Latn",
}
m["omc"] = {
"Mochica",
1951641,
}
m["omg"] = {
"Omagua",
33663,
"tup-gua",
"Latn",
}
m["omi"] = {
"Omi",
56795,
"csu-mma",
}
m["omk"] = {
"Omok",
4334657,
"qfa-yuk",
"Cyrl",
translit = "omk-translit",
}
m["oml"] = {
"Ombo",
7089928,
"bnt-tet",
"Latn",
}
m["omn"] = {
"Minoan",
1669994,
nil,
"Lina",
}
m["omo"] = {
"Utarmbung",
7902577,
"ngf",
"Latn",
}
m["omp"] = {
"Manipur Kuno",
nil,
"sit",
"Mtei",
translit = "Mtei-translit",
}
m["omr"] = {
"Marathi Kuno",
nil,
"inc-sou",
"Deva, Modi",
ancestors = "pmh",
translit = {
Deva = "sa-translit",
Modi = "Modi-translit",
},
}
m["omt"] = {
"Omotik",
36313,
"sdv-nis",
}
m["omu"] = {
"Omurano",
1957612,
}
m["omw"] = {
"Tairora Selatan",
20210553,
"paa-kag",
"Latn",
}
m["omx"] = {
"Mon Kuno",
nil,
"mkh-mnc",
"Mymr, Latn", --and also Pallava
}
m["ona"] = {
"Selk'nam",
2721227,
"sai-cho",
"Latn",
}
m["onb"] = {
"Lingao",
7093790,
"qfa-onb",
"Latn",
}
m["one"] = {
"Oneida",
857858,
"iro-nor",
"Latn",
}
m["ong"] = {
"Olo",
592162,
"qfa-tor",
"Latn",
}
m["oni"] = {
"Onin",
7093910,
"poz-cet",
"Latn",
}
m["onj"] = {
"Onjob",
7093968,
"ngf",
"Latn",
}
m["onk"] = {
"Kabore One",
12953783,
"qfa-tor",
"Latn",
}
m["onn"] = {
"Onobasulu",
7094437,
"ngf",
"Latn",
}
m["ono"] = {
"Onondaga",
1077450,
"iro-nor",
"Latn",
ancestors = "iro-oon",
}
m["onp"] = {
"Sartang",
7424639,
"sit-khb",
}
m["onr"] = {
"One Utara",
19830648,
"qfa-tor",
"Latn",
}
m["ons"] = {
"Ono",
11732548,
"ngf",
"Latn",
}
m["ont"] = {
"Ontenu",
3352827,
}
m["onu"] = {
"Unua",
3552042,
"poz-vnc",
"Latn",
}
m["onw"] = {
"Nubia Kuno",
2268,
"nub",
"Copt",
translit = "Copt-translit",
sort_key = "cop-sortkey",
}
m["onx"] = {
"Pidgin Onin",
12953788,
"crp",
"Latn",
ancestors = "oni",
}
m["ood"] = {
"O'odham",
2393095,
"azc",
"Latn",
}
m["oog"] = {
"Ong",
12953787,
"mkh-kat",
}
m["oon"] = {
"Önge",
2475551,
"qfa-ong",
}
m["oor"] = {
"Oorlams",
2484337,
}
m["oos"] = {
"Ossetia Kuno",
nil,
"xsc",
"Grek, Latn",
translit = "grc-translit",
ancestors = "os-pro",
}
m["opa"] = {
"Okpamheri",
3913331,
"alv-nwd",
"Latn",
}
m["opk"] = {
"Kopkaka",
6431129,
"ngf-okk",
"Latn",
}
m["opm"] = {
"Oksapmin",
1068097,
"ngf",
"Latn",
}
m["opo"] = {
"Opao",
7095585,
"ngf",
"Latn",
}
m["opt"] = {
"Opata",
2304583,
"azc-trc",
"Latn",
}
m["opy"] = {
"Ofayé",
3446691,
"sai-mje",
"Latn",
}
m["ora"] = {
"Oroha",
36298,
"poz-sls",
}
m["ore"] = {
"Orejón",
3355834,
"sai-tuc",
"Latn",
}
m["org"] = {
"Oring",
3915308,
"nic-ucn",
"Latn",
}
m["orh"] = {
"Oroqen",
1367309,
"tuw",
"Latn",
}
m["oro"] = {
"Orokolo",
7103758,
"ngf",
"Latn",
}
m["orr"] = {
"Oruma",
36299,
"ijo",
"Latn",
}
m["ort"] = {
"Oriya Adivasi",
12953791,
"inc-eas",
"Orya",
ancestors = "or",
}
m["ors"] = {
"Orang Seletar",
4208197,
"map",
"Latn",
ancestors = "ms",
}
m["oru"] = {
"Ormuri",
33740,
"ira-orp",
"fa-Arab",
}
m["orv"] = {
"Slav Timur Kuno",
35228,
"zle",
"Cyrs",
translit = {Cyrs = "Cyrs-translit"},
entry_name = s["Cyrs-entryname"],
sort_key = s["Cyrs-sortkey"],
}
m["orw"] = {
"Oro Win",
3450423,
"sai-cpc",
"Latn",
}
m["orx"] = {
"Oro",
3813396,
"nic-lcr",
"Latn",
}
m["orz"] = {
"Ormu",
7103494,
"poz-ocw",
"Latn",
}
m["osa"] = {
"Osage",
2600085,
"sio-dhe",
"Latn, Osge",
}
m["osc"] = {
"Osci",
36653,
"itc-sbl",
"Ital, Latn",
translit = "Ital-translit",
}
m["osi"] = {
"Osing",
2701322,
"poz-sus",
"Latn",
}
m["osn"] = {
"Sunda Kuno",
56197074,
"poz-msa",
"Latn, Sund, Kawi",
}
m["oso"] = {
"Ososo",
3913398,
"alv-yek",
"Latn",
}
m["osp"] = {
"Sepanyol Kuno",
1088025,
"roa-cas",
"Latn",
}
m["ost"] = {
"Osatu",
36243,
"nic-grs",
"Latn",
}
m["osu"] = {
"One Selatan",
12953785,
"qfa-tor",
"Latn",
}
m["osx"] = {
"Saxon Kuno",
35219,
"gmw",
"Latn",
entry_name = {remove_diacritics = c.circ .. c.macron},
}
m["ota"] = {
"Turki Usmaniyah",
36730,
"trk-ogz",
"ota-Arab, Armn",
ancestors = "trk-oat",
entry_name = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef},
translit = {Armn = "ota-Armn-translit"},
}
m["otb"] = {
"Tibet Kuno",
7085214,
"sit-tib",
"Tibt",
translit = "Tibt-translit",
override_translit = true,
display_text = s["Tibt-displaytext"],
entry_name = s["Tibt-entryname"],
sort_key = "Tibt-sortkey",
}
m["otd"] = {
"Ot Danum",
3033781,
"poz-brw",
"Latn",
}
m["ote"] = {
"Otomi Mezquital",
23755711,
"oto-otm",
"Latn",
}
m["oti"] = {
"Oti",
3357881,
}
m["otk"] = {
"Turk Kuno",
34988,
"trk",
"Orkh",
translit = "Orkh-translit",
}
m["otl"] = {
"Otomi Tilapa",
7802050,
"oto-otm",
"Latn",
}
m["otm"] = {
"Otomi Tanah Tinggi Timur",
13581718,
"oto-otm",
"Latn",
}
m["otn"] = {
"Otomi Tenango",
25559589,
"oto-otm",
"Latn",
}
m["otq"] = {
"Otomi Querétaro",
23755688,
"oto-otm",
"Latn",
}
m["otr"] = {
"Otoro",
36328,
"alv-hei",
}
m["ots"] = {
"Otomi Estado de México",
7413841,
"oto-otm",
"Latn",
}
m["ott"] = {
"Otomi Temoaya",
7698191,
"oto-otm",
"Latn",
}
m["otu"] = {
"Otuke",
7110049,
"sai-mje",
"Latn",
}
m["otw"] = {
"Ottawa",
133678,
"alg",
"Latn",
ancestors = "oj",
}
m["otx"] = {
"Otomi Texcatepec",
25559590,
"oto-otm",
"Latn",
}
m["oty"] = {
"Tamil Kuno",
20987452,
"dra",
"Brah",
translit = "Brah-translit",
}
m["otz"] = {
"Otomi Ixtenco",
6101171,
"oto-otm",
"Latn",
}
m["oub"] = {
"Glio-Oubi",
3914977,
"kro-grb",
}
m["oue"] = {
"Oune",
7110521,
"paa-sbo",
}
m["oui"] = {
"Uyghur Kuno",
nil,
"trk-sib",
"Ougr, Latn, Brah, Mani, Syrc, Phag",
}
m["oum"] = {
"Ouma",
7110494,
"poz-ocw",
"Latn",
}
m["ovd"] = {
"Älvdalen",
254950,
"gmq",
"Latn",
ancestors = "non",
}
m["owi"] = {
"Owiniga",
56454,
"qfa-mal",
"Latn",
}
m["owl"] = {
"Wales Kuno",
2266723,
"cel-bry",
"Latn",
}
m["oyb"] = {
"Oy",
13593748,
"mkh-ban",
}
m["oyd"] = {
"Oyda",
7116251,
"omv-nom",
}
m["oym"] = {
"Wayampi",
7975842,
"tup-gua",
"Latn",
}
m["oyy"] = {
"Oya'oya",
7116243,
"poz-ocw",
"Latn",
}
m["ozm"] = {
"Koonzime",
35566,
"bnt-ndb",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
7d637vy00f3zv6l8x23f5epevazae54
Modul:languages/data/3/p
828
9817
281247
273098
2026-04-21T13:32:19Z
Hakimi97
2668
281247
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["pab"] = {
"Pareci",
3504312,
"awd",
"Latn",
}
m["pac"] = {
"Pacoh",
3441136,
"mkh-kat",
"Latn",
}
m["pad"] = {
"Paumarí",
389827,
"auf",
"Latn",
}
m["pae"] = {
"Pagibete",
7124357,
"bnt-bta",
"Latn",
}
m["paf"] = {
"Paranawát",
12953806,
"tup-gua",
"Latn",
}
m["pag"] = {
"Pangasinan",
33879,
"phi",
"Latn, Tglg",
entry_name = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer},
}
m["pah"] = {
"Tenharim",
10266010,
"tup-gua",
"Latn",
}
m["pai"] = {
"Pe",
3914871,
"nic-tar",
"Latn",
}
m["pak"] = {
"Parakanã",
12953804,
"tup-gua",
"Latn",
}
m["pal"] = {
"Parsi Pertengahan",
32063,
"ira-swi",
"Latn, Phli, pal-Avst, Mani, Phlp, Phlv", -- Latn for translit; Phlv not in Unicode
translit = {
Phli = "Phli-translit",
["pal-Avst"] = "Avst-translit",
Mani = "Mani-translit",
},
ancestors = "peo",
}
m["pam"] = {
"Kapampangan",
36121,
"phi",
"Latn", --also Kulitan, which lacks a code
entry_name = {Latn = {remove_diacritics = c.grave .. c.acute .. c.circ}},
standardChars = {
Latn = "AaBbDdEeGgHhIiKkLlMmNnOoPpRrSsTtUuWwYy",
c.punc
},
sort_key = {
Latn = "tl-sortkey"
},
}
m["pao"] = {
"Paiute Utara",
3360656,
"azc-num",
"Latn",
}
m["pap"] = {
"Papiamentu",
33856,
"crp",
"Latn",
ancestors = "pt",
}
m["paq"] = {
"Parya",
1135134,
"inc-cen",
ancestors = "psu",
}
m["par"] = {
"Panamint",
33926,
"azc-num",
"Latn",
}
m["pas"] = {
"Papasena",
7132508,
"paa-lkp",
"Latn",
}
m["pat"] = {
"Papitalai",
6528659,
"poz-aay",
"Latn",
}
m["pau"] = {
"Palau",
33776,
"poz",
"Latn, Kana",
sort_key = {
Kana = "Kana-sortkey"
},
}
m["pav"] = {
"Wari'",
3027909,
"sai-cpc",
"Latn",
}
m["paw"] = {
"Pawnee",
56751,
"cdd",
"Latn",
}
m["pax"] = {
"Pankararé",
25559779,
nil,
"Latn",
}
m["pay"] = {
"Pech",
4898889,
"cba",
"Latn",
}
m["paz"] = {
"Pankararú",
7131310,
nil,
"Latn",
}
m["pbb"] = {
"Páez",
33677,
nil,
"Latn",
}
m["pbc"] = {
"Patamona",
3915921,
"sai-pem",
"Latn",
}
m["pbe"] = {
"Mezontla Popoloca",
42365630,
"omq-pop",
"Latn",
}
m["pbf"] = {
"Coyotepec Popoloca",
5180100,
"omq-pop",
"Latn",
}
m["pbg"] = {
"Paraujano",
3501747,
"awd-taa",
"Latn",
}
m["pbh"] = {
"Panare",
56610,
"sai-ven",
"Latn",
}
m["pbi"] = {
"Podoko",
3515096,
"cdc-cbm",
"Latn",
}
m["pbl"] = {
"Mak (Nigeria)",
3915349,
"alv-bwj",
"Latn",
}
m["pbm"] = {
"Puebla Mazatec",
nil,
"omq-maz",
"Latn",
}
m["pbn"] = {
"Kpasam",
3914902,
"alv-mye",
"Latn",
}
m["pbo"] = {
"Papel",
36314,
"alv-pap",
"Latn",
}
m["pbp"] = {
"Badyara",
35095,
"alv-ten",
"Latn",
}
m["pbr"] = {
"Pangwa",
3847550,
"bnt-bki",
"Latn",
}
m["pbs"] = {
"Pame Tengah",
3361763,
"omq",
"Latn",
}
m["pbv"] = {
"Pnar",
3501850,
"aav-pkl",
"Latn",
}
m["pby"] = {
"Pyu",
2567925,
"paa-asa",
"Latn",
}
m["pca"] = {
"Santa Inés Ahuatempan Popoloca",
42365276,
"omq-pop",
"Latn",
}
m["pcb"] = {
"Pear",
6583669,
"mkh-pea",
"Khmr",
}
m["pcc"] = {
"Bouyei",
35100,
"tai-nor",
"Latn, Hani",
sort_key = {Hani = "Hani-sortkey"},
}
m["pcd"] = {
"Picard",
34024,
"roa-oil",
"Latn",
ancestors = "fro",
sort_key = s["roa-oil-sortkey"],
}
m["pce"] = {
"Ruching Palaung",
12953798,
"mkh-pal",
}
m["pcf"] = {
"Paliyan",
7127643,
"dra",
}
m["pcg"] = {
"Paniya",
7131211,
"dra",
}
m["pch"] = {
"Pardhan",
7133207,
"dra",
ancestors = "gon",
}
m["pci"] = {
"Duruwa",
56753,
"dra",
"Deva, Orya",
}
m["pcj"] = {
"Parenga",
3111396,
"mun",
}
m["pck"] = {
"Paite",
12952337,
"tbq-kuk",
}
m["pcl"] = {
"Pardhi",
7136554,
"inc-bhi",
}
m["pcm"] = {
"Pijin Nigeria",
33655,
"crp",
"Latn",
ancestors = "en",
entry_name = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.caron .. c.macronbelow},
sort_key = {
remove_diacritics = c.tilde,
from = {"ẹ", "gb", "kp", "ọ", "sh", "zh"},
to = {"e" .. p[1], "g" .. p[1], "k" .. p[1], "o" .. p[1], "s" .. p[1], "z" .. p[1]}
},
}
m["pcn"] = {
"Piti",
3913375,
"nic-kne",
"Latn",
}
m["pcp"] = {
"Pacahuara",
2591165,
"sai-pan",
"Latn",
}
m["pcw"] = {
"Pyapun",
3438807,
nil,
"Latn",
}
m["pda"] = {
"Anam",
3501930,
"ngf-mad",
"Latn",
}
m["pdc"] = {
"Jerman Pennsylvania",
22711,
"gmw",
"Latn",
ancestors = "gmw-rfr",
}
m["pdi"] = {
"Pa Di",
3359940,
nil,
"Latn",
}
m["pdn"] = {
"Fedan",
7206699,
"poz-ocw",
"Latn",
}
m["pdo"] = {
"Padoe",
3360370,
"poz-btk",
"Latn",
}
m["pdt"] = {
"Plautdietsch",
1751432,
"gmw",
"Latn",
ancestors = "nds-de",
}
m["pdu"] = {
"Kayan",
7123283,
"kar",
"Latn",
}
m["pea"] = {
"Peranakan Indonesian",
653415,
nil,
"Latn",
}
m["peb"] = {
"Pomo Timur",
3396032,
"nai-pom",
"Latn",
}
m["ped"] = {
"Mala (New Guinea)",
11732569,
"ngf-mad",
"Latn",
}
m["pee"] = {
"Taje",
12953902,
nil,
"Latn",
}
m["pef"] = {
"Pomo Timur Laut",
3396018,
"nai-pom",
"Latn",
}
m["peg"] = {
"Pengo",
56758,
"dra",
"Orya",
translit = "kxv-translit",
}
m["peh"] = {
"Bonan",
32983,
"xgn-shr",
"Latn",
}
m["pei"] = {
"Chichimeca-Jonaz",
3915427,
"omq-otp",
"Latn",
}
m["pej"] = {
"Pomo Utara",
3396021,
"nai-pom",
"Latn",
}
m["pek"] = {
"Penchal",
3374631,
"poz-aay",
"Latn",
}
m["pel"] = {
"Pekal",
3241781,
nil,
"Latn",
}
m["pem"] = {
"Phende",
7162372,
"bnt-pen",
"Latn",
}
m["peo"] = {
"Parsi Kuno",
35225,
"ira-swi",
"Xpeo, Latn",
translit = "peo-translit",
}
m["pep"] = {
"Kunja",
6444807,
nil,
"Latn",
}
m["peq"] = {
"Pomo Selatan",
3396023,
"nai-pom",
"Latn",
}
-- "pes" IS TREATED AS "fa" (or as etymology-only), SEE WT:LT
m["pev"] = {
"Pémono",
3439012,
"sai-map",
"Latn",
}
m["pex"] = {
"Petats",
3376353,
"poz-ocw",
"Latn",
}
m["pey"] = {
"Petjo",
940486,
nil,
"Latn",
}
m["pez"] = {
"Penan Timur",
18638342,
"poz-swa",
"Latn",
}
m["pfa"] = {
"Pááfang",
3063517,
"poz-mic",
"Latn",
}
m["pfe"] = {
"Peere",
36377,
"alv-dur",
"Latn",
}
m["pga"] = {
"Arab Juba",
1262143,
"crp",
"Latn",
ancestors = "apd",
}
m["pgd"] = {
"Gandhari",
nil,
"inc-mid",
"Deva, Khar",
ancestors = "inc-ash",
translit = "Khar-translit",
}
m["pgg"] = {
"Pangwali",
13600429,
"him",
"Deva, Takr",
translit = "hi-translit",
}
m["pgi"] = {
"Pagi",
7124354,
"paa-brd",
"Latn",
}
m["pgk"] = {
"Rerep",
586907,
"poz-vnc",
"Latn",
}
m["pgl"] = {
"Primitive Irish",
3320030,
"cel-gae",
"Ogam",
translit = "pgl-translit",
}
m["pgn"] = {
"Paelignian",
nil,
"itc-sbl",
"Latn",
}
m["pgs"] = {
"Pangseng",
3914027,
"alv-mum",
"Latn",
}
m["pgu"] = {
"Pagu",
7124462,
"paa-nha",
"Latn",
}
m["pgz"] = {
"Bahasa Isyarat Papua New Guinea",
25044405,
"sgn",
}
m["pha"] = {
"Pa-Hng",
2625410,
"hmn",
}
m["phd"] = {
"Phudagi",
7188289,
}
m["phg"] = {
"Phuong",
7188376,
"mkh-kat",
}
m["phh"] = {
"Phukha",
7188298,
"tbq-lol",
}
m["phk"] = {
"Phake",
7675798,
"tai-swe",
"Mymr",
translit = "aio-phk-translit",
entry_name = {remove_diacritics = c.VS01},
}
m["phl"] = {
"Phalura",
2449549,
"inc-dar",
"Latn, ur-Arab",
}
m["phm"] = {
"Phimbi",
11007144,
"bnt-sna",
"Latn",
}
m["phn"] = {
"Phoenicia",
36734,
"sem-can",
"Phnx",
translit = "Phnx-translit",
}
m["pho"] = {
"Phunoi",
7188361,
"tbq-lol",
}
m["phq"] = {
"Phana'",
7180427,
"tbq-lol",
}
m["phr"] = {
"Pahari-Potwari",
33739,
"inc-pan",
"pa-Arab, Guru",
ancestors = "lah",
translit = {
Guru = "Guru-translit",
["pa-Arab"] = "pa-Arab-translit",
},
entry_name = {
["pa-Arab"] = {
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna,
from = {"ݨ", "ࣇ"},
to = {"ن", "ل"}
},
}
}
m["pht"] = {
"Phu Thai",
3626597,
"tai-swe",
}
m["phu"] = {
"Phuan",
3915665,
}
m["phv"] = {
"Pahlavani",
7124567,
}
m["phw"] = {
"Phangduwali",
12953036,
"sit-kie",
ancestors = "ybh",
}
m["pia"] = {
"Pima Bajo",
3388544,
"azc",
"Latn",
}
m["pib"] = {
"Yine",
3135432,
"awd",
"Latn",
}
m["pic"] = {
"Pinji",
36296,
"bnt-tso",
"Latn",
}
m["pid"] = {
"Piaroa",
3382207,
nil,
"Latn",
}
m["pie"] = {
"Piro",
7198055,
"nai-kta",
"Latn",
}
m["pif"] = {
"Pingelapese",
36421,
"poz-mic",
"Latn",
}
m["pig"] = {
"Pisabo",
966883,
"sai-pan",
"Latn",
}
m["pih"] = {
"Pitcairn-Norfolk",
36554,
"crp",
"Latn",
ancestors = "en",
}
m["pii"] = {
"Pini",
10631925,
}
m["pij"] = {
"Pijao",
7193519,
}
m["pil"] = {
"Yom",
36893,
"nic-yon",
}
m["pim"] = {
"Powhatan",
2270532,
"alg-eas",
"Latn",
}
m["pin"] = {
"Piame",
7190042,
"paa-sep",
"Latn",
}
m["pio"] = {
"Piapoco",
3382208,
"awd-nwk",
"Latn",
}
m["pip"] = {
"Pero",
2411063,
"cdc-wst",
}
m["pir"] = {
"Piratapuyo",
3389119,
"sai-tuc",
"Latn",
}
m["pis"] = {
"Pijin",
36699,
"crp",
"Latn",
ancestors = "en",
}
m["pit"] = {
"Pitta-Pitta",
6433116,
"aus-kar",
"Latn",
}
m["piu"] = {
"Pintupi-Luritja",
2591175,
"aus-pam",
}
m["piv"] = {
"Pileni",
2976736,
"poz-pnp",
"Latn",
}
m["piw"] = {
"Pimbwe",
3894132,
"bnt-mwi",
}
m["pix"] = {
"Piu",
7199578,
}
m["piy"] = {
"Piya-Kwonci",
3440492,
}
m["piz"] = {
"Pije",
3388339,
"poz-cln",
"Latn",
}
m["pjt"] = {
"Pitjantjatjara",
2982063,
"aus-pam",
"pjt-Latn",
}
m["pkb"] = {
"Kipfokomo",
7208693,
"bnt-sab",
"Latn",
}
m["pkc"] = {
"Baekje",
4841264,
"qfa-kor",
"Hani, Kana",
sort_key = {
Hani = "Hani-sortkey",
Kana = "Kana-sortkey"
},
}
m["pkg"] = {
"Pak-Tong",
3360711,
}
m["pkh"] = {
"Pankhu",
7130962,
"tbq-kuk",
}
m["pkn"] = {
"Pakanha",
954916,
"aus-pmn",
}
m["pko"] = {
"Pökoot",
36323,
"sdv-kln",
}
m["pkp"] = {
"Pukapukan",
36447,
"poz-pnp",
"Latn",
}
m["pkr"] = {
"Attapady Kurumba",
16835180,
"dra",
}
m["pks"] = {
"Bahasa Isyarat Pakistan",
22964057,
"sgn",
}
m["pkt"] = {
"Maleng",
6583562,
"mkh-vie",
}
m["pku"] = {
"Paku",
2932604,
}
m["pla"] = {
"Miani",
12952844,
nil,
"Latn",
}
m["plb"] = {
"Polonombauk",
7225957,
"poz-vnc",
"Latn",
}
m["plc"] = {
"Central Palawano",
12953795,
"phi",
"Latn",
}
m["ple"] = {
"Palu'e",
2196866,
"poz-cet",
"Latn",
}
m["plg"] = {
"Pilagá",
2748259,
"sai-guc",
"Latn",
}
m["plh"] = {
"Paulohi",
7155331,
"poz-cma",
}
m["plj"] = {
"Polci",
3914383,
}
m["plk"] = {
"Kohistani Shina",
12953882,
"inc-dar",
}
m["pll"] = {
"Shwe Palaung",
27941664,
"mkh-pal",
}
m["pln"] = {
"Palenquero",
36665,
"crp",
"Latn",
ancestors = "es",
}
m["plo"] = {
"Oluta Popoluca",
5908687,
"nai-miz",
"Latn",
}
m["plq"] = {
"Palaic",
36582,
"ine-ana",
"Xsux",
}
m["plr"] = {
"Palaka Senoufo",
36346,
"alv-snf",
"Latn",
}
m["pls"] = {
"San Marcos Tlalcoyalco Popoloca",
12641692,
"omq-pop",
"Latn",
}
m["plu"] = {
"Palikur",
3073448,
"awd",
"Latn",
}
m["plv"] = {
"Palawano Barat Daya",
15614922,
"phi",
"Latn",
}
m["plw"] = {
"Palawano Brooke's Point",
12953796,
"phi",
"Latn",
}
m["ply"] = {
"Bolyu",
3361723,
"mkh-pkn",
"Latn",
}
m["plz"] = {
"Paluan",
7128795,
nil,
"Latn",
}
m["pma"] = {
"Paama",
3130286,
"poz-vnc",
"Latn",
}
m["pmb"] = {
"Pambia",
36267,
"znd",
"Latn",
}
m["pmd"] = {
"Pallanganmiddang",
7127734,
"aus-pam",
"Latn",
}
m["pme"] = {
"Pwaamei",
3411152,
"poz-cln",
"Latn",
}
m["pmf"] = {
"Pamona",
3513320,
"poz-kal",
"Latn",
}
m["pmi"] = {
"Pumi Utara",
3403245,
"sit-qia",
}
m["pmj"] = {
"Pumi Selatan",
3403246,
"sit-qia",
}
m["pmk"] = {
"Pamlico",
nil,
"alg-eas",
"Latn",
}
m["pml"] = {
"Sabir",
636479,
"crp",
"Latn",
ancestors = "lij, pro, vec",
}
m["pmm"] = {
"Pol",
36408,
"bnt-kak",
"Latn",
}
m["pmn"] = {
"Pam",
7129017,
"alv-mbm",
}
m["pmo"] = {
"Pom",
7227178,
"poz-hce",
"Latn",
}
m["pmq"] = {
"Pame Utara",
3361762,
"omq",
"Latn",
}
m["pmr"] = {
"Paynamar",
3450824,
}
m["pms"] = {
"Piemonte",
15085,
"roa-git",
"Latn",
}
m["pmt"] = {
"Tuamotuan",
36763,
"poz-pep",
"Latn",
}
m["pmu"] = {
"Mirpur Panjabi",
6874480,
}
m["pmw"] = {
"Plains Miwok",
3391031,
"nai-you",
"Latn",
}
m["pmx"] = {
"Poumei Naga",
12952910,
"tbq-anp",
}
m["pmy"] = {
"Papuan Malay",
12473446,
nil,
"Latn",
}
m["pmz"] = {
"Southern Pame",
3361765,
"omq",
"Latn",
}
m["pna"] = {
"Punan Bah-Biau",
4842201,
"poz-bnn",
"Latn",
}
m["pnb"] = {
"Punjabi Barat",
58635,
"inc-pan",
"pa-Arab",
ancestors = "pa",
}
m["pnc"] = {
"Pannei",
7131391,
}
m["pnd"] = {
"Mpinda",
63308194,
"bnt-kmb",
}
m["pne"] = {
"Penan Barat",
12953808,
"poz-swa",
"Latn",
}
m["png"] = {
"Pongu",
36282,
"nic-shi",
}
m["pnh"] = {
"Penrhyn",
3130301,
"poz-pep",
"Latn",
}
m["pni"] = {
"Aoheng",
4778608,
"poz",
"Latn",
}
m["pnj"] = {
"Pinjarup",
33103591,
}
m["pnk"] = {
"Paunaca",
2064378,
"awd",
"Latn",
}
m["pnl"] = {
"Paleni",
7127118,
"alv-wan",
"Latn",
}
m["pnm"] = {
"Punan Batu",
7259892,
}
m["pnn"] = {
"Pinai-Hagahai",
5638511,
}
m["pno"] = {
"Panobo",
3141869,
"sai-pan",
"Latn",
}
m["pnp"] = {
"Pancana",
7130204,
}
m["pnq"] = {
"Pana (Afrika Barat)",
7129739,
"nic-gnn",
"Latn",
}
m["pnr"] = {
"Panim",
11732562,
"ngf-mad",
}
m["pns"] = {
"Ponosakan",
7227956,
"phi",
"Latn",
}
m["pnt"] = {
"Yunani Pontus",
36748,
"grk",
"Grek, Latn, Cyrl",
ancestors = "gkm",
translit = "el-translit",
entry_name = {remove_diacritics = c.caron .. c.diaerbelow .. c.brevebelow},
sort_key = s["Grek-sortkey"],
}
m["pnu"] = {
"Jiongnai Bunu",
56325,
"hmn",
}
m["pnv"] = {
"Pinigura",
10631927,
"aus-psw",
"Latn",
}
m["pnw"] = {
"Panyjima",
3913830,
"aus-nga",
"Latn",
}
m["pnx"] = {
"Phong-Kniang",
3914627,
"mkh",
}
m["pny"] = {
"Pinyin",
36250,
"nic-nge",
"Latn",
}
m["pnz"] = {
"Pana (Afrika Tengah)",
36241,
"alv-mbm",
"Latn",
}
m["poc"] = {
"Poqomam",
36416,
"myn",
"Latn",
}
m["poe"] = {
"San Juan Atzingo Popoloca",
12953819,
"omq-pop",
"Latn",
}
m["pof"] = {
"Poke",
7208577,
"bnt-ske",
}
m["pog"] = {
"Potiguára",
56722,
"tup-gua",
"Latn",
}
m["poh"] = {
"Poqomchi'",
36414,
"myn",
"Latn",
}
m["poi"] = {
"Highland Popoluca",
7511556,
"nai-miz",
"Latn",
}
m["pok"] = {
"Pokangá",
25559704,
"sai-tuc",
"Latn",
}
m["pom"] = {
"Pomo Tenggara",
3396025,
"nai-pom",
"Latn",
}
m["pon"] = {
"Pohnpei",
28422,
"poz-mic",
"Latn",
}
m["poo"] = {
"Pomo Tengah",
3396020,
"nai-pom",
"Latn",
}
m["pop"] = {
"Pwapwa",
3411153,
"poz-cln",
"Latn",
}
m["poq"] = {
"Texistepec Popoluca",
5908707,
"nai-miz",
"Latn",
}
m["pos"] = {
"Sayula Popoluca",
5908722,
"nai-miz",
"Latn",
}
m["pot"] = {
"Potawatomi",
56749,
"alg",
"Latn",
}
m["pov"] = {
"Kreol Guinea-Bissau",
33339,
"crp",
"Latn",
ancestors = "pt",
}
m["pow"] = {
"San Felipe Otlaltepec Popoloca",
25559598,
"omq-pop",
"Latn",
}
m["pox"] = {
"Polabia",
36741,
"zlw-lch",
"Latn",
}
m["poy"] = {
"Pogolo",
2429648,
"bnt-kil",
}
m["ppa"] = {
"Pao",
7132069,
}
m["ppe"] = {
"Papi",
7132809,
}
m["ppi"] = {
"Paipai",
56726,
"nai-yuc",
"Latn",
}
m["ppk"] = {
"Uma",
7881036,
"poz-kal",
"Latn",
}
m["ppl"] = {
"Pipil",
1186896,
"azc-nah",
"Latn",
entry_name = {remove_diacritics = c.acute .. c.macron},
}
m["ppm"] = {
"Papuma",
7133239,
"poz-hce",
"Latn",
}
m["ppn"] = {
"Papapana",
3362757,
"poz-ocw",
"Latn",
}
m["ppo"] = {
"Folopa",
5464843,
"paa",
"Latn",
}
m["ppq"] = {
"Pei",
7160903,
}
m["pps"] = {
"San Luís Temalacayuca Popoloca",
25559602,
"omq-pop",
"Latn",
}
m["ppt"] = {
"Pa",
3504757,
"ngf",
"Latn",
}
m["ppu"] = {
"Papora",
2094884,
"map",
"Latn",
}
m["pqa"] = {
"Pa'a",
3441315,
"cdc-wst",
}
m["pqm"] = {
"Malecite-Passamaquoddy",
3183144,
"alg-eas",
"Latn",
}
m["pra"] = {
"Prakrit",
192170,
"inc-mid",
"Brah, Deva, Gujr, Knda",
ancestors = "inc-ash",
translit = {
Brah = "Brah-translit",
Deva = "pra-Deva-translit",
Gujr = "sa-Gujr-translit",
Knda = "pra-Knda-translit",
},
entry_name = {
from = {"ऎ", "ऒ", u(0x0946), u(0x094A), "य़", "ಯ಼", u(0x11071), u(0x11072), u(0x11073), u(0x11074)},
to = {"ए", "ओ", u(0x0947), u(0x094B), "य", "ಯ", "𑀏", "𑀑", u(0x11042), u(0x11044)}
} ,
}
m["prc"] = {
"Parachi",
2640637,
"ira-orp",
}
-- "prd" IS NOT INCLUDED, SEE WT:LT
m["pre"] = {
"Principe",
36520,
"crp",
"Latn",
ancestors = "pt",
}
m["prf"] = {
"Paranan",
7135433,
"phi",
}
m["prg"] = {
"Prusia Kuno",
35501,
"bat",
"Latn",
}
m["prh"] = {
"Porohanon",
6583710,
"phi",
}
m["pri"] = {
"Paicî",
732131,
"poz-cln",
"Latn",
}
m["prk"] = {
"Parauk",
3363719,
"mkh-pal",
}
m["prl"] = {
"Bahasa Isyarat Peru",
3915508,
"sgn",
}
m["prm"] = {
"Kibiri",
56745,
"paa",
}
m["prn"] = {
"Prasuni",
32689,
"nur-nor",
}
m["pro"] = {
"Occitan Kuno",
2779185,
"roa-ocr",
"Latn",
sort_key = {remove_diacritics = c.cedilla},
}
-- "prp" IS NOT INCLUDED, SEE WT:LT
m["prq"] = {
"Ashéninka Perené",
3450601,
"awd",
"Latn",
}
m["prr"] = {
"Puri",
7261687,
}
-- "prs" IS TREATED AS "fa" (or as etymology-only), SEE WT:LT
m["prt"] = {
"Phai",
7180184,
"mkh",
}
m["pru"] = {
"Puragi",
7260800,
"ngf-sbh",
}
m["prw"] = {
"Parawen",
7136291,
"ngf-mad",
}
m["prx"] = {
"Purik",
567905,
"sit-lab",
}
m["prz"] = {
"Bahasa Isyarat Providencia",
3322084,
"sgn",
}
m["psa"] = {
"Asue Awyu",
11266334,
}
m["psc"] = {
"Bahasa Isyarat Parsi",
7170221,
"sgn",
}
m["psd"] = {
"Plains Indian Sign Language",
2380124,
"sgn",
}
m["pse"] = {
"Melayu Barisan Selatan",
3367751,
"poz-mly",
"Latn",
}
m["psg"] = {
"Bahasa Isyarat Pulau Pinang",
4924925,
"sgn",
}
m["psh"] = {
"Pashayi Barat Daya",
16112270,
"inc-dar",
}
m["psi"] = {
"Pashayi Tenggara",
23713536,
"inc-dar",
"Arab",
}
m["psl"] = {
"Bahasa Isyarat Puerto Rico",
7258608,
"sgn-fsl",
}
m["psm"] = {
"Pauserna",
2912846,
"tup-gua",
"Latn",
}
m["psn"] = {
"Panasuan",
7130113,
"poz",
}
m["pso"] = {
"Bahasa Isyarat Poland",
3915194,
"sgn-gsl",
}
m["psp"] = {
"Bahasa Isyarat Filipina",
3551357,
"sgn-fsl",
}
m["psq"] = {
"Pasi",
7142091,
}
m["psr"] = {
"Bahasa Isyarat Portugis",
3915472,
"sgn",
}
m["pss"] = {
"Kaulong",
3194294,
"poz-ocw",
}
m["psw"] = {
"Port Sandwich",
3398324,
"poz-vnc",
"Latn",
}
m["psy"] = {
"Piscataway",
3504233,
"alg-eas",
}
m["pta"] = {
"Pai Tavytera",
7124619,
"tup-gua",
"Latn",
}
m["pth"] = {
"Pataxó Hã-Ha-Hãe",
7144304,
}
m["pti"] = {
"Pintiini",
10632026,
"aus-pam",
}
m["ptn"] = {
"Patani",
7144242,
"poz-hce",
"Latn",
}
m["pto"] = {
"Zo'é",
8073148,
"tup-gua",
"Latn",
}
m["ptp"] = {
"Patep",
3368679,
"poz-ocw",
"Latn",
}
m["ptq"] = {
"Pattapu",
nil,
"dra",
}
m["ptr"] = {
"Piamatsina",
7190040,
"poz-vnc",
"Latn",
}
m["ptt"] = {
"Enrekang",
12953520,
}
m["ptu"] = {
"Bambam",
4853321,
"poz-ssw",
"Latn",
}
m["ptv"] = {
"Port Vato",
3398323,
nil,
"Latn",
}
m["ptw"] = {
"Pentlatch",
2069475,
}
m["pty"] = {
"Pathiya",
7144790,
"dra",
}
m["pua"] = {
"Purepecha",
16114351,
"qfa-iso",
"Latn",
sort_key = {remove_diacritics = c.acute},
}
m["pub"] = {
"Purum",
6400562,
"tbq-kuk",
"Latn",
}
m["puc"] = {
"Punan Merap",
7259895,
}
m["pud"] = {
"Punan Aput",
4782333,
"poz-swa",
"Latn",
}
m["pue"] = {
"Puelche",
33660,
}
m["puf"] = {
"Punan Merah",
7259894,
}
m["pug"] = {
"Phuie",
36375,
"nic-gnw",
}
m["pui"] = {
"Puinave",
3027918,
}
m["puj"] = {
"Punan Tubu",
7259896,
"poz-swa",
"Latn",
}
m["pum"] = {
"Puma",
33736,
"sit-kic",
}
m["puo"] = {
"Puoc",
6440803,
"mkh",
}
m["pup"] = {
"Pulabu",
7259163,
"ngf-mad",
}
m["puq"] = {
"Puquina",
1207739,
}
m["pur"] = {
"Puruborá",
7261619,
"tup",
}
m["put"] = {
"Putoh",
12953832,
"poz-swa",
"Latn",
}
m["puu"] = {
"Punu",
36401,
"bnt-sir",
"Latn",
}
m["puw"] = {
"Puluwat",
36397,
"poz-mic",
"Latn",
}
m["pux"] = {
"Puare",
3507983,
}
m["puy"] = {
"Purisimeño",
2967638,
"nai-chu",
"Latn",
}
m["pwa"] = {
"Pawaia",
7156099,
"paa",
"Latn",
}
m["pwb"] = {
"Panawa",
47385077,
"nic-jer",
"Latn",
ancestors = "jer",
}
m["pwg"] = {
"Gapapaiwa",
3095245,
"poz-ocw",
"Latn",
}
m["pwi"] = {
"Patwin",
3370188,
"nai-wtq",
"Latn",
}
m["pwm"] = {
"Molbog",
6895718,
"poz-san",
"Latn",
}
m["pwn"] = {
"Paiwan",
715755,
"map",
"Latn",
}
m["pwo"] = {
"Pwo Barat",
7988202,
"kar",
"Mymr",
}
m["pwr"] = {
"Powari",
12640277,
"inc-hie",
"Deva",
}
m["pww"] = {
"Pwo Utara",
7058885,
"kar",
"Thai",
}
m["pxm"] = {
"Quetzaltepec Mixe",
6842374,
"nai-miz",
"Latn",
}
m["pye"] = {
"Pye Krumen",
11157382,
"kro-grb",
}
m["pym"] = {
"Fyam",
3914025,
"nic-ple",
"Latn",
}
m["pyn"] = {
"Poyanáwa",
3401023,
"sai-pan",
}
m["pys"] = {
"Bahasa Isyarat Paraguay",
7134698,
"sgn",
}
m["pyu"] = {
"Puyuma",
716690,
"map",
"Latn",
}
m["pyx"] = {
"Tircul",
36259,
"sit",
}
m["pyy"] = {
"Pyen",
7262966,
"tbq-lol",
}
m["pzh"] = {
"Pazeh",
36435,
"map",
"Latn",
}
m["pzn"] = {
"Para Naga",
7133667,
"sit-aao",
}
return require("Module:languages").finalizeData(m, "language")
fribkhuo11qo3woq888t6l2nhypqvc1
Modul:languages/data/3/x
828
9824
281274
276274
2026-04-21T14:03:17Z
Hakimi97
2668
281274
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["xaa"] = {
"Arab Andalusia",
1137945,
"sem-arb",
"Arab, Latn",
entry_name = {
remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef,
from = {u(0x0671)},
to = {u(0x0627)}
},
}
m["xab"] = {
"Sambe",
36265,
"nic-alu",
"Latn",
}
m["xac"] = {
"Kachari",
3442442,
"tbq-bdg",
}
m["xad"] = {
"Adai",
346744,
}
m["xae"] = {
"Aequian",
930579,
"itc",
}
m["xag"] = {
"Aghwan",
34931,
"cau-esm",
"Aghb",
translit = "Aghb-translit",
override_translit = true,
}
m["xai"] = {
"Kaimbé",
6348017,
}
m["xaj"] = {
"Ararandewára",
nil,
"tup-gua",
"Latn",
}
m["xak"] = {
"Maku",
2032882,
nil,
"Latn",
}
m["xal"] = {
"Kalmyk",
33634,
"xgn-cen",
"Cyrl, xwo-Mong",
ancestors = "xwo",
translit = "xal-translit",
override_translit = true,
sort_key = "xal-sortkey",
}
m["xam"] = {
"ǀXam",
2086145,
"khi-tuu",
"Latn",
}
m["xan"] = {
"Xamtanga",
56527,
"cus-cen",
}
m["xao"] = {
"Khao",
3196077,
"mkh-pal",
}
m["xap"] = {
"Apalachee",
686501,
"nai-mus",
"Latn",
}
m["xaq"] = {
"Aquitanian",
500522,
"euq",
"Latn",
}
m["xar"] = {
"Karami",
11732281,
}
m["xas"] = {
"Kamassian",
35991,
translit = "xas-translit",
"syd",
"Cyrl",
}
m["xat"] = {
"Katawixi",
3440512,
"sai-ktk",
}
m["xau"] = {
"Kauwera",
6378983,
"paa-tkw",
}
m["xav"] = {
"Xavante",
36962,
"sai-cje",
"Latn",
}
m["xaw"] = {
"Kawaiisu",
56338,
"azc-num",
"Latn",
}
m["xay"] = {
"Kayan Mahakam",
25337171,
}
m["xbb"] = {
"Lower Burdekin",
6693353,
}
m["xbc"] = {
"Baktria",
756651,
"ira-sbc",
"Grek, Mani",
translit = "xbc-translit",
entry_name = {
from = {"Þ", "þ"},
to = {"Ϸ", "ϸ"}
},
}
m["xbd"] = {
"Bindal",
4913975,
}
m["xbe"] = {
"Bigambal",
16841801,
"aus-pam", --unclassified within
}
m["xbg"] = {
"Bunganditj",
4997615,
}
m["xbi"] = {
"Kombio",
6428259,
"qfa-tor",
"Latn",
}
m["xbj"] = {
"Birrpayi",
nil,
}
m["xbm"] = {
"Breton Pertengahan",
787610,
"cel-bry",
"Latn",
ancestors = "obt",
}
m["xbn"] = {
"Kenaboi",
6388752,
}
m["xbo"] = {
"Bulgar",
36880,
"trk-ogr",
"Arab, Grek",
}
m["xbp"] = {
"Bibbulman",
22918391,
}
m["xbr"] = {
"Kambera",
3053279,
"poz-cet",
"Latn",
}
m["xbw"] = {
"Kambiwá",
9006744,
}
m["xby"] = {
"Butchulla",
31752631,
}
m["xcb"] = {
"Cumbric",
35965,
"cel-bry",
}
m["xcc"] = {
"Camunic",
489011,
nil,
"Ital",
translit = "Ital-translit",
}
m["xce"] = {
"Celtiberian",
37012,
"cel",
"Latn",
}
m["xch"] = {
"Chemakum",
56397,
"chi",
"Latn",
}
m["xcl"] = {
"Armenia Kuno",
181074,
"hyx",
"Armn",
translit = "Armn-translit",
override_translit = true,
entry_name = {
remove_diacritics = "՞՜՛՟",
from = {"եւ"},
to = {"և"}
},
}
m["xcm"] = {
"Comecrudo",
609808,
"nai-pak",
}
m["xcn"] = {
"Cotoname",
56889,
"nai-pak",
}
m["xco"] = {
"Khwarezm",
33138,
"ira-sbc",
"Arab, Armi, Chrs, Phlv, Sogd",
translit = {Chrs = "Chrs-translit"},
}
m["xcr"] = {
"Carian",
35929,
"ine-ana",
"Cari",
}
m["xct"] = {
"Tibet Klasik",
5128314,
"sit-tib",
"Tibt, Hani, Marc, Mong, mnc-Mong, xwo-Mong, Phag, Tang, Zanb",
translit = {
Tibt = "Tibt-translit",
Mong = "Mong-translit",
["mnc-Mong"] = "mnc-translit",
["xwo-Mong"] = "xwo-translit",
Tang = "txg-translit",
},
override_translit = true,
display_text = {
Tibt = s["Tibt-displaytext"],
Mong = s["Mong-displaytext"],
},
entry_name = {
Tibt = s["Tibt-entryname"],
Mong = s["Mong-entryname"],
},
sort_key = {
Tibt = "Tibt-sortkey",
Hani = "Hani-sortkey",
},
}
m["xcu"] = {
"Curonian",
35857,
"bat",
"Latn",
}
m["xcv"] = {
"Chuva",
3516641,
"qfa-yuk",
"Cyrl",
translit = "xcv-translit"
}
m["xcw"] = {
"Coahuilteco",
2008062,
"nai-pak",
}
m["xcy"] = {
"Cayuse",
2472016,
}
m["xda"] = {
"Darkinjung",
5223660,
"aus-yuk",
"Latn",
}
m["xdc"] = {
"Dacian",
682547,
"ine",
"Latn",
}
m["xdk"] = {
"Dharug",
1166814,
"aus-yuk",
"Latn",
}
m["xdm"] = {
"Edom",
2363529,
"sem-can",
"Phnx",
translit = "Phnx-translit",
}
m["xdq"] = {
"Kaitag",
1990659,
"cau-drg",
"Cyrl",
translit = {Cyrl = "dar-translit"},
override_translit = true,
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
entry_name = {
Cyrl = s["cau-Cyrl-entryname"],
Latn = s["cau-Latn-entryname"],
},
sort_key = {
Cyrl = {
from = {
"къкъ", "хьхь", -- 4 chars
"гъ", "гь", "гӏ", "ё", "къ", "кь", "кӏ", "пп", "пӏ", "сс", "тт", "тӏ", "хх", "хъ", "хь", "хӏ", "цц", "цӏ", "чч", "чӏ" -- 2 chars
},
to = {
"к" .. p[2], "х" .. p[4],
"г" .. p[1], "г" .. p[2], "г" .. p[3], "е" .. p[1], "к" .. p[1], "к" .. p[3], "к" .. p[4], "п" .. p[1], "п" .. p[2], "с" .. p[1], "т" .. p[1], "т" .. p[2], "х" .. p[1], "х" .. p[2], "х" .. p[3], "х" .. p[5], "ц" .. p[1], "ц" .. p[2], "ч" .. p[1], "ч" .. p[2]
}
},
},
}
m["xdy"] = {
"Malayic Dayak",
3514892,
}
m["xeb"] = {
"Ebla",
35345,
"sem-eas",
"Xsux",
}
m["xed"] = {
"Hdi",
56246,
"cdc-cbm",
"Latn",
}
m["xeg"] = {
"ǁXegwi",
3509732,
"khi-tuu",
"Latn",
}
m["xel"] = {
"Kelo",
6386412,
"sdv-eje",
}
m["xem"] = {
"Kembayan",
6386874,
}
m["xep"] = {
"Epi-Olmec",
nil,
}
m["xer"] = {
"Xerénte",
3073436,
"sai-cje",
"Latn",
}
m["xes"] = {
"Kesawai",
6394907,
"ngf-mad",
"Latn",
}
m["xet"] = {
"Xetá",
2980404,
"tup-gua",
"Latn",
}
m["xeu"] = {
"Keoru-Ahia",
11732313,
"ngf",
}
m["xfa"] = {
"Falisci",
35669,
"itc",
"Ital, Latn",
translit = "Ital-translit",
entry_name = {remove_diacritics = c.macron .. c.breve .. c.diaer},
}
m["xga"] = {
"Galatia",
27403,
"cel",
"Latn, Grek",
ancestors = "cel-gau",
}
m["xgb"] = {
"Gbin",
16934745,
"dmn-mse",
"Latn",
}
m["xgd"] = {
"Gudang",
5614528,
}
m["xgf"] = {
"Gabrielino-Fernandeño",
56387,
"azc-tak",
"Latn",
}
m["xgg"] = {
"Goreng",
nil,
}
m["xgi"] = {
"Garingbal",
nil,
}
m["xgl"] = {
"Galindan",
1190494,
"bat",
"Latn",
}
m["xgm"] = {
"Darumbal",
16954400,
}
m["xgr"] = {
"Garza",
3098656,
"nai-pak",
}
m["xgu"] = {
"Unggumi",
62000004,
"aus-wor",
"Latn",
}
m["xgw"] = {
"Guwa",
5621992,
}
m["xha"] = {
"Harami",
41506724,
nil,
"Sarb",
translit = "Sarb-translit",
}
m["xhc"] = {
"Hun",
35959,
}
m["xhd"] = {
"Hadrami",
1032453,
"sem-osa",
"Sarb",
translit = "Sarb-translit",
}
m["xhe"] = {
"Khetrani",
2614111,
"inc-pan",
ancestors = "lah",
}
m["xhm"] = {
"Khmer Pertengahan",
25226861,
"mkh-kmr",
"Latn, Khmr", --and also Pallava
ancestors = "okz",
}
m["xhr"] = {
"Hernican",
5908773,
"itc-sbl",
"Ital",
}
m["xht"] = {
"Hatti",
31107,
"qfa-iso",
"Xsux",
}
m["xhu"] = {
"Hurri",
35740,
"qfa-hur",
"Xsux, Ugar",
}
m["xhv"] = {
"Khua",
22970290,
"mkh-kat",
}
m["xib"] = {
"Iberia",
855215,
"qfa-iso",
"Latn, Ibrn",
}
m["xii"] = {
"Xiri",
36876,
}
m["xin"] = {
"Xinca",
1546494,
"nai-xin",
"Latn",
}
m["xil"] = {
"Illyria",
35976,
"ine",
type = "reconstructed",
}
m["xir"] = {
"Xiriâna",
2028772,
"awd",
"Latn",
}
m["xis"] = {
"Kisan",
nil,
}
m["xiv"] = {
"Bahasa Lembah Indus",
3428279,
nil,
"Inds",
}
m["xiy"] = {
"Xipaya",
13226,
"tup",
}
m["xjb"] = {
"Minjungbal",
nil,
"aus-pam",
"Latn",
}
m["xka"] = {
"Kalkoti",
3877551,
"inc-dar",
"xka-Arab",
}
m["xkb"] = {
"Manigri-Kambolé Ede Nago",
36042,
"alv-ede",
}
m["xkc"] = {
"Khoini",
6401919,
"xme-ttc",
ancestors = "xme-ttc-wes",
}
m["xkd"] = {
"Kayan Mendalam",
12952597,
}
m["xke"] = {
"Kereho",
6437086,
"poz",
"Latn",
}
m["xkf"] = {
"Khengkha",
3695207,
"sit-ebo",
"Tibt",
translit = "Tibt-translit",
override_translit = true,
display_text = s["Tibt-displaytext"],
entry_name = s["Tibt-entryname"],
sort_key = "Tibt-sortkey",
}
m["xkg"] = {
"Kagoro",
11159524,
"dmn-wmn",
}
m["xki"] = {
"Kenyan Sign Language",
6392859,
"sgn",
}
m["xkj"] = {
"Kajali",
14916876,
"xme-ttc",
ancestors = "xme-ttc-cen",
}
m["xkk"] = {
"Kaco'",
6344767,
"mkh",
}
m["xkl"] = {
"Bakung",
6736761,
"poz-swa",
"Latn",
}
m["xkn"] = {
"Kayan Sungai Kayan",
12473395,
"poz",
"Latn",
}
m["xko"] = {
"Kiorr",
6414519,
"mkh-pal",
}
m["xkp"] = {
"Kabatei",
34165,
"xme-ttc",
ancestors = "xme-ttc-cen",
}
m["xkq"] = {
"Koroni",
3199000,
"poz-btk",
}
m["xkr"] = {
"Xakriabá",
3073441,
"sai-cje",
"Latn",
}
m["xks"] = {
"Kumbewaha",
6443722,
}
m["xkt"] = {
"Kantosi",
35651,
"nic-dag",
}
m["xku"] = {
"Kaamba",
11042324,
"bnt-kng",
}
m["xkv"] = {
"Kgalagadi",
2088743,
"bnt-sts",
"Latn",
}
m["xkw"] = {
"Kembra",
12953627,
"paa-pau",
}
m["xkx"] = {
"Karore",
6373260,
"poz-ocw",
}
m["xky"] = {
"Uma' Lasan",
nil,
"poz-swa",
}
m["xkz"] = {
"Kurtöp",
3695193,
"sit-ebo",
"Tibt, Latn",
translit = {Tibt = "Tibt-translit"},
display_text = {Tibt = s["Tibt-displaytext"]},
entry_name = {Tibt = s["Tibt-entryname"]},
sort_key = {Tibt = "Tibt-sortkey"},
}
m["xla"] = {
"Kamula",
10957277,
"ngf",
}
m["xlb"] = {
"Loup B",
13108281,
"alg-eas",
"Latn",
}
m["xlc"] = {
"Lycia",
35969,
"ine-ana",
"Lyci",
translit = "Lyci-translit",
}
m["xld"] = {
"Lydia",
36095,
"ine-ana",
"Lydi",
translit = "Lydi-translit",
}
m["xle"] = {
"Lemnos",
36203,
"qfa-tyn",
"Ital",
translit = "Ital-translit",
}
m["xlg"] = {
"Liguria Purba",
36104,
"ine",
}
m["xli"] = {
"Liburni",
35835,
"ine",
}
--xln is etymology-only
m["xlo"] = {
"Loup A",
27921265,
"alg-eas",
"Latn",
}
m["xlp"] = {
"Lepontii",
35993,
"cel",
"Ital",
translit = "Ital-translit",
}
m["xls"] = {
"Lusitania",
35960,
"ine",
"Latn",
}
m["xlu"] = {
"Luwiya",
12634577,
"ine-ana",
"Xsux, Hluw",
}
m["xly"] = {
"Elymi",
35329,
nil,
"Grek",
}
m["xmb"] = {
"Mbonga",
36064,
"nic-jrn",
"Latn",
}
m["xmc"] = {
"Makhuwa-Marrevone",
11127231,
"bnt-mak",
ancestors = "vmw",
}
m["xmd"] = {
"Mbudum",
6799790,
"cdc-cbm",
"Latn",
}
m["xmf"] = {
"Mingrelia",
13359,
"ccs-zan",
"Geor",
translit = "Geor-translit",
override_translit = true,
}
m["xmg"] = {
"Mengaka",
36017,
"bai",
"Latn",
}
m["xmh"] = {
"Kugu-Muminh",
10549849,
"aus-pmn",
"Latn",
}
m["xmj"] = {
"Majera",
6737666,
"cdc-cbm",
"Latn",
}
m["xmk"] = {
"Macedonia Purba",
35974,
"grk",
"Polyt",
translit = "grc-translit",
entry_name = {remove_diacritics = c.macron .. c.breve},
sort_key = s["Grek-sortkey"],
}
m["xml"] = {
"Bahasa Isyarat Malaysia",
33420,
"sgn",
}
m["xmm"] = {
"Melayu Manado",
1068112,
"crp",
"Latn",
}
m["xmo"] = {
"Morerebi",
12953749,
"tup",
"Latn",
}
m["xmp"] = {
"Kuku-Mu'inh",
10549852,
nil,
"Latn",
}
m["xmq"] = {
"Kuku-Mangk",
10549851,
"aus-pam",
"Latn",
}
m["xmr"] = {
"Meroe",
13366,
"afa",
"Mero, Merc, Latn", -- we have entries in Latn
translit = "xmr-translit",
}
m["xms"] = {
"Bahasa Isyarat Maghribi",
6913107,
"sgn",
}
m["xmt"] = {
"Matbat",
6786187,
"poz-hce",
}
m["xmu"] = {
"Kamu",
6359779,
}
m["xmx"] = {
"Maden",
12952756,
"poz-hce",
}
m["xmy"] = {
"Mayaguduna",
3436736,
}
m["xmz"] = {
"Mori Bawah",
3324069,
"poz-btk",
"Latn",
}
m["xna"] = {
"Arab Utara Purba",
1472213,
"sem",
"Narb",
translit = "Narb-translit",
}
m["xnb"] = {
"Kanakanabu",
172244,
"map",
"Latn",
}
m["xng"] = {
"Mongol Pertengahan",
2582455,
"xgn",
"Mong, Phag, Hani, Arab, Armn",
translit = {Mong = "Mong-translit"},
display_text = {Mong = s["Mong-displaytext"]},
entry_name = {Mong = s["Mong-entryname"]},
sort_key = {Hani = "Hani-sortkey"},
}
m["xnh"] = {
"Kuanhua",
6441084,
"mkh-pal",
}
m["xni"] = {
"Ngarigu",
7022072,
"aus-yuk",
}
m["xnk"] = {
"Nganakarti",
33087049,
}
m["xnn"] = {
"Kankanay Utara",
12953609,
"phi",
}
-- "xno" IS TREATED AS "fro", SEE WT:LT
m["xnr"] = {
"Kangri",
2331560,
"him",
"Deva, Takr, fa-Arab",
ancestors = "doi",
translit = "hi-translit",
}
m["xns"] = {
"Kanashi",
6360672,
"sit-whm",
}
m["xnt"] = {
"Narragansett",
3336118,
"alg-eas",
"Latn",
entry_name = {remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron},
}
m["xnu"] = {
"Nukunul",
7068904,
}
m["xny"] = {
"Nyiyaparli",
16919427,
"aus-nga",
"Latn",
}
m["xoc"] = {
"O'chi'chi'",
3813833,
"nic-cde",
"Latn",
}
m["xod"] = {
"Kokoda",
6426734,
"ngf-sbh",
}
m["xog"] = {
"Soga",
33784,
"bnt-nyg",
"Latn",
}
m["xoi"] = {
"Kominimung",
6428352,
"paa",
"Latn",
}
m["xok"] = {
"Xokleng",
3027930,
"sai-sje",
}
m["xom"] = {
"Komo",
56681,
"ssa-kom",
}
m["xon"] = {
"Konkomba",
35674,
"nic-grm",
"Latn",
}
m["xoo"] = { -- contrast kzw, sai-kat, sai-xoc
"Xukurú",
9096758,
}
m["xop"] = {
"Kopar",
11732346,
}
m["xor"] = {
"Korubo",
3199022,
}
m["xow"] = {
"Kowaki",
6434920,
"ngf-mad",
}
m["xpa"] = {
"Pirriya",
16978087,
}
m["xpb"] = {
"Pyemmairre",
7262964,
nil,
"Latn",
}
m["xpc"] = {
"Pecheneg",
877881,
"trk",
}
m["xpd"] = {
"Paredarerme",
7136678,
nil,
"Latn",
}
m["xpe"] = {
"Liberia Kpelle",
20527226,
"dmn-msw",
ancestors = "kpe",
}
m["xpf"] = {
"Tasmania Tenggara",
7068421,
nil,
"Latn",
}
m["xpg"] = {
"Phrygia",
36751,
"ine",
"Grek",
translit = "grc-translit",
}
m["xph"] = {
"Tyerrernotepanner",
7859815,
nil,
"Latn",
}
m["xpi"] = {
"Pict",
856383,
"cel",
"Ogam, Latn",
}
m["xpj"] = {
"Mpalitjanh",
6928192,
"aus-pam",
}
m["xpk"] = {
"Kulina",
6443027,
"sai-pan",
}
m["xpl"] = {
"Port Sorell",
7230944,
nil,
"Latn",
}
m["xpm"] = {
"Pumpokol",
2991985,
"qfa-yen",
"Latn",
}
m["xpn"] = {
"Kapinawá",
6366667,
}
m["xpo"] = {
"Pochutec",
2427341,
"azc-nah",
"Latn",
}
m["xpp"] = {
"Puyo-Paekche",
nil,
}
m["xpq"] = {
"Mohegan-Pequot",
3319130,
"alg-eas",
"Latn",
}
m["xpr"] = {
"Parthia",
25953,
"ira-mpr",
"Prti, Mani, Phlv",
translit = {
Prti = "Prti-translit",
Mani = "Mani-translit",
},
}
m["xps"] = {
"Pisidia",
36580,
"ine-ana",
}
m["xpu"] = {
"Punik",
535958,
"sem-can",
"Phnx, Latn, Grek",
ancestors = "phn",
translit = {Phnx = "Phnx-translit"},
}
m["xpv"] = {
"Tommeginne",
7819095,
nil,
"Latn",
}
m["xpw"] = {
"Peerapper",
7160431,
nil,
"Latn",
}
m["xpx"] = {
"Toogee",
7824008,
nil,
"Latn",
}
m["xpy"] = {
"Buyeo",
5003359,
"qfa-kor",
"Hani",
sort_key = "Hani-sortkey",
}
m["xpz"] = {
"Pulau Bruny",
4979601,
nil,
"Latn",
}
m["xqa"] = {
"Karakhanid",
nil,
"trk-kar",
"Arab",
entry_name = "ar-entryname",
}
m["xqt"] = {
"Qatabanian",
384101,
"sem-osa",
"Sarb",
translit = "Sarb-translit",
}
m["xra"] = {
"Krahô",
3199549,
"sai-nje",
"Latn",
}
m["xrb"] = {
"Karaboro Timur",
35716,
"alv-krb",
}
m["xrd"] = {
"Gundungurra",
nil,
}
m["xre"] = {
"Kreye",
3199686,
"sai-nje",
}
m["xrg"] = {
"Minang",
22893424,
}
m["xri"] = {
"Krikati-Timbira",
3199710,
}
m["xrm"] = {
"Armazic",
7599646,
}
m["xrn"] = {
"Arin",
34088,
"qfa-yen",
"Latn",
}
m["xrq"] = {
"Karranga",
6373349,
nil,
"Latn",
}
m["xrr"] = {
"Raetic",
36689,
nil,
"Ital",
translit = "Ital-translit",
}
m["xrt"] = {
"Aranama-Tamique",
2859505,
}
m["xru"] = {
"Marriammu",
10577724,
"aus-dal",
}
m["xrw"] = {
"Karawa",
6368857,
"paa-spk",
}
m["xsa"] = {
"Sabaean",
1070391,
"sem-osa",
"Sarb",
translit = "Sarb-translit",
}
m["xsb"] = {
"Sambal",
2592378,
"phi",
"Latn",
}
m["xsd"] = {
"Sidetic",
36659,
"ine-ana",
}
m["xse"] = {
"Sempan",
3504358,
}
m["xsh"] = {
"Shamang",
3914876,
"nic-plc",
}
m["xsi"] = {
"Sio",
3485100,
"poz-ocw",
}
m["xsj"] = {
"Subi",
7631298,
"bnt-haj",
}
m["xsl"] = {
"Slavey Selatan",
28552,
"ath-nor",
"Latn",
}
m["xsm"] = {
"Kasem",
35552,
"nic-gnn",
}
m["xsn"] = {
"Sanga (Nigeria)",
3915334,
"nic-jer",
"Latn",
}
m["xso"] = {
"Solano",
2474492,
nil,
"Latn",
}
m["xsp"] = {
"Silopi",
7515533,
"ngf-mad",
}
m["xsq"] = {
"Makhuwa-Saka",
11008159,
"bnt-mak",
ancestors = "vmw",
}
m["xsr"] = {
"Sherpa",
36612,
"sit-tib",
"Tibt, Deva",
ancestors = "xct",
translit = {
Tibt = "Tibt-translit",
Deva = "xsr-Deva-translit",
},
override_translit = true,
display_text = {Tibt = s["Tibt-displaytext"]},
entry_name = {Tibt = s["Tibt-entryname"]},
sort_key = {Tibt = "Tibt-sortkey"},
}
m["xss"] = {
"Assan",
34089,
"qfa-yen",
"Latn",
}
m["xsu"] = {
"Sanumá",
251728,
"sai-ynm",
"Latn",
}
m["xsv"] = {
"Sudovian",
35603,
"bat",
"Latn",
}
m["xsy"] = {
"Saisiyat",
716695,
"map",
"Latn",
}
m["xta"] = {
"Alcozauca Mixtec",
25559587,
"omq-mxt",
"Latn",
}
m["xtb"] = {
"Chazumba Mixtec",
12182838,
"omq-mxt",
"Latn",
}
m["xtc"] = {
"Kadugli",
3407136,
"qfa-kad",
"Latn",
}
m["xtd"] = {
"Diuxi-Tilantongo Mixtec",
7802048,
"omq-mxt",
"Latn",
}
m["xte"] = {
"Ketengban",
10990152,
}
m["xth"] = {
"Yitha Yitha",
nil,
}
m["xti"] = {
"Sinicahua Mixtec",
12953733,
"omq-mxt",
"Latn",
}
m["xtj"] = {
"San Juan Teita Mixtec",
32093049,
"omq-mxt",
"Latn",
}
m["xtl"] = {
"Tijaltepec Mixtec",
12953738,
"omq-mxt",
"Latn",
}
m["xtm"] = {
"Mixtec Magdalena Peñasco",
7179700,
"omq-mxt",
"Latn",
}
m["xtn"] = {
"Mixtec Tlaxiaco Utara",
25559585,
"omq-mxt",
"Latn",
}
m["xto"] = {
"Tocharia A",
2827041,
"ine-toc",
"Latn",
wikipedia_article = "Tocharian languages", -- wikidata id has no associated article
}
m["xtp"] = {
"Mixtec San Miguel Piedras",
7414970,
"omq-mxt",
"Latn",
}
m["xtq"] = {
"Tumshuq",
nil,
"xsc-sak",
"Brah, Khar",
translit = "Brah-translit",
}
m["xtr"] = {
"Tripuri Awal",
nil,
}
m["xts"] = {
"Mixtec Sindihui",
13583581,
"omq-mxt",
"Latn",
}
m["xtt"] = {
"Mixtec Tacahua",
7673668,
"omq-mxt",
"Latn",
}
m["xtu"] = {
"Mixtec Cuyamecalco",
12953726,
"omq-mxt",
"Latn",
}
m["xtv"] = {
"Thawa",
7711494,
}
m["xtw"] = {
"Tawandê",
nil,
"sai-nmk",
"Latn",
}
m["xty"] = {
"Mixtec Yoloxochitl",
8054817,
"omq-mxt",
"Latn",
}
m["xtz"] = {
"Tasmania",
530739,
nil,
"Latn",
}
m["xua"] = {
"Kurumba Alu",
12952679,
"dra",
}
m["xub"] = {
"Kurumba Betta",
16841033,
"dra",
"Knda, Mlym, Taml",
}
m["xud"] = {
"Umiida",
61999874,
"aus-wor",
"Latn",
}
m["xug"] = {
"Kunigami",
56558,
"jpx-ryu",
"Jpan",
translit = s["Jpan-translit"],
sort_key = s["Jpan-sortkey"],
}
m["xuj"] = {
"Jennu Kurumba",
21282543,
"dra",
}
m["xul"] = {
"Ngunawal",
7022712,
"aus-yuk",
"Latn",
}
m["xum"] = {
"Umbri",
36957,
"itc-sbl",
"Ital, Latn",
translit = "Ital-translit",
}
m["xun"] = {
"Unggaranggu",
61999823,
"aus-wor",
"Latn",
}
m["xuo"] = {
"Kuo",
6445233,
"alv-mbm",
}
m["xup"] = {
"Upper Umpqua",
20607,
"ath-pco",
"Latn",
}
m["xur"] = {
"Urartian",
36934,
"qfa-hur",
"Xsux",
}
m["xut"] = {
"Kuthant",
6448417,
}
m["xuu"] = {
"Khwe",
28305,
"khi-kal",
"Latn",
}
m["xve"] = {
"Venetic",
36871,
"ine",
"Ital",
translit = "Ital-translit",
}
-- m["xvi"] = { "Kamviri", 1193495, "nur-nor", Arab } moved to etym-only code
m["xvn"] = {
"Vandalic",
36835,
"gme",
"Latn",
}
m["xvo"] = {
"Volscian",
622110,
"itc-sbl",
"Latn",
}
m["xvs"] = {
"Vestinian",
2576407,
"itc",
"Latn",
}
m["xwa"] = {
"Kwaza",
3200839,
}
m["xwc"] = {
"Woccon",
3569569,
"nai-cat",
"Latn",
}
m["xwd"] = {
"Wadi Wadi",
7959249,
}
m["xwe"] = {
"Xwela Gbe",
36887,
"alv-pph",
}
m["xwg"] = {
"Kwegu",
56723,
"sdv",
}
m["xwj"] = {
"Wajuk",
33110188,
}
m["xwk"] = {
"Wangkumara",
7967891,
"aus-pam",
"Latn",
}
m["xwl"] = {
"Gbe Xwla Barat",
36924,
"alv-pph",
"Latn",
}
m["xwo"] = {
"Oirat Bertulis",
56959,
"xgn-cen",
"xwo-Mong",
translit = "xwo-translit",
}
m["xwr"] = {
"Kwerba Mamberamo",
6450325,
"paa-tkw",
}
m["xww"] = {
"Wemba-Wemba",
18472819,
"aus-pam",
"Latn",
}
m["xxb"] = {
"Boro",
16844787,
nil,
"Latn",
}
m["xxk"] = {
"Ke'o",
3195346,
}
m["xxm"] = {
"Minkin",
6867836,
}
m["xxr"] = {
"Koropó",
6432560,
}
m["xxt"] = {
"Tambora",
36711,
"paa",
"Latn",
}
m["xya"] = {
"Yaygir",
8050525,
"aus-pam",
}
m["xyb"] = {
"Yandjibara",
nil,
nil,
"Latn",
}
m["xyl"] = {
"Yalakalore",
12645352,
"sai-nmk",
"Latn",
}
m["xyt"] = {
"Mayi-Thakurti",
47004719,
"aus-pam",
"Latn",
}
m["xyy"] = {
"Yorta Yorta",
8055849,
"aus-pam",
"Latn",
}
m["xzh"] = {
"Zhang-Zhung",
3437292,
"sit-alm",
"xzh-Tibt, Marc",
display_text = {["xzh-Tibt"] = s["Tibt-displaytext"]},
entry_name = {["xzh-Tibt"] = s["Tibt-entryname"]},
}
m["xzm"] = {
"Zemgalia",
47631,
"bat",
}
m["xzp"] = {
"Zapotec Purba",
nil,
}
return require("Module:languages").finalizeData(m, "language")
n7ptvgbjsx7mexi60i8l6fpi3knkhmi
Modul:kanjitab
828
9911
281363
276293
2026-04-22T06:49:27Z
Hakimi97
2668
281363
Scribunto
text/plain
local export = {}
local m_str_utils = require("Module:string utilities")
local m_utilities = require("Module:utilities")
local m_ja = require("Module:ja")
local show_labels = require("Module:labels").show_labels
--[=[
Other modules used: [[Module:parameters]]
]=]
local concat = table.concat
local convert_iteration_marks = require("Module:Hani").convert_iteration_marks
local find = string.find
local gsplit = m_str_utils.gsplit
local gsub = string.gsub
local kata_to_hira = m_ja.kata_to_hira
local insert = table.insert
local match = string.match
local remove = table.remove
local split = m_str_utils.split
local sub = string.sub
local ugsub = mw.ustring.gsub
local ulen = m_str_utils.len
local umatch = mw.ustring.match
local usub = m_str_utils.sub
local PAGENAME = mw.loadData("Module:headword/data").pagename
local NAMESPACE = mw.title.getCurrentTitle().nsText
local d_range = mw.loadData("Module:ja/data/range")
local yomi_data = mw.loadData("Module:kanjitab/data")
local kanji_grade_links = {
"[[Lampiran:Glosari_bahasa_Jepun#kyōiku_kanji|Gred: 1]]",
"[[Lampiran:Glosari_bahasa_Jepun#kyōiku_kanji|Gred: 2]]",
"[[Lampiran:Glosari_bahasa_Jepun#kyōiku_kanji|Gred: 3]]",
"[[Lampiran:Glosari_bahasa_Jepun#kyōiku_kanji|Gred: 4]]",
"[[Lampiran:Glosari_bahasa_Jepun#kyōiku_kanji|Gred: 5]]",
"[[Lampiran:Glosari_bahasa_Jepun#kyōiku_kanji|Gred: 6]]",
"[[Lampiran:Glosari_bahasa_Jepun#jōyō_kanji|Gred: S]]", -- 7
"[[Lampiran:Glosari_bahasa_Jepun#jinmeiyō_kanji|Jinmeiyō]]", -- 8
"[[Lampiran:Glosari_bahasa_Jepun#hyōgaiji|Hyōgaiji]]" -- 9
}
-- this is the function that is called from templates
function export.show(frame)
local args = require("Module:parameters").process(frame:getParent().args, {
[1] = { list = true, allow_holes = true },
k = { list = true, allow_holes = true },
o = { list = true, allow_holes = true },
r = {},
sort = {},
yomi = {},
ateji = {},
alt = {},
alt2 = {},
kyu = { list = true },
y = {alias_of = "yomi"},
clearright = {type = "boolean"},
pagename = {},
})
local lang_code = frame.args[1]
local lang = require("Module:languages").getByCode(lang_code)
local lang_name = lang:getCanonicalName()
if args.pagename and NAMESPACE == "" then
require("Module:debug/track")("kanjitab/pagename param in mainspace")
end
local pagename = args.pagename or PAGENAME
local categories = {}
local cells = {}
-- extract kanji and non-kanji
local kanji = {}
local non_kanji = {}
-- 々 and 〻
pagename = convert_iteration_marks(pagename)
local kanji_border = 1
ugsub(pagename, "()([" .. d_range.kanji .. "々〻])()", function(p1, w1, p2)
insert(non_kanji, usub(pagename, kanji_border, p1 - 1))
kanji_border = p2
insert(kanji, w1)
end)
insert(non_kanji, usub(pagename, kanji_border))
-- kyujitai
local kyu = args.kyu
if kyu[1] == "-" then
kyu = {}
elseif kyu[1] == nil then
local form_kyu = {non_kanji[1]}
local kyu_data = mw.loadData("Module:ja/data/kyu")
local has_kyu, has_kyu_nonsupple, has_shin = false, false, false
for i, v in ipairs(kanji) do
local v_kyu = match(kyu_data[1], v .. "(%S*)%s")
if v_kyu == nil then
insert(form_kyu, v)
elseif v_kyu == "" then
has_shin = true
break
elseif v_kyu:sub(1, 1) == "&" then
has_kyu = true
insert(form_kyu, v_kyu)
else
has_kyu, has_kyu_nonsupple = true, true
insert(form_kyu, v_kyu)
end
insert(form_kyu, non_kanji[i + 1])
end
if not has_shin and has_kyu then
kyu[1] = (has_kyu_nonsupple and "" or pagename .. "|") .. concat(form_kyu)
end
if find(pagename, "弁") then
require("Module:debug/track")("kanjitab/ambiguous kyujitai for 弁")
kyu[1] = "which 弁?"
end
end
local all_yomi, missing_yomi
if args.yomi then
all_yomi = {}
local keys = split(args.yomi, ",")
for i, yomi, len in ipairs(keys) do
yomi, len = match(yomi, "^(%l*)(%d*)$")
yomi = yomi_data[yomi] or error("The yomi type \"" .. yomi .. "\" in the input \"" .. args.yomi .. "\" is not recognized.")
if len ~= "" then
-- Disallow length 0 or leading zeroes, as a sanity check.
len = match(len, "^[1-9]%d*$") and tonumber(len) or error("Cannot specify a length of " .. len .. " kanji.")
-- Only one yomi with no length given: apply to all kanji.
elseif i == 1 and #keys == 1 then
len = #kanji
else
len = 1
end
local yomi_type = yomi.type
-- If the on'yomi is not specified as goon/kanon/toon/soon, only "on".
if yomi_type == "on'yomi" then
require("Module:debug/track")("kanjitab/unspecified on")
elseif yomi_type == "jūbakoyomi" then
require("Module:debug/track")("kanjitab/jubakoyomi")
elseif yomi_type == "yutōyomi" then
require("Module:debug/track")("kanjitab/yutoyomi")
end
-- If the yomi requires a specific number of kanji (e.g. jūbakoyomi, yutōyomi).
local req_kanji = yomi.required_kanji
if req_kanji and #kanji ~= req_kanji then
error("The yomi type \"" .. yomi.type .. "\" is only applicable to terms with " .. req_kanji .. " kanji.")
elseif yomi.type == "none" then
missing_yomi = true
end
-- Insert yomi data for each applicable kanji. Wrap in a table first, as the range for this input yomi is determined by its identity, so that (e.g.) "kun,kun" is still treated as two separate inputs.
yomi = {data = yomi}
for _ = 1, len do
insert(all_yomi, yomi)
end
end
-- If there are any yomi slots left, handle them as empty.
if #all_yomi < #kanji then
missing_yomi = true
for _ = #all_yomi + 1, #kanji do
insert(all_yomi, {data = yomi_data.none})
end
end
elseif #kanji > 0 then
missing_yomi = true
end
if missing_yomi then
insert(categories, "Perkataan kehilangan yomi bahasa " .. lang_name )
end
-- process readings
local readings = {}
local readings_actual = {}
local reading_length_total = 0
for i = 1, args[1].maxindex do
local reading_kana, reading_length = match(args[1][i] or "", "^(%D*)(%d*)$")
reading_kana = reading_kana ~= "" and reading_kana or nil
reading_length = reading_kana and tonumber(reading_length) or 1
insert(readings, {reading_kana, reading_length})
reading_length_total = reading_length_total + reading_length
end
if reading_length_total > #kanji then
error("Readings for " .. reading_length_total .. " kanji are given, but this word has only " .. #kanji .. " kanji.")
else
for _ = reading_length_total + 1, #kanji do
insert(readings, {nil, 1})
end
end
local table_head = [=[
{| class="wikitable kanji-table floatright" style="text-align: center; ]=] .. (args.clearright and " clear:right;" or "") .. [=["
! ]=] .. (#kanji > 1 and "colspan=\"" .. #kanji .. "\" " or "") .. [=[style="font-weight: normal;" | [[Lampiran:Glosari_bahasa_Jepun#kanji|Kanji]] dalam kata ini
|- lang="]=] .. lang_code .. [=[" class="Jpan" style="font-size: 2em; background: white; line-height: 1em;"
]=]
if args.k.maxindex and args.k.maxindex > args[1].maxindex then
error("kanjitab/too many k")
end
if args.o.maxindex and args.o.maxindex > args[1].maxindex then
error("kanjitab/too many o")
end
local is_ateji = {}
if args.ateji then
local ateji = args.ateji
local cat_ateji = false
if ateji == "y" then
for i = 1, #kanji do
is_ateji[i] = true
end
cat_ateji = true
else
for i in gsplit(ateji, ";") do
gsub(i, "^(%d+)$", function(a)
is_ateji[tonumber(a)] = true
cat_ateji = true
end)
gsub(i, "^(%d+),(%d+)$", function (a, b)
for j = tonumber(a), tonumber(b) do
is_ateji[j] = true
end
cat_ateji = true
end)
end
end
if cat_ateji then insert(categories, "Perkataan dieja dengan ateji bahasa " .. lang_name) end
end
-- if hiragana readings were passed,
-- make the "spelled with ..." categories, the readings cells on the lower level and build the sort key
-- otherwise rely on the pagename to make the original kanjitab and categories
local cells_above = {}
local cells_below = {}
local kanji_pos = 1
for i, reading in ipairs(readings) do
local reading_kana, reading_length = reading[1], reading[2]
local cell = {}
if reading_length <= 1 then
insert(cell, "| rowspan=\"2\" | ")
else
insert(cell, "| colspan =\"" .. reading_length .. "\" | ")
end
-- display reading, actual reading and okurigana
if reading_kana then
if reading_kana ~= "" and reading_kana ~= "-" and umatch(reading_kana, "[^" .. d_range.kana .. "]") then
error("Please remove any non-kana characters from the reading input " .. reading_kana .. ".")
end
local actual_reading = args.k[i]
local okurigana = args.o[i]
local okurigana_text = okurigana and "(" .. okurigana .. ")" or ""
local actual_reading_text = actual_reading and " > " .. actual_reading .. okurigana_text or ""
local text = reading_kana .. okurigana_text .. actual_reading_text
readings_actual[i] = {(actual_reading or reading_kana) .. (okurigana or ""), reading_length}
insert(cell, "<span class=\"Jpan\" lang=\"" .. lang_code .. "\">" .. text .. "</span>")
if reading_length <= 1 then insert(cell, "<br/>") end
else
readings_actual[i] = {nil, 1}
end
-- display kanji grade, categorize
for j = kanji_pos, kanji_pos + reading_length - 1 do
local single_kanji = kanji[j]
local kanji_grade = m_ja.kanji_grade(single_kanji)
local ateji_text = is_ateji[j] and "<br/><small>([[Lampiran:Glosari bahasa Jepun#ateji|ateji]])</small>" or ""
local type, compound
if all_yomi then
local yomi = all_yomi[j].data
type, compound = yomi.type, yomi.compound_reading
end
if not reading_kana then
if type ~= "irregular" then
require("Module:debug/track")("kanjitab/no reading")
end
insert(categories, "Perkataan dieja dengan " .. single_kanji .. " bahasa " .. lang_name )
elseif reading_length ~= 1 or type == "tak teratur" then
insert(categories, "Perkataan dieja dengan " .. single_kanji .. " bahasa " .. lang_name )
elseif compound then
-- Re-enable once all bad jukujikun calls are fixed.
-- error("The yomi type \"" .. type .. "\" is only applicable to compound character readings, so cannot apply to " .. single_kanji .. " read as " .. reading_kana .. ". If this is intended as part of a " .. type .. " reading, please enter the whole reading as one, followed by the number of kanji it applies to.")
require("Module:debug/track")("kanjitab/single kanji with jukujikun")
else -- Subcategorize by reading.
insert(categories, "Perkataan dieja dengan " .. single_kanji .. " dibaca sebagai " .. kata_to_hira(reading_kana) .. " bahasa " .. lang_name )
end
if reading_length <= 1 then
insert(cell, "<small>" .. kanji_grade_links[kanji_grade] .. "</small>" .. ateji_text)
else
insert(cells_below, "| <small>" .. kanji_grade_links[kanji_grade] .. "</small>" .. ateji_text)
end
end
insert(cells_above, concat(cell))
kanji_pos = kanji_pos + reading_length
end
insert(cells, "|- style=\"background: white;\"")
if #cells_below > 0 then
insert(cells, concat(cells_above, "\n"))
insert(cells, "|- style=\"background: white;\"")
insert(cells, concat(cells_below, "\n"))
else
for i, v in ipairs(cells_above) do
cells_above[i] = gsub(v, "| rowspan=\"2\" | ", "| ")
end
insert(cells, concat(cells_above, "\n"))
end
local rendaku = args.r
if rendaku then
insert(categories, "Perkataan dengan rendaku bahasa " .. lang_name )
end
if all_yomi then
insert(cells, "|-")
local len, all_on, yomi_cat = 1, true
for i, yomi in ipairs(all_yomi) do
-- If the next kanji has the same yomi table, it's part of the same range.
if yomi == all_yomi[i + 1] then
len = len + 1
else
yomi = yomi.data
local yomi_type = yomi.type
local display = yomi.display or yomi_type
local appendix = yomi.appendix
insert(cells, "| colspan=\"" .. len .. "\" |" .. (
appendix == false and display or
"[[Lampiran:Glosari_bahasa_Jepun#" .. (appendix or yomi_type) .. "|" .. display .. "]]"
))
-- Categorise as irregular if any irregular yomi are found; otherwise, categorise if all yomi are of the same type. If yomi are of different types but are all on, on'yomi is used as a fallback.
if yomi_cat ~= "irregular" then
local cat_type = yomi_type
if cat_type == "irregular" or yomi_cat == nil then
yomi_cat = cat_type
elseif yomi_cat ~= cat_type then
yomi_cat = false
end
if not yomi.onyomi then
all_on = false
end
end
len = 1
end
end
if yomi_cat then
-- Check yomi_data first, in case cat_type is "irregular"; if no match, must be some other type, so get it from the first yomi in all_yomi, since not all yomi types are yomi_data keys.
yomi_cat = yomi_data[yomi_cat] or all_yomi[1].data
elseif all_on then
yomi_cat = yomi_data.on
elseif #all_yomi == 2 then
local y1, y2 = all_yomi[1].data, all_yomi[2].data
if ulen(pagename) == 2 then
if y1.onyomi and y2.type == "kun'yomi" then
yomi_cat = yomi_data.j -- jūbakoyomi
elseif y1.type == "kun'yomi" and y2.onyomi then
yomi_cat = yomi_data.y -- yutōyomi
end
end
end
if yomi_cat then
local category = yomi_cat.reading_category
if category ~= false then
insert(categories, "Perkataan dengan bacaan " .. (category or yomi_cat.type) .. " bahasa " .. lang_name )
end
end
end
local kanji_table
if #kanji > 0 then
kanji_table = table_head
for _, v in ipairs(kanji) do
kanji_table = kanji_table .. "| style=\"padding: 0.5em;\" | [[" .. v .. "#" .. lang_name .. "|" .. v .. "]]\n"
end
kanji_table = kanji_table .. concat(cells, "\n") .. "\n|}"
else
kanji_table = ""
end
local forms_table = ""
if args.alt == "" or args.alt == "-" then args.alt = nil end
if kyu[1] or args.alt then
local forms = {}
-- |kyu=
if kyu[1] == "which 弁?" then
insert(forms, "<strong class=\"error\" style=\"font-size:75%;\">Sila tentukan kyujitai yang betul untuk 弁 dengan parameter \"kyu\".</strong>[[Kategori:Permohonan untuk pembersihan dalam masukan bahasa " .. lang_name .. " entries]]")
remove(kyu, 1)
end
for _, form in ipairs(kyu) do
local form_linkto, form_display = match(form, "^(.+)|(.+)$")
if not form_linkto then form_linkto, form_display = form, form end
insert(forms, concat{
"<span class=\"Jpan\" lang=\"" .. lang_code .. "\" style=\"font-family:游ゴシック, HanaMinA, sans-serif; font-size:140%;\">[[",
form_linkto,
form_linkto == pagename and "|" or "#" .. lang_name .. "|",
form_display,
"]]</span> <small>",
show_labels {labels = {"kyūjitai"}, lang = lang, nocat = true },
"</small>",
})
end
-- |alt=
if args.alt then
for form in gsplit(args.alt, ",") do
local i_semicolon = find(form, ":")
if i_semicolon then
local altform = sub(form, 1, i_semicolon - 1)
local altlabels = split(sub(form, i_semicolon + 1), " ")
insert(forms, concat{
"<span class=\"Jpan\" lang=\"" .. lang_code .. "\" style=\"font-size:140%\">[[",
altform,
"#" .. lang_name .. "|",
altform,
"]]</span> <small>",
show_labels { labels = altlabels, lang = lang, nocat = true },
"</small>",
})
else
insert(forms, concat{
"<span class=\"Jpan\" lang=\"" .. lang_code .. "\" style=\"font-size:140%\">[[",
form,
"#" .. lang_name .. "|",
form,
"]]</span>"
})
end
end
end
forms_table = "\n" .. [[{| class="wikitable floatright"
! style="font-weight:normal" | Ejaan alternatif]] .. (#forms == 1 and "" or "s") .. [[
|-
| style="text-align:center;font-size:108%" | ]] .. concat(forms, "<br>") .. "\n|}"
end
local forms_table2 = ""
if args.alt2 and args.alt2 ~= "" and args.alt2 ~= "-" then
local forms2 = {}
for form in gsplit(args.alt2, ",") do
insert(forms2, "<span class=\"Jpan\" lang=\"" .. lang_code .. "\">[[" .. form .. "#" .. lang_name .. "|" .. form .. "]]</span>")
end
forms_table2 = "\n" .. [[{| class="wikitable floatright"
! style="font-weight:normal" | Bentuk varian]] .. (#forms2 == 1 and "" or "s") .. "\n" .. [[
| style="text-align:center;font-size:140%" | ]] .. concat(forms2, "<br>") .. "\n|}"
end
-- use user-provided sortkey if we got one, otherwise
-- use the sortkey we've already made by combining the
-- readings if provided, if we have neither then
-- default to empty string and don't sort
local sortkey
if args.sort then
sortkey = args.sort
else
sortkey = {non_kanji[1]}
local id = 1
for _, v in ipairs(readings_actual) do
id = id + v[2]
insert(sortkey, (v[1] or "") .. (non_kanji[id] or ""))
end
sortkey = concat(sortkey)
end
if sortkey == "" then
sortkey = nil
else
sortkey = lang:makeSortKey(sortkey)
end
if sortkey ~= lang:makeSortKey(PAGENAME) then
require("Module:debug/track"){"kanjitab/nonstandard sortkey", "kanjitab/nonstandard sortkey/" .. lang_code}
end
return kanji_table .. forms_table .. forms_table2 .. m_utilities.format_categories(categories, lang, sortkey)
end
return export
d7dyk2wp790fv7kaubmbw5u1s63ttw2
نام
0
10331
281300
268196
2026-04-21T15:41:05Z
Hakimi97
2668
/* Keturunan */
281300
wikitext
text/x-wiki
{{Pautan Projek Wikimedia}}
== Bahasa Melayu ==
=== Takrifan ===
{{ms-kn}}
# [[panggil]]an atau sebutan bagi orang (barang, tempat, pertubuhan, dan lain-lain.)
#: ''Dia mencatat nama orang yang menderma di dalam senarai itu.''
# {{konteks|arkaik|lang=ms}} [[gelaran]], [[sebutan]].
#: ''Kerana jasanya yang besar, Dia diberi nama Datuk.''
# kehormatan, kebaikan, kemasyhuran, [[maruah]], [[pujian]].
#: ''Dia melakukan semua itu semata-mata untuk mendapat nama.''
=== Sebutan ===
* {{dewan|nã|mã}}
* {{rhymes|ms|ma|a}}
* {{penyempangan|ms|na|ma}}
=== Tulisan Rumi ===
[[nama]]
=== Terbitan ===
* {{l|ms|برنام}}
* {{l|ms|ترنام}}
* {{l|ms|دناماکن}}
* {{ARchar|دناما<sup>ء</sup>ي}}
* {{l|ms|ڤنام}}
* {{ARchar|ڤناما<sup>ء</sup>ن}}
* {{l|ms|مناماکن}}
* {{ARchar|مناما<sup>ء</sup>ي}}
=== Kata majmuk ===
* {{l|ms|براوليه نام}}
* {{l|ms|مڠمبيل نام}}
* {{l|ms|منچاري نام}}
* {{l|ms|منداڤت نام}}
* {{l|ms|نام باتڠ توبوه}}
* {{ARchar|نام با<sup>ء</sup>يق}}
* {{l|ms|نام بندا}}
* {{l|ms|نام تيمڠن}}
* {{l|ms|نام جولوقن}}
* {{l|ms|نام چمبوڠ}}
* {{l|ms|نام خاص}}
* {{l|ms|نام داݢيڠ}}
* {{l|ms|نام سامرن}}
* {{l|ms|نام عام}}
* {{l|ms|نام ڤيديڠن}}
* {{l|ms|نام ڤينا}}
* {{l|ms|نام کچيل}}
* {{l|ms|نام کلوارݢ}}
=== Keturunan ===
{{atas2}}
* {{etyl|ms|abs|nama}}
{{tengah2}}
* {{etyl|ms|id|nama}}
{{ter-bawah}}
=== Tesaurus ===
; Sinonim: {{l|ms|اسم}}
[[Kategori:Tulisan Jawi]]
==Bahasa Arab==
==== Kata kerja ====
{{lang|ar|نَامَ}} • (nāma) ''I, tidak lampau'' {{l|ms|ينام|يَنَامُ}}
# untuk [[tidur]]
# untuk pergi ke [[katil]]
# untuk pergi ke [[tidur]]
# untuk [[mereda]], untuk [[menyurut]], untuk [[berkurang]], untuk [[menenangkan]]
# untuk menjadi [[tidak]] [[aktif]], untuk menjadi [[lesu]]
# untuk menjadi [[kebas]]
# untuk [[mengabaikan]], untuk [[meninggalkan]], untuk dapat [[melihat]]
# untuk [[melupakan]]
# untuk menjadi [[tenang]], untuk [[menerima]], untuk [[bersetuju]], untuk [[menyetujui]]
# untuk [[mempercayai]], untuk mempunyai [[keyakinan]] dalam
=== Etimologi ===
Daripada akar {{ar-akar|ن|و|م}}
=== Lihat juga ===
* {{l|ar|نوام}}
* {{l|ar|نوم}}
* {{l|ar|نومي}}
* {{l|ar|نومة}}
* {{l|ar|نؤوم}}
== Bahasa Baluchi ==
==== Kata nama ====
{{head|bal|Kata nama|tr=nám}}
# [[nama]]
# [[reputasi]]
# [[designasi]]
=== Etimologi ===
Daripada {{inh|bal|ira-pro|*Hnā́ma}}, daripada {{inh|bal|iir-pro|*Hnā́ma}}, daripada {{inh|bal|ine-pro|*h₁nómn̥}}.
==Bahasa Parsi==
{{wikipedia|lang=fa|sc=fa-Arab}}
==== Kata nama ====
{{fa-regional|نام|نام|ном}}
{{fa-kn|tr=nām||pl=نام ها|pltr=nām-hā}}
# [[nama]]
# [[reputasi]]
=== Etimologi ===
Daripada {{inh|fa|pal|ŠM|ts=nām}}, daripada {{inh|fa|peo|𐎴𐎠𐎶|ts=nāma}}, daripada {{inh|fa|ira-pro|*Hnā́ma}} (banding dengan {{cog|kmr|nav}}, {{cog|ps|نوم|tr=nūm}}, {{cog|ae|𐬥𐬁𐬨𐬀𐬥|𐬥𐬁𐬨𐬀𐬥-}}) daripada {{inh|fa|iir-pro|*Hnā́ma}} (banding dengan {{cog|el|όνομα}}, {{cog|it|nome}}, Tocharia A {{cog|xto|ñom}}, Armenia {{term|hy|անուն}}, dan Inggeris {{cog|en|name}}).
=== Sebutan ===
* {{a|IR}} {{AFA|fa|[nɒːm]}}
* {{audio|fa|Fa-نام.ogg|audio}}
=== Fleksi ===
{{fa-decl-c|nâm|poss=+}}
=== Terbitan ===
* {{l|fa|sc=fa-Arab|نامی|tr=nâmi}}
* {{l|fa|sc=fa-Arab|نامیدن|tr=nâmidan}}
* {{l|fa|sc=fa-Arab|بنام|tr=benâm}}
=== Lihat juga ===
* {{l|fa|sc=fa-Arab|اسم|tr=esm}}
== Bahasa Urdu ==
==== Kata nama ====
{{ur-kn|g=m|tr=nām|hi=नाम}}
# [[nama]]
=== Etimologi ===
Daripada {{inh+|ur|pra-sau|𑀡𑀸𑀫}}, daripada {{inh|ur|sa|नामन्|tr=nā́man}}, daripada {{inh|ur|inc-pro|*Hnā́ma}}, daripada {{inh|ur|iir-pro|*Hnā́ma}} (banding dengan {{cog|fa|نام|tr=nâm}}), daripada {{inh|ur|ine-pro|*h₁nómn̥||nama}}. Seasal dengan {{cog|pa|ناں}} dan {{cog|en|name}}.
=== Deklensi ===
{{ur-noun-c-c|نام|nām}}
oegn7sp9uo704mlzu5ketjruq0vx31a
281301
281300
2026-04-21T15:41:28Z
Hakimi97
2668
/* Etimologi */
281301
wikitext
text/x-wiki
{{Pautan Projek Wikimedia}}
== Bahasa Melayu ==
=== Takrifan ===
{{ms-kn}}
# [[panggil]]an atau sebutan bagi orang (barang, tempat, pertubuhan, dan lain-lain.)
#: ''Dia mencatat nama orang yang menderma di dalam senarai itu.''
# {{konteks|arkaik|lang=ms}} [[gelaran]], [[sebutan]].
#: ''Kerana jasanya yang besar, Dia diberi nama Datuk.''
# kehormatan, kebaikan, kemasyhuran, [[maruah]], [[pujian]].
#: ''Dia melakukan semua itu semata-mata untuk mendapat nama.''
=== Sebutan ===
* {{dewan|nã|mã}}
* {{rhymes|ms|ma|a}}
* {{penyempangan|ms|na|ma}}
=== Tulisan Rumi ===
[[nama]]
=== Terbitan ===
* {{l|ms|برنام}}
* {{l|ms|ترنام}}
* {{l|ms|دناماکن}}
* {{ARchar|دناما<sup>ء</sup>ي}}
* {{l|ms|ڤنام}}
* {{ARchar|ڤناما<sup>ء</sup>ن}}
* {{l|ms|مناماکن}}
* {{ARchar|مناما<sup>ء</sup>ي}}
=== Kata majmuk ===
* {{l|ms|براوليه نام}}
* {{l|ms|مڠمبيل نام}}
* {{l|ms|منچاري نام}}
* {{l|ms|منداڤت نام}}
* {{l|ms|نام باتڠ توبوه}}
* {{ARchar|نام با<sup>ء</sup>يق}}
* {{l|ms|نام بندا}}
* {{l|ms|نام تيمڠن}}
* {{l|ms|نام جولوقن}}
* {{l|ms|نام چمبوڠ}}
* {{l|ms|نام خاص}}
* {{l|ms|نام داݢيڠ}}
* {{l|ms|نام سامرن}}
* {{l|ms|نام عام}}
* {{l|ms|نام ڤيديڠن}}
* {{l|ms|نام ڤينا}}
* {{l|ms|نام کچيل}}
* {{l|ms|نام کلوارݢ}}
=== Keturunan ===
{{atas2}}
* {{etyl|ms|abs|nama}}
{{tengah2}}
* {{etyl|ms|id|nama}}
{{ter-bawah}}
=== Tesaurus ===
; Sinonim: {{l|ms|اسم}}
[[Kategori:Tulisan Jawi]]
==Bahasa Arab==
==== Kata kerja ====
{{lang|ar|نَامَ}} • (nāma) ''I, tidak lampau'' {{l|ms|ينام|يَنَامُ}}
# untuk [[tidur]]
# untuk pergi ke [[katil]]
# untuk pergi ke [[tidur]]
# untuk [[mereda]], untuk [[menyurut]], untuk [[berkurang]], untuk [[menenangkan]]
# untuk menjadi [[tidak]] [[aktif]], untuk menjadi [[lesu]]
# untuk menjadi [[kebas]]
# untuk [[mengabaikan]], untuk [[meninggalkan]], untuk dapat [[melihat]]
# untuk [[melupakan]]
# untuk menjadi [[tenang]], untuk [[menerima]], untuk [[bersetuju]], untuk [[menyetujui]]
# untuk [[mempercayai]], untuk mempunyai [[keyakinan]] dalam
=== Etimologi ===
Daripada akar {{ar-akar|ن و م}}
=== Lihat juga ===
* {{l|ar|نوام}}
* {{l|ar|نوم}}
* {{l|ar|نومي}}
* {{l|ar|نومة}}
* {{l|ar|نؤوم}}
== Bahasa Baluchi ==
==== Kata nama ====
{{head|bal|Kata nama|tr=nám}}
# [[nama]]
# [[reputasi]]
# [[designasi]]
=== Etimologi ===
Daripada {{inh|bal|ira-pro|*Hnā́ma}}, daripada {{inh|bal|iir-pro|*Hnā́ma}}, daripada {{inh|bal|ine-pro|*h₁nómn̥}}.
==Bahasa Parsi==
{{wikipedia|lang=fa|sc=fa-Arab}}
==== Kata nama ====
{{fa-regional|نام|نام|ном}}
{{fa-kn|tr=nām||pl=نام ها|pltr=nām-hā}}
# [[nama]]
# [[reputasi]]
=== Etimologi ===
Daripada {{inh|fa|pal|ŠM|ts=nām}}, daripada {{inh|fa|peo|𐎴𐎠𐎶|ts=nāma}}, daripada {{inh|fa|ira-pro|*Hnā́ma}} (banding dengan {{cog|kmr|nav}}, {{cog|ps|نوم|tr=nūm}}, {{cog|ae|𐬥𐬁𐬨𐬀𐬥|𐬥𐬁𐬨𐬀𐬥-}}) daripada {{inh|fa|iir-pro|*Hnā́ma}} (banding dengan {{cog|el|όνομα}}, {{cog|it|nome}}, Tocharia A {{cog|xto|ñom}}, Armenia {{term|hy|անուն}}, dan Inggeris {{cog|en|name}}).
=== Sebutan ===
* {{a|IR}} {{AFA|fa|[nɒːm]}}
* {{audio|fa|Fa-نام.ogg|audio}}
=== Fleksi ===
{{fa-decl-c|nâm|poss=+}}
=== Terbitan ===
* {{l|fa|sc=fa-Arab|نامی|tr=nâmi}}
* {{l|fa|sc=fa-Arab|نامیدن|tr=nâmidan}}
* {{l|fa|sc=fa-Arab|بنام|tr=benâm}}
=== Lihat juga ===
* {{l|fa|sc=fa-Arab|اسم|tr=esm}}
== Bahasa Urdu ==
==== Kata nama ====
{{ur-kn|g=m|tr=nām|hi=नाम}}
# [[nama]]
=== Etimologi ===
Daripada {{inh+|ur|pra-sau|𑀡𑀸𑀫}}, daripada {{inh|ur|sa|नामन्|tr=nā́man}}, daripada {{inh|ur|inc-pro|*Hnā́ma}}, daripada {{inh|ur|iir-pro|*Hnā́ma}} (banding dengan {{cog|fa|نام|tr=nâm}}), daripada {{inh|ur|ine-pro|*h₁nómn̥||nama}}. Seasal dengan {{cog|pa|ناں}} dan {{cog|en|name}}.
=== Deklensi ===
{{ur-noun-c-c|نام|nām}}
9y13su2738dl8zy9o01tolrc9kzopow
281302
281301
2026-04-21T15:42:01Z
Hakimi97
2668
/* Kata nama */
281302
wikitext
text/x-wiki
{{Pautan Projek Wikimedia}}
== Bahasa Melayu ==
=== Takrifan ===
{{ms-kn}}
# [[panggil]]an atau sebutan bagi orang (barang, tempat, pertubuhan, dan lain-lain.)
#: ''Dia mencatat nama orang yang menderma di dalam senarai itu.''
# {{konteks|arkaik|lang=ms}} [[gelaran]], [[sebutan]].
#: ''Kerana jasanya yang besar, Dia diberi nama Datuk.''
# kehormatan, kebaikan, kemasyhuran, [[maruah]], [[pujian]].
#: ''Dia melakukan semua itu semata-mata untuk mendapat nama.''
=== Sebutan ===
* {{dewan|nã|mã}}
* {{rhymes|ms|ma|a}}
* {{penyempangan|ms|na|ma}}
=== Tulisan Rumi ===
[[nama]]
=== Terbitan ===
* {{l|ms|برنام}}
* {{l|ms|ترنام}}
* {{l|ms|دناماکن}}
* {{ARchar|دناما<sup>ء</sup>ي}}
* {{l|ms|ڤنام}}
* {{ARchar|ڤناما<sup>ء</sup>ن}}
* {{l|ms|مناماکن}}
* {{ARchar|مناما<sup>ء</sup>ي}}
=== Kata majmuk ===
* {{l|ms|براوليه نام}}
* {{l|ms|مڠمبيل نام}}
* {{l|ms|منچاري نام}}
* {{l|ms|منداڤت نام}}
* {{l|ms|نام باتڠ توبوه}}
* {{ARchar|نام با<sup>ء</sup>يق}}
* {{l|ms|نام بندا}}
* {{l|ms|نام تيمڠن}}
* {{l|ms|نام جولوقن}}
* {{l|ms|نام چمبوڠ}}
* {{l|ms|نام خاص}}
* {{l|ms|نام داݢيڠ}}
* {{l|ms|نام سامرن}}
* {{l|ms|نام عام}}
* {{l|ms|نام ڤيديڠن}}
* {{l|ms|نام ڤينا}}
* {{l|ms|نام کچيل}}
* {{l|ms|نام کلوارݢ}}
=== Keturunan ===
{{atas2}}
* {{etyl|ms|abs|nama}}
{{tengah2}}
* {{etyl|ms|id|nama}}
{{ter-bawah}}
=== Tesaurus ===
; Sinonim: {{l|ms|اسم}}
[[Kategori:Tulisan Jawi]]
==Bahasa Arab==
==== Kata kerja ====
{{lang|ar|نَامَ}} • (nāma) ''I, tidak lampau'' {{l|ms|ينام|يَنَامُ}}
# untuk [[tidur]]
# untuk pergi ke [[katil]]
# untuk pergi ke [[tidur]]
# untuk [[mereda]], untuk [[menyurut]], untuk [[berkurang]], untuk [[menenangkan]]
# untuk menjadi [[tidak]] [[aktif]], untuk menjadi [[lesu]]
# untuk menjadi [[kebas]]
# untuk [[mengabaikan]], untuk [[meninggalkan]], untuk dapat [[melihat]]
# untuk [[melupakan]]
# untuk menjadi [[tenang]], untuk [[menerima]], untuk [[bersetuju]], untuk [[menyetujui]]
# untuk [[mempercayai]], untuk mempunyai [[keyakinan]] dalam
=== Etimologi ===
Daripada akar {{ar-akar|ن و م}}
=== Lihat juga ===
* {{l|ar|نوام}}
* {{l|ar|نوم}}
* {{l|ar|نومي}}
* {{l|ar|نومة}}
* {{l|ar|نؤوم}}
== Bahasa Baluchi ==
==== Kata nama ====
{{head|bal|Kata nama|tr=nám}}
# [[nama]]
# [[reputasi]]
# [[designasi]]
=== Etimologi ===
Daripada {{inh|bal|ira-pro|*Hnā́ma}}, daripada {{inh|bal|iir-pro|*Hnā́ma}}, daripada {{inh|bal|ine-pro|*h₁nómn̥}}.
==Bahasa Parsi==
{{wikipedia|lang=fa|sc=fa-Arab}}
==== Kata nama ====
{{fa-regional|نام|نام|ном}}
{{fa-kn|tr=nām||pl=نام ها|tr=nām-hā}}
# [[nama]]
# [[reputasi]]
=== Etimologi ===
Daripada {{inh|fa|pal|ŠM|ts=nām}}, daripada {{inh|fa|peo|𐎴𐎠𐎶|ts=nāma}}, daripada {{inh|fa|ira-pro|*Hnā́ma}} (banding dengan {{cog|kmr|nav}}, {{cog|ps|نوم|tr=nūm}}, {{cog|ae|𐬥𐬁𐬨𐬀𐬥|𐬥𐬁𐬨𐬀𐬥-}}) daripada {{inh|fa|iir-pro|*Hnā́ma}} (banding dengan {{cog|el|όνομα}}, {{cog|it|nome}}, Tocharia A {{cog|xto|ñom}}, Armenia {{term|hy|անուն}}, dan Inggeris {{cog|en|name}}).
=== Sebutan ===
* {{a|IR}} {{AFA|fa|[nɒːm]}}
* {{audio|fa|Fa-نام.ogg|audio}}
=== Fleksi ===
{{fa-decl-c|nâm|poss=+}}
=== Terbitan ===
* {{l|fa|sc=fa-Arab|نامی|tr=nâmi}}
* {{l|fa|sc=fa-Arab|نامیدن|tr=nâmidan}}
* {{l|fa|sc=fa-Arab|بنام|tr=benâm}}
=== Lihat juga ===
* {{l|fa|sc=fa-Arab|اسم|tr=esm}}
== Bahasa Urdu ==
==== Kata nama ====
{{ur-kn|g=m|tr=nām|hi=नाम}}
# [[nama]]
=== Etimologi ===
Daripada {{inh+|ur|pra-sau|𑀡𑀸𑀫}}, daripada {{inh|ur|sa|नामन्|tr=nā́man}}, daripada {{inh|ur|inc-pro|*Hnā́ma}}, daripada {{inh|ur|iir-pro|*Hnā́ma}} (banding dengan {{cog|fa|نام|tr=nâm}}), daripada {{inh|ur|ine-pro|*h₁nómn̥||nama}}. Seasal dengan {{cog|pa|ناں}} dan {{cog|en|name}}.
=== Deklensi ===
{{ur-noun-c-c|نام|nām}}
861wup5q0ygbjyj08z6gf9zddv7yocj
281303
281302
2026-04-21T15:42:28Z
Hakimi97
2668
/* Kata nama */
281303
wikitext
text/x-wiki
{{Pautan Projek Wikimedia}}
== Bahasa Melayu ==
=== Takrifan ===
{{ms-kn}}
# [[panggil]]an atau sebutan bagi orang (barang, tempat, pertubuhan, dan lain-lain.)
#: ''Dia mencatat nama orang yang menderma di dalam senarai itu.''
# {{konteks|arkaik|lang=ms}} [[gelaran]], [[sebutan]].
#: ''Kerana jasanya yang besar, Dia diberi nama Datuk.''
# kehormatan, kebaikan, kemasyhuran, [[maruah]], [[pujian]].
#: ''Dia melakukan semua itu semata-mata untuk mendapat nama.''
=== Sebutan ===
* {{dewan|nã|mã}}
* {{rhymes|ms|ma|a}}
* {{penyempangan|ms|na|ma}}
=== Tulisan Rumi ===
[[nama]]
=== Terbitan ===
* {{l|ms|برنام}}
* {{l|ms|ترنام}}
* {{l|ms|دناماکن}}
* {{ARchar|دناما<sup>ء</sup>ي}}
* {{l|ms|ڤنام}}
* {{ARchar|ڤناما<sup>ء</sup>ن}}
* {{l|ms|مناماکن}}
* {{ARchar|مناما<sup>ء</sup>ي}}
=== Kata majmuk ===
* {{l|ms|براوليه نام}}
* {{l|ms|مڠمبيل نام}}
* {{l|ms|منچاري نام}}
* {{l|ms|منداڤت نام}}
* {{l|ms|نام باتڠ توبوه}}
* {{ARchar|نام با<sup>ء</sup>يق}}
* {{l|ms|نام بندا}}
* {{l|ms|نام تيمڠن}}
* {{l|ms|نام جولوقن}}
* {{l|ms|نام چمبوڠ}}
* {{l|ms|نام خاص}}
* {{l|ms|نام داݢيڠ}}
* {{l|ms|نام سامرن}}
* {{l|ms|نام عام}}
* {{l|ms|نام ڤيديڠن}}
* {{l|ms|نام ڤينا}}
* {{l|ms|نام کچيل}}
* {{l|ms|نام کلوارݢ}}
=== Keturunan ===
{{atas2}}
* {{etyl|ms|abs|nama}}
{{tengah2}}
* {{etyl|ms|id|nama}}
{{ter-bawah}}
=== Tesaurus ===
; Sinonim: {{l|ms|اسم}}
[[Kategori:Tulisan Jawi]]
==Bahasa Arab==
==== Kata kerja ====
{{lang|ar|نَامَ}} • (nāma) ''I, tidak lampau'' {{l|ms|ينام|يَنَامُ}}
# untuk [[tidur]]
# untuk pergi ke [[katil]]
# untuk pergi ke [[tidur]]
# untuk [[mereda]], untuk [[menyurut]], untuk [[berkurang]], untuk [[menenangkan]]
# untuk menjadi [[tidak]] [[aktif]], untuk menjadi [[lesu]]
# untuk menjadi [[kebas]]
# untuk [[mengabaikan]], untuk [[meninggalkan]], untuk dapat [[melihat]]
# untuk [[melupakan]]
# untuk menjadi [[tenang]], untuk [[menerima]], untuk [[bersetuju]], untuk [[menyetujui]]
# untuk [[mempercayai]], untuk mempunyai [[keyakinan]] dalam
=== Etimologi ===
Daripada akar {{ar-akar|ن و م}}
=== Lihat juga ===
* {{l|ar|نوام}}
* {{l|ar|نوم}}
* {{l|ar|نومي}}
* {{l|ar|نومة}}
* {{l|ar|نؤوم}}
== Bahasa Baluchi ==
==== Kata nama ====
{{head|bal|Kata nama|tr=nám}}
# [[nama]]
# [[reputasi]]
# [[designasi]]
=== Etimologi ===
Daripada {{inh|bal|ira-pro|*Hnā́ma}}, daripada {{inh|bal|iir-pro|*Hnā́ma}}, daripada {{inh|bal|ine-pro|*h₁nómn̥}}.
==Bahasa Parsi==
{{wikipedia|lang=fa|sc=fa-Arab}}
==== Kata nama ====
{{fa-regional|نام|نام|ном}}
{{fa-kn|tr=nām||pl=نام ها|tr=nām-hā}}
# [[nama]]
# [[reputasi]]
=== Etimologi ===
Daripada {{inh|fa|pal|ŠM|ts=nām}}, daripada {{inh|fa|peo|𐎴𐎠𐎶|ts=nāma}}, daripada {{inh|fa|ira-pro|*Hnā́ma}} (banding dengan {{cog|kmr|nav}}, {{cog|ps|نوم|tr=nūm}}, {{cog|ae|𐬥𐬁𐬨𐬀𐬥|𐬥𐬁𐬨𐬀𐬥-}}) daripada {{inh|fa|iir-pro|*Hnā́ma}} (banding dengan {{cog|el|όνομα}}, {{cog|it|nome}}, Tocharia A {{cog|xto|ñom}}, Armenia {{term|hy|անուն}}, dan Inggeris {{cog|en|name}}).
=== Sebutan ===
* {{a|IR}} {{AFA|fa|[nɒːm]}}
* {{audio|fa|Fa-نام.ogg|audio}}
=== Fleksi ===
{{fa-decl-c|nâm|poss=+}}
=== Terbitan ===
* {{l|fa|sc=fa-Arab|نامی|tr=nâmi}}
* {{l|fa|sc=fa-Arab|نامیدن|tr=nâmidan}}
* {{l|fa|sc=fa-Arab|بنام|tr=benâm}}
=== Lihat juga ===
* {{l|fa|sc=fa-Arab|اسم|tr=esm}}
== Bahasa Urdu ==
==== Kata nama ====
{{ur-kn|tr=nām|hi=नाम}}
# [[nama]]
=== Etimologi ===
Daripada {{inh+|ur|pra-sau|𑀡𑀸𑀫}}, daripada {{inh|ur|sa|नामन्|tr=nā́man}}, daripada {{inh|ur|inc-pro|*Hnā́ma}}, daripada {{inh|ur|iir-pro|*Hnā́ma}} (banding dengan {{cog|fa|نام|tr=nâm}}), daripada {{inh|ur|ine-pro|*h₁nómn̥||nama}}. Seasal dengan {{cog|pa|ناں}} dan {{cog|en|name}}.
=== Deklensi ===
{{ur-noun-c-c|نام|nām}}
n67acfapspdl6uaede33p2rrghv4l0a
Templat:ar-akar
10
10334
281295
111416
2026-04-21T15:24:51Z
Hakimi97
2668
281295
wikitext
text/x-wiki
{{#invoke:sem-arb-utilities|root|lang=ar|plain=true}}<noinclude>{{documentation}}</noinclude>
iikovjqcrd38omyf9t2xqjiby9j55kt
Kategori:Bahasa Turki Usmaniyah
14
11029
281334
186546
2026-04-22T00:41:46Z
PeaceSeekers
3334
PeaceSeekers telah memindahkan laman [[Kategori:Bahasa Turki Uthmaniyah]] ke [[Kategori:Bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama
186546
wikitext
text/x-wiki
{{auto cat|Turki|extinct=1|setwikt=-}}
8inbrr93k67kuywhlw83a0joemasnmh
daun
0
11204
281246
242779
2026-04-21T13:32:01Z
Countryball mys123
9925
/* Bahasa Melayu */Tambah gambar
281246
wikitext
text/x-wiki
== Bahasa Melayu ==
=== Takrifan ===
[[File:Lisc lipy.jpg|210px|thumb|Sehelai daun]]
{{ms-kn|j=داءون}}
# Suatu [[helaian]] [[hidup]] pada [[tumbuhan]] yang bertanggungjawab memperoleh [[cahaya matahari]] untuk memperoleh tenaga bagi tumbuhan.
# Kepingan benda nipis.
=== Etimologi ===
Daripada {{inh|ms|poz-mly-pro|*daun}}, daripada {{inh|ms|poz-mcm-pro|*daun}}, daripada {{inh|ms|poz-msa-pro|*daun}}, daripada {{inh|ms|poz-pro|*dahun}}.
=== Sebutan ===
* {{dewan|daun}}
* {{a|Johor-Selangor}} {{IPA|ms|/daon/}}
* {{a|Riau-Lingga}} {{IPA|ms|/daʊn/}}
* {{rima|ms|aon|on}}
* {{audio|ms|Ms-MY-daun.ogg|Audio (MY)}}
===Tesaurus===
====Sinonim====
{{sinonim dialek|ms}}
=== Rujukan ===
* {{R:KD4}}
=== Pautan luar ===
* {{R:PRPM}}
{{C|ms|Botani}}
==Bahasa Iban==
===Takrifan===
====Kata nama====
{{inti|iba|kata nama}}
# daun
===Sebutan===
* {{AFA|iba|/daun/}}
===Rujukan===
{{R:KIMD2}}
==Bahasa Indonesia==
Lihat takrifan Bahasa Melayu
==Bahasa Melanau Tengah==
===Takrifan===
====Kata nama====
{{inti|mel|kata nama}}
# daun
===Etimologi===
{{inh+|mel|poz-swa-pro|*dahun}}, daripada {{inh|mel|poz-pro|*dahun}}.
===Sebutan===
* {{AFA|mel|/daun/}}
===Rujukan===
{{R:KMMD}}
dvmvjm3jnqsmxjdkfmw133jl6j60loh
Modul:category tree/topic/Communication
828
11523
281414
246758
2026-04-22T08:17:04Z
PeaceSeekers
3334
281414
Scribunto
text/plain
local labels = {}
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
-- FIXME: Lookup langs in the language list.
for _, lang_etc in ipairs {
"Arab", {"Cina", "Bahasa-bahasa Cina"}, "Inggeris", "Jerman", "Jepun", "Okinawa",
"Portugis", "Sepanyol", "Vietnam", {"Melayu", "Bahasa-bahasa Melayik"},
} do
if type(lang_etc) ~= "table" then
lang_etc = {lang_etc}
end
local lang, desc = unpack(lang_etc)
desc = desc or ("[[:Kategori:Bahasa %s|bahasa %s]]"):format(lang, lang)
labels["Bahasa " .. lang] = {
type = "berkenaan",
description = "=" .. desc,
parents = {"bahasa-bahasa"},
}
end
labels["komunikasi"] = {
type = "berkenaan",
description = "default",
parents = {"Semua topik"},
}
labels["huruf"] = {
type = "nama",
description = "default",
parents = {"sistem tulisan"},
}
labels["bahasa buatan"] = { -- distinguish from "cat:constructed languages" family category
type = "nama",
description = "={{w|constructed language}}s",
parents = {"bahasa-bahasa"},
}
labels["bahasa badan"] = {
type = "berkenaan",
description = "default",
parents = {"bahasa", "nonverbal communication"},
}
labels["penyiaran"] = {
type = "berkenaan",
description = "default",
parents = {"media", "telekomunikasi"},
}
labels["Komponen aksara Cina"] = {
type = "set",
description = "=[[komponen|Komponen]] [[aksara]] [[Cina]].",
parents = {"Huruf, simbol dan tanda baca"},
}
labels["diacritical marks"] = {
type = "set",
description = "default",
parents = {"Huruf, simbol dan tanda baca"},
}
labels["dialects"] = {
type = "set",
description = "default",
parents = {"bahasa"},
}
labels["dictation"] = {
type = "berkenaan",
description = "default",
parents = {"komunikasi"},
}
labels["bahasa pupus"] = {
type = "nama",
description = "default",
parents = {"bahasa-bahasa"},
}
labels["bahasa isyarat"] = {
type = "nama",
description = "default",
parents = {"bahasa-bahasa"},
}
labels["facial expressions"] = {
type = "set",
description = "default",
parents = {"nonverbal communication", "face"},
}
labels["kiasan"] = {
type = "set",
description = "=[[figure of speech|figures of speech]]",
parents = {"retorik"},
}
labels["bendera"] = {
type = "berkenaan,name,type",
description = "default",
parents = {"komunikasi"},
}
labels["jargon"] = {
type = "berkenaan",
description = "default",
parents = {"bahasa"},
}
labels["aksara Han"] = {
type = "berkenaan",
description = "default",
parents = {"sistem tulisan"},
}
labels["bahasa"] = {
type = "berkenaan",
description = "default",
parents = {"komunikasi"},
}
labels["keluarga bahasa"] = {
type = "nama",
description = "Topik berkenaan [[keluarga bahasa]], termasuklah yang diterima dan yang bersifat kontroversi.",
parents = {"bahasa", "nama"},
}
labels["bahasa-bahasa"] = {
type = "nama",
description = "default",
parents = {"bahasa", "nama"},
}
labels["Huruf, simbol dan tanda baca"] = {
type = "set",
description = "=[[letter]]s, [[symbol]]s, and [[punctuation]]",
parents = {"Ortografi"},
}
labels["logical fallacies"] = {
type = "set",
description = "=[[logical fallacy|logical fallacies]], clearly defined errors in reasoning used to support or refute an argument",
additional = "{{also|Kategori:{{{langcode}}}:biases}}",
parents = {"retorik", "logic"},
}
labels["media"] = {
type = "berkenaan",
description = "default",
parents = {"komunikasi"},
}
labels["telefon bimbit"] = {
type = "berkenaan,set",
description = "default",
parents = {"telefoni"},
}
labels["nonverbal communication"] = {
type = "berkenaan",
description = "default",
parents = {"komunikasi"},
}
labels["ortografi"] = {
type = "berkenaan",
description = "default",
parents = {"penulisan"},
}
labels["palaeography"] = {
type = "berkenaan",
description = "default",
parents = {"penulisan"},
}
labels["pos"] = {
type = "berkenaan",
description = "=[[post#Noun|post]] or [[mail#Noun|mail]]",
parents = {"komunikasi"},
}
labels["postal abbreviations"] = {
type = "nama",
description = "default",
parents = {"pos"},
}
labels["public relations"] = {
type = "berkenaan",
description = "default no singularize",
parents = {"komunikasi"},
}
labels["tanda baca"] = {
type = "set",
description = "default",
parents = {"Huruf, simbol dan tanda baca"},
}
labels["radio"] = {
type = "berkenaan",
description = "default",
parents = {"telekomunikasi"},
}
labels["retorik"] = {
type = "berkenaan",
description = "default",
parents = {"bahasa"},
}
labels["signs"] = {
type = "berkenaan,name,type",
description = "default",
parents = {"komunikasi"},
}
labels["sociolects"] = {
type = "nama",
description = "default",
parents = {"bahasa"},
}
labels["simbol"] = {
type = "set",
description = "=[[symbol]]s, especially [[mathematical]] and [[scientific]] symbols",
additional = "Most symbols have equivalent meanings in many languages and can therefore be found in [[:Category:Translingual symbols]].",
parents = {"Huruf, simbol dan tanda baca"},
}
labels["talking"] = {
type = "berkenaan",
description = "default",
parents = {"bahasa", "tingkah laku manusia"},
}
labels["telekomunikasi"] = {
type = "berkenaan",
description = "default no singularize",
parents = {"komunikasi", "teknologi"},
}
labels["telegraphy"] = {
type = "berkenaan",
description = "default",
parents = {"telekomunikasi", "elektronik"},
wpcat = true,
commonscat = true,
}
labels["telefoni"] = {
type = "berkenaan",
description = "default",
parents = {"telekomunikasi", "elektronik"},
}
labels["texting"] = {
type = "berkenaan",
description = "default",
parents = {"telekomunikasi"},
}
labels["textual division"] = {
type = "berkenaan",
description = "default",
parents = {"penulisan"},
}
labels["tipografi"] = {
type = "berkenaan",
description = "default",
parents = {"penulisan", "percetakan"},
}
labels["penulisan"] = {
type = "berkenaan",
description = "default",
parents = {"bahasa", "tingkah laku manusia"},
}
labels["sistem tulisan"] = {
type = "set",
description = "default",
parents = {"penulisan"},
}
return labels
33yh5uf9t2ik66l1e99302swjc3kj3a
Modul:category tree/topic/Culture
828
11524
281339
281227
2026-04-22T01:08:46Z
PeaceSeekers
3334
281339
Scribunto
text/plain
local labels = {}
labels["budaya"] = {
type = "berkenaan",
description = "default",
parents = {"masyarakat"},
}
labels["A Christmas Carol"] = {
type = "berkenaan",
wikidata = 62879,
displaytitle = "''A Christmas Carol''",
description = "{{{langname}}} terms that are used in the context of the tale ''{{w|A Christmas Carol}}'', by {{w|Charles Dickens}}, such as the names of its characters or author.",
parents = {"British fiction", "Charles Dickens"},
}
labels["A Song of Ice and Fire"] = {
type = "berkenaan",
wikidata = 45875,
displaytitle = "''A Song of Ice and Fire''",
description = "{{{langname}}} terms used in context of the ''{{w|Song of Ice and Fire}}'' novel series and its television adaptation ''{{w|Game of Thrones}}''.",
parents = {"cereka Amerika", "fantasy", "kesusasteraan"},
}
labels["lakonan"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["alternate history"] = {
type = "berkenaan",
description = "default",
parents = {"cereka spekulatif", "history"},
}
labels["cereka Amerika"] = {
type = "berkenaan",
description = "=works of American fiction",
parents = {"cereka", "Amerika Syarikat"},
}
labels["animasi"] = {
type = "berkenaan",
description = "default",
parents = {"media massa"},
}
labels["Arabic fiction"] = {
type = "berkenaan",
description = "=works of [[fiction]] of [[Arabic]] origin",
parents = {"cereka"},
}
labels["Arabian deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Arabian mythology"},
}
labels["Arabian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi"},
}
labels["Armenian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Armenia"},
}
labels["seni"] = {
type = "berkenaan",
description = "default",
parents = {"budaya"},
}
labels["Arthurian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "United Kingdom"},
}
labels["artistic works"] = {
type = "name,jenis",
description = "default",
parents = {"seni"},
}
labels["astrobiology"] = {
type = "berkenaan",
description = "default",
parents = {"astronomy", "biology", "geology"},
}
labels["astrologi"] = {
type = "berkenaan",
description = "default",
parents = {"penilikan", "pseudosains", "obsolete scientific theories"},
}
labels["Asturian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Asturias, Spain"},
}
labels["Avatar: The Last Airbender"] = {
type = "berkenaan",
wikidata = 11572,
displaytitle = "''Avatar: The Last Airbender''",
description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|Avatar: The Last Airbender}}'' and its spin-off ''{{w|The Legend of Korra}}''.",
parents = {"cereka Amerika", "animasi"},
}
labels["Australian Aboriginal mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Australia"},
}
labels["ballet"] = {
type = "berkenaan",
description = "default",
parents = {"tarian"},
}
labels["Barbie"] = {
type = "berkenaan",
wikidata = 167447,
description = "=the {{w|Barbie}} fashion doll produced by Mattel",
parents = {"toys"},
}
labels["Batman"] = {
type = "berkenaan",
wikidata = 2695156,
description = "=the fictional [[superhero]] [[Batman]]",
parents = {"DC Comics", "watak cereka"},
}
labels["bibliography"] = {
type = "berkenaan",
description = "default",
parents = {"buku"},
}
labels["Bilibili"] = {
type = "berkenaan",
wikidata = 3077586,
description = "=the video-sharing website {{w|bilibili}}",
parents = {"media sosial", "World Wide Web"},
}
labels["blogging"] = {
type = "berkenaan",
description = "default",
parents = {"media sosial"},
}
labels["Bluesky"] = {
type = "berkenaan",
wikidata = 78194383,
description = "=the microblogging and social networking service {{w|Bluesky}}",
parents = {"media sosial", "World Wide Web"},
}
labels["body art"] = {
type = "berkenaan",
description = "default",
parents = {"seni", "fesyen"},
}
labels["Bollywood"] = {
type = "berkenaan",
wikidata = 93196,
description = "default",
parents = {"filem", "India"},
}
labels["buku"] = {
type = "berkenaan",
description = "default",
parents = {"media massa", "kesusasteraan"},
}
labels["books of the Poetic Edda"] = {
type = "nama",
displaytitle = "books of the ''Poetic Edda''",
description = "=[[book]]s of the ''[[Poetic Edda]]''",
parents = {"Norse mythology"},
}
labels["Brazilian folklore"] = {
type = "berkenaan",
description = "default",
parents = {"folklore", "Brazil"},
}
labels["cereka British"] = {
type = "berkenaan",
description = "=works of [[fiction]] of [[British]] origin",
parents = {"cereka", "United Kingdom"},
}
labels["Buffy the Vampire Slayer"] = {
type = "berkenaan",
wikidata = 183513,
displaytitle = "''Buffy the Vampire Slayer''",
description = "=the television series ''{{w|Buffy the Vampire Slayer}}'' (1997–2003)",
parents = {"cereka Amerika", "televisyen", "vampires"},
}
labels["cereka Kanada"] = {
type = "berkenaan",
description = "=works of [[fiction]] of [[Canada|Canadian]] origin",
parents = {"cereka", "Kanada"},
}
labels["seni khat"] = {
type = "berkenaan",
description = "default",
parents = {"seni", "penulisan"},
}
labels["cartomancy"] = {
type = "berkenaan",
description = "default",
parents = {"penilikan"},
}
labels["castells"] = {
type = "berkenaan",
description = "=[[castell]]s, the Catalan tradition of human tower building",
additional = "See {{w|castells}}.",
parents = {"budaya", "sports"},
}
labels["celestial inhabitants"] = {
type = "jenis",
description = "=inhabitants of known [[celestial body|celestial bodies]]",
parents = {"watak cereka", "cereka sains", "demonyms"},
}
labels["Celtic mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Ireland", "Wales"},
}
labels["characters from folklore"] = {
type = "berkenaan",
description = "default",
parents = {"watak cereka", "folklore"},
}
labels["cheerleading"] = {
type = "berkenaan",
description = "default",
parents = {"tarian", "gymnastics", "sports"},
}
labels["Church of England"] = {
type = "berkenaan",
description = "default with the",
parents = {"Anglicanism", "England"},
}
labels["Chinese fiction"] = {
type = "berkenaan",
description = "=works of [[fiction]], including [[anime]]s, [[manhua]]s, [[novel]]s, [[series]] and [[video game]]s, whose origin is of [[China]]",
parents = {"cereka", "China"},
}
labels["Chinese mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "China"},
}
labels["cinematography"] = {
type = "berkenaan",
description = "default",
parents = {"filem"},
}
labels["circus"] = {
type = "berkenaan",
description = "default no singularize",
parents = {"hiburan", "teater"},
}
labels["comedy"] = {
type = "berkenaan",
description = "default",
parents = {"drama"},
}
labels["komik"] = {
type = "berkenaan",
description = "default no singularize",
parents = {"kesusasteraan"},
}
-- Confucianism: see [[Module:category tree/topic/Philosophy]]
labels["conlanging"] = {
type = "berkenaan",
description = "=[[conlanging]] (the making of [[constructed language]]s)",
parents = {"language", "budaya"},
}
labels["conspiracy theories"] = {
type = "berkenaan,set",
description = "=[[conspiracy theory|conspiracy theories]] and theorists",
parents = {"budaya"},
}
labels["constellations in the zodiac"] = {
type = "nama",
description = "=the ring of [[constellations]] that line the [[ecliptic]], the apparent path of the [[Sun]] across the [[celestial sphere]] over the course of a year",
parents = {"constellations", "astrologi"},
}
labels["kosmetik"] = {
type = "berkenaan",
description = "default",
parents = {"toiletries", "fesyen"},
}
labels["cosplay"] = {
type = "berkenaan",
description = "default",
parents = {"fandom"},
}
labels["tarian"] = {
type = "berkenaan",
description = "default",
parents = {"seni", "rekreasi"},
}
labels["dances"] = {
type = "jenis",
description = "default",
parents = {"tarian"},
}
labels["DC Comics"] = {
type = "berkenaan",
wikidata = 2924461,
description = "={{w|DC Comics}}",
parents = {"cereka Amerika", "komik"},
}
labels["demoscene"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "computing"},
}
labels["reka bentuk"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["dictionaries"] = {
type = "jenis,nama",
description = "default",
parents = {"reference works", "lexicography"},
}
labels["Disney"] = {
type = "berkenaan",
wikidata = 7414,
description = "=the properties of {{w|The Walt Disney Company}}",
additional = "This includes properties acquired jointly with or from other companies.",
parents = {"cereka Amerika", "komik", "filem", "televisyen"},
}
labels["penilikan"] = {
type = "jenis",
description = "default",
parents = {"okultisme"},
}
labels["Doctor Who"] = {
type = "berkenaan",
wikidata = 34316,
displaytitle = "''Doctor Who''",
description = "=the ''{{w|Doctor Who}}'' franchise",
parents = {"British fiction", "cereka sains", "televisyen"},
}
labels["Dracula"] = {
type = "berkenaan",
wikidata = 41542,
displaytitle = "''Dracula''",
description = "=the 1897 gothic horror novel ''{{w|Dracula}}'' by {{w|Bram Stoker}}, and its cultural derivations.",
parents = {"fantasy", "kesusasteraan", "vampires"},
}
labels["naga"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures"},
}
labels["drama"] = {
type = "berkenaan",
description = "default",
parents = {"teater"},
}
labels["Egyptian deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Egyptian mythology"},
}
labels["Egyptian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Ancient Egypt"},
}
labels["hiburan"] = {
type = "berkenaan",
description = "default",
parents = {"budaya"},
}
labels["erotic literature"] = {
type = "berkenaan",
description = "default",
parents = {"cereka", "literary genres", "sex"},
}
labels["Etruscan mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Etruria"},
}
labels["European folklore"] = {
type = "berkenaan",
description = "default",
parents = {"folklore", "Europe"},
}
labels["fairy tale"] = {
type = "berkenaan",
description = "=[[fairy tale]]s",
parents = {"cereka"},
}
labels["fairy tale characters"] = {
type = "nama",
description = "=[[fairy tale]] [[character]]s",
parents = {"watak cereka", "fairy tale"},
}
labels["fairy tales"] = {
type = "nama",
description = "default",
parents = {"fairy tale"},
}
labels["fan fiction"] = {
type = "berkenaan",
description = "default",
parents = {"cereka", "fandom", "kesusasteraan"},
}
labels["fandom"] = {
type = "berkenaan",
description = "{{{langname}}} terms arising from [[fandom]] culture.",
parents = {"budaya"},
}
labels["fantasy"] = {
type = "berkenaan",
description = "=the [[genre]] of [[fantasy]]",
parents = {"cereka", "cereka spekulatif"},
}
labels["fesyen"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "clothing"},
}
labels["faster-than-light travel"] = {
type = "berkenaan",
description = "default",
parents = {"travel", "cereka sains", "astrophysics", "relativity"},
}
labels["Fediverse"] = {
type = "berkenaan",
wikidata = 30325419,
description = "=the decentralised social networking services collectively known as the {{w|Fediverse}}",
parents = {"media sosial", "World Wide Web"},
}
labels["cereka"] = {
type = "berkenaan",
description = "=specific works of [[fiction]]",
parents = {"artistic works"},
}
labels["fictional abilities"] = {
type = "berkenaan,jenis",
description = "=fictional [[ability|abilities]] and [[superpower]]s",
parents = {"cereka", "cereka spekulatif"},
}
labels["watak cereka"] = {
type = "name,jenis",
description = "default",
parents = {"cereka"},
}
labels["fictional locations"] = {
type = "name,jenis",
description = "default",
parents = {"cereka"},
}
labels["fictional planets"] = {
type = "nama",
description = "default",
parents = {"fictional locations"},
}
labels["fictional universes"] = {
type = "name,jenis",
description = "default",
parents = {"fictional locations"},
}
labels["filem"] = {
type = "berkenaan",
description = "default",
parents = {"media massa", "hiburan"},
}
labels["F/F ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between two female characters.",
parents = {"LGBTQ", "ships (fandom) by relationship type"},
}
labels["film genres"] = {
type = "jenis,berkenaan",
description = "default",
parents = {"filem", "genre"},
}
labels["film industries"] = {
type = "nama",
description = "default",
parents = {"filem"},
}
labels["Finnic mythology"] = {
type = "berkenaan",
description = "=the [[mythology]] of the [[Finnic]] peoples",
additional = "This includes (but is not limited to) [[Finnish]] and [[Estonian]] mythology.",
parents = {"mitologi", "Finland", "Estonia"},
}
labels["flamenco"] = {
type = "berkenaan",
description = "default",
parents = {"tarian"},
}
labels["folklore"] = {
type = "berkenaan",
description = "default",
parents = {"budaya"},
}
labels["furry fandom"] = {
type = "berkenaan",
description = "default",
parents = {"fandom", "subbudaya"},
}
labels["Germanic deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Germanic mythology"},
}
labels["Germanic mythology"] = {
type = "nama",
description = "=the [[mythology]] of the [[Germanic]] peoples",
parents = {"mitologi"},
}
labels["genre"] = {
type = "jenis,berkenaan",
description = "=[[genre]]s and genre classifications",
parents = {"hiburan"},
wpcat = true,
}
labels["ghosts"] = {
type = "berkenaan",
description = "default",
parents = {"afterlife", "supernatural", "characters from folklore", "death", "fantasy", "horror", "mythological creatures", "okultisme"},
}
labels["Glee (TV series)"] = {
type = "berkenaan",
wikidata = 152178,
displaytitle = "''Glee'' (TV series)",
description = "=the television series ''[[w:Glee (TV series)|Glee]]'' (2009–2015)",
parents = {"cereka Amerika", "televisyen"},
}
labels["graphic design"] = {
type = "berkenaan",
description = "default",
parents = {"reka bentuk"},
}
labels["Greek deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Greek mythology"},
}
labels["Greek mythology"] = {
type = "berkenaan",
description = "=the [[mythology]] of [[Ancient Greece]]",
parents = {"mitologi", "Ancient Greece"},
}
labels["Gulliver's Travels"] = {
type = "berkenaan",
wikidata = 181488,
displaytitle = "''Gulliver's Travels''",
description = "=''[[w:Gulliver's Travels|Gulliver’s Travels]]''",
parents = {"kesusasteraan"},
}
labels["Harry Potter"] = {
type = "berkenaan",
wikidata = 8337,
displaytitle = "''Harry Potter''",
description = "{{{langname}}} terms used in context of the ''{{w|Harry Potter}}'' franchise.",
parents = {"British fiction", "fantasy", "kesusasteraan", "watak cereka"},
}
labels["Hawaiian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Hawaii, USA"},
}
labels["F/M ships"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between female and male characters.",
parents = {"ships (fandom) by relationship type"},
}
labels["Hindu deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Hindu mythology"},
}
labels["Hindu mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Hinduism"},
}
labels["Homestuck"] = {
type = "berkenaan",
displaytitle ="''Homestuck''",
wikidata = 2618713,
description = "=the ''{{w|Homestuck}}'' multimedia fiction series",
parents = {"cereka Amerika", "komik"},
}
labels["Hopi culture"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "United States"},
}
labels["horror"] = {
type = "berkenaan",
description = "=the [[horror]] [[genre]]",
parents = {"kesusasteraan", "cereka spekulatif"},
}
labels["humanities"] = {
type = "berkenaan",
description = "default no singularize",
parents = {"budaya"},
commonscat = true;
}
labels["incestuous ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} involving fictional incestuous relationships.",
parents = {"incest", "ships (fandom) by relationship type"},
}
labels["idol fandom"] = {
type = "berkenaan",
description = "default",
parents = {"fandom"},
}
labels["Instagram"] = {
type = "berkenaan",
wikidata = 209330,
description = "=the photo sharing and social networking service [[Instagram]]",
parents = {"photography", "media sosial", "World Wide Web"},
}
labels["Iranian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Iran"},
}
labels["Irish mythology"] = {
type = "berkenaan",
description = "default",
parents = {"Celtic mythology", "Ireland"},
}
labels["James Bond"] = {
type = "berkenaan",
wikidata = 844,
displaytitle = "''James Bond''",
description = "=the ''[[James Bond]]'' franchise",
parents = {"British fiction", "filem"},
}
labels["dewa Jepun"] = {
type = "nama",
description = "default",
parents = {"dewa", "mitologi Jepun"},
}
labels["cereka Jepun"] = {
type = "berkenaan",
description = "=bahan-bahan [[cereka]] Jepun, termasuk [[anime]], [[manga]], [[novel]], [[siri]] dan [[permainan video]]",
parents = {"cereka", "Japan"},
}
labels["mitologi Jepun"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Jepun"},
}
labels["job titles in Romance of the Three Kingdoms"] = {
type = "jenis",
displaytitle = "job titles in ''Romance of the Three Kingdoms''",
description = "=job titles in ''{{w|Romance of the Three Kingdoms}}''",
parents = {"Romance of the Three Kingdoms", "titles"},
}
labels["kewartawanan"] = {
type = "berkenaan",
description = "default",
parents = {"penulisan"},
}
labels["Kachinas"] = {
type = "nama",
description = "default",
parents = {"Hopi culture"},
}
labels["Komi mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Komi, Russia"},
}
labels["Korean fiction"] = {
type = "berkenaan",
description = "=works of [[fiction]], including [[anime]]s, [[manhwa]]s, [[novel]]s, [[series]] and [[video game]]s, whose origin is of [[Korea]]",
parents = {"cereka", "Korea"},
}
labels["Korean mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Korea"},
}
labels["genre kesusasteraan"] = {
type = "jenis",
description = "{{{langname}}} terms for [[literary]] [[genre]]s.",
parents = {"kesusasteraan", "cereka", "genre"},
}
labels["kesusasteraan"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "hiburan", "penulisan"},
}
labels["Lost (TV series)"] = {
type = "berkenaan",
wikidata = 23567,
displaytitle = "''Lost'' (TV series)",
description = "=the television series ''{{w|Lost (2004 TV series)|Lost}}'' (2004–2010)",
parents = {"cereka Amerika", "cereka sains", "televisyen"},
}
labels["Lovecraftian horror"] = {
type = "berkenaan",
wikidata = 2448865,
description = "=the [[literature|literary]] works of {{w|H. P. Lovecraft}}",
parents = {"horror", "kesusasteraan", "cereka", "supernatural"},
}
labels["magic"] = {
type = "berkenaan",
description = "default",
parents = {"supernatural"},
}
labels["magic words"] = {
type = "set",
wikidata = 1135882,
description = "{{{langname}}} magic words; terms that serve the purpose of effectively or apparently triggering a [[magical]] or [[illusionist]] event.",
parents = {"plot devices", "cereka"},
}
labels["genre manga"] = {
type = "jenis",
description = "Istilah [[genre]] [[manga]] dalam bahasa {{{langname}}}.",
parents = {"genre kesusasteraan"},
}
labels["perkahwinan"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "keluarga"},
}
labels["Marvel Comics"] = {
type = "berkenaan",
wikidata = 173496,
description = "={{w|Marvel Comics}}",
parents = {"cereka Amerika", "komik"},
}
labels["media massa"] = {
type = "berkenaan",
description = "default",
parents = {"media", "budaya"},
}
labels["Meitei deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Meitei mythology"},
}
labels["Meitei mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Manipur, India"},
}
labels["merpeople"] = {
type = "berkenaan",
description = "default",
parents = {"mythological creatures"},
}
labels["Mesopotamian deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Mesopotamian mythology"},
}
labels["Mesopotamian mythology"] = {
type = "berkenaan",
description = "=the [[mythology]] of ancient [[Mesopotamia]]",
parents = {"mitologi", "Ancient Near East"},
}
labels["M/M ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between two male characters.",
parents = {"LGBTQ", "ships (fandom) by relationship type"},
}
labels["modern art"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["Mongolian tribes"] = {
type = "nama",
description = "{{{langname}}} names for Mongolian tribes.",
parents = {"ethnonyms", "Mongolia"},
}
labels["moustaches"] = {
type = "jenis",
description = "default",
parents = {"face", "fesyen", "hair"},
}
labels["My Hero Academia"] = {
type = "berkenaan",
wikidata = 18047903,
displaytitle ="''My Hero Academia''",
description = "=the ''{{w|My Hero Academia}}'' series",
parents = {"Japanese fiction", "animasi", "komik"},
}
labels["My Little Pony"] = {
type = "berkenaan",
wikidata = 1071312,
displaytitle = "''My Little Pony''",
description = "=the ''{{w|My Little Pony}}'' franchise (which includes toys and animated series) and its fandom",
parents = {"cereka Amerika", "animasi", "toys"},
}
labels["mythological creatures"] = {
type = "jenis",
description = "default",
parents = {"mitologi", "fantasy"},
}
labels["mythological figures"] = {
type = "nama",
description = "default",
parents = {"mitologi"},
}
labels["mythological locations"] = {
type = "nama",
description = "default",
parents = {"mitologi"},
}
labels["mythological plants"] = {
type = "jenis,nama",
description = "default",
parents = {"mitologi", "plants"},
}
labels["mitologi"] = {
type = "berkenaan",
description = "default",
parents = {"budaya"},
}
labels["narratology"] = {
type = "berkenaan",
description = "default",
parents = {"kesusasteraan", "drama"},
}
labels["Navajo mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi"},
}
labels["newspapers"] = {
type = "nama",
description = "default",
parents = {"periodicals"},
}
labels["Niconico"] = {
type = "berkenaan",
wikidata = 697233,
description = "=the video-sharing website {{w|Niconico}}",
parents = {"media sosial", "World Wide Web"},
}
labels["Norse deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Germanic deities", "Norse mythology"},
}
labels["Norse mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Germanic mythology"},
}
labels["okultisme"] = {
type = "berkenaan",
description = "default with the",
parents = {"supernatural", "paranormal"},
}
labels["omegaverse"] = {
type = "berkenaan",
wikidata = 96397374,
description = "=the [[omegaverse]] genre",
parents = {"erotic literature", "fan fiction", "cereka spekulatif"},
}
labels["Omori"] = {
type = "berkenaan",
wikidata = 105618699,
displaytitle ="''Omori''",
description = "=the ''{{w|Omori (video game)|Omori}}'' series",
parents = {"cereka Amerika", "video games"},
}
labels["Once Upon a Time"] = {
type = "berkenaan",
wikidata = 23673,
displaytitle = "''Once Upon a Time''",
description = "=the television series ''{{w|Once Upon a Time (TV series)|Once Upon a Time}}'' (2011–2018)",
parents = {"cereka Amerika", "Disney", "televisyen"},
}
labels["painting"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["palmistry"] = {
type = "berkenaan",
description = "default",
parents = {"penilikan"},
}
labels["parties"] = {
type = "jenis,berkenaan",
description = "default",
parents = {"hiburan", "budaya"},
}
labels["people in Romance of the Three Kingdoms"] = {
type = "nama",
displaytitle = "people in ''Romance of the Three Kingdoms''",
description = "=people in ''{{w|Romance of the Three Kingdoms}}''",
parents = {"Romance of the Three Kingdoms"},
}
labels["perfumes"] = {
type = "jenis,set",
description = "default",
parents = {"fesyen", "scents", "perfumery"},
}
labels["periodicals"] = {
type = "jenis,berkenaan",
description = "default",
parents = {"media massa", "kesusasteraan"},
}
labels["personifications"] = {
type = "nama",
description = "default",
parents = {"narratology"},
}
labels["places in Romance of the Three Kingdoms"] = {
type = "nama",
displaytitle = "places in ''Romance of the Three Kingdoms''",
description = "=places in ''{{w|Romance of the Three Kingdoms}}''",
parents = {"Romance of the Three Kingdoms", "China"},
}
labels["plot devices"] = {
type = "jenis",
description = "default",
parents = {"narratology", "cereka"},
}
labels["puisi"] = {
type = "berkenaan",
description = "default",
parents = {"kesusasteraan", "seni"},
}
labels["polyamorous ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between three or more characters.",
parents = {"ships (fandom) by relationship type"},
}
labels["Private Eye"] = {
type = "berkenaan",
displaytitle = "''Private Eye''",
description = "=the ''{{w|Private Eye}}'' franchise",
parents = {"British fiction"},
}
labels["Reddit"] = {
type = "berkenaan",
wikidata = 2195701,
description = "=the social news aggregation and discussion website {{w|Reddit}}",
parents = {"media sosial", "World Wide Web"},
}
labels["reference works"] = {
type = "jenis",
description = "default",
parents = {"buku"},
}
labels["Roman deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Roman mythology"},
}
labels["Roman mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Ancient Rome"},
}
labels["romance fiction"] = {
type = "berkenaan",
description = "default",
parents = {"literary genres", "love"},
}
labels["Romance of the Three Kingdoms"] = {
type = "berkenaan",
wikidata = 70806,
displaytitle = "''Romance of the Three Kingdoms''",
description = "=''{{w|Romance of the Three Kingdoms}}''",
parents = {"cereka", "kesusasteraan", "China"},
}
labels["RPF ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} involving real people in a fictional relationship.",
additional = "For actual relationships between real people, see [[:Category:Couple nicknames]].",
parents = {"ships (fandom) by relationship type"},
}
labels["cereka sains"] = {
type = "berkenaan",
description = "default",
parents = {"cereka spekulatif", "cereka"},
}
labels["SCP Foundation"] = {
type = "berkenaan",
wikidata = 17439649,
description = "English terms related to the SCP Wiki collaborative writing website and its setting of the {{w|SCP Foundation}}.",
parents = {"fantasy", "cereka", "horror", "cereka sains", "supernatural"},
}
labels["arca"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["Shahnameh"] = {
type = "berkenaan",
wikidata = 8279,
displaytitle = "''Shahnameh''",
description = "=''Shahnameh''",
parents = {"cereka", "puisi", "kesusasteraan", "Persia"},
}
labels["Shahnameh characters"] = {
type = "nama",
description = "=characters in the [[Shahnameh]]",
parents = {"Shahnameh"},
}
labels["shapeshifters"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures", "characters from folklore"},
}
labels["Sherlock Holmes"] = {
type = "berkenaan",
wikidata = 2316684,
description = "=the [[Sherlock Holmes]] stories by {{w|Arthur Conan Doyle}} and adaptations of them",
parents = {"British fiction", "kesusasteraan"},
}
labels["Sherlock (TV series)"] = {
type = "berkenaan",
wikidata = 192837,
displaytitle = "''Sherlock'' (TV series)",
description = "=the television series ''[[w:Sherlock (TV series)|Sherlock]]'' (2010–2017)",
parents = {"Sherlock Holmes", "televisyen"},
}
labels["shipping (fandom)"] = {
type = "berkenaan",
description = "={{l|en|ship|shipping|id=fandomverb}} (i.e., in [[fandom]], supporting a fictional romantic relationship between two characters)",
parents = {"fandom", "romance fiction"},
}
labels["ships (fandom)"] = {
type = "kumpulan",
description = "=names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} i.e., a fictional relationship between two fictional characters or real people)",
parents = {"shipping (fandom)"},
}
labels["ships (fandom) by relationship type"] = {
type = "kumpulan",
description = "={{l|en|ship|ship|id=fandomnoun}} names organized by the type of relationship (e.g, [[heterosexual]], [[homosexual]], etc.)",
parents = {"ships (fandom)"},
}
labels["shippers (fandom)"] = {
type = "jenis",
description = "=[[shipper]]s (i.e., people who support a romantic or sexual relationship between characters or real people)",
parents = {"shipping (fandom)"},
}
labels["Slavic deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Slavic mythology"},
}
labels["Slavic mythology"] = {
type = "berkenaan",
description = "=the [[mythology]] of the [[Slav]]s",
parents = {"mitologi"},
}
labels["Smallville (TV series)"] = {
type = "berkenaan",
wikidata = 180228,
displaytitle = "''Smallville'' (TV series)",
description = "=the television series ''{{w|Smallville}}'' (2001–2011)",
parents = {"cereka Amerika", "Superman", "televisyen"},
}
labels["media sosial"] = {
type = "berkenaan",
wikidata = 202833,
description = "default",
parents = {"media massa", "Internet"},
}
labels["South Korean idol fandom"] = {
type = "berkenaan",
wikidata = 39086123,
description = "=[[South Korea|South Korean]] [[idol]] [[fandom]]",
parents = {"idol fandom", "South Korea"},
}
labels["South Park"] = {
type = "berkenaan",
wikidata = 16538,
displaytitle = "''South Park''",
description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|South Park}}''.",
parents = {"cereka Amerika", "animasi"},
}
labels["Star Trek"] = {
type = "berkenaan",
wikidata = 1092,
displaytitle = "''Star Trek''",
description = "=the ''{{w|Star Trek}}'' franchise",
parents = {"cereka Amerika", "filem", "cereka sains", "televisyen"},
}
labels["Star Wars"] = {
type = "berkenaan",
wikidata = 462,
displaytitle = "''Star Wars''",
description = "=the ''{{w|Star Wars}}'' franchise",
parents = {"cereka Amerika", "filem", "cereka sains", "Disney"},
}
labels["Steven Universe"] = {
type = "berkenaan",
wikidata = 7615342,
displaytitle = "''Steven Universe''",
description = "=the animated television series ''{{w|Steven Universe}}''",
parents = {"cereka Amerika", "animasi"},
}
labels["stock characters"] = {
type = "jenis",
wikidata = 636497,
description = "default",
parents = {"watak cereka"},
}
labels["cereka spekulatif"] = {
type = "berkenaan",
wikidata = 9326077,
description = "default",
parents = {"cereka", "genre"},
}
labels["spider fighting"] = {
type = "berkenaan",
wikidata = 7577058,
description = "={{w|spider fighting}}",
parents = {"spiders", "human activity"},
}
labels["subbudaya"] = {
type = "berkenaan",
description = "=[[subculture]]s",
parents = {"budaya"},
}
labels["adiwira"] = {
type = "nama",
wikidata = 188784,
description = "=[[superhero]]es",
parents = {"watak cereka"},
}
labels["Superman"] = {
type = "berkenaan",
wikidata = 79015,
description = "=the fictional [[superhero]] [[Superman]]",
parents = {"DC Comics", "watak cereka"},
}
labels["supernatural"] = {
type = "berkenaan",
wikidata = 80837,
description = "default with the",
parents = {"folklore"},
}
labels["Supernatural (TV series)"] = {
type = "berkenaan",
wikidata = 130585,
displaytitle = "''Supernatural'' (TV series)",
description = "=the television series ''[[w:Supernatural (American TV series)|Supernatural]]'' (2005–2020)",
parents = {"cereka Amerika", "televisyen"},
}
labels["Tamil deities"] = {
type = "nama",
description = "default",
additional = "See [[w:Dravidian folk religion|Dravidian religion]] or [[w:Religion in ancient Tamilakam|Tamil region]] for more.",
parents = {"gods", "Hindu deities", "Tamil mythology"},
}
labels["Tamil mythology"] = {
type = "nama",
description = "default",
additional = "See [[w:Dravidian folk religion|Dravidian religion]] or [[w:Religion in ancient Tamilakam|Tamil region]] for more.",
parents = {"mitologi", "Hindu mythology", "Tamil Nadu, India"},
}
labels["televisyen"] = {
type = "berkenaan",
wikidata = 289,
description = "default",
parents = {"media massa", "penyiaran"},
}
labels["The Handmaid's Tale"] = {
type = "berkenaan",
wikidata = 25207350,
displaytitle = "''The Handmaid's Tale''",
description = "=the 1985 novel ''{{w|The Handmaid's Tale}}'' by {{w|Margaret Atwood}} and its [[w:The Handmaid's Tale (TV series)|television adaptation]] (2017–)",
parents = {"Canadian fiction", "utopian and dystopian fiction", "kesusasteraan"},
}
labels["The Hunger Games"] = {
type = "berkenaan",
wikidata = 11679,
displaytitle = "''The Hunger Games''",
description = "=''{{w|The Hunger Games}}'' novel series by {{w|Suzanne Collins}} and its film adaptations",
parents = {"cereka Amerika", "cereka sains", "utopian and dystopian fiction", "kesusasteraan"},
}
labels["The Matrix"] = {
type = "berkenaan",
wikidata = 83495,
displaytitle = "''The Matrix''",
description = "=''{{w|The Matrix}}''",
parents = {"cereka Amerika", "cereka sains", "utopian and dystopian fiction"},
}
labels["The Simpsons"] = {
type = "berkenaan",
wikidata = 886,
displaytitle = "''The Simpsons''",
description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|The Simpsons}}''.",
parents = {"cereka Amerika", "animasi", "Disney"},
}
labels["The Walking Dead"] = {
type = "berkenaan",
wikidata = 232737,
displaytitle = "''The Walking Dead''",
description = "=the television series ''[[w:The Walking Dead (TV series)|The Walking Dead]]'' (2010–2022) and the comic series from which it was adapted",
parents = {"cereka Amerika", "televisyen", "utopian and dystopian fiction", "zombies"},
}
labels["The Wizard of Oz"] = {
type = "berkenaan",
wikidata = 130295,
displaytitle = "''The Wizard of Oz''",
description = "=the fantasy novel ''{{w|The Wonderful Wizard of Oz}}'', subsequent books or films derived from it, such as the ''[[w:The Wizard of Oz (1939 film)|1939 film]]''.",
parents = {"cereka Amerika", "fantasy", "kesusasteraan"},
}
labels["The X-Files"] = {
type = "berkenaan",
wikidata = 2744,
displaytitle = "''The X-Files''",
description = "=the ''{{w|The X-Files}}'' franchise",
parents = {"cereka Amerika", "cereka sains", "televisyen"},
}
labels["teater"] = {
type = "berkenaan",
description = "default",
parents = {"seni", "hiburan"},
}
labels["Thracian deities"] = {
type = "nama",
description = "default",
parents = {"gods"},
}
labels["TikTok"] = {
type = "berkenaan",
wikidata = 48938223,
description = "=the video-sharing and social-networking service {{w|TikTok}}",
parents = {"media sosial", "World Wide Web"},
}
labels["Tupi mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Brazil"},
}
labels["Twilight (novel series)"] = {
type = "berkenaan",
wikidata = 44523,
displaytitle = "''Twilight'' (novel series)",
description = "=the ''[[w:Twilight (series)|Twilight]]'' franchise",
parents = {"cereka Amerika", "fantasy", "kesusasteraan", "vampires"},
}
labels["Twitter"] = {
type = "berkenaan",
wikidata = 918,
description = "=the social networking and microblogging service {{w|Twitter}}",
parents = {"media sosial", "World Wide Web"},
}
labels["Tumblr"] = {
type = "berkenaan",
wikidata = 384060,
description = "=the microblogging and social networking service {{w|Tumblr}}",
parents = {"media sosial", "World Wide Web"},
}
labels["utopian and dystopian fiction"] = {
type = "berkenaan",
description = "default",
parents = {"cereka spekulatif"},
}
labels["vampires"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures", "characters from folklore", "death", "horror", "blood"},
}
labels["vampire lifestyle"] = {
type = "berkenaan",
description = "={{w|vampire lifestyle|the vampire lifestyle}} (i.e., a subculture which roleplays the stereotypical habits of vampires)",
parents = {"subbudaya", "vampires"},
}
labels["Virtual YouTuber"] = {
type = "berkenaan",
wikidata = 55155641,
description = "=[[virtual YouTuber]]s ([[VTuber]]s)",
parents = {"YouTube", "hiburan"},
}
labels["web design"] = {
type = "berkenaan",
description = "default",
parents = {"reka bentuk", "World Wide Web"},
}
labels["werewolves"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures", "characters from folklore", "shapeshifters", "horror"},
}
labels["worldbuilding"] = {
type = "berkenaan",
description = "default",
parents = {"narratology", "cereka spekulatif"},
}
labels["Xena: Warrior Princess"] = {
type = "berkenaan",
wikidata = 38497,
displaytitle = "''Xena: Warrior Princess''",
description = "=the television series ''{{w|Xena: Warrior Princess}}'' (1995–2001)",
parents = {"cereka Amerika", "fantasy", "televisyen"},
}
labels["YouTube"] = {
type = "berkenaan",
wikidata = 866,
description = "=the video-sharing website {{w|YouTube}}",
parents = {"media sosial", "World Wide Web", "Google"},
}
labels["YouTube Poop"] = {
type = "berkenaan",
wikidata = 16927904,
description = "default",
parents = {"YouTube", "Internet memes"},
}
labels["zombies"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures", "characters from folklore", "death", "horror"},
}
return labels
0cq1z1gik9bzog9quzjsbrm5g2oazso
281343
281339
2026-04-22T01:10:11Z
PeaceSeekers
3334
281343
Scribunto
text/plain
local labels = {}
labels["budaya"] = {
type = "berkenaan",
description = "default",
parents = {"masyarakat"},
}
labels["A Christmas Carol"] = {
type = "berkenaan",
wikidata = 62879,
displaytitle = "''A Christmas Carol''",
description = "{{{langname}}} terms that are used in the context of the tale ''{{w|A Christmas Carol}}'', by {{w|Charles Dickens}}, such as the names of its characters or author.",
parents = {"British fiction", "Charles Dickens"},
}
labels["A Song of Ice and Fire"] = {
type = "berkenaan",
wikidata = 45875,
displaytitle = "''A Song of Ice and Fire''",
description = "{{{langname}}} terms used in context of the ''{{w|Song of Ice and Fire}}'' novel series and its television adaptation ''{{w|Game of Thrones}}''.",
parents = {"cereka Amerika", "fantasy", "kesusasteraan"},
}
labels["lakonan"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["alternate history"] = {
type = "berkenaan",
description = "default",
parents = {"cereka spekulatif", "history"},
}
labels["cereka Amerika"] = {
type = "berkenaan",
description = "=works of American fiction",
parents = {"cereka", "Amerika Syarikat"},
}
labels["animasi"] = {
type = "berkenaan",
description = "default",
parents = {"media massa"},
}
labels["Arabic fiction"] = {
type = "berkenaan",
description = "=works of [[fiction]] of [[Arabic]] origin",
parents = {"cereka"},
}
labels["Arabian deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Arabian mythology"},
}
labels["Arabian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi"},
}
labels["Armenian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Armenia"},
}
labels["seni"] = {
type = "berkenaan",
description = "default",
parents = {"budaya"},
}
labels["Arthurian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "United Kingdom"},
}
labels["artistic works"] = {
type = "nama,jenis",
description = "default",
parents = {"seni"},
}
labels["astrobiology"] = {
type = "berkenaan",
description = "default",
parents = {"astronomy", "biology", "geology"},
}
labels["astrologi"] = {
type = "berkenaan",
description = "default",
parents = {"penilikan", "pseudosains", "obsolete scientific theories"},
}
labels["Asturian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Asturias, Spain"},
}
labels["Avatar: The Last Airbender"] = {
type = "berkenaan",
wikidata = 11572,
displaytitle = "''Avatar: The Last Airbender''",
description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|Avatar: The Last Airbender}}'' and its spin-off ''{{w|The Legend of Korra}}''.",
parents = {"cereka Amerika", "animasi"},
}
labels["Australian Aboriginal mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Australia"},
}
labels["ballet"] = {
type = "berkenaan",
description = "default",
parents = {"tarian"},
}
labels["Barbie"] = {
type = "berkenaan",
wikidata = 167447,
description = "=the {{w|Barbie}} fashion doll produced by Mattel",
parents = {"toys"},
}
labels["Batman"] = {
type = "berkenaan",
wikidata = 2695156,
description = "=the fictional [[superhero]] [[Batman]]",
parents = {"DC Comics", "watak cereka"},
}
labels["bibliography"] = {
type = "berkenaan",
description = "default",
parents = {"buku"},
}
labels["Bilibili"] = {
type = "berkenaan",
wikidata = 3077586,
description = "=the video-sharing website {{w|bilibili}}",
parents = {"media sosial", "World Wide Web"},
}
labels["blogging"] = {
type = "berkenaan",
description = "default",
parents = {"media sosial"},
}
labels["Bluesky"] = {
type = "berkenaan",
wikidata = 78194383,
description = "=the microblogging and social networking service {{w|Bluesky}}",
parents = {"media sosial", "World Wide Web"},
}
labels["body art"] = {
type = "berkenaan",
description = "default",
parents = {"seni", "fesyen"},
}
labels["Bollywood"] = {
type = "berkenaan",
wikidata = 93196,
description = "default",
parents = {"filem", "India"},
}
labels["buku"] = {
type = "berkenaan",
description = "default",
parents = {"media massa", "kesusasteraan"},
}
labels["books of the Poetic Edda"] = {
type = "nama",
displaytitle = "books of the ''Poetic Edda''",
description = "=[[book]]s of the ''[[Poetic Edda]]''",
parents = {"Norse mythology"},
}
labels["Brazilian folklore"] = {
type = "berkenaan",
description = "default",
parents = {"folklore", "Brazil"},
}
labels["cereka British"] = {
type = "berkenaan",
description = "=works of [[fiction]] of [[British]] origin",
parents = {"cereka", "United Kingdom"},
}
labels["Buffy the Vampire Slayer"] = {
type = "berkenaan",
wikidata = 183513,
displaytitle = "''Buffy the Vampire Slayer''",
description = "=the television series ''{{w|Buffy the Vampire Slayer}}'' (1997–2003)",
parents = {"cereka Amerika", "televisyen", "vampires"},
}
labels["cereka Kanada"] = {
type = "berkenaan",
description = "=works of [[fiction]] of [[Canada|Canadian]] origin",
parents = {"cereka", "Kanada"},
}
labels["seni khat"] = {
type = "berkenaan",
description = "default",
parents = {"seni", "penulisan"},
}
labels["cartomancy"] = {
type = "berkenaan",
description = "default",
parents = {"penilikan"},
}
labels["castells"] = {
type = "berkenaan",
description = "=[[castell]]s, the Catalan tradition of human tower building",
additional = "See {{w|castells}}.",
parents = {"budaya", "sports"},
}
labels["celestial inhabitants"] = {
type = "jenis",
description = "=inhabitants of known [[celestial body|celestial bodies]]",
parents = {"watak cereka", "cereka sains", "demonyms"},
}
labels["Celtic mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Ireland", "Wales"},
}
labels["characters from folklore"] = {
type = "berkenaan",
description = "default",
parents = {"watak cereka", "folklore"},
}
labels["cheerleading"] = {
type = "berkenaan",
description = "default",
parents = {"tarian", "gymnastics", "sports"},
}
labels["Church of England"] = {
type = "berkenaan",
description = "default with the",
parents = {"Anglicanism", "England"},
}
labels["Chinese fiction"] = {
type = "berkenaan",
description = "=works of [[fiction]], including [[anime]]s, [[manhua]]s, [[novel]]s, [[series]] and [[video game]]s, whose origin is of [[China]]",
parents = {"cereka", "China"},
}
labels["Chinese mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "China"},
}
labels["cinematography"] = {
type = "berkenaan",
description = "default",
parents = {"filem"},
}
labels["circus"] = {
type = "berkenaan",
description = "default no singularize",
parents = {"hiburan", "teater"},
}
labels["comedy"] = {
type = "berkenaan",
description = "default",
parents = {"drama"},
}
labels["komik"] = {
type = "berkenaan",
description = "default no singularize",
parents = {"kesusasteraan"},
}
-- Confucianism: see [[Module:category tree/topic/Philosophy]]
labels["conlanging"] = {
type = "berkenaan",
description = "=[[conlanging]] (the making of [[constructed language]]s)",
parents = {"language", "budaya"},
}
labels["conspiracy theories"] = {
type = "berkenaan,set",
description = "=[[conspiracy theory|conspiracy theories]] and theorists",
parents = {"budaya"},
}
labels["constellations in the zodiac"] = {
type = "nama",
description = "=the ring of [[constellations]] that line the [[ecliptic]], the apparent path of the [[Sun]] across the [[celestial sphere]] over the course of a year",
parents = {"constellations", "astrologi"},
}
labels["kosmetik"] = {
type = "berkenaan",
description = "default",
parents = {"toiletries", "fesyen"},
}
labels["cosplay"] = {
type = "berkenaan",
description = "default",
parents = {"fandom"},
}
labels["tarian"] = {
type = "berkenaan",
description = "default",
parents = {"seni", "rekreasi"},
}
labels["dances"] = {
type = "jenis",
description = "default",
parents = {"tarian"},
}
labels["DC Comics"] = {
type = "berkenaan",
wikidata = 2924461,
description = "={{w|DC Comics}}",
parents = {"cereka Amerika", "komik"},
}
labels["demoscene"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "computing"},
}
labels["reka bentuk"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["dictionaries"] = {
type = "jenis,nama",
description = "default",
parents = {"reference works", "lexicography"},
}
labels["Disney"] = {
type = "berkenaan",
wikidata = 7414,
description = "=the properties of {{w|The Walt Disney Company}}",
additional = "This includes properties acquired jointly with or from other companies.",
parents = {"cereka Amerika", "komik", "filem", "televisyen"},
}
labels["penilikan"] = {
type = "jenis",
description = "default",
parents = {"okultisme"},
}
labels["Doctor Who"] = {
type = "berkenaan",
wikidata = 34316,
displaytitle = "''Doctor Who''",
description = "=the ''{{w|Doctor Who}}'' franchise",
parents = {"British fiction", "cereka sains", "televisyen"},
}
labels["Dracula"] = {
type = "berkenaan",
wikidata = 41542,
displaytitle = "''Dracula''",
description = "=the 1897 gothic horror novel ''{{w|Dracula}}'' by {{w|Bram Stoker}}, and its cultural derivations.",
parents = {"fantasy", "kesusasteraan", "vampires"},
}
labels["naga"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures"},
}
labels["drama"] = {
type = "berkenaan",
description = "default",
parents = {"teater"},
}
labels["Egyptian deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Egyptian mythology"},
}
labels["Egyptian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Ancient Egypt"},
}
labels["hiburan"] = {
type = "berkenaan",
description = "default",
parents = {"budaya"},
}
labels["erotic literature"] = {
type = "berkenaan",
description = "default",
parents = {"cereka", "literary genres", "sex"},
}
labels["Etruscan mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Etruria"},
}
labels["European folklore"] = {
type = "berkenaan",
description = "default",
parents = {"folklore", "Europe"},
}
labels["fairy tale"] = {
type = "berkenaan",
description = "=[[fairy tale]]s",
parents = {"cereka"},
}
labels["fairy tale characters"] = {
type = "nama",
description = "=[[fairy tale]] [[character]]s",
parents = {"watak cereka", "fairy tale"},
}
labels["fairy tales"] = {
type = "nama",
description = "default",
parents = {"fairy tale"},
}
labels["fan fiction"] = {
type = "berkenaan",
description = "default",
parents = {"cereka", "fandom", "kesusasteraan"},
}
labels["fandom"] = {
type = "berkenaan",
description = "{{{langname}}} terms arising from [[fandom]] culture.",
parents = {"budaya"},
}
labels["fantasy"] = {
type = "berkenaan",
description = "=the [[genre]] of [[fantasy]]",
parents = {"cereka", "cereka spekulatif"},
}
labels["fesyen"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "clothing"},
}
labels["faster-than-light travel"] = {
type = "berkenaan",
description = "default",
parents = {"travel", "cereka sains", "astrophysics", "relativity"},
}
labels["Fediverse"] = {
type = "berkenaan",
wikidata = 30325419,
description = "=the decentralised social networking services collectively known as the {{w|Fediverse}}",
parents = {"media sosial", "World Wide Web"},
}
labels["cereka"] = {
type = "berkenaan",
description = "=specific works of [[fiction]]",
parents = {"artistic works"},
}
labels["fictional abilities"] = {
type = "berkenaan,jenis",
description = "=fictional [[ability|abilities]] and [[superpower]]s",
parents = {"cereka", "cereka spekulatif"},
}
labels["watak cereka"] = {
type = "nama,jenis",
description = "default",
parents = {"cereka"},
}
labels["fictional locations"] = {
type = "nama,jenis",
description = "default",
parents = {"cereka"},
}
labels["fictional planets"] = {
type = "nama",
description = "default",
parents = {"fictional locations"},
}
labels["fictional universes"] = {
type = "nama,jenis",
description = "default",
parents = {"fictional locations"},
}
labels["filem"] = {
type = "berkenaan",
description = "default",
parents = {"media massa", "hiburan"},
}
labels["F/F ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between two female characters.",
parents = {"LGBTQ", "ships (fandom) by relationship type"},
}
labels["film genres"] = {
type = "jenis,berkenaan",
description = "default",
parents = {"filem", "genre"},
}
labels["film industries"] = {
type = "nama",
description = "default",
parents = {"filem"},
}
labels["Finnic mythology"] = {
type = "berkenaan",
description = "=the [[mythology]] of the [[Finnic]] peoples",
additional = "This includes (but is not limited to) [[Finnish]] and [[Estonian]] mythology.",
parents = {"mitologi", "Finland", "Estonia"},
}
labels["flamenco"] = {
type = "berkenaan",
description = "default",
parents = {"tarian"},
}
labels["folklore"] = {
type = "berkenaan",
description = "default",
parents = {"budaya"},
}
labels["furry fandom"] = {
type = "berkenaan",
description = "default",
parents = {"fandom", "subbudaya"},
}
labels["Germanic deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Germanic mythology"},
}
labels["Germanic mythology"] = {
type = "nama",
description = "=the [[mythology]] of the [[Germanic]] peoples",
parents = {"mitologi"},
}
labels["genre"] = {
type = "jenis,berkenaan",
description = "=[[genre]]s and genre classifications",
parents = {"hiburan"},
wpcat = true,
}
labels["ghosts"] = {
type = "berkenaan",
description = "default",
parents = {"afterlife", "supernatural", "characters from folklore", "death", "fantasy", "horror", "mythological creatures", "okultisme"},
}
labels["Glee (TV series)"] = {
type = "berkenaan",
wikidata = 152178,
displaytitle = "''Glee'' (TV series)",
description = "=the television series ''[[w:Glee (TV series)|Glee]]'' (2009–2015)",
parents = {"cereka Amerika", "televisyen"},
}
labels["graphic design"] = {
type = "berkenaan",
description = "default",
parents = {"reka bentuk"},
}
labels["Greek deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Greek mythology"},
}
labels["Greek mythology"] = {
type = "berkenaan",
description = "=the [[mythology]] of [[Ancient Greece]]",
parents = {"mitologi", "Ancient Greece"},
}
labels["Gulliver's Travels"] = {
type = "berkenaan",
wikidata = 181488,
displaytitle = "''Gulliver's Travels''",
description = "=''[[w:Gulliver's Travels|Gulliver’s Travels]]''",
parents = {"kesusasteraan"},
}
labels["Harry Potter"] = {
type = "berkenaan",
wikidata = 8337,
displaytitle = "''Harry Potter''",
description = "{{{langname}}} terms used in context of the ''{{w|Harry Potter}}'' franchise.",
parents = {"British fiction", "fantasy", "kesusasteraan", "watak cereka"},
}
labels["Hawaiian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Hawaii, USA"},
}
labels["F/M ships"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between female and male characters.",
parents = {"ships (fandom) by relationship type"},
}
labels["Hindu deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Hindu mythology"},
}
labels["Hindu mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Hinduism"},
}
labels["Homestuck"] = {
type = "berkenaan",
displaytitle ="''Homestuck''",
wikidata = 2618713,
description = "=the ''{{w|Homestuck}}'' multimedia fiction series",
parents = {"cereka Amerika", "komik"},
}
labels["Hopi culture"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "United States"},
}
labels["horror"] = {
type = "berkenaan",
description = "=the [[horror]] [[genre]]",
parents = {"kesusasteraan", "cereka spekulatif"},
}
labels["humanities"] = {
type = "berkenaan",
description = "default no singularize",
parents = {"budaya"},
commonscat = true;
}
labels["incestuous ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} involving fictional incestuous relationships.",
parents = {"incest", "ships (fandom) by relationship type"},
}
labels["idol fandom"] = {
type = "berkenaan",
description = "default",
parents = {"fandom"},
}
labels["Instagram"] = {
type = "berkenaan",
wikidata = 209330,
description = "=the photo sharing and social networking service [[Instagram]]",
parents = {"photography", "media sosial", "World Wide Web"},
}
labels["Iranian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Iran"},
}
labels["Irish mythology"] = {
type = "berkenaan",
description = "default",
parents = {"Celtic mythology", "Ireland"},
}
labels["James Bond"] = {
type = "berkenaan",
wikidata = 844,
displaytitle = "''James Bond''",
description = "=the ''[[James Bond]]'' franchise",
parents = {"British fiction", "filem"},
}
labels["dewa Jepun"] = {
type = "nama",
description = "default",
parents = {"dewa", "mitologi Jepun"},
}
labels["cereka Jepun"] = {
type = "berkenaan",
description = "=bahan-bahan [[cereka]] Jepun, termasuk [[anime]], [[manga]], [[novel]], [[siri]] dan [[permainan video]]",
parents = {"cereka", "Japan"},
}
labels["mitologi Jepun"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Jepun"},
}
labels["job titles in Romance of the Three Kingdoms"] = {
type = "jenis",
displaytitle = "job titles in ''Romance of the Three Kingdoms''",
description = "=job titles in ''{{w|Romance of the Three Kingdoms}}''",
parents = {"Romance of the Three Kingdoms", "titles"},
}
labels["kewartawanan"] = {
type = "berkenaan",
description = "default",
parents = {"penulisan"},
}
labels["Kachinas"] = {
type = "nama",
description = "default",
parents = {"Hopi culture"},
}
labels["Komi mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Komi, Russia"},
}
labels["Korean fiction"] = {
type = "berkenaan",
description = "=works of [[fiction]], including [[anime]]s, [[manhwa]]s, [[novel]]s, [[series]] and [[video game]]s, whose origin is of [[Korea]]",
parents = {"cereka", "Korea"},
}
labels["Korean mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Korea"},
}
labels["genre kesusasteraan"] = {
type = "jenis",
description = "{{{langname}}} terms for [[literary]] [[genre]]s.",
parents = {"kesusasteraan", "cereka", "genre"},
}
labels["kesusasteraan"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "hiburan", "penulisan"},
}
labels["Lost (TV series)"] = {
type = "berkenaan",
wikidata = 23567,
displaytitle = "''Lost'' (TV series)",
description = "=the television series ''{{w|Lost (2004 TV series)|Lost}}'' (2004–2010)",
parents = {"cereka Amerika", "cereka sains", "televisyen"},
}
labels["Lovecraftian horror"] = {
type = "berkenaan",
wikidata = 2448865,
description = "=the [[literature|literary]] works of {{w|H. P. Lovecraft}}",
parents = {"horror", "kesusasteraan", "cereka", "supernatural"},
}
labels["magic"] = {
type = "berkenaan",
description = "default",
parents = {"supernatural"},
}
labels["magic words"] = {
type = "set",
wikidata = 1135882,
description = "{{{langname}}} magic words; terms that serve the purpose of effectively or apparently triggering a [[magical]] or [[illusionist]] event.",
parents = {"plot devices", "cereka"},
}
labels["genre manga"] = {
type = "jenis",
description = "Istilah [[genre]] [[manga]] dalam bahasa {{{langname}}}.",
parents = {"genre kesusasteraan"},
}
labels["perkahwinan"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "keluarga"},
}
labels["Marvel Comics"] = {
type = "berkenaan",
wikidata = 173496,
description = "={{w|Marvel Comics}}",
parents = {"cereka Amerika", "komik"},
}
labels["media massa"] = {
type = "berkenaan",
description = "default",
parents = {"media", "budaya"},
}
labels["Meitei deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Meitei mythology"},
}
labels["Meitei mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Manipur, India"},
}
labels["merpeople"] = {
type = "berkenaan",
description = "default",
parents = {"mythological creatures"},
}
labels["Mesopotamian deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Mesopotamian mythology"},
}
labels["Mesopotamian mythology"] = {
type = "berkenaan",
description = "=the [[mythology]] of ancient [[Mesopotamia]]",
parents = {"mitologi", "Ancient Near East"},
}
labels["M/M ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between two male characters.",
parents = {"LGBTQ", "ships (fandom) by relationship type"},
}
labels["modern art"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["Mongolian tribes"] = {
type = "nama",
description = "{{{langname}}} names for Mongolian tribes.",
parents = {"ethnonyms", "Mongolia"},
}
labels["moustaches"] = {
type = "jenis",
description = "default",
parents = {"face", "fesyen", "hair"},
}
labels["My Hero Academia"] = {
type = "berkenaan",
wikidata = 18047903,
displaytitle ="''My Hero Academia''",
description = "=the ''{{w|My Hero Academia}}'' series",
parents = {"Japanese fiction", "animasi", "komik"},
}
labels["My Little Pony"] = {
type = "berkenaan",
wikidata = 1071312,
displaytitle = "''My Little Pony''",
description = "=the ''{{w|My Little Pony}}'' franchise (which includes toys and animated series) and its fandom",
parents = {"cereka Amerika", "animasi", "toys"},
}
labels["mythological creatures"] = {
type = "jenis",
description = "default",
parents = {"mitologi", "fantasy"},
}
labels["mythological figures"] = {
type = "nama",
description = "default",
parents = {"mitologi"},
}
labels["mythological locations"] = {
type = "nama",
description = "default",
parents = {"mitologi"},
}
labels["mythological plants"] = {
type = "jenis,nama",
description = "default",
parents = {"mitologi", "plants"},
}
labels["mitologi"] = {
type = "berkenaan",
description = "default",
parents = {"budaya"},
}
labels["narratology"] = {
type = "berkenaan",
description = "default",
parents = {"kesusasteraan", "drama"},
}
labels["Navajo mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi"},
}
labels["newspapers"] = {
type = "nama",
description = "default",
parents = {"periodicals"},
}
labels["Niconico"] = {
type = "berkenaan",
wikidata = 697233,
description = "=the video-sharing website {{w|Niconico}}",
parents = {"media sosial", "World Wide Web"},
}
labels["Norse deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Germanic deities", "Norse mythology"},
}
labels["Norse mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Germanic mythology"},
}
labels["okultisme"] = {
type = "berkenaan",
description = "default with the",
parents = {"supernatural", "paranormal"},
}
labels["omegaverse"] = {
type = "berkenaan",
wikidata = 96397374,
description = "=the [[omegaverse]] genre",
parents = {"erotic literature", "fan fiction", "cereka spekulatif"},
}
labels["Omori"] = {
type = "berkenaan",
wikidata = 105618699,
displaytitle ="''Omori''",
description = "=the ''{{w|Omori (video game)|Omori}}'' series",
parents = {"cereka Amerika", "video games"},
}
labels["Once Upon a Time"] = {
type = "berkenaan",
wikidata = 23673,
displaytitle = "''Once Upon a Time''",
description = "=the television series ''{{w|Once Upon a Time (TV series)|Once Upon a Time}}'' (2011–2018)",
parents = {"cereka Amerika", "Disney", "televisyen"},
}
labels["painting"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["palmistry"] = {
type = "berkenaan",
description = "default",
parents = {"penilikan"},
}
labels["parties"] = {
type = "jenis,berkenaan",
description = "default",
parents = {"hiburan", "budaya"},
}
labels["people in Romance of the Three Kingdoms"] = {
type = "nama",
displaytitle = "people in ''Romance of the Three Kingdoms''",
description = "=people in ''{{w|Romance of the Three Kingdoms}}''",
parents = {"Romance of the Three Kingdoms"},
}
labels["perfumes"] = {
type = "jenis,set",
description = "default",
parents = {"fesyen", "scents", "perfumery"},
}
labels["periodicals"] = {
type = "jenis,berkenaan",
description = "default",
parents = {"media massa", "kesusasteraan"},
}
labels["personifications"] = {
type = "nama",
description = "default",
parents = {"narratology"},
}
labels["places in Romance of the Three Kingdoms"] = {
type = "nama",
displaytitle = "places in ''Romance of the Three Kingdoms''",
description = "=places in ''{{w|Romance of the Three Kingdoms}}''",
parents = {"Romance of the Three Kingdoms", "China"},
}
labels["plot devices"] = {
type = "jenis",
description = "default",
parents = {"narratology", "cereka"},
}
labels["puisi"] = {
type = "berkenaan",
description = "default",
parents = {"kesusasteraan", "seni"},
}
labels["polyamorous ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between three or more characters.",
parents = {"ships (fandom) by relationship type"},
}
labels["Private Eye"] = {
type = "berkenaan",
displaytitle = "''Private Eye''",
description = "=the ''{{w|Private Eye}}'' franchise",
parents = {"British fiction"},
}
labels["Reddit"] = {
type = "berkenaan",
wikidata = 2195701,
description = "=the social news aggregation and discussion website {{w|Reddit}}",
parents = {"media sosial", "World Wide Web"},
}
labels["reference works"] = {
type = "jenis",
description = "default",
parents = {"buku"},
}
labels["Roman deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Roman mythology"},
}
labels["Roman mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Ancient Rome"},
}
labels["romance fiction"] = {
type = "berkenaan",
description = "default",
parents = {"literary genres", "love"},
}
labels["Romance of the Three Kingdoms"] = {
type = "berkenaan",
wikidata = 70806,
displaytitle = "''Romance of the Three Kingdoms''",
description = "=''{{w|Romance of the Three Kingdoms}}''",
parents = {"cereka", "kesusasteraan", "China"},
}
labels["RPF ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} involving real people in a fictional relationship.",
additional = "For actual relationships between real people, see [[:Category:Couple nicknames]].",
parents = {"ships (fandom) by relationship type"},
}
labels["cereka sains"] = {
type = "berkenaan",
description = "default",
parents = {"cereka spekulatif", "cereka"},
}
labels["SCP Foundation"] = {
type = "berkenaan",
wikidata = 17439649,
description = "English terms related to the SCP Wiki collaborative writing website and its setting of the {{w|SCP Foundation}}.",
parents = {"fantasy", "cereka", "horror", "cereka sains", "supernatural"},
}
labels["arca"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["Shahnameh"] = {
type = "berkenaan",
wikidata = 8279,
displaytitle = "''Shahnameh''",
description = "=''Shahnameh''",
parents = {"cereka", "puisi", "kesusasteraan", "Persia"},
}
labels["Shahnameh characters"] = {
type = "nama",
description = "=characters in the [[Shahnameh]]",
parents = {"Shahnameh"},
}
labels["shapeshifters"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures", "characters from folklore"},
}
labels["Sherlock Holmes"] = {
type = "berkenaan",
wikidata = 2316684,
description = "=the [[Sherlock Holmes]] stories by {{w|Arthur Conan Doyle}} and adaptations of them",
parents = {"British fiction", "kesusasteraan"},
}
labels["Sherlock (TV series)"] = {
type = "berkenaan",
wikidata = 192837,
displaytitle = "''Sherlock'' (TV series)",
description = "=the television series ''[[w:Sherlock (TV series)|Sherlock]]'' (2010–2017)",
parents = {"Sherlock Holmes", "televisyen"},
}
labels["shipping (fandom)"] = {
type = "berkenaan",
description = "={{l|en|ship|shipping|id=fandomverb}} (i.e., in [[fandom]], supporting a fictional romantic relationship between two characters)",
parents = {"fandom", "romance fiction"},
}
labels["ships (fandom)"] = {
type = "kumpulan",
description = "=names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} i.e., a fictional relationship between two fictional characters or real people)",
parents = {"shipping (fandom)"},
}
labels["ships (fandom) by relationship type"] = {
type = "kumpulan",
description = "={{l|en|ship|ship|id=fandomnoun}} names organized by the type of relationship (e.g, [[heterosexual]], [[homosexual]], etc.)",
parents = {"ships (fandom)"},
}
labels["shippers (fandom)"] = {
type = "jenis",
description = "=[[shipper]]s (i.e., people who support a romantic or sexual relationship between characters or real people)",
parents = {"shipping (fandom)"},
}
labels["Slavic deities"] = {
type = "nama",
description = "default",
parents = {"gods", "Slavic mythology"},
}
labels["Slavic mythology"] = {
type = "berkenaan",
description = "=the [[mythology]] of the [[Slav]]s",
parents = {"mitologi"},
}
labels["Smallville (TV series)"] = {
type = "berkenaan",
wikidata = 180228,
displaytitle = "''Smallville'' (TV series)",
description = "=the television series ''{{w|Smallville}}'' (2001–2011)",
parents = {"cereka Amerika", "Superman", "televisyen"},
}
labels["media sosial"] = {
type = "berkenaan",
wikidata = 202833,
description = "default",
parents = {"media massa", "Internet"},
}
labels["South Korean idol fandom"] = {
type = "berkenaan",
wikidata = 39086123,
description = "=[[South Korea|South Korean]] [[idol]] [[fandom]]",
parents = {"idol fandom", "South Korea"},
}
labels["South Park"] = {
type = "berkenaan",
wikidata = 16538,
displaytitle = "''South Park''",
description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|South Park}}''.",
parents = {"cereka Amerika", "animasi"},
}
labels["Star Trek"] = {
type = "berkenaan",
wikidata = 1092,
displaytitle = "''Star Trek''",
description = "=the ''{{w|Star Trek}}'' franchise",
parents = {"cereka Amerika", "filem", "cereka sains", "televisyen"},
}
labels["Star Wars"] = {
type = "berkenaan",
wikidata = 462,
displaytitle = "''Star Wars''",
description = "=the ''{{w|Star Wars}}'' franchise",
parents = {"cereka Amerika", "filem", "cereka sains", "Disney"},
}
labels["Steven Universe"] = {
type = "berkenaan",
wikidata = 7615342,
displaytitle = "''Steven Universe''",
description = "=the animated television series ''{{w|Steven Universe}}''",
parents = {"cereka Amerika", "animasi"},
}
labels["stock characters"] = {
type = "jenis",
wikidata = 636497,
description = "default",
parents = {"watak cereka"},
}
labels["cereka spekulatif"] = {
type = "berkenaan",
wikidata = 9326077,
description = "default",
parents = {"cereka", "genre"},
}
labels["spider fighting"] = {
type = "berkenaan",
wikidata = 7577058,
description = "={{w|spider fighting}}",
parents = {"spiders", "human activity"},
}
labels["subbudaya"] = {
type = "berkenaan",
description = "=[[subculture]]s",
parents = {"budaya"},
}
labels["adiwira"] = {
type = "nama",
wikidata = 188784,
description = "=[[superhero]]es",
parents = {"watak cereka"},
}
labels["Superman"] = {
type = "berkenaan",
wikidata = 79015,
description = "=the fictional [[superhero]] [[Superman]]",
parents = {"DC Comics", "watak cereka"},
}
labels["supernatural"] = {
type = "berkenaan",
wikidata = 80837,
description = "default with the",
parents = {"folklore"},
}
labels["Supernatural (TV series)"] = {
type = "berkenaan",
wikidata = 130585,
displaytitle = "''Supernatural'' (TV series)",
description = "=the television series ''[[w:Supernatural (American TV series)|Supernatural]]'' (2005–2020)",
parents = {"cereka Amerika", "televisyen"},
}
labels["Tamil deities"] = {
type = "nama",
description = "default",
additional = "See [[w:Dravidian folk religion|Dravidian religion]] or [[w:Religion in ancient Tamilakam|Tamil region]] for more.",
parents = {"gods", "Hindu deities", "Tamil mythology"},
}
labels["Tamil mythology"] = {
type = "nama",
description = "default",
additional = "See [[w:Dravidian folk religion|Dravidian religion]] or [[w:Religion in ancient Tamilakam|Tamil region]] for more.",
parents = {"mitologi", "Hindu mythology", "Tamil Nadu, India"},
}
labels["televisyen"] = {
type = "berkenaan",
wikidata = 289,
description = "default",
parents = {"media massa", "penyiaran"},
}
labels["The Handmaid's Tale"] = {
type = "berkenaan",
wikidata = 25207350,
displaytitle = "''The Handmaid's Tale''",
description = "=the 1985 novel ''{{w|The Handmaid's Tale}}'' by {{w|Margaret Atwood}} and its [[w:The Handmaid's Tale (TV series)|television adaptation]] (2017–)",
parents = {"Canadian fiction", "utopian and dystopian fiction", "kesusasteraan"},
}
labels["The Hunger Games"] = {
type = "berkenaan",
wikidata = 11679,
displaytitle = "''The Hunger Games''",
description = "=''{{w|The Hunger Games}}'' novel series by {{w|Suzanne Collins}} and its film adaptations",
parents = {"cereka Amerika", "cereka sains", "utopian and dystopian fiction", "kesusasteraan"},
}
labels["The Matrix"] = {
type = "berkenaan",
wikidata = 83495,
displaytitle = "''The Matrix''",
description = "=''{{w|The Matrix}}''",
parents = {"cereka Amerika", "cereka sains", "utopian and dystopian fiction"},
}
labels["The Simpsons"] = {
type = "berkenaan",
wikidata = 886,
displaytitle = "''The Simpsons''",
description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|The Simpsons}}''.",
parents = {"cereka Amerika", "animasi", "Disney"},
}
labels["The Walking Dead"] = {
type = "berkenaan",
wikidata = 232737,
displaytitle = "''The Walking Dead''",
description = "=the television series ''[[w:The Walking Dead (TV series)|The Walking Dead]]'' (2010–2022) and the comic series from which it was adapted",
parents = {"cereka Amerika", "televisyen", "utopian and dystopian fiction", "zombies"},
}
labels["The Wizard of Oz"] = {
type = "berkenaan",
wikidata = 130295,
displaytitle = "''The Wizard of Oz''",
description = "=the fantasy novel ''{{w|The Wonderful Wizard of Oz}}'', subsequent books or films derived from it, such as the ''[[w:The Wizard of Oz (1939 film)|1939 film]]''.",
parents = {"cereka Amerika", "fantasy", "kesusasteraan"},
}
labels["The X-Files"] = {
type = "berkenaan",
wikidata = 2744,
displaytitle = "''The X-Files''",
description = "=the ''{{w|The X-Files}}'' franchise",
parents = {"cereka Amerika", "cereka sains", "televisyen"},
}
labels["teater"] = {
type = "berkenaan",
description = "default",
parents = {"seni", "hiburan"},
}
labels["Thracian deities"] = {
type = "nama",
description = "default",
parents = {"gods"},
}
labels["TikTok"] = {
type = "berkenaan",
wikidata = 48938223,
description = "=the video-sharing and social-networking service {{w|TikTok}}",
parents = {"media sosial", "World Wide Web"},
}
labels["Tupi mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Brazil"},
}
labels["Twilight (novel series)"] = {
type = "berkenaan",
wikidata = 44523,
displaytitle = "''Twilight'' (novel series)",
description = "=the ''[[w:Twilight (series)|Twilight]]'' franchise",
parents = {"cereka Amerika", "fantasy", "kesusasteraan", "vampires"},
}
labels["Twitter"] = {
type = "berkenaan",
wikidata = 918,
description = "=the social networking and microblogging service {{w|Twitter}}",
parents = {"media sosial", "World Wide Web"},
}
labels["Tumblr"] = {
type = "berkenaan",
wikidata = 384060,
description = "=the microblogging and social networking service {{w|Tumblr}}",
parents = {"media sosial", "World Wide Web"},
}
labels["utopian and dystopian fiction"] = {
type = "berkenaan",
description = "default",
parents = {"cereka spekulatif"},
}
labels["vampires"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures", "characters from folklore", "death", "horror", "blood"},
}
labels["vampire lifestyle"] = {
type = "berkenaan",
description = "={{w|vampire lifestyle|the vampire lifestyle}} (i.e., a subculture which roleplays the stereotypical habits of vampires)",
parents = {"subbudaya", "vampires"},
}
labels["Virtual YouTuber"] = {
type = "berkenaan",
wikidata = 55155641,
description = "=[[virtual YouTuber]]s ([[VTuber]]s)",
parents = {"YouTube", "hiburan"},
}
labels["web design"] = {
type = "berkenaan",
description = "default",
parents = {"reka bentuk", "World Wide Web"},
}
labels["werewolves"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures", "characters from folklore", "shapeshifters", "horror"},
}
labels["worldbuilding"] = {
type = "berkenaan",
description = "default",
parents = {"narratology", "cereka spekulatif"},
}
labels["Xena: Warrior Princess"] = {
type = "berkenaan",
wikidata = 38497,
displaytitle = "''Xena: Warrior Princess''",
description = "=the television series ''{{w|Xena: Warrior Princess}}'' (1995–2001)",
parents = {"cereka Amerika", "fantasy", "televisyen"},
}
labels["YouTube"] = {
type = "berkenaan",
wikidata = 866,
description = "=the video-sharing website {{w|YouTube}}",
parents = {"media sosial", "World Wide Web", "Google"},
}
labels["YouTube Poop"] = {
type = "berkenaan",
wikidata = 16927904,
description = "default",
parents = {"YouTube", "Internet memes"},
}
labels["zombies"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures", "characters from folklore", "death", "horror"},
}
return labels
grubzadh94jb0mihfa023nbxsztq4rg
281344
281343
2026-04-22T01:25:21Z
PeaceSeekers
3334
281344
Scribunto
text/plain
local labels = {}
labels["budaya"] = {
type = "berkenaan",
description = "default",
parents = {"masyarakat"},
}
labels["A Christmas Carol"] = {
type = "berkenaan",
wikidata = 62879,
displaytitle = "''A Christmas Carol''",
description = "{{{langname}}} terms that are used in the context of the tale ''{{w|A Christmas Carol}}'', by {{w|Charles Dickens}}, such as the names of its characters or author.",
parents = {"cereka British", "Charles Dickens"},
}
labels["A Song of Ice and Fire"] = {
type = "berkenaan",
wikidata = 45875,
displaytitle = "''A Song of Ice and Fire''",
description = "{{{langname}}} terms used in context of the ''{{w|Song of Ice and Fire}}'' novel series and its television adaptation ''{{w|Game of Thrones}}''.",
parents = {"cereka Amerika", "fantasi", "kesusasteraan"},
}
labels["lakonan"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["alternate history"] = {
type = "berkenaan",
description = "default",
parents = {"cereka spekulatif", "history"},
}
labels["cereka Amerika"] = {
type = "berkenaan",
description = "=works of American fiction",
parents = {"cereka", "Amerika Syarikat"},
}
labels["animasi"] = {
type = "berkenaan",
description = "default",
parents = {"media massa"},
}
labels["Arabic fiction"] = {
type = "berkenaan",
description = "=works of [[fiction]] of [[Arabic]] origin",
parents = {"cereka"},
}
labels["dewa Arab"] = {
type = "nama",
description = "default",
parents = {"dewa", "mitologi Arab"},
}
labels["mitologi Arab"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi"},
}
labels["mitologi Armenia"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Armenia"},
}
labels["seni"] = {
type = "berkenaan",
description = "default",
parents = {"budaya"},
}
labels["Arthurian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "United Kingdom"},
}
labels["karya seni"] = {
type = "nama,jenis",
description = "default",
parents = {"seni"},
}
labels["astrobiology"] = {
type = "berkenaan",
description = "default",
parents = {"astronomy", "biology", "geology"},
}
labels["astrologi"] = {
type = "berkenaan",
description = "default",
parents = {"penilikan", "pseudosains", "obsolete scientific theories"},
}
labels["Asturian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Asturias, Spain"},
}
labels["Avatar: The Last Airbender"] = {
type = "berkenaan",
wikidata = 11572,
displaytitle = "''Avatar: The Last Airbender''",
description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|Avatar: The Last Airbender}}'' and its spin-off ''{{w|The Legend of Korra}}''.",
parents = {"cereka Amerika", "animasi"},
}
labels["Australian Aboriginal mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Australia"},
}
labels["ballet"] = {
type = "berkenaan",
description = "default",
parents = {"tarian"},
}
labels["Barbie"] = {
type = "berkenaan",
wikidata = 167447,
description = "=the {{w|Barbie}} fashion doll produced by Mattel",
parents = {"toys"},
}
labels["Batman"] = {
type = "berkenaan",
wikidata = 2695156,
description = "=the fictional [[superhero]] [[Batman]]",
parents = {"DC Comics", "watak cereka"},
}
labels["bibliography"] = {
type = "berkenaan",
description = "default",
parents = {"buku"},
}
labels["Bilibili"] = {
type = "berkenaan",
wikidata = 3077586,
description = "=the video-sharing website {{w|bilibili}}",
parents = {"media sosial", "World Wide Web"},
}
labels["blogging"] = {
type = "berkenaan",
description = "default",
parents = {"media sosial"},
}
labels["Bluesky"] = {
type = "berkenaan",
wikidata = 78194383,
description = "=the microblogging and social networking service {{w|Bluesky}}",
parents = {"media sosial", "World Wide Web"},
}
labels["body art"] = {
type = "berkenaan",
description = "default",
parents = {"seni", "fesyen"},
}
labels["Bollywood"] = {
type = "berkenaan",
wikidata = 93196,
description = "default",
parents = {"filem", "India"},
}
labels["buku"] = {
type = "berkenaan",
description = "default",
parents = {"media massa", "kesusasteraan"},
}
labels["books of the Poetic Edda"] = {
type = "nama",
displaytitle = "books of the ''Poetic Edda''",
description = "=[[book]]s of the ''[[Poetic Edda]]''",
parents = {"mitologi Norse"},
}
labels["Brazilian folklore"] = {
type = "berkenaan",
description = "default",
parents = {"folklore", "Brazil"},
}
labels["cereka British"] = {
type = "berkenaan",
description = "=works of [[fiction]] of [[British]] origin",
parents = {"cereka", "United Kingdom"},
}
labels["Buffy the Vampire Slayer"] = {
type = "berkenaan",
wikidata = 183513,
displaytitle = "''Buffy the Vampire Slayer''",
description = "=the television series ''{{w|Buffy the Vampire Slayer}}'' (1997–2003)",
parents = {"cereka Amerika", "televisyen", "vampires"},
}
labels["cereka Kanada"] = {
type = "berkenaan",
description = "=works of [[fiction]] of [[Canada|Canadian]] origin",
parents = {"cereka", "Kanada"},
}
labels["seni khat"] = {
type = "berkenaan",
description = "default",
parents = {"seni", "penulisan"},
}
labels["cartomancy"] = {
type = "berkenaan",
description = "default",
parents = {"penilikan"},
}
labels["castells"] = {
type = "berkenaan",
description = "=[[castell]]s, the Catalan tradition of human tower building",
additional = "See {{w|castells}}.",
parents = {"budaya", "sports"},
}
labels["celestial inhabitants"] = {
type = "jenis",
description = "=inhabitants of known [[celestial body|celestial bodies]]",
parents = {"watak cereka", "cereka sains", "demonyms"},
}
labels["Celtic mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Ireland", "Wales"},
}
labels["characters from folklore"] = {
type = "berkenaan",
description = "default",
parents = {"watak cereka", "folklore"},
}
labels["cheerleading"] = {
type = "berkenaan",
description = "default",
parents = {"tarian", "gymnastics", "sports"},
}
labels["Church of England"] = {
type = "berkenaan",
description = "default with the",
parents = {"Anglicanism", "England"},
}
labels["cereka China"] = {
type = "berkenaan",
description = "=works of [[fiction]], including [[anime]]s, [[manhua]]s, [[novel]]s, [[series]] and [[video game]]s, whose origin is of [[China]]",
parents = {"cereka", "China"},
}
labels["mitologi Cina"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "China"},
}
labels["sinematografi"] = {
type = "berkenaan",
description = "default",
parents = {"filem"},
}
labels["sarkas"] = {
type = "berkenaan",
description = "default no singularize",
parents = {"hiburan", "teater"},
}
labels["komedi"] = {
type = "berkenaan",
description = "default",
parents = {"drama"},
}
labels["komik"] = {
type = "berkenaan",
description = "default no singularize",
parents = {"kesusasteraan"},
}
-- Confucianism: see [[Module:category tree/topic/Philosophy]]
labels["conlanging"] = {
type = "berkenaan",
description = "=[[conlanging]] (the making of [[constructed language]]s)",
parents = {"language", "budaya"},
}
labels["teori konspirasi"] = {
type = "berkenaan,set",
description = "=[[conspiracy theory|conspiracy theories]] and theorists",
parents = {"budaya"},
}
labels["constellations in the zodiac"] = {
type = "nama",
description = "=the ring of [[constellations]] that line the [[ecliptic]], the apparent path of the [[Sun]] across the [[celestial sphere]] over the course of a year",
parents = {"constellations", "astrologi"},
}
labels["kosmetik"] = {
type = "berkenaan",
description = "default",
parents = {"toiletries", "fesyen"},
}
labels["cosplay"] = {
type = "berkenaan",
description = "default",
parents = {"fandom"},
}
labels["tarian"] = {
type = "berkenaan",
description = "default",
parents = {"seni", "rekreasi"},
}
labels["dances"] = {
type = "jenis",
description = "default",
parents = {"tarian"},
}
labels["DC Comics"] = {
type = "berkenaan",
wikidata = 2924461,
description = "={{w|DC Comics}}",
parents = {"cereka Amerika", "komik"},
}
labels["demoscene"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "computing"},
}
labels["reka bentuk"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["dictionaries"] = {
type = "jenis,nama",
description = "default",
parents = {"reference works", "lexicography"},
}
labels["Disney"] = {
type = "berkenaan",
wikidata = 7414,
description = "=the properties of {{w|The Walt Disney Company}}",
additional = "This includes properties acquired jointly with or from other companies.",
parents = {"cereka Amerika", "komik", "filem", "televisyen"},
}
labels["penilikan"] = {
type = "jenis",
description = "default",
parents = {"okultisme"},
}
labels["Doctor Who"] = {
type = "berkenaan",
wikidata = 34316,
displaytitle = "''Doctor Who''",
description = "=the ''{{w|Doctor Who}}'' franchise",
parents = {"cereka British", "cereka sains", "televisyen"},
}
labels["Dracula"] = {
type = "berkenaan",
wikidata = 41542,
displaytitle = "''Dracula''",
description = "=the 1897 gothic horror novel ''{{w|Dracula}}'' by {{w|Bram Stoker}}, and its cultural derivations.",
parents = {"fantasi", "kesusasteraan", "vampires"},
}
labels["naga"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures"},
}
labels["drama"] = {
type = "berkenaan",
description = "default",
parents = {"teater"},
}
labels["dewa Mesir"] = {
type = "nama",
description = "default",
parents = {"dewa", "mitologi Mesir"},
}
labels["mitologi Mesir"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Mesir Purba"},
}
labels["hiburan"] = {
type = "berkenaan",
description = "default",
parents = {"budaya"},
}
labels["erotic literature"] = {
type = "berkenaan",
description = "default",
parents = {"cereka", "genre kesusasteraan", "sex"},
}
labels["mitologi Etruria"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Etruria"},
}
labels["European folklore"] = {
type = "berkenaan",
description = "default",
parents = {"folklore", "Europe"},
}
labels["fairy tale"] = {
type = "berkenaan",
description = "=[[fairy tale]]s",
parents = {"cereka"},
}
labels["fairy tale characters"] = {
type = "nama",
description = "=[[fairy tale]] [[character]]s",
parents = {"watak cereka", "fairy tale"},
}
labels["fairy tales"] = {
type = "nama",
description = "default",
parents = {"fairy tale"},
}
labels["fan fiction"] = {
type = "berkenaan",
description = "default",
parents = {"cereka", "fandom", "kesusasteraan"},
}
labels["fandom"] = {
type = "berkenaan",
description = "{{{langname}}} terms arising from [[fandom]] culture.",
parents = {"budaya"},
}
labels["fantasi"] = {
type = "berkenaan",
description = "=the [[genre]] of [[fantasy]]",
parents = {"cereka", "cereka spekulatif"},
}
labels["fesyen"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "pakaian"},
}
labels["faster-than-light travel"] = {
type = "berkenaan",
description = "default",
parents = {"travel", "cereka sains", "astrofizik", "kerelatifan"},
}
labels["Fediverse"] = {
type = "berkenaan",
wikidata = 30325419,
description = "=the decentralised social networking services collectively known as the {{w|Fediverse}}",
parents = {"media sosial", "World Wide Web"},
}
labels["cereka"] = {
type = "berkenaan",
description = "=specific works of [[fiction]]",
parents = {"karya seni"},
}
labels["fictional abilities"] = {
type = "berkenaan,jenis",
description = "=fictional [[ability|abilities]] and [[superpower]]s",
parents = {"cereka", "cereka spekulatif"},
}
labels["watak cereka"] = {
type = "nama,jenis",
description = "default",
parents = {"cereka"},
}
labels["fictional locations"] = {
type = "nama,jenis",
description = "default",
parents = {"cereka"},
}
labels["fictional planets"] = {
type = "nama",
description = "default",
parents = {"fictional locations"},
}
labels["fictional universes"] = {
type = "nama,jenis",
description = "default",
parents = {"fictional locations"},
}
labels["filem"] = {
type = "berkenaan",
description = "default",
parents = {"media massa", "hiburan"},
}
labels["F/F ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between two female characters.",
parents = {"LGBTQ", "ships (fandom) by relationship type"},
}
labels["film genres"] = {
type = "jenis,berkenaan",
description = "default",
parents = {"filem", "genre"},
}
labels["industri filem"] = {
type = "nama",
description = "default",
parents = {"filem"},
}
labels["Finnic mythology"] = {
type = "berkenaan",
description = "=the [[mythology]] of the [[Finnic]] peoples",
additional = "This includes (but is not limited to) [[Finnish]] and [[Estonian]] mythology.",
parents = {"mitologi", "Finland", "Estonia"},
}
labels["flamenco"] = {
type = "berkenaan",
description = "default",
parents = {"tarian"},
}
labels["folklore"] = {
type = "berkenaan",
description = "default",
parents = {"budaya"},
}
labels["furry fandom"] = {
type = "berkenaan",
description = "default",
parents = {"fandom", "subbudaya"},
}
labels["dewa Jermanik"] = {
type = "nama",
description = "default",
parents = {"dewa", "mitologi Jermanik"},
}
labels["mitologi Jermanik"] = {
type = "nama",
description = "=the [[mythology]] of the [[Germanic]] peoples",
parents = {"mitologi"},
}
labels["genre"] = {
type = "jenis,berkenaan",
description = "=[[genre]]s and genre classifications",
parents = {"hiburan"},
wpcat = true,
}
labels["hantu"] = {
type = "berkenaan",
description = "default",
parents = {"afterlife", "supernatural", "characters from folklore", "death", "fantasi", "horror", "mythological creatures", "okultisme"},
}
labels["Glee"] = {
type = "berkenaan",
wikidata = 152178,
description = "=siri televisyen, ''[[w:Glee (siri TV)|Glee]]'' (2009–2015)",
parents = {"cereka Amerika", "televisyen"},
}
labels["reka bentuk grafik"] = {
type = "berkenaan",
description = "default",
parents = {"reka bentuk"},
}
labels["dewa Yunani"] = {
type = "nama",
description = "default",
parents = {"dewa", "mitologi Yunani"},
}
labels["mitologi Yunani"] = {
type = "berkenaan",
description = "=[[mitologi]] masyarakat [[Yunani Purba]]",
parents = {"mitologi", "Yunani Purba"},
}
labels["Gulliver's Travels"] = {
type = "berkenaan",
wikidata = 181488,
displaytitle = "''Gulliver's Travels''",
description = "=''[[w:Gulliver's Travels|Gulliver’s Travels]]''",
parents = {"kesusasteraan"},
}
labels["Harry Potter"] = {
type = "berkenaan",
wikidata = 8337,
displaytitle = "''Harry Potter''",
description = "{{{langname}}} terms used in context of the ''{{w|Harry Potter}}'' franchise.",
parents = {"cereka British", "fantasi", "kesusasteraan", "watak cereka"},
}
labels["Hawaiian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Hawaii, USA"},
}
labels["F/M ships"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between female and male characters.",
parents = {"ships (fandom) by relationship type"},
}
labels["dewa Hindu"] = {
type = "nama",
description = "default",
parents = {"dewa", "mitologi Hindu"},
}
labels["mitologi Hindu"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Hinduisme"},
}
labels["Homestuck"] = {
type = "berkenaan",
displaytitle ="''Homestuck''",
wikidata = 2618713,
description = "=the ''{{w|Homestuck}}'' multimedia fiction series",
parents = {"cereka Amerika", "komik"},
}
labels["Hopi culture"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "United States"},
}
labels["horror"] = {
type = "berkenaan",
description = "=the [[horror]] [[genre]]",
parents = {"kesusasteraan", "cereka spekulatif"},
}
labels["humanities"] = {
type = "berkenaan",
description = "default no singularize",
parents = {"budaya"},
commonscat = true;
}
labels["incestuous ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} involving fictional incestuous relationships.",
parents = {"incest", "ships (fandom) by relationship type"},
}
labels["idol fandom"] = {
type = "berkenaan",
description = "default",
parents = {"fandom"},
}
labels["Instagram"] = {
type = "berkenaan",
wikidata = 209330,
description = "=the photo sharing and social networking service [[Instagram]]",
parents = {"photography", "media sosial", "World Wide Web"},
}
labels["Iranian mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Iran"},
}
labels["Irish mythology"] = {
type = "berkenaan",
description = "default",
parents = {"Celtic mythology", "Ireland"},
}
labels["James Bond"] = {
type = "berkenaan",
wikidata = 844,
displaytitle = "''James Bond''",
description = "=the ''[[James Bond]]'' franchise",
parents = {"cereka British", "filem"},
}
labels["dewa Jepun"] = {
type = "nama",
description = "default",
parents = {"dewa", "mitologi Jepun"},
}
labels["cereka Jepun"] = {
type = "berkenaan",
description = "=bahan-bahan [[cereka]] Jepun, termasuk [[anime]], [[manga]], [[novel]], [[siri]] dan [[permainan video]]",
parents = {"cereka", "Japan"},
}
labels["mitologi Jepun"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Jepun"},
}
labels["job titles in Romance of the Three Kingdoms"] = {
type = "jenis",
displaytitle = "job titles in ''Romance of the Three Kingdoms''",
description = "=job titles in ''{{w|Romance of the Three Kingdoms}}''",
parents = {"Romance of the Three Kingdoms", "titles"},
}
labels["kewartawanan"] = {
type = "berkenaan",
description = "default",
parents = {"penulisan"},
}
labels["Kachinas"] = {
type = "nama",
description = "default",
parents = {"budaya Hopi"},
}
labels["Komi mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Komi, Russia"},
}
labels["cereka Korea"] = {
type = "berkenaan",
description = "=works of [[fiction]], including [[anime]]s, [[manhwa]]s, [[novel]]s, [[series]] and [[video game]]s, whose origin is of [[Korea]]",
parents = {"cereka", "Korea"},
}
labels["mitologi Korea"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Korea"},
}
labels["genre kesusasteraan"] = {
type = "jenis",
description = "{{{langname}}} terms for [[literary]] [[genre]]s.",
parents = {"kesusasteraan", "cereka", "genre"},
}
labels["kesusasteraan"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "hiburan", "penulisan"},
}
labels["Lost (TV series)"] = {
type = "berkenaan",
wikidata = 23567,
displaytitle = "''Lost'' (TV series)",
description = "=the television series ''{{w|Lost (2004 TV series)|Lost}}'' (2004–2010)",
parents = {"cereka Amerika", "cereka sains", "televisyen"},
}
labels["Lovecraftian horror"] = {
type = "berkenaan",
wikidata = 2448865,
description = "=the [[literature|literary]] works of {{w|H. P. Lovecraft}}",
parents = {"horror", "kesusasteraan", "cereka", "supernatural"},
}
labels["magic"] = {
type = "berkenaan",
description = "default",
parents = {"supernatural"},
}
labels["magic words"] = {
type = "set",
wikidata = 1135882,
description = "{{{langname}}} magic words; terms that serve the purpose of effectively or apparently triggering a [[magical]] or [[illusionist]] event.",
parents = {"plot devices", "cereka"},
}
labels["genre manga"] = {
type = "jenis",
description = "Istilah [[genre]] [[manga]] dalam bahasa {{{langname}}}.",
parents = {"genre kesusasteraan"},
}
labels["perkahwinan"] = {
type = "berkenaan",
description = "default",
parents = {"budaya", "keluarga"},
}
labels["Marvel Comics"] = {
type = "berkenaan",
wikidata = 173496,
description = "={{w|Marvel Comics}}",
parents = {"cereka Amerika", "komik"},
}
labels["media massa"] = {
type = "berkenaan",
description = "default",
parents = {"media", "budaya"},
}
labels["dewa Meitei"] = {
type = "nama",
description = "default",
parents = {"dewa", "mitologi Meitei"},
}
labels["mitologi Meitei"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Manipur, India"},
}
labels["merpeople"] = {
type = "berkenaan",
description = "default",
parents = {"mythological creatures"},
}
labels["dewa Mesopotamia"] = {
type = "nama",
description = "default",
parents = {"dewa", "mitologi Mesopotamia"},
}
labels["mitologi Mesopotamia"] = {
type = "berkenaan",
description = "=the [[mythology]] of ancient [[Mesopotamia]]",
parents = {"mitologi", "Timur Dekat Purba"},
}
labels["M/M ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between two male characters.",
parents = {"LGBTQ", "ships (fandom) by relationship type"},
}
labels["seni moden"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["Mongolian tribes"] = {
type = "nama",
description = "{{{langname}}} names for Mongolian tribes.",
parents = {"ethnonyms", "Mongolia"},
}
labels["misai"] = {
type = "jenis",
description = "default",
parents = {"muka", "fesyen", "rambut"},
}
labels["My Hero Academia"] = {
type = "berkenaan",
wikidata = 18047903,
displaytitle ="''My Hero Academia''",
description = "=the ''{{w|My Hero Academia}}'' series",
parents = {"cereka Jepun", "animasi", "komik"},
}
labels["My Little Pony"] = {
type = "berkenaan",
wikidata = 1071312,
displaytitle = "''My Little Pony''",
description = "=the ''{{w|My Little Pony}}'' franchise (which includes toys and animated series) and its fandom",
parents = {"cereka Amerika", "animasi", "toys"},
}
labels["mythological creatures"] = {
type = "jenis",
description = "default",
parents = {"mitologi", "fantasi"},
}
labels["mythological figures"] = {
type = "nama",
description = "default",
parents = {"mitologi"},
}
labels["mythological locations"] = {
type = "nama",
description = "default",
parents = {"mitologi"},
}
labels["mythological plants"] = {
type = "jenis,nama",
description = "default",
parents = {"mitologi", "plants"},
}
labels["mitologi"] = {
type = "berkenaan",
description = "default",
parents = {"budaya"},
}
labels["narratology"] = {
type = "berkenaan",
description = "default",
parents = {"kesusasteraan", "drama"},
}
labels["Navajo mythology"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi"},
}
labels["akhbar"] = {
type = "nama",
description = "default",
parents = {"terbitan berkala"},
}
labels["Niconico"] = {
type = "berkenaan",
wikidata = 697233,
description = "=the video-sharing website {{w|Niconico}}",
parents = {"media sosial", "World Wide Web"},
}
labels["dewa Norse"] = {
type = "nama",
description = "default",
parents = {"dewa", "dewa Jermanik", "mitologi Norse"},
}
labels["mitologi Norse"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "mitologi Jermanik"},
}
labels["okultisme"] = {
type = "berkenaan",
description = "default with the",
parents = {"supernatural", "paranormal"},
}
labels["omegaverse"] = {
type = "berkenaan",
wikidata = 96397374,
description = "=the [[omegaverse]] genre",
parents = {"erotic literature", "fan fiction", "cereka spekulatif"},
}
labels["Omori"] = {
type = "berkenaan",
wikidata = 105618699,
displaytitle ="''Omori''",
description = "=the ''{{w|Omori (video game)|Omori}}'' series",
parents = {"cereka Amerika", "permainan video"},
}
labels["Once Upon a Time"] = {
type = "berkenaan",
wikidata = 23673,
displaytitle = "''Once Upon a Time''",
description = "=the television series ''{{w|Once Upon a Time (TV series)|Once Upon a Time}}'' (2011–2018)",
parents = {"cereka Amerika", "Disney", "televisyen"},
}
labels["painting"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["palmistry"] = {
type = "berkenaan",
description = "default",
parents = {"penilikan"},
}
labels["parti"] = {
type = "jenis,berkenaan",
description = "default",
parents = {"hiburan", "budaya"},
}
labels["people in Romance of the Three Kingdoms"] = {
type = "nama",
displaytitle = "people in ''Romance of the Three Kingdoms''",
description = "=people in ''{{w|Romance of the Three Kingdoms}}''",
parents = {"Romance of the Three Kingdoms"},
}
labels["minyak wangi"] = {
type = "jenis,set",
description = "default",
parents = {"fesyen", "scents", "perfumery"},
}
labels["terbitan berkala"] = {
type = "jenis,berkenaan",
description = "default",
parents = {"media massa", "kesusasteraan"},
}
labels["personifications"] = {
type = "nama",
description = "default",
parents = {"narratology"},
}
labels["places in Romance of the Three Kingdoms"] = {
type = "nama",
displaytitle = "places in ''Romance of the Three Kingdoms''",
description = "=places in ''{{w|Romance of the Three Kingdoms}}''",
parents = {"Romance of the Three Kingdoms", "China"},
}
labels["plot devices"] = {
type = "jenis",
description = "default",
parents = {"narratology", "cereka"},
}
labels["puisi"] = {
type = "berkenaan",
description = "default",
parents = {"kesusasteraan", "seni"},
}
labels["polyamorous ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between three or more characters.",
parents = {"ships (fandom) by relationship type"},
}
labels["Private Eye"] = {
type = "berkenaan",
displaytitle = "''Private Eye''",
description = "=the ''{{w|Private Eye}}'' franchise",
parents = {"cereka British"},
}
labels["Reddit"] = {
type = "berkenaan",
wikidata = 2195701,
description = "=the social news aggregation and discussion website {{w|Reddit}}",
parents = {"media sosial", "World Wide Web"},
}
labels["reference works"] = {
type = "jenis",
description = "default",
parents = {"buku"},
}
labels["dewa Rom"] = {
type = "nama",
description = "default",
parents = {"dewa", "mitologi Rom"},
}
labels["mitologi Rom"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Rom Purba"},
}
labels["romance fiction"] = {
type = "berkenaan",
description = "default",
parents = {"genre kesusasteraan", "cinta"},
}
labels["Hikayat Tiga Kerajaan"] = {
type = "berkenaan",
wikidata = 70806,
displaytitle = "''Hikayat Tiga Kerajaan''",
description = "=''{{w|Hikayat Tiga Kerajaan}}''",
parents = {"cereka", "kesusasteraan", "China"},
}
labels["RPF ships (fandom)"] = {
type = "nama",
description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} involving real people in a fictional relationship.",
additional = "For actual relationships between real people, see [[:Category:Couple nicknames]].",
parents = {"ships (fandom) by relationship type"},
}
labels["cereka sains"] = {
type = "berkenaan",
description = "default",
parents = {"cereka spekulatif", "cereka"},
}
labels["SCP Foundation"] = {
type = "berkenaan",
wikidata = 17439649,
description = "English terms related to the SCP Wiki collaborative writing website and its setting of the {{w|SCP Foundation}}.",
parents = {"fantasi", "cereka", "horror", "cereka sains", "supernatural"},
}
labels["arca"] = {
type = "berkenaan",
description = "default",
parents = {"seni"},
}
labels["Shahnameh"] = {
type = "berkenaan",
wikidata = 8279,
displaytitle = "''Shahnameh''",
description = "=''Shahnameh''",
parents = {"cereka", "puisi", "kesusasteraan", "Parsi"},
}
labels["Shahnameh characters"] = {
type = "nama",
description = "=characters in the [[Shahnameh]]",
parents = {"Shahnameh"},
}
labels["shapeshifters"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures", "characters from folklore"},
}
labels["Sherlock Holmes"] = {
type = "berkenaan",
wikidata = 2316684,
description = "=the [[Sherlock Holmes]] stories by {{w|Arthur Conan Doyle}} and adaptations of them",
parents = {"cereka British", "kesusasteraan"},
}
labels["Sherlock (TV series)"] = {
type = "berkenaan",
wikidata = 192837,
displaytitle = "''Sherlock'' (TV series)",
description = "=the television series ''[[w:Sherlock (TV series)|Sherlock]]'' (2010–2017)",
parents = {"Sherlock Holmes", "televisyen"},
}
labels["shipping (fandom)"] = {
type = "berkenaan",
description = "={{l|en|ship|shipping|id=fandomverb}} (i.e., in [[fandom]], supporting a fictional romantic relationship between two characters)",
parents = {"fandom", "romance fiction"},
}
labels["ships (fandom)"] = {
type = "kumpulan",
description = "=names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} i.e., a fictional relationship between two fictional characters or real people)",
parents = {"shipping (fandom)"},
}
labels["ships (fandom) by relationship type"] = {
type = "kumpulan",
description = "={{l|en|ship|ship|id=fandomnoun}} names organized by the type of relationship (e.g, [[heterosexual]], [[homosexual]], etc.)",
parents = {"ships (fandom)"},
}
labels["shippers (fandom)"] = {
type = "jenis",
description = "=[[shipper]]s (i.e., people who support a romantic or sexual relationship between characters or real people)",
parents = {"shipping (fandom)"},
}
labels["dewa Slavik"] = {
type = "nama",
description = "default",
parents = {"dewa", "mitologi Slavik"},
}
labels["mitologi Slavik"] = {
type = "berkenaan",
description = "=[[mitologi]] masyarakat [[Slav]]",
parents = {"mitologi"},
}
labels["Smallville (TV series)"] = {
type = "berkenaan",
wikidata = 180228,
displaytitle = "''Smallville'' (TV series)",
description = "=the television series ''{{w|Smallville}}'' (2001–2011)",
parents = {"cereka Amerika", "Superman", "televisyen"},
}
labels["media sosial"] = {
type = "berkenaan",
wikidata = 202833,
description = "default",
parents = {"media massa", "Internet"},
}
labels["South Korean idol fandom"] = {
type = "berkenaan",
wikidata = 39086123,
description = "=[[South Korea|South Korean]] [[idol]] [[fandom]]",
parents = {"idol fandom", "South Korea"},
}
labels["South Park"] = {
type = "berkenaan",
wikidata = 16538,
displaytitle = "''South Park''",
description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|South Park}}''.",
parents = {"cereka Amerika", "animasi"},
}
labels["Star Trek"] = {
type = "berkenaan",
wikidata = 1092,
displaytitle = "''Star Trek''",
description = "=the ''{{w|Star Trek}}'' franchise",
parents = {"cereka Amerika", "filem", "cereka sains", "televisyen"},
}
labels["Star Wars"] = {
type = "berkenaan",
wikidata = 462,
displaytitle = "''Star Wars''",
description = "=the ''{{w|Star Wars}}'' franchise",
parents = {"cereka Amerika", "filem", "cereka sains", "Disney"},
}
labels["Steven Universe"] = {
type = "berkenaan",
wikidata = 7615342,
displaytitle = "''Steven Universe''",
description = "=the animated television series ''{{w|Steven Universe}}''",
parents = {"cereka Amerika", "animasi"},
}
labels["stock characters"] = {
type = "jenis",
wikidata = 636497,
description = "default",
parents = {"watak cereka"},
}
labels["cereka spekulatif"] = {
type = "berkenaan",
wikidata = 9326077,
description = "default",
parents = {"cereka", "genre"},
}
labels["spider fighting"] = {
type = "berkenaan",
wikidata = 7577058,
description = "={{w|spider fighting}}",
parents = {"spiders", "human activity"},
}
labels["subbudaya"] = {
type = "berkenaan",
description = "=[[subculture]]s",
parents = {"budaya"},
}
labels["adiwira"] = {
type = "nama",
wikidata = 188784,
description = "=[[superhero]]es",
parents = {"watak cereka"},
}
labels["Superman"] = {
type = "berkenaan",
wikidata = 79015,
description = "=the fictional [[superhero]] [[Superman]]",
parents = {"DC Comics", "watak cereka"},
}
labels["supernatural"] = {
type = "berkenaan",
wikidata = 80837,
description = "default with the",
parents = {"folklore"},
}
labels["Supernatural (TV series)"] = {
type = "berkenaan",
wikidata = 130585,
displaytitle = "''Supernatural'' (TV series)",
description = "=the television series ''[[w:Supernatural (American TV series)|Supernatural]]'' (2005–2020)",
parents = {"cereka Amerika", "televisyen"},
}
labels["mitologi Tamil"] = {
type = "nama",
description = "default",
additional = "See [[w:Dravidian folk religion|Dravidian religion]] or [[w:Religion in ancient Tamilakam|Tamil region]] for more.",
parents = {"dewa", "dewa Hindu", "mitologi Tamil"},
}
labels["mitologi Tamil"] = {
type = "nama",
description = "default",
additional = "See [[w:Dravidian folk religion|Dravidian religion]] or [[w:Religion in ancient Tamilakam|Tamil region]] for more.",
parents = {"mitologi", "mitologi Hindu", "Tamil Nadu, India"},
}
labels["televisyen"] = {
type = "berkenaan",
wikidata = 289,
description = "default",
parents = {"media massa", "penyiaran"},
}
labels["The Handmaid's Tale"] = {
type = "berkenaan",
wikidata = 25207350,
displaytitle = "''The Handmaid's Tale''",
description = "=the 1985 novel ''{{w|The Handmaid's Tale}}'' by {{w|Margaret Atwood}} and its [[w:The Handmaid's Tale (TV series)|television adaptation]] (2017–)",
parents = {"Canadian fiction", "cereka utopia dan distopia", "kesusasteraan"},
}
labels["The Hunger Games"] = {
type = "berkenaan",
wikidata = 11679,
displaytitle = "''The Hunger Games''",
description = "=''{{w|The Hunger Games}}'' novel series by {{w|Suzanne Collins}} and its film adaptations",
parents = {"cereka Amerika", "cereka sains", "cereka utopia dan distopia", "kesusasteraan"},
}
labels["The Matrix"] = {
type = "berkenaan",
wikidata = 83495,
displaytitle = "''The Matrix''",
description = "=''{{w|The Matrix}}''",
parents = {"cereka Amerika", "cereka sains", "cereka utopia dan distopia"},
}
labels["The Simpsons"] = {
type = "berkenaan",
wikidata = 886,
displaytitle = "''The Simpsons''",
description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|The Simpsons}}''.",
parents = {"cereka Amerika", "animasi", "Disney"},
}
labels["The Walking Dead"] = {
type = "berkenaan",
wikidata = 232737,
displaytitle = "''The Walking Dead''",
description = "=the television series ''[[w:The Walking Dead (TV series)|The Walking Dead]]'' (2010–2022) and the comic series from which it was adapted",
parents = {"cereka Amerika", "televisyen", "cereka utopia dan distopia", "zombies"},
}
labels["The Wizard of Oz"] = {
type = "berkenaan",
wikidata = 130295,
displaytitle = "''The Wizard of Oz''",
description = "=the fantasy novel ''{{w|The Wonderful Wizard of Oz}}'', subsequent books or films derived from it, such as the ''[[w:The Wizard of Oz (1939 film)|1939 film]]''.",
parents = {"cereka Amerika", "fantasi", "kesusasteraan"},
}
labels["The X-Files"] = {
type = "berkenaan",
wikidata = 2744,
displaytitle = "''The X-Files''",
description = "=the ''{{w|The X-Files}}'' franchise",
parents = {"cereka Amerika", "cereka sains", "televisyen"},
}
labels["teater"] = {
type = "berkenaan",
description = "default",
parents = {"seni", "hiburan"},
}
labels["Thracian deities"] = {
type = "nama",
description = "default",
parents = {"dewa"},
}
labels["TikTok"] = {
type = "berkenaan",
wikidata = 48938223,
description = "=the video-sharing and social-networking service {{w|TikTok}}",
parents = {"media sosial", "World Wide Web"},
}
labels["mitologi Tupi"] = {
type = "berkenaan",
description = "default",
parents = {"mitologi", "Brazil"},
}
labels["Twilight (novel series)"] = {
type = "berkenaan",
wikidata = 44523,
displaytitle = "''Twilight'' (novel series)",
description = "=the ''[[w:Twilight (series)|Twilight]]'' franchise",
parents = {"cereka Amerika", "fantasi", "kesusasteraan", "vampires"},
}
labels["Twitter"] = {
type = "berkenaan",
wikidata = 918,
description = "=the social networking and microblogging service {{w|Twitter}}",
parents = {"media sosial", "World Wide Web"},
}
labels["Tumblr"] = {
type = "berkenaan",
wikidata = 384060,
description = "=the microblogging and social networking service {{w|Tumblr}}",
parents = {"media sosial", "World Wide Web"},
}
labels["cereka utopia dan distopia"] = {
type = "berkenaan",
description = "default",
parents = {"cereka spekulatif"},
}
labels["vampires"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures", "characters from folklore", "death", "horror", "blood"},
}
labels["vampire lifestyle"] = {
type = "berkenaan",
description = "={{w|vampire lifestyle|the vampire lifestyle}} (i.e., a subculture which roleplays the stereotypical habits of vampires)",
parents = {"subbudaya", "vampires"},
}
labels["Virtual YouTuber"] = {
type = "berkenaan",
wikidata = 55155641,
description = "=[[virtual YouTuber]]s ([[VTuber]]s)",
parents = {"YouTube", "hiburan"},
}
labels["web design"] = {
type = "berkenaan",
description = "default",
parents = {"reka bentuk", "World Wide Web"},
}
labels["werewolves"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures", "characters from folklore", "shapeshifters", "horror"},
}
labels["worldbuilding"] = {
type = "berkenaan",
description = "default",
parents = {"narratology", "cereka spekulatif"},
}
labels["Xena: Warrior Princess"] = {
type = "berkenaan",
wikidata = 38497,
displaytitle = "''Xena: Warrior Princess''",
description = "=the television series ''{{w|Xena: Warrior Princess}}'' (1995–2001)",
parents = {"cereka Amerika", "fantasi", "televisyen"},
}
labels["YouTube"] = {
type = "berkenaan",
wikidata = 866,
description = "=the video-sharing website {{w|YouTube}}",
parents = {"media sosial", "World Wide Web", "Google"},
}
labels["YouTube Poop"] = {
type = "berkenaan",
wikidata = 16927904,
description = "default",
parents = {"YouTube", "Internet memes"},
}
labels["zombi"] = {
type = "berkenaan,jenis",
description = "default",
parents = {"mythological creatures", "characters from folklore", "death", "horror"},
}
return labels
tojb5f765nr8fk8gwqplc7snir4z6on
Modul:category tree/topic/Animals
828
11530
281325
279238
2026-04-22T00:37:22Z
PeaceSeekers
3334
281325
Scribunto
text/plain
local labels = {}
labels["haiwan"] = {
type = "set",
description = "default",
parents = {"makhluk"},
commonscat = "Animalia",
wpcat = true,
}
labels["ikan akanturoid"] = {
type = "set",
description = "=[[surgeonfish]], [[light-horseman]], [[louvar]]s, [[scat]]s, [[rabbitfish]], [[Moorish idol]]s and other fish in the [[perciform]] [[suborder]] [[Acanthuroidei]]",
parents = {"ikan"},
}
labels["accentors"] = {
type = "set",
description = "=birds in the [[family]] [[Prunellidae]]",
parents = {"burung tenggek"},
}
labels["accipiters"] = {
type = "set",
description = "=[[besra]]s, [[Cooper's hawk]]s, [[goshawk]]s, [[sharp-shinned hawk]]s, [[shikra]]s, [[sparrowhawk]]s, and other [[hawk]]s in the [[genus]] ''[[Accipiter]]''",
parents = {"burung pemangsa"},
}
labels["ikan asipenseriform"] = {
type = "set",
description = "=[[paddlefish]], [[sturgeon]]s and other fish in the [[order]] [[Acipenseriformes]]",
parents = {"ikan"},
}
labels["adephagan beetles"] = {
type = "set",
description = "=[[diving beetle]]s, [[ground beetle]]s (including [[bombardier beetle]]s and [[tiger beetle]]s), [[whirligig beetle]]s and other [[beetle]]s in the [[suborder]] [[Adephaga]]",
parents = {"beetles"},
}
labels["African insectivores"] = {
type = "set",
description = "=[[aardvark]]s, [[elephant shrew]]s, [[golden mole]]s, [[otter shrew]]s, [[tenrec]]s, and other [[mammal]]s in the [[clade]] [[Afroinsectiphilia]]",
parents = {"mamalia"},
}
labels["agamid lizards"] = {
type = "set",
description = "=[[agama]]s, [[bearded dragon]]s, [[flying dragon]]s, [[frilled lizard]]s, [[moloch]]s, [[spiny-tailed lizard]]s, [[stellion]]s and other [[lizard]]s in the [[family]] [[Agamidae]]",
parents = {"lizards"},
}
labels["alcelaphine antelopes"] = {
type = "set",
description = "=[[blesbuck]]s, [[bontebok]]s, [[bubal]]s, [[gnu]]s or [[wildebeest]], [[hartebeest]]s, [[hirola]], [[sassaby]]s, [[topi]]s, [[tetel]]s, and other [[antelopes]] in the [[subfamily]] [[Alcelaphinae]]",
parents = {"antelopes"},
}
labels["ammonites"] = {
type = "set",
description = "=[[extinct]] [[cephalopod]]s in the [[subclass]] [[Ammonoidea]]",
parents = {"sefalopod"},
}
labels["amfibia"] = {
type = "set",
description = "default",
parents = {"vertebrat"},
commonscat = "Amphibia",
wpcat = true,
}
labels["amphipods"] = {
type = "set",
description = "=[[beach flea]]s, [[lawn shrimp]], [[scud]]s, [[side swimmer]]s, [[skeleton shrimp]], [[whale louse|whale lice]], and other [[crustacean]]s in the [[order]] [[Amphipoda]]",
parents = {"krustasea"},
}
labels["anatid"] = {
type = "set",
description = "=[[anatid]]s: ([[duck]]s, [[goose|geese]] and [[swan]]s)",
parents = {"burung air tawar"},
}
labels["annelids"] = {
type = "set",
description = "=[[earthworm]]s, [[leech]]es, [[ragworm]]s and many other [[segment]]ed [[worm]]s in the [[filum]] [[Annelida]]",
parents = {"cacing"},
}
labels["anglerfish"] = {
type = "set",
description = "=fish in the [[order]] [[Lophiiformes]]",
parents = {"ikan"},
}
labels["anguimorph lizards"] = {
type = "set",
description = "=[[alligator lizard]]s, [[beaded lizard]]s, [[blindworm]]s, [[crocodile monitor]]s, [[galliwasp]]s, [[Gila monster]]s, [[glass lizard]]s, [[goanna]]s, [[Komodo dragon]]s, [[legless lizard]]s, [[nile monitor]]s, [[perentie]]s, [[sheltopusik]]s, [[water monitor]]s, and other [[lizards]] in the [[suborder]] [[Anguimorpha]]",
parents = {"lizards"},
}
labels["anomurans"] = {
type = "set",
description = "=crablike [[crustacean]]s in the [[decapod]] [[infraorder]] [[Anomura]], which are closely related to the true [[crab]]s in the infraorder [[Brachyura]]",
parents = {"krustasea", "dekapod"},
}
labels["anteaters and sloths"] = {
type = "set",
description = "=[[mammal]]s in the [[order]] [[Pilosa]]",
parents = {"mamalia"},
}
labels["antelopes"] = {
type = "set",
description = "default",
parents = {"ungulat kuku genap"},
}
labels["antilopine antelopes"] = {
type = "set",
description = "=[[blackbuck]]s, [[chinkara]]s, [[dibatag]]s, [[dik-dik]]s, [[gazelle]]s, [[gerenuk]]s, [[grysbok]]s, [[klipspringer]]s, [[oribi]]s, [[royal antelope]]s, [[saiga]]s, [[springbok]]s, [[steenbok]]s, [[zeren]], and other [[antelope]]s in the [[bovid]] [[subfamily]] [[Antilopinae]]",
parents = {"antelopes"},
}
labels["ants"] = {
type = "set",
description = "default",
parents = {"Hymenoptera"},
}
labels["antshrikes"] = {
type = "set",
description = "default",
parents = {"suboscines", "burung tenggek"},
}
labels["anurans"] = {
type = "set",
description = "=[[amphibian]]s in the [[order]] [[Anura]], which are short-bodied and without tails, having long hind legs adapted for leaping that are typically folded at rest. Anurans are mostly known as [[frog]]s or [[toad]]s",
parents = {"amfibia"},
}
labels["aphids"] = {
type = "set",
description = "=[[insect]]s in the [[superfamily]] [[Aphidoidea]]",
parents = {"hemipterans"},
}
labels["apodiforms"] = {
type = "set",
description = "=[[hummingbird]]s, [[needletail]]s, [[spinetail]]s, [[swift]]s, [[swiftlet]]s, [[treeswift]]s, and other [[bird]]s in the [[order]] [[Apodiformes]]",
parents = {"burung"},
}
labels["araknid"] = {
type = "set",
description = "default",
parents = {"artropod"},
}
labels["lelabah araneoid"] = {
type = "set",
description = "=[[lelabah tinja burung]], [[cobweb spiders]] (including [[black widow]]s and [[redback]]s), [[orbweaver]]s (including [[cross spider]]s and [[writing spider]]s), [[long-jawed spider]]s, [[money spider]]s, [[nesticid]]s, [[pimoid]], [[pirate spider]]s, [[tetragnathid]]s and other [[spider]]s in the [[superfamily]] [[Araneoidea]]",
parents = {"lelabah"},
}
labels["ikan argentiniform"] = {
type = "set",
description = "=[[argentine]]s, [[barreleye]]s, [[blacksmelt]]s, [[smoothtongue]]s and other ikan in the [[order]] [[Argentiniformes]]",
parents = {"ikan"},
}
labels["armadillos"] = {
type = "set",
description = "default",
parents = {"mamalia"},
}
labels["artropod"] = {
type = "set",
description = "default",
parents = {"haiwan"},
commonscat = "Arthropoda",
wpcat = true,
}
labels["aschizan flies"] = {
type = "set",
description = "=[[fly|flies]] in the [[dipteran]] [[section]] [[Aschiza]]",
parents = {"Diptera"},
}
labels["asilomorph flies"] = {
type = "set",
description = "=[[bee fly|bee flies]], [[dance fly|dance flies]], [[Mydas fly|Mydas flies]], [[robber fly|robber flies]], [[stiletto fly|stiletto flies]], [[window fly|window flies]] and other [[fly|flies]] in the [[dipteran]] [[infraorder]] [[Asilomorpha]]",
parents = {"Diptera"},
}
labels["assassin bugs"] = {
type = "set",
description = "=[[ambush bug]]s, [[assassin bug]]s, [[corsair]]s, [[feather-legged bug]]s, [[kissing bug]]s or [[conenose bug]]s, [[masked hunter]]s, [[wheel bug]]s, and other [[true bug]]s in the [[family]] [[Reduviidae]]",
parents = {"true bugs"},
}
labels["astacideans"] = {
type = "set",
description = "=[[crustacean]]s in the [[decapod]] [[infraorder]] [[Astacidea]], including the original [[species]] known as [[crayfish]] and [[lobster]]s, and their relatives",
parents = {"krustasea", "dekapod"},
}
labels["ikan ateriniform"] = {
type = "set",
description = "=[[blue-eye]]s, [[hardyhead]]s, [[grunion]], [[jacksmelt]], [[rainbowfish]], [[silverside]]s, [[zona]], and other ikan in the [[order]] [[Atheriniformes]]",
parents = {"ikan"},
}
labels["auks"] = {
type = "set",
description = "=[[auk]]s, [[guillemot]]s, [[murre]]s, [[puffin]]s, [[razorbill]]s, and other [[seabird]]s in the family [[Alcidae]]",
parents = {"burung laut"},
}
labels["ikan aulopiform"] = {
type = "set",
description = "=[[daggertooth]]s, [[lancetfish]], [[sergeant baker]]s, [[greeneye]]s, [[telescopefish]], [[lizardfish]] and other ikan in the [[order]] [[Aulopiformes]]",
parents = {"ikan"},
}
labels["Australasian robins"] = {
type = "set",
description = "=birds in the [[passerine]] [[family]] [[Petroicidae]], which are not closely related to the [[European robin]] (an [[Old World flycatcher]] in the family [[Muscicapidae]]), or the [[American robin]] (a [[thrush]] in the family [[Turdidae]])",
parents = {"burung tenggek"},
}
labels["anak haiwan"] = {
type = "set",
description = "default",
parents = {"haiwan"},
}
labels["bandicoots and bilbies"] = {
type = "set",
description = "=[[peramelid]]s, [[bandicoot]]s, [[marl]]s, [[quenda]]s, [[chaeropodid]]s, [[pig-footed bandicoot]]s, [[thylacomyid]]s, [[bilby|bilbies]], [[dalgite]]s, [[rabbit-eared bandicoot]]s, [[philander]]s, [[pinkie]]s, and other [[marsupial]]s in the [[order]] [[Peramelemorphia]]",
parents = {"marsupials"},
}
labels["barklice"] = {
type = "set",
description = "=non-[[parasitic]] [[insect]]s in the [[order]] [[Psocodea]]",
parents = {"serangga"},
}
labels["barnacles"] = {
type = "set",
description = "=[[crustacean]]s in the [[infraclass]] [[Cirripedia]], including the parasitic [[rhizocephalan]]s",
parents = {"krustasea"},
}
labels["kelawar"] = {
type = "set",
description = "default",
parents = {"mamalia"},
}
labels["lebah"] = {
type = "set",
description = "default",
parents = {"Hymenoptera", "pemeliharaan lebah"},
}
labels["beetles"] = {
type = "set",
description = "default",
parents = {"serangga"},
}
labels["ikan beloniform"] = {
type = "set",
description = "=[[ballyhoo]], [[flying fish]], [[garfish]], [[halfbeak]]s, [[houndfish]], [[mackerel pike]]s, [[medaka]]s, [[needlefish]], [[ricefish]], [[saury|sauries]], [[silver gar]], and other ikan in the [[order]] [[Beloniformes]]",
parents = {"ikan"},
}
labels["bibionomorphs"] = {
type = "set",
description = "=[[March fly|March flies]], [[cecidomyiid]] [[gall midge]]s, [[keroplatid]] [[fungus gnat]]s, [[mycetophilid]]s, [[sciarid]]s and other [[fly|flies]], [[gnat]]s and [[midge]]s in the [[dipteran]] [[infraorder]] [[Bibionomorpha]]",
parents = {"Diptera"},
}
labels["burung"] = {
type = "set",
description = "default",
parents = {"vertebrat"},
commonscat = "Aves",
wpcat = true,
}
labels["burung pemangsa"] = {
type = "set",
description = "=birds that live by [[predatory]] hunting, and from [[carrion]]",
parents = {"burung"},
}
labels["bivalvia"] = {
type = "set",
description = "=[[clam]]s, [[cockle]]s, [[mussel]]s, [[oyster]]s, [[scallop]]s and other [[mollusk]]s in the [[class]] [[Bivalvia]]",
parents = {"moluska"},
}
labels["blennies"] = {
type = "set",
description = "=[[blenny|blennies]], [[chaenopsid]]s, [[clinid]]s, [[dactyloscopid]]s, [[klipfish]], [[labrisomid]]s, [[triplefin]]s, [[weedfish]] and other ikan in the [[perciform]] [[suborder]] [[Blennioidei]]",
parents = {"ikan"},
}
labels["boas"] = {
type = "set",
description = "=[[snake]]s in the family [[Boidae]]",
parents = {"ular"},
}
labels["bostrichiform beetles"] = {
type = "set",
description = "=[[carpet beetle]]s, [[deathwatch beetle]]s, [[drugstore beetle]]s, [[museum beetle]]s, [[powder-post beetle]]s, and other [[anobiid]]s/[[ptinid]]s, [[bostrichid]]s, [[dermestid]]s, [[derodontid]]s, [[jacobsoniid]]s and [[nosodendrid]]s in the [[coleopteran]] [[infraorder]] [[Bostrichiformia]]",
parents = {"beetles"},
}
labels["bovines"] = {
type = "set",
description = "default",
parents = {"ungulat kuku genap"},
}
labels["brachiopods"] = {
type = "set",
description = "=[[animal]]s in the [[filum]] [[Brachiopoda]]. <u>Note</u>: not to be confused with [[branchiopod]]s, which are [[crustacean]]s",
parents = {"haiwan"},
}
labels["branchiopods"] = {
type = "set",
description = "=[[[brine shrimp]], [[clam shrimp]], [[fairy shrimp]], [[tadpole shrimp]], [[water flea]]s, and other [[crustacean]]s in the [[class]] [[Branchiopoda]]. <u>Note</u>: not to be confused with [[brachiopod]]s, which are a separate [[filum]]",
parents = {"krustasea"},
}
labels["bryozoans"] = {
type = "set",
description = "=[[animal]]s in the [[filum]] [[Bryozoa]], also known as [[Ectoprocta]]",
parents = {"haiwan"},
}
labels["bulbuls"] = {
type = "set",
description = "=[[bulbul]]s, [[greenbul]]s, [[brownbul]]s, [[leaflove]]s, [[bristlebill]]s, and other birds in the [[passerine]] [[family]] [[Pycnonotidae]]",
parents = {"burung tenggek"},
}
labels["buteos"] = {
type = "set",
description = "=[[hawk]]s in the [[genus]] ''[[Buteo]]'', known as [[buzzard]]s in Europe",
parents = {"burung pemangsa"},
}
labels["butterflies"] = {
type = "set",
description = "default",
parents = {"serangga"},
}
labels["caddis flies"] = {
type = "set",
description = "=serangga in the order [[Trichoptera]], which are closely related to the [[butterfly|butterflies]] and [[moth]]s but with hairs on their wings instead of scales, and which have [[aquatic]] [[larvae]] that live in cases that they build around themselves",
parents = {"serangga"},
}
labels["caecilians"] = {
type = "set",
description = "=[[amphibian]]s in the [[order]] [[Gymnophiona]], which are legless and resemble [[earthworm]]s or [[snake]]s",
parents = {"amfibia"},
}
labels["camelids"] = {
type = "set",
description = "=[[camelid]]s ([[camel]]s, [[llama]]s, [[alpaca]]s, etc.)",
parents = {"mamalia", "ungulat kuku genap"},
}
labels["kanid"] = {
type = "set",
description = "default",
parents = {"karnivor"},
}
labels["caprines"] = {
type = "set",
description = "=[[sheep]], [[goat]]s, [[goat antelope]]s, [[chamois]], [[muskox]]en, [[bharal]], [[goral]], [[ibex]], [[mouflon]], [[serow]], [[tahr]], [[tur]], [[takin]] and other haiwan in the [[bovid]] [[subfamily]] [[Caprinae]], formerly known as the [[family]] [[Capridae]]",
parents = {"ungulat kuku genap"},
}
labels["caprimulgiforms"] = {
type = "set",
description = "=[[caprimulgiform]]s: birds in the taxonomic order [[Caprimulgiformes]]- the [[nightjar]]s, [[oilbird]]s, [[frogmouth]]s, [[potoo]]s, etc",
parents = {"burung"},
}
labels["carcharhiniform sharks"] = {
type = "set",
description = "=[[bull shark]]s, [[catshark]]s, [[gummy shark]]s, [[hammerhead]]s, [[leopard shark]]s, [[morgay]]s, [[requiem shark]]s, [[tiger shark]]s, [[tope]]s, [[whaler]]s, [[whitetip]]s and other sharks in the [[order]] [[Carcharhiniformes]]",
parents = {"jerung"},
}
labels["cardinalids"] = {
type = "set",
description = "=[[cardinal]]s, [[dickcissel]]s, [[indigo bunting]]s, [[pyrrhuloxia]]s, [[rose-breasted grosbeak]]s, [[scarlet tanager]]s, and other birds in the [[family]] [[Cardinalidae]]",
parents = {"burung tenggek"},
}
labels["caridean shrimp"] = {
type = "set",
description = "=[[crustacean]]s in the [[decapod]] [[infraorder]] [[Caridea]], mostly known as [[shrimp]] or [[prawn]]s",
parents = {"krustasea", "dekapod"},
}
labels["karnivor"] = {
type = "set",
description = "=[[bear]]s, [[cat]]s, [[civet]]s, [[dog]]s, [[fossa]]s, [[hyaena]]s, [[mongoose]]s, [[panda]]s, [[raccoon]]s, [[seal]]s, [[skunk]]s, [[weasel]]s and various other [[mammal]]s in the [[order]] [[Carnivora]]",
parents = {"mamalia"},
}
labels["carps"] = {
type = "set",
description = "=ikan in the [[subfamily]] [[Cyprininae]], the [[carps]] and [[goldfish]]",
parents = {"cyprinids"},
}
labels["catfish"] = {
type = "set",
description = "default",
parents = {"ikan", "ikan ikan otosefalan"},
}
labels["kucing"] = {
type = "set",
description = "=[[cat]]s in the sense of members of the genus ''[[Felis]]''",
parents = {"felids"},
commonscat = "Felis silvestris catus",
wpcat = true,
}
labels["cattle"] = {
type = "set",
description = "default",
parents = {"bovines", "ternakan"},
}
labels["caviomorphs"] = {
type = "set",
description = "=[[agouti]]s, [[capybara]]s, [[chinchilla]]s, [[guinea pig]]s, [[New World porcupine]]s, [[nutria]]s, [[tuco-tuco]]s and other [[rodent]]s in the parvorder [[Caviomorpha]]",
parents = {"rodensia"},
}
labels["sefalopod"] = {
type = "set",
description = "default",
parents = {"moluska"},
}
labels["monyet serkopitesin"] = {
type = "set",
description = "=[[blue monkey]]s, [[Diana monkey]]s, [[guenon]]s, [[lesula]]s, [[malbrouck]]s, [[patas monkey]]s, [[talapoin]]s, [[vervet]]s, and other [[Old World monkey]]s in the [[cercopithecine]] [[tribe]] [[Cercopithecini]]",
parents = {"monyet dunia lama"},
}
labels["burung sertioid"] = {
type = "set",
description = "=birds in the [[passerine]] [[superfamily]] [[Certhioidea]], the [[treecreeper]]s, [[nuthatch]]es, [[gnatcatcher]]s and [[wren]]s",
parents = {"burung tenggek"},
}
labels["Cervidae"] = {
type = "set",
description = "default",
parents = {"ungulat kuku genap"},
}
labels["setasea"] = {
type = "set",
description = "=[[cetacean]]s ([[dolphin]]s, [[whale]]s and [[porpoise]]s)",
parents = {"ungulat kuku genap"},
}
labels["chalcidoid wasps"] = {
type = "set",
description = "=[[chalcidid]]s, [[encyrtid]]s, [[fig wasp]]s, [[jointworm]]s, [[mymarid]] [[fairyfly|fairyflies]], [[perilampid]]s, [[torymid]]s, [[trichogramma]]s, and other [[wasp]]s in the [[superfamily]] [[Chalcidoidea]]",
parents = {"Hymenoptera"},
}
labels["characins"] = {
type = "set",
description = "=fish in the order [[Characiformes]]",
parents = {"ikan", "ikan otosefalan"},
}
labels["ayam"] = {
type = "set",
description = "default",
parents = {"poltri", "unggas"},
}
labels["chimaeras (fish)"] = {
type = "set",
description = "=[[cartilaginous]] fish in the [[Chimaeriformes]], the only surviving [[order]] of the [[subclass]] [[Holocephali]], and separate from the [[shark]]s, [[ray]]s, [[skate]]s and [[sawfish]] of the subclass [[Elasmobranchii]]",
parents = {"ikan"},
}
labels["kordata"] = {
type = "set",
description = "=haiwan dalam filum [[filum]] [[Chordata]]",
parents = {"haiwan"},
}
labels["chrysomeloid beetles"] = {
type = "set",
description = "=[[cerambycid]]s or [[longhorn beetle]]s such as [[apple borer]]s, [[huhu beetle]]s, [[locust borer]]s and [[thunderbolt beetle]]s, as well as [[chrysomelid]]s or [[leaf beetle]]s such as [[asparagus beetle]]s, [[bean weevil]]s, [[Colorado beetle]]s, [[cucumber beetle]]s, [[flea beetle]]s, [[potato beetle]]s, and other [[beetle]]s in the [[superfamily]] [[Chrysomeloidea]]",
parents = {"beetles"},
}
labels["cicadas"] = {
type = "set",
description = "=[[insect]]s in the [[superfamily]] [[Cicadoidea]]",
parents = {"hemipterans"},
}
labels["cichlids"] = {
type = "set",
description = "=fish in the family [[Cichlidae]]",
parents = {"ikan labroid"},
}
labels["clinids"] = {
type = "set",
description = "=fish in the family [[Clinidae]]",
parents = {"ikan"},
}
labels["knidaria"] = {
type = "set",
description = "=[[coral]]s, [[gorgonian]]s, [[hydra]]s, [[myxozoan]]s, [[Portuguese man-of-war]], [[sea anemone]]s, [[sea fir]]s, [[sea wasp]]s, and other haiwan in the in the [[filum]] [[Cnidaria]]",
parents = {"haiwan"},
}
labels["cockatoos"] = {
type = "set",
description = "=[[crested]] [[parrot]]s in the [[family]] [[Cacatuidae]]",
parents = {"parrots"},
}
labels["lipas"] = {
type = "set",
description = "default",
parents = {"serangga"},
}
labels["colobine monkeys"] = {
type = "set",
description = "=[[colobus]]es, [[douc]]s, [[langur]]s, [[guereza]]s, [[hanuman]]s,[[leaf monkey]]s, [[lutung]]s, [[proboscis monkey]]s, and other [[Old World monkey]]s in the [[subfamily]] [[Colobinae]]",
parents = {"monyet dunia lama"},
}
labels["ular kolubrid"] = {
type = "set",
description = "=[[snake]]s in the family [[Colubridae]]",
parents = {"ular"},
}
labels["colugos"] = {
type = "set",
description = "=the [[primate]]-like [[gliding]] [[mammal]]s in the [[order]] [[Dermoptera]], also known as [[flying lemur]]s",
parents = {"mamalia"},
}
labels["columbids"] = {
type = "set",
description = "=[[columbid]]s, i.e. [[pigeon]]s and [[dove]]s",
parents = {"burung"},
}
labels["copepods"] = {
type = "set",
description = "=[[crustacean]]s in the [[subclass]] [[Copepoda]]",
parents = {"krustasea"},
}
labels["coraciiforms"] = {
type = "set",
description = "=[[bee-eater]]s, [[ground rollers]], [[kingfisher]]s, [[motmot]]s, [[roller]]s, [[tody|todies]] and other birds in the taxonomic order [[Coraciiformes]]",
parents = {"burung"},
}
labels["corvids"] = {
type = "set",
description = "default",
parents = {"burung tenggek", "burung korvoid"},
}
labels["burung korvoid"] = {
type = "set",
description = "=[[apostlebird]]s, [[bird of paradise|birds of paradise]], [[crow]]s, [[drongo]]s, [[fantail]]s, [[grinder]]s, [[jackdaw]]s, [[jay]]s, [[magpie]]s, [[magpie-lark]]s, [[manucode]]s, [[monarchid]]s, [[nutcracker]]s, [[piwakawaka]]s, [[raven]]s, [[restless flycatcher]]s, [[riflebird]]s, [[shrike]]s, [[standard-wing]]s, and other birds in the [[superfamily]] [[Corvoidea]]",
parents = {"burung tenggek"},
}
labels["cotingas"] = {
type = "set",
description = "=birds in the [[suboscine]] [[family]] [[Cotingidae]]",
parents = {"suboscines"},
}
labels["ketam"] = {
type = "set",
description = "=[[crab]]s, [[decapod]] [[crustacean]]s in the [[infraorder]] [[Brachyura]]",
parents = {"krustasea", "dekapod"},
}
labels["cranes (birds)"] = {
type = "set",
description = "=[[crane]]s",
parents = {"gruiforms"},
}
labels["cricetids"] = {
type = "set",
description = "=[[cotton rat]]s, [[deer mouse|deer mice]], [[hamster]]s, [[harvest mouse|harvest mice]], [[lemming]]s, [[vole]]s, [[woodrat]]s, and other [[rodent]]s in the [[family]] [[Cricetidae]]",
parents = {"rodensia"},
}
labels["cengkerik dan belalang"] = {
type = "set",
description = "=[[cengkerik]], [[belalang]], [[katidid]], [[weta]] dan [[serangga]] lain dalam order [[Orthoptera]]",
parents = {"serangga"},
}
labels["croakers"] = {
type = "set",
description = "=[[croaker]]s, [[drum]]s, [[weakfish]]s and other fish in the family [[Sciaenidae]]",
parents = {"ikan perkoid"},
}
labels["Crocodilia"] = {
type = "set",
description = "=[[buaya]], [[aligator]], kayman dan [[reptilia]] lain dalam order [[Crocodilia]]",
parents = {"reptilia"},
}
labels["krustasea"] = {
type = "set",
description = "default",
parents = {"artropod"},
}
labels["cuckoos"] = {
type = "set",
description = "=[[cuckoo]]s and other birds in the [[family]] [[Cuculidae]]",
parents = {"otidimorph birds"},
}
labels["cuckooshrikes and minivets"] = {
type = "set",
description = "=birds in the [[family]] [[Campephagidae]]",
parents = {"burung tenggek"},
}
labels["cucujoid beetles"] = {
type = "set",
description = "=[[flower beetle]]s, [[fungus beetle]]s, [[grain beetle]]s, [[lady beetle]]s, [[lizard beetle]]s, [[Mexican bean beetle]]s, and other [[beetle]]s in the [[superfamily]] [[Cucujoidea]]",
parents = {"beetles"},
}
labels["ctenophores"] = {
type = "set",
description = "=haiwan in the [[filum]] [[Ctenophora]], the [[comb jelly|comb jellies]]",
parents = {"haiwan"},
}
labels["Culicomorpha"] = {
type = "set",
description = "=[[biting midge]]s, [[blackfly|blackflies]], [[blood worm]]s, [[glassworm]]s, [[meniscus midge]]s, [[mosquito]]s, [[no-see-um]]s, [[non-biting midge]]s, [[phantom midge]]s and other [[insect]]s in the [[dipteran]] [[infraorder]] [[Culicomorpha]]",
parents = {"Diptera"},
}
labels["cyprinids"] = {
type = "set",
description = "=[[carp]], [[minnow]]s, [[chub]]s and other fish in the [[family]] [[Cyprinidae]]. In some classifications, this group is known as the [[superfamily]] [[Cyprinoidea]] or [[suborder]] [[Cyprinoidei]], with the [[cyprinid]] [[subfamily|subfamilies]] considered to be families",
parents = {"ikan", "ikan otosefalan"},
}
labels["dabbling ducks"] = {
type = "set",
description = "=[[gadwall]]s [[garganey]]s, [[mallard]]s, [[mottled duck]]s, [[pintail]]s, [[shoveler]]s, [[teal]]s, [[wigeon]]s and other ducks in either the [[anatid]] [[tribe]] [[Anatini]] or [[subfamily]] [[Anatinae]], depending on the classification",
parents = {"itik"},
}
labels["damselflies"] = {
type = "set",
description = "=[[bluestreak]]s, [[bluetail]]s, [[demoiselle]]s, [[flatwing]]s, [[redtail]]s, [[riverdamsel]]s, [[rubyspot]]s, [[spreadwing]]s, [[threadtail]]s, [[whitetip]]s, and other serangga in the [[odonate]] [[suborder]] [[Zygoptera]]",
parents = {"dragonflies and damselflies"},
}
labels["danaine butterflies"] = {
type = "set",
description = "=[[clearwing]]s, [[crow]]s, [[milkweed]]s, [[monarch]]s, [[paper kite butterfly|paper kite butterflies]], [[tiger]]s, [[wanderer]]s and other [[butterfly|butterflies]] in the [[nymphalid]] [[subfamily]] [[Danainae]]",
parents = {"nymphalid butterflies"},
}
labels["dasyuromorphs"] = {
type = "set",
description = "=[[thylacine]]s, [[numbat]]s, [[dasyure]]s, [[antechinus]]es, [[dibbler]]s, [[dunnart]]s, [[mulgara]]s. [[phascogale]]s, [[planigale]]s, [[quoll]]s, [[Tasmanian devil]]s, and other [[marsupial]]s in the [[order]] [[Dasyuromorphia]]",
parents = {"marsupials"},
}
labels["dekapod"] = {
type = "set",
description = "=[[crabs]], [[crayfish]], [[lobster]]s, [[prawn]]s, ([[caridean]]) [[shrimp]], and many other [[crustacean]]s in the [[order]] [[Decapoda]]",
parents = {"krustasea"},
}
labels["delphinids"] = {
type = "set",
description = "=(oceanic) [[dolphin]]s, [[grampus]]es, [[killer whale]]s/[[orca]]s, [[pilot whale]]s, and other [[cetacean]]s in the [[family]] [[Delphinidae]]",
additional = "Note: [[river dolphin]]s and [[porpoise]]s are in other families.",
parents = {"setasea"},
}
labels["designer dogs"] = {
type = "set",
description = "default",
parents = {"anjing"},
commonscat = true,
wpcat = true,
}
labels["dinosaur"] = {
type = "set",
description = "default",
parents = {"reptilia"},
}
labels["lelabah dionika"] = {
type = "set",
description = "=[[crab spider]]s, [[flattie]]s, [[ground spider]]s, [[huntsman spider]]s, [[jumping spider]], [[scorpion spider]]s, and other [[lelabah]] in the [[entelegyne]] [[clade]] [[Dionycha]]",
parents = {"lelabah"},
}
labels["Diptera"] = {
type = "set",
description = "=[[fly|flies]], [[gnat]]s, [[midge]]s, [[mosquito]]s and other [[insect]]s in the order [[Diptera]]",
parents = {"serangga"},
}
labels["anjing"] = {
type = "set",
description = "default",
parents = {"kanid"},
commonscat = true,
wpcat = true,
}
labels["domestic cats"] = {
type = "set",
description = "default",
parents = {"kucing"},
}
labels["dragonflies and damselflies"] = {
type = "set",
description = "=serangga in the order [[Odonata]]",
parents = {"serangga"},
}
labels["itik"] = {
type = "set",
description = "default",
parents = {"anatid", "poltri"},
}
labels["dugongs and manatees"] = {
type = "set",
description = "=[[mammal]]s in the order [[Sirenia]]",
parents = {"mamalia"},
}
labels["eagles"] = {
type = "set",
description = "default",
parents = {"burung pemangsa"},
}
labels["earthworms"] = {
type = "set",
description = "=worms in the [[annelid]] [[suborder]] [[Lumbricina]]",
parents = {"annelids"},
}
labels["earwigs"] = {
type = "set",
description = "=serangga in the order [[Dermaptera]]",
parents = {"serangga"},
}
labels["ekinoderma"] = {
type = "set",
description = "default",
parents = {"haiwan"},
commonscat = "Echinodermata",
wpcat = true,
}
labels["belut"] = {
type = "set",
description = "=[[eel]]s, elongated, snakelike fish in the order [[Anguilliformes]]",
parents = {"ikan elopomorf"},
}
labels["ular elapid"] = {
type = "set",
description = "=[[cobra]]s, [[coral snake]]s, [[krait]]s, [[mamba]]s, [[sea snake]]s, and other [[venomous]] ular in the family [[Elapidae]]",
parents = {"ular"},
}
labels["elateroid beetles"] = {
type = "set",
description = "=[[click beetle]]s/[[elaterid]]s, [[fire beetle]]s, [[firefly|fireflies]]/[[lampyrid]]s, [[glowworm]]s, [[net-winged beetle]]s/[[lycid]]s, [[railroad worm]]s/[[phengodid]]s, [[soldier beetle]]s/[[cantharid]]s, [[throscid]]s, [[wireworm]]s and other [[beetle]]s in the [[superfamily]] [[Elateroidea]]",
parents = {"beetles"},
}
labels["elephants"] = {
type = "set",
description = "default",
parents = {"mamalia"},
commonscat = "Elephantidae",
wpcat = true,
}
labels["ikan elopomorf"] = {
type = "set",
description = "=[[bonefish]], [[eel]]s, [[gulper eel]]s, [[halosaur]]s, [[ladyfish]], [[tarpon]] and other fish in the [[superorder]] [[Elopomorpha]]",
parents = {"ikan"},
}
labels["emberizids"] = {
type = "set",
description = "=[[bunting]]s, [[yellowhammer]]s and related birds in the [[passerine]] family [[Emberizidae]]",
additional = "<u>Note</u>: for New World species that were formerly classified in this family, see [[:Category:{{{langcode}}}:New World sparrows]].",
parents = {"burung tenggek"},
}
labels["emydid turtles"] = {
type = "set",
description = "=(North American) [[box turtle]]s, [[chicken turtle]]s, [[cooter]]s, [[ellachick]]s, [[pond turtle]]s, [[slider]]s, [[terrapin]]s, and other [[turtle]]s in the [[family]] [[Emydidae]]",
parents = {"turtles"},
}
labels["Equidae"] = {
type = "set",
description = "default",
parents = {"ungulat kuku ganjil"},
}
labels["erinaceids"] = {
type = "set",
description = "=[[erinaceid]]s – hedgehogs and relatives",
parents = {"mamalia"},
}
labels["euplerids"] = {
type = "set",
description = "=[[euplerid]]s — mongoose-like mammals found in Madagascar",
parents = {"karnivor"},
}
labels["ungulat kuku genap"] = {
type = "set",
description = "=[[mammal]]s in the [[order]] [[Artiodactyla]]",
parents = {"mamalia"},
}
labels["falconids"] = {
type = "set",
description = "=[[caracara]]s, [[falcon]]s, [[hobby|hobbies]], [[kestrel]]s, [[lanner]]s, [[merlin]]s, [[saker]]s, and other birds in the [[family]] [[Falconidae]]",
parents = {"burung pemangsa"},
}
labels["felids"] = {
type = "set",
description = "default",
parents = {"karnivor"},
}
labels["female haiwan"] = {
type = "set",
description = "default",
parents = {"haiwan", "female"},
}
labels["ikan"] = {
type = "set",
description = "default",
parents = {"vertebrat"},
commonscat = true,
wpcat = true,
}
labels["flamingos"] = {
type = "set",
description = "default",
parents = {"burung air tawar"},
}
labels["flatfish"] = {
type = "set",
description = "=[[sole]]s, [[flounder]]s, [[halibut]]s and other fish in the order [[Pleuronectiformes]]",
parents = {"ikan"},
}
labels["flatworms"] = {
type = "set",
description = "=[[fluke]]s, [[monogenean]]s, [[planarian]]s, [[polyclad]]s, [[tapeworm]]s, and other haiwan in the [[filum]] [[Platyhelminthes]]",
additional = "For terms related to the study of [[parasitic]] [[worm#Noun|worms]], see [[:Category:Helminthology]] and its subcategories.",
parents = {"cacing"},
}
labels["fleas"] = {
type = "set",
description = "default",
parents = {"serangga"},
}
labels["unggas"] = {
type = "set",
description = "=[[fowl]]s: land birds in the [[order]] [[Galliformes]]",
parents = {"burung"},
}
labels["foxes"] = {
type = "set",
description = "default",
parents = {"kanid"},
}
labels["burung air tawar"] = {
type = "set",
description = "=birds that live mainly in [[freshwater]] areas, including [[estuaries]]",
parents = {"burung"},
}
labels["freshwater whitefish"] = {
type = "set",
description = "=[[cisco]]s, [[houting]]s, [[inconnu]]s, [[lavaret]]s, [[marena]]s, [[omul]]s, [[Otsego bass]], [[peled]]s, [[pollan]]s, [[roundfish]], [[tullibee]]s, [[vendace]]s, [[whitefish]] and other fish in the [[salmonid]] [[subfamily]] [[Coregoninae]]",
parents = {"salmonids"},
}
labels["frogs"] = {
type = "set",
description = "default",
parents = {"anurans"},
}
labels["gadiforms"] = {
type = "set",
description = "=[[cod]], [[haddock]], [[hake]] and other fish in the [[order]] [[Gadiformes]]",
parents = {"ikan"},
}
labels["ikan gasterosteiform"] = {
type = "set",
description = "=[[stickleback]]s, [[hypoptychid]] [[sand eel]]s, [[tubesnout]]s and other fish in the [[order]] [[Gasterosteiformes]]",
additional = "Note: See [[:Category:Ikan singnatiform]] for a group formerly included within this order.",
parents = {"ikan"},
}
labels["gastropod"] = {
type = "set",
description = "default",
parents = {"moluska"},
}
labels["geckos"] = {
type = "set",
description = "=[[lizard]]s in the [[infraorder]] [[Gekkota]], except for the [[legless lizards]] or [[pygopod]]s",
parents = {"lizards"},
}
labels["angsa"] = {
type = "set",
description = "default",
parents = {"anatid", "poltri"},
}
labels["geometrid moths"] = {
type = "set",
description = "=[[carpet]]s, [[engrailed]]s, [[heath]]s, [[pug]]s, [[peppered moth]]s, [[streak]]s, [[wave]]s and other [[moth]]s in the [[family]] [[Geometridae]], most of which have [[caterpillar]]s known as [[inchworm]]s, [[looper]]s, [[measuring worm]]s or [[spanworm]]s",
parents = {"moths"},
}
labels["goats"] = {
type = "set",
description = "default",
parents = {"caprines", "ternakan"},
}
labels["gobies"] = {
type = "set",
description = "=[[goby|gobies]], [[dartfish]], [[mudskipper]]s, [[sea gudgeon]]s, [[sleeper]]s, [[wormfish]], and other [[fish]] in the [[perciform]] [[suborder]] [[Gobioidei]]",
parents = {"ikan"},
}
labels["gossamer-winged butterflies"] = {
type = "set",
description = "=[[blue]]s, [[copper]]s, [[elfin]]s, [[harvester]]s, [[hairstreak]]s, [[sunbeam]]s and other [[butterfly|butterflies]] in the [[family]] [[Lycaenidae]]",
parents = {"butterflies"},
}
labels["grebes"] = {
type = "set",
description = "default",
parents = {"burung air tawar"},
}
labels["grouse"] = {
type = "set",
description = "=[[blackcock]]s, [[capercaillie]]s, [[grouse]], [[moorcock]]s, [[prairie chicken]]s, [[ptarmigan]]s, [[sagehen]]s, and other birds in the [[phasianid]] [[subfamily]] [[Tetraoninae]]",
parents = {"unggas"},
}
labels["gruiforms"] = {
type = "set",
description = "=[[coot]]s, [[crake]]s, [[crane]]s, [[finfoot]]s, [[flufftail]]s, [[gallinule]]s, [[limpkin]]s, [[rail]]s, [[sungrebe]]s, [[trumpeter]]s, and other birds in the [[order]] [[Gruiformes]]",
parents = {"burung air tawar"},
}
labels["gulls"] = {
type = "set",
description = "=[[gull]]s, [[seabird]]s in the [[family]] [[Laridae]]",
parents = {"burung laut"},
}
labels["anjing pemburu"] = {
type = "set",
description = "default",
parents = {"hunting dogs"},
}
labels["hares"] = {
type = "set",
description = "default",
parents = {"lagomorphs"},
}
labels["hemipterans"] = {
type = "set",
description = "=[[aphid]]s, [[leafhopper]]s, [[scale insect]]s, [[true bug]]s, [[whitefly|whiteflies]], and other [[insect]]s in the order [[Hemiptera]]",
parents = {"serangga"},
}
labels["herding dogs"] = {
type = "set",
description = "default",
parents = {"pastoral dogs"},
}
labels["herons"] = {
type = "set",
description = "=[[heron]]s, [[bittern]]s and [[egret]]s",
parents = {"burung air tawar"},
}
labels["herpestids"] = {
type = "set",
description = "=[[herpestid]]s- mongooses, meerkats, and relatives",
parents = {"karnivor"},
}
labels["herrings"] = {
type = "set",
description = "=[[herring]]s, [[shad]]s, [[sardine]]s and other fish in the family [[Clupeidae]]",
parents = {"ikan", "ikan otosefalan"},
}
labels["ikan holostean"] = {
type = "set",
description = "=[[gar]]s and [[bowfin]]s, primitive fish in the [[infraclass]] [[Holostei]]",
parents = {"ikan"},
}
labels["hominid"] = {
type = "set",
description = "default",
parents = {"primat"},
}
labels["honeyeaters"] = {
type = "set",
description = "=Australian [[chat]]s, [[bellbird]]s, [[friarbird]]s, [[gibberbird]]s, [[honeyeater]]s, [[miner]]s, [[spinebill]]s, [[wattlebird]]s, and other birds in the [[family]] [[Meliphagidae]]",
parents = {"meliphagoid birds"},
}
labels["hoopoes and hornbills"] = {
type = "set",
description = "=[[hoopoe]]s, [[woodhoopoe]]s (including [[scimitarbill]]s), [[hornbill]]s, [[ground hornbill]]s, and other birds in the taxonomic order [[Bucerotiformes]]",
parents = {"burung"},
}
labels["horseflies"] = {
type = "set",
description = "=[[blind-fly|blind-flies]], [[breezefly|breezeflies]], [[cleg]]s, [[deerfly|deerflies]], [[forest fly|forest flies]], [[gadfly|gadflies]], [[horsefly|horseflies]], [[oxfly|oxflies]], [[zimb]]s, and other biting flies in the [[family]] [[Tabanidae]]",
parents = {"Diptera"},
}
labels["horse breeds"] = {
type = "set",
description = "default",
parents = {"kuda"},
commonscat = true,
wpcat = true,
}
labels["kuda"] = {
type = "set",
description = "default",
parents = {"Equidae", "ternakan"},
}
labels["hummingbirds"] = {
type = "set",
description = "default",
parents = {"apodiforms"},
}
labels["hunting dogs"] = {
type = "set",
description = "default",
parents = {"anjing"},
}
labels["hyaenids"] = {
type = "set",
description = "default",
parents = {"karnivor"},
}
labels["hydrozoans"] = {
type = "set",
description = "=[[bluebottle]]s, [[calycophoran]]s, [[filiferan]]s, [[hydra]]s, [[hydractinian]]s, [[leptothecate]]s, [[narcomedusa]]s, [[pandeid]]s, [[physonect]]s, [[plumularian]]s, [[Portuguese man-of-war]]s, [[siphonophore]]s, [[stylaster]]s, [[sea fir]]s, [[sea ginger]], [[trachylid]]s, [[trachymedusa]]s, amd other haiwan in the [[cnidarian]] [[class]] [[Hydrozoa]]",
parents = {"knidaria"},
}
labels["Hymenoptera"] = {
type = "set",
description = "=[[semut]], [[lebah]], [[penyengat]] dan serangga lain dalam order [[Hymenoptera]]",
parents = {"serangga"},
}
labels["hyraxes"] = {
type = "set",
description = "default",
parents = {"mamalia"},
}
labels["ibises and spoonbills"] = {
type = "set",
description = "=[[ibis]]es and [[spoonbill]]s",
parents = {"burung air tawar"},
}
labels["ichthyosauromorphs"] = {
type = "set",
description = "=[[ichthyosaurs]] and related groups of [[extinct]] [[aquatic]] [[reptile]]s in the [[clade]] [[Ichthyosauromorpha]]",
parents = {"reptilia"},
}
labels["icterids"] = {
type = "set",
description = "=birds in the [[New World]] [[passerine]] family [[Icteridae]]",
parents = {"burung tenggek"},
}
labels["iguanoid lizards"] = {
type = "set",
description = "=[[anole]]s, [[basilisk]]s, [[collared lizard]]s, [[chuckwalla]]s, [[fence lizard]]s, [[fringe-toed lizard]]s, [[horned lizard]]s, [[iguana]]s, [[leopard lizard]]s, [[side-blotched lizard]]s, [[zebra-tailed lizard]]s and other [[lizard]]s formerly included in the [[family]] [[Iguanidae]], and now mostly treated as comprising either the [[infraorder]] [[Pleurodonta]] or the [[superfamily]] [[Iguanoidea]]",
parents = {"lizards"},
}
labels["serangga"] = {
type = "set",
description = "default",
parents = {"artropod"},
}
labels["isopods"] = {
type = "set",
description = "=[[gribble]]s, [[pillbug]]s, [[salve bug]]s, [[slater]]s, [[sea slater]]s, [[sowbug]]s, [[woodlouse|woodlice]], and other [[crustacean]]s in the [[order]] [[Isopoda]]",
parents = {"krustasea"},
}
labels["jackfish"] = {
type = "set",
description = "=[[jack]]s, [[pompano]]s, [[jack mackerel]]s, [[scad]]s and other fish in the family [[Carangidae]]",
parents = {"ikan perkoid"},
}
labels["ikan tanpa rahang"] = {
type = "set",
description = "=[[lamprey]]s and [[hagfish]]: primitive eel-like fishes that have no jaws",
parents = {"ikan"},
}
labels["kingfishers"] = {
type = "set",
description = "default",
parents = {"coraciiforms"},
}
labels["kites (birds)"] = {
type = "set",
description = "=[[hawk]]s in the [[accipitrid]] [[subfamily|subfamilies]] [[Milvinae]] and [[Elaninae]], as well as some in the subfamily [[Perninae]]",
parents = {"burung pemangsa"},
}
labels["ikan kifosid"] = {
type = "set",
description = "=[[blackfish]], [[drummer]]s, [[footballer]]s, [[greenfish]], [[halfmoon]]s, [[luderick]]s, [[mado]]s, [[moonlighter]]s, [[nibbler]]s, [[opaleye]]s, [[sea chub]]s, [[stripey]]s, [[sweep]]s and other fish in the [[percoid]] [[family]] [[Kyphosidae]]",
parents = {"ikan perkoid"},
}
labels["ikan labroid"] = {
type = "set",
description = "=[[anemonefish]], [[cale]]s, [[cichlid]]s, [[clownfish]], [[damselfish]], [[parrotfish]], [[surfperch]], [[wrasse]]s, and other fish in the [[perciform]] [[suborder]] [[Labroidei]]",
parents = {"ikan"},
}
labels["ikan labirin"] = {
type = "set",
description = "=[[climbing perch]], [[gourami]]s, [[paradisefish]], [[Siamese fighting fish]] and other fish in the [[suborder]] [[Anabantoidei]]",
parents = {"ikan"},
}
labels["lacertoid lizards"] = {
type = "set",
description = "=[[amphisbaena]]s, [[caiman lizard]]s, [[green lizard]]s, [[ocellated lizard]]s, [[racerunner]]s, [[rock lizard]]s, [[tegu]]s, [[teiid]]s, [[thunderworm]]s, [[viviparous lizard]]s, [[wall lizard]]s, [[whiptail]]s, and other [[lizard]]s in the [[superfamily]] [[Lacertoidea]]",
parents = {"lizards"},
}
labels["lagomorphs"] = {
type = "set",
description = "default",
parents = {"mamalia"},
}
labels["lamniform sharks"] = {
type = "set",
description = "=[[basking shark]]s, [[goblin shark]]s, [[great white shark]]s, [[mako shark]]s, [[megamouth shark]]s, [[porbeagle]]s, [[sand shark]]s, [[thresher shark]]s, and other [[shark]]s in the [[order]] [[Lamniformes]]",
parents = {"jerung"},
}
labels["ikan lampriform"] = {
type = "set",
description = "=[[crestfish]], [[oarfish]], [[opah]]s, [[ribbonfish]], [[velifer]]s and other fish in the [[order]] [[Lampridiformes]] (not to be confused with the unrelated [[lamprey]]s)",
parents = {"ikan"},
}
labels["larks"] = {
type = "set",
description = "default",
parents = {"burung tenggek"},
}
labels["laughingthrushes"] = {
type = "set",
description = "=birds in the [[family]] [[Leiothrichidae]]",
parents = {"burung tenggek"},
}
labels["leaf warblers"] = {
type = "set",
description = "=birds in the family [[Phylloscopidae]]",
parents = {"warblers"},
}
labels["kera kecil"] = {
type = "set",
description = "=[[gibbon]]s (including [[hoolock]]s, [[lar gibbon]]s [[wow-wow]]s, etc.) and [[siamang]]s, comprising the [[family]] [[Hylobatidae]], which is closely related to the [[hominid]]s",
parents = {"primate"},
}
labels["ikan leusisin"] = {
type = "set",
description = "=[[bream]]s, [[chub]]s, [[dace]]s, [[ide]]s, many [[minnow]]s, [[nase]]s, [[roach]]es, [[shiner]]s, [[ziege]]s, and other fish in the [[cyprinid]] [[subfamily]] [[Leuciscinae]], sometimes treated as the [[family]] [[Leuciscidae]], or as the [[tribe]] [[Leuciscini]] within the [[subfamily]] [[Cyprininae]]",
parents = {"cyprinids"},
}
labels["libellulid dragonflies"] = {
type = "set",
description = "=[[amberwing]]s, [[basker]]s, [[darter]]s, [[dropwing]]s, [[duskhawk]]s, [[flutterer]]s, [[glider]]s, [[meadowhawk]]s, [[pennant]]s, [[percher]]s, [[skimmer]]s, [[slimwing]]s, [[swampdragon]]s, [[twister]]s, and other [[dragonfly|dragonflies]] in the [[family]] [[Libellulidae]]",
parents = {"dragonflies and damselflies"},
}
labels["lice"] = {
type = "set",
description = "=[[parasitic]] serangga in the [[order]] [[Psocodea]]",
parents = {"serangga"},
}
labels["limenitidine butterflies"] = {
type = "set",
description = "=[[admiral]]s, [[clipper]]s, [[count]]s, [[duke]]s, [[purple]]s, [[sister]]s, and other [[butterfly|butterflies]] in the [[nymphalid]] [[subfamily]] [[Limenitidinae]]",
parents = {"nymphalid butterflies"},
}
labels["littorinimorphs"] = {
type = "set",
description = "=[[boat shell]]s, [[carrier shell]]s, [[conch]]s, [[cowry|cowries]], [[flamingo tongue]]s, [[helmet shell]]s, [[moon snail]]s, [[pebblesnail]]s, [[trumpet shell]]s, [[velutinid]]s, [[winkle]]s, [[worm-shell]]s, and other [[gastropod]]s in the [[order]] [[Littorinimorpha]]",
parents = {"gastropod"},
}
labels["livestock guardian dogs"] = {
type = "set",
description = "default",
parents = {"pastoral dogs"},
}
labels["lizards"] = {
type = "set",
description = "default",
parents = {"reptilia"},
}
labels["loaches"] = {
type = "set",
description = "=fish in the [[cypriniform]] [[superfamily]] [[Cobitoidea]]",
parents = {"ikan", "ikan otosefalan"},
}
labels["ikan sirip lobus"] = {
type = "set",
description = "=[[coelacanth]]s, [[lungfish]] and other fishes in the [[subclass]] [[Sarcopterygii]] of the [[bony fish]]es",
additional = "<u>Please note</u>: although the [[tetrapod]]s (including all [[reptile]]s, [[amphibian]]s, [[bird]]s and [[mammal]]s) are descended from within this group, they are excluded from this category by not being fish.",
parents = {"ikan"},
}
labels["loons"] = {
type = "set",
description = "=[[loon]]s, birds known as [[diver]]s outside the US",
parents = {"burung air tawar"},
}
labels["macaques"] = {
type = "set",
description = "=[[Barbary ape]]s, [[bonnet monkey]]s, [[crab-eating macaque]]s, [[Japanese macaque]]s, [[moor macaque]]s, [[pigtail macaque]]s, [[rhesus monkey]]s, [[toque]]s, and other [[Old World monkey]]s in the [[genus]] ''[[Macaca]]''",
parents = {"monyet dunia lama"},
}
labels["macropods"] = {
type = "set",
description = "=[[bettong]]s, [[kangaroo]]s, [[pademelon]]s, [[potoroo]]s, [[quokka]]s, [[wallaby]]s, and other [[marsupial]]s in the [[diprotodont]] [[suborder]] [[Macropodiformes]]",
parents = {"marsupials"},
}
labels["malaconotoid birds"] = {
type = "set",
description = "=[[Australian magpie]]s, [[bushshrike]]s, [[butcherbird]]s, [[boubou]]s, [[brubru]]s, [[currawong]]s, [[gonolek]]s, [[squeaker]]s, [[vanga]]s, and other birds in the [[passerine]] [[superfamily]] [[Malaconotoidea]]",
parents = {"burung tenggek"},
}
labels["male haiwan"] = {
type = "set",
description = "default",
parents = {"haiwan", "male"},
}
labels["mamalia"] = {
type = "set",
description = "default",
parents = {"vertebrat"},
}
labels["mantids"] = {
type = "set",
description = "=serangga in the [[order]] [[Mantodea]], often known as [[praying mantis]]es",
parents = {"serangga"},
}
labels["marsupials"] = {
type = "set",
description = "default",
parents = {"mamalia"},
}
labels["mayflies"] = {
type = "set",
description = "=serangga in the [[order]] [[Ephemeroptera]]",
parents = {"serangga"},
}
labels["megalopterans"] = {
type = "set",
description = "=[[alderfly|alderflies]], [[dobsonfly|dobsonflies]], [[fishfly|fishflies]] and other serangga in the [[order]] [[Megaloptera]]",
parents = {"serangga"},
}
labels["meliphagoid birds"] = {
type = "set",
description = "=[[blue wren]]s, [[bristlebird]]s, [[emu-wren]]s, [[fairywren]]s, [[gerygone]]s, [[grasswren]]s, [[honeyeater]]s, [[pardalote]]s, [[pilotbird]]s, [[redthroat]]s, [[scrubwren]]s, [[thornbill]]s, [[weebill]]s, [[whiteface]]s, and other birds in the [[passerine]] [[superfamily]] [[Meliphagoidea]]",
parents = {"burung tenggek"},
}
labels["mephitids"] = {
type = "set",
description = "=[[mephitid]]s: skunks and stink badgers",
parents = {"karnivor"},
}
labels["mergansers"] = {
type = "set",
description = "=[[diving]] [[duck]]s in the [[genus]] ''[[Mergus]]'' and a few similar species",
parents = {"itik"},
}
labels["mimids"] = {
type = "set",
description = "=[[catbird]]s, [[mockingbird]]s, [[thrasher]]s and other birds in the [[passerine]] family [[Mimidae]]",
parents = {"burung tenggek"},
}
labels["mites and ticks"] = {
type = "set",
description = "=[[arachnid]]s in the [[subclass]] [[Acari]]",
parents = {"araknid"},
}
labels["moluska"] = {
type = "set",
description = "default",
parents = {"haiwan"},
commonscat = "Mollusca",
wpcat = "Molluscs",
}
labels["monyet"] = {
type = "set",
description = "default",
parents = {"primat"},
}
labels["monotremes"] = {
type = "set",
description = "default",
parents = {"mamalia"},
}
labels["nyamuk"] = {
type = "set",
description = "=[[insect]]s in the [[dipteran]] [[family]] [[Culicidae]]",
parents = {"Culicomorpha"},
}
labels["moths"] = {
type = "set",
description = "default",
parents = {"serangga"},
}
labels["murids"] = {
type = "set",
description = "=a number of [[rats]], [[mice]], and other [[rodent]]s in the [[Old World]] [[family]] [[Muridae]]",
parents = {"rodensia"},
}
labels["muscicapids"] = {
type = "set",
description = "=birds in the [[passerine]] family [[Muscicapidae]]",
parents = {"burung tenggek"},
}
labels["muscoid flies"] = {
type = "set",
description = "=[[anthomyiid]]s such as [[root fly|root flies]], [[cabbage fly|cabbage flies]] and [[onion fly|onion flies]]; [[fanniid]]s; [[muscid]]s such as [[housefly|houseflies]], [[face fly|face flies]] and [[stable fly|stable flies]]; [[scathophagid]]s such as [[dungfly|dungflies]]; and other [[fly|flies]] in the [[dipteran]] [[superfamily]] [[Muscoidea]]",
parents = {"Diptera"},
}
labels["mustelids"] = {
type = "set",
description = "default",
parents = {"karnivor"},
}
labels["lelabah migalomorf"] = {
type = "set",
description = "=[[baboon spider]]s, [[barking spider]]s, [[bird spider]]s, [[purseweb spider]]s, [[tarantula]]s, [[trapdoor spider]]s, and other [[spider]]s in the [[infraorder]] [[Mygalomorphae]]",
parents = {"lelabah"},
}
labels["myriapods"] = {
type = "set",
description = "=[[centipede]]s, [[millipede]]s, [[pauropod]]s, [[symphylan]]s, and other [[arthropod]]s in the [[subfilum]] [[Myriapoda]]",
parents = {"artropod"},
}
labels["myrmicine ants"] = {
type = "set",
description = "=[[ant]]s in the [[subfamily]] [[Myrmicinae]]",
parents = {"ants"},
}
labels["nematodes"] = {
type = "set",
description = "=[[filaria]], [[gapeworm]]s, [[lungworm]]s, [[pinworm]]s, [[threadworm]]s, [[wheatworm]]s, [[whipworm]]s and other [[worm]]s in the [[filum]] [[Nematoda]]",
parents = {"cacing"},
}
labels["neogastropod"] = {
type = "set",
description = "=[[admiral shell]]s, [[cone snail]]s, [[harp shell]]s, [[murex]]es, [[olive]]s, [[rhombus]]es, [[spindle]]s, [[tulip shell]]s, [[turnip shell]]s, [[volute]]s, [[whelk]]s, [[winkle]]s and other [[gastropod]]s in the [[clade]] [[Neogastropoda]] (treated as an [[order]] in some classifications)",
parents = {"gastropod"},
}
labels["monyet dunia baharu"] = {
type = "set",
description = "=[[capuchin]]s, [[howler monkey]]s, [[marmoset]]s, [[night monkey]]s, [[saki]]s, [[spider monkey]]s, [[squirrel monkey]]s, [[tamarin]]s, [[titi]]s, [[uakari]]s, [[woolly monkey]]s, and other [[monkey]]s in the [[parvorder]] [[Platyrrhini]]",
parents = {"monyet"},
}
labels["New World quails"] = {
type = "set",
description = "=birds in the [[family]] [[Odontophoridae]], most of which live in the [[New World]] and are known as [[quail]]s, but the family also includes the African [[genus]] ''[[Ptilopachus]]'' and some [[species]] are known as partridges",
parents = {"unggas"},
}
labels["New World sparrows"] = {
type = "set",
description = "=[[sparrow]]- and [[finch]]-like birds in the [[passerine]] [[family]] [[Passerellidae]], until recently considered part of the family [[Emberizidae]]",
parents = {"burung tenggek"},
}
labels["New World warblers"] = {
type = "set",
description = "=birds in the family [[Parulidae]]",
parents = {"warblers"},
}
labels["neuropterans"] = {
type = "set",
description = "=[[antlion]]s, [[lacewing]]s, [[mantisfly|mantisflies]], [[owlfly|owlflies]] and other serangga in the [[order]] [[Neuroptera]]",
parents = {"serangga"},
}
labels["newts"] = {
type = "set",
description = "=[[terrestrial]] [[salamander]]s in the [[subfamily]] [[Pleurodelinae]]",
parents = {"salamanders"},
}
labels["noctuoid moths"] = {
type = "set",
description = "=[[armyworm]]s, [[cinnabar]]s, [[corn earworm]]s, [[cutworm]]s, [[gypsy moth]]s, [[owlet moth]]s, [[processionary|processionaries]], [[tiger moth]]s, [[underwing]]s, [[wainscot]]s, [[wooly bear]]s, and many other [[moth]]s (and [[caterpillar]]s) in the [[superfamily]] [[Noctuoidea]]",
parents = {"moths"},
}
labels["nudibranchs"] = {
type = "set",
description = "=[[sea slug]]s in the [[gastropod]] [[order]] [[Nudibranchia]]",
parents = {"gastropod"},
}
labels["nymphalid butterflies"] = {
type = "set",
description = "=[[admiral]]s, [[brown]]s, [[buckeye]]s, [[checkerspot]]s, [[emperor]]s, [[fritillary|fritillaries]], [[leafwing]]s, [[longwing]]s, [[monarch]]s, [[morpho]]s, [[painted lady|painted ladies]], [[ringlet]]s, [[satyr]]s, [[sister]]s, [[snout]]s, [[tortoiseshell]]s, and other butterflies in the [[family]] [[Nymphalidae]]",
parents = {"butterflies"},
}
labels["kurita"] = {
type = "set",
description = "default",
parents = {"sefalopod"},
}
labels["ungulat kuku ganjil"] = {
type = "set",
description = "=[[mammal]]s in the [[order]] [[Perissodactyla]], including the [[equid]]s, [[tapir]]s and [[rhinoceros]]es",
parents = {"mamalia"},
}
labels["oestroid flies"] = {
type = "set",
description = "=[[blowfly|blowflies]], [[bluebottle]]s, [[botfly|botflies]], [[flesh fly|flesh fles]], [[greenbottle]]s, [[mango fly|mango flies]], [[screwworm]]s, [[tachinid]]s, [[torsalo]]s, [[tumbu fly|tumbu flies]], [[warble fly|warble flies]], and other flies in the [[superfamily]] [[Oestroidea]]",
parents = {"Diptera"},
}
labels["monyet dunia lama"] = {
type = "set",
description = "=[[baboon]]s, [[colobus]], [[douc]]s, [[gelada]]s, [[green monkey]]s, [[grivet]]s, [[langur]]s, [[malbrouck]]s, [[mandrill]]s, [[mangabey]]s, [[patas monkey]]s, [[proboscis monkey]]s, [[talapoin]]s, [[vervet]]s, and other [[monkeys]] in the [[family]] [[Cercopithecidae]], the only [[members]] of the [[parvorder]] [[Catarrhini]] aside from the greater/lesser apes and humans",
parents = {"monyet"},
}
labels["Old World orioles"] = {
type = "set",
description = "=[[perching bird]]s in the [[family]] [[Oriolidae]], which are not closely related to the New World orioles in the family [[Icteridae]]",
parents = {"burung tenggek"},
}
labels["ornithopods"] = {
type = "set",
description = "=[[camptosaurid]]s, [[hadrosaur]]s, [[iguanodontid]]s, [[lambeosaurid]]s, [[rhabdodontid]]s, [[saurolophid]]s, [[thescelosaurid]]s, [[trachodontid]]s, and other [[dinosaur]]s in the [[ornithischian]] [[clade]] [[Ornithopoda]]",
parents = {"dinosaur"},
}
labels["ikan osteoglosomorf"] = {
type = "set",
description = "=[[aba]]s, [[arapaima]]s, [[arowana]]s, [[butterfly fish]], [[elephantfish]], [[featherback]]s, [[mooneye]]s and other fish in the [[superorder]] [[Osteoglossomorpha]]",
parents = {"ikan"},
}
labels["otariid seals"] = {
type = "set",
description = "=[[mammal]]s in the [[family]] [[Otariidae]], including the [[fur seal]]s and [[sea lion]]s",
parents = {"pinnipeds"},
}
labels["burung otidimorf"] = {
type = "set",
description = "=[[bustard]]s in the [[family]] [[Otididae]] and [[order]] [[Otidiformes]]; [[turaco]]s or [[lourie]]s, [[go-away bird]]s, [[plantain-eater]]s, etc., in the [[family]] [[Musophagidae]] and [[order]] [[Musophagiformes]]; and [[cuckoo]]s in the [[family]] [[Cuculidae]] and [[order]] [[Cuculiformes]]; all in the [[clade]] [[Otidimorphae]]",
parents = {"burung"},
}
labels["ikan otosefala"] = {
type = "set",
description = "=[[anchovy|anchovies]], [[beaked salmon]], [[carp]], [[catfish]], [[characin]]s, [[electric eel]]s, [[ghost knifefish]], [[herring]]s, [[loach]]es, [[milkfish]], [[minnow]]s, [[mousefish]], [[slickhead]]s, [[sucker]]s, [[tubeshoulder]]s, and other fish in the [[clade]] [[Otocephala]]",
parents = {"ikan"},
}
labels["ovenbirds"] = {
type = "set",
description = "=burung in the [[suboscine]] family [[Furnariidae]], including the former family Dendrocolaptidae (now the [[subfamily]] [[Dendrocolaptinae]])",
parents = {"suboscines"},
}
labels["owls"] = {
type = "set",
description = "default",
parents = {"burung pemangsa"},
}
labels["pangolins"] = {
type = "set",
description = "=[[mammal]]s in the [[order]] [[Pholidota]]",
parents = {"mamalia"},
}
labels["panthers"] = {
type = "set",
description = "=[[panther]]s in the sense of members of the genus ''[[Panthera]]''",
parents = {"felids"},
}
labels["parrots"] = {
type = "set",
description = "default",
parents = {"burung"},
}
labels["pastoral dogs"] = {
type = "set",
description = "default",
parents = {"anjing"},
}
labels["penguins"] = {
type = "set",
description = "default",
parents = {"burung"},
}
labels["pentatomoid bugs"] = {
type = "set",
description = "=[[acanthosomatid]]s, [[burrowing bug]]s, [[jewel bug]]s, [[shield bug]]s, [[stinkbug]]s, [[thyreocorid]]s, and other [[true bug]]s in the [[superfamily]] [[Pentatomoidea]]",
parents = {"true bugs"},
}
labels["perch and darters"] = {
type = "set",
description = "=fish in the family [[Percidae]]",
parents = {"ikan perkoid"},
}
labels["burung tenggek"] = {
type = "set",
description = "=Burung tenggek: salah satu ahli order [[Passeriformes]]",
parents = {"burung"},
}
labels["ikan perkoid"] = {
type = "set",
description = "=[[archerfish]], [[bass]], [[bigeye]]s, [[bluefish]], [[butterflyfish]], [[cardinalfish]], [[cobia]], [[croaker]]s, [[flagtail]]s, [[goatfish]], [[grouper]]s, [[grunt]]s, [[horse mackerel]], [[jack]]s, [[jawfish]], [[leaffish]], [[mahi-mahi]], [[mojarra]], [[perch]], [[pomfret]]s, [[pompano]], [[ponyfish]], [[porgy|porgies]], [[remora]]s, [[roosterfish]], [[sea bass]], [[sea bream]], [[snapper]], [[sunfish]], [[sweeper]]s, [[threadfin]], [[tilefish]], [[wreckfish]], and other [[perciform]] fish in the [[superfamily]] [[Percoidea]]",
parents = {"ikan"},
}
labels["phiomorphs"] = {
type = "set",
description = "=[[blesmol]]s, [[sand mole]]s, [[mole rat]]s, [[dassie rat]]s or [[rock rat]]s, [[Old World porcupine]]s, [[cane rat]]s or [[grasscutter]]s and other [[rodent]]s in the parvorder [[Phiomorpha]], which is the Old World counterpart of the [[caviomorph]]s",
parents = {"rodensia"},
}
labels["phocid seals"] = {
type = "set",
description = "=[[mammal]]s in the [[family]] [[Phocidae]], including the [[earless seal]]s (also known as [[true seal]]s)",
parents = {"pinnipeds"},
}
labels["piciforms"] = {
type = "set",
description = "=[[woodpecker]]s, [[aracari]]s, [[coppersmith]]s, [[honeyguide]]s, [[jacamar]]s, [[nunlet]]s, [[puffbird]]s, [[toucan]]s, and other burung in the [[order]] [[Piciformes]]",
parents = {"burung"},
}
labels["pierid butterflies"] = {
type = "set",
description = "=[[brimstone]]s, [[orange tip]]s, [[sulfur]]s, [[white]]s and other [[butterfly|butterflies]] in the [[family]] [[Pieridae]]",
parents = {"butterflies"},
}
labels["babi"] = {
type = "set",
description = "default",
parents = {"ungulat kuku genap", "ternakan"},
commonscat = "Suidae",
wpcat = true,
}
labels["pikes (fish)"] = {
type = "set",
description = "=fish in the family [[Esocidae]]",
parents = {"ikan"},
}
labels["pinnipeds"] = {
type = "set",
description = "default",
parents = {"karnivor"},
}
labels["pipits and wagtails"] = {
type = "set",
description = "=burung in the [[passerine]] family [[Motacillidae]]",
parents = {"burung tenggek"},
}
labels["placoderms"] = {
type = "set",
description = "=[[extinct]] armored fish of the [[class]] [[Placodermi]] from the [[Silurian]] and [[Devonian]] [[geologic]] [[period]]s",
parents = {"ikan"},
}
labels["plovers and lapwings"] = {
type = "set",
description = "=burung in the [[charadriiform]] [[family]] [[Charadriidae]]",
parents = {"shorebirds"},
}
labels["pomfrets"] = {
type = "set",
description = "=fish in the family [[Bramidae]]",
parents = {"ikan perkoid"},
}
labels["primat"] = {
type = "set",
description = "default",
parents = {"mamalia"},
commonscat = true,
wpcat = true,
}
labels["procyonids"] = {
type = "set",
description = "=[[procyonid]]s: ([[raccoon]]s, [[coati]]s, [[kinkajou]]s, [[olingo]]s, [[ringtail]]s and [[cacomistle]]s)",
parents = {"karnivor"},
}
labels["prosimian"] = {
type = "set",
description = "default",
parents = {"primat"},
}
labels["pterosaurs"] = {
type = "set",
description = "default",
parents = {"reptilia"},
}
labels["pyraloid moths"] = {
type = "set",
description = "=[[bee moth]]s, [[flour moth]]s, [[leaf crumpler]]s, [[magpie moth]]s, [[melonworm]]s, [[mint moth]]s, [[orangeworm]]s, [[pantry moth]]s, [[pickleworm]]s, [[snout moth]]s, [[veneer moth]]s, [[wax moth]]s and other [[crambid]] and [[pyralid]] [[moths]] in the [[superfamily]] [[Pyraloidea]]",
parents = {"moths"},
}
labels["rabbits"] = {
type = "set",
description = "default",
parents = {"lagomorphs"},
}
labels["rallids"] = {
type = "set",
description = "=[[rallid]]s: [[rail]]s and other burung in the family [[Rallidae]]",
parents = {"gruiforms"},
}
labels["ratites"] = {
type = "set",
description = "=[[ratite]]s: burung in the superorder [[Palaeognathae]], including large flightless burung such as [[ostrich]]es, and [[emu]]s, as well as the smaller [[kiwi]]s and [[flighted]] [[tinamous]]",
parents = {"burung"},
}
labels["rays and skates"] = {
type = "set",
description = "=[[fish]] in the superorder [[Batoidea]]",
parents = {"ikan"},
}
labels["reindeers"] = {
type = "set",
description = "default",
parents = {"cervids"},
}
labels["reptilia"] = {
type = "set",
description = "default",
parents = {"vertebrat"},
commonscat = "Reptilia",
wpcat = true,
}
labels["retrievers"] = {
type = "set",
description = "default",
parents = {"anjing pemburu"},
}
labels["rhinoceroses"] = {
type = "set",
description = "=[[rhinoceros]]es, [[mammal]]s in the [[perissodactylic]] [[family]] [[Rhinocerotidae]]",
parents = {"ungulat kuku ganjil"},
}
labels["rodensia"] = {
type = "set",
description = "default",
parents = {"mamalia"},
}
labels["salamanders"] = {
type = "set",
description = "=[[amphiuma]]s, [[axolotl]]s, [[hellbender]]s, [[mud puppy|mud puppies]], [[olm]]s, [[newt]]s, [[salamander]]s, [[siren]]s, and other [[amphibian]]s in the [[order]] [[Caudata]]",
parents = {"amfibia"},
}
labels["salmonids"] = {
type = "set",
description = "=[[salmon]]s, [[trout]], and other fish in the family [[Salmonidae]]",
parents = {"ikan"},
}
labels["saturniid moths"] = {
type = "set",
description = "=[[Atlas moth]]s, [[cecropia]]s, [[hickory horned devil]]s, [[io moth]]s, [[luna moth]]s, [[polyphemus moth]]s, and other [[moth]]s (and [[caterpillar]]s) in the [[family]] [[Saturniidae]]",
parents = {"moths"},
}
labels["satyrine butterflies"] = {
type = "set",
description = "=[[brown]]s, [[forester]]s, [[grayling]]s, [[heath]]s, [[palmfly|palmflies]], [[ringlet]]s, [[satyr]]s, and other [[butterfly|butterflies]] in the [[nymphalid]] [[subfamily]] [[Satyrinae]]",
parents = {"nymphalid butterflies"},
}
labels["sauropod"] = {
type = "set",
description = "=[[apatosaur]]s, [[brachiosaur]]s, [[brontosaur]]s, [[camarasaur]]s, [[cetiosaur]]s, [[diplodocus]]es, [[saltasaurid]]s, [[titanosaurian]]s, [[turiasaur]]s, [[vulcanodontid]]s, and other [[dinosaurs]] in the [[saurischian]] [[infraorder]] [[Sauropoda]]",
parents = {"dinosaur"},
}
labels["sauropterygians"] = {
type = "set",
description = "=[[elasmosaur]]s, [[placodont]]s, [[plesiosaur]]s, and other extinct aquatic [[reptile]]s in the [[superorder]] [[Sauropterygia]]",
parents = {"reptilia"},
}
labels["sawflies and wood wasps"] = {
type = "set",
description = "=[[horntail]]s, [[pigeon tremex]], [[rose slug]]s, [[sawfly|sawflies]], [[wood wasp]]s, and other primitive [[hymenopteran]]s in the [[suborder]] [[Symphyta]]",
parents = {"Hymenoptera"},
}
labels["serangga teritip"] = {
type = "set",
description = "=[[insect]]s in the [[superfamily]] [[Coccoidea]]",
parents = {"hemipterans"},
}
labels["scarabaeoids"] = {
type = "set",
description = "=[[cockchafer]]s, [[dor]]s, [[dung beetle]]s, [[June beetle]]s, [[rain beetle]]s, [[rose chafer]]s, [[scarab]]s, [[stag beetle]]s, and other beetles in the [[superfamily]] [[Scarabaeoidea]]",
parents = {"beetles"},
}
labels["scenthounds"] = {
type = "set",
description = "default",
parents = {"hunting dogs"},
}
labels["scincomorph lizards"] = {
type = "set",
description = "=[[blue-tongue lizard]]s, [[night lizard]]s, [[sandfish]], [[skink]]s, [[sungazer]]s, and other [[lizard]]s in the [[infraorder]] [[Scincomorpha]]",
parents = {"lizards"},
}
labels["scolopacids"] = {
type = "set",
description = "=[[curlew]]s, [[dunlin]]s, [[godwit]]s, [[knot]]s, [[redshank]]s, [[ruff]]s, [[sandpiper]]s, [[snipe]]s, [[stint]]s, [[turnstone]]s, [[tattler]]s, [[whimbrel]]s, [[woodcock]]s, [[yellowleg]]s, and other burung in the [[charadriiform]] [[family]] [[Scolopacidae]]",
parents = {"shorebirds"},
}
labels["scombroids"] = {
type = "set",
description = "=[[mackerel]]s, [[tuna]]s, [[barracuda]]s, [[swordfish]], and other fish in the suborder [[Scombroidei]]",
parents = {"ikan"},
}
labels["ikan skorpaeniform"] = {
type = "set",
description = "=[[bullhead]]s, [[cabezon]], [[golomyanka]], [[greenling]]s, [[gurnard]]s, [[Irish lord]], [[lionfish]], [[lumpsucker]]s, [[pigfish]], [[poacher]]s, [[sablefish]], [[scorpionfish]], [[sculpin]]s, [[sea raven]]s, [[sea toad]]s, [[skilfish]], [[snailfish]], [[stonefish]], [[wingfish]], and other fish in the [[order]] [[Scorpaeniformes]]",
parents = {"ikan"},
}
labels["scorpions"] = {
type = "set",
description = "=true [[scorpion]]s: [[arachnid]]s in the [[order]] [[Scorpiones]]",
parents = {"araknid"},
}
labels["screamers"] = {
type = "set",
description = "=[[screamer]]s: burung in the family [[Anhimidae]], related to [[duck]]s and [[geese]]",
parents = {"burung"},
}
labels["burung laut"] = {
type = "set",
description = "default",
parents = {"burung"},
}
labels["sea anemones"] = {
type = "set",
description = "=[[cnidarian]]s in the [[order]] [[Actiniaria]]",
parents = {"knidaria"},
}
labels["sea cucumbers"] = {
type = "set",
description = "=[[echinoderm]]s in the [[class]] [[Holothuroidea]]",
parents = {"ekinoderma"},
}
labels["sea urchins"] = {
type = "set",
description = "=[[echinoderm]]s in the [[class]] [[Echinoidea]], including the [[sand dollar]]s",
parents = {"ekinoderma"},
}
labels["sea turtles"] = {
type = "set",
description = "=[[flatback]]s, [[green turtle]]s, [[hawksbill]]s, [[leatherback]]s, [[loggerhead]]s, [[ridley]]s, and other [[turtle]]s in the [[superfamily]] [[Chelonioidea]]",
parents = {"turtles"},
}
labels["sebastids"] = {
type = "set",
description = "=fish in the family [[Sebastidae]]",
parents = {"ikan skorpaeniform"},
}
labels["serranids"] = {
type = "set",
description = "=[[sea bass]], [[grouper]]s, [[rockcod]]s, [[comber]]s and other fish in the family [[Serranidae]]",
parents = {"ikan perkoid"},
}
labels["jerung"] = {
type = "set",
description = "default",
parents = {"ikan"},
}
labels["kambing biri-biri"] = {
type = "set",
description = "default",
parents = {"caprines", "ternakan"},
}
labels["shorebirds"] = {
type = "set",
description = "default",
parents = {"burung"},
}
labels["shrikes"] = {
type = "set",
description = "default",
parents = {"burung tenggek", "burung korvoid"},
}
labels["sighthounds"] = {
type = "set",
description = "default",
parents = {"hunting dogs"},
}
labels["skippers"] = {
type = "set",
description = "=serangga in the family [[Hesperiidae]]",
parents = {"butterflies"},
}
labels["smelts"] = {
type = "set",
description = "=fish in the [[order]] [[Osmeriformes]]",
parents = {"ikan"},
}
labels["snails"] = {
type = "set",
description = "default",
parents = {"gastropod"},
}
labels["ular"] = {
type = "set",
description = "default",
parents = {"reptilia"},
}
labels["snappers"] = {
type = "set",
description = "=ikan in the [[family]] [[Lutjanidae]]",
parents = {"ikan perkoid"},
}
labels["soft corals"] = {
type = "set",
description = "=[[calcaxonian]]s, [[dead man's fingers]], [[fan coral]]s, [[gorgonian]]s, [[holaxonian]]s, [[scleraxonian]]s, [[sea feather]]s, [[sea willow]]s, [[stoloniferan]]s, [[whip coral]]s, and other marine haiwan in the [[cnidarian]] order [[Alcyonacea]]",
parents = {"knidaria"},
}
labels["soricomorphs"] = {
type = "set",
description = "=[[shrew]]s, [[mole]]s, [[solenodon]]s, and other [[mammal]]s in the [[order]] [[Soricomorpha]]",
parents = {"mamalia"},
}
labels["South American canids"] = {
type = "set",
description = "=fox-like [[canid]]s in the [[subtribe]] [[Cerdocyonina]], which are more closely related to the [[dog]]s and [[wolf|wolves]] than to the true [[fox]]es. Also known as [[zorro]]s",
parents = {"kanid"},
}
labels["spaniels"] = {
type = "set",
description = "default",
parents = {"anjing pemburu"},
}
labels["sparids"] = {
type = "set",
description = "=[[sea breams]], [[porgie]]s, [[scup]]s and other ikan in the family [[Sparidae]]",
parents = {"ikan perkoid"},
}
labels["sphinx moths"] = {
type = "set",
description = "=[[hawkmoth]]s, [[hornworm]]s, [[hummingbird moth]]s, [[sphinx moth]]s,[[tomato worm]]s, and other [[moth]]s (and [[caterpillar]]s) in the [[family]] [[Sphingidae]]",
parents = {"moths"},
}
labels["lelabah"] = {
type = "set",
description = "default",
parents = {"araknid"},
}
labels["sponges"] = {
type = "set",
description = "=[[aquatic]] [[animal]]s in the [[filum]] [[Porifera]]",
parents = {"haiwan"},
}
labels["squid"] = {
type = "set",
description = "default",
parents = {"sefalopod"},
}
labels["squirrels"] = {
type = "set",
description = "=[[squirrel]]s, [[chipmunk]]s, [[marmot]]s, [[prairie dog]]s, [[woodchuck]]s and other [[rodent]]s in the family [[Sciuridae]]",
parents = {"rodensia"},
}
labels["staphylinoid beetles"] = {
type = "set",
description = "=[[beetle]]s in the [[superfamily]] [[Staphylinoidea]]",
parents = {"beetles"},
}
labels["starlings"] = {
type = "set",
description = "=[[starling]]s, [[mynah]]s, and other birds in the [[passerine]] family [[Sturnidae]]",
parents = {"burung tenggek"},
}
labels["belalang ranting"] = {
type = "set",
description = "=[[insect]]s (including the [[leaf insect]]s) in the [[order]] known as either [[Phasmida]] or [[Phasmatodea]], which are noted for their extreme adaptations in form and color to look like parts of the plants they feed on",
parents = {"serangga"},
}
labels["stoneflies"] = {
type = "set",
description = "=[[freshwater]] [[aquatic]] [[insect]]s in the [[order]] [[Plecoptera]]",
parents = {"serangga"},
}
labels["stony corals"] = {
type = "set",
description = "=marine haiwan in the [[cnidarian]] order [[Scleractinia]]",
parents = {"knidaria"},
}
labels["storks"] = {
type = "set",
description = "default",
parents = {"burung air tawar"},
}
labels["ikan stromateoid"] = {
type = "set",
description = "=[[barrelfish]], [[blue eye cod]], [[dollarfish]], [[driftfish]], [[lafayette]], [[medusafish]], [[rudderfish]], [[squaretail]], [[warehou]], and other ikan in the [[perciform]] [[suborder]] [[Stromateoidei]]",
parents = {"ikan"},
}
labels["sturgeons"] = {
type = "set",
description = "=ikan in the family [[Acipenseridae]]",
parents = {"ikan"},
}
labels["suboscines"] = {
type = "set",
description = "=[[antpitta]]s, [[antshrike]]s, [[antthrush]]es, [[asity|asities]], [[broadbill]]s, [[cotinga]]s, [[crescentchest]]s, [[gnateater]]s, [[manakin]]s, [[ovenbird]]s, [[pitta]]s, [[sharpbill]]s, [[spadebill]]s, [[tapaculo]]s, [[tityra]]s, [[tyrant flycatcher]]s, [[woodcreeper]]s, and other birds in the [[passerine]] [[suborder]] [[Tyranni]]",
parents = {"burung tenggek"},
}
labels["suckers (ikan)"] = {
type = "set",
description = "=[[buffalo fish]], [[cuiui]], [[jumprock]]s, [[quillback]], [[redhorse]], [[sucker]]s, and other freshwater ikan in the family [[Catostomidae]]",
parents = {"ikan", "ikan otosefalan"},
}
labels["suliform birds"] = {
type = "set",
description = "=[[anhinga]]s, [[booby|boobies]], [[cormorant]]s, [[frigatebird]]s, [[gannet]]s, and other [[burung laut]] in the [[order]] [[Suliformes]]",
parents = {"burung laut"},
}
labels["sunfish"] = {
type = "set",
description = "=freshwater ikan otosefalan in the family [[Centrarchidae]]",
parents = {"ikan perkoid"},
}
labels["swallows"] = {
type = "set",
description = "default",
parents = {"burung tenggek"},
}
labels["swallowtails"] = {
type = "set",
description = "=[[apollo]]s, [[batwing]]s, [[birdwing]]s, [[clubtail]]s, [[festoon]]s, [[flying handkerchief]]s, [[Helen]]s, [[jay]]s, [[mime]]s, [[parnassian]]s, [[rose]]s, [[swallowtail]]s, [[swordtail]]s, [[triangle]]s, [[turnus]]es, [[windmill]]s, [[zebra]]s, and other [[butterfly|butterflies]] in the [[family]] [[Papilionidae]], notable for (mostly) having tail-like extensions on their [[hindwing]]s",
parents = {"butterflies"},
}
labels["swan"] = {
type = "set",
description = "default",
parents = {"anatid"},
}
labels["ikan singnatiform"] = {
type = "set",
description = "=[[bellowsfish]], [[cornetfish]], [[pipefish]], [[razorfish]], [[sea dragon]]s, [[sea horse]]s, [[snipefish]], [[trumpetfish]], and other ikan in the [[order]] [[Syngnathiformes]]",
parents = {"ikan"},
}
labels["tanagers"] = {
type = "set",
description = "=[[bananaquit]]s, [[conebill]]s, [[dacnis]]es, [[Darwin's finch]]es, [[grassquit]]s, [[ground finch]]es, [[honeycreeper]]s, [[pardusco]]s, [[tanager]]s, and other [[passerine]] birds in the family [[Thraupidae]]",
parents = {"burung tenggek"},
}
labels["temnospondyls"] = {
type = "set",
description = "=[[extinct]] early [[amphibian]]s in the [[order]] [[Temnospondyli]]",
parents = {"amfibia"},
}
labels["tenebrionoid beetles"] = {
type = "set",
description = "=[[aderid]]s, [[anthicid]]s, [[blister beetle]]s, [[borid]]s, [[ciid]]s, [[flour beetle]]s, [[darkling beetle]]s, [[mealworm]]s, [[melandryid]]s, [[mordellid]]s, [[mycetophagid]]s, [[oedemerid]]s, [[pinacate beetle]]s, [[pyrochroid]]s, [[pythid]]s, [[ripiphorid]]s, [[salpingid]]s, [[toktokkie]]s, [[ulodid]]s, [[wharf borer]]s, [[zopherid]]s and other [[beetle]]s in the [[superfamily]] [[Tenebrionoidea]]",
parents = {"beetles"},
}
labels["tephritoid flies"] = {
type = "set",
description = "=[[cheese fly|cheese flies]], [[tephritid]] [[fruit fly|fruit flies]], [[picture-winged fly|picture-winged flies]] and other [[fly|flies]] in the [[dipteran]] [[superfamily]] [[Tephritoidea]]",
parents = {"Diptera"},
}
labels["termites"] = {
type = "set",
description = "=[[termite]]s, [[insect]]s in the former [[order]] [[Isoptera]], which is now considered a [[suborder]] or other group within the [[cockroach]]es in the order [[Blattodea]]",
parents = {"serangga", "cockroaches"},
}
labels["terns"] = {
type = "set",
description = "=[[tern]]s, [[burung laut]] in the [[family]] [[Sternidae]]",
parents = {"burung laut"},
}
labels["tetraodontiforms"] = {
type = "set",
description = "=[[pufferfish]], [[triggerfish]], [[boxfish]], [[ocean sunfish]] and other ikan in the order [[Tetraodontiformes]]",
parents = {"ikan"},
}
labels["terriers"] = {
type = "set",
description = "default",
parents = {"hunting dogs"},
}
labels["theropods"] = {
type = "set",
description = "=[[dinosaur]]s in the [[clade]] [[Theropoda]]",
parents = {"dinosaur"},
}
labels["thrushes"] = {
type = "set",
description = "default",
parents = {"burung tenggek"},
}
labels["ticks"] = {
type = "set",
description = "=[[bloodsucking]] [[araknid]] in the [[order]] [[Ixodida]] (also known as [[Metastigmata]])",
parents = {"mites and ticks"},
}
labels["tinamous"] = {
type = "set",
description = "default",
parents = {"ratites"},
}
labels["tits"] = {
type = "set",
description = "=[[tit]]s, birds known as [[chickadee]]s in the US",
parents = {"burung tenggek"},
}
labels["toads"] = {
type = "set",
description = "default",
parents = {"anurans"},
}
labels["toothcarps"] = {
type = "set",
description = "=[[four-eyed fish]], [[guppy|guppies]], [[killifish]], [[molly|mollies]], [[mummichog]]s, [[platy|platies]], [[swordtail]]s, [[topminnow]]s and other ikan in the [[order]] [[Cyprinodontiformes]]",
parents = {"ikan"},
}
labels["tortoises"] = {
type = "set",
description = "=[[terrestrial]] [[turtle]]s in the [[family]] [[Testudinidae]]",
parents = {"turtles"},
}
labels["tortricid moths"] = {
type = "set",
description = "=[[moth]]s (and [[caterpillar]]s) in the [[family]] [[Tortricidae]]",
parents = {"moths"},
}
labels["ikan trakinoid"] = {
type = "set",
description = "=[[black swallower]]s, [[blue cod]], [[duckbill]]s, [[gaper]]s, [[sand eel]]s, [[torrentfish]], [[weeverfish]] and other ikan in the [[perciform]] [[suborder]] [[Trachinoidei]]",
parents = {"ikan"},
}
labels["toy dogs"] = {
type = "set",
description = "default",
parents = {"anjing"},
}
labels["trilobites"] = {
type = "set",
description = "default",
parents = {"artropod"},
}
labels["true bugs"] = {
type = "set",
description = "=[[insect]]s in the [[hemipteran]] suborder [[Heteroptera]]",
parents = {"hemipterans"},
}
labels["true finches"] = {
type = "set",
description = "=[[finch]]es in the [[passerine]] family [[Fringillidae]]",
parents = {"burung tenggek"},
}
labels["true jellyfish"] = {
type = "set",
description = "=[[cnidarian]]s in the [[class]] [[Scyphozoa]]",
parents = {"knidaria"},
}
labels["true sparrows"] = {
type = "set",
description = "=[[passerine]] birds in the family [[Passeridae]] (for other birds called sparrows, see the [[emberizid]]s)",
parents = {"burung tenggek"},
}
labels["tubenose birds"] = {
type = "set",
description = "=[[albatross]]es, [[fulmar]]s, [[petrel]]s, [[prion]]s, [[shearwater]]s, and other [[seabird]]s in the [[order]] [[Procellariiformes]]",
parents = {"burung laut"},
}
labels["tunicates"] = {
type = "set",
description = "default",
parents = {"haiwan"},
}
labels["turtles"] = {
type = "set",
description = "default",
parents = {"reptilia"},
}
labels["tyrant flycatchers"] = {
type = "set",
description = "=[[passerine]] birds in the family [[Tyrannidae]]",
parents = {"suboscines"},
}
labels["ursids"] = {
type = "set",
description = "=[[ursid]]s ([[bear]]s)",
parents = {"karnivor"},
}
labels["Venerida order mollusks"] = {
type = "set",
description = "=[[basket clam]]s, [[bean clam]]s, [[boring clam]]s, [[cockle]]s, [[duck clam]]s, [[giant clam]]s, [[hard clam]]s, [[lentil shell]]s, [[pipi]]s, [[pooquaw]]s, [[quahog]]s, [[surf clam]]s, [[trough-shell]]s, [[ugari]]s, [[Venus clam]]s, [[zebra mussel]]s, and other [[bivalve]]s in the [[order]] [[Venerida]]",
parents = {"bivalvia"},
}
labels["vertebrat"] = {
type = "set",
description = "default",
parents = {"kordata"},
}
labels["vespids"] = {
type = "set",
description = "=[[hornet]]s, [[paper wasp]]s, [[pollen wasp]]s, [[potter wasp]]s, [[yellow jacket]]s, and other [[wasp]]s in the [[family]] [[Vespidae]]",
parents = {"Hymenoptera"},
}
labels["vetigastropod"] = {
type = "set",
description = "=[[abalone]]s or [[ear shell]]s, [[duck's-bill limpet]]s, [[keyhole limpet]]s, [[rosary shell]]s, [[slit-shell]]s, [[topshell]]s, [[turban shell]]s, and other [[gastropod]]s in the [[clade]] [[Vetigastropoda]] (treated in some classifications as an [[order]], in others as [[subclass]])",
parents = {"gastropod"},
}
labels["vipers"] = {
type = "set",
description = "=[[adder]]s, [[asp]]s, [[rattlesnake]]s, [[viper]]s, [[water moccasin]]s and other [[venomous]] ular in the [[Viperidae]]",
parents = {"ular"},
}
labels["viverrids"] = {
type = "set",
description = "=[[viverrid]]s ([[civet]]s, [[genet]]s and relatives)",
parents = {"karnivor"},
}
labels["vombatiforms"] = {
type = "set",
description = "=[[diprotodontid]]s, [[diprotodon]]s, [[phascolarctid]]s, [[koala]]s, [[vombatid]]s, [[wombat]]s, [[phascolome]]s, [[ilariid]]s, [[maradid]]s, [[palorchestid]]s, [[thylacoleonid]]s, [[marsupial lion]]s , [[wynyardiid]]s and other [[marsupial]]s in the [[diprotodont]] [[suborder]] [[Vombatiformes]]",
parents = {"marsupials"},
}
labels["vultures"] = {
type = "set",
description = "=[[vulture]]s (both Old World and New World)",
parents = {"burung pemangsa"},
}
labels["warblers"] = {
type = "set",
description = "=[[warbler]]s, various small [[passerine]] songbirds, especially of the families Sylviidae (Old World warblers) and Parulidae (New World warblers)",
parents = {"burung tenggek"},
}
labels["warren hounds"] = {
type = "set",
description = "default",
parents = {"hunting dogs"},
}
labels["water dogs"] = {
type = "set",
description = "default",
parents = {"retrievers"},
}
labels["weaver finches"] = {
type = "set",
description = "=[[finch]]es in the family [[Estrildidae]]",
parents = {"burung tenggek"},
}
labels["weaverbirds"] = {
type = "set",
description = "=[[baya]]s, [[bishop]]s, [[fody|fodies]], [[malimbe]]s, [[quelea]]s, [[sakabula]]s, [[taha]]s, [[weaver]]s, and other birds in the [[family]] [[Ploceidae]]",
parents = {"burung tenggek"},
}
labels["weevils"] = {
type = "set",
description = "=[[bill-beetle]]s, [[curculio]]s, [[grugru worm]]s, [[snout beetle]]s, and other [[beetle]]s in the [[superfamily]] [[Curculionoidea]]",
parents = {"beetles"},
}
labels["paus"] = {
type = "set",
description = "default",
parents = {"setasea"},
}
labels["wolves"] = {
type = "set",
description = "=[[wolves]]",
parents = {"kanid"},
}
labels["woodpeckers"] = {
type = "set",
description = "=[[flicker]]s, [[sapsucker]]s, [[wryneck]]s, and other birds in the [[family]] [[Picidae]]",
parents = {"piciforms"},
}
labels["working dogs"] = {
type = "set",
description = "default",
parents = {"anjing"},
}
labels["cacing"] = {
type = "set",
description = "default",
parents = {"haiwan"},
}
labels["wrasses"] = {
type = "set",
description = "=ikan in the family [[Labridae]]",
parents = {"ikan labroid"},
}
labels["wrens"] = {
type = "set",
description = "default",
parents = {"burung sertioid"},
}
labels["ikan zoarkoid"] = {
type = "set",
description = "=[[butterfish]], [[eelpout]]s, [[guffer]]s, [[gunnel]]s, [[lumper]]s, [[prickleback]]s, [[prowfish]], [[wolf eel]]s and other fish in the [[perciform]] [[suborder]] [[Zoarcoidei]]",
parents = {"ikan"},
}
labels["zygaenoid moths"] = {
type = "set",
description = "=[[burnet moth]]s, [[forester]]s, [[hag moth]]s, [[limacodid]]s, [[megalopygid]]s, [[monkey slug]]s, [[puss moth]]s, [[saddleback caterpillar]]s, [[zygaenid]]s, and other [[moth]]s in the [[superfamily]] [[Zygaenoidea]]",
parents = {"moths"},
}
labels["plesiosaurs"] = {
type = "set",
description = "=[[plesiosaur]]s (order †[[Plesiosauria]])",
parents = {"sauropterygians"},
}
labels["tarantulas"] = {
type = "set",
description = "=[[tarantula]]s (family [[Theraphosidae]])",
parents = {"mygalomorph spiders"},
}
return labels
oyregq00fdgn0aur0muspidpkz7vntv
Modul:category tree/topic/Nature
828
11535
281323
263976
2026-04-22T00:33:34Z
PeaceSeekers
3334
281323
Scribunto
text/plain
local labels = {}
labels["alam semula jadi"] = {
type = "berkenaan",
description = "default",
parents = {"semua topik"},
}
labels["bentuk muka bumi"] = {
type = "set",
description = "=jenis bentuk muka bumi semula jadi",
parents = {"alam semula jadi"},
}
labels["asid"] = {
type = "set",
description = "default",
parents = {"jirim"},
}
labels["unsur kimia siri aktinid"] = {
type = "set",
description = "{{{langname}}} terms for those chemical elements in the {{w|f-block}} of the [[periodic table]] with [[atomic number]]s from 89 to 103.",
parents = {"unsur kimia", "logam", "keradioaktifan"},
}
labels["udara"] = {
type = "berkenaan",
description = "default",
parents = {"atmosfera"},
}
labels["logam alkali"] = {
type = "set",
description = "{{{langname}}} terms for [[alkali metal]]s, chemical elements in [[w:Group (periodic table)|group]] 1 of the [[periodic table]], which all have one [[valence electron]].",
parents = {"unsur kimia", "logam"},
}
labels["logam bumi beralkali"] = {
type = "set",
description = "{{{langname}}} terms for [[alkaline earth metal]]s, chemical elements in [[w:Group (periodic table)|group]] 2, which all have two [[valence electron]]s.",
parents = {"unsur kimia", "logam"},
}
labels["alkaloid"] = {
type = "set",
description = "default",
parents = {"sebatian organik"},
}
labels["aloi"] = {
type = "set",
description = "default",
parents = {"logam"},
}
labels["aluminium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kumpulan boron"},
}
labels["asid amino"] = {
type = "set",
description = "default",
parents = {"asid karboksilik"},
}
labels["bunyi haiwan"] = {
type = "set",
description = "default",
parents = {"bunyi-bunyi", "penyuaraan"},
}
labels["kebajikan haiwan"] = {
type = "berkenaan",
description = "{{{langname}}} terms closely associated with [[animal welfare]].",
parents = {"etika"},
}
labels["antijirim"] = {
type = "berkenaan",
description = "default",
parents = {"jirim"},
}
labels["antimoni"] = {
type = "berkenaan",
description = "default",
parents = {"pniktogen"},
}
labels["argon"] = {
type = "berkenaan",
description = "default",
parents = {"gas adi"},
}
labels["arsenik"] = {
type = "berkenaan",
description = "default",
parents = {"pniktogen"},
}
labels["astatin"] = {
type = "berkenaan",
description = "default",
parents = {"halogen"},
}
labels["asteroid"] = {
type = "set",
description = "default",
parents = {"jasad cakerawala"},
}
labels["atmosfera"] = {
type = "berkenaan",
description = "default",
parents = {"alam semula jadi"},
}
labels["fenomena atmosfera"] = {
type = "set",
description = "default",
parents = {"atmosfera"},
}
labels["musim luruh"] = {
type = "berkenaan",
description = "default",
parents = {"musim"},
}
labels["barium"] = {
type = "berkenaan",
description = "default",
parents = {"logam bumi beralkali"},
}
labels["barion"] = {
type = "set",
description = "default",
parents = {"hadrons"},
}
labels["berilium"] = {
type = "berkenaan",
description = "default",
parents = {"logam bumi beralkali"},
}
labels["kelahiran"] = {
type = "berkenaan",
description = "default",
parents = {"pembiakan"},
}
labels["bismut"] = {
type = "berkenaan",
description = "default",
parents = {"pniktogen"},
}
labels["boron"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kumpulan boron"},
}
labels["unsur kumpulan boron"] = {
type = "set",
description = "{{{langname}}} terms for chemical elements in [[w:Group (periodic table)|group]] 13 of the [[periodic table]], which all have three [[valence electron]]s.",
parents = {"unsur kimia"},
}
labels["boson"] = {
type = "set",
description = "default",
parents = {"zarah subatom"},
}
labels["bromin"] = {
type = "berkenaan",
description = "default",
parents = {"halogen"},
}
labels["kadmium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["kalsium"] = {
type = "berkenaan",
description = "default",
parents = {"logam bumi beralkali"},
}
labels["karbohidrat"] = {
type = "set",
description = "default",
parents = {"sebatian organik"},
}
labels["karbon"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kumpulan karbon"},
}
labels["unsur kumpulan karbon"] = {
type = "set",
description = "Perkataan bahasa {{{langname}}} bagi unsur-unsur kimia dalam [[w:Kumpulan (jadual berkala)|kumpulan]] 14 dalam [[jadual berkala]] yang memiliki empat [[elektron valens]].",
parents = {"unsur kimia"},
}
labels["asid karboksilik"] = {
type = "set",
description = "default",
parents = {"asid", "sebatian organik"},
}
labels["jasad cakerawala"] = {
type = "set",
description = "{{{langname}}} terms for varous [[celestial body|celestial bodies]]; things found in outer space.",
parents = {"angkasa"},
}
labels["serium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kimia siri lantanid"},
}
labels["sesium"] = {
type = "berkenaan",
description = "default",
parents = {"logam alkali"},
}
labels["kalkogen"] = {
type = "set",
description = "{{{langname}}} terms for chemical elements in [[w:Group (periodic table)|group]] 16 of the [[periodic table]], which all have 6 [[valence electron]]s.",
parents = {"unsur kimia"},
}
labels["unsur kimia"] = {
type = "set",
description = "default",
parents = {"jirim"},
}
labels["isomer kimia"] = {
type = "berkenaan",
description = "default",
parents = {"jirim", "kimia fizik", "bentuk"},
}
labels["proses kimia"] = {
type = "set",
description = "=[[chemical]] [[process]]es",
parents = {"alam semula jadi"},
}
labels["klorin"] = {
type = "berkenaan",
description = "default",
parents = {"halogen"},
}
labels["kromium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["planet klasik"] = {
type = "name",
description = "{{{langname}}} names for the [[classical planet]]s of our Solar System.",
parents = {"jasad cakerawala"},
}
labels["perubahan iklim"] = {
type = "berkenaan",
description = "=[[anthropogenic]] [[climate change]]",
parents = {"alam semula jadi"},
}
labels["awan"] = {
type = "set",
description = "default",
parents = {"fenomena atmosfera"},
}
labels["arang batu"] = {
type = "berkenaan",
description = "default",
parents = {"bahan api fosil"},
}
labels["kobalt"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["koenzim"] = {
type = "set",
description = "default",
parents = {"enzim"},
}
labels["warna"] = {
type = "set",
description = "default",
parents = {"cahaya", "penglihatan"},
}
for _, color_etc in ipairs {
{"hitam"},
{"biru"},
{"perang"},
{"hijau"},
{"kelabu"},
{"jingga"},
{"merah jambu"},
{"ungu"},
{"merah"},
{"putih"},
{"kuning"},
} do
local color, desc = unpack(color_etc)
desc = desc or ("[[%s]]"):format(color)
labels[color] = {
type = "set",
description = ("=shades of the [[color]] %s"):format(desc),
parents = {"warna"},
}
end
labels["warna pelangi"] = {
type = "set",
description = "=[[warna]] dalam [[pelangi]]",
parents = {"warna"},
}
labels["pembakaran"] = {
type = "berkenaan",
description = "default",
parents = {"proses kimia"},
}
labels["titik kompas"] = {
type = "set",
description = "default",
parents = {"arah", "navigasi"},
}
labels["copper"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["hablur"] = {
type = "berkenaan",
description = "default",
parents = {"jirim", "kimia fizik"},
}
labels["kegelapan"] = {
type = "berkenaan",
description = "default",
parents = {"cahaya"},
}
labels["arah"] = {
type = "set",
description = "default",
parents = {"alam semula jadi"},
}
labels["jarak"] = {
type = "berkenaan",
description = "default",
parents = {"alam semula jadi"},
}
labels["dadah"] = {
type = "set",
description = "default",
parents = {"jirim", "farmakologi"},
}
labels["kekeringan"] = {
type = "berkenaan",
description = "default",
parents = {"cecair"},
}
labels["planet kerdil Sistem Suria"] = {
type = "name",
description = "=[[planet kerdil]] di [[Sistem Suria]]",
parents = {"jasad cakerawala"},
}
labels["pewarna"] = {
type = "set",
description = "default",
parents = {"jirim", "pigmen"},
}
labels["tenaga"] = {
type = "berkenaan",
description = "default",
parents = {"alam semula jadi"},
}
labels["enzim"] = {
type = "set",
description = "default",
parents = {"protein", "pemangkinan"},
}
labels["europium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kimia siri lantanid"},
}
labels["bahan letupan"] = {
type = "set",
description = "default",
parents = {"jirim", "senjata"},
}
labels["warna mata"] = {
type = "set",
description = "=[[color]]s that are mostly or exclusively used of [[eye]]s",
parents = {"warna", "mata"},
}
labels["asid lemak"] = {
type = "set",
description = "default",
parents = {"asid karboksilik"},
}
labels["fermion"] = {
type = "set",
description = "default",
parents = {"zarah subatom"},
}
labels["api"] = {
type = "berkenaan",
description = "default",
parents = {"pembakaran", "sumber cahaya"},
wp = "Api",
}
labels["fluorin"] = {
type = "berkenaan",
description = "default",
parents = {"halogen"},
}
labels["kabus"] = {
type = "berkenaan",
description = "default",
parents = {"cuaca", "air"},
}
labels["bahan api fosil"] = {
type = "set",
description = "default",
parents = {"karbon", "tenaga", "sumber asli"},
}
labels["fransium"] = {
type = "berkenaan",
description = "default",
parents = {"logam alkali"},
}
labels["gadolinium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kimia siri lantanid"},
}
labels["galaksi"] = {
type = "set",
description = "default",
parents = {"jasad cakerawala"},
}
labels["gallium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kumpulan boron"},
}
labels["gas"] = {
type = "set",
description = "default",
parents = {"jirim"},
}
labels["germanium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kumpulan karbon"},
}
labels["gold"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["hadrons"] = {
type = "set",
description = "default",
parents = {"zarah subatom"},
}
labels["warna rambut"] = {
type = "set",
description = "=[[color]]s that are mostly or exclusively used of [[hair]]",
parents = {"warna", "rambut"},
}
labels["halogen"] = {
type = "set",
description = "=[[chemical element]]s in [[w:Group (periodic table)|group]] 17 of the [[periodic table]], which all have 7 [[valence electron]]s",
parents = {"unsur kimia"},
}
labels["helium"] = {
type = "berkenaan",
description = "default",
parents = {"gas adi"},
}
labels["heroin"] = {
type = "berkenaan",
description = "default",
parents = {"dadah rekreasi"},
}
labels["ketinggian"] = {
type = "berkenaan",
description = "default",
parents = {"jarak"},
}
labels["warna kuda"] = {
type = "set",
description = "=[[color]]s that are mostly or exclusively used of [[horse]]s",
parents = {"warna", "kuda"},
}
labels["hydrogen"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kimia"},
}
labels["ais"] = {
type = "berkenaan",
description = "default",
parents = {"air"},
}
labels["indium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kumpulan boron"},
}
labels["sebatian tak organik"] = {
type = "set",
description = "default",
parents = {"jirim"},
}
labels["iodine"] = {
type = "berkenaan",
description = "default",
parents = {"halogen"},
}
labels["ion"] = {
type = "set",
description = "default",
parents = {"jirim", "kimia", "keelektrikan"},
}
labels["iridium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["iron"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["isotop"] = {
type = "set",
description = "default",
parents = {"unsur kimia"},
}
labels["krypton"] = {
type = "berkenaan",
description = "default",
parents = {"gas adi"},
}
labels["unsur kimia siri lantanid"] = {
type = "set",
description = "=[[chemical element]]s in the {{w|f-block}} of the [[periodic table]] with [[atomic number]]s from 57 to 71",
parents = {"unsur kimia"},
}
labels["lanthanum"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kimia siri lantanid"},
}
labels["lead"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kumpulan karbon"},
}
labels["panjang"] = {
type = "berkenaan",
description = "default",
parents = {"jarak"},
}
labels["lepton"] = {
type = "set",
description = "default",
parents = {"fermion"},
}
labels["kehidupan"] = {
type = "berkenaan",
description = "default",
parents = {"alam semula jadi"},
}
labels["cahaya"] = {
type = "berkenaan",
description = "default",
parents = {"tenaga"},
}
labels["sumber cahaya"] = {
type = "set",
description = "default",
parents = {"cahaya"},
}
labels["kilat"] = {
type = "berkenaan",
description = "default",
parents = {"cuaca", "keelektrikan"},
}
labels["cecair"] = {
type = "set",
description = "default", -- At what temperature?
parents = {"jirim"},
}
labels["lithium"] = {
type = "berkenaan",
description = "default",
parents = {"logam alkali"},
}
labels["magnesium"] = {
type = "berkenaan",
description = "default",
parents = {"logam bumi beralkali"},
}
labels["manganese"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["Marikh"] = {
type = "berkenaan",
description = "=planet [[Marikh]]",
parents = {"planet Sistem Suria"},
}
labels["marijuana"] = {
type = "berkenaan",
description = "default",
parents = {"hemp family plants", "dadah rekreasi"},
}
labels["jirim"] = {
type = "berkenaan",
description = "=physical [[matter]]",
parents = {"alam semula jadi", "kimia"},
}
labels["mercury (element)"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["mesons"] = {
type = "set",
description = "default",
parents = {"hadrons"},
}
labels["metaloid"] = {
type = "set",
description = "default",
parents = {"unsur kimia"},
}
labels["logam"] = {
type = "set",
description = "default",
parents = {"jirim", "metalurgi"},
}
labels["mineral"] = {
type = "set",
description = "default",
parents = {"jirim", "mineralogi"},
}
labels["molibdenum"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["Bulan"] = {
type = "berkenaan",
description = "=[[Bulan]], satelit semula jadi Bumi",
parents = {"alam semula jadi", "cahaya", "badan samawi", "satelit semula jadi"},
}
labels["satelit semula jadi"] = {
type = "berkenaan",
description = "default",
parents = {"badan samawi"},
}
for _, planet in ipairs {"Marikh", "Haumea", "Musytari", "Zuhal", "Neptun", "Uranus", "Pluto"} do
labels["bulan " .. planet] = {
type = "name",
description = ("=[[bulan]] yang mengelilingi orbit [[%s]]"):format(planet),
parents = {"satelit-satelit bulan"},
}
end
labels["produk semula jadi (kimia)"] = {
type = "name",
description = "=[[organic compound]]s produced by living [[organism]]s",
parents = {"sebatian organik"},
}
labels["sumber asli"] = {
type = "set",
description = "default",
parents = {"jirim"},
}
labels["neodimium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kimia siri lantanid"},
}
labels["neon"] = {
type = "berkenaan",
description = "default",
parents = {"gas adi"},
}
labels["neurotoksin"] = {
type = "set",
description = "default",
parents = {"racun", "neurosains"},
}
labels["nickel"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["niobium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["nitrogen"] = {
type = "berkenaan",
description = "default",
parents = {"pniktogen"},
}
labels["gas adi"] = {
type = "set",
description = "=[[chemical element]]s in [[w:Group (periodic table)|group]] 18 of the [[periodic table]], which all have a full set of [[valence electron]]s: 2 for helium and 8 for the others",
parents = {"unsur kimia", "gas"},
}
labels["sebatian organik"] = {
type = "set",
description = "default",
parents = {"jirim"},
}
labels["osmium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["oxygen"] = {
type = "berkenaan",
description = "default",
parents = {"kalkogen"},
}
labels["palladium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["petroleum"] = {
type = "berkenaan",
description = "default",
parents = {"bahan api fosil", "cecair"},
}
labels["pharmaceutical drugs"] = {
type = "set",
description = "{{{langname}}} names for [[pharmaceutical#Adjective|pharmaceutical]] [[drug#Noun|drugs]].",
parents = {"dadah"},
}
labels["pharmaceutical effects"] = {
type = "set",
description = "{{{langname}}} names for [[pharmaceutical#Adjective|pharmaceutical]] [[effect#Noun|effects]].",
parents = {"farmakologi"},
}
labels["fosforus"] = {
type = "berkenaan",
description = "default",
parents = {"pniktogen"},
}
labels["pigmen"] = {
type = "set",
description = "default",
parents = {"warna"},
}
labels["planetoid"] = {
type = "set",
description = "default",
parents = {"jasad cakerawala"},
}
labels["planet"] = {
type = "set",
description = "default",
parents = {"jasad cakerawala"},
}
labels["planet Sistem Suria"] = {
type = "name",
description = "=[[planet]]s of our [[Solar System]]",
parents = {"planet"},
}
labels["platinum"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["Pluto"] = {
type = "berkenaan",
description = "=the dwarf planet [[Pluto]]",
parents = {"planet kerdil Sistem Suria"},
}
labels["pniktogen"] = {
type = "set",
description = "=[[chemical element]]s in [[w:Group (periodic table)|group]] 15 of the [[periodic table]], which all have 5 [[valence electron]]s",
parents = {"unsur kimia"},
}
labels["racun"] = {
type = "set",
description = "default",
parents = {"jirim"},
}
labels["kalium"] = {
type = "berkenaan",
description = "default",
parents = {"logam alkali"},
}
labels["praseodymium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kimia siri lantanid"},
}
labels["promesium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kimia siri lantanid"},
}
labels["kuark"] = {
type = "set",
description = "default",
parents = {"fermion"},
}
labels["sinaran"] = {
type = "berkenaan",
description = "default",
parents = {"tenaga"},
}
labels["keradioaktifan"] = {
type = "berkenaan",
description = "default",
parents = {"sinaran", "fizik nuklear"},
}
labels["radium"] = {
type = "berkenaan",
description = "default",
parents = {"logam bumi beralkali"},
}
labels["radon"] = {
type = "berkenaan",
description = "default",
parents = {"gas adi"},
}
labels["hujan"] = {
type = "berkenaan",
description = "default",
parents = {"cuaca", "air"},
}
labels["dadah rekreasi"] = {
type = "set",
description = "default",
parents = {"dadah"},
}
labels["rodium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["rubidium"] = {
type = "berkenaan",
description = "default",
parents = {"logam alkali"},
}
labels["rutenium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["samarium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kimia siri lantanid"},
}
labels["skandium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["selenium"] = {
type = "berkenaan",
description = "default",
parents = {"kalkogen"},
}
labels["bayang"] = {
type = "berkenaan",
description = "default",
parents = {"kegelapan"},
}
labels["senyap"] = {
type = "berkenaan",
description = "default",
parents = {"bunyi"},
}
labels["silikon"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kumpulan karbon"},
}
labels["silver"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["saiz"] = {
type = "berkenaan",
description = "default",
parents = {"alam semula jadi"},
}
labels["salji"] = {
type = "berkenaan",
description = "default",
parents = {"cuaca", "air"},
}
labels["natrium"] = {
type = "berkenaan",
description = "default",
parents = {"logam alkali"},
}
labels["bunyi"] = {
type = "berkenaan",
description = "default",
parents = {"tenaga"},
}
labels["bunyi-bunyi"] = {
type = "set",
description = "default",
parents = {"bunyi"},
}
labels["angkasa"] = {
type = "berkenaan",
description = "default",
parents = {"alam semula jadi"},
}
labels["musim bunga"] = {
type = "berkenaan",
description = "default",
parents = {"musim"},
}
labels["skluark"] = {
type = "set",
description = "default",
parents = {"fermion"},
}
labels["bintang"] = {
type = "set",
description = "{{{langname}}} names of individual [[star]]s, not including the [[Sun]].",
parents = {"jasad cakerawala"},
}
labels["steroid"] = {
type = "set",
description = "default",
parents = {"sebatian organik"},
}
labels["kekuatan"] = {
type = "berkenaan",
description = "default",
parents = {"alam semula jadi", "health"},
}
labels["strontium"] = {
type = "berkenaan",
description = "default",
parents = {"logam bumi beralkali"},
}
labels["zarah subatom"] = {
type = "set",
description = "default",
parents = {"jirim", "particle physics"},
}
labels["asid gula"] = {
type = "set",
description = "default",
parents = {"asid karboksilik", "karbohidrat"},
}
labels["gula"] = {
type = "set",
description = "default",
parents = {"karbohidrat"},
}
labels["sulfur"] = {
type = "berkenaan",
description = "default",
parents = {"kalkogen"},
}
labels["musim panas"] = {
type = "berkenaan",
description = "default",
parents = {"musim"},
}
labels["matahari"] = {
type = "berkenaan",
description = "=[[Matahari]]",
parents = {"alam semula jadi", "cahaya", "jasad cakerawala"},
}
labels["tantalum"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["teknesium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["telurium"] = {
type = "berkenaan",
description = "default",
parents = {"kalkogen"},
}
labels["suhu"] = {
type = "berkenaan",
description = "default",
parents = {"alam semula jadi", "cuaca"},
}
labels["teratogen"] = {
type = "set",
description = "default",
parents = {"racun"},
}
labels["talium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kumpulan boron"},
}
labels["torium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kimia siri aktinid"},
}
labels["timah"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kumpulan karbon"},
}
labels["titanium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["tembakau"] = {
type = "berkenaan",
description = "default",
parents = {"nightshades", "dadah rekreasi", "merokok"},
}
labels["logam peralihan"] = {
type = "set",
description = "{{{langname}}} terms for [[chemical element]]s in [[w:Group (periodic table)|group]]s 3 to 12 of the [[periodic table]], which are also in the {{w|d-block}} of the [[periodic table]] ",
parents = {"unsur kimia", "logam"},
}
labels["tungsten"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["jenis planet"] = {
type = "type",
topic = "planet",
description = "=[[planet]]",
parents = {"planet"},
}
labels["uranium"] = {
type = "berkenaan",
description = "default",
parents = {"unsur kimia siri aktinid"},
}
labels["vanadium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["penyuaraan"] = {
type = "set",
description = "default",
parents = {"bunyi-bunyi", "communication"},
}
labels["air"] = {
type = "berkenaan",
description = "default",
parents = {"cecair"},
}
labels["air terjun"] = {
type = "berkenaan",
description = "default",
parents = {"air"},
}
labels["cuaca"] = {
type = "berkenaan",
description = "default",
parents = {"atmosfera"},
}
labels["berat"] = {
type = "berkenaan",
description = "default",
parents = {"alam semula jadi"},
}
labels["kebasahan"] = {
type = "berkenaan",
description = "default",
parents = {"cecair"},
}
labels["angin"] = {
type = "berkenaan",
description = "default",
parents = {"cuaca"},
}
labels["musim sejuk"] = {
type = "berkenaan",
description = "default",
parents = {"musim"},
}
labels["xenon"] = {
type = "berkenaan",
description = "default",
parents = {"gas adi"},
}
labels["itrium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["zink"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
labels["zirkonium"] = {
type = "berkenaan",
description = "default",
parents = {"logam peralihan"},
}
return labels
mh4bqwiwqs0cj7btt0bm84te04937p3
Modul:headword/data
828
11806
281245
281234
2026-04-21T13:26:02Z
PeaceSeekers
3334
Dilindungi "[[Modul:headword/data]]": dia dah siap ([Sunting=Benarkan penyelia sahaja] (tak terbatas))
281234
Scribunto
text/plain
local headword_page_module = "Module:headword/page"
local list_to_set = require("Module:table").listToSet
local data = {}
------ 1. Lists which are converted into sets. ------
--[==[ var:
Large pages where we disable label tracking, red link checking and similar.
]==]
data.large_pages = list_to_set {
-- pages that consistently hit timeouts
"a",
-- pages that sometimes hit timeouts
"A",
"baba",
"de",
"e",
"i",
"lima",
"o",
"u",
"и",
"山",
"子",
"月",
"一",
"人",
}
--[==[ var:
Map from singular to plural, and from plural to itself, for recognized parts of speech with irregular plurals. Most of
these are invariable plurals, e.g. `kanji` is its own plural; but we also have `mora` plural `morae`.
]==]
data.irregular_plurals = list_to_set({
"cmavo",
"cmene",
"fu'ivla",
"gismu",
"Han tu",
"hanja",
"Hanzi",
"jyutping",
"kana",
"Kanji",
"lujvo",
"phrasebook",
"Pinyin",
"rafsi",
}, function(_, item)
return item
end)
local irregular_plurals = data.irregular_plurals
-- Irregular non-zero plurals AND any regular plurals where the singular ends in "s",
-- because the module assumes that inputs ending in "s" are plurals. The singular and
-- plural both need to be added, as the module will generate a default plural if
-- the input doesn't match a key in this table.
for sg, pl in next, {
mora = "morae"
} do
irregular_plurals[sg], irregular_plurals[pl] = pl, pl
end
--[==[ var:
Recognized lemmas. If the part of speech in {{tl|head}} is set to one of these or its singular equivalent, the category
'LANG lemmas' will automatically be added. If the part of speech is not a singular or plural lemma or non-lemma form and
is not an abbreviation that expands to a recognized lemma or non-lemma form, the page will be added to various tracking
categories:
* [[Special:WhatLinksHere/Wiktionary:Tracking/headword/unrecognized pos]]
* [[Special:WhatLinksHere/Wiktionary:Tracking/headword/unrecognized pos/LANG]]
* [[Special:WhatLinksHere/Wiktionary:Tracking/headword/unrecognized pos/pos/POS]]
* [[Special:WhatLinksHere/Wiktionary:Tracking/headword/unrecognized pos/pos/POS/LANG]]
]==]
data.lemmas = list_to_set{
"Kependekan",
"Akronim",
"Kata sifat",
"kata sifat",
"Kata adjektif", -- alias "kata sifat"
"kata adjektif", -- alias "kata sifat"
"adnominals",
"adpositions",
"adverba",
"Adverba",
"Kata keterangan",
"kata keterangan",
"Imbuhan",
"imbuhan",
"ambipositions",
"Kata sandang",
"kata sandang",
"Apitan",
"apitan",
"circumpositions",
"Penjodoh bilangan",
"penjodoh bilangan",
"cmavo",
"cmavo clusters",
"cmene",
"Bentuk gabungan",
"Kata hubung",
"kata hubung",
"counters",
"Penunjuk",
"Tanda diakritik",
"Digraf",
"equative adjectives",
"fu'ivla",
"gismu",
"Aksara Han",
"Han tu",
"hanja",
"hanzi",
"Hanzi",
"ideophones",
"Simpulan bahasa",
"Sisipan",
"sisipan",
"initialisms",
"Tanda lelaran",
"tanda lelaran",
"interfixes",
"Kata seru",
"kata seru",
"kana",
"kanji",
"Kanji",
"Huruf",
"ligatur",
"Logogram",
"lujvo",
"morae",
"Morfem",
"non-constituents",
"Kata nama",
"kata nama",
"Kata nama am", -- alias "kata nama"
"kata nama am", -- alias "kata nama"
"Nombor",
"nombor",
"Simbol angka",
"Kata bilangan",
"kata bilangan",
"Partikel",
"partikel",
"Frasa",
"frasa",
"kata dudi",
"Kata dudi",
"postpositional phrases",
"predicatives",
"Awalan",
"awalan",
"Frasa sendi nama",
"frasa sendi nama",
"Kata sendi nama",
"kata sendi nama",
"preverbs",
"pronominal adverbs",
"Kata ganti nama",
"kata ganti nama",
"Kata nama khas",
"kata nama khas",
"Peribahasa",
"peribahasa",
"Tanda baca",
"tanda baca",
"relatives",
"Akar",
"Kata dasar",
"kata dasar",
"Akhiran",
"akhiran",
"Suku kata",
"suku kata",
"Simbol",
"simbol",
"Kata kerja",
"kata kerja",
}
--[==[ var:
Recognized non-lemma forms. If the part of speech in {{tl|head}} is set to one of these or its singular equivalent, the
category 'LANG non-lemma forms' will automatically be added. If the part of speech is not a singular or plural lemma or
non-lemma form and is not an abbreviation that expands to a recognized lemma or non-lemma form, the page will be added
to various tracking categories; see the documentation of `data.lemmas`.
]==]
data.nonlemmas = list_to_set{
"active participle forms",
"active participles",
"adjectival participles",
"adjective case forms",
"Bentuk kata sifat",
"bentuk kata sifat",
"Bentuk kata adjektif", -- alias "bentuk kata sifat"
"bentuk kata adjektif", -- alias "bentuk kata sifat"
"Bentuk feminin kata sifat",
"Bentuk jamak kata sifat",
"Bentuk adverba",
"adverbial participles",
"agent participles",
"Bentuk artikel",
"Bentuk apitan",
"Bentuk gabungan",
"comparative adjective forms",
"comparative adjectives",
"comparative adverb forms",
"comparative adverbs",
"conjunction forms",
"contractions",
"converbs",
"determiner comparative forms",
"determiner forms",
"determiner superlative forms",
"diminutive nouns",
"elative adjectives",
"equative adjective forms",
"equative adjectives",
"future participles",
"gerund",
"infinitive forms",
"infinitives",
"interjection forms",
"jyutping",
"Kesalahan ejaan",
"negative participles",
"nominal participles",
"noun case forms",
"noun dual forms",
"Bentuk kata nama",
"bentuk kata nama",
"noun paucal forms",
"Bentuk jamak kata nama",
"noun possessive forms",
"noun singulative forms",
"numeral forms",
"partisipel",
"bentuk partisipel",
"particle forms",
"passive participles",
"past active participles",
"past participles",
"past participle forms",
"past passive participles",
"perfect active participles",
"perfect participles",
"perfect passive participles",
"Pinyin",
"Jamak",
"Bentuk kata dudi",
"Bentuk awalan",
"preposition contractions",
"preposition forms",
"prepositional pronouns",
"present active participles",
"present participles",
"present passive participles",
"Bentuk kata ganti nama",
"bentuk kata ganti nama",
"pronoun possessive forms",
"Bentuk kata nama khas",
"bentuk kata nama khas",
"Bentuk jamak kata nama khas",
"rafsi",
"Perumian",
"perumian",
"root forms",
"singulatives",
"Bentuk akhiran",
"superlative adjective forms",
"Kata sifat superlatif",
"superlative adverb forms",
"superlative adverbs",
"Bentuk kata kerja",
"bentuk kata kerja",
"verbal nouns",
}
--[==[ var:
List of languages that will not have links to separate parts of the headword.
]==]
data.no_multiword_links = list_to_set{
"zh",
}
--[==[ var:
List of languages that will not have `LANG multiword terms` categories added. There are various reasons why languages
are in this list: (a) words are written without spaces between them; (b) syllables are written with spaces between them;
(c) variant reconstructions are notated with a tilde surrounded by spaces; (d) the language is a sign language, where
pagenames are multiword descriptions of the gesture(s) required to make an individual sign; (e) some other weirdnesses.
]==]
data.no_multiword_cat = list_to_set{
-------- Languages without spaces between words (sometimes spaces between phrases) --------
"blt", -- Tai Dam
"ja", -- Japanese
"khb", -- Lü
"km", -- Khmer
"lo", -- Lao
"mnw", -- Mon
"my", -- Burmese
"nan", -- Min Nan (some words in Latin script; hyphens between syllables)
"nan-hbl", -- Hokkien (some words in Latin script; hyphens between syllables)
"nod", -- Northern Thai
"ojp", -- Old Japanese
"shn", -- Shan
"sou", -- Southern Thai
"tdd", -- Tai Nüa
"th", -- Thai
"tts", -- Isan
"twh", -- Tai Dón
"txg", -- Tangut
"zh", -- Chinese (all varieties with Chinese characters)
"zkt", -- Khitan
-------- Languages with spaces between syllables --------
"ahk", -- Akha
"aou", -- A'ou
"atb", -- Zaiwa
"byk", -- Biao
"cdy", -- Chadong
--"duu", -- Drung; not sure
--"hmx-pro", -- Proto-Hmong-Mien
--"hnj", -- Green Hmong; not sure
"huq", -- Tsat
"ium", -- Iu Mien
--"lis", -- Lisu; not sure
"mtq", -- Muong
--"mww", -- White Hmong; not sure
"onb", -- Lingao
--"sit-gkh", -- Gokhy; not sure
--"swi", -- Sui; not sure
"tbq-lol-pro", -- Proto-Loloish
"tdh", -- Thulung
"ukk", -- Muak Sa-aak
"vi", -- Vietnamese
"yig", -- Wusa Nasu
"zng", -- Mang
-------- Languages with ~ with surrounding spaces used to separate variants --------
"mkh-ban-pro", -- Proto-Bahnaric
"sit-pro", -- Proto-Sino-Tibetan; listed above
-------- Other weirdnesses --------
"mul", -- Translingual; gestures, Morse code, etc.
"aot", -- Atong (India); bullet is a letter
-------- All sign languages --------
"ads",
"aed",
"aen",
"afg",
"ase",
"asf",
"asp",
"asq",
"asw",
"bfi",
"bfk",
"bog",
"bqn",
"bqy",
"bvl",
"bzs",
"cds",
"csc",
"csd",
"cse",
"csf",
"csg",
"csl",
"csn",
"csq",
"csr",
"doq",
"dse",
"dsl",
"ecs",
"esl",
"esn",
"eso",
"eth",
"fcs",
"fse",
"fsl",
"fss",
"gds",
"gse",
"gsg",
"gsm",
"gss",
"gus",
"hab",
"haf",
"hds",
"hks",
"hos",
"hps",
"hsh",
"hsl",
"icl",
"iks",
"ils",
"inl",
"ins",
"ise",
"isg",
"isr",
"jcs",
"jhs",
"jls",
"jos",
"jsl",
"jus",
"kgi",
"kvk",
"lbs",
"lls",
"lsl",
"lso",
"lsp",
"lst",
"lsy",
"lws",
"mdl",
"mfs",
"mre",
"msd",
"msr",
"mzc",
"mzg",
"mzy",
"nbs",
"ncs",
"nsi",
"nsl",
"nsp",
"nsr",
"nzs",
"okl",
"pgz",
"pks",
"prl",
"prz",
"psc",
"psd",
"psg",
"psl",
"pso",
"psp",
"psr",
"pys",
"rms",
"rsl",
"rsm",
"sdl",
"sfb",
"sfs",
"sgg",
"sgx",
"slf",
"sls",
"sqk",
"sqs",
"ssp",
"ssr",
"svk",
"swl",
"syy",
"tse",
"tsm",
"tsq",
"tss",
"tsy",
"tza",
"ugn",
"ugy",
"ukl",
"uks",
"vgt",
"vsi",
"vsl",
"vsv",
"xki",
"xml",
"xms",
"ygs",
"ysl",
"zib",
"zsl",
}
--[==[ var:
List of languages where a hyphen is not considered a word separator for the `LANG multiword terms` category. There are
numerous reasons why languages are in this list; by each language should be listed the reason for inclusion.
]==]
data.hyphen_not_multiword_sep = list_to_set{
"akk", -- Akkadian; hyphens between syllables
"akl", -- Aklanon; hyphens for mid-word glottal stops
"ber-pro", -- Proto-Berber; morphemes separated by hyphens
"ceb", -- Cebuano; hyphens for mid-word glottal stops
"cnk", -- Khumi Chin; hyphens used in single words
"cpi", -- Chinese Pidgin English; Chinese-derived words with hyphens between syllables
"de", -- German; too many false positives
"esx-esk-pro", -- hyphen used to separate morphemes
"fi", -- Finnish; hyphen used to separate components in compound words if the final and initial vowels match, respectively
"gd", -- Scottish Gaelic; too many false positives like [[a-chianaibh]], [[a-nìos]], [[an-dè]] and other adverbs in a- and an-
"hil", -- Hiligaynon; hyphens for mid-word glottal stops
"hnn", -- Hanunoo; too many false positives
"ilo", -- Ilocano; hyphens for mid-word glottal stops
"kne", -- Kankanaey; hyphens for mid-word glottal stops
"lcp", -- Western Lawa; dash as syllable joiner
"lwl", -- Eastern Lawa; dash as syllable joiner
"mfa", -- Pattani Malay in Thai script; dash as syllable joiner
"mkh-vie-pro", -- Proto-Vietic; morphemes separated by hyphens
"msb", -- Masbatenyo; too many false positives
"tl", -- Tagalog; too many false positives
"war", -- Waray-Waray; too many false positives
"yo", -- Yoruba; hyphens used to show lengthened nasal vowels
}
--[==[ var:
List of languages that will not have `LANG masculine nouns` and similar categories added. Generally, these languages are
lacking gender but use the gender field for other purposes. (This is a massive hack and should be changed.)
]==]
data.no_gender_cat = list_to_set{
-- Languages without gender but which use the gender field for other purposes
"ja",
"th",
}
--[==[ var:
List of languages where [[Module:headword]] should not attempt to generate a transliteration even if the term is written
in a non-Latin script. FIXME: Notate reasons why each language is in this list.
]==]
data.notranslit = list_to_set{
"ams",
"az",
"bbc",
"bug",
"cdo",
"cia",
"cjm",
"cjy",
"cmn",
"cnp",
"cpi",
"cpx",
"csp",
"czh",
"czo",
"gan",
"hak",
"hnm",
"hsn",
"ja",
"kzg",
"lad",
"ltc",
"luh",
"lzh",
"mnp",
"ms",
"mul",
"mvi",
"nan",
"nan-dat",
"nan-hbl",
"nan-hlh",
"nan-lnx",
"nan-tws",
"nan-zhe",
"nan-zsh",
"och",
"oj",
"okn",
"ryn",
"rys",
"ryu",
"sh",
"sjc",
"tgt",
"th",
"tkn",
"tly",
"txg",
"und",
"vi",
"wuu",
"xug",
"yoi",
"yox",
"yue",
"za",
"zh",
"zhx-sic",
"zhx-tai",
}
--[==[ var:
List of languages that will default to `sccat` being true, i.e. categories like `LANG POS in SCRIPT script` will
automatically be generated. This can be overridden using {{para|sccat|0}} in {{tl|head}} or setting `sccat` to
`false` in Lua.
]==]
data.default_sccat = list_to_set{
"inc-apa",
"inc-ash",
"kfr",
"ks",
"mr",
"mwr",
"inc-oaw",
"inc-ohi",
"omr",
"inc-opa",
"phr",
"pi",
"pra",
"sa",
"skr",
"sd",
}
--[==[ var:
List of script codes for which a script-tagged display title will be added.
]==]
data.toBeTagged = list_to_set{
"Ahom",
"Arab",
"fa-Arab",
"glk-Arab",
"kk-Arab",
"ks-Arab",
"ku-Arab",
"mzn-Arab",
"ms-Arab",
"ota-Arab",
"pa-Arab",
"ps-Arab",
"sd-Arab",
"tt-Arab",
"ug-Arab",
"ur-Arab",
"Armi",
"Armn",
"Avst",
"Bali",
"Bamu",
"Batk",
"Beng",
"as-Beng",
"Bopo",
"Brah",
"Brai",
"Bugi",
"Buhd",
"Cakm",
"Cans",
"Cari",
"Cham",
"Cher",
"Copt",
"Cprt",
"Cyrl",
"Cyrs",
"Deva",
"Dsrt",
"Egyd",
"Egyp",
"Ethi",
"Geok",
"Geor",
"Glag",
"Goth",
"Grek",
"Polyt",
"polytonic",
"Gujr",
"Guru",
"Hang",
"Hani",
"Hano",
"Hebr",
"Hira",
"Hluw",
"Ital",
"Java",
"Kali",
"Kana",
"Khar",
"Khmr",
"Knda",
"Kthi",
"Lana",
"Laoo",
"Latn",
"Latf",
"Latg",
"Latnx",
"Latinx",
"pjt-Latn",
"Lepc",
"Limb",
"Linb",
"Lisu",
"Lyci",
"Lydi",
"Mand",
"Mani",
"Marc",
"Merc",
"Mero",
"Mlym",
"Mong",
"mnc-Mong",
"sjo-Mong",
"xwo-Mong",
"Mtei",
"Mymr",
"Narb",
"Nkoo",
"Nshu",
"Ogam",
"Olck",
"Orkh",
"Orya",
"Osma",
"Ougr",
"Palm",
"Phag",
"Phli",
"Phlv",
"Phnx",
"Plrd",
"Prti",
"Rjng",
"Runr",
"Samr",
"Sarb",
"Saur",
"Sgnw",
"Shaw",
"Shrd",
"Sinh",
"Sora",
"Sund",
"Sylo",
"Syrc",
"Tagb",
"Tale",
"Talu",
"Taml",
"Tang",
"Tavt",
"Telu",
"Tfng",
"Tglg",
"Thaa",
"Thai",
"Tibt",
"Ugar",
"Vaii",
"Xpeo",
"Xsux",
"Yiii",
"Zmth",
"Zsym",
"Ipach",
"Music",
"Rumin",
}
--[==[ var:
Parts of speech which will not be categorised in categories like `English terms spelled with É` if the term is the
character in question (e.g. the letter entry for English [[é]]). This contrasts with entries like the French adjective
[[m̂]], which is a one-letter word spelled with the letter.
]==]
data.pos_not_spelled_with_self = list_to_set{
"Tanda diakritik",
"Aksara Han",
"Han tu",
"hanja",
"hanzi",
"Tanda lelaran",
"kana",
"kanji",
"Huruf",
"ligatur",
"Logogram",
"morae",
"Simbol angka",
"Kata bilangan",
"Tanda baca",
"Suku kata",
"Simbol",
}
------ 2. Lists not converted into sets. ------
--[==[ var:
Recognized aliases for parts of speech (param 2=). Key is the short form and value is the canonical singular (not
pluralized) form. It is singular so the same table can be used in [[Module:form of]] for the {{para|p}}/{{para|POS}}
param and [[Module:links]] for the pos= param. Note that any part of speech, abbreviated or not, can be suffixed with
`f` to generate the corresponding non-lemma form part of speech, such as `adjf`, `af` or `adjectivef` for
`adjective form`, and `nounf` or `nf` for `noun form`. This expansion happens even when it does not make sense for the
given part of speech (e.g. `pclf` expands to `particle form` and `symf` expands to `symbol form`), and currently also,
at least in [[Module:headword]] (but not [[Module:links]]), even if the part before the `f` is not a recognized part of
speech or abbreviation (hence `nerf` expands to `ner form`).
]==]
data.pos_aliases = {
a = "kata sifat",
adj = "kata sifat",
["Kata adjektif"] = "kata sifat", -- alias "kata sifat"
["kata adjektif"] = "kata sifat", -- alias "kata sifat"
["kata Adjektif"] = "kata sifat", -- alias "kata sifat"
["Kata Adjektif"] = "kata sifat", -- alias "kata sifat"
adv = "adverba",
aug = "augmentative",
art = "kata sandang",
cls = "penjodoh bilangan",
compadj = "comparative adjective",
compadv = "comparative adverb",
compdet = "comparative determiner",
comppron = "comparative pronoun",
cnum = "nombor kardinal",
conj = "conjunction",
contr = "contraction",
conv = "converb",
det = "penunjuk",
dim = "diminutive",
int = "kata seru",
interj = "kata seru",
intj = "kata seru",
n = "kata nama",
["Kata nama am"] = "kata nama", -- alias "kata nama"
["kata nama am"] = "kata nama", -- alias "kata nama"
["kata benda"] = "kata nama", -- alias "kata nama"
["Kata benda"] = "kata nama", -- alias "kata nama"
["Kata Benda"] = "kata nama", -- alias "kata nama"
["Kata am"] = "kata nama", -- alias "kata nama"
["kata am"] = "kata nama", -- alias "kata nama"
na = "animate noun",
ni = "inanimate noun",
num = "kata bilangan",
pastpart = "past participle",
part = "partisipel",
pcl = "partikel",
phr = "frasa",
pn = "kata nama khas",
postp = "kata dudi",
pref = "awalan",
pre = "kata depan",
prep = "kata depan",
prepphr = "prepositional phrase",
prespart = "present participle",
pro = "kata ganti nama",
pron = "kata ganti nama",
prop = "kata nama khas",
proper = "kata nama khas",
propn = "kata nama khas",
onum = "nombor ordinal",
romanisation = "perumian",
romanisations = "perumian",
suf = "akhiran",
supadj = "superlative adjective",
supadv = "superlative adverb",
supdet = "superlative determiner",
suppron = "superlative pronoun",
sym = "simbol",
v = "kata kerja",
vb = "kata kerja",
vi = "kata kerja tak transitif",
vm = "modal verb",
vt = "kata kerja transitif",
vii = "kata kerja tak transitif tidak bernyawa",
vai = "kata kerja tak transitif bernyawa",
vti = "kata kerja transitif tidak bernyawa",
vta = "kata kerja transitif bernyawa",
}
--[==[ var:
Map of parts of speech for which categories like `German masculine nouns` or `Russian imperfective verbs` will be
generated if the headword is of the appropriate gender/number. The map is used to canonicalize parts of speech for
categorization purposes; specifically, proper nouns categorizes like nouns.
]==]
data.pos_for_gender_number_cat = {
["Kata nama"] = "Kata nama",
["Kata nama khas"] = "Kata nama",
["Akhiran"] = "Akhiran",
-- We include verbs because impf and pf are valid "genders".
["Kata kerja"] = "Kata kerja",
}
--[==[ var:
Lower limit for a "long" word in a particular language. Used to categorize terms into e.g.
[[:Category:Long English words]] automatically. Languages with no mapping here do not get categorized.
]==]
data.long_word_thresholds = {
["af"] = 20,
["bg"] = 20,
["cy"] = 25,
["de"] = 20,
["en"] = 25,
["es"] = 20,
["fr"] = 20,
["ka"] = 20,
["sv"] = 20,
["tl"] = 25,
}
------ 3. Page-wide processing (so that it only needs to be done once per page). ------
data.page = require(headword_page_module).process_page()
-- Set some page properties directly on `data` for ease of use.
data.pagename = data.page.pagename
data.encoded_pagename = data.page.encoded_pagename
return data
e25nfyx4xb9ot1xz9cga0fu4e00xa54
Modul:ja-kanji-readings
828
11835
281350
256040
2026-04-22T05:36:35Z
Hakimi97
2668
Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/88680772|88680772]])
281350
Scribunto
text/plain
local export = {}
local m_ja = require("Module:ja")
local m_str_utils = require("Module:string utilities")
local concat = table.concat
local find = m_str_utils.find
local get_script = require("Module:scripts").getByCode
local hira_to_kata = m_ja.hira_to_kata
local insert = table.insert
local kana_to_romaji = require("Module:Hrkt-translit").tr
local kata_to_hira = m_ja.kata_to_hira
local gmatch = m_str_utils.gmatch
local match = m_str_utils.match
local split = m_str_utils.split
local Jpan = get_script("Jpan")
-- local katakana_script = get_script("Kana")
local Hira = get_script("Hira")
local PAGENAME = mw.loadData("Module:headword/data").pagename
local NAMESPACE = mw.title.getCurrentTitle().nsText
-- Only used by commented-out code.
-- local data = mw.loadData("Module:ja/data")
local CONCAT_SEP = ', '
local labels = {
{
text = "Go-on",
text2 = "goon",
classification = "on",
},
{
text = "Kan-on",
text2 = "kan'on",
classification = "on",
},
{
text = "Sō-on",
text2 = "sōon",
classification = "on",
},
{
text = "Tō-on",
text2 = "tōon",
classification = "on",
},
{
text = "Kan’yō-on",
text2 = "kan'yōon",
classification = "on",
},
{
entry = "on'yomi",
text = "On",
text2 = "on",
classification = "on",
unclassified = " (tidak dikelaskan)",
},
{
entry = "kun'yomi",
text = "Kun",
text2 = "kun",
classification = "kun",
},
{
text = "Nanori",
text2 = "nanori",
classification = "nanori",
},
}
local function track(code)
require("Module:debug").track("ja-kanji-readings/" .. code)
end
local function plain_link(data)
data.term = data.term:gsub('[%.%- ]', '') -- 「かな-し.い」→「かなしい」, 「も-しく は」→「もしくは」
data.tr = data.tr and data.tr:gsub('[%.%-]', '') or '-'
data.sc = match(data.term:gsub('[%z\1-\127]', ''), '[^' .. Hira:getCharacters() .. ']') and Jpan or Hira
data.pos = data.pos ~= '' and data.pos or nil
data.respect_link_tr = true
return require("Module:links").full_link(data, "term") --"term" makes italic
end
--[=[
Copied from [[Module:ja]] on 2017/6/14.
Replaces the code in Template:ja-readings which accepted kanji readings,
and displayed them in a consistent format.
Substantial change in function was introduced in https://en.wiktionary.org/w/index.php?diff=46057625
]=]
function export.show(frame)
local args = require("Module:parameters").process(frame:getParent().args, {
["goon"] = {},
["kanon"] = {},
["soon"] = {},
["toon"] = {},
["on"] = {},
["kanyoon"] = {},
["kun"] = {},
["nanori"] = {},
["pagename"] = {},
})
local lang_code = frame.args[1] or 'ja'
local lang = require'Module:languages'.getByCode(lang_code)
local lang_name = lang:getCanonicalName()
if args.pagename and NAMESPACE == "" then
error("Parameter pagename tidak boleh digunakan dalam penyertaan, kerana hanya untuk ujian.")
end
local pagename = args.pagename or PAGENAME
local yomi_data = mw.loadData("Module:ja/data/jouyou-yomi").yomi
-- this holds the finished product composed of wikilinks to be displayed
-- in the Readings section under the Kanji section
local links, categories = {}, {}
local is_old_format = false
-- We need a separate kanji sortkey module.
local sortkey = (require("Module:Hani-sortkey").makeSortKey(pagename, lang_code, "Jpan"))
local function add_reading_category(reading, subtype, period)
reading = kata_to_hira(reading:gsub("[%. ]+", ""):gsub("%-$", ""):gsub("%-", "・"))
if subtype then
return insert(categories, '[[Kategori:Kanji dengan bacaan ' .. (period or '') .. ' ' ..
subtype .. ' ' .. reading ..
' bahasa ' .. lang_name .. '|' .. sortkey .. ']]')
else
return insert(categories, '[[Kategori:Kanji dibaca sebagai ' ..
reading .. ' bahasa ' .. lang_name .. '|' .. sortkey .. ']]')
end
end
local unclassified_on = {}
local classified_on = {}
local kun = {}
local kana = "[ぁ-ー]"
for _, label in ipairs(labels) do
local readings = args[label.text2:gsub('ō', 'o'):gsub('\'', '')]
if readings then
local unclassified = ""
if label.unclassified then
if not (args.goon or args.kanon or args.soon or args.toon or args.kanyoon) then
unclassified = label.unclassified
end
end
if find(readings, '%[%[' .. kana) then
is_old_format = true
if label.classification == 'on' then
for reading in gmatch(readings, kana .. '+') do
add_reading_category(reading)
end
end
readings = readings:gsub("%[%[([^%]|]+)%]%]", function(entry)
if find(entry, "^[" .. Jpan:getCharacters() .. "]+$") then
return plain_link{
lang = lang,
term = entry,
}
else
return "[[" .. entry .. "]]"
end
end)
else
readings = split(readings, "%s*[,、]%s*")
for i, reading in ipairs(readings) do
local is_jouyou = false
local pos, pos_hist, pos_oldest = { }, { '[[w:Ortografi kana bersejarah|bersejarah]]' }, { 'historical' }
-- check for formatting indicating presence of historical kana spelling
local reading_mod, reading_hist, reading_oldest, reading_surplus = reading:match'^(.-)%f[<%z]<?(.-)%f[<%z]<?(.-)%f[<%z]<?(.*)$'
if reading_surplus ~= '' then
error("Bacaan " .. reading .. " mengandungi terlalau banyak bacaan bersejarah. Maksimum hanya 3: moden, lama, kuno.")
end
if label.text2 == "on" then
unclassified_on[reading_mod] = true
insert(unclassified_on, reading_mod)
elseif label.text2 == "kun" then
kun[reading_mod] = true
insert(kun, reading_mod)
elseif label.classification == "on" then
classified_on[reading_mod] = true
insert(classified_on, reading_mod)
end
-- test if reading contains katakana
if find(reading_mod .. reading_hist .. reading_oldest, '[ァ-ヺ]') then
insert(categories, '[[Kategori:Permintaan untuk perhatian mengenai bahasa ' .. lang_name .. '|1]]') -- sometimes legit, like 「頁(ページ)」
end
if reading_hist ~= '' or reading_oldest ~= '' then
-- test if historical readings contain small kana (anachronistic)
if find(reading_hist .. reading_oldest, '[ぁぃぅぇぉゃゅょ]') then
insert(categories, '[[Kategori:Permintaan untuk perhatian mengenai bahasa ' .. lang_name .. '|2]]') --
end
-- test if reading contains kun'yomi delimiter thing but historical readings don't
if reading_mod:find("-", 1, true) then
if reading_hist ~= '' and not reading_hist:find("-", 1, true) or reading_oldest ~= '' and not reading_oldest:find("-", 1, true) then
insert(categories, '[[Kategori:Permintaan untuk perhatian mengenai bahasa ' .. lang_name .. '|3]]')
end
end
end
-- check if there is data indicating that our kanji is a jouyou kanji
if yomi_data[pagename] then
local reading = (label.classification == 'on' and hira_to_kata(reading_mod) or reading_mod)
reading = reading:gsub('%.', '') -- 「あたら-し.い」→「あたら-しい」
local yomi_type = yomi_data[pagename][reading]
if yomi_type then
is_jouyou = true
if yomi_type == 1 or yomi_type == 2 then
insert(pos, '[[w:Jōyō kanji|<abbr title="This reading is listed in the Jōyō kanji table. Click for the Wikipedia article about the Jōyō kanji.">Jōyō</abbr>]]')
elseif yomi_type == 3 or yomi_type == 4 then
insert(pos, '[[w:Jōyō kanji|<abbr title="This reading is listed in the Jōyō kanji table, but is marked as restricted or rare. Click for the Wikipedia article about the Jōyō kanji.">Jōyō <sup>†</sup></abbr>]]')
end
end
end
local subtype = label.text2
if reading_mod then
add_reading_category(reading_mod, subtype)
end
if reading_hist ~= '' then
add_reading_category(reading_hist, subtype, 'lama')
end
if reading_oldest ~= '' then
add_reading_category(reading_oldest, subtype, 'kuno')
end
-- process kun readings with okurigana, create kanji-okurigana links
if reading:find("-", 1, true) then
insert(pos, 1, plain_link{
lang = lang,
term = reading_mod:gsub('^.+%-', pagename),
})
if reading_hist ~= '' then
insert(pos_hist, 1, plain_link{
lang = lang,
term = reading_hist:gsub('^.+%-', pagename),
})
end
if reading_oldest ~= '' then
insert(pos_oldest, 1, plain_link{
lang = lang,
term = reading_oldest:gsub('^.+%-', pagename),
})
end
elseif label.classification == 'kun' then
insert(categories, '[[Kategori:Kanji ' .. lang_name .. ' dengan bacaan kun hilang penamaan okurigana|' .. sortkey .. ']]')
end
local rom = kana_to_romaji((reading_mod), lang_code):gsub('^(.+)(%-)', '<u>%1</u>')
local rom_hist = kana_to_romaji((reading_hist:gsub('^(.+)(%-)', '<u>%1</u>')), lang_code, nil, {hist = true})
local rom_oldest = kana_to_romaji((reading_oldest:gsub('^(.+)(%-)', '<u>%1</u>')), lang_code, nil, {hist = true})
local mod_link = plain_link{
lang = lang,
term = reading_mod,
tr = rom,
pos = concat(pos, CONCAT_SEP),
}
if is_jouyou then
mod_link = '<span class="jouyou-reading">' .. mod_link .. '</span]>'
end
readings[i] = mod_link .. (reading_hist ~= '' and '<sup>←' .. plain_link{
lang = lang,
term = reading_hist,
tr = rom_hist,
pos = concat(pos_hist, CONCAT_SEP),
} .. '</sup>' or '') .. (reading_oldest ~= '' and '<sup>←' .. plain_link{
lang = lang,
term = reading_oldest,
tr = rom_oldest,
pos = concat(pos_oldest, CONCAT_SEP),
} .. '</sup>' or '')
end
readings = concat(readings, "、")
end
-- Add "on-yomi", "kun-yomi", or "nanori-yomi" class around list of
-- readings to allow JavaScript to locate them.
insert(links, "* '''[[Lampiran:Glosari bahasa Jepun#" .. (label.entry or label.text2) .. '|'.. label.text .. "]]'''" .. unclassified .. ': <span class="' .. label.classification .. '-yomi">' .. readings .. '</span>')
end
end
for _, reading in ipairs(unclassified_on) do
-- [[Special:WhatLinksHere/Wiktionary:Tracking/ja-kanji-readings/duplicate reading]]
if classified_on[reading] then
track("duplicate reading")
end
-- [[Special:WhatLinksHere/Wiktionary:Tracking/ja-kanji-readings/unclassified reading ja]]
-- [[Special:WhatLinksHere/Wiktionary:Tracking/ja-kanji-readings/unclassified reading ryu]] etc.
track("unclassified reading " .. lang_code) -- Track unclassified readings for later classification
-- [[Special:WhatLinksHere/Wiktionary:Tracking/ja-kanji-readings/unclassified reading]]
track("unclassified reading") -- Leave a version that is not profiled by lang code, in order to not break any hypothetical scripts relying on the old tracking category
end
if not next(classified_on) and not next(unclassified_on) then
if next(kun) then
-- [[Special:WhatLinksHere/Wiktionary:Tracking/ja-kanji-readings/kun only]]
track("kun only")
end
elseif not next(kun) then
-- [[Special:WhatLinksHere/Wiktionary:Tracking/ja-kanji-readings/on only]]
track("on only")
end
if is_old_format then
insert(categories, '[[Kategori:Kanji Jepun menggunakan format lama ja-bacaan|' .. sortkey .. ']]')
end
return concat(links, '\n') .. (NAMESPACE == '' and concat(categories) or '') .. require("Module:TemplateStyles")("Template:ja-readings/style.css")
end
return export
s0hx5a524m6rhbbjn217ccn8ku5nqnk
ثعبان
0
13164
281304
111415
2026-04-21T15:44:26Z
Hakimi97
2668
281304
wikitext
text/x-wiki
{{juga|تعبان}}
==Bahasa Arab==
===Takrifan===
{{ar-kn|ثُعْبَان|m,f|pl=ثَعَابِين}}
# [[ular]]
#* {{RQ:Quran|26|32}}
#*: {{quote|ar|فَأَلْقَى عَصَاهُ فَإِذَا هِيَ '''ثُعْبَان'''ٌ مُبِينٌ|Nabi Musa pun mencampakkan tongkatnya, maka tiba-tiba tongkatnya itu menjadi seekor ular yang jelas nyata.}}
# {{lb|ar|buruj}} (biasanya {{l|ar|الثُعْبَان}}) [[Thuban]]
===Etimologi===
Daripada akar {{ar-akar|ث ع ب}}.
===Sebutan===
* {{ar-AFA|ثُعْبَان}}
===Deklensi===
{{ar-dekl-kn|ثُعْبَان|pl=ثَعَابِين}}
[[Kategori:ar:Reptilia]]
qnfbfz9fuuua4opeimzqjayw8deeldr
澪標
0
13298
281348
279385
2026-04-22T05:22:29Z
Hakimi97
2668
/* Kata nama */ Cuba buang, nak semak mengapa ada penjanaan Kategori:Perkataan dieja dengan 標 dibaca sebagai つくし bahasa Jepun
281348
wikitext
text/x-wiki
==Bahasa Jepun==
<div style="float:right;">
{{wikipedia|lang=ja}}
{{wikipedia|Berup siang}}
{{wikipedia|Tiang tambat}}
[[File:Miotsukushi_in_Osaka.JPG|thumb|250px|{{lang|ja|澪標}} (''miotsukushi'', ''miozukushi'', ''miojirushi'', ''reihyō''): sebuah '''{{w|tiang tambat}}''' tradisional Jepun di Osaka semasa {{w|zaman Meiji}}.]]
</div>
===Etimologi 1===
{{ja-kanjitab|yomi=k,irr|sort=みおづくし|みお|つくし|k2=づくし}}
{{ja-kanjitab|yomi=k,irr|sort=みおつくし|みお|つくし}}
Kata majmuk bagi {{ja-compound|澪|みお|つ|つ|串|くし|t1=[[saluran]] [[air]]|pos2=partikel kata milik {{inh|ja|ojp|sort=みおつくし|-}}|t3=[[pencucuk]] (biasanya daging)}}.<ref name="DJS">{{R:Daijisen}}</ref>
Juga ditemui dengan bacaan ''miozukushi''. {{rendaku2|sort=みおづくし|tsukushi|zukushi}}
Terutama, penerbit yang berbeza dari teks sejarah yang sama muncul sebagai pengganti antara bacaan ''miotsukushi'' dan ''miojirushi'', mungkin disebabkan perbezaan sejarah atau dialek.
====Sebutan====
{{ja-pron|みおつくし|acc=0|acc_ref=DJR|acc2=4|acc2_ref=DJR|acc3=3|acc3_ref=DJR}}
====Kata nama====
# {{w|Tiang tambat}} yang dipasang sebagai {{w|berup siang}} atau {{w|tanda siang}}: sebuah [[penanda]] [[pelayaran]] menunjukkan sempadan [[saluran]] [[air]]
#* {{RQ:Manyoshu|14|3429}}, teks di [https://web.archive.org/web/20200925204109/http://jti.lib.virginia.edu/japanese/manyoshu/Man14Yo.html#3429 sini]
#*: {{ja-usex|m=等保都安布美 伊奈佐保曽江乃 '''水乎都久思''' 安礼乎多能米弖 安佐麻之物能乎|m_kana=とほつあふみ いなさほそえの '''みをつくし''' あれをたのめて あさましものを
|遠%江%引%佐%細%江の'''みをつくし'''我を頼めてあさましものを
|^とほ-つ%-^あふみ% ^いな%さ%-ほそ%え の '''みをつくし''' あれ を たのめて あさまし もの を|rom=Tō-tsu-Ōmi Inasa-hosoe no '''miotsukushi''' are o tanomete asamashi mono o|Di Tōtsu Ōmi atas pada Sungai Inasa berdirinya '''palang saluran'''―anda boleh membuat saya mengikuti dan meninggalkan saya di tempat tinggi dan kering.<ref>{{cite-book|1998|Edwin A. Cranston|The Gem-Glistening Cup|page=734|publisher=Stanford University Press|isbn=0-8047-3157-8}}</ref>|sort=みおつくし}}
#: {{synonyms|ja|澪木|tr=miogi|澪杭|tr2=miokui|[[水尾坊木]], [[澪坊木]]|tr3=miobōgi}}
# [[menyentuh]] (secara tidak langsung) kepada {{m|ja|尽くし|tr=tsukushi||[[kepenatan]]}}
#* {{RQ:Manyoshu|12|3162}}, teks di [https://web.archive.org/web/20200918235602/http://jti.lib.virginia.edu/japanese/manyoshu/Man12Yo.html#3162 sini]
#*: {{ja-usex|m='''水咫衝%石''' 心%盡%而 念%鴨 此間%毛%本%名 夢%西%所見|m_kana='''みをつく%し'''こころ%つくし%て おもへ%かも ここに%も%もと%な いめ%にし%みゆる|'''みをつくし'''心%尽して思へかもここにももとな夢にし見ゆる|'''みをつくし'''こころ% つくして おもへ か も ここ に も もと な いめ に し みゆる|rom='''miotsukushi''' kokoro tsukushite omoe ka mo koko ni mo moto na ime ni shi miyuru}}
# salah satu daripada 60 [[pelbagai]] jenis [[kemenyan]] yang terkenal, yang terbuat dari [[kayu]] [[aromatik]] {{m|ja|伽羅|tr=kyara}} dengan [[bau]]an [[pahit]]
#: {{hyper|ja|六十一種名香|tr=rokujūichi shumeikō}}
=====Nota penggunaan=====
* Pada masa ''Man'yōshū'', maksud "tiang tambat" dirujuk kepada mereka di {{w|Wilayah Tōtōmi}}; semasa {{w|zaman Heian}}, maksudnya hanya untuk penanda di [[teluk]] Naniwa, kini [[Osaka]].
* Sejak zaman Heian, makna "tiang tambat" dapat digunakan sebagai {{m|ja|掛詞|tr={{w|kakekotoba}}}} untuk [[plesetan]]/[[pun]]/[[pan]] terhadap makna {{m|ja|身を尽くす|身を尽くし|tr=mi o tsukushi|pos=secara harfiah “[[kepenatan]] [[badan]] seseorang” → “dengan semua [[kekuatan]], dengan semua [[hati]] dan [[nyawa]]”}}:
** {{RQ:Gosenshu|13|860; also ''{{w|Ogura Hyakunin Isshu|Hyakunin Isshu}}'', puisi 20}}
**: {{ja-usex|わびぬれば今はた同じ難%波なる'''みをつくし'''ても逢はむとぞ思ふ|わびぬれば いま はた おなじ なに%は なる '''みをつくし'''て も あはむ と ぞ おもふ|rom=wabinureba ima hata onaji Naniwa naru '''mi o tsukushi'''te mo awan to zo omou|Sedih, kini, semuanya sama. '''Tanda saluran''' di Naniwa―walaupun ianya '''menggadai nyawa'''ku, Aku akan bertemu denganmu lagi!<ref>{{cite-book|1996|Joshua S. Mostow |Pictures of the Heart: The Hyakunin Isshu in Word and Image|edition=illustrated|publisher=University of Hawaii Press|isbn=0-8248-1705-2|page=201}}</ref>|sort=みおつくし}}
====Kata nama khas====
{{ja-pos|proper|みおつくし|hhira=みをつくし}}
# [[bab]] ke[[empat belas]] bagi ''{{w|Hikayat Genji}}''
===Etimologi 2===
{{ja-kanjitab|yomi=k|みお|しるし|k2=じるし}}
Kata majmuk bagi {{ja-compound|澪|みお|標|しるし|t1=[[saluran]] [[air]]|t2=[[tanda]], [[penanda]]}}. {{rendaku2|sort=みおじるし|shirushi|jirushi}}
Terutama, penerbit yang berbeza dari teks sejarah yang sama muncul sebagai pengganti antara bacaan ''miojirushi'' dan ''miotsukushi'', mungkin disebabkan perbezaan sejarah atau dialek.
====Sebutan====
{{ja-pron|みおじるし|acc=3|acc_ref=DJR}}
====Bentuk alternatif====
* {{ja-l|水脈標}}
====Kata nama====
{{ja-noun|みおじるし|hhira=みをじるし}}
# {{w|Tiang tambat}} yang dipasang sebagai {{w|berup siang}} atau {{w|tanda siang}}: sebuah [[penanda]] [[pelayaran]] menunjukkan sempadan [[saluran]] [[air]]
#* '''Abad ke-12''', ''{{w|lang=ja|山家集|Sankashū}}'' (buku 1, puisi 217)
#*: {{ja-usex|広%瀬%川%渡りの沖の'''みをじるし'''[[水%嵩]]ぞ深き[[五月雨]]の頃|^ひろ%せ%-がは% わたり の おき の '''みをじるし''' み%かさ ぞ ふかき さみだれ の ころ|rom=Hirose-gawa watari no oki no '''miojirushi''' mikasa zo fukaki samidare no koro}}
===Etimologi 3===
{{ja-kanjitab|yomi=kanon2|れい|ひょう}}
{{IPAchar|/reiheu/}} → {{IPAchar|/reːhjoː/}}
Daripada {{bor|ja|ltc|sort=れいひょう|-}} {{ltc-l|澪標|id=1,1}}.
====Sebutan====
{{ja-pron|れいひょう}}
====Kata nama====
{{ja-noun|れいひょう|hhira=れいへう}}
# {{w|Tiang tambat}} yang dipasang sebagai {{w|berup siang}} atau {{w|tanda siang}}: sebuah [[penanda]] [[pelayaran]] menunjukkan sempadan [[saluran]] [[air]]
===Rujukan===
<references/>
:* {{R:Kanjipedia Kotoba|0007265800|〈<sup>▲</sup>澪標〉}}
{{cln|ja|makurakotoba}}
{{C|ja|Nautika}}
lmjni4y5pqusernmvnauj25h5q5jpnc
281349
281348
2026-04-22T05:22:50Z
Hakimi97
2668
Membatalkan semakan [[Special:Diff/281348|281348]] oleh [[Special:Contributions/Hakimi97|Hakimi97]] ([[User talk:Hakimi97|bincang]])
281349
wikitext
text/x-wiki
==Bahasa Jepun==
<div style="float:right;">
{{wikipedia|lang=ja}}
{{wikipedia|Berup siang}}
{{wikipedia|Tiang tambat}}
[[File:Miotsukushi_in_Osaka.JPG|thumb|250px|{{lang|ja|澪標}} (''miotsukushi'', ''miozukushi'', ''miojirushi'', ''reihyō''): sebuah '''{{w|tiang tambat}}''' tradisional Jepun di Osaka semasa {{w|zaman Meiji}}.]]
</div>
===Etimologi 1===
{{ja-kanjitab|yomi=k,irr|sort=みおづくし|みお|つくし|k2=づくし}}
{{ja-kanjitab|yomi=k,irr|sort=みおつくし|みお|つくし}}
Kata majmuk bagi {{ja-compound|澪|みお|つ|つ|串|くし|t1=[[saluran]] [[air]]|pos2=partikel kata milik {{inh|ja|ojp|sort=みおつくし|-}}|t3=[[pencucuk]] (biasanya daging)}}.<ref name="DJS">{{R:Daijisen}}</ref>
Juga ditemui dengan bacaan ''miozukushi''. {{rendaku2|sort=みおづくし|tsukushi|zukushi}}
Terutama, penerbit yang berbeza dari teks sejarah yang sama muncul sebagai pengganti antara bacaan ''miotsukushi'' dan ''miojirushi'', mungkin disebabkan perbezaan sejarah atau dialek.
====Sebutan====
{{ja-pron|みおつくし|acc=0|acc_ref=DJR|acc2=4|acc2_ref=DJR|acc3=3|acc3_ref=DJR}}
====Kata nama====
{{ja-noun|みおつくし|hhira=みをつくし}}<br/>{{ja-altread|hira=みおづくし|hhira=みをづくし}}
# {{w|Tiang tambat}} yang dipasang sebagai {{w|berup siang}} atau {{w|tanda siang}}: sebuah [[penanda]] [[pelayaran]] menunjukkan sempadan [[saluran]] [[air]]
#* {{RQ:Manyoshu|14|3429}}, teks di [https://web.archive.org/web/20200925204109/http://jti.lib.virginia.edu/japanese/manyoshu/Man14Yo.html#3429 sini]
#*: {{ja-usex|m=等保都安布美 伊奈佐保曽江乃 '''水乎都久思''' 安礼乎多能米弖 安佐麻之物能乎|m_kana=とほつあふみ いなさほそえの '''みをつくし''' あれをたのめて あさましものを
|遠%江%引%佐%細%江の'''みをつくし'''我を頼めてあさましものを
|^とほ-つ%-^あふみ% ^いな%さ%-ほそ%え の '''みをつくし''' あれ を たのめて あさまし もの を|rom=Tō-tsu-Ōmi Inasa-hosoe no '''miotsukushi''' are o tanomete asamashi mono o|Di Tōtsu Ōmi atas pada Sungai Inasa berdirinya '''palang saluran'''―anda boleh membuat saya mengikuti dan meninggalkan saya di tempat tinggi dan kering.<ref>{{cite-book|1998|Edwin A. Cranston|The Gem-Glistening Cup|page=734|publisher=Stanford University Press|isbn=0-8047-3157-8}}</ref>|sort=みおつくし}}
#: {{synonyms|ja|澪木|tr=miogi|澪杭|tr2=miokui|[[水尾坊木]], [[澪坊木]]|tr3=miobōgi}}
# [[menyentuh]] (secara tidak langsung) kepada {{m|ja|尽くし|tr=tsukushi||[[kepenatan]]}}
#* {{RQ:Manyoshu|12|3162}}, teks di [https://web.archive.org/web/20200918235602/http://jti.lib.virginia.edu/japanese/manyoshu/Man12Yo.html#3162 sini]
#*: {{ja-usex|m='''水咫衝%石''' 心%盡%而 念%鴨 此間%毛%本%名 夢%西%所見|m_kana='''みをつく%し'''こころ%つくし%て おもへ%かも ここに%も%もと%な いめ%にし%みゆる|'''みをつくし'''心%尽して思へかもここにももとな夢にし見ゆる|'''みをつくし'''こころ% つくして おもへ か も ここ に も もと な いめ に し みゆる|rom='''miotsukushi''' kokoro tsukushite omoe ka mo koko ni mo moto na ime ni shi miyuru}}
# salah satu daripada 60 [[pelbagai]] jenis [[kemenyan]] yang terkenal, yang terbuat dari [[kayu]] [[aromatik]] {{m|ja|伽羅|tr=kyara}} dengan [[bau]]an [[pahit]]
#: {{hyper|ja|六十一種名香|tr=rokujūichi shumeikō}}
=====Nota penggunaan=====
* Pada masa ''Man'yōshū'', maksud "tiang tambat" dirujuk kepada mereka di {{w|Wilayah Tōtōmi}}; semasa {{w|zaman Heian}}, maksudnya hanya untuk penanda di [[teluk]] Naniwa, kini [[Osaka]].
* Sejak zaman Heian, makna "tiang tambat" dapat digunakan sebagai {{m|ja|掛詞|tr={{w|kakekotoba}}}} untuk [[plesetan]]/[[pun]]/[[pan]] terhadap makna {{m|ja|身を尽くす|身を尽くし|tr=mi o tsukushi|pos=secara harfiah “[[kepenatan]] [[badan]] seseorang” → “dengan semua [[kekuatan]], dengan semua [[hati]] dan [[nyawa]]”}}:
** {{RQ:Gosenshu|13|860; also ''{{w|Ogura Hyakunin Isshu|Hyakunin Isshu}}'', puisi 20}}
**: {{ja-usex|わびぬれば今はた同じ難%波なる'''みをつくし'''ても逢はむとぞ思ふ|わびぬれば いま はた おなじ なに%は なる '''みをつくし'''て も あはむ と ぞ おもふ|rom=wabinureba ima hata onaji Naniwa naru '''mi o tsukushi'''te mo awan to zo omou|Sedih, kini, semuanya sama. '''Tanda saluran''' di Naniwa―walaupun ianya '''menggadai nyawa'''ku, Aku akan bertemu denganmu lagi!<ref>{{cite-book|1996|Joshua S. Mostow |Pictures of the Heart: The Hyakunin Isshu in Word and Image|edition=illustrated|publisher=University of Hawaii Press|isbn=0-8248-1705-2|page=201}}</ref>|sort=みおつくし}}
====Kata nama khas====
{{ja-pos|proper|みおつくし|hhira=みをつくし}}
# [[bab]] ke[[empat belas]] bagi ''{{w|Hikayat Genji}}''
===Etimologi 2===
{{ja-kanjitab|yomi=k|みお|しるし|k2=じるし}}
Kata majmuk bagi {{ja-compound|澪|みお|標|しるし|t1=[[saluran]] [[air]]|t2=[[tanda]], [[penanda]]}}. {{rendaku2|sort=みおじるし|shirushi|jirushi}}
Terutama, penerbit yang berbeza dari teks sejarah yang sama muncul sebagai pengganti antara bacaan ''miojirushi'' dan ''miotsukushi'', mungkin disebabkan perbezaan sejarah atau dialek.
====Sebutan====
{{ja-pron|みおじるし|acc=3|acc_ref=DJR}}
====Bentuk alternatif====
* {{ja-l|水脈標}}
====Kata nama====
{{ja-noun|みおじるし|hhira=みをじるし}}
# {{w|Tiang tambat}} yang dipasang sebagai {{w|berup siang}} atau {{w|tanda siang}}: sebuah [[penanda]] [[pelayaran]] menunjukkan sempadan [[saluran]] [[air]]
#* '''Abad ke-12''', ''{{w|lang=ja|山家集|Sankashū}}'' (buku 1, puisi 217)
#*: {{ja-usex|広%瀬%川%渡りの沖の'''みをじるし'''[[水%嵩]]ぞ深き[[五月雨]]の頃|^ひろ%せ%-がは% わたり の おき の '''みをじるし''' み%かさ ぞ ふかき さみだれ の ころ|rom=Hirose-gawa watari no oki no '''miojirushi''' mikasa zo fukaki samidare no koro}}
===Etimologi 3===
{{ja-kanjitab|yomi=kanon2|れい|ひょう}}
{{IPAchar|/reiheu/}} → {{IPAchar|/reːhjoː/}}
Daripada {{bor|ja|ltc|sort=れいひょう|-}} {{ltc-l|澪標|id=1,1}}.
====Sebutan====
{{ja-pron|れいひょう}}
====Kata nama====
{{ja-noun|れいひょう|hhira=れいへう}}
# {{w|Tiang tambat}} yang dipasang sebagai {{w|berup siang}} atau {{w|tanda siang}}: sebuah [[penanda]] [[pelayaran]] menunjukkan sempadan [[saluran]] [[air]]
===Rujukan===
<references/>
:* {{R:Kanjipedia Kotoba|0007265800|〈<sup>▲</sup>澪標〉}}
{{cln|ja|makurakotoba}}
{{C|ja|Nautika}}
ldnhaivb76s1ue9b9op1mtiu3t7mjup
Templat:en-peribahasa
10
13613
281240
112271
2026-04-21T12:48:03Z
Hakimi97
2668
281240
wikitext
text/x-wiki
{{#invoke:en-headword|show|proverbs}}<!--
--><noinclude>{{documentation}}</noinclude>
sz5x1ciegn3gdcxhbrlyurer0owujgm
281242
281240
2026-04-21T13:02:04Z
Hakimi97
2668
281242
wikitext
text/x-wiki
{{#invoke:en-headword|show|peribahasa}}<!--
--><noinclude>{{documentation}}</noinclude>
bo0q937xoaulshjyzaf22r0iz94n1if
يوم
0
14405
281305
113435
2026-04-21T15:44:52Z
Hakimi97
2668
/* Etimologi */
281305
wikitext
text/x-wiki
== Bahasa Arab ==
=== Takrifan ===
==== Kata nama ====
{{ar-kn|يَوْم|m|pl=أَيَّام}}
# [[hari]]
# [[siang]]
=== Etimologi ===
Daripada akar {{ar-root|ي و م}}, daripada {{inh|ar|sem-pro|*yawm-}}.
=== Sebtuan ===
* {{ar-IPA|يَوْم}}
* {{audio|ar|Ar-يوم.ogg|Audio}}
p1ciiwrmwa5wgoa7q2cwxw1jh707yrp
buan
0
14613
281336
116333
2026-04-22T01:03:28Z
PeaceSeekers
3334
281336
wikitext
text/x-wiki
{{juga|buan-|bù'ān|Buan}}
==Bahasa Bajau Sama==
===Takrifan===
====Kata nama====
{{inti|bdr|kata nama}}
# {{lb|bdr|waktu}} [[bulan]]
===Etimologi===
Daripada {{inh|bdr|poz-pro|*bulan}}, daripada {{inh|bdr|map-pro|*bulaN}}.
===Sebutan===
* {{AFA|bdr|/ˈbu.wan/}}
* {{rima|bdr|an}}
* {{penyempangan|bdr|bu|an}}
{{C|bdr|Masa}}
ffvwbsf5lpfllvwf7dzmae4x7bcmidk
bini-bini
0
16670
281315
239503
2026-04-21T17:46:49Z
Hakimi97
2668
/* Takrifan */
281315
wikitext
text/x-wiki
{{Pautan Projek Wikimedia}}
== Bahasa Melayu ==
=== Takrifan ===
{{ms-kn|pl=-}}
# [[perempuan]]; [[wanita]]
===Etimologi===
Daripada {{der|ms|kxd|bini-bini}}.
=== Sebutan ===
* {{AFA|ms|/bi.bi.bi.ni/}}
* {{rima|ms|i}}
* {{penyempangan|ms|bi|ni|bi|ni}}
=== Tulisan Jawi ===
{{ARchar|[[بيني٢]]}}
=== Rujukan ===
* {{R:KD4}}
* {{R:Kamus Bahasa Melayu Nusantara|2=344}}
=== Pautan luar ===
* {{R:PRPM}}
==Bahasa Melayu Brunei==
===Takrifan===
====Kata nama====
{{inti|kxd|kata nama}}
# [[perempuan]] atau [[wanita]]
====Kata sifat====
{{inti|kxd|kata sifat}}
# [[perempuan]]
===Etimologi===
{{penggandaan|kxd|bini}}
===Sebutan===
* {{AFA|kxd|/bi.ni.bi.ni/}}
===Tesaurus===
====Sinonim====
* {{l|kxd|perempuan}}
====Antonim====
* {{l|kxd|laki-laki}} atau {{l|kxd|lelaki}}
====Kata berkaitan====
* {{l|kxd|betina}}
3ahjz3zlriswbu0ma2k2epe374liyf9
pikin
0
16926
281337
117722
2026-04-22T01:03:56Z
PeaceSeekers
3334
281337
wikitext
text/x-wiki
==Bahasa Belait==
===Takrifan===
[[Fail:B-Mingteller.JPG|thumb|pikin]]
====Kata nama====
{{head|beg|kata nama}}
# [[pinggan]].
===Sebutan===
* {{AFA|beg|/pi.kin/}}
* {{rima|beg|in}}
* {{penyempangan|beg|pi|kin}}
===Rujukan===
* {{R:DL7D|2=226}}
{{C|beg|Alat dapur}}
pxqs8exibxsbsh40tiu7e2j318fn2zt
Islam
0
17732
281306
119183
2026-04-21T15:45:42Z
Hakimi97
2668
/* Etimologi */
281306
wikitext
text/x-wiki
{{Pautan Projek Wikimedia}}
{{also|İslam}}
== Bahasa Melayu ==
=== Takrifan ===
{{ms-knk|j=إسلام}}
# [[agama|Agama]] yang mempercayai [[Allah]] sebagai [[tuhan]] yang tunggal, dan [[Muhammad]] sebagai [[rasul]].
=== Etimologi ===
Daripada {{bor|ms|ar|إِسْلَام||}}, bentuk kata nama bekerja {{m|ar|أَسْلَمَ}}, daripada akar {{ar-root|س ل م|nocat=1}}.
=== Sebutan ===
* {{dewan|is|lam}}
* {{AFA|ms|/islam/}}
* {{rima|ms|lam|am}}
=== Rujukan ===
* {{R:KD4}}
=== Pautan luar ===
* {{R:PRPM}}
[[Kategori:ms:Islam| ]]
[[Kategori:ms:Agama]]
== Bahasa Indonesia ==
{{Wikipedia|lang=id}}
=== Takrifan ===
==== Kata nama khas ====
{{head|id|kata nama khas}}
# agama Islam
=== Etimologi ===
Daripada {{bor|id|ar|إِسْلَام||}}.
=== Pautan luar ===
* {{R:KBBI Daring}}
[[Kategori:id:Islam| ]]
[[Kategori:id:Agama]]
== Bahasa Inggeris ==
{{Wikipedia|lang=en}}
=== Takrifan ===
==== Kata nama khas ====
{{head|en|kata nama khas}}
# agama Islam
=== Etimologi ===
Daripada {{bor|en|ar|إِسْلَام||}}.
=== Sebutan ===
* {{IPA|en|/ɪsˈlɑːm/|/ɪzˈlɑːm/|/ˈɪs.lɑːm/|/ˈɪz.lɑːm/}}, or with {{IPAchar|/-læːm/|lang=en}}
** {{audio|en|LL-Q1860 (eng)-Vealhurl-Islam.wav|Audio (UK)}}
* {{rhymes|en|ɑːm|æm}}
[[Kategori:en:Islam| ]]
[[Kategori:en:Agama]]
fr91z8yqimsy1g75yhsl84cqj6rq3u6
Kategori:Lema bahasa Turki Usmaniyah
14
18020
281328
224816
2026-04-22T00:39:26Z
PeaceSeekers
3334
PeaceSeekers telah memindahkan laman [[Kategori:Lema bahasa Turki Uthmaniyah]] ke [[Kategori:Lema bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama
224816
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
बारिश
0
21084
281340
124483
2026-04-22T01:09:12Z
PeaceSeekers
3334
281340
wikitext
text/x-wiki
== Bahasa Hindi ==
{{wikipedia|वर्षा|lang=hi}}
=== Takrifan ===
==== Kata nama ====
{{head|hi|kata nama}}
# [[hujan]]
#: {{syn|hi|बरसात|वर्षा|मेंह}}
=== Etimologi ===
Daripada {{bor|hi|fa-cls|بارش|tr=bāriš}}.
=== Sebutan ===
* {{audio|hi|LL-Q1568 (hin)-AryamanA-बारिश.wav|Audio}}
{{C|hi|Fenomena atmosfera}}
l4le1p5o73gzadm8vz3ftd6wyifnu9g
magnet
0
21555
281243
125052
2026-04-21T13:18:09Z
Countryball mys123
9925
/* Bahasa Melayu */Tambah gambar
281243
wikitext
text/x-wiki
== Bahasa Melayu ==
{{Wikipedia}}
[[File:Bar magnet crop.jpg|thumb|Magnet]]
=== Takrifan ===
==== Kata nama ====
{{ms-kn|j=مݢنيت}}
# Suatu [[besi]] yang berupaya menarik besi lain ke arahnya.
# {{lb|ms|kiasan}} Suatu benda yang menarik perhatian atau perkara.
=== Etimologi ===
Pinjaman {{bor|ms|en|magnet}}.
=== Sebutan ===
* {{dewan|mag|nét}}
=== Pautan luar ===
* {{R:PRPM}}
{{C|ms|Keelektromagnetan}}
== Bahasa Inggeris ==
{{Wikipedia|lang=en}}
=== Takrifan ===
==== Kata nama ====
{{en-kn}}
# Suatu [[besi]] yang berupaya menarik besi lain ke arahnya.
# {{lb|en|kiasan}} Suatu benda yang menarik perhatian atau perkara.
=== Etimologi ===
Daripada {{inh|en|enm|magnete}} melalui {{der|en|fro|magnete}}, {{der|en|la|magnēs|magnēs, magnētem|t=}}, daripada {{der|en|grc||[[μαγνῆτις]] [λίθος]|t=Batu Magnesia}}, sama ada sempena kota Magnesia ad Sipylum (kini Manisa, [[Turki]]) atau bandar Yunani {{m|grc|Μαγνησία}}. Berkait dengan {{m|en|manganese}}, {{m|en|magnesia}} and {{m|en|magnesium}}.
=== Sebutan ===
* {{a|GA}} {{IPA|en|/ˈmæɡnɪt/}}
* {{a|RP}} {{IPA|en|/ˈmæɡnət/}}
* {{audio|en|LL-Q1860 (eng)-Vealhurl-magnet.wav|Audio (UK)}}
* {{homophones|en|magnate}} {{qualifier|one pronunciation}}
* {{rhymes|en|ɪt|s=2}}
{{C|en|Keelektromagnetan}}
rgh93g2mmhp2ywnl7k1qw5zzupylkh1
حج
0
22365
281307
126771
2026-04-21T15:46:31Z
Hakimi97
2668
/* Kata kerja */
281307
wikitext
text/x-wiki
== Bahasa Arab ==
=== Takrifan ===
==== Kata nama ====
{{ar-noun|حَجّ|m|pl=-}}
# {{ar-verbal noun of|حَجَّ|form=I}}
# {{lb|ar|agama}} [[ziarah]]
## {{lb|ar|Islam}} [[haji]]
=== Kata kerja ===
{{ar-verb|I/a~u.pass.vn:حَجّ}}
# Membalas hujah dengan bukti dan sebagainya
# [[membuktikan]]; memberikan [[bukti]] tentang sesuatu.
# {{lb|ar|agama}} Melakukan ziarah
## {{lb|ar|Islam}} Melakukan haji
=== Etimologi ===
Daripada {{ar-root|ح|ج|ج|}}. Banding dengan {{cog|he|חַג|tr=ḥaḡ|t=hari menjamu}}, {{cog|syc|ܚܓܐ|tr=ḥaggā|t=jamuan}}, {{cog|syc|ܚܳܓ|tr=ḥāgg|t=mengelilingi}}, {{cog|gez|ሕግ|tr=ḥəgg|t=undang-undang}}, {{cog|gez|ሐገገ|tr=ḥaggaga|t=mewartakan (undang-undang)}}.
=== Sebutan ===
* {{ar-IPA|حَجّ}}
** {{a|Mesir}} {{IPA|arz|/ħaɡɡ/}}
** {{a|Maghribi}} {{IPA|ary|/ħaʒʒ/}}
** {{a|Levant Utara}} {{IPA|apc|/ħaʒʒ/}}
{{C|ar|Haji dan umrah}}
kni3b3nlybg2s9whjkp5macbm27lje8
281309
281307
2026-04-21T15:49:58Z
Hakimi97
2668
/* Etimologi */
281309
wikitext
text/x-wiki
== Bahasa Arab ==
=== Takrifan ===
==== Kata nama ====
{{ar-noun|حَجّ|m|pl=-}}
# {{ar-verbal noun of|حَجَّ|form=I}}
# {{lb|ar|agama}} [[ziarah]]
## {{lb|ar|Islam}} [[haji]]
=== Kata kerja ===
{{ar-verb|I/a~u.pass.vn:حَجّ}}
# Membalas hujah dengan bukti dan sebagainya
# [[membuktikan]]; memberikan [[bukti]] tentang sesuatu.
# {{lb|ar|agama}} Melakukan ziarah
## {{lb|ar|Islam}} Melakukan haji
=== Etimologi ===
Daripada {{ar-root|ح ج ج|}}. Banding dengan {{cog|he|חַג|tr=ḥaḡ|t=hari menjamu}}, {{cog|syc|ܚܓܐ|tr=ḥaggā|t=jamuan}}, {{cog|syc|ܚܳܓ|tr=ḥāgg|t=mengelilingi}}, {{cog|gez|ሕግ|tr=ḥəgg|t=undang-undang}}, {{cog|gez|ሐገገ|tr=ḥaggaga|t=mewartakan (undang-undang)}}.
=== Sebutan ===
* {{ar-IPA|حَجّ}}
** {{a|Mesir}} {{IPA|arz|/ħaɡɡ/}}
** {{a|Maghribi}} {{IPA|ary|/ħaʒʒ/}}
** {{a|Levant Utara}} {{IPA|apc|/ħaʒʒ/}}
{{C|ar|Haji dan umrah}}
om42qtdk0l324tyynokc9k0uiejicw8
Modul:ar-verb
828
22367
281308
266032
2026-04-21T15:48:34Z
Hakimi97
2668
Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/88683202|88683202]]) (perlu semakan semula untuk terjemahan label)
281308
Scribunto
text/plain
local export = {}
--[=[
This module implements {{ar-conj}} and provides the underlying conjugation functions for {{ar-verb}}
(whose actual formatting is done in [[Module:ar-headword]]).
Author: User:Benwing, from an early version (2013-2014) by User:Atitarev, User:ZxxZxxZ.
]=]
--[=[
TERMINOLOGY:
-- "slot" = A particular combination of tense/mood/person/number/etc.
Example slot names for verbs are "past_1s" (past tense first-person singular), "juss_pass_3fp" (non-past jussive
passive third-person feminine plural) "ap" (active participle). Each slot is filled with zero or more forms.
-- "form" = The conjugated Arabic form representing the value of a given slot.
-- "lemma" = The dictionary form of a given Arabic term. For Arabic, normally the third person masculine singular past,
although other forms may be used if this form is missing (e.g. in passive-only verbs or verbs lacking the past).
]=]
--[=[
FIXME:
1. Finish unimplemented conjugation types. Only IX-final-weak left (extremely rare, possibly only one verb اِعْمَايَ
(according to Haywood and Nahmad p. 244, who are very specific about the irregular occurrence of alif + yā instead
of expected اِعْمَيَّ with doubled yā). Not in Hans Wehr. NOTE: Not true about this, cf. form IX اِرْعَوَى "to desist,
to repent, to see the light". Also note form XII اِخْضَوْضَرَ = form IX اِخْضَرَّ "to be or become green".
[DONE except for اِعْمَايَ]
2. Implement irregular verbs as special cases and recognize them, e.g.
-- laysa "to not be"; only exists in the past tense, no non-past, no imperative, no participles, no passive, no
verbal noun. Irregular alternation las-/lays-. [IMPLEMENTABLE USING OVERRIDES]
-- istaḥā yastaḥī "be ashamed of" -- this is complex according to Hans Wehr because there are two verbs, regular
istaḥyā yastaḥyī "to spare (someone)'s life" and irregular istaḥyā yastaḥyī "to be ashamed to face (someone)",
which is irregular because it has the alternate irregular form istaḥā yastaḥī which only applies to this meaning.
Currently we follow Haywood and Nahmad in saying that both varieties can be spelled istaḥyā/istaḥā/istaḥḥā, but we
should instead use a variant= param similar to حَيَّ to distinguish the two possibilities, and maybe not include
istaḥḥā.
-- ʿayya/ʿayiya yaʿayyu/yaʿyā "to not find the right way, be incapable of, stammer, falter, fall ill". This appears
to be a mixture of a geminate and final-weak verb. Unclear what the whole paradigm looks like. Do the
consonant-ending parts in the past follow the final-weak paradigm? Is it the same in the non-past? Or can you
conjugate the non-past fully as either geminate or final-weak?
-- اِنْمَحَى inmaḥā or يمَّحَى immaḥā "to be effaced, obliterated; to disappear, vanish" has irregular assimilation of inm-
to imm- as an alternative. inmalasa "to become smooth; to glide; to slip away; to escape" also has immalasa as an
alternative. The only other form VII verbs in Hans Wehr beginning with -m- are inmalaḵa "to be pulled out, torn
out, wrenched" and inmāʿa "to be melted, to melt, to dissolve", which are not listed with imm- alternatives, but
might have them; if so, we should handle this generally. [DONE]
-- يَرَعَ yaraʕa yariʕu "to be a coward, to be chickenhearted" as an alternative form of يَرِعَ yariʕa yayraʕu (as given in
Wehr). [IMPLEMENTABLE USING OVERRIDES]
3. Implement individual override parameters for each paradigm part. See Module:fro-verb for an example of how to do this
generally. Note that {{temp|ar-conj-I}} and other of the older templates already had such individual override params.
[DONE]
Irregular verbs already implemented:
-- [ḥayya/ḥayiya yaḥyā "live" -- behaves like a normal final-weak verb
(e.g. past first singular ḥayītu) except in the past-tense parts with
vowel-initial endings (all the third person except for the third feminine
plural). The normal singular and dual endings have -yiya- in them, which
compresses to -yya-, with the normal endings the less preferred ones.
In masculine third plural, expected ḥayū is replaced by ḥayyū by
analogy to the -yy- parts, and the regular form is not given as an
alternant in John Mace. Barron's 201 verbs appears to have the regular
ḥayū as the part, however. Note also that final -yā appears with tall
alif. This appears to be a spelling convention of Arabic, also applying
in ḥayyā (form II, "to keep (someone) alive") and 'aḥyā (form IV,
"to animate, revive, give birth to, give new life to").] -- implemented
-- [ittaxadha yattaxidhu "take"] -- implemented
-- [sa'ala yas'alu "ask" with alternative jussive/imperative yasal/sal] -- implemented
-- [ra'ā yarā "see"] -- implemented
-- ['arā yurī "show"] -- implemented
-- ['akala ya'kulu "eat" with imperative kul] -- implemented
-- ['axadha ya'xudhu "take" with imperative xudh] -- implemented
-- ['amara ya'muru "order" with imperative mur] -- implemented
--]=]
local force_cat = false -- set to true for debugging
-- if true, always maintain manual translit during processing, and compare against full translit at the end
local debug_translit = false
local lang = require("Module:languages").getByCode("ar")
local m_links = require("Module:links")
local m_string_utilities = require("Module:string utilities")
local m_table = require("Module:table")
local ar_utilities = require("Module:ar-utilities")
local ar_nominals = require("Module:ar-nominals")
local iut = require("Module:inflection utilities")
local put = require("Module:parse utilities")
local pron_qualifier_module = "Module:pron qualifier"
local list_to_text = mw.text.listToText
local rfind = m_string_utilities.find
local rsubn = m_string_utilities.gsub
local rmatch = m_string_utilities.match
local rsplit = m_string_utilities.split
local usub = m_string_utilities.sub
local ulen = m_string_utilities.len
local u = m_string_utilities.char
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local dump = mw.dumpObject
-- Within this module, conjugations are the functions that do the actual
-- conjugating by creating the parts of a basic verb.
-- They are defined further down.
local conjugations = {}
-- hamza variants
local HAMZA = u(0x0621) -- hamza on the line (stand-alone hamza) = ء
local HAMZA_ON_ALIF = u(0x0623)
local HAMZA_ON_W = u(0x0624)
local HAMZA_UNDER_ALIF = u(0x0625)
local HAMZA_ON_Y = u(0x0626)
local HAMZA_ANY = "[" .. HAMZA .. HAMZA_ON_ALIF .. HAMZA_UNDER_ALIF .. HAMZA_ON_W .. HAMZA_ON_Y .. "]"
local HAMZA_PH = u(0xFFF0) -- hamza placeholder
local BAD = u(0xFFF1)
local BORDER = u(0xFFF2)
-- diacritics
local A = u(0x064E) -- fatḥa
local AN = u(0x064B) -- fatḥatān (fatḥa tanwīn)
local U = u(0x064F) -- ḍamma
local UN = u(0x064C) -- ḍammatān (ḍamma tanwīn)
local I = u(0x0650) -- kasra
local IN = u(0x064D) -- kasratān (kasra tanwīn)
local SK = u(0x0652) -- sukūn = no vowel
local SH = u(0x0651) -- šadda = gemination of consonants
local DAGGER_ALIF = u(0x0670)
local DIACRITIC_ANY_BUT_SH = "[" .. A .. I .. U .. AN .. IN .. UN .. SK .. DAGGER_ALIF .. "]"
-- Pattern matching short vowels
local AIU = "[" .. A .. I .. U .. "]"
-- Pattern matching short vowels or sukūn
local AIUSK = "[" .. A .. I .. U .. SK .. "]"
-- Pattern matching any diacritics that may be on a consonant
local DIACRITIC = SH .. "?" .. DIACRITIC_ANY_BUT_SH
-- translit_patterns
local vowels = "aeiouāēīōū"
local NV = "[^" .. vowels .. "]"
local dia = {a = A, i = I, u = U}
local undia = {[A] = "a", [I] = "i", [U] = "u", ["-"] = "-"}
-- various letters and signs
local ALIF = u(0x0627) -- ʾalif = ا
local AMAQ = u(0x0649) -- ʾalif maqṣūra = ى
local AMAD = u(0x0622) -- ʾalif madda = آ
local TAM = u(0x0629) -- tāʾ marbūṭa = ة
local T = u(0x062A) -- tāʾ = ت
local HYPHEN = u(0x0640)
local N = u(0x0646) -- nūn = ن
local W = u(0x0648) -- wāw = و
local Y = u(0x064A) -- yāʾ = ي
local S = "س"
local M = "م"
local LRM = u(0x200e) -- left-to-right mark
-- common combinations
local AH = A .. TAM
local AT = A .. T
local AA = A .. ALIF
local AAMAQ = A .. AMAQ
local AAH = AA .. TAM
local AAT = AA .. T
local II = I .. Y
local UU = U .. W
local AY = A .. Y
local AW = A .. W
local AYSK = AY .. SK
local AWSK = AW .. SK
local NA = N .. A
local NI = N .. I
local AAN = AA .. N
local AANI = AA .. NI
local AYNI = AYSK .. NI
local AWNA = AWSK .. NA
local AYNA = AYSK .. NA
local AYAAT = AY .. AAT
local UNU = "[" .. UN .. U .. "]"
local MA = M .. A
local MU = M .. U
local TA = T .. A
local TU = T .. U
local _I = ALIF .. I
local _U = ALIF .. U
local translit_cache = {
-- hamza variants
[HAMZA] = "ʔ",
[HAMZA_ON_ALIF] = "ʔ",
[HAMZA_ON_W] = "ʔ",
[HAMZA_UNDER_ALIF] = "ʔ",
[HAMZA_ON_Y] = "ʔ",
[HAMZA_PH] = "ʔ",
-- diacritics
[A] = "a",
[AN] = "an",
[U] = "u",
[UN] = "un",
[I] = "i",
[IN] = "in",
[SK] = "",
[SH] = "*", -- handled specially
[DAGGER_ALIF] = "ā",
-- various letters and signs
[""] = "",
[ALIF] = BAD, -- we should never be transliterating ALIF by itself, as its translit in isolation is ambiguous
[AMAQ] = BAD,
[AMAD] = "ʔā",
[TAM] = "",
[T] = "t",
[N] = "n",
[W] = "w",
[Y] = "y",
[S] = "s",
[M] = "m",
[LRM] = "",
-- common combinations
[AH] = "a",
[AT] = "at",
[AA] = "ā",
[AAMAQ] = "ā",
[AAH] = "āh",
[AAT] = "āt",
[II] = "ī",
[UU] = "ū",
[AY] = "ay",
[AW] = "aw",
[AYSK] = "ay",
[AWSK] = "aw",
[NA] = "na",
[NI] = "ni",
[AAN] = "ān",
[AANI] = "āni",
[AYNI] = "ayni",
[AWNA] = "awna",
[AYNA] = "ayna",
[AYAAT] = "ayāt",
[MA] = "ma",
[MU] = "mu",
[TA] = "ta",
[TU] = "tu",
[_I] = "i",
[_U] = "u",
}
local function transliterate(text)
local cached = translit_cache[text]
if cached then
if cached == BAD then
error(("Internal error: Unable to transliterate %s because explicitly marked as BAD"):format(text))
end
return cached
end
local tr = (lang:transliterate(text))
if not tr then
error(("Internal error: Unable to transliterate: %s"):format(text))
end
translit_cache[text] = tr
return tr
end
local all_person_number_list = {
"1s",
"2ms",
"2fs",
"3ms",
"3fs",
"2d",
"3md",
"3fd",
"1p",
"2mp",
"2fp",
"3mp",
"3fp"
}
local function make_person_number_slot_accel_list(list)
local slot_accel_list = {}
return slot_accel_list
end
local imp_person_number_list = {}
for _, pn in ipairs(all_person_number_list) do
if pn:find("^2") then
table.insert(imp_person_number_list, pn)
end
end
local passive_types = m_table.listToSet {
"pass", -- verb has both active and passive
"ipass", -- verb is active with impersonal passive
"nopass", -- verb is active-only
"onlypass", -- verb is passive-only
"onlypass-impers", -- verb itself is impersonal, meaning passive-only with impersonal passive
}
local indicator_flags = m_table.listToSet {
"nopast", "no_nonpast", "noimp",
"nocat", -- don't categorize or include annotations about this; useful in suppletive parts of verbs
"reduced", -- verb has assimilation/reduction of initial coronals
"altgem", -- form X with alternative past geminate forms with final-weak endings
}
export.potential_lemma_slots = {"past_3ms", "past_pass_3ms", "ind_3ms", "ind_pass_3ms", "imp_2ms"}
export.unsettable_slots = {}
for _, potential_lemma_slot in ipairs(export.potential_lemma_slots) do
table.insert(export.unsettable_slots, potential_lemma_slot .. "_linked")
end
-- We don't set the active participle directly for form I because we don't want stative verbs (with past vowel i or u)
-- to default to فَاعِل. Instead we set the special slot 'ap1' and later copy it to 'ap' for non-stative verbs. The user
-- meanwhile can explicitly request the فَاعِل form for active participles for stative verbs using `ap:+`.
table.insert(export.unsettable_slots, "ap1") -- primary default فَاعِل for form I active participles
table.insert(export.unsettable_slots, "ap2") -- secondary default فَعِيل for form I active participles (stative I)
table.insert(export.unsettable_slots, "ap3") -- secondary default فَعِل for form I active participles (stative II)
table.insert(export.unsettable_slots, "apcd") -- secondary default أَفْعَل for form I active participles (color/defect)
table.insert(export.unsettable_slots, "apan") -- secondary default فَعْلَان for form I active participles (in -ān)
table.insert(export.unsettable_slots, "pp2") -- secondary default فَعِيل for form I passive participles (same as ap2)
table.insert(export.unsettable_slots, "vn2") -- secondary default فِعَال for form III verbal nouns
export.unsettable_slots_set = m_table.listToSet(export.unsettable_slots)
local default_indicator_to_active_participle_slot = {
["+"] = "ap1",
["++"] = "ap2",
["+++"] = "ap3",
["+cd"] = "apcd",
["+an"] = "apan",
}
local slots_that_may_be_uncertain = {
vn = "verbal noun",
ap = "active participle",
}
-- Initialize all the slots for which we generate forms.
local function add_slots(alternant_multiword_spec)
alternant_multiword_spec.verb_slots = {
{"ap", "act|part"},
{"pp", "pass|part"},
{"vn", "vnoun"},
}
for _, unsettable_slot in ipairs(export.unsettable_slots) do
table.insert(alternant_multiword_spec.verb_slots, {unsettable_slot, "-"})
end
-- Add entries for a slot with person/number variants.
-- `slot_prefix` is the prefix of the slot, typically specifying the tense/aspect.
-- `tag_suffix` is a string listing the set of inflection tags to add after the person/number tags.
-- `person_number_list` is a list of the person/number slot suffixes to add to `slot_prefix`.
local function add_personal_slot(slot_prefix, tag_suffix, person_number_list)
for _, persnum in ipairs(person_number_list) do
local slot = slot_prefix .. "_" .. persnum
local accel = persnum:gsub("(.)", "%1|") .. tag_suffix
table.insert(alternant_multiword_spec.verb_slots, {slot, accel})
end
end
local tenses = {
{"past", "past|%s"},
{"ind", "non-past|%s|ind"},
{"sub", "non-past|%s|sub"},
{"juss", "non-past|%s|juss"},
}
for _, slot_accel in ipairs(tenses) do
local slot, accel = unpack(slot_accel)
for _, voice in ipairs {"act", "pass"} do
add_personal_slot(voice == "act" and slot or slot .. "_pass", accel:format(voice),
all_person_number_list)
end
end
add_personal_slot("imp", "imp", imp_person_number_list)
alternant_multiword_spec.verb_slots_map = {}
for _, slot_accel in ipairs(alternant_multiword_spec.verb_slots) do
local slot, accel = unpack(slot_accel)
alternant_multiword_spec.verb_slots_map[slot] = accel
end
end
local overridable_stems = {}
local slot_override_param_mods = {
footnote = {
item_dest = "footnotes",
store = "insert",
},
alt = {},
t = {
-- [[Module:links]] expects the gloss in "gloss".
item_dest = "gloss",
},
gloss = {},
g = {
-- [[Module:links]] expects the genders in "g". `sublist = true` automatically splits on comma (optionally
-- with surrounding whitespace).
item_dest = "genders",
sublist = true,
},
pos = {},
lit = {},
id = {},
-- Qualifiers and labels
q = {
type = "qualifier",
},
qq = {
type = "qualifier",
},
l = {
type = "labels",
},
ll = {
type = "labels",
},
}
local function generate_obj(formval, parse_err, prefix, is_slot_override)
local val, uncertain = formval:match("^(.*)(%?)$")
val = val or formval
uncertain = not not uncertain
local ar, translit = val:match("^(.*)//(.*)$")
if not ar then
ar = val
end
if ar == "" then
if uncertain then
ar = "?"
else
error(("Can't specify blank value for override for %s override '%s'"):format(
is_slot_override and "slot" or "stem", prefix))
end
end
return {form = ar, translit = translit, uncertain = uncertain}
end
local function parse_inline_modifiers(comma_separated_group, parse_err, prefix, is_slot_override)
local function this_generate_obj(formval, parse_err)
return generate_obj(formval, parse_err, prefix, is_slot_override)
end
return put.parse_inline_modifiers_from_segments {
group = comma_separated_group,
props = {
param_mods = slot_override_param_mods,
parse_err = parse_err,
generate_obj = this_generate_obj,
pre_normalize_modifiers = function(data)
local modtext = data.modtext
modtext = modtext:match("^(%[.*%])$")
if modtext then
return ("<footnote:%s>"):format(modtext)
end
return data.modtext
end,
},
}
end
local function allow_multiple_values_for_override(comma_separated_groups, data, is_slot_override)
local retvals = {}
for _, comma_separated_group in ipairs(comma_separated_groups) do
local retval
if is_slot_override then
retval = parse_inline_modifiers(comma_separated_group, data.parse_err)
else
retval = generate_obj(comma_separated_group[1], data.parse_err, data.prefix, is_slot_override)
retval.footnotes = data.fetch_footnotes(comma_separated_group)
end
table.insert(retvals, retval)
end
for _, form in ipairs(retvals) do
if form.form == "+" or default_indicator_to_active_participle_slot[form.form] then
if form.form ~= "+" and default_indicator_to_active_participle_slot[form.form] and not is_slot_override then
error(("Stem override '%s' cannot use %s to request a secondary default"):format(
data.prefix, form.form))
end
data.base.slot_override_uses_default[data.prefix] = true
end
end
for _, form in ipairs(retvals) do
if form.form == "-" then
data.base.slot_explicitly_missing[data.prefix] = true
break
end
end
if data.base.slot_explicitly_missing[data.prefix] then
for _, form in ipairs(retvals) do
if form.form ~= "-" then
data.parse_err(("For slot or stem '%s', saw both - and a value other than -, which isn't allowed"):
format(data.prefix))
end
end
return nil
end
return retvals
end
local function simple_choice(choices)
return function(separated_groups, data)
if #separated_groups > 1 then
data.parse_err("For spec '" .. data.prefix .. ":', only one value currently allowed")
end
if #separated_groups[1] > 1 then
data.parse_err("For spec '" .. data.prefix .. ":', no footnotes currently allowed")
end
local choice = separated_groups[1][1]
if not m_table.contains(choices, choice) then
data.parse_err("For spec '" .. data.prefix .. ":', saw value '" .. choice .. "' but expected one of '" ..
table.concat(choices, ",") .. "'")
end
return choice
end
end
for _, overridable_stem in ipairs {
"past",
"past_v",
"past_c",
"past_pass",
"past_pass_v",
"past_pass_c",
"nonpast",
"nonpast_v",
"nonpast_c",
"nonpast_pass",
"nonpast_pass_v",
"nonpast_pass_c",
"imp",
"imp_v",
"imp_c",
} do
overridable_stems[overridable_stem] = allow_multiple_values_for_override
end
overridable_stems.past_final_weak_vowel = simple_choice { "ay", "aw", "ī", "ū" }
overridable_stems.past_pass_final_weak_vowel = simple_choice { "ay", "aw", "ī", "ū" }
overridable_stems.nonpast_final_weak_vowel = simple_choice { "ā", "ī", "ū" }
overridable_stems.nonpast_pass_final_weak_vowel = simple_choice { "ā", "ī", "ū" }
-------------------------------------------------------------------------------
-- Utility functions --
-------------------------------------------------------------------------------
-- version of rsubn() that discards all but the first return value
local function rsub(term, foo, bar)
return (rsubn(term, foo, bar))
end
-- version of rsubn() that returns a 2nd argument boolean indicating whether a substitution was made.
local function rsubb(term, foo, bar)
local retval, nsubs = rsubn(term, foo, bar)
return retval, nsubs > 0
end
-- Concatenate one or more strings or form objects.
local function q(...)
local not_all_strings = debug_translit
local has_manual_translit = debug_translit
for i = 1, select("#", ...) do
local argt = select(i, ...)
if not argt then
error(("Internal error: Saw nil at index %s: %s"):format(i, dump({...})))
end
if type(argt) ~= "string" then
not_all_strings = true
if argt.translit then
has_manual_translit = true
break
end
end
end
if not not_all_strings then
-- just strings, concatenate directly
return table.concat({...})
end
local formvals = {}
local translit = has_manual_translit and {} or nil
local footnotes
for i = 1, select("#", ...) do
local argt = select(i, ...)
if type(argt) == "string" then
formvals[i] = argt
if has_manual_translit then
translit[i] = transliterate(argt)
end
else
formvals[i] = argt.form
if has_manual_translit then
translit[i] = argt.translit or transliterate(argt.form)
end
footnotes = iut.combine_footnotes(footnotes, argt.footnotes)
end
end
-- FIXME: Do we want to support other properties?
return {
form = table.concat(formvals),
translit = has_manual_translit and table.concat(translit) or nil,
footnotes = footnotes,
}
end
-- Return the formval associated with `rad` (a radical or past/non-past vowel, either a string or form object).
local function rget(rad)
if type(rad) == "string" then
return rad
elseif type(rad) == "table" then
return rad.form
else
error(("Internal error: Unexpected type for radical or past/non-past vowel: %s"):format(dump(rad)))
end
end
export.rget = rget -- for use in [[Module:ar-headword]]
-- Return the footnotes associated with `rad` (a radical or past/non-past vowel, either a string or form object).
local function rget_footnotes(rad)
if type(rad) == "string" then
return nil
elseif type(rad) == "table" then
return rad.footnotes
else
error(("Internal error: Unexpected type for radical or past/non-past vowel: %s"):format(dump(rad)))
end
end
-- Return true if the formval associated with `rad` (a radical or past/non-past vowel, either a string or form object)
-- is `val`.
local function req(rad, val)
return rget(rad) == val
end
-- Map `vow` (a past/non-past vowel, either a string or form object without translit) by passing the formval through
-- `fn`. Don't call this on radicals because they may have manual translit and it isn't clear how to handle that.
local function map_vowel(vow, fn)
if type(vow) == "string" then
return fn(vow)
elseif type(vow) == "table" then
return {form = fn(vow.form), footnotes = vow.footnotes}
else
error(("Internal error: Unexpected type for past/non-past vowel: %s"):format(dump(vow)))
end
end
local function get_radicals_3(vowel_spec)
return vowel_spec.rad1, vowel_spec.rad2, vowel_spec.rad3, vowel_spec.past, vowel_spec.nonpast
end
local function get_radicals_4(vowel_spec)
return vowel_spec.rad1, vowel_spec.rad2, vowel_spec.rad3, vowel_spec.rad4
end
local function is_final_weak(base, vowel_spec)
return vowel_spec.weakness == "final-weak" or base.form == "XV"
end
local function link_term(text, face, id)
return m_links.full_link({lang = lang, term = text, tr = "-", id = id}, face)
end
local function tag_text(text, tag, class)
return m_links.full_link({lang = lang, alt = text, tr = "-"})
end
local function track(page)
require("Module:debug/track")("ar-verb/" .. page)
return true
end
local function track_if_ar_conj(base, page)
if base.alternant_multiword_spec.source_template == "ar-conj" then
require("Module:debug/track")("ar-verb/" .. page)
end
return true
end
local function reorder_shadda(word)
-- shadda+short-vowel (including tanwīn vowels, i.e. -an -in -un) gets
-- replaced with short-vowel+shadda during NFC normalisation, which
-- MediaWiki does for all Unicode strings; however, it makes various
-- processes inconvenient, so undo it.
word = rsub(word, "(" .. DIACRITIC_ANY_BUT_SH .. ")" .. SH, SH .. "%1")
return word
end
-------------------------------------------------------------------------------
-- Basic functions to inflect tenses --
-------------------------------------------------------------------------------
local function skip_slot(base, slot, allow_overrides)
if base.slot_explicitly_missing[slot] then
return true
end
if not allow_overrides and base.slot_overrides[slot] and not base.slot_override_uses_default[slot] then
-- Skip any slots for which there are overrides, except those that request the default value using +, ++, etc.
return true
end
if base.passive == "nopass" and (slot == "pp" or slot:find("_pass")) then
return true
elseif base.passive == "onlypass" and slot ~= "pp" and slot ~= "vn" and not slot:find("_pass") then
return true
elseif base.passive == "ipass" and slot:find("_pass") and not slot:find("3ms") then
return true
elseif base.passive == "onlypass-impers" and slot ~= "pp" and slot ~= "vn" and (not slot:find("_pass") or
slot:find("_pass") and not slot:find("3ms")) then
return true
end
if base.nopast and slot:find("^past_") then
return true
end
if base.noimp and slot:find("^imp_") then
return true
end
if base.no_nonpast and (slot:find("^ind_") or slot:find("^sub_") or slot:find("^juss")) then
return true
end
return false
end
local function basic_combine_stem_ending(stem, ending)
return stem .. ending
end
local function basic_combine_stem_ending_tr(stem, ending)
return stem .. ending
end
-- Concatenate `prefixes`, `stems` and `endings` (any of which may be an abbreviate form list, i.e. strings, form
-- objects or lists of strings or form objects) and store into `slot`. If a user-supplied override exists for the slot,
-- nothing will happen unless `allow_overrides` is provided.
local function add3(base, slot, prefixes, stems, endings, allow_overrides)
if skip_slot(base, slot, allow_overrides) then
return
end
-- Optimization since the prefixes are almost always single strings.
if type(prefixes) == "string" then
local function do_combine_stem_ending(stem, ending)
return prefixes .. stem .. ending
end
local function do_combine_stem_ending_tr(stem, ending)
return transliterate(prefixes) .. stem .. ending
end
iut.add_forms(base.forms, slot, stems, endings, do_combine_stem_ending, transliterate,
do_combine_stem_ending_tr, base.form_footnotes)
else
iut.add_multiple_forms(base.forms, slot, {prefixes, stems, endings}, basic_combine_stem_ending, transliterate,
basic_combine_stem_ending_tr, base.form_footnotes)
end
end
-- Insert one or more forms in `form_or_forms` into `slot`. `form_or_forms` is an abbreviated form list (see comment at
-- top of [[Module:inflection utilities]]). If a user-supplied override exists for the slot, nothing will happen unless
-- `allow_overrides` is provided. BEWARE: One form object should never occur in two different slots, or twice in a given
-- slot; if taking a form object from an existing slot, make sure to shallowCopy() it.
local function insert_form_or_forms(base, slot, form_or_forms, allow_overrides, uncertain)
if not skip_slot(base, slot, allow_overrides) then
-- Some optimizations of the most common case of inserting a single string.
if type(form_or_forms) == "string" and not base.form_footnotes then
form_or_forms = {form = form_or_forms, uncertain = uncertain}
iut.insert_form(base.forms, slot, form_or_forms)
else
local list = iut.convert_to_general_list_form(form_or_forms, base.form_footnotes)
if uncertain then
for _, formobj in ipairs(list) do
formobj.uncertain = true
end
end
iut.insert_forms(base.forms, slot, list)
end
end
end
-- Insert `string_or_form` into both the ap2 and pp2 slots, shallowCopying a form object to make sure no form objects
-- occur in two slots.
local function insert_ap2_pp2(base, string_or_form)
insert_form_or_forms(base, "ap2", string_or_form)
if type(string_or_form) == "table" then
string_or_form = m_table.shallowCopy(string_or_form)
end
insert_form_or_forms(base, "pp2", string_or_form)
end
-- Convert `stemforms` (a string, a form object, or a list of strings and/or form objects) into "general form" (a list
-- of form objects) and map `fn` over the list of objects. `fn` is passed two arguments (form value and translit) and
-- should likewise return the new form value and translit. Footnotes will be preserved. FIXME: Preserve other metadata.
local function map_general(stemforms, fn)
return iut.map_forms(iut.convert_to_general_list_form(stemforms), fn)
end
-- Similar to map_general() except that `fn` should return a single value (one or more strings or form objects), instead
-- of two values (form value and translit), and the resulting value(s) from all calls to `fn` will be flattened to
-- construct the overall return value. Footnotes will be preserved. FIXME: Preserve other metadata.
local function flatmap_general(stemforms, fn)
return iut.flatmap_forms(iut.convert_to_general_list_form(stemforms), fn)
end
-- Given user-supplied stem overrides in `base`, construct any derived stem overrides (e.g. vowel-specific or
-- consonant-specific variants), and truncate initial y-/ي- in any non-past overrides.
local function construct_stems(base)
local stems = base.stem_overrides
stems.past_v = stems.past_v or stems.past
stems.past_c = stems.past_c or stems.past
stems.past_pass_v = stems.past_pass_v or stems.past_pass
stems.past_pass_c = stems.past_pass_c or stems.past_pass
stems.nonpast_v = stems.nonpast_v or stems.nonpast
stems.nonpast_c = stems.nonpast_c or stems.nonpast
stems.nonpast_pass_v = stems.nonpast_pass_v or stems.nonpast_pass
stems.nonpast_pass_c = stems.nonpast_pass_c or stems.nonpast_pass
stems.imp_v = stems.imp_v or stems.imp
stems.imp_c = stems.imp_c or stems.imp
local function truncate_nonpast_initial_cons(stem_type, form, translit)
if form == "+" then
return form, translit
end
if not form:find("^" .. Y) then
error(("Form value %s for stem type '%s' should begin with ي"):format(form, stem_type))
end
form = form:gsub("^" .. Y, "")
if translit then
if not translit:find("^y") then
error(("Translit value %s for stem type '%s' should begin with y"):format(translit, stem_type))
end
translit = translit:gsub("^y", "")
end
return form, translit
end
for _, nonpast_stem_type in ipairs { "nonpast_v", "nonpast_c", "nonpast_pass_v", "nonpast_pass_c" } do
if stems[nonpast_stem_type] then
stems[nonpast_stem_type] = map_general(stems[nonpast_stem_type], function(form, translit)
return truncate_nonpast_initial_cons(nonpast_stem_type, form, translit)
end)
end
end
end
-- Given user-specified overrides for stem `stemname`, return overrides with occurrences of + replaced by
-- `default_stem`. If no overrides, return `default_stem`, or {} if no default.
local function override_stem_if_needed(base, stemname, default_stem)
local overrides = base.stem_overrides[stemname]
if not overrides then
return default_stem or {}
end
return map_general(overrides, function(form, translit)
if form ~= "+" and default_indicator_to_active_participle_slot[form] then
error(("Stem overrides cannot use secondary default indicators but saw %s in stem override '%s'"):format(
form, stemname))
end
if form == "+" then
if translit then
error(("Cannot supply manual translit along with + for stem override '%s'"):format(stemname))
end
if not default_stem then
error(("Cannot use + for stem override '%s' because no default is available"):format(stemname))
end
if type(default_stem) ~= "string" then
error(("Internal error: Default stem for '%s' is not a string: %s"):format(stemname, dump(default_stem)))
end
return default_stem
end
return form, translit
end)
end
-------------------------------------------------------------------------------
-- Properties of different verbal forms --
-------------------------------------------------------------------------------
local allowed_vforms = {"I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX",
"X", "XI", "XII", "XIII", "XIV", "XV", "Iq", "IIq", "IIIq", "IVq"}
local allowed_vforms_set = m_table.listToSet(allowed_vforms)
local allowed_vforms_with_weakness = m_table.shallowCopy(allowed_vforms)
-- The user needs to be able to explicitly specify that a form-I verb (specifically one whose initial radical is و) is
-- sound. Cf. wajiʕa yawjaʕu (not #yajaʕu) "to ache, to hurt". In general, i~a and u~u verbs whose initial radical is و
-- seem to not assimilate the first radical; cf. وقح "to be shameless", variously waqaḥa~yaqiḥu, waquḥa~yawquḥu and
-- waqiḥa~yawqaḥu, whereas a~i verbs (wafaḍa~yafiḍu "to rush"), i~i verbs (wafiqa~yafiqu "to be proper, to be suitable")
-- and a~a verbs (waḍaʕa~yaḍaʕu "to set down, to place") do assimilate. But there are naturally exceptions, e.g.
-- waṭiʔa~yaṭaʔu "to tread, to trample"; wasiʕa~yasaʕu "to be spacious; to be well-off"; waṯiʔa~yaṯaʔu "to get bruised,
-- to be sprained". Also beware of waniya~yawnā "to be faint; to languish", which is sound in the first radical and
-- final-weak in the last radical. Nonetheless, the regularity of the patterns mentioned above suggest we should provide
-- them as defaults.
-- Note that there are other cases of unexpectedly sound verbs, e.g. izdawaja~yazdawiju "to be in pairs", layisa~yalyasu
-- "to be valiant, to be brave", ʔaḥwaja~yuḥwiju "to need", istahwana~yastahwinu "to consider easy", sawisa~yaswasu "to
-- be or become moth-eaten or worm-eaten" (vs. sāsa~yasūsu "to govern, to rule" from the same radicals), ʕawira~yaʕwaru
-- "to be one-eyed", istajwaba~yastajwibu "to interrogate", etc. But in these cases there is no need for explicit user
-- specification as the lemma itself specifies the unexpected soundness.
for _, form_with_weakness in ipairs { "I-sound", "I-assimilated", "none-sound", "none-hollow", "none-geminate",
"none-final-weak" } do
table.insert(allowed_vforms_with_weakness, form_with_weakness)
end
local allowed_vforms_with_weakness_set = m_table.listToSet(allowed_vforms_with_weakness)
local function vform_supports_final_weak(vform)
return vform ~= "XI" and vform ~= "XV" and vform ~= "IVq"
end
local function vform_supports_geminate(vform)
return vform == "I" or vform == "III" or vform == "IV" or vform == "VI" or vform == "VII" or vform == "VIII" or
vform == "X"
end
local function vform_supports_hollow(vform)
return vform == "I" or vform == "IV" or vform == "VII" or vform == "VIII" or vform == "X"
end
local function vform_probably_impersonal_passive(vform, weakness, past_vowel, nonpast_vowel)
return vform == "I" and req(past_vowel, I) or vform == "V" or vform == "VI" or vform == "X" or vform == "IIq"
end
local function vform_probably_full_passive(vform)
return vform == "II" or vform == "III" or vform == "IV" or vform == "Iq"
end
local function vform_probably_no_passive(vform, weakness, past_vowel, nonpast_vowel)
return vform == "I" and req(past_vowel, U) or vform == "VII" or vform == "IX" or
vform == "XI" or vform == "XII" or vform == "XIII" or vform == "XIV" or vform == "XV" or
vform == "IIIq" or vform == "IVq"
end
-- Active vforms II, III, IV, Iq use non-past prefixes in -u- instead of -a-.
local function prefix_vowel_from_vform(vform)
if vform == "II" or vform == "III" or vform == "IV" or vform == "Iq" then
return "u"
else
return "a"
end
end
-- True if the active non-past takes a-vocalization rather than i-vocalization in its last syllable.
local function vform_nonpast_a_vowel(vform)
return vform == "V" or vform == "VI" or vform == "XV" or vform == "IIq"
end
-- True if the `passive` spec indicates a passive-only verb.
local function is_passive_only(passive)
return passive == "onlypass" or passive == "onlypass-impers"
end
export.is_passive_only = is_passive_only -- for use in [[Module:ar-headword]]
-------------------------------------------------------------------------------
-- Properties of specific sounds --
-------------------------------------------------------------------------------
-- Is radical wāw (و) or yāʾ (ي)?
local function is_waw_ya(rad)
return req(rad, W) or req(rad, Y)
end
-- Check that radical is wāw (و) or yāʾ (ي), error if not
local function check_waw_ya(rad)
if not is_waw_ya(rad) then
error("Expecting weak radical: '" .. rget(rad) .. "' should be " .. W .. " or " .. Y)
end
end
-- Form-I verb حيّ or حيي and form-X verb استحيا or استحى
local function hayy_radicals(rad1, rad2, rad3)
return req(rad1, "ح") and req(rad2, Y) and is_waw_ya(rad3)
end
-- FUCK ME HARD. "Lua error at line 1514: main function has more than 200 local variables".
local function create_conjugations()
-------------------------------------------------------------------------------
-- Radicals associated with various irregular verbs --
-------------------------------------------------------------------------------
-- Form-I verb أخذ or form-VIII verb اتخذ
local function axadh_radicals(rad1, rad2, rad3)
return req(rad1, HAMZA) and req(rad2, "خ") and req(rad3, "ذ")
end
-- Form-I verb whose imperative has a reduced form: أكل and أخذ and أمر. Return "shortonly" if only
-- short-form imperatives exist (أكل and أخذ) or "shortlong" if long-form imperatives also exist (أمر);
-- they are used after a clitic like فَ and وَ.
local function reduced_imperative_verb(rad1, rad2, rad3)
return axadh_radicals(rad1, rad2, rad3) and "shortonly" or
req(rad1, HAMZA) and req(rad2, "ك") and req(rad3, "ل") and "shortonly" or
req(rad1, HAMZA) and req(rad2, "م") and req(rad3, "ر") and "shortlong"
end
-- Form-I verb رأى and form-IV verb أرى
local function raa_radicals(rad1, rad2, rad3)
return req(rad1, "ر") and req(rad2, HAMZA) and is_waw_ya(rad3)
end
-- Form-I verb سأل
local function saal_radicals(rad1, rad2, rad3)
return req(rad1, "س") and req(rad2, HAMZA) and req(rad3, "ل")
end
-- Form-I verb كان
local function kaan_radicals(rad1, rad2, rad3)
return req(rad1, "ك") and req(rad2, W) and req(rad3, N)
end
-------------------------------------------------------------------------------
-- Sets of past endings --
-------------------------------------------------------------------------------
-- The 13 endings of the sound/hollow/geminate past tense.
local past_endings = {
-- singular
SK .. TU, SK .. TA, SK .. "تِ", A, A .. "تْ",
--dual
SK .. "تُمَا", AA, A .. "تَا",
-- plural
SK .. "نَا", SK .. "تُمْ",
-- shadda + vowel diacritic ends up in the wrong order due to Unicode
-- bug, so keep them separate to avoid this
SK .. "تُن" .. SH .. A, UU .. ALIF, SK .. "نَ"
}
-- Make endings for final-weak past in -aytu or -awtu. AYAW is AY or AW as appropriate. Note that AA and AW are
-- global variables.
local function make_past_endings_ay_aw(ayaw, third_sg_masc)
return {
-- singular
ayaw .. SK .. TU, ayaw .. SK .. TA, ayaw .. SK .. "تِ",
third_sg_masc, A .. "تْ",
--dual
ayaw .. SK .. "تُمَا", ayaw .. AA, A .. "تَا",
-- plural
ayaw .. SK .. "نَا", ayaw .. SK .. "تُمْ",
-- shadda + vowel diacritic ends up in the wrong order due to Unicode
-- bug, so keep them separate to avoid this
ayaw .. SK .. "تُن" .. SH .. A, AW .. SK .. ALIF, ayaw .. SK .. "نَ"
}
end
-- past final-weak -aytu endings
local past_endings_ay = make_past_endings_ay_aw(AY, AAMAQ)
-- past final-weak -awtu endings
local past_endings_aw = make_past_endings_ay_aw(AW, AA)
-- used for alternative endings for form-X geminate verbs like اِسْتَمَرَّ
local past_endings_ay_12_person_only = {
-- singular
AY .. SK .. TU, AY .. SK .. TA, AY .. SK .. "تِ",
{}, {},
--dual
AY .. SK .. "تُمَا", {}, {},
-- plural
AY .. SK .. "نَا", AY .. SK .. "تُمْ",
-- shadda + vowel diacritic ends up in the wrong order due to Unicode
-- bug, so keep them separate to avoid this
AY .. SK .. "تُن" .. SH .. A, {}, {},
}
-- Make endings for final-weak past in -ītu or -ūtu. IIUU is ī or ū as appropriate. Note that AA and UU are global
-- variables.
local function make_past_endings_ii_uu(iiuu)
return {
-- singular
iiuu .. TU, iiuu .. TA, iiuu .. "تِ", iiuu .. A, iiuu .. A .. "تْ",
--dual
iiuu .. "تُمَا", iiuu .. AA, iiuu .. A .. "تَا",
-- plural
iiuu .. "نَا", iiuu .. "تُمْ",
-- shadda + vowel diacritic ends up in the wrong order due to Unicode
-- bug, so keep them separate to avoid this
iiuu .. "تُن" .. SH .. A, UU .. ALIF, iiuu .. "نَ"
}
end
-- past final-weak -ītu endings
local past_endings_ii = make_past_endings_ii_uu(II)
-- past final-weak -ūtu endings
local past_endings_uu = make_past_endings_ii_uu(UU)
-------------------------------------------------------------------------------
-- Sets of non-past prefixes and endings --
-------------------------------------------------------------------------------
local nonpast_prefix_consonants = {
-- singular
HAMZA, T, T, Y, T,
-- dual
T, Y, T,
-- plural
N, T, T, Y, Y
}
-- There are only five distinct endings in all non-past verbs. Make any set of non-past endings given these five
-- distinct endings.
local function make_nonpast_endings(null, fem, dual, pl, fempl)
return {
-- singular
null, null, fem, null, null,
-- dual
dual, dual, dual,
-- plural
null, pl, fempl, pl, fempl
}
end
-- endings for non-past indicative
local ind_endings = make_nonpast_endings(
U,
II .. NA,
AANI,
UU .. NA,
SK .. NA
)
-- Make the endings for non-past subjunctive/jussive, given the vowel diacritic used in "null" endings
-- (1s/2ms/3ms/3fs/1p).
local function make_sub_juss_endings(dia_null)
return make_nonpast_endings(
dia_null,
II,
AA,
UU .. ALIF,
SK .. NA
)
end
-- endings for non-past subjunctive
local sub_endings = make_sub_juss_endings(A)
-- endings for non-past jussive
local juss_endings = make_sub_juss_endings(SK)
-- endings for alternative geminate non-past jussive in -a; same as subjunctive
local juss_endings_alt_a = sub_endings
-- endings for alternative geminate non-past jussive in -i
local juss_endings_alt_i = make_sub_juss_endings(I)
-- Endings for final-weak non-past indicative in -ā. Note that AY, AW and AAMAQ are global variables.
local ind_endings_aa = make_nonpast_endings(
AAMAQ,
AYSK .. NA,
AY .. AANI,
AWSK .. NA,
AYSK .. NA
)
-- Make endings for final-weak non-past indicative in -ī or -ū; IIUU is ī or ū as appropriate. Note that II and UU
-- are global variables.
local function make_ind_endings_ii_uu(iiuu)
return make_nonpast_endings(
iiuu,
II .. NA,
iiuu .. AANI,
UU .. NA,
iiuu .. NA
)
end
-- endings for final-weak non-past indicative in -ī
local ind_endings_ii = make_ind_endings_ii_uu(II)
-- endings for final-weak non-past indicative in -ū
local ind_endings_uu = make_ind_endings_ii_uu(UU)
-- Endings for final-weak non-past subjunctive in -ā. Note that AY, AW, ALIF, AAMAQ are global variables.
local sub_endings_aa = make_nonpast_endings(
AAMAQ,
AYSK,
AY .. AA,
AWSK .. ALIF,
AYSK .. NA
)
-- Make endings for final-weak non-past subjunctive in -ī or -ū. IIUU is ī or ū as appropriate. Note that AA, II,
-- UU, ALIF are global variables.
local function make_sub_endings_ii_uu(iiuu)
return make_nonpast_endings(
iiuu .. A,
II,
iiuu .. AA,
UU .. ALIF,
iiuu .. NA
)
end
-- endings for final-weak non-past subjunctive in -ī
local sub_endings_ii = make_sub_endings_ii_uu(II)
-- endings for final-weak non-past subjunctive in -ū
local sub_endings_uu = make_sub_endings_ii_uu(UU)
-- endings for final-weak non-past jussive in -ā
local juss_endings_aa = make_nonpast_endings(
A,
AYSK,
AY .. AA,
AWSK .. ALIF,
AYSK .. NA
)
-- Make endings for final-weak non-past jussive in -ī or -ū. IU is short i or u, IIUU is long ī or ū as appropriate.
-- Note that AA, II, UU, ALIF are global variables.
local function make_juss_endings_ii_uu(iu, iiuu)
return make_nonpast_endings(
iu,
II,
iiuu .. AA,
UU .. ALIF,
iiuu .. NA
)
end
-- endings for final-weak non-past jussive in -ī
local juss_endings_ii = make_juss_endings_ii_uu(I, II)
-- endings for final-weak non-past jussive in -ū
local juss_endings_uu = make_juss_endings_ii_uu(U, UU)
-------------------------------------------------------------------------------
-- Sets of imperative endings --
-------------------------------------------------------------------------------
-- Extract the second person jussive endings to get corresponding imperative endings.
local function imperative_endings_from_jussive(endings)
return {endings[2], endings[3], endings[6], endings[10], endings[11]}
end
-- normal imperative endings
local imp_endings = imperative_endings_from_jussive(juss_endings)
-- alternative geminate imperative endings in -a
local imp_endings_alt_a = imperative_endings_from_jussive(juss_endings_alt_a)
-- alternative geminate imperative endings in -i
local imp_endings_alt_i = imperative_endings_from_jussive(juss_endings_alt_i)
-- final-weak imperative endings in -ā
local imp_endings_aa = imperative_endings_from_jussive(juss_endings_aa)
-- final-weak imperative endings in -ī
local imp_endings_ii = imperative_endings_from_jussive(juss_endings_ii)
-- final-weak imperative endings in -ū
local imp_endings_uu = imperative_endings_from_jussive(juss_endings_uu)
-------------------------------------------------------------------------------
-- Basic functions to inflect tenses --
-------------------------------------------------------------------------------
-- Add to `base` the inflections for the tense indicated by `tense` (the prefix in the slot names, e.g. 'past'
-- or 'juss_pass'), formed by combining the `prefixes`, `stems` and `endings`. Each of `prefixes`, `stems` and
-- `endings` is either a sequence of 5 (for the imperative) or 13 (for other tenses) abbreviated form lists (each of
-- which is either a string, a form object, or a list of strings and/or form objects; see
-- [[Module:inflection utilities]] for more info). Alternatively, any of `prefixes`, `stems` or `endings` can be a
-- single-element list containing an abbreviated form list, with an additional key `all_same` set to true, or (as a
-- special case) a single string; in the latter cases, the same value is used for all 5 or 13 slots. If existing
-- inflections already exist, they will be added to, not overridden. `pnums` is the list of person/number slot name
-- suffixes, which must match up with the elements in `prefixes`, `stems` and `endings` (i.e. 5 for imperative, 13
-- otherwise).
local function inflect_tense_1(base, tense, prefixes, stems, endings, pnums)
if not prefixes or not stems or not endings then
return
end
local function verify_affixes(affixname, affixes)
local function interr(msg)
error(("Internal error: For tense '%s', '%s' %s: %s"):format(tense, affixname, msg, dump(affixes)))
end
if type(affixes) == "string" then
-- do nothing
elseif type(affixes) ~= "table" then
interr("is not a table or string")
elseif affixes.all_same then
if #affixes ~= 1 then
interr(("with all_same = true should have length 1 but has length %s"):format(#affixes))
end
else
if #affixes ~= #pnums then
interr(("should have length %s but has length %s"):format(#pnums, #affixes))
end
end
end
verify_affixes("prefixes", prefixes)
verify_affixes("stems", stems)
verify_affixes("endings", endings)
local function get_affix(affixes, i)
if type(affixes) == "string" then
return affixes
elseif affixes.all_same then
return affixes[1]
else
return affixes[i]
end
end
for i, pnum in ipairs(pnums) do
local prefix = get_affix(prefixes, i)
local stem = get_affix(stems, i)
local ending = get_affix(endings, i)
local slot = tense .. "_" .. pnum
add3(base, slot, prefix, stem, ending)
end
end
-- Add to `base` the inflections for the tense indicated by `tense` (the prefix in the slot names, e.g. 'past'
-- or 'juss_pass'), formed by combining the `prefixes`, `stems` and `endings`. This is a simple wrapper around
-- inflect_tense_1() that applies to all tenses other than the imperative; see inflect_tense_1() for more
-- information about the parameters.
local function inflect_tense(base, tense, prefixes, stems, endings)
inflect_tense_1(base, tense, prefixes, stems, endings, all_person_number_list)
end
-- Like inflect_tense() but for the imperative, which has only five parts instead of 13 and no prefixes.
local function inflect_tense_imp(base, stems, endings)
inflect_tense_1(base, "imp", "", stems, endings, imp_person_number_list)
end
-------------------------------------------------------------------------------
-- Functions to inflect the past tense --
-------------------------------------------------------------------------------
-- Generate past verbs using specified vowel and consonant stems; works for sound, assimilated, hollow, and geminate
-- verbs, active and passive.
local function past_2stem_conj(base, tense, v_stem, c_stem, footnote_12)
local passive = tense:find("_pass") and "_pass" or ""
-- Override stems with user-specified stems if available.
v_stem = override_stem_if_needed(base, "past" .. passive .. "_v", v_stem)
local c_stem_12 = c_stem
if footnote_12 then
c_stem_12 = iut.combine_form_and_footnotes(c_stem_12, footnote_12)
end
c_stem_12 = override_stem_if_needed(base, "past" .. passive .. "_c", c_stem_12)
local c_stem_3 = override_stem_if_needed(base, "past" .. passive .. "_c", c_stem)
inflect_tense(base, tense, "", {
-- singular
c_stem_12, c_stem_12, c_stem_12, v_stem, v_stem,
--dual
c_stem_12, v_stem, v_stem,
-- plural
c_stem_12, c_stem_12, c_stem_12, v_stem, c_stem_3
}, past_endings)
end
-- Generate past verbs using single specified stem; works for sound and assimilated verbs, active and passive.
local function past_1stem_conj(base, tense, stem)
past_2stem_conj(base, tense, stem, stem)
end
-------------------------------------------------------------------------------
-- Functions to inflect non-past tenses --
-------------------------------------------------------------------------------
-- Generate non-past conjugation, with two stems, for vowel-initial and consonant-initial endings, respectively.
-- Useful for active and passive; for all forms; for all weaknesses (sound, assimilated, hollow, final-weak and
-- geminate) and for all types of non-past (indicative, subjunctive, jussive) except for the imperative. (There is a
-- separate wrapper function below for geminate jussives because they have three alternants.) Both stems may be the
-- same, e.g. for sound verbs.
-- `prefix_vowel` will be either "a" or "u". `endings` should be an array of 13 items. If `endings` is nil or
-- omitted, infer the endings from the tense. If `jussive` is true, or `endings` is nil and `tense` indicatives
-- jussive, use the jussive pattern of vowel/consonant stems (different from the normal ones).
local function nonpast_2stem_conj(base, tense, prefix_vowel, v_stem, c_stem, endings, jussive)
local passive = tense:find("_pass") and "_pass" or ""
-- Override stems with user-specified stems if available.
v_stem = override_stem_if_needed(base, "nonpast" .. passive .. "_v",
v_stem and q(dia[prefix_vowel], v_stem) or nil)
c_stem = override_stem_if_needed(base, "nonpast" .. passive .. "_c",
c_stem and q(dia[prefix_vowel], c_stem) or nil)
if not endings then
if tense:find("^ind") then
endings = ind_endings
elseif tense:find("^sub") then
endings = sub_endings
elseif tense:find("^juss") then
jussive = true
endings = juss_endings
else
error("Internal error: Unrecognized tense '" .. tense .."'")
end
end
if not jussive then
inflect_tense(base, tense, nonpast_prefix_consonants, {
-- singular
v_stem, v_stem, v_stem, v_stem, v_stem,
--dual
v_stem, v_stem, v_stem,
-- plural
v_stem, v_stem, c_stem, v_stem, c_stem
}, endings)
else
inflect_tense(base, tense, nonpast_prefix_consonants, {
-- singular
-- 'adlul, tadlul, tadullī, yadlul, tadlul
c_stem, c_stem, v_stem, c_stem, c_stem,
--dual
-- tadullā, yadullā, tadullā
v_stem, v_stem, v_stem,
-- plural
-- nadlul, tadullū, tadlulna, yadullū, yadlulna
c_stem, v_stem, c_stem, v_stem, c_stem
}, endings)
end
end
-- Generate non-past conjugation with one stem (no distinct stems for vowel-initial and consonant-initial endings).
-- See nonpast_2stem_conj().
local function nonpast_1stem_conj(base, tense, prefix_vowel, stem, endings, jussive)
nonpast_2stem_conj(base, tense, prefix_vowel, stem, stem, endings, jussive)
end
-- Generate active/passive jussive geminative. There are three alternants, two with terminations -a and -i and one
-- in a null termination with a distinct pattern of vowel/consonant stem usage. See nonpast_2stem_conj() for a
-- description of the arguments.
local function jussive_gem_conj(base, tense, prefix_vowel, v_stem, c_stem)
-- alternative in -a
nonpast_2stem_conj(base, tense, prefix_vowel, v_stem, c_stem, juss_endings_alt_a)
-- alternative in -i
nonpast_2stem_conj(base, tense, prefix_vowel, v_stem, c_stem, juss_endings_alt_i)
-- alternative in -null; requires different combination of v_stem and
-- c_stem since the null endings require the c_stem (e.g. "tadlul" here)
-- whereas the corresponding endings above in -a or -i require the v_stem
-- (e.g. "tadulla, tadulli" above)
nonpast_2stem_conj(base, tense, prefix_vowel, v_stem, c_stem, juss_endings, "jussive")
end
-------------------------------------------------------------------------------
-- Functions to inflect the imperative --
-------------------------------------------------------------------------------
-- Generate imperative conjugation, with two stems, for vowel-initial and consonant-initial endings, respectively.
-- Useful for all forms, and for all weaknesses other than final-weak. Note that the two stems may be the same
-- (specifically for sound and assimilated verbs). If `endings` is nil or omitted, use `imp_endings`. If `alt_gem`
-- is specified, use the pattern of vowel and consonant stems appropriate for the alternative geminate imperatives
-- that use a null ending of -a or -i instead of an empty ending.
local function make_2stem_imperative(base, v_stem, c_stem, endings, alt_gem)
endings = endings or imp_endings
-- Override stems with user-specified stems if available.
v_stem = override_stem_if_needed(base, "imp_v", v_stem)
c_stem = override_stem_if_needed(base, "imp_c", c_stem)
if alt_gem then
inflect_tense_imp(base, {v_stem, v_stem, v_stem, v_stem, c_stem}, endings)
else
inflect_tense_imp(base, {c_stem, v_stem, v_stem, v_stem, c_stem}, endings)
end
end
-- Generate imperative parts for sound or assimilated verbs.
local function make_1stem_imperative(base, stem)
make_2stem_imperative(base, stem, stem)
end
-- Generate imperative parts for geminate verbs form I (also IV, VII, VIII, X).
local function make_gem_imperative(base, v_stem, c_stem)
make_2stem_imperative(base, v_stem, c_stem, imp_endings_alt_a, "alt gem")
make_2stem_imperative(base, v_stem, c_stem, imp_endings_alt_i, "alt gem")
make_2stem_imperative(base, v_stem, c_stem)
end
-------------------------------------------------------------------------------
-- Functions to inflect entire verbs --
-------------------------------------------------------------------------------
-- Generate finite parts of a sound verb (also works for assimilated verbs) from five stems (past and non-past,
-- active and passive, plus imperative) plus the prefix vowel in the active non-past ("a" or "u").
local function make_sound_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem, imp_stem,
prefix_vowel)
past_1stem_conj(base, "past", past_stem)
past_1stem_conj(base, "past_pass", past_pass_stem)
nonpast_1stem_conj(base, "ind", prefix_vowel, nonpast_stem)
nonpast_1stem_conj(base, "sub", prefix_vowel, nonpast_stem)
nonpast_1stem_conj(base, "juss", prefix_vowel, nonpast_stem)
nonpast_1stem_conj(base, "ind_pass", "u", nonpast_pass_stem)
nonpast_1stem_conj(base, "sub_pass", "u", nonpast_pass_stem)
nonpast_1stem_conj(base, "juss_pass", "u", nonpast_pass_stem)
make_1stem_imperative(base, imp_stem)
end
local function past_final_weak_endings_from_vowel(vowel)
if vowel == "ay" then
return past_endings_ay
elseif vowel == "aw" then
return past_endings_aw
elseif vowel == "ī" then
return past_endings_ii
elseif vowel == "ū" then
return past_endings_uu
elseif not vowel then
return nil
else
error(("Internal error: Unrecognized past final-weak vowel spec '%s'"):format(vowel))
end
end
local function nonpast_final_weak_endings_from_vowel(vowel)
if vowel == "ā" then
return ind_endings_aa, sub_endings_aa, juss_endings_aa, imp_endings_aa
elseif vowel == "ī" then
return ind_endings_ii, sub_endings_ii, juss_endings_ii, imp_endings_ii
elseif vowel == "ū" then
return ind_endings_uu, sub_endings_uu, juss_endings_uu, imp_endings_uu
elseif not vowel then
return nil
else
error(("Internal error: Unrecognized non-past final-weak vowel spec '%s'"):format(vowel))
end
end
-- Generate finite parts of a final-weak verb from five stems (past and non-past, active and passive, plus
-- imperative), the past active ending vowel (ay, aw, ī or ū), the non-past active ending vowel (ā, ī or ū) and the
-- prefix vowel in the active non-past (a or u).
local function make_final_weak_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem, imp_stem,
past_ending_vowel, nonpast_ending_vowel, prefix_vowel)
past_stem = override_stem_if_needed(base, "past", past_stem)
past_pass_stem = override_stem_if_needed(base, "past_pass", past_pass_stem)
-- Don't call override_stem_if_needed() here for non-past stems; it's called in nonpast_2stem_conj().
imp_stem = override_stem_if_needed(base, "imp", imp_stem)
-- + not supported for ending vowel overrides
past_ending_vowel = base.stem_overrides.past_final_weak_vowel or past_ending_vowel
local past_pass_ending_vowel = base.stem_overrides.past_pass_final_weak_vowel or "ī"
nonpast_ending_vowel = base.stem_overrides.nonpast_final_weak_vowel or nonpast_ending_vowel
local nonpast_pass_ending_vowel = base.stem_overrides.nonpast_pass_final_weak_vowel or "ā"
local past_endings = past_final_weak_endings_from_vowel(past_ending_vowel)
local past_pass_endings = past_final_weak_endings_from_vowel(past_pass_ending_vowel)
local ind_endings, sub_endings, juss_endings, imp_endings =
nonpast_final_weak_endings_from_vowel(nonpast_ending_vowel)
local ind_pass_endings, sub_pass_endings, juss_pass_endings =
nonpast_final_weak_endings_from_vowel(nonpast_pass_ending_vowel)
inflect_tense(base, "past", "", {past_stem, all_same = 1}, past_endings)
inflect_tense(base, "past_pass", "", {past_pass_stem, all_same = 1}, past_pass_endings)
nonpast_1stem_conj(base, "ind", prefix_vowel, nonpast_stem, ind_endings)
nonpast_1stem_conj(base, "sub", prefix_vowel, nonpast_stem, sub_endings)
nonpast_1stem_conj(base, "juss", prefix_vowel, nonpast_stem, juss_endings)
nonpast_1stem_conj(base, "ind_pass", "u", nonpast_pass_stem, ind_pass_endings)
nonpast_1stem_conj(base, "sub_pass", "u", nonpast_pass_stem, sub_pass_endings)
nonpast_1stem_conj(base, "juss_pass", "u", nonpast_pass_stem, juss_pass_endings)
inflect_tense_imp(base, {imp_stem, all_same = 1}, imp_endings)
end
-- Generate finite parts of an augmented (form II+) final-weak verb from five stems (past and non-past, active and
-- passive, plus imperative) plus the prefix vowel in the active non-past ("a" or "u") and a flag indicating if it
-- behaves like a form V/VI verb in taking non-past endings in -ā instead of -ī.
local function make_augmented_final_weak_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem,
imp_stem, prefix_vowel, form56)
make_final_weak_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem, imp_stem, "ay",
form56 and "ā" or "ī", prefix_vowel)
end
-- Generate finite parts of an augmented (form II+) sound or final-weak verb, given:
-- * `base` (conjugation data structure);
-- * `vowel_spec` (radicals, weakness);
-- * `past_stem_base` (active past stem minus last syllable (= -al or -ā));
-- * `nonpast_stem_base` (non-past stem minus last syllable (= -al/-il or -ā/-ī);
-- * `past_pass_stem_base` (passive past stem minus last syllable (= -il or -ī));
-- * `vn` (verbal noun).
local function make_augmented_sound_final_weak_verb(base, vowel_spec, past_stem_base, nonpast_stem_base,
past_pass_stem_base, vn)
insert_form_or_forms(base, "vn", vn)
local lastrad = base.quadlit and vowel_spec.rad4 or vowel_spec.rad3
local final_weak = is_final_weak(base, vowel_spec)
local prefix_vowel = prefix_vowel_from_vform(base.verb_form)
local form56 = vform_nonpast_a_vowel(base.verb_form)
local a_base_suffix = final_weak and "" or q(A, lastrad)
local i_base_suffix = final_weak and "" or q(I, lastrad)
-- past and non-past stems, active and passive
local past_stem = q(past_stem_base, a_base_suffix)
-- In forms 5 and 6, non-past has /a/ as last stem vowel in the non-past
-- in both active and passive, but /i/ in the active participle and /a/
-- in the passive participle. Elsewhere, consistent /i/ in active non-past
-- and participle, consistent /a/ in passive non-past and participle.
-- Hence, forms 5 and 6 differ only in the non-past active (but not
-- active participle), so we have to split the finite non-past stem and
-- active participle stem.
local nonpast_stem = q(nonpast_stem_base, form56 and a_base_suffix or i_base_suffix)
local ap_stem = q(nonpast_stem_base, i_base_suffix)
local past_pass_stem = q(past_pass_stem_base, i_base_suffix)
local nonpast_pass_stem = q(nonpast_stem_base, a_base_suffix)
-- imperative stem
local imp_stem = q(past_stem_base, form56 and a_base_suffix or i_base_suffix)
-- make parts
if final_weak then
make_augmented_final_weak_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem, imp_stem,
prefix_vowel, form56)
else
make_sound_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem, imp_stem, prefix_vowel)
end
-- active and passive participle
if final_weak then
insert_form_or_forms(base, "ap", q(MU, ap_stem, IN))
insert_form_or_forms(base, "pp", q(MU, nonpast_pass_stem, AN, AMAQ))
else
insert_form_or_forms(base, "ap", q(MU, ap_stem))
insert_form_or_forms(base, "pp", q(MU, nonpast_pass_stem))
end
end
-- Generate finite parts of a hollow or geminate verb from ten stems (vowel and consonant stems for each of past and
-- non-past, active and passive, plus imperative) plus the prefix vowel in the active non-past ("a" or "u"), plus a
-- flag indicating if we are a geminate verb.
local function make_hollow_geminate_verb(base, geminate, past_v_stem, past_c_stem, past_pass_v_stem,
past_pass_c_stem, nonpast_v_stem, nonpast_c_stem, nonpast_pass_v_stem, nonpast_pass_c_stem, imp_v_stem,
imp_c_stem, prefix_vowel, altgem_note)
past_2stem_conj(base, "past", past_v_stem, past_c_stem, altgem_note)
past_2stem_conj(base, "past_pass", past_pass_v_stem, past_pass_c_stem)
nonpast_2stem_conj(base, "ind", prefix_vowel, nonpast_v_stem, nonpast_c_stem)
nonpast_2stem_conj(base, "sub", prefix_vowel, nonpast_v_stem, nonpast_c_stem)
nonpast_2stem_conj(base, "ind_pass", "u", nonpast_pass_v_stem, nonpast_pass_c_stem)
nonpast_2stem_conj(base, "sub_pass", "u", nonpast_pass_v_stem, nonpast_pass_c_stem)
if geminate then
jussive_gem_conj(base, "juss", prefix_vowel, nonpast_v_stem, nonpast_c_stem)
jussive_gem_conj(base, "juss_pass", "u", nonpast_pass_v_stem, nonpast_pass_c_stem)
make_gem_imperative(base, imp_v_stem, imp_c_stem)
else
nonpast_2stem_conj(base, "juss", prefix_vowel, nonpast_v_stem, nonpast_c_stem)
nonpast_2stem_conj(base, "juss_pass", "u", nonpast_pass_v_stem, nonpast_pass_c_stem)
make_2stem_imperative(base, imp_v_stem, imp_c_stem)
end
end
-- Generate finite parts of an augmented (form II+) hollow verb, given:
-- * `base` (conjugation data structure);
-- * `vowel_spec` (radicals, weakness);
-- * `past_stem_base` (invariable part of active past stem);
-- * `nonpast_stem_base` (invariable part of nonpast stem);
-- * `past_pass_stem_base` (invariable part of passive past stem);
-- * `vn` (verbal noun).
local function make_augmented_hollow_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base,
vn)
insert_form_or_forms(base, "vn", vn)
local lastrad = base.quadlit and vowel_spec.rad4 or vowel_spec.rad3
local form410 = base.verb_form == "IV" or base.verb_form == "X"
local prefix_vowel = prefix_vowel_from_vform(base.verb_form)
local a_base_suffix_v, a_base_suffix_c
local i_base_suffix_v, i_base_suffix_c
a_base_suffix_v = q(AA, lastrad) -- 'af-āl-a, inf-āl-a
a_base_suffix_c = q(A, lastrad) -- 'af-al-tu, inf-al-tu
i_base_suffix_v = q(II, lastrad) -- 'uf-īl-a, unf-īl-a
i_base_suffix_c = q(I, lastrad) -- 'uf-il-tu, unf-il-tu
-- past and non-past stems, active and passive, for vowel-initial and
-- consonant-initial endings
local past_v_stem = q(past_stem_base, a_base_suffix_v)
local past_c_stem = q(past_stem_base, a_base_suffix_c)
-- yu-f-īl-u, ya-staf-īl-u but yanf-āl-u, yaft-āl-u
local nonpast_v_stem = q(nonpast_stem_base, form410 and i_base_suffix_v or a_base_suffix_v)
local nonpast_c_stem = q(nonpast_stem_base, form410 and i_base_suffix_c or a_base_suffix_c)
local past_pass_v_stem = q(past_pass_stem_base, i_base_suffix_v)
local past_pass_c_stem = q(past_pass_stem_base, i_base_suffix_c)
local nonpast_pass_v_stem = q(nonpast_stem_base, a_base_suffix_v)
local nonpast_pass_c_stem = q(nonpast_stem_base, a_base_suffix_c)
-- imperative stem
local imp_v_stem = q(past_stem_base, form410 and i_base_suffix_v or a_base_suffix_v)
local imp_c_stem = q(past_stem_base, form410 and i_base_suffix_c or a_base_suffix_c)
-- make parts
make_hollow_geminate_verb(base, false, past_v_stem, past_c_stem, past_pass_v_stem,
past_pass_c_stem, nonpast_v_stem, nonpast_c_stem, nonpast_pass_v_stem,
nonpast_pass_c_stem, imp_v_stem, imp_c_stem, prefix_vowel)
-- active participle
insert_form_or_forms(base, "ap", q(MU, nonpast_v_stem))
-- passive participle
insert_form_or_forms(base, "pp", q(MU, nonpast_pass_v_stem))
end
-- Generate finite parts of an augmented (form II+) geminate verb, given:
-- * `base` (conjugation data structure);
-- * `vowel_spec` (radicals, weakness);
-- * `past_stem_base` (invariable part of active past stem; this and the stem bases below will end with a consonant
-- for forms IV, X, IVq, and a short vowel for the others);
-- * `nonpast_stem_base` (invariable part of nonpast stem);
-- * `past_pass_stem_base` (invariable part of passive past stem);
-- * `vn` (verbal noun);
-- * `altgem_note` (footnote to add to active past 1/2-person forms, when alternative forms are supplied [form X]).
local function make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base,
past_pass_stem_base, vn, altgem_note)
insert_form_or_forms(base, "vn", vn)
local vform = base.verb_form
local lastrad = base.quadlit and vowel_spec.rad4 or vowel_spec.rad3
local prefix_vowel = prefix_vowel_from_vform(vform)
local a_base_suffix_v, a_base_suffix_c
local i_base_suffix_v, i_base_suffix_c
if vform == "IV" or vform == "X" or vform == "IVq" then
a_base_suffix_v = q(A, lastrad, SH) -- 'af-all
a_base_suffix_c = q(SK, lastrad, A, lastrad) -- 'af-lal
i_base_suffix_v = q(I, lastrad, SH) -- yuf-ill
i_base_suffix_c = q(SK, lastrad, I, lastrad) -- yuf-lil
else
a_base_suffix_v = q(lastrad, SH) -- fā-ll, infa-ll
a_base_suffix_c = q(lastrad, A, lastrad) -- fā-lal, infa-lal
i_base_suffix_v = q(lastrad, SH) -- yufā-ll, yanfa-ll
i_base_suffix_c = q(lastrad, I, lastrad) -- yufā-lil, yanfa-lil
end
-- past and non-past stems, active and passive, for vowel-initial and
-- consonant-initial endings
local past_v_stem = q(past_stem_base, a_base_suffix_v)
local past_c_stem = q(past_stem_base, a_base_suffix_c)
local nonpast_v_stem = q(nonpast_stem_base, vform_nonpast_a_vowel(vform) and a_base_suffix_v or i_base_suffix_v)
local nonpast_c_stem = q(nonpast_stem_base, vform_nonpast_a_vowel(vform) and a_base_suffix_c or i_base_suffix_c)
-- NOTE: Formerly had a comment that "vform III and VI passive past do not have contracted parts, only
-- uncontracted parts, which are added separately by those functions". This is based on Mace
-- "Arabic Verbs and Essential Grammar" (1999) entry 63 (continued), which shows passive ḥūjija but no ḥūjja;
-- but that is apparently a mistake, as (1) verb tables in other books do show contracted passive parts for
-- these forms; (2) there is no mention of such an exception on p. 99, which explains how geminate ("doubled")
-- verbs work (on the contrary, it says "The contracted and uncontracted pairs (see above) are found all
-- over Forms III and VI of the doubled verbs").
local past_pass_v_stem = q(past_pass_stem_base, i_base_suffix_v)
local past_pass_c_stem = q(past_pass_stem_base, i_base_suffix_c)
local nonpast_pass_v_stem = q(nonpast_stem_base, a_base_suffix_v)
local nonpast_pass_c_stem = q(nonpast_stem_base, a_base_suffix_c)
-- imperative stem
local imp_v_stem = q(past_stem_base, vform_nonpast_a_vowel(vform) and a_base_suffix_v or i_base_suffix_v)
local imp_c_stem = q(past_stem_base, vform_nonpast_a_vowel(vform) and a_base_suffix_c or i_base_suffix_c)
-- make parts
make_hollow_geminate_verb(base, "geminate", past_v_stem, past_c_stem, past_pass_v_stem,
past_pass_c_stem, nonpast_v_stem, nonpast_c_stem, nonpast_pass_v_stem,
nonpast_pass_c_stem, imp_v_stem, imp_c_stem, prefix_vowel, altgem_note)
-- active participle
insert_form_or_forms(base, "ap", q(MU, nonpast_v_stem))
-- passive participle
insert_form_or_forms(base, "pp", q(MU, nonpast_pass_v_stem))
end
-------------------------------------------------------------------------------
-- Conjugation functions for specific conjugation types --
-------------------------------------------------------------------------------
local function form_i_imp_stem_through_rad1(base, nonpast_vowel, rad1)
local imp_vowel = map_vowel(nonpast_vowel, function(vow)
if vow == A or vow == I then
return I
elseif vow == U then
return U
elseif not skip_slot(base, "imp_2ms") then
error(("Internal error: Non-past vowel %s isn't a, i, or u, should have been caught earlier"):format(
dump(nonpast_vowel)))
else
-- Passive-only; imperative won't ever be displayed so it doesn't matter.
return I
end
end)
-- Mace ("Arabic Verbs and Essentials of Grammar" p. 63: [https://archive.org/details/arabicverbsessen00john/page/62/mode/2up])
-- claims that initial hamza is assimilated/elided into a long vowel in the form-I imperative, but apparently
-- this isn't corrrect.
local vowel_on_alif = map_vowel(imp_vowel, function(vow)
return ALIF .. vow
end)
return q(vowel_on_alif, rad1, SK)
end
-- Implement form-I sound or assimilated verb. ASSIMILATED is true for assimilated verbs.
local function make_form_i_sound_assimilated_verb(base, vowel_spec, assimilated)
local rad1, rad2, rad3, past_vowel, nonpast_vowel = get_radicals_3(vowel_spec)
-- Verbal nouns (maṣādir) for form I are unpredictable and have to be supplied
-- past and non-past stems, active and passive
local past_stem = q(rad1, A, rad2, past_vowel, rad3)
local nonpast_stem = assimilated and q(rad2, nonpast_vowel, rad3) or
q(rad1, SK, rad2, nonpast_vowel, rad3)
local past_pass_stem = q(rad1, U, rad2, I, rad3)
local nonpast_pass_stem = q(rad1, SK, rad2, A, rad3)
-- imperative stem
-- check for irregular verb with reduced imperative (أَخَذَ or أَكَلَ or أَمَرَ)
local reducedimp = reduced_imperative_verb(rad1, rad2, rad3)
if reducedimp then
base.irregular = true
end
local imp_stem_suffix = q(rad2, nonpast_vowel, rad3)
local long_imp_stem_base = form_i_imp_stem_through_rad1(base, nonpast_vowel, rad1)
local short_imp_stem_base = ""
local imp_stem = q((assimilated or reducedimp) and "" or long_imp_stem_base, imp_stem_suffix)
-- make parts
make_sound_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem, imp_stem, "a")
if reducedimp == "shortlong" then
make_1stem_imperative(base, iut.combine_form_and_footnotes(q(long_imp_stem_base, imp_stem_suffix),
mw.getCurrentFrame():preprocess("[used especially with a clitic such as {{m|ar|فَ}} or {{m|ar|وَ}}]")))
end
-- Check for irregular verb سَأَلَ with alternative jussive and imperative. Calling this after make_sound_verb()
-- adds additional entries to the paradigm parts.
if saal_radicals(rad1, rad2, rad3) then
base.irregular = true
nonpast_1stem_conj(base, "juss", "a", "سَل")
nonpast_1stem_conj(base, "juss_pass", "u", "سَل")
make_1stem_imperative(base, "سَل")
end
-- Active participle.
insert_form_or_forms(base, "ap1", q(rad1, AA, rad2, I, rad3))
-- Insert alternative active participle (stative type I) فَعِيل. Since not all verbs have this, we require that
-- verbs that do have it specify it explicitly; a shortcut ++ is provided to make this easier (e.g. <ap:++> to
-- indicate that the alternative form should be used for the active participle, <ap:+,++> to indicate that both
-- forms can be used, and <ap:-> to indicate that there is no active participle). The same form is used for
-- secondary default passive participle.
insert_ap2_pp2(base, q(rad1, A, rad2, II, rad3))
-- Active participle, stative type II فَعِل (+++).
insert_form_or_forms(base, "ap3", q(rad1, A, rad2, I, rad3))
-- Active participle, color/defect أَفْعَل (+cd).
insert_form_or_forms(base, "apcd", q(HAMZA, A, rad1, SK, rad2, A, rad3))
-- Active participle, -ān فَعْلَان (+an).
insert_form_or_forms(base, "apan", q(rad1, A, rad2, SK, rad3, AAN))
-- Passive participle.
insert_form_or_forms(base, "pp", q(MA, rad1, SK, rad2, UU, rad3))
end
conjugations["I-sound"] = function(base, vowel_spec)
make_form_i_sound_assimilated_verb(base, vowel_spec, false)
end
conjugations["none-sound"] = function(base, vowel_spec)
-- All default stems are nil.
make_sound_verb(base)
end
conjugations["none-hollow"] = function(base, vowel_spec)
-- All default stems are nil.
make_hollow_geminate_verb(base, false)
end
conjugations["none-geminate"] = function(base, vowel_spec)
-- All default stems are nil.
make_hollow_geminate_verb(base, "geminate")
end
conjugations["none-final-weak"] = function(base, vowel_spec)
-- All default stems are nil.
make_final_weak_verb(base)
end
conjugations["I-assimilated"] = function(base, vowel_spec)
make_form_i_sound_assimilated_verb(base, vowel_spec, "assimilated")
end
local function make_form_i_hayy_verb(base, vowel_spec)
-- Verbal nouns (maṣādir) for form I are unpredictable and have to be supplied
base.irregular = true
-- past and non-past stems, active and passive, and imperative stem
local past_c_stem = "حَيِي"
local past_v_stem_long = past_c_stem
local past_v_stem_short = "حَيّ"
local past_pass_c_stem = "حُيِي"
local past_pass_v_stem_long = past_pass_c_stem
local past_pass_v_stem_short = "حُيّ"
local nonpast_stem = "حْي"
local nonpast_pass_stem = nonpast_stem
local imp_stem = _I .. nonpast_stem
-- make parts
past_2stem_conj(base, "past", {}, past_c_stem)
past_2stem_conj(base, "past_pass", {}, past_pass_c_stem)
local variant = vowel_spec.variant or "both"
if variant == "short" or variant == "both" then
past_2stem_conj(base, "past", past_v_stem_short, {})
past_2stem_conj(base, "past_pass", past_pass_v_stem_short, {})
end
function inflect_long_variant(tense, long_stem, short_stem)
inflect_tense_1(base, tense, "",
{long_stem, long_stem, long_stem, long_stem, short_stem},
{past_endings[4], past_endings[5], past_endings[7], past_endings[8],
past_endings[12]},
{"3ms", "3fs", "3md", "3fd", "3mp"})
end
if variant == "long" or variant == "both" then
inflect_long_variant("past", past_v_stem_long, past_v_stem_short)
inflect_long_variant("past_pass", past_pass_v_stem_long, past_pass_v_stem_short)
end
nonpast_1stem_conj(base, "ind", "a", nonpast_stem, ind_endings_aa)
nonpast_1stem_conj(base, "sub", "a", nonpast_stem, sub_endings_aa)
nonpast_1stem_conj(base, "juss", "a", nonpast_stem, juss_endings_aa)
nonpast_1stem_conj(base, "ind_pass", "u", nonpast_pass_stem, ind_endings_aa)
nonpast_1stem_conj(base, "sub_pass", "u", nonpast_pass_stem, sub_endings_aa)
nonpast_1stem_conj(base, "juss_pass", "u", nonpast_pass_stem, juss_endings_aa)
inflect_tense_imp(base, {imp_stem, all_same = 1}, imp_endings_aa)
-- active and passive participles apparently do not exist for this verb
end
-- Implement form-I final-weak assimilated+final-weak verb. ASSIMILATED is true for assimilated verbs.
local function make_form_i_final_weak_verb(base, vowel_spec, assimilated)
local rad1, rad2, rad3, past_vowel, nonpast_vowel = get_radicals_3(vowel_spec)
-- حَيَّ or حَيِيَ is weird enough that we handle it as a separate function.
if hayy_radicals(rad1, rad2, rad3) then
make_form_i_hayy_verb(base, vowel_spec)
return
end
-- Verbal nouns (maṣādir) for form I are unpredictable and have to be supplied.
-- Past and non-past stems, active and passive, and imperative stem.
local past_stem = q(rad1, A, rad2)
local past_pass_stem = q(rad1, U, rad2)
local nonpast_stem, nonpast_pass_stem, imp_stem
if raa_radicals(rad1, rad2, rad3) then
base.irregular = true
nonpast_stem = rad1
nonpast_pass_stem = rad1
imp_stem = rad1
else
nonpast_pass_stem = q(rad1, SK, rad2)
if assimilated then
nonpast_stem = rad2
imp_stem = rad2
else
nonpast_stem = nonpast_pass_stem
imp_stem = q(form_i_imp_stem_through_rad1(base, nonpast_vowel, rad1), rad2)
end
end
-- Make parts.
local past_ending_vowel =
req(rad3, Y) and req(past_vowel, A) and "ay" or
req(rad3, W) and req(past_vowel, A) and "aw" or
req(past_vowel, I) and "ī" or "ū"
-- Try to preserve footnotes attached to the third radical and/or past and/or non-past vowels.
local past_footnotes = iut.combine_footnotes(rget_footnotes(rad3), rget_footnotes(past_vowel))
local nonpast_ending_vowel = req(nonpast_vowel, A) and "ā" or req(nonpast_vowel, I) and "ī" or "ū"
local nonpast_footnotes = iut.combine_footnotes(rget_footnotes(rad3), rget_footnotes(nonpast_vowel))
make_final_weak_verb(base,
iut.combine_form_and_footnotes(past_stem, past_footnotes),
iut.combine_form_and_footnotes(past_pass_stem, past_footnotes),
iut.combine_form_and_footnotes(nonpast_stem, nonpast_footnotes),
iut.combine_form_and_footnotes(nonpast_pass_stem, nonpast_footnotes),
iut.combine_form_and_footnotes(imp_stem, nonpast_footnotes),
past_ending_vowel, nonpast_ending_vowel, "a")
-- Active participle.
insert_form_or_forms(base, "ap1", q(rad1, AA, rad2, IN))
-- Active participle, stative type I فَعِيّ (++). FIXME: Is this correct when rad3 is W?
insert_ap2_pp2(base, q(rad1, A, rad2, II, SH))
-- Active participle, stative type II فَعٍ (+++). FIXME: Any examples of this to verify it's correct?
insert_form_or_forms(base, "ap3", q(rad1, A, rad2, IN))
-- Active participle, color/defect أَفْعَى (+cd).
insert_form_or_forms(base, "apcd", q(HAMZA, A, rad1, SK, rad2, AAMAQ))
-- Active participle, -ān فَعْيَان or فَعْوَان (+an).
-- FIXME: Any examples of this for both rad3 = W and y to verify it's correct?
insert_form_or_forms(base, "apan", q(rad1, A, rad2, SK, rad3, AAN))
-- Passive participle.
insert_form_or_forms(base, "pp", q(MA, rad1, SK, rad2, req(rad3, Y) and II or UU, SH))
end
conjugations["I-final-weak"] = function(base, vowel_spec)
make_form_i_final_weak_verb(base, vowel_spec, false)
end
conjugations["I-assimilated+final-weak"] = function(base, vowel_spec)
make_form_i_final_weak_verb(base, vowel_spec, "assimilated")
end
conjugations["I-hollow"] = function(base, vowel_spec)
local rad1, rad2, rad3, past_vowel, nonpast_vowel = get_radicals_3(vowel_spec)
-- In some sense, hollow vowels i~i and u~u are more "correct" than a~i and a~u, but the latter follow the
-- pattern of other form-I verbs, so we map i~i to a~i and u~u to a~u in infer_radicals(). Now however we have
-- to undo this to get the actual past vowel based on the non-past vowel.
if req(past_vowel, A) then
past_vowel = map_vowel(past_vowel, function(vow)
return req(nonpast_vowel, A) and I or rget(nonpast_vowel)
end)
end
local lengthened_nonpast = map_vowel(nonpast_vowel, function(vow)
return vow == U and UU or vow == I and II or AA
end)
-- Verbal nouns (maṣādir) for form I are unpredictable and have to be supplied.
-- active past stems - vowel (v) and consonant (c)
local past_v_stem = q(rad1, AA, rad3)
local past_c_stem = q(rad1, past_vowel, rad3)
-- active non-past stems - vowel (v) and consonant (c)
local nonpast_v_stem = q(rad1, lengthened_nonpast, rad3)
local nonpast_c_stem = q(rad1, nonpast_vowel, rad3)
-- passive past stems - vowel (v) and consonant (c)
-- 'ufīla, 'ufiltu
local past_pass_v_stem = q(rad1, II, rad3)
local past_pass_c_stem = q(rad1, I, rad3)
-- passive non-past stems - vowel (v) and consonant (c)
-- yufāla/yufalna
-- stem is built differently but conjugation is identical to sound verbs
local nonpast_pass_v_stem = q(rad1, AA, rad3)
local nonpast_pass_c_stem = q(rad1, A, rad3)
-- imperative stem
local imp_v_stem = nonpast_v_stem
local imp_c_stem = nonpast_c_stem
-- make parts
make_hollow_geminate_verb(base, false, past_v_stem, past_c_stem, past_pass_v_stem,
past_pass_c_stem, nonpast_v_stem, nonpast_c_stem, nonpast_pass_v_stem,
nonpast_pass_c_stem, imp_v_stem, imp_c_stem, "a")
if kaan_radicals(rad1, rad2, rad3) then
local endings = make_nonpast_endings(U, {}, {}, {}, {})
inflect_tense(base, "juss", nonpast_prefix_consonants, q(A, rad1), endings)
base.irregular = true
end
-- Active participle.
insert_form_or_forms(base, "ap1", req(rad3, HAMZA) and q(rad1, AA, HAMZA, IN) or
q(rad1, AA, HAMZA, I, rad3))
-- Active participle, stative type I فَيِّد (++). FIXME: Any examples of this to verify it's correct?
insert_ap2_pp2(base, q(rad1, A, Y, SH, I, rad3))
-- Active participle, stative type II فَيِد (+++). FIXME: Any examples of this to verify it's correct?
insert_form_or_forms(base, "ap3", q(rad1, A, Y, I, rad3))
-- Active participle, color/defect أَفّيَد or أَفّوَد (+cd). FIXME: Any examples of this to verify it's correct?
insert_form_or_forms(base, "apcd", q(HAMZA, A, rad1, SK, rad2, A, rad3))
-- Active participle, -ān فَيْدَان or فَوْدَان (+an). Example: جَاعَ "to be hungry", act part جَوْعَان
insert_form_or_forms(base, "apan", q(rad1, A, rad2, SK, rad3, AAN))
-- Passive participle.
insert_form_or_forms(base, "pp", q(MA, rad1, req(rad2, Y) and II or UU, rad3))
end
conjugations["I-geminate"] = function(base, vowel_spec)
local rad1, rad2, rad3, past_vowel, nonpast_vowel = get_radicals_3(vowel_spec)
-- Verbal nouns (maṣādir) for form I are unpredictable and have to be supplied.
-- active past stems - vowel (v) and consonant (c)
local past_v_stem = q(rad1, A, rad2, SH)
local past_c_stem = q(rad1, A, rad2, past_vowel, rad2)
-- active non-past stems - vowel (v) and consonant (c)
local nonpast_v_stem = q(rad1, nonpast_vowel, rad2, SH)
local nonpast_c_stem = q(rad1, SK, rad2, nonpast_vowel, rad2)
-- passive past stems - vowel (v) and consonant (c)
-- dulla/dulilta
local past_pass_v_stem = q(rad1, U, rad2, SH)
local past_pass_c_stem = q(rad1, U, rad2, I, rad2)
-- passive non-past stems - vowel (v) and consonant (c)
--yudallu/yudlalna
-- stem is built differently but conjugation is identical to sound verbs
local nonpast_pass_v_stem = q(rad1, A, rad2, SH)
local nonpast_pass_c_stem = q(rad1, SK, rad2, A, rad2)
-- imperative stem
local imp_v_stem = q(rad1, nonpast_vowel, rad2, SH)
local imp_c_stem = q(form_i_imp_stem_through_rad1(base, nonpast_vowel, rad1), rad2, nonpast_vowel, rad2)
-- make parts
make_hollow_geminate_verb(base, "geminate", past_v_stem, past_c_stem, past_pass_v_stem,
past_pass_c_stem, nonpast_v_stem, nonpast_c_stem, nonpast_pass_v_stem,
nonpast_pass_c_stem, imp_v_stem, imp_c_stem, "a")
-- Active participle.
insert_form_or_forms(base, "ap1", q(rad1, AA, rad2, SH))
-- Active participle, stative type I فَعِيع (++). FIXME: Any examples of this to verify it's correct?
insert_ap2_pp2(base, q(rad1, A, rad2, II, rad2))
-- Active participle, stative type II فَعّ (+++). Example: بَرَّ "to be pious", active participle بَرّ
insert_form_or_forms(base, "ap3", q(rad1, A, rad2, SH))
-- Active participle, color/defect أَفَعّ (+cd).
-- Example: لَصَّ "to be thievish, to steal repeatedly", active participle أَلَصّ.
insert_form_or_forms(base, "apcd", q(HAMZA, A, rad1, A, rad2, SH))
-- Active participle, -ān فَعَّان (+an). FIXME: Any examples of this to verify it's correct?
insert_form_or_forms(base, "apan", q(rad1, A, rad2, SH, AAN))
-- Passive participle.
insert_form_or_forms(base, "pp", q(MA, rad1, SK, rad2, UU, rad2))
end
-- Return the ta- (active, past and non-past) and tu- (passive past) prefixes for a form II/III/V/VI verb.
-- Form V and VI verbs normally use ta- and tu-, but reduced (base.reduced) verbs use different prefixes. Form II
-- and III verbs have no prefix.
local function form_ii_iii_v_vi_ta_tu_prefix(base, rad1)
local vform = base.verb_form
if vform == "V" or vform == "VI" then
if base.reduced then
-- To simplify the code, we generate two rad1's with a sukūn between them, which is cleaned up in
-- postprocessing.
return q(_I, rad1, SK), q(rad1, SK), q(_U, rad1, SK)
else
return TA, TA, TU
end
else
return "", "", ""
end
end
-- Make form II or V sound or final-weak verb.
local function make_form_ii_v_sound_final_weak_verb(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local final_weak = is_final_weak(base, vowel_spec)
local vform = base.verb_form
local ta_past_prefix, ta_nonpast_prefix, tu_past_prefix = form_ii_iii_v_vi_ta_tu_prefix(base, rad1)
local vn = vform == "V" and
q(ta_past_prefix, rad1, A, rad2, SH, final_weak and IN or q(U, rad3)) or
q(TA, rad1, SK, rad2, II, final_weak and AH or rad3)
-- various stem bases
local past_stem_base = q(ta_past_prefix, rad1, A, rad2, SH)
local nonpast_stem_base = q(ta_nonpast_prefix, rad1, A, rad2, SH)
local past_pass_stem_base = q(tu_past_prefix, rad1, U, rad2, SH)
-- make parts
make_augmented_sound_final_weak_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base,
vn)
end
conjugations["II-sound"] = function(base, vowel_spec)
make_form_ii_v_sound_final_weak_verb(base, vowel_spec)
end
conjugations["II-final-weak"] = function(base, vowel_spec)
make_form_ii_v_sound_final_weak_verb(base, vowel_spec)
end
local function make_form_iii_alt_vn(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local final_weak = is_final_weak(base, vowel_spec)
-- Insert alternative verbal noun فِعَال. Since not all verbs have this, we require that verbs that do have it
-- specify it explicitly; a shortcut ++ is provided to make this easier (e.g. <vn:+,++> to indicate that
-- both the normal verbal noun مُفَاعَلَة and secondary verbal noun فِعَال are available).
insert_form_or_forms(base, "vn2", q(rad1, I, rad2, AA, final_weak and HAMZA or rad3))
end
-- Make form III or VI sound or final-weak verb.
local function make_form_iii_vi_sound_final_weak_verb(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local final_weak = is_final_weak(base, vowel_spec)
local vform = base.verb_form
local ta_past_prefix, ta_nonpast_prefix, tu_past_prefix = form_ii_iii_v_vi_ta_tu_prefix(base, rad1)
local vn = vform == "VI" and
q(ta_past_prefix, rad1, AA, rad2, final_weak and IN or q(U, rad3)) or
q(MU, rad1, AA, rad2, final_weak and AAH or q(A, rad3, AH))
-- various stem bases
local past_stem_base = q(ta_past_prefix, rad1, AA, rad2)
local nonpast_stem_base = q(ta_nonpast_prefix, rad1, AA, rad2)
local past_pass_stem_base = q(tu_past_prefix, rad1, UU, rad2)
-- make parts
make_augmented_sound_final_weak_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base,
vn)
if vform == "III" then
make_form_iii_alt_vn(base, vowel_spec)
end
end
conjugations["III-sound"] = function(base, vowel_spec)
make_form_iii_vi_sound_final_weak_verb(base, vowel_spec)
end
conjugations["III-final-weak"] = function(base, vowel_spec)
make_form_iii_vi_sound_final_weak_verb(base, vowel_spec)
end
-- Make form III or VI geminate verb.
local function make_form_iii_vi_geminate_verb(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local vform = base.verb_form
local ta_past_prefix, ta_nonpast_prefix, tu_past_prefix = form_ii_iii_v_vi_ta_tu_prefix(base, rad1)
-- Alternative verbal noun فِعَال will be inserted when we add sound parts below.
local vn = vform == "VI" and q(ta_past_prefix, rad1, AA, rad2, SH) or q(MU, rad1, AA, rad2, SH, AH)
-- Various stem bases.
local past_stem_base = q(ta_past_prefix, rad1, AA)
local nonpast_stem_base = q(ta_nonpast_prefix, rad1, AA)
local past_pass_stem_base = q(tu_past_prefix, rad1, UU)
-- Make parts.
local variant = vowel_spec.variant or "short"
if variant == "short" or variant == "both" then
make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn)
end
-- Also add alternative sound (non-compressed) parts. This will lead to some duplicate entries, but they are
-- removed during addition.
if variant == "long" or variant == "both" then
make_form_iii_vi_sound_final_weak_verb(base, vowel_spec)
elseif vform == "III" then
-- Still need to add the alternative form-III verbal noun.
make_form_iii_alt_vn(base, vowel_spec)
end
end
conjugations["III-geminate"] = function(base, vowel_spec)
make_form_iii_vi_geminate_verb(base, vowel_spec)
end
-- Make form IV sound or final-weak verb.
local function make_form_iv_sound_final_weak_verb(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local final_weak = is_final_weak(base, vowel_spec)
-- core of stem base, minus stem prefixes
local stem_core
-- check for irregular verb أَرَى
local is_raa = raa_radicals(rad1, rad2, rad3)
if is_raa then
base.irregular = true
stem_core = rad1
else
stem_core = q(rad1, SK, rad2)
end
-- verbal noun
local vn = is_raa and
q(HAMZA, I, stem_core, AA, HAMZA, AH) or
q(HAMZA, I, stem_core, AA, final_weak and HAMZA or rad3)
-- various stem bases
local past_stem_base = q(HAMZA, A, stem_core)
local nonpast_stem_base = stem_core
local past_pass_stem_base = q(HAMZA, U, stem_core)
-- make parts
make_augmented_sound_final_weak_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base,
vn)
end
conjugations["IV-sound"] = function(base, vowel_spec)
make_form_iv_sound_final_weak_verb(base, vowel_spec)
end
conjugations["IV-final-weak"] = function(base, vowel_spec)
make_form_iv_sound_final_weak_verb(base, vowel_spec)
end
conjugations["IV-hollow"] = function(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
-- verbal noun
local vn = q(HAMZA, I, rad1, AA, rad3, AH)
-- various stem bases
local past_stem_base = q(HAMZA, A, rad1)
local nonpast_stem_base = rad1
local past_pass_stem_base = q(HAMZA, U, rad1)
-- make parts
make_augmented_hollow_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn)
end
conjugations["IV-geminate"] = function(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local vn = q(HAMZA, I, rad1, SK, rad2, AA, rad2)
-- various stem bases
local past_stem_base = q(HAMZA, A, rad1)
local nonpast_stem_base = rad1
local past_pass_stem_base = q(HAMZA, U, rad1)
-- make parts
make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn)
end
conjugations["V-sound"] = function(base, vowel_spec)
make_form_ii_v_sound_final_weak_verb(base, vowel_spec)
end
conjugations["V-final-weak"] = function(base, vowel_spec)
make_form_ii_v_sound_final_weak_verb(base, vowel_spec)
end
conjugations["VI-sound"] = function(base, vowel_spec)
make_form_iii_vi_sound_final_weak_verb(base, vowel_spec)
end
conjugations["VI-final-weak"] = function(base, vowel_spec)
make_form_iii_vi_sound_final_weak_verb(base, vowel_spec)
end
conjugations["VI-geminate"] = function(base, vowel_spec)
make_form_iii_vi_geminate_verb(base, vowel_spec)
end
-- Make a verbal noun of the general form that applies to forms VII and above. RAD12 is the first consonant cluster
-- (after initial اِ) and RAD34 is the second consonant cluster. RAD5 is the final consonant.
local function high_form_verbal_noun(rad12, rad34, rad5)
return q(_I, rad12, I, rad34, AA, rad5)
end
-- Populate a sound or final-weak verb for any of the various high-numbered augmented forms (form VII and up) that
-- have up to 5 consonants in two clusters in the stem and the same pattern of vowels between. Some of these
-- consonants in certain verb parts are w's, which leads to apparent anomalies in certain stems of these parts, but
-- these anomalies are handled automatically in postprocessing, where we resolve sequences of iwC -> īC, uwC -> ūC,
-- w + sukūn + w -> w + shadda.
-- RAD12 is the first consonant cluster (after initial اِ) and RAD34 is the second consonant cluster. RAD5 is the
-- final consonant.
local function make_high_form_sound_final_weak_verb(base, vowel_spec, rad12, rad34, rad5)
local final_weak = is_final_weak(base, vowel_spec)
local vn = high_form_verbal_noun(rad12, rad34, final_weak and HAMZA or rad5)
-- various stem bases
local nonpast_stem_base = q(rad12, A, rad34)
local past_stem_base = q(_I, nonpast_stem_base)
local past_pass_stem_base = q(_U, rad12, U, rad34)
-- make parts
make_augmented_sound_final_weak_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base,
vn)
end
local function form_vii_nrad1(base, rad1)
if base.reduced then
if not req(rad1, M) then
error(("Internal error: Form VII first radical %s is not م but .reduced specified; should have been caught earlier"):
format(rget(rad1)))
end
return M .. SH
else
return q("نْ", rad1)
end
end
-- Make form VII sound or final-weak verb.
local function make_form_vii_sound_final_weak_verb(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
make_high_form_sound_final_weak_verb(base, vowel_spec, form_vii_nrad1(base, rad1), rad2, rad3)
end
conjugations["VII-sound"] = function(base, vowel_spec)
make_form_vii_sound_final_weak_verb(base, vowel_spec)
end
conjugations["VII-final-weak"] = function(base, vowel_spec)
make_form_vii_sound_final_weak_verb(base, vowel_spec)
end
conjugations["VII-hollow"] = function(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local nrad1 = form_vii_nrad1(base, rad1)
local vn = high_form_verbal_noun(nrad1, Y, rad3)
-- various stem bases
local nonpast_stem_base = nrad1
local past_stem_base = q(_I, nonpast_stem_base)
local past_pass_stem_base = q(_U, nrad1)
-- make parts
make_augmented_hollow_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn)
end
conjugations["VII-geminate"] = function(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local nrad1 = form_vii_nrad1(base, rad1)
local vn = high_form_verbal_noun(nrad1, rad2, rad2)
-- various stem bases
local nonpast_stem_base = q(nrad1, A)
local past_stem_base = q(_I, nonpast_stem_base)
local past_pass_stem_base = q(_U, nrad1, U)
-- make parts
make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn)
end
-- Return Form VIII verbal noun.
local function form_viii_verbal_noun(base, vowel_spec, rad1, rad2, rad3)
local final_weak = is_final_weak(base, vowel_spec)
rad3 = final_weak and HAMZA or rad3
return {high_form_verbal_noun(vowel_spec.form_viii_assim, rad2, rad3)}
end
-- Make form VIII sound or final-weak verb.
local function make_form_viii_sound_final_weak_verb(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
-- check for irregular verb اِتَّخَذَ
if axadh_radicals(rad1, rad2, rad3) then
base.irregular = true
rad1 = T
end
make_high_form_sound_final_weak_verb(base, vowel_spec, vowel_spec.form_viii_assim, rad2, rad3)
end
conjugations["VIII-sound"] = function(base, vowel_spec)
make_form_viii_sound_final_weak_verb(base, vowel_spec)
end
conjugations["VIII-final-weak"] = function(base, vowel_spec)
make_form_viii_sound_final_weak_verb(base, vowel_spec)
end
conjugations["VIII-hollow"] = function(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local vn = form_viii_verbal_noun(base, vowel_spec, rad1, Y, rad3)
-- various stem bases
local nonpast_stem_base = vowel_spec.form_viii_assim
local past_stem_base = q(_I, nonpast_stem_base)
local past_pass_stem_base = q(_U, nonpast_stem_base)
-- make parts
make_augmented_hollow_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn)
end
conjugations["VIII-geminate"] = function(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local vn = form_viii_verbal_noun(base, vowel_spec, rad1, rad2, rad2)
-- various stem bases
local nonpast_stem_base = q(vowel_spec.form_viii_assim, A)
local past_stem_base = q(_I, nonpast_stem_base)
local past_pass_stem_base = q(_U, vowel_spec.form_viii_assim, U)
-- make parts
make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn)
end
conjugations["IX-sound"] = function(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local vn = q(_I, rad1, SK, rad2, I, rad3, AA, rad3)
-- various stem bases
local nonpast_stem_base = q(rad1, SK, rad2, A)
local past_stem_base = q(_I, nonpast_stem_base)
local past_pass_stem_base = q(_U, rad1, SK, rad2, U)
-- make parts
make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn)
end
conjugations["IX-final-weak"] = function(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
make_high_form_sound_final_weak_verb(base, vowel_spec, q(rad1, SK, rad2), rad3, rad3)
end
-- Populate a sound or final-weak verb for any of the various high-numbered
-- augmented forms that have 5 consonants in the stem and the same pattern of
-- vowels. Some of these consonants in certain verb parts are w's, which leads to
-- apparent anomalies in certain stems of these parts, but these anomalies
-- are handled automatically in postprocessing, where we resolve sequences of
-- iwC -> īC, uwC -> ūC, w + sukūn + w -> w + shadda.
local function make_high5_form_sound_final_weak_verb(base, vowel_spec, rad1, rad2, rad3, rad4, rad5)
make_high_form_sound_final_weak_verb(base, vowel_spec, q(rad1, SK, rad2), q(rad3, SK, rad4), rad5)
end
-- Make form X sound or final-weak verb.
local function make_form_x_sound_final_weak_verb(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
-- check for irregular verb اِسْتَحْيَا (also اِسْتَحَى)
local is_hayy = hayy_radicals(rad1, rad2, rad3)
local variant = vowel_spec.variant or "both"
if not is_hayy or variant == "long" or variant == "both" then
make_high5_form_sound_final_weak_verb(base, vowel_spec, S, T, rad1, rad2, rad3)
end
if is_hayy and (variant == "short" or variant == "both") then
base.irregular = true
-- Add alternative entries to the verbal paradigms. Any duplicates are removed during addition.
make_high_form_sound_final_weak_verb(base, vowel_spec, S .. SK .. T, rad1, rad3)
end
end
conjugations["X-sound"] = function(base, vowel_spec)
make_form_x_sound_final_weak_verb(base, vowel_spec)
end
conjugations["X-final-weak"] = function(base, vowel_spec)
make_form_x_sound_final_weak_verb(base, vowel_spec)
end
conjugations["X-hollow"] = function(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local vn = q(base.reduced and "اِسْ" or "اِسْتِ", rad1, AA, rad3, AH)
-- various stem bases
local past_stem_base = q(base.reduced and "اِسْ" or "اِسْتَ", rad1)
local nonpast_stem_base = q(base.reduced and "سْ" or "سْتَ", rad1)
local past_pass_stem_base = q(base.reduced and "اُسْ" or "اُسْتُ", rad1)
-- make parts
make_augmented_hollow_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn)
end
conjugations["X-geminate"] = function(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local vn = q("اِسْتِ", rad1, SK, rad2, AA, rad2)
-- various stem bases
local past_stem_base = q("اِسْتَ", rad1)
local nonpast_stem_base = q("سْتَ", rad1)
local past_pass_stem_base = q("اُسْتُ", rad1)
-- make parts
if base.altgem then
inflect_tense(base, "past", "", {q(past_stem_base, A, rad2, SH), all_same = 1},
past_endings_ay_12_person_only)
end
make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn,
base.altgem and "[uncommon]" or nil)
end
conjugations["XI-sound"] = function(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local vn = q(_I, rad1, SK, rad2, II, rad3, AA, rad3)
-- various stem bases
local nonpast_stem_base = q(rad1, SK, rad2, AA)
local past_stem_base = q(_I, nonpast_stem_base)
local past_pass_stem_base = q(_U, rad1, SK, rad2, UU)
-- make parts
make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn)
end
-- Probably no form XI final-weak, since already geminate in form; would behave as XI-sound.
-- Make form XII sound or final-weak verb.
local function make_form_xii_sound_final_weak_verb(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
make_high5_form_sound_final_weak_verb(base, vowel_spec, rad1, rad2, W, rad2, rad3)
end
conjugations["XII-sound"] = function(base, vowel_spec)
make_form_xii_sound_final_weak_verb(base, vowel_spec)
end
conjugations["XII-final-weak"] = function(base, vowel_spec)
make_form_xii_sound_final_weak_verb(base, vowel_spec)
end
-- Make form XIII sound or final-weak verb.
local function make_form_xiii_sound_final_weak_verb(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
make_high5_form_sound_final_weak_verb(base, vowel_spec, rad1, rad2, W, W, rad3)
end
conjugations["XIII-sound"] = function(base, vowel_spec)
make_form_xiii_sound_final_weak_verb(base, vowel_spec)
end
conjugations["XIII-final-weak"] = function(base, vowel_spec)
make_form_xiii_sound_final_weak_verb(base, vowel_spec)
end
-- Make a form XIV or XV sound or final-weak verb. Last radical appears twice (if`anlala / yaf`anlilu) so if it were
-- w or y you'd get if`anwā / yaf`anwī or if`anyā / yaf`anyī, i.e. unlike for most augmented verbs, the identity of
-- the radical matters.
local function make_form_xiv_xv_sound_final_weak_verb(base, vowel_spec)
local rad1, rad2, rad3 = get_radicals_3(vowel_spec)
local lastrad = base.verb_form == "XV" and Y or rad3
make_high5_form_sound_final_weak_verb(base, vowel_spec, rad1, rad2, N, rad3, lastrad)
end
conjugations["XIV-sound"] = function(base, vowel_spec)
make_form_xiv_xv_sound_final_weak_verb(base, vowel_spec)
end
conjugations["XIV-final-weak"] = function(base, vowel_spec)
make_form_xiv_xv_sound_final_weak_verb(base, vowel_spec)
end
conjugations["XV-sound"] = function(base, vowel_spec)
make_form_xiv_xv_sound_final_weak_verb(base, vowel_spec)
end
-- Probably no form XV final-weak, since already final-weak in form; would behave as XV-sound.
-- Make form Iq or IIq sound or final-weak verb.
local function make_form_iq_iiq_sound_final_weak_verb(base, vowel_spec)
local rad1, rad2, rad3, rad4 = get_radicals_4(vowel_spec)
local final_weak = is_final_weak(base, vowel_spec)
local vform = base.verb_form
local vn = vform == "IIq" and
q(TA, rad1, A, rad2, SK, rad3, (final_weak and IN or q(U, rad4))) or
q(rad1, A, rad2, SK, rad3, (final_weak and AAH or q(A, rad4, AH)))
local ta_pref = vform == "IIq" and TA or ""
local tu_pref = vform == "IIq" and TU or ""
-- various stem bases
local past_stem_base = q(ta_pref, rad1, A, rad2, SK, rad3)
local nonpast_stem_base = past_stem_base
local past_pass_stem_base = q(tu_pref, rad1, U, rad2, SK, rad3)
-- make parts
make_augmented_sound_final_weak_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base,
vn)
end
conjugations["Iq-sound"] = function(base, vowel_spec)
make_form_iq_iiq_sound_final_weak_verb(base, vowel_spec)
end
conjugations["Iq-final-weak"] = function(base, vowel_spec)
make_form_iq_iiq_sound_final_weak_verb(base, vowel_spec)
end
conjugations["IIq-sound"] = function(base, vowel_spec)
make_form_iq_iiq_sound_final_weak_verb(base, vowel_spec)
end
conjugations["IIq-final-weak"] = function(base, vowel_spec)
make_form_iq_iiq_sound_final_weak_verb(base, vowel_spec)
end
-- Make form IIIq sound or final-weak verb.
local function make_form_iiiq_sound_final_weak_verb(base, vowel_spec)
local rad1, rad2, rad3, rad4 = get_radicals_4(vowel_spec)
make_high5_form_sound_final_weak_verb(base, vowel_spec, rad1, rad2, N, rad3, rad4)
end
conjugations["IIIq-sound"] = function(base, vowel_spec)
make_form_iiiq_sound_final_weak_verb(base, vowel_spec)
end
conjugations["IIIq-final-weak"] = function(base, vowel_spec)
make_form_iiiq_sound_final_weak_verb(base, vowel_spec)
end
conjugations["IVq-sound"] = function(base, vowel_spec)
local rad1, rad2, rad3, rad4 = get_radicals_4(vowel_spec)
local vn = q(_I, rad1, SK, rad2, I, rad3, SK, rad4, AA, rad4)
-- various stem bases
local past_stem_base = q(_I, rad1, SK, rad2, A, rad3)
local nonpast_stem_base = q(rad1, SK, rad2, A, rad3)
local past_pass_stem_base = q(_U, rad1, SK, rad2, U, rad3)
-- make parts
make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn)
end
-- Probably no form IVq final-weak, since already geminate in form; would behave as IVq-sound.
end
create_conjugations()
-------------------------------------------------------------------------------
-- Guts of main conjugation function --
-------------------------------------------------------------------------------
-- Given form, weakness and radicals, check to make sure the radicals present are allowable for the weakness. Hamzas on
-- alif/wāw/yāʾ seats are never allowed (should always appear as hamza-on-the-line), and various weaknesses have various
-- strictures on allowable consonants.
local function check_radicals(form, weakness, rad1, rad2, rad3, rad4)
local function hamza_check(index, rad)
if rad == HAMZA_ON_ALIF or rad == HAMZA_UNDER_ALIF or
rad == HAMZA_ON_W or rad == HAMZA_ON_Y then
error("Radical " .. index .. " is " .. rad .. " but should be ء (hamza on the line)")
end
end
local function check_waw_ya(index, rad)
if not is_waw_ya(rad) then
error("Radical " .. index .. " is " .. rad .. " but should be و or ي")
end
end
local function check_not_waw_ya(index, rad)
if is_waw_ya(rad) then
error("In a sound verb, radical " .. index .. " should not be و or ي")
end
end
hamza_check(rad1)
hamza_check(rad2)
hamza_check(rad3)
hamza_check(rad4)
if weakness == "assimilated" or weakness == "assimilated+final-weak" then
if rad1 ~= W then
error("Radical 1 is " .. rad1 .. " but should be و")
end
-- don't check that non-assimilated form I verbs don't have wāw as their
-- first radical because some form-I verbs exist where a first-radical wāw
-- behaves as sound, e.g. wajuha yawjuhu "to be distinguished".
end
if weakness == "final-weak" or weakness == "assimilated+final-weak" then
if rad4 then
check_waw_ya(4, rad4)
else
check_waw_ya(3, rad3)
end
elseif vform_supports_final_weak(form) then
-- non-final-weak verbs cannot have weak final radical if there's a corresponding
-- final-weak verb category. I think this is safe. We may have problems with
-- ḥayya/ḥayiya yaḥyā if we treat it as a geminate verb.
if rad4 then
check_not_waw_ya(4, rad4)
else
check_not_waw_ya(3, rad3)
end
end
if weakness == "hollow" then
check_waw_ya(2, rad2)
-- don't check that non-hollow verbs in forms that support hollow verbs
-- don't have wāw or yāʾ as their second radical because some verbs exist
-- where a middle-radical wāw/yāʾ behaves as sound, e.g. form-VIII izdawaja
-- "to be in pairs".
end
if weakness == "geminate" then
if rad4 then
error("Internal error: No geminate quadrilaterals, should not be seen")
end
if rad2 ~= rad3 then
error("Weakness is geminate; radical 3 is " .. rad3 .. " but should be same as radical 2 " .. rad2)
end
elseif vform_supports_geminate(form) then
-- non-geminate verbs cannot have second and third radical same if there's
-- a corresponding geminate verb category. I think this is safe. We
-- don't fuss over double wāw or double yāʾ because this could legitimately
-- be a final-weak verb with middle wāw/yāʾ, treated as sound.
if rad4 then
error("Internal error: No quadrilaterals should support geminate verbs")
end
if rad2 == rad3 and not is_waw_ya(rad2) then
error("Weakness is '" .. weakness .. "'; radical 2 and 3 are same at " .. rad2 .. " but should not be; consider making weakness 'geminate'")
end
end
end
-- array of substitutions; each element is a 2-entry array FROM, TO; do it
-- this way so the concatenations only get evaluated once
local postprocess_subs = {
-- reorder short-vowel + shadda -> shadda + short-vowel for easier processing
{"(" .. AIU .. ")" .. SH, SH .. "%1"},
----------same letter separated by sukūn should instead use shadda---------
------------happens e.g. in kun-nā "we were".-----------------
{"(.)" .. SK .. "%1", "%1" .. SH},
---------------------------- assimilated verbs ----------------------------
-- iw, iy -> ī (assimilated verbs)
{I .. W .. SK, II},
{I .. Y .. SK, II},
-- uw, uy -> ū (assimilated verbs)
{U .. W .. SK, UU},
{U .. Y .. SK, UU},
-------------- final -yā uses tall alif not alif maqṣūra ------------------
{"(" .. Y .. SH .. "?" .. A .. ")" .. AMAQ, "%1" .. ALIF},
----------------------- handle hamza assimilation -------------------------
-- initial hamza + short-vowel + hamza + sukūn -> hamza + long vowel
{HAMZA .. A .. HAMZA .. SK, HAMZA .. A .. ALIF},
{HAMZA .. I .. HAMZA .. SK, HAMZA .. I .. Y},
{HAMZA .. U .. HAMZA .. SK, HAMZA .. U .. W}
}
local postprocess_tr_subs = {
{"ī([" .. vowels .. "y*])", "iy%1"},
{"ū([" .. vowels .. "w*])", "uw%1"},
{"(.)%*", "%1%1"}, -- implement shadda
---------------------------- assimilated verbs ----------------------------
-- iw, iy -> ī (assimilated verbs)
{"iw([^" .. vowels .. "w])", "ī%1"},
{"iy([^" .. vowels .. "y])", "ī%1"},
-- uw, uy -> ū (assimilated verbs)
{"uw([^" .. vowels .. "w])", "ū%1"},
{"uy([^" .. vowels .. "y])", "ū%1"},
----------------------- handle hamza assimilation -------------------------
-- initial hamza + short-vowel + hamza + sukūn -> hamza + long vowel
{"ʔaʔ(" .. NV .. ")", "ʔā%1"},
{"ʔiʔ(" .. NV .. ")", "ʔī%1"},
{"ʔuʔ(" .. NV .. ")", "ʔū%1"},
}
-- Post-process verb parts to eliminate phonological anomalies. Many of the changes, particularly the tricky ones,
-- involve converting hamza to have the proper seat. The rules for this are complicated and are documented on the
-- [[w:Hamza]] Wikipedia page. In some cases there are alternatives allowed, and we handle them below by returning
-- multiple possibilities.
local function postprocess_term(term)
if term == "?" then
return "?"
end
-- Add BORDER at text boundaries.
term = BORDER .. term .. BORDER
-- Do the main post-processing, based on the pattern substitutions in postprocess_subs.
for _, sub in ipairs(postprocess_subs) do
term = rsub(term, sub[1], sub[2])
end
term = term:gsub(BORDER, "")
if not rfind(term, HAMZA) then
return term
end
term = term:gsub(HAMZA, HAMZA_PH)
term = ar_utilities.process_hamza(term)
if #term == 1 then
term = term[1]
end
return term
end
local function postprocess_translit(translit)
if translit == "?" then
return "?"
end
-- Add BORDER at text boundaries.
translit = BORDER .. translit .. BORDER
-- Do the main post-processing, based on the pattern substitutions in postprocess_tr_subs.
for _, sub in ipairs(postprocess_tr_subs) do
translit = rsub(translit, sub[1], sub[2])
end
translit = translit:gsub(BORDER, "")
return translit
end
local function postprocess_forms(base)
local converted_values = {}
for slot, forms in pairs(base.forms) do
local need_dedup = false
for i, form in ipairs(forms) do
local term = postprocess_term(form.form)
local translit = form.translit and postprocess_translit(form.translit) or nil
if term ~= form.form or translit ~= form.translit then
need_dedup = true
end
converted_values[i] = {term, translit}
end
if need_dedup then
local temp_dedup = {}
for i = 1, #forms do
local new_term, new_translit = unpack(converted_values[i])
if type(new_term) == "table" then
for _, nt in ipairs(new_term) do
local new_formobj = {
form = nt,
translit = new_translit,
footnotes = forms[i].footnotes,
}
iut.insert_form(temp_dedup, "temp", new_formobj)
end
else
local new_formobj = {
form = new_term,
translit = new_translit,
footnotes = forms[i].footnotes,
}
iut.insert_form(temp_dedup, "temp", new_formobj)
end
end
base.forms[slot] = temp_dedup.temp
end
end
end
local function process_slot_overrides(base)
for slot, forms in pairs(base.slot_overrides) do
local existing_values = base.forms[slot]
base.forms[slot] = nil
for _, form in ipairs(forms) do
-- + in active participle for form I requests slot ap1
if form.form == "+" and (base.verb_form ~= "I" or slot ~= "ap") then
if not existing_values then
error(("Slot '%s' requested the default value but no such value available"):format(slot))
end
-- We maintain an invariant that no two slots share a form object (although they may share the footnote
-- lists inside the form objects). However, there is no need to copy the form objects here because there
-- is a one-to-one correspondence between slots and slot overrides, i.e. you can't have a default value
-- go into two slots.
insert_form_or_forms(base, slot, existing_values, "allow overrides", form.uncertain)
elseif default_indicator_to_active_participle_slot[form.form] then
if form.form == "++" then
if slot ~= "vn" and slot ~= "ap" and slot ~= "pp" then
error(("Secondary default value request '++' only applicable to verbal nouns and pariciples, but found in slot '%s'"):
format(slot))
end
else
if slot ~= "ap" then
error(("Secondary default value request '%s' only applicable to active pariciples, but found in slot '%s'"):
format(form.form, slot))
end
end
local secondary_default_slot =
slot == "vn" and "vn2" or slot == "pp" and "pp2" or
default_indicator_to_active_participle_slot[form.form]
local existing_values = base.forms[secondary_default_slot]
if not existing_values then
error(("Slot '%s' requested a secondary default value using '%s' but no such value available"):
format(slot, form.form))
end
-- See comment above about the lack of need to copy the form objects.
insert_form_or_forms(base, slot, existing_values, "allow overrides", form.uncertain)
-- To make sure there aren't shared form objects.
base.forms[secondary_default_slot] = nil
else
insert_form_or_forms(base, slot, form, "allow overrides", form.uncertain)
end
end
end
-- Now, for non-stative form-I verbs, fill the active participle slot from ap1 unless it should be missing (e.g.
-- passive-only or user specified 'ap:-').
if base.verb_form == "I" and not base.forms.ap and base.forms.ap1 and not skip_slot(base, "ap") then
local saw_non_stative = false
for _, vowel_spec in ipairs(base.conj_vowels) do
if req(vowel_spec.past, A) then
saw_non_stative = true
break
end
end
if saw_non_stative then
base.forms.ap = base.forms.ap1
-- To make sure there aren't shared form objects.
base.forms.ap1 = nil
end
end
end
local function handle_lemma_linked(base)
-- Compute linked versions of potential lemma slots, for use in {{ar-verb}}. We substitute the original lemma
-- (before removing links) for forms that are the same as the lemma, if the original lemma has links.
for _, slot in ipairs(export.potential_lemma_slots) do
if base.forms[slot] then
insert_form_or_forms(base, slot .. "_linked", iut.map_forms(base.forms[slot], function(form)
if form == base.lemma and rfind(base.linked_lemma, "%[%[") then
return base.linked_lemma
else
return form
end
end))
end
end
end
-- Process specs given by the user using 'addnote[SLOTSPEC][FOOTNOTE][FOOTNOTE][...]'.
local function process_addnote_specs(base)
for _, spec in ipairs(base.addnote_specs) do
for _, slot_spec in ipairs(spec.slot_specs) do
slot_spec = "^" .. slot_spec .. "$"
for slot, forms in pairs(base.forms) do
if rfind(slot, slot_spec) then
-- To save on memory, side-effect the existing forms.
for _, form in ipairs(forms) do
form.footnotes = iut.combine_footnotes(form.footnotes, spec.footnotes)
end
end
end
end
end
end
local function add_missing_links_to_forms(base)
-- Any forms without links should get them now. Redundant ones will be stripped later.
for slot, forms in pairs(base.forms) do
for _, form in ipairs(forms) do
if not form.form:find("%[%[") then
form.form = "[[" .. form.form .. "]]"
end
end
end
end
local function conjugate_verb(base)
construct_stems(base)
for _, vowel_spec in ipairs(base.conj_vowels) do
-- Reconstruct conjugation type from verb form and (possibly inferred) weakness.
conj_type = base.verb_form .. "-" .. vowel_spec.weakness
-- Check that the conjugation type is recognized.
if not conjugations[conj_type] then
error("Unknown conjugation type '" .. conj_type .. "'")
end
-- The way the conjugation functions work is they always add entries to the appropriate parts of the paradigm
-- (each of which is an array), rather than setting the values. This makes it possible to call more than one
-- conjugation function and essentially get a paradigm of the "either A or B" kind. Doing this may insert
-- duplicate entries into a particular paradigm part, but this is not a problem because we check for duplicate
-- entries when adding them, and don't insert in that case.
conjugations[conj_type](base, vowel_spec)
end
postprocess_forms(base)
process_slot_overrides(base)
-- This should happen before add_missing_links_to_forms() so that the comparison `form == base.lemma` in
-- handle_lemma_linked() works correctly and compares unlinked forms to unlinked forms.
handle_lemma_linked(base)
process_addnote_specs(base)
if not base.alternant_multiword_spec.args.noautolinkverb then
add_missing_links_to_forms(base)
end
end
local function parse_indicator_spec(angle_bracket_spec)
-- Store the original angle bracket spec so we can reconstruct the overall conj spec with the lemma(s) in them.
local base = {
angle_bracket_spec = angle_bracket_spec,
conj_vowels = {},
root_consonants = {},
user_stem_overrides = {},
user_slot_overrides = {},
slot_explicitly_missing = {},
slot_uncertain = {},
slot_override_uses_default = {},
addnote_specs = {},
}
local function parse_err(msg)
error(msg .. ": " .. angle_bracket_spec)
end
local function fetch_footnotes(separated_group)
local footnotes
for j = 2, #separated_group - 1, 2 do
if separated_group[j + 1] ~= "" then
parse_err("Extraneous text after bracketed footnotes: '" .. table.concat(separated_group) .. "'")
end
if not footnotes then
footnotes = {}
end
table.insert(footnotes, separated_group[j])
end
return footnotes
end
local inside = angle_bracket_spec:match("^<(.*)>$")
assert(inside)
local segments = put.parse_multi_delimiter_balanced_segment_run(inside, {{"[", "]"}, {"<", ">"}})
local dot_separated_groups = put.split_alternating_runs_and_strip_spaces(segments, "%.")
-- The first dot-separated element must specify the verb form, e.g. IV or IIq. If the form is I, it needs to include
-- the the past and non-past vowels, e.g. I/a~u for kataba ~ yaktubu. More than one vowel can be given,
-- comma-separated, and more than one past~non-past pair can be given, slash-separated, e.g. I/a,u~u/i~a for form I
-- كمل, which can be conjugated as kamala/kamula ~ yakmulu or kamila ~ yakmalu. An individual vowel spec must be one
-- of a, i or u and in general (a) at least one past~non-past pair most be given, and (b) both past and non-past
-- vowels must be given even though sometimes the vowel can be determined from the unvocalized form. An exception is
-- passive-only verbs, where the vowels can't in general be determined (except indirectly in some cases by looking
-- at an associated non-passive verb); in that case, the vowel~vowel spec can left out.
local slash_separated_groups = put.split_alternating_runs_and_strip_spaces(dot_separated_groups[1], "/")
local form_spec = slash_separated_groups[1]
base.form_footnotes = fetch_footnotes(form_spec)
if form_spec[1] == "" then
parse_err("Missing verb form")
end
if not allowed_vforms_with_weakness_set[form_spec[1]] then
parse_err(("Unrecognized verb form '%s', should be one of %s"):format(
form_spec[1], list_to_text(allowed_vforms, nil, " or ")))
end
if form_spec[1]:find("%-") then
base.verb_form, base.explicit_weakness = form_spec[1]:match("^(.-)%-(.*)$")
else
base.verb_form = form_spec[1]
end
if #slash_separated_groups > 1 then
if base.verb_form ~= "I" then
parse_err(("Past~non-past vowels can only be specified when verb form is I, but saw form '%s'"):format(
base.verb_form))
end
for i = 2, #slash_separated_groups do
local slash_separated_group = slash_separated_groups[i]
local tilde_separated_groups = put.split_alternating_runs_and_strip_spaces(slash_separated_group, "~")
if #tilde_separated_groups ~= 2 then
parse_err(("Expected two tilde-separated vowel specs: %s"):format(table.concat(slash_separated_group)))
end
local function parse_conj_vowels(tilde_separated_group, vtype)
local conj_vowel_objects = {}
local comma_separated_groups = put.split_alternating_runs_and_strip_spaces(tilde_separated_group, ",")
for _, comma_separated_group in ipairs(comma_separated_groups) do
local conj_vowel = comma_separated_group[1]
if conj_vowel ~= "a" and conj_vowel ~= "i" and conj_vowel ~= "u" then
parse_err(("Expected %s conjugation vowel '%s' to be one of a, i or u in %s"):format(
vtype, conj_vowel, table.concat(slash_separated_group)))
end
conj_vowel = dia[conj_vowel]
local conj_vowel_footnotes = fetch_footnotes(comma_separated_group)
-- Try to use strings when possible as it makes q() significantly more efficient.
if conj_vowel_footnotes then
table.insert(conj_vowel_objects, {form = conj_vowel, footnotes = conj_vowel_footnotes})
else
table.insert(conj_vowel_objects, conj_vowel)
end
end
return conj_vowel_objects
end
local conj_vowel_spec = {
past = parse_conj_vowels(tilde_separated_groups[1], "past"),
nonpast = parse_conj_vowels(tilde_separated_groups[2], "non-past"),
}
table.insert(base.conj_vowels, conj_vowel_spec)
end
end
for i = 2, #dot_separated_groups do
local dot_separated_group = dot_separated_groups[i]
local first_element = dot_separated_group[1]
if first_element == "addnote" then
local spec_and_footnotes = fetch_footnotes(dot_separated_group)
if #spec_and_footnotes < 2 then
parse_err("Spec with 'addnote' should be of the form 'addnote[SLOTSPEC][FOOTNOTE][FOOTNOTE][...]'")
end
local slot_spec = table.remove(spec_and_footnotes, 1)
local slot_spec_inside = rmatch(slot_spec, "^%[(.*)%]$")
if not slot_spec_inside then
parse_err("Internal error: slot_spec " .. slot_spec .. " should be surrounded with brackets")
end
local slot_specs = rsplit(slot_spec_inside, ",")
-- FIXME: Here, [[Module:it-verb]] called strip_spaces(). Generally we don't do this. Should we?
table.insert(base.addnote_specs, {slot_specs = slot_specs, footnotes = spec_and_footnotes})
elseif first_element:find("^var:") then
if #dot_separated_group > 1 then
parse_err(("Can't attach footnotes to 'var:' spec '%s'"):format(first_element))
end
base.variant = first_element:match("^var:(.*)$")
elseif first_element:find("^I+V?:") then
local root_cons, root_cons_value = first_element:match("^(I+V?):(.*)$")
local root_index
if root_cons == "I" then
root_index = 1
elseif root_cons == "II" then
root_index = 2
elseif root_cons == "III" then
root_index = 3
elseif root_cons == "IV" then
root_index = 4
if not base.verb_form:find("q$") then
parse_err(("Can't specify root consonant IV for non-quadriliteral verb form '%s': %s"):format(
base.verb_form, first_element))
end
end
local cons, translit = root_cons_value:match("^(.*)//(.*)$")
if not cons then
cons = root_cons_value
end
local root_footnotes = fetch_footnotes(dot_separated_group)
if not translit and not root_footnotes then
base.root_consonants[root_index] = cons
else
base.root_consonants[root_index] = {form = cons, translit = translit, footnotes = root_footnotes}
end
elseif first_element:find("^[a-z][a-z0-9_]*:") then
local slot_or_stem, remainder = first_element:match("^(.-):(.*)$")
dot_separated_group[1] = remainder
local comma_separated_groups = put.split_alternating_runs_and_strip_spaces(dot_separated_group, "[,،]")
if overridable_stems[slot_or_stem] then
if base.user_stem_overrides[slot_or_stem] then
parse_err("Overridable stem '" .. slot_or_stem .. "' specified twice")
end
base.user_stem_overrides[slot_or_stem] = overridable_stems[slot_or_stem](comma_separated_groups,
{prefix = slot_or_stem, base = base, parse_err = parse_err, fetch_footnotes = fetch_footnotes})
else -- assume a form override; we validate further later when the possible slots are available
if base.user_slot_overrides[slot_or_stem] then
parse_err("Form override '" .. slot_or_stem .. "' specified twice")
end
base.user_slot_overrides[slot_or_stem] = allow_multiple_values_for_override(comma_separated_groups,
{prefix = slot_or_stem, base = base, parse_err = parse_err, fetch_footnotes = fetch_footnotes},
"is form override")
end
elseif indicator_flags[first_element] then
if #dot_separated_group > 1 then
parse_err("No footnotes allowed with '" .. first_element .. "' spec")
end
if base[first_element] then
parse_err("Spec '" .. first_element .. "' specified twice")
end
base[first_element] = true
else
local passive, uncertain = first_element:match("^(.*)(%?)$")
passive = passive or first_element
uncertain = not not uncertain
if passive_types[passive] then
if #dot_separated_group > 1 then
parse_err("No footnotes allowed with '" .. passive .. "' spec")
end
if base.passive then
parse_err("Value for passive type specified twice")
end
base.passive = passive
base.passive_uncertain = uncertain
else
parse_err("Unrecognized spec '" .. first_element .. "'")
end
end
end
return base
end
-- Normalize all lemmas, substituting the pagename for blank lemmas and adding links to multiword lemmas.
local function normalize_all_lemmas(alternant_multiword_spec, head)
-- (1) Add links to all before and after text. Remember the original text so we can reconstruct the verb spec later.
if not alternant_multiword_spec.args.noautolinktext then
iut.add_links_to_before_and_after_text(alternant_multiword_spec, "remember original")
end
-- (2) Remove any links from the lemma, but remember the original form so we can use it below in the 'lemma_linked'
-- form.
iut.map_word_specs(alternant_multiword_spec, function(base)
if base.lemma == "" then
base.lemma = head
end
base.user_specified_lemma = base.lemma
base.lemma = m_links.remove_links(base.lemma)
base.user_specified_verb = base.lemma
base.verb = base.user_specified_verb
local linked_lemma
if alternant_multiword_spec.args.noautolinkverb or base.user_specified_lemma:find("%[%[") then
linked_lemma = base.user_specified_lemma
else
-- Add links to the lemma so the user doesn't specifically need to, since we preserve
-- links in multiword lemmas and include links in non-lemma forms rather than allowing
-- the entire form to be a link.
linked_lemma = iut.add_links(base.user_specified_lemma)
end
base.linked_lemma = linked_lemma
end)
end
-- Determine weakness from radicals. Used when root given in place of lemma (e.g. for {{ar-verb forms}}).
local function weakness_from_radicals(form, rad1, rad2, rad3, rad4)
local weakness = nil
local quadlit = form:find("q$")
-- If weakness unspecified, derive from radicals.
if not quadlit then
if is_waw_ya(rad3) and rad1 == W and form == "I" then
weakness = "assimilated+final-weak"
elseif is_waw_ya(rad3) and vform_supports_final_weak(form) then
weakness = "final-weak"
elseif rad2 == rad3 and vform_supports_geminate(form) then
weakness = "geminate"
elseif is_waw_ya(rad2) and vform_supports_hollow(form) then
weakness = "hollow"
elseif rad1 == W and form == "I" then
weakness = "assimilated"
else
weakness = "sound"
end
else
if is_waw_ya(rad4) then
weakness = "final-weak"
else
weakness = "sound"
end
end
return weakness
end
-- Join the infixed tāʔ (ت) to the first radical in form VIII verbs. This may cause assimilation of the tāʔ to the
-- radical or in some cases the radical to the tāʔ. Used when a root is supplied instead of a lemma (which already has
-- the appropriate assimilation in it).
local function form_viii_join_ta(rad)
if rad == W or rad == Y or rad == "ت" then return "تّ"
elseif rad == "د" then return "دّ"
elseif rad == "ث" then return "ثّ"
elseif rad == "ذ" then return "ذّ"
elseif rad == "ز" then return "زْد"
elseif rad == "ص" then return "صْط"
elseif rad == "ض" then return "ضْط"
elseif rad == "ط" then return "طّ"
elseif rad == "ظ" then return "ظّ"
else return rad .. SK .. "ت"
end
end
local function detect_indicator_spec(base)
base.forms = {}
base.stem_overrides = {}
base.slot_overrides = {}
if not base.conj_vowels[1] then
-- These may be converted to inferred vowels. If not, we throw an error if form I and not passive-only.
base.conj_vowels = {{
past = "-",
nonpast = "-",
}}
else
-- If multiple vowels specified for a given vowel type (e.g. a,u~u), expand so that each spec in
local expansion = {}
for _, spec in ipairs(base.conj_vowels) do
for _, past in ipairs(spec.past) do
for _, nonpast in ipairs(spec.nonpast) do
table.insert(expansion, {past = past, nonpast = nonpast})
end
end
end
base.conj_vowels = expansion
end
local vform = base.verb_form
-- check for quadriliteral form (Iq, IIq, IIIq, IVq)
base.quadlit = not not vform:find("q$")
-- Infer radicals as necessary. We infer a separate set of radicals for each past~non-past vowel combination because
-- they may be different (particularly with form-I hollow verbs).
for _, vowel_spec in ipairs(base.conj_vowels) do
-- NOTE: rad1, rad2, etc. refer to user-specified radicals, which are formobj tables that optionally specify an
-- explicit manual translit, whereas ir1, ir2, etc. refer to inferred radicals, which are either strings or
-- lists of possible radicals.
local rads = base.root_consonants
local rad1, rad2, rad3, rad4 = rads[1], rads[2], rads[3], rads[4]
-- Default any unspecified radicals to radicals determined from the headword. The returned radicals may be
-- lists of possible radicals, where the first radical should be chosen if the user didn't explicitly specify a
-- radical but all are allowed. If `ambig = true` is set in the table, the radical is considered ambiguous and
-- categories won't be created for weak radicals.
local weakness, ir1, ir2, ir3, ir4
if vform ~= "none" then
ir1, ir2, ir3 = rmatch(base.lemma, "^([^_])_([^_])_([^_])$")
if not ir1 then
ir1, ir2, ir3, ir4 = rmatch(base.lemma, "^([^_])_([^_])_([^_])_([^_])$")
end
if ir1 then
-- root given instead of lemma
weakness = weakness_from_radicals(vform, ir1, ir2, ir3, ir4)
if vform == "VIII" then
vowel_spec.form_viii_assim = form_viii_join_ta(ir1)
end
else
local ret = export.infer_radicals {
headword = base.lemma,
vform = vform,
passive = base.passive,
past_vowel = vowel_spec.past,
nonpast_vowel = vowel_spec.nonpast,
is_reduced = base.reduced,
}
weakness, ir1, ir2, ir3, ir4 = ret.weakness, ret.rad1, ret.rad2, ret.rad3, ret.rad4
vowel_spec.form_viii_assim = ret.form_viii_assim
vowel_spec.past = ret.past_vowel
vowel_spec.nonpast = ret.nonpast_vowel
vowel_spec.variant = base.variant or ret.variant
end
end
-- For most ambiguous radicals, the choice of radical doesn't matter because it doesn't affect the conjugation
-- one way or another. For form I hollow verbs, however, it definitely does. In fact, the choice of radical is
-- critical even beyond the past and non-past vowels because it affects the form of the passive participle. So,
-- check for this and signal an error if the radical could not be inferred and is not given explicitly.
if vform == "I" and type(ir2) == "table" and ir2.need_radical and not rad2 then
error("Unable to guess middle radical of hollow form I verb; need to specify radical explicitly")
end
if vform == "I" and not is_passive_only(base.passive) and (
rget(vowel_spec.past) == "-" or rget(vowel_spec.nonpast) == "-") then
error("Form I verb that isn't passive-only or final-weak must have past~non-past vowels specified")
end
-- Convert ambiguous radicals.
local function regularize_inferred_radical(rad)
if type(rad) == "table" then
if rad.ambig then
return {form = rad[1], ambig = true}
else
return rad[1]
end
else
return rad
end
end
-- Return the appropriate radical at index `index` (1 through 4), based either on the user-specified radical
-- `user_radical` or (if unspecified) `inferred_radical`, inferred from the unvocalized lemma. Two values are
-- returned, the "regularized" version of the radical (where ambiguous inferred radicals are converted to their
-- most likely actual radical) and the non-regularized version. The returned values are form objects rather than
-- strings.
local function fetch_radical(user_radical, inferred_radical, index)
if not user_radical then
return regularize_inferred_radical(inferred_radical), inferred_radical
else
local rad_formval = rget(user_radical)
if type(inferred_radical) == "table" then
local allowed_radical_set = m_table.listToSet(inferred_radical)
if not allowed_radical_set[rad_formval] then
error(("For lemma %s, radical %s ambiguously inferred as %s but user radical incompatibly given as %s"):
format(base.lemma, index,
list_to_text(inferred_radical, nil, " or "), rad_formval))
end
elseif rad_formval ~= inferred_radical then
error(("For lemma %s, radical %s inferred as %s but user radical incompatibly given as %s"):
format(base.lemma, index, inferred_radical, rad_formval))
end
return user_radical, user_radical
end
end
if vform ~= "none" then
vowel_spec.rad1, vowel_spec.unreg_rad1 = fetch_radical(rad1, ir1, 1)
vowel_spec.rad2, vowel_spec.unreg_rad2 = fetch_radical(rad2, ir2, 2)
vowel_spec.rad3, vowel_spec.unreg_rad3 = fetch_radical(rad3, ir3, 3)
if base.quadlit then
vowel_spec.rad4, vowel_spec.unreg_rad4 = fetch_radical(rad4, ir4, 4)
end
end
if vform == "I" then
-- If explicit weakness given using 'I-sound' or 'I-assimilated', we may need to adjust the inferred weakness.
if base.explicit_weakness == "sound" then
if weakness == "assimilated" then
weakness = "sound"
elseif weakness == "assimilated+final-weak" then
-- Verbs like waniya~yawnā "to be faint; to languish" (although the defaults should handle this
-- correctly)
weakness = "final-weak"
else
error(("Can't specify form 'I-sound' when inferred weakness is '%s' for lemma %s"):format(
weakness, base.lemma))
end
elseif base.explicit_weakness == "assimilated" then
if weakness == "sound" then
-- i~a verbs like waṭiʔa~yaṭaʔu "to tread, to trample"; wasiʕa~yasaʕu "to be spacious; to be well-off";
-- waṯiʔa~yaṯaʔu "to get bruised, to be sprained", which would default to sound.
weakness = "assimilated"
elseif weakness == "final-weak" then
-- For completeness; not clear if any verbs occur where this is needed. (There are plenty of
-- assimilated+final-weak verbs but the defaults should take care of them.)
weakness = "assimilated+final-weak"
else
error(("Can't specify form 'I-assimilated' when inferred weakness is '%s' for lemma %s"):format(
weakness, base.lemma))
end
elseif base.explicit_weakness then
error(("Internal error: Unrecognized value '%s' for base.explicit_weakness"):format(base.explicit_weakness))
end
elseif vform == "none" then
weakness = base.explicit_weakness
elseif base.explicit_weakness then
error(("Internal error: Explicit weakness should not be specifiable except with forms I and none, but saw explicit weakness '%s' with verb form '%s'"):
format(base.explicit_weakness, vform))
end
vowel_spec.weakness = weakness
if vform ~= "none" then
-- Error if radicals are wrong given the weakness. More likely to happen if the weakness is explicitly given
-- rather than inferred. Will also happen if certain incorrect letters are included as radicals e.g. hamza on
-- top of various letters, alif maqṣūra, tā' marbūṭa.
check_radicals(vform, weakness, rget(vowel_spec.rad1), rget(vowel_spec.rad2), rget(vowel_spec.rad3),
base.quadlit and rget(vowel_spec.rad4) or nil)
end
-- Check the variant value.
local form_iii_vi_geminate = (vform == "III" or vform == "VI") and rget(vowel_spec.rad2) == rget(vowel_spec.rad3) and
not req(vowel_spec.rad2, Y)
local hayy_i_x = hayy_radicals(vowel_spec.rad1, vowel_spec.rad2, vowel_spec.rad3) and (vform == "I" or vform == "X")
if form_iii_vi_geminate or hayy_i_x then
if vowel_spec.variant and vowel_spec.variant ~= "long" and vowel_spec.variant ~= "short" and vowel_spec.variant ~= "both" then
error(("For form-III/VI geminate verb or form-I/X verb with ح-ي-ي radicals, saw unrecognized 'var:%s' value; should be 'var:long', 'var:short' or 'var:both'"):format(
vowel_spec.variant))
end
elseif vowel_spec.variant then
error(("Variant value 'var:%s' not allowed in this context"):format(vowel_spec.variant))
end
end
-- If form I, regroup expanded vowels for display purposes.
if vform == "I" then
local group_by_past = {}
for _, vowel_spec in ipairs(base.conj_vowels) do
m_table.insertIfNot(group_by_past, {
past = undia[rget(vowel_spec.past)],
nonpasts = {undia[rget(vowel_spec.nonpast)]},
}, {
key = function(obj) return obj.past end,
combine = function(obj1, obj2)
for _, nonpast in ipairs(obj2.nonpasts) do
m_table.insertIfNot(obj1.nonpasts, nonpast)
end
end,
})
end
local group_by_nonpast = {}
for _, vowel_spec in ipairs(group_by_past) do
m_table.insertIfNot(group_by_nonpast, {
pasts = {vowel_spec.past},
nonpasts = vowel_spec.nonpasts,
}, {
key = function(obj) return obj.nonpasts end,
combine = function(obj1, obj2)
for _, past in ipairs(obj2.pasts) do
m_table.insertIfNot(obj1.pasts, past)
end
end,
})
end
base.grouped_conj_vowels = group_by_nonpast
end
-- Set value of passive. If not specified, default is yes for forms II, III, IV and Iq; no but uncertainly for
-- forms VII, IX, XI - XV and IIIq - IVq, as well as form I with past vowel u; impersonal but uncertainly for form
-- V, VI, X and IIq, as well as form I with past vowel i; and yes but uncertainly for the remainder (form I with
-- past vowel only a and form VIII).
if not base.passive then
base.passive_defaulted = true
-- Temporary tracking for defaulted passives by verb form, weakness and (for form I) past/non-past vowels.
track_if_ar_conj(base, "passive-defaulted/" .. vform)
for _, vowel_spec in ipairs(base.conj_vowels) do
track_if_ar_conj(base, "passive-defaulted/" .. vform.. "/" .. vowel_spec.weakness)
if vform == "I" then
local past_nonpast = ("%s~%s"):format(undia[vowel_spec.past], undia[vowel_spec.nonpast])
track_if_ar_conj(base, "passive-defaulted/I/" .. past_nonpast)
track_if_ar_conj(base, "passive-defaulted/I/" .. vowel_spec.weakness .. "/" .. past_nonpast)
end
end
if vform_probably_full_passive(vform) then
base.passive = "pass"
else
base.passive_uncertain = true
for _, vowel_spec in ipairs(base.conj_vowels) do
if vform_probably_no_passive(vform, vowel_spec.weakness, vowel_spec.past, vowel_spec.nonpast) then
base.passive = "nopass"
break
elseif vform_probably_impersonal_passive(vform, vowel_spec.weakness, vowel_spec.past,
vowel_spec.nonpast) then
base.passive = "ipass"
break
end
end
base.passive = base.passive or "pass"
end
end
-- NOTE: Currently there are no built-in stems or form overrides for Arabic; this code is inherited from
-- [[Module:ca-verb]], where such things do exist, and is kept for generality in case we decide in the future to
-- implement such things.
-- Override built-in verb stems and overrides with user-specified ones.
for stem, values in pairs(base.user_stem_overrides) do
base.stem_overrides[stem] = values
end
for slot, values in pairs(base.user_slot_overrides) do
if not base.alternant_multiword_spec.verb_slots_map[slot] then
error("Unrecognized override slot '" .. slot .. "': " .. base.angle_bracket_spec)
end
if export.unsettable_slots_set[slot] then
error("Slot '" .. slot .. "' cannot be set using an override: " .. base.angle_bracket_spec)
end
if skip_slot(base, slot, "allow overrides") then
error("Override slot '" .. slot ..
"' would be skipped based on the passive, 'noimp' and/or 'no_nonpast' settings: " ..
base.angle_bracket_spec)
end
base.slot_overrides[slot] = values
end
if base.verb_form == "none-final-weak" then
for _, stem_type in ipairs { "past", "past_pass", "nonpast", "nonpast_pass" } do
if base.stem_overrides[stem_type .. "_c"] or base.stem_overrides[stem_type .. "_v"] then
error(("Specify past stem for verb type 'none-final-weak' using '%s:...' not '%s_c:...' or '%s_v:...'"):
format(stem_type, stem_type, stem_type))
end
end
for _, stem_type in ipairs { "past", "nonpast" } do
if base.stem_overrides[stem_type] or not base.stem_overrides[stem_type .. "_final_weak_vowel"] then
error(("For verb type 'none-final-weak', if '%s:...' specified, so must '%s_final_weak_vowel:...'"):
format(stem_type, stem_type))
end
end
end
end
local function detect_all_indicator_specs(alternant_multiword_spec)
add_slots(alternant_multiword_spec)
alternant_multiword_spec.verb_forms = {}
-- This means at least one individual base had the slot marked as explicitly missing. Another base (e.g. when
-- there are multiple alternants) might have a value for the slot. In practice, we only respect this when there are
-- no overall values in the slot and `slot_uncertain` isn't set; in this case, we display "no ..." for the slot
-- instead of simply not displaying anything for the slot.
alternant_multiword_spec.slot_explicitly_missing = {}
-- This means at least one individual base had no values for the slot and the slot marked as explicitly uncertain.
-- Note that this is different from a value being present but marked as uncertain (e.g. if an override was given
-- with a ? after it); this causes the form object for the value to have `uncertain = true` set. If there are no
-- overall values in the slot and `slot_uncertain` is set, we display this in the headword.
alternant_multiword_spec.slot_uncertain = {}
iut.map_word_specs(alternant_multiword_spec, function(base)
-- So arguments, etc. can be accessed. WARNING: Creates circular reference.
base.alternant_multiword_spec = alternant_multiword_spec
detect_indicator_spec(base)
if not base.nocat then
m_table.insertIfNot(alternant_multiword_spec.verb_forms, base.verb_form)
end
if base.passive_uncertain then
alternant_multiword_spec.passive_uncertain = true
end
for slot, _ in pairs(base.slot_explicitly_missing) do
alternant_multiword_spec.slot_explicitly_missing[slot] = true
end
end)
end
local function determine_slot_uncertainty_from_forms(alternant_multiword_spec)
iut.map_word_specs(alternant_multiword_spec, function(base)
-- If no verbal noun and verb form is not 'none' (manually-specified stems) — which currently only happens for
-- form I — and the verbal noun wasn't explicitly indicated as missing using <vn:->, we assume it's just
-- unknown/unspecified rather than missing. Same with active participles.
for uncertain_slot, _ in pairs(slots_that_may_be_uncertain) do
if not base.forms[uncertain_slot] and vform ~= "none" and not skip_slot(base, uncertain_slot) then
base.slot_uncertain[uncertain_slot] = true
end
end
-- Propagate slot uncertainty up. Currently only the verbal noun can have this set but we write the code
-- generally.
for slot, _ in pairs(base.slot_uncertain) do
alternant_multiword_spec.slot_uncertain[slot] = true
end
end)
-- If slot is uncertain and has no value, explicitly set its value to "?".
for uncertain_slot, _ in pairs(slots_that_may_be_uncertain) do
if not alternant_multiword_spec.forms[uncertain_slot] and
alternant_multiword_spec.slot_uncertain[uncertain_slot] then
alternant_multiword_spec.forms[uncertain_slot] = {{form = "?"}}
end
end
end
-- Determine certain properties of the verb from the overall forms, such as whether the verb is active-only or
-- passive-only, is impersonal, lacks an imperative, etc.
local function determine_verb_properties_from_forms(alternant_multiword_spec)
alternant_multiword_spec.has_active = false
alternant_multiword_spec.has_passive = false
alternant_multiword_spec.has_non_impers_active = false
alternant_multiword_spec.has_non_impers_passive = false
alternant_multiword_spec.has_imp = false
alternant_multiword_spec.has_past = false
alternant_multiword_spec.has_nonpast = false
for slot, _ in pairs(alternant_multiword_spec.forms) do
if slot == "ap" or slot:find("[123]") and not slot:find("_pass") then
alternant_multiword_spec.has_active = true
end
if slot == "pp" or slot:find("[123]") and slot:find("_pass") then
alternant_multiword_spec.has_passive = true
end
if slot:find("[123]") and not slot:find("pass_[123]") and not slot:find("3ms") then
alternant_multiword_spec.has_non_impers_active = true
end
if slot:find("pass_[123]") and not slot:find("3ms") then
alternant_multiword_spec.has_non_impers_passive = true
end
if slot:find("^imp_") then
alternant_multiword_spec.has_imp = true
end
if slot:find("^past_") then
alternant_multiword_spec.has_past = true
end
if slot:find("^ind_") or slot:find("^sub_") or slot:find("^juss_") then
alternant_multiword_spec.has_nonpast = true
end
end
end
local function add_categories_and_annotation(alternant_multiword_spec, base, multiword_lemma, insert_ann, insert_cat)
-- Useful e.g. in constructing suppletive verbs out of parts. For a verb like جاء or أتى whose imperative comes
-- from the unrelated verb تعالى, we don't want the latter verb showing up in categories or annotations.
if base.nocat then
return
end
local vform = base.verb_form
if vform ~= "none" then
insert_ann("form", vform)
insert_cat("form-" .. vform .. " verbs")
end
if base.reduced then
insert_ann("reduced", "reduced")
if vform ~= "none" then
insert_cat("form-" .. vform .. " reduced verbs")
end
end
if base.quadlit then
insert_cat("verbs with quadriliteral roots")
end
if base.passive_defaulted then
insert_cat("verbs with defaulted passive")
end
for _, vowel_spec in ipairs(base.conj_vowels) do
local rad1, rad2, rad3, rad4 = get_radicals_4(vowel_spec)
local final_weak = is_final_weak(base, vowel_spec)
local weakness = vowel_spec.weakness
-- We have to distinguish weakness by form and weakness by conjugation. Weakness by form merely indicates the
-- presence of weak letters in certain positions in the radicals. Weakness by conjugation is related to how the
-- verbs are conjugated. For example, form-II verbs that are "hollow by form" (middle radical is wāw or yāʾ) are
-- conjugated as sound verbs. Another example: form-I verbs with initial wāw are "assimilated by form" and most
-- are assimilated by conjugation as well, but a few are sound by conjugation, e.g. wajuha yawjuhu "to be
-- distinguished" (rather than wajuha yajuhu); similarly for some hollow-by-form verbs in various forms, e.g.
-- form VIII izdawaja yazdawiju "to be in pairs" (rather than izdāja yazdāju). Categories referring to weakness
-- always refer to weakness by conjugation; weakness by form is distinguished only by categories such as
-- [[:Category:Arabic form-III verbs with و as second radical]].
insert_ann("weakness", weakness)
if vform ~= "none" then
insert_cat(("%s form-%s verbs"):format(weakness, vform))
end
local function radical_is_ambiguous(rad)
return type(rad) == "table" and rad.ambig
end
local function radical_is_unambiguous_weak(rad)
return not radical_is_ambiguous(rad) and (is_waw_ya(rad) or req(rad, HAMZA))
end
if vform ~= "none" then
local ur1, ur2, ur3, ur4 =
vowel_spec.unreg_rad1, vowel_spec.unreg_rad2, vowel_spec.unreg_rad3, vowel_spec.unreg_rad4
-- Create headword categories based on the radicals. Do the following before
-- converting the Latin radicals into Arabic ones so we distinguish
-- between ambiguous and non-ambiguous radicals.
if radical_is_ambiguous(ur1) or radical_is_ambiguous(ur2) or radical_is_ambiguous(ur3) or
ur4 and radical_is_ambiguous(ur4) then
insert_cat("verbs with ambiguous radicals")
end
if radical_is_unambiguous_weak(ur1) then
insert_cat("form-" .. vform .. " verbs with " .. rget(ur1) .. " as first radical")
end
if radical_is_unambiguous_weak(ur2) then
insert_cat("form-" .. vform .. " verbs with " .. rget(ur2) .. " as second radical")
end
if radical_is_unambiguous_weak(ur3) then
insert_cat("form-" .. vform .. " verbs with " .. rget(ur3) .. " as third radical")
end
if ur4 and radical_is_unambiguous_weak(ur4) then
insert_cat("form-" .. vform .. " verbs with " .. rget(ur4) .. " as fourth radical")
end
end
end
if vform == "I" and not is_passive_only(base.passive) then
for _, vowel_spec in ipairs(base.grouped_conj_vowels) do
insert_ann("vowels",
("%s ~ %s"):format(table.concat(vowel_spec.pasts, "/"), table.concat(vowel_spec.nonpasts, "/")))
for _, past in ipairs(vowel_spec.pasts) do
for _, nonpast in ipairs(vowel_spec.nonpasts) do
if past == "-" or nonpast == "-" then
error("Internal error: Saw form I past vowel %s and non-past vowel %s but - in place of vowel should have triggered an error earlier")
end
insert_cat(("form-I verbs with past vowel %s and non-past vowel %s"):format(past, nonpast))
end
end
end
end
for slot, name in pairs(slots_that_may_be_uncertain) do
if base.slot_uncertain[slot] then
-- An unspecified and non-defaulted verbal noun (form I) is considered uncertain rather than explicitly
-- missing. Use <vn:-> to explicitly indicate the lack of verbal noun. Same for form-I stative active
-- participles.
insert_cat(("verbs with unknown or uncertain %ss"):format(name))
end
end
if base.irregular then
insert_ann("irreg", "irregular")
insert_cat("irregular verbs")
end
end
-- Compute the categories to add the verb to, as well as the annotation to display in the conjugation title bar. We
-- combine the code to do these functions as both categories and title bar contain similar information.
local function compute_categories_and_annotation(alternant_multiword_spec)
alternant_multiword_spec.categories = {}
local ann = {}
alternant_multiword_spec.annotation = ann
ann.form = {}
ann.weakness = {}
ann.vowels = {}
ann.passive = nil
ann.reduced = {}
ann.irreg = {}
ann.defective = {}
local multiword_lemma = false
for _, slot in ipairs(export.potential_lemma_slots) do
if alternant_multiword_spec.forms[slot] then
for _, formobj in ipairs(alternant_multiword_spec.forms[slot]) do
if formobj.form:find(" ") then
multiword_lemma = true
break
end
end
break
end
end
local function insert_ann(anntype, value)
m_table.insertIfNot(alternant_multiword_spec.annotation[anntype], value)
end
local function insert_cat(cat, also_when_multiword)
-- Don't place multiword terms in categories like 'Arabic form-II verbs' to avoid spamming the categories with
-- such terms.
if also_when_multiword or not multiword_lemma then
m_table.insertIfNot(alternant_multiword_spec.categories, "Arabic " .. cat)
end
end
iut.map_word_specs(alternant_multiword_spec, function(base)
add_categories_and_annotation(alternant_multiword_spec, base, multiword_lemma, insert_ann, insert_cat)
end)
for slot, name in pairs(slots_that_may_be_uncertain) do
if alternant_multiword_spec.forms[slot] then
for _, form in ipairs(alternant_multiword_spec.forms[slot]) do
if form.uncertain then
if form.form == "?" then
insert_cat(("verbs with explicitly unknown %ss"):format(name))
else
insert_cat(("verbs needing %s checked"):format(name))
end
break
end
end
end
end
if alternant_multiword_spec.has_active then
if alternant_multiword_spec.has_passive and alternant_multiword_spec.has_non_impers_passive then
insert_cat("verbs with full passive")
ann.passive = "full passive"
elseif alternant_multiword_spec.has_passive then
insert_cat("verbs with impersonal passive")
ann.passive = "impersonal passive"
else
insert_cat("verbs lacking passive forms")
ann.passive = "no passive"
end
else
if alternant_multiword_spec.has_non_impers_passive then
insert_cat("passive verbs")
insert_cat("verbs with full passive")
ann.passive = "passive-only"
else
insert_cat("passive verbs")
insert_cat("impersonal verbs")
insert_cat("verbs with impersonal passive")
ann.passive = "impersonal (passive-only)"
end
end
if alternant_multiword_spec.passive_uncertain then
insert_cat("verbs needing passive checked")
ann.passive = ann.passive .. ' <abbr title="passive status uncertain">(?)</abbr>'
end
if alternant_multiword_spec.has_active and not alternant_multiword_spec.has_imp then
insert_ann("defective", "no imperative")
insert_cat("verbs lacking imperative forms")
end
if not alternant_multiword_spec.has_past then
insert_ann("defective", "no past")
insert_cat("verbs lacking past forms")
end
if not alternant_multiword_spec.has_nonpast then
insert_ann("defective", "no non-past")
insert_cat("verbs lacking non-past forms")
end
local ann_parts = {}
local function insert_ann_part(part, conj)
local val = table.concat(ann[part], conj or " or ")
if val ~= "" and val ~= "regular" then
table.insert(ann_parts, val)
end
end
insert_ann_part("form")
insert_ann_part("weakness")
insert_ann_part("reduced")
insert_ann_part("vowels")
if ann.passive then
table.insert(ann_parts, ann.passive)
end
insert_ann_part("irreg")
insert_ann_part("defective", ", ")
alternant_multiword_spec.annotation = table.concat(ann_parts, ", ")
end
local function show_forms(alternant_multiword_spec)
local lemmas = {}
for _, slot in ipairs(export.potential_lemma_slots) do
if alternant_multiword_spec.forms[slot] then
for _, formobj in ipairs(alternant_multiword_spec.forms[slot]) do
table.insert(lemmas, formobj)
end
break
end
end
alternant_multiword_spec.lemmas = lemmas -- save for later use in make_table()
alternant_multiword_spec.vn = alternant_multiword_spec.forms.vn -- save for later use in make_table()
-- Reconstruct the original verb spec without overrides for verbal nouns and participles, since those specific slots
-- are ignored by {{ar-verb form}}. Compute this once beforehand; `transform_accel_obj` is called repeatedly on each
-- form and we don't want to compute this repeatedly.
local reconstructed_verb_spec = iut.reconstruct_original_spec(alternant_multiword_spec, {
preprocess_angle_bracket_spec = function(spec)
spec = spec:match("^<(.*)>$")
assert(spec)
local segments = put.parse_multi_delimiter_balanced_segment_run(spec, {{"[", "]"}, {"<", ">"}})
local dot_separated_groups = put.split_alternating_runs_and_strip_spaces(segments, "%.")
-- Rejoin each dot-separated group into a single string, since we aren't actually going to do any parsing
-- of bracket-bounded textual runs; then filter out overrides for verbal nouns and participles.
local filtered_indicators = {}
for _, dot_separated_group in ipairs(dot_separated_groups) do
local indicator = table.concat(dot_separated_group)
-- FIXME: Do we want to filter out any other indicators?
if not (indicator:find("^vn:") or indicator:find("^[ap]p:")) then
table.insert(filtered_indicators, indicator)
end
end
return ("<%s>"):format(table.concat(filtered_indicators, "."))
end,
})
-- If we're dealing with a single word, no alternants and a single verb form, use the auto-conjugation-fetching
-- variant.
local reconstructed_lemma, inside = reconstructed_verb_spec:match("^([^ <>()]+)(%b<>)$")
if inside and alternant_multiword_spec.verb_forms[1] and not alternant_multiword_spec.verb_forms[2] then
reconstructed_verb_spec = ("+%s<%s>"):format(reconstructed_lemma, alternant_multiword_spec.verb_forms[1])
end
local function transform_accel_obj(slot, formobj, accel_obj)
if not accel_obj then
return accel_obj
end
if slot == "ap" or slot == "pp" or slot == "vn" then
-- FIXME: [[Module:accel]] can't correctly handle more than one verb form for participles and verbal nouns
accel_obj.form = slot .. "-" .. table.concat(alternant_multiword_spec.verb_forms, ",")
else
accel_obj.form = "verb-form-" .. reconstructed_verb_spec
end
return accel_obj
end
local function generate_link(data)
local form = data.form
local term = form.formval_for_link
local alt = form.alt
if term == "?" then
term = nil
alt = "?"
end
local link = m_links.full_link {
lang = lang, term = term, tr = "-", accel = form.accel_obj,
alt = alt, gloss = form.gloss, genders = form.genders, pos = form.pos, lit = form.lit, id = form.id,
} .. iut.get_footnote_text(form.footnotes, data.footnote_obj)
if form.q and form.q[1] or form.qq and form.qq[1] or form.l and form.l[1] or form.ll and form.ll[1] then
link = require(pron_qualifier_module).format_qualifiers {
lang = lang,
text = link,
q = form.q,
qq = form.qq,
l = form.l,
ll = form.ll,
}
end
return link
end
local props = {
lang = lang,
lemmas = lemmas,
transform_accel_obj = transform_accel_obj,
generate_link = generate_link,
slot_list = alternant_multiword_spec.verb_slots,
include_translit = true,
}
iut.show_forms(alternant_multiword_spec.forms, props)
end
-------------------------------------------------------------------------------
-- Functions to create inflection tables --
-------------------------------------------------------------------------------
-- Make the conjugation table. Called from export.show().
local function make_table(alternant_multiword_spec)
local text = mw.getCurrentFrame():expandTemplate{
title = 'inflection-table-top',
args = {
title = 'Conjugation of {title}',
tall = 'yes',
palette = "green",
category = 'conjugation',
class = 'tr-alongside', -- temp hack to prevent extra line break
}
}
text = text .. [=[
! colspan="6" | verbal noun<br /><<الْمَصْدَر>>
| colspan="7" | {vn}
]=]
if alternant_multiword_spec.has_active then
text = text .. [=[
|-
! colspan="6" | active participle<br /><<اِسْم الْفَاعِل>>
| colspan="7" | {ap}
]=]
end
if alternant_multiword_spec.has_passive then
text = text .. [=[
|-
! colspan="6" | passive participle<br /><<اِسْم الْمَفْعُول>>
| colspan="7" | {pp}
]=]
end
text = text .. [=[
|-
! colspan="999" class="separator" |
]=]
if alternant_multiword_spec.has_active then
text = text .. [=[
|-
! colspan="12" class="outer" | active voice<br /><<الْفِعْل الْمَعْلُوم>>
|-
! colspan="2" |
! colspan="3" | singular<br /><<الْمُفْرَد>>
! rowspan="12" class="separator" |
! colspan="2" | dual<br /><<الْمُثَنَّى>>
! rowspan="12" class="separator" |
! colspan="3"| plural<br /><<الْجَمْع>>
|-
! colspan="2"|
! 1<sup>st</sup> person<br /><<الْمُتَكَلِّم>>
! 2<sup>nd</sup> person<br /><<الْمُخَاطَب>>
! 3<sup>rd</sup> person<br /><<الْغَائِب>>
! 2<sup>nd</sup> person<br /><<الْمُخَاطَب>>
! 3<sup>rd</sup> person<br /><<الْغَائِب>>
! 1<sup>st</sup> person<br /><<الْمُتَكَلِّم>>
! 2<sup>nd</sup> person<br /><<الْمُخَاطَب>>
! 3<sup>rd</sup> person<br /><<الْغَائِب>>
|-
! rowspan="2" | past (perfect) indicative<br /><<الْمَاضِي>>
! class="secondary" | m
| rowspan="2" | {past_1s}
| {past_2ms}
| {past_3ms}
| rowspan="2" | {past_2d}
| {past_3md}
| rowspan="2" | {past_1p}
| {past_2mp}
| {past_3mp}
|-
! class="secondary" | f
| {past_2fs}
| {past_3fs}
| {past_3fd}
| {past_2fp}
| {past_3fp}
|-
! rowspan="2" | non-past (imperfect) indicative<br /><<الْمُضَارِع الْمَرْفُوع>>
! class="secondary" | m
| rowspan="2" | {ind_1s}
| {ind_2ms}
| {ind_3ms}
| rowspan="2" | {ind_2d}
| {ind_3md}
| rowspan="2" | {ind_1p}
| {ind_2mp}
| {ind_3mp}
|-
! class="secondary" | f
| {ind_2fs}
| {ind_3fs}
| {ind_3fd}
| {ind_2fp}
| {ind_3fp}
|-
! rowspan="2" | subjunctive<br /><<الْمُضَارِع الْمَنْصُوب>>
! class="secondary" | m
| rowspan="2" | {sub_1s}
| {sub_2ms}
| {sub_3ms}
| rowspan="2" | {sub_2d}
| {sub_3md}
| rowspan="2" | {sub_1p}
| {sub_2mp}
| {sub_3mp}
|-
! class="secondary" | f
| {sub_2fs}
| {sub_3fs}
| {sub_3fd}
| {sub_2fp}
| {sub_3fp}
|-
! rowspan="2" | jussive<br /><<الْمُضَارِع الْمَجْزُوم>>
! class="secondary" | m
| rowspan="2" | {juss_1s}
| {juss_2ms}
| {juss_3ms}
| rowspan="2" | {juss_2d}
| {juss_3md}
| rowspan="2" | {juss_1p}
| {juss_2mp}
| {juss_3mp}
|-
! class="secondary" | f
| {juss_2fs}
| {juss_3fs}
| {juss_3fd}
| {juss_2fp}
| {juss_3fp}
|-
! rowspan="2" | imperative<br /><<الْأَمْر>>
! class="secondary" | m
| rowspan="2" |
| {imp_2ms}
| rowspan="2" |
| rowspan="2" | {imp_2d}
| rowspan="2" |
| rowspan="2" |
| {imp_2mp}
| rowspan="2" |
|-
! class="secondary" | f
| {imp_2fs}
| {imp_2fp}
]=]
end
if alternant_multiword_spec.has_passive then
text = text .. [=[
|-
! colspan="999" class="separator" |
|-
! colspan="12" class="outer" | passive voice<br /><<الْفِعْل الْمَجْهُول>>
|-
! colspan="2" |
! colspan="3" | singular<br /><<الْمُفْرَد>>
! rowspan="10" class="separator" |
! colspan="2" | dual<br /><<الْمُثَنَّى>>
! rowspan="10" class="separator" |
! colspan="3" | plural<br /><<الْجَمْع>>
|-
! colspan="2" |
! 1<sup>st</sup> person<br /><<الْمُتَكَلِّم>>
! 2<sup>nd</sup> person<br /><<الْمُخَاطَب>>
! 3<sup>rd</sup> person<br /><<الْغَائِب>>
! 2<sup>nd</sup> person<br /><<الْمُخَاطَب>>
! 3<sup>rd</sup> person<br /><<الْغَائِب>>
! 1<sup>st</sup> person<br /><<الْمُتَكَلِّم>>
! 2<sup>nd</sup> person<br /><<الْمُخَاطَب>>
! 3<sup>rd</sup> person<br /><<الْغَائِب>>
|-
! rowspan="2" | past (perfect) indicative<br /><<الْمَاضِي>>
! class="secondary" | m
| rowspan="2" | {past_pass_1s}
| {past_pass_2ms}
| {past_pass_3ms}
| rowspan="2" | {past_pass_2d}
| {past_pass_3md}
| rowspan="2" | {past_pass_1p}
| {past_pass_2mp}
| {past_pass_3mp}
|-
! class="secondary" | f
| {past_pass_2fs}
| {past_pass_3fs}
| {past_pass_3fd}
| {past_pass_2fp}
| {past_pass_3fp}
|-
! rowspan="2" | non-past (imperfect) indicative<br /><<الْمُضَارِع الْمَرْفُوع>>
! class="secondary" | m
| rowspan="2" | {ind_pass_1s}
| {ind_pass_2ms}
| {ind_pass_3ms}
| rowspan="2" | {ind_pass_2d}
| {ind_pass_3md}
| rowspan="2" | {ind_pass_1p}
| {ind_pass_2mp}
| {ind_pass_3mp}
|-
! class="secondary" | f
| {ind_pass_2fs}
| {ind_pass_3fs}
| {ind_pass_3fd}
| {ind_pass_2fp}
| {ind_pass_3fp}
|-
! rowspan="2" | subjunctive<br /><<الْمُضَارِع الْمَنْصُوب>>
! class="secondary" | m
| rowspan="2" | {sub_pass_1s}
| {sub_pass_2ms}
| {sub_pass_3ms}
| rowspan="2" | {sub_pass_2d}
| {sub_pass_3md}
| rowspan="2" | {sub_pass_1p}
| {sub_pass_2mp}
| {sub_pass_3mp}
|-
! class="secondary" | f
| {sub_pass_2fs}
| {sub_pass_3fs}
| {sub_pass_3fd}
| {sub_pass_2fp}
| {sub_pass_3fp}
|-
! rowspan="2" | jussive<br /><<الْمُضَارِع الْمَجْزُوم>>
! class="secondary" | m
| rowspan="2" | {juss_pass_1s}
| {juss_pass_2ms}
| {juss_pass_3ms}
| rowspan="2" | {juss_pass_2d}
| {juss_pass_3md}
| rowspan="2" | {juss_pass_1p}
| {juss_pass_2mp}
| {juss_pass_3mp}
|-
! class="secondary" | f
| {juss_pass_2fs}
| {juss_pass_3fs}
| {juss_pass_3fd}
| {juss_pass_2fp}
| {juss_pass_3fp}
]=]
end
text = text .. mw.getCurrentFrame():expandTemplate{
title = 'inflection-table-bottom',
args = {
notes = '{footnote}',
}
}
local forms = alternant_multiword_spec.forms
if not alternant_multiword_spec.lemmas then
forms.title = "—"
else
local linked_lemmas = {}
for _, form in ipairs(alternant_multiword_spec.lemmas) do
table.insert(linked_lemmas, link_term(form.form, "term"))
end
forms.title = table.concat(linked_lemmas, ", ")
end
local ann_parts = {}
if alternant_multiword_spec.annotation ~= "" then
table.insert(ann_parts, alternant_multiword_spec.annotation)
end
if alternant_multiword_spec.vn then
local linked_vns = {}
for _, form in ipairs(alternant_multiword_spec.vn) do
table.insert(linked_vns, link_term(form.form, "term"))
end
table.insert(ann_parts, (#linked_vns > 1 and "verbal nouns" or "verbal noun") .. " " ..
table.concat(linked_vns, ", "))
end
local annotation = table.concat(ann_parts, ", ")
if annotation ~= "" then
forms.title = forms.title .. " (" .. annotation .. ")"
end
-- Format the table.
local tagged_table = rsub(text, "<<(.-)>>", tag_text)
return m_string_utilities.format(tagged_table, forms)
end
-------------------------------------------------------------------------------
-- External entry points --
-------------------------------------------------------------------------------
-- Append two lists `l1` and `l2`, removing duplicates. If either is {nil}, just return the other.
local function combine_lists(l1, l2)
-- combine_footnotes() does exactly what we want.
return iut.combine_footnotes(l1, l2)
end
local function combine_metadata(data)
local src1 = data.form1
local src2 = data.form2
local dest = data.dest_form
dest.uncertain = src1.uncertain or src2.uncertain
if src1.genders and src2.genders and not m_table.deepEquals(src1.genders, src2.genders) then
-- do nothing
else
dest.genders = src1.genders or src2.genders
end
if src1.pos and src2.pos and src1.pos ~= src2.pos then
-- do nothing
else
dest.pos = src1.pos or src2.pos
end
-- Don't copy .alt, .gloss, .lit, .id, which describe a single term and don't extend to multiword terms.
dest.q = combine_lists(src1.q, src2.q)
dest.qq = combine_lists(src1.qq, src2.qq)
dest.l = combine_lists(src1.l, src2.l)
dest.ll = combine_lists(src1.ll, src2.ll)
end
-- Externally callable function to parse and conjugate a verb given user-specified arguments.
-- Return value is WORD_SPEC, an object where the conjugated forms are in `WORD_SPEC.forms`
-- for each slot. If there are no values for a slot, the slot key will be missing. The value
-- for a given slot is a list of objects {form=FORM, footnotes=FOOTNOTES}.
function export.do_generate_forms(args, source_template, headword_head)
local PAGENAME = mw.loadData("Module:headword/data").pagename
local function in_template_space()
return mw.title.getCurrentTitle().nsText == "Template"
end
-- Determine the verb spec we're being asked to generate the conjugation of. This may be taken from the current page
-- title or the value of |pagename=; but not when called from {{ar-verb form}}, where the page title is a
-- non-lemma form. Note that the verb spec may omit the lemma; e.g. it may be "<II>". For this reason, we use the
-- value of `pagename` computed here down below, when calling normalize_all_lemmas().
local pagename = source_template ~= "ar-verb form" and args.pagename or PAGENAME
local head = headword_head or pagename
local arg1 = args[1]
if not arg1 then
if (pagename == "ar-conj" or pagename == "ar-verb" or pagename == "ar-verb form") and in_template_space() then
arg1 = "كتب<I/a~u.pass>"
else
arg1 = "<>"
end
end
-- When called from {{ar-verb form}}, determine the non-lemma form whose inflections we're being asked to
-- determine. This normally comes from the page title or the value of |pagename=.
local verb_form_of_form
if source_template == "ar-verb form" then
verb_form_of_form = args.pagename
if not verb_form_of_form then
if PAGENAME == "ar-verb form" and in_template_space() then
verb_form_of_form = "كتبت"
else
verb_form_of_form = PAGENAME
end
end
end
local incorporated_headword_head_into_lemma = false
if arg1:find("^<.*>$") then -- missing lemma
if head:find(" ") then
-- If multiword lemma, try to add arg spec after the first word.
-- Try to preserve the brackets in the part after the verb, but don't do it
-- if there aren't the same number of left and right brackets in the verb
-- (which means the verb was linked as part of a larger expression).
local first_word, post = rmatch(head, "^(.-)( .*)$")
local left_brackets = rsub(first_word, "[^%[]", "")
local right_brackets = rsub(first_word, "[^%]]", "")
if #left_brackets == #right_brackets then
arg1 = iut.remove_redundant_links(first_word) .. arg1 .. post
incorporated_headword_head_into_lemma = true
else
-- Try again using the form without links.
local linkless_head = m_links.remove_links(head)
if linkless_head:find(" ") then
first_word, post = rmatch(linkless_head, "^(.-)( .*)$")
arg1 = first_word .. arg1 .. post
else
error("Unable to incorporate <...> spec into explicit head due to a multiword linked verb or " ..
"unbalanced brackets; please include <> explicitly: " .. arg1)
end
end
else
-- Will be incorporated through `head` below in the call to normalize_all_lemmas().
incorporated_headword_head_into_lemma = true
end
end
local parse_props = {
parse_indicator_spec = parse_indicator_spec,
angle_brackets_omittable = true,
allow_blank_lemma = true,
}
local alternant_multiword_spec = iut.parse_inflected_text(arg1, parse_props)
alternant_multiword_spec.pos = pos or "verbs"
alternant_multiword_spec.args = args
alternant_multiword_spec.source_template = source_template
alternant_multiword_spec.verb_form_of_form = verb_form_of_form
alternant_multiword_spec.incorporated_headword_head_into_lemma = incorporated_headword_head_into_lemma
normalize_all_lemmas(alternant_multiword_spec, head)
detect_all_indicator_specs(alternant_multiword_spec)
local inflect_props = {
lang = lang,
slot_list = alternant_multiword_spec.verb_slots,
inflect_word_spec = conjugate_verb,
combine_metadata = combine_metadata,
-- We add links around the generated verbal forms rather than allow the entire multiword
-- expression to be a link, so ensure that user-specified links get included as well.
include_user_specified_links = true,
}
iut.inflect_multiword_or_alternant_multiword_spec(alternant_multiword_spec, inflect_props)
if debug_translit then
for slot, forms in pairs(alternant_multiword_spec.forms) do
for _, form in ipairs(forms) do
if form.translit then
local full_form_translit = (lang:transliterate(m_links.remove_links(form.form)))
if full_form_translit ~= form.translit then
error(("Internal error: For slot '%s', form '%s' incremental translit '%s' not same as full translit '%s'"):
format(slot, form.form, form.translit, full_form_translit))
end
end
form.form = iut.remove_redundant_links(form.form)
end
end
end
-- Remove redundant brackets around entire forms.
for slot, forms in pairs(alternant_multiword_spec.forms) do
for _, form in ipairs(forms) do
form.form = iut.remove_redundant_links(form.form)
end
end
determine_slot_uncertainty_from_forms(alternant_multiword_spec)
determine_verb_properties_from_forms(alternant_multiword_spec)
compute_categories_and_annotation(alternant_multiword_spec)
if args.json and source_template == "ar-conj" then
-- There is a circular reference in `base.alternant_multiword_spec`, which points back to top level.
iut.map_word_specs(alternant_multiword_spec, function(base)
base.alternant_multiword_spec = nil
end)
return require("Module:JSON").toJSON(alternant_multiword_spec)
end
return alternant_multiword_spec
end
-- Entry point for {{ar-conj}}. Template-callable function to parse and conjugate a verb given
-- user-specified arguments and generate a displayable table of the conjugated forms.
function export.show(frame)
local parent_args = frame:getParent().args
local params = {
[1] = {},
["noautolinktext"] = {type = "boolean"},
["noautolinkverb"] = {type = "boolean"},
["t"] = {}, -- for use by {{ar-verb form}}; otherwise ignored
["id"] = {}, -- for use by {{ar-verb form}}; otherwise ignored
["pagename"] = {}, -- for testing/documentation pages
["json"] = {type = "boolean"}, -- for bot use
}
local args = require("Module:parameters").process(parent_args, params)
local alternant_multiword_spec = export.do_generate_forms(args, "ar-conj")
if type(alternant_multiword_spec) == "string" then
-- JSON return value
return alternant_multiword_spec
end
show_forms(alternant_multiword_spec)
return make_table(alternant_multiword_spec) ..
require("Module:utilities").format_categories(alternant_multiword_spec.categories, lang, nil, nil, force_cat)
end
function export.verb_forms(frame)
local parargs = frame:getParent().args
local params = {
[1] = {},
[2] = {},
[3] = {},
[4] = {},
[5] = {},
pagename = {},
}
for _, form in ipairs(allowed_vforms) do
-- FIXME: We go up to 5 here. The code supports unlimited variants but it's unlikely we will ever see more than
-- 2.
for index = 1, 5 do
local prefix = index == 1 and form or form .. index
params[prefix .. "-pv"] = {}
for _, extn in ipairs { "", "-vn", "-ap", "-pp" } do
params[prefix .. extn] = {}
params[prefix .. extn .. "-head"] = {}
-- FIXME: No -tr?
params[prefix .. extn .. "-gloss"] = {}
end
end
end
local args = require("Module:parameters").process(parargs, params)
local i = 1
local past_vowel_re = "^[aui,]*$"
local combined_root = nil
if not args[i] or rfind(args[i], past_vowel_re) then
combined_root = args.pagename or mw.loadData("Module:headword/data").pagename
if not rfind(combined_root, "^([^ ]) ([^ ]) ([^ ])$") and not
rfind(combined_root, "^([^ ]) ([^ ]) ([^ ]) ([^ ])$") then
error("When inferring roots from page title, need three or four space-separated radicals: " .. combined_root)
end
elseif rfind(args[i], " ") then
combined_root = args[i]
i = i + 1
else
local separate_roots = {}
while args[i] and not rfind(args[i], past_vowel_re) do
table.insert(separate_roots, args[i])
i = i + 1
end
combined_root = table.concat(separate_roots, " ")
end
local past_vowel = args[i]
i = i + 1
if past_vowel and not rfind(past_vowel, past_vowel_re) then
error("Unrecognized past vowel, should be 'a', 'i', 'u', 'a,u', etc. or empty: " .. past_vowel)
end
-- Spaces interfere with parsing as a unit in [[Module:inflection utilities]], so replace with underscore.
combined_root = combined_root:gsub(" ", "_")
local split_root = rsplit(combined_root, "_")
-- Map from verb forms (I, II, etc.) to a table of verb properties,
-- which has entries e.g. for "verb" (either true to autogenerate the verb
-- head, or an explicitly specified verb head using e.g. argument "I-head"),
-- and for "verb-gloss" (which comes from e.g. the argument "I" or "I-gloss"),
-- and for "vn" and "vn-gloss", "ap" and "ap-gloss", "pp" and "pp-gloss".
local verb_properties = {}
for _, form in ipairs(allowed_vforms) do
local formpropslist = {}
local derivs = {{"verb", ""}, {"vn", "-vn"}, {"ap", "-ap"}, {"pp", "-pp"}}
local index = 1
while true do
local formprops = {}
local prefix = index == 1 and form or form .. index
if prefix == "I" then
formprops.pv = past_vowel
end
if args[prefix .. "-pv"] then
formprops.pv = args[prefix .. "-pv"]
end
for _, deriv in ipairs(derivs) do
local prop = deriv[1]
local extn = deriv[2]
if args[prefix .. extn] == "+" then
formprops[prop] = true
elseif args[prefix .. extn] == "-" then
formprops[prop] = false
elseif args[prefix .. extn] then
formprops[prop] = true
formprops[prop .. "-gloss"] = args[prefix .. extn]
end
if args[prefix .. extn .. "-head"] then
if formprops[prop] == nil then
formprops[prop] = true
end
formprops[prop] = args[prefix .. extn .. "-head"]
end
if args[prefix .. extn .. "-gloss"] then
if formprops[prop] == nil then
formprops[prop] = true
end
formprops[prop .. "-gloss"] = args[prefix .. extn .. "-gloss"]
end
end
if formprops.verb then
-- If a verb form specified, also turn on vn (unless form I, with
-- unpredictable vn) and ap, and maybe pp, according to form,
-- weakness and past vowel. But don't turn these on if there's
-- an explicit on/off specification for them (e.g. I-pp=-).
if form ~= "I" and formprops.vn == nil then
formprops.vn = true
end
if formprops.ap == nil then
formprops.ap = true
end
local weakness = weakness_from_radicals(form, split_root[1], split_root[2], split_root[3],
split_root[4])
if formprops.pp == nil and not vform_probably_no_passive(form,
weakness, rsplit(formprops.pv or "", ","), {}) then
formprops.pp = true
end
if formprops.verb == true or formprops.vn == true or formprops.ap == true or formprops.pp == true then
formprops.need_autogen = true
end
table.insert(formpropslist, formprops)
index = index + 1
else
break
end
end
table.insert(verb_properties, {form, formpropslist})
end
-- Go through and create the verb form derivations as necessary, when they haven't been explicitly given.
for _, vplist in ipairs(verb_properties) do
local vform = vplist[1]
for _, props in ipairs(vplist[2]) do
if props.need_autogen then
local form_with_vowels
if vform == "I" then
local pv = props.pv
if not pv then
-- Make up likely past vowels based on weakness and actual radical.
if split_root[3] == W then -- final-weak
form_with_vowels = "I/a~u"
elseif split_root[3] == Y then
form_with_vowels = "I/a~i"
elseif split_root[2] == W then --hollow
form_with_vowels = "I/u~u"
elseif split_root[2] == Y then
form_with_vowels = "I/i~i"
else
-- most common; doesn't matter so much since we're not displaying the non-past
form_with_vowels = "I/a~u"
end
else
local pvs = rsplit(pv, ",")
local vowel_sufs = {}
for _, pv in ipairs(pvs) do
local vowel_spec
if pv == "a" then
-- Make up likely past vowels based on weakness and actual radical.
if split_root[3] == W then -- final-weak
vowel_spec = "a~u"
elseif split_root[3] == Y then
vowel_spec = "a~i"
elseif split_root[2] == W then --hollow
vowel_spec = "a~u"
elseif split_root[2] == Y then
vowel_spec = "a~i"
else
-- most common; doesn't matter so much since we're not displaying the non-past
vowel_spec = "a~u"
end
elseif pv == "i" then
-- most common; doesn't matter so much since we're not displaying the non-past
vowel_spec = "i~a"
elseif pv == "u" then
-- most common; doesn't matter so much since we're not displaying the non-past
vowel_spec = "u~u"
else
error(("Internal error: Bad past vowel '%s' in {{ar-verb forms}}"):format(pv))
end
table.insert(vowel_sufs, vowel_spec)
end
form_with_vowels = "I/" .. table.concat(vowel_sufs, "/")
end
else
form_with_vowels = vform
end
local angle_bracket_spec = ("%s<%s.pass>"):format(combined_root, form_with_vowels)
local alternant_multiword_spec = export.do_generate_forms({angle_bracket_spec}, "ar-verb forms")
local function format_forms(forms)
if not forms then
return "-" -- FIXME: Throw an error?
end
local formatted = {}
for _, form in ipairs(forms) do
if form.translit then
table.insert(formatted, ("%s//%s"):format(form.form, form.translit))
else
table.insert(formatted, form.form)
end
end
return table.concat(formatted, ",")
end
if props.verb == true then
props.verb = format_forms(alternant_multiword_spec.forms.past_3ms)
end
for _, deriv in ipairs({"vn", "ap", "pp"}) do
if props[deriv] == true then
props[deriv] = format_forms(alternant_multiword_spec.forms[deriv])
end
end
end
end
end
-- Go through and output the result
local formtextarr = {}
for _, vplist in ipairs(verb_properties) do
local form = vplist[1]
for _, props in ipairs(vplist[2]) do
local textarr = {}
if props.verb then
local text = "* '''[[Appendix:Arabic verbs#Form " .. form .. "|Form " .. form .. "]]''': "
local linktext = {}
local splitheads = rsplit(props.verb, "[,،]")
for _, head in ipairs(splitheads) do
table.insert(linktext, m_links.full_link({lang = lang, term = head, gloss = props["verb-gloss"]}))
end
text = text .. table.concat(linktext, ", ")
table.insert(textarr, text)
for _, derivengl in ipairs({{"vn", "Kata nama kerjaan"}, {"ap", "Active participle"}, {"pp", "Passive participle"}}) do
local deriv = derivengl[1]
local engl = derivengl[2]
if props[deriv] then
local text = "** " .. engl .. ": "
local linktext = {}
local splitheads = rsplit(props[deriv], "[,،]")
for _, head in ipairs(splitheads) do
local ar, translit = head:match("^(.*)//(.-)$")
if not ar then
ar = head
end
table.insert(linktext, m_links.full_link {lang = lang, term = ar, tr = translit,
gloss = props[deriv .. "-gloss"]} )
end
text = text .. table.concat(linktext, ", ")
table.insert(textarr, text)
end
end
table.insert(formtextarr, table.concat(textarr, "\n"))
end
end
end
return table.concat(formtextarr, "\n")
end
-- Infer radicals from lemma headword (i.e. 3rd masculine singular past) and verb form (I, II, etc.). Throw an error if
-- headword is malformed. A given returned radical may be actually be a list of possible radicals, where the first one
-- should be used if the user didn't explicitly give the radical. If the list contains a field `ambig = true`, the
-- radical is considered ambiguous and should not be categorized. `is_reduced` indicates that the user specified
-- `.reduced` to indicate that the verb form is reduced by assimilation and/or haplology (typically archaic Koranic
-- forms such as اِدَّارَأَ instead of تَدَارَأَ; or اِسْطَاعَ instead of اِسْتِطَاعَ; etc.
function export.infer_radicals(data)
local headword, vform, passive, past_vowel, nonpast_vowel, is_reduced =
data.headword, data.vform, data.passive, data.past_vowel, data.nonpast_vowel, data.is_reduced
past_vowel = past_vowel or "-"
nonpast_vowel = nonpast_vowel or "-"
local function verify_vowel(vowel, param)
if vowel ~= A and vowel ~= I and vowel ~= U and vowel ~= "-" then
error(("Internal error: Bad value for %s: %s (should be Arabic diacritic vowel or '-')"):format(
param, vowel))
end
end
verify_vowel(past_vowel, "past_vowel")
verify_vowel(nonpast_vowel, "nonpast_vowel")
local ch = {}
local form_viii_assim, variant
-- sub out alif-madda for easier processing
headword = rsub(headword, AMAD, HAMZA .. ALIF)
local function infer_err(msg, noann)
local anns = {}
local nohead, novform
if noann == "nohead" then
nohead = true
elseif noann == "novform" then
novform = true
elseif noann == "nohead-vform" then
nohead = true
novform = true
elseif noann then
error(("Internal error: Unrecognized value for 'noann': %s"):format(dump(noann)))
end
if not nohead then
table.insert(anns, ("headword=%s"):format(data.headword))
end
if not novform then
table.insert(anns, ("verb form=%s"):format(data.vform))
end
anns = table.concat(anns, ", ")
if anns ~= "" then
anns = ": " .. anns
end
error(msg .. anns)
end
local len = ulen(headword)
local expected_length
-- extract the headword letters into an array
for i = 1, len do
table.insert(ch, usub(headword, i, i))
end
-- check that the letter at the given index is the given string, or
-- is one of the members of the given array
local function check(index, must)
local letter = ch[index]
if type(must) == "string" then
if not letter then
infer_err("Letter " .. index .. " is nil")
end
if letter ~= must then
infer_err(("For verb form %s, letter %s must be %s, not %s"):format(vform, index, must, letter),
"novform")
end
elseif not m_table.contains(must, letter) then
infer_err("For verb form " .. vform .. ", radical " .. index ..
" must be one of " .. table.concat(must, " ") .. ", not " .. letter, "novform")
end
end
-- Check that length of headword is within [min, max]
local function check_len(min, max)
if min and len < min then
infer_err(("Not enough letters for verb form %s, expected at least %s"):format(vform, min), "novform")
end
if max and len > max then
infer_err(("Too many letters for verb form %s, expected at most %s"):format(vform, max), "novform")
end
end
-- If the vowels are i~a or u~u, a form I verb beginning with w- normally keeps the w in the non-past. Otherwise it
-- loses it (i.e. it is "assimilated").
local function form_I_w_non_assimilated()
return req(past_vowel, I) and req(nonpast_vowel, A) or req(past_vowel, U) and req(nonpast_vowel, U)
end
-- Convert radicals to canonical form (handle various hamza varieties and check for misplaced alif or alif maqṣūra;
-- legitimate cases of these letters are handled above).
local function convert(rad, index)
if type(rad) == "table" then
for i, r in ipairs(rad) do
rad[i] = convert(r, index)
end
return rad
elseif rad == HAMZA_ON_ALIF or rad == HAMZA_UNDER_ALIF or
rad == HAMZA_ON_W or rad == HAMZA_ON_Y then
return HAMZA
elseif rad == AMAQ then
infer_err("Radical " .. index .. " must not be alif maqṣūra")
elseif rad == ALIF then
infer_err("Radical " .. index .. " must not be alif")
else
return rad
end
end
local quadlit = vform:find("q$")
-- find first radical, start of second/third radicals, check for
-- required letters
local radstart, rad1, rad2, rad3, rad4
local weakness
if vform == "I" or vform == "II" then
rad1 = ch[1]
radstart = 2
elseif vform == "III" then
rad1 = ch[1]
check(2, {ALIF, W}) -- W occurs in passive-only verbs
radstart = 3
elseif vform == "IV" then
-- this would be alif-madda but we replaced it with hamza-alif above.
if ch[1] == HAMZA and ch[2] == ALIF then
rad1 = HAMZA
else
check(1, HAMZA_ON_ALIF)
rad1 = ch[2]
end
radstart = 3
elseif vform == "V" then
check(1, is_reduced and ALIF or T)
rad1 = ch[2]
radstart = 3
elseif vform == "VI" then
check(1, is_reduced and ALIF or T)
if ch[2] == AMAD then
rad1 = HAMZA
radstart = 3
else
rad1 = ch[2]
check(3, {ALIF, W}) -- W occurs in passive-only verbs
radstart = 4
end
elseif vform == "VII" then
check(1, ALIF)
if is_reduced then
check(2, M)
rad1 = M
radstart = 3
else
check(2, N)
rad1 = ch[3]
radstart = 4
end
elseif vform == "VIII" then
check(1, ALIF)
rad1 = ch[2]
if rad1 == "د" then
rad1 = {"د", "ذ"} -- not considered ambiguous since it's usually د
radstart = 3
form_viii_assim = "دّ"
elseif rad1 == "ظ" and ch[3] == "ط" and len >= 5 then
-- [[اظطلم]], variant of [[اظلم]]
radstart = 4
form_viii_assim = "ظْط"
elseif rad1 == "ذ" and ch[3] == "د" and len >= 5 then
-- [[اذدكر]], variant of [[اذكر]]
radstart = 4
form_viii_assim = "ذْد"
elseif rad1 == T or rad1 == "ث" or rad1 == "ذ" or rad1 == "ط" or rad1 == "ظ" then
radstart = 3
form_viii_assim = rad1 .. SH
elseif rad1 == "ز" then
check(3, "د")
radstart = 4
form_viii_assim = "زْد"
elseif rad1 == "ص" or rad1 == "ض" then
check(3, "ط")
radstart = 4
form_viii_assim = rad1 .. SK .. "ط"
else
check(3, T)
radstart = 4
rad1 = convert(rad1, 1)
form_viii_assim = rad1 .. SK .. "ت"
end
if rad1 == T then
-- Radical is ambiguous, might be ت or و or ي but doesn't affect conjugation. Note that there are no
-- form-VIII verbs with initial radical ي given in Hans Wehr but Lane mentions at least:
-- - (page 2973) اِتَّأَسَ, with assimilation of the ي to ت, from root ي ء س;
-- - (page 2975) اِتَّبَسَ non-past يَتَّبِسُ and alternative اِيتَبَسَ non-past يَاتَبِسُ from the root ي ب س;
-- - (page 2976) اِتَّسَرَ non-past يَتَّسِرُ or alternatively يَأْتَسِرُ with hamza preserved from the root ي س ر.
-- These alternative forms seem very rare and probably not worth worrying about, but if we want to handle
-- them, we can do it when the time comes.
rad1 = {T, W, Y, ambig = true}
-- اِتَّخَذَ irregularly has hamza as the radical but assimilates like و
if ch[3] == "خ" and ch[4] == "ذ" then
rad1[4] = HAMZA
end
end
elseif vform == "IX" then
check(1, ALIF)
rad1 = ch[2]
radstart = 3
elseif vform == "X" then
check(1, ALIF)
check(2, S)
if is_reduced then
rad1 = ch[3]
radstart = 4
else
check(3, T)
rad1 = ch[4]
radstart = 5
end
elseif vform == "Iq" then
rad1 = ch[1]
rad2 = ch[2]
radstart = 3
elseif vform == "IIq" then
check(1, T)
rad1 = ch[2]
rad2 = ch[3]
radstart = 4
elseif vform == "IIIq" then
check(1, ALIF)
rad1 = ch[2]
rad2 = ch[3]
check(4, N)
radstart = 5
elseif vform == "IVq" then
check(1, ALIF)
rad1 = ch[2]
rad2 = ch[3]
radstart = 4
elseif vform == "XI" then
check_len(5, 5)
check(1, ALIF)
rad1 = ch[2]
rad2 = ch[3]
check(4, ALIF)
rad3 = ch[5]
weakness = "sound"
elseif vform == "XII" then
check(1, ALIF)
rad1 = ch[2]
if ch[3] ~= ch[5] then
infer_err("For verb form XII, letters 3 and 5 should be the same", "novform")
end
check(4, W)
radstart = 5
elseif vform == "XIII" then
check_len(5, 5)
check(1, ALIF)
rad1 = ch[2]
rad2 = ch[3]
check(4, W)
rad3 = ch[5]
if rad3 == AMAQ then
weakness = "final-weak"
else
weakness = "sound"
end
elseif vform == "XIV" then
check_len(6, 6)
check(1, ALIF)
rad1 = ch[2]
rad2 = ch[3]
check(4, N)
rad3 = ch[5]
if ch[6] == AMAQ then
check_waw_ya(rad3)
weakness = "final-weak"
else
if ch[5] ~= ch[6] then
infer_err("For verb form XIV, letters 5 and 6 should be the same", "novform")
end
weakness = "sound"
end
elseif vform == "XV" then
check_len(6, 6)
check(1, ALIF)
rad1 = ch[2]
rad2 = ch[3]
check(4, N)
rad3 = ch[5]
if rad3 == Y then
check(6, ALIF)
else
check(6, AMAQ)
end
weakness = "sound"
else
error("Internal error: Unrecognized verb form " .. vform)
end
-- Process the last two radicals. RADSTART is the index of the first of the two. If it's nil then all radicals have
-- already been processed above, and we don't do anything.
if radstart then
-- There must (normally) be one or two letters left.
if len == radstart then
if vform == "I" and ch[len] == Y then
-- short form حَيَّ
weakness = "final-weak"
rad2 = Y
rad3 = Y
variant = "short"
elseif vform == "IV" and rad1 == "ر" and ch[len] == AMAQ then
-- irregular verb أَرَى
weakness = "final-weak"
rad2 = HAMZA
rad3 = Y
elseif vform == "X" and rad1 == "ح" and ch[len] == AMAQ then
-- irregular verb اِسْتَحَى
weakness = "final-weak"
rad2 = Y
rad3 = Y
variant = "short"
else
-- If one letter left, then it's a geminate verb. If the letter is alif or alif maqṣūra, it will trigger
-- an error down the line.
if vform_supports_geminate(vform) then
weakness = "geminate"
rad2 = ch[len]
rad3 = ch[len]
if vform == "III" or vform == "VI" then
variant = "short"
end
else
infer_err("Apparent geminate verb, but geminate verbs not allowed for this verb form")
end
end
elseif quadlit then
-- Process last two radicals of a quadriliteral verb form.
rad3 = ch[radstart]
rad4 = ch[radstart + 1]
expected_length = radstart + 1
check_len(expected_length)
if rad4 == AMAQ or rad4 == ALIF and rad3 == Y or rad4 == Y then
-- rad4 can be Y in passive-only verbs.
if vform_supports_final_weak(vform) then
weakness = "final-weak"
-- Ambiguous radical; randomly pick wāw as radical (but avoid two wāws in a row); it could be wāw or
-- yāʾ, but doesn't affect the conjugation.
rad4 = rad3 == W and {Y, W, ambig = true} or {W, Y, ambig = true}
else
infer_err("Last radical is " .. rad4 .. " but verb form " .. vform ..
" doesn't support final-weak verbs", "novform")
end
else
weakness = "sound"
end
else
-- Process last two radicals of a triliteral verb form.
rad2 = ch[radstart]
rad3 = ch[radstart + 1]
expected_length = radstart + 1
check_len(expected_length)
if vform == "I" and (is_waw_ya(rad3) or rad3 == ALIF or rad3 == AMAQ) then
local inferred_past_vowel, inferred_nonpast_vowel
-- Check for final-weak form I verb. It can end in tall alif (rad3 = wāw) or alif maqṣūra (rad3 = yāʾ)
-- or a wāw or yāʾ (with a past vowel of i or u, e.g. nasiya/yansā "forget" or with a passive-only
-- verb).
if rad1 == W and not form_I_w_non_assimilated() then
weakness = "assimilated+final-weak"
else
weakness = "final-weak"
end
if rad3 == ALIF then
rad3 = W
inferred_past_vowel = A
inferred_nonpast_vowel = U
if is_passive_only(passive) then
infer_err("Final-weak form-I passive verbs should end in yāʔ (ي), not tall alif (ا)", "novform")
end
elseif rad3 == AMAQ then
rad3 = Y
inferred_past_vowel = A
inferred_nonpast_vowel = I
if is_passive_only(passive) then
infer_err("Final-weak form-I passive verbs should end in yāʔ (ي), not alif maqṣūra (ى)",
"novform")
end
elseif rad1 == "ح" and rad2 == Y and rad3 == Y then
-- Long variant حَيِيَ.
inferred_past_vowel = I
inferred_nonpast_vowel = A
variant = "long"
else
if not is_passive_only(passive) then
-- does a non-passive final-weak verb in -uwa ever happen? (YES: e.g. [[رجو]] "to be slack")
inferred_past_vowel = rad3 == Y and I or U
inferred_nonpast_vowel = A
end
-- Ambiguous radical; randomly pick wāw as radical (but avoid two wāws); it could be wāw or yāʾ, but
-- doesn't affect the conjugation.
rad3 = (rad1 == W or rad2 == W) and {Y, W, ambig = true} or {W, Y, ambig = true} -- ambiguous
end
if inferred_past_vowel then
local raw_past_vowel = rget(past_vowel)
local raw_nonpast_vowel = rget(nonpast_vowel)
if raw_past_vowel ~= "-" then
if raw_past_vowel ~= inferred_past_vowel then
infer_err(("Final-weak form-I verb inferred past vowel %s, which disagrees with " ..
"explicitly specified %s"):format(undia[inferred_past_vowel], undia[raw_past_vowel]), "novform")
else
-- in case of footnote in past_vowel
inferred_past_vowel = past_vowel
end
end
if raw_nonpast_vowel ~= "-" and raw_nonpast_vowel ~= A and inferred_nonpast_vowel == U then
-- if inferred as I or A, the reality can be the reverse; form-I final-weak verbs with a~a and
-- i~i exist, e.g. سَعَى/يَسْعَى, وَلِيَ/يَلِي. Weird verb [[صها]] (also written [[صهى]]) has non-past
-- يصهى so we can't throw an error in this situation.
if raw_nonpast_vowel ~= inferred_nonpast_vowel then
infer_err(("Final-weak form-I verb inferred non-past vowel %s, which disagrees with " ..
"explicitly specified %s"):format(undia[inferred_nonpast_vowel], undia[raw_nonpast_vowel]), "novform")
else
-- in case of footnote in nonpast_vowel
inferred_nonpast_vowel = nonpast_vowel
end
end
end
if not is_passive_only(passive) then
if rget(past_vowel) == "-" then
past_vowel = inferred_past_vowel
end
if rget(nonpast_vowel) == "-" then
nonpast_vowel = inferred_nonpast_vowel
end
end
elseif vform == "IX" and is_waw_ya(rad3) and len == radstart + 2 and ch[len] == AMAQ then
-- Final-weak form IX verbs like اِرْعَوَى "to desist, to repent, to see the light".
weakness = "final-weak"
expected_length = radstart + 2
elseif vform == "X" and rad1 == "ح" and rad2 == Y and rad3 == ALIF then
-- Long variant اِسْتَحْيَا.
weakness = "final-weak"
rad3 = Y
variant = "long"
elseif rad3 == AMAQ or rad2 == Y and rad3 == ALIF or rad3 == Y then
-- rad3 == Y happens in passive-only verbs.
if vform_supports_final_weak(vform) then
weakness = "final-weak"
else
infer_err("Last radical is " .. rad3 .. " but verb form doesn't support final-weak verbs")
end
-- Ambiguous radical; randomly pick wāw as radical (but avoid two wāws); it could be wāw or yāʾ, but
-- doesn't affect the conjugation.
rad3 = (rad1 == W or rad2 == W) and {Y, W, ambig = true} or {W, Y, ambig = true}
elseif rad2 == ALIF then
if vform_supports_hollow(vform) then
weakness = "hollow"
local function set_past_to_a()
if req(past_vowel, A) then
-- already set
elseif req(past_vowel, "-") or req(past_vowel, rget(nonpast_vowel)) then
past_vowel = A
else
infer_err(("Form I hollow verb with nonpast vowel set to '%s' must have past vowel set to 'a' or the same value, not %s"):
format(undia[rget(nonpast_vowel)], undia[rget(past_vowel)]), "novform")
end
end
if vform == "I" and req(nonpast_vowel, U) then
rad2 = W
set_past_to_a()
elseif vform == "I" and req(nonpast_vowel, I) then
rad2 = Y
set_past_to_a()
else
if req(nonpast_vowel, A) and not req(past_vowel, I) then
infer_err(("Form I hollow verb with nonpast vowel set to 'a' must have past vowel set to 'i', not %s"):
format(undia[rget(past_vowel)]), "novform")
end
-- Ambiguous radical; could be wāw or yāʾ; if verb form I, it's critical to get this right, and
-- the caller checks for this situation and throws an error if non-past vowel is "a" and second
-- radical isn't explicitly given.
rad2 = {W, Y, ambig = true, need_radical = true}
end
else
infer_err("Second radical is alif but verb form doesn't support hollow verbs")
end
elseif vform == "I" and rad1 == W and not form_I_w_non_assimilated() then
weakness = "assimilated"
elseif rad2 == rad3 and (vform == "III" or vform == "VI") then
weakness = "geminate"
variant = "long"
else
weakness = "sound"
end
end
if expected_length then
check_len(expected_length, expected_length)
end
end
rad1 = convert(rad1, 1)
rad2 = convert(rad2, 2)
rad3 = convert(rad3, 3)
rad4 = convert(rad4, 4)
if not weakness then
error("Internal error: Returned weakness from infer_radicals() is nil")
end
return {
weakness = weakness,
rad1 = rad1,
rad2 = rad2,
rad3 = rad3,
rad4 = rad4,
past_vowel = past_vowel,
nonpast_vowel = nonpast_vowel,
form_viii_assim = form_viii_assim,
variant = variant,
}
end
-- bot interface to infer_radicals()
function export.infer_radicals_json(frame)
local iparams = {
headword = {},
vform = {},
passive = {},
past_vowel = {},
nonpast_vowel = {},
is_reduced = {type = "boolean"},
}
local iargs = require("Module:parameters").process(frame.args, iparams)
return require("Module:JSON").toJSON(export.infer_radicals(iargs))
end
-- Infer vocalization from participle headword (active or passive), verb form (I, II, etc.) and whether the headword is
-- active or passive. Throw an error if headword is malformed. Returned radicals may contain Latin letters "t", "w" or "y"
-- indicating ambiguous radicals guessed to be tāʾ, wāw or yāʾ respectively.
function export.infer_participle_vocalization(headword, vform, weakness, is_active)
local chars = {}
local orig_headword = headword
-- Sub out alif-madda for easier processing.
headword = rsub(headword, AMAD, HAMZA .. ALIF)
local len = ulen(headword)
-- Extract the headword letters into an array.
for i = 1, len do
table.insert(chars, usub(headword, i, i))
end
local function form_intro_error_msg()
return ("For verb form %s %s%s participle %s, "):format(vform, orig_headword ~= headword and "normalized " or
"", is_active and "active" or "passive", headword)
end
local function err(msg)
error(form_intro_error_msg() .. msg, 1)
end
-- Check that length of headword is within [min, max].
local function check_len(min, max)
if min and len < min then
err(("expected at least %s letters but saw %s"):format(min, len))
elseif max and len > max then
err(("expected at most %s letters but saw %s"):format(max, len))
end
end
-- Get the character at `ind`, making sure it exists.
local function c(ind)
check_len(ind)
return chars[ind]
end
-- Check that the letter at the given index is the given string, or is one of the members of the given array
local function check(index, must)
local letter = chars[index]
local function make_possible_values()
if type(must) == "string" then
return must
else
return list_to_text(must, nil, " or ")
end
end
if not letter then
err(("expected a letter (specifically %s) at position %s, but participle is too short"):format(
make_possible_values(), index))
end
local matches
if type(must) == "string" then
matches = letter == must
else
matches = m_table.contains(must, letter)
end
if not matches then
err(("letter %s at index %s must be %s"):format(letter, index, make_possible_values()))
end
end
local function check_weakness(values, allow_missing, invert_condition)
local function make_possible_weaknesses()
for i, val in ipairs(values) do
values[i] = "'" .. val .. "'"
end
return list_to_text(values, nil, " or ")
end
if allow_missing and invert_condition then
error("Internal error: Can't specify both allow_missing and invert_condition")
end
if not weakness then
if allow_missing or invert_condition then
return
else
err(("weakness is unspecified but must be %s"):format(make_possible_weaknesses()))
end
else
local matches = m_table.contains(values, weakness)
if invert_condition and matches then
err(("weakness '%s' must not be %s"):format(weakness, make_possible_weaknesses()))
elseif not invert_condition and not matches then
err(("weakness '%s' must be %s"):format(weakness, make_possible_weaknesses()))
end
end
end
local vocalized
local function handle_possibly_final_weak(sound_prefix, expected_length)
check_len(expected_length, expected_length)
if c(expected_length) == AMAQ then
-- passive final-weak
if is_active then
err("participle in -ِى only allowed for passive participles")
end
check_weakness({"final-weak", "assimilated+final-weak"}, "allow missing")
vocalized = sound_prefix .. AN .. AMAQ
else
-- all others behave as if sound
check_weakness({"final-weak", "assimilated+final-weak"}, nil, "invert condition")
vocalized = sound_prefix .. (is_active and I or A) .. c(expected_length)
end
end
if not (vform == "I" and is_active) then
-- all participles except verb form I active begin in م-.
check(1, M)
end
if vform == "I" then
if is_active then
check(2, ALIF)
local sound_prefix = c(1) .. AA .. c(3)
if len == 3 then
if c(3) == HAMZA then
-- Either hollow with hamzated third radical, e.g. [[شاء]] active participle 'شَاءٍ', or final-weak
-- with hamzated second radical, e.g. [[رأى]] active participle 'رَاءٍ'. Theoretically (?), also
-- geminate with hamzated second/third radical, but I don't know if any such verbs exist.
if weakness == "geminate" then
vocalized = sound_prefix .. SH
else
check_weakness({"hollow", "final-weak"}, "allow missing")
vocalized = sound_prefix .. IN
end
else
check_weakness({"final-weak", "geminate"})
if weakness == "geminate" then
vocalized = sound_prefix .. SH
else
vocalized = sound_prefix .. IN
end
end
else
check_len(4, 4)
-- we will convert back to alif maqṣūra below as needed
vocalized = sound_prefix .. I .. c(4)
end
else
-- assimilated verbs: regular, e.g. مَوْزُون "weighed"
-- geminate verbs: regular, e.g. مَبْلُول "moistened"
-- third-hamzated verbs: مَبْرُوء
-- hollow verbs: مَقُود "led, driven"; مَزِيد "added, increased"
-- hollow first-hamzated verbs: مَئِيض "returned, reverted"; مَأْيُوس "despaired" (NOTE: formation is sound);
-- مَأُود or مَؤُود "bent; depleted"
-- hollow third-hamzated verbs: مَشِيء "willed, intended", مَضُوء "glittered?"
-- final-weak: مَلْقِيّ "found, encountered"; مَصْغُوّ "inclined"
-- hollow + final-weak: مَشْوِيّ "fried, grilled", مَهْوِيّ "loved"
-- first-hamzated + hollow + final-weak: مَأْوِيّ "received hospitably"
local sound_prefix = MA .. c(2) .. SK .. c(3)
if len == 5 then
-- sound, assimilated or geminate
check(4, W)
vocalized = sound_prefix .. UU .. c(5)
else
check_len(4, 4)
if c(4) == W then
-- final-weak third-wāw
vocalized = sound_prefix .. U .. W .. SH
elseif c(4) == Y then
-- final-weak third-yāʾ
vocalized = sound_prefix .. I .. Y .. SH
else
-- hollow
check(3, {W, Y})
if c(3) == W then
vocalized = MA .. c(2) .. UU .. c(4)
else
vocalized = MA .. c(2) .. II .. c(4)
end
end
end
end
elseif vform == "II" or vform == "V" or vform == "XII" or vform == "XIII" or vform == "Iq" or vform == "IIq" or
vform == "IIIq" then
local sound_prefix, expected_length
if vform == "II" then
sound_prefix = MU .. c(2) .. A .. c(3) .. SH
expected_length = 4
elseif vform == "V" then
check(2, T)
sound_prefix = MU .. T .. A .. c(3) .. A .. c(4) .. SH
expected_length = 5
elseif vform == "XII" then
-- e.g. [[احدودب]] "to be or become convex or humpbacked", مُحْدَوْدِب (active);
-- [[اثنونى]] "to be bent; to be doubled up", مُثْنَوْنٍ (active)
check(4, W)
if c(3) ~= c(5) then
err(("third letter %s should be the same as the fifth letter %s"):format(c(3), c(5)))
end
sound_prefix = MU .. c(2) .. SK .. c(3) .. A .. W .. SK .. c(5)
expected_length = 6
elseif vform == "XIII" then
-- e.g. [[اخروط]] "to get entangled; to extend", مُخْرَوِّط (active), مُخْرَوَّط (passive)
check(4, W)
sound_prefix = MU .. c(2) .. SK .. c(3) .. A .. W .. SH
expected_length = 5
elseif vform == "Iq" then
sound_prefix = MU .. c(2) .. A .. c(3) .. SK .. c(4)
expected_length = 5
elseif vform == "IIq" then
check(2, T)
sound_prefix = MU .. T .. A .. c(3) .. A .. c(4) .. SK .. c(5)
expected_length = 6
elseif vform == "IIIq" then
-- e.g. [[اخرنطم]] "to be proud and angry"
check(4, T)
sound_prefix = MU .. c(2) .. SK .. c(3) .. A .. N .. SK .. c(5)
expected_length = 6
else
error("Internal error: Unhandled verb form " .. vform)
end
if len == expected_length - 1 then
-- active final-weak
if not is_active then
err(("length-%s participle only allowed for active participles"):format(len))
end
check_weakness({"final-weak", "assimilated+final-weak"}, "allow missing")
vocalized = sound_prefix .. IN
else
handle_possibly_final_weak(sound_prefix, expected_length)
end
elseif vform == "III" or vform == "VI" then
local sound_prefix, expected_length
if vform == "VI" then
check(2, T)
check(4, ALIF)
sound_prefix = MU .. T .. A .. c(3) .. AA .. c(5)
expected_length = 6
else
sound_prefix = MU .. c(2) .. AA .. c(4)
expected_length = 5
end
if len == expected_length - 1 then
-- active final-weak or active or passive geminate
if is_active then
check_weakness({"geminate", "final-weak", "assimilated+final-weak"})
if weakness == "geminate" then
vocalized = sound_prefix .. SH
else
vocalized = sound_prefix .. IN
end
else
check_weakness({"geminate"}, "allow missing")
vocalized = sound_prefix .. SH
end
else
handle_possibly_final_weak(sound_prefix, expected_length)
end
elseif vform == "IV" or vform == "X" then
-- form IV:
-- sound: مُرْسِخ (active, "entrenching"), مُرْسَخ (passive, "entrenched")
-- first-hamzated (like sound): مُؤْيِس (active, "causing to despair"), مُؤْيَس (passive, "caused to despair")
-- final-weak: مُكْرٍ (active, "renting out"), مُكْرًى (passive, "rented out")
-- assimilated: مُورِد (active, "transferring"), مُورَد (passive, "transferred"); same when first-Y, e.g.
-- أَيْقَنَ "to be certain of": مُوقِن (active), مُوقَن (passive)
-- assimilated + final-weak: مُورٍ (active, "setting fire, kindling"), مُورًى (passive, "set fire, kindled")
-- geminate: مُمِدّ (active, "granting, helping"), مُمَدّ (passive, "granted, helped")
-- hollow: مُزِيل (active, "eliminating"), مُزَال (passive, "eliminated")
-- hollow + final-weak: مُعْيٍ (active, "tiring"), مُعْيًى (passive, "tired")
local sound_prefix, expected_length
if vform == "X" then
check(2, S)
check(3, T)
sound_prefix = MU .. S .. SK .. T .. A .. c(4)
expected_length = 6
else
sound_prefix = MU .. c(2)
expected_length = 4
end
if len == expected_length and c(len - 1) == Y and c(len) ~= AMAQ then
-- active hollow
if not is_active then
err("this shape only allowed for active participles")
end
check_weakness({"hollow"}, "allow missing")
vocalized = sound_prefix .. II .. c(len)
elseif len == expected_length and c(len - 1) == ALIF then
-- passive hollow
if is_active then
err("this shape only allowed for passive participles")
end
check_weakness({"hollow"}, "allow missing")
vocalized = sound_prefix .. AA .. c(len)
elseif len == expected_length - 1 then
-- active final-weak or active or passive geminate
if is_active then
check_weakness({"geminate", "final-weak", "assimilated+final-weak"})
if weakness == "geminate" then
vocalized = sound_prefix .. I .. c(len) .. SH
elseif vform == "IV" and c(2) == W then
-- assimilated final-weak
vocalized = sound_prefix .. c(len) .. IN
else
vocalized = sound_prefix .. SK .. c(len) .. IN
end
else
check_weakness({"geminate"}, "allow missing")
vocalized = sound_prefix .. A .. c(len) .. SH
end
else
if vform == "IV" and c(2) == W then
-- assimilated, possibly final-weak
sound_prefix = sound_prefix .. c(expected_length - 1)
else
sound_prefix = sound_prefix .. SK .. c(expected_length - 1)
end
handle_possibly_final_weak(sound_prefix, expected_length)
end
elseif vform == "VII" or vform == "VIII" then
-- form VII (passive participles are fairly rare but do exist):
-- sound: مُنْكَتِب (active "subscribing"), مُنْكَتَب (passive "subscribed")
-- geminate: مُنْضَمّ (both active "joining, containing" and passive "joined, contained")
-- final-weak: مُنْطَلٍ (active "fooling (someone)"), مُنْطَلًى (passive "fooled")
-- final-weak with medial wāw: مُنْطَوٍ (active "involving"), مُنْطَوًى (passive "involved")
-- hollow: مُنْقَاد (both active "complying with" and passive "complied with")
--
-- for form VIII, the same variants exist but things are complicated by assimilations involving the template T.
-- sound third-hamzated no assimilation: مُبْتَدِئ (active "beginning"), مُبْتَدَأ (passive "begun")
-- geminate no assimilation: مُبْتَزّ (both active "robbing" and passive "robbed")
-- final-weak no assimilation: مُبْتَنٍ (active "building"), مُبْتَنًى (passive "built")
-- final-weak with medial wāw no assimilation: مُحْتَوٍ (active "containing"), مُحْتَوًى (passive "contained")
-- hollow no assimilation: مُخْتَار (both active "choosing" and passive "chosen")
--
-- sound with total assimilation: مُتَّبِع (active "following"), مُتَّبَع (passive "followed")
-- sound with total assimilation, assimilating wāw: مُتَّعِد (active "threatening"), مُتَّعَد (passive "threatened")
-- sound with total assimilation, irregularly assimilating hamza: مُتَّخِذ (active "taking"), مُتَّخَذ (passive "taken")
-- sound with total assimilation (to ḏāl, producing dāl): مُدَّخِر (active "reserving"), مُدَّخَر (passive "reserved")
-- sound with total assimilation (to ḏāl): مُذَّكِر (active "remembering"), مُذَّكَر (passive "remembered")
-- sound with total assimilation (to ṭāʔ): مُطَّرِح (active "discarding"), مُطَّرَح (passive "discarded")
-- sound with total assimilation (to ẓāʔ): مُظَّلِم (active "tolerating"), مُظَّلَم (passive "tolerated")
-- final-weak with total assimilation, assimilating wāw: مُتَّقٍ (active "guarding against"), مُتَّقًى (passive "guarded against")
-- final-weak with total assimilation (to ṯāʔ): مُثَّنٍ (active "undulating"), مُثَّنًى (passive "undulated")
-- final-weak with total assimilation (to dāl): مُدَّعٍ (active "claiming"), مُدَّعًى (passive "claimed")
-- sound with partial assimilation (to zayn): مُزْدَهِر (active "thriving"), مُزْدَهَر (passive "thrived")
-- sound with medial wāw with partial assimilation (to zayn): مُزْدَوِج (active "appearing twice")
-- sound with partial assimilation (to ṣād): مُصْطَبِح (active "illuminating"), مُصْطَبَح (passive, "illuminated")
-- sound with partial assimilation (to ḍād): مُضْطَرِب (active "to be disturbed"; no passive)
-- geminate with partial assimilation (to ṣād): مُصْطَبّ (both active "effusing" and passive "effused")
-- geminate with partial assimilation (to ḍād): مُضْطَرّ (both active "forcing" and passive "forced")
-- final-weak with partial assimilation (to ṣād): مُصْطَلٍ (active "warming"), مُصْطَلًى (passive "warmed")
-- hollow with partial assimilation (to zayn): مُزْدَاد (both active "increasing" and passive "increased")
-- hollow with partial assimilation (to ṣad): مُصْطَاد (both active "hunting" and passive "hunted")
local sound_prefix, sufind
if vform == "VII" then
check(2, N)
sound_prefix = MU .. N .. SK .. c(3)
sufind = 4
else
local c2 = c(2)
if c2 == T or c2 == "د" or c2 == "ث" or c2 == "ذ" or c2 == "ط" or c2 == "ظ" then
-- full assimilation
sound_prefix = MU .. c2 .. SH
sufind = 3
else
-- partial or no assimilation
if c2 == "ز" then
check(3, "د")
elseif c2 == "ص" or c2 == "ض" then
check(3, "ط")
else
check(3, T)
end
sound_prefix = MU .. c2 .. SK .. c(3)
sufind = 4
end
end
if c(sufind) == ALIF then
-- hollow, active or passive
check_len(sufind + 1, sufind + 1)
check_weakness({"hollow"}, "allow missing")
vocalized = sound_prefix .. AA .. c(sufind + 1)
elseif len == sufind then
-- active final-weak or active or passive geminate
if is_active then
check_weakness({"geminate", "final-weak", "assimilated+final-weak"})
if weakness == "geminate" then
vocalized = sound_prefix .. A .. c(len) .. SH
else
vocalized = sound_prefix .. A .. c(len) .. IN
end
else
check_weakness({"geminate"}, "allow missing")
vocalized = sound_prefix .. A .. c(len) .. SH
end
else
sound_prefix = sound_prefix .. A .. c(sufind)
handle_possibly_final_weak(sound_prefix, sufind + 1)
end
elseif vform == "IX" then
check_len(4, 4)
vocalized = MU .. c(2) .. SK .. c(3) .. A .. c(4) .. SH
elseif vform == "IVq" then
-- e.g. [[اذلعب]] "to scamper away", مُذْلَعِبّ (active), مُذْلَعَبّ (passive);
-- [[اطمأن]] "to remain quietly; to be certain", مُطْمَئِنّ (active), مُطْمَأَنّ (passive)
check_len(5, 5)
local sound_prefix = MU .. c(2) .. SK .. c(3) .. A .. c(4)
if is_active then
vocalized = sound_prefix .. I .. c(5) .. SH
else
vocalized = sound_prefix .. A .. c(5) .. SH
end
elseif vform == "XI" then
check_len(5, 5)
check(4, ALIF)
vocalized = MU .. c(2) .. SK .. c(3) .. AA .. c(5) .. SH
-- e.g. [[احمار]] "to turn red, to blush", مُحْمَارّ (active)
elseif vform == "XIV" or vform == "XV" then
-- FIXME: Implement. No examples in Wiktionary currently; need to look up in a grammar.
error("Support for verb form " .. vform .. " not implemented yet")
else
error("Don't recognize verb form " .. vform)
end
vocalized = rsub(vocalized, HAMZA .. AA, AMAD)
local reconstructed_headword = lang:stripDiacritics(vocalized)
if reconstructed_headword ~= orig_headword then
error(("Internal error: Vocalized participle %s doesn't match original participle %s"):format(
vocalized, orig_headword))
end
return vocalized
end
function export.infer_participle_vocalization_json(frame)
local iparams = {
[1] = {required = true},
[2] = {required = true},
["weakness"] = {},
["passive"] = {type = "boolean"}
}
local iargs = require("Module:parameters").process(frame.args, iparams)
return export.infer_participle_vocalization(iargs[1], iargs[2], iargs.weakness, not iargs.passive)
end
return export
tdgd3u6jfiyf70dpiezhrk7sexvygv6
مؤذن
0
22396
281310
126849
2026-04-21T15:51:00Z
Hakimi97
2668
/* Etimologi */
281310
wikitext
text/x-wiki
== Bahasa Arab ==
{{Wikipedia|lang=ar}}
=== Takrifan ===
==== Kata ====
{{ar-noun|مُؤَذِّن|m|pl=مُؤَذِّنُون}}
# [[bilal]], [[muazin]]
===== Deklensi =====
{{ar-decl-noun|مُؤَذِّن|pl=مُؤَذِّنُون}}
=== Etimologi ===
Daripada {{m|ar|أَذَّنَ||[[panggil]]}}, daripada akar {{ar-root|ء ذ ن}}.
{{C|ar|Solat|Agamawan Islam}}
7nbt9iek917cs7r356bzx9aoddlyiy8
اعتقاد
0
24132
281311
144426
2026-04-21T15:52:12Z
Hakimi97
2668
/* Kata nama */
281311
wikitext
text/x-wiki
==Bahasa Melayu==
===Takrifan===
====Kata nama====
{{ms-noun|pl=-}}
# {{ms-jawi|iktikad}}
=== Pautan luar ===
* {{R:PRPM}}
== Bahasa Arab ==
=== Takrifan ===
==== Kata nama ====
{{ar-noun|اِعْتِقَاد|m|pl=اِعْتِقَادَات}}
# [[kepercayaan]], [[pegangan]], [[akidah]]
===Etimologi===
Daripada dasar {{ar-root|ع ق د}}.
===Sebutan===
* {{ar-IPA|اِعْتِقَاد}}
== Bahasa Parsi ==
=== Takrifan ===
==== Kata nama ====
{{fa-kn|tr=e'teqâd}}
# [[kepercayaan]], [[pegangan]]
# [[pendapat]]
===Etimologi===
Pinjaman {{bor|fa|ar|اِعْتِقَاد}}.
===Sebutan===
{{fa-AFA|i'ti`qād}}
rfjvg6r9qfrz78kt0aqzkqlr4z24ky3
buli
0
24834
281244
130760
2026-04-21T13:23:13Z
Countryball mys123
9925
/* Bahasa Melayu */Tambah gambar
281244
wikitext
text/x-wiki
== Bahasa Melayu ==
{{Wikipedia}} <!-- Kalau ada -->
[[File:Bullying Prevention in the United States.jpg|thumb|Gambaran buli]]
=== Takrifan ===
==== Kata nama ====
{{ms-kn|j=بولي}}
# Perbuatan mengganggu, memaksa dan merendah-rendahkan seseorang secara melampau-lampau, terutamanya yang berdarjat lebih rendah.
==== Kata kerja ====
{{ms-kk|j=بولي}}
# Melakukan perbuatan buli.
=== Sebutan ===
* {{dewan|bu|li}}
=== Pautan luar ===
* {{R:PRPM}}
91ladtnkpvdj4zahc8o095098qvmdli
gonob
0
25875
281413
178454
2026-04-22T08:06:54Z
~2026-24499-96
10668
281413
wikitext
text/x-wiki
==Bahasa Kadazandusun==
===Takrifan===
====Kata nama====
{{inti|dtp|kata nama}}
# [[sarung]]
# [[kain basahan]]
#: {{cp|dtp| Nopupuan ku no '''gonob''' di odu.| Saya sudah mencuci '''kain sarung''' nenek.}
===Sebutan===
* {{IPA|dtp|/ɡɔ.nɔɓ/}}
* {{rima|dtp|nɔɓ|ɔɓ}}
* {{penyempangan|dtp|go|nob}}
===Terbitan===
* {{l|dtp|mononggonob}}
* {{l|dtp|kigonob}}
===Tesaurus===
; Sinonim: [[tapi]], [[sarung]].
===Rujukan===
Mongulud Boros Dusun Kadazan (1994). Komoiboros Dusunkadazan. Kota Kinabalu: MBDK.
3bbn1ddwywy3r3kxvynju8qonbj1ssb
281420
281413
2026-04-22T09:10:07Z
Hakimi97
2668
Membatalkan semakan [[Special:Diff/281413|281413]] oleh [[Special:Contributions/~2026-24499-96|~2026-24499-96]] ([[User talk:~2026-24499-96|bincang]])
281420
wikitext
text/x-wiki
==Bahasa Kadazandusun==
===Takrifan===
====Kata nama====
{{inti|dtp|kata nama}}
# [[sarung]]
# [[kain basahan]]
#: {{cp|dtp| Nopupuan ku no '''gonob''' di odu.| Saya sudah mencuci '''kain sarung''' nenek.}}
===Sebutan===
* {{IPA|dtp|/ɡɔ.nɔɓ/}}
* {{rima|dtp|nɔɓ|ɔɓ}}
* {{penyempangan|dtp|go|nob}}
===Terbitan===
* {{l|dtp|mononggonob}}
* {{l|dtp|kigonob}}
===Tesaurus===
; Sinonim: [[tapi]], [[sarung]].
===Rujukan===
Mongulud Boros Dusun Kadazan (1994). Komoiboros Dusunkadazan. Kota Kinabalu: MBDK.
nt90ms78lqr09yrx5kcxoae60rfot3w
ليل
0
27195
281297
134057
2026-04-21T15:33:25Z
Hakimi97
2668
281297
wikitext
text/x-wiki
== Bahasa Arab ==
=== Takrifan ===
==== Kata nama ====
{{ar-noun|لَيْل|m|pl=-}}
# [[malam]]
#: {{ant|ar|نَهَار}}
=== Etimologi ===
Daripada {{ar-root|ل ي ل}}, daripada {{inh|ar|sem-pro|*layl-}}.
=== Sebutan ===
* {{ar-IPA|لَيْل}}
* {{audio|ar|Ar-ليل.ogg|Audio}}
=== Lihat juga ===
* {{l|ar|لَيْلَة}}
6jsf56ix68w0efrtv1xolvytcvhbe5x
Maghribi
0
27340
281312
149953
2026-04-21T15:52:43Z
Hakimi97
2668
/* Etimologi */
281312
wikitext
text/x-wiki
== Bahasa Melayu ==
{{Wikipedia}} <!-- Kalau ada -->
[[Image:MAR orthographic.svg|upright=1.13|thumb|right|Peta Maghribi.]]
=== Takrifan ===
==== Kata nama khas ====
{{ms-knk|j=مغربي}}
# Sebuah negara di barat [[Afrika]].
=== Etimologi ===
Daripada {{bor|ms|ar|الْمَغْرِب}}; daripada {{ar-root|غ ر ب}}; {{m|ar|غَرْب||[[barat]]}}. Lihat juga ''[[maghrib]]''.
=== Sebutan ===
* {{dewan|Magh|ri|bi}}
=== Lihat juga ===
* {{senarai:negara di Afrika/ms}}
=== Pautan luar ===
* {{R:PRPM}}
4v6mdxzdxzqs4q56i80f1zy007kcp03
Modul:languages/data
828
33717
281318
223008
2026-04-21T19:40:16Z
Hakimi97
2668
Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/89499015|89499015]])
281318
Scribunto
text/plain
local export = {}
-- We can't use mw.loadData() on [[Module:languages/chars]] because [[Module:languages/data]] itself is sometimes loaded
-- using mw.loadData(), and calling mw.loadData() on [[Module:languages/chars]] will insert metatables into the
-- character tables, which the second mw.loadData() will choke on.
local m_chars = require("Module:languages/chars")
local u = require("Module:string/char")
local c = m_chars.chars
export.chars = c
local p = m_chars.puaChars
export.puaChars = p
local cs = m_chars.chars_substitutions
export.chars_substitutions = cs
-- FIXME! Many of the script-specific values below can be moved to [[Module:scripts/data]] to serve as script-wide
-- fallback values instead of specifying them for each language using the script.
local s = {}
-- These values are placed here to make it possible to synchronise a group of languages without the need for a dedicated function module.
-- cau
do
local cau_remove_diacritics = c.grave .. c.acute .. c.macron
local cau_from = {"[IlΙІӀᴴ]"}
local cau_to = {{
["l"] = "ӏ",
["Ι"] = "ӏ",
["І"] = "ӏ",
["Ӏ"] = "ӏ",
["ᴴ"] = "ᵸ",
}}
s["cau-Cyrl-displaytext"] = {
from = cau_from,
to = cau_to,
}
s["cau-Cyrl-stripdiacritics"] = {
remove_diacritics = cau_remove_diacritics,
from = cau_from,
to = cau_to,
}
s["cau-Latn-stripdiacritics"] = {remove_diacritics = cau_remove_diacritics}
end
s["itc-Latn-displaytext"] = {
from = {c.caron},
to = {c.breve},
}
s["itc-Latn-stripdiacritics"] = {remove_diacritics = c.macron .. c.breve .. c.diaer .. c.caron .. c.dinvbreve}
s["itc-Latn-sortkey"] = {
remove_diacritics = c.circ .. c.tilde .. c.macron .. c.breve .. c.diaer .. c.caron .. c.zigzag .. c.dmacron .. c.dtilde .. c.dinvbreve .. c.small_a .. c.small_e .. c.small_i .. c.small_o .. c.small_u, -- Chiefly medieval abbreviations.
from = {"ᵃ", "æ", "[đꝱꟈ]", "ᵉ", "ⁱ", "ꝁ", "[ƚꝉꝲ]", "ꝳ", "ꝴ", "[ꝋᵒ]", "œ", "[ꝑꝓꝕ]", "[ꝗꝙ]", "[ꝛꝵꝶꝝ]", "[ꟊˢ]", "[ꝷᵗ]", "ᵘ", "ꝟ", "⁊"},
to = {"a", "ae", "d", "e", "i", "k", "l", "m", "n", "o", "oe", "p", "q", "r", "s", "t", "u", "v", "&"}
}
s["Jpan-standardchars"] = -- exclude ぢづヂヅ
"ぁあぃいぅうぇえぉおかがきぎくぐけげこごさざしじすずせぜそぞただちっつてでとどなにぬねのはばぱひびぴふぶぷへべぺほぼぽまみむめもゃやゅゆょよらりるれろん" ..
"ァアィイゥウェエォオカガキギクグケゲコゴサザシジスズセゼソゾタダチッツテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモャヤュユョヨラリルレロン"
local jpx_displaytext = {
from = {"~", "="},
to = {"〜", "゠"}
}
s["jpx-displaytext"] = {
Jpan = jpx_displaytext,
Hani = jpx_displaytext,
Hrkt = jpx_displaytext,
Hira = jpx_displaytext,
Kana = jpx_displaytext
-- not Latn or Brai
}
s["jpx-stripdiacritics"] = s["jpx-displaytext"]
s["jpx-sortkey"] = {
Jpan = "Jpan-sortkey",
Hani = "Hani-sortkey",
Hrkt = "Hira-sortkey", -- sort general kana by normalizing to Hira
Hira = "Hira-sortkey",
Kana = "Kana-sortkey",
Latn = {remove_diacritics = c.tilde .. c.macron .. c.diaer}
}
s["jpx-translit"] = {
Hrkt = "Hrkt-translit",
Hira = "Hrkt-translit",
Kana = "Hrkt-translit"
}
s["roa-oil-sortkey"] = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.diaer .. c.ringabove .. c.cedilla .. "'",
from = {"æ", "œ", "·"},
to = {"ae", "oe", " "}
}
s["wen-sortkey"] = {
from = {"ch", "[lłßꞩẜ]", "dz[" .. c.caron .. c.acute .. "]", "[bcefmnoprswz][" .. c.caron .. c.acute .. c.dotabove .. "]"},
to = {
"h" .. p[1],
{
["l"] = "l" .. p[1], ["ł"] = "l", ["ß"] = "s", ["ꞩ"] = "š", ["ẜ"] = "š",
},
{
["dz" .. c.caron] = "d" .. p[1], ["dz" .. c.acute] = "d" .. p[2]
},
{
["b" .. c.acute] = "b" .. p[1],
["c" .. c.caron] = "c" .. p[1], ["c" .. c.acute] = "c" .. p[2],
["e" .. c.caron] = "e" .. p[1], ["e" .. c.dotabove] = "e" .. p[1],
["f" .. c.acute] = "f" .. p[1],
["m" .. c.acute] = "m" .. p[1],
["n" .. c.acute] = "n" .. p[1],
["o" .. c.acute] = "o" .. p[1],
["p" .. c.acute] = "p" .. p[1],
["r" .. c.caron] = "r" .. p[1], ["r" .. c.acute] = "r" .. p[2],
["s" .. c.caron] = "s" .. p[1], ["s" .. c.acute] = "s" .. p[2],
["w" .. c.acute] = "w" .. p[1],
["z" .. c.caron] = "z" .. p[1], ["z" .. c.acute] = "z" .. p[2],
}
}
}
-- Myanmar dotted form : https://www.unicode.org/Public/UNIDATA/StandardizedVariants.txt
s["aio-displaytext"] = {
from = {"([ကဂငတထပမယလဝဢေၵၸၺႀꩠꩡꩢꩣꩤꩥꩦꩫꩬꩯꩺ])"},
to = {"%1" .. c.VS01}
}
s["aio-stripdiacritics"] = {
remove_diacritics = c.VS01,
}
s["phk-displaytext"] = s["aio-displaytext"]
s["phk-stripdiacritics"] = s["aio-stripdiacritics"]
s["kht-displaytext"] = s["aio-displaytext"]
s["kht-stripdiacritics"] = s["aio-stripdiacritics"]
export.shared = s
--[==[ var:
Short-term solution to override the standard substitution process, by forcing the module to substitute the entire
text in one pass, if "cont" is given. This results in any PUA characters that are used as stand-ins for formatting being
handled by the language-specific substitution process, which is usually undesirable. If the value is "none" then the
formatting tags do not get turned into PUA characters in the first place. This override is provided for languages which
use formatting between strings of text which might need to interact with each other (e.g. Korean 값이 transliterates as "gaps-i", but [[값]] has the formatting '''값'''[[-이]]. The normal process would split the text at the second '''.)
]==]
export.substitution = {
["gmy"] = "none",
["ja"] = "cont",
["jje"] = "cont",
["ko"] = "cont",
["ko-ear"] = "cont",
["ru"] = "cont",
["th-new"] = "cont",
["sa"] = "cont",
["zkt"] = "cont",
}
--[==[ var:
Code aliases. The left side is the alias and the right side is the canonical code. NOTE: These are gradually
being deprecated, so should not be added to on a permanent basis. Temporary additions are permitted under reasonable
circumstances (e.g. to facilitate changing a language's code). When an alias is no longer used, it should be removed.
Aliases in this table are tracked at [[Wiktionary:Tracking/languages/LANG]]; see e.g.
[[Special:WhatLinksHere/Wiktionary:Tracking/languages/VL.]] for the `VL.` alias.
]==]
export.aliases = {
["EL."] = "la-ecc",
["LL."] = "la-lat",
["ML."] = "la-med",
["NL."] = "la-new",
["VL."] = "la-vul",
["nds-DE"] = "nds-de",
["nds-NL"] = "nds-nl",
["roa-oan"] = "roa-ona",
["sa-cls"] = "cls",
["sa-ved"] = "vsn",
}
--[==[ var:
Codes which are tracked. Note that all aliases listed above are also tracked, so should not be duplicated here.
Tracking uses the same mechanism described above in the comment above `export.aliases`.
]==]
export.track = {
-- Codes duplicated between full and etymology-only languages.
["lzh-lit"] = true,
["lzh"] = true,
-- Languages actively being converted to families.
["bh"] = true, -- inc-bih
["nan"] = true, -- zhx-nan
}
return export
b89ykupfotcto2w975o7485wv63kf7f
281320
281318
2026-04-21T19:43:53Z
Hakimi97
2668
Membatalkan semakan [[Special:Diff/281318|281318]] oleh [[Special:Contributions/Hakimi97|Hakimi97]] ([[User talk:Hakimi97|bincang]])
281320
Scribunto
text/plain
local m_scripts = require("Module:scripts")
local table = table
local insert = table.insert
local u = require("Module:string/char")
local export = {}
-- UTF-8 encoded strings for some commonly-used diacritics.
local c = {
prime = u(0x02B9),
grave = u(0x0300),
acute = u(0x0301),
circ = u(0x0302),
tilde = u(0x0303),
macron = u(0x0304),
overline = u(0x0305),
breve = u(0x0306),
dotabove = u(0x0307),
diaer = u(0x0308),
ringabove = u(0x030A),
dacute = u(0x030B),
caron = u(0x030C),
lineabove = u(0x030D),
dgrave = u(0x030F),
invbreve = u(0x0311),
commaabove = u(0x0313),
revcommaabove = u(0x0314),
dotbelow = u(0x0323),
diaerbelow = u(0x0324),
ringbelow = u(0x0325),
cedilla = u(0x0327),
ogonek = u(0x0328),
brevebelow = u(0x032E),
macronbelow = u(0x0331),
perispomeni = u(0x0342),
ypogegrammeni = u(0x0345),
CGJ = u(0x034F), -- combining grapheme joiner
zigzag = u(0x035B),
dbrevebelow = u(0x035C),
dmacron = u(0x035E),
dtilde = u(0x0360),
dinvbreve = u(0x0361),
small_a = u(0x0363),
small_e = u(0x0364),
small_i = u(0x0365),
small_o = u(0x0366),
small_u = u(0x0367),
keraia = u(0x0374),
lowerkeraia = u(0x0375),
tonos = u(0x0384),
palatalization = u(0x0484),
dasiapneumata = u(0x0485),
psilipneumata = u(0x0486),
kashida = u(0x0640),
fathatan = u(0x064B),
dammatan = u(0x064C),
kasratan = u(0x064D),
fatha = u(0x064E),
damma = u(0x064F),
kasra = u(0x0650),
shadda = u(0x0651),
sukun = u(0x0652),
hamzaabove = u(0x0654),
nunghunna = u(0x0658),
zwarakay = u(0x0659),
smallv = u(0x065A),
superalef = u(0x0670),
udatta = u(0x0951),
anudatta = u(0x0952),
dottedgrave = u(0x1DC0),
dottedacute = u(0x1DC1),
coronis = u(0x1FBD),
psili = u(0x1FBF),
dasia = u(0x1FEF),
ZWNJ = u(0x200C), -- zero width non-joiner
ZWJ = u(0x200D), -- zero width joiner
RSQuo = u(0x2019), -- right single quote
kavyka = u(0xA67C),
VS01 = u(0xFE00), -- variation selector 1
-- Punctuation for the standardChars field.
-- Note: characters are literal (i.e. no magic characters).
punc = " ',-‐‑‒–—…∅",
-- Range covering all diacritics.
diacritics = u(0x300) .. "-" .. u(0x34E) ..
u(0x350) .. "-" .. u(0x36F) ..
u(0x1AB0) .. "-" .. u(0x1ACE) ..
u(0x1DC0) .. "-" .. u(0x1DFF) ..
u(0x20D0) .. "-" .. u(0x20F0) ..
u(0xFE20) .. "-" .. u(0xFE2F),
}
-- Braille characters for the standardChars field.
local braille = {}
for i = 0x2800, 0x28FF do
insert(braille, u(i))
end
c.braille = table.concat(braille)
export.chars = c
-- PUA characters, generally used in sortkeys.
-- Note: if the limit needs to be increased, do so in powers of 2 (due to the way memory is allocated for tables).
local p = {}
for i = 1, 32 do
p[i] = u(0xF000+i-1)
end
export.puaChars = p
local s = {}
-- These values are placed here to make it possible to synchronise a group of languages without the need for a dedicated function module.
-- cau
do
local cau_remove_diacritics = c.grave .. c.acute .. c.macron
local cau_from = {"[IlΙІӀᴴ]"}
local cau_to = {{
["l"] = "ӏ",
["Ι"] = "ӏ",
["І"] = "ӏ",
["Ӏ"] = "ӏ",
["ᴴ"] = "ᵸ",
}}
s["cau-Cyrl-displaytext"] = {
from = cau_from,
to = cau_to,
}
s["cau-Cyrl-entryname"] = {
remove_diacritics = cau_remove_diacritics,
from = cau_from,
to = cau_to,
}
s["cau-Latn-entryname"] = {remove_diacritics = cau_remove_diacritics}
end
-- Cyrs
do
local Cyrs_remove_diacritics = c.grave .. c.acute .. c.dotabove .. c.diaer .. c.invbreve .. c.palatalization .. c.dasiapneumata .. c.psilipneumata .. c.dottedgrave .. c.dottedacute .. c.kavyka
s["Cyrs-entryname"] = {remove_diacritics = Cyrs_remove_diacritics}
s["Cyrs-sortkey"] = {
remove_diacritics = Cyrs_remove_diacritics,
from = {
"ї", "оу", -- 2 chars
"[ґꙣєѕꙃꙅꙁіꙇђꙉѻꙩꙫꙭꙮꚙꚛꙋѡѿꙍѽꙑѣꙗѥꙕѧꙙѩꙝꙛѫѭѯѱѳѵҁ]"
},
to = {
"и" .. p[1], "у", {
["ґ"] = "г" .. p[1], ["ꙣ"] = "д" .. p[1], ["є"] = "е", ["ѕ"] = "ж" .. p[1], ["ꙃ"] = "ж" .. p[1],
["ꙅ"] = "ж" .. p[1], ["ꙁ"] = "з", ["і"] = "и" .. p[1], ["ꙇ"] = "и" .. p[1], ["ђ"] = "и" .. p[2],
["ꙉ"] = "и" .. p[2], ["ѻ"] = "о", ["ꙩ"] = "о", ["ꙫ"] = "о", ["ꙭ"] = "о",
["ꙮ"] = "о", ["ꚙ"] = "о", ["ꚛ"] = "о", ["ꙋ"] = "у", ["ѡ"] = "х" .. p[1],
["ѿ"] = "х" .. p[1], ["ꙍ"] = "х" .. p[1], ["ѽ"] = "х" .. p[1], ["ꙑ"] = "ы", ["ѣ"] = "ь" .. p[1],
["ꙗ"] = "ь" .. p[2], ["ѥ"] = "ь" .. p[3], ["ꙕ"] = "ю", ["ѧ"] = "я", ["ꙙ"] = "я",
["ѩ"] = "я" .. p[1], ["ꙝ"] = "я" .. p[1], ["ꙛ"] = "я" .. p[2], ["ѫ"] = "я" .. p[3], ["ѭ"] = "я" .. p[4],
["ѯ"] = "я" .. p[5], ["ѱ"] = "я" .. p[6], ["ѳ"] = "я" .. p[7], ["ѵ"] = "я" .. p[8], ["ҁ"] = "я" .. p[9],
}
},
}
end
s["Grek-displaytext"] = {
from = {"Þ", "þ", "['" .. c.RSQuo .. c.prime .. c.keraia .. c.coronis .. c.psili .. "]"}, -- Not tonos, used as the numeral sign in entries.
to = {"Ϸ", "ϸ", c.RSQuo}
}
s["Grek-entryname"] = {
remove_diacritics = c.caron .. c.diaerbelow .. c.brevebelow,
from = s["Grek-displaytext"].from,
to = {"Ϸ", "ϸ", "'"}
}
s["Grek-sortkey"] = {
remove_diacritics = "';·`¨´῀" .. c.grave .. c.acute .. c.diaer .. c.caron .. c.commaabove .. c.revcommaabove .. c.macron .. c.breve .. c.diaerbelow .. c.brevebelow .. c.perispomeni .. c.ypogegrammeni .. c.RSQuo .. c.prime .. c.keraia .. c.lowerkeraia .. c.tonos .. c.coronis .. c.psili .. c.dasia,
from = {"ϝ", "ͷ", "ϛ", "ͱ", "ͺ", "ϳ", "ϻ", "[ϟϙ]", "[ςϲ]", "ͳ"},
to = {"ε" .. p[1], "ε" .. p[2], "ε" .. p[3], "ζ" .. p[1], "ι", "ι" .. p[1], "π" .. p[1], "π" .. p[2], "σ", "ϡ"}
}
s["itc-Latn-displaytext"] = {
from = {c.caron},
to = {c.breve},
}
s["itc-Latn-entryname"] = {remove_diacritics = c.macron .. c.breve .. c.diaer .. c.caron .. c.dinvbreve}
s["itc-Latn-sortkey"] = {
remove_diacritics = c.circ .. c.tilde .. c.macron .. c.breve .. c.diaer .. c.caron .. c.zigzag .. c.dmacron .. c.dtilde .. c.dinvbreve .. c.small_a .. c.small_e .. c.small_i .. c.small_o .. c.small_u, -- Chiefly medieval abbreviations.
from = {"ᵃ", "æ", "[đꝱꟈ]", "ᵉ", "ⁱ", "ꝁ", "[ƚꝉꝲ]", "ꝳ", "ꝴ", "[ꝋᵒ]", "œ", "[ꝑꝓꝕ]", "[ꝗꝙ]", "[ꝛꝵꝶꝝ]", "[ꟊˢ]", "[ꝷᵗ]", "ᵘ", "ꝟ", "⁊"},
to = {"a", "ae", "d", "e", "i", "k", "l", "m", "n", "o", "oe", "p", "q", "r", "s", "t", "u", "v", "&"}
}
s["Jpan-standardchars"] = -- exclude ぢづヂヅ
"ぁあぃいぅうぇえぉおかがきぎくぐけげこごさざしじすずせぜそぞただちっつてでとどなにぬねのはばぱひびぴふぶぷへべぺほぼぽまみむめもゃやゅゆょよらりるれろん" ..
"ァアィイゥウェエォオカガキギクグケゲコゴサザシジスズセゼソゾタダチッツテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモャヤュユョヨラリルレロン"
local jpx_displaytext = {
from = {"~", "="},
to = {"〜", "゠"}
}
s["jpx-displaytext"] = {
Jpan = jpx_displaytext,
Hani = jpx_displaytext,
Hrkt = jpx_displaytext,
Hira = jpx_displaytext,
Kana = jpx_displaytext
-- not Latn or Brai
}
s["jpx-entryname"] = s["jpx-displaytext"]
s["jpx-sortkey"] = {
Jpan = "Jpan-sortkey",
Hani = "Hani-sortkey",
Hrkt = "Hira-sortkey", -- sort general kana by normalizing to Hira
Hira = "Hira-sortkey",
Kana = "Kana-sortkey",
Latn = {remove_diacritics = c.tilde .. c.macron .. c.diaer}
}
s["jpx-translit"] = {
Hrkt = "Hrkt-translit",
Hira = "Hrkt-translit",
Kana = "Hrkt-translit"
}
local HaniChars = m_scripts.getByCode("Hani"):getCharacters()
-- `漢字(한자)`→`漢字`
-- `가-나-다`→`가나다`, `가--나--다`→`가-나-다`
-- `온돌(溫突/溫堗)`→`온돌` ([[ondol]])
s["Kore-entryname"] = {
remove_diacritics = u(0x302E) .. u(0x302F),
from = {"([" .. HaniChars .. "])%(.-%)", "^%-", "%-$", "%-(%-?)", "\1", "%([" .. HaniChars .. "/]+%)"},
to = {"%1", "\1", "\1", "%1", "-"}
}
s["Lisu-sortkey"] = {
from = {"𑾰"},
to = {"ꓬ" .. p[1]}
}
s["Mong-displaytext"] = {
from = {"([ᠨ-ᡂᡸ])ᠶ([ᠨ-ᡂᡸ])", "([ᠠ-ᡂᡸ])ᠸ([^᠋ᠠ-ᠧ])", "([ᠠ-ᡂᡸ])ᠸ$"},
to = {"%1ᠢ%2", "%1ᠧ%2", "%1ᠧ"}
}
s["Mong-entryname"] = s["Mong-displaytext"]
s["Polyt-displaytext"] = s["Grek-displaytext"]
s["Polyt-entryname"] = {
remove_diacritics = c.macron .. c.breve .. c.dbrevebelow,
from = s["Grek-entryname"].from,
to = s["Grek-entryname"].to
}
s["Polyt-sortkey"] = s["Grek-sortkey"]
-- Samr
do
s["Samr-entryname"] = {
remove_diacritics = c.CGJ .. u(0x0816) .. "-" .. u(0x082D),
}
s["Samr-sortkey"] = s["Samr-entryname"]
end
s["roa-oil-sortkey"] = {
remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.diaer .. c.ringabove .. c.cedilla .. "'",
from = {"æ", "œ", "·"},
to = {"ae", "oe", " "}
}
s["Tibt-displaytext"] = {
from = {"ༀ", "༌", "།།", "༚༚", "༚༝", "༝༚", "༝༝", "ཷ", "ཹ", "ེེ", "ོོ"},
to = {"ཨོཾ", "་", "༎", "༛", "༟", "࿎", "༞", "ྲཱྀ", "ླཱྀ", "ཻ", "ཽ"}
}
s["Tibt-entryname"] = s["Tibt-displaytext"]
s["wen-sortkey"] = {
from = {"ch", "[lłßꞩẜ]", "dz[" .. c.caron .. c.acute .. "]", "[bcefmnoprswz][" .. c.caron .. c.acute .. c.dotabove .. "]"},
to = {
"h" .. p[1],
{
["l"] = "l" .. p[1], ["ł"] = "l", ["ß"] = "s", ["ꞩ"] = "š", ["ẜ"] = "š",
},
{
["dz" .. c.caron] = "d" .. p[1], ["dz" .. c.acute] = "d" .. p[2]
},
{
["b" .. c.acute] = "b" .. p[1],
["c" .. c.caron] = "c" .. p[1], ["c" .. c.acute] = "c" .. p[2],
["e" .. c.caron] = "e" .. p[1], ["e" .. c.dotabove] = "e" .. p[1],
["f" .. c.acute] = "f" .. p[1],
["m" .. c.acute] = "m" .. p[1],
["n" .. c.acute] = "n" .. p[1],
["o" .. c.acute] = "o" .. p[1],
["p" .. c.acute] = "p" .. p[1],
["r" .. c.caron] = "r" .. p[1], ["r" .. c.acute] = "r" .. p[2],
["s" .. c.caron] = "s" .. p[1], ["s" .. c.acute] = "s" .. p[2],
["w" .. c.acute] = "w" .. p[1],
["z" .. c.caron] = "z" .. p[1], ["z" .. c.acute] = "z" .. p[2],
}
}
}
export.shared = s
-- Short-term solution to override the standard substitution process, by forcing the module to substitute the entire text in one pass, if "cont" is given. This results in any PUA characters that are used as stand-ins for formatting being handled by the language-specific substitution process, which is usually undesirable. If the value is "none" then the formatting tags do not get turned into PUA characters in the first place.
-- This override is provided for languages which use formatting between strings of text which might need to interact with each other (e.g. Korean 값이 transliterates as "gaps-i", but [[값]] has the formatting '''값'''[[-이]]. The normal process would split the text at the second '''.)
export.substitution = {
["gmy"] = "none",
["ja"] = "cont",
["jje"] = "cont",
["ko"] = "cont",
["ko-ear"] = "cont",
["ru"] = "cont",
["th-new"] = "cont",
["sa"] = "cont",
["zkt"] = "cont",
}
-- Code aliases. The left side is the alias and the right side is the canonical code. NOTE: These are gradually
-- being deprecated, so should not be added to on a permanent basis. Temporary additions are permitted under reasonable
-- circumstances (e.g. to facilitate changing a language's code). When an alias is no longer used, it should be removed.
-- Aliases in this table are tracked at [[Wiktionary:Tracking/languages/LANG]]; see e.g.
-- [[Special:WhatLinksHere/Wiktionary:Tracking/languages/RL.]] for the `RL.` alias.
export.aliases = {
["EL."] = "la-ecc",
["LL."] = "la-lat",
["ML."] = "la-med",
["NL."] = "la-new",
["VL."] = "la-vul",
["nds-DE"] = "nds-de",
["nds-NL"] = "nds-nl",
["roa-oan"] = "roa-ona",
}
-- Codes which are tracked. Note that all aliases listed above are also tracked, so should not be duplicated here.
-- Tracking uses the same mechanism described above in the comment above `export.aliases`.
export.track = {
-- Codes duplicated between full and etymology-only languages.
["lzh-lit"] = true,
-- Languages actively being converted to families.
["bh"] = true, -- inc-bih
["nan"] = true, -- zhx-nan
}
return export
e070ni97phn8zzfz8rxgqlnnoivur06
Modul:languages/data/exceptional
828
33718
281316
276272
2026-04-21T19:33:44Z
Hakimi97
2668
Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/89762531|89762531]])
281316
Scribunto
text/plain
local m_langdata = require("Module:languages/data")
-- Loaded on demand, as it may not be needed (depending on the data).
local function u(...)
u = require("Module:string utilities").char
return u(...)
end
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["aav-khs-pro"] = {
"Khasi Purba",
116773216,
"aav-khs",
"Latn",
type = "reconstructed",
}
m["aav-nic-pro"] = {
"Nicobar Purba",
116773793,
"aav-nic",
"Latn",
type = "reconstructed",
}
m["aav-pkl-pro"] = {
"Pnar-Khasi-Lyngngam Purba",
116773259,
"aav-pkl",
"Latn",
type = "reconstructed",
}
m["aav-pro"] = { -- mkh-pro will merge into this
"Austroasia Purba",
116773186,
"aav",
"Latn",
type = "reconstructed",
}
m["afa-pro"] = {
"Afroasia Purba",
269125,
"afa",
"Latn",
type = "reconstructed",
}
m["alg-aga"] = {
"Agawam",
nil,
"alg-eas",
"Latn",
}
m["alg-pro"] = {
"Algonquin Purba",
7251834,
"alg",
"Latn",
type = "reconstructed",
sort_key = {remove_diacritics = "·"},
}
m["alv-ama"] = {
"Amasi",
4740400,
"nic-grs",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron},
}
m["alv-bgu"] = {
"Baïnounk Gubëeher",
17002646,
"alv-bny",
"Latn",
}
m["alv-bua-pro"] = {
"Bua Purba",
116773723,
"alv-bua",
"Latn",
type = "reconstructed",
}
m["alv-cng-pro"] = {
"Cangin Purba",
116773726,
"alv-cng",
"Latn",
type = "reconstructed",
}
m["alv-edo-pro"] = {
"Edoid Purba",
116773206,
"alv-edo",
"Latn",
type = "reconstructed",
}
m["alv-fli-pro"] = {
"Fali Purba",
116773754,
"alv-fli",
"Latn",
type = "reconstructed",
}
m["alv-gbe-pro"] = {
"Gbe Purba",
116773208,
"alv-gbe",
"Latn",
type = "reconstructed",
}
m["alv-gng-pro"] = {
"Guang Purba",
116773757,
"alv-gng",
"Latn",
type = "reconstructed",
}
m["alv-gtm-pro"] = {
"Togo Tengah Purba",
116773732,
"alv-gtm",
"Latn",
type = "reconstructed",
}
m["alv-gwa"] = {
"Gwara",
16945580,
"nic-pla",
"Latn",
}
m["alv-hei-pro"] = {
"Heiban Purba",
116773760,
"alv-hei",
"Latn",
type = "reconstructed",
}
m["alv-ido-pro"] = {
"Idomoid Purba",
116773764,
"alv-ido",
"Latn",
type = "reconstructed",
}
m["alv-igb-pro"] = {
"Igboid Purba",
116773765,
"alv-igb",
"Latn",
type = "reconstructed",
}
m["alv-kwa-pro"] = {
"Kwa Purba",
116773780,
"alv-kwa",
"Latn",
type = "reconstructed",
}
m["alv-mum-pro"] = {
"Mumuye Purba",
116773791,
"alv-mum",
"Latn",
type = "reconstructed",
}
m["alv-nup-pro"] = {
"Nupoid Purba",
116773795,
"alv-nup",
"Latn",
type = "reconstructed",
}
m["alv-pro"] = {
"Atlantik-Congo Purba",
116732838,
"alv",
"Latn",
type = "reconstructed",
}
m["alv-edk-pro"] = {
"Edekiri Purba",
nil,
"alv-edk",
"Latn",
type = "reconstructed",
}
m["alv-yor-pro"] = {
"Yoruba Purba",
nil,
"alv-yor",
"Latn",
type = "reconstructed",
}
m["alv-yrd-pro"] = {
"Yoruboid Purba",
116773824,
"alv-yrd",
"Latn",
type = "reconstructed",
}
m["alv-von-pro"] = {
"Volta-Niger Purba",
116773820,
"alv-von",
"Latn",
type = "reconstructed",
}
m["apa-pro"] = {
"Apache Purba",
116773135,
"apa",
"Latn",
type = "reconstructed",
}
m["aql-pro"] = {
"Algik Purba",
18389588,
"aql",
"Latn",
type = "reconstructed",
sort_key = {remove_diacritics = "·"},
}
m["art-adu"] = {
"Adûni",
1232159,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-bel"] = {
"Kreol Belter",
108055510,
"art",
"Latn",
type = "appendix-constructed",
sort_key = {
remove_diacritics = c.acute,
from = {"ɒ"},
to = {"a"},
},
}
m["art-blk"] = {
"Bolak",
2909283,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-bsp"] = {
"Black Speech",
686210,
"art",
"Latn, Teng",
type = "appendix-constructed",
}
m["art-com"] = {
"Communicationssprache",
35227,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-dtk"] = {
"Dothraki",
2914733,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-elo"] = {
"Eloi",
nil,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-gld"] = {
"Goa'uld",
19823,
"art",
"Latn, Egyp, Mero",
type = "appendix-constructed",
}
m["art-lap"] = {
"Lapine",
6488195,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-man"] = {
"Mandalorian",
54289,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-mun"] = {
"Mundolinco",
851355,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-nav"] = {
"Na'vi",
316939,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-vlh"] = {
"High Valyrian",
64483808,
"art",
"Latn",
type = "appendix-constructed",
}
m["ath-nic"] = {
"Nicola",
20609,
"ath-nor",
"Latn",
}
m["ath-pro"] = {
"Athabaska Purba",
104841722,
"ath",
"Latn",
type = "reconstructed",
}
m["auf-pro"] = {
"Arawa Purba",
116773706,
"auf",
"Latn",
type = "reconstructed",
}
m["aus-alu"] = {
"Alungul",
16827670,
"aus-pmn",
"Latn",
}
m["aus-and"] = {
"Andjingith",
4754509,
"aus-pmn",
"Latn",
}
m["aus-ang"] = {
"Angkula",
16828520,
"aus-pmn",
"Latn",
}
m["aus-arn-pro"] = {
"Arnhem Purba",
116773720,
"aus-arn",
"Latn",
type = "reconstructed",
}
m["aus-bra"] = {
"Barranbinya",
4863220,
"aus-pmn",
"Latn",
}
m["aus-brm"] = {
"Barunggam",
4865914,
"aus-pmn",
"Latn",
}
m["aus-cww-pro"] = {
"New South Wales Tengah Purba",
116773199,
"aus-cww",
"Latn",
type = "reconstructed",
}
m["aus-dal-pro"] = {
"Daly Purba",
116773743,
"aus-dal",
"Latn",
type = "reconstructed",
}
m["aus-guw"] = {
"Guwar",
6652138,
"aus-pam",
"Latn",
}
m["aus-lsw"] = {
"Little Swanport",
6652138,
"qfa-unc",
"Latn",
}
m["aus-mbi"] = {
"Mbiywom",
6799701,
"aus-pmn",
"Latn",
}
m["aus-ngk"] = {
"Ngkoth",
7022405,
"aus-pmn",
"Latn",
}
m["aus-nyu-pro"] = {
"Nyulnyulan Purba",
116773797,
"aus-nyu",
"Latn",
type = "reconstructed",
}
m["aus-pam-pro"] = {
"Pama-Nyunga Purba",
33942,
"aus-pam",
"Latn",
type = "reconstructed",
}
m["aus-tul"] = {
"Tulua",
16938541,
"aus-pam",
"Latn",
}
m["aus-uwi"] = {
"Uwinymil",
7903995,
"aus-arn",
"Latn",
}
m["aus-wdj-pro"] = {
"Iwaidjan Purba",
116773767,
"aus-wdj",
"Latn",
type = "reconstructed",
}
m["aus-won"] = {
"Wong-gie",
nil,
"aus-pam",
"Latn",
}
m["aus-wul"] = {
"Wulguru",
8039196,
"aus-dyb",
"Latn",
}
m["aus-ynk"] = { -- contrast nny
"Yangkaal",
3913770,
"aus-tnk",
"Latn",
}
m["awd-amc-pro"] = {
"Amuesha-Chamicuro Purba",
nil,
"awd",
"Latn",
type = "reconstructed",
}
m["awd-kmp-pro"] = {
"Kampa Purba",
nil,
"awd",
"Latn",
type = "reconstructed",
}
m["awd-prw-pro"] = {
"Paresi-Waura Purba",
nil,
"awd",
"Latn",
type = "reconstructed",
}
m["awd-ama"] = {
"Amarizana",
16827787,
"awd",
"Latn",
}
m["awd-ana"] = {
"Anauyá",
16828252,
"awd",
"Latn",
}
m["awd-apo"] = {
"Apolista",
16916645,
"awd",
"Latn",
}
m["awd-cab"] = {
"Cabre",
16850160,
"awd",
"Latn",
}
m["awd-gnu"] = {
"Guinau",
3504087,
"awd",
"Latn",
}
m["awd-kar"] = {
"Cariay",
16920253,
"awd",
"Latn",
}
m["awd-kaw"] = {
"Kawishana",
6379993,
"awd-nwk",
"Latn",
}
m["awd-kus"] = {
"Kustenau",
5196293,
"awd",
"Latn",
}
m["awd-man"] = {
"Manao",
6746920,
"awd",
"Latn",
}
m["awd-mar"] = {
"Marawan",
6755108,
"awd",
"Latn",
}
m["awd-mpr"] = {
"Maipure",
6736872,
"awd",
"Latn",
}
m["awd-mrt"] = {
"Mariaté",
16910017,
"awd-nwk",
"Latn",
}
m["awd-nwk-pro"] = {
"Nawiki Purba",
116773234,
"awd-nwk",
"Latn",
type = "reconstructed",
}
m["awd-pai"] = {
"Paikoneka",
128807835,
"awd",
"Latn",
}
m["awd-pas"] = {
"Pasé",
7143168,
"awd-nwk",
"Latn",
}
m["awd-pro"] = {
"Arawak Purba",
97573478,
"awd",
"Latn",
type = "reconstructed",
}
m["awd-she"] = {
"Shebayo",
7492248,
"awd",
"Latn",
}
m["awd-taa-pro"] = {
"Ta-Arawak Purba",
116773282,
"awd-taa",
"Latn",
type = "reconstructed",
}
m["awd-wai"] = {
"Wainumá",
16910017,
"awd-nwk",
"Latn",
}
m["awd-yum"] = {
"Yumana",
8061062,
"awd-nwk",
"Latn",
}
m["azc-caz"] = {
"Cazcan",
5055514,
"azc",
"Latn",
}
m["azc-cup-pro"] = {
"Cupan Purba",
116773738,
"azc-cup",
"Latn",
type = "reconstructed",
}
m["azc-ktn"] = {
"Kitanemuk",
3197558,
"azc-tak",
"Latn",
}
m["azc-nah-pro"] = {
"Nahua Purba",
7251860,
"azc-nah",
"Latn",
type = "reconstructed",
}
m["azc-num-pro"] = {
"Numi Purba",
116773247,
"azc-num",
"Latn",
type = "reconstructed",
}
m["azc-pro"] = {
"Uto-Aztek Purba",
96400333,
"azc",
"Latn",
type = "reconstructed",
}
m["azc-tak-pro"] = {
"Takik Purba",
116773283,
"azc-tak",
"Latn",
type = "reconstructed",
}
m["azc-tat"] = {
"Tataviam",
743736,
"azc",
"Latn",
}
m["ber-pro"] = {
"Barbar Purba",
2855698,
"ber",
"Latn",
type = "reconstructed",
}
m["ber-fog"] = {
"Fogaha",
107610173,
"ber",
"Latn",
}
m["ber-zuw"] = {
"Zuwara",
4117169,
"ber",
"Latn",
}
m["bnt-bal"] = {
"Balong",
93935237,
"bnt-bbo",
"Latn",
}
m["bnt-bon"] = {
"Boma Nkuu",
nil,
"bnt",
"Latn",
}
m["bnt-boy"] = {
"Boma Yumu",
nil,
"bnt",
"Latn",
}
m["bnt-bwa"] = {
"Bwala",
128810345,
"bnt-tek",
"Latn",
}
m["bnt-cmw"] = {
"Chimwiini",
4958328,
"bnt-swh",
"Latn",
}
m["bnt-ind"] = {
"Indanga",
51412803,
"bnt",
"Latn",
}
m["bnt-lal"] = {
"Lala (Afrika Selatan)",
6480154,
"bnt-ngu",
"Latn",
}
m["bnt-mpi"] = {
"Mpiin",
93937013,
"bnt-bdz",
"Latn",
}
m["bnt-mpu"] = {
"Mpuono", -- not to be confused with Mbuun zmp
36056,
"bnt",
"Latn",
}
m["bnt-ngu-pro"] = {
"Nguni Purba",
961559,
"bnt-ngu",
"Latn",
type = "reconstructed",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.caron},
}
m["bnt-phu"] = {
"Phuthi",
33796,
"bnt-ngu",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute},
}
m["bnt-pro"] = {
"Bantu Purba",
3408025,
"bnt",
"Latn",
type = "reconstructed",
sort_key = "bnt-pro-sortkey",
}
m["bnt-sab-pro"] = {
"Sabaki Purba",
nil, -- Q2209395 is the code for the Sabaki family
"bnt-sab",
"Latn",
type = "reconstructed",
}
m["bnt-sbo"] = {
"Boma Selatan",
nil,
"bnt",
"Latn",
}
m["bnt-sts-pro"] = {
"Sotho-Tswana Purba",
116773278,
"bnt-sts",
"Latn",
type = "reconstructed",
}
m["btk-pro"] = {
"Batak Purba",
116773191,
"btk",
"Latn",
type = "reconstructed",
}
m["cau-abz-pro"] = {
"Abkhaz-Abaza Purba",
7251831,
"cau-abz",
"Latn",
type = "reconstructed",
}
m["cau-and-pro"] = {
"Andi Purba",
nil,
"cau-and",
"Latn",
type = "reconstructed",
}
m["cau-ava-pro"] = {
"Avar-Andi Purba",
116773187,
"cau-ava",
"Latn",
type = "reconstructed",
}
m["cau-cir-pro"] = {
"Circassia Purba",
7251838,
"cau-cir",
"Latn",
type = "reconstructed",
}
m["cau-drg-pro"] = {
"Dargwa Purba",
116773205,
"cau-drg",
"Latn",
type = "reconstructed",
}
m["cau-lzg-pro"] = {
"Lezgi Purba",
116773223,
"cau-lzg",
"Latn",
type = "reconstructed",
}
m["cau-nec-pro"] = {
"Kaukasus Timur Laut Purba",
116773244,
"cau-nec",
"Latn",
type = "reconstructed",
}
m["cau-nkh-pro"] = {
"Nakh Purba",
108032840,
"cau-nkh",
"Latn",
type = "reconstructed",
}
m["cau-nwc-pro"] = {
"Kaukasus Barat Laut Purba",
7251861,
"cau-nwc",
"Latn",
type = "reconstructed",
}
m["cau-tsz-pro"] = {
"Tsez Purba",
116773287,
"cau-tsz",
"Latn",
type = "reconstructed",
}
m["cba-ata"] = {
"Atanques",
4812783,
"cba",
"Latn",
}
m["cba-cat"] = {
"Catío Chibcha",
7083619,
"cba",
"Latn",
}
m["cba-dor"] = {
"Dorasque",
5297532,
"cba",
"Latn",
}
m["cba-dui"] = {
"Duit",
3041061,
"cba",
"Latn",
}
m["cba-hue"] = {
"Huetar",
35514,
"cba",
"Latn",
}
m["cba-nut"] = {
"Nutabe",
7070405,
"cba",
"Latn",
}
m["cba-pro"] = {
"Chibchan Purba",
116773203,
"cba",
"Latn",
type = "reconstructed",
}
m["ccs-pro"] = {
"Kartvelia Purba",
2608203,
"ccs",
"Latn",
type = "reconstructed",
strip_diacritics = {
from = {"q̣", "p̣", "ʓ", "ċ"},
to = {"q̇", "ṗ", "ʒ", "c̣"}
},
}
m["ccs-gzn-pro"] = {
"Georgia-Zan Purba",
23808119,
"ccs-gzn",
"Latn",
type = "reconstructed",
strip_diacritics = {
from = {"q̣", "p̣", "ʓ", "ċ"},
to = {"q̇", "ṗ", "ʒ", "c̣"}
},
}
m["cdc-cbm-pro"] = {
"Chad Tengah Purba",
116773197,
"cdc-cbm",
"Latn",
type = "reconstructed",
}
m["cdc-mas-pro"] = {
"Masa Purba",
116773789,
"cdc-mas",
"Latn",
type = "reconstructed",
}
m["cdc-pro"] = {
"Chad Purba",
116773201,
"cdc",
"Latn",
type = "reconstructed",
}
m["cdd-pro"] = {
"Caddoan Purba",
116773725,
"cdd",
"Latn",
type = "reconstructed",
}
m["cel-bry-pro"] = {
"Briton Purba",
1248800,
"cel-bry",
"Latn, Polyt",
sort_key = {
Latn = "cel-bry-pro-sortkey",
},
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["cel-gal"] = {
"Gallaecia",
3094789,
"cel-his",
}
m["cel-gau"] = {
"Gallia",
29977,
"cel",
"Latn, Polyt, Ital",
strip_diacritics = {
Latn = {remove_diacritics = c.macron .. c.breve .. c.diaer},
},
sort_key = {
Latn = "cel-bry-pro-sortkey",
},
-- Ital translit in [[Module:scripts/data]]
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["cel-pro"] = {
"Keltik Purba",
653649,
"cel",
"Latn",
type = "reconstructed",
sort_key = "cel-pro-sortkey",
}
m["chi-pro"] = {
"Chimakuan Purba",
116773734,
"chi",
"Latn",
type = "reconstructed",
}
m["chm-pro"] = {
"Mari Purba",
116773788,
"chm",
"Latn",
type = "reconstructed",
}
m["cmc-pro"] = {
"Chamik Purba",
114793834,
"cmc",
"Latn",
type = "reconstructed",
}
m["crp-bip"] = {
"Pijin Basque-Iceland",
810378,
"crp",
"Latn",
ancestors = "eu",
}
m["crp-kia"] = {
"Pijin Jerman Kiautschou",
108314615,
"crp",
"Latn",
ancestors = "de",
}
m["crp-gep"] = {
"Pijin Greenland Barat",
17036301,
"crp",
"Latn",
ancestors = "kl",
}
m["crp-mar"] = {
"Maroon Spirit Language",
1093206,
"crp",
"Latn",
ancestors = "en",
}
m["crp-mpp"] = {
"Portugis Pijin Macau",
128804537,
"crp",
"Hant, Latn",
ancestors = "pt",
sort_key = {Hant = "Hani-sortkey"},
}
m["crp-rsn"] = {
"Russenorsk",
505125,
"crp",
"Cyrl, Latn",
ancestors = "nn, ru",
translit = {Cyrl = "ru-translit"},
}
m["crp-spp"] = {
"Samoan Plantation Pidgin",
7409948,
"crp",
"Latn",
ancestors = "en",
}
m["crp-slb"] = {
"Inggeris Solombala",
7558525,
"crp",
"Cyrl, Latn",
ancestors = "en, ru",
translit = {Cyrl = "ru-translit"},
}
m["crp-tpr"] = {
"Rusia Pijin Taimyr",
16930506,
"crp",
"Cyrl",
ancestors = "ru",
translit = "ru-translit",
}
m["csu-bba-pro"] = {
"Bongo-Bagirmi Purba",
116773722,
"csu-bba",
"Latn",
type = "reconstructed",
}
m["csu-maa-pro"] = {
"Mangbetu Purba",
116773786,
"csu-maa",
"Latn",
type = "reconstructed",
}
m["csu-pro"] = {
"Sudan Tengah Purba",
116773730,
"csu",
"Latn",
type = "reconstructed",
}
m["csu-sar-pro"] = {
"Sara Purba",
116773809,
"csu-sar",
"Latn",
type = "reconstructed",
}
m["cus-ash"] = {
"Ashraaf",
4805855,
"cus-som",
"Latn",
}
m["cus-hec-pro"] = {
"Kusyi Timur Tanah Tinggi Purba",
116773761,
"cus-hec",
"Latn",
type = "reconstructed",
}
m["cus-som-pro"] = {
"Somaloid Purba",
nil,
"cus-som",
"Latn",
type = "reconstructed",
}
m["cus-sou-pro"] = {
"Kusyi Selatan Purba",
126081567,
"cus-sou",
"Latn",
type = "reconstructed",
}
m["cus-pro"] = {
"Kusyi Purba",
116773204,
"cus",
"Latn",
type = "reconstructed",
}
m["dmn-dam"] = {
"Dama (Sierra Leone)",
19601574,
"dmn",
"Latn",
}
m["dra-bry"] = {
"Beary",
1089116,
"qfa-mix",
"Mlym, Knda",
ancestors = "ml, tcy",
-- Knda translit in [[Module:scripts/data]]
-- Mlym translit in [[Module:scripts/data]]
}
m["dra-cen-pro"] = {
"Dravidia Tengah Purba",
nil,
"dra-cen",
"Latn",
type = "reconstructed",
}
m["dra-mkn"] = {
"Kannada Pertengahan",
128810572,
"dra-kan",
"Knda",
-- Knda translit in [[Module:scripts/data]]
}
m["dra-nor-pro"] = {
"Dravidia Utara Purba",
124433593,
"dra-nor",
"Latn",
type = "reconstructed",
}
m["dra-okn"] = {
"Kannada Kuno",
15723156,
"dra-kan",
"Knda",
-- Knda translit in [[Module:scripts/data]]
}
m["dra-ote"] = {
"Telugu Kuno",
126720868,
"dra-tel",
"Telu",
translit = "te-translit",
}
m["dra-pro"] = {
"Dravidia Purba",
1702853,
"dra",
"Latn",
type = "reconstructed",
}
m["dra-sdo-pro"] = {
"Dravidia Selatan I Purba",
104847952, -- Wikipedia's "Dravidia Selatan Purba" is Dravidia Selatan Purba I in this scheme.
"dra-sdo",
"Latn",
type = "reconstructed",
}
m["dra-sdt-pro"] = {
"Dravidia Selatan II Purba",
128885257,
"dra-sdt",
"Latn",
type = "reconstructed",
}
m["dra-sou-pro"] = {
"Dravidia Selatan Purba",
128886121,
"dra-sou",
"Latn",
type = "reconstructed",
}
m["egx-dem"] = {
"Demotik",
36765,
"egx",
"Latn, Egyd, Polyt",
sort_key = {
Latn = {
remove_diacritics = "'%-%s",
from = {"ꜣ", "j", "e", "ꜥ", "y", "w", "b", "p", "f", "m", "n", "r", "l", "ḥ", "ḫ", "h̭", "ẖ", "h", "š", "s", "q", "k", "g", "ṱ", "ṯ", "t", "ḏ", "%.", "⸗"},
to = {p[1], p[2], p[3], p[4], p[5], p[6], p[7], p[8], p[9], p[10], p[11], p[12], p[13], p[15], p[16], p[16], p[17], p[14], p[19], p[18], p[20], p[21], p[22], p[23], p[24], p[23], p[25], p[26], p[26]}
},
},
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["dmn-pro"] = {
"Mande Purba",
116773785,
"dmn",
"Latn",
type = "reconstructed",
}
m["dmn-mdw-pro"] = {
"Mande Barat Purba",
116773822,
"dmn-mdw",
"Latn",
type = "reconstructed",
}
m["dru-pro"] = {
"Rukai Purba",
116773807,
"map",
"Latn",
type = "reconstructed",
}
m["ero-gsz"] = {
"Geshiza",
nil,
"ero",
"Latn",
}
m["ero-nya"] = {
"Nyagrong Minyag",
nil,
"ero",
"Latn",
}
m["ero-tau"] = {
"Stau",
nil,
"ero",
"Latn",
}
m["esx-esk-pro"] = {
"Eskimo Purba",
7251842,
"esx-esk",
"Latn",
type = "reconstructed",
}
m["esx-ink"] = {
"Inuktun",
1671647,
"esx-inu",
"Latn",
}
m["esx-inq"] = {
"Inuinnaqtun",
28070,
"esx-inu",
"Latn",
}
m["esx-inu-pro"] = {
"Inuit Purba",
60785588,
"esx-inu",
"Latn",
type = "reconstructed",
}
m["esx-pro"] = {
"Eskimo-Aleut Purba",
7251843,
"esx",
"Latn",
type = "reconstructed",
}
m["esx-tut"] = {
"Tunumiisut",
15665389,
"esx-inu",
"Latn",
}
m["euq-pro"] = {
"Vascon Purba",
938011,
"euq",
"Latn",
type = "reconstructed",
}
m["gba-pro"] = {
"Gbaya Purba",
nil,
"gba",
"Latn",
type = "reconstructed",
}
m["gem-pro"] = {
"Jermanik Purba",
669623,
"gem",
"Latn",
type = "reconstructed",
sort_key = "gem-pro-sortkey",
}
m["gme-bur"] = {
"Burgundians",
47625,
"gme",
"Latn",
}
m["gme-cgo"] = {
"Goth Crimea",
36211,
"gme",
"Latn",
}
m["gmq-gut"] = {
"Gutnish",
1256646,
"gmq",
"Latn",
ancestors = "gmq-ogt",
}
m["gmq-jmk"] = {
"Jamtish",
35512,
"gmq-eas",
"Latn",
}
m["gmq-mno"] = {
"Norway Pertengahan",
3417070,
"gmq-wes",
"Latn",
}
m["gmq-oda"] = {
"Denmark Kuno",
12330003,
"gmq-eas",
"Latn, Runr",
strip_diacritics = {remove_diacritics = c.macron},
}
m["gmq-ogt"] = {
"Gutnish Kuno",
1133488,
"gmq",
"Latn, Runr",
ancestors = "non",
}
m["gmq-osw"] = {
"Sweden Kuno",
2417210,
"gmq-eas",
"Latn, Runr",
strip_diacritics = {remove_diacritics = c.macron},
}
m["gmq-pro"] = {
"Norse Purba",
1671294,
"gmq",
"Runr",
translit = "Runr-translit",
}
m["gmq-scy"] = {
"Scanian",
768017,
"gmq-eas",
"Latn",
}
m["gmw-bgh"] = {
"Bergish",
329030,
"gmw-frk",
"Latn",
}
m["gmw-cfr"] = {
"Franconia Tengah",
572197,
"gmw-hgm",
"Latn",
ancestors = "gmh",
wikimedia_codes = "ksh",
}
m["gmw-ecg"] = {
"Jerman Tengah Timur",
499344, -- subsumes Q699284, Q152965
"gmw-hgm",
"Latn",
ancestors = "gmh",
}
m["gmw-fin"] = {
"Fingallian",
3072588,
"gmw-ian",
"Latn",
}
m["gmw-gts"] = {
"Gottscheerish",
533109,
"gmw-hgm",
"Latn",
ancestors = "bar",
}
m["gmw-jdt"] = {
"Belanda Jersey",
1687911,
"gmw-frk",
"Latn",
ancestors = "nl",
}
m["gmw-msc"] = {
"Scots Pertengahan",
3327000,
"gmw-ang",
"Latn",
ancestors = "enm-esc",
}
m["gmw-pro"] = {
"Jermanik Barat Purba",
78079021,
"gmw",
"Latn, Runr",
-- type = "reconstructed",
-- largely but not entirely reconstructed (like Proto-Norse); see April '24 BP, set back to reconstructed (?) if 'anti-asterisk' is added
sort_key = "gmw-pro-sortkey",
}
m["gmw-rfr"] = {
"Franconia Rhine",
707007,
"gmw-hgm",
"Latn",
ancestors = "gmh",
}
m["gmw-stm"] = {
"Sathmar Swabian",
2223059,
"gmw-hgm",
"Latn",
ancestors = "swg",
}
m["gmw-tsx"] = {
"Transylvanian Saxon",
260942,
"gmw-hgm",
"Latn",
ancestors = "gmw-cfr",
}
m["gmw-vog"] = {
"Jerman Volga",
312574,
"gmw-hgm",
"Latn",
ancestors = "gmw-rfr",
}
m["gmw-zps"] = {
"Jerman Zipser",
205548,
"gmw-hgm",
"Latn",
ancestors = "gmh",
}
m["gn-cls"] = {
"Guarani Klasik",
17478065,
"gn",
"Latn",
}
m["grk-cal"] = {
"Yunani Calabria",
1146398,
"grk",
"Latn, Grek",
ancestors = "grk-ita",
translit = {
Grek = "el-translit",
},
-- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["grk-ita"] = {
"Yunani Itali",
19720507,
"grk",
"Latn, Polyt",
ancestors = "gkm",
translit = {
Grek = "el-translit",
},
-- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["grk-mar"] = {
"Yunani Mariupol",
4400023,
"grk",
"Cyrl, Latn, Polyt",
ancestors = "gkm",
translit = {
Cyrl = "grk-mar-translit",
Polyt = "grk-mar-translit",
},
override_translit = true,
strip_diacritics = {
Cyrl = {remove_diacritics = c.acute},
},
-- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["grk-pro"] = {
"Hellenik Purba",
1231805,
"grk",
"Latn, Polyt",
type = "reconstructed",
sort_key = {Latn = {
from = {"ʰ", "ʷ"},
to = {"h", "w"},
remove_diacritics = c.grave .. c.acute .. c.macron .. c.breve .. c.caron
}},
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
-- NOTE: formerly no translit specified for Polyt; presumably an accidental omission; if not, set Polyt = false in
-- the translit section
}
m["hmn-pro"] = {
"Hmong",
116773210,
"hmn",
"Latn",
type = "reconstructed",
}
m["hmx-mie-pro"] = {
"Mien",
116773229,
"hmx-mie",
"Latn",
type = "reconstructed",
}
m["hmx-pro"] = {
"Hmong-Mien Purba",
7251846,
"hmx",
"Latn",
type = "reconstructed",
}
m["hyx-pro"] = {
"Armenia Purba",
3848498,
"hyx",
"Latn",
type = "reconstructed",
}
m["iir-nur-pro"] = {
"Nuristani Purba",
116773248,
"iir-nur",
"Latn",
type = "reconstructed",
}
m["iir-pro"] = {
"Indo-Iran Purba",
966439,
"iir",
"Latn",
type = "reconstructed",
}
m["ijo-pro"] = {
"Ijoid Purba",
116773766,
"ijo",
"Latn",
type = "reconstructed",
}
m["inc-apa"] = {
"Apabhramsa",
616419,
"inc-mid",
"Deva, Shrd, Sidd",
ancestors = "pra",
translit = {
Deva = "sa-translit",
-- Shrd translit in [[Module:scripts/data]]
-- Sidd translit in [[Module:scripts/data]]
},
}
m["inc-ash"] = {
"Prakrit Ashoka",
104854379,
"inc-mid",
"Brah, Khar",
ancestors = "sa",
translit = {
-- Brah translit in [[Module:scripts/data]]
Khar = "Khar-translit",
},
}
m["inc-dng-pro"] = {
"Dangari Purba",
nil,
"inc-dng",
"Latn",
type = "reconstructed",
}
m["inc-kam"] = {
"Prakrit Kamarupi",
6356097,
"inc-bas",
"Brah, Sidd",
-- Brah, Sidd translit in [[Module:scripts/data]]
}
m["inc-kho"] = {
"Kholosi",
24952008,
"inc-snd",
"Latn",
}
m["inc-krd-pro"] = {
"Kamta Purba",
128816843,
"inc-bas",
"Latn",
ancestors = "inc-kam",
type = "reconstructed",
}
m["inc-mas"] = {
"Assam Pertengahan",
128806836,
"inc-bas",
"as-Beng",
ancestors = "inc-oas",
translit = "inc-mas-translit",
}
m["inc-mbn"] = {
"Benggali Pertengahan",
113559927,
"inc-bas",
"Beng",
ancestors = "inc-obn",
translit = "inc-mbn-translit",
}
m["inc-mgu"] = {
"Gujarat Pertengahan",
24907429,
"inc-wes",
"Deva",
ancestors = "inc-ogu",
}
m["inc-mor"] = {
"Odia Pertengahan",
128810882,
"inc-eas",
"Orya",
ancestors = "inc-oor",
}
m["inc-oas"] = {
"Assam Awal",
85758237,
"inc-bas",
"as-Beng",
ancestors = "inc-kam",
translit = "inc-oas-translit",
}
m["inc-oaw"] = {
"Awadhi Kuno",
nil,
"inc-hie",
"Deva, Kthi, ur-Arab",
strip_diacritics = {
from = {"هٔ", "ۂ"}, -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"
to = {"ہ", "ہ"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef
},
translit = {
Deva = "sa-translit",
Kthi = "sa-Kthi-translit",
["ur-Arab"] = "inc-ohi-translit",
},
}
m["inc-obn"] = {
"Benggali Kuno",
113559926,
"inc-bas",
"Beng",
}
m["inc-ogu"] = {
"Gujarati Kuno",
24907427,
"inc-wes",
"Deva",
translit = "sa-translit",
}
m["inc-ohi"] = {
"Hindi Kuno",
48767781,
"inc-hiw",
"Deva, ur-Arab",
strip_diacritics = {
from = {"هٔ", "ۂ"}, -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"
to = {"ہ", "ہ"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef
},
translit = {
Deva = "sa-translit",
["ur-Arab"] = "inc-ohi-translit",
},
}
m["inc-oor"] = {
"Odia Kuno",
128807801,
"inc-eas",
"Orya",
}
m["inc-opa"] = {
"Punjabi Kuno",
115270971,
"inc-pan",
"Guru, pa-Arab",
translit = {
Guru = "inc-opa-Guru-translit",
["pa-Arab"] = "pa-Arab-translit",
},
strip_diacritics = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun},
}
m["inc-pro"] = {
"Indo-Arya Purba",
23808344,
"inc",
"Latn",
type = "reconstructed",
}
m["ine-ana-pro"] = {
"Anatolia Purba",
7251833,
"ine-ana",
"Latn",
type = "reconstructed",
}
m["ine-bsl-pro"] = {
"Balto-Slavik Purba",
1703347,
"ine-bsl",
"Latn",
type = "reconstructed",
sort_key = {
from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", c.acute, c.macron, "ˀ"},
to = {"a", "e", "i", "o", "u"}
},
}
m["ine-kal"] = {
"Kalašma",
122770439,
"ine-ana",
"Xsux",
}
m["ine-pae"] = {
"Paeonian",
2705672,
"ine",
"Polyt",
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["ine-pro"] = {
"Indo-Eropah Purba",
37178,
"ine",
"Latn",
type = "reconstructed",
sort_key = {
from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", "ĺ", "ḿ", "ń", "ŕ", "ǵ", "ḱ", "ʰ", "ʷ", "₁", "₂", "₃", c.ringbelow, c.acute, c.macron},
to = {"a", "e", "i", "o", "u", "l", "m", "n", "r", "g'", "k'", "¯h", "¯w", "1", "2", "3"}
},
}
m["ine-toc-pro"] = {
"Tocharia Purba",
104841462,
"ine-toc",
"Latn",
type = "reconstructed",
}
m["xme-old"] = {
"Medes Kuno",
36461,
"xme",
"Polyt, Latn",
}
m["xme-mid"] = {
"Medes Pertengahan",
12836150,
"xme",
"Latn",
}
m["xme-ker"] = {
"Kerman",
129850,
"xme",
"fa-Arab, Latn, Hebr",
ancestors = "xme-mid",
-- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["xme-taf"] = {
"Tafreshi",
nil,
"xme",
"fa-Arab, Latn",
ancestors = "xme-mid",
}
m["xme-ttc-pro"] = {
"Tat Purba",
122973870,
"xme-ttc",
"Latn",
ancestors = "xme-mid",
}
m["xme-kls"] = {
"Kalasuri",
nil,
"xme-ttc",
ancestors = "xme-ttc-nor",
}
m["xme-klt"] = {
"Kilit",
3612452,
"xme-ttc",
"Cyrl", -- and fa-Arab?
}
m["xme-ott"] = {
"Tati Kuno",
434697,
"xme-ttc",
"fa-Arab, Latn",
}
m["ira-kms-pro"] = {
"Komisenian Purba",
116773777,
"ira-kms",
"Latn",
type = "reconstructed",
}
m["ira-mpr-pro"] = {
"Medo-Parthia Purba",
116773227,
"ira-mpr",
"Latn",
type = "reconstructed",
}
m["ira-pat-pro"] = {
"Pathan Purba",
116773255,
"ira-pat",
"Latn",
type = "reconstructed",
}
m["ira-pro"] = {
"Iran Purba",
4167865,
"ira",
"Latn",
type = "reconstructed",
}
m["ira-zgr-pro"] = {
"Zaza-Gorani Purba",
116775031,
"ira-zgr",
"Latn",
type = "reconstructed",
}
m["xsc-pro"] = {
"Scythia Purba",
116773273,
"xsc",
"Latn",
type = "reconstructed",
}
m["xsc-sar-pro"] = {
"Sarmatia Purba",
116773249,
"xsc-sar",
"Latn",
type = "reconstructed",
}
m["xsc-skw-pro"] = {
"Saka-Wakhi Purba",
116773267,
"xsc-skw",
"Latn",
type = "reconstructed",
}
m["xsc-sak-pro"] = {
"Saka Purba",
116773264,
"xsc-sak",
"Latn",
type = "reconstructed",
}
m["ira-sym-pro"] = {
"Shughni-Yazghulami-Munji Purba",
116773813,
"ira-sym",
"Latn",
type = "reconstructed",
}
m["ira-sgi-pro"] = {
"Sanglechi-Ishkashimi Purba",
116773808,
"ira-sgi",
"Latn",
type = "reconstructed",
}
m["ira-mny-pro"] = {
"Munji-Yidgha Purba",
116773792,
"ira-mny",
"Latn",
type = "reconstructed",
}
m["ira-shy-pro"] = {
"Shughni-Yazghulami Purba",
116773812,
"ira-shy",
"Latn",
type = "reconstructed",
}
m["ira-shr-pro"] = {
"Shughni-Roshani Purba",
116773811,
"ira-shr",
"Latn",
type = "reconstructed",
}
m["ira-sgc-pro"] = {
"Sogdia Purba",
116773276,
"ira-sgc",
"Latn",
type = "reconstructed",
}
m["ira-wnj"] = {
"Vanji",
3398419,
"ira-shy",
"Latn",
}
m["iro-ere"] = {
"Erie",
5388365,
"iro-nor",
"Latn",
}
m["iro-min"] = {
"Mingo",
128531,
"iro-nor",
"Latn",
ietf_subtag = "i-mingo", -- grandfathered IETF tag
}
m["iro-nor-pro"] = {
"Iroquois Utara Purba",
116773242,
"iro-nor",
"Latn",
type = "reconstructed",
}
m["iro-pro"] = {
"Iroquois Purba",
7251852,
"iro",
"Latn",
type = "reconstructed",
}
m["itc-pro"] = {
"Italik Purba",
17102720,
"itc",
"Latn",
type = "reconstructed",
}
m["itc-psa"] = {
"Pra-Samnita",
7239186,
"itc-sbl",
"Ital, Polyt, Latn",
-- Ital translit in [[Module:scripts/data]] (NOTE: formerly not present, probably an accidental omission)
-- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["jpx-hcj"] = {
"Hachijō",
5637049,
"jpx",
"Jpan",
ancestors = "ojp-eas",
translit = s["jpx-translit"],
display_text = s["jpx-displaytext"],
strip_diacritics = s["jpx-stripdiacritics"],
sort_key = s["jpx-sortkey"],
}
m["jpx-pro"] = {
"Jepunik Purba",
3924309,
"jpx",
"Latn",
type = "reconstructed",
}
m["jpx-ryu-pro"] = {
"Ryukyu Purba",
56349069,
"jpx-ryu",
"Latn",
type = "reconstructed",
}
m["kar-pro"] = {
"Karen Purba",
85794783,
"kar",
"Latn",
type = "reconstructed",
}
m["kca-eas"] = {
"Khanty Timur",
30304622,
"kca",
"Cyrl",
translit = "kca-translit",
override_translit = true,
-- TODO temporary until MediaWiki supports Unicode 16 (probably requires a PHP update from their side)
sort_key = { Cyrl = { from = {""}, to = {""} } },
}
m["kca-nor"] = {
"Khanty Utara",
30304527,
"kca",
"Cyrl",
translit = "kca-translit",
override_translit = true,
-- TODO temporary until MediaWiki supports Unicode 16 (probably requires a PHP update from their side)
sort_key = { Cyrl = { from = {""}, to = {""} } },
}
m["kca-pro"] = {
"Khanty Purba",
127505171,
"kca",
"Latn",
type = "reconstructed",
}
m["kca-sou"] = {
"Khanty Selatan",
30304618,
"kca",
"Cyrl",
translit = "kca-translit",
override_translit = true,
}
m["khi-kho-pro"] = {
"Khoe Purba",
116773218,
"khi-kho",
"Latn",
type = "reconstructed",
}
m["khi-kun"] = {
"ǃKung",
32904,
"khi-kxa",
"Latn",
}
m["ko-ear"] = {
"Korea Moden Awal",
756014,
"qfa-kor",
"Kore",
ancestors = "okm",
translit = "okm-translit",
-- Kore strip_diacritics in [[Module:scripts/data]]
}
m["kro-pro"] = {
"Kru Purba",
116773778,
"kro",
"Latn",
type = "reconstructed",
}
m["ku-pro"] = {
"Kurdi Purba",
116773221,
"ku",
"Latn",
type = "reconstructed",
}
m["map-ata-pro"] = {
"Atayal Purba",
116773151,
"map-ata",
"Latn",
type = "reconstructed",
}
m["map-bms"] = {
"Banyumasan",
33219,
"map",
"Latn, Java",
}
m["map-pro"] = {
"Austronesia Purba",
49230,
"map",
"Latn",
type = "reconstructed",
}
m["mis-hkl"] = {
"Hokkien Kelantan Peranakan",
108794818,
"qfa-mix",
ancestors = "nan-hbl, sou, mfa",
}
m["mis-idn"] = {
"Idiom Neutral",
35847,
"art",
"Latn",
type = "appendix-constructed",
}
m["mis-isa"] = {
"Isaurian",
16956868,
nil,
-- "Xsux, Hluw, Latn",
}
m["mis-jie"] = {
"Jie",
124424186,
nil,
"Hani",
sort_key = "Hani-sortkey",
}
m["mis-jzh"] = {
"Jizhao",
45242758,
"qfa-bej",
"Latn",
}
m["mis-kas"] = {
"Kassite",
35612,
nil,
"Xsux",
}
m["mis-mmd"] = {
"Mimi of Decorse",
6862206,
nil,
"Latn",
}
m["mis-mmn"] = {
"Mimi of Nachtigal",
6862207,
nil,
"Latn",
}
m["mis-phi"] = {
"Philistine",
2230924,
nil,
"Phnx",
-- Phnx translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission)
}
m["mis-rou"] = {
"Rouran",
48816637,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-tdl"] = {
"Turdulian",
133176492,
}
m["mis-tdt"] = {
"Turdetanian",
133176461,
}
m["mis-tnw"] = {
"Tangwang",
7683179,
"qfa-mix",
"Latn",
ancestors = "cmn, sce",
}
m["mis-tuh"] = {
"Tuyuhun",
48816625,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-tuo"] = {
"Tuoba",
48816629,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-wuh"] = {
"Wuhuan",
118976867,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-xbi"] = {
"Xianbei",
4448647,
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-xnu"] = {
"Xiongnu",
10901674,
nil,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mjg-mgl"] = {
"Mongghul",
53765528,
"mjg",
"Latn", -- also Mong, Cyrl ?
}
m["mjg-mgr"] = {
"Mangghuer",
56285392,
"mjg",
"Latn", -- also Mong, Cyrl ?
}
m["mkh-asl-pro"] = {
"Asli Purba",
55630680,
"mkh-asl",
"Latn",
type = "reconstructed",
}
m["mkh-ban-pro"] = {
"Bahnar Purba",
116773189,
"mkh-ban",
"Latn",
type = "reconstructed",
}
m["mkh-kat-pro"] = {
"Katu Purba",
116773772,
"mkh-kat",
"Latn",
type = "reconstructed",
}
m["mkh-khm-pro"] = {
"Khmu Purba",
116773774,
"mkh-khm",
"Latn",
type = "reconstructed",
}
m["mkh-kmr-pro"] = {
"Khmer Purba",
55630684,
"mkh-kmr",
"Latn",
type = "reconstructed",
}
m["mkh-mmn"] = {
"Mon Pertengahan",
121337926,
"mkh-mnc",
"Latn, Mymr", --and also Pallava
ancestors = "omx",
}
m["mkh-mnc-pro"] = {
"Mon Purba",
116773231,
"mkh-mnc",
"Latn",
type = "reconstructed",
}
m["mkh-mvi"] = {
"Vietnam Pertengahan",
9199,
"mkh-vie",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mkh-pal-pro"] = {
"Palaung Purba",
104847372,
"mkh-pal",
"Latn",
type = "reconstructed",
}
m["mkh-pea-pro"] = {
"Pear Purba",
116773804,
"mkh-pea",
"Latn",
type = "reconstructed",
}
m["mkh-pkn-pro"] = {
"Pakan Purba",
116773803,
"mkh-pkn",
"Latn",
type = "reconstructed",
}
m["mkh-pro"] = { --This will be merged into 2015 aav-pro.
"Mon-Khmer Purba",
7251859,
"mkh",
"Latn",
type = "reconstructed",
}
m["mnw-tha"] = { -- To be removed.
"Thai Mon",
nil,
"mkh-mnc",
"Mymr, Thai",
ancestors = "mkh-mmn",
sort_key = {
from = {"[%p]", "ျ", "ြ", "ွ", "ှ", "ၞ", "ၟ", "ၠ", "ၚ", "ဿ", "[็-๎]", "([เแโใไ])([ก-ฮ])ฺ?"},
to = {"", "္ယ", "္ရ", "္ဝ", "္ဟ", "္န", "္မ", "္လ", "င", "သ္သ", "", "%2%1"}
},
}
m["mkh-vie-pro"] = {
"Viet Purba",
109432616,
"mkh-vie",
"Latn",
type = "reconstructed",
}
m["mns-cen"] = {
"Mansi Tengah",
128810384,
"mns",
"Cyrl",
translit = "mns-translit",
override_translit = true,
}
m["mns-nor"] = {
"Mansi Utara",
30304537,
"mns",
"Cyrl",
translit = "mns-translit",
override_translit = true,
}
m["mns-pro"] = {
"Mansi Purba",
128883093,
"mns",
"Latn",
type = "reconstructed",
}
m["mns-sou"] = {
"Mansi Selatan",
30304629,
"mns",
"Cyrl",
translit = "mns-translit",
override_translit = true,
}
m["mun-pro"] = {
"Munda Purba",
105102373,
"mun",
"Latn",
type = "reconstructed",
}
m["myn-chl"] = { -- the stage after ''emy''
"Ch'olti'",
873995,
"myn",
"Latn",
}
m["myn-pro"] = {
"Maya Purba",
3321532,
"myn",
"Latn",
type = "reconstructed",
}
m["nai-ala"] = {
"Alazapa",
128810233,
nil,
"Latn",
}
m["nai-bay"] = {
"Bayogoula",
1563704,
nil,
"Latn",
}
m["nai-cal"] = {
"Calusa",
51782,
nil,
"Latn",
}
m["nai-chi"] = {
"Chiquimulilla",
25339627,
"nai-xin",
"Latn",
}
m["nai-chu-pro"] = {
"Chumash Purba",
116773736,
"nai-chu",
"Latn",
type = "reconstructed",
}
m["nai-cig"] = {
"Ciguayo",
20741700,
nil,
"Latn",
}
m["nai-ckn-pro"] = {
"Chinook Purba",
116773735,
"nai-ckn",
"Latn",
type = "reconstructed",
}
m["nai-guz"] = {
"Guazacapán",
19572028,
"nai-xin",
"Latn",
}
m["nai-hit"] = {
"Hitchiti",
1542882,
"nai-mus",
"Latn",
}
m["nai-ipa"] = {
"Ipai",
3027474,
"nai-yuc",
"Latn",
}
m["nai-jtp"] = {
"Jutiapa",
nil,
"nai-xin",
"Latn",
}
m["nai-jum"] = {
"Jumaytepeque",
25339626,
"nai-xin",
"Latn",
}
m["nai-kat"] = {
"Kathlamet",
6376639,
"nai-ckn",
"Latn",
}
m["nai-klp-pro"] = {
"Kalapuyan Purba",
116773771,
"nai-klp",
"Latn",
type = "reconstructed",
}
m["nai-knm"] = {
"Konomihu",
3198734,
"nai-shs",
"Latn",
}
m["nai-kum"] = {
"Kumeyaay",
4910139,
"nai-yuc",
"Latn",
}
m["nai-mac"] = {
"Macoris",
21070851,
nil,
"Latn",
}
m["nai-mdu-pro"] = {
"Maidun Purba",
116773784,
"nai-mdu",
"Latn",
type = "reconstructed",
}
m["nai-miz-pro"] = {
"Mixe-Zoque Purba",
7251858,
"nai-miz",
"Latn",
type = "reconstructed",
}
m["nai-mus-pro"] = {
"Muscogee Purba",
116775368,
"nai-mus",
"Latn",
type = "reconstructed",
}
m["nai-nao"] = {
"Naolan",
6964594,
nil,
"Latn",
}
m["nai-nrs"] = {
"New River Shasta",
7011254,
"nai-shs",
"Latn",
}
m["nai-okw"] = {
"Okwanuchu",
3350126,
"nai-shs",
"Latn",
}
m["nai-per"] = {
"Pericú",
3375369,
nil,
"Latn",
}
m["nai-pic"] = {
"Picuris",
7191257,
"nai-kta",
"Latn",
}
m["nai-plp-pro"] = {
"Penuti Penara Purba",
116773806,
"nai-plp",
"Latn",
type = "reconstructed",
}
m["nai-pom-pro"] = {
"Pomo Purba",
116773262,
"nai-pom",
"Latn",
type = "reconstructed",
}
m["nai-qng"] = {
"Quinigua",
36360,
nil,
"Latn",
}
m["nai-sca-pro"] = { -- NB 'sio-pro' "Proto-Siouan" which is Proto-Western Siouan
"Sioux-Catawba Purba",
116773275,
"nai-sca",
"Latn",
type = "reconstructed",
}
m["nai-sin"] = {
"Sinacantán",
24190249,
"nai-xin",
"Latn",
}
m["nai-sln"] = {
"Salvadoran Lenca",
3229434,
"nai-len",
"Latn",
}
m["nai-spt"] = {
"Sahaptin",
3833015,
"nai-shp",
"Latn",
}
m["nai-tap"] = {
"Tapachultec",
7684401,
"nai-miz",
"Latn",
}
m["nai-taw"] = {
"Tawasa",
7689233,
nil,
"Latn",
}
m["nai-teq"] = {
"Tequistlatec",
2964454,
"nai-tqn",
"Latn",
}
m["nai-tip"] = {
"Tipai",
3027471,
"nai-yuc",
"Latn",
}
m["nai-tot-pro"] = {
"Totozoque Purba",
116773285,
"nai-tot",
"Latn",
type = "reconstructed",
}
m["nai-tsi-pro"] = {
"Tsimshian Purba",
nil,
"nai-tsi",
"Latn",
type = "reconstructed",
}
m["nai-utn-pro"] = {
"Uti Purba",
116773290,
"nai-utn",
"Latn",
type = "reconstructed",
}
m["nai-wai"] = {
"Waikuri",
3118702,
nil,
"Latn",
}
m["nai-wji"] = {
"Jicaque Barat",
3178610,
"nai-jcq",
"Latn",
}
m["nai-yup"] = {
"Yupiltepeque",
25339628,
"nai-xin",
"Latn",
}
m["nan-dat"] = {
"Datian Min",
19855572,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-hbl"] = {
"Hokkien",
1624231,
"zhx-nan",
"Hants, Latn, Bopo, Kana",
wikimedia_codes = "zh-min-nan",
generate_forms = "zh-generateforms",
sort_key = {
Hani = "Hani-sortkey",
Kana = "Kana-sortkey"
},
}
m["nan-hlh"] = {
"Min Hailufeng",
120755728,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-lnx"] = {
"Min Longyan",
6674568,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-tws"] = {
"Teochew",
36759,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["nan-zhe"] = {
"Min Zhenan",
3846710,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-zsh"] = {
"Min Sanxiang",
7420769,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["ngf-bin-pro"] = {
"Binanderean Purba",
137881672,
"ngf-bin",
"Latn",
type = "reconstructed",
}
m["ngf-pro"] = {
"Trans-New Guinea Purba",
85794785,
"ngf",
"Latn",
type = "reconstructed",
}
m["nic-bco-pro"] = {
"Benue-Congo Purba",
116773194,
"nic-bco",
"Latn",
type = "reconstructed",
}
m["nic-bod-pro"] = {
"Bantoid Purba",
116773190,
"nic-bod",
"Latn",
type = "reconstructed",
}
m["nic-eov-pro"] = {
"Oti-Volta Timur Purba",
116773753,
"nic-eov",
"Latn",
type = "reconstructed",
}
m["nic-gns-pro"] = {
"Gurunsi Purba",
116773759,
"nic-gns",
"Latn",
type = "reconstructed",
}
m["nic-grf-pro"] = {
"Grassfields Purba",
116773755,
"nic-grf",
"Latn",
type = "reconstructed",
}
m["nic-gur-pro"] = {
"Gur Purba",
116773758,
"nic-gur",
"Latn",
type = "reconstructed",
}
m["nic-jkn-pro"] = {
"Jukunoid Purba",
116773769,
"nic-jkn",
"Latn",
type = "reconstructed",
}
m["nic-lcr-pro"] = {
"Lower Cross River Purba",
116773782,
"nic-lcr",
"Latn",
type = "reconstructed",
}
m["nic-ogo-pro"] = {
"Ogoni Purba",
116773799,
"nic-ogo",
"Latn",
type = "reconstructed",
}
m["nic-ovo-pro"] = {
"Oti-Volta Purba",
116773802,
"nic-ovo",
"Latn",
type = "reconstructed",
}
m["nic-plt-pro"] = {
"Plateau Purba",
116773805,
"nic-plt",
"Latn",
type = "reconstructed",
}
m["nic-pro"] = {
"Niger-Congo Purba",
108000748,
"nic",
"Latn",
type = "reconstructed",
}
m["nic-ubg-pro"] = {
"Ubangi Purba",
116773818,
"nic-ubg",
"Latn",
type = "reconstructed",
}
m["nic-ucr-pro"] = {
"Upper Cross River Purba",
116773819,
"nic-ucr",
"Latn",
type = "reconstructed",
}
m["nic-vco-pro"] = {
"Volta-Congo Purba",
116773293,
"nic-vco",
"Latn",
type = "reconstructed",
}
m["njo-jgl"] = {
"Chungli Ao",
55607615,
"sit-aao",
"Latn",
}
m["nub-har"] = {
"Haraza",
19572059,
"nub",
"Arab, Latn",
}
m["nub-pro"] = {
"Nubia Purba",
116773246,
"nub",
"Latn",
type = "reconstructed",
}
m["omq-cha-pro"] = {
"Chatino Purba",
116773202,
"omq-cha",
"Latn",
type = "reconstructed",
}
m["omq-maz-pro"] = {
"Mazatec Purba",
116773790,
"omq-maz",
"Latn",
type = "reconstructed",
}
m["omq-mix-pro"] = {
"Mixtecan Purba",
21573423,
"omq-mix",
"Latn",
type = "reconstructed",
}
m["omq-mxt-pro"] = {
"Mixtec Purba",
21573424,
"omq-mxt",
"Latn",
type = "reconstructed",
}
m["omq-otp-pro"] = {
"Oto-Pamean Purba",
116773251,
"omq-otp",
"Latn",
type = "reconstructed",
}
m["omq-pro"] = {
"Oto-Manguean Purba",
33669,
"omq",
"Latn",
type = "reconstructed",
}
m["omq-sjq"] = {
"San Juan Quiahije Chatino",
138330751,
"omq-cha",
"Latn",
}
m["omq-tel"] = {
"Teposcolula Mixtec",
nil,
"omq-mxt",
"Latn",
}
m["omq-teo"] = {
"Teojomulco Chatino",
25340451,
"omq-cha",
"Latn",
}
m["omq-tri-pro"] = {
"Trique Purba",
116773817,
"omq-tri",
"Latn",
type = "reconstructed",
}
m["omq-zap-pro"] = {
"Zapotecan Purba",
116773297,
"omq-zap",
"Latn",
type = "reconstructed",
}
m["omq-zpc-pro"] = {
"Zapotec Purba",
116773296,
"omq-zpc",
"Latn",
type = "reconstructed",
}
m["omv-aro-pro"] = {
"Aroid Purba",
116773721,
"omv-aro",
"Latn",
type = "reconstructed",
}
m["omv-diz-pro"] = {
"Dizoid Purba",
116773750,
"omv-diz",
"Latn",
type = "reconstructed",
}
m["omv-pro"] = {
"Omo Purba",
116773800,
"omv",
"Latn",
type = "reconstructed",
}
m["oto-otm-pro"] = {
"Otomi Purba",
5908710,
"oto-otm",
"Latn",
type = "reconstructed",
}
m["oto-pro"] = {
"Otomi Purba",
116773252,
"oto",
"Latn",
type = "reconstructed",
}
m["paa-kmn"] = {
"Kómnzo",
18344310,
"paa-wko",
"Latn",
}
m["paa-kwn"] = {
"Kuwani",
6449056,
"qfa-unc", -- poorly attested, possibly the same as or related to Kalabra
"Latn",
}
m["paa-lei"] = {
"Leitre",
85776228,
"paa-isk",
}
m["paa-nha-pro"] = {
"Halmahera Utara Purba",
116773241,
"paa-nha",
"Latn",
type = "reconstructed"
}
m["paa-nun"] = {
"Nungon",
128807788,
"ngf-ynu",
"Latn",
}
m["phi-din"] = {
"Dinapigue Agta",
16945774,
"phi",
"Latn",
}
m["phi-kal-pro"] = {
"Kalamian Purba",
116773213,
"phi-kal",
"Latn",
type = "reconstructed",
}
m["phi-nag"] = {
"Nagtipunan Agta",
16966111,
"phi",
"Latn",
}
m["phi-pro"] = {
"Filipina Purba",
18204898,
"phi",
"Latn",
type = "reconstructed",
}
m["poz-abi"] = {
"Abai",
19570729,
"poz-san",
"Latn",
}
m["poz-bal"] = {
"Baliledo",
4850912,
"poz",
"Latn",
}
m["poz-btk-pro"] = {
"Bungku-Tolaki Purba",
116773724,
"poz-btk",
"Latn",
type = "reconstructed",
}
m["poz-cet-pro"] = {
"Melayu-Polinesia Tengah Timur Purba",
2269883,
"poz-cet",
"Latn",
type = "reconstructed",
}
m["poz-hce-pro"] = {
"Halmahera Cenderawasih Purba",
116773209,
"poz-hce",
"Latn",
type = "reconstructed",
}
m["poz-lgx-pro"] = {
"Lampung Purba",
116773222,
"poz-lgx",
"Latn",
type = "reconstructed",
}
m["poz-mcm-pro"] = {
"Melayu-Chamik Purba",
116773225,
"poz-mcm",
"Latn",
type = "reconstructed",
}
m["poz-mic-pro"] = {
"Mikronesia Purba",
111939079,
"poz-mic",
"Latn",
type = "reconstructed",
}
m["poz-mly-pro"] = {
"Melayik Purba",
98057728,
"poz-mly",
"Latn",
type = "reconstructed",
}
m["poz-msa-pro"] = {
"Melayu-Sumbawa Purba",
116773226,
"poz-msa",
"Latn",
type = "reconstructed",
}
m["poz-oce-pro"] = {
"Oceania Purba",
141741,
"poz-oce",
"Latn",
type = "reconstructed",
}
m["poz-pep-pro"] = {
"Polinesia Timur Purba",
113988745,
"poz-pep",
"Latn",
type = "reconstructed",
}
m["poz-pnp-pro"] = {
"Polinesia Teras Purba",
113988746,
"poz-pnp",
"Latn",
type = "reconstructed",
}
m["poz-pol-pro"] = {
"Polinesia Purba",
1658709,
"poz-pol",
"Latn",
type = "reconstructed",
}
m["poz-pro"] = {
"Melayu-Polinesia Purba",
3832960,
"poz",
"Latn",
type = "reconstructed",
}
m["poz-sml"] = {
"Melayu Sarawak",
4251702,
"poz-mly",
"Latn, ms-Arab",
}
m["poz-ssw-pro"] = {
"Sulawesi Selatan Purba",
116773279,
"poz-ssw",
"Latn",
type = "reconstructed",
}
m["poz-swa-pro"] = {
"Sarawak Utara Purba",
116773243,
"poz-swa",
"Latn",
type = "reconstructed",
}
m["poz-ter"] = {
"Melayu Terengganu",
4207412,
"poz-mly",
"Latn, ms-Arab",
}
m["pqe-pro"] = {
"Melayu-Polinesia Timur Purba",
2269883,
"pqe",
"Latn",
type = "reconstructed",
}
m["pra-niy"] = {
"Prakrit Niya",
11991601,
"inc-mid",
"Khar",
ancestors = "inc-ash",
translit = "Khar-translit",
}
m["qfa-adm-pro"] = {
"Andaman Raya Purba",
116773756,
"qfa-adm",
"Latn",
type = "reconstructed",
}
m["qfa-bet-pro"] = {
"Be-Tai Purba",
116773193,
"qfa-bet",
"Latn",
type = "reconstructed",
}
m["qfa-cka-pro"] = {
"Chukotko-Kamchatka Purba",
7251837,
"qfa-cka",
"Latn",
type = "reconstructed",
}
m["qfa-hur-pro"] = {
"Hurro-Urartu Purba",
116773211,
"qfa-hur",
"Latn",
type = "reconstructed",
}
m["qfa-kad-pro"] = {
"Kadu Purba",
116773770,
"qfa-kad",
"Latn",
type = "reconstructed",
}
m["qfa-kms-pro"] = {
"Kam-Sui Purba",
55630682,
"qfa-kms",
"Latn",
type = "reconstructed",
}
m["qfa-kor-pro"] = {
"Korea Purba",
467883,
"qfa-kor",
"Latn",
type = "reconstructed",
}
m["qfa-kra-pro"] = {
"Kra Purba",
7251854,
"qfa-kra",
"Latn",
type = "reconstructed",
}
m["qfa-lic-pro"] = {
"Hlai Purba",
7251845,
"qfa-lic",
"Latn",
type = "reconstructed",
}
m["qfa-onb-pro"] = {
"Be Purba",
116773192,
"qfa-onb",
"Latn",
type = "reconstructed",
}
m["qfa-ong-pro"] = {
"Ongan Purba",
116773801,
"qfa-ong",
"Latn",
type = "reconstructed",
}
m["qfa-tak-pro"] = {
"Kra-Dai Purba",
104901616,
"qfa-tak",
"Latn",
type = "reconstructed",
}
m["qfa-yen-pro"] = {
"Yenisei Purba",
27639,
"qfa-yen",
"Latn",
type = "reconstructed",
}
m["qfa-yuk-pro"] = {
"Yukaghir Purba",
116773294,
"qfa-yuk",
"Latn",
type = "reconstructed",
}
m["qwe-kch"] = {
"Kichwa",
1740805,
"qwe",
"Latn",
ancestors = "qu",
}
m["qwe-pro"] = {
"Quechua Purba",
5575757,
"qwe",
"Latn",
type = "reconstructed",
}
m["roa-ang"] = {
"Angevin",
56782,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-bbn"] = {
"Bourbonnais-Berrichon",
2899128,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-brg"] = {
"Bourguignon",
508332,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-can"] = {
"Cantabria",
917021,
"roa-asl",
"Latn",
}
m["roa-cha"] = {
"Champenois",
430018,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-fcm"] = {
"Franc-Comtois",
510561,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-gal"] = {
"Gallo",
37300,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-gib"] = {
"Gallo-Italic of Basilicata",
3094838,
"roa-git",
ancestors = "pms-old",
"Latn",
}
m["roa-gis"] = {
"Gallo-Italic of Sicily",
2629019,
"roa-git",
"Latn",
ancestors = "pms-old",
}
m["roa-leo"] = {
"Leon",
34108,
"roa-asl",
"Latn",
}
m["roa-lor"] = {
"Lorrain",
671198,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-oca"] = {
"Catalonia Kuno",
15478520,
"roa-ocr",
"Latn",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.diaer .. c.cedilla .. "·"},
}
m["roa-ole"] = {
"Leon Kuno",
125977465,
"roa-asl",
"Latn",
}
m["roa-ona"] = {
"Navarro-Aragon Kuno",
2736184,
"roa-nar",
"Latn",
}
m["roa-opt"] = {
"Galicia-Portugis Kuno",
1072111,
"roa-gap",
"Latn",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ},
}
m["roa-orl"] = {
"Orléanais",
28497058,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-poi"] = {
"Poitevin-Saintongeais",
514123,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-tar"] = {
"Tarantino",
695526,
"roa-itr",
"Latn",
wikimedia_codes = "roa-tara",
}
m["sai-all"] = {
"Allentiac",
19570789,
"sai-hrp",
"Latn",
}
m["sai-and"] = { -- not to be confused with 'cbc' or 'ano'
"Andoquero",
16828359,
"sai-wit",
"Latn",
}
m["sai-ayo"] = {
"Ayomán",
16937754,
"sai-jir",
"Latn",
}
m["sai-bae"] = {
"Baenan",
3401998,
"qfa-unc", -- extinct, poorly attested; only known through 9 words
"Latn",
}
m["sai-bag"] = {
"Bagua",
5390321,
"qfa-unc", -- extinct, poorly attested; possibly Cariban
"Latn",
}
m["sai-bet"] = {
"Betoi",
926551,
"qfa-iso",
"Latn",
}
m["sai-bor-pro"] = {
"Boran Purba",
nil,
"sai-bor",
"Latn",
}
m["sai-cac"] = {
"Cacán",
945482,
"qfa-unc", -- extinct, poorly attested; no consensus on classification
"Latn",
}
m["sai-caq"] = {
"Caranqui",
2937753,
"sai-bar",
"Latn",
}
m["sai-car-pro"] = {
"Cariban Purba",
116773196,
"sai-car",
"Latn",
type = "reconstructed",
}
m["sai-cat"] = {
"Catacao",
5051136,
"sai-ctc",
"Latn",
}
m["sai-cer-pro"] = {
"Cerrado Purba",
116773200,
"sai-cer",
"Latn",
type = "reconstructed",
}
m["sai-chi"] = {
"Chirino",
5390321,
"qfa-unc", -- extinct, only four words known; possibly related to Candoshi-Shapra (cbu)
"Latn",
}
m["sai-chn"] = {
"Chaná",
5072718,
"sai-crn",
"Latn",
}
m["sai-chp"] = {
"Chapacura",
5072884,
"sai-cpc",
"Latn",
}
m["sai-chr"] = {
"Charrua",
5086680,
"sai-crn",
"Latn",
}
m["sai-chu"] = {
"Churuya",
5118339,
"sai-guh",
"Latn",
}
m["sai-cje-pro"] = {
"Jê Tengah Purba",
116773198,
"sai-cje",
"Latn",
type = "reconstructed",
}
m["sai-cmg"] = {
"Comechingon",
6644203,
"qfa-unc", -- extinct, poorly attested; no consensus on classification
"Latn",
}
m["sai-cno"] = {
"Chono",
5104704,
"qfa-unc", -- extinct, poorly attested; no consensus on classification, possibly spurious
"Latn",
}
m["sai-cnr"] = {
"Cañari",
5055572,
"qfa-unc", -- extinct, poorly attested; possibly Chimuan or Barbacoan
"Latn",
}
m["sai-coe"] = {
"Coeruna",
6425639,
"sai-wit",
"Latn",
}
m["sai-col"] = {
"Colán",
5141893,
"sai-ctc",
"Latn",
}
m["sai-cop"] = {
"Copallén",
5390321,
"qfa-unc", -- extinct, only four words attested; possibly Cholonan
"Latn",
}
m["sai-crd"] = {
"Coroado Puri",
24191321,
"sai-mje",
"Latn",
}
m["sai-ctq"] = {
"Catuquinaru",
16858455,
"qfa-unc", -- extinct, poorly attested; vocabulary does not resemble other languages
"Latn",
}
m["sai-cul"] = {
"Culli",
2879660,
"qfa-unc", -- extinct, poorly attested; often considered an isolate
"Latn",
}
m["sai-cva"] = {
"Cueva",
5192644,
"qfa-unc", -- extinct, poorly attested; possibly Chocoan
"Latn",
}
m["sai-esm"] = {
"Esmeralda",
3058083,
"qfa-unc", -- extinct, poorly attested; possibly related to Yaruro
"Latn",
}
m["sai-ewa"] = {
"Ewarhuyana",
16898104,
nil,
"Latn",
}
m["sai-gam"] = {
"Gamela",
5403661,
"qfa-unc", -- extinct, poorly attested; possibly an isolate
"Latn",
}
m["sai-gay"] = {
"Gayón",
5528902,
"sai-jir",
"Latn",
}
m["sai-gmo"] = {
"Guamo",
5613495,
"qfa-unc", -- extinct; "Kaufman (1990) finds a connection with the Chapacuran languages convincing." [Wikipedia] Considered an isolate by Campbell (2024).
"Latn",
}
m["sai-gua"] = {
"Guachí",
5613172,
"sai-guc",
"Latn",
}
m["sai-gue"] = {
"Güenoa",
5626799,
"sai-crn",
"Latn",
}
m["sai-hau"] = {
"Haush",
3128376,
"sai-cho",
"Latn",
}
m["sai-jee-pro"] = {
"Jê Purba",
116773212,
"sai-jee",
"Latn",
type = "reconstructed",
}
m["sai-jko"] = {
"Jeikó",
6176527,
"sai-mje",
"Latn",
}
m["sai-jrj"] = {
"Jirajara",
6202966,
"sai-jir",
"Latn",
}
m["sai-kat"] = { -- contrast xoo, kzw, sai-xoc
"Katembri",
6375925,
"qfa-unc", -- extinct, poorly attested; "Kaufman (1990) has linked it with the nearly extinct Taruma, although this has not been accepted by other scholars." [Wikipedia]
"Latn",
}
m["sai-mal"] = {
"Malalí",
6741212,
"sai-mje", -- considered the most divergent Maxakalían language (a subdivision of Macro-Jê), for which we have no entry
"Latn",
}
m["sai-mar"] = {
"Maratino",
6755055,
"qfa-unc", -- extinct, poorly attested; possibly Uto-Aztecan
"Latn",
}
m["sai-mat"] = {
"Matanawi",
6786047,
"qfa-unc", -- extinct; either an isolate or distantly related to the Muran languages; Campbell (2024) lists it as an isolate, Glottolog gives it as unclassified
"Latn",
}
m["sai-mcn"] = {
"Mocana",
3402048,
"qfa-unc", -- extinct, poorly attested; given as part of the Malibu languages (geographic grouping; not a clade)
"Latn",
}
m["sai-men"] = {
"Menien",
16890110,
"sai-mje",
"Latn",
}
m["sai-mil"] = {
"Millcayac",
19573012,
"sai-hrp",
"Latn",
}
m["sai-mlb"] = {
"Malibu",
3402048,
"qfa-unc", -- extinct, poorly attested; given as part of the Malibu languages (geographic grouping; not a clade)
"Latn",
}
m["sai-msk"] = {
"Masakará",
6782426,
"sai-mje",
"Latn",
}
m["sai-muc"] = {
"Mucuchí",
6931290,
nil, -- generally considered Timotean, for which we have no entry
"Latn",
}
m["sai-mue"] = {
"Muellama",
16886936,
"sai-bar",
"Latn",
}
m["sai-muz"] = {
"Muzo",
6644203,
"qfa-unc", -- extinct language of Colombia, poorly attested; may be Pijao (Cariban)
"Latn",
}
m["sai-mys"] = {
"Maynas",
16919393,
"sai-cah", -- per Campbell (2024); formerly considered unclassified
"Latn",
}
m["sai-nat"] = {
"Natú",
9006749,
"qfa-unc", -- extinct, poorly attested; "only Greenberg dares to classify [it]".[Wikipedia, quoting Moseley, Christopher; Asher, R. E.; Tait, Mary (1994), Atlas of the world's languages]
"Latn",
}
m["sai-nje-pro"] = {
"Jê Utara Purba",
116773245,
"sai-nje",
"Latn",
type = "reconstructed",
}
m["sai-opo"] = {
"Opón",
7099152,
"sai-car",
"Latn",
}
m["sai-oto"] = {
"Otomaco",
16879234,
"sai-otm",
"Latn",
}
m["sai-pal"] = {
"Palta",
3042978,
"qfa-unc", -- extinct, unclassified; possibly Chicham
"Latn",
}
m["sai-pam"] = {
"Pamigua",
5908689,
"sai-otm",
"Latn",
}
m["sai-par"] = {
"Paratió",
16890038,
"qfa-unc", -- extinct, poorly attested; possibly Xukuruan
"Latn",
}
m["sai-peb"] = {
"Peba",
3373890,
"sai-pey",
"Latn",
}
m["sai-pnz"] = {
"Panzaleo",
3123275,
"qfa-unc", -- extinct, unclassified; possibly Paezan
"Latn",
}
m["sai-prh"] = {
"Puruhá",
3410994,
"qfa-unc", -- extinct, poorly attested; possibly in a family with Cañari
"Latn",
}
m["sai-ptg"] = {
"Patagón",
128807870,
"sai-tar", -- extinct, only known from 4 words, which suggest Cariban lineage (Campbell 2024)
"Latn",
}
m["sai-pur"] = {
"Purukotó",
7261622,
"sai-pem",
"Latn",
}
m["sai-pyg"] = {
"Payaguá",
7156643,
"sai-guc",
"Latn",
}
m["sai-pyk"] = {
"Pykobjê",
98113977,
"sai-nje",
"Latn",
}
m["sai-qmb"] = {
"Quimbaya",
7272043,
"qfa-unc", -- extinct, might not exist; few known words
"Latn",
}
m["sai-qtm"] = {
"Quitemo",
7272651,
"sai-cpc",
"Latn",
}
m["sai-rab"] = {
"Rabona",
6644203,
"qfa-unc", -- extinct, poorly attested, mostly plant names; possibly Candoshi-Shapra
"Latn",
}
m["sai-ram"] = {
"Ramanos",
16902824,
"qfa-unc", -- extinct, poorly attested, possibly an isolate; per Glottolog: "the minuscule wordlist ... shows no convincing resemblances to surrounding languages"
"Latn",
}
m["sai-sac"] = {
"Sácata",
5390321,
"qfa-unc", -- extinct, only 3 words known; possibly Candoshí or Arawakan
"Latn",
}
m["sai-san"] = {
"Sanaviron",
16895999,
"qfa-unc", -- extinct, unclassified; no consensus on classification
"Latn",
}
m["sai-sap"] = {
"Sapará",
7420922,
"sai-car",
"Latn",
}
m["sai-sec"] = {
"Sechura",
7442912,
"qfa-unc", -- extinct, poorly attested; possibly Catacaoan
"Latn",
}
m["sai-sin"] = {
"Sinúfana",
7525275,
"qfa-unc", -- moribund, poorly attested; possibly Chocoan
"Latn",
}
m["sai-sje-pro"] = {
"Jê Selatan Purba",
116773814,
"sai-sje",
"Latn",
type = "reconstructed",
}
m["sai-tab"] = {
"Tabancale",
5390321,
"qfa-unc", -- extinct, only 5 words known; no obvious connections, might be an isolate
"Latn",
}
m["sai-tal"] = {
"Tallán",
16910468,
"qfa-unc", -- extinct, poorly attested; might be Catacaoan
"Latn",
}
m["sai-tap"] = {
"Tapayuna",
30719984,
"sai-nje",
"Latn",
}
m["sai-tar-pro"] = {
"Taranoan Purba",
116773816,
"sai-tar",
"Latn",
type = "reconstructed",
}
m["sai-teu"] = {
"Teushen",
3519243,
"qfa-unc", -- probably extinct by the 1950's; possibly Chonan
"Latn",
}
m["sai-tim"] = {
"Timote",
7806995,
nil, -- possibly in a small Timotean family
"Latn",
}
m["sai-tpr"] = {
"Taparita",
7684460,
"sai-otm",
"Latn",
}
m["sai-trr"] = {
"Tarairiú",
7685313,
"qfa-unc", -- extinct, too poorly attested to classify
"Latn",
}
m["sai-wai"] = {
"Waitaká",
16918610,
"qfa-unc", -- extinct, possibly Purian
"Latn",
}
m["sai-way"] = {
"Wayumara",
7960726,
"sai-car",
"Latn",
}
m["sai-wit-pro"] = {
"Witotoan Purba",
116773823,
"sai-wit",
"Latn",
type = "reconstructed",
}
m["sai-wnm"] = {
"Wanham",
16879440,
"sai-cpc",
"Latn",
}
m["sai-xoc"] = { -- contrast xoo, kzw, sai-kat
"Xocó",
12953620,
"qfa-unc", -- extinct and poorly attested; not clear if one or three languages
"Latn",
}
m["sai-yao"] = {
"Yao (Amerika Selatan)",
16979655,
"sai-ven",
"Latn",
}
m["sai-yar"] = { -- not the same family as 'suy'
"Yarumá",
3505859,
"sai-pek",
"Latn",
}
m["sai-yri"] = {
"Yuri",
2669157,
"sai-tyu",
"Latn",
}
m["sai-yup"] = {
"Yupua",
8061430,
"sai-tuc",
"Latn",
}
m["sai-yur"] = {
"Yurumanguí",
1281291,
"qfa-unc", -- extinct, too poorly attested to classify
"Latn",
}
m["sal-pro"] = {
"Salish Purba",
116773269,
"sal",
"Latn",
type = "reconstructed",
}
m["sdv-daj-pro"] = {
"Daju Purba",
116773739,
"sdv-daj",
"Latn",
type = "reconstructed",
}
m["sdv-eje-pro"] = {
"Jabal Timur Purba",
116773751,
"sdv-eje",
"Latn",
type = "reconstructed",
}
m["sdv-nil-pro"] = {
"Nil Purba",
116773794,
"sdv-nil",
"Latn",
type = "reconstructed",
}
m["sdv-nyi-pro"] = {
"Nyima Purba",
116773796,
"sdv-nyi",
"Latn",
type = "reconstructed",
}
m["sdv-tmn-pro"] = {
"Taman Purba",
116773815,
"sdv-tmn",
"Latn",
type = "reconstructed",
}
m["sel-nor"] = {
"Selkup Utara",
30304565,
"sel",
"Cyrl",
translit = "sel-nor-translit",
}
m["sel-pro"] = {
"Selkup Purba",
128884235,
"sel",
"Latn",
type = "reconstructed",
}
m["sel-sou"] = {
"Selkup Selatan",
30304639,
"sel",
"Cyrl",
translit = "sel-sou-translit",
}
m["sem-amm"] = {
"Ammun",
279181,
"sem-can",
"Phnx",
-- Phnx translit in [[Module:scripts/data]]
}
m["sem-amo"] = {
"Amorit",
35941,
"sem-nwe",
"Xsux, Latn",
}
m["sem-cha"] = {
"Chaha",
35543,
"sem-eth",
"Ethi",
translit = "Ethi-translit",
}
m["sem-dad"] = {
"Dadanitic",
21838040,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-dum"] = {
"Dumaitic",
128810397,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-has"] = {
"Hasaitic",
3541433,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-his"] = {
"Hismaic",
22948260,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-mhr"] = {
"Muher",
33743,
"sem-eth",
"Latn",
}
m["sem-pro"] = {
"Samiah Purba",
1658554,
"sem",
"Latn",
type = "reconstructed",
}
m["sem-saf"] = {
"Safaitic",
472586,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-sam"] = {
"Samalia",
85847147,
"sem-nwe",
"Phnx",
-- Phnx translit in [[Module:scripts/data]]
}
m["sem-srb"] = {
"Arab Selatan Kuno",
35025,
"sem-osa",
"Sarb",
-- Sarb translit in [[Module:scripts/data]]
}
m["sem-tay"] = {
"Taymanitic",
24912301,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-tha"] = {
"Thamudic",
843030,
"sem-cen",
"Narb",
-- Narb translit in [[Module:scripts/data]]
}
m["sem-wes-pro"] = {
"Samiah Barat Purba",
98021726,
"sem-wes",
"Latn",
type = "reconstructed",
}
m["sio-pro"] = { -- NB this is not Proto-Siouan-Catawban 'nai-sca-pro'
"Sioux Purba",
34181,
"sio",
"Latn",
type = "reconstructed",
}
m["sit-aao-pro"] = {
"Naga Tengah Purba",
nil,
"sit-aao",
"Latn",
type = "reconstructed",
}
m["sit-bai-pro"] = {
"Bai Purba",
nil,
"sit-bai",
"Latn",
type = "reconstructed",
}
m["sit-ban"] = {
"Bangru",
56071779,
"sit-hrs",
"Latn",
}
m["sit-bdi-pro"] = {
"Bodish Purba",
nil,
"sit-bdi",
"Latn",
type = "reconstructed",
}
m["sit-bok"] = {
"Bokar",
4938727,
"sit-tan",
"Latn, Tibt",
override_translit = true,
-- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["sit-cai"] = {
"Caijia",
5017528,
"sit-cln",
"Latn"
}
m["sit-cha"] = {
"Chairel",
5068066,
"sit-luu",
"Latn",
}
m["sit-ers-pro"] = {
"Ersuic Purba",
nil,
"sit-ers",
"Latn",
type = "reconstructed",
}
m["sit-hrs-pro"] = {
"Hrusish Purba",
116773762,
"sit-hrs",
"Latn",
type = "reconstructed",
}
m["sit-jap"] = {
"Japhug",
3162245,
"sit-egy",
"Latn",
}
m["sit-kha-pro"] = {
"Kham Purba",
116773773,
"sit-kha",
"Latn",
type = "reconstructed",
}
m["sit-khb-pro"] = {
"Kho-Bwa Purba",
nil,
"sit-khb",
"Latn",
type = "reconstructed",
}
m["sit-khp-pro"] = {
"Puroik Purba",
nil,
"sit-khb",
"Latn",
type = "reconstructed",
}
m["sit-khw-pro"] = {
"Kho-Bwa Barat Purba",
nil,
"sit-khw",
"Latn",
type = "reconstructed",
}
m["sit-kon-pro"] = {
"Naga Utara Purba",
nil,
"sit-kon",
"Latn",
type = "reconstructed",
}
m["sit-liz"] = {
"Lizu",
6660653,
"sit-ers",
"Latn", -- and Ersu Shaba
}
m["sit-lnj"] = {
"Longjia",
17096251,
"sit-cln",
"Latn"
}
m["sit-lrn"] = {
"Luren",
16946370,
"sit-cln",
"Latn"
}
m["sit-luu-pro"] = {
"Luish Purba",
116773783,
"sit-luu",
"Latn",
type = "reconstructed",
}
m["sit-nas-pro"] = {
"Naish Purba",
nil,
"sit-nas",
"Latn",
type = "reconstructed",
}
m["sit-prn"] = {
"Puiron",
7259048,
"sit-zem",
}
m["sit-pro"] = {
"Sino-Tibet Purba",
24839178,
"sit",
"Latn",
type = "reconstructed",
}
m["sit-sit"] = {
"Situ",
19840830,
"sit-egy",
"Latn",
}
m["sit-tam-pro"] = {
"Tamang Purba",
117469295,
"sit-tam",
"Latn",
type = "reconstructed",
}
m["sit-tan-pro"] = {
"Tani Purba",
116773284,
"sit-tan",
"Latn", -- needs verification
type = "reconstructed",
}
m["sit-tgm"] = {
"Tangam",
17041370,
"sit-tan",
"Latn",
}
m["sit-tng-pro"] = {
"Tangkhulic Purba",
nil,
"sit-tng",
"Latn",
type = "reconstructed"
}
m["sit-tos"] = {
"Tosu",
7827899,
"sit-ers",
"Latn", -- also Ersu Shaba
}
m["sit-tsh"] = {
"Tshobdun",
19840950,
"sit-egy",
"Latn",
}
m["sit-zbu"] = {
"Zbu",
19841106,
"sit-egy",
"Latn",
}
m["sla-pro"] = {
"Slavik Purba",
747537,
"sla",
"Latn",
type = "reconstructed",
strip_diacritics = {
remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve,
remove_exceptions = {'ś'},
},
sort_key = {
from = {"č", "ď", "ě", "ę", "ь", "ľ", "ň", "ǫ", "ř", "š", "ś", "ť", "ъ", "ž"},
to = {"c²", "d²", "e²", "e³", "i²", "l²", "nj", "o²", "r²", "s²", "s³", "t²", "u²", "z²"},
}
}
m["smi-pro"] = {
"Sami Purba",
7251862,
"smi",
"Latn",
type = "reconstructed",
sort_key = {
from = {"ā", "č", "δ", "[ëē]", "ŋ", "ń", "ō", "š", "θ", "%([^()]+%)"},
to = {"a", "c²", "d", "e", "n²", "n³", "o", "s²", "t²"}
},
}
m["son-pro"] = {
"Songhay Purba",
116773277,
"son",
"Latn",
type = "reconstructed",
}
m["sqj-pro"] = {
"Albania Purba",
18210846,
"sqj",
"Latn",
type = "reconstructed",
}
m["ssa-klk-pro"] = {
"Kuliak Purba",
116773779,
"ssa-klk",
"Latn",
type = "reconstructed",
}
m["ssa-kom-pro"] = {
"Koman Purba",
116773775,
"ssa-kom",
"Latn",
type = "reconstructed",
}
m["ssa-pro"] = {
"Nilo-Sahara Purba",
116773236,
"ssa",
"Latn",
type = "reconstructed",
}
m["syd-pro"] = {
"Samoyed Purba",
7251863,
"syd",
"Latn",
type = "reconstructed",
}
m["tai-pro"] = {
"Tai Purba",
6583709,
"tai",
"Latn",
type = "reconstructed",
}
m["tai-swe-pro"] = {
"Tai Barat Daya Purba",
116773280,
"tai-swe",
"Latn",
type = "reconstructed",
}
m["tbq-bdg-pro"] = {
"Bodo-Garo Purba",
116773195,
"tbq-bdg",
"Latn",
type = "reconstructed",
}
m["tbq-blg"] = {
"Bailang",
2879843,
"tbq-lob",
"Hani",
sort_key = "Hani-sortkey",
}
m["tbq-brm-pro"] = {
"Burma Purba",
nil,
"tbq-brm",
"Latn",
type = "reconstructed",
}
m["tbq-gkh"] = {
"Gokhy",
5578069,
"tbq-sil",
"Latn",
}
m["tbq-kuk-pro"] = {
"Kukish Purba",
116773220,
"tbq-kuk",
"Latn",
type = "reconstructed",
}
m["tbq-lal-pro"] = {
"Lalo Purba",
116773781,
"tbq-lal",
"Latn",
type = "reconstructed",
}
m["tbq-laz"] = {
"Laze",
17007626,
"sit-nas",
"Latn",
}
m["tbq-lob-pro"] = {
"Lolo-Burma Purba",
116773224,
"tbq-lob",
"Latn",
type = "reconstructed",
}
m["tbq-lol-pro"] = {
"Lolo Purba",
7251855,
"tbq-lol",
"Latn",
type = "reconstructed",
}
m["tbq-mil"] = {
"Milang",
6850761,
"sit-gsi",
"Deva, Latn",
}
m["tbq-mor"] = {
"Moran",
6909216,
"tbq-bdg",
"Latn",
}
m["tbq-ngo"] = {
"Ngochang",
56582,
"tbq-brm",
"Latn",
}
-- tbq-pro is now etymology-only
m["trk-dkh"] = {
"Dukhan",
12809273,
"trk-ssb",
"Latn, Cyrl, Mong",
-- Mong translit, display_text and strip_diacritics in [[Module:scripts/data]]
}
-- As described in Mahmud al-Kashgari's 11th century ''Dīwān Lughāt al-Turk''.
m["trk-eog"] = {
"Oghuz Kuno Awal",
nil,
"trk-ogz",
"ota-Arab",
strip_diacritics = {["ota-Arab"] = "ar-stripdiacritics"},
}
m["trk-oat"] = {
"Turki Anatolia Kuno",
7083390,
"trk-ogz",
"ota-Arab",
strip_diacritics = {["ota-Arab"] = "ar-stripdiacritics"},
}
m["trk-pro"] = {
"Turk Purba",
3657773,
"trk",
"Latn",
type = "reconstructed",
standard_chars = {
Latn = " ()-abdegiklmnoprstuxyzïöüāčēīĺŋōŕšūǖȫẹ" .. c.macron,
}
}
m["tup-gua-pro"] = {
"Tupi-Guarani Purba",
116773288,
"tup-gua",
"Latn",
type = "reconstructed",
}
m["tup-kab"] = {
"Kabishiana",
15302988,
"tup",
"Latn",
}
m["tup-pro"] = {
"Tupi Purba",
10354700,
"tup",
"Latn",
type = "reconstructed",
}
m["tuw-alk"] = {
"Alchuka",
113553616,
"tuw-jrc",
"Latn, Hans",
sort_key = {Hans = "Hani-sortkey"},
}
m["tuw-bal"] = {
"Bala",
86730632,
"tuw-jrc",
"Latn, Hans",
sort_key = {Hans = "Hani-sortkey"},
}
m["tuw-kkl"] = {
"Kyakala",
118875708,
"tuw-jrc",
"Latn, Hans",
sort_key = {Hans = "Hani-sortkey"},
}
m["tuw-kli"] = {
"Kili",
6406892,
"tuw-ewe",
"Cyrl",
}
m["tuw-pro"] = {
"Tungus Purba",
85872335,
"tuw",
"Latn",
type = "reconstructed",
}
m["tuw-sol"] = {
"Solon",
30004,
"tuw-ewe",
}
m["urj-fin-pro"] = {
"Finnik Purba",
11883720,
"urj-fin",
"Latn",
type = "reconstructed",
}
m["urj-koo"] = {
"Komi Kuno",
86679962,
"kv",
"Perm, Cyrs",
translit = "urj-koo-translit",
-- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]]; previously, Cyrs strip_diacritics not present
}
m["urj-kuk"] = {
"Kukkuzi",
107410460,
"urj-fin",
"Latn",
ancestors = "vot",
}
m["urj-kya"] = {
"Komi-Yazva",
2365210,
"kv",
"Cyrl",
translit = "kv-translit",
override_translit = true,
strip_diacritics = {remove_diacritics = c.acute},
}
m["urj-mdv-pro"] = {
"Mordvin Purba",
116773232,
"urj-mdv",
"Latn",
type = "reconstructed",
}
m["urj-prm-pro"] = {
"Perm Purba",
116773257,
"urj-prm",
"Latn",
type = "reconstructed",
}
m["urj-pro"] = {
"Ural Purba",
288765,
"urj",
"Latn",
type = "reconstructed",
}
m["urj-ugr-pro"] = {
"Ugri Purba",
156631,
"urj-ugr",
"Latn",
type = "reconstructed",
}
m["xnd-pro"] = {
"Na-Dene Purba",
116773233,
"xnd",
"Latn",
type = "reconstructed",
}
m["xgn-pro"] = {
"Mongol Purba",
2493677,
"xgn",
"Latn",
type = "reconstructed",
sort_key = {
from = {"č", "i", "ï", "ǰ", "ŋ", "ö", "š", "ü"},
to = {"c", "i" .. p[1], "i", "j", "n" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1]},
},
}
m["yok-bvy"] = {
"Yokuts Buena Vista",
4985474,
"yok",
"Latn",
}
m["yok-dly"] = {
"Yokuts Delta",
70923266,
"yok",
"Latn",
}
m["yok-gsy"] = {
"Gashowu",
3098708,
"yok",
"Latn",
}
m["yok-kry"] = {
"Yokuts Sungai Kings",
6413014,
"yok",
"Latn",
}
m["yok-nvy"] = {
"Yokuts Lembah Utara",
85789777,
"yok",
"Latn",
}
m["yok-ply"] = {
"Yokuts Palewyami",
2387391,
"yok",
"Latn",
}
m["yok-svy"] = {
"Yokuts Lembah Selatan",
12642473,
"yok",
"Latn",
}
m["yok-tky"] = {
"Yokuts Tule-Kaweah",
7851988,
"yok",
"Latn",
}
m["ypk-pro"] = {
"Yupik Purba",
116773295,
"ypk",
"Latn",
type = "reconstructed",
}
m["yrk-for"] = {
"Forest Nenets",
1295107,
"yrk",
"Cyrl",
translit = "yrk-for-translit",
strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.macron .. c.breve .. c.dotabove},
}
m["yrk-tun"] = {
"Tundra Nenets",
36452,
"yrk",
"Cyrl",
strip_diacritics = {
from = {"ӑ", "а̄", "э̇", "ӣ", "ы̄", "ӯ", "ю̄", "я̆", "я̄"},
to = {"а", "а", "э", "и", "ы", "у", "ю", "я", "я"},
},
translit = "yrk-tun-translit",
}
m["zhx-min-pro"] = {
"Min Purba",
19646347,
"zhx-min",
"Latn",
type = "reconstructed",
}
m["zhx-sht"] = {
"Shaozhou Tuhua",
1920769,
"zhx",
"Nshu, Hants",
generate_forms = "zh-generateforms",
sort_key = {Hani = "Hani-sortkey"},
}
m["zhx-sic"] = {
"Sichuan",
2278732,
"zhx-man",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["zhx-tai"] = {
"Taishan",
2208940,
"zhx-yue",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["zle-ono"] = {
"Novgorodia Kuno",
162013,
"zle",
"Cyrs, Glag",
translit = {Cyrs = "Cyrs-translit", Glag = "Glag-translit"},
-- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["zle-ort"] = {
"Ruthenia Kuno",
13211,
"zle",
"Arab, Cyrs, Latn",
ancestors = "orv",
translit = {
Cyrs = "zle-ort-translit",
Arab = "zle-ort-Arab-translit",
},
strip_diacritics = {
Cyrs = {
remove_diacritics = m_langdata.chars_substitutions["Cyrs_remove_diacritics"],
remove_exceptions = {"Ї", "ї"},
},
Arab = "ar-stripdiacritics",
},
-- Cyrs sort_key in [[Module:scripts/data]]
}
m["zls-chs"] = {
"Slav Gereja",
33251,
"zls",
"Cyrs, Glag, Latn",
ancestors = "cu",
translit = {
Cyrs = "Cyrs-translit",
Glag = "Glag-translit"
},
-- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]]
}
m["zlw-ocs"] = {
"Czech Kuno",
593096,
"zlw",
"Latn",
}
m["zlw-opl"] = {
"Poland Kuno",
149838,
"zlw-lch",
"Latn",
strip_diacritics = {remove_diacritics = c.ringabove},
}
m["zlw-osk"] = {
"Slovak Kuno",
12776676,
"zlw",
"Latn",
}
m["zlw-slv"] = {
"Slovincia",
36822,
"zlw-pom",
"Latn",
strip_diacritics = {remove_diacritics = c.macron .. c.breve},
}
m["zlm-coa"] = {
"Melayu Terengganu Pesisir",
4207412,
"poz-mly",
"Latn, ms-Arab",
}
m["zlm-pah"] = {
"Melayu Pahang",
Q7310370,
"poz-mly",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
b5nkainh67h0t8own0oi8tskhrtcnqc
281322
281316
2026-04-21T19:44:55Z
Hakimi97
2668
Membatalkan semakan [[Special:Diff/281316|281316]] oleh [[Special:Contributions/Hakimi97|Hakimi97]] ([[User talk:Hakimi97|bincang]])
281322
Scribunto
text/plain
local m_lang = require("Module:languages")
local m_langdata = require("Module:languages/data")
local u = require("Module:string utilities").char
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["aav-khs-pro"] = {
"Khasi Purba",
116773216,
"aav-khs",
"Latn",
type = "reconstructed",
}
m["aav-nic-pro"] = {
"Nicobar Purba",
116773793,
"aav-nic",
"Latn",
type = "reconstructed",
}
m["aav-pkl-pro"] = {
"Pnar-Khasi-Lyngngam Purba",
116773259,
"aav-pkl",
"Latn",
type = "reconstructed",
}
m["aav-pro"] = { -- mkh-pro will merge into this
"Austroasia Purba",
116773186,
"aav",
"Latn",
type = "reconstructed",
}
m["afa-pro"] = {
"Afroasia Purba",
269125,
"afa",
"Latn",
type = "reconstructed",
}
m["alg-aga"] = {
"Agawam",
nil,
"alg-eas",
"Latn",
}
m["alg-pro"] = {
"Algonquin Purba",
7251834,
"alg",
"Latn",
type = "reconstructed",
sort_key = {remove_diacritics = "·"},
}
m["alv-ama"] = {
"Amasi",
4740400,
"nic-grs",
"Latn",
entry_name = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron},
}
m["alv-bgu"] = {
"Baïnounk Gubëeher",
17002646,
"alv-bny",
"Latn",
}
m["alv-bua-pro"] = {
"Bua Purba",
116773723,
"alv-bua",
"Latn",
type = "reconstructed",
}
m["alv-cng-pro"] = {
"Cangin Purba",
116773726,
"alv-cng",
"Latn",
type = "reconstructed",
}
m["alv-edo-pro"] = {
"Edoid Purba",
116773206,
"alv-edo",
"Latn",
type = "reconstructed",
}
m["alv-fli-pro"] = {
"Fali Purba",
116773754,
"alv-fli",
"Latn",
type = "reconstructed",
}
m["alv-gbe-pro"] = {
"Gbe Purba",
116773208,
"alv-gbe",
"Latn",
type = "reconstructed",
}
m["alv-gng-pro"] = {
"Guang Purba",
116773757,
"alv-gng",
"Latn",
type = "reconstructed",
}
m["alv-gtm-pro"] = {
"Togo Tengah Purba",
116773732,
"alv-gtm",
"Latn",
type = "reconstructed",
}
m["alv-gwa"] = {
"Gwara",
16945580,
"nic-pla",
"Latn",
}
m["alv-hei-pro"] = {
"Heiban Purba",
116773760,
"alv-hei",
"Latn",
type = "reconstructed",
}
m["alv-ido-pro"] = {
"Idomoid Purba",
116773764,
"alv-ido",
"Latn",
type = "reconstructed",
}
m["alv-igb-pro"] = {
"Igboid Purba",
116773765,
"alv-igb",
"Latn",
type = "reconstructed",
}
m["alv-kwa-pro"] = {
"Kwa Purba",
116773780,
"alv-kwa",
"Latn",
type = "reconstructed",
}
m["alv-mum-pro"] = {
"Mumuye Purba",
116773791,
"alv-mum",
"Latn",
type = "reconstructed",
}
m["alv-nup-pro"] = {
"Nupoid Purba",
116773795,
"alv-nup",
"Latn",
type = "reconstructed",
}
m["alv-pro"] = {
"Atlantik-Congo Purba",
116732838,
"alv",
"Latn",
type = "reconstructed",
}
m["alv-edk-pro"] = {
"Edekiri Purba",
nil,
"alv-edk",
"Latn",
type = "reconstructed",
}
m["alv-yor-pro"] = {
"Yoruba Purba",
nil,
"alv-yor",
"Latn",
type = "reconstructed",
}
m["alv-yrd-pro"] = {
"Yoruboid Purba",
116773824,
"alv-yrd",
"Latn",
type = "reconstructed",
}
m["alv-von-pro"] = {
"Volta-Niger Purba",
116773820,
"alv-von",
"Latn",
type = "reconstructed",
}
m["apa-pro"] = {
"Apache Purba",
116773135,
"apa",
"Latn",
type = "reconstructed",
}
m["aql-pro"] = {
"Algik Purba",
18389588,
"aql",
"Latn",
type = "reconstructed",
sort_key = {remove_diacritics = "·"},
}
m["art-adu"] = {
"Adûni",
1232159,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-bel"] = {
"Kreol Belter",
108055510,
"art",
"Latn",
type = "appendix-constructed",
sort_key = {
remove_diacritics = c.acute,
from = {"ɒ"},
to = {"a"},
},
}
m["art-blk"] = {
"Bolak",
2909283,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-bsp"] = {
"Black Speech",
686210,
"art",
"Latn, Teng",
type = "appendix-constructed",
}
m["art-com"] = {
"Communicationssprache",
35227,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-dtk"] = {
"Dothraki",
2914733,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-elo"] = {
"Eloi",
nil,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-gld"] = {
"Goa'uld",
19823,
"art",
"Latn, Egyp, Mero",
type = "appendix-constructed",
}
m["art-lap"] = {
"Lapine",
6488195,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-man"] = {
"Mandalorian",
54289,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-mun"] = {
"Mundolinco",
851355,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-nav"] = {
"Na'vi",
316939,
"art",
"Latn",
type = "appendix-constructed",
}
m["art-vlh"] = {
"High Valyrian",
64483808,
"art",
"Latn",
type = "appendix-constructed",
}
m["ath-nic"] = {
"Nicola",
20609,
"ath-nor",
"Latn",
}
m["ath-pro"] = {
"Athabaska Purba",
104841722,
"ath",
"Latn",
type = "reconstructed",
}
m["auf-pro"] = {
"Arawa Purba",
116773706,
"auf",
"Latn",
type = "reconstructed",
}
m["aus-alu"] = {
"Alungul",
16827670,
"aus-pmn",
"Latn",
}
m["aus-and"] = {
"Andjingith",
4754509,
"aus-pmn",
"Latn",
}
m["aus-ang"] = {
"Angkula",
16828520,
"aus-pmn",
"Latn",
}
m["aus-arn-pro"] = {
"Arnhem Purba",
116773720,
"aus-arn",
"Latn",
type = "reconstructed",
}
m["aus-bra"] = {
"Barranbinya",
4863220,
"aus-pmn",
"Latn",
}
m["aus-brm"] = {
"Barunggam",
4865914,
"aus-pmn",
"Latn",
}
m["aus-cww-pro"] = {
"New South Wales Tengah Purba",
116773199,
"aus-cww",
"Latn",
type = "reconstructed",
}
m["aus-dal-pro"] = {
"Daly Purba",
116773743,
"aus-dal",
"Latn",
type = "reconstructed",
}
m["aus-guw"] = {
"Guwar",
6652138,
"aus-pam",
"Latn",
}
m["aus-lsw"] = {
"Little Swanport",
6652138,
nil,
"Latn",
}
m["aus-mbi"] = {
"Mbiywom",
6799701,
"aus-pmn",
"Latn",
}
m["aus-ngk"] = {
"Ngkoth",
7022405,
"aus-pmn",
"Latn",
}
m["aus-nyu-pro"] = {
"Nyulnyulan Purba",
116773797,
"aus-nyu",
"Latn",
type = "reconstructed",
}
m["aus-pam-pro"] = {
"Pama-Nyunga Purba",
33942,
"aus-pam",
"Latn",
type = "reconstructed",
}
m["aus-tul"] = {
"Tulua",
16938541,
"aus-pam",
"Latn",
}
m["aus-uwi"] = {
"Uwinymil",
7903995,
"aus-arn",
"Latn",
}
m["aus-wdj-pro"] = {
"Iwaidjan Purba",
116773767,
"aus-wdj",
"Latn",
type = "reconstructed",
}
m["aus-won"] = {
"Wong-gie",
nil,
"aus-pam",
"Latn",
}
m["aus-wul"] = {
"Wulguru",
8039196,
"aus-dyb",
"Latn",
}
m["aus-ynk"] = { -- contrast nny
"Yangkaal",
3913770,
"aus-tnk",
"Latn",
}
m["awd-amc-pro"] = {
"Amuesha-Chamicuro Purba",
nil,
"awd",
"Latn",
type = "reconstructed",
}
m["awd-kmp-pro"] = {
"Kampa Purba",
nil,
"awd",
"Latn",
type = "reconstructed",
}
m["awd-prw-pro"] = {
"Paresi-Waura Purba",
nil,
"awd",
"Latn",
type = "reconstructed",
}
m["awd-ama"] = {
"Amarizana",
16827787,
"awd",
"Latn",
}
m["awd-ana"] = {
"Anauyá",
16828252,
"awd",
"Latn",
}
m["awd-apo"] = {
"Apolista",
16916645,
"awd",
"Latn",
}
m["awd-cab"] = {
"Cabre",
16850160,
"awd",
"Latn",
}
m["awd-gnu"] = {
"Guinau",
3504087,
"awd",
"Latn",
}
m["awd-kar"] = {
"Cariay",
16920253,
"awd",
"Latn",
}
m["awd-kaw"] = {
"Kawishana",
6379993,
"awd-nwk",
"Latn",
}
m["awd-kus"] = {
"Kustenau",
5196293,
"awd",
"Latn",
}
m["awd-man"] = {
"Manao",
6746920,
"awd",
"Latn",
}
m["awd-mar"] = {
"Marawan",
6755108,
"awd",
"Latn",
}
m["awd-mpr"] = {
"Maipure",
6736872,
"awd",
"Latn",
}
m["awd-mrt"] = {
"Mariaté",
16910017,
"awd-nwk",
"Latn",
}
m["awd-nwk-pro"] = {
"Nawiki Purba",
116773234,
"awd-nwk",
"Latn",
type = "reconstructed",
}
m["awd-pai"] = {
"Paikoneka",
128807835,
"awd",
"Latn",
}
m["awd-pas"] = {
"Pasé",
7143168,
"awd-nwk",
"Latn",
}
m["awd-pro"] = {
"Arawak Purba",
97573478,
"awd",
"Latn",
type = "reconstructed",
}
m["awd-she"] = {
"Shebayo",
7492248,
"awd",
"Latn",
}
m["awd-taa-pro"] = {
"Ta-Arawak Purba",
116773282,
"awd-taa",
"Latn",
type = "reconstructed",
}
m["awd-wai"] = {
"Wainumá",
16910017,
"awd-nwk",
"Latn",
}
m["awd-yum"] = {
"Yumana",
8061062,
"awd-nwk",
"Latn",
}
m["azc-caz"] = {
"Cazcan",
5055514,
"azc",
"Latn",
}
m["azc-cup-pro"] = {
"Cupan Purba",
116773738,
"azc-cup",
"Latn",
type = "reconstructed",
}
m["azc-ktn"] = {
"Kitanemuk",
3197558,
"azc-tak",
"Latn",
}
m["azc-nah-pro"] = {
"Nahua Purba",
7251860,
"azc-nah",
"Latn",
type = "reconstructed",
}
m["azc-num-pro"] = {
"Numi Purba",
116773247,
"azc-num",
"Latn",
type = "reconstructed",
}
m["azc-pro"] = {
"Uto-Aztek Purba",
96400333,
"azc",
"Latn",
type = "reconstructed",
}
m["azc-tak-pro"] = {
"Takik Purba",
116773283,
"azc-tak",
"Latn",
type = "reconstructed",
}
m["azc-tat"] = {
"Tataviam",
743736,
"azc",
"Latn",
}
m["ber-pro"] = {
"Barbar Purba",
2855698,
"ber",
"Latn",
type = "reconstructed",
}
m["ber-fog"] = {
"Fogaha",
107610173,
"ber",
"Latn",
}
m["ber-zuw"] = {
"Zuwara",
4117169,
"ber",
"Latn",
}
m["bnt-bal"] = {
"Balong",
93935237,
"bnt-bbo",
"Latn",
}
m["bnt-bon"] = {
"Boma Nkuu",
nil,
"bnt",
"Latn",
}
m["bnt-boy"] = {
"Boma Yumu",
nil,
"bnt",
"Latn",
}
m["bnt-bwa"] = {
"Bwala",
128810345,
"bnt-tek",
"Latn",
}
m["bnt-cmw"] = {
"Chimwiini",
4958328,
"bnt-swh",
"Latn",
}
m["bnt-ind"] = {
"Indanga",
51412803,
"bnt",
"Latn",
}
m["bnt-lal"] = {
"Lala (Afrika Selatan)",
6480154,
"bnt-ngu",
"Latn",
}
m["bnt-mpi"] = {
"Mpiin",
93937013,
"bnt-bdz",
"Latn",
}
m["bnt-mpu"] = {
"Mpuono", -- not to be confused with Mbuun zmp
36056,
"bnt",
"Latn",
}
m["bnt-ngu-pro"] = {
"Nguni Purba",
961559,
"bnt-ngu",
"Latn",
type = "reconstructed",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.caron},
}
m["bnt-phu"] = {
"Phuthi",
33796,
"bnt-ngu",
"Latn",
entry_name = {remove_diacritics = c.grave .. c.acute},
}
m["bnt-pro"] = {
"Bantu Purba",
3408025,
"bnt",
"Latn",
type = "reconstructed",
sort_key = "bnt-pro-sortkey",
}
m["bnt-sbo"] = {
"Boma Selatan",
nil,
"bnt",
"Latn",
}
m["bnt-sts-pro"] = {
"Sotho-Tswana Purba",
116773278,
"bnt-sts",
"Latn",
type = "reconstructed",
}
m["btk-pro"] = {
"Batak Purba",
116773191,
"btk",
"Latn",
type = "reconstructed",
}
m["cau-abz-pro"] = {
"Abkhaz-Abaza Purba",
7251831,
"cau-abz",
"Latn",
type = "reconstructed",
}
m["cau-and-pro"] = {
"Andi Purba",
nil,
"cau-and",
"Latn",
type = "reconstructed",
}
m["cau-ava-pro"] = {
"Avar-Andi Purba",
116773187,
"cau-ava",
"Latn",
type = "reconstructed",
}
m["cau-cir-pro"] = {
"Circassia Purba",
7251838,
"cau-cir",
"Latn",
type = "reconstructed",
}
m["cau-drg-pro"] = {
"Dargwa Purba",
116773205,
"cau-drg",
"Latn",
type = "reconstructed",
}
m["cau-lzg-pro"] = {
"Lezgi Purba",
116773223,
"cau-lzg",
"Latn",
type = "reconstructed",
}
m["cau-nec-pro"] = {
"Kaukasus Timur Laut Purba",
116773244,
"cau-nec",
"Latn",
type = "reconstructed",
}
m["cau-nkh-pro"] = {
"Nakh Purba",
108032840,
"cau-nkh",
"Latn",
type = "reconstructed",
}
m["cau-nwc-pro"] = {
"Kaukasus Barat Laut Purba",
7251861,
"cau-nwc",
"Latn",
type = "reconstructed",
}
m["cau-tsz-pro"] = {
"Tsez Purba",
116773287,
"cau-tsz",
"Latn",
type = "reconstructed",
}
m["cba-ata"] = {
"Atanques",
4812783,
"cba",
"Latn",
}
m["cba-cat"] = {
"Catío Chibcha",
7083619,
"cba",
"Latn",
}
m["cba-dor"] = {
"Dorasque",
5297532,
"cba",
"Latn",
}
m["cba-dui"] = {
"Duit",
3041061,
"cba",
"Latn",
}
m["cba-hue"] = {
"Huetar",
35514,
"cba",
"Latn",
}
m["cba-nut"] = {
"Nutabe",
7070405,
"cba",
"Latn",
}
m["cba-pro"] = {
"Chibchan Purba",
116773203,
"cba",
"Latn",
type = "reconstructed",
}
m["ccn-pro"] = {
"Kaukasus Utara Purba",
116773237,
"ccn",
"Latn",
type = "reconstructed",
}
m["ccs-pro"] = {
"Kartvelia Purba",
2608203,
"ccs",
"Latn",
type = "reconstructed",
entry_name = {
from = {"q̣", "p̣", "ʓ", "ċ"},
to = {"q̇", "ṗ", "ʒ", "c̣"}
},
}
m["ccs-gzn-pro"] = {
"Georgia-Zan Purba",
23808119,
"ccs-gzn",
"Latn",
type = "reconstructed",
entry_name = {
from = {"q̣", "p̣", "ʓ", "ċ"},
to = {"q̇", "ṗ", "ʒ", "c̣"}
},
}
m["cdc-cbm-pro"] = {
"Chad Tengah Purba",
116773197,
"cdc-cbm",
"Latn",
type = "reconstructed",
}
m["cdc-mas-pro"] = {
"Masa Purba",
116773789,
"cdc-mas",
"Latn",
type = "reconstructed",
}
m["cdc-pro"] = {
"Chad Purba",
116773201,
"cdc",
"Latn",
type = "reconstructed",
}
m["cdd-pro"] = {
"Caddoan Purba",
116773725,
"cdd",
"Latn",
type = "reconstructed",
}
m["cel-bry-pro"] = {
"Briton Purba",
1248800,
"cel-bry",
"Latn, Grek",
sort_key = "cel-bry-pro-sortkey",
}
m["cel-gal"] = {
"Gallaecia",
3094789,
"cel-his",
}
m["cel-gau"] = {
"Gallia",
29977,
"cel",
"Latn, Grek, Ital",
entry_name = {remove_diacritics = c.macron .. c.breve .. c.diaer},
}
m["cel-pro"] = {
"Keltik Purba",
653649,
"cel",
"Latn",
type = "reconstructed",
sort_key = "cel-pro-sortkey",
}
m["chi-pro"] = {
"Chimakuan Purba",
116773734,
"chi",
"Latn",
type = "reconstructed",
}
m["chm-pro"] = {
"Mari Purba",
116773788,
"chm",
"Latn",
type = "reconstructed",
}
m["cmc-pro"] = {
"Chamik Purba",
114793834,
"cmc",
"Latn",
type = "reconstructed",
}
m["crp-bip"] = {
"Pijin Basque-Iceland",
810378,
"crp",
"Latn",
ancestors = "eu",
}
m["crp-gep"] = {
"Pijin Greenland Barat",
17036301,
"crp",
"Latn",
ancestors = "kl",
}
m["crp-mar"] = {
"Maroon Spirit Language",
1093206,
"crp",
"Latn",
ancestors = "en",
}
m["crp-mpp"] = {
"Portugis Pijin Macau",
128804537,
"crp",
"Hant, Latn",
ancestors = "pt",
sort_key = {Hant = "Hani-sortkey"},
}
m["crp-rsn"] = {
"Russenorsk",
505125,
"crp",
"Cyrl, Latn",
ancestors = "nn, ru",
translit = {Cyrl = "ru-translit"},
}
m["crp-spp"] = {
"Samoan Plantation Pidgin",
7409948,
"crp",
"Latn",
ancestors = "en",
}
m["crp-slb"] = {
"Inggeris Solombala",
7558525,
"crp",
"Cyrl, Latn",
ancestors = "en, ru",
translit = {Cyrl = "ru-translit"},
}
m["crp-tpr"] = {
"Rusia Pijin Taimyr",
16930506,
"crp",
"Cyrl",
ancestors = "ru",
translit = "ru-translit",
}
m["csu-bba-pro"] = {
"Bongo-Bagirmi Purba",
116773722,
"csu-bba",
"Latn",
type = "reconstructed",
}
m["csu-maa-pro"] = {
"Mangbetu Purba",
116773786,
"csu-maa",
"Latn",
type = "reconstructed",
}
m["csu-pro"] = {
"Sudan Tengah Purba",
116773730,
"csu",
"Latn",
type = "reconstructed",
}
m["csu-sar-pro"] = {
"Sara Purba",
116773809,
"csu-sar",
"Latn",
type = "reconstructed",
}
m["cus-ash"] = {
"Ashraaf",
4805855,
"cus-som",
"Latn",
}
m["cus-hec-pro"] = {
"Kusyi Timur Tanah Tinggi Purba",
116773761,
"cus-hec",
"Latn",
type = "reconstructed",
}
m["cus-som-pro"] = {
"Somaloid Purba",
nil,
"cus-som",
"Latn",
type = "reconstructed",
}
m["cus-sou-pro"] = {
"Kusyi Selatan Purba",
126081567,
"cus-sou",
"Latn",
type = "reconstructed",
}
m["cus-pro"] = {
"Kusyi Purba",
116773204,
"cus",
"Latn",
type = "reconstructed",
}
m["dmn-dam"] = {
"Dama (Sierra Leone)",
19601574,
"dmn",
"Latn",
}
m["dra-bry"] = {
"Beary",
1089116,
"qfa-mix",
"Mlym, Knda",
ancestors = "ml, tcy",
translit = {
Mlym = "ml-translit",
Knda = "kn-translit",
},
}
m["dra-cen-pro"] = {
"Dravidia Tengah Purba",
nil,
"dra-cen",
"Latn",
type = "reconstructed",
}
m["dra-mkn"] = {
"Kannada Pertengahan",
128810572,
"dra-kan",
"Knda",
translit = "kn-translit",
}
m["dra-nor-pro"] = {
"Dravidia Utara Purba",
124433593,
"dra-nor",
"Latn",
type = "reconstructed",
}
m["dra-okn"] = {
"Kannada Kuno",
15723156,
"dra-kan",
"Knda",
translit = "kn-translit",
}
m["dra-ote"] = {
"Telugu Kuno",
126720868,
"dra-tel",
"Telu",
translit = "te-translit",
}
m["dra-pro"] = {
"Dravidia Purba",
1702853,
"dra",
"Latn",
type = "reconstructed",
}
m["dra-sdo-pro"] = {
"Dravidia Selatan I Purba",
104847952, -- Wikipedia's "Dravidia Selatan Purba" is Dravidia Selatan Purba I in this scheme.
"dra-sdo",
"Latn",
type = "reconstructed",
}
m["dra-sdt-pro"] = {
"Dravidia Selatan II Purba",
128885257,
"dra-sdt",
"Latn",
type = "reconstructed",
}
m["dra-sou-pro"] = {
"Dravidia Selatan Purba",
128886121,
"dra-sou",
"Latn",
type = "reconstructed",
}
m["egx-dem"] = {
"Demotik",
36765,
"egx",
"Latn, Egyd, Polyt",
translit = {
Polyt = "grc-translit",
},
entry_name = {
Polyt = s["Polyt-entryname"],
},
sort_key = {
Latn = {
remove_diacritics = "'%-%s",
from = {"ꜣ", "j", "e", "ꜥ", "y", "w", "b", "p", "f", "m", "n", "r", "l", "ḥ", "ḫ", "h̭", "ẖ", "h", "š", "s", "q", "k", "g", "ṱ", "ṯ", "t", "ḏ", "%.", "⸗"},
to = {p[1], p[2], p[3], p[4], p[5], p[6], p[7], p[8], p[9], p[10], p[11], p[12], p[13], p[15], p[16], p[16], p[17], p[14], p[19], p[18], p[20], p[21], p[22], p[23], p[24], p[23], p[25], p[26], p[26]}
},
Polyt = s["Grek-sortkey"],
},
}
m["dmn-pro"] = {
"Mande Purba",
116773785,
"dmn",
"Latn",
type = "reconstructed",
}
m["dmn-mdw-pro"] = {
"Mande Barat Purba",
116773822,
"dmn-mdw",
"Latn",
type = "reconstructed",
}
m["dru-pro"] = {
"Rukai Purba",
116773807,
"map",
"Latn",
type = "reconstructed",
}
m["esx-esk-pro"] = {
"Eskimo Purba",
7251842,
"esx-esk",
"Latn",
type = "reconstructed",
}
m["esx-ink"] = {
"Inuktun",
1671647,
"esx-inu",
"Latn",
}
m["esx-inq"] = {
"Inuinnaqtun",
28070,
"esx-inu",
"Latn",
}
m["esx-inu-pro"] = {
"Inuit Purba",
60785588,
"esx-inu",
"Latn",
type = "reconstructed",
}
m["esx-pro"] = {
"Eskimo-Aleut Purba",
7251843,
"esx",
"Latn",
type = "reconstructed",
}
m["esx-tut"] = {
"Tunumiisut",
15665389,
"esx-inu",
"Latn",
}
m["euq-pro"] = {
"Vascon Purba",
938011,
"euq",
"Latn",
type = "reconstructed",
}
m["gba-pro"] = {
"Gbaya Purba",
nil,
"gba",
"Latn",
type = "reconstructed",
}
m["gem-pro"] = {
"Jermanik Purba",
669623,
"gem",
"Latn",
type = "reconstructed",
sort_key = "gem-pro-sortkey",
}
m["gme-bur"] = {
"Burgundians",
47625,
"gme",
"Latn",
}
m["gme-cgo"] = {
"Goth Crimea",
36211,
"gme",
"Latn",
}
m["gmq-gut"] = {
"Gutnish",
1256646,
"gmq",
"Latn",
ancestors = "gmq-ogt",
}
m["gmq-jmk"] = {
"Jamtish",
35512,
"gmq-eas",
"Latn",
}
m["gmq-mno"] = {
"Norway Pertengahan",
3417070,
"gmq-wes",
"Latn",
}
m["gmq-oda"] = {
"Denmark Kuno",
12330003,
"gmq-eas",
"Latn, Runr",
entry_name = {remove_diacritics = c.macron},
}
m["gmq-ogt"] = {
"Gutnish Kuno",
1133488,
"gmq",
"Latn",
ancestors = "non",
}
m["gmq-osw"] = {
"Sweden Kuno",
2417210,
"gmq-eas",
"Latn, Runr",
entry_name = {remove_diacritics = c.macron},
}
m["gmq-pro"] = {
"Norse Purba",
1671294,
"gmq",
"Runr",
translit = "Runr-translit",
}
m["gmq-scy"] = {
"Scanian",
768017,
"gmq-eas",
"Latn",
}
m["gmw-bgh"] = {
"Bergish",
329030,
"gmw-frk",
"Latn",
}
m["gmw-cfr"] = {
"Franconia Tengah",
572197,
"gmw-hgm",
"Latn",
ancestors = "gmh",
wikimedia_codes = "ksh",
}
m["gmw-ecg"] = {
"Jerman Tengah Timur",
499344, -- subsumes Q699284, Q152965
"gmw-hgm",
"Latn",
ancestors = "gmh",
}
m["gmw-fin"] = {
"Fingallian",
3072588,
"gmw-ian",
"Latn",
}
m["gmw-gts"] = {
"Gottscheerish",
533109,
"gmw-hgm",
"Latn",
ancestors = "bar",
}
m["gmw-jdt"] = {
"Belanda Jersey",
1687911,
"gmw-frk",
"Latn",
ancestors = "nl",
}
m["gmw-msc"] = {
"Scots Pertengahan",
3327000,
"gmw-ang",
"Latn",
ancestors = "enm-esc",
}
m["gmw-pro"] = {
"Jermanik Barat Purba",
78079021,
"gmw",
"Latn",
-- type = "reconstructed",
-- largely but not entirely reconstructed (like Norse); see April '24 BP, set back to reconstructed (?) if 'anti-asterisk' is added
sort_key = "gmw-pro-sortkey",
}
m["gmw-rfr"] = {
"Franconia Rhine",
707007,
"gmw-hgm",
"Latn",
ancestors = "gmh",
}
m["gmw-stm"] = {
"Sathmar Swabian",
2223059,
"gmw-hgm",
"Latn",
ancestors = "swg",
}
m["gmw-tsx"] = {
"Transylvanian Saxon",
260942,
"gmw-hgm",
"Latn",
ancestors = "gmw-cfr",
}
m["gmw-vog"] = {
"Jerman Volga",
312574,
"gmw-hgm",
"Latn",
ancestors = "gmw-rfr",
}
m["gmw-zps"] = {
"Jerman Zipser",
205548,
"gmw-hgm",
"Latn",
ancestors = "gmh",
}
m["gn-cls"] = {
"Guaraní Klasik",
17478065,
"tup-gua",
"Latn",
ancestors = "gn",
}
m["grk-cal"] = {
"Yunani Calabria",
1146398,
"grk",
"Latn",
ancestors = "grk-ita",
}
m["grk-ita"] = {
"Yunani Itali",
19720507,
"grk",
"Latn, Grek",
ancestors = "gkm",
entry_name = {remove_diacritics = c.caron .. c.diaerbelow .. c.brevebelow},
sort_key = s["Grek-sortkey"],
}
m["grk-mar"] = {
"Yunani Mariupol",
4400023,
"grk",
"Cyrl, Latn, Grek",
ancestors = "gkm",
translit = {
Cyrl = "grk-mar-translit",
Grek = "grk-mar-translit",
},
override_translit = true,
display_text = {
Grek = s["Grek-displaytext"],
},
entry_name = {
Cyrl = {remove_diacritics = c.acute},
Grek = s["Grek-entryname"],
},
sort_key = {
Grek = s["Grek-sortkey"],
},
}
m["grk-pro"] = {
"Hellenik Purba",
1231805,
"grk",
"Latn",
type = "reconstructed",
sort_key = {
from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", "ď", "ľ", "ň", "ř", "ʰ", "ʷ", c.acute, c.macron},
to = {"a", "e", "i", "o", "u", "d", "l", "n", "r", "¯h", "¯w"}
},
}
m["hmn-pro"] = {
"Hmong",
116773210,
"hmn",
"Latn",
type = "reconstructed",
}
m["hmx-mie-pro"] = {
"Mien",
116773229,
"hmx-mie",
"Latn",
type = "reconstructed",
}
m["hmx-pro"] = {
"Hmong-Mien Purba",
7251846,
"hmx",
"Latn",
type = "reconstructed",
}
m["hyx-pro"] = {
"Armenia Purba",
3848498,
"hyx",
"Latn",
type = "reconstructed",
}
m["iir-nur-pro"] = {
"Nuristani Purba",
116773248,
"iir-nur",
"Latn",
type = "reconstructed",
}
m["iir-pro"] = {
"Indo-Iran Purba",
966439,
"iir",
"Latn",
type = "reconstructed",
}
m["ijo-pro"] = {
"Ijoid Purba",
116773766,
"ijo",
"Latn",
type = "reconstructed",
}
m["inc-apa"] = {
"Apabhramsa",
616419,
"inc-mid",
"Deva, Shrd, Sidd",
ancestors = "pra",
translit = {
Deva = "sa-translit",
Shrd = "Shrd-translit",
Sidd = "Sidd-translit",
},
}
m["inc-ash"] = {
"Prakrit Ashoka",
104854379,
"inc-mid",
"Brah, Khar",
ancestors = "sa",
translit = {
Brah = "Brah-translit",
Khar = "Khar-translit",
},
}
m["inc-kam"] = {
"Prakrit Kamarupi",
6356097,
"inc-eas",
"Brah, Sidd",
translit = {
Brah = "Brah-translit",
Sidd = "Sidd-translit",
},
}
m["inc-kho"] = {
"Kholosi",
24952008,
"inc-snd",
"Latn",
}
m["inc-krn-pro"] = {
"KRDS lects Purba",
128816843,
"inc-eas",
"Latn",
ancestors = "inc-kam",
type = "reconstructed",
}
m["inc-mas"] = {
"Assam Pertengahan",
128806836,
"inc-eas",
"as-Beng",
ancestors = "inc-oas",
translit = "inc-mas-translit",
}
m["inc-mbn"] = {
"Benggali Pertengahan",
113559927,
"inc-eas",
"Beng",
ancestors = "inc-obn",
translit = "inc-mbn-translit",
}
m["inc-mgu"] = {
"Gujarat Pertengahan",
24907429,
"inc-wes",
"Deva",
ancestors = "inc-ogu",
}
m["inc-mor"] = {
"Odia Pertengahan",
128810882,
"inc-eas",
"Orya",
ancestors = "inc-oor",
}
m["inc-oas"] = {
"Assam Awal",
85758237,
"inc-eas",
"as-Beng",
ancestors = "inc-kam",
translit = "inc-oas-translit",
}
m["inc-oaw"] = {
"Awadhi Kuno",
nil,
"inc-hie",
"Deva, Kthi, ur-Arab",
entry_name = {
from = {"هٔ", "ۂ"}, -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"
to = {"ہ", "ہ"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef
},
translit = {
Deva = "sa-translit",
Kthi = "sa-Kthi-translit",
["ur-Arab"] = "inc-ohi-translit",
},
}
m["inc-obn"] = {
"Benggali Kuno",
113559926,
"inc-eas",
"Beng",
}
m["inc-ogu"] = {
"Gujarati Kuno",
24907427,
"inc-wes",
"Deva",
translit = "sa-translit",
}
m["inc-ohi"] = {
"Hindi Kuno",
48767781,
"inc-hiw",
"Deva, ur-Arab",
entry_name = {
from = {"هٔ", "ۂ"}, -- character "ۂ" code U+06C2 to "ه" and "هٔ" (U+0647 + U+0654) to "ه"
to = {"ہ", "ہ"},
remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef
},
translit = {
Deva = "sa-translit",
["ur-Arab"] = "inc-ohi-translit",
},
}
m["inc-oor"] = {
"Odia Kuno",
128807801,
"inc-eas",
"Orya",
}
m["inc-opa"] = {
"Punjabi Kuno",
115270971,
"inc-pan",
"Guru, pa-Arab",
translit = {
Guru = "inc-opa-Guru-translit",
["pa-Arab"] = "pa-Arab-translit",
},
entry_name = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun},
}
m["inc-pro"] = {
"Indo-Arya Purba",
23808344,
"inc",
"Latn",
type = "reconstructed",
}
m["ine-ana-pro"] = {
"Anatolia Purba",
7251833,
"ine-ana",
"Latn",
type = "reconstructed",
}
m["ine-bsl-pro"] = {
"Balto-Slavik Purba",
1703347,
"ine-bsl",
"Latn",
type = "reconstructed",
sort_key = {
from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", c.acute, c.macron, "ˀ"},
to = {"a", "e", "i", "o", "u"}
},
}
m["ine-kal"] = {
"Kalašma",
122770439,
"ine-ana",
"Xsux",
}
m["ine-pae"] = {
"Paeonian",
2705672,
"ine",
"Polyt",
translit = "grc-translit",
entry_name = s["Polyt-entryname"],
sort_key = s["Grek-sortkey"],
}
m["ine-pro"] = {
"Indo-Eropah Purba",
37178,
"ine",
"Latn",
type = "reconstructed",
sort_key = {
from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", "ĺ", "ḿ", "ń", "ŕ", "ǵ", "ḱ", "ʰ", "ʷ", "₁", "₂", "₃", c.ringbelow, c.acute, c.macron},
to = {"a", "e", "i", "o", "u", "l", "m", "n", "r", "g'", "k'", "¯h", "¯w", "1", "2", "3"}
},
}
m["ine-toc-pro"] = {
"Tocharia Purba",
37029,
"ine-toc",
"Latn",
type = "reconstructed",
}
m["xme-old"] = {
"Medes Kuno",
36461,
"xme",
"Grek, Latn",
}
m["xme-mid"] = {
"Medes Pertengahan",
12836150,
"xme",
"Latn",
}
m["xme-ker"] = {
"Kerman",
129850,
"xme",
"fa-Arab, Latn",
ancestors = "xme-mid",
}
m["xme-taf"] = {
"Tafreshi",
nil,
"xme",
"fa-Arab, Latn",
ancestors = "xme-mid",
}
m["xme-ttc-pro"] = {
"Tat Purba",
122973870,
"xme-ttc",
"Latn",
ancestors = "xme-mid",
}
m["xme-kls"] = {
"Kalasuri",
nil,
"xme-ttc",
ancestors = "xme-ttc-nor",
}
m["xme-klt"] = {
"Kilit",
3612452,
"xme-ttc",
"Cyrl", -- and fa-Arab?
}
m["xme-ott"] = {
"Tati Kuno",
434697,
"xme-ttc",
"fa-Arab, Latn",
}
m["ira-kms-pro"] = {
"Komisenian Purba",
116773777,
"ira-kms",
"Latn",
type = "reconstructed",
}
m["ira-mpr-pro"] = {
"Medo-Parthia Purba",
116773227,
"ira-mpr",
"Latn",
type = "reconstructed",
}
m["ira-pat-pro"] = {
"Pathan Purba",
116773255,
"ira-pat",
"Latn",
type = "reconstructed",
}
m["ira-pro"] = {
"Iran Purba",
4167865,
"ira",
"Latn",
type = "reconstructed",
}
m["ira-zgr-pro"] = {
"Zaza-Gorani Purba",
116775031,
"ira-zgr",
"Latn",
type = "reconstructed",
}
m["os-pro"] = {
"Ossetia Purba",
116773249,
"xsc",
"Latn",
type = "reconstructed",
}
m["xsc-pro"] = {
"Scythia Purba",
116773273,
"xsc",
"Latn",
type = "reconstructed",
}
m["xsc-skw-pro"] = {
"Saka-Wakhi Purba",
116773267,
"xsc-skw",
"Latn",
type = "reconstructed",
}
m["xsc-sak-pro"] = {
"Saka Purba",
116773264,
"xsc-sak",
"Latn",
type = "reconstructed",
}
m["ira-sym-pro"] = {
"Shughni-Yazghulami-Munji Purba",
116773813,
"ira-sym",
"Latn",
type = "reconstructed",
}
m["ira-sgi-pro"] = {
"Sanglechi-Ishkashimi Purba",
116773808,
"ira-sgi",
"Latn",
type = "reconstructed",
}
m["ira-mny-pro"] = {
"Munji-Yidgha Purba",
116773792,
"ira-mny",
"Latn",
type = "reconstructed",
}
m["ira-shy-pro"] = {
"Shughni-Yazghulami Purba",
116773812,
"ira-shy",
"Latn",
type = "reconstructed",
}
m["ira-shr-pro"] = {
"Shughni-Roshani Purba",
116773811,
"ira-shr",
"Latn",
type = "reconstructed",
}
m["ira-sgc-pro"] = {
"Sogdia Purba",
116773276,
"ira-sgc",
"Latn",
type = "reconstructed",
}
m["ira-wnj"] = {
"Vanji Purba",
3398419,
"ira-shy",
"Latn",
}
m["iro-ere"] = {
"Erie",
5388365,
"iro-nor",
"Latn",
}
m["iro-min"] = {
"Mingo",
128531,
"iro-nor",
"Latn",
ietf_subtag = "i-mingo", -- grandfathered IETF tag
}
m["iro-nor-pro"] = {
"Iroquois Utara Purba",
116773242,
"iro-nor",
"Latn",
type = "reconstructed",
}
m["iro-pro"] = {
"Iroquois Purba",
7251852,
"iro",
"Latn",
type = "reconstructed",
}
m["itc-pro"] = {
"Italik Purba",
17102720,
"itc",
"Latn",
type = "reconstructed",
}
m["jpx-hcj"] = {
"Hachijō",
5637049,
"jpx",
"Jpan",
ancestors = "ojp-eas",
translit = s["jpx-translit"],
display_text = s["jpx-displaytext"],
entry_name = s["jpx-entryname"],
sort_key = s["jpx-sortkey"],
}
m["jpx-pro"] = {
"Jepunik Purba",
3924309,
"jpx",
"Latn",
type = "reconstructed",
}
m["jpx-ryu-pro"] = {
"Ryukyu Purba",
56349069,
"jpx-ryu",
"Latn",
type = "reconstructed",
}
m["kar-pro"] = {
"Karen Purba",
85794783,
"kar",
"Latn",
type = "reconstructed",
}
m["kca-eas"] = {
"Khanty Timur",
30304622,
"kca",
"Cyrl",
translit = "kca-translit",
override_translit = true,
}
m["kca-nor"] = {
"Khanty Utara",
30304527,
"kca",
"Cyrl",
translit = "kca-translit",
override_translit = true,
}
m["kca-pro"] = {
"Khanty Purba",
127505171,
"kca",
"Latn",
type = "reconstructed",
}
m["kca-sou"] = {
"Khanty Selatan",
30304618,
"kca",
"Cyrl",
translit = "kca-translit",
override_translit = true,
}
m["khi-kho-pro"] = {
"Khoe Purba",
116773218,
"khi-kho",
"Latn",
type = "reconstructed",
}
m["khi-kun"] = {
"ǃKung",
32904,
"khi-kxa",
"Latn",
}
m["ko-ear"] = {
"Korea Moden Awal",
756014,
"qfa-kor",
"Kore",
ancestors = "okm",
translit = "okm-translit",
entry_name = s["Kore-entryname"],
}
m["kro-pro"] = {
"Kru Purba",
116773778,
"kro",
"Latn",
type = "reconstructed",
}
m["ku-pro"] = {
"Kurdi Purba",
116773221,
"ku",
"Latn",
type = "reconstructed",
}
m["map-ata-pro"] = {
"Atayal Purba",
116773151,
"map-ata",
"Latn",
type = "reconstructed",
}
m["map-bms"] = {
"Banyumasan",
33219,
"map",
"Latn, Java",
}
m["map-pro"] = {
"Austronesia Purba",
49230,
"map",
"Latn",
type = "reconstructed",
}
m["mis-hkl"] = {
"Kelantan Peranakan",
108794818,
"qfa-mix",
ancestors = "nan-hbl, sou, mfa",
}
m["mis-isa"] = {
"Isaurian",
16956868,
nil,
-- "Xsux, Hluw, Latn",
}
m["mis-jie"] = {
"Jie",
124424186,
nil,
"Hani",
sort_key = "Hani-sortkey",
}
m["mis-jzh"] = {
"Jizhao",
45242758,
"qfa-bej",
"Latn",
}
m["mis-kas"] = {
"Kassite",
35612,
nil,
"Xsux",
}
m["mis-mmd"] = {
"Mimi of Decorse",
6862206,
nil,
"Latn",
}
m["mis-mmn"] = {
"Mimi of Nachtigal",
6862207,
nil,
"Latn",
}
m["mis-phi"] = {
"Philistine",
2230924,
nil,
"Phnx",
}
m["mis-rou"] = {
"Rouran",
48816637,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-tnw"] = {
"Tangwang",
7683179,
"qfa-mix",
"Latn",
ancestors = "cmn, sce",
}
m["mis-tuh"] = {
"Tuyuhun",
48816625,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-tuo"] = {
"Tuoba",
48816629,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-wuh"] = {
"Wuhuan",
118976867,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mis-xbi"] = {
"Xianbei",
4448647,
"qfa-xgx",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mjg-mgl"] = {
"Mongghul",
53765528,
"mjg",
"Latn", -- also Mong, Cyrl ?
}
m["mjg-mgr"] = {
"Mangghuer",
56285392,
"mjg",
"Latn", -- also Mong, Cyrl ?
}
m["mkh-asl-pro"] = {
"Asli Purba",
55630680,
"mkh-asl",
"Latn",
type = "reconstructed",
}
m["mkh-ban-pro"] = {
"Bahnar Purba",
116773189,
"mkh-ban",
"Latn",
type = "reconstructed",
}
m["mkh-kat-pro"] = {
"Katu Purba",
116773772,
"mkh-kat",
"Latn",
type = "reconstructed",
}
m["mkh-khm-pro"] = {
"Khmu Purba",
116773774,
"mkh-khm",
"Latn",
type = "reconstructed",
}
m["mkh-kmr-pro"] = {
"Khmer Purba",
55630684,
"mkh-kmr",
"Latn",
type = "reconstructed",
}
m["mkh-mmn"] = {
"Mon Pertengahan",
121337926,
"mkh-mnc",
"Latn, Mymr", --and also Pallava
ancestors = "omx",
}
m["mkh-mnc-pro"] = {
"Mon Purba",
116773231,
"mkh-mnc",
"Latn",
type = "reconstructed",
}
m["mkh-mvi"] = {
"Vietnam Pertengahan",
9199,
"mkh-vie",
"Hani, Latn",
sort_key = {Hani = "Hani-sortkey"},
}
m["mkh-pal-pro"] = {
"Palaung Purba",
104847372,
"mkh-pal",
"Latn",
type = "reconstructed",
}
m["mkh-pea-pro"] = {
"Pear Purba",
116773804,
"mkh-pea",
"Latn",
type = "reconstructed",
}
m["mkh-pkn-pro"] = {
"Pakan Purba",
116773803,
"mkh-pkn",
"Latn",
type = "reconstructed",
}
m["mkh-pro"] = { --This will be merged into 2015 aav-pro.
"Mon-Khmer Purba",
7251859,
"mkh",
"Latn",
type = "reconstructed",
}
m["mnw-tha"] = { -- To be removed.
"Thai Mon",
nil,
"mkh-mnc",
"Mymr, Thai",
ancestors = "mkh-mmn",
sort_key = {
from = {"[%p]", "ျ", "ြ", "ွ", "ှ", "ၞ", "ၟ", "ၠ", "ၚ", "ဿ", "[็-๎]", "([เแโใไ])([ก-ฮ])ฺ?"},
to = {"", "္ယ", "္ရ", "္ဝ", "္ဟ", "္န", "္မ", "္လ", "င", "သ္သ", "", "%2%1"}
},
}
m["mkh-vie-pro"] = {
"Viet Purba",
109432616,
"mkh-vie",
"Latn",
type = "reconstructed",
}
m["mns-cen"] = {
"Mansi Tengah",
128810384,
"mns",
"Cyrl",
translit = "mns-translit",
override_translit = true,
}
m["mns-nor"] = {
"Mansi Utara",
30304537,
"mns",
"Cyrl",
translit = "mns-translit",
override_translit = true,
}
m["mns-pro"] = {
"Mansi Purba",
128883093,
"mns",
"Latn",
type = "reconstructed",
}
m["mns-sou"] = {
"Mansi Selatan",
30304629,
"mns",
"Cyrl",
translit = "mns-translit",
override_translit = true,
}
m["mun-pro"] = {
"Munda Purba",
105102373,
"mun",
"Latn",
type = "reconstructed",
}
m["myn-chl"] = { -- the stage after ''emy''
"Ch'olti'",
873995,
"myn",
"Latn",
}
m["myn-pro"] = {
"Maya Purba",
3321532,
"myn",
"Latn",
type = "reconstructed",
}
m["nai-ala"] = {
"Alazapa",
128810233,
nil,
"Latn",
}
m["nai-bay"] = {
"Bayogoula",
1563704,
nil,
"Latn",
}
m["nai-cal"] = {
"Calusa",
51782,
nil,
"Latn",
}
m["nai-chi"] = {
"Chiquimulilla",
25339627,
"nai-xin",
"Latn",
}
m["nai-chu-pro"] = {
"Chumash Purba",
116773736,
"nai-chu",
"Latn",
type = "reconstructed",
}
m["nai-cig"] = {
"Ciguayo",
20741700,
nil,
"Latn",
}
m["nai-ckn-pro"] = {
"Chinook Purba",
116773735,
"nai-ckn",
"Latn",
type = "reconstructed",
}
m["nai-guz"] = {
"Guazacapán",
19572028,
"nai-xin",
"Latn",
}
m["nai-hit"] = {
"Hitchiti",
1542882,
"nai-mus",
"Latn",
}
m["nai-ipa"] = {
"Ipai",
3027474,
"nai-yuc",
"Latn",
}
m["nai-jtp"] = {
"Jutiapa",
nil,
"nai-xin",
"Latn",
}
m["nai-jum"] = {
"Jumaytepeque",
25339626,
"nai-xin",
"Latn",
}
m["nai-kat"] = {
"Kathlamet",
6376639,
"nai-ckn",
"Latn",
}
m["nai-klp-pro"] = {
"Kalapuyan Purba",
116773771,
"nai-klp",
"Latn",
type = "reconstructed",
}
m["nai-knm"] = {
"Konomihu",
3198734,
"nai-shs",
"Latn",
}
m["nai-kum"] = {
"Kumeyaay",
4910139,
"nai-yuc",
"Latn",
}
m["nai-mac"] = {
"Macoris",
21070851,
nil,
"Latn",
}
m["nai-mdu-pro"] = {
"Maidun Purba",
116773784,
"nai-mdu",
"Latn",
type = "reconstructed",
}
m["nai-miz-pro"] = {
"Mixe-Zoque Purba",
7251858,
"nai-miz",
"Latn",
type = "reconstructed",
}
m["nai-mus-pro"] = {
"Muscogee Purba",
116775368,
"nai-mus",
"Latn",
type = "reconstructed",
}
m["nai-nao"] = {
"Naolan",
6964594,
nil,
"Latn",
}
m["nai-nrs"] = {
"New River Shasta",
7011254,
"nai-shs",
"Latn",
}
m["nai-okw"] = {
"Okwanuchu",
3350126,
"nai-shs",
"Latn",
}
m["nai-per"] = {
"Pericú",
3375369,
nil,
"Latn",
}
m["nai-pic"] = {
"Picuris",
7191257,
"nai-kta",
"Latn",
}
m["nai-plp-pro"] = {
"Penuti Penara Purba",
116773806,
"nai-plp",
"Latn",
type = "reconstructed",
}
m["nai-pom-pro"] = {
"Pomo Purba",
116773262,
"nai-pom",
"Latn",
type = "reconstructed",
}
m["nai-qng"] = {
"Quinigua",
36360,
nil,
"Latn",
}
m["nai-sca-pro"] = { -- NB 'sio-pro' "Siouan" which is Western Siouan
"Sioux-Catawba Purba",
116773275,
"nai-sca",
"Latn",
type = "reconstructed",
}
m["nai-sin"] = {
"Sinacantán",
24190249,
"nai-xin",
"Latn",
}
m["nai-sln"] = {
"Salvadoran Lenca",
3229434,
"nai-len",
"Latn",
}
m["nai-spt"] = {
"Sahaptin",
3833015,
"nai-shp",
"Latn",
}
m["nai-tap"] = {
"Tapachultec",
7684401,
"nai-miz",
"Latn",
}
m["nai-taw"] = {
"Tawasa",
7689233,
nil,
"Latn",
}
m["nai-teq"] = {
"Tequistlatec",
2964454,
"nai-tqn",
"Latn",
}
m["nai-tip"] = {
"Tipai",
3027471,
"nai-yuc",
"Latn",
}
m["nai-tot-pro"] = {
"Totozoque Purba",
116773285,
"nai-tot",
"Latn",
type = "reconstructed",
}
m["nai-tsi-pro"] = {
"Tsimshian Purba",
nil,
"nai-tsi",
"Latn",
type = "reconstructed",
}
m["nai-utn-pro"] = {
"Uti Purba",
116773290,
"nai-utn",
"Latn",
type = "reconstructed",
}
m["nai-wai"] = {
"Waikuri",
3118702,
nil,
"Latn",
}
m["nai-wji"] = {
"Jicaque Barat",
3178610,
"nai-jcq",
"Latn",
}
m["nai-yup"] = {
"Yupiltepeque",
25339628,
"nai-xin",
"Latn",
}
m["nan-dat"] = {
"Datian Min",
19855572,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-hbl"] = {
"Hokkien",
1624231,
"zhx-nan",
"Hants, Latn, Bopo, Kana",
wikimedia_codes = "zh-min-nan",
generate_forms = "zh-generateforms",
sort_key = {
Hani = "Hani-sortkey",
Kana = "Kana-sortkey"
},
}
m["nan-hlh"] = {
"Min Hailufeng",
120755728,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-hnm"] = {
"Hainan",
934541,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-lnx"] = {
"Min Longyan",
6674568,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-luh"] = {
"Min Leizhou",
1988433,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-tws"] = {
"Teochew",
36759,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["nan-zhe"] = {
"Min Zhenan",
3846710,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nan-zsh"] = {
"Min Sanxiang",
7420769,
"zhx-nan",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["nds-de"] = {
"German Low German",
25433,
"gmw-lgm",
"Latn",
ancestors = "nds",
ietf_subtag = "nds-DE", -- should we make this the actual code?
wikimedia_codes = "nds",
}
m["nds-nl"] = {
"Dutch Low Saxon",
516137,
"gmw-lgm",
"Latn",
ancestors = "nds",
ietf_subtag = "nds-NL", -- should we make this the actual code?
}
m["ngf-pro"] = {
"Trans-New Guinea Purba",
85794785,
"ngf",
"Latn",
type = "reconstructed",
}
m["nic-bco-pro"] = {
"Benue-Congo Purba",
116773194,
"nic-bco",
"Latn",
type = "reconstructed",
}
m["nic-bod-pro"] = {
"Bantoid Purba",
116773190,
"nic-bod",
"Latn",
type = "reconstructed",
}
m["nic-eov-pro"] = {
"Oti-Volta Timur Purba",
116773753,
"nic-eov",
"Latn",
type = "reconstructed",
}
m["nic-gns-pro"] = {
"Gurunsi Purba",
116773759,
"nic-gns",
"Latn",
type = "reconstructed",
}
m["nic-grf-pro"] = {
"Grassfields Purba",
116773755,
"nic-grf",
"Latn",
type = "reconstructed",
}
m["nic-gur-pro"] = {
"Gur Purba",
116773758,
"nic-gur",
"Latn",
type = "reconstructed",
}
m["nic-jkn-pro"] = {
"Jukunoid Purba",
116773769,
"nic-jkn",
"Latn",
type = "reconstructed",
}
m["nic-lcr-pro"] = {
"Lower Cross River Purba",
116773782,
"nic-lcr",
"Latn",
type = "reconstructed",
}
m["nic-ogo-pro"] = {
"Ogoni Purba",
116773799,
"nic-ogo",
"Latn",
type = "reconstructed",
}
m["nic-ovo-pro"] = {
"Oti-Volta Purba",
116773802,
"nic-ovo",
"Latn",
type = "reconstructed",
}
m["nic-plt-pro"] = {
"Plateau Purba",
116773805,
"nic-plt",
"Latn",
type = "reconstructed",
}
m["nic-pro"] = {
"Niger-Congo Purba",
108000748,
"nic",
"Latn",
type = "reconstructed",
}
m["nic-ubg-pro"] = {
"Ubangi Purba",
116773818,
"nic-ubg",
"Latn",
type = "reconstructed",
}
m["nic-ucr-pro"] = {
"Upper Cross River Purba",
116773819,
"nic-ucr",
"Latn",
type = "reconstructed",
}
m["nic-vco-pro"] = {
"Volta-Congo Purba",
116773293,
"nic-vco",
"Latn",
type = "reconstructed",
}
m["nub-har"] = {
"Haraza",
19572059,
"nub",
"Arab, Latn",
}
m["nub-pro"] = {
"Nubia Purba",
116773246,
"nub",
"Latn",
type = "reconstructed",
}
m["omq-cha-pro"] = {
"Chatino Purba",
116773202,
"omq-cha",
"Latn",
type = "reconstructed",
}
m["omq-maz-pro"] = {
"Mazatec Purba",
116773790,
"omq-maz",
"Latn",
type = "reconstructed",
}
m["omq-mix-pro"] = {
"Mixtecan Purba",
21573423,
"omq-mix",
"Latn",
type = "reconstructed",
}
m["omq-mxt-pro"] = {
"Mixtec Purba",
21573424,
"omq-mxt",
"Latn",
type = "reconstructed",
}
m["omq-otp-pro"] = {
"Oto-Pamean Purba",
116773251,
"omq-otp",
"Latn",
type = "reconstructed",
}
m["omq-pro"] = {
"Oto-Manguean Purba",
33669,
"omq",
"Latn",
type = "reconstructed",
}
m["omq-sjq"] = {
"San Juan Quiahije Chatino",
17003130,
"omq-cha",
"Latn",
}
m["omq-tel"] = {
"Teposcolula Mixtec",
nil,
"omq-mxt",
"Latn",
}
m["omq-teo"] = {
"Teojomulco Chatino",
25340451,
"omq-cha",
"Latn",
}
m["omq-tri-pro"] = {
"Trique Purba",
116773817,
"omq-tri",
"Latn",
type = "reconstructed",
}
m["omq-zap-pro"] = {
"Zapotecan Purba",
116773297,
"omq-zap",
"Latn",
type = "reconstructed",
}
m["omq-zpc-pro"] = {
"Zapotec Purba",
116773296,
"omq-zpc",
"Latn",
type = "reconstructed",
}
m["omv-aro-pro"] = {
"Aroid Purba",
116773721,
"omv-aro",
"Latn",
type = "reconstructed",
}
m["omv-diz-pro"] = {
"Dizoid Purba",
116773750,
"omv-diz",
"Latn",
type = "reconstructed",
}
m["omv-pro"] = {
"Omo Purba",
116773800,
"omv",
"Latn",
type = "reconstructed",
}
m["oto-otm-pro"] = {
"Otomi Purba",
5908710,
"oto-otm",
"Latn",
type = "reconstructed",
}
m["oto-pro"] = {
"Otomi Purba",
116773252,
"oto",
"Latn",
type = "reconstructed",
}
m["paa-kom"] = {
"Kómnzo",
18344310,
"paa-yam",
"Latn",
}
m["paa-kwn"] = {
"Kuwani",
6449056,
"paa",
"Latn",
}
m["paa-nha-pro"] = {
"Halmahera Utara Purba",
116773241,
"paa-nha",
"Latn",
type = "reconstructed"
}
m["paa-nun"] = {
"Nungon",
128807788,
"paa",
"Latn",
}
m["phi-din"] = {
"Dinapigue Agta",
16945774,
"phi",
"Latn",
}
m["phi-kal-pro"] = {
"Kalamian Purba",
116773213,
"phi-kal",
"Latn",
type = "reconstructed",
}
m["phi-nag"] = {
"Nagtipunan Agta",
16966111,
"phi",
"Latn",
}
m["phi-pro"] = {
"Filipina Purba",
18204898,
"phi",
"Latn",
type = "reconstructed",
}
m["poz-abi"] = {
"Abai",
19570729,
"poz-san",
"Latn",
}
m["poz-bal"] = {
"Baliledo",
4850912,
"poz",
"Latn",
}
m["poz-btk-pro"] = {
"Bungku-Tolaki Purba",
116773724,
"poz-btk",
"Latn",
type = "reconstructed",
}
m["poz-cet-pro"] = {
"Melayu-Polinesia Tengah Timur Purba",
2269883,
"poz-cet",
"Latn",
type = "reconstructed",
}
m["poz-hce-pro"] = {
"Halmahera Cenderawasih Purba",
116773209,
"poz-hce",
"Latn",
type = "reconstructed",
}
m["poz-lgx-pro"] = {
"Lampung Purba",
116773222,
"poz-lgx",
"Latn",
type = "reconstructed",
}
m["poz-mcm-pro"] = {
"Melayu-Chamik Purba",
116773225,
"poz-mcm",
"Latn",
type = "reconstructed",
}
m["poz-mic-pro"] = {
"Mikronesia Purba",
111939079,
"poz-mic",
"Latn",
type = "reconstructed",
}
m["poz-mly-pro"] = {
"Melayik Purba",
98057728,
"poz-mly",
"Latn",
type = "reconstructed",
}
m["poz-msa-pro"] = {
"Melayu-Sumbawa Purba",
116773226,
"poz-msa",
"Latn",
type = "reconstructed",
}
m["poz-oce-pro"] = {
"Oceania Purba",
141741,
"poz-oce",
"Latn",
type = "reconstructed",
}
m["poz-pep-pro"] = {
"Polinesia Timur Purba",
113988745,
"poz-pep",
"Latn",
type = "reconstructed",
}
m["poz-pnp-pro"] = {
"Polinesia Teras Purba",
113988746,
"poz-pnp",
"Latn",
type = "reconstructed",
}
m["poz-pol-pro"] = {
"Polinesia Purba",
1658709,
"poz-pol",
"Latn",
type = "reconstructed",
}
m["poz-pro"] = {
"Melayu-Polinesia Purba",
3832960,
"poz",
"Latn",
type = "reconstructed",
}
m["poz-sml"] = {
"Melayu Sarawak",
4251702,
"poz-mly",
"Latn, ms-Arab",
}
m["poz-ssw-pro"] = {
"Sulawesi Selatan Purba",
116773279,
"poz-ssw",
"Latn",
type = "reconstructed",
}
m["poz-sus-pro"] = {
"Sunda-Sulawesi Purba",
116773281,
"poz-sus",
"Latn",
type = "reconstructed",
}
m["poz-swa-pro"] = {
"Sarawak Utara Purba",
116773243,
"poz-swa",
"Latn",
type = "reconstructed",
}
m["poz-ter"] = {
"Melayu Terengganu",
4207412,
"poz-mly",
"Latn, ms-Arab",
}
m["pqe-pro"] = {
"Melayu-Polinesia Timur Purba",
2269883,
"pqe",
"Latn",
type = "reconstructed",
}
m["pra-niy"] = {
"Prakrit Niya",
11991601,
"inc-mid",
"Khar",
ancestors = "inc-ash",
translit = "Khar-translit",
}
m["qfa-adm-pro"] = {
"Andaman Raya Purba",
116773756,
"qfa-adm",
"Latn",
type = "reconstructed",
}
m["qfa-bet-pro"] = {
"Be-Tai Purba",
116773193,
"qfa-bet",
"Latn",
type = "reconstructed",
}
m["qfa-cka-pro"] = {
"Chukotko-Kamchatka Purba",
7251837,
"qfa-cka",
"Latn",
type = "reconstructed",
}
m["qfa-hur-pro"] = {
"Hurro-Urartu Purba",
116773211,
"qfa-hur",
"Latn",
type = "reconstructed",
}
m["qfa-kad-pro"] = {
"Kadu Purba",
116773770,
"qfa-kad",
"Latn",
type = "reconstructed",
}
m["qfa-kms-pro"] = {
"Kam-Sui Purba",
55630682,
"qfa-kms",
"Latn",
type = "reconstructed",
}
m["qfa-kor-pro"] = {
"Korea Purba",
467883,
"qfa-kor",
"Latn",
type = "reconstructed",
}
m["qfa-kra-pro"] = {
"Kra Purba",
7251854,
"qfa-kra",
"Latn",
type = "reconstructed",
}
m["qfa-lic-pro"] = {
"Hlai Purba",
7251845,
"qfa-lic",
"Latn",
type = "reconstructed",
}
m["qfa-onb-pro"] = {
"Be Purba",
116773192,
"qfa-onb",
"Latn",
type = "reconstructed",
}
m["qfa-ong-pro"] = {
"Ongan Purba",
116773801,
"qfa-ong",
"Latn",
type = "reconstructed",
}
m["qfa-tak-pro"] = {
"Kra-Dai Purba",
104901616,
"qfa-tak",
"Latn",
type = "reconstructed",
}
m["qfa-yen-pro"] = {
"Yenisei Purba",
27639,
"qfa-yen",
"Latn",
type = "reconstructed",
}
m["qfa-yuk-pro"] = {
"Yukaghir Purba",
116773294,
"qfa-yuk",
"Latn",
type = "reconstructed",
}
m["qwe-kch"] = {
"Kichwa",
1740805,
"qwe",
"Latn",
ancestors = "qu",
}
m["qwe-pro"] = {
"Quechua Purba",
5575757,
"qwe",
"Latn",
type = "reconstructed",
}
m["roa-ang"] = {
"Angevin",
56782,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-bbn"] = {
"Bourbonnais-Berrichon",
2899128,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-brg"] = {
"Bourguignon",
508332,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-cha"] = {
"Champenois",
430018,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-fcm"] = {
"Franc-Comtois",
510561,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-gal"] = {
"Gallo",
37300,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-gib"] = {
"Gallo-Italic of Basilicata",
3094838,
"roa-git",
"Latn",
}
m["roa-gis"] = {
"Gallo-Italic of Sicily",
2629019,
"roa-git",
"Latn",
}
m["roa-leo"] = {
"Leon",
34108,
"roa-ibe",
"Latn",
ancestors = "roa-ole",
}
m["roa-lor"] = {
"Lorrain",
671198,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-oan"] = {
"Navarro-Aragon",
2736184,
"roa-ibe",
"Latn",
}
m["roa-oca"] = {
"Catalonia Kuno",
15478520,
"roa-ocr",
"Latn",
sort_key = {
from = {"à", "[èé]", "[íï]", "[òó]", "[úü]", "ç", "·"},
to = {"a", "e", "i", "o", "u", "c"}
},
}
m["roa-ole"] = {
"Leon Kuno",
125977465,
"roa-ibe",
"Latn",
}
m["roa-opt"] = {
"Galicia-Portugis Kuno",
1072111,
"roa-ibe",
"Latn",
entry_name = {remove_diacritics = c.grave .. c.acute .. c.circ},
}
m["roa-orl"] = {
"Orléanais",
28497058,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-poi"] = {
"Poitevin-Saintongeais",
514123,
"roa-oil",
"Latn",
sort_key = s["roa-oil-sortkey"],
}
m["roa-tar"] = {
"Tarantino",
695526,
"roa-itd",
"Latn",
ancestors = "nap",
wikimedia_codes = "roa-tara",
}
m["sai-all"] = {
"Allentiac",
19570789,
"sai-hrp",
"Latn",
}
m["sai-and"] = { -- not to be confused with 'cbc' or 'ano'
"Andoquero",
16828359,
"sai-wit",
"Latn",
}
m["sai-ayo"] = {
"Ayomán",
16937754,
"sai-jir",
"Latn",
}
m["sai-bae"] = {
"Baenan",
3401998,
nil,
"Latn",
}
m["sai-bag"] = {
"Bagua",
5390321,
nil,
"Latn",
}
m["sai-bet"] = {
"Betoi",
926551,
"qfa-iso",
"Latn",
}
m["sai-bor-pro"] = {
"Boran Purba",
nil,
"sai-bor",
"Latn",
}
m["sai-cac"] = {
"Cacán",
945482,
nil,
"Latn",
}
m["sai-caq"] = {
"Caranqui",
2937753,
"sai-bar",
"Latn",
}
m["sai-car-pro"] = {
"Cariban Purba",
116773196,
"sai-car",
"Latn",
type = "reconstructed",
}
m["sai-cat"] = {
"Catacao",
5051136,
"sai-ctc",
"Latn",
}
m["sai-cer-pro"] = {
"Cerrado Purba",
116773200,
"sai-cer",
"Latn",
type = "reconstructed",
}
m["sai-chi"] = {
"Chirino",
5390321,
nil,
"Latn",
}
m["sai-chn"] = {
"Chaná",
5072718,
"sai-crn",
"Latn",
}
m["sai-chp"] = {
"Chapacura",
5072884,
"sai-cpc",
"Latn",
}
m["sai-chr"] = {
"Charrua",
5086680,
"sai-crn",
"Latn",
}
m["sai-chu"] = {
"Churuya",
5118339,
"sai-guh",
"Latn",
}
m["sai-cje-pro"] = {
"Jê Tengah Purba",
116773198,
"sai-cje",
"Latn",
type = "reconstructed",
}
m["sai-cmg"] = {
"Comechingon",
6644203,
nil,
"Latn",
}
m["sai-cno"] = {
"Chono",
5104704,
nil,
"Latn",
}
m["sai-cnr"] = {
"Cañari",
5055572,
nil,
"Latn",
}
m["sai-coe"] = {
"Coeruna",
6425639,
"sai-wit",
"Latn",
}
m["sai-col"] = {
"Colán",
5141893,
"sai-ctc",
"Latn",
}
m["sai-cop"] = {
"Copallén",
5390321,
nil,
"Latn",
}
m["sai-crd"] = {
"Coroado Puri",
24191321,
"sai-mje",
"Latn",
}
m["sai-ctq"] = {
"Catuquinaru",
16858455,
nil,
"Latn",
}
m["sai-cul"] = {
"Culli",
2879660,
nil,
"Latn",
}
m["sai-cva"] = {
"Cueva",
5192644,
nil,
"Latn",
}
m["sai-esm"] = {
"Esmeralda",
3058083,
nil,
"Latn",
}
m["sai-ewa"] = {
"Ewarhuyana",
16898104,
nil,
"Latn",
}
m["sai-gam"] = {
"Gamela",
5403661,
nil,
"Latn",
}
m["sai-gay"] = {
"Gayón",
5528902,
"sai-jir",
"Latn",
}
m["sai-gmo"] = {
"Guamo",
5613495,
nil,
"Latn",
}
m["sai-gue"] = {
"Güenoa",
5626799,
"sai-crn",
"Latn",
}
m["sai-hau"] = {
"Haush",
3128376,
"sai-cho",
"Latn",
}
m["sai-jee-pro"] = {
"Jê Purba",
116773212,
"sai-jee",
"Latn",
type = "reconstructed",
}
m["sai-jko"] = {
"Jeikó",
6176527,
"sai-mje",
"Latn",
}
m["sai-jrj"] = {
"Jirajara",
6202966,
"sai-jir",
"Latn",
}
m["sai-kat"] = { -- contrast xoo, kzw, sai-xoc
"Katembri",
6375925,
nil,
"Latn",
}
m["sai-mal"] = {
"Malalí",
6741212,
nil,
"Latn",
}
m["sai-mar"] = {
"Maratino",
6755055,
nil,
"Latn",
}
m["sai-mat"] = {
"Matanawi",
6786047,
nil,
"Latn",
}
m["sai-mcn"] = {
"Mocana",
3402048,
nil,
"Latn",
}
m["sai-men"] = {
"Menien",
16890110,
"sai-mje",
"Latn",
}
m["sai-mil"] = {
"Millcayac",
19573012,
"sai-hrp",
"Latn",
}
m["sai-mlb"] = {
"Malibu",
3402048,
nil,
"Latn",
}
m["sai-msk"] = {
"Masakará",
6782426,
"sai-mje",
"Latn",
}
m["sai-muc"] = {
"Mucuchí",
6931290,
nil,
"Latn",
}
m["sai-mue"] = {
"Muellama",
16886936,
"sai-bar",
"Latn",
}
m["sai-muz"] = {
"Muzo",
6644203,
nil,
"Latn",
}
m["sai-mys"] = {
"Maynas",
16919393,
nil,
"Latn",
}
m["sai-nat"] = {
"Natú",
9006749,
nil,
"Latn",
}
m["sai-nje-pro"] = {
"Jê Utara Purba",
116773245,
"sai-nje",
"Latn",
type = "reconstructed",
}
m["sai-opo"] = {
"Opón",
7099152,
"sai-car",
"Latn",
}
m["sai-oto"] = {
"Otomaco",
16879234,
"sai-otm",
"Latn",
}
m["sai-pal"] = {
"Palta",
3042978,
nil,
"Latn",
}
m["sai-pam"] = {
"Pamigua",
5908689,
"sai-otm",
"Latn",
}
m["sai-par"] = {
"Paratió",
16890038,
nil,
"Latn",
}
m["sai-pnz"] = {
"Panzaleo",
3123275,
nil,
"Latn",
}
m["sai-prh"] = {
"Puruhá",
3410994,
nil,
"Latn",
}
m["sai-ptg"] = {
"Patagón",
128807870,
nil,
"Latn",
}
m["sai-pur"] = {
"Purukotó",
7261622,
"sai-pem",
"Latn",
}
m["sai-pyg"] = {
"Payaguá",
7156643,
"sai-guc",
"Latn",
}
m["sai-pyk"] = {
"Pykobjê",
98113977,
"sai-nje",
"Latn",
}
m["sai-qmb"] = {
"Quimbaya",
7272043,
nil,
"Latn",
}
m["sai-qtm"] = {
"Quitemo",
7272651,
"sai-cpc",
"Latn",
}
m["sai-rab"] = {
"Rabona",
6644203,
nil,
"Latn",
}
m["sai-ram"] = {
"Ramanos",
16902824,
nil,
"Latn",
}
m["sai-sac"] = {
"Sácata",
5390321,
nil,
"Latn",
}
m["sai-san"] = {
"Sanaviron",
16895999,
nil,
"Latn",
}
m["sai-sap"] = {
"Sapará",
7420922,
"sai-car",
"Latn",
}
m["sai-sec"] = {
"Sechura",
7442912,
nil,
"Latn",
}
m["sai-sin"] = {
"Sinúfana",
7525275,
nil,
"Latn",
}
m["sai-sje-pro"] = {
"Jê Selatan Purba",
116773814,
"sai-sje",
"Latn",
type = "reconstructed",
}
m["sai-tab"] = {
"Tabancale",
5390321,
nil,
"Latn",
}
m["sai-tal"] = {
"Tallán",
16910468,
nil,
"Latn",
}
m["sai-tap"] = {
"Tapayuna",
30719984,
"sai-nje",
"Latn",
}
m["sai-tar-pro"] = {
"Taranoan Purba",
116773816,
"sai-tar",
"Latn",
type = "reconstructed",
}
m["sai-teu"] = {
"Teushen",
3519243,
nil,
"Latn",
}
m["sai-tim"] = {
"Timote",
7806995,
nil,
"Latn",
}
m["sai-tpr"] = {
"Taparita",
7684460,
"sai-otm",
"Latn",
}
m["sai-trr"] = {
"Tarairiú",
7685313,
nil,
"Latn",
}
m["sai-wai"] = {
"Waitaká",
16918610,
nil,
"Latn",
}
m["sai-way"] = {
"Wayumara",
7960726,
"sai-car",
"Latn",
}
m["sai-wit-pro"] = {
"Witotoan Purba",
116773823,
"sai-wit",
"Latn",
type = "reconstructed",
}
m["sai-wnm"] = {
"Wanham",
16879440,
"sai-cpc",
"Latn",
}
m["sai-xoc"] = { -- contrast xoo, kzw, sai-kat
"Xocó",
12953620,
nil,
"Latn",
}
m["sai-yao"] = {
"Yao (Amerika Selatan)",
16979655,
"sai-ven",
"Latn",
}
m["sai-yar"] = { -- not the same family as 'suy'
"Yarumá",
3505859,
"sai-pek",
"Latn",
}
m["sai-yri"] = {
"Yuri",
2669157,
"sai-tyu",
"Latn",
}
m["sai-yup"] = {
"Yupua",
8061430,
"sai-tuc",
"Latn",
}
m["sai-yur"] = {
"Yurumanguí",
1281291,
nil,
"Latn",
}
m["sal-pro"] = {
"Salish Purba",
116773269,
"sal",
"Latn",
type = "reconstructed",
}
m["sdv-daj-pro"] = {
"Daju Purba",
116773739,
"sdv-daj",
"Latn",
type = "reconstructed",
}
m["sdv-eje-pro"] = {
"Jabal Timur Purba",
116773751,
"sdv-eje",
"Latn",
type = "reconstructed",
}
m["sdv-nil-pro"] = {
"Nil Purba",
116773794,
"sdv-nil",
"Latn",
type = "reconstructed",
}
m["sdv-nyi-pro"] = {
"Nyima Purba",
116773796,
"sdv-nyi",
"Latn",
type = "reconstructed",
}
m["sdv-tmn-pro"] = {
"Taman Purba",
116773815,
"sdv-tmn",
"Latn",
type = "reconstructed",
}
m["sel-nor"] = {
"Selkup Utara",
30304565,
"sel",
"Cyrl",
translit = "sel-nor-translit",
}
m["sel-pro"] = {
"Selkup Purba",
128884235,
"sel",
"Latn",
type = "reconstructed",
}
m["sel-sou"] = {
"Selkup Selatan",
30304639,
"sel",
"Cyrl",
}
m["sem-amm"] = {
"Ammun",
279181,
"sem-can",
"Phnx",
translit = "Phnx-translit",
}
m["sem-amo"] = {
"Amorit",
35941,
"sem-nwe",
"Xsux, Latn",
}
m["sem-cha"] = {
"Chaha",
35543,
"sem-eth",
"Ethi",
translit = "Ethi-translit",
}
m["sem-dad"] = {
"Dadanitic",
21838040,
"sem-cen",
"Narb",
translit = "Narb-translit",
}
m["sem-dum"] = {
"Dumaitic",
128810397,
"sem-cen",
"Narb",
translit = "Narb-translit",
}
m["sem-has"] = {
"Hasaitic",
3541433,
"sem-cen",
"Narb",
translit = "Narb-translit",
}
m["sem-his"] = {
"Hismaic",
22948260,
"sem-cen",
"Narb",
translit = "Narb-translit",
}
m["sem-mhr"] = {
"Muher",
33743,
"sem-eth",
"Latn",
}
m["sem-pro"] = {
"Samiah Purba",
1658554,
"sem",
"Latn",
type = "reconstructed",
}
m["sem-saf"] = {
"Safaitic",
472586,
"sem-cen",
"Narb",
translit = "Narb-translit",
}
m["sem-srb"] = {
"Arab Selatan Kuno",
35025,
"sem-osa",
"Sarb",
translit = "Sarb-translit",
}
m["sem-tay"] = {
"Taymanitic",
24912301,
"sem-cen",
"Narb",
translit = "Narb-translit",
}
m["sem-tha"] = {
"Thamudic",
843030,
"sem-cen",
"Narb",
translit = "Narb-translit",
}
m["sem-wes-pro"] = {
"Samiah Barat Purba",
98021726,
"sem-wes",
"Latn",
type = "reconstructed",
}
m["sio-pro"] = { -- NB this is not Siouan-Catawban 'nai-sca-pro'
"Sioux Purba",
34181,
"sio",
"Latn",
type = "reconstructed",
}
m["sit-bai-pro"] = {
"Bai Purba",
nil,
"sit-bai",
"Latn",
type = "reconstructed",
}
m["sit-bok"] = {
"Bokar",
4938727,
"sit-tan",
"Latn, Tibt",
translit = {Tibt = "Tibt-translit"},
override_translit = true,
display_text = {Tibt = s["Tibt-displaytext"]},
entry_name = {Tibt = s["Tibt-entryname"]},
sort_key = {Tibt = "Tibt-sortkey"},
}
m["sit-cai"] = {
"Caijia",
5017528,
"sit-cln",
"Latn"
}
m["sit-cha"] = {
"Chairel",
5068066,
"sit-luu",
"Latn",
}
m["sit-hrs-pro"] = {
"Hrusish Purba",
116773762,
"sit-hrs",
"Latn",
type = "reconstructed",
}
m["sit-jap"] = {
"Japhug",
3162245,
"sit-rgy",
"Latn",
}
m["sit-kha-pro"] = {
"Kham Purba",
116773773,
"sit-kha",
"Latn",
type = "reconstructed",
}
m["sit-liz"] = {
"Lizu",
6660653,
"sit-qia",
"Latn", -- and Ersu Shaba
}
m["sit-lnj"] = {
"Longjia",
17096251,
"sit-cln",
"Latn"
}
m["sit-lrn"] = {
"Luren",
16946370,
"sit-cln",
"Latn"
}
m["sit-luu-pro"] = {
"Luish Purba",
116773783,
"sit-luu",
"Latn",
type = "reconstructed",
}
m["sit-prn"] = {
"Puiron",
7259048,
"sit-zem",
}
m["sit-pro"] = {
"Sino-Tibet Purba",
45961,
"sit",
"Latn",
type = "reconstructed",
}
m["sit-sit"] = {
"Situ",
19840830,
"sit-rgy",
"Latn",
}
m["sit-tam-pro"] = {
"Tamang Purba",
117469295,
"sit-tam",
"Latn",
type = "reconstructed",
}
m["sit-tan-pro"] = {
"Tani Purba",
116773284,
"sit-tan",
"Latn", -- needs verification
type = "reconstructed",
}
m["sit-tgm"] = {
"Tangam",
17041370,
"sit-tan",
"Latn",
}
m["sit-tos"] = {
"Tosu",
7827899,
"sit-qia",
"Latn", -- also Ersu Shaba
}
m["sit-tsh"] = {
"Tshobdun",
19840950,
"sit-rgy",
"Latn",
}
m["sit-zbu"] = {
"Zbu",
19841106,
"sit-rgy",
"Latn",
}
m["sla-pro"] = {
"Slavik Purba",
747537,
"sla",
"Latn",
type = "reconstructed",
entry_name = {
remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve,
remove_exceptions = {'ś'},
},
sort_key = {
from = {"č", "ď", "ě", "ę", "ь", "ľ", "ň", "ǫ", "ř", "š", "ś", "ť", "ъ", "ž"},
to = {"c²", "d²", "e²", "e³", "i²", "l²", "nj", "o²", "r²", "s²", "s³", "t²", "u²", "z²"},
}
}
m["smi-pro"] = {
"Sami Purba",
7251862,
"smi",
"Latn",
type = "reconstructed",
sort_key = {
from = {"ā", "č", "δ", "[ëē]", "ŋ", "ń", "ō", "š", "θ", "%([^()]+%)"},
to = {"a", "c²", "d", "e", "n²", "n³", "o", "s²", "t²"}
},
}
m["son-pro"] = {
"Songhay Purba",
116773277,
"son",
"Latn",
type = "reconstructed",
}
m["sqj-pro"] = {
"Albania Purba",
18210846,
"sqj",
"Latn",
type = "reconstructed",
}
m["ssa-klk-pro"] = {
"Kuliak Purba",
116773779,
"ssa-klk",
"Latn",
type = "reconstructed",
}
m["ssa-kom-pro"] = {
"Koman Purba",
116773775,
"ssa-kom",
"Latn",
type = "reconstructed",
}
m["ssa-pro"] = {
"Nilo-Sahara Purba",
116773236,
"ssa",
"Latn",
type = "reconstructed",
}
m["syd-fne"] = {
"Forest Nenets",
1295107,
"syd",
"Cyrl",
translit = "syd-fne-translit",
entry_name = {remove_diacritics = c.grave .. c.acute .. c.macron .. c.breve .. c.dotabove},
}
m["syd-pro"] = {
"Samoyed Purba",
7251863,
"syd",
"Latn",
type = "reconstructed",
}
m["tai-pro"] = {
"Tai Purba",
6583709,
"tai",
"Latn",
type = "reconstructed",
}
m["tai-swe-pro"] = {
"Tai Barat Daya Purba",
116773280,
"tai-swe",
"Latn",
type = "reconstructed",
}
m["tbq-bdg-pro"] = {
"Bodo-Garo Purba",
116773195,
"tbq-bdg",
"Latn",
type = "reconstructed",
}
m["tbq-blg"] = {
"Bailang",
2879843,
"tbq-lob",
"Hani",
sort_key = "Hani-sortkey",
}
m["tbq-gkh"] = {
"Gokhy",
5578069,
"tbq-sil",
"Latn",
}
m["tbq-kuk-pro"] = {
"Kukish Purba",
116773220,
"tbq-kuk",
"Latn",
type = "reconstructed",
}
m["tbq-lal-pro"] = {
"Lalo Purba",
116773781,
"tbq-lal",
"Latn",
type = "reconstructed",
}
m["tbq-laz"] = {
"Laze",
17007626,
"sit-nas",
"Latn",
}
m["tbq-lob-pro"] = {
"Lolo-Burma Purba",
116773224,
"tbq-lob",
"Latn",
type = "reconstructed",
}
m["tbq-lol-pro"] = {
"Lolo Purba",
7251855,
"tbq-lol",
"Latn",
type = "reconstructed",
}
m["tbq-mil"] = {
"Milang",
6850761,
"sit-gsi",
"Deva, Latn",
}
m["tbq-mor"] = {
"Moran",
6909216,
"tbq-bdg",
"Latn",
}
m["tbq-ngo"] = {
"Ngochang",
56582,
"tbq-brm",
"Latn",
}
-- tbq-pro is now etymology-only
m["trk-dkh"] = {
"Dukhan",
12809273,
"trk-ssb",
"Latn, Cyrl, Mong",
translit = {Mong = "Mong-translit"},
display_text = {Mong = s["Mong-displaytext"]},
entry_name = {Mong = s["Mong-entryname"]},
}
m["trk-oat"] = {
"Turki Anatolia Kuno",
7083390,
"trk-ogz",
"ota-Arab",
entry_name = {["ota-Arab"] = "ar-entryname"},
}
m["trk-pro"] = {
"Turk Purba",
3657773,
"trk",
"Latn",
type = "reconstructed",
}
m["tup-gua-pro"] = {
"Tupi-Guarani Purba",
116773288,
"tup-gua",
"Latn",
type = "reconstructed",
}
m["tup-kab"] = {
"Kabishiana",
15302988,
"tup",
"Latn",
}
m["tup-pro"] = {
"Tupi Purba",
10354700,
"tup",
"Latn",
type = "reconstructed",
}
m["tuw-alk"] = {
"Alchuka",
113553616,
"tuw-jrc",
"Latn, Hans",
sort_key = {Hans = "Hani-sortkey"},
}
m["tuw-bal"] = {
"Bala",
86730632,
"tuw-jrc",
"Latn, Hans",
sort_key = {Hans = "Hani-sortkey"},
}
m["tuw-kkl"] = {
"Kyakala",
118875708,
"tuw-jrc",
"Latn, Hans",
sort_key = {Hans = "Hani-sortkey"},
}
m["tuw-kli"] = {
"Kili",
6406892,
"tuw-ewe",
"Cyrl",
}
m["tuw-pro"] = {
"Tungus Purba",
85872335,
"tuw",
"Latn",
type = "reconstructed",
}
m["tuw-sol"] = {
"Solon",
30004,
"tuw-ewe",
}
m["urj-fin-pro"] = {
"Finnik Purba",
11883720,
"urj-fin",
"Latn",
type = "reconstructed",
}
m["urj-koo"] = {
"Komi Kuno",
86679962,
"urj-prm",
"Perm, Cyrs",
translit = "urj-koo-translit",
sort_key = {Cyrs = s["Cyrs-sortkey"]},
}
m["urj-kuk"] = {
"Kukkuzi",
107410460,
"urj-fin",
"Latn",
ancestors = "vot",
}
m["urj-kya"] = {
"Komi-Yazva",
2365210,
"urj-prm",
"Cyrl",
translit = "kv-translit",
override_translit = true,
entry_name = {remove_diacritics = c.acute},
}
m["urj-mdv-pro"] = {
"Mordvin Purba",
116773232,
"urj-mdv",
"Latn",
type = "reconstructed",
}
m["urj-prm-pro"] = {
"Perm Purba",
116773257,
"urj-prm",
"Latn",
type = "reconstructed",
}
m["urj-pro"] = {
"Ural Purba",
288765,
"urj",
"Latn",
type = "reconstructed",
}
m["urj-ugr-pro"] = {
"Ugri Purba",
156631,
"urj-ugr",
"Latn",
type = "reconstructed",
}
m["xnd-pro"] = {
"Na-Dene Purba",
116773233,
"xnd",
"Latn",
type = "reconstructed",
}
m["xgn-pro"] = {
"Mongol Purba",
2493677,
"xgn",
"Latn",
type = "reconstructed",
sort_key = {
from = {"č", "i", "ï", "ǰ", "ŋ", "ö", "š", "ü"},
to = {"c", "i" .. p[1], "i", "j", "n" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1]},
},
}
m["yok-bvy"] = {
"Yokuts Buena Vista",
4985474,
"yok",
"Latn",
}
m["yok-dly"] = {
"Yokuts Delta",
70923266,
"yok",
"Latn",
}
m["yok-gsy"] = {
"Gashowu",
3098708,
"yok",
"Latn",
}
m["yok-kry"] = {
"Yokuts Sungai Kings",
6413014,
"yok",
"Latn",
}
m["yok-nvy"] = {
"Yokuts Lembah Utara",
85789777,
"yok",
"Latn",
}
m["yok-ply"] = {
"Palewyami",
2387391,
"yok",
"Latn",
}
m["yok-svy"] = {
"Yokuts Lembah Selatan",
12642473,
"yok",
"Latn",
}
m["yok-tky"] = {
"Yokuts Tule-Kaweah",
7851988,
"yok",
"Latn",
}
m["ypk-pro"] = {
"Yupik Purba",
116773295,
"ypk",
"Latn",
type = "reconstructed",
}
m["zhx-min-pro"] = {
"Min Purba",
19646347,
"zhx-min",
"Latn",
type = "reconstructed",
}
m["zhx-sht"] = {
"Shaozhou Tuhua",
1920769,
"zhx",
"Nshu, Hants",
generate_forms = "zh-generateforms",
sort_key = {Hani = "Hani-sortkey"},
}
m["zhx-sic"] = {
"Sichuan",
2278732,
"zhx-man",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["zhx-tai"] = {
"Taishan",
2208940,
"zhx-yue",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["zlw-mas"] = {
"Masurian",
489691,
"zlw-lch",
"Latn",
ancestors = "zlw-opl",
}
m["zle-ono"] = {
"Novgorodia Kuno",
162013,
"zle",
"Cyrs, Glag",
translit = {Cyrs = "Cyrs-translit", Glag = "Glag-translit"},
entry_name = {Cyrs = s["Cyrs-entryname"]},
sort_key = {Cyrs = s["Cyrs-sortkey"]},
}
m["zle-ort"] = {
"Ruthenia Kuno",
13211,
"zle",
"Cyrs",
ancestors = "orv",
translit = "zle-ort-translit",
entry_name = {
remove_diacritics = s["Cyrs-entryname"].remove_diacritics,
remove_exceptions = {"Ї", "ї"}
},
sort_key = s["Cyrs-sortkey"],
}
m["zlw-ocs"] = {
"Czech Kuno",
593096,
"zlw",
"Latn",
}
m["zlw-opl"] = {
"Poland Kuno",
149838,
"zlw-lch",
"Latn",
entry_name = {remove_diacritics = c.ringabove},
}
m["zlw-osk"] = {
"Slovak Kuno",
12776676,
"zlw",
"Latn",
}
m["zlw-slv"] = {
"Slovincia",
36822,
"zlw-pom",
"Latn",
entry_name = "zlw-slv-entryname"
}
m["zlm-coa"] = {
"Melayu Terengganu Pesisir",
4207412,
"poz-mly",
"Latn, ms-Arab",
}
m["zlm-pah"] = {
"Melayu Pahang",
Q7310370,
"poz-mly",
"Latn",
}
return require("Module:languages").finalizeData(m, "language")
p5xmcz5s172d3fd3gtp8bq8eg2xvkws
Modul:languages/data/exceptional/extra
828
33778
281317
245803
2026-04-21T19:37:20Z
Hakimi97
2668
Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/89762527|89762527]]) (perlu semakan semula)
281317
Scribunto
text/plain
local m = {}
m["aav-khs-pro"] = {
aliases = {"Proto-Khasic"},
}
m["aav-nic-pro"] = {
}
m["aav-pkl-pro"] = {
}
m["aav-pro"] = { -- mkh-pro will merge into this.
}
m["afa-pro"] = {
aliases = {"Proto-Afro-Asiatic", "Hamito-Semitic"},
}
m["alg-aga"] = {
aliases = {"Agwam", "Agaam"},
}
m["alg-pro"] = {
}
m["alv-ama"] = {
}
m["alv-bgu"] = {
aliases = {"Baïnounk Gubëeher", -- Wikipedia's name
"Gubeeher-Gufangor-Gubelor", -- Glottolog's name,
"Gubëeher", "Nyun Gubëeher", "Nun Gubëeher"}, -- N(y)un appears to be the family name
varieties = {"Gubeeher", "Gufangor", "Gubelor"},
}
m["alv-bua-pro"] = {
}
m["alv-cng-pro"] = {
}
m["alv-edk-pro"] = {
}
m["alv-edo-pro"] = {
}
m["alv-fli-pro"] = {
}
m["alv-gbe-pro"] = {
}
m["alv-gng-pro"] = {
}
m["alv-gtm-pro"] = {
aliases = {"Proto-Ghana-Togo Mountain"},
}
m["alv-gwa"] = {
}
m["alv-hei-pro"] = {
}
m["alv-ido-pro"] = {
}
m["alv-igb-pro"] = {
}
m["alv-kwa-pro"] = {
}
m["alv-mum-pro"] = {
}
m["alv-nup-pro"] = {
}
m["alv-pro"] = {
}
m["alv-von-pro"] = {
}
m["alv-yor-pro"] = {
}
m["alv-yrd-pro"] = {
}
m["apa-pro"] = {
aliases = {"Proto-Apache", "Proto-Southern Athabaskan"},
}
m["aql-pro"] = {
}
m["art-adu"] = {
aliases = {"Westron"},
}
m["art-bel"] = {
}
m["art-blk"] = {
}
m["art-bsp"] = {
}
m["art-com"] = {
}
m["art-dtk"] = {
}
m["art-elo"] = {
}
m["art-gld"] = {
}
m["art-lap"] = {
}
m["art-man"] = {
}
m["art-mun"] = {
}
m["art-nav"] = {
}
m["art-vlh"] = {
}
m["ath-nic"] = {
}
m["ath-pro"] = {
}
m["auf-pro"] = {
aliases = {"Proto-Arawan", "Proto-Arauan"},
}
m["aus-alu"] = {
other_names = {"Ogh-Alungul", "Alngula"},
}
m["aus-and"] = {
aliases = {"Adithinngithigh"},
}
m["aus-ang"] = {
other_names = {"Ogh-Anggula", "Anggula", "Ogh-Anggul", "Anggul"},
}
m["aus-arn-pro"] = {
}
m["aus-bra"] = {
aliases = {"Barranbinja", "Baranbinya", "Burranbinya", "Burrumbiniya", "Burrunbinya", "Barrumbinya", "Barren-binya", "Parran-binye"},
}
m["aus-brm"] = {
}
m["aus-cww-pro"] = {
}
m["aus-dal-pro"] = {
}
m["aus-guw"] = {
other_names = {"Gowar", "Goowar", "Gooar", "Guar", "Gowr-burra", "Ngugi", "Mugee", "Wogee", "Gnoogee", "Chunchiburri", "Booroo-geen-merrie"},
}
m["aus-lsw"] = {
aliases = {"Little Swanport Tasmanian"},
}
m["aus-mbi"] = {
other_names = {"Mbeiwum"},
}
m["aus-ngk"] = {
other_names = {"Ngkot", "Nggoth"},
}
m["aus-nyu-pro"] = {
}
m["aus-pam-pro"] = {
}
m["aus-tul"] = {
other_names = {"Dappil", "Dapil", "Toolooa", "Dulua", "Narung", "Dandan"},
}
m["aus-uwi"] = {
other_names = {"Uwinjmil"},
}
m["aus-wdj-pro"] = {
}
m["aus-won"] = {
}
m["aus-wul"] = {
other_names = {"Manbara", "Wulgurugaba", "Wulgurukaba", "Nhawalgaba"},
}
m["aus-ynk"] = { -- contrast nny
}
m["awd-amc-pro"] = {
other_names = {"Western Maipuran"},
}
m["awd-kmp-pro"] = {
other_names = {"Campa", "Kampan", "Campan", "Pre-Andine Maipurean"},
}
m["awd-prw-pro"] = {
other_names = {"Paresí-Waurá", "Parecí–Xingú", "Paresí–Xingu", "Central Arawak", "Central Maipurean"},
}
m["awd-ama"] = {
}
m["awd-ana"] = {
aliases = {"Anauya"},
}
m["awd-apo"] = {
other_names = {"Lapachu"},
}
m["awd-cab"] = {
aliases = {"Cabere", "Cávere", "Cavere"},
}
m["awd-gnu"] = {
other_names = {"Guinao", "Inao", "Guniare", "Quinhau", "Guiano"},
}
m["awd-kar"] = {
aliases = {"Kariaí", "Kariai", "Cariyai", "Carihiahy"},
}
m["awd-kaw"] = {
aliases = {"Cawishana", "Cayuishana", "Kaishana", "Cauixana"},
}
m["awd-kus"] = {
aliases = {"Kustenaú", "Custenau", "Kutenabu"},
}
m["awd-man"] = {
}
m["awd-mar"] = {
aliases = {"Marawán"},
}
m["awd-mpr"] = {
aliases = {"Maypure", "Mejepure"},
}
m["awd-mrt"] = {
aliases = {"Mariate"},
}
m["awd-nwk-pro"] = {
aliases = {"Proto-Newiki"},
}
m["awd-pai"] = {
aliases = {"Paiconeca", "Paikone", "Paicone"},
}
m["awd-pas"] = {
aliases = {"Passé", "Pazé"},
}
m["awd-pro"] = {
other_names = {"Proto-Arawakan", "Proto-Maipurean", "Proto-Maipuran"},
}
m["awd-she"] = {
aliases = {"Shebaya", "Shebaye"},
}
m["awd-taa-pro"] = {
other_names = {"Proto-Ta-Arawakan", "Proto-Caribbean Northern Arawak"},
}
m["awd-wai"] = {
other_names = {"Wainuma", "Wai", "Waima", "Wainumi", "Wainambí", "Waiwana", "Waipi", "Yanuma"},
}
m["awd-yum"] = {
aliases = {"Jumana"},
}
m["azc-caz"] = {
aliases = {"Caxcan", "Kaskán"},
}
m["azc-cup-pro"] = {
}
m["azc-ktn"] = {
aliases = {"Gitanemuk"},
}
m["azc-nah-pro"] = {
}
m["azc-num-pro"] = {
}
m["azc-pro"] = {
}
m["azc-tak-pro"] = {
}
m["azc-tat"] = {
}
m["ber-fog"] = {
other_names = {"El-Fogaha", "El-Foqaha", "Foqaha", "Fuqaha"},
}
m["ber-pro"] = {
}
m["ber-zuw"] = {
}
m["bnt-bal"] = {
}
m["bnt-bon"] = {
}
m["bnt-boy"] = {
}
m["bnt-bwa"] = {
}
m["bnt-cmw"] = {
other_names = {"Bravanese", "Mwiini", "Mwini", "Chimwini", "Chimini", "Brava"},
}
m["bnt-ind"] = {
other_names = {"Kɔlɔmɔnyi", "Kɔlɛ", "Kasaï Oriental"},
}
m["bnt-lal"] = {
}
m["bnt-mpi"] = {
}
m["bnt-mpu"] = {
}
m["bnt-ngu-pro"] = {
}
m["bnt-phu"] = {
aliases = {"Siphuthi"},
}
m["bnt-pro"] = {
}
m["bnt-sab-pro"] = {
}
m["bnt-sbo"] = {
}
m["bnt-sts-pro"] = {
}
m["btk-pro"] = {
}
m["cau-abz-pro"] = {
other_names = {"Proto-Abazgi", "Proto-Abkhaz-Tapanta"},
}
m["cau-and-pro"] = {
aliases = {"Proto-Andi", "Proto-Andic"},
}
m["cau-ava-pro"] = {
aliases = {"Proto-Avar-Andian", "Proto-Avar-Andi", "Proto-Avar-Andic"},
}
m["cau-cir-pro"] = {
other_names = {"Proto-Adyghe-Kabardian", "Proto-Adyghe-Circassian"},
}
m["cau-drg-pro"] = {
other_names = {"Proto-Dargin"},
}
m["cau-lzg-pro"] = {
aliases = {"Proto-Lezgi", "Proto-Lezgian", "Proto-Lezgic"},
}
m["cau-nec-pro"] = {
}
m["cau-nkh-pro"] = {
}
m["cau-nwc-pro"] = {
}
m["cau-tsz-pro"] = {
other_names = {"Proto-Tsezic", "Proto-Didoic"},
}
m["cba-ata"] = {
other_names = {"Atanque", "Cancuamo", "Kankuamo", "Kankwe", "Kankuí", "Atanke"},
}
m["cba-cat"] = {
other_names = {"Catio Chibcha", "Old Catio"},
}
m["cba-dor"] = {
other_names = {"Chumulu", "Changuena", "Changuina", "Chánguena", "Gualaca"},
}
m["cba-dui"] = {
}
m["cba-hue"] = {
other_names = {"Güetar", "Guetar", "Brusela"},
}
m["cba-nut"] = {
other_names = {"Nutabane"},
}
m["cba-pro"] = {
}
m["ccs-pro"] = {
}
m["ccs-gzn-pro"] = {
aliases = {"Proto-Karto-Zan"},
}
m["cdc-cbm-pro"] = {
aliases = {"Proto-Central-Chadic", "Proto-Biu-Mandara"},
}
m["cdc-mas-pro"] = {
}
m["cdc-pro"] = {
}
m["cdd-pro"] = {
}
m["cel-bry-pro"] = {
aliases = {"Proto-Brittonic", "Common Brythonic", "Common Brittonic"},
}
m["cel-gal"] = {
}
m["cel-gau"] = {
}
m["cel-pro"] = {
}
m["chi-pro"] = {
}
m["chm-pro"] = {
}
m["cmc-pro"] = {
}
m["crp-bip"] = {
}
m["crp-gep"] = {
aliases = {"Greenlandic Pidgin", "Greenlandic Eskimo Pidgin"},
}
m["crp-kia"] = {
aliases = {"Kiautschou Pidgin German"},
}
m["crp-mar"] = {
other_names = {"Jamaican Maroon Spirit Possession Language"},
}
m["crp-mpp"] = {
aliases = {"Macao Pidgin Portuguese"},
}
m["crp-rsn"] = {
}
m["crp-slb"] = {
other_names = {"Solombala-English", "Solombala English-Russian Pidgin"},
}
m["crp-spp"] = {
}
m["crp-tpr"] = {
}
m["csu-bba-pro"] = {
}
m["csu-maa-pro"] = {
}
m["csu-pro"] = {
}
m["csu-sar-pro"] = {
}
m["cus-ash"] = {
other_names = {"Ashraf", "Af-Ashraaf"},
varieties = { {"Marka, Lower Shabelle"}, "Shingani"},
}
m["cus-hec-pro"] = {
}
m["cus-som-pro"] = {
aliases = {"Proto-Sam", "Proto-Macro-Somali"},
}
m["cus-sou-pro"] = {
other_names = {"Proto-Rift"},
}
m["cus-pro"] = {
}
m["dmn-dam"] = {
}
m["dra-bry"] = {
aliases = {"Byari"},
}
m["dra-cen-pro"] = {
}
m["dra-mkn"] = {
aliases = {"Nadugannada"},
}
m["dra-nor-pro"] = {
}
m["dra-okn"] = {
aliases = {"Halegannada"},
}
m["dra-ote"] = {
}
m["dra-pro"] = {
}
m["dra-sdo-pro"] = {
aliases = {"Proto-South Dravidian"},
}
m["dra-sdt-pro"] = {
aliases = {"Proto-South-Central Dravidian"},
}
m["dra-sou-pro"] = {
aliases = {"Proto-Southern Dravidian"},
}
m["egx-dem"] = {
aliases = {"Demotic", "Enchorial"},
}
m["dmn-pro"] = {
}
m["dmn-mdw-pro"] = {
}
m["dru-pro"] = {
}
m["ero-gsz"] = {
}
m["ero-nya"] = {
}
m["ero-tau"] = {
other_names = {"Rtau"},
}
m["esx-esk-pro"] = {
}
m["esx-ink"] = {
}
m["esx-inq"] = {
}
m["esx-inu-pro"] = {
}
m["esx-pro"] = {
}
m["esx-tut"] = {
}
m["euq-pro"] = {
aliases = {"Proto-Vasconic"},
}
m["gba-pro"] = {
}
m["gem-pro"] = {
aliases = {"Common Germanic"},
}
m["gme-bur"] = {
aliases = {"Burgundish", "Burgundic"},
}
m["gme-cgo"] = {
}
m["gmq-gut"] = {
}
m["gmq-jmk"] = {
aliases = {"Jamtlandic"},
}
m["gmq-mno"] = {
}
m["gmq-oda"] = {
}
m["gmq-ogt"] = {
aliases = {"Old Gotlandic"},
}
m["gmq-osw"] = {
}
m["gmq-pro"] = {
aliases = {"Proto-Scandinavian", "Primitive Norse", "Proto-Nordic",
"Ancient Nordic", "Ancient Scandinavian", "Old Nordic", "Old Scandinavian",
"Proto-North Germanic", "North Proto-Germanic", "Common Scandinavian"},
}
m["gmq-scy"] = {
}
m["gmw-bgh"] = {
}
m["gmw-cfr"] = {
varieties = {"Mittelfränkisch", "Ripuarian", "Moselle Franconian", "Colognian", "Kölsch"},
}
m["gmw-ecg"] = {
varieties = {"Thuringian", "Thüringisch", "Upper Saxon", "Upper Saxon German", "Obersächsisch", "Lusatian", "Erzgebirgisch", "Silesian", "Silesian German", "High Prussian"},
}
m["gmw-fin"] = {
aliases = {"Fingal"},
}
m["gmw-gts"] = {
aliases = {"Gottscheerisch"},
}
m["gmw-jdt"] = {
}
m["gmw-msc"] = {
}
m["gmw-pro"] = {
}
m["gmw-rfr"] = {
aliases = {"Rheinfränkisch", "Rhenish Franconian"},
varieties = {"Hessian", "Lorraine Franconian", "Lorrainian", "Lothringisch", "Palatine German", "Pfälzisch", "Pälzisch", "Palatinate German"},
}
m["gmw-stm"] = {
aliases = {"Satu Mare Swabian", "Sathmarschwäbisch", "Sathmarisch"},
}
m["gmw-tsx"] = {
aliases = {"Siebenbürger Saxon"},
}
m["gmw-vog"] = {
}
m["gmw-zps"] = {
aliases = {"Zipser", "Zipserisch", "Outzäpsersch"},
}
m["gn-cls"] = {
}
m["grk-cal"] = {
aliases = {"Italian Greek", "Bova"},
}
m["grk-ita"] = {
aliases = {"Griko", "Grico", "Grecanic"},
}
m["grk-mar"] = {
aliases = {"Mariupolitan Greek", "Rumeíka", "Rumeika"},
}
m["grk-pro"] = {
aliases = {"Proto-Greek"},
}
m["hmn-pro"] = {
}
m["hmx-mie-pro"] = {
}
m["hmx-pro"] = {
}
m["hyx-pro"] = {
}
m["iir-nur-pro"] = {
}
m["iir-pro"] = {
}
m["ijo-pro"] = {
aliases = {"Proto-Ijaw"},
}
m["inc-apa"] = {
aliases = {"Apabhraṃśa"},
}
m["inc-ash"] = {
aliases = {"Asokan Prakrit", "Aśokan Prakrit"},
}
m["inc-dng-pro"] = {
}
m["inc-kam"] = {
}
m["inc-kho"] = {
}
m["inc-krd-pro"] = {
aliases = {"Proto-Kamata"},
}
m["inc-mas"] = {
}
m["inc-mbn"] = {
}
m["inc-mgu"] = {
}
m["inc-mor"] = {
aliases = {"Middle Oriya"},
}
m["inc-oas"] = {
}
m["inc-oaw"] = {
aliases = {"Early Awadhi"},
}
m["inc-obn"] = {
}
m["inc-ogu"] = {
aliases = {"Old Western Rajasthani"},
}
m["inc-ohi"] = {
aliases = {"Dehlavi"},
}
m["inc-oor"] = {
aliases = {"Old Oriya"},
}
m["inc-opa"] = {
}
m["inc-pro"] = {
}
m["ine-ana-pro"] = {
}
m["ine-bsl-pro"] = {
}
m["ine-kal"] = {
aliases = {"Kalasma", "Kalashma", "Kalašmaic", "Kalasmaic", "Kalašmian", "Kalasmian"},
}
m["ine-pae"] = {
}
m["ine-pro"] = {
}
m["ine-toc-pro"] = {
}
m["xme-old"] = {
}
m["xme-mid"] = {
aliases = {"Atropatenian"},
}
m["xme-ker"] = {
other_names = {"Kermanian", "Central Iranian Dialects", "Central Plateau Dialects", "Central Iranian", "South Median", "Gazi", "Soi", "Sohi", "Abuzeydabadi", "Abyanehi", "Farizandi", "Jowshaqani", "Nashalji", "Qohrudi", "Yarandi", "Tari", "Sedehi", "Ardestani", "Zefrehi", "Isfahani", "Kafroni", "Varzenehi", "Khuri", "Nayini", "Anaraki", "Zoroastrian Dari", "Behdināni", "Behdinani", "Gabri", "Gavrŭni", "Gavruni", "Gabrōni", "Gabroni", "Kermani", "Yazdi", "Bidhandi", "Bijagani", "Chimehi", "Hanjani", "Komjani", "Naraqi", "Qalhari", "Varani", "Zori"},
}
m["xme-taf"] = {
}
m["xme-ttc-pro"] = {
}
m["xme-kls"] = {
aliases = {"Kalāsuri", "Kalasur", "Kalāsur"},
}
m["xme-klt"] = {
}
m["xme-ott"] = {
other_names = {"Old Tatic", "Old Azeri", "Azari", "Azeri", "Āḏarī", "Adari", "Adhari"},
}
m["ira-kms-pro"] = {
}
m["ira-mpr-pro"] = {
}
m["ira-pat-pro"] = {
}
m["ira-pro"] = {
}
m["ira-zgr-pro"] = {
}
m["xsc-pro"] = {
}
m["xsc-sar-pro"] = {
}
m["xsc-skw-pro"] = {
}
m["xsc-sak-pro"] = {
aliases = {"Proto-Sakan", "Proto-Tumshuqese-Khotanese"},
}
m["ira-sym-pro"] = {
}
m["ira-sgi-pro"] = {
}
m["ira-mny-pro"] = {
}
m["ira-shy-pro"] = {
}
m["ira-shr-pro"] = {
}
m["ira-sgc-pro"] = {
aliases = {"Proto-Sogdian"},
}
m["ira-wnj"] = {
aliases = {"Old Vanji", "Vanchi", "Vanži", "Wanji"},
}
m["iro-ere"] = {
}
m["iro-min"] = {
}
m["iro-nor-pro"] = {
}
m["iro-pro"] = {
}
m["itc-pro"] = {
}
m["itc-psa"] = {
}
m["jpx-hcj"] = {
aliases = {"Hachijo"},
}
m["jpx-pro"] = {
}
m["jpx-ryu-pro"] = {
}
m["kar-pro"] = {
}
m["kca-eas"] = {
}
m["kca-nor"] = {
}
m["kca-pro"] = {
}
m["kca-sou"] = {
}
m["khi-kho-pro"] = {
}
m["khi-kun"] = {
other_names = {"ǃOǃKung", "ǃ'OǃKung", "Kung", "Ekoka ǃKung", "Ekoka Kung", "Sekele"},
}
m["ko-ear"] = {
}
m["kro-pro"] = {
}
m["ku-pro"] = {
}
m["map-ata-pro"] = {
}
m["map-bms"] = {
}
m["map-pro"] = {
}
m["mis-hkl"] = {
aliases = {"Kelantan Peranakan Chinese", "Hokkien Kelantan", "Kelantan Local Hokkien"}
}
m["mis-idn"] = {
}
m["mis-isa"] = {
}
m["mis-jie"] = {
aliases = {"Chieh", "Kjet"},
}
m["mis-jzh"] = {
aliases = {"Haihua"},
}
m["mis-kas"] = {
aliases = {"Cassite", "Kassitic", "Kaššite"},
}
m["mis-mmd"] = {
other_names = {"Mimi of Gaudefroy-Demombynes", "Mimi-D"},
}
m["mis-mmn"] = {
other_names = {"Mimi-N"},
}
m["mis-phi"] = {
aliases = {"Philistian", "Philistinian"},
}
m["mis-rou"] = {
aliases = {"Ruanruan", "Ruan-ruan", "Juan-juan"},
}
m["mis-tdl"] = {
aliases = {"Turduli"},
}
m["mis-tdt"] = {
aliases = {"Turdetani"},
}
m["mis-tnw"] = {
aliases = {"Tangwanghua"},
}
m["mis-tuh"] = {
aliases = {"'Azha"},
}
m["mis-tuo"] = {
aliases = {"Tabghach", "Taghbach"},
}
m["mis-wuh"] = {
aliases = {"Wuwan", "Awar"},
}
m["mis-xbi"] = {
aliases = {"Serbi", "Shirwi"},
}
m["mis-xnu"] = {
aliases = {"Hsiung-nu", "Hiong-nu"},
}
m["mjg-mgl"] = {
aliases = {"Huzhu", "Huzhu Monguor"},
}
m["mjg-mgr"] = {
aliases = {"Minhe", "Minhe Monguor"},
}
m["mkh-asl-pro"] = {
}
m["mkh-ban-pro"] = {
}
m["mkh-kat-pro"] = {
}
m["mkh-khm-pro"] = {
}
m["mkh-kmr-pro"] = {
}
m["mkh-mmn"] = {
}
m["mkh-mnc-pro"] = {
}
m["mkh-mvi"] = {
}
m["mkh-pal-pro"] = {
}
m["mkh-pea-pro"] = {
}
m["mkh-pkn-pro"] = {
}
m["mkh-pro"] = { --This will be merged into 2015 aav-pro.
}
m["mnw-tha"] = {
aliases = {"Raman", "Thai Raman", "Siamese Mon"},
}
m["mkh-vie-pro"] = {
}
m["mns-cen"] = {
}
m["mns-nor"] = {
}
m["mns-pro"] = {
}
m["mns-sou"] = {
}
m["mun-pro"] = {
aliases = {"Proto-Mundan"},
}
m["myn-chl"] = { -- the stage after ''emy''
other_names = {"Cholti", "Colonial Ch'olti'", "Colonial Cholti"},
}
m["myn-pro"] = {
aliases = {"Proto-Maya"},
}
m["nai-ala"] = {
other_names = {"Alasapa", "Pinto"},
}
m["nai-bay"] = {
other_names = {"Bayougoula", "Bayou Goula", "Ischenoca"}, -- tribe merged with "Mougulasha", "Mongoulacha", "Mugulasha", "Mougulasha", "Muglahsa", "Muglasha", "Muguasha", "Imongolosha", "Houma", "Acolapissa"
}
m["nai-cal"] = {
}
m["nai-chi"] = {
}
m["nai-chu-pro"] = {
aliases = {"Proto-Chumashan"},
}
m["nai-cig"] = {
}
m["nai-ckn-pro"] = {
aliases = {"Proto-Chinook"},
}
m["nai-guz"] = {
aliases = {"Guazacapan"},
}
m["nai-hit"] = {
other_names = {"Atcik-hata", "At-pasha-shliha"},
}
m["nai-ipa"] = {
other_names = {"'Iipay 'aa", "Northern Diegueño", "Diegueño"},
}
m["nai-jtp"] = {
other_names = {"Xutiapa", "Jalapa", "Xalapa"},
}
m["nai-jum"] = {
aliases = {"Jumaitepeque", "Jumaytepec"},
}
m["nai-kat"] = {
other_names = {"Kathlamet Chinook"},
}
m["nai-klp-pro"] = {
}
m["nai-knm"] = {
}
m["nai-kum"] = {
other_names = {"Kumiai", "Central Diegueño", "Diegueño"},
}
m["nai-mac"] = {
aliases = {"Macorís", "Macorix", "Mazorij", "Mazorig", "Mazoriges"},
}
m["nai-mdu-pro"] = {
aliases = {"Proto-Maiduan"},
}
m["nai-miz-pro"] = {
aliases = {"Proto-Mixe-Zoquean"},
}
m["nai-mus-pro"] = {
aliases = {"Proto-Muskhogean", "Proto-Muskogee"},
}
m["nai-nao"] = {
}
m["nai-nrs"] = {
}
m["nai-okw"] = {
}
m["nai-per"] = {
}
m["nai-pic"] = {
}
m["nai-plp-pro"] = {
}
m["nai-pom-pro"] = {
aliases = {"Proto-Pomoan"},
}
m["nai-qng"] = {
}
m["nai-sca-pro"] = { -- NB 'sio-pro' "Proto-Siouan" which is Proto-Western Siouan
}
m["nai-sin"] = {
aliases = {"Sinacantan", "Zinacantán", "Zinacantan"},
}
m["nai-sln"] = {
}
m["nai-spt"] = {
aliases = {"Shahaptin"},
}
m["nai-tap"] = {
other_names = {"Tapachulteca", "Tapachulteco", "Tapachula"},
}
m["nai-taw"] = {
}
m["nai-teq"] = {
other_names = {"Tequistlateco", "Tequistlateca", "Chontal", "Chontol of Oaxaca", "Oaxaca Chontal", "Oaxacan Chontal"},
}
m["nai-tip"] = {
other_names = {"Tipay", "Tiipai", "Tiipay", "Jamul Tiipay", "Southern Digueño", "Diegueño"},
}
m["nai-tot-pro"] = {
}
m["nai-tsi-pro"] = {
}
m["nai-utn-pro"] = {
other_names = {"Proto-Miwok-Costanoan"},
}
m["nai-wai"] = {
aliases = {"Guaycura", "Waicura"},
}
m["nai-wji"] = {
other_names = {"Jicaque of El Palmar", "Sula"},
}
m["nai-yup"] = {
aliases = {"Jupiltepeque", "Yupiltepec", "Jupiltepec", "Xupiltepec"},
}
m["nan-dat"] = {
aliases = {"Datian"},
}
m["nan-hbl"] = {
aliases = {"Hokkienese", "Quanzhang", "Fukien", "Banlam", "Banlamese", "Ban-lam"},
}
m["nan-hlh"] = {
aliases = {"Hailufeng", "Hoklo Min", "Hai Lok Hong"},
}
m["nan-lnx"] = {
aliases = {"Longyan", "Liongna"},
}
m["nan-tws"] = {
aliases = {"Teochew Min", "Chiuchow", "Teo-Swa", "Teo-Swa Min", "Tio-Sua"},
}
m["nan-zhe"] = {
aliases = {"Zhenan"},
}
m["nan-zsh"] = {
aliases = {"Sanxiang", "Samheung", "Sahiu"},
}
m["ngf-pro"] = {
}
m["nic-bco-pro"] = {
}
m["nic-bod-pro"] = {
}
m["nic-eov-pro"] = {
}
m["nic-gns-pro"] = {
}
m["nic-grf-pro"] = {
}
m["nic-gur-pro"] = {
}
m["nic-jkn-pro"] = {
}
m["nic-lcr-pro"] = {
}
m["nic-ogo-pro"] = {
}
m["nic-ovo-pro"] = {
}
m["nic-plt-pro"] = {
}
m["nic-pro"] = {
}
m["nic-ubg-pro"] = {
}
m["nic-ucr-pro"] = {
}
m["nic-vco-pro"] = {
}
m["njo-jgl"] = {
}
m["nub-har"] = {
aliases = {"Ḥarāza"},
}
m["nub-pro"] = {
}
m["omq-cha-pro"] = {
}
m["omq-maz-pro"] = {
aliases = {"Proto-Mazatecan"},
}
m["omq-mix-pro"] = {
}
m["omq-mxt-pro"] = {
}
m["omq-otp-pro"] = {
}
m["omq-pro"] = {
aliases = {"Proto-Otomanguean", "Proto-Oto-Mangue"},
}
m["omq-sjq"] = {
}
m["omq-tel"] = {
}
m["omq-teo"] = {
}
m["omq-tri-pro"] = {
aliases = {"Proto-Trique"},
}
m["omq-zap-pro"] = {
}
m["omq-zpc-pro"] = {
}
m["omv-aro-pro"] = {
}
m["omv-diz-pro"] = {
aliases = {"Proto-Maji"},
}
m["omv-pro"] = {
}
m["oto-otm-pro"] = {
}
m["oto-pro"] = {
}
m["ngf-bin-pro"] = {
}
m["paa-kmn"] = {
aliases = {"Komnzo", "Kómnjo", "Komnjo", "Kamundjo", "Rouku"},
}
m["paa-kwn"] = {
}
m["paa-lei"] = {
}
m["paa-nha-pro"] = {
}
m["paa-nun"] = {
}
m["phi-din"] = {
}
m["phi-kal-pro"] = {
aliases = {"Proto-Calamian"},
}
m["phi-nag"] = {
}
m["phi-pro"] = {
}
m["poz-abi"] = {
other_names = {"Sembuak", "Tubu"},
}
m["poz-bal"] = {
}
m["poz-btk-pro"] = {
}
m["poz-cet-pro"] = {
}
m["poz-hce-pro"] = {
other_names = {"Proto-South Halmahera - West New Guinea"},
}
m["poz-lgx-pro"] = {
}
m["poz-mcm-pro"] = {
}
m["poz-mic-pro"] = {
}
m["poz-mly-pro"] = {
}
m["poz-msa-pro"] = {
}
m["poz-oce-pro"] = {
}
m["poz-pep-pro"] = {
aliases = {"Proto-Eastern-Polynesian", "Proto-East Polynesian", "Proto-East-Polynesian"},
}
m["poz-pnp-pro"] = {
}
m["poz-pol-pro"] = {
}
m["poz-pro"] = {
other_names = {"Proto-Western Malayo-Polynesian"}, -- Western is subsumed into general Proto-MP
}
m["poz-sml"] = {
aliases = {"Sarawak"},
}
m["poz-ssw-pro"] = {
}
m["poz-swa-pro"] = {
}
m["poz-ter"] = {
aliases = {"Terengganu"},
}
m["pqe-pro"] = {
}
m["pra-niy"] = {
}
m["qfa-adm-pro"] = {
}
m["qfa-bet-pro"] = {
aliases = {"Proto-Tai-Be"},
}
m["qfa-cka-pro"] = {
}
m["qfa-hur-pro"] = {
}
m["qfa-kad-pro"] = {
}
m["qfa-kms-pro"] = {
}
m["qfa-kor-pro"] = {
}
m["qfa-kra-pro"] = {
}
m["qfa-lic-pro"] = {
}
m["qfa-onb-pro"] = {
aliases = {"Proto-Ong-Be", "Proto-Bê"},
}
m["qfa-ong-pro"] = {
}
m["qfa-tak-pro"] = {
aliases = {"Proto-Tai-Kadai"},
}
m["qfa-yen-pro"] = {
}
m["qfa-yuk-pro"] = {
}
m["qwe-kch"] = {
other_names = {"Kichwa shimi", "Runashimi", "Runa", "Quichua", "Quecha", "Inga", "Chimborazo", "Imbabura Highland Kichwa", "Cañar Highland Quecha", "Quechua"},
}
m["qwe-pro"] = {
}
m["roa-ang"] = {
other_names = {"Craonnais", "Baugeois", "Saumurois"},
}
m["roa-bbn"] = {
other_names = {"Bourbonnais", "Berrichon", "Moulins", "Allier", "Nivernais", "Haut-Berrichon", "Bas-Berrichon"},
}
m["roa-brg"] = {
other_names = {"Burgundian", "Bregognon", "Dijonnais", "Morvandiau", "Morvandeau", "Morvan", "Bourguignon-Morvandiau", "Mâconnais", "Brionnais", "Brionnais-Charolais", "Auxerrois", "Beaunois", "Langrois", "Valsaônois", "Verduno-Chalonnais", "Sédelocien"},
}
m["roa-can"] = {
}
m["roa-cha"] = {
other_names = {"Bassignot", "Langrois", "Sennonais", "Vallage", "Troyen", "Briard", "Der", "Perthois", "Rémois", "Argonnais", "Porcien", "Ardennais", "Sugny"},
}
m["roa-fcm"] = {
other_names = {"Frainc-Comtou", "Comtois", "Jurassien", "Ajoulot", "Vâdais", "Taignon", "Bisontin", "Bousbot"},
}
m["roa-gal"] = {
}
m["roa-gib"] = {
}
m["roa-gis"] = {
}
m["roa-leo"] = {
}
m["roa-lor"] = {
other_names = {"Gaumais", "Vosgien", "Welche", "Argonnais", "Longovicien", "Messin", "Nancéien", "Spinalien", "Déodatien"},
}
m["roa-oca"] = {
aliases = {"Medieval Catalan"},
}
m["roa-ole"] = {
aliases = {"Medieval Leonese"},
}
m["roa-ona"] = {
aliases = {"Navarro-Aragonese", "Medieval Navarro-Aragonese", "Old Aragonese", "Medieval Aragonese"},
}
m["roa-opt"] = {
aliases = {"Old Galician Portuguese", "Old Galician–Portuguese", "Old Galician", "Old Portuguese", "Galician-Portuguese", "Galician Portuguese", "Galician–Portuguese", "Medieval Galician-Portuguese", "Medieval Galician Portuguese", "Medieval Galician–Portuguese", "Medieval Galician", "Medieval Portuguese", "Galaic-Portuguese"},
}
m["roa-orl"] = {
other_names = {"Beauceron", "Solognot", "Gâtinais", "Blaisois", "Vendômois"},
}
m["roa-poi"] = {
other_names = {"Poitevin", "Saintongeais", "Maraîchin"},
}
m["roa-tar"] = {
}
m["sai-all"] = {
other_names = {"Alyentiyak", "Huarpe", "Warpe"},
}
m["sai-and"] = { -- not to be confused with 'cbc' or 'ano'
other_names = {"Miranya", "Miranha", "Miranha Carapana-Tapuya", "Miraña-Carapana-Tapuyo", "Andokero", "Miranya-Karapana-Tapuyo", "Miraña", "Carapana"},
}
m["sai-ayo"] = {
aliases = {"Ayoman", "Ayamán", "Ayaman"},
}
m["sai-bae"] = {
aliases = {"Baenã", "Baenán", "Baena"},
}
m["sai-bag"] = {
other_names = {"Patagón de Bagua"},
}
m["sai-bet"] = {
other_names = {"Betoy", "Betoya", "Betoye", "Betoi-Jirara", "Jirara"},
}
m["sai-bor-pro"] = {
other_names = {"Proto-Bora-Muinane", "Proto-Bora-Muiname"},
}
m["sai-cac"] = {
other_names = {"Kakán", "Diaguita", "Cacan", "Kakan", "Calchaquí", "Chaka", "Kaka", "Kaká", "Caca", "Caca-Diaguita", "Catamarcano", "Capayán", "Capayana", "Yacampis"},
}
m["sai-caq"] = {
other_names = {"Cara", "Kara"},
}
m["sai-car-pro"] = {
}
m["sai-cat"] = {
}
m["sai-cer-pro"] = {
other_names = {"Proto-Amazonian Jê"},
}
m["sai-chi"] = {
}
m["sai-chn"] = {
aliases = {"Chana"},
}
m["sai-chp"] = {
aliases = {"Txapacura", "Xapacura", "Guapore", "Šapakura", "Txapakura", "Txapakúra", "Xapakúra"},
}
m["sai-chr"] = {
aliases = {"Charrúa", "Charruá"},
}
m["sai-chu"] = {
aliases = {"Churoya"},
}
m["sai-cje-pro"] = {
other_names = {"Proto-Akuwẽ"},
}
m["sai-cmg"] = {
aliases = {"Comechingón", "Comechingona", "Comechingone"},
}
m["sai-cno"] = {
other_names = {"Chonos", "Caucau"},
}
m["sai-cnr"] = {
aliases = {"Cañar"},
}
m["sai-coe"] = {
aliases = {"Koeruna"},
}
m["sai-col"] = {
aliases = {"Colan"},
}
m["sai-cop"] = {
}
m["sai-crd"] = {
other_names = {"Coroado"},
}
m["sai-ctq"] = {
aliases = {"Catuquinarú", "Katukinaru"},
}
m["sai-cul"] = {
other_names = {"Culle", "Kulyi", "Ilinga", "Linga"},
}
m["sai-cva"] = {
}
m["sai-esm"] = {
other_names = {"Esmeraldeño", "Atacame", "Takame"},
}
m["sai-ewa"] = {
}
m["sai-gam"] = {
aliases = {"Gamella", "Acobu", "Curinsi", "Barbados"},
}
m["sai-gay"] = {
aliases = {"Gayon"},
}
m["sai-gmo"] = {
other_names = {"Wamo", "Santa Rosa", "San Jose", "Barinas", "Guamotey", "Guama"},
}
m["sai-gua"] = {
aliases = {"Guachi", "Wachí", "Wachi"},
}
m["sai-gue"] = {
aliases = {"Guenoa"},
}
m["sai-hau"] = {
other_names = {"Manek'enk"},
}
m["sai-jee-pro"] = {
other_names = {"Proto-Gê", "Proto-Jean", "Proto-Gean", "Proto-Jê-Kaingang", "Proto-Ye"},
}
m["sai-jko"] = {
aliases = {"Geicó", "Jeicó", "Jaikó", "Geikó", "Yeikó", "Jeiko", "Geico", "Jeico", "Jaiko", "Geiko", "Yeiko", "Eyco"},
}
m["sai-jrj"] = {
}
m["sai-kat"] = { -- contrast xoo, kzw, sai-xoc
other_names = {"Catrimbi", "Catembri", "Kariri de Mirandela", "Mirandela", "Kariri", "Kiriri"},
}
m["sai-mal"] = {
aliases = {"Malali"},
}
m["sai-mar"] = {
}
m["sai-mat"] = {
other_names = {"Matanauí", "Matanaui", "Matanawü", "Mitandua", "Moutoniway"},
}
m["sai-mcn"] = {
aliases = {"Mokana"},
}
m["sai-men"] = {
aliases = {"Menién"},
}
m["sai-mil"] = {
other_names = {"Milykayak", "Huarpe", "Warpe"},
}
m["sai-mlb"] = {
aliases = {"Malibú", "Malebú"},
}
m["sai-msk"] = {
aliases = {"Masakara", "Masacará", "Masacara"},
}
m["sai-muc"] = {
other_names = {"Mucuchi", "Mokochi", "Mocochí", "Mirripú", "Maripú", "Mucuchí-Maripú"},
}
m["sai-mue"] = {
aliases = {"Muellamués"},
}
m["sai-muz"] = {
}
m["sai-mys"] = {
other_names = {"Mayna", "Maina", "Rimachu"},
}
m["sai-nat"] = {
other_names = {"Natu", "Peagaxinan"},
}
m["sai-nje-pro"] = {
other_names = {"Proto-Core Jê"},
}
m["sai-opo"] = {
other_names = {"Opon", "Opón-Karare", "Opón-Carare", "Carare", "Carare-Opón"},
}
m["sai-oto"] = {
aliases = {"Otomako", "Otomacan", "Otomac", "Otomak"},
}
m["sai-pal"] = {
}
m["sai-pam"] = {
aliases = {"Pamiwa"},
}
m["sai-par"] = {
aliases = {"Paratio", "Prarto"},
}
m["sai-peb"] = {
aliases = {"Peva"},
varieties = {"Cauwachi", "Caumari", "Pacaya"}, -- per Wikipedia, according to the American anthropologist and linguist John Alden Mason (1950)
}
m["sai-pnz"] = {
aliases = {"Pansaleo"},
}
m["sai-prh"] = {
}
m["sai-ptg"] = {
other_names = {"Patagón de Perico"},
}
m["sai-pur"] = {
aliases = {"Purukoto", "Purucotó", "Purucoto"},
}
m["sai-pyg"] = {
aliases = {"Payawá", "Payagua"},
}
m["sai-pyk"] = {
aliases = {"Gavião-Pykobjê", "Pykobjê-Gavião", "Gavião", "Pyhcopji", "Gavião-Pyhcopji"},
}
m["sai-qmb"] = {
other_names = {"Kimbaya", "Quindío", "Quindio", "Quindo"},
}
m["sai-qtm"] = {
aliases = {"Quitemoca"},
}
m["sai-rab"] = {
}
m["sai-ram"] = {
}
m["sai-sac"] = {
other_names = {"Sacata", "Zácata", "Chillao"},
}
m["sai-san"] = {
aliases = {"Sanavirón", "Sanabirón", "Sanabiron", "Sanavirona", "Zanavirona"},
}
m["sai-sap"] = {
aliases = {"Zapará", "Zapara"},
}
m["sai-sec"] = {
other_names = {"Sek", "Sec"},
}
m["sai-sin"] = {
other_names = {"Cenúfana", "Zenúfana", "Cinifaná", "Sinufana", "Sinú", "Cenú", "Zenú", "Finzenú", "Fincenú", "Pancenú", "Sutagao"},
}
m["sai-sje-pro"] = {
}
m["sai-tab"] = {
other_names = {"Aconipa"},
}
m["sai-tal"] = {
other_names = {"Atalán", "Tallan", "Tallanca", "Atalan", "Sek"},
}
m["sai-tap"] = {
other_names = {"Tapayúna", "Kajkwakhrattxi"},
}
m["sai-tar-pro"] = {
}
m["sai-teu"] = {
aliases = {"Tehues", "Teuéx"},
}
m["sai-tim"] = {
other_names = {"Cuica", "Timote-Cuica"},
}
m["sai-tpr"] = {
aliases = {"Taparito"},
}
m["sai-trr"] = {
other_names = {"Caratiú"},
}
m["sai-wai"] = {
aliases = {"Waitaka", "Waitacá", "Waitaca", "Goytacá", "Goitacá", "Guaitacá", "Guiatacá", "Guiatacás", "Goiatacá", "Goiatacás", "Guaiatacá", "Goytacaz", "Goitacaz", "Goyataca", "Aitacaz", "Uetacaz", "Uetacá", "Outacá", "Ouetacá", "Eutacá", "Itacaz", "Vaitacá"},
}
m["sai-way"] = {
aliases = {"Wajumará", "Wajumara", "Wayumará", "Azumara", "Guimara"},
}
m["sai-wit-pro"] = {
other_names = {"Proto-Huitotoan", "Proto-Uitotoan"},
}
m["sai-wnm"] = {
other_names = {"Wañam", "Wanyam", "Huanyam", "Uanham", "Abitana"},
}
m["sai-xoc"] = { -- contrast xoo, kzw, sai-kat
other_names = {"Xoco", "Chocó", "Shokó", "Shoko", "Shocó", "Shoco", "Choco", "Chocaz", "Kariri-Xocó", "Kariri-Xoco", "Kariri-Shoko", "Cariri-Chocó", "Xukuru-Kariri", "Xucuru-Kariri", "Xucuru-Cariri", "Xukurú-Kirirí"},
}
m["sai-yao"] = {
aliases = {"Yao", "Jaoi", "Yaoi", "Yaio", "Anacaioury"},
}
m["sai-yar"] = { -- not the same family as 'suy'
aliases = {"Yaruma"},
}
m["sai-yri"] = {
aliases = {"Jurí"},
}
m["sai-yup"] = {
other_names = {"Yupuá", "Yupúa", "Jupua", "Jupuá", "Jupúa", "Hiupiá", "Yupuá-Duriña", "Duriña"},
}
m["sai-yur"] = {
aliases = {"Yurumangui", "Yurimangí", "Yurimangi", "Yurimanguí", "Yurimangui"},
}
m["sal-pro"] = {
aliases = {"Proto-Salishan"},
}
m["sdv-daj-pro"] = {
}
m["sdv-eje-pro"] = {
}
m["sdv-nil-pro"] = {
}
m["sdv-nyi-pro"] = {
}
m["sdv-tmn-pro"] = {
}
m["sel-nor"] = {
aliases = {"Taz Selkup"},
}
m["sel-pro"] = {
}
m["sel-sou"] = {
}
m["sem-amm"] = {
}
m["sem-amo"] = {
aliases = {"Amoritic"},
}
m["sem-cha"] = {
aliases = {"Cheha", "Čäha", "Čäxa"},
}
m["sem-dad"] = {
other_names = {"Dadanite", "Lihyanite", "Lihyanitic"},
}
m["sem-dum"] = {
}
m["sem-has"] = {
}
m["sem-his"] = {
other_names = {"Thamudic E"},
}
m["sem-mhr"] = {
other_names = {"Muher Gurage", "Muxar", "Muxər", "Muhər", "Muḫər"},
}
m["sem-pro"] = {
}
m["sem-saf"] = {
}
m["sem-sam"] = {
other_names = {"Sam'alian"},
}
m["sem-srb"] = {
}
m["sem-tay"] = {
other_names = {"Taymanite", "Thamudic A"},
}
m["sem-tha"] = {
}
m["sem-wes-pro"] = {
}
m["sio-pro"] = { -- NB this is not Proto-Siouan-Catawban 'nai-sca-pro'
}
m["sit-aao-pro"] = {
}
m["sit-bok"] = {
other_names = {"Ramo", "Pailibo"},
}
m["sit-bai-pro"] = {
}
m["sit-ban"] = {
}
m["sit-bdi-pro"] = {
}
m["sit-cai"] = {
}
m["sit-cha"] = {
}
m["sit-ers-pro"] = {
}
m["sit-hrs-pro"] = {
}
m["sit-jap"] = {
other_names = {"Chabao", "Kuru"},
}
m["sit-kha-pro"] = {
}
m["sit-khb-pro"] = {
}
m["sit-khp-pro"] = {
}
m["sit-khw-pro"] = {
}
m["sit-kon-pro"] = {
}
m["sit-liz"] = {
}
m["sit-lnj"] = {
}
m["sit-lrn"] = {
}
m["sit-luu-pro"] = {
}
m["sit-nas-pro"] = {
}
m["sit-prn"] = {
}
m["sit-pro"] = {
}
m["sit-sit"] = {
other_names = {"Eastern rGyalrong", "rGyalrong", "Rgyalrong", "rGyalrongic", "Gyalrong", "Gyarong", "rGyarong", "Gyarung", "Jiarong", "Jiarongyu", "Jyarong", "Jyarung", "Yelong", "Kuru"},
}
m["sit-tam-pro"] = {
aliases = {"Proto-Tamang"},
}
m["sit-tan-pro"] = {
}
m["sit-tgm"] = {
}
m["sit-tng-pro"] = {
}
m["sit-tos"] = {
}
m["sit-tsh"] = {
other_names = {"Caodeng", "Sidaba", "rGyalrong", "Rgyalrong", "Jiarong", "Gyarung", "Kuru"},
}
m["sit-zbu"] = {
other_names = {"Ribu", "Rdzong'bur", "Rdzongmbur", "Showu", "rGyalrong", "Rgyalrong", "Jiarong", "Gyarung", "Kuru"},
}
m["sla-pro"] = {
aliases = {"Common Slavic"},
}
m["smi-pro"] = {
aliases = {"Proto-Sami"},
}
m["son-pro"] = {
aliases = {"Proto-Songhai"},
}
m["sqj-pro"] = {
}
m["ssa-klk-pro"] = {
aliases = {"Proto-Rub"},
}
m["ssa-kom-pro"] = {
}
m["ssa-pro"] = {
}
m["syd-pro"] = {
}
m["tai-pro"] = {
}
m["tai-swe-pro"] = {
}
m["tbq-bdg-pro"] = {
}
m["tbq-blg"] = {
aliases = {"Pai-lang", "Pailang"},
}
m["tbq-brm-pro"] = {
}
m["tbq-gkh"] = {
aliases = {"Gɔkhý", "Gɔkhy", "Gouke"},
}
m["tbq-kuk-pro"] = {
other_names = {"Proto-Kukish"},
}
m["tbq-lal-pro"] = {
}
m["tbq-laz"] = {
other_names = {"Lare", "Shuitianhua"},
}
m["tbq-lob-pro"] = {
}
m["tbq-lol-pro"] = {
aliases = {"Proto-Yi", "Proto-Ngwi", "Proto-Nisoic"},
}
m["tbq-mil"] = {
}
m["tbq-mor"] = {
aliases = {"Morān"},
}
m["tbq-ngo"] = {
other_names = {"Ngachang", "Achang"},
}
-- tbq-pro is now etymology-only
m["trk-dkh"] = {
aliases = {"Dukha"},
}
m["trk-eog"] = {
}
m["trk-oat"] = {
}
m["trk-pro"] = {
}
m["tup-gua-pro"] = {
}
m["tup-kab"] = {
aliases = {"Kabixiana", "Cabixiana", "Cabishiana", "Kapishana", "Capishana", "Kapišana", "Cabichiana", "Capichana", "Capixana"},
}
m["tuw-alk"] = {
aliases = {"Alechuka"},
}
m["tuw-bal"] = {
}
m["tuw-kkl"] = {
aliases = {"Chinese Kyakala"},
}
m["tuw-kli"] = {
aliases = {"Kilen", "Kirin", "Kila", "Hezhe", "Qile'en"},
}
m["tup-pro"] = {
}
m["tuw-pro"] = {
}
m["tuw-sol"] = {
}
m["urj-fin-pro"] = {
}
m["urj-koo"] = {
aliases = {"Old Permian"},
}
m["urj-kuk"] = {
aliases = {"Kukkuzi Votic", "Kukkuzi Ingrian", "Kukkusi"},
}
m["urj-kya"] = {
}
m["urj-mdv-pro"] = {
}
m["urj-prm-pro"] = {
}
m["urj-pro"] = {
other_names = {"Proto-Finno-Ugric", "Proto-Finno-Permic"}, -- PFU and PFP are subsumed into PU per [[Wiktionary:Beer parlour/2015/January#Merging Finno-Volgaic, Finno-Samic, Finno-Permic and Finno-Ugric into Uralic]]
}
m["urj-ugr-pro"] = {
}
m["xgn-pro"] = {
}
m["xnd-pro"] = {
other_names = {"Proto-Na-Dené", "Proto-Athabaskan-Eyak-Tlingit"},
}
m["yok-bvy"] = {
other_names = {"Tulamni-Hometwoli", "Tulamni", "Tulamne", "Tuolumne", "Tawitchi", "Hometwoli", "Taneshach"},
}
m["yok-dly"] = {
other_names = {"Far Northern Valley Yokuts", "Yachikumne", "Yachikumni", "Chulamni", "Lower San Joaquin", "Lakisamni", "Tawalimni"},
}
m["yok-gsy"] = {
}
m["yok-kry"] = {
other_names = {"Choinimni", "Choynimni", "Ayticha", "Kocheyali", "Ayitcha", "Michahay", "Chukaymina", "Chukaimina"},
}
m["yok-nvy"] = {
other_names = {"Chukchansi", "Kechayi", "Dumna", "Chawchila", "Noptinte", "Nopṭinṭe", "Nopthrinthre", "Nopchinchi", "Takin"},
}
m["yok-ply"] = {
other_names = {"Paleuyami", "Altinin", "Poso Creek", "Poso Creek Yokuts"},
}
m["yok-svy"] = {
other_names = {"Yawelmani", "Tachi", "Koyeti", "Nutunutu", "Chunut", "Wo'lasi", "Choynok", "Choinok", "Wechihit"},
}
m["yok-tky"] = {
other_names = {"Wikchamni", "Wukchamni", "Wukchumni", "Yawdanchi"},
}
m["ypk-pro"] = {
}
m["yrk-for"] = {
}
m["yrk-tun"] = {
other_names = {"Yurak"},
varieties = {
{ "Western Nenets" },
{ "Eastern Nenets" },
}
}
m["zhx-min-pro"] = {
}
m["zhx-sht"] = {
other_names = {"Xiangnan Tuhua", "Yuebei Tuhua", "Shipo", "Shina"},
}
m["zhx-sic"] = {
aliases = {"Sichuanese Mandarin"},
}
m["zhx-tai"] = {
aliases = {"Toishanese"},
}
m["zle-ono"] = {
}
m["zle-ort"] = {
}
m["zls-chs"] = {
}
m["zlw-ocs"] = {
}
m["zlw-opl"] = {
}
m["zlw-osk"] = {
}
m["zlw-slv"] = {
}
m["zlm-coa"] = {
}
m["zlm-pah"] = {
}
return m
h2a8ycwfcvvyuj273xn3e0ftvyjv8gi
281321
281317
2026-04-21T19:44:28Z
Hakimi97
2668
Membatalkan semakan [[Special:Diff/281317|281317]] oleh [[Special:Contributions/Hakimi97|Hakimi97]] ([[User talk:Hakimi97|bincang]])
281321
Scribunto
text/plain
local m = {}
m["aav-khs-pro"] = {
aliases = {"Proto-Khasic"},
}
m["aav-nic-pro"] = {
}
m["aav-pkl-pro"] = {
}
m["aav-pro"] = { -- mkh-pro will merge into this.
}
m["afa-pro"] = {
aliases = {"Proto-Afro-Asiatic", "Hamito-Semitic"},
}
m["alg-aga"] = {
aliases = {"Agwam", "Agaam"},
}
m["alg-pro"] = {
}
m["alv-ama"] = {
}
m["alv-bgu"] = {
otherNames = {"Gubëeher", "Nyun Gubëeher", "Nun Gubëeher"},
}
m["alv-bua-pro"] = {
}
m["alv-cng-pro"] = {
}
m["alv-edk-pro"] = {
}
m["alv-edo-pro"] = {
}
m["alv-fli-pro"] = {
}
m["alv-gbe-pro"] = {
}
m["alv-gng-pro"] = {
}
m["alv-gtm-pro"] = {
aliases = {"Proto-Ghana-Togo Mountain"},
}
m["alv-gwa"] = {
}
m["alv-hei-pro"] = {
}
m["alv-ido-pro"] = {
}
m["alv-igb-pro"] = {
}
m["alv-kwa-pro"] = {
}
m["alv-mum-pro"] = {
}
m["alv-nup-pro"] = {
}
m["alv-pro"] = {
}
m["alv-von-pro"] = {
}
m["alv-yor-pro"] = {
}
m["alv-yrd-pro"] = {
}
m["apa-pro"] = {
aliases = {"Proto-Apache", "Proto-Southern Athabaskan"},
}
m["aql-pro"] = {
}
m["art-adu"] = {
aliases = {"Westron"},
}
m["art-bel"] = {
}
m["art-blk"] = {
}
m["art-bsp"] = {
}
m["art-com"] = {
}
m["art-dtk"] = {
}
m["art-elo"] = {
}
m["art-gld"] = {
}
m["art-lap"] = {
}
m["art-man"] = {
}
m["art-mun"] = {
}
m["art-nav"] = {
}
m["art-vlh"] = {
}
m["ath-nic"] = {
}
m["ath-pro"] = {
}
m["auf-pro"] = {
aliases = {"Proto-Arawan", "Proto-Arauan"},
}
m["aus-alu"] = {
otherNames = {"Ogh-Alungul", "Alngula"},
}
m["aus-and"] = {
aliases = {"Adithinngithigh"},
}
m["aus-ang"] = {
otherNames = {"Ogh-Anggula", "Anggula", "Ogh-Anggul", "Anggul"},
}
m["aus-arn-pro"] = {
}
m["aus-bra"] = {
aliases = {"Barranbinja", "Baranbinya", "Burranbinya", "Burrumbiniya", "Burrunbinya", "Barrumbinya", "Barren-binya", "Parran-binye"},
}
m["aus-brm"] = {
}
m["aus-cww-pro"] = {
}
m["aus-dal-pro"] = {
}
m["aus-guw"] = {
otherNames = {"Gowar", "Goowar", "Gooar", "Guar", "Gowr-burra", "Ngugi", "Mugee", "Wogee", "Gnoogee", "Chunchiburri", "Booroo-geen-merrie"},
}
m["aus-lsw"] = {
aliases = {"Little Swanport Tasmanian"},
}
m["aus-mbi"] = {
otherNames = {"Mbeiwum"},
}
m["aus-ngk"] = {
otherNames = {"Ngkot", "Nggoth"},
}
m["aus-nyu-pro"] = {
}
m["aus-pam-pro"] = {
}
m["aus-tul"] = {
otherNames = {"Dappil", "Dapil", "Toolooa", "Dulua", "Narung", "Dandan"},
}
m["aus-uwi"] = {
otherNames = {"Uwinjmil"},
}
m["aus-wdj-pro"] = {
}
m["aus-won"] = {
}
m["aus-wul"] = {
otherNames = {"Manbara", "Wulgurugaba", "Wulgurukaba", "Nhawalgaba"},
}
m["aus-ynk"] = { -- contrast nny
}
m["awd-amc-pro"] = {
otherNames = {"Western Maipuran"},
}
m["awd-kmp-pro"] = {
otherNames = {"Campa", "Kampan", "Campan", "Pre-Andine Maipurean"},
}
m["awd-prw-pro"] = {
otherNames = {"Paresí-Waurá", "Parecí–Xingú", "Paresí–Xingu", "Central Arawak", "Central Maipurean"},
}
m["awd-ama"] = {
}
m["awd-ana"] = {
aliases = {"Anauya"},
}
m["awd-apo"] = {
otherNames = {"Lapachu"},
}
m["awd-cab"] = {
aliases = {"Cabere", "Cávere", "Cavere"},
}
m["awd-gnu"] = {
otherNames = {"Guinao", "Inao", "Guniare", "Quinhau", "Guiano"},
}
m["awd-kar"] = {
aliases = {"Kariaí", "Kariai", "Cariyai", "Carihiahy"},
}
m["awd-kaw"] = {
aliases = {"Cawishana", "Cayuishana", "Kaishana", "Cauixana"},
}
m["awd-kus"] = {
aliases = {"Kustenaú", "Custenau", "Kutenabu"},
}
m["awd-man"] = {
}
m["awd-mar"] = {
aliases = {"Marawán"},
}
m["awd-mpr"] = {
aliases = {"Maypure", "Mejepure"},
}
m["awd-mrt"] = {
aliases = {"Mariate"},
}
m["awd-nwk-pro"] = {
aliases = {"Proto-Newiki"},
}
m["awd-pai"] = {
aliases = {"Paiconeca", "Paikone", "Paicone"},
}
m["awd-pas"] = {
aliases = {"Passé", "Pazé"},
}
m["awd-pro"] = {
otherNames = {"Proto-Arawakan", "Proto-Maipurean", "Proto-Maipuran"},
}
m["awd-she"] = {
aliases = {"Shebaya", "Shebaye"},
}
m["awd-taa-pro"] = {
otherNames = {"Proto-Ta-Arawakan", "Proto-Caribbean Northern Arawak"},
}
m["awd-wai"] = {
otherNames = {"Wainuma", "Wai", "Waima", "Wainumi", "Wainambí", "Waiwana", "Waipi", "Yanuma"},
}
m["awd-yum"] = {
aliases = {"Jumana"},
}
m["azc-caz"] = {
aliases = {"Caxcan", "Kaskán"},
}
m["azc-cup-pro"] = {
}
m["azc-ktn"] = {
aliases = {"Gitanemuk"},
}
m["azc-nah-pro"] = {
}
m["azc-num-pro"] = {
}
m["azc-pro"] = {
}
m["azc-tak-pro"] = {
}
m["azc-tat"] = {
}
m["ber-fog"] = {
otherNames = {"El-Fogaha", "El-Foqaha", "Foqaha", "Fuqaha"},
}
m["ber-pro"] = {
}
m["ber-zuw"] = {
}
m["bnt-bal"] = {
}
m["bnt-bon"] = {
}
m["bnt-boy"] = {
}
m["bnt-bwa"] = {
}
m["bnt-cmw"] = {
otherNames = {"Bravanese", "Mwiini", "Mwini", "Chimwini", "Chimini", "Brava"},
}
m["bnt-ind"] = {
otherNames = {"Kɔlɔmɔnyi", "Kɔlɛ", "Kasaï Oriental"},
}
m["bnt-lal"] = {
}
m["bnt-mpi"] = {
}
m["bnt-mpu"] = {
}
m["bnt-ngu-pro"] = {
}
m["bnt-phu"] = {
aliases = {"Siphuthi"},
}
m["bnt-pro"] = {
}
m["bnt-sbo"] = {
}
m["bnt-sts-pro"] = {
}
m["btk-pro"] = {
}
m["cau-abz-pro"] = {
otherNames = {"Proto-Abazgi", "Proto-Abkhaz-Tapanta"},
}
m["cau-and-pro"] = {
aliases = {"Proto-Andi", "Proto-Andic"},
}
m["cau-ava-pro"] = {
aliases = {"Proto-Avar-Andian", "Proto-Avar-Andi", "Proto-Avar-Andic"},
}
m["cau-cir-pro"] = {
otherNames = {"Proto-Adyghe-Kabardian", "Proto-Adyghe-Circassian"},
}
m["cau-drg-pro"] = {
otherNames = {"Proto-Dargin"},
}
m["cau-lzg-pro"] = {
aliases = {"Proto-Lezgi", "Proto-Lezgian", "Proto-Lezgic"},
}
m["cau-nec-pro"] = {
}
m["cau-nkh-pro"] = {
}
m["cau-nwc-pro"] = {
}
m["cau-tsz-pro"] = {
otherNames = {"Proto-Tsezic", "Proto-Didoic"},
}
m["cba-ata"] = {
otherNames = {"Atanque", "Cancuamo", "Kankuamo", "Kankwe", "Kankuí", "Atanke"},
}
m["cba-cat"] = {
otherNames = {"Catio Chibcha", "Old Catio"},
}
m["cba-dor"] = {
otherNames = {"Chumulu", "Changuena", "Changuina", "Chánguena", "Gualaca"},
}
m["cba-dui"] = {
}
m["cba-hue"] = {
otherNames = {"Güetar", "Guetar", "Brusela"},
}
m["cba-nut"] = {
otherNames = {"Nutabane"},
}
m["cba-pro"] = {
}
m["ccn-pro"] = {
}
m["ccs-pro"] = {
}
m["ccs-gzn-pro"] = {
aliases = {"Proto-Karto-Zan"},
}
m["cdc-cbm-pro"] = {
otherNames = {"Proto-Central-Chadic", "Proto-Biu-Mandara"},
}
m["cdc-mas-pro"] = {
}
m["cdc-pro"] = {
}
m["cdd-pro"] = {
}
m["cel-bry-pro"] = {
aliases = {"Proto-Brittonic", "Common Brythonic", "Common Brittonic"},
}
m["cel-gal"] = {
}
m["cel-gau"] = {
}
m["cel-pro"] = {
}
m["chi-pro"] = {
}
m["chm-pro"] = {
}
m["cmc-pro"] = {
}
m["crp-bip"] = {
}
m["crp-gep"] = {
aliases = {"Greenlandic Pidgin", "Greenlandic Eskimo Pidgin"},
}
m["crp-mar"] = {
otherNames = {"Jamaican Maroon Spirit Possession Language"},
}
m["crp-mpp"] = {
aliases = {"Macao Pidgin Portuguese"},
}
m["crp-rsn"] = {
}
m["crp-slb"] = {
otherNames = {"Solombala-English", "Solombala English-Russian Pidgin"},
}
m["crp-spp"] = {
}
m["crp-tpr"] = {
}
m["csu-bba-pro"] = {
}
m["csu-maa-pro"] = {
}
m["csu-pro"] = {
}
m["csu-sar-pro"] = {
}
m["cus-ash"] = {
otherNames = {"Ashraf", "Af-Ashraaf"},
varieties = { {"Marka, Lower Shabelle"}, "Shingani"},
}
m["cus-hec-pro"] = {
}
m["cus-som-pro"] = {
otherNames = {"Proto-Sam", "Proto-Macro-Somali"},
}
m["cus-sou-pro"] = {
otherNames = {"Proto-Rift"},
}
m["cus-pro"] = {
}
m["dmn-dam"] = {
}
m["dra-bry"] = {
aliases = {"Byari"},
}
m["dra-cen-pro"] = {
}
m["dra-mkn"] = {
aliases = {"Nadugannada"},
}
m["dra-nor-pro"] = {
}
m["dra-okn"] = {
aliases = {"Halegannada"},
}
m["dra-ote"] = {
}
m["dra-pro"] = {
}
m["dra-sdo-pro"] = {
aliases = {"Proto-South Dravidian"},
}
m["dra-sdt-pro"] = {
aliases = {"Proto-South-Central Dravidian"},
}
m["dra-sou-pro"] = {
aliases = {"Proto-Southern Dravidian"},
}
m["egx-dem"] = {
aliases = {"Demotic Egyptian", "Enchorial"},
}
m["dmn-pro"] = {
}
m["dmn-mdw-pro"] = {
}
m["dru-pro"] = {
}
m["esx-esk-pro"] = {
}
m["esx-ink"] = {
}
m["esx-inq"] = {
}
m["esx-inu-pro"] = {
}
m["esx-pro"] = {
}
m["esx-tut"] = {
}
m["euq-pro"] = {
aliases = {"Proto-Vasconic"},
}
m["gba-pro"] = {
}
m["gem-pro"] = {
aliases = {"Common Germanic"},
}
m["gme-bur"] = {
aliases = {"Burgundish", "Burgundic"},
}
m["gme-cgo"] = {
}
m["gmq-gut"] = {
}
m["gmq-jmk"] = {
aliases = {"Jamtlandic"},
}
m["gmq-mno"] = {
}
m["gmq-oda"] = {
}
m["gmq-ogt"] = {
aliases = {"Old Gotlandic"},
}
m["gmq-osw"] = {
}
m["gmq-pro"] = {
aliases = {"Proto-Scandinavian", "Primitive Norse", "Proto-Nordic",
"Ancient Nordic", "Ancient Scandinavian", "Old Nordic", "Old Scandinavian",
"Proto-North Germanic", "North Proto-Germanic", "Common Scandinavian"},
}
m["gmq-scy"] = {
}
m["gmw-bgh"] = {
}
m["gmw-cfr"] = {
varieties = {"Mittelfränkisch", "Ripuarian", "Moselle Franconian", "Colognian", "Kölsch"},
}
m["gmw-ecg"] = {
varieties = {"Thuringian", "Thüringisch", "Upper Saxon", "Upper Saxon German", "Obersächsisch", "Lusatian", "Erzgebirgisch", "Silesian", "Silesian German", "High Prussian"},
}
m["gmw-fin"] = {
aliases = {"Fingal"},
}
m["gmw-gts"] = {
aliases = {"Gottscheerisch"},
}
m["gmw-jdt"] = {
}
m["gmw-msc"] = {
}
m["gmw-pro"] = {
}
m["gmw-rfr"] = {
aliases = {"Rheinfränkisch", "Rhenish Franconian"},
varieties = {"Hessian", "Lorraine Franconian", "Lorrainian", "Lothringisch", "Palatine German", "Pfälzisch", "Pälzisch", "Palatinate German"},
}
m["gmw-stm"] = {
aliases = {"Satu Mare Swabian", "Sathmarschwäbisch", "Sathmarisch"},
}
m["gmw-tsx"] = {
aliases = {"Siebenbürger Saxon"},
}
m["gmw-vog"] = {
}
m["gmw-zps"] = {
aliases = {"Zipser", "Zipserisch", "Outzäpsersch"},
}
m["gn-cls"] = {
}
m["grk-cal"] = {
aliases = {"Italian Greek", "Bova"},
}
m["grk-ita"] = {
aliases = {"Griko", "Grico", "Grecanic"},
}
m["grk-mar"] = {
aliases = {"Mariupolitan Greek", "Rumeíka", "Rumeika"},
}
m["grk-pro"] = {
aliases = {"Proto-Greek"},
}
m["hmn-pro"] = {
}
m["hmx-mie-pro"] = {
}
m["hmx-pro"] = {
}
m["hyx-pro"] = {
}
m["iir-nur-pro"] = {
}
m["iir-pro"] = {
}
m["ijo-pro"] = {
aliases = {"Proto-Ijaw"},
}
m["inc-apa"] = {
aliases = {"Apabhraṃśa"},
}
m["inc-ash"] = {
aliases = {"Asokan Prakrit", "Aśokan Prakrit"},
}
m["inc-kam"] = {
}
m["inc-kho"] = {
}
m["inc-krn-pro"] = {
aliases = {"Proto Kamta", "Proto-Kamata", "Proto Kamata"},
}
m["inc-mas"] = {
}
m["inc-mbn"] = {
}
m["inc-mgu"] = {
}
m["inc-mor"] = {
aliases = {"Middle Oriya"},
}
m["inc-oas"] = {
}
m["inc-oaw"] = {
aliases = {"Early Awadhi"},
}
m["inc-obn"] = {
}
m["inc-ogu"] = {
otherNames = {"Old Western Rajasthani"},
}
m["inc-ohi"] = {
aliases = {"Dehlavi"},
}
m["inc-oor"] = {
aliases = {"Old Oriya"},
}
m["inc-opa"] = {
}
m["inc-pro"] = {
}
m["ine-ana-pro"] = {
}
m["ine-bsl-pro"] = {
}
m["ine-kal"] = {
aliases = {"Kalašmaic", "Kalasmaic"},
}
m["ine-pae"] = {
}
m["ine-pro"] = {
}
m["ine-toc-pro"] = {
}
m["xme-old"] = {
}
m["xme-mid"] = {
aliases = {"Atropatenian"},
}
m["xme-ker"] = {
otherNames = {"Kermanian", "Central Iranian Dialects", "Central Plateau Dialects", "Central Iranian", "South Median", "Gazi", "Soi", "Sohi", "Abuzeydabadi", "Abyanehi", "Farizandi", "Jowshaqani", "Nashalji", "Qohrudi", "Yarandi", "Tari", "Sedehi", "Ardestani", "Zefrehi", "Isfahani", "Kafroni", "Varzenehi", "Khuri", "Nayini", "Anaraki", "Zoroastrian Dari", "Behdināni", "Behdinani", "Gabri", "Gavrŭni", "Gavruni", "Gabrōni", "Gabroni", "Kermani", "Yazdi", "Bidhandi", "Bijagani", "Chimehi", "Hanjani", "Komjani", "Naraqi", "Qalhari", "Varani", "Zori"},
}
m["xme-taf"] = {
}
m["xme-ttc-pro"] = {
}
m["xme-kls"] = {
aliases = {"Kalāsuri", "Kalasur", "Kalāsur"},
}
m["xme-klt"] = {
}
m["xme-ott"] = {
otherNames = {"Old Tatic", "Old Azeri", "Azari", "Azeri", "Āḏarī", "Adari", "Adhari"},
}
m["ira-kms-pro"] = {
}
m["ira-mpr-pro"] = {
}
m["ira-pat-pro"] = {
}
m["ira-pro"] = {
}
m["ira-zgr-pro"] = {
}
m["os-pro"] = {
otherNames = {"Sarmatian"},
}
m["xsc-pro"] = {
}
m["xsc-skw-pro"] = {
}
m["xsc-sak-pro"] = {
aliases = {"Proto-Sakan"},
}
m["ira-sym-pro"] = {
}
m["ira-sgi-pro"] = {
}
m["ira-mny-pro"] = {
}
m["ira-shy-pro"] = {
}
m["ira-shr-pro"] = {
}
m["ira-sgc-pro"] = {
aliases = {"Proto-Sogdian"},
}
m["ira-wnj"] = {
aliases = {"Old Vanji", "Vanchi", "Vanži", "Wanji"},
}
m["iro-ere"] = {
}
m["iro-min"] = {
}
m["iro-nor-pro"] = {
}
m["iro-pro"] = {
}
m["itc-pro"] = {
}
m["jpx-hcj"] = {
aliases = {"Hachijo"},
}
m["jpx-pro"] = {
}
m["jpx-ryu-pro"] = {
}
m["kar-pro"] = {
}
m["kca-eas"] = {
}
m["kca-nor"] = {
}
m["kca-pro"] = {
}
m["kca-sou"] = {
}
m["khi-kho-pro"] = {
}
m["khi-kun"] = {
otherNames = {"ǃOǃKung", "ǃ'OǃKung", "Kung", "Ekoka ǃKung", "Ekoka Kung", "Sekele"},
}
m["ko-ear"] = {
}
m["kro-pro"] = {
}
m["ku-pro"] = {
}
m["map-ata-pro"] = {
}
m["map-bms"] = {
}
m["map-pro"] = {
}
m["mis-hkl"] = {
aliases = {"Kelantan Peranakan Chinese", "Kelantan Peranakan Hokkien", "Hokkien Kelantan", "Kelantan Local Hokkien"}
}
m["mis-isa"] = {
}
m["mis-jie"] = {
aliases = {"Chieh", "Kjet"},
}
m["mis-jzh"] = {
aliases = {"Haihua"},
}
m["mis-kas"] = {
aliases = {"Cassite", "Kassitic", "Kaššite"},
}
m["mis-mmd"] = {
otherNames = {"Mimi of Gaudefroy-Demombynes", "Mimi-D"},
}
m["mis-mmn"] = {
otherNames = {"Mimi-N"},
}
m["mis-phi"] = {
aliases = {"Philistian", "Philistinian"},
}
m["mis-rou"] = {
aliases = {"Ruanruan", "Ruan-ruan", "Juan-juan"},
}
m["mis-tnw"] = {
aliases = {"Tangwanghua"},
}
m["mis-tuh"] = {
aliases = {"'Azha"},
}
m["mis-tuo"] = {
aliases = {"Tabghach", "Taghbach"},
}
m["mis-wuh"] = {
aliases = {"Wuwan", "Awar"},
}
m["mis-xbi"] = {
aliases = {"Serbi", "Shirwi"},
}
m["mjg-mgl"] = {
aliases = {"Huzhu", "Huzhu Monguor"},
}
m["mjg-mgr"] = {
aliases = {"Minhe", "Minhe Monguor"},
}
m["mkh-asl-pro"] = {
}
m["mkh-ban-pro"] = {
}
m["mkh-kat-pro"] = {
}
m["mkh-khm-pro"] = {
}
m["mkh-kmr-pro"] = {
}
m["mkh-mmn"] = {
}
m["mkh-mnc-pro"] = {
}
m["mkh-mvi"] = {
}
m["mkh-pal-pro"] = {
}
m["mkh-pea-pro"] = {
}
m["mkh-pkn-pro"] = {
}
m["mkh-pro"] = { --This will be merged into 2015 aav-pro.
}
m["mnw-tha"] = {
aliases = {"Raman", "Thai Raman", "Siamese Mon"},
}
m["mkh-vie-pro"] = {
}
m["mns-cen"] = {
}
m["mns-nor"] = {
}
m["mns-pro"] = {
}
m["mns-sou"] = {
}
m["mun-pro"] = {
aliases = {"Proto-Mundan"},
}
m["myn-chl"] = { -- the stage after ''emy''
otherNames = {"Cholti", "Colonial Ch'olti'", "Colonial Cholti"},
}
m["myn-pro"] = {
aliases = {"Proto-Maya"},
}
m["nai-ala"] = {
otherNames = {"Alasapa", "Pinto"},
}
m["nai-bay"] = {
otherNames = {"Bayougoula", "Bayou Goula", "Ischenoca"}, -- tribe merged with "Mougulasha", "Mongoulacha", "Mugulasha", "Mougulasha", "Muglahsa", "Muglasha", "Muguasha", "Imongolosha", "Houma", "Acolapissa"
}
m["nai-cal"] = {
}
m["nai-chi"] = {
}
m["nai-chu-pro"] = {
aliases = {"Proto-Chumashan"},
}
m["nai-cig"] = {
}
m["nai-ckn-pro"] = {
aliases = {"Proto-Chinook"},
}
m["nai-guz"] = {
aliases = {"Guazacapan"},
}
m["nai-hit"] = {
otherNames = {"Atcik-hata", "At-pasha-shliha"},
}
m["nai-ipa"] = {
otherNames = {"'Iipay 'aa", "Northern Diegueño", "Diegueño"},
}
m["nai-jtp"] = {
otherNames = {"Xutiapa", "Jalapa", "Xalapa"},
}
m["nai-jum"] = {
aliases = {"Jumaitepeque", "Jumaytepec"},
}
m["nai-kat"] = {
otherNames = {"Kathlamet Chinook"},
}
m["nai-klp-pro"] = {
}
m["nai-knm"] = {
}
m["nai-kum"] = {
otherNames = {"Kumiai", "Central Diegueño", "Diegueño"},
}
m["nai-mac"] = {
aliases = {"Macorís", "Macorix", "Mazorij", "Mazorig", "Mazoriges"},
}
m["nai-mdu-pro"] = {
aliases = {"Proto-Maiduan"},
}
m["nai-miz-pro"] = {
aliases = {"Proto-Mixe-Zoquean"},
}
m["nai-mus-pro"] = {
aliases = {"Proto-Muskhogean", "Proto-Muskogee"},
}
m["nai-nao"] = {
}
m["nai-nrs"] = {
}
m["nai-okw"] = {
}
m["nai-per"] = {
}
m["nai-pic"] = {
}
m["nai-plp-pro"] = {
}
m["nai-pom-pro"] = {
aliases = {"Proto-Pomoan"},
}
m["nai-qng"] = {
}
m["nai-sca-pro"] = { -- NB 'sio-pro' "Proto-Siouan" which is Proto-Western Siouan
}
m["nai-sin"] = {
aliases = {"Sinacantan", "Zinacantán", "Zinacantan"},
}
m["nai-sln"] = {
}
m["nai-spt"] = {
aliases = {"Shahaptin"},
}
m["nai-tap"] = {
otherNames = {"Tapachulteca", "Tapachulteco", "Tapachula"},
}
m["nai-taw"] = {
}
m["nai-teq"] = {
otherNames = {"Tequistlateco", "Tequistlateca", "Chontal", "Chontol of Oaxaca", "Oaxaca Chontal", "Oaxacan Chontal"},
}
m["nai-tip"] = {
otherNames = {"Tipay", "Tiipai", "Tiipay", "Jamul Tiipay", "Southern Digueño", "Diegueño"},
}
m["nai-tot-pro"] = {
}
m["nai-tsi-pro"] = {
}
m["nai-utn-pro"] = {
otherNames = {"Proto-Miwok-Costanoan"},
}
m["nai-wai"] = {
aliases = {"Guaycura", "Waicura"},
}
m["nai-wji"] = {
otherNames = {"Jicaque of El Palmar", "Sula"},
}
m["nai-yup"] = {
aliases = {"Jupiltepeque", "Yupiltepec", "Jupiltepec", "Xupiltepec"},
}
m["nan-dat"] = {
aliases = {"Datian"},
}
m["nan-hbl"] = {
aliases = {"Hokkienese", "Quanzhang", "Fukien", "Banlam", "Banlamese", "Ban-lam"},
}
m["nan-hlh"] = {
aliases = {"Hailufeng", "Hoklo Min", "Hai Lok Hong"},
}
m["nan-hnm"] = {
aliases = {"Hainamese", "Hailamese", "Hainam", "Hainan Min", "Hainam Min"},
}
m["nan-lnx"] = {
aliases = {"Longyan", "Liongna"},
}
m["nan-luh"] = {
aliases = {"Leizhou", "Luichew", "Luichew Min"}
}
m["nan-tws"] = {
aliases = {"Teochew Min", "Chiuchow", "Teo-Swa", "Teo-Swa Min", "Tio-Sua"},
}
m["nan-zhe"] = {
aliases = {"Zhenan"},
}
m["nan-zsh"] = {
aliases = {"Sanxiang", "Samheung", "Sahiu"},
}
m["nds-de"] = {
}
m["nds-nl"] = {
varieties = {"Achterhoeks", "Drents", "Gronings", "Sallands", "Stellingwerfs", "Twents", "Veluws"},
}
m["ngf-pro"] = {
}
m["nic-bco-pro"] = {
}
m["nic-bod-pro"] = {
}
m["nic-eov-pro"] = {
}
m["nic-gns-pro"] = {
}
m["nic-grf-pro"] = {
}
m["nic-gur-pro"] = {
}
m["nic-jkn-pro"] = {
}
m["nic-lcr-pro"] = {
}
m["nic-ogo-pro"] = {
}
m["nic-ovo-pro"] = {
}
m["nic-plt-pro"] = {
}
m["nic-pro"] = {
}
m["nic-ubg-pro"] = {
}
m["nic-ucr-pro"] = {
}
m["nic-vco-pro"] = {
}
m["nub-har"] = {
aliases = {"Ḥarāza"},
}
m["nub-pro"] = {
}
m["omq-cha-pro"] = {
}
m["omq-maz-pro"] = {
aliases = {"Proto-Mazatecan"},
}
m["omq-mix-pro"] = {
}
m["omq-mxt-pro"] = {
}
m["omq-otp-pro"] = {
}
m["omq-pro"] = {
aliases = {"Proto-Otomanguean", "Proto-Oto-Mangue"},
}
m["omq-sjq"] = {
aliases = {"Chatino Sign Language", "San Juan Quiahije Chatino Sign Language"},
}
m["omq-tel"] = {
}
m["omq-teo"] = {
}
m["omq-tri-pro"] = {
}
m["omq-zap-pro"] = {
}
m["omq-zpc-pro"] = {
}
m["omv-aro-pro"] = {
}
m["omv-diz-pro"] = {
aliases = {"Proto-Maji"},
}
m["omv-pro"] = {
}
m["oto-otm-pro"] = {
}
m["oto-pro"] = {
}
m["paa-kom"] = {
aliases = {"Komnzo", "Kómnjo", "Komnjo", "Kamundjo"},
}
m["paa-kwn"] = {
}
m["paa-nha-pro"] = {
}
m["paa-nun"] = {
}
m["phi-din"] = {
}
m["phi-kal-pro"] = {
aliases = {"Proto-Calamian"},
}
m["phi-nag"] = {
}
m["phi-pro"] = {
}
m["poz-abi"] = {
otherNames = {"Sembuak", "Tubu"},
}
m["poz-bal"] = {
}
m["poz-btk-pro"] = {
}
m["poz-cet-pro"] = {
}
m["poz-hce-pro"] = {
otherNames = {"Proto-South Halmahera - West New Guinea"},
}
m["poz-lgx-pro"] = {
}
m["poz-mcm-pro"] = {
}
m["poz-mic-pro"] = {
}
m["poz-mly-pro"] = {
}
m["poz-msa-pro"] = {
}
m["poz-oce-pro"] = {
}
m["poz-pep-pro"] = {
aliases = {"Proto-Eastern-Polynesian", "Proto-East Polynesian", "Proto-East-Polynesian"},
}
m["poz-pnp-pro"] = {
}
m["poz-pol-pro"] = {
}
m["poz-pro"] = {
otherNames = {"Proto-Western Malayo-Polynesian"}, -- Western is subsumed into general Proto-MP
}
m["poz-sml"] = {
aliases = {"Sarawak"},
}
m["poz-ssw-pro"] = {
}
m["poz-sus-pro"] = {
}
m["poz-swa-pro"] = {
}
m["poz-ter"] = {
aliases = {"Terengganu"},
}
m["pqe-pro"] = {
}
m["pra-niy"] = {
}
m["qfa-adm-pro"] = {
}
m["qfa-bet-pro"] = {
aliases = {"Proto-Tai-Be"},
}
m["qfa-cka-pro"] = {
}
m["qfa-hur-pro"] = {
}
m["qfa-kad-pro"] = {
}
m["qfa-kms-pro"] = {
}
m["qfa-kor-pro"] = {
}
m["qfa-kra-pro"] = {
}
m["qfa-lic-pro"] = {
}
m["qfa-onb-pro"] = {
aliases = {"Proto-Ong-Be", "Proto-Bê"},
}
m["qfa-ong-pro"] = {
}
m["qfa-tak-pro"] = {
aliases = {"Proto-Tai-Kadai"},
}
m["qfa-yen-pro"] = {
}
m["qfa-yuk-pro"] = {
}
m["qwe-kch"] = {
otherNames = {"Kichwa shimi", "Runashimi", "Runa", "Quichua", "Quecha", "Inga", "Chimborazo", "Imbabura Highland Kichwa", "Cañar Highland Quecha", "Quechua"},
}
m["qwe-pro"] = {
}
m["roa-ang"] = {
otherNames = {"Craonnais", "Baugeois", "Saumurois"},
}
m["roa-bbn"] = {
otherNames = {"Bourbonnais", "Berrichon", "Moulins", "Allier", "Nivernais", "Haut-Berrichon", "Bas-Berrichon"},
}
m["roa-brg"] = {
otherNames = {"Burgundian", "Bregognon", "Dijonnais", "Morvandiau", "Morvandeau", "Morvan", "Bourguignon-Morvandiau", "Mâconnais", "Brionnais", "Brionnais-Charolais", "Auxerrois", "Beaunois", "Langrois", "Valsaônois", "Verduno-Chalonnais", "Sédelocien"},
}
m["roa-cha"] = {
otherNames = {"Bassignot", "Langrois", "Sennonais", "Vallage", "Troyen", "Briard", "Der", "Perthois", "Rémois", "Argonnais", "Porcien", "Ardennais", "Sugny"},
}
m["roa-fcm"] = {
otherNames = {"Frainc-Comtou", "Comtois", "Jurassien", "Ajoulot", "Vâdais", "Taignon", "Bisontin", "Bousbot"},
}
m["roa-gal"] = {
}
m["roa-gib"] = {
}
m["roa-gis"] = {
}
m["roa-leo"] = {
}
m["roa-lor"] = {
otherNames = {"Gaumais", "Vosgien", "Welche", "Argonnais", "Longovicien", "Messin", "Nancéien", "Spinalien", "Déodatien"},
}
m["roa-oan"] = {
aliases = {"Old Aragonese"},
}
m["roa-oca"] = {
}
m["roa-ole"] = {
}
m["roa-opt"] = {
aliases = {"Galician-Portuguese", "Galician Portuguese", "Medieval Galician", "Medieval Portuguese", "Old Galician", "Old Portuguese"},
}
m["roa-orl"] = {
otherNames = {"Beauceron", "Solognot", "Gâtinais", "Blaisois", "Vendômois"},
}
m["roa-poi"] = {
otherNames = {"Poitevin", "Saintongeais", "Maraîchin"},
}
m["roa-tar"] = {
}
m["sai-all"] = {
otherNames = {"Alyentiyak", "Huarpe", "Warpe"},
}
m["sai-and"] = { -- not to be confused with 'cbc' or 'ano'
otherNames = {"Miranya", "Miranha", "Miranha Carapana-Tapuya", "Miraña-Carapana-Tapuyo", "Andokero", "Miranya-Karapana-Tapuyo", "Miraña", "Carapana"},
}
m["sai-ayo"] = {
aliases = {"Ayoman", "Ayamán", "Ayaman"},
}
m["sai-bae"] = {
aliases = {"Baenã", "Baenán", "Baena"},
}
m["sai-bag"] = {
otherNames = {"Patagón de Bagua"},
}
m["sai-bet"] = {
otherNames = {"Betoy", "Betoya", "Betoye", "Betoi-Jirara", "Jirara"},
}
m["sai-bor-pro"] = {
otherNames = {"Proto-Bora-Muinane", "Proto-Bora-Muiname"},
}
m["sai-cac"] = {
otherNames = {"Kakán", "Diaguita", "Cacan", "Kakan", "Calchaquí", "Chaka", "Kaka", "Kaká", "Caca", "Caca-Diaguita", "Catamarcano", "Capayán", "Capayana", "Yacampis"},
}
m["sai-caq"] = {
otherNames = {"Cara", "Kara"},
}
m["sai-car-pro"] = {
}
m["sai-cat"] = {
}
m["sai-cer-pro"] = {
otherNames = {"Proto-Amazonian Jê"},
}
m["sai-chi"] = {
}
m["sai-chn"] = {
aliases = {"Chana"},
}
m["sai-chp"] = {
aliases = {"Txapacura", "Xapacura", "Guapore", "Šapakura", "Txapakura", "Txapakúra", "Xapakúra"},
}
m["sai-chr"] = {
aliases = {"Charrúa", "Charruá"},
}
m["sai-chu"] = {
aliases = {"Churoya"},
}
m["sai-cje-pro"] = {
otherNames = {"Proto-Akuwẽ"},
}
m["sai-cmg"] = {
aliases = {"Comechingón", "Comechingona", "Comechingone"},
}
m["sai-cno"] = {
otherNames = {"Chonos", "Caucau"},
}
m["sai-cnr"] = {
aliases = {"Cañar"},
}
m["sai-coe"] = {
aliases = {"Koeruna"},
}
m["sai-col"] = {
aliases = {"Colan"},
}
m["sai-cop"] = {
}
m["sai-crd"] = {
otherNames = {"Coroado"},
}
m["sai-ctq"] = {
aliases = {"Catuquinarú", "Katukinaru"},
}
m["sai-cul"] = {
otherNames = {"Culle", "Kulyi", "Ilinga", "Linga"},
}
m["sai-cva"] = {
}
m["sai-esm"] = {
otherNames = {"Esmeraldeño", "Atacame", "Takame"},
}
m["sai-ewa"] = {
}
m["sai-gam"] = {
aliases = {"Gamella", "Acobu", "Curinsi", "Barbados"},
}
m["sai-gay"] = {
aliases = {"Gayon"},
}
m["sai-gmo"] = {
otherNames = {"Wamo", "Santa Rosa", "San Jose", "Barinas", "Guamotey", "Guama"},
}
m["sai-gue"] = {
aliases = {"Guenoa"},
}
m["sai-hau"] = {
otherNames = {"Manek'enk"},
}
m["sai-jee-pro"] = {
otherNames = {"Proto-Gê", "Proto-Jean", "Proto-Gean", "Proto-Jê-Kaingang", "Proto-Ye"},
}
m["sai-jko"] = {
aliases = {"Geicó", "Jeicó", "Jaikó", "Geikó", "Yeikó", "Jeiko", "Geico", "Jeico", "Jaiko", "Geiko", "Yeiko", "Eyco"},
}
m["sai-jrj"] = {
}
m["sai-kat"] = { -- contrast xoo, kzw, sai-xoc
otherNames = {"Catrimbi", "Catembri", "Kariri de Mirandela", "Mirandela", "Kariri", "Kiriri"},
}
m["sai-mal"] = {
aliases = {"Malali"},
}
m["sai-mar"] = {
}
m["sai-mat"] = {
otherNames = {"Matanauí", "Matanaui", "Matanawü", "Mitandua", "Moutoniway"},
}
m["sai-mcn"] = {
aliases = {"Mokana"},
}
m["sai-men"] = {
aliases = {"Menién"},
}
m["sai-mil"] = {
otherNames = {"Milykayak", "Huarpe", "Warpe"},
}
m["sai-mlb"] = {
aliases = {"Malibú", "Malebú"},
}
m["sai-msk"] = {
aliases = {"Masakara", "Masacará", "Masacara"},
}
m["sai-muc"] = {
otherNames = {"Mucuchi", "Mokochi", "Mocochí", "Mirripú", "Maripú", "Mucuchí-Maripú"},
}
m["sai-mue"] = {
aliases = {"Muellamués"},
}
m["sai-muz"] = {
}
m["sai-mys"] = {
otherNames = {"Mayna", "Maina", "Rimachu"},
}
m["sai-nat"] = {
otherNames = {"Natu", "Peagaxinan"},
}
m["sai-nje-pro"] = {
otherNames = {"Proto-Core Jê"},
}
m["sai-opo"] = {
otherNames = {"Opon", "Opón-Karare", "Opón-Carare", "Carare", "Carare-Opón"},
}
m["sai-oto"] = {
aliases = {"Otomako", "Otomacan", "Otomac", "Otomak"},
}
m["sai-pal"] = {
}
m["sai-pam"] = {
aliases = {"Pamiwa"},
}
m["sai-par"] = {
aliases = {"Paratio", "Prarto"},
}
m["sai-pnz"] = {
aliases = {"Pansaleo"},
}
m["sai-prh"] = {
}
m["sai-ptg"] = {
otherNames = {"Patagón de Perico"},
}
m["sai-pur"] = {
aliases = {"Purukoto", "Purucotó", "Purucoto"},
}
m["sai-pyg"] = {
aliases = {"Payawá", "Payagua"},
}
m["sai-pyk"] = {
aliases = {"Gavião-Pykobjê", "Pykobjê-Gavião", "Gavião", "Pyhcopji", "Gavião-Pyhcopji"},
}
m["sai-qmb"] = {
otherNames = {"Kimbaya", "Quindío", "Quindio", "Quindo"},
}
m["sai-qtm"] = {
aliases = {"Quitemoca"},
}
m["sai-rab"] = {
}
m["sai-ram"] = {
}
m["sai-sac"] = {
otherNames = {"Sacata", "Zácata", "Chillao"},
}
m["sai-san"] = {
aliases = {"Sanavirón", "Sanabirón", "Sanabiron", "Sanavirona", "Zanavirona"},
}
m["sai-sap"] = {
aliases = {"Zapará", "Zapara"},
}
m["sai-sec"] = {
otherNames = {"Sek", "Sec"},
}
m["sai-sin"] = {
otherNames = {"Cenúfana", "Zenúfana", "Cinifaná", "Sinufana", "Sinú", "Cenú", "Zenú", "Finzenú", "Fincenú", "Pancenú", "Sutagao"},
}
m["sai-sje-pro"] = {
}
m["sai-tab"] = {
otherNames = {"Aconipa"},
}
m["sai-tal"] = {
otherNames = {"Atalán", "Tallan", "Tallanca", "Atalan", "Sek"},
}
m["sai-tap"] = {
otherNames = {"Tapayúna", "Kajkwakhrattxi"},
}
m["sai-tar-pro"] = {
}
m["sai-teu"] = {
aliases = {"Tehues", "Teuéx"},
}
m["sai-tim"] = {
otherNames = {"Cuica", "Timote-Cuica"},
}
m["sai-tpr"] = {
aliases = {"Taparito"},
}
m["sai-trr"] = {
otherNames = {"Caratiú"},
}
m["sai-wai"] = {
aliases = {"Waitaka", "Waitacá", "Waitaca", "Goytacá", "Goitacá", "Guaitacá", "Guiatacá", "Guiatacás", "Goiatacá", "Goiatacás", "Guaiatacá", "Goytacaz", "Goitacaz", "Goyataca", "Aitacaz", "Uetacaz", "Uetacá", "Outacá", "Ouetacá", "Eutacá", "Itacaz", "Vaitacá"},
}
m["sai-way"] = {
aliases = {"Wajumará", "Wajumara", "Wayumará", "Azumara", "Guimara"},
}
m["sai-wit-pro"] = {
otherNames = {"Proto-Huitotoan", "Proto-Uitotoan"},
}
m["sai-wnm"] = {
otherNames = {"Wañam", "Wanyam", "Huanyam", "Uanham", "Abitana"},
}
m["sai-xoc"] = { -- contrast xoo, kzw, sai-kat
otherNames = {"Xoco", "Chocó", "Shokó", "Shoko", "Shocó", "Shoco", "Choco", "Chocaz", "Kariri-Xocó", "Kariri-Xoco", "Kariri-Shoko", "Cariri-Chocó", "Xukuru-Kariri", "Xucuru-Kariri", "Xucuru-Cariri", "Xukurú-Kirirí"},
}
m["sai-yao"] = {
aliases = {"Yao", "Jaoi", "Yaoi", "Yaio", "Anacaioury"},
}
m["sai-yar"] = { -- not the same family as 'suy'
aliases = {"Yaruma"},
}
m["sai-yri"] = {
aliases = {"Jurí"},
}
m["sai-yup"] = {
otherNames = {"Yupuá", "Yupúa", "Jupua", "Jupuá", "Jupúa", "Hiupiá", "Yupuá-Duriña", "Duriña"},
}
m["sai-yur"] = {
aliases = {"Yurumangui", "Yurimangí", "Yurimangi", "Yurimanguí", "Yurimangui"},
}
m["sal-pro"] = {
aliases = {"Proto-Salishan"},
}
m["sdv-daj-pro"] = {
}
m["sdv-eje-pro"] = {
}
m["sdv-nil-pro"] = {
}
m["sdv-nyi-pro"] = {
}
m["sdv-tmn-pro"] = {
}
m["sel-nor"] = {
aliases = {"Taz Selkup"},
}
m["sel-pro"] = {
}
m["sel-sou"] = {
}
m["sem-amm"] = {
}
m["sem-amo"] = {
aliases = {"Amoritic"},
}
m["sem-cha"] = {
aliases = {"Cheha", "Čäha", "Čäxa"},
}
m["sem-dad"] = {
otherNames = {"Dadanite", "Lihyanite", "Lihyanitic"},
}
m["sem-dum"] = {
}
m["sem-has"] = {
}
m["sem-his"] = {
otherNames = {"Thamudic E"},
}
m["sem-mhr"] = {
otherNames = {"Muher Gurage", "Muxar", "Muxər", "Muhər", "Muḫər"},
}
m["sem-pro"] = {
}
m["sem-saf"] = {
}
m["sem-srb"] = {
}
m["sem-tay"] = {
otherNames = {"Taymanite", "Thamudic A"},
}
m["sem-tha"] = {
}
m["sem-wes-pro"] = {
}
m["sio-pro"] = { -- NB this is not Proto-Siouan-Catawban 'nai-sca-pro'
}
m["sit-bok"] = {
otherNames = {"Ramo", "Pailibo"},
}
m["sit-bai-pro"] = {
}
m["sit-cai"] = {
}
m["sit-cha"] = {
}
m["sit-hrs-pro"] = {
}
m["sit-jap"] = {
otherNames = {"Chabao", "Kuru"},
}
m["sit-kha-pro"] = {
}
m["sit-liz"] = {
}
m["sit-lnj"] = {
}
m["sit-lrn"] = {
}
m["sit-luu-pro"] = {
}
m["sit-prn"] = {
}
m["sit-pro"] = {
}
m["sit-sit"] = {
otherNames = {"Eastern rGyalrong", "rGyalrong", "Rgyalrong", "rGyalrongic", "Gyalrong", "Gyarong", "rGyarong", "Gyarung", "Jiarong", "Jiarongyu", "Jyarong", "Jyarung", "Yelong", "Kuru"},
}
m["sit-tam-pro"] = {
aliases = {"Proto-Tamang"},
}
m["sit-tan-pro"] = {
}
m["sit-tgm"] = {
}
m["sit-tos"] = {
}
m["sit-tsh"] = {
otherNames = {"Caodeng", "Sidaba", "rGyalrong", "Rgyalrong", "Jiarong", "Gyarung", "Kuru"},
}
m["sit-zbu"] = {
otherNames = {"Ribu", "Rdzong'bur", "Rdzongmbur", "Showu", "rGyalrong", "Rgyalrong", "Jiarong", "Gyarung", "Kuru"},
}
m["sla-pro"] = {
aliases = {"Common Slavic"},
}
m["smi-pro"] = {
aliases = {"Proto-Sami"},
}
m["son-pro"] = {
aliases = {"Proto-Songhai"},
}
m["sqj-pro"] = {
}
m["ssa-klk-pro"] = {
aliases = {"Proto-Rub"},
}
m["ssa-kom-pro"] = {
}
m["ssa-pro"] = {
}
m["syd-fne"] = {
}
m["syd-pro"] = {
}
m["tai-pro"] = {
}
m["tai-swe-pro"] = {
}
m["tbq-bdg-pro"] = {
}
m["tbq-blg"] = {
aliases = {"Pai-lang", "Pailang"},
}
m["tbq-gkh"] = {
aliases = {"Gɔkhý", "Gɔkhy", "Gouke"},
}
m["tbq-kuk-pro"] = {
otherNames = {"Proto-Kukish"},
}
m["tbq-lal-pro"] = {
}
m["tbq-laz"] = {
otherNames = {"Lare", "Shuitianhua"},
}
m["tbq-lob-pro"] = {
}
m["tbq-lol-pro"] = {
otherNames = {"Proto-Yi", "Proto-Ngwi", "Proto-Nisoic"},
}
m["tbq-mil"] = {
}
m["tbq-mor"] = {
aliases = {"Morān"},
}
m["tbq-ngo"] = {
otherNames = {"Ngachang", "Achang"},
}
-- tbq-pro is now etymology-only
m["trk-dkh"] = {
aliases = {"Dukha"},
}
m["trk-oat"] = {
}
m["trk-pro"] = {
}
m["tup-gua-pro"] = {
}
m["tup-kab"] = {
aliases = {"Kabixiana", "Cabixiana", "Cabishiana", "Kapishana", "Capishana", "Kapišana", "Cabichiana", "Capichana", "Capixana"},
}
m["tuw-alk"] = {
aliases = {"Alechuka"},
}
m["tuw-bal"] = {
}
m["tuw-kkl"] = {
aliases = {"Chinese Kyakala"},
}
m["tuw-kli"] = {
aliases = {"Kilen", "Kirin", "Kila", "Hezhe", "Qile'en"},
}
m["tup-pro"] = {
}
m["tuw-pro"] = {
}
m["tuw-sol"] = {
}
m["urj-fin-pro"] = {
}
m["urj-koo"] = {
aliases = {"Old Permian"},
}
m["urj-kuk"] = {
aliases = {"Kukkuzi Votic", "Kukkuzi Ingrian", "Kukkusi"},
}
m["urj-kya"] = {
}
m["urj-mdv-pro"] = {
}
m["urj-prm-pro"] = {
}
m["urj-pro"] = {
otherNames = {"Proto-Finno-Ugric", "Proto-Finno-Permic"}, -- PFU and PFP are subsumed into PU per [[Wiktionary:Beer parlour/2015/January#Merging Finno-Volgaic, Finno-Samic, Finno-Permic and Finno-Ugric into Uralic]]
}
m["urj-ugr-pro"] = {
}
m["xgn-pro"] = {
}
m["xnd-pro"] = {
otherNames = {"Proto-Na-Dené", "Proto-Athabaskan-Eyak-Tlingit"},
}
m["yok-bvy"] = {
otherNames = {"Tulamni-Hometwoli", "Tulamni", "Tulamne", "Tuolumne", "Tawitchi", "Hometwoli", "Taneshach"},
}
m["yok-dly"] = {
otherNames = {"Far Northern Valley Yokuts", "Yachikumne", "Yachikumni", "Chulamni", "Lower San Joaquin", "Lakisamni", "Tawalimni"},
}
m["yok-gsy"] = {
}
m["yok-kry"] = {
otherNames = {"Choinimni", "Choynimni", "Ayticha", "Kocheyali", "Ayitcha", "Michahay", "Chukaymina", "Chukaimina"},
}
m["yok-nvy"] = {
otherNames = {"Chukchansi", "Kechayi", "Dumna", "Chawchila", "Noptinte", "Nopṭinṭe", "Nopthrinthre", "Nopchinchi", "Takin"},
}
m["yok-ply"] = {
otherNames = {"Paleuyami", "Altinin", "Poso Creek", "Poso Creek Yokuts"},
}
m["yok-svy"] = {
otherNames = {"Yawelmani", "Tachi", "Koyeti", "Nutunutu", "Chunut", "Wo'lasi", "Choynok", "Choinok", "Wechihit"},
}
m["yok-tky"] = {
otherNames = {"Wikchamni", "Wukchamni", "Wukchumni", "Yawdanchi"},
}
m["ypk-pro"] = {
}
m["zhx-min-pro"] = {
}
m["zhx-sht"] = {
otherNames = {"Xiangnan Tuhua", "Yuebei Tuhua", "Shipo", "Shina"},
}
m["zhx-sic"] = {
otherNames = {"Sichuanese Mandarin"},
}
m["zhx-tai"] = {
aliases = {"Toishanese"},
}
m["zlw-mas"] = {
aliases = {"Mazurian"},
}
m["zle-ono"] = {
}
m["zle-ort"] = {
}
m["zlw-ocs"] = {
}
m["zlw-opl"] = {
}
m["zlw-osk"] = {
}
m["zlw-slv"] = {
}
m["zlm-coa"] = {
}
m["zlm-pah"] = {
}
return m
c44ahyqdiyqwf2wdft7nfmybeiuymem
Modul:Jpan-headword
828
34824
281360
245830
2026-04-22T06:28:21Z
PeaceSeekers
3334
281360
Scribunto
text/plain
local m_ja = require("Module:ja")
local m_ja_ruby = require("Module:ja-ruby")
local m_str_utils = require("Module:string utilities")
local byteoffset = mw.ustring.byteoffset
local concat = table.concat
local gsplit = m_str_utils.gsplit
local insert = table.insert
local kana_to_romaji = require("Module:Hrkt-translit").tr
local max_index = require("Module:table").maxIndex
local moraify = m_ja.moraify
local remove = table.remove
local ugmatch = mw.ustring.gmatch
local ugsub = m_str_utils.gsub
local ulen = m_str_utils.len
local ulower = m_str_utils.lower
local umatch = mw.ustring.match
local usub = m_str_utils.sub
local export = {}
local pos_functions = {}
local range = mw.loadData('Module:ja/data/range')
local Jpan = require("Module:scripts").getByCode("Jpan")
local function remove_links(text)
return (text:gsub("%[%[[^|%]]-|", "")
:gsub("%[%[", "")
:gsub("%]%]", ""))
end
local function assign_kana_to_kanji(head, kana, pagename, template_name)
-- TODO: uses deprecated module
local m_tu = require'Module:template utilities'
local kanji_pos = {[0] = { nil, 0}}
local head_nolink = {}
local link_border = 0
local function insert_kanji_pos(substr)
insert(head_nolink, substr)
for p1, w1 in ugmatch(substr, '()([々' .. range.kanji .. '])') do
p1 = byteoffset(substr, p1) + link_border
insert(kanji_pos, { p1, p1 + w1:len() - 1 })
end
end
for p1, p2, w1 in m_tu.gfind_bracket(head, {['%[%['] = ']]'}) do
insert_kanji_pos(head:sub(link_border + 1, p1 - 1))
local p_pipe = w1:find'|' or 2
link_border = p1 + p_pipe - 1
insert_kanji_pos(w1:sub(p_pipe + 1, -3))
link_border = p2
end
insert_kanji_pos(head:sub(link_border + 1))
head_nolink = concat(head_nolink)
local pagetext = mw.title.new(pagename):getContent()
if not pagetext then return head, kana end
local non_kanji = {}
local last_kanji = 1
for p1 in ugmatch(head_nolink, '[々' .. range.kanji .. ']()') do
insert(non_kanji, usub(head_nolink, last_kanji, p1 - 2))
last_kanji = p1
end
insert(non_kanji, usub(head_nolink, last_kanji))
for kanjitab in pagetext:gmatch('(){{%s*' .. template_name) do
kanjitab = select(3, m_tu.find_bracket(pagetext, m_tu.brackets_temp, kanjitab))
if not kanjitab then error('ill-formed [[t:' .. template_name:gsub('%%', '') .. ']] syntax') end
kanjitab = m_tu.parse_temp(kanjitab)
local readings = {}
local readings_len = {}
for i = 1, max_index(kanjitab.args) do
local r_i = kanjitab.args[i] or ''
local r_o = kanjitab.args['o' .. i] or ''
if kanjitab.args['k' .. i] then
readings[i] = kanjitab.args['k' .. i] .. r_o
readings_len[i] = tonumber(r_i:match'^%s*%D*(%d*)%s*$') or 1
else
local r_kana, r_len = r_i:match'^%s*(%D*)(%d*)%s*$'
readings[i] = r_kana .. r_o
readings_len[i] = tonumber(r_len) or 1
end
end
local kana_decom = {}
local reading_id = 1
local reading_len = 1
for i = 1, #non_kanji - 1 do
if reading_len <= 1 then
reading_len = readings_len[reading_id] or 1
insert(kana_decom, non_kanji[i])
insert(kana_decom, readings[reading_id])
reading_id = reading_id + 1
else
reading_len = reading_len - 1
end
end
insert(kana_decom, non_kanji[#non_kanji])
local function strip_nonkana(str, repl)
return ugsub(str, '[^' .. range.kana .. ']+', repl) or nil
end
local xeno_reading = {strip_nonkana(kana, ''):match('^' .. strip_nonkana(concat(kana_decom), '(.-)') .. '$')}
if #xeno_reading > 0 then
local head_decom = {}
reading_id = 1
reading_len = 1
for i = 1, #non_kanji - 1 do
if reading_len <= 1 then
reading_len = readings_len[reading_id] or 1
insert(head_decom, head:sub(kanji_pos[i - 1][2] + 1, kanji_pos[i][1] - 1))
insert(head_decom, head:sub(kanji_pos[i][1], kanji_pos[i + reading_len - 1][2]))
reading_id = reading_id + 1
else
reading_len = reading_len - 1
end
end
insert(head_decom, head:sub(kanji_pos[#non_kanji - 1][2] + 1))
if #head_decom ~= #kana_decom then error('number of parameters in [[t:' .. template_name:gsub('%%', '') .. ']] is incorrect') end
local n_xeno_reading = 0
for i = 1, #kana_decom, 2 do
kana_decom[i] = ugsub(kana_decom[i], '[^' .. range.kana .. ']+', function()
n_xeno_reading = n_xeno_reading + 1
if xeno_reading[n_xeno_reading] == '' then return nil
else return xeno_reading[n_xeno_reading] end
end)
end
return concat(head_decom, '%'), concat(kana_decom, '%')
end
end
return head, kana
end
local en_grades = {
"gred pertama", "gred kedua", "gred ketiga",
"gred keempat", "gred kelima", "gred keenam",
"sekolah menengah", "jinmeiyō", "hyōgai"
}
local aliases = {
['transitive']='tr', ['trans']='tr',
['intransitive']='in', ['intrans']='in', ['intr']='in',
['godan']='1', ['ichidan']='2', ['irregular']='irr'
}
local adverbs_optional_tag = 'optionally '
local adverbs_optional_aliases = {
['to']='と', ['と']='と', ['ト']='と',
['ni']='に', ['に']='に', ['ニ']='に',
}
local adverbs_optional_links = {
['と']='[[と#Japanese:_adverbs|と]]',
['に']='[[に]]',
}
local function formatting_adjustments(rom, kana, pos_category)
-- hyphens for prefixes, suffixes, and counters (classifiers)
if pos_category == "Awalan" then
rom = rom:gsub('%-?$', '-')
elseif pos_category == "Akhiran" or pos_category == "Bentuk akhiran" or pos_category == "counters" or pos_category == "classifiers" then
rom = rom:gsub('^%-?', '-')
elseif pos_category == "Kata nama khas" and not kana:match'%^' then -- automatic caps for proper nouns, if not already specified
rom = ugsub(ugsub(rom, '%f[^%s%c%p]%l', string.uupper), "%w'%u", ulower) -- no caps after medial apostrophes
end
return rom
end
local function kana_to_romaji_with_pos_format(kana, data, args)
if data.headword.pos_category == "Bentuk gabungan" or data.headword.pos_category == "Tanda baca" or data.headword.pos_category == "Tanda lelaran" then
return "-"
end
local rom = remove_links(kana_to_romaji(kana, data.lang_code))
-- make adjustments for -u verbs and -i adjectives
if args['infl'] == '1' or args['infl'] == '1s' or args['infl'] == 'godan' then
rom = rom:gsub('ō$', 'ou'):gsub('ū$', 'uu')
elseif args['infl'] == 'i' or args['infl'] == 'is' or args['infl'] == 'い' then
rom = rom:gsub('ī$', 'ii')
end
return formatting_adjustments(rom, kana, data.headword.pos_category)
end
local function iterate_rare_chars(text)
local ch, i
return function()
repeat
ch, i = umatch(text, "([" .. range.kana .. range.kana_graph .. "!-/:-@%[\\-`×△○◎。-〠〶〷〻-〽・·゠=~][゙゚]*)()", i)
until not (ch and umatch(ch, "^[ぁ-ちっつて-ろんァ-チッツテ-ロンヲ-゚]$"))
return ch
end
end
local function historical_kana(data, hist_kana, modern_kana)
-- Disallow historical kana for kana and morae, as there's no one-to-one correspondence.
local pos = data.headword.pos_category
if pos == "syllables" or pos == "kana" or pos == "morae" then
error(("Cannot specify historical kana for %s."):format(pos))
end
local hist_kana_no_formatting = hist_kana:gsub("[%^%-%. %%]+", "")
local rare_chars, lang_name, hc = {}, data.lang_name, data.headword.categories
for ch in iterate_rare_chars(hist_kana_no_formatting) do
if not (modern_kana and modern_kana:find(ch)) then
rare_chars[ch] = true
end
end
for _, mora in ipairs(moraify((ugsub(hist_kana_no_formatting, "[^" .. range.kana .. "]+", " ")))) do
if not (mora:gsub(" +", ""):match("^.?[\128-\191]*$") or (modern_kana and modern_kana:find(mora))) then
rare_chars[mora] = true
end
end
for ch in pairs(rare_chars) do
insert(hc, "Perkataan mengikut sejarah dieja dengan " .. ch .. " bahasa " .. lang_name)
end
insert(data.info_hist, require("Module:ja-link").link({
lang = data.headword.lang,
lemma = hist_kana,
tr = formatting_adjustments(
remove_links(kana_to_romaji(hist_kana, data.lang_code, nil, {hist = true})),
hist_kana,
pos
)
}, {
face = "head",
disableSelfLink = true,
}))
end
local function detect_pagename_kana(data, digraphs)
local pagename = data.pagename
-- Exclude "&" and "@", which are part of %p (e.g. リズム&ブルース).
local function remove_kana(m)
return m:match("[&@]") or ""
end
if ugsub(pagename, '[%p%s%c' .. range.hiragana .. (digraphs and "ゟ" or "") .. ']', remove_kana) == "" then
return 'hira'
elseif ugsub(pagename, '[%p%s%c' .. range.katakana .. (digraphs and "ヿ" or "") .. ']', remove_kana) == "" then
return 'kata'
elseif ugsub(pagename, '[%p%s%c' .. range.kana .. (digraphs and "ゟヿ" or "") .. ']', remove_kana) == "" then
return 'both'
end
end
-- go through args and build inflections by finding whatever kanas were given to us
local function format_headword(args, data)
local pagename, kanas, lang_name = data.pagename, data.kanas, data.lang_name
data.pagename_kana = detect_pagename_kana(data)
if args[1][1] and not args[1][1]:match'[\128-\255]' then
-- filter out POS designations
remove(args[1], 1)
end
local linked_translit = data.headword.lang:link_tr(Jpan)
local suru_ending, rom_suru_ending
if data.headword.pos_category == "kata kerja suru" then
suru_ending = "[[する]]"
rom_suru_ending = linked_translit and " [[suru]]" or " suru"
else
suru_ending, rom_suru_ending = "", ""
end
if data.pagename_kana then -- pure-kana-title entry
if #args.head > 0 or args.head.default then
insert(data.headword.categories, "Perkataan dengan parameter pengepala lewah bahasa " .. lang_name)
end
-- {{ja-xxx}} vs {{ja-xxx|こ.うし}} vs {{ja-xxx|コウシ}} in [[こうし]]
if not args[1][1] then
args[1][1] = pagename
elseif remove_links(args[1][1]:gsub("[%^%-%. %%]+", "")) ~= pagename then
insert(args[1], 1, pagename)
end
for i, k in ipairs(args[1]) do
insert(data.headword.heads, {
term = k:gsub("[%^%-%. %%]+", "") .. suru_ending,
tr = '-',
l = args.label[i] and {args.label[i]} or nil,
})
end
for i = 1, math.max(args.rom.maxindex, 1) do
local rom = args.rom[i] or args.rom.default or kana_to_romaji_with_pos_format(args[1][1], data, args)
if not data.headword.heads[i] then
data.headword.heads[i] = {term = data.headword.heads[i-1].term}
end
if rom == "-" then
data.headword.heads[i].tr = "-"
elseif linked_translit then
data.headword.heads[i].tr = "[[" .. rom .. "]]" .. rom_suru_ending
else
data.headword.heads[i].tr = rom .. rom_suru_ending
end
if not data.inflection_base.form then
data.inflection_base.form = remove_links(args[i][1]:gsub("[%^%-%. %%]+", "")) .. suru_ending
data.inflection_base.romaji = rom .. rom_suru_ending
end
end
kanas[1] = pagename
if args.hist[1] then
historical_kana(data, args.hist[1], args[1][1])
end
else -- non-pure-kana-title entry
if #args[1] == 0 and not (data.headword.pos_category == "Tanda baca" or data.headword.pos_category == "Tanda lelaran" or data.headword.pos_category == "Simbol") then
error("Kana form is required.")
end
if args.head.default == pagename then
insert(data.headword.categories, "Perkataan dengan parameter pengepala lewah bahasa " .. lang_name)
end
local rom_repetition_final = {}
for i, k in ipairs(args[1]) do
local rom_auto = kana_to_romaji_with_pos_format(k, data, args)
local head = args.head[i] or args.head.default or pagename
if args.head[i] == pagename then
insert(data.headword.categories, "Perkataan dengan parameter pengepala lewah bahasa " .. lang_name)
end
local head_for_ruby, kana_for_ruby
if ulen(head) > 1 and head:match'%%' == nil and k:match'%%' == nil then
head_for_ruby, kana_for_ruby = assign_kana_to_kanji(head, k, pagename, data.lang_code .. '%-kanjitab')
else
head_for_ruby, kana_for_ruby = head, k
end
local format_table = m_ja_ruby.parse_text(head_for_ruby, kana_for_ruby, {
try = 'force',
try_force_limit = 10000
})
local kana_bare = remove_links(k:gsub("[%^%-%. %%]+", ""))
local rom = args.rom[i] or args.rom.default or rom_auto
head = {
term = m_ja_ruby.to_wiki(format_table, {
break_link = true,
}):gsub('<rt>(..-)</rt>', "<rt>[[" .. kana_bare .."|%1]]</rt>") .. suru_ending,
l = args.label[i] and {args.label[i]} or nil,
}
if rom == "-" or rom_repetition_final[rom] then
head.tr = "-"
elseif linked_translit then
head.tr = "[[" .. rom .. "]]" .. rom_suru_ending
else
head.tr = rom .. rom_suru_ending
end
insert(data.headword.heads, head)
rom_repetition_final[rom] = true
insert(kanas, kana_bare)
if args.hist[i] then
historical_kana(data, args.hist[i], k)
end
if not data.inflection_base.form then
data.inflection_base.form = remove_links(m_ja_ruby.to_markup(format_table)) .. suru_ending
data.inflection_base.romaji = rom .. rom_suru_ending
end
end
local first_reading, multiple = kanas[1]
if not first_reading then
return
end
first_reading = ulower(kana_to_romaji(first_reading, data.lang_code)):gsub("%%", "")
for i = 2, #kanas do
if ulower(kana_to_romaji(kanas[i], data.lang_code)):gsub("%%", "") ~= first_reading then
multiple = true
break
end
end
if not multiple then
local lang_code = data.lang_code
local content = mw.title.getCurrentTitle():getContent()
local loc1, loc2 = content:find("%f[^%z%s]==%s*" .. lang_name:gsub("%-", "%%%-") .. "%s*==()")
loc2 = content:find("%f[^%z%s]==[^\n=]+==", loc2)
if loc1 then
content = content:sub(loc1, loc2)
for template in require("Module:template parser").find_templates(content) do
local name, reading = template:get_name()
if (
name == lang_code .. "-head" or
name == lang_code .. "-pos"
) then
reading = template:get_arguments()[2]
if reading ~= nil then
reading = remove_links(reading):gsub("%%", "")
end
elseif (
name == lang_code .. "-noun" or
name == lang_code .. "-verb" or
name == lang_code .. "-adj" or
name == lang_code .. "-phrase" or
name == lang_code .. "-verb form" or
name == lang_code .. "-verb-suru"
) then
reading = template:get_arguments()[1]
if reading ~= nil then
reading = remove_links(reading):gsub("%%", "")
end
elseif name == lang_code .. "-see" then
reading = template:get_arguments()[1]
if reading ~= nil then
reading = remove_links(reading):gsub("%%", "")
end
-- if umatch(reading, "[^" .. range.kana .. "]") then
-- TODO: check linked page
-- end
end
if reading and ulower(kana_to_romaji(reading, lang_code)):gsub("%%", "") ~= first_reading then
multiple = true
end
end
end
end
if multiple then
insert(data.headword.categories, "Perkataan dengan pelbagai bacaan bahasa " .. lang_name)
end
end
end
local function add_transitivity(data, tr)
local categories, lang_name = data.headword.categories, data.lang_name
tr = aliases[tr] or tr
if tr == "tr" then
insert(data.info_mid, 'transitive')
insert(categories, "Kata kerja transitif bahasa " .. lang_name)
elseif tr == "in" then
insert(data.info_mid, 'intransitive')
insert(categories, "Kata kerja tak transitif bahasa " .. lang_name)
elseif tr == "both" then
insert(data.info_mid, 'transitive or intransitive')
insert(categories, "Kata kerja transitif bahasa " .. lang_name)
insert(categories, "Kata kerja tak transitif bahasa " .. lang_name)
else
insert(categories, "Kata kerja tanpa ketransitifan bahasa " .. lang_name)
end
end
local function get_final(lemma, data)
return kana_to_romaji(remove(moraify(m_ja_ruby.to_ruby(m_ja_ruby.parse_markup(lemma)))), data.lang_code)
end
local function add_language_fragment(t, lang_name)
for k, v in ipairs(t) do
t[k] = v:gsub("%[%[([^]#]*)%]%]", "[[%1#%%s|%1]]"):format(lang_name)
end
end
local function add_inflections(data, inflection_type, cat_suffix)
local lang_name = data.lang_name
local lemma = data.inflection_base.form
local romaji = data.inflection_base.romaji
inflection_type = aliases[inflection_type] or inflection_type
local function replace_suffix(lemma_from, lemma_to, romaji_from, romaji_to)
-- e.g. 持って来る, lemma = "[持](も)って来(く)る"
-- lemma_from = "くる", lemma_to = {"き","きた"}
add_language_fragment(lemma_to, lang_name)
add_language_fragment(romaji_to, lang_name)
local result = {}
local pattern_from, n_from = lemma_from:gsub('.[\128-\191]*', function(c)
return '[' .. c .. m_ja.hira_to_kata(c) .. ']([^' .. range.kana .. ']*)'
end)
pattern_from = pattern_from .. '$'
-- "[くク]([^kana range]*)[るル]([^kana range]*)$"
for i_lemma_to, s_lemma_to in ipairs(lemma_to) do
local n_to = 0
local pattern_to = s_lemma_to:gsub('.[\128-\191]*', function(c)
if n_to < n_from then
n_to = n_to + 1
return c .. "%" .. n_to
else
return c
end
end)
for i = n_to + 1, n_from do
pattern_to = pattern_to .. "%" .. i
end
-- "き%1%2", "き%1た%2"
local lemma_inflected, success = ugsub(lemma, pattern_from, pattern_to)
if success == 0 then
return
end
local romaji_inflected
romaji_inflected, success = romaji:gsub(romaji_from .. "$", romaji_to[i_lemma_to])
if success == 0 then
romaji_inflected, success = romaji:gsub("%[%[" .. romaji_from .. "%]%]$", "[[" .. romaji_to[i_lemma_to] .. "]]")
if success == 0 then
return
end
end
insert(result, {lemma = lemma_inflected, romaji = romaji_inflected})
end
return result -- {{lemma="[持](も)って来(き)",romaji="motteki"},{lemma="[持](も)って来(き)た",romaji="mottekita"}}
end
local function insert_form(label, ...)
-- label = "stem" or "past" etc.
-- ... = {lemma=...,romaji=...},{lemma=...,romaji=...}
local labeled_forms = {label = label}
for _, v in ipairs{...} do
local table_form = m_ja_ruby.parse_markup(v.lemma)
local form_term = m_ja_ruby.to_wiki(table_form)
if not form_term:find'%[%[.+%]%]' then
form_term = '[[' .. m_ja_ruby.to_text(table_form) .. '#' .. lang_name .. '|' .. form_term .. ']]'
end
insert(labeled_forms, {
term = form_term,
tr = v.romaji,
})
end
insert(data.headword.inflections, labeled_forms)
end
local inflected_forms
if data.lang_code == 'ja' then
if inflection_type == '1' or inflection_type == '1s' then
insert(data.info_mid, '<abbr title="godan (group 1) conjugation">godan</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " godan bahasa " .. lang_name)
local romaji = data.inflection_base.romaji
if cat_suffix == "Kata kerja" then
local final = get_final(lemma, data)
insert(data.headword.categories, cat_suffix .. " godan berakhir dengan -" .. final " bahasa ".. lang_name)
if final == "ru" then
if umatch(romaji, "[iIīĪ]ru$") then
insert(data.headword.categories, cat_suffix .. " godan berakhir dengan -iru bahasa " .. lang_name)
elseif umatch(romaji, "[eEēĒ]ru$") then
insert(data.headword.categories, cat_suffix .. " godan berakhir dengan -eru bahasa " .. lang_name)
end
end
end
end
if inflection_type == '1' then
inflected_forms =
replace_suffix('く', {'き', 'いた'}, 'ku', {'ki', 'ita'}) or
replace_suffix('ぐ', {'ぎ', 'いだ'}, 'gu', {'gi', 'ida'}) or
replace_suffix('す', {'し', 'した'}, 'su', {'shi', 'shita'}) or
replace_suffix('つ', {'ち', 'った'}, 'tsu', {'chi', 'tta'}) or
replace_suffix('ぬ', {'に', 'んだ'}, 'nu', {'ni', 'nda'}) or
replace_suffix('ぶ', {'び', 'んだ'}, 'bu', {'bi', 'nda'}) or
replace_suffix('む', {'み', 'んだ'}, 'mu', {'mi', 'nda'}) or
replace_suffix('る', {'り', 'った'}, 'ru', {'ri', 'tta'}) or
replace_suffix('う', {'い', 'った'}, 'u', {'i', 'tta'})
if inflected_forms then
insert_form('dasar', inflected_forms[1])
insert_form('lampau', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
else
inflected_forms =
replace_suffix('る', {'り', 'った', 'い'}, 'ru', {'ri', 'tta', 'i'}) or --くださる
replace_suffix('いく', {'いき', 'いった'}, 'iku', {'iki', 'itta'}) or --行く
replace_suffix('う', {'い', 'うた'}, 'ou', {'oi', 'ōta'}) --問う
if inflected_forms then
insert_form('dasar', inflected_forms[1], inflected_forms[3])
insert_form('lampau', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
end
elseif inflection_type == '2' then
insert(data.info_mid, '<abbr title="ichidan (group 2) conjugation">ichidan</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " ichidan bahasa " .. lang_name)
local romaji = data.inflection_base.romaji
if umatch(romaji, "[iIīĪ]ru$") then
insert(data.headword.categories, cat_suffix .. " kami ichidan bahasa " .. lang_name)
elseif umatch(romaji, "[eEēĒ]ru$") then
insert(data.headword.categories, cat_suffix .. " shimo ichidan bahasa " .. lang_name)
else
insert(data.headword.categories, cat_suffix .. " irregular bahasa " .. lang_name)
end
end
inflected_forms = replace_suffix('る', {'', 'た'}, 'ru', {'', 'ta'})
if inflected_forms then
insert_form('dasar', inflected_forms[1])
insert_form('lampau', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'suru' then
insert(data.info_mid, '<abbr title="suru (group 3) conjugation">suru</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " suru bahasa " .. lang_name)
end
inflected_forms =
replace_suffix('する', {'し', 'した'}, 'suru', {'shi', 'shita'}) or
replace_suffix('ずる', {'じ', 'じた'}, 'zuru', {'ji', 'jita'})
if inflected_forms then
insert_form('dasar', inflected_forms[1])
insert_form('lampau', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'kuru' then
insert(data.info_mid, '<abbr title="kuru (group 3) conjugation">kuru</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " kuru bahasa " .. lang_name)
end
inflected_forms = replace_suffix('くる', {'き', 'きた'}, 'kuru', {'ki', 'kita'})
if inflected_forms then
insert_form('dasar', inflected_forms[1])
insert_form('lampau', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'i' or inflection_type == 'い' then
insert(data.info_mid, '<abbr title="-i (type I) inflection">-i</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " い-i bahasa " .. lang_name)
end
inflected_forms = replace_suffix('い', {'く'}, 'i', {'ku'})
if inflected_forms then
insert_form('adverbial', inflected_forms[1])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'is' then
insert(data.info_mid, '<abbr title="-i (type I) inflection">-i</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " い-i bahasa " .. lang_name)
end
inflected_forms = replace_suffix('いい', {'よく'}, 'ii', {'yoku'})
if inflected_forms then
insert_form('adverbial', inflected_forms[1])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'na' or inflection_type == 'な' then
insert(data.info_mid, '<abbr title="-na (type II) inflection">-na</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " な-na bahasa " .. lang_name)
end
inflected_forms = replace_suffix('', {'[[な]]', '[[に]]'}, '', {' [[na]]', ' [[ni]]'})
insert_form('adnominal', inflected_forms[1])
insert_form('adverbial', inflected_forms[2])
elseif inflection_type == "yo" then
insert(data.info_mid, '<abbr title="yodan conjugation (classical)"><sup><small>†</small></sup>yodan</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " yodan " .. lang_name)
insert(data.headword.categories, cat_suffix .. " yodan berakhir dengan -" .. get_final(lemma, data) .. " bahasa ".. lang_name)
end
elseif inflection_type == "kami ni" then
insert(data.info_mid, '<abbr title="kami nidan conjugation (classical)"><sup><small>†</small></sup>nidan</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " nidan bahasa " .. lang_name)
insert(data.headword.categories, cat_suffix .. " kami nidan bahasa " .. lang_name)
end
elseif inflection_type == "shimo ni" then
insert(data.info_mid, '<abbr title="shimo nidan conjugation (classical)"><sup><small>†</small></sup>nidan</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " nidan bahasa " .. lang_name)
insert(data.headword.categories, cat_suffix .. " shimo nidan bahasa " .. lang_name)
end
elseif inflection_type == "rahen" then
insert(data.info_mid, '<abbr title="r-special conjugation (classical)"><sup><small>†</small></sup>-ri</abbr>')
elseif inflection_type == "sahen" then
insert(data.info_mid, '<abbr title="s-special conjugation (classical)"><sup><small>†</small></sup>-se</abbr>')
elseif inflection_type == "kahen" then
insert(data.info_mid, '<abbr title="k-special conjugation (classical)"><sup><small>†</small></sup>-ko</abbr>')
elseif inflection_type == "nahen" then
insert(data.info_mid, '<abbr title="n-special conjugation (classical)"><sup><small>†</small></sup>-n</abbr>')
elseif inflection_type == "nari" or inflection_type == "なり" then
insert(data.info_mid, '<abbr title="-nari inflection (classical)"><sup><small>†</small></sup>-nari</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " なり-nari bahasa " .. lang_name)
end
elseif inflection_type == 'tari' or inflection_type == 'たり' then
insert(data.info_mid, '<abbr title="-tari inflection (classical)"><sup><small>†</small></sup>-tari</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " たり-tari bahasa " .. lang_name)
end
inflected_forms = replace_suffix('', {'[[とした]]', '[[たる]]', '[[と]]', '[[として]]'}, '', {' [[to shita]]', ' [[taru]]', ' [[to]]', ' [[to shite]]'})
insert_form('adnominal', inflected_forms[1], inflected_forms[2])
insert_form('adverbial', inflected_forms[3], inflected_forms[4])
elseif inflection_type == "ku" or inflection_type == "く" then
insert(data.info_mid, '<abbr title="-ku inflection (classical)"><sup><small>†</small></sup>-ku</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " く-ku bahasa " .. lang_name)
end
elseif inflection_type == "shiku" or inflection_type == "しく" then
insert(data.info_mid, '<abbr title="-shiku inflection (classical)"><sup><small>†</small></sup>-shiku</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " しく-shiku bahasa " .. lang_name)
end
elseif inflection_type == "ka" or inflection_type == "か" then
insert(data.info_mid, '<abbr title="-ka inflection (dialectal)"><sup><small>†</small></sup>-ka</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " か-ka " .. lang_name)
end
elseif inflection_type and inflection_type:len() > adverbs_optional_tag:len() and inflection_type:sub(1, adverbs_optional_tag:len()) == adverbs_optional_tag then
local adverbs_optional_list = inflection_type:sub(adverbs_optional_tag:len() + 1)
for option in gsplit(adverbs_optional_list, ':') do
local normalized_option = adverbs_optional_aliases[option]
if not normalized_option then
error('unrecognized adverb opt= argument: "' .. option .. '"')
end
local normalized_option_romaji = kana_to_romaji(normalized_option, data.lang_code)
local normalized_option_link = adverbs_optional_links[normalized_option]
inflected_forms = replace_suffix('', {normalized_option_link}, '', {' [[' .. normalized_option_romaji .. ']]'})
insert_form('optionally as', inflected_forms[1])
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " secara pilihan mengambil " .. normalized_option .. "-" .. normalized_option_romaji .. " bahasa " .. lang_name)
end
end
elseif inflection_type == 'irr' then
insert(data.info_mid, 'irregular')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " irregular bahasa " .. lang_name)
end
elseif inflection_type == '-' or inflection_type == 'un' then
insert(data.info_mid, 'uninflectable')
end
--elseif data.lang_code == 'ryu' then ...
end
end
local function add_categories(data)
local lang_name = data.lang_name
local pagename = data.pagename
local tc = data.headword.categories
-- adds category [langname] terms spelled with jōyō kanji or [langname] terms spelled with non-jōyō kanji
-- (if it contains any kanji)
local number_of_kanji = 0
for c in ugmatch(pagename, "[" .. range.kanji .. "々〻]") do
number_of_kanji = number_of_kanji + 1
if c ~= "々" and c ~= "〻" then -- Not a kanji for the purposes of categorisation.
insert(tc, ("Perkataan dieja dengan kanji %s bahasa " .. lang_name):format(en_grades[m_ja.kanji_grade(c)]))
end
end
-- categorize by number of kanji
if number_of_kanji ~= 0 then
insert(tc, ("Perkataan dengan %s aksara kanji bahasa " .. lang_name):format(number_of_kanji))
-- single-kanji terms
if ulen(pagename) == 1 then
insert(tc, "Perkataan dieja dengan " .. pagename .. " bahasa " .. lang_name)
insert(tc, "Perkataan kanji tunggal bahasa " .. lang_name)
end
end
-- categorize by the script of the pagename or specific characters contained in it
-- if pagename is hiragana or katakana
if detect_pagename_kana(data, true) == 'hira' then insert(tc, "Hiragana bahasa " .. lang_name) end
if detect_pagename_kana(data, true) == 'kata' then insert(data.katakana_category, "Katakana bahasa " .. lang_name) end
local p, n = ugsub(pagename, '[' .. range.kana .. range.kanji .. range.ideograph .. range.kana_graph .. range.punctuation .. ']+', '')
if p ~= '' and n > 0 then insert(tc, "Perkataan ditulis dalam pelbagai tulisan bahasa " .. lang_name) end
local pos = data.headword.pos_category
local rare_chars = {}
for ch in iterate_rare_chars(pagename) do
rare_chars[ch] = true
end
-- Categorise yōon, but exclude kana and mora entries, since they can't be spelled with themselves.
-- FIXME: allow kana categories for morae.
if not (pos == "syllables" or pos == "kana" or pos == "morae") then
for _, mora in ipairs(moraify((ugsub(pagename, "[^" .. range.kana .. "]+", " ")))) do
if not mora:gsub(" +", ""):match("^.?[\128-\191]*$") then
rare_chars[mora] = true
end
end
end
for ch in pairs(rare_chars) do
insert(tc, "Perkataan dieja dengan " .. ch .. " bahasa " .. lang_name)
end
if (
pos ~= "proverbs" and
pos ~= "phrases" and
umatch(ugsub(pagename, "[" .. range.katakana .. "]+", ""), "[" .. range.hiragana .. "]") and
umatch(ugsub(pagename, "[" .. range.hiragana .. "]+", ""), "[" .. range.katakana .. "]")
) then
insert(tc, "Perkataan dieja dengan campuran kana bahasa " .. lang_name)
end
end
pos_functions["Kata kerja"] = function(args, data)
add_transitivity(data, args["tr"])
add_inflections(data, args["infl"], 'verbs')
end
pos_functions["Akhiran"] = function(args, data)
add_inflections(data, args["infl"])
end
pos_functions["Kata kerja bantu"] = function(args, data)
insert(data.headword.categories, "Kata kerja bantu bahasa " .. data.lang_name)
add_inflections(data, args["infl"])
data.headword.pos_category = "Kata kerja"
end
pos_functions["Kata kerja suru"] = function(args, data)
add_transitivity(data, args["tr"])
add_inflections(data, 'suru', 'verbs')
data.headword.pos_category = "Kata kerja"
end
pos_functions["kata sifat"] = function(args, data)
add_inflections(data, args["infl"], 'adjectives')
end
pos_functions["kata nama"] = function(args, data)
-- the counter (classifier) parameter, only relevant for nouns
local counter = args["count"] or ""
if counter == "-" then
insert(data.headword.inflections, {label = "uncountable"})
elseif counter ~= "" then
insert(data.headword.inflections, {label = "counter", counter})
end
end
pos_functions["Adverba"] = function(args, data)
local opt = args["opt"]
if opt then
opt = adverbs_optional_tag .. opt
end
add_inflections(data, opt, 'adverbs')
end
--[==[
Generate categories by pagename, also optionally by POS
Also for use in soft redirect pages ([[Module:ja-see]]).
Sortkey is not provided.
data = {
pagename = ..., -- (required)
lang = ..., -- (required) language object
categories = {}, -- (required) receive categories
katakana_category = {}, -- (required) receive katakana-sorted categories
pos = ..., "noun", "verb", etc. no POS categories if not given
}
]==]
function export.cat(data)
data.lang_name = data.lang:getCanonicalName()
data.pagename_kana = detect_pagename_kana(data)
if data.pos then
local pos = data.pos:gsub('x$', 'xe') .. ''
insert(data.categories, pos .. ' bahasa ' .. data.lang_name)
insert(data.categories, require'Module:headword'.pos_lemma_or_nonlemma(pos, true) .. ' bahasa ' .. data.lang_name)
end
data.headword = { categories = data.categories }
add_categories(data)
end
--[==[
The main entry point.
This is the only function that can be invoked from a template.
]==]
function export.show(frame)
local poscat = frame.args[2] or frame.args[1] or error("Part of speech has not been specified. Please pass parameter 1 to the module invocation.")
local alias_of_hist = {alias_of = 'hist', list = false}
local alias_of_infl = {alias_of = "infl"}
local list = {list = true}
local list_allow_holes_separate_no_index = {list = true, allow_holes = true, separate_no_index = true}
local params = {
[1] = list,
['rom'] = list_allow_holes_separate_no_index,
['head'] = list_allow_holes_separate_no_index,
['label'] = {list = true, allow_holes = true},
['hist'] = list, ['hhira'] = alias_of_hist, ['hkata'] = alias_of_hist,
['tr'] = true,
['infl'] = true, ['type'] = alias_of_infl, ['decl'] = alias_of_infl,
['opt'] = true,
['count'] = true,
['sort'] = true,
['pagename'] = true,
}
-- For backwards compatibility with uses of {{ja-syllable}} with the script parameter.
if poscat == "syllables" then
params["sc"] = true
end
local args = require('Module:parameters').process(frame:getParent().args, params)
local data = {
headword = {
pos_category = poscat,
categories = {},
heads = {},
no_redundant_head_cat = true,
inflections = {},
genders = {'m'}, -- placeholder
nogendercat = true
},
--custom info
pagename = args.pagename or mw.loadData("Module:headword/data").pagename,
pagename_kana = nil, -- "hira" "kata" "both", nil
lang_code = frame.args[1],
lang_name = nil, -- "Japanese", "Okinawan" ...
katakana_category = {},
info_mid = {}, -- "godan", "intransitive" ...
info_hist = {}, -- historical kana
inflection_base = {}, -- base of inflections
kanas = {}, -- kana id
}
data.headword.lang = require("Module:languages").getByCode(data.lang_code)
data.lang_name = data.headword.lang:getCanonicalName()
-- sort out all the kanas and do the romanization business
format_headword(args, data)
-- add certain inflections and categories for adjectives, verbs, nouns, or adverbs
if pos_functions[poscat] then
pos_functions[poscat](args, data)
end
-- categories
add_categories(data)
local sort_base = args.sort or data.kanas[1] or data.pagename
data.headword.sort_key = data.headword.lang:makeSortKey(sort_base)
local katakana_category = #data.katakana_category > 0 and
require("Module:utilities").format_categories(
data.katakana_category,
data.headword.lang,
nil,
sort_base,
nil,
require("Module:scripts").getByCode("Kana")
) or ""
-- output
local i_kanas = 0
return katakana_category .. require('Module:headword').full_headword(data.headword):gsub('<span class="gender">.-</span>', function()
return (#data.info_hist > 0 and '<sup>←' .. concat(data.info_hist, ' or ') .. '<sup>[[w:Historical kana orthography|?]]</sup></sup>' or '') .. ('<i>' .. concat(data.info_mid, ' ') .. '</i>')
end):gsub('<strong .->.-</strong>', function(m0)
i_kanas = i_kanas + 1
if data.kanas[i_kanas] then
return m0
end
end):gsub('<span class="headword%-tr tr" dir="ltr"><span class="Latn" lang="ja">', '<span lang="ja-Latn" class="headword-tr tr Latn" dir="ltr">'):gsub('</span></span>', '</span>')
end
return export
egwsrby262fhe6hpnwknt7huiegcpys
281361
281360
2026-04-22T06:29:02Z
PeaceSeekers
3334
Membatalkan semakan [[Special:Diff/281360|281360]] oleh [[Special:Contributions/PeaceSeekers|PeaceSeekers]] ([[User talk:PeaceSeekers|bincang]])
281361
Scribunto
text/plain
local m_ja = require("Module:ja")
local m_ja_ruby = require("Module:ja-ruby")
local m_str_utils = require("Module:string utilities")
local byteoffset = mw.ustring.byteoffset
local concat = table.concat
local gsplit = m_str_utils.gsplit
local insert = table.insert
local kana_to_romaji = require("Module:Hrkt-translit").tr
local max_index = require("Module:table").maxIndex
local moraify = m_ja.moraify
local remove = table.remove
local ugmatch = mw.ustring.gmatch
local ugsub = m_str_utils.gsub
local ulen = m_str_utils.len
local ulower = m_str_utils.lower
local umatch = mw.ustring.match
local usub = m_str_utils.sub
local export = {}
local pos_functions = {}
local range = mw.loadData('Module:ja/data/range')
local Jpan = require("Module:scripts").getByCode("Jpan")
local function remove_links(text)
return (text:gsub("%[%[[^|%]]-|", "")
:gsub("%[%[", "")
:gsub("%]%]", ""))
end
local function assign_kana_to_kanji(head, kana, pagename, template_name)
-- TODO: uses deprecated module
local m_tu = require'Module:template utilities'
local kanji_pos = {[0] = { nil, 0}}
local head_nolink = {}
local link_border = 0
local function insert_kanji_pos(substr)
insert(head_nolink, substr)
for p1, w1 in ugmatch(substr, '()([々' .. range.kanji .. '])') do
p1 = byteoffset(substr, p1) + link_border
insert(kanji_pos, { p1, p1 + w1:len() - 1 })
end
end
for p1, p2, w1 in m_tu.gfind_bracket(head, {['%[%['] = ']]'}) do
insert_kanji_pos(head:sub(link_border + 1, p1 - 1))
local p_pipe = w1:find'|' or 2
link_border = p1 + p_pipe - 1
insert_kanji_pos(w1:sub(p_pipe + 1, -3))
link_border = p2
end
insert_kanji_pos(head:sub(link_border + 1))
head_nolink = concat(head_nolink)
local pagetext = mw.title.new(pagename):getContent()
if not pagetext then return head, kana end
local non_kanji = {}
local last_kanji = 1
for p1 in ugmatch(head_nolink, '[々' .. range.kanji .. ']()') do
insert(non_kanji, usub(head_nolink, last_kanji, p1 - 2))
last_kanji = p1
end
insert(non_kanji, usub(head_nolink, last_kanji))
for kanjitab in pagetext:gmatch('(){{%s*' .. template_name) do
kanjitab = select(3, m_tu.find_bracket(pagetext, m_tu.brackets_temp, kanjitab))
if not kanjitab then error('ill-formed [[t:' .. template_name:gsub('%%', '') .. ']] syntax') end
kanjitab = m_tu.parse_temp(kanjitab)
local readings = {}
local readings_len = {}
for i = 1, max_index(kanjitab.args) do
local r_i = kanjitab.args[i] or ''
local r_o = kanjitab.args['o' .. i] or ''
if kanjitab.args['k' .. i] then
readings[i] = kanjitab.args['k' .. i] .. r_o
readings_len[i] = tonumber(r_i:match'^%s*%D*(%d*)%s*$') or 1
else
local r_kana, r_len = r_i:match'^%s*(%D*)(%d*)%s*$'
readings[i] = r_kana .. r_o
readings_len[i] = tonumber(r_len) or 1
end
end
local kana_decom = {}
local reading_id = 1
local reading_len = 1
for i = 1, #non_kanji - 1 do
if reading_len <= 1 then
reading_len = readings_len[reading_id] or 1
insert(kana_decom, non_kanji[i])
insert(kana_decom, readings[reading_id])
reading_id = reading_id + 1
else
reading_len = reading_len - 1
end
end
insert(kana_decom, non_kanji[#non_kanji])
local function strip_nonkana(str, repl)
return ugsub(str, '[^' .. range.kana .. ']+', repl) or nil
end
local xeno_reading = {strip_nonkana(kana, ''):match('^' .. strip_nonkana(concat(kana_decom), '(.-)') .. '$')}
if #xeno_reading > 0 then
local head_decom = {}
reading_id = 1
reading_len = 1
for i = 1, #non_kanji - 1 do
if reading_len <= 1 then
reading_len = readings_len[reading_id] or 1
insert(head_decom, head:sub(kanji_pos[i - 1][2] + 1, kanji_pos[i][1] - 1))
insert(head_decom, head:sub(kanji_pos[i][1], kanji_pos[i + reading_len - 1][2]))
reading_id = reading_id + 1
else
reading_len = reading_len - 1
end
end
insert(head_decom, head:sub(kanji_pos[#non_kanji - 1][2] + 1))
if #head_decom ~= #kana_decom then error('number of parameters in [[t:' .. template_name:gsub('%%', '') .. ']] is incorrect') end
local n_xeno_reading = 0
for i = 1, #kana_decom, 2 do
kana_decom[i] = ugsub(kana_decom[i], '[^' .. range.kana .. ']+', function()
n_xeno_reading = n_xeno_reading + 1
if xeno_reading[n_xeno_reading] == '' then return nil
else return xeno_reading[n_xeno_reading] end
end)
end
return concat(head_decom, '%'), concat(kana_decom, '%')
end
end
return head, kana
end
local en_grades = {
"gred pertama", "gred kedua", "gred ketiga",
"gred keempat", "gred kelima", "gred keenam",
"sekolah menengah", "jinmeiyō", "hyōgai"
}
local aliases = {
['transitive']='tr', ['trans']='tr',
['intransitive']='in', ['intrans']='in', ['intr']='in',
['godan']='1', ['ichidan']='2', ['irregular']='irr'
}
local adverbs_optional_tag = 'optionally '
local adverbs_optional_aliases = {
['to']='と', ['と']='と', ['ト']='と',
['ni']='に', ['に']='に', ['ニ']='に',
}
local adverbs_optional_links = {
['と']='[[と#Japanese:_adverbs|と]]',
['に']='[[に]]',
}
local function formatting_adjustments(rom, kana, pos_category)
-- hyphens for prefixes, suffixes, and counters (classifiers)
if pos_category == "Awalan" then
rom = rom:gsub('%-?$', '-')
elseif pos_category == "Akhiran" or pos_category == "Bentuk akhiran" or pos_category == "counters" or pos_category == "classifiers" then
rom = rom:gsub('^%-?', '-')
elseif pos_category == "Kata nama khas" and not kana:match'%^' then -- automatic caps for proper nouns, if not already specified
rom = ugsub(ugsub(rom, '%f[^%s%c%p]%l', string.uupper), "%w'%u", ulower) -- no caps after medial apostrophes
end
return rom
end
local function kana_to_romaji_with_pos_format(kana, data, args)
if data.headword.pos_category == "Bentuk gabungan" or data.headword.pos_category == "Tanda baca" or data.headword.pos_category == "Tanda lelaran" then
return "-"
end
local rom = remove_links(kana_to_romaji(kana, data.lang_code))
-- make adjustments for -u verbs and -i adjectives
if args['infl'] == '1' or args['infl'] == '1s' or args['infl'] == 'godan' then
rom = rom:gsub('ō$', 'ou'):gsub('ū$', 'uu')
elseif args['infl'] == 'i' or args['infl'] == 'is' or args['infl'] == 'い' then
rom = rom:gsub('ī$', 'ii')
end
return formatting_adjustments(rom, kana, data.headword.pos_category)
end
local function iterate_rare_chars(text)
local ch, i
return function()
repeat
ch, i = umatch(text, "([" .. range.kana .. range.kana_graph .. "!-/:-@%[\\-`×△○◎。-〠〶〷〻-〽・·゠=~][゙゚]*)()", i)
until not (ch and umatch(ch, "^[ぁ-ちっつて-ろんァ-チッツテ-ロンヲ-゚]$"))
return ch
end
end
local function historical_kana(data, hist_kana, modern_kana)
-- Disallow historical kana for kana and morae, as there's no one-to-one correspondence.
local pos = data.headword.pos_category
if pos == "syllables" or pos == "kana" or pos == "morae" then
error(("Cannot specify historical kana for %s."):format(pos))
end
local hist_kana_no_formatting = hist_kana:gsub("[%^%-%. %%]+", "")
local rare_chars, lang_name, hc = {}, data.lang_name, data.headword.categories
for ch in iterate_rare_chars(hist_kana_no_formatting) do
if not (modern_kana and modern_kana:find(ch)) then
rare_chars[ch] = true
end
end
for _, mora in ipairs(moraify((ugsub(hist_kana_no_formatting, "[^" .. range.kana .. "]+", " ")))) do
if not (mora:gsub(" +", ""):match("^.?[\128-\191]*$") or (modern_kana and modern_kana:find(mora))) then
rare_chars[mora] = true
end
end
for ch in pairs(rare_chars) do
insert(hc, "Perkataan mengikut sejarah dieja dengan " .. ch .. " bahasa " .. lang_name)
end
insert(data.info_hist, require("Module:ja-link").link({
lang = data.headword.lang,
lemma = hist_kana,
tr = formatting_adjustments(
remove_links(kana_to_romaji(hist_kana, data.lang_code, nil, {hist = true})),
hist_kana,
pos
)
}, {
face = "head",
disableSelfLink = true,
}))
end
local function detect_pagename_kana(data, digraphs)
local pagename = data.pagename
-- Exclude "&" and "@", which are part of %p (e.g. リズム&ブルース).
local function remove_kana(m)
return m:match("[&@]") or ""
end
if ugsub(pagename, '[%p%s%c' .. range.hiragana .. (digraphs and "ゟ" or "") .. ']', remove_kana) == "" then
return 'hira'
elseif ugsub(pagename, '[%p%s%c' .. range.katakana .. (digraphs and "ヿ" or "") .. ']', remove_kana) == "" then
return 'kata'
elseif ugsub(pagename, '[%p%s%c' .. range.kana .. (digraphs and "ゟヿ" or "") .. ']', remove_kana) == "" then
return 'both'
end
end
-- go through args and build inflections by finding whatever kanas were given to us
local function format_headword(args, data)
local pagename, kanas, lang_name = data.pagename, data.kanas, data.lang_name
data.pagename_kana = detect_pagename_kana(data)
if args[1][1] and not args[1][1]:match'[\128-\255]' then
-- filter out POS designations
remove(args[1], 1)
end
local linked_translit = data.headword.lang:link_tr(Jpan)
local suru_ending, rom_suru_ending
if data.headword.pos_category == "kata kerja suru" then
suru_ending = "[[する]]"
rom_suru_ending = linked_translit and " [[suru]]" or " suru"
else
suru_ending, rom_suru_ending = "", ""
end
if data.pagename_kana then -- pure-kana-title entry
if #args.head > 0 or args.head.default then
insert(data.headword.categories, "Perkataan dengan parameter pengepala lewah bahasa " .. lang_name)
end
-- {{ja-xxx}} vs {{ja-xxx|こ.うし}} vs {{ja-xxx|コウシ}} in [[こうし]]
if not args[1][1] then
args[1][1] = pagename
elseif remove_links(args[1][1]:gsub("[%^%-%. %%]+", "")) ~= pagename then
insert(args[1], 1, pagename)
end
for i, k in ipairs(args[1]) do
insert(data.headword.heads, {
term = k:gsub("[%^%-%. %%]+", "") .. suru_ending,
tr = '-',
l = args.label[i] and {args.label[i]} or nil,
})
end
for i = 1, math.max(args.rom.maxindex, 1) do
local rom = args.rom[i] or args.rom.default or kana_to_romaji_with_pos_format(args[1][1], data, args)
if not data.headword.heads[i] then
data.headword.heads[i] = {term = data.headword.heads[i-1].term}
end
if rom == "-" then
data.headword.heads[i].tr = "-"
elseif linked_translit then
data.headword.heads[i].tr = "[[" .. rom .. "]]" .. rom_suru_ending
else
data.headword.heads[i].tr = rom .. rom_suru_ending
end
if not data.inflection_base.form then
data.inflection_base.form = remove_links(args[i][1]:gsub("[%^%-%. %%]+", "")) .. suru_ending
data.inflection_base.romaji = rom .. rom_suru_ending
end
end
kanas[1] = pagename
if args.hist[1] then
historical_kana(data, args.hist[1], args[1][1])
end
else -- non-pure-kana-title entry
if #args[1] == 0 and not (data.headword.pos_category == "Tanda baca" or data.headword.pos_category == "Tanda lelaran" or data.headword.pos_category == "Simbol") then
error("Kana form is required.")
end
if args.head.default == pagename then
insert(data.headword.categories, "Perkataan dengan parameter pengepala lewah bahasa " .. lang_name)
end
local rom_repetition_final = {}
for i, k in ipairs(args[1]) do
local rom_auto = kana_to_romaji_with_pos_format(k, data, args)
local head = args.head[i] or args.head.default or pagename
if args.head[i] == pagename then
insert(data.headword.categories, "Perkataan dengan parameter pengepala lewah bahasa " .. lang_name)
end
local head_for_ruby, kana_for_ruby
if ulen(head) > 1 and head:match'%%' == nil and k:match'%%' == nil then
head_for_ruby, kana_for_ruby = assign_kana_to_kanji(head, k, pagename, data.lang_code .. '%-kanjitab')
else
head_for_ruby, kana_for_ruby = head, k
end
local format_table = m_ja_ruby.parse_text(head_for_ruby, kana_for_ruby, {
try = 'force',
try_force_limit = 10000
})
local kana_bare = remove_links(k:gsub("[%^%-%. %%]+", ""))
local rom = args.rom[i] or args.rom.default or rom_auto
head = {
term = m_ja_ruby.to_wiki(format_table, {
break_link = true,
}):gsub('<rt>(..-)</rt>', "<rt>[[" .. kana_bare .."|%1]]</rt>") .. suru_ending,
l = args.label[i] and {args.label[i]} or nil,
}
if rom == "-" or rom_repetition_final[rom] then
head.tr = "-"
elseif linked_translit then
head.tr = "[[" .. rom .. "]]" .. rom_suru_ending
else
head.tr = rom .. rom_suru_ending
end
insert(data.headword.heads, head)
rom_repetition_final[rom] = true
insert(kanas, kana_bare)
if args.hist[i] then
historical_kana(data, args.hist[i], k)
end
if not data.inflection_base.form then
data.inflection_base.form = remove_links(m_ja_ruby.to_markup(format_table)) .. suru_ending
data.inflection_base.romaji = rom .. rom_suru_ending
end
end
local first_reading, multiple = kanas[1]
if not first_reading then
return
end
first_reading = ulower(kana_to_romaji(first_reading, data.lang_code)):gsub("%%", "")
for i = 2, #kanas do
if ulower(kana_to_romaji(kanas[i], data.lang_code)):gsub("%%", "") ~= first_reading then
multiple = true
break
end
end
if not multiple then
local lang_code = data.lang_code
local content = mw.title.getCurrentTitle():getContent()
local loc1, loc2 = content:find("%f[^%z%s]==%s*" .. lang_name:gsub("%-", "%%%-") .. "%s*==()")
loc2 = content:find("%f[^%z%s]==[^\n=]+==", loc2)
if loc1 then
content = content:sub(loc1, loc2)
for template in require("Module:template parser").find_templates(content) do
local name, reading = template:get_name()
if (
name == lang_code .. "-head" or
name == lang_code .. "-pos"
) then
reading = template:get_arguments()[2]
if reading ~= nil then
reading = remove_links(reading):gsub("%%", "")
end
elseif (
name == lang_code .. "-noun" or
name == lang_code .. "-verb" or
name == lang_code .. "-adj" or
name == lang_code .. "-phrase" or
name == lang_code .. "-verb form" or
name == lang_code .. "-verb-suru"
) then
reading = template:get_arguments()[1]
if reading ~= nil then
reading = remove_links(reading):gsub("%%", "")
end
elseif name == lang_code .. "-see" then
reading = template:get_arguments()[1]
if reading ~= nil then
reading = remove_links(reading):gsub("%%", "")
end
-- if umatch(reading, "[^" .. range.kana .. "]") then
-- TODO: check linked page
-- end
end
if reading and ulower(kana_to_romaji(reading, lang_code)):gsub("%%", "") ~= first_reading then
multiple = true
end
end
end
end
if multiple then
insert(data.headword.categories, "Perkataan dengan pelbagai bacaan bahasa " .. lang_name)
end
end
end
local function add_transitivity(data, tr)
local categories, lang_name = data.headword.categories, data.lang_name
tr = aliases[tr] or tr
if tr == "tr" then
insert(data.info_mid, 'transitive')
insert(categories, "Kata kerja transitif bahasa " .. lang_name)
elseif tr == "in" then
insert(data.info_mid, 'intransitive')
insert(categories, "Kata kerja tak transitif bahasa " .. lang_name)
elseif tr == "both" then
insert(data.info_mid, 'transitive or intransitive')
insert(categories, "Kata kerja transitif bahasa " .. lang_name)
insert(categories, "Kata kerja tak transitif bahasa " .. lang_name)
else
insert(categories, "Kata kerja tanpa ketransitifan bahasa " .. lang_name)
end
end
local function get_final(lemma, data)
return kana_to_romaji(remove(moraify(m_ja_ruby.to_ruby(m_ja_ruby.parse_markup(lemma)))), data.lang_code)
end
local function add_language_fragment(t, lang_name)
for k, v in ipairs(t) do
t[k] = v:gsub("%[%[([^]#]*)%]%]", "[[%1#%%s|%1]]"):format(lang_name)
end
end
local function add_inflections(data, inflection_type, cat_suffix)
local lang_name = data.lang_name
local lemma = data.inflection_base.form
local romaji = data.inflection_base.romaji
inflection_type = aliases[inflection_type] or inflection_type
local function replace_suffix(lemma_from, lemma_to, romaji_from, romaji_to)
-- e.g. 持って来る, lemma = "[持](も)って来(く)る"
-- lemma_from = "くる", lemma_to = {"き","きた"}
add_language_fragment(lemma_to, lang_name)
add_language_fragment(romaji_to, lang_name)
local result = {}
local pattern_from, n_from = lemma_from:gsub('.[\128-\191]*', function(c)
return '[' .. c .. m_ja.hira_to_kata(c) .. ']([^' .. range.kana .. ']*)'
end)
pattern_from = pattern_from .. '$'
-- "[くク]([^kana range]*)[るル]([^kana range]*)$"
for i_lemma_to, s_lemma_to in ipairs(lemma_to) do
local n_to = 0
local pattern_to = s_lemma_to:gsub('.[\128-\191]*', function(c)
if n_to < n_from then
n_to = n_to + 1
return c .. "%" .. n_to
else
return c
end
end)
for i = n_to + 1, n_from do
pattern_to = pattern_to .. "%" .. i
end
-- "き%1%2", "き%1た%2"
local lemma_inflected, success = ugsub(lemma, pattern_from, pattern_to)
if success == 0 then
return
end
local romaji_inflected
romaji_inflected, success = romaji:gsub(romaji_from .. "$", romaji_to[i_lemma_to])
if success == 0 then
romaji_inflected, success = romaji:gsub("%[%[" .. romaji_from .. "%]%]$", "[[" .. romaji_to[i_lemma_to] .. "]]")
if success == 0 then
return
end
end
insert(result, {lemma = lemma_inflected, romaji = romaji_inflected})
end
return result -- {{lemma="[持](も)って来(き)",romaji="motteki"},{lemma="[持](も)って来(き)た",romaji="mottekita"}}
end
local function insert_form(label, ...)
-- label = "stem" or "past" etc.
-- ... = {lemma=...,romaji=...},{lemma=...,romaji=...}
local labeled_forms = {label = label}
for _, v in ipairs{...} do
local table_form = m_ja_ruby.parse_markup(v.lemma)
local form_term = m_ja_ruby.to_wiki(table_form)
if not form_term:find'%[%[.+%]%]' then
form_term = '[[' .. m_ja_ruby.to_text(table_form) .. '#' .. lang_name .. '|' .. form_term .. ']]'
end
insert(labeled_forms, {
term = form_term,
tr = v.romaji,
})
end
insert(data.headword.inflections, labeled_forms)
end
local inflected_forms
if data.lang_code == 'ja' then
if inflection_type == '1' or inflection_type == '1s' then
insert(data.info_mid, '<abbr title="godan (group 1) conjugation">godan</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " godan bahasa " .. lang_name)
local romaji = data.inflection_base.romaji
if cat_suffix == "Kata kerja" then
local final = get_final(lemma, data)
insert(data.headword.categories, cat_suffix .. " godan berakhir dengan -" .. final " bahasa ".. lang_name)
if final == "ru" then
if umatch(romaji, "[iIīĪ]ru$") then
insert(data.headword.categories, cat_suffix .. " godan berakhir dengan -iru bahasa " .. lang_name)
elseif umatch(romaji, "[eEēĒ]ru$") then
insert(data.headword.categories, cat_suffix .. " godan berakhir dengan -eru bahasa " .. lang_name)
end
end
end
end
if inflection_type == '1' then
inflected_forms =
replace_suffix('く', {'き', 'いた'}, 'ku', {'ki', 'ita'}) or
replace_suffix('ぐ', {'ぎ', 'いだ'}, 'gu', {'gi', 'ida'}) or
replace_suffix('す', {'し', 'した'}, 'su', {'shi', 'shita'}) or
replace_suffix('つ', {'ち', 'った'}, 'tsu', {'chi', 'tta'}) or
replace_suffix('ぬ', {'に', 'んだ'}, 'nu', {'ni', 'nda'}) or
replace_suffix('ぶ', {'び', 'んだ'}, 'bu', {'bi', 'nda'}) or
replace_suffix('む', {'み', 'んだ'}, 'mu', {'mi', 'nda'}) or
replace_suffix('る', {'り', 'った'}, 'ru', {'ri', 'tta'}) or
replace_suffix('う', {'い', 'った'}, 'u', {'i', 'tta'})
if inflected_forms then
insert_form('dasar', inflected_forms[1])
insert_form('lampau', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
else
inflected_forms =
replace_suffix('る', {'り', 'った', 'い'}, 'ru', {'ri', 'tta', 'i'}) or --くださる
replace_suffix('いく', {'いき', 'いった'}, 'iku', {'iki', 'itta'}) or --行く
replace_suffix('う', {'い', 'うた'}, 'ou', {'oi', 'ōta'}) --問う
if inflected_forms then
insert_form('dasar', inflected_forms[1], inflected_forms[3])
insert_form('lampau', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
end
elseif inflection_type == '2' then
insert(data.info_mid, '<abbr title="ichidan (group 2) conjugation">ichidan</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " ichidan bahasa " .. lang_name)
local romaji = data.inflection_base.romaji
if umatch(romaji, "[iIīĪ]ru$") then
insert(data.headword.categories, cat_suffix .. " kami ichidan bahasa " .. lang_name)
elseif umatch(romaji, "[eEēĒ]ru$") then
insert(data.headword.categories, cat_suffix .. " shimo ichidan bahasa " .. lang_name)
else
insert(data.headword.categories, cat_suffix .. " irregular bahasa " .. lang_name)
end
end
inflected_forms = replace_suffix('る', {'', 'た'}, 'ru', {'', 'ta'})
if inflected_forms then
insert_form('dasar', inflected_forms[1])
insert_form('lampau', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'suru' then
insert(data.info_mid, '<abbr title="suru (group 3) conjugation">suru</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " suru bahasa " .. lang_name)
end
inflected_forms =
replace_suffix('する', {'し', 'した'}, 'suru', {'shi', 'shita'}) or
replace_suffix('ずる', {'じ', 'じた'}, 'zuru', {'ji', 'jita'})
if inflected_forms then
insert_form('dasar', inflected_forms[1])
insert_form('lampau', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'kuru' then
insert(data.info_mid, '<abbr title="kuru (group 3) conjugation">kuru</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " kuru bahasa " .. lang_name)
end
inflected_forms = replace_suffix('くる', {'き', 'きた'}, 'kuru', {'ki', 'kita'})
if inflected_forms then
insert_form('dasar', inflected_forms[1])
insert_form('lampau', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'i' or inflection_type == 'い' then
insert(data.info_mid, '<abbr title="-i (type I) inflection">-i</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " い-i bahasa " .. lang_name)
end
inflected_forms = replace_suffix('い', {'く'}, 'i', {'ku'})
if inflected_forms then
insert_form('adverbial', inflected_forms[1])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'is' then
insert(data.info_mid, '<abbr title="-i (type I) inflection">-i</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " い-i bahasa " .. lang_name)
end
inflected_forms = replace_suffix('いい', {'よく'}, 'ii', {'yoku'})
if inflected_forms then
insert_form('adverbial', inflected_forms[1])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'na' or inflection_type == 'な' then
insert(data.info_mid, '<abbr title="-na (type II) inflection">-na</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " な-na bahasa " .. lang_name)
end
inflected_forms = replace_suffix('', {'[[な]]', '[[に]]'}, '', {' [[na]]', ' [[ni]]'})
insert_form('adnominal', inflected_forms[1])
insert_form('adverbial', inflected_forms[2])
elseif inflection_type == "yo" then
insert(data.info_mid, '<abbr title="yodan conjugation (classical)"><sup><small>†</small></sup>yodan</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " yodan " .. lang_name)
insert(data.headword.categories, cat_suffix .. " yodan berakhir dengan -" .. get_final(lemma, data) .. " bahasa ".. lang_name)
end
elseif inflection_type == "kami ni" then
insert(data.info_mid, '<abbr title="kami nidan conjugation (classical)"><sup><small>†</small></sup>nidan</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " nidan bahasa " .. lang_name)
insert(data.headword.categories, cat_suffix .. " kami nidan bahasa " .. lang_name)
end
elseif inflection_type == "shimo ni" then
insert(data.info_mid, '<abbr title="shimo nidan conjugation (classical)"><sup><small>†</small></sup>nidan</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " nidan bahasa " .. lang_name)
insert(data.headword.categories, cat_suffix .. " shimo nidan bahasa " .. lang_name)
end
elseif inflection_type == "rahen" then
insert(data.info_mid, '<abbr title="r-special conjugation (classical)"><sup><small>†</small></sup>-ri</abbr>')
elseif inflection_type == "sahen" then
insert(data.info_mid, '<abbr title="s-special conjugation (classical)"><sup><small>†</small></sup>-se</abbr>')
elseif inflection_type == "kahen" then
insert(data.info_mid, '<abbr title="k-special conjugation (classical)"><sup><small>†</small></sup>-ko</abbr>')
elseif inflection_type == "nahen" then
insert(data.info_mid, '<abbr title="n-special conjugation (classical)"><sup><small>†</small></sup>-n</abbr>')
elseif inflection_type == "nari" or inflection_type == "なり" then
insert(data.info_mid, '<abbr title="-nari inflection (classical)"><sup><small>†</small></sup>-nari</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " なり-nari bahasa " .. lang_name)
end
elseif inflection_type == 'tari' or inflection_type == 'たり' then
insert(data.info_mid, '<abbr title="-tari inflection (classical)"><sup><small>†</small></sup>-tari</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " たり-tari bahasa " .. lang_name)
end
inflected_forms = replace_suffix('', {'[[とした]]', '[[たる]]', '[[と]]', '[[として]]'}, '', {' [[to shita]]', ' [[taru]]', ' [[to]]', ' [[to shite]]'})
insert_form('adnominal', inflected_forms[1], inflected_forms[2])
insert_form('adverbial', inflected_forms[3], inflected_forms[4])
elseif inflection_type == "ku" or inflection_type == "く" then
insert(data.info_mid, '<abbr title="-ku inflection (classical)"><sup><small>†</small></sup>-ku</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " く-ku bahasa " .. lang_name)
end
elseif inflection_type == "shiku" or inflection_type == "しく" then
insert(data.info_mid, '<abbr title="-shiku inflection (classical)"><sup><small>†</small></sup>-shiku</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " しく-shiku bahasa " .. lang_name)
end
elseif inflection_type == "ka" or inflection_type == "か" then
insert(data.info_mid, '<abbr title="-ka inflection (dialectal)"><sup><small>†</small></sup>-ka</abbr>')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " か-ka " .. lang_name)
end
elseif inflection_type and inflection_type:len() > adverbs_optional_tag:len() and inflection_type:sub(1, adverbs_optional_tag:len()) == adverbs_optional_tag then
local adverbs_optional_list = inflection_type:sub(adverbs_optional_tag:len() + 1)
for option in gsplit(adverbs_optional_list, ':') do
local normalized_option = adverbs_optional_aliases[option]
if not normalized_option then
error('unrecognized adverb opt= argument: "' .. option .. '"')
end
local normalized_option_romaji = kana_to_romaji(normalized_option, data.lang_code)
local normalized_option_link = adverbs_optional_links[normalized_option]
inflected_forms = replace_suffix('', {normalized_option_link}, '', {' [[' .. normalized_option_romaji .. ']]'})
insert_form('optionally as', inflected_forms[1])
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " secara pilihan mengambil " .. normalized_option .. "-" .. normalized_option_romaji .. " bahasa " .. lang_name)
end
end
elseif inflection_type == 'irr' then
insert(data.info_mid, 'irregular')
if cat_suffix then
insert(data.headword.categories, cat_suffix .. " irregular bahasa " .. lang_name)
end
elseif inflection_type == '-' or inflection_type == 'un' then
insert(data.info_mid, 'uninflectable')
end
--elseif data.lang_code == 'ryu' then ...
end
end
local function add_categories(data)
local lang_name = data.lang_name
local pagename = data.pagename
local tc = data.headword.categories
-- adds category [langname] terms spelled with jōyō kanji or [langname] terms spelled with non-jōyō kanji
-- (if it contains any kanji)
local number_of_kanji = 0
for c in ugmatch(pagename, "[" .. range.kanji .. "々〻]") do
number_of_kanji = number_of_kanji + 1
if c ~= "々" and c ~= "〻" then -- Not a kanji for the purposes of categorisation.
insert(tc, ("Perkataan dieja dengan kanji %s bahasa " .. lang_name):format(en_grades[m_ja.kanji_grade(c)]))
end
end
-- categorize by number of kanji
if number_of_kanji ~= 0 then
insert(tc, ("Perkataan dengan %s aksara kanji bahasa " .. lang_name):format(number_of_kanji))
-- single-kanji terms
if ulen(pagename) == 1 then
insert(tc, "Perkataan dieja dengan " .. pagename .. " bahasa " .. lang_name)
insert(tc, "Perkataan kanji tunggal bahasa " .. lang_name)
end
end
-- categorize by the script of the pagename or specific characters contained in it
-- if pagename is hiragana or katakana
if detect_pagename_kana(data, true) == 'hira' then insert(tc, "Hiragana bahasa " .. lang_name) end
if detect_pagename_kana(data, true) == 'kata' then insert(data.katakana_category, "Katakana bahasa " .. lang_name) end
local p, n = ugsub(pagename, '[' .. range.kana .. range.kanji .. range.ideograph .. range.kana_graph .. range.punctuation .. ']+', '')
if p ~= '' and n > 0 then insert(tc, "Perkataan ditulis dalam pelbagai tulisan bahasa " .. lang_name) end
local pos = data.headword.pos_category
local rare_chars = {}
for ch in iterate_rare_chars(pagename) do
rare_chars[ch] = true
end
-- Categorise yōon, but exclude kana and mora entries, since they can't be spelled with themselves.
-- FIXME: allow kana categories for morae.
if not (pos == "syllables" or pos == "kana" or pos == "morae") then
for _, mora in ipairs(moraify((ugsub(pagename, "[^" .. range.kana .. "]+", " ")))) do
if not mora:gsub(" +", ""):match("^.?[\128-\191]*$") then
rare_chars[mora] = true
end
end
end
for ch in pairs(rare_chars) do
insert(tc, "Perkataan dieja dengan " .. ch .. " bahasa " .. lang_name)
end
if (
pos ~= "proverbs" and
pos ~= "phrases" and
umatch(ugsub(pagename, "[" .. range.katakana .. "]+", ""), "[" .. range.hiragana .. "]") and
umatch(ugsub(pagename, "[" .. range.hiragana .. "]+", ""), "[" .. range.katakana .. "]")
) then
insert(tc, "Perkataan dieja dengan campuran kana bahasa " .. lang_name)
end
end
pos_functions["Kata kerja"] = function(args, data)
add_transitivity(data, args["tr"])
add_inflections(data, args["infl"], 'verbs')
end
pos_functions["Akhiran"] = function(args, data)
add_inflections(data, args["infl"])
end
pos_functions["Kata kerja bantu"] = function(args, data)
insert(data.headword.categories, "Kata kerja bantu bahasa " .. data.lang_name)
add_inflections(data, args["infl"])
data.headword.pos_category = "Kata kerja"
end
pos_functions["Kata kerja suru"] = function(args, data)
add_transitivity(data, args["tr"])
add_inflections(data, 'suru', 'verbs')
data.headword.pos_category = "Kata kerja"
end
pos_functions["Kata sifat"] = function(args, data)
add_inflections(data, args["infl"], 'adjectives')
end
pos_functions["Kata nama"] = function(args, data)
-- the counter (classifier) parameter, only relevant for nouns
local counter = args["count"] or ""
if counter == "-" then
insert(data.headword.inflections, {label = "uncountable"})
elseif counter ~= "" then
insert(data.headword.inflections, {label = "counter", counter})
end
end
pos_functions["Adverba"] = function(args, data)
local opt = args["opt"]
if opt then
opt = adverbs_optional_tag .. opt
end
add_inflections(data, opt, 'adverbs')
end
--[==[
Generate categories by pagename, also optionally by POS
Also for use in soft redirect pages ([[Module:ja-see]]).
Sortkey is not provided.
data = {
pagename = ..., -- (required)
lang = ..., -- (required) language object
categories = {}, -- (required) receive categories
katakana_category = {}, -- (required) receive katakana-sorted categories
pos = ..., "noun", "verb", etc. no POS categories if not given
}
]==]
function export.cat(data)
data.lang_name = data.lang:getCanonicalName()
data.pagename_kana = detect_pagename_kana(data)
if data.pos then
local pos = data.pos:gsub('x$', 'xe') .. ''
insert(data.categories, pos .. ' bahasa ' .. data.lang_name)
insert(data.categories, require'Module:headword'.pos_lemma_or_nonlemma(pos, true) .. ' bahasa ' .. data.lang_name)
end
data.headword = { categories = data.categories }
add_categories(data)
end
--[==[
The main entry point.
This is the only function that can be invoked from a template.
]==]
function export.show(frame)
local poscat = frame.args[2] or frame.args[1] or error("Part of speech has not been specified. Please pass parameter 1 to the module invocation.")
local alias_of_hist = {alias_of = 'hist', list = false}
local alias_of_infl = {alias_of = "infl"}
local list = {list = true}
local list_allow_holes_separate_no_index = {list = true, allow_holes = true, separate_no_index = true}
local params = {
[1] = list,
['rom'] = list_allow_holes_separate_no_index,
['head'] = list_allow_holes_separate_no_index,
['label'] = {list = true, allow_holes = true},
['hist'] = list, ['hhira'] = alias_of_hist, ['hkata'] = alias_of_hist,
['tr'] = true,
['infl'] = true, ['type'] = alias_of_infl, ['decl'] = alias_of_infl,
['opt'] = true,
['count'] = true,
['sort'] = true,
['pagename'] = true,
}
-- For backwards compatibility with uses of {{ja-syllable}} with the script parameter.
if poscat == "syllables" then
params["sc"] = true
end
local args = require('Module:parameters').process(frame:getParent().args, params)
local data = {
headword = {
pos_category = poscat,
categories = {},
heads = {},
no_redundant_head_cat = true,
inflections = {},
genders = {'m'}, -- placeholder
nogendercat = true
},
--custom info
pagename = args.pagename or mw.loadData("Module:headword/data").pagename,
pagename_kana = nil, -- "hira" "kata" "both", nil
lang_code = frame.args[1],
lang_name = nil, -- "Japanese", "Okinawan" ...
katakana_category = {},
info_mid = {}, -- "godan", "intransitive" ...
info_hist = {}, -- historical kana
inflection_base = {}, -- base of inflections
kanas = {}, -- kana id
}
data.headword.lang = require("Module:languages").getByCode(data.lang_code)
data.lang_name = data.headword.lang:getCanonicalName()
-- sort out all the kanas and do the romanization business
format_headword(args, data)
-- add certain inflections and categories for adjectives, verbs, nouns, or adverbs
if pos_functions[poscat] then
pos_functions[poscat](args, data)
end
-- categories
add_categories(data)
local sort_base = args.sort or data.kanas[1] or data.pagename
data.headword.sort_key = data.headword.lang:makeSortKey(sort_base)
local katakana_category = #data.katakana_category > 0 and
require("Module:utilities").format_categories(
data.katakana_category,
data.headword.lang,
nil,
sort_base,
nil,
require("Module:scripts").getByCode("Kana")
) or ""
-- output
local i_kanas = 0
return katakana_category .. require('Module:headword').full_headword(data.headword):gsub('<span class="gender">.-</span>', function()
return (#data.info_hist > 0 and '<sup>←' .. concat(data.info_hist, ' or ') .. '<sup>[[w:Historical kana orthography|?]]</sup></sup>' or '') .. ('<i>' .. concat(data.info_mid, ' ') .. '</i>')
end):gsub('<strong .->.-</strong>', function(m0)
i_kanas = i_kanas + 1
if data.kanas[i_kanas] then
return m0
end
end):gsub('<span class="headword%-tr tr" dir="ltr"><span class="Latn" lang="ja">', '<span lang="ja-Latn" class="headword-tr tr Latn" dir="ltr">'):gsub('</span></span>', '</span>')
end
return export
laelznpt61s3a3oafnsfj3trmwx3ys1
Modul:script tag link
828
34900
281276
146325
2026-04-21T14:05:10Z
Hakimi97
2668
Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/90159474|90159474]])
281276
Scribunto
text/plain
local export = {}
local codepoint_to_script = require("Module:scripts").charToScript
-- FIXME: Temporary hack for script renames.
local alias_mapping = {
polytonic = "Polyt",
Latinx = "Latn",
Latnx = "Latn",
}
-- If there are characters in both scripts (the key and value), the value should be used.
local overridden_by = {
Grek = "Polyt",
Cyrl = "Cyrs",
}
local function get_script(text)
local sc, curr_sc
for codepoint in mw.ustring.gcodepoint(text) do
curr_sc = codepoint_to_script(codepoint)
curr_sc = alias_mapping[curr_sc] or curr_sc
if curr_sc ~= "None" then
if sc == nil then
sc = curr_sc
elseif curr_sc ~= sc then
-- For instance, Grek -> Polyt.
if overridden_by[sc] == curr_sc then
sc = curr_sc
-- For instance, Grek and Latn.
elseif overridden_by[curr_sc] ~= sc then
require("Module:debug").track("also/no sc detected")
mw.log("Two scripts found in " .. tostring(text) .. ": "
.. tostring(sc) .. " and " .. tostring(curr_sc) .. ".")
sc = nil
break
end
end
end
end
return sc
end
local function link(text, sc)
return '<span class="' .. sc .. '">[[' .. text .. ']]</span>'
end
function export.tag_link(link_innards, text)
local sc = get_script(text or link_innards) or "None"
return link(link_innards, sc)
end
function export.tag_links(str)
str = str:gsub('%[%[(.-)%]%]', function (innards)
local sc
-- The actual displayed text, whose script we need to detect,
-- if different from link innards.
local text
if innards:find("|") then
text = innards:match("|(.+)%]%]$")
if not text then
return
end
end
return export.tag_link(innards, text)
end)
return str
end
function export.tag_links_frame(frame)
local args = {}
for k, v in pairs(frame:getParent().args) do
if k == 1 then
args[k] = v
else
error("The parameter " .. k .. " is not used by this template.")
end
end
local text = args[1]
if text then
text = mw.text.trim(text)
return export.tag_links(text)
end
end
function export.link(frame)
local args = {}
for k, v in pairs(frame:getParent().args) do
if k == 1 then
args[k] = v
else
error("The parameter " .. k .. " is not used by this template.")
end
end
local text = args[1]
if text then
return export.tag_link(text)
end
end
return export
b6041o91hdoysgx74sd5awl3zxoddru
Templat:Jpan-pos/format
10
35002
281365
247886
2026-04-22T06:58:03Z
PeaceSeekers
3334
281365
wikitext
text/x-wiki
<includeonly>{{#switch:{{{1|}}}
|acronym=Akronim
|adj|adjective=Kata sifat
|adjective form=Bentuk kata sifat
|adnominal=Adnominal
|adv|adverb|adverba=Adverba
|adverb form=Bentuk adverba
|affix=Imbuhan
|aux=auxiliary verbs
|classifier=Penjodoh bilangan
|combining form=combining forms
|conjunction=Kata hubung
|counter=counters
|ideophonic root=ideophonic roots
|idiom=Peribahasa
|infix=Sisipan
|interjection=Kata seru
|iteration mark=iteration marks
|kana=kana
|mora=mora
|noun=Kata nama
|noun form=Bentuk kata nama
|numeral=Kata bilangan
|numeral symbol=numeral symbols
|particle=Partikel
|phrase=Frasa
|postposition=postpositions
|prefix=Awalan
|pronoun=Kata ganti nama
|pronoun form=Bentuk kata ganti nama
|proper|proper noun=Kata nama khas
|proverb=proverbs
|punctuation mark=Tanda baca
|suffix=Akhiran
|suffix form=Bentuk akhiran
|syllable=Suku kata
|symbol=Simbol
|verb=Kata kerja
|verb suru=Kata kerja suru
|verb form=Bentuk kata kerja
|#default = {{error|Invalid part of speech.}}}}</includeonly><noinclude>{{documentation}}</noinclude>
7m2fl6wd09xntr0ch2lo18n5og3o88y
281366
281365
2026-04-22T06:59:59Z
PeaceSeekers
3334
281366
wikitext
text/x-wiki
<includeonly>{{#switch:{{{1|}}}
|acronym|akronim=Akronim
|adj|adjective|kata sifat|kata adjektif=Kata sifat
|adjective form|bentuk kata sifat=Bentuk kata sifat
|adnominal=Adnominal
|adv|adverb|adverba=Adverba
|adverb form|bentuk adverba=Bentuk adverba
|affix|imbuhan=Imbuhan
|aux=auxiliary verbs
|classifier|penjodoh bilangan=Penjodoh bilangan
|combining form=combining forms
|conjunction|kata hubung=Kata hubung
|counter=counters
|ideophonic root=ideophonic roots
|idiom|peribahasa=Peribahasa
|infix|sisipan=Sisipan
|interjection|kata seru=Kata seru
|iteration mark=iteration marks
|kana=kana
|mora=mora
|noun|kata nama=Kata nama
|noun form|bentuk kata nama=Bentuk kata nama
|numeral|kata bilangan=Kata bilangan
|numeral symbol=numeral symbols
|particle|partikel=Partikel
|phrase|frasa=Frasa
|postposition=postpositions
|prefix|awalan=Awalan
|pronoun|kata ganti nama=Kata ganti nama
|pronoun form|bentuk kata ganti nama=Bentuk kata ganti nama
|proper|proper noun|kata nama khas=Kata nama khas
|proverb=proverbs
|punctuation mark=Tanda baca
|suffix|akhiran=Akhiran
|suffix form|bentuk akhiran=Bentuk akhiran
|syllable|suku kata=Suku kata
|symbol|simbol=Simbol
|verb|kata kerja=Kata kerja
|verb suru=Kata kerja suru
|verb form|bentuk kata kerja=Bentuk kata kerja
|#default = {{error|Invalid part of speech.}}}}</includeonly><noinclude>{{documentation}}</noinclude>
kpwp9i93fva5eldt0kbwe07te81rz24
Zambia
0
37655
281386
150447
2026-04-22T07:26:32Z
PeaceSeekers
3334
281386
wikitext
text/x-wiki
== Bahasa Melayu ==
{{Wikipedia}} <!-- Kalau ada -->
=== Takrifan ===
==== Kata nama khas ====
{{ms-knk|j=زمبيا}}
# {{place|ms|negara|r/selatan Afrika|official=Republik Zambia}}.
=== Sebutan ===
* {{dewan|Zam|bia}}
=== Lihat juga ===
* {{senarai:negara di Afrika/ms}}
=== Pautan luar ===
* {{R:PRPM}}
== Bahasa Indonesia ==
{{Wikipedia|lang=id}} <!-- Kalau ada -->
=== Takrifan ===
==== Kata nama khas ====
{{id-knk}}
# Zambia; negara di selatan Afrika
=== Sebutan ===
* {{penyempangan|id|Zam|bia}}
=== Lihat juga ===
* {{senarai:negara di Afrika/id}}
=== Pautan luar ===
* {{R:KBBI Daring}}
== Bahasa Inggeris ==
{{Wikipedia|lang=en}} <!-- Kalau ada -->
=== Takrifan ===
==== Kata nama khas ====
{{en-knk}}
# Zambia; negara di selatan Afrika
=== Sebutan ===
* {{IPA|en|/ˈzæmbiə/}}
* {{audio|en|LL-Q1860 (eng)-Vealhurl-Zambia.wav |Audio (England Selatan)}}
* {{audio|en|Zambia.wav|Audio (Zambia)}}
=== Lihat juga ===
* {{senarai:negara di Afrika/en}}
s1f36bg2l179bats43flaprnm0vh4ji
hadiah
0
44008
281248
159303
2026-04-21T13:37:48Z
Countryball mys123
9925
/* Bahasa Melayu */Tambah gambar
281248
wikitext
text/x-wiki
== Bahasa Melayu ==
{{Wikipedia}} <!-- Kalau ada -->
[[File:Gift packing.jpg|thumb|Bungkusan hadiah]]
=== Takrifan ===
==== Kata nama ====
{{ms-kn|j=هديه}}
# Suatu [[pemberian]] yang diberi untuk menghargai seseorang atau sesuatu.
=== Etimologi ===
Daripada {{der|ms|ar|هَدِيَّة}}.
=== Sebutan ===
* {{dewan|ha|diah}}
* {{IPA|ms|/ha.di.(j)ah/}}
=== Pautan luar ===
* {{R:PRPM}}
== Bahasa Indonesia ==
{{Wikipedia|lang=id}} <!-- Kalau ada -->
=== Takrifan ===
==== Kata nama ====
{{id-kn}}
# Suatu [[pemberian]] yang diberi untuk menghargai seseorang atau sesuatu.
=== Etimologi ===
Daripada {{der|id|ar|هَدِيَّة}}.
=== Sebutan ===
* {{IPA|id|/haˈdi(j)ah/}}
* {{rhymes|id|jah|ah|h|s=3}}
* {{hyphenation|id|ha|di|ah}}
=== Pautan luar ===
* {{R:KBBI Daring}}
b9fblfedhxc33r1kvxck3t6t0wawoep
خزن
0
47969
281313
165692
2026-04-21T15:53:20Z
Hakimi97
2668
/* Etimologi */
281313
wikitext
text/x-wiki
== Bahasa Arab ==
=== Takrifan ===
==== Kata kerja ====
{{head|ar|kata kerja}}
# [[simpan]], [[kumpul]]
# [[kandung]]
# menyimpan [[rahsia]]
=== Etimologi ===
Daripada akar {{ar-root|خ ز ن}}.
84fjqh8jj4gww5l4jz3c03odqq6gyo3
Modul:headword/page
828
51771
281241
265774
2026-04-21T13:00:15Z
Hakimi97
2668
Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/88725393|88725393]])
281241
Scribunto
text/plain
local export = {}
local languages_module = "Module:languages"
local maintenance_category_module = "Module:maintenance category"
local pages_module = "Module:pages"
local string_compare_module = "Module:string/compare"
local string_decode_entities_module = "Module:string/decodeEntities"
local string_remove_comments_module = "Module:string/removeComments"
local string_utilities_module = "Module:string utilities"
local table_module = "Module:table"
local template_parser_module = "Module:template parser"
local mw = mw
local string = string
local table = table
local ustring = mw.ustring
local concat = table.concat
local find = string.find
local format = string.format
local gsub = string.gsub
local insert = table.insert
local load_data = mw.loadData
local match = string.match
local new_title = mw.title.new
local pairs = pairs
local require = require
local sub = string.sub
local toNFC = ustring.toNFC
local toNFD = ustring.toNFD
local ugsub = ustring.gsub
local function class_else_type(...)
class_else_type = require(template_parser_module).class_else_type
return class_else_type(...)
end
local function decode_entities(...)
decode_entities = require(string_decode_entities_module)
return decode_entities(...)
end
local function encode_entities(...)
encode_entities = require(string_utilities_module).encode_entities
return encode_entities(...)
end
local function get_category(...)
get_category = require(maintenance_category_module).get_category
return get_category(...)
end
local function get_lang(...)
get_lang = require(languages_module).getByCode
return get_lang(...)
end
local function list_to_set(...)
list_to_set = require(table_module).listToSet
return list_to_set(...)
end
local function parse(...)
parse = require(template_parser_module).parse
return parse(...)
end
local function remove_comments(...)
remove_comments = require(string_remove_comments_module)
return remove_comments(...)
end
local function physical_to_logical_pagename_if_mammoth(...)
physical_to_logical_pagename_if_mammoth = require(pages_module).physical_to_logical_pagename_if_mammoth
return physical_to_logical_pagename_if_mammoth(...)
end
local function split(...)
split = require(string_utilities_module).split
return split(...)
end
local function string_compare(...)
string_compare = require(string_compare_module)
return string_compare(...)
end
local function uupper(...)
uupper = require(string_utilities_module).upper
return uupper(...)
end
--[==[
Loaders for objects, which load data (or some other object) into some variable, which can then be accessed as "foo or get_foo()", where the function get_foo sets the object to "foo" and then returns it. This ensures they are only loaded when needed, and avoids the need to check for the existence of the object each time, since once "foo" has been set, "get_foo" will not be called again.]==]
local langnames
local function get_langnames()
langnames, get_langnames = load_data("Module:languages/canonical names"), nil
return langnames
end
-- Combining character data used when categorising unusual characters. These resolve into two patterns, used to find
-- single combining characters (i.e. character + diacritic(s)) or double combining characters (i.e. character +
-- diacritic(s) + character).
-- Charsets are in the format used by Unicode's UnicodeSet tool: https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp.
-- Single combining characters.
-- Charset: [[:M:]&[:^Canonical_Combining_Class=/^Double_/:]&[:^subhead=Grapheme joiner:]&[:^Variation_Selector=Yes:]]
-- Note: concatenating hundreds of lines at once gives an error, so () are used every 150 lines to break it up into chunks.
local comb_chars_single =
("\204\128-\205\142" .. -- U+0300-U+034E
"\205\144-\205\155" .. -- U+0350-U+035B
"\205\163-\205\175" .. -- U+0363-U+036F
"\210\131-\210\137" .. -- U+0483-U+0489
"\214\145-\214\189" .. -- U+0591-U+05BD
"\214\191" .. -- U+05BF
"\215\129" .. -- U+05C1
"\215\130" .. -- U+05C2
"\215\132" .. -- U+05C4
"\215\133" .. -- U+05C5
"\215\135" .. -- U+05C7
"\216\144-\216\154" .. -- U+0610-U+061A
"\217\139-\217\159" .. -- U+064B-U+065F
"\217\176" .. -- U+0670
"\219\150-\219\156" .. -- U+06D6-U+06DC
"\219\159-\219\164" .. -- U+06DF-U+06E4
"\219\167" .. -- U+06E7
"\219\168" .. -- U+06E8
"\219\170-\219\173" .. -- U+06EA-U+06ED
"\220\145" .. -- U+0711
"\220\176-\221\138" .. -- U+0730-U+074A
"\222\166-\222\176" .. -- U+07A6-U+07B0
"\223\171-\223\179" .. -- U+07EB-U+07F3
"\223\189" .. -- U+07FD
"\224\160\150-\224\160\153" .. -- U+0816-U+0819
"\224\160\155-\224\160\163" .. -- U+081B-U+0823
"\224\160\165-\224\160\167" .. -- U+0825-U+0827
"\224\160\169-\224\160\173" .. -- U+0829-U+082D
"\224\161\153-\224\161\155" .. -- U+0859-U+085B
"\224\162\151-\224\162\159" .. -- U+0897-U+089F
"\224\163\138-\224\163\161" .. -- U+08CA-U+08E1
"\224\163\163-\224\164\131" .. -- U+08E3-U+0903
"\224\164\186-\224\164\188" .. -- U+093A-U+093C
"\224\164\190-\224\165\143" .. -- U+093E-U+094F
"\224\165\145-\224\165\151" .. -- U+0951-U+0957
"\224\165\162" .. -- U+0962
"\224\165\163" .. -- U+0963
"\224\166\129-\224\166\131" .. -- U+0981-U+0983
"\224\166\188" .. -- U+09BC
"\224\166\190-\224\167\132" .. -- U+09BE-U+09C4
"\224\167\135" .. -- U+09C7
"\224\167\136" .. -- U+09C8
"\224\167\139-\224\167\141" .. -- U+09CB-U+09CD
"\224\167\151" .. -- U+09D7
"\224\167\162" .. -- U+09E2
"\224\167\163" .. -- U+09E3
"\224\167\190" .. -- U+09FE
"\224\168\129-\224\168\131" .. -- U+0A01-U+0A03
"\224\168\188" .. -- U+0A3C
"\224\168\190-\224\169\130" .. -- U+0A3E-U+0A42
"\224\169\135" .. -- U+0A47
"\224\169\136" .. -- U+0A48
"\224\169\139-\224\169\141" .. -- U+0A4B-U+0A4D
"\224\169\145" .. -- U+0A51
"\224\169\176" .. -- U+0A70
"\224\169\177" .. -- U+0A71
"\224\169\181" .. -- U+0A75
"\224\170\129-\224\170\131" .. -- U+0A81-U+0A83
"\224\170\188" .. -- U+0ABC
"\224\170\190-\224\171\133" .. -- U+0ABE-U+0AC5
"\224\171\135-\224\171\137" .. -- U+0AC7-U+0AC9
"\224\171\139-\224\171\141" .. -- U+0ACB-U+0ACD
"\224\171\162" .. -- U+0AE2
"\224\171\163" .. -- U+0AE3
"\224\171\186-\224\171\191" .. -- U+0AFA-U+0AFF
"\224\172\129-\224\172\131" .. -- U+0B01-U+0B03
"\224\172\188" .. -- U+0B3C
"\224\172\190-\224\173\132" .. -- U+0B3E-U+0B44
"\224\173\135" .. -- U+0B47
"\224\173\136" .. -- U+0B48
"\224\173\139-\224\173\141" .. -- U+0B4B-U+0B4D
"\224\173\149-\224\173\151" .. -- U+0B55-U+0B57
"\224\173\162" .. -- U+0B62
"\224\173\163" .. -- U+0B63
"\224\174\130" .. -- U+0B82
"\224\174\190-\224\175\130" .. -- U+0BBE-U+0BC2
"\224\175\134-\224\175\136" .. -- U+0BC6-U+0BC8
"\224\175\138-\224\175\141" .. -- U+0BCA-U+0BCD
"\224\175\151" .. -- U+0BD7
"\224\176\128-\224\176\132" .. -- U+0C00-U+0C04
"\224\176\188" .. -- U+0C3C
"\224\176\190-\224\177\132" .. -- U+0C3E-U+0C44
"\224\177\134-\224\177\136" .. -- U+0C46-U+0C48
"\224\177\138-\224\177\141" .. -- U+0C4A-U+0C4D
"\224\177\149" .. -- U+0C55
"\224\177\150" .. -- U+0C56
"\224\177\162" .. -- U+0C62
"\224\177\163" .. -- U+0C63
"\224\178\129-\224\178\131" .. -- U+0C81-U+0C83
"\224\178\188" .. -- U+0CBC
"\224\178\190-\224\179\132" .. -- U+0CBE-U+0CC4
"\224\179\134-\224\179\136" .. -- U+0CC6-U+0CC8
"\224\179\138-\224\179\141" .. -- U+0CCA-U+0CCD
"\224\179\149" .. -- U+0CD5
"\224\179\150" .. -- U+0CD6
"\224\179\162" .. -- U+0CE2
"\224\179\163" .. -- U+0CE3
"\224\179\179" .. -- U+0CF3
"\224\180\128-\224\180\131" .. -- U+0D00-U+0D03
"\224\180\187" .. -- U+0D3B
"\224\180\188" .. -- U+0D3C
"\224\180\190-\224\181\132" .. -- U+0D3E-U+0D44
"\224\181\134-\224\181\136" .. -- U+0D46-U+0D48
"\224\181\138-\224\181\141" .. -- U+0D4A-U+0D4D
"\224\181\151" .. -- U+0D57
"\224\181\162" .. -- U+0D62
"\224\181\163" .. -- U+0D63
"\224\182\129-\224\182\131" .. -- U+0D81-U+0D83
"\224\183\138" .. -- U+0DCA
"\224\183\143-\224\183\148" .. -- U+0DCF-U+0DD4
"\224\183\150" .. -- U+0DD6
"\224\183\152-\224\183\159" .. -- U+0DD8-U+0DDF
"\224\183\178" .. -- U+0DF2
"\224\183\179" .. -- U+0DF3
"\224\184\177" .. -- U+0E31
"\224\184\180-\224\184\186" .. -- U+0E34-U+0E3A
"\224\185\135-\224\185\142" .. -- U+0E47-U+0E4E
"\224\186\177" .. -- U+0EB1
"\224\186\180-\224\186\188" .. -- U+0EB4-U+0EBC
"\224\187\136-\224\187\142" .. -- U+0EC8-U+0ECE
"\224\188\152" .. -- U+0F18
"\224\188\153" .. -- U+0F19
"\224\188\181" .. -- U+0F35
"\224\188\183" .. -- U+0F37
"\224\188\185" .. -- U+0F39
"\224\188\190" .. -- U+0F3E
"\224\188\191" .. -- U+0F3F
"\224\189\177-\224\190\132" .. -- U+0F71-U+0F84
"\224\190\134" .. -- U+0F86
"\224\190\135" .. -- U+0F87
"\224\190\141-\224\190\151" .. -- U+0F8D-U+0F97
"\224\190\153-\224\190\188" .. -- U+0F99-U+0FBC
"\224\191\134" .. -- U+0FC6
"\225\128\171-\225\128\190" .. -- U+102B-U+103E
"\225\129\150-\225\129\153" .. -- U+1056-U+1059
"\225\129\158-\225\129\160" .. -- U+105E-U+1060
"\225\129\162-\225\129\164" .. -- U+1062-U+1064
"\225\129\167-\225\129\173" .. -- U+1067-U+106D
"\225\129\177-\225\129\180" .. -- U+1071-U+1074
"\225\130\130-\225\130\141" .. -- U+1082-U+108D
"\225\130\143" .. -- U+108F
"\225\130\154-\225\130\157" .. -- U+109A-U+109D
"\225\141\157-\225\141\159" .. -- U+135D-U+135F
"\225\156\146-\225\156\149" .. -- U+1712-U+1715
"\225\156\178-\225\156\180" .. -- U+1732-U+1734
"\225\157\146" .. -- U+1752
"\225\157\147" .. -- U+1753
"\225\157\178" .. -- U+1772
"\225\157\179" .. -- U+1773
"\225\158\180-\225\159\147") .. -- U+17B4-U+17D3
("\225\159\157" .. -- U+17DD
"\225\162\133" .. -- U+1885
"\225\162\134" .. -- U+1886
"\225\162\169" .. -- U+18A9
"\225\164\160-\225\164\171" .. -- U+1920-U+192B
"\225\164\176-\225\164\187" .. -- U+1930-U+193B
"\225\168\151-\225\168\155" .. -- U+1A17-U+1A1B
"\225\169\149-\225\169\158" .. -- U+1A55-U+1A5E
"\225\169\160-\225\169\188" .. -- U+1A60-U+1A7C
"\225\169\191" .. -- U+1A7F
"\225\170\176-\225\171\142" .. -- U+1AB0-U+1ACE
"\225\172\128-\225\172\132" .. -- U+1B00-U+1B04
"\225\172\180-\225\173\132" .. -- U+1B34-U+1B44
"\225\173\171-\225\173\179" .. -- U+1B6B-U+1B73
"\225\174\128-\225\174\130" .. -- U+1B80-U+1B82
"\225\174\161-\225\174\173" .. -- U+1BA1-U+1BAD
"\225\175\166-\225\175\179" .. -- U+1BE6-U+1BF3
"\225\176\164-\225\176\183" .. -- U+1C24-U+1C37
"\225\179\144-\225\179\146" .. -- U+1CD0-U+1CD2
"\225\179\148-\225\179\168" .. -- U+1CD4-U+1CE8
"\225\179\173" .. -- U+1CED
"\225\179\180" .. -- U+1CF4
"\225\179\183-\225\179\185" .. -- U+1CF7-U+1CF9
"\225\183\128-\225\183\140" .. -- U+1DC0-U+1DCC
"\225\183\142-\225\183\187" .. -- U+1DCE-U+1DFB
"\225\183\189-\225\183\191" .. -- U+1DFD-U+1DFF
"\226\131\144-\226\131\176" .. -- U+20D0-U+20F0
"\226\179\175-\226\179\177" .. -- U+2CEF-U+2CF1
"\226\181\191" .. -- U+2D7F
"\226\183\160-\226\183\191" .. -- U+2DE0-U+2DFF
"\227\128\170-\227\128\175" .. -- U+302A-U+302F
"\227\130\153" .. -- U+3099
"\227\130\154" .. -- U+309A
"\234\153\175-\234\153\178" .. -- U+A66F-U+A672
"\234\153\180-\234\153\189" .. -- U+A674-U+A67D
"\234\154\158" .. -- U+A69E
"\234\154\159" .. -- U+A69F
"\234\155\176" .. -- U+A6F0
"\234\155\177" .. -- U+A6F1
"\234\160\130" .. -- U+A802
"\234\160\134" .. -- U+A806
"\234\160\139" .. -- U+A80B
"\234\160\163-\234\160\167" .. -- U+A823-U+A827
"\234\160\172" .. -- U+A82C
"\234\162\128" .. -- U+A880
"\234\162\129" .. -- U+A881
"\234\162\180-\234\163\133" .. -- U+A8B4-U+A8C5
"\234\163\160-\234\163\177" .. -- U+A8E0-U+A8F1
"\234\163\191" .. -- U+A8FF
"\234\164\166-\234\164\173" .. -- U+A926-U+A92D
"\234\165\135-\234\165\147" .. -- U+A947-U+A953
"\234\166\128-\234\166\131" .. -- U+A980-U+A983
"\234\166\179-\234\167\128" .. -- U+A9B3-U+A9C0
"\234\167\165" .. -- U+A9E5
"\234\168\169-\234\168\182" .. -- U+AA29-U+AA36
"\234\169\131" .. -- U+AA43
"\234\169\140" .. -- U+AA4C
"\234\169\141" .. -- U+AA4D
"\234\169\187-\234\169\189" .. -- U+AA7B-U+AA7D
"\234\170\176" .. -- U+AAB0
"\234\170\178-\234\170\180" .. -- U+AAB2-U+AAB4
"\234\170\183" .. -- U+AAB7
"\234\170\184" .. -- U+AAB8
"\234\170\190" .. -- U+AABE
"\234\170\191" .. -- U+AABF
"\234\171\129" .. -- U+AAC1
"\234\171\171-\234\171\175" .. -- U+AAEB-U+AAEF
"\234\171\181" .. -- U+AAF5
"\234\171\182" .. -- U+AAF6
"\234\175\163-\234\175\170" .. -- U+ABE3-U+ABEA
"\234\175\172" .. -- U+ABEC
"\234\175\173" .. -- U+ABED
"\239\172\158" .. -- U+FB1E
"\239\184\160-\239\184\175" .. -- U+FE20-U+FE2F
"\240\144\135\189" .. -- U+101FD
"\240\144\139\160" .. -- U+102E0
"\240\144\141\182-\240\144\141\186" .. -- U+10376-U+1037A
"\240\144\168\129-\240\144\168\131" .. -- U+10A01-U+10A03
"\240\144\168\133" .. -- U+10A05
"\240\144\168\134" .. -- U+10A06
"\240\144\168\140-\240\144\168\143" .. -- U+10A0C-U+10A0F
"\240\144\168\184-\240\144\168\186" .. -- U+10A38-U+10A3A
"\240\144\168\191" .. -- U+10A3F
"\240\144\171\165" .. -- U+10AE5
"\240\144\171\166" .. -- U+10AE6
"\240\144\180\164-\240\144\180\167" .. -- U+10D24-U+10D27
"\240\144\181\169-\240\144\181\173" .. -- U+10D69-U+10D6D
"\240\144\186\171" .. -- U+10EAB
"\240\144\186\172" .. -- U+10EAC
"\240\144\187\188-\240\144\187\191" .. -- U+10EFC-U+10EFF
"\240\144\189\134-\240\144\189\144" .. -- U+10F46-U+10F50
"\240\144\190\130-\240\144\190\133" .. -- U+10F82-U+10F85
"\240\145\128\128-\240\145\128\130" .. -- U+11000-U+11002
"\240\145\128\184-\240\145\129\134" .. -- U+11038-U+11046
"\240\145\129\176" .. -- U+11070
"\240\145\129\179" .. -- U+11073
"\240\145\129\180" .. -- U+11074
"\240\145\129\191-\240\145\130\130" .. -- U+1107F-U+11082
"\240\145\130\176-\240\145\130\186" .. -- U+110B0-U+110BA
"\240\145\131\130" .. -- U+110C2
"\240\145\132\128-\240\145\132\130" .. -- U+11100-U+11102
"\240\145\132\167-\240\145\132\180" .. -- U+11127-U+11134
"\240\145\133\133" .. -- U+11145
"\240\145\133\134" .. -- U+11146
"\240\145\133\179" .. -- U+11173
"\240\145\134\128-\240\145\134\130" .. -- U+11180-U+11182
"\240\145\134\179-\240\145\135\128" .. -- U+111B3-U+111C0
"\240\145\135\137-\240\145\135\140" .. -- U+111C9-U+111CC
"\240\145\135\142" .. -- U+111CE
"\240\145\135\143" .. -- U+111CF
"\240\145\136\172-\240\145\136\183" .. -- U+1122C-U+11237
"\240\145\136\190" .. -- U+1123E
"\240\145\137\129" .. -- U+11241
"\240\145\139\159-\240\145\139\170" .. -- U+112DF-U+112EA
"\240\145\140\128-\240\145\140\131" .. -- U+11300-U+11303
"\240\145\140\187" .. -- U+1133B
"\240\145\140\188" .. -- U+1133C
"\240\145\140\190-\240\145\141\132" .. -- U+1133E-U+11344
"\240\145\141\135" .. -- U+11347
"\240\145\141\136" .. -- U+11348
"\240\145\141\139-\240\145\141\141" .. -- U+1134B-U+1134D
"\240\145\141\151" .. -- U+11357
"\240\145\141\162" .. -- U+11362
"\240\145\141\163" .. -- U+11363
"\240\145\141\166-\240\145\141\172" .. -- U+11366-U+1136C
"\240\145\141\176-\240\145\141\180" .. -- U+11370-U+11374
"\240\145\142\184-\240\145\143\128" .. -- U+113B8-U+113C0
"\240\145\143\130" .. -- U+113C2
"\240\145\143\133" .. -- U+113C5
"\240\145\143\135-\240\145\143\138" .. -- U+113C7-U+113CA
"\240\145\143\140-\240\145\143\144" .. -- U+113CC-U+113D0
"\240\145\143\146" .. -- U+113D2
"\240\145\143\161" .. -- U+113E1
"\240\145\143\162" .. -- U+113E2
"\240\145\144\181-\240\145\145\134" .. -- U+11435-U+11446
"\240\145\145\158" .. -- U+1145E
"\240\145\146\176-\240\145\147\131" .. -- U+114B0-U+114C3
"\240\145\150\175-\240\145\150\181" .. -- U+115AF-U+115B5
"\240\145\150\184-\240\145\151\128" .. -- U+115B8-U+115C0
"\240\145\151\156" .. -- U+115DC
"\240\145\151\157" .. -- U+115DD
"\240\145\152\176-\240\145\153\128" .. -- U+11630-U+11640
"\240\145\154\171-\240\145\154\183" .. -- U+116AB-U+116B7
"\240\145\156\157-\240\145\156\171" .. -- U+1171D-U+1172B
"\240\145\160\172-\240\145\160\186" .. -- U+1182C-U+1183A
"\240\145\164\176-\240\145\164\181" .. -- U+11930-U+11935
"\240\145\164\183" .. -- U+11937
"\240\145\164\184" .. -- U+11938
"\240\145\164\187-\240\145\164\190" .. -- U+1193B-U+1193E
"\240\145\165\128") .. -- U+11940
("\240\145\165\130" .. -- U+11942
"\240\145\165\131" .. -- U+11943
"\240\145\167\145-\240\145\167\151" .. -- U+119D1-U+119D7
"\240\145\167\154-\240\145\167\160" .. -- U+119DA-U+119E0
"\240\145\167\164" .. -- U+119E4
"\240\145\168\129-\240\145\168\138" .. -- U+11A01-U+11A0A
"\240\145\168\179-\240\145\168\185" .. -- U+11A33-U+11A39
"\240\145\168\187-\240\145\168\190" .. -- U+11A3B-U+11A3E
"\240\145\169\135" .. -- U+11A47
"\240\145\169\145-\240\145\169\155" .. -- U+11A51-U+11A5B
"\240\145\170\138-\240\145\170\153" .. -- U+11A8A-U+11A99
"\240\145\176\175-\240\145\176\182" .. -- U+11C2F-U+11C36
"\240\145\176\184-\240\145\176\191" .. -- U+11C38-U+11C3F
"\240\145\178\146-\240\145\178\167" .. -- U+11C92-U+11CA7
"\240\145\178\169-\240\145\178\182" .. -- U+11CA9-U+11CB6
"\240\145\180\177-\240\145\180\182" .. -- U+11D31-U+11D36
"\240\145\180\186" .. -- U+11D3A
"\240\145\180\188" .. -- U+11D3C
"\240\145\180\189" .. -- U+11D3D
"\240\145\180\191-\240\145\181\133" .. -- U+11D3F-U+11D45
"\240\145\181\135" .. -- U+11D47
"\240\145\182\138-\240\145\182\142" .. -- U+11D8A-U+11D8E
"\240\145\182\144" .. -- U+11D90
"\240\145\182\145" .. -- U+11D91
"\240\145\182\147-\240\145\182\151" .. -- U+11D93-U+11D97
"\240\145\187\179-\240\145\187\182" .. -- U+11EF3-U+11EF6
"\240\145\188\128" .. -- U+11F00
"\240\145\188\129" .. -- U+11F01
"\240\145\188\131" .. -- U+11F03
"\240\145\188\180-\240\145\188\186" .. -- U+11F34-U+11F3A
"\240\145\188\190-\240\145\189\130" .. -- U+11F3E-U+11F42
"\240\145\189\154" .. -- U+11F5A
"\240\147\145\128" .. -- U+13440
"\240\147\145\135-\240\147\145\149" .. -- U+13447-U+13455
"\240\150\132\158-\240\150\132\175" .. -- U+1611E-U+1612F
"\240\150\171\176-\240\150\171\180" .. -- U+16AF0-U+16AF4
"\240\150\172\176-\240\150\172\182" .. -- U+16B30-U+16B36
"\240\150\189\143" .. -- U+16F4F
"\240\150\189\145-\240\150\190\135" .. -- U+16F51-U+16F87
"\240\150\190\143-\240\150\190\146" .. -- U+16F8F-U+16F92
"\240\150\191\164" .. -- U+16FE4
"\240\150\191\176" .. -- U+16FF0
"\240\150\191\177" .. -- U+16FF1
"\240\155\178\157" .. -- U+1BC9D
"\240\155\178\158" .. -- U+1BC9E
"\240\156\188\128-\240\156\188\173" .. -- U+1CF00-U+1CF2D
"\240\156\188\176-\240\156\189\134" .. -- U+1CF30-U+1CF46
"\240\157\133\165-\240\157\133\169" .. -- U+1D165-U+1D169
"\240\157\133\173-\240\157\133\178" .. -- U+1D16D-U+1D172
"\240\157\133\187-\240\157\134\130" .. -- U+1D17B-U+1D182
"\240\157\134\133-\240\157\134\139" .. -- U+1D185-U+1D18B
"\240\157\134\170-\240\157\134\173" .. -- U+1D1AA-U+1D1AD
"\240\157\137\130-\240\157\137\132" .. -- U+1D242-U+1D244
"\240\157\168\128-\240\157\168\182" .. -- U+1DA00-U+1DA36
"\240\157\168\187-\240\157\169\172" .. -- U+1DA3B-U+1DA6C
"\240\157\169\181" .. -- U+1DA75
"\240\157\170\132" .. -- U+1DA84
"\240\157\170\155-\240\157\170\159" .. -- U+1DA9B-U+1DA9F
"\240\157\170\161-\240\157\170\175" .. -- U+1DAA1-U+1DAAF
"\240\158\128\128-\240\158\128\134" .. -- U+1E000-U+1E006
"\240\158\128\136-\240\158\128\152" .. -- U+1E008-U+1E018
"\240\158\128\155-\240\158\128\161" .. -- U+1E01B-U+1E021
"\240\158\128\163" .. -- U+1E023
"\240\158\128\164" .. -- U+1E024
"\240\158\128\166-\240\158\128\170" .. -- U+1E026-U+1E02A
"\240\158\130\143" .. -- U+1E08F
"\240\158\132\176-\240\158\132\182" .. -- U+1E130-U+1E136
"\240\158\138\174" .. -- U+1E2AE
"\240\158\139\172-\240\158\139\175" .. -- U+1E2EC-U+1E2EF
"\240\158\147\172-\240\158\147\175" .. -- U+1E4EC-U+1E4EF
"\240\158\151\174" .. -- U+1E5EE
"\240\158\151\175" .. -- U+1E5EF
"\240\158\163\144-\240\158\163\150" .. -- U+1E8D0-U+1E8D6
"\240\158\165\132-\240\158\165\138") -- U+1E944-U+1E94A
-- Double combining characters.
-- Charset: [[:M:]&[:Canonical_Combining_Class=/^Double_/:]&[:^subhead=Grapheme joiner:]&[:^Variation_Selector=Yes:]]
local comb_chars_double =
"\205\156-\205\162" .. -- U+035C-U+0362
"\225\183\141" .. -- U+1DCD
"\225\183\188" -- U+1DFC
-- Variation selectors etc.; separated out so that we don't get categories for them.
-- Charset: [[:M:]&[[:subhead=Grapheme joiner:][:Variation_Selector=Yes:]]].
local comb_chars_other =
"\205\143" .. -- U+034F
"\225\160\139-\225\160\141" .. -- U+180B-U+180D
"\225\160\143" .. -- U+180F
"\239\184\128-\239\184\143" .. -- U+FE00-U+FE0F
"\243\160\132\128-\243\160\135\175" -- U+E0100-U+E01EF
local comb_chars_all = comb_chars_single .. comb_chars_double .. comb_chars_other
local comb_chars = {
combined_single = "[^" .. comb_chars_all .. "][" .. comb_chars_single .. comb_chars_other .. "]+%f[^" .. comb_chars_all .. "]",
combined_double = "[^" .. comb_chars_all .. "][" .. comb_chars_single .. comb_chars_other .. "]*[" .. comb_chars_double .. "]+[" .. comb_chars_all .. "]*.[" .. comb_chars_single .. comb_chars_other .. "]*",
diacritics_single = "[" .. comb_chars_single .. "]",
diacritics_double = "[" .. comb_chars_double .. "]",
diacritics_all = "[" .. comb_chars_all .. "]"
}
-- Somewhat curated list from https://unicode.org/Public/emoji/16.0/emoji-sequences.txt.
-- NOTE: There are lots more emoji sequences involving non-emoji Plane 0 symbols followed by 0xFE0F, which we don't
-- (yet?) handle.
local emoji_chars =
"\226\140\154" .. -- U+231A (⌚)
"\226\140\155" .. -- U+231B (⌛)
"\226\140\168" .. -- U+2328 (⌨)
"\226\143\143" .. -- U+23CF (⏏)
"\226\143\169-\226\143\179" .. -- U+23E9-U+23F3 (⏩-⏳)
"\226\143\184-\226\143\186" .. -- U+23F8-U+23FA (⏸-⏺)
"\226\150\170" .. -- U+25AA (▪)
"\226\150\171" .. -- U+25AB (▫)
"\226\150\182" .. -- U+25B6 (▶)
"\226\151\128" .. -- U+25C0 (◀)
"\226\151\187-\226\151\190" .. -- U+25FB-U+25FE (◻-◾)
"\226\152\128-\226\152\132" .. -- U+2600-U+2604 (☀-☄)
"\226\152\142" .. -- U+260E (☎)
"\226\152\145" .. -- U+2611 (☑)
"\226\152\148" .. -- U+2614 (☔)
"\226\152\149" .. -- U+2615 (☕)
"\226\152\152" .. -- U+2618 (☘)
"\226\152\157" .. -- U+261D (☝)
"\226\152\160" .. -- U+2620 (☠)
"\226\152\162" .. -- U+2622 (☢)
"\226\152\163" .. -- U+2623 (☣)
"\226\152\166" .. -- U+2626 (☦)
"\226\152\170" .. -- U+262A (☪)
"\226\152\174" .. -- U+262E (☮)
"\226\152\175" .. -- U+262F (☯)
"\226\152\184-\226\152\186" .. -- U+2638-U+263A (☸-☺)
"\226\153\136-\226\153\147" .. -- U+2648-U+2653 (♈-♓)
"\226\153\159" .. -- U+265F (♟)
"\226\153\160" .. -- U+2660 (♠)
"\226\153\163" .. -- U+2663 (♣)
"\226\153\165" .. -- U+2665 (♥)
"\226\153\166" .. -- U+2666 (♦)
"\226\153\168" .. -- U+2668 (♨)
"\226\153\187" .. -- U+267B (♻)
"\226\153\190" .. -- U+267E (♾)
"\226\153\191" .. -- U+267F (♿)
"\226\154\146-\226\154\151" .. -- U+2692-U+2697 (⚒-⚗)
"\226\154\153" .. -- U+2699 (⚙)
"\226\154\155" .. -- U+269B (⚛)
"\226\154\156" .. -- U+269C (⚜)
"\226\154\160" .. -- U+26A0 (⚠)
"\226\154\161" .. -- U+26A1 (⚡)
"\226\154\170" .. -- U+26AA (⚪)
"\226\154\171" .. -- U+26AB (⚫)
"\226\154\176" .. -- U+26B0 (⚰)
"\226\154\177" .. -- U+26B1 (⚱)
"\226\154\189" .. -- U+26BD (⚽)
"\226\154\190" .. -- U+26BE (⚾)
"\226\155\132" .. -- U+26C4 (⛄)
"\226\155\133" .. -- U+26C5 (⛅)
"\226\155\136" .. -- U+26C8 (⛈)
"\226\155\142" .. -- U+26CE (⛎)
"\226\155\143" .. -- U+26CF (⛏)
"\226\155\145" .. -- U+26D1 (⛑)
"\226\155\147" .. -- U+26D3 (⛓)
"\226\155\148" .. -- U+26D4 (⛔)
"\226\155\169" .. -- U+26E9 (⛩)
"\226\155\170" .. -- U+26EA (⛪)
"\226\155\176-\226\155\181" .. -- U+26F0-U+26F5 (⛰-⛵)
"\226\155\183-\226\155\186" .. -- U+26F7-U+26FA (⛷-⛺)
"\226\155\189" .. -- U+26FD (⛽)
"\226\156\130" .. -- U+2702 (✂)
"\226\156\133" .. -- U+2705 (✅)
"\226\156\136-\226\156\141" .. -- U+2708-U+270D (✈-✍)
"\226\156\143" .. -- U+270F (✏)
"\226\156\146" .. -- U+2712 (✒)
"\226\156\148" .. -- U+2714 (✔)
"\226\156\150" .. -- U+2716 (✖)
"\226\156\157" .. -- U+271D (✝)
"\226\156\161" .. -- U+2721 (✡)
"\226\156\168" .. -- U+2728 (✨)
"\226\156\179" .. -- U+2733 (✳)
"\226\156\180" .. -- U+2734 (✴)
"\226\157\132" .. -- U+2744 (❄)
"\226\157\135" .. -- U+2747 (❇)
"\226\157\140" .. -- U+274C (❌)
"\226\157\142" .. -- U+274E (❎)
"\226\157\147-\226\157\149" .. -- U+2753-U+2755 (❓-❕)
"\226\157\151" .. -- U+2757 (❗)
"\226\157\163" .. -- U+2763 (❣)
"\226\157\164" .. -- U+2764 (❤)
"\226\158\149-\226\158\151" .. -- U+2795-U+2797 (➕-➗)
"\226\158\161" .. -- U+27A1 (➡)
"\226\158\176" .. -- U+27B0 (➰)
"\226\158\191" .. -- U+27BF (➿)
"\226\164\180" .. -- U+2934 (⤴)
"\226\164\181" .. -- U+2935 (⤵)
"\226\172\133-\226\172\135" .. -- U+2B05-U+2B07 (⬅-⬇)
"\226\172\155" .. -- U+2B1B (⬛)
"\226\172\156" .. -- U+2B1C (⬜)
"\226\173\144" .. -- U+2B50 (⭐)
"\226\173\149" .. -- U+2B55 (⭕)
"\227\128\176" .. -- U+3030 (〰)
"\227\128\189" .. -- U+303D (〽)
"\227\138\151" .. -- U+3297 (㊗)
"\227\138\153" .. -- U+3299 (㊙)
"\240\159\128\132" .. -- U+1F004 (🀄)
"\240\159\131\143" .. -- U+1F0CF (🃏)
"\240\159\133\176" .. -- U+1F170 (🅰)
"\240\159\133\177" .. -- U+1F171 (🅱)
"\240\159\133\190" .. -- U+1F17E (🅾)
"\240\159\133\191" .. -- U+1F17F (🅿)
"\240\159\134\142" .. -- U+1F18E (🆎)
"\240\159\134\145-\240\159\134\154" .. -- U+1F191-U+1F19A (🆑-🆚)
"\240\159\136\129" .. -- U+1F201 (🈁)
"\240\159\136\130" .. -- U+1F202 (🈂)
"\240\159\136\154" .. -- U+1F21A (🈚)
"\240\159\136\175" .. -- U+1F22F (🈯)
"\240\159\136\178-\240\159\136\186" .. -- U+1F232-U+1F23A (🈲-🈺)
"\240\159\137\144" .. -- U+1F250 (🉐)
"\240\159\137\145" .. -- U+1F251 (🉑)
"\240\159\140\128-\240\159\153\143" .. -- U+1F300-U+1F64F (🌀-🙏)
"\240\159\154\128-\240\159\155\151" .. -- U+1F680-U+1F6D7 (🚀-🛗)
"\240\159\155\156-\240\159\155\172" .. -- U+1F6DC-U+1F6EC (🛜-🛬)
"\240\159\155\176-\240\159\155\188" .. -- U+1F6F0-U+1F6FC (🛰-🛼)
"\240\159\159\160-\240\159\159\171" .. -- U+1F7E0-U+1F7EB (🟠-🟫)
"\240\159\159\176" .. -- U+1F7F0 (🟰)
"\240\159\164\140-\240\159\169\147" .. -- U+1F90C-U+1FA53 (🤌-🩓)
"\240\159\169\160-\240\159\169\173" .. -- U+1FA60-U+1FA6D (🩠-🩭)
"\240\159\169\176-\240\159\169\188" .. -- U+1FA70-U+1FA7C (🩰-🩼)
"\240\159\170\128-\240\159\170\137" .. -- U+1FA80-U+1FA89 (🪀-)
"\240\159\170\143-\240\159\171\134" .. -- U+1FA8F-U+1FAC6 (-)
"\240\159\171\142-\240\159\171\156" .. -- U+1FACE-U+1FADC (🫎-)
"\240\159\171\159-\240\159\171\169" .. -- U+1FADF-U+1FAE9 (-)
"\240\159\171\176-\240\159\171\184" -- U+1FAF0-U+1FAF8 (🫰-🫸)
local unsupported_characters
local function get_unsupported_characters()
unsupported_characters, get_unsupported_characters = {}, nil
for k, v in pairs(load_data("Module:links/data").unsupported_characters) do
unsupported_characters[v] = k
end
return unsupported_characters
end
-- The list of unsupported titles and invert it (so the keys are pagenames and values are canonical titles).
local unsupported_titles
local function get_unsupported_titles()
unsupported_titles, get_unsupported_titles = {}, nil
for k, v in pairs(load_data("Module:links/data").unsupported_titles) do
unsupported_titles[v] = k
end
return unsupported_titles
end
-- To save on memory, we only cache names with either non-ASCII characters in them or ASCII characters to be removed or
-- transformed (apostrophe, double quote, hyphen).
local L2_sort_key_cache = {}
function export.get_L2_sort_key(L2)
if L2 == "Rentas bahasa" then
return "\1"
elseif L2 == "Bahasa Melayu" then
return "\2"
elseif match(L2, "^[%z\1-\b\14-!#-&(-,.-\127]+$") then
return L2
end
local sort_key = L2_sort_key_cache[L2]
if sort_key then
return sort_key
end
sort_key = toNFC(ugsub(ugsub(toNFD(L2), "[" .. comb_chars_all .. "'\"ʻʼ]+", ""), "[%s%-]+", " "))
L2_sort_key_cache[L2] = sort_key
return sort_key
end
--[==[
Given a pagename (or {nil} for the current page), create and return a data structure describing the page. The returned
object includes the following fields:
* `comb_chars`: A table containing various Lua character class patterns for different types of combined characters
(those that decompose into multiple characters in the NFD decomposition). The patterns are meant to be used with
{mw.ustring.find()}. The keys are:
** `single`: Single combining characters (character + diacritic), without surrounding brackets;
** `double`: Double combining characters (character + diacritic + character), without surrounding brackets;
** `vs`: Variation selectors, without surrounding brackets;
** `all`: Concatenation of `single` + `double` + `vs`, without surrounding brackets;
** `diacritics_single`: Like `single` but with surrounding brackets;
** `diacritics_double`: Like `double` but with surrounding brackets;
** `diacritics_all`: Like `all` but with surrounding brackets;
** `combined_single`: Lua pattern for matching a spacing character followed by one or more single combining characters;
** `combined_double`: Lua pattern for matching a combination of two spacing characters separated by one or more double
combining characters, possibly also with single combining characters;
* `emoji_pattern`: A Lua character class pattern (including surrounding brackets) that matches emojis. Meant to be used
with {mw.ustring.find()}.
* `L2_list`: Ordered list of L2 headings on the page, with the extra key `n` that gives the length of the list.
* `L2_sections`: Lookup table of L2 headings on the page, where the key is the section number assigned by the preprocessor, and the value is the L2 heading name. Once an invocation has got its actual section number from get_current_L2 in [[Module:pages]], it can use this table to determine its parent L2. TODO: We could expand this to include subsections, to check POS headings are correct etc.
* `unsupported_titles`: Map from pagenames to canonical titles for unsupported-title pages.
* `namespace`: Namespace of the pagename.
* `ns`: Namespace table for the page from mw.site.namespaces (TODO: merge with `namespace` above).
* `full_raw_pagename`: Full version of the '''RAW''' pagename (i.e. unsupported-title pages aren't canonicalized);
including the namespace and the base (portion before the slash).
* `pagename`: Canonicalized subpage portion of the pagename (unsupported-title pages are canonicalized).
* `pagename_with_base`: Same as `pagename` in the main namespace; otherwise, the whole pagename without the namespace.
* `decompose_pagename`: Equivalent of `pagename` in NFD decomposition.
* `pagename_len`: Length of `pagename` in Unicode chars, where combinations of spacing character + decomposed diacritic
are treated as single characters.
* `explode_pagename`: Set of characters found in `pagename`. The keys are characters (where combinations of spacing
character + decomposed diacritic are treated as single characters).
* `encoded_pagename`: FIXME: Document me.
* `pagename_defaultsort`: FIXME: Document me.
* `raw_defaultsort`: FIXME: Document me.
* `wikitext_topic_cat`: FIXME: Document me.
* `wikitext_langname_cat`: FIXME: Document me.
`no_fetch_content` says to not fetch and parse the content or set a DEFAULTSORT sort key, in order to save time on
test and documentation pages that have lots of template invocations that set `|pagename=`. It turns out nearly all the
time of this function is contained in the line `frame:callParserFunction("DEFAULTSORT", data.pagename_defaultsort)`,
so we skip it on test and documentation pages where it accomplishes nothing in any case.
]==]
function export.process_page(pagename, no_fetch_content)
local data = {
comb_chars = comb_chars,
emoji_pattern = "[" .. emoji_chars .. "]",
unsupported_titles = unsupported_titles or get_unsupported_titles()
}
local cats = {}
data.cats = cats
-- We cannot store `raw_title` in `data` because it contains a metatable.
local raw_title
local function bad_pagename()
if not pagename then
error("Internal error: Something wrong, `data.pagename` not specified but current title contains illegal characters")
else
error(format("Bad value for `data.pagename`: '%s', which must not contain illegal characters", pagename))
end
end
if pagename then -- for testing, doc pages, etc.
raw_title = new_title(pagename)
if not raw_title then
bad_pagename()
end
else
raw_title = mw.title.getCurrentTitle()
end
local nsText = raw_title.nsText
local namespace_is_reconstruction = nsText == "Rekonstruksi"
data.namespace = nsText
data.ns = mw.site.namespaces[raw_title.namespace]
local full_raw_pagename = raw_title.fullText
data.full_raw_pagename = full_raw_pagename
local frame = mw.getCurrentFrame()
-- WARNING: `content` may be nil, e.g. if we're substing a template like {{ja-new}} on a not-yet-created page
-- or if the module specifies the subpage as `data.pagename` (which many modules do) and we're in an Appendix
-- or other non-mainspace page. We used to make the latter an error but there are too many modules that do it,
-- and substing on a nonexistent page is totally legit, and we don't actually need to be able to access the
-- content of the page.
local content = not no_fetch_content and raw_title:getContent() or nil
-- Get the pagename.
pagename = physical_to_logical_pagename_if_mammoth(raw_title)
pagename = gsub(pagename, "^Unsupported titles/(.+)", function(m)
insert(cats, "Tajuk tidak disokong")
local title = (unsupported_titles or get_unsupported_titles())[m]
if title then
return title
end
-- Substitute pairs of "`". Those not used for escaping should be escaped as "`grave`", but might not be,
-- so if a pair don't form a match, the closing "`" should become the opening "`" of the next match attempt.
-- This has to be done manually, instead of using gsub.
local open_pos = find(m, "`")
if not open_pos then
return m
end
title = {sub(m, 1, open_pos - 1)}
while true do
local close_pos = find(m, "`", open_pos + 1)
if not close_pos then
-- Add "`" plus any remaining characters.
insert(title, sub(m, open_pos))
break
end
local escape = sub(m, open_pos, close_pos)
local ch = (unsupported_characters or get_unsupported_characters())[escape]
-- Match found, so substitute the character and move to the first "`" after the match if found, or
-- otherwise return.
if ch then
insert(title, ch)
local nxt_pos = close_pos + 1
open_pos = find(m, "`", nxt_pos)
-- Add any characters between the match and the next "`" or end.
if open_pos then
insert(title, sub(m, nxt_pos, open_pos - 1))
else
insert(title, sub(m, nxt_pos))
break
end
-- Match not found, so make the closing "`" the opening "`" of the next attempt.
else
-- Add the failed match, except for the closing "`".
insert(title, sub(m, open_pos, close_pos - 1))
open_pos = close_pos
end
end
return concat(title)
end)
-- Save pagename, as the local variable will be destructively modified.
data.pagename = pagename
if nsText == "" then
data.pagename_with_base = pagename
else
data.pagename_with_base = raw_title.text
end
-- Decompose the pagename in Unicode normalization form D.
data.decompose_pagename = toNFD(pagename)
-- Explode the current page name into a character table, taking decomposed combining characters into account.
local explode_pagename = {}
local pagename_len = 0
local function explode(char)
explode_pagename[char] = true
pagename_len = pagename_len + 1
return ""
end
pagename = ugsub(pagename, comb_chars.combined_double, explode)
pagename = gsub(ugsub(pagename, comb_chars.combined_single, explode), ".[\128-\191]*", explode)
data.explode_pagename = explode_pagename
data.pagename_len = pagename_len
-- Generate DEFAULTSORT.
data.encoded_pagename = encode_entities(data.pagename)
data.pagename_defaultsort = get_lang("mul"):makeSortKey(data.encoded_pagename)
if not no_fetch_content then
frame:callParserFunction("DEFAULTSORT", data.pagename_defaultsort)
end
data.raw_defaultsort = uupper(raw_title.text)
-- Make `L2_list` and `L2_sections`, note raw wikitext use of {{DEFAULTSORT:}} and {{DISPLAYTITLE:}}, then add categories if any unwanted L1 headings are found, the L2 headings are in the wrong order, or they don't match a canonical language name.
-- Note: HTML comments shouldn't be removed from `content` until after this step, as they can affect the result.
do
local L2_list, L2_list_len, L2_sections = {}, 0, {}
local prev, rc
local new_cats, L2_wrong_order = {}
local function handle_heading(heading)
local level = heading.level
if level > 2 then
return
end
local name = heading:get_name()
-- heading:get_name() will return nil if there are any newline characters in the preprocessed heading name (e.g. from an expanded template). In such cases, the preprocessor section count still increments (since it's calculated pre-expansion), but the heading will fail, so the L2 count shouldn't be incremented.
if name == nil then
return
end
L2_list_len = L2_list_len + 1
L2_list[L2_list_len] = name
L2_sections[heading.section] = name
-- Also add any L1s, since they terminate the preceding L2, but add a maintenance category since it's probably a mistake.
if level == 1 then
new_cats["Laman dengan pengepala L1 tidak dikehendaki"] = true
end
-- Check the heading is in the right order.
-- FIXME: we need a more sophisticated sorting method which handles non-diacritic special characters (e.g. Magɨ).
if prev and not (
L2_wrong_order or
string_compare(export.get_L2_sort_key(prev), export.get_L2_sort_key(name))
) then
new_cats["Laman dengan pengepala bahasa dalam susunan salah"] = true
L2_wrong_order = true
end
-- Check it's a canonical language name.
if not "Bahasa " and (langnames or get_langnames())[name] then
new_cats["Laman dengan pengepala bahasa tidak piawai"] = true
end
prev = name
end
local function handle_template(template)
-- Turn off redirect checking except in the Reconstruction namespace because the rc flag is only
-- used in the Reconstruction namespace and the other names are parser functions, which AFAIK can't
-- be redirected to.
local name = template:get_name(nil, not namespace_is_reconstruction and "no_redirect" or nil)
if name == "DEFAULTSORT:" then
new_cats["Laman dengan percanggahan DEFAULTSORT"] = true
elseif name == "DISPLAYTITLE:" then
new_cats["Laman dengan percanggahan DISPLAYTITLE"] = true
elseif name == "reconstructed" then
rc = true
end
end
if content then
for node in parse(content):iterate_nodes() do
local node_class = class_else_type(node)
if node_class == "heading" then
handle_heading(node)
elseif node_class == "template" then
handle_template(node)
elseif node_class == "parameter" then
new_cats["Laman dengan parameter templat bertanda kurung dakap ganda tiga"] = true
end
end
end
L2_list.n = L2_list_len
data.L2_list = L2_list
data.L2_sections = L2_sections
insert(cats, get_category("Laman dengan entri"))
insert(cats, get_category(format("Laman dengan %s entri", L2_list_len)))
for cat in pairs(new_cats) do
insert(cats, get_category(cat))
end
if namespace_is_reconstruction and not rc then
local langname = match(full_raw_pagename, "^Rekonstruksi:([^/]+)/.")
if langname then
insert(cats, get_category("Entri bahasa " .. langname .. " kehilangan Templat:reconstructed"))
end
end
end
------ 4. Parse page for maintenance categories. ------
-- Use of tab characters.
if content and find(content, "\t", 1, true) then
insert(cats, get_category("Laman dengan aksara tab"))
end
-- Unencoded character(s) in title.
local IDS = list_to_set{"⿰", "⿱", "⿲", "⿳", "⿴", "⿵", "⿶", "⿷", "⿸", "⿹", "⿺", "⿻", "", "", "", "", ""}
for char in pairs(explode_pagename) do
if IDS[char] and char ~= data.pagename then
insert(cats, "Perkataan mengandungi aksara tidak dikodkan")
break
end
end
-- Raw wikitext use of a topic or langname category. Also check if any raw sortkeys have been used.
do
local wikitext_topic_cat = {}
local wikitext_langname_cat = {}
local raw_sortkey
-- If a raw sortkey has been found, add it to the relevant table.
-- If there's no table (or the index is just `true`), create one first.
local function add_cat_table(t, lang, sortkey)
local t_lang = t[lang]
if not sortkey then
if not t_lang then
t[lang] = true
end
return
elseif t_lang == true or not t_lang then
t_lang = {}
t[lang] = t_lang
end
t_lang[uupper(decode_entities(sortkey))] = true
end
local function process_category(content, cat, colon, nxt)
local pipe = find(cat, "|", colon + 1, true)
-- Categories cannot end "|]]".
if pipe == #cat then
return
end
local title = new_title(pipe and sub(cat, 1, pipe - 1) or cat)
if not (title and title.namespace == 14) then
return
end
-- Get the sortkey (if any), then canonicalize category title.
local sortkey = pipe and sub(cat, pipe + 1) or nil
cat = title.text
if sortkey then
raw_sortkey = true
-- If the sortkey contains "[", the first "]" of a final "]]]" is treated as part of the sortkey.
if find(sortkey, "[", 1, true) and sub(content, nxt, nxt) == "]" then
sortkey = sortkey .. "]"
end
end
local code = match(cat, "^([%w%-.]+):")
if code then
add_cat_table(wikitext_topic_cat, code, sortkey)
return
end
-- Split by word.
cat = split(cat, " ", true, true)
-- Formerly we looked for the language name anywhere in the category. This is simply wrong
-- because there are no categories like 'Alsatian French lemmas' (only L2 languages
-- have langname categories), but doing it this way wrongly catches things like [[Category:Shapsug Adyghe]]
-- in [[Category:Adyghe entries with language name categories using raw markup]].
local n = #cat - 1
if n <= 0 then
return
end
-- Go from longest to shortest and stop once we've found a language name. Going from shortest
-- to longest or not stopping after a match risks falsely matching (e.g.) German Low German
-- categories as German.
repeat
local name = concat(cat, " ", 1, n)
if "Bahasa " and (langnames or get_langnames())[name] then
add_cat_table(wikitext_langname_cat, name, sortkey)
return
end
n = n - 1
until n == 0
end
if content then
-- Remove comments, then iterate over category links.
content = remove_comments(content, "BOTH")
local head = find(content, "[[", 1, true)
while head do
local close = find(content, "]]", head + 2, true)
if not close then
break
end
-- Make sure there are no intervening "[[" between head and close.
local open = find(content, "[[", head + 2, true)
while open and open < close do
head = open
open = find(content, "[[", head + 2, true)
end
local cat = sub(content, head + 2, close - 1)
-- Locate the colon, and weed out most unwanted links. "[ _\128-\244]*" catches valid whitespace, and ensures any category links using the colon trick are ignored. We match all non-ASCII characters, as there could be multibyte spaces, and mw.title.new will filter out any remaining false-positives; this is a lot faster than running mw.title.new on every link.
local colon = match(cat, "^[ _\128-\244]*[Kk][Aa][Tt][EeGgOoRrIi _\128-\244]*():")
if colon then
process_category(content, cat, colon, close + 2)
end
head = open
end
end
data.wikitext_topic_cat = wikitext_topic_cat
data.wikitext_langname_cat = wikitext_langname_cat
if raw_sortkey then
insert(cats, get_category("Laman dengan kunci isih mentah"))
end
end
return data
end
return export
enca5wnu1j5d8aw625qrp26c74f6yrf
Modul:category tree/lang/jpx
828
55921
281367
256038
2026-04-22T07:02:54Z
PeaceSeekers
3334
281367
Scribunto
text/plain
local labels = {}
local handlers = {}
local m_str_utils = require("Module:string utilities")
local concat = table.concat
local full_link = require("Module:links").full_link
local insert = table.insert
local Hani_sort = require("Module:Hani-sortkey").makeSortKey
local match = m_str_utils.match
local sort = table.sort
local tag_text = require("Module:script_utilities").tag_text
local ucfirst = m_str_utils.ucfirst
local Hira = require("Module:scripts").getByCode("Hira")
local Jpan = require("Module:scripts").getByCode("Jpan")
local kana_to_romaji = require("Module:Hrkt-translit").tr
local m_numeric = require("Module:ConvertNumeric")
local kana_capture = "([-" .. require("Module:ja/data/range").kana .. "・]+)"
local yomi_data = require("Module:kanjitab/data")
labels["adnominals"] = {
description = "{{{langname}}} adnominals, or {{ja-r|連%体%詞|れん%たい%し}}, which modify nouns, and do not conjugate or [[predicate#Verb|predicate]].",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Hiragana"] = {
description = "{{{langname}}} terms with hiragana {{mdash}} {{ja-r|平%仮%名|ひら%が%な}} {{mdash}} forms, sorted by conventional hiragana sequence. The hiragana form is a [[phonetic]] representation of that word. " ..
"Wiktionary represents {{{langname}}}-language segments in three ways: in normal form (with [[kanji]], if appropriate), in [[hiragana]] " ..
"form (this differs from kanji form only when the segment contains kanji), and in [[romaji]] form.",
additional = "''Lihat juga'' [[:Kategori:Katakana bahasa {{{langname}}}]]",
toc_template = "categoryTOC-hiragana",
parents = {
{name = "{{{langcat}}}", raw = true},
"Kategori:Aksara Tulisan Hiragana",
}
}
labels["historical hiragana"] = {
description = "{{{langname}}} historical [[hiragana]].",
additional = "''See also'' [[:Category:{{{langname}}} historical katakana]].",
toc_template = "categoryTOC-hiragana",
parents = {
"Hiragana",
{name = "{{{langcat}}}", raw = true},
"Kategori:Aksara Tulisan Hiragana",
}
}
labels["Katakana"] = {
description = "{{{langname}}} terms with katakana {{mdash}} {{ja-r|片%仮%名|かた%か%な}} {{mdash}} forms, sorted by conventional katakana sequence. Katakana is used primarily for transliterations of foreign words, including old Chinese hanzi not used in [[shinjitai]].",
additional = "''Lihat juga'' [[:Kategori:Hiragana bahasa {{{langname}}}]]",
toc_template = "categoryTOC-katakana",
parents = {
{name = "{{{langcat}}}", raw = true},
"Kategori:Aksara Tulisan Katakana",
}
}
labels["historical katakana"] = {
description = "{{{langname}}} historical [[katakana]].",
additional = "''See also'' [[:Category:{{{langname}}} historical hiragana]].",
toc_template = "categoryTOC-katakana",
parents = {
"Katakana",
{name = "{{{langcat}}}", raw = true},
"Kategori:Aksara Tulisan Katakana",
}
}
labels["Perkataan dieja dengan kana campuran"] = {
description = "{{{langname}}} terms which combine [[hiragana]] and [[katakana]] characters, potentially with [[kanji]] too.",
parents = {
{name = "{{{langcat}}}", raw = true},
"Hiragana",
"Katakana",
},
}
labels["Kanji"] = {
topright = "{{wp|Kanji}}",
description = "Simbol bahasa {{{langname}}} yang merupakan sebahagian daripada tulisan logogram Han, yang boleh mewakili bunyi atau menyampaikan makna secara langsung.",
toc_template = "Hani-categoryTOC",
umbrella = "Aksara Han",
parents = "Logogram",
}
labels["Kanji mengikut bacaan"] = {
description = "Kanji bahasa {{{langname}}} yang dikategorikan mengikut bacaan.",
parents = {{name = "Kanji", sort = "bacaan"}},
}
labels["Makurakotoba"] = {
topright = "{{wp|Makurakotoba}}",
description = "{{{langname}}} idioms used in poetry to introduce specific words.",
parents = {"peribahasa"},
}
labels["Perkataan mengikut bacaan kanji"] = {
description = "Kategori bahasa {{{langname}}} yang dikumpulkan berdasarkan bacaan kanji yang dieja dengannya.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Perkataan mengikut pola bacaan"] = {
description = "Kategori bahasa {{{langname}}} dengan perkataan yang dikumpulkan berdasarkan corak bacaannya.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Perkataan mengikut bilangan aksara kanji"] = {
description = "Perkataan bahasa {{{langname}}} dikategorikan mengikut bilangan aksara kanji.",
parents = {"Perkataan mengikut sifat ortografi"},
}
local function handle_onyomi_list(category, category_type, cat_yomi_type)
local onyomi, seen = {}, {}
for _, yomi in pairs(yomi_data) do
if not seen[yomi] and yomi.onyomi then
local yomi_catname = yomi[category_type]
if yomi_catname ~= false then
local yomi_type = yomi.type
if yomi_type ~= "on'yomi" and yomi_type ~= cat_yomi_type then
insert(onyomi, "[[:Kategori:" .. category:gsub("{{{yomi_catname}}}", yomi_catname) .. " bahasa {{{langname}}}]]")
end
end
end
seen[yomi] = true
end
sort(onyomi)
return onyomi
end
local function add_yomi_category(category, category_type, parent, description)
for _, yomi in pairs(yomi_data) do
local yomi_catname = yomi[category_type]
if yomi_catname ~= false then
local yomi_type = yomi.type
local yomi_desc = yomi.link or yomi_catname
if yomi.description then
yomi_desc = yomi_desc .. "; " .. yomi.description
end
local label = {
description = description .. " " .. yomi_desc .. ".",
breadcrumb = yomi_type,
parents = {{name = parent, sort = yomi_catname}},
}
if yomi.onyomi then
local onyomi = handle_onyomi_list(category, category_type, yomi_type)
label.additional = "Kategori untuk perkataan dengan " ..
(yomi_type == "on'yomi" and "pelbagai lagi" or "lain-lain") ..
" jenis spesifik bacaan on'yomi boleh ditemukan pada kategori berikut:\n* " .. concat(onyomi, "\n* ")
if yomi_type ~= "on'yomi" then
insert(label.parents, 1, {
name = (category:gsub("{{{yomi_catname}}}", yomi_data.on[category_type])),
sort = yomi_catname
})
end
end
labels[category:gsub("{{{yomi_catname}}}", yomi_catname)] = label
end
end
end
add_yomi_category(
"Perkataan dengan bacaan {{{yomi_catname}}}",
"reading_category",
"Perkataan mengikut pola bacaan",
"Perkataan bahasa {{{langname}}} dengan bacaan"
)
add_yomi_category(
"Perkataan dieja dengan kanji dengan bacaan {{{yomi_catname}}}",
"kanji_category",
"Perkataan mengikut jenis bacaan kanji",
"Kategori bahasa {{{langname}}} dengan perkataan yang dieja dengan satu atau lebih banyak aksara kanji dengan bacaan"
)
labels["Perkataan kehilangan yomi"] = {
description = "Perkataan bahasa {{{langname}}} yang kehilangan satu atau lebih [[Lampiran:Glosari bahasa Jepun#yomi|yomi]] dalam {{tl|{{{langcode}}}-kanjitab}}.",
hidden = true,
can_be_empty = true,
parents = {"Penyelenggaraan entri"},
}
labels["terms with IPA pronunciation with pitch accent"] = {
description = "{{{langname}}} terms with pronunciations that have {{w|Japanese pitch accent|pitch accent}} specified.",
additional = "Pitch accent can be specified in {{tl|{{{langcode}}}-pron}} with the {{code|=acc=}} parameter.",
can_be_empty = true,
parents = {"Penyelenggaraan entri", "pitch accent"},
}
labels["terms with IPA pronunciation missing pitch accent"] = {
description = "{{{langname}}} terms with pronunciations that do not have a {{w|Japanese pitch accent|pitch accent}} specified.",
additional = "Pitch accent can be specified in {{tl|{{{langcode}}}-pron}} with the {{code|=acc=}} parameter.",
hidden = true,
can_be_empty = true,
parents = {"Penyelenggaraan entri"},
}
labels["pitch accent"] = {
description = "{{{langname}}} terms regarding {{w|Japanese pitch accent|pitch accent}} pronunciation.",
can_be_empty = true,
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["terms with Heiban pitch accent (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[平板型|Heiban]] {{w|Japanese pitch accent|pitch accent}}.",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["terms with Atamadaka pitch accent (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[頭高型|Atamadaka]] {{w|Japanese pitch accent|pitch accent}}.",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["terms with Nakadaka pitch accent (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[中高型|Nakadaka]] {{w|Japanese pitch accent|pitch accent}}.",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["terms with Odaka pitch accent (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[尾高型|Odaka]] {{w|Japanese pitch accent|pitch accent}}.",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["pitch accent deaccenting before の"] = {
description = "{{{langname}}} terms with {{w|Japanese pitch accent|pitch accent}} pronunciations that have exceptional deaccenting or lack thereof before の ({{ja-deaccenting-before-no}}).",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["terms with Odaka pitch accent not deaccented before の (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[尾高型|Odaka]] {{w|Japanese pitch accent|pitch accent}} and do not become deaccented before の ({{ja-deaccenting-before-no}}).",
can_be_empty = true,
parents = {"pitch accent deaccenting before の"}
}
labels["terms with Nakadaka pitch accent deaccented before の (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[中高型|Nakadaka]] {{w|Japanese pitch accent|pitch accent}} and become deaccented before の ({{ja-deaccenting-before-no}}).",
can_be_empty = true,
parents = {"pitch accent deaccenting before の"}
}
labels["Perkataan mengikut jenis bacaan kanji"] = {
description = "{{{langname}}} categories with terms grouped with regard to the types of readings of the kanji with which " ..
"they are spelled; broadly, those of Chinese origin, {{ja-r|音|おん}} readings, and those of non-Chinese origin, {{ja-r|訓|くん}} readings.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Perkataan dieja dengan ateji"] = {
topright = "{{wp|Ateji}}",
description = "{{{langname}}} terms containing one or more [[Appendix:Japanese glossary#ateji|ateji]] {{mdash}} {{ja-r|当て字|あてじ}} {{mdash}} which are [[kanji]] used to represent sounds rather than meanings (though meaning may have some influence on which kanji are chosen).",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Perkataan dieja dengan daiyōji"] = {
description = "Japanese terms spelled using [[Appendix:Japanese glossary#daiyouji|daiyōji]], categorized using {{temp|ja-daiyouji}}.",
parents = {"Perkataan mengikut etimologi"},
}
labels["Perkataan dieja dengan jukujikun"] = {
description = "{{{langname}}} terms containing one or more [[Appendix:Japanese glossary#jukujikun|jukujikun]] {{mdash}} {{ja-r|熟%字%訓|じゅく%じ%くん}} {{mdash}} which are [[kanji]] used to represent meanings rather than sounds.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
local function add_grade_categories(grade, desc, wp, only_one, parent, sort)
local grade_kanji = "Kanji " .. grade
local topright = wp and ("{{wp|%s}}"):format(ucfirst(grade_kanji)) or nil
labels[grade_kanji] = {
topright = topright,
description = "Kanji bahasa {{{langname}}} " .. desc,
toc_template = "Hani-categoryTOC",
parents = {{
name = parent and ("Kanji " .. parent) or "Kanji",
sort = sort or grade
}},
}
labels["Perkataan dieja dengan " .. grade_kanji:lower()] = {
topright = topright,
description = "Perkataan bahasa {{{langname}}} yang dieja dengan " .. (only_one and "sekurang-kurangnya satu " or "") .. " aksara kanji " .. desc,
parents = {{
name = parent and ("Perkataan dieja dengan kanji " .. parent) or "Perkataan mengikut sifat ortografi",
sort = sort or grade
}},
}
end
for i = 1, 6 do
local ord = m_numeric.ones_position_ord[i]
add_grade_categories(
"gred " .. ord,
"diajar dalam gred " .. ord .. " sekolah rendah, seperti yang ditetapkan oleh senarai rasmi {{ja-r|教%育 漢%字|きょう%いく かん%じ|sukatan pendidikan kanji}}.",
false,
false,
"kyōiku",
i
)
end
add_grade_categories(
"kyōiku",
"pada senarai rasmi {{ja-r|教%育 漢%字|きょう%いく かん%じ|sukatan pendidikan kanji}}.",
true,
false,
"jōyō"
)
add_grade_categories(
"sekolah menengah",
"pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}} yang secara umumnya diajar pada peringkat sekolah menengah.",
false,
false,
"jōyō"
)
add_grade_categories(
"jōyō",
"pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}}.",
true,
false
)
add_grade_categories(
"tōyō",
"pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}}, yang digunakan pada sekitar tahun 1946{{ndash}}1981 sehingga penerbitan senarai {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}}.",
true,
false
)
add_grade_categories(
"jinmeiyō",
"pada senarai rasmi {{ja-r|人%名%用 漢%字|じん%めい%-よう かん%じ|kanji untuk kegunaan nama peribadi}}.",
true,
true
)
add_grade_categories(
"hyōgai",
"tidak termasuk pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara kegunaan kerap}} atau {{ja-r|人%名%用 漢%字|じん%めい%-よう かん%じ|kanji untuk kegunaan nama peribadi}}, yang dikenali sebagai {{ja-r|表%外 漢%字|ひょう%がい かん%じ}} atau {{ja-r|表%外%字|ひょう%がい%じ|aksara tidak tersenarai}}.",
true,
true
)
labels["Perkataan dengan berbilang bacaan"] = {
description = "Perkataan bahasa {{{langname}}} dengan berbilang cara sebutan (maka juga sama dengan berbilang ejaan [[kana]]).",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Bacaan kanji mengikut bilangan mora"] = {
description = "Kategori-kategori bahasa {{{langname}}} dikumpulkan berdasarkan bilangan mora dalam bacaan kanji.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Perkataan kanji tunggal"] = {
description = "Perkataan {{{langname}}} yang ditulis dengan kanji tunggal.",
parents = {
"Perkataan mengikut sifat ortografi",
{name = "Perkataan dengan 1 aksara kanji", sort = " "},
},
}
labels["kanji with kun readings missing okurigana designation"] = {
breadcrumb = "Kanji missing okurigana designation",
description = "{{{langname}}} kanji entries in which one or more kun readings entered into {{tl|{{{langcode}}}-readings}} is missing a hyphen denoting okurigana.",
toc_template = "Hani-categoryTOC",
hidden = true,
can_be_empty = true,
parents = {"Penyelenggaraan entri"},
}
labels["Perkataan mengikut aksara individu dalam ejaan sejarah"] = {
breadcrumb = "Bersejarah",
description = "{{{langname}}} terms categorized by whether their spellings in the {{w|historical kana orthography}} included certain individual characters.",
parents = {{name = "Perkataan mengikut aksara individu", sort = " "}},
}
labels["Kata kerja tanpa ketransitifan"] = {
description = "{{{langname}}} verbs missing the {{code|=tr=}} parameter from their headword templates.",
hidden = true,
can_be_empty = true,
parents = {"Penyelenggaraan entri"},
}
labels["Yojijukugo"] = {
topright = "{{wp|Yojijukugo}}",
description = "{{{langname}}} four-[[kanji]] compound terms, {{ja-r|四%字 熟%語|よ%じ じゅく%ご}}, with idiomatic meanings; typically derived from Classical Chinese, Buddhist scripture or traditional Japanese proverbs.",
additional = "Compare Chinese {{w|chengyu}} and Korean {{w|sajaseong-eo}}.",
umbrella = "four-character idioms",
parents = {"peribahasa"},
}
-- FIXME: Only works for 0 through 19.
local word_to_number = {}
for k, v in pairs(m_numeric.ones_position) do
word_to_number[v] = k
end
local periods = {
lama = true,
kuno = true,
}
local function get_period_text_and_reading_type_link(period, reading_type)
if period and not periods[period] then
return nil
end
local period_text = period and " " .. period or nil
-- Allow periods (historical or ancient) by themselves; they will parse as reading types.
if not period and periods[reading_type] then
return nil, reading_type
end
local reading_type_link = "[[Lampiran:Glosari bahasa Jepun#" .. reading_type .. "|" .. reading_type .. "]]"
return period_text, reading_type_link
end
local function get_sc(str)
return match(str:gsub("[%s%p]+", ""), "[^" .. Hira:getCharacters() .. "]") and Jpan or Hira
end
local function get_tagged_reading(reading, lang)
return tag_text(reading, lang, get_sc(reading))
end
local function get_reading_link(reading, lang, period, link)
local hist = periods[period]
reading = reading:gsub("[%.%-%s]+", "")
return full_link({
lang = lang,
sc = get_sc(reading),
term = link or reading:gsub("・", ""),
-- If we have okurigana, demarcate furigana.
alt = reading:gsub("^(.-)・", "<span style=\"border-top:1px solid;position:relative;padding:1px;\">%1<span style=\"position:absolute;top:0;bottom:67%%;right:0%%;border-right:1px solid;\"></span></span>"),
tr = kana_to_romaji((reading:gsub("・", ".")), lang:getCode(), nil, {keep_dot = true, hist = hist})
:gsub("^(.-)%.", "<u>%1</u>"),
pos = reading:find("・", 1, true) and get_tagged_reading((reading:gsub("^.-・", "~")), lang) or nil
}, "term")
end
local function is_on_subtype(reading_type)
return reading_type:find(".on$")
end
insert(handlers, function(data)
local n =data.label:match("^Perkataan dengan ([1-9]%d*) aksara kanji$")
if not n then
return
end
local sortkey = require("Module:category tree").numeral_sortkey(n, 2097152)
return {
breadcrumb = n,
description = ("Perkataan bahasa {{{langname}}} yang mengandungi tepat %d aksara kanji."):format(n),
-- TODO: implement this using the same mechanism used to implement parents (i.e. avoiding the need for raw categories).
-- umbrella = {
-- breadcrumb = ("%d kanji"):format(n),
-- parents = {{name = "terms by number of kanji subcategories by language", sort = sortkey}},
-- },
parents = {{name = ("Perkataan mengikut bilangan aksara kanji"), sort = sortkey}}
}
end)
insert(handlers, function(data)
local label_pref, kana = data.label:match("^(Perkataan yang mengikut sejarah dieja dengan )" .. kana_capture .. "$")
if not kana then
return
end
local lang = data.lang
return {
description = "Perkataan bahasa {{{langname}}} yang dieja dengan " .. get_reading_link(kana, lang, "bersejarah") .. " dalam {{w|ortografi kana sejarawi}}.",
displaytitle = label_pref .. get_tagged_reading(kana, lang) .. " bahasa {{{langname}}}",
breadcrumb = "sejarah",
parents = {
{name = "Perkataan dieja dengan " .. kana, sort = " "},
{name = "Perkataan mengikut aksara individu dalam ejaan sejarah", sort = lang:makeSortKey(kana)}
},
}
end)
insert(handlers, function(data)
local count, plural = data.label:match("^Bacaan kanji dengan (.+) mora$")
local num = word_to_number[count]
if not num then
return nil
end
return {
description = "Bacaan kanji bahasa {{{langname}}} yang mengandungi " .. count .. " mora.",
breadcrumb = num,
parents = {{name = "Bacaan kanji mengikut bilangan mora", sort = num}},
}
end)
insert(handlers, function(data)
local label_pref, period, reading_type, reading = match(data.label, "^(Kanji dengan bacaan ([a-z]-) ?([%a']+) )" .. kana_capture .. "$")
if not period then
return
end
period = period ~= "" and period or nil
local period_text, reading_type_link = get_period_text_and_reading_type_link(period, reading_type)
if not reading_type_link then
return
end
local lang = data.lang
-- Compute parents.
local parents, breadcrumb = {}
if reading:find("・", 1, true) then
local okurigana = reading:match("・(.*)")
insert(parents, {
name = "Kanji dengan bacaan" .. (period_text or "") .. " ".. reading_type .. " " .. reading:match("(.-)・"),
-- Sort by okurigana, since all coordinate categories will have the same furigana.
sort = (lang:makeSortKey(okurigana))
})
breadcrumb = "~" .. okurigana
else
insert(parents, {
name = "Kanji mengikut bacaan" .. (period_text or "") .. " " .. reading_type,
sort = (lang:makeSortKey(reading))
})
breadcrumb = reading
end
if is_on_subtype(reading_type) then
insert(parents, {name = "Kanji dengan bacaan" .. (period_text or "") .. " on " .. reading, sort = reading_type})
elseif period_text then
insert(parents, {name = "Kanji dengan bacaan" .. period_text .. " " .. reading, sort = reading_type})
end
if not period_text then
insert(parents, {name = "Kanji dibaca sebagai " .. reading, sort = reading_type})
end
return {
description = "Aksara [[kanji]] bahasa {{{langname}}} dengan bacaan " .. reading_type_link .. " " ..
get_reading_link(reading, lang, period or reading_type) .. ".",
displaytitle = "{{{langname}}} " .. label_pref .. get_tagged_reading(reading, lang),
breadcrumb = get_tagged_reading(breadcrumb, lang),
parents = parents,
}
end)
insert(handlers, function(data)
local period, reading_type = match(data.label, "^Kanji mengikut bacaan ([a-z]-) ?([%a']+)$")
if not period then
return
end
period = period ~= "" and period or nil
local period_text, reading_type_link = get_period_text_and_reading_type_link(period, reading_type)
if not reading_type_link then
return nil
end
-- Compute parents.
local parents = {
is_on_subtype(reading_type) and {name = "Kanji mengikut bacaan" .. (period_text or "") .. " on", sort = reading_type} or
period_text and {name = "Kanji mengikut bacaan " .. reading_type, sort = period} or
{name = "Kanji mengikut bacaan", sort = reading_type}
}
if period_text then
insert(parents, {name = "Kanji mengikut bacaan" .. period_text, sort = reading_type})
end
-- Compute description.
local description = "[[kanji|Kanji]] bahasa {{{langname}}} dikategorikan mengikat bacaan " .. (period_text or "") .. reading_type_link .. "."
return {
description = description,
breadcrumb = reading_type .. (period_text or ""),
parents = parents,
}
end)
insert(handlers, function(data)
local label_pref, reading = match(data.label, "^(Kanji dibaca sebagai )" .. kana_capture .. "$")
if not reading then
return
end
local args = require("Module:parameters").process(data.args, {
["histconsol"] = true,
})
local lang = data.lang
local parents, breadcrumb = {}
if reading:find("・", 1, true) then
local okurigana = reading:match("・(.*)")
insert(parents, {
name = "Kanji dibaca sebagai " .. reading:match("(.-)・"),
-- Sort by okurigana, since all coordinate categories will have the same furigana.
sort = (lang:makeSortKey(okurigana))
})
breadcrumb = "~" .. okurigana
else
insert(parents, {
name = "Kanji mengikut bacaan",
sort = (lang:makeSortKey(reading))
})
breadcrumb = reading
end
local addl
local period_text
if args.histconsol then
period_text = "lama"
addl = ("This is a [[Wikipedia:Historical kana orthography|historical]] [[Wikipedia:Kanazukai|reading]], now " ..
"consolidated with the [[Wikipedia:Modern kana usage|modern reading]] of " ..
get_reading_link(args.histconsol, lang, nil, ("Kategori:Kanji dibaca sebagai %s bahasa Jepun"):format(args.histconsol)) .. ".")
end
return {
description = "[[kanji|Kanji]] bahasa {{{langname}}} dibaca sebagai " .. get_reading_link(reading, lang, period_text) .. ".",
additional = addl,
displaytitle = label_pref .. get_tagged_reading(reading, lang) .. " bahasa {{{langname}}}" ,
breadcrumb = get_tagged_reading(breadcrumb, lang),
parents = parents,
}, true
end)
insert(handlers, function(data)
local label_pref, reading = match(data.label, "^(Perkataan dieja dengan kanji dibaca sebagai )" .. kana_capture .. "$")
if not reading then
return
end
-- Compute parents.
local lang = data.lang
local sort_key = (lang:makeSortKey(reading))
local mora_count = require("Module:ja").count_morae(reading)
local mora_count_words = m_numeric.spell_number(tostring(mora_count))
local parents = {
{name = "Perkataan mengikut bacaan kanji", sort = sort_key},
{name = "Bacaan kanji dengan " .. mora_count_words .. " mora", sort = sort_key},
{name = "Kanji dibaca sebagai " .. reading, sort = " "},
}
local tagged_reading = get_tagged_reading(reading, lang)
return {
description = "{{{langname}}} terms that contain kanji that exhibit a reading of " .. get_reading_link(reading, lang) ..
" in those terms prior to any sound changes.",
displaytitle = "{{{langname}}} " .. label_pref .. tagged_reading,
breadcrumb = tagged_reading,
parents = parents,
}
end)
insert(handlers, function(data)
local kanji, reading = match(data.label, "^Perkataan dieja dengan (.) dibaca sebagai " .. kana_capture .. "$")
if not kanji then
return nil
end
local args = require("Module:parameters").process(data.args, {
[1] = {list = true},
})
local lang = data.lang
if #args[1] == 0 then
error("Bagi kategori dalam bentuk \"" .. lang:getCanonicalName() ..
" terms spelled with KANJI dibaca sebagai READING\", at least one reading type (e.g. <code>kun</code> or <code>on</code>) must be specified using <code>1=</code>, <code>2=</code>, <code>3=</code>, etc.")
end
local yomi_types, parents = {}, {}
for _, yomi, category in ipairs(args[1]) do
local yomi_data = yomi_data[yomi]
if not yomi_data then
error("Jenis yomi \"" .. yomi .. "\" tidak sah.")
end
category = yomi_data.kanji_category
if not category then
error("Jenis yomi \"" .. yomi .. "\" tidak sah bagi jenis kategori ini.")
end
insert(yomi_types, yomi_data.link)
insert(parents, {
name = "Perkataan dieja dengan kanji dengan bacaan " .. category,
sort = (lang:makeSortKey(reading))
})
end
insert(parents, 1, {name = "Perkataan dieja dengan " .. kanji, sort = (lang:makeSortKey(reading))})
insert(parents, 2, {name = "Perkataan dieja dengan kanji dibaca sebagai " .. reading, sort = Hani_sort(kanji)})
yomi_types = (#yomi_types > 1 and "one of " or "") .. "its " ..
require("Module:table").serialCommaJoin(yomi_types, {conj = "or"}) ..
" reading" .. (#yomi_types > 1 and "s" or "")
local tagged_kanji = get_tagged_reading(kanji, lang)
local tagged_reading = get_tagged_reading(reading, lang)
return {
description = "{{{langname}}} terms spelled with {{l|{{{langcode}}}|" .. kanji .. "}} with " ..
yomi_types .. " of " .. get_reading_link(reading, lang) .. ".",
displaytitle = "{{{langname}}} terms spelled with " .. tagged_kanji .. " dibaca sebagai " .. tagged_reading,
breadcrumb = "dibaca sebagai " .. tagged_reading,
parents = parents,
}, true
end)
insert(handlers, function(data)
local affix, kanji, reading = data.label:match("^Perkataan dengan ([a-z]) (.+) dibaca sebagai " .. kana_capture .. "$")
if not affix or not kanji or not reading then
return nil
end
local args = require("Module:parameters").process(data.args, {
[1] = {list = true},
})
local lang = data.lang
if #args[1] == 0 then
error("For categories of the form \"" .. lang:getCanonicalName() ..
" terms AFFIXed with KANJI dibaca sebagai READING\", at least one reading type (e.g. <code>kun</code> or <code>on</code>) must be specified using <code>1=</code>, <code>2=</code>, <code>3=</code>, etc.")
end
local yomi_types = {}
for _, yomi, category in ipairs(args[1]) do
local yomi_data = yomi_data[yomi]
if not yomi_data then
error("The yomi type \"" .. yomi .. "\" is not recognized.")
end
category = yomi_data.kanji_category
if not category then
error("The yomi type \"" .. yomi .. "\" is not valid for this type of category.")
end
insert(yomi_types, yomi_data.link)
end
yomi_types = (#yomi_types > 1 and "one of " or "") .. "its " ..
require("Module:table").serialCommaJoin(yomi_types, {conj = "or"}) ..
" reading" .. (#yomi_types > 1 and "s" or "")
local tagged_kanji = get_tagged_reading(kanji, lang)
local tagged_reading = get_tagged_reading(reading, lang)
return {
description = "{{{langname}}} terms " .. affix .. "ed with {{l|{{{langcode}}}|" .. kanji .. "}} with " ..
yomi_types .. " of " .. get_reading_link(reading, lang) .. ".",
displaytitle = "{{{langname}}} terms " .. affix .. "ed with " .. tagged_kanji .. " dibaca sebagai " .. tagged_reading,
breadcrumb = "dibaca sebagai " .. reading,
parents = {
{name = "terms " .. affix .. "ed with " .. kanji, sort = (lang:makeSortKey(reading))},
--{name = "Perkataan dieja dengan " .. kanji .. " dibaca sebagai " .. reading, sort = (lang:makeSortKey(reading)), args=data.args}
},
}, true
end)
insert(handlers, function(data)
local kanji, daiyoji = match(data.label, "^Perkataan dengan (.) digantikan oleh daiyōji (.)$")
if not kanji then
return nil
end
local args = require("Module:parameters").process(data.args, {
["sort"] = true,
})
local lang = data.lang
if not args.sort then
error("For categories of the form \"" .. lang:getCanonicalName() ..
" terms with KANJI replaced by daiyōji DAIYOJI\", the sort key must be specified using sort=")
end
local tagged_kanji = get_tagged_reading(kanji, lang)
local tagged_daiyoji = get_tagged_reading(daiyoji, lang)
return {
description = "{{{langname}}} terms with {{l|{{{langcode}}}|" .. kanji .. "}} replaced by [[Appendix:Japanese glossary#daiyouji|daiyōji]] {{l|{{{langcode}}}|" .. daiyoji .. "}}.",
displaytitle = "{{{langname}}} terms with " .. tagged_kanji .. " replaced by daiyōji " .. tagged_daiyoji,
breadcrumb = tagged_kanji .. " replaced by daiyōji " .. tagged_daiyoji,
parents = {{name = "Perkataan dieja dengan daiyōji", sort = args.sort}},
}, true
end)
return {LABELS = labels, HANDLERS = handlers}
kdm401tcuk1j3f4p1ncitj47yq9j7w0
281369
281367
2026-04-22T07:03:20Z
PeaceSeekers
3334
281369
Scribunto
text/plain
local labels = {}
local handlers = {}
local m_str_utils = require("Module:string utilities")
local concat = table.concat
local full_link = require("Module:links").full_link
local insert = table.insert
local Hani_sort = require("Module:Hani-sortkey").makeSortKey
local match = m_str_utils.match
local sort = table.sort
local tag_text = require("Module:script_utilities").tag_text
local ucfirst = m_str_utils.ucfirst
local Hira = require("Module:scripts").getByCode("Hira")
local Jpan = require("Module:scripts").getByCode("Jpan")
local kana_to_romaji = require("Module:Hrkt-translit").tr
local m_numeric = require("Module:ConvertNumeric")
local kana_capture = "([-" .. require("Module:ja/data/range").kana .. "・]+)"
local yomi_data = require("Module:kanjitab/data")
labels["adnominals"] = {
description = "{{{langname}}} adnominals, or {{ja-r|連%体%詞|れん%たい%し}}, which modify nouns, and do not conjugate or [[predicate#Verb|predicate]].",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Hiragana"] = {
description = "{{{langname}}} terms with hiragana {{mdash}} {{ja-r|平%仮%名|ひら%が%な}} {{mdash}} forms, sorted by conventional hiragana sequence. The hiragana form is a [[phonetic]] representation of that word. " ..
"Wiktionary represents {{{langname}}}-language segments in three ways: in normal form (with [[kanji]], if appropriate), in [[hiragana]] " ..
"form (this differs from kanji form only when the segment contains kanji), and in [[romaji]] form.",
additional = "''Lihat juga'' [[:Kategori:Katakana bahasa {{{langname}}}]]",
toc_template = "categoryTOC-hiragana",
parents = {
{name = "{{{langcat}}}", raw = true},
"Kategori:Aksara Tulisan Hiragana",
}
}
labels["historical hiragana"] = {
description = "{{{langname}}} historical [[hiragana]].",
additional = "''See also'' [[:Category:{{{langname}}} historical katakana]].",
toc_template = "categoryTOC-hiragana",
parents = {
"Hiragana",
{name = "{{{langcat}}}", raw = true},
"Kategori:Aksara Tulisan Hiragana",
}
}
labels["Katakana"] = {
description = "{{{langname}}} terms with katakana {{mdash}} {{ja-r|片%仮%名|かた%か%な}} {{mdash}} forms, sorted by conventional katakana sequence. Katakana is used primarily for transliterations of foreign words, including old Chinese hanzi not used in [[shinjitai]].",
additional = "''Lihat juga'' [[:Kategori:Hiragana bahasa {{{langname}}}]]",
toc_template = "categoryTOC-katakana",
parents = {
{name = "{{{langcat}}}", raw = true},
"Kategori:Aksara Tulisan Katakana",
}
}
labels["historical katakana"] = {
description = "{{{langname}}} historical [[katakana]].",
additional = "''See also'' [[:Category:{{{langname}}} historical hiragana]].",
toc_template = "categoryTOC-katakana",
parents = {
"Katakana",
{name = "{{{langcat}}}", raw = true},
"Kategori:Aksara Tulisan Katakana",
}
}
labels["Perkataan dieja dengan kana campuran"] = {
description = "{{{langname}}} terms which combine [[hiragana]] and [[katakana]] characters, potentially with [[kanji]] too.",
parents = {
{name = "{{{langcat}}}", raw = true},
"Hiragana",
"Katakana",
},
}
labels["Kanji"] = {
topright = "{{wp|Kanji}}",
description = "Simbol bahasa {{{langname}}} yang merupakan sebahagian daripada tulisan logogram Han, yang boleh mewakili bunyi atau menyampaikan makna secara langsung.",
toc_template = "Hani-categoryTOC",
umbrella = "Aksara Han",
parents = "Logogram",
}
labels["Kanji mengikut bacaan"] = {
description = "Kanji bahasa {{{langname}}} yang dikategorikan mengikut bacaan.",
parents = {{name = "Kanji", sort = "bacaan"}},
}
labels["Makurakotoba"] = {
topright = "{{wp|Makurakotoba}}",
description = "{{{langname}}} idioms used in poetry to introduce specific words.",
parents = {"Peribahasa"},
}
labels["Perkataan mengikut bacaan kanji"] = {
description = "Kategori bahasa {{{langname}}} yang dikumpulkan berdasarkan bacaan kanji yang dieja dengannya.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Perkataan mengikut pola bacaan"] = {
description = "Kategori bahasa {{{langname}}} dengan perkataan yang dikumpulkan berdasarkan corak bacaannya.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Perkataan mengikut bilangan aksara kanji"] = {
description = "Perkataan bahasa {{{langname}}} dikategorikan mengikut bilangan aksara kanji.",
parents = {"Perkataan mengikut sifat ortografi"},
}
local function handle_onyomi_list(category, category_type, cat_yomi_type)
local onyomi, seen = {}, {}
for _, yomi in pairs(yomi_data) do
if not seen[yomi] and yomi.onyomi then
local yomi_catname = yomi[category_type]
if yomi_catname ~= false then
local yomi_type = yomi.type
if yomi_type ~= "on'yomi" and yomi_type ~= cat_yomi_type then
insert(onyomi, "[[:Kategori:" .. category:gsub("{{{yomi_catname}}}", yomi_catname) .. " bahasa {{{langname}}}]]")
end
end
end
seen[yomi] = true
end
sort(onyomi)
return onyomi
end
local function add_yomi_category(category, category_type, parent, description)
for _, yomi in pairs(yomi_data) do
local yomi_catname = yomi[category_type]
if yomi_catname ~= false then
local yomi_type = yomi.type
local yomi_desc = yomi.link or yomi_catname
if yomi.description then
yomi_desc = yomi_desc .. "; " .. yomi.description
end
local label = {
description = description .. " " .. yomi_desc .. ".",
breadcrumb = yomi_type,
parents = {{name = parent, sort = yomi_catname}},
}
if yomi.onyomi then
local onyomi = handle_onyomi_list(category, category_type, yomi_type)
label.additional = "Kategori untuk perkataan dengan " ..
(yomi_type == "on'yomi" and "pelbagai lagi" or "lain-lain") ..
" jenis spesifik bacaan on'yomi boleh ditemukan pada kategori berikut:\n* " .. concat(onyomi, "\n* ")
if yomi_type ~= "on'yomi" then
insert(label.parents, 1, {
name = (category:gsub("{{{yomi_catname}}}", yomi_data.on[category_type])),
sort = yomi_catname
})
end
end
labels[category:gsub("{{{yomi_catname}}}", yomi_catname)] = label
end
end
end
add_yomi_category(
"Perkataan dengan bacaan {{{yomi_catname}}}",
"reading_category",
"Perkataan mengikut pola bacaan",
"Perkataan bahasa {{{langname}}} dengan bacaan"
)
add_yomi_category(
"Perkataan dieja dengan kanji dengan bacaan {{{yomi_catname}}}",
"kanji_category",
"Perkataan mengikut jenis bacaan kanji",
"Kategori bahasa {{{langname}}} dengan perkataan yang dieja dengan satu atau lebih banyak aksara kanji dengan bacaan"
)
labels["Perkataan kehilangan yomi"] = {
description = "Perkataan bahasa {{{langname}}} yang kehilangan satu atau lebih [[Lampiran:Glosari bahasa Jepun#yomi|yomi]] dalam {{tl|{{{langcode}}}-kanjitab}}.",
hidden = true,
can_be_empty = true,
parents = {"Penyelenggaraan entri"},
}
labels["terms with IPA pronunciation with pitch accent"] = {
description = "{{{langname}}} terms with pronunciations that have {{w|Japanese pitch accent|pitch accent}} specified.",
additional = "Pitch accent can be specified in {{tl|{{{langcode}}}-pron}} with the {{code|=acc=}} parameter.",
can_be_empty = true,
parents = {"Penyelenggaraan entri", "pitch accent"},
}
labels["terms with IPA pronunciation missing pitch accent"] = {
description = "{{{langname}}} terms with pronunciations that do not have a {{w|Japanese pitch accent|pitch accent}} specified.",
additional = "Pitch accent can be specified in {{tl|{{{langcode}}}-pron}} with the {{code|=acc=}} parameter.",
hidden = true,
can_be_empty = true,
parents = {"Penyelenggaraan entri"},
}
labels["pitch accent"] = {
description = "{{{langname}}} terms regarding {{w|Japanese pitch accent|pitch accent}} pronunciation.",
can_be_empty = true,
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["terms with Heiban pitch accent (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[平板型|Heiban]] {{w|Japanese pitch accent|pitch accent}}.",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["terms with Atamadaka pitch accent (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[頭高型|Atamadaka]] {{w|Japanese pitch accent|pitch accent}}.",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["terms with Nakadaka pitch accent (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[中高型|Nakadaka]] {{w|Japanese pitch accent|pitch accent}}.",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["terms with Odaka pitch accent (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[尾高型|Odaka]] {{w|Japanese pitch accent|pitch accent}}.",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["pitch accent deaccenting before の"] = {
description = "{{{langname}}} terms with {{w|Japanese pitch accent|pitch accent}} pronunciations that have exceptional deaccenting or lack thereof before の ({{ja-deaccenting-before-no}}).",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["terms with Odaka pitch accent not deaccented before の (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[尾高型|Odaka]] {{w|Japanese pitch accent|pitch accent}} and do not become deaccented before の ({{ja-deaccenting-before-no}}).",
can_be_empty = true,
parents = {"pitch accent deaccenting before の"}
}
labels["terms with Nakadaka pitch accent deaccented before の (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[中高型|Nakadaka]] {{w|Japanese pitch accent|pitch accent}} and become deaccented before の ({{ja-deaccenting-before-no}}).",
can_be_empty = true,
parents = {"pitch accent deaccenting before の"}
}
labels["Perkataan mengikut jenis bacaan kanji"] = {
description = "{{{langname}}} categories with terms grouped with regard to the types of readings of the kanji with which " ..
"they are spelled; broadly, those of Chinese origin, {{ja-r|音|おん}} readings, and those of non-Chinese origin, {{ja-r|訓|くん}} readings.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Perkataan dieja dengan ateji"] = {
topright = "{{wp|Ateji}}",
description = "{{{langname}}} terms containing one or more [[Appendix:Japanese glossary#ateji|ateji]] {{mdash}} {{ja-r|当て字|あてじ}} {{mdash}} which are [[kanji]] used to represent sounds rather than meanings (though meaning may have some influence on which kanji are chosen).",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Perkataan dieja dengan daiyōji"] = {
description = "Japanese terms spelled using [[Appendix:Japanese glossary#daiyouji|daiyōji]], categorized using {{temp|ja-daiyouji}}.",
parents = {"Perkataan mengikut etimologi"},
}
labels["Perkataan dieja dengan jukujikun"] = {
description = "{{{langname}}} terms containing one or more [[Appendix:Japanese glossary#jukujikun|jukujikun]] {{mdash}} {{ja-r|熟%字%訓|じゅく%じ%くん}} {{mdash}} which are [[kanji]] used to represent meanings rather than sounds.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
local function add_grade_categories(grade, desc, wp, only_one, parent, sort)
local grade_kanji = "Kanji " .. grade
local topright = wp and ("{{wp|%s}}"):format(ucfirst(grade_kanji)) or nil
labels[grade_kanji] = {
topright = topright,
description = "Kanji bahasa {{{langname}}} " .. desc,
toc_template = "Hani-categoryTOC",
parents = {{
name = parent and ("Kanji " .. parent) or "Kanji",
sort = sort or grade
}},
}
labels["Perkataan dieja dengan " .. grade_kanji:lower()] = {
topright = topright,
description = "Perkataan bahasa {{{langname}}} yang dieja dengan " .. (only_one and "sekurang-kurangnya satu " or "") .. " aksara kanji " .. desc,
parents = {{
name = parent and ("Perkataan dieja dengan kanji " .. parent) or "Perkataan mengikut sifat ortografi",
sort = sort or grade
}},
}
end
for i = 1, 6 do
local ord = m_numeric.ones_position_ord[i]
add_grade_categories(
"gred " .. ord,
"diajar dalam gred " .. ord .. " sekolah rendah, seperti yang ditetapkan oleh senarai rasmi {{ja-r|教%育 漢%字|きょう%いく かん%じ|sukatan pendidikan kanji}}.",
false,
false,
"kyōiku",
i
)
end
add_grade_categories(
"kyōiku",
"pada senarai rasmi {{ja-r|教%育 漢%字|きょう%いく かん%じ|sukatan pendidikan kanji}}.",
true,
false,
"jōyō"
)
add_grade_categories(
"sekolah menengah",
"pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}} yang secara umumnya diajar pada peringkat sekolah menengah.",
false,
false,
"jōyō"
)
add_grade_categories(
"jōyō",
"pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}}.",
true,
false
)
add_grade_categories(
"tōyō",
"pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}}, yang digunakan pada sekitar tahun 1946{{ndash}}1981 sehingga penerbitan senarai {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}}.",
true,
false
)
add_grade_categories(
"jinmeiyō",
"pada senarai rasmi {{ja-r|人%名%用 漢%字|じん%めい%-よう かん%じ|kanji untuk kegunaan nama peribadi}}.",
true,
true
)
add_grade_categories(
"hyōgai",
"tidak termasuk pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara kegunaan kerap}} atau {{ja-r|人%名%用 漢%字|じん%めい%-よう かん%じ|kanji untuk kegunaan nama peribadi}}, yang dikenali sebagai {{ja-r|表%外 漢%字|ひょう%がい かん%じ}} atau {{ja-r|表%外%字|ひょう%がい%じ|aksara tidak tersenarai}}.",
true,
true
)
labels["Perkataan dengan berbilang bacaan"] = {
description = "Perkataan bahasa {{{langname}}} dengan berbilang cara sebutan (maka juga sama dengan berbilang ejaan [[kana]]).",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Bacaan kanji mengikut bilangan mora"] = {
description = "Kategori-kategori bahasa {{{langname}}} dikumpulkan berdasarkan bilangan mora dalam bacaan kanji.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["Perkataan kanji tunggal"] = {
description = "Perkataan {{{langname}}} yang ditulis dengan kanji tunggal.",
parents = {
"Perkataan mengikut sifat ortografi",
{name = "Perkataan dengan 1 aksara kanji", sort = " "},
},
}
labels["kanji with kun readings missing okurigana designation"] = {
breadcrumb = "Kanji missing okurigana designation",
description = "{{{langname}}} kanji entries in which one or more kun readings entered into {{tl|{{{langcode}}}-readings}} is missing a hyphen denoting okurigana.",
toc_template = "Hani-categoryTOC",
hidden = true,
can_be_empty = true,
parents = {"Penyelenggaraan entri"},
}
labels["Perkataan mengikut aksara individu dalam ejaan sejarah"] = {
breadcrumb = "Bersejarah",
description = "{{{langname}}} terms categorized by whether their spellings in the {{w|historical kana orthography}} included certain individual characters.",
parents = {{name = "Perkataan mengikut aksara individu", sort = " "}},
}
labels["Kata kerja tanpa ketransitifan"] = {
description = "{{{langname}}} verbs missing the {{code|=tr=}} parameter from their headword templates.",
hidden = true,
can_be_empty = true,
parents = {"Penyelenggaraan entri"},
}
labels["Yojijukugo"] = {
topright = "{{wp|Yojijukugo}}",
description = "{{{langname}}} four-[[kanji]] compound terms, {{ja-r|四%字 熟%語|よ%じ じゅく%ご}}, with idiomatic meanings; typically derived from Classical Chinese, Buddhist scripture or traditional Japanese proverbs.",
additional = "Compare Chinese {{w|chengyu}} and Korean {{w|sajaseong-eo}}.",
umbrella = "four-character idioms",
parents = {"Peribahasa"},
}
-- FIXME: Only works for 0 through 19.
local word_to_number = {}
for k, v in pairs(m_numeric.ones_position) do
word_to_number[v] = k
end
local periods = {
lama = true,
kuno = true,
}
local function get_period_text_and_reading_type_link(period, reading_type)
if period and not periods[period] then
return nil
end
local period_text = period and " " .. period or nil
-- Allow periods (historical or ancient) by themselves; they will parse as reading types.
if not period and periods[reading_type] then
return nil, reading_type
end
local reading_type_link = "[[Lampiran:Glosari bahasa Jepun#" .. reading_type .. "|" .. reading_type .. "]]"
return period_text, reading_type_link
end
local function get_sc(str)
return match(str:gsub("[%s%p]+", ""), "[^" .. Hira:getCharacters() .. "]") and Jpan or Hira
end
local function get_tagged_reading(reading, lang)
return tag_text(reading, lang, get_sc(reading))
end
local function get_reading_link(reading, lang, period, link)
local hist = periods[period]
reading = reading:gsub("[%.%-%s]+", "")
return full_link({
lang = lang,
sc = get_sc(reading),
term = link or reading:gsub("・", ""),
-- If we have okurigana, demarcate furigana.
alt = reading:gsub("^(.-)・", "<span style=\"border-top:1px solid;position:relative;padding:1px;\">%1<span style=\"position:absolute;top:0;bottom:67%%;right:0%%;border-right:1px solid;\"></span></span>"),
tr = kana_to_romaji((reading:gsub("・", ".")), lang:getCode(), nil, {keep_dot = true, hist = hist})
:gsub("^(.-)%.", "<u>%1</u>"),
pos = reading:find("・", 1, true) and get_tagged_reading((reading:gsub("^.-・", "~")), lang) or nil
}, "term")
end
local function is_on_subtype(reading_type)
return reading_type:find(".on$")
end
insert(handlers, function(data)
local n =data.label:match("^Perkataan dengan ([1-9]%d*) aksara kanji$")
if not n then
return
end
local sortkey = require("Module:category tree").numeral_sortkey(n, 2097152)
return {
breadcrumb = n,
description = ("Perkataan bahasa {{{langname}}} yang mengandungi tepat %d aksara kanji."):format(n),
-- TODO: implement this using the same mechanism used to implement parents (i.e. avoiding the need for raw categories).
-- umbrella = {
-- breadcrumb = ("%d kanji"):format(n),
-- parents = {{name = "terms by number of kanji subcategories by language", sort = sortkey}},
-- },
parents = {{name = ("Perkataan mengikut bilangan aksara kanji"), sort = sortkey}}
}
end)
insert(handlers, function(data)
local label_pref, kana = data.label:match("^(Perkataan yang mengikut sejarah dieja dengan )" .. kana_capture .. "$")
if not kana then
return
end
local lang = data.lang
return {
description = "Perkataan bahasa {{{langname}}} yang dieja dengan " .. get_reading_link(kana, lang, "bersejarah") .. " dalam {{w|ortografi kana sejarawi}}.",
displaytitle = label_pref .. get_tagged_reading(kana, lang) .. " bahasa {{{langname}}}",
breadcrumb = "sejarah",
parents = {
{name = "Perkataan dieja dengan " .. kana, sort = " "},
{name = "Perkataan mengikut aksara individu dalam ejaan sejarah", sort = lang:makeSortKey(kana)}
},
}
end)
insert(handlers, function(data)
local count, plural = data.label:match("^Bacaan kanji dengan (.+) mora$")
local num = word_to_number[count]
if not num then
return nil
end
return {
description = "Bacaan kanji bahasa {{{langname}}} yang mengandungi " .. count .. " mora.",
breadcrumb = num,
parents = {{name = "Bacaan kanji mengikut bilangan mora", sort = num}},
}
end)
insert(handlers, function(data)
local label_pref, period, reading_type, reading = match(data.label, "^(Kanji dengan bacaan ([a-z]-) ?([%a']+) )" .. kana_capture .. "$")
if not period then
return
end
period = period ~= "" and period or nil
local period_text, reading_type_link = get_period_text_and_reading_type_link(period, reading_type)
if not reading_type_link then
return
end
local lang = data.lang
-- Compute parents.
local parents, breadcrumb = {}
if reading:find("・", 1, true) then
local okurigana = reading:match("・(.*)")
insert(parents, {
name = "Kanji dengan bacaan" .. (period_text or "") .. " ".. reading_type .. " " .. reading:match("(.-)・"),
-- Sort by okurigana, since all coordinate categories will have the same furigana.
sort = (lang:makeSortKey(okurigana))
})
breadcrumb = "~" .. okurigana
else
insert(parents, {
name = "Kanji mengikut bacaan" .. (period_text or "") .. " " .. reading_type,
sort = (lang:makeSortKey(reading))
})
breadcrumb = reading
end
if is_on_subtype(reading_type) then
insert(parents, {name = "Kanji dengan bacaan" .. (period_text or "") .. " on " .. reading, sort = reading_type})
elseif period_text then
insert(parents, {name = "Kanji dengan bacaan" .. period_text .. " " .. reading, sort = reading_type})
end
if not period_text then
insert(parents, {name = "Kanji dibaca sebagai " .. reading, sort = reading_type})
end
return {
description = "Aksara [[kanji]] bahasa {{{langname}}} dengan bacaan " .. reading_type_link .. " " ..
get_reading_link(reading, lang, period or reading_type) .. ".",
displaytitle = "{{{langname}}} " .. label_pref .. get_tagged_reading(reading, lang),
breadcrumb = get_tagged_reading(breadcrumb, lang),
parents = parents,
}
end)
insert(handlers, function(data)
local period, reading_type = match(data.label, "^Kanji mengikut bacaan ([a-z]-) ?([%a']+)$")
if not period then
return
end
period = period ~= "" and period or nil
local period_text, reading_type_link = get_period_text_and_reading_type_link(period, reading_type)
if not reading_type_link then
return nil
end
-- Compute parents.
local parents = {
is_on_subtype(reading_type) and {name = "Kanji mengikut bacaan" .. (period_text or "") .. " on", sort = reading_type} or
period_text and {name = "Kanji mengikut bacaan " .. reading_type, sort = period} or
{name = "Kanji mengikut bacaan", sort = reading_type}
}
if period_text then
insert(parents, {name = "Kanji mengikut bacaan" .. period_text, sort = reading_type})
end
-- Compute description.
local description = "[[kanji|Kanji]] bahasa {{{langname}}} dikategorikan mengikat bacaan " .. (period_text or "") .. reading_type_link .. "."
return {
description = description,
breadcrumb = reading_type .. (period_text or ""),
parents = parents,
}
end)
insert(handlers, function(data)
local label_pref, reading = match(data.label, "^(Kanji dibaca sebagai )" .. kana_capture .. "$")
if not reading then
return
end
local args = require("Module:parameters").process(data.args, {
["histconsol"] = true,
})
local lang = data.lang
local parents, breadcrumb = {}
if reading:find("・", 1, true) then
local okurigana = reading:match("・(.*)")
insert(parents, {
name = "Kanji dibaca sebagai " .. reading:match("(.-)・"),
-- Sort by okurigana, since all coordinate categories will have the same furigana.
sort = (lang:makeSortKey(okurigana))
})
breadcrumb = "~" .. okurigana
else
insert(parents, {
name = "Kanji mengikut bacaan",
sort = (lang:makeSortKey(reading))
})
breadcrumb = reading
end
local addl
local period_text
if args.histconsol then
period_text = "lama"
addl = ("This is a [[Wikipedia:Historical kana orthography|historical]] [[Wikipedia:Kanazukai|reading]], now " ..
"consolidated with the [[Wikipedia:Modern kana usage|modern reading]] of " ..
get_reading_link(args.histconsol, lang, nil, ("Kategori:Kanji dibaca sebagai %s bahasa Jepun"):format(args.histconsol)) .. ".")
end
return {
description = "[[kanji|Kanji]] bahasa {{{langname}}} dibaca sebagai " .. get_reading_link(reading, lang, period_text) .. ".",
additional = addl,
displaytitle = label_pref .. get_tagged_reading(reading, lang) .. " bahasa {{{langname}}}" ,
breadcrumb = get_tagged_reading(breadcrumb, lang),
parents = parents,
}, true
end)
insert(handlers, function(data)
local label_pref, reading = match(data.label, "^(Perkataan dieja dengan kanji dibaca sebagai )" .. kana_capture .. "$")
if not reading then
return
end
-- Compute parents.
local lang = data.lang
local sort_key = (lang:makeSortKey(reading))
local mora_count = require("Module:ja").count_morae(reading)
local mora_count_words = m_numeric.spell_number(tostring(mora_count))
local parents = {
{name = "Perkataan mengikut bacaan kanji", sort = sort_key},
{name = "Bacaan kanji dengan " .. mora_count_words .. " mora", sort = sort_key},
{name = "Kanji dibaca sebagai " .. reading, sort = " "},
}
local tagged_reading = get_tagged_reading(reading, lang)
return {
description = "{{{langname}}} terms that contain kanji that exhibit a reading of " .. get_reading_link(reading, lang) ..
" in those terms prior to any sound changes.",
displaytitle = "{{{langname}}} " .. label_pref .. tagged_reading,
breadcrumb = tagged_reading,
parents = parents,
}
end)
insert(handlers, function(data)
local kanji, reading = match(data.label, "^Perkataan dieja dengan (.) dibaca sebagai " .. kana_capture .. "$")
if not kanji then
return nil
end
local args = require("Module:parameters").process(data.args, {
[1] = {list = true},
})
local lang = data.lang
if #args[1] == 0 then
error("Bagi kategori dalam bentuk \"" .. lang:getCanonicalName() ..
" terms spelled with KANJI dibaca sebagai READING\", at least one reading type (e.g. <code>kun</code> or <code>on</code>) must be specified using <code>1=</code>, <code>2=</code>, <code>3=</code>, etc.")
end
local yomi_types, parents = {}, {}
for _, yomi, category in ipairs(args[1]) do
local yomi_data = yomi_data[yomi]
if not yomi_data then
error("Jenis yomi \"" .. yomi .. "\" tidak sah.")
end
category = yomi_data.kanji_category
if not category then
error("Jenis yomi \"" .. yomi .. "\" tidak sah bagi jenis kategori ini.")
end
insert(yomi_types, yomi_data.link)
insert(parents, {
name = "Perkataan dieja dengan kanji dengan bacaan " .. category,
sort = (lang:makeSortKey(reading))
})
end
insert(parents, 1, {name = "Perkataan dieja dengan " .. kanji, sort = (lang:makeSortKey(reading))})
insert(parents, 2, {name = "Perkataan dieja dengan kanji dibaca sebagai " .. reading, sort = Hani_sort(kanji)})
yomi_types = (#yomi_types > 1 and "one of " or "") .. "its " ..
require("Module:table").serialCommaJoin(yomi_types, {conj = "or"}) ..
" reading" .. (#yomi_types > 1 and "s" or "")
local tagged_kanji = get_tagged_reading(kanji, lang)
local tagged_reading = get_tagged_reading(reading, lang)
return {
description = "{{{langname}}} terms spelled with {{l|{{{langcode}}}|" .. kanji .. "}} with " ..
yomi_types .. " of " .. get_reading_link(reading, lang) .. ".",
displaytitle = "{{{langname}}} terms spelled with " .. tagged_kanji .. " dibaca sebagai " .. tagged_reading,
breadcrumb = "dibaca sebagai " .. tagged_reading,
parents = parents,
}, true
end)
insert(handlers, function(data)
local affix, kanji, reading = data.label:match("^Perkataan dengan ([a-z]) (.+) dibaca sebagai " .. kana_capture .. "$")
if not affix or not kanji or not reading then
return nil
end
local args = require("Module:parameters").process(data.args, {
[1] = {list = true},
})
local lang = data.lang
if #args[1] == 0 then
error("For categories of the form \"" .. lang:getCanonicalName() ..
" terms AFFIXed with KANJI dibaca sebagai READING\", at least one reading type (e.g. <code>kun</code> or <code>on</code>) must be specified using <code>1=</code>, <code>2=</code>, <code>3=</code>, etc.")
end
local yomi_types = {}
for _, yomi, category in ipairs(args[1]) do
local yomi_data = yomi_data[yomi]
if not yomi_data then
error("The yomi type \"" .. yomi .. "\" is not recognized.")
end
category = yomi_data.kanji_category
if not category then
error("The yomi type \"" .. yomi .. "\" is not valid for this type of category.")
end
insert(yomi_types, yomi_data.link)
end
yomi_types = (#yomi_types > 1 and "one of " or "") .. "its " ..
require("Module:table").serialCommaJoin(yomi_types, {conj = "or"}) ..
" reading" .. (#yomi_types > 1 and "s" or "")
local tagged_kanji = get_tagged_reading(kanji, lang)
local tagged_reading = get_tagged_reading(reading, lang)
return {
description = "{{{langname}}} terms " .. affix .. "ed with {{l|{{{langcode}}}|" .. kanji .. "}} with " ..
yomi_types .. " of " .. get_reading_link(reading, lang) .. ".",
displaytitle = "{{{langname}}} terms " .. affix .. "ed with " .. tagged_kanji .. " dibaca sebagai " .. tagged_reading,
breadcrumb = "dibaca sebagai " .. reading,
parents = {
{name = "terms " .. affix .. "ed with " .. kanji, sort = (lang:makeSortKey(reading))},
--{name = "Perkataan dieja dengan " .. kanji .. " dibaca sebagai " .. reading, sort = (lang:makeSortKey(reading)), args=data.args}
},
}, true
end)
insert(handlers, function(data)
local kanji, daiyoji = match(data.label, "^Perkataan dengan (.) digantikan oleh daiyōji (.)$")
if not kanji then
return nil
end
local args = require("Module:parameters").process(data.args, {
["sort"] = true,
})
local lang = data.lang
if not args.sort then
error("For categories of the form \"" .. lang:getCanonicalName() ..
" terms with KANJI replaced by daiyōji DAIYOJI\", the sort key must be specified using sort=")
end
local tagged_kanji = get_tagged_reading(kanji, lang)
local tagged_daiyoji = get_tagged_reading(daiyoji, lang)
return {
description = "{{{langname}}} terms with {{l|{{{langcode}}}|" .. kanji .. "}} replaced by [[Appendix:Japanese glossary#daiyouji|daiyōji]] {{l|{{{langcode}}}|" .. daiyoji .. "}}.",
displaytitle = "{{{langname}}} terms with " .. tagged_kanji .. " replaced by daiyōji " .. tagged_daiyoji,
breadcrumb = tagged_kanji .. " replaced by daiyōji " .. tagged_daiyoji,
parents = {{name = "Perkataan dieja dengan daiyōji", sort = args.sort}},
}, true
end)
return {LABELS = labels, HANDLERS = handlers}
55nzh7armhc450zs6jk9i3eepl0t9es
Modul:MediaWiki message helper
828
57886
281283
184920
2026-04-21T14:11:25Z
Hakimi97
2668
Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/84209320|84209320]])
281283
Scribunto
text/plain
local m_str_utils = require("Module:string utilities")
local dump = mw.dumpObject
local get_current_title = mw.title.getCurrentTitle
local gsplit = m_str_utils.gsplit
local make_title = mw.title.makeTitle
local new_title = mw.title.new
local pattern_escape = m_str_utils.pattern_escape
local php_trim = require("Module:Scribunto").php_trim
local ufind = m_str_utils.find
local ugsub = m_str_utils.gsub
local ulower = m_str_utils.lower
local uupper = m_str_utils.upper
local export = {}
local function get_title(frame)
local args = frame.args
local title = args and args.title or nil
return title == nil and get_current_title() or
new_title(title) or
error(("%s is not a valid title"):format(dump(title)))
end
local function print_suggestions(suggestions)
if #suggestions == 0 then
return ""
else
local prefix = "* Adakah anda maksudkan "
local suffix
if #suggestions > 1 then
prefix = prefix .. " salah satu daripada semua ini?\n"
suggestions = suggestions:map(function(link) return "** " .. link end)
suffix = ""
else
suffix = "?"
end
return prefix .. suggestions:concat "\n" .. suffix
end
end
-- For [[MediaWiki:Noarticletext]] on uncreated category pages.
function export.category_suggestions(frame)
local title = ugsub(get_title(frame).text, "^.", uupper)
local output = require("Module:array")()
local function make_suggestion(title, suffix)
output:insert("'''[[:Kategori:" .. title .. "]]'''" .. (suffix or ""))
end
local function check_for_page_from_function(func)
local suggestion = func(title)
if suggestion then
local suggestion_title = make_title(14, suggestion)
if suggestion_title and suggestion_title.exists then
make_suggestion(suggestion)
return true
end
end
return false
end
local function check_for_page_with_suffix(suffix)
return check_for_page_from_function(function(title)
return title .. " " .. suffix
end)
end
local function check_for_page_with_prefix_removed(prefix)
return check_for_page_from_function(function(title)
return title:gsub(pattern_escape(prefix), "")
end)
end
check_for_page_with_prefix_removed("List of ")
check_for_page_with_prefix_removed("list of ")
local has_language_category = check_for_page_with_suffix("language")
check_for_page_with_suffix("Language")
check_for_page_with_suffix("languages")
local has_script_category = check_for_page_with_suffix("script")
local function check_other_names_of_languages(language_name)
for code, data in pairs(require("Module:languages/data/all")) do
local function check_name_list(list)
if list then
for _, name in ipairs(list) do
-- The aliases and varieties are recursive,
-- with subtables that themselves contain names.
if type(name) == "table" then
check_name_list(name)
else
if name == language_name then
local object = require("Module:languages").makeObject(code, data)
make_suggestion(object:getCategoryName())
end
end
end
end
end
check_name_list(data.otherNames)
check_name_list(data.aliases)
check_name_list(data.varieties)
end
end
-- If title looks like a language category, then check if the language name
-- in it is a valid canonical name, or one of the otherNames for some
-- language.
-- If the title looks like a language code, check for a language or a script
-- with that code.
local function check_language_name(language_name, is_language_category, has_language_category)
local ret = false
if not has_language_category then
if require("Module:languages/canonical names")[language_name] then
if not is_language_category then
make_suggestion("bahasa " .. language_name)
else
output:insert("* '''" .. language_name .. "''' merupakan nama bahasa Wikikamus yang sah.")
end
ret = true
end
end
-- Some otherNames are the canonical name of another language.
check_other_names_of_languages(language_name)
return ret
end
local language_name = title:match "^[Bb]ahasa (.+)$" or title:match "^(Bahasa .+)$"
check_language_name(language_name or title, language_name ~= nil, has_language_category)
-- Most languages (7965/8085 by last count) have uppercase letters at
-- beginning of the name and after whitespace and punctuation characters,
-- and lowercase everywhere else. Exceptions include languages
-- with apostrophes, such as Yup'ik, and languages with tone letters,
-- such as ǃXóõ.
local fixed_capitalization = ugsub(ulower(language_name or title), "%f[^%z%s%p]%a", uupper)
if fixed_capitalization ~= (language_name or title) then
check_language_name(fixed_capitalization)
end
if title:find "^[%a-]+$" then
local function check_for_valid_code(code, ...)
for _, module_name in ipairs { ... } do
local object = require("Modul:" .. module_name).getByCode(code)
if object then
make_suggestion(object:getCategoryName(), " (kod <code>" .. code .. "</code>)")
end
end
end
local code = title:lower()
check_for_valid_code(code, "languages", "etymology languages", "scripts", "families")
check_for_valid_code(code .. "-pro", "languages", "etymology languages")
end
local function check_script_name(script_name, is_script_category, has_script_category)
if not has_script_category then
local object = require("Module:scripts").getByCanonicalName(script_name)
if object then
if is_script_category then
output:insert("* " .. script_name .. " merupakan nama tulisan Wikikamus yang sah.")
else
make_suggestion(object:getCategoryName())
end
end
end
for code, data in pairs(require("Module:scripts/data")) do
local function check_other_names_of_script(list)
if list then
for _, name in ipairs(list) do
if type(name) == "table" then
check_other_names_of_script(name)
elseif script_name == name then
local object = require("Module:scripts").makeObject(code, data)
make_suggestion(object:getCategoryName())
end
end
end
end
check_other_names_of_script(data.otherNames)
check_other_names_of_script(data.varieties)
check_other_names_of_script(data.aliases)
end
end
local script_name = title:match "^[Tt]ulisan (.+)$"
check_script_name(script_name or title, script_name ~= nil, has_script_category)
return print_suggestions(output)
end
function export.template_suggestions(frame)
local title = get_title(frame).text
local output = require("Module:array")()
local function make_suggestion(title, suffix)
output:insert("'''[[:Templat:" .. title .. "]]'''" .. (suffix or ""))
end
local function check_for_page_with_prefix(prefix)
local suggestion = prefix .. title
local suggestion_title = make_title(10, suggestion)
if suggestion_title and suggestion_title.exists then
make_suggestion(suggestion)
return true
end
return false
end
if title:find(" ", 1, true) then
local with_hyphen = title:gsub(" ", "-")
local suggestion_title = make_title(10, with_hyphen)
if suggestion_title and suggestion_title.exists then
make_suggestion(with_hyphen)
end
end
local prefixes = frame.args.prefixes
if prefixes then
for prefix in gsplit(prefixes, ",", true, true) do
check_for_page_with_prefix(prefix)
end
end
local prefix, rest = title:match "^([^: ]+) *:(.+)$"
if prefix then
prefix = prefix:upper()
else
prefix, rest = "", title
end
if prefix == "" or prefix == "R" or prefix == "RQ" then
local templates = require("Module:MediaWiki message helper/R: and RQ: templates")
local rest_pattern = pattern_escape(rest)
:gsub("%l", function(letter)
return "[" .. letter:upper() .. letter:lower() .. "]"
end)
for suggestion in templates:gmatch("%f[^%z\n][Rr][Qq]? ?:[a-z-]*[-:]?"
.. rest_pattern
.. "[ :]?%d?%d?%d?%d?[/-]?%d?%d?%d?%d?%l?%f[%z\n]") do
local suggestion_title = make_title(10, suggestion)
if suggestion_title and suggestion_title.exists then
make_suggestion(suggestion)
end
end
end
return print_suggestions(output)
end
function export.module_suggestions(frame)
local title = get_title(frame).text
local output = require("Module:array")()
local function make_suggestion(title)
output:insert("'''[[:Module:" .. title .. "]]'''")
end
local pronunciation_suffixes = require("Module:array"){"IPA", "pr", "pron", "pronunc", "pronunciation"}
if pronunciation_suffixes:some(function(suffix)
return title:find("%-" .. suffix .. "$")
end) then
for _, suggestion in ipairs(pronunciation_suffixes:map(function(suffix)
return title:gsub("%-%w+$", "-" .. suffix)
end)) do
local suggestion_title = make_title(828, suggestion)
if suggestion_title and suggestion_title.exists then
make_suggestion(suggestion)
end
end
end
-- Look for the modules actually invoked by the template with the same name
-- as the module (accounting for "Module:Template:..." cases).
local template_title = new_title(title, 10)
if template_title then
local template_text = template_title:getContent()
if template_text then
for template in require("Module:template parser").find_templates(template_text) do
if template:get_name() == "#INVOKE:" then
local args = template:get_arguments()
if args[2] then -- args[2] is the function name, so #INVOKE: will throw an error if not present
make_suggestion(php_trim(args[1])) -- args[1] is the module name
end
end
end
end
end
return print_suggestions(output)
end
function export.is_data_module_not_documentation(frame)
local title = get_title(frame)
if require("Module:pages").get_pagetype(title) == "module" then
return ufind(title.text, "^" .. (frame.args[1] or "") .. "$")
end
end
return export
py90hz418oltqr9ytx3oknlhjrub6c4
gelugak
0
68267
281324
241579
2026-04-22T00:33:54Z
PeaceSeekers
3334
281324
wikitext
text/x-wiki
==Bahasa Melayu Sarawak==
===Takrifan===
====Kata kerja====
{{inti|poz-sml|kata kerja}}
# [[selongkar]]
#: {{cp|poz-sml|'''Gelugak''' jak kabat ya, ngare gilak dalam ya.|'''Selongkar''' saja almari itu, bersepah sangat dalam itu.}}
===Sebutan===
* {{AFA|poz-sml|/gə.lu.gaʔ/}}
* {{rima|poz-sml|gaʔ}}
* {{penyempangan|poz-sml|ge|lu|gak}}
th75a44iild6j2qnxdrxurreaoijuvp
Modul:sem-arb-utilities
828
72855
281296
266012
2026-04-21T15:31:27Z
Hakimi97
2668
Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/87627369|87627369]]) (perlu semakan semula)
281296
Scribunto
text/plain
local export = {}
local m_str_utils = require("Module:string utilities")
local m_utilities = require("Module:utilities")
local m_links = require("Module:links")
local m_headword = require("Module:headword")
local m_langs = require("Module:languages")
local m_params = require("Module:parameters")
local m_parse_utils = require("Module:parse utilities")
local m_affix = require("Module:affix")
local m_sc_utils = require("Module:script utilities")
local pluralize = require("Module:en-utilities").pluralize
local lg_ar = m_langs.getByCode("ar")
local lg_sem_arb = lg_ar:getFamily():getCode()
local rsplit = m_str_utils.split
local rsubn = m_str_utils.gsub
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local separator_langs = { ["mt"] = true, ["acy"] = true }
local color_langs = { ["mt"] = "red", ["ary"] = "red", ["ar"] = "green", ["shu"] = "yellow" }
local template_preview_per_langcode = { ["mt"] = "k-t-b", ["acy"] = "k-t-p" }
local lang
local sc
local function ifelse(cond, yes, no)
if cond then
return yes
end
return no
end
local function ucfirst(str)
if str == nil then
return str
end
return mw.language.getContentLanguage():ucfirst(str)
end
local function link(term, alt, id)
if term == "" or term == "—" then
return term
else
return m_links.full_link({
term = term,
alt = alt,
lang = lang,
id = id,
})
end
end
local function parse_inlines(term)
return m_parse_utils.parse_inline_modifiers(
term,
{
param_mods = {tr = {}, t = {}, pos = {}},
generate_obj = function(term) return {term} end,
}
)
end
local function make_part(noninline, lang)
local keys = {"tr", "t", "pos"}
local inline
if type(noninline) == "string" then
inline = parse_inlines(noninline)
else
inline = parse_inlines(noninline[1])
end
local return_value = {
term = m_sc_utils.tag_text(inline[1], lang),
}
for i, key in ipairs(keys) do
if inline[key] and noninline[key] then
error(
key .. " specified twice: "
.. "<" .. key .. ":" .. inline[key] .. ">"
.. " and "
.. "|" .. key .. "=" .. nonline[key]
)
end
return_value[key] = inline[key] or noninline[key]
end
if not return_value.tr then
return_value.tr = lang:transliterate(inline[1])
end
return return_value
end
local function make_parts(lang, raw_parts)
local parts = {}
for i, part in ipairs(raw_parts) do
parts[#parts + 1] = make_part(part, lang)
end
return {
parts = parts,
lang = lang,
sc = lang:findBestScript(parts[1][1]),
}
end
local function show_affix(lang, raw_parts)
return m_affix.show_affix(
make_parts(lang, raw_parts),
{},
lang
)
end
local appendices = {
["active participle"] = {
-- participles have verbal force in (most?) vernaculars
function(args, lang)
return ifelse(lang:getCode() == lg_ar:getCode(), "nominals", "verbs")
end,
derived = true,
},
["characteristic adjective"] = "nominals",
["color/defect adjective"] = {
"nominals",
fragment = "Color or defect adjectives",
},
["diminutive"] = "nominals",
["elative"] = "nominals",
["relative"] = {
"nominals",
params = function(lang)
return {
[1] = {
required = true,
},
[2] = {
alias_of = "t",
},
[3] = {
alias_of = "suffix",
},
["id"] = {},
["id1"] = {
alias_of = "id",
},
["tr"] = {},
["pl"] = {
type = "boolean",
},
["t"] = {
required = true,
},
["suffix"] = {
default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيَّة", "ـية"),
},
}
end,
title = function(args)
if args.pl then
return "relative nouns (nisba)"
end
return "relative noun (nisba)"
end,
desc = function(args, lang)
if not args[1] then
return ""
end
return (
"composed from "
.. show_affix(
lang,
{
args,
{args.suffix, pos="feminine nisba"},
}
)
)
end,
},
["relative-a"] = {
"nominals",
params = function(lang)
return {
[1] = {
required = true,
},
[2] = {
alias_of = "t",
},
["tr"] = {},
["pl"] = {
type = "boolean",
},
["suffix"] = {
default = ifelse(lang:getCode() == lg_ar:getCode(), "ـَة", "ـة"),
},
["t"] = {
required = true,
},
}
end,
title = function(args)
if args.pl then
return "relative nouns (nisba)"
end
return "relative noun (nisba)"
end,
desc = function(args, lang)
if not args[1] then
return ""
end
return (
"composed from "
.. show_affix(
lang,
{
args,
{args.suffix, pos="feminine ending"},
}
)
)
end,
},
["relative-linking"] = {
"nominals",
params = function(lang)
return {
[1] = {
required = true,
},
[2] = {
alias_of = "t",
},
[3] = {
alias_of = "linking",
},
["tr"] = {},
["pl"] = {
type = "boolean",
},
["suffix"] = {
default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيّ", "ـي"),
},
["t"] = {
required = true,
},
["linking"] = {
required = true,
}
}
end,
title = function(args)
if args.pl then
return "relative adjectives (nisba)"
end
return "relative adjective (nisba)"
end,
desc = function(args, lang)
if not args[1] then
return ""
end
return (
"composed from "
.. show_affix(
lang,
{
args,
args.linking,
{args.suffix, pos = "nisba"},
}
)
)
end,
},
["relative-linking-noun"] = {
"nominals",
params = function(lang)
return {
[1] = {
required = true,
},
[2] = {
alias_of = "t",
},
[3] = {
alias_of = "linking",
},
["tr"] = {},
["pl"] = {
type = "boolean",
},
["t"] = {
required = true,
},
["linking"] = {
required = true,
},
["suffix"] = {
default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيّ", "ـي"),
}
}
end,
title = function(args)
if args.pl then
return "relative nouns (nisba)"
end
return "relative noun (nisba)"
end,
desc = function(args, lang)
if not args[1] then
return ""
end
return (
"composed from "
.. show_affix(
lang,
{args, args.linking, args.suffix}
)
)
end,
},
["form"] = {
"verbs",
params = {
[1] = {
alias_of = "wazn",
},
["wazn"] = {
required = true,
},
},
fragment = function(args) return "Form_" .. args.wazn end,
title = function(args) return "form " .. args.wazn end,
},
["instance noun"] = "nominals",
["noun of place"] = "nominals",
["occupational noun"] = "nominals",
["passive participle"] = {
"nominals",
derived = true,
},
["reduplicated"] = {
glossary = "reduplication",
},
["singulative noun"] = "nominals",
["tool noun"] = "nominals",
["verbal noun"] = "nominals",
}
local radicals = {
["Arab"] = {
["ء"] = true,
["ب"] = true,
["ت"] = true,
["ث"] = true,
["ج"] = true,
["ح"] = true,
["خ"] = true,
["د"] = true,
["ذ"] = true,
["ر"] = true,
["ز"] = true,
["س"] = true,
["ش"] = true,
["ص"] = true,
["ض"] = true,
["ط"] = true,
["ظ"] = true,
["ع"] = true,
["غ"] = true,
["ف"] = true,
["ق"] = true,
["ك"] = true,
["ل"] = true,
["م"] = true,
["ن"] = true,
["ه"] = true,
["و"] = true,
["ي"] = true,
["گ"] = true,
["چ"] = true,
["پ"] = true,
["ڭ"] = true,
},
["Latn"] = {
["'"] = true,
["b"] = true,
["c"] = true,
["ċ"] = true,
["d"] = true,
["δ"] = true,
["f"] = true,
["ġ"] = true,
["g"] = true,
["għ"] = true,
["h"] = true,
["ħ"] = true,
["j"] = true,
["k"] = true,
["l"] = true,
["m"] = true,
["n"] = true,
["p"] = true,
["q"] = true,
["r"] = true,
["s"] = true,
["ş"] = true,
["t"] = true,
["v"] = true,
["w"] = true,
["x"] = true,
["y"] = true,
["ż"] = true,
["z"] = true,
["θ"] = true,
}
}
local function validateRoot(rootTable, joined_root)
if type(rootTable) ~= "table" then
error("rootTable is not a table", 2)
end
local len = #rootTable
if len < 3 then
error("Root must have at least three radicals.")
end
if sc == nil then
sc = lang:findBestScript(joined_root):getCode()
end
for i, radical in ipairs(rootTable) do
if not radicals[sc][radical] then
error("Unrecognized radical " .. radical .. " in " .. joined_root)
end
end
end
function export.root(frame)
local output = {}
local categories = {}
local title = mw.title.getCurrentTitle()
local namespace = title.nsText
local fulltitle = title.fullText
if frame.args["lang"] then
lang = require("Module:languages").getByCode(frame.args["lang"])
else
error("Please provide a language code.")
end
local subpage = "Appendix:" .. lang:getCanonicalName() .. " roots/"
local fulltitle = rsubn(fulltitle, rsubn(subpage, "([^%w])", "%%%1"), "")
local params = {
[1] = { list = true },
["nocat"] = { type = "boolean" },
["plain"] = { type = "boolean" },
["notext"] = { type = "boolean" },
["sense"] = {}
}
local args = require("Module:parameters").process(frame:getParent().args, params)
local rootLetters = {}
local roots = args[1]
local plain = args["plain"]
if frame.args["plain"] then
plain = true
end
local langCode = lang:getCode()
local separator = " "
if separator_langs[langCode] then
separator = "-"
else
separator = " "
end
local roots_len = #roots
if #roots == 0 and namespace == "Template" then
if template_preview_per_langcode[langCode] ~= nil then
table.insert(rootLetters, rsplit(template_preview_per_langcode[langCode], separator))
else
table.insert(rootLetters, rsplit("ك ت ب", separator))
end
elseif #roots ~= 0 then
for _, root in ipairs(roots) do
table.insert(rootLetters, rsplit(root, separator))
end
else
table.insert(rootLetters, rsplit(fulltitle, separator))
end
local joined_roots = {}
for i, rootLetter in ipairs(rootLetters) do
table.insert(joined_roots, table.concat(rootLetter, separator))
validateRoot(rootLetter, joined_roots[i])
end
local sense = args["sense"]
local sense_formatted = ""
if sense ~= nil then
sense_formatted = " (" .. sense .. ") "
end
if fulltitle == joined_roots[1] then
if namespace == "" then
error("The root page should be in the Appendix namespace. Please move it to : [[" ..
subpage .. joined_roots[1] .. "]]")
end
if roots_len > 1 then
error("There should be only one root.")
end
table.insert(output,
m_headword.full_headword({ lang = lang, pos_category = "roots", categories = {}, heads = { fulltitle }, nomultiwordcat = true, noposcat = true }))
if args["nocat"] then
return table.concat(output)
else
return table.concat(output) .. table.concat(categories)
end
else
local link_texts = {}
local term_counts = {}
for i, joined_root in ipairs(joined_roots) do
local link_text = subpage .. joined_root
table.insert(link_texts, link(link_text, joined_root .. sense_formatted, sense))
table.insert(
categories,
m_utilities.format_categories(
{ lang:getCanonicalName() .. " terms belonging to the root " .. joined_root .. sense_formatted },
lang)
)
table.insert(term_counts,
mw.site.stats.pagesInCategory(
lang:getCanonicalName() .. " terms belonging to the root " .. joined_root .. sense_formatted, "pages")
)
end
if args["nocat"] or plain then
if args["nocat"] then
return table.concat(link_texts, ", ")
else
return table.concat(link_texts, ", ") .. table.concat(categories)
end
else
local link_text_output = ""
for i, link_text in ipairs(link_texts) do
link_text_output = link_text_output ..
"\n|-\n| " .. link_text ..
"\n|-\n| [[:Category:" ..
lang:getCanonicalName() ..
" terms belonging to the root " ..
joined_roots[i] ..
sense_formatted .. "|" .. term_counts[i] .. " term" .. (term_counts[i] == 1 and "" or "s") .. "]]\n"
end
local color = "grey"
if color_langs[langCode] ~= nil then
color = color_langs[langCode]
end
local wikicode = mw.getCurrentFrame():expandTemplate {
title = 'inflection-table-top',
args = {
title = "-",
palette = color,
class = "floatright tr-alongside"
}
}
wikicode = wikicode .. [=[
! [[w:Semitic root|Root]=] .. (#term_counts == 1 and "" or "s") .. [=[]]]=]
wikicode = wikicode .. link_text_output
wikicode = wikicode .. mw.getCurrentFrame():expandTemplate {
title = 'inflection-table-bottom',
}
return wikicode .. table.concat(categories)
end
end
end
local function iffn(val, ...)
if type(val) == "function" then
return val(unpack(arg))
end
return val
end
function export.etym(frame)
local params = {
[1] = {
alias_of = "lang",
},
[2] = {
alias_of = "class"
},
["fragment"] = {},
["nocat"] = {
type = boolean,
},
["lang"] = {
type = "language",
replaced_by = false,
required = true,
},
["class"] = {
required = true,
},
}
local args, extra = m_params.process(frame:getParent().args, params, true)
local fixed_indices = {}
for k, v in pairs(extra) do
if type(k) == "number" then
k = k - 2
end
fixed_indices[k] = v
end
extra = fixed_indices
if args.lang:getFamily():getCode() ~= lg_sem_arb then
error(
args.lang:getCode()
.. "'s family is "
.. args.lang:getFamily():getCode()
.. ", not "
.. lg_sem_arb
)
end
local lookup = appendices[mw.ustring.lower(args.class)]
local lookup_args = {}
if not lookup then
error("Unrecognized word type " .. mw.ustring.lower(args.class))
end
if lookup.glossary then
return "[[Lampiran:Glosari#" .. lookup.glossary .. "]]"
end
if lookup.params then
lookup_args = m_params.process(extra, iffn(lookup.params, args.lang))
end
local appendix = nil
if lookup.glossary then
appendix = "Glosari"
else
local appendix_lang = args.lang:getCanonicalName()
local appendix_title = ifelse(type(lookup) == "string", lookup, iffn(lookup[1], args.lang))
appendix = appendix_lang .. " " .. appendix_title
if not mw.title.new(appendix, "Lampiran").exists then
appendix_lang = lg_ar:getCanonicalName()
appendix_title = ifelse(type(lookup) == "string", lookup, iffn(lookup[1], lg_ar))
appendix = appendix_lang .. " " .. appendix_title
end
end
local title = args.class
local desc = ""
local intro = ""
if lookup.derived then
intro = "diterbitkan daripada "
end
if type(lookup) ~= "string" then
title = iffn(lookup.title, lookup_args, args.lang) or ""
desc = iffn(lookup.desc, lookup_args, args.lang) or ""
if mw.ustring.match(args.class, "^%u.*") then
if intro == "" then
title = ucfirst(title) or ""
else
intro = ucfirst(intro) or ""
end
end
end
local fragment = (
iffn(lookup.fragment, lookup_args, args.lang)
or ucfirst(iffn(lookup.title, {pl=true}, args.lang))
or pluralize(ucfirst(args.class))
) or ""
return (
intro
.. "[[Lampiran:"
.. appendix
.. ifelse(fragment, "#" .. fragment, "")
.. "|"
.. title
.. "]] "
.. desc
)
end
return export
axix2jxl24a4mf6k5fxcdlyvjykpmhi
281298
281296
2026-04-21T15:36:39Z
Hakimi97
2668
281298
Scribunto
text/plain
local export = {}
local m_str_utils = require("Module:string utilities")
local m_utilities = require("Module:utilities")
local m_links = require("Module:links")
local m_headword = require("Module:headword")
local m_langs = require("Module:languages")
local m_params = require("Module:parameters")
local m_parse_utils = require("Module:parse utilities")
local m_affix = require("Module:affix")
local m_sc_utils = require("Module:script utilities")
local pluralize = require("Module:en-utilities").pluralize
local lg_ar = m_langs.getByCode("ar")
local lg_sem_arb = lg_ar:getFamily():getCode()
local rsplit = m_str_utils.split
local rsubn = m_str_utils.gsub
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local separator_langs = { ["mt"] = true, ["acy"] = true }
local color_langs = { ["mt"] = "red", ["ary"] = "red", ["ar"] = "green", ["shu"] = "yellow" }
local template_preview_per_langcode = { ["mt"] = "k-t-b", ["acy"] = "k-t-p" }
local lang
local sc
local function ifelse(cond, yes, no)
if cond then
return yes
end
return no
end
local function ucfirst(str)
if str == nil then
return str
end
return mw.language.getContentLanguage():ucfirst(str)
end
local function link(term, alt, id)
if term == "" or term == "—" then
return term
else
return m_links.full_link({
term = term,
alt = alt,
lang = lang,
id = id,
})
end
end
local function parse_inlines(term)
return m_parse_utils.parse_inline_modifiers(
term,
{
param_mods = {tr = {}, t = {}, pos = {}},
generate_obj = function(term) return {term} end,
}
)
end
local function make_part(noninline, lang)
local keys = {"tr", "t", "pos"}
local inline
if type(noninline) == "string" then
inline = parse_inlines(noninline)
else
inline = parse_inlines(noninline[1])
end
local return_value = {
term = m_sc_utils.tag_text(inline[1], lang),
}
for i, key in ipairs(keys) do
if inline[key] and noninline[key] then
error(
key .. " specified twice: "
.. "<" .. key .. ":" .. inline[key] .. ">"
.. " and "
.. "|" .. key .. "=" .. nonline[key]
)
end
return_value[key] = inline[key] or noninline[key]
end
if not return_value.tr then
return_value.tr = lang:transliterate(inline[1])
end
return return_value
end
local function make_parts(lang, raw_parts)
local parts = {}
for i, part in ipairs(raw_parts) do
parts[#parts + 1] = make_part(part, lang)
end
return {
parts = parts,
lang = lang,
sc = lang:findBestScript(parts[1][1]),
}
end
local function show_affix(lang, raw_parts)
return m_affix.show_affix(
make_parts(lang, raw_parts),
{},
lang
)
end
local appendices = {
["active participle"] = {
-- participles have verbal force in (most?) vernaculars
function(args, lang)
return ifelse(lang:getCode() == lg_ar:getCode(), "nominals", "verbs")
end,
derived = true,
},
["characteristic adjective"] = "nominals",
["color/defect adjective"] = {
"nominals",
fragment = "Color or defect adjectives",
},
["diminutive"] = "nominals",
["elative"] = "nominals",
["relative"] = {
"nominals",
params = function(lang)
return {
[1] = {
required = true,
},
[2] = {
alias_of = "t",
},
[3] = {
alias_of = "suffix",
},
["id"] = {},
["id1"] = {
alias_of = "id",
},
["tr"] = {},
["pl"] = {
type = "boolean",
},
["t"] = {
required = true,
},
["suffix"] = {
default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيَّة", "ـية"),
},
}
end,
title = function(args)
if args.pl then
return "relative nouns (nisba)"
end
return "relative noun (nisba)"
end,
desc = function(args, lang)
if not args[1] then
return ""
end
return (
"composed from "
.. show_affix(
lang,
{
args,
{args.suffix, pos="feminine nisba"},
}
)
)
end,
},
["relative-a"] = {
"nominals",
params = function(lang)
return {
[1] = {
required = true,
},
[2] = {
alias_of = "t",
},
["tr"] = {},
["pl"] = {
type = "boolean",
},
["suffix"] = {
default = ifelse(lang:getCode() == lg_ar:getCode(), "ـَة", "ـة"),
},
["t"] = {
required = true,
},
}
end,
title = function(args)
if args.pl then
return "relative nouns (nisba)"
end
return "relative noun (nisba)"
end,
desc = function(args, lang)
if not args[1] then
return ""
end
return (
"composed from "
.. show_affix(
lang,
{
args,
{args.suffix, pos="feminine ending"},
}
)
)
end,
},
["relative-linking"] = {
"nominals",
params = function(lang)
return {
[1] = {
required = true,
},
[2] = {
alias_of = "t",
},
[3] = {
alias_of = "linking",
},
["tr"] = {},
["pl"] = {
type = "boolean",
},
["suffix"] = {
default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيّ", "ـي"),
},
["t"] = {
required = true,
},
["linking"] = {
required = true,
}
}
end,
title = function(args)
if args.pl then
return "relative adjectives (nisba)"
end
return "relative adjective (nisba)"
end,
desc = function(args, lang)
if not args[1] then
return ""
end
return (
"composed from "
.. show_affix(
lang,
{
args,
args.linking,
{args.suffix, pos = "nisba"},
}
)
)
end,
},
["relative-linking-noun"] = {
"nominals",
params = function(lang)
return {
[1] = {
required = true,
},
[2] = {
alias_of = "t",
},
[3] = {
alias_of = "linking",
},
["tr"] = {},
["pl"] = {
type = "boolean",
},
["t"] = {
required = true,
},
["linking"] = {
required = true,
},
["suffix"] = {
default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيّ", "ـي"),
}
}
end,
title = function(args)
if args.pl then
return "relative nouns (nisba)"
end
return "relative noun (nisba)"
end,
desc = function(args, lang)
if not args[1] then
return ""
end
return (
"composed from "
.. show_affix(
lang,
{args, args.linking, args.suffix}
)
)
end,
},
["form"] = {
"verbs",
params = {
[1] = {
alias_of = "wazn",
},
["wazn"] = {
required = true,
},
},
fragment = function(args) return "Form_" .. args.wazn end,
title = function(args) return "form " .. args.wazn end,
},
["instance noun"] = "nominals",
["noun of place"] = "nominals",
["occupational noun"] = "nominals",
["passive participle"] = {
"nominals",
derived = true,
},
["reduplicated"] = {
glossary = "reduplication",
},
["singulative noun"] = "nominals",
["tool noun"] = "nominals",
["verbal noun"] = "nominals",
}
local radicals = {
["Arab"] = {
["ء"] = true,
["ب"] = true,
["ت"] = true,
["ث"] = true,
["ج"] = true,
["ح"] = true,
["خ"] = true,
["د"] = true,
["ذ"] = true,
["ر"] = true,
["ز"] = true,
["س"] = true,
["ش"] = true,
["ص"] = true,
["ض"] = true,
["ط"] = true,
["ظ"] = true,
["ع"] = true,
["غ"] = true,
["ف"] = true,
["ق"] = true,
["ك"] = true,
["ل"] = true,
["م"] = true,
["ن"] = true,
["ه"] = true,
["و"] = true,
["ي"] = true,
["گ"] = true,
["چ"] = true,
["پ"] = true,
["ڭ"] = true,
},
["Latn"] = {
["'"] = true,
["b"] = true,
["c"] = true,
["ċ"] = true,
["d"] = true,
["δ"] = true,
["f"] = true,
["ġ"] = true,
["g"] = true,
["għ"] = true,
["h"] = true,
["ħ"] = true,
["j"] = true,
["k"] = true,
["l"] = true,
["m"] = true,
["n"] = true,
["p"] = true,
["q"] = true,
["r"] = true,
["s"] = true,
["ş"] = true,
["t"] = true,
["v"] = true,
["w"] = true,
["x"] = true,
["y"] = true,
["ż"] = true,
["z"] = true,
["θ"] = true,
}
}
local function validateRoot(rootTable, joined_root)
if type(rootTable) ~= "table" then
error("rootTable is not a table", 2)
end
local len = #rootTable
if len < 3 then
error("Root must have at least three radicals.")
end
if sc == nil then
sc = lang:findBestScript(joined_root):getCode()
end
for i, radical in ipairs(rootTable) do
if not radicals[sc][radical] then
error("Unrecognized radical " .. radical .. " in " .. joined_root)
end
end
end
function export.root(frame)
local output = {}
local categories = {}
local title = mw.title.getCurrentTitle()
local namespace = title.nsText
local fulltitle = title.fullText
if frame.args["lang"] then
lang = require("Module:languages").getByCode(frame.args["lang"])
else
error("Please provide a language code.")
end
local subpage = "Lampiran:Akar bahasa " .. lang:getCanonicalName() .. "/"
local fulltitle = rsubn(fulltitle, rsubn(subpage, "([^%w])", "%%%1"), "")
local params = {
[1] = { list = true },
["nocat"] = { type = "boolean" },
["plain"] = { type = "boolean" },
["notext"] = { type = "boolean" },
["sense"] = {}
}
local args = require("Module:parameters").process(frame:getParent().args, params)
local rootLetters = {}
local roots = args[1]
local plain = args["plain"]
if frame.args["plain"] then
plain = true
end
local langCode = lang:getCode()
local separator = " "
if separator_langs[langCode] then
separator = "-"
else
separator = " "
end
local roots_len = #roots
if #roots == 0 and namespace == "Templat" then
if template_preview_per_langcode[langCode] ~= nil then
table.insert(rootLetters, rsplit(template_preview_per_langcode[langCode], separator))
else
table.insert(rootLetters, rsplit("ك ت ب", separator))
end
elseif #roots ~= 0 then
for _, root in ipairs(roots) do
table.insert(rootLetters, rsplit(root, separator))
end
else
table.insert(rootLetters, rsplit(fulltitle, separator))
end
local joined_roots = {}
for i, rootLetter in ipairs(rootLetters) do
table.insert(joined_roots, table.concat(rootLetter, separator))
validateRoot(rootLetter, joined_roots[i])
end
local sense = args["sense"]
local sense_formatted = ""
if sense ~= nil then
sense_formatted = " (" .. sense .. ") "
end
if fulltitle == joined_roots[1] then
if namespace == "" then
error("The root page should be in the Appendix namespace. Please move it to : [[" ..
subpage .. joined_roots[1] .. "]]")
end
if roots_len > 1 then
error("There should be only one root.")
end
table.insert(output,
m_headword.full_headword({ lang = lang, pos_category = "roots", categories = {}, heads = { fulltitle }, nomultiwordcat = true, noposcat = true }))
if args["nocat"] then
return table.concat(output)
else
return table.concat(output) .. table.concat(categories)
end
else
local link_texts = {}
local term_counts = {}
for i, joined_root in ipairs(joined_roots) do
local link_text = subpage .. joined_root
table.insert(link_texts, link(link_text, joined_root .. sense_formatted, sense))
table.insert(
categories,
m_utilities.format_categories(
{ "Perkataan bahasa " .. lang:getCanonicalName() .. " milik akar " .. joined_root .. sense_formatted },
lang)
)
table.insert(term_counts,
mw.site.stats.pagesInCategory(
"Perkataan bahasa " .. lang:getCanonicalName() .. " milik akar " .. joined_root .. sense_formatted, "pages")
)
end
if args["nocat"] or plain then
if args["nocat"] then
return table.concat(link_texts, ", ")
else
return table.concat(link_texts, ", ") .. table.concat(categories)
end
else
local link_text_output = ""
for i, link_text in ipairs(link_texts) do
link_text_output = link_text_output ..
"\n|-\n| " .. link_text ..
"\n|-\n| [[:Kategori:Perkataan bahasa " ..
lang:getCanonicalName() ..
" milik akar " ..
joined_roots[i] ..
sense_formatted .. "|" .. term_counts[i] .. " perkataan" .. (term_counts[i] == 1 and "" or "") .. "]]\n"
end
local color = "grey"
if color_langs[langCode] ~= nil then
color = color_langs[langCode]
end
local wikicode = mw.getCurrentFrame():expandTemplate {
title = 'inflection-table-top',
args = {
title = "-",
palette = color,
class = "floatright tr-alongside"
}
}
wikicode = wikicode .. [=[
! [[w:Akar bahasa-bahasa Samiah|Akar]=] .. (#term_counts == 1 and "" or "") .. [=[]]]=]
wikicode = wikicode .. link_text_output
wikicode = wikicode .. mw.getCurrentFrame():expandTemplate {
title = 'inflection-table-bottom',
}
return wikicode .. table.concat(categories)
end
end
end
local function iffn(val, ...)
if type(val) == "function" then
return val(unpack(arg))
end
return val
end
function export.etym(frame)
local params = {
[1] = {
alias_of = "lang",
},
[2] = {
alias_of = "class"
},
["fragment"] = {},
["nocat"] = {
type = boolean,
},
["lang"] = {
type = "language",
replaced_by = false,
required = true,
},
["class"] = {
required = true,
},
}
local args, extra = m_params.process(frame:getParent().args, params, true)
local fixed_indices = {}
for k, v in pairs(extra) do
if type(k) == "number" then
k = k - 2
end
fixed_indices[k] = v
end
extra = fixed_indices
if args.lang:getFamily():getCode() ~= lg_sem_arb then
error(
args.lang:getCode()
.. "'s family is "
.. args.lang:getFamily():getCode()
.. ", not "
.. lg_sem_arb
)
end
local lookup = appendices[mw.ustring.lower(args.class)]
local lookup_args = {}
if not lookup then
error("Unrecognized word type " .. mw.ustring.lower(args.class))
end
if lookup.glossary then
return "[[Lampiran:Glosari#" .. lookup.glossary .. "]]"
end
if lookup.params then
lookup_args = m_params.process(extra, iffn(lookup.params, args.lang))
end
local appendix = nil
if lookup.glossary then
appendix = "Glosari"
else
local appendix_lang = args.lang:getCanonicalName()
local appendix_title = ifelse(type(lookup) == "string", lookup, iffn(lookup[1], args.lang))
appendix = appendix_lang .. " " .. appendix_title
if not mw.title.new(appendix, "Lampiran").exists then
appendix_lang = lg_ar:getCanonicalName()
appendix_title = ifelse(type(lookup) == "string", lookup, iffn(lookup[1], lg_ar))
appendix = appendix_lang .. " " .. appendix_title
end
end
local title = args.class
local desc = ""
local intro = ""
if lookup.derived then
intro = "diterbitkan daripada "
end
if type(lookup) ~= "string" then
title = iffn(lookup.title, lookup_args, args.lang) or ""
desc = iffn(lookup.desc, lookup_args, args.lang) or ""
if mw.ustring.match(args.class, "^%u.*") then
if intro == "" then
title = ucfirst(title) or ""
else
intro = ucfirst(intro) or ""
end
end
end
local fragment = (
iffn(lookup.fragment, lookup_args, args.lang)
or ucfirst(iffn(lookup.title, {pl=true}, args.lang))
or pluralize(ucfirst(args.class))
) or ""
return (
intro
.. "[[Lampiran:"
.. appendix
.. ifelse(fragment, "#" .. fragment, "")
.. "|"
.. title
.. "]] "
.. desc
)
end
return export
1l5ppnwebn031b4xrzw47woeuiwavib
281299
281298
2026-04-21T15:37:52Z
Hakimi97
2668
281299
Scribunto
text/plain
local export = {}
local m_str_utils = require("Module:string utilities")
local m_utilities = require("Module:utilities")
local m_links = require("Module:links")
local m_headword = require("Module:headword")
local m_langs = require("Module:languages")
local m_params = require("Module:parameters")
local m_parse_utils = require("Module:parse utilities")
local m_affix = require("Module:affix")
local m_sc_utils = require("Module:script utilities")
local pluralize = require("Module:en-utilities").pluralize
local lg_ar = m_langs.getByCode("ar")
local lg_sem_arb = lg_ar:getFamily():getCode()
local rsplit = m_str_utils.split
local rsubn = m_str_utils.gsub
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local separator_langs = { ["mt"] = true, ["acy"] = true }
local color_langs = { ["mt"] = "red", ["ary"] = "red", ["ar"] = "green", ["shu"] = "yellow" }
local template_preview_per_langcode = { ["mt"] = "k-t-b", ["acy"] = "k-t-p" }
local lang
local sc
local function ifelse(cond, yes, no)
if cond then
return yes
end
return no
end
local function ucfirst(str)
if str == nil then
return str
end
return mw.language.getContentLanguage():ucfirst(str)
end
local function link(term, alt, id)
if term == "" or term == "—" then
return term
else
return m_links.full_link({
term = term,
alt = alt,
lang = lang,
id = id,
})
end
end
local function parse_inlines(term)
return m_parse_utils.parse_inline_modifiers(
term,
{
param_mods = {tr = {}, t = {}, pos = {}},
generate_obj = function(term) return {term} end,
}
)
end
local function make_part(noninline, lang)
local keys = {"tr", "t", "pos"}
local inline
if type(noninline) == "string" then
inline = parse_inlines(noninline)
else
inline = parse_inlines(noninline[1])
end
local return_value = {
term = m_sc_utils.tag_text(inline[1], lang),
}
for i, key in ipairs(keys) do
if inline[key] and noninline[key] then
error(
key .. " specified twice: "
.. "<" .. key .. ":" .. inline[key] .. ">"
.. " and "
.. "|" .. key .. "=" .. nonline[key]
)
end
return_value[key] = inline[key] or noninline[key]
end
if not return_value.tr then
return_value.tr = lang:transliterate(inline[1])
end
return return_value
end
local function make_parts(lang, raw_parts)
local parts = {}
for i, part in ipairs(raw_parts) do
parts[#parts + 1] = make_part(part, lang)
end
return {
parts = parts,
lang = lang,
sc = lang:findBestScript(parts[1][1]),
}
end
local function show_affix(lang, raw_parts)
return m_affix.show_affix(
make_parts(lang, raw_parts),
{},
lang
)
end
local appendices = {
["active participle"] = {
-- participles have verbal force in (most?) vernaculars
function(args, lang)
return ifelse(lang:getCode() == lg_ar:getCode(), "nominals", "verbs")
end,
derived = true,
},
["characteristic adjective"] = "nominals",
["color/defect adjective"] = {
"nominals",
fragment = "Color or defect adjectives",
},
["diminutive"] = "nominals",
["elative"] = "nominals",
["relative"] = {
"nominals",
params = function(lang)
return {
[1] = {
required = true,
},
[2] = {
alias_of = "t",
},
[3] = {
alias_of = "suffix",
},
["id"] = {},
["id1"] = {
alias_of = "id",
},
["tr"] = {},
["pl"] = {
type = "boolean",
},
["t"] = {
required = true,
},
["suffix"] = {
default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيَّة", "ـية"),
},
}
end,
title = function(args)
if args.pl then
return "relative nouns (nisba)"
end
return "relative noun (nisba)"
end,
desc = function(args, lang)
if not args[1] then
return ""
end
return (
"composed from "
.. show_affix(
lang,
{
args,
{args.suffix, pos="feminine nisba"},
}
)
)
end,
},
["relative-a"] = {
"nominals",
params = function(lang)
return {
[1] = {
required = true,
},
[2] = {
alias_of = "t",
},
["tr"] = {},
["pl"] = {
type = "boolean",
},
["suffix"] = {
default = ifelse(lang:getCode() == lg_ar:getCode(), "ـَة", "ـة"),
},
["t"] = {
required = true,
},
}
end,
title = function(args)
if args.pl then
return "relative nouns (nisba)"
end
return "relative noun (nisba)"
end,
desc = function(args, lang)
if not args[1] then
return ""
end
return (
"composed from "
.. show_affix(
lang,
{
args,
{args.suffix, pos="feminine ending"},
}
)
)
end,
},
["relative-linking"] = {
"nominals",
params = function(lang)
return {
[1] = {
required = true,
},
[2] = {
alias_of = "t",
},
[3] = {
alias_of = "linking",
},
["tr"] = {},
["pl"] = {
type = "boolean",
},
["suffix"] = {
default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيّ", "ـي"),
},
["t"] = {
required = true,
},
["linking"] = {
required = true,
}
}
end,
title = function(args)
if args.pl then
return "relative adjectives (nisba)"
end
return "relative adjective (nisba)"
end,
desc = function(args, lang)
if not args[1] then
return ""
end
return (
"composed from "
.. show_affix(
lang,
{
args,
args.linking,
{args.suffix, pos = "nisba"},
}
)
)
end,
},
["relative-linking-noun"] = {
"nominals",
params = function(lang)
return {
[1] = {
required = true,
},
[2] = {
alias_of = "t",
},
[3] = {
alias_of = "linking",
},
["tr"] = {},
["pl"] = {
type = "boolean",
},
["t"] = {
required = true,
},
["linking"] = {
required = true,
},
["suffix"] = {
default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيّ", "ـي"),
}
}
end,
title = function(args)
if args.pl then
return "relative nouns (nisba)"
end
return "relative noun (nisba)"
end,
desc = function(args, lang)
if not args[1] then
return ""
end
return (
"composed from "
.. show_affix(
lang,
{args, args.linking, args.suffix}
)
)
end,
},
["form"] = {
"verbs",
params = {
[1] = {
alias_of = "wazn",
},
["wazn"] = {
required = true,
},
},
fragment = function(args) return "Form_" .. args.wazn end,
title = function(args) return "form " .. args.wazn end,
},
["instance noun"] = "nominals",
["noun of place"] = "nominals",
["occupational noun"] = "nominals",
["passive participle"] = {
"nominals",
derived = true,
},
["reduplicated"] = {
glossary = "reduplication",
},
["singulative noun"] = "nominals",
["tool noun"] = "nominals",
["verbal noun"] = "nominals",
}
local radicals = {
["Arab"] = {
["ء"] = true,
["ب"] = true,
["ت"] = true,
["ث"] = true,
["ج"] = true,
["ح"] = true,
["خ"] = true,
["د"] = true,
["ذ"] = true,
["ر"] = true,
["ز"] = true,
["س"] = true,
["ش"] = true,
["ص"] = true,
["ض"] = true,
["ط"] = true,
["ظ"] = true,
["ع"] = true,
["غ"] = true,
["ف"] = true,
["ق"] = true,
["ك"] = true,
["ل"] = true,
["م"] = true,
["ن"] = true,
["ه"] = true,
["و"] = true,
["ي"] = true,
["گ"] = true,
["چ"] = true,
["پ"] = true,
["ڭ"] = true,
},
["Latn"] = {
["'"] = true,
["b"] = true,
["c"] = true,
["ċ"] = true,
["d"] = true,
["δ"] = true,
["f"] = true,
["ġ"] = true,
["g"] = true,
["għ"] = true,
["h"] = true,
["ħ"] = true,
["j"] = true,
["k"] = true,
["l"] = true,
["m"] = true,
["n"] = true,
["p"] = true,
["q"] = true,
["r"] = true,
["s"] = true,
["ş"] = true,
["t"] = true,
["v"] = true,
["w"] = true,
["x"] = true,
["y"] = true,
["ż"] = true,
["z"] = true,
["θ"] = true,
}
}
local function validateRoot(rootTable, joined_root)
if type(rootTable) ~= "table" then
error("rootTable is not a table", 2)
end
local len = #rootTable
if len < 3 then
error("Root must have at least three radicals.")
end
if sc == nil then
sc = lang:findBestScript(joined_root):getCode()
end
for i, radical in ipairs(rootTable) do
if not radicals[sc][radical] then
error("Unrecognized radical " .. radical .. " in " .. joined_root)
end
end
end
function export.root(frame)
local output = {}
local categories = {}
local title = mw.title.getCurrentTitle()
local namespace = title.nsText
local fulltitle = title.fullText
if frame.args["lang"] then
lang = require("Module:languages").getByCode(frame.args["lang"])
else
error("Please provide a language code.")
end
local subpage = "Lampiran:Akar bahasa " .. lang:getCanonicalName() .. "/"
local fulltitle = rsubn(fulltitle, rsubn(subpage, "([^%w])", "%%%1"), "")
local params = {
[1] = { list = true },
["nocat"] = { type = "boolean" },
["plain"] = { type = "boolean" },
["notext"] = { type = "boolean" },
["sense"] = {}
}
local args = require("Module:parameters").process(frame:getParent().args, params)
local rootLetters = {}
local roots = args[1]
local plain = args["plain"]
if frame.args["plain"] then
plain = true
end
local langCode = lang:getCode()
local separator = " "
if separator_langs[langCode] then
separator = "-"
else
separator = " "
end
local roots_len = #roots
if #roots == 0 and namespace == "Templat" then
if template_preview_per_langcode[langCode] ~= nil then
table.insert(rootLetters, rsplit(template_preview_per_langcode[langCode], separator))
else
table.insert(rootLetters, rsplit("ك ت ب", separator))
end
elseif #roots ~= 0 then
for _, root in ipairs(roots) do
table.insert(rootLetters, rsplit(root, separator))
end
else
table.insert(rootLetters, rsplit(fulltitle, separator))
end
local joined_roots = {}
for i, rootLetter in ipairs(rootLetters) do
table.insert(joined_roots, table.concat(rootLetter, separator))
validateRoot(rootLetter, joined_roots[i])
end
local sense = args["sense"]
local sense_formatted = ""
if sense ~= nil then
sense_formatted = " (" .. sense .. ") "
end
if fulltitle == joined_roots[1] then
if namespace == "" then
error("The root page should be in the Appendix namespace. Please move it to : [[" ..
subpage .. joined_roots[1] .. "]]")
end
if roots_len > 1 then
error("There should be only one root.")
end
table.insert(output,
m_headword.full_headword({ lang = lang, pos_category = "roots", categories = {}, heads = { fulltitle }, nomultiwordcat = true, noposcat = true }))
if args["nocat"] then
return table.concat(output)
else
return table.concat(output) .. table.concat(categories)
end
else
local link_texts = {}
local term_counts = {}
for i, joined_root in ipairs(joined_roots) do
local link_text = subpage .. joined_root
table.insert(link_texts, link(link_text, joined_root .. sense_formatted, sense))
table.insert(
categories,
m_utilities.format_categories(
{ "Perkataan bahasa " .. lang:getCanonicalName() .. " dengan akar " .. joined_root .. sense_formatted },
lang)
)
table.insert(term_counts,
mw.site.stats.pagesInCategory(
"Perkataan bahasa " .. lang:getCanonicalName() .. " dengan akar " .. joined_root .. sense_formatted, "pages")
)
end
if args["nocat"] or plain then
if args["nocat"] then
return table.concat(link_texts, ", ")
else
return table.concat(link_texts, ", ") .. table.concat(categories)
end
else
local link_text_output = ""
for i, link_text in ipairs(link_texts) do
link_text_output = link_text_output ..
"\n|-\n| " .. link_text ..
"\n|-\n| [[:Kategori:Perkataan bahasa " ..
lang:getCanonicalName() ..
" dengan akar " ..
joined_roots[i] ..
sense_formatted .. "|" .. term_counts[i] .. " perkataan" .. (term_counts[i] == 1 and "" or "") .. "]]\n"
end
local color = "grey"
if color_langs[langCode] ~= nil then
color = color_langs[langCode]
end
local wikicode = mw.getCurrentFrame():expandTemplate {
title = 'inflection-table-top',
args = {
title = "-",
palette = color,
class = "floatright tr-alongside"
}
}
wikicode = wikicode .. [=[
! [[w:Akar bahasa-bahasa Samiah|Akar]=] .. (#term_counts == 1 and "" or "") .. [=[]]]=]
wikicode = wikicode .. link_text_output
wikicode = wikicode .. mw.getCurrentFrame():expandTemplate {
title = 'inflection-table-bottom',
}
return wikicode .. table.concat(categories)
end
end
end
local function iffn(val, ...)
if type(val) == "function" then
return val(unpack(arg))
end
return val
end
function export.etym(frame)
local params = {
[1] = {
alias_of = "lang",
},
[2] = {
alias_of = "class"
},
["fragment"] = {},
["nocat"] = {
type = boolean,
},
["lang"] = {
type = "language",
replaced_by = false,
required = true,
},
["class"] = {
required = true,
},
}
local args, extra = m_params.process(frame:getParent().args, params, true)
local fixed_indices = {}
for k, v in pairs(extra) do
if type(k) == "number" then
k = k - 2
end
fixed_indices[k] = v
end
extra = fixed_indices
if args.lang:getFamily():getCode() ~= lg_sem_arb then
error(
args.lang:getCode()
.. "'s family is "
.. args.lang:getFamily():getCode()
.. ", not "
.. lg_sem_arb
)
end
local lookup = appendices[mw.ustring.lower(args.class)]
local lookup_args = {}
if not lookup then
error("Unrecognized word type " .. mw.ustring.lower(args.class))
end
if lookup.glossary then
return "[[Lampiran:Glosari#" .. lookup.glossary .. "]]"
end
if lookup.params then
lookup_args = m_params.process(extra, iffn(lookup.params, args.lang))
end
local appendix = nil
if lookup.glossary then
appendix = "Glosari"
else
local appendix_lang = args.lang:getCanonicalName()
local appendix_title = ifelse(type(lookup) == "string", lookup, iffn(lookup[1], args.lang))
appendix = appendix_lang .. " " .. appendix_title
if not mw.title.new(appendix, "Lampiran").exists then
appendix_lang = lg_ar:getCanonicalName()
appendix_title = ifelse(type(lookup) == "string", lookup, iffn(lookup[1], lg_ar))
appendix = appendix_lang .. " " .. appendix_title
end
end
local title = args.class
local desc = ""
local intro = ""
if lookup.derived then
intro = "diterbitkan daripada "
end
if type(lookup) ~= "string" then
title = iffn(lookup.title, lookup_args, args.lang) or ""
desc = iffn(lookup.desc, lookup_args, args.lang) or ""
if mw.ustring.match(args.class, "^%u.*") then
if intro == "" then
title = ucfirst(title) or ""
else
intro = ucfirst(intro) or ""
end
end
end
local fragment = (
iffn(lookup.fragment, lookup_args, args.lang)
or ucfirst(iffn(lookup.title, {pl=true}, args.lang))
or pluralize(ucfirst(args.class))
) or ""
return (
intro
.. "[[Lampiran:"
.. appendix
.. ifelse(fragment, "#" .. fragment, "")
.. "|"
.. title
.. "]] "
.. desc
)
end
return export
ofaljfp3gndnd0wqj1h0e8dt16klnoq
Modul:place/locations
828
76177
281433
264770
2026-04-22T09:56:11Z
PeaceSeekers
3334
281433
Scribunto
text/plain
local export = {}
export.force_cat = false -- set to true to force category generation even on non-mainspace pages
local m_table = require("Module:table")
local string_utilities_module = "Module:string utilities"
local en_utilities_module = "Module:en-utilities"
local insert = table.insert
local concat = table.concat
local dump = mw.dumpObject
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
--[==[ intro:
This module contains data on all known locations, along with some lower-level code to process them (higher-level
known-location code is in [[Module:place/placetypes]]). You must load this module using require(), not using
mw.loadData().
===Location data===
'''NOTE: In order to understand the following better, first read the introductory documentation in [[Module:place]],
especially the section `More about known locations`.'''
The bulk of the code in this module (after some helper functions and placetype tables) describes the known locations
and their relationships. Locations are grouped into ''location groups'' that share some common properties (examples are
states of the United States and cities in Brazil). Each location group is associated with two tables, a ''data table''
that lists the locations and their individual properties, and a ''metadata table'' that lists group-level properties and
defaults for the location properties. Each metadata table points to the associated data table (i.e. contains the data
table as its `data` field), and the global `locations` variable holds a list of all group metadata tables. A given
location is generally described by three values: (a) the group metadata table for the group the location is part of; (b)
the location's canonical ''key'', which is the actual key in the group's data table and is globally unique across all
locations; and (c) the location's ''spec'', which is the initialized object describing the properties of the location
and comes from the value in the data table corresponding to the canonical key, transformed by the `initialize_spec()`
function. These are typically named `group`, `key` and `spec`, respectively and in that order, and are found in the
arguments to many functions.
In a per-group data table, the keys are either ''canonical keys'' describing locations (which, as mentioned above, must
be globally unique) or ''alias keys'' specifying an allowed alias for a given location. There may be multiple aliases
for a given location and the alias keys only need to be unique within a particular group data table, not across all
groups. It is also possible for the same string to serve as an alias key in one group and a canonical key in another
group. (For example, `Newcastle` appears as an alias key in two different groups, referring to two different locations,
canonically known as `Newcastle upon Tyne`, for the city in England, and `Newcastle, New South Wales`, for the city in
New South Wales, Australia; and `Birmingham` appears both as a canonical key in the group of English cities and an alias
key for canonical `Birmingham, Alabama` in the group of US cities.) The corresponding value objects are different for
canonical and alias keys. Corresponding to canonical keys are ''location specs'', describing the properies of the
location that cannot be derived from default properties of the group or global defaults. Corresponding to alias keys
are ''alias specs'', which are highly restricted in the properties they can contain, and whose properties do not have
per-group defaults, but only global defaults.
The canonical key is always the same as the bare category corresponding to the location, which is one of the reasons it
must be globally unique. For example, the country of Georgia uses the canonical key `Georgia` and corresponding bare
category [[:Category:Georgia]], while the US state of Georgia uses the canonical key `Georgia, USA` and corresponding
bare category [[:Category:Georgia, USA]]. The following conventions are followed in naming keys:
* Countries, ''country-like entities'' (which are a mixture of unrecognized de-facto states and dependent territories)
and ''former countries'' (which also includes other types of polities, such as the Roman Empire) use their unqualified
placename as the canonical key. (See the documentation for [[Module:place]] for the distinction between keys and
placenames, which is critical to understand when working with location data.) This also applies to constituent
countries (such as England, Aruba and the Faroe Islands) and constituent parts of grouped dependent territories (such
as the island of Saint Helena, which is administratively part of the British overseas territory of Saint Helena,
Ascension and Tristan da Cunha).
* Cities (including prefecture-level cities in China, which behave in most respects more like non-city administrative
divisions) also normally use their unqualified placename as the canonical key, but if this causes name conflicts or
ambiguities, they use a ''qualified key'' containing either the country name or immediate containing division (if
different) following a comma, such as the case of `Newcastle, New South Wales` and `Birmingham, Alabama` above.
Examples of name conflicts are the two cities just given; examples of ambiguities are the major cities of León and
Mérida in Mexico and city of Cartagena, Colombia, which are given the respective canonical keys of `León, Guanajuato`,
`Mérida, Yucatán` and `Cartagena, Colombia` to avoid ambiguity with the well-known respective cities of the same name
in Spain, even though none of those cities are large enough to be included as known locations in this module. (The
cutoff is generally having a metro area of at least 1,000,000 inhabitants, although there are exceptions.)
* Administrative divisions of countries, other than the exceptions noted above for constituent countries and dependent
territories, use a qualified key that contains the name of the country or constituent country in it, e.g.
`Normandy, France` (a region), `Calvados, France` (a department in the region of Normandy), `Herefordshire, England`
(a ceremonial county), `Northwest Territories, Canada` (a territory), `Central Finland, Finland` (a region),
`Antalya Province, Turkey` (a province), `Cluj County, Romania` (a county), `County Cork, Ireland` (a county) and
`New York, USA` (a state). As shown in these various examples, (a) first and second-level divisions are sometimes both
included (as in France, the United Kingdom and China); (b) the qualifier after the comma is sometimes a constituent
country (England) instead of a country (United Kingdom), and is sometimes abbreviated (USA rather than United States
or Unites States of America); (c) the word `the` is not normally included in the key even if the location is normally
preceded by `the` when following a preposition (there is a property in the location and alias specs to indicate this),
except in a very few cases (most notably `The Hague`); (d) the country is included as a qualifier even if it creates
an apparent redundancy, as with `Central Finland, Finland`; and (e) sometimes the placetype is included in the key, as
with provinces in Turkey and several other countries; states in Nigeria; and counties in Ireland, Romania and several
other countries. Whether the placetype is included, and whether it follows or precedes the placename, depends on
per-country conventions. For example, provinces in Turkey, Iran and several other countries (likewise for states in
Nigeria, oblasts in Russia, etc.) conventionally include the word "Province", "negeri", "Oblast" etc. in their name
because they are normally named after the largest city in the division, which would otherwise lead to ambiguity; and
counties in Ireland and Northern Ireland (and likewise County Durham, England) normally have the word "County"
preceding rather than following them in their conventional name, so we follow this practice. The Wikipedia article
naming scheme for a given administrative division is a strong clue as to how the division is normally referred to,
and we usually follow this practice. (A minor exception is that the Wikipedia articles for provinces in Iran, Laos and
Thailand include the word `province` with an initial lowercase letter while provinces elsewhere, e.g. North and South
Korea, Saudi Arabia and Turkey, use uppercase `Province`; we normalize to uppercase `Province` in all cases.)
As mentioned above, associated with canonical keys in the group data table are location specs, which are objects
containing properties. It is important here to distinguish ''initialized specs'' from ''uninitialized specs''.
Unininitialized specs are as directly specified in [[Module:place/locations]], containing only those properties that
differ from the per-group or global defaults. Initialized specs result from calling `initialize_spec()` on an
uninitialized spec (it is idempotent in that it will do nothing if encountering an already-initialized spec). This
copies all group-level defaults that are not overridden in the location spec itself from the group-level metadata table
into the location spec, so that in general, no more reference need be made to the group to fetch the correct value of a
given location property. (The initialization process also does more transformations in a few cases, noted below.) Note
that the default value of a given property is stored under a key in the group metadata table that is preceded by the
string `default_`; for example, the default value corresponding to the `placetype` property of a given location is
specified in the `default_placetype` key in the group metadata table.
The following are the properties of the location spec.
* `placetype`: String specifying the placetype of the location (e.g. "negara", "negeri", province"). This can also be a
table of such types; in this case, the first listed type is the canonical type that will be used in descriptions, but
the location will be recognized (e.g. in a holonym, or for categorizing into the bare category) when tagged with any
of the specified types. The placetype '''must''' be either specified on an individual location or defaulted at the
group level, or an error occurs.
* `container`: Either a string, a ''canonicalized container'' structure or a list of either type, specifying the
immediate ''container'' (or containers) of the given location. A container is another location which this location is
considered to be directly part of, either politically or (above the country level) geographically. Some locations
belong to multiple immediate containers; this applies especially to transcontinental countries such as Russia and
Turkey. Containers can themselves have containers, forming a tree (or more correctly, a [[w:directed acyclic graph]])
of locations. The list of immediate container(s), followed by the container(s) of the container(s), etc., is termed
the ''container trail'', and some functions compute and return this trail as part of their operation. When a location
spec is initialized, the given container spec is canonicalized into ''canonical container form'', which consists of a
list of canonicalized container structures, each of which is of the form
`{key = "``container_key``", placetype = "``container_placetype``"}`, where ``container_key`` is a canonical location
key and ``container_placetype`` should be the listed placetype for the location, or the first listed placetype if
there are multiple. (FIXME: Since the key uniquely identifies the container location, we should eliminate the
placetype from the container structure.) The list of canonicalized container structures is stored into the
`.containers` field of the location spec (this happens even if the container value is unset in its uninitialized spec
form, causing it to default to the corresponding group-level value), and the `.container` field is set to {nil}. The
canonicalization process is described in more detail below under [[#Container spec canonicalization]].
* `divs`: List of recognized political divisions; e.g. for the Netherlands, a specification of the form
`divs = {"provinces", "municipalities"}` will allow categories such as [[:Category:de:Provinces of the Netherlands]]
and [[:Category:pt:Municipalities of the Netherlands]] to be created. Any division that appears here must also be
found in `placetype_data`, or an error occurs. The entities appearing in the `divs` list can be structures as well as
just strings; this is explained more below under [[#Location divisions]]. Additional political divisions that apply to
all locations in a group can be specified at the group level using the group-only property `addl_divs`, which has the
same format as `divs`. This is intended to be used in the situation where some division types are shared among all
locations in the group and others differ from location to location. An example where this is used is the United
States, where `census-designated places` is specified in the group-level `addl_divs` so that all 50 states have
census-designated places categorized as e.g. [[:Category:Census-designated places in Arizona, USA]], but `counties`
and `county seats` are specified in the group-level `default_divs` because not all states have counties and county
seats (Alaska has boroughs and borough seats and Louisiana has parishes and parish seats), and some states have
additional divisions (New Jersey and Pennsylvania also have boroughs, while Colorado and Connecticut have
municipalities). Note that under most circumstances (particularly, if `container_parent_type` is not set as a property
associated with the division type), any division type specified on a sub-country-level location must also be specified
on all containers up through the country. For example, since French departments specify `communes` and
`municipalities` in `default_divs`, the same division types must be (and are) specified on French regions and for
France itself.
* `keydesc`: String directly specifying a description of the location, for use in generating the contents of category
pages related to the location. In place of a string, a function of three arguments (`group`, `key`, `spec`, as is
normal for locations) that computes the location description can also be given. This is used, for example, for
Russian federal subjects; see `construct_russia_federal_subject_keydesc`. The special string `+++` contained in the
keydesc is replaced with the default value of the location description, which specifies the location's placename,
placetype, and the corresponding values for each container in the container trail, generally up through (but not
beyond) the country level; see `no_include_container_in_desc` below. The location description is used to construct
the full description of various categories, such as bare location categories, whose description generally reads
`"{{(((}}langname}}} terms related to the people, culture, or territory of ``keydesc``."` where ``keydesc`` is the
specified or auto-constructed location description.
* `fulldesc`: String overriding the full description for the bare location category (but not for any other category).
This is currently used only for the location `Earth`, at the very top of the tree (because the standard
`people, culture or territory of ...` text doesn't make sense here), and for `Antarctica` (because it has no permanent
inhabitants). FIXME: This should be renamed `bare_category_fulldesc`.
* `addl_parents`: Specify additional parents for the bare location category, in addition to the category or categories
generated based on the immediate container(s). For example, `Hawaii, USA` specifies `Polynesia` as an additional
parent category; both `North Korea` and `South Korea` specify `Korea` (which is a specially handled location category)
as an additional parent; and `Earth` specifies `nature` (not a location category, but still a topic category) as an
additional parent (which in this case becomes the first parent, as `Earth` has no container). The only restriction on
the categories in `addl_parents` is that they must be topic categories, because each language-specific version of the
bare location category gets the corresponding language-specific versions of the categories in `addl_parents`. FIXME:
This shoudl be renamed `bare_category_addl_parents`.
* `wp`: Spec describing how to construct the Wikipedia article for the location. Each spec is either `true` (equivalent
to `"%l"`, i.e. use the full location placename directly) or a string containing formatting directives, indicating how
to construct the article name. The allowed formatting directives are `%l` (the full location placename), `%e` (the
elliptical location placename) and `%c` (the full placename of the first immediate container). For example, the
default value of `wp` for the group of United States cities is `"%l, %c"` since the city articles tend to be named
e.g. `Austin, Texas` (but with many exceptions, specified using `wp` fields at the city level). Another example is
Thai provinces, which specify a group-level default of `"%e province"` as the Wikipedia articles have lowercase
`province` in their name but the Thai province keys specified in this module have uppercase `Province`. Here we have
to use `%e` to get the placename without the word `Province` in it. The default is `true`, which simply uses the full
location placename as the article name. Note that the Wikipedia article, along with the Wikipedia and Commons category
pages, are shown in the upper right of bare category pages.
* `wpcat`: Spec describing how to construct the Wikipedia category page for the location (i.e. the page listing articles
and categories relevant to the location). The format is the same as with `wp`, and it defaults to the value of `wp`.
It rarely needs to be specified because the category page and the article page almost always follow the same format.
* `commonscat`: Spec describing how to construct the Commons category page for the location (i.e. the page on the
MediaWiki Commons site listing articles and categories relevant to the location). It has the same format as `wp` and
`wpcat` and defaults to `wpcat`, which is usually (but not always) correct.
* `the`: Boolean specifying whether a location should be preceded by `the` when following a preposition, e.g. in
category names such as [[:Category:Cities in the Northern Territory, Australia]] and in old-style place descriptions
when the location occurs as the first holonym, such as the city [[Darwin]] described using
{{tl|place|city|terr/Northern Territory|c/Australia}}. Note that the global default for this and all Boolean
properties is {nil}, which amounts to the same as {false}.
* `british_spelling`: Boolean indicating whether the location in question uses British spelling. Currently this only
affects whether the spelling `neighborhoods` or `neighbourhoods` is used in categories such as
[[:Category:Neighborhoods of New York City]] and [[:Category:Neighbourhoods of Sydney]]. This usually needs to be set
only at the top level (i.e. country or country-like entity), because lower-level entities look up the container trail
for any container that has `british_spelling = true` set, and if found, assume that British spelling applies. The
general principle used in setting this is that all countries in Europe, all dependent territories of any such country,
all former British colonies, and any dependent territories of these former colonies, are assumed to use British
spelling, while all other countries and associated dependent territories are assumed to use American spelling. This
can potentially be modified on a case-by-case basis.
* `is_city`: Boolean indicating whether the location in question is a city. This is explicitly set to `true` for
city-states (e.g. Monaco and Vatican City), dependent territories that are cities (e.g. Hong Kong, Macau, Bonaire,
Gibraltar, etc.), certain city-level administrative divisions (such as `City of Belfast, Northern Ireland`) and
(through a group-levell setting) New York boroughs. In addition, it is set to `true` in initialize_spec() whenever
the group-level `default_placetype == "city"`, so that all cities get it set without explicitly needing to add a
group-level setting for this. Note that the condition `default_placetype == "city"` intentionally excludes Chinese
prefecture-level cities, which aren't really cities in that (for example) they don't directly contain neighborhoods,
but do contain cities within them. This setting is used in various places: (a) to add cities, rivers, etc. to
categories like [[:Category:Rivers in Osaka Prefecture, Japan]] and [[:Category:Cities in Wuhan]] for holonyms that
are ''not'' cities; (b) to add districts, neighborhoods, and the like to categories like
[[:Category:Neighborhoods of Brooklyn]] and [[:Category:Neighborhoods of Monaco]] for holoynms that ''are'' cities;
(c) generally, to determine which "generic" placetypes (cities, rivers, neighborhoods, etc.) apply to the location.
(Those that can occur with cities have a `generic_before_cities` setting in [[Module:place/placetypes]], and those
that can occur with non-cities have a `generic_before_non_cities` setting.)
* `is_former_place`: Boolean that should be set on former places such as the Soviet Union and the Roman Empire. For such
places, categories such as [[:Category:fr:Rivers in the Soviet Union]] are neither generated nor recognized (more
generally, no "generic" placetypes apply except for `places`), and category descriptions include the word `former`.
* `overriding_bare_label_parents`: Document me!
* `bare_category_parent_type`: Document me!
* `no_container_cat`: Document me!
* `no_container_parent`: Document me!
* `no_generic_place_cat`: Document me!
* `no_check_holonym_mismatch`: Document me!
* `no_auto_augment_container`: Document me!
* `no_include_container_in_desc`: Document me!
====Location divisions====
The `divs` field of a location describes the recognized political division types of that location. Specifying a given
division type will cause places defined as being of the specified division type and with the location as a holonym will
cause the place to be categorized as ` ``placetypes`` in/of ``location`` `; for example, specifying that the United
States has `"negeri"` as a division will cause anything defined as {{tl|place|fr|state|c/US}} to be categorized under
[[:Category:fr:States of the United States]]. Note that you do not have to explicitly specify division types for
"generic" placetypes (those that have a `generic_before_non_cities` field if the location is not a city, or that have a
`generic_before_cities` field if the location is a city); this includes things like cities, towns, villages,
neighbo(u)rhoods and rivers. A given element in the `divs` list is usually a string naming a plural placetype; the
placetype is automatically converted to the singular for recognizing the placetype in a {{tl|place}} spec, and irregular
plurals such as `kibbutzim` are handled correctly as long as the placetype specifies an appropriate `plural` field
(if the `plural` isn't explicitly given, the default singularization algorithm in [[Module:en-utilities]] is run, which
gets most things correctly but has problems with `passes` and `fortresses`, which are singularized to `passe` and
`fortresse`; for this reason, an explicit plural entry is added to terms in ''-ss''). In place of a string, an object
can be given with the plural placetype in the `type` field; this allows additional properties to be specified along with
the placetype. An example of this is the `divs` list for Canada:
{
["Canada"] = {divs = {
{type = "provinces", cat_as = "provinces and territories"},
{type = "territories", cat_as = "provinces and territories"},
"counties", "districts", "municipalities", "regional municipalities",
"rural municipalities", "parishes",
"Indian reserves",
"census divisions",
{type = "townships", prep = "di"},
}, ...},
}
Here, both provinces and territories are set to categorize as `provinces and territories`, meaning that there is a
single category [[:Category:Provinces and territories of Canada]] rather than separate categories for provinces and
territories. Similar things are done for other countries that have more than one type of first-level administrative
division (e.g. Australia, China, India and Pakistan). Note that any placetype listed under `cat_as` must exist in the
table of placetypes in [[Module:place/placetypes]], and in fact there is a category-only entry there for `provinces and
territories!` (the use of exclamation point following a plural placetype means that the placetype is present only for
use in categories and won't be recognized as the placetype field in a {{tl|place}} description). In addition, townships
are declared to use `in` rather than `of` as the preposition in the category; hence the category name will be
[[:Category:Townships in Canada]] rather than [[:Category:Townships of Canada]]. (The use of `in` vs. `of` is somewhat
related to whether a given placetype is an official administrative or statistical division of the location in question
and comes in a defined list, in which case `of` should be used, or is more ill-defined, in which case `in` should be
used; the default is `of`, and the use of `in` with `townships` is probably by analogy with the use of `in` with cities
and towns.)
Another more complex example is the divisions given for Quebec:
{
["Quebec, Canada"] = {divs = {
"counties",
{type = "regional county municipalities", container_parent_type = "regional municipalities"},
{type = "regions", container_parent_type = false},
{type = "townships", prep = "di"},
{type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "counties"}, "municipalities"}},
{type = "township municipalities", cat_as = {{type = "townships", prep = "di"}, "municipalities"}},
{type = "village municipalities", cat_as = {{type = "villages", prep = "di"}, "municipalities"}},
}, ...},
}
Here, `container_parent_type` controls the second parent category of the placetype/location category associated with the
entry. In this case, for example, [[:Category:Counties of Quebec, Canada]] will have [[:Category:Counties of Canada]] as
its second or ''container-level'' parent. However, this doesn't make sense for `regional county municipalities`, which
exist only in Quebec (so the parent category [[:Category:Regional county municipalities of Canada]] would have only one
subcategory); but they are similar to regional municipalities in British Columbia, Nova Scotia and Ontario, so the
`container_parent_type = "regional municipalities"` spec causes the container-level parent of this category to be
[[:Category:Regional municipalities of Canada]]. Likewise, `regions` as administrative divisions (as opposed to mere
geographic regions) exist only in Quebec; they have no equivalent elsewhere, so we disable the container-level parent
using `container_parent_type = false`. The specs for `parish municipalities`, `township municipalities` and
`village municipalities` show both that multiple types can be specified under `cat_as` (here, for example, we categorize
`parish municipalities` as both `parishes` and `municipalities`) and that these types can themselves have properties,
just as for entries directly under `divs`. Specifically, `{type = "parishes", container_parent_type = "counties"}`
means that any place defined as a parish municipality in Quebec will be categorized under both [[:Category:Parishes of
Quebec, Canada]] and [[:Category:Municipalities of Quebec, Canada]], and that the former will have a container-level
parent of [[:Category:Counties of Canada]] (rather than the default of [[:Category:Parishes of Canada]]). Similarly,
`township municipalities` will be categorized under both [[:Category:Townships in Quebec, Canada]] (''not''
[[:Category:Townships of Quebec, Canada]]) and [[:Category:Municipalities of Quebec, Canada]].
====Container spec canonicalization====
A fully canonicalized container spec for a given location consists of a list of ''canonicalized container objects'',
each with a `key` and `placetype` field. The `key` field should name the canonical key of some other location at a
higher level (e.g. French cities are contained in French departments, which are contained in French regions, which are
contained in France, which is contained in Europe, which is contained in Eurasia, which is contained in the Earth). The
`placetype` field should correspond to the first (canonical) placetype listed for the key in question. The process of
initializing a locaion spec converts the container spec in `.container` into a canonicalized spec in `.containers` and
removes the spec from `.container`. It works as follows:
# If the `container` field is missing, and there is a group-level `default_container` field, it is used in its place.
For example, none of the Brazilian states listed in `brazil_states` specifies a container, but the group specifies
`default_container = "Brazil"`.
# A single string or canonicalized container object is allowed and made into a one-element list.
# If a list element is a string that did ''not'' come from `default_container`, and there is a group-level
`canonicalize_key_container` field, it is assumed to be a one-argument function and is called on the string to get
a canonicalized container object.
# Any remaining strings are assumed to be countries and are used directly as the `key`, with `placetype` set to
`"negara"`.
====Alias keys====
Aliases can be provided for canonical keys using ''alias keys''. Alias keys have a very different location spec
structure from canonical keys. This structure does not, in general, have defaults at the group level and is not
initialized using `initialize_spec()`, but is used as-is. The following properties are recognized in an alias location
spec:
* `alias_of`: The canonical key of which this key is an alias. Required.
* `the`: If true, this alias key is preceded by `the` following a preposition. Defaults to the group-level `default_the`
but does not pay attention to the value of `the` for the corresponding canonical key.
* `display`: This is a display alias, meaning that holonyms using the placename corresponding to this alias will be
converted to the placename corresponding to the canonical key when formatting the holonym for display. (Otherwise,
the aliasing applies only to categorization.) If the value is true, the display canonicalization is to the placename
of the canonical key; otherwise, the value should be a key whose corresponding placename is used when display
canonicalizing.
* `placetype`: The placetype of the alias. Rarely needs to be specified as it defaults to the canonical key's placetype,
and if that is unspecified, to the group-level default placetype.
====Location group metadata tables====
As mentioned above, associated with each location group is a ''metadata table'' listing group-level properties. The
metadata table contains two types of keys: group-level defaults (named like the corresponding location-level keys but
preceded by `default_`, e.g. `default_placetype` corresponding to the location-level `placetype` key) and group-only
keys, which are mostly functions. The following are the possible group-only keys:
* `data`: This points to the group data table for the group, as described above.
* `key_to_placename`: This is a function of one argument to transform the location's key (whether canonical or alias)
into the full and elliptical placenames. The difference between full and elliptical placenames is described in the
documentation for [[Module:place]], but in essence, it applies for keys that include the placetype in them (e.g.
`Phuket Province, Thailand` or `County Mayo, Ireland`), in which case the full placename includes the placetype and
the elliptical placename does not. For keys that do not include the placetype in them (e.g. `Arizona, USA` or
`Gloucestershire, England`), the full and elliptical placenames are identical. Note that neither the full nor the
elliptical placename includes the container in it; hence, for `Phuket Province, Thailand`, the full placename is
`Phuket Province` and the elliptical placename is just `Phuket`. (Note that the full vs. elliptical placename
distinction is intended only for handling cases where the placetype follows or precedes the raw placename and there
is no difference between the two in whether they are normally preceded by `the`. More complex situations, such as
`State of Mexico` (which normally takes `the`) vs. just `Mexico` (which doesn't), or `Islamabad Capital Territory` vs.
just `Islamabad`, should be handled instead by aliases.) The `key_to_placename` function takes one argument, the key,
and returns two arguments, the full and elliptical placenames, respectively. If left undefined, the default is to
chop off anything starting with a comma and return the result as both full and elliptical placename, and if
specifically set to `false`, the key is used directly as both full and elliptical placename. If it needs to be
defined, it is best to use the helper function `make_key_to_placename`, if possible (or
`make_irish_type_key_to_placename` in the case of Ireland and Northern Ireland, where `County` precedes), rather than
rolling your own. In addition, you should use the global `key_to_placename` function (which takes care of the default
implementation and such) rather than directly calling the function in the `key_to_placename` field.
* `placename_to_key`: This is approximately the inverse of `key_to_placename`, transforming a placename (which can be
either in full or elliptical form) into the corresponding key. As with `key_to_placename`, if you need to define this
(generally, when the full and elliptical placenames are different), prefer using `make_placename_to_key` (or
`make_irish_type_placename_to_key` for Ireland and Northern Ireland) to rolling your own. In addition, similarly to
`key_to_placename`, use the global `placename_to_key` function to convert placenames to keys rather than directly
invoking the function in the `placename_to_key` field. If the field is set to `false`, the placename is used unchanged
as the key. Otherwise, the default algorithm works as follows:
*# If the group-level `default_placetype == "city"`, use the placename unchanged as the key.
*# Otherwise, if the group-level `default_container` exists and is a string, append it to the placename after a comma +
space and use the result as the key.
*# Otherwise, if the group-level `default_container` is a canonical container object (an object with `key` and
`placetype` fields), and the `placetype` field is either `country` or `constituent country`, append the `key` field
to the placename after a comma + space and use the result as the key.
*# Otherwise, use the placename unchanged as the key.
* `canonicalize_key_container`: A function of one argument to convert the specified `container` field, when a string,
to canonical form. Described in more detail above under [[#Container spec canonicalization]]. It is preferable to
construct the function using `make_canonicalize_key_container`, if possible, rather than rolling your own.
* `addl_divs`: Additional political divisions appended, for all locations in the group, to the list of divisions derived
from the location-level `divs` or group-level `default_divs` fields to get the final list of divisions for the
location. See [[#Location divisions]] for more details.
]==]
-----------------------------------------------------------------------------------
-- Helper functions --
-----------------------------------------------------------------------------------
--[==[
Throw an error. `fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to
format the format string as if `fmt:format(...)` were called. In general, callers should use `internal_error` unless the
error was due to bad user input rather than a logic error (which usually isn't the case in deep back-end code like
this).
]==]
function export.process_error(fmt, ...)
local args = {...}
for i = 1, select("#", ...) do
args[i] = dump(args[i])
end
return error(string.format(fmt, unpack(args)))
end
--[==[
Throw an internal error (a logic error that should never happen unless there is a bug in the code, as opposed to a user
error triggered by bad input or a system error due to something like running out of memory or hitting a time limit).
`fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to format the
format string as if `fmt:format(...)` were called.
]==]
function export.internal_error(fmt, ...)
export.process_error("Internal error: " .. fmt, ...)
end
local internal_error = export.internal_error
-- Return whether `list_or_element` (a list of strings, or a single string) "contains" `item` (a string). If
-- `list_or_element` is a list, this returns true if `item` is in the list; otherwise it returns true if `item`
-- equals `list_or_element`.
local function list_or_element_contains(list_or_element, item)
if type(list_or_element) == "table" then
return m_table.contains(list_or_element, item) and true or false
end
return list_or_element == item
end
--[==[
Call the location group's `key_to_placename` function if it exists (see the comment at the top of [[Module:place]] for
the distinction between keys and placenames). Two values are returned, the full and elliptical placenames (e.g. full
`"County Durham"` vs. elliptical `"Durham"`). If the group does not define `key_to_placename`, both full and elliptical
placenames are computed by chopping off anything starting with a comma.
]==]
function export.key_to_placename(group, key)
if group.key_to_placename == false then
return key, key
end
if group.key_to_placename then
local full_placename, elliptical_placename = group.key_to_placename(key)
if type(full_placename) ~= "string" then
internal_error("Key %s returned a non-string full placename: %s", key, full_placename)
end
if type(elliptical_placename) ~= "string" then
internal_error("Key %s returned a non-string elliptical placename: %s", key, elliptical_placename)
end
return full_placename, elliptical_placename
end
key = key:gsub(",.*", "")
return key, key
end
--[==[
Call the location group's `placename_to_key` function if it exists (see the comment at the top of [[Module:place]] for
the distinction between keys and placenames) and return the result. If `placename_to_key` exists with the value `false`,
return the placename unchanged. If the group does not define `placename_to_key`, and it defines a `default_container`
whose placetype is either `country` or `constituent country`, the container name is appended to the placename after a
comma and a space. Otherwise the placename is returned unchanged.
]==]
function export.placename_to_key(group, placename)
if group.placename_to_key == false then
return placename
elseif group.placename_to_key then
local key = group.placename_to_key(placename)
if type(key) ~= "string" then
internal_error("Placename %s returned a non-string key: %s", placename, key)
end
return key
elseif group.default_placetype == "city" then
return placename
else
local defcon = group.default_container
if not defcon then
return placename
elseif type(defcon) == "string" then
return placename .. ", " .. defcon
elseif type(defcon) == "table" and (defcon.placetype == "negara" or
defcon.placetype == "constituent country") then
return placename .. ", " .. defcon.key
else
return placename
end
end
end
--[==[
Initialize the location spec `spec`, augmenting it with default values taken from `group` if the spec itself doesn't
specify values for the properties. This sets `containers` to a canonicalized list of objects, each with `key` and
`placetype` keys, describing the immediate containers of the location, and erases (sets to nil) the original
non-canonicalized `container` field. (Most locations have only one immediate container but some, e.g. Russia, have more
than one. Containers should be carefully distinguished from category parents. Generally the container is the first
category parent, or the first ``n`` parents if there are ``n`` containers, but there may be additional category parents,
which indicate some sort of relation between the category parent and the location but not necessarily one of
containment.)
This function is idempotent in that nothing happens if called more than once on the same spec.
FIXME: Consider reimplementing this in a more standardly object-oriented way using metatables.
]==]
function export.initialize_spec(group, key, spec)
if spec.initialized then
return
end
local container = spec.container
local containers
local container_from_default
if not container then
container = group.default_container
container_from_default = true
end
if container then
if type(container) == "string" or container.key then
container = {container}
end
containers = {}
for _, cont in ipairs(container) do
if type(cont) == "string" then
if group.canonicalize_key_container and not container_from_default then
cont = group.canonicalize_key_container(cont)
else
cont = {key = cont, placetype = "negara"}
end
end
insert(containers, cont)
end
end
spec.containers = containers
spec.container = nil
local function value_with_default(val, default_val)
if val == nil then
return default_val
else
return val
end
end
local function set_or_default(prop)
spec[prop] = value_with_default(spec[prop], group["default_" .. prop])
end
set_or_default("placetype")
if not spec.placetype then
internal_error("No placetype found in key %s for spec %s or in group `default_placetype`", key, spec)
end
set_or_default("divs")
spec.addl_divs = group.addl_divs
for _, prop in ipairs {
"keydesc",
"fulldesc",
"addl_parents",
"overriding_bare_label_parents",
"bare_category_parent_type",
"wp",
"wpcat",
"commonscat",
"british_spelling",
"the",
"no_container_cat",
"no_container_parent",
"no_generic_place_cat",
"no_check_holonym_mismatch",
"no_auto_augment_container",
"no_include_container_in_desc",
"is_city",
"is_former_place",
} do
set_or_default(prop)
end
-- `default_placetype == "city"` is correct; if `default_placetype` has something else like `prefecture-level city`
-- as the canonical placetype but also lists `city` (as Chinese prefecture-level cities do), don't mark as
-- is_city.
spec.is_city = value_with_default(spec.is_city, group.default_placetype == "city")
spec.initialized = true
end
--[=[
Given a location group, key and possible placetypes that the placename must match, check if the key exists in the group
with at least one of the group's key's placetypes matching one of the passed-in placetypes. If so, return two values:
the group key (which potentially could differ from the passed-in key due to aliases) and the corresponding spec object,
which (as with all functions that return spec objects) has been initialized using `initialize_spec()` (i.e. default
property values have been copied from the group into the spec, if the spec doesn't itself specify a value for the
property in question).
`alias_resolution` controls how aliases are resolved. Normally, both display and category aliases are followed, and
the returned key will reflect the canonical location key. However, if `alias_resolution` is {"none"}, no alias following
happens. In that case, if the key specifies an alias, the spec for the alias rather than the spec for the canonical
location is returned, and importantly, it is returned uninitialized, meaning that properties from the group are not
copied into the spec. (If the key specifies a canonical location, its spec is returned initialized, as in the normal
case where `alias_resolution` is unspecified.) The caller needs to check whether the returned spec is an alias by
looking for an `alias_of` property. If `alias_resolution` is {"display"}, the behavior is the same as for {"none"}
except that if the alias contains a setting `display = true`, the returned key will reflect the canonical location key,
and if the alias contains a setting `display = ``string`` `, the returned key will reflect that string.
This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for
internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or
`find_canonical_key` (for known-canonical locations where the placetype isn't known).
]=]
local function find_matching_key_in_group(group, placetypes, key, alias_resolution)
if alias_resolution ~= nil and alias_resolution ~= "none" and alias_resolution ~= "display" and
alias_resolution ~= "all" then
internal_error("Bad value for 'alias_resolution': %s", alias_resolution)
end
local spec = group.data[key]
if not spec then
return nil
end
local function check_correct_placetype(placetype)
if type(placetype) == "table" then
for _, pt in ipairs(placetype) do
if list_or_element_contains(placetypes, pt) then
return true
end
end
return false
else
return list_or_element_contains(placetypes, placetype)
end
end
if spec.alias_of then
local resolved_key = spec.alias_of
local resolved_spec = group.data[resolved_key]
if not resolved_spec then
internal_error("Key %s is an alias of %s, which doesn't exist", key, resolved_key)
elseif resolved_spec.alias_of then
internal_error("Key %s is an alias of %s, which is itself an alias; indirect aliasing not allowed",
key, resolved_key)
end
if alias_resolution == "none" or alias_resolution == "display" then
-- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group.
local placetype = spec.placetype or resolved_spec.placetype or group.default_placetype
if not placetype then
internal_error("No placetype found for key %s in any of spec %s, alias-resolved spec %s or in group " ..
"`default_placetype`", key, spec, resolved_spec)
end
if not check_correct_placetype(placetype) then
return nil
end
if alias_resolution == "display" then
if spec.display == true then
key = resolved_key
elseif spec.display then
key = spec.display
end
end
return key, spec
end
key = resolved_key
spec = resolved_spec
end
-- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group.
local placetype = spec.placetype or group.default_placetype
if not placetype then
internal_error("No placetype found for key %s in spec %s or group `default_placetype`", key, spec)
end
if not check_correct_placetype(placetype) then
return nil
end
export.initialize_spec(group, key, spec)
return key, spec
end
--[=[
Given a location group, placename and possible placetypes that the placename must match, check if the placename exists
in the group with at least one of the placetypes of the key in the group that corresponds to the placename matching one
of the passed-in placetypes. If so, return two values: the key corrsponding to the passed-in placename and the
corresponding spec object. This is similar to `find_matching_key_in_group()` but works with placenames rather than keys.
`alias_resolution` is as in `find_matching_key_in_group()`.
This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for
internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or
`find_canonical_key` (for known-canonical locations where the placetype isn't known).
]=]
local function find_matching_placename_in_group(group, placetypes, placename, alias_resolution)
local key = export.placename_to_key(group, placename)
return find_matching_key_in_group(group, placetypes, key, alias_resolution)
end
--[==[
If `key` is a canonical known location key (i.e. not an alias), return the corresponding group and initialized spec.
If no such key exists, return {nil}. This throws an internal error if two locations with the same key are found.
]==]
function export.find_canonical_key(key)
local found_locations = {}
for _, group in ipairs(export.locations) do
local spec = group.data[key]
if not spec then
-- do nothing
elseif spec.alias_of then
mw.log(("Skipping alias '%s' of canonical '%s'"):format(key, spec.alias_of))
else
insert(found_locations, {group, spec})
end
end
if not found_locations[1] then
return nil
elseif found_locations[2] then
internal_error("Found multiple matching locations for canonical key %s: %s", key, found_locations)
else
local group, spec = unpack(found_locations[1])
export.initialize_spec(group, key, spec)
return group, spec
end
end
--[==[
Iterator that returns all locations matching a given description, where the description consists of either a placename
or a key along with a list of possible placetypes. Usually there will be at most one such location. The iterator
returns three values at each iteration: the location group, canonical key by which the location is known and the spec
object describing the location. `data` contains the following possible fields:
* `placetypes`: A list of possible placetypes, one of which must match one of the location's placetypes; or a string
specifying a placetype, which must match one of the location's placetypes. This must be specified.
* `placename`: The placename of the location. Either this or `key` must be specified.
* `key`: The key of the location. Either this or `placename` must be specified.
* `alias_resolution`: If specified, it behaves the same as for `find_matching_key_in_group`.
The spec is normally initialized using `initialize_spec()` prior to it being returned (but may not be if
`alias_resolution` is given and the specified key or placename is an alias; see the documentation for
`find_matching_key_in_group`).
]==]
function export.iterate_matching_location(data)
local i = 0
local n = #export.locations
return function()
while true do
i = i + 1
if i > n then
break
end
local group = export.locations[i]
local key, spec
if data.placename then
key, spec = find_matching_placename_in_group(group, data.placetypes, data.placename,
data.alias_resolution)
else
if not data.key then
internal_error("'.placename' or '.key' must be defined: %s", data)
end
key, spec = find_matching_key_in_group(group, data.placetypes, data.key, data.alias_resolution)
end
if key then
return group, key, spec
end
end
end
end
--[==[
Return the location matching a given description, where the description consists of either a placename or a key along
with a list of possible placetypes. This is similar to `iterate_matching_location()` but throws an internal error if
there is not exactly one location found; as such, it is for use with internally specified locations (such as the
containers of known locations) rather than externally specified locations, which may not match a known location and in
some cases may match multiple known locations. For finding an externally specified location, consider using
`find_matching_holonym_location`, which returns {nil} rather than throwing an error if the location isn't found, but
also (more importantly) checks to make sure there are no conflicting holonyms among the user-specified holonyms (e.g.
{{tl|place|city|s/Delaware|c/USA|t=Newark}} will not match the known location `Newark` (in New Jersey, not Delaware).
]==]
function export.get_matching_location(data)
local all_found = {}
for group, key, spec in export.iterate_matching_location(data) do
insert(all_found, {group, key, spec})
end
if not all_found[1] then
internal_error("Couldn't find matching location for data %s", data)
elseif all_found[2] then
internal_error("Found multiple matching locations for data %s: %s", data, all_found)
else
return unpack(all_found[1])
end
end
--[==[
Successively iterate over a location's containers, and then the containers of those containers, etc. Keep in mind that
locations may have multiple containers (e.g. Russia has both Europe and Asia as containers, and both Europe and Asia
have Eurasia as their container). A given container will never be returned twice (e.g. in the case where a specific
location A has locations B and C as containers, and B has C as its container, C will not be returned twice). An
internal error happens if a container loop is detected. The return value is a list of location objects, each of which
contains `group`, `key` and `spec` fields.
]==]
function export.iterate_containers(group, key, spec)
local keys_seen = {}
keys_seen[key] = true
local iterations = 0
local last_iteration_containers = {{group = group, key = key, spec = spec}}
return function()
iterations = iterations + 1
if iterations > 10 then
internal_error("Probable loop in containers when processing key %s", key)
end
local next_iteration_containers = {}
for _, location in ipairs(last_iteration_containers) do
local containers = location.spec.containers
if containers then
for _, container in ipairs(containers) do
local container_group, container_key, container_spec = export.get_matching_location {
placetypes = container.placetype,
key = container.key,
}
if not keys_seen[container_key] then
insert(next_iteration_containers, {
group = container_group, key = container_key, spec = container_spec
})
keys_seen[container_key] = true
end
end
end
end
if not next_iteration_containers[1] then
return nil
end
last_iteration_containers = next_iteration_containers
return next_iteration_containers
end
end
--[==[
Given a placename, convert it into a link (two-part if `display_form` is given and differs from `placename`) and add
`"the "` to the beginning if called for in `spec`.
]==]
function export.construct_linked_placename(spec, placename, display_form)
local linked_placename = display_form and placename ~= display_form and ("[[%s|%s]]"):format(placename,
display_form) or ("[[%s]]"):format(placename)
if spec.the then
linked_placename = "the " .. linked_placename
end
return linked_placename
end
--[=[
This is typically used to define `key_to_placename`. It generates a function that chops off parts of a string (a
location key), typically at the end, in order to get the full and elliptical versions of a placename. (See the
documentation above for `key_to_placename` under "Location group tables" for the difference between full and elliptical
placenames.) `container_patterns` is a Lua pattern or a list of possible patterns matching the container at the end of
the key, which will be used to remove that container. If multiple patterns are specified, each one is tried until one
matches. If `container_patterns` is omitted, this part of the process is skipped. The reulting string becomes the full
placename. If `divtype_patterns` is specified, it is likewise either a Lua pattern or list of possible patterns to match
and remove the political division affixed onto the end (or possibly the beginning) of the key in the keys of certain
countries (such as South Korean and North Korean counties, which include the word "County" in the key). The resulting
chopped string becomes the elliptical placename. If `divtype_patterns` is omitted, this part of the process is skipped
and the full and elliptical placenames are the same.
Typical usage is as follows:
```
key_to_placename = make_key_to_placename(", England$"),
```
or (when the political division is part of the key)
```
key_to_placename = make_key_to_placename(", South Korea$", " County$")
```
]=]
local function make_key_to_placename(container_patterns, divtype_patterns)
if type(container_patterns) == "string" then
container_patterns = {container_patterns}
end
if type(divtype_patterns) == "string" then
divtype_patterns = {divtype_patterns}
end
return function(key)
local full_placename = key
if container_patterns then
for _, container_pattern in ipairs(container_patterns) do
local nsubs
full_placename, nsubs = full_placename:gsub(container_pattern, "")
if nsubs > 0 then
break
end
end
end
local elliptical_placename = full_placename
if divtype_patterns then
for _, divtype_pattern in ipairs(divtype_patterns) do
local nsubs
elliptical_placename, nsubs = elliptical_placename:gsub(divtype_pattern, "")
if nsubs > 0 then
break
end
end
end
return full_placename, elliptical_placename
end
end
--[=[
This is typically used to define `placename_to_key`. It generates a function that appends a string to the end of a given
placename to get the key (see the definition of `placename_to_key` above in the documentation under "Location group
tables"). Optional `divtype_suffix` is a raw string (which should not contain hyphens or other characters that have
special meaning in Lua patterns) to be appended first to the placename; if already present at the end, it is not
appended. `container_suffix` is then added in the same fashion if given. Typical usage is like this:
```
placename_to_key = make_placename_to_key(", England")
```
(which will convert e.g. `"Hampshire"` into `"Hampshire, England"`)
or
```
placename_to_key = make_placename_to_key(", South Korea", " County")
```
(which will convert e.g. `"Gangwon"` or `"Gangwon County"` into `"Gangwon County, South Korea"`).
]=]
local function make_placename_to_key(container_suffix, divtype_suffix)
return function(placename)
local key = placename
if divtype_suffix then
if not key:find(divtype_suffix .. "$") then
key = key .. divtype_suffix
end
end
if container_suffix then
key = key .. container_suffix
end
return key
end
end
--[=[
This is typically used to define `canonicalize_key_container`, which converts a container as specified in the location
data into the canonical form containing both the full container key and its placetype. It generates a function to do
the canonicalization of a given container. If the container is a string, `suffix` is appended onto the string (use {nil}
or {""} if there is no suffix to append), and the placetype is set to `placetype`. Otherwise the container is left
as-is. Typical usage is like this:
```
canonicalize_key_container = make_canonicalize_key_container(", Canada", "province")
```
which will convert e.g. `"Ontario"` into `{key = "Ontario, Canada", placetype = "province"}`.
]=]
local function make_canonicalize_key_container(suffix, placetype)
return function(container)
if type(container) == "string" then
return {key = container .. (suffix or ""), placetype = placetype}
else
return container
end
end
end
-----------------------------------------------------------------------------------
-- Top-level tables --
-----------------------------------------------------------------------------------
export.continents = {
["Bumi"] = {the = true, placetype = "planet", addl_parents = {"alam semula jadi"},
fulldesc = "=the planet [[Earth]] and the features found on it"},
["Afrika"] = {placetype = "benua", container = {key = "Bumi", placetype = "planet"}},
["Amerika"] = {placetype = {"superbenua", "benua"}, container = {key = "Bumi", placetype = "planet"},
keydesc = "[[America]], in the sense of [[North America]] and [[South America]] combined",
wp = "Amerika"},
["America"] = {alias_of = "Amerika", the = true},
["Amerika Utara"] = {placetype = "benua", container = {key = "America", placetype = "superbenua"}},
["Caribbean"] = {the = true, placetype = {"kawasan benua", "region"}, container = {key = "Amerika Utara", placetype = "benua"}},
["Amerika Tengah"] = {placetype = {"kawasan benua", "region"}, container = {key = "Amerika Utara", placetype = "benua"}},
["Amerika Selatan"] = {placetype = "benua", container = {key = "America", placetype = "superbenua"}},
["Antartika"] = {placetype = "benua", container = {key = "Bumi", placetype = "planet"},
fulldesc = "=the territory of [[Antarctica]]"},
["Eurasia"] = {placetype = {"superbenua", "benua"}, container = {key = "Bumi", placetype = "planet"},
keydesc = "[[Eurasia]], i.e. [[Europe]] and [[Asia]] together"},
["Asia"] = {placetype = "benua", container = {key = "Eurasia", placetype = "superbenua"}},
["Eropah"] = {placetype = "benua", container = {key = "Eurasia", placetype = "superbenua"}},
["Oceania"] = {placetype = "benua", container = {key = "Bumi", placetype = "planet"}},
["Melanesia"] = {placetype = {"kawasan benua", "region"}, container = {key = "Oceania", placetype = "benua"}},
["Micronesia"] = {placetype = {"kawasan benua", "region"}, container = {key = "Oceania", placetype = "benua"}},
["Polynesia"] = {placetype = {"kawasan benua", "region"}, container = {key = "Oceania", placetype = "benua"}},
}
export.continents_group = {
default_overriding_bare_label_parents = {}, -- container parents should be used
default_divs = {{type = "negara", prep = "di"}},
-- It's enough to mention the first-level continent or continent group. It seems excessive to write e.g.
-- "El Salvador, a country in Central America, a continental region in North America, a continent in America, ...".
default_no_include_container_in_desc = true,
default_no_container_cat = true,
default_no_container_parent = true,
default_no_auto_augment_container = true,
default_no_generic_place_cat = true,
-- French Guyana is in France but not in Europe, which should not be an issue, so don't check holonym mismatches at
-- this level. We also run into problems with supercontinents, which have "benua" as the fallback and cause
-- mismatches.
default_no_check_holonym_mismatch = true,
data = export.continents,
}
-- Countries: including those with partial recognition that are normally considered countries (e.g. Kosovo, Taiwan).
export.countries = {
["Afghanistan"] = {container = "Asia", divs = {"provinces", "districts"}},
["Albania"] = {container = "Eropah", divs = {"counties", "municipalities", "communes",
{type = "administrative units", cat_as = "communes"},
}, british_spelling = true},
["Algeria"] = {container = "Afrika", divs = {"provinces", "communes", "districts", "municipalities"}},
["Andorra"] = {container = "Eropah", divs = {"parishes"}, british_spelling = true},
["Angola"] = {container = "Afrika", divs = {"provinces", "municipalities"}},
["Antigua and Barbuda"] = {container = "Caribbean", divs = {"provinces"}, british_spelling = true},
["Argentina"] = {container = "Amerika Selatan", divs = {"provinces", "departments", "municipalities"}},
["Armenia"] = {container = {"Eropah", "Asia"}, divs = {"provinces", "districts", "municipalities"},
british_spelling = true},
["Republic of Armenia"] = {alias_of = "Armenia", the = true}, -- differs in "the"
-- Both a country and continent
["Australia"] = {container = "Oceania", divs = {
{type = "negeri", cat_as = "states and territories"},
{type = "territories", cat_as = "states and territories"},
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and territories"},
{type = "ABBREVIATION_OF territories", cat_as = "abbreviations of states and territories"},
"local government areas", "dependent territories",
}, british_spelling = true},
["Austria"] = {container = "Eropah", divs = {"negeri", "districts", "municipalities"}, british_spelling = true},
["Azerbaijan"] = {container = {"Eropah", "Asia"}, divs = {"districts", "municipalities"}, british_spelling = true},
["Bahamas"] = {the = true, container = "Caribbean", divs = {"districts"}, british_spelling = true, wp = "The %l"},
["Bahrain"] = {container = "Asia", divs = {"governorates"}},
["Bangladesh"] = {container = "Asia", divs = {"divisions", "districts", "municipalities"}, british_spelling = true},
["Barbados"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
["Belarus"] = {container = "Eropah", divs = {"regions", "districts"}, british_spelling = true},
["Belgium"] = {container = "Eropah", divs = {"regions", "provinces", "municipalities"}, british_spelling = true},
["Belize"] = {container = "Amerika Tengah", divs = {"districts"}, british_spelling = true},
["Benin"] = {container = "Afrika", divs = {"departments", "communes"}},
["Bhutan"] = {container = "Asia", divs = {"districts", "gewogs"}},
["Bolivia"] = {container = "Amerika Selatan", divs = {"provinces", "departments", "municipalities"}},
["Bosnia and Herzegovina"] = {container = "Eropah", divs = {"entities", "cantons", "municipalities"}, british_spelling = true},
["Bosnia and Hercegovina"] = {alias_of = "Bosnia and Herzegovina", display = true},
["Bosnia"] = {alias_of = "Bosnia and Herzegovina", display = true},
["Botswana"] = {container = "Afrika", divs = {"districts", "subdistricts"}, british_spelling = true},
["Brazil"] = {container = "Amerika Selatan", divs = {
"negeri", "municipalities", "macroregions",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
}},
["Brunei"] = {container = "Asia", divs = {"districts", "mukims"}, british_spelling = true},
["Bulgaria"] = {container = "Eropah", divs = {"provinces", "municipalities"}, british_spelling = true},
["Burkina Faso"] = {container = "Afrika", divs = {"regions", "departments", "provinces"}},
["Burundi"] = {container = "Afrika", divs = {"provinces", "communes"}},
["Cambodia"] = {container = "Asia", divs = {"provinces", "districts"}},
["Cameroon"] = {container = "Afrika", divs = {"regions", "departments"}},
["Kanada"] = {container = "Amerika Utara", divs = {
{type = "provinces", cat_as = "provinces and territories"},
{type = "territories", cat_as = "provinces and territories"},
{type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces and territories"},
{type = "ABBREVIATION_OF territories", cat_as = "abbreviations of provinces and territories"},
"counties", "districts", "municipalities", "regional municipalities",
"rural municipalities", "parishes",
-- Don't change the following to something more politically correct (e.g. "First Nations reserves") until/unless
-- the Canadian government makes a similar switch (and note that as of Apr 18 2025, the Wikipedia article is
-- still at [[w:Indian reserves]]).
"Indian reserves",
"census divisions",
{type = "townships", prep = "di"},
},
british_spelling = true},
["Cape Verde"] = {container = "Afrika", divs = {"municipalities", "parishes"}},
["Central African Republic"] = {the = true, container = "Afrika", divs = {"prefectures", "subprefectures"}},
["Chad"] = {container = "Afrika", divs = {"regions", "departments"}},
["Chile"] = {container = "Amerika Selatan", divs = {"regions", "provinces", "communes"}},
["China"] = {container = "Asia", divs = {
{type = "provinces", cat_as = "provinces and autonomous regions"},
{type = "autonomous regions", cat_as = "provinces and autonomous regions"},
{type = "FORMER provinces", cat_as = "former provinces"},
"special administrative regions",
"prefectures",
{type = "FORMER prefectures", cat_as = "former prefectures"},
"prefecture-level cities",
{type = "counties", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
{type = "FORMER counties", cat_as = "former counties and county-level cities"},
{type = "FORMER county-level cities", cat_as = "former counties and county-level cities"},
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities.
"districts",
{type = "FORMER districts", cat_as = "former districts"},
"subdistricts",
"townships",
"municipalities",
{type = "direct-administered municipalities", cat_as = "municipalities"},
}},
["People's Republic of China"] = {alias_of = "China", the = true}, -- differs in "the"
["Colombia"] = {container = "Amerika Selatan", divs = {"departments", "municipalities"}},
["Comoros"] = {the = true, container = "Afrika", divs = {"autonomous islands"}},
["Costa Rica"] = {container = "Amerika Tengah", divs = {"provinces", "cantons"}},
["Croatia"] = {container = "Eropah", divs = {"counties", "municipalities"}, british_spelling = true},
["Cuba"] = {container = "Caribbean", divs = {"provinces", "municipalities"}},
["Cyprus"] = {container = {"Eropah", "Asia"}, divs = {"districts"}, british_spelling = true},
["Czech Republic"] = {the = true, container = "Eropah", divs = {"regions", "districts", "municipalities"}, british_spelling = true},
["Czechia"] = {alias_of = "Czech Republic"}, -- differs in "the"
["Democratic Republic of the Congo"] = {the = true, container = "Afrika", divs = {"provinces", "territories"}},
["Congo"] = {alias_of = "Democratic Republic of the Congo", display = true, the = true},
["Denmark"] = {container = "Eropah", divs = {"regions", "municipalities", "dependent territories"},
british_spelling = true,
-- Wikipedia separates [[w:Denmark]] (constituent country) from [[w:Danish Realm]] (country)
},
["Djibouti"] = {container = "Afrika", divs = {"regions", "districts"}},
["Dominica"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
["Dominican Republic"] = {the = true, container = "Caribbean", divs = {"provinces", "municipalities"},
keydesc = "the [[Dominican Republic]], the country that shares the [[Caribbean]] island of [[Hispaniola]] with [[Haiti]]"},
["East Timor"] = {container = "Asia", divs = {"municipalities"}, wp = "Timor-Leste"},
["Timor-Leste"] = {alias_of = "East Timor", display = true},
["Ecuador"] = {container = "Amerika Selatan", divs = {"provinces", "cantons"}},
["Egypt"] = {container = "Afrika", divs = {"governorates", "regions"}, british_spelling = true},
["El Salvador"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}},
["Equatorial Guinea"] = {container = "Afrika", divs = {"provinces"}},
["Eritrea"] = {container = "Afrika", divs = {"regions", "subregions"}},
["Estonia"] = {container = "Eropah", divs = {"counties", "municipalities"}, british_spelling = true},
["Eswatini"] = {container = "Afrika", british_spelling = true},
["Swaziland"] = {alias_of = "Eswatini", display = true},
["Ethiopia"] = {container = "Afrika", divs = {"regions", "zones"}},
["Federated States of Micronesia"] = {the = true, container = "Micronesia", divs = {"negeri"}},
["Micronesia"] = {alias_of = "Federated States of Micronesia"},
["Fiji"] = {container = "Melanesia", divs = {"divisions", "provinces"}, british_spelling = true},
["Finland"] = {container = "Eropah", divs = {"regions", "municipalities"}, british_spelling = true},
["France"] = {container = "Eropah", divs = {"regions", "cantons", "collectivities",
"communes",
{type = "municipalities", cat_as = "communes"},
"departments",
{type = "prefectures", cat_as = {"prefectures", "departmental capitals"}},
{type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}},
"dependent territories", "territories", "provinces",
}, british_spelling = true},
["Gabon"] = {container = "Afrika", divs = {"provinces", "departments"}},
["Gambia"] = {the = true, container = "Afrika", divs = {"divisions", "districts"}, british_spelling = true, wp = "The %l"},
["Georgia"] = {container = {"Eropah", "Asia"}, divs = {"regions", "districts"},
keydesc = "the country of [[Georgia]], in [[Eurasia]]", british_spelling = true, wp = "%l (country)"},
["Germany"] = {container = "Eropah", divs = {
"negeri",
-- Bavaria, Baden-Württemberg, Hesse and North Rhine-Westphalia have administrative regions as divisions, but
-- there aren't really enough of them to categorize per state.
"regions",
"municipalities", "districts"}, british_spelling = true},
["Ghana"] = {container = "Afrika", divs = {"regions", "districts"}, british_spelling = true},
["Greece"] = {container = "Eropah", divs = {"regions", "regional units", "municipalities",
{type = "peripheries", cat_as = {"regions"}},
}, british_spelling = true},
["Grenada"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
["Guatemala"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}},
["Guinea"] = {container = "Afrika", divs = {"regions", "prefectures"}},
["Guinea-Bissau"] = {container = "Afrika", divs = {"regions"}},
["Guyana"] = {container = "Amerika Selatan", divs = {"regions"}, british_spelling = true},
["Haiti"] = {container = "Caribbean", divs = {"departments", "arrondissements"}},
["Honduras"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}},
["Hungary"] = {container = "Eropah", divs = {"counties", "districts"}, british_spelling = true},
["Iceland"] = {container = "Eropah", divs = {"regions", "municipalities", "counties"}, british_spelling = true},
["India"] = {container = "Asia", divs = {
{type = "negeri", cat_as = "states and union territories"},
{type = "union territories", cat_as = "states and union territories"},
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and union territories"},
{type = "ABBREVIATION_OF union territories", cat_as = "abbreviations of states and union territories"},
"divisions", "districts", "municipalities",
}, british_spelling = true},
["Indonesia"] = {container = "Asia", divs = {"regencies", "provinces",
{type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces"},
}},
["Iran"] = {container = "Asia", divs = {"provinces", "counties"}},
["Iraq"] = {container = "Asia", divs = {"governorates", "districts"}},
["Ireland"] = {container = "Eropah", addl_parents = {"British Isles"},
divs = {"counties", "districts", "provinces"}, british_spelling = true, wp = "Republic of %l"},
["Republic of Ireland"] = {alias_of = "Ireland", the = true}, -- differs in "the"
["Israel"] = {container = "Asia", divs = {"districts"}},
["Italy"] = {container = "Eropah", divs = {
"regions", "provinces", "metropolitan cities", "municipalities",
{type = "autonomous regions", cat_as = "regions"},
}, british_spelling = true},
["Ivory Coast"] = {container = "Afrika", divs = {"districts", "regions"}},
-- We should really be using Ivory Coast (common name) but there are political ramifications to the use of
-- Côte d'Ivoire so don't make it a display alias.
["Côte d'Ivoire"] = {alias_of = "Ivory Coast"},
["Jamaica"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
["Jepun"] = {container = "Asia", divs = {"prefectures", "subprefectures", "municipalities"}},
["Jordan"] = {container = "Asia", divs = {"governorates"}},
["Kazakhstan"] = {container = {"Asia", "Eropah"}, divs = {"regions", "districts"}},
["Kenya"] = {container = "Afrika", divs = {"counties"}, british_spelling = true},
["Kiribati"] = {container = "Micronesia", british_spelling = true},
["Kosovo"] = {container = "Eropah", divs = {"districts", "municipalities"}, british_spelling = true},
["Kuwait"] = {container = "Asia", divs = {"governorates", "areas"}},
["Kyrgyzstan"] = {container = "Asia", divs = {"regions", "districts"}},
["Laos"] = {container = "Asia", divs = {"provinces", "districts"}},
["Latvia"] = {container = "Eropah", divs = {"municipalities"}, british_spelling = true},
["Lubnan"] = {container = "Asia", divs = {"governorates", "districts"}},
["Lesotho"] = {container = "Afrika", divs = {"districts"}, british_spelling = true},
["Liberia"] = {container = "Afrika", divs = {"counties", "districts"}},
["Libya"] = {container = "Afrika", divs = {"districts", "municipalities"}},
["Liechtenstein"] = {container = "Eropah", divs = {"municipalities"}, british_spelling = true},
["Lithuania"] = {container = "Eropah", divs = {"counties", "municipalities"}, british_spelling = true},
["Luxembourg"] = {container = "Eropah", divs = {"cantons", "districts"}, british_spelling = true},
["Madagascar"] = {container = "Afrika", divs = {"regions", "districts"}},
["Malawi"] = {container = "Afrika", divs = {"regions", "districts"}, british_spelling = true},
["Malaysia"] = {container = "Asia", divs = {"negeri", "wilayah persekutuan", "daerah"}, british_spelling = true},
["Maldives"] = {the = true, container = "Asia", divs = {"provinces", "administrative atolls"}, british_spelling = true},
["Mali"] = {container = "Afrika", divs = {"regions", "cercles"}},
["Malta"] = {container = "Eropah", divs = {"regions", "local councils"}, british_spelling = true},
["Kepulauan Marshall"] = {the = true, container = "Micronesia", divs = {"municipalities"}},
["Mauritania"] = {container = "Afrika", divs = {"regions", "departments"}},
["Mauritius"] = {container = "Afrika", divs = {"districts"}, british_spelling = true},
["Mexico"] = {container = "Amerika Utara", addl_parents = {"Amerika Tengah"}, divs = {
"negeri", "municipalities",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
}},
["Moldova"] = {container = "Eropah", divs = {
{type = "districts", cat_as = "districts and autonomous territorial units"},
{type = "autonomous territorial units", cat_as = "districts and autonomous territorial units"},
"communes", "municipalities",
}, british_spelling = true},
["Monaco"] = {placetype = {"city-state", "negara"}, container = "Eropah",
-- We want the first placetype to be 'city-state' so the description of Monaco says it's a city-state, but we
-- want its parent to be "countries in Europe".
bare_category_parent_type = {type = "negara", prep = "di"},
is_city = true, british_spelling = true},
["Mongolia"] = {container = "Asia", divs = {"provinces", "districts"}},
["Montenegro"] = {container = "Eropah", divs = {"municipalities"}},
["Morocco"] = {container = "Afrika", divs = {"regions", "prefectures", "provinces"}},
["Mozambique"] = {container = "Afrika", divs = {"provinces", "districts"}},
["Myanmar"] = {container = "Asia",
divs = {"regions", "negeri", "union territories",
{type = "self-administered zones", cat_as = "self-administered areas"},
{type = "self-administered divisions", cat_as = "self-administered areas"},
"districts"}},
["Burma"] = {alias_of = "Myanmar"}, -- not display-canonicalizing; has political connotations
["Namibia"] = {container = "Afrika", divs = {"regions", "constituencies"}, british_spelling = true},
["Nauru"] = {container = "Micronesia", divs = {"districts"}, british_spelling = true},
["Nepal"] = {container = "Asia", divs = {"provinces", "districts"}},
["Netherlands"] = {the = true, placetype = {"negara", "constituent country"}, container = "Eropah",
divs = {"provinces", "municipalities",
{type = "FORMER municipalities", cat_as = "former municipalities"},
"dependent territories", "constituent countries"}, british_spelling = true,
-- Wikipedia separates [[w:Netherlands]] (constituent country) from [[w:Kingdom of the Netherlands]]
-- (country)
},
["New Zealand"] = {container = "Polynesia", divs = {
"regions", "dependent territories", "territorial authorities",
{type = "districts", cat_as = "territorial authorities"},
},
british_spelling = true},
["Nicaragua"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}},
["Niger"] = {container = "Afrika", divs = {"regions", "departments"}},
["Nigeria"] = {container = "Afrika", divs = {
"negeri",
-- Categorize the Federal Capital Territory as a state because there's only one of it; we could categorize
-- everything under 'states and territories' but that seems a bit pointless.
{type = "wilayah persekutuan", cat_as = "negeri"},
"local government areas",
}, british_spelling = true},
["North Korea"] = {container = "Asia", addl_parents = {"Korea"}, divs = {"provinces", "counties"}},
["North Macedonia"] = {container = "Eropah", divs = {"regions", "municipalities"}, british_spelling = true},
["Macedonia"] = {alias_of = "North Macedonia", display = true},
["Republic of North Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the"
["Republic of Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the"
["Norway"] = {container = "Eropah",
divs = {"counties", "municipalities", "dependent territories", "districts", "unincorporated areas"},
british_spelling = true},
["Oman"] = {container = "Asia", divs = {"governorates", "provinces"}},
["Pakistan"] = {container = "Asia", divs = {
{type = "provinces", cat_as = "provinces and territories"},
{type = "administrative territories", cat_as = "provinces and territories"},
{type = "wilayah persekutuan", cat_as = "provinces and territories"},
{type = "territories", cat_as = "provinces and territories"},
"divisions", "districts",
}, british_spelling = true},
["Palau"] = {container = "Micronesia", divs = {"negeri"}},
["Palestine"] = {container = "Asia", divs = {"governorates"}},
["State of Palestine"] = {alias_of = "Palestine", the = true}, -- differs in "the"
["Panama"] = {container = "Amerika Tengah", divs = {"provinces", "districts"}},
["Papua New Guinea"] = {container = "Melanesia", divs = {"provinces", "districts"}, british_spelling = true},
["Paraguay"] = {container = "Amerika Selatan", divs = {"departments", "districts"}},
["Peru"] = {container = "Amerika Selatan", divs = {"regions", "provinces", "districts"}},
["Philippines"] = {the = true, container = "Asia", divs = {"regions", "provinces", "districts", "municipalities", "barangays"}},
["Poland"] = {divs = {"voivodeships", "counties",
{type = "Polish colonies", cat_as = {{type = "villages", prep = "di"}}},
}, container = "Eropah", british_spelling = true},
["Portugal"] = {container = "Eropah", divs = {
{type = "autonomous regions", cat_as = "districts and autonomous regions"},
{type = "districts", cat_as = "districts and autonomous regions"},
"provinces", "municipalities"}, british_spelling = true},
["Qatar"] = {container = "Asia", divs = {"municipalities", "zones"}},
["Republic of the Congo"] = {the = true, container = "Afrika", divs = {"departments", "districts"}},
["Congo Republic"] = {alias_of = "Republic of the Congo", display = true, the = true},
["Romania"] = {container = "Eropah", divs = {
"regions", "counties", "communes",
{type = "ABBREVIATION_OF counties", cat_as = "abbreviations of counties"},
}, british_spelling = true},
["Russia"] = {container = {"Eropah", "Asia"}, divs = {
"federal subjects", "republics", "autonomous oblasts", "autonomous okrugs", "oblasts", "krais", "federal cities",
"districts", "federal districts"},
british_spelling = true},
["Rwanda"] = {container = "Afrika", divs = {"provinces", "districts"}},
["Saint Kitts and Nevis"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
["Saint Lucia"] = {container = "Caribbean", divs = {"districts"}, british_spelling = true},
["Saint Vincent and the Grenadines"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
["Samoa"] = {container = "Polynesia", divs = {"districts"}, british_spelling = true},
["San Marino"] = {container = "Eropah", divs = {"municipalities"}, british_spelling = true},
["São Tomé and Príncipe"] = {container = "Afrika", divs = {"districts"}},
["Arab Saudi"] = {container = "Asia", divs = {"wilayah", "kegaboneran"}},
["Senegal"] = {container = "Afrika", divs = {"regions", "departments"}},
["Serbia"] = {container = "Eropah", divs = {"districts", "municipalities", "autonomous provinces"}},
["Seychelles"] = {container = "Afrika", divs = {"districts"}, british_spelling = true},
["Sierra Leone"] = {container = "Afrika", divs = {"provinces", "districts"}, british_spelling = true},
["Singapore"] = {container = "Asia", divs = {"districts", "regions"}, british_spelling = true},
["Slovakia"] = {container = "Eropah", divs = {"regions", "districts"}, british_spelling = true},
["Slovenia"] = {container = "Eropah", divs = {"statistical regions", "municipalities"}, british_spelling = true},
-- Note: the official name does not include "the" at the beginning, but it sounds strange in
-- English to leave it out and it's commonly included, so we include it.
["Solomon Islands"] = {the = true, container = "Melanesia", divs = {"provinces"}, british_spelling = true},
["Somalia"] = {container = "Afrika", divs = {"regions", "districts"}},
["South Africa"] = {container = "Afrika", divs = {
"provinces",
"districts",
{type = "district municipalities", cat_as = "districts"},
{type = "metropolitan municipalities", cat_as = "districts"},
"municipalities",
}, british_spelling = true},
["South Korea"] = {container = "Asia", addl_parents = {"Korea"}, divs = {"provinces", "counties", "districts"}},
["South Sudan"] = {container = "Afrika", divs = {"regions", "negeri", "counties"}, british_spelling = true},
["Spain"] = {container = "Eropah", divs = {"autonomous communities", "provinces", "municipalities",
"comarcas", "autonomous cities"},
british_spelling = true},
["Sri Lanka"] = {container = "Asia", divs = {"provinces", "districts"}, british_spelling = true},
["Sudan"] = {container = "Afrika", divs = {"negeri", "districts"}, british_spelling = true},
["Suriname"] = {container = "Amerika Selatan", divs = {"districts"}},
["Sweden"] = {container = "Eropah", divs = {"provinces", "counties", "municipalities"}, british_spelling = true},
["Switzerland"] = {container = "Eropah", divs = {"cantons", "municipalities", "districts"}, british_spelling = true},
["Syria"] = {container = "Asia", divs = {"governorates", "districts"}},
["Taiwan"] = {container = "Asia", divs = {"counties", "districts", "townships", "special municipalities"}},
["Republic of China"] = {alias_of = "Taiwan", the = true}, -- differs in "the", different political connotations
["Tajikistan"] = {container = "Asia", divs = {"regions", "districts"}},
["Tanzania"] = {container = "Afrika", divs = {"regions", "districts"}, british_spelling = true},
["Thailand"] = {container = "Asia", divs = {"wilayah", "daerah", "subdaerah"}},
["Togo"] = {container = "Afrika", divs = {"provinces", "prefectures"}},
["Tonga"] = {container = "Polynesia", divs = {"divisions"}, british_spelling = true},
["Trinidad and Tobago"] = {container = "Caribbean", divs = {"regions", "municipalities"}, british_spelling = true},
["Tunisia"] = {container = "Afrika", divs = {"governorates", "delegations"}},
["Turkey"] = {container = {"Eropah", "Asia"}, divs = {"provinces", "districts"}},
-- Foreign names generally get display-canonicalized.
["Türkiye"] = {alias_of = "Turkey", display = true},
["Turkmenistan"] = {container = "Asia", divs = {
-- The 5 regions are often also called provinces
"regions", {type = "provinces", cat_as = "regions"}, "districts"},
},
["Tuvalu"] = {container = "Polynesia", divs = {"atolls"}, british_spelling = true},
["Uganda"] = {container = "Afrika", divs = {"districts", "counties"}, british_spelling = true},
["Ukraine"] = {container = "Eropah", divs = {
{type = "oblasts", cat_as = "oblasts and autonomous republics"},
{type = "autonomous republics", cat_as = "oblasts and autonomous republics"},
"raions", "hromadas",
}, british_spelling = true},
["United Arab Emirates"] = {the = true, container = "Asia", divs = {"emirates"}},
-- Abbreviations get display-canonicalized.
["UAE"] = {alias_of = "United Arab Emirates", display = true, the = true},
["U.A.E."] = {alias_of = "United Arab Emirates", display = true, the = true},
["United Kingdom"] = {the = true, container = "Eropah", addl_parents = {"British Isles"},
divs = {"constituent countries", "counties", "districts", "boroughs", "territories", "dependent territories",
"traditional counties"},
keydesc = "the [[United Kingdom]] of Great Britain and Northern Ireland", british_spelling = true},
-- Abbreviations get display-canonicalized.
["UK"] = {alias_of = "United Kingdom", display = true, the = true},
["U.K."] = {alias_of = "United Kingdom", display = true, the = true},
["Amerika Syarikat"] = {the = true, container = "Amerika Utara",
divs = {"counties", "county seats", "negeri", "territories", "dependent territories",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
{type = "DEROGATORY_NAME_FOR states", cat_as = "derogatory names for states"},
{type = "NICKNAME_FOR states", cat_as = "nicknames for states"},
{type = "OFFICIAL_NICKNAME_FOR states", cat_as = "official nicknames for states"},
{type = "boroughs", prep = "di"}, -- exist in Pennsylvania and New Jersey
"municipalities", -- these exist politically at least in Colorado and Connecticut
{type = "census-designated places", prep = "di"},
{type = "unincorporated communities", prep = "di"},
-- Don't change the following to something more politically correct until/unless the US government makes a
-- similar switch (and note that as of Apr 18 2025, the Wikipedia article is still at
-- [[w:Indian reservations]]).
"Indian reservations",
}},
-- Abbreviations and long forms (when possible) get display-canonicalized.
["US"] = {alias_of = "Amerika Syarikat", display = true, the = true},
["U.S."] = {alias_of = "Amerika Syarikat", display = true, the = true},
["USA"] = {alias_of = "Amerika Syarikat", display = true, the = true},
["U.S.A."] = {alias_of = "Amerika Syarikat", display = true, the = true},
["United States of America"] = {alias_of = "Amerika Syarikat", display = true, the = true},
["United States"] = {alias_of = "Amerika Syarikat", display = true, the = true},
["Uruguay"] = {container = "Amerika Selatan", divs = {"departments", "municipalities"}},
["Uzbekistan"] = {container = "Asia", divs = {"regions", "districts"}},
["Vanuatu"] = {container = "Melanesia", divs = {"provinces"}, british_spelling = true},
["Vatican City"] = {placetype = {"city-state", "negara"}, container = "Eropah",
-- We want the first placetype to be 'city-state' so the description of Vatican City says it's a city-state,
-- but we want its parent to be "countries in Europe".
bare_category_parent_type = {type = "negara", prep = "di"},
addl_parents = {"Rome"}, is_city = true, british_spelling = true},
["Vatican"] = {alias_of = "Vatican City", the = true}, -- differs in "the"
["Venezuela"] = {container = "Amerika Selatan", divs = {"negeri", "municipalities"}},
["Vietnam"] = {container = "Asia", divs = {"provinces", "districts", "municipalities"}},
["Western Sahara"] = {placetype = {"territory", "negara"}, container = "Afrika",
bare_category_parent_type = {type = "negara", prep = "di"},
},
-- Not display-canonicalizable both due to differences in 'the' and the sovereignty dispute over Western Sahara
["Sahrawi Arab Democratic Republic"] = {alias_of = "Western Sahara", the = true},
["Yemen"] = {container = "Asia", divs = {"governorates", "districts"}},
["Zambia"] = {container = "Afrika", divs = {"provinces", "districts"}, british_spelling = true},
["Zimbabwe"] = {container = "Afrika", divs = {"provinces", "districts"}, british_spelling = true},
}
local function canonicalize_continent_container(key)
if type(key) ~= "string" then
return key
end
if export.continents[key] then
return {key = key, placetype = export.continents[key].placetype}
end
internal_error("Unrecognized key %s in `canonicalize_continent_like`", key)
end
export.countries_group = {
canonicalize_key_container = canonicalize_continent_container,
default_overriding_bare_label_parents = {"+++", "negara"},
default_placetype = "negara",
default_no_container_cat = true,
default_no_container_parent = true,
-- No need to augment country holonyms with continents; not needed for disambiguation.
default_no_auto_augment_container = true,
data = export.countries,
}
-- Country-like entities: typically overseas territories or de-facto independent countries, which in both cases
-- are not internationally recognized as sovereign nations but which we treat similarly to countries.
export.country_like_entities = {
-- British Overseas Territory
["Akrotiri and Dhekelia"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Cyprus", "Eropah", "Asia"},
british_spelling = true,
},
-- Åland: Listed as a region of Finland. Wikipedia lists this under "dependent territories" in
-- [[w:List of sovereign states and dependent territories by continent]].
-- unincorporated territory of the United States
["American Samoa"] = {
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Polynesia"},
},
-- British Overseas Territory
["Anguilla"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Georgia
["Abkhazia"] = {
placetype = {"unrecognized country", "negara"},
addl_parents = {"Georgia", "Eropah", "Asia"},
divs = {"districts"},
keydesc = "the de-facto independent state of [[Abkhazia]], internationally recognized as part of the country of [[Georgia]]",
british_spelling = true,
},
-- Australian external territory
["Ashmore and Cartier Islands"] = {
the = true,
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Asia"},
},
-- constituent country of the Netherlands
["Aruba"] = {
placetype = {"constituent country", "negara"},
container = "Netherlands",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- British Overseas Territory
["Bermuda"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Amerika Utara"},
british_spelling = true,
},
-- special municipality of the Netherlands
["Bonaire"] = {
placetype = {"special municipality", "municipality", "overseas territory", "territory"},
container = "Netherlands",
addl_parents = {"Caribbean"},
is_city = true,
british_spelling = true,
},
-- British Overseas Territory
["British Indian Ocean Territory"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Asia"},
british_spelling = true,
},
-- British Overseas Territory
["British Virgin Islands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- Norwegian dependent territory
["Bouvet Island"] = {
placetype = {"dependent territory", "territory"},
container = "Norway",
addl_parents = {"Afrika"},
british_spelling = true,
},
-- British Overseas Territory
["Cayman Islands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- Australian external territory
["Christmas Island"] = {
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Asia"},
british_spelling = true,
},
-- Sui generis French "state private property" per Wikipedia; classify as overseas territory like the
-- French Southern and Antarctic Lands.
["Clipperton Island"] = {
placetype = {"overseas territory", "territory"},
container = "France",
addl_parents = {"Amerika Utara"},
},
-- Australian external territory; also called the Keeling Islands or (officially) the Cocos (Keeling) Islands
["Cocos Islands"] = {
the = true,
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Asia"},
wp = "Cocos (Keeling) Islands",
british_spelling = true,
},
["Cocos (Keeling) Islands"] = {alias_of = "Cocos Islands", display = true, the = true},
["Keeling Islands"] = {alias_of = "Cocos Islands", display = true, the = true},
-- self-governing but in free association with New Zealand
["Cook Islands"] = {
the = true,
placetype = {"negara"},
container = "New Zealand",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- constituent country of the Netherlands
["Curaçao"] = {
placetype = {"constituent country", "negara"},
container = "Netherlands",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- special territory of Chile
["Easter Island"] = {
placetype = {"special territory", "territory"},
container = "Chile",
addl_parents = {"Polynesia"},
},
-- British Overseas Territory
["Falkland Islands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Amerika Selatan"},
british_spelling = true,
},
-- autonomous territory of Denmark
["Faroe Islands"] = {
the = true,
placetype = {"autonomous territory", "territory"},
container = "Denmark",
addl_parents = {"Eropah"},
british_spelling = true,
},
-- overseas department and region of France
["French Guiana"] = {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
divs = {"communes"},
addl_parents = {"Amerika Selatan"},
british_spelling = true,
},
-- overseas collectivity of France
["French Polynesia"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- French overseas territory
["French Southern and Antarctic Lands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "France",
addl_parents = {"Afrika"},
},
-- British Overseas Territory
["Gibraltar"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Eropah"},
is_city = true,
british_spelling = true,
},
-- autonomous territory of Denmark
["Greenland"] = {
placetype = {"autonomous territory", "territory"},
container = "Denmark",
addl_parents = {"Amerika Utara"},
divs = {"municipalities"},
british_spelling = true,
},
-- overseas department and region of France
["Guadeloupe"] = {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
addl_parents = {"Caribbean"},
divs = {"communes"},
british_spelling = true,
},
-- unincorporated territory of the United States
["Guam"] = {
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Micronesia"},
},
-- self-governing British Crown dependency; technically called the Bailiwick of Guernsey
["Guernsey"] = {
placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "territory"},
container = "United Kingdom",
addl_parents = {"British Isles", "Eropah"},
british_spelling = true,
wp = "Bailiwick of %l",
},
["Bailiwick of Guernsey"] = {alias_of = "Guernsey", the = true},
-- Australian external territory
["Heard Island and McDonald Islands"] = {
the = true,
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Afrika"},
},
-- special administrative region of China
["Hong Kong"] = {
placetype = {"special administrative region", "city"},
container = "China",
is_city = true,
british_spelling = true,
},
-- self-governing British Crown dependency
["Isle of Man"] = {
the = true,
placetype = {"crown dependency", "dependency", "dependent territory", "territory"},
container = "United Kingdom",
addl_parents = {"British Isles", "Eropah"},
british_spelling = true,
},
-- Norwegian unincorporated area
["Jan Mayen"] = {
placetype = {"unincorporated area", "dependent territory", "territory", "island"},
container = "Norway",
addl_parents = {"Eropah"},
british_spelling = true,
},
-- self-governing British Crown dependency; technically called the Bailiwick of Jersey
["Jersey"] = {
placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "territory"},
container = "United Kingdom",
addl_parents = {"British Isles", "Eropah"},
british_spelling = true,
},
["Bailiwick of Jersey"] = {alias_of = "Jersey", the = true},
-- special administrative region of China
["Macau"] = {
placetype = {"special administrative region", "city"},
container = "China",
is_city = true,
british_spelling = true,
},
-- overseas department and region of France
["Martinique"] = {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
divs = {"communes"},
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- overseas department and region of France
["Mayotte"] = {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
divs = {"communes"},
addl_parents = {"Afrika"},
british_spelling = true,
},
-- British Overseas Territory
["Montserrat"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- special collectivity of France
["New Caledonia"] = {
placetype = {"special collectivity", "collectivity"},
container = "France",
addl_parents = {"Melanesia"},
british_spelling = true,
},
-- dependent territory of New Zealand
["New Zealand Subantarctic Islands"] = {
the = true,
placetype = {"dependent territory", "territory"},
container = "New Zealand",
addl_parents = {"Antartika"},
british_spelling = true,
},
-- self-governing but in free association with New Zealand
["Niue"] = {
placetype = {"negara"},
container = "New Zealand",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- Australian external territory
["Norfolk Island"] = {
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Cyprus
["Northern Cyprus"] = {
placetype = {"unrecognized country", "negara"},
addl_parents = {"Cyprus", "Turkey", "Eropah", "Asia"},
divs = {"districts"},
keydesc = "the de-facto independent state of [[Northern Cyprus]], internationally recognized as part of the country of [[Cyprus]]",
british_spelling = true,
},
-- commonwealth, unincorporated territory of the United States
["Northern Mariana Islands"] = {
the = true,
placetype = {"commonwealth", "unincorporated territory", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Micronesia"},
},
-- British Overseas Territory
["Pitcairn Islands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- commonwealth of the United States
["Puerto Rico"] = {
placetype = {"commonwealth", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Caribbean"},
divs = {"municipalities"},
},
-- overseas department and region of France
["Réunion"] = {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
divs = {"communes"},
addl_parents = {"Afrika"},
british_spelling = true,
},
-- special municipality of the Netherlands
["Saba"] = {
placetype = {"special municipality", "municipality", "overseas territory", "territory"},
container = "Netherlands",
addl_parents = {"Caribbean"},
is_city = true,
british_spelling = true,
},
-- overseas collectivity of France
["Saint Barthélemy"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- British Overseas Territory
["Saint Helena, Ascension and Tristan da Cunha"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
divs = {{type = "constituent parts", container_parent_type = false}},
addl_parents = {"Atlantic Ocean", "Afrika"},
british_spelling = true,
},
-- constituent parts of the combined oveseas territory
["Ascension Island"] = {
placetype = {"constituent part", "territory", "island"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"Atlantic Ocean"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
["Saint Helena"] = {
placetype = {"constituent part", "territory", "island"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"Atlantic Ocean"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
["Tristan da Cunha"] = {
placetype = {"constituent part", "territory", "archipelago"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"Atlantic Ocean"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
-- overseas collectivity of France
["Saint Martin"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- overseas collectivity of France
["Saint Pierre and Miquelon"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
divs = {"communes"},
addl_parents = {"Amerika Utara"},
british_spelling = true,
},
-- special municipality of the Netherlands
["Sint Eustatius"] = {
placetype = {"special municipality", "municipality", "overseas territory", "territory"},
container = "Netherlands",
addl_parents = {"Caribbean"},
is_city = true,
british_spelling = true,
},
-- constituent country of the Netherlands
["Sint Maarten"] = {
placetype = {"constituent country", "negara"},
container = "Netherlands",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Somalia
["Somaliland"] = {
placetype = {"unrecognized country", "negara"},
addl_parents = {"Somalia", "Afrika"},
keydesc = "the de-facto independent state of [[Somaliland]], internationally recognized as part of the country of [[Somalia]]",
british_spelling = true,
},
-- British Overseas Territory
-- FIXME: We should form the group "South Georgia and the South Sandwich Islands" like we did for
-- "Saint Helena, Ascension and Tristan da Cunha".
["South Georgia"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Atlantic Ocean"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Georgia
["South Ossetia"] = {
placetype = {"unrecognized country", "negara"},
addl_parents = {"Georgia", "Eropah", "Asia"},
keydesc = "the de-facto independent state of [[South Ossetia]], internationally recognized as part of the country of [[Georgia]]",
british_spelling = true,
},
-- British Overseas Territory
["South Sandwich Islands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Atlantic Ocean"},
wp = true,
wpcat = "South Georgia and the South Sandwich Islands",
british_spelling = true,
},
-- Norwegian unincorporated area
["Svalbard"] = {
placetype = {"unincorporated area", "dependent territory", "territory", "archipelago"},
container = "Norway",
addl_parents = {"Eropah"},
british_spelling = true,
},
-- dependent territory of New Zealand
["Tokelau"] = {
placetype = {"dependent territory", "territory"},
container = "New Zealand",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Moldova
["Transnistria"] = {
placetype = {"unrecognized country", "negara"},
addl_parents = {"Moldova", "Eropah"},
keydesc = "the de-facto independent state of [[Transnistria]], internationally recognized as part of [[Moldova]]",
british_spelling = true,
},
-- British Overseas Territory
["Turks and Caicos Islands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- unincorporated territory of the United States
["United States Minor Outlying Islands"] = {
the = true,
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Islands", "Micronesia", "Polynesia", "Caribbean"},
},
-- FIXME: We should add entries for the other minor outlying islands.
-- Baker Island (Oceania)
-- Howland Island (Oceania)
-- Jarvis Island (Oceania)
-- Johnston Atoll (Oceania)
-- Kingman Reef (Oceania)
-- Midway Atoll (Oceania)
-- Navassa Island (Caribbean)
-- Palmyra Atoll (Oceania)
-- Wake Island (Oceania)
["Wake Island"] = {
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Micronesia"},
},
-- unincorporated territory of the United States
["United States Virgin Islands"] = {
the = true,
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Caribbean"},
},
["U.S. Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true},
["US Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true},
-- overseas collectivity of France
["Wallis and Futuna"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
addl_parents = {"Polynesia"},
british_spelling = true,
},
}
export.country_like_entities_group = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Saint Helena, Ascension and Tristan da Cunha".
key_to_placename = false,
placename_to_key = false,
canonicalize_key_container = make_canonicalize_key_container(nil, "negara"),
default_overriding_bare_label_parents = {"country-like entities"},
default_no_container_cat = true,
default_no_container_parent = true,
-- These entities often aren't really part of their container; a village in Wallis and Futuna (an overseas
-- collectivity of France in Polynesia), for example, shouldn't be treated as a village in France, nor as a village
-- in Europe.
default_no_auto_augment_container = true,
data = export.country_like_entities,
}
-- Former countries and such; we don't create "Cities in ..." categories because they don't exist anymore
export.former_countries = {
-- de-facto independent state of Armenian ethnicity, internationally recognized as part of Azerbaijan
-- (also known as Nagorno-Karabakh)
-- NOTE: Formerly listed Armenia as a parent; this seems politically non-neutral so I've taken it out.
["Artsakh"] = {
placetype = {"unrecognized country", "negara"},
addl_parents = {"Azerbaijan", "Eropah", "Asia"},
keydesc = "the former de-facto independent state of [[Artsakh]], internationally recognized as part of [[Azerbaijan]]",
british_spelling = true,
},
["Nagorno-Karabakh"] = {alias_of = "Artsakh"},
["Czechoslovakia"] = {container = "Eropah", british_spelling = true},
["East Germany"] = {container = "Eropah", addl_parents = {"Germany"}, british_spelling = true},
["North Vietnam"] = {container = "Asia", addl_parents = {"Vietnam"}},
["Persia"] = {placetype = {"empire", "negara"}, container = "Asia", divs = {"provinces"}},
["Byzantine Empire"] = {
the = true, placetype = {"empire", "negara"}, container = {"Eropah", "Afrika", "Asia"},
addl_parents = {"Ancient Europe", "Ancient Near East"},
divs = {
"provinces", "themes",
}},
["Roman Empire"] = {
the = true, placetype = {"empire", "negara"}, container = {"Eropah", "Afrika", "Asia"}, addl_parents = {"Rome"},
divs = {
"provinces",
{type = "FORMER provinces", cat_as = "provinces"},
}},
["South Vietnam"] = {container = "Asia", addl_parents = {"Vietnam"}},
["Soviet Union"] = {
the = true, container = {"Eropah", "Asia"}, divs = {"republics", "autonomous republics"},
british_spelling = true},
["West Germany"] = {container = "Eropah", addl_parents = {"Germany"}, british_spelling = true},
["Yugoslavia"] = {container = "Eropah", divs = {"districts"},
keydesc = "the former [[Kingdom of Yugoslavia]] (1918–1943) or the former [[Socialist Federal Republic of Yugoslavia]] (1943–1992)", british_spelling = true},
}
export.former_countries_group = {
canonicalize_key_container = canonicalize_continent_container,
default_overriding_bare_label_parents = {"former countries and country-like entities"},
default_is_former_place = true,
default_placetype = "negara",
default_no_container_cat = true,
default_no_container_parent = true,
-- No need to augment country holonyms with continents; not needed for disambiguation.
default_no_auto_augment_container = true,
data = export.former_countries,
}
-----------------------------------------------------------------------------------
-- Subpolity tables --
-----------------------------------------------------------------------------------
export.australia_states_and_territories = {
["Australian Capital Territory, Australia"] = {the = true, placetype = "territory"},
["Jervis Bay Territory, Australia"] = {the = true, placetype = "territory"},
["New South Wales, Australia"] = {},
["Northern Territory, Australia"] = {the = true, placetype = "territory"},
["Queensland, Australia"] = {},
["South Australia, Australia"] = {},
["Tasmania, Australia"] = {},
["Victoria, Australia"] = {},
["Western Australia, Australia"] = {},
}
-- states and territories of Australia
export.australia_group = {
default_container = "Australia",
default_placetype = "negeri",
default_divs = "local government areas",
data = export.australia_states_and_territories,
}
export.austria_states = {
["Vienna, Austria"] = {},
["Lower Austria, Austria"] = {},
["Upper Austria, Austria"] = {},
["Styria, Austria"] = {},
["Tyrol, Austria"] = {wp = "Tyrol (state)"},
["Carinthia, Austria"] = {},
["Salzburg, Austria"] = {wp = "Salzburg (state)"},
["Vorarlberg, Austria"] = {},
["Burgenland, Austria"] = {},
}
-- states of Austria
export.austria_group = {
default_container = "Austria",
default_placetype = "negeri",
default_divs = "municipalities",
data = export.austria_states,
}
export.bangladesh_divisions = {
["Barisal Division, Bangladesh"] = {},
["Chittagong Division, Bangladesh"] = {},
["Dhaka Division, Bangladesh"] = {},
["Khulna Division, Bangladesh"] = {},
["Mymensingh Division, Bangladesh"] = {},
["Rajshahi Division, Bangladesh"] = {},
["Rangpur Division, Bangladesh"] = {},
["Sylhet Division, Bangladesh"] = {},
}
-- divisions of Bangladesh
export.bangladesh_group = {
key_to_placename = make_key_to_placename(", Bangladesh$", " Division$"),
placename_to_key = make_placename_to_key(", Bangladesh", " Division"),
default_container = "Bangladesh",
default_placetype = "division",
default_divs = "districts",
data = export.bangladesh_divisions,
}
export.brazil_states = {
["Acre, Brazil"] = {wp = "%l (state)"},
["Alagoas, Brazil"] = {},
["Amapá, Brazil"] = {},
["Amazonas, Brazil"] = {wp = "%l (Brazilian state)"},
["Bahia, Brazil"] = {},
["Ceará, Brazil"] = {},
["Distrito Federal, Brazil"] = {wp = "Federal District (Brazil)"},
["Espírito Santo, Brazil"] = {},
["Goiás, Brazil"] = {},
["Maranhão, Brazil"] = {},
["Mato Grosso, Brazil"] = {},
["Mato Grosso do Sul, Brazil"] = {},
["Minas Gerais, Brazil"] = {},
["Pará, Brazil"] = {},
["Paraíba, Brazil"] = {},
["Paraná, Brazil"] = {wp = "%l (state)"},
["Pernambuco, Brazil"] = {},
["Piauí, Brazil"] = {},
["Rio de Janeiro, Brazil"] = {wp = "%l (state)"},
["Rio Grande do Norte, Brazil"] = {},
["Rio Grande do Sul, Brazil"] = {},
["Rondônia, Brazil"] = {},
["Roraima, Brazil"] = {},
["Santa Catarina, Brazil"] = {wp = "%l (state)"},
["São Paulo, Brazil"] = {wp = "%l (state)"},
["Sergipe, Brazil"] = {},
["Tocantins, Brazil"] = {},
}
-- states of Brazil
export.brazil_group = {
default_container = "Brazil",
default_placetype = "negeri",
default_divs = "municipalities",
data = export.brazil_states,
}
export.canada_provinces_and_territories = {
["Alberta, Canada"] = {divs = {
{type = "municipal districts", container_parent_type = "rural municipalities"},
}},
["British Columbia, Canada"] = {divs =
{type = "regional districts", container_parent_type = false},
"regional municipalities",
},
["Manitoba, Canada"] = {divs = {"rural municipalities"}},
["New Brunswick, Canada"] = {divs = {"counties", "parishes", {type = "civil parishes", cat_as = "parishes"}}},
["Newfoundland and Labrador, Canada"] = {},
["Northwest Territories, Canada"] = {the = true, placetype = "territory"},
["Nova Scotia, Canada"] = {divs = {"counties", "regional municipalities"}},
["Nunavut, Canada"] = {placetype = "territory"},
["Ontario, Canada"] = {divs = {"counties", "regional municipalities", {type = "townships", prep = "di"}}},
["Prince Edward Island, Canada"] = {divs = {"counties", "parishes", "rural municipalities"}},
["Saskatchewan, Canada"] = {divs = {"rural municipalities"}},
["Quebec, Canada"] = {divs = {
"counties",
{type = "regional county municipalities", container_parent_type = "regional municipalities"},
-- administrative regions have an official (but non-governmental) function but there don't appear to be any
-- equivalent regions elsewhere in Canada, so disable the [[Category:Regions of Canada]] grouping
{type = "regions", container_parent_type = false},
{type = "townships", prep = "di"},
{type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "counties"}, "municipalities"}},
{type = "township municipalities", cat_as = {{type = "townships", prep = "di"}, "municipalities"}},
{type = "village municipalities", cat_as = {{type = "villages", prep = "di"}, "municipalities"}},
}},
["Yukon, Canada"] = {placetype = "territory"},
["Yukon Territory, Canada"] = {alias_of = "Yukon, Canada", the = true},
}
-- provinces and territories of Canada
export.canada_group = {
default_container = "Canada",
default_placetype = "province",
data = export.canada_provinces_and_territories,
}
export.china_provinces_and_autonomous_regions = {
-- direct-administered municipalities are not here but below under prefecture-level cities
["Anhui, China"] = {},
["Fujian, China"] = {},
["Fuchien, China"] = {alias_of = "Fujian, China", display = true},
["Gansu, China"] = {},
["Guangdong, China"] = {},
["Guangxi, China"] = {placetype = "autonomous region"},
["Guizhou, China"] = {},
["Hainan, China"] = {},
["Hebei, China"] = {},
["Heilongjiang, China"] = {},
["Henan, China"] = {},
["Hubei, China"] = {},
["Hunan, China"] = {},
["Inner Mongolia, China"] = {placetype = "autonomous region"},
["Jiangsu, China"] = {},
["Jiangxi, China"] = {},
["Jilin, China"] = {},
["Liaoning, China"] = {},
["Ningxia, China"] = {placetype = "autonomous region"},
["Qinghai, China"] = {},
["Shaanxi, China"] = {},
["Shandong, China"] = {},
["Shanxi, China"] = {},
["Sichuan, China"] = {},
["Tibet, China"] = {placetype = "autonomous region", wp = "Tibet Autonomous Region"},
["Xinjiang, China"] = {placetype = "autonomous region"},
["Yunnan, China"] = {},
["Zhejiang, China"] = {},
}
-- provinces and autonomous regions of China
export.china_group = {
default_container = "China",
default_placetype = "province",
default_divs = {
"prefectures", "prefecture-level cities",
"districts", "subdistricts", "townships",
{type = "counties", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_provinces_and_autonomous_regions,
}
export.china_prefecture_level_cities = {
-- In China, a "prefecture-level city" is not a city in any real sense. It is rather a prefecture, which is an
-- administrative unit smaller than a province but bigger than a county, which is administratively controlled by
-- the chief city of the prefecture (which bears the same name as the prefecture), in a unified government. Prior
-- to the mid-1980's, in fact, prefecture-level cities *were* prefectures, and a few of them (especially in the
-- western portion of China) have not yet been converted. Generally a given province is entirely tiled by
-- prefecture-level cities, another indication that they should be treated as prefectures and not cities per se.
-- Yet another indication is that prefecture-level cities can contain counties and county-level cities (which, much
-- like prefecture-level cities, are effectively counties surrounding a chief city of the county, again which bears
-- the same name as the county-level city).
--
-- For this reason, we treat prefecture-level cities as non-city political divisions, and separately enumerate the
-- most populous so we can separately categorize districts and counties under them instead of lumping them at the
-- province level.
--
-- Note also that China separately distinguishes "urban area" from "metro area". Sometimes the two figures are
-- identical but sometimes the metro area is larger (and very occasionally smaller, which I assume is an error). I'm
-- guessing that the "urban area" is the contiguous urban area over a certain density while the metro area includes
-- all urban areas above a certain density; when the latter is greater, it's because of satellite cities in the
-- metro area separated by suburban/exurban or rural land.
-- At first I chose all prefecture/province-level cities with a total prefecture/province-level population of at
-- least 6,000,000 per the 2020 census with data taken from https://www.citypopulation.de/en/china/admin/ (a total
-- of 67, including the four direct-administered municipalities), and also chose all prefecture/province-level
-- cities whose "urban population" was at least 2,000,000 per the 2020 census with data taken from Wikipedia
-- [[w:List of cities in China by population#Cities and towns by population]] (a total of 61 cities; if we cut off
-- at 1.5 million we'd have 84 cities, and if we cut off at 1 million we'd have 105 cities). Merging them produces
-- 87 cities. Note that this leaves off a few well-known cities (Guilin, Qiqihar, Kashgar, Lhasa, ...) but includes
-- a lot of obscure cities.
--
-- At a later date I added all cities from citypopulation.de whose "urban" population per the 2020 China census was
-- >= 1 million, and then finally added all urban agglomerations from citypopulation.de whose 2025-01-01 estimate
-- was >= 1 million. These are sorted below by the urban agglomeration value (which is generally of the "adm-urb" =
-- "administrative area (urban population)" type) and sometimes groups nearby cities into a single agglomeration
-- (most notably in the case of the Pearl River Delta, grouped under Guangzhou with an agglomeration population of
-- 72,700,000 but including a large number of nearby large cities in the agglomeration (although for some reason not
-- Hong Kong, maybe due to the administrative issues involved). In addition, citypopulation.de includes divisions
-- under a prefecture-level city if they are city-like and have an agglomeration population of at least 1 million;
-- this includes several county-level cities, one county and one district (Wanzhou, a "district" of Chongqing
-- despite being 142 miles away). None of the county-level cities or counties have districts under them, only
-- subdistricts, towns and townships.
["Guangzhou"] = {container = "Guangdong"}, -- 18.7 prefectural, 18.8 urban; sub-provincial city; 16.097 urban (72.700 adm-urb including Dongguan, Foshan, Huizhou, Jiangmen, Shenzhen, Zhongshan) per citypopulation.de
["Dongguan"] = {container = "Guangdong"}, -- 10.5 prefectural, 10.5 urban; 9.645 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Foshan"] = {container = "Guangdong"}, -- 9.5 prefectural, 9.5 urban; 9.043 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Huizhou"] = {container = "Guangdong"}, -- 6.0 prefectural, 2.5 urban; 2.900 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Jiangmen"] = {container = "Guangdong"}, -- 4.798 prefectural, 2.7 urban; 1.795 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Shenzhen"] = {container = "Guangdong"}, -- 17.5 prefectural, 14.7 urban; sub-provincial city; 17.445 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Zhongshan"] = {container = "Guangdong"}, -- 4.418 prefectural, 4.4 urban; 3.842 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Shanghai"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 24.9 prefectural, 29.9 urban; 21.910 urban (41.600 adm-urb including Changshu, Changzhou, Suzhou, Wuxi) per citypopulation.de
["Changshu"] = {container = "Jiangsu"}, -- 1.231 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
-- NOTE: Not to be confused with Cangzhou in Hebei
["Changzhou"] = {container = "Jiangsu"}, -- 5.278 prefectural, 3.6 urban; 3.187 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
-- NOTE: There is also a prefecture-level city Suzhou in Anhui with 5.3 million prefectural inhabitants
["Suzhou"] = {container = "Jiangsu"}, -- 12.8 prefectural, 4.3 urban; 5.893 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
["Wuxi"] = {container = "Jiangsu"}, -- 7.5 prefectural, 3.3 urban; 3.957 per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
["Beijing"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 21.9 prefectural, 21.9 urban; 18.961 urban (21.500 adm-urb) per citypopulation.de
["Chengdu"] = {container = "Sichuan"}, -- 20.9 prefectural, 16.9 urban; sub-provincial city; 13.568 urban (18.100 adm-urb) per citypopulation.de
["Xiamen"] = {container = "Fujian"}, -- 5.163 prefectural, 5.2 urban; sub-provincial city; 4.617 urban (15.400 adm-urb including Jinjiang, Quanzhou, Putian) per citypopulation.de
["Jinjiang"] = {container = "Fujian"}, -- 1.416 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
["Quanzhou"] = {container = "Fujian"}, -- 8.8 prefectural, 1.7 urban (6.7 metro); 1.469 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
["Putian"] = {container = "Fujian"}, -- 3.210 prefectural, 2.0 urban; 1.539 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
["Hangzhou"] = {container = "Zhejiang"}, -- 11.9 prefectural, 10.7 urban; sub-provincial city; 9.236 urban (14.600 adm-urb including Shaoxing) per citypopulation.de
["Shaoxing"] = {container = "Zhejiang"}, -- 5.270 prefectural, 2.5 urban; 2.333 urban per citypopulation.de; included by citypopulation.de in Hangzhou agglomeration
["Xi'an"] = {container = "Shaanxi"}, -- 12.1 prefectural, 11.9 urban; sub-provincial city; 9.393 urban (13.400 adm-urb including Xianyang) per citypopulation.de
["Xianyang"] = {container = "Shaanxi"}, -- 1.193 urban per citypopulation.de; included by citypopulation.de in Xi'an agglomeration
["Chongqing"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 32.1 prefectural, 16.9 urban; 9.581 urban (12.900 adm-urb) per citypopulation.de
["Wuhan"] = {container = "Hubei"}, -- 12.4 prefectural, 12.3 urban; sub-provincial city; 10.495 urban (12.600 adm-urb) per citypopulation.de
["Tianjin"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 13.9 prefectural, 13.9 urban; 11.052 urban (11.700 adm-urb) per citypopulation.de
["Changsha"] = {container = "Hunan"}, -- 10.0 prefectural, 6.0 urban; 5.630 urban (11.500 adm-urb including Xiangtan, Zhuzhou) per citypopulation.de
-- Changsha County -- 1.024 urban per citypopulation.de
["Zhuzhou"] = {container = "Hunan"}, -- 1.510 urban per citypopulation.de; included by citypopulation.de in Changsha agglomeration
["Zhengzhou"] = {container = "Henan"}, -- 12.6 prefectural, 6.7 urban; 6.461 urban (10.300 adm-urb) per citypopulation.de
["Nanjing"] = {container = "Jiangsu"}, -- 9.3 prefectural, 9.3 urban; sub-provincial city; 7.520 urban (9.500 adm-urb including Ma'anshan) per citypopulation.de
["Shenyang"] = {container = "Liaoning"}, -- 9.1 prefectural, 7.9 urban; sub-provincial city; 7.026 urban (8.800 adm-urb including Fushun) per citypopulation.de
["Fushun"] = {container = "Liaoning"}, -- 1.229 urban per citypopulation.de; included by citypopulation.de in Shenyang agglomeration
["Hefei"] = {container = "Anhui"}, -- 9.4 prefectural, 4.2 urban; 5.056 urban (8.200 adm-urb) per citypopulation.de
["Shantou"] = {container = "Guangdong"}, -- 5.502 prefectural, 4.3 urban; 3.839 urban (8.050 adm-urb including Chaozhou, Jieyang, Puning) per citypopulation.de
["Chaozhou"] = {container = "Guangdong"}, -- 1.254 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration
["Jieyang"] = {container = "Guangdong"}, -- 1.243 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration
["Qingdao"] = {container = "Shandong"}, -- 10.1 prefectural, 7.1 urban; sub-provincial city; 6.165 urban (7.700 adm-urb) per citypopulation.de
["Ningbo"] = {container = "Zhejiang"}, -- 9.4 prefectural, 5.1 urban; sub-provincial city; 3.731 urban (7.600 adm-urb including Cixi, Yuyao) per citypopulation.de
["Cixi"] = {container = "Zhejiang"}, -- 1.458 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration
["Yuyao"] = {container = "Zhejiang"}, -- 1.014 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration
-- Hong Kong 7.500 agglomeration per citypopulation.de 2025-01-01 estimate including Kowloon, Victoria
["Wenzhou"] = {container = "Zhejiang"}, -- 9.6 prefectural, 3.6 urban; 2.582 urban (7.000 adm-urb including Rui'an, Cangnan, Pingyang) per citypopulation.de
-- Rui'an is a "county-level city" of the "prefecture-level city" of Wenzhou but in fact is 19 miles away from Wenzhou city proper (urban core to urban core).
["Rui'an"] = {placetype = "county-level city", container = {key = "Wenzhou", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 1.013 urban per citypopulation.de; included by citypopulation.de in Wenzhou agglomeration
["Kunming"] = {container = "Yunnan"}, -- 8.5 prefectural, 6.0 urban; 5.273 urban (6.800 adm-urb) per citypopulation.de
-- includes Láiwú city
["Jinan"] = {container = "Shandong", wp = "%l, %c"}, -- 9.2 prefectural, 8.4 urban; sub-provincial city; 5.648 urban (6.750 adm-urb) per citypopulation.de
-- includes Xīnjí city
["Shijiazhuang"] = {container = "Hebei"}, -- 11.2 prefectural, 4.1 urban; 5.090 urban (6.450 adm-urb) per citypopulation.de
["Taiyuan"] = {container = "Shanxi"}, -- 5.304 prefectural, 4.5 urban; 4.304 urban (6.150 adm-urb) per citypopulation.de
["Harbin"] = {container = "Heilongjiang"}, -- 10.0 prefectural, 7.0 urban; sub-provincial city; 5.243 urban (5.550 adm-urb) per citypopulation.de
["Nanning"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 8.7 prefectural, 3.8 urban; 4.583 urban (5.550 adm-urb) per citypopulation.de
["Dalian"] = {container = "Liaoning"}, -- 7.5 prefectural, 5.7 urban; sub-provincial city; 4.914 urban (5.400 adm-urb) per citypopulation.de
["Guiyang"] = {container = "Guizhou"}, -- 5.987 prefectural, 3.5 urban; 4.021 urban (5.300 adm-urb) per citypopulation.de
["Changchun"] = {container = "Jilin"}, -- 9.1 prefectural, 5.7 urban; sub-provincial city; 4.557 urban (5.200 adm-urb) per citypopulation.de
["Nanchang"] = {container = "Jiangxi"}, -- 6.3 prefectural, 3.6 (3.9?) urban, 5.3 metro; 3.519 urban (5.150 adm-urb) per citypopulation.de
["Ürümqi"] = {container = {key = "Xinjiang, China", placetype = "autonomous region"}}, -- 4.054 prefectural, 4.3 urban; 3.843 urban (5.000 adm-urb) per citypopulation.de
["Urumqi"] = {alias_of = "Ürümqi", display = true},
["Fuzhou"] = {container = "Fujian"}, -- 8.3 prefectural, 4.1 urban; 3.723 urban (4.775 adm-urb) per citypopulation.de
["Linyi"] = {container = "Shandong"}, -- 11.0 prefectural, 2.3 urban; 2.744 urban (4.650 adm-urb) per citypopulation.de
["Zibo"] = {container = "Shandong"}, -- 4.704 prefectural, 2.6 urban; 2.750 urban (3.975 adm-urb) per citypopulation.de
["Luoyang"] = {container = "Henan"}, -- 7.1 prefectural, 2.4 urban; 2.231 urban (3.750 adm-urb) per citypopulation.de
["Lanzhou"] = {container = "Gansu"}, -- 4.359 prefectural, 3.1 urban; 3.013 urban (3.575 adm-urb) per citypopulation.de
["Nantong"] = {container = "Jiangsu"}, -- 7.7 prefectural, 2.3 urban; 2.988 urban (3.475 adm-urb) citypopulation.de
["Weifang"] = {container = "Shandong"}, -- 9.4 prefectural, 2.7 urban; 1.998 urban (3.325 adm-urb) per citypopulation.de
["Jiangyin"] = {container = "Jiangsu"}, -- 1.331 urban (3.200 adm-urb including Zhangjiagang) per citypopulation.de
["Zhangjiagang"] = {container = "Jiangsu"}, -- 1.056 urban per citypopulation.de; included in Jiangyin figures
["Xuzhou"] = {container = "Jiangsu"}, -- 9.1 prefectural, 2.6 urban; 2.846 urban (3.150 adm-urb) per citypopulation.de
["Handan"] = {container = "Hebei"}, -- 9.4 prefectural, 2.8 urban; 2.095 urban (2.925 adm-urb) per citypopulation.de
["Hohhot"] = {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 3.446 prefectural, 2.7 urban; 2.373 urban (2.850 adm-urb) per citypopulation.de
["Haikou"] = {container = "Hainan"}, -- 2.873 prefectural, 2.3 urban; 2.349 urban (2.800 adm-urb) per citypopulation.de
["Tangshan"] = {container = "Hebei"}, -- 7.7 prefectural, 3.4 urban; 2.550 urban (2.750 adm-urb) per citypopulation.de
["Xinxiang"] = {container = "Henan"}, -- 6.3 prefectural, 1.2 urban, 2.7 metro; 1.271 urban (2.700 adm-urb) per citypopulation.de
["Yiwu"] = {container = "Zhejiang"}, -- 1.481 urban (2.700 adm-urb) per citypopulation.de
["Zhuhai"] = {container = "Guangdong"}, -- 2.439 prefectural, 2.4 urban; 2.207 urban (2.675 adm-urb) per citypopulation.de
["Taizhou, Zhejiang"] = {container = "Zhejiang"}, -- 6.6 prefectural, 1.6 urban; 1.486 urban (2.625 adm-urb) per citypopulation.de
["Taizhou"] = {alias_of = "Taizhou, Zhejiang"},
["Yantai"] = {container = "Shandong"}, -- 7.1 prefectural, 2.5 urban; 2.312 urban (2.550 adm-urb) per citypopulation.de
["Yinchuan"] = {container = {key = "Ningxia, China", placetype = "autonomous region"}}, -- 1.663 urban (2.525 adm-urb) per citypopulation.de
["Liuzhou"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 4.157 prefectural, 2.2 urban; 2.205 urban (2.500 adm-urb) per citypopulation.de
["Anshan"] = {container = "Liaoning"}, -- 1.480 urban (2.350 adm-urb including Liáoyáng) per citypopulation.de
["Yangzhou"] = {container = "Jiangsu"}, -- 2.067 urban (2.300 adm-urb) per citypopulation.de
["Jiaxing"] = {container = "Zhejiang"}, -- 1.188 urban (2.275 adm-urb) per citypopulation.de
["Xining"] = {container = "Qinghai"}, -- 1.677 urban (2.250 adm-urb) per citypopulation.de
-- includes Dìngzhōu city and Xióngān Xīnqū
["Baoding"] = {container = "Hebei"}, -- 11.5 prefectural, 2.0 urban; 1.940 urban (2.225 adm-urb) per citypopulation.de
["Baotou"] = {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 2.709 prefectural, 2.2 urban; 2.104 urban (2.200 adm-urb) per citypopulation.de
["Ganzhou"] = {container = "Jiangxi"}, -- 9.0 prefectural, 1.6 urban; 1.778 urban (2.150 adm-urb) per citypopulation.de
["Pingdingshan"] = {container = "Henan"}, -- 1.046 urban (2.100 adm-urb) per citypopulation.de
["Zunyi"] = {container = "Guizhou"}, -- 6.6 prefectural, 2.4 urban/metro; 1.675 urban (2.025 adm-urb) per citypopulation.de
["Bengbu"] = {container = "Anhui"}, -- 1.078 urban (2.000 adm-urb) per citypopulation.de
["Datong"] = {container = "Shanxi"}, -- 3.105 prefectural, 2.0 urban; 1.810 urban (2.000 adm-urb) per citypopulation.de
["Anyang"] = {container = "Henan"}, -- 1.188 urban (1.960 adm-urb) per citypopulation.de
["Huai'an"] = {container = "Jiangsu"}, -- 4.556 prefectural, 2.6 urban; 1.805 urban (1.940 adm-urb) per citypopulation.de
["Zaozhuang"] = {container = "Shandong"}, -- 1.350 urban (1.900 adm-urb) per citypopulation.de
["Zhanjiang"] = {container = "Guangdong"}, -- 7.0 prefectural, 1.9 urban; 1.401 urban (1.890 adm-urb) per citypopulation.de
["Huainan"] = {container = "Anhui"}, -- 1.256 urban (1.880 adm-urb) per citypopulation.de
["Jining"] = {container = "Shandong"}, -- 8.4 prefectural, 1.5 urban; 1.700 urban (1.880 adm-urb) per citypopulation.de
["Daqing"] = {container = "Heilongjiang"}, -- 1.604 urban (1.860 adm-urb) per citypopulation.de
["Wuhu"] = {container = "Anhui"}, -- 1.598 urban (1.850 adm-urb) per citypopulation.de
["Guilin"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 1.361 urban (1.830 adm-urb) per citypopulation.de
["Mianyang"] = {container = "Sichuan"}, -- 1.549 urban (1.800 adm-urb) per citypopulation.de
["Xiangyang"] = {container = "Hubei"}, -- 1.686 urban (1.800 adm-urb) per citypopulation.de
["Huzhou"] = {container = "Zhejiang"}, -- 1.084 urban (1.750 adm-urb) per citypopulation.de
["Puyang"] = {container = "Henan"}, -- 0.824 urban (1.750 adm-urb) per citypopulation.de
["Shangqiu"] = {container = "Henan"}, -- 7.8 prefectural, 1.9 urban (2.8 metro); 1.031 urban (1.750 adm-urb) per citypopulation.de
["Qinhuangdao"] = {container = "Hebei"}, -- 1.520 urban (1.740 adm-urb) per citypopulation.de
["Xingtai"] = {container = "Hebei"}, -- 7.1 prefectural, 971,000 urban; 1.5 urban (1.700 adm-urb) per citypopulation.de
["Nanyang"] = {container = "Henan", wp = "%l, %c"}, -- 9.7 prefectural, 2.1 urban/metro; 1.481 urban (1.680 adm-urb) per citypopulation.de
["Jiaozuo"] = {container = "Henan"}, -- 0.875 urban (1.640 adm-urb) per citypopulation.de
["Jilin City"] = {container = "Jilin"}, -- 1.509 urban (1.610 adm-urb) per citypopulation.de
["Jilin"] = {alias_of = "Jilin City"},
["Jinhua"] = {container = "Zhejiang"}, -- 7.1 prefectural, 1.5 urban; 1.041 urban (1.590 adm-urb) per citypopulation.de
["Shangrao"] = {container = "Jiangxi"}, -- 6.5 prefectural, 2.1 urban, 1.3 metro [sic]; 1.342 urban (1.580 adm-urb) per citypopulation.de
["Heze"] = {container = "Shandong"}, -- 8.8 prefectural, 1.3 urban; 1.294 urban (1.570 adm-urb) per citypopulation.de
["Yulin"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}, wp = "%l, %c"}, -- 0.878 urban (1.570 adm-urb) per citypopulation.de
["Tai'an"] = {container = "Shandong"}, -- 1.417 urban (1.560 adm-urb) per citypopulation.de
["Weihai"] = {container = "Shandong"}, -- 1.340 urban (1.510 adm-urb) per citypopulation.de
-- Taizhou, Jiangsu would be here (1.490 adm-urb) but moved to china_prefecture_level_cities_2 to avoid clash
["Yancheng"] = {container = "Jiangsu"}, -- 6.7 prefectural, 1.6 urban; 1.353 urban (1.460 adm-urb) per citypopulation.de
["Zhangjiakou"] = {container = "Hebei"}, -- 1.339 urban (1.450 adm-urb) per citypopulation.de
["Maoming"] = {container = "Guangdong"}, -- 6.2 prefectural, 2.5 urban; 1.308 urban (1.440 adm-urb) per citypopulation.de
["Nanchong"] = {container = "Sichuan"}, -- 1.254 urban (1.440 adm-urb) per citypopulation.de
["Fuyang"] = {container = "Anhui", wp = "%l, %c"}, -- 8.2 prefectural, 2.1 urban; 1.191 urban (1.410 adm-urb) per citypopulation.de
["Xuchang"] = {container = "Henan"}, -- 0.850 urban (1.390 adm-urb) per citypopulation.de
["Yichang"] = {container = "Hubei"}, -- 1.284 urban (1.390 adm-urb) per citypopulation.de
["Dazhou"] = {container = "Sichuan"}, -- 1.136 urban (1.380 adm-urb) per citypopulation.de
["Kaifeng"] = {container = "Henan"}, -- 1.194 urban (1.340 adm-urb) per citypopulation.de
["Luzhou"] = {container = "Sichuan"}, -- 1.128 urban (1.340 adm-urb) per citypopulation.de
["Qingyuan"] = {container = "Guangdong"}, -- 1.198 urban (1.340 adm-urb) per citypopulation.de
["Huaibei"] = {container = "Anhui"}, -- 0.831 urban (1.330 adm-urb) per citypopulation.de
["Yibin"] = {container = "Sichuan"}, -- 1.101 urban (1.310 adm-urb) per citypopulation.de
["Lu'an"] = {container = "Anhui"}, -- 1.070 urban (1.300 adm-urb) per citypopulation.de
["Dezhou"] = {container = "Shandong"}, -- 0.843 urban (1.290 adm-urb) per citypopulation.de
["Rizhao"] = {container = "Shandong"}, -- 1.147 urban (1.270 adm-urb) per citypopulation.de
["Changzhi"] = {container = "Shanxi"}, -- 1.047 urban (1.250 adm-urb) per citypopulation.de
["Hengyang"] = {container = "Hunan"}, -- 6.6 prefectural, 1.5 urban; 1.185 urban (1.250 adm-urb) per citypopulation.de
["Jinzhou"] = {container = "Liaoning"}, -- 1.021 urban (1.240 adm-urb) per citypopulation.de
["Liaocheng"] = {container = "Shandong"}, -- 1.020 urban (1.240 adm-urb) per citypopulation.de
["Changde"] = {container = "Hunan"}, -- 1.101 urban (1.230 adm-urb) per citypopulation.de
["Suqian"] = {container = "Jiangsu"}, -- 1.082 urban (1.230 adm-urb) per citypopulation.de
["Xinyang"] = {container = "Henan"}, -- 6.2 prefectural, 1.4 urban/metro; 1.015 urban (1.230 adm-urb) per citypopulation.de
["Baoji"] = {container = "Shaanxi"}, -- 1.108 urban (1.220 adm-urb) per citypopulation.de
["Yueyang"] = {container = "Hunan"}, -- 1.125 urban (1.220 adm-urb) per citypopulation.de
["Zhenjiang"] = {container = "Jiangsu"}, -- 1.124 urban (1.210 adm-urb) per citypopulation.de
-- Wanzhou is a "district" of the "direct-administered municipality" of Chongqing but in fact is 142 miles away from Chongqing city proper.
["Wanzhou"] = {placetype = "district", container = {key = "Chongqing", placetype = "direct-administered municipality"}, divs = {"subdistricts", "townships"}, wp = "%l, %c"}, -- 1.078 urban (1.190 adm-urb) per citypopulation.de
["Ulanhad"] = {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 1.093 urban (1.180 adm-urb) per citypopulation.de
["Chifeng"] = {alias_of = "Ulanhad"},
["Ulankhad"] = {alias_of = "Ulanhad", display = true},
["Ezhou"] = {container = "Hubei"}, -- < 0.750 urban (1.180 adm-urb) per citypopulation.de
["Zhaoqing"] = {container = "Guangdong"}, -- 1.036 urban (1.160 adm-urb) per citypopulation.de
["Lianyungang"] = {container = "Jiangsu"}, -- 4.599 prefectural, 2.0 urban; 1.071 urban (1.150 adm-urb) per citypopulation.de
["Qujing"] = {container = "Yunnan"}, -- 0.976 urban (1.150 adm-urb) per citypopulation.de
-- Shuyang is a "county" of the "prefecture-level city" of Suqian but in fact is 38 miles away from Suqian city proper (urban core to urban core).
-- The county itself is 37 miles by 34 miles.
["Shuyang"] = {placetype = "county", container = {key = "Suqian", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}, wp = "%l County"}, -- 0.986 urban (1.120 adm-urb) per citypopulation.de
-- Yongkang is a "county-level city" of the "prefecture-level city" of Jinhua but in fact is 32 miles away from Jinhua city proper (urban core to urban core).
["Yongkang"] = {placetype = "county-level city", container = {key = "Jinhua", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}, wp = "%l, Zhejiang"}, -- < 0.750 urban (1.110 adm-urb) per citypopulation.de
["Zhoukou"] = {container = "Henan"}, -- 9.0 prefectural, 721,000 urban (1.6 metro); < 0.750 urban (1.100 adm-urb) per citypopulation.de
["Beihai"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- < 1 urban (1.090 adm-urb) per citypopulation.de
["Jiujiang"] = {container = "Jiangxi"}, -- < 0.750 urban (1.080 adm-urb) per citypopulation.de
["Shaoyang"] = {container = "Hunan"}, -- 6.6 prefectural, 802,000 urban, 1.4 metro; < 1 urban (1.080 adm-urb) per citypopulation.de
["Chuzhou"] = {container = "Anhui"}, -- < 0.750 urban (1.070 adm-urb) per citypopulation.de
["Hengshui"] = {container = "Hebei"}, -- 0.885 urban (1.070 adm-urb) per citypopulation.de
["Shiyan"] = {container = "Hubei"}, -- 0.955 urban (1.070 adm-urb) per citypopulation.de
["Huludao"] = {container = "Liaoning"}, -- 0.764 urban (1.060 adm-urb) per citypopulation.de
["Dongying"] = {container = "Shandong"}, -- 0.961 urban (1.050 adm-urb) per citypopulation.de
["Guigang"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 0.921 urban (1.050 adm-urb) per citypopulation.de
-- Liuyang is a "county-level city" of the "prefecture-level city" of Changsha but in fact is 47 miles away from Changsha city proper (urban core to urban core).
["Liuyang"] = {placetype = "county-level city", container = {key = "Changsha", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 0.886 urban (1.040 adm-urb) per citypopulation.de
-- NOTE: Not to be confused with Changzhou in Jiangsu
["Cangzhou"] = {container = "Hebei"}, -- 7.3 prefectural, 621,000 urban; 0.759 urban (1.030 adm-urb) per citypopulation.de
["Liupanshui"] = {container = "Guizhou"}, -- < 0.750 urban (1.030 adm-urb) per citypopulation.de
["Panjin"] = {container = "Liaoning"}, -- 0.980 urban (1.030 adm-urb) per citypopulation.de
["Qiqihar"] = {container = "Heilongjiang"}, -- 1.030 urban (1.030 adm-urb) per citypopulation.de
["Linfen"] = {container = "Shanxi"}, -- < 0.750 urban (1.010 adm-urb) per citypopulation.de
-- Tengzhou is a "county-level city" of the "prefecture-level city" of Zaozhuang but in fact is 30 miles away from Zaozhuang city proper (urban core to urban core).
["Tengzhou"] = {placetype = "county-level city", container = {key = "Zaozhuang", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 0.937 urban (1.010 adm-urb) per citypopulation.de
-- 3 extra that got added in earlier incarnations and aren't found in the "major agglomerations of the world" page https://citypopulation.de/en/world/agglomerations/ reference date 2025-01-01
["Kunshan"] = {container = "Jiangsu"}, -- 1.652 urban (2020 China census) per citypopulation.de
["Zhumadian"] = {container = "Henan"}, -- 7.0 prefectural, 722,000 urban per Wikipedia; 0.754 urban per citypopulation.de
["Bijie"] = {container = "Guizhou"}, -- 6.9 prefectural, ? urban, ? metro (not listed in Wikipedia); < 0.750 urban per citypopulation.de
}
export.china_prefecture_level_cities_group = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Taizhou, Zhejiang" or "Suzhou, Anhui".
key_to_placename = false,
placename_to_key = false, -- don't add ", China" to make the key
default_container = "China",
canonicalize_key_container = make_canonicalize_key_container(", China", "province"),
-- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people
-- don't understand how Chinese administrative divisions work.
default_placetype = {"prefecture-level city", "city"},
default_divs = {
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities,
-- and prefecture-level cities (as well as county-level cities) are considered non-cities.
"districts", "subdistricts", "townships",
{type = "counties", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_prefecture_level_cities,
}
-- Needed to avoid problems with two cities called Taizhou and Suzhou.
export.china_prefecture_level_cities_2 = {
-- NOTE: There is also a larger and better-known prefecture-level city Taizhou in Zhejiang.
["Taizhou, Jiangsu"] = {container = "Jiangsu"}, -- 1.3 urban (1.490 adm-urb) per citypopulation.de 2020 census
["Taizhou"] = {alias_of = "Taizhou, Jiangsu"},
-- NOTE: There is also a larger and better-known prefecture-level city Suzhou in Jiangsu.
["Suzhou, Anhui"] = {container = "Anhui"}, -- 5.3 prefectural, 1.766 metro and "urban"; < 1 urban (1.010 adm-urb) per citypopulation.de 2020 census
-- hopefully this will work because we also have Suzhou as a key by itself for the larger, more-well-known Suzhou in Jiangsu
["Suzhou"] = {alias_of = "Suzhou, Anhui"},
}
export.china_prefecture_level_cities_group_2 = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Taizhou, Jiangsu".
placename_to_key = false, -- don't add ", China" to make the key
default_container = "China",
canonicalize_key_container = make_canonicalize_key_container(", China", "province"),
-- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people
-- don't understand how Chinese administrative divisions work.
default_placetype = {"prefecture-level city", "city"},
default_divs = {
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities,
-- and prefecture-level cities (as well as county-level cities) are considered non-cities.
"districts", "subdistricts", "townships",
{type = "counties", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_prefecture_level_cities_2,
}
export.finland_regions = {
["Lapland, Finland"] = {wp = "%l (%c)"},
["North Ostrobothnia, Finland"] = {},
["Northern Ostrobothnia, Finland"] = {alias_of = "North Ostrobothnia, Finland", display = true},
["Kainuu, Finland"] = {},
["North Karelia, Finland"] = {},
["Northern Savonia, Finland"] = {},
["North Savo, Finland"] = {alias_of = "Northern Savonia, Finland", display = true},
["Southern Savonia, Finland"] = {},
["South Savo, Finland"] = {alias_of = "Southern Savonia, Finland", display = true},
["South Karelia, Finland"] = {},
["Central Finland, Finland"] = {},
["South Ostrobothnia, Finland"] = {},
["Southern Ostrobothnia, Finland"] = {alias_of = "South Ostrobothnia, Finland", display = true},
["Ostrobothnia, Finland"] = {wp = "%l (region)"},
["Central Ostrobothnia, Finland"] = {},
["Pirkanmaa, Finland"] = {},
["Satakunta, Finland"] = {},
["Päijänne Tavastia, Finland"] = {},
["Päijät-Häme, Finland"] = {alias_of = "Päijänne Tavastia, Finland", display = true},
["Tavastia Proper, Finland"] = {},
["Kanta-Häme, Finland"] = {alias_of = "Tavastia Proper, Finland", display = true},
["Kymenlaakso, Finland"] = {},
["Uusimaa, Finland"] = {},
["Southwest Finland, Finland"] = {},
["Åland Islands, Finland"] = {the = true, wp = "Åland"},
["Åland, Finland"] = {alias_of = "Åland Islands, Finland"}, -- differs in "the"
}
-- regions of Finland
export.finland_group = {
default_container = "Finland",
default_placetype = "region",
default_divs = "municipalities",
data = export.finland_regions,
}
export.france_administrative_regions = {
["Auvergne-Rhône-Alpes, France"] = {},
["Bourgogne-Franche-Comté, France"] = {},
["Brittany, France"] = {wp = "%l (administrative region)"},
["Centre-Val de Loire, France"] = {},
["Corsica, France"] = {},
-- overseas departments are handled in `export.country_like_entities`
-- ["French Guiana"] = {},
["Grand Est, France"] = {},
-- ["Guadeloupe"] = {},
["Hauts-de-France, France"] = {},
["Île-de-France, France"] = {},
-- ["Martinique"] = {},
-- ["Mayotte"] = {},
["Normandy, France"] = {wp = "%l (administrative region)"},
["Nouvelle-Aquitaine, France"] = {},
["Occitania, France"] = {wp = "%l (administrative region)"},
["Occitanie, France"] = {alias_of = "Occitania, France", display = true},
["Pays de la Loire, France"] = {},
["Provence-Alpes-Côte d'Azur, France"] = {},
-- ["Réunion"] = {},
}
-- administrative regions of France
export.france_group = {
default_container = "France",
-- Canonically these are 'administrative regions' but also treat as 'region' ('administrative region' falls back
-- to 'region').
default_placetype = "region",
default_divs = {
"communes",
{type = "municipalities", cat_as = "communes"},
"departments",
{type = "prefectures", cat_as = {"prefectures", "departmental capitals"}},
{type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}},
},
data = export.france_administrative_regions,
}
export.france_departments = {
["Ain, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 01
["Aisne, France"] = {container = "Hauts-de-France"}, -- 02
["Allier, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 03
["Alpes-de-Haute-Provence, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 04
["Hautes-Alpes, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 05
["Alpes-Maritimes, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 06
["Ardèche, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 07
["Ardennes, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 08
["Ariège, France"] = {container = "Occitania", wp = "%l (department)"}, -- 09
["Aube, France"] = {container = "Grand Est"}, -- 10
["Aude, France"] = {container = "Occitania"}, -- 11
["Aveyron, France"] = {container = "Occitania"}, -- 12
["Bouches-du-Rhône, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 13
["Calvados, France"] = {container = "Normandy", wp = "%l (department)"}, -- 14
["Cantal, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 15
["Charente, France"] = {container = "Nouvelle-Aquitaine"}, -- 16
["Charente-Maritime, France"] = {container = "Nouvelle-Aquitaine"}, -- 17
["Cher, France"] = {container = "Centre-Val de Loire", wp = "%l (department)"}, -- 18
["Corrèze, France"] = {container = "Nouvelle-Aquitaine"}, -- 19
["Corse-du-Sud, France"] = {container = "Corsica"}, -- 2A
["Haute-Corse, France"] = {container = "Corsica"}, -- 2B
["Côte-d'Or, France"] = {container = "Bourgogne-Franche-Comté"}, -- 21
["Côte d'Or, France"] = {alias_of = "Côte-d'Or, France", display = true},
["Côtes-d'Armor, France"] = {container = "Brittany"}, -- 22
["Côtes d'Armor, France"] = {alias_of = "Côtes-d'Armor, France", display = true},
["Creuse, France"] = {container = "Nouvelle-Aquitaine"}, -- 23
["Dordogne, France"] = {container = "Nouvelle-Aquitaine"}, -- 24
["Doubs, France"] = {container = "Bourgogne-Franche-Comté"}, -- 25
["Drôme, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 26
["Eure, France"] = {container = "Normandy"}, -- 27
["Eure-et-Loir, France"] = {container = "Centre-Val de Loire"}, -- 28
["Finistère, France"] = {container = "Brittany"}, -- 29
["Gard, France"] = {container = "Occitania"}, -- 30
["Haute-Garonne, France"] = {container = "Occitania"}, -- 31
["Gers, France"] = {container = "Occitania"}, -- 32
["Gironde, France"] = {container = "Nouvelle-Aquitaine"}, -- 33
["Hérault, France"] = {container = "Occitania"}, -- 34
["Ille-et-Vilaine, France"] = {container = "Brittany"}, -- 35
["Indre, France"] = {container = "Centre-Val de Loire"}, -- 36
["Indre-et-Loire, France"] = {container = "Centre-Val de Loire"}, -- 37
["Isère, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 38
["Jura, France"] = {container = "Bourgogne-Franche-Comté", wp = "%l (department)"}, -- 39
["Landes, France"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 40
["Loir-et-Cher, France"] = {container = "Centre-Val de Loire"}, -- 41
["Loire, France"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 42
["Haute-Loire, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 43
["Loire-Atlantique, France"] = {container = "Pays de la Loire"}, -- 44
["Loiret, France"] = {container = "Centre-Val de Loire"}, -- 45
["Lot, France"] = {container = "Occitania", wp = "%l (department)"}, -- 46
["Lot-et-Garonne, France"] = {container = "Nouvelle-Aquitaine"}, -- 47
["Lozère, France"] = {container = "Occitania"}, -- 48
["Maine-et-Loire, France"] = {container = "Pays de la Loire"}, -- 49
["Manche, France"] = {container = "Normandy"}, -- 50
["Marne, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 51
["Haute-Marne, France"] = {container = "Grand Est"}, -- 52
["Mayenne, France"] = {container = "Pays de la Loire"}, -- 53
["Meurthe-et-Moselle, France"] = {container = "Grand Est"}, -- 54
["Meuse, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 55
["Morbihan, France"] = {container = "Brittany"}, -- 56
["Moselle, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 57
["Nièvre, France"] = {container = "Bourgogne-Franche-Comté"}, -- 58
["Nord, France"] = {container = "Hauts-de-France", wp = "%l (French department)"}, -- 59
["Oise, France"] = {container = "Hauts-de-France"}, -- 60
["Orne, France"] = {container = "Normandy"}, -- 61
["Pas-de-Calais, France"] = {container = "Hauts-de-France"}, -- 62
["Puy-de-Dôme, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 63
["Pyrénées-Atlantiques, France"] = {container = "Nouvelle-Aquitaine"}, -- 64
["Hautes-Pyrénées, France"] = {container = "Occitania"}, -- 65
["Pyrénées-Orientales, France"] = {container = "Occitania"}, -- 66
["Bas-Rhin, France"] = {container = "Grand Est"}, -- 67
["Haut-Rhin, France"] = {container = "Grand Est"}, -- 68
["Rhône, France"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 69D
["Metropolis of Lyon, France"] = {container = "Auvergne-Rhône-Alpes", the = true}, -- 69M
["Lyon Metropolis, France"] = {alias_of = "Metropolis of Lyon, France"},
["Lyon, France"] = {alias_of = "Metropolis of Lyon, France"},
["Haute-Saône, France"] = {container = "Bourgogne-Franche-Comté"}, -- 70
["Saône-et-Loire, France"] = {container = "Bourgogne-Franche-Comté"}, -- 71
["Sarthe, France"] = {container = "Pays de la Loire"}, -- 72
["Savoie, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 73
["Haute-Savoie, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 74
["Paris, France"] = {container = "Île-de-France"}, -- 75
["Seine-Maritime, France"] = {container = "Normandy"}, -- 76
["Seine-et-Marne, France"] = {container = "Île-de-France"}, -- 77
["Yvelines, France"] = {container = "Île-de-France"}, -- 78
["Deux-Sèvres, France"] = {container = "Nouvelle-Aquitaine"}, -- 79
["Somme, France"] = {container = "Hauts-de-France", wp = "%l (department)"}, -- 80
["Tarn, France"] = {container = "Occitania", wp = "%l (department)"}, -- 81
["Tarn-et-Garonne, France"] = {container = "Occitania"}, -- 82
["Var, France"] = {container = "Provence-Alpes-Côte d'Azur", wp = "%l (department)"}, -- 83
["Vaucluse, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 84
["Vendée, France"] = {container = "Pays de la Loire"}, -- 85
["Vienne, France"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 86
["Haute-Vienne, France"] = {container = "Nouvelle-Aquitaine"}, -- 87
["Vosges, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 88
["Yonne, France"] = {container = "Bourgogne-Franche-Comté"}, -- 89
["Territoire de Belfort, France"] = {container = "Bourgogne-Franche-Comté"}, -- 90
["Essonne, France"] = {container = "Île-de-France"}, -- 91
["Hauts-de-Seine, France"] = {container = "Île-de-France"}, -- 92
["Seine-Saint-Denis, France"] = {container = "Île-de-France"}, -- 93
["Val-de-Marne, France"] = {container = "Île-de-France"}, -- 94
["Val-d'Oise, France"] = {container = "Île-de-France"}, -- 95
--["Guadeloupe"] = {container = "Guadeloupe"}, -- 971
--["Martinique"] = {container = "Martinique"}, -- 972
--["Guyane"] = {container = "French Guiana", wp = "French Guiana"}, -- 973
--["La Réunion"] = {container = "Réunion", wp = "Réunion"}, -- 974
--["Mayotte"] = {container = "Mayotte"}, -- 976
}
export.france_departments_group = {
placename_to_key = make_placename_to_key(", France"),
canonicalize_key_container = make_canonicalize_key_container(", France", "region"),
default_placetype = "department",
default_divs = {
"communes",
{type = "municipalities", cat_as = "communes"},
},
data = export.france_departments,
}
export.germany_states = {
["Baden-Württemberg, Germany"] = {},
["Bavaria, Germany"] = {},
-- Berlin, Bremen and Hamburg are effectively city-states and don't have districts ([[Kreise]]), so override
-- the default_divs setting. Better not to include them at all since they're included as cities down below.
-- ["Berlin"] = {divs = {}},
["Brandenburg, Germany"] = {},
-- ["Bremen"] = {divs = {}},
-- ["Hamburg"] = {divs = {}},
["Hesse, Germany"] = {},
["Lower Saxony, Germany"] = {},
["Mecklenburg-Vorpommern, Germany"] = {},
["Mecklenburg-Western Pomerania, Germany"] = {alias_of = "Mecklenburg-Vorpommern, Germany", display = true},
["North Rhine-Westphalia, Germany"] = {},
["Rhineland-Palatinate, Germany"] = {},
["Saarland, Germany"] = {},
["Saxony, Germany"] = {},
["Saxony-Anhalt, Germany"] = {},
["Schleswig-Holstein, Germany"] = {},
["Thuringia, Germany"] = {},
}
-- states of Germany
export.germany_group = {
default_container = "Germany",
default_placetype = "negeri",
default_divs = {"districts", "municipalities"},
data = export.germany_states,
}
export.greece_regions = {
["Attica, Greece"] = {wp = "%l (region)"},
["Central Greece, Greece"] = {wp = "%l (administrative region)"},
["Central Macedonia, Greece"] = {},
["Crete, Greece"] = {},
["Eastern Macedonia and Thrace, Greece"] = {},
["Epirus, Greece"] = {wp = "%l (region)"},
["Ionian Islands, Greece"] = {the = true, wp = "%l (region)"},
["North Aegean, Greece"] = {the = true},
-- I would expect 'the Peloponnese' but Wikipedia mostly has categories like [[w:Category:Geography of Peloponnese (region)]]
-- and [[w:Category:Buildings and structures in Peloponnese (region)]]; only [[w:Category:People from the Peloponnese (region)]]
-- has "the" in it.
["Peloponnese, Greece"] = {wp = "%l (region)"},
["South Aegean, Greece"] = {the = true},
["Thessaly, Greece"] = {},
["Western Greece, Greece"] = {},
["Western Macedonia, Greece"] = {},
["Mount Athos, Greece"] = {placetype = {"autonomous region", "region"}, wp = "Monastic community of Mount Athos"},
}
-- regions of Greece
export.greece_group = {
default_container = "Greece",
default_placetype = "region",
data = export.greece_regions,
}
local india_polity_with_divisions = {"divisions", "districts"}
local india_polity_without_divisions = {"districts"}
-- States and union territories of India. Only some of them are divided into divisions.
export.india_states_and_union_territories = {
["Andaman and Nicobar Islands, India"] =
{the = true, placetype = "union territory", divs = india_polity_without_divisions},
["Andhra Pradesh, India"] = {divs = india_polity_without_divisions},
["Arunachal Pradesh, India"] = {divs = india_polity_with_divisions},
["Assam, India"] = {divs = india_polity_with_divisions},
["Bihar, India"] = {divs = india_polity_with_divisions},
["Chandigarh, India"] = {placetype = "union territory", divs = india_polity_without_divisions},
["Chhattisgarh, India"] = {divs = india_polity_with_divisions},
["Dadra and Nagar Haveli and Daman and Diu, India"] = {placetype = "union territory", divs = india_polity_without_divisions},
["Delhi, India"] = {placetype = "union territory", divs = india_polity_with_divisions},
["Goa, India"] = {divs = india_polity_without_divisions},
["Gujarat, India"] = {divs = india_polity_without_divisions},
["Haryana, India"] = {divs = india_polity_with_divisions},
["Himachal Pradesh, India"] = {divs = india_polity_with_divisions},
["Jammu and Kashmir, India"] = {placetype = "union territory", divs = india_polity_with_divisions,
wp = "%l (union territory)"},
["Jharkhand, India"] = {divs = india_polity_with_divisions},
["Karnataka, India"] = {divs = india_polity_with_divisions},
["Kerala, India"] = {divs = india_polity_without_divisions},
["Ladakh, India"] = {placetype = "union territory", divs = india_polity_with_divisions},
["Lakshadweep, India"] = {placetype = "union territory", divs = india_polity_without_divisions},
["Madhya Pradesh, India"] = {divs = india_polity_with_divisions},
["Maharashtra, India"] = {divs = india_polity_with_divisions},
["Manipur, India"] = {divs = india_polity_without_divisions},
["Meghalaya, India"] = {divs = india_polity_with_divisions},
["Mizoram, India"] = {divs = india_polity_without_divisions},
["Nagaland, India"] = {divs = india_polity_with_divisions},
["Odisha, India"] = {divs = india_polity_with_divisions},
["Puducherry, India"] = {placetype = "union territory", divs = india_polity_without_divisions,
wp = "%l (union territory)"},
["Pondicherry, India"] = {alias_of = "Puducherry, India", display = true},
["Punjab, India"] = {divs = india_polity_with_divisions, wp = "%l, %c"},
["Rajasthan, India"] = {divs = india_polity_with_divisions},
["Sikkim, India"] = {divs = india_polity_without_divisions},
["Tamil Nadu, India"] = {divs = india_polity_without_divisions},
["Telangana, India"] = {divs = india_polity_without_divisions},
["Tripura, India"] = {divs = india_polity_without_divisions},
["Uttar Pradesh, India"] = {divs = india_polity_with_divisions},
["Uttarakhand, India"] = {divs = india_polity_with_divisions},
["West Bengal, India"] = {divs = india_polity_with_divisions},
}
-- states and union territories of India
export.india_group = {
default_container = "India",
default_placetype = "negeri",
data = export.india_states_and_union_territories,
}
export.indonesia_provinces = {
["Aceh, Indonesia"] = {},
["Bali, Indonesia"] = {},
["Bangka Belitung Islands, Indonesia"] = {the = true},
["Banten, Indonesia"] = {},
["Bengkulu, Indonesia"] = {},
["Central Java, Indonesia"] = {},
["Central Kalimantan, Indonesia"] = {},
["Central Papua, Indonesia"] = {},
["Central Sulawesi, Indonesia"] = {},
["East Java, Indonesia"] = {},
["East Kalimantan, Indonesia"] = {},
["East Nusa Tenggara, Indonesia"] = {},
["Gorontalo, Indonesia"] = {},
["Highland Papua, Indonesia"] = {wp = "%l"},
["Special Capital Region of Jakarta, Indonesia"] = {the = true, wp = "Jakarta"},
["Jakarta, Indonesia"] = {alias_of = "Special Capital Region of Jakarta, Indonesia"},
["Jambi, Indonesia"] = {},
["Lampung, Indonesia"] = {},
["Maluku, Indonesia"] = {},
["North Kalimantan, Indonesia"] = {},
["North Maluku, Indonesia"] = {},
["North Sulawesi, Indonesia"] = {},
["North Papua, Indonesia"] = {},
["North Sumatra, Indonesia"] = {},
["Papua, Indonesia"] = {wp = "%l (province)"},
["Riau, Indonesia"] = {},
["Riau Islands, Indonesia"] = {the = true},
["Southeast Sulawesi, Indonesia"] = {},
["South Kalimantan, Indonesia"] = {},
["South Papua, Indonesia"] = {},
["South Sulawesi, Indonesia"] = {},
["South Sumatra, Indonesia"] = {},
["Southwest Papua, Indonesia"] = {},
["West Java, Indonesia"] = {},
["West Kalimantan, Indonesia"] = {},
["West Nusa Tenggara, Indonesia"] = {},
["West Papua, Indonesia"] = {wp = "%l (province)"},
["West Sulawesi, Indonesia"] = {},
["West Sumatra, Indonesia"] = {},
["Special Region of Yogyakarta, Indonesia"] = {the = true},
["Yogyakarta, Indonesia"] = {alias_of = "Special Region of Yogyakarta, Indonesia"},
}
-- provinces of Indonesia
export.indonesia_group = {
default_container = "Indonesia",
default_placetype = "province",
-- per https://www.quora.com/Does-Indonesia-use-British-or-American-English, Indonesia tends to use American
-- spellings.
data = export.indonesia_provinces,
}
export.iran_provinces = {
["Alborz Province, Iran"] = {}, -- abbreviation AL, capital [[w:Karaj]]
["Ardabil Province, Iran"] = {}, -- abbreviation AR, capital [[w:Ardabil]]
["Bushehr Province, Iran"] = {}, -- abbreviation BU, capital [[w:Bushehr]]
["Chaharmahal and Bakhtiari Province, Iran"] = {}, -- abbreviation CB, capital [[w:Shahr-e Kord]]
["East Azerbaijan Province, Iran"] = {}, -- abbreviation EA, capital [[w:Tabriz]]
["Fars Province, Iran"] = {}, -- abbreviation FA, capital [[w:Shiraz]]
["Pars Province, Iran"] = {alias_of = "Fars Province, Iran", display = true},
["Gilan Province, Iran"] = {}, -- abbreviation GN, capital [[w:Rasht]]
["Golestan Province, Iran"] = {}, -- abbreviation GO, capital [[w:Gorgan]]
["Hamadan Province, Iran"] = {}, -- abbreviation HA, capital [[w:Hamadan]]
["Hormozgan Province, Iran"] = {}, -- abbreviation HO, capital [[w:Bandar Abbas]]
["Ilam Province, Iran"] = {}, -- abbreviation IL, capital [[w:Ilam, Iran|Ilam]]
["Isfahan Province, Iran"] = {}, -- abbreviation IS, capital [[w:Isfahan]]
["Kerman Province, Iran"] = {}, -- abbreviation KN, capital [[w:Kerman]]
["Kermanshah Province, Iran"] = {}, -- abbreviation KE, capital [[w:Kermanshah]]
["Khuzestan Province, Iran"] = {}, -- abbreviation KH, capital [[w:Ahvaz]]
["Kohgiluyeh and Boyer-Ahmad Province, Iran"] = {}, -- abbreviation KB, capital [[w:Yasuj]]
["Kurdistan Province, Iran"] = {}, -- abbreviation KU, capital [[w:Sanandaj]]
["Lorestan Province, Iran"] = {}, -- abbreviation LO, capital [[w:Khorramabad]]
["Markazi Province, Iran"] = {}, -- abbreviation MA, capital [[w:Arak, Iran|Arak]]
["Mazandaran Province, Iran"] = {}, -- abbreviation MN, capital [[w:Sari, Iran|Sari]]
["North Khorasan Province, Iran"] = {}, -- abbreviation NK, capital [[w:Bojnord]]
["Qazvin Province, Iran"] = {}, -- abbreviation QA, capital [[w:Qazvin]]
["Qom Province, Iran"] = {}, -- abbreviation QM, capital [[w:Qom]]
["Razavi Khorasan Province, Iran"] = {}, -- abbreviation RK, capital [[w:Mashhad]]
["Semnan Province, Iran"] = {}, -- abbreviation SE, capital [[w:Semnan, Iran|Semnan]]
["Sistan and Baluchestan Province, Iran"] = {}, -- abbreviation SB, capital [[w:Zahedan]]
["South Khorasan Province, Iran"] = {}, -- abbreviation SK, capital [[w:Birjand]]
["Tehran Province, Iran"] = {}, -- abbreviation TE, capital [[w:Tehran]]
["West Azerbaijan Province, Iran"] = {}, -- abbreviation WA, capital [[w:Urmia]]
["Yazd Province, Iran"] = {}, -- abbreviation YA, capital [[w:Yazd]]
["Zanjan Province, Iran"] = {}, -- abbreviation ZA, capital [[w:Zanjan, Iran|Zanjan]]
}
-- provinces of Iran
export.iran_group = {
key_to_placename = make_key_to_placename(", Iran", " Province$"),
placename_to_key = make_placename_to_key(", Iran", " Province"),
default_container = "Iran",
default_placetype = "province",
-- There aren't nearly enough counties of Iran currently entered in any language to allow for categorizing them
-- per-province. (As of 2025-05-09, there are only 6 counties in each of [[Category:en:Counties of Iran]],
-- [[Category:fa:Counties of Iran]] and [[Category:ar:Counties of Iran]].)
-- default_divs = "counties",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "%e province",
data = export.iran_provinces,
}
export.ireland_counties = {
["County Carlow, Ireland"] = {},
["County Cavan, Ireland"] = {},
["County Clare, Ireland"] = {},
["County Cork, Ireland"] = {},
["County Donegal, Ireland"] = {},
["County Dublin, Ireland"] = {},
["County Galway, Ireland"] = {},
["County Kerry, Ireland"] = {},
["County Kildare, Ireland"] = {},
["County Kilkenny, Ireland"] = {},
["County Laois, Ireland"] = {},
["County Leitrim, Ireland"] = {},
["County Limerick, Ireland"] = {},
["County Longford, Ireland"] = {},
["County Louth, Ireland"] = {},
["County Mayo, Ireland"] = {},
["County Meath, Ireland"] = {},
["County Monaghan, Ireland"] = {},
["County Offaly, Ireland"] = {},
["County Roscommon, Ireland"] = {},
["County Sligo, Ireland"] = {},
["County Tipperary, Ireland"] = {},
["County Waterford, Ireland"] = {},
["County Westmeath, Ireland"] = {},
["County Wexford, Ireland"] = {},
["County Wicklow, Ireland"] = {},
}
local function make_irish_type_key_to_placename(container_pattern)
return function(key)
key = key:gsub(container_pattern, "")
local elliptical_key = key:gsub("^County ", "")
return key, elliptical_key
end
end
local function make_irish_type_placename_to_key(container_suffix)
return function(placename)
if not placename:find("^County ") and not placename:find("^City ") then
placename = "County " .. placename
end
return placename .. container_suffix
end
end
-- counties of Ireland
export.ireland_group = {
key_to_placename = make_irish_type_key_to_placename(", Ireland$"),
placename_to_key = make_irish_type_placename_to_key(", Ireland"),
default_container = "Ireland",
default_placetype = "county",
data = export.ireland_counties,
}
export.italy_administrative_regions = {
["Abruzzo, Italy"] = {},
["Aosta Valley, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}},
["Apulia, Italy"] = {},
["Basilicata, Italy"] = {},
["Calabria, Italy"] = {},
["Campania, Italy"] = {},
["Emilia-Romagna, Italy"] = {},
["Friuli-Venezia Giulia, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}},
["Lazio, Italy"] = {},
["Liguria, Italy"] = {},
["Lombardy, Italy"] = {},
["Marche, Italy"] = {},
["Molise, Italy"] = {},
["Piedmont, Italy"] = {},
["Sardinia, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}},
["Sicily, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}},
["Trentino-Alto Adige, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}},
["Tuscany, Italy"] = {},
["Umbria, Italy"] = {},
["Veneto, Italy"] = {},
}
-- administrative regions of Italy
export.italy_group = {
default_container = "Italy",
default_placetype = "region",
data = export.italy_administrative_regions,
}
-- table of Japanese prefectures; interpolated into the main 'places' table, but also needed separately
export.japan_prefectures = {
["Aichi Prefecture, Japan"] = {},
["Akita Prefecture, Japan"] = {},
["Aomori Prefecture, Japan"] = {},
["Chiba Prefecture, Japan"] = {},
["Ehime Prefecture, Japan"] = {},
["Fukui Prefecture, Japan"] = {},
["Fukuoka Prefecture, Japan"] = {},
["Fukushima Prefecture, Japan"] = {},
["Gifu Prefecture, Japan"] = {},
["Gunma Prefecture, Japan"] = {},
["Hiroshima Prefecture, Japan"] = {},
["Hokkaido Prefecture, Japan"] = {divs = "subprefectures", wp = "Hokkaido"},
["Hyōgo Prefecture, Japan"] = {},
["Hyogo Prefecture, Japan"] = {alias_of = "Hyōgo Prefecture, Japan", display = true},
["Ibaraki Prefecture, Japan"] = {},
["Ishikawa Prefecture, Japan"] = {},
["Iwate Prefecture, Japan"] = {},
["Kagawa Prefecture, Japan"] = {},
["Kagoshima Prefecture, Japan"] = {},
["Kanagawa Prefecture, Japan"] = {},
["Kōchi Prefecture, Japan"] = {},
["Kochi Prefecture, Japan"] = {alias_of = "Kōchi Prefecture, Japan", display = true},
["Kumamoto Prefecture, Japan"] = {},
["Kyoto Prefecture, Japan"] = {},
["Mie Prefecture, Japan"] = {},
["Miyagi Prefecture, Japan"] = {},
["Miyazaki Prefecture, Japan"] = {},
["Nagano Prefecture, Japan"] = {},
["Nagasaki Prefecture, Japan"] = {},
["Nara Prefecture, Japan"] = {},
["Niigata Prefecture, Japan"] = {},
["Ōita Prefecture, Japan"] = {},
["Oita Prefecture, Japan"] = {alias_of = "Ōita Prefecture, Japan", display = true},
["Okayama Prefecture, Japan"] = {},
["Okinawa Prefecture, Japan"] = {},
["Osaka Prefecture, Japan"] = {},
["Saga Prefecture, Japan"] = {},
["Saitama Prefecture, Japan"] = {},
["Shiga Prefecture, Japan"] = {},
["Shimane Prefecture, Japan"] = {},
["Shizuoka Prefecture, Japan"] = {},
["Tochigi Prefecture, Japan"] = {},
["Tokushima Prefecture, Japan"] = {},
["Tottori Prefecture, Japan"] = {},
["Toyama Prefecture, Japan"] = {},
["Wakayama Prefecture, Japan"] = {},
["Yamagata Prefecture, Japan"] = {},
["Yamaguchi Prefecture, Japan"] = {},
["Yamanashi Prefecture, Japan"] = {},
}
-- prefectures of Japan
export.japan_group = {
key_to_placename = make_key_to_placename(", Japan$", " Prefecture$"),
placename_to_key = make_placename_to_key(", Japan", " Prefecture"),
default_container = "Japan",
default_placetype = "prefecture",
data = export.japan_prefectures,
}
export.laos_provinces = {
["Attapeu Province, Laos"] = {},
["Bokeo Province, Laos"] = {},
["Bolikhamxai Province, Laos"] = {},
["Champasak Province, Laos"] = {},
["Houaphanh Province, Laos"] = {},
["Khammouane Province, Laos"] = {},
["Luang Namtha Province, Laos"] = {},
["Luang Prabang Province, Laos"] = {},
["Oudomxay Province, Laos"] = {},
["Phongsaly Province, Laos"] = {},
["Salavan Province, Laos"] = {},
["Savannakhet Province, Laos"] = {},
["Vientiane Province, Laos"] = {},
["Vientiane Prefecture, Laos"] = {placetype = "prefecture", wp = "%l"},
["Sainyabuli Province, Laos"] = {},
["Sekong Province, Laos"] = {},
["Xaisomboun Province, Laos"] = {},
["Xiangkhouang Province, Laos"] = {},
}
local function laos_placename_to_key(placename)
if placename == "Vientiane Prefecture" then
return placename .. ", Laos"
end
if placename:find(" Province$") then
return placename .. ", Laos"
end
return placename .. " Province, Laos"
end
-- provinces of Laos
export.laos_group = {
key_to_placename = make_key_to_placename(", Laos$", {" Province$", " Prefecture$"}),
placename_to_key = laos_placename_to_key,
default_container = "Laos",
default_placetype = "province",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "%e province",
data = export.laos_provinces,
}
export.lebanon_governorates = {
["Akkar Governorate, Lebanon"] = {},
["Baalbek-Hermel Governorate, Lebanon"] = {},
["Beirut Governorate, Lebanon"] = {},
["Beqaa Governorate, Lebanon"] = {},
["Keserwan-Jbeil Governorate, Lebanon"] = {},
["Mount Lebanon Governorate, Lebanon"] = {},
["Nabatieh Governorate, Lebanon"] = {},
-- These two are generic enough that we don't want to automatically augment a use of `gov/North Governorate` or
-- `gov/South Governorate` with `c/Lebanon`.
["North Governorate, Lebanon"] = {no_auto_augment_container = true},
["South Governorate, Lebanon"] = {no_auto_augment_container = true},
}
-- governorates of Lebanon
export.lebanon_group = {
key_to_placename = make_key_to_placename(", Lebanon$", " Governorate$"),
placename_to_key = make_placename_to_key(", Lebanon", " Governorate"),
default_container = "Lebanon",
default_placetype = "governorate",
data = export.lebanon_governorates,
}
export.malaysia_states = {
["Johor, Malaysia"] = {},
["Kedah, Malaysia"] = {},
["Kelantan, Malaysia"] = {},
["Malacca, Malaysia"] = {},
["Negeri Sembilan, Malaysia"] = {},
["Pahang, Malaysia"] = {},
["Penang, Malaysia"] = {},
["Perak, Malaysia"] = {},
["Perlis, Malaysia"] = {},
["Sabah, Malaysia"] = {},
["Sarawak, Malaysia"] = {},
["Selangor, Malaysia"] = {},
["Terengganu, Malaysia"] = {},
}
-- states of Malaysia
export.malaysia_group = {
default_container = "Malaysia",
default_placetype = "negeri",
default_wp = "%l, %c",
data = export.malaysia_states,
}
export.malta_regions = {
-- Some of the regions are generic enough that we don't want to automatically augment a use of e.g.
-- `r/Northern Region` with `c/Malta`. In particular;
-- * "Eastern Region" also occurs at least in Ghana, Uganda, Iceland, Nigeria, Venezuela, North Macedonia and
-- El Salvador;
-- * "Northern Region" also occurs at least in Ghana, Uganda, Malawi, Nigeria, Canada and South Africa;
-- * "Western Region" also occurs at least in Abu Dhabi, Bahrain, South Africa, Ghana, Iceland, Nepal, Nigeria,
-- Serbia and Uganda;
-- * "Southern Region" also occurs at least in Nigeria, Eritrea, Iceland, Ireland, Malawi and Serbia.
["Eastern Region, Malta"] = {no_auto_augment_container = true},
["Gozo Region, Malta"] = {wp = "%l"},
["Northern Region, Malta"] = {no_auto_augment_container = true},
["Port Region, Malta"] = {},
["Southern Region, Malta"] = {no_auto_augment_container = true},
["Western Region, Malta"] = {no_auto_augment_container = true},
}
-- regions of Malta
export.malta_group = {
key_to_placename = make_key_to_placename(", Malta$", " Region"),
placename_to_key = make_placename_to_key(", Malta", " Region"),
default_container = "Malta",
default_placetype = "region",
default_wp = "%l, %c",
default_the = true,
data = export.malta_regions,
}
export.mexico_states = {
["Aguascalientes, Mexico"] = {},
["Baja California, Mexico"] = {},
-- not display-canonicalizing because the "Norte" could be for emphasis
["Baja California Norte, Mexico"] = {alias_of = "Baja California, Mexico"},
["Baja California Sur, Mexico"] = {},
["Campeche, Mexico"] = {},
["Chiapas, Mexico"] = {},
["Chihuahua, Mexico"] = {wp = "%l (state)"},
["Coahuila, Mexico"] = {},
["Colima, Mexico"] = {},
["Durango, Mexico"] = {},
["Guanajuato, Mexico"] = {},
["Guerrero, Mexico"] = {},
["Hidalgo, Mexico"] = {wp = "%l (state)"},
["Jalisco, Mexico"] = {},
["State of Mexico, Mexico"] = {the = true},
["Mexico, Mexico"] = {alias_of = "State of Mexico, Mexico"}, -- differs in "the"
-- ["Mexico City, Mexico"] = {}, doesn't belong here because it's a city
["Michoacán, Mexico"] = {},
["Michoacan, Mexico"] = {alias_of = "Michoacán, Mexico", display = true},
["Morelos, Mexico"] = {},
["Nayarit, Mexico"] = {},
["Nuevo León, Mexico"] = {},
["Nuevo Leon, Mexico"] = {alias_of = "Nuevo León, Mexico", display = true},
["Oaxaca, Mexico"] = {},
["Puebla, Mexico"] = {},
["Querétaro, Mexico"] = {},
["Queretaro, Mexico"] = {alias_of = "Querétaro, Mexico", display = true},
["Quintana Roo, Mexico"] = {},
["San Luis Potosí, Mexico"] = {},
["San Luis Potosi, Mexico"] = {alias_of = "San Luis Potosí, Mexico", display = true},
["Sinaloa, Mexico"] = {},
["Sonora, Mexico"] = {},
["Tabasco, Mexico"] = {},
["Tamaulipas, Mexico"] = {},
["Tlaxcala, Mexico"] = {},
["Veracruz, Mexico"] = {},
["Yucatán, Mexico"] = {},
["Yucatan, Mexico"] = {alias_of = "Yucatán, Mexico", display = true},
["Zacatecas, Mexico"] = {},
}
-- Mexican states
export.mexico_group = {
default_container = "Mexico",
default_placetype = "negeri",
data = export.mexico_states,
}
export.moldova_districts_and_autonomous_territorial_units = {
["Anenii Noi District, Moldova"] = {}, -- capital [[Anenii Noi]]
["Basarabeasca District, Moldova"] = {}, -- capital [[Basarabeasca]]
["Briceni District, Moldova"] = {}, -- capital [[Briceni]]
["Cahul District, Moldova"] = {}, -- capital [[Cahul]]
["Cantemir District, Moldova"] = {}, -- capital [[Cantemir, Moldova|Cantemir]]
["Călărași District, Moldova"] = {}, -- capital [[Călărași, Moldova|Călărași]]
["Căușeni District, Moldova"] = {}, -- capital [[Căușeni]]
["Cimișlia District, Moldova"] = {}, -- capital [[Cimișlia]]
["Criuleni District, Moldova"] = {}, -- capital [[Criuleni]]
["Dondușeni District, Moldova"] = {}, -- capital [[Dondușeni]]
["Drochia District, Moldova"] = {}, -- capital [[Drochia]]
["Dubăsari District, Moldova"] = {}, -- capital [[Cocieri]]
["Edineț District, Moldova"] = {}, -- capital [[Edineț]]
["Fălești District, Moldova"] = {}, -- capital [[Fălești]]
["Florești District, Moldova"] = {}, -- capital [[Florești, Moldova|Florești]]
["Glodeni District, Moldova"] = {}, -- capital [[Glodeni]]
["Hîncești District, Moldova"] = {}, -- capital [[Hîncești]]
["Ialoveni District, Moldova"] = {}, -- capital [[Ialoveni]]
["Leova District, Moldova"] = {}, -- capital [[Leova]]
["Nisporeni District, Moldova"] = {}, -- capital [[Nisporeni]]
["Ocnița District, Moldova"] = {}, -- capital [[Ocnița]]
["Orhei District, Moldova"] = {}, -- capital [[Orhei]]
["Rezina District, Moldova"] = {}, -- capital [[Rezina]]
["Rîșcani District, Moldova"] = {}, -- capital [[Rîșcani]]
["Sîngerei District, Moldova"] = {}, -- capital [[Sîngerei]]
["Soroca District, Moldova"] = {}, -- capital [[Soroca]]
["Strășeni District, Moldova"] = {}, -- capital [[Strășeni]]
["Șoldănești District, Moldova"] = {}, -- capital [[Șoldănești]]
["Ștefan Vodă District, Moldova"] = {}, -- capital [[Ștefan Vodă]]
["Taraclia District, Moldova"] = {}, -- capital [[Taraclia]]
["Telenești District, Moldova"] = {}, -- capital [[Telenești]]
["Ungheni District, Moldova"] = {}, -- capital [[Ungheni]]
["Chișinău, Moldova"] = {placetype = "municipality"},
["Bălți, Moldova"] = {placetype = "municipality"},
["Gagauzia, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "region"}}, -- capital [[Comrat]]
-- the remainder are under the de-facto control of the unrecognized state of Transnistria
["Bender, Moldova"] = {placetype = "municipality"},
["Tighina, Moldova"] = {alias_of = "Bender, Moldova"},
["Transnistria, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "region"}}, -- capital [[Tiraspol]]
["Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true},
["Administrative-Territorial Units of the Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true},
}
local function moldova_placename_to_key(placename)
local elliptical_key = placename .. ", Moldova"
if export.moldova_districts_and_autonomous_territorial_units[elliptical_key] then
return elliptical_key
end
if placename:find(" District$") then
return placename .. ", Moldova"
end
return placename .. " District, Moldova"
end
-- Moldovan districts (raions) and autonomous territorial units
export.moldova_group = {
key_to_placename = make_key_to_placename(", Moldova$", " District"),
placename_to_key = moldova_placename_to_key,
default_container = "Moldova",
default_placetype = {"district", "raion"},
default_divs = "communes",
data = export.moldova_districts_and_autonomous_territorial_units,
}
export.morocco_regions = {
["Tangier-Tetouan-Al Hoceima, Morocco"] = {},
["Oriental, Morocco"] = {wp = "%l (%c)"},
["L'Oriental, Morocco"] = {alias_of = "Oriental, Morocco", display = true},
["Fez-Meknes, Morocco"] = {},
["Rabat-Sale-Kenitra, Morocco"] = {wp = "Rabat-Salé-Kénitra"},
["Rabat-Salé-Kénitra, Morocco"] = {alias_of = "Rabat-Sale-Kenitra, Morocco", display = true},
["Beni Mellal-Khenifra, Morocco"] = {wp = "Béni Mellal-Khénifra"},
["Béni Mellal-Khénifra, Morocco"] = {alias_of = "Beni Mellal-Khenifra, Morocco", display = true},
["Casablanca-Settat, Morocco"] = {},
["Marrakesh-Safi, Morocco"] = {wp = "Marrakesh–Safi"}, -- WP title has en-dash
["Marrakech-Safi, Morocco"] = {alias_of = "Marrakesh-Safi, Morocco", display = true},
["Draa-Tafilalet, Morocco"] = {wp = "Drâa-Tafilalet"},
["Drâa-Tafilalet, Morocco"] = {alias_of = "Draa-Tafilalet, Morocco", display = true},
["Souss-Massa, Morocco"] = {},
["Guelmim-Oued Noun, Morocco"] = {
keydesc = "+++. '''NOTE:''' This region lies partly within the disputed territory of [[Western Sahara]]"
},
["Laayoune-Sakia El Hamra, Morocco"] = {
wp = "Laâyoune-Sakia El Hamra",
keydesc = "+++. '''NOTE:''' This region lies almost completely within the disputed territory of [[Western Sahara]]",
},
["Laâyoune-Sakia El Hamra, Morocco"] = {alias_of = "Laayoune-Sakia El Hamra, Morocco", display = true},
["Dakhla-Oued Ed-Dahab, Morocco"] = {
keydesc = "+++. '''NOTE:''' This region lies completely within the disputed territory of [[Western Sahara]]",
},
}
-- regions of Morocco
export.morocco_group = {
default_container = "Morocco",
default_placetype = "region",
data = export.morocco_regions,
}
export.egypt_governorates = {
["Cairo Governorate, Egypt"] = {},
["Giza Governorate, Egypt"] = {},
["Sharqia Governorate, Egypt"] = {},
["Dakahlia Governorate, Egypt"] = {},
["Beheira Governorate, Egypt"] = {},
["Minya Governorate, Egypt"] = {},
["Qalyubia Governorate, Egypt"] = {},
["Sohag Governorate, Egypt"] = {},
["Alexandria Governorate, Egypt"] = {},
["Gharbia Governorate, Egypt"] = {},
["Asyut Governorate, Egypt"] = {},
["Monufia Governorate, Egypt"] = {},
["Faiyum Governorate, Egypt"] = {},
["Kafr El Sheikh Governorate, Egypt"] = {},
["Qena Governorate, Egypt"] = {},
["Beni Suef Governorate, Egypt"] = {},
["Damietta Governorate, Egypt"] = {},
["Aswan Governorate, Egypt"] = {},
["Ismailia Governorate, Egypt"] = {},
["Luxor Governorate, Egypt"] = {},
["Suez Governorate, Egypt"] = {},
["Port Said Governorate, Egypt"] = {},
["Matrouh Governorate, Egypt"] = {},
["North Sinai Governorate, Egypt"] = {},
["Red Sea Governorate, Egypt"] = {},
["New Valley Governorate, Egypt"] = {},
["South Sinai Governorate, Egypt"] = {},
}
-- governorates of Egypt
export.egypt_group = {
key_to_placename = make_key_to_placename(", Egypt$", " Governorate$"),
placename_to_key = make_placename_to_key(", Egypt", " Governorate"),
default_container = "Egypt",
default_placetype = "governorate",
data = export.egypt_governorates,
}
export.netherlands_provinces = {
["Drenthe, Netherlands"] = {},
["Flevoland, Netherlands"] = {},
["Friesland, Netherlands"] = {},
["Gelderland, Netherlands"] = {},
["Groningen, Netherlands"] = {wp = "%l (province)"},
["Limburg, Netherlands"] = {wp = "%l (%c)"},
["North Brabant, Netherlands"] = {},
-- Foreign forms get display-canonicalized.
["Noord-Brabant, Netherlands"] = {alias_of = "North Brabant, Netherlands", display = true},
["North Holland, Netherlands"] = {},
["Noord-Holland, Netherlands"] = {alias_of = "North Holland, Netherlands", display = true},
["Overijssel, Netherlands"] = {},
["South Holland, Netherlands"] = {},
["Zuid-Holland, Netherlands"] = {alias_of = "South Holland, Netherlands", display = true},
["Utrecht, Netherlands"] = {wp = "%l (province)"},
["Zeeland, Netherlands"] = {},
}
-- provinces of the Netherlands
export.netherlands_group = {
default_container = "Netherlands",
default_placetype = "province",
default_divs = "municipalities",
data = export.netherlands_provinces,
}
export.new_zealand_regions = {
-- North Island regions
["Northland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-NTL, number 1, capital [[Whangārei]]
["Auckland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-AUK, number 2, capital [[Auckland]]
["Waikato, New Zealand"] = {}, -- ISO 3166-2 code NZ-WKO, number 3, capital [[Hamilton, New Zealand|Hamilton]]
["Bay of Plenty, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-BOP, number 4, capital [[Whakatāne]]
["Gisborne, New Zealand"] = {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-GIS, number 5, capital [[Gisborne, New Zealand|Gisborne]]
["Hawke's Bay, New Zealand"] = {}, -- ISO 3166-2 code NZ-HKB, number 6, capital [[Napier, New Zealand|Napier]]
["Taranaki, New Zealand"] = {}, -- ISO 3166-2 code NZ-TKI, number 7, capital [[Stratford, New Zealand|Stratford]]
["Manawatū-Whanganui, New Zealand"] = {}, -- ISO 3166-2 code NZ-MWT, number 8, capital [[Palmerston North]]
["Manawatu-Whanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true},
["Manawatu-Wanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true},
["Wellington, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-WGN, number 9, capital [[Wellington]]
-- South Island regions
["Tasman, New Zealand"] = {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-TAS, number 10, capital [[Richmond, New Zealand|Richmond]]
["Nelson, New Zealand"] = {placetype = {"region", "city"}, wp = "%l, %c", is_city = true}, -- ISO 3166-2 code NZ-NSN, number 11, capital [[Nelson, New Zealand|Nelson]]
["Marlborough, New Zealand"] = {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-MBH, number 12, capital [[Blenheim, New Zealand|Blenheim]]
["West Coast, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-WTC, number 13, capital [[Greymouth]]
["Canterbury, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-CAN, number 14, capital [[Christchurch]]
["Otago, New Zealand"] = {}, -- ISO 3166-2 code NZ-OTA, number 15, capital [[Dunedin]]
["Southland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-STL, number 16, capital [[Invercargill]]
}
-- regions of New Zealand
export.new_zealand_group = {
default_container = "New Zealand",
default_placetype = "region",
data = export.new_zealand_regions,
}
export.nigeria_states = {
["Abia State, Nigeria"] = {},
["Adamawa State, Nigeria"] = {},
["Akwa Ibom State, Nigeria"] = {},
["Anambra State, Nigeria"] = {},
["Bauchi State, Nigeria"] = {},
["Bayelsa State, Nigeria"] = {},
["Benue State, Nigeria"] = {},
["Borno State, Nigeria"] = {},
["Cross River State, Nigeria"] = {},
["Delta State, Nigeria"] = {},
["Ebonyi State, Nigeria"] = {},
["Edo State, Nigeria"] = {},
["Ekiti State, Nigeria"] = {},
["Enugu State, Nigeria"] = {},
["Federal Capital Territory, Nigeria"] = {
-- not a state but allow it to be referenced as one in holonyms
placetype = {"wilayah persekutuan", "territory", "negeri"}, the = true, wp = "%l (%c)",
},
["Gombe State, Nigeria"] = {},
["Imo State, Nigeria"] = {},
["Jigawa State, Nigeria"] = {},
["Kaduna State, Nigeria"] = {},
["Kano State, Nigeria"] = {},
["Katsina State, Nigeria"] = {},
["Kebbi State, Nigeria"] = {},
["Kogi State, Nigeria"] = {},
["Kwara State, Nigeria"] = {},
["Lagos State, Nigeria"] = {},
["Nasarawa State, Nigeria"] = {},
["Niger State, Nigeria"] = {},
["Ogun State, Nigeria"] = {},
["Ondo State, Nigeria"] = {},
["Osun State, Nigeria"] = {},
["Oyo State, Nigeria"] = {},
["Plateau State, Nigeria"] = {},
["Rivers State, Nigeria"] = {},
["Sokoto State, Nigeria"] = {},
["Taraba State, Nigeria"] = {},
["Yobe State, Nigeria"] = {},
["Zamfara State, Nigeria"] = {},
}
-- states of Nigeria
export.nigeria_group = {
key_to_placename = make_key_to_placename(", Nigeria$", " State$"),
placename_to_key = make_placename_to_key(", Nigeria", " State"),
default_container = "Nigeria",
default_placetype = "negeri",
data = export.nigeria_states,
}
export.north_korea_provinces = {
["Chagang Province, North Korea"] = {},
["North Hamgyong Province, North Korea"] = {},
["South Hamgyong Province, North Korea"] = {},
["North Hwanghae Province, North Korea"] = {},
["South Hwanghae Province, North Korea"] = {},
["Kangwon Province, North Korea"] = {wp = "%l (%c)"},
["North Pyongan Province, North Korea"] = {},
["South Pyongan Province, North Korea"] = {},
["Ryanggang Province, North Korea"] = {},
}
-- provinces of North Korea
export.north_korea_group = {
key_to_placename = make_key_to_placename(", North Korea$", " Province$"),
placename_to_key = make_placename_to_key(", North Korea", " Province"),
default_container = "North Korea",
default_placetype = "province",
data = export.north_korea_provinces,
}
export.norwegian_counties = {
["Oslo, Norway"] = {},
["Rogaland, Norway"] = {},
["Møre og Romsdal, Norway"] = {},
["Nordland, Norway"] = {},
["Østfold, Norway"] = {},
["Akershus, Norway"] = {},
["Buskerud, Norway"] = {},
-- the following two were merged into Innlandet
-- ["Hedmark, Norway"] = {},
-- ["Oppland, Norway"] = {},
["Innlandet, Norway"] = {},
["Vestfold, Norway"] = {},
["Telemark, Norway"] = {},
-- the following two were merged into Agder
-- ["Aust-Agder, Norway"] = {},
-- ["Vest-Agder, Norway"] = {},
["Agder, Norway"] = {},
-- the following two were merged into Vestland
-- ["Hordaland, Norway"] = {},
-- ["Sogn og Fjordane, Norway"] = {},
["Vestland, Norway"] = {},
["Trøndelag, Norway"] = {},
["Troms, Norway"] = {},
["Finnmark, Norway"] = {},
}
-- counties of Norway
export.norway_group = {
default_container = "Norway",
default_placetype = "county",
data = export.norwegian_counties,
}
export.pakistan_provinces_and_territories = {
["Azad Kashmir, Pakistan"] = {
placetype = {"administrative territory", "autonomous territory", "territory"},
},
["Azad Jammu and Kashmir, Pakistan"] = {alias_of = "Azad Kashmir, Pakistan", display = true},
["Balochistan, Pakistan"] = {wp = "%l, %c"},
["Gilgit-Baltistan, Pakistan"] = {
placetype = {"administrative territory", "territory"},
},
["Islamabad Capital Territory, Pakistan"] = {
the = true,
divs = {}, -- no divisions
placetype = {"wilayah persekutuan", "administrative territory", "territory"},
},
-- Islamabad is an accepted alias for Islamabad Capital Territory given the above placetypes
["Islamabad, Pakistan"] = {alias_of = "Islamabad Capital Territory, Pakistan"},
["Khyber Pakhtunkhwa, Pakistan"] = {},
["Punjab, Pakistan"] = {wp = "%l, %c"},
["Sindh, Pakistan"] = {},
}
-- provinces and territories of Pakistan
export.pakistan_group = {
default_container = "Pakistan",
default_placetype = "province",
default_divs = "divisions",
data = export.pakistan_provinces_and_territories,
}
export.philippines_provinces = {
["Abra, Philippines"] = {wp = "%l (province)"},
["Agusan del Norte, Philippines"] = {},
["Agusan del Sur, Philippines"] = {},
["Aklan, Philippines"] = {},
["Albay, Philippines"] = {},
["Antique, Philippines"] = {wp = "%l (province)"},
["Apayao, Philippines"] = {},
["Aurora, Philippines"] = {wp = "%l (province)"},
["Basilan, Philippines"] = {},
["Bataan, Philippines"] = {},
["Batanes, Philippines"] = {},
["Batangas, Philippines"] = {},
["Benguet, Philippines"] = {},
["Biliran, Philippines"] = {},
["Bohol, Philippines"] = {},
["Bukidnon, Philippines"] = {},
["Bulacan, Philippines"] = {},
["Cagayan, Philippines"] = {},
["Camarines Norte, Philippines"] = {},
["Camarines Sur, Philippines"] = {},
["Camiguin, Philippines"] = {},
["Capiz, Philippines"] = {},
["Catanduanes, Philippines"] = {},
["Cavite, Philippines"] = {},
["Cebu, Philippines"] = {},
["Cotabato, Philippines"] = {},
["Davao de Oro, Philippines"] = {},
["Davao del Norte, Philippines"] = {},
["Davao del Sur, Philippines"] = {},
["Davao Occidental, Philippines"] = {},
["Davao Oriental, Philippines"] = {},
["Dinagat Islands, Philippines"] = {the = true},
["Eastern Samar, Philippines"] = {},
["Guimaras, Philippines"] = {},
["Ifugao, Philippines"] = {},
["Ilocos Norte, Philippines"] = {},
["Ilocos Sur, Philippines"] = {},
["Iloilo, Philippines"] = {},
["Isabela, Philippines"] = {wp = "%l (province)"},
["Kalinga, Philippines"] = {wp = "%l (province)"},
["La Union, Philippines"] = {},
["Laguna, Philippines"] = {wp = "%l (province)"},
["Lanao del Norte, Philippines"] = {},
["Lanao del Sur, Philippines"] = {},
["Leyte, Philippines"] = {wp = "%l (province)"},
["Maguindanao del Norte, Philippines"] = {},
["Maguindanao del Sur, Philippines"] = {},
["Marinduque, Philippines"] = {},
["Masbate, Philippines"] = {},
["Misamis Occidental, Philippines"] = {},
["Misamis Oriental, Philippines"] = {},
["Mountain Province, Philippines"] = {},
["Negros Occidental, Philippines"] = {},
["Negros Oriental, Philippines"] = {},
["Northern Samar, Philippines"] = {},
["Nueva Ecija, Philippines"] = {},
["Nueva Vizcaya, Philippines"] = {},
["Occidental Mindoro, Philippines"] = {},
["Oriental Mindoro, Philippines"] = {},
["Palawan, Philippines"] = {},
["Pampanga, Philippines"] = {},
["Pangasinan, Philippines"] = {},
["Quezon, Philippines"] = {},
["Quirino, Philippines"] = {},
["Rizal, Philippines"] = {wp = "%l (province)"},
["Romblon, Philippines"] = {},
["Samar, Philippines"] = {wp = "%l (province)"},
["Sarangani, Philippines"] = {},
["Siquijor, Philippines"] = {},
["Sorsogon, Philippines"] = {},
["South Cotabato, Philippines"] = {},
["Southern Leyte, Philippines"] = {},
["Sultan Kudarat, Philippines"] = {},
["Sulu, Philippines"] = {},
["Surigao del Norte, Philippines"] = {},
["Surigao del Sur, Philippines"] = {},
["Tarlac, Philippines"] = {},
["Tawi-Tawi, Philippines"] = {},
["Zambales, Philippines"] = {},
["Zamboanga del Norte, Philippines"] = {},
["Zamboanga del Sur, Philippines"] = {},
["Zamboanga Sibugay, Philippines"] = {},
-- not a province but treated as one; allow it to be referred to as a province in holonyms
["Metro Manila, Philippines"] = {placetype = {"region", "province"}},
}
-- provinces of the Philippines
export.philippines_group = {
default_container = "Philippines",
default_placetype = "province",
default_divs = {"municipalities", "barangays"},
data = export.philippines_provinces,
}
export.poland_voivodeships = {
["Lower Silesian Voivodeship, Poland"] = {}, -- abbr DS, code 02, capital Wrocław
["Kuyavian-Pomeranian Voivodeship, Poland"] = {}, -- abbr KP, code 04, capital Bydgoszcz (seat of voivode), Toruń (seat of sejmik and marshal)
["Lublin Voivodeship, Poland"] = {}, -- abbr LU, code 06, capital Lublin
["Lubusz Voivodeship, Poland"] = {}, -- abbr LB, code 08, capital Gorzów Wielkopolski (seat of voivode), Zielona Góra (seat of sejmik and marshal)
["Lodz Voivodeship, Poland"] = {wp = "Łódź Voivodeship"}, -- abbr LD, code 10, capital Łódź
["Łódź Voivodeship, Poland"] = {alias_of = "Lodz Voivodeship, Poland", display = true, display_as_full = true},
["Lesser Poland Voivodeship, Poland"] = {}, -- abbr MA, code 12, capital Kraków
["Masovian Voivodeship, Poland"] = {}, -- abbr MZ, code 14, capital Warsaw
["Opole Voivodeship, Poland"] = {}, -- abbr OP, code 16, capital Opole
["Subcarpathian Voivodeship, Poland"] = {}, -- abbr PK, code 18, capital Rzeszów
["Podlaskie Voivodeship, Poland"] = {}, -- abbr PD, code 20, capital Białystok
["Pomeranian Voivodeship, Poland"] = {}, -- abbr PM, code 22, capital Gdańsk
["Silesian Voivodeship, Poland"] = {}, -- abbr SL, code 24, capital Katowice
["Holy Cross Voivodeship, Poland"] = {wp = "Świętokrzyskie Voivodeship"}, -- abbr SK, code 26, capital Kielce
["Świętokrzyskie Voivodeship, Poland"] = {alias_of = "Holy Cross Voivodeship, Poland", display = true, display_as_full = true},
["Warmian-Masurian Voivodeship, Poland"] = {}, -- abbr WN, code 28, capital Olsztyn
["Greater Poland Voivodeship, Poland"] = {}, -- abbr WP, code 30, capital Poznań
["West Pomeranian Voivodeship, Poland"] = {}, -- abbr ZP, code 32, capital Szczecin
}
-- voivodeships of Poland
export.poland_group = {
key_to_placename = make_key_to_placename(", Poland$", " Voivodeship$"),
placename_to_key = make_placename_to_key(", Poland", " Voivodeship"),
default_container = "Poland",
default_placetype = "voivodeship",
default_divs = {
-- "counties", -- not enough of them currently
{type = "Polish colonies", cat_as = {{type = "villages", prep = "di"}}},
},
data = export.poland_voivodeships,
}
export.portugal_districts_and_autonomous_regions = {
["Azores, Portugal"] = {the = true, placetype = {"autonomous region", "region"}},
["Aveiro District, Portugal"] = {},
["Beja District, Portugal"] = {},
["Braga District, Portugal"] = {},
["Bragança District, Portugal"] = {},
["Castelo Branco District, Portugal"] = {},
["Coimbra District, Portugal"] = {},
["Évora District, Portugal"] = {},
["Faro District, Portugal"] = {},
["Guarda District, Portugal"] = {},
["Leiria District, Portugal"] = {},
["Lisbon District, Portugal"] = {},
["Lisboa District, Portugal"] = {alias_of = "Lisbon District, Portugal", display = true},
["Madeira, Portugal"] = {placetype = {"autonomous region", "region"}},
["Portalegre District, Portugal"] = {},
["Porto District, Portugal"] = {},
["Santarém District, Portugal"] = {},
["Setúbal District, Portugal"] = {},
["Viana do Castelo District, Portugal"] = {},
["Vila Real District, Portugal"] = {},
["Viseu District, Portugal"] = {},
}
local function portugal_placename_to_key(placename)
if placename == "Azores" or placename == "Madeira" then
return placename .. ", Portugal"
end
if placename:find(" District$") then
return placename .. ", Portugal"
end
return placename .. " District, Portugal"
end
-- districts and autonomous regions of Portugal
export.portugal_group = {
key_to_placename = make_key_to_placename(", Portugal$", " District$"),
placename_to_key = portugal_placename_to_key,
default_container = "Portugal",
default_placetype = "district",
default_divs = "municipalities",
data = export.portugal_districts_and_autonomous_regions,
}
export.romania_counties = {
["Alba County, Romania"] = {},
["Arad County, Romania"] = {},
["Argeș County, Romania"] = {},
["Bacău County, Romania"] = {},
["Bihor County, Romania"] = {},
["Bistrița-Năsăud County, Romania"] = {},
["Botoșani County, Romania"] = {},
["Brașov County, Romania"] = {},
["Brăila County, Romania"] = {},
-- Bucharest: not in a county
["Buzău County, Romania"] = {},
["Caraș-Severin County, Romania"] = {},
["Cluj County, Romania"] = {},
["Constanța County, Romania"] = {},
["Covasna County, Romania"] = {},
["Călărași County, Romania"] = {},
["Dolj County, Romania"] = {},
["Dâmbovița County, Romania"] = {},
["Galați County, Romania"] = {},
["Giurgiu County, Romania"] = {},
["Gorj County, Romania"] = {},
["Harghita County, Romania"] = {},
["Hunedoara County, Romania"] = {},
["Ialomița County, Romania"] = {},
["Iași County, Romania"] = {},
["Ilfov County, Romania"] = {},
["Maramureș County, Romania"] = {},
["Mehedinți County, Romania"] = {},
["Mureș County, Romania"] = {},
["Neamț County, Romania"] = {},
["Olt County, Romania"] = {},
["Prahova County, Romania"] = {},
["Satu Mare County, Romania"] = {},
["Sibiu County, Romania"] = {},
["Suceava County, Romania"] = {},
["Sălaj County, Romania"] = {},
["Teleorman County, Romania"] = {},
["Timiș County, Romania"] = {},
["Tulcea County, Romania"] = {},
["Vaslui County, Romania"] = {},
["Vrancea County, Romania"] = {},
["Vâlcea County, Romania"] = {},
}
-- counties of Romania
export.romania_group = {
key_to_placename = make_key_to_placename(", Romania$", " County$"),
placename_to_key = make_placename_to_key(", Romania", " County"),
default_container = "Romania",
default_placetype = "county",
default_divs = "communes",
data = export.romania_counties,
}
local function make_russia_federal_subject_spec(spectype, use_the, wp)
return {
placetype = spectype,
the = not not use_the,
bare_category_parent_type = {"federal subjects", spectype .. "s"},
wp = wp,
}
end
local russia_autonomous_okrug_no_the =
{placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"}}
local russia_autonomous_okrug_the =
{placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"},
the = true}
local russia_krai = make_russia_federal_subject_spec("krai")
local russia_oblast = make_russia_federal_subject_spec("oblast")
local russia_republic_the = make_russia_federal_subject_spec("republic", "use the")
local russia_republic_no_the = make_russia_federal_subject_spec("republic")
export.russia_federal_subjects = {
-- autonomous oblasts
["Jewish Autonomous Oblast, Russia"] =
{the = true, placetype = {"autonomous oblast", "oblast"},
bare_category_parent_type = {"federal subjects", "autonomous oblasts"}},
-- autonomous okrugs
["Chukotka Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Chukotka, Russia"] = {alias_of = "Chukotka Autonomous Okrug, Russia"},
["Khanty-Mansi Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Khanty-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
["Khantia-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
["Yugra, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
["Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Nenetsia, Russia"] = {alias_of = "Nenets Autonomous Okrug, Russia"},
["Yamalo-Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Yamalia, Russia"] = {alias_of = "Yamalo-Nenets Autonomous Okrug, Russia"},
-- krais
["Altai Krai, Russia"] = russia_krai,
["Kamchatka Krai, Russia"] = russia_krai,
["Khabarovsk Krai, Russia"] = russia_krai,
["Krasnodar Krai, Russia"] = russia_krai,
["Krasnoyarsk Krai, Russia"] = russia_krai,
["Perm Krai, Russia"] = russia_krai,
["Primorsky Krai, Russia"] = russia_krai,
["Stavropol Krai, Russia"] = russia_krai,
["Zabaykalsky Krai, Russia"] = russia_krai,
-- oblasts
["Amur Oblast, Russia"] = russia_oblast,
["Arkhangelsk Oblast, Russia"] = russia_oblast,
["Astrakhan Oblast, Russia"] = russia_oblast,
["Belgorod Oblast, Russia"] = russia_oblast,
["Bryansk Oblast, Russia"] = russia_oblast,
["Chelyabinsk Oblast, Russia"] = russia_oblast,
["Irkutsk Oblast, Russia"] = russia_oblast,
["Ivanovo Oblast, Russia"] = russia_oblast,
["Kaliningrad Oblast, Russia"] = russia_oblast,
["Kaluga Oblast, Russia"] = russia_oblast,
["Kemerovo Oblast, Russia"] = russia_oblast,
["Kirov Oblast, Russia"] = russia_oblast,
["Kostroma Oblast, Russia"] = russia_oblast,
["Kurgan Oblast, Russia"] = russia_oblast,
["Kursk Oblast, Russia"] = russia_oblast,
["Leningrad Oblast, Russia"] = russia_oblast,
["Lipetsk Oblast, Russia"] = russia_oblast,
["Magadan Oblast, Russia"] = russia_oblast,
["Moscow Oblast, Russia"] = russia_oblast,
["Murmansk Oblast, Russia"] = russia_oblast,
["Nizhny Novgorod Oblast, Russia"] = russia_oblast,
["Novgorod Oblast, Russia"] = russia_oblast,
["Novosibirsk Oblast, Russia"] = russia_oblast,
["Omsk Oblast, Russia"] = russia_oblast,
["Orenburg Oblast, Russia"] = russia_oblast,
["Oryol Oblast, Russia"] = russia_oblast,
["Penza Oblast, Russia"] = russia_oblast,
["Pskov Oblast, Russia"] = russia_oblast,
["Rostov Oblast, Russia"] = russia_oblast,
["Ryazan Oblast, Russia"] = russia_oblast,
["Sakhalin Oblast, Russia"] = russia_oblast,
["Samara Oblast, Russia"] = russia_oblast,
["Saratov Oblast, Russia"] = russia_oblast,
["Smolensk Oblast, Russia"] = russia_oblast,
["Sverdlovsk Oblast, Russia"] = russia_oblast,
["Tambov Oblast, Russia"] = russia_oblast,
["Tomsk Oblast, Russia"] = russia_oblast,
["Tula Oblast, Russia"] = russia_oblast,
["Tver Oblast, Russia"] = russia_oblast,
["Tyumen Oblast, Russia"] = russia_oblast,
["Ulyanovsk Oblast, Russia"] = russia_oblast,
["Vladimir Oblast, Russia"] = russia_oblast,
["Volgograd Oblast, Russia"] = russia_oblast,
["Vologda Oblast, Russia"] = russia_oblast,
["Voronezh Oblast, Russia"] = russia_oblast,
["Yaroslavl Oblast, Russia"] = russia_oblast,
-- republics
--
-- We only need to include cases that aren't just shortened versions of the full federal subject name (i.e. where
-- words like "Republic" and "Oblast" are omitted but the name is not otherwise modified; these are handled by
-- key_to_placename). Non-display-canonicalizing aliases are generally due to differences in the presence or absence
-- of "the".
["Adygea, Russia"] = russia_republic_no_the,
["Republic of Adygea, Russia"] = {alias_of = "Adygea, Russia", the = true},
["Bashkortostan, Russia"] = russia_republic_no_the,
["Republic of Bashkortostan, Russia"] = {alias_of = "Bashkortostan, Russia", the = true},
["Bashkiria, Russia"] = {alias_of = "Bashkortostan, Russia"},
["Buryatia, Russia"] = russia_republic_no_the,
["Republic of Buryatia, Russia"] = {alias_of = "Buryatia, Russia", the = true},
["Dagestan, Russia"] = russia_republic_no_the,
["Republic of Dagestan, Russia"] = {alias_of = "Dagestan, Russia", the = true},
["Ingushetia, Russia"] = russia_republic_no_the,
["Republic of Ingushetia, Russia"] = {alias_of = "Ingushetia, Russia", the = true},
["Kalmykia, Russia"] = russia_republic_no_the,
["Republic of Kalmykia, Russia"] = {alias_of = "Kalmykia, Russia", the = true},
["Karelia, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Karelia"),
["Republic of Karelia, Russia"] = {alias_of = "Karelia, Russia", the = true},
["Khakassia, Russia"] = russia_republic_no_the,
["Republic of Khakassia, Russia"] = {alias_of = "Khakassia, Russia", the = true},
["Mordovia, Russia"] = russia_republic_no_the,
["Republic of Mordovia, Russia"] = {alias_of = "Mordovia, Russia", the = true},
["North Ossetia-Alania, Russia"] = make_russia_federal_subject_spec("republic", nil, "North Ossetia–Alania"), -- with en-dash
["Republic of North Ossetia-Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", the = true},
["North Ossetia, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true},
["Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true},
["Tatarstan, Russia"] = russia_republic_no_the,
["Republic of Tatarstan, Russia"] = {alias_of = "Tatarstan, Russia", the = true},
["Altai Republic, Russia"] = russia_republic_the,
["Chechnya, Russia"] = russia_republic_no_the,
["Chechen Republic, Russia"] = {alias_of = "Chechnya, Russia", the = true},
["Chuvashia, Russia"] = russia_republic_no_the,
["Chuvash Republic, Russia"] = {alias_of = "Chuvashia, Russia", the = true},
["Kabardino-Balkaria, Russia"] = russia_republic_no_the,
["Kabardino-Balkariya, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", display = true},
["Kabardino-Balkarian Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", the = true},
["Kabardino-Balkar Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia",
display = "Kabardino-Balkarian Republic, Russia", the = true},
["Karachay-Cherkessia, Russia"] = russia_republic_no_the,
["Karachay-Cherkess Republic, Russia"] = {alias_of = "Karachay-Cherkessia, Russia"},
["Komi, Russia"] = make_russia_federal_subject_spec("republic", nil, "Komi Republic"),
["Komi Republic, Russia"] = {alias_of = "Komi, Russia", the = true},
["Mari El, Russia"] = russia_republic_no_the,
["Mari El Republic, Russia"] = {alias_of = "Mari El, Russia", the = true},
["Sakha, Russia"] = make_russia_federal_subject_spec("republic", nil, "Sakha Republic"),
["Sakha Republic, Russia"] = {alias_of = "Sakha, Russia", the = true},
["Yakutia, Russia"] = {alias_of = "Sakha, Russia"},
["Yakutiya, Russia"] = {alias_of = "Sakha, Russia", display = "Yakutia, Russia"},
["Republic of Yakutia (Sakha), Russia"] = {alias_of = "Sakha, Russia", display = "Sakha Republic, Russia",
the = true},
["Tuva, Russia"] = russia_republic_no_the,
["Tyva, Russia"] = {alias_of = "Tuva, Russia", display = true},
["Tuva Republic, Russia"] = {alias_of = "Tuva, Russia", the = true},
["Tyva Republic, Russia"] = {alias_of = "Tuva, Russia", display= "Tuva Republic, Russia", the = true},
["Udmurtia, Russia"] = russia_republic_no_the,
["Udmurt Republic, Russia"] = {alias_of = "Udmurtia, Russia", the = true},
-- Not included due to being unrecognized and only partly controlled:
-- ["Crimea, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Crimea (Russia)")
-- ["Donetsk People's Republic, Russia"] = russia_republic_the,
-- ["Luhansk People's Republic, Russia"] = russia_republic_the,
-- ["Zaporozhye Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Zaporizhzhia Oblast"),
-- ["Kherson Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Kherson Oblast"),
-- There are also federal cities (not included because they're cities):
-- Moscow, Saint Petersburg; Sevastopol (unrecognized; same status as for "Crimea, Russia" above)
}
local function russia_key_to_placename(key)
key = key:gsub(",.*", "")
local full_placename = key
if key == "Jewish Autonomous Oblast" then
return full_placename, full_placename
end
local elliptical_placename
for _, suffix in ipairs({"Krai", "Oblast"}) do
elliptical_placename = key:match("^(.*) " .. suffix .. "$")
if elliptical_placename then
return full_placename, elliptical_placename
end
end
return full_placename, full_placename
end
local function russia_placename_to_key(placename)
local key = placename .. ", Russia"
if export.russia_federal_subjects[key] then
return key
end
-- We allow the user to say e.g. "obl/Samara" in place of "obl/Samara Oblast".
for _, suffix in ipairs({"Krai", "Oblast"}) do
local suffixed_key = placename .. " " .. suffix .. ", Russia"
if export.russia_federal_subjects[suffixed_key] then
return suffixed_key
end
end
return placename .. ", Russia"
end
local function construct_russia_federal_subject_keydesc(group, key, spec)
local placename = key:gsub(",.*", "")
local linked_placename = export.construct_linked_placename(spec, placename)
local placetype = spec.placetype
if type(placetype) == "table" then
placetype = placetype[1]
end
if placetype == "oblast" then
-- Hack: Oblasts generally don't have entries under "Foo Oblast"
-- but just under "Foo", so fix the linked key appropriately;
-- doesn't apply to the Jewish Autonomous Oblast
linked_placename = linked_placename:gsub(" Oblast%]%]", "%]%] Oblast")
end
return linked_placename .. ", a [[federal subject]] ([[" .. placetype .. "]]) of [[Russia]]"
end
-- federal subjects of Russia
export.russia_group = {
key_to_placename = russia_key_to_placename,
placename_to_key = russia_placename_to_key,
default_container = "Russia",
default_keydesc = construct_russia_federal_subject_keydesc,
default_overriding_bare_label_parents = {"federal subjects of Russia", "+++"},
data = export.russia_federal_subjects,
}
export.saudi_arabia_provinces = {
["Riyadh Province, Saudi Arabia"] = {},
["Mecca Province, Saudi Arabia"] = {},
-- Name is too generic to assume it's in Saudi Arabia if not specified.
["Eastern Province, Saudi Arabia"] = {no_auto_augment_container = true, wp = "%l, %c"},
["Medina Province, Saudi Arabia"] = {wp = "%l (%c)"},
["Aseer Province, Saudi Arabia"] = {wp = "Asir"},
["Asir Province, Saudi Arabia"] = {alias_of = "Aseer Province, Saudi Arabia", display = true},
["Jazan Province, Saudi Arabia"] = {},
["Qassim Province, Saudi Arabia"] = {wp = "Al-Qassim Province"},
["Al-Qassim Province, Saudi Arabia"] = {alias_of = "Qassim Province, Saudi Arabia", display = true},
["Tabuk Province, Saudi Arabia"] = {},
["Hail Province, Saudi Arabia"] = {wp = "Ḥa'il Province"},
["Ha'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true},
["Ḥa'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true},
["Al-Jouf Province, Saudi Arabia"] = {wp = "Al-Jawf Province"},
["Al-Jawf Province, Saudi Arabia"] = {alias_of = "Al-Jouf Province, Saudi Arabia", display = true},
["Najran Province, Saudi Arabia"] = {},
["Northern Borders Province, Saudi Arabia"] = {},
["Al-Bahah Province, Saudi Arabia"] = {},
}
-- provinces of Saudi Arabia
export.saudi_arabia_group = {
key_to_placename = make_key_to_placename(", Arab Saudi$", " Province$"),
placename_to_key = make_placename_to_key(", Arab Saudi", " Province"),
default_container = "Arab Saudi",
default_placetype = "wilayah",
data = export.saudi_arabia_provinces,
}
export.south_africa_provinces = {
["Eastern Cape, South Africa"] = {the = true},
["Free State, South Africa"] = {the = true, wp = "%l (province)"},
["Gauteng, South Africa"] = {},
["KwaZulu-Natal, South Africa"] = {},
["Limpopo, South Africa"] = {},
["Mpumalanga, South Africa"] = {},
-- per Wikipedia and other sources, `North West` doesn't normally have `the` before it
["North West, South Africa"] = {wp = "%l (South African province)"},
["Northern Cape, South Africa"] = {the = true},
["Western Cape, South Africa"] = {the = true},
}
-- provinces of South Africa
export.south_africa_group = {
default_container = "South Africa",
default_placetype = "province",
default_divs = "municipalities",
data = export.south_africa_provinces,
}
export.south_korea_provinces = {
["North Chungcheong Province, South Korea"] = {},
["South Chungcheong Province, South Korea"] = {},
["Gangwon Province, South Korea"] = {wp = "%l, %c"},
["Gyeonggi Province, South Korea"] = {},
["North Gyeongsang Province, South Korea"] = {},
["South Gyeongsang Province, South Korea"] = {},
["North Jeolla Province, South Korea"] = {},
["South Jeolla Province, South Korea"] = {},
["Jeju Province, South Korea"] = {},
}
-- provinces of South Korea
export.south_korea_group = {
key_to_placename = make_key_to_placename(", South Korea$", " Province$"),
placename_to_key = make_placename_to_key(", South Korea", " Province"),
default_container = "South Korea",
default_placetype = "province",
data = export.south_korea_provinces,
}
export.spain_autonomous_communities = {
["Andalusia, Spain"] = {},
["Aragon, Spain"] = {},
["Asturias, Spain"] = {},
["Balearic Islands, Spain"] = {the = true},
["Basque Country, Spain"] = {the = true, wp = "%l (autonomous community)"},
["Canary Islands, Spain"] = {the = true},
["Cantabria, Spain"] = {},
["Castile and León, Spain"] = {},
["Castilla-La Mancha, Spain"] = {wp = "Castilla–La Mancha"}, -- with en-dash
["Catalonia, Spain"] = {},
["Community of Madrid, Spain"] = {the = true},
["Extremadura, Spain"] = {},
["Galicia, Spain"] = {wp = "%l (Spain)"},
["La Rioja, Spain"] = {},
["Murcia, Spain"] = {wp = "Region of %l"},
["Navarre, Spain"] = {},
["Valencia, Spain"] = {wp = "Valencian Community"},
["Valencian Community, Spain"] = {alias_of = "Valencia, Spain", the = true},
}
-- autonomous communities of Spain
export.spain_group = {
default_container = "Spain",
default_placetype = "autonomous community",
default_divs = {"municipalities", "comarcas"},
data = export.spain_autonomous_communities,
}
export.taiwan_counties = {
["Changhua County, Taiwan"] = {},
["Chiayi County, Taiwan"] = {},
["Hsinchu County, Taiwan"] = {},
["Hualien County, Taiwan"] = {},
["Kinmen County, Taiwan"] = {wp = "Kinmen"},
["Lienchiang County, Taiwan"] = {wp = "Matsu Islands"},
["Miaoli County, Taiwan"] = {},
["Nantou County, Taiwan"] = {},
["Penghu County, Taiwan"] = {wp = "Penghu"},
["Pingtung County, Taiwan"] = {},
["Taitung County, Taiwan"] = {},
["Yilan County, Taiwan"] = {wp = "%l, %c"},
["Yunlin County, Taiwan"] = {},
}
-- counties of Taiwan
export.taiwan_group = {
key_to_placename = make_key_to_placename(", Taiwan$", " County$"),
placename_to_key = make_placename_to_key(", Taiwan", " County"),
default_container = "Taiwan",
default_placetype = "county",
default_divs = {"districts", "townships"},
data = export.taiwan_counties,
}
export.thailand_provinces = {
-- Bangkok (special administrative area)
["Amnat Charoen Province, Thailand"] = {},
["Ang Thong Province, Thailand"] = {},
["Bueng Kan Province, Thailand"] = {},
["Buriram Province, Thailand"] = {},
["Chachoengsao Province, Thailand"] = {},
["Chai Nat Province, Thailand"] = {},
["Chaiyaphum Province, Thailand"] = {},
["Chanthaburi Province, Thailand"] = {},
["Chiang Mai Province, Thailand"] = {},
["Chiang Rai Province, Thailand"] = {},
["Chonburi Province, Thailand"] = {},
["Chumphon Province, Thailand"] = {},
["Kalasin Province, Thailand"] = {},
["Kamphaeng Phet Province, Thailand"] = {},
["Kanchanaburi Province, Thailand"] = {},
["Khon Kaen Province, Thailand"] = {},
["Krabi Province, Thailand"] = {},
["Lampang Province, Thailand"] = {},
["Lamphun Province, Thailand"] = {},
["Loei Province, Thailand"] = {},
["Lopburi Province, Thailand"] = {},
["Mae Hong Son Province, Thailand"] = {},
["Maha Sarakham Province, Thailand"] = {},
["Mukdahan Province, Thailand"] = {},
["Nakhon Nayok Province, Thailand"] = {},
["Nakhon Pathom Province, Thailand"] = {},
["Nakhon Phanom Province, Thailand"] = {},
["Nakhon Ratchasima Province, Thailand"] = {},
["Nakhon Sawon Province, Thailand"] = {},
["Nakhon Si Thammarat Province, Thailand"] = {},
["Nan Province, Thailand"] = {},
["Narathiwat Province, Thailand"] = {},
["Nong Bua Lamphu Province, Thailand"] = {},
["Nong Khai Province, Thailand"] = {},
["Nonthaburi Province, Thailand"] = {},
["Pathum Thani Province, Thailand"] = {},
["Pattani Province, Thailand"] = {},
["Phang Nga Province, Thailand"] = {},
["Phatthalung Province, Thailand"] = {},
["Phayao Province, Thailand"] = {},
["Phetchabun Province, Thailand"] = {},
["Phetchaburi Province, Thailand"] = {},
["Phichit Province, Thailand"] = {},
["Phitsanulok Province, Thailand"] = {},
["Phra Nakhon Si Ayutthaya Province, Thailand"] = {},
["Phrae Province, Thailand"] = {},
["Phuket Province, Thailand"] = {},
["Prachinburi Province, Thailand"] = {},
["Prachuap Khiri Khan Province, Thailand"] = {},
["Ranong Province, Thailand"] = {},
["Ratchaburi Province, Thailand"] = {},
["Rayong Province, Thailand"] = {},
["Roi Et Province, Thailand"] = {},
["Sa Kaeo Province, Thailand"] = {},
["Sakon Nakhon Province, Thailand"] = {},
["Samut Prakan Province, Thailand"] = {},
["Samut Sakhon Province, Thailand"] = {},
["Samut Songkhram Province, Thailand"] = {},
["Saraburi Province, Thailand"] = {},
["Satun Province, Thailand"] = {},
["Sing Buri Province, Thailand"] = {},
["Sisaket Province, Thailand"] = {},
["Songkhla Province, Thailand"] = {},
["Sukhothai Province, Thailand"] = {},
["Suphan Buri Province, Thailand"] = {},
["Surat Thani Province, Thailand"] = {},
["Surin Province, Thailand"] = {},
["Tak Province, Thailand"] = {},
["Trang Province, Thailand"] = {},
["Trat Province, Thailand"] = {},
["Ubon Ratchathani Province, Thailand"] = {},
["Udon Thani Province, Thailand"] = {},
["Uthai Thani Province, Thailand"] = {},
["Uttaradit Province, Thailand"] = {},
["Yala Province, Thailand"] = {},
["Yasothon Province, Thailand"] = {},
}
-- provinces of Thailand
export.thailand_group = {
key_to_placename = make_key_to_placename(", Thailand$", "Wilayah "),
placename_to_key = make_placename_to_key(", Thailand", "Wilayah "),
default_container = "Thailand",
default_placetype = "wilayah",
default_divs = "daerah",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "Wilayah %e",
data = export.thailand_provinces,
}
export.turkey_provinces = {
["Adana Province, Turkey"] = {}, -- code 01
["Adıyaman Province, Turkey"] = {}, -- code 02
["Afyonkarahisar Province, Turkey"] = {}, -- code 03
["Ağrı Province, Turkey"] = {}, -- code 04
["Amasya Province, Turkey"] = {}, -- code 05
["Ankara Province, Turkey"] = {}, -- code 06
["Antalya Province, Turkey"] = {}, -- code 07
["Artvin Province, Turkey"] = {}, -- code 08
["Aydın Province, Turkey"] = {}, -- code 09
["Balıkesir Province, Turkey"] = {}, -- code 10
["Bilecik Province, Turkey"] = {}, -- code 11
["Bingöl Province, Turkey"] = {}, -- code 12
["Bitlis Province, Turkey"] = {}, -- code 13
["Bolu Province, Turkey"] = {}, -- code 14
["Burdur Province, Turkey"] = {}, -- code 15
["Bursa Province, Turkey"] = {}, -- code 16
["Çanakkale Province, Turkey"] = {}, -- code 17
["Çankırı Province, Turkey"] = {}, -- code 18
["Çorum Province, Turkey"] = {}, -- code 19
["Denizli Province, Turkey"] = {}, -- code 20
["Diyarbakır Province, Turkey"] = {}, -- code 21
["Edirne Province, Turkey"] = {}, -- code 22
["Elazığ Province, Turkey"] = {}, -- code 23
["Elâzığ Province, Turkey"] = {alias_of = "Elazığ Province, Turkey", display = true},
["Erzincan Province, Turkey"] = {}, -- code 24
["Erzurum Province, Turkey"] = {}, -- code 25
["Eskişehir Province, Turkey"] = {}, -- code 26
["Gaziantep Province, Turkey"] = {}, -- code 27
["Giresun Province, Turkey"] = {}, -- code 28
["Gümüşhane Province, Turkey"] = {}, -- code 29
["Hakkâri Province, Turkey"] = {}, -- code 30
["Hakkari Province, Turkey"] = {alias_of = "Hakkâri Province, Turkey", display = true},
["Hatay Province, Turkey"] = {}, -- code 31
["Isparta Province, Turkey"] = {}, -- code 32
["Mersin Province, Turkey"] = {}, -- code 33
-- ["Istanbul Province, Turkey"] = {}, -- code 34; this is coextensive with the city itself
["İzmir Province, Turkey"] = {}, -- code 35
["Izmir Province, Turkey"] = {alias_of = "İzmir Province, Turkey", display = true},
["Kars Province, Turkey"] = {}, -- code 36
["Kastamonu Province, Turkey"] = {}, -- code 37
["Kayseri Province, Turkey"] = {}, -- code 38
["Kırklareli Province, Turkey"] = {}, -- code 39
["Kırşehir Province, Turkey"] = {}, -- code 40
["Kocaeli Province, Turkey"] = {}, -- code 41
["Konya Province, Turkey"] = {}, -- code 42
["Kütahya Province, Turkey"] = {}, -- code 43
["Malatya Province, Turkey"] = {}, -- code 44
["Manisa Province, Turkey"] = {}, -- code 45
["Kahramanmaraş Province, Turkey"] = {}, -- code 46
["Mardin Province, Turkey"] = {}, -- code 47
["Muğla Province, Turkey"] = {}, -- code 48
["Muş Province, Turkey"] = {}, -- code 49
["Nevşehir Province, Turkey"] = {}, -- code 50
["Niğde Province, Turkey"] = {}, -- code 51
["Ordu Province, Turkey"] = {}, -- code 52
["Rize Province, Turkey"] = {}, -- code 53
["Sakarya Province, Turkey"] = {}, -- code 54
["Samsun Province, Turkey"] = {}, -- code 55
["Siirt Province, Turkey"] = {}, -- code 56
["Sinop Province, Turkey"] = {}, -- code 57
["Sivas Province, Turkey"] = {}, -- code 58
["Tekirdağ Province, Turkey"] = {}, -- code 59
["Tokat Province, Turkey"] = {}, -- code 60
["Trabzon Province, Turkey"] = {}, -- code 61
["Tunceli Province, Turkey"] = {}, -- code 62
["Şanlıurfa Province, Turkey"] = {}, -- code 63
["Uşak Province, Turkey"] = {}, -- code 64
["Van Province, Turkey"] = {}, -- code 65
["Yozgat Province, Turkey"] = {}, -- code 66
["Zonguldak Province, Turkey"] = {}, -- code 67
["Aksaray Province, Turkey"] = {}, -- code 68
["Bayburt Province, Turkey"] = {}, -- code 69
["Karaman Province, Turkey"] = {}, -- code 70
["Kırıkkale Province, Turkey"] = {}, -- code 71
["Batman Province, Turkey"] = {}, -- code 72
["Şırnak Province, Turkey"] = {}, -- code 73
["Bartın Province, Turkey"] = {}, -- code 74
["Ardahan Province, Turkey"] = {}, -- code 75
["Iğdır Province, Turkey"] = {}, -- code 76
["Yalova Province, Turkey"] = {}, -- code 77
["Karabük Province, Turkey"] = {}, -- code 78
["Kilis Province, Turkey"] = {}, -- code 79
["Osmaniye Province, Turkey"] = {}, -- code 80
["Düzce Province, Turkey"] = {}, -- code 81
}
-- provinces of Turkey
export.turkey_group = {
key_to_placename = make_key_to_placename(", Turkey$", " Province$"),
placename_to_key = make_placename_to_key(", Turkey", " Province"),
default_container = "Turkey",
default_placetype = "province",
default_divs = "districts",
data = export.turkey_provinces,
}
export.ukraine_oblasts = {
["Cherkasy Oblast, Ukraine"] = {}, -- capital [[Cherkasy]], license plate prefix CA, IA
["Chernihiv Oblast, Ukraine"] = {}, -- capital [[Chernihiv]], license plate prefix CB, IB
["Chernivtsi Oblast, Ukraine"] = {}, -- capital [[Chernivtsi]], license plate prefix CE, IE
-- apparently will be renamed to 'Dnipro Oblast'
["Dnipropetrovsk Oblast, Ukraine"] = {}, -- capital [[Dnipro]], license plate prefix AE, KE
["Donetsk Oblast, Ukraine"] = {}, -- capital ''[[Donetsk]] ([[Kramatorsk]])'', license plate prefix AH, KH
["Ivano-Frankivsk Oblast, Ukraine"] = {}, -- capital [[Ivano-Frankivsk]], license plate prefix AT, KT
["Kharkiv Oblast, Ukraine"] = {}, -- capital [[Kharkiv]], license plate prefix AX, KX
["Kherson Oblast, Ukraine"] = {}, -- capital ''[[Kherson]]'', license plate prefix ''BT, HT''
["Khmelnytskyi Oblast, Ukraine"] = {}, -- capital [[Khmelnytskyi]], license plate prefix BX, HX
-- apparently will be renamed to 'Kropyvnytskyi Oblast'
["Kirovohrad Oblast, Ukraine"] = {}, -- capital [[Kropyvnytskyi]], license plate prefix BA, HA
["Kyiv Oblast, Ukraine"] = {}, -- capital [[Kyiv]], license plate prefix AI, KI
["Kiev Oblast, Ukraine"] = {alias_of = "Kyiv Oblast, Ukraine", display = true},
["Luhansk Oblast, Ukraine"] = {}, -- capital ''[[Luhansk]] ([[Sievierodonetsk]])'', license plate prefix BB, HB
["Lviv Oblast, Ukraine"] = {}, -- capital [[Lviv]], license plate prefix BC, HC
["Mykolaiv Oblast, Ukraine"] = {}, -- capital [[Mykolaiv]], license plate prefix BE, HE
["Odesa Oblast, Ukraine"] = {}, -- capital [[Odesa]], license plate prefix BH, HH
["Odessa Oblast, Ukraine"] = {alias_of = "Odesa Oblast, Ukraine", display = true},
["Poltava Oblast, Ukraine"] = {}, -- capital [[Poltava]], license plate prefix BI, HI
["Rivne Oblast, Ukraine"] = {}, -- capital [[Rivne]], license plate prefix BK, HK
["Sumy Oblast, Ukraine"] = {}, -- capital [[Sumy]], license plate prefix BM, HM
["Ternopil Oblast, Ukraine"] = {}, -- capital [[Ternopil]], license plate prefix BO, HO
["Vinnytsia Oblast, Ukraine"] = {}, -- capital [[Vinnytsia]], license plate prefix AB, KB
["Volyn Oblast, Ukraine"] = {}, -- capital [[Lutsk]], license plate prefix AC, KC
["Zakarpattia Oblast, Ukraine"] = {}, -- capital [[Uzhhorod]], license plate prefix AO, KO
["Zaporizhzhia Oblast, Ukraine"] = {}, -- capital ''[[Zaporizhzhia]]'', license plate prefix AP, KP
["Zaporizhia Oblast, Ukraine"] = {alias_of = "Zaporizhzhia Oblast, Ukraine", display = true},
["Zhytomyr Oblast, Ukraine"] = {}, -- capital [[Zhytomyr]], license plate prefix AM, KM
}
-- oblasts of Ukraine
export.ukraine_group = {
key_to_placename = make_key_to_placename(", Ukraine$", " Oblast$"),
placename_to_key = make_placename_to_key(", Ukraine", " Oblast"),
default_container = "Ukraine",
default_placetype = "oblast",
default_divs = {"raions", "hromadas"},
data = export.ukraine_oblasts,
}
export.united_kingdom_constituent_countries = {
["England"] = {divs = {
"counties",
"districts",
{type = "local government districts", cat_as = "districts"},
{
type = "local government districts with borough status",
cat_as = {"districts", "boroughs"},
},
{type = "boroughs", cat_as = {"districts", "boroughs"}},
{type = "civil parishes", container_parent_type = false},
}},
["Northern Ireland"] = {
placetype = {"constituent country", "province", "negara"},
divs = {"counties", "districts"},
},
["Scotland"] = {divs = {
{type = "council areas", container_parent_type = false},
"districts",
}},
["Wales"] = {divs = {
"counties",
{type = "county boroughs", container_parent_type = false},
{type = "communities", container_parent_type = false},
{type = "Welsh communities", cat_as = {{type = "communities", container_parent_type = false}}},
}},
}
-- constituent countries and provinces of the United Kingdom
export.united_kingdom_group = {
placename_to_key = false,
default_container = "United Kingdom",
default_placetype = {"constituent country", "negara"},
addl_divs = {
"traditional counties",
{type = "historical counties", cat_as = "traditional counties"},
},
-- Don't create categories like 'Category:en:Towns in the United Kingdom'
-- or 'Category:en:Places in the United Kingdom'.
default_no_container_cat = true,
data = export.united_kingdom_constituent_countries,
}
export.england_counties = {
-- NOTE: We used to have various other "no longer" counties commented out, which seems to refer to counties that
-- existed officially at some point between 1889 and 1974, which I have removed. I have only kept the three
-- ceremonial counties that existed from 1974 (when ceremonial counties were created) to 1996, as well as those
-- still considered "historic counties" per [[w:Historic counties of England]].
-- ["Avon, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996)
["Bedfordshire, England"] = {},
["Berkshire, England"] = {},
-- ["Brighton and Hove, England"] = {}, -- city
-- ["Bristol, England"] = {}, -- city
["Buckinghamshire, England"] = {},
["Cambridgeshire, England"] = {},
["Cheshire, England"] = {},
-- ["Cleveland, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996)
["Cornwall, England"] = {},
-- ["Cumberland, England"] = {}, -- no longer (historic county)
["Cumbria, England"] = {},
["Derbyshire, England"] = {},
["Devon, England"] = {},
["Dorset, England"] = {},
["County Durham, England"] = {},
["East Sussex, England"] = {},
["Essex, England"] = {},
["Gloucestershire, England"] = {},
["Greater London, England"] = {},
["Greater Manchester, England"] = {},
["Hampshire, England"] = {},
["Herefordshire, England"] = {},
["Hertfordshire, England"] = {},
-- ["Humberside, England"] = {}, -- no longer (1974 to 1996)
-- ["Huntingdonshire, England"] = {}, -- no longer (historic county)
["Isle of Wight, England"] = {the = true},
["Kent, England"] = {},
["Lancashire, England"] = {},
["Leicestershire, England"] = {},
["Lincolnshire, England"] = {},
["Merseyside, England"] = {},
-- ["Middlesex, England"] = {}, -- no longer (historic county)
["Norfolk, England"] = {},
["Northamptonshire, England"] = {},
["Northumberland, England"] = {},
["North Yorkshire, England"] = {},
["Nottinghamshire, England"] = {},
["Oxfordshire, England"] = {},
["Rutland, England"] = {},
["Shropshire, England"] = {},
["Somerset, England"] = {},
["South Humberside, England"] = {},
["South Yorkshire, England"] = {},
["Staffordshire, England"] = {},
["Suffolk, England"] = {},
["Surrey, England"] = {},
-- ["Sussex, England"] = {}, -- no longer (historic county)
["Tyne and Wear, England"] = {},
["Warwickshire, England"] = {},
["West Midlands, England"] = {the = true, wp = "%l (county)"},
-- ["Westmorland, England"] = {}, -- no longer (historic county)
["West Sussex, England"] = {},
["West Yorkshire, England"] = {},
["Wiltshire, England"] = {},
["Worcestershire, England"] = {},
-- ["Yorkshire, England"] = {}, -- no longer (historic county)
["East Riding of Yorkshire, England"] = {the = true},
}
-- counties of England
export.england_group = {
default_container = {key = "England", placetype = "constituent country"},
default_placetype = "county",
default_divs = {
"districts",
{type = "local government districts", cat_as = "districts"},
{
type = "local government districts with borough status",
cat_as = {"districts", "boroughs"},
},
{type = "boroughs", cat_as = {"districts", "boroughs"}},
"civil parishes",
},
data = export.england_counties,
}
export.northern_ireland_counties = {
["County Antrim, Northern Ireland"] = {},
["County Armagh, Northern Ireland"] = {},
["City of Belfast, Northern Ireland"] = {the = true, is_city = true, wp = "Belfast"},
["County Down, Northern Ireland"] = {},
["County Fermanagh, Northern Ireland"] = {},
["County Londonderry, Northern Ireland"] = {},
["City of Derry, Northern Ireland"] = {the = true, is_city = true, wp = "Derry"},
["County Tyrone, Northern Ireland"] = {},
}
-- counties of Northern Ireland
export.northern_ireland_group = {
key_to_placename = make_irish_type_key_to_placename(", Northern Ireland$"),
placename_to_key = make_irish_type_placename_to_key(", Northern Ireland"),
default_container = {key = "Northern Ireland", placetype = "constituent country"},
default_placetype = "county",
data = export.northern_ireland_counties,
}
export.scotland_council_areas = {
["Aberdeenshire, Scotland"] = {},
["Angus, Scotland"] = {wp = "%l, %c"},
["Argyll and Bute, Scotland"] = {},
["City of Aberdeen, Scotland"] = {the = true, wp = "Aberdeen"},
["Aberdeen"] = {alias_of = "City of Aberdeen, Scotland"},
["Aberdeen City"] = {alias_of = "City of Aberdeen, Scotland"},
["City of Dundee, Scotland"] = {the = true, wp = "Dundee"},
["Dundee"] = {alias_of = "City of Dundee, Scotland"},
["Dundee City"] = {alias_of = "City of Dundee, Scotland"},
["City of Edinburgh, Scotland"] = {the = true, wp = "%l council area"},
["Edinburgh"] = {alias_of = "City of Edinburgh, Scotland"},
["City of Glasgow, Scotland"] = {the = true, wp = "Glasgow"},
["Glasgow"] = {alias_of = "City of Glasgow, Scotland"},
["Clackmannanshire, Scotland"] = {},
["Dumfries and Galloway, Scotland"] = {},
["East Ayrshire, Scotland"] = {},
["East Dunbartonshire, Scotland"] = {},
["East Lothian, Scotland"] = {},
["East Renfrewshire, Scotland"] = {},
["Falkirk, Scotland"] = {wp = "%l council area"},
["Fife, Scotland"] = {},
["Highland, Scotland"] = {wp = "%l council area"},
["Inverclyde, Scotland"] = {},
["Midlothian, Scotland"] = {},
["Moray, Scotland"] = {},
["North Ayrshire, Scotland"] = {},
["North Lanarkshire, Scotland"] = {},
["Orkney Islands, Scotland"] = {the = true},
["Perth and Kinross, Scotland"] = {},
["Renfrewshire, Scotland"] = {},
["Scottish Borders, Scotland"] = {the = true},
["Shetland Islands, Scotland"] = {the = true},
["South Ayrshire, Scotland"] = {},
["South Lanarkshire, Scotland"] = {},
["Stirling, Scotland"] = {wp = "%l council area"},
["West Dunbartonshire, Scotland"] = {},
["West Lothian, Scotland"] = {},
["Western Isles, Scotland"] = {the = true, wp = "Outer Hebrides"},
["Na h-Eileanan Siar, Scotland"] = {alias_of = "Western Isles, Scotland"},
}
-- council areas of Scotland
export.scotland_group = {
default_container = {key = "Scotland", placetype = "constituent country"},
default_placetype = "council area",
data = export.scotland_council_areas,
}
export.wales_principal_areas = {
["Blaenau Gwent, Wales"] = {},
["Bridgend, Wales"] = {wp = "%l County Borough"},
["Caerphilly, Wales"] = {wp = "%l County Borough"},
-- ["Cardiff, Wales"] = {placetype = "city"},
["Carmarthenshire, Wales"] = {placetype = "county"},
["Ceredigion, Wales"] = {placetype = "county"},
["Conwy, Wales"] = {wp = "%l County Borough"},
["Denbighshire, Wales"] = {placetype = "county"},
["Flintshire, Wales"] = {placetype = "county"},
["Gwynedd, Wales"] = {placetype = "county"},
["Isle of Anglesey, Wales"] = {the = true, placetype = "county"},
["Anglesey, Wales"] = {alias_of = "Isle of Anglesey, Wales"}, -- differs in "the"
["Merthyr Tydfil, Wales"] = {wp = "%l County Borough"},
["Monmouthshire, Wales"] = {placetype = "county"},
["Neath Port Talbot, Wales"] = {},
-- ["Newport, Wales"] = {placetype = "city", wp = "%l, %c"},
["Pembrokeshire, Wales"] = {placetype = "county"},
["Powys, Wales"] = {placetype = "county"},
["Rhondda Cynon Taf, Wales"] = {},
-- ["Swansea, Wales"] = {placetype = "city"},
["Torfaen, Wales"] = {},
["Vale of Glamorgan, Wales"] = {the = true},
["Wrexham, Wales"] = {wp = "%l County Borough"},
}
-- principal areas (cities, counties and county boroughs) of Wales
export.wales_group = {
default_container = {key = "Wales", placetype = "constituent country"},
default_placetype = "county borough",
data = export.wales_principal_areas,
}
export.united_states_states = {
["Alabama, USA"] = {},
["Alaska, USA"] = {divs = {
{type = "boroughs", container_parent_type = "counties"},
{type = "borough seats", container_parent_type = "county seats"},
}},
["Arizona, USA"] = {},
["Arkansas, USA"] = {},
["California, USA"] = {},
["Colorado, USA"] = {divs = {"counties", "county seats", "municipalities"}},
["Connecticut, USA"] = {divs = {"counties", "county seats", "municipalities"}},
["Delaware, USA"] = {},
["Florida, USA"] = {},
["Georgia, USA"] = {wp = "%l (U.S. state)"},
["Hawaii, USA"] = {addl_parents = {"Polynesia"}},
["Idaho, USA"] = {},
["Illinois, USA"] = {},
["Indiana, USA"] = {},
["Iowa, USA"] = {},
["Kansas, USA"] = {},
["Kentucky, USA"] = {},
["Louisiana, USA"] = {divs = {
{type = "parishes", container_parent_type = "counties"},
{type = "parish seats", container_parent_type = "county seats"},
}},
["Maine, USA"] = {},
["Maryland, USA"] = {},
["Massachusetts, USA"] = {},
["Michigan, USA"] = {},
["Minnesota, USA"] = {},
["Mississippi, USA"] = {},
["Missouri, USA"] = {},
["Montana, USA"] = {},
["Nebraska, USA"] = {},
["Nevada, USA"] = {},
["New Hampshire, USA"] = {},
["New Jersey, USA"] = {divs = {
"counties", "county seats",
{type = "boroughs", prep = "di"},
}},
["New Mexico, USA"] = {},
["New York, USA"] = {wp = "%l (state)"},
["North Carolina, USA"] = {},
["North Dakota, USA"] = {},
["Ohio, USA"] = {},
["Oklahoma, USA"] = {},
["Oregon, USA"] = {},
["Pennsylvania, USA"] = {divs = {
"counties", "county seats",
{type = "boroughs", prep = "di"},
}},
["Rhode Island, USA"] = {},
["South Carolina, USA"] = {},
["South Dakota, USA"] = {},
["Tennessee, USA"] = {},
["Texas, USA"] = {},
["Utah, USA"] = {},
["Vermont, USA"] = {},
["Virginia, USA"] = {},
["Washington, USA"] = {wp = "%l (state)"},
["West Virginia, USA"] = {},
["Wisconsin, USA"] = {},
["Wyoming, USA"] = {},
}
-- states of the United States
export.united_states_group = {
placename_to_key = make_placename_to_key(", USA"),
default_container = "Amerika Syarikat",
default_placetype = "negeri",
default_divs = {"counties", "county seats"},
addl_divs = {
{type = "census-designated places", prep = "di"},
{type = "unincorporated communities", prep = "di"},
},
data = export.united_states_states,
}
export.vietnam_provinces = {
-- [[Northeast (Vietnam)|Northeast]] region
["Bắc Giang Province, Vietnam"] = {}, -- capital [[Bắc Giang]]
["Bắc Kạn Province, Vietnam"] = {}, -- capital [[Bắc Kạn]]
["Cao Bằng Province, Vietnam"] = {}, -- capital [[Cao Bằng]]
["Hà Giang Province, Vietnam"] = {}, -- capital [[Hà Giang]]
["Lạng Sơn Province, Vietnam"] = {}, -- capital [[Lạng Sơn]]
["Phú Thọ Province, Vietnam"] = {}, -- capital [[Việt Trì]]
["Quảng Ninh Province, Vietnam"] = {}, -- capital [[Hạ Long]]
["Thái Nguyên Province, Vietnam"] = {}, -- capital [[Thái Nguyên]]
["Tuyên Quang Province, Vietnam"] = {}, -- capital [[Tuyên Quang]]
-- [[Northwest (Vietnam)|Northwest]] region
["Lào Cai Province, Vietnam"] = {}, -- capital [[Lào Cai]]
["Yên Bái Province, Vietnam"] = {}, -- capital [[Yên Bái]]
["Điện Biên Province, Vietnam"] = {}, -- capital [[Điện Biên Phủ]]
["Hoà Bình Province, Vietnam"] = {}, -- capital [[Hoà Bình City|Hoà Bình]]
["Hòa Bình Province, Vietnam"] = {alias_of = "Hoà Bình Province, Vietnam", display = true},
["Lai Châu Province, Vietnam"] = {}, -- capital [[Lai Châu]]
["Sơn La Province, Vietnam"] = {}, -- capital [[Sơn La]]
-- [[Red River Delta]] region
["Bắc Ninh Province, Vietnam"] = {}, -- capital [[Bắc Ninh]]
["Hà Nam Province, Vietnam"] = {}, -- capital [[Phủ Lý]]
["Hải Dương Province, Vietnam"] = {}, -- capital [[Hải Dương]]
["Hưng Yên Province, Vietnam"] = {}, -- capital [[Hưng Yên]]
["Nam Định Province, Vietnam"] = {}, -- capital [[Nam Định]]
["Ninh Bình Province, Vietnam"] = {}, -- capital [[Ninh Bình|Hoa Lư]]
["Thái Bình Province, Vietnam"] = {}, -- capital [[Thái Bình]]
["Vĩnh Phúc Province, Vietnam"] = {}, -- capital [[Vĩnh Yên]]
-- ["Hanoi"] = {placetype = {"municipality", "city"}}, -- capital [[Hoàn Kiếm district]]
-- ["Haiphong"] = {placetype = {"municipality", "city"}}, -- capital [[Hồng Bàng district]]
-- [[North Central Coast]] region
["Hà Tĩnh Province, Vietnam"] = {}, -- capital [[Hà Tĩnh]]
["Nghệ An Province, Vietnam"] = {}, -- capital [[Vinh]]
["Quảng Bình Province, Vietnam"] = {}, -- capital [[Đồng Hới]]
["Quảng Trị Province, Vietnam"] = {}, -- capital [[Đông Hà]]
["Thanh Hoá Province, Vietnam"] = {}, -- capital [[Thanh Hoá]]
["Thanh Hóa Province, Vietnam"] = {alias_of = "Thanh Hoá Province, Vietnam", display = true},
-- ["Hue"] = {placetype = {"municipality", "city"}, wp = "Huế"}, -- capital [[Thuận Hoá district]]
-- [[Central Highlands (Vietnam)|Central Highlands]] region
["Đắk Lắk Province, Vietnam"] = {}, -- capital [[Buôn Ma Thuột]]
["Đăk Nông Province, Vietnam"] = {}, -- capital [[Gia Nghĩa]]
["Gia Lai Province, Vietnam"] = {}, -- capital [[Pleiku]]
["Kon Tum Province, Vietnam"] = {}, -- capital [[Kon Tum]]
["Lâm Đồng Province, Vietnam"] = {}, -- capital [[Đà Lạt]]
-- [[South Central Coast]] region
["Bình Định Province, Vietnam"] = {}, -- capital [[Quy Nhon]]
["Bình Thuận Province, Vietnam"] = {}, -- capital [[Phan Thiết]]
["Khánh Hoà Province, Vietnam"] = {}, -- capital [[Nha Trang]]
["Khánh Hòa Province, Vietnam"] = {alias_of = "Khánh Hoà Province, Vietnam", display = true},
["Ninh Thuận Province, Vietnam"] = {}, -- capital [[Phan Rang–Tháp Chàm]]
["Phú Yên Province, Vietnam"] = {}, -- capital [[Tuy Hoà]]
["Quảng Nam Province, Vietnam"] = {}, -- capital [[Tam Kỳ]]
["Quảng Ngãi Province, Vietnam"] = {}, -- capital [[Quảng Ngãi]]
-- ["Da Nang"] = {placetype = {"municipality", "city"}}, -- capital [[Hải Châu district]]
-- [[Southeast (Vietnam)|Southeast]] region
["Bà Rịa–Vũng Tàu Province, Vietnam"] = {}, -- capital [[Bà Rịa]]
["Bình Dương Province, Vietnam"] = {}, -- capital [[Thủ Dầu Một]]
["Bình Phước Province, Vietnam"] = {}, -- capital [[Đồng Xoài]]
["Đồng Nai Province, Vietnam"] = {}, -- capital [[Biên Hoà]]
["Tây Ninh Province, Vietnam"] = {}, -- capital [[Tây Ninh]]
-- ["Ho Chi Minh City"] = {placetype = {"municipality", "city"}}, -- capital [[District 1, Ho Chi Minh City|'''District 1''']]
-- [[Mekong Delta]] region
["An Giang Province, Vietnam"] = {}, -- capital [[Long Xuyên]]
["Bạc Liêu Province, Vietnam"] = {}, -- capital [[Bạc Liêu]]
["Bến Tre Province, Vietnam"] = {}, -- capital [[Bến Tre]]
["Cà Mau Province, Vietnam"] = {}, -- capital [[Cà Mau]]
["Đồng Tháp Province, Vietnam"] = {}, -- capital [[Cao Lãnh City|Cao Lãnh]]
["Hậu Giang Province, Vietnam"] = {}, -- capital [[Vị Thanh]]
["Kiên Giang Province, Vietnam"] = {}, -- capital [[Rạch Giá]]
["Long An Province, Vietnam"] = {}, -- capital [[Tân An]]
["Sóc Trăng Province, Vietnam"] = {}, -- capital [[Sóc Trăng]]
["Tiền Giang Province, Vietnam"] = {}, -- capital [[Mỹ Tho]]
["Trà Vinh Province, Vietnam"] = {}, -- capital [[Trà Vinh]]
["Vĩnh Long Province, Vietnam"] = {}, -- capital [[Vĩnh Long]]
-- ["Can Tho"] = {placetype = {"municipality", "city"}, wp = "Cần Thơ"}, -- capital [[Ninh Kiều district]]
}
-- provinces of Vietnam
export.vietnam_group = {
key_to_placename = make_key_to_placename(", Vietnam$", " Province$"),
placename_to_key = make_placename_to_key(", Vietnam", " Province"),
default_container = "Vietnam",
default_placetype = "province",
-- There may not be enough districts to subcategorize like this.
-- default_divs = "districts",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "%e province",
data = export.vietnam_provinces,
}
-----------------------------------------------------------------------------------
-- City data --
-----------------------------------------------------------------------------------
export.australia_cities = {
["Adelaide"] = {container = "South Australia"}, -- 1,450,000 (Agglomeration)
["Brisbane"] = {container = "Queensland"}, -- 3,450,000 (Conglomeration; including the Gold Coast [750,997 2024 estiamte])
["Canberra"] = {container = {key = "Australian Capital Territory, Australia", placetype = "territory"}}, -- 510,641 (2024 estimate)
["Melbourne"] = {container = "Victoria"}, -- 5,200,000 (Agglomeration)
["Newcastle, New South Wales"] = {container = "New South Wales", wp = "%l, %c"}, -- 534,033 (2024 estimate)
["Newcastle"] = {alias_of = "Newcastle, New South Wales"},
["Perth"] = {container = "Western Australia"}, -- 2,350,000 (Agglomeration)
["Sydney"] = {container = "New South Wales"}, -- 5,100,000 (Agglomeration)
}
export.australia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Australia", "negeri"),
default_placetype = "city",
data = export.australia_cities,
}
export.brazil_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01.
["São Paulo"] = {container = "São Paulo"}, -- 22,600,000 (Consolidated Urban Area; including Guarulhos)
["Sao Paulo"] = {alias_of = "São Paulo", display = true},
["Rio de Janeiro"] = {container = "Rio de Janeiro"}, -- 13,600,000 (Consolidated Urban Area)
["Belo Horizonte"] = {container = "Minas Gerais"}, -- 5,300,000
["Recife"] = {container = "Pernambuco"}, -- 4,100,000
["Porto Alegre"] = {container = "Rio Grande do Sul"}, -- 3,950,000 (Consolidated Urban Area)
["Brasília"] = {container = "Distrito Federal"}, -- 3,850,000
["Brasilia"] = {alias_of = "Brasília", display = true},
["Fortaleza"] = {container = "Ceará"}, -- 3,825,000
["Salvador"] = {container = "Bahia", wp = "%l, %c", commonscat = "%l (%c)"}, -- 3,400,000
["Curitiba"] = {container = "Paraná"}, -- 3,375,000
["Campinas"] = {container = "São Paulo"}, -- 3,250,000
["Goiânia"] = {container = "Goiás"}, -- 2,525,000
["Goiania"] = {alias_of = "Goiânia", display = true},
["Manaus"] = {container = "Amazonas"}, -- 2,275,000
["Belém"] = {container = "Pará"}, -- 2,200,000
["Belem"] = {alias_of = "Belém", display = true},
["Vitória"] = {container = "Espírito Santo", wp = "%l, %c"}, -- 1,870,000
["Vitoria"] = {alias_of = "Vitória", display = true},
["Santos"] = {container = "São Paulo", wp = "%l, %c"}, -- 1,760,000
["São Luís"] = {container = "Maranhão", wp = "%l, %c"}, -- 1,530,000
["Sao Luis"] = {alias_of = "São Luís", display = true},
["Natal"] = {container = "Rio Grande do Norte", wp = "%l, %c"}, -- 1,360,000
["Florianópolis"] = {container = "Santa Catarina"}, -- 1,260,000
["Florianopolis"] = {alias_of = "Florianópolis", display = true},
["Maceió"] = {container = "Alagoas"}, -- 1,220,000
["Maceio"] = {alias_of = "Maceió", display = true},
["João Pessoa"] = {container = "Paraíba", wp = "%l, %c"}, -- 1,210,000
["Joao Pessoa"] = {alias_of = "João Pessoa", display = true},
["São José dos Campos"] = {container = "São Paulo"}, -- 1,090,000
["Sao Jose dos Campos"] = {alias_of = "São José dos Campos", display = true},
["Londrina"] = {container = "Paraná"}, -- 1,050,000
["Teresina"] = {container = "Piauí"}, -- 1,040,000
}
export.brazil_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Brazil", "negeri"),
default_placetype = "city",
data = export.brazil_cities,
}
export.canada_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01.
["Toronto"] = {container = "Ontario"}, -- 7,850,000 (Consolidated Urban Area; including Hamilton)
["Montreal"] = {container = "Quebec"}, -- 4,500,000 (Consolidated Urban Area)
["Vancouver"] = {container = "British Columbia"}, -- 3,175,000 (Consolidated Urban Area)
["Calgary"] = {container = "Alberta"}, -- 1,510,000 (Consolidated Urban Area)
["Edmonton"] = {container = "Alberta"}, -- 1,460,000 (Consolidated Urban Area)
["Ottawa"] = {container = "Ontario"}, -- 1,390,000 (Consolidated Urban Area)
["Quebec City"] = {container = "Quebec"}, -- 839,311 metro per Wikipedia (2021 census)
["Winnipeg"] = {container = "Manitoba"}, -- 834,678 metro per Wikipedia (2021 census)
["Hamilton"] = {container = "Ontario", wp = "%l, %c"}, -- 785,184 metro per Wikipedia (2021 census)
["Kitchener"] = {container = "Ontario", wp = "%l, %c"}, -- 575,847 metro per Wikipedia (2021 census)
}
export.canada_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Canada", "province"),
default_placetype = "city",
data = export.canada_cities,
}
export.france_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
["Paris"] = {container = "Île-de-France"}, -- 11,500,000 (Conglomeration)
["Lyon"] = {container = "Auvergne-Rhône-Alpes"}, -- 2,050,000 (Conglomeration)
["Lyons"] = {alias_of = "Lyon", display = true},
["Marseille"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 1,710,000 (Conglomeration)
["Marseilles"] = {alias_of = "Marseille", display = true},
["Lille"] = {container = "Hauts-de-France"}, -- 1,320,000 (Conglomeration)
["Bordeaux"] = {container = "Nouvelle-Aquitaine"}, -- 1,160,000 (Conglomeration)
["Toulouse"] = {container = "Occitania"}, -- 1,150,000 (Conglomeration)
["Nice"] = {container = "Provence-Alpes-Côte d'Azur"},
["Nantes"] = {container = "Pays de la Loire"},
["Strasbourg"] = {container = "Grand Est"},
["Rennes"] = {container = "Brittany"},
}
export.france_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", France", "region"),
default_placetype = "city",
data = export.france_cities,
}
export.germany_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
-- listed under Rhein-Ruhr Area, total population 10,900,000 (Consolidated Urban Area)
["Cologne"] = {container = "North Rhine-Westphalia"},
["Köln"] = {alias_of = "Cologne", display = true},
["Düsseldorf"] = {container = "North Rhine-Westphalia"},
["Dusseldorf"] = {alias_of = "Düsseldorf", display = true},
["Dortmund"] = {container = "North Rhine-Westphalia"},
["Essen"] = {container = "North Rhine-Westphalia"},
["Duisberg"] = {container = "North Rhine-Westphalia"},
["Berlin"] = {}, -- 4,700,000
["Frankfurt"] = {container = "Hesse"}, -- 3,225,000
["Frankfurt am Main"] = {alias_of = "Frankfurt"}, -- not a display alias as it's longer
["Hamburg"] = {}, -- 2,900,000
["Munich"] = {container = "Bavaria"}, -- 2,300,000
["Stuttgart"] = {container = "Baden-Württemberg"}, -- 2,300,000
["Mannheim"] = {container = "Baden-Württemberg"}, -- 1,550,000
["Nuremberg"] = {container = "Bavaria"}, -- 1,120,000
["Hanover"] = {"Lower Saxony"}, -- 1,090,000
["Bielefeld"] = {container = "North Rhine-Westphalia"}, -- 1,080,000
["Leipzig"] = {container = "Saxony"}, -- 1,080,000
["Aachen"] = {container = "North Rhine-Westphalia"}, -- 1,000,000
["Aix-la-Chapelle"] = {alias_of = "Aachen"}, -- historical; not a display alias
["Bremen"] = {},
}
export.germany_cities_group = {
default_container = "Germany",
canonicalize_key_container = make_canonicalize_key_container(", Germany", "negeri"),
default_placetype = "city",
data = export.germany_cities,
}
export.india_cities = {
-- This lists the 65 metro areas per Demographia's 2023 estimates, as found in
-- [[w:List_of_million-plus_urban_agglomerations_in_India]]. The last census in India (as of April 2025) was
-- conducted in 2011, and the results are not accurate any more.
["Delhi"] = {container = {key = "Delhi, India", placetype = "union territory"}}, -- 31,190,000
["Mumbai"] = {container = "Maharashtra"}, -- 25,189,000
["Kolkata"] = {container = "West Bengal"}, -- 21,747,000
["Bangalore"] = {container = "Karnataka", wp = "Bengaluru"}, -- 15,257,000
["Bengaluru"] = {alias_of = "Bangalore"},
["Chennai"] = {container = "Tamil Nadu"}, -- 11,570,000
["Hyderabad"] = {container = "Telangana"}, -- 9,797,000
["Ahmedabad"] = {container = "Gujarat"}, -- 8,006,000
["Pune"] = {container = "Maharashtra"}, -- 6,819,000
["Surat"] = {container = "Gujarat"}, -- 6,601,000
["Lucknow"] = {container = "Uttar Pradesh"}, -- 4,661,000
["Jaipur"] = {container = "Rajasthan"}, -- 4,360,000
["Kanpur"] = {container = "Uttar Pradesh"}, -- 4,350,000
["Indore"] = {container = "Madhya Pradesh"}, -- 3,765,000
["Nagpur"] = {container = "Maharashtra"}, -- 3,493,000
["Patna"] = {container = "Bihar"}, -- 3,331,000
["Varanasi"] = {container = "Uttar Pradesh"}, -- 3,229,000
["Kozhikode"] = {container = "Kerala"}, -- 3,049,000
["Thiruvananthapuram"] = {container = "Kerala"}, -- 2,851,000
["Agra"] = {container = "Uttar Pradesh"}, -- 2,737,000
["Bhopal"] = {container = "Madhya Pradesh"}, -- 2,562,000
["Coimbatore"] = {container = "Tamil Nadu"}, -- 2,551,000
["Allahabad"] = {container = "Uttar Pradesh", wp = "Prayagraj"}, -- 2,438,000
["Prayagraj"] = {alias_of = "Allahabad"},
["Kochi"] = {container = "Kerala"}, -- 2,381,000
["Ludhiana"] = {container = "Punjab"}, -- 2,205,000
["Vadodara"] = {container = "Gujarat"}, -- 2,182,000
["Chandigarh"] = {container = {key = "Chandigarh, India", placetype = "union territory"}}, -- 2,168,000
["Madurai"] = {container = "Tamil Nadu"}, -- 2,048,000
["Meerut"] = {container = "Uttar Pradesh"}, -- 2,011,000
["Visakhapatnam"] = {container = "Andhra Pradesh"}, -- 2,005,000
["Jamshedpur"] = {container = "Jharkhand"}, -- 1,925,000
["Malappuram"] = {container = "Kerala"}, -- 1,868,000
["Nashik"] = {container = "Maharashtra"}, -- 1,810,000
["Asansol"] = {container = "West Bengal"}, -- 1,720,000
["Aligarh"] = {container = "Uttar Pradesh"}, -- 1,660,000
["Ranchi"] = {container = "Jharkhand"}, -- 1,638,000
["Thrissur"] = {container = "Kerala"}, -- 1,578,000
["Kollam"] = {container = "Kerala"}, -- 1,576,000
["Jabalpur"] = {container = "Madhya Pradesh"}, -- 1,533,000
["Dhanbad"] = {container = "Jharkhand"}, -- 1,503,000
["Jodhpur"] = {container = "Rajasthan"}, -- 1,497,000
["Aurangabad"] = {container = "Maharashtra"}, -- 1,490,000
["Chhatrapati Sambhajinagar"] = {alias_of = "Aurangabad"},
["Rajkot"] = {container = "Gujarat"}, -- 1,487,000
["Gwalior"] = {container = "Madhya Pradesh"}, -- 1,477,000
["Raipur"] = {container = "Chhattisgarh"}, -- 1,429,000
["Gorakhpur"] = {container = "Uttar Pradesh"}, -- 1,410,000
["Kannur"] = {container = "Kerala"}, -- 1,360,000
["Bareilly"] = {container = "Uttar Pradesh"}, -- 1,355,000
["Guwahati"] = {container = "Assam"}, -- 1,355,000
["Moradabad"] = {container = "Uttar Pradesh"}, -- 1,345,000
["Amritsar"] = {container = "Punjab"}, -- 1,313,000
["Mysore"] = {container = "Karnataka"}, -- 1,296,000
["Bhilai"] = {container = "Chhattisgarh"}, -- 1,293,000
["Durg-Bhilainagar"] = {alias_of = "Bhilai"},
["Durg-Bhilai"] = {alias_of = "Bhilai"},
["Durg"] = {alias_of = "Bhilai"},
["Bhilainagar"] = {alias_of = "Bhilai"},
["Vijayawada"] = {container = "Andhra Pradesh"}, -- 1,232,000
["Srinagar"] = {container = {key = "Jammu and Kashmir, India", placetype = "union territory"}}, -- 1,212,000
["Salem"] = {container = "Tamil Nadu", wp = "%l, %c"}, -- 1,189,000
["Kota"] = {container = "Rajasthan"}, -- 1,172,000
["Jalandhar"] = {container = "Punjab"}, -- 1,165,000
["Saharanpur"] = {container = "Uttar Pradesh"}, -- 1,152,000
["Dehradun"] = {container = "Uttarakhand"}, -- 1,136,000
["Tiruchirappalli"] = {container = "Tamil Nadu"}, -- 1,131,000
["Bhubaneswar"] = {container = "Odisha"}, -- 1,112,000
["Jammu"] = {container = {key = "Jammu and Kashmir, India", placetype = "union territory"}}, -- 1,103,000
["Solapur"] = {container = "Maharashtra"}, -- 1,082,000
["Hubli-Dharwad"] = {container = "Karnataka", wp = "Hubli–Dharwad"}, -- 1,062,000; wp with en dash
["Hubli"] = {alias_of = "Hubli-Dharwad"},
["Dharwad"] = {alias_of = "Hubli-Dharwad"},
["Puducherry"] = {container = {key = "Puducherry, India", placetype = "union territory"}}, -- 1,024,000
["Pondicherry"] = {alias_of = "Puducherry", display = true},
-- satellite/secondary cities of metro area (none in citypopulation.de)
["Ghaziabad"] = {container = "Uttar Pradesh"}, -- 1,729,000 city, 2,358,525 urban agglomeration per 2011 census; 3,406,061 2025 estimate from official website; part of Delhi metro area
["Faridabad"] = {container = "Haryana"}, -- 1,414,050 city per 2011 census; part of Delhi metro area
["Thane"] = {container = "Maharashtra"}, -- 1,841,488 city per 2011 census; part of Mumbai metro area
["Kalyan-Dombivli"] = {container = "Maharashtra"}, -- 1,246,381 city per 2011 census; part of Mumbai metro area
["Kalyan-Dombivali"] = {alias_of = "Kalyan-Dombivli", display = true},
["Kalyan"] = {alias_of = "Kalyan-Dombivli"},
["Dombivli"] = {alias_of = "Kalyan-Dombivli"},
["Dombivali"] = {alias_of = "Kalyan-Dombivli"},
["Vasai-Virar"] = {container = "Maharashtra"}, -- 1,221,233 city per 2011 census; part of Mumbai metro area
["Vasai"] = {alias_of = "Vasai-Virar"},
["Virar"] = {alias_of = "Vasai-Virar"},
["Navi Mumbai"] = {container = "Maharashtra"}, -- 1,120,547 city per 2011 census; part of Mumbai metro area
["Howrah"] = {container = "West Bengal"}, -- 1,077,075 city ("metropolis"), 2,811,344 "metro" per 2011 census; part of Kolkata metro area
["Pimpri-Chinchwad"] = {container = "Maharashtra"}, -- 1,727,692 per 2011 census; part of Pune metro area
["Pimpri Chinchwad"] = {alias_of = "Pimpri-Chinchwad", display = true},
}
export.india_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", India", "negeri"),
default_placetype = "city",
data = export.india_cities,
}
export.indonesia_cities = {
-- cities where the city proper has more than 1,000,000 people as of mid-2023 estimate
["Jakarta"] = {container = "Special Capital Region of Jakarta", divs = {
{type = "subdistricts", container_parent_type = false},
}},
["Surabaya"] = {container = "East Java"},
["Bekasi"] = {container = "West Java"}, -- part of Jakarta metro area
["Bandung"] = {container = "West Java"},
["Medan"] = {container = "North Sumatra"},
["Depok"] = {container = "West Java"}, -- part of Jakarta metro area
["Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area
["Palembang"] = {container = "South Sumatra"},
["Semarang"] = {container = "Central Java"},
["Makassar"] = {container = "South Sulawesi"},
["South Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area
["Batam"] = {container = "Riau Islands"},
["Bogor"] = {container = "West Java"}, -- part of Jakarta metro area
["Pekanbaru"] = {container = "Riau"},
["Bandar Lampung"] = {container = "Lampung"},
-- other metro areas over 1,000,000 people
["Padang"] = {container = "West Sumatra"},
["Samarinda"] = {container = "East Kalimantan"},
["Malang"] = {container = "East Java"},
["Yogyakarta"] = {container = "Special Region of Yogyakarta"},
["Denpasar"] = {container = "Bali"},
["Cirebon"] = {container = "West Java"},
["Surakarta"] = {container = "Central Java"},
["Banjarmasin"] = {container = "South Kalimantan"},
["Tasikmalaya"] = {container = "West Java"},
}
export.indonesia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Indonesia", "province"),
default_placetype = "city",
data = export.indonesia_cities,
}
export.italy_cities = {
-- Data per [[w:List_of_metropolitan_areas_of_Italy]]. There are several lists given; the most recent one, used
-- here, only gives estimates as of Jan 1, 2014.
["Milan"] = {container = "Lombardy"}, -- 6,623,798
["Naples"] = {container = "Campania"}, -- 5,294,546
["Rome"] = {container = "Lazio"}, -- 4,447,881
["Turin"] = {container = "Piedmont"}, -- 1,865,284
["Venice"] = {container = "Veneto"}, -- 1,645,900
["Florence"] = {container = "Tuscany"}, -- 1,485,030
["Bari"] = {container = "Apulia"}, -- 1,257,459
["Palermo"] = {container = "Sicily"}, -- 1,183,084
-- include a few just below 1,000,000 metro area that may be above it by now (depending on the definition).
["Catania"] = {container = "Sicily"}, -- 988,240
["Brescia"] = {container = "Lombardy"}, -- 924,090
["Genoa"] = {container = "Liguria"}, -- 861,318
}
export.italy_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Italy", "region"),
default_placetype = "city",
data = export.italy_cities,
}
export.japan_cities = {
-- Population figures from [[w:List of cities in Japan]]. Metro areas from
-- [[w:List of metropolitan areas in Japan]].
["Tokyo"] = {keydesc = "[[Tokyo]] Metropolis, the [[capital city]] and a [[prefecture]] of [[Japan]] (which is a country in [[Asia]])",
placetype = {"city", "prefecture"},
divs = {
{type = "special wards", container_parent_type = false},
{type = "cities", prep = "di"},
},
},
["Yokohama"] = {container = "Kanagawa"}, -- 3,697,894
["Osaka"] = {container = "Osaka"}, -- 2,668,586
["Nagoya"] = {container = "Aichi"}, -- 2,283,289
-- FIXME, Hokkaido is handled specially.
["Sapporo"] = {container = "Hokkaido"}, -- 1,918,096
["Fukuoka"] = {container = "Fukuoka"}, -- 1,581,527
["Kobe"] = {container = "Hyōgo"}, -- 1,530,847
["Kyoto"] = {container = "Kyoto"}, -- 1,474,570
["Kawasaki"] = {container = "Kanagawa", wp = "%l, Kanagawa"}, -- 1,373,630
["Saitama"] = {container = "Saitama", wp = "%l (city)", commonscat = "%l, %c"}, -- 1,192,418
["Hiroshima"] = {container = "Hiroshima"}, -- 1,163,806
["Sendai"] = {container = "Miyagi"}, -- 1,029,552
-- the remaining cities are considered "central cities" in a 1,000,000+ metro area
-- (sometimes there is more than one central city in the area).
["Kitakyushu"] = {container = "Fukuoka"}, -- 986,998
["Chiba"] = {container = "Chiba", wp = "%l (city)", commonscat = "%l, %c"}, -- 938,695
["Sakai"] = {container = "Osaka"}, -- 835,333
["Niigata"] = {container = "Niigata", wp = "%l (city)", commonscat = "%l, %c"}, -- 813,053
["Hamamatsu"] = {container = "Shizuoka"}, -- 811,431
["Shizuoka"] = {container = "Shizuoka", wp = "%l (city)", commonscat = "%l, %c"}, -- 710,944
["Sagamihara"] = {container = "Kanagawa"}, -- 706,342
["Okayama"] = {container = "Okayama"}, -- 701,293
["Kumamoto"] = {container = "Kumamoto"}, -- 670,348
["Kagoshima"] = {container = "Kagoshima"}, -- 605,196
-- skipped 6 cities (Funabashi, Hachiōji, Kawaguchi, Himeji, Matsuyama, Higashiōsaka)
-- with population in the range 509k - 587k because not central cities in any
-- 1,000,000+ metro area.
["Utsunomiya"] = {container = "Tochigi"}, -- 507,833
}
export.japan_cities_group = {
default_container = "Japan",
canonicalize_key_container = make_canonicalize_key_container(" Prefecture, Japan", "prefecture"),
default_placetype = "city",
data = export.japan_cities,
}
export.mexico_cities = {
["Mexico City"] = {}, -- its own state
["Monterrey"] = {container = "Nuevo León"},
["Guadalajara"] = {container = "Jalisco"},
["Puebla"] = {container = "Puebla", wp = "%l (city)"},
["Toluca"] = {container = "State of Mexico"},
["Tijuana"] = {container = "Baja California"},
-- Include the state in the category for León due to possible confusion with León, Spain.
["León, Guanajuato"] = {container = "Guanajuato", wp = "%l, %c"},
["León"] = {alias_of = "León, Guanajuato"},
["Leon"] = {alias_of = "León, Guanajuato", display = true},
["Querétaro"] = {container = "Querétaro", wp = "%l (city)"},
["Queretaro"] = {alias_of = "Querétaro", display = true},
["Ciudad Juárez"] = {container = "Chihuahua"},
["Juárez"] = {alias_of = "Ciudad Juárez"},
["Juarez"] = {alias_of = "Ciudad Juárez", display = "Juárez"},
["Torreón"] = {container = "Coahuila"},
["Torreon"] = {alias_of = "Torreón", display = true},
-- Include the state in the category for Mérida due to possible confusion with Mérida, Spain or
-- Mérida, Venezuela.
["Mérida, Yucatán"] = {container = "Yucatán", wp = "%l, %c"},
["Mérida"] = {alias_of = "Mérida, Yucatán"},
["Merida"] = {alias_of = "Mérida, Yucatán", display = true},
["San Luis Potosí"] = {container = "San Luis Potosí", wp = "%l (city)"},
["San Luis Potosi"] = {alias_of = "San Luis Potosí", display = true},
["Aguascalientes"] = {container = "Aguascalientes", wp = "%l (city)"},
["Mexicali"] = {container = "Baja California"},
}
export.mexico_cities_group = {
default_container = "Mexico",
canonicalize_key_container = make_canonicalize_key_container(", Mexico", "negeri"),
default_placetype = "city",
data = export.mexico_cities,
}
export.nigeria_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
["Lagos"] = {container = "Lagos"}, -- 21,300,000 (unindicated; population of low reliability)
["Kano"] = {container = "Kano", wp = "%l (city)"}, -- 5,350,000 (unindicated; population of low reliability)
["Ibadan"] = {container = "Oyo"}, -- 3,400,000 (unindicated; population of low reliability)
["Abuja"] = {container = {key = "Federal Capital Territory, Nigeria", placetype = "wilayah persekutuan"}}, -- 3,050,000 (unindicated; population of low reliability)
["Port Harcourt"] = {container = "Rivers"}, -- 2,250,000 (unindicated; population of low reliability)
["Kaduna"] = {container = "Kaduna"}, -- 1,980,000 (unindicated; population of low reliability)
["Benin City"] = {container = "Edo"}, -- 1,790,000 (unindicated; population of low reliability)
["Aba"] = {container = "Abia", wp = "%l, Nigeria"}, -- 1,280,000 (unindicated; population of low reliability)
["Onitsha"] = {container = "Anambra"}, -- 1,230,000 (unindicated; population of low reliability)
["Maiduguri"] = {container = "Borno"}, -- 1,190,000 (unindicated; population of low reliability)
["Ilorin"] = {container = "Kwara"}, -- 1,160,000 (unindicated; population of low reliability)
["Sokoto"] = {container = "Sokoto", wp = "%l (city)"}, -- 1,140,000 (unindicated; population of low reliability)
["Jos"] = {container = "Plateau"}, -- 1,110,000 (unindicated; population of low reliability)
["Zaria"] = {container = "Kaduna"}, -- 1,050,000 (unindicated; population of low reliability)
["Enugu"] = {container = "Enugu", wp = "%l (city)"}, -- 1,010,000 (unindicated; population of low reliability)
}
export.nigeria_cities_group = {
default_container = "Nigeria",
canonicalize_key_container = make_canonicalize_key_container(" State, Nigeria", "negeri"),
default_placetype = "city",
data = export.nigeria_cities,
}
export.pakistan_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01.
["Karachi"] = {container = "Sindh"}, -- 21,000,000 (Consolidated Urban Area)
["Lahore"] = {container = "Punjab"}, -- 14,600,000 (Consolidated Urban Area)
["Rawalpindi"] = {container = "Punjab"}, -- 5,600,000 (Consolidated Urban Area; including Islamabad)
["Islamabad"] = {container = {key = "Islamabad Capital Territory, Pakistan", placetype = "wilayah persekutuan"}}, -- 5,600,000 (Consolidated Urban Area; including Rawalpindi)
["Faisalabad"] = {container = "Punjab"}, -- 4,125,000 (Consolidated Urban Area)
["Gujranwala"] = {container = "Punjab"}, -- 3,450,000 (Consolidated Urban Area)
-- there is also Hyderabad in India (very confusing)
["Hyderabad, Pakistan"] = {container = "Sindh", wp = "%l, %c"}, -- 2,475,000 (Consolidated Urban Area)
["Hyderabad"] = {alias_of = "Hyderabad, Pakistan"},
["Multan"] = {container = "Punjab"}, -- 2,425,000 (Consolidated Urban Area)
["Peshawar"] = {container = "Khyber Pakhtunkhwa"}, -- 2,150,000 (Consolidated Urban Area)
["Quetta"] = {container = "Balochistan"}, -- 1,720,000 (Urban Area)
["Sargodha"] = {container = "Punjab"}, -- 1,080,000 (Urban Area)
["Sialkot"] = {container = "Punjab"}, -- 1,050,000 (Consolidated Urban Area)
}
export.pakistan_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Pakistan", "province"),
default_placetype = "city",
data = export.pakistan_cities,
}
export.philippines_cities = {
-- Skipped some cities in Metro Manila (Taguig, Pasig) which don't have districts.
-- Other cities outside Metro Manila skipped as not central city in their urban area.
["Quezon City"] = {container = {key = "Metro Manila, Philippines", placetype = "region"}},
-- Don't display-canonicalize Foo to Foo City as it may make the display weird.
["Quezon"] = {alias_of = "Quezon City"},
["Manila"] = {container = {key = "Metro Manila, Philippines", placetype = "region"}},
["Davao City"] = {container = "Davao del Sur"},
["Davao"] = {alias_of = "Davao City"},
["Caloocan"] = {container = {key = "Metro Manila, Philippines", placetype = "region"}},
["Zamboanga City"] = {container = "Zamboanga del Sur"},
["Zamboanga"] = {alias_of = "Zamboanga City"},
["Cebu City"] = {container = "Cebu"},
["Cebu"] = {alias_of = "Cebu City"},
["Antipolo"] = {container = "Rizal"},
["Cagayan de Oro"] = {container = "Misamis Oriental"},
["Dasmariñas"] = {container = "Cavite"},
["Dasmarinas"] = {alias_of = "Dasmariñas", display = true},
["General Santos"] = {container = "South Cotabato"},
["San Jose del Monte"] = {container = "Bulacan"},
["Bacolod"] = {container = "Negros Occidental"},
["Calamba"] = {container = "Laguna", wp = "%l, %c"},
["Angeles"] = {container = "Pampanga", wp = "Angeles City"},
["Angeles City"] = {alias_of = "Angeles"},
["Iloilo City"] = {container = "Iloilo"},
["Iloilo"] = {alias_of = "Iloilo City"},
}
export.philippines_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Philippines", "province"),
default_placetype = "city",
data = export.philippines_cities,
}
export.russia_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01.
["Moscow"] = {}, -- 18,800,000 (Agglomeration)
["Saint Petersburg"] = {}, -- 6,350,000 (Agglomeration)
["Novosibirsk"] = {container = "Novosibirsk Oblast"}, -- 1,820,000 (Agglomeration)
["Yekaterinburg"] = {container = "Sverdlovsk Oblast"}, -- 1,810,000 (Agglomeration)
["Nizhny Novgorod"] = {container = "Nizhny Novgorod Oblast"}, -- 1,620,000 (Agglomeration)
["Kazan"] = {container = {key = "Tatarstan, Russia", placetype = "republic"}}, -- 1,560,000 (Agglomeration)
["Chelyabinsk"] = {container = "Chelyabinsk Oblast"}, -- 1,430,000 (Agglomeration)
["Rostov-on-Don"] = {container = "Rostov Oblast"}, -- 1,390,000 (Agglomeration)
["Rostov-na-Donu"] = {alias_of = "Rostov-on-Don", display = true},
["Krasnodar"] = {container = {key = "Krasnodar Krai, Russia", placetype = "krai"}}, -- 1,370,000 (Agglomeration)
["Samara"] = {container = "Samara Oblast"}, -- 1,350,000 (Agglomeration)
["Krasnoyarsk"] = {container = {key = "Krasnoyarsk Krai, Russia", placetype = "krai"}}, -- 1,270,000 (Agglomeration)
["Ufa"] = {container = {key = "Bashkortostan, Russia", placetype = "republic"}}, -- 1,230,000 (Agglomeration)
["Saratov"] = {container = "Saratov Oblast"}, -- 1,170,000 (Agglomeration)
["Omsk"] = {container = "Omsk Oblast"}, -- 1,140,000 (Agglomeration)
["Voronezh"] = {container = "Voronezh Oblast"}, -- 1,130,000 (Agglomeration)
["Volgograd"] = {container = "Volgograd Oblast"}, -- 1,080,000 (Agglomeration)
["Perm"] = {container = {key = "Perm Krai, Russia", placetype = "krai"}, wp = "%l, Russia"}, -- 1,070,000 (Agglomeration)
}
export.russia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Russia", "oblast"),
default_container = "Russia",
default_placetype = "city",
data = export.russia_cities,
}
export.saudi_arabia_cities = {
-- Figures for the first five from [[w:List of cities and towns in Saudi Arabia]] as of 2022. Unclear if these are
-- metro, urban or city proper figures.
["Riyadh"] = {container = "Riyadh"}, -- 7,000,100; 7,700,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Jeddah"] = {container = "Mecca"}, -- 3,751,917; 3,950,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Jedda"] = {alias_of = "Jeddah", display = true},
["Jiddah"] = {alias_of = "Jeddah", display = true},
["Jidda"] = {alias_of = "Jeddah", display = true},
["Dammam"] = {container = "Eastern"}, -- 2,638,166; 2,925,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Mecca"] = {container = "Mecca"}, -- 2,385,509; 2,675,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Makkah"] = {alias_of = "Mecca", display = true},
["Medina"] = {container = "Medina"}, -- 1,477,023; 1,530,000 per citypopulation.de 2025-01-01 (City)
["Hofuf"] = {container = "Eastern"}, -- 1,060,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Khamis Mushait"] = {container = "Aseer"}, -- 1,030,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Khamis Mushayt"] = {alias_of = "Khamis Mushait", display = true},
}
export.saudi_arabia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(" Province, Saudi Arabia", "province"),
default_placetype = "city",
data = export.saudi_arabia_cities,
}
export.south_korea_cities = {
-- All cities listed are not associated with any county.
["Seoul"] = {},
["Busan"] = {},
["Incheon"] = {},
["Daegu"] = {},
["Daejeon"] = {},
["Gwangju"] = {},
["Ulsan"] = {},
}
export.south_korea_cities_group = {
default_container = "South Korea",
canonicalize_key_container = make_canonicalize_key_container(" County, South Korea", "province"),
default_placetype = "city",
data = export.south_korea_cities,
}
export.spain_cities = {
["Madrid"] = {container = "Community of Madrid"},
["Barcelona"] = {container = "Catalonia"},
["Valencia"] = {container = "Valencia"},
["Seville"] = {container = "Andalusia"},
["Bilbao"] = {container = "Basque Country"},
}
export.spain_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Spain", "autonomous community"),
default_placetype = "city",
data = export.spain_cities,
}
export.taiwan_cities = {
["New Taipei City"] = {},
["New Taipei"] = {alias_of = "New Taipei City", display = true},
["Taichung"] = {},
["Kaohsiung"] = {wp = "%l, Taiwan"},
["Taipei"] = {},
["Taoyuan"] = {},
["Tainan"] = {},
-- these last three are not special municipalities
["Chiayi"] = {placetype = "city"},
["Hsinchu"] = {placetype = "city"},
["Keelung"] = {placetype = "city"},
}
export.taiwan_cities_group = {
placename_to_key = false, -- don't add ", Taiwan" to make the key
canonicalize_key_container = make_canonicalize_key_container(", Taiwan", "county"),
default_container = "Taiwan",
default_placetype = {"special municipality", "municipality", "city"},
default_is_city = true,
default_divs = {"districts"},
data = export.taiwan_cities,
}
-- NOTE: It's OK to mix cities from different constituent countries; as long as the immediate container is correct,
-- everything else will be figured out.
export.united_kingdom_cities = {
["London"] = {container = "Greater London"},
["Manchester"] = {container = "Greater Manchester"},
["Birmingham"] = {container = "West Midlands"},
["Liverpool"] = {container = "Merseyside"},
["Glasgow"] = {container = {key = "City of Glasgow, Scotland", placetype = "council area"}},
["Leeds"] = {container = "West Yorkshire"},
["Newcastle upon Tyne"] = {container = "Tyne and Wear"},
["Newcastle"] = {alias_of = "Newcastle upon Tyne"},
["Bristol"] = {container = {key = "England", placetype = "constituent country"}},
["Cardiff"] = {container = {key = "Wales", placetype = "constituent country"}},
["Portsmouth"] = {container = "Hampshire"},
["Edinburgh"] = {container = {key = "City of Edinburgh, Scotland", placetype = "council area"}},
-- under 1,000,000 people but principal areas of Wales; requested by [[User:Donnanz]]
["Swansea"] = {container = {key = "Wales", placetype = "constituent country"}},
["Newport"] = {container = {key = "Wales", placetype = "constituent country"}, wp = "Newport, Wales"},
}
export.united_kingdom_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", England", "county"),
default_placetype = "city",
data = export.united_kingdom_cities,
}
export.united_states_cities = {
-- top 50 CSA's by population, with the top and sometimes 2nd or 3rd city listed
["New York City"] = {container = "New York", wp = "%l", divs = {
{type = "boroughs", container_parent_type = false},
}},
-- Don't display-canonicalize as it may make the display weird (e.g. in the context New York, New York).
["New York"] = {alias_of = "New York City"},
["Newark"] = {container = "New Jersey"},
["Los Angeles"] = {container = "California", wp = "%l"},
["Long Beach"] = {container = "California"},
["Riverside"] = {container = "California"},
["Chicago"] = {container = "Illinois", wp = "%l"},
["Washington, D.C."] = {wp = "%l"},
["Washington, DC"] = {alias_of = "Washington, D.C.", display = true},
["Washington D.C."] = {alias_of = "Washington, D.C.", display = true},
["Washington DC"] = {alias_of = "Washington, D.C.", display = true},
-- Don't display-canonicalize as it may make the display weird (e.g. if the holonym is followed by a District of
-- Columbia holonym).
["Washington"] = {alias_of = "Washington, D.C."},
["Baltimore"] = {container = "Maryland", wp = "%l"},
-- to avoid conflict with San Jose in Costa Rica
["San Jose, California"] = {container = "California"},
["San Jose"] = {alias_of = "San Jose, California"},
["San Francisco"] = {container = "California", wp = "%l"},
["Oakland"] = {container = "California"},
["Boston"] = {container = "Massachusetts", wp = "%l"},
["Providence"] = {container = "Rhode Island"},
["Dallas"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"},
["Fort Worth"] = {container = "Texas"},
["Philadelphia"] = {container = "Pennsylvania", wp = "%l"},
["Houston"] = {container = "Texas", wp = "%l"},
["Miami"] = {container = "Florida", wp = "%l", commonscat = "%l, %c"},
["Atlanta"] = {container = "Georgia", wp = "%l"},
["Detroit"] = {container = "Michigan", wp = "%l"},
["Phoenix"] = {container = "Arizona", wp = "%l", commonscat = "%l, %c"},
["Mesa"] = {container = "Arizona"},
["Seattle"] = {container = "Washington", wp = "%l"},
["Orlando"] = {container = "Florida"},
["Minneapolis"] = {container = "Minnesota", wp = "%l"},
["Cleveland"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"},
["Denver"] = {container = "Colorado", wp = "%l", commonscat = "%l, %c"},
["San Diego"] = {container = "California", wp = "%l", commonscat = "%l, %c"},
["Portland"] = {container = "Oregon"},
["Tampa"] = {container = "Florida"},
["St. Louis"] = {container = "Missouri", wp = "%l", commonscat = "%l, %c"},
["Saint Louis"] = {alias_of = "St. Louis", display = true},
["Charlotte"] = {container = "North Carolina"},
["Sacramento"] = {container = "California"},
["Pittsburgh"] = {container = "Pennsylvania", wp = "%l"},
["Salt Lake City"] = {container = "Utah", wp = "%l"},
["San Antonio"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"},
["Columbus"] = {container = "Ohio"},
["Kansas City"] = {container = "Missouri", wp = "%l metropolitan area", commonscat = "%l, %c"},
["Indianapolis"] = {container = "Indiana", wp = "%l"},
["Las Vegas"] = {container = "Nevada", wp = "%l"},
["Cincinnati"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"},
["Austin"] = {container = "Texas"},
["Milwaukee"] = {container = "Wisconsin", wp = "%l", commonscat = "%l, %c"},
["Raleigh"] = {container = "North Carolina"},
["Nashville"] = {container = "Tennessee"},
["Virginia Beach"] = {container = "Virginia"},
["Norfolk"] = {container = "Virginia"},
["Greensboro"] = {container = "North Carolina"},
["Winston-Salem"] = {container = "North Carolina"},
["Jacksonville"] = {container = "Florida"},
["New Orleans"] = {container = "Louisiana", wp = "%l"},
["Louisville"] = {container = "Kentucky"},
["Greenville"] = {container = "South Carolina"},
["Hartford"] = {container = "Connecticut"},
["Oklahoma City"] = {container = "Oklahoma", wp = "%l"},
["Grand Rapids"] = {container = "Michigan"},
["Memphis"] = {container = "Tennessee"},
["Birmingham, Alabama"] = {container = "Alabama"},
["Birmingham"] = {alias_of = "Birmingham, Alabama"},
["Fresno"] = {container = "California"},
["Richmond"] = {container = "Virginia"},
["Harrisburg"] = {container = "Pennsylvania"},
-- any major city of top 50 MSA's that's missed by previous
["Buffalo"] = {container = "New York"},
-- any of the top 50 city by city population that's missed by previous
["El Paso"] = {container = "Texas"},
["Albuquerque"] = {container = "New Mexico"},
["Tucson"] = {container = "Arizona"},
["Colorado Springs"] = {container = "Colorado"},
["Omaha"] = {container = "Nebraska"},
["Tulsa"] = {container = "Oklahoma"},
-- skip Arlington, Texas; too obscure and likely to be interpreted as Arlington, Virginia
}
export.united_states_cities_group = {
default_container = "Amerika Syarikat",
canonicalize_key_container = make_canonicalize_key_container(", USA", "negeri"),
default_placetype = "city",
default_wp = "%l, %c",
data = export.united_states_cities,
}
export.new_york_boroughs = {
["Bronx"] = {the = true, wp = "The Bronx"},
["Brooklyn"] = {},
["Manhattan"] = {},
["Queens"] = {},
["Staten Island"] = {},
}
export.new_york_boroughs_group = {
default_container = {key = "New York City", placetype = "city"},
default_placetype = "borough",
default_is_city = true,
data = export.new_york_boroughs,
}
export.vietnam_cities = {
-- Figures from citypopulation.de (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated.
["Ho Chi Minh City"] = {}, -- 14,300,000 (Agglomeration; inclunding Bien Hoa)
["Saigon"] = {alias_of = "Ho Chi Minh City"},
["Hanoi"] = {}, -- 7,350,000 (Agglomeration)
["Da Nang"] = {}, -- 1,500,000 (Agglomeration)
["Danang"] = {alias_of = "Da Nang", display = true},
["Haiphong"] = {}, -- 1,450,000 (Agglomeration)
["Hai Phong"] = {alias_of = "Haiphong", display = true},
-- This is the one entry in this list that is not a province-level municipality; instead it's a "provincial city"
-- meaning it is directly under its province as opposed to being contained in a district.
["Bien Hoa"] = {placetype = "city", container = "Đồng Nai", wp = "Biên Hòa"}, -- 1,272,235 (2022 city population per Wikipedia)
["Biên Hòa"] = {alias_of = "Bien Hoa", display = true},
["Biên Hoà"] = {alias_of = "Bien Hoa", display = true},
-- These two not in citypopulation.de because the urban population may be slightly under 1,000,000, but they are
-- both province-level municipalities and close to the 1,000,000 mark.
["Can Tho"] = {wp = "Cần Thơ"}, -- 1,456,000 municipality (2019 census), 994,704 urban (2022 General Statistics Office of Vietnam estimate); capital [[Ninh Kiều district]]
["Cần Thơ"] = {alias_of = "Can Tho", display = true},
["Hue"] = {wp = "Huế"}, -- 1,257,000 municipality (2019 census), 840,000 urban (2022 General Statistics Office of Vietnam estimate); -- capital [[Thuận Hóa district]]
["Huế"] = {alias_of = "Hue", display = true},
}
export.vietnam_cities_group = {
placename_to_key = false, -- don't add ", Vietnam" to make the key
default_container = "Vietnam",
canonicalize_key_container = make_canonicalize_key_container(" Province, Vietnam", "province"),
-- Most of the cities listed are province-level municipalities in addition, which contain a certain amount of
-- rural territory surrounding the city, but not enough to separate the municipality from the city as distinct
-- known locations.
default_placetype = {"municipality", "city"},
default_is_city = true,
-- There may not be enough districts to subcategorize like this.
-- default_divs = "districts",
data = export.vietnam_cities,
}
export.misc_cities = {
------------------ Africa -------------------
-- Sorted by country and then within the country, by decreasing population; figures from citypopulation.de
-- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated; combined with data from
-- [[w:List of urban areas in Africa by population]].
["Algiers"] = {container = "Algeria"}, -- 4,325,000 (Consolidated Urban Area)
["Oran"] = {container = "Algeria"}, -- 1,640,000 (Consolidated Urban Area)
["Luanda"] = {container = "Angola"}, -- 9,650,000 (Urban Area)
["Benguela"] = {container = "Angola"}, -- 1,420,000 (Urban Area)
["Cotonou"] = {container = "Benin"}, -- 2,150,000 (Agglomeration)
["Ouagadougou"] = {container = "Burkina Faso"}, -- 3,425,000 (Agglomeration)
["Bobo-Dioulasso"] = {container = "Burkina Faso"}, -- 1,100,000 (Agglomeration)
["Bujumbura"] = {container = "Burundi"}, -- 1,143,202 (Urban Area 2023 per PopulationStat, cited in Wikipedia)
["Yaoundé"] = {container = "Cameroon"}, -- 3,975,000 (City)
["Yaounde"] = {alias_of = "Yaoundé", display = true},
["Douala"] = {container = "Cameroon"}, -- 3,900,000 (City)
["Bangui"] = {container = "Central African Republic"}, -- 1,680,000 (Agglomeration)
["N'Djamena"] = {container = "Chad"}, -- 1,950,000 (City)
["Ndjamena"] = {alias_of = "N'Djamena", display = true},
["Kinshasa"] = {container = "Democratic Republic of the Congo"}, -- 16,300,000 (City; population of low reliability)
["Lubumbashi"] = {container = "Democratic Republic of the Congo"}, -- 2,875,000 (City; population of low reliability)
["Mbuji-Mayi"] = {container = "Democratic Republic of the Congo"}, -- 2,500,000 (City; population of low reliability)
["Kananga"] = {container = "Democratic Republic of the Congo"}, -- 1,370,000 (City; population of low reliability)
["Kisangani"] = {container = "Democratic Republic of the Congo"}, -- 1,300,000 (City; population of low reliability)
["Bukavu"] = {container = "Democratic Republic of the Congo"}, -- 1,100,000 (City; population of low reliability)
["Goma"] = {container = "Democratic Republic of the Congo"}, -- 1,010,000 (City; population of low reliability)
["Tshikapa"] = {container = "Democratic Republic of the Congo"}, -- 1,020,468 (2023 Wikipedia [[w:List of cities with over one million inhabitants]] from populationstat.com; not in citypopulation.de)
["Cairo"] = {container = "Egypt"}, -- 22,800,000 (Agglomeration, including Giza and Subhra El Kheima)
["Alexandria"] = {container = "Egypt"}, -- 6,250,000 (Agglomeration)
["Giza"] = {container = "Egypt"}, -- 4,458,135 (2023 from citypopulation.de)
["Shubra El Kheima"] = {container = "Egypt"}, -- 1,240,239 (2021 from citypopulation.de)
["Asmara"] = {container = "Eritrea"}, -- 1,090,000 (City; population of low reliability)
["Asmera"] = {alias_of = "Asmara", display = true},
["Addis Ababa"] = {container = "Ethiopia"}, -- 4,825,000 (Agglomeration)
["Banjul"] = {container = "Gambia"}, -- 1,170,000 (Agglomeration)
["Accra"] = {container = "Ghana"}, -- 6,800,000 (Agglomeration)
["Kumasi"] = {container = "Ghana"}, -- 2,900,000 (Agglomeration)
["Conakry"] = {container = "Guinea"}, -- 2,975,000 (Consolidated Urban Area)
["Abidjan"] = {container = "Ivory Coast"}, -- 7,050,000 (Agglomeration)
["Nairobi"] = {container = "Kenya"}, -- 6,900,000 (unindicated)
["Mombasa"] = {container = "Kenya"}, -- 1,370,000 (City)
["Monrovia"] = {container = "Liberia"}, -- 1,940,000 (Urban Area)
["Tripoli"] = {container = "Libya", wp = "%l, %c"}, -- 1,870,000 (unindicated)
["Antananarivo"] = {container = "Madagascar"}, -- 3,150,000 (Agglomeration)
["Lilongwe"] = {container = "Malawi"}, -- 1,210,000 (City)
["Bamako"] = {container = "Mali"}, -- 5,700,000 (Agglomeration)
["Nouakchott"] = {container = "Mauritania"}, -- 1,500,000 (City)
["Casablanca"] = {container = {key = "Casablanca-Settat, Morocco", placetype = "region"}}, -- 4,450,000 (Municipality (urban population))
["Rabat"] = {container = {key = "Rabat-Sale-Kenitra, Morocco", placetype = "region"}}, -- 2,125,000 (Municipality (urban population))
["Tangier"] = {container = {key = "Tangier-Tetouan-Al Hoceima, Morocco", placetype = "region"}}, -- 1,410,000 (Municipality (urban population))
["Tanger"] = {alias_of = "Tangier", display = true},
["Tangiers"] = {alias_of = "Tangier", display = true},
["Fez"] = {container = {key = "Fez-Meknes, Morocco", placetype = "region"}, wp = "%l, Morocco"}, -- 1,310,000 (Municipality (urban population))
["Fes"] = {alias_of = "Fez", display = true},
["Fès"] = {alias_of = "Fez", display = true},
["Agadir"] = {container = {key = "Souss-Massa, Morocco", placetype = "region"}}, -- 1,270,000 (Municipality (urban population))
["Marrakesh"] = {container = {key = "Marrakesh-Safi, Morocco", placetype = "region"}}, -- 1,140,000 (Municipality (urban population))
["Marrakech"] = {alias_of = "Marrakesh", display = true},
["Maputo"] = {container = "Mozambique"}, -- 2,575,000 (Agglomeration)
["Niamey"] = {container = "Niger"}, -- 1,530,000 (City)
["Brazzaville"] = {container = "Republic of the Congo"}, -- 2,475,000 (Agglomeration)
["Pointe-Noire"] = {container = "Republic of the Congo"}, -- 1,480,000 (City)
["Kigali"] = {container = "Rwanda"}, -- 1,960,000 (Municipality (urban population))
["Dakar"] = {container = "Senegal"}, -- 4,225,000 (Agglomeration)
["Touba"] = {container = "Senegal"}, -- 1,320,000 (Agglomeration)
["Freetown"] = {container = "Sierra Leone"}, -- 1,420,000 (Agglomeration)
["Mogadishu"] = {container = "Somalia"}, -- 2,250,000 (unindicated; population of low reliability)
["Johannesburg"] = {container = {key = "Gauteng, South Africa", placetype = "province"}}, -- 14,800,000 (Consolidated Urban Area; including Pretoria, Soweto, etc.)
["Cape Town"] = {container = {key = "Western Cape, South Africa", placetype = "province"}}, -- 5,100,000 (Consolidated Urban Area)
["Durban"] = {container = {key = "KwaZulu-Natal, South Africa", placetype = "province"}}, -- 3,900,000 (Consolidated Urban Area)
["Pretoria"] = {container = {key = "Gauteng, South Africa", placetype = "province"}}, -- 2,921,488 (2011 census)
["Port Elizabeth"] = {container = {key = "Eastern Cape, South Africa", placetype = "province"}, wp = "Gqeberha"}, -- 1,200,000 (Consolidated Urban Area)
["Gqeberha"] = {alias_of = "Port Elizabeth"}, -- official name; not a display alias
["Khartoum"] = {container = "Sudan"}, -- 7,200,000 (unindicated; population of low reliability)
["Dar es Salaam"] = {container = "Tanzania"}, -- 6,650,000 (Agglomeration)
["Mwanza"] = {container = "Tanzania"}, -- 1,340,000 (Agglomeration)
["Mwanza City"] = {alias_of = "Mwanza", display = true},
["Arusha"] = {container = "Tanzania"}, -- 1,190,000 (Agglomeration)
["Zanzibar"] = {container = "Tanzania"}, -- 1,030,000 (Agglomeration)
["Lomé"] = {container = "Togo"}, -- 2,625,000 (unindicated)
["Lome"] = {alias_of = "Lomé", display = true},
["Tunis"] = {container = "Tunisia"}, -- 2,725,000 (Municipality (urban population))
["Sousse"] = {container = "Tunisia"}, -- 1,180,000 (Municipality (urban population))
["Soussa"] = {alias_of = "Sousse", display = true},
["Kampala"] = {container = "Uganda"}, -- 4,300,000 (unindicated)
["Lusaka"] = {container = "Zambia"}, -- 3,000,000 (Consolidated Urban Area)
["Harare"] = {container = "Zimbabwe"}, -- 2,675,000 (Agglomeration)
------------------ Asia -------------------
-- sorted by country and then within the country, by decreasing population; figures from citypopulation.de
-- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated.
["Kabul"] = {container = "Afghanistan"}, -- 5,250,000 (Agglomeration)
["Baku"] = {container = "Azerbaijan"}, -- 3,725,000 (Administrative Area (urban population))
["Manama"] = {container = "Bahrain"}, -- 1,560,000 (unindicated)
["Dhaka"] = {container = {key = "Dhaka Division, Bangladesh", placetype = "division"}}, -- 23,100,000 (Agglomeration)
["Dacca"] = {alias_of = "Dhaka", display = true},
["Chittagong"] = {container = {key = "Chittagong Division, Bangladesh", placetype = "division"}}, -- 5,050,000 (Agglomeration)
["Gazipur"] = {container = {key = "Dhaka Division, Bangladesh", placetype = "division"}}, -- 2,674,697 (City per 2022; countied in citypopulation.de as part of Dhaka metro area)
["Khulna"] = {container = {key = "Khulna Division, Bangladesh", placetype = "division"}}, -- 1,210,000 (Agglomeration)
["Phnom Penh"] = {container = "Cambodia"}, -- 2,925,000 (Agglomeration)
["Tehran"] = {container = {key = "Tehran Province, Iran", placetype = "province"}}, -- 16,800,000 (Agglomeration)
["Teheran"] = {alias_of = "Tehran", display = true},
["Mashhad"] = {container = {key = "Razavi Khorasan Province, Iran", placetype = "province"}}, -- 3,475,000 (Agglomeration)
["Mashad"] = {alias_of = "Mashhad", display = true},
["Meshhed"] = {alias_of = "Mashhad", display = true},
["Meshed"] = {alias_of = "Mashhad", display = true},
["Isfahan"] = {container = {key = "Isfahan Province, Iran", placetype = "province"}}, -- 3,425,000 (Agglomeration)
["Esfahan"] = {alias_of = "Isfahan", display = true},
["Tabriz"] = {container = {key = "East Azerbaijan Province, Iran", placetype = "province"}}, -- 1,970,000 (Agglomeration)
["Shiraz"] = {container = {key = "Fars Province, Iran", placetype = "province"}}, -- 1,950,000 (Agglomeration)
["Ahvaz"] = {container = {key = "Khuzestan Province, Iran", placetype = "province"}}, -- 1,550,000 (Agglomeration)
["Qom"] = {container = {key = "Qom Province, Iran", placetype = "province"}}, -- 1,450,000 (City)
["Kermanshah"] = {container = {key = "Kermanshah Province, Iran", placetype = "province"}}, -- 1,130,000 (City)
["Baghdad"] = {container = "Iraq"}, -- 7,800,000 (Administrative Area (urban population))
["Basra"] = {container = "Iraq"}, -- 1,710,000 (Administrative Area (urban population))
["Mosul"] = {container = "Iraq"}, -- 1,550,000 (Administrative Area (urban population))
["Erbil"] = {container = "Iraq"}, -- 1,220,000 (Administrative Area (urban population))
["Kirkuk"] = {container = "Iraq"}, -- 1,160,000 (Administrative Area (urban population))
["Najaf"] = {container = "Iraq"}, -- 1,050,000 (Administrative Area (urban population))
["Tel Aviv"] = {container = "Israel"}, -- 3,000,000 (Agglomeration)
-- Jerusalem is not recognized internationally as part of either Israel or Palestine, but as a
-- [[w:corpus separatum]], so put the container as "Asia" and list Israel and Palestine as additional parents for
-- categorization purposes.
["Jerusalem"] = {container = {key = "Asia", placetype = "benua"},
addl_parents = {"Israel", "Palestine"}}, -- 1,080,000 (Agglomeration)
["Amman"] = {container = "Jordan"}, -- 6,150,000 (unindicated)
["Irbid"] = {container = "Jordan"}, -- 1,070,000 (unindicated)
["Almaty"] = {container = "Kazakhstan"}, -- 2,700,000 (Agglomeration)
["Alma-Ata"] = {alias_of = "Almaty"}, -- former name, sometimes still used; don't display-canonicalize
["Astana"] = {container = "Kazakhstan"}, -- 1,600,000 (Agglomeration)
["Shymkent"] = {container = "Kazakhstan"}, -- 1,370,000 (Agglomeration)
["Kuwait City"] = {container = "Kuwait"}, -- 5,050,000 (Agglomeration)
["Bishkek"] = {container = "Kyrgyzstan"}, -- 1,540,000 (Agglomeration)
["Beirut"] = {container = "Lebanon"}, -- 1,930,000 (unindicated; population of low reliability)
-- Kuala Lumpur is a federal capital city, not in any state
["Kuala Lumpur"] = {container = "Malaysia"}, -- 9,550,000 (Agglomeration)
-- there are various George Towns and Georgetowns
["George Town, Malaysia"] = {container = {key = "Penang, Malaysia", placetype = "negeri"}, wp = "%l, %c"}, -- 2,075,000 (Agglomeration)
["George Town"] = {alias_of = "George Town, Malaysia"},
["Ulaanbaatar"] = {container = "Mongolia"}, -- 1,610,000 (City)
["Ulan Bator"] = {alias_of = "Ulaanbaatar", display = true},
["Yangon"] = {container = "Myanmar"}, -- 5,650,000 (Municipality (urban population))
["Rangoon"] = {alias_of = "Yangon", display = true},
["Mandalay"] = {container = "Myanmar"}, -- 1,600,000 (Municipality (urban population))
["Kathmandu"] = {container = "Nepal"}, -- 3,175,000 (Agglomeration)
-- Pyongyang is a directly governed city, not in any province
["Pyongyang"] = {container = "North Korea"}, -- 3,025,000 (Administrative Area (urban population))
["Muscat"] = {container = "Oman"}, -- 1,620,000 (Agglomeration)
["Gaza"] = {container = "Palestine", wp = "Gaza City"}, -- 2,275,000 (unindicated)
["Gaza City"] = {alias_of = "Gaza"},
["Doha"] = {container = "Qatar"}, -- 2,650,000 (Agglomeration)
["Colombo"] = {container = "Sri Lanka"}, -- 4,975,000 (unindicated)
["Damascus"] = {container = "Syria"}, -- 3,975,000 (unindicated; population of low reliability)
["Aleppo"] = {container = "Syria"}, -- 1,980,000 (unindicated; population of low reliability)
["Dushanbe"] = {container = "Tajikistan"}, -- 1,270,000 (City)
["Bangkok"] = {container = "Thailand"}, -- 21,800,000 (Agglomeration)
-- Chiang Mai not in citypopulation.de, but 1,198,000 urban population in 2021 per Wikipedia
-- [[w:List_of_municipalities_in_Thailand#Largest_cities_by_urban_population]]
["Chiang Mai"] = {container = {key = "Chiang Mai Province, Thailand", placetype = "province"}},
["Chonburi"] = {container = {key = "Chonburi Province, Thailand", placetype = "province"}}, -- 1,570,000 (Agglomeration; including Pattaya)
-- metro area population stats from https://www.statista.com/statistics/255483/biggest-cities-in-turkey/ as of 2021;
-- second source is citypopulation.de reference date 2025-01-01.
["Istanbul"] = {placetype = {"city", "province"}, divs = {"districts"}, container = "Turkey"}, -- 15.2 million; 16,000,000 (Agglomeration)
["İstanbul"] = {alias_of = "Istanbul", display = true},
["Ankara"] = {container = {key = "Ankara Province, Turkey", placetype = "province"}}, -- 5.15 million; 5,200,000 (Agglomeration)
["Izmir"] = {container = {key = "İzmir Province, Turkey", placetype = "province"}, wp = "İzmir"}, -- 2.95 million; 3,025,000 (Agglomeration)
["İzmir"] = {alias_of = "Izmir", display = true},
["Bursa"] = {container = {key = "Bursa Province, Turkey", placetype = "province"}}, -- 2.02 million; 2,200,000 (Agglomeration)
["Adana"] = {container = {key = "Adana Province, Turkey", placetype = "province"}}, -- 1.77 million; 1,780,000 (Agglomeration)
["Gaziantep"] = {container = {key = "Gaziantep Province, Turkey", placetype = "province"}}, -- 1.71 million; 1,750,000 (Agglomeration)
["Antalya"] = {container = {key = "Antalya Province, Turkey", placetype = "province"}}, -- 1.3 million; 1,400,000 (Agglomeration)
["Konya"] = {container = {key = "Konya Province, Turkey", placetype = "province"}}, -- 1.35 million; 1,390,000 (Agglomeration)
["Diyarbakır"] = {container = {key = "Diyarbakır Province, Turkey", placetype = "province"}}, -- 1.07 million; 1,100,000 (Agglomeration)
-- Diyarbakır is more common per Ngrams and Google Scholar, but Diyarbakir is the Kurdish form, so we should not
-- display-canonicalize to the Turkish form Diyarbakır.
["Diyarbakir"] = {alias_of = "Diyarbakır"},
["Mersin"] = {container = {key = "Mersin Province, Turkey", placetype = "province"}}, -- 1.03 million; 1,060,000 (Agglomeration)
["Ashgabat"] = {container = "Turkmenistan"}, -- 1,150,000 (Agglomeration)
["Dubai"] = {container = "United Arab Emirates"}, -- 6,050,000 (Agglomeration; including Sharjah)
["Abu Dhabi"] = {container = "United Arab Emirates"}, -- 1,850,000 (City)
["Sharjah"] = {container = "United Arab Emirates"}, -- 1,800,000 (Metro area 2022-2023 per Wikipedia; separate from Dubai)
["Tashkent"] = {container = "Uzbekistan"}, -- 3,850,000 (unindicated)
["Sanaa"] = {container = "Yemen"}, -- 3,275,000 (City; population of low reliability)
["Sana'a"] = {alias_of = "Sanaa", display = true},
["Aden"] = {container = "Yemen"}, -- 1,079,060 (?; 2023 estimate from World Population Review per Wikipedia)
------------------ Europe or Europe-like (Caucasus etc.) ---------------------
["Yerevan"] = {container = "Armenia"}, -- 1,520,000 (Agglomeration)
["Vienna"] = {container = "Austria"}, -- 2,375,000 (Agglomeration)
["Minsk"] = {container = "Belarus"}, -- 2,100,000 (unindicated)
["Brussels"] = {container = "Belgium"}, -- 2,800,000 (Consolidated Urban Area)
["Antwerp"] = {container = "Belgium"}, -- 1,270,000 (Consolidated Urban Area)
["Sofia"] = {container = "Bulgaria"}, -- 1,260,000 (Agglomeration)
["Zagreb"] = {container = "Croatia"},
["Prague"] = {container = "Czech Republic"}, -- 1,470,000 (Agglomeration)
["Brno"] = {container = "Czech Republic"}, -- 729,405 (metro area per Wikipedia as of 2024-01-01 Czech Statistical Office)
["Olomouc"] = {container = "Czech Republic"}, -- 102,293 (city; included only because someone went crazy creating Olomouc-related terms)
["Copenhagen"] = {container = "Denmark"}, -- 1,800,000 (Consolidated Urban Area)
["Helsinki"] = {container = {key = "Uusimaa, Finland", placetype = "region"}}, -- 1,560,000 (Consolidated Urban Area)
["Tbilisi"] = {container = "Georgia"}, -- 1,430,000 (Agglomeration)
["Athens"] = {container = "Greece"},
["Thessaloniki"] = {container = "Greece"},
["Budapest"] = {container = "Hungary"},
-- FIXME, per Wikipedia "County Dublin" is now the "Dublin Region"
["Dublin"] = {container = {key = "County Dublin, Ireland", placetype = "county"}},
["Riga"] = {container = "Latvia"},
["Amsterdam"] = {container = {key = "North Holland, Netherlands", placetype = "province"}},
["Rotterdam"] = {container = {key = "South Holland, Netherlands", placetype = "province"}},
["The Hague"] = {container = {key = "South Holland, Netherlands", placetype = "province"}},
-- Christchurch (metro 546,600) and Wellington (metro 439,800) are too small to make it.
["Auckland"] = {container = {key = "Auckland, New Zealand", placetype = "region"}},
["Oslo"] = {container = {key = "Oslo, Norway", placetype = "county"}},
["Warsaw"] = {container = {key = "Masovian Voivodeship, Poland", placetype = "voivodeship"}},
["Katowice"] = {container = {key = "Silesian Voivodeship, Poland", placetype = "voivodeship"}},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Krakow" without accent.
["Krakow"] = {container = {key = "Lesser Poland Voivodeship, Poland", placetype = "voivodeship"}, wp = "Kraków"},
["Kraków"] = {alias_of = "Krakow", display = true},
["Cracow"] = {alias_of = "Krakow", display = true},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirm "Gdańsk" and "Poznań" with accent.
["Gdańsk"] = {container = {key = "Pomeranian Voivodeship, Poland", placetype = "voivodeship"}},
["Gdansk"] = {alias_of = "Gdańsk", display = true},
["Poznań"] = {container = {key = "Greater Poland Voivodeship, Poland", placetype = "voivodeship"}},
["Poznan"] = {alias_of = "Poznań", display = true},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Lodz" without accents.
["Lodz"] = {container = {key = "Lodz Voivodeship, Poland", placetype = "voivodeship"}, wp = "Łódź"},
["Łódź"] = {alias_of = "Lodz", display = true},
["Lisbon"] = {container = {key = "Lisbon District, Portugal", placetype = "district"}},
["Porto"] = {container = {key = "Porto District, Portugal", placetype = "district"}},
["Oporto"] = {alias_of = "Porto", display = true},
["Bucharest"] = {container = "Romania"},
["Belgrade"] = {container = "Serbia"},
["Stockholm"] = {container = "Sweden"},
["Zurich"] = {container = "Switzerland"},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Zurich" without umlaut.
--- Even Wikipedia uses the form without umlaut.
["Zürich"] = {alias_of = "Zurich", display = true},
["Kyiv"] = {container = "Ukraine"}, -- not in Kyiv Oblast
-- Don't display-canonicalize Kiev -> Kyiv because in ancient contexts, Kiev is still more common.
["Kiev"] = {alias_of = "Kyiv"},
["Kharkiv"] = {container = {key = "Kharkiv Oblast, Ukraine", placetype = "oblast"}},
["Odessa"] = {container = {key = "Odesa Oblast, Ukraine", placetype = "oblast"}, wp = "Odesa"},
-- Don't display-canonicalize Odesa -> Odessa because it may be interpreted as a political statement.
["Odesa"] = {alias_of = "Odessa"},
------------------ North America, South America ---------------------
-- Primary figures from citypopulation.de retrieved on 2025-04-26 (reference date 2025-01-01);
-- Wikipedia metropolitan figures from [[w:List of metropolitan areas in the Americas]] based on per-country data;
-- Wikipedia city limits figures from [[w:List of largest cities in the Americas]].
["Buenos Aires"] = {container = "Argentina"}, -- 16,800,000 (Consolidated Urban Area; 13,985,794 metropolitan area per Wikipedia)
["Córdoba, Argentina"] = {container = "Argentina", wp = "%l, %c"}, -- 1,810,000 (Consolidated Urban Area; 1,505,25 city limits per Wikipedia)
-- to avoid confusion with Córdoba in Spain
["Córdoba"] = {alias_of = "Córdoba, Argentina"},
["Cordoba"] = {alias_of = "Córdoba, Argentina", display = "Córdoba"},
["Rosario"] = {container = "Argentina", wp = "%l, Santa Fe"}, -- 1,510,000 (Consolidated Urban Area; 1,348,725 metropolitan area per Wikipedia)
["Mendoza"] = {container = "Argentina", wp = "%l, %c"}, -- 1,180,000 (Consolidated Urban Area)
["San Miguel de Tucumán"] = {container = "Argentina"}, -- 1,110,000 (Consolidated Urban Area)
["Tucumán"] = {alias_of = "San Miguel de Tucumán"},
["Tucuman"] = {alias_of = "San Miguel de Tucumán", display = "Tucumán"},
["Santa Cruz de la Sierra"] = {container = "Bolivia"}, -- 1,960,000 (Consolidated Urban Area); 1,606,671 (city limits per Wikipedia)
["Santa Cruz"] = {alias_of = "Santa Cruz de la Sierra"},
["La Paz"] = {container = "Bolivia"}, -- 1,870,000 (Consolidated Urban Area; composed of El Alto, now slightly larger, and La Paz)
["El Alto"] = {container = "Bolivia"},
["Cochabamba"] = {container = "Bolivia"}, -- 1,280,000 (Consolidated Urban Area)
["Santiago"] = {container = "Chile"}, -- 8,400,000 (Consolidated Urban Area; 6,903,479 city limits? per Wikipedia)
["Valparaíso"] = {container = "Chile"}, -- 1,060,000 (Consolidated Urban Area)
["Valparaiso"] = {alias_of = "Valparaíso"}, -- 1,060,000 (Consolidated Urban Area)
["Bogotá"] = {container = "Colombia"}, -- 10,600,000 (Agglomeration; 12,772,828 metropolitan area per Wikipedia)
["Bogota"] = {alias_of = "Bogotá", display = true},
["Medellín"] = {container = "Colombia"}, -- 4,350,000 (Agglomeration; 4,068,000 metropolitan area per Wikipedia)
["Medellin"] = {alias_of = "Medellín", display = true},
["Cali"] = {container = "Colombia"}, -- 2,975,000 (Agglomeration; 2,837,000 metropolitan area per Wikipedia)
["Barranquilla"] = {container = "Colombia"}, -- 2,375,000 (Agglomeration; 1,341,160 city limits per Wikipedia)
["Bucaramanga"] = {container = "Colombia"}, -- 1,380,000 (Agglomeration)
["Cartagena, Colombia"] = {container = "Colombia", wp = "%l, %c"}, -- 1,250,000 (Agglomeration)
-- to avoid confusion with Cartagena, Spain
["Cartagena"] = {alias_of = "Cartagena, Colombia"},
["Cúcuta"] = {container = "Colombia"}, -- 1,130,000 (Agglomeration)
["Cucuta"] = {alias_of = "Cúcuta", display = true},
-- to avoid conflict with San Jose, California
["San José, Costa Rica"] = {container = "Costa Rica", wp = "%l, %c"}, -- 2,450,000 (Municipality (urban population); 3,160,000 metropolitan area per Wikipedia)
["San José"] = {alias_of = "San José, Costa Rica"},
["San Jose"] = {alias_of = "San José, Costa Rica"}, -- display = "San José"; causes error due to San Jose alias for California city; FIXME
["Havana"] = {container = "Cuba"}, -- 2,150,000 (City; 2,137,847 city limits? per Wikipedia)
["Santo Domingo"] = {container = "Dominican Republic"}, -- 3,900,000 (Municipality (urban population); 4,274,651 ??? per Wikipedia)
["Guayaquil"] = {container = "Ecuador"}, -- 3,350,000 (Agglomeration; 3,092,000 metro area? per Wikipedia)
["Quito"] = {container = "Ecuador"}, -- 2,875,000 (Agglomeration; 2,889,703 metro area? per Wikipedia)
["San Salvador"] = {container = "El Salvador"}, -- 1,580,000 (Municipality (urban population))
["Guatemala City"] = {container = "Guatemala"}, -- 3,375,000 (Municipality (urban population); 3,160,000 metro area? per Wikipedia)
["Port-au-Prince"] = {container = "Haiti"}, -- 3,050,000 (Agglomeration; population of low reliability; 2,915,000 metro area? per Wikipedia)
["San Pedro Sula"] = {container = "Honduras"}, -- 1,330,000 (Consolidated Urban Area)
["Tegucigalpa"] = {container = "Honduras"}, -- 1,220,000 (Urban Area)
["Managua"] = {container = "Nicaragua"}, -- 1,400,000 (Consolidated Urban Area)
["Panama City"] = {container = "Panama"}, -- 1,430,000 (Urban Area)
["Asunción"] = {container = "Paraguay"}, -- 2,350,000 (Municipality (urban population))
["Lima"] = {container = "Peru"}, -- 12,000,000 (Agglomeration; 11,283,787 ??? per Wikipedia)
["Arequipa"] = {container = "Peru"}, -- 1,210,000 (Agglomeration)
["San Juan"] = {container = {key = "Puerto Rico", placetype = "commonwealth"}, wp = "%l, %c"}, -- 1,910,000 (Consolidated Urban Area)
["Montevideo"] = {container = "Uruguay"}, -- 1,810,000 (Agglomeration; 1,302,954 ??? per Wikipedia)
["Caracas"] = {container = "Venezuela"}, -- 3,850,000 (Consolidated Urban Area; 5,243,301 ??? per Wikipedia)
["Maracaibo"] = {container = "Venezuela"}, -- 2,825,000 (Consolidated Urban Area; 5,278,448 ??? per Wikipedia)
-- to avoid confusion with Valencia (city and autonomous community of Spain)
["Valencia, Venezuela"] = {container = "Venezuela", wp = "%l, %c"}, -- 2,100,000 (Consolidated Urban Area)
["Valencia"] = {alias_of = "Valencia, Venezuela"},
["Maracay"] = {container = "Venezuela"}, -- 1,480,000 (Consolidated Urban Area)
["Barquisimeto"] = {container = "Venezuela"}, -- 1,360,000 (Consolidated Urban Area)
}
export.misc_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(nil, "negara"),
default_placetype = "city",
data = export.misc_cities,
}
--[==[ var:
List of all known locations, in groups. The first group lists continents and continental regions, followed by three
groups listing top-level locations: countries, "country-like entities" (de-facto/unrecognized/etc. countries and
dependent territories) and former polities (countries, empires, etc.). After that come first-level subpolities
(administrative divisions) of several, mostly large, countries, followed by groups of cities. China and the United
Kingdom include second-level subpolities (in the case of China, only the largest ones as the full list runs in the
hundreds).
]==]
export.locations = {
export.continents_group,
export.countries_group,
export.country_like_entities_group,
export.former_countries_group,
export.australia_group,
export.austria_group,
export.bangladesh_group,
export.brazil_group,
export.canada_group,
export.china_group,
export.china_prefecture_level_cities_group,
export.china_prefecture_level_cities_group_2,
export.egypt_group,
export.finland_group,
export.france_group,
export.france_departments_group,
export.germany_group,
export.greece_group,
export.india_group,
export.indonesia_group,
export.iran_group,
export.ireland_group,
export.italy_group,
export.japan_group,
export.laos_group,
export.lebanon_group,
export.malaysia_group,
export.malta_group,
export.mexico_group,
export.moldova_group,
export.morocco_group,
export.netherlands_group,
export.new_zealand_group,
export.nigeria_group,
export.north_korea_group,
export.norway_group,
export.pakistan_group,
export.philippines_group,
export.poland_group,
export.portugal_group,
export.romania_group,
export.russia_group,
export.saudi_arabia_group,
export.south_africa_group,
export.south_korea_group,
export.spain_group,
export.taiwan_group,
export.thailand_group,
export.turkey_group,
export.ukraine_group,
export.united_kingdom_group,
export.united_states_group,
export.england_group,
export.northern_ireland_group,
export.scotland_group,
export.wales_group,
export.vietnam_group,
export.australia_cities_group,
export.brazil_cities_group,
export.canada_cities_group,
export.france_cities_group,
export.germany_cities_group,
export.india_cities_group,
export.indonesia_cities_group,
export.italy_cities_group,
export.japan_cities_group,
export.mexico_cities_group,
export.nigeria_cities_group,
export.pakistan_cities_group,
export.philippines_cities_group,
export.russia_cities_group,
export.saudi_arabia_cities_group,
export.south_korea_cities_group,
export.spain_cities_group,
export.taiwan_cities_group,
export.united_kingdom_cities_group,
export.united_states_cities_group,
export.new_york_boroughs_group,
export.vietnam_cities_group,
export.misc_cities_group,
}
return export
2mni79t5mgnor8ogyeszmd5msabwih8
281435
281433
2026-04-22T10:05:07Z
PeaceSeekers
3334
281435
Scribunto
text/plain
local export = {}
export.force_cat = false -- set to true to force category generation even on non-mainspace pages
local m_table = require("Module:table")
local string_utilities_module = "Module:string utilities"
local en_utilities_module = "Module:en-utilities"
local insert = table.insert
local concat = table.concat
local dump = mw.dumpObject
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
--[==[ intro:
This module contains data on all known locations, along with some lower-level code to process them (higher-level
known-location code is in [[Module:place/placetypes]]). You must load this module using require(), not using
mw.loadData().
===Location data===
'''NOTE: In order to understand the following better, first read the introductory documentation in [[Module:place]],
especially the section `More about known locations`.'''
The bulk of the code in this module (after some helper functions and placetype tables) describes the known locations
and their relationships. Locations are grouped into ''location groups'' that share some common properties (examples are
states of the United States and cities in Brazil). Each location group is associated with two tables, a ''data table''
that lists the locations and their individual properties, and a ''metadata table'' that lists group-level properties and
defaults for the location properties. Each metadata table points to the associated data table (i.e. contains the data
table as its `data` field), and the global `locations` variable holds a list of all group metadata tables. A given
location is generally described by three values: (a) the group metadata table for the group the location is part of; (b)
the location's canonical ''key'', which is the actual key in the group's data table and is globally unique across all
locations; and (c) the location's ''spec'', which is the initialized object describing the properties of the location
and comes from the value in the data table corresponding to the canonical key, transformed by the `initialize_spec()`
function. These are typically named `group`, `key` and `spec`, respectively and in that order, and are found in the
arguments to many functions.
In a per-group data table, the keys are either ''canonical keys'' describing locations (which, as mentioned above, must
be globally unique) or ''alias keys'' specifying an allowed alias for a given location. There may be multiple aliases
for a given location and the alias keys only need to be unique within a particular group data table, not across all
groups. It is also possible for the same string to serve as an alias key in one group and a canonical key in another
group. (For example, `Newcastle` appears as an alias key in two different groups, referring to two different locations,
canonically known as `Newcastle upon Tyne`, for the city in England, and `Newcastle, New South Wales`, for the city in
New South Wales, Australia; and `Birmingham` appears both as a canonical key in the group of English cities and an alias
key for canonical `Birmingham, Alabama` in the group of US cities.) The corresponding value objects are different for
canonical and alias keys. Corresponding to canonical keys are ''location specs'', describing the properies of the
location that cannot be derived from default properties of the group or global defaults. Corresponding to alias keys
are ''alias specs'', which are highly restricted in the properties they can contain, and whose properties do not have
per-group defaults, but only global defaults.
The canonical key is always the same as the bare category corresponding to the location, which is one of the reasons it
must be globally unique. For example, the country of Georgia uses the canonical key `Georgia` and corresponding bare
category [[:Category:Georgia]], while the US state of Georgia uses the canonical key `Georgia, USA` and corresponding
bare category [[:Category:Georgia, USA]]. The following conventions are followed in naming keys:
* Countries, ''country-like entities'' (which are a mixture of unrecognized de-facto states and dependent territories)
and ''former countries'' (which also includes other types of polities, such as the Roman Empire) use their unqualified
placename as the canonical key. (See the documentation for [[Module:place]] for the distinction between keys and
placenames, which is critical to understand when working with location data.) This also applies to constituent
countries (such as England, Aruba and the Faroe Islands) and constituent parts of grouped dependent territories (such
as the island of Saint Helena, which is administratively part of the British overseas territory of Saint Helena,
Ascension and Tristan da Cunha).
* Cities (including prefecture-level cities in China, which behave in most respects more like non-city administrative
divisions) also normally use their unqualified placename as the canonical key, but if this causes name conflicts or
ambiguities, they use a ''qualified key'' containing either the country name or immediate containing division (if
different) following a comma, such as the case of `Newcastle, New South Wales` and `Birmingham, Alabama` above.
Examples of name conflicts are the two cities just given; examples of ambiguities are the major cities of León and
Mérida in Mexico and city of Cartagena, Colombia, which are given the respective canonical keys of `León, Guanajuato`,
`Mérida, Yucatán` and `Cartagena, Colombia` to avoid ambiguity with the well-known respective cities of the same name
in Spain, even though none of those cities are large enough to be included as known locations in this module. (The
cutoff is generally having a metro area of at least 1,000,000 inhabitants, although there are exceptions.)
* Administrative divisions of countries, other than the exceptions noted above for constituent countries and dependent
territories, use a qualified key that contains the name of the country or constituent country in it, e.g.
`Normandy, France` (a region), `Calvados, France` (a department in the region of Normandy), `Herefordshire, England`
(a ceremonial county), `Northwest Territories, Canada` (a territory), `Central Finland, Finland` (a region),
`Antalya Province, Turkey` (a province), `Cluj County, Romania` (a county), `County Cork, Ireland` (a county) and
`New York, USA` (a state). As shown in these various examples, (a) first and second-level divisions are sometimes both
included (as in France, the United Kingdom and China); (b) the qualifier after the comma is sometimes a constituent
country (England) instead of a country (United Kingdom), and is sometimes abbreviated (USA rather than United States
or Unites States of America); (c) the word `the` is not normally included in the key even if the location is normally
preceded by `the` when following a preposition (there is a property in the location and alias specs to indicate this),
except in a very few cases (most notably `The Hague`); (d) the country is included as a qualifier even if it creates
an apparent redundancy, as with `Central Finland, Finland`; and (e) sometimes the placetype is included in the key, as
with provinces in Turkey and several other countries; states in Nigeria; and counties in Ireland, Romania and several
other countries. Whether the placetype is included, and whether it follows or precedes the placename, depends on
per-country conventions. For example, provinces in Turkey, Iran and several other countries (likewise for states in
Nigeria, oblasts in Russia, etc.) conventionally include the word "Province", "negeri", "Oblast" etc. in their name
because they are normally named after the largest city in the division, which would otherwise lead to ambiguity; and
counties in Ireland and Northern Ireland (and likewise County Durham, England) normally have the word "County"
preceding rather than following them in their conventional name, so we follow this practice. The Wikipedia article
naming scheme for a given administrative division is a strong clue as to how the division is normally referred to,
and we usually follow this practice. (A minor exception is that the Wikipedia articles for provinces in Iran, Laos and
Thailand include the word `province` with an initial lowercase letter while provinces elsewhere, e.g. North and South
Korea, Saudi Arabia and Turkey, use uppercase `Province`; we normalize to uppercase `Province` in all cases.)
As mentioned above, associated with canonical keys in the group data table are location specs, which are objects
containing properties. It is important here to distinguish ''initialized specs'' from ''uninitialized specs''.
Unininitialized specs are as directly specified in [[Module:place/locations]], containing only those properties that
differ from the per-group or global defaults. Initialized specs result from calling `initialize_spec()` on an
uninitialized spec (it is idempotent in that it will do nothing if encountering an already-initialized spec). This
copies all group-level defaults that are not overridden in the location spec itself from the group-level metadata table
into the location spec, so that in general, no more reference need be made to the group to fetch the correct value of a
given location property. (The initialization process also does more transformations in a few cases, noted below.) Note
that the default value of a given property is stored under a key in the group metadata table that is preceded by the
string `default_`; for example, the default value corresponding to the `placetype` property of a given location is
specified in the `default_placetype` key in the group metadata table.
The following are the properties of the location spec.
* `placetype`: String specifying the placetype of the location (e.g. "negara", "negeri", province"). This can also be a
table of such types; in this case, the first listed type is the canonical type that will be used in descriptions, but
the location will be recognized (e.g. in a holonym, or for categorizing into the bare category) when tagged with any
of the specified types. The placetype '''must''' be either specified on an individual location or defaulted at the
group level, or an error occurs.
* `container`: Either a string, a ''canonicalized container'' structure or a list of either type, specifying the
immediate ''container'' (or containers) of the given location. A container is another location which this location is
considered to be directly part of, either politically or (above the country level) geographically. Some locations
belong to multiple immediate containers; this applies especially to transcontinental countries such as Russia and
Turkey. Containers can themselves have containers, forming a tree (or more correctly, a [[w:directed acyclic graph]])
of locations. The list of immediate container(s), followed by the container(s) of the container(s), etc., is termed
the ''container trail'', and some functions compute and return this trail as part of their operation. When a location
spec is initialized, the given container spec is canonicalized into ''canonical container form'', which consists of a
list of canonicalized container structures, each of which is of the form
`{key = "``container_key``", placetype = "``container_placetype``"}`, where ``container_key`` is a canonical location
key and ``container_placetype`` should be the listed placetype for the location, or the first listed placetype if
there are multiple. (FIXME: Since the key uniquely identifies the container location, we should eliminate the
placetype from the container structure.) The list of canonicalized container structures is stored into the
`.containers` field of the location spec (this happens even if the container value is unset in its uninitialized spec
form, causing it to default to the corresponding group-level value), and the `.container` field is set to {nil}. The
canonicalization process is described in more detail below under [[#Container spec canonicalization]].
* `divs`: List of recognized political divisions; e.g. for the Netherlands, a specification of the form
`divs = {"provinces", "municipalities"}` will allow categories such as [[:Category:de:Provinces of the Netherlands]]
and [[:Category:pt:Municipalities of the Netherlands]] to be created. Any division that appears here must also be
found in `placetype_data`, or an error occurs. The entities appearing in the `divs` list can be structures as well as
just strings; this is explained more below under [[#Location divisions]]. Additional political divisions that apply to
all locations in a group can be specified at the group level using the group-only property `addl_divs`, which has the
same format as `divs`. This is intended to be used in the situation where some division types are shared among all
locations in the group and others differ from location to location. An example where this is used is the United
States, where `census-designated places` is specified in the group-level `addl_divs` so that all 50 states have
census-designated places categorized as e.g. [[:Category:Census-designated places in Arizona, USA]], but `counties`
and `county seats` are specified in the group-level `default_divs` because not all states have counties and county
seats (Alaska has boroughs and borough seats and Louisiana has parishes and parish seats), and some states have
additional divisions (New Jersey and Pennsylvania also have boroughs, while Colorado and Connecticut have
municipalities). Note that under most circumstances (particularly, if `container_parent_type` is not set as a property
associated with the division type), any division type specified on a sub-country-level location must also be specified
on all containers up through the country. For example, since French departments specify `communes` and
`municipalities` in `default_divs`, the same division types must be (and are) specified on French regions and for
France itself.
* `keydesc`: String directly specifying a description of the location, for use in generating the contents of category
pages related to the location. In place of a string, a function of three arguments (`group`, `key`, `spec`, as is
normal for locations) that computes the location description can also be given. This is used, for example, for
Russian federal subjects; see `construct_russia_federal_subject_keydesc`. The special string `+++` contained in the
keydesc is replaced with the default value of the location description, which specifies the location's placename,
placetype, and the corresponding values for each container in the container trail, generally up through (but not
beyond) the country level; see `no_include_container_in_desc` below. The location description is used to construct
the full description of various categories, such as bare location categories, whose description generally reads
`"{{(((}}langname}}} terms related to the people, culture, or territory of ``keydesc``."` where ``keydesc`` is the
specified or auto-constructed location description.
* `fulldesc`: String overriding the full description for the bare location category (but not for any other category).
This is currently used only for the location `Earth`, at the very top of the tree (because the standard
`people, culture or territory of ...` text doesn't make sense here), and for `Antarctica` (because it has no permanent
inhabitants). FIXME: This should be renamed `bare_category_fulldesc`.
* `addl_parents`: Specify additional parents for the bare location category, in addition to the category or categories
generated based on the immediate container(s). For example, `Hawaii, USA` specifies `Polynesia` as an additional
parent category; both `North Korea` and `South Korea` specify `Korea` (which is a specially handled location category)
as an additional parent; and `Earth` specifies `nature` (not a location category, but still a topic category) as an
additional parent (which in this case becomes the first parent, as `Earth` has no container). The only restriction on
the categories in `addl_parents` is that they must be topic categories, because each language-specific version of the
bare location category gets the corresponding language-specific versions of the categories in `addl_parents`. FIXME:
This shoudl be renamed `bare_category_addl_parents`.
* `wp`: Spec describing how to construct the Wikipedia article for the location. Each spec is either `true` (equivalent
to `"%l"`, i.e. use the full location placename directly) or a string containing formatting directives, indicating how
to construct the article name. The allowed formatting directives are `%l` (the full location placename), `%e` (the
elliptical location placename) and `%c` (the full placename of the first immediate container). For example, the
default value of `wp` for the group of United States cities is `"%l, %c"` since the city articles tend to be named
e.g. `Austin, Texas` (but with many exceptions, specified using `wp` fields at the city level). Another example is
Thai provinces, which specify a group-level default of `"%e province"` as the Wikipedia articles have lowercase
`province` in their name but the Thai province keys specified in this module have uppercase `Province`. Here we have
to use `%e` to get the placename without the word `Province` in it. The default is `true`, which simply uses the full
location placename as the article name. Note that the Wikipedia article, along with the Wikipedia and Commons category
pages, are shown in the upper right of bare category pages.
* `wpcat`: Spec describing how to construct the Wikipedia category page for the location (i.e. the page listing articles
and categories relevant to the location). The format is the same as with `wp`, and it defaults to the value of `wp`.
It rarely needs to be specified because the category page and the article page almost always follow the same format.
* `commonscat`: Spec describing how to construct the Commons category page for the location (i.e. the page on the
MediaWiki Commons site listing articles and categories relevant to the location). It has the same format as `wp` and
`wpcat` and defaults to `wpcat`, which is usually (but not always) correct.
* `the`: Boolean specifying whether a location should be preceded by `the` when following a preposition, e.g. in
category names such as [[:Category:Cities in the Northern Territory, Australia]] and in old-style place descriptions
when the location occurs as the first holonym, such as the city [[Darwin]] described using
{{tl|place|city|terr/Northern Territory|c/Australia}}. Note that the global default for this and all Boolean
properties is {nil}, which amounts to the same as {false}.
* `british_spelling`: Boolean indicating whether the location in question uses British spelling. Currently this only
affects whether the spelling `neighborhoods` or `neighbourhoods` is used in categories such as
[[:Category:Neighborhoods of New York City]] and [[:Category:Neighbourhoods of Sydney]]. This usually needs to be set
only at the top level (i.e. country or country-like entity), because lower-level entities look up the container trail
for any container that has `british_spelling = true` set, and if found, assume that British spelling applies. The
general principle used in setting this is that all countries in Europe, all dependent territories of any such country,
all former British colonies, and any dependent territories of these former colonies, are assumed to use British
spelling, while all other countries and associated dependent territories are assumed to use American spelling. This
can potentially be modified on a case-by-case basis.
* `is_city`: Boolean indicating whether the location in question is a city. This is explicitly set to `true` for
city-states (e.g. Monaco and Vatican City), dependent territories that are cities (e.g. Hong Kong, Macau, Bonaire,
Gibraltar, etc.), certain city-level administrative divisions (such as `City of Belfast, Northern Ireland`) and
(through a group-levell setting) New York boroughs. In addition, it is set to `true` in initialize_spec() whenever
the group-level `default_placetype == "city"`, so that all cities get it set without explicitly needing to add a
group-level setting for this. Note that the condition `default_placetype == "city"` intentionally excludes Chinese
prefecture-level cities, which aren't really cities in that (for example) they don't directly contain neighborhoods,
but do contain cities within them. This setting is used in various places: (a) to add cities, rivers, etc. to
categories like [[:Category:Rivers in Osaka Prefecture, Japan]] and [[:Category:Cities in Wuhan]] for holonyms that
are ''not'' cities; (b) to add districts, neighborhoods, and the like to categories like
[[:Category:Neighborhoods of Brooklyn]] and [[:Category:Neighborhoods of Monaco]] for holoynms that ''are'' cities;
(c) generally, to determine which "generic" placetypes (cities, rivers, neighborhoods, etc.) apply to the location.
(Those that can occur with cities have a `generic_before_cities` setting in [[Module:place/placetypes]], and those
that can occur with non-cities have a `generic_before_non_cities` setting.)
* `is_former_place`: Boolean that should be set on former places such as the Soviet Union and the Roman Empire. For such
places, categories such as [[:Category:fr:Rivers in the Soviet Union]] are neither generated nor recognized (more
generally, no "generic" placetypes apply except for `places`), and category descriptions include the word `former`.
* `overriding_bare_label_parents`: Document me!
* `bare_category_parent_type`: Document me!
* `no_container_cat`: Document me!
* `no_container_parent`: Document me!
* `no_generic_place_cat`: Document me!
* `no_check_holonym_mismatch`: Document me!
* `no_auto_augment_container`: Document me!
* `no_include_container_in_desc`: Document me!
====Location divisions====
The `divs` field of a location describes the recognized political division types of that location. Specifying a given
division type will cause places defined as being of the specified division type and with the location as a holonym will
cause the place to be categorized as ` ``placetypes`` in/of ``location`` `; for example, specifying that the United
States has `"negeri"` as a division will cause anything defined as {{tl|place|fr|state|c/US}} to be categorized under
[[:Category:fr:States of the United States]]. Note that you do not have to explicitly specify division types for
"generic" placetypes (those that have a `generic_before_non_cities` field if the location is not a city, or that have a
`generic_before_cities` field if the location is a city); this includes things like cities, towns, villages,
neighbo(u)rhoods and rivers. A given element in the `divs` list is usually a string naming a plural placetype; the
placetype is automatically converted to the singular for recognizing the placetype in a {{tl|place}} spec, and irregular
plurals such as `kibbutzim` are handled correctly as long as the placetype specifies an appropriate `plural` field
(if the `plural` isn't explicitly given, the default singularization algorithm in [[Module:en-utilities]] is run, which
gets most things correctly but has problems with `passes` and `fortresses`, which are singularized to `passe` and
`fortresse`; for this reason, an explicit plural entry is added to terms in ''-ss''). In place of a string, an object
can be given with the plural placetype in the `type` field; this allows additional properties to be specified along with
the placetype. An example of this is the `divs` list for Canada:
{
["Canada"] = {divs = {
{type = "provinces", cat_as = "provinces and territories"},
{type = "territories", cat_as = "provinces and territories"},
"counties", "districts", "municipalities", "regional municipalities",
"rural municipalities", "parishes",
"Indian reserves",
"census divisions",
{type = "townships", prep = "di"},
}, ...},
}
Here, both provinces and territories are set to categorize as `provinces and territories`, meaning that there is a
single category [[:Category:Provinces and territories of Canada]] rather than separate categories for provinces and
territories. Similar things are done for other countries that have more than one type of first-level administrative
division (e.g. Australia, China, India and Pakistan). Note that any placetype listed under `cat_as` must exist in the
table of placetypes in [[Module:place/placetypes]], and in fact there is a category-only entry there for `provinces and
territories!` (the use of exclamation point following a plural placetype means that the placetype is present only for
use in categories and won't be recognized as the placetype field in a {{tl|place}} description). In addition, townships
are declared to use `in` rather than `of` as the preposition in the category; hence the category name will be
[[:Category:Townships in Canada]] rather than [[:Category:Townships of Canada]]. (The use of `in` vs. `of` is somewhat
related to whether a given placetype is an official administrative or statistical division of the location in question
and comes in a defined list, in which case `of` should be used, or is more ill-defined, in which case `in` should be
used; the default is `of`, and the use of `in` with `townships` is probably by analogy with the use of `in` with cities
and towns.)
Another more complex example is the divisions given for Quebec:
{
["Quebec, Canada"] = {divs = {
"counties",
{type = "regional county municipalities", container_parent_type = "regional municipalities"},
{type = "regions", container_parent_type = false},
{type = "townships", prep = "di"},
{type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "counties"}, "municipalities"}},
{type = "township municipalities", cat_as = {{type = "townships", prep = "di"}, "municipalities"}},
{type = "village municipalities", cat_as = {{type = "villages", prep = "di"}, "municipalities"}},
}, ...},
}
Here, `container_parent_type` controls the second parent category of the placetype/location category associated with the
entry. In this case, for example, [[:Category:Counties of Quebec, Canada]] will have [[:Category:Counties of Canada]] as
its second or ''container-level'' parent. However, this doesn't make sense for `regional county municipalities`, which
exist only in Quebec (so the parent category [[:Category:Regional county municipalities of Canada]] would have only one
subcategory); but they are similar to regional municipalities in British Columbia, Nova Scotia and Ontario, so the
`container_parent_type = "regional municipalities"` spec causes the container-level parent of this category to be
[[:Category:Regional municipalities of Canada]]. Likewise, `regions` as administrative divisions (as opposed to mere
geographic regions) exist only in Quebec; they have no equivalent elsewhere, so we disable the container-level parent
using `container_parent_type = false`. The specs for `parish municipalities`, `township municipalities` and
`village municipalities` show both that multiple types can be specified under `cat_as` (here, for example, we categorize
`parish municipalities` as both `parishes` and `municipalities`) and that these types can themselves have properties,
just as for entries directly under `divs`. Specifically, `{type = "parishes", container_parent_type = "counties"}`
means that any place defined as a parish municipality in Quebec will be categorized under both [[:Category:Parishes of
Quebec, Canada]] and [[:Category:Municipalities of Quebec, Canada]], and that the former will have a container-level
parent of [[:Category:Counties of Canada]] (rather than the default of [[:Category:Parishes of Canada]]). Similarly,
`township municipalities` will be categorized under both [[:Category:Townships in Quebec, Canada]] (''not''
[[:Category:Townships of Quebec, Canada]]) and [[:Category:Municipalities of Quebec, Canada]].
====Container spec canonicalization====
A fully canonicalized container spec for a given location consists of a list of ''canonicalized container objects'',
each with a `key` and `placetype` field. The `key` field should name the canonical key of some other location at a
higher level (e.g. French cities are contained in French departments, which are contained in French regions, which are
contained in France, which is contained in Europe, which is contained in Eurasia, which is contained in the Earth). The
`placetype` field should correspond to the first (canonical) placetype listed for the key in question. The process of
initializing a locaion spec converts the container spec in `.container` into a canonicalized spec in `.containers` and
removes the spec from `.container`. It works as follows:
# If the `container` field is missing, and there is a group-level `default_container` field, it is used in its place.
For example, none of the Brazilian states listed in `brazil_states` specifies a container, but the group specifies
`default_container = "Brazil"`.
# A single string or canonicalized container object is allowed and made into a one-element list.
# If a list element is a string that did ''not'' come from `default_container`, and there is a group-level
`canonicalize_key_container` field, it is assumed to be a one-argument function and is called on the string to get
a canonicalized container object.
# Any remaining strings are assumed to be countries and are used directly as the `key`, with `placetype` set to
`"negara"`.
====Alias keys====
Aliases can be provided for canonical keys using ''alias keys''. Alias keys have a very different location spec
structure from canonical keys. This structure does not, in general, have defaults at the group level and is not
initialized using `initialize_spec()`, but is used as-is. The following properties are recognized in an alias location
spec:
* `alias_of`: The canonical key of which this key is an alias. Required.
* `the`: If true, this alias key is preceded by `the` following a preposition. Defaults to the group-level `default_the`
but does not pay attention to the value of `the` for the corresponding canonical key.
* `display`: This is a display alias, meaning that holonyms using the placename corresponding to this alias will be
converted to the placename corresponding to the canonical key when formatting the holonym for display. (Otherwise,
the aliasing applies only to categorization.) If the value is true, the display canonicalization is to the placename
of the canonical key; otherwise, the value should be a key whose corresponding placename is used when display
canonicalizing.
* `placetype`: The placetype of the alias. Rarely needs to be specified as it defaults to the canonical key's placetype,
and if that is unspecified, to the group-level default placetype.
====Location group metadata tables====
As mentioned above, associated with each location group is a ''metadata table'' listing group-level properties. The
metadata table contains two types of keys: group-level defaults (named like the corresponding location-level keys but
preceded by `default_`, e.g. `default_placetype` corresponding to the location-level `placetype` key) and group-only
keys, which are mostly functions. The following are the possible group-only keys:
* `data`: This points to the group data table for the group, as described above.
* `key_to_placename`: This is a function of one argument to transform the location's key (whether canonical or alias)
into the full and elliptical placenames. The difference between full and elliptical placenames is described in the
documentation for [[Module:place]], but in essence, it applies for keys that include the placetype in them (e.g.
`Phuket Province, Thailand` or `County Mayo, Ireland`), in which case the full placename includes the placetype and
the elliptical placename does not. For keys that do not include the placetype in them (e.g. `Arizona, USA` or
`Gloucestershire, England`), the full and elliptical placenames are identical. Note that neither the full nor the
elliptical placename includes the container in it; hence, for `Phuket Province, Thailand`, the full placename is
`Phuket Province` and the elliptical placename is just `Phuket`. (Note that the full vs. elliptical placename
distinction is intended only for handling cases where the placetype follows or precedes the raw placename and there
is no difference between the two in whether they are normally preceded by `the`. More complex situations, such as
`State of Mexico` (which normally takes `the`) vs. just `Mexico` (which doesn't), or `Islamabad Capital Territory` vs.
just `Islamabad`, should be handled instead by aliases.) The `key_to_placename` function takes one argument, the key,
and returns two arguments, the full and elliptical placenames, respectively. If left undefined, the default is to
chop off anything starting with a comma and return the result as both full and elliptical placename, and if
specifically set to `false`, the key is used directly as both full and elliptical placename. If it needs to be
defined, it is best to use the helper function `make_key_to_placename`, if possible (or
`make_irish_type_key_to_placename` in the case of Ireland and Northern Ireland, where `County` precedes), rather than
rolling your own. In addition, you should use the global `key_to_placename` function (which takes care of the default
implementation and such) rather than directly calling the function in the `key_to_placename` field.
* `placename_to_key`: This is approximately the inverse of `key_to_placename`, transforming a placename (which can be
either in full or elliptical form) into the corresponding key. As with `key_to_placename`, if you need to define this
(generally, when the full and elliptical placenames are different), prefer using `make_placename_to_key` (or
`make_irish_type_placename_to_key` for Ireland and Northern Ireland) to rolling your own. In addition, similarly to
`key_to_placename`, use the global `placename_to_key` function to convert placenames to keys rather than directly
invoking the function in the `placename_to_key` field. If the field is set to `false`, the placename is used unchanged
as the key. Otherwise, the default algorithm works as follows:
*# If the group-level `default_placetype == "city"`, use the placename unchanged as the key.
*# Otherwise, if the group-level `default_container` exists and is a string, append it to the placename after a comma +
space and use the result as the key.
*# Otherwise, if the group-level `default_container` is a canonical container object (an object with `key` and
`placetype` fields), and the `placetype` field is either `country` or `constituent country`, append the `key` field
to the placename after a comma + space and use the result as the key.
*# Otherwise, use the placename unchanged as the key.
* `canonicalize_key_container`: A function of one argument to convert the specified `container` field, when a string,
to canonical form. Described in more detail above under [[#Container spec canonicalization]]. It is preferable to
construct the function using `make_canonicalize_key_container`, if possible, rather than rolling your own.
* `addl_divs`: Additional political divisions appended, for all locations in the group, to the list of divisions derived
from the location-level `divs` or group-level `default_divs` fields to get the final list of divisions for the
location. See [[#Location divisions]] for more details.
]==]
-----------------------------------------------------------------------------------
-- Helper functions --
-----------------------------------------------------------------------------------
--[==[
Throw an error. `fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to
format the format string as if `fmt:format(...)` were called. In general, callers should use `internal_error` unless the
error was due to bad user input rather than a logic error (which usually isn't the case in deep back-end code like
this).
]==]
function export.process_error(fmt, ...)
local args = {...}
for i = 1, select("#", ...) do
args[i] = dump(args[i])
end
return error(string.format(fmt, unpack(args)))
end
--[==[
Throw an internal error (a logic error that should never happen unless there is a bug in the code, as opposed to a user
error triggered by bad input or a system error due to something like running out of memory or hitting a time limit).
`fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to format the
format string as if `fmt:format(...)` were called.
]==]
function export.internal_error(fmt, ...)
export.process_error("Internal error: " .. fmt, ...)
end
local internal_error = export.internal_error
-- Return whether `list_or_element` (a list of strings, or a single string) "contains" `item` (a string). If
-- `list_or_element` is a list, this returns true if `item` is in the list; otherwise it returns true if `item`
-- equals `list_or_element`.
local function list_or_element_contains(list_or_element, item)
if type(list_or_element) == "table" then
return m_table.contains(list_or_element, item) and true or false
end
return list_or_element == item
end
--[==[
Call the location group's `key_to_placename` function if it exists (see the comment at the top of [[Module:place]] for
the distinction between keys and placenames). Two values are returned, the full and elliptical placenames (e.g. full
`"County Durham"` vs. elliptical `"Durham"`). If the group does not define `key_to_placename`, both full and elliptical
placenames are computed by chopping off anything starting with a comma.
]==]
function export.key_to_placename(group, key)
if group.key_to_placename == false then
return key, key
end
if group.key_to_placename then
local full_placename, elliptical_placename = group.key_to_placename(key)
if type(full_placename) ~= "string" then
internal_error("Key %s returned a non-string full placename: %s", key, full_placename)
end
if type(elliptical_placename) ~= "string" then
internal_error("Key %s returned a non-string elliptical placename: %s", key, elliptical_placename)
end
return full_placename, elliptical_placename
end
key = key:gsub(",.*", "")
return key, key
end
--[==[
Call the location group's `placename_to_key` function if it exists (see the comment at the top of [[Module:place]] for
the distinction between keys and placenames) and return the result. If `placename_to_key` exists with the value `false`,
return the placename unchanged. If the group does not define `placename_to_key`, and it defines a `default_container`
whose placetype is either `country` or `constituent country`, the container name is appended to the placename after a
comma and a space. Otherwise the placename is returned unchanged.
]==]
function export.placename_to_key(group, placename)
if group.placename_to_key == false then
return placename
elseif group.placename_to_key then
local key = group.placename_to_key(placename)
if type(key) ~= "string" then
internal_error("Placename %s returned a non-string key: %s", placename, key)
end
return key
elseif group.default_placetype == "city" then
return placename
else
local defcon = group.default_container
if not defcon then
return placename
elseif type(defcon) == "string" then
return placename .. ", " .. defcon
elseif type(defcon) == "table" and (defcon.placetype == "negara" or
defcon.placetype == "constituent country") then
return placename .. ", " .. defcon.key
else
return placename
end
end
end
--[==[
Initialize the location spec `spec`, augmenting it with default values taken from `group` if the spec itself doesn't
specify values for the properties. This sets `containers` to a canonicalized list of objects, each with `key` and
`placetype` keys, describing the immediate containers of the location, and erases (sets to nil) the original
non-canonicalized `container` field. (Most locations have only one immediate container but some, e.g. Russia, have more
than one. Containers should be carefully distinguished from category parents. Generally the container is the first
category parent, or the first ``n`` parents if there are ``n`` containers, but there may be additional category parents,
which indicate some sort of relation between the category parent and the location but not necessarily one of
containment.)
This function is idempotent in that nothing happens if called more than once on the same spec.
FIXME: Consider reimplementing this in a more standardly object-oriented way using metatables.
]==]
function export.initialize_spec(group, key, spec)
if spec.initialized then
return
end
local container = spec.container
local containers
local container_from_default
if not container then
container = group.default_container
container_from_default = true
end
if container then
if type(container) == "string" or container.key then
container = {container}
end
containers = {}
for _, cont in ipairs(container) do
if type(cont) == "string" then
if group.canonicalize_key_container and not container_from_default then
cont = group.canonicalize_key_container(cont)
else
cont = {key = cont, placetype = "negara"}
end
end
insert(containers, cont)
end
end
spec.containers = containers
spec.container = nil
local function value_with_default(val, default_val)
if val == nil then
return default_val
else
return val
end
end
local function set_or_default(prop)
spec[prop] = value_with_default(spec[prop], group["default_" .. prop])
end
set_or_default("placetype")
if not spec.placetype then
internal_error("No placetype found in key %s for spec %s or in group `default_placetype`", key, spec)
end
set_or_default("divs")
spec.addl_divs = group.addl_divs
for _, prop in ipairs {
"keydesc",
"fulldesc",
"addl_parents",
"overriding_bare_label_parents",
"bare_category_parent_type",
"wp",
"wpcat",
"commonscat",
"british_spelling",
"the",
"no_container_cat",
"no_container_parent",
"no_generic_place_cat",
"no_check_holonym_mismatch",
"no_auto_augment_container",
"no_include_container_in_desc",
"is_city",
"is_former_place",
} do
set_or_default(prop)
end
-- `default_placetype == "city"` is correct; if `default_placetype` has something else like `prefecture-level city`
-- as the canonical placetype but also lists `city` (as Chinese prefecture-level cities do), don't mark as
-- is_city.
spec.is_city = value_with_default(spec.is_city, group.default_placetype == "city")
spec.initialized = true
end
--[=[
Given a location group, key and possible placetypes that the placename must match, check if the key exists in the group
with at least one of the group's key's placetypes matching one of the passed-in placetypes. If so, return two values:
the group key (which potentially could differ from the passed-in key due to aliases) and the corresponding spec object,
which (as with all functions that return spec objects) has been initialized using `initialize_spec()` (i.e. default
property values have been copied from the group into the spec, if the spec doesn't itself specify a value for the
property in question).
`alias_resolution` controls how aliases are resolved. Normally, both display and category aliases are followed, and
the returned key will reflect the canonical location key. However, if `alias_resolution` is {"none"}, no alias following
happens. In that case, if the key specifies an alias, the spec for the alias rather than the spec for the canonical
location is returned, and importantly, it is returned uninitialized, meaning that properties from the group are not
copied into the spec. (If the key specifies a canonical location, its spec is returned initialized, as in the normal
case where `alias_resolution` is unspecified.) The caller needs to check whether the returned spec is an alias by
looking for an `alias_of` property. If `alias_resolution` is {"display"}, the behavior is the same as for {"none"}
except that if the alias contains a setting `display = true`, the returned key will reflect the canonical location key,
and if the alias contains a setting `display = ``string`` `, the returned key will reflect that string.
This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for
internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or
`find_canonical_key` (for known-canonical locations where the placetype isn't known).
]=]
local function find_matching_key_in_group(group, placetypes, key, alias_resolution)
if alias_resolution ~= nil and alias_resolution ~= "none" and alias_resolution ~= "display" and
alias_resolution ~= "all" then
internal_error("Bad value for 'alias_resolution': %s", alias_resolution)
end
local spec = group.data[key]
if not spec then
return nil
end
local function check_correct_placetype(placetype)
if type(placetype) == "table" then
for _, pt in ipairs(placetype) do
if list_or_element_contains(placetypes, pt) then
return true
end
end
return false
else
return list_or_element_contains(placetypes, placetype)
end
end
if spec.alias_of then
local resolved_key = spec.alias_of
local resolved_spec = group.data[resolved_key]
if not resolved_spec then
internal_error("Key %s is an alias of %s, which doesn't exist", key, resolved_key)
elseif resolved_spec.alias_of then
internal_error("Key %s is an alias of %s, which is itself an alias; indirect aliasing not allowed",
key, resolved_key)
end
if alias_resolution == "none" or alias_resolution == "display" then
-- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group.
local placetype = spec.placetype or resolved_spec.placetype or group.default_placetype
if not placetype then
internal_error("No placetype found for key %s in any of spec %s, alias-resolved spec %s or in group " ..
"`default_placetype`", key, spec, resolved_spec)
end
if not check_correct_placetype(placetype) then
return nil
end
if alias_resolution == "display" then
if spec.display == true then
key = resolved_key
elseif spec.display then
key = spec.display
end
end
return key, spec
end
key = resolved_key
spec = resolved_spec
end
-- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group.
local placetype = spec.placetype or group.default_placetype
if not placetype then
internal_error("No placetype found for key %s in spec %s or group `default_placetype`", key, spec)
end
if not check_correct_placetype(placetype) then
return nil
end
export.initialize_spec(group, key, spec)
return key, spec
end
--[=[
Given a location group, placename and possible placetypes that the placename must match, check if the placename exists
in the group with at least one of the placetypes of the key in the group that corresponds to the placename matching one
of the passed-in placetypes. If so, return two values: the key corrsponding to the passed-in placename and the
corresponding spec object. This is similar to `find_matching_key_in_group()` but works with placenames rather than keys.
`alias_resolution` is as in `find_matching_key_in_group()`.
This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for
internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or
`find_canonical_key` (for known-canonical locations where the placetype isn't known).
]=]
local function find_matching_placename_in_group(group, placetypes, placename, alias_resolution)
local key = export.placename_to_key(group, placename)
return find_matching_key_in_group(group, placetypes, key, alias_resolution)
end
--[==[
If `key` is a canonical known location key (i.e. not an alias), return the corresponding group and initialized spec.
If no such key exists, return {nil}. This throws an internal error if two locations with the same key are found.
]==]
function export.find_canonical_key(key)
local found_locations = {}
for _, group in ipairs(export.locations) do
local spec = group.data[key]
if not spec then
-- do nothing
elseif spec.alias_of then
mw.log(("Skipping alias '%s' of canonical '%s'"):format(key, spec.alias_of))
else
insert(found_locations, {group, spec})
end
end
if not found_locations[1] then
return nil
elseif found_locations[2] then
internal_error("Found multiple matching locations for canonical key %s: %s", key, found_locations)
else
local group, spec = unpack(found_locations[1])
export.initialize_spec(group, key, spec)
return group, spec
end
end
--[==[
Iterator that returns all locations matching a given description, where the description consists of either a placename
or a key along with a list of possible placetypes. Usually there will be at most one such location. The iterator
returns three values at each iteration: the location group, canonical key by which the location is known and the spec
object describing the location. `data` contains the following possible fields:
* `placetypes`: A list of possible placetypes, one of which must match one of the location's placetypes; or a string
specifying a placetype, which must match one of the location's placetypes. This must be specified.
* `placename`: The placename of the location. Either this or `key` must be specified.
* `key`: The key of the location. Either this or `placename` must be specified.
* `alias_resolution`: If specified, it behaves the same as for `find_matching_key_in_group`.
The spec is normally initialized using `initialize_spec()` prior to it being returned (but may not be if
`alias_resolution` is given and the specified key or placename is an alias; see the documentation for
`find_matching_key_in_group`).
]==]
function export.iterate_matching_location(data)
local i = 0
local n = #export.locations
return function()
while true do
i = i + 1
if i > n then
break
end
local group = export.locations[i]
local key, spec
if data.placename then
key, spec = find_matching_placename_in_group(group, data.placetypes, data.placename,
data.alias_resolution)
else
if not data.key then
internal_error("'.placename' or '.key' must be defined: %s", data)
end
key, spec = find_matching_key_in_group(group, data.placetypes, data.key, data.alias_resolution)
end
if key then
return group, key, spec
end
end
end
end
--[==[
Return the location matching a given description, where the description consists of either a placename or a key along
with a list of possible placetypes. This is similar to `iterate_matching_location()` but throws an internal error if
there is not exactly one location found; as such, it is for use with internally specified locations (such as the
containers of known locations) rather than externally specified locations, which may not match a known location and in
some cases may match multiple known locations. For finding an externally specified location, consider using
`find_matching_holonym_location`, which returns {nil} rather than throwing an error if the location isn't found, but
also (more importantly) checks to make sure there are no conflicting holonyms among the user-specified holonyms (e.g.
{{tl|place|city|s/Delaware|c/USA|t=Newark}} will not match the known location `Newark` (in New Jersey, not Delaware).
]==]
function export.get_matching_location(data)
local all_found = {}
for group, key, spec in export.iterate_matching_location(data) do
insert(all_found, {group, key, spec})
end
if not all_found[1] then
internal_error("Couldn't find matching location for data %s", data)
elseif all_found[2] then
internal_error("Found multiple matching locations for data %s: %s", data, all_found)
else
return unpack(all_found[1])
end
end
--[==[
Successively iterate over a location's containers, and then the containers of those containers, etc. Keep in mind that
locations may have multiple containers (e.g. Russia has both Europe and Asia as containers, and both Europe and Asia
have Eurasia as their container). A given container will never be returned twice (e.g. in the case where a specific
location A has locations B and C as containers, and B has C as its container, C will not be returned twice). An
internal error happens if a container loop is detected. The return value is a list of location objects, each of which
contains `group`, `key` and `spec` fields.
]==]
function export.iterate_containers(group, key, spec)
local keys_seen = {}
keys_seen[key] = true
local iterations = 0
local last_iteration_containers = {{group = group, key = key, spec = spec}}
return function()
iterations = iterations + 1
if iterations > 10 then
internal_error("Probable loop in containers when processing key %s", key)
end
local next_iteration_containers = {}
for _, location in ipairs(last_iteration_containers) do
local containers = location.spec.containers
if containers then
for _, container in ipairs(containers) do
local container_group, container_key, container_spec = export.get_matching_location {
placetypes = container.placetype,
key = container.key,
}
if not keys_seen[container_key] then
insert(next_iteration_containers, {
group = container_group, key = container_key, spec = container_spec
})
keys_seen[container_key] = true
end
end
end
end
if not next_iteration_containers[1] then
return nil
end
last_iteration_containers = next_iteration_containers
return next_iteration_containers
end
end
--[==[
Given a placename, convert it into a link (two-part if `display_form` is given and differs from `placename`) and add
`"the "` to the beginning if called for in `spec`.
]==]
function export.construct_linked_placename(spec, placename, display_form)
local linked_placename = display_form and placename ~= display_form and ("[[%s|%s]]"):format(placename,
display_form) or ("[[%s]]"):format(placename)
if spec.the then
linked_placename = "the " .. linked_placename
end
return linked_placename
end
--[=[
This is typically used to define `key_to_placename`. It generates a function that chops off parts of a string (a
location key), typically at the end, in order to get the full and elliptical versions of a placename. (See the
documentation above for `key_to_placename` under "Location group tables" for the difference between full and elliptical
placenames.) `container_patterns` is a Lua pattern or a list of possible patterns matching the container at the end of
the key, which will be used to remove that container. If multiple patterns are specified, each one is tried until one
matches. If `container_patterns` is omitted, this part of the process is skipped. The reulting string becomes the full
placename. If `divtype_patterns` is specified, it is likewise either a Lua pattern or list of possible patterns to match
and remove the political division affixed onto the end (or possibly the beginning) of the key in the keys of certain
countries (such as South Korean and North Korean counties, which include the word "County" in the key). The resulting
chopped string becomes the elliptical placename. If `divtype_patterns` is omitted, this part of the process is skipped
and the full and elliptical placenames are the same.
Typical usage is as follows:
```
key_to_placename = make_key_to_placename(", England$"),
```
or (when the political division is part of the key)
```
key_to_placename = make_key_to_placename(", South Korea$", " County$")
```
]=]
local function make_key_to_placename(container_patterns, divtype_patterns)
if type(container_patterns) == "string" then
container_patterns = {container_patterns}
end
if type(divtype_patterns) == "string" then
divtype_patterns = {divtype_patterns}
end
return function(key)
local full_placename = key
if container_patterns then
for _, container_pattern in ipairs(container_patterns) do
local nsubs
full_placename, nsubs = full_placename:gsub(container_pattern, "")
if nsubs > 0 then
break
end
end
end
local elliptical_placename = full_placename
if divtype_patterns then
for _, divtype_pattern in ipairs(divtype_patterns) do
local nsubs
elliptical_placename, nsubs = elliptical_placename:gsub(divtype_pattern, "")
if nsubs > 0 then
break
end
end
end
return full_placename, elliptical_placename
end
end
--[=[
This is typically used to define `placename_to_key`. It generates a function that appends a string to the end of a given
placename to get the key (see the definition of `placename_to_key` above in the documentation under "Location group
tables"). Optional `divtype_suffix` is a raw string (which should not contain hyphens or other characters that have
special meaning in Lua patterns) to be appended first to the placename; if already present at the end, it is not
appended. `container_suffix` is then added in the same fashion if given. Typical usage is like this:
```
placename_to_key = make_placename_to_key(", England")
```
(which will convert e.g. `"Hampshire"` into `"Hampshire, England"`)
or
```
placename_to_key = make_placename_to_key(", South Korea", " County")
```
(which will convert e.g. `"Gangwon"` or `"Gangwon County"` into `"Gangwon County, South Korea"`).
]=]
local function make_placename_to_key(container_suffix, divtype_suffix)
return function(placename)
local key = placename
if divtype_suffix then
if not key:find(divtype_suffix .. "$") then
key = key .. divtype_suffix
end
end
if container_suffix then
key = key .. container_suffix
end
return key
end
end
--[=[
This is typically used to define `canonicalize_key_container`, which converts a container as specified in the location
data into the canonical form containing both the full container key and its placetype. It generates a function to do
the canonicalization of a given container. If the container is a string, `suffix` is appended onto the string (use {nil}
or {""} if there is no suffix to append), and the placetype is set to `placetype`. Otherwise the container is left
as-is. Typical usage is like this:
```
canonicalize_key_container = make_canonicalize_key_container(", Canada", "province")
```
which will convert e.g. `"Ontario"` into `{key = "Ontario, Canada", placetype = "province"}`.
]=]
local function make_canonicalize_key_container(suffix, placetype)
return function(container)
if type(container) == "string" then
return {key = container .. (suffix or ""), placetype = placetype}
else
return container
end
end
end
-----------------------------------------------------------------------------------
-- Top-level tables --
-----------------------------------------------------------------------------------
export.continents = {
["Bumi"] = {the = true, placetype = "planet", addl_parents = {"alam semula jadi"},
fulldesc = "=the planet [[Earth]] and the features found on it"},
["Afrika"] = {placetype = "benua", container = {key = "Bumi", placetype = "planet"}},
["Amerika"] = {placetype = {"superbenua", "benua"}, container = {key = "Bumi", placetype = "planet"},
keydesc = "[[America]], in the sense of [[North America]] and [[South America]] combined",
wp = "Amerika"},
["America"] = {alias_of = "Amerika", the = true},
["Amerika Utara"] = {placetype = "benua", container = {key = "America", placetype = "superbenua"}},
["Caribbean"] = {the = true, placetype = {"kawasan benua", "region"}, container = {key = "Amerika Utara", placetype = "benua"}},
["Amerika Tengah"] = {placetype = {"kawasan benua", "region"}, container = {key = "Amerika Utara", placetype = "benua"}},
["Amerika Selatan"] = {placetype = "benua", container = {key = "America", placetype = "superbenua"}},
["Antartika"] = {placetype = "benua", container = {key = "Bumi", placetype = "planet"},
fulldesc = "=the territory of [[Antarctica]]"},
["Eurasia"] = {placetype = {"superbenua", "benua"}, container = {key = "Bumi", placetype = "planet"},
keydesc = "[[Eurasia]], i.e. [[Europe]] and [[Asia]] together"},
["Asia"] = {placetype = "benua", container = {key = "Eurasia", placetype = "superbenua"}},
["Eropah"] = {placetype = "benua", container = {key = "Eurasia", placetype = "superbenua"}},
["Oceania"] = {placetype = "benua", container = {key = "Bumi", placetype = "planet"}},
["Melanesia"] = {placetype = {"kawasan benua", "region"}, container = {key = "Oceania", placetype = "benua"}},
["Micronesia"] = {placetype = {"kawasan benua", "region"}, container = {key = "Oceania", placetype = "benua"}},
["Polynesia"] = {placetype = {"kawasan benua", "region"}, container = {key = "Oceania", placetype = "benua"}},
}
export.continents_group = {
default_overriding_bare_label_parents = {}, -- container parents should be used
default_divs = {{type = "negara", prep = "di"}},
-- It's enough to mention the first-level continent or continent group. It seems excessive to write e.g.
-- "El Salvador, a country in Central America, a continental region in North America, a continent in America, ...".
default_no_include_container_in_desc = true,
default_no_container_cat = true,
default_no_container_parent = true,
default_no_auto_augment_container = true,
default_no_generic_place_cat = true,
-- French Guyana is in France but not in Europe, which should not be an issue, so don't check holonym mismatches at
-- this level. We also run into problems with supercontinents, which have "benua" as the fallback and cause
-- mismatches.
default_no_check_holonym_mismatch = true,
data = export.continents,
}
-- Countries: including those with partial recognition that are normally considered countries (e.g. Kosovo, Taiwan).
export.countries = {
["Afghanistan"] = {container = "Asia", divs = {"provinces", "districts"}},
["Albania"] = {container = "Eropah", divs = {"counties", "municipalities", "communes",
{type = "administrative units", cat_as = "communes"},
}, british_spelling = true},
["Algeria"] = {container = "Afrika", divs = {"provinces", "communes", "districts", "municipalities"}},
["Andorra"] = {container = "Eropah", divs = {"parishes"}, british_spelling = true},
["Angola"] = {container = "Afrika", divs = {"provinces", "municipalities"}},
["Antigua dan Barbuda"] = {container = "Caribbean", divs = {"provinces"}, british_spelling = true},
["Argentina"] = {container = "Amerika Selatan", divs = {"provinces", "departments", "municipalities"}},
["Armenia"] = {container = {"Eropah", "Asia"}, divs = {"provinces", "districts", "municipalities"},
british_spelling = true},
["Republik Armenia"] = {alias_of = "Armenia", the = true}, -- differs in "the"
-- Both a country and continent
["Australia"] = {container = "Oceania", divs = {
{type = "negeri", cat_as = "negeri dan wilayah"},
{type = "wilayah", cat_as = "negeri dan wilayah"},
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and territories"},
{type = "ABBREVIATION_OF territories", cat_as = "abbreviations of states and territories"},
"local government areas", "dependent territories",
}, british_spelling = true},
["Austria"] = {container = "Eropah", divs = {"negeri", "districts", "municipalities"}, british_spelling = true},
["Azerbaijan"] = {container = {"Eropah", "Asia"}, divs = {"districts", "municipalities"}, british_spelling = true},
["Bahamas"] = {the = true, container = "Caribbean", divs = {"districts"}, british_spelling = true, wp = "The %l"},
["Bahrain"] = {container = "Asia", divs = {"governorates"}},
["Bangladesh"] = {container = "Asia", divs = {"divisions", "districts", "municipalities"}, british_spelling = true},
["Barbados"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
["Belarus"] = {container = "Eropah", divs = {"regions", "districts"}, british_spelling = true},
["Belgium"] = {container = "Eropah", divs = {"regions", "provinces", "municipalities"}, british_spelling = true},
["Belize"] = {container = "Amerika Tengah", divs = {"districts"}, british_spelling = true},
["Benin"] = {container = "Afrika", divs = {"departments", "communes"}},
["Bhutan"] = {container = "Asia", divs = {"districts", "gewogs"}},
["Bolivia"] = {container = "Amerika Selatan", divs = {"provinces", "departments", "municipalities"}},
["Bosnia dan Herzegovina"] = {container = "Eropah", divs = {"entities", "cantons", "municipalities"}, british_spelling = true},
["Bosnia dan Hercegovina"] = {alias_of = "Bosnia and Herzegovina", display = true},
["Bosnia"] = {alias_of = "Bosnia and Herzegovina", display = true},
["Botswana"] = {container = "Afrika", divs = {"districts", "subdistricts"}, british_spelling = true},
["Brazil"] = {container = "Amerika Selatan", divs = {
"negeri", "municipalities", "macroregions",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
}},
["Brunei"] = {container = "Asia", divs = {"daerah", "mukim"}, british_spelling = true},
["Bulgaria"] = {container = "Eropah", divs = {"provinces", "municipalities"}, british_spelling = true},
["Burkina Faso"] = {container = "Afrika", divs = {"regions", "departments", "provinces"}},
["Burundi"] = {container = "Afrika", divs = {"provinces", "communes"}},
["Kemboja"] = {container = "Asia", divs = {"provinces", "districts"}},
["Cameroon"] = {container = "Afrika", divs = {"regions", "departments"}},
["Kanada"] = {container = "Amerika Utara", divs = {
{type = "provinces", cat_as = "provinces and territories"},
{type = "territories", cat_as = "provinces and territories"},
{type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces and territories"},
{type = "ABBREVIATION_OF territories", cat_as = "abbreviations of provinces and territories"},
"counties", "districts", "municipalities", "regional municipalities",
"rural municipalities", "parishes",
-- Don't change the following to something more politically correct (e.g. "First Nations reserves") until/unless
-- the Canadian government makes a similar switch (and note that as of Apr 18 2025, the Wikipedia article is
-- still at [[w:Indian reserves]]).
"Indian reserves",
"census divisions",
{type = "townships", prep = "di"},
},
british_spelling = true},
["Cape Verde"] = {container = "Afrika", divs = {"municipalities", "parishes"}},
["Central African Republic"] = {the = true, container = "Afrika", divs = {"prefectures", "subprefectures"}},
["Chad"] = {container = "Afrika", divs = {"regions", "departments"}},
["Chile"] = {container = "Amerika Selatan", divs = {"regions", "provinces", "communes"}},
["China"] = {container = "Asia", divs = {
{type = "provinces", cat_as = "provinces and autonomous regions"},
{type = "autonomous regions", cat_as = "provinces and autonomous regions"},
{type = "FORMER provinces", cat_as = "former provinces"},
"special administrative regions",
"prefectures",
{type = "FORMER prefectures", cat_as = "former prefectures"},
"prefecture-level cities",
{type = "counties", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
{type = "FORMER counties", cat_as = "former counties and county-level cities"},
{type = "FORMER county-level cities", cat_as = "former counties and county-level cities"},
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities.
"districts",
{type = "FORMER districts", cat_as = "former districts"},
"subdistricts",
"townships",
"municipalities",
{type = "direct-administered municipalities", cat_as = "municipalities"},
}},
["Republik Rakyat China"] = {alias_of = "China", the = true}, -- differs in "the"
["Colombia"] = {container = "Amerika Selatan", divs = {"departments", "municipalities"}},
["Comoros"] = {the = true, container = "Afrika", divs = {"autonomous islands"}},
["Costa Rica"] = {container = "Amerika Tengah", divs = {"provinces", "cantons"}},
["Croatia"] = {container = "Eropah", divs = {"counties", "municipalities"}, british_spelling = true},
["Cuba"] = {container = "Caribbean", divs = {"provinces", "municipalities"}},
["Cyprus"] = {container = {"Eropah", "Asia"}, divs = {"districts"}, british_spelling = true},
["Czech Republic"] = {the = true, container = "Eropah", divs = {"regions", "districts", "municipalities"}, british_spelling = true},
["Czechia"] = {alias_of = "Czech Republic"}, -- differs in "the"
["Democratic Republic of the Congo"] = {the = true, container = "Afrika", divs = {"provinces", "territories"}},
["Congo"] = {alias_of = "Democratic Republic of the Congo", display = true, the = true},
["Denmark"] = {container = "Eropah", divs = {"regions", "municipalities", "dependent territories"},
british_spelling = true,
-- Wikipedia separates [[w:Denmark]] (constituent country) from [[w:Danish Realm]] (country)
},
["Djibouti"] = {container = "Afrika", divs = {"regions", "districts"}},
["Dominica"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
["Dominican Republic"] = {the = true, container = "Caribbean", divs = {"provinces", "municipalities"},
keydesc = "the [[Dominican Republic]], the country that shares the [[Caribbean]] island of [[Hispaniola]] with [[Haiti]]"},
["East Timor"] = {container = "Asia", divs = {"municipalities"}, wp = "Timor-Leste"},
["Timor-Leste"] = {alias_of = "East Timor", display = true},
["Ecuador"] = {container = "Amerika Selatan", divs = {"provinces", "cantons"}},
["Mesir"] = {container = "Afrika", divs = {"kegabenoran", "kawasan"}, british_spelling = true},
["El Salvador"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}},
["Equatorial Guinea"] = {container = "Afrika", divs = {"provinces"}},
["Eritrea"] = {container = "Afrika", divs = {"regions", "subregions"}},
["Estonia"] = {container = "Eropah", divs = {"counties", "municipalities"}, british_spelling = true},
["Eswatini"] = {container = "Afrika", british_spelling = true},
["Swaziland"] = {alias_of = "Eswatini", display = true},
["Ethiopia"] = {container = "Afrika", divs = {"regions", "zones"}},
["Federated States of Micronesia"] = {the = true, container = "Micronesia", divs = {"negeri"}},
["Micronesia"] = {alias_of = "Federated States of Micronesia"},
["Fiji"] = {container = "Melanesia", divs = {"divisions", "provinces"}, british_spelling = true},
["Finland"] = {container = "Eropah", divs = {"regions", "municipalities"}, british_spelling = true},
["France"] = {container = "Eropah", divs = {"regions", "cantons", "collectivities",
"communes",
{type = "municipalities", cat_as = "communes"},
"departments",
{type = "prefectures", cat_as = {"prefectures", "departmental capitals"}},
{type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}},
"dependent territories", "territories", "provinces",
}, british_spelling = true},
["Gabon"] = {container = "Afrika", divs = {"provinces", "departments"}},
["Gambia"] = {the = true, container = "Afrika", divs = {"divisions", "districts"}, british_spelling = true, wp = "The %l"},
["Georgia"] = {container = {"Eropah", "Asia"}, divs = {"regions", "districts"},
keydesc = "the country of [[Georgia]], in [[Eurasia]]", british_spelling = true, wp = "%l (country)"},
["Germany"] = {container = "Eropah", divs = {
"negeri",
-- Bavaria, Baden-Württemberg, Hesse and North Rhine-Westphalia have administrative regions as divisions, but
-- there aren't really enough of them to categorize per state.
"regions",
"municipalities", "districts"}, british_spelling = true},
["Ghana"] = {container = "Afrika", divs = {"regions", "districts"}, british_spelling = true},
["Greece"] = {container = "Eropah", divs = {"regions", "regional units", "municipalities",
{type = "peripheries", cat_as = {"regions"}},
}, british_spelling = true},
["Grenada"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
["Guatemala"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}},
["Guinea"] = {container = "Afrika", divs = {"regions", "prefectures"}},
["Guinea-Bissau"] = {container = "Afrika", divs = {"regions"}},
["Guyana"] = {container = "Amerika Selatan", divs = {"regions"}, british_spelling = true},
["Haiti"] = {container = "Caribbean", divs = {"departments", "arrondissements"}},
["Honduras"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}},
["Hungary"] = {container = "Eropah", divs = {"counties", "districts"}, british_spelling = true},
["Iceland"] = {container = "Eropah", divs = {"regions", "municipalities", "counties"}, british_spelling = true},
["India"] = {container = "Asia", divs = {
{type = "negeri", cat_as = "states and union territories"},
{type = "union territories", cat_as = "states and union territories"},
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and union territories"},
{type = "ABBREVIATION_OF union territories", cat_as = "abbreviations of states and union territories"},
"divisions", "districts", "municipalities",
}, british_spelling = true},
["Indonesia"] = {container = "Asia", divs = {"regencies", "provinces",
{type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces"},
}},
["Iran"] = {container = "Asia", divs = {"provinces", "counties"}},
["Iraq"] = {container = "Asia", divs = {"governorates", "districts"}},
["Ireland"] = {container = "Eropah", addl_parents = {"British Isles"},
divs = {"counties", "districts", "provinces"}, british_spelling = true, wp = "Republic of %l"},
["Republic of Ireland"] = {alias_of = "Ireland", the = true}, -- differs in "the"
["Israel"] = {container = "Asia", divs = {"districts"}},
["Italy"] = {container = "Eropah", divs = {
"regions", "provinces", "metropolitan cities", "municipalities",
{type = "autonomous regions", cat_as = "regions"},
}, british_spelling = true},
["Ivory Coast"] = {container = "Afrika", divs = {"districts", "regions"}},
-- We should really be using Ivory Coast (common name) but there are political ramifications to the use of
-- Côte d'Ivoire so don't make it a display alias.
["Côte d'Ivoire"] = {alias_of = "Ivory Coast"},
["Jamaica"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
["Jepun"] = {container = "Asia", divs = {"prefectures", "subprefectures", "municipalities"}},
["Jordan"] = {container = "Asia", divs = {"governorates"}},
["Kazakhstan"] = {container = {"Asia", "Eropah"}, divs = {"regions", "districts"}},
["Kenya"] = {container = "Afrika", divs = {"counties"}, british_spelling = true},
["Kiribati"] = {container = "Micronesia", british_spelling = true},
["Kosovo"] = {container = "Eropah", divs = {"districts", "municipalities"}, british_spelling = true},
["Kuwait"] = {container = "Asia", divs = {"governorates", "areas"}},
["Kyrgyzstan"] = {container = "Asia", divs = {"regions", "districts"}},
["Laos"] = {container = "Asia", divs = {"provinces", "districts"}},
["Latvia"] = {container = "Eropah", divs = {"municipalities"}, british_spelling = true},
["Lubnan"] = {container = "Asia", divs = {"governorates", "districts"}},
["Lesotho"] = {container = "Afrika", divs = {"districts"}, british_spelling = true},
["Liberia"] = {container = "Afrika", divs = {"counties", "districts"}},
["Libya"] = {container = "Afrika", divs = {"districts", "municipalities"}},
["Liechtenstein"] = {container = "Eropah", divs = {"municipalities"}, british_spelling = true},
["Lithuania"] = {container = "Eropah", divs = {"counties", "municipalities"}, british_spelling = true},
["Luxembourg"] = {container = "Eropah", divs = {"cantons", "districts"}, british_spelling = true},
["Madagascar"] = {container = "Afrika", divs = {"regions", "districts"}},
["Malawi"] = {container = "Afrika", divs = {"regions", "districts"}, british_spelling = true},
["Malaysia"] = {container = "Asia", divs = {"negeri", "wilayah persekutuan", "daerah"}, british_spelling = true},
["Maldives"] = {the = true, container = "Asia", divs = {"provinces", "administrative atolls"}, british_spelling = true},
["Mali"] = {container = "Afrika", divs = {"regions", "cercles"}},
["Malta"] = {container = "Eropah", divs = {"regions", "local councils"}, british_spelling = true},
["Kepulauan Marshall"] = {the = true, container = "Micronesia", divs = {"municipalities"}},
["Mauritania"] = {container = "Afrika", divs = {"regions", "departments"}},
["Mauritius"] = {container = "Afrika", divs = {"districts"}, british_spelling = true},
["Mexico"] = {container = "Amerika Utara", addl_parents = {"Amerika Tengah"}, divs = {
"negeri", "municipalities",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
}},
["Moldova"] = {container = "Eropah", divs = {
{type = "districts", cat_as = "districts and autonomous territorial units"},
{type = "autonomous territorial units", cat_as = "districts and autonomous territorial units"},
"communes", "municipalities",
}, british_spelling = true},
["Monaco"] = {placetype = {"city-state", "negara"}, container = "Eropah",
-- We want the first placetype to be 'city-state' so the description of Monaco says it's a city-state, but we
-- want its parent to be "countries in Europe".
bare_category_parent_type = {type = "negara", prep = "di"},
is_city = true, british_spelling = true},
["Mongolia"] = {container = "Asia", divs = {"provinces", "districts"}},
["Montenegro"] = {container = "Eropah", divs = {"municipalities"}},
["Morocco"] = {container = "Afrika", divs = {"regions", "prefectures", "provinces"}},
["Mozambique"] = {container = "Afrika", divs = {"provinces", "districts"}},
["Myanmar"] = {container = "Asia",
divs = {"regions", "negeri", "union territories",
{type = "self-administered zones", cat_as = "self-administered areas"},
{type = "self-administered divisions", cat_as = "self-administered areas"},
"districts"}},
["Burma"] = {alias_of = "Myanmar"}, -- not display-canonicalizing; has political connotations
["Namibia"] = {container = "Afrika", divs = {"regions", "constituencies"}, british_spelling = true},
["Nauru"] = {container = "Micronesia", divs = {"districts"}, british_spelling = true},
["Nepal"] = {container = "Asia", divs = {"provinces", "districts"}},
["Netherlands"] = {the = true, placetype = {"negara", "constituent country"}, container = "Eropah",
divs = {"provinces", "municipalities",
{type = "FORMER municipalities", cat_as = "former municipalities"},
"dependent territories", "constituent countries"}, british_spelling = true,
-- Wikipedia separates [[w:Netherlands]] (constituent country) from [[w:Kingdom of the Netherlands]]
-- (country)
},
["New Zealand"] = {container = "Polynesia", divs = {
"regions", "dependent territories", "territorial authorities",
{type = "districts", cat_as = "territorial authorities"},
},
british_spelling = true},
["Nicaragua"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}},
["Niger"] = {container = "Afrika", divs = {"regions", "departments"}},
["Nigeria"] = {container = "Afrika", divs = {
"negeri",
-- Categorize the Federal Capital Territory as a state because there's only one of it; we could categorize
-- everything under 'states and territories' but that seems a bit pointless.
{type = "wilayah persekutuan", cat_as = "negeri"},
"local government areas",
}, british_spelling = true},
["North Korea"] = {container = "Asia", addl_parents = {"Korea"}, divs = {"provinces", "counties"}},
["North Macedonia"] = {container = "Eropah", divs = {"regions", "municipalities"}, british_spelling = true},
["Macedonia"] = {alias_of = "North Macedonia", display = true},
["Republic of North Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the"
["Republic of Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the"
["Norway"] = {container = "Eropah",
divs = {"counties", "municipalities", "dependent territories", "districts", "unincorporated areas"},
british_spelling = true},
["Oman"] = {container = "Asia", divs = {"governorates", "provinces"}},
["Pakistan"] = {container = "Asia", divs = {
{type = "provinces", cat_as = "provinces and territories"},
{type = "administrative territories", cat_as = "provinces and territories"},
{type = "wilayah persekutuan", cat_as = "provinces and territories"},
{type = "territories", cat_as = "provinces and territories"},
"divisions", "districts",
}, british_spelling = true},
["Palau"] = {container = "Micronesia", divs = {"negeri"}},
["Palestine"] = {container = "Asia", divs = {"governorates"}},
["State of Palestine"] = {alias_of = "Palestine", the = true}, -- differs in "the"
["Panama"] = {container = "Amerika Tengah", divs = {"provinces", "districts"}},
["Papua New Guinea"] = {container = "Melanesia", divs = {"provinces", "districts"}, british_spelling = true},
["Paraguay"] = {container = "Amerika Selatan", divs = {"departments", "districts"}},
["Peru"] = {container = "Amerika Selatan", divs = {"regions", "provinces", "districts"}},
["Filipina"] = {the = true, container = "Asia", divs = {"kawasan", "wilayah", "daerah", "perbandaran", "barangay"}},
["Poland"] = {divs = {"voivodeships", "counties",
{type = "Polish colonies", cat_as = {{type = "villages", prep = "di"}}},
}, container = "Eropah", british_spelling = true},
["Portugal"] = {container = "Eropah", divs = {
{type = "autonomous regions", cat_as = "districts and autonomous regions"},
{type = "districts", cat_as = "districts and autonomous regions"},
"provinces", "municipalities"}, british_spelling = true},
["Qatar"] = {container = "Asia", divs = {"municipalities", "zones"}},
["Republic of the Congo"] = {the = true, container = "Afrika", divs = {"departments", "districts"}},
["Congo Republic"] = {alias_of = "Republic of the Congo", display = true, the = true},
["Romania"] = {container = "Eropah", divs = {
"regions", "counties", "communes",
{type = "ABBREVIATION_OF counties", cat_as = "abbreviations of counties"},
}, british_spelling = true},
["Rusia"] = {container = {"Eropah", "Asia"}, divs = {
"federal subjects", "republics", "autonomous oblasts", "autonomous okrugs", "oblasts", "krais", "federal cities",
"districts", "federal districts"},
british_spelling = true},
["Rwanda"] = {container = "Afrika", divs = {"provinces", "districts"}},
["Saint Kitts and Nevis"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
["Saint Lucia"] = {container = "Caribbean", divs = {"districts"}, british_spelling = true},
["Saint Vincent and the Grenadines"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
["Samoa"] = {container = "Polynesia", divs = {"districts"}, british_spelling = true},
["San Marino"] = {container = "Eropah", divs = {"municipalities"}, british_spelling = true},
["São Tomé and Príncipe"] = {container = "Afrika", divs = {"districts"}},
["Arab Saudi"] = {container = "Asia", divs = {"wilayah", "kegaboneran"}},
["Senegal"] = {container = "Afrika", divs = {"regions", "departments"}},
["Serbia"] = {container = "Eropah", divs = {"districts", "municipalities", "autonomous provinces"}},
["Seychelles"] = {container = "Afrika", divs = {"districts"}, british_spelling = true},
["Sierra Leone"] = {container = "Afrika", divs = {"provinces", "districts"}, british_spelling = true},
["Singapura"] = {container = "Asia", divs = {"daerah", "kawasan"}, british_spelling = true},
["Slovakia"] = {container = "Eropah", divs = {"regions", "districts"}, british_spelling = true},
["Slovenia"] = {container = "Eropah", divs = {"statistical regions", "municipalities"}, british_spelling = true},
-- Note: the official name does not include "the" at the beginning, but it sounds strange in
-- English to leave it out and it's commonly included, so we include it.
["Solomon Islands"] = {the = true, container = "Melanesia", divs = {"provinces"}, british_spelling = true},
["Somalia"] = {container = "Afrika", divs = {"regions", "districts"}},
["South Africa"] = {container = "Afrika", divs = {
"provinces",
"districts",
{type = "district municipalities", cat_as = "districts"},
{type = "metropolitan municipalities", cat_as = "districts"},
"municipalities",
}, british_spelling = true},
["Korea Selatan"] = {container = "Asia", addl_parents = {"Korea"}, divs = {"provinces", "counties", "districts"}},
["South Sudan"] = {container = "Afrika", divs = {"regions", "negeri", "counties"}, british_spelling = true},
["Sepanyol"] = {container = "Eropah", divs = {"autonomous communities", "provinces", "municipalities",
"comarcas", "autonomous cities"},
british_spelling = true},
["Sri Lanka"] = {container = "Asia", divs = {"provinces", "districts"}, british_spelling = true},
["Sudan"] = {container = "Afrika", divs = {"negeri", "districts"}, british_spelling = true},
["Suriname"] = {container = "Amerika Selatan", divs = {"districts"}},
["Sweden"] = {container = "Eropah", divs = {"provinces", "counties", "municipalities"}, british_spelling = true},
["Switzerland"] = {container = "Eropah", divs = {"cantons", "municipalities", "districts"}, british_spelling = true},
["Syria"] = {container = "Asia", divs = {"governorates", "districts"}},
["Taiwan"] = {container = "Asia", divs = {"counties", "districts", "townships", "special municipalities"}},
["Republik China"] = {alias_of = "Taiwan", the = true}, -- differs in "the", different political connotations
["Tajikistan"] = {container = "Asia", divs = {"regions", "districts"}},
["Tanzania"] = {container = "Afrika", divs = {"regions", "districts"}, british_spelling = true},
["Thailand"] = {container = "Asia", divs = {"wilayah", "daerah", "subdaerah"}},
["Togo"] = {container = "Afrika", divs = {"provinces", "prefectures"}},
["Tonga"] = {container = "Polynesia", divs = {"divisions"}, british_spelling = true},
["Trinidad dan Tobago"] = {container = "Caribbean", divs = {"regions", "municipalities"}, british_spelling = true},
["Tunisia"] = {container = "Afrika", divs = {"governorates", "delegations"}},
["Turki"] = {container = {"Eropah", "Asia"}, divs = {"provinces", "districts"}},
-- Foreign names generally get display-canonicalized.
["Türkiye"] = {alias_of = "Turkey", display = true},
["Turkmenistan"] = {container = "Asia", divs = {
-- The 5 regions are often also called provinces
"regions", {type = "provinces", cat_as = "regions"}, "districts"},
},
["Tuvalu"] = {container = "Polynesia", divs = {"atolls"}, british_spelling = true},
["Uganda"] = {container = "Afrika", divs = {"districts", "counties"}, british_spelling = true},
["Ukraine"] = {container = "Eropah", divs = {
{type = "oblasts", cat_as = "oblasts and autonomous republics"},
{type = "autonomous republics", cat_as = "oblasts and autonomous republics"},
"raions", "hromadas",
}, british_spelling = true},
["United Arab Emirates"] = {the = true, container = "Asia", divs = {"emirates"}},
-- Abbreviations get display-canonicalized.
["UAE"] = {alias_of = "United Arab Emirates", display = true, the = true},
["U.A.E."] = {alias_of = "United Arab Emirates", display = true, the = true},
["United Kingdom"] = {the = true, container = "Eropah", addl_parents = {"British Isles"},
divs = {"constituent countries", "counties", "districts", "boroughs", "territories", "dependent territories",
"traditional counties"},
keydesc = "the [[United Kingdom]] of Great Britain and Northern Ireland", british_spelling = true},
-- Abbreviations get display-canonicalized.
["UK"] = {alias_of = "United Kingdom", display = true, the = true},
["U.K."] = {alias_of = "United Kingdom", display = true, the = true},
["Amerika Syarikat"] = {the = true, container = "Amerika Utara",
divs = {"counties", "county seats", "negeri", "territories", "dependent territories",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
{type = "DEROGATORY_NAME_FOR states", cat_as = "derogatory names for states"},
{type = "NICKNAME_FOR states", cat_as = "nicknames for states"},
{type = "OFFICIAL_NICKNAME_FOR states", cat_as = "official nicknames for states"},
{type = "boroughs", prep = "di"}, -- exist in Pennsylvania and New Jersey
"municipalities", -- these exist politically at least in Colorado and Connecticut
{type = "census-designated places", prep = "di"},
{type = "unincorporated communities", prep = "di"},
-- Don't change the following to something more politically correct until/unless the US government makes a
-- similar switch (and note that as of Apr 18 2025, the Wikipedia article is still at
-- [[w:Indian reservations]]).
"Indian reservations",
}},
-- Abbreviations and long forms (when possible) get display-canonicalized.
["US"] = {alias_of = "Amerika Syarikat", display = true, the = true},
["U.S."] = {alias_of = "Amerika Syarikat", display = true, the = true},
["USA"] = {alias_of = "Amerika Syarikat", display = true, the = true},
["U.S.A."] = {alias_of = "Amerika Syarikat", display = true, the = true},
["United States of America"] = {alias_of = "Amerika Syarikat", display = true, the = true},
["United States"] = {alias_of = "Amerika Syarikat", display = true, the = true},
["Uruguay"] = {container = "Amerika Selatan", divs = {"departments", "municipalities"}},
["Uzbekistan"] = {container = "Asia", divs = {"regions", "districts"}},
["Vanuatu"] = {container = "Melanesia", divs = {"provinces"}, british_spelling = true},
["Vatican City"] = {placetype = {"city-state", "negara"}, container = "Eropah",
-- We want the first placetype to be 'city-state' so the description of Vatican City says it's a city-state,
-- but we want its parent to be "countries in Europe".
bare_category_parent_type = {type = "negara", prep = "di"},
addl_parents = {"Rome"}, is_city = true, british_spelling = true},
["Vatican"] = {alias_of = "Vatican City", the = true}, -- differs in "the"
["Venezuela"] = {container = "Amerika Selatan", divs = {"negeri", "municipalities"}},
["Vietnam"] = {container = "Asia", divs = {"wilayah", "daerah", "perbandaran"}},
["Western Sahara"] = {placetype = {"territory", "negara"}, container = "Afrika",
bare_category_parent_type = {type = "negara", prep = "di"},
},
-- Not display-canonicalizable both due to differences in 'the' and the sovereignty dispute over Western Sahara
["Sahrawi Arab Democratic Republic"] = {alias_of = "Western Sahara", the = true},
["Yemen"] = {container = "Asia", divs = {"governorates", "districts"}},
["Zambia"] = {container = "Afrika", divs = {"provinces", "districts"}, british_spelling = true},
["Zimbabwe"] = {container = "Afrika", divs = {"provinces", "districts"}, british_spelling = true},
}
local function canonicalize_continent_container(key)
if type(key) ~= "string" then
return key
end
if export.continents[key] then
return {key = key, placetype = export.continents[key].placetype}
end
internal_error("Unrecognized key %s in `canonicalize_continent_like`", key)
end
export.countries_group = {
canonicalize_key_container = canonicalize_continent_container,
default_overriding_bare_label_parents = {"+++", "negara"},
default_placetype = "negara",
default_no_container_cat = true,
default_no_container_parent = true,
-- No need to augment country holonyms with continents; not needed for disambiguation.
default_no_auto_augment_container = true,
data = export.countries,
}
-- Country-like entities: typically overseas territories or de-facto independent countries, which in both cases
-- are not internationally recognized as sovereign nations but which we treat similarly to countries.
export.country_like_entities = {
-- British Overseas Territory
["Akrotiri and Dhekelia"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Cyprus", "Eropah", "Asia"},
british_spelling = true,
},
-- Åland: Listed as a region of Finland. Wikipedia lists this under "dependent territories" in
-- [[w:List of sovereign states and dependent territories by continent]].
-- unincorporated territory of the United States
["American Samoa"] = {
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Polynesia"},
},
-- British Overseas Territory
["Anguilla"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Georgia
["Abkhazia"] = {
placetype = {"unrecognized country", "negara"},
addl_parents = {"Georgia", "Eropah", "Asia"},
divs = {"districts"},
keydesc = "the de-facto independent state of [[Abkhazia]], internationally recognized as part of the country of [[Georgia]]",
british_spelling = true,
},
-- Australian external territory
["Ashmore and Cartier Islands"] = {
the = true,
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Asia"},
},
-- constituent country of the Netherlands
["Aruba"] = {
placetype = {"constituent country", "negara"},
container = "Netherlands",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- British Overseas Territory
["Bermuda"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Amerika Utara"},
british_spelling = true,
},
-- special municipality of the Netherlands
["Bonaire"] = {
placetype = {"special municipality", "municipality", "overseas territory", "territory"},
container = "Netherlands",
addl_parents = {"Caribbean"},
is_city = true,
british_spelling = true,
},
-- British Overseas Territory
["British Indian Ocean Territory"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Asia"},
british_spelling = true,
},
-- British Overseas Territory
["British Virgin Islands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- Norwegian dependent territory
["Bouvet Island"] = {
placetype = {"dependent territory", "territory"},
container = "Norway",
addl_parents = {"Afrika"},
british_spelling = true,
},
-- British Overseas Territory
["Cayman Islands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- Australian external territory
["Christmas Island"] = {
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Asia"},
british_spelling = true,
},
-- Sui generis French "state private property" per Wikipedia; classify as overseas territory like the
-- French Southern and Antarctic Lands.
["Clipperton Island"] = {
placetype = {"overseas territory", "territory"},
container = "France",
addl_parents = {"Amerika Utara"},
},
-- Australian external territory; also called the Keeling Islands or (officially) the Cocos (Keeling) Islands
["Cocos Islands"] = {
the = true,
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Asia"},
wp = "Cocos (Keeling) Islands",
british_spelling = true,
},
["Cocos (Keeling) Islands"] = {alias_of = "Cocos Islands", display = true, the = true},
["Keeling Islands"] = {alias_of = "Cocos Islands", display = true, the = true},
-- self-governing but in free association with New Zealand
["Cook Islands"] = {
the = true,
placetype = {"negara"},
container = "New Zealand",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- constituent country of the Netherlands
["Curaçao"] = {
placetype = {"constituent country", "negara"},
container = "Netherlands",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- special territory of Chile
["Easter Island"] = {
placetype = {"special territory", "territory"},
container = "Chile",
addl_parents = {"Polynesia"},
},
-- British Overseas Territory
["Falkland Islands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Amerika Selatan"},
british_spelling = true,
},
-- autonomous territory of Denmark
["Faroe Islands"] = {
the = true,
placetype = {"autonomous territory", "territory"},
container = "Denmark",
addl_parents = {"Eropah"},
british_spelling = true,
},
-- overseas department and region of France
["French Guiana"] = {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
divs = {"communes"},
addl_parents = {"Amerika Selatan"},
british_spelling = true,
},
-- overseas collectivity of France
["French Polynesia"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- French overseas territory
["French Southern and Antarctic Lands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "France",
addl_parents = {"Afrika"},
},
-- British Overseas Territory
["Gibraltar"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Eropah"},
is_city = true,
british_spelling = true,
},
-- autonomous territory of Denmark
["Greenland"] = {
placetype = {"autonomous territory", "territory"},
container = "Denmark",
addl_parents = {"Amerika Utara"},
divs = {"municipalities"},
british_spelling = true,
},
-- overseas department and region of France
["Guadeloupe"] = {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
addl_parents = {"Caribbean"},
divs = {"communes"},
british_spelling = true,
},
-- unincorporated territory of the United States
["Guam"] = {
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Micronesia"},
},
-- self-governing British Crown dependency; technically called the Bailiwick of Guernsey
["Guernsey"] = {
placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "territory"},
container = "United Kingdom",
addl_parents = {"British Isles", "Eropah"},
british_spelling = true,
wp = "Bailiwick of %l",
},
["Bailiwick of Guernsey"] = {alias_of = "Guernsey", the = true},
-- Australian external territory
["Heard Island and McDonald Islands"] = {
the = true,
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Afrika"},
},
-- special administrative region of China
["Hong Kong"] = {
placetype = {"special administrative region", "city"},
container = "China",
is_city = true,
british_spelling = true,
},
-- self-governing British Crown dependency
["Isle of Man"] = {
the = true,
placetype = {"crown dependency", "dependency", "dependent territory", "territory"},
container = "United Kingdom",
addl_parents = {"British Isles", "Eropah"},
british_spelling = true,
},
-- Norwegian unincorporated area
["Jan Mayen"] = {
placetype = {"unincorporated area", "dependent territory", "territory", "island"},
container = "Norway",
addl_parents = {"Eropah"},
british_spelling = true,
},
-- self-governing British Crown dependency; technically called the Bailiwick of Jersey
["Jersey"] = {
placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "territory"},
container = "United Kingdom",
addl_parents = {"British Isles", "Eropah"},
british_spelling = true,
},
["Bailiwick of Jersey"] = {alias_of = "Jersey", the = true},
-- special administrative region of China
["Macau"] = {
placetype = {"special administrative region", "city"},
container = "China",
is_city = true,
british_spelling = true,
},
-- overseas department and region of France
["Martinique"] = {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
divs = {"communes"},
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- overseas department and region of France
["Mayotte"] = {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
divs = {"communes"},
addl_parents = {"Afrika"},
british_spelling = true,
},
-- British Overseas Territory
["Montserrat"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- special collectivity of France
["New Caledonia"] = {
placetype = {"special collectivity", "collectivity"},
container = "France",
addl_parents = {"Melanesia"},
british_spelling = true,
},
-- dependent territory of New Zealand
["New Zealand Subantarctic Islands"] = {
the = true,
placetype = {"dependent territory", "territory"},
container = "New Zealand",
addl_parents = {"Antartika"},
british_spelling = true,
},
-- self-governing but in free association with New Zealand
["Niue"] = {
placetype = {"negara"},
container = "New Zealand",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- Australian external territory
["Norfolk Island"] = {
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Cyprus
["Northern Cyprus"] = {
placetype = {"unrecognized country", "negara"},
addl_parents = {"Cyprus", "Turkey", "Eropah", "Asia"},
divs = {"districts"},
keydesc = "the de-facto independent state of [[Northern Cyprus]], internationally recognized as part of the country of [[Cyprus]]",
british_spelling = true,
},
-- commonwealth, unincorporated territory of the United States
["Northern Mariana Islands"] = {
the = true,
placetype = {"commonwealth", "unincorporated territory", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Micronesia"},
},
-- British Overseas Territory
["Pitcairn Islands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- commonwealth of the United States
["Puerto Rico"] = {
placetype = {"commonwealth", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Caribbean"},
divs = {"municipalities"},
},
-- overseas department and region of France
["Réunion"] = {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
divs = {"communes"},
addl_parents = {"Afrika"},
british_spelling = true,
},
-- special municipality of the Netherlands
["Saba"] = {
placetype = {"special municipality", "municipality", "overseas territory", "territory"},
container = "Netherlands",
addl_parents = {"Caribbean"},
is_city = true,
british_spelling = true,
},
-- overseas collectivity of France
["Saint Barthélemy"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- British Overseas Territory
["Saint Helena, Ascension and Tristan da Cunha"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
divs = {{type = "constituent parts", container_parent_type = false}},
addl_parents = {"Atlantic Ocean", "Afrika"},
british_spelling = true,
},
-- constituent parts of the combined oveseas territory
["Ascension Island"] = {
placetype = {"constituent part", "territory", "island"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"Atlantic Ocean"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
["Saint Helena"] = {
placetype = {"constituent part", "territory", "island"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"Atlantic Ocean"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
["Tristan da Cunha"] = {
placetype = {"constituent part", "territory", "archipelago"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"Atlantic Ocean"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
-- overseas collectivity of France
["Saint Martin"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- overseas collectivity of France
["Saint Pierre and Miquelon"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
divs = {"communes"},
addl_parents = {"Amerika Utara"},
british_spelling = true,
},
-- special municipality of the Netherlands
["Sint Eustatius"] = {
placetype = {"special municipality", "municipality", "overseas territory", "territory"},
container = "Netherlands",
addl_parents = {"Caribbean"},
is_city = true,
british_spelling = true,
},
-- constituent country of the Netherlands
["Sint Maarten"] = {
placetype = {"constituent country", "negara"},
container = "Netherlands",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Somalia
["Somaliland"] = {
placetype = {"unrecognized country", "negara"},
addl_parents = {"Somalia", "Afrika"},
keydesc = "the de-facto independent state of [[Somaliland]], internationally recognized as part of the country of [[Somalia]]",
british_spelling = true,
},
-- British Overseas Territory
-- FIXME: We should form the group "South Georgia and the South Sandwich Islands" like we did for
-- "Saint Helena, Ascension and Tristan da Cunha".
["South Georgia"] = {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Atlantic Ocean"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Georgia
["South Ossetia"] = {
placetype = {"unrecognized country", "negara"},
addl_parents = {"Georgia", "Eropah", "Asia"},
keydesc = "the de-facto independent state of [[South Ossetia]], internationally recognized as part of the country of [[Georgia]]",
british_spelling = true,
},
-- British Overseas Territory
["South Sandwich Islands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Atlantic Ocean"},
wp = true,
wpcat = "South Georgia and the South Sandwich Islands",
british_spelling = true,
},
-- Norwegian unincorporated area
["Svalbard"] = {
placetype = {"unincorporated area", "dependent territory", "territory", "archipelago"},
container = "Norway",
addl_parents = {"Eropah"},
british_spelling = true,
},
-- dependent territory of New Zealand
["Tokelau"] = {
placetype = {"dependent territory", "territory"},
container = "New Zealand",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Moldova
["Transnistria"] = {
placetype = {"unrecognized country", "negara"},
addl_parents = {"Moldova", "Eropah"},
keydesc = "the de-facto independent state of [[Transnistria]], internationally recognized as part of [[Moldova]]",
british_spelling = true,
},
-- British Overseas Territory
["Turks and Caicos Islands"] = {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- unincorporated territory of the United States
["United States Minor Outlying Islands"] = {
the = true,
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Islands", "Micronesia", "Polynesia", "Caribbean"},
},
-- FIXME: We should add entries for the other minor outlying islands.
-- Baker Island (Oceania)
-- Howland Island (Oceania)
-- Jarvis Island (Oceania)
-- Johnston Atoll (Oceania)
-- Kingman Reef (Oceania)
-- Midway Atoll (Oceania)
-- Navassa Island (Caribbean)
-- Palmyra Atoll (Oceania)
-- Wake Island (Oceania)
["Wake Island"] = {
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Micronesia"},
},
-- unincorporated territory of the United States
["United States Virgin Islands"] = {
the = true,
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "Amerika Syarikat",
addl_parents = {"Caribbean"},
},
["U.S. Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true},
["US Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true},
-- overseas collectivity of France
["Wallis and Futuna"] = {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
addl_parents = {"Polynesia"},
british_spelling = true,
},
}
export.country_like_entities_group = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Saint Helena, Ascension and Tristan da Cunha".
key_to_placename = false,
placename_to_key = false,
canonicalize_key_container = make_canonicalize_key_container(nil, "negara"),
default_overriding_bare_label_parents = {"country-like entities"},
default_no_container_cat = true,
default_no_container_parent = true,
-- These entities often aren't really part of their container; a village in Wallis and Futuna (an overseas
-- collectivity of France in Polynesia), for example, shouldn't be treated as a village in France, nor as a village
-- in Europe.
default_no_auto_augment_container = true,
data = export.country_like_entities,
}
-- Former countries and such; we don't create "Cities in ..." categories because they don't exist anymore
export.former_countries = {
-- de-facto independent state of Armenian ethnicity, internationally recognized as part of Azerbaijan
-- (also known as Nagorno-Karabakh)
-- NOTE: Formerly listed Armenia as a parent; this seems politically non-neutral so I've taken it out.
["Artsakh"] = {
placetype = {"unrecognized country", "negara"},
addl_parents = {"Azerbaijan", "Eropah", "Asia"},
keydesc = "the former de-facto independent state of [[Artsakh]], internationally recognized as part of [[Azerbaijan]]",
british_spelling = true,
},
["Nagorno-Karabakh"] = {alias_of = "Artsakh"},
["Czechoslovakia"] = {container = "Eropah", british_spelling = true},
["East Germany"] = {container = "Eropah", addl_parents = {"Germany"}, british_spelling = true},
["North Vietnam"] = {container = "Asia", addl_parents = {"Vietnam"}},
["Persia"] = {placetype = {"empire", "negara"}, container = "Asia", divs = {"provinces"}},
["Byzantine Empire"] = {
the = true, placetype = {"empire", "negara"}, container = {"Eropah", "Afrika", "Asia"},
addl_parents = {"Ancient Europe", "Ancient Near East"},
divs = {
"provinces", "themes",
}},
["Roman Empire"] = {
the = true, placetype = {"empire", "negara"}, container = {"Eropah", "Afrika", "Asia"}, addl_parents = {"Rome"},
divs = {
"provinces",
{type = "FORMER provinces", cat_as = "provinces"},
}},
["South Vietnam"] = {container = "Asia", addl_parents = {"Vietnam"}},
["Soviet Union"] = {
the = true, container = {"Eropah", "Asia"}, divs = {"republics", "autonomous republics"},
british_spelling = true},
["West Germany"] = {container = "Eropah", addl_parents = {"Germany"}, british_spelling = true},
["Yugoslavia"] = {container = "Eropah", divs = {"districts"},
keydesc = "the former [[Kingdom of Yugoslavia]] (1918–1943) or the former [[Socialist Federal Republic of Yugoslavia]] (1943–1992)", british_spelling = true},
}
export.former_countries_group = {
canonicalize_key_container = canonicalize_continent_container,
default_overriding_bare_label_parents = {"former countries and country-like entities"},
default_is_former_place = true,
default_placetype = "negara",
default_no_container_cat = true,
default_no_container_parent = true,
-- No need to augment country holonyms with continents; not needed for disambiguation.
default_no_auto_augment_container = true,
data = export.former_countries,
}
-----------------------------------------------------------------------------------
-- Subpolity tables --
-----------------------------------------------------------------------------------
export.australia_states_and_territories = {
["Australian Capital Territory, Australia"] = {the = true, placetype = "territory"},
["Jervis Bay Territory, Australia"] = {the = true, placetype = "territory"},
["New South Wales, Australia"] = {},
["Northern Territory, Australia"] = {the = true, placetype = "territory"},
["Queensland, Australia"] = {},
["South Australia, Australia"] = {},
["Tasmania, Australia"] = {},
["Victoria, Australia"] = {},
["Western Australia, Australia"] = {},
}
-- states and territories of Australia
export.australia_group = {
default_container = "Australia",
default_placetype = "negeri",
default_divs = "local government areas",
data = export.australia_states_and_territories,
}
export.austria_states = {
["Vienna, Austria"] = {},
["Lower Austria, Austria"] = {},
["Upper Austria, Austria"] = {},
["Styria, Austria"] = {},
["Tyrol, Austria"] = {wp = "Tyrol (state)"},
["Carinthia, Austria"] = {},
["Salzburg, Austria"] = {wp = "Salzburg (state)"},
["Vorarlberg, Austria"] = {},
["Burgenland, Austria"] = {},
}
-- states of Austria
export.austria_group = {
default_container = "Austria",
default_placetype = "negeri",
default_divs = "municipalities",
data = export.austria_states,
}
export.bangladesh_divisions = {
["Barisal Division, Bangladesh"] = {},
["Chittagong Division, Bangladesh"] = {},
["Dhaka Division, Bangladesh"] = {},
["Khulna Division, Bangladesh"] = {},
["Mymensingh Division, Bangladesh"] = {},
["Rajshahi Division, Bangladesh"] = {},
["Rangpur Division, Bangladesh"] = {},
["Sylhet Division, Bangladesh"] = {},
}
-- divisions of Bangladesh
export.bangladesh_group = {
key_to_placename = make_key_to_placename(", Bangladesh$", " Division$"),
placename_to_key = make_placename_to_key(", Bangladesh", " Division"),
default_container = "Bangladesh",
default_placetype = "division",
default_divs = "districts",
data = export.bangladesh_divisions,
}
export.brazil_states = {
["Acre, Brazil"] = {wp = "%l (state)"},
["Alagoas, Brazil"] = {},
["Amapá, Brazil"] = {},
["Amazonas, Brazil"] = {wp = "%l (Brazilian state)"},
["Bahia, Brazil"] = {},
["Ceará, Brazil"] = {},
["Distrito Federal, Brazil"] = {wp = "Federal District (Brazil)"},
["Espírito Santo, Brazil"] = {},
["Goiás, Brazil"] = {},
["Maranhão, Brazil"] = {},
["Mato Grosso, Brazil"] = {},
["Mato Grosso do Sul, Brazil"] = {},
["Minas Gerais, Brazil"] = {},
["Pará, Brazil"] = {},
["Paraíba, Brazil"] = {},
["Paraná, Brazil"] = {wp = "%l (state)"},
["Pernambuco, Brazil"] = {},
["Piauí, Brazil"] = {},
["Rio de Janeiro, Brazil"] = {wp = "%l (state)"},
["Rio Grande do Norte, Brazil"] = {},
["Rio Grande do Sul, Brazil"] = {},
["Rondônia, Brazil"] = {},
["Roraima, Brazil"] = {},
["Santa Catarina, Brazil"] = {wp = "%l (state)"},
["São Paulo, Brazil"] = {wp = "%l (state)"},
["Sergipe, Brazil"] = {},
["Tocantins, Brazil"] = {},
}
-- states of Brazil
export.brazil_group = {
default_container = "Brazil",
default_placetype = "negeri",
default_divs = "municipalities",
data = export.brazil_states,
}
export.canada_provinces_and_territories = {
["Alberta, Canada"] = {divs = {
{type = "municipal districts", container_parent_type = "rural municipalities"},
}},
["British Columbia, Canada"] = {divs =
{type = "regional districts", container_parent_type = false},
"regional municipalities",
},
["Manitoba, Canada"] = {divs = {"rural municipalities"}},
["New Brunswick, Canada"] = {divs = {"counties", "parishes", {type = "civil parishes", cat_as = "parishes"}}},
["Newfoundland and Labrador, Canada"] = {},
["Northwest Territories, Canada"] = {the = true, placetype = "territory"},
["Nova Scotia, Canada"] = {divs = {"counties", "regional municipalities"}},
["Nunavut, Canada"] = {placetype = "territory"},
["Ontario, Canada"] = {divs = {"counties", "regional municipalities", {type = "townships", prep = "di"}}},
["Prince Edward Island, Canada"] = {divs = {"counties", "parishes", "rural municipalities"}},
["Saskatchewan, Canada"] = {divs = {"rural municipalities"}},
["Quebec, Canada"] = {divs = {
"counties",
{type = "regional county municipalities", container_parent_type = "regional municipalities"},
-- administrative regions have an official (but non-governmental) function but there don't appear to be any
-- equivalent regions elsewhere in Canada, so disable the [[Category:Regions of Canada]] grouping
{type = "regions", container_parent_type = false},
{type = "townships", prep = "di"},
{type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "counties"}, "municipalities"}},
{type = "township municipalities", cat_as = {{type = "townships", prep = "di"}, "municipalities"}},
{type = "village municipalities", cat_as = {{type = "villages", prep = "di"}, "municipalities"}},
}},
["Yukon, Canada"] = {placetype = "territory"},
["Yukon Territory, Canada"] = {alias_of = "Yukon, Canada", the = true},
}
-- provinces and territories of Canada
export.canada_group = {
default_container = "Canada",
default_placetype = "province",
data = export.canada_provinces_and_territories,
}
export.china_provinces_and_autonomous_regions = {
-- direct-administered municipalities are not here but below under prefecture-level cities
["Anhui, China"] = {},
["Fujian, China"] = {},
["Fuchien, China"] = {alias_of = "Fujian, China", display = true},
["Gansu, China"] = {},
["Guangdong, China"] = {},
["Guangxi, China"] = {placetype = "autonomous region"},
["Guizhou, China"] = {},
["Hainan, China"] = {},
["Hebei, China"] = {},
["Heilongjiang, China"] = {},
["Henan, China"] = {},
["Hubei, China"] = {},
["Hunan, China"] = {},
["Inner Mongolia, China"] = {placetype = "autonomous region"},
["Jiangsu, China"] = {},
["Jiangxi, China"] = {},
["Jilin, China"] = {},
["Liaoning, China"] = {},
["Ningxia, China"] = {placetype = "autonomous region"},
["Qinghai, China"] = {},
["Shaanxi, China"] = {},
["Shandong, China"] = {},
["Shanxi, China"] = {},
["Sichuan, China"] = {},
["Tibet, China"] = {placetype = "autonomous region", wp = "Tibet Autonomous Region"},
["Xinjiang, China"] = {placetype = "autonomous region"},
["Yunnan, China"] = {},
["Zhejiang, China"] = {},
}
-- provinces and autonomous regions of China
export.china_group = {
default_container = "China",
default_placetype = "province",
default_divs = {
"prefectures", "prefecture-level cities",
"districts", "subdistricts", "townships",
{type = "counties", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_provinces_and_autonomous_regions,
}
export.china_prefecture_level_cities = {
-- In China, a "prefecture-level city" is not a city in any real sense. It is rather a prefecture, which is an
-- administrative unit smaller than a province but bigger than a county, which is administratively controlled by
-- the chief city of the prefecture (which bears the same name as the prefecture), in a unified government. Prior
-- to the mid-1980's, in fact, prefecture-level cities *were* prefectures, and a few of them (especially in the
-- western portion of China) have not yet been converted. Generally a given province is entirely tiled by
-- prefecture-level cities, another indication that they should be treated as prefectures and not cities per se.
-- Yet another indication is that prefecture-level cities can contain counties and county-level cities (which, much
-- like prefecture-level cities, are effectively counties surrounding a chief city of the county, again which bears
-- the same name as the county-level city).
--
-- For this reason, we treat prefecture-level cities as non-city political divisions, and separately enumerate the
-- most populous so we can separately categorize districts and counties under them instead of lumping them at the
-- province level.
--
-- Note also that China separately distinguishes "urban area" from "metro area". Sometimes the two figures are
-- identical but sometimes the metro area is larger (and very occasionally smaller, which I assume is an error). I'm
-- guessing that the "urban area" is the contiguous urban area over a certain density while the metro area includes
-- all urban areas above a certain density; when the latter is greater, it's because of satellite cities in the
-- metro area separated by suburban/exurban or rural land.
-- At first I chose all prefecture/province-level cities with a total prefecture/province-level population of at
-- least 6,000,000 per the 2020 census with data taken from https://www.citypopulation.de/en/china/admin/ (a total
-- of 67, including the four direct-administered municipalities), and also chose all prefecture/province-level
-- cities whose "urban population" was at least 2,000,000 per the 2020 census with data taken from Wikipedia
-- [[w:List of cities in China by population#Cities and towns by population]] (a total of 61 cities; if we cut off
-- at 1.5 million we'd have 84 cities, and if we cut off at 1 million we'd have 105 cities). Merging them produces
-- 87 cities. Note that this leaves off a few well-known cities (Guilin, Qiqihar, Kashgar, Lhasa, ...) but includes
-- a lot of obscure cities.
--
-- At a later date I added all cities from citypopulation.de whose "urban" population per the 2020 China census was
-- >= 1 million, and then finally added all urban agglomerations from citypopulation.de whose 2025-01-01 estimate
-- was >= 1 million. These are sorted below by the urban agglomeration value (which is generally of the "adm-urb" =
-- "administrative area (urban population)" type) and sometimes groups nearby cities into a single agglomeration
-- (most notably in the case of the Pearl River Delta, grouped under Guangzhou with an agglomeration population of
-- 72,700,000 but including a large number of nearby large cities in the agglomeration (although for some reason not
-- Hong Kong, maybe due to the administrative issues involved). In addition, citypopulation.de includes divisions
-- under a prefecture-level city if they are city-like and have an agglomeration population of at least 1 million;
-- this includes several county-level cities, one county and one district (Wanzhou, a "district" of Chongqing
-- despite being 142 miles away). None of the county-level cities or counties have districts under them, only
-- subdistricts, towns and townships.
["Guangzhou"] = {container = "Guangdong"}, -- 18.7 prefectural, 18.8 urban; sub-provincial city; 16.097 urban (72.700 adm-urb including Dongguan, Foshan, Huizhou, Jiangmen, Shenzhen, Zhongshan) per citypopulation.de
["Dongguan"] = {container = "Guangdong"}, -- 10.5 prefectural, 10.5 urban; 9.645 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Foshan"] = {container = "Guangdong"}, -- 9.5 prefectural, 9.5 urban; 9.043 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Huizhou"] = {container = "Guangdong"}, -- 6.0 prefectural, 2.5 urban; 2.900 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Jiangmen"] = {container = "Guangdong"}, -- 4.798 prefectural, 2.7 urban; 1.795 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Shenzhen"] = {container = "Guangdong"}, -- 17.5 prefectural, 14.7 urban; sub-provincial city; 17.445 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Zhongshan"] = {container = "Guangdong"}, -- 4.418 prefectural, 4.4 urban; 3.842 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
["Shanghai"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 24.9 prefectural, 29.9 urban; 21.910 urban (41.600 adm-urb including Changshu, Changzhou, Suzhou, Wuxi) per citypopulation.de
["Changshu"] = {container = "Jiangsu"}, -- 1.231 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
-- NOTE: Not to be confused with Cangzhou in Hebei
["Changzhou"] = {container = "Jiangsu"}, -- 5.278 prefectural, 3.6 urban; 3.187 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
-- NOTE: There is also a prefecture-level city Suzhou in Anhui with 5.3 million prefectural inhabitants
["Suzhou"] = {container = "Jiangsu"}, -- 12.8 prefectural, 4.3 urban; 5.893 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
["Wuxi"] = {container = "Jiangsu"}, -- 7.5 prefectural, 3.3 urban; 3.957 per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
["Beijing"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 21.9 prefectural, 21.9 urban; 18.961 urban (21.500 adm-urb) per citypopulation.de
["Chengdu"] = {container = "Sichuan"}, -- 20.9 prefectural, 16.9 urban; sub-provincial city; 13.568 urban (18.100 adm-urb) per citypopulation.de
["Xiamen"] = {container = "Fujian"}, -- 5.163 prefectural, 5.2 urban; sub-provincial city; 4.617 urban (15.400 adm-urb including Jinjiang, Quanzhou, Putian) per citypopulation.de
["Jinjiang"] = {container = "Fujian"}, -- 1.416 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
["Quanzhou"] = {container = "Fujian"}, -- 8.8 prefectural, 1.7 urban (6.7 metro); 1.469 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
["Putian"] = {container = "Fujian"}, -- 3.210 prefectural, 2.0 urban; 1.539 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
["Hangzhou"] = {container = "Zhejiang"}, -- 11.9 prefectural, 10.7 urban; sub-provincial city; 9.236 urban (14.600 adm-urb including Shaoxing) per citypopulation.de
["Shaoxing"] = {container = "Zhejiang"}, -- 5.270 prefectural, 2.5 urban; 2.333 urban per citypopulation.de; included by citypopulation.de in Hangzhou agglomeration
["Xi'an"] = {container = "Shaanxi"}, -- 12.1 prefectural, 11.9 urban; sub-provincial city; 9.393 urban (13.400 adm-urb including Xianyang) per citypopulation.de
["Xianyang"] = {container = "Shaanxi"}, -- 1.193 urban per citypopulation.de; included by citypopulation.de in Xi'an agglomeration
["Chongqing"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 32.1 prefectural, 16.9 urban; 9.581 urban (12.900 adm-urb) per citypopulation.de
["Wuhan"] = {container = "Hubei"}, -- 12.4 prefectural, 12.3 urban; sub-provincial city; 10.495 urban (12.600 adm-urb) per citypopulation.de
["Tianjin"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 13.9 prefectural, 13.9 urban; 11.052 urban (11.700 adm-urb) per citypopulation.de
["Changsha"] = {container = "Hunan"}, -- 10.0 prefectural, 6.0 urban; 5.630 urban (11.500 adm-urb including Xiangtan, Zhuzhou) per citypopulation.de
-- Changsha County -- 1.024 urban per citypopulation.de
["Zhuzhou"] = {container = "Hunan"}, -- 1.510 urban per citypopulation.de; included by citypopulation.de in Changsha agglomeration
["Zhengzhou"] = {container = "Henan"}, -- 12.6 prefectural, 6.7 urban; 6.461 urban (10.300 adm-urb) per citypopulation.de
["Nanjing"] = {container = "Jiangsu"}, -- 9.3 prefectural, 9.3 urban; sub-provincial city; 7.520 urban (9.500 adm-urb including Ma'anshan) per citypopulation.de
["Shenyang"] = {container = "Liaoning"}, -- 9.1 prefectural, 7.9 urban; sub-provincial city; 7.026 urban (8.800 adm-urb including Fushun) per citypopulation.de
["Fushun"] = {container = "Liaoning"}, -- 1.229 urban per citypopulation.de; included by citypopulation.de in Shenyang agglomeration
["Hefei"] = {container = "Anhui"}, -- 9.4 prefectural, 4.2 urban; 5.056 urban (8.200 adm-urb) per citypopulation.de
["Shantou"] = {container = "Guangdong"}, -- 5.502 prefectural, 4.3 urban; 3.839 urban (8.050 adm-urb including Chaozhou, Jieyang, Puning) per citypopulation.de
["Chaozhou"] = {container = "Guangdong"}, -- 1.254 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration
["Jieyang"] = {container = "Guangdong"}, -- 1.243 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration
["Qingdao"] = {container = "Shandong"}, -- 10.1 prefectural, 7.1 urban; sub-provincial city; 6.165 urban (7.700 adm-urb) per citypopulation.de
["Ningbo"] = {container = "Zhejiang"}, -- 9.4 prefectural, 5.1 urban; sub-provincial city; 3.731 urban (7.600 adm-urb including Cixi, Yuyao) per citypopulation.de
["Cixi"] = {container = "Zhejiang"}, -- 1.458 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration
["Yuyao"] = {container = "Zhejiang"}, -- 1.014 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration
-- Hong Kong 7.500 agglomeration per citypopulation.de 2025-01-01 estimate including Kowloon, Victoria
["Wenzhou"] = {container = "Zhejiang"}, -- 9.6 prefectural, 3.6 urban; 2.582 urban (7.000 adm-urb including Rui'an, Cangnan, Pingyang) per citypopulation.de
-- Rui'an is a "county-level city" of the "prefecture-level city" of Wenzhou but in fact is 19 miles away from Wenzhou city proper (urban core to urban core).
["Rui'an"] = {placetype = "county-level city", container = {key = "Wenzhou", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 1.013 urban per citypopulation.de; included by citypopulation.de in Wenzhou agglomeration
["Kunming"] = {container = "Yunnan"}, -- 8.5 prefectural, 6.0 urban; 5.273 urban (6.800 adm-urb) per citypopulation.de
-- includes Láiwú city
["Jinan"] = {container = "Shandong", wp = "%l, %c"}, -- 9.2 prefectural, 8.4 urban; sub-provincial city; 5.648 urban (6.750 adm-urb) per citypopulation.de
-- includes Xīnjí city
["Shijiazhuang"] = {container = "Hebei"}, -- 11.2 prefectural, 4.1 urban; 5.090 urban (6.450 adm-urb) per citypopulation.de
["Taiyuan"] = {container = "Shanxi"}, -- 5.304 prefectural, 4.5 urban; 4.304 urban (6.150 adm-urb) per citypopulation.de
["Harbin"] = {container = "Heilongjiang"}, -- 10.0 prefectural, 7.0 urban; sub-provincial city; 5.243 urban (5.550 adm-urb) per citypopulation.de
["Nanning"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 8.7 prefectural, 3.8 urban; 4.583 urban (5.550 adm-urb) per citypopulation.de
["Dalian"] = {container = "Liaoning"}, -- 7.5 prefectural, 5.7 urban; sub-provincial city; 4.914 urban (5.400 adm-urb) per citypopulation.de
["Guiyang"] = {container = "Guizhou"}, -- 5.987 prefectural, 3.5 urban; 4.021 urban (5.300 adm-urb) per citypopulation.de
["Changchun"] = {container = "Jilin"}, -- 9.1 prefectural, 5.7 urban; sub-provincial city; 4.557 urban (5.200 adm-urb) per citypopulation.de
["Nanchang"] = {container = "Jiangxi"}, -- 6.3 prefectural, 3.6 (3.9?) urban, 5.3 metro; 3.519 urban (5.150 adm-urb) per citypopulation.de
["Ürümqi"] = {container = {key = "Xinjiang, China", placetype = "autonomous region"}}, -- 4.054 prefectural, 4.3 urban; 3.843 urban (5.000 adm-urb) per citypopulation.de
["Urumqi"] = {alias_of = "Ürümqi", display = true},
["Fuzhou"] = {container = "Fujian"}, -- 8.3 prefectural, 4.1 urban; 3.723 urban (4.775 adm-urb) per citypopulation.de
["Linyi"] = {container = "Shandong"}, -- 11.0 prefectural, 2.3 urban; 2.744 urban (4.650 adm-urb) per citypopulation.de
["Zibo"] = {container = "Shandong"}, -- 4.704 prefectural, 2.6 urban; 2.750 urban (3.975 adm-urb) per citypopulation.de
["Luoyang"] = {container = "Henan"}, -- 7.1 prefectural, 2.4 urban; 2.231 urban (3.750 adm-urb) per citypopulation.de
["Lanzhou"] = {container = "Gansu"}, -- 4.359 prefectural, 3.1 urban; 3.013 urban (3.575 adm-urb) per citypopulation.de
["Nantong"] = {container = "Jiangsu"}, -- 7.7 prefectural, 2.3 urban; 2.988 urban (3.475 adm-urb) citypopulation.de
["Weifang"] = {container = "Shandong"}, -- 9.4 prefectural, 2.7 urban; 1.998 urban (3.325 adm-urb) per citypopulation.de
["Jiangyin"] = {container = "Jiangsu"}, -- 1.331 urban (3.200 adm-urb including Zhangjiagang) per citypopulation.de
["Zhangjiagang"] = {container = "Jiangsu"}, -- 1.056 urban per citypopulation.de; included in Jiangyin figures
["Xuzhou"] = {container = "Jiangsu"}, -- 9.1 prefectural, 2.6 urban; 2.846 urban (3.150 adm-urb) per citypopulation.de
["Handan"] = {container = "Hebei"}, -- 9.4 prefectural, 2.8 urban; 2.095 urban (2.925 adm-urb) per citypopulation.de
["Hohhot"] = {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 3.446 prefectural, 2.7 urban; 2.373 urban (2.850 adm-urb) per citypopulation.de
["Haikou"] = {container = "Hainan"}, -- 2.873 prefectural, 2.3 urban; 2.349 urban (2.800 adm-urb) per citypopulation.de
["Tangshan"] = {container = "Hebei"}, -- 7.7 prefectural, 3.4 urban; 2.550 urban (2.750 adm-urb) per citypopulation.de
["Xinxiang"] = {container = "Henan"}, -- 6.3 prefectural, 1.2 urban, 2.7 metro; 1.271 urban (2.700 adm-urb) per citypopulation.de
["Yiwu"] = {container = "Zhejiang"}, -- 1.481 urban (2.700 adm-urb) per citypopulation.de
["Zhuhai"] = {container = "Guangdong"}, -- 2.439 prefectural, 2.4 urban; 2.207 urban (2.675 adm-urb) per citypopulation.de
["Taizhou, Zhejiang"] = {container = "Zhejiang"}, -- 6.6 prefectural, 1.6 urban; 1.486 urban (2.625 adm-urb) per citypopulation.de
["Taizhou"] = {alias_of = "Taizhou, Zhejiang"},
["Yantai"] = {container = "Shandong"}, -- 7.1 prefectural, 2.5 urban; 2.312 urban (2.550 adm-urb) per citypopulation.de
["Yinchuan"] = {container = {key = "Ningxia, China", placetype = "autonomous region"}}, -- 1.663 urban (2.525 adm-urb) per citypopulation.de
["Liuzhou"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 4.157 prefectural, 2.2 urban; 2.205 urban (2.500 adm-urb) per citypopulation.de
["Anshan"] = {container = "Liaoning"}, -- 1.480 urban (2.350 adm-urb including Liáoyáng) per citypopulation.de
["Yangzhou"] = {container = "Jiangsu"}, -- 2.067 urban (2.300 adm-urb) per citypopulation.de
["Jiaxing"] = {container = "Zhejiang"}, -- 1.188 urban (2.275 adm-urb) per citypopulation.de
["Xining"] = {container = "Qinghai"}, -- 1.677 urban (2.250 adm-urb) per citypopulation.de
-- includes Dìngzhōu city and Xióngān Xīnqū
["Baoding"] = {container = "Hebei"}, -- 11.5 prefectural, 2.0 urban; 1.940 urban (2.225 adm-urb) per citypopulation.de
["Baotou"] = {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 2.709 prefectural, 2.2 urban; 2.104 urban (2.200 adm-urb) per citypopulation.de
["Ganzhou"] = {container = "Jiangxi"}, -- 9.0 prefectural, 1.6 urban; 1.778 urban (2.150 adm-urb) per citypopulation.de
["Pingdingshan"] = {container = "Henan"}, -- 1.046 urban (2.100 adm-urb) per citypopulation.de
["Zunyi"] = {container = "Guizhou"}, -- 6.6 prefectural, 2.4 urban/metro; 1.675 urban (2.025 adm-urb) per citypopulation.de
["Bengbu"] = {container = "Anhui"}, -- 1.078 urban (2.000 adm-urb) per citypopulation.de
["Datong"] = {container = "Shanxi"}, -- 3.105 prefectural, 2.0 urban; 1.810 urban (2.000 adm-urb) per citypopulation.de
["Anyang"] = {container = "Henan"}, -- 1.188 urban (1.960 adm-urb) per citypopulation.de
["Huai'an"] = {container = "Jiangsu"}, -- 4.556 prefectural, 2.6 urban; 1.805 urban (1.940 adm-urb) per citypopulation.de
["Zaozhuang"] = {container = "Shandong"}, -- 1.350 urban (1.900 adm-urb) per citypopulation.de
["Zhanjiang"] = {container = "Guangdong"}, -- 7.0 prefectural, 1.9 urban; 1.401 urban (1.890 adm-urb) per citypopulation.de
["Huainan"] = {container = "Anhui"}, -- 1.256 urban (1.880 adm-urb) per citypopulation.de
["Jining"] = {container = "Shandong"}, -- 8.4 prefectural, 1.5 urban; 1.700 urban (1.880 adm-urb) per citypopulation.de
["Daqing"] = {container = "Heilongjiang"}, -- 1.604 urban (1.860 adm-urb) per citypopulation.de
["Wuhu"] = {container = "Anhui"}, -- 1.598 urban (1.850 adm-urb) per citypopulation.de
["Guilin"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 1.361 urban (1.830 adm-urb) per citypopulation.de
["Mianyang"] = {container = "Sichuan"}, -- 1.549 urban (1.800 adm-urb) per citypopulation.de
["Xiangyang"] = {container = "Hubei"}, -- 1.686 urban (1.800 adm-urb) per citypopulation.de
["Huzhou"] = {container = "Zhejiang"}, -- 1.084 urban (1.750 adm-urb) per citypopulation.de
["Puyang"] = {container = "Henan"}, -- 0.824 urban (1.750 adm-urb) per citypopulation.de
["Shangqiu"] = {container = "Henan"}, -- 7.8 prefectural, 1.9 urban (2.8 metro); 1.031 urban (1.750 adm-urb) per citypopulation.de
["Qinhuangdao"] = {container = "Hebei"}, -- 1.520 urban (1.740 adm-urb) per citypopulation.de
["Xingtai"] = {container = "Hebei"}, -- 7.1 prefectural, 971,000 urban; 1.5 urban (1.700 adm-urb) per citypopulation.de
["Nanyang"] = {container = "Henan", wp = "%l, %c"}, -- 9.7 prefectural, 2.1 urban/metro; 1.481 urban (1.680 adm-urb) per citypopulation.de
["Jiaozuo"] = {container = "Henan"}, -- 0.875 urban (1.640 adm-urb) per citypopulation.de
["Jilin City"] = {container = "Jilin"}, -- 1.509 urban (1.610 adm-urb) per citypopulation.de
["Jilin"] = {alias_of = "Jilin City"},
["Jinhua"] = {container = "Zhejiang"}, -- 7.1 prefectural, 1.5 urban; 1.041 urban (1.590 adm-urb) per citypopulation.de
["Shangrao"] = {container = "Jiangxi"}, -- 6.5 prefectural, 2.1 urban, 1.3 metro [sic]; 1.342 urban (1.580 adm-urb) per citypopulation.de
["Heze"] = {container = "Shandong"}, -- 8.8 prefectural, 1.3 urban; 1.294 urban (1.570 adm-urb) per citypopulation.de
["Yulin"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}, wp = "%l, %c"}, -- 0.878 urban (1.570 adm-urb) per citypopulation.de
["Tai'an"] = {container = "Shandong"}, -- 1.417 urban (1.560 adm-urb) per citypopulation.de
["Weihai"] = {container = "Shandong"}, -- 1.340 urban (1.510 adm-urb) per citypopulation.de
-- Taizhou, Jiangsu would be here (1.490 adm-urb) but moved to china_prefecture_level_cities_2 to avoid clash
["Yancheng"] = {container = "Jiangsu"}, -- 6.7 prefectural, 1.6 urban; 1.353 urban (1.460 adm-urb) per citypopulation.de
["Zhangjiakou"] = {container = "Hebei"}, -- 1.339 urban (1.450 adm-urb) per citypopulation.de
["Maoming"] = {container = "Guangdong"}, -- 6.2 prefectural, 2.5 urban; 1.308 urban (1.440 adm-urb) per citypopulation.de
["Nanchong"] = {container = "Sichuan"}, -- 1.254 urban (1.440 adm-urb) per citypopulation.de
["Fuyang"] = {container = "Anhui", wp = "%l, %c"}, -- 8.2 prefectural, 2.1 urban; 1.191 urban (1.410 adm-urb) per citypopulation.de
["Xuchang"] = {container = "Henan"}, -- 0.850 urban (1.390 adm-urb) per citypopulation.de
["Yichang"] = {container = "Hubei"}, -- 1.284 urban (1.390 adm-urb) per citypopulation.de
["Dazhou"] = {container = "Sichuan"}, -- 1.136 urban (1.380 adm-urb) per citypopulation.de
["Kaifeng"] = {container = "Henan"}, -- 1.194 urban (1.340 adm-urb) per citypopulation.de
["Luzhou"] = {container = "Sichuan"}, -- 1.128 urban (1.340 adm-urb) per citypopulation.de
["Qingyuan"] = {container = "Guangdong"}, -- 1.198 urban (1.340 adm-urb) per citypopulation.de
["Huaibei"] = {container = "Anhui"}, -- 0.831 urban (1.330 adm-urb) per citypopulation.de
["Yibin"] = {container = "Sichuan"}, -- 1.101 urban (1.310 adm-urb) per citypopulation.de
["Lu'an"] = {container = "Anhui"}, -- 1.070 urban (1.300 adm-urb) per citypopulation.de
["Dezhou"] = {container = "Shandong"}, -- 0.843 urban (1.290 adm-urb) per citypopulation.de
["Rizhao"] = {container = "Shandong"}, -- 1.147 urban (1.270 adm-urb) per citypopulation.de
["Changzhi"] = {container = "Shanxi"}, -- 1.047 urban (1.250 adm-urb) per citypopulation.de
["Hengyang"] = {container = "Hunan"}, -- 6.6 prefectural, 1.5 urban; 1.185 urban (1.250 adm-urb) per citypopulation.de
["Jinzhou"] = {container = "Liaoning"}, -- 1.021 urban (1.240 adm-urb) per citypopulation.de
["Liaocheng"] = {container = "Shandong"}, -- 1.020 urban (1.240 adm-urb) per citypopulation.de
["Changde"] = {container = "Hunan"}, -- 1.101 urban (1.230 adm-urb) per citypopulation.de
["Suqian"] = {container = "Jiangsu"}, -- 1.082 urban (1.230 adm-urb) per citypopulation.de
["Xinyang"] = {container = "Henan"}, -- 6.2 prefectural, 1.4 urban/metro; 1.015 urban (1.230 adm-urb) per citypopulation.de
["Baoji"] = {container = "Shaanxi"}, -- 1.108 urban (1.220 adm-urb) per citypopulation.de
["Yueyang"] = {container = "Hunan"}, -- 1.125 urban (1.220 adm-urb) per citypopulation.de
["Zhenjiang"] = {container = "Jiangsu"}, -- 1.124 urban (1.210 adm-urb) per citypopulation.de
-- Wanzhou is a "district" of the "direct-administered municipality" of Chongqing but in fact is 142 miles away from Chongqing city proper.
["Wanzhou"] = {placetype = "district", container = {key = "Chongqing", placetype = "direct-administered municipality"}, divs = {"subdistricts", "townships"}, wp = "%l, %c"}, -- 1.078 urban (1.190 adm-urb) per citypopulation.de
["Ulanhad"] = {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 1.093 urban (1.180 adm-urb) per citypopulation.de
["Chifeng"] = {alias_of = "Ulanhad"},
["Ulankhad"] = {alias_of = "Ulanhad", display = true},
["Ezhou"] = {container = "Hubei"}, -- < 0.750 urban (1.180 adm-urb) per citypopulation.de
["Zhaoqing"] = {container = "Guangdong"}, -- 1.036 urban (1.160 adm-urb) per citypopulation.de
["Lianyungang"] = {container = "Jiangsu"}, -- 4.599 prefectural, 2.0 urban; 1.071 urban (1.150 adm-urb) per citypopulation.de
["Qujing"] = {container = "Yunnan"}, -- 0.976 urban (1.150 adm-urb) per citypopulation.de
-- Shuyang is a "county" of the "prefecture-level city" of Suqian but in fact is 38 miles away from Suqian city proper (urban core to urban core).
-- The county itself is 37 miles by 34 miles.
["Shuyang"] = {placetype = "county", container = {key = "Suqian", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}, wp = "%l County"}, -- 0.986 urban (1.120 adm-urb) per citypopulation.de
-- Yongkang is a "county-level city" of the "prefecture-level city" of Jinhua but in fact is 32 miles away from Jinhua city proper (urban core to urban core).
["Yongkang"] = {placetype = "county-level city", container = {key = "Jinhua", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}, wp = "%l, Zhejiang"}, -- < 0.750 urban (1.110 adm-urb) per citypopulation.de
["Zhoukou"] = {container = "Henan"}, -- 9.0 prefectural, 721,000 urban (1.6 metro); < 0.750 urban (1.100 adm-urb) per citypopulation.de
["Beihai"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- < 1 urban (1.090 adm-urb) per citypopulation.de
["Jiujiang"] = {container = "Jiangxi"}, -- < 0.750 urban (1.080 adm-urb) per citypopulation.de
["Shaoyang"] = {container = "Hunan"}, -- 6.6 prefectural, 802,000 urban, 1.4 metro; < 1 urban (1.080 adm-urb) per citypopulation.de
["Chuzhou"] = {container = "Anhui"}, -- < 0.750 urban (1.070 adm-urb) per citypopulation.de
["Hengshui"] = {container = "Hebei"}, -- 0.885 urban (1.070 adm-urb) per citypopulation.de
["Shiyan"] = {container = "Hubei"}, -- 0.955 urban (1.070 adm-urb) per citypopulation.de
["Huludao"] = {container = "Liaoning"}, -- 0.764 urban (1.060 adm-urb) per citypopulation.de
["Dongying"] = {container = "Shandong"}, -- 0.961 urban (1.050 adm-urb) per citypopulation.de
["Guigang"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 0.921 urban (1.050 adm-urb) per citypopulation.de
-- Liuyang is a "county-level city" of the "prefecture-level city" of Changsha but in fact is 47 miles away from Changsha city proper (urban core to urban core).
["Liuyang"] = {placetype = "county-level city", container = {key = "Changsha", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 0.886 urban (1.040 adm-urb) per citypopulation.de
-- NOTE: Not to be confused with Changzhou in Jiangsu
["Cangzhou"] = {container = "Hebei"}, -- 7.3 prefectural, 621,000 urban; 0.759 urban (1.030 adm-urb) per citypopulation.de
["Liupanshui"] = {container = "Guizhou"}, -- < 0.750 urban (1.030 adm-urb) per citypopulation.de
["Panjin"] = {container = "Liaoning"}, -- 0.980 urban (1.030 adm-urb) per citypopulation.de
["Qiqihar"] = {container = "Heilongjiang"}, -- 1.030 urban (1.030 adm-urb) per citypopulation.de
["Linfen"] = {container = "Shanxi"}, -- < 0.750 urban (1.010 adm-urb) per citypopulation.de
-- Tengzhou is a "county-level city" of the "prefecture-level city" of Zaozhuang but in fact is 30 miles away from Zaozhuang city proper (urban core to urban core).
["Tengzhou"] = {placetype = "county-level city", container = {key = "Zaozhuang", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 0.937 urban (1.010 adm-urb) per citypopulation.de
-- 3 extra that got added in earlier incarnations and aren't found in the "major agglomerations of the world" page https://citypopulation.de/en/world/agglomerations/ reference date 2025-01-01
["Kunshan"] = {container = "Jiangsu"}, -- 1.652 urban (2020 China census) per citypopulation.de
["Zhumadian"] = {container = "Henan"}, -- 7.0 prefectural, 722,000 urban per Wikipedia; 0.754 urban per citypopulation.de
["Bijie"] = {container = "Guizhou"}, -- 6.9 prefectural, ? urban, ? metro (not listed in Wikipedia); < 0.750 urban per citypopulation.de
}
export.china_prefecture_level_cities_group = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Taizhou, Zhejiang" or "Suzhou, Anhui".
key_to_placename = false,
placename_to_key = false, -- don't add ", China" to make the key
default_container = "China",
canonicalize_key_container = make_canonicalize_key_container(", China", "province"),
-- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people
-- don't understand how Chinese administrative divisions work.
default_placetype = {"prefecture-level city", "city"},
default_divs = {
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities,
-- and prefecture-level cities (as well as county-level cities) are considered non-cities.
"districts", "subdistricts", "townships",
{type = "counties", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_prefecture_level_cities,
}
-- Needed to avoid problems with two cities called Taizhou and Suzhou.
export.china_prefecture_level_cities_2 = {
-- NOTE: There is also a larger and better-known prefecture-level city Taizhou in Zhejiang.
["Taizhou, Jiangsu"] = {container = "Jiangsu"}, -- 1.3 urban (1.490 adm-urb) per citypopulation.de 2020 census
["Taizhou"] = {alias_of = "Taizhou, Jiangsu"},
-- NOTE: There is also a larger and better-known prefecture-level city Suzhou in Jiangsu.
["Suzhou, Anhui"] = {container = "Anhui"}, -- 5.3 prefectural, 1.766 metro and "urban"; < 1 urban (1.010 adm-urb) per citypopulation.de 2020 census
-- hopefully this will work because we also have Suzhou as a key by itself for the larger, more-well-known Suzhou in Jiangsu
["Suzhou"] = {alias_of = "Suzhou, Anhui"},
}
export.china_prefecture_level_cities_group_2 = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Taizhou, Jiangsu".
placename_to_key = false, -- don't add ", China" to make the key
default_container = "China",
canonicalize_key_container = make_canonicalize_key_container(", China", "province"),
-- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people
-- don't understand how Chinese administrative divisions work.
default_placetype = {"prefecture-level city", "city"},
default_divs = {
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities,
-- and prefecture-level cities (as well as county-level cities) are considered non-cities.
"districts", "subdistricts", "townships",
{type = "counties", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_prefecture_level_cities_2,
}
export.finland_regions = {
["Lapland, Finland"] = {wp = "%l (%c)"},
["North Ostrobothnia, Finland"] = {},
["Northern Ostrobothnia, Finland"] = {alias_of = "North Ostrobothnia, Finland", display = true},
["Kainuu, Finland"] = {},
["North Karelia, Finland"] = {},
["Northern Savonia, Finland"] = {},
["North Savo, Finland"] = {alias_of = "Northern Savonia, Finland", display = true},
["Southern Savonia, Finland"] = {},
["South Savo, Finland"] = {alias_of = "Southern Savonia, Finland", display = true},
["South Karelia, Finland"] = {},
["Central Finland, Finland"] = {},
["South Ostrobothnia, Finland"] = {},
["Southern Ostrobothnia, Finland"] = {alias_of = "South Ostrobothnia, Finland", display = true},
["Ostrobothnia, Finland"] = {wp = "%l (region)"},
["Central Ostrobothnia, Finland"] = {},
["Pirkanmaa, Finland"] = {},
["Satakunta, Finland"] = {},
["Päijänne Tavastia, Finland"] = {},
["Päijät-Häme, Finland"] = {alias_of = "Päijänne Tavastia, Finland", display = true},
["Tavastia Proper, Finland"] = {},
["Kanta-Häme, Finland"] = {alias_of = "Tavastia Proper, Finland", display = true},
["Kymenlaakso, Finland"] = {},
["Uusimaa, Finland"] = {},
["Southwest Finland, Finland"] = {},
["Åland Islands, Finland"] = {the = true, wp = "Åland"},
["Åland, Finland"] = {alias_of = "Åland Islands, Finland"}, -- differs in "the"
}
-- regions of Finland
export.finland_group = {
default_container = "Finland",
default_placetype = "region",
default_divs = "municipalities",
data = export.finland_regions,
}
export.france_administrative_regions = {
["Auvergne-Rhône-Alpes, France"] = {},
["Bourgogne-Franche-Comté, France"] = {},
["Brittany, France"] = {wp = "%l (administrative region)"},
["Centre-Val de Loire, France"] = {},
["Corsica, France"] = {},
-- overseas departments are handled in `export.country_like_entities`
-- ["French Guiana"] = {},
["Grand Est, France"] = {},
-- ["Guadeloupe"] = {},
["Hauts-de-France, France"] = {},
["Île-de-France, France"] = {},
-- ["Martinique"] = {},
-- ["Mayotte"] = {},
["Normandy, France"] = {wp = "%l (administrative region)"},
["Nouvelle-Aquitaine, France"] = {},
["Occitania, France"] = {wp = "%l (administrative region)"},
["Occitanie, France"] = {alias_of = "Occitania, France", display = true},
["Pays de la Loire, France"] = {},
["Provence-Alpes-Côte d'Azur, France"] = {},
-- ["Réunion"] = {},
}
-- administrative regions of France
export.france_group = {
default_container = "France",
-- Canonically these are 'administrative regions' but also treat as 'region' ('administrative region' falls back
-- to 'region').
default_placetype = "region",
default_divs = {
"communes",
{type = "municipalities", cat_as = "communes"},
"departments",
{type = "prefectures", cat_as = {"prefectures", "departmental capitals"}},
{type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}},
},
data = export.france_administrative_regions,
}
export.france_departments = {
["Ain, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 01
["Aisne, France"] = {container = "Hauts-de-France"}, -- 02
["Allier, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 03
["Alpes-de-Haute-Provence, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 04
["Hautes-Alpes, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 05
["Alpes-Maritimes, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 06
["Ardèche, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 07
["Ardennes, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 08
["Ariège, France"] = {container = "Occitania", wp = "%l (department)"}, -- 09
["Aube, France"] = {container = "Grand Est"}, -- 10
["Aude, France"] = {container = "Occitania"}, -- 11
["Aveyron, France"] = {container = "Occitania"}, -- 12
["Bouches-du-Rhône, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 13
["Calvados, France"] = {container = "Normandy", wp = "%l (department)"}, -- 14
["Cantal, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 15
["Charente, France"] = {container = "Nouvelle-Aquitaine"}, -- 16
["Charente-Maritime, France"] = {container = "Nouvelle-Aquitaine"}, -- 17
["Cher, France"] = {container = "Centre-Val de Loire", wp = "%l (department)"}, -- 18
["Corrèze, France"] = {container = "Nouvelle-Aquitaine"}, -- 19
["Corse-du-Sud, France"] = {container = "Corsica"}, -- 2A
["Haute-Corse, France"] = {container = "Corsica"}, -- 2B
["Côte-d'Or, France"] = {container = "Bourgogne-Franche-Comté"}, -- 21
["Côte d'Or, France"] = {alias_of = "Côte-d'Or, France", display = true},
["Côtes-d'Armor, France"] = {container = "Brittany"}, -- 22
["Côtes d'Armor, France"] = {alias_of = "Côtes-d'Armor, France", display = true},
["Creuse, France"] = {container = "Nouvelle-Aquitaine"}, -- 23
["Dordogne, France"] = {container = "Nouvelle-Aquitaine"}, -- 24
["Doubs, France"] = {container = "Bourgogne-Franche-Comté"}, -- 25
["Drôme, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 26
["Eure, France"] = {container = "Normandy"}, -- 27
["Eure-et-Loir, France"] = {container = "Centre-Val de Loire"}, -- 28
["Finistère, France"] = {container = "Brittany"}, -- 29
["Gard, France"] = {container = "Occitania"}, -- 30
["Haute-Garonne, France"] = {container = "Occitania"}, -- 31
["Gers, France"] = {container = "Occitania"}, -- 32
["Gironde, France"] = {container = "Nouvelle-Aquitaine"}, -- 33
["Hérault, France"] = {container = "Occitania"}, -- 34
["Ille-et-Vilaine, France"] = {container = "Brittany"}, -- 35
["Indre, France"] = {container = "Centre-Val de Loire"}, -- 36
["Indre-et-Loire, France"] = {container = "Centre-Val de Loire"}, -- 37
["Isère, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 38
["Jura, France"] = {container = "Bourgogne-Franche-Comté", wp = "%l (department)"}, -- 39
["Landes, France"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 40
["Loir-et-Cher, France"] = {container = "Centre-Val de Loire"}, -- 41
["Loire, France"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 42
["Haute-Loire, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 43
["Loire-Atlantique, France"] = {container = "Pays de la Loire"}, -- 44
["Loiret, France"] = {container = "Centre-Val de Loire"}, -- 45
["Lot, France"] = {container = "Occitania", wp = "%l (department)"}, -- 46
["Lot-et-Garonne, France"] = {container = "Nouvelle-Aquitaine"}, -- 47
["Lozère, France"] = {container = "Occitania"}, -- 48
["Maine-et-Loire, France"] = {container = "Pays de la Loire"}, -- 49
["Manche, France"] = {container = "Normandy"}, -- 50
["Marne, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 51
["Haute-Marne, France"] = {container = "Grand Est"}, -- 52
["Mayenne, France"] = {container = "Pays de la Loire"}, -- 53
["Meurthe-et-Moselle, France"] = {container = "Grand Est"}, -- 54
["Meuse, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 55
["Morbihan, France"] = {container = "Brittany"}, -- 56
["Moselle, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 57
["Nièvre, France"] = {container = "Bourgogne-Franche-Comté"}, -- 58
["Nord, France"] = {container = "Hauts-de-France", wp = "%l (French department)"}, -- 59
["Oise, France"] = {container = "Hauts-de-France"}, -- 60
["Orne, France"] = {container = "Normandy"}, -- 61
["Pas-de-Calais, France"] = {container = "Hauts-de-France"}, -- 62
["Puy-de-Dôme, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 63
["Pyrénées-Atlantiques, France"] = {container = "Nouvelle-Aquitaine"}, -- 64
["Hautes-Pyrénées, France"] = {container = "Occitania"}, -- 65
["Pyrénées-Orientales, France"] = {container = "Occitania"}, -- 66
["Bas-Rhin, France"] = {container = "Grand Est"}, -- 67
["Haut-Rhin, France"] = {container = "Grand Est"}, -- 68
["Rhône, France"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 69D
["Metropolis of Lyon, France"] = {container = "Auvergne-Rhône-Alpes", the = true}, -- 69M
["Lyon Metropolis, France"] = {alias_of = "Metropolis of Lyon, France"},
["Lyon, France"] = {alias_of = "Metropolis of Lyon, France"},
["Haute-Saône, France"] = {container = "Bourgogne-Franche-Comté"}, -- 70
["Saône-et-Loire, France"] = {container = "Bourgogne-Franche-Comté"}, -- 71
["Sarthe, France"] = {container = "Pays de la Loire"}, -- 72
["Savoie, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 73
["Haute-Savoie, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 74
["Paris, France"] = {container = "Île-de-France"}, -- 75
["Seine-Maritime, France"] = {container = "Normandy"}, -- 76
["Seine-et-Marne, France"] = {container = "Île-de-France"}, -- 77
["Yvelines, France"] = {container = "Île-de-France"}, -- 78
["Deux-Sèvres, France"] = {container = "Nouvelle-Aquitaine"}, -- 79
["Somme, France"] = {container = "Hauts-de-France", wp = "%l (department)"}, -- 80
["Tarn, France"] = {container = "Occitania", wp = "%l (department)"}, -- 81
["Tarn-et-Garonne, France"] = {container = "Occitania"}, -- 82
["Var, France"] = {container = "Provence-Alpes-Côte d'Azur", wp = "%l (department)"}, -- 83
["Vaucluse, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 84
["Vendée, France"] = {container = "Pays de la Loire"}, -- 85
["Vienne, France"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 86
["Haute-Vienne, France"] = {container = "Nouvelle-Aquitaine"}, -- 87
["Vosges, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 88
["Yonne, France"] = {container = "Bourgogne-Franche-Comté"}, -- 89
["Territoire de Belfort, France"] = {container = "Bourgogne-Franche-Comté"}, -- 90
["Essonne, France"] = {container = "Île-de-France"}, -- 91
["Hauts-de-Seine, France"] = {container = "Île-de-France"}, -- 92
["Seine-Saint-Denis, France"] = {container = "Île-de-France"}, -- 93
["Val-de-Marne, France"] = {container = "Île-de-France"}, -- 94
["Val-d'Oise, France"] = {container = "Île-de-France"}, -- 95
--["Guadeloupe"] = {container = "Guadeloupe"}, -- 971
--["Martinique"] = {container = "Martinique"}, -- 972
--["Guyane"] = {container = "French Guiana", wp = "French Guiana"}, -- 973
--["La Réunion"] = {container = "Réunion", wp = "Réunion"}, -- 974
--["Mayotte"] = {container = "Mayotte"}, -- 976
}
export.france_departments_group = {
placename_to_key = make_placename_to_key(", France"),
canonicalize_key_container = make_canonicalize_key_container(", France", "region"),
default_placetype = "department",
default_divs = {
"communes",
{type = "municipalities", cat_as = "communes"},
},
data = export.france_departments,
}
export.germany_states = {
["Baden-Württemberg, Germany"] = {},
["Bavaria, Germany"] = {},
-- Berlin, Bremen and Hamburg are effectively city-states and don't have districts ([[Kreise]]), so override
-- the default_divs setting. Better not to include them at all since they're included as cities down below.
-- ["Berlin"] = {divs = {}},
["Brandenburg, Germany"] = {},
-- ["Bremen"] = {divs = {}},
-- ["Hamburg"] = {divs = {}},
["Hesse, Germany"] = {},
["Lower Saxony, Germany"] = {},
["Mecklenburg-Vorpommern, Germany"] = {},
["Mecklenburg-Western Pomerania, Germany"] = {alias_of = "Mecklenburg-Vorpommern, Germany", display = true},
["North Rhine-Westphalia, Germany"] = {},
["Rhineland-Palatinate, Germany"] = {},
["Saarland, Germany"] = {},
["Saxony, Germany"] = {},
["Saxony-Anhalt, Germany"] = {},
["Schleswig-Holstein, Germany"] = {},
["Thuringia, Germany"] = {},
}
-- states of Germany
export.germany_group = {
default_container = "Germany",
default_placetype = "negeri",
default_divs = {"districts", "municipalities"},
data = export.germany_states,
}
export.greece_regions = {
["Attica, Greece"] = {wp = "%l (region)"},
["Central Greece, Greece"] = {wp = "%l (administrative region)"},
["Central Macedonia, Greece"] = {},
["Crete, Greece"] = {},
["Eastern Macedonia and Thrace, Greece"] = {},
["Epirus, Greece"] = {wp = "%l (region)"},
["Ionian Islands, Greece"] = {the = true, wp = "%l (region)"},
["North Aegean, Greece"] = {the = true},
-- I would expect 'the Peloponnese' but Wikipedia mostly has categories like [[w:Category:Geography of Peloponnese (region)]]
-- and [[w:Category:Buildings and structures in Peloponnese (region)]]; only [[w:Category:People from the Peloponnese (region)]]
-- has "the" in it.
["Peloponnese, Greece"] = {wp = "%l (region)"},
["South Aegean, Greece"] = {the = true},
["Thessaly, Greece"] = {},
["Western Greece, Greece"] = {},
["Western Macedonia, Greece"] = {},
["Mount Athos, Greece"] = {placetype = {"autonomous region", "region"}, wp = "Monastic community of Mount Athos"},
}
-- regions of Greece
export.greece_group = {
default_container = "Greece",
default_placetype = "region",
data = export.greece_regions,
}
local india_polity_with_divisions = {"divisions", "districts"}
local india_polity_without_divisions = {"districts"}
-- States and union territories of India. Only some of them are divided into divisions.
export.india_states_and_union_territories = {
["Andaman and Nicobar Islands, India"] =
{the = true, placetype = "union territory", divs = india_polity_without_divisions},
["Andhra Pradesh, India"] = {divs = india_polity_without_divisions},
["Arunachal Pradesh, India"] = {divs = india_polity_with_divisions},
["Assam, India"] = {divs = india_polity_with_divisions},
["Bihar, India"] = {divs = india_polity_with_divisions},
["Chandigarh, India"] = {placetype = "union territory", divs = india_polity_without_divisions},
["Chhattisgarh, India"] = {divs = india_polity_with_divisions},
["Dadra and Nagar Haveli and Daman and Diu, India"] = {placetype = "union territory", divs = india_polity_without_divisions},
["Delhi, India"] = {placetype = "union territory", divs = india_polity_with_divisions},
["Goa, India"] = {divs = india_polity_without_divisions},
["Gujarat, India"] = {divs = india_polity_without_divisions},
["Haryana, India"] = {divs = india_polity_with_divisions},
["Himachal Pradesh, India"] = {divs = india_polity_with_divisions},
["Jammu and Kashmir, India"] = {placetype = "union territory", divs = india_polity_with_divisions,
wp = "%l (union territory)"},
["Jharkhand, India"] = {divs = india_polity_with_divisions},
["Karnataka, India"] = {divs = india_polity_with_divisions},
["Kerala, India"] = {divs = india_polity_without_divisions},
["Ladakh, India"] = {placetype = "union territory", divs = india_polity_with_divisions},
["Lakshadweep, India"] = {placetype = "union territory", divs = india_polity_without_divisions},
["Madhya Pradesh, India"] = {divs = india_polity_with_divisions},
["Maharashtra, India"] = {divs = india_polity_with_divisions},
["Manipur, India"] = {divs = india_polity_without_divisions},
["Meghalaya, India"] = {divs = india_polity_with_divisions},
["Mizoram, India"] = {divs = india_polity_without_divisions},
["Nagaland, India"] = {divs = india_polity_with_divisions},
["Odisha, India"] = {divs = india_polity_with_divisions},
["Puducherry, India"] = {placetype = "union territory", divs = india_polity_without_divisions,
wp = "%l (union territory)"},
["Pondicherry, India"] = {alias_of = "Puducherry, India", display = true},
["Punjab, India"] = {divs = india_polity_with_divisions, wp = "%l, %c"},
["Rajasthan, India"] = {divs = india_polity_with_divisions},
["Sikkim, India"] = {divs = india_polity_without_divisions},
["Tamil Nadu, India"] = {divs = india_polity_without_divisions},
["Telangana, India"] = {divs = india_polity_without_divisions},
["Tripura, India"] = {divs = india_polity_without_divisions},
["Uttar Pradesh, India"] = {divs = india_polity_with_divisions},
["Uttarakhand, India"] = {divs = india_polity_with_divisions},
["West Bengal, India"] = {divs = india_polity_with_divisions},
}
-- states and union territories of India
export.india_group = {
default_container = "India",
default_placetype = "negeri",
data = export.india_states_and_union_territories,
}
export.indonesia_provinces = {
["Aceh, Indonesia"] = {},
["Bali, Indonesia"] = {},
["Bangka Belitung Islands, Indonesia"] = {the = true},
["Banten, Indonesia"] = {},
["Bengkulu, Indonesia"] = {},
["Central Java, Indonesia"] = {},
["Central Kalimantan, Indonesia"] = {},
["Central Papua, Indonesia"] = {},
["Central Sulawesi, Indonesia"] = {},
["East Java, Indonesia"] = {},
["East Kalimantan, Indonesia"] = {},
["East Nusa Tenggara, Indonesia"] = {},
["Gorontalo, Indonesia"] = {},
["Highland Papua, Indonesia"] = {wp = "%l"},
["Special Capital Region of Jakarta, Indonesia"] = {the = true, wp = "Jakarta"},
["Jakarta, Indonesia"] = {alias_of = "Special Capital Region of Jakarta, Indonesia"},
["Jambi, Indonesia"] = {},
["Lampung, Indonesia"] = {},
["Maluku, Indonesia"] = {},
["North Kalimantan, Indonesia"] = {},
["North Maluku, Indonesia"] = {},
["North Sulawesi, Indonesia"] = {},
["North Papua, Indonesia"] = {},
["North Sumatra, Indonesia"] = {},
["Papua, Indonesia"] = {wp = "%l (province)"},
["Riau, Indonesia"] = {},
["Riau Islands, Indonesia"] = {the = true},
["Southeast Sulawesi, Indonesia"] = {},
["South Kalimantan, Indonesia"] = {},
["South Papua, Indonesia"] = {},
["South Sulawesi, Indonesia"] = {},
["South Sumatra, Indonesia"] = {},
["Southwest Papua, Indonesia"] = {},
["West Java, Indonesia"] = {},
["West Kalimantan, Indonesia"] = {},
["West Nusa Tenggara, Indonesia"] = {},
["West Papua, Indonesia"] = {wp = "%l (province)"},
["West Sulawesi, Indonesia"] = {},
["West Sumatra, Indonesia"] = {},
["Special Region of Yogyakarta, Indonesia"] = {the = true},
["Yogyakarta, Indonesia"] = {alias_of = "Special Region of Yogyakarta, Indonesia"},
}
-- provinces of Indonesia
export.indonesia_group = {
default_container = "Indonesia",
default_placetype = "province",
-- per https://www.quora.com/Does-Indonesia-use-British-or-American-English, Indonesia tends to use American
-- spellings.
data = export.indonesia_provinces,
}
export.iran_provinces = {
["Alborz Province, Iran"] = {}, -- abbreviation AL, capital [[w:Karaj]]
["Ardabil Province, Iran"] = {}, -- abbreviation AR, capital [[w:Ardabil]]
["Bushehr Province, Iran"] = {}, -- abbreviation BU, capital [[w:Bushehr]]
["Chaharmahal and Bakhtiari Province, Iran"] = {}, -- abbreviation CB, capital [[w:Shahr-e Kord]]
["East Azerbaijan Province, Iran"] = {}, -- abbreviation EA, capital [[w:Tabriz]]
["Fars Province, Iran"] = {}, -- abbreviation FA, capital [[w:Shiraz]]
["Pars Province, Iran"] = {alias_of = "Fars Province, Iran", display = true},
["Gilan Province, Iran"] = {}, -- abbreviation GN, capital [[w:Rasht]]
["Golestan Province, Iran"] = {}, -- abbreviation GO, capital [[w:Gorgan]]
["Hamadan Province, Iran"] = {}, -- abbreviation HA, capital [[w:Hamadan]]
["Hormozgan Province, Iran"] = {}, -- abbreviation HO, capital [[w:Bandar Abbas]]
["Ilam Province, Iran"] = {}, -- abbreviation IL, capital [[w:Ilam, Iran|Ilam]]
["Isfahan Province, Iran"] = {}, -- abbreviation IS, capital [[w:Isfahan]]
["Kerman Province, Iran"] = {}, -- abbreviation KN, capital [[w:Kerman]]
["Kermanshah Province, Iran"] = {}, -- abbreviation KE, capital [[w:Kermanshah]]
["Khuzestan Province, Iran"] = {}, -- abbreviation KH, capital [[w:Ahvaz]]
["Kohgiluyeh and Boyer-Ahmad Province, Iran"] = {}, -- abbreviation KB, capital [[w:Yasuj]]
["Kurdistan Province, Iran"] = {}, -- abbreviation KU, capital [[w:Sanandaj]]
["Lorestan Province, Iran"] = {}, -- abbreviation LO, capital [[w:Khorramabad]]
["Markazi Province, Iran"] = {}, -- abbreviation MA, capital [[w:Arak, Iran|Arak]]
["Mazandaran Province, Iran"] = {}, -- abbreviation MN, capital [[w:Sari, Iran|Sari]]
["North Khorasan Province, Iran"] = {}, -- abbreviation NK, capital [[w:Bojnord]]
["Qazvin Province, Iran"] = {}, -- abbreviation QA, capital [[w:Qazvin]]
["Qom Province, Iran"] = {}, -- abbreviation QM, capital [[w:Qom]]
["Razavi Khorasan Province, Iran"] = {}, -- abbreviation RK, capital [[w:Mashhad]]
["Semnan Province, Iran"] = {}, -- abbreviation SE, capital [[w:Semnan, Iran|Semnan]]
["Sistan and Baluchestan Province, Iran"] = {}, -- abbreviation SB, capital [[w:Zahedan]]
["South Khorasan Province, Iran"] = {}, -- abbreviation SK, capital [[w:Birjand]]
["Tehran Province, Iran"] = {}, -- abbreviation TE, capital [[w:Tehran]]
["West Azerbaijan Province, Iran"] = {}, -- abbreviation WA, capital [[w:Urmia]]
["Yazd Province, Iran"] = {}, -- abbreviation YA, capital [[w:Yazd]]
["Zanjan Province, Iran"] = {}, -- abbreviation ZA, capital [[w:Zanjan, Iran|Zanjan]]
}
-- provinces of Iran
export.iran_group = {
key_to_placename = make_key_to_placename(", Iran", " Province$"),
placename_to_key = make_placename_to_key(", Iran", " Province"),
default_container = "Iran",
default_placetype = "province",
-- There aren't nearly enough counties of Iran currently entered in any language to allow for categorizing them
-- per-province. (As of 2025-05-09, there are only 6 counties in each of [[Category:en:Counties of Iran]],
-- [[Category:fa:Counties of Iran]] and [[Category:ar:Counties of Iran]].)
-- default_divs = "counties",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "%e province",
data = export.iran_provinces,
}
export.ireland_counties = {
["County Carlow, Ireland"] = {},
["County Cavan, Ireland"] = {},
["County Clare, Ireland"] = {},
["County Cork, Ireland"] = {},
["County Donegal, Ireland"] = {},
["County Dublin, Ireland"] = {},
["County Galway, Ireland"] = {},
["County Kerry, Ireland"] = {},
["County Kildare, Ireland"] = {},
["County Kilkenny, Ireland"] = {},
["County Laois, Ireland"] = {},
["County Leitrim, Ireland"] = {},
["County Limerick, Ireland"] = {},
["County Longford, Ireland"] = {},
["County Louth, Ireland"] = {},
["County Mayo, Ireland"] = {},
["County Meath, Ireland"] = {},
["County Monaghan, Ireland"] = {},
["County Offaly, Ireland"] = {},
["County Roscommon, Ireland"] = {},
["County Sligo, Ireland"] = {},
["County Tipperary, Ireland"] = {},
["County Waterford, Ireland"] = {},
["County Westmeath, Ireland"] = {},
["County Wexford, Ireland"] = {},
["County Wicklow, Ireland"] = {},
}
local function make_irish_type_key_to_placename(container_pattern)
return function(key)
key = key:gsub(container_pattern, "")
local elliptical_key = key:gsub("^County ", "")
return key, elliptical_key
end
end
local function make_irish_type_placename_to_key(container_suffix)
return function(placename)
if not placename:find("^County ") and not placename:find("^City ") then
placename = "County " .. placename
end
return placename .. container_suffix
end
end
-- counties of Ireland
export.ireland_group = {
key_to_placename = make_irish_type_key_to_placename(", Ireland$"),
placename_to_key = make_irish_type_placename_to_key(", Ireland"),
default_container = "Ireland",
default_placetype = "county",
data = export.ireland_counties,
}
export.italy_administrative_regions = {
["Abruzzo, Italy"] = {},
["Aosta Valley, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}},
["Apulia, Italy"] = {},
["Basilicata, Italy"] = {},
["Calabria, Italy"] = {},
["Campania, Italy"] = {},
["Emilia-Romagna, Italy"] = {},
["Friuli-Venezia Giulia, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}},
["Lazio, Italy"] = {},
["Liguria, Italy"] = {},
["Lombardy, Italy"] = {},
["Marche, Italy"] = {},
["Molise, Italy"] = {},
["Piedmont, Italy"] = {},
["Sardinia, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}},
["Sicily, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}},
["Trentino-Alto Adige, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}},
["Tuscany, Italy"] = {},
["Umbria, Italy"] = {},
["Veneto, Italy"] = {},
}
-- administrative regions of Italy
export.italy_group = {
default_container = "Italy",
default_placetype = "region",
data = export.italy_administrative_regions,
}
-- table of Japanese prefectures; interpolated into the main 'places' table, but also needed separately
export.japan_prefectures = {
["Aichi Prefecture, Japan"] = {},
["Akita Prefecture, Japan"] = {},
["Aomori Prefecture, Japan"] = {},
["Chiba Prefecture, Japan"] = {},
["Ehime Prefecture, Japan"] = {},
["Fukui Prefecture, Japan"] = {},
["Fukuoka Prefecture, Japan"] = {},
["Fukushima Prefecture, Japan"] = {},
["Gifu Prefecture, Japan"] = {},
["Gunma Prefecture, Japan"] = {},
["Hiroshima Prefecture, Japan"] = {},
["Hokkaido Prefecture, Japan"] = {divs = "subprefectures", wp = "Hokkaido"},
["Hyōgo Prefecture, Japan"] = {},
["Hyogo Prefecture, Japan"] = {alias_of = "Hyōgo Prefecture, Japan", display = true},
["Ibaraki Prefecture, Japan"] = {},
["Ishikawa Prefecture, Japan"] = {},
["Iwate Prefecture, Japan"] = {},
["Kagawa Prefecture, Japan"] = {},
["Kagoshima Prefecture, Japan"] = {},
["Kanagawa Prefecture, Japan"] = {},
["Kōchi Prefecture, Japan"] = {},
["Kochi Prefecture, Japan"] = {alias_of = "Kōchi Prefecture, Japan", display = true},
["Kumamoto Prefecture, Japan"] = {},
["Kyoto Prefecture, Japan"] = {},
["Mie Prefecture, Japan"] = {},
["Miyagi Prefecture, Japan"] = {},
["Miyazaki Prefecture, Japan"] = {},
["Nagano Prefecture, Japan"] = {},
["Nagasaki Prefecture, Japan"] = {},
["Nara Prefecture, Japan"] = {},
["Niigata Prefecture, Japan"] = {},
["Ōita Prefecture, Japan"] = {},
["Oita Prefecture, Japan"] = {alias_of = "Ōita Prefecture, Japan", display = true},
["Okayama Prefecture, Japan"] = {},
["Okinawa Prefecture, Japan"] = {},
["Osaka Prefecture, Japan"] = {},
["Saga Prefecture, Japan"] = {},
["Saitama Prefecture, Japan"] = {},
["Shiga Prefecture, Japan"] = {},
["Shimane Prefecture, Japan"] = {},
["Shizuoka Prefecture, Japan"] = {},
["Tochigi Prefecture, Japan"] = {},
["Tokushima Prefecture, Japan"] = {},
["Tottori Prefecture, Japan"] = {},
["Toyama Prefecture, Japan"] = {},
["Wakayama Prefecture, Japan"] = {},
["Yamagata Prefecture, Japan"] = {},
["Yamaguchi Prefecture, Japan"] = {},
["Yamanashi Prefecture, Japan"] = {},
}
-- prefectures of Japan
export.japan_group = {
key_to_placename = make_key_to_placename(", Japan$", " Prefecture$"),
placename_to_key = make_placename_to_key(", Japan", " Prefecture"),
default_container = "Japan",
default_placetype = "prefecture",
data = export.japan_prefectures,
}
export.laos_provinces = {
["Attapeu Province, Laos"] = {},
["Bokeo Province, Laos"] = {},
["Bolikhamxai Province, Laos"] = {},
["Champasak Province, Laos"] = {},
["Houaphanh Province, Laos"] = {},
["Khammouane Province, Laos"] = {},
["Luang Namtha Province, Laos"] = {},
["Luang Prabang Province, Laos"] = {},
["Oudomxay Province, Laos"] = {},
["Phongsaly Province, Laos"] = {},
["Salavan Province, Laos"] = {},
["Savannakhet Province, Laos"] = {},
["Vientiane Province, Laos"] = {},
["Vientiane Prefecture, Laos"] = {placetype = "prefecture", wp = "%l"},
["Sainyabuli Province, Laos"] = {},
["Sekong Province, Laos"] = {},
["Xaisomboun Province, Laos"] = {},
["Xiangkhouang Province, Laos"] = {},
}
local function laos_placename_to_key(placename)
if placename == "Vientiane Prefecture" then
return placename .. ", Laos"
end
if placename:find(" Province$") then
return placename .. ", Laos"
end
return placename .. " Province, Laos"
end
-- provinces of Laos
export.laos_group = {
key_to_placename = make_key_to_placename(", Laos$", {" Province$", " Prefecture$"}),
placename_to_key = laos_placename_to_key,
default_container = "Laos",
default_placetype = "province",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "%e province",
data = export.laos_provinces,
}
export.lebanon_governorates = {
["Akkar Governorate, Lebanon"] = {},
["Baalbek-Hermel Governorate, Lebanon"] = {},
["Beirut Governorate, Lebanon"] = {},
["Beqaa Governorate, Lebanon"] = {},
["Keserwan-Jbeil Governorate, Lebanon"] = {},
["Mount Lebanon Governorate, Lebanon"] = {},
["Nabatieh Governorate, Lebanon"] = {},
-- These two are generic enough that we don't want to automatically augment a use of `gov/North Governorate` or
-- `gov/South Governorate` with `c/Lebanon`.
["North Governorate, Lebanon"] = {no_auto_augment_container = true},
["South Governorate, Lebanon"] = {no_auto_augment_container = true},
}
-- governorates of Lebanon
export.lebanon_group = {
key_to_placename = make_key_to_placename(", Lebanon$", " Governorate$"),
placename_to_key = make_placename_to_key(", Lebanon", " Governorate"),
default_container = "Lebanon",
default_placetype = "governorate",
data = export.lebanon_governorates,
}
export.malaysia_states = {
["Johor, Malaysia"] = {},
["Kedah, Malaysia"] = {},
["Kelantan, Malaysia"] = {},
["Malacca, Malaysia"] = {},
["Negeri Sembilan, Malaysia"] = {},
["Pahang, Malaysia"] = {},
["Penang, Malaysia"] = {},
["Perak, Malaysia"] = {},
["Perlis, Malaysia"] = {},
["Sabah, Malaysia"] = {},
["Sarawak, Malaysia"] = {},
["Selangor, Malaysia"] = {},
["Terengganu, Malaysia"] = {},
}
-- states of Malaysia
export.malaysia_group = {
default_container = "Malaysia",
default_placetype = "negeri",
default_wp = "%l, %c",
data = export.malaysia_states,
}
export.malta_regions = {
-- Some of the regions are generic enough that we don't want to automatically augment a use of e.g.
-- `r/Northern Region` with `c/Malta`. In particular;
-- * "Eastern Region" also occurs at least in Ghana, Uganda, Iceland, Nigeria, Venezuela, North Macedonia and
-- El Salvador;
-- * "Northern Region" also occurs at least in Ghana, Uganda, Malawi, Nigeria, Canada and South Africa;
-- * "Western Region" also occurs at least in Abu Dhabi, Bahrain, South Africa, Ghana, Iceland, Nepal, Nigeria,
-- Serbia and Uganda;
-- * "Southern Region" also occurs at least in Nigeria, Eritrea, Iceland, Ireland, Malawi and Serbia.
["Eastern Region, Malta"] = {no_auto_augment_container = true},
["Gozo Region, Malta"] = {wp = "%l"},
["Northern Region, Malta"] = {no_auto_augment_container = true},
["Port Region, Malta"] = {},
["Southern Region, Malta"] = {no_auto_augment_container = true},
["Western Region, Malta"] = {no_auto_augment_container = true},
}
-- regions of Malta
export.malta_group = {
key_to_placename = make_key_to_placename(", Malta$", " Region"),
placename_to_key = make_placename_to_key(", Malta", " Region"),
default_container = "Malta",
default_placetype = "region",
default_wp = "%l, %c",
default_the = true,
data = export.malta_regions,
}
export.mexico_states = {
["Aguascalientes, Mexico"] = {},
["Baja California, Mexico"] = {},
-- not display-canonicalizing because the "Norte" could be for emphasis
["Baja California Norte, Mexico"] = {alias_of = "Baja California, Mexico"},
["Baja California Sur, Mexico"] = {},
["Campeche, Mexico"] = {},
["Chiapas, Mexico"] = {},
["Chihuahua, Mexico"] = {wp = "%l (state)"},
["Coahuila, Mexico"] = {},
["Colima, Mexico"] = {},
["Durango, Mexico"] = {},
["Guanajuato, Mexico"] = {},
["Guerrero, Mexico"] = {},
["Hidalgo, Mexico"] = {wp = "%l (state)"},
["Jalisco, Mexico"] = {},
["State of Mexico, Mexico"] = {the = true},
["Mexico, Mexico"] = {alias_of = "State of Mexico, Mexico"}, -- differs in "the"
-- ["Mexico City, Mexico"] = {}, doesn't belong here because it's a city
["Michoacán, Mexico"] = {},
["Michoacan, Mexico"] = {alias_of = "Michoacán, Mexico", display = true},
["Morelos, Mexico"] = {},
["Nayarit, Mexico"] = {},
["Nuevo León, Mexico"] = {},
["Nuevo Leon, Mexico"] = {alias_of = "Nuevo León, Mexico", display = true},
["Oaxaca, Mexico"] = {},
["Puebla, Mexico"] = {},
["Querétaro, Mexico"] = {},
["Queretaro, Mexico"] = {alias_of = "Querétaro, Mexico", display = true},
["Quintana Roo, Mexico"] = {},
["San Luis Potosí, Mexico"] = {},
["San Luis Potosi, Mexico"] = {alias_of = "San Luis Potosí, Mexico", display = true},
["Sinaloa, Mexico"] = {},
["Sonora, Mexico"] = {},
["Tabasco, Mexico"] = {},
["Tamaulipas, Mexico"] = {},
["Tlaxcala, Mexico"] = {},
["Veracruz, Mexico"] = {},
["Yucatán, Mexico"] = {},
["Yucatan, Mexico"] = {alias_of = "Yucatán, Mexico", display = true},
["Zacatecas, Mexico"] = {},
}
-- Mexican states
export.mexico_group = {
default_container = "Mexico",
default_placetype = "negeri",
data = export.mexico_states,
}
export.moldova_districts_and_autonomous_territorial_units = {
["Anenii Noi District, Moldova"] = {}, -- capital [[Anenii Noi]]
["Basarabeasca District, Moldova"] = {}, -- capital [[Basarabeasca]]
["Briceni District, Moldova"] = {}, -- capital [[Briceni]]
["Cahul District, Moldova"] = {}, -- capital [[Cahul]]
["Cantemir District, Moldova"] = {}, -- capital [[Cantemir, Moldova|Cantemir]]
["Călărași District, Moldova"] = {}, -- capital [[Călărași, Moldova|Călărași]]
["Căușeni District, Moldova"] = {}, -- capital [[Căușeni]]
["Cimișlia District, Moldova"] = {}, -- capital [[Cimișlia]]
["Criuleni District, Moldova"] = {}, -- capital [[Criuleni]]
["Dondușeni District, Moldova"] = {}, -- capital [[Dondușeni]]
["Drochia District, Moldova"] = {}, -- capital [[Drochia]]
["Dubăsari District, Moldova"] = {}, -- capital [[Cocieri]]
["Edineț District, Moldova"] = {}, -- capital [[Edineț]]
["Fălești District, Moldova"] = {}, -- capital [[Fălești]]
["Florești District, Moldova"] = {}, -- capital [[Florești, Moldova|Florești]]
["Glodeni District, Moldova"] = {}, -- capital [[Glodeni]]
["Hîncești District, Moldova"] = {}, -- capital [[Hîncești]]
["Ialoveni District, Moldova"] = {}, -- capital [[Ialoveni]]
["Leova District, Moldova"] = {}, -- capital [[Leova]]
["Nisporeni District, Moldova"] = {}, -- capital [[Nisporeni]]
["Ocnița District, Moldova"] = {}, -- capital [[Ocnița]]
["Orhei District, Moldova"] = {}, -- capital [[Orhei]]
["Rezina District, Moldova"] = {}, -- capital [[Rezina]]
["Rîșcani District, Moldova"] = {}, -- capital [[Rîșcani]]
["Sîngerei District, Moldova"] = {}, -- capital [[Sîngerei]]
["Soroca District, Moldova"] = {}, -- capital [[Soroca]]
["Strășeni District, Moldova"] = {}, -- capital [[Strășeni]]
["Șoldănești District, Moldova"] = {}, -- capital [[Șoldănești]]
["Ștefan Vodă District, Moldova"] = {}, -- capital [[Ștefan Vodă]]
["Taraclia District, Moldova"] = {}, -- capital [[Taraclia]]
["Telenești District, Moldova"] = {}, -- capital [[Telenești]]
["Ungheni District, Moldova"] = {}, -- capital [[Ungheni]]
["Chișinău, Moldova"] = {placetype = "municipality"},
["Bălți, Moldova"] = {placetype = "municipality"},
["Gagauzia, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "region"}}, -- capital [[Comrat]]
-- the remainder are under the de-facto control of the unrecognized state of Transnistria
["Bender, Moldova"] = {placetype = "municipality"},
["Tighina, Moldova"] = {alias_of = "Bender, Moldova"},
["Transnistria, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "region"}}, -- capital [[Tiraspol]]
["Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true},
["Administrative-Territorial Units of the Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true},
}
local function moldova_placename_to_key(placename)
local elliptical_key = placename .. ", Moldova"
if export.moldova_districts_and_autonomous_territorial_units[elliptical_key] then
return elliptical_key
end
if placename:find(" District$") then
return placename .. ", Moldova"
end
return placename .. " District, Moldova"
end
-- Moldovan districts (raions) and autonomous territorial units
export.moldova_group = {
key_to_placename = make_key_to_placename(", Moldova$", " District"),
placename_to_key = moldova_placename_to_key,
default_container = "Moldova",
default_placetype = {"district", "raion"},
default_divs = "communes",
data = export.moldova_districts_and_autonomous_territorial_units,
}
export.morocco_regions = {
["Tangier-Tetouan-Al Hoceima, Morocco"] = {},
["Oriental, Morocco"] = {wp = "%l (%c)"},
["L'Oriental, Morocco"] = {alias_of = "Oriental, Morocco", display = true},
["Fez-Meknes, Morocco"] = {},
["Rabat-Sale-Kenitra, Morocco"] = {wp = "Rabat-Salé-Kénitra"},
["Rabat-Salé-Kénitra, Morocco"] = {alias_of = "Rabat-Sale-Kenitra, Morocco", display = true},
["Beni Mellal-Khenifra, Morocco"] = {wp = "Béni Mellal-Khénifra"},
["Béni Mellal-Khénifra, Morocco"] = {alias_of = "Beni Mellal-Khenifra, Morocco", display = true},
["Casablanca-Settat, Morocco"] = {},
["Marrakesh-Safi, Morocco"] = {wp = "Marrakesh–Safi"}, -- WP title has en-dash
["Marrakech-Safi, Morocco"] = {alias_of = "Marrakesh-Safi, Morocco", display = true},
["Draa-Tafilalet, Morocco"] = {wp = "Drâa-Tafilalet"},
["Drâa-Tafilalet, Morocco"] = {alias_of = "Draa-Tafilalet, Morocco", display = true},
["Souss-Massa, Morocco"] = {},
["Guelmim-Oued Noun, Morocco"] = {
keydesc = "+++. '''NOTE:''' This region lies partly within the disputed territory of [[Western Sahara]]"
},
["Laayoune-Sakia El Hamra, Morocco"] = {
wp = "Laâyoune-Sakia El Hamra",
keydesc = "+++. '''NOTE:''' This region lies almost completely within the disputed territory of [[Western Sahara]]",
},
["Laâyoune-Sakia El Hamra, Morocco"] = {alias_of = "Laayoune-Sakia El Hamra, Morocco", display = true},
["Dakhla-Oued Ed-Dahab, Morocco"] = {
keydesc = "+++. '''NOTE:''' This region lies completely within the disputed territory of [[Western Sahara]]",
},
}
-- regions of Morocco
export.morocco_group = {
default_container = "Morocco",
default_placetype = "region",
data = export.morocco_regions,
}
export.egypt_governorates = {
["Cairo Governorate, Egypt"] = {},
["Giza Governorate, Egypt"] = {},
["Sharqia Governorate, Egypt"] = {},
["Dakahlia Governorate, Egypt"] = {},
["Beheira Governorate, Egypt"] = {},
["Minya Governorate, Egypt"] = {},
["Qalyubia Governorate, Egypt"] = {},
["Sohag Governorate, Egypt"] = {},
["Alexandria Governorate, Egypt"] = {},
["Gharbia Governorate, Egypt"] = {},
["Asyut Governorate, Egypt"] = {},
["Monufia Governorate, Egypt"] = {},
["Faiyum Governorate, Egypt"] = {},
["Kafr El Sheikh Governorate, Egypt"] = {},
["Qena Governorate, Egypt"] = {},
["Beni Suef Governorate, Egypt"] = {},
["Damietta Governorate, Egypt"] = {},
["Aswan Governorate, Egypt"] = {},
["Ismailia Governorate, Egypt"] = {},
["Luxor Governorate, Egypt"] = {},
["Suez Governorate, Egypt"] = {},
["Port Said Governorate, Egypt"] = {},
["Matrouh Governorate, Egypt"] = {},
["North Sinai Governorate, Egypt"] = {},
["Red Sea Governorate, Egypt"] = {},
["New Valley Governorate, Egypt"] = {},
["South Sinai Governorate, Egypt"] = {},
}
-- governorates of Egypt
export.egypt_group = {
key_to_placename = make_key_to_placename(", Egypt$", " Governorate$"),
placename_to_key = make_placename_to_key(", Egypt", " Governorate"),
default_container = "Egypt",
default_placetype = "governorate",
data = export.egypt_governorates,
}
export.netherlands_provinces = {
["Drenthe, Netherlands"] = {},
["Flevoland, Netherlands"] = {},
["Friesland, Netherlands"] = {},
["Gelderland, Netherlands"] = {},
["Groningen, Netherlands"] = {wp = "%l (province)"},
["Limburg, Netherlands"] = {wp = "%l (%c)"},
["North Brabant, Netherlands"] = {},
-- Foreign forms get display-canonicalized.
["Noord-Brabant, Netherlands"] = {alias_of = "North Brabant, Netherlands", display = true},
["North Holland, Netherlands"] = {},
["Noord-Holland, Netherlands"] = {alias_of = "North Holland, Netherlands", display = true},
["Overijssel, Netherlands"] = {},
["South Holland, Netherlands"] = {},
["Zuid-Holland, Netherlands"] = {alias_of = "South Holland, Netherlands", display = true},
["Utrecht, Netherlands"] = {wp = "%l (province)"},
["Zeeland, Netherlands"] = {},
}
-- provinces of the Netherlands
export.netherlands_group = {
default_container = "Netherlands",
default_placetype = "province",
default_divs = "municipalities",
data = export.netherlands_provinces,
}
export.new_zealand_regions = {
-- North Island regions
["Northland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-NTL, number 1, capital [[Whangārei]]
["Auckland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-AUK, number 2, capital [[Auckland]]
["Waikato, New Zealand"] = {}, -- ISO 3166-2 code NZ-WKO, number 3, capital [[Hamilton, New Zealand|Hamilton]]
["Bay of Plenty, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-BOP, number 4, capital [[Whakatāne]]
["Gisborne, New Zealand"] = {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-GIS, number 5, capital [[Gisborne, New Zealand|Gisborne]]
["Hawke's Bay, New Zealand"] = {}, -- ISO 3166-2 code NZ-HKB, number 6, capital [[Napier, New Zealand|Napier]]
["Taranaki, New Zealand"] = {}, -- ISO 3166-2 code NZ-TKI, number 7, capital [[Stratford, New Zealand|Stratford]]
["Manawatū-Whanganui, New Zealand"] = {}, -- ISO 3166-2 code NZ-MWT, number 8, capital [[Palmerston North]]
["Manawatu-Whanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true},
["Manawatu-Wanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true},
["Wellington, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-WGN, number 9, capital [[Wellington]]
-- South Island regions
["Tasman, New Zealand"] = {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-TAS, number 10, capital [[Richmond, New Zealand|Richmond]]
["Nelson, New Zealand"] = {placetype = {"region", "city"}, wp = "%l, %c", is_city = true}, -- ISO 3166-2 code NZ-NSN, number 11, capital [[Nelson, New Zealand|Nelson]]
["Marlborough, New Zealand"] = {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-MBH, number 12, capital [[Blenheim, New Zealand|Blenheim]]
["West Coast, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-WTC, number 13, capital [[Greymouth]]
["Canterbury, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-CAN, number 14, capital [[Christchurch]]
["Otago, New Zealand"] = {}, -- ISO 3166-2 code NZ-OTA, number 15, capital [[Dunedin]]
["Southland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-STL, number 16, capital [[Invercargill]]
}
-- regions of New Zealand
export.new_zealand_group = {
default_container = "New Zealand",
default_placetype = "region",
data = export.new_zealand_regions,
}
export.nigeria_states = {
["Abia State, Nigeria"] = {},
["Adamawa State, Nigeria"] = {},
["Akwa Ibom State, Nigeria"] = {},
["Anambra State, Nigeria"] = {},
["Bauchi State, Nigeria"] = {},
["Bayelsa State, Nigeria"] = {},
["Benue State, Nigeria"] = {},
["Borno State, Nigeria"] = {},
["Cross River State, Nigeria"] = {},
["Delta State, Nigeria"] = {},
["Ebonyi State, Nigeria"] = {},
["Edo State, Nigeria"] = {},
["Ekiti State, Nigeria"] = {},
["Enugu State, Nigeria"] = {},
["Federal Capital Territory, Nigeria"] = {
-- not a state but allow it to be referenced as one in holonyms
placetype = {"wilayah persekutuan", "territory", "negeri"}, the = true, wp = "%l (%c)",
},
["Gombe State, Nigeria"] = {},
["Imo State, Nigeria"] = {},
["Jigawa State, Nigeria"] = {},
["Kaduna State, Nigeria"] = {},
["Kano State, Nigeria"] = {},
["Katsina State, Nigeria"] = {},
["Kebbi State, Nigeria"] = {},
["Kogi State, Nigeria"] = {},
["Kwara State, Nigeria"] = {},
["Lagos State, Nigeria"] = {},
["Nasarawa State, Nigeria"] = {},
["Niger State, Nigeria"] = {},
["Ogun State, Nigeria"] = {},
["Ondo State, Nigeria"] = {},
["Osun State, Nigeria"] = {},
["Oyo State, Nigeria"] = {},
["Plateau State, Nigeria"] = {},
["Rivers State, Nigeria"] = {},
["Sokoto State, Nigeria"] = {},
["Taraba State, Nigeria"] = {},
["Yobe State, Nigeria"] = {},
["Zamfara State, Nigeria"] = {},
}
-- states of Nigeria
export.nigeria_group = {
key_to_placename = make_key_to_placename(", Nigeria$", " State$"),
placename_to_key = make_placename_to_key(", Nigeria", " State"),
default_container = "Nigeria",
default_placetype = "negeri",
data = export.nigeria_states,
}
export.north_korea_provinces = {
["Chagang Province, North Korea"] = {},
["North Hamgyong Province, North Korea"] = {},
["South Hamgyong Province, North Korea"] = {},
["North Hwanghae Province, North Korea"] = {},
["South Hwanghae Province, North Korea"] = {},
["Kangwon Province, North Korea"] = {wp = "%l (%c)"},
["North Pyongan Province, North Korea"] = {},
["South Pyongan Province, North Korea"] = {},
["Ryanggang Province, North Korea"] = {},
}
-- provinces of North Korea
export.north_korea_group = {
key_to_placename = make_key_to_placename(", North Korea$", " Province$"),
placename_to_key = make_placename_to_key(", North Korea", " Province"),
default_container = "North Korea",
default_placetype = "province",
data = export.north_korea_provinces,
}
export.norwegian_counties = {
["Oslo, Norway"] = {},
["Rogaland, Norway"] = {},
["Møre og Romsdal, Norway"] = {},
["Nordland, Norway"] = {},
["Østfold, Norway"] = {},
["Akershus, Norway"] = {},
["Buskerud, Norway"] = {},
-- the following two were merged into Innlandet
-- ["Hedmark, Norway"] = {},
-- ["Oppland, Norway"] = {},
["Innlandet, Norway"] = {},
["Vestfold, Norway"] = {},
["Telemark, Norway"] = {},
-- the following two were merged into Agder
-- ["Aust-Agder, Norway"] = {},
-- ["Vest-Agder, Norway"] = {},
["Agder, Norway"] = {},
-- the following two were merged into Vestland
-- ["Hordaland, Norway"] = {},
-- ["Sogn og Fjordane, Norway"] = {},
["Vestland, Norway"] = {},
["Trøndelag, Norway"] = {},
["Troms, Norway"] = {},
["Finnmark, Norway"] = {},
}
-- counties of Norway
export.norway_group = {
default_container = "Norway",
default_placetype = "county",
data = export.norwegian_counties,
}
export.pakistan_provinces_and_territories = {
["Azad Kashmir, Pakistan"] = {
placetype = {"administrative territory", "autonomous territory", "territory"},
},
["Azad Jammu and Kashmir, Pakistan"] = {alias_of = "Azad Kashmir, Pakistan", display = true},
["Balochistan, Pakistan"] = {wp = "%l, %c"},
["Gilgit-Baltistan, Pakistan"] = {
placetype = {"administrative territory", "territory"},
},
["Islamabad Capital Territory, Pakistan"] = {
the = true,
divs = {}, -- no divisions
placetype = {"wilayah persekutuan", "administrative territory", "territory"},
},
-- Islamabad is an accepted alias for Islamabad Capital Territory given the above placetypes
["Islamabad, Pakistan"] = {alias_of = "Islamabad Capital Territory, Pakistan"},
["Khyber Pakhtunkhwa, Pakistan"] = {},
["Punjab, Pakistan"] = {wp = "%l, %c"},
["Sindh, Pakistan"] = {},
}
-- provinces and territories of Pakistan
export.pakistan_group = {
default_container = "Pakistan",
default_placetype = "province",
default_divs = "divisions",
data = export.pakistan_provinces_and_territories,
}
export.philippines_provinces = {
["Abra, Philippines"] = {wp = "%l (province)"},
["Agusan del Norte, Philippines"] = {},
["Agusan del Sur, Philippines"] = {},
["Aklan, Philippines"] = {},
["Albay, Philippines"] = {},
["Antique, Philippines"] = {wp = "%l (province)"},
["Apayao, Philippines"] = {},
["Aurora, Philippines"] = {wp = "%l (province)"},
["Basilan, Philippines"] = {},
["Bataan, Philippines"] = {},
["Batanes, Philippines"] = {},
["Batangas, Philippines"] = {},
["Benguet, Philippines"] = {},
["Biliran, Philippines"] = {},
["Bohol, Philippines"] = {},
["Bukidnon, Philippines"] = {},
["Bulacan, Philippines"] = {},
["Cagayan, Philippines"] = {},
["Camarines Norte, Philippines"] = {},
["Camarines Sur, Philippines"] = {},
["Camiguin, Philippines"] = {},
["Capiz, Philippines"] = {},
["Catanduanes, Philippines"] = {},
["Cavite, Philippines"] = {},
["Cebu, Philippines"] = {},
["Cotabato, Philippines"] = {},
["Davao de Oro, Philippines"] = {},
["Davao del Norte, Philippines"] = {},
["Davao del Sur, Philippines"] = {},
["Davao Occidental, Philippines"] = {},
["Davao Oriental, Philippines"] = {},
["Dinagat Islands, Philippines"] = {the = true},
["Eastern Samar, Philippines"] = {},
["Guimaras, Philippines"] = {},
["Ifugao, Philippines"] = {},
["Ilocos Norte, Philippines"] = {},
["Ilocos Sur, Philippines"] = {},
["Iloilo, Philippines"] = {},
["Isabela, Philippines"] = {wp = "%l (province)"},
["Kalinga, Philippines"] = {wp = "%l (province)"},
["La Union, Philippines"] = {},
["Laguna, Philippines"] = {wp = "%l (province)"},
["Lanao del Norte, Philippines"] = {},
["Lanao del Sur, Philippines"] = {},
["Leyte, Philippines"] = {wp = "%l (province)"},
["Maguindanao del Norte, Philippines"] = {},
["Maguindanao del Sur, Philippines"] = {},
["Marinduque, Philippines"] = {},
["Masbate, Philippines"] = {},
["Misamis Occidental, Philippines"] = {},
["Misamis Oriental, Philippines"] = {},
["Mountain Province, Philippines"] = {},
["Negros Occidental, Philippines"] = {},
["Negros Oriental, Philippines"] = {},
["Northern Samar, Philippines"] = {},
["Nueva Ecija, Philippines"] = {},
["Nueva Vizcaya, Philippines"] = {},
["Occidental Mindoro, Philippines"] = {},
["Oriental Mindoro, Philippines"] = {},
["Palawan, Philippines"] = {},
["Pampanga, Philippines"] = {},
["Pangasinan, Philippines"] = {},
["Quezon, Philippines"] = {},
["Quirino, Philippines"] = {},
["Rizal, Philippines"] = {wp = "%l (province)"},
["Romblon, Philippines"] = {},
["Samar, Philippines"] = {wp = "%l (province)"},
["Sarangani, Philippines"] = {},
["Siquijor, Philippines"] = {},
["Sorsogon, Philippines"] = {},
["South Cotabato, Philippines"] = {},
["Southern Leyte, Philippines"] = {},
["Sultan Kudarat, Philippines"] = {},
["Sulu, Philippines"] = {},
["Surigao del Norte, Philippines"] = {},
["Surigao del Sur, Philippines"] = {},
["Tarlac, Philippines"] = {},
["Tawi-Tawi, Philippines"] = {},
["Zambales, Philippines"] = {},
["Zamboanga del Norte, Philippines"] = {},
["Zamboanga del Sur, Philippines"] = {},
["Zamboanga Sibugay, Philippines"] = {},
-- not a province but treated as one; allow it to be referred to as a province in holonyms
["Metro Manila, Philippines"] = {placetype = {"region", "province"}},
}
-- provinces of the Philippines
export.philippines_group = {
default_container = "Philippines",
default_placetype = "province",
default_divs = {"municipalities", "barangays"},
data = export.philippines_provinces,
}
export.poland_voivodeships = {
["Lower Silesian Voivodeship, Poland"] = {}, -- abbr DS, code 02, capital Wrocław
["Kuyavian-Pomeranian Voivodeship, Poland"] = {}, -- abbr KP, code 04, capital Bydgoszcz (seat of voivode), Toruń (seat of sejmik and marshal)
["Lublin Voivodeship, Poland"] = {}, -- abbr LU, code 06, capital Lublin
["Lubusz Voivodeship, Poland"] = {}, -- abbr LB, code 08, capital Gorzów Wielkopolski (seat of voivode), Zielona Góra (seat of sejmik and marshal)
["Lodz Voivodeship, Poland"] = {wp = "Łódź Voivodeship"}, -- abbr LD, code 10, capital Łódź
["Łódź Voivodeship, Poland"] = {alias_of = "Lodz Voivodeship, Poland", display = true, display_as_full = true},
["Lesser Poland Voivodeship, Poland"] = {}, -- abbr MA, code 12, capital Kraków
["Masovian Voivodeship, Poland"] = {}, -- abbr MZ, code 14, capital Warsaw
["Opole Voivodeship, Poland"] = {}, -- abbr OP, code 16, capital Opole
["Subcarpathian Voivodeship, Poland"] = {}, -- abbr PK, code 18, capital Rzeszów
["Podlaskie Voivodeship, Poland"] = {}, -- abbr PD, code 20, capital Białystok
["Pomeranian Voivodeship, Poland"] = {}, -- abbr PM, code 22, capital Gdańsk
["Silesian Voivodeship, Poland"] = {}, -- abbr SL, code 24, capital Katowice
["Holy Cross Voivodeship, Poland"] = {wp = "Świętokrzyskie Voivodeship"}, -- abbr SK, code 26, capital Kielce
["Świętokrzyskie Voivodeship, Poland"] = {alias_of = "Holy Cross Voivodeship, Poland", display = true, display_as_full = true},
["Warmian-Masurian Voivodeship, Poland"] = {}, -- abbr WN, code 28, capital Olsztyn
["Greater Poland Voivodeship, Poland"] = {}, -- abbr WP, code 30, capital Poznań
["West Pomeranian Voivodeship, Poland"] = {}, -- abbr ZP, code 32, capital Szczecin
}
-- voivodeships of Poland
export.poland_group = {
key_to_placename = make_key_to_placename(", Poland$", " Voivodeship$"),
placename_to_key = make_placename_to_key(", Poland", " Voivodeship"),
default_container = "Poland",
default_placetype = "voivodeship",
default_divs = {
-- "counties", -- not enough of them currently
{type = "Polish colonies", cat_as = {{type = "villages", prep = "di"}}},
},
data = export.poland_voivodeships,
}
export.portugal_districts_and_autonomous_regions = {
["Azores, Portugal"] = {the = true, placetype = {"autonomous region", "region"}},
["Aveiro District, Portugal"] = {},
["Beja District, Portugal"] = {},
["Braga District, Portugal"] = {},
["Bragança District, Portugal"] = {},
["Castelo Branco District, Portugal"] = {},
["Coimbra District, Portugal"] = {},
["Évora District, Portugal"] = {},
["Faro District, Portugal"] = {},
["Guarda District, Portugal"] = {},
["Leiria District, Portugal"] = {},
["Lisbon District, Portugal"] = {},
["Lisboa District, Portugal"] = {alias_of = "Lisbon District, Portugal", display = true},
["Madeira, Portugal"] = {placetype = {"autonomous region", "region"}},
["Portalegre District, Portugal"] = {},
["Porto District, Portugal"] = {},
["Santarém District, Portugal"] = {},
["Setúbal District, Portugal"] = {},
["Viana do Castelo District, Portugal"] = {},
["Vila Real District, Portugal"] = {},
["Viseu District, Portugal"] = {},
}
local function portugal_placename_to_key(placename)
if placename == "Azores" or placename == "Madeira" then
return placename .. ", Portugal"
end
if placename:find(" District$") then
return placename .. ", Portugal"
end
return placename .. " District, Portugal"
end
-- districts and autonomous regions of Portugal
export.portugal_group = {
key_to_placename = make_key_to_placename(", Portugal$", " District$"),
placename_to_key = portugal_placename_to_key,
default_container = "Portugal",
default_placetype = "district",
default_divs = "municipalities",
data = export.portugal_districts_and_autonomous_regions,
}
export.romania_counties = {
["Alba County, Romania"] = {},
["Arad County, Romania"] = {},
["Argeș County, Romania"] = {},
["Bacău County, Romania"] = {},
["Bihor County, Romania"] = {},
["Bistrița-Năsăud County, Romania"] = {},
["Botoșani County, Romania"] = {},
["Brașov County, Romania"] = {},
["Brăila County, Romania"] = {},
-- Bucharest: not in a county
["Buzău County, Romania"] = {},
["Caraș-Severin County, Romania"] = {},
["Cluj County, Romania"] = {},
["Constanța County, Romania"] = {},
["Covasna County, Romania"] = {},
["Călărași County, Romania"] = {},
["Dolj County, Romania"] = {},
["Dâmbovița County, Romania"] = {},
["Galați County, Romania"] = {},
["Giurgiu County, Romania"] = {},
["Gorj County, Romania"] = {},
["Harghita County, Romania"] = {},
["Hunedoara County, Romania"] = {},
["Ialomița County, Romania"] = {},
["Iași County, Romania"] = {},
["Ilfov County, Romania"] = {},
["Maramureș County, Romania"] = {},
["Mehedinți County, Romania"] = {},
["Mureș County, Romania"] = {},
["Neamț County, Romania"] = {},
["Olt County, Romania"] = {},
["Prahova County, Romania"] = {},
["Satu Mare County, Romania"] = {},
["Sibiu County, Romania"] = {},
["Suceava County, Romania"] = {},
["Sălaj County, Romania"] = {},
["Teleorman County, Romania"] = {},
["Timiș County, Romania"] = {},
["Tulcea County, Romania"] = {},
["Vaslui County, Romania"] = {},
["Vrancea County, Romania"] = {},
["Vâlcea County, Romania"] = {},
}
-- counties of Romania
export.romania_group = {
key_to_placename = make_key_to_placename(", Romania$", " County$"),
placename_to_key = make_placename_to_key(", Romania", " County"),
default_container = "Romania",
default_placetype = "county",
default_divs = "communes",
data = export.romania_counties,
}
local function make_russia_federal_subject_spec(spectype, use_the, wp)
return {
placetype = spectype,
the = not not use_the,
bare_category_parent_type = {"federal subjects", spectype .. "s"},
wp = wp,
}
end
local russia_autonomous_okrug_no_the =
{placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"}}
local russia_autonomous_okrug_the =
{placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"},
the = true}
local russia_krai = make_russia_federal_subject_spec("krai")
local russia_oblast = make_russia_federal_subject_spec("oblast")
local russia_republic_the = make_russia_federal_subject_spec("republic", "use the")
local russia_republic_no_the = make_russia_federal_subject_spec("republic")
export.russia_federal_subjects = {
-- autonomous oblasts
["Jewish Autonomous Oblast, Russia"] =
{the = true, placetype = {"autonomous oblast", "oblast"},
bare_category_parent_type = {"federal subjects", "autonomous oblasts"}},
-- autonomous okrugs
["Chukotka Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Chukotka, Russia"] = {alias_of = "Chukotka Autonomous Okrug, Russia"},
["Khanty-Mansi Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Khanty-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
["Khantia-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
["Yugra, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
["Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Nenetsia, Russia"] = {alias_of = "Nenets Autonomous Okrug, Russia"},
["Yamalo-Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the,
["Yamalia, Russia"] = {alias_of = "Yamalo-Nenets Autonomous Okrug, Russia"},
-- krais
["Altai Krai, Russia"] = russia_krai,
["Kamchatka Krai, Russia"] = russia_krai,
["Khabarovsk Krai, Russia"] = russia_krai,
["Krasnodar Krai, Russia"] = russia_krai,
["Krasnoyarsk Krai, Russia"] = russia_krai,
["Perm Krai, Russia"] = russia_krai,
["Primorsky Krai, Russia"] = russia_krai,
["Stavropol Krai, Russia"] = russia_krai,
["Zabaykalsky Krai, Russia"] = russia_krai,
-- oblasts
["Amur Oblast, Russia"] = russia_oblast,
["Arkhangelsk Oblast, Russia"] = russia_oblast,
["Astrakhan Oblast, Russia"] = russia_oblast,
["Belgorod Oblast, Russia"] = russia_oblast,
["Bryansk Oblast, Russia"] = russia_oblast,
["Chelyabinsk Oblast, Russia"] = russia_oblast,
["Irkutsk Oblast, Russia"] = russia_oblast,
["Ivanovo Oblast, Russia"] = russia_oblast,
["Kaliningrad Oblast, Russia"] = russia_oblast,
["Kaluga Oblast, Russia"] = russia_oblast,
["Kemerovo Oblast, Russia"] = russia_oblast,
["Kirov Oblast, Russia"] = russia_oblast,
["Kostroma Oblast, Russia"] = russia_oblast,
["Kurgan Oblast, Russia"] = russia_oblast,
["Kursk Oblast, Russia"] = russia_oblast,
["Leningrad Oblast, Russia"] = russia_oblast,
["Lipetsk Oblast, Russia"] = russia_oblast,
["Magadan Oblast, Russia"] = russia_oblast,
["Moscow Oblast, Russia"] = russia_oblast,
["Murmansk Oblast, Russia"] = russia_oblast,
["Nizhny Novgorod Oblast, Russia"] = russia_oblast,
["Novgorod Oblast, Russia"] = russia_oblast,
["Novosibirsk Oblast, Russia"] = russia_oblast,
["Omsk Oblast, Russia"] = russia_oblast,
["Orenburg Oblast, Russia"] = russia_oblast,
["Oryol Oblast, Russia"] = russia_oblast,
["Penza Oblast, Russia"] = russia_oblast,
["Pskov Oblast, Russia"] = russia_oblast,
["Rostov Oblast, Russia"] = russia_oblast,
["Ryazan Oblast, Russia"] = russia_oblast,
["Sakhalin Oblast, Russia"] = russia_oblast,
["Samara Oblast, Russia"] = russia_oblast,
["Saratov Oblast, Russia"] = russia_oblast,
["Smolensk Oblast, Russia"] = russia_oblast,
["Sverdlovsk Oblast, Russia"] = russia_oblast,
["Tambov Oblast, Russia"] = russia_oblast,
["Tomsk Oblast, Russia"] = russia_oblast,
["Tula Oblast, Russia"] = russia_oblast,
["Tver Oblast, Russia"] = russia_oblast,
["Tyumen Oblast, Russia"] = russia_oblast,
["Ulyanovsk Oblast, Russia"] = russia_oblast,
["Vladimir Oblast, Russia"] = russia_oblast,
["Volgograd Oblast, Russia"] = russia_oblast,
["Vologda Oblast, Russia"] = russia_oblast,
["Voronezh Oblast, Russia"] = russia_oblast,
["Yaroslavl Oblast, Russia"] = russia_oblast,
-- republics
--
-- We only need to include cases that aren't just shortened versions of the full federal subject name (i.e. where
-- words like "Republic" and "Oblast" are omitted but the name is not otherwise modified; these are handled by
-- key_to_placename). Non-display-canonicalizing aliases are generally due to differences in the presence or absence
-- of "the".
["Adygea, Russia"] = russia_republic_no_the,
["Republic of Adygea, Russia"] = {alias_of = "Adygea, Russia", the = true},
["Bashkortostan, Russia"] = russia_republic_no_the,
["Republic of Bashkortostan, Russia"] = {alias_of = "Bashkortostan, Russia", the = true},
["Bashkiria, Russia"] = {alias_of = "Bashkortostan, Russia"},
["Buryatia, Russia"] = russia_republic_no_the,
["Republic of Buryatia, Russia"] = {alias_of = "Buryatia, Russia", the = true},
["Dagestan, Russia"] = russia_republic_no_the,
["Republic of Dagestan, Russia"] = {alias_of = "Dagestan, Russia", the = true},
["Ingushetia, Russia"] = russia_republic_no_the,
["Republic of Ingushetia, Russia"] = {alias_of = "Ingushetia, Russia", the = true},
["Kalmykia, Russia"] = russia_republic_no_the,
["Republic of Kalmykia, Russia"] = {alias_of = "Kalmykia, Russia", the = true},
["Karelia, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Karelia"),
["Republic of Karelia, Russia"] = {alias_of = "Karelia, Russia", the = true},
["Khakassia, Russia"] = russia_republic_no_the,
["Republic of Khakassia, Russia"] = {alias_of = "Khakassia, Russia", the = true},
["Mordovia, Russia"] = russia_republic_no_the,
["Republic of Mordovia, Russia"] = {alias_of = "Mordovia, Russia", the = true},
["North Ossetia-Alania, Russia"] = make_russia_federal_subject_spec("republic", nil, "North Ossetia–Alania"), -- with en-dash
["Republic of North Ossetia-Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", the = true},
["North Ossetia, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true},
["Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true},
["Tatarstan, Russia"] = russia_republic_no_the,
["Republic of Tatarstan, Russia"] = {alias_of = "Tatarstan, Russia", the = true},
["Altai Republic, Russia"] = russia_republic_the,
["Chechnya, Russia"] = russia_republic_no_the,
["Chechen Republic, Russia"] = {alias_of = "Chechnya, Russia", the = true},
["Chuvashia, Russia"] = russia_republic_no_the,
["Chuvash Republic, Russia"] = {alias_of = "Chuvashia, Russia", the = true},
["Kabardino-Balkaria, Russia"] = russia_republic_no_the,
["Kabardino-Balkariya, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", display = true},
["Kabardino-Balkarian Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", the = true},
["Kabardino-Balkar Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia",
display = "Kabardino-Balkarian Republic, Russia", the = true},
["Karachay-Cherkessia, Russia"] = russia_republic_no_the,
["Karachay-Cherkess Republic, Russia"] = {alias_of = "Karachay-Cherkessia, Russia"},
["Komi, Russia"] = make_russia_federal_subject_spec("republic", nil, "Komi Republic"),
["Komi Republic, Russia"] = {alias_of = "Komi, Russia", the = true},
["Mari El, Russia"] = russia_republic_no_the,
["Mari El Republic, Russia"] = {alias_of = "Mari El, Russia", the = true},
["Sakha, Russia"] = make_russia_federal_subject_spec("republic", nil, "Sakha Republic"),
["Sakha Republic, Russia"] = {alias_of = "Sakha, Russia", the = true},
["Yakutia, Russia"] = {alias_of = "Sakha, Russia"},
["Yakutiya, Russia"] = {alias_of = "Sakha, Russia", display = "Yakutia, Russia"},
["Republic of Yakutia (Sakha), Russia"] = {alias_of = "Sakha, Russia", display = "Sakha Republic, Russia",
the = true},
["Tuva, Russia"] = russia_republic_no_the,
["Tyva, Russia"] = {alias_of = "Tuva, Russia", display = true},
["Tuva Republic, Russia"] = {alias_of = "Tuva, Russia", the = true},
["Tyva Republic, Russia"] = {alias_of = "Tuva, Russia", display= "Tuva Republic, Russia", the = true},
["Udmurtia, Russia"] = russia_republic_no_the,
["Udmurt Republic, Russia"] = {alias_of = "Udmurtia, Russia", the = true},
-- Not included due to being unrecognized and only partly controlled:
-- ["Crimea, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Crimea (Russia)")
-- ["Donetsk People's Republic, Russia"] = russia_republic_the,
-- ["Luhansk People's Republic, Russia"] = russia_republic_the,
-- ["Zaporozhye Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Zaporizhzhia Oblast"),
-- ["Kherson Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Kherson Oblast"),
-- There are also federal cities (not included because they're cities):
-- Moscow, Saint Petersburg; Sevastopol (unrecognized; same status as for "Crimea, Russia" above)
}
local function russia_key_to_placename(key)
key = key:gsub(",.*", "")
local full_placename = key
if key == "Jewish Autonomous Oblast" then
return full_placename, full_placename
end
local elliptical_placename
for _, suffix in ipairs({"Krai", "Oblast"}) do
elliptical_placename = key:match("^(.*) " .. suffix .. "$")
if elliptical_placename then
return full_placename, elliptical_placename
end
end
return full_placename, full_placename
end
local function russia_placename_to_key(placename)
local key = placename .. ", Russia"
if export.russia_federal_subjects[key] then
return key
end
-- We allow the user to say e.g. "obl/Samara" in place of "obl/Samara Oblast".
for _, suffix in ipairs({"Krai", "Oblast"}) do
local suffixed_key = placename .. " " .. suffix .. ", Russia"
if export.russia_federal_subjects[suffixed_key] then
return suffixed_key
end
end
return placename .. ", Russia"
end
local function construct_russia_federal_subject_keydesc(group, key, spec)
local placename = key:gsub(",.*", "")
local linked_placename = export.construct_linked_placename(spec, placename)
local placetype = spec.placetype
if type(placetype) == "table" then
placetype = placetype[1]
end
if placetype == "oblast" then
-- Hack: Oblasts generally don't have entries under "Foo Oblast"
-- but just under "Foo", so fix the linked key appropriately;
-- doesn't apply to the Jewish Autonomous Oblast
linked_placename = linked_placename:gsub(" Oblast%]%]", "%]%] Oblast")
end
return linked_placename .. ", a [[federal subject]] ([[" .. placetype .. "]]) of [[Russia]]"
end
-- federal subjects of Russia
export.russia_group = {
key_to_placename = russia_key_to_placename,
placename_to_key = russia_placename_to_key,
default_container = "Russia",
default_keydesc = construct_russia_federal_subject_keydesc,
default_overriding_bare_label_parents = {"federal subjects of Russia", "+++"},
data = export.russia_federal_subjects,
}
export.saudi_arabia_provinces = {
["Riyadh Province, Saudi Arabia"] = {},
["Mecca Province, Saudi Arabia"] = {},
-- Name is too generic to assume it's in Saudi Arabia if not specified.
["Eastern Province, Saudi Arabia"] = {no_auto_augment_container = true, wp = "%l, %c"},
["Medina Province, Saudi Arabia"] = {wp = "%l (%c)"},
["Aseer Province, Saudi Arabia"] = {wp = "Asir"},
["Asir Province, Saudi Arabia"] = {alias_of = "Aseer Province, Saudi Arabia", display = true},
["Jazan Province, Saudi Arabia"] = {},
["Qassim Province, Saudi Arabia"] = {wp = "Al-Qassim Province"},
["Al-Qassim Province, Saudi Arabia"] = {alias_of = "Qassim Province, Saudi Arabia", display = true},
["Tabuk Province, Saudi Arabia"] = {},
["Hail Province, Saudi Arabia"] = {wp = "Ḥa'il Province"},
["Ha'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true},
["Ḥa'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true},
["Al-Jouf Province, Saudi Arabia"] = {wp = "Al-Jawf Province"},
["Al-Jawf Province, Saudi Arabia"] = {alias_of = "Al-Jouf Province, Saudi Arabia", display = true},
["Najran Province, Saudi Arabia"] = {},
["Northern Borders Province, Saudi Arabia"] = {},
["Al-Bahah Province, Saudi Arabia"] = {},
}
-- provinces of Saudi Arabia
export.saudi_arabia_group = {
key_to_placename = make_key_to_placename(", Arab Saudi$", " Province$"),
placename_to_key = make_placename_to_key(", Arab Saudi", " Province"),
default_container = "Arab Saudi",
default_placetype = "wilayah",
data = export.saudi_arabia_provinces,
}
export.south_africa_provinces = {
["Eastern Cape, South Africa"] = {the = true},
["Free State, South Africa"] = {the = true, wp = "%l (province)"},
["Gauteng, South Africa"] = {},
["KwaZulu-Natal, South Africa"] = {},
["Limpopo, South Africa"] = {},
["Mpumalanga, South Africa"] = {},
-- per Wikipedia and other sources, `North West` doesn't normally have `the` before it
["North West, South Africa"] = {wp = "%l (South African province)"},
["Northern Cape, South Africa"] = {the = true},
["Western Cape, South Africa"] = {the = true},
}
-- provinces of South Africa
export.south_africa_group = {
default_container = "South Africa",
default_placetype = "province",
default_divs = "municipalities",
data = export.south_africa_provinces,
}
export.south_korea_provinces = {
["North Chungcheong Province, South Korea"] = {},
["South Chungcheong Province, South Korea"] = {},
["Gangwon Province, South Korea"] = {wp = "%l, %c"},
["Gyeonggi Province, South Korea"] = {},
["North Gyeongsang Province, South Korea"] = {},
["South Gyeongsang Province, South Korea"] = {},
["North Jeolla Province, South Korea"] = {},
["South Jeolla Province, South Korea"] = {},
["Jeju Province, South Korea"] = {},
}
-- provinces of South Korea
export.south_korea_group = {
key_to_placename = make_key_to_placename(", South Korea$", " Province$"),
placename_to_key = make_placename_to_key(", South Korea", " Province"),
default_container = "South Korea",
default_placetype = "province",
data = export.south_korea_provinces,
}
export.spain_autonomous_communities = {
["Andalusia, Spain"] = {},
["Aragon, Spain"] = {},
["Asturias, Spain"] = {},
["Balearic Islands, Spain"] = {the = true},
["Basque Country, Spain"] = {the = true, wp = "%l (autonomous community)"},
["Canary Islands, Spain"] = {the = true},
["Cantabria, Spain"] = {},
["Castile and León, Spain"] = {},
["Castilla-La Mancha, Spain"] = {wp = "Castilla–La Mancha"}, -- with en-dash
["Catalonia, Spain"] = {},
["Community of Madrid, Spain"] = {the = true},
["Extremadura, Spain"] = {},
["Galicia, Spain"] = {wp = "%l (Spain)"},
["La Rioja, Spain"] = {},
["Murcia, Spain"] = {wp = "Region of %l"},
["Navarre, Spain"] = {},
["Valencia, Spain"] = {wp = "Valencian Community"},
["Valencian Community, Spain"] = {alias_of = "Valencia, Spain", the = true},
}
-- autonomous communities of Spain
export.spain_group = {
default_container = "Spain",
default_placetype = "autonomous community",
default_divs = {"municipalities", "comarcas"},
data = export.spain_autonomous_communities,
}
export.taiwan_counties = {
["Changhua County, Taiwan"] = {},
["Chiayi County, Taiwan"] = {},
["Hsinchu County, Taiwan"] = {},
["Hualien County, Taiwan"] = {},
["Kinmen County, Taiwan"] = {wp = "Kinmen"},
["Lienchiang County, Taiwan"] = {wp = "Matsu Islands"},
["Miaoli County, Taiwan"] = {},
["Nantou County, Taiwan"] = {},
["Penghu County, Taiwan"] = {wp = "Penghu"},
["Pingtung County, Taiwan"] = {},
["Taitung County, Taiwan"] = {},
["Yilan County, Taiwan"] = {wp = "%l, %c"},
["Yunlin County, Taiwan"] = {},
}
-- counties of Taiwan
export.taiwan_group = {
key_to_placename = make_key_to_placename(", Taiwan$", " County$"),
placename_to_key = make_placename_to_key(", Taiwan", " County"),
default_container = "Taiwan",
default_placetype = "county",
default_divs = {"districts", "townships"},
data = export.taiwan_counties,
}
export.thailand_provinces = {
-- Bangkok (special administrative area)
["Amnat Charoen Province, Thailand"] = {},
["Ang Thong Province, Thailand"] = {},
["Bueng Kan Province, Thailand"] = {},
["Buriram Province, Thailand"] = {},
["Chachoengsao Province, Thailand"] = {},
["Chai Nat Province, Thailand"] = {},
["Chaiyaphum Province, Thailand"] = {},
["Chanthaburi Province, Thailand"] = {},
["Chiang Mai Province, Thailand"] = {},
["Chiang Rai Province, Thailand"] = {},
["Chonburi Province, Thailand"] = {},
["Chumphon Province, Thailand"] = {},
["Kalasin Province, Thailand"] = {},
["Kamphaeng Phet Province, Thailand"] = {},
["Kanchanaburi Province, Thailand"] = {},
["Khon Kaen Province, Thailand"] = {},
["Krabi Province, Thailand"] = {},
["Lampang Province, Thailand"] = {},
["Lamphun Province, Thailand"] = {},
["Loei Province, Thailand"] = {},
["Lopburi Province, Thailand"] = {},
["Mae Hong Son Province, Thailand"] = {},
["Maha Sarakham Province, Thailand"] = {},
["Mukdahan Province, Thailand"] = {},
["Nakhon Nayok Province, Thailand"] = {},
["Nakhon Pathom Province, Thailand"] = {},
["Nakhon Phanom Province, Thailand"] = {},
["Nakhon Ratchasima Province, Thailand"] = {},
["Nakhon Sawon Province, Thailand"] = {},
["Nakhon Si Thammarat Province, Thailand"] = {},
["Nan Province, Thailand"] = {},
["Narathiwat Province, Thailand"] = {},
["Nong Bua Lamphu Province, Thailand"] = {},
["Nong Khai Province, Thailand"] = {},
["Nonthaburi Province, Thailand"] = {},
["Pathum Thani Province, Thailand"] = {},
["Pattani Province, Thailand"] = {},
["Phang Nga Province, Thailand"] = {},
["Phatthalung Province, Thailand"] = {},
["Phayao Province, Thailand"] = {},
["Phetchabun Province, Thailand"] = {},
["Phetchaburi Province, Thailand"] = {},
["Phichit Province, Thailand"] = {},
["Phitsanulok Province, Thailand"] = {},
["Phra Nakhon Si Ayutthaya Province, Thailand"] = {},
["Phrae Province, Thailand"] = {},
["Phuket Province, Thailand"] = {},
["Prachinburi Province, Thailand"] = {},
["Prachuap Khiri Khan Province, Thailand"] = {},
["Ranong Province, Thailand"] = {},
["Ratchaburi Province, Thailand"] = {},
["Rayong Province, Thailand"] = {},
["Roi Et Province, Thailand"] = {},
["Sa Kaeo Province, Thailand"] = {},
["Sakon Nakhon Province, Thailand"] = {},
["Samut Prakan Province, Thailand"] = {},
["Samut Sakhon Province, Thailand"] = {},
["Samut Songkhram Province, Thailand"] = {},
["Saraburi Province, Thailand"] = {},
["Satun Province, Thailand"] = {},
["Sing Buri Province, Thailand"] = {},
["Sisaket Province, Thailand"] = {},
["Songkhla Province, Thailand"] = {},
["Sukhothai Province, Thailand"] = {},
["Suphan Buri Province, Thailand"] = {},
["Surat Thani Province, Thailand"] = {},
["Surin Province, Thailand"] = {},
["Tak Province, Thailand"] = {},
["Trang Province, Thailand"] = {},
["Trat Province, Thailand"] = {},
["Ubon Ratchathani Province, Thailand"] = {},
["Udon Thani Province, Thailand"] = {},
["Uthai Thani Province, Thailand"] = {},
["Uttaradit Province, Thailand"] = {},
["Yala Province, Thailand"] = {},
["Yasothon Province, Thailand"] = {},
}
-- provinces of Thailand
export.thailand_group = {
key_to_placename = make_key_to_placename(", Thailand$", "Wilayah "),
placename_to_key = make_placename_to_key(", Thailand", "Wilayah "),
default_container = "Thailand",
default_placetype = "wilayah",
default_divs = "daerah",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "Wilayah %e",
data = export.thailand_provinces,
}
export.turkey_provinces = {
["Adana Province, Turkey"] = {}, -- code 01
["Adıyaman Province, Turkey"] = {}, -- code 02
["Afyonkarahisar Province, Turkey"] = {}, -- code 03
["Ağrı Province, Turkey"] = {}, -- code 04
["Amasya Province, Turkey"] = {}, -- code 05
["Ankara Province, Turkey"] = {}, -- code 06
["Antalya Province, Turkey"] = {}, -- code 07
["Artvin Province, Turkey"] = {}, -- code 08
["Aydın Province, Turkey"] = {}, -- code 09
["Balıkesir Province, Turkey"] = {}, -- code 10
["Bilecik Province, Turkey"] = {}, -- code 11
["Bingöl Province, Turkey"] = {}, -- code 12
["Bitlis Province, Turkey"] = {}, -- code 13
["Bolu Province, Turkey"] = {}, -- code 14
["Burdur Province, Turkey"] = {}, -- code 15
["Bursa Province, Turkey"] = {}, -- code 16
["Çanakkale Province, Turkey"] = {}, -- code 17
["Çankırı Province, Turkey"] = {}, -- code 18
["Çorum Province, Turkey"] = {}, -- code 19
["Denizli Province, Turkey"] = {}, -- code 20
["Diyarbakır Province, Turkey"] = {}, -- code 21
["Edirne Province, Turkey"] = {}, -- code 22
["Elazığ Province, Turkey"] = {}, -- code 23
["Elâzığ Province, Turkey"] = {alias_of = "Elazığ Province, Turkey", display = true},
["Erzincan Province, Turkey"] = {}, -- code 24
["Erzurum Province, Turkey"] = {}, -- code 25
["Eskişehir Province, Turkey"] = {}, -- code 26
["Gaziantep Province, Turkey"] = {}, -- code 27
["Giresun Province, Turkey"] = {}, -- code 28
["Gümüşhane Province, Turkey"] = {}, -- code 29
["Hakkâri Province, Turkey"] = {}, -- code 30
["Hakkari Province, Turkey"] = {alias_of = "Hakkâri Province, Turkey", display = true},
["Hatay Province, Turkey"] = {}, -- code 31
["Isparta Province, Turkey"] = {}, -- code 32
["Mersin Province, Turkey"] = {}, -- code 33
-- ["Istanbul Province, Turkey"] = {}, -- code 34; this is coextensive with the city itself
["İzmir Province, Turkey"] = {}, -- code 35
["Izmir Province, Turkey"] = {alias_of = "İzmir Province, Turkey", display = true},
["Kars Province, Turkey"] = {}, -- code 36
["Kastamonu Province, Turkey"] = {}, -- code 37
["Kayseri Province, Turkey"] = {}, -- code 38
["Kırklareli Province, Turkey"] = {}, -- code 39
["Kırşehir Province, Turkey"] = {}, -- code 40
["Kocaeli Province, Turkey"] = {}, -- code 41
["Konya Province, Turkey"] = {}, -- code 42
["Kütahya Province, Turkey"] = {}, -- code 43
["Malatya Province, Turkey"] = {}, -- code 44
["Manisa Province, Turkey"] = {}, -- code 45
["Kahramanmaraş Province, Turkey"] = {}, -- code 46
["Mardin Province, Turkey"] = {}, -- code 47
["Muğla Province, Turkey"] = {}, -- code 48
["Muş Province, Turkey"] = {}, -- code 49
["Nevşehir Province, Turkey"] = {}, -- code 50
["Niğde Province, Turkey"] = {}, -- code 51
["Ordu Province, Turkey"] = {}, -- code 52
["Rize Province, Turkey"] = {}, -- code 53
["Sakarya Province, Turkey"] = {}, -- code 54
["Samsun Province, Turkey"] = {}, -- code 55
["Siirt Province, Turkey"] = {}, -- code 56
["Sinop Province, Turkey"] = {}, -- code 57
["Sivas Province, Turkey"] = {}, -- code 58
["Tekirdağ Province, Turkey"] = {}, -- code 59
["Tokat Province, Turkey"] = {}, -- code 60
["Trabzon Province, Turkey"] = {}, -- code 61
["Tunceli Province, Turkey"] = {}, -- code 62
["Şanlıurfa Province, Turkey"] = {}, -- code 63
["Uşak Province, Turkey"] = {}, -- code 64
["Van Province, Turkey"] = {}, -- code 65
["Yozgat Province, Turkey"] = {}, -- code 66
["Zonguldak Province, Turkey"] = {}, -- code 67
["Aksaray Province, Turkey"] = {}, -- code 68
["Bayburt Province, Turkey"] = {}, -- code 69
["Karaman Province, Turkey"] = {}, -- code 70
["Kırıkkale Province, Turkey"] = {}, -- code 71
["Batman Province, Turkey"] = {}, -- code 72
["Şırnak Province, Turkey"] = {}, -- code 73
["Bartın Province, Turkey"] = {}, -- code 74
["Ardahan Province, Turkey"] = {}, -- code 75
["Iğdır Province, Turkey"] = {}, -- code 76
["Yalova Province, Turkey"] = {}, -- code 77
["Karabük Province, Turkey"] = {}, -- code 78
["Kilis Province, Turkey"] = {}, -- code 79
["Osmaniye Province, Turkey"] = {}, -- code 80
["Düzce Province, Turkey"] = {}, -- code 81
}
-- provinces of Turkey
export.turkey_group = {
key_to_placename = make_key_to_placename(", Turkey$", " Province$"),
placename_to_key = make_placename_to_key(", Turkey", " Province"),
default_container = "Turkey",
default_placetype = "province",
default_divs = "districts",
data = export.turkey_provinces,
}
export.ukraine_oblasts = {
["Cherkasy Oblast, Ukraine"] = {}, -- capital [[Cherkasy]], license plate prefix CA, IA
["Chernihiv Oblast, Ukraine"] = {}, -- capital [[Chernihiv]], license plate prefix CB, IB
["Chernivtsi Oblast, Ukraine"] = {}, -- capital [[Chernivtsi]], license plate prefix CE, IE
-- apparently will be renamed to 'Dnipro Oblast'
["Dnipropetrovsk Oblast, Ukraine"] = {}, -- capital [[Dnipro]], license plate prefix AE, KE
["Donetsk Oblast, Ukraine"] = {}, -- capital ''[[Donetsk]] ([[Kramatorsk]])'', license plate prefix AH, KH
["Ivano-Frankivsk Oblast, Ukraine"] = {}, -- capital [[Ivano-Frankivsk]], license plate prefix AT, KT
["Kharkiv Oblast, Ukraine"] = {}, -- capital [[Kharkiv]], license plate prefix AX, KX
["Kherson Oblast, Ukraine"] = {}, -- capital ''[[Kherson]]'', license plate prefix ''BT, HT''
["Khmelnytskyi Oblast, Ukraine"] = {}, -- capital [[Khmelnytskyi]], license plate prefix BX, HX
-- apparently will be renamed to 'Kropyvnytskyi Oblast'
["Kirovohrad Oblast, Ukraine"] = {}, -- capital [[Kropyvnytskyi]], license plate prefix BA, HA
["Kyiv Oblast, Ukraine"] = {}, -- capital [[Kyiv]], license plate prefix AI, KI
["Kiev Oblast, Ukraine"] = {alias_of = "Kyiv Oblast, Ukraine", display = true},
["Luhansk Oblast, Ukraine"] = {}, -- capital ''[[Luhansk]] ([[Sievierodonetsk]])'', license plate prefix BB, HB
["Lviv Oblast, Ukraine"] = {}, -- capital [[Lviv]], license plate prefix BC, HC
["Mykolaiv Oblast, Ukraine"] = {}, -- capital [[Mykolaiv]], license plate prefix BE, HE
["Odesa Oblast, Ukraine"] = {}, -- capital [[Odesa]], license plate prefix BH, HH
["Odessa Oblast, Ukraine"] = {alias_of = "Odesa Oblast, Ukraine", display = true},
["Poltava Oblast, Ukraine"] = {}, -- capital [[Poltava]], license plate prefix BI, HI
["Rivne Oblast, Ukraine"] = {}, -- capital [[Rivne]], license plate prefix BK, HK
["Sumy Oblast, Ukraine"] = {}, -- capital [[Sumy]], license plate prefix BM, HM
["Ternopil Oblast, Ukraine"] = {}, -- capital [[Ternopil]], license plate prefix BO, HO
["Vinnytsia Oblast, Ukraine"] = {}, -- capital [[Vinnytsia]], license plate prefix AB, KB
["Volyn Oblast, Ukraine"] = {}, -- capital [[Lutsk]], license plate prefix AC, KC
["Zakarpattia Oblast, Ukraine"] = {}, -- capital [[Uzhhorod]], license plate prefix AO, KO
["Zaporizhzhia Oblast, Ukraine"] = {}, -- capital ''[[Zaporizhzhia]]'', license plate prefix AP, KP
["Zaporizhia Oblast, Ukraine"] = {alias_of = "Zaporizhzhia Oblast, Ukraine", display = true},
["Zhytomyr Oblast, Ukraine"] = {}, -- capital [[Zhytomyr]], license plate prefix AM, KM
}
-- oblasts of Ukraine
export.ukraine_group = {
key_to_placename = make_key_to_placename(", Ukraine$", " Oblast$"),
placename_to_key = make_placename_to_key(", Ukraine", " Oblast"),
default_container = "Ukraine",
default_placetype = "oblast",
default_divs = {"raions", "hromadas"},
data = export.ukraine_oblasts,
}
export.united_kingdom_constituent_countries = {
["England"] = {divs = {
"counties",
"districts",
{type = "local government districts", cat_as = "districts"},
{
type = "local government districts with borough status",
cat_as = {"districts", "boroughs"},
},
{type = "boroughs", cat_as = {"districts", "boroughs"}},
{type = "civil parishes", container_parent_type = false},
}},
["Northern Ireland"] = {
placetype = {"constituent country", "province", "negara"},
divs = {"counties", "districts"},
},
["Scotland"] = {divs = {
{type = "council areas", container_parent_type = false},
"districts",
}},
["Wales"] = {divs = {
"counties",
{type = "county boroughs", container_parent_type = false},
{type = "communities", container_parent_type = false},
{type = "Welsh communities", cat_as = {{type = "communities", container_parent_type = false}}},
}},
}
-- constituent countries and provinces of the United Kingdom
export.united_kingdom_group = {
placename_to_key = false,
default_container = "United Kingdom",
default_placetype = {"constituent country", "negara"},
addl_divs = {
"traditional counties",
{type = "historical counties", cat_as = "traditional counties"},
},
-- Don't create categories like 'Category:en:Towns in the United Kingdom'
-- or 'Category:en:Places in the United Kingdom'.
default_no_container_cat = true,
data = export.united_kingdom_constituent_countries,
}
export.england_counties = {
-- NOTE: We used to have various other "no longer" counties commented out, which seems to refer to counties that
-- existed officially at some point between 1889 and 1974, which I have removed. I have only kept the three
-- ceremonial counties that existed from 1974 (when ceremonial counties were created) to 1996, as well as those
-- still considered "historic counties" per [[w:Historic counties of England]].
-- ["Avon, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996)
["Bedfordshire, England"] = {},
["Berkshire, England"] = {},
-- ["Brighton and Hove, England"] = {}, -- city
-- ["Bristol, England"] = {}, -- city
["Buckinghamshire, England"] = {},
["Cambridgeshire, England"] = {},
["Cheshire, England"] = {},
-- ["Cleveland, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996)
["Cornwall, England"] = {},
-- ["Cumberland, England"] = {}, -- no longer (historic county)
["Cumbria, England"] = {},
["Derbyshire, England"] = {},
["Devon, England"] = {},
["Dorset, England"] = {},
["County Durham, England"] = {},
["East Sussex, England"] = {},
["Essex, England"] = {},
["Gloucestershire, England"] = {},
["Greater London, England"] = {},
["Greater Manchester, England"] = {},
["Hampshire, England"] = {},
["Herefordshire, England"] = {},
["Hertfordshire, England"] = {},
-- ["Humberside, England"] = {}, -- no longer (1974 to 1996)
-- ["Huntingdonshire, England"] = {}, -- no longer (historic county)
["Isle of Wight, England"] = {the = true},
["Kent, England"] = {},
["Lancashire, England"] = {},
["Leicestershire, England"] = {},
["Lincolnshire, England"] = {},
["Merseyside, England"] = {},
-- ["Middlesex, England"] = {}, -- no longer (historic county)
["Norfolk, England"] = {},
["Northamptonshire, England"] = {},
["Northumberland, England"] = {},
["North Yorkshire, England"] = {},
["Nottinghamshire, England"] = {},
["Oxfordshire, England"] = {},
["Rutland, England"] = {},
["Shropshire, England"] = {},
["Somerset, England"] = {},
["South Humberside, England"] = {},
["South Yorkshire, England"] = {},
["Staffordshire, England"] = {},
["Suffolk, England"] = {},
["Surrey, England"] = {},
-- ["Sussex, England"] = {}, -- no longer (historic county)
["Tyne and Wear, England"] = {},
["Warwickshire, England"] = {},
["West Midlands, England"] = {the = true, wp = "%l (county)"},
-- ["Westmorland, England"] = {}, -- no longer (historic county)
["West Sussex, England"] = {},
["West Yorkshire, England"] = {},
["Wiltshire, England"] = {},
["Worcestershire, England"] = {},
-- ["Yorkshire, England"] = {}, -- no longer (historic county)
["East Riding of Yorkshire, England"] = {the = true},
}
-- counties of England
export.england_group = {
default_container = {key = "England", placetype = "constituent country"},
default_placetype = "county",
default_divs = {
"districts",
{type = "local government districts", cat_as = "districts"},
{
type = "local government districts with borough status",
cat_as = {"districts", "boroughs"},
},
{type = "boroughs", cat_as = {"districts", "boroughs"}},
"civil parishes",
},
data = export.england_counties,
}
export.northern_ireland_counties = {
["County Antrim, Northern Ireland"] = {},
["County Armagh, Northern Ireland"] = {},
["City of Belfast, Northern Ireland"] = {the = true, is_city = true, wp = "Belfast"},
["County Down, Northern Ireland"] = {},
["County Fermanagh, Northern Ireland"] = {},
["County Londonderry, Northern Ireland"] = {},
["City of Derry, Northern Ireland"] = {the = true, is_city = true, wp = "Derry"},
["County Tyrone, Northern Ireland"] = {},
}
-- counties of Northern Ireland
export.northern_ireland_group = {
key_to_placename = make_irish_type_key_to_placename(", Northern Ireland$"),
placename_to_key = make_irish_type_placename_to_key(", Northern Ireland"),
default_container = {key = "Northern Ireland", placetype = "constituent country"},
default_placetype = "county",
data = export.northern_ireland_counties,
}
export.scotland_council_areas = {
["Aberdeenshire, Scotland"] = {},
["Angus, Scotland"] = {wp = "%l, %c"},
["Argyll and Bute, Scotland"] = {},
["City of Aberdeen, Scotland"] = {the = true, wp = "Aberdeen"},
["Aberdeen"] = {alias_of = "City of Aberdeen, Scotland"},
["Aberdeen City"] = {alias_of = "City of Aberdeen, Scotland"},
["City of Dundee, Scotland"] = {the = true, wp = "Dundee"},
["Dundee"] = {alias_of = "City of Dundee, Scotland"},
["Dundee City"] = {alias_of = "City of Dundee, Scotland"},
["City of Edinburgh, Scotland"] = {the = true, wp = "%l council area"},
["Edinburgh"] = {alias_of = "City of Edinburgh, Scotland"},
["City of Glasgow, Scotland"] = {the = true, wp = "Glasgow"},
["Glasgow"] = {alias_of = "City of Glasgow, Scotland"},
["Clackmannanshire, Scotland"] = {},
["Dumfries and Galloway, Scotland"] = {},
["East Ayrshire, Scotland"] = {},
["East Dunbartonshire, Scotland"] = {},
["East Lothian, Scotland"] = {},
["East Renfrewshire, Scotland"] = {},
["Falkirk, Scotland"] = {wp = "%l council area"},
["Fife, Scotland"] = {},
["Highland, Scotland"] = {wp = "%l council area"},
["Inverclyde, Scotland"] = {},
["Midlothian, Scotland"] = {},
["Moray, Scotland"] = {},
["North Ayrshire, Scotland"] = {},
["North Lanarkshire, Scotland"] = {},
["Orkney Islands, Scotland"] = {the = true},
["Perth and Kinross, Scotland"] = {},
["Renfrewshire, Scotland"] = {},
["Scottish Borders, Scotland"] = {the = true},
["Shetland Islands, Scotland"] = {the = true},
["South Ayrshire, Scotland"] = {},
["South Lanarkshire, Scotland"] = {},
["Stirling, Scotland"] = {wp = "%l council area"},
["West Dunbartonshire, Scotland"] = {},
["West Lothian, Scotland"] = {},
["Western Isles, Scotland"] = {the = true, wp = "Outer Hebrides"},
["Na h-Eileanan Siar, Scotland"] = {alias_of = "Western Isles, Scotland"},
}
-- council areas of Scotland
export.scotland_group = {
default_container = {key = "Scotland", placetype = "constituent country"},
default_placetype = "council area",
data = export.scotland_council_areas,
}
export.wales_principal_areas = {
["Blaenau Gwent, Wales"] = {},
["Bridgend, Wales"] = {wp = "%l County Borough"},
["Caerphilly, Wales"] = {wp = "%l County Borough"},
-- ["Cardiff, Wales"] = {placetype = "city"},
["Carmarthenshire, Wales"] = {placetype = "county"},
["Ceredigion, Wales"] = {placetype = "county"},
["Conwy, Wales"] = {wp = "%l County Borough"},
["Denbighshire, Wales"] = {placetype = "county"},
["Flintshire, Wales"] = {placetype = "county"},
["Gwynedd, Wales"] = {placetype = "county"},
["Isle of Anglesey, Wales"] = {the = true, placetype = "county"},
["Anglesey, Wales"] = {alias_of = "Isle of Anglesey, Wales"}, -- differs in "the"
["Merthyr Tydfil, Wales"] = {wp = "%l County Borough"},
["Monmouthshire, Wales"] = {placetype = "county"},
["Neath Port Talbot, Wales"] = {},
-- ["Newport, Wales"] = {placetype = "city", wp = "%l, %c"},
["Pembrokeshire, Wales"] = {placetype = "county"},
["Powys, Wales"] = {placetype = "county"},
["Rhondda Cynon Taf, Wales"] = {},
-- ["Swansea, Wales"] = {placetype = "city"},
["Torfaen, Wales"] = {},
["Vale of Glamorgan, Wales"] = {the = true},
["Wrexham, Wales"] = {wp = "%l County Borough"},
}
-- principal areas (cities, counties and county boroughs) of Wales
export.wales_group = {
default_container = {key = "Wales", placetype = "constituent country"},
default_placetype = "county borough",
data = export.wales_principal_areas,
}
export.united_states_states = {
["Alabama, USA"] = {},
["Alaska, USA"] = {divs = {
{type = "boroughs", container_parent_type = "counties"},
{type = "borough seats", container_parent_type = "county seats"},
}},
["Arizona, USA"] = {},
["Arkansas, USA"] = {},
["California, USA"] = {},
["Colorado, USA"] = {divs = {"counties", "county seats", "municipalities"}},
["Connecticut, USA"] = {divs = {"counties", "county seats", "municipalities"}},
["Delaware, USA"] = {},
["Florida, USA"] = {},
["Georgia, USA"] = {wp = "%l (U.S. state)"},
["Hawaii, USA"] = {addl_parents = {"Polynesia"}},
["Idaho, USA"] = {},
["Illinois, USA"] = {},
["Indiana, USA"] = {},
["Iowa, USA"] = {},
["Kansas, USA"] = {},
["Kentucky, USA"] = {},
["Louisiana, USA"] = {divs = {
{type = "parishes", container_parent_type = "counties"},
{type = "parish seats", container_parent_type = "county seats"},
}},
["Maine, USA"] = {},
["Maryland, USA"] = {},
["Massachusetts, USA"] = {},
["Michigan, USA"] = {},
["Minnesota, USA"] = {},
["Mississippi, USA"] = {},
["Missouri, USA"] = {},
["Montana, USA"] = {},
["Nebraska, USA"] = {},
["Nevada, USA"] = {},
["New Hampshire, USA"] = {},
["New Jersey, USA"] = {divs = {
"counties", "county seats",
{type = "boroughs", prep = "di"},
}},
["New Mexico, USA"] = {},
["New York, USA"] = {wp = "%l (state)"},
["North Carolina, USA"] = {},
["North Dakota, USA"] = {},
["Ohio, USA"] = {},
["Oklahoma, USA"] = {},
["Oregon, USA"] = {},
["Pennsylvania, USA"] = {divs = {
"counties", "county seats",
{type = "boroughs", prep = "di"},
}},
["Rhode Island, USA"] = {},
["South Carolina, USA"] = {},
["South Dakota, USA"] = {},
["Tennessee, USA"] = {},
["Texas, USA"] = {},
["Utah, USA"] = {},
["Vermont, USA"] = {},
["Virginia, USA"] = {},
["Washington, USA"] = {wp = "%l (state)"},
["West Virginia, USA"] = {},
["Wisconsin, USA"] = {},
["Wyoming, USA"] = {},
}
-- states of the United States
export.united_states_group = {
placename_to_key = make_placename_to_key(", USA"),
default_container = "Amerika Syarikat",
default_placetype = "negeri",
default_divs = {"counties", "county seats"},
addl_divs = {
{type = "census-designated places", prep = "di"},
{type = "unincorporated communities", prep = "di"},
},
data = export.united_states_states,
}
export.vietnam_provinces = {
-- [[Northeast (Vietnam)|Northeast]] region
["Bắc Giang Province, Vietnam"] = {}, -- capital [[Bắc Giang]]
["Bắc Kạn Province, Vietnam"] = {}, -- capital [[Bắc Kạn]]
["Cao Bằng Province, Vietnam"] = {}, -- capital [[Cao Bằng]]
["Hà Giang Province, Vietnam"] = {}, -- capital [[Hà Giang]]
["Lạng Sơn Province, Vietnam"] = {}, -- capital [[Lạng Sơn]]
["Phú Thọ Province, Vietnam"] = {}, -- capital [[Việt Trì]]
["Quảng Ninh Province, Vietnam"] = {}, -- capital [[Hạ Long]]
["Thái Nguyên Province, Vietnam"] = {}, -- capital [[Thái Nguyên]]
["Tuyên Quang Province, Vietnam"] = {}, -- capital [[Tuyên Quang]]
-- [[Northwest (Vietnam)|Northwest]] region
["Lào Cai Province, Vietnam"] = {}, -- capital [[Lào Cai]]
["Yên Bái Province, Vietnam"] = {}, -- capital [[Yên Bái]]
["Điện Biên Province, Vietnam"] = {}, -- capital [[Điện Biên Phủ]]
["Hoà Bình Province, Vietnam"] = {}, -- capital [[Hoà Bình City|Hoà Bình]]
["Hòa Bình Province, Vietnam"] = {alias_of = "Hoà Bình Province, Vietnam", display = true},
["Lai Châu Province, Vietnam"] = {}, -- capital [[Lai Châu]]
["Sơn La Province, Vietnam"] = {}, -- capital [[Sơn La]]
-- [[Red River Delta]] region
["Bắc Ninh Province, Vietnam"] = {}, -- capital [[Bắc Ninh]]
["Hà Nam Province, Vietnam"] = {}, -- capital [[Phủ Lý]]
["Hải Dương Province, Vietnam"] = {}, -- capital [[Hải Dương]]
["Hưng Yên Province, Vietnam"] = {}, -- capital [[Hưng Yên]]
["Nam Định Province, Vietnam"] = {}, -- capital [[Nam Định]]
["Ninh Bình Province, Vietnam"] = {}, -- capital [[Ninh Bình|Hoa Lư]]
["Thái Bình Province, Vietnam"] = {}, -- capital [[Thái Bình]]
["Vĩnh Phúc Province, Vietnam"] = {}, -- capital [[Vĩnh Yên]]
-- ["Hanoi"] = {placetype = {"municipality", "city"}}, -- capital [[Hoàn Kiếm district]]
-- ["Haiphong"] = {placetype = {"municipality", "city"}}, -- capital [[Hồng Bàng district]]
-- [[North Central Coast]] region
["Hà Tĩnh Province, Vietnam"] = {}, -- capital [[Hà Tĩnh]]
["Nghệ An Province, Vietnam"] = {}, -- capital [[Vinh]]
["Quảng Bình Province, Vietnam"] = {}, -- capital [[Đồng Hới]]
["Quảng Trị Province, Vietnam"] = {}, -- capital [[Đông Hà]]
["Thanh Hoá Province, Vietnam"] = {}, -- capital [[Thanh Hoá]]
["Thanh Hóa Province, Vietnam"] = {alias_of = "Thanh Hoá Province, Vietnam", display = true},
-- ["Hue"] = {placetype = {"municipality", "city"}, wp = "Huế"}, -- capital [[Thuận Hoá district]]
-- [[Central Highlands (Vietnam)|Central Highlands]] region
["Đắk Lắk Province, Vietnam"] = {}, -- capital [[Buôn Ma Thuột]]
["Đăk Nông Province, Vietnam"] = {}, -- capital [[Gia Nghĩa]]
["Gia Lai Province, Vietnam"] = {}, -- capital [[Pleiku]]
["Kon Tum Province, Vietnam"] = {}, -- capital [[Kon Tum]]
["Lâm Đồng Province, Vietnam"] = {}, -- capital [[Đà Lạt]]
-- [[South Central Coast]] region
["Bình Định Province, Vietnam"] = {}, -- capital [[Quy Nhon]]
["Bình Thuận Province, Vietnam"] = {}, -- capital [[Phan Thiết]]
["Khánh Hoà Province, Vietnam"] = {}, -- capital [[Nha Trang]]
["Khánh Hòa Province, Vietnam"] = {alias_of = "Khánh Hoà Province, Vietnam", display = true},
["Ninh Thuận Province, Vietnam"] = {}, -- capital [[Phan Rang–Tháp Chàm]]
["Phú Yên Province, Vietnam"] = {}, -- capital [[Tuy Hoà]]
["Quảng Nam Province, Vietnam"] = {}, -- capital [[Tam Kỳ]]
["Quảng Ngãi Province, Vietnam"] = {}, -- capital [[Quảng Ngãi]]
-- ["Da Nang"] = {placetype = {"municipality", "city"}}, -- capital [[Hải Châu district]]
-- [[Southeast (Vietnam)|Southeast]] region
["Bà Rịa–Vũng Tàu Province, Vietnam"] = {}, -- capital [[Bà Rịa]]
["Bình Dương Province, Vietnam"] = {}, -- capital [[Thủ Dầu Một]]
["Bình Phước Province, Vietnam"] = {}, -- capital [[Đồng Xoài]]
["Đồng Nai Province, Vietnam"] = {}, -- capital [[Biên Hoà]]
["Tây Ninh Province, Vietnam"] = {}, -- capital [[Tây Ninh]]
-- ["Ho Chi Minh City"] = {placetype = {"municipality", "city"}}, -- capital [[District 1, Ho Chi Minh City|'''District 1''']]
-- [[Mekong Delta]] region
["An Giang Province, Vietnam"] = {}, -- capital [[Long Xuyên]]
["Bạc Liêu Province, Vietnam"] = {}, -- capital [[Bạc Liêu]]
["Bến Tre Province, Vietnam"] = {}, -- capital [[Bến Tre]]
["Cà Mau Province, Vietnam"] = {}, -- capital [[Cà Mau]]
["Đồng Tháp Province, Vietnam"] = {}, -- capital [[Cao Lãnh City|Cao Lãnh]]
["Hậu Giang Province, Vietnam"] = {}, -- capital [[Vị Thanh]]
["Kiên Giang Province, Vietnam"] = {}, -- capital [[Rạch Giá]]
["Long An Province, Vietnam"] = {}, -- capital [[Tân An]]
["Sóc Trăng Province, Vietnam"] = {}, -- capital [[Sóc Trăng]]
["Tiền Giang Province, Vietnam"] = {}, -- capital [[Mỹ Tho]]
["Trà Vinh Province, Vietnam"] = {}, -- capital [[Trà Vinh]]
["Vĩnh Long Province, Vietnam"] = {}, -- capital [[Vĩnh Long]]
-- ["Can Tho"] = {placetype = {"municipality", "city"}, wp = "Cần Thơ"}, -- capital [[Ninh Kiều district]]
}
-- provinces of Vietnam
export.vietnam_group = {
key_to_placename = make_key_to_placename(", Vietnam$", " Province$"),
placename_to_key = make_placename_to_key(", Vietnam", " Province"),
default_container = "Vietnam",
default_placetype = "province",
-- There may not be enough districts to subcategorize like this.
-- default_divs = "districts",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "%e province",
data = export.vietnam_provinces,
}
-----------------------------------------------------------------------------------
-- City data --
-----------------------------------------------------------------------------------
export.australia_cities = {
["Adelaide"] = {container = "South Australia"}, -- 1,450,000 (Agglomeration)
["Brisbane"] = {container = "Queensland"}, -- 3,450,000 (Conglomeration; including the Gold Coast [750,997 2024 estiamte])
["Canberra"] = {container = {key = "Australian Capital Territory, Australia", placetype = "territory"}}, -- 510,641 (2024 estimate)
["Melbourne"] = {container = "Victoria"}, -- 5,200,000 (Agglomeration)
["Newcastle, New South Wales"] = {container = "New South Wales", wp = "%l, %c"}, -- 534,033 (2024 estimate)
["Newcastle"] = {alias_of = "Newcastle, New South Wales"},
["Perth"] = {container = "Western Australia"}, -- 2,350,000 (Agglomeration)
["Sydney"] = {container = "New South Wales"}, -- 5,100,000 (Agglomeration)
}
export.australia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Australia", "negeri"),
default_placetype = "city",
data = export.australia_cities,
}
export.brazil_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01.
["São Paulo"] = {container = "São Paulo"}, -- 22,600,000 (Consolidated Urban Area; including Guarulhos)
["Sao Paulo"] = {alias_of = "São Paulo", display = true},
["Rio de Janeiro"] = {container = "Rio de Janeiro"}, -- 13,600,000 (Consolidated Urban Area)
["Belo Horizonte"] = {container = "Minas Gerais"}, -- 5,300,000
["Recife"] = {container = "Pernambuco"}, -- 4,100,000
["Porto Alegre"] = {container = "Rio Grande do Sul"}, -- 3,950,000 (Consolidated Urban Area)
["Brasília"] = {container = "Distrito Federal"}, -- 3,850,000
["Brasilia"] = {alias_of = "Brasília", display = true},
["Fortaleza"] = {container = "Ceará"}, -- 3,825,000
["Salvador"] = {container = "Bahia", wp = "%l, %c", commonscat = "%l (%c)"}, -- 3,400,000
["Curitiba"] = {container = "Paraná"}, -- 3,375,000
["Campinas"] = {container = "São Paulo"}, -- 3,250,000
["Goiânia"] = {container = "Goiás"}, -- 2,525,000
["Goiania"] = {alias_of = "Goiânia", display = true},
["Manaus"] = {container = "Amazonas"}, -- 2,275,000
["Belém"] = {container = "Pará"}, -- 2,200,000
["Belem"] = {alias_of = "Belém", display = true},
["Vitória"] = {container = "Espírito Santo", wp = "%l, %c"}, -- 1,870,000
["Vitoria"] = {alias_of = "Vitória", display = true},
["Santos"] = {container = "São Paulo", wp = "%l, %c"}, -- 1,760,000
["São Luís"] = {container = "Maranhão", wp = "%l, %c"}, -- 1,530,000
["Sao Luis"] = {alias_of = "São Luís", display = true},
["Natal"] = {container = "Rio Grande do Norte", wp = "%l, %c"}, -- 1,360,000
["Florianópolis"] = {container = "Santa Catarina"}, -- 1,260,000
["Florianopolis"] = {alias_of = "Florianópolis", display = true},
["Maceió"] = {container = "Alagoas"}, -- 1,220,000
["Maceio"] = {alias_of = "Maceió", display = true},
["João Pessoa"] = {container = "Paraíba", wp = "%l, %c"}, -- 1,210,000
["Joao Pessoa"] = {alias_of = "João Pessoa", display = true},
["São José dos Campos"] = {container = "São Paulo"}, -- 1,090,000
["Sao Jose dos Campos"] = {alias_of = "São José dos Campos", display = true},
["Londrina"] = {container = "Paraná"}, -- 1,050,000
["Teresina"] = {container = "Piauí"}, -- 1,040,000
}
export.brazil_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Brazil", "negeri"),
default_placetype = "city",
data = export.brazil_cities,
}
export.canada_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01.
["Toronto"] = {container = "Ontario"}, -- 7,850,000 (Consolidated Urban Area; including Hamilton)
["Montreal"] = {container = "Quebec"}, -- 4,500,000 (Consolidated Urban Area)
["Vancouver"] = {container = "British Columbia"}, -- 3,175,000 (Consolidated Urban Area)
["Calgary"] = {container = "Alberta"}, -- 1,510,000 (Consolidated Urban Area)
["Edmonton"] = {container = "Alberta"}, -- 1,460,000 (Consolidated Urban Area)
["Ottawa"] = {container = "Ontario"}, -- 1,390,000 (Consolidated Urban Area)
["Quebec City"] = {container = "Quebec"}, -- 839,311 metro per Wikipedia (2021 census)
["Winnipeg"] = {container = "Manitoba"}, -- 834,678 metro per Wikipedia (2021 census)
["Hamilton"] = {container = "Ontario", wp = "%l, %c"}, -- 785,184 metro per Wikipedia (2021 census)
["Kitchener"] = {container = "Ontario", wp = "%l, %c"}, -- 575,847 metro per Wikipedia (2021 census)
}
export.canada_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Canada", "province"),
default_placetype = "city",
data = export.canada_cities,
}
export.france_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
["Paris"] = {container = "Île-de-France"}, -- 11,500,000 (Conglomeration)
["Lyon"] = {container = "Auvergne-Rhône-Alpes"}, -- 2,050,000 (Conglomeration)
["Lyons"] = {alias_of = "Lyon", display = true},
["Marseille"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 1,710,000 (Conglomeration)
["Marseilles"] = {alias_of = "Marseille", display = true},
["Lille"] = {container = "Hauts-de-France"}, -- 1,320,000 (Conglomeration)
["Bordeaux"] = {container = "Nouvelle-Aquitaine"}, -- 1,160,000 (Conglomeration)
["Toulouse"] = {container = "Occitania"}, -- 1,150,000 (Conglomeration)
["Nice"] = {container = "Provence-Alpes-Côte d'Azur"},
["Nantes"] = {container = "Pays de la Loire"},
["Strasbourg"] = {container = "Grand Est"},
["Rennes"] = {container = "Brittany"},
}
export.france_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", France", "region"),
default_placetype = "city",
data = export.france_cities,
}
export.germany_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
-- listed under Rhein-Ruhr Area, total population 10,900,000 (Consolidated Urban Area)
["Cologne"] = {container = "North Rhine-Westphalia"},
["Köln"] = {alias_of = "Cologne", display = true},
["Düsseldorf"] = {container = "North Rhine-Westphalia"},
["Dusseldorf"] = {alias_of = "Düsseldorf", display = true},
["Dortmund"] = {container = "North Rhine-Westphalia"},
["Essen"] = {container = "North Rhine-Westphalia"},
["Duisberg"] = {container = "North Rhine-Westphalia"},
["Berlin"] = {}, -- 4,700,000
["Frankfurt"] = {container = "Hesse"}, -- 3,225,000
["Frankfurt am Main"] = {alias_of = "Frankfurt"}, -- not a display alias as it's longer
["Hamburg"] = {}, -- 2,900,000
["Munich"] = {container = "Bavaria"}, -- 2,300,000
["Stuttgart"] = {container = "Baden-Württemberg"}, -- 2,300,000
["Mannheim"] = {container = "Baden-Württemberg"}, -- 1,550,000
["Nuremberg"] = {container = "Bavaria"}, -- 1,120,000
["Hanover"] = {"Lower Saxony"}, -- 1,090,000
["Bielefeld"] = {container = "North Rhine-Westphalia"}, -- 1,080,000
["Leipzig"] = {container = "Saxony"}, -- 1,080,000
["Aachen"] = {container = "North Rhine-Westphalia"}, -- 1,000,000
["Aix-la-Chapelle"] = {alias_of = "Aachen"}, -- historical; not a display alias
["Bremen"] = {},
}
export.germany_cities_group = {
default_container = "Germany",
canonicalize_key_container = make_canonicalize_key_container(", Germany", "negeri"),
default_placetype = "city",
data = export.germany_cities,
}
export.india_cities = {
-- This lists the 65 metro areas per Demographia's 2023 estimates, as found in
-- [[w:List_of_million-plus_urban_agglomerations_in_India]]. The last census in India (as of April 2025) was
-- conducted in 2011, and the results are not accurate any more.
["Delhi"] = {container = {key = "Delhi, India", placetype = "union territory"}}, -- 31,190,000
["Mumbai"] = {container = "Maharashtra"}, -- 25,189,000
["Kolkata"] = {container = "West Bengal"}, -- 21,747,000
["Bangalore"] = {container = "Karnataka", wp = "Bengaluru"}, -- 15,257,000
["Bengaluru"] = {alias_of = "Bangalore"},
["Chennai"] = {container = "Tamil Nadu"}, -- 11,570,000
["Hyderabad"] = {container = "Telangana"}, -- 9,797,000
["Ahmedabad"] = {container = "Gujarat"}, -- 8,006,000
["Pune"] = {container = "Maharashtra"}, -- 6,819,000
["Surat"] = {container = "Gujarat"}, -- 6,601,000
["Lucknow"] = {container = "Uttar Pradesh"}, -- 4,661,000
["Jaipur"] = {container = "Rajasthan"}, -- 4,360,000
["Kanpur"] = {container = "Uttar Pradesh"}, -- 4,350,000
["Indore"] = {container = "Madhya Pradesh"}, -- 3,765,000
["Nagpur"] = {container = "Maharashtra"}, -- 3,493,000
["Patna"] = {container = "Bihar"}, -- 3,331,000
["Varanasi"] = {container = "Uttar Pradesh"}, -- 3,229,000
["Kozhikode"] = {container = "Kerala"}, -- 3,049,000
["Thiruvananthapuram"] = {container = "Kerala"}, -- 2,851,000
["Agra"] = {container = "Uttar Pradesh"}, -- 2,737,000
["Bhopal"] = {container = "Madhya Pradesh"}, -- 2,562,000
["Coimbatore"] = {container = "Tamil Nadu"}, -- 2,551,000
["Allahabad"] = {container = "Uttar Pradesh", wp = "Prayagraj"}, -- 2,438,000
["Prayagraj"] = {alias_of = "Allahabad"},
["Kochi"] = {container = "Kerala"}, -- 2,381,000
["Ludhiana"] = {container = "Punjab"}, -- 2,205,000
["Vadodara"] = {container = "Gujarat"}, -- 2,182,000
["Chandigarh"] = {container = {key = "Chandigarh, India", placetype = "union territory"}}, -- 2,168,000
["Madurai"] = {container = "Tamil Nadu"}, -- 2,048,000
["Meerut"] = {container = "Uttar Pradesh"}, -- 2,011,000
["Visakhapatnam"] = {container = "Andhra Pradesh"}, -- 2,005,000
["Jamshedpur"] = {container = "Jharkhand"}, -- 1,925,000
["Malappuram"] = {container = "Kerala"}, -- 1,868,000
["Nashik"] = {container = "Maharashtra"}, -- 1,810,000
["Asansol"] = {container = "West Bengal"}, -- 1,720,000
["Aligarh"] = {container = "Uttar Pradesh"}, -- 1,660,000
["Ranchi"] = {container = "Jharkhand"}, -- 1,638,000
["Thrissur"] = {container = "Kerala"}, -- 1,578,000
["Kollam"] = {container = "Kerala"}, -- 1,576,000
["Jabalpur"] = {container = "Madhya Pradesh"}, -- 1,533,000
["Dhanbad"] = {container = "Jharkhand"}, -- 1,503,000
["Jodhpur"] = {container = "Rajasthan"}, -- 1,497,000
["Aurangabad"] = {container = "Maharashtra"}, -- 1,490,000
["Chhatrapati Sambhajinagar"] = {alias_of = "Aurangabad"},
["Rajkot"] = {container = "Gujarat"}, -- 1,487,000
["Gwalior"] = {container = "Madhya Pradesh"}, -- 1,477,000
["Raipur"] = {container = "Chhattisgarh"}, -- 1,429,000
["Gorakhpur"] = {container = "Uttar Pradesh"}, -- 1,410,000
["Kannur"] = {container = "Kerala"}, -- 1,360,000
["Bareilly"] = {container = "Uttar Pradesh"}, -- 1,355,000
["Guwahati"] = {container = "Assam"}, -- 1,355,000
["Moradabad"] = {container = "Uttar Pradesh"}, -- 1,345,000
["Amritsar"] = {container = "Punjab"}, -- 1,313,000
["Mysore"] = {container = "Karnataka"}, -- 1,296,000
["Bhilai"] = {container = "Chhattisgarh"}, -- 1,293,000
["Durg-Bhilainagar"] = {alias_of = "Bhilai"},
["Durg-Bhilai"] = {alias_of = "Bhilai"},
["Durg"] = {alias_of = "Bhilai"},
["Bhilainagar"] = {alias_of = "Bhilai"},
["Vijayawada"] = {container = "Andhra Pradesh"}, -- 1,232,000
["Srinagar"] = {container = {key = "Jammu and Kashmir, India", placetype = "union territory"}}, -- 1,212,000
["Salem"] = {container = "Tamil Nadu", wp = "%l, %c"}, -- 1,189,000
["Kota"] = {container = "Rajasthan"}, -- 1,172,000
["Jalandhar"] = {container = "Punjab"}, -- 1,165,000
["Saharanpur"] = {container = "Uttar Pradesh"}, -- 1,152,000
["Dehradun"] = {container = "Uttarakhand"}, -- 1,136,000
["Tiruchirappalli"] = {container = "Tamil Nadu"}, -- 1,131,000
["Bhubaneswar"] = {container = "Odisha"}, -- 1,112,000
["Jammu"] = {container = {key = "Jammu and Kashmir, India", placetype = "union territory"}}, -- 1,103,000
["Solapur"] = {container = "Maharashtra"}, -- 1,082,000
["Hubli-Dharwad"] = {container = "Karnataka", wp = "Hubli–Dharwad"}, -- 1,062,000; wp with en dash
["Hubli"] = {alias_of = "Hubli-Dharwad"},
["Dharwad"] = {alias_of = "Hubli-Dharwad"},
["Puducherry"] = {container = {key = "Puducherry, India", placetype = "union territory"}}, -- 1,024,000
["Pondicherry"] = {alias_of = "Puducherry", display = true},
-- satellite/secondary cities of metro area (none in citypopulation.de)
["Ghaziabad"] = {container = "Uttar Pradesh"}, -- 1,729,000 city, 2,358,525 urban agglomeration per 2011 census; 3,406,061 2025 estimate from official website; part of Delhi metro area
["Faridabad"] = {container = "Haryana"}, -- 1,414,050 city per 2011 census; part of Delhi metro area
["Thane"] = {container = "Maharashtra"}, -- 1,841,488 city per 2011 census; part of Mumbai metro area
["Kalyan-Dombivli"] = {container = "Maharashtra"}, -- 1,246,381 city per 2011 census; part of Mumbai metro area
["Kalyan-Dombivali"] = {alias_of = "Kalyan-Dombivli", display = true},
["Kalyan"] = {alias_of = "Kalyan-Dombivli"},
["Dombivli"] = {alias_of = "Kalyan-Dombivli"},
["Dombivali"] = {alias_of = "Kalyan-Dombivli"},
["Vasai-Virar"] = {container = "Maharashtra"}, -- 1,221,233 city per 2011 census; part of Mumbai metro area
["Vasai"] = {alias_of = "Vasai-Virar"},
["Virar"] = {alias_of = "Vasai-Virar"},
["Navi Mumbai"] = {container = "Maharashtra"}, -- 1,120,547 city per 2011 census; part of Mumbai metro area
["Howrah"] = {container = "West Bengal"}, -- 1,077,075 city ("metropolis"), 2,811,344 "metro" per 2011 census; part of Kolkata metro area
["Pimpri-Chinchwad"] = {container = "Maharashtra"}, -- 1,727,692 per 2011 census; part of Pune metro area
["Pimpri Chinchwad"] = {alias_of = "Pimpri-Chinchwad", display = true},
}
export.india_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", India", "negeri"),
default_placetype = "city",
data = export.india_cities,
}
export.indonesia_cities = {
-- cities where the city proper has more than 1,000,000 people as of mid-2023 estimate
["Jakarta"] = {container = "Special Capital Region of Jakarta", divs = {
{type = "subdistricts", container_parent_type = false},
}},
["Surabaya"] = {container = "East Java"},
["Bekasi"] = {container = "West Java"}, -- part of Jakarta metro area
["Bandung"] = {container = "West Java"},
["Medan"] = {container = "North Sumatra"},
["Depok"] = {container = "West Java"}, -- part of Jakarta metro area
["Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area
["Palembang"] = {container = "South Sumatra"},
["Semarang"] = {container = "Central Java"},
["Makassar"] = {container = "South Sulawesi"},
["South Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area
["Batam"] = {container = "Riau Islands"},
["Bogor"] = {container = "West Java"}, -- part of Jakarta metro area
["Pekanbaru"] = {container = "Riau"},
["Bandar Lampung"] = {container = "Lampung"},
-- other metro areas over 1,000,000 people
["Padang"] = {container = "West Sumatra"},
["Samarinda"] = {container = "East Kalimantan"},
["Malang"] = {container = "East Java"},
["Yogyakarta"] = {container = "Special Region of Yogyakarta"},
["Denpasar"] = {container = "Bali"},
["Cirebon"] = {container = "West Java"},
["Surakarta"] = {container = "Central Java"},
["Banjarmasin"] = {container = "South Kalimantan"},
["Tasikmalaya"] = {container = "West Java"},
}
export.indonesia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Indonesia", "province"),
default_placetype = "city",
data = export.indonesia_cities,
}
export.italy_cities = {
-- Data per [[w:List_of_metropolitan_areas_of_Italy]]. There are several lists given; the most recent one, used
-- here, only gives estimates as of Jan 1, 2014.
["Milan"] = {container = "Lombardy"}, -- 6,623,798
["Naples"] = {container = "Campania"}, -- 5,294,546
["Rome"] = {container = "Lazio"}, -- 4,447,881
["Turin"] = {container = "Piedmont"}, -- 1,865,284
["Venice"] = {container = "Veneto"}, -- 1,645,900
["Florence"] = {container = "Tuscany"}, -- 1,485,030
["Bari"] = {container = "Apulia"}, -- 1,257,459
["Palermo"] = {container = "Sicily"}, -- 1,183,084
-- include a few just below 1,000,000 metro area that may be above it by now (depending on the definition).
["Catania"] = {container = "Sicily"}, -- 988,240
["Brescia"] = {container = "Lombardy"}, -- 924,090
["Genoa"] = {container = "Liguria"}, -- 861,318
}
export.italy_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Italy", "region"),
default_placetype = "city",
data = export.italy_cities,
}
export.japan_cities = {
-- Population figures from [[w:List of cities in Japan]]. Metro areas from
-- [[w:List of metropolitan areas in Japan]].
["Tokyo"] = {keydesc = "[[Tokyo]] Metropolis, the [[capital city]] and a [[prefecture]] of [[Japan]] (which is a country in [[Asia]])",
placetype = {"city", "prefecture"},
divs = {
{type = "special wards", container_parent_type = false},
{type = "cities", prep = "di"},
},
},
["Yokohama"] = {container = "Kanagawa"}, -- 3,697,894
["Osaka"] = {container = "Osaka"}, -- 2,668,586
["Nagoya"] = {container = "Aichi"}, -- 2,283,289
-- FIXME, Hokkaido is handled specially.
["Sapporo"] = {container = "Hokkaido"}, -- 1,918,096
["Fukuoka"] = {container = "Fukuoka"}, -- 1,581,527
["Kobe"] = {container = "Hyōgo"}, -- 1,530,847
["Kyoto"] = {container = "Kyoto"}, -- 1,474,570
["Kawasaki"] = {container = "Kanagawa", wp = "%l, Kanagawa"}, -- 1,373,630
["Saitama"] = {container = "Saitama", wp = "%l (city)", commonscat = "%l, %c"}, -- 1,192,418
["Hiroshima"] = {container = "Hiroshima"}, -- 1,163,806
["Sendai"] = {container = "Miyagi"}, -- 1,029,552
-- the remaining cities are considered "central cities" in a 1,000,000+ metro area
-- (sometimes there is more than one central city in the area).
["Kitakyushu"] = {container = "Fukuoka"}, -- 986,998
["Chiba"] = {container = "Chiba", wp = "%l (city)", commonscat = "%l, %c"}, -- 938,695
["Sakai"] = {container = "Osaka"}, -- 835,333
["Niigata"] = {container = "Niigata", wp = "%l (city)", commonscat = "%l, %c"}, -- 813,053
["Hamamatsu"] = {container = "Shizuoka"}, -- 811,431
["Shizuoka"] = {container = "Shizuoka", wp = "%l (city)", commonscat = "%l, %c"}, -- 710,944
["Sagamihara"] = {container = "Kanagawa"}, -- 706,342
["Okayama"] = {container = "Okayama"}, -- 701,293
["Kumamoto"] = {container = "Kumamoto"}, -- 670,348
["Kagoshima"] = {container = "Kagoshima"}, -- 605,196
-- skipped 6 cities (Funabashi, Hachiōji, Kawaguchi, Himeji, Matsuyama, Higashiōsaka)
-- with population in the range 509k - 587k because not central cities in any
-- 1,000,000+ metro area.
["Utsunomiya"] = {container = "Tochigi"}, -- 507,833
}
export.japan_cities_group = {
default_container = "Japan",
canonicalize_key_container = make_canonicalize_key_container(" Prefecture, Japan", "prefecture"),
default_placetype = "city",
data = export.japan_cities,
}
export.mexico_cities = {
["Mexico City"] = {}, -- its own state
["Monterrey"] = {container = "Nuevo León"},
["Guadalajara"] = {container = "Jalisco"},
["Puebla"] = {container = "Puebla", wp = "%l (city)"},
["Toluca"] = {container = "State of Mexico"},
["Tijuana"] = {container = "Baja California"},
-- Include the state in the category for León due to possible confusion with León, Spain.
["León, Guanajuato"] = {container = "Guanajuato", wp = "%l, %c"},
["León"] = {alias_of = "León, Guanajuato"},
["Leon"] = {alias_of = "León, Guanajuato", display = true},
["Querétaro"] = {container = "Querétaro", wp = "%l (city)"},
["Queretaro"] = {alias_of = "Querétaro", display = true},
["Ciudad Juárez"] = {container = "Chihuahua"},
["Juárez"] = {alias_of = "Ciudad Juárez"},
["Juarez"] = {alias_of = "Ciudad Juárez", display = "Juárez"},
["Torreón"] = {container = "Coahuila"},
["Torreon"] = {alias_of = "Torreón", display = true},
-- Include the state in the category for Mérida due to possible confusion with Mérida, Spain or
-- Mérida, Venezuela.
["Mérida, Yucatán"] = {container = "Yucatán", wp = "%l, %c"},
["Mérida"] = {alias_of = "Mérida, Yucatán"},
["Merida"] = {alias_of = "Mérida, Yucatán", display = true},
["San Luis Potosí"] = {container = "San Luis Potosí", wp = "%l (city)"},
["San Luis Potosi"] = {alias_of = "San Luis Potosí", display = true},
["Aguascalientes"] = {container = "Aguascalientes", wp = "%l (city)"},
["Mexicali"] = {container = "Baja California"},
}
export.mexico_cities_group = {
default_container = "Mexico",
canonicalize_key_container = make_canonicalize_key_container(", Mexico", "negeri"),
default_placetype = "city",
data = export.mexico_cities,
}
export.nigeria_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
["Lagos"] = {container = "Lagos"}, -- 21,300,000 (unindicated; population of low reliability)
["Kano"] = {container = "Kano", wp = "%l (city)"}, -- 5,350,000 (unindicated; population of low reliability)
["Ibadan"] = {container = "Oyo"}, -- 3,400,000 (unindicated; population of low reliability)
["Abuja"] = {container = {key = "Federal Capital Territory, Nigeria", placetype = "wilayah persekutuan"}}, -- 3,050,000 (unindicated; population of low reliability)
["Port Harcourt"] = {container = "Rivers"}, -- 2,250,000 (unindicated; population of low reliability)
["Kaduna"] = {container = "Kaduna"}, -- 1,980,000 (unindicated; population of low reliability)
["Benin City"] = {container = "Edo"}, -- 1,790,000 (unindicated; population of low reliability)
["Aba"] = {container = "Abia", wp = "%l, Nigeria"}, -- 1,280,000 (unindicated; population of low reliability)
["Onitsha"] = {container = "Anambra"}, -- 1,230,000 (unindicated; population of low reliability)
["Maiduguri"] = {container = "Borno"}, -- 1,190,000 (unindicated; population of low reliability)
["Ilorin"] = {container = "Kwara"}, -- 1,160,000 (unindicated; population of low reliability)
["Sokoto"] = {container = "Sokoto", wp = "%l (city)"}, -- 1,140,000 (unindicated; population of low reliability)
["Jos"] = {container = "Plateau"}, -- 1,110,000 (unindicated; population of low reliability)
["Zaria"] = {container = "Kaduna"}, -- 1,050,000 (unindicated; population of low reliability)
["Enugu"] = {container = "Enugu", wp = "%l (city)"}, -- 1,010,000 (unindicated; population of low reliability)
}
export.nigeria_cities_group = {
default_container = "Nigeria",
canonicalize_key_container = make_canonicalize_key_container(" State, Nigeria", "negeri"),
default_placetype = "city",
data = export.nigeria_cities,
}
export.pakistan_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01.
["Karachi"] = {container = "Sindh"}, -- 21,000,000 (Consolidated Urban Area)
["Lahore"] = {container = "Punjab"}, -- 14,600,000 (Consolidated Urban Area)
["Rawalpindi"] = {container = "Punjab"}, -- 5,600,000 (Consolidated Urban Area; including Islamabad)
["Islamabad"] = {container = {key = "Islamabad Capital Territory, Pakistan", placetype = "wilayah persekutuan"}}, -- 5,600,000 (Consolidated Urban Area; including Rawalpindi)
["Faisalabad"] = {container = "Punjab"}, -- 4,125,000 (Consolidated Urban Area)
["Gujranwala"] = {container = "Punjab"}, -- 3,450,000 (Consolidated Urban Area)
-- there is also Hyderabad in India (very confusing)
["Hyderabad, Pakistan"] = {container = "Sindh", wp = "%l, %c"}, -- 2,475,000 (Consolidated Urban Area)
["Hyderabad"] = {alias_of = "Hyderabad, Pakistan"},
["Multan"] = {container = "Punjab"}, -- 2,425,000 (Consolidated Urban Area)
["Peshawar"] = {container = "Khyber Pakhtunkhwa"}, -- 2,150,000 (Consolidated Urban Area)
["Quetta"] = {container = "Balochistan"}, -- 1,720,000 (Urban Area)
["Sargodha"] = {container = "Punjab"}, -- 1,080,000 (Urban Area)
["Sialkot"] = {container = "Punjab"}, -- 1,050,000 (Consolidated Urban Area)
}
export.pakistan_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Pakistan", "province"),
default_placetype = "city",
data = export.pakistan_cities,
}
export.philippines_cities = {
-- Skipped some cities in Metro Manila (Taguig, Pasig) which don't have districts.
-- Other cities outside Metro Manila skipped as not central city in their urban area.
["Quezon City"] = {container = {key = "Metro Manila, Philippines", placetype = "region"}},
-- Don't display-canonicalize Foo to Foo City as it may make the display weird.
["Quezon"] = {alias_of = "Quezon City"},
["Manila"] = {container = {key = "Metro Manila, Philippines", placetype = "region"}},
["Davao City"] = {container = "Davao del Sur"},
["Davao"] = {alias_of = "Davao City"},
["Caloocan"] = {container = {key = "Metro Manila, Philippines", placetype = "region"}},
["Zamboanga City"] = {container = "Zamboanga del Sur"},
["Zamboanga"] = {alias_of = "Zamboanga City"},
["Cebu City"] = {container = "Cebu"},
["Cebu"] = {alias_of = "Cebu City"},
["Antipolo"] = {container = "Rizal"},
["Cagayan de Oro"] = {container = "Misamis Oriental"},
["Dasmariñas"] = {container = "Cavite"},
["Dasmarinas"] = {alias_of = "Dasmariñas", display = true},
["General Santos"] = {container = "South Cotabato"},
["San Jose del Monte"] = {container = "Bulacan"},
["Bacolod"] = {container = "Negros Occidental"},
["Calamba"] = {container = "Laguna", wp = "%l, %c"},
["Angeles"] = {container = "Pampanga", wp = "Angeles City"},
["Angeles City"] = {alias_of = "Angeles"},
["Iloilo City"] = {container = "Iloilo"},
["Iloilo"] = {alias_of = "Iloilo City"},
}
export.philippines_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Philippines", "province"),
default_placetype = "city",
data = export.philippines_cities,
}
export.russia_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01.
["Moscow"] = {}, -- 18,800,000 (Agglomeration)
["Saint Petersburg"] = {}, -- 6,350,000 (Agglomeration)
["Novosibirsk"] = {container = "Novosibirsk Oblast"}, -- 1,820,000 (Agglomeration)
["Yekaterinburg"] = {container = "Sverdlovsk Oblast"}, -- 1,810,000 (Agglomeration)
["Nizhny Novgorod"] = {container = "Nizhny Novgorod Oblast"}, -- 1,620,000 (Agglomeration)
["Kazan"] = {container = {key = "Tatarstan, Russia", placetype = "republic"}}, -- 1,560,000 (Agglomeration)
["Chelyabinsk"] = {container = "Chelyabinsk Oblast"}, -- 1,430,000 (Agglomeration)
["Rostov-on-Don"] = {container = "Rostov Oblast"}, -- 1,390,000 (Agglomeration)
["Rostov-na-Donu"] = {alias_of = "Rostov-on-Don", display = true},
["Krasnodar"] = {container = {key = "Krasnodar Krai, Russia", placetype = "krai"}}, -- 1,370,000 (Agglomeration)
["Samara"] = {container = "Samara Oblast"}, -- 1,350,000 (Agglomeration)
["Krasnoyarsk"] = {container = {key = "Krasnoyarsk Krai, Russia", placetype = "krai"}}, -- 1,270,000 (Agglomeration)
["Ufa"] = {container = {key = "Bashkortostan, Russia", placetype = "republic"}}, -- 1,230,000 (Agglomeration)
["Saratov"] = {container = "Saratov Oblast"}, -- 1,170,000 (Agglomeration)
["Omsk"] = {container = "Omsk Oblast"}, -- 1,140,000 (Agglomeration)
["Voronezh"] = {container = "Voronezh Oblast"}, -- 1,130,000 (Agglomeration)
["Volgograd"] = {container = "Volgograd Oblast"}, -- 1,080,000 (Agglomeration)
["Perm"] = {container = {key = "Perm Krai, Russia", placetype = "krai"}, wp = "%l, Russia"}, -- 1,070,000 (Agglomeration)
}
export.russia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Russia", "oblast"),
default_container = "Russia",
default_placetype = "city",
data = export.russia_cities,
}
export.saudi_arabia_cities = {
-- Figures for the first five from [[w:List of cities and towns in Saudi Arabia]] as of 2022. Unclear if these are
-- metro, urban or city proper figures.
["Riyadh"] = {container = "Riyadh"}, -- 7,000,100; 7,700,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Jeddah"] = {container = "Mecca"}, -- 3,751,917; 3,950,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Jedda"] = {alias_of = "Jeddah", display = true},
["Jiddah"] = {alias_of = "Jeddah", display = true},
["Jidda"] = {alias_of = "Jeddah", display = true},
["Dammam"] = {container = "Eastern"}, -- 2,638,166; 2,925,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Mecca"] = {container = "Mecca"}, -- 2,385,509; 2,675,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Makkah"] = {alias_of = "Mecca", display = true},
["Medina"] = {container = "Medina"}, -- 1,477,023; 1,530,000 per citypopulation.de 2025-01-01 (City)
["Hofuf"] = {container = "Eastern"}, -- 1,060,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Khamis Mushait"] = {container = "Aseer"}, -- 1,030,000 per citypopulation.de 2025-01-01 (Agglomeration)
["Khamis Mushayt"] = {alias_of = "Khamis Mushait", display = true},
}
export.saudi_arabia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(" Province, Saudi Arabia", "province"),
default_placetype = "city",
data = export.saudi_arabia_cities,
}
export.south_korea_cities = {
-- All cities listed are not associated with any county.
["Seoul"] = {},
["Busan"] = {},
["Incheon"] = {},
["Daegu"] = {},
["Daejeon"] = {},
["Gwangju"] = {},
["Ulsan"] = {},
}
export.south_korea_cities_group = {
default_container = "South Korea",
canonicalize_key_container = make_canonicalize_key_container(" County, South Korea", "province"),
default_placetype = "city",
data = export.south_korea_cities,
}
export.spain_cities = {
["Madrid"] = {container = "Community of Madrid"},
["Barcelona"] = {container = "Catalonia"},
["Valencia"] = {container = "Valencia"},
["Seville"] = {container = "Andalusia"},
["Bilbao"] = {container = "Basque Country"},
}
export.spain_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Spain", "autonomous community"),
default_placetype = "city",
data = export.spain_cities,
}
export.taiwan_cities = {
["New Taipei City"] = {},
["New Taipei"] = {alias_of = "New Taipei City", display = true},
["Taichung"] = {},
["Kaohsiung"] = {wp = "%l, Taiwan"},
["Taipei"] = {},
["Taoyuan"] = {},
["Tainan"] = {},
-- these last three are not special municipalities
["Chiayi"] = {placetype = "city"},
["Hsinchu"] = {placetype = "city"},
["Keelung"] = {placetype = "city"},
}
export.taiwan_cities_group = {
placename_to_key = false, -- don't add ", Taiwan" to make the key
canonicalize_key_container = make_canonicalize_key_container(", Taiwan", "county"),
default_container = "Taiwan",
default_placetype = {"special municipality", "municipality", "city"},
default_is_city = true,
default_divs = {"districts"},
data = export.taiwan_cities,
}
-- NOTE: It's OK to mix cities from different constituent countries; as long as the immediate container is correct,
-- everything else will be figured out.
export.united_kingdom_cities = {
["London"] = {container = "Greater London"},
["Manchester"] = {container = "Greater Manchester"},
["Birmingham"] = {container = "West Midlands"},
["Liverpool"] = {container = "Merseyside"},
["Glasgow"] = {container = {key = "City of Glasgow, Scotland", placetype = "council area"}},
["Leeds"] = {container = "West Yorkshire"},
["Newcastle upon Tyne"] = {container = "Tyne and Wear"},
["Newcastle"] = {alias_of = "Newcastle upon Tyne"},
["Bristol"] = {container = {key = "England", placetype = "constituent country"}},
["Cardiff"] = {container = {key = "Wales", placetype = "constituent country"}},
["Portsmouth"] = {container = "Hampshire"},
["Edinburgh"] = {container = {key = "City of Edinburgh, Scotland", placetype = "council area"}},
-- under 1,000,000 people but principal areas of Wales; requested by [[User:Donnanz]]
["Swansea"] = {container = {key = "Wales", placetype = "constituent country"}},
["Newport"] = {container = {key = "Wales", placetype = "constituent country"}, wp = "Newport, Wales"},
}
export.united_kingdom_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", England", "county"),
default_placetype = "city",
data = export.united_kingdom_cities,
}
export.united_states_cities = {
-- top 50 CSA's by population, with the top and sometimes 2nd or 3rd city listed
["New York City"] = {container = "New York", wp = "%l", divs = {
{type = "boroughs", container_parent_type = false},
}},
-- Don't display-canonicalize as it may make the display weird (e.g. in the context New York, New York).
["New York"] = {alias_of = "New York City"},
["Newark"] = {container = "New Jersey"},
["Los Angeles"] = {container = "California", wp = "%l"},
["Long Beach"] = {container = "California"},
["Riverside"] = {container = "California"},
["Chicago"] = {container = "Illinois", wp = "%l"},
["Washington, D.C."] = {wp = "%l"},
["Washington, DC"] = {alias_of = "Washington, D.C.", display = true},
["Washington D.C."] = {alias_of = "Washington, D.C.", display = true},
["Washington DC"] = {alias_of = "Washington, D.C.", display = true},
-- Don't display-canonicalize as it may make the display weird (e.g. if the holonym is followed by a District of
-- Columbia holonym).
["Washington"] = {alias_of = "Washington, D.C."},
["Baltimore"] = {container = "Maryland", wp = "%l"},
-- to avoid conflict with San Jose in Costa Rica
["San Jose, California"] = {container = "California"},
["San Jose"] = {alias_of = "San Jose, California"},
["San Francisco"] = {container = "California", wp = "%l"},
["Oakland"] = {container = "California"},
["Boston"] = {container = "Massachusetts", wp = "%l"},
["Providence"] = {container = "Rhode Island"},
["Dallas"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"},
["Fort Worth"] = {container = "Texas"},
["Philadelphia"] = {container = "Pennsylvania", wp = "%l"},
["Houston"] = {container = "Texas", wp = "%l"},
["Miami"] = {container = "Florida", wp = "%l", commonscat = "%l, %c"},
["Atlanta"] = {container = "Georgia", wp = "%l"},
["Detroit"] = {container = "Michigan", wp = "%l"},
["Phoenix"] = {container = "Arizona", wp = "%l", commonscat = "%l, %c"},
["Mesa"] = {container = "Arizona"},
["Seattle"] = {container = "Washington", wp = "%l"},
["Orlando"] = {container = "Florida"},
["Minneapolis"] = {container = "Minnesota", wp = "%l"},
["Cleveland"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"},
["Denver"] = {container = "Colorado", wp = "%l", commonscat = "%l, %c"},
["San Diego"] = {container = "California", wp = "%l", commonscat = "%l, %c"},
["Portland"] = {container = "Oregon"},
["Tampa"] = {container = "Florida"},
["St. Louis"] = {container = "Missouri", wp = "%l", commonscat = "%l, %c"},
["Saint Louis"] = {alias_of = "St. Louis", display = true},
["Charlotte"] = {container = "North Carolina"},
["Sacramento"] = {container = "California"},
["Pittsburgh"] = {container = "Pennsylvania", wp = "%l"},
["Salt Lake City"] = {container = "Utah", wp = "%l"},
["San Antonio"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"},
["Columbus"] = {container = "Ohio"},
["Kansas City"] = {container = "Missouri", wp = "%l metropolitan area", commonscat = "%l, %c"},
["Indianapolis"] = {container = "Indiana", wp = "%l"},
["Las Vegas"] = {container = "Nevada", wp = "%l"},
["Cincinnati"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"},
["Austin"] = {container = "Texas"},
["Milwaukee"] = {container = "Wisconsin", wp = "%l", commonscat = "%l, %c"},
["Raleigh"] = {container = "North Carolina"},
["Nashville"] = {container = "Tennessee"},
["Virginia Beach"] = {container = "Virginia"},
["Norfolk"] = {container = "Virginia"},
["Greensboro"] = {container = "North Carolina"},
["Winston-Salem"] = {container = "North Carolina"},
["Jacksonville"] = {container = "Florida"},
["New Orleans"] = {container = "Louisiana", wp = "%l"},
["Louisville"] = {container = "Kentucky"},
["Greenville"] = {container = "South Carolina"},
["Hartford"] = {container = "Connecticut"},
["Oklahoma City"] = {container = "Oklahoma", wp = "%l"},
["Grand Rapids"] = {container = "Michigan"},
["Memphis"] = {container = "Tennessee"},
["Birmingham, Alabama"] = {container = "Alabama"},
["Birmingham"] = {alias_of = "Birmingham, Alabama"},
["Fresno"] = {container = "California"},
["Richmond"] = {container = "Virginia"},
["Harrisburg"] = {container = "Pennsylvania"},
-- any major city of top 50 MSA's that's missed by previous
["Buffalo"] = {container = "New York"},
-- any of the top 50 city by city population that's missed by previous
["El Paso"] = {container = "Texas"},
["Albuquerque"] = {container = "New Mexico"},
["Tucson"] = {container = "Arizona"},
["Colorado Springs"] = {container = "Colorado"},
["Omaha"] = {container = "Nebraska"},
["Tulsa"] = {container = "Oklahoma"},
-- skip Arlington, Texas; too obscure and likely to be interpreted as Arlington, Virginia
}
export.united_states_cities_group = {
default_container = "Amerika Syarikat",
canonicalize_key_container = make_canonicalize_key_container(", USA", "negeri"),
default_placetype = "city",
default_wp = "%l, %c",
data = export.united_states_cities,
}
export.new_york_boroughs = {
["Bronx"] = {the = true, wp = "The Bronx"},
["Brooklyn"] = {},
["Manhattan"] = {},
["Queens"] = {},
["Staten Island"] = {},
}
export.new_york_boroughs_group = {
default_container = {key = "New York City", placetype = "city"},
default_placetype = "borough",
default_is_city = true,
data = export.new_york_boroughs,
}
export.vietnam_cities = {
-- Figures from citypopulation.de (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated.
["Ho Chi Minh City"] = {}, -- 14,300,000 (Agglomeration; inclunding Bien Hoa)
["Saigon"] = {alias_of = "Ho Chi Minh City"},
["Hanoi"] = {}, -- 7,350,000 (Agglomeration)
["Da Nang"] = {}, -- 1,500,000 (Agglomeration)
["Danang"] = {alias_of = "Da Nang", display = true},
["Haiphong"] = {}, -- 1,450,000 (Agglomeration)
["Hai Phong"] = {alias_of = "Haiphong", display = true},
-- This is the one entry in this list that is not a province-level municipality; instead it's a "provincial city"
-- meaning it is directly under its province as opposed to being contained in a district.
["Bien Hoa"] = {placetype = "city", container = "Đồng Nai", wp = "Biên Hòa"}, -- 1,272,235 (2022 city population per Wikipedia)
["Biên Hòa"] = {alias_of = "Bien Hoa", display = true},
["Biên Hoà"] = {alias_of = "Bien Hoa", display = true},
-- These two not in citypopulation.de because the urban population may be slightly under 1,000,000, but they are
-- both province-level municipalities and close to the 1,000,000 mark.
["Can Tho"] = {wp = "Cần Thơ"}, -- 1,456,000 municipality (2019 census), 994,704 urban (2022 General Statistics Office of Vietnam estimate); capital [[Ninh Kiều district]]
["Cần Thơ"] = {alias_of = "Can Tho", display = true},
["Hue"] = {wp = "Huế"}, -- 1,257,000 municipality (2019 census), 840,000 urban (2022 General Statistics Office of Vietnam estimate); -- capital [[Thuận Hóa district]]
["Huế"] = {alias_of = "Hue", display = true},
}
export.vietnam_cities_group = {
placename_to_key = false, -- don't add ", Vietnam" to make the key
default_container = "Vietnam",
canonicalize_key_container = make_canonicalize_key_container(" Province, Vietnam", "province"),
-- Most of the cities listed are province-level municipalities in addition, which contain a certain amount of
-- rural territory surrounding the city, but not enough to separate the municipality from the city as distinct
-- known locations.
default_placetype = {"municipality", "city"},
default_is_city = true,
-- There may not be enough districts to subcategorize like this.
-- default_divs = "districts",
data = export.vietnam_cities,
}
export.misc_cities = {
------------------ Africa -------------------
-- Sorted by country and then within the country, by decreasing population; figures from citypopulation.de
-- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated; combined with data from
-- [[w:List of urban areas in Africa by population]].
["Algiers"] = {container = "Algeria"}, -- 4,325,000 (Consolidated Urban Area)
["Oran"] = {container = "Algeria"}, -- 1,640,000 (Consolidated Urban Area)
["Luanda"] = {container = "Angola"}, -- 9,650,000 (Urban Area)
["Benguela"] = {container = "Angola"}, -- 1,420,000 (Urban Area)
["Cotonou"] = {container = "Benin"}, -- 2,150,000 (Agglomeration)
["Ouagadougou"] = {container = "Burkina Faso"}, -- 3,425,000 (Agglomeration)
["Bobo-Dioulasso"] = {container = "Burkina Faso"}, -- 1,100,000 (Agglomeration)
["Bujumbura"] = {container = "Burundi"}, -- 1,143,202 (Urban Area 2023 per PopulationStat, cited in Wikipedia)
["Yaoundé"] = {container = "Cameroon"}, -- 3,975,000 (City)
["Yaounde"] = {alias_of = "Yaoundé", display = true},
["Douala"] = {container = "Cameroon"}, -- 3,900,000 (City)
["Bangui"] = {container = "Central African Republic"}, -- 1,680,000 (Agglomeration)
["N'Djamena"] = {container = "Chad"}, -- 1,950,000 (City)
["Ndjamena"] = {alias_of = "N'Djamena", display = true},
["Kinshasa"] = {container = "Democratic Republic of the Congo"}, -- 16,300,000 (City; population of low reliability)
["Lubumbashi"] = {container = "Democratic Republic of the Congo"}, -- 2,875,000 (City; population of low reliability)
["Mbuji-Mayi"] = {container = "Democratic Republic of the Congo"}, -- 2,500,000 (City; population of low reliability)
["Kananga"] = {container = "Democratic Republic of the Congo"}, -- 1,370,000 (City; population of low reliability)
["Kisangani"] = {container = "Democratic Republic of the Congo"}, -- 1,300,000 (City; population of low reliability)
["Bukavu"] = {container = "Democratic Republic of the Congo"}, -- 1,100,000 (City; population of low reliability)
["Goma"] = {container = "Democratic Republic of the Congo"}, -- 1,010,000 (City; population of low reliability)
["Tshikapa"] = {container = "Democratic Republic of the Congo"}, -- 1,020,468 (2023 Wikipedia [[w:List of cities with over one million inhabitants]] from populationstat.com; not in citypopulation.de)
["Cairo"] = {container = "Egypt"}, -- 22,800,000 (Agglomeration, including Giza and Subhra El Kheima)
["Alexandria"] = {container = "Egypt"}, -- 6,250,000 (Agglomeration)
["Giza"] = {container = "Egypt"}, -- 4,458,135 (2023 from citypopulation.de)
["Shubra El Kheima"] = {container = "Egypt"}, -- 1,240,239 (2021 from citypopulation.de)
["Asmara"] = {container = "Eritrea"}, -- 1,090,000 (City; population of low reliability)
["Asmera"] = {alias_of = "Asmara", display = true},
["Addis Ababa"] = {container = "Ethiopia"}, -- 4,825,000 (Agglomeration)
["Banjul"] = {container = "Gambia"}, -- 1,170,000 (Agglomeration)
["Accra"] = {container = "Ghana"}, -- 6,800,000 (Agglomeration)
["Kumasi"] = {container = "Ghana"}, -- 2,900,000 (Agglomeration)
["Conakry"] = {container = "Guinea"}, -- 2,975,000 (Consolidated Urban Area)
["Abidjan"] = {container = "Ivory Coast"}, -- 7,050,000 (Agglomeration)
["Nairobi"] = {container = "Kenya"}, -- 6,900,000 (unindicated)
["Mombasa"] = {container = "Kenya"}, -- 1,370,000 (City)
["Monrovia"] = {container = "Liberia"}, -- 1,940,000 (Urban Area)
["Tripoli"] = {container = "Libya", wp = "%l, %c"}, -- 1,870,000 (unindicated)
["Antananarivo"] = {container = "Madagascar"}, -- 3,150,000 (Agglomeration)
["Lilongwe"] = {container = "Malawi"}, -- 1,210,000 (City)
["Bamako"] = {container = "Mali"}, -- 5,700,000 (Agglomeration)
["Nouakchott"] = {container = "Mauritania"}, -- 1,500,000 (City)
["Casablanca"] = {container = {key = "Casablanca-Settat, Morocco", placetype = "region"}}, -- 4,450,000 (Municipality (urban population))
["Rabat"] = {container = {key = "Rabat-Sale-Kenitra, Morocco", placetype = "region"}}, -- 2,125,000 (Municipality (urban population))
["Tangier"] = {container = {key = "Tangier-Tetouan-Al Hoceima, Morocco", placetype = "region"}}, -- 1,410,000 (Municipality (urban population))
["Tanger"] = {alias_of = "Tangier", display = true},
["Tangiers"] = {alias_of = "Tangier", display = true},
["Fez"] = {container = {key = "Fez-Meknes, Morocco", placetype = "region"}, wp = "%l, Morocco"}, -- 1,310,000 (Municipality (urban population))
["Fes"] = {alias_of = "Fez", display = true},
["Fès"] = {alias_of = "Fez", display = true},
["Agadir"] = {container = {key = "Souss-Massa, Morocco", placetype = "region"}}, -- 1,270,000 (Municipality (urban population))
["Marrakesh"] = {container = {key = "Marrakesh-Safi, Morocco", placetype = "region"}}, -- 1,140,000 (Municipality (urban population))
["Marrakech"] = {alias_of = "Marrakesh", display = true},
["Maputo"] = {container = "Mozambique"}, -- 2,575,000 (Agglomeration)
["Niamey"] = {container = "Niger"}, -- 1,530,000 (City)
["Brazzaville"] = {container = "Republic of the Congo"}, -- 2,475,000 (Agglomeration)
["Pointe-Noire"] = {container = "Republic of the Congo"}, -- 1,480,000 (City)
["Kigali"] = {container = "Rwanda"}, -- 1,960,000 (Municipality (urban population))
["Dakar"] = {container = "Senegal"}, -- 4,225,000 (Agglomeration)
["Touba"] = {container = "Senegal"}, -- 1,320,000 (Agglomeration)
["Freetown"] = {container = "Sierra Leone"}, -- 1,420,000 (Agglomeration)
["Mogadishu"] = {container = "Somalia"}, -- 2,250,000 (unindicated; population of low reliability)
["Johannesburg"] = {container = {key = "Gauteng, South Africa", placetype = "province"}}, -- 14,800,000 (Consolidated Urban Area; including Pretoria, Soweto, etc.)
["Cape Town"] = {container = {key = "Western Cape, South Africa", placetype = "province"}}, -- 5,100,000 (Consolidated Urban Area)
["Durban"] = {container = {key = "KwaZulu-Natal, South Africa", placetype = "province"}}, -- 3,900,000 (Consolidated Urban Area)
["Pretoria"] = {container = {key = "Gauteng, South Africa", placetype = "province"}}, -- 2,921,488 (2011 census)
["Port Elizabeth"] = {container = {key = "Eastern Cape, South Africa", placetype = "province"}, wp = "Gqeberha"}, -- 1,200,000 (Consolidated Urban Area)
["Gqeberha"] = {alias_of = "Port Elizabeth"}, -- official name; not a display alias
["Khartoum"] = {container = "Sudan"}, -- 7,200,000 (unindicated; population of low reliability)
["Dar es Salaam"] = {container = "Tanzania"}, -- 6,650,000 (Agglomeration)
["Mwanza"] = {container = "Tanzania"}, -- 1,340,000 (Agglomeration)
["Mwanza City"] = {alias_of = "Mwanza", display = true},
["Arusha"] = {container = "Tanzania"}, -- 1,190,000 (Agglomeration)
["Zanzibar"] = {container = "Tanzania"}, -- 1,030,000 (Agglomeration)
["Lomé"] = {container = "Togo"}, -- 2,625,000 (unindicated)
["Lome"] = {alias_of = "Lomé", display = true},
["Tunis"] = {container = "Tunisia"}, -- 2,725,000 (Municipality (urban population))
["Sousse"] = {container = "Tunisia"}, -- 1,180,000 (Municipality (urban population))
["Soussa"] = {alias_of = "Sousse", display = true},
["Kampala"] = {container = "Uganda"}, -- 4,300,000 (unindicated)
["Lusaka"] = {container = "Zambia"}, -- 3,000,000 (Consolidated Urban Area)
["Harare"] = {container = "Zimbabwe"}, -- 2,675,000 (Agglomeration)
------------------ Asia -------------------
-- sorted by country and then within the country, by decreasing population; figures from citypopulation.de
-- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated.
["Kabul"] = {container = "Afghanistan"}, -- 5,250,000 (Agglomeration)
["Baku"] = {container = "Azerbaijan"}, -- 3,725,000 (Administrative Area (urban population))
["Manama"] = {container = "Bahrain"}, -- 1,560,000 (unindicated)
["Dhaka"] = {container = {key = "Dhaka Division, Bangladesh", placetype = "division"}}, -- 23,100,000 (Agglomeration)
["Dacca"] = {alias_of = "Dhaka", display = true},
["Chittagong"] = {container = {key = "Chittagong Division, Bangladesh", placetype = "division"}}, -- 5,050,000 (Agglomeration)
["Gazipur"] = {container = {key = "Dhaka Division, Bangladesh", placetype = "division"}}, -- 2,674,697 (City per 2022; countied in citypopulation.de as part of Dhaka metro area)
["Khulna"] = {container = {key = "Khulna Division, Bangladesh", placetype = "division"}}, -- 1,210,000 (Agglomeration)
["Phnom Penh"] = {container = "Cambodia"}, -- 2,925,000 (Agglomeration)
["Tehran"] = {container = {key = "Tehran Province, Iran", placetype = "province"}}, -- 16,800,000 (Agglomeration)
["Teheran"] = {alias_of = "Tehran", display = true},
["Mashhad"] = {container = {key = "Razavi Khorasan Province, Iran", placetype = "province"}}, -- 3,475,000 (Agglomeration)
["Mashad"] = {alias_of = "Mashhad", display = true},
["Meshhed"] = {alias_of = "Mashhad", display = true},
["Meshed"] = {alias_of = "Mashhad", display = true},
["Isfahan"] = {container = {key = "Isfahan Province, Iran", placetype = "province"}}, -- 3,425,000 (Agglomeration)
["Esfahan"] = {alias_of = "Isfahan", display = true},
["Tabriz"] = {container = {key = "East Azerbaijan Province, Iran", placetype = "province"}}, -- 1,970,000 (Agglomeration)
["Shiraz"] = {container = {key = "Fars Province, Iran", placetype = "province"}}, -- 1,950,000 (Agglomeration)
["Ahvaz"] = {container = {key = "Khuzestan Province, Iran", placetype = "province"}}, -- 1,550,000 (Agglomeration)
["Qom"] = {container = {key = "Qom Province, Iran", placetype = "province"}}, -- 1,450,000 (City)
["Kermanshah"] = {container = {key = "Kermanshah Province, Iran", placetype = "province"}}, -- 1,130,000 (City)
["Baghdad"] = {container = "Iraq"}, -- 7,800,000 (Administrative Area (urban population))
["Basra"] = {container = "Iraq"}, -- 1,710,000 (Administrative Area (urban population))
["Mosul"] = {container = "Iraq"}, -- 1,550,000 (Administrative Area (urban population))
["Erbil"] = {container = "Iraq"}, -- 1,220,000 (Administrative Area (urban population))
["Kirkuk"] = {container = "Iraq"}, -- 1,160,000 (Administrative Area (urban population))
["Najaf"] = {container = "Iraq"}, -- 1,050,000 (Administrative Area (urban population))
["Tel Aviv"] = {container = "Israel"}, -- 3,000,000 (Agglomeration)
-- Jerusalem is not recognized internationally as part of either Israel or Palestine, but as a
-- [[w:corpus separatum]], so put the container as "Asia" and list Israel and Palestine as additional parents for
-- categorization purposes.
["Jerusalem"] = {container = {key = "Asia", placetype = "benua"},
addl_parents = {"Israel", "Palestine"}}, -- 1,080,000 (Agglomeration)
["Amman"] = {container = "Jordan"}, -- 6,150,000 (unindicated)
["Irbid"] = {container = "Jordan"}, -- 1,070,000 (unindicated)
["Almaty"] = {container = "Kazakhstan"}, -- 2,700,000 (Agglomeration)
["Alma-Ata"] = {alias_of = "Almaty"}, -- former name, sometimes still used; don't display-canonicalize
["Astana"] = {container = "Kazakhstan"}, -- 1,600,000 (Agglomeration)
["Shymkent"] = {container = "Kazakhstan"}, -- 1,370,000 (Agglomeration)
["Kuwait City"] = {container = "Kuwait"}, -- 5,050,000 (Agglomeration)
["Bishkek"] = {container = "Kyrgyzstan"}, -- 1,540,000 (Agglomeration)
["Beirut"] = {container = "Lebanon"}, -- 1,930,000 (unindicated; population of low reliability)
-- Kuala Lumpur is a federal capital city, not in any state
["Kuala Lumpur"] = {container = "Malaysia"}, -- 9,550,000 (Agglomeration)
-- there are various George Towns and Georgetowns
["George Town, Malaysia"] = {container = {key = "Penang, Malaysia", placetype = "negeri"}, wp = "%l, %c"}, -- 2,075,000 (Agglomeration)
["George Town"] = {alias_of = "George Town, Malaysia"},
["Ulaanbaatar"] = {container = "Mongolia"}, -- 1,610,000 (City)
["Ulan Bator"] = {alias_of = "Ulaanbaatar", display = true},
["Yangon"] = {container = "Myanmar"}, -- 5,650,000 (Municipality (urban population))
["Rangoon"] = {alias_of = "Yangon", display = true},
["Mandalay"] = {container = "Myanmar"}, -- 1,600,000 (Municipality (urban population))
["Kathmandu"] = {container = "Nepal"}, -- 3,175,000 (Agglomeration)
-- Pyongyang is a directly governed city, not in any province
["Pyongyang"] = {container = "North Korea"}, -- 3,025,000 (Administrative Area (urban population))
["Muscat"] = {container = "Oman"}, -- 1,620,000 (Agglomeration)
["Gaza"] = {container = "Palestine", wp = "Gaza City"}, -- 2,275,000 (unindicated)
["Gaza City"] = {alias_of = "Gaza"},
["Doha"] = {container = "Qatar"}, -- 2,650,000 (Agglomeration)
["Colombo"] = {container = "Sri Lanka"}, -- 4,975,000 (unindicated)
["Damascus"] = {container = "Syria"}, -- 3,975,000 (unindicated; population of low reliability)
["Aleppo"] = {container = "Syria"}, -- 1,980,000 (unindicated; population of low reliability)
["Dushanbe"] = {container = "Tajikistan"}, -- 1,270,000 (City)
["Bangkok"] = {container = "Thailand"}, -- 21,800,000 (Agglomeration)
-- Chiang Mai not in citypopulation.de, but 1,198,000 urban population in 2021 per Wikipedia
-- [[w:List_of_municipalities_in_Thailand#Largest_cities_by_urban_population]]
["Chiang Mai"] = {container = {key = "Chiang Mai Province, Thailand", placetype = "province"}},
["Chonburi"] = {container = {key = "Chonburi Province, Thailand", placetype = "province"}}, -- 1,570,000 (Agglomeration; including Pattaya)
-- metro area population stats from https://www.statista.com/statistics/255483/biggest-cities-in-turkey/ as of 2021;
-- second source is citypopulation.de reference date 2025-01-01.
["Istanbul"] = {placetype = {"city", "province"}, divs = {"districts"}, container = "Turkey"}, -- 15.2 million; 16,000,000 (Agglomeration)
["İstanbul"] = {alias_of = "Istanbul", display = true},
["Ankara"] = {container = {key = "Ankara Province, Turkey", placetype = "province"}}, -- 5.15 million; 5,200,000 (Agglomeration)
["Izmir"] = {container = {key = "İzmir Province, Turkey", placetype = "province"}, wp = "İzmir"}, -- 2.95 million; 3,025,000 (Agglomeration)
["İzmir"] = {alias_of = "Izmir", display = true},
["Bursa"] = {container = {key = "Bursa Province, Turkey", placetype = "province"}}, -- 2.02 million; 2,200,000 (Agglomeration)
["Adana"] = {container = {key = "Adana Province, Turkey", placetype = "province"}}, -- 1.77 million; 1,780,000 (Agglomeration)
["Gaziantep"] = {container = {key = "Gaziantep Province, Turkey", placetype = "province"}}, -- 1.71 million; 1,750,000 (Agglomeration)
["Antalya"] = {container = {key = "Antalya Province, Turkey", placetype = "province"}}, -- 1.3 million; 1,400,000 (Agglomeration)
["Konya"] = {container = {key = "Konya Province, Turkey", placetype = "province"}}, -- 1.35 million; 1,390,000 (Agglomeration)
["Diyarbakır"] = {container = {key = "Diyarbakır Province, Turkey", placetype = "province"}}, -- 1.07 million; 1,100,000 (Agglomeration)
-- Diyarbakır is more common per Ngrams and Google Scholar, but Diyarbakir is the Kurdish form, so we should not
-- display-canonicalize to the Turkish form Diyarbakır.
["Diyarbakir"] = {alias_of = "Diyarbakır"},
["Mersin"] = {container = {key = "Mersin Province, Turkey", placetype = "province"}}, -- 1.03 million; 1,060,000 (Agglomeration)
["Ashgabat"] = {container = "Turkmenistan"}, -- 1,150,000 (Agglomeration)
["Dubai"] = {container = "United Arab Emirates"}, -- 6,050,000 (Agglomeration; including Sharjah)
["Abu Dhabi"] = {container = "United Arab Emirates"}, -- 1,850,000 (City)
["Sharjah"] = {container = "United Arab Emirates"}, -- 1,800,000 (Metro area 2022-2023 per Wikipedia; separate from Dubai)
["Tashkent"] = {container = "Uzbekistan"}, -- 3,850,000 (unindicated)
["Sanaa"] = {container = "Yemen"}, -- 3,275,000 (City; population of low reliability)
["Sana'a"] = {alias_of = "Sanaa", display = true},
["Aden"] = {container = "Yemen"}, -- 1,079,060 (?; 2023 estimate from World Population Review per Wikipedia)
------------------ Europe or Europe-like (Caucasus etc.) ---------------------
["Yerevan"] = {container = "Armenia"}, -- 1,520,000 (Agglomeration)
["Vienna"] = {container = "Austria"}, -- 2,375,000 (Agglomeration)
["Minsk"] = {container = "Belarus"}, -- 2,100,000 (unindicated)
["Brussels"] = {container = "Belgium"}, -- 2,800,000 (Consolidated Urban Area)
["Antwerp"] = {container = "Belgium"}, -- 1,270,000 (Consolidated Urban Area)
["Sofia"] = {container = "Bulgaria"}, -- 1,260,000 (Agglomeration)
["Zagreb"] = {container = "Croatia"},
["Prague"] = {container = "Czech Republic"}, -- 1,470,000 (Agglomeration)
["Brno"] = {container = "Czech Republic"}, -- 729,405 (metro area per Wikipedia as of 2024-01-01 Czech Statistical Office)
["Olomouc"] = {container = "Czech Republic"}, -- 102,293 (city; included only because someone went crazy creating Olomouc-related terms)
["Copenhagen"] = {container = "Denmark"}, -- 1,800,000 (Consolidated Urban Area)
["Helsinki"] = {container = {key = "Uusimaa, Finland", placetype = "region"}}, -- 1,560,000 (Consolidated Urban Area)
["Tbilisi"] = {container = "Georgia"}, -- 1,430,000 (Agglomeration)
["Athens"] = {container = "Greece"},
["Thessaloniki"] = {container = "Greece"},
["Budapest"] = {container = "Hungary"},
-- FIXME, per Wikipedia "County Dublin" is now the "Dublin Region"
["Dublin"] = {container = {key = "County Dublin, Ireland", placetype = "county"}},
["Riga"] = {container = "Latvia"},
["Amsterdam"] = {container = {key = "North Holland, Netherlands", placetype = "province"}},
["Rotterdam"] = {container = {key = "South Holland, Netherlands", placetype = "province"}},
["The Hague"] = {container = {key = "South Holland, Netherlands", placetype = "province"}},
-- Christchurch (metro 546,600) and Wellington (metro 439,800) are too small to make it.
["Auckland"] = {container = {key = "Auckland, New Zealand", placetype = "region"}},
["Oslo"] = {container = {key = "Oslo, Norway", placetype = "county"}},
["Warsaw"] = {container = {key = "Masovian Voivodeship, Poland", placetype = "voivodeship"}},
["Katowice"] = {container = {key = "Silesian Voivodeship, Poland", placetype = "voivodeship"}},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Krakow" without accent.
["Krakow"] = {container = {key = "Lesser Poland Voivodeship, Poland", placetype = "voivodeship"}, wp = "Kraków"},
["Kraków"] = {alias_of = "Krakow", display = true},
["Cracow"] = {alias_of = "Krakow", display = true},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirm "Gdańsk" and "Poznań" with accent.
["Gdańsk"] = {container = {key = "Pomeranian Voivodeship, Poland", placetype = "voivodeship"}},
["Gdansk"] = {alias_of = "Gdańsk", display = true},
["Poznań"] = {container = {key = "Greater Poland Voivodeship, Poland", placetype = "voivodeship"}},
["Poznan"] = {alias_of = "Poznań", display = true},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Lodz" without accents.
["Lodz"] = {container = {key = "Lodz Voivodeship, Poland", placetype = "voivodeship"}, wp = "Łódź"},
["Łódź"] = {alias_of = "Lodz", display = true},
["Lisbon"] = {container = {key = "Lisbon District, Portugal", placetype = "district"}},
["Porto"] = {container = {key = "Porto District, Portugal", placetype = "district"}},
["Oporto"] = {alias_of = "Porto", display = true},
["Bucharest"] = {container = "Romania"},
["Belgrade"] = {container = "Serbia"},
["Stockholm"] = {container = "Sweden"},
["Zurich"] = {container = "Switzerland"},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Zurich" without umlaut.
--- Even Wikipedia uses the form without umlaut.
["Zürich"] = {alias_of = "Zurich", display = true},
["Kyiv"] = {container = "Ukraine"}, -- not in Kyiv Oblast
-- Don't display-canonicalize Kiev -> Kyiv because in ancient contexts, Kiev is still more common.
["Kiev"] = {alias_of = "Kyiv"},
["Kharkiv"] = {container = {key = "Kharkiv Oblast, Ukraine", placetype = "oblast"}},
["Odessa"] = {container = {key = "Odesa Oblast, Ukraine", placetype = "oblast"}, wp = "Odesa"},
-- Don't display-canonicalize Odesa -> Odessa because it may be interpreted as a political statement.
["Odesa"] = {alias_of = "Odessa"},
------------------ North America, South America ---------------------
-- Primary figures from citypopulation.de retrieved on 2025-04-26 (reference date 2025-01-01);
-- Wikipedia metropolitan figures from [[w:List of metropolitan areas in the Americas]] based on per-country data;
-- Wikipedia city limits figures from [[w:List of largest cities in the Americas]].
["Buenos Aires"] = {container = "Argentina"}, -- 16,800,000 (Consolidated Urban Area; 13,985,794 metropolitan area per Wikipedia)
["Córdoba, Argentina"] = {container = "Argentina", wp = "%l, %c"}, -- 1,810,000 (Consolidated Urban Area; 1,505,25 city limits per Wikipedia)
-- to avoid confusion with Córdoba in Spain
["Córdoba"] = {alias_of = "Córdoba, Argentina"},
["Cordoba"] = {alias_of = "Córdoba, Argentina", display = "Córdoba"},
["Rosario"] = {container = "Argentina", wp = "%l, Santa Fe"}, -- 1,510,000 (Consolidated Urban Area; 1,348,725 metropolitan area per Wikipedia)
["Mendoza"] = {container = "Argentina", wp = "%l, %c"}, -- 1,180,000 (Consolidated Urban Area)
["San Miguel de Tucumán"] = {container = "Argentina"}, -- 1,110,000 (Consolidated Urban Area)
["Tucumán"] = {alias_of = "San Miguel de Tucumán"},
["Tucuman"] = {alias_of = "San Miguel de Tucumán", display = "Tucumán"},
["Santa Cruz de la Sierra"] = {container = "Bolivia"}, -- 1,960,000 (Consolidated Urban Area); 1,606,671 (city limits per Wikipedia)
["Santa Cruz"] = {alias_of = "Santa Cruz de la Sierra"},
["La Paz"] = {container = "Bolivia"}, -- 1,870,000 (Consolidated Urban Area; composed of El Alto, now slightly larger, and La Paz)
["El Alto"] = {container = "Bolivia"},
["Cochabamba"] = {container = "Bolivia"}, -- 1,280,000 (Consolidated Urban Area)
["Santiago"] = {container = "Chile"}, -- 8,400,000 (Consolidated Urban Area; 6,903,479 city limits? per Wikipedia)
["Valparaíso"] = {container = "Chile"}, -- 1,060,000 (Consolidated Urban Area)
["Valparaiso"] = {alias_of = "Valparaíso"}, -- 1,060,000 (Consolidated Urban Area)
["Bogotá"] = {container = "Colombia"}, -- 10,600,000 (Agglomeration; 12,772,828 metropolitan area per Wikipedia)
["Bogota"] = {alias_of = "Bogotá", display = true},
["Medellín"] = {container = "Colombia"}, -- 4,350,000 (Agglomeration; 4,068,000 metropolitan area per Wikipedia)
["Medellin"] = {alias_of = "Medellín", display = true},
["Cali"] = {container = "Colombia"}, -- 2,975,000 (Agglomeration; 2,837,000 metropolitan area per Wikipedia)
["Barranquilla"] = {container = "Colombia"}, -- 2,375,000 (Agglomeration; 1,341,160 city limits per Wikipedia)
["Bucaramanga"] = {container = "Colombia"}, -- 1,380,000 (Agglomeration)
["Cartagena, Colombia"] = {container = "Colombia", wp = "%l, %c"}, -- 1,250,000 (Agglomeration)
-- to avoid confusion with Cartagena, Spain
["Cartagena"] = {alias_of = "Cartagena, Colombia"},
["Cúcuta"] = {container = "Colombia"}, -- 1,130,000 (Agglomeration)
["Cucuta"] = {alias_of = "Cúcuta", display = true},
-- to avoid conflict with San Jose, California
["San José, Costa Rica"] = {container = "Costa Rica", wp = "%l, %c"}, -- 2,450,000 (Municipality (urban population); 3,160,000 metropolitan area per Wikipedia)
["San José"] = {alias_of = "San José, Costa Rica"},
["San Jose"] = {alias_of = "San José, Costa Rica"}, -- display = "San José"; causes error due to San Jose alias for California city; FIXME
["Havana"] = {container = "Cuba"}, -- 2,150,000 (City; 2,137,847 city limits? per Wikipedia)
["Santo Domingo"] = {container = "Dominican Republic"}, -- 3,900,000 (Municipality (urban population); 4,274,651 ??? per Wikipedia)
["Guayaquil"] = {container = "Ecuador"}, -- 3,350,000 (Agglomeration; 3,092,000 metro area? per Wikipedia)
["Quito"] = {container = "Ecuador"}, -- 2,875,000 (Agglomeration; 2,889,703 metro area? per Wikipedia)
["San Salvador"] = {container = "El Salvador"}, -- 1,580,000 (Municipality (urban population))
["Guatemala City"] = {container = "Guatemala"}, -- 3,375,000 (Municipality (urban population); 3,160,000 metro area? per Wikipedia)
["Port-au-Prince"] = {container = "Haiti"}, -- 3,050,000 (Agglomeration; population of low reliability; 2,915,000 metro area? per Wikipedia)
["San Pedro Sula"] = {container = "Honduras"}, -- 1,330,000 (Consolidated Urban Area)
["Tegucigalpa"] = {container = "Honduras"}, -- 1,220,000 (Urban Area)
["Managua"] = {container = "Nicaragua"}, -- 1,400,000 (Consolidated Urban Area)
["Panama City"] = {container = "Panama"}, -- 1,430,000 (Urban Area)
["Asunción"] = {container = "Paraguay"}, -- 2,350,000 (Municipality (urban population))
["Lima"] = {container = "Peru"}, -- 12,000,000 (Agglomeration; 11,283,787 ??? per Wikipedia)
["Arequipa"] = {container = "Peru"}, -- 1,210,000 (Agglomeration)
["San Juan"] = {container = {key = "Puerto Rico", placetype = "commonwealth"}, wp = "%l, %c"}, -- 1,910,000 (Consolidated Urban Area)
["Montevideo"] = {container = "Uruguay"}, -- 1,810,000 (Agglomeration; 1,302,954 ??? per Wikipedia)
["Caracas"] = {container = "Venezuela"}, -- 3,850,000 (Consolidated Urban Area; 5,243,301 ??? per Wikipedia)
["Maracaibo"] = {container = "Venezuela"}, -- 2,825,000 (Consolidated Urban Area; 5,278,448 ??? per Wikipedia)
-- to avoid confusion with Valencia (city and autonomous community of Spain)
["Valencia, Venezuela"] = {container = "Venezuela", wp = "%l, %c"}, -- 2,100,000 (Consolidated Urban Area)
["Valencia"] = {alias_of = "Valencia, Venezuela"},
["Maracay"] = {container = "Venezuela"}, -- 1,480,000 (Consolidated Urban Area)
["Barquisimeto"] = {container = "Venezuela"}, -- 1,360,000 (Consolidated Urban Area)
}
export.misc_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(nil, "negara"),
default_placetype = "city",
data = export.misc_cities,
}
--[==[ var:
List of all known locations, in groups. The first group lists continents and continental regions, followed by three
groups listing top-level locations: countries, "country-like entities" (de-facto/unrecognized/etc. countries and
dependent territories) and former polities (countries, empires, etc.). After that come first-level subpolities
(administrative divisions) of several, mostly large, countries, followed by groups of cities. China and the United
Kingdom include second-level subpolities (in the case of China, only the largest ones as the full list runs in the
hundreds).
]==]
export.locations = {
export.continents_group,
export.countries_group,
export.country_like_entities_group,
export.former_countries_group,
export.australia_group,
export.austria_group,
export.bangladesh_group,
export.brazil_group,
export.canada_group,
export.china_group,
export.china_prefecture_level_cities_group,
export.china_prefecture_level_cities_group_2,
export.egypt_group,
export.finland_group,
export.france_group,
export.france_departments_group,
export.germany_group,
export.greece_group,
export.india_group,
export.indonesia_group,
export.iran_group,
export.ireland_group,
export.italy_group,
export.japan_group,
export.laos_group,
export.lebanon_group,
export.malaysia_group,
export.malta_group,
export.mexico_group,
export.moldova_group,
export.morocco_group,
export.netherlands_group,
export.new_zealand_group,
export.nigeria_group,
export.north_korea_group,
export.norway_group,
export.pakistan_group,
export.philippines_group,
export.poland_group,
export.portugal_group,
export.romania_group,
export.russia_group,
export.saudi_arabia_group,
export.south_africa_group,
export.south_korea_group,
export.spain_group,
export.taiwan_group,
export.thailand_group,
export.turkey_group,
export.ukraine_group,
export.united_kingdom_group,
export.united_states_group,
export.england_group,
export.northern_ireland_group,
export.scotland_group,
export.wales_group,
export.vietnam_group,
export.australia_cities_group,
export.brazil_cities_group,
export.canada_cities_group,
export.france_cities_group,
export.germany_cities_group,
export.india_cities_group,
export.indonesia_cities_group,
export.italy_cities_group,
export.japan_cities_group,
export.mexico_cities_group,
export.nigeria_cities_group,
export.pakistan_cities_group,
export.philippines_cities_group,
export.russia_cities_group,
export.saudi_arabia_cities_group,
export.south_korea_cities_group,
export.spain_cities_group,
export.taiwan_cities_group,
export.united_kingdom_cities_group,
export.united_states_cities_group,
export.new_york_boroughs_group,
export.vietnam_cities_group,
export.misc_cities_group,
}
return export
e7wgydrdurf6fp3y3d571dji0neo8h9
Modul:place
828
76178
281374
244167
2026-04-22T07:16:34Z
PeaceSeekers
3334
281374
Scribunto
text/plain
local export = {}
local force_cat = false -- set to true for testing
local m_placetypes = require("Module:place/placetypes")
local m_links = require("Module:links")
local memoize = require("Module:memoize")
local m_strutils = require("Module:string utilities")
local m_table = require("Module:table")
local debug_track_module = "Module:debug/track"
local en_utilities_module = "Module:en-utilities"
local form_of_module = "Module:form of"
local languages_module = "Module:languages"
local parse_interface_module = "Module:parse interface"
local parse_utilities_module = "Module:parse utilities"
local parameter_utilities_module = "Module:parameter utilities"
local utilities_module = "Module:utilities"
local enlang = require(languages_module).getByCode("en")
local rmatch = m_strutils.match
local rfind = m_strutils.find
local ulen = m_strutils.len
local split = m_strutils.split
local dump = mw.dumpObject
local insert = table.insert
local concat = table.concat
local pluralize = require(en_utilities_module).pluralize
local extend = m_table.extend
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
local internal_error = m_placetypes.internal_error
local process_error = m_placetypes.process_error
local placetype_data = m_placetypes.placetype_data
--[==[ intro:
===Introduction===
This module implements {{tl|place}}, which is a template for standardizing the description and categorization of
toponyms (terms that refer to locations such as cities, countries, rivers, etc.). The following modules support this
template:
* [[Module:place]]: The main module.
* [[Module:place/placetypes]]: A module containing data on placetypes, as well as utilities for working with placetypes;
category generation handlers for adding categories based on placetypes; and display handlers for displaying holonyms
(i.e. containing locations) of a specific type. FIXME: Maybe split out the code from the data.
* [[Module:place/locations]]: A module containing data on known locations, as well as utilities for working with
such locations. FIXME: Maybe split out the code from the data.
* [[Module:category tree/topic/Places]]: A category tree module for generating the descriptions of all
categories generated by {{tl|place}}.
* [[Module:place doc]]: A module that generates documentation tables describing known placetypes and locations.
===Basic terminology===
The basic terminology used in this and associated {{tl|place}} modules is:
* A ''location'' (or equivalently, a ''place'') is any geographic feature (either natural or geopolitical), either on
the surface of the Earth or elsewhere. Examples of types of natural places are rivers, mountains, seas and moons;
examples of types of geopolitical places are cities, countries, neighborhoods and roads. A ''known location'' is
specifically a location whose properties are specified in the {{tl|place}} modules; more on them below.
* Specific places are identified by names, referred to as ''toponyms'' or ''placenames''. A given place will often have
multiple names, and a given toponym may be ambiguous, referring to multiple possible locations. Specifically:
** There may be names including different amounts of disambiguating information (`Tucson` vs. `Tucson, Arizona` vs.
`Tucson, Arizona, USA` or `New York` vs. `New York City` vs. `New York, New York`); abbreviations (`NYC`
for `New York City`, `USA` for `United States of America`); ''official'' vs. ''short'' names (e.g.
`Union of Soviet Socialist Republics` vs. `Soviet Union`); spelling variations (`Cracow` vs. `Krakow` vs. `Kraków`);
current vs. former names (`Saint Petersburg` vs. `Leningrad` vs. `Petrograd`); [[exonym]]s vs. [[endonym]]s (e.g.
`Tavastia Proper` vs. `Kanta-Häme`, both referring to the same administrative region in Finland); alternative names
not due to any of the above reasons (`Bashkiria` vs. `Bashkortostan`); etc. In addition, each language that has an
opportunity to refer to the place will have its own name, with the same sorts of variations as exist in English.
** Examples of ambiguous toponyms are `New York` (either a city or a state); `Georgia` (either a state of the US or an
independent country in the Caucasus Mountains); `Paris` (either the capital of France or various small cities and
towns in the US); `Mexico` (either a country, a state of that country, or the capital city of that country); and
`San Antonio` (besides being a major city in Texas, it is the name of dozens of settlements of all sorts throughout
the US and Latin America, and a least 181 distinct [[barangay]]s in the Philippines).
* A ''placetype'' is the (or a) type that a location belongs to (e.g. `city`, `state`, `river`, `administrative region`,
`[[regional county municipality]]`, etc.).
** It is common for locations to be described using multiple placetypes, and even sometimes known locations have
multiple placetypes that they may be identified by (e.g. American Samoa can be identified either as an
`unincorporated territory`, an `overseas territory` or just a `territory`). Both the {{tl|place}} template and the
known location data allow a given location to be identified by multiple placetypes. When in doubt as to the correct
placetype or placetypes for a given location, generally follow how Wikipedia describes the place.
** Some placetypes themselves are ambiguous; e.g. an ''area'' can variously refer to a top-level administrative division
(specifically of Kuwait); a geographic region, generally without unambiguously defined borders; or a section of a
city, similar to a neighborhood. The term ''district'' is similarly ambiguous. A ''[[prefecture]]'' in the context of
Japan is similar to a province, but a prefecture in France is the capital of a ''[[department]]'' (which is similar
to a county). Some of this ambiguity is currently handled automatically; e.g. the ambiguity of areas and districts is
handled by looking at the ''holonyms'', or containing locations, specified for a given place. But sometimes it is
necessary to use a qualifier before the placetype to disambiguate; for example to refer to a French prefecture, use
the placetype `French prefecture` instead of just `prefecture`. (FIXME: Handle this automatically.)
* A ''holonym'', in the context of a description of a place, is a placename that refers to a larger-sized entity that
contains the location being described. For example, `Arizona` and `United States` are holonyms of `Tucson`, and
`United States` is a holonym of `Arizona`.
* A ''place invocation'' consists of the invocation of {{tl|place}}, including all its parameters. Place invocations
may contain one or more ''place descriptions'', each of which provides a description of the location, including its
placetype or types, any holonyms, and any additional raw text needed to properly explain the place in context. Place
invocations may also contain named parameters specifying zero or more English ''glosses'' or translations (for
foreign-language toponyms) and any attached ''extra information'' such as the capital, largest city, official name,
modern name or full name. Multiple place descriptions in a single invocation are separated by a numbered parameter
starting with a semicolon, and are used when it is necessary to provide two or more definitions of a single location
for proper categorization. For example, [[Vatican City]] is defined both as a city-state in Southern Europe and as an
enclave within the city of Rome, follows:
: {{tl|place|en|city-state|r/Southern Europe|;,|an <<enclave>> within the city of [[Rome]], [[Italy]]|cat=Places in Rome|official=Vatican City State}}.
Similar things need to be done for places like [[Crimea]] that are claimed by two different countries with different
definitions and administrative structures.
** There are two types of place descriptions, ''new-style'' and ''old-style''. (The use of the terms "new" and "old"
indicates chronological precedence in the development of {{tl|place}}, but is not meant to pass any value judgments
on the two types, and does not indicate any intent to deprecate old-style descriptions. Both types of descriptions
are useful; for example, old-style descriptions are generally more succinct but less flexible.) The above invocation
shows both types: an old-style description followed by a new-style description. Old style descriptions use multiple
numbered parameters, where the first parameter (after the language code) specifies the placetype or types, and
following parameters specify either holonyms (which are always of the form ` ``placetype``/``placename`` `) or raw
text (which is identifiable by not having a slash in it). New-style descriptions use a single parameter, where both
placetypes and holonyms are surrounded by double angle brackets, and all remaining text is raw (displayed as-is). In
both types of descriptions, holonyms include a slash in them to separate the placetype (which is mandatory and often
abbreviated) from the placename.
** In the context of a place description, there are two types of placetypes. The ''entry placetypes'' are the placetypes
of the place being described, while the ''holonym placetypes'' are the placetypes of the holonyms that the place
being described is located within. Currently, a given place can have multiple placetypes specified (e.g. [[Normandy]]
is specified using the ''compound placetype'' `administrative region/former province/and/medieval kingdom`) while a
given holonym can have only one placetype associated with it. Holonym placetypes are frequently abbreviated (e.g.
`r` for `region`, `s` for `state`, `co` for `county`, etc.), while stylistically it is preferred to spell out the
entry placetype (except for some long placetypes with well-known abbreviations, such as `CDP` or `cdp` for
`[[census-designated place]]`).
** All holonyms in place descriptions are automatically linked as if surrounded by {{tl|l|en|...}}; i.e. if double
brackets do not occur in the holonym, the entire holonym will be linked to the corresponding Wiktionary article. For
this reason, the holonym should generally be in the same format as the canonical Wiktionary article describing the
location; see below).
* A ''known location'' is a location whose properties are specifically defined in the {{tl|place}} modules. Generally
each such location has an associated category, and known locations exist in a containment hierarchy, where the
immediately containing known location is known as the ''container'' of the location and the chain of successive
containing locations is known as the ''container trail''. Generally the location's container corresponds to the first
parent of its category. Note that some known locations belong to more than one immediate container; for example,
Russia belongs to both Europe and Asia.
===More about placetypes===
# The following general categories of placetypes exist:
## ''Natural features'' such as lakes, mountains, mountain ranges, islands, archipelagoes, moons, stars, asteroids, etc.
## ''Continents'', ''supercontinents'' (groupings of continents where it makes sense, such as `America` and `Eurasia`)
and ''continent-level regions'' (grouping of countries in a given continent, such as `Central America` and
`Polynesia`).
## ''Political entities'', which are generally classified as either ''polities'' (top-level entities such as countries),
''subpolities'' or ''political divisions'' (non-sovereign divisions, often specifically ''administrative divisions'',
of a polity, where an administrative division has a governmental or statistical function and almost always has
unambiguously defined boundaries), or ''settlements'' (e.g. cities; towns; villages; and divisions of a city such as
neighborhoods, wards, [[barrio]]s and [[barangay]]s, which may or may not be formal administrative divisions and
may or may not have unambiguous boundaries).
## ''Geographic regions'', which refer to recognized areas of the Earth (either with a natural geographic, political or
cultural significance, often of a historical nature). Such regions can be of greatly varying size, may exist either
within a single country or spanning multiple countries or (more often) parts of multiple countries, and may not have
well-defined boundaries. They should be distinguished from ''administrative regions'', which exist within a single
country and have well-defined boundaries and a political or administrative function. Geographic regions are
categorized using the generic term ''geographic and cultural areas'' to emphasize that (a) they have no
administrative significance; (b) they may vary greatly in size; and (c) their cohesion is due either to natural
geographic boundaries, such as rivers or mountain ranges, or to sharing some cultural characteristics.
## ''Man-made structures'' below the level of a settlement or neighborhood, such as airports, roads, individual
buildings, and the like. (Note that such structures, even if named, often do not meet the [[WT:CFI]] criteria; this
is particularly the case for roads.)
# Placetypes support aliases, and the mapping to canonical form happens early on in the processing. For example, `state`
can be abbreviated as `s`; `administrative region` as `adr`; `regional county municipality` as `rcomun`; etc. Some
placetype aliases handle alternative spellings rather than abbreviations. For example, `departmental capital` maps to
`department capital`, and `home-rule city` maps to `home rule city`. Placetype abbreviations are particularly useful
in holonym specs, because every holonym must be accompanied by its placetype, for disambiguation purposes.
# A ''placetype qualifier'' is an adjective prepended to the placetype to give additional information about the
place being described. For example, a given place may be described as a `small city`; logically this is still a city,
but the qualifier `small` gives additional information about the place. Multiple qualifiers can be stacked, e.g.
`small affluent beachfront unincorporated community`, where `unincorporated community` is a recognized placetype and
`small`, `affluent` and `beachfront` are qualifiers. (As shown here, it may not always be obvious where the qualifiers
end and the placetype begins.) For the most part, placetype qualifiers do not affect categorization; a `small city`
is still a city and an `affluent beachfront unincorporated community` is still an unincorporated community, and both
should still be categorized as such. But some qualifiers do change the categorization. In particular, a
`former province` is no longer a province and should not be categorized in e.g. [[:Category:Provinces of Italy]], but
instead in a different set of categories, e.g. [[:Category:Historical political subdivisions]]. There are several
terms treated as equivalent for this purpose: `abandoned` `ancient`, `extinct`, `historic(al)`, `medi(a)eval` and
`traditional`. Another set of qualifiers that change categorization are `fictional` and `mythological`, which cause
any term using the qualifier to be categorized respectively into [[:Category:Fictional locations]] and
[[:Category:Mythological locations]].
===More about toponyms===
# Toponyms may be:
## ''simple'' (not including any containing location in its name, such as `Tucson`) or ''multipart'' (including one or
more containing locations, such as `Tucson, Arizona` or `Tucson, USA` or even `Tucson, Arizona, USA`);
## ''bare'' (not including the word `the` if the location normally requires this article when following a preposition,
such as `United States`, `Gambia` or 'Community of Madrid') or ''prefixed'' (including the word `the` as needed, such
as `the United States`, `the Gambia` or `the Community of Madrid`);
## ''elliptical'' (just the placename without any disambiguating placetype, such as `Durham`, `New York` or `Mexico`) or
''full'' (containing a disambiguating placetype or similar identifier if one is commonly included, such as
the city of `Durham` (in England) vs. its containing county `County Durham`; the US city `New York City` vs. its
containing state `New York`; or the three-way distinction between `Mexico` (the country), `Mexico City` (the capital
of this country) and `(the) State of Mexico` (one of the states of the country Mexico, mostly surrounding but not
including Mexico City)).
# The ''canonical Wiktionary article'' is the main article on Wiktionary where a location is described. Canonical
articles, per the above terminology, are generally ''simple'' and ''bare'', but may be either ''full'' or
''elliptical''. The fact that a given article is canonical is often identifiable by the fact that translations are
housed there an not somewhere else. For example, most counties of the US and Canada include the word `County` in their
canonical article name, but most counties elsewhere do not. `Washington, D.C.` is one of the few cases where a
non-simple toponym is used as the canonical article; this is based on common usage, especially by residents of the
city in question (who commonly refer to it as "D.C." but rarely just as "Washington").
===More about known locations===
# The following types of known locations are defined in this module:
## Continents, supercontinents and continent-level regions, into which countries are grouped. Specifically:
### At the top level below `Earth` are the supercontinents `America` and `Eurasia` and the continents `Africa`,
`Oceania` and `Antartica`.
### `America` is further broken down into the continents `North America` (in turn containing the continental regions
`Central America` and `Caribbean`, with the United States, Canada and Mexico directly under North America) and
`South America`.
### `Eurasia` is further broken down into the continents `Europe` and `Asia`.
### `Oceania` is further broken down into the continental regions `Melanesia`, `Micronesia` and `Polynesia`, with
Australia` directly under `Oceania.
### Under the above-specified divisions are countries. Some countries are placed in more than one continent or
continent-level region, either because they actually span two continents (e.g. Russia, Turkey, Kazakhstan, Egypt) or
because they are politically considered to belong to a continent different from the one they are geographically in
(Cyprus, Georgia, Armenia, etc.).
## Political entities, including:
### Top-level political entities, which includes:
#### Countries, with a fairly liberal definition, notably including all UN-recognized countries plus some others that
are commonly considered countries, even if not all other countries recognize them as such or consider them
completely independent (notably, Kosovo, Palestine, Taiwan, Western Sahara, Niue and the Cook Islands).
#### Pseudo-countries, which include areas calling themselves countries that are de-facto not under the control of the
country that they are internationally considered part of (e.g. Abkhazia, South Ossetia, Transnistria);
dependent/external/etc. territories of countries (e.g. American Samoa [US], Bermuda [UK], Christmas Island
[Australia], Easter Island [Chile]); constituent countries, autonomous territories and the like (Aruba, Curaçao and
Sint Maarten of the Netherlands; Greenland and the Faroe Islands of Denmark; etc.; but notably not including
England, Scotland, Northern Ireland and Wales, which are treated as regular countries); and a grab bag of other
entities that have a semi-independent existence, such as Hong Kong, Macau, Guadeloupe, Martinique and the like.
Currently, the actual distinction in treatment between "countries" and "country-like entities" is minimal, but in
the future we might restrict the sorts of subcategories of country-like entities more than regular countries.
#### Former countries, e.g. the Soviet Union, Yugoslavia, West Germany and the Roman Empire. These are much more limited
in the sorts of subcategories allowed, because generally locations, especially cities, should be described from the
perspective of which political entity they are currently located in (e.g. "an ancient Roman town in modern Syria")
and categorized as such.
### Subpolities. Generally we only list top-level administrative divisions of countries (and only fairly major countries
are usually included), but sometimes we list second-level administrative divisions, as in the case of the
United Kingdom (where the top-level administrative divisions of the four constituent countries are listed) and China
(where major prefecture-level cities are listed, and are considered administrative divisions rather than cities).
### Cities. Only major cities get categories, with the definition of "major" varying by country but often including
those where the city population itself (sometimes the metro area) is >= 1,000,000 people.
# A distinction should be made in the {{tl|place}} modules between ''keys'' and ''placenames''. Placenames are as the
location appears in a holonym, and are generally in the same format as the canonical Wiktionary article describing the
location so that when formatted as a link, the link goes to the right article; i.e. they are simple and bare, and may
be full or elliptical according to Wiktionary conventions. The ''canonical key'' of a location is how the location's
category is named, and always uniquely identifies the location from among the known locations in this module (but
not necessarily among all possible locations). In particular, subpolities usually have multipart keys that include the
containing location, such as `Anhui, China` (not just `Anhui`); `Arizona, USA` (not just `Arizona`, and also not
`Arizona, United States`); and `Herefordshire, England` (not just `Herefordshire`, and also in this case not
`Herefordshire, UK` or `Herefordshire, England, UK` or any other possible variation). Cities are normally simple, but
some cities are multipart for disambiguation purposes (e.g. `Newcastle, New South Wales` for the city in Australia vs.
`Newcastle upon Tyne` for the identically-named city in England). Canonical keys may have ''key aliases'', other
ways of referring to the location that are not necessarily unique (e.g. `Newcastle` is a key alias for both of the
above-mentioned cities), and city keys with diacritics generally have diacriticless aliases, such as canonical key
`Düsseldorf` vs. key alias `Dusseldorf`, or canonical key `Łódź` vs. key alias `Lodz`.
# Known locations are gathered into ''groups'' with similar properties, such as all the states of the United States;
all the (ceremonial) counties of England (see below); and all the "sufficiently major" prefecture-level cities in
China (where a prefecture-level city is a prefecture surrounding a major city with a unified government and is more
like a prefecture, i.e. a major administrative division just underneath a province, than like a city, and where
"sufficiently major" is defined according to the population of either the total prefecture or the urban area of the
city). Note that there are multiple types of counties in England, with overlapping but non-identical names and
boundaries; there are, in particular, ''ceremonial counties'', ''local government counties'' and ''historic
counties''; ''ceremonial counties'' have only ceremonial administrative functionality but unlike local government
counties (a) don't frequently change their boundaries or nature, (b) correspond more closely to historic county
boundaries and names, and (c) are what Englanders usually identify themselves with, and so they are used as top-level
divisions rather than local government counties.
# Some known locations have ''aliases'' defined, which are of two types. ''Display aliases'' map holonyms to their
canonical form near the beginning of processing (in particular before the displayed output is formatted). For example,
`US`, `U.S.`, `USA`, `U.S.A.` and `United States of America` are all canonicalized to `United States` (if identified
as a country), and display as `United States`. Similarly, the foreign forms `Occitanie` (as a region or administrative
region) and `Noord-Brabant` (as a province) are mapped to `Occitania` and `North Brabant` for display purposes. There
are also ''category aliases'', so that if e.g. `Republic of Macedonia` is encountered, it will display as such but
categorize as `North Macedonia`. (This is because, among other reasons, `Republic of Macedonia` is normally preceded
by `"the"` while `North Macedonia` is not, so a call {{tl|place|en|a <<city>> in the <<c/Republic of Macedonia>>}}
would look wrong if `Republic of Macedonia` were converted to `North Macedonia` during display, as the result would be
`a city in the North Macedonia`. There are also frequently political connotations to different category aliases, e.g.
`Burma` vs. `Myanmar`.) All of these aliases are sensitive to the placetype specified. For example, `Mexico` as a
state is categorized under `State of Mexico, Mexico` but `Mexico` the country is categorized as just `Mexico`.
===Categories===
There are two main types of categories:
# Categories for known locations, divided into:
## Top-level polity categories (e.g. [[:Category:United States]], [[:Category:Taiwan]], [[:Category:South Ossetia]],
[[:Category:Bermuda]], [[:Category:Soviet Union]], [[:Category:West Germany]]).
## Subpolity categories ([[:Category:Arizona, USA]], [[:Category:Hunan]], [[:Category:Kagoshima Prefecture]],
[[:Category:Cluj County, Romania]]). For historical reasons, different formats are used for the subpolities of
different polities. Increasingly, we are moving towards always including the polity name in the subpolity category,
but whether the subpolity type is included and where it is included (cf. [[:Category:Cluj County, Romania]] vs.
[[:Category:County Cork, Ireland]] is still inconsistent and will probably remain that way, based on how the
subpolity is normally referred to.
## City categories ([[:Category:Tokyo]], [[:Category:New York City]], [[:Category:Jaipur]]). Normally these do not
include the containing subpolity, but may do so in order to disambiguate.
# Categories for placetypes, divided into:
## "Immediate" political and non-political division categories ([[:Category:States of the United States]],
[[:Category:Municipalities of Tocantins, Brazil]], [[:Category:Ghost towns in Arizona, USA]]). These are name
categories, whose purpose is to contain locations of the specified type. "Immediate" here refers to the fact that
the location in the category name is the immediately-containing polity. Usually these categories use the preposition
"of", but sometimes "di". (Specifically, "of" typically implies that the placetype in question has an official or
semi-official status, whereas "di" implies there is no such official status, but common usage may override this.)
The form of the toponym appearing in these categories is always the same as that of the corresponding toponym
category except that the word "the" may appear (e.g. [[:Category:States of the United States]]), whereas it doesn't
appear in the toponym category itself ([[:Category:United States]], no "the").
## "Skip-polity" categories for second-level political and non-political divisions of a country or other top-level
polity (e.g. [[:Category:Counties of the United States]], [[:Category:Municipalities of Brazil]] and
[[:Category:Subprefectures of Japan]]). These have several purposes:
* They group the immediate division categories mentioned previously.
* They categorize "straggler" topoynms that (often improperly) fail to mention the subpolity they belong to, but
only the top-level polity.
* If categories do not exist for the first-level divisions of a country (and sometimes even when they do), they group
all toponyms of the specified type for the specified country. For example, Lithuania is divided into first-level
counties and second-level municipalities, but since we don't currently have categories for Lithuanian counties,
all municipalities go under [[:Category:Municipalities of Lithuania]] rather than under a category for a specific
county. In addition, even though we do have categories for Japanese prefectures (a first-level division), all
subprefectures (a second-level division) go under [[:Category:Subprefectures of Japan]] because there aren't very
many of them (see below).
## "Generic placetype" categories, both of the immediate and skip-polity type (immediate
[[:Category:Cities in California, USA]] and [[:Category:Neighborhoods of the Bronx]]; skip-polity
[[:Category:Villages in Ivory Coast]], [[:Category:Geographic and cultural areas of England]],
[[:Category:Rivers in Egypt]] and [[:Category:Places in the Philippines]]). As mentioned above, "generic" placetypes
occur in every polity (although the set of generic placetypes allowed for cities is a subset of those allowed for
top-level polities and subpolities). Usually these categories use the preposition "di", but sometimes "of". As above,
skip-polity categories group immediate categories, and in addition there are various reasons a toponym entry is
categorized into a skip-polity category. (For example, as a general rule, geographic and cultural areas only
categorize at the country level, not the subpolity level, both because there often aren't very many in a given
country and because they often span multiple subpolities.)
The parent categories of a given category depend on its type. Generally, location categories have placetype categories
as their first parent, and vice-versa. Specifically:
# Top-level country categories have as their parent e.g. [[:Category:Countries in Europe]],
[[:Category:Countries in Central America]] or [[:Category:Countries in Polynesia]], using the most specific
continental-level region the country is contained in.
# Pseudo-countries are under [[:Category:Country-like entities]] as a neutral designation. There aren't enough of them
to subcategorize under continent-level regions.
# Former countries are under [[:Category:Former countries and country-like entities]].
# Subpolity categories are usually under a placetype category whose placetype is the canonical (first-listed) placetype
of the subpolity and whose toponym is the immediately containing polity, but there are exceptions. Specifically,
sometimes if a polity has multiple types of subpolities, they are combined (e.g. [[:Category:States and territories of
Australia]], [[:Category:Federal subjects of Russia]]). In addition, sometimes a less specific but more identifiable
placetype is used instead of the canonical one (e.g. [[:Category:Regions of France]] when the canonical placetype is
"administrative region"). The same rules and exceptions generally apply when categorizing subpolities themselves; e.g.
both the Australian state of Queensland and territory of Northern Territory go under
[[:Category:en:States and territories of Australia]] rather than separately under [[:Category:en:States of Australia]]
and [[:Category:en:Territories of Australia]]. In addition, sometimes subpolities may "skip a level" if there aren't
very many. For example, there are only 26 subprefectures of Japan (14 under Hokkaido and 12 more scattered under five
other prefectures). Rather than have e.g. [[:Category:en:Subprefectures of Kagoshima Prefecture]] containing at most
two entries and [[:Category:en:Subprefectures of Miyazaki Prefecture]] containing at most one, they are all grouped
under the so-called "skip-subpolity category" [[:Category:en:Subprefectures of Japan]].
# City categories are always under e.g. [[:Category:Cities in the United States]] (e.g. [[:Category:New York City]] is
so-placed, even though [[:Category:Cities in New York, USA]] exists). However, they may have a second, more-specific
parent (e.g. [[:Category:Cities in New York, USA]] in the case of New York City). The city entries themselves will
go under the more specific parent if it exists.
# Immediate placetype categories for second-level divisions of a country generally have, respectively, a
"toponym parent" that is the toponym mentioned in the category and a "skip-polity parent" that groups all subpolity
placetype categories of a specific type and containing polity. For example, [[:Category:Counties of Arizona, USA]] has
toponym parent [[:Category:en:Arizona, USA]] and skip-polity parent [[:Category:en:Counties of the United States]].
Sometimes the default skip-polity parent is overridden or disabled entirely. For example, in the US, most states are
divided into counties but Louisiana is divided into parishes and Alaska into boroughs. It would make no sense to put
[[:Category:Parishes of Louisiana, USA]] under [[:Category:Parishes of the United States]] (which would only have one
subcategory), so we include them under [[:Category:Counties of the United States]]. An alternative would be to name
the skip-polity category to explicitly include parishes and boroughs; this would get awkward here but is done in some
cases. Similarly, [[:Category:Regional county municipalities of Quebec]] is placed under
[[:Category:Regional municipalities of Canada]] since that name is used in other provinces. Meanwhile,
[[:Category:Regional districts of British Columbia]] disables its skip-polity category since no other province or
territory of Canada has regional districts or comparable subpolities under a different name (an alternative would be
to place them under [[:Category:Counties of Canada]], since they are sort of comparable to counties).
# Placetype categories for first-level divisions of a country similarly (e.g. [[:Category:States of the United States]])
have a toponym parent (in this case [[:Category:United States]]), but in place of the skip-polity parent they have two
other parents: a "bare placetype" parent (in this case [[:Category:States]]) and the "generic" parent
[[:Category:Political divisions of specific countries]]. (There is also a bare [[:Category:Political divisions]]
that groups "bare placetype" categories.) Skip-polity placetype categories for second-level divisions of a country
(e.g. [[:Category:Counties of the United States]]) work the same. Placetype categories for countries work likewise
except they are missing the generic parent.
===Place descriptions===
A given place description is defined internally in a table of the following form:
```{
placetypes = {"``placetype``", "``placetype``", ...},
holonyms = {
{ -- holonym object; see below
placetype = "``placetype``" or nil,
display_placename = "``placename``",
unlinked_placename = "``placename``",
langcode = "``langcode``" or nil,
no_display = BOOLEAN,
needs_article = BOOLEAN,
force_the = BOOLEAN,
affix_type = "``affix_type``" or nil,
pluralize_affix = BOOLEAN,
suppress_affix = BOOLEAN,
continue_cat_loop = BOOLEAN,
},
...
},
order = { ``order_item``, ``order_item``, ... }, -- (only for new-style place descriptions),
joiner = "``joiner_string``" or nil,
holonyms_by_placetype = {
``holonym_placetype`` = {"``placename``", "``placename``", ...},
``holonym_placetype`` = {"``placename``", "``placename``", ...},
...
},
}```
Holonym objects have the following fields:
* `placetype`: The canonicalized placetype if specified as e.g. `c/Australia`; nil if no slash is present (in which case
the placename in `display_placename` refers to raw text).
* `display_placename`: The placename or raw text, in the format to be displayed. Placename display aliases have already
been resolved. It is raw text if `placetype` is nil.
* `unlinked_placename`: Same as `display_placename` but with links and HTML removed.
* `langcode`: The language code prefix if specified as e.g. `c/fr:Australie`; otherwise nil.
* `no_display`: If true (holonym prefixed with !), don't display the holonym but use it for categorization.
* `needs_article`: If true, prepend an article if the placename needs one (e.g. `United States`).
* `force_the`: If true, always prepend the article `the`. Example use: holoynm 'city:pref:the/Gold Coast', which gets
formatted as `(the) city of the [[Gold Coast]]`.
* `affix_type`: Type of affix to prepend (values `pref` or `Pref`) or append (values `suf` or `Suf`). The actual affix
added is the placetype (capitalized if values `Pref` or `Suf` are given), or its plural if
`pluralize_affix` is given. Note that some placetypes (e.g. `district` and `department`) have inherent
affixes displayed after (or sometimes before) them.
* `pluralize_affix`: Pluralize any displayed affix. Used for holonyms like `c:pref/Canada,US`, which displays as
`the countries of Canada and the United States`.
* `suppress_affix`: Don't display any affix even if the placetype has an inherent affix. Used for the non-last
placenames when there are multiple and a suffix is present, and for the non-first placenames when
there are multiple and a prefix is present.
* `continue_cat_loop`: If true (holonym used :also), continue producing categories starting with this holonym when
preceding holonyms generated categories.
Note that new-style place descs (those specified as a single argument using <<...>> to denote placetypes, placetype
qualifiers and holonyms) have an additional `order` field to properly capture the raw text surrounding the items
denoted in double angle brackets. The ``order_item`` items in the `order` field are objects of the following form:
```{
type = "``order_type``",
value = "STRING" or INDEX,
}```
Here, the ``order_type`` is one of `"raw"`, `"qualifier"`, `"placetype"` or `"holonym"`:
* `"raw"` is used for raw text surrounding `<<...>>` specs.
* `"qualifier"` is used for `<<...>>` specs without slashes in them that consist only of qualifiers (e.g. the spec
`<<former>>` in `<<former>> French <<colony>>`).
* `"placetype"` is used for `<<...>>` `specs without slashes that do not consist only of qualifiers.
* `"holonym"` is used for holonyms, i.e. `<<...>>` specs with a slash in them.
For all types but `"holonym"`, the value is a string, specifying the text in question. For `"holonym"`, the value is a
numeric index into the `holonyms` field.
It should be noted that placetypes and placenames occurring inside the holonyms structure are canonicalized, but
placetypes inside the placetypes structure are as specified by the user. Stripping off of qualifiers and
canonicalization of qualifiers and bare placetypes happens later.
The information under `holonyms_by_placetype` is redundant to the information in holonyms but makes categorization
easier. The holonym placenames listed here already have category aliases applied.
For example, the call {{tl|place|en|city|s/Pennsylvania|c/US}} will result in the return value
```{
placetypes = {"city"},
holonyms = {
{ placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania" },
{ placetype = "negara", display_placename = "United States", unlinked_placename = "United States" },
},
holonyms_by_placetype = {
state = {"Pennsylvania"},
country = {"United States"},
},
}```
Here, the placetype aliases `s` and `c` have been expanded into `state` and `country` respectively, and the placename
display alias `US` has been expanded into `United States`. PLACETYPES is a list because there may be more than one. For
example, the call {{tl|place|en|city/and/municipality|p/[[Kwango]] Province|c/Congo}} will result in the return value
```
{
placetypes = {"city", "and", "municipality"},
holonyms = {
{ placetype = "province", display_placename = "[[Kwango]] Province", unlinked_placename = "Kwango Province" },
{ placetype = "negara", display_placename = "Congo", unlinked_placename = "Congo" },
},
holonyms_by_placetype = {
country = {"Congo"},
},
}```
Here, the `unlinked_placename` field has removed links from `display_placename`.
The value in the key/value pairs is likewise a list; e.g. the call {{tl|place|en|city|s/Kansas|and|s/Missouri}} will
return
```
{
placetypes = {"city"},
holonyms = {
{ placetype = "state", display_placename = "Kansas", unlinked_placename = "Kansas" },
{ display_placename = "and", unlinked_placename = "and" },
{ placetype = "state", display_placename = "Missouri", unlinked_placename = "Missouri" },
},
holonyms_by_placetype = {
state = {"Kansas", "Missouri"},
},
}
```
Note that in `get_cats()` (which runs after the display form has been generated), further changes to the holonym
structure are made to aid in categorization. For example, after `handle_category_implications()` and
`augment_holonyms_with_container()` are called, the above structure will look more like
```
{
placetypes = {"city"},
holonyms = {
{ placetype = "state", display_placename = "Kansas", unlinked_placename = "Kansas" },
{ placetype = "negara", unlinked_placename = "United States" },
{ display_placename = "and", unlinked_placename = "and" },
{ placetype = "state", display_placename = "Missouri", unlinked_placename = "Missouri" },
{ placetype = "negara", unlinked_placename = "United States" },
},
holonyms_by_placetype = {
state = {"Kansas", "Missouri"},
country = {"United States"}
},
}
```
===Overall place specs===
The overall place spec parsed by `parse_overall_place_spec` has the following fields:
* `lang`: The language object (from {{para|1}}).
* `args`: The parsed arguments from the {{tl|place}} call.
* `directives`: List of form-of directives (starting with `@`) parsed from the numeric args beginning with {{para|2}}.
Each directive contains fields `directive` (the directive as specified by the user, e.g. `"former name of"`);
`terms` (list of term objects for the terms specified by the user); `conj` (conjunction specified by the user using
inline modifier `<conj:...>`, or {nil}); `spec` (the corresponding directive spec from `all_form_of_directives`);
`pretext` (the text to display directly before the directive); `posttext` (the text to display directly after the
directive; {nil} except for the last directive).
* `descs`: List of one or more place description objects parsed from the numeric args beginning with {{para|2}}, as
described above.
* `extra_info`: List of extra-info objects for extra info specified using arguments such as {{para|capital}},
{{para|modern}}, etc. Objects are in the order they should be displayed, and each object contains fields `spec` (the
spec for the type of extra info, taken from `export.extra_info_args`), `terms` (list of term objects for the terms
specified by the user); and `conj` (conjunction specified by the user using inline modifier `<conj:...>`, or {nil}).
===Category determination===
The algorithm to find the categories to which a given place belongs works off of a place description (which specifies
the entry placetype(s) and holonym(s); see above). If there are multiple place descriptions, each is processed
independently to generate categories. Likewise, if there are multiple entry placetypes in a given place description,
each is processed independently with all the holonyms of the description to generate categories. Furthermore, before
the category-generation algorithm runs, earlier steps have modified the holonyms of the place description (inserting
containing polities whenever possible; see the description above of `handle_category_implications()` and
`augment_holonyms_with_container()`).
Given a single entry placetype and a place description, the algorithm to generate categories processes holonyms from
left to right until it finds one that "matches" in that it produces one or more categories. At that point it attempts
to generate categories for all other holonyms in the place description of the same placetype. Normally, it then stops
processing holonyms, but if a holonym is marked using the `:also` modifier, the category generation process starts over
starting with that holonym (or the leftmost such remaining holonym, if there is more than one marked with `:also`).
This makes it possible, for example, to specify the description of a river that passes through two different types of
political divisions (e.g. Alberta and the Northwest Territories), or categorize a geographic region at both the
continent and country level, such as this:
<pre>
{{place|en|historical region|r/Eastern Europe|located in southeastern|c:also/Poland|*and western|c/Ukraine}}
</pre>
Here, `r/Eastern Europe` has a category implication that adds `cont/Europe` as a holonym directly after it, which
causes the page to be categorized into [[:Category:en:Geographic and cultural areas of Europe]]. The category generation
process would normally stop at this point, but the presence of `:also` causes it to restart with `c/Poland` and
generate the category [[:Category:en:Geographic and cultural areas of Poland]]. After doing this, it looks for other
holonyms of the same placetype as `c/Poland` (i.e. other countries), which causes it to process `c/Ukraine` and generate
the category [[:Category:en:Geographic and cultural areas of Ukraine]].
The category generation process works off of the `placetype_data` table, which specifies various properties for
placetypes, such as how to display a holonym of that placetype as well as how to categorize certain pages where the
{{tl|place}} call contains the specified placetype as an entry placetype. For example, the entry for `city-state` in
[[Module:place/placetypes]] might look like
```
["city-state"] = {
link = true,
category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]",
has_neighborhoods = true,
class = "settlement",
["continent/*"] = {"City-states", "Cities", "Countries", "Countries in +++", "National capitals"},
default = {"City-states", "Cities", "Countries", "National capitals"},
},
```
Here, the keys specify, respectively:
# If `city-state` occurs as an entry placetype, link it to the corresponding Wiktionary entry (that is what `true` means
in `link = true`).
# Use the specified `category_link` text for categories such as [[:Category:City-states]].
# City-states are "city-like", i.e. they have neighborhoods; this controls the handling of entry placetypes such as
`neighborhood`, `district`, `area`, etc.
# City-states should be treated as settlements for determining how to handle the placetype `former city-state` and for
categorizing the bare category [[:Category:City-states]] and language-specific equivalents such as
[[:Category:en:City-states]].
# When the entry placetype `city-state` occurs along with a continent holonym, categorize into the specified categories
under `continent/*`. Here, `+++` stands for the holonym in question.
# When the entry placetype `city-state` occurs in any other context, categorize into the specified categories under
`default`.
It's important to realize that the only categorization keys under a given placetype entry that are specified
explicitly in [[Module:place/placetypes]] are certain wildcard keys such as `continent/*` above (i.e. containing a slash
followed by `*`) and under the key `default`. All the remaining categorization happens through category handlers, based
on the information on known locations in [[Module:place/locations]]. For example, [[Module:place/locations]] has an
"England group" specified similarly to the following:
```
export.england_group = {
default_container = {key = "England", placetype = "constituent country"},
default_placetype = "county",
default_divs = {
"districts",
{type = "local government districts", cat_as = "districts"},
{
type = "local government districts with borough status",
cat_as = {"districts", "boroughs"},
},
{type = "boroughs", cat_as = {"districts", "boroughs"}},
"civil parishes",
},
default_british_spelling = true,
data = export.england_counties,
}
```
The `default_divs` key here specifies the divisions that exist for each of the counties listed under the `data` key
(unless the key overrides them). Here, the entry `{type = "boroughs", cat_as = {"districts", "boroughs"}}` directs the
category handler `political_division_cat_handler` in [[Module:place/placetypes]] (which is one of two category handlers
that run for all entry placetypes, along with `generic_place_cat_handler`) to categorize boroughs specified under any of
the counties listed under `data` as both districts and boroughs.
Now, the categorization process proceeds as follows, given an entry placetype and place description, which specifies a
set of holonyms (the code to do this is in `get_placetype_cats()`):
# First, look up the entry placetype and any equivalent placetypes in `placetype_data`, which is defined in
[[Module:place/placetypes]]. Note that the entry in `placetype_data` that specifies the placetype information that is
used to determine the category or categories may not directly correspond to the entry placetype as specified in the
place description. For example, if the entry placetype is `small town`, the placetype whose data is fetched will be
`town` since `small` is a recognized qualifier and there is no entry in `placetype_data` for `small town`. As another
example, if the entry placetype is `administrative capital`, the code will first look up `administrative capital` and
then look up `capital city`, which is where the category handler is found, because `administrative capital` specifies
`capital city` as its fallback.
# Then, iterate over holonyms from left to right, as described above. For each holonym, we proceed as follows:
## First, call `political_division_cat_handler` to check if the entry placetype and holonym match a division in the
`locations` data in [[Module:place/locations]], as in the example above. Note that when doing this, holonyms are
canonicalized so that e.g. `co/Bedfordshire` gets mapped to `county/Bedfordshire` (because there is an entry in
`placetype_aliases` in [[Module:place/placetypes]] that maps `co` to `county`) and `c/USA` gets mapped to
`country/United States` (because there is an entry in the location data for the list of countries that maps
`country/USA` to `country/United States` for both display and categorization purposes). This category handler, as
with all such handlers, is passed the entry placetype and holonym being processed, but is also passed the entire
place description, so it can look at other specified holonyms (particularly those that follow). It either returns
{nil} or a list of category specs (which are the actual categories minus the preceding language code).
## If `political_division_cat_handler` doesn't generate any categories, check if there is a category handler defined
using the `cat_handler` key for the entry placetype. If so, call it to generate the categories (if any).
## If the category handler returns {nil}, or there is no category handler, look for a ''wildcard key'' of the format
e.g. `country/*`, which matches any holonym of placetype `country`. If found, the value is a list of category specs,
which are processed as above.
## If we get this far without generating any categories, move to the next holonym.
## If we do generate any categories, process all other holonyms of the same placetype. For example, if the user says
{{tl|place|en|city|s/Kansas|and|s/Missouri}}, when we get to the holonym `s/Kansas`, we generate the category
[[:Category:en:Cities in Kansas, USA]]. This causes us to look for other holonyms of the same placetype `state`,
and process them accordingly, generating a category [[:Category:en:Cities in Missouri, USA]] as well. The same thing
happens in an invocation like {{tl|place|pl|river|c/Poland,Ukraine,Belarus}}.
# Once we generate categories for a holonym and any other holonyms of the same placetype, we normally stop processing
holonyms. But if a holonym has the `:also` modifier, we restart the left-to-right loop at that holonym. For example,
in the invocation {{tl|place|en|river|flowing through|p/Alberta|p/British Columbia|and the|terr/Northwest Territories}},
we will generate a category [[:Category:en:Rivers in Alberta, Canada]] as well as
[[:Category:en:Rivers in British Columbia, Canada]] (because British Columbia is of the same placetype as Alberta);
but no category will be generated for the Northwest Territories, which is of a different placetype. To fix this, write
{{tl|place|en|river|flowing through|p/Alberta|p/British Columbia|and the|terr:also/Northwest Territories}}. The use
of `:also` will cause holonym processing to resume at `Northwest Territories` after `Alberta` is processed, leading to
an additional category [[:Category:en:Rivers in the Northwest Territories, Canada]]. (The presence of `the` in this
last category is because `Northwest Territories` is a known location with a spec indicating that it should be preceded
by `the`; it has nothing to do with the raw text `and the` in the invocation.)
# Finally, if we process all holonyms and don't end up producing any categories, we check the entry placetype's data for
a `default` key. If found, it lists category specs, which are processed to generate categories. This is used, for
example, in the placetype `city-state`, as described above.
# It should be noted that the above process runs independently for each combination of entry placetype and place
description. Thus, for example, an invocation {{tl|place|en|city/and/county|s/Kansas,Missouri|c/USA}} will generate
categories for both cities and counties in both Kansas and Missouri.
# Two additional sources of categories are ''bare location'' categories and ''generic place'' categories. These
categories are added by appropriate calls in the outer function `get_cats`, which iterates over placetypes and place
descriptions, calling `get_placetype_cats` on each combination.
## Bare location categories are categories like [[:Category:Arizona, USA]] that are related-to categories containing
terms related to the specified location. The bare location code, for example, adds the term [[Arizona]], and its
equivalents in other languages, to [[:Category:Arizona, USA]]. When looking for terms to consider, it checks the
pagename, the glosses specified using {{para|t}}, and the terms specified using {{para|modern}}, {{para|short}} and
{{para|full}}. It looks to see if any of these parameters match any known locations, but only adds them to a bare
location category if (a) the specified entry placetype matches, so that for example Russian `[[Джорджия]]` goes into
[[:Category:Georgia, USA]] while `[[Грузия]]` goes into [[:Category:Georgia]] (the country), even though both have a
gloss `Georgia`; and (b) there are no conflicting holonyms, so that for example the Old English term [[Munucceaster]]
if defined similarly to {{tl|place|ang|city|in modern|cc/England|t=Newcastle}} won't get added to
[[:Category:Newcastle, New South Wales]] (even though it is also a city) because the latter city is known to be in
Australia, which conflicts with the country `United Kingdom` (added internally to the Old English place description
through the holonym augmentation process, based on the holonym `cc/England`).
## Generic place categories are categories like [[:Category:Places in Kansas, USA]] and [[:Category:Places in England]]
that contain places of arbitrary placetype. These are added through a special category handler that operates like
other category handlers but is run for all placetypes, rather than only for the specified one(s).
]==]
--[=[
TODO/FIXME:
1. [DONE] Neighborhoods should categorize at the city level. Categories like [[:Category:Places in Los Angeles]] exist
but not [[:Category:Neighborhoods in Los Angeles]]; we can refactor the code in generic_cat_handler() to support this
use case.
2. Display handlers should be smarter. For example, 'co/Travis' as a holonym should display as 'Travis County' in the
United States, but (I think) display handlers don't currently have the full context of holonyms passed in to allow
this to happen.
3. Connected to this, we have various display handlers that add the name of the holonym after or (sometimes) before the
placename if it's not already there. An example is the county_display_handler() in [[Module:place/placetypes]], which
adds "County" before Ireland and Northern Ireland counties and after Taiwan and Romania counties. This should be
integrated into the polity group for these respective polities through a setting rather than requiring a separate
handler that has special casing for various polities.
4. Placetypes for toponyms should also have display handlers rather than just fixed text. This should allow us to
dispense with the need for special types for "fpref" = "French prefecture" (which displays as "prefecture" but links
to the appropriate Wikipedia article on Frenc prefectures, which are completely different from the more general
concept of prefecture). Similarly for "Polish colony" and "Welsh community". ("Israeli settlement" should probably
stay as-is because it displays as "Israeli settlement" not just "settlement".)
5. [DONE] Currently, categories for e.g. states and territories of Australia go into
[[:Category:States and territories of Australia]] but terms for states and territories of Australia go into
(respectively) [[:Category:States of Australia]] and [[:Category:Territories of Australia]]. We should fix this;
maybe this is as easy as setting cat_as in the respective divs definitions.
6. Probably cat_as should support raw categories as well as category types; raw categories would be indicated by being
prefixed with "Category:".
7. [MOSTLY DONE] Update documentation.
8. [DONE] Rename remaining political division categories to include name of country in them.
9. [DONE] Add Pakistan provinces and territories.
10. [DONE] Add a polity group for continents and continent-level regions instead of special-casing. This should make it
possible e.g. to have Jerusalem as a city under "Asia".
11. [DONE] Add better handling of cities that are their own states, like Mexico City.
12. [DONE] Breadcrumb for e.g. [[Category:Aguascalientes, Mexico]] is "Aguascalientes, Mexico" instead of just
"Aguascalientes".
13. [DONE] Unify aliasing system; cities have a completely different mechanism (alias_of) vs. polities/subpolities
(which use`placename_cat_aliases` and `placename_display_aliases` in [[Module:place/placetypes]]).
14. [DONE] More generally, cities should be unified into the polity grouping system to the extent possible; this would
allow for divs of cities (see #17 below).
15. [DONE] We have `no_containing_polity_cat` set for Lebanon, Malta and Saudi Arabia to prevent country-level
implications from being added due to generically-named divisions like "North Governorate", "Central Region" and
"Eastern Province" but (a) this setting seems to do multiple things and should be split, (b) it should be possible
to set this at the division level instead of the country level.
16. Split out the data from the handlers so we can use loadData() on the data because it's becoming very big.
17. [DONE] Cities like Tokyo have special wards; "prefecture-level cities" like Wuhan (which aren't really cities but we
treat them as such) have districts, subdistricts, etc. We need to support divs for cities and even named divisions
of cities (such as we already have for boroughs of New York City).
18. [DONE] It should be allowed to set 'true' to any qualifier (which links it) and have it work correctly; qualifier lookup
in [[Module:place]] needs to remove links first.
19. [DONE] Categories 'Historical polities' and 'Historical political subdivisions' should be renamed 'Former ...' since
"historic(al)" is ambiguous (cf. "historic counties" in England which are not former, but still have a legal
definition).
20. [PARTLY DONE; SUPPORT IS THERE BUT FORMER PROVINCES NOT YET CATEGORIZED] It should be possible to categorize former
subpolities of certain polities; cf. [[:Category:ja:Provinces of Japan]], which contains former provinces.
21. [DONE] In subpolity_keydesc(), we need to generate the correct indefinite article and have a huge hack to check
specifically for "union territory", which is the only placetype that shows up in this function where the default
indefinite article generating function fails. To fix this properly, we need to separate out the non-category
placetype data from `cat_data` in [[Module:place/placetypes]] and move it to [[Module:place/locations]], because we
don't have access to the data in [[Module:place/placetypes]], and that data indicates the correct article for
placetypes like "union territory".
22. [DONE] Simplify the specs in `cat_data`, eliminating the distinction between "inner" and "outer" matching. There
should not be two levels, just one. For example, in "district", instead of
["country/Portugal"] = {
["itself"] = {"Districts and autonomous regions of +++"},
}
we should just have
["country/Portugal"] = {"Districts and autonomous regions of +++"},
And in "dependent territory", instead of
["default"] = {
["itself"] = {true},
["negara"] = {true},
},
we should just have
["itself"] = {true},
["country/*"] = {true},
It appears the only remaining spec that can't be easily converted in this fashion is for "subdistrict":
["country/Indonesia"] = {
["municipality"] = {true},
},
This seems to be specifically for Jakarta and doesn't seem to work anyway, as the two entries in
[[:Category:en:Subdistricts of Jakarta]] and the one entry in [[:Category:id:Subdistricts of Jakarta]] are manually
categorized.
23. [DONE] Consolidate the remaining stuff in [[Module:category tree/topic cat/data/Earth]] into
[[Module:category tree/topic cat/data/Places]].
24. [DONE] The `generic_cat_handler` that categorizes into `Places in FOO` is smart enough not to categorize cities that
are in different polities from the specified containing polity/polities of the city, but doesn't do the same for
larger-level divisions. Likewise for the `city_type_cat_handler`. There are some sufficiently generically-named
divisions that this issue can occur; for example, [[Koforidua]], the capital city of Eastern Region, Ghana, is
incorrectly categorized under [[:Category:en:Cities in Eastern Region, Malta]] and
[[:Category:en:Places in Eastern Region, Malta]]. Note that the function `augment_holonyms_with_container`
''DOES'' do such checks, so we should be able to refactor the code out of that function and use it elsewhere.
25. [DONE] The `generic_cat_handler` that categorizes into `Places in FOO` is smart enough not to categorize cities that
are in different polities from the specified containing polity/polities of the city; but how smart is it? It will
successfully avoid categorizing a neighborhood in e.g. [[Columbus]], [[Georgia]] that doesn't explicitly mention the
US (only `s/Georgia`) into [[:Category:en:Places in Columbus]], which is for Columbus, Ohio, but will it do the same
for a hypothetical neighborhood of Columbus in say Merseyside, England? This should be investigated. It will
probably work for a hypothetical Columbus in [[Canada]] because `augment_holonyms_with_container` would
auto-add Canada as an additional holonym once say `p/Ontario` is mentioned, but I think there's a setting preventing
this augmentation from happening for the UK. (This relates to FIXME #15. `no_containing_polity_cat` is set on
England, Scotland, etc. to prevent the toponyms from being added to [[:Category:en:Places in the United Kingdom]],
but this same setting is used to prevent augmentation, which it should not be; there should be different settings.)
26. [DONE] The `generic_cat_handler` (or more specifically `find_holonym_keys_for_categorization`) checks for city
holonyms by looking specifically for holonym type `city`. But some cities (particularly those in China) can be
specified using different holonym types, e.g. `prefecture-level city`, `subprovincial city`, etc. We should allow
these when appropriate (which means the cities in China need to have a `placetype` set that indicates their
regional-level status as well as just `city`). I'm not sure if cities support specifying a custom `placetype` at the
moment; this relates to FIXME #14 above concerning unifying cities and political divisions internally.
27. [DONE] The bare category handler (`get_bare_categories` in [[Module:place/placetypes]]) is not smart enough to avoid
overcategorizing cities or other divisions that are of the right placetype but in the wrong containing polity. For
example, Asturian [[Llión]] "León (city in Spain)" gets put in [[:Category:ast:León]] even though the latter is
supposed to refer to a city in Mexico. We can borrow the check-containing-polity code from `generic_cat_handler`.
28. [DONE] Redo handling of singular and plural to respect overrides specified in placetype_data. Check more carefully
for things that may not singularize correctly, e.g. 'passes' -> 'passe'? Definitely 'headquarters' and variants.
29. [DONE] Combine placetype_equivs and other placetype data into `placetype_data`. Figure out if we need the
distinction between `placetype_equivs` and `fallback`.
30. `has_neighborhoods` may need to be a function that can look at the containing holonyms to determine whether the
entity in question is city-like.
31. [DONE] Bare placenames as they appear in holonyms (e.g. `Riau Islands`) instead of category keys (e.g.
`the Riau Islands, Indonesia`) should appear in the polity data tables. As a first pass, the word "the" should not
appear but should instead be a property of the polity.
32. [DONE] `capital_city_cat_handler` should use `get_holonyms_to_check()`.
33. [PARTLY DONE] The code to generate and parse the correct preposition ("di" or "of") is very convoluted, and the
actual preposition used is specified in various locations with various defaults, sometimes hardcoded. This should be
simplified. It is made more difficult by the fact that the in/of distinction occurs in several places:
(a) when generating the {{place}} text in old-style descriptions where the preposition isn't explicitly given, which
uses the `preposition` setting in placetype_data, defaulting to "di";
(b) when generating categories based on explicit category specs in placetype_data (which are gradually being
deprecated), which likewise uses the `preposition` setting in placetype_data, defaulting to "di";
(c) when generating categories based on political_division_cat_handler, originating in the `divs` placetypes for
specific known locations in [[Module:place/locations]], which uses the `prep` setting embedded in the `divs`
specifications, defaulting to "of";
(d) when generating categories based on category handlers specified using the `cat_handler` property of entries in
placetype_data, which tend to hardcode "di" or "of" depending on the specific category handler;
(e) when generating category descriptions in [[Module:category tree/topic/Places]] for `divs` categories generated
in (c), which (correctly) uses the same `prep` setting embedded in the `divs` settings that is used when
generating the categories themselves;
(f) when generating category descriptions for categories generated in (b) and (d) above, which relies on the
`generic_before_non_cities` and `generic_before_cities` settings in placetype_data, which need to match the
corresponding prepositions hardcoded in the category generation handlers. Instead of the hardcoding, the
category generation handler should respect the `generic_before_*` settings.
34. [[Krakow]] defined as {{place|en|A <<city>> on the [[Vistula]] River, the <<capital>> of the <<voi/Lesser Poland Voivodeship>> in southern <<c/Poland>>}}
categorizes under [[:Category:Voivodeship capitals]] when it should probably instead be under
[[:Category:Voivodeship capitals of Poland]]. Possibly this is because the various voivodeships haven't yet been
entered as known locations, but this should happen regardless of that.
35. {{tcl}} bugs:
a. [DONE] Lowercase initial letter in new-style {{place}} descriptions in {{tcl}}. Maybe we can have a setting
tcl_nolc=1 to prevent this from happening.
b. [DONE] tcl= and probably new-style {{place}} descriptions in general should recognize ;; to separate distinct {{place}}
descriptions, and similarly ;;and as the equivalent of regular `;and`, etc.
c. [DONE] The value supplied in `modern=` should be displayed in {{tcl}} descriptions regardless of the setting that
normally disables this, so that e.g. the foreign-language equivalent of [[British Honduras]] doesn't just say
it's a former British colony in Central America but specifically identifies it as modern Belize. If the user
gives, place_modern= in {{tcl}}, that should override the modern= value and still display.
d. [DONE] The page supplied to {{tcl}} should be used for generating bare categories even if t= is supplied and
overrides the English term displayed. [DONE]
e. [DONE] If text follows {{place}} and begins with a semicolon, the semicolon isn't copied into {{tcl}}.
36. County boroughs used as holonyms currently display 'borough county borough' because there's an affix setting for
'county borough' and a fallback display handler for 'borough'. We need to rethink this; maybe merge the affix
setting and display handlers.
37. Implement known-location groups and specs in a more standardly object-oriented way using metatables.
38. Implement caching of known location lookup in the holonym. This may have to be keyed by placetype, but we can have a
special field for when the lookup placetype is the same as the user-specified placetype of the holonym. Use this
known location in place of looking up known locations and store the appropriate known location there in
`augment_holonyms_with_container()` instead of calling `key_to_placename`.
39. Bug fixes with 'the':
(a) [DONE] [[Kazaň]] defined as {{place|cs|caplc|rep:Pref/Tatarstan|c/Russia|t1=Kazan}} displays as
"Republic of the Tatarstan".
(b) [[Valday]] defined as {{place|en|town/administrative center|dist:Suf/Valdaysky|obl/Novgorod|c/Russia}}
displays as "a town, the administrative center of the Valdaysky District". Changing to `dist:suf/Valdaysky`
displays as "... of Valdaysky district".
40. [DONE] Bug fix with 'the': [[Verkhoyansk]] defined as {{place|en|town|rep/Sakha|c/Russia}} displays as "a town in
the Sakha".
41. [DONE] [[Category:Cities in Asia]] has [[Category:Cities in Eurasia]] as a parent, which in turn has
[[Category:Cities in the Earth]] as a parent. Continents should not have the second parent like this.
42. [DONE] When checking `british_spelling`, it should check all containers as well; otherwise it's too hard to keep
this in sync across cities, administrative divisions and countries.
43. [DONE] `skip_polity_parent_type` should be renamed to container_parent_type or similar.
44. There should be a flag to allow e.g. departments of France that are currently categorized as departments of their
region to also be categorized as departments of France.
45. [DONE] Aliases are causing iterate_matching_holonym_location() to fail, e.g. if [[براق]] "Prague" is specified as
{{place|acw|capital city|c/Czechia|t1=Prague}}, this fails add a bare category [[Category:acw:Prague]] because
the code in iterate_matching_holonym_location() isn't resolving aliases when comparing the known container
'Czech Republic'. Probably we want to build an alias table to speed up these sorts of lookups.
46. [DONE; DUE TO TYPO IN HANDLER] The district cat handler is failing to work right, e.g. in [[Saint-Gaudérique]]
defined as {{place|fr|district|city/Perpignan|in|dept/Pyrénées-Orientales|r/Occitania|c/France|t=Saint-Gaudérique}},
only the 'Places in ...' categories are getting triggered.
47. Suburbs of a given city aren't generally in the city and may not even be in the same country or country division,
so they should not categorize as "Places in ..." based on the city and specified country and division. Same goes
for "enclave" (within somewhere) and "exclave".
48. When converting display aliases, we should automatically convert full placenames to full placenames and elliptical
placenames to elliptical placenames instead of always either doing elliptical or full placenames depending on the
value of `display_as_full`.
49. `@obsolete form of` and `@archaic form of` should automatically trigger nocat=1.
50. The handler that adds bare categories should pick up values in <eq:...>.
]=]
--[==[ var:
List specifying the allowed form-of directives, used for former names, official names, abbreviations, etc. of places.
The key is the form-of directive and the value is an object with the following properties:
* `text`: The actual text displayed before the terms. If the value is `+`, the key is used as the text. If the value is
a function, it is passed a single argument, the overall place spec (see comment at top of file) and should return
the text to be displayed.
* `type_prefix`: The prefix used to generate the placetype for looking up the appropriate category or categories in the
placetype data structure. Can be omitted if there are no categories associated with the directive.
* `conjunction`: The conjunction used to join multiple terms, defaulting to `and`.
* `cat`: Additional category or categories to add the term to, whenever this particular directive is used. Normally the
value is a topic-style category minus the langcode prefix, but if prefixed with `cln:`, it is a langname-style
category. For example, the value `"Abbreviations"` would correspond to a category [[:Category:en:Abbreviations]]
(assuming the language of the {{tl|place}} call is English), while the value `"cln:abbreviations"` corresponds to a
category [[:Category:English abbreviations]]. Use a list of such specs for multiple categories.
* `default_foreign`: If specified, the default language of terms given along with this directive is the language in
{{para|1}}; otherwise it is English.
]==]
export.all_form_of_directives = {
["former name of"] = {text = "+", type_prefix = "FORMER_NAME_OF"},
["fmr of"] = {alias_of = "former name of"},
["ancient name of"] = {text = "+", type_prefix = "FORMER_NAME_OF"},
["official name of"] = {text = "+", type_prefix = "OFFICIAL_NAME_OF"},
["former official name of"] = {text = "+", type_prefix = "FORMER_OFFICIAL_NAME_OF"},
["long form of"] = {text = "+", type_prefix = "LONG_FORM_OF"},
["former long form of"] = {text = "+", type_prefix = "FORMER_LONG_FORM_OF"},
["nickname for"] = {text = "+", type_prefix = "NICKNAME_FOR"},
["official nickname for"] = {text = "+", type_prefix = "OFFICIAL_NICKNAME_FOR"},
["former nickname for"] = {text = "+", type_prefix = "FORMER_NICKNAME_FOR"},
["derogatory name for"] = {text = "[[Appendix:Glossary#derogatory|derogatory]] name for", type_prefix = "DEROGATORY_NAME_FOR"},
["synonym of"] = {text = "+"},
["syn of"] = {alias_of = "synonym of"},
["abbreviation of"] = {text = "[[Appendix:Glossary#abbreviation|abbreviation]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:abbreviations",
default_foreign = true},
["abbr of"] = {alias_of = "abbreviation of"},
["abbrev of"] = {alias_of = "abbreviation of"},
["initialism of"] = {text = "[[Appendix:Glossary#initialism|initialism]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:initialisms",
default_foreign = true},
["init of"] = {alias_of = "initialism of"},
["acronym of"] = {text = "[[Appendix:Glossary#acronym|acronym]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:acronyms",
default_foreign = true},
["syllabic abbreviation of"] = {text = "[[Appendix:Glossary#syllabic abbreviation|syllabic abbreviation]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:syllabic abbreviations",
default_foreign = true},
["sylabbr of"] = {alias_of = "syllabic abbreviation of"},
["sylabbrev of"] = {alias_of = "syllabic abbreviation of"},
["ellipsis of"] = {text = "[[Appendix:Glossary#ellipsis|ellipsis]] of", type_prefix = "ELLIPSIS_OF", cat = "cln:ellipses",
default_foreign = true},
["ellip of"] = {alias_of = "ellipsis of"},
["clipping of"] = {text = "[[Appendix:Glossary#clipping|clipping]] of", type_prefix = "CLIPPING_OF", cat = "cln:clippings",
default_foreign = true},
["clip of"] = {alias_of = "clipping of"},
["alternative form of"] = {text = "+", default_foreign = true},
["alt form"] = {alias_of = "alternative form of"},
["alternative spelling of"] = {text = "+", default_foreign = true},
["alt spell"] = {alias_of = "alternative spelling of"},
["alt sp"] = {alias_of = "alternative spelling of"},
["dated form of"] = {text = "[[Appendix:Glossary#dated|dated]] form of", type_prefix = "DATED_FORM_OF", cat = "cln:dated forms",
default_foreign = true},
["dated form"] = {alias_of = "dated form of"},
["dated spelling of"] = {text = "[[Appendix:Glossary#dated|dated]] spelling of", type_prefix = "DATED_FORM_OF", cat = "cln:dated forms",
default_foreign = true},
["dated spell"] = {alias_of = "dated spelling of"},
["dated sp"] = {alias_of = "dated spelling of"},
["archaic form of"] = {text = "[[Appendix:Glossary#archaic|archaic]] form of", type_prefix = "ARCHAIC_FORM_OF", cat = "cln:archaic forms",
default_foreign = true},
["arch form"] = {alias_of = "archaic form of"},
["archaic spelling of"] = {text = "[[Appendix:Glossary#archaic|archaic]] spelling of", type_prefix = "ARCHAIC_FORM_OF", cat = "cln:archaic forms",
default_foreign = true},
["arch spell"] = {alias_of = "archaic spelling of"},
["arch sp"] = {alias_of = "archaic spelling of"},
["obsolete form of"] = {text = "[[Appendix:Glossary#obsolete|obsolete]] form of", type_prefix = "OBSOLETE_FORM_OF", cat = "cln:obsolete forms",
default_foreign = true},
["obs form"] = {alias_of = "obsolete form of"},
["obsolete spelling of"] = {text = "[[Appendix:Glossary#obsolete|obsolete]] spelling of", type_prefix = "OBSOLETE_FORM_OF", cat = "cln:obsolete forms",
default_foreign = true},
["obs spell"] = {alias_of = "obsolete spelling of"},
["obs sp"] = {alias_of = "obsolete spelling of"},
}
local function get_seat_text(overall_place_spec)
local placetype = overall_place_spec.descs[1].placetypes[1]
if placetype == "county" or placetype == "counties" then
return "county seat"
elseif placetype == "parish" or placetype == "parishes" then
return "parish seat"
elseif placetype == "borough" or placetype == "boroughs" then
return "borough seat"
else
return "seat"
end
end
--[==[ var:
List specifying the allowed arguments containing extra information that is sometimes added to a definition, such as the
capital, largest city, modern name, official name, etc., along with associated properties; displayed in the order given.
Each element is an object with the following properties:
* `arg`: The argument name.
* `text`: The actual text displayed before the terms. If the value is `+`, the argument name is used as the text. If the
value is a function, it is passed a single argument, the overall place spec (see the comment at the top of the file)
and should return the text to be displayed.
* `conjunction`: The conjunction used to join multiple terms, defaulting to `and`.
* `display_even_when_dropped`: Display this piece of extra info even when it would normally be dropped (e.g. in
{{tl|tcl}} when the language is other than English).
* `match_sentence_style`: If true, the text will be capitalized and preceded by a period when ''sentence style'' is
in effect (essentially, when the language is English and there is no translation specified using {{para|t}} or
similar parameter); otherwise, the text will be displayed as-is and preceded by a semicolon. If false, the semicolon
style will always be used.
* `auto_plural`: If true, pluralize the text when there is more than one term.
* `with_colon`: If true, follow the text with a colon. (This colon cannot easily be included in the text itself because
if pluralized, the pluralized text goes before the colon.)
]==]
export.extra_info_args = {
{arg = "modern", text = "+", conjunction = "atau", display_even_when_dropped = true},
{arg = "now", text = "kini", conjunction = "atau", display_even_when_dropped = true},
{arg = "full", text = "nama penuh", conjunction = "atau", display_even_when_dropped = true},
{arg = "short", text = "nama pendek", conjunction = "atau"},
{arg = "abbr", text = "singkatan", conjunction = "atau"},
{arg = "former", text = "dahulunya"},
{arg = "official", text = "nama rasmi", match_sentence_style = true, auto_plural = true, with_colon = true},
{arg = "capital", text = "+", match_sentence_style = true, auto_plural = true, with_colon = true},
{arg = "largest city", text = "+", match_sentence_style = true, auto_plural = true, with_colon = true},
{arg = "caplc", text = "ibu negara dan bandar terbesar", match_sentence_style = true, auto_plural = false,
with_colon = true},
{arg = "seat", text = get_seat_text, match_sentence_style = true, auto_plural = true, with_colon = true},
{arg = "shire town", text = "+", match_sentence_style = true, auto_plural = true, with_colon = true},
{arg = "headquarters", text = "+", match_sentence_style = true, auto_plural = false, with_colon = true},
{arg = "center", text = "pusat pentadbiran", match_sentence_style = true, auto_plural = false, with_colon = true},
{arg = "centre", text = "pusat pentadbiran", match_sentence_style = true, auto_plural = false, with_colon = true},
}
export.extra_info_arg_map = {}
for _, spec in ipairs(export.extra_info_args) do
export.extra_info_arg_map[spec.arg] = spec
end
----------- Wikicode utility functions
-- Return a wikilink link {{l|language|text}}
local function link(text, langcode, id)
if not langcode then
return text
end
return m_links.full_link(
{term = text, lang = require(languages_module).getByCode(langcode, true, "allow etym"), id = id},
nil, "allow self link"
)
end
---------- Basic utility functions
-- Add the page to a tracking "category". To see the pages in the "category",
-- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here".
local function track(page)
require(debug_track_module)("place/" .. page)
return true
end
local function ucfirst_all(text)
if text:find(" ") then
local parts = split(text, " ", true)
for i, part in ipairs(parts) do
parts[i] = m_strutils.ucfirst(part)
end
return concat(parts, " ")
else
return m_strutils.ucfirst(text)
end
end
local function lc(text)
return mw.getContentLanguage():lc(text)
end
---------- Argument parsing functions and utilities
-- Split an argument on comma, but not comma followed by whitespace.
local function split_on_comma(val)
if val:find(",") then
return require(parse_interface_module).split_on_comma(val)
else
return {val}
end
end
-- Split an argument on slash, but not slash occurring inside of HTML tags like </span> or <br />.
local function split_on_slash(arg)
if arg:find("<") then
local m_parse_utilities = require(parse_utilities_module)
-- We implement this by parsing balanced segment runs involving <...>, and splitting on slash in the remainder.
-- The result is a list of lists, so we have to rejoin the inner lists by concatenating.
local segments = m_parse_utilities.parse_balanced_segment_run(arg, "<", ">")
local slash_separated_groups = m_parse_utilities.split_alternating_runs(segments, "/")
for i, group in ipairs(slash_separated_groups) do
slash_separated_groups[i] = concat(group)
end
return slash_separated_groups
else
return split(arg, "/", true)
end
end
-- Implement "implications", i.e. where the presence of a given holonym causes additional holonym(s) to be added.
-- Implications apply only to categorization. There used to be support for "general implications" that applied to both
-- display and categorization, but there ended up not being any such implications, so we've removed the support. It is
-- a bad idea in any case to have such implications; the user might purposely leave out a higher-level polity to avoid
-- redundancy in several successive definitions, and we wouldn't want to override that. Note that in practice the
-- mechanism implemented by this function is used specifically for non-administrative geographic regions such as
-- Eastern Europe and the West Bank; there is a similar mechanism for administrative regions handled by
-- `augment_holonyms_with_containing_polity` in [[Module:place/placetypes]].
--
-- `place_descriptions` is a list of place descriptions (see top of file, collectively describing the data passed to
-- {{place}}). `implication_data` is the data used to implement the implications, i.e. a table indexed by holonym
-- placetype, each value of which is a table indexed by holonym placename, each value of which is a list of
-- "PLACETYPE/PLACENAME" holonyms to be added to the end of the list of holonyms.
local function handle_category_implications(place_descriptions, implication_data)
for i, desc in ipairs(place_descriptions) do
if desc.holonyms then
local new_holonyms = {}
for _, holonym in ipairs(desc.holonyms) do
insert(new_holonyms, holonym)
local imp_data = m_placetypes.get_equiv_placetype_prop(holonym.placetype, function(pt)
local implication = implication_data[pt] and implication_data[pt][holonym.unlinked_placename]
if implication then
return implication
end
end)
if imp_data then
for _, holonym_to_add in ipairs(imp_data) do
local split_holonym = split_on_slash(holonym_to_add)
if #split_holonym ~= 2 then
internal_error("Invalid holonym in implications: %s", holonym_to_add)
end
local holonym_placetype, holonym_placename = unpack(split_holonym, 1, 2)
local new_holonym = {
-- By the time we run, the display has already been generated so we don't need to set
-- display_placename.
placetype = holonym_placetype, unlinked_placename = holonym_placename
}
insert(new_holonyms, new_holonym)
m_placetypes.key_holonym_into_place_desc(desc, new_holonym)
end
end
end
desc.holonyms = new_holonyms
end
end
end
-- Split a holonym (e.g. "continent/Europe" or "country/en:Italy" or "in southern" or "r:suf/O'Higgins" or
-- "c/Austria,Germany,Czech Republic") into its components. Return a list of holonym objects (see top of file). Note
-- that if there isn't a slash in the holonym (e.g. "in southern"), the `placetype` field of the holonym will be nil.
-- Placetype aliases (e.g. "r" for "region") and placename aliases (e.g. "US" or "USA" for "United States") will be
-- expanded.
local function split_holonym(raw)
local no_display, combined_holonym = raw:match("^(!)(.*)$")
no_display = not not no_display
combined_holonym = combined_holonym or raw
local suppress_comma, combined_holonym_without_comma = combined_holonym:match("^(%*)(.*)$")
suppress_comma = not not suppress_comma
combined_holonym = combined_holonym_without_comma or combined_holonym
local holonym_parts = split_on_slash(combined_holonym)
if #holonym_parts == 1 then
-- `unlinked_placename` should not be used.
return {{display_placename = combined_holonym, no_display = no_display, suppress_comma = suppress_comma}}
end
-- Rejoin further slashes in case of slash in holonym placename, e.g. Admaston/Bromley.
local placetype = holonym_parts[1]
local placename = concat(holonym_parts, "/", 2)
-- Check for modifiers after the holonym placetype.
local split_holonym_placetype = split(placetype, ":", true)
placetype = split_holonym_placetype[1]
local affix_type
local saw_also
local saw_the
for i = 2, #split_holonym_placetype do
local modifier = split_holonym_placetype[i]
if modifier == "also" then
if saw_also then
error(("Modifier ':also' occurs twice in holonym '%s'"):format(combined_holonym))
end
saw_also = true
elseif modifier == "the" then
if saw_the then
error(("Modifier ':the' occurs twice in holonym '%s'"):format(combined_holonym))
end
saw_the = true
elseif modifier == "pref" or modifier == "Pref" or modifier == "suf" or modifier == "Suf" or
modifier == "noaff" then
if affix_type then
error(("Affix-type modifier ':%s' occurs twice in holonym '%s'"):format(modifier, combined_holonym))
end
affix_type = modifier
else
error(("Unrecognized holonym placetype modifier '%s', should be one of " ..
"'pref', 'Pref', 'suf', 'Suf', 'noaff', 'also' or 'the'"):format(modifier))
end
end
placetype = m_placetypes.resolve_placetype_aliases(placetype)
local holonyms = split_on_comma(placename)
local pluralize_affix = #holonyms > 1
local affix_holonym_index = (affix_type == "pref" or affix_type == "Pref") and 1 or affix_type == "noaff" and 0 or
#holonyms
for i, placename in ipairs(holonyms) do
-- Check for langcode before the holonym placename, but don't get tripped up by Wikipedia links, which begin
-- "[[w:...]]" or "[[wikipedia:]]".
local langcode, placename_without_langcode = rmatch(placename, "^([^%[%]]-):(.*)$")
if langcode then
placename = placename_without_langcode
end
placename = m_placetypes.resolve_placename_display_aliases(placetype, placename)
holonyms[i] = {
placetype = placetype,
display_placename = placename,
unlinked_placename = m_placetypes.remove_links_and_html(placename),
langcode = langcode,
affix_type = i == affix_holonym_index and affix_type or nil,
pluralize_affix = i == affix_holonym_index and pluralize_affix,
suppress_affix = i ~= affix_holonym_index,
no_display = no_display,
suppress_comma = suppress_comma,
continue_cat_loop = saw_also,
force_the = i == 1 and saw_the,
}
end
return holonyms
end
local get_param_mods = memoize(function()
local m_param_utils = require(parameter_utilities_module)
return m_param_utils.construct_param_mods {
{group = {"link", "q", "l", "ref"}},
{param = "eq"},
-- FIXME: Finish [[Module:format utilities]].
--{param = "conj", set = require(format_utilities_module).allowed_conjs_for_join_segments, overall = true},
{param = "conj", set = {["and"] = true, ["or"] = true, ["and/or"] = true}, overall = true},
}
end)
local function parse_term_with_inline_modifiers(term, paramname, default_lang)
-- FIXME: Finish changes to [[Module:parameter utilities]] and [[Module:parse utilities]] that support continuations
-- and new-format generate_obj().
--local function generate_obj(data)
-- local m_param_utils = require(parameter_utilities_module)
-- data.parse_lang_prefix = true
-- data.special_continuations = m_param_utils.default_special_continuations
-- data.default_lang = default_lang
-- return m_param_utils.generate_obj_maybe_parsing_lang_prefix(data)
--end
local function generate_obj(raw_term, parse_err)
local obj = require(parameter_utilities_module).generate_obj_maybe_parsing_lang_prefix {
term = raw_term,
parse_err = parse_err,
parse_lang_prefix = true,
}
obj.lang = obj.lang or default_lang
return obj
end
return require(parse_interface_module).parse_inline_modifiers(term, {
paramname = paramname,
param_mods = get_param_mods(),
generate_obj = generate_obj,
-- FIXME: See above.
--generate_obj_new_format = true,
splitchar = ",",
outer_container = {},
})
end
local function parse_form_of_directive(arg, lang, form_of_overridden_args)
local form_of_directive, raw_terms = arg:match("^@([a-z -]+):(.*)$")
if not form_of_directive then
error("Misformatted @-directive: " .. dump(arg))
end
if not export.all_form_of_directives[form_of_directive] then
local known_directives = {}
for k, _ in pairs(export.all_form_of_directives) do
insert(known_directives, '"' .. k .. '"')
end
table.sort(known_directives)
error(("Unrecognized form-of directive %s in @-directive %s; recognized directives are %s"):format(
dump(form_of_directive), dump(arg), concat(known_directives, ", ")))
end
local spec = export.all_form_of_directives[form_of_directive]
local canonical_directive = form_of_directive
if spec.alias_of then
canonical_directive = spec.alias_of
spec = export.all_form_of_directives[canonical_directive]
if not spec then
internal_error("Form-of directive alias %s points to %s, which is not a directive",
"@" .. form_of_directive, canonical_directive)
elseif spec.alias_of then
internal_error("Form-of directive alias %s points to %s, which is also an alias",
"@" .. form_of_directive, canonical_directive)
end
end
local default_foreign = spec.default_foreign
local directive_param = "@" .. form_of_directive
if form_of_overridden_args and form_of_overridden_args[canonical_directive] then
raw_terms = form_of_overridden_args[canonical_directive].new_value
local new_directive = form_of_overridden_args[canonical_directive].new_directive
local new_spec = export.all_form_of_directives[new_directive]
if not new_spec then
error(("Internal error: [[Module:transclude]] passed in unrecognized replacement directive '@%s'"):
format(new_directive))
end
if new_spec.alias_of then
error(("Internal error: [[Module:transclude]] passed in replacement directive alias '@%s', " ..
"should be canonical"):format(new_directive))
end
if new_directive ~= canonical_directive then
directive_param = directive_param .. (" (replaced with @%s)"):format(new_directive)
canonical_directive = new_directive
spec = new_spec
end
default_foreign = true
end
local terms = parse_term_with_inline_modifiers(raw_terms, directive_param,
default_foreign and lang or enlang)
return {
directive = canonical_directive,
terms = terms.terms,
conj = terms.conj,
spec = spec,
}
end
-- Parse an argument containing extra information that is sometimes added to a definition, such as the capital, largest
-- city, modern name, official name, etc. `args` is the value from the parsed argument structure and can be either nil,
-- a string or a list (depending on whether it was declared as a single parameter or a list). `spec` is the extra info
-- spec corresponding to the type of extra info. Each value in `args` can be a comma-separated list of terms with inline
-- modifiers attached. [FIXME: we should switch to always using the comma-separated format and disallow list parameters
-- such as |capital=, |capital2=, etc.] The return value is a structure containing fields `terms` (a list of term
-- objects, each of which is in the format expected by full_link() in [[Module:links]]), `conj` (an explicit
-- conjunction to join multiple terms, or nil if no explicit conjunction was given) and `spec` (the passed-in spec).
local function parse_extra_info_arg(args, spec, default_lang)
if not args then
return nil
end
if type(args) ~= "table" then
args = {args}
end
if not args[1] then
return nil
end
local terms = nil
local conj
for i, arg in ipairs(args) do
local this_terms = parse_term_with_inline_modifiers(arg, spec.arg .. (i == 1 and "" or i), default_lang)
local thisconj = this_terms.conj
if not conj then
conj = thisconj
elseif thisconj and conj ~= thisconj then
error(("Two different conjunctions '%s' and '%s' specified for |%s=; you only need to specify the " ..
"conjunction once"):format(conj, thisconj))
end
if not terms then
terms = this_terms.terms
else
m_table.extend(terms, this_terms.terms)
end
end
return {
spec = spec,
terms = terms,
conj = conj,
}
end
--[==[
Parse a "new-style" place description, with placetypes and holonyms surrounded by `<<...>>` amid otherwise raw text.
Return value is a place description object as documented at the top of the file. Exported for use by
[[Module:demonyms]].
]==]
function export.parse_new_style_place_desc(text, lang, form_of_directives, form_of_overridden_args)
local placetypes = {}
local segments = split(text, "<<(.-)>>")
local retval = {holonyms = {}, order = {}}
local form_of_directives_already_present = form_of_directives and not not form_of_directives[1]
for i, segment in ipairs(segments) do
if i % 2 == 1 then
insert(retval.order, {type = "raw", value = segment})
elseif segment:find("@") then
if not form_of_directives then
error(("Form-of directive '%s' not allowed in this context"):format(segment))
elseif form_of_directives_already_present then
error(("Saw form-of directive '%s' in new-style place desc followed by direct (separate-parameter) form-of directives; not allowed"):format(
segment))
elseif placetypes[1] or retval.holonyms[1] then
error(("Form-of directive '%s' must come first, before placetypes and holonyms"):format(segment))
else
local form_of_directive = parse_form_of_directive(segment, lang, form_of_overridden_args)
if not retval.order[1] or retval.order[1].type ~= "raw" or retval.order[2] then
internal_error("`retval.order` should have a single raw element: %s", retval.order)
end
form_of_directive.pretext = retval.order[1].value
retval.order[1] = nil
insert(form_of_directives, form_of_directive)
end
elseif segment:find("/") then
local holonyms = split_holonym(segment)
for j, holonym in ipairs(holonyms) do
if j > 1 then
if not holonym.no_display then
if j == #holonyms then
insert(retval.order, {type = "raw", value = " and "})
else
insert(retval.order, {type = "raw", value = ", "})
end
end
-- All but the first in a multi-holonym need an article. For the first one, the article is
-- specified in the raw text if needed. (Currently, needs_article is only used when displaying the
-- holonym, so it wouldn't matter when no_display is set, but we set it anyway in case we need it
-- for something else.)
holonym.needs_article = true
end
insert(retval.holonyms, holonym)
if not holonym.no_display then
insert(retval.order, {type = "holonym", value = #retval.holonyms})
end
m_placetypes.key_holonym_into_place_desc(retval, holonym)
end
else
local treat_as, display = segment:match("^(..-):(.+)$")
if treat_as then
segment = treat_as
else
display = segment
end
-- see if the placetype segment is just qualifiers
local only_qualifiers = true
local split_segments = split(segment, " ", true)
for _, split_segment in ipairs(split_segments) do
if m_placetypes.placetype_qualifiers[split_segment] == nil then
only_qualifiers = false
break
end
end
insert(placetypes, {placetype = segment, only_qualifiers = only_qualifiers})
if only_qualifiers then
insert(retval.order, {type = "qualifier", value = display})
else
insert(retval.order, {type = "placetype", value = display})
end
end
end
if not form_of_directives_already_present and form_of_directives and form_of_directives[1] then
form_of_directives[#form_of_directives].posttext = ""
end
local final_placetypes = {}
for i, placetype in ipairs(placetypes) do
if i > 1 and placetypes[i - 1].only_qualifiers then
final_placetypes[#final_placetypes] = final_placetypes[#final_placetypes] .. " " .. placetypes[i].placetype
else
insert(final_placetypes, placetypes[i].placetype)
end
end
retval.placetypes = final_placetypes
return retval
end
--[==[
Parse one or more "new-style" place descriptions, with placetypes and holonyms surrounded by `<<...>>` amid otherwise
raw text. Multiple descriptions are separated by two semicolons in a row. Return value is a list of place description
objects as documented at the top of the file.
]==]
local function parse_conjoined_new_style_place_desc(text, lang, form_of_directives, form_of_overridden_args)
local separate_specs = split(text, ";(;[^ ]*)")
local descs = {}
for i = 1, #separate_specs do
if i % 2 == 1 then
insert(descs, export.parse_new_style_place_desc(separate_specs[i], lang, form_of_directives,
form_of_overridden_args))
form_of_directives = nil
else
descs[#descs].separator = separate_specs[i]
end
end
return descs
end
--[=[
Process numeric and "extra info" arguments into an overall place spec, as described at the top of the file. `data` is an
object with the following fields:
* `args`: The parsed arguments of {{tl|place}}.
* `from_tcl`: True if we're being invoked from {{tl|tcl}}.
* `extra_info_overridden_set`, `form_of_overridden_args`: Same as the corresponding fields in the `data` object passed
to `export.format`.
]=]
local function parse_overall_place_spec(data)
local args, from_tcl, extra_info_overridden_set, form_of_overridden_args =
data.args, data.from_tcl, data.extra_info_overridden_set, data.form_of_overridden_args
local descs = {}
local this_desc
-- Index of separate (semicolon-separated) place descriptions within `descs`.
local desc_index = 1
-- Index of separate holonyms within a place description. 0 means we've seen no holonyms and have yet to process
-- the placetypes that precede the holonyms. 1 means we've seen no holonyms but have already processed the
-- placetypes.
local holonym_index = 0
local in_place_desc = false
local form_of_directives = {}
local function set_desc_joiner(desc, separator)
if separator == ";" then
this_desc.joiner = "; "
this_desc.include_following_article = true
elseif separator == ";;" then
this_desc.joiner = " "
else
local joiner = separator:sub(2)
if rfind(joiner, "^%a") then
this_desc.joiner = " " .. joiner .. " "
else
this_desc.joiner = joiner .. " "
end
end
end
for _, arg in ipairs(args[2]) do
if arg:find("^@") then
if not (desc_index == 1 and holonym_index == 0) then
error("@-directives cannot follow place descriptions")
end
local form_of_directive = parse_form_of_directive(arg, args[1], form_of_overridden_args)
if form_of_directives[1] then
form_of_directive.pretext = ", "
else
form_of_directive.pretext = ""
end
insert(form_of_directives, form_of_directive)
elseif arg == ";" or arg:find("^;[^ ]") then
if not this_desc then
error("Saw semicolon joiner without preceding place description")
end
set_desc_joiner(this_desc, arg)
desc_index = desc_index + 1
holonym_index = 0
in_place_desc = false
else
if arg:find("<<") then
if in_place_desc then
error("New-style place description must come first or following a separator (semicolon or similar), not directly following another description")
end
in_place_desc = true
local this_descs = parse_conjoined_new_style_place_desc(arg, args[1], form_of_directives,
form_of_overridden_args)
for j, desc in ipairs(this_descs) do
this_desc = desc
if holonym_index > 0 then
desc_index = desc_index + 1
holonym_index = 0
end
if j < #this_descs then
set_desc_joiner(this_desc, this_desc.separator)
end
descs[desc_index] = this_desc
last_was_new_style = true
holonym_index = #this_desc.holonyms + 1
end
else
-- Old-style arguments can directly follow a new-style argument; they become additional holonyms
-- tacked onto the end of the holonym list, and are displayed old-style except that there is no
-- prefix before the first one following the new-style argument.
in_place_desc = true
if holonym_index == 0 then
local entry_placetypes = split_on_slash(arg)
this_desc = {placetypes = entry_placetypes, holonyms = {}}
descs[desc_index] = this_desc
holonym_index = holonym_index + 1
else
local holonyms = split_holonym(arg)
for j, holonym in ipairs(holonyms) do
if j > 1 then
-- All but the first in a multi-holonym need an article. Not for the first one because e.g.
-- {{place|en|city|s/Arizona|c/United States}} should not display as "a city in Arizona, the
-- United States". The overall first holonym in the place description gets an article if
-- needed regardless of our setting here.
holonym.needs_article = true
-- Insert "and" before the last holonym.
if j == #holonyms then
this_desc.holonyms[holonym_index] = {
-- Use the no_display value from the first holonym; it should be the same for all
-- holonyms. `unlinked_placename` should not be used.
display_placename = "and", no_display = holonyms[1].no_display
}
holonym_index = holonym_index + 1
end
end
this_desc.holonyms[holonym_index] = holonym
m_placetypes.key_holonym_into_place_desc(this_desc, this_desc.holonyms[holonym_index])
holonym_index = holonym_index + 1
end
end
end
end
end
if form_of_directives[1] and not form_of_directives[#form_of_directives].posttext then
form_of_directives[#form_of_directives].posttext =
(args.def and args.def ~= "-" or not args.def and descs[1]) and ": " or ""
end
-- Tracking code. This does nothing but add tracking for seen placetypes and qualifiers. The place will be linked to
-- [[Wiktionary:Tracking/place/entry-placetype/PLACETYPE]] for all entry placetypes seen; in addition, if PLACETYPE
-- has qualifiers (e.g. 'small city'), there will be links for the bare placetype minus qualifiers and separately
-- for the qualifiers themselves:
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/BARE_PLACETYPE]]
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-qualifier/QUALIFIER]]
-- Note that if there are multiple qualifiers, there will be links for each possible split. For example, for
-- 'small maritime city'), there will be the following links:
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/small maritime city]]
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/maritime city]]
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/city]]
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-qualifier/small]]
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-qualifier/maritime]]
-- Finally, there are also links for holonym placetypes, e.g. if the holonym 'c/Italy' occurs, there will be the
-- following link:
-- [[Special:WhatLinksHere/Wiktionary:Tracking/place/holonym-placetype/country]]
for _, desc in ipairs(descs) do
for _, entry_placetype in ipairs(desc.placetypes) do
local splits = m_placetypes.split_qualifiers_from_placetype(entry_placetype, "no canon qualifiers")
for _, split in ipairs(splits) do
local prev_qualifier, this_qualifier, bare_placetype = unpack(split, 1, 3)
track("entry-placetype/" .. bare_placetype)
if this_qualifier then
track("entry-qualifier/" .. this_qualifier)
end
end
end
for _, holonym in ipairs(desc.holonyms) do
if holonym.placetype then
track("holonym-placetype/" .. holonym.placetype)
end
end
end
local extra_info = {}
for _, extra_info_spec in ipairs(export.extra_info_args) do
local extra_info_terms = parse_extra_info_arg(args[extra_info_spec.arg], extra_info_spec,
-- If called from {{tcl}} and extra info argument was set by {{tcl}}, interpret the argument
-- according to the language in 1=; otherwise interpret as English. To override this, prefix
-- with the appropriate language.
from_tcl and extra_info_overridden_set and extra_info_overridden_set[extra_info_spec.arg] and args[1] or
enlang)
if extra_info_terms then
insert(extra_info, extra_info_terms)
end
end
return {
lang = args[1],
args = args,
directives = form_of_directives,
descs = descs,
extra_info = extra_info,
}
end
-------- Definition-generating functions
-- Return a string with the wikilinks to the English translations of the word.
local function get_translations(transl, ids)
local ret = {}
for i, t in ipairs(transl) do
local arg_transls = split_on_comma(t)
local arg_ids = ids[i]
if arg_ids then
arg_ids = split_on_comma(arg_ids)
if #arg_transls ~= #arg_ids then
error(("Saw %s translation%s in t%s=%s but %s ID%s in tid%s=%s"):format(
#arg_transls, #arg_transls > 1 and "s" or "", i == 1 and "" or i, t, #arg_ids,
#arg_ids > 1 and "'s" or "", i == 1 and "" or i, ids[i]))
end
end
for j, arg_transl in ipairs(arg_transls) do
insert(ret, link(arg_transl, "en", arg_ids and arg_ids[j] or nil))
end
end
return concat(ret, ", ")
end
-- Return the article (currently always `"the"`) to be prepended to the given placename, or nil. `decorated_placename`
-- is the placename as specified by the user along with any affix added to it. `placename` is the raw unlinked
-- placename, defaulting to the unlinked version of `decorated_placename` if not given. `placetypes` is a placetype or
-- list of placetypes for the placename. `suppress_holonym_use_the_check` suppresses checking the placetypes for
-- `holonym_use_the`.
local function get_placename_article(decorated_placename, placetypes, placename, suppress_holonym_use_the_check)
local unlinked_decorated_placename = m_placetypes.remove_links_and_html(decorated_placename)
if unlinked_decorated_placename:find("^the ") then
return nil
end
placename = placename or unlinked_decorated_placename
if type(placetypes) == "string" then
placetypes = {placetypes}
end
for _, placetype in ipairs(placetypes) do
local art = m_placetypes.get_equiv_placetype_prop(placetype, function(pt)
local art = m_placetypes.placename_article[pt] and m_placetypes.placename_article[pt][placename]
if art then
return art
end
end)
if art then
return art
end
end
-- Get equivalent placetypes of the specified placetype so that e.g.
-- {{place|en|@official name of:Bahamas|island country|r/Caribbean}} put 'the' before Bahamas ("Bahamas" is just
-- specified as a country but "island country" falls back to "negara").
local all_equiv_placetypes = {}
for _, placetype in ipairs(placetypes) do
local this_equiv_placetypes = m_placetypes.get_placetype_equivs(placetype)
for _, this_equiv_placetype in ipairs(this_equiv_placetypes) do
insert(all_equiv_placetypes, this_equiv_placetype.placetype)
end
end
-- Look for a known location. We should be using find_matching_holonym_location() but that function doesn't
-- currently work without alias resolution. Instead we check if any matching location has `the = true` set.
-- In practice there aren't any cases where a given placename matches two locations, only one of which has
-- `the = true` set.
for group, key, spec in m_placetypes.iterate_matching_location {
placetypes = all_equiv_placetypes,
placename = placename,
alias_resolution = "none",
} do
-- `iterate_holonym_location` doesn't initialize the spec if alias resolution is turned off, so check both
-- the spec and group. Be careful in case `the = false` is explicitly given by the spec.
if spec.the ~= nil then
if spec.the then
return "the"
end
elseif group.default_the then
return "the"
end
end
if not suppress_holonym_use_the_check then
-- See if the placetype requests an article to be placed before the placename. This occurs e.g. with 'sea'. But
-- if the user specifies e.g. "sea:pref/Cortez", we'll wrongly get "the sea of the Cortez", so in that case we
-- need to ignore the holonym article specified along with the placetype.
for _, placetype in ipairs(placetypes) do
local holonym_use_the = m_placetypes.get_equiv_placetype_prop(placetype,
function(pt) return placetype_data[pt] and placetype_data[pt].holonym_use_the end)
if holonym_use_the then
return "the"
end
end
end
local universal_res = m_placetypes.placename_the_re["*"]
for _, re in ipairs(universal_res) do
if unlinked_decorated_placename:find(re) then
return "the"
end
end
for _, placetype in ipairs(placetypes) do
local matched = m_placetypes.get_equiv_placetype_prop(placetype, function(pt)
local res = m_placetypes.placename_the_re[pt]
if not res then
return nil
end
for _, re in ipairs(res) do
if unlinked_decorated_placename:find(re) then
return true
end
end
return nil
end)
if matched then
return "the"
end
end
return nil
end
-- Prepend the appropriate article if needed to `decorated_placename` (the user-specified placename with any affix
-- added), where the underlying holonym object that generated `linked_placename` can be found at `holonym_index` in the
-- holonyms in `place_desc`.
local function get_holonym_article(decorated_placename, place_desc, holonym_index)
local holonym = place_desc.holonyms[holonym_index]
local holonym_placetype = holonym.placetype
if not holonym_placetype then
return nil
end
return get_placename_article(decorated_placename, holonym_placetype, holonym.unlinked_placename,
not not holonym.affix_type)
end
-- Convert a holonym into display format. This adds wikilinks to holonyms and passes them through any display handlers,
-- which may (e.g.) add the placetype to the holonym. If `needs_article` is true, prepend the article `"the"` if the
-- holonym requires it (e.g. if the holonym is `United States`). `needs_article` is set to true we are processing the
-- first specified holonym in an old-style place description (i.e. the holonym directly following the entry placetype,
-- with no raw-text holonym in between).
--
-- Examples:
-- ({placetype = "negara", display_placename = "United States", unlinked_placename = "United States"}, true) returns
-- the template-expanded equivalent of "the {{l|en|United States}}".
-- ({placetype = "region", display_placename = "O'Higgins", unlinked_placename = "O'Higgins", affix_type = "suf"}, false)
-- returns the template-expanded equivalent of "{{l|en|O'Higgins}} region".
-- ({display_placename = "in the southern"}, false) returns "in the southern" (without wikilinking because .placetype
-- and .langcode are both nil).
local function format_holonym(place_desc, holonym_index, needs_article)
local holonym = place_desc.holonyms[holonym_index]
if holonym.no_display then
return ""
end
local orig_needs_article = needs_article
needs_article = needs_article or holonym.needs_article or holonym.force_the
local output = holonym.display_placename
local placetype = holonym.placetype
local affix_type_pt_data, affix_type, affix_is_prefix, affix, prefix, suffix, no_affix_strings
local pt_equiv_for_affix_type, already_seen_affix, need_affix
-- Implement display handlers.
local display_handler = m_placetypes.get_equiv_placetype_prop(placetype,
function(pt) return placetype_data[pt] and placetype_data[pt].display_handler end)
if display_handler then
output = display_handler(placetype, output)
end
if not holonym.suppress_affix then
-- Implement adding an affix (prefix or suffix) based on the holonym's placetype. The affix will be
-- added either if the placetype's placetype_data spec says so (by setting 'affix_type'), or if the
-- user explicitly called for this (e.g. by using 'r:suf/O'Higgins'). Before adding the affix,
-- however, we check to see if the affix is already present (e.g. the placetype is "district"
-- and the placename is "Mission District"). The placetype can override the affix to add (by setting
-- `prefix`, `suffix` or `affix`) and/or override the strings used for checking if the affix is already
-- present (by setting 'no_affix_strings', which defaults to the affix explicitly given through `prefix`,
-- `suffix` or `affix` if any are given). `prefix` and `suffix` take precedence over `affix` if both are
-- set, but only when the appropriate type of affix is requested.
-- Search through equivalent placetypes for a setting of `affix_type`, `affix`, `prefix` or `suffix`. If we
-- find any, use them. If `affix_type` is given, it is overridden by the user's explicitly specified affix
-- type. If either an `affix_type` is found or the user explicitly specified an affix type, the affix is
-- displayed according to the following:
-- 1. If `prefix`, `suffix` or `affix` is given by the placetype or equivalent placetypes, use it (e.g.
-- placetype `administrative region` requests suffix "region" but doesn't set affix type; if the user
-- explicitly specifies `administrative region` as the placetype for a holonym and specifies a suffixal
-- affix type, use "region"). In this search, we stop looking if we find an explicit `affix_type`
-- setting; if this is found without an associated affix setting, the assumption is the associated
-- placetype was intended as the affix, not some explicit affix setting associated with a fallback
-- placetype.
-- 2. Otherwise, if the user explicitly requested an affix type, use the actual placetype (principle of
-- least surprise).
-- 3. Finally, fall back to the placetype associated with an explicit `affix_type` setting (which will
-- always exist if we get this far).
affix_type_pt_data, pt_equiv_for_affix_type = m_placetypes.get_equiv_placetype_prop(placetype,
function(pt)
local cdpt = placetype_data[pt]
return cdpt and cdpt.affix_type and cdpt or nil
end
)
affix_pt_data, pt_equiv_for_affix = m_placetypes.get_equiv_placetype_prop(placetype,
function(pt)
local cdpt = placetype_data[pt]
return cdpt and (cdpt.affix_type or cdpt.affix or cdpt.prefix or cdpt.suffix) and cdpt or nil
end
)
if affix_type_pt_data then
affix_type = affix_type_pt_data.affix_type
need_affix = true
end
if affix_pt_data then
prefix = affix_pt_data.prefix or affix_pt_data.affix
suffix = affix_pt_data.suffix or affix_pt_data.affix
need_affix = true
end
no_affix_strings = affix_pt_data and affix_pt_data.no_affix_strings or
affix_type_pt_data and affix_type_pt_data.no_affix_strings
if holonym.affix_type and placetype then
affix_type = holonym.affix_type
prefix = prefix or placetype
suffix = suffix or placetype
need_affix = true
end
if need_affix then
-- At this point the affix_type has been determined and can't change any more, so we can figure out
-- whether we need the calculated prefix or suffix.
affix_is_prefix = affix_type == "pref" or affix_type == "Pref"
if affix_is_prefix then
affix = prefix
else
affix = suffix
end
if not affix then
if not pt_equiv_for_affix_type then
internal_error("Something wrong, `pt_equiv_for_affix_type` not set processing holonym: %s",
holonym)
end
affix = pt_equiv_for_affix_type.placetype
if not affix then
internal_error("Something wrong, no affix could be located in `pt_equiv_for_affix_type` for " ..
"holonym %s: %s", holonym, pt_equiv_for_affix_type)
end
end
no_affix_strings = no_affix_strings or lc(affix)
if holonym.pluralize_affix then
affix = m_placetypes.pluralize_placetype(affix)
end
already_seen_affix = m_placetypes.check_already_seen_string(output, no_affix_strings)
end
end
output = link(output, holonym.langcode or placetype and "en" or nil)
if need_affix and not affix_is_prefix and not already_seen_affix then
output = output .. " " .. (affix_type == "Suf" and ucfirst_all(affix) or affix)
end
if needs_article then
local article = holonym.force_the and "the" or get_holonym_article(output, place_desc, holonym_index)
if article then
output = article .. " " .. output
end
end
if affix_is_prefix and not already_seen_affix then
output = (affix_type == "Pref" and ucfirst_all(affix) or affix) .. " of " .. output
if orig_needs_article then
-- Put the article before the added affix if we're the first holonym in the place description. This is
-- distinct from the article added above for the holonym itself; cf. "c:pref/United States,Canada" ->
-- "the countries of the United States and Canada". We need to use the value of `needs_article` passed
-- in from the function, which indicates whether we're processing the first holonym.
output = "the " .. output
end
end
return output
end
-- Format a holonym for display, taking into account the entry's placetype (specifically, the last placetype if there
-- are more than one, excluding conjunctions and parenthetical items); the holonym's index among the holonyms in the
-- template (which specifies what the previous holonym is and whether it is the first holonym); and the full place
-- description (which helps resolve ambiguities in holonyms when looking up known locations). This may involve putting a
-- preposition ("di" or "of") before the formatted holonym, particularly if it is the first one, and may involve
-- prepending a comma. If `holonym_no_prefix` is specified, nothing except a space is put before the holonym; used
-- when formatting mixed new/old-style descriptions.
local function format_holonym_in_context(entry_placetype, place_desc, holonym_index, holonym_no_prefix)
local desc = ""
-- If holonym.placetype is nil, the holonym is just raw text, e.g. 'in southern'.
if holonym_no_prefix then
desc = " "
else
local holonym = place_desc.holonyms[holonym_index]
if not holonym.no_display then
-- First compute the initial delimiter.
if holonym_index == 1 then
if holonym.placetype then
desc = desc .. " " .. m_placetypes.get_placetype_entry_preposition(entry_placetype) .. " "
elseif not holonym.display_placename:find("^,") then
desc = desc .. " "
end
else
local prev_holonym = place_desc.holonyms[holonym_index - 1]
if prev_holonym.placetype and not holonym.suppress_comma then
local dname = holonym.display_placename
if dname ~= "and" and dname ~= "di" and dname ~= "and the" and dname ~= "di" then
desc = desc .. ","
end
end
if holonym.placetype or not holonym.display_placename:find("^,") then
desc = desc .. " "
end
end
end
end
return desc .. format_holonym(place_desc, holonym_index, not holonym_no_prefix and holonym_index == 1)
end
-- Return the linked description of a placetype. This splits off any qualifiers and displays them separately.
local function get_placetype_description(placetype)
local splits = m_placetypes.split_qualifiers_from_placetype(placetype)
local prefix = ""
for _, split in ipairs(splits) do
local prev_qualifier, this_qualifier, bare_placetype = unpack(split, 1, 3)
if this_qualifier then
prefix = (prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier) .. " "
else
prefix = ""
end
local display_form = m_placetypes.get_placetype_display_form(bare_placetype)
if display_form then
return prefix .. display_form
end
placetype = bare_placetype
end
return prefix .. placetype
end
-- Return the linked description of a qualifier (which may be multiple words).
local function get_qualifier_description(qualifier)
local splits = m_placetypes.split_qualifiers_from_placetype(qualifier .. " foo")
local split = splits[#splits]
local prev_qualifier, this_qualifier, bare_placetype = unpack(split, 1, 3)
return prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier
end
-- Format a set of form-of directive terms.
local function format_form_of_directive(overall_place_spec, directive_terms, ucfirst, from_tcl)
local formatted_terms = {}
local placetypes
if not overall_place_spec.descs[2] then
placetypes = overall_place_spec.descs[1].placetypes
else
placetypes = {}
for _, desc in ipairs(overall_place_spec.descs) do
m_table.extend(placetypes, desc.placetypes)
end
end
for _, termobj in ipairs(directive_terms.terms) do
local placename_article
if not termobj.alt and termobj.term and not termobj.term:find("%[%[") then
placename_article = get_placename_article(termobj.term, placetypes)
end
local linked_term = m_links.full_link(termobj, "term", nil, "show qualifiers")
linked_term = "<span class='form-of-definition-link'>" .. linked_term .. "</span>"
if termobj.eq then
linked_term = linked_term .. " (= " .. m_links.full_link {term = termobj.eq, lang = enlang} .. ")"
end
if placename_article then
linked_term = placename_article .. " " .. linked_term
end
insert(formatted_terms, linked_term)
end
local spec = directive_terms.spec
local text = spec.text
if type(text) == "function" then
text = text(overall_place_spec)
end
if text == "+" then
text = directive_terms.directive
end
if ucfirst then
text = m_strutils.ucfirst(text)
end
if not from_tcl then
local tracking_prefix = "form-of/" .. directive_terms.directive
track(tracking_prefix)
local langcode = overall_place_spec.lang:getCode()
local full_langcode = overall_place_spec.lang:getFullCode()
track(tracking_prefix .. "/" .. langcode)
if full_langcode ~= langcode then
track(tracking_prefix .. "/" .. full_langcode)
end
if full_langcode ~= "en" then
track(tracking_prefix .. "/non-english")
end
end
return (require(form_of_module).format_form_of {
text = text,
lemmas = m_table.serialCommaJoin(formatted_terms, {conj = directive_terms.conj or spec.conjunction or "and"}),
lemma_classes = false,
-- text_classes = "place-text",
})
end
-- Format a set of extra-info terms for extra information that is sometimes added to a definition, such as the capital,
-- largest city, modern name, official name, etc. `overall_place_spec` is the overall parsed {{tl|place}} spec (see
-- comment at top of file); `extra_info_terms` is the terms spec for this type of extra-info (as returned by
-- `parse_extra_info_arg`); and `sentence_style` indicates whether we're generating a sentence-style definition (as
-- suitable for an English-language term without a translation specified using t=).
local function format_extra_info(overall_place_spec, extra_info_terms, sentence_style)
local formatted_terms = {}
for _, termobj in ipairs(extra_info_terms.terms) do
insert(formatted_terms, m_links.full_link(termobj, nil, nil, "show qualifiers"))
end
local spec = extra_info_terms.spec
local text = spec.text
if type(text) == "function" then
text = text(overall_place_spec)
end
if text == "+" then
text = spec.arg
end
if spec.auto_plural and formatted_terms[2] then
text = pluralize(text)
end
if spec.with_colon then
text = text .. ":"
end
if sentence_style and spec.match_sentence_style then
text = ". " .. m_strutils.ucfirst(text)
else
text = "; " .. text
end
-- FIME: Use joinSegments when available.
-- return text .. " " ..
-- m_table.joinSegments(formatted_terms, {conj = extra_info_terms.conj or spec.conjunction or "and"})
return text .. " " ..
m_table.serialCommaJoin(formatted_terms, {conj = extra_info_terms.conj or spec.conjunction or "and"})
end
-- Format an old-style place description (with separate arguments for the placetype and each holonym) for display and
-- return the resulting string.
local function format_old_style_place_desc_for_display(args, place_desc, desc_index, with_article, ucfirst)
-- The placetype used to determine whether "di" or "of" follows is the last placetype if there are
-- multiple slash-separated placetypes, but ignoring "and", "or" and parenthesized notes
-- such as "(one of 254)".
local entry_placetype = nil
local placetypes = place_desc.placetypes
local function is_and_or(item)
return item == "and" or item == "or"
end
local parts = {}
local function ins(txt)
insert(parts, txt)
end
local function ins_space()
if #parts > 0 then
ins(" ")
end
end
local and_or_pos
for i, placetype in ipairs(placetypes) do
if is_and_or(placetype) then
and_or_pos = i
-- no break here; we want the last in case of more than one
end
end
local remaining_placetype_index
if and_or_pos then
track("multiple-placetypes-with-and")
if and_or_pos == #placetypes then
error("Conjunctions 'and' and 'or' cannot occur last in a set of slash-separated placetypes: " ..
concat(placetypes, "/"))
end
local items = {}
for i = 1, and_or_pos + 1 do
local pt = placetypes[i]
if is_and_or(pt) then
-- skip
elseif i > 1 and pt:find("^%(") then
-- append placetypes beginning with a paren to previous item
items[#items] = items[#items] .. " " .. pt
else
entry_placetype = pt
insert(items, get_placetype_description(pt))
end
end
ins(m_table.serialCommaJoin(items, {conj = placetypes[and_or_pos]}))
remaining_placetype_index = and_or_pos + 2
else
remaining_placetype_index = 1
end
for i = remaining_placetype_index, #placetypes do
local pt = placetypes[i]
-- Check for and, or and placetypes beginning with a paren (so that things like
-- "{{place|en|county/(one of 254)|s/Texas}}" work).
if m_placetypes.placetype_is_ignorable(pt) then
ins_space()
ins(pt)
else
entry_placetype = pt
-- Join multiple placetypes with comma unless placetypes are already
-- joined with "and". We allow "the" to precede the second placetype
-- if they're not joined with "and" (so we get "city and county seat of ..."
-- but "city, the county seat of ...").
if i > 1 then
ins(", ")
local article = m_placetypes.get_placetype_article(pt)
if article ~= "the" and i > remaining_placetype_index then
-- Track cases where we are comma-separating multiple placetypes without the second one starting
-- with "the", as they may be mistakes. The occurrence of "the" is usually intentional, e.g.
-- {{place|zh|municipality/state capital|s/Rio de Janeiro|c/Brazil|t1=Rio de Janeiro}}
-- for the city of [[Rio de Janeiro]], which displays as "a municipality, the state capital of ...".
track("multiple-placetypes-without-and-or-the")
end
if article then
ins(article)
ins(" ")
end
end
ins(get_placetype_description(pt))
end
end
if place_desc.holonyms then
for holonym_index, _ in ipairs(place_desc.holonyms) do
ins(format_holonym_in_context(entry_placetype, place_desc, holonym_index))
end
end
local gloss = concat(parts)
if with_article then
local article
if desc_index == 1 then
article = args.a
else
if not place_desc.holonyms then
-- there isn't a following holonym; the place type given might be raw text as well, so don't add
-- an article.
with_article = false
else
local saw_placetype_holonym = false
for _, holonym in ipairs(place_desc.holonyms) do
if holonym.placetype then
saw_placetype_holonym = true
break
end
end
if not saw_placetype_holonym then
-- following holonym(s)s is/are just raw text; the place type given might be raw text as well,
-- so don't add an article.
with_article = false
end
end
if with_article then
track("second-or-higher-description-with-added-article")
else
track("second-or-higher-description-suppressed-article")
end
end
if with_article then
article = article or m_placetypes.get_placetype_article(place_desc.placetypes[1], ucfirst)
if article then
gloss = article .. " " .. gloss
elseif ucfirst then
gloss = m_strutils.ucfirst(gloss)
end
end
end
return gloss
end
--[==[
Get the full gloss (English description) of a new-style place description. New-style place descriptions are
specified with a single string containing raw text interspersed with placetypes and holonyms surrounded by `<<...>>`.
Exported for use by [[Module:demonyms]].
]==]
function export.format_new_style_place_desc_for_display(args, place_desc, with_article)
local parts = {}
local function ins(txt)
insert(parts, txt)
end
if with_article and args.a then
ins(args.a .. " ")
end
local max_holonym = 0
for _, order in ipairs(place_desc.order) do
local segment_type, segment = order.type, order.value
if segment_type == "raw" then
ins(segment)
elseif segment_type == "placetype" then
ins(get_placetype_description(segment))
elseif segment_type == "qualifier" then
ins(get_qualifier_description(segment))
elseif segment_type == "holonym" then
ins(format_holonym(place_desc, segment, false))
if segment > max_holonym then
max_holonym = segment
end
else
internal_error("Unrecognized segment type %s", segment_type)
end
end
if place_desc.holonyms and max_holonym < #place_desc.holonyms then
local holonym_no_prefix = true
for holonym_index = max_holonym + 1, #place_desc.holonyms do
ins(format_holonym_in_context(nil, place_desc, holonym_index, holonym_no_prefix))
holonym_no_prefix = false
end
end
return concat(parts)
end
-- Return a string with the gloss (the description of the place itself, as opposed to translations). If `ucfirst` is
-- given, the gloss's first letter is made upper case. If `sentence_style` is given, the "extra info" (modern name,
-- capital, largest city, etc.) is displayed as separated sentences; otherwise, it is displayed separated from the main
-- definition by semicolons.
local function get_display_form(data)
local overall_place_spec, ucfirst, sentence_style, drop_extra_info, extra_info_overridden_set, from_tcl =
data.overall_place_spec, data.ucfirst, data.sentence_style, data.drop_extra_info,
data.extra_info_overridden_set, data.from_tcl
local args = overall_place_spec.args
local parts = {}
local function ins(txt)
table.insert(parts, txt)
end
if overall_place_spec.directives and overall_place_spec.directives[1] then
for i, directive_terms in ipairs(overall_place_spec.directives) do
ins(directive_terms.pretext)
if directive_terms.pretext ~= "" then
ucfirst = false
end
if not args.def or args.def == "-" then
ins(format_form_of_directive(overall_place_spec, directive_terms, ucfirst, from_tcl))
ucfirst = false
if i == #overall_place_spec.directives and directive_terms.posttext then
ins(directive_terms.posttext)
end
end
end
end
if args.def == "-" then
return concat(parts)
end
if args.def then
if args.def:find("<<") then
local def_desc = export.parse_new_style_place_desc(args.def, args[1])
ins(export.format_new_style_place_desc_for_display({}, def_desc, false))
else
ins(args.def)
end
else
local include_article = true
for n, desc in ipairs(overall_place_spec.descs) do
if desc.order then
ins(export.format_new_style_place_desc_for_display(args, desc, n == 1))
else
ins(format_old_style_place_desc_for_display(args, desc, n, include_article, ucfirst))
end
if desc.joiner then
ins(desc.joiner)
end
include_article = desc.include_following_article
ucfirst = false
end
end
local addl = args.addl
if addl then
posttext = posttext or ""
if addl:find("^[;:]") then
ins(addl)
elseif addl:find("^_") then
ins(" " .. addl:sub(2))
else
ins(", " .. addl)
end
end
for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do
-- Include a given extra info term either when
-- (1) drop_extra_info not set (it's set by {{tcl}}), or
-- (2) the extra info term is marked as "display even when dropped" (e.g. modern= or full=, to help understand
-- the term's sense), or
-- (3) the term was overridden by a `place_*=` setting in {{tcl}}.
if not drop_extra_info or extra_info_terms.spec.display_even_when_dropped or
extra_info_overridden_set and extra_info_overridden_set[extra_info_terms.spec.arg] then
ins(format_extra_info(overall_place_spec, extra_info_terms, sentence_style))
end
end
return concat(parts)
end
-- Return the definition line.
local function get_def(data)
local overall_place_spec, from_tcl, drop_extra_info, extra_info_overridden_set, translation_follows =
data.overall_place_spec, data.from_tcl, data.drop_extra_info, data.extra_info_overridden_set,
data.translation_follows
local args = overall_place_spec.args
local sentence_style = overall_place_spec.lang:getCode() == "en"
local ucfirst = sentence_style and not args.nocap
if #args.t > 0 then
local gloss = get_display_form {
overall_place_spec = overall_place_spec,
ucfirst = false,
sentence_style = false,
drop_extra_info = drop_extra_info,
extra_info_overridden_set = extra_info_overridden_set,
from_tcl = from_tcl,
}
if from_tcl and not args.tcl_nolc then
gloss = m_strutils.lcfirst(gloss)
end
if translation_follows then
return (gloss == "" and "" or gloss .. ": ") .. get_translations(args.t, args.tid)
else
return get_translations(args.t, args.tid) .. (gloss == "" and "" or " (" .. gloss .. ")")
end
else
return get_display_form {
overall_place_spec = overall_place_spec,
ucfirst = ucfirst,
sentence_style = sentence_style,
drop_extra_info = drop_extra_info,
extra_info_overridden_set = extra_info_overridden_set,
from_tcl = from_tcl,
}
end
end
---------- Functions for the category wikicode
-- The code in this section finds the categories to which a given place belongs. See comment at top of file.
--[=[
Find the appropriate category specs for a given place description and placetype. For example, for the template
invocation {{tl|place|en|city/and/county|s/Pennsylvania|c/US}}, which results in the place description
```
{
placetypes = {"city", "and", "county"},
holonyms = {
{placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania"},
{placetype = "negara", display_placename = "United States", unlinked_placename = "United States"},
},
holonyms_by_placetype = {
state = {"Pennsylvania"},
country = {"United States"},
},
}
```
the call
```
find_placetype_cat_specs {
entry_placetype = "city",
place_desc = {
placetypes = {"city", "and", "county"},
holonyms = {
{placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania"},
{placetype = "negara", display_placename = "United States", unlinked_placename = "United States"},
},
holonyms_by_placetype = {
state = {"Pennsylvania"},
country = {"United States"},
},
},
}
```
might produce the return value
```
{
entry_placetype = "city",
cat_specs = {"Cities in Pennsylvania, USA"},
triggering_holonym = {placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania"},
triggering_holonym_index = 1,
}
```
See the comment at the top of the section for a description of category specs and the overall algorithm.
On entry, `data` is an object with the following fields:
* `entry_placetype`: the entry placetype (or equivalent) used to look up the category data in placetype_data,
which must have already been resolved to a placetype with an entry in `placetype_data`;
* `place_desc`: the full place description as documented at the top of the file (used only for its holonyms);
* `first_holonym_index`: the index of the first holonym to consider when iterating through the holonyms (used to
implement the `:also` holonym placetype modifier);
* `overriding_holonym`: an optional overriding holonym to use, in place of iterating through the holonyms (used to
implement categorizing other holonyms of the same type as the triggering holonym, so that e.g.
{{tl|place|en|river|s/Kansas,Nebraska}}, or equivalently {{tl|place|en|river|s/Kansas|and|s/Nebraska}}, works);
* `from_demonym`: we are called from {{tl|demonym-noun}} or {{tl|demonym-adj}} instead of {{tl|place}}, and should
generate categories appropriate to those templates.
* `form_of_directive`: A form-of directive prefix such as `FORMER_NAME_OF`. If specified, use that type prefix to
generate categories appropriate to the form-of directive (in addition to the regular categories generated for the
{{tl|place}} invocation, which happens in a separate call).
The return value is {nil} if no category specs could be located, otherwise an object with the following fields:
* `entry_placetype`: the placetype that should be used to construct categories when `true` is one of the returned
category specs (normally the same as the `entry_placetype` passed in, but will be different when a "fallback" key
exists and is used);
* `cat_specs`: list of category specs as described above;
* `triggering_holonym`: the triggering holonym (see the comment at the top of the section), or nil if there was no
triggering holonym;
* `triggering_holonym_index`: the index of the triggering holonym in the list of holonyms in `place_desc`, or nil if
an overriding holonym was passed in or there was no triggering holonym.
]=]
local function find_placetype_cat_specs(data)
local entry_placetype, place_desc, first_holonym_index, overriding_holonym, from_demonym =
data.entry_placetype, data.place_desc, data.first_holonym_index, data.overriding_holonym, data.from_demonym
local form_of_directive = data.form_of_directive
local function fetch_cat_specs(holonym_to_match, index, no_fallback)
local holonym_placetype = holonym_to_match.placetype
if not holonym_placetype then
-- raw text in place of holonym
return nil
end
local holonym_placename = holonym_to_match.unlinked_placename
if not holonym_placename then
internal_error("Missing unlinked_placename in holonym (index %s): %s", index, holonym_to_match)
end
local cat_specs, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype,
function(equiv_entry_pt)
return m_placetypes.get_equiv_placetype_prop(holonym_placetype,
function(equiv_holonym_pt) return m_placetypes.political_division_cat_handler {
entry_placetype = equiv_entry_pt,
holonym_placetype = equiv_holonym_pt,
holonym_placename = holonym_placename,
holonym_index = index,
place_desc = place_desc,
from_demonym = from_demonym,
} end)
end,
{no_fallback = no_fallback, form_of_directive = form_of_directive}
)
if cat_specs and cat_specs[1] then
return cat_specs, equiv_entry_placetype_and_qualifier.placetype
end
local cat_handler, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype,
function(equiv_entry_pt)
local entry_placetype_data = m_placetypes.placetype_data[equiv_entry_pt]
if entry_placetype_data and entry_placetype_data.cat_handler then
return entry_placetype_data.cat_handler
end
end,
{no_fallback = no_fallback, form_of_directive = form_of_directive}
)
if cat_handler then
local cat_specs = m_placetypes.get_equiv_placetype_prop(holonym_placetype,
function(equiv_holonym_pt) return cat_handler {
entry_placetype = equiv_entry_placetype_and_qualifier.placetype,
holonym_placetype = equiv_holonym_pt,
holonym_placename = holonym_placename,
holonym_index = index,
place_desc = place_desc,
from_demonym = from_demonym,
} end)
if cat_specs and cat_specs[1] then
return cat_specs, equiv_entry_placetype_and_qualifier.placetype
end
end
if not no_fallback then
local cat_specs, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype,
function(equiv_entry_pt)
local entry_placetype_data = m_placetypes.placetype_data[equiv_entry_pt]
if entry_placetype_data then
return m_placetypes.get_equiv_placetype_prop(holonym_placetype,
function(equiv_holonym_pt)
return entry_placetype_data[equiv_holonym_pt .. "/*"]
end)
end
end,
{form_of_directive = form_of_directive}
)
if cat_specs and cat_specs[1] then
return cat_specs, equiv_entry_placetype_and_qualifier.placetype
end
end
return nil
end
if overriding_holonym then
-- FIXME, change the algorithm to eliminate overriding_holonym
local cat_specs, fetched_entry_placetype = fetch_cat_specs(overriding_holonym, nil)
if cat_specs and cat_specs[1] then
return {
entry_placetype = fetched_entry_placetype,
cat_specs = cat_specs,
triggering_holonym = overriding_holonym,
-- no triggering_holonym_index
}
end
else
-- We loop twice over holonyms, the first time setting `no_fallback` so that we process only category specs for
-- the specifically given entry placetype (possibly with preceding qualifiers). The reason for this is to
-- correctly handle cases like [[Poblacion IX]]:
-- {{place|en|barangay|mun/Roxas|p/Capiz|c/Philippines}}.
-- "barangay" falls back to "neighborhood", and without the `no_fallback` loop, the neighborhood cat handler run
-- on the mun/Roxas holonym will take precedence over the barangay-specific setting for p/Capiz because we
-- check, for each holonym in turn, first for a matching spec through political_division_cat_handler, then a cat
-- handler, then a wildcard spec like country/*. During the first no-fallback loop, we disable checking for
-- wildcard specs because it seems a fallback matching exactly or through a cat handler on an earlier holonym
-- would be better than a wildcard match for the exact entry placetype at a later holonym. (FIXME: But I don't
-- know for sure; maybe we should check wildcard holonyms on the exact entry placetype first, or contrariwise
-- maybe we should check only exact-match holonyms through political_division_cat_handler on the exact entry
-- placetype first, not even checking other cat handlers.)
for i, holonym in ipairs(place_desc.holonyms) do
if first_holonym_index and i < first_holonym_index then
-- continue
else
local cat_specs, fetched_entry_placetype = fetch_cat_specs(holonym, i, "no_fallback")
if cat_specs and cat_specs[1] then
return {
entry_placetype = fetched_entry_placetype,
cat_specs = cat_specs,
triggering_holonym = holonym,
triggering_holonym_index = i,
}
end
end
end
for i, holonym in ipairs(place_desc.holonyms) do
if first_holonym_index and i < first_holonym_index then
-- continue
else
local cat_specs, fetched_entry_placetype = fetch_cat_specs(holonym, i)
if cat_specs and cat_specs[1] then
return {
entry_placetype = fetched_entry_placetype,
cat_specs = cat_specs,
triggering_holonym = holonym,
triggering_holonym_index = i,
}
end
end
end
end
return nil
end
-- Turn a list of category specs (see comment at section top) into the corresponding categories (minus the language
-- code prefix). The function is given the following arguments:
-- (1) the category specs retrieved using find_placetype_cat_specs();
-- (2) the entry placetype used to fetch the entry in `placetype_data`
-- (3) the triggering holonym (a holonym object; see comment at top of file) used to fetch the category specs
-- (see top-of-section comment); or nil if no triggering holonym.
-- The return value is constructed as described in the top-of-section comment.
local function cat_specs_to_categories(place_desc, cat_data)
local all_cats = {}
local cat_specs, entry_placetype, triggering_holonym, triggering_holonym_index =
cat_data.cat_specs, cat_data.entry_placetype, cat_data.triggering_holonym, cat_data.triggering_holonym_index
if triggering_holonym then
for _, cat_spec in ipairs(cat_specs) do
local cat
if cat_spec == true then
cat = m_placetypes.pluralize_placetype(entry_placetype, "ucfirst") .. " " ..
m_placetypes.get_placetype_entry_preposition(entry_placetype) .. " +++"
else
cat = cat_spec
end
if cat:find("%+%+%+") then
local group, key, spec, container_trail = m_placetypes.find_matching_holonym_location {
holonym_placetype = triggering_holonym.placetype,
holonym_placename = triggering_holonym.unlinked_placename,
holonym_index = triggering_holonym_index,
place_desc = place_desc,
}
if group then
cat = cat:gsub("%+%+%+", m_strutils.replacement_escape(m_placetypes.get_prefixed_key(key, spec)))
insert(all_cats, cat)
else
mw.log(("Unable to insert category for cat spec '%s' because holonym '%s/%s' did not match a " ..
"known location"):format(cat, triggering_holonym.placetype, triggering_holonym.unlinked_placename))
track("cant-match-holonym-for-category-spec")
end
else
insert(all_cats, cat)
end
end
else
for _, cat_spec in ipairs(cat_specs) do
local cat
if cat_spec == true then
cat = m_placetypes.pluralize_placetype(entry_placetype, "ucfirst")
else
cat = cat_spec
if cat:find("%+%+%+") then
internal_error("Category %s contains +++ but there is no holonym to substitute", cat)
end
end
insert(all_cats, cat)
end
end
return all_cats
end
-- Return the categories (without initial lang code) that should be added to the entry, given the place description
-- (which specifies the entry placetype(s) and holonym(s); see top of file) and a particular entry placetype (e.g.
-- "city"). Note that only the holonyms from the place description are looked at, not the entry placetypes in the place
-- description.
local function get_placetype_cats(place_desc, entry_placetype, from_demonym, form_of_directive)
local cats = {}
local first_holonym_index = 1
while first_holonym_index <= #place_desc.holonyms do
-- Find the category specs (see top-of-file comment) corresponding to the holonym(s) in the place description.
local cat_data = find_placetype_cat_specs {
entry_placetype = entry_placetype,
place_desc = place_desc,
first_holonym_index = first_holonym_index,
from_demonym = from_demonym,
form_of_directive = form_of_directive,
}
-- Check if no category spec could be found.
if not cat_data then
break
end
local triggering_holonym = cat_data.triggering_holonym
if not triggering_holonym then
internal_error("find_placetype_cat_specs should have returned a triggering holonym: %s", cat_data)
end
-- Generate categories for the category specs found.
extend(cats, cat_specs_to_categories(place_desc, cat_data))
-- Also generate categories for other holonyms of the same placetype, so that e.g.
-- {{place|en|city|s/Kansas|and|s/Missouri|c/USA}} generates both [[:Category:en:Cities in Kansas, USA]] and
-- [[:Category:en:Cities in Missouri, USA]].
first_holonym_index = cat_data.triggering_holonym_index
-- Loop over non-fallback equivalent placetypes to the triggering holonym's placetype, in case it is
-- non-canonical (e.g. `cities/San Francisco`). This matches the loop over equivalent places in
-- key_holonym_into_place_desc().
local equiv_triggering_placetypes = m_placetypes.get_placetype_equivs(triggering_holonym.placetype,
{no_fallback = true})
for _, equiv in ipairs(equiv_triggering_placetypes) do
local other_holonyms_of_same_type = place_desc.holonyms_by_placetype[equiv.placetype]
if other_holonyms_of_same_type then
for _, other_placename_of_same_type in ipairs(other_holonyms_of_same_type) do
if other_placename_of_same_type ~= triggering_holonym.unlinked_placename then
local overriding_holonym = {
placetype = triggering_holonym.placetype,
unlinked_placename = other_placename_of_same_type,
}
local other_cat_data = find_placetype_cat_specs {
entry_placetype = entry_placetype,
place_desc = place_desc,
overriding_holonym = overriding_holonym,
from_demonym = from_demonym,
form_of_directive = form_of_directive,
}
if other_cat_data then
extend(cats, cat_specs_to_categories(place_desc, other_cat_data))
end
end
end
end
end
-- If there are any later-specified holonyms that had the modifier :also, try to produce categories for them
-- as well.
first_holonym_index = first_holonym_index + 1
while first_holonym_index <= #place_desc.holonyms do
if place_desc.holonyms[first_holonym_index].continue_cat_loop then
break
end
first_holonym_index = first_holonym_index + 1
end
end
if cats[1] then
return cats
end
local entry_pt_default, equiv_entry_placetype_and_qualifier =
m_placetypes.get_equiv_placetype_prop(entry_placetype, function(pt)
return m_placetypes.placetype_data[pt] and m_placetypes.placetype_data[pt].default
end,
{form_of_directive = form_of_directive})
if entry_pt_default then
return cat_specs_to_categories(place_desc, {
cat_specs = entry_pt_default,
entry_placetype = equiv_entry_placetype_and_qualifier.placetype,
-- no triggering holonym
})
end
return {}
end
--[==[
Iterate through each type of place and return a list of the categories that need to be added to the entry. The returned
categories need to be formatted using `format_cats`, as they can be either topic-style categories (by default) or
langname-style categories (if prefixed with `cln:`). The function is passed the overall place spec, which contains all
the parsed info on the {{tl|place}} call (see comment at top of file), the parsed arguments (needed for arguments
not parsed by `parse_overall_place_spec` and used primarily to add "bare categories" corresponding to toponyms for known
locations), and `from_demonym`, which is true if we're being called from {{tl|demonym-noun}} or {{tl|demonym-adj}} (in
this case, we only want certain categories added, specifically bare categories corresponding to the specified
holonym(s)).
]==]
function export.get_cats(args, overall_place_spec, from_demonym)
local cats = {}
local place_descriptions = overall_place_spec.descs
handle_category_implications(place_descriptions, m_placetypes.cat_implications)
m_placetypes.augment_holonyms_with_container(place_descriptions)
if overall_place_spec.directives then -- not necessarily when called from [[Module:demonym]]
for _, directive_terms in ipairs(overall_place_spec.directives) do
local spec_cats = directive_terms.spec.cat
if spec_cats then
if type(spec_cats) == "string" then
spec_cats = {spec_cats}
end
for _, spec_cat in ipairs(spec_cats) do
insert(cats, spec_cat)
end
end
if directive_terms.spec.type_prefix then
for _, place_desc in ipairs(place_descriptions) do
for _, placetype in ipairs(place_desc.placetypes) do
if not m_placetypes.placetype_is_ignorable(placetype) then
extend(cats, get_placetype_cats(place_desc, placetype, from_demonym,
directive_terms.spec.type_prefix))
end
end
end
end
end
end
if not from_demonym then
local bare_categories = m_placetypes.get_bare_categories(args, overall_place_spec)
extend(cats, bare_categories)
end
for _, place_desc in ipairs(place_descriptions) do
if not from_demonym then
for _, placetype in ipairs(place_desc.placetypes) do
if not m_placetypes.placetype_is_ignorable(placetype) then
extend(cats, get_placetype_cats(place_desc, placetype))
end
end
end
-- Also add generic place categories for the holonyms listed (e.g. a category like
-- [[Category:Places in Merseyside, England]]). This is handled through the special placetype "*".
extend(cats, get_placetype_cats(place_desc, "*", from_demonym))
end
if args.cat then -- not necessarily when called from [[Module:demonym]]
for _, cat in ipairs(args.cat) do
local split_cats = split_on_comma(cat)
extend(cats, split_cats)
end
end
return cats
end
-- Return the category link for a category, given the language code and the name of the category.
local function format_cats(lang, cats, sort_key)
local full_cats = {}
local langcode = lang:getFullCode()
for _, cat in ipairs(cats) do
-- 'cln' corresponds to {{cln}}, which generates lang-name categories like [[:Category:English abbreviations]]
-- (as opposed to topic categories like [[:Category:en:Abbreviations of states of the United States]]).
local cln_cat = cat:match("^cln:(.*)$")
if cln_cat then
insert(full_cats, lang:getFullName() .. " " .. cln_cat)
else
insert(full_cats, langcode .. ":" .. cat)
end
end
return require(utilities_module).format_categories(full_cats, lang, sort_key, nil,
force_cat or m_placetypes.get_force_cat())
end
----------- Main entry point
--[==[
Implementation of {{tl|place}}. Meant to be callable from another module (specifically, [[Module:transclude]]). The
single argument `data` is an object with the following fields:
* `template_args`: Raw arguments specified by {{tl|place}}, possibly modified by {{tl|tcl}}.
* `from_tcl`: True if we're being invoked from {{tl|tcl}}.
* `drop_extra_info`: True if we should drop most of the "extra info" specified using extra info arguments (capital,
largest city, etc.). Usually true when invoked from {{tl|tcl}}. Note that some extra info is still displayed even
when `drop_extra_info` is set in order to establish the context (e.g. {{para|full}} and {{para|modern}}), and any
extra info overridden at the {{tl|tcl}} level is displayed regardless.
* `extra_info_overridden_set`: Set of booleans specifying, for each extra info arg, whether it was overridden at the
{{tl|tcl}} level. This means, for example, that the values are interpreted according to the language in {{para|1}}
instead of always defaulting to English, as is the case when {{tl|place}} is called directly.
* `form_of_overridden_args`: Set of objects of the form `{new_directive = ``directive``, new_value = ``value``}` for
overriding a given form-of directive (the key) with new directive ``directive`` and new unparsed value ``value``.
Both the key and the replacing directive should be canonical. ``value`` will be parsed in the same way as a regular
form-of directive except that all specified terms are interpreted in the language specified in {{para|1}}, never in
English. This is present so that {{tl|tcl}} can be used on abbreviations like [[GDR]] and [[FYROM]], whose
equivalents in a foreign language have language-specific expansions but where the rest of the call should stay the
same.
* `translation_follows`: If true, any translation specified using t= should follow the definition, after a colon,
rather than preceding, with the definition in parens.
]==]
function export.format(data)
local template_args = data.template_args
local list_param = {list = true}
local boolean_param = {type = "boolean"}
local params = {
[1] = {required = true, type = "language", default = "und"},
[2] = {required = true, list = true},
["t"] = list_param,
["tid"] = {list = true, allow_holes = true},
["cat"] = list_param,
["nocat"] = boolean_param,
["nocap"] = boolean_param,
["sort"] = true,
["pagename"] = true, -- for testing or documentation purposes
["a"] = true,
["addl"] = true,
["def"] = true,
-- params that are only used when transcluding using {{tcl}}/{{transclude}}, to transmit information to {{tcl}}.
["tcl"] = true,
["tcl_t"] = list_param,
["tcl_tid"] = list_param,
["tcl_nolb"] = true,
["tcl_nolc"] = boolean_param,
["tcl_noextratext"] = boolean_param,
}
-- add "extra info" parameters
for _, extra_arg_spec in ipairs(export.extra_info_args) do
params[extra_arg_spec.arg] = list_param
end
-- FIXME, once we've flushed out any uses, delete the following clause. That will cause def= to be ignored.
if template_args.def == "" then
error("Cannot currently pass def= as an empty parameter; use def=- if you want to suppress the definition display")
end
local args = require("Module:parameters").process(template_args, params)
if args.a then
track("a")
if args.a:find("^[Aa]n?$") or args.a:find("^[Tt]he$") then
track("a/article")
else
error("a= can only be used to specify a definite or indefinite article (and preferably use |nocap=1 instead to get the initial letter lowercase); see especially the documentation on the [[Template:place#Mixed format|mixed format]], which can be used to add arbitrary text before the placetype")
end
end
data.args = args
local overall_place_spec = parse_overall_place_spec(data)
data.overall_place_spec = overall_place_spec
return get_def(data) .. (
args.nocat and "" or format_cats(args[1], export.get_cats(args, overall_place_spec), args.sort))
end
--[==[
Actual entry point of {{tl|place}}.
]==]
function export.show(frame)
return export.format {
template_args = frame:getParent().args,
}
end
return export
jon52s5c2fdlkwuduulsxgt074fk66m
Kategori:Perkataan dengan terjemahan bahasa Turki Usmaniyah
14
77014
281346
225608
2026-04-22T05:14:34Z
Hakimi97
2668
Hakimi97 telah memindahkan laman [[Kategori:Perkataan dengan terjemahan bahasa Turki Uthmaniyah]] ke [[Kategori:Perkataan dengan terjemahan bahasa Turki Usmaniyah]]: Tajuk salah eja
225608
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Penyelenggaraan entri bahasa Turki Usmaniyah
14
77015
281330
225609
2026-04-22T00:40:18Z
PeaceSeekers
3334
PeaceSeekers telah memindahkan laman [[Kategori:Penyelenggaraan entri bahasa Turki Uthmaniyah]] ke [[Kategori:Penyelenggaraan entri bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama
225609
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
tahun lompat
0
77174
281417
225940
2026-04-22T08:27:17Z
PeaceSeekers
3334
281417
wikitext
text/x-wiki
== Bahasa Melayu ==
{{Wikipedia}} <!-- Kalau ada -->
=== Takrifan ===
==== Kata nama ====
{{ms-kn|j=تاهون لومڤت}}
# Tahun dalam takwim [[Masihi]] di mana satu hari tambahan ditambah pada akhir bulan [[Februari]] (29 Februari) untuk mengimbangi waktu tambahan [[tahun suria]] berbanding takwim.
#: {{syn|ms|tahun kabisat}}
=== Terjemahan ===
{{trans-top|tahun Masihi dengan hari tambahan}}
* Afrikaans: {{t+|af|skrikkeljaar}}
* Altai:
*: Altai Selatan: {{t|alt|кату јыл}}
* Arab: {{t|ar|سَنَة كَبِيسَة}}
* Belanda: {{t+|nl|schrikkeljaar|n}}
* Breton: {{t|br|bloavezh bizeost|m}}
* Bulgaria: {{t|bg|високосна година|f}}
* Burma: {{t|my|ရက်ထပ်နှစ်}}
* Catalonia: {{t+|ca|any bixest|m}}, {{t+|ca|any bissextil|m}}, {{t+|ca|any de traspàs|m}}
* Cina:
*: Mandarin: {{t+|cmn|閏年}}
* Cornwall: {{t|kw|bledhen lamm|f}}
* Czech: {{t+|cs|přestupný rok|m}}
* Denmark: {{t+|da|skudår|n}}
* Esperanto: {{t|eo|superjaro}}
* Estonia: {{t+|et|liigaasta}}
* Faroe: {{t|fo|leypár|n}}
* Finland: {{t+|fi|karkausvuosi}}
* Gael Scotland: {{t|gd|bliadhna-leum|f}}
* Georgia: {{t|ka|ნაკიანი წელი}}, {{t|ka|ნაკიანი წელიწადი}}
* Hindi: {{t|hi|अधिवर्ष}}, {{t|hi|लीप वर्ष}}
* Hungary: {{t+|hu|szökőév}}
* Ibrani: {{t+|he|שנה מעוברת|m|tr=shaná meubéret}}
* Iceland: {{t|is|hlaupár|n}}
* Ido: {{t|io|bisextila yaro}}
* Indonesia: {{t+|id|tahun kabisat}}
* Inggeris: {{t+|en|leap year}}
* Interlingua: {{t|ia|anno bissextil}}
* Ireland: {{t|ga|bliain bhisigh|f}}
* Itali: {{t|it|anno bisestile|m}}
* Jepun: {{t+|ja|閏年|tr=じゅんねん, junnen; うるうどし, urūdoshi}}
* Jerman: {{t+|de|Schaltjahr|n}}
* Khmer: {{t|km|ឆ្នាំបង្គ្រប់}}
* Korea: {{t+|ko|윤년(閏年)}}
* Lao: {{t|lo|ປີອະທິກະສຸລະທິນ}}
* Lithuania: {{t|lt|keliamieji metai|m-p}}
* Luxembourg: {{t|lb|Schaltjoer|n}}
* Macedonia: {{t|mk|престапна година|f}}
* Malta: {{t|mt|sena biżestili}}
* Māori: {{t|mi|tau kuhurangi}}
* Minangkabau: {{t|min|tahun kabisat}}
* Mongol: {{t|mn|өндөр жил}}
* Norman: {{t|nrf|année bissextile|f}}
* Norway:
*: Bokmål: {{t|nb|skottår|n}}, {{t|nb|skuddår|n}}
*: Nynorsk: {{t|nn|skotår|n}}, {{t|nn|skottår|n}}
* Pashto: {{t|ps|د کبيسې کال|m}}
* Parsi: {{t|fa|سال انباشته|tr=sâl-e anbâšte}}, {{t+|fa|سال کبیسه|tr=sâl-e kabise}}
* Perancis: {{t+|fr|année bissextile|f}}
* Plautdietsch: {{t|pdt|Schaultjoa|n}}
* Poland: {{t+|pl|rok przestępny|m-in}}
* Portugis: {{t+|pt|ano bissexto|m}}
* Romania: {{t+|ro|an bisect|m}}, {{t|ro|an bisectil|m}}
* Rusia: {{t+|ru|високо́сный год|m}}
* Samoa: {{t|sm|puna ifo tausaga}}
* Sepanyol: {{t+|es|año bisiesto|m}}, {{t+|es|bisiesto|m}}
* Serbo-Croatia:
*: Cyril: {{t|sh|преступна го̏дина)|f}}
*: Latin: {{t|sh|prestupna gȍdina|f}}, {{t|sh|prijestupna gȍdina|f}}
* Slovak: {{t|sk|priestupný rok|m}}
* Slovene: {{t+|sl|prestopno leto|n}}
* Sweden: {{t+|sv|skottår|n}}
* Tagalog: {{t|tl|taong bisyesto}}
* Tajik: {{t|tg|соли кабиса}}
* Thai: {{t|th|ปีอธิกมาส}}, {{t|th|ปีอธิกสุรทิน}}
* Turki: {{t+|tr|artık yıl}}
* Ukraine: {{t|uk|пере́ступний рік}} {{qualifier|dated}}, {{t|uk|високо́сний рік|m}}
* Urdu: {{t|ur|سال کبیسہ|tr=sāl-e kabīsā}}
* Vietnam: {{t+|vi|năm nhuận}}, {{t|vi|năm nhuần}}
* Wales: {{t|cy|blwyddyn naid}}
* Waray-Waray: {{t|war|tuig bisyesto}}
* Yiddish: {{t|yi|עיבור־יאָר|n|tr=iber-yor}}
* Yunani: {{t|el|δίσεκτο έτος|n}}
{{trans-bottom}}
=== Pautan luar ===
* {{R:PRPM}}
{{C|ms|Takwim|Tahun}}
0c726aa8llj2kgksrsgd454fmmmnswg
Kategori:Kata nama bahasa Turki Usmaniyah
14
77426
281326
226398
2026-04-22T00:38:48Z
PeaceSeekers
3334
PeaceSeekers telah memindahkan laman [[Kategori:Kata nama bahasa Turki Uthmaniyah]] ke [[Kategori:Kata nama bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama
226398
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Rekonstruksi:Bahasa Indo-Eropah Purba/dwóh₁
110
79527
281345
230299
2026-04-22T03:31:39Z
Hakimi97
2668
/* Terbitan */
281345
wikitext
text/x-wiki
{{reconstructed}}
==Bahasa Indo-Eropah Purba==
{{etymon|ine-pro|id=two}}
===Kata bilangan===
{{cardinalbox|ine-pro|1|2|3|*óynos|*tréyes|ord=*h₂énteros|adv=*dwís|frac=*sēmi|opt=Awalan|optx=*dwi-}}
{{head|ine-pro|kata bilangan}}<ref name="PIEPG">{{R:gem:PIEPG|page=53}}</ref>
# [[dua]], 2
====Bentuk alternatif====
* {{alt|ine-pro|*dwó|*duwó}}<ref name="PIEPG"/><ref name="LIPP">{{R:ine:LIPP|vol=2|entry=*du̯ó-, *du̯í- 'zwei (einzelne)'|page=168-174}}</ref> {{q|bentuk tak terinfleksi}}
* {{alt|ine-pro|*dwṓw}}<ref name="LIPP"/>
====Infleksi====
{{ine-decl-adj|n=d|dwó}}
====Terbitan====
* {{l|ine-pro|*dwi-||pos=kata majmuk}}
** {{desc|ine-pro|*wí|alt=*(h₁)wi-|nolang=1|unc=1}}<ref name="De Vaan"/> {{see desc}}
* {{l|ine-pro||*dwi-gʰo-}}<ref name="LIPP"/>
** {{desc|sqj-pro|*duaigā}} {{q|< {{m|ine-pro||*dwoy-gʰ-eh₂}}}}
*** {{desc|sq|degë|t=cabang (terpisah)}}
** {{desc|ine-bsl-pro|*dweigas}}
*** {{desc|sla-pro|*dvigъ|t=cabang}}
**** {{desc|sla-pro|*dvigati|t=angkat|der=1}}
** {{desc|gem-pro|*twīgą|t=cabang (terpisah); ranting}} {{see desc}}
** {{desc|grk-pro}}
*** {{desc|grc|δίχα}}, {{l|grc|διχθά}}, {{l|grc|διχο-}}, {{l|grc|διξός}}, {{l|grc|δισσός}}
* {{l|ine-pro||*dwí-ko-s}}
** {{desc|gem-pro|*twihô|der=1}}
*** {{desc|gmw-pro|*twihō|t=syak}} {{see desc}}
** {{desc|iir-pro}}
*** {{desc|inc-pro}}
**** {{desc|sa|द्विक}}
* {{l|ine-pro||*dwi-no-s}}<ref>{{R:itc:EDL|head=bis|page=72}}</ref>
** {{desc|gem-pro|*twinaz}} {{see desc}}
** ⇒ {{l|ine-pro||*dwis-no-s}}
*** {{desc|gem-pro|*twiznaz}} {{see desc}}
*** {{desc|itc-pro|alt=*dwiznos}}
**** {{desc|la|bīnus}} {{see desc}}
* {{l|ine-pro|*dwi[[*pel-#Bahasa Indo-Eropah Purba: lipat|-pl-o-s]]||double}}
** {{desc|gem-pro|*twīflaz|t=syak}} {{see desc}}
** {{desc|grk-pro|}}
*** {{desc|grc|διπλόος}}, {{l|grc|δῐπλᾱ́ς}}
***: {{desc|grc-att|δῐπλοῦς}}
***: {{desc|grc-ion|δῐπλέος}}
**** {{desc|el|διπλός}}, {{l|el|δίπλα}}
** {{desc|itc-pro|*dwiplos}}
*** {{desc|la|duplus}} {{see desc}}
* {{l|ine-pro||*dwi-pl-o-m}}
** {{desc|imy|𐊗𐊂𐊆𐊓𐊍𐊚|tr=tbiplẽ|unc=1}}
* {{l|ine-pro|*dwís|*dwí-s|pos=adverba}}
* {{l|ine-pro||*dwi-sk-}}
** {{desc|gem-pro|*twisk(j)a-|t=dua kali ganda}}
*** {{desc|osx|twisk}}
*** {{desc|goh|zuiski}}, {{l|goh|zwisk}}
**** {{desc|gmh|zwisc(h)}}
***** {{desc|de|zwischen|der=1}}
* {{l|ine-pro||*(d)wi-tyo-}}<ref name="De Vaan">{{R:itc:EDL|head=vitium|page=684}}</ref> {{q|dengan disimilasi ''*d…t'' > ''*(h₁)…t''}}
** {{desc|itc-pro|*witjom}}
*** {{desc|la|vitium}} {{see desc}}
* {{l|ine-pro||*dwoy-}}
** {{desc|hit|𒋫𒄿𒊌𒀀|tr=tāiuga|t=umur dua tahun|der=1}}
** {{desc|hyx-pro|-}}
*** {{desc|xcl|*երկե-}}
**** {{desc|xcl|երկերիւր|der=1}}
** {{l|ine-pro||*dwoy-os}}
*** {{desc|ine-bsl-pro|*dwajas}}
**** {{desc|sla-pro|*dъvojь}}
**** {{desc|lt|dveji}}
*** {{desc|grk-pro}}
**** {{desc|grc|δοιός}}
*** {{desc|iir-pro|*dwayás}}
**** {{desc|sa|द्वय|tr=dvayá}}
** {{l|ine-pro||*dwoy-om}}
*** {{desc|hit|𒋫𒀀𒀭|tr=ta-a-an|ts=tān}}<ref>{{R:hit:Kloekhorst|page=826-827|head=tān}}</ref>
*** {{desc|xlu|𔑢𔗬𔐤𔔂|tr=tu-wa-na|unc=1}}
** {{l|ine-pro||*dwoy-o-mṓi-}}
*** {{desc|hit|𒁕𒈠𒄿|tr=dam(m)ai-}}, {{l|hit|𒋫𒈠𒄿|tr=tāmai-|t=kedua, lain}}
* {{l|ine-pro|*dwey-|t=takut}}
;Pembentukan tak dikelaskan
* {{desc|gem-pro|*twīhnaz}}
====Keturunan====
* {{desc|sqj-pro|*duwō}} {{see desc}}
* Anatolia:
** {{desc|hit|𒋫|tr=ta-}}
** {{desc|xlu|𔑢𔗬|𔓯𔖩|tr=tuwa|tr2=i-zi-}}
** {{desc|imy|𐊗𐊂𐊆|tr=tbi}}
** {{desc|xlc|𐊋𐊂𐊆}}
* {{desc|hyx-pro}}
** {{desctree|xcl|երկու}}
** {{desctree|xcl|կրկին|qq=susur terbitan tepat dipertikai}}
* {{desc|ine-bsl-pro|*duwōˀ|*duōˀ}} {{see desc}}
* {{desc|cel-pro|*duwo}} {{see desc}}
* {{desc|gem-pro|*twai}} {{see desc}}
* {{desc|grk-pro|*dúwo}} {{q|< {{l|ine-pro||*duwó}}<ref>{{R:grc:Beekes|head=δύο|page=359}}</ref>}} {{see desc}}
* {{desc|iir-pro|*dwáH}} {{see desc}}
* {{desc|itc-pro|*duō}} {{see desc}}
* {{desc|ine-toc-pro}}
** {{desc|xto|wu|we}}
** {{desc|txb|wi}}
===Rujukan===
<references/>
===Bacaan lanjut===
* {{R:gem:EDPG|page=529-530}}
* {{R:ine:IEW|page=228-232}}
r17hj0wrafddgtyobd2zi3eftntgggv
Kategori:Perkataan dengan kod tulisan lewah bahasa Turki Usmaniyah
14
80819
281331
232212
2026-04-22T00:40:48Z
PeaceSeekers
3334
PeaceSeekers telah memindahkan laman [[Kategori:Perkataan dengan kod tulisan lewah bahasa Turki Uthmaniyah]] ke [[Kategori:Perkataan dengan kod tulisan lewah bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama
232212
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Permintaan perkataan bahasa Turki Usmaniyah
14
88257
281332
248694
2026-04-22T00:40:57Z
PeaceSeekers
3334
PeaceSeekers telah memindahkan laman [[Kategori:Permintaan perkataan bahasa Turki Uthmaniyah]] ke [[Kategori:Permintaan perkataan bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama
248694
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Kata sifat bahasa Turki Usmaniyah
14
91356
281327
252673
2026-04-22T00:39:05Z
PeaceSeekers
3334
PeaceSeekers telah memindahkan laman [[Kategori:Kata sifat bahasa Turki Uthmaniyah]] ke [[Kategori:Kata sifat bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama
252673
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Modul bahasa Turki Usmaniyah
14
92212
281333
253806
2026-04-22T00:41:27Z
PeaceSeekers
3334
PeaceSeekers telah memindahkan laman [[Kategori:Modul bahasa Turki Uthmaniyah]] ke [[Kategori:Modul bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama
253806
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Wikikamus:th/กกุธภัณฑ์
4
92266
281421
253881
2026-04-22T09:10:33Z
PeaceSeekers
3334
/* Sebutan */
281421
wikitext
text/x-wiki
==Bahasa Thai==
===Kata nama===
{{inti|th|kata nama}}
===Etimologi===
Daripada {{bor|th|pi|kakudhabhaṇdha}}, daripada {{m|pi|kakudha|gloss=[[panji]] atau [[simbol]] daripada [[royalti]]}} + {{m|pi|bhaṇḍa|gloss=[[artikel]]; [[instrumen]]; [[perkakas]]}}; bersamaan dengan {{cog|th|-}} {{com|th|กกุธ|ภัณฑ์}}.
===Sebutan===
{{th-seb|กะ-กุด-ทะ-พัน}}
===Kata nama===
{{th-kn}}
# [[alat kebesaran]] [[diraja]]
## {{lb|th|khusus}} Alat-alat kebesaran diraja Thailand, ataupun disebut "Lima Regalia Diraja": Mahkota Kemenangan Besar, Pedang Kemenangan, Tongkat Diraja, Kipas Diraja (dan Cambuk), Sandal Diraja.
==== Lihat juga ====
{{top2}}
* {{l|th|เครื่องราชกกุธภัณฑ์}}
* {{l|th|เบญจราชกกุธภัณฑ์}}
{{bottom}}
<!-- ==== Terjemahan dalam bahasa lain ====
{{trans-top|lambang kerajaan}}
* Jepun: {{t|ja|[[五]][[種]][[の]][[神器]]|tr=ごしゅのじんぎ, โกะชุโนะจิงงิ}}
* Laos: {{t|lo|ກະກຸທະພັນ}}
* Inggeris: {{t|en|[[insignia]] [[of]] [[kingship]]}}, {{t+|en|regalia}}
{{trans-bottom}} -->
{{C|th|Thailand}}
m7h8jf54qij43nulcoo3rith211tyu9
281422
281421
2026-04-22T09:11:25Z
PeaceSeekers
3334
/* Kata nama */
281422
wikitext
text/x-wiki
==Bahasa Thai==
===Kata nama===
{{inti|th|kata nama}}
===Etimologi===
Daripada {{bor|th|pi|kakudhabhaṇdha}}, daripada {{m|pi|kakudha|gloss=[[panji]] atau [[simbol]] daripada [[royalti]]}} + {{m|pi|bhaṇḍa|gloss=[[artikel]]; [[instrumen]]; [[perkakas]]}}; bersamaan dengan {{cog|th|-}} {{com|th|กกุธ|ภัณฑ์}}.
===Sebutan===
{{th-seb|กะ-กุด-ทะ-พัน}}
===Kata nama===
{{head|th|kata nama}}
# [[alat kebesaran]] [[diraja]]
## {{lb|th|khusus}} Alat-alat kebesaran diraja Thailand, ataupun disebut "Lima Regalia Diraja": Mahkota Kemenangan Besar, Pedang Kemenangan, Tongkat Diraja, Kipas Diraja (dan Cambuk), Sandal Diraja.
==== Lihat juga ====
{{top2}}
* {{l|th|เครื่องราชกกุธภัณฑ์}}
* {{l|th|เบญจราชกกุธภัณฑ์}}
{{bottom}}
<!-- ==== Terjemahan dalam bahasa lain ====
{{trans-top|lambang kerajaan}}
* Jepun: {{t|ja|[[五]][[種]][[の]][[神器]]|tr=ごしゅのじんぎ, โกะชุโนะจิงงิ}}
* Laos: {{t|lo|ກະກຸທະພັນ}}
* Inggeris: {{t|en|[[insignia]] [[of]] [[kingship]]}}, {{t+|en|regalia}}
{{trans-bottom}} -->
{{C|th|Thailand}}
4rrse4qv3fs72ug1ecj6aicju561vko
Kategori:Modul data bahasa Turki Usmaniyah
14
92877
281329
254915
2026-04-22T00:40:09Z
PeaceSeekers
3334
PeaceSeekers telah memindahkan laman [[Kategori:Modul data bahasa Turki Uthmaniyah]] ke [[Kategori:Modul data bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama
254915
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
datsi
0
94530
281399
257304
2026-04-22T07:51:05Z
PeaceSeekers
3334
/* Bahasa Chin Tedim */
281399
wikitext
text/x-wiki
== Bahasa Chin Tedim ==
=== Kata nama ===
{{head|ctd|kata nama}}
# minyak [[petrol]]
=== Etimologi ===
{{bor+|ctd|my|ဓာတ်ဆီ}}.
6mvzgeu2t3qwikymyb5wmm7zfb1lje0
Kategori:beg:Alat dapur
14
112150
281338
278314
2026-04-22T01:04:10Z
PeaceSeekers
3334
PeaceSeekers telah memindahkan laman [[Kategori:beg:Peralatan dapur]] ke [[Kategori:beg:Alat dapur]] tanpa meninggalkan lencongan: Tajuk salah eja
278314
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:hi:Fenomena atmosfera
14
113160
281341
279417
2026-04-22T01:09:20Z
PeaceSeekers
3334
PeaceSeekers telah memindahkan laman [[Kategori:hi:Kejadian atmosfera]] ke [[Kategori:hi:Fenomena atmosfera]] tanpa meninggalkan lencongan: Tajuk salah eja
279417
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Anjing
14
114822
281249
2026-04-21T13:42:29Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281249
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:ms:Anjing
14
114823
281250
2026-04-21T13:42:45Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281250
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Kuda
14
114824
281251
2026-04-21T13:43:21Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281251
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Kutleri
14
114825
281252
2026-04-21T13:43:25Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281252
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Equidae
14
114826
281253
2026-04-21T13:43:37Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281253
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Ungulat kuku ganjil
14
114827
281254
2026-04-21T13:43:40Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281254
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Alat dapur
14
114828
281255
2026-04-21T13:44:29Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281255
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Beri
14
114829
281256
2026-04-21T13:46:32Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281256
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Cacing
14
114830
281257
2026-04-21T13:46:35Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281257
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Hotel
14
114831
281258
2026-04-21T13:47:06Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281258
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Jenis perniagaan
14
114832
281259
2026-04-21T13:47:18Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281259
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Mata wang
14
114833
281260
2026-04-21T13:48:48Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281260
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Kebersihan kesihatan
14
114834
281261
2026-04-21T13:48:50Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281261
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Pendengaran
14
114835
281262
2026-04-21T13:50:25Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281262
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Rangka
14
114836
281263
2026-04-21T13:50:28Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281263
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Ungulat kuku genap
14
114837
281264
2026-04-21T13:50:50Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281264
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Arah
14
114838
281265
2026-04-21T13:52:06Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281265
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Arca
14
114839
281266
2026-04-21T13:52:11Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281266
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Buddhisme
14
114840
281267
2026-04-21T13:52:53Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281267
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Bulan Kristian Syria
14
114841
281268
2026-04-21T13:52:58Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281268
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Buruj
14
114842
281269
2026-04-21T13:53:01Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281269
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Cengkerik dan belalang
14
114843
281270
2026-04-21T13:57:10Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281270
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Cervidae
14
114844
281271
2026-04-21T13:57:25Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281271
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Crocodilia
14
114845
281272
2026-04-21T14:00:37Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281272
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Cuti
14
114846
281273
2026-04-21T14:00:42Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281273
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Dapur
14
114847
281277
2026-04-21T14:05:16Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281277
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Dekapod
14
114848
281278
2026-04-21T14:05:19Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281278
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Demonim
14
114849
281279
2026-04-21T14:07:35Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281279
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Elipsis bahasa Hungary
14
114850
281280
2026-04-21T14:09:01Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281280
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Pemendekan bahasa Hungary
14
114851
281281
2026-04-21T14:09:17Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281281
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Individu
14
114852
281282
2026-04-21T14:10:31Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281282
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Ketam
14
114853
281284
2026-04-21T14:11:44Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281284
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Kerakyatan
14
114854
281285
2026-04-21T14:11:49Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281285
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Kecacatan
14
114855
281286
2026-04-21T14:14:40Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281286
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Katolik
14
114856
281287
2026-04-21T14:14:44Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281287
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Klimatologi
14
114857
281288
2026-04-21T14:17:07Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281288
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Krismas
14
114858
281289
2026-04-21T14:21:02Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281289
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Kucing
14
114859
281290
2026-04-21T14:21:07Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281290
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Komelinid
14
114860
281291
2026-04-21T14:22:42Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281291
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Mekanisme
14
114861
281292
2026-04-21T14:24:14Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281292
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Mineral
14
114862
281293
2026-04-21T14:24:24Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281293
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Minuman beralkohol
14
114863
281294
2026-04-21T14:39:12Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281294
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Modul:languages/chars
828
114864
281319
2026-04-21T19:40:58Z
Hakimi97
2668
Mencipta laman baru dengan kandungan 'local export = {} local table = table local insert = table.insert local u = require("Module:string/char") -- UTF-8 encoded strings for some commonly-used diacritics. local c = { prime = u(0x02B9), grave = u(0x0300), acute = u(0x0301), circ = u(0x0302), -- circumflex tilde = u(0x0303), macron = u(0x0304), overline = u(0x0305), breve = u(0x0306), dotabove = u(0x0307), diaer = u(0x0308), -- diaeresis ringabove =...'
281319
Scribunto
text/plain
local export = {}
local table = table
local insert = table.insert
local u = require("Module:string/char")
-- UTF-8 encoded strings for some commonly-used diacritics.
local c = {
prime = u(0x02B9),
grave = u(0x0300),
acute = u(0x0301),
circ = u(0x0302), -- circumflex
tilde = u(0x0303),
macron = u(0x0304),
overline = u(0x0305),
breve = u(0x0306),
dotabove = u(0x0307),
diaer = u(0x0308), -- diaeresis
ringabove = u(0x030A),
dacute = u(0x030B), -- double acute
caron = u(0x030C),
lineabove = u(0x030D),
dgrave = u(0x030F), -- double grave
invbreve = u(0x0311), -- inverted breve
turnedcommaabove = u(0x0312),
commaabove = u(0x0313),
revcommaabove = u(0x0314), -- reversed comma above
dotbelow = u(0x0323),
diaerbelow = u(0x0324), -- diaeresis below
ringbelow = u(0x0325),
cedilla = u(0x0327),
ogonek = u(0x0328),
caronbelow = u(0x032C),
brevebelow = u(0x032E),
macronbelow = u(0x0331),
perispomeni = u(0x0342),
ypogegrammeni = u(0x0345),
CGJ = u(0x034F), -- combining grapheme joiner
zigzag = u(0x035B),
dbrevebelow = u(0x035C), -- double breve below
dmacron = u(0x035E), -- double macron
dtilde = u(0x0360), -- double tilde
dinvbreve = u(0x0361), -- double inverted breve
small_a = u(0x0363),
small_e = u(0x0364),
small_i = u(0x0365),
small_o = u(0x0366),
small_u = u(0x0367),
keraia = u(0x0374),
lowerkeraia = u(0x0375),
tonos = u(0x0384),
palatalization = u(0x0484),
dasiapneumata = u(0x0485),
psilipneumata = u(0x0486),
kashida = u(0x0640),
fathatan = u(0x064B),
dammatan = u(0x064C),
kasratan = u(0x064D),
fatha = u(0x064E),
damma = u(0x064F),
kasra = u(0x0650),
shadda = u(0x0651),
sukun = u(0x0652),
hamzaabove = u(0x0654),
nunghunna = u(0x0658),
zwarakay = u(0x0659),
smallv = u(0x065A),
superalef = u(0x0670),
udatta = u(0x0951),
anudatta = u(0x0952),
tacute = u(0x1ACB), -- triple acute
dsvarita = u(0x1CDA), -- double svarita
tsvarita = u(0x1CDB), -- triple svarita
dottedgrave = u(0x1DC0),
dottedacute = u(0x1DC1),
coronis = u(0x1FBD),
psili = u(0x1FBF),
dasia = u(0x1FEF),
ZWNJ = u(0x200C), -- zero width non-joiner
ZWJ = u(0x200D), -- zero width joiner
RSQuo = u(0x2019), -- right single quote
kavyka = u(0xA67C),
VS01 = u(0xFE00), -- variation selector 1
-- Punctuation for the standard_chars field.
-- Note: characters are literal (i.e. no magic characters).
punc = " ',-‐‑‒–—…∅◌",
-- Range covering all diacritics.
diacritics = u(0x300) .. "-" .. u(0x34E) ..
u(0x350) .. "-" .. u(0x36F) ..
u(0x1AB0) .. "-" .. u(0x1ACE) ..
u(0x1DC0) .. "-" .. u(0x1DFF) ..
u(0x20D0) .. "-" .. u(0x20F0) ..
u(0xFE20) .. "-" .. u(0xFE2F),
}
-- Braille characters for the standard_chars field.
local braille = {}
for i = 0x2800, 0x28FF do
insert(braille, u(i))
end
c.braille = table.concat(braille)
export.chars = c
-- PUA characters, generally used in sortkeys.
-- Note: if the limit needs to be increased, do so in powers of 2 (due to the way memory is allocated for tables).
local p = {}
for i = 1, 32 do
p[i] = u(0xF000+i-1)
end
export.puaChars = p
local cs = {}
-- Used for the default display_text and strip_diacritics for Grek, but parts also used directly by Albanian (sq).
cs["Grek-displaytext"] = {
from = {"Þ", "þ", c.turnedcommaabove, "['ʼ" .. c.RSQuo .. c.prime .. c.keraia .. c.coronis .. c.psili .. "]"}, -- Not tonos: used as the numeral sign in entries.
to = {"Ϸ", "ϸ", c.revcommaabove, c.RSQuo}
}
cs["Grek-stripdiacritics"] = {
remove_diacritics = c.caron .. c.diaerbelow .. c.brevebelow,
from = cs["Grek-displaytext"].from,
to = {"Ϸ", "ϸ", c.revcommaabove, "'"}
}
-- Used in the default strip_diacritics and sort_key for Cyrs, but also used directly by Old Ruthenian (zle-ort).
cs["Cyrs_remove_diacritics"] =
c.grave .. c.acute .. c.dotabove .. c.diaer .. c.invbreve .. c.palatalization .. c.dasiapneumata .. c.psilipneumata .. c.dottedgrave .. c.dottedacute .. c.kavyka
export.chars_substitutions = cs
return export
nvo2d2djqerlm03ucvsy9n8dkv3uip8
Kategori:Ekinoderma
14
114865
281335
2026-04-22T00:42:20Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281335
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:hi:Atmosfera
14
114866
281342
2026-04-22T01:09:35Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281342
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
一生懸命
0
114868
281351
2026-04-22T06:01:01Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa Jepun== {{ja-kanjitab|いつ|しょう|けん|めい|k1=いっ|yomi=on}} ===Adverba=== {{ja-pos|adverb|いっしょうけんめい|hhira=いつしやうけんめい}} # secara sedaya upaya, secara bersungguh-sungguh #: {{ja-usex|'''一%生%懸%命'''頑%張る|'''いっ%しょう%けん%めい''' がん%ばる|berusaha '''sedaya upaya'''}} ===Etimologi=== {{ja-yoji}} daripada {{ja-r|一所懸命|いっしょけんめい}} yang merujuk kepad...'
281351
wikitext
text/x-wiki
==Bahasa Jepun==
{{ja-kanjitab|いつ|しょう|けん|めい|k1=いっ|yomi=on}}
===Adverba===
{{ja-pos|adverb|いっしょうけんめい|hhira=いつしやうけんめい}}
# secara sedaya upaya, secara bersungguh-sungguh
#: {{ja-usex|'''一%生%懸%命'''頑%張る|'''いっ%しょう%けん%めい''' がん%ばる|berusaha '''sedaya upaya'''}}
===Etimologi===
{{ja-yoji}} daripada {{ja-r|一所懸命|いっしょけんめい}} yang merujuk kepada para [[samurai]] menaruhkan nyawa untuk melindungi wilayah pusaka.
===Sebutan===
{{ja-pron|いっしょうけんめい|acc=5|acc_ref=DJR}}
===Rujukan===
<references/>
:* {{R:Kanjipedia Kotoba|0000230600}}
{{cln|ja|yojijukugo|}}
o0960f6bsyscxv75jp8h69v1jn6rgb6
Kategori:Perkataan dieja dengan 一 dibaca sebagai いつ bahasa Jepun
14
114869
281352
2026-04-22T06:01:56Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat|kan'on}}'
281352
wikitext
text/x-wiki
{{auto cat|kan'on}}
clmo3b09zci1t12px7gti5vw1yfsq0y
Kategori:Perkataan dieja dengan kanji dibaca sebagai いつ bahasa Jepun
14
114870
281353
2026-04-22T06:03:17Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281353
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Perkataan dieja dengan 懸 dibaca sebagai けん bahasa Jepun
14
114871
281354
2026-04-22T06:07:19Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat|kan'on}}'
281354
wikitext
text/x-wiki
{{auto cat|kan'on}}
clmo3b09zci1t12px7gti5vw1yfsq0y
Kategori:Perkataan dieja dengan 懸 bahasa Jepun
14
114872
281355
2026-04-22T06:07:43Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281355
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Perkataan dieja dengan 懸 mengikut bahasa
14
114873
281356
2026-04-22T06:08:10Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281356
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
四方八方
0
114874
281357
2026-04-22T06:19:50Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa Jepun== {{ja-kanjitab|し|ほう|はつ|ほう|k3=はっ|k4=ぽう|yomi=on}} ===Adverba=== {{ja-pos|kata sifat|しほうはっぽう}} # setiap [[arah]] # setiap tempat; di [[mana-mana]] ===Rujukan=== <references/> :* {{R:Kanjipedia Kotoba|0000230600}} {{cln|ja|yojijukugo|}}'
281357
wikitext
text/x-wiki
==Bahasa Jepun==
{{ja-kanjitab|し|ほう|はつ|ほう|k3=はっ|k4=ぽう|yomi=on}}
===Adverba===
{{ja-pos|kata sifat|しほうはっぽう}}
# setiap [[arah]]
# setiap tempat; di [[mana-mana]]
===Rujukan===
<references/>
:* {{R:Kanjipedia Kotoba|0000230600}}
{{cln|ja|yojijukugo|}}
hsyj08f9qpv1wozk06px1hjplakdbdf
281362
281357
2026-04-22T06:34:02Z
PeaceSeekers
3334
281362
wikitext
text/x-wiki
==Bahasa Jepun==
{{ja-kanjitab|し|ほう|はつ|ほう|k3=はっ|k4=ぽう|yomi=on}}
===Adverba===
{{ja-pos|kata sifat|しほうはっぽう}}
# setiap [[arah]]
# setiap tempat; di [[mana-mana]]
{{cln|ja|yojijukugo|}}
emlr30jpml4lcxgnqrhlrlth3irar7f
Kategori:Perkataan dieja dengan 八 dibaca sebagai はつ bahasa Jepun
14
114875
281358
2026-04-22T06:20:22Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat|kan'on}}'
281358
wikitext
text/x-wiki
{{auto cat|kan'on}}
clmo3b09zci1t12px7gti5vw1yfsq0y
Kategori:Perkataan dieja dengan 方 dibaca sebagai ほう bahasa Jepun
14
114876
281359
2026-04-22T06:21:06Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat|on}}'
281359
wikitext
text/x-wiki
{{auto cat|on}}
irnidilxpyzph26fxce9qlrz5zy5gor
Kategori:Yojijukugo bahasa Jepun
14
114877
281368
2026-04-22T07:02:57Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281368
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Paus
14
114878
281370
2026-04-22T07:05:00Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281370
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Setasea
14
114879
281371
2026-04-22T07:05:25Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281371
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Teh
14
114880
281372
2026-04-22T07:08:13Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281372
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Wikikamus:ms/Vanuatu
4
114881
281373
2026-04-22T07:14:41Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Kata nama khas=== {{inti|{{subst:ROOTPAGENAME}}|kata nama khas}} # {{place|ms|negara/dan/kepulauan|r/Melanesia|di|cont/Oceania|official=Republik Vanuatu|caplc=Port Vila}}. ===Etimologi=== Akhirnya daripada {{der|ms|bi|Vanuatu}}. ===Rujukan=== * {{R:KDP}}'
281373
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Kata nama khas===
{{inti|ms|kata nama khas}}
# {{place|ms|negara/dan/kepulauan|r/Melanesia|di|cont/Oceania|official=Republik Vanuatu|caplc=Port Vila}}.
===Etimologi===
Akhirnya daripada {{der|ms|bi|Vanuatu}}.
===Rujukan===
* {{R:KDP}}
gwp4falgslrgf6i1czl6befnxapxrna
Vanuatu
0
114882
281375
2026-04-22T07:17:17Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}'
281375
wikitext
text/x-wiki
{{wt:ms/{{PAGENAME}}}}
oduz2pevfujwte0m2yicioulifapb4r
Kategori:ms:Vanuatu
14
114883
281376
2026-04-22T07:18:11Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281376
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Vanuatu
14
114884
281377
2026-04-22T07:18:28Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281377
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:ms:Negara di Melanesia
14
114885
281378
2026-04-22T07:20:23Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281378
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Perkataan bahasa Melayu diterbitkan daripada bahasa Bislama
14
114886
281379
2026-04-22T07:20:32Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281379
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Perkataan diterbitkan daripada bahasa Bislama mengikut bahasa
14
114887
281380
2026-04-22T07:20:46Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281380
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Negara di Melanesia
14
114888
281381
2026-04-22T07:21:30Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281381
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Melanesia
14
114889
281382
2026-04-22T07:22:07Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281382
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:ms:Melanesia
14
114890
281383
2026-04-22T07:22:15Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281383
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:ms:Oceania
14
114891
281384
2026-04-22T07:23:08Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281384
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Oceania
14
114892
281385
2026-04-22T07:23:20Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281385
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:ms:Zambia
14
114893
281387
2026-04-22T07:26:56Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281387
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Zambia
14
114894
281388
2026-04-22T07:27:17Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281388
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Wikikamus:ms/Melanesia
4
114895
281389
2026-04-22T07:32:57Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Kata nama khas=== {{inti|{{subst:ROOTPAGENAME}}|kata nama khas}} # {{place|en|Sebuah <<kawasan benua>> di <<cont/Oceania>> yang terdiri daripada [[New Guinea]], [[Kepulauan Bismarck]], [[Kepulauan Solomon]], [[New Caledonia]], [[Vanuatu]] dan [[Fiji]]}}. ====Perkataan setara==== * {{l|ms|Mikronesia}} * {{l|ms|Polinesia}} ===Etimologi=== Akhirnya daripada {{der|en|grc|μέλας||gelap}} + {{m|grc|νῆ...'
281389
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Kata nama khas===
{{inti|ms|kata nama khas}}
# {{place|en|Sebuah <<kawasan benua>> di <<cont/Oceania>> yang terdiri daripada [[New Guinea]], [[Kepulauan Bismarck]], [[Kepulauan Solomon]], [[New Caledonia]], [[Vanuatu]] dan [[Fiji]]}}.
====Perkataan setara====
* {{l|ms|Mikronesia}}
* {{l|ms|Polinesia}}
===Etimologi===
Akhirnya daripada {{der|en|grc|μέλας||gelap}} + {{m|grc|νῆσος||pulau}}, dengan "gelap" di sini merujuk kepada warna kulit warga penghuni.
nplcym5ejel9fi34uhbtykdi20tq51u
281391
281389
2026-04-22T07:33:42Z
PeaceSeekers
3334
/* Kata nama khas */
281391
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Kata nama khas===
{{inti|ms|kata nama khas}}
# {{place|ms|Sebuah <<kawasan benua>> di <<cont/Oceania>> yang terdiri daripada [[New Guinea]], [[Kepulauan Bismarck]], [[Kepulauan Solomon]], [[New Caledonia]], [[Vanuatu]] dan [[Fiji]]}}.
====Perkataan setara====
* {{l|ms|Mikronesia}}
* {{l|ms|Polinesia}}
===Etimologi===
Akhirnya daripada {{der|en|grc|μέλας||gelap}} + {{m|grc|νῆσος||pulau}}, dengan "gelap" di sini merujuk kepada warna kulit warga penghuni.
4qiw5py4uwyzr2r3w0uu8wonrefyla2
281392
281391
2026-04-22T07:34:18Z
PeaceSeekers
3334
/* Etimologi */
281392
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Kata nama khas===
{{inti|ms|kata nama khas}}
# {{place|ms|Sebuah <<kawasan benua>> di <<cont/Oceania>> yang terdiri daripada [[New Guinea]], [[Kepulauan Bismarck]], [[Kepulauan Solomon]], [[New Caledonia]], [[Vanuatu]] dan [[Fiji]]}}.
====Perkataan setara====
* {{l|ms|Mikronesia}}
* {{l|ms|Polinesia}}
===Etimologi===
Akhirnya daripada {{der|ms|grc|μέλας||gelap}} + {{m|grc|νῆσος||pulau}}, dengan "gelap" di sini merujuk kepada warna kulit warga penghuni.
6xsv39rvj1smgi52qkwcxj8nkw92uj0
281393
281392
2026-04-22T07:34:57Z
PeaceSeekers
3334
281393
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
{{Wikipedia|lang=ms}}
===Kata nama khas===
{{inti|ms|kata nama khas}}
# {{place|ms|Sebuah <<kawasan benua>> di <<cont/Oceania>> yang terdiri daripada [[New Guinea]], [[Kepulauan Bismarck]], [[Kepulauan Solomon]], [[New Caledonia]], [[Vanuatu]] dan [[Fiji]]}}.
====Perkataan setara====
* {{l|ms|Mikronesia}}
* {{l|ms|Polinesia}}
===Etimologi===
Akhirnya daripada {{der|ms|grc|μέλας||gelap}} + {{m|grc|νῆσος||pulau}}, dengan "gelap" di sini merujuk kepada warna kulit warga penghuni.
oa2rl5i70rxrgu3tk8sr1pi99pc69w5
Melanesia
0
114896
281390
2026-04-22T07:33:17Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}'
281390
wikitext
text/x-wiki
{{wt:ms/{{PAGENAME}}}}
oduz2pevfujwte0m2yicioulifapb4r
Wikikamus:ms/Mikronesia
4
114897
281394
2026-04-22T07:40:09Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|ms}}== {{Wikipedia|lang=ms}} ===Kata nama khas=== {{inti|ms|kata nama khas}} # {{place|ms|kawasan benua|cont/Oceania|*di barat laut [[Lautan Pasifik]]}}, dengan kira-kira 2,000 buah pulau kecil. ====Perkataan setara==== * {{l|ms|Melanesia}} * {{l|ms|Polinesia}}'
281394
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
{{Wikipedia|lang=ms}}
===Kata nama khas===
{{inti|ms|kata nama khas}}
# {{place|ms|kawasan benua|cont/Oceania|*di barat laut [[Lautan Pasifik]]}}, dengan kira-kira 2,000 buah pulau kecil.
====Perkataan setara====
* {{l|ms|Melanesia}}
* {{l|ms|Polinesia}}
5mvft4vp7y4ysrjfn4yemh5e24oarxl
Mikronesia
0
114898
281395
2026-04-22T07:40:27Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}'
281395
wikitext
text/x-wiki
{{wt:ms/{{PAGENAME}}}}
oduz2pevfujwte0m2yicioulifapb4r
Wikikamus:ms/Polinesia
4
114899
281396
2026-04-22T07:45:50Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|ms}}== {{Wikipedia|lang=ms}} ===Kata nama khas=== {{inti|ms|kata nama khas}} # {{place|en|Sebuah <<kawasan benua>> di <<cont/Oceania>>, termasuk [[Pulau Easter]], [[Hawaii]], [[New Zealand]] dan kebanyakan pulau di antara sesama mereka}}. ====Perkataan setara==== * {{l|ms|Melanesia}} * {{l|ms|Mikronesia}} ====Terjemahan==== {{trans-top|sebahagian Oceania}} * Afrikaans: {{t|af|Polinesië}} * Albania: {{t|sq|Polinezi|f}}, {{t|sq|Pol...'
281396
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
{{Wikipedia|lang=ms}}
===Kata nama khas===
{{inti|ms|kata nama khas}}
# {{place|en|Sebuah <<kawasan benua>> di <<cont/Oceania>>, termasuk [[Pulau Easter]], [[Hawaii]], [[New Zealand]] dan kebanyakan pulau di antara sesama mereka}}.
====Perkataan setara====
* {{l|ms|Melanesia}}
* {{l|ms|Mikronesia}}
====Terjemahan====
{{trans-top|sebahagian Oceania}}
* Afrikaans: {{t|af|Polinesië}}
* Albania: {{t|sq|Polinezi|f}}, {{t|sq|Polinezia|f}} {{qualifier|definite}}
* Amhara: {{t|am|ፖሊኔዥያ}}
* Arab: {{t|ar|بُولِينِزِيَا|f}}
* Armenia: {{t|hy|Պոլինեզիա}}
* Azeri: {{t|az|Polineziya}}
* Belanda: {{t+|nl|Polynesië|n}}
* Belarus: {{t|be|Паліне́зія|f}}, {{t|be|Палінэ́зія|f}}
* Bengali: {{t+|bn|পলিনেশিয়া}}
* Bulgaria: {{t|bg|Полине́зия|f}}
* Burma: {{t|my|ပိုလီနီးရှား}}
* Catalonia: {{t+|ca|Polinèsia|f}}
* Cherokee: {{t|chr|ᏆᎵᏂᏏᎠ|tr=qualinisia}}
* Cina:
*: Kantonis: {{t|yue|波利尼西亞|tr=bo1 lei6 nei4 sai1 aa3}}
*: Mandarin: {{t+|cmn|波利尼西亞|tr=Bōlìníxīyà}}, {{t+|cmn|玻里尼西亞|tr=Bōlǐníxīyà}} {{qualifier|Taiwan}}
* Czech: {{t+|cs|Polynésie|f}}
* Denmark: {{t|da|Polynesien|n}}
* Esperanto: {{t|eo|Polinezio}}
* Estonia: {{t|et|Polüneesia}}
* Farefare: {{t|gur|Polinesia}}
* Finland: {{t+|fi|Polynesia}}
* Galicia: {{t+|gl|Polinesia}}
* Georgia: {{t|ka|პოლინეზია}}
* Hawaii: {{t|haw|Polenekia}}
* Hindi: {{t|hi|पॉलिनेशिया|m}}
* Hungary: {{t+|hu|Polinézia}}
* Ibrani: {{t|he|פּוֹלִינֶזְיָה|f|tr=polinézya}}
* Iceland: {{t|is|Pólýnesía}}
* Indonesia: {{t|id|Polinesia}}
* Ireland: {{t|ga|Polainéis|f|alt=An Pholainéis}}
* Itali: {{t+|it|Polinesia|f}}
* Jepun: {{t+|ja|ポリネシア|tr=Porineshia}}
* Jerman: {{t+|de|Polynesien|n}}
* Kazakh: {{t+|kk|Полинезия}}
* Khmer: {{t|km|ប៉ូលីណេស៊ី}}
* Korea: {{t|ko|^폴리네시아}}
* Kurdi:
*: Kurdi Utara: {{t|kmr|Polînezya}}
* Kyrgyz: {{t+|ky|Полинезия}}
* Lao: {{t|lo|ໂປລີເນຊີ}}
* Latvia: {{t|lv|Polinēzija|f}}
* Lithuania: {{t|lt|Polinezija|f}}
* Macedonia: {{t|mk|Полинезија|f}}
* Māori: {{t|mi|Poronihia}}
* Mongol:
*: Cyril: {{t|mn|Полинези}}
* Norway:
*: Bokmål: {{t|nb|Polynesia}}
*: Nynorsk: {{t|nn|Polynesia}}
* Parsi: {{t|fa|پلینزی|tr=poli-nezi}}, {{t+|fa|پلینزی|tr=polinezi}}
* Perancis: {{t+|fr|Polynésie|f}}
* Polish: {{t+|pl|Polinezja|f}}
* Portugis: {{t+|pt|Polinésia|f}}
* Romania: {{t|ro|Polinezia|f}}
* Rusia: {{t+|ru|Полине́зия|f|tr=Polinɛ́zija}}
* Samoa: {{t|sm|Polenisia}}
* Sepanyol: {{t|es|Polinesia}}
* Serbo-Croatia:
*: Cyril: {{t|sh|Полѝне̄зија|f}}
*: Latin: {{t+|sh|Polìnēzija|f}}
* Sinhala: {{t|si|පොලිනීසියාව}}
* Slovak: {{t|sk|Polynézia|f}}
* Slovene: {{t|sl|Polinezija|f}}
* Sweden: {{t+|sv|Polynesien|n}}
* Tagalog: {{t|tl|Dampuluan}}, {{t|tl|Polynesia}}
* Tahiti: {{t|ty|Pōrīnetia}}
* Tajik: {{t|tg|Полинезия}}
* Tamil: {{t|ta|பொலினீசியா}}
* Tatar: {{t|tt|Полинезия}}
* Thai: {{t|th|พอลินีเชีย}}
* Turki: {{t+|tr|Polinezya}}
* Turkmen: {{t|tk|Polineziýa}}
* Ukraine: {{t|uk|Поліне́зія|f}}
* Urdu: {{t|ur|پولینیشیا|m|tr=polīneśiyā}}
* Uyghur: {{t|ug|پولىنېزىيە}}
* Uzbek: {{t|uz|Polineziya}}
* Vietnam: {{t|vi|Pô-li-nê-di}}, {{t|vi|Đa Đảo}} ({{t|vi|多島}})
* Volapük: {{t+|vo|Möda-Seanuäns}}
* Wales: {{t|cy|Polynesia}}
* Yiddish: {{t|yi|פּאָלינעזיע|n}}
* Yunani: {{t+|el|Πολυνησία|f}}
{{trans-bottom}}
jhbdc5sor9j19pgd4oe5xgyizm3bz65
281398
281396
2026-04-22T07:48:42Z
PeaceSeekers
3334
/* Terjemahan */
281398
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
{{Wikipedia|lang=ms}}
===Kata nama khas===
{{inti|ms|kata nama khas}}
# {{place|en|Sebuah <<kawasan benua>> di <<cont/Oceania>>, termasuk [[Pulau Easter]], [[Hawaii]], [[New Zealand]] dan kebanyakan pulau di antara sesama mereka}}.
====Perkataan setara====
* {{l|ms|Melanesia}}
* {{l|ms|Mikronesia}}
====Terjemahan====
{{trans-top|sebahagian Oceania}}
* Afrikaans: {{t|af|Polinesië}}
* Albania: {{t|sq|Polinezi|f}}, {{t|sq|Polinezia|f}} {{qualifier|definite}}
* Amhara: {{t|am|ፖሊኔዥያ}}
* Arab: {{t|ar|بُولِينِزِيَا|f}}
* Armenia: {{t|hy|Պոլինեզիա}}
* Azeri: {{t|az|Polineziya}}
* Belanda: {{t+|nl|Polynesië|n}}
* Belarus: {{t|be|Паліне́зія|f}}, {{t|be|Палінэ́зія|f}}
* Bengali: {{t+|bn|পলিনেশিয়া}}
* Bulgaria: {{t|bg|Полине́зия|f}}
* Burma: {{t|my|ပိုလီနီးရှား}}
* Catalonia: {{t+|ca|Polinèsia|f}}
* Cherokee: {{t|chr|ᏆᎵᏂᏏᎠ|tr=qualinisia}}
* Cina:
*: Kantonis: {{t|yue|波利尼西亞|tr=bo1 lei6 nei4 sai1 aa3}}
*: Mandarin: {{t+|cmn|波利尼西亞|tr=Bōlìníxīyà}}, {{t+|cmn|玻里尼西亞|tr=Bōlǐníxīyà}} {{qualifier|Taiwan}}
* Czech: {{t+|cs|Polynésie|f}}
* Denmark: {{t|da|Polynesien|n}}
* Esperanto: {{t|eo|Polinezio}}
* Estonia: {{t|et|Polüneesia}}
* Farefare: {{t|gur|Polinesia}}
* Finland: {{t+|fi|Polynesia}}
* Galicia: {{t+|gl|Polinesia}}
* Georgia: {{t|ka|პოლინეზია}}
* Hawaii: {{t|haw|Polenekia}}
* Hindi: {{t|hi|पॉलिनेशिया|m}}
* Hungary: {{t+|hu|Polinézia}}
* Ibrani: {{t|he|פּוֹלִינֶזְיָה|f|tr=polinézya}}
* Iceland: {{t|is|Pólýnesía}}
* Indonesia: {{t|id|Polinesia}}
* Inggeris: {{t+|en|Polynesia}}
* Ireland: {{t|ga|Polainéis|f|alt=An Pholainéis}}
* Itali: {{t+|it|Polinesia|f}}
* Jepun: {{t+|ja|ポリネシア|tr=Porineshia}}
* Jerman: {{t+|de|Polynesien|n}}
* Kazakh: {{t+|kk|Полинезия}}
* Khmer: {{t|km|ប៉ូលីណេស៊ី}}
* Korea: {{t|ko|^폴리네시아}}
* Kurdi:
*: Kurdi Utara: {{t|kmr|Polînezya}}
* Kyrgyz: {{t+|ky|Полинезия}}
* Lao: {{t|lo|ໂປລີເນຊີ}}
* Latvia: {{t|lv|Polinēzija|f}}
* Lithuania: {{t|lt|Polinezija|f}}
* Macedonia: {{t|mk|Полинезија|f}}
* Māori: {{t|mi|Poronihia}}
* Mongol:
*: Cyril: {{t|mn|Полинези}}
* Norway:
*: Bokmål: {{t|nb|Polynesia}}
*: Nynorsk: {{t|nn|Polynesia}}
* Parsi: {{t|fa|پلینزی|tr=poli-nezi}}, {{t+|fa|پلینزی|tr=polinezi}}
* Perancis: {{t+|fr|Polynésie|f}}
* Polish: {{t+|pl|Polinezja|f}}
* Portugis: {{t+|pt|Polinésia|f}}
* Romania: {{t|ro|Polinezia|f}}
* Rusia: {{t+|ru|Полине́зия|f|tr=Polinɛ́zija}}
* Samoa: {{t|sm|Polenisia}}
* Sepanyol: {{t|es|Polinesia}}
* Serbo-Croatia:
*: Cyril: {{t|sh|Полѝне̄зија|f}}
*: Latin: {{t+|sh|Polìnēzija|f}}
* Sinhala: {{t|si|පොලිනීසියාව}}
* Slovak: {{t|sk|Polynézia|f}}
* Slovene: {{t|sl|Polinezija|f}}
* Sweden: {{t+|sv|Polynesien|n}}
* Tagalog: {{t|tl|Dampuluan}}, {{t|tl|Polynesia}}
* Tahiti: {{t|ty|Pōrīnetia}}
* Tajik: {{t|tg|Полинезия}}
* Tamil: {{t|ta|பொலினீசியா}}
* Tatar: {{t|tt|Полинезия}}
* Thai: {{t|th|พอลินีเชีย}}
* Turki: {{t+|tr|Polinezya}}
* Turkmen: {{t|tk|Polineziýa}}
* Ukraine: {{t|uk|Поліне́зія|f}}
* Urdu: {{t|ur|پولینیشیا|m|tr=polīneśiyā}}
* Uyghur: {{t|ug|پولىنېزىيە}}
* Uzbek: {{t|uz|Polineziya}}
* Vietnam: {{t|vi|Pô-li-nê-di}}, {{t|vi|Đa Đảo}} ({{t|vi|多島}})
* Volapük: {{t+|vo|Möda-Seanuäns}}
* Wales: {{t|cy|Polynesia}}
* Yiddish: {{t|yi|פּאָלינעזיע|n}}
* Yunani: {{t+|el|Πολυνησία|f}}
{{trans-bottom}}
4xcl0sj7wxvfwclq6feeb3y06iow20q
Polinesia
0
114900
281397
2026-04-22T07:46:17Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}'
281397
wikitext
text/x-wiki
{{wt:ms/{{PAGENAME}}}}
oduz2pevfujwte0m2yicioulifapb4r
Kategori:Perkataan bahasa Chin Tedim dipinjam daripada bahasa Burma
14
114901
281400
2026-04-22T07:51:20Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281400
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Perkataan bahasa Chin Tedim diterbitkan daripada bahasa Burma
14
114902
281401
2026-04-22T07:51:23Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281401
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Perkataan dipinjam daripada bahasa Burma mengikut bahasa
14
114903
281402
2026-04-22T07:51:37Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281402
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Kata pinjaman bahasa Chin Tedim
14
114904
281403
2026-04-22T07:51:50Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281403
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Perkataan mengikut etimologi bahasa Chin Tedim
14
114905
281404
2026-04-22T07:51:52Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281404
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Perkataan bahasa Chin Tedim diterbitkan daripada bahasa lain
14
114906
281405
2026-04-22T07:53:14Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281405
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Perkataan bahasa Chin Tedim diterbitkan daripada bahasa-bahasa Lolo-Burma
14
114907
281406
2026-04-22T07:53:18Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281406
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Perkataan bahasa Chin Tedim diterbitkan daripada bahasa-bahasa Burma-Qiang
14
114908
281407
2026-04-22T07:53:21Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281407
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Perkataan bahasa Chin Tedim diterbitkan daripada bahasa-bahasa Sino-Tibet
14
114909
281408
2026-04-22T07:53:24Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281408
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Wikikamus:ms/taiko
4
114910
281409
2026-04-22T08:04:24Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Takrifan 1=== [[Fail:TaikoDrummersAichiJapan.jpg|thumb|Orang Jepun bermain '''taiko'''.]] {{inti|{{subst:ROOTPAGENAME}}|kata nama}} # Sejenis [[gendang]] tradisional [[Jepun]]. ====Etimologi==== {{bor+|ms|ja|太鼓|tr=たいこ ''taiko''}}, daripada {{m|ltc|太|tr=tʰàj|t=besar}} + {{m|ltc|鼓|tr=kú|t=dram, gendang}}. {{C|ms|Alat muzik|Jepun}} ===Takrifan 2=== {{inti|{{subst:ROOTPAGENAME}}|kata na...'
281409
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Takrifan 1===
[[Fail:TaikoDrummersAichiJapan.jpg|thumb|Orang Jepun bermain '''taiko'''.]]
{{inti|ms|kata nama}}
# Sejenis [[gendang]] tradisional [[Jepun]].
====Etimologi====
{{bor+|ms|ja|太鼓|tr=たいこ ''taiko''}}, daripada {{m|ltc|太|tr=tʰàj|t=besar}} + {{m|ltc|鼓|tr=kú|t=dram, gendang}}.
{{C|ms|Alat muzik|Jepun}}
===Takrifan 2===
{{inti|ms|kata nama}}
# Sejenis [[penyakit]] berjangkit bawaan [[bakteria]] ''Mycobacterium leprae''; [[kusta]].
====Etimologi====
Daripada {{bor|ms|zh|-}} {{bor|ms|nan-hbl|-}} {{zh-l|癩哥|tr=thái-ko|gloss=kusta}}.
{{C|ms|Penyakit}}
===Rujukan===
* {{R:KDP}}
qbrieu35se6cssy2o08hhswm8rioafo
281415
281409
2026-04-22T08:20:28Z
PeaceSeekers
3334
/* Takrifan 1 */
281415
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Takrifan 1===
[[Fail:TaikoDrummersAichiJapan.jpg|thumb|Orang Jepun bermain '''taiko'''.]]
{{inti|ms|kata nama}}
# Sejenis [[gendang]] tradisional [[Jepun]].
====Etimologi====
{{bor+|ms|ja|太鼓|tr=たいこ ''taiko''}}, daripada {{m|ltc|太|tr=tʰàj|t=besar}} + {{m|ltc|鼓|tr=kú|t=dram, gendang}}.
{{C|ms|Alat muzik|Jepun}}
====Terjemahan====
{{ter-atas|gendang Jepun}}
* Inggeris: {{t+|en|taiko}}
* Jepun: {{t+|ja|太鼓|tr=taiko}}
{{ter-bawah}}
===Takrifan 2===
{{inti|ms|kata nama}}
# Sejenis [[penyakit]] berjangkit bawaan [[bakteria]] ''Mycobacterium leprae''; [[kusta]].
====Etimologi====
Daripada {{bor|ms|zh|-}} {{bor|ms|nan-hbl|-}} {{zh-l|癩哥|tr=thái-ko|gloss=kusta}}.
{{C|ms|Penyakit}}
===Rujukan===
* {{R:KDP}}
ql25gr3u1twnxsl9sfr2a8vcwloa16p
taiko
0
114911
281410
2026-04-22T08:05:07Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}'
281410
wikitext
text/x-wiki
{{wt:ms/{{PAGENAME}}}}
oduz2pevfujwte0m2yicioulifapb4r
Kategori:ms:Jepun
14
114912
281411
2026-04-22T08:05:51Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281411
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Jepun
14
114913
281412
2026-04-22T08:06:08Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281412
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:id:Tahun
14
114914
281416
2026-04-22T08:21:40Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281416
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Perkataan dengan terjemahan bahasa Cornwall
14
114915
281418
2026-04-22T08:27:35Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281418
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Penyelenggaraan entri bahasa Cornwall
14
114916
281419
2026-04-22T08:27:46Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281419
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Wikikamus:ms/Bahai
4
114917
281423
2026-04-22T09:25:00Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== {{wikipedia|Bahá'í|lang=ms}} ===Kata nama khas=== {{inti|{{subst:ROOTPAGENAME}}|kata nama khas}} # Sebuah gerakan [[agama]] yang ditubuhkan oleh agamawan Iran, [[w:Baháʼu'lláh|Baháʼu'lláh]], pada abad ke-19. ====Terjemahan==== {{trans-top|agama}} * Arab: {{t|ar|الْبَهَائِيَّة|f}} * Armenia: {{t|hy|բահայականություն}} * Cina: *: Mandarin: {{t+|cmn|大同教|tr=dàtóngj...'
281423
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
{{wikipedia|Bahá'í|lang=ms}}
===Kata nama khas===
{{inti|ms|kata nama khas}}
# Sebuah gerakan [[agama]] yang ditubuhkan oleh agamawan Iran, [[w:Baháʼu'lláh|Baháʼu'lláh]], pada abad ke-19.
====Terjemahan====
{{trans-top|agama}}
* Arab: {{t|ar|الْبَهَائِيَّة|f}}
* Armenia: {{t|hy|բահայականություն}}
* Cina:
*: Mandarin: {{t+|cmn|大同教|tr=dàtóngjiào}}, {{t+|cmn|巴哈伊信仰|tr=bāhāyī xìnyǎng}}, {{t+|cmn|巴哈伊教|tr=bāhāyījiào}}
* Esperanto: {{t+|eo|Bahaa Kredo}}, {{t|eo|Bahaa Religio}}, {{t+|eo|Bahaismo}}
* Finland: {{t|fi|bahaʼi-usko}}, {{t|fi|bahai-usko}}
* Georgia: {{t|ka|ბაჰაიზმი}}, {{t|ka|ბაჰაი რელიგია}}
* Ibrani: {{t|he|הָדָּת הָבָּהָאִית|f-p|tr=ha-dat ha-Baha'it}}
* Inggeris: {{t+|en|Baháʼí Faith}}
* Jerman: {{t|de|Bahaitum|n}}, {{t+|de|Bahaismus|m}}
* Hindi: {{t|hi|बहाई धर्म|m}}
* Hungary: {{t|hu|[[bahái]] [[hit]]}}, {{t|hu|[[baháʼí]] [[hit]]}}, {{t|hu|baháizmus}}
* Ireland: {{t|ga|creideamh Bahá'íoch|m}}
* Jepun: {{t|ja|バハイ教|tr=bahai-kyō}}
* Kazakh: {{t|kk|Баһаи}}, {{t|kk|Баһаи Сенімі}}
* Khmer: {{t|km|[[ជំនឿ]][[បាហៃ]]}}
* Parsi: {{t|fa|بهائیت|tr=bahâ'iyyat}}
* Perancis: {{t+|fr|bahaïsme|m}}, {{t|fr|foi baháʼíe|f}}, {{t+|fr|béhaïsme|m}}
* Poland: {{t+|pl|bahaizm|m}}
* Portugis: {{t|pt|bahaísmo|m}}, {{t|pt|Fé Bahá'í|f}}
* Rusia: {{t+|ru|бахаи́зм|m}}, {{t|ru|бехаи́зм|m}}, {{t|ru|бахаи́|f}}
* Sepanyol: {{t+|es|bahaísmo|m}}
* Thai: {{t|th|[[ศาสนา]][[บาไฮ]]}}, {{t|th|[[ลัทธิ]][[บาไฮ]]}}, {{t|th|[[ศาสนา]][[บะฮาอี]]}}, {{t|th|[[ลัทธิ]][[บะฮาอี]]}}
* Turki: {{t+|tr|Bahâîlik}}
*: Turki Usmaniyah: {{t|ota|بهائیلك|tr=Behâîlik, Bahâîlik}}
* Urdu: {{t|ur|بہائیت|f|tr=bahāiyat}}
* Uyghur: {{t|ug|باھائىيلىك}}
* Uzbek: {{t|uz|Bahoiy Eʼtiqodi}}, {{t|uz|Bahoiylik}}
{{trans-bottom}}
===Rujukan===
* {{R:KDP}}
{{C|ms|Agama}}
gz986lgx6v5hnk20ycvufcd5lwtyefu
Bahai
0
114918
281424
2026-04-22T09:25:50Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{also|bahai}} {{wt:ms/{{PAGENAME}}}}'
281424
wikitext
text/x-wiki
{{also|bahai}}
{{wt:ms/{{PAGENAME}}}}
8tw19lsvv678dmz7j5bplaedt3182kh
Wikikamus:ms/maulhayat
4
114919
281425
2026-04-22T09:48:13Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Kata nama=== {{inti|{{subst:ROOTPAGENAME}}|kata nama}} # Sejenis [[air]] yang dikatakan dapat memberi peminumnya kehidupan secara [[abadi]]. #: {{syn|ms|ainul hayat|air hayat}} ===Etimologi=== {{bor+|ms|ar|ماء الحياة}}. {{C|ms|Bahan cereka|Keabadian}} ===Rujukan=== * {{R:KDP}}'
281425
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Kata nama===
{{inti|ms|kata nama}}
# Sejenis [[air]] yang dikatakan dapat memberi peminumnya kehidupan secara [[abadi]].
#: {{syn|ms|ainul hayat|air hayat}}
===Etimologi===
{{bor+|ms|ar|ماء الحياة}}.
{{C|ms|Bahan cereka|Keabadian}}
===Rujukan===
* {{R:KDP}}
ed2fvsh88rq3fjt98y8mrsnc8udke4j
maulhayat
0
114920
281426
2026-04-22T09:48:46Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}'
281426
wikitext
text/x-wiki
{{wt:ms/{{PAGENAME}}}}
oduz2pevfujwte0m2yicioulifapb4r
Wikikamus:ms/Saudi
4
114921
281427
2026-04-22T09:51:46Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|ms}}== ===Kata sifat=== {{inti|ms|kata nama khas}} # Berkenaan negara [[Arab Saudi]]. ===Etimologi=== {{bor+|ms|ar|سُعُودِيّ}}. {{root|en|ar|س ع د}} ===Rujukan=== * {{R:KDP}} {{C|ms|Arab Saudi}}'
281427
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Kata sifat===
{{inti|ms|kata nama khas}}
# Berkenaan negara [[Arab Saudi]].
===Etimologi===
{{bor+|ms|ar|سُعُودِيّ}}. {{root|en|ar|س ع د}}
===Rujukan===
* {{R:KDP}}
{{C|ms|Arab Saudi}}
4e5vdnv9aqo707qx46wpi6m2udf6hsi
281429
281427
2026-04-22T09:52:33Z
PeaceSeekers
3334
281429
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Kata sifat===
{{inti|ms|kata nama khas}}
# Berkenaan negara [[Arab Saudi]].
===Etimologi===
{{bor+|ms|ar|سُعُودِيّ}}. {{root|ms|ar|س ع د}}
===Rujukan===
* {{R:KDP}}
{{C|ms|Arab Saudi}}
punqb32jnacgbu4ig5e1r526vuakgtp
Saudi
0
114922
281428
2026-04-22T09:52:18Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}'
281428
wikitext
text/x-wiki
{{wt:ms/{{PAGENAME}}}}
oduz2pevfujwte0m2yicioulifapb4r
Kategori:Perkataan bahasa Melayu diterbitkan daripada akar bahasa Arab س ع د
14
114923
281430
2026-04-22T09:52:55Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281430
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:ms:Arab Saudi
14
114924
281431
2026-04-22T09:52:59Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281431
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Perkataan diterbitkan daripada akar bahasa Arab س ع د
14
114925
281432
2026-04-22T09:53:10Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281432
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:Arab Saudi
14
114926
281434
2026-04-22T09:57:30Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281434
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Wikikamus:en/Macau scam
4
114927
281436
2026-04-22T10:16:18Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Kata nama=== {{inti|{{subst:ROOTPAGENAME}}|kata nama}} # {{lb|en|Malaysia}} Sejenis taktik [[penipuan]] di mana seseorang [[samar|menyamar]] sebagai suatu pihak berkuasa untuk memaksa mangsa menyalurkan sejumlah wang [[tebusan]]. ===Etimologi=== Daripada {{l|en|Macau}}, di mana jenayah ini mula dilaporkan. Banding dengan {{noncog|tl|lutong-makaw}}. {{C|en|Jenayah}}'
281436
wikitext
text/x-wiki
==Bahasa {{bahasa|en}}==
===Kata nama===
{{inti|en|kata nama}}
# {{lb|en|Malaysia}} Sejenis taktik [[penipuan]] di mana seseorang [[samar|menyamar]] sebagai suatu pihak berkuasa untuk memaksa mangsa menyalurkan sejumlah wang [[tebusan]].
===Etimologi===
Daripada {{l|en|Macau}}, di mana jenayah ini mula dilaporkan. Banding dengan {{noncog|tl|lutong-makaw}}.
{{C|en|Jenayah}}
874v2vi42wbsff0lp8me4nwmjz2py6y
281440
281436
2026-04-22T10:20:30Z
PeaceSeekers
3334
/* Bahasa {{bahasa|en}} */
281440
wikitext
text/x-wiki
==Bahasa {{bahasa|en}}==
===Kata nama===
{{inti|en|kata nama}}
# {{lb|en|Malaysia}} Sejenis taktik [[penipuan]] di mana seseorang [[samar|menyamar]] sebagai suatu pihak berkuasa untuk mendesak mangsa menyalurkan sejumlah wang [[tebusan]].
===Etimologi===
Daripada {{l|en|Macau}}, di mana jenayah ini mula dilaporkan. Banding dengan {{noncog|tl|lutong-makaw}}.
{{C|en|Jenayah}}
8zeea9pbzda657bhg6frtpoy49ardtm
Macau scam
0
114928
281437
2026-04-22T10:17:00Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{wt:en/{{PAGENAME}}}}'
281437
wikitext
text/x-wiki
{{wt:en/{{PAGENAME}}}}
2y33swzmyjj8jr581mnvur6xi1gpqs8
Kategori:en:Jenayah
14
114929
281438
2026-04-22T10:17:26Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281438
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Kategori:en:Undang-undang jenayah
14
114930
281439
2026-04-22T10:19:01Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{auto cat}}'
281439
wikitext
text/x-wiki
{{auto cat}}
eomzlm5v4j7ond1phrju7cnue91g5qx
Wikikamus:en/MNC
4
114931
281441
2026-04-22T10:25:47Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|en}}== ===Kata nama=== {{inti|en|kata nama}} # {{abbreviation of|en|[[multinational]] [[corporation]]}}. {{C|en|Perniagaan}}'
281441
wikitext
text/x-wiki
==Bahasa {{bahasa|en}}==
===Kata nama===
{{inti|en|kata nama}}
# {{abbreviation of|en|[[multinational]] [[corporation]]}}.
{{C|en|Perniagaan}}
809z9s7jnjotzwwmkdzsuqmdnmrp1b9
MNC
0
114932
281442
2026-04-22T10:26:28Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{wt:en/{{PAGENAME}}}}'
281442
wikitext
text/x-wiki
{{wt:en/{{PAGENAME}}}}
2y33swzmyjj8jr581mnvur6xi1gpqs8
Wikikamus:ms/hiburan malam
4
114933
281443
2026-04-22T10:35:37Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Kata nama=== {{inti|{{subst:ROOTPAGENAME}}|kata nama}} # Aneka jenis [[hiburan]] yang biasanya dibuka pada waktu [[malam]] seperti [[kelab malam]], [[bar]] dsb. #: {{syn|ms|kehidupan malam}} ===Terjemahan=== {{trans-top|hiburan}} * Belanda: {{t+|nl|nachtleven|n}}, {{t+|nl|uitgaansleven|n}} * Cina: *: Mandarin: {{t+|cmn|夜生活|tr=yèshēnghuó}} * Esperanto: {{t|eo|nokta vivo}} * Finland: {{t+|fi|yöe...'
281443
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Kata nama===
{{inti|ms|kata nama}}
# Aneka jenis [[hiburan]] yang biasanya dibuka pada waktu [[malam]] seperti [[kelab malam]], [[bar]] dsb.
#: {{syn|ms|kehidupan malam}}
===Terjemahan===
{{trans-top|hiburan}}
* Belanda: {{t+|nl|nachtleven|n}}, {{t+|nl|uitgaansleven|n}}
* Cina:
*: Mandarin: {{t+|cmn|夜生活|tr=yèshēnghuó}}
* Esperanto: {{t|eo|nokta vivo}}
* Finland: {{t+|fi|yöelämä}}
* Georgia: {{t|ka|ღამის ცხოვრება}}
* Itali: {{t+|en|nightlife}}
* Itali: {{t|it|vita notturna|f}}
* Jepun: {{t+|ja|夜遊び|tr=よあそび, yoasobi}}, {{t|ja|ナイトライフ|tr=naitoraifu}}
* Jerman: {{t+|de|Nachtleben|n}}
* Macedonia: {{t|mk|ноќен живот|m}}
* Perancis: {{t|fr|vie nocturne|f}}
* Poland: {{t|pl|nocne życie|n}}
* Portugis: {{t|pt|vida noturna|f}}, {{t+|pt|noite|f}}, {{t+|pt|night|f}}
* Rusia: {{t|ru|ночна́я жизнь|f}}
* Rusyn Pannonia: {{t|rsk|ноцни живот|m}}
* Sepanyol: {{t|es|vida nocturna|f}}
* Swahili: {{t|sw|maisha ya usiku}}
* Sweden: {{t+|sv|nattliv|n}}
* Turki: {{t+|tr|gece hayatı}}
* Yunani: {{t|el|νυχτερινή ζωή|f}}
{{trans-bottom}}
{{C|ms|Hiburan|Malam}}
6fyl4psmbkrc8qxs1vuxnsas8y0txqi
281445
281443
2026-04-22T10:37:37Z
PeaceSeekers
3334
/* Terjemahan */
281445
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Kata nama===
{{inti|ms|kata nama}}
# Aneka jenis [[hiburan]] yang biasanya dibuka pada waktu [[malam]] seperti [[kelab malam]], [[bar]] dsb.
#: {{syn|ms|kehidupan malam}}
===Terjemahan===
{{trans-top|hiburan}}
* Belanda: {{t+|nl|nachtleven|n}}, {{t+|nl|uitgaansleven|n}}
* Cina:
*: Mandarin: {{t+|cmn|夜生活|tr=yèshēnghuó}}
* Esperanto: {{t|eo|nokta vivo}}
* Finland: {{t+|fi|yöelämä}}
* Georgia: {{t|ka|ღამის ცხოვრება}}
* Inggeris: {{t+|en|nightlife}}
* Itali: {{t|it|vita notturna|f}}
* Jepun: {{t+|ja|夜遊び|tr=よあそび, yoasobi}}, {{t|ja|ナイトライフ|tr=naitoraifu}}
* Jerman: {{t+|de|Nachtleben|n}}
* Macedonia: {{t|mk|ноќен живот|m}}
* Perancis: {{t|fr|vie nocturne|f}}
* Poland: {{t|pl|nocne życie|n}}
* Portugis: {{t|pt|vida noturna|f}}, {{t+|pt|noite|f}}, {{t+|pt|night|f}}
* Rusia: {{t|ru|ночна́я жизнь|f}}
* Rusyn Pannonia: {{t|rsk|ноцни живот|m}}
* Sepanyol: {{t|es|vida nocturna|f}}
* Swahili: {{t|sw|maisha ya usiku}}
* Sweden: {{t+|sv|nattliv|n}}
* Turki: {{t+|tr|gece hayatı}}
* Yunani: {{t|el|νυχτερινή ζωή|f}}
{{trans-bottom}}
{{C|ms|Hiburan|Malam}}
lyv13nxbpscbslj89dkzvhs0kgjoc62
hiburan malam
0
114934
281444
2026-04-22T10:36:21Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}'
281444
wikitext
text/x-wiki
{{wt:ms/{{PAGENAME}}}}
oduz2pevfujwte0m2yicioulifapb4r
Wikikamus:ms/Hari Bumi
4
114935
281446
2026-04-22T10:59:13Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Kata nama khas=== {{inti|{{subst:ROOTPAGENAME}}|kata nama khas}} # [[hari|Hari]] peringatan khas yang ditetapkan pada 22 April di peringkat antarabangsa sebagai hari kesedaran menjaga [[alam sekitar]]. ===Terjemahan=== {{trans-top|hari peringatan alam sekitar}} * Arab: {{t|ar|يَوْم اَلْأَرْض|m}} * Cina: *: Mandarin: {{t+|cmn|世界地球日|tr=Shìjiè Dìqiúrì}}, {{t+|cmn|地球日|tr=D...'
281446
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Kata nama khas===
{{inti|ms|kata nama khas}}
# [[hari|Hari]] peringatan khas yang ditetapkan pada 22 April di peringkat antarabangsa sebagai hari kesedaran menjaga [[alam sekitar]].
===Terjemahan===
{{trans-top|hari peringatan alam sekitar}}
* Arab: {{t|ar|يَوْم اَلْأَرْض|m}}
* Cina:
*: Mandarin: {{t+|cmn|世界地球日|tr=Shìjiè Dìqiúrì}}, {{t+|cmn|地球日|tr=Dìqiúrì}}
* Finland: {{t|fi|[[maan]] [[päivä]]}}
* Galicia: {{t|gl|Día da Terra}}
* Georgia: {{t|ka|დედამიწის დღე}}
* Inggeris: {{t+|en|Earth Day}}
* Jepun: {{t|ja|アースデイ|tr=Āsu-dei}}, {{t|ja|地球の日|tr=ちきゅうのひ, Chikyū no hi}}
* Jerman: {{t|de|Tag der Erde|m}}
* Korea: {{t|ko|^지구-의 날}}
* Navajo: {{t|nv|Nahasdzáán Baa Hą́ą́hwiindzin Bá Hazʼą́}}, {{t|nv|Nahasdzáán baa ʼáháyą́}}
* Perancis: {{t|fr|Jour de la Terre|m}}
* Portugis: {{t|pt|Dia da Terra|m}}
* Rusia: {{t|ru|День Земли́|m}}
* Sepanyol: {{t|es|Día de la Tierra|m}}
* Swahili: {{t|sw|Siku ya Dunia}}
* Wales: {{t|cy|Dydd y Ddaear|m}}
{{trans-bottom}}
{{C|en|Cuti}}
f6yty16dzmtasahxnwv7iul7z4j98x6
281447
281446
2026-04-22T11:00:01Z
PeaceSeekers
3334
281447
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Kata nama khas===
{{inti|ms|kata nama khas}}
# [[hari|Hari]] peringatan khas yang ditetapkan pada 22 April di peringkat antarabangsa sebagai hari kesedaran menjaga [[alam sekitar]].
===Terjemahan===
{{trans-top|hari peringatan alam sekitar}}
* Arab: {{t|ar|يَوْم اَلْأَرْض|m}}
* Cina:
*: Mandarin: {{t+|cmn|世界地球日|tr=Shìjiè Dìqiúrì}}, {{t+|cmn|地球日|tr=Dìqiúrì}}
* Finland: {{t|fi|[[maan]] [[päivä]]}}
* Galicia: {{t|gl|Día da Terra}}
* Georgia: {{t|ka|დედამიწის დღე}}
* Inggeris: {{t+|en|Earth Day}}
* Jepun: {{t|ja|アースデイ|tr=Āsu-dei}}, {{t|ja|地球の日|tr=ちきゅうのひ, Chikyū no hi}}
* Jerman: {{t|de|Tag der Erde|m}}
* Korea: {{t|ko|^지구-의 날}}
* Navajo: {{t|nv|Nahasdzáán Baa Hą́ą́hwiindzin Bá Hazʼą́}}, {{t|nv|Nahasdzáán baa ʼáháyą́}}
* Perancis: {{t|fr|Jour de la Terre|m}}
* Portugis: {{t|pt|Dia da Terra|m}}
* Rusia: {{t|ru|День Земли́|m}}
* Sepanyol: {{t|es|Día de la Tierra|m}}
* Swahili: {{t|sw|Siku ya Dunia}}
* Wales: {{t|cy|Dydd y Ddaear|m}}
{{trans-bottom}}
{{C|en|Perayaan}}
2zxsrcf9rnoeigp60flg2bunope367z
281449
281447
2026-04-22T11:01:08Z
PeaceSeekers
3334
281449
wikitext
text/x-wiki
==Bahasa {{bahasa|ms}}==
===Kata nama khas===
{{ms-knk}}
# [[hari|Hari]] peringatan khas yang ditetapkan pada 22 April di peringkat antarabangsa sebagai hari kesedaran menjaga [[alam sekitar]].
===Terjemahan===
{{trans-top|hari peringatan alam sekitar}}
* Arab: {{t|ar|يَوْم اَلْأَرْض|m}}
* Cina:
*: Mandarin: {{t+|cmn|世界地球日|tr=Shìjiè Dìqiúrì}}, {{t+|cmn|地球日|tr=Dìqiúrì}}
* Finland: {{t|fi|[[maan]] [[päivä]]}}
* Galicia: {{t|gl|Día da Terra}}
* Georgia: {{t|ka|დედამიწის დღე}}
* Inggeris: {{t+|en|Earth Day}}
* Jepun: {{t|ja|アースデイ|tr=Āsu-dei}}, {{t|ja|地球の日|tr=ちきゅうのひ, Chikyū no hi}}
* Jerman: {{t|de|Tag der Erde|m}}
* Korea: {{t|ko|^지구-의 날}}
* Navajo: {{t|nv|Nahasdzáán Baa Hą́ą́hwiindzin Bá Hazʼą́}}, {{t|nv|Nahasdzáán baa ʼáháyą́}}
* Perancis: {{t|fr|Jour de la Terre|m}}
* Portugis: {{t|pt|Dia da Terra|m}}
* Rusia: {{t|ru|День Земли́|m}}
* Sepanyol: {{t|es|Día de la Tierra|m}}
* Swahili: {{t|sw|Siku ya Dunia}}
* Wales: {{t|cy|Dydd y Ddaear|m}}
{{trans-bottom}}
{{C|ms|Perayaan}}
ezwyhhpj8oo8usemya12hxqn6h0jq53
Hari Bumi
0
114936
281448
2026-04-22T11:00:36Z
PeaceSeekers
3334
Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}'
281448
wikitext
text/x-wiki
{{wt:ms/{{PAGENAME}}}}
oduz2pevfujwte0m2yicioulifapb4r