Wikikamus mswiktionary https://ms.wiktionary.org/wiki/Wikikamus:Laman_Utama MediaWiki 1.46.0-wmf.24 case-sensitive Media Khas Perbincangan Pengguna Perbincangan pengguna Wikikamus Perbincangan Wikikamus Fail Perbincangan fail MediaWiki Perbincangan MediaWiki Templat Perbincangan templat Bantuan Perbincangan bantuan Kategori Perbincangan kategori Lampiran Perbincangan lampiran Rima Perbincangan rima Tesaurus Perbincangan tesaurus Indeks Perbincangan indeks Petikan Perbincangan petikan Rekonstruksi Perbincangan rekonstruksi Padanan isyarat Perbincangan padanan isyarat Konkordans Perbincangan konkordans TimedText TimedText talk Modul Perbincangan modul Acara Perbincangan acara bina 0 6723 281314 242081 2026-04-21T17:45:52Z Hakimi97 2668 /* Terjemahan */ 281314 wikitext text/x-wiki == Bahasa Melayu == ===Takrifan=== {{ms-kn}} # [[bangunan]]. ===Etimologi=== {{bor+|ms|ar|بنى|tr=banā||t=membina}}, {{m|ar|بناء|tr=binā||t=binaan, bangunan}}. ===Sebutan=== {{dewan|bi|na}} {{AFA|ms|/bina/|/bena/}} ===Tulisan Jawi=== {{ARchar|بينا}} ===Terjemahan=== {{ter-atas|membina}} * Arab: {{ARchar|يبني}} (yabne) * Armenia: {{Armn|կառուցել}} (kaṙuc'el), {{Armn|շինել}} (šinel), {{Armn|սարքել}} (sark'el) * Belanda: bouwen * Czech: stavět * Denmark: bygge * Estonia: ehitama * Ewe: tu * Finland: rakentaa * Ibrani: {{Hebr|לבנות}} * Ido: konstruktar * Indonesia: {{t-|id|bangun|alt=membangun}}, {{t-|id|diri|alt=mendirikan}} * Inggeris: build * Ireland: tóg * Itali: costruire, edificare * Jepun: 建てる (たてる, tateru), 建設する (けんせつする, kensetsu-suru) * Jerman: bauen * Korea: 만들다 /mandulda/ * Kurdi: {{KUchar|دروستکردن}} * Latin: mūniō, munīre, munīvī, munītus * Perancis: construire, édifier * Poland: budować * Portugis: construir * Romania: clădi * Rusia: {{Cyrl|строить}} (stroit) * Scotland: build, big * Sepanyol: construir, edificar * Swahili: kujenga * Sweden: anlägga, bygga, förfärdiga, uppföra, uppresa, upprätta {{ter-bawah}} ===Terbitan=== * binaan: ** benda yang dibina, bangunan, susunan; ** pembentukan. * membina: ** membuat sehingga terbina, membangunkan, mendirikan; ** membantu dalam proses menjadi besar; ** mengusahakan supaya lebih maju, membangunkan; ** mewujudkan, membentuk; ** mengembangkan; ** mendatangkan kebaikan, membawa manfaat. * pembinaan: ** perbuatan membina; ** pembangunan; ** perihal membina. * terbina: ** dibina, terbangun, terdiri; ** terbentuk daripada beberapa unsur. ===Tesaurus=== ; Sinonim: [[bangun]]. == Bahasa Indonesia == * Lihat takrifan bahasa Melayu. 7vaecdz4pzgfq933ijeoqs5vez89t47 Modul:languages 828 8666 281239 280883 2026-04-21T12:20:35Z Hakimi97 2668 281239 Scribunto text/plain --[==[ intro: This module implements fetching of language-specific information and processing text in a given language. ===Types of languages=== There are two types of languages: full languages and etymology-only languages. The essential difference is that only full languages appear in L2 headings in vocabulary entries, and hence categories like [[:Category:French nouns]] exist only for full languages. Etymology-only languages have either a full language or another etymology-only language as their parent (in the parent-child inheritance sense), and for etymology-only languages with another etymology-only language as their parent, a full language can always be derived by following the parent links upwards. For example, "Canadian French", code `fr-CA`, is an etymology-only language whose parent is the full language "French", code `fr`. An example of an etymology-only language with another etymology-only parent is "Northumbrian Old English", code `ang-nor`, which has "Anglian Old English", code `ang-ang` as its parent; this is an etymology-only language whose parent is "Old English", code `ang`, which is a full language. (This is because Northumbrian Old English is considered a variety of Anglian Old English.) Sometimes the parent is the "Undetermined" language, code `und`; this is the case, for example, for "substrate" languages such as "Pre-Greek", code `qsb-grc`, and "the BMAC substrate", code `qsb-bma`. It is important to distinguish language ''parents'' from language ''ancestors''. The parent-child relationship is one of containment, i.e. if X is a child of Y, X is considered a variety of Y. On the other hand, the ancestor-descendant relationship is one of descent in time. For example, "Classical Latin", code `la-cla`, and "Late Latin", code `la-lat`, are both etymology-only languages with "Latin", code `la`, as their parents, because both of the former are varieties of Latin. However, Late Latin does *NOT* have Classical Latin as its parent because Late Latin is *not* a variety of Classical Latin; rather, it is a descendant. There is in fact a separate `ancestors` field that is used to express the ancestor-descendant relationship, and Late Latin's ancestor is given as Classical Latin. It is also important to note that sometimes an etymology-only language is actually the conceptual ancestor of its parent language. This happens, for example, with "Old Italian" (code `roa-oit`), which is an etymology-only variant of full language "Italian" (code `it`), and with "Old Latin" (code `itc-ola`), which is an etymology-only variant of Latin. In both cases, the full language has the etymology-only variant listed as an ancestor. This allows a Latin term to inherit from Old Latin using the {{tl|inh}} template (where in this template, "inheritance" refers to ancestral inheritance, i.e. inheritance in time, rather than in the parent-child sense); likewise for Italian and Old Italian. Full languages come in three subtypes: * {regular}: This indicates a full language that is attested according to [[WT:CFI]] and therefore permitted in the main namespace. There may also be reconstructed terms for the language, which are placed in the {Reconstruction} namespace and must be prefixed with * to indicate a reconstruction. Most full languages are natural (not constructed) languages, but a few constructed languages (e.g. Esperanto and Volapük, among others) are also allowed in the mainspace and considered regular languages. * {reconstructed}: This language is not attested according to [[WT:CFI]], and therefore is allowed only in the {Reconstruction} namespace. All terms in this language are reconstructed, and must be prefixed with *. Languages such as Proto-Indo-European and Proto-Germanic are in this category. * {appendix-constructed}: This language is attested but does not meet the additional requirements set out for constructed languages ([[WT:CFI#Constructed languages]]). Its entries must therefore be in the Appendix namespace, but they are not reconstructed and therefore should not have * prefixed in links. Most constructed languages are of this subtype. Both full languages and etymology-only languages have a {Language} object associated with them, which is fetched using the {getByCode} function in [[Module:languages]] to convert a language code to a {Language} object. Depending on the options supplied to this function, etymology-only languages may or may not be accepted, and family codes may be accepted (returning a {Family} object as described in [[Module:families]]). There are also separate {getByCanonicalName} functions in [[Module:languages]] and [[Module:etymology languages]] to convert a language's canonical name to a {Language} object (depending on whether the canonical name refers to a full or etymology-only language). ===Textual representations=== Textual strings belonging to a given language come in several different ''text variants'': # The ''input text'' is what the user supplies in wikitext, in the parameters to {{tl|m}}, {{tl|l}}, {{tl|ux}}, {{tl|t}}, {{tl|lang}} and the like. # The ''corrected input text'' is the input text with some corrections and/or normalizations applied, such as bad-character replacements for certain languages, like replacing `l` or `1` to [[palochka]] in some languages written in Cyrillic. (FIXME: This currently goes under the name ''display text'' but that will be repurposed below. Also, [[User:Surjection]] suggests renaming this to ''normalized input text'', but "normalized" is used in a different sense in [[Module:usex]].) # The ''display text'' is the text in the form as it will be displayed to the user. This is what appears in headwords, in usexes, in displayed internal links, etc. This can include accent marks that are removed to form the stripped display text (see below), as well as embedded bracketed links that are variously processed further. The display text is generated from the corrected input text by applying language-specific transformations; for most languages, there will be no such transformations. The general reason for having a difference between input and display text is to allow for extra information in the input text that is not displayed to the user but is sent to the transliteration module. Note that having different display and input text is only supported currently through special-casing but will be generalized. Examples of transformations are: (1) Removing the {{cd|^}} that is used in certain East Asian (and possibly other unicameral) languages to indicate capitalization of the transliteration (which is currently special-cased); (2) for Korean, removing or otherwise processing hyphens (which is currently special-cased); (3) for Arabic, removing a ''sukūn'' diacritic placed over a ''tāʔ marbūṭa'' (like this: ةْ) to indicate that the ''tāʔ marbūṭa'' is pronounced and transliterated as /t/ instead of being silent [NOTE, NOT IMPLEMENTED YET]; (4) for Thai and Khmer, converting space-separated words to bracketed words and resolving respelling substitutions such as `[กรีน/กฺรีน]`, which indicate how to transliterate given words [NOTE, NOT IMPLEMENTED YET except in language-specific templates like {{tl|th-usex}}]. ## The ''right-resolved display text'' is the result of removing brackets around one-part embedded links and resolving two-part embedded links into their right-hand components (i.e. converting two-part links into the displayed form). The process of right-resolution is what happens when you call {{cd|remove_links()}} in [[Module:links]] on some text. When applied to the display text, it produces exactly what the user sees, without any link markup. # The ''stripped display text'' is the result of applying diacritic-stripping to the display text. ## The ''left-resolved stripped display text'' [NEED BETTER NAME] is the result of applying left-resolution to the stripped display text, i.e. similar to right-resolution but resolving two-part embedded links into their left-hand components (i.e. the linked-to page). If the display text refers to a single page, the resulting of applying diacritic stripping and left-resolution produces the ''logical pagename''. # The ''physical pagename text'' is the result of converting the stripped display text into physical page links. If the stripped display text contains embedded links, the left side of those links is converted into physical page links; otherwise, the entire text is considered a pagename and converted in the same fashion. The conversion does three things: (1) converts characters not allowed in pagenames into their "unsupported title" representation, e.g. {{cd|Unsupported titles/`gt`}} in place of the logical name {{cd|>}}; (2) handles certain special-cased unsupported-title logical pagenames, such as {{cd|Unsupported titles/Space}} in place of {{cd|[space]}} and {{cd|Unsupported titles/Ancient Greek dish}} in place of a very long Greek name for a gourmet dish as found in Aristophanes; (3) converts "mammoth" pagenames such as [[a]] into their appropriate split component, e.g. [[a/languages A to L]]. # The ''source translit text'' is the text as supplied to the language-specific {{cd|transliterate()}} method. The form of the source translit text may need to be language-specific, e.g Thai and Khmer will need the corrected input text, whereas other languages may need to work off the display text. [FIXME: It's still unclear to me how embedded bracketed links are handled in the existing code.] In general, embedded links need to be right-resolved (see above), but when this happens is unclear to me [FIXME]. Some languages have a chop-up-and-paste-together scheme that sends parts of the text through the transliterate mechanism, and for others (those listed with "cont" in {{cd|substitution}} in [[Module:languages/data]]) they receive the full input text, but preprocessed in certain ways. (The wisdom of this is still unclear to me.) # The ''transliterated text'' (or ''transliteration'') is the result of transliterating the source translit text. Unlike for all the other text variants except the transcribed text, it is always in the Latin script. # The ''transcribed text'' (or ''transcription'') is the result of transcribing the source translit text, where "transcription" here means a close approximation to the phonetic form of the language in languages (e.g. Akkadian, Sumerian, Ancient Egyptian, maybe Tibetan) that have a wide difference between the written letters and spoken form. Unlike for all the other text variants other than the transliterated text, it is always in the Latin script. Currently, the transcribed text is always supplied manually be the user; there is no such thing as a {{cd|transcribe()}} method on language objects. # The ''sort key'' is the text used in sort keys for determining the placing of pages in categories they belong to. The sort key is generated from the pagename or a specified ''sort base'' by lowercasing, doing language-specific transformations and then uppercasing the result. If the sort base is supplied and is generated from input text, it needs to be converted to display text, have embedded links removed through right-resolution and have diacritic-stripping applied. # There are other text variants that occur in usexes (specifically, there are normalized variants of several of the above text variants), but we can skip them for now. The following methods exist on {Language} objects to convert between different text variants: # {correctInputText} (currently called {makeDisplayText}): This converts input text to corrected input text. # {stripDiacritics}: This converts to stripped display text. [FIXME: This needs some rethinking. In particular, {stripDiacritics} is sometimes called on input text, corrected input text or display text (in various paths inside of [[Module:links]], and, in the case of input text, usually from other modules). We need to make sure we don't try to convert input text to display text twice, but at the same time we need to support calling it directly on input text since so many modules do this. This means we need to add a parameter indicating whether the passed-in text is input, corrected input, or display text; if the former two, we call {correctInputText} ourselves.] # {logicalToPhysical}: This converts logical pagenames to physical pagenames. # {transliterate}: This appears to convert input text with embedded brackets removed into a transliteration. [FIXME: This needs some rethinking. In particular, it calls {processDisplayText} on its input, which won't work for Thai and Khmer, so we may need language-specific flags indicating whether to pass the input text directly to the language transliterate method. In addition, I'm not sure how embedded links are handled in the existing translit code; a lot of callers remove the links themselves before calling {transliterate()}, which I assume is wrong.] # {makeSortKey}: This converts display text (?) to a sort key. [FIXME: Clarify this.] ]==] local export = {} local debug_track_module = "Modul:debug/track" local etymology_languages_data_module = "Modul:etymology languages/data" local families_module = "Modul:families" local headword_page_module = "Modul:headword/page" local json_module = "Modul:JSON" local language_like_module = "Modul:language-like" local languages_data_module = "Modul:languages/data" local languages_data_patterns_module = "Modul:languages/data/patterns" local links_data_module = "Modul:links/data" local load_module = "Modul:load" local scripts_module = "Modul:scripts" local scripts_data_module = "Modul:scripts/data" local string_encode_entities_module = "Modul:string/encode entities" local string_pattern_escape_module = "Modul:string/patternEscape" local string_replacement_escape_module = "Modul:string/replacementEscape" local string_utilities_module = "Modul:string utilities" local table_module = "Modul:table" local utilities_module = "Modul:utilities" local wikimedia_languages_module = "Modul:wikimedia languages" local mw = mw local string = string local table = table local char = string.char local concat = table.concat local find = string.find local floor = math.floor local get_by_code -- Defined below. local get_data_module_name -- Defined below. local get_extra_data_module_name -- Defined below. local getmetatable = getmetatable local gmatch = string.gmatch local gsub = string.gsub local insert = table.insert local ipairs = ipairs local is_known_language_tag = mw.language.isKnownLanguageTag local make_object -- Defined below. local match = string.match local next = next local pairs = pairs local remove = table.remove local require = require local select = select local setmetatable = setmetatable local sub = string.sub local type = type local unstrip = mw.text.unstrip -- Loaded as needed by findBestScript. local Hans_chars local Hant_chars local function check_object(...) check_object = require(utilities_module).check_object return check_object(...) end local function debug_track(...) debug_track = require(debug_track_module) return debug_track(...) end local function decode_entities(...) decode_entities = require(string_utilities_module).decode_entities return decode_entities(...) end local function decode_uri(...) decode_uri = require(string_utilities_module).decode_uri return decode_uri(...) end local function deep_copy(...) deep_copy = require(table_module).deepCopy return deep_copy(...) end local function encode_entities(...) encode_entities = require(string_encode_entities_module) return encode_entities(...) end local function get_L2_sort_key(...) get_L2_sort_key = require(headword_page_module).get_L2_sort_key return get_L2_sort_key(...) end local function get_script(...) get_script = require(scripts_module).getByCode return get_script(...) end local function find_best_script_without_lang(...) find_best_script_without_lang = require(scripts_module).findBestScriptWithoutLang return find_best_script_without_lang(...) end local function get_family(...) get_family = require(families_module).getByCode return get_family(...) end local function get_plaintext(...) get_plaintext = require(utilities_module).get_plaintext return get_plaintext(...) end local function get_wikimedia_lang(...) get_wikimedia_lang = require(wikimedia_languages_module).getByCode return get_wikimedia_lang(...) end local function keys_to_list(...) keys_to_list = require(table_module).keysToList return keys_to_list(...) end local function list_to_set(...) list_to_set = require(table_module).listToSet return list_to_set(...) end local function load_data(...) load_data = require(load_module).load_data return load_data(...) end local function make_family_object(...) make_family_object = require(families_module).makeObject return make_family_object(...) end local function pattern_escape(...) pattern_escape = require(string_pattern_escape_module) return pattern_escape(...) end local function replacement_escape(...) replacement_escape = require(string_replacement_escape_module) return replacement_escape(...) end local function safe_require(...) safe_require = require(load_module).safe_require return safe_require(...) end local function shallow_copy(...) shallow_copy = require(table_module).shallowCopy return shallow_copy(...) end local function split(...) split = require(string_utilities_module).split return split(...) end local function to_json(...) to_json = require(json_module).toJSON return to_json(...) end local function u(...) u = require(string_utilities_module).char return u(...) end local function ugsub(...) ugsub = require(string_utilities_module).gsub return ugsub(...) end local function ulen(...) ulen = require(string_utilities_module).len return ulen(...) end local function ulower(...) ulower = require(string_utilities_module).lower return ulower(...) end local function umatch(...) umatch = require(string_utilities_module).match return umatch(...) end local function uupper(...) uupper = require(string_utilities_module).upper return uupper(...) end local function track(page) debug_track("languages/" .. page) return true end local function normalize_code(code) return load_data(languages_data_module).aliases[code] or code end local function check_inputs(self, check, default, ...) local n = select("#", ...) if n == 0 then return false end local ret = check(self, (...)) if ret ~= nil then return ret elseif n > 1 then local inputs = {...} for i = 2, n do ret = check(self, inputs[i]) if ret ~= nil then return ret end end end return default end local function make_link(self, target, display) local prefix, main if self:getFamilyCode() == "qfa-sub" then prefix, main = display:match("^(sebuah )(.*)") if not prefix then prefix, main = display:match("^(suatu )(.*)") end end return (prefix or "") .. "[[" .. target .. "|" .. (main or display) .. "]]" end -- Convert risky characters to HTML entities, which minimizes interference once returned (e.g. for "sms:a", "<!-- -->" etc.). local function escape_risky_characters(text) -- Spacing characters in isolation generally need to be escaped in order to be properly processed by the MediaWiki software. if umatch(text, "^%s*$") then return encode_entities(text, text) end return encode_entities(text, "!#%&*+/:;<=>?@[\\]_{|}") end -- Temporarily convert various formatting characters to PUA to prevent them from being disrupted by the substitution process. local function doTempSubstitutions(text, subbedChars, keepCarets, noTrim) -- Clone so that we don't insert any extra patterns into the table in package.loaded. For some reason, using require seems to keep memory use down; probably because the table is always cloned. local patterns = shallow_copy(require(languages_data_patterns_module)) if keepCarets then insert(patterns, "((\\+)%^)") insert(patterns, "((%^))") end -- Ensure any whitespace at the beginning and end is temp substituted, to prevent it from being accidentally trimmed. We only want to trim any final spaces added during the substitution process (e.g. by a module), which means we only do this during the first round of temp substitutions. if not noTrim then insert(patterns, "^([\128-\191\244]*(%s+))") insert(patterns, "((%s+)[\128-\191\244]*)$") end -- Pre-substitution, of "[[" and "]]", which makes pattern matching more accurate. text = gsub(text, "%f[%[]%[%[", "\1"):gsub("%f[%]]%]%]", "\2") local i = #subbedChars for _, pattern in ipairs(patterns) do -- Patterns ending in \0 stand are for things like "[[" or "]]"), so the inserted PUA are treated as breaks between terms by modules that scrape info from pages. local term_divider pattern = gsub(pattern, "%z$", function(divider) term_divider = divider == "\0" return "" end) text = gsub(text, pattern, function(...) local m = {...} local m1New = m[1] for k = 2, #m do local n = i + k - 1 subbedChars[n] = m[k] local byte2 = floor(n / 4096) % 64 + (term_divider and 128 or 136) local byte3 = floor(n / 64) % 64 + 128 local byte4 = n % 64 + 128 m1New = gsub(m1New, pattern_escape(m[k]), "\244" .. char(byte2) .. char(byte3) .. char(byte4), 1) end i = i + #m - 1 return m1New end) end text = gsub(text, "\1", "%[%["):gsub("\2", "%]%]") return text, subbedChars end -- Reinsert any formatting that was temporarily substituted. local function undoTempSubstitutions(text, subbedChars) for i = 1, #subbedChars do local byte2 = floor(i / 4096) % 64 + 128 local byte3 = floor(i / 64) % 64 + 128 local byte4 = i % 64 + 128 text = gsub(text, "\244[" .. char(byte2) .. char(byte2+8) .. "]" .. char(byte3) .. char(byte4), replacement_escape(subbedChars[i])) end text = gsub(text, "\1", "%[%["):gsub("\2", "%]%]") return text end -- Check if the raw text is an unsupported title, and if so return that. Otherwise, remove HTML entities. We do the pre-conversion to avoid loading the unsupported title list unnecessarily. local function checkNoEntities(self, text) local textNoEnc = decode_entities(text) if textNoEnc ~= text and load_data(links_data_module).unsupported_titles[text] then return text else return textNoEnc end end -- If no script object is provided (or if it's invalid or None), get one. local function checkScript(text, self, sc) if not check_object("script", true, sc) or sc:getCode() == "None" then return self:findBestScript(text) end return sc end local function normalize(text, sc) text = sc:fixDiscouragedSequences(text) return sc:toFixedNFD(text) end -- Subfunction of iterateSectionSubstitutions(). Process an individual chunk of text according to the specifications in -- `substitution_data`. The input parameters are all as in the documentation of iterateSectionSubstitutions() except for -- `recursed`, which is set to true if we called ourselves recursively to process a script-specific setting or -- script-wide fallback. Returns two values: the processed text and the actual substitution data used to do the -- substitutions (same as the `actual_substitution_data` return value to iterateSectionSubstitutions()). local function doSubstitutions(self, text, sc, substitution_data, data_field, function_name, recursed) -- BE CAREFUL in this function because the value at any level can be `false`, which causes no processing to be done -- and blocks any further fallback processing. local actual_substitution_data = substitution_data -- If there are language-specific substitutes given in the data module, use those. if type(substitution_data) == "table" then -- If a script is specified, run this function with the script-specific data before continuing. local sc_code = sc:getCode() local has_substitution_data = false if substitution_data[sc_code] ~= nil then has_substitution_data = true if substitution_data[sc_code] then text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data[sc_code], data_field, function_name, true) end -- Hant, Hans and Hani are usually treated the same, so add a special case to avoid having to specify each one -- separately. elseif sc_code:match("^Han") and substitution_data.Hani ~= nil then has_substitution_data = true if substitution_data.Hani then text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data.Hani, data_field, function_name, true) end -- Substitution data with key 1 in the outer table may be given as a fallback. elseif substitution_data[1] ~= nil then has_substitution_data = true if substitution_data[1] then text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data[1], data_field, function_name, true) end end -- Iterate over all strings in the "from" subtable, and gsub with the corresponding string in "to". We work with -- the NFD decomposed forms, as this simplifies many substitutions. if substitution_data.from then has_substitution_data = true for i, from in ipairs(substitution_data.from) do -- Normalize each loop, to ensure multi-stage substitutions work correctly. text = sc:toFixedNFD(text) text = ugsub(text, sc:toFixedNFD(from), substitution_data.to[i] or "") end end if substitution_data.remove_diacritics then has_substitution_data = true text = sc:toFixedNFD(text) -- Convert exceptions to PUA. local remove_exceptions, substitutes = substitution_data.remove_exceptions if remove_exceptions then substitutes = {} local i = 0 for _, exception in ipairs(remove_exceptions) do exception = sc:toFixedNFD(exception) text = ugsub(text, exception, function(m) i = i + 1 local subst = u(0x80000 + i) substitutes[subst] = m return subst end) end end -- Strip diacritics. text = ugsub(text, "[" .. substitution_data.remove_diacritics .. "]", "") -- Convert exceptions back. if remove_exceptions then text = text:gsub("\242[\128-\191]*", substitutes) end end if not has_substitution_data and sc._data[data_field] then -- If language-specific sort key (etc.) is nil, fall back to script-wide sort key (etc.). text, actual_substitution_data = doSubstitutions(self, text, sc, sc._data[data_field], data_field, function_name, true) end elseif type(substitution_data) == "string" then -- If there is a dedicated function module, use that. local module = safe_require("Modul:" .. substitution_data) if module then -- TODO: translit functions should take objects, not codes. -- TODO: translit functions should be called with form NFD. if function_name == "tr" then if not module[function_name] then error(("Internal error: Module [[%s]] has no function named 'tr'"):format(substitution_data)) end text = module[function_name](text, self._code, sc:getCode()) elseif function_name == "stripDiacritics" then -- FIXME, get rid of this arm after renaming makeEntryName -> stripDiacritics. if module[function_name] then text = module[function_name](sc:toFixedNFD(text), self, sc) elseif module.makeEntryName then text = module.makeEntryName(sc:toFixedNFD(text), self, sc) else error(("Internal error: Module [[%s]] has no function named 'stripDiacritics' or 'makeEntryName'" ):format(substitution_data)) end else if not module[function_name] then error(("Internal error: Module [[%s]] has no function named '%s'"):format( substitution_data, function_name)) end text = module[function_name](sc:toFixedNFD(text), self, sc) end else error("Substitution data '" .. substitution_data .. "' does not match an existing module.") end elseif substitution_data == nil and sc._data[data_field] then -- If language-specific sort key (etc.) is nil, fall back to script-wide sort key (etc.). text, actual_substitution_data = doSubstitutions(self, text, sc, sc._data[data_field], data_field, function_name, true) end -- Don't normalize to NFC if this is the inner loop or if a module returned nil. if recursed or not text then return text, actual_substitution_data end -- Fix any discouraged sequences created during the substitution process, and normalize into the final form. return sc:toFixedNFC(sc:fixDiscouragedSequences(text)), actual_substitution_data end -- Split the text into sections, based on the presence of temporarily substituted formatting characters, then iterate -- over each section to apply substitutions (e.g. transliteration or diacritic stripping). This avoids putting PUA -- characters through language-specific modules, which may be unequipped for them. This function is passed the following -- values: -- * `self` (the Language object); -- * `text` (the text to process); -- * `sc` (the script of the text, which must be specified; callers should call checkScript() as needed to autodetect the -- script of the text if not given explicitly by the user); -- * `subbedChars` (an array of the same length as the text, indicating which characters have been substituted and by -- what, or {nil} if no substitutions are to happen); -- * `keepCarets` (DOCUMENT ME); -- * `substitution_data` (the data indicating which substitutions to apply, taken directly from `data_field` in the -- language's data structure in a submodule of [[Module:languages/data]]); -- * `data_field` (the data field from which `substitution_data` was fetched, such as "sort_key" or "strip_diacritics"); -- * `function_name` (the name of the function to call to do the substitution, in case `substitution_data` specifies a -- module to do the substitution); -- * `notrim` (don't trim whitespace at the edges of `text`; set when computing the sort key, because whitespace at the -- beginning of a sort key is significant and causes the resulting page to be sorted at the beginning of the category -- it's in). -- Returns three values: -- (1) the processed text; -- (2) the value of `subbedChars` that was passed in, possibly modified with additional character substitutions; will be -- {nil} if {nil} was passed in; -- (3) the actual substitution data that was used to apply substitutions to `text`; this may be different from the value -- of `substitution_data` passed in if that value recursively specified script-specific substitutions or if no -- substitution data could be found in the language-specific data (e.g. {nil} was passed in or a structure was passed -- in that had no setting for the script given in `sc`), but a script-wide fallback value was set; currently it is -- only used by makeSortKey(). local function iterateSectionSubstitutions(self, text, sc, subbedChars, keepCarets, substitution_data, data_field, function_name, notrim) local sections -- See [[Module:languages/data]]. if not find(text, "\244") or load_data(languages_data_module).substitution[self._code] == "cont" then sections = {text} else sections = split(text, "\244[\128-\143][\128-\191]*", true) end local actual_substitution_data for _, section in ipairs(sections) do -- Don't bother processing empty strings or whitespace (which may also not be handled well by dedicated -- modules). if gsub(section, "%s+", "") ~= "" then local sub, this_actual_substitution_data = doSubstitutions(self, section, sc, substitution_data, data_field, function_name) actual_substitution_data = this_actual_substitution_data -- Second round of temporary substitutions, in case any formatting was added by the main substitution -- process. However, don't do this if the section contains formatting already (as it would have had to have -- been escaped to reach this stage, and therefore should be given as raw text). if sub and subbedChars then local noSub for _, pattern in ipairs(require(languages_data_patterns_module)) do if match(section, pattern .. "%z?") then noSub = true end end if not noSub then sub, subbedChars = doTempSubstitutions(sub, subbedChars, keepCarets, true) end end if not sub then text = sub break end text = sub and gsub(text, pattern_escape(section), replacement_escape(sub), 1) or text end end if not notrim then -- Trim, unless there are only spacing characters, while ignoring any final formatting characters. -- Do not trim sort keys because spaces at the beginning are significant. text = text and text:gsub("^([\128-\191\244]*)%s+(%S)", "%1%2"):gsub("(%S)%s+([\128-\191\244]*)$", "%1%2") or nil end return text, subbedChars, actual_substitution_data end -- Process carets (and any escapes). Default to simple removal, if no pattern/replacement is given. local function processCarets(text, pattern, repl) local rep repeat text, rep = gsub(text, "\\\\(\\*^)", "\3%1") until rep == 0 return (text:gsub("\\^", "\4") :gsub(pattern or "%^", repl or "") :gsub("\3", "\\") :gsub("\4", "^")) end -- Remove carets if they are used to capitalize parts of transliterations (unless they have been escaped). local function removeCarets(text, sc) if not sc:hasCapitalization() and sc:isTransliterated() and text:find("^", 1, true) then return processCarets(text) else return text end end local Language = {} --[==[Returns the language code of the language. Example: {{code|lua|"fr"}} for French.]==] function Language:getCode() return self._code end --[==[Returns the canonical name of the language. This is the name used to represent that language on Wiktionary, and is guaranteed to be unique to that language alone. Example: {{code|lua|"French"}} for French.]==] function Language:getCanonicalName() local name = self._name if name == nil then name = self._data[1] self._name = name end return name end --[==[ Return the display form of the language. The display form of a language, family or script is the form it takes when appearing as the <code><var>source</var></code> in categories such as <code>English terms derived from <var>source</var></code> or <code>English given names from <var>source</var></code>, and is also the displayed text in {makeCategoryLink()} links. For full and etymology-only languages, this is the same as the canonical name, but for families, it reads <code>"<var>name</var> languages"</code> (e.g. {"Indo-Iranian languages"}), and for scripts, it reads <code>"<var>name</var> script"</code> (e.g. {"Arabic script"}). ]==] function Language:getDisplayForm() local form = self._displayForm if form == nil then form = self:getCanonicalName() -- Add article and " substrate" to substrates that lack them. if self:getFamilyCode() == "qfa-sub" then if not (sub(form, 1, 7) == "sebuah " or sub(form, 1, 6) == "suatu ") then form = "suatu " .. form end if not match(form, "[Ss]ubstratum") then form = "substratum " .. form end end self._displayForm = form end return form end --[==[Returns the value which should be used in the HTML lang= attribute for tagged text in the language.]==] function Language:getHTMLAttribute(sc, region) local code = self._code if not find(code, "-", 1, true) then return code .. "-" .. sc:getCode() .. (region and "-" .. region or "") end local parent = self:getParent() region = region or match(code, "%f[%u][%u-]+%f[%U]") if parent then return parent:getHTMLAttribute(sc, region) end -- TODO: ISO family codes can also be used. return "mis-" .. sc:getCode() .. (region and "-" .. region or "") end --[==[Returns a table of the aliases that the language is known by, excluding the canonical name. Aliases are synonyms for the language in question. The names are not guaranteed to be unique, in that sometimes more than one language is known by the same name. Example: {{code|lua|{"High German", "New High German", "Deutsch"} }} for [[:Category:German language|German]].]==] function Language:getAliases() self:loadInExtraData() return require(language_like_module).getAliases(self) end --[==[ Return a table of the known subvarieties of a given language, excluding subvarieties that have been given explicit etymology-only language codes. The names are not guaranteed to be unique, in that sometimes a given name refers to a subvariety of more than one language. Example: {{code|lua|{"Southern Aymara", "Central Aymara"} }} for [[:Category:Aymara language|Aymara]]. Note that the returned value can have nested tables in it, when a subvariety goes by more than one name. Example: {{code|lua|{"North Azerbaijani", "South Azerbaijani", {"Afshar", "Afshari", "Afshar Azerbaijani", "Afchar"}, {"Qashqa'i", "Qashqai", "Kashkay"}, "Sonqor"} }} for [[:Category:Azerbaijani language|Azerbaijani]]. Here, for example, Afshar, Afshari, Afshar Azerbaijani and Afchar all refer to the same subvariety, whose preferred name is Afshar (the one listed first). To avoid a return value with nested tables in it, specify a non-{{code|lua|nil}} value for the <code>flatten</code> parameter; in that case, the return value would be {{code|lua|{"North Azerbaijani", "South Azerbaijani", "Afshar", "Afshari", "Afshar Azerbaijani", "Afchar", "Qashqa'i", "Qashqai", "Kashkay", "Sonqor"} }}. ]==] function Language:getVarieties(flatten) self:loadInExtraData() return require(language_like_module).getVarieties(self, flatten) end --[==[Returns a table of the "other names" that the language is known by, which are listed in the <code>otherNames</code> field. It should be noted that the <code>otherNames</code> field itself is deprecated, and entries listed there should eventually be moved to either <code>aliases</code> or <code>varieties</code>.]==] function Language:getOtherNames() -- To be eventually removed, once there are no more uses of the `otherNames` field. self:loadInExtraData() return require(language_like_module).getOtherNames(self) end --[==[ Return a combined table of the canonical name, aliases, varieties and other names of a given language.]==] function Language:getAllNames() self:loadInExtraData() return require(language_like_module).getAllNames(self) end --[==[Returns a table of types as a lookup table (with the types as keys). The possible types are * {language}: This is a language, either full or etymology-only. * {full}: This is a "full" (not etymology-only) language, i.e. the union of {regular}, {reconstructed} and {appendix-constructed}. Note that the types {full} and {etymology-only} also exist for families, so if you want to check specifically for a full language and you have an object that might be a family, you should use {{lua|hasType("language", "full")}} and not simply {{lua|hasType("full")}}. * {etymology-only}: This is an etymology-only (not full) language, whose parent is another etymology-only language or a full language. Note that the types {full} and {etymology-only} also exist for families, so if you want to check specifically for an etymology-only language and you have an object that might be a family, you should use {{lua|hasType("language", "etymology-only")}} and not simply {{lua|hasType("etymology-only")}}. * {regular}: This indicates a full language that is attested according to [[WT:CFI]] and therefore permitted in the main namespace. There may also be reconstructed terms for the language, which are placed in the {Reconstruction} namespace and must be prefixed with * to indicate a reconstruction. Most full languages are natural (not constructed) languages, but a few constructed languages (e.g. Esperanto and Volapük, among others) are also allowed in the mainspace and considered regular languages. * {reconstructed}: This language is not attested according to [[WT:CFI]], and therefore is allowed only in the {Reconstruction} namespace. All terms in this language are reconstructed, and must be prefixed with *. Languages such as Proto-Indo-European and Proto-Germanic are in this category. * {appendix-constructed}: This language is attested but does not meet the additional requirements set out for constructed languages ([[WT:CFI#Constructed languages]]). Its entries must therefore be in the Appendix namespace, but they are not reconstructed and therefore should not have * prefixed in links. ]==] function Language:getTypes() local types = self._types if types == nil then types = {language = true} if self:getFullCode() == self._code then types.full = true else types["etymology-only"] = true end for t in gmatch(self._data.type, "[^,]+") do types[t] = true end self._types = types end return types end --[==[Given a list of types as strings, returns true if the language has all of them.]==] function Language:hasType(...) Language.hasType = require(language_like_module).hasType return self:hasType(...) end --[==[Returns a table containing <code>WikimediaLanguage</code> objects (see [[Module:wikimedia languages]]), which represent languages and their codes as they are used in Wikimedia projects for interwiki linking and such. More than one object may be returned, as a single Wiktionary language may correspond to multiple Wikimedia languages. For example, Wiktionary's single code <code>sh</code> (Serbo-Croatian) maps to four Wikimedia codes: <code>sh</code> (Serbo-Croatian), <code>bs</code> (Bosnian), <code>hr</code> (Croatian) and <code>sr</code> (Serbian). The code for the Wikimedia language is retrieved from the <code>wikimedia_codes</code> property in the data modules. If that property is not present, the code of the current language is used. If none of the available codes is actually a valid Wikimedia code, an empty table is returned.]==] function Language:getWikimediaLanguages() local wm_langs = self._wikimediaLanguageObjects if wm_langs == nil then local codes = self:getWikimediaLanguageCodes() wm_langs = {} for i = 1, #codes do wm_langs[i] = get_wikimedia_lang(codes[i]) end self._wikimediaLanguageObjects = wm_langs end return wm_langs end function Language:getWikimediaLanguageCodes() local wm_langs = self._wikimediaLanguageCodes if wm_langs == nil then wm_langs = self._data.wikimedia_codes if wm_langs then wm_langs = split(wm_langs, ",", true, true) else local code = self._code if is_known_language_tag(code) then wm_langs = {code} else -- Inherit, but only if no codes are specified in the data *and* -- the language code isn't a valid Wikimedia language code. local parent = self:getParent() wm_langs = parent and parent:getWikimediaLanguageCodes() or {} end end self._wikimediaLanguageCodes = wm_langs end return wm_langs end --[==[ Returns the name of the Wikipedia article for the language. `project` specifies the language and project to retrieve the article from, defaulting to {"enwiki"} for the English Wikipedia. Normally if specified it should be the project code for a specific-language Wikipedia e.g. "zhwiki" for the Chinese Wikipedia, but it can be any project, including non-Wikipedia ones. If the project is the English Wikipedia and the property {wikipedia_article} is present in the data module it will be used first. In all other cases, a sitelink will be generated from {:getWikidataItem} (if set). The resulting value (or lack of value) is cached so that subsequent calls are fast. If no value could be determined, and `noCategoryFallback` is {false}, {:getCategoryName} is used as fallback; otherwise, {nil} is returned. Note that if `noCategoryFallback` is {nil} or omitted, it defaults to {false} if the project is the English Wikipedia, otherwise to {true}. In other words, under normal circumstances, if the English Wikipedia article couldn't be retrieved, the return value will fall back to a link to the language's category, but this won't normally happen for any other project. ]==] function Language:getWikipediaArticle(noCategoryFallback, project) Language.getWikipediaArticle = require(language_like_module).getWikipediaArticle return self:getWikipediaArticle(noCategoryFallback, project) end function Language:makeWikipediaLink() return make_link(self, "w:" .. self:getWikipediaArticle(), self:getCanonicalName()) end --[==[Returns the name of the Wikimedia Commons category page for the language.]==] function Language:getCommonsCategory() Language.getCommonsCategory = require(language_like_module).getCommonsCategory return self:getCommonsCategory() end --[==[Returns the Wikidata item id for the language or <code>nil</code>. This corresponds to the the second field in the data modules.]==] function Language:getWikidataItem() Language.getWikidataItem = require(language_like_module).getWikidataItem return self:getWikidataItem() end --[==[Returns a table of <code>Script</code> objects for all scripts that the language is written in. See [[Module:scripts]].]==] function Language:getScripts() local scripts = self._scriptObjects if scripts == nil then local codes = self:getScriptCodes() if codes[1] == "All" then scripts = load_data(scripts_data_module) else scripts = {} for i = 1, #codes do scripts[i] = get_script(codes[i]) end end self._scriptObjects = scripts end return scripts end --[==[Returns the table of script codes in the language's data file.]==] function Language:getScriptCodes() local scripts = self._scriptCodes if scripts == nil then scripts = self._data[4] if scripts then local codes, n = {}, 0 for code in gmatch(scripts, "[^,]+") do n = n + 1 -- Special handling of "Hants", which represents "Hani", "Hant" and "Hans" collectively. if code == "Hants" then codes[n] = "Hani" codes[n + 1] = "Hant" codes[n + 2] = "Hans" n = n + 2 else codes[n] = code end end scripts = codes else scripts = {"None"} end self._scriptCodes = scripts end return scripts end --[==[Given some text, this function iterates through the scripts of a given language and tries to find the script that best matches the text. It returns a {{code|lua|Script}} object representing the script. If no match is found at all, it returns the {{code|lua|None}} script object.]==] function Language:findBestScript(text, forceDetect) if not text or text == "" or text == "-" then return get_script("None") end -- Differs from table returned by getScriptCodes, as Hants is not normalized into its constituents. local codes = self._bestScriptCodes if codes == nil then codes = self._data[4] codes = codes and split(codes, ",", true, true) or {"None"} self._bestScriptCodes = codes end local first_sc = codes[1] if first_sc == "All" then return find_best_script_without_lang(text) end local codes_len = #codes if not (forceDetect or first_sc == "Hants" or codes_len > 1) then first_sc = get_script(first_sc) local charset = first_sc.characters return charset and umatch(text, "[" .. charset .. "]") and first_sc or get_script("None") end -- Remove all formatting characters. text = get_plaintext(text) -- Remove all spaces and any ASCII punctuation. Some non-ASCII punctuation is script-specific, so can't be removed. text = ugsub(text, "[%s!\"#%%&'()*,%-./:;?@[\\%]_{}]+", "") if #text == 0 then return get_script("None") end -- Try to match every script against the text, -- and return the one with the most matching characters. local bestcount, bestscript, length = 0 for i = 1, codes_len do local sc = codes[i] -- Special case for "Hants", which is a special code that represents whichever of "Hant" or "Hans" best matches, or "Hani" if they match equally. This avoids having to list all three. In addition, "Hants" will be treated as the best match if there is at least one matching character, under the assumption that a Han script is desirable in terms that contain a mix of Han and other scripts (not counting those which use Jpan or Kore). if sc == "Hants" then local Hani = get_script("Hani") if not Hant_chars then Hant_chars = load_data("Modul:zh/data/ts") Hans_chars = load_data("Modul:zh/data/st") end local t, s, found = 0, 0 -- This is faster than using mw.ustring.gmatch directly. for ch in gmatch((ugsub(text, "[" .. Hani.characters .. "]", "\255%0")), "\255(.[\128-\191]*)") do found = true if Hant_chars[ch] then t = t + 1 if Hans_chars[ch] then s = s + 1 end elseif Hans_chars[ch] then s = s + 1 else t, s = t + 1, s + 1 end end if found then if t == s then return Hani end return get_script(t > s and "Hant" or "Hans") end else sc = get_script(sc) if not length then length = ulen(text) end -- Count characters by removing everything in the script's charset and comparing to the original length. local charset = sc.characters local count = charset and length - ulen((ugsub(text, "[" .. charset .. "]+", ""))) or 0 if count >= length then return sc elseif count > bestcount then bestcount = count bestscript = sc end end end -- Return best matching script, or otherwise None. return bestscript or get_script("None") end --[==[Returns a <code>Family</code> object for the language family that the language belongs to. See [[Module:families]].]==] function Language:getFamily() local family = self._familyObject if family == nil then family = self:getFamilyCode() -- If the value is nil, it's cached as false. family = family and get_family(family) or false self._familyObject = family end return family or nil end --[==[Returns the family code in the language's data file.]==] function Language:getFamilyCode() local family = self._familyCode if family == nil then -- If the value is nil, it's cached as false. family = self._data[3] or false self._familyCode = family end return family or nil end function Language:getFamilyName() local family = self._familyName if family == nil then family = self:getFamily() -- If the value is nil, it's cached as false. family = family and family:getCanonicalName() or false self._familyName = family end return family or nil end do local function check_family(self, family) if type(family) == "table" then family = family:getCode() end if self:getFamilyCode() == family then return true end local self_family = self:getFamily() if self_family:inFamily(family) then return true -- If the family isn't a real family (e.g. creoles) check any ancestors. elseif self_family:inFamily("qfa-not") then local ancestors = self:getAncestors() for _, ancestor in ipairs(ancestors) do if ancestor:inFamily(family) then return true end end end end --[==[Check whether the language belongs to `family` (which can be a family code or object). A list of objects can be given in place of `family`; in that case, return true if the language belongs to any of the specified families. Note that some languages (in particular, certain creoles) can have multiple immediate ancestors potentially belonging to different families; in that case, return true if the language belongs to any of the specified families.]==] function Language:inFamily(...) if self:getFamilyCode() == nil then return false end return check_inputs(self, check_family, false, ...) end end function Language:getParent() local parent = self._parentObject if parent == nil then parent = self:getParentCode() -- If the value is nil, it's cached as false. parent = parent and get_by_code(parent, nil, true, true) or false self._parentObject = parent end return parent or nil end function Language:getParentCode() local parent = self._parentCode if parent == nil then -- If the value is nil, it's cached as false. parent = self._data.parent or false self._parentCode = parent end return parent or nil end function Language:getParentName() local parent = self._parentName if parent == nil then parent = self:getParent() -- If the value is nil, it's cached as false. parent = parent and parent:getCanonicalName() or false self._parentName = parent end return parent or nil end function Language:getParentChain() local chain = self._parentChain if chain == nil then chain = {} local parent, n = self:getParent(), 0 while parent do n = n + 1 chain[n] = parent parent = parent:getParent() end self._parentChain = chain end return chain end do local function check_lang(self, lang) for _, parent in ipairs(self:getParentChain()) do if (type(lang) == "string" and lang or lang:getCode()) == parent:getCode() then return true end end end function Language:hasParent(...) return check_inputs(self, check_lang, false, ...) end end --[==[ If the language is etymology-only, this iterates through parents until a full language or family is found, and the corresponding object is returned. If the language is a full language, then it simply returns itself. ]==] function Language:getFull() local full = self._fullObject if full == nil then full = self:getFullCode() full = full == self._code and self or get_by_code(full) self._fullObject = full end return full end --[==[ If the language is an etymology-only language, this iterates through parents until a full language or family is found, and the corresponding code is returned. If the language is a full language, then it simply returns the language code. ]==] function Language:getFullCode() return self._fullCode or self._code end --[==[ If the language is an etymology-only language, this iterates through parents until a full language or family is found, and the corresponding canonical name is returned. If the language is a full language, then it simply returns the canonical name of the language. ]==] function Language:getFullName() local full = self._fullName if full == nil then full = self:getFull():getCanonicalName() self._fullName = full end return full end --[==[Returns a table of <code class="nf">Language</code> objects for all languages that this language is directly descended from. Generally this is only a single language, but creoles, pidgins and mixed languages can have multiple ancestors.]==] function Language:getAncestors() local ancestors = self._ancestorObjects if ancestors == nil then ancestors = {} local ancestor_codes = self:getAncestorCodes() if #ancestor_codes > 0 then for _, ancestor in ipairs(ancestor_codes) do insert(ancestors, get_by_code(ancestor, nil, true)) end else local fam = self:getFamily() local protoLang = fam and fam:getProtoLanguage() or nil -- For the cases where the current language is the proto-language -- of its family, or an etymology-only language that is ancestral to that -- proto-language, we need to step up a level higher right from the -- start. if protoLang and ( protoLang:getCode() == self._code or (self:hasType("etymology-only") and protoLang:hasAncestor(self)) ) then fam = fam:getFamily() protoLang = fam and fam:getProtoLanguage() or nil end while not protoLang and not (not fam or fam:getCode() == "qfa-not") do fam = fam:getFamily() protoLang = fam and fam:getProtoLanguage() or nil end insert(ancestors, protoLang) end self._ancestorObjects = ancestors end return ancestors end do -- Avoid a language being its own ancestor via class inheritance. We only need to check for this if the language has inherited an ancestor table from its parent, because we never want to drop ancestors that have been explicitly set in the data. -- Recursively iterate over ancestors until we either find self or run out. If self is found, return true. local function check_ancestor(self, lang) local codes = lang:getAncestorCodes() if not codes then return nil end for i = 1, #codes do local code = codes[i] if code == self._code then return true end local anc = get_by_code(code, nil, true) if check_ancestor(self, anc) then return true end end end --[==[Returns a table of <code class="nf">Language</code> codes for all languages that this language is directly descended from. Generally this is only a single language, but creoles, pidgins and mixed languages can have multiple ancestors.]==] function Language:getAncestorCodes() if self._ancestorCodes then return self._ancestorCodes end local data = self._data local codes = data.ancestors if codes == nil then codes = {} self._ancestorCodes = codes return codes end codes = split(codes, ",", true, true) self._ancestorCodes = codes -- If there are no codes or the ancestors weren't inherited data, there's nothing left to check. if #codes == 0 or self:getData(false, "raw").ancestors ~= nil then return codes end local i, code = 1 while i <= #codes do code = codes[i] if check_ancestor(self, self) then remove(codes, i) else i = i + 1 end end return codes end end --[==[Given a list of language objects or codes, returns true if at least one of them is an ancestor. This includes any etymology-only children of that ancestor. If the language's ancestor(s) are etymology-only languages, it will also return true for those language parent(s) (e.g. if Vulgar Latin is the ancestor, it will also return true for its parent, Latin). However, a parent is excluded from this if the ancestor is also ancestral to that parent (e.g. if Classical Persian is the ancestor, Persian would return false, because Classical Persian is also ancestral to Persian).]==] function Language:hasAncestor(...) local function iterateOverAncestorTree(node, func, parent_check) local ancestors = node:getAncestors() local ancestorsParents = {} for _, ancestor in ipairs(ancestors) do -- When checking the parents of the other language, and the ancestor is also a parent, skip to the next ancestor, so that we exclude any etymology-only children of that parent that are not directly related (see below). local ret = (parent_check or not node:hasParent(ancestor)) and func(ancestor) or iterateOverAncestorTree(ancestor, func, parent_check) if ret then return ret end end -- Check the parents of any ancestors. We don't do this if checking the parents of the other language, so that we exclude any etymology-only children of those parents that are not directly related (e.g. if the ancestor is Vulgar Latin and we are checking New Latin, we want it to return false because they are on different ancestral branches. As such, if we're already checking the parent of New Latin (Latin) we don't want to compare it to the parent of the ancestor (Latin), as this would be a false positive; it should be one or the other). if not parent_check then return nil end for _, ancestor in ipairs(ancestors) do local ancestorParents = ancestor:getParentChain() for _, ancestorParent in ipairs(ancestorParents) do if ancestorParent:getCode() == self._code or ancestorParent:hasAncestor(ancestor) then break else insert(ancestorsParents, ancestorParent) end end end for _, ancestorParent in ipairs(ancestorsParents) do local ret = func(ancestorParent) if ret then return ret end end end local function do_iteration(otherlang, parent_check) -- otherlang can't be self if (type(otherlang) == "string" and otherlang or otherlang:getCode()) == self._code then return false end repeat if iterateOverAncestorTree( self, function(ancestor) return ancestor:getCode() == (type(otherlang) == "string" and otherlang or otherlang:getCode()) end, parent_check ) then return true elseif type(otherlang) == "string" then otherlang = get_by_code(otherlang, nil, true) end otherlang = otherlang:getParent() parent_check = false until not otherlang end local parent_check = true for _, otherlang in ipairs{...} do local ret = do_iteration(otherlang, parent_check) if ret then return true end end return false end do local function construct_node(lang, memo) local branch, ancestors = {lang = lang:getCode()} memo[lang:getCode()] = branch for _, ancestor in ipairs(lang:getAncestors()) do if ancestors == nil then ancestors = {} end insert(ancestors, memo[ancestor:getCode()] or construct_node(ancestor, memo)) end branch.ancestors = ancestors return branch end function Language:getAncestorChain() local chain = self._ancestorChain if chain == nil then chain = construct_node(self, {}) self._ancestorChain = chain end return chain end end function Language:getAncestorChainOld() local chain = self._ancestorChain if chain == nil then chain = {} local step = self while true do local ancestors = step:getAncestors() step = #ancestors == 1 and ancestors[1] or nil if not step then break end insert(chain, step) end self._ancestorChain = chain end return chain end local function fetch_descendants(self, fmt) local descendants, family = {}, self:getFamily() -- Iterate over all three datasets. for _, data in ipairs{ require("Modul:languages/code to canonical name"), require("Modul:etymology languages/code to canonical name"), require("Modul:families/code to canonical name"), } do for code in pairs(data) do local lang = get_by_code(code, nil, true, true) -- Test for a descendant. Earlier tests weed out most candidates, while the more intensive tests are only used sparingly. if ( code ~= self._code and -- Not self. lang:inFamily(family) and -- In the same family. ( family:getProtoLanguageCode() == self._code or -- Self is the protolanguage. self:hasDescendant(lang) or -- Full hasDescendant check. (lang:getFullCode() == self._code and not self:hasAncestor(lang)) -- Etymology-only child which isn't an ancestor. ) ) then if fmt == "object" then insert(descendants, lang) elseif fmt == "code" then insert(descendants, code) elseif fmt == "name" then insert(descendants, lang:getCanonicalName()) end end end end return descendants end function Language:getDescendants() local descendants = self._descendantObjects if descendants == nil then descendants = fetch_descendants(self, "object") self._descendantObjects = descendants end return descendants end function Language:getDescendantCodes() local descendants = self._descendantCodes if descendants == nil then descendants = fetch_descendants(self, "code") self._descendantCodes = descendants end return descendants end function Language:getDescendantNames() local descendants = self._descendantNames if descendants == nil then descendants = fetch_descendants(self, "name") self._descendantNames = descendants end return descendants end do local function check_lang(self, lang) if type(lang) == "string" then lang = get_by_code(lang, nil, true) end if lang:hasAncestor(self) then return true end end function Language:hasDescendant(...) return check_inputs(self, check_lang, false, ...) end end local function fetch_children(self, fmt) local m_etym_data = require(etymology_languages_data_module) local self_code, children = self._code, {} for code, lang in pairs(m_etym_data) do local _lang = lang repeat local parent = _lang.parent if parent == self_code then if fmt == "object" then insert(children, get_by_code(code, nil, true)) elseif fmt == "code" then insert(children, code) elseif fmt == "name" then insert(children, lang[1]) end break end _lang = m_etym_data[parent] until not _lang end return children end function Language:getChildren() local children = self._childObjects if children == nil then children = fetch_children(self, "object") self._childObjects = children end return children end function Language:getChildrenCodes() local children = self._childCodes if children == nil then children = fetch_children(self, "code") self._childCodes = children end return children end function Language:getChildrenNames() local children = self._childNames if children == nil then children = fetch_children(self, "name") self._childNames = children end return children end function Language:hasChild(...) local lang = ... if not lang then return false elseif type(lang) == "string" then lang = get_by_code(lang, nil, true) end if lang:hasParent(self) then return true end return self:hasChild(select(2, ...)) end --[==[Returns the name of the main category of that language. Example: {{code|lua|"French language"}} for French, whose category is at [[:Category:French language]]. Unless optional argument <code>nocap</code> is given, the language name at the beginning of the returned value will be capitalized. This capitalization is correct for category names, but not if the language name is lowercase and the returned value of this function is used in the middle of a sentence.]==] function Language:getCategoryName(nocap) local name = self._categoryName if name == nil then name = self:getCanonicalName() -- If a substrate, omit any leading article. if self:getFamilyCode() == "qfa-sub" then name = name:gsub("^sebuah ", ""):gsub("^suatu ", "") end -- Only add "Bahasa " prefix if a full language. if self:hasType("full") then -- Unless the canonical name already starts with "Bahasa", "bahasa", "Lek" or "lek", add "Bahasa " prefix. if not (match(name, "^[Bb]ahasa") or match(name, "^[Ll]ek")) then name = "Bahasa " .. name end end self._categoryName = name end if nocap then return name end return mw.getContentLanguage():ucfirst(name) end --[==[Creates a link to the category; the link text is the canonical name.]==] function Language:makeCategoryLink() return make_link(self, ":Kategori:" .. self:getCategoryName(), self:getDisplayForm()) end function Language:getStandardCharacters(sc) local standard_chars = self._data.standard_chars if type(standard_chars) ~= "table" then return standard_chars elseif sc and type(sc) ~= "string" then check_object("script", nil, sc) sc = sc:getCode() end if (not sc) or sc == "None" then local scripts = {} for _, script in pairs(standard_chars) do insert(scripts, script) end return concat(scripts) end if standard_chars[sc] then return standard_chars[sc] .. (standard_chars[1] or "") end end --[==[ Strip diacritics from display text `text` (in a language-specific fashion), which is in the script `sc`. If `sc` is omitted or {nil}, the script is autodetected. This also strips certain punctuation characters from the end and (in the case of Spanish upside-down question mark and exclamation points) from the beginning; strips any whitespace at the end of the text or between the text and final stripped punctuation characters; and applies some language-specific Unicode normalizations to replace discouraged characters with their prescribed alternatives. Return the stripped text. ]==] function Language:stripDiacritics(text, sc) if (not text) or text == "" then return text end sc = checkScript(text, self, sc) text = normalize(text, sc) -- FIXME, rename makeEntryName to stripDiacritics and get rid of second and third return values -- everywhere text, _, _ = iterateSectionSubstitutions(self, text, sc, nil, nil, self._data.strip_diacritics or self._data.entry_name, "strip_diacritics", "stripDiacritics") text = umatch(text, "^[¿¡]?(.-[^%s%p].-)%s*[؟?!;՛՜ ՞ ՟?!︖︕।॥။၊་།]?$") or text return text end --[==[ Convert a ''logical'' pagename (the pagename as it appears to the user, after diacritics and punctuation have been stripped) to a ''physical'' pagename (the pagename as it appears in the MediaWiki database). Reasons for a difference between the two are (a) unsupported titles such as `[ ]` (with square brackets in them), `#` (pound/hash sign) and `¯\_(ツ)_/¯` (with underscores), as well as overly long titles of various sorts; (b) "mammoth" pages that are split into parts (e.g. `a`, which is split into physical pagenames `a/languages A to L` and `a/languages M to Z`). For almost all purposes, you should work with logical and not physical pagenames. But there are certain use cases that require physical pagenames, such as checking the existence of a page or retrieving a page's contents. `pagename` is the logical pagename to be converted. `is_reconstructed_or_appendix` indicates whether the page is in the `Reconstruction` or `Appendix` namespaces. If it is omitted or has the value {nil}, the pagename is checked for an initial asterisk, and if found, the page is assumed to be a `Reconstruction` page. Setting a value of `false` or `true` to `is_reconstructed_or_appendix` disables this check and allows for mainspace pagenames that begin with an asterisk. ]==] function Language:logicalToPhysical(pagename, is_reconstructed_or_appendix) -- FIXME: This probably shouldn't happen but it happens when makeEntryName() receives nil. if pagename == nil then track("nil-passed-to-logicalToPhysical") return nil end local initial_asterisk if is_reconstructed_or_appendix == nil then local pagename_minus_initial_asterisk initial_asterisk, pagename_minus_initial_asterisk = pagename:match("^(%*)(.*)$") if pagename_minus_initial_asterisk then is_reconstructed_or_appendix = true pagename = pagename_minus_initial_asterisk elseif self:hasType("appendix-constructed") then is_reconstructed_or_appendix = true end end if not is_reconstructed_or_appendix then -- Check if the pagename is a listed unsupported title. local unsupportedTitles = load_data(links_data_module).unsupported_titles if unsupportedTitles[pagename] then return "Unsupported titles/" .. unsupportedTitles[pagename] end end -- Set `unsupported` as true if certain conditions are met. local unsupported -- Check if there's an unsupported character. \239\191\189 is the replacement character U+FFFD, which can't be typed -- directly here due to an abuse filter. Unix-style dot-slash notation is also unsupported, as it is used for -- relative paths in links, as are 3 or more consecutive tildes. Note: match is faster with magic -- characters/charsets; find is faster with plaintext. if ( match(pagename, "[#<>%[%]_{|}]") or find(pagename, "\239\191\189") or match(pagename, "%f[^%z/]%.%.?%f[%z/]") or find(pagename, "~~~") ) then unsupported = true -- If it looks like an interwiki link. elseif find(pagename, ":") then local prefix = gsub(pagename, "^:*(.-):.*", ulower) if ( load_data("Modul:data/namespaces")[prefix] or load_data("Modul:data/interwikis")[prefix] ) then unsupported = true end end -- Escape unsupported characters so they can be used in titles. ` is used as a delimiter for this, so a raw use of -- it in an unsupported title is also escaped here to prevent interference; this is only done with unsupported -- titles, though, so inclusion won't in itself mean a title is treated as unsupported (which is why it's excluded -- from the earlier test). if unsupported then -- FIXME: This conversion needs to be different for reconstructed pages with unsupported characters. There -- aren't any currently, but if there ever are, we need to fix this e.g. to put them in something like -- Reconstruction:Proto-Indo-European/Unsupported titles/`lowbar``num`. local unsupported_characters = load_data(links_data_module).unsupported_characters pagename = pagename:gsub("[#<>%[%]_`{|}\239]\191?\189?", unsupported_characters) :gsub("%f[^%z/]%.%.?%f[%z/]", function(m) return (gsub(m, "%.", "`period`")) end) :gsub("~~~+", function(m) return (gsub(m, "~", "`tilde`")) end) pagename = "Unsupported titles/" .. pagename elseif not is_reconstructed_or_appendix then -- Check if this is a mammoth page. If so, which subpage should we link to? local m_links_data = load_data(links_data_module) local mammoth_page_type = m_links_data.mammoth_pages[pagename] if mammoth_page_type then local canonical_name = "Bahasa " .. self:getFullName() if canonical_name ~= "Rentas bahasa" and canonical_name ~= "Bahasa Melayu" then local this_subpage local L2_sort_key = get_L2_sort_key(canonical_name) for _, subpage_spec in ipairs(m_links_data.mammoth_page_subpage_types[mammoth_page_type]) do -- unpack() fails utterly on data loaded using mw.loadData() even if offsets are given local subpage, pattern = subpage_spec[1], subpage_spec[2] if pattern == true or L2_sort_key:match(pattern) then this_subpage = subpage break end end if not this_subpage then error(("Internal error: Bad data in mammoth_page_subpage_pages in [[Modul:links/data]] for mammoth page %s, type %s; last entry didn't have 'true' in it"):format( pagename, mammoth_page_type)) end pagename = pagename .. "/" .. this_subpage end end end return (initial_asterisk or "") .. pagename end --[==[ Strip the diacritics from a display pagename and convert the resulting logical pagename into a physical pagename. This allows you, for example, to retrieve the contents of the page or check its existence. WARNING: This is deprecated and will be going away. It is a simple composition of `self:stripDiacritics` and `self:logicalToPhysical`; most callers only want the former, and if you need both, call them both yourself. `text` and `sc` are as in `self:stripDiacritics`, and `is_reconstructed_or_appendix` is as in `self:logicalToPhysical`. ]==] function Language:makeEntryName(text, sc, is_reconstructed_or_appendix) return self:logicalToPhysical(self:stripDiacritics(text, sc), is_reconstructed_or_appendix) end --[==[Generates alternative forms using a specified method, and returns them as a table. If no method is specified, returns a table containing only the input term.]==] function Language:generateForms(text, sc) local generate_forms = self._data.generate_forms if generate_forms == nil then return {text} end sc = checkScript(text, self, sc) return require("Modul:" .. self._data.generate_forms).generateForms(text, self, sc) end --[==[Creates a sort key for the given stripped text, following the rules appropriate for the language. This removes diacritical marks from the stripped text if they are not considered significant for sorting, and may perform some other changes. Any initial hyphen is also removed, and anything in parentheses is removed as well. The <code>sort_key</code> setting for each language in the data modules defines the replacements made by this function, or it gives the name of the module that takes the stripped text and returns a sortkey.]==] function Language:makeSortKey(text, sc) if (not text) or text == "" then return text end if match(text, "<[^<>]+>") then track("track HTML tag") end -- Remove directional characters, bold, italics, soft hyphens, strip markers and HTML tags. -- FIXME: Partly duplicated with remove_formatting() in [[Module:links]]. text = ugsub(text, "[\194\173\226\128\170-\226\128\174\226\129\166-\226\129\169]", "") text = text:gsub("('*)'''(.-'*)'''", "%1%2"):gsub("('*)''(.-'*)''", "%1%2") text = gsub(unstrip(text), "<[^<>]+>", "") text = decode_uri(text, "PATH") text = checkNoEntities(self, text) -- Remove initial hyphens and * unless the term only consists of spacing + punctuation characters. text = ugsub(text, "^([􀀀-􏿽]*)[-־ـ᠊*]+([􀀀-􏿽]*)(.*[^%s%p].*)", "%1%2%3") sc = checkScript(text, self, sc) text = normalize(text, sc) text = removeCarets(text, sc) -- For languages with dotted dotless i, ensure that "İ" is sorted as "i", and "I" is sorted as "ı". if self:hasDottedDotlessI() then text = gsub(text, "I\204\135", "i") -- decomposed "İ" :gsub("I", "ı") text = sc:toFixedNFD(text) end -- Convert to lowercase, make the sortkey, then convert to uppercase. Where the language has dotted dotless i, it is -- usually not necessary to convert "i" to "İ" and "ı" to "I" first, because "I" will always be interpreted as -- conventional "I" (not dotless "İ") by any sorting algorithms, which will have been taken into account by the -- sortkey substitutions themselves. However, if no sortkey substitutions have been specified, then conversion is -- necessary so as to prevent "i" and "ı" both being sorted as "I". -- -- An exception is made for scripts that (sometimes) sort by scraping page content, as that means they are sensitive -- to changes in capitalization (as it changes the target page). if not sc:sortByScraping() then text = ulower(text) end local actual_substitution_data -- Don't trim whitespace here because it's significant at the beginning of a sort key or sort base. text, _, actual_substitution_data = iterateSectionSubstitutions(self, text, sc, nil, nil, self._data.sort_key, "sort_key", "makeSortKey", "notrim") if not sc:sortByScraping() then if self:hasDottedDotlessI() and not actual_substitution_data then text = text:gsub("ı", "I"):gsub("i", "İ") text = sc:toFixedNFC(text) end text = uupper(text) end -- Remove parentheses, as long as they are either preceded or followed by something. text = gsub(text, "(.)[()]+", "%1"):gsub("[()]+(.)", "%1") text = escape_risky_characters(text) return text end --[==[Create the form used as as a basis for display text and transliteration. FIXME: Rename to correctInputText().]==] local function processDisplayText(text, self, sc, keepCarets, keepPrefixes) local subbedChars = {} text, subbedChars = doTempSubstitutions(text, subbedChars, keepCarets) text = decode_uri(text, "PATH") text = checkNoEntities(self, text) sc = checkScript(text, self, sc) text = normalize(text, sc) text, subbedChars = iterateSectionSubstitutions(self, text, sc, subbedChars, keepCarets, self._data.display_text, "display_text", "makeDisplayText") text = removeCarets(text, sc) -- Remove any interwiki link prefixes (unless they have been escaped or this has been disabled). if find(text, ":") and not keepPrefixes then local rep repeat text, rep = gsub(text, "\\\\(\\*:)", "\3%1") until rep == 0 text = gsub(text, "\\:", "\4") while true do local prefix = gsub(text, "^(.-):.+", function(m1) return (gsub(m1, "\244[\128-\191]*", "")) end) -- Check if the prefix is an interwiki, though ignore capitalised Wikikamus:, which is a namespace. if not prefix or prefix == text or prefix == "Wikikamus" or not (load_data("Modul:data/interwikis")[ulower(prefix)] or prefix == "") then break end text = gsub(text, "^(.-):(.*)", function(m1, m2) local ret = {} for subbedChar in gmatch(m1, "\244[\128-\191]*") do insert(ret, subbedChar) end return concat(ret) .. m2 end) end text = gsub(text, "\3", "\\"):gsub("\4", ":") end return text, subbedChars end --[==[Make the display text (i.e. what is displayed on the page).]==] function Language:makeDisplayText(text, sc, keepPrefixes) if not text or text == "" then return text end local subbedChars text, subbedChars = processDisplayText(text, self, sc, nil, keepPrefixes) text = escape_risky_characters(text) return undoTempSubstitutions(text, subbedChars) end --[==[Transliterates the text from the given script into the Latin script (see [[Wiktionary:Transliteration and romanization]]). The language must have the <code>translit</code> property for this to work; if it is not present, {{code|lua|nil}} is returned. The <code>sc</code> parameter is handled by the transliteration module, and how it is handled is specific to that module. Some transliteration modules may tolerate {{code|lua|nil}} as the script, others require it to be one of the possible scripts that the module can transliterate, and will throw an error if it's not one of them. For this reason, the <code>sc</code> parameter should always be provided when writing non-language-specific code. The <code>module_override</code> parameter is used to override the default module that is used to provide the transliteration. This is useful in cases where you need to demonstrate a particular module in use, but there is no default module yet, or you want to demonstrate an alternative version of a transliteration module before making it official. It should not be used in real modules or templates, only for testing. All uses of this parameter are tracked by [[Wiktionary:Tracking/languages/module_override]]. '''Known bugs''': * This function assumes {tr(s1) .. tr(s2) == tr(s1 .. s2)}. When this assertion fails, wikitext markups like <nowiki>'''</nowiki> can cause wrong transliterations. * HTML entities like <code>&amp;apos;</code>, often used to escape wikitext markups, do not work. ]==] function Language:transliterate(text, sc, module_override) -- If there is no text, or the language doesn't have transliteration data and there's no override, return nil. if not text or text == "" or text == "-" then return text end -- If the script is not transliteratable (and no override is given), return nil. sc = checkScript(text, self, sc) if not (sc:isTransliterated() or module_override) then -- temporary tracking to see if/when this gets triggered track("non-transliterable") track("non-transliterable/" .. self._code) track("non-transliterable/" .. sc:getCode()) track("non-transliterable/" .. sc:getCode() .. "/" .. self._code) return nil end -- Remove any strip markers. text = unstrip(text) -- Do not process the formatting into PUA characters for certain languages. local processed = load_data(languages_data_module).substitution[self._code] ~= "none" -- Get the display text with the keepCarets flag set. local subbedChars if processed then text, subbedChars = processDisplayText(text, self, sc, true) end -- Transliterate (using the module override if applicable). text, subbedChars = iterateSectionSubstitutions(self, text, sc, subbedChars, true, module_override or self._data.translit, "translit", "tr") if not text then return nil end -- Incomplete transliterations return nil. local charset = sc.characters if charset and umatch(text, "[" .. charset .. "]") then -- Remove any characters in Latin, which includes Latin characters also included in other scripts (as these are -- false positives), as well as any PUA substitutions. Anything remaining should only be script code "None" -- (e.g. numerals). local check_text = ugsub(text, "[" .. get_script("Latn").characters .. "􀀀-􏿽]+", "") -- Set none_is_last_resort_only flag, so that any non-None chars will cause a script other than "None" to be -- returned. if find_best_script_without_lang(check_text, true):getCode() ~= "None" then return nil end end if processed then text = escape_risky_characters(text) text = undoTempSubstitutions(text, subbedChars) end -- If the script does not use capitalization, then capitalize any letters of the transliteration which are -- immediately preceded by a caret (and remove the caret). if text and not sc:hasCapitalization() and text:find("^", 1, true) then text = processCarets(text, "%^([\128-\191\244]*%*?)([^\128-\191\244][\128-\191]*)", function(m1, m2) return m1 .. uupper(m2) end) end -- Track module overrides. if module_override ~= nil then track("module_override") end return text end do local function handle_language_spec(self, spec, sc) local ret = self["_" .. spec] if ret == nil then ret = self._data[spec] if type(ret) == "string" then ret = list_to_set(split(ret, ",", true, true)) end self["_" .. spec] = ret end if type(ret) == "table" then ret = ret[sc:getCode()] end return not not ret end function Language:overrideManualTranslit(sc) return handle_language_spec(self, "override_translit", sc) end function Language:link_tr(sc) return handle_language_spec(self, "link_tr", sc) end end --[==[Returns {{code|lua|true}} if the language has a transliteration module, or {{code|lua|false}} if it doesn't.]==] function Language:hasTranslit() return not not self._data.translit end --[==[Returns {{code|lua|true}} if the language uses the letters I/ı and İ/i, or {{code|lua|false}} if it doesn't.]==] function Language:hasDottedDotlessI() return not not self._data.dotted_dotless_i end function Language:toJSON(opts) local strip_diacritics, strip_diacritics_patterns, strip_diacritics_remove_diacritics = self._data.strip_diacritics if strip_diacritics then if strip_diacritics.from then strip_diacritics_patterns = {} for i, from in ipairs(strip_diacritics.from) do insert(strip_diacritics_patterns, {from = from, to = strip_diacritics.to[i] or ""}) end end strip_diacritics_remove_diacritics = strip_diacritics.remove_diacritics end -- mainCode should only end up non-nil if dontCanonicalizeAliases is passed to make_object(). -- props should either contain zero-argument functions to compute the value, or the value itself. local props = { ancestors = function() return self:getAncestorCodes() end, canonicalName = function() return self:getCanonicalName() end, categoryName = function() return self:getCategoryName("nocap") end, code = self._code, mainCode = self._mainCode, parent = function() return self:getParentCode() end, full = function() return self:getFullCode() end, stripDiacriticsPatterns = strip_diacritics_patterns, stripDiacriticsRemoveDiacritics = strip_diacritics_remove_diacritics, family = function() return self:getFamilyCode() end, aliases = function() return self:getAliases() end, varieties = function() return self:getVarieties() end, otherNames = function() return self:getOtherNames() end, scripts = function() return self:getScriptCodes() end, type = function() return keys_to_list(self:getTypes()) end, wikimediaLanguages = function() return self:getWikimediaLanguageCodes() end, wikidataItem = function() return self:getWikidataItem() end, wikipediaArticle = function() return self:getWikipediaArticle(true) end, } local ret = {} for prop, val in pairs(props) do if not opts.skip_fields or not opts.skip_fields[prop] then if type(val) == "function" then ret[prop] = val() else ret[prop] = val end end end -- Use `deep_copy` when returning a table, so that there are no editing restrictions imposed by `mw.loadData`. return opts and opts.lua_table and deep_copy(ret) or to_json(ret, opts) end function export.getDataModuleName(code) local letter = match(code, "^(%l)%l%l?$") return "Modul:" .. ( letter == nil and "languages/data/exceptional" or #code == 2 and "languages/data/2" or "languages/data/3/" .. letter ) end get_data_module_name = export.getDataModuleName function export.getExtraDataModuleName(code) return get_data_module_name(code) .. "/extra" end get_extra_data_module_name = export.getExtraDataModuleName do local function make_stack(data) local key_types = { [2] = "unique", aliases = "unique", otherNames = "unique", type = "append", varieties = "unique", wikipedia_article = "unique", wikimedia_codes = "unique" } local function __index(self, k) local stack, key_type = getmetatable(self), key_types[k] -- Data that isn't inherited from the parent. if key_type == "unique" then local v = stack[stack[make_stack]][k] if v == nil then local layer = stack[0] if layer then -- Could be false if there's no extra data. v = layer[k] end end return v -- Data that is appended by each generation. elseif key_type == "append" then local parts, offset, n = {}, 0, stack[make_stack] for i = 1, n do local part = stack[i][k] if part == nil then offset = offset + 1 else parts[i - offset] = part end end return offset ~= n and concat(parts, ",") or nil end local n = stack[make_stack] while true do local layer = stack[n] if not layer then -- Could be false if there's no extra data. return nil end local v = layer[k] if v ~= nil then return v end n = n - 1 end end local function __newindex() error("table is read-only") end local function __pairs(self) -- Iterate down the stack, caching keys to avoid duplicate returns. local stack, seen = getmetatable(self), {} local n = stack[make_stack] local iter, state, k, v = pairs(stack[n]) return function() repeat repeat k = iter(state, k) if k == nil then n = n - 1 local layer = stack[n] if not layer then -- Could be false if there's no extra data. return nil end iter, state, k = pairs(layer) end until not (k == nil or seen[k]) -- Get the value via a lookup, as the one returned by the -- iterator will be the raw value from the current layer, -- which may not be the one __index will return for that -- key. Also memoize the key in `seen` (even if the lookup -- returns nil) so that it doesn't get looked up again. -- TODO: store values in `self`, avoiding the need to create -- the `seen` table. The iterator will need to iterate over -- `self` with `next` first to find these on future loops. v, seen[k] = self[k], true until v ~= nil return k, v end end local __ipairs = require(table_module).indexIpairs function make_stack(data) local stack = { data, [make_stack] = 1, -- stores the length and acts as a sentinel to confirm a given metatable is a stack. __index = __index, __newindex = __newindex, __pairs = __pairs, __ipairs = __ipairs, } stack.__metatable = stack return setmetatable({}, stack), stack end return make_stack(data) end local function get_stack(data) local stack = getmetatable(data) return stack and type(stack) == "table" and stack[make_stack] and stack or nil end --[==[ <span style="color: var(--wikt-palette-red,#BA0000)">This function is not for use in entries or other content pages.</span> Returns a blob of data about the language. The format of this blob is undocumented, and perhaps unstable; it's intended for things like the module's own unit-tests, which are "close friends" with the module and will be kept up-to-date as the format changes. If `extra` is set, any extra data in the relevant `/extra` module will be included. (Note that it will be included anyway if it has already been loaded into the language object.) If `raw` is set, then the returned data will not contain any data inherited from parent objects. -- Do NOT use these methods! -- All uses should be pre-approved on the talk page! ]==] function Language:getData(extra, raw) if extra then self:loadInExtraData() end local data = self._data -- If raw is not set, just return the data. if not raw then return data end local stack = get_stack(data) -- If there isn't a stack or its length is 1, return the data. Extra data (if any) will be included, as it's stored at key 0 and doesn't affect the reported length. if stack == nil then return data end local n = stack[make_stack] if n == 1 then return data end local extra = stack[0] -- If there isn't any extra data, return the top layer of the stack. if extra == nil then return stack[n] end -- If there is, return a new stack which has the top layer at key 1 and the extra data at key 0. data, stack = make_stack(stack[n]) stack[0] = extra return data end function Language:loadInExtraData() -- Only full languages have extra data. if not self:hasType("language", "full") then return end local data = self._data -- If there's no stack, create one. local stack = get_stack(self._data) if stack == nil then data, stack = make_stack(data) -- If already loaded, return. elseif stack[0] ~= nil then return end self._data = data -- Load extra data from the relevant module and add it to the stack at key 0, so that the __index and __pairs metamethods will pick it up, since they iterate down the stack until they run out of layers. local code = self._code local modulename = get_extra_data_module_name(code) -- No data cached as false. stack[0] = modulename and load_data(modulename)[code] or false end --[==[Returns the name of the module containing the language's data. Currently, this is always [[Module:scripts/data]].]==] function Language:getDataModuleName() local name = self._dataModuleName if name == nil then name = self:hasType("etymology-only") and etymology_languages_data_module or get_data_module_name(self._mainCode or self._code) self._dataModuleName = name end return name end --[==[Returns the name of the module containing the language's data. Currently, this is always [[Module:scripts/data]].]==] function Language:getExtraDataModuleName() local name = self._extraDataModuleName if name == nil then name = not self:hasType("etymology-only") and get_extra_data_module_name(self._mainCode or self._code) or false self._extraDataModuleName = name end return name or nil end function export.makeObject(code, data, dontCanonicalizeAliases) local data_type = type(data) if data_type ~= "table" then error(("bad argument #2 to 'makeObject' (table expected, got %s)"):format(data_type)) end -- Convert any aliases. local input_code = code code = normalize_code(code) input_code = dontCanonicalizeAliases and input_code or code local parent if data.parent then parent = get_by_code(data.parent, nil, true, true) else parent = Language end parent.__index = parent local lang = {_code = input_code} -- This can only happen if dontCanonicalizeAliases is passed to make_object(). if code ~= input_code then lang._mainCode = code end local parent_data = parent._data if parent_data == nil then -- Full code is the same as the code. lang._fullCode = parent._code or code else -- Copy full code. lang._fullCode = parent._fullCode local stack = get_stack(parent_data) if stack == nil then parent_data, stack = make_stack(parent_data) end -- Insert the input data as the new top layer of the stack. local n = stack[make_stack] + 1 data, stack[n], stack[make_stack] = parent_data, data, n end lang._data = data return setmetatable(lang, parent) end make_object = export.makeObject end --[==[Finds the language whose code matches the one provided. If it exists, it returns a <code class="nf">Language</code> object representing the language. Otherwise, it returns {{code|lua|nil}}, unless <code class="n">paramForError</code> is given, in which case an error is generated. If <code class="n">paramForError</code> is {{code|lua|true}}, a generic error message mentioning the bad code is generated; otherwise <code class="n">paramForError</code> should be a string or number specifying the parameter that the code came from, and this parameter will be mentioned in the error message along with the bad code. If <code class="n">allowEtymLang</code> is specified, etymology-only language codes are allowed and looked up along with normal language codes. If <code class="n">allowFamily</code> is specified, language family codes are allowed and looked up along with normal language codes.]==] function export.getByCode(code, paramForError, allowEtymLang, allowFamily) -- Track uses of paramForError, ultimately so it can be removed, as error-handling should be done by [[Module:parameters]], not here. if paramForError ~= nil then track("paramForError") end if type(code) ~= "string" then local typ if not code then typ = "nil" elseif check_object("language", true, code) then typ = "a language object" elseif check_object("family", true, code) then typ = "a family object" else typ = "a " .. type(code) end error("The function getByCode expects a string as its first argument, but received " .. typ .. ".") end local m_data = load_data(languages_data_module) if m_data.aliases[code] or m_data.track[code] then track(code) end local norm_code = normalize_code(code) -- Get the data, checking for etymology-only languages if allowEtymLang is set. local data = load_data(get_data_module_name(norm_code))[norm_code] or allowEtymLang and load_data(etymology_languages_data_module)[norm_code] -- If no data was found and allowFamily is set, check the family data. If the main family data was found, make the object with [[Module:families]] instead, as family objects have different methods. However, if it's an etymology-only family, use make_object in this module (which handles object inheritance), and the family-specific methods will be inherited from the parent object. if data == nil and allowFamily then data = load_data("Modul:families/data")[norm_code] if data ~= nil then if data.parent == nil then return make_family_object(norm_code, data) elseif not allowEtymLang then data = nil end end end local retval = code and data and make_object(code, data) if not retval and paramForError then require("Modul:languages/errorGetBy").code(code, paramForError, allowEtymLang, allowFamily) end return retval end get_by_code = export.getByCode --[==[Finds the language whose canonical name (the name used to represent that language on Wiktionary) or other name matches the one provided. If it exists, it returns a <code class="nf">Language</code> object representing the language. Otherwise, it returns {{code|lua|nil}}, unless <code class="n">paramForError</code> is given, in which case an error is generated. If <code class="n">allowEtymLang</code> is specified, etymology-only language codes are allowed and looked up along with normal language codes. If <code class="n">allowFamily</code> is specified, language family codes are allowed and looked up along with normal language codes. The canonical name of languages should always be unique (it is an error for two languages on Wiktionary to share the same canonical name), so this is guaranteed to give at most one result. This function is powered by [[Module:languages/canonical names]], which contains a pre-generated mapping of full-language canonical names to codes. It is generated by going through the [[:Category:Language data modules]] for full languages. When <code class="n">allowEtymLang</code> is specified for the above function, [[Module:etymology languages/canonical names]] may also be used, and when <code class="n">allowFamily</code> is specified for the above function, [[Module:families/canonical names]] may also be used.]==] function export.getByCanonicalName(name, errorIfInvalid, allowEtymLang, allowFamily) local byName = load_data("Modul:languages/canonical names") local code = byName and byName[name] if not code and allowEtymLang then byName = load_data("Modul:etymology languages/canonical names") code = byName and byName[name] or byName[gsub(name, "^[Ss]ubstratum ", "")] or byName[gsub(name, "^suatu ", "")] or byName[gsub(name, "^suatu ", ""):gsub("^[Ss]ubstratum ", "")] or -- For etymology families like "ira-pro". -- FIXME: This is not ideal, as it allows " languages" to be appended to any etymology-only language, too. byName[match(name, "^[Bb]ahasa%-bahasa (.*)$")] end if not code and allowFamily then byName = load_data("Modul:families/canonical names") code = byName[name] or byName[match(name, "^[Bb]ahasa%-bahasa (.*)$")] end local retval = code and get_by_code(code, errorIfInvalid, allowEtymLang, allowFamily) if not retval and errorIfInvalid then require("Modul:languages/errorGetBy").canonicalName(name, allowEtymLang, allowFamily) end return retval end --[==[Used by [[Module:languages/data/2]] (et al.) and [[Module:etymology languages/data]], [[Module:families/data]], [[Module:scripts/data]] and [[Module:writing systems/data]] to finalize the data into the format that is actually returned.]==] function export.finalizeData(data, main_type, variety) local fields = {"type"} if main_type == "language" then insert(fields, 4) -- script codes insert(fields, "ancestors") insert(fields, "link_tr") insert(fields, "override_translit") insert(fields, "wikimedia_codes") elseif main_type == "script" then insert(fields, 3) -- writing system codes end -- Families and writing systems have no extra fields to process. local fields_len = #fields for _, entity in next, data do if variety then -- Move parent from 3 to "parent" and family from "family" to 3. These are different for the sake of convenience, since very few varieties have the family specified, whereas all of them have a parent. entity.parent, entity[3], entity.family = entity[3], entity.family -- Give the type "regular" iff not a variety and no other types are assigned. elseif not (entity.type or entity.parent) then entity.type = "regular" end for i = 1, fields_len do local key = fields[i] local field = entity[key] if field and type(field) == "string" then entity[key] = gsub(field, "%s*,%s*", ",") end end end return data end --[==[For backwards compatibility only; modules should require the error themselves.]==] function export.err(lang_code, param, code_desc, template_tag, not_real_lang) return require("Modul:languages/error")(lang_code, param, code_desc, template_tag, not_real_lang) end return export jrp354oohz2vhqpfwsukxukfcx757jm Modul:languages/data/3/o 828 9793 281275 280832 2026-04-21T14:04:08Z Hakimi97 2668 281275 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["oaa"] = { "Orok", 33928, "tuw", "Cyrl, Latn", } m["oac"] = { "Oroch", 33650, "tuw", "Latn, Cyrl", } m["oav"] = { "Avar Kuno", nil, "cau-ava", "Geor", } m["obi"] = { "Obispeño", 1288385, "nai-chu", "Latn", } m["obk"] = { "Bontoc Selatan", nil, "phi", "Latn", } m["obl"] = { "Oblo", 36309, } m["obm"] = { "Moabite", 36385, "sem-can", "Phnx", translit = "Phnx-translit", } m["obo"] = { "Obo Manobo", 12953699, "mno", "Latn", } m["obr"] = { "Burma Kuno", 17006600, "tbq-brm", "Mymr, Latn", --and also Pallava } m["obt"] = { "Breton Kuno", 3558112, "cel-bry", "Latn", } m["obu"] = { "Obulom", 3813403, "nic-cde", "Latn", } m["oca"] = { "Ocaina", 3182577, "sai-wit", "Latn", } m["och"] = { "Cina Kuno", 35137, "zhx", "Hant", translit = "zh-translit", sort_key = "Hani-sortkey", } m["oco"] = { "Cornwall Kuno", 48304520, "cel-bry", "Latn", } m["ocu"] = { "Tlahuica", 10751739, "omq", "Latn", } m["oda"] = { "Odut", 3915388, "nic-uce", "Latn", ancestors = "mfn", } m["odk"] = { "Od", 7077191, "inc-wes", "Arab", } m["odt"] = { "Belanda Kuno", 443089, "gmw", "Latn, Runr", ancestors = "frk", entry_name = {remove_diacritics = c.circ .. c.macron}, } m["odu"] = { "Odual", 3813392, "nic-cde", "Latn", } m["ofo"] = { "Ofo", 3349758, "sio-ohv", } m["ofs"] = { "Frisia Kuno", 35133, "gmw-fri", "Latn", entry_name = {remove_diacritics = c.circ .. c.macron}, } m["ofu"] = { "Efutop", 35297, "nic-eko", "Latn", } m["ogb"] = { "Ogbia", 3813400, "nic-cde", "Latn", } m["ogc"] = { "Ogbah", 36291, "alv-igb", "Latn", } m["oge"] = { "Georgia Kuno", 34834, "ccs-gzn", "Geor, Geok", translit = { Geor = "Geor-translit", Geok = "Geok-translit", }, override_translit = true, entry_name = {remove_diacritics = c.circ}, } m["ogg"] = { "Ogbogolo", 3813405, "nic-cde", "Latn", } m["ogo"] = { "Khana", 3914409, "nic-ogo", "Latn", } m["ogu"] = { "Ogbronuagum", 3914485, "nic-cde", "Latn", } m["ohu"] = { "Hungary Kuno", nil, "urj-ugr", "Latn", } m["oia"] = { "Oirata", 56738, "ngf", "Latn", } m["oin"] = { "One Inebu", 12953782, "qfa-tor", } m["ojb"] = { "Ojibwa Barat Laut", 7060356, "alg", "Latn", ancestors = "oj", } m["ojc"] = { "Ojibwa Tengah", 5061548, "alg", "Latn", ancestors = "oj", } m["ojg"] = { "Ojibwa Timur", 5330342, "alg", "Latn", ancestors = "oj", } m["ojp"] = { "Jepun Kuno", 5736700, "jpx", "Jpan", sort_key = s["Jpan-sortkey"], } m["ojs"] = { "Severn Ojibwa", 56494, "alg", "Latn", ancestors = "oj", } m["ojv"] = { "Jawa Ontong", 7095071, "poz-pnp", "Latn", } m["ojw"] = { "Ojibwa Barat", 3474222, "alg", "Latn", ancestors = "oj", } m["oka"] = { "Okanagan", 2984602, "sal", "Latn", } m["okb"] = { "Okobo", 3813398, "nic-lcr", "Latn", } m["okd"] = { "Okodia", 36300, "ijo", } m["oke"] = { "Okpe (Edo Barat Daya)", 268924, "alv-swd", "Latn", } m["okg"] = { "Kok-Paponk", nil, "aus-pmn", "Latn", } m["okh"] = { "Koresh-e Rostam", 6432160, "xme-ttc", ancestors = "xme-ttc-cen", } m["oki"] = { "Okiek", 56367, "sdv-kln", "Latn", } m["okj"] = { "Oko-Juwoi", 3436832, "qfa-adc", } m["okk"] = { "Kwamtim One", 19830649, "qfa-tor", "Latn", } m["okl"] = { "Bahasa Isyarat Kentish Kuno", 7084319, "sgn", } m["okm"] = { "Korea Pertengahan", 715339, "qfa-kor", "Kore", ancestors = "oko", translit = "okm-translit", entry_name = s["Kore-entryname"], } m["okn"] = { "Oki-No-Erabu", 3350036, "jpx-ryu", "Jpan", translit = s["Jpan-translit"], sort_key = s["Jpan-sortkey"], } m["oko"] = { "Korea Kuno", 715364, "qfa-kor", "Kore", entry_name = s["Kore-entryname"], } m["okr"] = { "Kirike", 11006763, "ijo", } m["oks"] = { "Oko-Eni-Osayen", 36302, "alv-von", "Latn", } m["oku"] = { "Oku", 36289, "nic-rnc", "Latn", } m["okv"] = { "Orokaiva", 7103752, "ngf", "Latn", } m["okx"] = { "Okpe (Edo Barat Laut)", 7082547, "alv-nwd", "Latn", } m["okz"] = { "Khmer Kuno", 9205, "mkh-kmr", "Latn, Khmr", --and also Pallava } m["old"] = { "Mochi", 12952852, "bnt-chg", "Latn", } m["ole"] = { "Olekha", 3695204, "sit-bdi", "Tibt, Latn", translit = {Tibt = "Tibt-translit"}, override_translit = true, display_text = {Tibt = s["Tibt-displaytext"]}, entry_name = {Tibt = s["Tibt-entryname"]}, sort_key = {Tibt = "Tibt-sortkey"}, } m["olm"] = { "Oloma", 3441166, "alv-nwd", "Latn", } m["olo"] = { "Livvi", 36584, "urj-fin", "Latn", } m["olr"] = { "Olrat", 3351562, "poz-vnc", } m["olt"] = { "Lithuania Kuno", 17417801, "bat", "Latn", entry_name = {remove_diacritics = c.grave .. c.acute .. c.tilde}, } m["olu"] = { "Kuvale", 6448765, "bnt-swb", "Latn", } m["oma"] = { "Omaha-Ponca", 2917968, "sio-dhe", "Latn", } m["omb"] = { "Omba", 2841471, "poz-vnc", "Latn", } m["omc"] = { "Mochica", 1951641, } m["omg"] = { "Omagua", 33663, "tup-gua", "Latn", } m["omi"] = { "Omi", 56795, "csu-mma", } m["omk"] = { "Omok", 4334657, "qfa-yuk", "Cyrl", translit = "omk-translit", } m["oml"] = { "Ombo", 7089928, "bnt-tet", "Latn", } m["omn"] = { "Minoan", 1669994, nil, "Lina", } m["omo"] = { "Utarmbung", 7902577, "ngf", "Latn", } m["omp"] = { "Manipur Kuno", nil, "sit", "Mtei", translit = "Mtei-translit", } m["omr"] = { "Marathi Kuno", nil, "inc-sou", "Deva, Modi", ancestors = "pmh", translit = { Deva = "sa-translit", Modi = "Modi-translit", }, } m["omt"] = { "Omotik", 36313, "sdv-nis", } m["omu"] = { "Omurano", 1957612, } m["omw"] = { "Tairora Selatan", 20210553, "paa-kag", "Latn", } m["omx"] = { "Mon Kuno", nil, "mkh-mnc", "Mymr, Latn", --and also Pallava } m["ona"] = { "Selk'nam", 2721227, "sai-cho", "Latn", } m["onb"] = { "Lingao", 7093790, "qfa-onb", "Latn", } m["one"] = { "Oneida", 857858, "iro-nor", "Latn", } m["ong"] = { "Olo", 592162, "qfa-tor", "Latn", } m["oni"] = { "Onin", 7093910, "poz-cet", "Latn", } m["onj"] = { "Onjob", 7093968, "ngf", "Latn", } m["onk"] = { "Kabore One", 12953783, "qfa-tor", "Latn", } m["onn"] = { "Onobasulu", 7094437, "ngf", "Latn", } m["ono"] = { "Onondaga", 1077450, "iro-nor", "Latn", ancestors = "iro-oon", } m["onp"] = { "Sartang", 7424639, "sit-khb", } m["onr"] = { "One Utara", 19830648, "qfa-tor", "Latn", } m["ons"] = { "Ono", 11732548, "ngf", "Latn", } m["ont"] = { "Ontenu", 3352827, } m["onu"] = { "Unua", 3552042, "poz-vnc", "Latn", } m["onw"] = { "Nubia Kuno", 2268, "nub", "Copt", translit = "Copt-translit", sort_key = "cop-sortkey", } m["onx"] = { "Pidgin Onin", 12953788, "crp", "Latn", ancestors = "oni", } m["ood"] = { "O'odham", 2393095, "azc", "Latn", } m["oog"] = { "Ong", 12953787, "mkh-kat", } m["oon"] = { "Önge", 2475551, "qfa-ong", } m["oor"] = { "Oorlams", 2484337, } m["oos"] = { "Ossetia Kuno", nil, "xsc", "Grek, Latn", translit = "grc-translit", ancestors = "os-pro", } m["opa"] = { "Okpamheri", 3913331, "alv-nwd", "Latn", } m["opk"] = { "Kopkaka", 6431129, "ngf-okk", "Latn", } m["opm"] = { "Oksapmin", 1068097, "ngf", "Latn", } m["opo"] = { "Opao", 7095585, "ngf", "Latn", } m["opt"] = { "Opata", 2304583, "azc-trc", "Latn", } m["opy"] = { "Ofayé", 3446691, "sai-mje", "Latn", } m["ora"] = { "Oroha", 36298, "poz-sls", } m["ore"] = { "Orejón", 3355834, "sai-tuc", "Latn", } m["org"] = { "Oring", 3915308, "nic-ucn", "Latn", } m["orh"] = { "Oroqen", 1367309, "tuw", "Latn", } m["oro"] = { "Orokolo", 7103758, "ngf", "Latn", } m["orr"] = { "Oruma", 36299, "ijo", "Latn", } m["ort"] = { "Oriya Adivasi", 12953791, "inc-eas", "Orya", ancestors = "or", } m["ors"] = { "Orang Seletar", 4208197, "map", "Latn", ancestors = "ms", } m["oru"] = { "Ormuri", 33740, "ira-orp", "fa-Arab", } m["orv"] = { "Slav Timur Kuno", 35228, "zle", "Cyrs", translit = {Cyrs = "Cyrs-translit"}, entry_name = s["Cyrs-entryname"], sort_key = s["Cyrs-sortkey"], } m["orw"] = { "Oro Win", 3450423, "sai-cpc", "Latn", } m["orx"] = { "Oro", 3813396, "nic-lcr", "Latn", } m["orz"] = { "Ormu", 7103494, "poz-ocw", "Latn", } m["osa"] = { "Osage", 2600085, "sio-dhe", "Latn, Osge", } m["osc"] = { "Osci", 36653, "itc-sbl", "Ital, Latn", translit = "Ital-translit", } m["osi"] = { "Osing", 2701322, "poz-sus", "Latn", } m["osn"] = { "Sunda Kuno", 56197074, "poz-msa", "Latn, Sund, Kawi", } m["oso"] = { "Ososo", 3913398, "alv-yek", "Latn", } m["osp"] = { "Sepanyol Kuno", 1088025, "roa-cas", "Latn", } m["ost"] = { "Osatu", 36243, "nic-grs", "Latn", } m["osu"] = { "One Selatan", 12953785, "qfa-tor", "Latn", } m["osx"] = { "Saxon Kuno", 35219, "gmw", "Latn", entry_name = {remove_diacritics = c.circ .. c.macron}, } m["ota"] = { "Turki Usmaniyah", 36730, "trk-ogz", "ota-Arab, Armn", ancestors = "trk-oat", entry_name = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef}, translit = {Armn = "ota-Armn-translit"}, } m["otb"] = { "Tibet Kuno", 7085214, "sit-tib", "Tibt", translit = "Tibt-translit", override_translit = true, display_text = s["Tibt-displaytext"], entry_name = s["Tibt-entryname"], sort_key = "Tibt-sortkey", } m["otd"] = { "Ot Danum", 3033781, "poz-brw", "Latn", } m["ote"] = { "Otomi Mezquital", 23755711, "oto-otm", "Latn", } m["oti"] = { "Oti", 3357881, } m["otk"] = { "Turk Kuno", 34988, "trk", "Orkh", translit = "Orkh-translit", } m["otl"] = { "Otomi Tilapa", 7802050, "oto-otm", "Latn", } m["otm"] = { "Otomi Tanah Tinggi Timur", 13581718, "oto-otm", "Latn", } m["otn"] = { "Otomi Tenango", 25559589, "oto-otm", "Latn", } m["otq"] = { "Otomi Querétaro", 23755688, "oto-otm", "Latn", } m["otr"] = { "Otoro", 36328, "alv-hei", } m["ots"] = { "Otomi Estado de México", 7413841, "oto-otm", "Latn", } m["ott"] = { "Otomi Temoaya", 7698191, "oto-otm", "Latn", } m["otu"] = { "Otuke", 7110049, "sai-mje", "Latn", } m["otw"] = { "Ottawa", 133678, "alg", "Latn", ancestors = "oj", } m["otx"] = { "Otomi Texcatepec", 25559590, "oto-otm", "Latn", } m["oty"] = { "Tamil Kuno", 20987452, "dra", "Brah", translit = "Brah-translit", } m["otz"] = { "Otomi Ixtenco", 6101171, "oto-otm", "Latn", } m["oub"] = { "Glio-Oubi", 3914977, "kro-grb", } m["oue"] = { "Oune", 7110521, "paa-sbo", } m["oui"] = { "Uyghur Kuno", nil, "trk-sib", "Ougr, Latn, Brah, Mani, Syrc, Phag", } m["oum"] = { "Ouma", 7110494, "poz-ocw", "Latn", } m["ovd"] = { "Älvdalen", 254950, "gmq", "Latn", ancestors = "non", } m["owi"] = { "Owiniga", 56454, "qfa-mal", "Latn", } m["owl"] = { "Wales Kuno", 2266723, "cel-bry", "Latn", } m["oyb"] = { "Oy", 13593748, "mkh-ban", } m["oyd"] = { "Oyda", 7116251, "omv-nom", } m["oym"] = { "Wayampi", 7975842, "tup-gua", "Latn", } m["oyy"] = { "Oya'oya", 7116243, "poz-ocw", "Latn", } m["ozm"] = { "Koonzime", 35566, "bnt-ndb", "Latn", } return require("Module:languages").finalizeData(m, "language") 7d637vy00f3zv6l8x23f5epevazae54 Modul:languages/data/3/p 828 9817 281247 273098 2026-04-21T13:32:19Z Hakimi97 2668 281247 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["pab"] = { "Pareci", 3504312, "awd", "Latn", } m["pac"] = { "Pacoh", 3441136, "mkh-kat", "Latn", } m["pad"] = { "Paumarí", 389827, "auf", "Latn", } m["pae"] = { "Pagibete", 7124357, "bnt-bta", "Latn", } m["paf"] = { "Paranawát", 12953806, "tup-gua", "Latn", } m["pag"] = { "Pangasinan", 33879, "phi", "Latn, Tglg", entry_name = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer}, } m["pah"] = { "Tenharim", 10266010, "tup-gua", "Latn", } m["pai"] = { "Pe", 3914871, "nic-tar", "Latn", } m["pak"] = { "Parakanã", 12953804, "tup-gua", "Latn", } m["pal"] = { "Parsi Pertengahan", 32063, "ira-swi", "Latn, Phli, pal-Avst, Mani, Phlp, Phlv", -- Latn for translit; Phlv not in Unicode translit = { Phli = "Phli-translit", ["pal-Avst"] = "Avst-translit", Mani = "Mani-translit", }, ancestors = "peo", } m["pam"] = { "Kapampangan", 36121, "phi", "Latn", --also Kulitan, which lacks a code entry_name = {Latn = {remove_diacritics = c.grave .. c.acute .. c.circ}}, standardChars = { Latn = "AaBbDdEeGgHhIiKkLlMmNnOoPpRrSsTtUuWwYy", c.punc }, sort_key = { Latn = "tl-sortkey" }, } m["pao"] = { "Paiute Utara", 3360656, "azc-num", "Latn", } m["pap"] = { "Papiamentu", 33856, "crp", "Latn", ancestors = "pt", } m["paq"] = { "Parya", 1135134, "inc-cen", ancestors = "psu", } m["par"] = { "Panamint", 33926, "azc-num", "Latn", } m["pas"] = { "Papasena", 7132508, "paa-lkp", "Latn", } m["pat"] = { "Papitalai", 6528659, "poz-aay", "Latn", } m["pau"] = { "Palau", 33776, "poz", "Latn, Kana", sort_key = { Kana = "Kana-sortkey" }, } m["pav"] = { "Wari'", 3027909, "sai-cpc", "Latn", } m["paw"] = { "Pawnee", 56751, "cdd", "Latn", } m["pax"] = { "Pankararé", 25559779, nil, "Latn", } m["pay"] = { "Pech", 4898889, "cba", "Latn", } m["paz"] = { "Pankararú", 7131310, nil, "Latn", } m["pbb"] = { "Páez", 33677, nil, "Latn", } m["pbc"] = { "Patamona", 3915921, "sai-pem", "Latn", } m["pbe"] = { "Mezontla Popoloca", 42365630, "omq-pop", "Latn", } m["pbf"] = { "Coyotepec Popoloca", 5180100, "omq-pop", "Latn", } m["pbg"] = { "Paraujano", 3501747, "awd-taa", "Latn", } m["pbh"] = { "Panare", 56610, "sai-ven", "Latn", } m["pbi"] = { "Podoko", 3515096, "cdc-cbm", "Latn", } m["pbl"] = { "Mak (Nigeria)", 3915349, "alv-bwj", "Latn", } m["pbm"] = { "Puebla Mazatec", nil, "omq-maz", "Latn", } m["pbn"] = { "Kpasam", 3914902, "alv-mye", "Latn", } m["pbo"] = { "Papel", 36314, "alv-pap", "Latn", } m["pbp"] = { "Badyara", 35095, "alv-ten", "Latn", } m["pbr"] = { "Pangwa", 3847550, "bnt-bki", "Latn", } m["pbs"] = { "Pame Tengah", 3361763, "omq", "Latn", } m["pbv"] = { "Pnar", 3501850, "aav-pkl", "Latn", } m["pby"] = { "Pyu", 2567925, "paa-asa", "Latn", } m["pca"] = { "Santa Inés Ahuatempan Popoloca", 42365276, "omq-pop", "Latn", } m["pcb"] = { "Pear", 6583669, "mkh-pea", "Khmr", } m["pcc"] = { "Bouyei", 35100, "tai-nor", "Latn, Hani", sort_key = {Hani = "Hani-sortkey"}, } m["pcd"] = { "Picard", 34024, "roa-oil", "Latn", ancestors = "fro", sort_key = s["roa-oil-sortkey"], } m["pce"] = { "Ruching Palaung", 12953798, "mkh-pal", } m["pcf"] = { "Paliyan", 7127643, "dra", } m["pcg"] = { "Paniya", 7131211, "dra", } m["pch"] = { "Pardhan", 7133207, "dra", ancestors = "gon", } m["pci"] = { "Duruwa", 56753, "dra", "Deva, Orya", } m["pcj"] = { "Parenga", 3111396, "mun", } m["pck"] = { "Paite", 12952337, "tbq-kuk", } m["pcl"] = { "Pardhi", 7136554, "inc-bhi", } m["pcm"] = { "Pijin Nigeria", 33655, "crp", "Latn", ancestors = "en", entry_name = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.caron .. c.macronbelow}, sort_key = { remove_diacritics = c.tilde, from = {"ẹ", "gb", "kp", "ọ", "sh", "zh"}, to = {"e" .. p[1], "g" .. p[1], "k" .. p[1], "o" .. p[1], "s" .. p[1], "z" .. p[1]} }, } m["pcn"] = { "Piti", 3913375, "nic-kne", "Latn", } m["pcp"] = { "Pacahuara", 2591165, "sai-pan", "Latn", } m["pcw"] = { "Pyapun", 3438807, nil, "Latn", } m["pda"] = { "Anam", 3501930, "ngf-mad", "Latn", } m["pdc"] = { "Jerman Pennsylvania", 22711, "gmw", "Latn", ancestors = "gmw-rfr", } m["pdi"] = { "Pa Di", 3359940, nil, "Latn", } m["pdn"] = { "Fedan", 7206699, "poz-ocw", "Latn", } m["pdo"] = { "Padoe", 3360370, "poz-btk", "Latn", } m["pdt"] = { "Plautdietsch", 1751432, "gmw", "Latn", ancestors = "nds-de", } m["pdu"] = { "Kayan", 7123283, "kar", "Latn", } m["pea"] = { "Peranakan Indonesian", 653415, nil, "Latn", } m["peb"] = { "Pomo Timur", 3396032, "nai-pom", "Latn", } m["ped"] = { "Mala (New Guinea)", 11732569, "ngf-mad", "Latn", } m["pee"] = { "Taje", 12953902, nil, "Latn", } m["pef"] = { "Pomo Timur Laut", 3396018, "nai-pom", "Latn", } m["peg"] = { "Pengo", 56758, "dra", "Orya", translit = "kxv-translit", } m["peh"] = { "Bonan", 32983, "xgn-shr", "Latn", } m["pei"] = { "Chichimeca-Jonaz", 3915427, "omq-otp", "Latn", } m["pej"] = { "Pomo Utara", 3396021, "nai-pom", "Latn", } m["pek"] = { "Penchal", 3374631, "poz-aay", "Latn", } m["pel"] = { "Pekal", 3241781, nil, "Latn", } m["pem"] = { "Phende", 7162372, "bnt-pen", "Latn", } m["peo"] = { "Parsi Kuno", 35225, "ira-swi", "Xpeo, Latn", translit = "peo-translit", } m["pep"] = { "Kunja", 6444807, nil, "Latn", } m["peq"] = { "Pomo Selatan", 3396023, "nai-pom", "Latn", } -- "pes" IS TREATED AS "fa" (or as etymology-only), SEE WT:LT m["pev"] = { "Pémono", 3439012, "sai-map", "Latn", } m["pex"] = { "Petats", 3376353, "poz-ocw", "Latn", } m["pey"] = { "Petjo", 940486, nil, "Latn", } m["pez"] = { "Penan Timur", 18638342, "poz-swa", "Latn", } m["pfa"] = { "Pááfang", 3063517, "poz-mic", "Latn", } m["pfe"] = { "Peere", 36377, "alv-dur", "Latn", } m["pga"] = { "Arab Juba", 1262143, "crp", "Latn", ancestors = "apd", } m["pgd"] = { "Gandhari", nil, "inc-mid", "Deva, Khar", ancestors = "inc-ash", translit = "Khar-translit", } m["pgg"] = { "Pangwali", 13600429, "him", "Deva, Takr", translit = "hi-translit", } m["pgi"] = { "Pagi", 7124354, "paa-brd", "Latn", } m["pgk"] = { "Rerep", 586907, "poz-vnc", "Latn", } m["pgl"] = { "Primitive Irish", 3320030, "cel-gae", "Ogam", translit = "pgl-translit", } m["pgn"] = { "Paelignian", nil, "itc-sbl", "Latn", } m["pgs"] = { "Pangseng", 3914027, "alv-mum", "Latn", } m["pgu"] = { "Pagu", 7124462, "paa-nha", "Latn", } m["pgz"] = { "Bahasa Isyarat Papua New Guinea", 25044405, "sgn", } m["pha"] = { "Pa-Hng", 2625410, "hmn", } m["phd"] = { "Phudagi", 7188289, } m["phg"] = { "Phuong", 7188376, "mkh-kat", } m["phh"] = { "Phukha", 7188298, "tbq-lol", } m["phk"] = { "Phake", 7675798, "tai-swe", "Mymr", translit = "aio-phk-translit", entry_name = {remove_diacritics = c.VS01}, } m["phl"] = { "Phalura", 2449549, "inc-dar", "Latn, ur-Arab", } m["phm"] = { "Phimbi", 11007144, "bnt-sna", "Latn", } m["phn"] = { "Phoenicia", 36734, "sem-can", "Phnx", translit = "Phnx-translit", } m["pho"] = { "Phunoi", 7188361, "tbq-lol", } m["phq"] = { "Phana'", 7180427, "tbq-lol", } m["phr"] = { "Pahari-Potwari", 33739, "inc-pan", "pa-Arab, Guru", ancestors = "lah", translit = { Guru = "Guru-translit", ["pa-Arab"] = "pa-Arab-translit", }, entry_name = { ["pa-Arab"] = { remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna, from = {"ݨ", "ࣇ"}, to = {"ن", "ل"} }, } } m["pht"] = { "Phu Thai", 3626597, "tai-swe", } m["phu"] = { "Phuan", 3915665, } m["phv"] = { "Pahlavani", 7124567, } m["phw"] = { "Phangduwali", 12953036, "sit-kie", ancestors = "ybh", } m["pia"] = { "Pima Bajo", 3388544, "azc", "Latn", } m["pib"] = { "Yine", 3135432, "awd", "Latn", } m["pic"] = { "Pinji", 36296, "bnt-tso", "Latn", } m["pid"] = { "Piaroa", 3382207, nil, "Latn", } m["pie"] = { "Piro", 7198055, "nai-kta", "Latn", } m["pif"] = { "Pingelapese", 36421, "poz-mic", "Latn", } m["pig"] = { "Pisabo", 966883, "sai-pan", "Latn", } m["pih"] = { "Pitcairn-Norfolk", 36554, "crp", "Latn", ancestors = "en", } m["pii"] = { "Pini", 10631925, } m["pij"] = { "Pijao", 7193519, } m["pil"] = { "Yom", 36893, "nic-yon", } m["pim"] = { "Powhatan", 2270532, "alg-eas", "Latn", } m["pin"] = { "Piame", 7190042, "paa-sep", "Latn", } m["pio"] = { "Piapoco", 3382208, "awd-nwk", "Latn", } m["pip"] = { "Pero", 2411063, "cdc-wst", } m["pir"] = { "Piratapuyo", 3389119, "sai-tuc", "Latn", } m["pis"] = { "Pijin", 36699, "crp", "Latn", ancestors = "en", } m["pit"] = { "Pitta-Pitta", 6433116, "aus-kar", "Latn", } m["piu"] = { "Pintupi-Luritja", 2591175, "aus-pam", } m["piv"] = { "Pileni", 2976736, "poz-pnp", "Latn", } m["piw"] = { "Pimbwe", 3894132, "bnt-mwi", } m["pix"] = { "Piu", 7199578, } m["piy"] = { "Piya-Kwonci", 3440492, } m["piz"] = { "Pije", 3388339, "poz-cln", "Latn", } m["pjt"] = { "Pitjantjatjara", 2982063, "aus-pam", "pjt-Latn", } m["pkb"] = { "Kipfokomo", 7208693, "bnt-sab", "Latn", } m["pkc"] = { "Baekje", 4841264, "qfa-kor", "Hani, Kana", sort_key = { Hani = "Hani-sortkey", Kana = "Kana-sortkey" }, } m["pkg"] = { "Pak-Tong", 3360711, } m["pkh"] = { "Pankhu", 7130962, "tbq-kuk", } m["pkn"] = { "Pakanha", 954916, "aus-pmn", } m["pko"] = { "Pökoot", 36323, "sdv-kln", } m["pkp"] = { "Pukapukan", 36447, "poz-pnp", "Latn", } m["pkr"] = { "Attapady Kurumba", 16835180, "dra", } m["pks"] = { "Bahasa Isyarat Pakistan", 22964057, "sgn", } m["pkt"] = { "Maleng", 6583562, "mkh-vie", } m["pku"] = { "Paku", 2932604, } m["pla"] = { "Miani", 12952844, nil, "Latn", } m["plb"] = { "Polonombauk", 7225957, "poz-vnc", "Latn", } m["plc"] = { "Central Palawano", 12953795, "phi", "Latn", } m["ple"] = { "Palu'e", 2196866, "poz-cet", "Latn", } m["plg"] = { "Pilagá", 2748259, "sai-guc", "Latn", } m["plh"] = { "Paulohi", 7155331, "poz-cma", } m["plj"] = { "Polci", 3914383, } m["plk"] = { "Kohistani Shina", 12953882, "inc-dar", } m["pll"] = { "Shwe Palaung", 27941664, "mkh-pal", } m["pln"] = { "Palenquero", 36665, "crp", "Latn", ancestors = "es", } m["plo"] = { "Oluta Popoluca", 5908687, "nai-miz", "Latn", } m["plq"] = { "Palaic", 36582, "ine-ana", "Xsux", } m["plr"] = { "Palaka Senoufo", 36346, "alv-snf", "Latn", } m["pls"] = { "San Marcos Tlalcoyalco Popoloca", 12641692, "omq-pop", "Latn", } m["plu"] = { "Palikur", 3073448, "awd", "Latn", } m["plv"] = { "Palawano Barat Daya", 15614922, "phi", "Latn", } m["plw"] = { "Palawano Brooke's Point", 12953796, "phi", "Latn", } m["ply"] = { "Bolyu", 3361723, "mkh-pkn", "Latn", } m["plz"] = { "Paluan", 7128795, nil, "Latn", } m["pma"] = { "Paama", 3130286, "poz-vnc", "Latn", } m["pmb"] = { "Pambia", 36267, "znd", "Latn", } m["pmd"] = { "Pallanganmiddang", 7127734, "aus-pam", "Latn", } m["pme"] = { "Pwaamei", 3411152, "poz-cln", "Latn", } m["pmf"] = { "Pamona", 3513320, "poz-kal", "Latn", } m["pmi"] = { "Pumi Utara", 3403245, "sit-qia", } m["pmj"] = { "Pumi Selatan", 3403246, "sit-qia", } m["pmk"] = { "Pamlico", nil, "alg-eas", "Latn", } m["pml"] = { "Sabir", 636479, "crp", "Latn", ancestors = "lij, pro, vec", } m["pmm"] = { "Pol", 36408, "bnt-kak", "Latn", } m["pmn"] = { "Pam", 7129017, "alv-mbm", } m["pmo"] = { "Pom", 7227178, "poz-hce", "Latn", } m["pmq"] = { "Pame Utara", 3361762, "omq", "Latn", } m["pmr"] = { "Paynamar", 3450824, } m["pms"] = { "Piemonte", 15085, "roa-git", "Latn", } m["pmt"] = { "Tuamotuan", 36763, "poz-pep", "Latn", } m["pmu"] = { "Mirpur Panjabi", 6874480, } m["pmw"] = { "Plains Miwok", 3391031, "nai-you", "Latn", } m["pmx"] = { "Poumei Naga", 12952910, "tbq-anp", } m["pmy"] = { "Papuan Malay", 12473446, nil, "Latn", } m["pmz"] = { "Southern Pame", 3361765, "omq", "Latn", } m["pna"] = { "Punan Bah-Biau", 4842201, "poz-bnn", "Latn", } m["pnb"] = { "Punjabi Barat", 58635, "inc-pan", "pa-Arab", ancestors = "pa", } m["pnc"] = { "Pannei", 7131391, } m["pnd"] = { "Mpinda", 63308194, "bnt-kmb", } m["pne"] = { "Penan Barat", 12953808, "poz-swa", "Latn", } m["png"] = { "Pongu", 36282, "nic-shi", } m["pnh"] = { "Penrhyn", 3130301, "poz-pep", "Latn", } m["pni"] = { "Aoheng", 4778608, "poz", "Latn", } m["pnj"] = { "Pinjarup", 33103591, } m["pnk"] = { "Paunaca", 2064378, "awd", "Latn", } m["pnl"] = { "Paleni", 7127118, "alv-wan", "Latn", } m["pnm"] = { "Punan Batu", 7259892, } m["pnn"] = { "Pinai-Hagahai", 5638511, } m["pno"] = { "Panobo", 3141869, "sai-pan", "Latn", } m["pnp"] = { "Pancana", 7130204, } m["pnq"] = { "Pana (Afrika Barat)", 7129739, "nic-gnn", "Latn", } m["pnr"] = { "Panim", 11732562, "ngf-mad", } m["pns"] = { "Ponosakan", 7227956, "phi", "Latn", } m["pnt"] = { "Yunani Pontus", 36748, "grk", "Grek, Latn, Cyrl", ancestors = "gkm", translit = "el-translit", entry_name = {remove_diacritics = c.caron .. c.diaerbelow .. c.brevebelow}, sort_key = s["Grek-sortkey"], } m["pnu"] = { "Jiongnai Bunu", 56325, "hmn", } m["pnv"] = { "Pinigura", 10631927, "aus-psw", "Latn", } m["pnw"] = { "Panyjima", 3913830, "aus-nga", "Latn", } m["pnx"] = { "Phong-Kniang", 3914627, "mkh", } m["pny"] = { "Pinyin", 36250, "nic-nge", "Latn", } m["pnz"] = { "Pana (Afrika Tengah)", 36241, "alv-mbm", "Latn", } m["poc"] = { "Poqomam", 36416, "myn", "Latn", } m["poe"] = { "San Juan Atzingo Popoloca", 12953819, "omq-pop", "Latn", } m["pof"] = { "Poke", 7208577, "bnt-ske", } m["pog"] = { "Potiguára", 56722, "tup-gua", "Latn", } m["poh"] = { "Poqomchi'", 36414, "myn", "Latn", } m["poi"] = { "Highland Popoluca", 7511556, "nai-miz", "Latn", } m["pok"] = { "Pokangá", 25559704, "sai-tuc", "Latn", } m["pom"] = { "Pomo Tenggara", 3396025, "nai-pom", "Latn", } m["pon"] = { "Pohnpei", 28422, "poz-mic", "Latn", } m["poo"] = { "Pomo Tengah", 3396020, "nai-pom", "Latn", } m["pop"] = { "Pwapwa", 3411153, "poz-cln", "Latn", } m["poq"] = { "Texistepec Popoluca", 5908707, "nai-miz", "Latn", } m["pos"] = { "Sayula Popoluca", 5908722, "nai-miz", "Latn", } m["pot"] = { "Potawatomi", 56749, "alg", "Latn", } m["pov"] = { "Kreol Guinea-Bissau", 33339, "crp", "Latn", ancestors = "pt", } m["pow"] = { "San Felipe Otlaltepec Popoloca", 25559598, "omq-pop", "Latn", } m["pox"] = { "Polabia", 36741, "zlw-lch", "Latn", } m["poy"] = { "Pogolo", 2429648, "bnt-kil", } m["ppa"] = { "Pao", 7132069, } m["ppe"] = { "Papi", 7132809, } m["ppi"] = { "Paipai", 56726, "nai-yuc", "Latn", } m["ppk"] = { "Uma", 7881036, "poz-kal", "Latn", } m["ppl"] = { "Pipil", 1186896, "azc-nah", "Latn", entry_name = {remove_diacritics = c.acute .. c.macron}, } m["ppm"] = { "Papuma", 7133239, "poz-hce", "Latn", } m["ppn"] = { "Papapana", 3362757, "poz-ocw", "Latn", } m["ppo"] = { "Folopa", 5464843, "paa", "Latn", } m["ppq"] = { "Pei", 7160903, } m["pps"] = { "San Luís Temalacayuca Popoloca", 25559602, "omq-pop", "Latn", } m["ppt"] = { "Pa", 3504757, "ngf", "Latn", } m["ppu"] = { "Papora", 2094884, "map", "Latn", } m["pqa"] = { "Pa'a", 3441315, "cdc-wst", } m["pqm"] = { "Malecite-Passamaquoddy", 3183144, "alg-eas", "Latn", } m["pra"] = { "Prakrit", 192170, "inc-mid", "Brah, Deva, Gujr, Knda", ancestors = "inc-ash", translit = { Brah = "Brah-translit", Deva = "pra-Deva-translit", Gujr = "sa-Gujr-translit", Knda = "pra-Knda-translit", }, entry_name = { from = {"ऎ", "ऒ", u(0x0946), u(0x094A), "य़", "ಯ಼", u(0x11071), u(0x11072), u(0x11073), u(0x11074)}, to = {"ए", "ओ", u(0x0947), u(0x094B), "य", "ಯ", "𑀏", "𑀑", u(0x11042), u(0x11044)} } , } m["prc"] = { "Parachi", 2640637, "ira-orp", } -- "prd" IS NOT INCLUDED, SEE WT:LT m["pre"] = { "Principe", 36520, "crp", "Latn", ancestors = "pt", } m["prf"] = { "Paranan", 7135433, "phi", } m["prg"] = { "Prusia Kuno", 35501, "bat", "Latn", } m["prh"] = { "Porohanon", 6583710, "phi", } m["pri"] = { "Paicî", 732131, "poz-cln", "Latn", } m["prk"] = { "Parauk", 3363719, "mkh-pal", } m["prl"] = { "Bahasa Isyarat Peru", 3915508, "sgn", } m["prm"] = { "Kibiri", 56745, "paa", } m["prn"] = { "Prasuni", 32689, "nur-nor", } m["pro"] = { "Occitan Kuno", 2779185, "roa-ocr", "Latn", sort_key = {remove_diacritics = c.cedilla}, } -- "prp" IS NOT INCLUDED, SEE WT:LT m["prq"] = { "Ashéninka Perené", 3450601, "awd", "Latn", } m["prr"] = { "Puri", 7261687, } -- "prs" IS TREATED AS "fa" (or as etymology-only), SEE WT:LT m["prt"] = { "Phai", 7180184, "mkh", } m["pru"] = { "Puragi", 7260800, "ngf-sbh", } m["prw"] = { "Parawen", 7136291, "ngf-mad", } m["prx"] = { "Purik", 567905, "sit-lab", } m["prz"] = { "Bahasa Isyarat Providencia", 3322084, "sgn", } m["psa"] = { "Asue Awyu", 11266334, } m["psc"] = { "Bahasa Isyarat Parsi", 7170221, "sgn", } m["psd"] = { "Plains Indian Sign Language", 2380124, "sgn", } m["pse"] = { "Melayu Barisan Selatan", 3367751, "poz-mly", "Latn", } m["psg"] = { "Bahasa Isyarat Pulau Pinang", 4924925, "sgn", } m["psh"] = { "Pashayi Barat Daya", 16112270, "inc-dar", } m["psi"] = { "Pashayi Tenggara", 23713536, "inc-dar", "Arab", } m["psl"] = { "Bahasa Isyarat Puerto Rico", 7258608, "sgn-fsl", } m["psm"] = { "Pauserna", 2912846, "tup-gua", "Latn", } m["psn"] = { "Panasuan", 7130113, "poz", } m["pso"] = { "Bahasa Isyarat Poland", 3915194, "sgn-gsl", } m["psp"] = { "Bahasa Isyarat Filipina", 3551357, "sgn-fsl", } m["psq"] = { "Pasi", 7142091, } m["psr"] = { "Bahasa Isyarat Portugis", 3915472, "sgn", } m["pss"] = { "Kaulong", 3194294, "poz-ocw", } m["psw"] = { "Port Sandwich", 3398324, "poz-vnc", "Latn", } m["psy"] = { "Piscataway", 3504233, "alg-eas", } m["pta"] = { "Pai Tavytera", 7124619, "tup-gua", "Latn", } m["pth"] = { "Pataxó Hã-Ha-Hãe", 7144304, } m["pti"] = { "Pintiini", 10632026, "aus-pam", } m["ptn"] = { "Patani", 7144242, "poz-hce", "Latn", } m["pto"] = { "Zo'é", 8073148, "tup-gua", "Latn", } m["ptp"] = { "Patep", 3368679, "poz-ocw", "Latn", } m["ptq"] = { "Pattapu", nil, "dra", } m["ptr"] = { "Piamatsina", 7190040, "poz-vnc", "Latn", } m["ptt"] = { "Enrekang", 12953520, } m["ptu"] = { "Bambam", 4853321, "poz-ssw", "Latn", } m["ptv"] = { "Port Vato", 3398323, nil, "Latn", } m["ptw"] = { "Pentlatch", 2069475, } m["pty"] = { "Pathiya", 7144790, "dra", } m["pua"] = { "Purepecha", 16114351, "qfa-iso", "Latn", sort_key = {remove_diacritics = c.acute}, } m["pub"] = { "Purum", 6400562, "tbq-kuk", "Latn", } m["puc"] = { "Punan Merap", 7259895, } m["pud"] = { "Punan Aput", 4782333, "poz-swa", "Latn", } m["pue"] = { "Puelche", 33660, } m["puf"] = { "Punan Merah", 7259894, } m["pug"] = { "Phuie", 36375, "nic-gnw", } m["pui"] = { "Puinave", 3027918, } m["puj"] = { "Punan Tubu", 7259896, "poz-swa", "Latn", } m["pum"] = { "Puma", 33736, "sit-kic", } m["puo"] = { "Puoc", 6440803, "mkh", } m["pup"] = { "Pulabu", 7259163, "ngf-mad", } m["puq"] = { "Puquina", 1207739, } m["pur"] = { "Puruborá", 7261619, "tup", } m["put"] = { "Putoh", 12953832, "poz-swa", "Latn", } m["puu"] = { "Punu", 36401, "bnt-sir", "Latn", } m["puw"] = { "Puluwat", 36397, "poz-mic", "Latn", } m["pux"] = { "Puare", 3507983, } m["puy"] = { "Purisimeño", 2967638, "nai-chu", "Latn", } m["pwa"] = { "Pawaia", 7156099, "paa", "Latn", } m["pwb"] = { "Panawa", 47385077, "nic-jer", "Latn", ancestors = "jer", } m["pwg"] = { "Gapapaiwa", 3095245, "poz-ocw", "Latn", } m["pwi"] = { "Patwin", 3370188, "nai-wtq", "Latn", } m["pwm"] = { "Molbog", 6895718, "poz-san", "Latn", } m["pwn"] = { "Paiwan", 715755, "map", "Latn", } m["pwo"] = { "Pwo Barat", 7988202, "kar", "Mymr", } m["pwr"] = { "Powari", 12640277, "inc-hie", "Deva", } m["pww"] = { "Pwo Utara", 7058885, "kar", "Thai", } m["pxm"] = { "Quetzaltepec Mixe", 6842374, "nai-miz", "Latn", } m["pye"] = { "Pye Krumen", 11157382, "kro-grb", } m["pym"] = { "Fyam", 3914025, "nic-ple", "Latn", } m["pyn"] = { "Poyanáwa", 3401023, "sai-pan", } m["pys"] = { "Bahasa Isyarat Paraguay", 7134698, "sgn", } m["pyu"] = { "Puyuma", 716690, "map", "Latn", } m["pyx"] = { "Tircul", 36259, "sit", } m["pyy"] = { "Pyen", 7262966, "tbq-lol", } m["pzh"] = { "Pazeh", 36435, "map", "Latn", } m["pzn"] = { "Para Naga", 7133667, "sit-aao", } return require("Module:languages").finalizeData(m, "language") fribkhuo11qo3woq888t6l2nhypqvc1 Modul:languages/data/3/x 828 9824 281274 276274 2026-04-21T14:03:17Z Hakimi97 2668 281274 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["xaa"] = { "Arab Andalusia", 1137945, "sem-arb", "Arab, Latn", entry_name = { remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef, from = {u(0x0671)}, to = {u(0x0627)} }, } m["xab"] = { "Sambe", 36265, "nic-alu", "Latn", } m["xac"] = { "Kachari", 3442442, "tbq-bdg", } m["xad"] = { "Adai", 346744, } m["xae"] = { "Aequian", 930579, "itc", } m["xag"] = { "Aghwan", 34931, "cau-esm", "Aghb", translit = "Aghb-translit", override_translit = true, } m["xai"] = { "Kaimbé", 6348017, } m["xaj"] = { "Ararandewára", nil, "tup-gua", "Latn", } m["xak"] = { "Maku", 2032882, nil, "Latn", } m["xal"] = { "Kalmyk", 33634, "xgn-cen", "Cyrl, xwo-Mong", ancestors = "xwo", translit = "xal-translit", override_translit = true, sort_key = "xal-sortkey", } m["xam"] = { "ǀXam", 2086145, "khi-tuu", "Latn", } m["xan"] = { "Xamtanga", 56527, "cus-cen", } m["xao"] = { "Khao", 3196077, "mkh-pal", } m["xap"] = { "Apalachee", 686501, "nai-mus", "Latn", } m["xaq"] = { "Aquitanian", 500522, "euq", "Latn", } m["xar"] = { "Karami", 11732281, } m["xas"] = { "Kamassian", 35991, translit = "xas-translit", "syd", "Cyrl", } m["xat"] = { "Katawixi", 3440512, "sai-ktk", } m["xau"] = { "Kauwera", 6378983, "paa-tkw", } m["xav"] = { "Xavante", 36962, "sai-cje", "Latn", } m["xaw"] = { "Kawaiisu", 56338, "azc-num", "Latn", } m["xay"] = { "Kayan Mahakam", 25337171, } m["xbb"] = { "Lower Burdekin", 6693353, } m["xbc"] = { "Baktria", 756651, "ira-sbc", "Grek, Mani", translit = "xbc-translit", entry_name = { from = {"Þ", "þ"}, to = {"Ϸ", "ϸ"} }, } m["xbd"] = { "Bindal", 4913975, } m["xbe"] = { "Bigambal", 16841801, "aus-pam", --unclassified within } m["xbg"] = { "Bunganditj", 4997615, } m["xbi"] = { "Kombio", 6428259, "qfa-tor", "Latn", } m["xbj"] = { "Birrpayi", nil, } m["xbm"] = { "Breton Pertengahan", 787610, "cel-bry", "Latn", ancestors = "obt", } m["xbn"] = { "Kenaboi", 6388752, } m["xbo"] = { "Bulgar", 36880, "trk-ogr", "Arab, Grek", } m["xbp"] = { "Bibbulman", 22918391, } m["xbr"] = { "Kambera", 3053279, "poz-cet", "Latn", } m["xbw"] = { "Kambiwá", 9006744, } m["xby"] = { "Butchulla", 31752631, } m["xcb"] = { "Cumbric", 35965, "cel-bry", } m["xcc"] = { "Camunic", 489011, nil, "Ital", translit = "Ital-translit", } m["xce"] = { "Celtiberian", 37012, "cel", "Latn", } m["xch"] = { "Chemakum", 56397, "chi", "Latn", } m["xcl"] = { "Armenia Kuno", 181074, "hyx", "Armn", translit = "Armn-translit", override_translit = true, entry_name = { remove_diacritics = "՞՜՛՟", from = {"եւ"}, to = {"և"} }, } m["xcm"] = { "Comecrudo", 609808, "nai-pak", } m["xcn"] = { "Cotoname", 56889, "nai-pak", } m["xco"] = { "Khwarezm", 33138, "ira-sbc", "Arab, Armi, Chrs, Phlv, Sogd", translit = {Chrs = "Chrs-translit"}, } m["xcr"] = { "Carian", 35929, "ine-ana", "Cari", } m["xct"] = { "Tibet Klasik", 5128314, "sit-tib", "Tibt, Hani, Marc, Mong, mnc-Mong, xwo-Mong, Phag, Tang, Zanb", translit = { Tibt = "Tibt-translit", Mong = "Mong-translit", ["mnc-Mong"] = "mnc-translit", ["xwo-Mong"] = "xwo-translit", Tang = "txg-translit", }, override_translit = true, display_text = { Tibt = s["Tibt-displaytext"], Mong = s["Mong-displaytext"], }, entry_name = { Tibt = s["Tibt-entryname"], Mong = s["Mong-entryname"], }, sort_key = { Tibt = "Tibt-sortkey", Hani = "Hani-sortkey", }, } m["xcu"] = { "Curonian", 35857, "bat", "Latn", } m["xcv"] = { "Chuva", 3516641, "qfa-yuk", "Cyrl", translit = "xcv-translit" } m["xcw"] = { "Coahuilteco", 2008062, "nai-pak", } m["xcy"] = { "Cayuse", 2472016, } m["xda"] = { "Darkinjung", 5223660, "aus-yuk", "Latn", } m["xdc"] = { "Dacian", 682547, "ine", "Latn", } m["xdk"] = { "Dharug", 1166814, "aus-yuk", "Latn", } m["xdm"] = { "Edom", 2363529, "sem-can", "Phnx", translit = "Phnx-translit", } m["xdq"] = { "Kaitag", 1990659, "cau-drg", "Cyrl", translit = {Cyrl = "dar-translit"}, override_translit = true, display_text = {Cyrl = s["cau-Cyrl-displaytext"]}, entry_name = { Cyrl = s["cau-Cyrl-entryname"], Latn = s["cau-Latn-entryname"], }, sort_key = { Cyrl = { from = { "къкъ", "хьхь", -- 4 chars "гъ", "гь", "гӏ", "ё", "къ", "кь", "кӏ", "пп", "пӏ", "сс", "тт", "тӏ", "хх", "хъ", "хь", "хӏ", "цц", "цӏ", "чч", "чӏ" -- 2 chars }, to = { "к" .. p[2], "х" .. p[4], "г" .. p[1], "г" .. p[2], "г" .. p[3], "е" .. p[1], "к" .. p[1], "к" .. p[3], "к" .. p[4], "п" .. p[1], "п" .. p[2], "с" .. p[1], "т" .. p[1], "т" .. p[2], "х" .. p[1], "х" .. p[2], "х" .. p[3], "х" .. p[5], "ц" .. p[1], "ц" .. p[2], "ч" .. p[1], "ч" .. p[2] } }, }, } m["xdy"] = { "Malayic Dayak", 3514892, } m["xeb"] = { "Ebla", 35345, "sem-eas", "Xsux", } m["xed"] = { "Hdi", 56246, "cdc-cbm", "Latn", } m["xeg"] = { "ǁXegwi", 3509732, "khi-tuu", "Latn", } m["xel"] = { "Kelo", 6386412, "sdv-eje", } m["xem"] = { "Kembayan", 6386874, } m["xep"] = { "Epi-Olmec", nil, } m["xer"] = { "Xerénte", 3073436, "sai-cje", "Latn", } m["xes"] = { "Kesawai", 6394907, "ngf-mad", "Latn", } m["xet"] = { "Xetá", 2980404, "tup-gua", "Latn", } m["xeu"] = { "Keoru-Ahia", 11732313, "ngf", } m["xfa"] = { "Falisci", 35669, "itc", "Ital, Latn", translit = "Ital-translit", entry_name = {remove_diacritics = c.macron .. c.breve .. c.diaer}, } m["xga"] = { "Galatia", 27403, "cel", "Latn, Grek", ancestors = "cel-gau", } m["xgb"] = { "Gbin", 16934745, "dmn-mse", "Latn", } m["xgd"] = { "Gudang", 5614528, } m["xgf"] = { "Gabrielino-Fernandeño", 56387, "azc-tak", "Latn", } m["xgg"] = { "Goreng", nil, } m["xgi"] = { "Garingbal", nil, } m["xgl"] = { "Galindan", 1190494, "bat", "Latn", } m["xgm"] = { "Darumbal", 16954400, } m["xgr"] = { "Garza", 3098656, "nai-pak", } m["xgu"] = { "Unggumi", 62000004, "aus-wor", "Latn", } m["xgw"] = { "Guwa", 5621992, } m["xha"] = { "Harami", 41506724, nil, "Sarb", translit = "Sarb-translit", } m["xhc"] = { "Hun", 35959, } m["xhd"] = { "Hadrami", 1032453, "sem-osa", "Sarb", translit = "Sarb-translit", } m["xhe"] = { "Khetrani", 2614111, "inc-pan", ancestors = "lah", } m["xhm"] = { "Khmer Pertengahan", 25226861, "mkh-kmr", "Latn, Khmr", --and also Pallava ancestors = "okz", } m["xhr"] = { "Hernican", 5908773, "itc-sbl", "Ital", } m["xht"] = { "Hatti", 31107, "qfa-iso", "Xsux", } m["xhu"] = { "Hurri", 35740, "qfa-hur", "Xsux, Ugar", } m["xhv"] = { "Khua", 22970290, "mkh-kat", } m["xib"] = { "Iberia", 855215, "qfa-iso", "Latn, Ibrn", } m["xii"] = { "Xiri", 36876, } m["xin"] = { "Xinca", 1546494, "nai-xin", "Latn", } m["xil"] = { "Illyria", 35976, "ine", type = "reconstructed", } m["xir"] = { "Xiriâna", 2028772, "awd", "Latn", } m["xis"] = { "Kisan", nil, } m["xiv"] = { "Bahasa Lembah Indus", 3428279, nil, "Inds", } m["xiy"] = { "Xipaya", 13226, "tup", } m["xjb"] = { "Minjungbal", nil, "aus-pam", "Latn", } m["xka"] = { "Kalkoti", 3877551, "inc-dar", "xka-Arab", } m["xkb"] = { "Manigri-Kambolé Ede Nago", 36042, "alv-ede", } m["xkc"] = { "Khoini", 6401919, "xme-ttc", ancestors = "xme-ttc-wes", } m["xkd"] = { "Kayan Mendalam", 12952597, } m["xke"] = { "Kereho", 6437086, "poz", "Latn", } m["xkf"] = { "Khengkha", 3695207, "sit-ebo", "Tibt", translit = "Tibt-translit", override_translit = true, display_text = s["Tibt-displaytext"], entry_name = s["Tibt-entryname"], sort_key = "Tibt-sortkey", } m["xkg"] = { "Kagoro", 11159524, "dmn-wmn", } m["xki"] = { "Kenyan Sign Language", 6392859, "sgn", } m["xkj"] = { "Kajali", 14916876, "xme-ttc", ancestors = "xme-ttc-cen", } m["xkk"] = { "Kaco'", 6344767, "mkh", } m["xkl"] = { "Bakung", 6736761, "poz-swa", "Latn", } m["xkn"] = { "Kayan Sungai Kayan", 12473395, "poz", "Latn", } m["xko"] = { "Kiorr", 6414519, "mkh-pal", } m["xkp"] = { "Kabatei", 34165, "xme-ttc", ancestors = "xme-ttc-cen", } m["xkq"] = { "Koroni", 3199000, "poz-btk", } m["xkr"] = { "Xakriabá", 3073441, "sai-cje", "Latn", } m["xks"] = { "Kumbewaha", 6443722, } m["xkt"] = { "Kantosi", 35651, "nic-dag", } m["xku"] = { "Kaamba", 11042324, "bnt-kng", } m["xkv"] = { "Kgalagadi", 2088743, "bnt-sts", "Latn", } m["xkw"] = { "Kembra", 12953627, "paa-pau", } m["xkx"] = { "Karore", 6373260, "poz-ocw", } m["xky"] = { "Uma' Lasan", nil, "poz-swa", } m["xkz"] = { "Kurtöp", 3695193, "sit-ebo", "Tibt, Latn", translit = {Tibt = "Tibt-translit"}, display_text = {Tibt = s["Tibt-displaytext"]}, entry_name = {Tibt = s["Tibt-entryname"]}, sort_key = {Tibt = "Tibt-sortkey"}, } m["xla"] = { "Kamula", 10957277, "ngf", } m["xlb"] = { "Loup B", 13108281, "alg-eas", "Latn", } m["xlc"] = { "Lycia", 35969, "ine-ana", "Lyci", translit = "Lyci-translit", } m["xld"] = { "Lydia", 36095, "ine-ana", "Lydi", translit = "Lydi-translit", } m["xle"] = { "Lemnos", 36203, "qfa-tyn", "Ital", translit = "Ital-translit", } m["xlg"] = { "Liguria Purba", 36104, "ine", } m["xli"] = { "Liburni", 35835, "ine", } --xln is etymology-only m["xlo"] = { "Loup A", 27921265, "alg-eas", "Latn", } m["xlp"] = { "Lepontii", 35993, "cel", "Ital", translit = "Ital-translit", } m["xls"] = { "Lusitania", 35960, "ine", "Latn", } m["xlu"] = { "Luwiya", 12634577, "ine-ana", "Xsux, Hluw", } m["xly"] = { "Elymi", 35329, nil, "Grek", } m["xmb"] = { "Mbonga", 36064, "nic-jrn", "Latn", } m["xmc"] = { "Makhuwa-Marrevone", 11127231, "bnt-mak", ancestors = "vmw", } m["xmd"] = { "Mbudum", 6799790, "cdc-cbm", "Latn", } m["xmf"] = { "Mingrelia", 13359, "ccs-zan", "Geor", translit = "Geor-translit", override_translit = true, } m["xmg"] = { "Mengaka", 36017, "bai", "Latn", } m["xmh"] = { "Kugu-Muminh", 10549849, "aus-pmn", "Latn", } m["xmj"] = { "Majera", 6737666, "cdc-cbm", "Latn", } m["xmk"] = { "Macedonia Purba", 35974, "grk", "Polyt", translit = "grc-translit", entry_name = {remove_diacritics = c.macron .. c.breve}, sort_key = s["Grek-sortkey"], } m["xml"] = { "Bahasa Isyarat Malaysia", 33420, "sgn", } m["xmm"] = { "Melayu Manado", 1068112, "crp", "Latn", } m["xmo"] = { "Morerebi", 12953749, "tup", "Latn", } m["xmp"] = { "Kuku-Mu'inh", 10549852, nil, "Latn", } m["xmq"] = { "Kuku-Mangk", 10549851, "aus-pam", "Latn", } m["xmr"] = { "Meroe", 13366, "afa", "Mero, Merc, Latn", -- we have entries in Latn translit = "xmr-translit", } m["xms"] = { "Bahasa Isyarat Maghribi", 6913107, "sgn", } m["xmt"] = { "Matbat", 6786187, "poz-hce", } m["xmu"] = { "Kamu", 6359779, } m["xmx"] = { "Maden", 12952756, "poz-hce", } m["xmy"] = { "Mayaguduna", 3436736, } m["xmz"] = { "Mori Bawah", 3324069, "poz-btk", "Latn", } m["xna"] = { "Arab Utara Purba", 1472213, "sem", "Narb", translit = "Narb-translit", } m["xnb"] = { "Kanakanabu", 172244, "map", "Latn", } m["xng"] = { "Mongol Pertengahan", 2582455, "xgn", "Mong, Phag, Hani, Arab, Armn", translit = {Mong = "Mong-translit"}, display_text = {Mong = s["Mong-displaytext"]}, entry_name = {Mong = s["Mong-entryname"]}, sort_key = {Hani = "Hani-sortkey"}, } m["xnh"] = { "Kuanhua", 6441084, "mkh-pal", } m["xni"] = { "Ngarigu", 7022072, "aus-yuk", } m["xnk"] = { "Nganakarti", 33087049, } m["xnn"] = { "Kankanay Utara", 12953609, "phi", } -- "xno" IS TREATED AS "fro", SEE WT:LT m["xnr"] = { "Kangri", 2331560, "him", "Deva, Takr, fa-Arab", ancestors = "doi", translit = "hi-translit", } m["xns"] = { "Kanashi", 6360672, "sit-whm", } m["xnt"] = { "Narragansett", 3336118, "alg-eas", "Latn", entry_name = {remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron}, } m["xnu"] = { "Nukunul", 7068904, } m["xny"] = { "Nyiyaparli", 16919427, "aus-nga", "Latn", } m["xoc"] = { "O'chi'chi'", 3813833, "nic-cde", "Latn", } m["xod"] = { "Kokoda", 6426734, "ngf-sbh", } m["xog"] = { "Soga", 33784, "bnt-nyg", "Latn", } m["xoi"] = { "Kominimung", 6428352, "paa", "Latn", } m["xok"] = { "Xokleng", 3027930, "sai-sje", } m["xom"] = { "Komo", 56681, "ssa-kom", } m["xon"] = { "Konkomba", 35674, "nic-grm", "Latn", } m["xoo"] = { -- contrast kzw, sai-kat, sai-xoc "Xukurú", 9096758, } m["xop"] = { "Kopar", 11732346, } m["xor"] = { "Korubo", 3199022, } m["xow"] = { "Kowaki", 6434920, "ngf-mad", } m["xpa"] = { "Pirriya", 16978087, } m["xpb"] = { "Pyemmairre", 7262964, nil, "Latn", } m["xpc"] = { "Pecheneg", 877881, "trk", } m["xpd"] = { "Paredarerme", 7136678, nil, "Latn", } m["xpe"] = { "Liberia Kpelle", 20527226, "dmn-msw", ancestors = "kpe", } m["xpf"] = { "Tasmania Tenggara", 7068421, nil, "Latn", } m["xpg"] = { "Phrygia", 36751, "ine", "Grek", translit = "grc-translit", } m["xph"] = { "Tyerrernotepanner", 7859815, nil, "Latn", } m["xpi"] = { "Pict", 856383, "cel", "Ogam, Latn", } m["xpj"] = { "Mpalitjanh", 6928192, "aus-pam", } m["xpk"] = { "Kulina", 6443027, "sai-pan", } m["xpl"] = { "Port Sorell", 7230944, nil, "Latn", } m["xpm"] = { "Pumpokol", 2991985, "qfa-yen", "Latn", } m["xpn"] = { "Kapinawá", 6366667, } m["xpo"] = { "Pochutec", 2427341, "azc-nah", "Latn", } m["xpp"] = { "Puyo-Paekche", nil, } m["xpq"] = { "Mohegan-Pequot", 3319130, "alg-eas", "Latn", } m["xpr"] = { "Parthia", 25953, "ira-mpr", "Prti, Mani, Phlv", translit = { Prti = "Prti-translit", Mani = "Mani-translit", }, } m["xps"] = { "Pisidia", 36580, "ine-ana", } m["xpu"] = { "Punik", 535958, "sem-can", "Phnx, Latn, Grek", ancestors = "phn", translit = {Phnx = "Phnx-translit"}, } m["xpv"] = { "Tommeginne", 7819095, nil, "Latn", } m["xpw"] = { "Peerapper", 7160431, nil, "Latn", } m["xpx"] = { "Toogee", 7824008, nil, "Latn", } m["xpy"] = { "Buyeo", 5003359, "qfa-kor", "Hani", sort_key = "Hani-sortkey", } m["xpz"] = { "Pulau Bruny", 4979601, nil, "Latn", } m["xqa"] = { "Karakhanid", nil, "trk-kar", "Arab", entry_name = "ar-entryname", } m["xqt"] = { "Qatabanian", 384101, "sem-osa", "Sarb", translit = "Sarb-translit", } m["xra"] = { "Krahô", 3199549, "sai-nje", "Latn", } m["xrb"] = { "Karaboro Timur", 35716, "alv-krb", } m["xrd"] = { "Gundungurra", nil, } m["xre"] = { "Kreye", 3199686, "sai-nje", } m["xrg"] = { "Minang", 22893424, } m["xri"] = { "Krikati-Timbira", 3199710, } m["xrm"] = { "Armazic", 7599646, } m["xrn"] = { "Arin", 34088, "qfa-yen", "Latn", } m["xrq"] = { "Karranga", 6373349, nil, "Latn", } m["xrr"] = { "Raetic", 36689, nil, "Ital", translit = "Ital-translit", } m["xrt"] = { "Aranama-Tamique", 2859505, } m["xru"] = { "Marriammu", 10577724, "aus-dal", } m["xrw"] = { "Karawa", 6368857, "paa-spk", } m["xsa"] = { "Sabaean", 1070391, "sem-osa", "Sarb", translit = "Sarb-translit", } m["xsb"] = { "Sambal", 2592378, "phi", "Latn", } m["xsd"] = { "Sidetic", 36659, "ine-ana", } m["xse"] = { "Sempan", 3504358, } m["xsh"] = { "Shamang", 3914876, "nic-plc", } m["xsi"] = { "Sio", 3485100, "poz-ocw", } m["xsj"] = { "Subi", 7631298, "bnt-haj", } m["xsl"] = { "Slavey Selatan", 28552, "ath-nor", "Latn", } m["xsm"] = { "Kasem", 35552, "nic-gnn", } m["xsn"] = { "Sanga (Nigeria)", 3915334, "nic-jer", "Latn", } m["xso"] = { "Solano", 2474492, nil, "Latn", } m["xsp"] = { "Silopi", 7515533, "ngf-mad", } m["xsq"] = { "Makhuwa-Saka", 11008159, "bnt-mak", ancestors = "vmw", } m["xsr"] = { "Sherpa", 36612, "sit-tib", "Tibt, Deva", ancestors = "xct", translit = { Tibt = "Tibt-translit", Deva = "xsr-Deva-translit", }, override_translit = true, display_text = {Tibt = s["Tibt-displaytext"]}, entry_name = {Tibt = s["Tibt-entryname"]}, sort_key = {Tibt = "Tibt-sortkey"}, } m["xss"] = { "Assan", 34089, "qfa-yen", "Latn", } m["xsu"] = { "Sanumá", 251728, "sai-ynm", "Latn", } m["xsv"] = { "Sudovian", 35603, "bat", "Latn", } m["xsy"] = { "Saisiyat", 716695, "map", "Latn", } m["xta"] = { "Alcozauca Mixtec", 25559587, "omq-mxt", "Latn", } m["xtb"] = { "Chazumba Mixtec", 12182838, "omq-mxt", "Latn", } m["xtc"] = { "Kadugli", 3407136, "qfa-kad", "Latn", } m["xtd"] = { "Diuxi-Tilantongo Mixtec", 7802048, "omq-mxt", "Latn", } m["xte"] = { "Ketengban", 10990152, } m["xth"] = { "Yitha Yitha", nil, } m["xti"] = { "Sinicahua Mixtec", 12953733, "omq-mxt", "Latn", } m["xtj"] = { "San Juan Teita Mixtec", 32093049, "omq-mxt", "Latn", } m["xtl"] = { "Tijaltepec Mixtec", 12953738, "omq-mxt", "Latn", } m["xtm"] = { "Mixtec Magdalena Peñasco", 7179700, "omq-mxt", "Latn", } m["xtn"] = { "Mixtec Tlaxiaco Utara", 25559585, "omq-mxt", "Latn", } m["xto"] = { "Tocharia A", 2827041, "ine-toc", "Latn", wikipedia_article = "Tocharian languages", -- wikidata id has no associated article } m["xtp"] = { "Mixtec San Miguel Piedras", 7414970, "omq-mxt", "Latn", } m["xtq"] = { "Tumshuq", nil, "xsc-sak", "Brah, Khar", translit = "Brah-translit", } m["xtr"] = { "Tripuri Awal", nil, } m["xts"] = { "Mixtec Sindihui", 13583581, "omq-mxt", "Latn", } m["xtt"] = { "Mixtec Tacahua", 7673668, "omq-mxt", "Latn", } m["xtu"] = { "Mixtec Cuyamecalco", 12953726, "omq-mxt", "Latn", } m["xtv"] = { "Thawa", 7711494, } m["xtw"] = { "Tawandê", nil, "sai-nmk", "Latn", } m["xty"] = { "Mixtec Yoloxochitl", 8054817, "omq-mxt", "Latn", } m["xtz"] = { "Tasmania", 530739, nil, "Latn", } m["xua"] = { "Kurumba Alu", 12952679, "dra", } m["xub"] = { "Kurumba Betta", 16841033, "dra", "Knda, Mlym, Taml", } m["xud"] = { "Umiida", 61999874, "aus-wor", "Latn", } m["xug"] = { "Kunigami", 56558, "jpx-ryu", "Jpan", translit = s["Jpan-translit"], sort_key = s["Jpan-sortkey"], } m["xuj"] = { "Jennu Kurumba", 21282543, "dra", } m["xul"] = { "Ngunawal", 7022712, "aus-yuk", "Latn", } m["xum"] = { "Umbri", 36957, "itc-sbl", "Ital, Latn", translit = "Ital-translit", } m["xun"] = { "Unggaranggu", 61999823, "aus-wor", "Latn", } m["xuo"] = { "Kuo", 6445233, "alv-mbm", } m["xup"] = { "Upper Umpqua", 20607, "ath-pco", "Latn", } m["xur"] = { "Urartian", 36934, "qfa-hur", "Xsux", } m["xut"] = { "Kuthant", 6448417, } m["xuu"] = { "Khwe", 28305, "khi-kal", "Latn", } m["xve"] = { "Venetic", 36871, "ine", "Ital", translit = "Ital-translit", } -- m["xvi"] = { "Kamviri", 1193495, "nur-nor", Arab } moved to etym-only code m["xvn"] = { "Vandalic", 36835, "gme", "Latn", } m["xvo"] = { "Volscian", 622110, "itc-sbl", "Latn", } m["xvs"] = { "Vestinian", 2576407, "itc", "Latn", } m["xwa"] = { "Kwaza", 3200839, } m["xwc"] = { "Woccon", 3569569, "nai-cat", "Latn", } m["xwd"] = { "Wadi Wadi", 7959249, } m["xwe"] = { "Xwela Gbe", 36887, "alv-pph", } m["xwg"] = { "Kwegu", 56723, "sdv", } m["xwj"] = { "Wajuk", 33110188, } m["xwk"] = { "Wangkumara", 7967891, "aus-pam", "Latn", } m["xwl"] = { "Gbe Xwla Barat", 36924, "alv-pph", "Latn", } m["xwo"] = { "Oirat Bertulis", 56959, "xgn-cen", "xwo-Mong", translit = "xwo-translit", } m["xwr"] = { "Kwerba Mamberamo", 6450325, "paa-tkw", } m["xww"] = { "Wemba-Wemba", 18472819, "aus-pam", "Latn", } m["xxb"] = { "Boro", 16844787, nil, "Latn", } m["xxk"] = { "Ke'o", 3195346, } m["xxm"] = { "Minkin", 6867836, } m["xxr"] = { "Koropó", 6432560, } m["xxt"] = { "Tambora", 36711, "paa", "Latn", } m["xya"] = { "Yaygir", 8050525, "aus-pam", } m["xyb"] = { "Yandjibara", nil, nil, "Latn", } m["xyl"] = { "Yalakalore", 12645352, "sai-nmk", "Latn", } m["xyt"] = { "Mayi-Thakurti", 47004719, "aus-pam", "Latn", } m["xyy"] = { "Yorta Yorta", 8055849, "aus-pam", "Latn", } m["xzh"] = { "Zhang-Zhung", 3437292, "sit-alm", "xzh-Tibt, Marc", display_text = {["xzh-Tibt"] = s["Tibt-displaytext"]}, entry_name = {["xzh-Tibt"] = s["Tibt-entryname"]}, } m["xzm"] = { "Zemgalia", 47631, "bat", } m["xzp"] = { "Zapotec Purba", nil, } return require("Module:languages").finalizeData(m, "language") n7ptvgbjsx7mexi60i8l6fpi3knkhmi Modul:kanjitab 828 9911 281363 276293 2026-04-22T06:49:27Z Hakimi97 2668 281363 Scribunto text/plain local export = {} local m_str_utils = require("Module:string utilities") local m_utilities = require("Module:utilities") local m_ja = require("Module:ja") local show_labels = require("Module:labels").show_labels --[=[ Other modules used: [[Module:parameters]] ]=] local concat = table.concat local convert_iteration_marks = require("Module:Hani").convert_iteration_marks local find = string.find local gsplit = m_str_utils.gsplit local gsub = string.gsub local kata_to_hira = m_ja.kata_to_hira local insert = table.insert local match = string.match local remove = table.remove local split = m_str_utils.split local sub = string.sub local ugsub = mw.ustring.gsub local ulen = m_str_utils.len local umatch = mw.ustring.match local usub = m_str_utils.sub local PAGENAME = mw.loadData("Module:headword/data").pagename local NAMESPACE = mw.title.getCurrentTitle().nsText local d_range = mw.loadData("Module:ja/data/range") local yomi_data = mw.loadData("Module:kanjitab/data") local kanji_grade_links = { "[[Lampiran:Glosari_bahasa_Jepun#kyōiku_kanji|Gred: 1]]", "[[Lampiran:Glosari_bahasa_Jepun#kyōiku_kanji|Gred: 2]]", "[[Lampiran:Glosari_bahasa_Jepun#kyōiku_kanji|Gred: 3]]", "[[Lampiran:Glosari_bahasa_Jepun#kyōiku_kanji|Gred: 4]]", "[[Lampiran:Glosari_bahasa_Jepun#kyōiku_kanji|Gred: 5]]", "[[Lampiran:Glosari_bahasa_Jepun#kyōiku_kanji|Gred: 6]]", "[[Lampiran:Glosari_bahasa_Jepun#jōyō_kanji|Gred: S]]", -- 7 "[[Lampiran:Glosari_bahasa_Jepun#jinmeiyō_kanji|Jinmeiyō]]", -- 8 "[[Lampiran:Glosari_bahasa_Jepun#hyōgaiji|Hyōgaiji]]" -- 9 } -- this is the function that is called from templates function export.show(frame) local args = require("Module:parameters").process(frame:getParent().args, { [1] = { list = true, allow_holes = true }, k = { list = true, allow_holes = true }, o = { list = true, allow_holes = true }, r = {}, sort = {}, yomi = {}, ateji = {}, alt = {}, alt2 = {}, kyu = { list = true }, y = {alias_of = "yomi"}, clearright = {type = "boolean"}, pagename = {}, }) local lang_code = frame.args[1] local lang = require("Module:languages").getByCode(lang_code) local lang_name = lang:getCanonicalName() if args.pagename and NAMESPACE == "" then require("Module:debug/track")("kanjitab/pagename param in mainspace") end local pagename = args.pagename or PAGENAME local categories = {} local cells = {} -- extract kanji and non-kanji local kanji = {} local non_kanji = {} -- 々 and 〻 pagename = convert_iteration_marks(pagename) local kanji_border = 1 ugsub(pagename, "()([" .. d_range.kanji .. "々〻])()", function(p1, w1, p2) insert(non_kanji, usub(pagename, kanji_border, p1 - 1)) kanji_border = p2 insert(kanji, w1) end) insert(non_kanji, usub(pagename, kanji_border)) -- kyujitai local kyu = args.kyu if kyu[1] == "-" then kyu = {} elseif kyu[1] == nil then local form_kyu = {non_kanji[1]} local kyu_data = mw.loadData("Module:ja/data/kyu") local has_kyu, has_kyu_nonsupple, has_shin = false, false, false for i, v in ipairs(kanji) do local v_kyu = match(kyu_data[1], v .. "(%S*)%s") if v_kyu == nil then insert(form_kyu, v) elseif v_kyu == "" then has_shin = true break elseif v_kyu:sub(1, 1) == "&" then has_kyu = true insert(form_kyu, v_kyu) else has_kyu, has_kyu_nonsupple = true, true insert(form_kyu, v_kyu) end insert(form_kyu, non_kanji[i + 1]) end if not has_shin and has_kyu then kyu[1] = (has_kyu_nonsupple and "" or pagename .. "|") .. concat(form_kyu) end if find(pagename, "弁") then require("Module:debug/track")("kanjitab/ambiguous kyujitai for 弁") kyu[1] = "which 弁?" end end local all_yomi, missing_yomi if args.yomi then all_yomi = {} local keys = split(args.yomi, ",") for i, yomi, len in ipairs(keys) do yomi, len = match(yomi, "^(%l*)(%d*)$") yomi = yomi_data[yomi] or error("The yomi type \"" .. yomi .. "\" in the input \"" .. args.yomi .. "\" is not recognized.") if len ~= "" then -- Disallow length 0 or leading zeroes, as a sanity check. len = match(len, "^[1-9]%d*$") and tonumber(len) or error("Cannot specify a length of " .. len .. " kanji.") -- Only one yomi with no length given: apply to all kanji. elseif i == 1 and #keys == 1 then len = #kanji else len = 1 end local yomi_type = yomi.type -- If the on'yomi is not specified as goon/kanon/toon/soon, only "on". if yomi_type == "on'yomi" then require("Module:debug/track")("kanjitab/unspecified on") elseif yomi_type == "jūbakoyomi" then require("Module:debug/track")("kanjitab/jubakoyomi") elseif yomi_type == "yutōyomi" then require("Module:debug/track")("kanjitab/yutoyomi") end -- If the yomi requires a specific number of kanji (e.g. jūbakoyomi, yutōyomi). local req_kanji = yomi.required_kanji if req_kanji and #kanji ~= req_kanji then error("The yomi type \"" .. yomi.type .. "\" is only applicable to terms with " .. req_kanji .. " kanji.") elseif yomi.type == "none" then missing_yomi = true end -- Insert yomi data for each applicable kanji. Wrap in a table first, as the range for this input yomi is determined by its identity, so that (e.g.) "kun,kun" is still treated as two separate inputs. yomi = {data = yomi} for _ = 1, len do insert(all_yomi, yomi) end end -- If there are any yomi slots left, handle them as empty. if #all_yomi < #kanji then missing_yomi = true for _ = #all_yomi + 1, #kanji do insert(all_yomi, {data = yomi_data.none}) end end elseif #kanji > 0 then missing_yomi = true end if missing_yomi then insert(categories, "Perkataan kehilangan yomi bahasa " .. lang_name ) end -- process readings local readings = {} local readings_actual = {} local reading_length_total = 0 for i = 1, args[1].maxindex do local reading_kana, reading_length = match(args[1][i] or "", "^(%D*)(%d*)$") reading_kana = reading_kana ~= "" and reading_kana or nil reading_length = reading_kana and tonumber(reading_length) or 1 insert(readings, {reading_kana, reading_length}) reading_length_total = reading_length_total + reading_length end if reading_length_total > #kanji then error("Readings for " .. reading_length_total .. " kanji are given, but this word has only " .. #kanji .. " kanji.") else for _ = reading_length_total + 1, #kanji do insert(readings, {nil, 1}) end end local table_head = [=[ {| class="wikitable kanji-table floatright" style="text-align: center; ]=] .. (args.clearright and " clear:right;" or "") .. [=[" ! ]=] .. (#kanji > 1 and "colspan=\"" .. #kanji .. "\" " or "") .. [=[style="font-weight: normal;" | [[Lampiran:Glosari_bahasa_Jepun#kanji|Kanji]] dalam kata ini |- lang="]=] .. lang_code .. [=[" class="Jpan" style="font-size: 2em; background: white; line-height: 1em;" ]=] if args.k.maxindex and args.k.maxindex > args[1].maxindex then error("kanjitab/too many k") end if args.o.maxindex and args.o.maxindex > args[1].maxindex then error("kanjitab/too many o") end local is_ateji = {} if args.ateji then local ateji = args.ateji local cat_ateji = false if ateji == "y" then for i = 1, #kanji do is_ateji[i] = true end cat_ateji = true else for i in gsplit(ateji, ";") do gsub(i, "^(%d+)$", function(a) is_ateji[tonumber(a)] = true cat_ateji = true end) gsub(i, "^(%d+),(%d+)$", function (a, b) for j = tonumber(a), tonumber(b) do is_ateji[j] = true end cat_ateji = true end) end end if cat_ateji then insert(categories, "Perkataan dieja dengan ateji bahasa " .. lang_name) end end -- if hiragana readings were passed, -- make the "spelled with ..." categories, the readings cells on the lower level and build the sort key -- otherwise rely on the pagename to make the original kanjitab and categories local cells_above = {} local cells_below = {} local kanji_pos = 1 for i, reading in ipairs(readings) do local reading_kana, reading_length = reading[1], reading[2] local cell = {} if reading_length <= 1 then insert(cell, "| rowspan=\"2\" | ") else insert(cell, "| colspan =\"" .. reading_length .. "\" | ") end -- display reading, actual reading and okurigana if reading_kana then if reading_kana ~= "" and reading_kana ~= "-" and umatch(reading_kana, "[^" .. d_range.kana .. "]") then error("Please remove any non-kana characters from the reading input " .. reading_kana .. ".") end local actual_reading = args.k[i] local okurigana = args.o[i] local okurigana_text = okurigana and "(" .. okurigana .. ")" or "" local actual_reading_text = actual_reading and " > " .. actual_reading .. okurigana_text or "" local text = reading_kana .. okurigana_text .. actual_reading_text readings_actual[i] = {(actual_reading or reading_kana) .. (okurigana or ""), reading_length} insert(cell, "<span class=\"Jpan\" lang=\"" .. lang_code .. "\">" .. text .. "</span>") if reading_length <= 1 then insert(cell, "<br/>") end else readings_actual[i] = {nil, 1} end -- display kanji grade, categorize for j = kanji_pos, kanji_pos + reading_length - 1 do local single_kanji = kanji[j] local kanji_grade = m_ja.kanji_grade(single_kanji) local ateji_text = is_ateji[j] and "<br/><small>([[Lampiran:Glosari bahasa Jepun#ateji|ateji]])</small>" or "" local type, compound if all_yomi then local yomi = all_yomi[j].data type, compound = yomi.type, yomi.compound_reading end if not reading_kana then if type ~= "irregular" then require("Module:debug/track")("kanjitab/no reading") end insert(categories, "Perkataan dieja dengan " .. single_kanji .. " bahasa " .. lang_name ) elseif reading_length ~= 1 or type == "tak teratur" then insert(categories, "Perkataan dieja dengan " .. single_kanji .. " bahasa " .. lang_name ) elseif compound then -- Re-enable once all bad jukujikun calls are fixed. -- error("The yomi type \"" .. type .. "\" is only applicable to compound character readings, so cannot apply to " .. single_kanji .. " read as " .. reading_kana .. ". If this is intended as part of a " .. type .. " reading, please enter the whole reading as one, followed by the number of kanji it applies to.") require("Module:debug/track")("kanjitab/single kanji with jukujikun") else -- Subcategorize by reading. insert(categories, "Perkataan dieja dengan " .. single_kanji .. " dibaca sebagai " .. kata_to_hira(reading_kana) .. " bahasa " .. lang_name ) end if reading_length <= 1 then insert(cell, "<small>" .. kanji_grade_links[kanji_grade] .. "</small>" .. ateji_text) else insert(cells_below, "| <small>" .. kanji_grade_links[kanji_grade] .. "</small>" .. ateji_text) end end insert(cells_above, concat(cell)) kanji_pos = kanji_pos + reading_length end insert(cells, "|- style=\"background: white;\"") if #cells_below > 0 then insert(cells, concat(cells_above, "\n")) insert(cells, "|- style=\"background: white;\"") insert(cells, concat(cells_below, "\n")) else for i, v in ipairs(cells_above) do cells_above[i] = gsub(v, "| rowspan=\"2\" | ", "| ") end insert(cells, concat(cells_above, "\n")) end local rendaku = args.r if rendaku then insert(categories, "Perkataan dengan rendaku bahasa " .. lang_name ) end if all_yomi then insert(cells, "|-") local len, all_on, yomi_cat = 1, true for i, yomi in ipairs(all_yomi) do -- If the next kanji has the same yomi table, it's part of the same range. if yomi == all_yomi[i + 1] then len = len + 1 else yomi = yomi.data local yomi_type = yomi.type local display = yomi.display or yomi_type local appendix = yomi.appendix insert(cells, "| colspan=\"" .. len .. "\" |" .. ( appendix == false and display or "[[Lampiran:Glosari_bahasa_Jepun#" .. (appendix or yomi_type) .. "|" .. display .. "]]" )) -- Categorise as irregular if any irregular yomi are found; otherwise, categorise if all yomi are of the same type. If yomi are of different types but are all on, on'yomi is used as a fallback. if yomi_cat ~= "irregular" then local cat_type = yomi_type if cat_type == "irregular" or yomi_cat == nil then yomi_cat = cat_type elseif yomi_cat ~= cat_type then yomi_cat = false end if not yomi.onyomi then all_on = false end end len = 1 end end if yomi_cat then -- Check yomi_data first, in case cat_type is "irregular"; if no match, must be some other type, so get it from the first yomi in all_yomi, since not all yomi types are yomi_data keys. yomi_cat = yomi_data[yomi_cat] or all_yomi[1].data elseif all_on then yomi_cat = yomi_data.on elseif #all_yomi == 2 then local y1, y2 = all_yomi[1].data, all_yomi[2].data if ulen(pagename) == 2 then if y1.onyomi and y2.type == "kun'yomi" then yomi_cat = yomi_data.j -- jūbakoyomi elseif y1.type == "kun'yomi" and y2.onyomi then yomi_cat = yomi_data.y -- yutōyomi end end end if yomi_cat then local category = yomi_cat.reading_category if category ~= false then insert(categories, "Perkataan dengan bacaan " .. (category or yomi_cat.type) .. " bahasa " .. lang_name ) end end end local kanji_table if #kanji > 0 then kanji_table = table_head for _, v in ipairs(kanji) do kanji_table = kanji_table .. "| style=\"padding: 0.5em;\" | [[" .. v .. "#" .. lang_name .. "|" .. v .. "]]\n" end kanji_table = kanji_table .. concat(cells, "\n") .. "\n|}" else kanji_table = "" end local forms_table = "" if args.alt == "" or args.alt == "-" then args.alt = nil end if kyu[1] or args.alt then local forms = {} -- |kyu= if kyu[1] == "which 弁?" then insert(forms, "<strong class=\"error\" style=\"font-size:75%;\">Sila tentukan kyujitai yang betul untuk 弁 dengan parameter \"kyu\".</strong>[[Kategori:Permohonan untuk pembersihan dalam masukan bahasa " .. lang_name .. " entries]]") remove(kyu, 1) end for _, form in ipairs(kyu) do local form_linkto, form_display = match(form, "^(.+)|(.+)$") if not form_linkto then form_linkto, form_display = form, form end insert(forms, concat{ "<span class=\"Jpan\" lang=\"" .. lang_code .. "\" style=\"font-family:游ゴシック, HanaMinA, sans-serif; font-size:140%;\">[[", form_linkto, form_linkto == pagename and "|" or "#" .. lang_name .. "|", form_display, "]]</span> <small>", show_labels {labels = {"kyūjitai"}, lang = lang, nocat = true }, "</small>", }) end -- |alt= if args.alt then for form in gsplit(args.alt, ",") do local i_semicolon = find(form, ":") if i_semicolon then local altform = sub(form, 1, i_semicolon - 1) local altlabels = split(sub(form, i_semicolon + 1), " ") insert(forms, concat{ "<span class=\"Jpan\" lang=\"" .. lang_code .. "\" style=\"font-size:140%\">[[", altform, "#" .. lang_name .. "|", altform, "]]</span> <small>", show_labels { labels = altlabels, lang = lang, nocat = true }, "</small>", }) else insert(forms, concat{ "<span class=\"Jpan\" lang=\"" .. lang_code .. "\" style=\"font-size:140%\">[[", form, "#" .. lang_name .. "|", form, "]]</span>" }) end end end forms_table = "\n" .. [[{| class="wikitable floatright" ! style="font-weight:normal" | Ejaan alternatif]] .. (#forms == 1 and "" or "s") .. [[ |- | style="text-align:center;font-size:108%" | ]] .. concat(forms, "<br>") .. "\n|}" end local forms_table2 = "" if args.alt2 and args.alt2 ~= "" and args.alt2 ~= "-" then local forms2 = {} for form in gsplit(args.alt2, ",") do insert(forms2, "<span class=\"Jpan\" lang=\"" .. lang_code .. "\">[[" .. form .. "#" .. lang_name .. "|" .. form .. "]]</span>") end forms_table2 = "\n" .. [[{| class="wikitable floatright" ! style="font-weight:normal" | Bentuk varian]] .. (#forms2 == 1 and "" or "s") .. "\n" .. [[ | style="text-align:center;font-size:140%" | ]] .. concat(forms2, "<br>") .. "\n|}" end -- use user-provided sortkey if we got one, otherwise -- use the sortkey we've already made by combining the -- readings if provided, if we have neither then -- default to empty string and don't sort local sortkey if args.sort then sortkey = args.sort else sortkey = {non_kanji[1]} local id = 1 for _, v in ipairs(readings_actual) do id = id + v[2] insert(sortkey, (v[1] or "") .. (non_kanji[id] or "")) end sortkey = concat(sortkey) end if sortkey == "" then sortkey = nil else sortkey = lang:makeSortKey(sortkey) end if sortkey ~= lang:makeSortKey(PAGENAME) then require("Module:debug/track"){"kanjitab/nonstandard sortkey", "kanjitab/nonstandard sortkey/" .. lang_code} end return kanji_table .. forms_table .. forms_table2 .. m_utilities.format_categories(categories, lang, sortkey) end return export d7dyk2wp790fv7kaubmbw5u1s63ttw2 نام 0 10331 281300 268196 2026-04-21T15:41:05Z Hakimi97 2668 /* Keturunan */ 281300 wikitext text/x-wiki {{Pautan Projek Wikimedia}} == Bahasa Melayu == === Takrifan === {{ms-kn}} # [[panggil]]an atau sebutan bagi orang (barang, tempat, pertubuhan, dan lain-lain.) #: ''Dia mencatat nama orang yang menderma di dalam senarai itu.'' # {{konteks|arkaik|lang=ms}} [[gelaran]], [[sebutan]]. #: ''Kerana jasanya yang besar, Dia diberi nama Datuk.'' # kehormatan, kebaikan, kemasyhuran, [[maruah]], [[pujian]]. #: ''Dia melakukan semua itu semata-mata untuk mendapat nama.'' === Sebutan === * {{dewan|nã|mã}} * {{rhymes|ms|ma|a}} * {{penyempangan|ms|na|ma}} === Tulisan Rumi === [[nama]] === Terbitan === * {{l|ms|برنام}} * {{l|ms|ترنام}} * {{l|ms|دناماکن}} * {{ARchar|دناما<sup>ء</sup>ي}} * {{l|ms|ڤنام}} * {{ARchar|ڤناما<sup>ء</sup>ن}} * {{l|ms|مناماکن}} * {{ARchar|مناما<sup>ء</sup>ي}} === Kata majmuk === * {{l|ms|براوليه نام}} * {{l|ms|مڠمبيل نام}} * {{l|ms|منچاري نام}} * {{l|ms|منداڤت نام}} * {{l|ms|نام باتڠ توبوه}} * {{ARchar|نام با<sup>ء</sup>يق}} * {{l|ms|نام بندا}} * {{l|ms|نام تيمڠن}} * {{l|ms|نام جولوقن}} * {{l|ms|نام چمبوڠ}} * {{l|ms|نام خاص}} * {{l|ms|نام داݢيڠ}} * {{l|ms|نام سامرن}} * {{l|ms|نام عام}} * {{l|ms|نام ڤيديڠن}} * {{l|ms|نام ڤينا}} * {{l|ms|نام کچيل}} * {{l|ms|نام کلوارݢ}} === Keturunan === {{atas2}} * {{etyl|ms|abs|nama}} {{tengah2}} * {{etyl|ms|id|nama}} {{ter-bawah}} === Tesaurus === ; Sinonim: {{l|ms|اسم}} [[Kategori:Tulisan Jawi]] ==Bahasa Arab== ==== Kata kerja ==== {{lang|ar|نَامَ}} • (nāma) ''I, tidak lampau'' {{l|ms|ينام|يَنَامُ}} # untuk [[tidur]] # untuk pergi ke [[katil]] # untuk pergi ke [[tidur]] # untuk [[mereda]], untuk [[menyurut]], untuk [[berkurang]], untuk [[menenangkan]] # untuk menjadi [[tidak]] [[aktif]], untuk menjadi [[lesu]] # untuk menjadi [[kebas]] # untuk [[mengabaikan]], untuk [[meninggalkan]], untuk dapat [[melihat]] # untuk [[melupakan]] # untuk menjadi [[tenang]], untuk [[menerima]], untuk [[bersetuju]], untuk [[menyetujui]] # untuk [[mempercayai]], untuk mempunyai [[keyakinan]] dalam === Etimologi === Daripada akar {{ar-akar|ن|و|م}} === Lihat juga === * {{l|ar|نوام}} * {{l|ar|نوم}} * {{l|ar|نومي}} * {{l|ar|نومة}} * {{l|ar|نؤوم}} == Bahasa Baluchi == ==== Kata nama ==== {{head|bal|Kata nama|tr=nám}} # [[nama]] # [[reputasi]] # [[designasi]] === Etimologi === Daripada {{inh|bal|ira-pro|*Hnā́ma}}, daripada {{inh|bal|iir-pro|*Hnā́ma}}, daripada {{inh|bal|ine-pro|*h₁nómn̥}}. ==Bahasa Parsi== {{wikipedia|lang=fa|sc=fa-Arab}} ==== Kata nama ==== {{fa-regional|نام|نام|ном}} {{fa-kn|tr=nām||pl=نام ها|pltr=nām-hā}} # [[nama]] # [[reputasi]] === Etimologi === Daripada {{inh|fa|pal|ŠM|ts=nām}}, daripada {{inh|fa|peo|𐎴𐎠𐎶|ts=nāma}}, daripada {{inh|fa|ira-pro|*Hnā́ma}} (banding dengan {{cog|kmr|nav}}, {{cog|ps|نوم|tr=nūm}}, {{cog|ae|𐬥𐬁𐬨𐬀𐬥|𐬥𐬁𐬨𐬀𐬥-}}) daripada {{inh|fa|iir-pro|*Hnā́ma}} (banding dengan {{cog|el|όνομα}}, {{cog|it|nome}}, Tocharia A {{cog|xto|ñom}}, Armenia {{term|hy|անուն}}, dan Inggeris {{cog|en|name}}). === Sebutan === * {{a|IR}} {{AFA|fa|[nɒːm]}} * {{audio|fa|Fa-نام.ogg|audio}} === Fleksi === {{fa-decl-c|nâm|poss=+}} === Terbitan === * {{l|fa|sc=fa-Arab|نامی|tr=nâmi}} * {{l|fa|sc=fa-Arab|نامیدن|tr=nâmidan}} * {{l|fa|sc=fa-Arab|بنام|tr=benâm}} === Lihat juga === * {{l|fa|sc=fa-Arab|اسم|tr=esm}} == Bahasa Urdu == ==== Kata nama ==== {{ur-kn|g=m|tr=nām|hi=नाम}} # [[nama]] === Etimologi === Daripada {{inh+|ur|pra-sau|𑀡𑀸𑀫}}, daripada {{inh|ur|sa|नामन्|tr=nā́man}}, daripada {{inh|ur|inc-pro|*Hnā́ma}}, daripada {{inh|ur|iir-pro|*Hnā́ma}} (banding dengan {{cog|fa|نام|tr=nâm}}), daripada {{inh|ur|ine-pro|*h₁nómn̥||nama}}. Seasal dengan {{cog|pa|ناں}} dan {{cog|en|name}}. === Deklensi === {{ur-noun-c-c|نام|nām}} oegn7sp9uo704mlzu5ketjruq0vx31a 281301 281300 2026-04-21T15:41:28Z Hakimi97 2668 /* Etimologi */ 281301 wikitext text/x-wiki {{Pautan Projek Wikimedia}} == Bahasa Melayu == === Takrifan === {{ms-kn}} # [[panggil]]an atau sebutan bagi orang (barang, tempat, pertubuhan, dan lain-lain.) #: ''Dia mencatat nama orang yang menderma di dalam senarai itu.'' # {{konteks|arkaik|lang=ms}} [[gelaran]], [[sebutan]]. #: ''Kerana jasanya yang besar, Dia diberi nama Datuk.'' # kehormatan, kebaikan, kemasyhuran, [[maruah]], [[pujian]]. #: ''Dia melakukan semua itu semata-mata untuk mendapat nama.'' === Sebutan === * {{dewan|nã|mã}} * {{rhymes|ms|ma|a}} * {{penyempangan|ms|na|ma}} === Tulisan Rumi === [[nama]] === Terbitan === * {{l|ms|برنام}} * {{l|ms|ترنام}} * {{l|ms|دناماکن}} * {{ARchar|دناما<sup>ء</sup>ي}} * {{l|ms|ڤنام}} * {{ARchar|ڤناما<sup>ء</sup>ن}} * {{l|ms|مناماکن}} * {{ARchar|مناما<sup>ء</sup>ي}} === Kata majmuk === * {{l|ms|براوليه نام}} * {{l|ms|مڠمبيل نام}} * {{l|ms|منچاري نام}} * {{l|ms|منداڤت نام}} * {{l|ms|نام باتڠ توبوه}} * {{ARchar|نام با<sup>ء</sup>يق}} * {{l|ms|نام بندا}} * {{l|ms|نام تيمڠن}} * {{l|ms|نام جولوقن}} * {{l|ms|نام چمبوڠ}} * {{l|ms|نام خاص}} * {{l|ms|نام داݢيڠ}} * {{l|ms|نام سامرن}} * {{l|ms|نام عام}} * {{l|ms|نام ڤيديڠن}} * {{l|ms|نام ڤينا}} * {{l|ms|نام کچيل}} * {{l|ms|نام کلوارݢ}} === Keturunan === {{atas2}} * {{etyl|ms|abs|nama}} {{tengah2}} * {{etyl|ms|id|nama}} {{ter-bawah}} === Tesaurus === ; Sinonim: {{l|ms|اسم}} [[Kategori:Tulisan Jawi]] ==Bahasa Arab== ==== Kata kerja ==== {{lang|ar|نَامَ}} • (nāma) ''I, tidak lampau'' {{l|ms|ينام|يَنَامُ}} # untuk [[tidur]] # untuk pergi ke [[katil]] # untuk pergi ke [[tidur]] # untuk [[mereda]], untuk [[menyurut]], untuk [[berkurang]], untuk [[menenangkan]] # untuk menjadi [[tidak]] [[aktif]], untuk menjadi [[lesu]] # untuk menjadi [[kebas]] # untuk [[mengabaikan]], untuk [[meninggalkan]], untuk dapat [[melihat]] # untuk [[melupakan]] # untuk menjadi [[tenang]], untuk [[menerima]], untuk [[bersetuju]], untuk [[menyetujui]] # untuk [[mempercayai]], untuk mempunyai [[keyakinan]] dalam === Etimologi === Daripada akar {{ar-akar|ن و م}} === Lihat juga === * {{l|ar|نوام}} * {{l|ar|نوم}} * {{l|ar|نومي}} * {{l|ar|نومة}} * {{l|ar|نؤوم}} == Bahasa Baluchi == ==== Kata nama ==== {{head|bal|Kata nama|tr=nám}} # [[nama]] # [[reputasi]] # [[designasi]] === Etimologi === Daripada {{inh|bal|ira-pro|*Hnā́ma}}, daripada {{inh|bal|iir-pro|*Hnā́ma}}, daripada {{inh|bal|ine-pro|*h₁nómn̥}}. ==Bahasa Parsi== {{wikipedia|lang=fa|sc=fa-Arab}} ==== Kata nama ==== {{fa-regional|نام|نام|ном}} {{fa-kn|tr=nām||pl=نام ها|pltr=nām-hā}} # [[nama]] # [[reputasi]] === Etimologi === Daripada {{inh|fa|pal|ŠM|ts=nām}}, daripada {{inh|fa|peo|𐎴𐎠𐎶|ts=nāma}}, daripada {{inh|fa|ira-pro|*Hnā́ma}} (banding dengan {{cog|kmr|nav}}, {{cog|ps|نوم|tr=nūm}}, {{cog|ae|𐬥𐬁𐬨𐬀𐬥|𐬥𐬁𐬨𐬀𐬥-}}) daripada {{inh|fa|iir-pro|*Hnā́ma}} (banding dengan {{cog|el|όνομα}}, {{cog|it|nome}}, Tocharia A {{cog|xto|ñom}}, Armenia {{term|hy|անուն}}, dan Inggeris {{cog|en|name}}). === Sebutan === * {{a|IR}} {{AFA|fa|[nɒːm]}} * {{audio|fa|Fa-نام.ogg|audio}} === Fleksi === {{fa-decl-c|nâm|poss=+}} === Terbitan === * {{l|fa|sc=fa-Arab|نامی|tr=nâmi}} * {{l|fa|sc=fa-Arab|نامیدن|tr=nâmidan}} * {{l|fa|sc=fa-Arab|بنام|tr=benâm}} === Lihat juga === * {{l|fa|sc=fa-Arab|اسم|tr=esm}} == Bahasa Urdu == ==== Kata nama ==== {{ur-kn|g=m|tr=nām|hi=नाम}} # [[nama]] === Etimologi === Daripada {{inh+|ur|pra-sau|𑀡𑀸𑀫}}, daripada {{inh|ur|sa|नामन्|tr=nā́man}}, daripada {{inh|ur|inc-pro|*Hnā́ma}}, daripada {{inh|ur|iir-pro|*Hnā́ma}} (banding dengan {{cog|fa|نام|tr=nâm}}), daripada {{inh|ur|ine-pro|*h₁nómn̥||nama}}. Seasal dengan {{cog|pa|ناں}} dan {{cog|en|name}}. === Deklensi === {{ur-noun-c-c|نام|nām}} 9y13su2738dl8zy9o01tolrc9kzopow 281302 281301 2026-04-21T15:42:01Z Hakimi97 2668 /* Kata nama */ 281302 wikitext text/x-wiki {{Pautan Projek Wikimedia}} == Bahasa Melayu == === Takrifan === {{ms-kn}} # [[panggil]]an atau sebutan bagi orang (barang, tempat, pertubuhan, dan lain-lain.) #: ''Dia mencatat nama orang yang menderma di dalam senarai itu.'' # {{konteks|arkaik|lang=ms}} [[gelaran]], [[sebutan]]. #: ''Kerana jasanya yang besar, Dia diberi nama Datuk.'' # kehormatan, kebaikan, kemasyhuran, [[maruah]], [[pujian]]. #: ''Dia melakukan semua itu semata-mata untuk mendapat nama.'' === Sebutan === * {{dewan|nã|mã}} * {{rhymes|ms|ma|a}} * {{penyempangan|ms|na|ma}} === Tulisan Rumi === [[nama]] === Terbitan === * {{l|ms|برنام}} * {{l|ms|ترنام}} * {{l|ms|دناماکن}} * {{ARchar|دناما<sup>ء</sup>ي}} * {{l|ms|ڤنام}} * {{ARchar|ڤناما<sup>ء</sup>ن}} * {{l|ms|مناماکن}} * {{ARchar|مناما<sup>ء</sup>ي}} === Kata majmuk === * {{l|ms|براوليه نام}} * {{l|ms|مڠمبيل نام}} * {{l|ms|منچاري نام}} * {{l|ms|منداڤت نام}} * {{l|ms|نام باتڠ توبوه}} * {{ARchar|نام با<sup>ء</sup>يق}} * {{l|ms|نام بندا}} * {{l|ms|نام تيمڠن}} * {{l|ms|نام جولوقن}} * {{l|ms|نام چمبوڠ}} * {{l|ms|نام خاص}} * {{l|ms|نام داݢيڠ}} * {{l|ms|نام سامرن}} * {{l|ms|نام عام}} * {{l|ms|نام ڤيديڠن}} * {{l|ms|نام ڤينا}} * {{l|ms|نام کچيل}} * {{l|ms|نام کلوارݢ}} === Keturunan === {{atas2}} * {{etyl|ms|abs|nama}} {{tengah2}} * {{etyl|ms|id|nama}} {{ter-bawah}} === Tesaurus === ; Sinonim: {{l|ms|اسم}} [[Kategori:Tulisan Jawi]] ==Bahasa Arab== ==== Kata kerja ==== {{lang|ar|نَامَ}} • (nāma) ''I, tidak lampau'' {{l|ms|ينام|يَنَامُ}} # untuk [[tidur]] # untuk pergi ke [[katil]] # untuk pergi ke [[tidur]] # untuk [[mereda]], untuk [[menyurut]], untuk [[berkurang]], untuk [[menenangkan]] # untuk menjadi [[tidak]] [[aktif]], untuk menjadi [[lesu]] # untuk menjadi [[kebas]] # untuk [[mengabaikan]], untuk [[meninggalkan]], untuk dapat [[melihat]] # untuk [[melupakan]] # untuk menjadi [[tenang]], untuk [[menerima]], untuk [[bersetuju]], untuk [[menyetujui]] # untuk [[mempercayai]], untuk mempunyai [[keyakinan]] dalam === Etimologi === Daripada akar {{ar-akar|ن و م}} === Lihat juga === * {{l|ar|نوام}} * {{l|ar|نوم}} * {{l|ar|نومي}} * {{l|ar|نومة}} * {{l|ar|نؤوم}} == Bahasa Baluchi == ==== Kata nama ==== {{head|bal|Kata nama|tr=nám}} # [[nama]] # [[reputasi]] # [[designasi]] === Etimologi === Daripada {{inh|bal|ira-pro|*Hnā́ma}}, daripada {{inh|bal|iir-pro|*Hnā́ma}}, daripada {{inh|bal|ine-pro|*h₁nómn̥}}. ==Bahasa Parsi== {{wikipedia|lang=fa|sc=fa-Arab}} ==== Kata nama ==== {{fa-regional|نام|نام|ном}} {{fa-kn|tr=nām||pl=نام ها|tr=nām-hā}} # [[nama]] # [[reputasi]] === Etimologi === Daripada {{inh|fa|pal|ŠM|ts=nām}}, daripada {{inh|fa|peo|𐎴𐎠𐎶|ts=nāma}}, daripada {{inh|fa|ira-pro|*Hnā́ma}} (banding dengan {{cog|kmr|nav}}, {{cog|ps|نوم|tr=nūm}}, {{cog|ae|𐬥𐬁𐬨𐬀𐬥|𐬥𐬁𐬨𐬀𐬥-}}) daripada {{inh|fa|iir-pro|*Hnā́ma}} (banding dengan {{cog|el|όνομα}}, {{cog|it|nome}}, Tocharia A {{cog|xto|ñom}}, Armenia {{term|hy|անուն}}, dan Inggeris {{cog|en|name}}). === Sebutan === * {{a|IR}} {{AFA|fa|[nɒːm]}} * {{audio|fa|Fa-نام.ogg|audio}} === Fleksi === {{fa-decl-c|nâm|poss=+}} === Terbitan === * {{l|fa|sc=fa-Arab|نامی|tr=nâmi}} * {{l|fa|sc=fa-Arab|نامیدن|tr=nâmidan}} * {{l|fa|sc=fa-Arab|بنام|tr=benâm}} === Lihat juga === * {{l|fa|sc=fa-Arab|اسم|tr=esm}} == Bahasa Urdu == ==== Kata nama ==== {{ur-kn|g=m|tr=nām|hi=नाम}} # [[nama]] === Etimologi === Daripada {{inh+|ur|pra-sau|𑀡𑀸𑀫}}, daripada {{inh|ur|sa|नामन्|tr=nā́man}}, daripada {{inh|ur|inc-pro|*Hnā́ma}}, daripada {{inh|ur|iir-pro|*Hnā́ma}} (banding dengan {{cog|fa|نام|tr=nâm}}), daripada {{inh|ur|ine-pro|*h₁nómn̥||nama}}. Seasal dengan {{cog|pa|ناں}} dan {{cog|en|name}}. === Deklensi === {{ur-noun-c-c|نام|nām}} 861wup5q0ygbjyj08z6gf9zddv7yocj 281303 281302 2026-04-21T15:42:28Z Hakimi97 2668 /* Kata nama */ 281303 wikitext text/x-wiki {{Pautan Projek Wikimedia}} == Bahasa Melayu == === Takrifan === {{ms-kn}} # [[panggil]]an atau sebutan bagi orang (barang, tempat, pertubuhan, dan lain-lain.) #: ''Dia mencatat nama orang yang menderma di dalam senarai itu.'' # {{konteks|arkaik|lang=ms}} [[gelaran]], [[sebutan]]. #: ''Kerana jasanya yang besar, Dia diberi nama Datuk.'' # kehormatan, kebaikan, kemasyhuran, [[maruah]], [[pujian]]. #: ''Dia melakukan semua itu semata-mata untuk mendapat nama.'' === Sebutan === * {{dewan|nã|mã}} * {{rhymes|ms|ma|a}} * {{penyempangan|ms|na|ma}} === Tulisan Rumi === [[nama]] === Terbitan === * {{l|ms|برنام}} * {{l|ms|ترنام}} * {{l|ms|دناماکن}} * {{ARchar|دناما<sup>ء</sup>ي}} * {{l|ms|ڤنام}} * {{ARchar|ڤناما<sup>ء</sup>ن}} * {{l|ms|مناماکن}} * {{ARchar|مناما<sup>ء</sup>ي}} === Kata majmuk === * {{l|ms|براوليه نام}} * {{l|ms|مڠمبيل نام}} * {{l|ms|منچاري نام}} * {{l|ms|منداڤت نام}} * {{l|ms|نام باتڠ توبوه}} * {{ARchar|نام با<sup>ء</sup>يق}} * {{l|ms|نام بندا}} * {{l|ms|نام تيمڠن}} * {{l|ms|نام جولوقن}} * {{l|ms|نام چمبوڠ}} * {{l|ms|نام خاص}} * {{l|ms|نام داݢيڠ}} * {{l|ms|نام سامرن}} * {{l|ms|نام عام}} * {{l|ms|نام ڤيديڠن}} * {{l|ms|نام ڤينا}} * {{l|ms|نام کچيل}} * {{l|ms|نام کلوارݢ}} === Keturunan === {{atas2}} * {{etyl|ms|abs|nama}} {{tengah2}} * {{etyl|ms|id|nama}} {{ter-bawah}} === Tesaurus === ; Sinonim: {{l|ms|اسم}} [[Kategori:Tulisan Jawi]] ==Bahasa Arab== ==== Kata kerja ==== {{lang|ar|نَامَ}} • (nāma) ''I, tidak lampau'' {{l|ms|ينام|يَنَامُ}} # untuk [[tidur]] # untuk pergi ke [[katil]] # untuk pergi ke [[tidur]] # untuk [[mereda]], untuk [[menyurut]], untuk [[berkurang]], untuk [[menenangkan]] # untuk menjadi [[tidak]] [[aktif]], untuk menjadi [[lesu]] # untuk menjadi [[kebas]] # untuk [[mengabaikan]], untuk [[meninggalkan]], untuk dapat [[melihat]] # untuk [[melupakan]] # untuk menjadi [[tenang]], untuk [[menerima]], untuk [[bersetuju]], untuk [[menyetujui]] # untuk [[mempercayai]], untuk mempunyai [[keyakinan]] dalam === Etimologi === Daripada akar {{ar-akar|ن و م}} === Lihat juga === * {{l|ar|نوام}} * {{l|ar|نوم}} * {{l|ar|نومي}} * {{l|ar|نومة}} * {{l|ar|نؤوم}} == Bahasa Baluchi == ==== Kata nama ==== {{head|bal|Kata nama|tr=nám}} # [[nama]] # [[reputasi]] # [[designasi]] === Etimologi === Daripada {{inh|bal|ira-pro|*Hnā́ma}}, daripada {{inh|bal|iir-pro|*Hnā́ma}}, daripada {{inh|bal|ine-pro|*h₁nómn̥}}. ==Bahasa Parsi== {{wikipedia|lang=fa|sc=fa-Arab}} ==== Kata nama ==== {{fa-regional|نام|نام|ном}} {{fa-kn|tr=nām||pl=نام ها|tr=nām-hā}} # [[nama]] # [[reputasi]] === Etimologi === Daripada {{inh|fa|pal|ŠM|ts=nām}}, daripada {{inh|fa|peo|𐎴𐎠𐎶|ts=nāma}}, daripada {{inh|fa|ira-pro|*Hnā́ma}} (banding dengan {{cog|kmr|nav}}, {{cog|ps|نوم|tr=nūm}}, {{cog|ae|𐬥𐬁𐬨𐬀𐬥|𐬥𐬁𐬨𐬀𐬥-}}) daripada {{inh|fa|iir-pro|*Hnā́ma}} (banding dengan {{cog|el|όνομα}}, {{cog|it|nome}}, Tocharia A {{cog|xto|ñom}}, Armenia {{term|hy|անուն}}, dan Inggeris {{cog|en|name}}). === Sebutan === * {{a|IR}} {{AFA|fa|[nɒːm]}} * {{audio|fa|Fa-نام.ogg|audio}} === Fleksi === {{fa-decl-c|nâm|poss=+}} === Terbitan === * {{l|fa|sc=fa-Arab|نامی|tr=nâmi}} * {{l|fa|sc=fa-Arab|نامیدن|tr=nâmidan}} * {{l|fa|sc=fa-Arab|بنام|tr=benâm}} === Lihat juga === * {{l|fa|sc=fa-Arab|اسم|tr=esm}} == Bahasa Urdu == ==== Kata nama ==== {{ur-kn|tr=nām|hi=नाम}} # [[nama]] === Etimologi === Daripada {{inh+|ur|pra-sau|𑀡𑀸𑀫}}, daripada {{inh|ur|sa|नामन्|tr=nā́man}}, daripada {{inh|ur|inc-pro|*Hnā́ma}}, daripada {{inh|ur|iir-pro|*Hnā́ma}} (banding dengan {{cog|fa|نام|tr=nâm}}), daripada {{inh|ur|ine-pro|*h₁nómn̥||nama}}. Seasal dengan {{cog|pa|ناں}} dan {{cog|en|name}}. === Deklensi === {{ur-noun-c-c|نام|nām}} n67acfapspdl6uaede33p2rrghv4l0a Templat:ar-akar 10 10334 281295 111416 2026-04-21T15:24:51Z Hakimi97 2668 281295 wikitext text/x-wiki {{#invoke:sem-arb-utilities|root|lang=ar|plain=true}}<noinclude>{{documentation}}</noinclude> iikovjqcrd38omyf9t2xqjiby9j55kt Kategori:Bahasa Turki Usmaniyah 14 11029 281334 186546 2026-04-22T00:41:46Z PeaceSeekers 3334 PeaceSeekers telah memindahkan laman [[Kategori:Bahasa Turki Uthmaniyah]] ke [[Kategori:Bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama 186546 wikitext text/x-wiki {{auto cat|Turki|extinct=1|setwikt=-}} 8inbrr93k67kuywhlw83a0joemasnmh daun 0 11204 281246 242779 2026-04-21T13:32:01Z Countryball mys123 9925 /* Bahasa Melayu */Tambah gambar 281246 wikitext text/x-wiki == Bahasa Melayu == === Takrifan === [[File:Lisc lipy.jpg|210px|thumb|Sehelai daun]] {{ms-kn|j=داءون}} # Suatu [[helaian]] [[hidup]] pada [[tumbuhan]] yang bertanggungjawab memperoleh [[cahaya matahari]] untuk memperoleh tenaga bagi tumbuhan. # Kepingan benda nipis. === Etimologi === Daripada {{inh|ms|poz-mly-pro|*daun}}, daripada {{inh|ms|poz-mcm-pro|*daun}}, daripada {{inh|ms|poz-msa-pro|*daun}}, daripada {{inh|ms|poz-pro|*dahun}}. === Sebutan === * {{dewan|daun}} * {{a|Johor-Selangor}} {{IPA|ms|/daon/}} * {{a|Riau-Lingga}} {{IPA|ms|/daʊn/}} * {{rima|ms|aon|on}} * {{audio|ms|Ms-MY-daun.ogg|Audio (MY)}} ===Tesaurus=== ====Sinonim==== {{sinonim dialek|ms}} === Rujukan === * {{R:KD4}} === Pautan luar === * {{R:PRPM}} {{C|ms|Botani}} ==Bahasa Iban== ===Takrifan=== ====Kata nama==== {{inti|iba|kata nama}} # daun ===Sebutan=== * {{AFA|iba|/daun/}} ===Rujukan=== {{R:KIMD2}} ==Bahasa Indonesia== Lihat takrifan Bahasa Melayu ==Bahasa Melanau Tengah== ===Takrifan=== ====Kata nama==== {{inti|mel|kata nama}} # daun ===Etimologi=== {{inh+|mel|poz-swa-pro|*dahun}}, daripada {{inh|mel|poz-pro|*dahun}}. ===Sebutan=== * {{AFA|mel|/daun/}} ===Rujukan=== {{R:KMMD}} dvmvjm3jnqsmxjdkfmw133jl6j60loh Modul:category tree/topic/Communication 828 11523 281414 246758 2026-04-22T08:17:04Z PeaceSeekers 3334 281414 Scribunto text/plain local labels = {} local unpack = unpack or table.unpack -- Lua 5.2 compatibility -- FIXME: Lookup langs in the language list. for _, lang_etc in ipairs { "Arab", {"Cina", "Bahasa-bahasa Cina"}, "Inggeris", "Jerman", "Jepun", "Okinawa", "Portugis", "Sepanyol", "Vietnam", {"Melayu", "Bahasa-bahasa Melayik"}, } do if type(lang_etc) ~= "table" then lang_etc = {lang_etc} end local lang, desc = unpack(lang_etc) desc = desc or ("[[:Kategori:Bahasa %s|bahasa %s]]"):format(lang, lang) labels["Bahasa " .. lang] = { type = "berkenaan", description = "=" .. desc, parents = {"bahasa-bahasa"}, } end labels["komunikasi"] = { type = "berkenaan", description = "default", parents = {"Semua topik"}, } labels["huruf"] = { type = "nama", description = "default", parents = {"sistem tulisan"}, } labels["bahasa buatan"] = { -- distinguish from "cat:constructed languages" family category type = "nama", description = "={{w|constructed language}}s", parents = {"bahasa-bahasa"}, } labels["bahasa badan"] = { type = "berkenaan", description = "default", parents = {"bahasa", "nonverbal communication"}, } labels["penyiaran"] = { type = "berkenaan", description = "default", parents = {"media", "telekomunikasi"}, } labels["Komponen aksara Cina"] = { type = "set", description = "=[[komponen|Komponen]] [[aksara]] [[Cina]].", parents = {"Huruf, simbol dan tanda baca"}, } labels["diacritical marks"] = { type = "set", description = "default", parents = {"Huruf, simbol dan tanda baca"}, } labels["dialects"] = { type = "set", description = "default", parents = {"bahasa"}, } labels["dictation"] = { type = "berkenaan", description = "default", parents = {"komunikasi"}, } labels["bahasa pupus"] = { type = "nama", description = "default", parents = {"bahasa-bahasa"}, } labels["bahasa isyarat"] = { type = "nama", description = "default", parents = {"bahasa-bahasa"}, } labels["facial expressions"] = { type = "set", description = "default", parents = {"nonverbal communication", "face"}, } labels["kiasan"] = { type = "set", description = "=[[figure of speech|figures of speech]]", parents = {"retorik"}, } labels["bendera"] = { type = "berkenaan,name,type", description = "default", parents = {"komunikasi"}, } labels["jargon"] = { type = "berkenaan", description = "default", parents = {"bahasa"}, } labels["aksara Han"] = { type = "berkenaan", description = "default", parents = {"sistem tulisan"}, } labels["bahasa"] = { type = "berkenaan", description = "default", parents = {"komunikasi"}, } labels["keluarga bahasa"] = { type = "nama", description = "Topik berkenaan [[keluarga bahasa]], termasuklah yang diterima dan yang bersifat kontroversi.", parents = {"bahasa", "nama"}, } labels["bahasa-bahasa"] = { type = "nama", description = "default", parents = {"bahasa", "nama"}, } labels["Huruf, simbol dan tanda baca"] = { type = "set", description = "=[[letter]]s, [[symbol]]s, and [[punctuation]]", parents = {"Ortografi"}, } labels["logical fallacies"] = { type = "set", description = "=[[logical fallacy|logical fallacies]], clearly defined errors in reasoning used to support or refute an argument", additional = "{{also|Kategori:{{{langcode}}}:biases}}", parents = {"retorik", "logic"}, } labels["media"] = { type = "berkenaan", description = "default", parents = {"komunikasi"}, } labels["telefon bimbit"] = { type = "berkenaan,set", description = "default", parents = {"telefoni"}, } labels["nonverbal communication"] = { type = "berkenaan", description = "default", parents = {"komunikasi"}, } labels["ortografi"] = { type = "berkenaan", description = "default", parents = {"penulisan"}, } labels["palaeography"] = { type = "berkenaan", description = "default", parents = {"penulisan"}, } labels["pos"] = { type = "berkenaan", description = "=[[post#Noun|post]] or [[mail#Noun|mail]]", parents = {"komunikasi"}, } labels["postal abbreviations"] = { type = "nama", description = "default", parents = {"pos"}, } labels["public relations"] = { type = "berkenaan", description = "default no singularize", parents = {"komunikasi"}, } labels["tanda baca"] = { type = "set", description = "default", parents = {"Huruf, simbol dan tanda baca"}, } labels["radio"] = { type = "berkenaan", description = "default", parents = {"telekomunikasi"}, } labels["retorik"] = { type = "berkenaan", description = "default", parents = {"bahasa"}, } labels["signs"] = { type = "berkenaan,name,type", description = "default", parents = {"komunikasi"}, } labels["sociolects"] = { type = "nama", description = "default", parents = {"bahasa"}, } labels["simbol"] = { type = "set", description = "=[[symbol]]s, especially [[mathematical]] and [[scientific]] symbols", additional = "Most symbols have equivalent meanings in many languages and can therefore be found in [[:Category:Translingual symbols]].", parents = {"Huruf, simbol dan tanda baca"}, } labels["talking"] = { type = "berkenaan", description = "default", parents = {"bahasa", "tingkah laku manusia"}, } labels["telekomunikasi"] = { type = "berkenaan", description = "default no singularize", parents = {"komunikasi", "teknologi"}, } labels["telegraphy"] = { type = "berkenaan", description = "default", parents = {"telekomunikasi", "elektronik"}, wpcat = true, commonscat = true, } labels["telefoni"] = { type = "berkenaan", description = "default", parents = {"telekomunikasi", "elektronik"}, } labels["texting"] = { type = "berkenaan", description = "default", parents = {"telekomunikasi"}, } labels["textual division"] = { type = "berkenaan", description = "default", parents = {"penulisan"}, } labels["tipografi"] = { type = "berkenaan", description = "default", parents = {"penulisan", "percetakan"}, } labels["penulisan"] = { type = "berkenaan", description = "default", parents = {"bahasa", "tingkah laku manusia"}, } labels["sistem tulisan"] = { type = "set", description = "default", parents = {"penulisan"}, } return labels 33yh5uf9t2ik66l1e99302swjc3kj3a Modul:category tree/topic/Culture 828 11524 281339 281227 2026-04-22T01:08:46Z PeaceSeekers 3334 281339 Scribunto text/plain local labels = {} labels["budaya"] = { type = "berkenaan", description = "default", parents = {"masyarakat"}, } labels["A Christmas Carol"] = { type = "berkenaan", wikidata = 62879, displaytitle = "''A Christmas Carol''", description = "{{{langname}}} terms that are used in the context of the tale ''{{w|A Christmas Carol}}'', by {{w|Charles Dickens}}, such as the names of its characters or author.", parents = {"British fiction", "Charles Dickens"}, } labels["A Song of Ice and Fire"] = { type = "berkenaan", wikidata = 45875, displaytitle = "''A Song of Ice and Fire''", description = "{{{langname}}} terms used in context of the ''{{w|Song of Ice and Fire}}'' novel series and its television adaptation ''{{w|Game of Thrones}}''.", parents = {"cereka Amerika", "fantasy", "kesusasteraan"}, } labels["lakonan"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["alternate history"] = { type = "berkenaan", description = "default", parents = {"cereka spekulatif", "history"}, } labels["cereka Amerika"] = { type = "berkenaan", description = "=works of American fiction", parents = {"cereka", "Amerika Syarikat"}, } labels["animasi"] = { type = "berkenaan", description = "default", parents = {"media massa"}, } labels["Arabic fiction"] = { type = "berkenaan", description = "=works of [[fiction]] of [[Arabic]] origin", parents = {"cereka"}, } labels["Arabian deities"] = { type = "nama", description = "default", parents = {"gods", "Arabian mythology"}, } labels["Arabian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi"}, } labels["Armenian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Armenia"}, } labels["seni"] = { type = "berkenaan", description = "default", parents = {"budaya"}, } labels["Arthurian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "United Kingdom"}, } labels["artistic works"] = { type = "name,jenis", description = "default", parents = {"seni"}, } labels["astrobiology"] = { type = "berkenaan", description = "default", parents = {"astronomy", "biology", "geology"}, } labels["astrologi"] = { type = "berkenaan", description = "default", parents = {"penilikan", "pseudosains", "obsolete scientific theories"}, } labels["Asturian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Asturias, Spain"}, } labels["Avatar: The Last Airbender"] = { type = "berkenaan", wikidata = 11572, displaytitle = "''Avatar: The Last Airbender''", description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|Avatar: The Last Airbender}}'' and its spin-off ''{{w|The Legend of Korra}}''.", parents = {"cereka Amerika", "animasi"}, } labels["Australian Aboriginal mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Australia"}, } labels["ballet"] = { type = "berkenaan", description = "default", parents = {"tarian"}, } labels["Barbie"] = { type = "berkenaan", wikidata = 167447, description = "=the {{w|Barbie}} fashion doll produced by Mattel", parents = {"toys"}, } labels["Batman"] = { type = "berkenaan", wikidata = 2695156, description = "=the fictional [[superhero]] [[Batman]]", parents = {"DC Comics", "watak cereka"}, } labels["bibliography"] = { type = "berkenaan", description = "default", parents = {"buku"}, } labels["Bilibili"] = { type = "berkenaan", wikidata = 3077586, description = "=the video-sharing website {{w|bilibili}}", parents = {"media sosial", "World Wide Web"}, } labels["blogging"] = { type = "berkenaan", description = "default", parents = {"media sosial"}, } labels["Bluesky"] = { type = "berkenaan", wikidata = 78194383, description = "=the microblogging and social networking service {{w|Bluesky}}", parents = {"media sosial", "World Wide Web"}, } labels["body art"] = { type = "berkenaan", description = "default", parents = {"seni", "fesyen"}, } labels["Bollywood"] = { type = "berkenaan", wikidata = 93196, description = "default", parents = {"filem", "India"}, } labels["buku"] = { type = "berkenaan", description = "default", parents = {"media massa", "kesusasteraan"}, } labels["books of the Poetic Edda"] = { type = "nama", displaytitle = "books of the ''Poetic Edda''", description = "=[[book]]s of the ''[[Poetic Edda]]''", parents = {"Norse mythology"}, } labels["Brazilian folklore"] = { type = "berkenaan", description = "default", parents = {"folklore", "Brazil"}, } labels["cereka British"] = { type = "berkenaan", description = "=works of [[fiction]] of [[British]] origin", parents = {"cereka", "United Kingdom"}, } labels["Buffy the Vampire Slayer"] = { type = "berkenaan", wikidata = 183513, displaytitle = "''Buffy the Vampire Slayer''", description = "=the television series ''{{w|Buffy the Vampire Slayer}}'' (1997–2003)", parents = {"cereka Amerika", "televisyen", "vampires"}, } labels["cereka Kanada"] = { type = "berkenaan", description = "=works of [[fiction]] of [[Canada|Canadian]] origin", parents = {"cereka", "Kanada"}, } labels["seni khat"] = { type = "berkenaan", description = "default", parents = {"seni", "penulisan"}, } labels["cartomancy"] = { type = "berkenaan", description = "default", parents = {"penilikan"}, } labels["castells"] = { type = "berkenaan", description = "=[[castell]]s, the Catalan tradition of human tower building", additional = "See {{w|castells}}.", parents = {"budaya", "sports"}, } labels["celestial inhabitants"] = { type = "jenis", description = "=inhabitants of known [[celestial body|celestial bodies]]", parents = {"watak cereka", "cereka sains", "demonyms"}, } labels["Celtic mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Ireland", "Wales"}, } labels["characters from folklore"] = { type = "berkenaan", description = "default", parents = {"watak cereka", "folklore"}, } labels["cheerleading"] = { type = "berkenaan", description = "default", parents = {"tarian", "gymnastics", "sports"}, } labels["Church of England"] = { type = "berkenaan", description = "default with the", parents = {"Anglicanism", "England"}, } labels["Chinese fiction"] = { type = "berkenaan", description = "=works of [[fiction]], including [[anime]]s, [[manhua]]s, [[novel]]s, [[series]] and [[video game]]s, whose origin is of [[China]]", parents = {"cereka", "China"}, } labels["Chinese mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "China"}, } labels["cinematography"] = { type = "berkenaan", description = "default", parents = {"filem"}, } labels["circus"] = { type = "berkenaan", description = "default no singularize", parents = {"hiburan", "teater"}, } labels["comedy"] = { type = "berkenaan", description = "default", parents = {"drama"}, } labels["komik"] = { type = "berkenaan", description = "default no singularize", parents = {"kesusasteraan"}, } -- Confucianism: see [[Module:category tree/topic/Philosophy]] labels["conlanging"] = { type = "berkenaan", description = "=[[conlanging]] (the making of [[constructed language]]s)", parents = {"language", "budaya"}, } labels["conspiracy theories"] = { type = "berkenaan,set", description = "=[[conspiracy theory|conspiracy theories]] and theorists", parents = {"budaya"}, } labels["constellations in the zodiac"] = { type = "nama", description = "=the ring of [[constellations]] that line the [[ecliptic]], the apparent path of the [[Sun]] across the [[celestial sphere]] over the course of a year", parents = {"constellations", "astrologi"}, } labels["kosmetik"] = { type = "berkenaan", description = "default", parents = {"toiletries", "fesyen"}, } labels["cosplay"] = { type = "berkenaan", description = "default", parents = {"fandom"}, } labels["tarian"] = { type = "berkenaan", description = "default", parents = {"seni", "rekreasi"}, } labels["dances"] = { type = "jenis", description = "default", parents = {"tarian"}, } labels["DC Comics"] = { type = "berkenaan", wikidata = 2924461, description = "={{w|DC Comics}}", parents = {"cereka Amerika", "komik"}, } labels["demoscene"] = { type = "berkenaan", description = "default", parents = {"budaya", "computing"}, } labels["reka bentuk"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["dictionaries"] = { type = "jenis,nama", description = "default", parents = {"reference works", "lexicography"}, } labels["Disney"] = { type = "berkenaan", wikidata = 7414, description = "=the properties of {{w|The Walt Disney Company}}", additional = "This includes properties acquired jointly with or from other companies.", parents = {"cereka Amerika", "komik", "filem", "televisyen"}, } labels["penilikan"] = { type = "jenis", description = "default", parents = {"okultisme"}, } labels["Doctor Who"] = { type = "berkenaan", wikidata = 34316, displaytitle = "''Doctor Who''", description = "=the ''{{w|Doctor Who}}'' franchise", parents = {"British fiction", "cereka sains", "televisyen"}, } labels["Dracula"] = { type = "berkenaan", wikidata = 41542, displaytitle = "''Dracula''", description = "=the 1897 gothic horror novel ''{{w|Dracula}}'' by {{w|Bram Stoker}}, and its cultural derivations.", parents = {"fantasy", "kesusasteraan", "vampires"}, } labels["naga"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures"}, } labels["drama"] = { type = "berkenaan", description = "default", parents = {"teater"}, } labels["Egyptian deities"] = { type = "nama", description = "default", parents = {"gods", "Egyptian mythology"}, } labels["Egyptian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Ancient Egypt"}, } labels["hiburan"] = { type = "berkenaan", description = "default", parents = {"budaya"}, } labels["erotic literature"] = { type = "berkenaan", description = "default", parents = {"cereka", "literary genres", "sex"}, } labels["Etruscan mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Etruria"}, } labels["European folklore"] = { type = "berkenaan", description = "default", parents = {"folklore", "Europe"}, } labels["fairy tale"] = { type = "berkenaan", description = "=[[fairy tale]]s", parents = {"cereka"}, } labels["fairy tale characters"] = { type = "nama", description = "=[[fairy tale]] [[character]]s", parents = {"watak cereka", "fairy tale"}, } labels["fairy tales"] = { type = "nama", description = "default", parents = {"fairy tale"}, } labels["fan fiction"] = { type = "berkenaan", description = "default", parents = {"cereka", "fandom", "kesusasteraan"}, } labels["fandom"] = { type = "berkenaan", description = "{{{langname}}} terms arising from [[fandom]] culture.", parents = {"budaya"}, } labels["fantasy"] = { type = "berkenaan", description = "=the [[genre]] of [[fantasy]]", parents = {"cereka", "cereka spekulatif"}, } labels["fesyen"] = { type = "berkenaan", description = "default", parents = {"budaya", "clothing"}, } labels["faster-than-light travel"] = { type = "berkenaan", description = "default", parents = {"travel", "cereka sains", "astrophysics", "relativity"}, } labels["Fediverse"] = { type = "berkenaan", wikidata = 30325419, description = "=the decentralised social networking services collectively known as the {{w|Fediverse}}", parents = {"media sosial", "World Wide Web"}, } labels["cereka"] = { type = "berkenaan", description = "=specific works of [[fiction]]", parents = {"artistic works"}, } labels["fictional abilities"] = { type = "berkenaan,jenis", description = "=fictional [[ability|abilities]] and [[superpower]]s", parents = {"cereka", "cereka spekulatif"}, } labels["watak cereka"] = { type = "name,jenis", description = "default", parents = {"cereka"}, } labels["fictional locations"] = { type = "name,jenis", description = "default", parents = {"cereka"}, } labels["fictional planets"] = { type = "nama", description = "default", parents = {"fictional locations"}, } labels["fictional universes"] = { type = "name,jenis", description = "default", parents = {"fictional locations"}, } labels["filem"] = { type = "berkenaan", description = "default", parents = {"media massa", "hiburan"}, } labels["F/F ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between two female characters.", parents = {"LGBTQ", "ships (fandom) by relationship type"}, } labels["film genres"] = { type = "jenis,berkenaan", description = "default", parents = {"filem", "genre"}, } labels["film industries"] = { type = "nama", description = "default", parents = {"filem"}, } labels["Finnic mythology"] = { type = "berkenaan", description = "=the [[mythology]] of the [[Finnic]] peoples", additional = "This includes (but is not limited to) [[Finnish]] and [[Estonian]] mythology.", parents = {"mitologi", "Finland", "Estonia"}, } labels["flamenco"] = { type = "berkenaan", description = "default", parents = {"tarian"}, } labels["folklore"] = { type = "berkenaan", description = "default", parents = {"budaya"}, } labels["furry fandom"] = { type = "berkenaan", description = "default", parents = {"fandom", "subbudaya"}, } labels["Germanic deities"] = { type = "nama", description = "default", parents = {"gods", "Germanic mythology"}, } labels["Germanic mythology"] = { type = "nama", description = "=the [[mythology]] of the [[Germanic]] peoples", parents = {"mitologi"}, } labels["genre"] = { type = "jenis,berkenaan", description = "=[[genre]]s and genre classifications", parents = {"hiburan"}, wpcat = true, } labels["ghosts"] = { type = "berkenaan", description = "default", parents = {"afterlife", "supernatural", "characters from folklore", "death", "fantasy", "horror", "mythological creatures", "okultisme"}, } labels["Glee (TV series)"] = { type = "berkenaan", wikidata = 152178, displaytitle = "''Glee'' (TV series)", description = "=the television series ''[[w:Glee (TV series)|Glee]]'' (2009–2015)", parents = {"cereka Amerika", "televisyen"}, } labels["graphic design"] = { type = "berkenaan", description = "default", parents = {"reka bentuk"}, } labels["Greek deities"] = { type = "nama", description = "default", parents = {"gods", "Greek mythology"}, } labels["Greek mythology"] = { type = "berkenaan", description = "=the [[mythology]] of [[Ancient Greece]]", parents = {"mitologi", "Ancient Greece"}, } labels["Gulliver's Travels"] = { type = "berkenaan", wikidata = 181488, displaytitle = "''Gulliver's Travels''", description = "=''[[w:Gulliver's Travels|Gulliver’s Travels]]''", parents = {"kesusasteraan"}, } labels["Harry Potter"] = { type = "berkenaan", wikidata = 8337, displaytitle = "''Harry Potter''", description = "{{{langname}}} terms used in context of the ''{{w|Harry Potter}}'' franchise.", parents = {"British fiction", "fantasy", "kesusasteraan", "watak cereka"}, } labels["Hawaiian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Hawaii, USA"}, } labels["F/M ships"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between female and male characters.", parents = {"ships (fandom) by relationship type"}, } labels["Hindu deities"] = { type = "nama", description = "default", parents = {"gods", "Hindu mythology"}, } labels["Hindu mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Hinduism"}, } labels["Homestuck"] = { type = "berkenaan", displaytitle ="''Homestuck''", wikidata = 2618713, description = "=the ''{{w|Homestuck}}'' multimedia fiction series", parents = {"cereka Amerika", "komik"}, } labels["Hopi culture"] = { type = "berkenaan", description = "default", parents = {"budaya", "United States"}, } labels["horror"] = { type = "berkenaan", description = "=the [[horror]] [[genre]]", parents = {"kesusasteraan", "cereka spekulatif"}, } labels["humanities"] = { type = "berkenaan", description = "default no singularize", parents = {"budaya"}, commonscat = true; } labels["incestuous ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} involving fictional incestuous relationships.", parents = {"incest", "ships (fandom) by relationship type"}, } labels["idol fandom"] = { type = "berkenaan", description = "default", parents = {"fandom"}, } labels["Instagram"] = { type = "berkenaan", wikidata = 209330, description = "=the photo sharing and social networking service [[Instagram]]", parents = {"photography", "media sosial", "World Wide Web"}, } labels["Iranian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Iran"}, } labels["Irish mythology"] = { type = "berkenaan", description = "default", parents = {"Celtic mythology", "Ireland"}, } labels["James Bond"] = { type = "berkenaan", wikidata = 844, displaytitle = "''James Bond''", description = "=the ''[[James Bond]]'' franchise", parents = {"British fiction", "filem"}, } labels["dewa Jepun"] = { type = "nama", description = "default", parents = {"dewa", "mitologi Jepun"}, } labels["cereka Jepun"] = { type = "berkenaan", description = "=bahan-bahan [[cereka]] Jepun, termasuk [[anime]], [[manga]], [[novel]], [[siri]] dan [[permainan video]]", parents = {"cereka", "Japan"}, } labels["mitologi Jepun"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Jepun"}, } labels["job titles in Romance of the Three Kingdoms"] = { type = "jenis", displaytitle = "job titles in ''Romance of the Three Kingdoms''", description = "=job titles in ''{{w|Romance of the Three Kingdoms}}''", parents = {"Romance of the Three Kingdoms", "titles"}, } labels["kewartawanan"] = { type = "berkenaan", description = "default", parents = {"penulisan"}, } labels["Kachinas"] = { type = "nama", description = "default", parents = {"Hopi culture"}, } labels["Komi mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Komi, Russia"}, } labels["Korean fiction"] = { type = "berkenaan", description = "=works of [[fiction]], including [[anime]]s, [[manhwa]]s, [[novel]]s, [[series]] and [[video game]]s, whose origin is of [[Korea]]", parents = {"cereka", "Korea"}, } labels["Korean mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Korea"}, } labels["genre kesusasteraan"] = { type = "jenis", description = "{{{langname}}} terms for [[literary]] [[genre]]s.", parents = {"kesusasteraan", "cereka", "genre"}, } labels["kesusasteraan"] = { type = "berkenaan", description = "default", parents = {"budaya", "hiburan", "penulisan"}, } labels["Lost (TV series)"] = { type = "berkenaan", wikidata = 23567, displaytitle = "''Lost'' (TV series)", description = "=the television series ''{{w|Lost (2004 TV series)|Lost}}'' (2004–2010)", parents = {"cereka Amerika", "cereka sains", "televisyen"}, } labels["Lovecraftian horror"] = { type = "berkenaan", wikidata = 2448865, description = "=the [[literature|literary]] works of {{w|H. P. Lovecraft}}", parents = {"horror", "kesusasteraan", "cereka", "supernatural"}, } labels["magic"] = { type = "berkenaan", description = "default", parents = {"supernatural"}, } labels["magic words"] = { type = "set", wikidata = 1135882, description = "{{{langname}}} magic words; terms that serve the purpose of effectively or apparently triggering a [[magical]] or [[illusionist]] event.", parents = {"plot devices", "cereka"}, } labels["genre manga"] = { type = "jenis", description = "Istilah [[genre]] [[manga]] dalam bahasa {{{langname}}}.", parents = {"genre kesusasteraan"}, } labels["perkahwinan"] = { type = "berkenaan", description = "default", parents = {"budaya", "keluarga"}, } labels["Marvel Comics"] = { type = "berkenaan", wikidata = 173496, description = "={{w|Marvel Comics}}", parents = {"cereka Amerika", "komik"}, } labels["media massa"] = { type = "berkenaan", description = "default", parents = {"media", "budaya"}, } labels["Meitei deities"] = { type = "nama", description = "default", parents = {"gods", "Meitei mythology"}, } labels["Meitei mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Manipur, India"}, } labels["merpeople"] = { type = "berkenaan", description = "default", parents = {"mythological creatures"}, } labels["Mesopotamian deities"] = { type = "nama", description = "default", parents = {"gods", "Mesopotamian mythology"}, } labels["Mesopotamian mythology"] = { type = "berkenaan", description = "=the [[mythology]] of ancient [[Mesopotamia]]", parents = {"mitologi", "Ancient Near East"}, } labels["M/M ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between two male characters.", parents = {"LGBTQ", "ships (fandom) by relationship type"}, } labels["modern art"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["Mongolian tribes"] = { type = "nama", description = "{{{langname}}} names for Mongolian tribes.", parents = {"ethnonyms", "Mongolia"}, } labels["moustaches"] = { type = "jenis", description = "default", parents = {"face", "fesyen", "hair"}, } labels["My Hero Academia"] = { type = "berkenaan", wikidata = 18047903, displaytitle ="''My Hero Academia''", description = "=the ''{{w|My Hero Academia}}'' series", parents = {"Japanese fiction", "animasi", "komik"}, } labels["My Little Pony"] = { type = "berkenaan", wikidata = 1071312, displaytitle = "''My Little Pony''", description = "=the ''{{w|My Little Pony}}'' franchise (which includes toys and animated series) and its fandom", parents = {"cereka Amerika", "animasi", "toys"}, } labels["mythological creatures"] = { type = "jenis", description = "default", parents = {"mitologi", "fantasy"}, } labels["mythological figures"] = { type = "nama", description = "default", parents = {"mitologi"}, } labels["mythological locations"] = { type = "nama", description = "default", parents = {"mitologi"}, } labels["mythological plants"] = { type = "jenis,nama", description = "default", parents = {"mitologi", "plants"}, } labels["mitologi"] = { type = "berkenaan", description = "default", parents = {"budaya"}, } labels["narratology"] = { type = "berkenaan", description = "default", parents = {"kesusasteraan", "drama"}, } labels["Navajo mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi"}, } labels["newspapers"] = { type = "nama", description = "default", parents = {"periodicals"}, } labels["Niconico"] = { type = "berkenaan", wikidata = 697233, description = "=the video-sharing website {{w|Niconico}}", parents = {"media sosial", "World Wide Web"}, } labels["Norse deities"] = { type = "nama", description = "default", parents = {"gods", "Germanic deities", "Norse mythology"}, } labels["Norse mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Germanic mythology"}, } labels["okultisme"] = { type = "berkenaan", description = "default with the", parents = {"supernatural", "paranormal"}, } labels["omegaverse"] = { type = "berkenaan", wikidata = 96397374, description = "=the [[omegaverse]] genre", parents = {"erotic literature", "fan fiction", "cereka spekulatif"}, } labels["Omori"] = { type = "berkenaan", wikidata = 105618699, displaytitle ="''Omori''", description = "=the ''{{w|Omori (video game)|Omori}}'' series", parents = {"cereka Amerika", "video games"}, } labels["Once Upon a Time"] = { type = "berkenaan", wikidata = 23673, displaytitle = "''Once Upon a Time''", description = "=the television series ''{{w|Once Upon a Time (TV series)|Once Upon a Time}}'' (2011–2018)", parents = {"cereka Amerika", "Disney", "televisyen"}, } labels["painting"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["palmistry"] = { type = "berkenaan", description = "default", parents = {"penilikan"}, } labels["parties"] = { type = "jenis,berkenaan", description = "default", parents = {"hiburan", "budaya"}, } labels["people in Romance of the Three Kingdoms"] = { type = "nama", displaytitle = "people in ''Romance of the Three Kingdoms''", description = "=people in ''{{w|Romance of the Three Kingdoms}}''", parents = {"Romance of the Three Kingdoms"}, } labels["perfumes"] = { type = "jenis,set", description = "default", parents = {"fesyen", "scents", "perfumery"}, } labels["periodicals"] = { type = "jenis,berkenaan", description = "default", parents = {"media massa", "kesusasteraan"}, } labels["personifications"] = { type = "nama", description = "default", parents = {"narratology"}, } labels["places in Romance of the Three Kingdoms"] = { type = "nama", displaytitle = "places in ''Romance of the Three Kingdoms''", description = "=places in ''{{w|Romance of the Three Kingdoms}}''", parents = {"Romance of the Three Kingdoms", "China"}, } labels["plot devices"] = { type = "jenis", description = "default", parents = {"narratology", "cereka"}, } labels["puisi"] = { type = "berkenaan", description = "default", parents = {"kesusasteraan", "seni"}, } labels["polyamorous ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between three or more characters.", parents = {"ships (fandom) by relationship type"}, } labels["Private Eye"] = { type = "berkenaan", displaytitle = "''Private Eye''", description = "=the ''{{w|Private Eye}}'' franchise", parents = {"British fiction"}, } labels["Reddit"] = { type = "berkenaan", wikidata = 2195701, description = "=the social news aggregation and discussion website {{w|Reddit}}", parents = {"media sosial", "World Wide Web"}, } labels["reference works"] = { type = "jenis", description = "default", parents = {"buku"}, } labels["Roman deities"] = { type = "nama", description = "default", parents = {"gods", "Roman mythology"}, } labels["Roman mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Ancient Rome"}, } labels["romance fiction"] = { type = "berkenaan", description = "default", parents = {"literary genres", "love"}, } labels["Romance of the Three Kingdoms"] = { type = "berkenaan", wikidata = 70806, displaytitle = "''Romance of the Three Kingdoms''", description = "=''{{w|Romance of the Three Kingdoms}}''", parents = {"cereka", "kesusasteraan", "China"}, } labels["RPF ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} involving real people in a fictional relationship.", additional = "For actual relationships between real people, see [[:Category:Couple nicknames]].", parents = {"ships (fandom) by relationship type"}, } labels["cereka sains"] = { type = "berkenaan", description = "default", parents = {"cereka spekulatif", "cereka"}, } labels["SCP Foundation"] = { type = "berkenaan", wikidata = 17439649, description = "English terms related to the SCP Wiki collaborative writing website and its setting of the {{w|SCP Foundation}}.", parents = {"fantasy", "cereka", "horror", "cereka sains", "supernatural"}, } labels["arca"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["Shahnameh"] = { type = "berkenaan", wikidata = 8279, displaytitle = "''Shahnameh''", description = "=''Shahnameh''", parents = {"cereka", "puisi", "kesusasteraan", "Persia"}, } labels["Shahnameh characters"] = { type = "nama", description = "=characters in the [[Shahnameh]]", parents = {"Shahnameh"}, } labels["shapeshifters"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures", "characters from folklore"}, } labels["Sherlock Holmes"] = { type = "berkenaan", wikidata = 2316684, description = "=the [[Sherlock Holmes]] stories by {{w|Arthur Conan Doyle}} and adaptations of them", parents = {"British fiction", "kesusasteraan"}, } labels["Sherlock (TV series)"] = { type = "berkenaan", wikidata = 192837, displaytitle = "''Sherlock'' (TV series)", description = "=the television series ''[[w:Sherlock (TV series)|Sherlock]]'' (2010–2017)", parents = {"Sherlock Holmes", "televisyen"}, } labels["shipping (fandom)"] = { type = "berkenaan", description = "={{l|en|ship|shipping|id=fandomverb}} (i.e., in [[fandom]], supporting a fictional romantic relationship between two characters)", parents = {"fandom", "romance fiction"}, } labels["ships (fandom)"] = { type = "kumpulan", description = "=names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} i.e., a fictional relationship between two fictional characters or real people)", parents = {"shipping (fandom)"}, } labels["ships (fandom) by relationship type"] = { type = "kumpulan", description = "={{l|en|ship|ship|id=fandomnoun}} names organized by the type of relationship (e.g, [[heterosexual]], [[homosexual]], etc.)", parents = {"ships (fandom)"}, } labels["shippers (fandom)"] = { type = "jenis", description = "=[[shipper]]s (i.e., people who support a romantic or sexual relationship between characters or real people)", parents = {"shipping (fandom)"}, } labels["Slavic deities"] = { type = "nama", description = "default", parents = {"gods", "Slavic mythology"}, } labels["Slavic mythology"] = { type = "berkenaan", description = "=the [[mythology]] of the [[Slav]]s", parents = {"mitologi"}, } labels["Smallville (TV series)"] = { type = "berkenaan", wikidata = 180228, displaytitle = "''Smallville'' (TV series)", description = "=the television series ''{{w|Smallville}}'' (2001–2011)", parents = {"cereka Amerika", "Superman", "televisyen"}, } labels["media sosial"] = { type = "berkenaan", wikidata = 202833, description = "default", parents = {"media massa", "Internet"}, } labels["South Korean idol fandom"] = { type = "berkenaan", wikidata = 39086123, description = "=[[South Korea|South Korean]] [[idol]] [[fandom]]", parents = {"idol fandom", "South Korea"}, } labels["South Park"] = { type = "berkenaan", wikidata = 16538, displaytitle = "''South Park''", description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|South Park}}''.", parents = {"cereka Amerika", "animasi"}, } labels["Star Trek"] = { type = "berkenaan", wikidata = 1092, displaytitle = "''Star Trek''", description = "=the ''{{w|Star Trek}}'' franchise", parents = {"cereka Amerika", "filem", "cereka sains", "televisyen"}, } labels["Star Wars"] = { type = "berkenaan", wikidata = 462, displaytitle = "''Star Wars''", description = "=the ''{{w|Star Wars}}'' franchise", parents = {"cereka Amerika", "filem", "cereka sains", "Disney"}, } labels["Steven Universe"] = { type = "berkenaan", wikidata = 7615342, displaytitle = "''Steven Universe''", description = "=the animated television series ''{{w|Steven Universe}}''", parents = {"cereka Amerika", "animasi"}, } labels["stock characters"] = { type = "jenis", wikidata = 636497, description = "default", parents = {"watak cereka"}, } labels["cereka spekulatif"] = { type = "berkenaan", wikidata = 9326077, description = "default", parents = {"cereka", "genre"}, } labels["spider fighting"] = { type = "berkenaan", wikidata = 7577058, description = "={{w|spider fighting}}", parents = {"spiders", "human activity"}, } labels["subbudaya"] = { type = "berkenaan", description = "=[[subculture]]s", parents = {"budaya"}, } labels["adiwira"] = { type = "nama", wikidata = 188784, description = "=[[superhero]]es", parents = {"watak cereka"}, } labels["Superman"] = { type = "berkenaan", wikidata = 79015, description = "=the fictional [[superhero]] [[Superman]]", parents = {"DC Comics", "watak cereka"}, } labels["supernatural"] = { type = "berkenaan", wikidata = 80837, description = "default with the", parents = {"folklore"}, } labels["Supernatural (TV series)"] = { type = "berkenaan", wikidata = 130585, displaytitle = "''Supernatural'' (TV series)", description = "=the television series ''[[w:Supernatural (American TV series)|Supernatural]]'' (2005–2020)", parents = {"cereka Amerika", "televisyen"}, } labels["Tamil deities"] = { type = "nama", description = "default", additional = "See [[w:Dravidian folk religion|Dravidian religion]] or [[w:Religion in ancient Tamilakam|Tamil region]] for more.", parents = {"gods", "Hindu deities", "Tamil mythology"}, } labels["Tamil mythology"] = { type = "nama", description = "default", additional = "See [[w:Dravidian folk religion|Dravidian religion]] or [[w:Religion in ancient Tamilakam|Tamil region]] for more.", parents = {"mitologi", "Hindu mythology", "Tamil Nadu, India"}, } labels["televisyen"] = { type = "berkenaan", wikidata = 289, description = "default", parents = {"media massa", "penyiaran"}, } labels["The Handmaid's Tale"] = { type = "berkenaan", wikidata = 25207350, displaytitle = "''The Handmaid's Tale''", description = "=the 1985 novel ''{{w|The Handmaid's Tale}}'' by {{w|Margaret Atwood}} and its [[w:The Handmaid's Tale (TV series)|television adaptation]] (2017–)", parents = {"Canadian fiction", "utopian and dystopian fiction", "kesusasteraan"}, } labels["The Hunger Games"] = { type = "berkenaan", wikidata = 11679, displaytitle = "''The Hunger Games''", description = "=''{{w|The Hunger Games}}'' novel series by {{w|Suzanne Collins}} and its film adaptations", parents = {"cereka Amerika", "cereka sains", "utopian and dystopian fiction", "kesusasteraan"}, } labels["The Matrix"] = { type = "berkenaan", wikidata = 83495, displaytitle = "''The Matrix''", description = "=''{{w|The Matrix}}''", parents = {"cereka Amerika", "cereka sains", "utopian and dystopian fiction"}, } labels["The Simpsons"] = { type = "berkenaan", wikidata = 886, displaytitle = "''The Simpsons''", description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|The Simpsons}}''.", parents = {"cereka Amerika", "animasi", "Disney"}, } labels["The Walking Dead"] = { type = "berkenaan", wikidata = 232737, displaytitle = "''The Walking Dead''", description = "=the television series ''[[w:The Walking Dead (TV series)|The Walking Dead]]'' (2010–2022) and the comic series from which it was adapted", parents = {"cereka Amerika", "televisyen", "utopian and dystopian fiction", "zombies"}, } labels["The Wizard of Oz"] = { type = "berkenaan", wikidata = 130295, displaytitle = "''The Wizard of Oz''", description = "=the fantasy novel ''{{w|The Wonderful Wizard of Oz}}'', subsequent books or films derived from it, such as the ''[[w:The Wizard of Oz (1939 film)|1939 film]]''.", parents = {"cereka Amerika", "fantasy", "kesusasteraan"}, } labels["The X-Files"] = { type = "berkenaan", wikidata = 2744, displaytitle = "''The X-Files''", description = "=the ''{{w|The X-Files}}'' franchise", parents = {"cereka Amerika", "cereka sains", "televisyen"}, } labels["teater"] = { type = "berkenaan", description = "default", parents = {"seni", "hiburan"}, } labels["Thracian deities"] = { type = "nama", description = "default", parents = {"gods"}, } labels["TikTok"] = { type = "berkenaan", wikidata = 48938223, description = "=the video-sharing and social-networking service {{w|TikTok}}", parents = {"media sosial", "World Wide Web"}, } labels["Tupi mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Brazil"}, } labels["Twilight (novel series)"] = { type = "berkenaan", wikidata = 44523, displaytitle = "''Twilight'' (novel series)", description = "=the ''[[w:Twilight (series)|Twilight]]'' franchise", parents = {"cereka Amerika", "fantasy", "kesusasteraan", "vampires"}, } labels["Twitter"] = { type = "berkenaan", wikidata = 918, description = "=the social networking and microblogging service {{w|Twitter}}", parents = {"media sosial", "World Wide Web"}, } labels["Tumblr"] = { type = "berkenaan", wikidata = 384060, description = "=the microblogging and social networking service {{w|Tumblr}}", parents = {"media sosial", "World Wide Web"}, } labels["utopian and dystopian fiction"] = { type = "berkenaan", description = "default", parents = {"cereka spekulatif"}, } labels["vampires"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures", "characters from folklore", "death", "horror", "blood"}, } labels["vampire lifestyle"] = { type = "berkenaan", description = "={{w|vampire lifestyle|the vampire lifestyle}} (i.e., a subculture which roleplays the stereotypical habits of vampires)", parents = {"subbudaya", "vampires"}, } labels["Virtual YouTuber"] = { type = "berkenaan", wikidata = 55155641, description = "=[[virtual YouTuber]]s ([[VTuber]]s)", parents = {"YouTube", "hiburan"}, } labels["web design"] = { type = "berkenaan", description = "default", parents = {"reka bentuk", "World Wide Web"}, } labels["werewolves"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures", "characters from folklore", "shapeshifters", "horror"}, } labels["worldbuilding"] = { type = "berkenaan", description = "default", parents = {"narratology", "cereka spekulatif"}, } labels["Xena: Warrior Princess"] = { type = "berkenaan", wikidata = 38497, displaytitle = "''Xena: Warrior Princess''", description = "=the television series ''{{w|Xena: Warrior Princess}}'' (1995–2001)", parents = {"cereka Amerika", "fantasy", "televisyen"}, } labels["YouTube"] = { type = "berkenaan", wikidata = 866, description = "=the video-sharing website {{w|YouTube}}", parents = {"media sosial", "World Wide Web", "Google"}, } labels["YouTube Poop"] = { type = "berkenaan", wikidata = 16927904, description = "default", parents = {"YouTube", "Internet memes"}, } labels["zombies"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures", "characters from folklore", "death", "horror"}, } return labels 0cq1z1gik9bzog9quzjsbrm5g2oazso 281343 281339 2026-04-22T01:10:11Z PeaceSeekers 3334 281343 Scribunto text/plain local labels = {} labels["budaya"] = { type = "berkenaan", description = "default", parents = {"masyarakat"}, } labels["A Christmas Carol"] = { type = "berkenaan", wikidata = 62879, displaytitle = "''A Christmas Carol''", description = "{{{langname}}} terms that are used in the context of the tale ''{{w|A Christmas Carol}}'', by {{w|Charles Dickens}}, such as the names of its characters or author.", parents = {"British fiction", "Charles Dickens"}, } labels["A Song of Ice and Fire"] = { type = "berkenaan", wikidata = 45875, displaytitle = "''A Song of Ice and Fire''", description = "{{{langname}}} terms used in context of the ''{{w|Song of Ice and Fire}}'' novel series and its television adaptation ''{{w|Game of Thrones}}''.", parents = {"cereka Amerika", "fantasy", "kesusasteraan"}, } labels["lakonan"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["alternate history"] = { type = "berkenaan", description = "default", parents = {"cereka spekulatif", "history"}, } labels["cereka Amerika"] = { type = "berkenaan", description = "=works of American fiction", parents = {"cereka", "Amerika Syarikat"}, } labels["animasi"] = { type = "berkenaan", description = "default", parents = {"media massa"}, } labels["Arabic fiction"] = { type = "berkenaan", description = "=works of [[fiction]] of [[Arabic]] origin", parents = {"cereka"}, } labels["Arabian deities"] = { type = "nama", description = "default", parents = {"gods", "Arabian mythology"}, } labels["Arabian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi"}, } labels["Armenian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Armenia"}, } labels["seni"] = { type = "berkenaan", description = "default", parents = {"budaya"}, } labels["Arthurian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "United Kingdom"}, } labels["artistic works"] = { type = "nama,jenis", description = "default", parents = {"seni"}, } labels["astrobiology"] = { type = "berkenaan", description = "default", parents = {"astronomy", "biology", "geology"}, } labels["astrologi"] = { type = "berkenaan", description = "default", parents = {"penilikan", "pseudosains", "obsolete scientific theories"}, } labels["Asturian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Asturias, Spain"}, } labels["Avatar: The Last Airbender"] = { type = "berkenaan", wikidata = 11572, displaytitle = "''Avatar: The Last Airbender''", description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|Avatar: The Last Airbender}}'' and its spin-off ''{{w|The Legend of Korra}}''.", parents = {"cereka Amerika", "animasi"}, } labels["Australian Aboriginal mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Australia"}, } labels["ballet"] = { type = "berkenaan", description = "default", parents = {"tarian"}, } labels["Barbie"] = { type = "berkenaan", wikidata = 167447, description = "=the {{w|Barbie}} fashion doll produced by Mattel", parents = {"toys"}, } labels["Batman"] = { type = "berkenaan", wikidata = 2695156, description = "=the fictional [[superhero]] [[Batman]]", parents = {"DC Comics", "watak cereka"}, } labels["bibliography"] = { type = "berkenaan", description = "default", parents = {"buku"}, } labels["Bilibili"] = { type = "berkenaan", wikidata = 3077586, description = "=the video-sharing website {{w|bilibili}}", parents = {"media sosial", "World Wide Web"}, } labels["blogging"] = { type = "berkenaan", description = "default", parents = {"media sosial"}, } labels["Bluesky"] = { type = "berkenaan", wikidata = 78194383, description = "=the microblogging and social networking service {{w|Bluesky}}", parents = {"media sosial", "World Wide Web"}, } labels["body art"] = { type = "berkenaan", description = "default", parents = {"seni", "fesyen"}, } labels["Bollywood"] = { type = "berkenaan", wikidata = 93196, description = "default", parents = {"filem", "India"}, } labels["buku"] = { type = "berkenaan", description = "default", parents = {"media massa", "kesusasteraan"}, } labels["books of the Poetic Edda"] = { type = "nama", displaytitle = "books of the ''Poetic Edda''", description = "=[[book]]s of the ''[[Poetic Edda]]''", parents = {"Norse mythology"}, } labels["Brazilian folklore"] = { type = "berkenaan", description = "default", parents = {"folklore", "Brazil"}, } labels["cereka British"] = { type = "berkenaan", description = "=works of [[fiction]] of [[British]] origin", parents = {"cereka", "United Kingdom"}, } labels["Buffy the Vampire Slayer"] = { type = "berkenaan", wikidata = 183513, displaytitle = "''Buffy the Vampire Slayer''", description = "=the television series ''{{w|Buffy the Vampire Slayer}}'' (1997–2003)", parents = {"cereka Amerika", "televisyen", "vampires"}, } labels["cereka Kanada"] = { type = "berkenaan", description = "=works of [[fiction]] of [[Canada|Canadian]] origin", parents = {"cereka", "Kanada"}, } labels["seni khat"] = { type = "berkenaan", description = "default", parents = {"seni", "penulisan"}, } labels["cartomancy"] = { type = "berkenaan", description = "default", parents = {"penilikan"}, } labels["castells"] = { type = "berkenaan", description = "=[[castell]]s, the Catalan tradition of human tower building", additional = "See {{w|castells}}.", parents = {"budaya", "sports"}, } labels["celestial inhabitants"] = { type = "jenis", description = "=inhabitants of known [[celestial body|celestial bodies]]", parents = {"watak cereka", "cereka sains", "demonyms"}, } labels["Celtic mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Ireland", "Wales"}, } labels["characters from folklore"] = { type = "berkenaan", description = "default", parents = {"watak cereka", "folklore"}, } labels["cheerleading"] = { type = "berkenaan", description = "default", parents = {"tarian", "gymnastics", "sports"}, } labels["Church of England"] = { type = "berkenaan", description = "default with the", parents = {"Anglicanism", "England"}, } labels["Chinese fiction"] = { type = "berkenaan", description = "=works of [[fiction]], including [[anime]]s, [[manhua]]s, [[novel]]s, [[series]] and [[video game]]s, whose origin is of [[China]]", parents = {"cereka", "China"}, } labels["Chinese mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "China"}, } labels["cinematography"] = { type = "berkenaan", description = "default", parents = {"filem"}, } labels["circus"] = { type = "berkenaan", description = "default no singularize", parents = {"hiburan", "teater"}, } labels["comedy"] = { type = "berkenaan", description = "default", parents = {"drama"}, } labels["komik"] = { type = "berkenaan", description = "default no singularize", parents = {"kesusasteraan"}, } -- Confucianism: see [[Module:category tree/topic/Philosophy]] labels["conlanging"] = { type = "berkenaan", description = "=[[conlanging]] (the making of [[constructed language]]s)", parents = {"language", "budaya"}, } labels["conspiracy theories"] = { type = "berkenaan,set", description = "=[[conspiracy theory|conspiracy theories]] and theorists", parents = {"budaya"}, } labels["constellations in the zodiac"] = { type = "nama", description = "=the ring of [[constellations]] that line the [[ecliptic]], the apparent path of the [[Sun]] across the [[celestial sphere]] over the course of a year", parents = {"constellations", "astrologi"}, } labels["kosmetik"] = { type = "berkenaan", description = "default", parents = {"toiletries", "fesyen"}, } labels["cosplay"] = { type = "berkenaan", description = "default", parents = {"fandom"}, } labels["tarian"] = { type = "berkenaan", description = "default", parents = {"seni", "rekreasi"}, } labels["dances"] = { type = "jenis", description = "default", parents = {"tarian"}, } labels["DC Comics"] = { type = "berkenaan", wikidata = 2924461, description = "={{w|DC Comics}}", parents = {"cereka Amerika", "komik"}, } labels["demoscene"] = { type = "berkenaan", description = "default", parents = {"budaya", "computing"}, } labels["reka bentuk"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["dictionaries"] = { type = "jenis,nama", description = "default", parents = {"reference works", "lexicography"}, } labels["Disney"] = { type = "berkenaan", wikidata = 7414, description = "=the properties of {{w|The Walt Disney Company}}", additional = "This includes properties acquired jointly with or from other companies.", parents = {"cereka Amerika", "komik", "filem", "televisyen"}, } labels["penilikan"] = { type = "jenis", description = "default", parents = {"okultisme"}, } labels["Doctor Who"] = { type = "berkenaan", wikidata = 34316, displaytitle = "''Doctor Who''", description = "=the ''{{w|Doctor Who}}'' franchise", parents = {"British fiction", "cereka sains", "televisyen"}, } labels["Dracula"] = { type = "berkenaan", wikidata = 41542, displaytitle = "''Dracula''", description = "=the 1897 gothic horror novel ''{{w|Dracula}}'' by {{w|Bram Stoker}}, and its cultural derivations.", parents = {"fantasy", "kesusasteraan", "vampires"}, } labels["naga"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures"}, } labels["drama"] = { type = "berkenaan", description = "default", parents = {"teater"}, } labels["Egyptian deities"] = { type = "nama", description = "default", parents = {"gods", "Egyptian mythology"}, } labels["Egyptian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Ancient Egypt"}, } labels["hiburan"] = { type = "berkenaan", description = "default", parents = {"budaya"}, } labels["erotic literature"] = { type = "berkenaan", description = "default", parents = {"cereka", "literary genres", "sex"}, } labels["Etruscan mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Etruria"}, } labels["European folklore"] = { type = "berkenaan", description = "default", parents = {"folklore", "Europe"}, } labels["fairy tale"] = { type = "berkenaan", description = "=[[fairy tale]]s", parents = {"cereka"}, } labels["fairy tale characters"] = { type = "nama", description = "=[[fairy tale]] [[character]]s", parents = {"watak cereka", "fairy tale"}, } labels["fairy tales"] = { type = "nama", description = "default", parents = {"fairy tale"}, } labels["fan fiction"] = { type = "berkenaan", description = "default", parents = {"cereka", "fandom", "kesusasteraan"}, } labels["fandom"] = { type = "berkenaan", description = "{{{langname}}} terms arising from [[fandom]] culture.", parents = {"budaya"}, } labels["fantasy"] = { type = "berkenaan", description = "=the [[genre]] of [[fantasy]]", parents = {"cereka", "cereka spekulatif"}, } labels["fesyen"] = { type = "berkenaan", description = "default", parents = {"budaya", "clothing"}, } labels["faster-than-light travel"] = { type = "berkenaan", description = "default", parents = {"travel", "cereka sains", "astrophysics", "relativity"}, } labels["Fediverse"] = { type = "berkenaan", wikidata = 30325419, description = "=the decentralised social networking services collectively known as the {{w|Fediverse}}", parents = {"media sosial", "World Wide Web"}, } labels["cereka"] = { type = "berkenaan", description = "=specific works of [[fiction]]", parents = {"artistic works"}, } labels["fictional abilities"] = { type = "berkenaan,jenis", description = "=fictional [[ability|abilities]] and [[superpower]]s", parents = {"cereka", "cereka spekulatif"}, } labels["watak cereka"] = { type = "nama,jenis", description = "default", parents = {"cereka"}, } labels["fictional locations"] = { type = "nama,jenis", description = "default", parents = {"cereka"}, } labels["fictional planets"] = { type = "nama", description = "default", parents = {"fictional locations"}, } labels["fictional universes"] = { type = "nama,jenis", description = "default", parents = {"fictional locations"}, } labels["filem"] = { type = "berkenaan", description = "default", parents = {"media massa", "hiburan"}, } labels["F/F ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between two female characters.", parents = {"LGBTQ", "ships (fandom) by relationship type"}, } labels["film genres"] = { type = "jenis,berkenaan", description = "default", parents = {"filem", "genre"}, } labels["film industries"] = { type = "nama", description = "default", parents = {"filem"}, } labels["Finnic mythology"] = { type = "berkenaan", description = "=the [[mythology]] of the [[Finnic]] peoples", additional = "This includes (but is not limited to) [[Finnish]] and [[Estonian]] mythology.", parents = {"mitologi", "Finland", "Estonia"}, } labels["flamenco"] = { type = "berkenaan", description = "default", parents = {"tarian"}, } labels["folklore"] = { type = "berkenaan", description = "default", parents = {"budaya"}, } labels["furry fandom"] = { type = "berkenaan", description = "default", parents = {"fandom", "subbudaya"}, } labels["Germanic deities"] = { type = "nama", description = "default", parents = {"gods", "Germanic mythology"}, } labels["Germanic mythology"] = { type = "nama", description = "=the [[mythology]] of the [[Germanic]] peoples", parents = {"mitologi"}, } labels["genre"] = { type = "jenis,berkenaan", description = "=[[genre]]s and genre classifications", parents = {"hiburan"}, wpcat = true, } labels["ghosts"] = { type = "berkenaan", description = "default", parents = {"afterlife", "supernatural", "characters from folklore", "death", "fantasy", "horror", "mythological creatures", "okultisme"}, } labels["Glee (TV series)"] = { type = "berkenaan", wikidata = 152178, displaytitle = "''Glee'' (TV series)", description = "=the television series ''[[w:Glee (TV series)|Glee]]'' (2009–2015)", parents = {"cereka Amerika", "televisyen"}, } labels["graphic design"] = { type = "berkenaan", description = "default", parents = {"reka bentuk"}, } labels["Greek deities"] = { type = "nama", description = "default", parents = {"gods", "Greek mythology"}, } labels["Greek mythology"] = { type = "berkenaan", description = "=the [[mythology]] of [[Ancient Greece]]", parents = {"mitologi", "Ancient Greece"}, } labels["Gulliver's Travels"] = { type = "berkenaan", wikidata = 181488, displaytitle = "''Gulliver's Travels''", description = "=''[[w:Gulliver's Travels|Gulliver’s Travels]]''", parents = {"kesusasteraan"}, } labels["Harry Potter"] = { type = "berkenaan", wikidata = 8337, displaytitle = "''Harry Potter''", description = "{{{langname}}} terms used in context of the ''{{w|Harry Potter}}'' franchise.", parents = {"British fiction", "fantasy", "kesusasteraan", "watak cereka"}, } labels["Hawaiian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Hawaii, USA"}, } labels["F/M ships"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between female and male characters.", parents = {"ships (fandom) by relationship type"}, } labels["Hindu deities"] = { type = "nama", description = "default", parents = {"gods", "Hindu mythology"}, } labels["Hindu mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Hinduism"}, } labels["Homestuck"] = { type = "berkenaan", displaytitle ="''Homestuck''", wikidata = 2618713, description = "=the ''{{w|Homestuck}}'' multimedia fiction series", parents = {"cereka Amerika", "komik"}, } labels["Hopi culture"] = { type = "berkenaan", description = "default", parents = {"budaya", "United States"}, } labels["horror"] = { type = "berkenaan", description = "=the [[horror]] [[genre]]", parents = {"kesusasteraan", "cereka spekulatif"}, } labels["humanities"] = { type = "berkenaan", description = "default no singularize", parents = {"budaya"}, commonscat = true; } labels["incestuous ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} involving fictional incestuous relationships.", parents = {"incest", "ships (fandom) by relationship type"}, } labels["idol fandom"] = { type = "berkenaan", description = "default", parents = {"fandom"}, } labels["Instagram"] = { type = "berkenaan", wikidata = 209330, description = "=the photo sharing and social networking service [[Instagram]]", parents = {"photography", "media sosial", "World Wide Web"}, } labels["Iranian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Iran"}, } labels["Irish mythology"] = { type = "berkenaan", description = "default", parents = {"Celtic mythology", "Ireland"}, } labels["James Bond"] = { type = "berkenaan", wikidata = 844, displaytitle = "''James Bond''", description = "=the ''[[James Bond]]'' franchise", parents = {"British fiction", "filem"}, } labels["dewa Jepun"] = { type = "nama", description = "default", parents = {"dewa", "mitologi Jepun"}, } labels["cereka Jepun"] = { type = "berkenaan", description = "=bahan-bahan [[cereka]] Jepun, termasuk [[anime]], [[manga]], [[novel]], [[siri]] dan [[permainan video]]", parents = {"cereka", "Japan"}, } labels["mitologi Jepun"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Jepun"}, } labels["job titles in Romance of the Three Kingdoms"] = { type = "jenis", displaytitle = "job titles in ''Romance of the Three Kingdoms''", description = "=job titles in ''{{w|Romance of the Three Kingdoms}}''", parents = {"Romance of the Three Kingdoms", "titles"}, } labels["kewartawanan"] = { type = "berkenaan", description = "default", parents = {"penulisan"}, } labels["Kachinas"] = { type = "nama", description = "default", parents = {"Hopi culture"}, } labels["Komi mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Komi, Russia"}, } labels["Korean fiction"] = { type = "berkenaan", description = "=works of [[fiction]], including [[anime]]s, [[manhwa]]s, [[novel]]s, [[series]] and [[video game]]s, whose origin is of [[Korea]]", parents = {"cereka", "Korea"}, } labels["Korean mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Korea"}, } labels["genre kesusasteraan"] = { type = "jenis", description = "{{{langname}}} terms for [[literary]] [[genre]]s.", parents = {"kesusasteraan", "cereka", "genre"}, } labels["kesusasteraan"] = { type = "berkenaan", description = "default", parents = {"budaya", "hiburan", "penulisan"}, } labels["Lost (TV series)"] = { type = "berkenaan", wikidata = 23567, displaytitle = "''Lost'' (TV series)", description = "=the television series ''{{w|Lost (2004 TV series)|Lost}}'' (2004–2010)", parents = {"cereka Amerika", "cereka sains", "televisyen"}, } labels["Lovecraftian horror"] = { type = "berkenaan", wikidata = 2448865, description = "=the [[literature|literary]] works of {{w|H. P. Lovecraft}}", parents = {"horror", "kesusasteraan", "cereka", "supernatural"}, } labels["magic"] = { type = "berkenaan", description = "default", parents = {"supernatural"}, } labels["magic words"] = { type = "set", wikidata = 1135882, description = "{{{langname}}} magic words; terms that serve the purpose of effectively or apparently triggering a [[magical]] or [[illusionist]] event.", parents = {"plot devices", "cereka"}, } labels["genre manga"] = { type = "jenis", description = "Istilah [[genre]] [[manga]] dalam bahasa {{{langname}}}.", parents = {"genre kesusasteraan"}, } labels["perkahwinan"] = { type = "berkenaan", description = "default", parents = {"budaya", "keluarga"}, } labels["Marvel Comics"] = { type = "berkenaan", wikidata = 173496, description = "={{w|Marvel Comics}}", parents = {"cereka Amerika", "komik"}, } labels["media massa"] = { type = "berkenaan", description = "default", parents = {"media", "budaya"}, } labels["Meitei deities"] = { type = "nama", description = "default", parents = {"gods", "Meitei mythology"}, } labels["Meitei mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Manipur, India"}, } labels["merpeople"] = { type = "berkenaan", description = "default", parents = {"mythological creatures"}, } labels["Mesopotamian deities"] = { type = "nama", description = "default", parents = {"gods", "Mesopotamian mythology"}, } labels["Mesopotamian mythology"] = { type = "berkenaan", description = "=the [[mythology]] of ancient [[Mesopotamia]]", parents = {"mitologi", "Ancient Near East"}, } labels["M/M ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between two male characters.", parents = {"LGBTQ", "ships (fandom) by relationship type"}, } labels["modern art"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["Mongolian tribes"] = { type = "nama", description = "{{{langname}}} names for Mongolian tribes.", parents = {"ethnonyms", "Mongolia"}, } labels["moustaches"] = { type = "jenis", description = "default", parents = {"face", "fesyen", "hair"}, } labels["My Hero Academia"] = { type = "berkenaan", wikidata = 18047903, displaytitle ="''My Hero Academia''", description = "=the ''{{w|My Hero Academia}}'' series", parents = {"Japanese fiction", "animasi", "komik"}, } labels["My Little Pony"] = { type = "berkenaan", wikidata = 1071312, displaytitle = "''My Little Pony''", description = "=the ''{{w|My Little Pony}}'' franchise (which includes toys and animated series) and its fandom", parents = {"cereka Amerika", "animasi", "toys"}, } labels["mythological creatures"] = { type = "jenis", description = "default", parents = {"mitologi", "fantasy"}, } labels["mythological figures"] = { type = "nama", description = "default", parents = {"mitologi"}, } labels["mythological locations"] = { type = "nama", description = "default", parents = {"mitologi"}, } labels["mythological plants"] = { type = "jenis,nama", description = "default", parents = {"mitologi", "plants"}, } labels["mitologi"] = { type = "berkenaan", description = "default", parents = {"budaya"}, } labels["narratology"] = { type = "berkenaan", description = "default", parents = {"kesusasteraan", "drama"}, } labels["Navajo mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi"}, } labels["newspapers"] = { type = "nama", description = "default", parents = {"periodicals"}, } labels["Niconico"] = { type = "berkenaan", wikidata = 697233, description = "=the video-sharing website {{w|Niconico}}", parents = {"media sosial", "World Wide Web"}, } labels["Norse deities"] = { type = "nama", description = "default", parents = {"gods", "Germanic deities", "Norse mythology"}, } labels["Norse mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Germanic mythology"}, } labels["okultisme"] = { type = "berkenaan", description = "default with the", parents = {"supernatural", "paranormal"}, } labels["omegaverse"] = { type = "berkenaan", wikidata = 96397374, description = "=the [[omegaverse]] genre", parents = {"erotic literature", "fan fiction", "cereka spekulatif"}, } labels["Omori"] = { type = "berkenaan", wikidata = 105618699, displaytitle ="''Omori''", description = "=the ''{{w|Omori (video game)|Omori}}'' series", parents = {"cereka Amerika", "video games"}, } labels["Once Upon a Time"] = { type = "berkenaan", wikidata = 23673, displaytitle = "''Once Upon a Time''", description = "=the television series ''{{w|Once Upon a Time (TV series)|Once Upon a Time}}'' (2011–2018)", parents = {"cereka Amerika", "Disney", "televisyen"}, } labels["painting"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["palmistry"] = { type = "berkenaan", description = "default", parents = {"penilikan"}, } labels["parties"] = { type = "jenis,berkenaan", description = "default", parents = {"hiburan", "budaya"}, } labels["people in Romance of the Three Kingdoms"] = { type = "nama", displaytitle = "people in ''Romance of the Three Kingdoms''", description = "=people in ''{{w|Romance of the Three Kingdoms}}''", parents = {"Romance of the Three Kingdoms"}, } labels["perfumes"] = { type = "jenis,set", description = "default", parents = {"fesyen", "scents", "perfumery"}, } labels["periodicals"] = { type = "jenis,berkenaan", description = "default", parents = {"media massa", "kesusasteraan"}, } labels["personifications"] = { type = "nama", description = "default", parents = {"narratology"}, } labels["places in Romance of the Three Kingdoms"] = { type = "nama", displaytitle = "places in ''Romance of the Three Kingdoms''", description = "=places in ''{{w|Romance of the Three Kingdoms}}''", parents = {"Romance of the Three Kingdoms", "China"}, } labels["plot devices"] = { type = "jenis", description = "default", parents = {"narratology", "cereka"}, } labels["puisi"] = { type = "berkenaan", description = "default", parents = {"kesusasteraan", "seni"}, } labels["polyamorous ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between three or more characters.", parents = {"ships (fandom) by relationship type"}, } labels["Private Eye"] = { type = "berkenaan", displaytitle = "''Private Eye''", description = "=the ''{{w|Private Eye}}'' franchise", parents = {"British fiction"}, } labels["Reddit"] = { type = "berkenaan", wikidata = 2195701, description = "=the social news aggregation and discussion website {{w|Reddit}}", parents = {"media sosial", "World Wide Web"}, } labels["reference works"] = { type = "jenis", description = "default", parents = {"buku"}, } labels["Roman deities"] = { type = "nama", description = "default", parents = {"gods", "Roman mythology"}, } labels["Roman mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Ancient Rome"}, } labels["romance fiction"] = { type = "berkenaan", description = "default", parents = {"literary genres", "love"}, } labels["Romance of the Three Kingdoms"] = { type = "berkenaan", wikidata = 70806, displaytitle = "''Romance of the Three Kingdoms''", description = "=''{{w|Romance of the Three Kingdoms}}''", parents = {"cereka", "kesusasteraan", "China"}, } labels["RPF ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} involving real people in a fictional relationship.", additional = "For actual relationships between real people, see [[:Category:Couple nicknames]].", parents = {"ships (fandom) by relationship type"}, } labels["cereka sains"] = { type = "berkenaan", description = "default", parents = {"cereka spekulatif", "cereka"}, } labels["SCP Foundation"] = { type = "berkenaan", wikidata = 17439649, description = "English terms related to the SCP Wiki collaborative writing website and its setting of the {{w|SCP Foundation}}.", parents = {"fantasy", "cereka", "horror", "cereka sains", "supernatural"}, } labels["arca"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["Shahnameh"] = { type = "berkenaan", wikidata = 8279, displaytitle = "''Shahnameh''", description = "=''Shahnameh''", parents = {"cereka", "puisi", "kesusasteraan", "Persia"}, } labels["Shahnameh characters"] = { type = "nama", description = "=characters in the [[Shahnameh]]", parents = {"Shahnameh"}, } labels["shapeshifters"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures", "characters from folklore"}, } labels["Sherlock Holmes"] = { type = "berkenaan", wikidata = 2316684, description = "=the [[Sherlock Holmes]] stories by {{w|Arthur Conan Doyle}} and adaptations of them", parents = {"British fiction", "kesusasteraan"}, } labels["Sherlock (TV series)"] = { type = "berkenaan", wikidata = 192837, displaytitle = "''Sherlock'' (TV series)", description = "=the television series ''[[w:Sherlock (TV series)|Sherlock]]'' (2010–2017)", parents = {"Sherlock Holmes", "televisyen"}, } labels["shipping (fandom)"] = { type = "berkenaan", description = "={{l|en|ship|shipping|id=fandomverb}} (i.e., in [[fandom]], supporting a fictional romantic relationship between two characters)", parents = {"fandom", "romance fiction"}, } labels["ships (fandom)"] = { type = "kumpulan", description = "=names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} i.e., a fictional relationship between two fictional characters or real people)", parents = {"shipping (fandom)"}, } labels["ships (fandom) by relationship type"] = { type = "kumpulan", description = "={{l|en|ship|ship|id=fandomnoun}} names organized by the type of relationship (e.g, [[heterosexual]], [[homosexual]], etc.)", parents = {"ships (fandom)"}, } labels["shippers (fandom)"] = { type = "jenis", description = "=[[shipper]]s (i.e., people who support a romantic or sexual relationship between characters or real people)", parents = {"shipping (fandom)"}, } labels["Slavic deities"] = { type = "nama", description = "default", parents = {"gods", "Slavic mythology"}, } labels["Slavic mythology"] = { type = "berkenaan", description = "=the [[mythology]] of the [[Slav]]s", parents = {"mitologi"}, } labels["Smallville (TV series)"] = { type = "berkenaan", wikidata = 180228, displaytitle = "''Smallville'' (TV series)", description = "=the television series ''{{w|Smallville}}'' (2001–2011)", parents = {"cereka Amerika", "Superman", "televisyen"}, } labels["media sosial"] = { type = "berkenaan", wikidata = 202833, description = "default", parents = {"media massa", "Internet"}, } labels["South Korean idol fandom"] = { type = "berkenaan", wikidata = 39086123, description = "=[[South Korea|South Korean]] [[idol]] [[fandom]]", parents = {"idol fandom", "South Korea"}, } labels["South Park"] = { type = "berkenaan", wikidata = 16538, displaytitle = "''South Park''", description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|South Park}}''.", parents = {"cereka Amerika", "animasi"}, } labels["Star Trek"] = { type = "berkenaan", wikidata = 1092, displaytitle = "''Star Trek''", description = "=the ''{{w|Star Trek}}'' franchise", parents = {"cereka Amerika", "filem", "cereka sains", "televisyen"}, } labels["Star Wars"] = { type = "berkenaan", wikidata = 462, displaytitle = "''Star Wars''", description = "=the ''{{w|Star Wars}}'' franchise", parents = {"cereka Amerika", "filem", "cereka sains", "Disney"}, } labels["Steven Universe"] = { type = "berkenaan", wikidata = 7615342, displaytitle = "''Steven Universe''", description = "=the animated television series ''{{w|Steven Universe}}''", parents = {"cereka Amerika", "animasi"}, } labels["stock characters"] = { type = "jenis", wikidata = 636497, description = "default", parents = {"watak cereka"}, } labels["cereka spekulatif"] = { type = "berkenaan", wikidata = 9326077, description = "default", parents = {"cereka", "genre"}, } labels["spider fighting"] = { type = "berkenaan", wikidata = 7577058, description = "={{w|spider fighting}}", parents = {"spiders", "human activity"}, } labels["subbudaya"] = { type = "berkenaan", description = "=[[subculture]]s", parents = {"budaya"}, } labels["adiwira"] = { type = "nama", wikidata = 188784, description = "=[[superhero]]es", parents = {"watak cereka"}, } labels["Superman"] = { type = "berkenaan", wikidata = 79015, description = "=the fictional [[superhero]] [[Superman]]", parents = {"DC Comics", "watak cereka"}, } labels["supernatural"] = { type = "berkenaan", wikidata = 80837, description = "default with the", parents = {"folklore"}, } labels["Supernatural (TV series)"] = { type = "berkenaan", wikidata = 130585, displaytitle = "''Supernatural'' (TV series)", description = "=the television series ''[[w:Supernatural (American TV series)|Supernatural]]'' (2005–2020)", parents = {"cereka Amerika", "televisyen"}, } labels["Tamil deities"] = { type = "nama", description = "default", additional = "See [[w:Dravidian folk religion|Dravidian religion]] or [[w:Religion in ancient Tamilakam|Tamil region]] for more.", parents = {"gods", "Hindu deities", "Tamil mythology"}, } labels["Tamil mythology"] = { type = "nama", description = "default", additional = "See [[w:Dravidian folk religion|Dravidian religion]] or [[w:Religion in ancient Tamilakam|Tamil region]] for more.", parents = {"mitologi", "Hindu mythology", "Tamil Nadu, India"}, } labels["televisyen"] = { type = "berkenaan", wikidata = 289, description = "default", parents = {"media massa", "penyiaran"}, } labels["The Handmaid's Tale"] = { type = "berkenaan", wikidata = 25207350, displaytitle = "''The Handmaid's Tale''", description = "=the 1985 novel ''{{w|The Handmaid's Tale}}'' by {{w|Margaret Atwood}} and its [[w:The Handmaid's Tale (TV series)|television adaptation]] (2017–)", parents = {"Canadian fiction", "utopian and dystopian fiction", "kesusasteraan"}, } labels["The Hunger Games"] = { type = "berkenaan", wikidata = 11679, displaytitle = "''The Hunger Games''", description = "=''{{w|The Hunger Games}}'' novel series by {{w|Suzanne Collins}} and its film adaptations", parents = {"cereka Amerika", "cereka sains", "utopian and dystopian fiction", "kesusasteraan"}, } labels["The Matrix"] = { type = "berkenaan", wikidata = 83495, displaytitle = "''The Matrix''", description = "=''{{w|The Matrix}}''", parents = {"cereka Amerika", "cereka sains", "utopian and dystopian fiction"}, } labels["The Simpsons"] = { type = "berkenaan", wikidata = 886, displaytitle = "''The Simpsons''", description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|The Simpsons}}''.", parents = {"cereka Amerika", "animasi", "Disney"}, } labels["The Walking Dead"] = { type = "berkenaan", wikidata = 232737, displaytitle = "''The Walking Dead''", description = "=the television series ''[[w:The Walking Dead (TV series)|The Walking Dead]]'' (2010–2022) and the comic series from which it was adapted", parents = {"cereka Amerika", "televisyen", "utopian and dystopian fiction", "zombies"}, } labels["The Wizard of Oz"] = { type = "berkenaan", wikidata = 130295, displaytitle = "''The Wizard of Oz''", description = "=the fantasy novel ''{{w|The Wonderful Wizard of Oz}}'', subsequent books or films derived from it, such as the ''[[w:The Wizard of Oz (1939 film)|1939 film]]''.", parents = {"cereka Amerika", "fantasy", "kesusasteraan"}, } labels["The X-Files"] = { type = "berkenaan", wikidata = 2744, displaytitle = "''The X-Files''", description = "=the ''{{w|The X-Files}}'' franchise", parents = {"cereka Amerika", "cereka sains", "televisyen"}, } labels["teater"] = { type = "berkenaan", description = "default", parents = {"seni", "hiburan"}, } labels["Thracian deities"] = { type = "nama", description = "default", parents = {"gods"}, } labels["TikTok"] = { type = "berkenaan", wikidata = 48938223, description = "=the video-sharing and social-networking service {{w|TikTok}}", parents = {"media sosial", "World Wide Web"}, } labels["Tupi mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Brazil"}, } labels["Twilight (novel series)"] = { type = "berkenaan", wikidata = 44523, displaytitle = "''Twilight'' (novel series)", description = "=the ''[[w:Twilight (series)|Twilight]]'' franchise", parents = {"cereka Amerika", "fantasy", "kesusasteraan", "vampires"}, } labels["Twitter"] = { type = "berkenaan", wikidata = 918, description = "=the social networking and microblogging service {{w|Twitter}}", parents = {"media sosial", "World Wide Web"}, } labels["Tumblr"] = { type = "berkenaan", wikidata = 384060, description = "=the microblogging and social networking service {{w|Tumblr}}", parents = {"media sosial", "World Wide Web"}, } labels["utopian and dystopian fiction"] = { type = "berkenaan", description = "default", parents = {"cereka spekulatif"}, } labels["vampires"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures", "characters from folklore", "death", "horror", "blood"}, } labels["vampire lifestyle"] = { type = "berkenaan", description = "={{w|vampire lifestyle|the vampire lifestyle}} (i.e., a subculture which roleplays the stereotypical habits of vampires)", parents = {"subbudaya", "vampires"}, } labels["Virtual YouTuber"] = { type = "berkenaan", wikidata = 55155641, description = "=[[virtual YouTuber]]s ([[VTuber]]s)", parents = {"YouTube", "hiburan"}, } labels["web design"] = { type = "berkenaan", description = "default", parents = {"reka bentuk", "World Wide Web"}, } labels["werewolves"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures", "characters from folklore", "shapeshifters", "horror"}, } labels["worldbuilding"] = { type = "berkenaan", description = "default", parents = {"narratology", "cereka spekulatif"}, } labels["Xena: Warrior Princess"] = { type = "berkenaan", wikidata = 38497, displaytitle = "''Xena: Warrior Princess''", description = "=the television series ''{{w|Xena: Warrior Princess}}'' (1995–2001)", parents = {"cereka Amerika", "fantasy", "televisyen"}, } labels["YouTube"] = { type = "berkenaan", wikidata = 866, description = "=the video-sharing website {{w|YouTube}}", parents = {"media sosial", "World Wide Web", "Google"}, } labels["YouTube Poop"] = { type = "berkenaan", wikidata = 16927904, description = "default", parents = {"YouTube", "Internet memes"}, } labels["zombies"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures", "characters from folklore", "death", "horror"}, } return labels grubzadh94jb0mihfa023nbxsztq4rg 281344 281343 2026-04-22T01:25:21Z PeaceSeekers 3334 281344 Scribunto text/plain local labels = {} labels["budaya"] = { type = "berkenaan", description = "default", parents = {"masyarakat"}, } labels["A Christmas Carol"] = { type = "berkenaan", wikidata = 62879, displaytitle = "''A Christmas Carol''", description = "{{{langname}}} terms that are used in the context of the tale ''{{w|A Christmas Carol}}'', by {{w|Charles Dickens}}, such as the names of its characters or author.", parents = {"cereka British", "Charles Dickens"}, } labels["A Song of Ice and Fire"] = { type = "berkenaan", wikidata = 45875, displaytitle = "''A Song of Ice and Fire''", description = "{{{langname}}} terms used in context of the ''{{w|Song of Ice and Fire}}'' novel series and its television adaptation ''{{w|Game of Thrones}}''.", parents = {"cereka Amerika", "fantasi", "kesusasteraan"}, } labels["lakonan"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["alternate history"] = { type = "berkenaan", description = "default", parents = {"cereka spekulatif", "history"}, } labels["cereka Amerika"] = { type = "berkenaan", description = "=works of American fiction", parents = {"cereka", "Amerika Syarikat"}, } labels["animasi"] = { type = "berkenaan", description = "default", parents = {"media massa"}, } labels["Arabic fiction"] = { type = "berkenaan", description = "=works of [[fiction]] of [[Arabic]] origin", parents = {"cereka"}, } labels["dewa Arab"] = { type = "nama", description = "default", parents = {"dewa", "mitologi Arab"}, } labels["mitologi Arab"] = { type = "berkenaan", description = "default", parents = {"mitologi"}, } labels["mitologi Armenia"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Armenia"}, } labels["seni"] = { type = "berkenaan", description = "default", parents = {"budaya"}, } labels["Arthurian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "United Kingdom"}, } labels["karya seni"] = { type = "nama,jenis", description = "default", parents = {"seni"}, } labels["astrobiology"] = { type = "berkenaan", description = "default", parents = {"astronomy", "biology", "geology"}, } labels["astrologi"] = { type = "berkenaan", description = "default", parents = {"penilikan", "pseudosains", "obsolete scientific theories"}, } labels["Asturian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Asturias, Spain"}, } labels["Avatar: The Last Airbender"] = { type = "berkenaan", wikidata = 11572, displaytitle = "''Avatar: The Last Airbender''", description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|Avatar: The Last Airbender}}'' and its spin-off ''{{w|The Legend of Korra}}''.", parents = {"cereka Amerika", "animasi"}, } labels["Australian Aboriginal mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Australia"}, } labels["ballet"] = { type = "berkenaan", description = "default", parents = {"tarian"}, } labels["Barbie"] = { type = "berkenaan", wikidata = 167447, description = "=the {{w|Barbie}} fashion doll produced by Mattel", parents = {"toys"}, } labels["Batman"] = { type = "berkenaan", wikidata = 2695156, description = "=the fictional [[superhero]] [[Batman]]", parents = {"DC Comics", "watak cereka"}, } labels["bibliography"] = { type = "berkenaan", description = "default", parents = {"buku"}, } labels["Bilibili"] = { type = "berkenaan", wikidata = 3077586, description = "=the video-sharing website {{w|bilibili}}", parents = {"media sosial", "World Wide Web"}, } labels["blogging"] = { type = "berkenaan", description = "default", parents = {"media sosial"}, } labels["Bluesky"] = { type = "berkenaan", wikidata = 78194383, description = "=the microblogging and social networking service {{w|Bluesky}}", parents = {"media sosial", "World Wide Web"}, } labels["body art"] = { type = "berkenaan", description = "default", parents = {"seni", "fesyen"}, } labels["Bollywood"] = { type = "berkenaan", wikidata = 93196, description = "default", parents = {"filem", "India"}, } labels["buku"] = { type = "berkenaan", description = "default", parents = {"media massa", "kesusasteraan"}, } labels["books of the Poetic Edda"] = { type = "nama", displaytitle = "books of the ''Poetic Edda''", description = "=[[book]]s of the ''[[Poetic Edda]]''", parents = {"mitologi Norse"}, } labels["Brazilian folklore"] = { type = "berkenaan", description = "default", parents = {"folklore", "Brazil"}, } labels["cereka British"] = { type = "berkenaan", description = "=works of [[fiction]] of [[British]] origin", parents = {"cereka", "United Kingdom"}, } labels["Buffy the Vampire Slayer"] = { type = "berkenaan", wikidata = 183513, displaytitle = "''Buffy the Vampire Slayer''", description = "=the television series ''{{w|Buffy the Vampire Slayer}}'' (1997–2003)", parents = {"cereka Amerika", "televisyen", "vampires"}, } labels["cereka Kanada"] = { type = "berkenaan", description = "=works of [[fiction]] of [[Canada|Canadian]] origin", parents = {"cereka", "Kanada"}, } labels["seni khat"] = { type = "berkenaan", description = "default", parents = {"seni", "penulisan"}, } labels["cartomancy"] = { type = "berkenaan", description = "default", parents = {"penilikan"}, } labels["castells"] = { type = "berkenaan", description = "=[[castell]]s, the Catalan tradition of human tower building", additional = "See {{w|castells}}.", parents = {"budaya", "sports"}, } labels["celestial inhabitants"] = { type = "jenis", description = "=inhabitants of known [[celestial body|celestial bodies]]", parents = {"watak cereka", "cereka sains", "demonyms"}, } labels["Celtic mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Ireland", "Wales"}, } labels["characters from folklore"] = { type = "berkenaan", description = "default", parents = {"watak cereka", "folklore"}, } labels["cheerleading"] = { type = "berkenaan", description = "default", parents = {"tarian", "gymnastics", "sports"}, } labels["Church of England"] = { type = "berkenaan", description = "default with the", parents = {"Anglicanism", "England"}, } labels["cereka China"] = { type = "berkenaan", description = "=works of [[fiction]], including [[anime]]s, [[manhua]]s, [[novel]]s, [[series]] and [[video game]]s, whose origin is of [[China]]", parents = {"cereka", "China"}, } labels["mitologi Cina"] = { type = "berkenaan", description = "default", parents = {"mitologi", "China"}, } labels["sinematografi"] = { type = "berkenaan", description = "default", parents = {"filem"}, } labels["sarkas"] = { type = "berkenaan", description = "default no singularize", parents = {"hiburan", "teater"}, } labels["komedi"] = { type = "berkenaan", description = "default", parents = {"drama"}, } labels["komik"] = { type = "berkenaan", description = "default no singularize", parents = {"kesusasteraan"}, } -- Confucianism: see [[Module:category tree/topic/Philosophy]] labels["conlanging"] = { type = "berkenaan", description = "=[[conlanging]] (the making of [[constructed language]]s)", parents = {"language", "budaya"}, } labels["teori konspirasi"] = { type = "berkenaan,set", description = "=[[conspiracy theory|conspiracy theories]] and theorists", parents = {"budaya"}, } labels["constellations in the zodiac"] = { type = "nama", description = "=the ring of [[constellations]] that line the [[ecliptic]], the apparent path of the [[Sun]] across the [[celestial sphere]] over the course of a year", parents = {"constellations", "astrologi"}, } labels["kosmetik"] = { type = "berkenaan", description = "default", parents = {"toiletries", "fesyen"}, } labels["cosplay"] = { type = "berkenaan", description = "default", parents = {"fandom"}, } labels["tarian"] = { type = "berkenaan", description = "default", parents = {"seni", "rekreasi"}, } labels["dances"] = { type = "jenis", description = "default", parents = {"tarian"}, } labels["DC Comics"] = { type = "berkenaan", wikidata = 2924461, description = "={{w|DC Comics}}", parents = {"cereka Amerika", "komik"}, } labels["demoscene"] = { type = "berkenaan", description = "default", parents = {"budaya", "computing"}, } labels["reka bentuk"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["dictionaries"] = { type = "jenis,nama", description = "default", parents = {"reference works", "lexicography"}, } labels["Disney"] = { type = "berkenaan", wikidata = 7414, description = "=the properties of {{w|The Walt Disney Company}}", additional = "This includes properties acquired jointly with or from other companies.", parents = {"cereka Amerika", "komik", "filem", "televisyen"}, } labels["penilikan"] = { type = "jenis", description = "default", parents = {"okultisme"}, } labels["Doctor Who"] = { type = "berkenaan", wikidata = 34316, displaytitle = "''Doctor Who''", description = "=the ''{{w|Doctor Who}}'' franchise", parents = {"cereka British", "cereka sains", "televisyen"}, } labels["Dracula"] = { type = "berkenaan", wikidata = 41542, displaytitle = "''Dracula''", description = "=the 1897 gothic horror novel ''{{w|Dracula}}'' by {{w|Bram Stoker}}, and its cultural derivations.", parents = {"fantasi", "kesusasteraan", "vampires"}, } labels["naga"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures"}, } labels["drama"] = { type = "berkenaan", description = "default", parents = {"teater"}, } labels["dewa Mesir"] = { type = "nama", description = "default", parents = {"dewa", "mitologi Mesir"}, } labels["mitologi Mesir"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Mesir Purba"}, } labels["hiburan"] = { type = "berkenaan", description = "default", parents = {"budaya"}, } labels["erotic literature"] = { type = "berkenaan", description = "default", parents = {"cereka", "genre kesusasteraan", "sex"}, } labels["mitologi Etruria"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Etruria"}, } labels["European folklore"] = { type = "berkenaan", description = "default", parents = {"folklore", "Europe"}, } labels["fairy tale"] = { type = "berkenaan", description = "=[[fairy tale]]s", parents = {"cereka"}, } labels["fairy tale characters"] = { type = "nama", description = "=[[fairy tale]] [[character]]s", parents = {"watak cereka", "fairy tale"}, } labels["fairy tales"] = { type = "nama", description = "default", parents = {"fairy tale"}, } labels["fan fiction"] = { type = "berkenaan", description = "default", parents = {"cereka", "fandom", "kesusasteraan"}, } labels["fandom"] = { type = "berkenaan", description = "{{{langname}}} terms arising from [[fandom]] culture.", parents = {"budaya"}, } labels["fantasi"] = { type = "berkenaan", description = "=the [[genre]] of [[fantasy]]", parents = {"cereka", "cereka spekulatif"}, } labels["fesyen"] = { type = "berkenaan", description = "default", parents = {"budaya", "pakaian"}, } labels["faster-than-light travel"] = { type = "berkenaan", description = "default", parents = {"travel", "cereka sains", "astrofizik", "kerelatifan"}, } labels["Fediverse"] = { type = "berkenaan", wikidata = 30325419, description = "=the decentralised social networking services collectively known as the {{w|Fediverse}}", parents = {"media sosial", "World Wide Web"}, } labels["cereka"] = { type = "berkenaan", description = "=specific works of [[fiction]]", parents = {"karya seni"}, } labels["fictional abilities"] = { type = "berkenaan,jenis", description = "=fictional [[ability|abilities]] and [[superpower]]s", parents = {"cereka", "cereka spekulatif"}, } labels["watak cereka"] = { type = "nama,jenis", description = "default", parents = {"cereka"}, } labels["fictional locations"] = { type = "nama,jenis", description = "default", parents = {"cereka"}, } labels["fictional planets"] = { type = "nama", description = "default", parents = {"fictional locations"}, } labels["fictional universes"] = { type = "nama,jenis", description = "default", parents = {"fictional locations"}, } labels["filem"] = { type = "berkenaan", description = "default", parents = {"media massa", "hiburan"}, } labels["F/F ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between two female characters.", parents = {"LGBTQ", "ships (fandom) by relationship type"}, } labels["film genres"] = { type = "jenis,berkenaan", description = "default", parents = {"filem", "genre"}, } labels["industri filem"] = { type = "nama", description = "default", parents = {"filem"}, } labels["Finnic mythology"] = { type = "berkenaan", description = "=the [[mythology]] of the [[Finnic]] peoples", additional = "This includes (but is not limited to) [[Finnish]] and [[Estonian]] mythology.", parents = {"mitologi", "Finland", "Estonia"}, } labels["flamenco"] = { type = "berkenaan", description = "default", parents = {"tarian"}, } labels["folklore"] = { type = "berkenaan", description = "default", parents = {"budaya"}, } labels["furry fandom"] = { type = "berkenaan", description = "default", parents = {"fandom", "subbudaya"}, } labels["dewa Jermanik"] = { type = "nama", description = "default", parents = {"dewa", "mitologi Jermanik"}, } labels["mitologi Jermanik"] = { type = "nama", description = "=the [[mythology]] of the [[Germanic]] peoples", parents = {"mitologi"}, } labels["genre"] = { type = "jenis,berkenaan", description = "=[[genre]]s and genre classifications", parents = {"hiburan"}, wpcat = true, } labels["hantu"] = { type = "berkenaan", description = "default", parents = {"afterlife", "supernatural", "characters from folklore", "death", "fantasi", "horror", "mythological creatures", "okultisme"}, } labels["Glee"] = { type = "berkenaan", wikidata = 152178, description = "=siri televisyen, ''[[w:Glee (siri TV)|Glee]]'' (2009–2015)", parents = {"cereka Amerika", "televisyen"}, } labels["reka bentuk grafik"] = { type = "berkenaan", description = "default", parents = {"reka bentuk"}, } labels["dewa Yunani"] = { type = "nama", description = "default", parents = {"dewa", "mitologi Yunani"}, } labels["mitologi Yunani"] = { type = "berkenaan", description = "=[[mitologi]] masyarakat [[Yunani Purba]]", parents = {"mitologi", "Yunani Purba"}, } labels["Gulliver's Travels"] = { type = "berkenaan", wikidata = 181488, displaytitle = "''Gulliver's Travels''", description = "=''[[w:Gulliver's Travels|Gulliver’s Travels]]''", parents = {"kesusasteraan"}, } labels["Harry Potter"] = { type = "berkenaan", wikidata = 8337, displaytitle = "''Harry Potter''", description = "{{{langname}}} terms used in context of the ''{{w|Harry Potter}}'' franchise.", parents = {"cereka British", "fantasi", "kesusasteraan", "watak cereka"}, } labels["Hawaiian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Hawaii, USA"}, } labels["F/M ships"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between female and male characters.", parents = {"ships (fandom) by relationship type"}, } labels["dewa Hindu"] = { type = "nama", description = "default", parents = {"dewa", "mitologi Hindu"}, } labels["mitologi Hindu"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Hinduisme"}, } labels["Homestuck"] = { type = "berkenaan", displaytitle ="''Homestuck''", wikidata = 2618713, description = "=the ''{{w|Homestuck}}'' multimedia fiction series", parents = {"cereka Amerika", "komik"}, } labels["Hopi culture"] = { type = "berkenaan", description = "default", parents = {"budaya", "United States"}, } labels["horror"] = { type = "berkenaan", description = "=the [[horror]] [[genre]]", parents = {"kesusasteraan", "cereka spekulatif"}, } labels["humanities"] = { type = "berkenaan", description = "default no singularize", parents = {"budaya"}, commonscat = true; } labels["incestuous ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} involving fictional incestuous relationships.", parents = {"incest", "ships (fandom) by relationship type"}, } labels["idol fandom"] = { type = "berkenaan", description = "default", parents = {"fandom"}, } labels["Instagram"] = { type = "berkenaan", wikidata = 209330, description = "=the photo sharing and social networking service [[Instagram]]", parents = {"photography", "media sosial", "World Wide Web"}, } labels["Iranian mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Iran"}, } labels["Irish mythology"] = { type = "berkenaan", description = "default", parents = {"Celtic mythology", "Ireland"}, } labels["James Bond"] = { type = "berkenaan", wikidata = 844, displaytitle = "''James Bond''", description = "=the ''[[James Bond]]'' franchise", parents = {"cereka British", "filem"}, } labels["dewa Jepun"] = { type = "nama", description = "default", parents = {"dewa", "mitologi Jepun"}, } labels["cereka Jepun"] = { type = "berkenaan", description = "=bahan-bahan [[cereka]] Jepun, termasuk [[anime]], [[manga]], [[novel]], [[siri]] dan [[permainan video]]", parents = {"cereka", "Japan"}, } labels["mitologi Jepun"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Jepun"}, } labels["job titles in Romance of the Three Kingdoms"] = { type = "jenis", displaytitle = "job titles in ''Romance of the Three Kingdoms''", description = "=job titles in ''{{w|Romance of the Three Kingdoms}}''", parents = {"Romance of the Three Kingdoms", "titles"}, } labels["kewartawanan"] = { type = "berkenaan", description = "default", parents = {"penulisan"}, } labels["Kachinas"] = { type = "nama", description = "default", parents = {"budaya Hopi"}, } labels["Komi mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Komi, Russia"}, } labels["cereka Korea"] = { type = "berkenaan", description = "=works of [[fiction]], including [[anime]]s, [[manhwa]]s, [[novel]]s, [[series]] and [[video game]]s, whose origin is of [[Korea]]", parents = {"cereka", "Korea"}, } labels["mitologi Korea"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Korea"}, } labels["genre kesusasteraan"] = { type = "jenis", description = "{{{langname}}} terms for [[literary]] [[genre]]s.", parents = {"kesusasteraan", "cereka", "genre"}, } labels["kesusasteraan"] = { type = "berkenaan", description = "default", parents = {"budaya", "hiburan", "penulisan"}, } labels["Lost (TV series)"] = { type = "berkenaan", wikidata = 23567, displaytitle = "''Lost'' (TV series)", description = "=the television series ''{{w|Lost (2004 TV series)|Lost}}'' (2004–2010)", parents = {"cereka Amerika", "cereka sains", "televisyen"}, } labels["Lovecraftian horror"] = { type = "berkenaan", wikidata = 2448865, description = "=the [[literature|literary]] works of {{w|H. P. Lovecraft}}", parents = {"horror", "kesusasteraan", "cereka", "supernatural"}, } labels["magic"] = { type = "berkenaan", description = "default", parents = {"supernatural"}, } labels["magic words"] = { type = "set", wikidata = 1135882, description = "{{{langname}}} magic words; terms that serve the purpose of effectively or apparently triggering a [[magical]] or [[illusionist]] event.", parents = {"plot devices", "cereka"}, } labels["genre manga"] = { type = "jenis", description = "Istilah [[genre]] [[manga]] dalam bahasa {{{langname}}}.", parents = {"genre kesusasteraan"}, } labels["perkahwinan"] = { type = "berkenaan", description = "default", parents = {"budaya", "keluarga"}, } labels["Marvel Comics"] = { type = "berkenaan", wikidata = 173496, description = "={{w|Marvel Comics}}", parents = {"cereka Amerika", "komik"}, } labels["media massa"] = { type = "berkenaan", description = "default", parents = {"media", "budaya"}, } labels["dewa Meitei"] = { type = "nama", description = "default", parents = {"dewa", "mitologi Meitei"}, } labels["mitologi Meitei"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Manipur, India"}, } labels["merpeople"] = { type = "berkenaan", description = "default", parents = {"mythological creatures"}, } labels["dewa Mesopotamia"] = { type = "nama", description = "default", parents = {"dewa", "mitologi Mesopotamia"}, } labels["mitologi Mesopotamia"] = { type = "berkenaan", description = "=the [[mythology]] of ancient [[Mesopotamia]]", parents = {"mitologi", "Timur Dekat Purba"}, } labels["M/M ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between two male characters.", parents = {"LGBTQ", "ships (fandom) by relationship type"}, } labels["seni moden"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["Mongolian tribes"] = { type = "nama", description = "{{{langname}}} names for Mongolian tribes.", parents = {"ethnonyms", "Mongolia"}, } labels["misai"] = { type = "jenis", description = "default", parents = {"muka", "fesyen", "rambut"}, } labels["My Hero Academia"] = { type = "berkenaan", wikidata = 18047903, displaytitle ="''My Hero Academia''", description = "=the ''{{w|My Hero Academia}}'' series", parents = {"cereka Jepun", "animasi", "komik"}, } labels["My Little Pony"] = { type = "berkenaan", wikidata = 1071312, displaytitle = "''My Little Pony''", description = "=the ''{{w|My Little Pony}}'' franchise (which includes toys and animated series) and its fandom", parents = {"cereka Amerika", "animasi", "toys"}, } labels["mythological creatures"] = { type = "jenis", description = "default", parents = {"mitologi", "fantasi"}, } labels["mythological figures"] = { type = "nama", description = "default", parents = {"mitologi"}, } labels["mythological locations"] = { type = "nama", description = "default", parents = {"mitologi"}, } labels["mythological plants"] = { type = "jenis,nama", description = "default", parents = {"mitologi", "plants"}, } labels["mitologi"] = { type = "berkenaan", description = "default", parents = {"budaya"}, } labels["narratology"] = { type = "berkenaan", description = "default", parents = {"kesusasteraan", "drama"}, } labels["Navajo mythology"] = { type = "berkenaan", description = "default", parents = {"mitologi"}, } labels["akhbar"] = { type = "nama", description = "default", parents = {"terbitan berkala"}, } labels["Niconico"] = { type = "berkenaan", wikidata = 697233, description = "=the video-sharing website {{w|Niconico}}", parents = {"media sosial", "World Wide Web"}, } labels["dewa Norse"] = { type = "nama", description = "default", parents = {"dewa", "dewa Jermanik", "mitologi Norse"}, } labels["mitologi Norse"] = { type = "berkenaan", description = "default", parents = {"mitologi", "mitologi Jermanik"}, } labels["okultisme"] = { type = "berkenaan", description = "default with the", parents = {"supernatural", "paranormal"}, } labels["omegaverse"] = { type = "berkenaan", wikidata = 96397374, description = "=the [[omegaverse]] genre", parents = {"erotic literature", "fan fiction", "cereka spekulatif"}, } labels["Omori"] = { type = "berkenaan", wikidata = 105618699, displaytitle ="''Omori''", description = "=the ''{{w|Omori (video game)|Omori}}'' series", parents = {"cereka Amerika", "permainan video"}, } labels["Once Upon a Time"] = { type = "berkenaan", wikidata = 23673, displaytitle = "''Once Upon a Time''", description = "=the television series ''{{w|Once Upon a Time (TV series)|Once Upon a Time}}'' (2011–2018)", parents = {"cereka Amerika", "Disney", "televisyen"}, } labels["painting"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["palmistry"] = { type = "berkenaan", description = "default", parents = {"penilikan"}, } labels["parti"] = { type = "jenis,berkenaan", description = "default", parents = {"hiburan", "budaya"}, } labels["people in Romance of the Three Kingdoms"] = { type = "nama", displaytitle = "people in ''Romance of the Three Kingdoms''", description = "=people in ''{{w|Romance of the Three Kingdoms}}''", parents = {"Romance of the Three Kingdoms"}, } labels["minyak wangi"] = { type = "jenis,set", description = "default", parents = {"fesyen", "scents", "perfumery"}, } labels["terbitan berkala"] = { type = "jenis,berkenaan", description = "default", parents = {"media massa", "kesusasteraan"}, } labels["personifications"] = { type = "nama", description = "default", parents = {"narratology"}, } labels["places in Romance of the Three Kingdoms"] = { type = "nama", displaytitle = "places in ''Romance of the Three Kingdoms''", description = "=places in ''{{w|Romance of the Three Kingdoms}}''", parents = {"Romance of the Three Kingdoms", "China"}, } labels["plot devices"] = { type = "jenis", description = "default", parents = {"narratology", "cereka"}, } labels["puisi"] = { type = "berkenaan", description = "default", parents = {"kesusasteraan", "seni"}, } labels["polyamorous ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} between three or more characters.", parents = {"ships (fandom) by relationship type"}, } labels["Private Eye"] = { type = "berkenaan", displaytitle = "''Private Eye''", description = "=the ''{{w|Private Eye}}'' franchise", parents = {"cereka British"}, } labels["Reddit"] = { type = "berkenaan", wikidata = 2195701, description = "=the social news aggregation and discussion website {{w|Reddit}}", parents = {"media sosial", "World Wide Web"}, } labels["reference works"] = { type = "jenis", description = "default", parents = {"buku"}, } labels["dewa Rom"] = { type = "nama", description = "default", parents = {"dewa", "mitologi Rom"}, } labels["mitologi Rom"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Rom Purba"}, } labels["romance fiction"] = { type = "berkenaan", description = "default", parents = {"genre kesusasteraan", "cinta"}, } labels["Hikayat Tiga Kerajaan"] = { type = "berkenaan", wikidata = 70806, displaytitle = "''Hikayat Tiga Kerajaan''", description = "=''{{w|Hikayat Tiga Kerajaan}}''", parents = {"cereka", "kesusasteraan", "China"}, } labels["RPF ships (fandom)"] = { type = "nama", description = "{{{langname}}} names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} involving real people in a fictional relationship.", additional = "For actual relationships between real people, see [[:Category:Couple nicknames]].", parents = {"ships (fandom) by relationship type"}, } labels["cereka sains"] = { type = "berkenaan", description = "default", parents = {"cereka spekulatif", "cereka"}, } labels["SCP Foundation"] = { type = "berkenaan", wikidata = 17439649, description = "English terms related to the SCP Wiki collaborative writing website and its setting of the {{w|SCP Foundation}}.", parents = {"fantasi", "cereka", "horror", "cereka sains", "supernatural"}, } labels["arca"] = { type = "berkenaan", description = "default", parents = {"seni"}, } labels["Shahnameh"] = { type = "berkenaan", wikidata = 8279, displaytitle = "''Shahnameh''", description = "=''Shahnameh''", parents = {"cereka", "puisi", "kesusasteraan", "Parsi"}, } labels["Shahnameh characters"] = { type = "nama", description = "=characters in the [[Shahnameh]]", parents = {"Shahnameh"}, } labels["shapeshifters"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures", "characters from folklore"}, } labels["Sherlock Holmes"] = { type = "berkenaan", wikidata = 2316684, description = "=the [[Sherlock Holmes]] stories by {{w|Arthur Conan Doyle}} and adaptations of them", parents = {"cereka British", "kesusasteraan"}, } labels["Sherlock (TV series)"] = { type = "berkenaan", wikidata = 192837, displaytitle = "''Sherlock'' (TV series)", description = "=the television series ''[[w:Sherlock (TV series)|Sherlock]]'' (2010–2017)", parents = {"Sherlock Holmes", "televisyen"}, } labels["shipping (fandom)"] = { type = "berkenaan", description = "={{l|en|ship|shipping|id=fandomverb}} (i.e., in [[fandom]], supporting a fictional romantic relationship between two characters)", parents = {"fandom", "romance fiction"}, } labels["ships (fandom)"] = { type = "kumpulan", description = "=names used in [[fandom]] for specific {{l|en|ship|ships|id=fandomnoun}} i.e., a fictional relationship between two fictional characters or real people)", parents = {"shipping (fandom)"}, } labels["ships (fandom) by relationship type"] = { type = "kumpulan", description = "={{l|en|ship|ship|id=fandomnoun}} names organized by the type of relationship (e.g, [[heterosexual]], [[homosexual]], etc.)", parents = {"ships (fandom)"}, } labels["shippers (fandom)"] = { type = "jenis", description = "=[[shipper]]s (i.e., people who support a romantic or sexual relationship between characters or real people)", parents = {"shipping (fandom)"}, } labels["dewa Slavik"] = { type = "nama", description = "default", parents = {"dewa", "mitologi Slavik"}, } labels["mitologi Slavik"] = { type = "berkenaan", description = "=[[mitologi]] masyarakat [[Slav]]", parents = {"mitologi"}, } labels["Smallville (TV series)"] = { type = "berkenaan", wikidata = 180228, displaytitle = "''Smallville'' (TV series)", description = "=the television series ''{{w|Smallville}}'' (2001–2011)", parents = {"cereka Amerika", "Superman", "televisyen"}, } labels["media sosial"] = { type = "berkenaan", wikidata = 202833, description = "default", parents = {"media massa", "Internet"}, } labels["South Korean idol fandom"] = { type = "berkenaan", wikidata = 39086123, description = "=[[South Korea|South Korean]] [[idol]] [[fandom]]", parents = {"idol fandom", "South Korea"}, } labels["South Park"] = { type = "berkenaan", wikidata = 16538, displaytitle = "''South Park''", description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|South Park}}''.", parents = {"cereka Amerika", "animasi"}, } labels["Star Trek"] = { type = "berkenaan", wikidata = 1092, displaytitle = "''Star Trek''", description = "=the ''{{w|Star Trek}}'' franchise", parents = {"cereka Amerika", "filem", "cereka sains", "televisyen"}, } labels["Star Wars"] = { type = "berkenaan", wikidata = 462, displaytitle = "''Star Wars''", description = "=the ''{{w|Star Wars}}'' franchise", parents = {"cereka Amerika", "filem", "cereka sains", "Disney"}, } labels["Steven Universe"] = { type = "berkenaan", wikidata = 7615342, displaytitle = "''Steven Universe''", description = "=the animated television series ''{{w|Steven Universe}}''", parents = {"cereka Amerika", "animasi"}, } labels["stock characters"] = { type = "jenis", wikidata = 636497, description = "default", parents = {"watak cereka"}, } labels["cereka spekulatif"] = { type = "berkenaan", wikidata = 9326077, description = "default", parents = {"cereka", "genre"}, } labels["spider fighting"] = { type = "berkenaan", wikidata = 7577058, description = "={{w|spider fighting}}", parents = {"spiders", "human activity"}, } labels["subbudaya"] = { type = "berkenaan", description = "=[[subculture]]s", parents = {"budaya"}, } labels["adiwira"] = { type = "nama", wikidata = 188784, description = "=[[superhero]]es", parents = {"watak cereka"}, } labels["Superman"] = { type = "berkenaan", wikidata = 79015, description = "=the fictional [[superhero]] [[Superman]]", parents = {"DC Comics", "watak cereka"}, } labels["supernatural"] = { type = "berkenaan", wikidata = 80837, description = "default with the", parents = {"folklore"}, } labels["Supernatural (TV series)"] = { type = "berkenaan", wikidata = 130585, displaytitle = "''Supernatural'' (TV series)", description = "=the television series ''[[w:Supernatural (American TV series)|Supernatural]]'' (2005–2020)", parents = {"cereka Amerika", "televisyen"}, } labels["mitologi Tamil"] = { type = "nama", description = "default", additional = "See [[w:Dravidian folk religion|Dravidian religion]] or [[w:Religion in ancient Tamilakam|Tamil region]] for more.", parents = {"dewa", "dewa Hindu", "mitologi Tamil"}, } labels["mitologi Tamil"] = { type = "nama", description = "default", additional = "See [[w:Dravidian folk religion|Dravidian religion]] or [[w:Religion in ancient Tamilakam|Tamil region]] for more.", parents = {"mitologi", "mitologi Hindu", "Tamil Nadu, India"}, } labels["televisyen"] = { type = "berkenaan", wikidata = 289, description = "default", parents = {"media massa", "penyiaran"}, } labels["The Handmaid's Tale"] = { type = "berkenaan", wikidata = 25207350, displaytitle = "''The Handmaid's Tale''", description = "=the 1985 novel ''{{w|The Handmaid's Tale}}'' by {{w|Margaret Atwood}} and its [[w:The Handmaid's Tale (TV series)|television adaptation]] (2017–)", parents = {"Canadian fiction", "cereka utopia dan distopia", "kesusasteraan"}, } labels["The Hunger Games"] = { type = "berkenaan", wikidata = 11679, displaytitle = "''The Hunger Games''", description = "=''{{w|The Hunger Games}}'' novel series by {{w|Suzanne Collins}} and its film adaptations", parents = {"cereka Amerika", "cereka sains", "cereka utopia dan distopia", "kesusasteraan"}, } labels["The Matrix"] = { type = "berkenaan", wikidata = 83495, displaytitle = "''The Matrix''", description = "=''{{w|The Matrix}}''", parents = {"cereka Amerika", "cereka sains", "cereka utopia dan distopia"}, } labels["The Simpsons"] = { type = "berkenaan", wikidata = 886, displaytitle = "''The Simpsons''", description = "{{{langname}}} terms derived from and/or related to the animated television series ''{{w|The Simpsons}}''.", parents = {"cereka Amerika", "animasi", "Disney"}, } labels["The Walking Dead"] = { type = "berkenaan", wikidata = 232737, displaytitle = "''The Walking Dead''", description = "=the television series ''[[w:The Walking Dead (TV series)|The Walking Dead]]'' (2010–2022) and the comic series from which it was adapted", parents = {"cereka Amerika", "televisyen", "cereka utopia dan distopia", "zombies"}, } labels["The Wizard of Oz"] = { type = "berkenaan", wikidata = 130295, displaytitle = "''The Wizard of Oz''", description = "=the fantasy novel ''{{w|The Wonderful Wizard of Oz}}'', subsequent books or films derived from it, such as the ''[[w:The Wizard of Oz (1939 film)|1939 film]]''.", parents = {"cereka Amerika", "fantasi", "kesusasteraan"}, } labels["The X-Files"] = { type = "berkenaan", wikidata = 2744, displaytitle = "''The X-Files''", description = "=the ''{{w|The X-Files}}'' franchise", parents = {"cereka Amerika", "cereka sains", "televisyen"}, } labels["teater"] = { type = "berkenaan", description = "default", parents = {"seni", "hiburan"}, } labels["Thracian deities"] = { type = "nama", description = "default", parents = {"dewa"}, } labels["TikTok"] = { type = "berkenaan", wikidata = 48938223, description = "=the video-sharing and social-networking service {{w|TikTok}}", parents = {"media sosial", "World Wide Web"}, } labels["mitologi Tupi"] = { type = "berkenaan", description = "default", parents = {"mitologi", "Brazil"}, } labels["Twilight (novel series)"] = { type = "berkenaan", wikidata = 44523, displaytitle = "''Twilight'' (novel series)", description = "=the ''[[w:Twilight (series)|Twilight]]'' franchise", parents = {"cereka Amerika", "fantasi", "kesusasteraan", "vampires"}, } labels["Twitter"] = { type = "berkenaan", wikidata = 918, description = "=the social networking and microblogging service {{w|Twitter}}", parents = {"media sosial", "World Wide Web"}, } labels["Tumblr"] = { type = "berkenaan", wikidata = 384060, description = "=the microblogging and social networking service {{w|Tumblr}}", parents = {"media sosial", "World Wide Web"}, } labels["cereka utopia dan distopia"] = { type = "berkenaan", description = "default", parents = {"cereka spekulatif"}, } labels["vampires"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures", "characters from folklore", "death", "horror", "blood"}, } labels["vampire lifestyle"] = { type = "berkenaan", description = "={{w|vampire lifestyle|the vampire lifestyle}} (i.e., a subculture which roleplays the stereotypical habits of vampires)", parents = {"subbudaya", "vampires"}, } labels["Virtual YouTuber"] = { type = "berkenaan", wikidata = 55155641, description = "=[[virtual YouTuber]]s ([[VTuber]]s)", parents = {"YouTube", "hiburan"}, } labels["web design"] = { type = "berkenaan", description = "default", parents = {"reka bentuk", "World Wide Web"}, } labels["werewolves"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures", "characters from folklore", "shapeshifters", "horror"}, } labels["worldbuilding"] = { type = "berkenaan", description = "default", parents = {"narratology", "cereka spekulatif"}, } labels["Xena: Warrior Princess"] = { type = "berkenaan", wikidata = 38497, displaytitle = "''Xena: Warrior Princess''", description = "=the television series ''{{w|Xena: Warrior Princess}}'' (1995–2001)", parents = {"cereka Amerika", "fantasi", "televisyen"}, } labels["YouTube"] = { type = "berkenaan", wikidata = 866, description = "=the video-sharing website {{w|YouTube}}", parents = {"media sosial", "World Wide Web", "Google"}, } labels["YouTube Poop"] = { type = "berkenaan", wikidata = 16927904, description = "default", parents = {"YouTube", "Internet memes"}, } labels["zombi"] = { type = "berkenaan,jenis", description = "default", parents = {"mythological creatures", "characters from folklore", "death", "horror"}, } return labels tojb5f765nr8fk8gwqplc7snir4z6on Modul:category tree/topic/Animals 828 11530 281325 279238 2026-04-22T00:37:22Z PeaceSeekers 3334 281325 Scribunto text/plain local labels = {} labels["haiwan"] = { type = "set", description = "default", parents = {"makhluk"}, commonscat = "Animalia", wpcat = true, } labels["ikan akanturoid"] = { type = "set", description = "=[[surgeonfish]], [[light-horseman]], [[louvar]]s, [[scat]]s, [[rabbitfish]], [[Moorish idol]]s and other fish in the [[perciform]] [[suborder]] [[Acanthuroidei]]", parents = {"ikan"}, } labels["accentors"] = { type = "set", description = "=birds in the [[family]] [[Prunellidae]]", parents = {"burung tenggek"}, } labels["accipiters"] = { type = "set", description = "=[[besra]]s, [[Cooper's hawk]]s, [[goshawk]]s, [[sharp-shinned hawk]]s, [[shikra]]s, [[sparrowhawk]]s, and other [[hawk]]s in the [[genus]] ''[[Accipiter]]''", parents = {"burung pemangsa"}, } labels["ikan asipenseriform"] = { type = "set", description = "=[[paddlefish]], [[sturgeon]]s and other fish in the [[order]] [[Acipenseriformes]]", parents = {"ikan"}, } labels["adephagan beetles"] = { type = "set", description = "=[[diving beetle]]s, [[ground beetle]]s (including [[bombardier beetle]]s and [[tiger beetle]]s), [[whirligig beetle]]s and other [[beetle]]s in the [[suborder]] [[Adephaga]]", parents = {"beetles"}, } labels["African insectivores"] = { type = "set", description = "=[[aardvark]]s, [[elephant shrew]]s, [[golden mole]]s, [[otter shrew]]s, [[tenrec]]s, and other [[mammal]]s in the [[clade]] [[Afroinsectiphilia]]", parents = {"mamalia"}, } labels["agamid lizards"] = { type = "set", description = "=[[agama]]s, [[bearded dragon]]s, [[flying dragon]]s, [[frilled lizard]]s, [[moloch]]s, [[spiny-tailed lizard]]s, [[stellion]]s and other [[lizard]]s in the [[family]] [[Agamidae]]", parents = {"lizards"}, } labels["alcelaphine antelopes"] = { type = "set", description = "=[[blesbuck]]s, [[bontebok]]s, [[bubal]]s, [[gnu]]s or [[wildebeest]], [[hartebeest]]s, [[hirola]], [[sassaby]]s, [[topi]]s, [[tetel]]s, and other [[antelopes]] in the [[subfamily]] [[Alcelaphinae]]", parents = {"antelopes"}, } labels["ammonites"] = { type = "set", description = "=[[extinct]] [[cephalopod]]s in the [[subclass]] [[Ammonoidea]]", parents = {"sefalopod"}, } labels["amfibia"] = { type = "set", description = "default", parents = {"vertebrat"}, commonscat = "Amphibia", wpcat = true, } labels["amphipods"] = { type = "set", description = "=[[beach flea]]s, [[lawn shrimp]], [[scud]]s, [[side swimmer]]s, [[skeleton shrimp]], [[whale louse|whale lice]], and other [[crustacean]]s in the [[order]] [[Amphipoda]]", parents = {"krustasea"}, } labels["anatid"] = { type = "set", description = "=[[anatid]]s: ([[duck]]s, [[goose|geese]] and [[swan]]s)", parents = {"burung air tawar"}, } labels["annelids"] = { type = "set", description = "=[[earthworm]]s, [[leech]]es, [[ragworm]]s and many other [[segment]]ed [[worm]]s in the [[filum]] [[Annelida]]", parents = {"cacing"}, } labels["anglerfish"] = { type = "set", description = "=fish in the [[order]] [[Lophiiformes]]", parents = {"ikan"}, } labels["anguimorph lizards"] = { type = "set", description = "=[[alligator lizard]]s, [[beaded lizard]]s, [[blindworm]]s, [[crocodile monitor]]s, [[galliwasp]]s, [[Gila monster]]s, [[glass lizard]]s, [[goanna]]s, [[Komodo dragon]]s, [[legless lizard]]s, [[nile monitor]]s, [[perentie]]s, [[sheltopusik]]s, [[water monitor]]s, and other [[lizards]] in the [[suborder]] [[Anguimorpha]]", parents = {"lizards"}, } labels["anomurans"] = { type = "set", description = "=crablike [[crustacean]]s in the [[decapod]] [[infraorder]] [[Anomura]], which are closely related to the true [[crab]]s in the infraorder [[Brachyura]]", parents = {"krustasea", "dekapod"}, } labels["anteaters and sloths"] = { type = "set", description = "=[[mammal]]s in the [[order]] [[Pilosa]]", parents = {"mamalia"}, } labels["antelopes"] = { type = "set", description = "default", parents = {"ungulat kuku genap"}, } labels["antilopine antelopes"] = { type = "set", description = "=[[blackbuck]]s, [[chinkara]]s, [[dibatag]]s, [[dik-dik]]s, [[gazelle]]s, [[gerenuk]]s, [[grysbok]]s, [[klipspringer]]s, [[oribi]]s, [[royal antelope]]s, [[saiga]]s, [[springbok]]s, [[steenbok]]s, [[zeren]], and other [[antelope]]s in the [[bovid]] [[subfamily]] [[Antilopinae]]", parents = {"antelopes"}, } labels["ants"] = { type = "set", description = "default", parents = {"Hymenoptera"}, } labels["antshrikes"] = { type = "set", description = "default", parents = {"suboscines", "burung tenggek"}, } labels["anurans"] = { type = "set", description = "=[[amphibian]]s in the [[order]] [[Anura]], which are short-bodied and without tails, having long hind legs adapted for leaping that are typically folded at rest. Anurans are mostly known as [[frog]]s or [[toad]]s", parents = {"amfibia"}, } labels["aphids"] = { type = "set", description = "=[[insect]]s in the [[superfamily]] [[Aphidoidea]]", parents = {"hemipterans"}, } labels["apodiforms"] = { type = "set", description = "=[[hummingbird]]s, [[needletail]]s, [[spinetail]]s, [[swift]]s, [[swiftlet]]s, [[treeswift]]s, and other [[bird]]s in the [[order]] [[Apodiformes]]", parents = {"burung"}, } labels["araknid"] = { type = "set", description = "default", parents = {"artropod"}, } labels["lelabah araneoid"] = { type = "set", description = "=[[lelabah tinja burung]], [[cobweb spiders]] (including [[black widow]]s and [[redback]]s), [[orbweaver]]s (including [[cross spider]]s and [[writing spider]]s), [[long-jawed spider]]s, [[money spider]]s, [[nesticid]]s, [[pimoid]], [[pirate spider]]s, [[tetragnathid]]s and other [[spider]]s in the [[superfamily]] [[Araneoidea]]", parents = {"lelabah"}, } labels["ikan argentiniform"] = { type = "set", description = "=[[argentine]]s, [[barreleye]]s, [[blacksmelt]]s, [[smoothtongue]]s and other ikan in the [[order]] [[Argentiniformes]]", parents = {"ikan"}, } labels["armadillos"] = { type = "set", description = "default", parents = {"mamalia"}, } labels["artropod"] = { type = "set", description = "default", parents = {"haiwan"}, commonscat = "Arthropoda", wpcat = true, } labels["aschizan flies"] = { type = "set", description = "=[[fly|flies]] in the [[dipteran]] [[section]] [[Aschiza]]", parents = {"Diptera"}, } labels["asilomorph flies"] = { type = "set", description = "=[[bee fly|bee flies]], [[dance fly|dance flies]], [[Mydas fly|Mydas flies]], [[robber fly|robber flies]], [[stiletto fly|stiletto flies]], [[window fly|window flies]] and other [[fly|flies]] in the [[dipteran]] [[infraorder]] [[Asilomorpha]]", parents = {"Diptera"}, } labels["assassin bugs"] = { type = "set", description = "=[[ambush bug]]s, [[assassin bug]]s, [[corsair]]s, [[feather-legged bug]]s, [[kissing bug]]s or [[conenose bug]]s, [[masked hunter]]s, [[wheel bug]]s, and other [[true bug]]s in the [[family]] [[Reduviidae]]", parents = {"true bugs"}, } labels["astacideans"] = { type = "set", description = "=[[crustacean]]s in the [[decapod]] [[infraorder]] [[Astacidea]], including the original [[species]] known as [[crayfish]] and [[lobster]]s, and their relatives", parents = {"krustasea", "dekapod"}, } labels["ikan ateriniform"] = { type = "set", description = "=[[blue-eye]]s, [[hardyhead]]s, [[grunion]], [[jacksmelt]], [[rainbowfish]], [[silverside]]s, [[zona]], and other ikan in the [[order]] [[Atheriniformes]]", parents = {"ikan"}, } labels["auks"] = { type = "set", description = "=[[auk]]s, [[guillemot]]s, [[murre]]s, [[puffin]]s, [[razorbill]]s, and other [[seabird]]s in the family [[Alcidae]]", parents = {"burung laut"}, } labels["ikan aulopiform"] = { type = "set", description = "=[[daggertooth]]s, [[lancetfish]], [[sergeant baker]]s, [[greeneye]]s, [[telescopefish]], [[lizardfish]] and other ikan in the [[order]] [[Aulopiformes]]", parents = {"ikan"}, } labels["Australasian robins"] = { type = "set", description = "=birds in the [[passerine]] [[family]] [[Petroicidae]], which are not closely related to the [[European robin]] (an [[Old World flycatcher]] in the family [[Muscicapidae]]), or the [[American robin]] (a [[thrush]] in the family [[Turdidae]])", parents = {"burung tenggek"}, } labels["anak haiwan"] = { type = "set", description = "default", parents = {"haiwan"}, } labels["bandicoots and bilbies"] = { type = "set", description = "=[[peramelid]]s, [[bandicoot]]s, [[marl]]s, [[quenda]]s, [[chaeropodid]]s, [[pig-footed bandicoot]]s, [[thylacomyid]]s, [[bilby|bilbies]], [[dalgite]]s, [[rabbit-eared bandicoot]]s, [[philander]]s, [[pinkie]]s, and other [[marsupial]]s in the [[order]] [[Peramelemorphia]]", parents = {"marsupials"}, } labels["barklice"] = { type = "set", description = "=non-[[parasitic]] [[insect]]s in the [[order]] [[Psocodea]]", parents = {"serangga"}, } labels["barnacles"] = { type = "set", description = "=[[crustacean]]s in the [[infraclass]] [[Cirripedia]], including the parasitic [[rhizocephalan]]s", parents = {"krustasea"}, } labels["kelawar"] = { type = "set", description = "default", parents = {"mamalia"}, } labels["lebah"] = { type = "set", description = "default", parents = {"Hymenoptera", "pemeliharaan lebah"}, } labels["beetles"] = { type = "set", description = "default", parents = {"serangga"}, } labels["ikan beloniform"] = { type = "set", description = "=[[ballyhoo]], [[flying fish]], [[garfish]], [[halfbeak]]s, [[houndfish]], [[mackerel pike]]s, [[medaka]]s, [[needlefish]], [[ricefish]], [[saury|sauries]], [[silver gar]], and other ikan in the [[order]] [[Beloniformes]]", parents = {"ikan"}, } labels["bibionomorphs"] = { type = "set", description = "=[[March fly|March flies]], [[cecidomyiid]] [[gall midge]]s, [[keroplatid]] [[fungus gnat]]s, [[mycetophilid]]s, [[sciarid]]s and other [[fly|flies]], [[gnat]]s and [[midge]]s in the [[dipteran]] [[infraorder]] [[Bibionomorpha]]", parents = {"Diptera"}, } labels["burung"] = { type = "set", description = "default", parents = {"vertebrat"}, commonscat = "Aves", wpcat = true, } labels["burung pemangsa"] = { type = "set", description = "=birds that live by [[predatory]] hunting, and from [[carrion]]", parents = {"burung"}, } labels["bivalvia"] = { type = "set", description = "=[[clam]]s, [[cockle]]s, [[mussel]]s, [[oyster]]s, [[scallop]]s and other [[mollusk]]s in the [[class]] [[Bivalvia]]", parents = {"moluska"}, } labels["blennies"] = { type = "set", description = "=[[blenny|blennies]], [[chaenopsid]]s, [[clinid]]s, [[dactyloscopid]]s, [[klipfish]], [[labrisomid]]s, [[triplefin]]s, [[weedfish]] and other ikan in the [[perciform]] [[suborder]] [[Blennioidei]]", parents = {"ikan"}, } labels["boas"] = { type = "set", description = "=[[snake]]s in the family [[Boidae]]", parents = {"ular"}, } labels["bostrichiform beetles"] = { type = "set", description = "=[[carpet beetle]]s, [[deathwatch beetle]]s, [[drugstore beetle]]s, [[museum beetle]]s, [[powder-post beetle]]s, and other [[anobiid]]s/[[ptinid]]s, [[bostrichid]]s, [[dermestid]]s, [[derodontid]]s, [[jacobsoniid]]s and [[nosodendrid]]s in the [[coleopteran]] [[infraorder]] [[Bostrichiformia]]", parents = {"beetles"}, } labels["bovines"] = { type = "set", description = "default", parents = {"ungulat kuku genap"}, } labels["brachiopods"] = { type = "set", description = "=[[animal]]s in the [[filum]] [[Brachiopoda]]. <u>Note</u>: not to be confused with [[branchiopod]]s, which are [[crustacean]]s", parents = {"haiwan"}, } labels["branchiopods"] = { type = "set", description = "=[[[brine shrimp]], [[clam shrimp]], [[fairy shrimp]], [[tadpole shrimp]], [[water flea]]s, and other [[crustacean]]s in the [[class]] [[Branchiopoda]]. <u>Note</u>: not to be confused with [[brachiopod]]s, which are a separate [[filum]]", parents = {"krustasea"}, } labels["bryozoans"] = { type = "set", description = "=[[animal]]s in the [[filum]] [[Bryozoa]], also known as [[Ectoprocta]]", parents = {"haiwan"}, } labels["bulbuls"] = { type = "set", description = "=[[bulbul]]s, [[greenbul]]s, [[brownbul]]s, [[leaflove]]s, [[bristlebill]]s, and other birds in the [[passerine]] [[family]] [[Pycnonotidae]]", parents = {"burung tenggek"}, } labels["buteos"] = { type = "set", description = "=[[hawk]]s in the [[genus]] ''[[Buteo]]'', known as [[buzzard]]s in Europe", parents = {"burung pemangsa"}, } labels["butterflies"] = { type = "set", description = "default", parents = {"serangga"}, } labels["caddis flies"] = { type = "set", description = "=serangga in the order [[Trichoptera]], which are closely related to the [[butterfly|butterflies]] and [[moth]]s but with hairs on their wings instead of scales, and which have [[aquatic]] [[larvae]] that live in cases that they build around themselves", parents = {"serangga"}, } labels["caecilians"] = { type = "set", description = "=[[amphibian]]s in the [[order]] [[Gymnophiona]], which are legless and resemble [[earthworm]]s or [[snake]]s", parents = {"amfibia"}, } labels["camelids"] = { type = "set", description = "=[[camelid]]s ([[camel]]s, [[llama]]s, [[alpaca]]s, etc.)", parents = {"mamalia", "ungulat kuku genap"}, } labels["kanid"] = { type = "set", description = "default", parents = {"karnivor"}, } labels["caprines"] = { type = "set", description = "=[[sheep]], [[goat]]s, [[goat antelope]]s, [[chamois]], [[muskox]]en, [[bharal]], [[goral]], [[ibex]], [[mouflon]], [[serow]], [[tahr]], [[tur]], [[takin]] and other haiwan in the [[bovid]] [[subfamily]] [[Caprinae]], formerly known as the [[family]] [[Capridae]]", parents = {"ungulat kuku genap"}, } labels["caprimulgiforms"] = { type = "set", description = "=[[caprimulgiform]]s: birds in the taxonomic order [[Caprimulgiformes]]- the [[nightjar]]s, [[oilbird]]s, [[frogmouth]]s, [[potoo]]s, etc", parents = {"burung"}, } labels["carcharhiniform sharks"] = { type = "set", description = "=[[bull shark]]s, [[catshark]]s, [[gummy shark]]s, [[hammerhead]]s, [[leopard shark]]s, [[morgay]]s, [[requiem shark]]s, [[tiger shark]]s, [[tope]]s, [[whaler]]s, [[whitetip]]s and other sharks in the [[order]] [[Carcharhiniformes]]", parents = {"jerung"}, } labels["cardinalids"] = { type = "set", description = "=[[cardinal]]s, [[dickcissel]]s, [[indigo bunting]]s, [[pyrrhuloxia]]s, [[rose-breasted grosbeak]]s, [[scarlet tanager]]s, and other birds in the [[family]] [[Cardinalidae]]", parents = {"burung tenggek"}, } labels["caridean shrimp"] = { type = "set", description = "=[[crustacean]]s in the [[decapod]] [[infraorder]] [[Caridea]], mostly known as [[shrimp]] or [[prawn]]s", parents = {"krustasea", "dekapod"}, } labels["karnivor"] = { type = "set", description = "=[[bear]]s, [[cat]]s, [[civet]]s, [[dog]]s, [[fossa]]s, [[hyaena]]s, [[mongoose]]s, [[panda]]s, [[raccoon]]s, [[seal]]s, [[skunk]]s, [[weasel]]s and various other [[mammal]]s in the [[order]] [[Carnivora]]", parents = {"mamalia"}, } labels["carps"] = { type = "set", description = "=ikan in the [[subfamily]] [[Cyprininae]], the [[carps]] and [[goldfish]]", parents = {"cyprinids"}, } labels["catfish"] = { type = "set", description = "default", parents = {"ikan", "ikan ikan otosefalan"}, } labels["kucing"] = { type = "set", description = "=[[cat]]s in the sense of members of the genus ''[[Felis]]''", parents = {"felids"}, commonscat = "Felis silvestris catus", wpcat = true, } labels["cattle"] = { type = "set", description = "default", parents = {"bovines", "ternakan"}, } labels["caviomorphs"] = { type = "set", description = "=[[agouti]]s, [[capybara]]s, [[chinchilla]]s, [[guinea pig]]s, [[New World porcupine]]s, [[nutria]]s, [[tuco-tuco]]s and other [[rodent]]s in the parvorder [[Caviomorpha]]", parents = {"rodensia"}, } labels["sefalopod"] = { type = "set", description = "default", parents = {"moluska"}, } labels["monyet serkopitesin"] = { type = "set", description = "=[[blue monkey]]s, [[Diana monkey]]s, [[guenon]]s, [[lesula]]s, [[malbrouck]]s, [[patas monkey]]s, [[talapoin]]s, [[vervet]]s, and other [[Old World monkey]]s in the [[cercopithecine]] [[tribe]] [[Cercopithecini]]", parents = {"monyet dunia lama"}, } labels["burung sertioid"] = { type = "set", description = "=birds in the [[passerine]] [[superfamily]] [[Certhioidea]], the [[treecreeper]]s, [[nuthatch]]es, [[gnatcatcher]]s and [[wren]]s", parents = {"burung tenggek"}, } labels["Cervidae"] = { type = "set", description = "default", parents = {"ungulat kuku genap"}, } labels["setasea"] = { type = "set", description = "=[[cetacean]]s ([[dolphin]]s, [[whale]]s and [[porpoise]]s)", parents = {"ungulat kuku genap"}, } labels["chalcidoid wasps"] = { type = "set", description = "=[[chalcidid]]s, [[encyrtid]]s, [[fig wasp]]s, [[jointworm]]s, [[mymarid]] [[fairyfly|fairyflies]], [[perilampid]]s, [[torymid]]s, [[trichogramma]]s, and other [[wasp]]s in the [[superfamily]] [[Chalcidoidea]]", parents = {"Hymenoptera"}, } labels["characins"] = { type = "set", description = "=fish in the order [[Characiformes]]", parents = {"ikan", "ikan otosefalan"}, } labels["ayam"] = { type = "set", description = "default", parents = {"poltri", "unggas"}, } labels["chimaeras (fish)"] = { type = "set", description = "=[[cartilaginous]] fish in the [[Chimaeriformes]], the only surviving [[order]] of the [[subclass]] [[Holocephali]], and separate from the [[shark]]s, [[ray]]s, [[skate]]s and [[sawfish]] of the subclass [[Elasmobranchii]]", parents = {"ikan"}, } labels["kordata"] = { type = "set", description = "=haiwan dalam filum [[filum]] [[Chordata]]", parents = {"haiwan"}, } labels["chrysomeloid beetles"] = { type = "set", description = "=[[cerambycid]]s or [[longhorn beetle]]s such as [[apple borer]]s, [[huhu beetle]]s, [[locust borer]]s and [[thunderbolt beetle]]s, as well as [[chrysomelid]]s or [[leaf beetle]]s such as [[asparagus beetle]]s, [[bean weevil]]s, [[Colorado beetle]]s, [[cucumber beetle]]s, [[flea beetle]]s, [[potato beetle]]s, and other [[beetle]]s in the [[superfamily]] [[Chrysomeloidea]]", parents = {"beetles"}, } labels["cicadas"] = { type = "set", description = "=[[insect]]s in the [[superfamily]] [[Cicadoidea]]", parents = {"hemipterans"}, } labels["cichlids"] = { type = "set", description = "=fish in the family [[Cichlidae]]", parents = {"ikan labroid"}, } labels["clinids"] = { type = "set", description = "=fish in the family [[Clinidae]]", parents = {"ikan"}, } labels["knidaria"] = { type = "set", description = "=[[coral]]s, [[gorgonian]]s, [[hydra]]s, [[myxozoan]]s, [[Portuguese man-of-war]], [[sea anemone]]s, [[sea fir]]s, [[sea wasp]]s, and other haiwan in the in the [[filum]] [[Cnidaria]]", parents = {"haiwan"}, } labels["cockatoos"] = { type = "set", description = "=[[crested]] [[parrot]]s in the [[family]] [[Cacatuidae]]", parents = {"parrots"}, } labels["lipas"] = { type = "set", description = "default", parents = {"serangga"}, } labels["colobine monkeys"] = { type = "set", description = "=[[colobus]]es, [[douc]]s, [[langur]]s, [[guereza]]s, [[hanuman]]s,[[leaf monkey]]s, [[lutung]]s, [[proboscis monkey]]s, and other [[Old World monkey]]s in the [[subfamily]] [[Colobinae]]", parents = {"monyet dunia lama"}, } labels["ular kolubrid"] = { type = "set", description = "=[[snake]]s in the family [[Colubridae]]", parents = {"ular"}, } labels["colugos"] = { type = "set", description = "=the [[primate]]-like [[gliding]] [[mammal]]s in the [[order]] [[Dermoptera]], also known as [[flying lemur]]s", parents = {"mamalia"}, } labels["columbids"] = { type = "set", description = "=[[columbid]]s, i.e. [[pigeon]]s and [[dove]]s", parents = {"burung"}, } labels["copepods"] = { type = "set", description = "=[[crustacean]]s in the [[subclass]] [[Copepoda]]", parents = {"krustasea"}, } labels["coraciiforms"] = { type = "set", description = "=[[bee-eater]]s, [[ground rollers]], [[kingfisher]]s, [[motmot]]s, [[roller]]s, [[tody|todies]] and other birds in the taxonomic order [[Coraciiformes]]", parents = {"burung"}, } labels["corvids"] = { type = "set", description = "default", parents = {"burung tenggek", "burung korvoid"}, } labels["burung korvoid"] = { type = "set", description = "=[[apostlebird]]s, [[bird of paradise|birds of paradise]], [[crow]]s, [[drongo]]s, [[fantail]]s, [[grinder]]s, [[jackdaw]]s, [[jay]]s, [[magpie]]s, [[magpie-lark]]s, [[manucode]]s, [[monarchid]]s, [[nutcracker]]s, [[piwakawaka]]s, [[raven]]s, [[restless flycatcher]]s, [[riflebird]]s, [[shrike]]s, [[standard-wing]]s, and other birds in the [[superfamily]] [[Corvoidea]]", parents = {"burung tenggek"}, } labels["cotingas"] = { type = "set", description = "=birds in the [[suboscine]] [[family]] [[Cotingidae]]", parents = {"suboscines"}, } labels["ketam"] = { type = "set", description = "=[[crab]]s, [[decapod]] [[crustacean]]s in the [[infraorder]] [[Brachyura]]", parents = {"krustasea", "dekapod"}, } labels["cranes (birds)"] = { type = "set", description = "=[[crane]]s", parents = {"gruiforms"}, } labels["cricetids"] = { type = "set", description = "=[[cotton rat]]s, [[deer mouse|deer mice]], [[hamster]]s, [[harvest mouse|harvest mice]], [[lemming]]s, [[vole]]s, [[woodrat]]s, and other [[rodent]]s in the [[family]] [[Cricetidae]]", parents = {"rodensia"}, } labels["cengkerik dan belalang"] = { type = "set", description = "=[[cengkerik]], [[belalang]], [[katidid]], [[weta]] dan [[serangga]] lain dalam order [[Orthoptera]]", parents = {"serangga"}, } labels["croakers"] = { type = "set", description = "=[[croaker]]s, [[drum]]s, [[weakfish]]s and other fish in the family [[Sciaenidae]]", parents = {"ikan perkoid"}, } labels["Crocodilia"] = { type = "set", description = "=[[buaya]], [[aligator]], kayman dan [[reptilia]] lain dalam order [[Crocodilia]]", parents = {"reptilia"}, } labels["krustasea"] = { type = "set", description = "default", parents = {"artropod"}, } labels["cuckoos"] = { type = "set", description = "=[[cuckoo]]s and other birds in the [[family]] [[Cuculidae]]", parents = {"otidimorph birds"}, } labels["cuckooshrikes and minivets"] = { type = "set", description = "=birds in the [[family]] [[Campephagidae]]", parents = {"burung tenggek"}, } labels["cucujoid beetles"] = { type = "set", description = "=[[flower beetle]]s, [[fungus beetle]]s, [[grain beetle]]s, [[lady beetle]]s, [[lizard beetle]]s, [[Mexican bean beetle]]s, and other [[beetle]]s in the [[superfamily]] [[Cucujoidea]]", parents = {"beetles"}, } labels["ctenophores"] = { type = "set", description = "=haiwan in the [[filum]] [[Ctenophora]], the [[comb jelly|comb jellies]]", parents = {"haiwan"}, } labels["Culicomorpha"] = { type = "set", description = "=[[biting midge]]s, [[blackfly|blackflies]], [[blood worm]]s, [[glassworm]]s, [[meniscus midge]]s, [[mosquito]]s, [[no-see-um]]s, [[non-biting midge]]s, [[phantom midge]]s and other [[insect]]s in the [[dipteran]] [[infraorder]] [[Culicomorpha]]", parents = {"Diptera"}, } labels["cyprinids"] = { type = "set", description = "=[[carp]], [[minnow]]s, [[chub]]s and other fish in the [[family]] [[Cyprinidae]]. In some classifications, this group is known as the [[superfamily]] [[Cyprinoidea]] or [[suborder]] [[Cyprinoidei]], with the [[cyprinid]] [[subfamily|subfamilies]] considered to be families", parents = {"ikan", "ikan otosefalan"}, } labels["dabbling ducks"] = { type = "set", description = "=[[gadwall]]s [[garganey]]s, [[mallard]]s, [[mottled duck]]s, [[pintail]]s, [[shoveler]]s, [[teal]]s, [[wigeon]]s and other ducks in either the [[anatid]] [[tribe]] [[Anatini]] or [[subfamily]] [[Anatinae]], depending on the classification", parents = {"itik"}, } labels["damselflies"] = { type = "set", description = "=[[bluestreak]]s, [[bluetail]]s, [[demoiselle]]s, [[flatwing]]s, [[redtail]]s, [[riverdamsel]]s, [[rubyspot]]s, [[spreadwing]]s, [[threadtail]]s, [[whitetip]]s, and other serangga in the [[odonate]] [[suborder]] [[Zygoptera]]", parents = {"dragonflies and damselflies"}, } labels["danaine butterflies"] = { type = "set", description = "=[[clearwing]]s, [[crow]]s, [[milkweed]]s, [[monarch]]s, [[paper kite butterfly|paper kite butterflies]], [[tiger]]s, [[wanderer]]s and other [[butterfly|butterflies]] in the [[nymphalid]] [[subfamily]] [[Danainae]]", parents = {"nymphalid butterflies"}, } labels["dasyuromorphs"] = { type = "set", description = "=[[thylacine]]s, [[numbat]]s, [[dasyure]]s, [[antechinus]]es, [[dibbler]]s, [[dunnart]]s, [[mulgara]]s. [[phascogale]]s, [[planigale]]s, [[quoll]]s, [[Tasmanian devil]]s, and other [[marsupial]]s in the [[order]] [[Dasyuromorphia]]", parents = {"marsupials"}, } labels["dekapod"] = { type = "set", description = "=[[crabs]], [[crayfish]], [[lobster]]s, [[prawn]]s, ([[caridean]]) [[shrimp]], and many other [[crustacean]]s in the [[order]] [[Decapoda]]", parents = {"krustasea"}, } labels["delphinids"] = { type = "set", description = "=(oceanic) [[dolphin]]s, [[grampus]]es, [[killer whale]]s/[[orca]]s, [[pilot whale]]s, and other [[cetacean]]s in the [[family]] [[Delphinidae]]", additional = "Note: [[river dolphin]]s and [[porpoise]]s are in other families.", parents = {"setasea"}, } labels["designer dogs"] = { type = "set", description = "default", parents = {"anjing"}, commonscat = true, wpcat = true, } labels["dinosaur"] = { type = "set", description = "default", parents = {"reptilia"}, } labels["lelabah dionika"] = { type = "set", description = "=[[crab spider]]s, [[flattie]]s, [[ground spider]]s, [[huntsman spider]]s, [[jumping spider]], [[scorpion spider]]s, and other [[lelabah]] in the [[entelegyne]] [[clade]] [[Dionycha]]", parents = {"lelabah"}, } labels["Diptera"] = { type = "set", description = "=[[fly|flies]], [[gnat]]s, [[midge]]s, [[mosquito]]s and other [[insect]]s in the order [[Diptera]]", parents = {"serangga"}, } labels["anjing"] = { type = "set", description = "default", parents = {"kanid"}, commonscat = true, wpcat = true, } labels["domestic cats"] = { type = "set", description = "default", parents = {"kucing"}, } labels["dragonflies and damselflies"] = { type = "set", description = "=serangga in the order [[Odonata]]", parents = {"serangga"}, } labels["itik"] = { type = "set", description = "default", parents = {"anatid", "poltri"}, } labels["dugongs and manatees"] = { type = "set", description = "=[[mammal]]s in the order [[Sirenia]]", parents = {"mamalia"}, } labels["eagles"] = { type = "set", description = "default", parents = {"burung pemangsa"}, } labels["earthworms"] = { type = "set", description = "=worms in the [[annelid]] [[suborder]] [[Lumbricina]]", parents = {"annelids"}, } labels["earwigs"] = { type = "set", description = "=serangga in the order [[Dermaptera]]", parents = {"serangga"}, } labels["ekinoderma"] = { type = "set", description = "default", parents = {"haiwan"}, commonscat = "Echinodermata", wpcat = true, } labels["belut"] = { type = "set", description = "=[[eel]]s, elongated, snakelike fish in the order [[Anguilliformes]]", parents = {"ikan elopomorf"}, } labels["ular elapid"] = { type = "set", description = "=[[cobra]]s, [[coral snake]]s, [[krait]]s, [[mamba]]s, [[sea snake]]s, and other [[venomous]] ular in the family [[Elapidae]]", parents = {"ular"}, } labels["elateroid beetles"] = { type = "set", description = "=[[click beetle]]s/[[elaterid]]s, [[fire beetle]]s, [[firefly|fireflies]]/[[lampyrid]]s, [[glowworm]]s, [[net-winged beetle]]s/[[lycid]]s, [[railroad worm]]s/[[phengodid]]s, [[soldier beetle]]s/[[cantharid]]s, [[throscid]]s, [[wireworm]]s and other [[beetle]]s in the [[superfamily]] [[Elateroidea]]", parents = {"beetles"}, } labels["elephants"] = { type = "set", description = "default", parents = {"mamalia"}, commonscat = "Elephantidae", wpcat = true, } labels["ikan elopomorf"] = { type = "set", description = "=[[bonefish]], [[eel]]s, [[gulper eel]]s, [[halosaur]]s, [[ladyfish]], [[tarpon]] and other fish in the [[superorder]] [[Elopomorpha]]", parents = {"ikan"}, } labels["emberizids"] = { type = "set", description = "=[[bunting]]s, [[yellowhammer]]s and related birds in the [[passerine]] family [[Emberizidae]]", additional = "<u>Note</u>: for New World species that were formerly classified in this family, see [[:Category:{{{langcode}}}:New World sparrows]].", parents = {"burung tenggek"}, } labels["emydid turtles"] = { type = "set", description = "=(North American) [[box turtle]]s, [[chicken turtle]]s, [[cooter]]s, [[ellachick]]s, [[pond turtle]]s, [[slider]]s, [[terrapin]]s, and other [[turtle]]s in the [[family]] [[Emydidae]]", parents = {"turtles"}, } labels["Equidae"] = { type = "set", description = "default", parents = {"ungulat kuku ganjil"}, } labels["erinaceids"] = { type = "set", description = "=[[erinaceid]]s – hedgehogs and relatives", parents = {"mamalia"}, } labels["euplerids"] = { type = "set", description = "=[[euplerid]]s &mdash; mongoose-like mammals found in Madagascar", parents = {"karnivor"}, } labels["ungulat kuku genap"] = { type = "set", description = "=[[mammal]]s in the [[order]] [[Artiodactyla]]", parents = {"mamalia"}, } labels["falconids"] = { type = "set", description = "=[[caracara]]s, [[falcon]]s, [[hobby|hobbies]], [[kestrel]]s, [[lanner]]s, [[merlin]]s, [[saker]]s, and other birds in the [[family]] [[Falconidae]]", parents = {"burung pemangsa"}, } labels["felids"] = { type = "set", description = "default", parents = {"karnivor"}, } labels["female haiwan"] = { type = "set", description = "default", parents = {"haiwan", "female"}, } labels["ikan"] = { type = "set", description = "default", parents = {"vertebrat"}, commonscat = true, wpcat = true, } labels["flamingos"] = { type = "set", description = "default", parents = {"burung air tawar"}, } labels["flatfish"] = { type = "set", description = "=[[sole]]s, [[flounder]]s, [[halibut]]s and other fish in the order [[Pleuronectiformes]]", parents = {"ikan"}, } labels["flatworms"] = { type = "set", description = "=[[fluke]]s, [[monogenean]]s, [[planarian]]s, [[polyclad]]s, [[tapeworm]]s, and other haiwan in the [[filum]] [[Platyhelminthes]]", additional = "For terms related to the study of [[parasitic]] [[worm#Noun|worms]], see [[:Category:Helminthology]] and its subcategories.", parents = {"cacing"}, } labels["fleas"] = { type = "set", description = "default", parents = {"serangga"}, } labels["unggas"] = { type = "set", description = "=[[fowl]]s: land birds in the [[order]] [[Galliformes]]", parents = {"burung"}, } labels["foxes"] = { type = "set", description = "default", parents = {"kanid"}, } labels["burung air tawar"] = { type = "set", description = "=birds that live mainly in [[freshwater]] areas, including [[estuaries]]", parents = {"burung"}, } labels["freshwater whitefish"] = { type = "set", description = "=[[cisco]]s, [[houting]]s, [[inconnu]]s, [[lavaret]]s, [[marena]]s, [[omul]]s, [[Otsego bass]], [[peled]]s, [[pollan]]s, [[roundfish]], [[tullibee]]s, [[vendace]]s, [[whitefish]] and other fish in the [[salmonid]] [[subfamily]] [[Coregoninae]]", parents = {"salmonids"}, } labels["frogs"] = { type = "set", description = "default", parents = {"anurans"}, } labels["gadiforms"] = { type = "set", description = "=[[cod]], [[haddock]], [[hake]] and other fish in the [[order]] [[Gadiformes]]", parents = {"ikan"}, } labels["ikan gasterosteiform"] = { type = "set", description = "=[[stickleback]]s, [[hypoptychid]] [[sand eel]]s, [[tubesnout]]s and other fish in the [[order]] [[Gasterosteiformes]]", additional = "Note: See [[:Category:Ikan singnatiform]] for a group formerly included within this order.", parents = {"ikan"}, } labels["gastropod"] = { type = "set", description = "default", parents = {"moluska"}, } labels["geckos"] = { type = "set", description = "=[[lizard]]s in the [[infraorder]] [[Gekkota]], except for the [[legless lizards]] or [[pygopod]]s", parents = {"lizards"}, } labels["angsa"] = { type = "set", description = "default", parents = {"anatid", "poltri"}, } labels["geometrid moths"] = { type = "set", description = "=[[carpet]]s, [[engrailed]]s, [[heath]]s, [[pug]]s, [[peppered moth]]s, [[streak]]s, [[wave]]s and other [[moth]]s in the [[family]] [[Geometridae]], most of which have [[caterpillar]]s known as [[inchworm]]s, [[looper]]s, [[measuring worm]]s or [[spanworm]]s", parents = {"moths"}, } labels["goats"] = { type = "set", description = "default", parents = {"caprines", "ternakan"}, } labels["gobies"] = { type = "set", description = "=[[goby|gobies]], [[dartfish]], [[mudskipper]]s, [[sea gudgeon]]s, [[sleeper]]s, [[wormfish]], and other [[fish]] in the [[perciform]] [[suborder]] [[Gobioidei]]", parents = {"ikan"}, } labels["gossamer-winged butterflies"] = { type = "set", description = "=[[blue]]s, [[copper]]s, [[elfin]]s, [[harvester]]s, [[hairstreak]]s, [[sunbeam]]s and other [[butterfly|butterflies]] in the [[family]] [[Lycaenidae]]", parents = {"butterflies"}, } labels["grebes"] = { type = "set", description = "default", parents = {"burung air tawar"}, } labels["grouse"] = { type = "set", description = "=[[blackcock]]s, [[capercaillie]]s, [[grouse]], [[moorcock]]s, [[prairie chicken]]s, [[ptarmigan]]s, [[sagehen]]s, and other birds in the [[phasianid]] [[subfamily]] [[Tetraoninae]]", parents = {"unggas"}, } labels["gruiforms"] = { type = "set", description = "=[[coot]]s, [[crake]]s, [[crane]]s, [[finfoot]]s, [[flufftail]]s, [[gallinule]]s, [[limpkin]]s, [[rail]]s, [[sungrebe]]s, [[trumpeter]]s, and other birds in the [[order]] [[Gruiformes]]", parents = {"burung air tawar"}, } labels["gulls"] = { type = "set", description = "=[[gull]]s, [[seabird]]s in the [[family]] [[Laridae]]", parents = {"burung laut"}, } labels["anjing pemburu"] = { type = "set", description = "default", parents = {"hunting dogs"}, } labels["hares"] = { type = "set", description = "default", parents = {"lagomorphs"}, } labels["hemipterans"] = { type = "set", description = "=[[aphid]]s, [[leafhopper]]s, [[scale insect]]s, [[true bug]]s, [[whitefly|whiteflies]], and other [[insect]]s in the order [[Hemiptera]]", parents = {"serangga"}, } labels["herding dogs"] = { type = "set", description = "default", parents = {"pastoral dogs"}, } labels["herons"] = { type = "set", description = "=[[heron]]s, [[bittern]]s and [[egret]]s", parents = {"burung air tawar"}, } labels["herpestids"] = { type = "set", description = "=[[herpestid]]s- mongooses, meerkats, and relatives", parents = {"karnivor"}, } labels["herrings"] = { type = "set", description = "=[[herring]]s, [[shad]]s, [[sardine]]s and other fish in the family [[Clupeidae]]", parents = {"ikan", "ikan otosefalan"}, } labels["ikan holostean"] = { type = "set", description = "=[[gar]]s and [[bowfin]]s, primitive fish in the [[infraclass]] [[Holostei]]", parents = {"ikan"}, } labels["hominid"] = { type = "set", description = "default", parents = {"primat"}, } labels["honeyeaters"] = { type = "set", description = "=Australian [[chat]]s, [[bellbird]]s, [[friarbird]]s, [[gibberbird]]s, [[honeyeater]]s, [[miner]]s, [[spinebill]]s, [[wattlebird]]s, and other birds in the [[family]] [[Meliphagidae]]", parents = {"meliphagoid birds"}, } labels["hoopoes and hornbills"] = { type = "set", description = "=[[hoopoe]]s, [[woodhoopoe]]s (including [[scimitarbill]]s), [[hornbill]]s, [[ground hornbill]]s, and other birds in the taxonomic order [[Bucerotiformes]]", parents = {"burung"}, } labels["horseflies"] = { type = "set", description = "=[[blind-fly|blind-flies]], [[breezefly|breezeflies]], [[cleg]]s, [[deerfly|deerflies]], [[forest fly|forest flies]], [[gadfly|gadflies]], [[horsefly|horseflies]], [[oxfly|oxflies]], [[zimb]]s, and other biting flies in the [[family]] [[Tabanidae]]", parents = {"Diptera"}, } labels["horse breeds"] = { type = "set", description = "default", parents = {"kuda"}, commonscat = true, wpcat = true, } labels["kuda"] = { type = "set", description = "default", parents = {"Equidae", "ternakan"}, } labels["hummingbirds"] = { type = "set", description = "default", parents = {"apodiforms"}, } labels["hunting dogs"] = { type = "set", description = "default", parents = {"anjing"}, } labels["hyaenids"] = { type = "set", description = "default", parents = {"karnivor"}, } labels["hydrozoans"] = { type = "set", description = "=[[bluebottle]]s, [[calycophoran]]s, [[filiferan]]s, [[hydra]]s, [[hydractinian]]s, [[leptothecate]]s, [[narcomedusa]]s, [[pandeid]]s, [[physonect]]s, [[plumularian]]s, [[Portuguese man-of-war]]s, [[siphonophore]]s, [[stylaster]]s, [[sea fir]]s, [[sea ginger]], [[trachylid]]s, [[trachymedusa]]s, amd other haiwan in the [[cnidarian]] [[class]] [[Hydrozoa]]", parents = {"knidaria"}, } labels["Hymenoptera"] = { type = "set", description = "=[[semut]], [[lebah]], [[penyengat]] dan serangga lain dalam order [[Hymenoptera]]", parents = {"serangga"}, } labels["hyraxes"] = { type = "set", description = "default", parents = {"mamalia"}, } labels["ibises and spoonbills"] = { type = "set", description = "=[[ibis]]es and [[spoonbill]]s", parents = {"burung air tawar"}, } labels["ichthyosauromorphs"] = { type = "set", description = "=[[ichthyosaurs]] and related groups of [[extinct]] [[aquatic]] [[reptile]]s in the [[clade]] [[Ichthyosauromorpha]]", parents = {"reptilia"}, } labels["icterids"] = { type = "set", description = "=birds in the [[New World]] [[passerine]] family [[Icteridae]]", parents = {"burung tenggek"}, } labels["iguanoid lizards"] = { type = "set", description = "=[[anole]]s, [[basilisk]]s, [[collared lizard]]s, [[chuckwalla]]s, [[fence lizard]]s, [[fringe-toed lizard]]s, [[horned lizard]]s, [[iguana]]s, [[leopard lizard]]s, [[side-blotched lizard]]s, [[zebra-tailed lizard]]s and other [[lizard]]s formerly included in the [[family]] [[Iguanidae]], and now mostly treated as comprising either the [[infraorder]] [[Pleurodonta]] or the [[superfamily]] [[Iguanoidea]]", parents = {"lizards"}, } labels["serangga"] = { type = "set", description = "default", parents = {"artropod"}, } labels["isopods"] = { type = "set", description = "=[[gribble]]s, [[pillbug]]s, [[salve bug]]s, [[slater]]s, [[sea slater]]s, [[sowbug]]s, [[woodlouse|woodlice]], and other [[crustacean]]s in the [[order]] [[Isopoda]]", parents = {"krustasea"}, } labels["jackfish"] = { type = "set", description = "=[[jack]]s, [[pompano]]s, [[jack mackerel]]s, [[scad]]s and other fish in the family [[Carangidae]]", parents = {"ikan perkoid"}, } labels["ikan tanpa rahang"] = { type = "set", description = "=[[lamprey]]s and [[hagfish]]: primitive eel-like fishes that have no jaws", parents = {"ikan"}, } labels["kingfishers"] = { type = "set", description = "default", parents = {"coraciiforms"}, } labels["kites (birds)"] = { type = "set", description = "=[[hawk]]s in the [[accipitrid]] [[subfamily|subfamilies]] [[Milvinae]] and [[Elaninae]], as well as some in the subfamily [[Perninae]]", parents = {"burung pemangsa"}, } labels["ikan kifosid"] = { type = "set", description = "=[[blackfish]], [[drummer]]s, [[footballer]]s, [[greenfish]], [[halfmoon]]s, [[luderick]]s, [[mado]]s, [[moonlighter]]s, [[nibbler]]s, [[opaleye]]s, [[sea chub]]s, [[stripey]]s, [[sweep]]s and other fish in the [[percoid]] [[family]] [[Kyphosidae]]", parents = {"ikan perkoid"}, } labels["ikan labroid"] = { type = "set", description = "=[[anemonefish]], [[cale]]s, [[cichlid]]s, [[clownfish]], [[damselfish]], [[parrotfish]], [[surfperch]], [[wrasse]]s, and other fish in the [[perciform]] [[suborder]] [[Labroidei]]", parents = {"ikan"}, } labels["ikan labirin"] = { type = "set", description = "=[[climbing perch]], [[gourami]]s, [[paradisefish]], [[Siamese fighting fish]] and other fish in the [[suborder]] [[Anabantoidei]]", parents = {"ikan"}, } labels["lacertoid lizards"] = { type = "set", description = "=[[amphisbaena]]s, [[caiman lizard]]s, [[green lizard]]s, [[ocellated lizard]]s, [[racerunner]]s, [[rock lizard]]s, [[tegu]]s, [[teiid]]s, [[thunderworm]]s, [[viviparous lizard]]s, [[wall lizard]]s, [[whiptail]]s, and other [[lizard]]s in the [[superfamily]] [[Lacertoidea]]", parents = {"lizards"}, } labels["lagomorphs"] = { type = "set", description = "default", parents = {"mamalia"}, } labels["lamniform sharks"] = { type = "set", description = "=[[basking shark]]s, [[goblin shark]]s, [[great white shark]]s, [[mako shark]]s, [[megamouth shark]]s, [[porbeagle]]s, [[sand shark]]s, [[thresher shark]]s, and other [[shark]]s in the [[order]] [[Lamniformes]]", parents = {"jerung"}, } labels["ikan lampriform"] = { type = "set", description = "=[[crestfish]], [[oarfish]], [[opah]]s, [[ribbonfish]], [[velifer]]s and other fish in the [[order]] [[Lampridiformes]] (not to be confused with the unrelated [[lamprey]]s)", parents = {"ikan"}, } labels["larks"] = { type = "set", description = "default", parents = {"burung tenggek"}, } labels["laughingthrushes"] = { type = "set", description = "=birds in the [[family]] [[Leiothrichidae]]", parents = {"burung tenggek"}, } labels["leaf warblers"] = { type = "set", description = "=birds in the family [[Phylloscopidae]]", parents = {"warblers"}, } labels["kera kecil"] = { type = "set", description = "=[[gibbon]]s (including [[hoolock]]s, [[lar gibbon]]s [[wow-wow]]s, etc.) and [[siamang]]s, comprising the [[family]] [[Hylobatidae]], which is closely related to the [[hominid]]s", parents = {"primate"}, } labels["ikan leusisin"] = { type = "set", description = "=[[bream]]s, [[chub]]s, [[dace]]s, [[ide]]s, many [[minnow]]s, [[nase]]s, [[roach]]es, [[shiner]]s, [[ziege]]s, and other fish in the [[cyprinid]] [[subfamily]] [[Leuciscinae]], sometimes treated as the [[family]] [[Leuciscidae]], or as the [[tribe]] [[Leuciscini]] within the [[subfamily]] [[Cyprininae]]", parents = {"cyprinids"}, } labels["libellulid dragonflies"] = { type = "set", description = "=[[amberwing]]s, [[basker]]s, [[darter]]s, [[dropwing]]s, [[duskhawk]]s, [[flutterer]]s, [[glider]]s, [[meadowhawk]]s, [[pennant]]s, [[percher]]s, [[skimmer]]s, [[slimwing]]s, [[swampdragon]]s, [[twister]]s, and other [[dragonfly|dragonflies]] in the [[family]] [[Libellulidae]]", parents = {"dragonflies and damselflies"}, } labels["lice"] = { type = "set", description = "=[[parasitic]] serangga in the [[order]] [[Psocodea]]", parents = {"serangga"}, } labels["limenitidine butterflies"] = { type = "set", description = "=[[admiral]]s, [[clipper]]s, [[count]]s, [[duke]]s, [[purple]]s, [[sister]]s, and other [[butterfly|butterflies]] in the [[nymphalid]] [[subfamily]] [[Limenitidinae]]", parents = {"nymphalid butterflies"}, } labels["littorinimorphs"] = { type = "set", description = "=[[boat shell]]s, [[carrier shell]]s, [[conch]]s, [[cowry|cowries]], [[flamingo tongue]]s, [[helmet shell]]s, [[moon snail]]s, [[pebblesnail]]s, [[trumpet shell]]s, [[velutinid]]s, [[winkle]]s, [[worm-shell]]s, and other [[gastropod]]s in the [[order]] [[Littorinimorpha]]", parents = {"gastropod"}, } labels["livestock guardian dogs"] = { type = "set", description = "default", parents = {"pastoral dogs"}, } labels["lizards"] = { type = "set", description = "default", parents = {"reptilia"}, } labels["loaches"] = { type = "set", description = "=fish in the [[cypriniform]] [[superfamily]] [[Cobitoidea]]", parents = {"ikan", "ikan otosefalan"}, } labels["ikan sirip lobus"] = { type = "set", description = "=[[coelacanth]]s, [[lungfish]] and other fishes in the [[subclass]] [[Sarcopterygii]] of the [[bony fish]]es", additional = "<u>Please note</u>: although the [[tetrapod]]s (including all [[reptile]]s, [[amphibian]]s, [[bird]]s and [[mammal]]s) are descended from within this group, they are excluded from this category by not being fish.", parents = {"ikan"}, } labels["loons"] = { type = "set", description = "=[[loon]]s, birds known as [[diver]]s outside the US", parents = {"burung air tawar"}, } labels["macaques"] = { type = "set", description = "=[[Barbary ape]]s, [[bonnet monkey]]s, [[crab-eating macaque]]s, [[Japanese macaque]]s, [[moor macaque]]s, [[pigtail macaque]]s, [[rhesus monkey]]s, [[toque]]s, and other [[Old World monkey]]s in the [[genus]] ''[[Macaca]]''", parents = {"monyet dunia lama"}, } labels["macropods"] = { type = "set", description = "=[[bettong]]s, [[kangaroo]]s, [[pademelon]]s, [[potoroo]]s, [[quokka]]s, [[wallaby]]s, and other [[marsupial]]s in the [[diprotodont]] [[suborder]] [[Macropodiformes]]", parents = {"marsupials"}, } labels["malaconotoid birds"] = { type = "set", description = "=[[Australian magpie]]s, [[bushshrike]]s, [[butcherbird]]s, [[boubou]]s, [[brubru]]s, [[currawong]]s, [[gonolek]]s, [[squeaker]]s, [[vanga]]s, and other birds in the [[passerine]] [[superfamily]] [[Malaconotoidea]]", parents = {"burung tenggek"}, } labels["male haiwan"] = { type = "set", description = "default", parents = {"haiwan", "male"}, } labels["mamalia"] = { type = "set", description = "default", parents = {"vertebrat"}, } labels["mantids"] = { type = "set", description = "=serangga in the [[order]] [[Mantodea]], often known as [[praying mantis]]es", parents = {"serangga"}, } labels["marsupials"] = { type = "set", description = "default", parents = {"mamalia"}, } labels["mayflies"] = { type = "set", description = "=serangga in the [[order]] [[Ephemeroptera]]", parents = {"serangga"}, } labels["megalopterans"] = { type = "set", description = "=[[alderfly|alderflies]], [[dobsonfly|dobsonflies]], [[fishfly|fishflies]] and other serangga in the [[order]] [[Megaloptera]]", parents = {"serangga"}, } labels["meliphagoid birds"] = { type = "set", description = "=[[blue wren]]s, [[bristlebird]]s, [[emu-wren]]s, [[fairywren]]s, [[gerygone]]s, [[grasswren]]s, [[honeyeater]]s, [[pardalote]]s, [[pilotbird]]s, [[redthroat]]s, [[scrubwren]]s, [[thornbill]]s, [[weebill]]s, [[whiteface]]s, and other birds in the [[passerine]] [[superfamily]] [[Meliphagoidea]]", parents = {"burung tenggek"}, } labels["mephitids"] = { type = "set", description = "=[[mephitid]]s: skunks and stink badgers", parents = {"karnivor"}, } labels["mergansers"] = { type = "set", description = "=[[diving]] [[duck]]s in the [[genus]] ''[[Mergus]]'' and a few similar species", parents = {"itik"}, } labels["mimids"] = { type = "set", description = "=[[catbird]]s, [[mockingbird]]s, [[thrasher]]s and other birds in the [[passerine]] family [[Mimidae]]", parents = {"burung tenggek"}, } labels["mites and ticks"] = { type = "set", description = "=[[arachnid]]s in the [[subclass]] [[Acari]]", parents = {"araknid"}, } labels["moluska"] = { type = "set", description = "default", parents = {"haiwan"}, commonscat = "Mollusca", wpcat = "Molluscs", } labels["monyet"] = { type = "set", description = "default", parents = {"primat"}, } labels["monotremes"] = { type = "set", description = "default", parents = {"mamalia"}, } labels["nyamuk"] = { type = "set", description = "=[[insect]]s in the [[dipteran]] [[family]] [[Culicidae]]", parents = {"Culicomorpha"}, } labels["moths"] = { type = "set", description = "default", parents = {"serangga"}, } labels["murids"] = { type = "set", description = "=a number of [[rats]], [[mice]], and other [[rodent]]s in the [[Old World]] [[family]] [[Muridae]]", parents = {"rodensia"}, } labels["muscicapids"] = { type = "set", description = "=birds in the [[passerine]] family [[Muscicapidae]]", parents = {"burung tenggek"}, } labels["muscoid flies"] = { type = "set", description = "=[[anthomyiid]]s such as [[root fly|root flies]], [[cabbage fly|cabbage flies]] and [[onion fly|onion flies]]; [[fanniid]]s; [[muscid]]s such as [[housefly|houseflies]], [[face fly|face flies]] and [[stable fly|stable flies]]; [[scathophagid]]s such as [[dungfly|dungflies]]; and other [[fly|flies]] in the [[dipteran]] [[superfamily]] [[Muscoidea]]", parents = {"Diptera"}, } labels["mustelids"] = { type = "set", description = "default", parents = {"karnivor"}, } labels["lelabah migalomorf"] = { type = "set", description = "=[[baboon spider]]s, [[barking spider]]s, [[bird spider]]s, [[purseweb spider]]s, [[tarantula]]s, [[trapdoor spider]]s, and other [[spider]]s in the [[infraorder]] [[Mygalomorphae]]", parents = {"lelabah"}, } labels["myriapods"] = { type = "set", description = "=[[centipede]]s, [[millipede]]s, [[pauropod]]s, [[symphylan]]s, and other [[arthropod]]s in the [[subfilum]] [[Myriapoda]]", parents = {"artropod"}, } labels["myrmicine ants"] = { type = "set", description = "=[[ant]]s in the [[subfamily]] [[Myrmicinae]]", parents = {"ants"}, } labels["nematodes"] = { type = "set", description = "=[[filaria]], [[gapeworm]]s, [[lungworm]]s, [[pinworm]]s, [[threadworm]]s, [[wheatworm]]s, [[whipworm]]s and other [[worm]]s in the [[filum]] [[Nematoda]]", parents = {"cacing"}, } labels["neogastropod"] = { type = "set", description = "=[[admiral shell]]s, [[cone snail]]s, [[harp shell]]s, [[murex]]es, [[olive]]s, [[rhombus]]es, [[spindle]]s, [[tulip shell]]s, [[turnip shell]]s, [[volute]]s, [[whelk]]s, [[winkle]]s and other [[gastropod]]s in the [[clade]] [[Neogastropoda]] (treated as an [[order]] in some classifications)", parents = {"gastropod"}, } labels["monyet dunia baharu"] = { type = "set", description = "=[[capuchin]]s, [[howler monkey]]s, [[marmoset]]s, [[night monkey]]s, [[saki]]s, [[spider monkey]]s, [[squirrel monkey]]s, [[tamarin]]s, [[titi]]s, [[uakari]]s, [[woolly monkey]]s, and other [[monkey]]s in the [[parvorder]] [[Platyrrhini]]", parents = {"monyet"}, } labels["New World quails"] = { type = "set", description = "=birds in the [[family]] [[Odontophoridae]], most of which live in the [[New World]] and are known as [[quail]]s, but the family also includes the African [[genus]] ''[[Ptilopachus]]'' and some [[species]] are known as partridges", parents = {"unggas"}, } labels["New World sparrows"] = { type = "set", description = "=[[sparrow]]- and [[finch]]-like birds in the [[passerine]] [[family]] [[Passerellidae]], until recently considered part of the family [[Emberizidae]]", parents = {"burung tenggek"}, } labels["New World warblers"] = { type = "set", description = "=birds in the family [[Parulidae]]", parents = {"warblers"}, } labels["neuropterans"] = { type = "set", description = "=[[antlion]]s, [[lacewing]]s, [[mantisfly|mantisflies]], [[owlfly|owlflies]] and other serangga in the [[order]] [[Neuroptera]]", parents = {"serangga"}, } labels["newts"] = { type = "set", description = "=[[terrestrial]] [[salamander]]s in the [[subfamily]] [[Pleurodelinae]]", parents = {"salamanders"}, } labels["noctuoid moths"] = { type = "set", description = "=[[armyworm]]s, [[cinnabar]]s, [[corn earworm]]s, [[cutworm]]s, [[gypsy moth]]s, [[owlet moth]]s, [[processionary|processionaries]], [[tiger moth]]s, [[underwing]]s, [[wainscot]]s, [[wooly bear]]s, and many other [[moth]]s (and [[caterpillar]]s) in the [[superfamily]] [[Noctuoidea]]", parents = {"moths"}, } labels["nudibranchs"] = { type = "set", description = "=[[sea slug]]s in the [[gastropod]] [[order]] [[Nudibranchia]]", parents = {"gastropod"}, } labels["nymphalid butterflies"] = { type = "set", description = "=[[admiral]]s, [[brown]]s, [[buckeye]]s, [[checkerspot]]s, [[emperor]]s, [[fritillary|fritillaries]], [[leafwing]]s, [[longwing]]s, [[monarch]]s, [[morpho]]s, [[painted lady|painted ladies]], [[ringlet]]s, [[satyr]]s, [[sister]]s, [[snout]]s, [[tortoiseshell]]s, and other butterflies in the [[family]] [[Nymphalidae]]", parents = {"butterflies"}, } labels["kurita"] = { type = "set", description = "default", parents = {"sefalopod"}, } labels["ungulat kuku ganjil"] = { type = "set", description = "=[[mammal]]s in the [[order]] [[Perissodactyla]], including the [[equid]]s, [[tapir]]s and [[rhinoceros]]es", parents = {"mamalia"}, } labels["oestroid flies"] = { type = "set", description = "=[[blowfly|blowflies]], [[bluebottle]]s, [[botfly|botflies]], [[flesh fly|flesh fles]], [[greenbottle]]s, [[mango fly|mango flies]], [[screwworm]]s, [[tachinid]]s, [[torsalo]]s, [[tumbu fly|tumbu flies]], [[warble fly|warble flies]], and other flies in the [[superfamily]] [[Oestroidea]]", parents = {"Diptera"}, } labels["monyet dunia lama"] = { type = "set", description = "=[[baboon]]s, [[colobus]], [[douc]]s, [[gelada]]s, [[green monkey]]s, [[grivet]]s, [[langur]]s, [[malbrouck]]s, [[mandrill]]s, [[mangabey]]s, [[patas monkey]]s, [[proboscis monkey]]s, [[talapoin]]s, [[vervet]]s, and other [[monkeys]] in the [[family]] [[Cercopithecidae]], the only [[members]] of the [[parvorder]] [[Catarrhini]] aside from the greater/lesser apes and humans", parents = {"monyet"}, } labels["Old World orioles"] = { type = "set", description = "=[[perching bird]]s in the [[family]] [[Oriolidae]], which are not closely related to the New World orioles in the family [[Icteridae]]", parents = {"burung tenggek"}, } labels["ornithopods"] = { type = "set", description = "=[[camptosaurid]]s, [[hadrosaur]]s, [[iguanodontid]]s, [[lambeosaurid]]s, [[rhabdodontid]]s, [[saurolophid]]s, [[thescelosaurid]]s, [[trachodontid]]s, and other [[dinosaur]]s in the [[ornithischian]] [[clade]] [[Ornithopoda]]", parents = {"dinosaur"}, } labels["ikan osteoglosomorf"] = { type = "set", description = "=[[aba]]s, [[arapaima]]s, [[arowana]]s, [[butterfly fish]], [[elephantfish]], [[featherback]]s, [[mooneye]]s and other fish in the [[superorder]] [[Osteoglossomorpha]]", parents = {"ikan"}, } labels["otariid seals"] = { type = "set", description = "=[[mammal]]s in the [[family]] [[Otariidae]], including the [[fur seal]]s and [[sea lion]]s", parents = {"pinnipeds"}, } labels["burung otidimorf"] = { type = "set", description = "=[[bustard]]s in the [[family]] [[Otididae]] and [[order]] [[Otidiformes]]; [[turaco]]s or [[lourie]]s, [[go-away bird]]s, [[plantain-eater]]s, etc., in the [[family]] [[Musophagidae]] and [[order]] [[Musophagiformes]]; and [[cuckoo]]s in the [[family]] [[Cuculidae]] and [[order]] [[Cuculiformes]]; all in the [[clade]] [[Otidimorphae]]", parents = {"burung"}, } labels["ikan otosefala"] = { type = "set", description = "=[[anchovy|anchovies]], [[beaked salmon]], [[carp]], [[catfish]], [[characin]]s, [[electric eel]]s, [[ghost knifefish]], [[herring]]s, [[loach]]es, [[milkfish]], [[minnow]]s, [[mousefish]], [[slickhead]]s, [[sucker]]s, [[tubeshoulder]]s, and other fish in the [[clade]] [[Otocephala]]", parents = {"ikan"}, } labels["ovenbirds"] = { type = "set", description = "=burung in the [[suboscine]] family [[Furnariidae]], including the former family Dendrocolaptidae (now the [[subfamily]] [[Dendrocolaptinae]])", parents = {"suboscines"}, } labels["owls"] = { type = "set", description = "default", parents = {"burung pemangsa"}, } labels["pangolins"] = { type = "set", description = "=[[mammal]]s in the [[order]] [[Pholidota]]", parents = {"mamalia"}, } labels["panthers"] = { type = "set", description = "=[[panther]]s in the sense of members of the genus ''[[Panthera]]''", parents = {"felids"}, } labels["parrots"] = { type = "set", description = "default", parents = {"burung"}, } labels["pastoral dogs"] = { type = "set", description = "default", parents = {"anjing"}, } labels["penguins"] = { type = "set", description = "default", parents = {"burung"}, } labels["pentatomoid bugs"] = { type = "set", description = "=[[acanthosomatid]]s, [[burrowing bug]]s, [[jewel bug]]s, [[shield bug]]s, [[stinkbug]]s, [[thyreocorid]]s, and other [[true bug]]s in the [[superfamily]] [[Pentatomoidea]]", parents = {"true bugs"}, } labels["perch and darters"] = { type = "set", description = "=fish in the family [[Percidae]]", parents = {"ikan perkoid"}, } labels["burung tenggek"] = { type = "set", description = "=Burung tenggek: salah satu ahli order [[Passeriformes]]", parents = {"burung"}, } labels["ikan perkoid"] = { type = "set", description = "=[[archerfish]], [[bass]], [[bigeye]]s, [[bluefish]], [[butterflyfish]], [[cardinalfish]], [[cobia]], [[croaker]]s, [[flagtail]]s, [[goatfish]], [[grouper]]s, [[grunt]]s, [[horse mackerel]], [[jack]]s, [[jawfish]], [[leaffish]], [[mahi-mahi]], [[mojarra]], [[perch]], [[pomfret]]s, [[pompano]], [[ponyfish]], [[porgy|porgies]], [[remora]]s, [[roosterfish]], [[sea bass]], [[sea bream]], [[snapper]], [[sunfish]], [[sweeper]]s, [[threadfin]], [[tilefish]], [[wreckfish]], and other [[perciform]] fish in the [[superfamily]] [[Percoidea]]", parents = {"ikan"}, } labels["phiomorphs"] = { type = "set", description = "=[[blesmol]]s, [[sand mole]]s, [[mole rat]]s, [[dassie rat]]s or [[rock rat]]s, [[Old World porcupine]]s, [[cane rat]]s or [[grasscutter]]s and other [[rodent]]s in the parvorder [[Phiomorpha]], which is the Old World counterpart of the [[caviomorph]]s", parents = {"rodensia"}, } labels["phocid seals"] = { type = "set", description = "=[[mammal]]s in the [[family]] [[Phocidae]], including the [[earless seal]]s (also known as [[true seal]]s)", parents = {"pinnipeds"}, } labels["piciforms"] = { type = "set", description = "=[[woodpecker]]s, [[aracari]]s, [[coppersmith]]s, [[honeyguide]]s, [[jacamar]]s, [[nunlet]]s, [[puffbird]]s, [[toucan]]s, and other burung in the [[order]] [[Piciformes]]", parents = {"burung"}, } labels["pierid butterflies"] = { type = "set", description = "=[[brimstone]]s, [[orange tip]]s, [[sulfur]]s, [[white]]s and other [[butterfly|butterflies]] in the [[family]] [[Pieridae]]", parents = {"butterflies"}, } labels["babi"] = { type = "set", description = "default", parents = {"ungulat kuku genap", "ternakan"}, commonscat = "Suidae", wpcat = true, } labels["pikes (fish)"] = { type = "set", description = "=fish in the family [[Esocidae]]", parents = {"ikan"}, } labels["pinnipeds"] = { type = "set", description = "default", parents = {"karnivor"}, } labels["pipits and wagtails"] = { type = "set", description = "=burung in the [[passerine]] family [[Motacillidae]]", parents = {"burung tenggek"}, } labels["placoderms"] = { type = "set", description = "=[[extinct]] armored fish of the [[class]] [[Placodermi]] from the [[Silurian]] and [[Devonian]] [[geologic]] [[period]]s", parents = {"ikan"}, } labels["plovers and lapwings"] = { type = "set", description = "=burung in the [[charadriiform]] [[family]] [[Charadriidae]]", parents = {"shorebirds"}, } labels["pomfrets"] = { type = "set", description = "=fish in the family [[Bramidae]]", parents = {"ikan perkoid"}, } labels["primat"] = { type = "set", description = "default", parents = {"mamalia"}, commonscat = true, wpcat = true, } labels["procyonids"] = { type = "set", description = "=[[procyonid]]s: ([[raccoon]]s, [[coati]]s, [[kinkajou]]s, [[olingo]]s, [[ringtail]]s and [[cacomistle]]s)", parents = {"karnivor"}, } labels["prosimian"] = { type = "set", description = "default", parents = {"primat"}, } labels["pterosaurs"] = { type = "set", description = "default", parents = {"reptilia"}, } labels["pyraloid moths"] = { type = "set", description = "=[[bee moth]]s, [[flour moth]]s, [[leaf crumpler]]s, [[magpie moth]]s, [[melonworm]]s, [[mint moth]]s, [[orangeworm]]s, [[pantry moth]]s, [[pickleworm]]s, [[snout moth]]s, [[veneer moth]]s, [[wax moth]]s and other [[crambid]] and [[pyralid]] [[moths]] in the [[superfamily]] [[Pyraloidea]]", parents = {"moths"}, } labels["rabbits"] = { type = "set", description = "default", parents = {"lagomorphs"}, } labels["rallids"] = { type = "set", description = "=[[rallid]]s: [[rail]]s and other burung in the family [[Rallidae]]", parents = {"gruiforms"}, } labels["ratites"] = { type = "set", description = "=[[ratite]]s: burung in the superorder [[Palaeognathae]], including large flightless burung such as [[ostrich]]es, and [[emu]]s, as well as the smaller [[kiwi]]s and [[flighted]] [[tinamous]]", parents = {"burung"}, } labels["rays and skates"] = { type = "set", description = "=[[fish]] in the superorder [[Batoidea]]", parents = {"ikan"}, } labels["reindeers"] = { type = "set", description = "default", parents = {"cervids"}, } labels["reptilia"] = { type = "set", description = "default", parents = {"vertebrat"}, commonscat = "Reptilia", wpcat = true, } labels["retrievers"] = { type = "set", description = "default", parents = {"anjing pemburu"}, } labels["rhinoceroses"] = { type = "set", description = "=[[rhinoceros]]es, [[mammal]]s in the [[perissodactylic]] [[family]] [[Rhinocerotidae]]", parents = {"ungulat kuku ganjil"}, } labels["rodensia"] = { type = "set", description = "default", parents = {"mamalia"}, } labels["salamanders"] = { type = "set", description = "=[[amphiuma]]s, [[axolotl]]s, [[hellbender]]s, [[mud puppy|mud puppies]], [[olm]]s, [[newt]]s, [[salamander]]s, [[siren]]s, and other [[amphibian]]s in the [[order]] [[Caudata]]", parents = {"amfibia"}, } labels["salmonids"] = { type = "set", description = "=[[salmon]]s, [[trout]], and other fish in the family [[Salmonidae]]", parents = {"ikan"}, } labels["saturniid moths"] = { type = "set", description = "=[[Atlas moth]]s, [[cecropia]]s, [[hickory horned devil]]s, [[io moth]]s, [[luna moth]]s, [[polyphemus moth]]s, and other [[moth]]s (and [[caterpillar]]s) in the [[family]] [[Saturniidae]]", parents = {"moths"}, } labels["satyrine butterflies"] = { type = "set", description = "=[[brown]]s, [[forester]]s, [[grayling]]s, [[heath]]s, [[palmfly|palmflies]], [[ringlet]]s, [[satyr]]s, and other [[butterfly|butterflies]] in the [[nymphalid]] [[subfamily]] [[Satyrinae]]", parents = {"nymphalid butterflies"}, } labels["sauropod"] = { type = "set", description = "=[[apatosaur]]s, [[brachiosaur]]s, [[brontosaur]]s, [[camarasaur]]s, [[cetiosaur]]s, [[diplodocus]]es, [[saltasaurid]]s, [[titanosaurian]]s, [[turiasaur]]s, [[vulcanodontid]]s, and other [[dinosaurs]] in the [[saurischian]] [[infraorder]] [[Sauropoda]]", parents = {"dinosaur"}, } labels["sauropterygians"] = { type = "set", description = "=[[elasmosaur]]s, [[placodont]]s, [[plesiosaur]]s, and other extinct aquatic [[reptile]]s in the [[superorder]] [[Sauropterygia]]", parents = {"reptilia"}, } labels["sawflies and wood wasps"] = { type = "set", description = "=[[horntail]]s, [[pigeon tremex]], [[rose slug]]s, [[sawfly|sawflies]], [[wood wasp]]s, and other primitive [[hymenopteran]]s in the [[suborder]] [[Symphyta]]", parents = {"Hymenoptera"}, } labels["serangga teritip"] = { type = "set", description = "=[[insect]]s in the [[superfamily]] [[Coccoidea]]", parents = {"hemipterans"}, } labels["scarabaeoids"] = { type = "set", description = "=[[cockchafer]]s, [[dor]]s, [[dung beetle]]s, [[June beetle]]s, [[rain beetle]]s, [[rose chafer]]s, [[scarab]]s, [[stag beetle]]s, and other beetles in the [[superfamily]] [[Scarabaeoidea]]", parents = {"beetles"}, } labels["scenthounds"] = { type = "set", description = "default", parents = {"hunting dogs"}, } labels["scincomorph lizards"] = { type = "set", description = "=[[blue-tongue lizard]]s, [[night lizard]]s, [[sandfish]], [[skink]]s, [[sungazer]]s, and other [[lizard]]s in the [[infraorder]] [[Scincomorpha]]", parents = {"lizards"}, } labels["scolopacids"] = { type = "set", description = "=[[curlew]]s, [[dunlin]]s, [[godwit]]s, [[knot]]s, [[redshank]]s, [[ruff]]s, [[sandpiper]]s, [[snipe]]s, [[stint]]s, [[turnstone]]s, [[tattler]]s, [[whimbrel]]s, [[woodcock]]s, [[yellowleg]]s, and other burung in the [[charadriiform]] [[family]] [[Scolopacidae]]", parents = {"shorebirds"}, } labels["scombroids"] = { type = "set", description = "=[[mackerel]]s, [[tuna]]s, [[barracuda]]s, [[swordfish]], and other fish in the suborder [[Scombroidei]]", parents = {"ikan"}, } labels["ikan skorpaeniform"] = { type = "set", description = "=[[bullhead]]s, [[cabezon]], [[golomyanka]], [[greenling]]s, [[gurnard]]s, [[Irish lord]], [[lionfish]], [[lumpsucker]]s, [[pigfish]], [[poacher]]s, [[sablefish]], [[scorpionfish]], [[sculpin]]s, [[sea raven]]s, [[sea toad]]s, [[skilfish]], [[snailfish]], [[stonefish]], [[wingfish]], and other fish in the [[order]] [[Scorpaeniformes]]", parents = {"ikan"}, } labels["scorpions"] = { type = "set", description = "=true [[scorpion]]s: [[arachnid]]s in the [[order]] [[Scorpiones]]", parents = {"araknid"}, } labels["screamers"] = { type = "set", description = "=[[screamer]]s: burung in the family [[Anhimidae]], related to [[duck]]s and [[geese]]", parents = {"burung"}, } labels["burung laut"] = { type = "set", description = "default", parents = {"burung"}, } labels["sea anemones"] = { type = "set", description = "=[[cnidarian]]s in the [[order]] [[Actiniaria]]", parents = {"knidaria"}, } labels["sea cucumbers"] = { type = "set", description = "=[[echinoderm]]s in the [[class]] [[Holothuroidea]]", parents = {"ekinoderma"}, } labels["sea urchins"] = { type = "set", description = "=[[echinoderm]]s in the [[class]] [[Echinoidea]], including the [[sand dollar]]s", parents = {"ekinoderma"}, } labels["sea turtles"] = { type = "set", description = "=[[flatback]]s, [[green turtle]]s, [[hawksbill]]s, [[leatherback]]s, [[loggerhead]]s, [[ridley]]s, and other [[turtle]]s in the [[superfamily]] [[Chelonioidea]]", parents = {"turtles"}, } labels["sebastids"] = { type = "set", description = "=fish in the family [[Sebastidae]]", parents = {"ikan skorpaeniform"}, } labels["serranids"] = { type = "set", description = "=[[sea bass]], [[grouper]]s, [[rockcod]]s, [[comber]]s and other fish in the family [[Serranidae]]", parents = {"ikan perkoid"}, } labels["jerung"] = { type = "set", description = "default", parents = {"ikan"}, } labels["kambing biri-biri"] = { type = "set", description = "default", parents = {"caprines", "ternakan"}, } labels["shorebirds"] = { type = "set", description = "default", parents = {"burung"}, } labels["shrikes"] = { type = "set", description = "default", parents = {"burung tenggek", "burung korvoid"}, } labels["sighthounds"] = { type = "set", description = "default", parents = {"hunting dogs"}, } labels["skippers"] = { type = "set", description = "=serangga in the family [[Hesperiidae]]", parents = {"butterflies"}, } labels["smelts"] = { type = "set", description = "=fish in the [[order]] [[Osmeriformes]]", parents = {"ikan"}, } labels["snails"] = { type = "set", description = "default", parents = {"gastropod"}, } labels["ular"] = { type = "set", description = "default", parents = {"reptilia"}, } labels["snappers"] = { type = "set", description = "=ikan in the [[family]] [[Lutjanidae]]", parents = {"ikan perkoid"}, } labels["soft corals"] = { type = "set", description = "=[[calcaxonian]]s, [[dead man's fingers]], [[fan coral]]s, [[gorgonian]]s, [[holaxonian]]s, [[scleraxonian]]s, [[sea feather]]s, [[sea willow]]s, [[stoloniferan]]s, [[whip coral]]s, and other marine haiwan in the [[cnidarian]] order [[Alcyonacea]]", parents = {"knidaria"}, } labels["soricomorphs"] = { type = "set", description = "=[[shrew]]s, [[mole]]s, [[solenodon]]s, and other [[mammal]]s in the [[order]] [[Soricomorpha]]", parents = {"mamalia"}, } labels["South American canids"] = { type = "set", description = "=fox-like [[canid]]s in the [[subtribe]] [[Cerdocyonina]], which are more closely related to the [[dog]]s and [[wolf|wolves]] than to the true [[fox]]es. Also known as [[zorro]]s", parents = {"kanid"}, } labels["spaniels"] = { type = "set", description = "default", parents = {"anjing pemburu"}, } labels["sparids"] = { type = "set", description = "=[[sea breams]], [[porgie]]s, [[scup]]s and other ikan in the family [[Sparidae]]", parents = {"ikan perkoid"}, } labels["sphinx moths"] = { type = "set", description = "=[[hawkmoth]]s, [[hornworm]]s, [[hummingbird moth]]s, [[sphinx moth]]s,[[tomato worm]]s, and other [[moth]]s (and [[caterpillar]]s) in the [[family]] [[Sphingidae]]", parents = {"moths"}, } labels["lelabah"] = { type = "set", description = "default", parents = {"araknid"}, } labels["sponges"] = { type = "set", description = "=[[aquatic]] [[animal]]s in the [[filum]] [[Porifera]]", parents = {"haiwan"}, } labels["squid"] = { type = "set", description = "default", parents = {"sefalopod"}, } labels["squirrels"] = { type = "set", description = "=[[squirrel]]s, [[chipmunk]]s, [[marmot]]s, [[prairie dog]]s, [[woodchuck]]s and other [[rodent]]s in the family [[Sciuridae]]", parents = {"rodensia"}, } labels["staphylinoid beetles"] = { type = "set", description = "=[[beetle]]s in the [[superfamily]] [[Staphylinoidea]]", parents = {"beetles"}, } labels["starlings"] = { type = "set", description = "=[[starling]]s, [[mynah]]s, and other birds in the [[passerine]] family [[Sturnidae]]", parents = {"burung tenggek"}, } labels["belalang ranting"] = { type = "set", description = "=[[insect]]s (including the [[leaf insect]]s) in the [[order]] known as either [[Phasmida]] or [[Phasmatodea]], which are noted for their extreme adaptations in form and color to look like parts of the plants they feed on", parents = {"serangga"}, } labels["stoneflies"] = { type = "set", description = "=[[freshwater]] [[aquatic]] [[insect]]s in the [[order]] [[Plecoptera]]", parents = {"serangga"}, } labels["stony corals"] = { type = "set", description = "=marine haiwan in the [[cnidarian]] order [[Scleractinia]]", parents = {"knidaria"}, } labels["storks"] = { type = "set", description = "default", parents = {"burung air tawar"}, } labels["ikan stromateoid"] = { type = "set", description = "=[[barrelfish]], [[blue eye cod]], [[dollarfish]], [[driftfish]], [[lafayette]], [[medusafish]], [[rudderfish]], [[squaretail]], [[warehou]], and other ikan in the [[perciform]] [[suborder]] [[Stromateoidei]]", parents = {"ikan"}, } labels["sturgeons"] = { type = "set", description = "=ikan in the family [[Acipenseridae]]", parents = {"ikan"}, } labels["suboscines"] = { type = "set", description = "=[[antpitta]]s, [[antshrike]]s, [[antthrush]]es, [[asity|asities]], [[broadbill]]s, [[cotinga]]s, [[crescentchest]]s, [[gnateater]]s, [[manakin]]s, [[ovenbird]]s, [[pitta]]s, [[sharpbill]]s, [[spadebill]]s, [[tapaculo]]s, [[tityra]]s, [[tyrant flycatcher]]s, [[woodcreeper]]s, and other birds in the [[passerine]] [[suborder]] [[Tyranni]]", parents = {"burung tenggek"}, } labels["suckers (ikan)"] = { type = "set", description = "=[[buffalo fish]], [[cuiui]], [[jumprock]]s, [[quillback]], [[redhorse]], [[sucker]]s, and other freshwater ikan in the family [[Catostomidae]]", parents = {"ikan", "ikan otosefalan"}, } labels["suliform birds"] = { type = "set", description = "=[[anhinga]]s, [[booby|boobies]], [[cormorant]]s, [[frigatebird]]s, [[gannet]]s, and other [[burung laut]] in the [[order]] [[Suliformes]]", parents = {"burung laut"}, } labels["sunfish"] = { type = "set", description = "=freshwater ikan otosefalan in the family [[Centrarchidae]]", parents = {"ikan perkoid"}, } labels["swallows"] = { type = "set", description = "default", parents = {"burung tenggek"}, } labels["swallowtails"] = { type = "set", description = "=[[apollo]]s, [[batwing]]s, [[birdwing]]s, [[clubtail]]s, [[festoon]]s, [[flying handkerchief]]s, [[Helen]]s, [[jay]]s, [[mime]]s, [[parnassian]]s, [[rose]]s, [[swallowtail]]s, [[swordtail]]s, [[triangle]]s, [[turnus]]es, [[windmill]]s, [[zebra]]s, and other [[butterfly|butterflies]] in the [[family]] [[Papilionidae]], notable for (mostly) having tail-like extensions on their [[hindwing]]s", parents = {"butterflies"}, } labels["swan"] = { type = "set", description = "default", parents = {"anatid"}, } labels["ikan singnatiform"] = { type = "set", description = "=[[bellowsfish]], [[cornetfish]], [[pipefish]], [[razorfish]], [[sea dragon]]s, [[sea horse]]s, [[snipefish]], [[trumpetfish]], and other ikan in the [[order]] [[Syngnathiformes]]", parents = {"ikan"}, } labels["tanagers"] = { type = "set", description = "=[[bananaquit]]s, [[conebill]]s, [[dacnis]]es, [[Darwin's finch]]es, [[grassquit]]s, [[ground finch]]es, [[honeycreeper]]s, [[pardusco]]s, [[tanager]]s, and other [[passerine]] birds in the family [[Thraupidae]]", parents = {"burung tenggek"}, } labels["temnospondyls"] = { type = "set", description = "=[[extinct]] early [[amphibian]]s in the [[order]] [[Temnospondyli]]", parents = {"amfibia"}, } labels["tenebrionoid beetles"] = { type = "set", description = "=[[aderid]]s, [[anthicid]]s, [[blister beetle]]s, [[borid]]s, [[ciid]]s, [[flour beetle]]s, [[darkling beetle]]s, [[mealworm]]s, [[melandryid]]s, [[mordellid]]s, [[mycetophagid]]s, [[oedemerid]]s, [[pinacate beetle]]s, [[pyrochroid]]s, [[pythid]]s, [[ripiphorid]]s, [[salpingid]]s, [[toktokkie]]s, [[ulodid]]s, [[wharf borer]]s, [[zopherid]]s and other [[beetle]]s in the [[superfamily]] [[Tenebrionoidea]]", parents = {"beetles"}, } labels["tephritoid flies"] = { type = "set", description = "=[[cheese fly|cheese flies]], [[tephritid]] [[fruit fly|fruit flies]], [[picture-winged fly|picture-winged flies]] and other [[fly|flies]] in the [[dipteran]] [[superfamily]] [[Tephritoidea]]", parents = {"Diptera"}, } labels["termites"] = { type = "set", description = "=[[termite]]s, [[insect]]s in the former [[order]] [[Isoptera]], which is now considered a [[suborder]] or other group within the [[cockroach]]es in the order [[Blattodea]]", parents = {"serangga", "cockroaches"}, } labels["terns"] = { type = "set", description = "=[[tern]]s, [[burung laut]] in the [[family]] [[Sternidae]]", parents = {"burung laut"}, } labels["tetraodontiforms"] = { type = "set", description = "=[[pufferfish]], [[triggerfish]], [[boxfish]], [[ocean sunfish]] and other ikan in the order [[Tetraodontiformes]]", parents = {"ikan"}, } labels["terriers"] = { type = "set", description = "default", parents = {"hunting dogs"}, } labels["theropods"] = { type = "set", description = "=[[dinosaur]]s in the [[clade]] [[Theropoda]]", parents = {"dinosaur"}, } labels["thrushes"] = { type = "set", description = "default", parents = {"burung tenggek"}, } labels["ticks"] = { type = "set", description = "=[[bloodsucking]] [[araknid]] in the [[order]] [[Ixodida]] (also known as [[Metastigmata]])", parents = {"mites and ticks"}, } labels["tinamous"] = { type = "set", description = "default", parents = {"ratites"}, } labels["tits"] = { type = "set", description = "=[[tit]]s, birds known as [[chickadee]]s in the US", parents = {"burung tenggek"}, } labels["toads"] = { type = "set", description = "default", parents = {"anurans"}, } labels["toothcarps"] = { type = "set", description = "=[[four-eyed fish]], [[guppy|guppies]], [[killifish]], [[molly|mollies]], [[mummichog]]s, [[platy|platies]], [[swordtail]]s, [[topminnow]]s and other ikan in the [[order]] [[Cyprinodontiformes]]", parents = {"ikan"}, } labels["tortoises"] = { type = "set", description = "=[[terrestrial]] [[turtle]]s in the [[family]] [[Testudinidae]]", parents = {"turtles"}, } labels["tortricid moths"] = { type = "set", description = "=[[moth]]s (and [[caterpillar]]s) in the [[family]] [[Tortricidae]]", parents = {"moths"}, } labels["ikan trakinoid"] = { type = "set", description = "=[[black swallower]]s, [[blue cod]], [[duckbill]]s, [[gaper]]s, [[sand eel]]s, [[torrentfish]], [[weeverfish]] and other ikan in the [[perciform]] [[suborder]] [[Trachinoidei]]", parents = {"ikan"}, } labels["toy dogs"] = { type = "set", description = "default", parents = {"anjing"}, } labels["trilobites"] = { type = "set", description = "default", parents = {"artropod"}, } labels["true bugs"] = { type = "set", description = "=[[insect]]s in the [[hemipteran]] suborder [[Heteroptera]]", parents = {"hemipterans"}, } labels["true finches"] = { type = "set", description = "=[[finch]]es in the [[passerine]] family [[Fringillidae]]", parents = {"burung tenggek"}, } labels["true jellyfish"] = { type = "set", description = "=[[cnidarian]]s in the [[class]] [[Scyphozoa]]", parents = {"knidaria"}, } labels["true sparrows"] = { type = "set", description = "=[[passerine]] birds in the family [[Passeridae]] (for other birds called sparrows, see the [[emberizid]]s)", parents = {"burung tenggek"}, } labels["tubenose birds"] = { type = "set", description = "=[[albatross]]es, [[fulmar]]s, [[petrel]]s, [[prion]]s, [[shearwater]]s, and other [[seabird]]s in the [[order]] [[Procellariiformes]]", parents = {"burung laut"}, } labels["tunicates"] = { type = "set", description = "default", parents = {"haiwan"}, } labels["turtles"] = { type = "set", description = "default", parents = {"reptilia"}, } labels["tyrant flycatchers"] = { type = "set", description = "=[[passerine]] birds in the family [[Tyrannidae]]", parents = {"suboscines"}, } labels["ursids"] = { type = "set", description = "=[[ursid]]s ([[bear]]s)", parents = {"karnivor"}, } labels["Venerida order mollusks"] = { type = "set", description = "=[[basket clam]]s, [[bean clam]]s, [[boring clam]]s, [[cockle]]s, [[duck clam]]s, [[giant clam]]s, [[hard clam]]s, [[lentil shell]]s, [[pipi]]s, [[pooquaw]]s, [[quahog]]s, [[surf clam]]s, [[trough-shell]]s, [[ugari]]s, [[Venus clam]]s, [[zebra mussel]]s, and other [[bivalve]]s in the [[order]] [[Venerida]]", parents = {"bivalvia"}, } labels["vertebrat"] = { type = "set", description = "default", parents = {"kordata"}, } labels["vespids"] = { type = "set", description = "=[[hornet]]s, [[paper wasp]]s, [[pollen wasp]]s, [[potter wasp]]s, [[yellow jacket]]s, and other [[wasp]]s in the [[family]] [[Vespidae]]", parents = {"Hymenoptera"}, } labels["vetigastropod"] = { type = "set", description = "=[[abalone]]s or [[ear shell]]s, [[duck's-bill limpet]]s, [[keyhole limpet]]s, [[rosary shell]]s, [[slit-shell]]s, [[topshell]]s, [[turban shell]]s, and other [[gastropod]]s in the [[clade]] [[Vetigastropoda]] (treated in some classifications as an [[order]], in others as [[subclass]])", parents = {"gastropod"}, } labels["vipers"] = { type = "set", description = "=[[adder]]s, [[asp]]s, [[rattlesnake]]s, [[viper]]s, [[water moccasin]]s and other [[venomous]] ular in the [[Viperidae]]", parents = {"ular"}, } labels["viverrids"] = { type = "set", description = "=[[viverrid]]s ([[civet]]s, [[genet]]s and relatives)", parents = {"karnivor"}, } labels["vombatiforms"] = { type = "set", description = "=[[diprotodontid]]s, [[diprotodon]]s, [[phascolarctid]]s, [[koala]]s, [[vombatid]]s, [[wombat]]s, [[phascolome]]s, [[ilariid]]s, [[maradid]]s, [[palorchestid]]s, [[thylacoleonid]]s, [[marsupial lion]]s , [[wynyardiid]]s and other [[marsupial]]s in the [[diprotodont]] [[suborder]] [[Vombatiformes]]", parents = {"marsupials"}, } labels["vultures"] = { type = "set", description = "=[[vulture]]s (both Old World and New World)", parents = {"burung pemangsa"}, } labels["warblers"] = { type = "set", description = "=[[warbler]]s, various small [[passerine]] songbirds, especially of the families Sylviidae (Old World warblers) and Parulidae (New World warblers)", parents = {"burung tenggek"}, } labels["warren hounds"] = { type = "set", description = "default", parents = {"hunting dogs"}, } labels["water dogs"] = { type = "set", description = "default", parents = {"retrievers"}, } labels["weaver finches"] = { type = "set", description = "=[[finch]]es in the family [[Estrildidae]]", parents = {"burung tenggek"}, } labels["weaverbirds"] = { type = "set", description = "=[[baya]]s, [[bishop]]s, [[fody|fodies]], [[malimbe]]s, [[quelea]]s, [[sakabula]]s, [[taha]]s, [[weaver]]s, and other birds in the [[family]] [[Ploceidae]]", parents = {"burung tenggek"}, } labels["weevils"] = { type = "set", description = "=[[bill-beetle]]s, [[curculio]]s, [[grugru worm]]s, [[snout beetle]]s, and other [[beetle]]s in the [[superfamily]] [[Curculionoidea]]", parents = {"beetles"}, } labels["paus"] = { type = "set", description = "default", parents = {"setasea"}, } labels["wolves"] = { type = "set", description = "=[[wolves]]", parents = {"kanid"}, } labels["woodpeckers"] = { type = "set", description = "=[[flicker]]s, [[sapsucker]]s, [[wryneck]]s, and other birds in the [[family]] [[Picidae]]", parents = {"piciforms"}, } labels["working dogs"] = { type = "set", description = "default", parents = {"anjing"}, } labels["cacing"] = { type = "set", description = "default", parents = {"haiwan"}, } labels["wrasses"] = { type = "set", description = "=ikan in the family [[Labridae]]", parents = {"ikan labroid"}, } labels["wrens"] = { type = "set", description = "default", parents = {"burung sertioid"}, } labels["ikan zoarkoid"] = { type = "set", description = "=[[butterfish]], [[eelpout]]s, [[guffer]]s, [[gunnel]]s, [[lumper]]s, [[prickleback]]s, [[prowfish]], [[wolf eel]]s and other fish in the [[perciform]] [[suborder]] [[Zoarcoidei]]", parents = {"ikan"}, } labels["zygaenoid moths"] = { type = "set", description = "=[[burnet moth]]s, [[forester]]s, [[hag moth]]s, [[limacodid]]s, [[megalopygid]]s, [[monkey slug]]s, [[puss moth]]s, [[saddleback caterpillar]]s, [[zygaenid]]s, and other [[moth]]s in the [[superfamily]] [[Zygaenoidea]]", parents = {"moths"}, } labels["plesiosaurs"] = { type = "set", description = "=[[plesiosaur]]s (order †[[Plesiosauria]])", parents = {"sauropterygians"}, } labels["tarantulas"] = { type = "set", description = "=[[tarantula]]s (family [[Theraphosidae]])", parents = {"mygalomorph spiders"}, } return labels oyregq00fdgn0aur0muspidpkz7vntv Modul:category tree/topic/Nature 828 11535 281323 263976 2026-04-22T00:33:34Z PeaceSeekers 3334 281323 Scribunto text/plain local labels = {} labels["alam semula jadi"] = { type = "berkenaan", description = "default", parents = {"semua topik"}, } labels["bentuk muka bumi"] = { type = "set", description = "=jenis bentuk muka bumi semula jadi", parents = {"alam semula jadi"}, } labels["asid"] = { type = "set", description = "default", parents = {"jirim"}, } labels["unsur kimia siri aktinid"] = { type = "set", description = "{{{langname}}} terms for those chemical elements in the {{w|f-block}} of the [[periodic table]] with [[atomic number]]s from 89 to 103.", parents = {"unsur kimia", "logam", "keradioaktifan"}, } labels["udara"] = { type = "berkenaan", description = "default", parents = {"atmosfera"}, } labels["logam alkali"] = { type = "set", description = "{{{langname}}} terms for [[alkali metal]]s, chemical elements in [[w:Group (periodic table)|group]] 1 of the [[periodic table]], which all have one [[valence electron]].", parents = {"unsur kimia", "logam"}, } labels["logam bumi beralkali"] = { type = "set", description = "{{{langname}}} terms for [[alkaline earth metal]]s, chemical elements in [[w:Group (periodic table)|group]] 2, which all have two [[valence electron]]s.", parents = {"unsur kimia", "logam"}, } labels["alkaloid"] = { type = "set", description = "default", parents = {"sebatian organik"}, } labels["aloi"] = { type = "set", description = "default", parents = {"logam"}, } labels["aluminium"] = { type = "berkenaan", description = "default", parents = {"unsur kumpulan boron"}, } labels["asid amino"] = { type = "set", description = "default", parents = {"asid karboksilik"}, } labels["bunyi haiwan"] = { type = "set", description = "default", parents = {"bunyi-bunyi", "penyuaraan"}, } labels["kebajikan haiwan"] = { type = "berkenaan", description = "{{{langname}}} terms closely associated with [[animal welfare]].", parents = {"etika"}, } labels["antijirim"] = { type = "berkenaan", description = "default", parents = {"jirim"}, } labels["antimoni"] = { type = "berkenaan", description = "default", parents = {"pniktogen"}, } labels["argon"] = { type = "berkenaan", description = "default", parents = {"gas adi"}, } labels["arsenik"] = { type = "berkenaan", description = "default", parents = {"pniktogen"}, } labels["astatin"] = { type = "berkenaan", description = "default", parents = {"halogen"}, } labels["asteroid"] = { type = "set", description = "default", parents = {"jasad cakerawala"}, } labels["atmosfera"] = { type = "berkenaan", description = "default", parents = {"alam semula jadi"}, } labels["fenomena atmosfera"] = { type = "set", description = "default", parents = {"atmosfera"}, } labels["musim luruh"] = { type = "berkenaan", description = "default", parents = {"musim"}, } labels["barium"] = { type = "berkenaan", description = "default", parents = {"logam bumi beralkali"}, } labels["barion"] = { type = "set", description = "default", parents = {"hadrons"}, } labels["berilium"] = { type = "berkenaan", description = "default", parents = {"logam bumi beralkali"}, } labels["kelahiran"] = { type = "berkenaan", description = "default", parents = {"pembiakan"}, } labels["bismut"] = { type = "berkenaan", description = "default", parents = {"pniktogen"}, } labels["boron"] = { type = "berkenaan", description = "default", parents = {"unsur kumpulan boron"}, } labels["unsur kumpulan boron"] = { type = "set", description = "{{{langname}}} terms for chemical elements in [[w:Group (periodic table)|group]] 13 of the [[periodic table]], which all have three [[valence electron]]s.", parents = {"unsur kimia"}, } labels["boson"] = { type = "set", description = "default", parents = {"zarah subatom"}, } labels["bromin"] = { type = "berkenaan", description = "default", parents = {"halogen"}, } labels["kadmium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["kalsium"] = { type = "berkenaan", description = "default", parents = {"logam bumi beralkali"}, } labels["karbohidrat"] = { type = "set", description = "default", parents = {"sebatian organik"}, } labels["karbon"] = { type = "berkenaan", description = "default", parents = {"unsur kumpulan karbon"}, } labels["unsur kumpulan karbon"] = { type = "set", description = "Perkataan bahasa {{{langname}}} bagi unsur-unsur kimia dalam [[w:Kumpulan (jadual berkala)|kumpulan]] 14 dalam [[jadual berkala]] yang memiliki empat [[elektron valens]].", parents = {"unsur kimia"}, } labels["asid karboksilik"] = { type = "set", description = "default", parents = {"asid", "sebatian organik"}, } labels["jasad cakerawala"] = { type = "set", description = "{{{langname}}} terms for varous [[celestial body|celestial bodies]]; things found in outer space.", parents = {"angkasa"}, } labels["serium"] = { type = "berkenaan", description = "default", parents = {"unsur kimia siri lantanid"}, } labels["sesium"] = { type = "berkenaan", description = "default", parents = {"logam alkali"}, } labels["kalkogen"] = { type = "set", description = "{{{langname}}} terms for chemical elements in [[w:Group (periodic table)|group]] 16 of the [[periodic table]], which all have 6 [[valence electron]]s.", parents = {"unsur kimia"}, } labels["unsur kimia"] = { type = "set", description = "default", parents = {"jirim"}, } labels["isomer kimia"] = { type = "berkenaan", description = "default", parents = {"jirim", "kimia fizik", "bentuk"}, } labels["proses kimia"] = { type = "set", description = "=[[chemical]] [[process]]es", parents = {"alam semula jadi"}, } labels["klorin"] = { type = "berkenaan", description = "default", parents = {"halogen"}, } labels["kromium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["planet klasik"] = { type = "name", description = "{{{langname}}} names for the [[classical planet]]s of our Solar System.", parents = {"jasad cakerawala"}, } labels["perubahan iklim"] = { type = "berkenaan", description = "=[[anthropogenic]] [[climate change]]", parents = {"alam semula jadi"}, } labels["awan"] = { type = "set", description = "default", parents = {"fenomena atmosfera"}, } labels["arang batu"] = { type = "berkenaan", description = "default", parents = {"bahan api fosil"}, } labels["kobalt"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["koenzim"] = { type = "set", description = "default", parents = {"enzim"}, } labels["warna"] = { type = "set", description = "default", parents = {"cahaya", "penglihatan"}, } for _, color_etc in ipairs { {"hitam"}, {"biru"}, {"perang"}, {"hijau"}, {"kelabu"}, {"jingga"}, {"merah jambu"}, {"ungu"}, {"merah"}, {"putih"}, {"kuning"}, } do local color, desc = unpack(color_etc) desc = desc or ("[[%s]]"):format(color) labels[color] = { type = "set", description = ("=shades of the [[color]] %s"):format(desc), parents = {"warna"}, } end labels["warna pelangi"] = { type = "set", description = "=[[warna]] dalam [[pelangi]]", parents = {"warna"}, } labels["pembakaran"] = { type = "berkenaan", description = "default", parents = {"proses kimia"}, } labels["titik kompas"] = { type = "set", description = "default", parents = {"arah", "navigasi"}, } labels["copper"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["hablur"] = { type = "berkenaan", description = "default", parents = {"jirim", "kimia fizik"}, } labels["kegelapan"] = { type = "berkenaan", description = "default", parents = {"cahaya"}, } labels["arah"] = { type = "set", description = "default", parents = {"alam semula jadi"}, } labels["jarak"] = { type = "berkenaan", description = "default", parents = {"alam semula jadi"}, } labels["dadah"] = { type = "set", description = "default", parents = {"jirim", "farmakologi"}, } labels["kekeringan"] = { type = "berkenaan", description = "default", parents = {"cecair"}, } labels["planet kerdil Sistem Suria"] = { type = "name", description = "=[[planet kerdil]] di [[Sistem Suria]]", parents = {"jasad cakerawala"}, } labels["pewarna"] = { type = "set", description = "default", parents = {"jirim", "pigmen"}, } labels["tenaga"] = { type = "berkenaan", description = "default", parents = {"alam semula jadi"}, } labels["enzim"] = { type = "set", description = "default", parents = {"protein", "pemangkinan"}, } labels["europium"] = { type = "berkenaan", description = "default", parents = {"unsur kimia siri lantanid"}, } labels["bahan letupan"] = { type = "set", description = "default", parents = {"jirim", "senjata"}, } labels["warna mata"] = { type = "set", description = "=[[color]]s that are mostly or exclusively used of [[eye]]s", parents = {"warna", "mata"}, } labels["asid lemak"] = { type = "set", description = "default", parents = {"asid karboksilik"}, } labels["fermion"] = { type = "set", description = "default", parents = {"zarah subatom"}, } labels["api"] = { type = "berkenaan", description = "default", parents = {"pembakaran", "sumber cahaya"}, wp = "Api", } labels["fluorin"] = { type = "berkenaan", description = "default", parents = {"halogen"}, } labels["kabus"] = { type = "berkenaan", description = "default", parents = {"cuaca", "air"}, } labels["bahan api fosil"] = { type = "set", description = "default", parents = {"karbon", "tenaga", "sumber asli"}, } labels["fransium"] = { type = "berkenaan", description = "default", parents = {"logam alkali"}, } labels["gadolinium"] = { type = "berkenaan", description = "default", parents = {"unsur kimia siri lantanid"}, } labels["galaksi"] = { type = "set", description = "default", parents = {"jasad cakerawala"}, } labels["gallium"] = { type = "berkenaan", description = "default", parents = {"unsur kumpulan boron"}, } labels["gas"] = { type = "set", description = "default", parents = {"jirim"}, } labels["germanium"] = { type = "berkenaan", description = "default", parents = {"unsur kumpulan karbon"}, } labels["gold"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["hadrons"] = { type = "set", description = "default", parents = {"zarah subatom"}, } labels["warna rambut"] = { type = "set", description = "=[[color]]s that are mostly or exclusively used of [[hair]]", parents = {"warna", "rambut"}, } labels["halogen"] = { type = "set", description = "=[[chemical element]]s in [[w:Group (periodic table)|group]] 17 of the [[periodic table]], which all have 7 [[valence electron]]s", parents = {"unsur kimia"}, } labels["helium"] = { type = "berkenaan", description = "default", parents = {"gas adi"}, } labels["heroin"] = { type = "berkenaan", description = "default", parents = {"dadah rekreasi"}, } labels["ketinggian"] = { type = "berkenaan", description = "default", parents = {"jarak"}, } labels["warna kuda"] = { type = "set", description = "=[[color]]s that are mostly or exclusively used of [[horse]]s", parents = {"warna", "kuda"}, } labels["hydrogen"] = { type = "berkenaan", description = "default", parents = {"unsur kimia"}, } labels["ais"] = { type = "berkenaan", description = "default", parents = {"air"}, } labels["indium"] = { type = "berkenaan", description = "default", parents = {"unsur kumpulan boron"}, } labels["sebatian tak organik"] = { type = "set", description = "default", parents = {"jirim"}, } labels["iodine"] = { type = "berkenaan", description = "default", parents = {"halogen"}, } labels["ion"] = { type = "set", description = "default", parents = {"jirim", "kimia", "keelektrikan"}, } labels["iridium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["iron"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["isotop"] = { type = "set", description = "default", parents = {"unsur kimia"}, } labels["krypton"] = { type = "berkenaan", description = "default", parents = {"gas adi"}, } labels["unsur kimia siri lantanid"] = { type = "set", description = "=[[chemical element]]s in the {{w|f-block}} of the [[periodic table]] with [[atomic number]]s from 57 to 71", parents = {"unsur kimia"}, } labels["lanthanum"] = { type = "berkenaan", description = "default", parents = {"unsur kimia siri lantanid"}, } labels["lead"] = { type = "berkenaan", description = "default", parents = {"unsur kumpulan karbon"}, } labels["panjang"] = { type = "berkenaan", description = "default", parents = {"jarak"}, } labels["lepton"] = { type = "set", description = "default", parents = {"fermion"}, } labels["kehidupan"] = { type = "berkenaan", description = "default", parents = {"alam semula jadi"}, } labels["cahaya"] = { type = "berkenaan", description = "default", parents = {"tenaga"}, } labels["sumber cahaya"] = { type = "set", description = "default", parents = {"cahaya"}, } labels["kilat"] = { type = "berkenaan", description = "default", parents = {"cuaca", "keelektrikan"}, } labels["cecair"] = { type = "set", description = "default", -- At what temperature? parents = {"jirim"}, } labels["lithium"] = { type = "berkenaan", description = "default", parents = {"logam alkali"}, } labels["magnesium"] = { type = "berkenaan", description = "default", parents = {"logam bumi beralkali"}, } labels["manganese"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["Marikh"] = { type = "berkenaan", description = "=planet [[Marikh]]", parents = {"planet Sistem Suria"}, } labels["marijuana"] = { type = "berkenaan", description = "default", parents = {"hemp family plants", "dadah rekreasi"}, } labels["jirim"] = { type = "berkenaan", description = "=physical [[matter]]", parents = {"alam semula jadi", "kimia"}, } labels["mercury (element)"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["mesons"] = { type = "set", description = "default", parents = {"hadrons"}, } labels["metaloid"] = { type = "set", description = "default", parents = {"unsur kimia"}, } labels["logam"] = { type = "set", description = "default", parents = {"jirim", "metalurgi"}, } labels["mineral"] = { type = "set", description = "default", parents = {"jirim", "mineralogi"}, } labels["molibdenum"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["Bulan"] = { type = "berkenaan", description = "=[[Bulan]], satelit semula jadi Bumi", parents = {"alam semula jadi", "cahaya", "badan samawi", "satelit semula jadi"}, } labels["satelit semula jadi"] = { type = "berkenaan", description = "default", parents = {"badan samawi"}, } for _, planet in ipairs {"Marikh", "Haumea", "Musytari", "Zuhal", "Neptun", "Uranus", "Pluto"} do labels["bulan " .. planet] = { type = "name", description = ("=[[bulan]] yang mengelilingi orbit [[%s]]"):format(planet), parents = {"satelit-satelit bulan"}, } end labels["produk semula jadi (kimia)"] = { type = "name", description = "=[[organic compound]]s produced by living [[organism]]s", parents = {"sebatian organik"}, } labels["sumber asli"] = { type = "set", description = "default", parents = {"jirim"}, } labels["neodimium"] = { type = "berkenaan", description = "default", parents = {"unsur kimia siri lantanid"}, } labels["neon"] = { type = "berkenaan", description = "default", parents = {"gas adi"}, } labels["neurotoksin"] = { type = "set", description = "default", parents = {"racun", "neurosains"}, } labels["nickel"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["niobium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["nitrogen"] = { type = "berkenaan", description = "default", parents = {"pniktogen"}, } labels["gas adi"] = { type = "set", description = "=[[chemical element]]s in [[w:Group (periodic table)|group]] 18 of the [[periodic table]], which all have a full set of [[valence electron]]s: 2 for helium and 8 for the others", parents = {"unsur kimia", "gas"}, } labels["sebatian organik"] = { type = "set", description = "default", parents = {"jirim"}, } labels["osmium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["oxygen"] = { type = "berkenaan", description = "default", parents = {"kalkogen"}, } labels["palladium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["petroleum"] = { type = "berkenaan", description = "default", parents = {"bahan api fosil", "cecair"}, } labels["pharmaceutical drugs"] = { type = "set", description = "{{{langname}}} names for [[pharmaceutical#Adjective|pharmaceutical]] [[drug#Noun|drugs]].", parents = {"dadah"}, } labels["pharmaceutical effects"] = { type = "set", description = "{{{langname}}} names for [[pharmaceutical#Adjective|pharmaceutical]] [[effect#Noun|effects]].", parents = {"farmakologi"}, } labels["fosforus"] = { type = "berkenaan", description = "default", parents = {"pniktogen"}, } labels["pigmen"] = { type = "set", description = "default", parents = {"warna"}, } labels["planetoid"] = { type = "set", description = "default", parents = {"jasad cakerawala"}, } labels["planet"] = { type = "set", description = "default", parents = {"jasad cakerawala"}, } labels["planet Sistem Suria"] = { type = "name", description = "=[[planet]]s of our [[Solar System]]", parents = {"planet"}, } labels["platinum"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["Pluto"] = { type = "berkenaan", description = "=the dwarf planet [[Pluto]]", parents = {"planet kerdil Sistem Suria"}, } labels["pniktogen"] = { type = "set", description = "=[[chemical element]]s in [[w:Group (periodic table)|group]] 15 of the [[periodic table]], which all have 5 [[valence electron]]s", parents = {"unsur kimia"}, } labels["racun"] = { type = "set", description = "default", parents = {"jirim"}, } labels["kalium"] = { type = "berkenaan", description = "default", parents = {"logam alkali"}, } labels["praseodymium"] = { type = "berkenaan", description = "default", parents = {"unsur kimia siri lantanid"}, } labels["promesium"] = { type = "berkenaan", description = "default", parents = {"unsur kimia siri lantanid"}, } labels["kuark"] = { type = "set", description = "default", parents = {"fermion"}, } labels["sinaran"] = { type = "berkenaan", description = "default", parents = {"tenaga"}, } labels["keradioaktifan"] = { type = "berkenaan", description = "default", parents = {"sinaran", "fizik nuklear"}, } labels["radium"] = { type = "berkenaan", description = "default", parents = {"logam bumi beralkali"}, } labels["radon"] = { type = "berkenaan", description = "default", parents = {"gas adi"}, } labels["hujan"] = { type = "berkenaan", description = "default", parents = {"cuaca", "air"}, } labels["dadah rekreasi"] = { type = "set", description = "default", parents = {"dadah"}, } labels["rodium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["rubidium"] = { type = "berkenaan", description = "default", parents = {"logam alkali"}, } labels["rutenium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["samarium"] = { type = "berkenaan", description = "default", parents = {"unsur kimia siri lantanid"}, } labels["skandium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["selenium"] = { type = "berkenaan", description = "default", parents = {"kalkogen"}, } labels["bayang"] = { type = "berkenaan", description = "default", parents = {"kegelapan"}, } labels["senyap"] = { type = "berkenaan", description = "default", parents = {"bunyi"}, } labels["silikon"] = { type = "berkenaan", description = "default", parents = {"unsur kumpulan karbon"}, } labels["silver"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["saiz"] = { type = "berkenaan", description = "default", parents = {"alam semula jadi"}, } labels["salji"] = { type = "berkenaan", description = "default", parents = {"cuaca", "air"}, } labels["natrium"] = { type = "berkenaan", description = "default", parents = {"logam alkali"}, } labels["bunyi"] = { type = "berkenaan", description = "default", parents = {"tenaga"}, } labels["bunyi-bunyi"] = { type = "set", description = "default", parents = {"bunyi"}, } labels["angkasa"] = { type = "berkenaan", description = "default", parents = {"alam semula jadi"}, } labels["musim bunga"] = { type = "berkenaan", description = "default", parents = {"musim"}, } labels["skluark"] = { type = "set", description = "default", parents = {"fermion"}, } labels["bintang"] = { type = "set", description = "{{{langname}}} names of individual [[star]]s, not including the [[Sun]].", parents = {"jasad cakerawala"}, } labels["steroid"] = { type = "set", description = "default", parents = {"sebatian organik"}, } labels["kekuatan"] = { type = "berkenaan", description = "default", parents = {"alam semula jadi", "health"}, } labels["strontium"] = { type = "berkenaan", description = "default", parents = {"logam bumi beralkali"}, } labels["zarah subatom"] = { type = "set", description = "default", parents = {"jirim", "particle physics"}, } labels["asid gula"] = { type = "set", description = "default", parents = {"asid karboksilik", "karbohidrat"}, } labels["gula"] = { type = "set", description = "default", parents = {"karbohidrat"}, } labels["sulfur"] = { type = "berkenaan", description = "default", parents = {"kalkogen"}, } labels["musim panas"] = { type = "berkenaan", description = "default", parents = {"musim"}, } labels["matahari"] = { type = "berkenaan", description = "=[[Matahari]]", parents = {"alam semula jadi", "cahaya", "jasad cakerawala"}, } labels["tantalum"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["teknesium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["telurium"] = { type = "berkenaan", description = "default", parents = {"kalkogen"}, } labels["suhu"] = { type = "berkenaan", description = "default", parents = {"alam semula jadi", "cuaca"}, } labels["teratogen"] = { type = "set", description = "default", parents = {"racun"}, } labels["talium"] = { type = "berkenaan", description = "default", parents = {"unsur kumpulan boron"}, } labels["torium"] = { type = "berkenaan", description = "default", parents = {"unsur kimia siri aktinid"}, } labels["timah"] = { type = "berkenaan", description = "default", parents = {"unsur kumpulan karbon"}, } labels["titanium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["tembakau"] = { type = "berkenaan", description = "default", parents = {"nightshades", "dadah rekreasi", "merokok"}, } labels["logam peralihan"] = { type = "set", description = "{{{langname}}} terms for [[chemical element]]s in [[w:Group (periodic table)|group]]s 3 to 12 of the [[periodic table]], which are also in the {{w|d-block}} of the [[periodic table]] ", parents = {"unsur kimia", "logam"}, } labels["tungsten"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["jenis planet"] = { type = "type", topic = "planet", description = "=[[planet]]", parents = {"planet"}, } labels["uranium"] = { type = "berkenaan", description = "default", parents = {"unsur kimia siri aktinid"}, } labels["vanadium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["penyuaraan"] = { type = "set", description = "default", parents = {"bunyi-bunyi", "communication"}, } labels["air"] = { type = "berkenaan", description = "default", parents = {"cecair"}, } labels["air terjun"] = { type = "berkenaan", description = "default", parents = {"air"}, } labels["cuaca"] = { type = "berkenaan", description = "default", parents = {"atmosfera"}, } labels["berat"] = { type = "berkenaan", description = "default", parents = {"alam semula jadi"}, } labels["kebasahan"] = { type = "berkenaan", description = "default", parents = {"cecair"}, } labels["angin"] = { type = "berkenaan", description = "default", parents = {"cuaca"}, } labels["musim sejuk"] = { type = "berkenaan", description = "default", parents = {"musim"}, } labels["xenon"] = { type = "berkenaan", description = "default", parents = {"gas adi"}, } labels["itrium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["zink"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } labels["zirkonium"] = { type = "berkenaan", description = "default", parents = {"logam peralihan"}, } return labels mh4bqwiwqs0cj7btt0bm84te04937p3 Modul:headword/data 828 11806 281245 281234 2026-04-21T13:26:02Z PeaceSeekers 3334 Dilindungi "[[Modul:headword/data]]": dia dah siap ([Sunting=Benarkan penyelia sahaja] (tak terbatas)) 281234 Scribunto text/plain local headword_page_module = "Module:headword/page" local list_to_set = require("Module:table").listToSet local data = {} ------ 1. Lists which are converted into sets. ------ --[==[ var: Large pages where we disable label tracking, red link checking and similar. ]==] data.large_pages = list_to_set { -- pages that consistently hit timeouts "a", -- pages that sometimes hit timeouts "A", "baba", "de", "e", "i", "lima", "o", "u", "и", "山", "子", "月", "一", "人", } --[==[ var: Map from singular to plural, and from plural to itself, for recognized parts of speech with irregular plurals. Most of these are invariable plurals, e.g. `kanji` is its own plural; but we also have `mora` plural `morae`. ]==] data.irregular_plurals = list_to_set({ "cmavo", "cmene", "fu'ivla", "gismu", "Han tu", "hanja", "Hanzi", "jyutping", "kana", "Kanji", "lujvo", "phrasebook", "Pinyin", "rafsi", }, function(_, item) return item end) local irregular_plurals = data.irregular_plurals -- Irregular non-zero plurals AND any regular plurals where the singular ends in "s", -- because the module assumes that inputs ending in "s" are plurals. The singular and -- plural both need to be added, as the module will generate a default plural if -- the input doesn't match a key in this table. for sg, pl in next, { mora = "morae" } do irregular_plurals[sg], irregular_plurals[pl] = pl, pl end --[==[ var: Recognized lemmas. If the part of speech in {{tl|head}} is set to one of these or its singular equivalent, the category 'LANG lemmas' will automatically be added. If the part of speech is not a singular or plural lemma or non-lemma form and is not an abbreviation that expands to a recognized lemma or non-lemma form, the page will be added to various tracking categories: * [[Special:WhatLinksHere/Wiktionary:Tracking/headword/unrecognized pos]] * [[Special:WhatLinksHere/Wiktionary:Tracking/headword/unrecognized pos/LANG]] * [[Special:WhatLinksHere/Wiktionary:Tracking/headword/unrecognized pos/pos/POS]] * [[Special:WhatLinksHere/Wiktionary:Tracking/headword/unrecognized pos/pos/POS/LANG]] ]==] data.lemmas = list_to_set{ "Kependekan", "Akronim", "Kata sifat", "kata sifat", "Kata adjektif", -- alias "kata sifat" "kata adjektif", -- alias "kata sifat" "adnominals", "adpositions", "adverba", "Adverba", "Kata keterangan", "kata keterangan", "Imbuhan", "imbuhan", "ambipositions", "Kata sandang", "kata sandang", "Apitan", "apitan", "circumpositions", "Penjodoh bilangan", "penjodoh bilangan", "cmavo", "cmavo clusters", "cmene", "Bentuk gabungan", "Kata hubung", "kata hubung", "counters", "Penunjuk", "Tanda diakritik", "Digraf", "equative adjectives", "fu'ivla", "gismu", "Aksara Han", "Han tu", "hanja", "hanzi", "Hanzi", "ideophones", "Simpulan bahasa", "Sisipan", "sisipan", "initialisms", "Tanda lelaran", "tanda lelaran", "interfixes", "Kata seru", "kata seru", "kana", "kanji", "Kanji", "Huruf", "ligatur", "Logogram", "lujvo", "morae", "Morfem", "non-constituents", "Kata nama", "kata nama", "Kata nama am", -- alias "kata nama" "kata nama am", -- alias "kata nama" "Nombor", "nombor", "Simbol angka", "Kata bilangan", "kata bilangan", "Partikel", "partikel", "Frasa", "frasa", "kata dudi", "Kata dudi", "postpositional phrases", "predicatives", "Awalan", "awalan", "Frasa sendi nama", "frasa sendi nama", "Kata sendi nama", "kata sendi nama", "preverbs", "pronominal adverbs", "Kata ganti nama", "kata ganti nama", "Kata nama khas", "kata nama khas", "Peribahasa", "peribahasa", "Tanda baca", "tanda baca", "relatives", "Akar", "Kata dasar", "kata dasar", "Akhiran", "akhiran", "Suku kata", "suku kata", "Simbol", "simbol", "Kata kerja", "kata kerja", } --[==[ var: Recognized non-lemma forms. If the part of speech in {{tl|head}} is set to one of these or its singular equivalent, the category 'LANG non-lemma forms' will automatically be added. If the part of speech is not a singular or plural lemma or non-lemma form and is not an abbreviation that expands to a recognized lemma or non-lemma form, the page will be added to various tracking categories; see the documentation of `data.lemmas`. ]==] data.nonlemmas = list_to_set{ "active participle forms", "active participles", "adjectival participles", "adjective case forms", "Bentuk kata sifat", "bentuk kata sifat", "Bentuk kata adjektif", -- alias "bentuk kata sifat" "bentuk kata adjektif", -- alias "bentuk kata sifat" "Bentuk feminin kata sifat", "Bentuk jamak kata sifat", "Bentuk adverba", "adverbial participles", "agent participles", "Bentuk artikel", "Bentuk apitan", "Bentuk gabungan", "comparative adjective forms", "comparative adjectives", "comparative adverb forms", "comparative adverbs", "conjunction forms", "contractions", "converbs", "determiner comparative forms", "determiner forms", "determiner superlative forms", "diminutive nouns", "elative adjectives", "equative adjective forms", "equative adjectives", "future participles", "gerund", "infinitive forms", "infinitives", "interjection forms", "jyutping", "Kesalahan ejaan", "negative participles", "nominal participles", "noun case forms", "noun dual forms", "Bentuk kata nama", "bentuk kata nama", "noun paucal forms", "Bentuk jamak kata nama", "noun possessive forms", "noun singulative forms", "numeral forms", "partisipel", "bentuk partisipel", "particle forms", "passive participles", "past active participles", "past participles", "past participle forms", "past passive participles", "perfect active participles", "perfect participles", "perfect passive participles", "Pinyin", "Jamak", "Bentuk kata dudi", "Bentuk awalan", "preposition contractions", "preposition forms", "prepositional pronouns", "present active participles", "present participles", "present passive participles", "Bentuk kata ganti nama", "bentuk kata ganti nama", "pronoun possessive forms", "Bentuk kata nama khas", "bentuk kata nama khas", "Bentuk jamak kata nama khas", "rafsi", "Perumian", "perumian", "root forms", "singulatives", "Bentuk akhiran", "superlative adjective forms", "Kata sifat superlatif", "superlative adverb forms", "superlative adverbs", "Bentuk kata kerja", "bentuk kata kerja", "verbal nouns", } --[==[ var: List of languages that will not have links to separate parts of the headword. ]==] data.no_multiword_links = list_to_set{ "zh", } --[==[ var: List of languages that will not have `LANG multiword terms` categories added. There are various reasons why languages are in this list: (a) words are written without spaces between them; (b) syllables are written with spaces between them; (c) variant reconstructions are notated with a tilde surrounded by spaces; (d) the language is a sign language, where pagenames are multiword descriptions of the gesture(s) required to make an individual sign; (e) some other weirdnesses. ]==] data.no_multiword_cat = list_to_set{ -------- Languages without spaces between words (sometimes spaces between phrases) -------- "blt", -- Tai Dam "ja", -- Japanese "khb", -- Lü "km", -- Khmer "lo", -- Lao "mnw", -- Mon "my", -- Burmese "nan", -- Min Nan (some words in Latin script; hyphens between syllables) "nan-hbl", -- Hokkien (some words in Latin script; hyphens between syllables) "nod", -- Northern Thai "ojp", -- Old Japanese "shn", -- Shan "sou", -- Southern Thai "tdd", -- Tai Nüa "th", -- Thai "tts", -- Isan "twh", -- Tai Dón "txg", -- Tangut "zh", -- Chinese (all varieties with Chinese characters) "zkt", -- Khitan -------- Languages with spaces between syllables -------- "ahk", -- Akha "aou", -- A'ou "atb", -- Zaiwa "byk", -- Biao "cdy", -- Chadong --"duu", -- Drung; not sure --"hmx-pro", -- Proto-Hmong-Mien --"hnj", -- Green Hmong; not sure "huq", -- Tsat "ium", -- Iu Mien --"lis", -- Lisu; not sure "mtq", -- Muong --"mww", -- White Hmong; not sure "onb", -- Lingao --"sit-gkh", -- Gokhy; not sure --"swi", -- Sui; not sure "tbq-lol-pro", -- Proto-Loloish "tdh", -- Thulung "ukk", -- Muak Sa-aak "vi", -- Vietnamese "yig", -- Wusa Nasu "zng", -- Mang -------- Languages with ~ with surrounding spaces used to separate variants -------- "mkh-ban-pro", -- Proto-Bahnaric "sit-pro", -- Proto-Sino-Tibetan; listed above -------- Other weirdnesses -------- "mul", -- Translingual; gestures, Morse code, etc. "aot", -- Atong (India); bullet is a letter -------- All sign languages -------- "ads", "aed", "aen", "afg", "ase", "asf", "asp", "asq", "asw", "bfi", "bfk", "bog", "bqn", "bqy", "bvl", "bzs", "cds", "csc", "csd", "cse", "csf", "csg", "csl", "csn", "csq", "csr", "doq", "dse", "dsl", "ecs", "esl", "esn", "eso", "eth", "fcs", "fse", "fsl", "fss", "gds", "gse", "gsg", "gsm", "gss", "gus", "hab", "haf", "hds", "hks", "hos", "hps", "hsh", "hsl", "icl", "iks", "ils", "inl", "ins", "ise", "isg", "isr", "jcs", "jhs", "jls", "jos", "jsl", "jus", "kgi", "kvk", "lbs", "lls", "lsl", "lso", "lsp", "lst", "lsy", "lws", "mdl", "mfs", "mre", "msd", "msr", "mzc", "mzg", "mzy", "nbs", "ncs", "nsi", "nsl", "nsp", "nsr", "nzs", "okl", "pgz", "pks", "prl", "prz", "psc", "psd", "psg", "psl", "pso", "psp", "psr", "pys", "rms", "rsl", "rsm", "sdl", "sfb", "sfs", "sgg", "sgx", "slf", "sls", "sqk", "sqs", "ssp", "ssr", "svk", "swl", "syy", "tse", "tsm", "tsq", "tss", "tsy", "tza", "ugn", "ugy", "ukl", "uks", "vgt", "vsi", "vsl", "vsv", "xki", "xml", "xms", "ygs", "ysl", "zib", "zsl", } --[==[ var: List of languages where a hyphen is not considered a word separator for the `LANG multiword terms` category. There are numerous reasons why languages are in this list; by each language should be listed the reason for inclusion. ]==] data.hyphen_not_multiword_sep = list_to_set{ "akk", -- Akkadian; hyphens between syllables "akl", -- Aklanon; hyphens for mid-word glottal stops "ber-pro", -- Proto-Berber; morphemes separated by hyphens "ceb", -- Cebuano; hyphens for mid-word glottal stops "cnk", -- Khumi Chin; hyphens used in single words "cpi", -- Chinese Pidgin English; Chinese-derived words with hyphens between syllables "de", -- German; too many false positives "esx-esk-pro", -- hyphen used to separate morphemes "fi", -- Finnish; hyphen used to separate components in compound words if the final and initial vowels match, respectively "gd", -- Scottish Gaelic; too many false positives like [[a-chianaibh]], [[a-nìos]], [[an-dè]] and other adverbs in a- and an- "hil", -- Hiligaynon; hyphens for mid-word glottal stops "hnn", -- Hanunoo; too many false positives "ilo", -- Ilocano; hyphens for mid-word glottal stops "kne", -- Kankanaey; hyphens for mid-word glottal stops "lcp", -- Western Lawa; dash as syllable joiner "lwl", -- Eastern Lawa; dash as syllable joiner "mfa", -- Pattani Malay in Thai script; dash as syllable joiner "mkh-vie-pro", -- Proto-Vietic; morphemes separated by hyphens "msb", -- Masbatenyo; too many false positives "tl", -- Tagalog; too many false positives "war", -- Waray-Waray; too many false positives "yo", -- Yoruba; hyphens used to show lengthened nasal vowels } --[==[ var: List of languages that will not have `LANG masculine nouns` and similar categories added. Generally, these languages are lacking gender but use the gender field for other purposes. (This is a massive hack and should be changed.) ]==] data.no_gender_cat = list_to_set{ -- Languages without gender but which use the gender field for other purposes "ja", "th", } --[==[ var: List of languages where [[Module:headword]] should not attempt to generate a transliteration even if the term is written in a non-Latin script. FIXME: Notate reasons why each language is in this list. ]==] data.notranslit = list_to_set{ "ams", "az", "bbc", "bug", "cdo", "cia", "cjm", "cjy", "cmn", "cnp", "cpi", "cpx", "csp", "czh", "czo", "gan", "hak", "hnm", "hsn", "ja", "kzg", "lad", "ltc", "luh", "lzh", "mnp", "ms", "mul", "mvi", "nan", "nan-dat", "nan-hbl", "nan-hlh", "nan-lnx", "nan-tws", "nan-zhe", "nan-zsh", "och", "oj", "okn", "ryn", "rys", "ryu", "sh", "sjc", "tgt", "th", "tkn", "tly", "txg", "und", "vi", "wuu", "xug", "yoi", "yox", "yue", "za", "zh", "zhx-sic", "zhx-tai", } --[==[ var: List of languages that will default to `sccat` being true, i.e. categories like `LANG POS in SCRIPT script` will automatically be generated. This can be overridden using {{para|sccat|0}} in {{tl|head}} or setting `sccat` to `false` in Lua. ]==] data.default_sccat = list_to_set{ "inc-apa", "inc-ash", "kfr", "ks", "mr", "mwr", "inc-oaw", "inc-ohi", "omr", "inc-opa", "phr", "pi", "pra", "sa", "skr", "sd", } --[==[ var: List of script codes for which a script-tagged display title will be added. ]==] data.toBeTagged = list_to_set{ "Ahom", "Arab", "fa-Arab", "glk-Arab", "kk-Arab", "ks-Arab", "ku-Arab", "mzn-Arab", "ms-Arab", "ota-Arab", "pa-Arab", "ps-Arab", "sd-Arab", "tt-Arab", "ug-Arab", "ur-Arab", "Armi", "Armn", "Avst", "Bali", "Bamu", "Batk", "Beng", "as-Beng", "Bopo", "Brah", "Brai", "Bugi", "Buhd", "Cakm", "Cans", "Cari", "Cham", "Cher", "Copt", "Cprt", "Cyrl", "Cyrs", "Deva", "Dsrt", "Egyd", "Egyp", "Ethi", "Geok", "Geor", "Glag", "Goth", "Grek", "Polyt", "polytonic", "Gujr", "Guru", "Hang", "Hani", "Hano", "Hebr", "Hira", "Hluw", "Ital", "Java", "Kali", "Kana", "Khar", "Khmr", "Knda", "Kthi", "Lana", "Laoo", "Latn", "Latf", "Latg", "Latnx", "Latinx", "pjt-Latn", "Lepc", "Limb", "Linb", "Lisu", "Lyci", "Lydi", "Mand", "Mani", "Marc", "Merc", "Mero", "Mlym", "Mong", "mnc-Mong", "sjo-Mong", "xwo-Mong", "Mtei", "Mymr", "Narb", "Nkoo", "Nshu", "Ogam", "Olck", "Orkh", "Orya", "Osma", "Ougr", "Palm", "Phag", "Phli", "Phlv", "Phnx", "Plrd", "Prti", "Rjng", "Runr", "Samr", "Sarb", "Saur", "Sgnw", "Shaw", "Shrd", "Sinh", "Sora", "Sund", "Sylo", "Syrc", "Tagb", "Tale", "Talu", "Taml", "Tang", "Tavt", "Telu", "Tfng", "Tglg", "Thaa", "Thai", "Tibt", "Ugar", "Vaii", "Xpeo", "Xsux", "Yiii", "Zmth", "Zsym", "Ipach", "Music", "Rumin", } --[==[ var: Parts of speech which will not be categorised in categories like `English terms spelled with É` if the term is the character in question (e.g. the letter entry for English [[é]]). This contrasts with entries like the French adjective [[m̂]], which is a one-letter word spelled with the letter. ]==] data.pos_not_spelled_with_self = list_to_set{ "Tanda diakritik", "Aksara Han", "Han tu", "hanja", "hanzi", "Tanda lelaran", "kana", "kanji", "Huruf", "ligatur", "Logogram", "morae", "Simbol angka", "Kata bilangan", "Tanda baca", "Suku kata", "Simbol", } ------ 2. Lists not converted into sets. ------ --[==[ var: Recognized aliases for parts of speech (param 2=). Key is the short form and value is the canonical singular (not pluralized) form. It is singular so the same table can be used in [[Module:form of]] for the {{para|p}}/{{para|POS}} param and [[Module:links]] for the pos= param. Note that any part of speech, abbreviated or not, can be suffixed with `f` to generate the corresponding non-lemma form part of speech, such as `adjf`, `af` or `adjectivef` for `adjective form`, and `nounf` or `nf` for `noun form`. This expansion happens even when it does not make sense for the given part of speech (e.g. `pclf` expands to `particle form` and `symf` expands to `symbol form`), and currently also, at least in [[Module:headword]] (but not [[Module:links]]), even if the part before the `f` is not a recognized part of speech or abbreviation (hence `nerf` expands to `ner form`). ]==] data.pos_aliases = { a = "kata sifat", adj = "kata sifat", ["Kata adjektif"] = "kata sifat", -- alias "kata sifat" ["kata adjektif"] = "kata sifat", -- alias "kata sifat" ["kata Adjektif"] = "kata sifat", -- alias "kata sifat" ["Kata Adjektif"] = "kata sifat", -- alias "kata sifat" adv = "adverba", aug = "augmentative", art = "kata sandang", cls = "penjodoh bilangan", compadj = "comparative adjective", compadv = "comparative adverb", compdet = "comparative determiner", comppron = "comparative pronoun", cnum = "nombor kardinal", conj = "conjunction", contr = "contraction", conv = "converb", det = "penunjuk", dim = "diminutive", int = "kata seru", interj = "kata seru", intj = "kata seru", n = "kata nama", ["Kata nama am"] = "kata nama", -- alias "kata nama" ["kata nama am"] = "kata nama", -- alias "kata nama" ["kata benda"] = "kata nama", -- alias "kata nama" ["Kata benda"] = "kata nama", -- alias "kata nama" ["Kata Benda"] = "kata nama", -- alias "kata nama" ["Kata am"] = "kata nama", -- alias "kata nama" ["kata am"] = "kata nama", -- alias "kata nama" na = "animate noun", ni = "inanimate noun", num = "kata bilangan", pastpart = "past participle", part = "partisipel", pcl = "partikel", phr = "frasa", pn = "kata nama khas", postp = "kata dudi", pref = "awalan", pre = "kata depan", prep = "kata depan", prepphr = "prepositional phrase", prespart = "present participle", pro = "kata ganti nama", pron = "kata ganti nama", prop = "kata nama khas", proper = "kata nama khas", propn = "kata nama khas", onum = "nombor ordinal", romanisation = "perumian", romanisations = "perumian", suf = "akhiran", supadj = "superlative adjective", supadv = "superlative adverb", supdet = "superlative determiner", suppron = "superlative pronoun", sym = "simbol", v = "kata kerja", vb = "kata kerja", vi = "kata kerja tak transitif", vm = "modal verb", vt = "kata kerja transitif", vii = "kata kerja tak transitif tidak bernyawa", vai = "kata kerja tak transitif bernyawa", vti = "kata kerja transitif tidak bernyawa", vta = "kata kerja transitif bernyawa", } --[==[ var: Map of parts of speech for which categories like `German masculine nouns` or `Russian imperfective verbs` will be generated if the headword is of the appropriate gender/number. The map is used to canonicalize parts of speech for categorization purposes; specifically, proper nouns categorizes like nouns. ]==] data.pos_for_gender_number_cat = { ["Kata nama"] = "Kata nama", ["Kata nama khas"] = "Kata nama", ["Akhiran"] = "Akhiran", -- We include verbs because impf and pf are valid "genders". ["Kata kerja"] = "Kata kerja", } --[==[ var: Lower limit for a "long" word in a particular language. Used to categorize terms into e.g. [[:Category:Long English words]] automatically. Languages with no mapping here do not get categorized. ]==] data.long_word_thresholds = { ["af"] = 20, ["bg"] = 20, ["cy"] = 25, ["de"] = 20, ["en"] = 25, ["es"] = 20, ["fr"] = 20, ["ka"] = 20, ["sv"] = 20, ["tl"] = 25, } ------ 3. Page-wide processing (so that it only needs to be done once per page). ------ data.page = require(headword_page_module).process_page() -- Set some page properties directly on `data` for ease of use. data.pagename = data.page.pagename data.encoded_pagename = data.page.encoded_pagename return data e25nfyx4xb9ot1xz9cga0fu4e00xa54 Modul:ja-kanji-readings 828 11835 281350 256040 2026-04-22T05:36:35Z Hakimi97 2668 Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/88680772|88680772]]) 281350 Scribunto text/plain local export = {} local m_ja = require("Module:ja") local m_str_utils = require("Module:string utilities") local concat = table.concat local find = m_str_utils.find local get_script = require("Module:scripts").getByCode local hira_to_kata = m_ja.hira_to_kata local insert = table.insert local kana_to_romaji = require("Module:Hrkt-translit").tr local kata_to_hira = m_ja.kata_to_hira local gmatch = m_str_utils.gmatch local match = m_str_utils.match local split = m_str_utils.split local Jpan = get_script("Jpan") -- local katakana_script = get_script("Kana") local Hira = get_script("Hira") local PAGENAME = mw.loadData("Module:headword/data").pagename local NAMESPACE = mw.title.getCurrentTitle().nsText -- Only used by commented-out code. -- local data = mw.loadData("Module:ja/data") local CONCAT_SEP = ', ' local labels = { { text = "Go-on", text2 = "goon", classification = "on", }, { text = "Kan-on", text2 = "kan'on", classification = "on", }, { text = "Sō-on", text2 = "sōon", classification = "on", }, { text = "Tō-on", text2 = "tōon", classification = "on", }, { text = "Kan’yō-on", text2 = "kan'yōon", classification = "on", }, { entry = "on'yomi", text = "On", text2 = "on", classification = "on", unclassified = " (tidak dikelaskan)", }, { entry = "kun'yomi", text = "Kun", text2 = "kun", classification = "kun", }, { text = "Nanori", text2 = "nanori", classification = "nanori", }, } local function track(code) require("Module:debug").track("ja-kanji-readings/" .. code) end local function plain_link(data) data.term = data.term:gsub('[%.%- ]', '') -- 「かな-し.い」→「かなしい」, 「も-しく は」→「もしくは」 data.tr = data.tr and data.tr:gsub('[%.%-]', '') or '-' data.sc = match(data.term:gsub('[%z\1-\127]', ''), '[^' .. Hira:getCharacters() .. ']') and Jpan or Hira data.pos = data.pos ~= '' and data.pos or nil data.respect_link_tr = true return require("Module:links").full_link(data, "term") --"term" makes italic end --[=[ Copied from [[Module:ja]] on 2017/6/14. Replaces the code in Template:ja-readings which accepted kanji readings, and displayed them in a consistent format. Substantial change in function was introduced in https://en.wiktionary.org/w/index.php?diff=46057625 ]=] function export.show(frame) local args = require("Module:parameters").process(frame:getParent().args, { ["goon"] = {}, ["kanon"] = {}, ["soon"] = {}, ["toon"] = {}, ["on"] = {}, ["kanyoon"] = {}, ["kun"] = {}, ["nanori"] = {}, ["pagename"] = {}, }) local lang_code = frame.args[1] or 'ja' local lang = require'Module:languages'.getByCode(lang_code) local lang_name = lang:getCanonicalName() if args.pagename and NAMESPACE == "" then error("Parameter pagename tidak boleh digunakan dalam penyertaan, kerana hanya untuk ujian.") end local pagename = args.pagename or PAGENAME local yomi_data = mw.loadData("Module:ja/data/jouyou-yomi").yomi -- this holds the finished product composed of wikilinks to be displayed -- in the Readings section under the Kanji section local links, categories = {}, {} local is_old_format = false -- We need a separate kanji sortkey module. local sortkey = (require("Module:Hani-sortkey").makeSortKey(pagename, lang_code, "Jpan")) local function add_reading_category(reading, subtype, period) reading = kata_to_hira(reading:gsub("[%. ]+", ""):gsub("%-$", ""):gsub("%-", "・")) if subtype then return insert(categories, '[[Kategori:Kanji dengan bacaan ' .. (period or '') .. ' ' .. subtype .. ' ' .. reading .. ' bahasa ' .. lang_name .. '|' .. sortkey .. ']]') else return insert(categories, '[[Kategori:Kanji dibaca sebagai ' .. reading .. ' bahasa ' .. lang_name .. '|' .. sortkey .. ']]') end end local unclassified_on = {} local classified_on = {} local kun = {} local kana = "[ぁ-ー]" for _, label in ipairs(labels) do local readings = args[label.text2:gsub('ō', 'o'):gsub('\'', '')] if readings then local unclassified = "" if label.unclassified then if not (args.goon or args.kanon or args.soon or args.toon or args.kanyoon) then unclassified = label.unclassified end end if find(readings, '%[%[' .. kana) then is_old_format = true if label.classification == 'on' then for reading in gmatch(readings, kana .. '+') do add_reading_category(reading) end end readings = readings:gsub("%[%[([^%]|]+)%]%]", function(entry) if find(entry, "^[" .. Jpan:getCharacters() .. "]+$") then return plain_link{ lang = lang, term = entry, } else return "[[" .. entry .. "]]" end end) else readings = split(readings, "%s*[,、]%s*") for i, reading in ipairs(readings) do local is_jouyou = false local pos, pos_hist, pos_oldest = { }, { '[[w:Ortografi kana bersejarah|bersejarah]]' }, { 'historical' } -- check for formatting indicating presence of historical kana spelling local reading_mod, reading_hist, reading_oldest, reading_surplus = reading:match'^(.-)%f[<%z]<?(.-)%f[<%z]<?(.-)%f[<%z]<?(.*)$' if reading_surplus ~= '' then error("Bacaan " .. reading .. " mengandungi terlalau banyak bacaan bersejarah. Maksimum hanya 3: moden, lama, kuno.") end if label.text2 == "on" then unclassified_on[reading_mod] = true insert(unclassified_on, reading_mod) elseif label.text2 == "kun" then kun[reading_mod] = true insert(kun, reading_mod) elseif label.classification == "on" then classified_on[reading_mod] = true insert(classified_on, reading_mod) end -- test if reading contains katakana if find(reading_mod .. reading_hist .. reading_oldest, '[ァ-ヺ]') then insert(categories, '[[Kategori:Permintaan untuk perhatian mengenai bahasa ' .. lang_name .. '|1]]') -- sometimes legit, like 「頁(ページ)」 end if reading_hist ~= '' or reading_oldest ~= '' then -- test if historical readings contain small kana (anachronistic) if find(reading_hist .. reading_oldest, '[ぁぃぅぇぉゃゅょ]') then insert(categories, '[[Kategori:Permintaan untuk perhatian mengenai bahasa ' .. lang_name .. '|2]]') -- end -- test if reading contains kun'yomi delimiter thing but historical readings don't if reading_mod:find("-", 1, true) then if reading_hist ~= '' and not reading_hist:find("-", 1, true) or reading_oldest ~= '' and not reading_oldest:find("-", 1, true) then insert(categories, '[[Kategori:Permintaan untuk perhatian mengenai bahasa ' .. lang_name .. '|3]]') end end end -- check if there is data indicating that our kanji is a jouyou kanji if yomi_data[pagename] then local reading = (label.classification == 'on' and hira_to_kata(reading_mod) or reading_mod) reading = reading:gsub('%.', '') -- 「あたら-し.い」→「あたら-しい」 local yomi_type = yomi_data[pagename][reading] if yomi_type then is_jouyou = true if yomi_type == 1 or yomi_type == 2 then insert(pos, '[[w:Jōyō kanji|<abbr title="This reading is listed in the Jōyō kanji table. Click for the Wikipedia article about the Jōyō kanji.">Jōyō</abbr>]]') elseif yomi_type == 3 or yomi_type == 4 then insert(pos, '[[w:Jōyō kanji|<abbr title="This reading is listed in the Jōyō kanji table, but is marked as restricted or rare. Click for the Wikipedia article about the Jōyō kanji.">Jōyō <sup>†</sup></abbr>]]') end end end local subtype = label.text2 if reading_mod then add_reading_category(reading_mod, subtype) end if reading_hist ~= '' then add_reading_category(reading_hist, subtype, 'lama') end if reading_oldest ~= '' then add_reading_category(reading_oldest, subtype, 'kuno') end -- process kun readings with okurigana, create kanji-okurigana links if reading:find("-", 1, true) then insert(pos, 1, plain_link{ lang = lang, term = reading_mod:gsub('^.+%-', pagename), }) if reading_hist ~= '' then insert(pos_hist, 1, plain_link{ lang = lang, term = reading_hist:gsub('^.+%-', pagename), }) end if reading_oldest ~= '' then insert(pos_oldest, 1, plain_link{ lang = lang, term = reading_oldest:gsub('^.+%-', pagename), }) end elseif label.classification == 'kun' then insert(categories, '[[Kategori:Kanji ' .. lang_name .. ' dengan bacaan kun hilang penamaan okurigana|' .. sortkey .. ']]') end local rom = kana_to_romaji((reading_mod), lang_code):gsub('^(.+)(%-)', '<u>%1</u>') local rom_hist = kana_to_romaji((reading_hist:gsub('^(.+)(%-)', '<u>%1</u>')), lang_code, nil, {hist = true}) local rom_oldest = kana_to_romaji((reading_oldest:gsub('^(.+)(%-)', '<u>%1</u>')), lang_code, nil, {hist = true}) local mod_link = plain_link{ lang = lang, term = reading_mod, tr = rom, pos = concat(pos, CONCAT_SEP), } if is_jouyou then mod_link = '<span class="jouyou-reading">' .. mod_link .. '</span]>' end readings[i] = mod_link .. (reading_hist ~= '' and '<sup>←' .. plain_link{ lang = lang, term = reading_hist, tr = rom_hist, pos = concat(pos_hist, CONCAT_SEP), } .. '</sup>' or '') .. (reading_oldest ~= '' and '<sup>←' .. plain_link{ lang = lang, term = reading_oldest, tr = rom_oldest, pos = concat(pos_oldest, CONCAT_SEP), } .. '</sup>' or '') end readings = concat(readings, "、") end -- Add "on-yomi", "kun-yomi", or "nanori-yomi" class around list of -- readings to allow JavaScript to locate them. insert(links, "* '''[[Lampiran:Glosari bahasa Jepun#" .. (label.entry or label.text2) .. '|'.. label.text .. "]]'''" .. unclassified .. ': <span class="' .. label.classification .. '-yomi">' .. readings .. '</span>') end end for _, reading in ipairs(unclassified_on) do -- [[Special:WhatLinksHere/Wiktionary:Tracking/ja-kanji-readings/duplicate reading]] if classified_on[reading] then track("duplicate reading") end -- [[Special:WhatLinksHere/Wiktionary:Tracking/ja-kanji-readings/unclassified reading ja]] -- [[Special:WhatLinksHere/Wiktionary:Tracking/ja-kanji-readings/unclassified reading ryu]] etc. track("unclassified reading " .. lang_code) -- Track unclassified readings for later classification -- [[Special:WhatLinksHere/Wiktionary:Tracking/ja-kanji-readings/unclassified reading]] track("unclassified reading") -- Leave a version that is not profiled by lang code, in order to not break any hypothetical scripts relying on the old tracking category end if not next(classified_on) and not next(unclassified_on) then if next(kun) then -- [[Special:WhatLinksHere/Wiktionary:Tracking/ja-kanji-readings/kun only]] track("kun only") end elseif not next(kun) then -- [[Special:WhatLinksHere/Wiktionary:Tracking/ja-kanji-readings/on only]] track("on only") end if is_old_format then insert(categories, '[[Kategori:Kanji Jepun menggunakan format lama ja-bacaan|' .. sortkey .. ']]') end return concat(links, '\n') .. (NAMESPACE == '' and concat(categories) or '') .. require("Module:TemplateStyles")("Template:ja-readings/style.css") end return export s0hx5a524m6rhbbjn217ccn8ku5nqnk ثعبان 0 13164 281304 111415 2026-04-21T15:44:26Z Hakimi97 2668 281304 wikitext text/x-wiki {{juga|تعبان}} ==Bahasa Arab== ===Takrifan=== {{ar-kn|ثُعْبَان|m,f|pl=ثَعَابِين}} # [[ular]] #* {{RQ:Quran|26|32}} #*: {{quote|ar|فَأَلْقَى عَصَاهُ فَإِذَا هِيَ '''ثُعْبَان'''ٌ مُبِينٌ|Nabi Musa pun mencampakkan tongkatnya, maka tiba-tiba tongkatnya itu menjadi seekor ular yang jelas nyata.}} # {{lb|ar|buruj}} (biasanya {{l|ar|الثُعْبَان}}) [[Thuban]] ===Etimologi=== Daripada akar {{ar-akar|ث ع ب}}. ===Sebutan=== * {{ar-AFA|ثُعْبَان}} ===Deklensi=== {{ar-dekl-kn|ثُعْبَان|pl=ثَعَابِين}} [[Kategori:ar:Reptilia]] qnfbfz9fuuua4opeimzqjayw8deeldr 澪標 0 13298 281348 279385 2026-04-22T05:22:29Z Hakimi97 2668 /* Kata nama */ Cuba buang, nak semak mengapa ada penjanaan Kategori:Perkataan dieja dengan 標 dibaca sebagai つくし bahasa Jepun 281348 wikitext text/x-wiki ==Bahasa Jepun== <div style="float:right;"> {{wikipedia|lang=ja}} {{wikipedia|Berup siang}} {{wikipedia|Tiang tambat}} [[File:Miotsukushi_in_Osaka.JPG|thumb|250px|{{lang|ja|澪標}} (''miotsukushi'', ''miozukushi'', ''miojirushi'', ''reihyō''): sebuah '''{{w|tiang tambat}}''' tradisional Jepun di Osaka semasa {{w|zaman Meiji}}.]] </div> ===Etimologi 1=== {{ja-kanjitab|yomi=k,irr|sort=みおづくし|みお|つくし|k2=づくし}} {{ja-kanjitab|yomi=k,irr|sort=みおつくし|みお|つくし}} Kata majmuk bagi {{ja-compound|澪|みお|つ|つ|串|くし|t1=[[saluran]] [[air]]|pos2=partikel kata milik {{inh|ja|ojp|sort=みおつくし|-}}|t3=[[pencucuk]] (biasanya daging)}}.<ref name="DJS">{{R:Daijisen}}</ref> Juga ditemui dengan bacaan ''miozukushi''. {{rendaku2|sort=みおづくし|tsukushi|zukushi}} Terutama, penerbit yang berbeza dari teks sejarah yang sama muncul sebagai pengganti antara bacaan ''miotsukushi'' dan ''miojirushi'', mungkin disebabkan perbezaan sejarah atau dialek. ====Sebutan==== {{ja-pron|みおつくし|acc=0|acc_ref=DJR|acc2=4|acc2_ref=DJR|acc3=3|acc3_ref=DJR}} ====Kata nama==== # {{w|Tiang tambat}} yang dipasang sebagai {{w|berup siang}} atau {{w|tanda siang}}: sebuah [[penanda]] [[pelayaran]] menunjukkan sempadan [[saluran]] [[air]] #* {{RQ:Manyoshu|14|3429}}, teks di [https://web.archive.org/web/20200925204109/http://jti.lib.virginia.edu/japanese/manyoshu/Man14Yo.html#3429 sini] #*: {{ja-usex|m=等保都安布美 伊奈佐保曽江乃 '''水乎都久思''' 安礼乎多能米弖 安佐麻之物能乎|m_kana=とほつあふみ いなさほそえの '''みをつくし''' あれをたのめて あさましものを |遠%江%引%佐%細%江の'''みをつくし'''我を頼めてあさましものを |^とほ-つ%-^あふみ% ^いな%さ%-ほそ%え の '''みをつくし''' あれ を たのめて あさまし もの を|rom=Tō-tsu-Ōmi Inasa-hosoe no '''miotsukushi''' are o tanomete asamashi mono o|Di Tōtsu Ōmi atas pada Sungai Inasa berdirinya '''palang saluran'''―anda boleh membuat saya mengikuti dan meninggalkan saya di tempat tinggi dan kering.<ref>{{cite-book|1998|Edwin A. Cranston|The Gem-Glistening Cup|page=734|publisher=Stanford University Press|isbn=0-8047-3157-8}}</ref>|sort=みおつくし}} #: {{synonyms|ja|澪木|tr=miogi|澪杭|tr2=miokui|[[水尾坊木]], [[澪坊木]]|tr3=miobōgi}} # [[menyentuh]] (secara tidak langsung) kepada {{m|ja|尽くし|tr=tsukushi||[[kepenatan]]}} #* {{RQ:Manyoshu|12|3162}}, teks di [https://web.archive.org/web/20200918235602/http://jti.lib.virginia.edu/japanese/manyoshu/Man12Yo.html#3162 sini] #*: {{ja-usex|m='''水咫衝%石''' 心%盡%而 念%鴨 此間%毛%本%名 夢%西%所見|m_kana='''みをつく%し'''こころ%つくし%て おもへ%かも ここに%も%もと%な いめ%にし%みゆる|'''みをつくし'''心%尽して思へかもここにももとな夢にし見ゆる|'''みをつくし'''こころ% つくして おもへ か も ここ に も もと な いめ に し みゆる|rom='''miotsukushi''' kokoro tsukushite omoe ka mo koko ni mo moto na ime ni shi miyuru}} # salah satu daripada 60 [[pelbagai]] jenis [[kemenyan]] yang terkenal, yang terbuat dari [[kayu]] [[aromatik]] {{m|ja|伽羅|tr=kyara}} dengan [[bau]]an [[pahit]] #: {{hyper|ja|六十一種名香|tr=rokujūichi shumeikō}} =====Nota penggunaan===== * Pada masa ''Man'yōshū'', maksud "tiang tambat" dirujuk kepada mereka di {{w|Wilayah Tōtōmi}}; semasa {{w|zaman Heian}}, maksudnya hanya untuk penanda di [[teluk]] Naniwa, kini [[Osaka]]. * Sejak zaman Heian, makna "tiang tambat" dapat digunakan sebagai {{m|ja|掛詞|tr={{w|kakekotoba}}}} untuk [[plesetan]]/[[pun]]/[[pan]] terhadap makna {{m|ja|身を尽くす|身を尽くし|tr=mi o tsukushi|pos=secara harfiah “[[kepenatan]] [[badan]] seseorang” → “dengan semua [[kekuatan]], dengan semua [[hati]] dan [[nyawa]]”}}: ** {{RQ:Gosenshu|13|860; also ''{{w|Ogura Hyakunin Isshu|Hyakunin Isshu}}'', puisi 20}} **: {{ja-usex|わびぬれば今はた同じ難%波なる'''みをつくし'''ても逢はむとぞ思ふ|わびぬれば いま はた おなじ なに%は なる '''みをつくし'''て も あはむ と ぞ おもふ|rom=wabinureba ima hata onaji Naniwa naru '''mi o tsukushi'''te mo awan to zo omou|Sedih, kini, semuanya sama. '''Tanda saluran''' di Naniwa―walaupun ianya '''menggadai nyawa'''ku, Aku akan bertemu denganmu lagi!<ref>{{cite-book|1996|Joshua S. Mostow |Pictures of the Heart: The Hyakunin Isshu in Word and Image|edition=illustrated|publisher=University of Hawaii Press|isbn=0-8248-1705-2|page=201}}</ref>|sort=みおつくし}} ====Kata nama khas==== {{ja-pos|proper|みおつくし|hhira=みをつくし}} # [[bab]] ke[[empat belas]] bagi ''{{w|Hikayat Genji}}'' ===Etimologi 2=== {{ja-kanjitab|yomi=k|みお|しるし|k2=じるし}} Kata majmuk bagi {{ja-compound|澪|みお|標|しるし|t1=[[saluran]] [[air]]|t2=[[tanda]], [[penanda]]}}. {{rendaku2|sort=みおじるし|shirushi|jirushi}} Terutama, penerbit yang berbeza dari teks sejarah yang sama muncul sebagai pengganti antara bacaan ''miojirushi'' dan ''miotsukushi'', mungkin disebabkan perbezaan sejarah atau dialek. ====Sebutan==== {{ja-pron|みおじるし|acc=3|acc_ref=DJR}} ====Bentuk alternatif==== * {{ja-l|水脈標}} ====Kata nama==== {{ja-noun|みおじるし|hhira=みをじるし}} # {{w|Tiang tambat}} yang dipasang sebagai {{w|berup siang}} atau {{w|tanda siang}}: sebuah [[penanda]] [[pelayaran]] menunjukkan sempadan [[saluran]] [[air]] #* '''Abad ke-12''', ''{{w|lang=ja|山家集|Sankashū}}'' (buku 1, puisi 217) #*: {{ja-usex|広%瀬%川%渡りの沖の'''みをじるし'''[[水%嵩]]ぞ深き[[五月雨]]の頃|^ひろ%せ%-がは% わたり の おき の '''みをじるし''' み%かさ ぞ ふかき さみだれ の ころ|rom=Hirose-gawa watari no oki no '''miojirushi''' mikasa zo fukaki samidare no koro}} ===Etimologi 3=== {{ja-kanjitab|yomi=kanon2|れい|ひょう}} {{IPAchar|/reiheu/}} → {{IPAchar|/reːhjoː/}} Daripada {{bor|ja|ltc|sort=れいひょう|-}} {{ltc-l|澪標|id=1,1}}. ====Sebutan==== {{ja-pron|れいひょう}} ====Kata nama==== {{ja-noun|れいひょう|hhira=れいへう}} # {{w|Tiang tambat}} yang dipasang sebagai {{w|berup siang}} atau {{w|tanda siang}}: sebuah [[penanda]] [[pelayaran]] menunjukkan sempadan [[saluran]] [[air]] ===Rujukan=== <references/> :* {{R:Kanjipedia Kotoba|0007265800|〈<sup>▲</sup>澪標〉}} {{cln|ja|makurakotoba}} {{C|ja|Nautika}} lmjni4y5pqusernmvnauj25h5q5jpnc 281349 281348 2026-04-22T05:22:50Z Hakimi97 2668 Membatalkan semakan [[Special:Diff/281348|281348]] oleh [[Special:Contributions/Hakimi97|Hakimi97]] ([[User talk:Hakimi97|bincang]]) 281349 wikitext text/x-wiki ==Bahasa Jepun== <div style="float:right;"> {{wikipedia|lang=ja}} {{wikipedia|Berup siang}} {{wikipedia|Tiang tambat}} [[File:Miotsukushi_in_Osaka.JPG|thumb|250px|{{lang|ja|澪標}} (''miotsukushi'', ''miozukushi'', ''miojirushi'', ''reihyō''): sebuah '''{{w|tiang tambat}}''' tradisional Jepun di Osaka semasa {{w|zaman Meiji}}.]] </div> ===Etimologi 1=== {{ja-kanjitab|yomi=k,irr|sort=みおづくし|みお|つくし|k2=づくし}} {{ja-kanjitab|yomi=k,irr|sort=みおつくし|みお|つくし}} Kata majmuk bagi {{ja-compound|澪|みお|つ|つ|串|くし|t1=[[saluran]] [[air]]|pos2=partikel kata milik {{inh|ja|ojp|sort=みおつくし|-}}|t3=[[pencucuk]] (biasanya daging)}}.<ref name="DJS">{{R:Daijisen}}</ref> Juga ditemui dengan bacaan ''miozukushi''. {{rendaku2|sort=みおづくし|tsukushi|zukushi}} Terutama, penerbit yang berbeza dari teks sejarah yang sama muncul sebagai pengganti antara bacaan ''miotsukushi'' dan ''miojirushi'', mungkin disebabkan perbezaan sejarah atau dialek. ====Sebutan==== {{ja-pron|みおつくし|acc=0|acc_ref=DJR|acc2=4|acc2_ref=DJR|acc3=3|acc3_ref=DJR}} ====Kata nama==== {{ja-noun|みおつくし|hhira=みをつくし}}<br/>{{ja-altread|hira=みおづくし|hhira=みをづくし}} # {{w|Tiang tambat}} yang dipasang sebagai {{w|berup siang}} atau {{w|tanda siang}}: sebuah [[penanda]] [[pelayaran]] menunjukkan sempadan [[saluran]] [[air]] #* {{RQ:Manyoshu|14|3429}}, teks di [https://web.archive.org/web/20200925204109/http://jti.lib.virginia.edu/japanese/manyoshu/Man14Yo.html#3429 sini] #*: {{ja-usex|m=等保都安布美 伊奈佐保曽江乃 '''水乎都久思''' 安礼乎多能米弖 安佐麻之物能乎|m_kana=とほつあふみ いなさほそえの '''みをつくし''' あれをたのめて あさましものを |遠%江%引%佐%細%江の'''みをつくし'''我を頼めてあさましものを |^とほ-つ%-^あふみ% ^いな%さ%-ほそ%え の '''みをつくし''' あれ を たのめて あさまし もの を|rom=Tō-tsu-Ōmi Inasa-hosoe no '''miotsukushi''' are o tanomete asamashi mono o|Di Tōtsu Ōmi atas pada Sungai Inasa berdirinya '''palang saluran'''―anda boleh membuat saya mengikuti dan meninggalkan saya di tempat tinggi dan kering.<ref>{{cite-book|1998|Edwin A. Cranston|The Gem-Glistening Cup|page=734|publisher=Stanford University Press|isbn=0-8047-3157-8}}</ref>|sort=みおつくし}} #: {{synonyms|ja|澪木|tr=miogi|澪杭|tr2=miokui|[[水尾坊木]], [[澪坊木]]|tr3=miobōgi}} # [[menyentuh]] (secara tidak langsung) kepada {{m|ja|尽くし|tr=tsukushi||[[kepenatan]]}} #* {{RQ:Manyoshu|12|3162}}, teks di [https://web.archive.org/web/20200918235602/http://jti.lib.virginia.edu/japanese/manyoshu/Man12Yo.html#3162 sini] #*: {{ja-usex|m='''水咫衝%石''' 心%盡%而 念%鴨 此間%毛%本%名 夢%西%所見|m_kana='''みをつく%し'''こころ%つくし%て おもへ%かも ここに%も%もと%な いめ%にし%みゆる|'''みをつくし'''心%尽して思へかもここにももとな夢にし見ゆる|'''みをつくし'''こころ% つくして おもへ か も ここ に も もと な いめ に し みゆる|rom='''miotsukushi''' kokoro tsukushite omoe ka mo koko ni mo moto na ime ni shi miyuru}} # salah satu daripada 60 [[pelbagai]] jenis [[kemenyan]] yang terkenal, yang terbuat dari [[kayu]] [[aromatik]] {{m|ja|伽羅|tr=kyara}} dengan [[bau]]an [[pahit]] #: {{hyper|ja|六十一種名香|tr=rokujūichi shumeikō}} =====Nota penggunaan===== * Pada masa ''Man'yōshū'', maksud "tiang tambat" dirujuk kepada mereka di {{w|Wilayah Tōtōmi}}; semasa {{w|zaman Heian}}, maksudnya hanya untuk penanda di [[teluk]] Naniwa, kini [[Osaka]]. * Sejak zaman Heian, makna "tiang tambat" dapat digunakan sebagai {{m|ja|掛詞|tr={{w|kakekotoba}}}} untuk [[plesetan]]/[[pun]]/[[pan]] terhadap makna {{m|ja|身を尽くす|身を尽くし|tr=mi o tsukushi|pos=secara harfiah “[[kepenatan]] [[badan]] seseorang” → “dengan semua [[kekuatan]], dengan semua [[hati]] dan [[nyawa]]”}}: ** {{RQ:Gosenshu|13|860; also ''{{w|Ogura Hyakunin Isshu|Hyakunin Isshu}}'', puisi 20}} **: {{ja-usex|わびぬれば今はた同じ難%波なる'''みをつくし'''ても逢はむとぞ思ふ|わびぬれば いま はた おなじ なに%は なる '''みをつくし'''て も あはむ と ぞ おもふ|rom=wabinureba ima hata onaji Naniwa naru '''mi o tsukushi'''te mo awan to zo omou|Sedih, kini, semuanya sama. '''Tanda saluran''' di Naniwa―walaupun ianya '''menggadai nyawa'''ku, Aku akan bertemu denganmu lagi!<ref>{{cite-book|1996|Joshua S. Mostow |Pictures of the Heart: The Hyakunin Isshu in Word and Image|edition=illustrated|publisher=University of Hawaii Press|isbn=0-8248-1705-2|page=201}}</ref>|sort=みおつくし}} ====Kata nama khas==== {{ja-pos|proper|みおつくし|hhira=みをつくし}} # [[bab]] ke[[empat belas]] bagi ''{{w|Hikayat Genji}}'' ===Etimologi 2=== {{ja-kanjitab|yomi=k|みお|しるし|k2=じるし}} Kata majmuk bagi {{ja-compound|澪|みお|標|しるし|t1=[[saluran]] [[air]]|t2=[[tanda]], [[penanda]]}}. {{rendaku2|sort=みおじるし|shirushi|jirushi}} Terutama, penerbit yang berbeza dari teks sejarah yang sama muncul sebagai pengganti antara bacaan ''miojirushi'' dan ''miotsukushi'', mungkin disebabkan perbezaan sejarah atau dialek. ====Sebutan==== {{ja-pron|みおじるし|acc=3|acc_ref=DJR}} ====Bentuk alternatif==== * {{ja-l|水脈標}} ====Kata nama==== {{ja-noun|みおじるし|hhira=みをじるし}} # {{w|Tiang tambat}} yang dipasang sebagai {{w|berup siang}} atau {{w|tanda siang}}: sebuah [[penanda]] [[pelayaran]] menunjukkan sempadan [[saluran]] [[air]] #* '''Abad ke-12''', ''{{w|lang=ja|山家集|Sankashū}}'' (buku 1, puisi 217) #*: {{ja-usex|広%瀬%川%渡りの沖の'''みをじるし'''[[水%嵩]]ぞ深き[[五月雨]]の頃|^ひろ%せ%-がは% わたり の おき の '''みをじるし''' み%かさ ぞ ふかき さみだれ の ころ|rom=Hirose-gawa watari no oki no '''miojirushi''' mikasa zo fukaki samidare no koro}} ===Etimologi 3=== {{ja-kanjitab|yomi=kanon2|れい|ひょう}} {{IPAchar|/reiheu/}} → {{IPAchar|/reːhjoː/}} Daripada {{bor|ja|ltc|sort=れいひょう|-}} {{ltc-l|澪標|id=1,1}}. ====Sebutan==== {{ja-pron|れいひょう}} ====Kata nama==== {{ja-noun|れいひょう|hhira=れいへう}} # {{w|Tiang tambat}} yang dipasang sebagai {{w|berup siang}} atau {{w|tanda siang}}: sebuah [[penanda]] [[pelayaran]] menunjukkan sempadan [[saluran]] [[air]] ===Rujukan=== <references/> :* {{R:Kanjipedia Kotoba|0007265800|〈<sup>▲</sup>澪標〉}} {{cln|ja|makurakotoba}} {{C|ja|Nautika}} ldnhaivb76s1ue9b9op1mtiu3t7mjup Templat:en-peribahasa 10 13613 281240 112271 2026-04-21T12:48:03Z Hakimi97 2668 281240 wikitext text/x-wiki {{#invoke:en-headword|show|proverbs}}<!-- --><noinclude>{{documentation}}</noinclude> sz5x1ciegn3gdcxhbrlyurer0owujgm 281242 281240 2026-04-21T13:02:04Z Hakimi97 2668 281242 wikitext text/x-wiki {{#invoke:en-headword|show|peribahasa}}<!-- --><noinclude>{{documentation}}</noinclude> bo0q937xoaulshjyzaf22r0iz94n1if يوم 0 14405 281305 113435 2026-04-21T15:44:52Z Hakimi97 2668 /* Etimologi */ 281305 wikitext text/x-wiki == Bahasa Arab == === Takrifan === ==== Kata nama ==== {{ar-kn|يَوْم|m|pl=أَيَّام}} # [[hari]] # [[siang]] === Etimologi === Daripada akar {{ar-root|ي و م}}, daripada {{inh|ar|sem-pro|*yawm-}}. === Sebtuan === * {{ar-IPA|يَوْم}} * {{audio|ar|Ar-يوم.ogg|Audio}} p1ciiwrmwa5wgoa7q2cwxw1jh707yrp buan 0 14613 281336 116333 2026-04-22T01:03:28Z PeaceSeekers 3334 281336 wikitext text/x-wiki {{juga|buan-|bù'ān|Buan}} ==Bahasa Bajau Sama== ===Takrifan=== ====Kata nama==== {{inti|bdr|kata nama}} # {{lb|bdr|waktu}} [[bulan]] ===Etimologi=== Daripada {{inh|bdr|poz-pro|*bulan}}, daripada {{inh|bdr|map-pro|*bulaN}}. ===Sebutan=== * {{AFA|bdr|/ˈbu.wan/}} * {{rima|bdr|an}} * {{penyempangan|bdr|bu|an}} {{C|bdr|Masa}} ffvwbsf5lpfllvwf7dzmae4x7bcmidk bini-bini 0 16670 281315 239503 2026-04-21T17:46:49Z Hakimi97 2668 /* Takrifan */ 281315 wikitext text/x-wiki {{Pautan Projek Wikimedia}} == Bahasa Melayu == === Takrifan === {{ms-kn|pl=-}} # [[perempuan]]; [[wanita]] ===Etimologi=== Daripada {{der|ms|kxd|bini-bini}}. === Sebutan === * {{AFA|ms|/bi.bi.bi.ni/}} * {{rima|ms|i}} * {{penyempangan|ms|bi|ni|bi|ni}} === Tulisan Jawi === {{ARchar|[[بيني٢]]}} === Rujukan === * {{R:KD4}} * {{R:Kamus Bahasa Melayu Nusantara|2=344}} === Pautan luar === * {{R:PRPM}} ==Bahasa Melayu Brunei== ===Takrifan=== ====Kata nama==== {{inti|kxd|kata nama}} # [[perempuan]] atau [[wanita]] ====Kata sifat==== {{inti|kxd|kata sifat}} # [[perempuan]] ===Etimologi=== {{penggandaan|kxd|bini}} ===Sebutan=== * {{AFA|kxd|/bi.ni.bi.ni/}} ===Tesaurus=== ====Sinonim==== * {{l|kxd|perempuan}} ====Antonim==== * {{l|kxd|laki-laki}} atau {{l|kxd|lelaki}} ====Kata berkaitan==== * {{l|kxd|betina}} 3ahjz3zlriswbu0ma2k2epe374liyf9 pikin 0 16926 281337 117722 2026-04-22T01:03:56Z PeaceSeekers 3334 281337 wikitext text/x-wiki ==Bahasa Belait== ===Takrifan=== [[Fail:B-Mingteller.JPG|thumb|pikin]] ====Kata nama==== {{head|beg|kata nama}} # [[pinggan]]. ===Sebutan=== * {{AFA|beg|/pi.kin/}} * {{rima|beg|in}} * {{penyempangan|beg|pi|kin}} ===Rujukan=== * {{R:DL7D|2=226}} {{C|beg|Alat dapur}} pxqs8exibxsbsh40tiu7e2j318fn2zt Islam 0 17732 281306 119183 2026-04-21T15:45:42Z Hakimi97 2668 /* Etimologi */ 281306 wikitext text/x-wiki {{Pautan Projek Wikimedia}} {{also|İslam}} == Bahasa Melayu == === Takrifan === {{ms-knk|j=إسلام}} # [[agama|Agama]] yang mempercayai [[Allah]] sebagai [[tuhan]] yang tunggal, dan [[Muhammad]] sebagai [[rasul]]. === Etimologi === Daripada {{bor|ms|ar|إِسْلَام||}}, bentuk kata nama bekerja {{m|ar|أَسْلَمَ}}, daripada akar {{ar-root|س ل م|nocat=1}}. === Sebutan === * {{dewan|is|lam}} * {{AFA|ms|/islam/}} * {{rima|ms|lam|am}} === Rujukan === * {{R:KD4}} === Pautan luar === * {{R:PRPM}} [[Kategori:ms:Islam| ]] [[Kategori:ms:Agama]] == Bahasa Indonesia == {{Wikipedia|lang=id}} === Takrifan === ==== Kata nama khas ==== {{head|id|kata nama khas}} # agama Islam === Etimologi === Daripada {{bor|id|ar|إِسْلَام||}}. === Pautan luar === * {{R:KBBI Daring}} [[Kategori:id:Islam| ]] [[Kategori:id:Agama]] == Bahasa Inggeris == {{Wikipedia|lang=en}} === Takrifan === ==== Kata nama khas ==== {{head|en|kata nama khas}} # agama Islam === Etimologi === Daripada {{bor|en|ar|إِسْلَام||}}. === Sebutan === * {{IPA|en|/ɪsˈlɑːm/|/ɪzˈlɑːm/|/ˈɪs.lɑːm/|/ˈɪz.lɑːm/}}, or with {{IPAchar|/-læːm/|lang=en}} ** {{audio|en|LL-Q1860 (eng)-Vealhurl-Islam.wav|Audio (UK)}} * {{rhymes|en|ɑːm|æm}} [[Kategori:en:Islam| ]] [[Kategori:en:Agama]] fr91z8yqimsy1g75yhsl84cqj6rq3u6 Kategori:Lema bahasa Turki Usmaniyah 14 18020 281328 224816 2026-04-22T00:39:26Z PeaceSeekers 3334 PeaceSeekers telah memindahkan laman [[Kategori:Lema bahasa Turki Uthmaniyah]] ke [[Kategori:Lema bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama 224816 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx बारिश 0 21084 281340 124483 2026-04-22T01:09:12Z PeaceSeekers 3334 281340 wikitext text/x-wiki == Bahasa Hindi == {{wikipedia|वर्षा|lang=hi}} === Takrifan === ==== Kata nama ==== {{head|hi|kata nama}} # [[hujan]] #: {{syn|hi|बरसात|वर्षा|मेंह}} === Etimologi === Daripada {{bor|hi|fa-cls|بارش|tr=bāriš}}. === Sebutan === * {{audio|hi|LL-Q1568 (hin)-AryamanA-बारिश.wav|Audio}} {{C|hi|Fenomena atmosfera}} l4le1p5o73gzadm8vz3ftd6wyifnu9g magnet 0 21555 281243 125052 2026-04-21T13:18:09Z Countryball mys123 9925 /* Bahasa Melayu */Tambah gambar 281243 wikitext text/x-wiki == Bahasa Melayu == {{Wikipedia}} [[File:Bar magnet crop.jpg|thumb|Magnet]] === Takrifan === ==== Kata nama ==== {{ms-kn|j=مݢنيت}} # Suatu [[besi]] yang berupaya menarik besi lain ke arahnya. # {{lb|ms|kiasan}} Suatu benda yang menarik perhatian atau perkara. === Etimologi === Pinjaman {{bor|ms|en|magnet}}. === Sebutan === * {{dewan|mag|nét}} === Pautan luar === * {{R:PRPM}} {{C|ms|Keelektromagnetan}} == Bahasa Inggeris == {{Wikipedia|lang=en}} === Takrifan === ==== Kata nama ==== {{en-kn}} # Suatu [[besi]] yang berupaya menarik besi lain ke arahnya. # {{lb|en|kiasan}} Suatu benda yang menarik perhatian atau perkara. === Etimologi === Daripada {{inh|en|enm|magnete}} melalui {{der|en|fro|magnete}}, {{der|en|la|magnēs|magnēs, magnētem|t=}}, daripada {{der|en|grc||[[μαγνῆτις]] [λίθος]|t=Batu Magnesia}}, sama ada sempena kota Magnesia ad Sipylum (kini Manisa, [[Turki]]) atau bandar Yunani {{m|grc|Μαγνησία}}. Berkait dengan {{m|en|manganese}}, {{m|en|magnesia}} and {{m|en|magnesium}}. === Sebutan === * {{a|GA}} {{IPA|en|/ˈmæɡnɪt/}} * {{a|RP}} {{IPA|en|/ˈmæɡnət/}} * {{audio|en|LL-Q1860 (eng)-Vealhurl-magnet.wav|Audio (UK)}} * {{homophones|en|magnate}} {{qualifier|one pronunciation}} * {{rhymes|en|ɪt|s=2}} {{C|en|Keelektromagnetan}} rgh93g2mmhp2ywnl7k1qw5zzupylkh1 حج 0 22365 281307 126771 2026-04-21T15:46:31Z Hakimi97 2668 /* Kata kerja */ 281307 wikitext text/x-wiki == Bahasa Arab == === Takrifan === ==== Kata nama ==== {{ar-noun|حَجّ|m|pl=-}} # {{ar-verbal noun of|حَجَّ|form=I}} # {{lb|ar|agama}} [[ziarah]] ## {{lb|ar|Islam}} [[haji]] === Kata kerja === {{ar-verb|I/a~u.pass.vn:حَجّ}} # Membalas hujah dengan bukti dan sebagainya # [[membuktikan]]; memberikan [[bukti]] tentang sesuatu. # {{lb|ar|agama}} Melakukan ziarah ## {{lb|ar|Islam}} Melakukan haji === Etimologi === Daripada {{ar-root|ح|ج|ج|}}. Banding dengan {{cog|he|חַג|tr=ḥaḡ|t=hari menjamu}}, {{cog|syc|ܚܓܐ|tr=ḥaggā|t=jamuan}}, {{cog|syc|ܚܳܓ|tr=ḥāgg|t=mengelilingi}}, {{cog|gez|ሕግ|tr=ḥəgg|t=undang-undang}}, {{cog|gez|ሐገገ|tr=ḥaggaga|t=mewartakan (undang-undang)}}. === Sebutan === * {{ar-IPA|حَجّ}} ** {{a|Mesir}} {{IPA|arz|/ħaɡɡ/}} ** {{a|Maghribi}} {{IPA|ary|/ħaʒʒ/}} ** {{a|Levant Utara}} {{IPA|apc|/ħaʒʒ/}} {{C|ar|Haji dan umrah}} kni3b3nlybg2s9whjkp5macbm27lje8 281309 281307 2026-04-21T15:49:58Z Hakimi97 2668 /* Etimologi */ 281309 wikitext text/x-wiki == Bahasa Arab == === Takrifan === ==== Kata nama ==== {{ar-noun|حَجّ|m|pl=-}} # {{ar-verbal noun of|حَجَّ|form=I}} # {{lb|ar|agama}} [[ziarah]] ## {{lb|ar|Islam}} [[haji]] === Kata kerja === {{ar-verb|I/a~u.pass.vn:حَجّ}} # Membalas hujah dengan bukti dan sebagainya # [[membuktikan]]; memberikan [[bukti]] tentang sesuatu. # {{lb|ar|agama}} Melakukan ziarah ## {{lb|ar|Islam}} Melakukan haji === Etimologi === Daripada {{ar-root|ح ج ج|}}. Banding dengan {{cog|he|חַג|tr=ḥaḡ|t=hari menjamu}}, {{cog|syc|ܚܓܐ|tr=ḥaggā|t=jamuan}}, {{cog|syc|ܚܳܓ|tr=ḥāgg|t=mengelilingi}}, {{cog|gez|ሕግ|tr=ḥəgg|t=undang-undang}}, {{cog|gez|ሐገገ|tr=ḥaggaga|t=mewartakan (undang-undang)}}. === Sebutan === * {{ar-IPA|حَجّ}} ** {{a|Mesir}} {{IPA|arz|/ħaɡɡ/}} ** {{a|Maghribi}} {{IPA|ary|/ħaʒʒ/}} ** {{a|Levant Utara}} {{IPA|apc|/ħaʒʒ/}} {{C|ar|Haji dan umrah}} om42qtdk0l324tyynokc9k0uiejicw8 Modul:ar-verb 828 22367 281308 266032 2026-04-21T15:48:34Z Hakimi97 2668 Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/88683202|88683202]]) (perlu semakan semula untuk terjemahan label) 281308 Scribunto text/plain local export = {} --[=[ This module implements {{ar-conj}} and provides the underlying conjugation functions for {{ar-verb}} (whose actual formatting is done in [[Module:ar-headword]]). Author: User:Benwing, from an early version (2013-2014) by User:Atitarev, User:ZxxZxxZ. ]=] --[=[ TERMINOLOGY: -- "slot" = A particular combination of tense/mood/person/number/etc. Example slot names for verbs are "past_1s" (past tense first-person singular), "juss_pass_3fp" (non-past jussive passive third-person feminine plural) "ap" (active participle). Each slot is filled with zero or more forms. -- "form" = The conjugated Arabic form representing the value of a given slot. -- "lemma" = The dictionary form of a given Arabic term. For Arabic, normally the third person masculine singular past, although other forms may be used if this form is missing (e.g. in passive-only verbs or verbs lacking the past). ]=] --[=[ FIXME: 1. Finish unimplemented conjugation types. Only IX-final-weak left (extremely rare, possibly only one verb اِعْمَايَ (according to Haywood and Nahmad p. 244, who are very specific about the irregular occurrence of alif + yā instead of expected اِعْمَيَّ with doubled yā). Not in Hans Wehr. NOTE: Not true about this, cf. form IX اِرْعَوَى "to desist, to repent, to see the light". Also note form XII اِخْضَوْضَرَ = form IX اِخْضَرَّ "to be or become green". [DONE except for اِعْمَايَ] 2. Implement irregular verbs as special cases and recognize them, e.g. -- laysa "to not be"; only exists in the past tense, no non-past, no imperative, no participles, no passive, no verbal noun. Irregular alternation las-/lays-. [IMPLEMENTABLE USING OVERRIDES] -- istaḥā yastaḥī "be ashamed of" -- this is complex according to Hans Wehr because there are two verbs, regular istaḥyā yastaḥyī "to spare (someone)'s life" and irregular istaḥyā yastaḥyī "to be ashamed to face (someone)", which is irregular because it has the alternate irregular form istaḥā yastaḥī which only applies to this meaning. Currently we follow Haywood and Nahmad in saying that both varieties can be spelled istaḥyā/istaḥā/istaḥḥā, but we should instead use a variant= param similar to حَيَّ to distinguish the two possibilities, and maybe not include istaḥḥā. -- ʿayya/ʿayiya yaʿayyu/yaʿyā "to not find the right way, be incapable of, stammer, falter, fall ill". This appears to be a mixture of a geminate and final-weak verb. Unclear what the whole paradigm looks like. Do the consonant-ending parts in the past follow the final-weak paradigm? Is it the same in the non-past? Or can you conjugate the non-past fully as either geminate or final-weak? -- اِنْمَحَى inmaḥā or يمَّحَى immaḥā "to be effaced, obliterated; to disappear, vanish" has irregular assimilation of inm- to imm- as an alternative. inmalasa "to become smooth; to glide; to slip away; to escape" also has immalasa as an alternative. The only other form VII verbs in Hans Wehr beginning with -m- are inmalaḵa "to be pulled out, torn out, wrenched" and inmāʿa "to be melted, to melt, to dissolve", which are not listed with imm- alternatives, but might have them; if so, we should handle this generally. [DONE] -- يَرَعَ yaraʕa yariʕu "to be a coward, to be chickenhearted" as an alternative form of يَرِعَ yariʕa yayraʕu (as given in Wehr). [IMPLEMENTABLE USING OVERRIDES] 3. Implement individual override parameters for each paradigm part. See Module:fro-verb for an example of how to do this generally. Note that {{temp|ar-conj-I}} and other of the older templates already had such individual override params. [DONE] Irregular verbs already implemented: -- [ḥayya/ḥayiya yaḥyā "live" -- behaves like a normal final-weak verb (e.g. past first singular ḥayītu) except in the past-tense parts with vowel-initial endings (all the third person except for the third feminine plural). The normal singular and dual endings have -yiya- in them, which compresses to -yya-, with the normal endings the less preferred ones. In masculine third plural, expected ḥayū is replaced by ḥayyū by analogy to the -yy- parts, and the regular form is not given as an alternant in John Mace. Barron's 201 verbs appears to have the regular ḥayū as the part, however. Note also that final -yā appears with tall alif. This appears to be a spelling convention of Arabic, also applying in ḥayyā (form II, "to keep (someone) alive") and 'aḥyā (form IV, "to animate, revive, give birth to, give new life to").] -- implemented -- [ittaxadha yattaxidhu "take"] -- implemented -- [sa'ala yas'alu "ask" with alternative jussive/imperative yasal/sal] -- implemented -- [ra'ā yarā "see"] -- implemented -- ['arā yurī "show"] -- implemented -- ['akala ya'kulu "eat" with imperative kul] -- implemented -- ['axadha ya'xudhu "take" with imperative xudh] -- implemented -- ['amara ya'muru "order" with imperative mur] -- implemented --]=] local force_cat = false -- set to true for debugging -- if true, always maintain manual translit during processing, and compare against full translit at the end local debug_translit = false local lang = require("Module:languages").getByCode("ar") local m_links = require("Module:links") local m_string_utilities = require("Module:string utilities") local m_table = require("Module:table") local ar_utilities = require("Module:ar-utilities") local ar_nominals = require("Module:ar-nominals") local iut = require("Module:inflection utilities") local put = require("Module:parse utilities") local pron_qualifier_module = "Module:pron qualifier" local list_to_text = mw.text.listToText local rfind = m_string_utilities.find local rsubn = m_string_utilities.gsub local rmatch = m_string_utilities.match local rsplit = m_string_utilities.split local usub = m_string_utilities.sub local ulen = m_string_utilities.len local u = m_string_utilities.char local unpack = unpack or table.unpack -- Lua 5.2 compatibility local dump = mw.dumpObject -- Within this module, conjugations are the functions that do the actual -- conjugating by creating the parts of a basic verb. -- They are defined further down. local conjugations = {} -- hamza variants local HAMZA = u(0x0621) -- hamza on the line (stand-alone hamza) = ء local HAMZA_ON_ALIF = u(0x0623) local HAMZA_ON_W = u(0x0624) local HAMZA_UNDER_ALIF = u(0x0625) local HAMZA_ON_Y = u(0x0626) local HAMZA_ANY = "[" .. HAMZA .. HAMZA_ON_ALIF .. HAMZA_UNDER_ALIF .. HAMZA_ON_W .. HAMZA_ON_Y .. "]" local HAMZA_PH = u(0xFFF0) -- hamza placeholder local BAD = u(0xFFF1) local BORDER = u(0xFFF2) -- diacritics local A = u(0x064E) -- fatḥa local AN = u(0x064B) -- fatḥatān (fatḥa tanwīn) local U = u(0x064F) -- ḍamma local UN = u(0x064C) -- ḍammatān (ḍamma tanwīn) local I = u(0x0650) -- kasra local IN = u(0x064D) -- kasratān (kasra tanwīn) local SK = u(0x0652) -- sukūn = no vowel local SH = u(0x0651) -- šadda = gemination of consonants local DAGGER_ALIF = u(0x0670) local DIACRITIC_ANY_BUT_SH = "[" .. A .. I .. U .. AN .. IN .. UN .. SK .. DAGGER_ALIF .. "]" -- Pattern matching short vowels local AIU = "[" .. A .. I .. U .. "]" -- Pattern matching short vowels or sukūn local AIUSK = "[" .. A .. I .. U .. SK .. "]" -- Pattern matching any diacritics that may be on a consonant local DIACRITIC = SH .. "?" .. DIACRITIC_ANY_BUT_SH -- translit_patterns local vowels = "aeiouāēīōū" local NV = "[^" .. vowels .. "]" local dia = {a = A, i = I, u = U} local undia = {[A] = "a", [I] = "i", [U] = "u", ["-"] = "-"} -- various letters and signs local ALIF = u(0x0627) -- ʾalif = ا local AMAQ = u(0x0649) -- ʾalif maqṣūra = ى local AMAD = u(0x0622) -- ʾalif madda = آ local TAM = u(0x0629) -- tāʾ marbūṭa = ة local T = u(0x062A) -- tāʾ = ت local HYPHEN = u(0x0640) local N = u(0x0646) -- nūn = ن local W = u(0x0648) -- wāw = و local Y = u(0x064A) -- yāʾ = ي local S = "س" local M = "م" local LRM = u(0x200e) -- left-to-right mark -- common combinations local AH = A .. TAM local AT = A .. T local AA = A .. ALIF local AAMAQ = A .. AMAQ local AAH = AA .. TAM local AAT = AA .. T local II = I .. Y local UU = U .. W local AY = A .. Y local AW = A .. W local AYSK = AY .. SK local AWSK = AW .. SK local NA = N .. A local NI = N .. I local AAN = AA .. N local AANI = AA .. NI local AYNI = AYSK .. NI local AWNA = AWSK .. NA local AYNA = AYSK .. NA local AYAAT = AY .. AAT local UNU = "[" .. UN .. U .. "]" local MA = M .. A local MU = M .. U local TA = T .. A local TU = T .. U local _I = ALIF .. I local _U = ALIF .. U local translit_cache = { -- hamza variants [HAMZA] = "ʔ", [HAMZA_ON_ALIF] = "ʔ", [HAMZA_ON_W] = "ʔ", [HAMZA_UNDER_ALIF] = "ʔ", [HAMZA_ON_Y] = "ʔ", [HAMZA_PH] = "ʔ", -- diacritics [A] = "a", [AN] = "an", [U] = "u", [UN] = "un", [I] = "i", [IN] = "in", [SK] = "", [SH] = "*", -- handled specially [DAGGER_ALIF] = "ā", -- various letters and signs [""] = "", [ALIF] = BAD, -- we should never be transliterating ALIF by itself, as its translit in isolation is ambiguous [AMAQ] = BAD, [AMAD] = "ʔā", [TAM] = "", [T] = "t", [N] = "n", [W] = "w", [Y] = "y", [S] = "s", [M] = "m", [LRM] = "", -- common combinations [AH] = "a", [AT] = "at", [AA] = "ā", [AAMAQ] = "ā", [AAH] = "āh", [AAT] = "āt", [II] = "ī", [UU] = "ū", [AY] = "ay", [AW] = "aw", [AYSK] = "ay", [AWSK] = "aw", [NA] = "na", [NI] = "ni", [AAN] = "ān", [AANI] = "āni", [AYNI] = "ayni", [AWNA] = "awna", [AYNA] = "ayna", [AYAAT] = "ayāt", [MA] = "ma", [MU] = "mu", [TA] = "ta", [TU] = "tu", [_I] = "i", [_U] = "u", } local function transliterate(text) local cached = translit_cache[text] if cached then if cached == BAD then error(("Internal error: Unable to transliterate %s because explicitly marked as BAD"):format(text)) end return cached end local tr = (lang:transliterate(text)) if not tr then error(("Internal error: Unable to transliterate: %s"):format(text)) end translit_cache[text] = tr return tr end local all_person_number_list = { "1s", "2ms", "2fs", "3ms", "3fs", "2d", "3md", "3fd", "1p", "2mp", "2fp", "3mp", "3fp" } local function make_person_number_slot_accel_list(list) local slot_accel_list = {} return slot_accel_list end local imp_person_number_list = {} for _, pn in ipairs(all_person_number_list) do if pn:find("^2") then table.insert(imp_person_number_list, pn) end end local passive_types = m_table.listToSet { "pass", -- verb has both active and passive "ipass", -- verb is active with impersonal passive "nopass", -- verb is active-only "onlypass", -- verb is passive-only "onlypass-impers", -- verb itself is impersonal, meaning passive-only with impersonal passive } local indicator_flags = m_table.listToSet { "nopast", "no_nonpast", "noimp", "nocat", -- don't categorize or include annotations about this; useful in suppletive parts of verbs "reduced", -- verb has assimilation/reduction of initial coronals "altgem", -- form X with alternative past geminate forms with final-weak endings } export.potential_lemma_slots = {"past_3ms", "past_pass_3ms", "ind_3ms", "ind_pass_3ms", "imp_2ms"} export.unsettable_slots = {} for _, potential_lemma_slot in ipairs(export.potential_lemma_slots) do table.insert(export.unsettable_slots, potential_lemma_slot .. "_linked") end -- We don't set the active participle directly for form I because we don't want stative verbs (with past vowel i or u) -- to default to فَاعِل. Instead we set the special slot 'ap1' and later copy it to 'ap' for non-stative verbs. The user -- meanwhile can explicitly request the فَاعِل form for active participles for stative verbs using `ap:+`. table.insert(export.unsettable_slots, "ap1") -- primary default فَاعِل for form I active participles table.insert(export.unsettable_slots, "ap2") -- secondary default فَعِيل for form I active participles (stative I) table.insert(export.unsettable_slots, "ap3") -- secondary default فَعِل for form I active participles (stative II) table.insert(export.unsettable_slots, "apcd") -- secondary default أَفْعَل for form I active participles (color/defect) table.insert(export.unsettable_slots, "apan") -- secondary default فَعْلَان for form I active participles (in -ān) table.insert(export.unsettable_slots, "pp2") -- secondary default فَعِيل for form I passive participles (same as ap2) table.insert(export.unsettable_slots, "vn2") -- secondary default فِعَال for form III verbal nouns export.unsettable_slots_set = m_table.listToSet(export.unsettable_slots) local default_indicator_to_active_participle_slot = { ["+"] = "ap1", ["++"] = "ap2", ["+++"] = "ap3", ["+cd"] = "apcd", ["+an"] = "apan", } local slots_that_may_be_uncertain = { vn = "verbal noun", ap = "active participle", } -- Initialize all the slots for which we generate forms. local function add_slots(alternant_multiword_spec) alternant_multiword_spec.verb_slots = { {"ap", "act|part"}, {"pp", "pass|part"}, {"vn", "vnoun"}, } for _, unsettable_slot in ipairs(export.unsettable_slots) do table.insert(alternant_multiword_spec.verb_slots, {unsettable_slot, "-"}) end -- Add entries for a slot with person/number variants. -- `slot_prefix` is the prefix of the slot, typically specifying the tense/aspect. -- `tag_suffix` is a string listing the set of inflection tags to add after the person/number tags. -- `person_number_list` is a list of the person/number slot suffixes to add to `slot_prefix`. local function add_personal_slot(slot_prefix, tag_suffix, person_number_list) for _, persnum in ipairs(person_number_list) do local slot = slot_prefix .. "_" .. persnum local accel = persnum:gsub("(.)", "%1|") .. tag_suffix table.insert(alternant_multiword_spec.verb_slots, {slot, accel}) end end local tenses = { {"past", "past|%s"}, {"ind", "non-past|%s|ind"}, {"sub", "non-past|%s|sub"}, {"juss", "non-past|%s|juss"}, } for _, slot_accel in ipairs(tenses) do local slot, accel = unpack(slot_accel) for _, voice in ipairs {"act", "pass"} do add_personal_slot(voice == "act" and slot or slot .. "_pass", accel:format(voice), all_person_number_list) end end add_personal_slot("imp", "imp", imp_person_number_list) alternant_multiword_spec.verb_slots_map = {} for _, slot_accel in ipairs(alternant_multiword_spec.verb_slots) do local slot, accel = unpack(slot_accel) alternant_multiword_spec.verb_slots_map[slot] = accel end end local overridable_stems = {} local slot_override_param_mods = { footnote = { item_dest = "footnotes", store = "insert", }, alt = {}, t = { -- [[Module:links]] expects the gloss in "gloss". item_dest = "gloss", }, gloss = {}, g = { -- [[Module:links]] expects the genders in "g". `sublist = true` automatically splits on comma (optionally -- with surrounding whitespace). item_dest = "genders", sublist = true, }, pos = {}, lit = {}, id = {}, -- Qualifiers and labels q = { type = "qualifier", }, qq = { type = "qualifier", }, l = { type = "labels", }, ll = { type = "labels", }, } local function generate_obj(formval, parse_err, prefix, is_slot_override) local val, uncertain = formval:match("^(.*)(%?)$") val = val or formval uncertain = not not uncertain local ar, translit = val:match("^(.*)//(.*)$") if not ar then ar = val end if ar == "" then if uncertain then ar = "?" else error(("Can't specify blank value for override for %s override '%s'"):format( is_slot_override and "slot" or "stem", prefix)) end end return {form = ar, translit = translit, uncertain = uncertain} end local function parse_inline_modifiers(comma_separated_group, parse_err, prefix, is_slot_override) local function this_generate_obj(formval, parse_err) return generate_obj(formval, parse_err, prefix, is_slot_override) end return put.parse_inline_modifiers_from_segments { group = comma_separated_group, props = { param_mods = slot_override_param_mods, parse_err = parse_err, generate_obj = this_generate_obj, pre_normalize_modifiers = function(data) local modtext = data.modtext modtext = modtext:match("^(%[.*%])$") if modtext then return ("<footnote:%s>"):format(modtext) end return data.modtext end, }, } end local function allow_multiple_values_for_override(comma_separated_groups, data, is_slot_override) local retvals = {} for _, comma_separated_group in ipairs(comma_separated_groups) do local retval if is_slot_override then retval = parse_inline_modifiers(comma_separated_group, data.parse_err) else retval = generate_obj(comma_separated_group[1], data.parse_err, data.prefix, is_slot_override) retval.footnotes = data.fetch_footnotes(comma_separated_group) end table.insert(retvals, retval) end for _, form in ipairs(retvals) do if form.form == "+" or default_indicator_to_active_participle_slot[form.form] then if form.form ~= "+" and default_indicator_to_active_participle_slot[form.form] and not is_slot_override then error(("Stem override '%s' cannot use %s to request a secondary default"):format( data.prefix, form.form)) end data.base.slot_override_uses_default[data.prefix] = true end end for _, form in ipairs(retvals) do if form.form == "-" then data.base.slot_explicitly_missing[data.prefix] = true break end end if data.base.slot_explicitly_missing[data.prefix] then for _, form in ipairs(retvals) do if form.form ~= "-" then data.parse_err(("For slot or stem '%s', saw both - and a value other than -, which isn't allowed"): format(data.prefix)) end end return nil end return retvals end local function simple_choice(choices) return function(separated_groups, data) if #separated_groups > 1 then data.parse_err("For spec '" .. data.prefix .. ":', only one value currently allowed") end if #separated_groups[1] > 1 then data.parse_err("For spec '" .. data.prefix .. ":', no footnotes currently allowed") end local choice = separated_groups[1][1] if not m_table.contains(choices, choice) then data.parse_err("For spec '" .. data.prefix .. ":', saw value '" .. choice .. "' but expected one of '" .. table.concat(choices, ",") .. "'") end return choice end end for _, overridable_stem in ipairs { "past", "past_v", "past_c", "past_pass", "past_pass_v", "past_pass_c", "nonpast", "nonpast_v", "nonpast_c", "nonpast_pass", "nonpast_pass_v", "nonpast_pass_c", "imp", "imp_v", "imp_c", } do overridable_stems[overridable_stem] = allow_multiple_values_for_override end overridable_stems.past_final_weak_vowel = simple_choice { "ay", "aw", "ī", "ū" } overridable_stems.past_pass_final_weak_vowel = simple_choice { "ay", "aw", "ī", "ū" } overridable_stems.nonpast_final_weak_vowel = simple_choice { "ā", "ī", "ū" } overridable_stems.nonpast_pass_final_weak_vowel = simple_choice { "ā", "ī", "ū" } ------------------------------------------------------------------------------- -- Utility functions -- ------------------------------------------------------------------------------- -- version of rsubn() that discards all but the first return value local function rsub(term, foo, bar) return (rsubn(term, foo, bar)) end -- version of rsubn() that returns a 2nd argument boolean indicating whether a substitution was made. local function rsubb(term, foo, bar) local retval, nsubs = rsubn(term, foo, bar) return retval, nsubs > 0 end -- Concatenate one or more strings or form objects. local function q(...) local not_all_strings = debug_translit local has_manual_translit = debug_translit for i = 1, select("#", ...) do local argt = select(i, ...) if not argt then error(("Internal error: Saw nil at index %s: %s"):format(i, dump({...}))) end if type(argt) ~= "string" then not_all_strings = true if argt.translit then has_manual_translit = true break end end end if not not_all_strings then -- just strings, concatenate directly return table.concat({...}) end local formvals = {} local translit = has_manual_translit and {} or nil local footnotes for i = 1, select("#", ...) do local argt = select(i, ...) if type(argt) == "string" then formvals[i] = argt if has_manual_translit then translit[i] = transliterate(argt) end else formvals[i] = argt.form if has_manual_translit then translit[i] = argt.translit or transliterate(argt.form) end footnotes = iut.combine_footnotes(footnotes, argt.footnotes) end end -- FIXME: Do we want to support other properties? return { form = table.concat(formvals), translit = has_manual_translit and table.concat(translit) or nil, footnotes = footnotes, } end -- Return the formval associated with `rad` (a radical or past/non-past vowel, either a string or form object). local function rget(rad) if type(rad) == "string" then return rad elseif type(rad) == "table" then return rad.form else error(("Internal error: Unexpected type for radical or past/non-past vowel: %s"):format(dump(rad))) end end export.rget = rget -- for use in [[Module:ar-headword]] -- Return the footnotes associated with `rad` (a radical or past/non-past vowel, either a string or form object). local function rget_footnotes(rad) if type(rad) == "string" then return nil elseif type(rad) == "table" then return rad.footnotes else error(("Internal error: Unexpected type for radical or past/non-past vowel: %s"):format(dump(rad))) end end -- Return true if the formval associated with `rad` (a radical or past/non-past vowel, either a string or form object) -- is `val`. local function req(rad, val) return rget(rad) == val end -- Map `vow` (a past/non-past vowel, either a string or form object without translit) by passing the formval through -- `fn`. Don't call this on radicals because they may have manual translit and it isn't clear how to handle that. local function map_vowel(vow, fn) if type(vow) == "string" then return fn(vow) elseif type(vow) == "table" then return {form = fn(vow.form), footnotes = vow.footnotes} else error(("Internal error: Unexpected type for past/non-past vowel: %s"):format(dump(vow))) end end local function get_radicals_3(vowel_spec) return vowel_spec.rad1, vowel_spec.rad2, vowel_spec.rad3, vowel_spec.past, vowel_spec.nonpast end local function get_radicals_4(vowel_spec) return vowel_spec.rad1, vowel_spec.rad2, vowel_spec.rad3, vowel_spec.rad4 end local function is_final_weak(base, vowel_spec) return vowel_spec.weakness == "final-weak" or base.form == "XV" end local function link_term(text, face, id) return m_links.full_link({lang = lang, term = text, tr = "-", id = id}, face) end local function tag_text(text, tag, class) return m_links.full_link({lang = lang, alt = text, tr = "-"}) end local function track(page) require("Module:debug/track")("ar-verb/" .. page) return true end local function track_if_ar_conj(base, page) if base.alternant_multiword_spec.source_template == "ar-conj" then require("Module:debug/track")("ar-verb/" .. page) end return true end local function reorder_shadda(word) -- shadda+short-vowel (including tanwīn vowels, i.e. -an -in -un) gets -- replaced with short-vowel+shadda during NFC normalisation, which -- MediaWiki does for all Unicode strings; however, it makes various -- processes inconvenient, so undo it. word = rsub(word, "(" .. DIACRITIC_ANY_BUT_SH .. ")" .. SH, SH .. "%1") return word end ------------------------------------------------------------------------------- -- Basic functions to inflect tenses -- ------------------------------------------------------------------------------- local function skip_slot(base, slot, allow_overrides) if base.slot_explicitly_missing[slot] then return true end if not allow_overrides and base.slot_overrides[slot] and not base.slot_override_uses_default[slot] then -- Skip any slots for which there are overrides, except those that request the default value using +, ++, etc. return true end if base.passive == "nopass" and (slot == "pp" or slot:find("_pass")) then return true elseif base.passive == "onlypass" and slot ~= "pp" and slot ~= "vn" and not slot:find("_pass") then return true elseif base.passive == "ipass" and slot:find("_pass") and not slot:find("3ms") then return true elseif base.passive == "onlypass-impers" and slot ~= "pp" and slot ~= "vn" and (not slot:find("_pass") or slot:find("_pass") and not slot:find("3ms")) then return true end if base.nopast and slot:find("^past_") then return true end if base.noimp and slot:find("^imp_") then return true end if base.no_nonpast and (slot:find("^ind_") or slot:find("^sub_") or slot:find("^juss")) then return true end return false end local function basic_combine_stem_ending(stem, ending) return stem .. ending end local function basic_combine_stem_ending_tr(stem, ending) return stem .. ending end -- Concatenate `prefixes`, `stems` and `endings` (any of which may be an abbreviate form list, i.e. strings, form -- objects or lists of strings or form objects) and store into `slot`. If a user-supplied override exists for the slot, -- nothing will happen unless `allow_overrides` is provided. local function add3(base, slot, prefixes, stems, endings, allow_overrides) if skip_slot(base, slot, allow_overrides) then return end -- Optimization since the prefixes are almost always single strings. if type(prefixes) == "string" then local function do_combine_stem_ending(stem, ending) return prefixes .. stem .. ending end local function do_combine_stem_ending_tr(stem, ending) return transliterate(prefixes) .. stem .. ending end iut.add_forms(base.forms, slot, stems, endings, do_combine_stem_ending, transliterate, do_combine_stem_ending_tr, base.form_footnotes) else iut.add_multiple_forms(base.forms, slot, {prefixes, stems, endings}, basic_combine_stem_ending, transliterate, basic_combine_stem_ending_tr, base.form_footnotes) end end -- Insert one or more forms in `form_or_forms` into `slot`. `form_or_forms` is an abbreviated form list (see comment at -- top of [[Module:inflection utilities]]). If a user-supplied override exists for the slot, nothing will happen unless -- `allow_overrides` is provided. BEWARE: One form object should never occur in two different slots, or twice in a given -- slot; if taking a form object from an existing slot, make sure to shallowCopy() it. local function insert_form_or_forms(base, slot, form_or_forms, allow_overrides, uncertain) if not skip_slot(base, slot, allow_overrides) then -- Some optimizations of the most common case of inserting a single string. if type(form_or_forms) == "string" and not base.form_footnotes then form_or_forms = {form = form_or_forms, uncertain = uncertain} iut.insert_form(base.forms, slot, form_or_forms) else local list = iut.convert_to_general_list_form(form_or_forms, base.form_footnotes) if uncertain then for _, formobj in ipairs(list) do formobj.uncertain = true end end iut.insert_forms(base.forms, slot, list) end end end -- Insert `string_or_form` into both the ap2 and pp2 slots, shallowCopying a form object to make sure no form objects -- occur in two slots. local function insert_ap2_pp2(base, string_or_form) insert_form_or_forms(base, "ap2", string_or_form) if type(string_or_form) == "table" then string_or_form = m_table.shallowCopy(string_or_form) end insert_form_or_forms(base, "pp2", string_or_form) end -- Convert `stemforms` (a string, a form object, or a list of strings and/or form objects) into "general form" (a list -- of form objects) and map `fn` over the list of objects. `fn` is passed two arguments (form value and translit) and -- should likewise return the new form value and translit. Footnotes will be preserved. FIXME: Preserve other metadata. local function map_general(stemforms, fn) return iut.map_forms(iut.convert_to_general_list_form(stemforms), fn) end -- Similar to map_general() except that `fn` should return a single value (one or more strings or form objects), instead -- of two values (form value and translit), and the resulting value(s) from all calls to `fn` will be flattened to -- construct the overall return value. Footnotes will be preserved. FIXME: Preserve other metadata. local function flatmap_general(stemforms, fn) return iut.flatmap_forms(iut.convert_to_general_list_form(stemforms), fn) end -- Given user-supplied stem overrides in `base`, construct any derived stem overrides (e.g. vowel-specific or -- consonant-specific variants), and truncate initial y-/ي- in any non-past overrides. local function construct_stems(base) local stems = base.stem_overrides stems.past_v = stems.past_v or stems.past stems.past_c = stems.past_c or stems.past stems.past_pass_v = stems.past_pass_v or stems.past_pass stems.past_pass_c = stems.past_pass_c or stems.past_pass stems.nonpast_v = stems.nonpast_v or stems.nonpast stems.nonpast_c = stems.nonpast_c or stems.nonpast stems.nonpast_pass_v = stems.nonpast_pass_v or stems.nonpast_pass stems.nonpast_pass_c = stems.nonpast_pass_c or stems.nonpast_pass stems.imp_v = stems.imp_v or stems.imp stems.imp_c = stems.imp_c or stems.imp local function truncate_nonpast_initial_cons(stem_type, form, translit) if form == "+" then return form, translit end if not form:find("^" .. Y) then error(("Form value %s for stem type '%s' should begin with ي"):format(form, stem_type)) end form = form:gsub("^" .. Y, "") if translit then if not translit:find("^y") then error(("Translit value %s for stem type '%s' should begin with y"):format(translit, stem_type)) end translit = translit:gsub("^y", "") end return form, translit end for _, nonpast_stem_type in ipairs { "nonpast_v", "nonpast_c", "nonpast_pass_v", "nonpast_pass_c" } do if stems[nonpast_stem_type] then stems[nonpast_stem_type] = map_general(stems[nonpast_stem_type], function(form, translit) return truncate_nonpast_initial_cons(nonpast_stem_type, form, translit) end) end end end -- Given user-specified overrides for stem `stemname`, return overrides with occurrences of + replaced by -- `default_stem`. If no overrides, return `default_stem`, or {} if no default. local function override_stem_if_needed(base, stemname, default_stem) local overrides = base.stem_overrides[stemname] if not overrides then return default_stem or {} end return map_general(overrides, function(form, translit) if form ~= "+" and default_indicator_to_active_participle_slot[form] then error(("Stem overrides cannot use secondary default indicators but saw %s in stem override '%s'"):format( form, stemname)) end if form == "+" then if translit then error(("Cannot supply manual translit along with + for stem override '%s'"):format(stemname)) end if not default_stem then error(("Cannot use + for stem override '%s' because no default is available"):format(stemname)) end if type(default_stem) ~= "string" then error(("Internal error: Default stem for '%s' is not a string: %s"):format(stemname, dump(default_stem))) end return default_stem end return form, translit end) end ------------------------------------------------------------------------------- -- Properties of different verbal forms -- ------------------------------------------------------------------------------- local allowed_vforms = {"I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX", "X", "XI", "XII", "XIII", "XIV", "XV", "Iq", "IIq", "IIIq", "IVq"} local allowed_vforms_set = m_table.listToSet(allowed_vforms) local allowed_vforms_with_weakness = m_table.shallowCopy(allowed_vforms) -- The user needs to be able to explicitly specify that a form-I verb (specifically one whose initial radical is و) is -- sound. Cf. wajiʕa yawjaʕu (not #yajaʕu) "to ache, to hurt". In general, i~a and u~u verbs whose initial radical is و -- seem to not assimilate the first radical; cf. وقح "to be shameless", variously waqaḥa~yaqiḥu, waquḥa~yawquḥu and -- waqiḥa~yawqaḥu, whereas a~i verbs (wafaḍa~yafiḍu "to rush"), i~i verbs (wafiqa~yafiqu "to be proper, to be suitable") -- and a~a verbs (waḍaʕa~yaḍaʕu "to set down, to place") do assimilate. But there are naturally exceptions, e.g. -- waṭiʔa~yaṭaʔu "to tread, to trample"; wasiʕa~yasaʕu "to be spacious; to be well-off"; waṯiʔa~yaṯaʔu "to get bruised, -- to be sprained". Also beware of waniya~yawnā "to be faint; to languish", which is sound in the first radical and -- final-weak in the last radical. Nonetheless, the regularity of the patterns mentioned above suggest we should provide -- them as defaults. -- Note that there are other cases of unexpectedly sound verbs, e.g. izdawaja~yazdawiju "to be in pairs", layisa~yalyasu -- "to be valiant, to be brave", ʔaḥwaja~yuḥwiju "to need", istahwana~yastahwinu "to consider easy", sawisa~yaswasu "to -- be or become moth-eaten or worm-eaten" (vs. sāsa~yasūsu "to govern, to rule" from the same radicals), ʕawira~yaʕwaru -- "to be one-eyed", istajwaba~yastajwibu "to interrogate", etc. But in these cases there is no need for explicit user -- specification as the lemma itself specifies the unexpected soundness. for _, form_with_weakness in ipairs { "I-sound", "I-assimilated", "none-sound", "none-hollow", "none-geminate", "none-final-weak" } do table.insert(allowed_vforms_with_weakness, form_with_weakness) end local allowed_vforms_with_weakness_set = m_table.listToSet(allowed_vforms_with_weakness) local function vform_supports_final_weak(vform) return vform ~= "XI" and vform ~= "XV" and vform ~= "IVq" end local function vform_supports_geminate(vform) return vform == "I" or vform == "III" or vform == "IV" or vform == "VI" or vform == "VII" or vform == "VIII" or vform == "X" end local function vform_supports_hollow(vform) return vform == "I" or vform == "IV" or vform == "VII" or vform == "VIII" or vform == "X" end local function vform_probably_impersonal_passive(vform, weakness, past_vowel, nonpast_vowel) return vform == "I" and req(past_vowel, I) or vform == "V" or vform == "VI" or vform == "X" or vform == "IIq" end local function vform_probably_full_passive(vform) return vform == "II" or vform == "III" or vform == "IV" or vform == "Iq" end local function vform_probably_no_passive(vform, weakness, past_vowel, nonpast_vowel) return vform == "I" and req(past_vowel, U) or vform == "VII" or vform == "IX" or vform == "XI" or vform == "XII" or vform == "XIII" or vform == "XIV" or vform == "XV" or vform == "IIIq" or vform == "IVq" end -- Active vforms II, III, IV, Iq use non-past prefixes in -u- instead of -a-. local function prefix_vowel_from_vform(vform) if vform == "II" or vform == "III" or vform == "IV" or vform == "Iq" then return "u" else return "a" end end -- True if the active non-past takes a-vocalization rather than i-vocalization in its last syllable. local function vform_nonpast_a_vowel(vform) return vform == "V" or vform == "VI" or vform == "XV" or vform == "IIq" end -- True if the `passive` spec indicates a passive-only verb. local function is_passive_only(passive) return passive == "onlypass" or passive == "onlypass-impers" end export.is_passive_only = is_passive_only -- for use in [[Module:ar-headword]] ------------------------------------------------------------------------------- -- Properties of specific sounds -- ------------------------------------------------------------------------------- -- Is radical wāw (و) or yāʾ (ي)? local function is_waw_ya(rad) return req(rad, W) or req(rad, Y) end -- Check that radical is wāw (و) or yāʾ (ي), error if not local function check_waw_ya(rad) if not is_waw_ya(rad) then error("Expecting weak radical: '" .. rget(rad) .. "' should be " .. W .. " or " .. Y) end end -- Form-I verb حيّ or حيي and form-X verb استحيا or استحى local function hayy_radicals(rad1, rad2, rad3) return req(rad1, "ح") and req(rad2, Y) and is_waw_ya(rad3) end -- FUCK ME HARD. "Lua error at line 1514: main function has more than 200 local variables". local function create_conjugations() ------------------------------------------------------------------------------- -- Radicals associated with various irregular verbs -- ------------------------------------------------------------------------------- -- Form-I verb أخذ or form-VIII verb اتخذ local function axadh_radicals(rad1, rad2, rad3) return req(rad1, HAMZA) and req(rad2, "خ") and req(rad3, "ذ") end -- Form-I verb whose imperative has a reduced form: أكل and أخذ and أمر. Return "shortonly" if only -- short-form imperatives exist (أكل and أخذ) or "shortlong" if long-form imperatives also exist (أمر); -- they are used after a clitic like فَ and وَ. local function reduced_imperative_verb(rad1, rad2, rad3) return axadh_radicals(rad1, rad2, rad3) and "shortonly" or req(rad1, HAMZA) and req(rad2, "ك") and req(rad3, "ل") and "shortonly" or req(rad1, HAMZA) and req(rad2, "م") and req(rad3, "ر") and "shortlong" end -- Form-I verb رأى and form-IV verb أرى local function raa_radicals(rad1, rad2, rad3) return req(rad1, "ر") and req(rad2, HAMZA) and is_waw_ya(rad3) end -- Form-I verb سأل local function saal_radicals(rad1, rad2, rad3) return req(rad1, "س") and req(rad2, HAMZA) and req(rad3, "ل") end -- Form-I verb كان local function kaan_radicals(rad1, rad2, rad3) return req(rad1, "ك") and req(rad2, W) and req(rad3, N) end ------------------------------------------------------------------------------- -- Sets of past endings -- ------------------------------------------------------------------------------- -- The 13 endings of the sound/hollow/geminate past tense. local past_endings = { -- singular SK .. TU, SK .. TA, SK .. "تِ", A, A .. "تْ", --dual SK .. "تُمَا", AA, A .. "تَا", -- plural SK .. "نَا", SK .. "تُمْ", -- shadda + vowel diacritic ends up in the wrong order due to Unicode -- bug, so keep them separate to avoid this SK .. "تُن" .. SH .. A, UU .. ALIF, SK .. "نَ" } -- Make endings for final-weak past in -aytu or -awtu. AYAW is AY or AW as appropriate. Note that AA and AW are -- global variables. local function make_past_endings_ay_aw(ayaw, third_sg_masc) return { -- singular ayaw .. SK .. TU, ayaw .. SK .. TA, ayaw .. SK .. "تِ", third_sg_masc, A .. "تْ", --dual ayaw .. SK .. "تُمَا", ayaw .. AA, A .. "تَا", -- plural ayaw .. SK .. "نَا", ayaw .. SK .. "تُمْ", -- shadda + vowel diacritic ends up in the wrong order due to Unicode -- bug, so keep them separate to avoid this ayaw .. SK .. "تُن" .. SH .. A, AW .. SK .. ALIF, ayaw .. SK .. "نَ" } end -- past final-weak -aytu endings local past_endings_ay = make_past_endings_ay_aw(AY, AAMAQ) -- past final-weak -awtu endings local past_endings_aw = make_past_endings_ay_aw(AW, AA) -- used for alternative endings for form-X geminate verbs like اِسْتَمَرَّ local past_endings_ay_12_person_only = { -- singular AY .. SK .. TU, AY .. SK .. TA, AY .. SK .. "تِ", {}, {}, --dual AY .. SK .. "تُمَا", {}, {}, -- plural AY .. SK .. "نَا", AY .. SK .. "تُمْ", -- shadda + vowel diacritic ends up in the wrong order due to Unicode -- bug, so keep them separate to avoid this AY .. SK .. "تُن" .. SH .. A, {}, {}, } -- Make endings for final-weak past in -ītu or -ūtu. IIUU is ī or ū as appropriate. Note that AA and UU are global -- variables. local function make_past_endings_ii_uu(iiuu) return { -- singular iiuu .. TU, iiuu .. TA, iiuu .. "تِ", iiuu .. A, iiuu .. A .. "تْ", --dual iiuu .. "تُمَا", iiuu .. AA, iiuu .. A .. "تَا", -- plural iiuu .. "نَا", iiuu .. "تُمْ", -- shadda + vowel diacritic ends up in the wrong order due to Unicode -- bug, so keep them separate to avoid this iiuu .. "تُن" .. SH .. A, UU .. ALIF, iiuu .. "نَ" } end -- past final-weak -ītu endings local past_endings_ii = make_past_endings_ii_uu(II) -- past final-weak -ūtu endings local past_endings_uu = make_past_endings_ii_uu(UU) ------------------------------------------------------------------------------- -- Sets of non-past prefixes and endings -- ------------------------------------------------------------------------------- local nonpast_prefix_consonants = { -- singular HAMZA, T, T, Y, T, -- dual T, Y, T, -- plural N, T, T, Y, Y } -- There are only five distinct endings in all non-past verbs. Make any set of non-past endings given these five -- distinct endings. local function make_nonpast_endings(null, fem, dual, pl, fempl) return { -- singular null, null, fem, null, null, -- dual dual, dual, dual, -- plural null, pl, fempl, pl, fempl } end -- endings for non-past indicative local ind_endings = make_nonpast_endings( U, II .. NA, AANI, UU .. NA, SK .. NA ) -- Make the endings for non-past subjunctive/jussive, given the vowel diacritic used in "null" endings -- (1s/2ms/3ms/3fs/1p). local function make_sub_juss_endings(dia_null) return make_nonpast_endings( dia_null, II, AA, UU .. ALIF, SK .. NA ) end -- endings for non-past subjunctive local sub_endings = make_sub_juss_endings(A) -- endings for non-past jussive local juss_endings = make_sub_juss_endings(SK) -- endings for alternative geminate non-past jussive in -a; same as subjunctive local juss_endings_alt_a = sub_endings -- endings for alternative geminate non-past jussive in -i local juss_endings_alt_i = make_sub_juss_endings(I) -- Endings for final-weak non-past indicative in -ā. Note that AY, AW and AAMAQ are global variables. local ind_endings_aa = make_nonpast_endings( AAMAQ, AYSK .. NA, AY .. AANI, AWSK .. NA, AYSK .. NA ) -- Make endings for final-weak non-past indicative in -ī or -ū; IIUU is ī or ū as appropriate. Note that II and UU -- are global variables. local function make_ind_endings_ii_uu(iiuu) return make_nonpast_endings( iiuu, II .. NA, iiuu .. AANI, UU .. NA, iiuu .. NA ) end -- endings for final-weak non-past indicative in -ī local ind_endings_ii = make_ind_endings_ii_uu(II) -- endings for final-weak non-past indicative in -ū local ind_endings_uu = make_ind_endings_ii_uu(UU) -- Endings for final-weak non-past subjunctive in -ā. Note that AY, AW, ALIF, AAMAQ are global variables. local sub_endings_aa = make_nonpast_endings( AAMAQ, AYSK, AY .. AA, AWSK .. ALIF, AYSK .. NA ) -- Make endings for final-weak non-past subjunctive in -ī or -ū. IIUU is ī or ū as appropriate. Note that AA, II, -- UU, ALIF are global variables. local function make_sub_endings_ii_uu(iiuu) return make_nonpast_endings( iiuu .. A, II, iiuu .. AA, UU .. ALIF, iiuu .. NA ) end -- endings for final-weak non-past subjunctive in -ī local sub_endings_ii = make_sub_endings_ii_uu(II) -- endings for final-weak non-past subjunctive in -ū local sub_endings_uu = make_sub_endings_ii_uu(UU) -- endings for final-weak non-past jussive in -ā local juss_endings_aa = make_nonpast_endings( A, AYSK, AY .. AA, AWSK .. ALIF, AYSK .. NA ) -- Make endings for final-weak non-past jussive in -ī or -ū. IU is short i or u, IIUU is long ī or ū as appropriate. -- Note that AA, II, UU, ALIF are global variables. local function make_juss_endings_ii_uu(iu, iiuu) return make_nonpast_endings( iu, II, iiuu .. AA, UU .. ALIF, iiuu .. NA ) end -- endings for final-weak non-past jussive in -ī local juss_endings_ii = make_juss_endings_ii_uu(I, II) -- endings for final-weak non-past jussive in -ū local juss_endings_uu = make_juss_endings_ii_uu(U, UU) ------------------------------------------------------------------------------- -- Sets of imperative endings -- ------------------------------------------------------------------------------- -- Extract the second person jussive endings to get corresponding imperative endings. local function imperative_endings_from_jussive(endings) return {endings[2], endings[3], endings[6], endings[10], endings[11]} end -- normal imperative endings local imp_endings = imperative_endings_from_jussive(juss_endings) -- alternative geminate imperative endings in -a local imp_endings_alt_a = imperative_endings_from_jussive(juss_endings_alt_a) -- alternative geminate imperative endings in -i local imp_endings_alt_i = imperative_endings_from_jussive(juss_endings_alt_i) -- final-weak imperative endings in -ā local imp_endings_aa = imperative_endings_from_jussive(juss_endings_aa) -- final-weak imperative endings in -ī local imp_endings_ii = imperative_endings_from_jussive(juss_endings_ii) -- final-weak imperative endings in -ū local imp_endings_uu = imperative_endings_from_jussive(juss_endings_uu) ------------------------------------------------------------------------------- -- Basic functions to inflect tenses -- ------------------------------------------------------------------------------- -- Add to `base` the inflections for the tense indicated by `tense` (the prefix in the slot names, e.g. 'past' -- or 'juss_pass'), formed by combining the `prefixes`, `stems` and `endings`. Each of `prefixes`, `stems` and -- `endings` is either a sequence of 5 (for the imperative) or 13 (for other tenses) abbreviated form lists (each of -- which is either a string, a form object, or a list of strings and/or form objects; see -- [[Module:inflection utilities]] for more info). Alternatively, any of `prefixes`, `stems` or `endings` can be a -- single-element list containing an abbreviated form list, with an additional key `all_same` set to true, or (as a -- special case) a single string; in the latter cases, the same value is used for all 5 or 13 slots. If existing -- inflections already exist, they will be added to, not overridden. `pnums` is the list of person/number slot name -- suffixes, which must match up with the elements in `prefixes`, `stems` and `endings` (i.e. 5 for imperative, 13 -- otherwise). local function inflect_tense_1(base, tense, prefixes, stems, endings, pnums) if not prefixes or not stems or not endings then return end local function verify_affixes(affixname, affixes) local function interr(msg) error(("Internal error: For tense '%s', '%s' %s: %s"):format(tense, affixname, msg, dump(affixes))) end if type(affixes) == "string" then -- do nothing elseif type(affixes) ~= "table" then interr("is not a table or string") elseif affixes.all_same then if #affixes ~= 1 then interr(("with all_same = true should have length 1 but has length %s"):format(#affixes)) end else if #affixes ~= #pnums then interr(("should have length %s but has length %s"):format(#pnums, #affixes)) end end end verify_affixes("prefixes", prefixes) verify_affixes("stems", stems) verify_affixes("endings", endings) local function get_affix(affixes, i) if type(affixes) == "string" then return affixes elseif affixes.all_same then return affixes[1] else return affixes[i] end end for i, pnum in ipairs(pnums) do local prefix = get_affix(prefixes, i) local stem = get_affix(stems, i) local ending = get_affix(endings, i) local slot = tense .. "_" .. pnum add3(base, slot, prefix, stem, ending) end end -- Add to `base` the inflections for the tense indicated by `tense` (the prefix in the slot names, e.g. 'past' -- or 'juss_pass'), formed by combining the `prefixes`, `stems` and `endings`. This is a simple wrapper around -- inflect_tense_1() that applies to all tenses other than the imperative; see inflect_tense_1() for more -- information about the parameters. local function inflect_tense(base, tense, prefixes, stems, endings) inflect_tense_1(base, tense, prefixes, stems, endings, all_person_number_list) end -- Like inflect_tense() but for the imperative, which has only five parts instead of 13 and no prefixes. local function inflect_tense_imp(base, stems, endings) inflect_tense_1(base, "imp", "", stems, endings, imp_person_number_list) end ------------------------------------------------------------------------------- -- Functions to inflect the past tense -- ------------------------------------------------------------------------------- -- Generate past verbs using specified vowel and consonant stems; works for sound, assimilated, hollow, and geminate -- verbs, active and passive. local function past_2stem_conj(base, tense, v_stem, c_stem, footnote_12) local passive = tense:find("_pass") and "_pass" or "" -- Override stems with user-specified stems if available. v_stem = override_stem_if_needed(base, "past" .. passive .. "_v", v_stem) local c_stem_12 = c_stem if footnote_12 then c_stem_12 = iut.combine_form_and_footnotes(c_stem_12, footnote_12) end c_stem_12 = override_stem_if_needed(base, "past" .. passive .. "_c", c_stem_12) local c_stem_3 = override_stem_if_needed(base, "past" .. passive .. "_c", c_stem) inflect_tense(base, tense, "", { -- singular c_stem_12, c_stem_12, c_stem_12, v_stem, v_stem, --dual c_stem_12, v_stem, v_stem, -- plural c_stem_12, c_stem_12, c_stem_12, v_stem, c_stem_3 }, past_endings) end -- Generate past verbs using single specified stem; works for sound and assimilated verbs, active and passive. local function past_1stem_conj(base, tense, stem) past_2stem_conj(base, tense, stem, stem) end ------------------------------------------------------------------------------- -- Functions to inflect non-past tenses -- ------------------------------------------------------------------------------- -- Generate non-past conjugation, with two stems, for vowel-initial and consonant-initial endings, respectively. -- Useful for active and passive; for all forms; for all weaknesses (sound, assimilated, hollow, final-weak and -- geminate) and for all types of non-past (indicative, subjunctive, jussive) except for the imperative. (There is a -- separate wrapper function below for geminate jussives because they have three alternants.) Both stems may be the -- same, e.g. for sound verbs. -- `prefix_vowel` will be either "a" or "u". `endings` should be an array of 13 items. If `endings` is nil or -- omitted, infer the endings from the tense. If `jussive` is true, or `endings` is nil and `tense` indicatives -- jussive, use the jussive pattern of vowel/consonant stems (different from the normal ones). local function nonpast_2stem_conj(base, tense, prefix_vowel, v_stem, c_stem, endings, jussive) local passive = tense:find("_pass") and "_pass" or "" -- Override stems with user-specified stems if available. v_stem = override_stem_if_needed(base, "nonpast" .. passive .. "_v", v_stem and q(dia[prefix_vowel], v_stem) or nil) c_stem = override_stem_if_needed(base, "nonpast" .. passive .. "_c", c_stem and q(dia[prefix_vowel], c_stem) or nil) if not endings then if tense:find("^ind") then endings = ind_endings elseif tense:find("^sub") then endings = sub_endings elseif tense:find("^juss") then jussive = true endings = juss_endings else error("Internal error: Unrecognized tense '" .. tense .."'") end end if not jussive then inflect_tense(base, tense, nonpast_prefix_consonants, { -- singular v_stem, v_stem, v_stem, v_stem, v_stem, --dual v_stem, v_stem, v_stem, -- plural v_stem, v_stem, c_stem, v_stem, c_stem }, endings) else inflect_tense(base, tense, nonpast_prefix_consonants, { -- singular -- 'adlul, tadlul, tadullī, yadlul, tadlul c_stem, c_stem, v_stem, c_stem, c_stem, --dual -- tadullā, yadullā, tadullā v_stem, v_stem, v_stem, -- plural -- nadlul, tadullū, tadlulna, yadullū, yadlulna c_stem, v_stem, c_stem, v_stem, c_stem }, endings) end end -- Generate non-past conjugation with one stem (no distinct stems for vowel-initial and consonant-initial endings). -- See nonpast_2stem_conj(). local function nonpast_1stem_conj(base, tense, prefix_vowel, stem, endings, jussive) nonpast_2stem_conj(base, tense, prefix_vowel, stem, stem, endings, jussive) end -- Generate active/passive jussive geminative. There are three alternants, two with terminations -a and -i and one -- in a null termination with a distinct pattern of vowel/consonant stem usage. See nonpast_2stem_conj() for a -- description of the arguments. local function jussive_gem_conj(base, tense, prefix_vowel, v_stem, c_stem) -- alternative in -a nonpast_2stem_conj(base, tense, prefix_vowel, v_stem, c_stem, juss_endings_alt_a) -- alternative in -i nonpast_2stem_conj(base, tense, prefix_vowel, v_stem, c_stem, juss_endings_alt_i) -- alternative in -null; requires different combination of v_stem and -- c_stem since the null endings require the c_stem (e.g. "tadlul" here) -- whereas the corresponding endings above in -a or -i require the v_stem -- (e.g. "tadulla, tadulli" above) nonpast_2stem_conj(base, tense, prefix_vowel, v_stem, c_stem, juss_endings, "jussive") end ------------------------------------------------------------------------------- -- Functions to inflect the imperative -- ------------------------------------------------------------------------------- -- Generate imperative conjugation, with two stems, for vowel-initial and consonant-initial endings, respectively. -- Useful for all forms, and for all weaknesses other than final-weak. Note that the two stems may be the same -- (specifically for sound and assimilated verbs). If `endings` is nil or omitted, use `imp_endings`. If `alt_gem` -- is specified, use the pattern of vowel and consonant stems appropriate for the alternative geminate imperatives -- that use a null ending of -a or -i instead of an empty ending. local function make_2stem_imperative(base, v_stem, c_stem, endings, alt_gem) endings = endings or imp_endings -- Override stems with user-specified stems if available. v_stem = override_stem_if_needed(base, "imp_v", v_stem) c_stem = override_stem_if_needed(base, "imp_c", c_stem) if alt_gem then inflect_tense_imp(base, {v_stem, v_stem, v_stem, v_stem, c_stem}, endings) else inflect_tense_imp(base, {c_stem, v_stem, v_stem, v_stem, c_stem}, endings) end end -- Generate imperative parts for sound or assimilated verbs. local function make_1stem_imperative(base, stem) make_2stem_imperative(base, stem, stem) end -- Generate imperative parts for geminate verbs form I (also IV, VII, VIII, X). local function make_gem_imperative(base, v_stem, c_stem) make_2stem_imperative(base, v_stem, c_stem, imp_endings_alt_a, "alt gem") make_2stem_imperative(base, v_stem, c_stem, imp_endings_alt_i, "alt gem") make_2stem_imperative(base, v_stem, c_stem) end ------------------------------------------------------------------------------- -- Functions to inflect entire verbs -- ------------------------------------------------------------------------------- -- Generate finite parts of a sound verb (also works for assimilated verbs) from five stems (past and non-past, -- active and passive, plus imperative) plus the prefix vowel in the active non-past ("a" or "u"). local function make_sound_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem, imp_stem, prefix_vowel) past_1stem_conj(base, "past", past_stem) past_1stem_conj(base, "past_pass", past_pass_stem) nonpast_1stem_conj(base, "ind", prefix_vowel, nonpast_stem) nonpast_1stem_conj(base, "sub", prefix_vowel, nonpast_stem) nonpast_1stem_conj(base, "juss", prefix_vowel, nonpast_stem) nonpast_1stem_conj(base, "ind_pass", "u", nonpast_pass_stem) nonpast_1stem_conj(base, "sub_pass", "u", nonpast_pass_stem) nonpast_1stem_conj(base, "juss_pass", "u", nonpast_pass_stem) make_1stem_imperative(base, imp_stem) end local function past_final_weak_endings_from_vowel(vowel) if vowel == "ay" then return past_endings_ay elseif vowel == "aw" then return past_endings_aw elseif vowel == "ī" then return past_endings_ii elseif vowel == "ū" then return past_endings_uu elseif not vowel then return nil else error(("Internal error: Unrecognized past final-weak vowel spec '%s'"):format(vowel)) end end local function nonpast_final_weak_endings_from_vowel(vowel) if vowel == "ā" then return ind_endings_aa, sub_endings_aa, juss_endings_aa, imp_endings_aa elseif vowel == "ī" then return ind_endings_ii, sub_endings_ii, juss_endings_ii, imp_endings_ii elseif vowel == "ū" then return ind_endings_uu, sub_endings_uu, juss_endings_uu, imp_endings_uu elseif not vowel then return nil else error(("Internal error: Unrecognized non-past final-weak vowel spec '%s'"):format(vowel)) end end -- Generate finite parts of a final-weak verb from five stems (past and non-past, active and passive, plus -- imperative), the past active ending vowel (ay, aw, ī or ū), the non-past active ending vowel (ā, ī or ū) and the -- prefix vowel in the active non-past (a or u). local function make_final_weak_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem, imp_stem, past_ending_vowel, nonpast_ending_vowel, prefix_vowel) past_stem = override_stem_if_needed(base, "past", past_stem) past_pass_stem = override_stem_if_needed(base, "past_pass", past_pass_stem) -- Don't call override_stem_if_needed() here for non-past stems; it's called in nonpast_2stem_conj(). imp_stem = override_stem_if_needed(base, "imp", imp_stem) -- + not supported for ending vowel overrides past_ending_vowel = base.stem_overrides.past_final_weak_vowel or past_ending_vowel local past_pass_ending_vowel = base.stem_overrides.past_pass_final_weak_vowel or "ī" nonpast_ending_vowel = base.stem_overrides.nonpast_final_weak_vowel or nonpast_ending_vowel local nonpast_pass_ending_vowel = base.stem_overrides.nonpast_pass_final_weak_vowel or "ā" local past_endings = past_final_weak_endings_from_vowel(past_ending_vowel) local past_pass_endings = past_final_weak_endings_from_vowel(past_pass_ending_vowel) local ind_endings, sub_endings, juss_endings, imp_endings = nonpast_final_weak_endings_from_vowel(nonpast_ending_vowel) local ind_pass_endings, sub_pass_endings, juss_pass_endings = nonpast_final_weak_endings_from_vowel(nonpast_pass_ending_vowel) inflect_tense(base, "past", "", {past_stem, all_same = 1}, past_endings) inflect_tense(base, "past_pass", "", {past_pass_stem, all_same = 1}, past_pass_endings) nonpast_1stem_conj(base, "ind", prefix_vowel, nonpast_stem, ind_endings) nonpast_1stem_conj(base, "sub", prefix_vowel, nonpast_stem, sub_endings) nonpast_1stem_conj(base, "juss", prefix_vowel, nonpast_stem, juss_endings) nonpast_1stem_conj(base, "ind_pass", "u", nonpast_pass_stem, ind_pass_endings) nonpast_1stem_conj(base, "sub_pass", "u", nonpast_pass_stem, sub_pass_endings) nonpast_1stem_conj(base, "juss_pass", "u", nonpast_pass_stem, juss_pass_endings) inflect_tense_imp(base, {imp_stem, all_same = 1}, imp_endings) end -- Generate finite parts of an augmented (form II+) final-weak verb from five stems (past and non-past, active and -- passive, plus imperative) plus the prefix vowel in the active non-past ("a" or "u") and a flag indicating if it -- behaves like a form V/VI verb in taking non-past endings in -ā instead of -ī. local function make_augmented_final_weak_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem, imp_stem, prefix_vowel, form56) make_final_weak_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem, imp_stem, "ay", form56 and "ā" or "ī", prefix_vowel) end -- Generate finite parts of an augmented (form II+) sound or final-weak verb, given: -- * `base` (conjugation data structure); -- * `vowel_spec` (radicals, weakness); -- * `past_stem_base` (active past stem minus last syllable (= -al or -ā)); -- * `nonpast_stem_base` (non-past stem minus last syllable (= -al/-il or -ā/-ī); -- * `past_pass_stem_base` (passive past stem minus last syllable (= -il or -ī)); -- * `vn` (verbal noun). local function make_augmented_sound_final_weak_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) insert_form_or_forms(base, "vn", vn) local lastrad = base.quadlit and vowel_spec.rad4 or vowel_spec.rad3 local final_weak = is_final_weak(base, vowel_spec) local prefix_vowel = prefix_vowel_from_vform(base.verb_form) local form56 = vform_nonpast_a_vowel(base.verb_form) local a_base_suffix = final_weak and "" or q(A, lastrad) local i_base_suffix = final_weak and "" or q(I, lastrad) -- past and non-past stems, active and passive local past_stem = q(past_stem_base, a_base_suffix) -- In forms 5 and 6, non-past has /a/ as last stem vowel in the non-past -- in both active and passive, but /i/ in the active participle and /a/ -- in the passive participle. Elsewhere, consistent /i/ in active non-past -- and participle, consistent /a/ in passive non-past and participle. -- Hence, forms 5 and 6 differ only in the non-past active (but not -- active participle), so we have to split the finite non-past stem and -- active participle stem. local nonpast_stem = q(nonpast_stem_base, form56 and a_base_suffix or i_base_suffix) local ap_stem = q(nonpast_stem_base, i_base_suffix) local past_pass_stem = q(past_pass_stem_base, i_base_suffix) local nonpast_pass_stem = q(nonpast_stem_base, a_base_suffix) -- imperative stem local imp_stem = q(past_stem_base, form56 and a_base_suffix or i_base_suffix) -- make parts if final_weak then make_augmented_final_weak_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem, imp_stem, prefix_vowel, form56) else make_sound_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem, imp_stem, prefix_vowel) end -- active and passive participle if final_weak then insert_form_or_forms(base, "ap", q(MU, ap_stem, IN)) insert_form_or_forms(base, "pp", q(MU, nonpast_pass_stem, AN, AMAQ)) else insert_form_or_forms(base, "ap", q(MU, ap_stem)) insert_form_or_forms(base, "pp", q(MU, nonpast_pass_stem)) end end -- Generate finite parts of a hollow or geminate verb from ten stems (vowel and consonant stems for each of past and -- non-past, active and passive, plus imperative) plus the prefix vowel in the active non-past ("a" or "u"), plus a -- flag indicating if we are a geminate verb. local function make_hollow_geminate_verb(base, geminate, past_v_stem, past_c_stem, past_pass_v_stem, past_pass_c_stem, nonpast_v_stem, nonpast_c_stem, nonpast_pass_v_stem, nonpast_pass_c_stem, imp_v_stem, imp_c_stem, prefix_vowel, altgem_note) past_2stem_conj(base, "past", past_v_stem, past_c_stem, altgem_note) past_2stem_conj(base, "past_pass", past_pass_v_stem, past_pass_c_stem) nonpast_2stem_conj(base, "ind", prefix_vowel, nonpast_v_stem, nonpast_c_stem) nonpast_2stem_conj(base, "sub", prefix_vowel, nonpast_v_stem, nonpast_c_stem) nonpast_2stem_conj(base, "ind_pass", "u", nonpast_pass_v_stem, nonpast_pass_c_stem) nonpast_2stem_conj(base, "sub_pass", "u", nonpast_pass_v_stem, nonpast_pass_c_stem) if geminate then jussive_gem_conj(base, "juss", prefix_vowel, nonpast_v_stem, nonpast_c_stem) jussive_gem_conj(base, "juss_pass", "u", nonpast_pass_v_stem, nonpast_pass_c_stem) make_gem_imperative(base, imp_v_stem, imp_c_stem) else nonpast_2stem_conj(base, "juss", prefix_vowel, nonpast_v_stem, nonpast_c_stem) nonpast_2stem_conj(base, "juss_pass", "u", nonpast_pass_v_stem, nonpast_pass_c_stem) make_2stem_imperative(base, imp_v_stem, imp_c_stem) end end -- Generate finite parts of an augmented (form II+) hollow verb, given: -- * `base` (conjugation data structure); -- * `vowel_spec` (radicals, weakness); -- * `past_stem_base` (invariable part of active past stem); -- * `nonpast_stem_base` (invariable part of nonpast stem); -- * `past_pass_stem_base` (invariable part of passive past stem); -- * `vn` (verbal noun). local function make_augmented_hollow_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) insert_form_or_forms(base, "vn", vn) local lastrad = base.quadlit and vowel_spec.rad4 or vowel_spec.rad3 local form410 = base.verb_form == "IV" or base.verb_form == "X" local prefix_vowel = prefix_vowel_from_vform(base.verb_form) local a_base_suffix_v, a_base_suffix_c local i_base_suffix_v, i_base_suffix_c a_base_suffix_v = q(AA, lastrad) -- 'af-āl-a, inf-āl-a a_base_suffix_c = q(A, lastrad) -- 'af-al-tu, inf-al-tu i_base_suffix_v = q(II, lastrad) -- 'uf-īl-a, unf-īl-a i_base_suffix_c = q(I, lastrad) -- 'uf-il-tu, unf-il-tu -- past and non-past stems, active and passive, for vowel-initial and -- consonant-initial endings local past_v_stem = q(past_stem_base, a_base_suffix_v) local past_c_stem = q(past_stem_base, a_base_suffix_c) -- yu-f-īl-u, ya-staf-īl-u but yanf-āl-u, yaft-āl-u local nonpast_v_stem = q(nonpast_stem_base, form410 and i_base_suffix_v or a_base_suffix_v) local nonpast_c_stem = q(nonpast_stem_base, form410 and i_base_suffix_c or a_base_suffix_c) local past_pass_v_stem = q(past_pass_stem_base, i_base_suffix_v) local past_pass_c_stem = q(past_pass_stem_base, i_base_suffix_c) local nonpast_pass_v_stem = q(nonpast_stem_base, a_base_suffix_v) local nonpast_pass_c_stem = q(nonpast_stem_base, a_base_suffix_c) -- imperative stem local imp_v_stem = q(past_stem_base, form410 and i_base_suffix_v or a_base_suffix_v) local imp_c_stem = q(past_stem_base, form410 and i_base_suffix_c or a_base_suffix_c) -- make parts make_hollow_geminate_verb(base, false, past_v_stem, past_c_stem, past_pass_v_stem, past_pass_c_stem, nonpast_v_stem, nonpast_c_stem, nonpast_pass_v_stem, nonpast_pass_c_stem, imp_v_stem, imp_c_stem, prefix_vowel) -- active participle insert_form_or_forms(base, "ap", q(MU, nonpast_v_stem)) -- passive participle insert_form_or_forms(base, "pp", q(MU, nonpast_pass_v_stem)) end -- Generate finite parts of an augmented (form II+) geminate verb, given: -- * `base` (conjugation data structure); -- * `vowel_spec` (radicals, weakness); -- * `past_stem_base` (invariable part of active past stem; this and the stem bases below will end with a consonant -- for forms IV, X, IVq, and a short vowel for the others); -- * `nonpast_stem_base` (invariable part of nonpast stem); -- * `past_pass_stem_base` (invariable part of passive past stem); -- * `vn` (verbal noun); -- * `altgem_note` (footnote to add to active past 1/2-person forms, when alternative forms are supplied [form X]). local function make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn, altgem_note) insert_form_or_forms(base, "vn", vn) local vform = base.verb_form local lastrad = base.quadlit and vowel_spec.rad4 or vowel_spec.rad3 local prefix_vowel = prefix_vowel_from_vform(vform) local a_base_suffix_v, a_base_suffix_c local i_base_suffix_v, i_base_suffix_c if vform == "IV" or vform == "X" or vform == "IVq" then a_base_suffix_v = q(A, lastrad, SH) -- 'af-all a_base_suffix_c = q(SK, lastrad, A, lastrad) -- 'af-lal i_base_suffix_v = q(I, lastrad, SH) -- yuf-ill i_base_suffix_c = q(SK, lastrad, I, lastrad) -- yuf-lil else a_base_suffix_v = q(lastrad, SH) -- fā-ll, infa-ll a_base_suffix_c = q(lastrad, A, lastrad) -- fā-lal, infa-lal i_base_suffix_v = q(lastrad, SH) -- yufā-ll, yanfa-ll i_base_suffix_c = q(lastrad, I, lastrad) -- yufā-lil, yanfa-lil end -- past and non-past stems, active and passive, for vowel-initial and -- consonant-initial endings local past_v_stem = q(past_stem_base, a_base_suffix_v) local past_c_stem = q(past_stem_base, a_base_suffix_c) local nonpast_v_stem = q(nonpast_stem_base, vform_nonpast_a_vowel(vform) and a_base_suffix_v or i_base_suffix_v) local nonpast_c_stem = q(nonpast_stem_base, vform_nonpast_a_vowel(vform) and a_base_suffix_c or i_base_suffix_c) -- NOTE: Formerly had a comment that "vform III and VI passive past do not have contracted parts, only -- uncontracted parts, which are added separately by those functions". This is based on Mace -- "Arabic Verbs and Essential Grammar" (1999) entry 63 (continued), which shows passive ḥūjija but no ḥūjja; -- but that is apparently a mistake, as (1) verb tables in other books do show contracted passive parts for -- these forms; (2) there is no mention of such an exception on p. 99, which explains how geminate ("doubled") -- verbs work (on the contrary, it says "The contracted and uncontracted pairs (see above) are found all -- over Forms III and VI of the doubled verbs"). local past_pass_v_stem = q(past_pass_stem_base, i_base_suffix_v) local past_pass_c_stem = q(past_pass_stem_base, i_base_suffix_c) local nonpast_pass_v_stem = q(nonpast_stem_base, a_base_suffix_v) local nonpast_pass_c_stem = q(nonpast_stem_base, a_base_suffix_c) -- imperative stem local imp_v_stem = q(past_stem_base, vform_nonpast_a_vowel(vform) and a_base_suffix_v or i_base_suffix_v) local imp_c_stem = q(past_stem_base, vform_nonpast_a_vowel(vform) and a_base_suffix_c or i_base_suffix_c) -- make parts make_hollow_geminate_verb(base, "geminate", past_v_stem, past_c_stem, past_pass_v_stem, past_pass_c_stem, nonpast_v_stem, nonpast_c_stem, nonpast_pass_v_stem, nonpast_pass_c_stem, imp_v_stem, imp_c_stem, prefix_vowel, altgem_note) -- active participle insert_form_or_forms(base, "ap", q(MU, nonpast_v_stem)) -- passive participle insert_form_or_forms(base, "pp", q(MU, nonpast_pass_v_stem)) end ------------------------------------------------------------------------------- -- Conjugation functions for specific conjugation types -- ------------------------------------------------------------------------------- local function form_i_imp_stem_through_rad1(base, nonpast_vowel, rad1) local imp_vowel = map_vowel(nonpast_vowel, function(vow) if vow == A or vow == I then return I elseif vow == U then return U elseif not skip_slot(base, "imp_2ms") then error(("Internal error: Non-past vowel %s isn't a, i, or u, should have been caught earlier"):format( dump(nonpast_vowel))) else -- Passive-only; imperative won't ever be displayed so it doesn't matter. return I end end) -- Mace ("Arabic Verbs and Essentials of Grammar" p. 63: [https://archive.org/details/arabicverbsessen00john/page/62/mode/2up]) -- claims that initial hamza is assimilated/elided into a long vowel in the form-I imperative, but apparently -- this isn't corrrect. local vowel_on_alif = map_vowel(imp_vowel, function(vow) return ALIF .. vow end) return q(vowel_on_alif, rad1, SK) end -- Implement form-I sound or assimilated verb. ASSIMILATED is true for assimilated verbs. local function make_form_i_sound_assimilated_verb(base, vowel_spec, assimilated) local rad1, rad2, rad3, past_vowel, nonpast_vowel = get_radicals_3(vowel_spec) -- Verbal nouns (maṣādir) for form I are unpredictable and have to be supplied -- past and non-past stems, active and passive local past_stem = q(rad1, A, rad2, past_vowel, rad3) local nonpast_stem = assimilated and q(rad2, nonpast_vowel, rad3) or q(rad1, SK, rad2, nonpast_vowel, rad3) local past_pass_stem = q(rad1, U, rad2, I, rad3) local nonpast_pass_stem = q(rad1, SK, rad2, A, rad3) -- imperative stem -- check for irregular verb with reduced imperative (أَخَذَ or أَكَلَ or أَمَرَ) local reducedimp = reduced_imperative_verb(rad1, rad2, rad3) if reducedimp then base.irregular = true end local imp_stem_suffix = q(rad2, nonpast_vowel, rad3) local long_imp_stem_base = form_i_imp_stem_through_rad1(base, nonpast_vowel, rad1) local short_imp_stem_base = "" local imp_stem = q((assimilated or reducedimp) and "" or long_imp_stem_base, imp_stem_suffix) -- make parts make_sound_verb(base, past_stem, past_pass_stem, nonpast_stem, nonpast_pass_stem, imp_stem, "a") if reducedimp == "shortlong" then make_1stem_imperative(base, iut.combine_form_and_footnotes(q(long_imp_stem_base, imp_stem_suffix), mw.getCurrentFrame():preprocess("[used especially with a clitic such as {{m|ar|فَ}} or {{m|ar|وَ}}]"))) end -- Check for irregular verb سَأَلَ with alternative jussive and imperative. Calling this after make_sound_verb() -- adds additional entries to the paradigm parts. if saal_radicals(rad1, rad2, rad3) then base.irregular = true nonpast_1stem_conj(base, "juss", "a", "سَل") nonpast_1stem_conj(base, "juss_pass", "u", "سَل") make_1stem_imperative(base, "سَل") end -- Active participle. insert_form_or_forms(base, "ap1", q(rad1, AA, rad2, I, rad3)) -- Insert alternative active participle (stative type I) فَعِيل. Since not all verbs have this, we require that -- verbs that do have it specify it explicitly; a shortcut ++ is provided to make this easier (e.g. <ap:++> to -- indicate that the alternative form should be used for the active participle, <ap:+,++> to indicate that both -- forms can be used, and <ap:-> to indicate that there is no active participle). The same form is used for -- secondary default passive participle. insert_ap2_pp2(base, q(rad1, A, rad2, II, rad3)) -- Active participle, stative type II فَعِل (+++). insert_form_or_forms(base, "ap3", q(rad1, A, rad2, I, rad3)) -- Active participle, color/defect أَفْعَل (+cd). insert_form_or_forms(base, "apcd", q(HAMZA, A, rad1, SK, rad2, A, rad3)) -- Active participle, -ān فَعْلَان (+an). insert_form_or_forms(base, "apan", q(rad1, A, rad2, SK, rad3, AAN)) -- Passive participle. insert_form_or_forms(base, "pp", q(MA, rad1, SK, rad2, UU, rad3)) end conjugations["I-sound"] = function(base, vowel_spec) make_form_i_sound_assimilated_verb(base, vowel_spec, false) end conjugations["none-sound"] = function(base, vowel_spec) -- All default stems are nil. make_sound_verb(base) end conjugations["none-hollow"] = function(base, vowel_spec) -- All default stems are nil. make_hollow_geminate_verb(base, false) end conjugations["none-geminate"] = function(base, vowel_spec) -- All default stems are nil. make_hollow_geminate_verb(base, "geminate") end conjugations["none-final-weak"] = function(base, vowel_spec) -- All default stems are nil. make_final_weak_verb(base) end conjugations["I-assimilated"] = function(base, vowel_spec) make_form_i_sound_assimilated_verb(base, vowel_spec, "assimilated") end local function make_form_i_hayy_verb(base, vowel_spec) -- Verbal nouns (maṣādir) for form I are unpredictable and have to be supplied base.irregular = true -- past and non-past stems, active and passive, and imperative stem local past_c_stem = "حَيِي" local past_v_stem_long = past_c_stem local past_v_stem_short = "حَيّ" local past_pass_c_stem = "حُيِي" local past_pass_v_stem_long = past_pass_c_stem local past_pass_v_stem_short = "حُيّ" local nonpast_stem = "حْي" local nonpast_pass_stem = nonpast_stem local imp_stem = _I .. nonpast_stem -- make parts past_2stem_conj(base, "past", {}, past_c_stem) past_2stem_conj(base, "past_pass", {}, past_pass_c_stem) local variant = vowel_spec.variant or "both" if variant == "short" or variant == "both" then past_2stem_conj(base, "past", past_v_stem_short, {}) past_2stem_conj(base, "past_pass", past_pass_v_stem_short, {}) end function inflect_long_variant(tense, long_stem, short_stem) inflect_tense_1(base, tense, "", {long_stem, long_stem, long_stem, long_stem, short_stem}, {past_endings[4], past_endings[5], past_endings[7], past_endings[8], past_endings[12]}, {"3ms", "3fs", "3md", "3fd", "3mp"}) end if variant == "long" or variant == "both" then inflect_long_variant("past", past_v_stem_long, past_v_stem_short) inflect_long_variant("past_pass", past_pass_v_stem_long, past_pass_v_stem_short) end nonpast_1stem_conj(base, "ind", "a", nonpast_stem, ind_endings_aa) nonpast_1stem_conj(base, "sub", "a", nonpast_stem, sub_endings_aa) nonpast_1stem_conj(base, "juss", "a", nonpast_stem, juss_endings_aa) nonpast_1stem_conj(base, "ind_pass", "u", nonpast_pass_stem, ind_endings_aa) nonpast_1stem_conj(base, "sub_pass", "u", nonpast_pass_stem, sub_endings_aa) nonpast_1stem_conj(base, "juss_pass", "u", nonpast_pass_stem, juss_endings_aa) inflect_tense_imp(base, {imp_stem, all_same = 1}, imp_endings_aa) -- active and passive participles apparently do not exist for this verb end -- Implement form-I final-weak assimilated+final-weak verb. ASSIMILATED is true for assimilated verbs. local function make_form_i_final_weak_verb(base, vowel_spec, assimilated) local rad1, rad2, rad3, past_vowel, nonpast_vowel = get_radicals_3(vowel_spec) -- حَيَّ or حَيِيَ is weird enough that we handle it as a separate function. if hayy_radicals(rad1, rad2, rad3) then make_form_i_hayy_verb(base, vowel_spec) return end -- Verbal nouns (maṣādir) for form I are unpredictable and have to be supplied. -- Past and non-past stems, active and passive, and imperative stem. local past_stem = q(rad1, A, rad2) local past_pass_stem = q(rad1, U, rad2) local nonpast_stem, nonpast_pass_stem, imp_stem if raa_radicals(rad1, rad2, rad3) then base.irregular = true nonpast_stem = rad1 nonpast_pass_stem = rad1 imp_stem = rad1 else nonpast_pass_stem = q(rad1, SK, rad2) if assimilated then nonpast_stem = rad2 imp_stem = rad2 else nonpast_stem = nonpast_pass_stem imp_stem = q(form_i_imp_stem_through_rad1(base, nonpast_vowel, rad1), rad2) end end -- Make parts. local past_ending_vowel = req(rad3, Y) and req(past_vowel, A) and "ay" or req(rad3, W) and req(past_vowel, A) and "aw" or req(past_vowel, I) and "ī" or "ū" -- Try to preserve footnotes attached to the third radical and/or past and/or non-past vowels. local past_footnotes = iut.combine_footnotes(rget_footnotes(rad3), rget_footnotes(past_vowel)) local nonpast_ending_vowel = req(nonpast_vowel, A) and "ā" or req(nonpast_vowel, I) and "ī" or "ū" local nonpast_footnotes = iut.combine_footnotes(rget_footnotes(rad3), rget_footnotes(nonpast_vowel)) make_final_weak_verb(base, iut.combine_form_and_footnotes(past_stem, past_footnotes), iut.combine_form_and_footnotes(past_pass_stem, past_footnotes), iut.combine_form_and_footnotes(nonpast_stem, nonpast_footnotes), iut.combine_form_and_footnotes(nonpast_pass_stem, nonpast_footnotes), iut.combine_form_and_footnotes(imp_stem, nonpast_footnotes), past_ending_vowel, nonpast_ending_vowel, "a") -- Active participle. insert_form_or_forms(base, "ap1", q(rad1, AA, rad2, IN)) -- Active participle, stative type I فَعِيّ (++). FIXME: Is this correct when rad3 is W? insert_ap2_pp2(base, q(rad1, A, rad2, II, SH)) -- Active participle, stative type II فَعٍ (+++). FIXME: Any examples of this to verify it's correct? insert_form_or_forms(base, "ap3", q(rad1, A, rad2, IN)) -- Active participle, color/defect أَفْعَى (+cd). insert_form_or_forms(base, "apcd", q(HAMZA, A, rad1, SK, rad2, AAMAQ)) -- Active participle, -ān فَعْيَان or فَعْوَان (+an). -- FIXME: Any examples of this for both rad3 = W and y to verify it's correct? insert_form_or_forms(base, "apan", q(rad1, A, rad2, SK, rad3, AAN)) -- Passive participle. insert_form_or_forms(base, "pp", q(MA, rad1, SK, rad2, req(rad3, Y) and II or UU, SH)) end conjugations["I-final-weak"] = function(base, vowel_spec) make_form_i_final_weak_verb(base, vowel_spec, false) end conjugations["I-assimilated+final-weak"] = function(base, vowel_spec) make_form_i_final_weak_verb(base, vowel_spec, "assimilated") end conjugations["I-hollow"] = function(base, vowel_spec) local rad1, rad2, rad3, past_vowel, nonpast_vowel = get_radicals_3(vowel_spec) -- In some sense, hollow vowels i~i and u~u are more "correct" than a~i and a~u, but the latter follow the -- pattern of other form-I verbs, so we map i~i to a~i and u~u to a~u in infer_radicals(). Now however we have -- to undo this to get the actual past vowel based on the non-past vowel. if req(past_vowel, A) then past_vowel = map_vowel(past_vowel, function(vow) return req(nonpast_vowel, A) and I or rget(nonpast_vowel) end) end local lengthened_nonpast = map_vowel(nonpast_vowel, function(vow) return vow == U and UU or vow == I and II or AA end) -- Verbal nouns (maṣādir) for form I are unpredictable and have to be supplied. -- active past stems - vowel (v) and consonant (c) local past_v_stem = q(rad1, AA, rad3) local past_c_stem = q(rad1, past_vowel, rad3) -- active non-past stems - vowel (v) and consonant (c) local nonpast_v_stem = q(rad1, lengthened_nonpast, rad3) local nonpast_c_stem = q(rad1, nonpast_vowel, rad3) -- passive past stems - vowel (v) and consonant (c) -- 'ufīla, 'ufiltu local past_pass_v_stem = q(rad1, II, rad3) local past_pass_c_stem = q(rad1, I, rad3) -- passive non-past stems - vowel (v) and consonant (c) -- yufāla/yufalna -- stem is built differently but conjugation is identical to sound verbs local nonpast_pass_v_stem = q(rad1, AA, rad3) local nonpast_pass_c_stem = q(rad1, A, rad3) -- imperative stem local imp_v_stem = nonpast_v_stem local imp_c_stem = nonpast_c_stem -- make parts make_hollow_geminate_verb(base, false, past_v_stem, past_c_stem, past_pass_v_stem, past_pass_c_stem, nonpast_v_stem, nonpast_c_stem, nonpast_pass_v_stem, nonpast_pass_c_stem, imp_v_stem, imp_c_stem, "a") if kaan_radicals(rad1, rad2, rad3) then local endings = make_nonpast_endings(U, {}, {}, {}, {}) inflect_tense(base, "juss", nonpast_prefix_consonants, q(A, rad1), endings) base.irregular = true end -- Active participle. insert_form_or_forms(base, "ap1", req(rad3, HAMZA) and q(rad1, AA, HAMZA, IN) or q(rad1, AA, HAMZA, I, rad3)) -- Active participle, stative type I فَيِّد (++). FIXME: Any examples of this to verify it's correct? insert_ap2_pp2(base, q(rad1, A, Y, SH, I, rad3)) -- Active participle, stative type II فَيِد (+++). FIXME: Any examples of this to verify it's correct? insert_form_or_forms(base, "ap3", q(rad1, A, Y, I, rad3)) -- Active participle, color/defect أَفّيَد or أَفّوَد (+cd). FIXME: Any examples of this to verify it's correct? insert_form_or_forms(base, "apcd", q(HAMZA, A, rad1, SK, rad2, A, rad3)) -- Active participle, -ān فَيْدَان or فَوْدَان (+an). Example: جَاعَ "to be hungry", act part جَوْعَان insert_form_or_forms(base, "apan", q(rad1, A, rad2, SK, rad3, AAN)) -- Passive participle. insert_form_or_forms(base, "pp", q(MA, rad1, req(rad2, Y) and II or UU, rad3)) end conjugations["I-geminate"] = function(base, vowel_spec) local rad1, rad2, rad3, past_vowel, nonpast_vowel = get_radicals_3(vowel_spec) -- Verbal nouns (maṣādir) for form I are unpredictable and have to be supplied. -- active past stems - vowel (v) and consonant (c) local past_v_stem = q(rad1, A, rad2, SH) local past_c_stem = q(rad1, A, rad2, past_vowel, rad2) -- active non-past stems - vowel (v) and consonant (c) local nonpast_v_stem = q(rad1, nonpast_vowel, rad2, SH) local nonpast_c_stem = q(rad1, SK, rad2, nonpast_vowel, rad2) -- passive past stems - vowel (v) and consonant (c) -- dulla/dulilta local past_pass_v_stem = q(rad1, U, rad2, SH) local past_pass_c_stem = q(rad1, U, rad2, I, rad2) -- passive non-past stems - vowel (v) and consonant (c) --yudallu/yudlalna -- stem is built differently but conjugation is identical to sound verbs local nonpast_pass_v_stem = q(rad1, A, rad2, SH) local nonpast_pass_c_stem = q(rad1, SK, rad2, A, rad2) -- imperative stem local imp_v_stem = q(rad1, nonpast_vowel, rad2, SH) local imp_c_stem = q(form_i_imp_stem_through_rad1(base, nonpast_vowel, rad1), rad2, nonpast_vowel, rad2) -- make parts make_hollow_geminate_verb(base, "geminate", past_v_stem, past_c_stem, past_pass_v_stem, past_pass_c_stem, nonpast_v_stem, nonpast_c_stem, nonpast_pass_v_stem, nonpast_pass_c_stem, imp_v_stem, imp_c_stem, "a") -- Active participle. insert_form_or_forms(base, "ap1", q(rad1, AA, rad2, SH)) -- Active participle, stative type I فَعِيع (++). FIXME: Any examples of this to verify it's correct? insert_ap2_pp2(base, q(rad1, A, rad2, II, rad2)) -- Active participle, stative type II فَعّ (+++). Example: بَرَّ "to be pious", active participle بَرّ insert_form_or_forms(base, "ap3", q(rad1, A, rad2, SH)) -- Active participle, color/defect أَفَعّ (+cd). -- Example: لَصَّ "to be thievish, to steal repeatedly", active participle أَلَصّ. insert_form_or_forms(base, "apcd", q(HAMZA, A, rad1, A, rad2, SH)) -- Active participle, -ān فَعَّان (+an). FIXME: Any examples of this to verify it's correct? insert_form_or_forms(base, "apan", q(rad1, A, rad2, SH, AAN)) -- Passive participle. insert_form_or_forms(base, "pp", q(MA, rad1, SK, rad2, UU, rad2)) end -- Return the ta- (active, past and non-past) and tu- (passive past) prefixes for a form II/III/V/VI verb. -- Form V and VI verbs normally use ta- and tu-, but reduced (base.reduced) verbs use different prefixes. Form II -- and III verbs have no prefix. local function form_ii_iii_v_vi_ta_tu_prefix(base, rad1) local vform = base.verb_form if vform == "V" or vform == "VI" then if base.reduced then -- To simplify the code, we generate two rad1's with a sukūn between them, which is cleaned up in -- postprocessing. return q(_I, rad1, SK), q(rad1, SK), q(_U, rad1, SK) else return TA, TA, TU end else return "", "", "" end end -- Make form II or V sound or final-weak verb. local function make_form_ii_v_sound_final_weak_verb(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local final_weak = is_final_weak(base, vowel_spec) local vform = base.verb_form local ta_past_prefix, ta_nonpast_prefix, tu_past_prefix = form_ii_iii_v_vi_ta_tu_prefix(base, rad1) local vn = vform == "V" and q(ta_past_prefix, rad1, A, rad2, SH, final_weak and IN or q(U, rad3)) or q(TA, rad1, SK, rad2, II, final_weak and AH or rad3) -- various stem bases local past_stem_base = q(ta_past_prefix, rad1, A, rad2, SH) local nonpast_stem_base = q(ta_nonpast_prefix, rad1, A, rad2, SH) local past_pass_stem_base = q(tu_past_prefix, rad1, U, rad2, SH) -- make parts make_augmented_sound_final_weak_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end conjugations["II-sound"] = function(base, vowel_spec) make_form_ii_v_sound_final_weak_verb(base, vowel_spec) end conjugations["II-final-weak"] = function(base, vowel_spec) make_form_ii_v_sound_final_weak_verb(base, vowel_spec) end local function make_form_iii_alt_vn(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local final_weak = is_final_weak(base, vowel_spec) -- Insert alternative verbal noun فِعَال. Since not all verbs have this, we require that verbs that do have it -- specify it explicitly; a shortcut ++ is provided to make this easier (e.g. <vn:+,++> to indicate that -- both the normal verbal noun مُفَاعَلَة and secondary verbal noun فِعَال are available). insert_form_or_forms(base, "vn2", q(rad1, I, rad2, AA, final_weak and HAMZA or rad3)) end -- Make form III or VI sound or final-weak verb. local function make_form_iii_vi_sound_final_weak_verb(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local final_weak = is_final_weak(base, vowel_spec) local vform = base.verb_form local ta_past_prefix, ta_nonpast_prefix, tu_past_prefix = form_ii_iii_v_vi_ta_tu_prefix(base, rad1) local vn = vform == "VI" and q(ta_past_prefix, rad1, AA, rad2, final_weak and IN or q(U, rad3)) or q(MU, rad1, AA, rad2, final_weak and AAH or q(A, rad3, AH)) -- various stem bases local past_stem_base = q(ta_past_prefix, rad1, AA, rad2) local nonpast_stem_base = q(ta_nonpast_prefix, rad1, AA, rad2) local past_pass_stem_base = q(tu_past_prefix, rad1, UU, rad2) -- make parts make_augmented_sound_final_weak_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) if vform == "III" then make_form_iii_alt_vn(base, vowel_spec) end end conjugations["III-sound"] = function(base, vowel_spec) make_form_iii_vi_sound_final_weak_verb(base, vowel_spec) end conjugations["III-final-weak"] = function(base, vowel_spec) make_form_iii_vi_sound_final_weak_verb(base, vowel_spec) end -- Make form III or VI geminate verb. local function make_form_iii_vi_geminate_verb(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local vform = base.verb_form local ta_past_prefix, ta_nonpast_prefix, tu_past_prefix = form_ii_iii_v_vi_ta_tu_prefix(base, rad1) -- Alternative verbal noun فِعَال will be inserted when we add sound parts below. local vn = vform == "VI" and q(ta_past_prefix, rad1, AA, rad2, SH) or q(MU, rad1, AA, rad2, SH, AH) -- Various stem bases. local past_stem_base = q(ta_past_prefix, rad1, AA) local nonpast_stem_base = q(ta_nonpast_prefix, rad1, AA) local past_pass_stem_base = q(tu_past_prefix, rad1, UU) -- Make parts. local variant = vowel_spec.variant or "short" if variant == "short" or variant == "both" then make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end -- Also add alternative sound (non-compressed) parts. This will lead to some duplicate entries, but they are -- removed during addition. if variant == "long" or variant == "both" then make_form_iii_vi_sound_final_weak_verb(base, vowel_spec) elseif vform == "III" then -- Still need to add the alternative form-III verbal noun. make_form_iii_alt_vn(base, vowel_spec) end end conjugations["III-geminate"] = function(base, vowel_spec) make_form_iii_vi_geminate_verb(base, vowel_spec) end -- Make form IV sound or final-weak verb. local function make_form_iv_sound_final_weak_verb(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local final_weak = is_final_weak(base, vowel_spec) -- core of stem base, minus stem prefixes local stem_core -- check for irregular verb أَرَى local is_raa = raa_radicals(rad1, rad2, rad3) if is_raa then base.irregular = true stem_core = rad1 else stem_core = q(rad1, SK, rad2) end -- verbal noun local vn = is_raa and q(HAMZA, I, stem_core, AA, HAMZA, AH) or q(HAMZA, I, stem_core, AA, final_weak and HAMZA or rad3) -- various stem bases local past_stem_base = q(HAMZA, A, stem_core) local nonpast_stem_base = stem_core local past_pass_stem_base = q(HAMZA, U, stem_core) -- make parts make_augmented_sound_final_weak_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end conjugations["IV-sound"] = function(base, vowel_spec) make_form_iv_sound_final_weak_verb(base, vowel_spec) end conjugations["IV-final-weak"] = function(base, vowel_spec) make_form_iv_sound_final_weak_verb(base, vowel_spec) end conjugations["IV-hollow"] = function(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) -- verbal noun local vn = q(HAMZA, I, rad1, AA, rad3, AH) -- various stem bases local past_stem_base = q(HAMZA, A, rad1) local nonpast_stem_base = rad1 local past_pass_stem_base = q(HAMZA, U, rad1) -- make parts make_augmented_hollow_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end conjugations["IV-geminate"] = function(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local vn = q(HAMZA, I, rad1, SK, rad2, AA, rad2) -- various stem bases local past_stem_base = q(HAMZA, A, rad1) local nonpast_stem_base = rad1 local past_pass_stem_base = q(HAMZA, U, rad1) -- make parts make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end conjugations["V-sound"] = function(base, vowel_spec) make_form_ii_v_sound_final_weak_verb(base, vowel_spec) end conjugations["V-final-weak"] = function(base, vowel_spec) make_form_ii_v_sound_final_weak_verb(base, vowel_spec) end conjugations["VI-sound"] = function(base, vowel_spec) make_form_iii_vi_sound_final_weak_verb(base, vowel_spec) end conjugations["VI-final-weak"] = function(base, vowel_spec) make_form_iii_vi_sound_final_weak_verb(base, vowel_spec) end conjugations["VI-geminate"] = function(base, vowel_spec) make_form_iii_vi_geminate_verb(base, vowel_spec) end -- Make a verbal noun of the general form that applies to forms VII and above. RAD12 is the first consonant cluster -- (after initial اِ) and RAD34 is the second consonant cluster. RAD5 is the final consonant. local function high_form_verbal_noun(rad12, rad34, rad5) return q(_I, rad12, I, rad34, AA, rad5) end -- Populate a sound or final-weak verb for any of the various high-numbered augmented forms (form VII and up) that -- have up to 5 consonants in two clusters in the stem and the same pattern of vowels between. Some of these -- consonants in certain verb parts are w's, which leads to apparent anomalies in certain stems of these parts, but -- these anomalies are handled automatically in postprocessing, where we resolve sequences of iwC -> īC, uwC -> ūC, -- w + sukūn + w -> w + shadda. -- RAD12 is the first consonant cluster (after initial اِ) and RAD34 is the second consonant cluster. RAD5 is the -- final consonant. local function make_high_form_sound_final_weak_verb(base, vowel_spec, rad12, rad34, rad5) local final_weak = is_final_weak(base, vowel_spec) local vn = high_form_verbal_noun(rad12, rad34, final_weak and HAMZA or rad5) -- various stem bases local nonpast_stem_base = q(rad12, A, rad34) local past_stem_base = q(_I, nonpast_stem_base) local past_pass_stem_base = q(_U, rad12, U, rad34) -- make parts make_augmented_sound_final_weak_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end local function form_vii_nrad1(base, rad1) if base.reduced then if not req(rad1, M) then error(("Internal error: Form VII first radical %s is not م but .reduced specified; should have been caught earlier"): format(rget(rad1))) end return M .. SH else return q("نْ", rad1) end end -- Make form VII sound or final-weak verb. local function make_form_vii_sound_final_weak_verb(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) make_high_form_sound_final_weak_verb(base, vowel_spec, form_vii_nrad1(base, rad1), rad2, rad3) end conjugations["VII-sound"] = function(base, vowel_spec) make_form_vii_sound_final_weak_verb(base, vowel_spec) end conjugations["VII-final-weak"] = function(base, vowel_spec) make_form_vii_sound_final_weak_verb(base, vowel_spec) end conjugations["VII-hollow"] = function(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local nrad1 = form_vii_nrad1(base, rad1) local vn = high_form_verbal_noun(nrad1, Y, rad3) -- various stem bases local nonpast_stem_base = nrad1 local past_stem_base = q(_I, nonpast_stem_base) local past_pass_stem_base = q(_U, nrad1) -- make parts make_augmented_hollow_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end conjugations["VII-geminate"] = function(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local nrad1 = form_vii_nrad1(base, rad1) local vn = high_form_verbal_noun(nrad1, rad2, rad2) -- various stem bases local nonpast_stem_base = q(nrad1, A) local past_stem_base = q(_I, nonpast_stem_base) local past_pass_stem_base = q(_U, nrad1, U) -- make parts make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end -- Return Form VIII verbal noun. local function form_viii_verbal_noun(base, vowel_spec, rad1, rad2, rad3) local final_weak = is_final_weak(base, vowel_spec) rad3 = final_weak and HAMZA or rad3 return {high_form_verbal_noun(vowel_spec.form_viii_assim, rad2, rad3)} end -- Make form VIII sound or final-weak verb. local function make_form_viii_sound_final_weak_verb(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) -- check for irregular verb اِتَّخَذَ if axadh_radicals(rad1, rad2, rad3) then base.irregular = true rad1 = T end make_high_form_sound_final_weak_verb(base, vowel_spec, vowel_spec.form_viii_assim, rad2, rad3) end conjugations["VIII-sound"] = function(base, vowel_spec) make_form_viii_sound_final_weak_verb(base, vowel_spec) end conjugations["VIII-final-weak"] = function(base, vowel_spec) make_form_viii_sound_final_weak_verb(base, vowel_spec) end conjugations["VIII-hollow"] = function(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local vn = form_viii_verbal_noun(base, vowel_spec, rad1, Y, rad3) -- various stem bases local nonpast_stem_base = vowel_spec.form_viii_assim local past_stem_base = q(_I, nonpast_stem_base) local past_pass_stem_base = q(_U, nonpast_stem_base) -- make parts make_augmented_hollow_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end conjugations["VIII-geminate"] = function(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local vn = form_viii_verbal_noun(base, vowel_spec, rad1, rad2, rad2) -- various stem bases local nonpast_stem_base = q(vowel_spec.form_viii_assim, A) local past_stem_base = q(_I, nonpast_stem_base) local past_pass_stem_base = q(_U, vowel_spec.form_viii_assim, U) -- make parts make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end conjugations["IX-sound"] = function(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local vn = q(_I, rad1, SK, rad2, I, rad3, AA, rad3) -- various stem bases local nonpast_stem_base = q(rad1, SK, rad2, A) local past_stem_base = q(_I, nonpast_stem_base) local past_pass_stem_base = q(_U, rad1, SK, rad2, U) -- make parts make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end conjugations["IX-final-weak"] = function(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) make_high_form_sound_final_weak_verb(base, vowel_spec, q(rad1, SK, rad2), rad3, rad3) end -- Populate a sound or final-weak verb for any of the various high-numbered -- augmented forms that have 5 consonants in the stem and the same pattern of -- vowels. Some of these consonants in certain verb parts are w's, which leads to -- apparent anomalies in certain stems of these parts, but these anomalies -- are handled automatically in postprocessing, where we resolve sequences of -- iwC -> īC, uwC -> ūC, w + sukūn + w -> w + shadda. local function make_high5_form_sound_final_weak_verb(base, vowel_spec, rad1, rad2, rad3, rad4, rad5) make_high_form_sound_final_weak_verb(base, vowel_spec, q(rad1, SK, rad2), q(rad3, SK, rad4), rad5) end -- Make form X sound or final-weak verb. local function make_form_x_sound_final_weak_verb(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) -- check for irregular verb اِسْتَحْيَا (also اِسْتَحَى) local is_hayy = hayy_radicals(rad1, rad2, rad3) local variant = vowel_spec.variant or "both" if not is_hayy or variant == "long" or variant == "both" then make_high5_form_sound_final_weak_verb(base, vowel_spec, S, T, rad1, rad2, rad3) end if is_hayy and (variant == "short" or variant == "both") then base.irregular = true -- Add alternative entries to the verbal paradigms. Any duplicates are removed during addition. make_high_form_sound_final_weak_verb(base, vowel_spec, S .. SK .. T, rad1, rad3) end end conjugations["X-sound"] = function(base, vowel_spec) make_form_x_sound_final_weak_verb(base, vowel_spec) end conjugations["X-final-weak"] = function(base, vowel_spec) make_form_x_sound_final_weak_verb(base, vowel_spec) end conjugations["X-hollow"] = function(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local vn = q(base.reduced and "اِسْ" or "اِسْتِ", rad1, AA, rad3, AH) -- various stem bases local past_stem_base = q(base.reduced and "اِسْ" or "اِسْتَ", rad1) local nonpast_stem_base = q(base.reduced and "سْ" or "سْتَ", rad1) local past_pass_stem_base = q(base.reduced and "اُسْ" or "اُسْتُ", rad1) -- make parts make_augmented_hollow_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end conjugations["X-geminate"] = function(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local vn = q("اِسْتِ", rad1, SK, rad2, AA, rad2) -- various stem bases local past_stem_base = q("اِسْتَ", rad1) local nonpast_stem_base = q("سْتَ", rad1) local past_pass_stem_base = q("اُسْتُ", rad1) -- make parts if base.altgem then inflect_tense(base, "past", "", {q(past_stem_base, A, rad2, SH), all_same = 1}, past_endings_ay_12_person_only) end make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn, base.altgem and "[uncommon]" or nil) end conjugations["XI-sound"] = function(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local vn = q(_I, rad1, SK, rad2, II, rad3, AA, rad3) -- various stem bases local nonpast_stem_base = q(rad1, SK, rad2, AA) local past_stem_base = q(_I, nonpast_stem_base) local past_pass_stem_base = q(_U, rad1, SK, rad2, UU) -- make parts make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end -- Probably no form XI final-weak, since already geminate in form; would behave as XI-sound. -- Make form XII sound or final-weak verb. local function make_form_xii_sound_final_weak_verb(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) make_high5_form_sound_final_weak_verb(base, vowel_spec, rad1, rad2, W, rad2, rad3) end conjugations["XII-sound"] = function(base, vowel_spec) make_form_xii_sound_final_weak_verb(base, vowel_spec) end conjugations["XII-final-weak"] = function(base, vowel_spec) make_form_xii_sound_final_weak_verb(base, vowel_spec) end -- Make form XIII sound or final-weak verb. local function make_form_xiii_sound_final_weak_verb(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) make_high5_form_sound_final_weak_verb(base, vowel_spec, rad1, rad2, W, W, rad3) end conjugations["XIII-sound"] = function(base, vowel_spec) make_form_xiii_sound_final_weak_verb(base, vowel_spec) end conjugations["XIII-final-weak"] = function(base, vowel_spec) make_form_xiii_sound_final_weak_verb(base, vowel_spec) end -- Make a form XIV or XV sound or final-weak verb. Last radical appears twice (if`anlala / yaf`anlilu) so if it were -- w or y you'd get if`anwā / yaf`anwī or if`anyā / yaf`anyī, i.e. unlike for most augmented verbs, the identity of -- the radical matters. local function make_form_xiv_xv_sound_final_weak_verb(base, vowel_spec) local rad1, rad2, rad3 = get_radicals_3(vowel_spec) local lastrad = base.verb_form == "XV" and Y or rad3 make_high5_form_sound_final_weak_verb(base, vowel_spec, rad1, rad2, N, rad3, lastrad) end conjugations["XIV-sound"] = function(base, vowel_spec) make_form_xiv_xv_sound_final_weak_verb(base, vowel_spec) end conjugations["XIV-final-weak"] = function(base, vowel_spec) make_form_xiv_xv_sound_final_weak_verb(base, vowel_spec) end conjugations["XV-sound"] = function(base, vowel_spec) make_form_xiv_xv_sound_final_weak_verb(base, vowel_spec) end -- Probably no form XV final-weak, since already final-weak in form; would behave as XV-sound. -- Make form Iq or IIq sound or final-weak verb. local function make_form_iq_iiq_sound_final_weak_verb(base, vowel_spec) local rad1, rad2, rad3, rad4 = get_radicals_4(vowel_spec) local final_weak = is_final_weak(base, vowel_spec) local vform = base.verb_form local vn = vform == "IIq" and q(TA, rad1, A, rad2, SK, rad3, (final_weak and IN or q(U, rad4))) or q(rad1, A, rad2, SK, rad3, (final_weak and AAH or q(A, rad4, AH))) local ta_pref = vform == "IIq" and TA or "" local tu_pref = vform == "IIq" and TU or "" -- various stem bases local past_stem_base = q(ta_pref, rad1, A, rad2, SK, rad3) local nonpast_stem_base = past_stem_base local past_pass_stem_base = q(tu_pref, rad1, U, rad2, SK, rad3) -- make parts make_augmented_sound_final_weak_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end conjugations["Iq-sound"] = function(base, vowel_spec) make_form_iq_iiq_sound_final_weak_verb(base, vowel_spec) end conjugations["Iq-final-weak"] = function(base, vowel_spec) make_form_iq_iiq_sound_final_weak_verb(base, vowel_spec) end conjugations["IIq-sound"] = function(base, vowel_spec) make_form_iq_iiq_sound_final_weak_verb(base, vowel_spec) end conjugations["IIq-final-weak"] = function(base, vowel_spec) make_form_iq_iiq_sound_final_weak_verb(base, vowel_spec) end -- Make form IIIq sound or final-weak verb. local function make_form_iiiq_sound_final_weak_verb(base, vowel_spec) local rad1, rad2, rad3, rad4 = get_radicals_4(vowel_spec) make_high5_form_sound_final_weak_verb(base, vowel_spec, rad1, rad2, N, rad3, rad4) end conjugations["IIIq-sound"] = function(base, vowel_spec) make_form_iiiq_sound_final_weak_verb(base, vowel_spec) end conjugations["IIIq-final-weak"] = function(base, vowel_spec) make_form_iiiq_sound_final_weak_verb(base, vowel_spec) end conjugations["IVq-sound"] = function(base, vowel_spec) local rad1, rad2, rad3, rad4 = get_radicals_4(vowel_spec) local vn = q(_I, rad1, SK, rad2, I, rad3, SK, rad4, AA, rad4) -- various stem bases local past_stem_base = q(_I, rad1, SK, rad2, A, rad3) local nonpast_stem_base = q(rad1, SK, rad2, A, rad3) local past_pass_stem_base = q(_U, rad1, SK, rad2, U, rad3) -- make parts make_augmented_geminate_verb(base, vowel_spec, past_stem_base, nonpast_stem_base, past_pass_stem_base, vn) end -- Probably no form IVq final-weak, since already geminate in form; would behave as IVq-sound. end create_conjugations() ------------------------------------------------------------------------------- -- Guts of main conjugation function -- ------------------------------------------------------------------------------- -- Given form, weakness and radicals, check to make sure the radicals present are allowable for the weakness. Hamzas on -- alif/wāw/yāʾ seats are never allowed (should always appear as hamza-on-the-line), and various weaknesses have various -- strictures on allowable consonants. local function check_radicals(form, weakness, rad1, rad2, rad3, rad4) local function hamza_check(index, rad) if rad == HAMZA_ON_ALIF or rad == HAMZA_UNDER_ALIF or rad == HAMZA_ON_W or rad == HAMZA_ON_Y then error("Radical " .. index .. " is " .. rad .. " but should be ء (hamza on the line)") end end local function check_waw_ya(index, rad) if not is_waw_ya(rad) then error("Radical " .. index .. " is " .. rad .. " but should be و or ي") end end local function check_not_waw_ya(index, rad) if is_waw_ya(rad) then error("In a sound verb, radical " .. index .. " should not be و or ي") end end hamza_check(rad1) hamza_check(rad2) hamza_check(rad3) hamza_check(rad4) if weakness == "assimilated" or weakness == "assimilated+final-weak" then if rad1 ~= W then error("Radical 1 is " .. rad1 .. " but should be و") end -- don't check that non-assimilated form I verbs don't have wāw as their -- first radical because some form-I verbs exist where a first-radical wāw -- behaves as sound, e.g. wajuha yawjuhu "to be distinguished". end if weakness == "final-weak" or weakness == "assimilated+final-weak" then if rad4 then check_waw_ya(4, rad4) else check_waw_ya(3, rad3) end elseif vform_supports_final_weak(form) then -- non-final-weak verbs cannot have weak final radical if there's a corresponding -- final-weak verb category. I think this is safe. We may have problems with -- ḥayya/ḥayiya yaḥyā if we treat it as a geminate verb. if rad4 then check_not_waw_ya(4, rad4) else check_not_waw_ya(3, rad3) end end if weakness == "hollow" then check_waw_ya(2, rad2) -- don't check that non-hollow verbs in forms that support hollow verbs -- don't have wāw or yāʾ as their second radical because some verbs exist -- where a middle-radical wāw/yāʾ behaves as sound, e.g. form-VIII izdawaja -- "to be in pairs". end if weakness == "geminate" then if rad4 then error("Internal error: No geminate quadrilaterals, should not be seen") end if rad2 ~= rad3 then error("Weakness is geminate; radical 3 is " .. rad3 .. " but should be same as radical 2 " .. rad2) end elseif vform_supports_geminate(form) then -- non-geminate verbs cannot have second and third radical same if there's -- a corresponding geminate verb category. I think this is safe. We -- don't fuss over double wāw or double yāʾ because this could legitimately -- be a final-weak verb with middle wāw/yāʾ, treated as sound. if rad4 then error("Internal error: No quadrilaterals should support geminate verbs") end if rad2 == rad3 and not is_waw_ya(rad2) then error("Weakness is '" .. weakness .. "'; radical 2 and 3 are same at " .. rad2 .. " but should not be; consider making weakness 'geminate'") end end end -- array of substitutions; each element is a 2-entry array FROM, TO; do it -- this way so the concatenations only get evaluated once local postprocess_subs = { -- reorder short-vowel + shadda -> shadda + short-vowel for easier processing {"(" .. AIU .. ")" .. SH, SH .. "%1"}, ----------same letter separated by sukūn should instead use shadda--------- ------------happens e.g. in kun-nā "we were".----------------- {"(.)" .. SK .. "%1", "%1" .. SH}, ---------------------------- assimilated verbs ---------------------------- -- iw, iy -> ī (assimilated verbs) {I .. W .. SK, II}, {I .. Y .. SK, II}, -- uw, uy -> ū (assimilated verbs) {U .. W .. SK, UU}, {U .. Y .. SK, UU}, -------------- final -yā uses tall alif not alif maqṣūra ------------------ {"(" .. Y .. SH .. "?" .. A .. ")" .. AMAQ, "%1" .. ALIF}, ----------------------- handle hamza assimilation ------------------------- -- initial hamza + short-vowel + hamza + sukūn -> hamza + long vowel {HAMZA .. A .. HAMZA .. SK, HAMZA .. A .. ALIF}, {HAMZA .. I .. HAMZA .. SK, HAMZA .. I .. Y}, {HAMZA .. U .. HAMZA .. SK, HAMZA .. U .. W} } local postprocess_tr_subs = { {"ī([" .. vowels .. "y*])", "iy%1"}, {"ū([" .. vowels .. "w*])", "uw%1"}, {"(.)%*", "%1%1"}, -- implement shadda ---------------------------- assimilated verbs ---------------------------- -- iw, iy -> ī (assimilated verbs) {"iw([^" .. vowels .. "w])", "ī%1"}, {"iy([^" .. vowels .. "y])", "ī%1"}, -- uw, uy -> ū (assimilated verbs) {"uw([^" .. vowels .. "w])", "ū%1"}, {"uy([^" .. vowels .. "y])", "ū%1"}, ----------------------- handle hamza assimilation ------------------------- -- initial hamza + short-vowel + hamza + sukūn -> hamza + long vowel {"ʔaʔ(" .. NV .. ")", "ʔā%1"}, {"ʔiʔ(" .. NV .. ")", "ʔī%1"}, {"ʔuʔ(" .. NV .. ")", "ʔū%1"}, } -- Post-process verb parts to eliminate phonological anomalies. Many of the changes, particularly the tricky ones, -- involve converting hamza to have the proper seat. The rules for this are complicated and are documented on the -- [[w:Hamza]] Wikipedia page. In some cases there are alternatives allowed, and we handle them below by returning -- multiple possibilities. local function postprocess_term(term) if term == "?" then return "?" end -- Add BORDER at text boundaries. term = BORDER .. term .. BORDER -- Do the main post-processing, based on the pattern substitutions in postprocess_subs. for _, sub in ipairs(postprocess_subs) do term = rsub(term, sub[1], sub[2]) end term = term:gsub(BORDER, "") if not rfind(term, HAMZA) then return term end term = term:gsub(HAMZA, HAMZA_PH) term = ar_utilities.process_hamza(term) if #term == 1 then term = term[1] end return term end local function postprocess_translit(translit) if translit == "?" then return "?" end -- Add BORDER at text boundaries. translit = BORDER .. translit .. BORDER -- Do the main post-processing, based on the pattern substitutions in postprocess_tr_subs. for _, sub in ipairs(postprocess_tr_subs) do translit = rsub(translit, sub[1], sub[2]) end translit = translit:gsub(BORDER, "") return translit end local function postprocess_forms(base) local converted_values = {} for slot, forms in pairs(base.forms) do local need_dedup = false for i, form in ipairs(forms) do local term = postprocess_term(form.form) local translit = form.translit and postprocess_translit(form.translit) or nil if term ~= form.form or translit ~= form.translit then need_dedup = true end converted_values[i] = {term, translit} end if need_dedup then local temp_dedup = {} for i = 1, #forms do local new_term, new_translit = unpack(converted_values[i]) if type(new_term) == "table" then for _, nt in ipairs(new_term) do local new_formobj = { form = nt, translit = new_translit, footnotes = forms[i].footnotes, } iut.insert_form(temp_dedup, "temp", new_formobj) end else local new_formobj = { form = new_term, translit = new_translit, footnotes = forms[i].footnotes, } iut.insert_form(temp_dedup, "temp", new_formobj) end end base.forms[slot] = temp_dedup.temp end end end local function process_slot_overrides(base) for slot, forms in pairs(base.slot_overrides) do local existing_values = base.forms[slot] base.forms[slot] = nil for _, form in ipairs(forms) do -- + in active participle for form I requests slot ap1 if form.form == "+" and (base.verb_form ~= "I" or slot ~= "ap") then if not existing_values then error(("Slot '%s' requested the default value but no such value available"):format(slot)) end -- We maintain an invariant that no two slots share a form object (although they may share the footnote -- lists inside the form objects). However, there is no need to copy the form objects here because there -- is a one-to-one correspondence between slots and slot overrides, i.e. you can't have a default value -- go into two slots. insert_form_or_forms(base, slot, existing_values, "allow overrides", form.uncertain) elseif default_indicator_to_active_participle_slot[form.form] then if form.form == "++" then if slot ~= "vn" and slot ~= "ap" and slot ~= "pp" then error(("Secondary default value request '++' only applicable to verbal nouns and pariciples, but found in slot '%s'"): format(slot)) end else if slot ~= "ap" then error(("Secondary default value request '%s' only applicable to active pariciples, but found in slot '%s'"): format(form.form, slot)) end end local secondary_default_slot = slot == "vn" and "vn2" or slot == "pp" and "pp2" or default_indicator_to_active_participle_slot[form.form] local existing_values = base.forms[secondary_default_slot] if not existing_values then error(("Slot '%s' requested a secondary default value using '%s' but no such value available"): format(slot, form.form)) end -- See comment above about the lack of need to copy the form objects. insert_form_or_forms(base, slot, existing_values, "allow overrides", form.uncertain) -- To make sure there aren't shared form objects. base.forms[secondary_default_slot] = nil else insert_form_or_forms(base, slot, form, "allow overrides", form.uncertain) end end end -- Now, for non-stative form-I verbs, fill the active participle slot from ap1 unless it should be missing (e.g. -- passive-only or user specified 'ap:-'). if base.verb_form == "I" and not base.forms.ap and base.forms.ap1 and not skip_slot(base, "ap") then local saw_non_stative = false for _, vowel_spec in ipairs(base.conj_vowels) do if req(vowel_spec.past, A) then saw_non_stative = true break end end if saw_non_stative then base.forms.ap = base.forms.ap1 -- To make sure there aren't shared form objects. base.forms.ap1 = nil end end end local function handle_lemma_linked(base) -- Compute linked versions of potential lemma slots, for use in {{ar-verb}}. We substitute the original lemma -- (before removing links) for forms that are the same as the lemma, if the original lemma has links. for _, slot in ipairs(export.potential_lemma_slots) do if base.forms[slot] then insert_form_or_forms(base, slot .. "_linked", iut.map_forms(base.forms[slot], function(form) if form == base.lemma and rfind(base.linked_lemma, "%[%[") then return base.linked_lemma else return form end end)) end end end -- Process specs given by the user using 'addnote[SLOTSPEC][FOOTNOTE][FOOTNOTE][...]'. local function process_addnote_specs(base) for _, spec in ipairs(base.addnote_specs) do for _, slot_spec in ipairs(spec.slot_specs) do slot_spec = "^" .. slot_spec .. "$" for slot, forms in pairs(base.forms) do if rfind(slot, slot_spec) then -- To save on memory, side-effect the existing forms. for _, form in ipairs(forms) do form.footnotes = iut.combine_footnotes(form.footnotes, spec.footnotes) end end end end end end local function add_missing_links_to_forms(base) -- Any forms without links should get them now. Redundant ones will be stripped later. for slot, forms in pairs(base.forms) do for _, form in ipairs(forms) do if not form.form:find("%[%[") then form.form = "[[" .. form.form .. "]]" end end end end local function conjugate_verb(base) construct_stems(base) for _, vowel_spec in ipairs(base.conj_vowels) do -- Reconstruct conjugation type from verb form and (possibly inferred) weakness. conj_type = base.verb_form .. "-" .. vowel_spec.weakness -- Check that the conjugation type is recognized. if not conjugations[conj_type] then error("Unknown conjugation type '" .. conj_type .. "'") end -- The way the conjugation functions work is they always add entries to the appropriate parts of the paradigm -- (each of which is an array), rather than setting the values. This makes it possible to call more than one -- conjugation function and essentially get a paradigm of the "either A or B" kind. Doing this may insert -- duplicate entries into a particular paradigm part, but this is not a problem because we check for duplicate -- entries when adding them, and don't insert in that case. conjugations[conj_type](base, vowel_spec) end postprocess_forms(base) process_slot_overrides(base) -- This should happen before add_missing_links_to_forms() so that the comparison `form == base.lemma` in -- handle_lemma_linked() works correctly and compares unlinked forms to unlinked forms. handle_lemma_linked(base) process_addnote_specs(base) if not base.alternant_multiword_spec.args.noautolinkverb then add_missing_links_to_forms(base) end end local function parse_indicator_spec(angle_bracket_spec) -- Store the original angle bracket spec so we can reconstruct the overall conj spec with the lemma(s) in them. local base = { angle_bracket_spec = angle_bracket_spec, conj_vowels = {}, root_consonants = {}, user_stem_overrides = {}, user_slot_overrides = {}, slot_explicitly_missing = {}, slot_uncertain = {}, slot_override_uses_default = {}, addnote_specs = {}, } local function parse_err(msg) error(msg .. ": " .. angle_bracket_spec) end local function fetch_footnotes(separated_group) local footnotes for j = 2, #separated_group - 1, 2 do if separated_group[j + 1] ~= "" then parse_err("Extraneous text after bracketed footnotes: '" .. table.concat(separated_group) .. "'") end if not footnotes then footnotes = {} end table.insert(footnotes, separated_group[j]) end return footnotes end local inside = angle_bracket_spec:match("^<(.*)>$") assert(inside) local segments = put.parse_multi_delimiter_balanced_segment_run(inside, {{"[", "]"}, {"<", ">"}}) local dot_separated_groups = put.split_alternating_runs_and_strip_spaces(segments, "%.") -- The first dot-separated element must specify the verb form, e.g. IV or IIq. If the form is I, it needs to include -- the the past and non-past vowels, e.g. I/a~u for kataba ~ yaktubu. More than one vowel can be given, -- comma-separated, and more than one past~non-past pair can be given, slash-separated, e.g. I/a,u~u/i~a for form I -- كمل, which can be conjugated as kamala/kamula ~ yakmulu or kamila ~ yakmalu. An individual vowel spec must be one -- of a, i or u and in general (a) at least one past~non-past pair most be given, and (b) both past and non-past -- vowels must be given even though sometimes the vowel can be determined from the unvocalized form. An exception is -- passive-only verbs, where the vowels can't in general be determined (except indirectly in some cases by looking -- at an associated non-passive verb); in that case, the vowel~vowel spec can left out. local slash_separated_groups = put.split_alternating_runs_and_strip_spaces(dot_separated_groups[1], "/") local form_spec = slash_separated_groups[1] base.form_footnotes = fetch_footnotes(form_spec) if form_spec[1] == "" then parse_err("Missing verb form") end if not allowed_vforms_with_weakness_set[form_spec[1]] then parse_err(("Unrecognized verb form '%s', should be one of %s"):format( form_spec[1], list_to_text(allowed_vforms, nil, " or "))) end if form_spec[1]:find("%-") then base.verb_form, base.explicit_weakness = form_spec[1]:match("^(.-)%-(.*)$") else base.verb_form = form_spec[1] end if #slash_separated_groups > 1 then if base.verb_form ~= "I" then parse_err(("Past~non-past vowels can only be specified when verb form is I, but saw form '%s'"):format( base.verb_form)) end for i = 2, #slash_separated_groups do local slash_separated_group = slash_separated_groups[i] local tilde_separated_groups = put.split_alternating_runs_and_strip_spaces(slash_separated_group, "~") if #tilde_separated_groups ~= 2 then parse_err(("Expected two tilde-separated vowel specs: %s"):format(table.concat(slash_separated_group))) end local function parse_conj_vowels(tilde_separated_group, vtype) local conj_vowel_objects = {} local comma_separated_groups = put.split_alternating_runs_and_strip_spaces(tilde_separated_group, ",") for _, comma_separated_group in ipairs(comma_separated_groups) do local conj_vowel = comma_separated_group[1] if conj_vowel ~= "a" and conj_vowel ~= "i" and conj_vowel ~= "u" then parse_err(("Expected %s conjugation vowel '%s' to be one of a, i or u in %s"):format( vtype, conj_vowel, table.concat(slash_separated_group))) end conj_vowel = dia[conj_vowel] local conj_vowel_footnotes = fetch_footnotes(comma_separated_group) -- Try to use strings when possible as it makes q() significantly more efficient. if conj_vowel_footnotes then table.insert(conj_vowel_objects, {form = conj_vowel, footnotes = conj_vowel_footnotes}) else table.insert(conj_vowel_objects, conj_vowel) end end return conj_vowel_objects end local conj_vowel_spec = { past = parse_conj_vowels(tilde_separated_groups[1], "past"), nonpast = parse_conj_vowels(tilde_separated_groups[2], "non-past"), } table.insert(base.conj_vowels, conj_vowel_spec) end end for i = 2, #dot_separated_groups do local dot_separated_group = dot_separated_groups[i] local first_element = dot_separated_group[1] if first_element == "addnote" then local spec_and_footnotes = fetch_footnotes(dot_separated_group) if #spec_and_footnotes < 2 then parse_err("Spec with 'addnote' should be of the form 'addnote[SLOTSPEC][FOOTNOTE][FOOTNOTE][...]'") end local slot_spec = table.remove(spec_and_footnotes, 1) local slot_spec_inside = rmatch(slot_spec, "^%[(.*)%]$") if not slot_spec_inside then parse_err("Internal error: slot_spec " .. slot_spec .. " should be surrounded with brackets") end local slot_specs = rsplit(slot_spec_inside, ",") -- FIXME: Here, [[Module:it-verb]] called strip_spaces(). Generally we don't do this. Should we? table.insert(base.addnote_specs, {slot_specs = slot_specs, footnotes = spec_and_footnotes}) elseif first_element:find("^var:") then if #dot_separated_group > 1 then parse_err(("Can't attach footnotes to 'var:' spec '%s'"):format(first_element)) end base.variant = first_element:match("^var:(.*)$") elseif first_element:find("^I+V?:") then local root_cons, root_cons_value = first_element:match("^(I+V?):(.*)$") local root_index if root_cons == "I" then root_index = 1 elseif root_cons == "II" then root_index = 2 elseif root_cons == "III" then root_index = 3 elseif root_cons == "IV" then root_index = 4 if not base.verb_form:find("q$") then parse_err(("Can't specify root consonant IV for non-quadriliteral verb form '%s': %s"):format( base.verb_form, first_element)) end end local cons, translit = root_cons_value:match("^(.*)//(.*)$") if not cons then cons = root_cons_value end local root_footnotes = fetch_footnotes(dot_separated_group) if not translit and not root_footnotes then base.root_consonants[root_index] = cons else base.root_consonants[root_index] = {form = cons, translit = translit, footnotes = root_footnotes} end elseif first_element:find("^[a-z][a-z0-9_]*:") then local slot_or_stem, remainder = first_element:match("^(.-):(.*)$") dot_separated_group[1] = remainder local comma_separated_groups = put.split_alternating_runs_and_strip_spaces(dot_separated_group, "[,،]") if overridable_stems[slot_or_stem] then if base.user_stem_overrides[slot_or_stem] then parse_err("Overridable stem '" .. slot_or_stem .. "' specified twice") end base.user_stem_overrides[slot_or_stem] = overridable_stems[slot_or_stem](comma_separated_groups, {prefix = slot_or_stem, base = base, parse_err = parse_err, fetch_footnotes = fetch_footnotes}) else -- assume a form override; we validate further later when the possible slots are available if base.user_slot_overrides[slot_or_stem] then parse_err("Form override '" .. slot_or_stem .. "' specified twice") end base.user_slot_overrides[slot_or_stem] = allow_multiple_values_for_override(comma_separated_groups, {prefix = slot_or_stem, base = base, parse_err = parse_err, fetch_footnotes = fetch_footnotes}, "is form override") end elseif indicator_flags[first_element] then if #dot_separated_group > 1 then parse_err("No footnotes allowed with '" .. first_element .. "' spec") end if base[first_element] then parse_err("Spec '" .. first_element .. "' specified twice") end base[first_element] = true else local passive, uncertain = first_element:match("^(.*)(%?)$") passive = passive or first_element uncertain = not not uncertain if passive_types[passive] then if #dot_separated_group > 1 then parse_err("No footnotes allowed with '" .. passive .. "' spec") end if base.passive then parse_err("Value for passive type specified twice") end base.passive = passive base.passive_uncertain = uncertain else parse_err("Unrecognized spec '" .. first_element .. "'") end end end return base end -- Normalize all lemmas, substituting the pagename for blank lemmas and adding links to multiword lemmas. local function normalize_all_lemmas(alternant_multiword_spec, head) -- (1) Add links to all before and after text. Remember the original text so we can reconstruct the verb spec later. if not alternant_multiword_spec.args.noautolinktext then iut.add_links_to_before_and_after_text(alternant_multiword_spec, "remember original") end -- (2) Remove any links from the lemma, but remember the original form so we can use it below in the 'lemma_linked' -- form. iut.map_word_specs(alternant_multiword_spec, function(base) if base.lemma == "" then base.lemma = head end base.user_specified_lemma = base.lemma base.lemma = m_links.remove_links(base.lemma) base.user_specified_verb = base.lemma base.verb = base.user_specified_verb local linked_lemma if alternant_multiword_spec.args.noautolinkverb or base.user_specified_lemma:find("%[%[") then linked_lemma = base.user_specified_lemma else -- Add links to the lemma so the user doesn't specifically need to, since we preserve -- links in multiword lemmas and include links in non-lemma forms rather than allowing -- the entire form to be a link. linked_lemma = iut.add_links(base.user_specified_lemma) end base.linked_lemma = linked_lemma end) end -- Determine weakness from radicals. Used when root given in place of lemma (e.g. for {{ar-verb forms}}). local function weakness_from_radicals(form, rad1, rad2, rad3, rad4) local weakness = nil local quadlit = form:find("q$") -- If weakness unspecified, derive from radicals. if not quadlit then if is_waw_ya(rad3) and rad1 == W and form == "I" then weakness = "assimilated+final-weak" elseif is_waw_ya(rad3) and vform_supports_final_weak(form) then weakness = "final-weak" elseif rad2 == rad3 and vform_supports_geminate(form) then weakness = "geminate" elseif is_waw_ya(rad2) and vform_supports_hollow(form) then weakness = "hollow" elseif rad1 == W and form == "I" then weakness = "assimilated" else weakness = "sound" end else if is_waw_ya(rad4) then weakness = "final-weak" else weakness = "sound" end end return weakness end -- Join the infixed tāʔ (ت) to the first radical in form VIII verbs. This may cause assimilation of the tāʔ to the -- radical or in some cases the radical to the tāʔ. Used when a root is supplied instead of a lemma (which already has -- the appropriate assimilation in it). local function form_viii_join_ta(rad) if rad == W or rad == Y or rad == "ت" then return "تّ" elseif rad == "د" then return "دّ" elseif rad == "ث" then return "ثّ" elseif rad == "ذ" then return "ذّ" elseif rad == "ز" then return "زْد" elseif rad == "ص" then return "صْط" elseif rad == "ض" then return "ضْط" elseif rad == "ط" then return "طّ" elseif rad == "ظ" then return "ظّ" else return rad .. SK .. "ت" end end local function detect_indicator_spec(base) base.forms = {} base.stem_overrides = {} base.slot_overrides = {} if not base.conj_vowels[1] then -- These may be converted to inferred vowels. If not, we throw an error if form I and not passive-only. base.conj_vowels = {{ past = "-", nonpast = "-", }} else -- If multiple vowels specified for a given vowel type (e.g. a,u~u), expand so that each spec in local expansion = {} for _, spec in ipairs(base.conj_vowels) do for _, past in ipairs(spec.past) do for _, nonpast in ipairs(spec.nonpast) do table.insert(expansion, {past = past, nonpast = nonpast}) end end end base.conj_vowels = expansion end local vform = base.verb_form -- check for quadriliteral form (Iq, IIq, IIIq, IVq) base.quadlit = not not vform:find("q$") -- Infer radicals as necessary. We infer a separate set of radicals for each past~non-past vowel combination because -- they may be different (particularly with form-I hollow verbs). for _, vowel_spec in ipairs(base.conj_vowels) do -- NOTE: rad1, rad2, etc. refer to user-specified radicals, which are formobj tables that optionally specify an -- explicit manual translit, whereas ir1, ir2, etc. refer to inferred radicals, which are either strings or -- lists of possible radicals. local rads = base.root_consonants local rad1, rad2, rad3, rad4 = rads[1], rads[2], rads[3], rads[4] -- Default any unspecified radicals to radicals determined from the headword. The returned radicals may be -- lists of possible radicals, where the first radical should be chosen if the user didn't explicitly specify a -- radical but all are allowed. If `ambig = true` is set in the table, the radical is considered ambiguous and -- categories won't be created for weak radicals. local weakness, ir1, ir2, ir3, ir4 if vform ~= "none" then ir1, ir2, ir3 = rmatch(base.lemma, "^([^_])_([^_])_([^_])$") if not ir1 then ir1, ir2, ir3, ir4 = rmatch(base.lemma, "^([^_])_([^_])_([^_])_([^_])$") end if ir1 then -- root given instead of lemma weakness = weakness_from_radicals(vform, ir1, ir2, ir3, ir4) if vform == "VIII" then vowel_spec.form_viii_assim = form_viii_join_ta(ir1) end else local ret = export.infer_radicals { headword = base.lemma, vform = vform, passive = base.passive, past_vowel = vowel_spec.past, nonpast_vowel = vowel_spec.nonpast, is_reduced = base.reduced, } weakness, ir1, ir2, ir3, ir4 = ret.weakness, ret.rad1, ret.rad2, ret.rad3, ret.rad4 vowel_spec.form_viii_assim = ret.form_viii_assim vowel_spec.past = ret.past_vowel vowel_spec.nonpast = ret.nonpast_vowel vowel_spec.variant = base.variant or ret.variant end end -- For most ambiguous radicals, the choice of radical doesn't matter because it doesn't affect the conjugation -- one way or another. For form I hollow verbs, however, it definitely does. In fact, the choice of radical is -- critical even beyond the past and non-past vowels because it affects the form of the passive participle. So, -- check for this and signal an error if the radical could not be inferred and is not given explicitly. if vform == "I" and type(ir2) == "table" and ir2.need_radical and not rad2 then error("Unable to guess middle radical of hollow form I verb; need to specify radical explicitly") end if vform == "I" and not is_passive_only(base.passive) and ( rget(vowel_spec.past) == "-" or rget(vowel_spec.nonpast) == "-") then error("Form I verb that isn't passive-only or final-weak must have past~non-past vowels specified") end -- Convert ambiguous radicals. local function regularize_inferred_radical(rad) if type(rad) == "table" then if rad.ambig then return {form = rad[1], ambig = true} else return rad[1] end else return rad end end -- Return the appropriate radical at index `index` (1 through 4), based either on the user-specified radical -- `user_radical` or (if unspecified) `inferred_radical`, inferred from the unvocalized lemma. Two values are -- returned, the "regularized" version of the radical (where ambiguous inferred radicals are converted to their -- most likely actual radical) and the non-regularized version. The returned values are form objects rather than -- strings. local function fetch_radical(user_radical, inferred_radical, index) if not user_radical then return regularize_inferred_radical(inferred_radical), inferred_radical else local rad_formval = rget(user_radical) if type(inferred_radical) == "table" then local allowed_radical_set = m_table.listToSet(inferred_radical) if not allowed_radical_set[rad_formval] then error(("For lemma %s, radical %s ambiguously inferred as %s but user radical incompatibly given as %s"): format(base.lemma, index, list_to_text(inferred_radical, nil, " or "), rad_formval)) end elseif rad_formval ~= inferred_radical then error(("For lemma %s, radical %s inferred as %s but user radical incompatibly given as %s"): format(base.lemma, index, inferred_radical, rad_formval)) end return user_radical, user_radical end end if vform ~= "none" then vowel_spec.rad1, vowel_spec.unreg_rad1 = fetch_radical(rad1, ir1, 1) vowel_spec.rad2, vowel_spec.unreg_rad2 = fetch_radical(rad2, ir2, 2) vowel_spec.rad3, vowel_spec.unreg_rad3 = fetch_radical(rad3, ir3, 3) if base.quadlit then vowel_spec.rad4, vowel_spec.unreg_rad4 = fetch_radical(rad4, ir4, 4) end end if vform == "I" then -- If explicit weakness given using 'I-sound' or 'I-assimilated', we may need to adjust the inferred weakness. if base.explicit_weakness == "sound" then if weakness == "assimilated" then weakness = "sound" elseif weakness == "assimilated+final-weak" then -- Verbs like waniya~yawnā "to be faint; to languish" (although the defaults should handle this -- correctly) weakness = "final-weak" else error(("Can't specify form 'I-sound' when inferred weakness is '%s' for lemma %s"):format( weakness, base.lemma)) end elseif base.explicit_weakness == "assimilated" then if weakness == "sound" then -- i~a verbs like waṭiʔa~yaṭaʔu "to tread, to trample"; wasiʕa~yasaʕu "to be spacious; to be well-off"; -- waṯiʔa~yaṯaʔu "to get bruised, to be sprained", which would default to sound. weakness = "assimilated" elseif weakness == "final-weak" then -- For completeness; not clear if any verbs occur where this is needed. (There are plenty of -- assimilated+final-weak verbs but the defaults should take care of them.) weakness = "assimilated+final-weak" else error(("Can't specify form 'I-assimilated' when inferred weakness is '%s' for lemma %s"):format( weakness, base.lemma)) end elseif base.explicit_weakness then error(("Internal error: Unrecognized value '%s' for base.explicit_weakness"):format(base.explicit_weakness)) end elseif vform == "none" then weakness = base.explicit_weakness elseif base.explicit_weakness then error(("Internal error: Explicit weakness should not be specifiable except with forms I and none, but saw explicit weakness '%s' with verb form '%s'"): format(base.explicit_weakness, vform)) end vowel_spec.weakness = weakness if vform ~= "none" then -- Error if radicals are wrong given the weakness. More likely to happen if the weakness is explicitly given -- rather than inferred. Will also happen if certain incorrect letters are included as radicals e.g. hamza on -- top of various letters, alif maqṣūra, tā' marbūṭa. check_radicals(vform, weakness, rget(vowel_spec.rad1), rget(vowel_spec.rad2), rget(vowel_spec.rad3), base.quadlit and rget(vowel_spec.rad4) or nil) end -- Check the variant value. local form_iii_vi_geminate = (vform == "III" or vform == "VI") and rget(vowel_spec.rad2) == rget(vowel_spec.rad3) and not req(vowel_spec.rad2, Y) local hayy_i_x = hayy_radicals(vowel_spec.rad1, vowel_spec.rad2, vowel_spec.rad3) and (vform == "I" or vform == "X") if form_iii_vi_geminate or hayy_i_x then if vowel_spec.variant and vowel_spec.variant ~= "long" and vowel_spec.variant ~= "short" and vowel_spec.variant ~= "both" then error(("For form-III/VI geminate verb or form-I/X verb with ح-ي-ي radicals, saw unrecognized 'var:%s' value; should be 'var:long', 'var:short' or 'var:both'"):format( vowel_spec.variant)) end elseif vowel_spec.variant then error(("Variant value 'var:%s' not allowed in this context"):format(vowel_spec.variant)) end end -- If form I, regroup expanded vowels for display purposes. if vform == "I" then local group_by_past = {} for _, vowel_spec in ipairs(base.conj_vowels) do m_table.insertIfNot(group_by_past, { past = undia[rget(vowel_spec.past)], nonpasts = {undia[rget(vowel_spec.nonpast)]}, }, { key = function(obj) return obj.past end, combine = function(obj1, obj2) for _, nonpast in ipairs(obj2.nonpasts) do m_table.insertIfNot(obj1.nonpasts, nonpast) end end, }) end local group_by_nonpast = {} for _, vowel_spec in ipairs(group_by_past) do m_table.insertIfNot(group_by_nonpast, { pasts = {vowel_spec.past}, nonpasts = vowel_spec.nonpasts, }, { key = function(obj) return obj.nonpasts end, combine = function(obj1, obj2) for _, past in ipairs(obj2.pasts) do m_table.insertIfNot(obj1.pasts, past) end end, }) end base.grouped_conj_vowels = group_by_nonpast end -- Set value of passive. If not specified, default is yes for forms II, III, IV and Iq; no but uncertainly for -- forms VII, IX, XI - XV and IIIq - IVq, as well as form I with past vowel u; impersonal but uncertainly for form -- V, VI, X and IIq, as well as form I with past vowel i; and yes but uncertainly for the remainder (form I with -- past vowel only a and form VIII). if not base.passive then base.passive_defaulted = true -- Temporary tracking for defaulted passives by verb form, weakness and (for form I) past/non-past vowels. track_if_ar_conj(base, "passive-defaulted/" .. vform) for _, vowel_spec in ipairs(base.conj_vowels) do track_if_ar_conj(base, "passive-defaulted/" .. vform.. "/" .. vowel_spec.weakness) if vform == "I" then local past_nonpast = ("%s~%s"):format(undia[vowel_spec.past], undia[vowel_spec.nonpast]) track_if_ar_conj(base, "passive-defaulted/I/" .. past_nonpast) track_if_ar_conj(base, "passive-defaulted/I/" .. vowel_spec.weakness .. "/" .. past_nonpast) end end if vform_probably_full_passive(vform) then base.passive = "pass" else base.passive_uncertain = true for _, vowel_spec in ipairs(base.conj_vowels) do if vform_probably_no_passive(vform, vowel_spec.weakness, vowel_spec.past, vowel_spec.nonpast) then base.passive = "nopass" break elseif vform_probably_impersonal_passive(vform, vowel_spec.weakness, vowel_spec.past, vowel_spec.nonpast) then base.passive = "ipass" break end end base.passive = base.passive or "pass" end end -- NOTE: Currently there are no built-in stems or form overrides for Arabic; this code is inherited from -- [[Module:ca-verb]], where such things do exist, and is kept for generality in case we decide in the future to -- implement such things. -- Override built-in verb stems and overrides with user-specified ones. for stem, values in pairs(base.user_stem_overrides) do base.stem_overrides[stem] = values end for slot, values in pairs(base.user_slot_overrides) do if not base.alternant_multiword_spec.verb_slots_map[slot] then error("Unrecognized override slot '" .. slot .. "': " .. base.angle_bracket_spec) end if export.unsettable_slots_set[slot] then error("Slot '" .. slot .. "' cannot be set using an override: " .. base.angle_bracket_spec) end if skip_slot(base, slot, "allow overrides") then error("Override slot '" .. slot .. "' would be skipped based on the passive, 'noimp' and/or 'no_nonpast' settings: " .. base.angle_bracket_spec) end base.slot_overrides[slot] = values end if base.verb_form == "none-final-weak" then for _, stem_type in ipairs { "past", "past_pass", "nonpast", "nonpast_pass" } do if base.stem_overrides[stem_type .. "_c"] or base.stem_overrides[stem_type .. "_v"] then error(("Specify past stem for verb type 'none-final-weak' using '%s:...' not '%s_c:...' or '%s_v:...'"): format(stem_type, stem_type, stem_type)) end end for _, stem_type in ipairs { "past", "nonpast" } do if base.stem_overrides[stem_type] or not base.stem_overrides[stem_type .. "_final_weak_vowel"] then error(("For verb type 'none-final-weak', if '%s:...' specified, so must '%s_final_weak_vowel:...'"): format(stem_type, stem_type)) end end end end local function detect_all_indicator_specs(alternant_multiword_spec) add_slots(alternant_multiword_spec) alternant_multiword_spec.verb_forms = {} -- This means at least one individual base had the slot marked as explicitly missing. Another base (e.g. when -- there are multiple alternants) might have a value for the slot. In practice, we only respect this when there are -- no overall values in the slot and `slot_uncertain` isn't set; in this case, we display "no ..." for the slot -- instead of simply not displaying anything for the slot. alternant_multiword_spec.slot_explicitly_missing = {} -- This means at least one individual base had no values for the slot and the slot marked as explicitly uncertain. -- Note that this is different from a value being present but marked as uncertain (e.g. if an override was given -- with a ? after it); this causes the form object for the value to have `uncertain = true` set. If there are no -- overall values in the slot and `slot_uncertain` is set, we display this in the headword. alternant_multiword_spec.slot_uncertain = {} iut.map_word_specs(alternant_multiword_spec, function(base) -- So arguments, etc. can be accessed. WARNING: Creates circular reference. base.alternant_multiword_spec = alternant_multiword_spec detect_indicator_spec(base) if not base.nocat then m_table.insertIfNot(alternant_multiword_spec.verb_forms, base.verb_form) end if base.passive_uncertain then alternant_multiword_spec.passive_uncertain = true end for slot, _ in pairs(base.slot_explicitly_missing) do alternant_multiword_spec.slot_explicitly_missing[slot] = true end end) end local function determine_slot_uncertainty_from_forms(alternant_multiword_spec) iut.map_word_specs(alternant_multiword_spec, function(base) -- If no verbal noun and verb form is not 'none' (manually-specified stems) — which currently only happens for -- form I — and the verbal noun wasn't explicitly indicated as missing using <vn:->, we assume it's just -- unknown/unspecified rather than missing. Same with active participles. for uncertain_slot, _ in pairs(slots_that_may_be_uncertain) do if not base.forms[uncertain_slot] and vform ~= "none" and not skip_slot(base, uncertain_slot) then base.slot_uncertain[uncertain_slot] = true end end -- Propagate slot uncertainty up. Currently only the verbal noun can have this set but we write the code -- generally. for slot, _ in pairs(base.slot_uncertain) do alternant_multiword_spec.slot_uncertain[slot] = true end end) -- If slot is uncertain and has no value, explicitly set its value to "?". for uncertain_slot, _ in pairs(slots_that_may_be_uncertain) do if not alternant_multiword_spec.forms[uncertain_slot] and alternant_multiword_spec.slot_uncertain[uncertain_slot] then alternant_multiword_spec.forms[uncertain_slot] = {{form = "?"}} end end end -- Determine certain properties of the verb from the overall forms, such as whether the verb is active-only or -- passive-only, is impersonal, lacks an imperative, etc. local function determine_verb_properties_from_forms(alternant_multiword_spec) alternant_multiword_spec.has_active = false alternant_multiword_spec.has_passive = false alternant_multiword_spec.has_non_impers_active = false alternant_multiword_spec.has_non_impers_passive = false alternant_multiword_spec.has_imp = false alternant_multiword_spec.has_past = false alternant_multiword_spec.has_nonpast = false for slot, _ in pairs(alternant_multiword_spec.forms) do if slot == "ap" or slot:find("[123]") and not slot:find("_pass") then alternant_multiword_spec.has_active = true end if slot == "pp" or slot:find("[123]") and slot:find("_pass") then alternant_multiword_spec.has_passive = true end if slot:find("[123]") and not slot:find("pass_[123]") and not slot:find("3ms") then alternant_multiword_spec.has_non_impers_active = true end if slot:find("pass_[123]") and not slot:find("3ms") then alternant_multiword_spec.has_non_impers_passive = true end if slot:find("^imp_") then alternant_multiword_spec.has_imp = true end if slot:find("^past_") then alternant_multiword_spec.has_past = true end if slot:find("^ind_") or slot:find("^sub_") or slot:find("^juss_") then alternant_multiword_spec.has_nonpast = true end end end local function add_categories_and_annotation(alternant_multiword_spec, base, multiword_lemma, insert_ann, insert_cat) -- Useful e.g. in constructing suppletive verbs out of parts. For a verb like جاء or أتى whose imperative comes -- from the unrelated verb تعالى, we don't want the latter verb showing up in categories or annotations. if base.nocat then return end local vform = base.verb_form if vform ~= "none" then insert_ann("form", vform) insert_cat("form-" .. vform .. " verbs") end if base.reduced then insert_ann("reduced", "reduced") if vform ~= "none" then insert_cat("form-" .. vform .. " reduced verbs") end end if base.quadlit then insert_cat("verbs with quadriliteral roots") end if base.passive_defaulted then insert_cat("verbs with defaulted passive") end for _, vowel_spec in ipairs(base.conj_vowels) do local rad1, rad2, rad3, rad4 = get_radicals_4(vowel_spec) local final_weak = is_final_weak(base, vowel_spec) local weakness = vowel_spec.weakness -- We have to distinguish weakness by form and weakness by conjugation. Weakness by form merely indicates the -- presence of weak letters in certain positions in the radicals. Weakness by conjugation is related to how the -- verbs are conjugated. For example, form-II verbs that are "hollow by form" (middle radical is wāw or yāʾ) are -- conjugated as sound verbs. Another example: form-I verbs with initial wāw are "assimilated by form" and most -- are assimilated by conjugation as well, but a few are sound by conjugation, e.g. wajuha yawjuhu "to be -- distinguished" (rather than wajuha yajuhu); similarly for some hollow-by-form verbs in various forms, e.g. -- form VIII izdawaja yazdawiju "to be in pairs" (rather than izdāja yazdāju). Categories referring to weakness -- always refer to weakness by conjugation; weakness by form is distinguished only by categories such as -- [[:Category:Arabic form-III verbs with و as second radical]]. insert_ann("weakness", weakness) if vform ~= "none" then insert_cat(("%s form-%s verbs"):format(weakness, vform)) end local function radical_is_ambiguous(rad) return type(rad) == "table" and rad.ambig end local function radical_is_unambiguous_weak(rad) return not radical_is_ambiguous(rad) and (is_waw_ya(rad) or req(rad, HAMZA)) end if vform ~= "none" then local ur1, ur2, ur3, ur4 = vowel_spec.unreg_rad1, vowel_spec.unreg_rad2, vowel_spec.unreg_rad3, vowel_spec.unreg_rad4 -- Create headword categories based on the radicals. Do the following before -- converting the Latin radicals into Arabic ones so we distinguish -- between ambiguous and non-ambiguous radicals. if radical_is_ambiguous(ur1) or radical_is_ambiguous(ur2) or radical_is_ambiguous(ur3) or ur4 and radical_is_ambiguous(ur4) then insert_cat("verbs with ambiguous radicals") end if radical_is_unambiguous_weak(ur1) then insert_cat("form-" .. vform .. " verbs with " .. rget(ur1) .. " as first radical") end if radical_is_unambiguous_weak(ur2) then insert_cat("form-" .. vform .. " verbs with " .. rget(ur2) .. " as second radical") end if radical_is_unambiguous_weak(ur3) then insert_cat("form-" .. vform .. " verbs with " .. rget(ur3) .. " as third radical") end if ur4 and radical_is_unambiguous_weak(ur4) then insert_cat("form-" .. vform .. " verbs with " .. rget(ur4) .. " as fourth radical") end end end if vform == "I" and not is_passive_only(base.passive) then for _, vowel_spec in ipairs(base.grouped_conj_vowels) do insert_ann("vowels", ("%s ~ %s"):format(table.concat(vowel_spec.pasts, "/"), table.concat(vowel_spec.nonpasts, "/"))) for _, past in ipairs(vowel_spec.pasts) do for _, nonpast in ipairs(vowel_spec.nonpasts) do if past == "-" or nonpast == "-" then error("Internal error: Saw form I past vowel %s and non-past vowel %s but - in place of vowel should have triggered an error earlier") end insert_cat(("form-I verbs with past vowel %s and non-past vowel %s"):format(past, nonpast)) end end end end for slot, name in pairs(slots_that_may_be_uncertain) do if base.slot_uncertain[slot] then -- An unspecified and non-defaulted verbal noun (form I) is considered uncertain rather than explicitly -- missing. Use <vn:-> to explicitly indicate the lack of verbal noun. Same for form-I stative active -- participles. insert_cat(("verbs with unknown or uncertain %ss"):format(name)) end end if base.irregular then insert_ann("irreg", "irregular") insert_cat("irregular verbs") end end -- Compute the categories to add the verb to, as well as the annotation to display in the conjugation title bar. We -- combine the code to do these functions as both categories and title bar contain similar information. local function compute_categories_and_annotation(alternant_multiword_spec) alternant_multiword_spec.categories = {} local ann = {} alternant_multiword_spec.annotation = ann ann.form = {} ann.weakness = {} ann.vowels = {} ann.passive = nil ann.reduced = {} ann.irreg = {} ann.defective = {} local multiword_lemma = false for _, slot in ipairs(export.potential_lemma_slots) do if alternant_multiword_spec.forms[slot] then for _, formobj in ipairs(alternant_multiword_spec.forms[slot]) do if formobj.form:find(" ") then multiword_lemma = true break end end break end end local function insert_ann(anntype, value) m_table.insertIfNot(alternant_multiword_spec.annotation[anntype], value) end local function insert_cat(cat, also_when_multiword) -- Don't place multiword terms in categories like 'Arabic form-II verbs' to avoid spamming the categories with -- such terms. if also_when_multiword or not multiword_lemma then m_table.insertIfNot(alternant_multiword_spec.categories, "Arabic " .. cat) end end iut.map_word_specs(alternant_multiword_spec, function(base) add_categories_and_annotation(alternant_multiword_spec, base, multiword_lemma, insert_ann, insert_cat) end) for slot, name in pairs(slots_that_may_be_uncertain) do if alternant_multiword_spec.forms[slot] then for _, form in ipairs(alternant_multiword_spec.forms[slot]) do if form.uncertain then if form.form == "?" then insert_cat(("verbs with explicitly unknown %ss"):format(name)) else insert_cat(("verbs needing %s checked"):format(name)) end break end end end end if alternant_multiword_spec.has_active then if alternant_multiword_spec.has_passive and alternant_multiword_spec.has_non_impers_passive then insert_cat("verbs with full passive") ann.passive = "full passive" elseif alternant_multiword_spec.has_passive then insert_cat("verbs with impersonal passive") ann.passive = "impersonal passive" else insert_cat("verbs lacking passive forms") ann.passive = "no passive" end else if alternant_multiword_spec.has_non_impers_passive then insert_cat("passive verbs") insert_cat("verbs with full passive") ann.passive = "passive-only" else insert_cat("passive verbs") insert_cat("impersonal verbs") insert_cat("verbs with impersonal passive") ann.passive = "impersonal (passive-only)" end end if alternant_multiword_spec.passive_uncertain then insert_cat("verbs needing passive checked") ann.passive = ann.passive .. ' <abbr title="passive status uncertain">(?)</abbr>' end if alternant_multiword_spec.has_active and not alternant_multiword_spec.has_imp then insert_ann("defective", "no imperative") insert_cat("verbs lacking imperative forms") end if not alternant_multiword_spec.has_past then insert_ann("defective", "no past") insert_cat("verbs lacking past forms") end if not alternant_multiword_spec.has_nonpast then insert_ann("defective", "no non-past") insert_cat("verbs lacking non-past forms") end local ann_parts = {} local function insert_ann_part(part, conj) local val = table.concat(ann[part], conj or " or ") if val ~= "" and val ~= "regular" then table.insert(ann_parts, val) end end insert_ann_part("form") insert_ann_part("weakness") insert_ann_part("reduced") insert_ann_part("vowels") if ann.passive then table.insert(ann_parts, ann.passive) end insert_ann_part("irreg") insert_ann_part("defective", ", ") alternant_multiword_spec.annotation = table.concat(ann_parts, ", ") end local function show_forms(alternant_multiword_spec) local lemmas = {} for _, slot in ipairs(export.potential_lemma_slots) do if alternant_multiword_spec.forms[slot] then for _, formobj in ipairs(alternant_multiword_spec.forms[slot]) do table.insert(lemmas, formobj) end break end end alternant_multiword_spec.lemmas = lemmas -- save for later use in make_table() alternant_multiword_spec.vn = alternant_multiword_spec.forms.vn -- save for later use in make_table() -- Reconstruct the original verb spec without overrides for verbal nouns and participles, since those specific slots -- are ignored by {{ar-verb form}}. Compute this once beforehand; `transform_accel_obj` is called repeatedly on each -- form and we don't want to compute this repeatedly. local reconstructed_verb_spec = iut.reconstruct_original_spec(alternant_multiword_spec, { preprocess_angle_bracket_spec = function(spec) spec = spec:match("^<(.*)>$") assert(spec) local segments = put.parse_multi_delimiter_balanced_segment_run(spec, {{"[", "]"}, {"<", ">"}}) local dot_separated_groups = put.split_alternating_runs_and_strip_spaces(segments, "%.") -- Rejoin each dot-separated group into a single string, since we aren't actually going to do any parsing -- of bracket-bounded textual runs; then filter out overrides for verbal nouns and participles. local filtered_indicators = {} for _, dot_separated_group in ipairs(dot_separated_groups) do local indicator = table.concat(dot_separated_group) -- FIXME: Do we want to filter out any other indicators? if not (indicator:find("^vn:") or indicator:find("^[ap]p:")) then table.insert(filtered_indicators, indicator) end end return ("<%s>"):format(table.concat(filtered_indicators, ".")) end, }) -- If we're dealing with a single word, no alternants and a single verb form, use the auto-conjugation-fetching -- variant. local reconstructed_lemma, inside = reconstructed_verb_spec:match("^([^ <>()]+)(%b<>)$") if inside and alternant_multiword_spec.verb_forms[1] and not alternant_multiword_spec.verb_forms[2] then reconstructed_verb_spec = ("+%s<%s>"):format(reconstructed_lemma, alternant_multiword_spec.verb_forms[1]) end local function transform_accel_obj(slot, formobj, accel_obj) if not accel_obj then return accel_obj end if slot == "ap" or slot == "pp" or slot == "vn" then -- FIXME: [[Module:accel]] can't correctly handle more than one verb form for participles and verbal nouns accel_obj.form = slot .. "-" .. table.concat(alternant_multiword_spec.verb_forms, ",") else accel_obj.form = "verb-form-" .. reconstructed_verb_spec end return accel_obj end local function generate_link(data) local form = data.form local term = form.formval_for_link local alt = form.alt if term == "?" then term = nil alt = "?" end local link = m_links.full_link { lang = lang, term = term, tr = "-", accel = form.accel_obj, alt = alt, gloss = form.gloss, genders = form.genders, pos = form.pos, lit = form.lit, id = form.id, } .. iut.get_footnote_text(form.footnotes, data.footnote_obj) if form.q and form.q[1] or form.qq and form.qq[1] or form.l and form.l[1] or form.ll and form.ll[1] then link = require(pron_qualifier_module).format_qualifiers { lang = lang, text = link, q = form.q, qq = form.qq, l = form.l, ll = form.ll, } end return link end local props = { lang = lang, lemmas = lemmas, transform_accel_obj = transform_accel_obj, generate_link = generate_link, slot_list = alternant_multiword_spec.verb_slots, include_translit = true, } iut.show_forms(alternant_multiword_spec.forms, props) end ------------------------------------------------------------------------------- -- Functions to create inflection tables -- ------------------------------------------------------------------------------- -- Make the conjugation table. Called from export.show(). local function make_table(alternant_multiword_spec) local text = mw.getCurrentFrame():expandTemplate{ title = 'inflection-table-top', args = { title = 'Conjugation of {title}', tall = 'yes', palette = "green", category = 'conjugation', class = 'tr-alongside', -- temp hack to prevent extra line break } } text = text .. [=[ ! colspan="6" | verbal noun<br /><<الْمَصْدَر>> | colspan="7" | {vn} ]=] if alternant_multiword_spec.has_active then text = text .. [=[ |- ! colspan="6" | active participle<br /><<اِسْم الْفَاعِل>> | colspan="7" | {ap} ]=] end if alternant_multiword_spec.has_passive then text = text .. [=[ |- ! colspan="6" | passive participle<br /><<اِسْم الْمَفْعُول>> | colspan="7" | {pp} ]=] end text = text .. [=[ |- ! colspan="999" class="separator" | ]=] if alternant_multiword_spec.has_active then text = text .. [=[ |- ! colspan="12" class="outer" | active voice<br /><<الْفِعْل الْمَعْلُوم>> |- ! colspan="2" | ! colspan="3" | singular<br /><<الْمُفْرَد>> ! rowspan="12" class="separator" | ! colspan="2" | dual<br /><<الْمُثَنَّى>> ! rowspan="12" class="separator" | ! colspan="3"| plural<br /><<الْجَمْع>> |- ! colspan="2"| ! 1<sup>st</sup> person<br /><<الْمُتَكَلِّم>> ! 2<sup>nd</sup> person<br /><<الْمُخَاطَب>> ! 3<sup>rd</sup> person<br /><<الْغَائِب>> ! 2<sup>nd</sup> person<br /><<الْمُخَاطَب>> ! 3<sup>rd</sup> person<br /><<الْغَائِب>> ! 1<sup>st</sup> person<br /><<الْمُتَكَلِّم>> ! 2<sup>nd</sup> person<br /><<الْمُخَاطَب>> ! 3<sup>rd</sup> person<br /><<الْغَائِب>> |- ! rowspan="2" | past (perfect) indicative<br /><<الْمَاضِي>> ! class="secondary" | m | rowspan="2" | {past_1s} | {past_2ms} | {past_3ms} | rowspan="2" | {past_2d} | {past_3md} | rowspan="2" | {past_1p} | {past_2mp} | {past_3mp} |- ! class="secondary" | f | {past_2fs} | {past_3fs} | {past_3fd} | {past_2fp} | {past_3fp} |- ! rowspan="2" | non-past (imperfect) indicative<br /><<الْمُضَارِع الْمَرْفُوع>> ! class="secondary" | m | rowspan="2" | {ind_1s} | {ind_2ms} | {ind_3ms} | rowspan="2" | {ind_2d} | {ind_3md} | rowspan="2" | {ind_1p} | {ind_2mp} | {ind_3mp} |- ! class="secondary" | f | {ind_2fs} | {ind_3fs} | {ind_3fd} | {ind_2fp} | {ind_3fp} |- ! rowspan="2" | subjunctive<br /><<الْمُضَارِع الْمَنْصُوب>> ! class="secondary" | m | rowspan="2" | {sub_1s} | {sub_2ms} | {sub_3ms} | rowspan="2" | {sub_2d} | {sub_3md} | rowspan="2" | {sub_1p} | {sub_2mp} | {sub_3mp} |- ! class="secondary" | f | {sub_2fs} | {sub_3fs} | {sub_3fd} | {sub_2fp} | {sub_3fp} |- ! rowspan="2" | jussive<br /><<الْمُضَارِع الْمَجْزُوم>> ! class="secondary" | m | rowspan="2" | {juss_1s} | {juss_2ms} | {juss_3ms} | rowspan="2" | {juss_2d} | {juss_3md} | rowspan="2" | {juss_1p} | {juss_2mp} | {juss_3mp} |- ! class="secondary" | f | {juss_2fs} | {juss_3fs} | {juss_3fd} | {juss_2fp} | {juss_3fp} |- ! rowspan="2" | imperative<br /><<الْأَمْر>> ! class="secondary" | m | rowspan="2" | | {imp_2ms} | rowspan="2" | | rowspan="2" | {imp_2d} | rowspan="2" | | rowspan="2" | | {imp_2mp} | rowspan="2" | |- ! class="secondary" | f | {imp_2fs} | {imp_2fp} ]=] end if alternant_multiword_spec.has_passive then text = text .. [=[ |- ! colspan="999" class="separator" | |- ! colspan="12" class="outer" | passive voice<br /><<الْفِعْل الْمَجْهُول>> |- ! colspan="2" | ! colspan="3" | singular<br /><<الْمُفْرَد>> ! rowspan="10" class="separator" | ! colspan="2" | dual<br /><<الْمُثَنَّى>> ! rowspan="10" class="separator" | ! colspan="3" | plural<br /><<الْجَمْع>> |- ! colspan="2" | ! 1<sup>st</sup> person<br /><<الْمُتَكَلِّم>> ! 2<sup>nd</sup> person<br /><<الْمُخَاطَب>> ! 3<sup>rd</sup> person<br /><<الْغَائِب>> ! 2<sup>nd</sup> person<br /><<الْمُخَاطَب>> ! 3<sup>rd</sup> person<br /><<الْغَائِب>> ! 1<sup>st</sup> person<br /><<الْمُتَكَلِّم>> ! 2<sup>nd</sup> person<br /><<الْمُخَاطَب>> ! 3<sup>rd</sup> person<br /><<الْغَائِب>> |- ! rowspan="2" | past (perfect) indicative<br /><<الْمَاضِي>> ! class="secondary" | m | rowspan="2" | {past_pass_1s} | {past_pass_2ms} | {past_pass_3ms} | rowspan="2" | {past_pass_2d} | {past_pass_3md} | rowspan="2" | {past_pass_1p} | {past_pass_2mp} | {past_pass_3mp} |- ! class="secondary" | f | {past_pass_2fs} | {past_pass_3fs} | {past_pass_3fd} | {past_pass_2fp} | {past_pass_3fp} |- ! rowspan="2" | non-past (imperfect) indicative<br /><<الْمُضَارِع الْمَرْفُوع>> ! class="secondary" | m | rowspan="2" | {ind_pass_1s} | {ind_pass_2ms} | {ind_pass_3ms} | rowspan="2" | {ind_pass_2d} | {ind_pass_3md} | rowspan="2" | {ind_pass_1p} | {ind_pass_2mp} | {ind_pass_3mp} |- ! class="secondary" | f | {ind_pass_2fs} | {ind_pass_3fs} | {ind_pass_3fd} | {ind_pass_2fp} | {ind_pass_3fp} |- ! rowspan="2" | subjunctive<br /><<الْمُضَارِع الْمَنْصُوب>> ! class="secondary" | m | rowspan="2" | {sub_pass_1s} | {sub_pass_2ms} | {sub_pass_3ms} | rowspan="2" | {sub_pass_2d} | {sub_pass_3md} | rowspan="2" | {sub_pass_1p} | {sub_pass_2mp} | {sub_pass_3mp} |- ! class="secondary" | f | {sub_pass_2fs} | {sub_pass_3fs} | {sub_pass_3fd} | {sub_pass_2fp} | {sub_pass_3fp} |- ! rowspan="2" | jussive<br /><<الْمُضَارِع الْمَجْزُوم>> ! class="secondary" | m | rowspan="2" | {juss_pass_1s} | {juss_pass_2ms} | {juss_pass_3ms} | rowspan="2" | {juss_pass_2d} | {juss_pass_3md} | rowspan="2" | {juss_pass_1p} | {juss_pass_2mp} | {juss_pass_3mp} |- ! class="secondary" | f | {juss_pass_2fs} | {juss_pass_3fs} | {juss_pass_3fd} | {juss_pass_2fp} | {juss_pass_3fp} ]=] end text = text .. mw.getCurrentFrame():expandTemplate{ title = 'inflection-table-bottom', args = { notes = '{footnote}', } } local forms = alternant_multiword_spec.forms if not alternant_multiword_spec.lemmas then forms.title = "—" else local linked_lemmas = {} for _, form in ipairs(alternant_multiword_spec.lemmas) do table.insert(linked_lemmas, link_term(form.form, "term")) end forms.title = table.concat(linked_lemmas, ", ") end local ann_parts = {} if alternant_multiword_spec.annotation ~= "" then table.insert(ann_parts, alternant_multiword_spec.annotation) end if alternant_multiword_spec.vn then local linked_vns = {} for _, form in ipairs(alternant_multiword_spec.vn) do table.insert(linked_vns, link_term(form.form, "term")) end table.insert(ann_parts, (#linked_vns > 1 and "verbal nouns" or "verbal noun") .. " " .. table.concat(linked_vns, ", ")) end local annotation = table.concat(ann_parts, ", ") if annotation ~= "" then forms.title = forms.title .. " (" .. annotation .. ")" end -- Format the table. local tagged_table = rsub(text, "<<(.-)>>", tag_text) return m_string_utilities.format(tagged_table, forms) end ------------------------------------------------------------------------------- -- External entry points -- ------------------------------------------------------------------------------- -- Append two lists `l1` and `l2`, removing duplicates. If either is {nil}, just return the other. local function combine_lists(l1, l2) -- combine_footnotes() does exactly what we want. return iut.combine_footnotes(l1, l2) end local function combine_metadata(data) local src1 = data.form1 local src2 = data.form2 local dest = data.dest_form dest.uncertain = src1.uncertain or src2.uncertain if src1.genders and src2.genders and not m_table.deepEquals(src1.genders, src2.genders) then -- do nothing else dest.genders = src1.genders or src2.genders end if src1.pos and src2.pos and src1.pos ~= src2.pos then -- do nothing else dest.pos = src1.pos or src2.pos end -- Don't copy .alt, .gloss, .lit, .id, which describe a single term and don't extend to multiword terms. dest.q = combine_lists(src1.q, src2.q) dest.qq = combine_lists(src1.qq, src2.qq) dest.l = combine_lists(src1.l, src2.l) dest.ll = combine_lists(src1.ll, src2.ll) end -- Externally callable function to parse and conjugate a verb given user-specified arguments. -- Return value is WORD_SPEC, an object where the conjugated forms are in `WORD_SPEC.forms` -- for each slot. If there are no values for a slot, the slot key will be missing. The value -- for a given slot is a list of objects {form=FORM, footnotes=FOOTNOTES}. function export.do_generate_forms(args, source_template, headword_head) local PAGENAME = mw.loadData("Module:headword/data").pagename local function in_template_space() return mw.title.getCurrentTitle().nsText == "Template" end -- Determine the verb spec we're being asked to generate the conjugation of. This may be taken from the current page -- title or the value of |pagename=; but not when called from {{ar-verb form}}, where the page title is a -- non-lemma form. Note that the verb spec may omit the lemma; e.g. it may be "<II>". For this reason, we use the -- value of `pagename` computed here down below, when calling normalize_all_lemmas(). local pagename = source_template ~= "ar-verb form" and args.pagename or PAGENAME local head = headword_head or pagename local arg1 = args[1] if not arg1 then if (pagename == "ar-conj" or pagename == "ar-verb" or pagename == "ar-verb form") and in_template_space() then arg1 = "كتب<I/a~u.pass>" else arg1 = "<>" end end -- When called from {{ar-verb form}}, determine the non-lemma form whose inflections we're being asked to -- determine. This normally comes from the page title or the value of |pagename=. local verb_form_of_form if source_template == "ar-verb form" then verb_form_of_form = args.pagename if not verb_form_of_form then if PAGENAME == "ar-verb form" and in_template_space() then verb_form_of_form = "كتبت" else verb_form_of_form = PAGENAME end end end local incorporated_headword_head_into_lemma = false if arg1:find("^<.*>$") then -- missing lemma if head:find(" ") then -- If multiword lemma, try to add arg spec after the first word. -- Try to preserve the brackets in the part after the verb, but don't do it -- if there aren't the same number of left and right brackets in the verb -- (which means the verb was linked as part of a larger expression). local first_word, post = rmatch(head, "^(.-)( .*)$") local left_brackets = rsub(first_word, "[^%[]", "") local right_brackets = rsub(first_word, "[^%]]", "") if #left_brackets == #right_brackets then arg1 = iut.remove_redundant_links(first_word) .. arg1 .. post incorporated_headword_head_into_lemma = true else -- Try again using the form without links. local linkless_head = m_links.remove_links(head) if linkless_head:find(" ") then first_word, post = rmatch(linkless_head, "^(.-)( .*)$") arg1 = first_word .. arg1 .. post else error("Unable to incorporate <...> spec into explicit head due to a multiword linked verb or " .. "unbalanced brackets; please include <> explicitly: " .. arg1) end end else -- Will be incorporated through `head` below in the call to normalize_all_lemmas(). incorporated_headword_head_into_lemma = true end end local parse_props = { parse_indicator_spec = parse_indicator_spec, angle_brackets_omittable = true, allow_blank_lemma = true, } local alternant_multiword_spec = iut.parse_inflected_text(arg1, parse_props) alternant_multiword_spec.pos = pos or "verbs" alternant_multiword_spec.args = args alternant_multiword_spec.source_template = source_template alternant_multiword_spec.verb_form_of_form = verb_form_of_form alternant_multiword_spec.incorporated_headword_head_into_lemma = incorporated_headword_head_into_lemma normalize_all_lemmas(alternant_multiword_spec, head) detect_all_indicator_specs(alternant_multiword_spec) local inflect_props = { lang = lang, slot_list = alternant_multiword_spec.verb_slots, inflect_word_spec = conjugate_verb, combine_metadata = combine_metadata, -- We add links around the generated verbal forms rather than allow the entire multiword -- expression to be a link, so ensure that user-specified links get included as well. include_user_specified_links = true, } iut.inflect_multiword_or_alternant_multiword_spec(alternant_multiword_spec, inflect_props) if debug_translit then for slot, forms in pairs(alternant_multiword_spec.forms) do for _, form in ipairs(forms) do if form.translit then local full_form_translit = (lang:transliterate(m_links.remove_links(form.form))) if full_form_translit ~= form.translit then error(("Internal error: For slot '%s', form '%s' incremental translit '%s' not same as full translit '%s'"): format(slot, form.form, form.translit, full_form_translit)) end end form.form = iut.remove_redundant_links(form.form) end end end -- Remove redundant brackets around entire forms. for slot, forms in pairs(alternant_multiword_spec.forms) do for _, form in ipairs(forms) do form.form = iut.remove_redundant_links(form.form) end end determine_slot_uncertainty_from_forms(alternant_multiword_spec) determine_verb_properties_from_forms(alternant_multiword_spec) compute_categories_and_annotation(alternant_multiword_spec) if args.json and source_template == "ar-conj" then -- There is a circular reference in `base.alternant_multiword_spec`, which points back to top level. iut.map_word_specs(alternant_multiword_spec, function(base) base.alternant_multiword_spec = nil end) return require("Module:JSON").toJSON(alternant_multiword_spec) end return alternant_multiword_spec end -- Entry point for {{ar-conj}}. Template-callable function to parse and conjugate a verb given -- user-specified arguments and generate a displayable table of the conjugated forms. function export.show(frame) local parent_args = frame:getParent().args local params = { [1] = {}, ["noautolinktext"] = {type = "boolean"}, ["noautolinkverb"] = {type = "boolean"}, ["t"] = {}, -- for use by {{ar-verb form}}; otherwise ignored ["id"] = {}, -- for use by {{ar-verb form}}; otherwise ignored ["pagename"] = {}, -- for testing/documentation pages ["json"] = {type = "boolean"}, -- for bot use } local args = require("Module:parameters").process(parent_args, params) local alternant_multiword_spec = export.do_generate_forms(args, "ar-conj") if type(alternant_multiword_spec) == "string" then -- JSON return value return alternant_multiword_spec end show_forms(alternant_multiword_spec) return make_table(alternant_multiword_spec) .. require("Module:utilities").format_categories(alternant_multiword_spec.categories, lang, nil, nil, force_cat) end function export.verb_forms(frame) local parargs = frame:getParent().args local params = { [1] = {}, [2] = {}, [3] = {}, [4] = {}, [5] = {}, pagename = {}, } for _, form in ipairs(allowed_vforms) do -- FIXME: We go up to 5 here. The code supports unlimited variants but it's unlikely we will ever see more than -- 2. for index = 1, 5 do local prefix = index == 1 and form or form .. index params[prefix .. "-pv"] = {} for _, extn in ipairs { "", "-vn", "-ap", "-pp" } do params[prefix .. extn] = {} params[prefix .. extn .. "-head"] = {} -- FIXME: No -tr? params[prefix .. extn .. "-gloss"] = {} end end end local args = require("Module:parameters").process(parargs, params) local i = 1 local past_vowel_re = "^[aui,]*$" local combined_root = nil if not args[i] or rfind(args[i], past_vowel_re) then combined_root = args.pagename or mw.loadData("Module:headword/data").pagename if not rfind(combined_root, "^([^ ]) ([^ ]) ([^ ])$") and not rfind(combined_root, "^([^ ]) ([^ ]) ([^ ]) ([^ ])$") then error("When inferring roots from page title, need three or four space-separated radicals: " .. combined_root) end elseif rfind(args[i], " ") then combined_root = args[i] i = i + 1 else local separate_roots = {} while args[i] and not rfind(args[i], past_vowel_re) do table.insert(separate_roots, args[i]) i = i + 1 end combined_root = table.concat(separate_roots, " ") end local past_vowel = args[i] i = i + 1 if past_vowel and not rfind(past_vowel, past_vowel_re) then error("Unrecognized past vowel, should be 'a', 'i', 'u', 'a,u', etc. or empty: " .. past_vowel) end -- Spaces interfere with parsing as a unit in [[Module:inflection utilities]], so replace with underscore. combined_root = combined_root:gsub(" ", "_") local split_root = rsplit(combined_root, "_") -- Map from verb forms (I, II, etc.) to a table of verb properties, -- which has entries e.g. for "verb" (either true to autogenerate the verb -- head, or an explicitly specified verb head using e.g. argument "I-head"), -- and for "verb-gloss" (which comes from e.g. the argument "I" or "I-gloss"), -- and for "vn" and "vn-gloss", "ap" and "ap-gloss", "pp" and "pp-gloss". local verb_properties = {} for _, form in ipairs(allowed_vforms) do local formpropslist = {} local derivs = {{"verb", ""}, {"vn", "-vn"}, {"ap", "-ap"}, {"pp", "-pp"}} local index = 1 while true do local formprops = {} local prefix = index == 1 and form or form .. index if prefix == "I" then formprops.pv = past_vowel end if args[prefix .. "-pv"] then formprops.pv = args[prefix .. "-pv"] end for _, deriv in ipairs(derivs) do local prop = deriv[1] local extn = deriv[2] if args[prefix .. extn] == "+" then formprops[prop] = true elseif args[prefix .. extn] == "-" then formprops[prop] = false elseif args[prefix .. extn] then formprops[prop] = true formprops[prop .. "-gloss"] = args[prefix .. extn] end if args[prefix .. extn .. "-head"] then if formprops[prop] == nil then formprops[prop] = true end formprops[prop] = args[prefix .. extn .. "-head"] end if args[prefix .. extn .. "-gloss"] then if formprops[prop] == nil then formprops[prop] = true end formprops[prop .. "-gloss"] = args[prefix .. extn .. "-gloss"] end end if formprops.verb then -- If a verb form specified, also turn on vn (unless form I, with -- unpredictable vn) and ap, and maybe pp, according to form, -- weakness and past vowel. But don't turn these on if there's -- an explicit on/off specification for them (e.g. I-pp=-). if form ~= "I" and formprops.vn == nil then formprops.vn = true end if formprops.ap == nil then formprops.ap = true end local weakness = weakness_from_radicals(form, split_root[1], split_root[2], split_root[3], split_root[4]) if formprops.pp == nil and not vform_probably_no_passive(form, weakness, rsplit(formprops.pv or "", ","), {}) then formprops.pp = true end if formprops.verb == true or formprops.vn == true or formprops.ap == true or formprops.pp == true then formprops.need_autogen = true end table.insert(formpropslist, formprops) index = index + 1 else break end end table.insert(verb_properties, {form, formpropslist}) end -- Go through and create the verb form derivations as necessary, when they haven't been explicitly given. for _, vplist in ipairs(verb_properties) do local vform = vplist[1] for _, props in ipairs(vplist[2]) do if props.need_autogen then local form_with_vowels if vform == "I" then local pv = props.pv if not pv then -- Make up likely past vowels based on weakness and actual radical. if split_root[3] == W then -- final-weak form_with_vowels = "I/a~u" elseif split_root[3] == Y then form_with_vowels = "I/a~i" elseif split_root[2] == W then --hollow form_with_vowels = "I/u~u" elseif split_root[2] == Y then form_with_vowels = "I/i~i" else -- most common; doesn't matter so much since we're not displaying the non-past form_with_vowels = "I/a~u" end else local pvs = rsplit(pv, ",") local vowel_sufs = {} for _, pv in ipairs(pvs) do local vowel_spec if pv == "a" then -- Make up likely past vowels based on weakness and actual radical. if split_root[3] == W then -- final-weak vowel_spec = "a~u" elseif split_root[3] == Y then vowel_spec = "a~i" elseif split_root[2] == W then --hollow vowel_spec = "a~u" elseif split_root[2] == Y then vowel_spec = "a~i" else -- most common; doesn't matter so much since we're not displaying the non-past vowel_spec = "a~u" end elseif pv == "i" then -- most common; doesn't matter so much since we're not displaying the non-past vowel_spec = "i~a" elseif pv == "u" then -- most common; doesn't matter so much since we're not displaying the non-past vowel_spec = "u~u" else error(("Internal error: Bad past vowel '%s' in {{ar-verb forms}}"):format(pv)) end table.insert(vowel_sufs, vowel_spec) end form_with_vowels = "I/" .. table.concat(vowel_sufs, "/") end else form_with_vowels = vform end local angle_bracket_spec = ("%s<%s.pass>"):format(combined_root, form_with_vowels) local alternant_multiword_spec = export.do_generate_forms({angle_bracket_spec}, "ar-verb forms") local function format_forms(forms) if not forms then return "-" -- FIXME: Throw an error? end local formatted = {} for _, form in ipairs(forms) do if form.translit then table.insert(formatted, ("%s//%s"):format(form.form, form.translit)) else table.insert(formatted, form.form) end end return table.concat(formatted, ",") end if props.verb == true then props.verb = format_forms(alternant_multiword_spec.forms.past_3ms) end for _, deriv in ipairs({"vn", "ap", "pp"}) do if props[deriv] == true then props[deriv] = format_forms(alternant_multiword_spec.forms[deriv]) end end end end end -- Go through and output the result local formtextarr = {} for _, vplist in ipairs(verb_properties) do local form = vplist[1] for _, props in ipairs(vplist[2]) do local textarr = {} if props.verb then local text = "* '''[[Appendix:Arabic verbs#Form " .. form .. "|Form " .. form .. "]]''': " local linktext = {} local splitheads = rsplit(props.verb, "[,،]") for _, head in ipairs(splitheads) do table.insert(linktext, m_links.full_link({lang = lang, term = head, gloss = props["verb-gloss"]})) end text = text .. table.concat(linktext, ", ") table.insert(textarr, text) for _, derivengl in ipairs({{"vn", "Kata nama kerjaan"}, {"ap", "Active participle"}, {"pp", "Passive participle"}}) do local deriv = derivengl[1] local engl = derivengl[2] if props[deriv] then local text = "** " .. engl .. ": " local linktext = {} local splitheads = rsplit(props[deriv], "[,،]") for _, head in ipairs(splitheads) do local ar, translit = head:match("^(.*)//(.-)$") if not ar then ar = head end table.insert(linktext, m_links.full_link {lang = lang, term = ar, tr = translit, gloss = props[deriv .. "-gloss"]} ) end text = text .. table.concat(linktext, ", ") table.insert(textarr, text) end end table.insert(formtextarr, table.concat(textarr, "\n")) end end end return table.concat(formtextarr, "\n") end -- Infer radicals from lemma headword (i.e. 3rd masculine singular past) and verb form (I, II, etc.). Throw an error if -- headword is malformed. A given returned radical may be actually be a list of possible radicals, where the first one -- should be used if the user didn't explicitly give the radical. If the list contains a field `ambig = true`, the -- radical is considered ambiguous and should not be categorized. `is_reduced` indicates that the user specified -- `.reduced` to indicate that the verb form is reduced by assimilation and/or haplology (typically archaic Koranic -- forms such as اِدَّارَأَ instead of تَدَارَأَ; or اِسْطَاعَ instead of اِسْتِطَاعَ; etc. function export.infer_radicals(data) local headword, vform, passive, past_vowel, nonpast_vowel, is_reduced = data.headword, data.vform, data.passive, data.past_vowel, data.nonpast_vowel, data.is_reduced past_vowel = past_vowel or "-" nonpast_vowel = nonpast_vowel or "-" local function verify_vowel(vowel, param) if vowel ~= A and vowel ~= I and vowel ~= U and vowel ~= "-" then error(("Internal error: Bad value for %s: %s (should be Arabic diacritic vowel or '-')"):format( param, vowel)) end end verify_vowel(past_vowel, "past_vowel") verify_vowel(nonpast_vowel, "nonpast_vowel") local ch = {} local form_viii_assim, variant -- sub out alif-madda for easier processing headword = rsub(headword, AMAD, HAMZA .. ALIF) local function infer_err(msg, noann) local anns = {} local nohead, novform if noann == "nohead" then nohead = true elseif noann == "novform" then novform = true elseif noann == "nohead-vform" then nohead = true novform = true elseif noann then error(("Internal error: Unrecognized value for 'noann': %s"):format(dump(noann))) end if not nohead then table.insert(anns, ("headword=%s"):format(data.headword)) end if not novform then table.insert(anns, ("verb form=%s"):format(data.vform)) end anns = table.concat(anns, ", ") if anns ~= "" then anns = ": " .. anns end error(msg .. anns) end local len = ulen(headword) local expected_length -- extract the headword letters into an array for i = 1, len do table.insert(ch, usub(headword, i, i)) end -- check that the letter at the given index is the given string, or -- is one of the members of the given array local function check(index, must) local letter = ch[index] if type(must) == "string" then if not letter then infer_err("Letter " .. index .. " is nil") end if letter ~= must then infer_err(("For verb form %s, letter %s must be %s, not %s"):format(vform, index, must, letter), "novform") end elseif not m_table.contains(must, letter) then infer_err("For verb form " .. vform .. ", radical " .. index .. " must be one of " .. table.concat(must, " ") .. ", not " .. letter, "novform") end end -- Check that length of headword is within [min, max] local function check_len(min, max) if min and len < min then infer_err(("Not enough letters for verb form %s, expected at least %s"):format(vform, min), "novform") end if max and len > max then infer_err(("Too many letters for verb form %s, expected at most %s"):format(vform, max), "novform") end end -- If the vowels are i~a or u~u, a form I verb beginning with w- normally keeps the w in the non-past. Otherwise it -- loses it (i.e. it is "assimilated"). local function form_I_w_non_assimilated() return req(past_vowel, I) and req(nonpast_vowel, A) or req(past_vowel, U) and req(nonpast_vowel, U) end -- Convert radicals to canonical form (handle various hamza varieties and check for misplaced alif or alif maqṣūra; -- legitimate cases of these letters are handled above). local function convert(rad, index) if type(rad) == "table" then for i, r in ipairs(rad) do rad[i] = convert(r, index) end return rad elseif rad == HAMZA_ON_ALIF or rad == HAMZA_UNDER_ALIF or rad == HAMZA_ON_W or rad == HAMZA_ON_Y then return HAMZA elseif rad == AMAQ then infer_err("Radical " .. index .. " must not be alif maqṣūra") elseif rad == ALIF then infer_err("Radical " .. index .. " must not be alif") else return rad end end local quadlit = vform:find("q$") -- find first radical, start of second/third radicals, check for -- required letters local radstart, rad1, rad2, rad3, rad4 local weakness if vform == "I" or vform == "II" then rad1 = ch[1] radstart = 2 elseif vform == "III" then rad1 = ch[1] check(2, {ALIF, W}) -- W occurs in passive-only verbs radstart = 3 elseif vform == "IV" then -- this would be alif-madda but we replaced it with hamza-alif above. if ch[1] == HAMZA and ch[2] == ALIF then rad1 = HAMZA else check(1, HAMZA_ON_ALIF) rad1 = ch[2] end radstart = 3 elseif vform == "V" then check(1, is_reduced and ALIF or T) rad1 = ch[2] radstart = 3 elseif vform == "VI" then check(1, is_reduced and ALIF or T) if ch[2] == AMAD then rad1 = HAMZA radstart = 3 else rad1 = ch[2] check(3, {ALIF, W}) -- W occurs in passive-only verbs radstart = 4 end elseif vform == "VII" then check(1, ALIF) if is_reduced then check(2, M) rad1 = M radstart = 3 else check(2, N) rad1 = ch[3] radstart = 4 end elseif vform == "VIII" then check(1, ALIF) rad1 = ch[2] if rad1 == "د" then rad1 = {"د", "ذ"} -- not considered ambiguous since it's usually د radstart = 3 form_viii_assim = "دّ" elseif rad1 == "ظ" and ch[3] == "ط" and len >= 5 then -- [[اظطلم]], variant of [[اظلم]] radstart = 4 form_viii_assim = "ظْط" elseif rad1 == "ذ" and ch[3] == "د" and len >= 5 then -- [[اذدكر]], variant of [[اذكر]] radstart = 4 form_viii_assim = "ذْد" elseif rad1 == T or rad1 == "ث" or rad1 == "ذ" or rad1 == "ط" or rad1 == "ظ" then radstart = 3 form_viii_assim = rad1 .. SH elseif rad1 == "ز" then check(3, "د") radstart = 4 form_viii_assim = "زْد" elseif rad1 == "ص" or rad1 == "ض" then check(3, "ط") radstart = 4 form_viii_assim = rad1 .. SK .. "ط" else check(3, T) radstart = 4 rad1 = convert(rad1, 1) form_viii_assim = rad1 .. SK .. "ت" end if rad1 == T then -- Radical is ambiguous, might be ت or و or ي but doesn't affect conjugation. Note that there are no -- form-VIII verbs with initial radical ي given in Hans Wehr but Lane mentions at least: -- - (page 2973) اِتَّأَسَ, with assimilation of the ي to ت, from root ي ء س; -- - (page 2975) اِتَّبَسَ non-past يَتَّبِسُ and alternative اِيتَبَسَ non-past يَاتَبِسُ from the root ي ب س; -- - (page 2976) اِتَّسَرَ non-past يَتَّسِرُ or alternatively يَأْتَسِرُ with hamza preserved from the root ي س ر. -- These alternative forms seem very rare and probably not worth worrying about, but if we want to handle -- them, we can do it when the time comes. rad1 = {T, W, Y, ambig = true} -- اِتَّخَذَ irregularly has hamza as the radical but assimilates like و if ch[3] == "خ" and ch[4] == "ذ" then rad1[4] = HAMZA end end elseif vform == "IX" then check(1, ALIF) rad1 = ch[2] radstart = 3 elseif vform == "X" then check(1, ALIF) check(2, S) if is_reduced then rad1 = ch[3] radstart = 4 else check(3, T) rad1 = ch[4] radstart = 5 end elseif vform == "Iq" then rad1 = ch[1] rad2 = ch[2] radstart = 3 elseif vform == "IIq" then check(1, T) rad1 = ch[2] rad2 = ch[3] radstart = 4 elseif vform == "IIIq" then check(1, ALIF) rad1 = ch[2] rad2 = ch[3] check(4, N) radstart = 5 elseif vform == "IVq" then check(1, ALIF) rad1 = ch[2] rad2 = ch[3] radstart = 4 elseif vform == "XI" then check_len(5, 5) check(1, ALIF) rad1 = ch[2] rad2 = ch[3] check(4, ALIF) rad3 = ch[5] weakness = "sound" elseif vform == "XII" then check(1, ALIF) rad1 = ch[2] if ch[3] ~= ch[5] then infer_err("For verb form XII, letters 3 and 5 should be the same", "novform") end check(4, W) radstart = 5 elseif vform == "XIII" then check_len(5, 5) check(1, ALIF) rad1 = ch[2] rad2 = ch[3] check(4, W) rad3 = ch[5] if rad3 == AMAQ then weakness = "final-weak" else weakness = "sound" end elseif vform == "XIV" then check_len(6, 6) check(1, ALIF) rad1 = ch[2] rad2 = ch[3] check(4, N) rad3 = ch[5] if ch[6] == AMAQ then check_waw_ya(rad3) weakness = "final-weak" else if ch[5] ~= ch[6] then infer_err("For verb form XIV, letters 5 and 6 should be the same", "novform") end weakness = "sound" end elseif vform == "XV" then check_len(6, 6) check(1, ALIF) rad1 = ch[2] rad2 = ch[3] check(4, N) rad3 = ch[5] if rad3 == Y then check(6, ALIF) else check(6, AMAQ) end weakness = "sound" else error("Internal error: Unrecognized verb form " .. vform) end -- Process the last two radicals. RADSTART is the index of the first of the two. If it's nil then all radicals have -- already been processed above, and we don't do anything. if radstart then -- There must (normally) be one or two letters left. if len == radstart then if vform == "I" and ch[len] == Y then -- short form حَيَّ weakness = "final-weak" rad2 = Y rad3 = Y variant = "short" elseif vform == "IV" and rad1 == "ر" and ch[len] == AMAQ then -- irregular verb أَرَى weakness = "final-weak" rad2 = HAMZA rad3 = Y elseif vform == "X" and rad1 == "ح" and ch[len] == AMAQ then -- irregular verb اِسْتَحَى weakness = "final-weak" rad2 = Y rad3 = Y variant = "short" else -- If one letter left, then it's a geminate verb. If the letter is alif or alif maqṣūra, it will trigger -- an error down the line. if vform_supports_geminate(vform) then weakness = "geminate" rad2 = ch[len] rad3 = ch[len] if vform == "III" or vform == "VI" then variant = "short" end else infer_err("Apparent geminate verb, but geminate verbs not allowed for this verb form") end end elseif quadlit then -- Process last two radicals of a quadriliteral verb form. rad3 = ch[radstart] rad4 = ch[radstart + 1] expected_length = radstart + 1 check_len(expected_length) if rad4 == AMAQ or rad4 == ALIF and rad3 == Y or rad4 == Y then -- rad4 can be Y in passive-only verbs. if vform_supports_final_weak(vform) then weakness = "final-weak" -- Ambiguous radical; randomly pick wāw as radical (but avoid two wāws in a row); it could be wāw or -- yāʾ, but doesn't affect the conjugation. rad4 = rad3 == W and {Y, W, ambig = true} or {W, Y, ambig = true} else infer_err("Last radical is " .. rad4 .. " but verb form " .. vform .. " doesn't support final-weak verbs", "novform") end else weakness = "sound" end else -- Process last two radicals of a triliteral verb form. rad2 = ch[radstart] rad3 = ch[radstart + 1] expected_length = radstart + 1 check_len(expected_length) if vform == "I" and (is_waw_ya(rad3) or rad3 == ALIF or rad3 == AMAQ) then local inferred_past_vowel, inferred_nonpast_vowel -- Check for final-weak form I verb. It can end in tall alif (rad3 = wāw) or alif maqṣūra (rad3 = yāʾ) -- or a wāw or yāʾ (with a past vowel of i or u, e.g. nasiya/yansā "forget" or with a passive-only -- verb). if rad1 == W and not form_I_w_non_assimilated() then weakness = "assimilated+final-weak" else weakness = "final-weak" end if rad3 == ALIF then rad3 = W inferred_past_vowel = A inferred_nonpast_vowel = U if is_passive_only(passive) then infer_err("Final-weak form-I passive verbs should end in yāʔ (ي), not tall alif (ا)", "novform") end elseif rad3 == AMAQ then rad3 = Y inferred_past_vowel = A inferred_nonpast_vowel = I if is_passive_only(passive) then infer_err("Final-weak form-I passive verbs should end in yāʔ (ي), not alif maqṣūra (ى)", "novform") end elseif rad1 == "ح" and rad2 == Y and rad3 == Y then -- Long variant حَيِيَ. inferred_past_vowel = I inferred_nonpast_vowel = A variant = "long" else if not is_passive_only(passive) then -- does a non-passive final-weak verb in -uwa ever happen? (YES: e.g. [[رجو]] "to be slack") inferred_past_vowel = rad3 == Y and I or U inferred_nonpast_vowel = A end -- Ambiguous radical; randomly pick wāw as radical (but avoid two wāws); it could be wāw or yāʾ, but -- doesn't affect the conjugation. rad3 = (rad1 == W or rad2 == W) and {Y, W, ambig = true} or {W, Y, ambig = true} -- ambiguous end if inferred_past_vowel then local raw_past_vowel = rget(past_vowel) local raw_nonpast_vowel = rget(nonpast_vowel) if raw_past_vowel ~= "-" then if raw_past_vowel ~= inferred_past_vowel then infer_err(("Final-weak form-I verb inferred past vowel %s, which disagrees with " .. "explicitly specified %s"):format(undia[inferred_past_vowel], undia[raw_past_vowel]), "novform") else -- in case of footnote in past_vowel inferred_past_vowel = past_vowel end end if raw_nonpast_vowel ~= "-" and raw_nonpast_vowel ~= A and inferred_nonpast_vowel == U then -- if inferred as I or A, the reality can be the reverse; form-I final-weak verbs with a~a and -- i~i exist, e.g. سَعَى/يَسْعَى, وَلِيَ/يَلِي. Weird verb [[صها]] (also written [[صهى]]) has non-past -- يصهى so we can't throw an error in this situation. if raw_nonpast_vowel ~= inferred_nonpast_vowel then infer_err(("Final-weak form-I verb inferred non-past vowel %s, which disagrees with " .. "explicitly specified %s"):format(undia[inferred_nonpast_vowel], undia[raw_nonpast_vowel]), "novform") else -- in case of footnote in nonpast_vowel inferred_nonpast_vowel = nonpast_vowel end end end if not is_passive_only(passive) then if rget(past_vowel) == "-" then past_vowel = inferred_past_vowel end if rget(nonpast_vowel) == "-" then nonpast_vowel = inferred_nonpast_vowel end end elseif vform == "IX" and is_waw_ya(rad3) and len == radstart + 2 and ch[len] == AMAQ then -- Final-weak form IX verbs like اِرْعَوَى "to desist, to repent, to see the light". weakness = "final-weak" expected_length = radstart + 2 elseif vform == "X" and rad1 == "ح" and rad2 == Y and rad3 == ALIF then -- Long variant اِسْتَحْيَا. weakness = "final-weak" rad3 = Y variant = "long" elseif rad3 == AMAQ or rad2 == Y and rad3 == ALIF or rad3 == Y then -- rad3 == Y happens in passive-only verbs. if vform_supports_final_weak(vform) then weakness = "final-weak" else infer_err("Last radical is " .. rad3 .. " but verb form doesn't support final-weak verbs") end -- Ambiguous radical; randomly pick wāw as radical (but avoid two wāws); it could be wāw or yāʾ, but -- doesn't affect the conjugation. rad3 = (rad1 == W or rad2 == W) and {Y, W, ambig = true} or {W, Y, ambig = true} elseif rad2 == ALIF then if vform_supports_hollow(vform) then weakness = "hollow" local function set_past_to_a() if req(past_vowel, A) then -- already set elseif req(past_vowel, "-") or req(past_vowel, rget(nonpast_vowel)) then past_vowel = A else infer_err(("Form I hollow verb with nonpast vowel set to '%s' must have past vowel set to 'a' or the same value, not %s"): format(undia[rget(nonpast_vowel)], undia[rget(past_vowel)]), "novform") end end if vform == "I" and req(nonpast_vowel, U) then rad2 = W set_past_to_a() elseif vform == "I" and req(nonpast_vowel, I) then rad2 = Y set_past_to_a() else if req(nonpast_vowel, A) and not req(past_vowel, I) then infer_err(("Form I hollow verb with nonpast vowel set to 'a' must have past vowel set to 'i', not %s"): format(undia[rget(past_vowel)]), "novform") end -- Ambiguous radical; could be wāw or yāʾ; if verb form I, it's critical to get this right, and -- the caller checks for this situation and throws an error if non-past vowel is "a" and second -- radical isn't explicitly given. rad2 = {W, Y, ambig = true, need_radical = true} end else infer_err("Second radical is alif but verb form doesn't support hollow verbs") end elseif vform == "I" and rad1 == W and not form_I_w_non_assimilated() then weakness = "assimilated" elseif rad2 == rad3 and (vform == "III" or vform == "VI") then weakness = "geminate" variant = "long" else weakness = "sound" end end if expected_length then check_len(expected_length, expected_length) end end rad1 = convert(rad1, 1) rad2 = convert(rad2, 2) rad3 = convert(rad3, 3) rad4 = convert(rad4, 4) if not weakness then error("Internal error: Returned weakness from infer_radicals() is nil") end return { weakness = weakness, rad1 = rad1, rad2 = rad2, rad3 = rad3, rad4 = rad4, past_vowel = past_vowel, nonpast_vowel = nonpast_vowel, form_viii_assim = form_viii_assim, variant = variant, } end -- bot interface to infer_radicals() function export.infer_radicals_json(frame) local iparams = { headword = {}, vform = {}, passive = {}, past_vowel = {}, nonpast_vowel = {}, is_reduced = {type = "boolean"}, } local iargs = require("Module:parameters").process(frame.args, iparams) return require("Module:JSON").toJSON(export.infer_radicals(iargs)) end -- Infer vocalization from participle headword (active or passive), verb form (I, II, etc.) and whether the headword is -- active or passive. Throw an error if headword is malformed. Returned radicals may contain Latin letters "t", "w" or "y" -- indicating ambiguous radicals guessed to be tāʾ, wāw or yāʾ respectively. function export.infer_participle_vocalization(headword, vform, weakness, is_active) local chars = {} local orig_headword = headword -- Sub out alif-madda for easier processing. headword = rsub(headword, AMAD, HAMZA .. ALIF) local len = ulen(headword) -- Extract the headword letters into an array. for i = 1, len do table.insert(chars, usub(headword, i, i)) end local function form_intro_error_msg() return ("For verb form %s %s%s participle %s, "):format(vform, orig_headword ~= headword and "normalized " or "", is_active and "active" or "passive", headword) end local function err(msg) error(form_intro_error_msg() .. msg, 1) end -- Check that length of headword is within [min, max]. local function check_len(min, max) if min and len < min then err(("expected at least %s letters but saw %s"):format(min, len)) elseif max and len > max then err(("expected at most %s letters but saw %s"):format(max, len)) end end -- Get the character at `ind`, making sure it exists. local function c(ind) check_len(ind) return chars[ind] end -- Check that the letter at the given index is the given string, or is one of the members of the given array local function check(index, must) local letter = chars[index] local function make_possible_values() if type(must) == "string" then return must else return list_to_text(must, nil, " or ") end end if not letter then err(("expected a letter (specifically %s) at position %s, but participle is too short"):format( make_possible_values(), index)) end local matches if type(must) == "string" then matches = letter == must else matches = m_table.contains(must, letter) end if not matches then err(("letter %s at index %s must be %s"):format(letter, index, make_possible_values())) end end local function check_weakness(values, allow_missing, invert_condition) local function make_possible_weaknesses() for i, val in ipairs(values) do values[i] = "'" .. val .. "'" end return list_to_text(values, nil, " or ") end if allow_missing and invert_condition then error("Internal error: Can't specify both allow_missing and invert_condition") end if not weakness then if allow_missing or invert_condition then return else err(("weakness is unspecified but must be %s"):format(make_possible_weaknesses())) end else local matches = m_table.contains(values, weakness) if invert_condition and matches then err(("weakness '%s' must not be %s"):format(weakness, make_possible_weaknesses())) elseif not invert_condition and not matches then err(("weakness '%s' must be %s"):format(weakness, make_possible_weaknesses())) end end end local vocalized local function handle_possibly_final_weak(sound_prefix, expected_length) check_len(expected_length, expected_length) if c(expected_length) == AMAQ then -- passive final-weak if is_active then err("participle in -ِى only allowed for passive participles") end check_weakness({"final-weak", "assimilated+final-weak"}, "allow missing") vocalized = sound_prefix .. AN .. AMAQ else -- all others behave as if sound check_weakness({"final-weak", "assimilated+final-weak"}, nil, "invert condition") vocalized = sound_prefix .. (is_active and I or A) .. c(expected_length) end end if not (vform == "I" and is_active) then -- all participles except verb form I active begin in م-. check(1, M) end if vform == "I" then if is_active then check(2, ALIF) local sound_prefix = c(1) .. AA .. c(3) if len == 3 then if c(3) == HAMZA then -- Either hollow with hamzated third radical, e.g. [[شاء]] active participle 'شَاءٍ', or final-weak -- with hamzated second radical, e.g. [[رأى]] active participle 'رَاءٍ'. Theoretically (?), also -- geminate with hamzated second/third radical, but I don't know if any such verbs exist. if weakness == "geminate" then vocalized = sound_prefix .. SH else check_weakness({"hollow", "final-weak"}, "allow missing") vocalized = sound_prefix .. IN end else check_weakness({"final-weak", "geminate"}) if weakness == "geminate" then vocalized = sound_prefix .. SH else vocalized = sound_prefix .. IN end end else check_len(4, 4) -- we will convert back to alif maqṣūra below as needed vocalized = sound_prefix .. I .. c(4) end else -- assimilated verbs: regular, e.g. مَوْزُون "weighed" -- geminate verbs: regular, e.g. مَبْلُول "moistened" -- third-hamzated verbs: مَبْرُوء -- hollow verbs: مَقُود "led, driven"; مَزِيد "added, increased" -- hollow first-hamzated verbs: مَئِيض "returned, reverted"; مَأْيُوس "despaired" (NOTE: formation is sound); -- مَأُود or مَؤُود "bent; depleted" -- hollow third-hamzated verbs: مَشِيء "willed, intended", مَضُوء "glittered?" -- final-weak: مَلْقِيّ "found, encountered"; مَصْغُوّ "inclined" -- hollow + final-weak: مَشْوِيّ "fried, grilled", مَهْوِيّ "loved" -- first-hamzated + hollow + final-weak: مَأْوِيّ "received hospitably" local sound_prefix = MA .. c(2) .. SK .. c(3) if len == 5 then -- sound, assimilated or geminate check(4, W) vocalized = sound_prefix .. UU .. c(5) else check_len(4, 4) if c(4) == W then -- final-weak third-wāw vocalized = sound_prefix .. U .. W .. SH elseif c(4) == Y then -- final-weak third-yāʾ vocalized = sound_prefix .. I .. Y .. SH else -- hollow check(3, {W, Y}) if c(3) == W then vocalized = MA .. c(2) .. UU .. c(4) else vocalized = MA .. c(2) .. II .. c(4) end end end end elseif vform == "II" or vform == "V" or vform == "XII" or vform == "XIII" or vform == "Iq" or vform == "IIq" or vform == "IIIq" then local sound_prefix, expected_length if vform == "II" then sound_prefix = MU .. c(2) .. A .. c(3) .. SH expected_length = 4 elseif vform == "V" then check(2, T) sound_prefix = MU .. T .. A .. c(3) .. A .. c(4) .. SH expected_length = 5 elseif vform == "XII" then -- e.g. [[احدودب]] "to be or become convex or humpbacked", مُحْدَوْدِب (active); -- [[اثنونى]] "to be bent; to be doubled up", مُثْنَوْنٍ (active) check(4, W) if c(3) ~= c(5) then err(("third letter %s should be the same as the fifth letter %s"):format(c(3), c(5))) end sound_prefix = MU .. c(2) .. SK .. c(3) .. A .. W .. SK .. c(5) expected_length = 6 elseif vform == "XIII" then -- e.g. [[اخروط]] "to get entangled; to extend", مُخْرَوِّط (active), مُخْرَوَّط (passive) check(4, W) sound_prefix = MU .. c(2) .. SK .. c(3) .. A .. W .. SH expected_length = 5 elseif vform == "Iq" then sound_prefix = MU .. c(2) .. A .. c(3) .. SK .. c(4) expected_length = 5 elseif vform == "IIq" then check(2, T) sound_prefix = MU .. T .. A .. c(3) .. A .. c(4) .. SK .. c(5) expected_length = 6 elseif vform == "IIIq" then -- e.g. [[اخرنطم]] "to be proud and angry" check(4, T) sound_prefix = MU .. c(2) .. SK .. c(3) .. A .. N .. SK .. c(5) expected_length = 6 else error("Internal error: Unhandled verb form " .. vform) end if len == expected_length - 1 then -- active final-weak if not is_active then err(("length-%s participle only allowed for active participles"):format(len)) end check_weakness({"final-weak", "assimilated+final-weak"}, "allow missing") vocalized = sound_prefix .. IN else handle_possibly_final_weak(sound_prefix, expected_length) end elseif vform == "III" or vform == "VI" then local sound_prefix, expected_length if vform == "VI" then check(2, T) check(4, ALIF) sound_prefix = MU .. T .. A .. c(3) .. AA .. c(5) expected_length = 6 else sound_prefix = MU .. c(2) .. AA .. c(4) expected_length = 5 end if len == expected_length - 1 then -- active final-weak or active or passive geminate if is_active then check_weakness({"geminate", "final-weak", "assimilated+final-weak"}) if weakness == "geminate" then vocalized = sound_prefix .. SH else vocalized = sound_prefix .. IN end else check_weakness({"geminate"}, "allow missing") vocalized = sound_prefix .. SH end else handle_possibly_final_weak(sound_prefix, expected_length) end elseif vform == "IV" or vform == "X" then -- form IV: -- sound: مُرْسِخ (active, "entrenching"), مُرْسَخ (passive, "entrenched") -- first-hamzated (like sound): مُؤْيِس (active, "causing to despair"), مُؤْيَس (passive, "caused to despair") -- final-weak: مُكْرٍ (active, "renting out"), مُكْرًى (passive, "rented out") -- assimilated: مُورِد (active, "transferring"), مُورَد (passive, "transferred"); same when first-Y, e.g. -- أَيْقَنَ "to be certain of": مُوقِن (active), مُوقَن (passive) -- assimilated + final-weak: مُورٍ (active, "setting fire, kindling"), مُورًى (passive, "set fire, kindled") -- geminate: مُمِدّ (active, "granting, helping"), مُمَدّ (passive, "granted, helped") -- hollow: مُزِيل (active, "eliminating"), مُزَال (passive, "eliminated") -- hollow + final-weak: مُعْيٍ (active, "tiring"), مُعْيًى (passive, "tired") local sound_prefix, expected_length if vform == "X" then check(2, S) check(3, T) sound_prefix = MU .. S .. SK .. T .. A .. c(4) expected_length = 6 else sound_prefix = MU .. c(2) expected_length = 4 end if len == expected_length and c(len - 1) == Y and c(len) ~= AMAQ then -- active hollow if not is_active then err("this shape only allowed for active participles") end check_weakness({"hollow"}, "allow missing") vocalized = sound_prefix .. II .. c(len) elseif len == expected_length and c(len - 1) == ALIF then -- passive hollow if is_active then err("this shape only allowed for passive participles") end check_weakness({"hollow"}, "allow missing") vocalized = sound_prefix .. AA .. c(len) elseif len == expected_length - 1 then -- active final-weak or active or passive geminate if is_active then check_weakness({"geminate", "final-weak", "assimilated+final-weak"}) if weakness == "geminate" then vocalized = sound_prefix .. I .. c(len) .. SH elseif vform == "IV" and c(2) == W then -- assimilated final-weak vocalized = sound_prefix .. c(len) .. IN else vocalized = sound_prefix .. SK .. c(len) .. IN end else check_weakness({"geminate"}, "allow missing") vocalized = sound_prefix .. A .. c(len) .. SH end else if vform == "IV" and c(2) == W then -- assimilated, possibly final-weak sound_prefix = sound_prefix .. c(expected_length - 1) else sound_prefix = sound_prefix .. SK .. c(expected_length - 1) end handle_possibly_final_weak(sound_prefix, expected_length) end elseif vform == "VII" or vform == "VIII" then -- form VII (passive participles are fairly rare but do exist): -- sound: مُنْكَتِب (active "subscribing"), مُنْكَتَب (passive "subscribed") -- geminate: مُنْضَمّ (both active "joining, containing" and passive "joined, contained") -- final-weak: مُنْطَلٍ (active "fooling (someone)"), مُنْطَلًى (passive "fooled") -- final-weak with medial wāw: مُنْطَوٍ (active "involving"), مُنْطَوًى (passive "involved") -- hollow: مُنْقَاد (both active "complying with" and passive "complied with") -- -- for form VIII, the same variants exist but things are complicated by assimilations involving the template T. -- sound third-hamzated no assimilation: مُبْتَدِئ (active "beginning"), مُبْتَدَأ (passive "begun") -- geminate no assimilation: مُبْتَزّ (both active "robbing" and passive "robbed") -- final-weak no assimilation: مُبْتَنٍ (active "building"), مُبْتَنًى (passive "built") -- final-weak with medial wāw no assimilation: مُحْتَوٍ (active "containing"), مُحْتَوًى (passive "contained") -- hollow no assimilation: مُخْتَار (both active "choosing" and passive "chosen") -- -- sound with total assimilation: مُتَّبِع (active "following"), مُتَّبَع (passive "followed") -- sound with total assimilation, assimilating wāw: مُتَّعِد (active "threatening"), مُتَّعَد (passive "threatened") -- sound with total assimilation, irregularly assimilating hamza: مُتَّخِذ (active "taking"), مُتَّخَذ (passive "taken") -- sound with total assimilation (to ḏāl, producing dāl): مُدَّخِر (active "reserving"), مُدَّخَر (passive "reserved") -- sound with total assimilation (to ḏāl): مُذَّكِر (active "remembering"), مُذَّكَر (passive "remembered") -- sound with total assimilation (to ṭāʔ): مُطَّرِح (active "discarding"), مُطَّرَح (passive "discarded") -- sound with total assimilation (to ẓāʔ): مُظَّلِم (active "tolerating"), مُظَّلَم (passive "tolerated") -- final-weak with total assimilation, assimilating wāw: مُتَّقٍ (active "guarding against"), مُتَّقًى (passive "guarded against") -- final-weak with total assimilation (to ṯāʔ): مُثَّنٍ (active "undulating"), مُثَّنًى (passive "undulated") -- final-weak with total assimilation (to dāl): مُدَّعٍ (active "claiming"), مُدَّعًى (passive "claimed") -- sound with partial assimilation (to zayn): مُزْدَهِر (active "thriving"), مُزْدَهَر (passive "thrived") -- sound with medial wāw with partial assimilation (to zayn): مُزْدَوِج (active "appearing twice") -- sound with partial assimilation (to ṣād): مُصْطَبِح (active "illuminating"), مُصْطَبَح (passive, "illuminated") -- sound with partial assimilation (to ḍād): مُضْطَرِب (active "to be disturbed"; no passive) -- geminate with partial assimilation (to ṣād): مُصْطَبّ (both active "effusing" and passive "effused") -- geminate with partial assimilation (to ḍād): مُضْطَرّ (both active "forcing" and passive "forced") -- final-weak with partial assimilation (to ṣād): مُصْطَلٍ (active "warming"), مُصْطَلًى (passive "warmed") -- hollow with partial assimilation (to zayn): مُزْدَاد (both active "increasing" and passive "increased") -- hollow with partial assimilation (to ṣad): مُصْطَاد (both active "hunting" and passive "hunted") local sound_prefix, sufind if vform == "VII" then check(2, N) sound_prefix = MU .. N .. SK .. c(3) sufind = 4 else local c2 = c(2) if c2 == T or c2 == "د" or c2 == "ث" or c2 == "ذ" or c2 == "ط" or c2 == "ظ" then -- full assimilation sound_prefix = MU .. c2 .. SH sufind = 3 else -- partial or no assimilation if c2 == "ز" then check(3, "د") elseif c2 == "ص" or c2 == "ض" then check(3, "ط") else check(3, T) end sound_prefix = MU .. c2 .. SK .. c(3) sufind = 4 end end if c(sufind) == ALIF then -- hollow, active or passive check_len(sufind + 1, sufind + 1) check_weakness({"hollow"}, "allow missing") vocalized = sound_prefix .. AA .. c(sufind + 1) elseif len == sufind then -- active final-weak or active or passive geminate if is_active then check_weakness({"geminate", "final-weak", "assimilated+final-weak"}) if weakness == "geminate" then vocalized = sound_prefix .. A .. c(len) .. SH else vocalized = sound_prefix .. A .. c(len) .. IN end else check_weakness({"geminate"}, "allow missing") vocalized = sound_prefix .. A .. c(len) .. SH end else sound_prefix = sound_prefix .. A .. c(sufind) handle_possibly_final_weak(sound_prefix, sufind + 1) end elseif vform == "IX" then check_len(4, 4) vocalized = MU .. c(2) .. SK .. c(3) .. A .. c(4) .. SH elseif vform == "IVq" then -- e.g. [[اذلعب]] "to scamper away", مُذْلَعِبّ (active), مُذْلَعَبّ (passive); -- [[اطمأن]] "to remain quietly; to be certain", مُطْمَئِنّ (active), مُطْمَأَنّ (passive) check_len(5, 5) local sound_prefix = MU .. c(2) .. SK .. c(3) .. A .. c(4) if is_active then vocalized = sound_prefix .. I .. c(5) .. SH else vocalized = sound_prefix .. A .. c(5) .. SH end elseif vform == "XI" then check_len(5, 5) check(4, ALIF) vocalized = MU .. c(2) .. SK .. c(3) .. AA .. c(5) .. SH -- e.g. [[احمار]] "to turn red, to blush", مُحْمَارّ (active) elseif vform == "XIV" or vform == "XV" then -- FIXME: Implement. No examples in Wiktionary currently; need to look up in a grammar. error("Support for verb form " .. vform .. " not implemented yet") else error("Don't recognize verb form " .. vform) end vocalized = rsub(vocalized, HAMZA .. AA, AMAD) local reconstructed_headword = lang:stripDiacritics(vocalized) if reconstructed_headword ~= orig_headword then error(("Internal error: Vocalized participle %s doesn't match original participle %s"):format( vocalized, orig_headword)) end return vocalized end function export.infer_participle_vocalization_json(frame) local iparams = { [1] = {required = true}, [2] = {required = true}, ["weakness"] = {}, ["passive"] = {type = "boolean"} } local iargs = require("Module:parameters").process(frame.args, iparams) return export.infer_participle_vocalization(iargs[1], iargs[2], iargs.weakness, not iargs.passive) end return export tdgd3u6jfiyf70dpiezhrk7sexvygv6 مؤذن 0 22396 281310 126849 2026-04-21T15:51:00Z Hakimi97 2668 /* Etimologi */ 281310 wikitext text/x-wiki == Bahasa Arab == {{Wikipedia|lang=ar}} === Takrifan === ==== Kata ==== {{ar-noun|مُؤَذِّن|m|pl=مُؤَذِّنُون}} # [[bilal]], [[muazin]] ===== Deklensi ===== {{ar-decl-noun|مُؤَذِّن|pl=مُؤَذِّنُون}} === Etimologi === Daripada {{m|ar|أَذَّنَ||[[panggil]]}}, daripada akar {{ar-root|ء ذ ن}}. {{C|ar|Solat|Agamawan Islam}} 7nbt9iek917cs7r356bzx9aoddlyiy8 اعتقاد 0 24132 281311 144426 2026-04-21T15:52:12Z Hakimi97 2668 /* Kata nama */ 281311 wikitext text/x-wiki ==Bahasa Melayu== ===Takrifan=== ====Kata nama==== {{ms-noun|pl=-}} # {{ms-jawi|iktikad}} === Pautan luar === * {{R:PRPM}} == Bahasa Arab == === Takrifan === ==== Kata nama ==== {{ar-noun|اِعْتِقَاد|m|pl=اِعْتِقَادَات}} # [[kepercayaan]], [[pegangan]], [[akidah]] ===Etimologi=== Daripada dasar {{ar-root|ع ق د}}. ===Sebutan=== * {{ar-IPA|اِعْتِقَاد}} == Bahasa Parsi == === Takrifan === ==== Kata nama ==== {{fa-kn|tr=e'teqâd}} # [[kepercayaan]], [[pegangan]] # [[pendapat]] ===Etimologi=== Pinjaman {{bor|fa|ar|اِعْتِقَاد}}. ===Sebutan=== {{fa-AFA|i'ti`qād}} rfjvg6r9qfrz78kt0aqzkqlr4z24ky3 buli 0 24834 281244 130760 2026-04-21T13:23:13Z Countryball mys123 9925 /* Bahasa Melayu */Tambah gambar 281244 wikitext text/x-wiki == Bahasa Melayu == {{Wikipedia}} <!-- Kalau ada --> [[File:Bullying Prevention in the United States.jpg|thumb|Gambaran buli]] === Takrifan === ==== Kata nama ==== {{ms-kn|j=بولي}} # Perbuatan mengganggu, memaksa dan merendah-rendahkan seseorang secara melampau-lampau, terutamanya yang berdarjat lebih rendah. ==== Kata kerja ==== {{ms-kk|j=بولي}} # Melakukan perbuatan buli. === Sebutan === * {{dewan|bu|li}} === Pautan luar === * {{R:PRPM}} 91ladtnkpvdj4zahc8o095098qvmdli gonob 0 25875 281413 178454 2026-04-22T08:06:54Z ~2026-24499-96 10668 281413 wikitext text/x-wiki ==Bahasa Kadazandusun== ===Takrifan=== ====Kata nama==== {{inti|dtp|kata nama}} # [[sarung]] # [[kain basahan]] #: {{cp|dtp| Nopupuan ku no '''gonob''' di odu.| Saya sudah mencuci '''kain sarung''' nenek.} ===Sebutan=== * {{IPA|dtp|/ɡɔ.nɔɓ/}} * {{rima|dtp|nɔɓ|ɔɓ}} * {{penyempangan|dtp|go|nob}} ===Terbitan=== * {{l|dtp|mononggonob}} * {{l|dtp|kigonob}} ===Tesaurus=== ; Sinonim: [[tapi]], [[sarung]]. ===Rujukan=== Mongulud Boros Dusun Kadazan (1994). Komoiboros Dusunkadazan. Kota Kinabalu: MBDK. 3bbn1ddwywy3r3kxvynju8qonbj1ssb 281420 281413 2026-04-22T09:10:07Z Hakimi97 2668 Membatalkan semakan [[Special:Diff/281413|281413]] oleh [[Special:Contributions/~2026-24499-96|~2026-24499-96]] ([[User talk:~2026-24499-96|bincang]]) 281420 wikitext text/x-wiki ==Bahasa Kadazandusun== ===Takrifan=== ====Kata nama==== {{inti|dtp|kata nama}} # [[sarung]] # [[kain basahan]] #: {{cp|dtp| Nopupuan ku no '''gonob''' di odu.| Saya sudah mencuci '''kain sarung''' nenek.}} ===Sebutan=== * {{IPA|dtp|/ɡɔ.nɔɓ/}} * {{rima|dtp|nɔɓ|ɔɓ}} * {{penyempangan|dtp|go|nob}} ===Terbitan=== * {{l|dtp|mononggonob}} * {{l|dtp|kigonob}} ===Tesaurus=== ; Sinonim: [[tapi]], [[sarung]]. ===Rujukan=== Mongulud Boros Dusun Kadazan (1994). Komoiboros Dusunkadazan. Kota Kinabalu: MBDK. nt90ms78lqr09yrx5kcxoae60rfot3w ليل 0 27195 281297 134057 2026-04-21T15:33:25Z Hakimi97 2668 281297 wikitext text/x-wiki == Bahasa Arab == === Takrifan === ==== Kata nama ==== {{ar-noun|لَيْل|m|pl=-}} # [[malam]] #: {{ant|ar|نَهَار}} === Etimologi === Daripada {{ar-root|ل ي ل}}, daripada {{inh|ar|sem-pro|*layl-}}. === Sebutan === * {{ar-IPA|لَيْل}} * {{audio|ar|Ar-ليل.ogg|Audio}} === Lihat juga === * {{l|ar|لَيْلَة}} 6jsf56ix68w0efrtv1xolvytcvhbe5x Maghribi 0 27340 281312 149953 2026-04-21T15:52:43Z Hakimi97 2668 /* Etimologi */ 281312 wikitext text/x-wiki == Bahasa Melayu == {{Wikipedia}} <!-- Kalau ada --> [[Image:MAR orthographic.svg|upright=1.13|thumb|right|Peta Maghribi.]] === Takrifan === ==== Kata nama khas ==== {{ms-knk|j=مغربي}} # Sebuah negara di barat [[Afrika]]. === Etimologi === Daripada {{bor|ms|ar|الْمَغْرِب}}; daripada {{ar-root|غ ر ب}}; {{m|ar|غَرْب||[[barat]]}}. Lihat juga ''[[maghrib]]''. === Sebutan === * {{dewan|Magh|ri|bi}} === Lihat juga === * {{senarai:negara di Afrika/ms}} === Pautan luar === * {{R:PRPM}} 4v6mdxzdxzqs4q56i80f1zy007kcp03 Modul:languages/data 828 33717 281318 223008 2026-04-21T19:40:16Z Hakimi97 2668 Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/89499015|89499015]]) 281318 Scribunto text/plain local export = {} -- We can't use mw.loadData() on [[Module:languages/chars]] because [[Module:languages/data]] itself is sometimes loaded -- using mw.loadData(), and calling mw.loadData() on [[Module:languages/chars]] will insert metatables into the -- character tables, which the second mw.loadData() will choke on. local m_chars = require("Module:languages/chars") local u = require("Module:string/char") local c = m_chars.chars export.chars = c local p = m_chars.puaChars export.puaChars = p local cs = m_chars.chars_substitutions export.chars_substitutions = cs -- FIXME! Many of the script-specific values below can be moved to [[Module:scripts/data]] to serve as script-wide -- fallback values instead of specifying them for each language using the script. local s = {} -- These values are placed here to make it possible to synchronise a group of languages without the need for a dedicated function module. -- cau do local cau_remove_diacritics = c.grave .. c.acute .. c.macron local cau_from = {"[IlΙІӀᴴ]"} local cau_to = {{ ["l"] = "ӏ", ["Ι"] = "ӏ", ["І"] = "ӏ", ["Ӏ"] = "ӏ", ["ᴴ"] = "ᵸ", }} s["cau-Cyrl-displaytext"] = { from = cau_from, to = cau_to, } s["cau-Cyrl-stripdiacritics"] = { remove_diacritics = cau_remove_diacritics, from = cau_from, to = cau_to, } s["cau-Latn-stripdiacritics"] = {remove_diacritics = cau_remove_diacritics} end s["itc-Latn-displaytext"] = { from = {c.caron}, to = {c.breve}, } s["itc-Latn-stripdiacritics"] = {remove_diacritics = c.macron .. c.breve .. c.diaer .. c.caron .. c.dinvbreve} s["itc-Latn-sortkey"] = { remove_diacritics = c.circ .. c.tilde .. c.macron .. c.breve .. c.diaer .. c.caron .. c.zigzag .. c.dmacron .. c.dtilde .. c.dinvbreve .. c.small_a .. c.small_e .. c.small_i .. c.small_o .. c.small_u, -- Chiefly medieval abbreviations. from = {"ᵃ", "æ", "[đꝱꟈ]", "ᵉ", "ⁱ", "ꝁ", "[ƚꝉꝲ]", "ꝳ", "ꝴ", "[ꝋᵒ]", "œ", "[ꝑꝓꝕ]", "[ꝗꝙ]", "[ꝛꝵꝶꝝ]", "[ꟊˢ]", "[ꝷᵗ]", "ᵘ", "ꝟ", "⁊"}, to = {"a", "ae", "d", "e", "i", "k", "l", "m", "n", "o", "oe", "p", "q", "r", "s", "t", "u", "v", "&"} } s["Jpan-standardchars"] = -- exclude ぢづヂヅ "ぁあぃいぅうぇえぉおかがきぎくぐけげこごさざしじすずせぜそぞただちっつてでとどなにぬねのはばぱひびぴふぶぷへべぺほぼぽまみむめもゃやゅゆょよらりるれろん" .. "ァアィイゥウェエォオカガキギクグケゲコゴサザシジスズセゼソゾタダチッツテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモャヤュユョヨラリルレロン" local jpx_displaytext = { from = {"~", "="}, to = {"〜", "゠"} } s["jpx-displaytext"] = { Jpan = jpx_displaytext, Hani = jpx_displaytext, Hrkt = jpx_displaytext, Hira = jpx_displaytext, Kana = jpx_displaytext -- not Latn or Brai } s["jpx-stripdiacritics"] = s["jpx-displaytext"] s["jpx-sortkey"] = { Jpan = "Jpan-sortkey", Hani = "Hani-sortkey", Hrkt = "Hira-sortkey", -- sort general kana by normalizing to Hira Hira = "Hira-sortkey", Kana = "Kana-sortkey", Latn = {remove_diacritics = c.tilde .. c.macron .. c.diaer} } s["jpx-translit"] = { Hrkt = "Hrkt-translit", Hira = "Hrkt-translit", Kana = "Hrkt-translit" } s["roa-oil-sortkey"] = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.diaer .. c.ringabove .. c.cedilla .. "'", from = {"æ", "œ", "·"}, to = {"ae", "oe", " "} } s["wen-sortkey"] = { from = {"ch", "[lłßꞩẜ]", "dz[" .. c.caron .. c.acute .. "]", "[bcefmnoprswz][" .. c.caron .. c.acute .. c.dotabove .. "]"}, to = { "h" .. p[1], { ["l"] = "l" .. p[1], ["ł"] = "l", ["ß"] = "s", ["ꞩ"] = "š", ["ẜ"] = "š", }, { ["dz" .. c.caron] = "d" .. p[1], ["dz" .. c.acute] = "d" .. p[2] }, { ["b" .. c.acute] = "b" .. p[1], ["c" .. c.caron] = "c" .. p[1], ["c" .. c.acute] = "c" .. p[2], ["e" .. c.caron] = "e" .. p[1], ["e" .. c.dotabove] = "e" .. p[1], ["f" .. c.acute] = "f" .. p[1], ["m" .. c.acute] = "m" .. p[1], ["n" .. c.acute] = "n" .. p[1], ["o" .. c.acute] = "o" .. p[1], ["p" .. c.acute] = "p" .. p[1], ["r" .. c.caron] = "r" .. p[1], ["r" .. c.acute] = "r" .. p[2], ["s" .. c.caron] = "s" .. p[1], ["s" .. c.acute] = "s" .. p[2], ["w" .. c.acute] = "w" .. p[1], ["z" .. c.caron] = "z" .. p[1], ["z" .. c.acute] = "z" .. p[2], } } } -- Myanmar dotted form : https://www.unicode.org/Public/UNIDATA/StandardizedVariants.txt s["aio-displaytext"] = { from = {"([ကဂငတထပမယလဝဢေၵၸၺႀꩠꩡꩢꩣꩤꩥꩦꩫꩬꩯꩺ])"}, to = {"%1" .. c.VS01} } s["aio-stripdiacritics"] = { remove_diacritics = c.VS01, } s["phk-displaytext"] = s["aio-displaytext"] s["phk-stripdiacritics"] = s["aio-stripdiacritics"] s["kht-displaytext"] = s["aio-displaytext"] s["kht-stripdiacritics"] = s["aio-stripdiacritics"] export.shared = s --[==[ var: Short-term solution to override the standard substitution process, by forcing the module to substitute the entire text in one pass, if "cont" is given. This results in any PUA characters that are used as stand-ins for formatting being handled by the language-specific substitution process, which is usually undesirable. If the value is "none" then the formatting tags do not get turned into PUA characters in the first place. This override is provided for languages which use formatting between strings of text which might need to interact with each other (e.g. Korean 값이 transliterates as "gaps-i", but [[값]] has the formatting '''값'''[[-이]]. The normal process would split the text at the second '''.) ]==] export.substitution = { ["gmy"] = "none", ["ja"] = "cont", ["jje"] = "cont", ["ko"] = "cont", ["ko-ear"] = "cont", ["ru"] = "cont", ["th-new"] = "cont", ["sa"] = "cont", ["zkt"] = "cont", } --[==[ var: Code aliases. The left side is the alias and the right side is the canonical code. NOTE: These are gradually being deprecated, so should not be added to on a permanent basis. Temporary additions are permitted under reasonable circumstances (e.g. to facilitate changing a language's code). When an alias is no longer used, it should be removed. Aliases in this table are tracked at [[Wiktionary:Tracking/languages/LANG]]; see e.g. [[Special:WhatLinksHere/Wiktionary:Tracking/languages/VL.]] for the `VL.` alias. ]==] export.aliases = { ["EL."] = "la-ecc", ["LL."] = "la-lat", ["ML."] = "la-med", ["NL."] = "la-new", ["VL."] = "la-vul", ["nds-DE"] = "nds-de", ["nds-NL"] = "nds-nl", ["roa-oan"] = "roa-ona", ["sa-cls"] = "cls", ["sa-ved"] = "vsn", } --[==[ var: Codes which are tracked. Note that all aliases listed above are also tracked, so should not be duplicated here. Tracking uses the same mechanism described above in the comment above `export.aliases`. ]==] export.track = { -- Codes duplicated between full and etymology-only languages. ["lzh-lit"] = true, ["lzh"] = true, -- Languages actively being converted to families. ["bh"] = true, -- inc-bih ["nan"] = true, -- zhx-nan } return export b89ykupfotcto2w975o7485wv63kf7f 281320 281318 2026-04-21T19:43:53Z Hakimi97 2668 Membatalkan semakan [[Special:Diff/281318|281318]] oleh [[Special:Contributions/Hakimi97|Hakimi97]] ([[User talk:Hakimi97|bincang]]) 281320 Scribunto text/plain local m_scripts = require("Module:scripts") local table = table local insert = table.insert local u = require("Module:string/char") local export = {} -- UTF-8 encoded strings for some commonly-used diacritics. local c = { prime = u(0x02B9), grave = u(0x0300), acute = u(0x0301), circ = u(0x0302), tilde = u(0x0303), macron = u(0x0304), overline = u(0x0305), breve = u(0x0306), dotabove = u(0x0307), diaer = u(0x0308), ringabove = u(0x030A), dacute = u(0x030B), caron = u(0x030C), lineabove = u(0x030D), dgrave = u(0x030F), invbreve = u(0x0311), commaabove = u(0x0313), revcommaabove = u(0x0314), dotbelow = u(0x0323), diaerbelow = u(0x0324), ringbelow = u(0x0325), cedilla = u(0x0327), ogonek = u(0x0328), brevebelow = u(0x032E), macronbelow = u(0x0331), perispomeni = u(0x0342), ypogegrammeni = u(0x0345), CGJ = u(0x034F), -- combining grapheme joiner zigzag = u(0x035B), dbrevebelow = u(0x035C), dmacron = u(0x035E), dtilde = u(0x0360), dinvbreve = u(0x0361), small_a = u(0x0363), small_e = u(0x0364), small_i = u(0x0365), small_o = u(0x0366), small_u = u(0x0367), keraia = u(0x0374), lowerkeraia = u(0x0375), tonos = u(0x0384), palatalization = u(0x0484), dasiapneumata = u(0x0485), psilipneumata = u(0x0486), kashida = u(0x0640), fathatan = u(0x064B), dammatan = u(0x064C), kasratan = u(0x064D), fatha = u(0x064E), damma = u(0x064F), kasra = u(0x0650), shadda = u(0x0651), sukun = u(0x0652), hamzaabove = u(0x0654), nunghunna = u(0x0658), zwarakay = u(0x0659), smallv = u(0x065A), superalef = u(0x0670), udatta = u(0x0951), anudatta = u(0x0952), dottedgrave = u(0x1DC0), dottedacute = u(0x1DC1), coronis = u(0x1FBD), psili = u(0x1FBF), dasia = u(0x1FEF), ZWNJ = u(0x200C), -- zero width non-joiner ZWJ = u(0x200D), -- zero width joiner RSQuo = u(0x2019), -- right single quote kavyka = u(0xA67C), VS01 = u(0xFE00), -- variation selector 1 -- Punctuation for the standardChars field. -- Note: characters are literal (i.e. no magic characters). punc = " ',-‐‑‒–—…∅", -- Range covering all diacritics. diacritics = u(0x300) .. "-" .. u(0x34E) .. u(0x350) .. "-" .. u(0x36F) .. u(0x1AB0) .. "-" .. u(0x1ACE) .. u(0x1DC0) .. "-" .. u(0x1DFF) .. u(0x20D0) .. "-" .. u(0x20F0) .. u(0xFE20) .. "-" .. u(0xFE2F), } -- Braille characters for the standardChars field. local braille = {} for i = 0x2800, 0x28FF do insert(braille, u(i)) end c.braille = table.concat(braille) export.chars = c -- PUA characters, generally used in sortkeys. -- Note: if the limit needs to be increased, do so in powers of 2 (due to the way memory is allocated for tables). local p = {} for i = 1, 32 do p[i] = u(0xF000+i-1) end export.puaChars = p local s = {} -- These values are placed here to make it possible to synchronise a group of languages without the need for a dedicated function module. -- cau do local cau_remove_diacritics = c.grave .. c.acute .. c.macron local cau_from = {"[IlΙІӀᴴ]"} local cau_to = {{ ["l"] = "ӏ", ["Ι"] = "ӏ", ["І"] = "ӏ", ["Ӏ"] = "ӏ", ["ᴴ"] = "ᵸ", }} s["cau-Cyrl-displaytext"] = { from = cau_from, to = cau_to, } s["cau-Cyrl-entryname"] = { remove_diacritics = cau_remove_diacritics, from = cau_from, to = cau_to, } s["cau-Latn-entryname"] = {remove_diacritics = cau_remove_diacritics} end -- Cyrs do local Cyrs_remove_diacritics = c.grave .. c.acute .. c.dotabove .. c.diaer .. c.invbreve .. c.palatalization .. c.dasiapneumata .. c.psilipneumata .. c.dottedgrave .. c.dottedacute .. c.kavyka s["Cyrs-entryname"] = {remove_diacritics = Cyrs_remove_diacritics} s["Cyrs-sortkey"] = { remove_diacritics = Cyrs_remove_diacritics, from = { "ї", "оу", -- 2 chars "[ґꙣєѕꙃꙅꙁіꙇђꙉѻꙩꙫꙭꙮꚙꚛꙋѡѿꙍѽꙑѣꙗѥꙕѧꙙѩꙝꙛѫѭѯѱѳѵҁ]" }, to = { "и" .. p[1], "у", { ["ґ"] = "г" .. p[1], ["ꙣ"] = "д" .. p[1], ["є"] = "е", ["ѕ"] = "ж" .. p[1], ["ꙃ"] = "ж" .. p[1], ["ꙅ"] = "ж" .. p[1], ["ꙁ"] = "з", ["і"] = "и" .. p[1], ["ꙇ"] = "и" .. p[1], ["ђ"] = "и" .. p[2], ["ꙉ"] = "и" .. p[2], ["ѻ"] = "о", ["ꙩ"] = "о", ["ꙫ"] = "о", ["ꙭ"] = "о", ["ꙮ"] = "о", ["ꚙ"] = "о", ["ꚛ"] = "о", ["ꙋ"] = "у", ["ѡ"] = "х" .. p[1], ["ѿ"] = "х" .. p[1], ["ꙍ"] = "х" .. p[1], ["ѽ"] = "х" .. p[1], ["ꙑ"] = "ы", ["ѣ"] = "ь" .. p[1], ["ꙗ"] = "ь" .. p[2], ["ѥ"] = "ь" .. p[3], ["ꙕ"] = "ю", ["ѧ"] = "я", ["ꙙ"] = "я", ["ѩ"] = "я" .. p[1], ["ꙝ"] = "я" .. p[1], ["ꙛ"] = "я" .. p[2], ["ѫ"] = "я" .. p[3], ["ѭ"] = "я" .. p[4], ["ѯ"] = "я" .. p[5], ["ѱ"] = "я" .. p[6], ["ѳ"] = "я" .. p[7], ["ѵ"] = "я" .. p[8], ["ҁ"] = "я" .. p[9], } }, } end s["Grek-displaytext"] = { from = {"Þ", "þ", "['" .. c.RSQuo .. c.prime .. c.keraia .. c.coronis .. c.psili .. "]"}, -- Not tonos, used as the numeral sign in entries. to = {"Ϸ", "ϸ", c.RSQuo} } s["Grek-entryname"] = { remove_diacritics = c.caron .. c.diaerbelow .. c.brevebelow, from = s["Grek-displaytext"].from, to = {"Ϸ", "ϸ", "'"} } s["Grek-sortkey"] = { remove_diacritics = "';·`¨´῀" .. c.grave .. c.acute .. c.diaer .. c.caron .. c.commaabove .. c.revcommaabove .. c.macron .. c.breve .. c.diaerbelow .. c.brevebelow .. c.perispomeni .. c.ypogegrammeni .. c.RSQuo .. c.prime .. c.keraia .. c.lowerkeraia .. c.tonos .. c.coronis .. c.psili .. c.dasia, from = {"ϝ", "ͷ", "ϛ", "ͱ", "ͺ", "ϳ", "ϻ", "[ϟϙ]", "[ςϲ]", "ͳ"}, to = {"ε" .. p[1], "ε" .. p[2], "ε" .. p[3], "ζ" .. p[1], "ι", "ι" .. p[1], "π" .. p[1], "π" .. p[2], "σ", "ϡ"} } s["itc-Latn-displaytext"] = { from = {c.caron}, to = {c.breve}, } s["itc-Latn-entryname"] = {remove_diacritics = c.macron .. c.breve .. c.diaer .. c.caron .. c.dinvbreve} s["itc-Latn-sortkey"] = { remove_diacritics = c.circ .. c.tilde .. c.macron .. c.breve .. c.diaer .. c.caron .. c.zigzag .. c.dmacron .. c.dtilde .. c.dinvbreve .. c.small_a .. c.small_e .. c.small_i .. c.small_o .. c.small_u, -- Chiefly medieval abbreviations. from = {"ᵃ", "æ", "[đꝱꟈ]", "ᵉ", "ⁱ", "ꝁ", "[ƚꝉꝲ]", "ꝳ", "ꝴ", "[ꝋᵒ]", "œ", "[ꝑꝓꝕ]", "[ꝗꝙ]", "[ꝛꝵꝶꝝ]", "[ꟊˢ]", "[ꝷᵗ]", "ᵘ", "ꝟ", "⁊"}, to = {"a", "ae", "d", "e", "i", "k", "l", "m", "n", "o", "oe", "p", "q", "r", "s", "t", "u", "v", "&"} } s["Jpan-standardchars"] = -- exclude ぢづヂヅ "ぁあぃいぅうぇえぉおかがきぎくぐけげこごさざしじすずせぜそぞただちっつてでとどなにぬねのはばぱひびぴふぶぷへべぺほぼぽまみむめもゃやゅゆょよらりるれろん" .. "ァアィイゥウェエォオカガキギクグケゲコゴサザシジスズセゼソゾタダチッツテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモャヤュユョヨラリルレロン" local jpx_displaytext = { from = {"~", "="}, to = {"〜", "゠"} } s["jpx-displaytext"] = { Jpan = jpx_displaytext, Hani = jpx_displaytext, Hrkt = jpx_displaytext, Hira = jpx_displaytext, Kana = jpx_displaytext -- not Latn or Brai } s["jpx-entryname"] = s["jpx-displaytext"] s["jpx-sortkey"] = { Jpan = "Jpan-sortkey", Hani = "Hani-sortkey", Hrkt = "Hira-sortkey", -- sort general kana by normalizing to Hira Hira = "Hira-sortkey", Kana = "Kana-sortkey", Latn = {remove_diacritics = c.tilde .. c.macron .. c.diaer} } s["jpx-translit"] = { Hrkt = "Hrkt-translit", Hira = "Hrkt-translit", Kana = "Hrkt-translit" } local HaniChars = m_scripts.getByCode("Hani"):getCharacters() -- `漢字(한자)`→`漢字` -- `가-나-다`→`가나다`, `가--나--다`→`가-나-다` -- `온돌(溫突/溫堗)`→`온돌` ([[ondol]]) s["Kore-entryname"] = { remove_diacritics = u(0x302E) .. u(0x302F), from = {"([" .. HaniChars .. "])%(.-%)", "^%-", "%-$", "%-(%-?)", "\1", "%([" .. HaniChars .. "/]+%)"}, to = {"%1", "\1", "\1", "%1", "-"} } s["Lisu-sortkey"] = { from = {"𑾰"}, to = {"ꓬ" .. p[1]} } s["Mong-displaytext"] = { from = {"([ᠨ-ᡂᡸ])ᠶ([ᠨ-ᡂᡸ])", "([ᠠ-ᡂᡸ])ᠸ([^᠋ᠠ-ᠧ])", "([ᠠ-ᡂᡸ])ᠸ$"}, to = {"%1ᠢ%2", "%1ᠧ%2", "%1ᠧ"} } s["Mong-entryname"] = s["Mong-displaytext"] s["Polyt-displaytext"] = s["Grek-displaytext"] s["Polyt-entryname"] = { remove_diacritics = c.macron .. c.breve .. c.dbrevebelow, from = s["Grek-entryname"].from, to = s["Grek-entryname"].to } s["Polyt-sortkey"] = s["Grek-sortkey"] -- Samr do s["Samr-entryname"] = { remove_diacritics = c.CGJ .. u(0x0816) .. "-" .. u(0x082D), } s["Samr-sortkey"] = s["Samr-entryname"] end s["roa-oil-sortkey"] = { remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.diaer .. c.ringabove .. c.cedilla .. "'", from = {"æ", "œ", "·"}, to = {"ae", "oe", " "} } s["Tibt-displaytext"] = { from = {"ༀ", "༌", "།།", "༚༚", "༚༝", "༝༚", "༝༝", "ཷ", "ཹ", "ེེ", "ོོ"}, to = {"ཨོཾ", "་", "༎", "༛", "༟", "࿎", "༞", "ྲཱྀ", "ླཱྀ", "ཻ", "ཽ"} } s["Tibt-entryname"] = s["Tibt-displaytext"] s["wen-sortkey"] = { from = {"ch", "[lłßꞩẜ]", "dz[" .. c.caron .. c.acute .. "]", "[bcefmnoprswz][" .. c.caron .. c.acute .. c.dotabove .. "]"}, to = { "h" .. p[1], { ["l"] = "l" .. p[1], ["ł"] = "l", ["ß"] = "s", ["ꞩ"] = "š", ["ẜ"] = "š", }, { ["dz" .. c.caron] = "d" .. p[1], ["dz" .. c.acute] = "d" .. p[2] }, { ["b" .. c.acute] = "b" .. p[1], ["c" .. c.caron] = "c" .. p[1], ["c" .. c.acute] = "c" .. p[2], ["e" .. c.caron] = "e" .. p[1], ["e" .. c.dotabove] = "e" .. p[1], ["f" .. c.acute] = "f" .. p[1], ["m" .. c.acute] = "m" .. p[1], ["n" .. c.acute] = "n" .. p[1], ["o" .. c.acute] = "o" .. p[1], ["p" .. c.acute] = "p" .. p[1], ["r" .. c.caron] = "r" .. p[1], ["r" .. c.acute] = "r" .. p[2], ["s" .. c.caron] = "s" .. p[1], ["s" .. c.acute] = "s" .. p[2], ["w" .. c.acute] = "w" .. p[1], ["z" .. c.caron] = "z" .. p[1], ["z" .. c.acute] = "z" .. p[2], } } } export.shared = s -- Short-term solution to override the standard substitution process, by forcing the module to substitute the entire text in one pass, if "cont" is given. This results in any PUA characters that are used as stand-ins for formatting being handled by the language-specific substitution process, which is usually undesirable. If the value is "none" then the formatting tags do not get turned into PUA characters in the first place. -- This override is provided for languages which use formatting between strings of text which might need to interact with each other (e.g. Korean 값이 transliterates as "gaps-i", but [[값]] has the formatting '''값'''[[-이]]. The normal process would split the text at the second '''.) export.substitution = { ["gmy"] = "none", ["ja"] = "cont", ["jje"] = "cont", ["ko"] = "cont", ["ko-ear"] = "cont", ["ru"] = "cont", ["th-new"] = "cont", ["sa"] = "cont", ["zkt"] = "cont", } -- Code aliases. The left side is the alias and the right side is the canonical code. NOTE: These are gradually -- being deprecated, so should not be added to on a permanent basis. Temporary additions are permitted under reasonable -- circumstances (e.g. to facilitate changing a language's code). When an alias is no longer used, it should be removed. -- Aliases in this table are tracked at [[Wiktionary:Tracking/languages/LANG]]; see e.g. -- [[Special:WhatLinksHere/Wiktionary:Tracking/languages/RL.]] for the `RL.` alias. export.aliases = { ["EL."] = "la-ecc", ["LL."] = "la-lat", ["ML."] = "la-med", ["NL."] = "la-new", ["VL."] = "la-vul", ["nds-DE"] = "nds-de", ["nds-NL"] = "nds-nl", ["roa-oan"] = "roa-ona", } -- Codes which are tracked. Note that all aliases listed above are also tracked, so should not be duplicated here. -- Tracking uses the same mechanism described above in the comment above `export.aliases`. export.track = { -- Codes duplicated between full and etymology-only languages. ["lzh-lit"] = true, -- Languages actively being converted to families. ["bh"] = true, -- inc-bih ["nan"] = true, -- zhx-nan } return export e070ni97phn8zzfz8rxgqlnnoivur06 Modul:languages/data/exceptional 828 33718 281316 276272 2026-04-21T19:33:44Z Hakimi97 2668 Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/89762531|89762531]]) 281316 Scribunto text/plain local m_langdata = require("Module:languages/data") -- Loaded on demand, as it may not be needed (depending on the data). local function u(...) u = require("Module:string utilities").char return u(...) end local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["aav-khs-pro"] = { "Khasi Purba", 116773216, "aav-khs", "Latn", type = "reconstructed", } m["aav-nic-pro"] = { "Nicobar Purba", 116773793, "aav-nic", "Latn", type = "reconstructed", } m["aav-pkl-pro"] = { "Pnar-Khasi-Lyngngam Purba", 116773259, "aav-pkl", "Latn", type = "reconstructed", } m["aav-pro"] = { -- mkh-pro will merge into this "Austroasia Purba", 116773186, "aav", "Latn", type = "reconstructed", } m["afa-pro"] = { "Afroasia Purba", 269125, "afa", "Latn", type = "reconstructed", } m["alg-aga"] = { "Agawam", nil, "alg-eas", "Latn", } m["alg-pro"] = { "Algonquin Purba", 7251834, "alg", "Latn", type = "reconstructed", sort_key = {remove_diacritics = "·"}, } m["alv-ama"] = { "Amasi", 4740400, "nic-grs", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron}, } m["alv-bgu"] = { "Baïnounk Gubëeher", 17002646, "alv-bny", "Latn", } m["alv-bua-pro"] = { "Bua Purba", 116773723, "alv-bua", "Latn", type = "reconstructed", } m["alv-cng-pro"] = { "Cangin Purba", 116773726, "alv-cng", "Latn", type = "reconstructed", } m["alv-edo-pro"] = { "Edoid Purba", 116773206, "alv-edo", "Latn", type = "reconstructed", } m["alv-fli-pro"] = { "Fali Purba", 116773754, "alv-fli", "Latn", type = "reconstructed", } m["alv-gbe-pro"] = { "Gbe Purba", 116773208, "alv-gbe", "Latn", type = "reconstructed", } m["alv-gng-pro"] = { "Guang Purba", 116773757, "alv-gng", "Latn", type = "reconstructed", } m["alv-gtm-pro"] = { "Togo Tengah Purba", 116773732, "alv-gtm", "Latn", type = "reconstructed", } m["alv-gwa"] = { "Gwara", 16945580, "nic-pla", "Latn", } m["alv-hei-pro"] = { "Heiban Purba", 116773760, "alv-hei", "Latn", type = "reconstructed", } m["alv-ido-pro"] = { "Idomoid Purba", 116773764, "alv-ido", "Latn", type = "reconstructed", } m["alv-igb-pro"] = { "Igboid Purba", 116773765, "alv-igb", "Latn", type = "reconstructed", } m["alv-kwa-pro"] = { "Kwa Purba", 116773780, "alv-kwa", "Latn", type = "reconstructed", } m["alv-mum-pro"] = { "Mumuye Purba", 116773791, "alv-mum", "Latn", type = "reconstructed", } m["alv-nup-pro"] = { "Nupoid Purba", 116773795, "alv-nup", "Latn", type = "reconstructed", } m["alv-pro"] = { "Atlantik-Congo Purba", 116732838, "alv", "Latn", type = "reconstructed", } m["alv-edk-pro"] = { "Edekiri Purba", nil, "alv-edk", "Latn", type = "reconstructed", } m["alv-yor-pro"] = { "Yoruba Purba", nil, "alv-yor", "Latn", type = "reconstructed", } m["alv-yrd-pro"] = { "Yoruboid Purba", 116773824, "alv-yrd", "Latn", type = "reconstructed", } m["alv-von-pro"] = { "Volta-Niger Purba", 116773820, "alv-von", "Latn", type = "reconstructed", } m["apa-pro"] = { "Apache Purba", 116773135, "apa", "Latn", type = "reconstructed", } m["aql-pro"] = { "Algik Purba", 18389588, "aql", "Latn", type = "reconstructed", sort_key = {remove_diacritics = "·"}, } m["art-adu"] = { "Adûni", 1232159, "art", "Latn", type = "appendix-constructed", } m["art-bel"] = { "Kreol Belter", 108055510, "art", "Latn", type = "appendix-constructed", sort_key = { remove_diacritics = c.acute, from = {"ɒ"}, to = {"a"}, }, } m["art-blk"] = { "Bolak", 2909283, "art", "Latn", type = "appendix-constructed", } m["art-bsp"] = { "Black Speech", 686210, "art", "Latn, Teng", type = "appendix-constructed", } m["art-com"] = { "Communicationssprache", 35227, "art", "Latn", type = "appendix-constructed", } m["art-dtk"] = { "Dothraki", 2914733, "art", "Latn", type = "appendix-constructed", } m["art-elo"] = { "Eloi", nil, "art", "Latn", type = "appendix-constructed", } m["art-gld"] = { "Goa'uld", 19823, "art", "Latn, Egyp, Mero", type = "appendix-constructed", } m["art-lap"] = { "Lapine", 6488195, "art", "Latn", type = "appendix-constructed", } m["art-man"] = { "Mandalorian", 54289, "art", "Latn", type = "appendix-constructed", } m["art-mun"] = { "Mundolinco", 851355, "art", "Latn", type = "appendix-constructed", } m["art-nav"] = { "Na'vi", 316939, "art", "Latn", type = "appendix-constructed", } m["art-vlh"] = { "High Valyrian", 64483808, "art", "Latn", type = "appendix-constructed", } m["ath-nic"] = { "Nicola", 20609, "ath-nor", "Latn", } m["ath-pro"] = { "Athabaska Purba", 104841722, "ath", "Latn", type = "reconstructed", } m["auf-pro"] = { "Arawa Purba", 116773706, "auf", "Latn", type = "reconstructed", } m["aus-alu"] = { "Alungul", 16827670, "aus-pmn", "Latn", } m["aus-and"] = { "Andjingith", 4754509, "aus-pmn", "Latn", } m["aus-ang"] = { "Angkula", 16828520, "aus-pmn", "Latn", } m["aus-arn-pro"] = { "Arnhem Purba", 116773720, "aus-arn", "Latn", type = "reconstructed", } m["aus-bra"] = { "Barranbinya", 4863220, "aus-pmn", "Latn", } m["aus-brm"] = { "Barunggam", 4865914, "aus-pmn", "Latn", } m["aus-cww-pro"] = { "New South Wales Tengah Purba", 116773199, "aus-cww", "Latn", type = "reconstructed", } m["aus-dal-pro"] = { "Daly Purba", 116773743, "aus-dal", "Latn", type = "reconstructed", } m["aus-guw"] = { "Guwar", 6652138, "aus-pam", "Latn", } m["aus-lsw"] = { "Little Swanport", 6652138, "qfa-unc", "Latn", } m["aus-mbi"] = { "Mbiywom", 6799701, "aus-pmn", "Latn", } m["aus-ngk"] = { "Ngkoth", 7022405, "aus-pmn", "Latn", } m["aus-nyu-pro"] = { "Nyulnyulan Purba", 116773797, "aus-nyu", "Latn", type = "reconstructed", } m["aus-pam-pro"] = { "Pama-Nyunga Purba", 33942, "aus-pam", "Latn", type = "reconstructed", } m["aus-tul"] = { "Tulua", 16938541, "aus-pam", "Latn", } m["aus-uwi"] = { "Uwinymil", 7903995, "aus-arn", "Latn", } m["aus-wdj-pro"] = { "Iwaidjan Purba", 116773767, "aus-wdj", "Latn", type = "reconstructed", } m["aus-won"] = { "Wong-gie", nil, "aus-pam", "Latn", } m["aus-wul"] = { "Wulguru", 8039196, "aus-dyb", "Latn", } m["aus-ynk"] = { -- contrast nny "Yangkaal", 3913770, "aus-tnk", "Latn", } m["awd-amc-pro"] = { "Amuesha-Chamicuro Purba", nil, "awd", "Latn", type = "reconstructed", } m["awd-kmp-pro"] = { "Kampa Purba", nil, "awd", "Latn", type = "reconstructed", } m["awd-prw-pro"] = { "Paresi-Waura Purba", nil, "awd", "Latn", type = "reconstructed", } m["awd-ama"] = { "Amarizana", 16827787, "awd", "Latn", } m["awd-ana"] = { "Anauyá", 16828252, "awd", "Latn", } m["awd-apo"] = { "Apolista", 16916645, "awd", "Latn", } m["awd-cab"] = { "Cabre", 16850160, "awd", "Latn", } m["awd-gnu"] = { "Guinau", 3504087, "awd", "Latn", } m["awd-kar"] = { "Cariay", 16920253, "awd", "Latn", } m["awd-kaw"] = { "Kawishana", 6379993, "awd-nwk", "Latn", } m["awd-kus"] = { "Kustenau", 5196293, "awd", "Latn", } m["awd-man"] = { "Manao", 6746920, "awd", "Latn", } m["awd-mar"] = { "Marawan", 6755108, "awd", "Latn", } m["awd-mpr"] = { "Maipure", 6736872, "awd", "Latn", } m["awd-mrt"] = { "Mariaté", 16910017, "awd-nwk", "Latn", } m["awd-nwk-pro"] = { "Nawiki Purba", 116773234, "awd-nwk", "Latn", type = "reconstructed", } m["awd-pai"] = { "Paikoneka", 128807835, "awd", "Latn", } m["awd-pas"] = { "Pasé", 7143168, "awd-nwk", "Latn", } m["awd-pro"] = { "Arawak Purba", 97573478, "awd", "Latn", type = "reconstructed", } m["awd-she"] = { "Shebayo", 7492248, "awd", "Latn", } m["awd-taa-pro"] = { "Ta-Arawak Purba", 116773282, "awd-taa", "Latn", type = "reconstructed", } m["awd-wai"] = { "Wainumá", 16910017, "awd-nwk", "Latn", } m["awd-yum"] = { "Yumana", 8061062, "awd-nwk", "Latn", } m["azc-caz"] = { "Cazcan", 5055514, "azc", "Latn", } m["azc-cup-pro"] = { "Cupan Purba", 116773738, "azc-cup", "Latn", type = "reconstructed", } m["azc-ktn"] = { "Kitanemuk", 3197558, "azc-tak", "Latn", } m["azc-nah-pro"] = { "Nahua Purba", 7251860, "azc-nah", "Latn", type = "reconstructed", } m["azc-num-pro"] = { "Numi Purba", 116773247, "azc-num", "Latn", type = "reconstructed", } m["azc-pro"] = { "Uto-Aztek Purba", 96400333, "azc", "Latn", type = "reconstructed", } m["azc-tak-pro"] = { "Takik Purba", 116773283, "azc-tak", "Latn", type = "reconstructed", } m["azc-tat"] = { "Tataviam", 743736, "azc", "Latn", } m["ber-pro"] = { "Barbar Purba", 2855698, "ber", "Latn", type = "reconstructed", } m["ber-fog"] = { "Fogaha", 107610173, "ber", "Latn", } m["ber-zuw"] = { "Zuwara", 4117169, "ber", "Latn", } m["bnt-bal"] = { "Balong", 93935237, "bnt-bbo", "Latn", } m["bnt-bon"] = { "Boma Nkuu", nil, "bnt", "Latn", } m["bnt-boy"] = { "Boma Yumu", nil, "bnt", "Latn", } m["bnt-bwa"] = { "Bwala", 128810345, "bnt-tek", "Latn", } m["bnt-cmw"] = { "Chimwiini", 4958328, "bnt-swh", "Latn", } m["bnt-ind"] = { "Indanga", 51412803, "bnt", "Latn", } m["bnt-lal"] = { "Lala (Afrika Selatan)", 6480154, "bnt-ngu", "Latn", } m["bnt-mpi"] = { "Mpiin", 93937013, "bnt-bdz", "Latn", } m["bnt-mpu"] = { "Mpuono", -- not to be confused with Mbuun zmp 36056, "bnt", "Latn", } m["bnt-ngu-pro"] = { "Nguni Purba", 961559, "bnt-ngu", "Latn", type = "reconstructed", sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.caron}, } m["bnt-phu"] = { "Phuthi", 33796, "bnt-ngu", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute}, } m["bnt-pro"] = { "Bantu Purba", 3408025, "bnt", "Latn", type = "reconstructed", sort_key = "bnt-pro-sortkey", } m["bnt-sab-pro"] = { "Sabaki Purba", nil, -- Q2209395 is the code for the Sabaki family "bnt-sab", "Latn", type = "reconstructed", } m["bnt-sbo"] = { "Boma Selatan", nil, "bnt", "Latn", } m["bnt-sts-pro"] = { "Sotho-Tswana Purba", 116773278, "bnt-sts", "Latn", type = "reconstructed", } m["btk-pro"] = { "Batak Purba", 116773191, "btk", "Latn", type = "reconstructed", } m["cau-abz-pro"] = { "Abkhaz-Abaza Purba", 7251831, "cau-abz", "Latn", type = "reconstructed", } m["cau-and-pro"] = { "Andi Purba", nil, "cau-and", "Latn", type = "reconstructed", } m["cau-ava-pro"] = { "Avar-Andi Purba", 116773187, "cau-ava", "Latn", type = "reconstructed", } m["cau-cir-pro"] = { "Circassia Purba", 7251838, "cau-cir", "Latn", type = "reconstructed", } m["cau-drg-pro"] = { "Dargwa Purba", 116773205, "cau-drg", "Latn", type = "reconstructed", } m["cau-lzg-pro"] = { "Lezgi Purba", 116773223, "cau-lzg", "Latn", type = "reconstructed", } m["cau-nec-pro"] = { "Kaukasus Timur Laut Purba", 116773244, "cau-nec", "Latn", type = "reconstructed", } m["cau-nkh-pro"] = { "Nakh Purba", 108032840, "cau-nkh", "Latn", type = "reconstructed", } m["cau-nwc-pro"] = { "Kaukasus Barat Laut Purba", 7251861, "cau-nwc", "Latn", type = "reconstructed", } m["cau-tsz-pro"] = { "Tsez Purba", 116773287, "cau-tsz", "Latn", type = "reconstructed", } m["cba-ata"] = { "Atanques", 4812783, "cba", "Latn", } m["cba-cat"] = { "Catío Chibcha", 7083619, "cba", "Latn", } m["cba-dor"] = { "Dorasque", 5297532, "cba", "Latn", } m["cba-dui"] = { "Duit", 3041061, "cba", "Latn", } m["cba-hue"] = { "Huetar", 35514, "cba", "Latn", } m["cba-nut"] = { "Nutabe", 7070405, "cba", "Latn", } m["cba-pro"] = { "Chibchan Purba", 116773203, "cba", "Latn", type = "reconstructed", } m["ccs-pro"] = { "Kartvelia Purba", 2608203, "ccs", "Latn", type = "reconstructed", strip_diacritics = { from = {"q̣", "p̣", "ʓ", "ċ"}, to = {"q̇", "ṗ", "ʒ", "c̣"} }, } m["ccs-gzn-pro"] = { "Georgia-Zan Purba", 23808119, "ccs-gzn", "Latn", type = "reconstructed", strip_diacritics = { from = {"q̣", "p̣", "ʓ", "ċ"}, to = {"q̇", "ṗ", "ʒ", "c̣"} }, } m["cdc-cbm-pro"] = { "Chad Tengah Purba", 116773197, "cdc-cbm", "Latn", type = "reconstructed", } m["cdc-mas-pro"] = { "Masa Purba", 116773789, "cdc-mas", "Latn", type = "reconstructed", } m["cdc-pro"] = { "Chad Purba", 116773201, "cdc", "Latn", type = "reconstructed", } m["cdd-pro"] = { "Caddoan Purba", 116773725, "cdd", "Latn", type = "reconstructed", } m["cel-bry-pro"] = { "Briton Purba", 1248800, "cel-bry", "Latn, Polyt", sort_key = { Latn = "cel-bry-pro-sortkey", }, -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["cel-gal"] = { "Gallaecia", 3094789, "cel-his", } m["cel-gau"] = { "Gallia", 29977, "cel", "Latn, Polyt, Ital", strip_diacritics = { Latn = {remove_diacritics = c.macron .. c.breve .. c.diaer}, }, sort_key = { Latn = "cel-bry-pro-sortkey", }, -- Ital translit in [[Module:scripts/data]] -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["cel-pro"] = { "Keltik Purba", 653649, "cel", "Latn", type = "reconstructed", sort_key = "cel-pro-sortkey", } m["chi-pro"] = { "Chimakuan Purba", 116773734, "chi", "Latn", type = "reconstructed", } m["chm-pro"] = { "Mari Purba", 116773788, "chm", "Latn", type = "reconstructed", } m["cmc-pro"] = { "Chamik Purba", 114793834, "cmc", "Latn", type = "reconstructed", } m["crp-bip"] = { "Pijin Basque-Iceland", 810378, "crp", "Latn", ancestors = "eu", } m["crp-kia"] = { "Pijin Jerman Kiautschou", 108314615, "crp", "Latn", ancestors = "de", } m["crp-gep"] = { "Pijin Greenland Barat", 17036301, "crp", "Latn", ancestors = "kl", } m["crp-mar"] = { "Maroon Spirit Language", 1093206, "crp", "Latn", ancestors = "en", } m["crp-mpp"] = { "Portugis Pijin Macau", 128804537, "crp", "Hant, Latn", ancestors = "pt", sort_key = {Hant = "Hani-sortkey"}, } m["crp-rsn"] = { "Russenorsk", 505125, "crp", "Cyrl, Latn", ancestors = "nn, ru", translit = {Cyrl = "ru-translit"}, } m["crp-spp"] = { "Samoan Plantation Pidgin", 7409948, "crp", "Latn", ancestors = "en", } m["crp-slb"] = { "Inggeris Solombala", 7558525, "crp", "Cyrl, Latn", ancestors = "en, ru", translit = {Cyrl = "ru-translit"}, } m["crp-tpr"] = { "Rusia Pijin Taimyr", 16930506, "crp", "Cyrl", ancestors = "ru", translit = "ru-translit", } m["csu-bba-pro"] = { "Bongo-Bagirmi Purba", 116773722, "csu-bba", "Latn", type = "reconstructed", } m["csu-maa-pro"] = { "Mangbetu Purba", 116773786, "csu-maa", "Latn", type = "reconstructed", } m["csu-pro"] = { "Sudan Tengah Purba", 116773730, "csu", "Latn", type = "reconstructed", } m["csu-sar-pro"] = { "Sara Purba", 116773809, "csu-sar", "Latn", type = "reconstructed", } m["cus-ash"] = { "Ashraaf", 4805855, "cus-som", "Latn", } m["cus-hec-pro"] = { "Kusyi Timur Tanah Tinggi Purba", 116773761, "cus-hec", "Latn", type = "reconstructed", } m["cus-som-pro"] = { "Somaloid Purba", nil, "cus-som", "Latn", type = "reconstructed", } m["cus-sou-pro"] = { "Kusyi Selatan Purba", 126081567, "cus-sou", "Latn", type = "reconstructed", } m["cus-pro"] = { "Kusyi Purba", 116773204, "cus", "Latn", type = "reconstructed", } m["dmn-dam"] = { "Dama (Sierra Leone)", 19601574, "dmn", "Latn", } m["dra-bry"] = { "Beary", 1089116, "qfa-mix", "Mlym, Knda", ancestors = "ml, tcy", -- Knda translit in [[Module:scripts/data]] -- Mlym translit in [[Module:scripts/data]] } m["dra-cen-pro"] = { "Dravidia Tengah Purba", nil, "dra-cen", "Latn", type = "reconstructed", } m["dra-mkn"] = { "Kannada Pertengahan", 128810572, "dra-kan", "Knda", -- Knda translit in [[Module:scripts/data]] } m["dra-nor-pro"] = { "Dravidia Utara Purba", 124433593, "dra-nor", "Latn", type = "reconstructed", } m["dra-okn"] = { "Kannada Kuno", 15723156, "dra-kan", "Knda", -- Knda translit in [[Module:scripts/data]] } m["dra-ote"] = { "Telugu Kuno", 126720868, "dra-tel", "Telu", translit = "te-translit", } m["dra-pro"] = { "Dravidia Purba", 1702853, "dra", "Latn", type = "reconstructed", } m["dra-sdo-pro"] = { "Dravidia Selatan I Purba", 104847952, -- Wikipedia's "Dravidia Selatan Purba" is Dravidia Selatan Purba I in this scheme. "dra-sdo", "Latn", type = "reconstructed", } m["dra-sdt-pro"] = { "Dravidia Selatan II Purba", 128885257, "dra-sdt", "Latn", type = "reconstructed", } m["dra-sou-pro"] = { "Dravidia Selatan Purba", 128886121, "dra-sou", "Latn", type = "reconstructed", } m["egx-dem"] = { "Demotik", 36765, "egx", "Latn, Egyd, Polyt", sort_key = { Latn = { remove_diacritics = "'%-%s", from = {"ꜣ", "j", "e", "ꜥ", "y", "w", "b", "p", "f", "m", "n", "r", "l", "ḥ", "ḫ", "h̭", "ẖ", "h", "š", "s", "q", "k", "g", "ṱ", "ṯ", "t", "ḏ", "%.", "⸗"}, to = {p[1], p[2], p[3], p[4], p[5], p[6], p[7], p[8], p[9], p[10], p[11], p[12], p[13], p[15], p[16], p[16], p[17], p[14], p[19], p[18], p[20], p[21], p[22], p[23], p[24], p[23], p[25], p[26], p[26]} }, }, -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["dmn-pro"] = { "Mande Purba", 116773785, "dmn", "Latn", type = "reconstructed", } m["dmn-mdw-pro"] = { "Mande Barat Purba", 116773822, "dmn-mdw", "Latn", type = "reconstructed", } m["dru-pro"] = { "Rukai Purba", 116773807, "map", "Latn", type = "reconstructed", } m["ero-gsz"] = { "Geshiza", nil, "ero", "Latn", } m["ero-nya"] = { "Nyagrong Minyag", nil, "ero", "Latn", } m["ero-tau"] = { "Stau", nil, "ero", "Latn", } m["esx-esk-pro"] = { "Eskimo Purba", 7251842, "esx-esk", "Latn", type = "reconstructed", } m["esx-ink"] = { "Inuktun", 1671647, "esx-inu", "Latn", } m["esx-inq"] = { "Inuinnaqtun", 28070, "esx-inu", "Latn", } m["esx-inu-pro"] = { "Inuit Purba", 60785588, "esx-inu", "Latn", type = "reconstructed", } m["esx-pro"] = { "Eskimo-Aleut Purba", 7251843, "esx", "Latn", type = "reconstructed", } m["esx-tut"] = { "Tunumiisut", 15665389, "esx-inu", "Latn", } m["euq-pro"] = { "Vascon Purba", 938011, "euq", "Latn", type = "reconstructed", } m["gba-pro"] = { "Gbaya Purba", nil, "gba", "Latn", type = "reconstructed", } m["gem-pro"] = { "Jermanik Purba", 669623, "gem", "Latn", type = "reconstructed", sort_key = "gem-pro-sortkey", } m["gme-bur"] = { "Burgundians", 47625, "gme", "Latn", } m["gme-cgo"] = { "Goth Crimea", 36211, "gme", "Latn", } m["gmq-gut"] = { "Gutnish", 1256646, "gmq", "Latn", ancestors = "gmq-ogt", } m["gmq-jmk"] = { "Jamtish", 35512, "gmq-eas", "Latn", } m["gmq-mno"] = { "Norway Pertengahan", 3417070, "gmq-wes", "Latn", } m["gmq-oda"] = { "Denmark Kuno", 12330003, "gmq-eas", "Latn, Runr", strip_diacritics = {remove_diacritics = c.macron}, } m["gmq-ogt"] = { "Gutnish Kuno", 1133488, "gmq", "Latn, Runr", ancestors = "non", } m["gmq-osw"] = { "Sweden Kuno", 2417210, "gmq-eas", "Latn, Runr", strip_diacritics = {remove_diacritics = c.macron}, } m["gmq-pro"] = { "Norse Purba", 1671294, "gmq", "Runr", translit = "Runr-translit", } m["gmq-scy"] = { "Scanian", 768017, "gmq-eas", "Latn", } m["gmw-bgh"] = { "Bergish", 329030, "gmw-frk", "Latn", } m["gmw-cfr"] = { "Franconia Tengah", 572197, "gmw-hgm", "Latn", ancestors = "gmh", wikimedia_codes = "ksh", } m["gmw-ecg"] = { "Jerman Tengah Timur", 499344, -- subsumes Q699284, Q152965 "gmw-hgm", "Latn", ancestors = "gmh", } m["gmw-fin"] = { "Fingallian", 3072588, "gmw-ian", "Latn", } m["gmw-gts"] = { "Gottscheerish", 533109, "gmw-hgm", "Latn", ancestors = "bar", } m["gmw-jdt"] = { "Belanda Jersey", 1687911, "gmw-frk", "Latn", ancestors = "nl", } m["gmw-msc"] = { "Scots Pertengahan", 3327000, "gmw-ang", "Latn", ancestors = "enm-esc", } m["gmw-pro"] = { "Jermanik Barat Purba", 78079021, "gmw", "Latn, Runr", -- type = "reconstructed", -- largely but not entirely reconstructed (like Proto-Norse); see April '24 BP, set back to reconstructed (?) if 'anti-asterisk' is added sort_key = "gmw-pro-sortkey", } m["gmw-rfr"] = { "Franconia Rhine", 707007, "gmw-hgm", "Latn", ancestors = "gmh", } m["gmw-stm"] = { "Sathmar Swabian", 2223059, "gmw-hgm", "Latn", ancestors = "swg", } m["gmw-tsx"] = { "Transylvanian Saxon", 260942, "gmw-hgm", "Latn", ancestors = "gmw-cfr", } m["gmw-vog"] = { "Jerman Volga", 312574, "gmw-hgm", "Latn", ancestors = "gmw-rfr", } m["gmw-zps"] = { "Jerman Zipser", 205548, "gmw-hgm", "Latn", ancestors = "gmh", } m["gn-cls"] = { "Guarani Klasik", 17478065, "gn", "Latn", } m["grk-cal"] = { "Yunani Calabria", 1146398, "grk", "Latn, Grek", ancestors = "grk-ita", translit = { Grek = "el-translit", }, -- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["grk-ita"] = { "Yunani Itali", 19720507, "grk", "Latn, Polyt", ancestors = "gkm", translit = { Grek = "el-translit", }, -- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["grk-mar"] = { "Yunani Mariupol", 4400023, "grk", "Cyrl, Latn, Polyt", ancestors = "gkm", translit = { Cyrl = "grk-mar-translit", Polyt = "grk-mar-translit", }, override_translit = true, strip_diacritics = { Cyrl = {remove_diacritics = c.acute}, }, -- Grek display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["grk-pro"] = { "Hellenik Purba", 1231805, "grk", "Latn, Polyt", type = "reconstructed", sort_key = {Latn = { from = {"ʰ", "ʷ"}, to = {"h", "w"}, remove_diacritics = c.grave .. c.acute .. c.macron .. c.breve .. c.caron }}, -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] -- NOTE: formerly no translit specified for Polyt; presumably an accidental omission; if not, set Polyt = false in -- the translit section } m["hmn-pro"] = { "Hmong", 116773210, "hmn", "Latn", type = "reconstructed", } m["hmx-mie-pro"] = { "Mien", 116773229, "hmx-mie", "Latn", type = "reconstructed", } m["hmx-pro"] = { "Hmong-Mien Purba", 7251846, "hmx", "Latn", type = "reconstructed", } m["hyx-pro"] = { "Armenia Purba", 3848498, "hyx", "Latn", type = "reconstructed", } m["iir-nur-pro"] = { "Nuristani Purba", 116773248, "iir-nur", "Latn", type = "reconstructed", } m["iir-pro"] = { "Indo-Iran Purba", 966439, "iir", "Latn", type = "reconstructed", } m["ijo-pro"] = { "Ijoid Purba", 116773766, "ijo", "Latn", type = "reconstructed", } m["inc-apa"] = { "Apabhramsa", 616419, "inc-mid", "Deva, Shrd, Sidd", ancestors = "pra", translit = { Deva = "sa-translit", -- Shrd translit in [[Module:scripts/data]] -- Sidd translit in [[Module:scripts/data]] }, } m["inc-ash"] = { "Prakrit Ashoka", 104854379, "inc-mid", "Brah, Khar", ancestors = "sa", translit = { -- Brah translit in [[Module:scripts/data]] Khar = "Khar-translit", }, } m["inc-dng-pro"] = { "Dangari Purba", nil, "inc-dng", "Latn", type = "reconstructed", } m["inc-kam"] = { "Prakrit Kamarupi", 6356097, "inc-bas", "Brah, Sidd", -- Brah, Sidd translit in [[Module:scripts/data]] } m["inc-kho"] = { "Kholosi", 24952008, "inc-snd", "Latn", } m["inc-krd-pro"] = { "Kamta Purba", 128816843, "inc-bas", "Latn", ancestors = "inc-kam", type = "reconstructed", } m["inc-mas"] = { "Assam Pertengahan", 128806836, "inc-bas", "as-Beng", ancestors = "inc-oas", translit = "inc-mas-translit", } m["inc-mbn"] = { "Benggali Pertengahan", 113559927, "inc-bas", "Beng", ancestors = "inc-obn", translit = "inc-mbn-translit", } m["inc-mgu"] = { "Gujarat Pertengahan", 24907429, "inc-wes", "Deva", ancestors = "inc-ogu", } m["inc-mor"] = { "Odia Pertengahan", 128810882, "inc-eas", "Orya", ancestors = "inc-oor", } m["inc-oas"] = { "Assam Awal", 85758237, "inc-bas", "as-Beng", ancestors = "inc-kam", translit = "inc-oas-translit", } m["inc-oaw"] = { "Awadhi Kuno", nil, "inc-hie", "Deva, Kthi, ur-Arab", strip_diacritics = { from = {"هٔ", "ۂ"}, -- character "ۂ" code U+06C2 to "ه" and "هٔ"‎ (U+0647 + U+0654) to "ه" to = {"ہ", "ہ"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef }, translit = { Deva = "sa-translit", Kthi = "sa-Kthi-translit", ["ur-Arab"] = "inc-ohi-translit", }, } m["inc-obn"] = { "Benggali Kuno", 113559926, "inc-bas", "Beng", } m["inc-ogu"] = { "Gujarati Kuno", 24907427, "inc-wes", "Deva", translit = "sa-translit", } m["inc-ohi"] = { "Hindi Kuno", 48767781, "inc-hiw", "Deva, ur-Arab", strip_diacritics = { from = {"هٔ", "ۂ"}, -- character "ۂ" code U+06C2 to "ه" and "هٔ"‎ (U+0647 + U+0654) to "ه" to = {"ہ", "ہ"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef }, translit = { Deva = "sa-translit", ["ur-Arab"] = "inc-ohi-translit", }, } m["inc-oor"] = { "Odia Kuno", 128807801, "inc-eas", "Orya", } m["inc-opa"] = { "Punjabi Kuno", 115270971, "inc-pan", "Guru, pa-Arab", translit = { Guru = "inc-opa-Guru-translit", ["pa-Arab"] = "pa-Arab-translit", }, strip_diacritics = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun}, } m["inc-pro"] = { "Indo-Arya Purba", 23808344, "inc", "Latn", type = "reconstructed", } m["ine-ana-pro"] = { "Anatolia Purba", 7251833, "ine-ana", "Latn", type = "reconstructed", } m["ine-bsl-pro"] = { "Balto-Slavik Purba", 1703347, "ine-bsl", "Latn", type = "reconstructed", sort_key = { from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", c.acute, c.macron, "ˀ"}, to = {"a", "e", "i", "o", "u"} }, } m["ine-kal"] = { "Kalašma", 122770439, "ine-ana", "Xsux", } m["ine-pae"] = { "Paeonian", 2705672, "ine", "Polyt", -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["ine-pro"] = { "Indo-Eropah Purba", 37178, "ine", "Latn", type = "reconstructed", sort_key = { from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", "ĺ", "ḿ", "ń", "ŕ", "ǵ", "ḱ", "ʰ", "ʷ", "₁", "₂", "₃", c.ringbelow, c.acute, c.macron}, to = {"a", "e", "i", "o", "u", "l", "m", "n", "r", "g'", "k'", "¯h", "¯w", "1", "2", "3"} }, } m["ine-toc-pro"] = { "Tocharia Purba", 104841462, "ine-toc", "Latn", type = "reconstructed", } m["xme-old"] = { "Medes Kuno", 36461, "xme", "Polyt, Latn", } m["xme-mid"] = { "Medes Pertengahan", 12836150, "xme", "Latn", } m["xme-ker"] = { "Kerman", 129850, "xme", "fa-Arab, Latn, Hebr", ancestors = "xme-mid", -- Hebr display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["xme-taf"] = { "Tafreshi", nil, "xme", "fa-Arab, Latn", ancestors = "xme-mid", } m["xme-ttc-pro"] = { "Tat Purba", 122973870, "xme-ttc", "Latn", ancestors = "xme-mid", } m["xme-kls"] = { "Kalasuri", nil, "xme-ttc", ancestors = "xme-ttc-nor", } m["xme-klt"] = { "Kilit", 3612452, "xme-ttc", "Cyrl", -- and fa-Arab? } m["xme-ott"] = { "Tati Kuno", 434697, "xme-ttc", "fa-Arab, Latn", } m["ira-kms-pro"] = { "Komisenian Purba", 116773777, "ira-kms", "Latn", type = "reconstructed", } m["ira-mpr-pro"] = { "Medo-Parthia Purba", 116773227, "ira-mpr", "Latn", type = "reconstructed", } m["ira-pat-pro"] = { "Pathan Purba", 116773255, "ira-pat", "Latn", type = "reconstructed", } m["ira-pro"] = { "Iran Purba", 4167865, "ira", "Latn", type = "reconstructed", } m["ira-zgr-pro"] = { "Zaza-Gorani Purba", 116775031, "ira-zgr", "Latn", type = "reconstructed", } m["xsc-pro"] = { "Scythia Purba", 116773273, "xsc", "Latn", type = "reconstructed", } m["xsc-sar-pro"] = { "Sarmatia Purba", 116773249, "xsc-sar", "Latn", type = "reconstructed", } m["xsc-skw-pro"] = { "Saka-Wakhi Purba", 116773267, "xsc-skw", "Latn", type = "reconstructed", } m["xsc-sak-pro"] = { "Saka Purba", 116773264, "xsc-sak", "Latn", type = "reconstructed", } m["ira-sym-pro"] = { "Shughni-Yazghulami-Munji Purba", 116773813, "ira-sym", "Latn", type = "reconstructed", } m["ira-sgi-pro"] = { "Sanglechi-Ishkashimi Purba", 116773808, "ira-sgi", "Latn", type = "reconstructed", } m["ira-mny-pro"] = { "Munji-Yidgha Purba", 116773792, "ira-mny", "Latn", type = "reconstructed", } m["ira-shy-pro"] = { "Shughni-Yazghulami Purba", 116773812, "ira-shy", "Latn", type = "reconstructed", } m["ira-shr-pro"] = { "Shughni-Roshani Purba", 116773811, "ira-shr", "Latn", type = "reconstructed", } m["ira-sgc-pro"] = { "Sogdia Purba", 116773276, "ira-sgc", "Latn", type = "reconstructed", } m["ira-wnj"] = { "Vanji", 3398419, "ira-shy", "Latn", } m["iro-ere"] = { "Erie", 5388365, "iro-nor", "Latn", } m["iro-min"] = { "Mingo", 128531, "iro-nor", "Latn", ietf_subtag = "i-mingo", -- grandfathered IETF tag } m["iro-nor-pro"] = { "Iroquois Utara Purba", 116773242, "iro-nor", "Latn", type = "reconstructed", } m["iro-pro"] = { "Iroquois Purba", 7251852, "iro", "Latn", type = "reconstructed", } m["itc-pro"] = { "Italik Purba", 17102720, "itc", "Latn", type = "reconstructed", } m["itc-psa"] = { "Pra-Samnita", 7239186, "itc-sbl", "Ital, Polyt, Latn", -- Ital translit in [[Module:scripts/data]] (NOTE: formerly not present, probably an accidental omission) -- Polyt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["jpx-hcj"] = { "Hachijō", 5637049, "jpx", "Jpan", ancestors = "ojp-eas", translit = s["jpx-translit"], display_text = s["jpx-displaytext"], strip_diacritics = s["jpx-stripdiacritics"], sort_key = s["jpx-sortkey"], } m["jpx-pro"] = { "Jepunik Purba", 3924309, "jpx", "Latn", type = "reconstructed", } m["jpx-ryu-pro"] = { "Ryukyu Purba", 56349069, "jpx-ryu", "Latn", type = "reconstructed", } m["kar-pro"] = { "Karen Purba", 85794783, "kar", "Latn", type = "reconstructed", } m["kca-eas"] = { "Khanty Timur", 30304622, "kca", "Cyrl", translit = "kca-translit", override_translit = true, -- TODO temporary until MediaWiki supports Unicode 16 (probably requires a PHP update from their side) sort_key = { Cyrl = { from = {"ᲊ"}, to = {"Ᲊ"} } }, } m["kca-nor"] = { "Khanty Utara", 30304527, "kca", "Cyrl", translit = "kca-translit", override_translit = true, -- TODO temporary until MediaWiki supports Unicode 16 (probably requires a PHP update from their side) sort_key = { Cyrl = { from = {"ᲊ"}, to = {"Ᲊ"} } }, } m["kca-pro"] = { "Khanty Purba", 127505171, "kca", "Latn", type = "reconstructed", } m["kca-sou"] = { "Khanty Selatan", 30304618, "kca", "Cyrl", translit = "kca-translit", override_translit = true, } m["khi-kho-pro"] = { "Khoe Purba", 116773218, "khi-kho", "Latn", type = "reconstructed", } m["khi-kun"] = { "ǃKung", 32904, "khi-kxa", "Latn", } m["ko-ear"] = { "Korea Moden Awal", 756014, "qfa-kor", "Kore", ancestors = "okm", translit = "okm-translit", -- Kore strip_diacritics in [[Module:scripts/data]] } m["kro-pro"] = { "Kru Purba", 116773778, "kro", "Latn", type = "reconstructed", } m["ku-pro"] = { "Kurdi Purba", 116773221, "ku", "Latn", type = "reconstructed", } m["map-ata-pro"] = { "Atayal Purba", 116773151, "map-ata", "Latn", type = "reconstructed", } m["map-bms"] = { "Banyumasan", 33219, "map", "Latn, Java", } m["map-pro"] = { "Austronesia Purba", 49230, "map", "Latn", type = "reconstructed", } m["mis-hkl"] = { "Hokkien Kelantan Peranakan", 108794818, "qfa-mix", ancestors = "nan-hbl, sou, mfa", } m["mis-idn"] = { "Idiom Neutral", 35847, "art", "Latn", type = "appendix-constructed", } m["mis-isa"] = { "Isaurian", 16956868, nil, -- "Xsux, Hluw, Latn", } m["mis-jie"] = { "Jie", 124424186, nil, "Hani", sort_key = "Hani-sortkey", } m["mis-jzh"] = { "Jizhao", 45242758, "qfa-bej", "Latn", } m["mis-kas"] = { "Kassite", 35612, nil, "Xsux", } m["mis-mmd"] = { "Mimi of Decorse", 6862206, nil, "Latn", } m["mis-mmn"] = { "Mimi of Nachtigal", 6862207, nil, "Latn", } m["mis-phi"] = { "Philistine", 2230924, nil, "Phnx", -- Phnx translit in [[Module:scripts/data]] (NOTE: not present before, presumably an accidental omission) } m["mis-rou"] = { "Rouran", 48816637, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-tdl"] = { "Turdulian", 133176492, } m["mis-tdt"] = { "Turdetanian", 133176461, } m["mis-tnw"] = { "Tangwang", 7683179, "qfa-mix", "Latn", ancestors = "cmn, sce", } m["mis-tuh"] = { "Tuyuhun", 48816625, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-tuo"] = { "Tuoba", 48816629, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-wuh"] = { "Wuhuan", 118976867, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-xbi"] = { "Xianbei", 4448647, "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-xnu"] = { "Xiongnu", 10901674, nil, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mjg-mgl"] = { "Mongghul", 53765528, "mjg", "Latn", -- also Mong, Cyrl ? } m["mjg-mgr"] = { "Mangghuer", 56285392, "mjg", "Latn", -- also Mong, Cyrl ? } m["mkh-asl-pro"] = { "Asli Purba", 55630680, "mkh-asl", "Latn", type = "reconstructed", } m["mkh-ban-pro"] = { "Bahnar Purba", 116773189, "mkh-ban", "Latn", type = "reconstructed", } m["mkh-kat-pro"] = { "Katu Purba", 116773772, "mkh-kat", "Latn", type = "reconstructed", } m["mkh-khm-pro"] = { "Khmu Purba", 116773774, "mkh-khm", "Latn", type = "reconstructed", } m["mkh-kmr-pro"] = { "Khmer Purba", 55630684, "mkh-kmr", "Latn", type = "reconstructed", } m["mkh-mmn"] = { "Mon Pertengahan", 121337926, "mkh-mnc", "Latn, Mymr", --and also Pallava ancestors = "omx", } m["mkh-mnc-pro"] = { "Mon Purba", 116773231, "mkh-mnc", "Latn", type = "reconstructed", } m["mkh-mvi"] = { "Vietnam Pertengahan", 9199, "mkh-vie", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mkh-pal-pro"] = { "Palaung Purba", 104847372, "mkh-pal", "Latn", type = "reconstructed", } m["mkh-pea-pro"] = { "Pear Purba", 116773804, "mkh-pea", "Latn", type = "reconstructed", } m["mkh-pkn-pro"] = { "Pakan Purba", 116773803, "mkh-pkn", "Latn", type = "reconstructed", } m["mkh-pro"] = { --This will be merged into 2015 aav-pro. "Mon-Khmer Purba", 7251859, "mkh", "Latn", type = "reconstructed", } m["mnw-tha"] = { -- To be removed. "Thai Mon", nil, "mkh-mnc", "Mymr, Thai", ancestors = "mkh-mmn", sort_key = { from = {"[%p]", "ျ", "ြ", "ွ", "ှ", "ၞ", "ၟ", "ၠ", "ၚ", "ဿ", "[็-๎]", "([เแโใไ])([ก-ฮ])ฺ?"}, to = {"", "္ယ", "္ရ", "္ဝ", "္ဟ", "္န", "္မ", "္လ", "င", "သ္သ", "", "%2%1"} }, } m["mkh-vie-pro"] = { "Viet Purba", 109432616, "mkh-vie", "Latn", type = "reconstructed", } m["mns-cen"] = { "Mansi Tengah", 128810384, "mns", "Cyrl", translit = "mns-translit", override_translit = true, } m["mns-nor"] = { "Mansi Utara", 30304537, "mns", "Cyrl", translit = "mns-translit", override_translit = true, } m["mns-pro"] = { "Mansi Purba", 128883093, "mns", "Latn", type = "reconstructed", } m["mns-sou"] = { "Mansi Selatan", 30304629, "mns", "Cyrl", translit = "mns-translit", override_translit = true, } m["mun-pro"] = { "Munda Purba", 105102373, "mun", "Latn", type = "reconstructed", } m["myn-chl"] = { -- the stage after ''emy'' "Ch'olti'", 873995, "myn", "Latn", } m["myn-pro"] = { "Maya Purba", 3321532, "myn", "Latn", type = "reconstructed", } m["nai-ala"] = { "Alazapa", 128810233, nil, "Latn", } m["nai-bay"] = { "Bayogoula", 1563704, nil, "Latn", } m["nai-cal"] = { "Calusa", 51782, nil, "Latn", } m["nai-chi"] = { "Chiquimulilla", 25339627, "nai-xin", "Latn", } m["nai-chu-pro"] = { "Chumash Purba", 116773736, "nai-chu", "Latn", type = "reconstructed", } m["nai-cig"] = { "Ciguayo", 20741700, nil, "Latn", } m["nai-ckn-pro"] = { "Chinook Purba", 116773735, "nai-ckn", "Latn", type = "reconstructed", } m["nai-guz"] = { "Guazacapán", 19572028, "nai-xin", "Latn", } m["nai-hit"] = { "Hitchiti", 1542882, "nai-mus", "Latn", } m["nai-ipa"] = { "Ipai", 3027474, "nai-yuc", "Latn", } m["nai-jtp"] = { "Jutiapa", nil, "nai-xin", "Latn", } m["nai-jum"] = { "Jumaytepeque", 25339626, "nai-xin", "Latn", } m["nai-kat"] = { "Kathlamet", 6376639, "nai-ckn", "Latn", } m["nai-klp-pro"] = { "Kalapuyan Purba", 116773771, "nai-klp", "Latn", type = "reconstructed", } m["nai-knm"] = { "Konomihu", 3198734, "nai-shs", "Latn", } m["nai-kum"] = { "Kumeyaay", 4910139, "nai-yuc", "Latn", } m["nai-mac"] = { "Macoris", 21070851, nil, "Latn", } m["nai-mdu-pro"] = { "Maidun Purba", 116773784, "nai-mdu", "Latn", type = "reconstructed", } m["nai-miz-pro"] = { "Mixe-Zoque Purba", 7251858, "nai-miz", "Latn", type = "reconstructed", } m["nai-mus-pro"] = { "Muscogee Purba", 116775368, "nai-mus", "Latn", type = "reconstructed", } m["nai-nao"] = { "Naolan", 6964594, nil, "Latn", } m["nai-nrs"] = { "New River Shasta", 7011254, "nai-shs", "Latn", } m["nai-okw"] = { "Okwanuchu", 3350126, "nai-shs", "Latn", } m["nai-per"] = { "Pericú", 3375369, nil, "Latn", } m["nai-pic"] = { "Picuris", 7191257, "nai-kta", "Latn", } m["nai-plp-pro"] = { "Penuti Penara Purba", 116773806, "nai-plp", "Latn", type = "reconstructed", } m["nai-pom-pro"] = { "Pomo Purba", 116773262, "nai-pom", "Latn", type = "reconstructed", } m["nai-qng"] = { "Quinigua", 36360, nil, "Latn", } m["nai-sca-pro"] = { -- NB 'sio-pro' "Proto-Siouan" which is Proto-Western Siouan "Sioux-Catawba Purba", 116773275, "nai-sca", "Latn", type = "reconstructed", } m["nai-sin"] = { "Sinacantán", 24190249, "nai-xin", "Latn", } m["nai-sln"] = { "Salvadoran Lenca", 3229434, "nai-len", "Latn", } m["nai-spt"] = { "Sahaptin", 3833015, "nai-shp", "Latn", } m["nai-tap"] = { "Tapachultec", 7684401, "nai-miz", "Latn", } m["nai-taw"] = { "Tawasa", 7689233, nil, "Latn", } m["nai-teq"] = { "Tequistlatec", 2964454, "nai-tqn", "Latn", } m["nai-tip"] = { "Tipai", 3027471, "nai-yuc", "Latn", } m["nai-tot-pro"] = { "Totozoque Purba", 116773285, "nai-tot", "Latn", type = "reconstructed", } m["nai-tsi-pro"] = { "Tsimshian Purba", nil, "nai-tsi", "Latn", type = "reconstructed", } m["nai-utn-pro"] = { "Uti Purba", 116773290, "nai-utn", "Latn", type = "reconstructed", } m["nai-wai"] = { "Waikuri", 3118702, nil, "Latn", } m["nai-wji"] = { "Jicaque Barat", 3178610, "nai-jcq", "Latn", } m["nai-yup"] = { "Yupiltepeque", 25339628, "nai-xin", "Latn", } m["nan-dat"] = { "Datian Min", 19855572, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-hbl"] = { "Hokkien", 1624231, "zhx-nan", "Hants, Latn, Bopo, Kana", wikimedia_codes = "zh-min-nan", generate_forms = "zh-generateforms", sort_key = { Hani = "Hani-sortkey", Kana = "Kana-sortkey" }, } m["nan-hlh"] = { "Min Hailufeng", 120755728, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-lnx"] = { "Min Longyan", 6674568, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-tws"] = { "Teochew", 36759, "zhx-nan", "Hants", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["nan-zhe"] = { "Min Zhenan", 3846710, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-zsh"] = { "Min Sanxiang", 7420769, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["ngf-bin-pro"] = { "Binanderean Purba", 137881672, "ngf-bin", "Latn", type = "reconstructed", } m["ngf-pro"] = { "Trans-New Guinea Purba", 85794785, "ngf", "Latn", type = "reconstructed", } m["nic-bco-pro"] = { "Benue-Congo Purba", 116773194, "nic-bco", "Latn", type = "reconstructed", } m["nic-bod-pro"] = { "Bantoid Purba", 116773190, "nic-bod", "Latn", type = "reconstructed", } m["nic-eov-pro"] = { "Oti-Volta Timur Purba", 116773753, "nic-eov", "Latn", type = "reconstructed", } m["nic-gns-pro"] = { "Gurunsi Purba", 116773759, "nic-gns", "Latn", type = "reconstructed", } m["nic-grf-pro"] = { "Grassfields Purba", 116773755, "nic-grf", "Latn", type = "reconstructed", } m["nic-gur-pro"] = { "Gur Purba", 116773758, "nic-gur", "Latn", type = "reconstructed", } m["nic-jkn-pro"] = { "Jukunoid Purba", 116773769, "nic-jkn", "Latn", type = "reconstructed", } m["nic-lcr-pro"] = { "Lower Cross River Purba", 116773782, "nic-lcr", "Latn", type = "reconstructed", } m["nic-ogo-pro"] = { "Ogoni Purba", 116773799, "nic-ogo", "Latn", type = "reconstructed", } m["nic-ovo-pro"] = { "Oti-Volta Purba", 116773802, "nic-ovo", "Latn", type = "reconstructed", } m["nic-plt-pro"] = { "Plateau Purba", 116773805, "nic-plt", "Latn", type = "reconstructed", } m["nic-pro"] = { "Niger-Congo Purba", 108000748, "nic", "Latn", type = "reconstructed", } m["nic-ubg-pro"] = { "Ubangi Purba", 116773818, "nic-ubg", "Latn", type = "reconstructed", } m["nic-ucr-pro"] = { "Upper Cross River Purba", 116773819, "nic-ucr", "Latn", type = "reconstructed", } m["nic-vco-pro"] = { "Volta-Congo Purba", 116773293, "nic-vco", "Latn", type = "reconstructed", } m["njo-jgl"] = { "Chungli Ao", 55607615, "sit-aao", "Latn", } m["nub-har"] = { "Haraza", 19572059, "nub", "Arab, Latn", } m["nub-pro"] = { "Nubia Purba", 116773246, "nub", "Latn", type = "reconstructed", } m["omq-cha-pro"] = { "Chatino Purba", 116773202, "omq-cha", "Latn", type = "reconstructed", } m["omq-maz-pro"] = { "Mazatec Purba", 116773790, "omq-maz", "Latn", type = "reconstructed", } m["omq-mix-pro"] = { "Mixtecan Purba", 21573423, "omq-mix", "Latn", type = "reconstructed", } m["omq-mxt-pro"] = { "Mixtec Purba", 21573424, "omq-mxt", "Latn", type = "reconstructed", } m["omq-otp-pro"] = { "Oto-Pamean Purba", 116773251, "omq-otp", "Latn", type = "reconstructed", } m["omq-pro"] = { "Oto-Manguean Purba", 33669, "omq", "Latn", type = "reconstructed", } m["omq-sjq"] = { "San Juan Quiahije Chatino", 138330751, "omq-cha", "Latn", } m["omq-tel"] = { "Teposcolula Mixtec", nil, "omq-mxt", "Latn", } m["omq-teo"] = { "Teojomulco Chatino", 25340451, "omq-cha", "Latn", } m["omq-tri-pro"] = { "Trique Purba", 116773817, "omq-tri", "Latn", type = "reconstructed", } m["omq-zap-pro"] = { "Zapotecan Purba", 116773297, "omq-zap", "Latn", type = "reconstructed", } m["omq-zpc-pro"] = { "Zapotec Purba", 116773296, "omq-zpc", "Latn", type = "reconstructed", } m["omv-aro-pro"] = { "Aroid Purba", 116773721, "omv-aro", "Latn", type = "reconstructed", } m["omv-diz-pro"] = { "Dizoid Purba", 116773750, "omv-diz", "Latn", type = "reconstructed", } m["omv-pro"] = { "Omo Purba", 116773800, "omv", "Latn", type = "reconstructed", } m["oto-otm-pro"] = { "Otomi Purba", 5908710, "oto-otm", "Latn", type = "reconstructed", } m["oto-pro"] = { "Otomi Purba", 116773252, "oto", "Latn", type = "reconstructed", } m["paa-kmn"] = { "Kómnzo", 18344310, "paa-wko", "Latn", } m["paa-kwn"] = { "Kuwani", 6449056, "qfa-unc", -- poorly attested, possibly the same as or related to Kalabra "Latn", } m["paa-lei"] = { "Leitre", 85776228, "paa-isk", } m["paa-nha-pro"] = { "Halmahera Utara Purba", 116773241, "paa-nha", "Latn", type = "reconstructed" } m["paa-nun"] = { "Nungon", 128807788, "ngf-ynu", "Latn", } m["phi-din"] = { "Dinapigue Agta", 16945774, "phi", "Latn", } m["phi-kal-pro"] = { "Kalamian Purba", 116773213, "phi-kal", "Latn", type = "reconstructed", } m["phi-nag"] = { "Nagtipunan Agta", 16966111, "phi", "Latn", } m["phi-pro"] = { "Filipina Purba", 18204898, "phi", "Latn", type = "reconstructed", } m["poz-abi"] = { "Abai", 19570729, "poz-san", "Latn", } m["poz-bal"] = { "Baliledo", 4850912, "poz", "Latn", } m["poz-btk-pro"] = { "Bungku-Tolaki Purba", 116773724, "poz-btk", "Latn", type = "reconstructed", } m["poz-cet-pro"] = { "Melayu-Polinesia Tengah Timur Purba", 2269883, "poz-cet", "Latn", type = "reconstructed", } m["poz-hce-pro"] = { "Halmahera Cenderawasih Purba", 116773209, "poz-hce", "Latn", type = "reconstructed", } m["poz-lgx-pro"] = { "Lampung Purba", 116773222, "poz-lgx", "Latn", type = "reconstructed", } m["poz-mcm-pro"] = { "Melayu-Chamik Purba", 116773225, "poz-mcm", "Latn", type = "reconstructed", } m["poz-mic-pro"] = { "Mikronesia Purba", 111939079, "poz-mic", "Latn", type = "reconstructed", } m["poz-mly-pro"] = { "Melayik Purba", 98057728, "poz-mly", "Latn", type = "reconstructed", } m["poz-msa-pro"] = { "Melayu-Sumbawa Purba", 116773226, "poz-msa", "Latn", type = "reconstructed", } m["poz-oce-pro"] = { "Oceania Purba", 141741, "poz-oce", "Latn", type = "reconstructed", } m["poz-pep-pro"] = { "Polinesia Timur Purba", 113988745, "poz-pep", "Latn", type = "reconstructed", } m["poz-pnp-pro"] = { "Polinesia Teras Purba", 113988746, "poz-pnp", "Latn", type = "reconstructed", } m["poz-pol-pro"] = { "Polinesia Purba", 1658709, "poz-pol", "Latn", type = "reconstructed", } m["poz-pro"] = { "Melayu-Polinesia Purba", 3832960, "poz", "Latn", type = "reconstructed", } m["poz-sml"] = { "Melayu Sarawak", 4251702, "poz-mly", "Latn, ms-Arab", } m["poz-ssw-pro"] = { "Sulawesi Selatan Purba", 116773279, "poz-ssw", "Latn", type = "reconstructed", } m["poz-swa-pro"] = { "Sarawak Utara Purba", 116773243, "poz-swa", "Latn", type = "reconstructed", } m["poz-ter"] = { "Melayu Terengganu", 4207412, "poz-mly", "Latn, ms-Arab", } m["pqe-pro"] = { "Melayu-Polinesia Timur Purba", 2269883, "pqe", "Latn", type = "reconstructed", } m["pra-niy"] = { "Prakrit Niya", 11991601, "inc-mid", "Khar", ancestors = "inc-ash", translit = "Khar-translit", } m["qfa-adm-pro"] = { "Andaman Raya Purba", 116773756, "qfa-adm", "Latn", type = "reconstructed", } m["qfa-bet-pro"] = { "Be-Tai Purba", 116773193, "qfa-bet", "Latn", type = "reconstructed", } m["qfa-cka-pro"] = { "Chukotko-Kamchatka Purba", 7251837, "qfa-cka", "Latn", type = "reconstructed", } m["qfa-hur-pro"] = { "Hurro-Urartu Purba", 116773211, "qfa-hur", "Latn", type = "reconstructed", } m["qfa-kad-pro"] = { "Kadu Purba", 116773770, "qfa-kad", "Latn", type = "reconstructed", } m["qfa-kms-pro"] = { "Kam-Sui Purba", 55630682, "qfa-kms", "Latn", type = "reconstructed", } m["qfa-kor-pro"] = { "Korea Purba", 467883, "qfa-kor", "Latn", type = "reconstructed", } m["qfa-kra-pro"] = { "Kra Purba", 7251854, "qfa-kra", "Latn", type = "reconstructed", } m["qfa-lic-pro"] = { "Hlai Purba", 7251845, "qfa-lic", "Latn", type = "reconstructed", } m["qfa-onb-pro"] = { "Be Purba", 116773192, "qfa-onb", "Latn", type = "reconstructed", } m["qfa-ong-pro"] = { "Ongan Purba", 116773801, "qfa-ong", "Latn", type = "reconstructed", } m["qfa-tak-pro"] = { "Kra-Dai Purba", 104901616, "qfa-tak", "Latn", type = "reconstructed", } m["qfa-yen-pro"] = { "Yenisei Purba", 27639, "qfa-yen", "Latn", type = "reconstructed", } m["qfa-yuk-pro"] = { "Yukaghir Purba", 116773294, "qfa-yuk", "Latn", type = "reconstructed", } m["qwe-kch"] = { "Kichwa", 1740805, "qwe", "Latn", ancestors = "qu", } m["qwe-pro"] = { "Quechua Purba", 5575757, "qwe", "Latn", type = "reconstructed", } m["roa-ang"] = { "Angevin", 56782, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-bbn"] = { "Bourbonnais-Berrichon", 2899128, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-brg"] = { "Bourguignon", 508332, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-can"] = { "Cantabria", 917021, "roa-asl", "Latn", } m["roa-cha"] = { "Champenois", 430018, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-fcm"] = { "Franc-Comtois", 510561, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-gal"] = { "Gallo", 37300, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-gib"] = { "Gallo-Italic of Basilicata", 3094838, "roa-git", ancestors = "pms-old", "Latn", } m["roa-gis"] = { "Gallo-Italic of Sicily", 2629019, "roa-git", "Latn", ancestors = "pms-old", } m["roa-leo"] = { "Leon", 34108, "roa-asl", "Latn", } m["roa-lor"] = { "Lorrain", 671198, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-oca"] = { "Catalonia Kuno", 15478520, "roa-ocr", "Latn", sort_key = {remove_diacritics = c.grave .. c.acute .. c.diaer .. c.cedilla .. "·"}, } m["roa-ole"] = { "Leon Kuno", 125977465, "roa-asl", "Latn", } m["roa-ona"] = { "Navarro-Aragon Kuno", 2736184, "roa-nar", "Latn", } m["roa-opt"] = { "Galicia-Portugis Kuno", 1072111, "roa-gap", "Latn", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.circ}, } m["roa-orl"] = { "Orléanais", 28497058, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-poi"] = { "Poitevin-Saintongeais", 514123, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-tar"] = { "Tarantino", 695526, "roa-itr", "Latn", wikimedia_codes = "roa-tara", } m["sai-all"] = { "Allentiac", 19570789, "sai-hrp", "Latn", } m["sai-and"] = { -- not to be confused with 'cbc' or 'ano' "Andoquero", 16828359, "sai-wit", "Latn", } m["sai-ayo"] = { "Ayomán", 16937754, "sai-jir", "Latn", } m["sai-bae"] = { "Baenan", 3401998, "qfa-unc", -- extinct, poorly attested; only known through 9 words "Latn", } m["sai-bag"] = { "Bagua", 5390321, "qfa-unc", -- extinct, poorly attested; possibly Cariban "Latn", } m["sai-bet"] = { "Betoi", 926551, "qfa-iso", "Latn", } m["sai-bor-pro"] = { "Boran Purba", nil, "sai-bor", "Latn", } m["sai-cac"] = { "Cacán", 945482, "qfa-unc", -- extinct, poorly attested; no consensus on classification "Latn", } m["sai-caq"] = { "Caranqui", 2937753, "sai-bar", "Latn", } m["sai-car-pro"] = { "Cariban Purba", 116773196, "sai-car", "Latn", type = "reconstructed", } m["sai-cat"] = { "Catacao", 5051136, "sai-ctc", "Latn", } m["sai-cer-pro"] = { "Cerrado Purba", 116773200, "sai-cer", "Latn", type = "reconstructed", } m["sai-chi"] = { "Chirino", 5390321, "qfa-unc", -- extinct, only four words known; possibly related to Candoshi-Shapra (cbu) "Latn", } m["sai-chn"] = { "Chaná", 5072718, "sai-crn", "Latn", } m["sai-chp"] = { "Chapacura", 5072884, "sai-cpc", "Latn", } m["sai-chr"] = { "Charrua", 5086680, "sai-crn", "Latn", } m["sai-chu"] = { "Churuya", 5118339, "sai-guh", "Latn", } m["sai-cje-pro"] = { "Jê Tengah Purba", 116773198, "sai-cje", "Latn", type = "reconstructed", } m["sai-cmg"] = { "Comechingon", 6644203, "qfa-unc", -- extinct, poorly attested; no consensus on classification "Latn", } m["sai-cno"] = { "Chono", 5104704, "qfa-unc", -- extinct, poorly attested; no consensus on classification, possibly spurious "Latn", } m["sai-cnr"] = { "Cañari", 5055572, "qfa-unc", -- extinct, poorly attested; possibly Chimuan or Barbacoan "Latn", } m["sai-coe"] = { "Coeruna", 6425639, "sai-wit", "Latn", } m["sai-col"] = { "Colán", 5141893, "sai-ctc", "Latn", } m["sai-cop"] = { "Copallén", 5390321, "qfa-unc", -- extinct, only four words attested; possibly Cholonan "Latn", } m["sai-crd"] = { "Coroado Puri", 24191321, "sai-mje", "Latn", } m["sai-ctq"] = { "Catuquinaru", 16858455, "qfa-unc", -- extinct, poorly attested; vocabulary does not resemble other languages "Latn", } m["sai-cul"] = { "Culli", 2879660, "qfa-unc", -- extinct, poorly attested; often considered an isolate "Latn", } m["sai-cva"] = { "Cueva", 5192644, "qfa-unc", -- extinct, poorly attested; possibly Chocoan "Latn", } m["sai-esm"] = { "Esmeralda", 3058083, "qfa-unc", -- extinct, poorly attested; possibly related to Yaruro "Latn", } m["sai-ewa"] = { "Ewarhuyana", 16898104, nil, "Latn", } m["sai-gam"] = { "Gamela", 5403661, "qfa-unc", -- extinct, poorly attested; possibly an isolate "Latn", } m["sai-gay"] = { "Gayón", 5528902, "sai-jir", "Latn", } m["sai-gmo"] = { "Guamo", 5613495, "qfa-unc", -- extinct; "Kaufman (1990) finds a connection with the Chapacuran languages convincing." [Wikipedia] Considered an isolate by Campbell (2024). "Latn", } m["sai-gua"] = { "Guachí", 5613172, "sai-guc", "Latn", } m["sai-gue"] = { "Güenoa", 5626799, "sai-crn", "Latn", } m["sai-hau"] = { "Haush", 3128376, "sai-cho", "Latn", } m["sai-jee-pro"] = { "Jê Purba", 116773212, "sai-jee", "Latn", type = "reconstructed", } m["sai-jko"] = { "Jeikó", 6176527, "sai-mje", "Latn", } m["sai-jrj"] = { "Jirajara", 6202966, "sai-jir", "Latn", } m["sai-kat"] = { -- contrast xoo, kzw, sai-xoc "Katembri", 6375925, "qfa-unc", -- extinct, poorly attested; "Kaufman (1990) has linked it with the nearly extinct Taruma, although this has not been accepted by other scholars." [Wikipedia] "Latn", } m["sai-mal"] = { "Malalí", 6741212, "sai-mje", -- considered the most divergent Maxakalían language (a subdivision of Macro-Jê), for which we have no entry "Latn", } m["sai-mar"] = { "Maratino", 6755055, "qfa-unc", -- extinct, poorly attested; possibly Uto-Aztecan "Latn", } m["sai-mat"] = { "Matanawi", 6786047, "qfa-unc", -- extinct; either an isolate or distantly related to the Muran languages; Campbell (2024) lists it as an isolate, Glottolog gives it as unclassified "Latn", } m["sai-mcn"] = { "Mocana", 3402048, "qfa-unc", -- extinct, poorly attested; given as part of the Malibu languages (geographic grouping; not a clade) "Latn", } m["sai-men"] = { "Menien", 16890110, "sai-mje", "Latn", } m["sai-mil"] = { "Millcayac", 19573012, "sai-hrp", "Latn", } m["sai-mlb"] = { "Malibu", 3402048, "qfa-unc", -- extinct, poorly attested; given as part of the Malibu languages (geographic grouping; not a clade) "Latn", } m["sai-msk"] = { "Masakará", 6782426, "sai-mje", "Latn", } m["sai-muc"] = { "Mucuchí", 6931290, nil, -- generally considered Timotean, for which we have no entry "Latn", } m["sai-mue"] = { "Muellama", 16886936, "sai-bar", "Latn", } m["sai-muz"] = { "Muzo", 6644203, "qfa-unc", -- extinct language of Colombia, poorly attested; may be Pijao (Cariban) "Latn", } m["sai-mys"] = { "Maynas", 16919393, "sai-cah", -- per Campbell (2024); formerly considered unclassified "Latn", } m["sai-nat"] = { "Natú", 9006749, "qfa-unc", -- extinct, poorly attested; "only Greenberg dares to classify [it]".[Wikipedia, quoting Moseley, Christopher; Asher, R. E.; Tait, Mary (1994), Atlas of the world's languages] "Latn", } m["sai-nje-pro"] = { "Jê Utara Purba", 116773245, "sai-nje", "Latn", type = "reconstructed", } m["sai-opo"] = { "Opón", 7099152, "sai-car", "Latn", } m["sai-oto"] = { "Otomaco", 16879234, "sai-otm", "Latn", } m["sai-pal"] = { "Palta", 3042978, "qfa-unc", -- extinct, unclassified; possibly Chicham "Latn", } m["sai-pam"] = { "Pamigua", 5908689, "sai-otm", "Latn", } m["sai-par"] = { "Paratió", 16890038, "qfa-unc", -- extinct, poorly attested; possibly Xukuruan "Latn", } m["sai-peb"] = { "Peba", 3373890, "sai-pey", "Latn", } m["sai-pnz"] = { "Panzaleo", 3123275, "qfa-unc", -- extinct, unclassified; possibly Paezan "Latn", } m["sai-prh"] = { "Puruhá", 3410994, "qfa-unc", -- extinct, poorly attested; possibly in a family with Cañari "Latn", } m["sai-ptg"] = { "Patagón", 128807870, "sai-tar", -- extinct, only known from 4 words, which suggest Cariban lineage (Campbell 2024) "Latn", } m["sai-pur"] = { "Purukotó", 7261622, "sai-pem", "Latn", } m["sai-pyg"] = { "Payaguá", 7156643, "sai-guc", "Latn", } m["sai-pyk"] = { "Pykobjê", 98113977, "sai-nje", "Latn", } m["sai-qmb"] = { "Quimbaya", 7272043, "qfa-unc", -- extinct, might not exist; few known words "Latn", } m["sai-qtm"] = { "Quitemo", 7272651, "sai-cpc", "Latn", } m["sai-rab"] = { "Rabona", 6644203, "qfa-unc", -- extinct, poorly attested, mostly plant names; possibly Candoshi-Shapra "Latn", } m["sai-ram"] = { "Ramanos", 16902824, "qfa-unc", -- extinct, poorly attested, possibly an isolate; per Glottolog: "the minuscule wordlist ... shows no convincing resemblances to surrounding languages" "Latn", } m["sai-sac"] = { "Sácata", 5390321, "qfa-unc", -- extinct, only 3 words known; possibly Candoshí or Arawakan "Latn", } m["sai-san"] = { "Sanaviron", 16895999, "qfa-unc", -- extinct, unclassified; no consensus on classification "Latn", } m["sai-sap"] = { "Sapará", 7420922, "sai-car", "Latn", } m["sai-sec"] = { "Sechura", 7442912, "qfa-unc", -- extinct, poorly attested; possibly Catacaoan "Latn", } m["sai-sin"] = { "Sinúfana", 7525275, "qfa-unc", -- moribund, poorly attested; possibly Chocoan "Latn", } m["sai-sje-pro"] = { "Jê Selatan Purba", 116773814, "sai-sje", "Latn", type = "reconstructed", } m["sai-tab"] = { "Tabancale", 5390321, "qfa-unc", -- extinct, only 5 words known; no obvious connections, might be an isolate "Latn", } m["sai-tal"] = { "Tallán", 16910468, "qfa-unc", -- extinct, poorly attested; might be Catacaoan "Latn", } m["sai-tap"] = { "Tapayuna", 30719984, "sai-nje", "Latn", } m["sai-tar-pro"] = { "Taranoan Purba", 116773816, "sai-tar", "Latn", type = "reconstructed", } m["sai-teu"] = { "Teushen", 3519243, "qfa-unc", -- probably extinct by the 1950's; possibly Chonan "Latn", } m["sai-tim"] = { "Timote", 7806995, nil, -- possibly in a small Timotean family "Latn", } m["sai-tpr"] = { "Taparita", 7684460, "sai-otm", "Latn", } m["sai-trr"] = { "Tarairiú", 7685313, "qfa-unc", -- extinct, too poorly attested to classify "Latn", } m["sai-wai"] = { "Waitaká", 16918610, "qfa-unc", -- extinct, possibly Purian "Latn", } m["sai-way"] = { "Wayumara", 7960726, "sai-car", "Latn", } m["sai-wit-pro"] = { "Witotoan Purba", 116773823, "sai-wit", "Latn", type = "reconstructed", } m["sai-wnm"] = { "Wanham", 16879440, "sai-cpc", "Latn", } m["sai-xoc"] = { -- contrast xoo, kzw, sai-kat "Xocó", 12953620, "qfa-unc", -- extinct and poorly attested; not clear if one or three languages "Latn", } m["sai-yao"] = { "Yao (Amerika Selatan)", 16979655, "sai-ven", "Latn", } m["sai-yar"] = { -- not the same family as 'suy' "Yarumá", 3505859, "sai-pek", "Latn", } m["sai-yri"] = { "Yuri", 2669157, "sai-tyu", "Latn", } m["sai-yup"] = { "Yupua", 8061430, "sai-tuc", "Latn", } m["sai-yur"] = { "Yurumanguí", 1281291, "qfa-unc", -- extinct, too poorly attested to classify "Latn", } m["sal-pro"] = { "Salish Purba", 116773269, "sal", "Latn", type = "reconstructed", } m["sdv-daj-pro"] = { "Daju Purba", 116773739, "sdv-daj", "Latn", type = "reconstructed", } m["sdv-eje-pro"] = { "Jabal Timur Purba", 116773751, "sdv-eje", "Latn", type = "reconstructed", } m["sdv-nil-pro"] = { "Nil Purba", 116773794, "sdv-nil", "Latn", type = "reconstructed", } m["sdv-nyi-pro"] = { "Nyima Purba", 116773796, "sdv-nyi", "Latn", type = "reconstructed", } m["sdv-tmn-pro"] = { "Taman Purba", 116773815, "sdv-tmn", "Latn", type = "reconstructed", } m["sel-nor"] = { "Selkup Utara", 30304565, "sel", "Cyrl", translit = "sel-nor-translit", } m["sel-pro"] = { "Selkup Purba", 128884235, "sel", "Latn", type = "reconstructed", } m["sel-sou"] = { "Selkup Selatan", 30304639, "sel", "Cyrl", translit = "sel-sou-translit", } m["sem-amm"] = { "Ammun", 279181, "sem-can", "Phnx", -- Phnx translit in [[Module:scripts/data]] } m["sem-amo"] = { "Amorit", 35941, "sem-nwe", "Xsux, Latn", } m["sem-cha"] = { "Chaha", 35543, "sem-eth", "Ethi", translit = "Ethi-translit", } m["sem-dad"] = { "Dadanitic", 21838040, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-dum"] = { "Dumaitic", 128810397, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-has"] = { "Hasaitic", 3541433, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-his"] = { "Hismaic", 22948260, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-mhr"] = { "Muher", 33743, "sem-eth", "Latn", } m["sem-pro"] = { "Samiah Purba", 1658554, "sem", "Latn", type = "reconstructed", } m["sem-saf"] = { "Safaitic", 472586, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-sam"] = { "Samalia", 85847147, "sem-nwe", "Phnx", -- Phnx translit in [[Module:scripts/data]] } m["sem-srb"] = { "Arab Selatan Kuno", 35025, "sem-osa", "Sarb", -- Sarb translit in [[Module:scripts/data]] } m["sem-tay"] = { "Taymanitic", 24912301, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-tha"] = { "Thamudic", 843030, "sem-cen", "Narb", -- Narb translit in [[Module:scripts/data]] } m["sem-wes-pro"] = { "Samiah Barat Purba", 98021726, "sem-wes", "Latn", type = "reconstructed", } m["sio-pro"] = { -- NB this is not Proto-Siouan-Catawban 'nai-sca-pro' "Sioux Purba", 34181, "sio", "Latn", type = "reconstructed", } m["sit-aao-pro"] = { "Naga Tengah Purba", nil, "sit-aao", "Latn", type = "reconstructed", } m["sit-bai-pro"] = { "Bai Purba", nil, "sit-bai", "Latn", type = "reconstructed", } m["sit-ban"] = { "Bangru", 56071779, "sit-hrs", "Latn", } m["sit-bdi-pro"] = { "Bodish Purba", nil, "sit-bdi", "Latn", type = "reconstructed", } m["sit-bok"] = { "Bokar", 4938727, "sit-tan", "Latn, Tibt", override_translit = true, -- Tibt translit, display_text, strip_diacritics, sort_key in [[Module:scripts/data]] } m["sit-cai"] = { "Caijia", 5017528, "sit-cln", "Latn" } m["sit-cha"] = { "Chairel", 5068066, "sit-luu", "Latn", } m["sit-ers-pro"] = { "Ersuic Purba", nil, "sit-ers", "Latn", type = "reconstructed", } m["sit-hrs-pro"] = { "Hrusish Purba", 116773762, "sit-hrs", "Latn", type = "reconstructed", } m["sit-jap"] = { "Japhug", 3162245, "sit-egy", "Latn", } m["sit-kha-pro"] = { "Kham Purba", 116773773, "sit-kha", "Latn", type = "reconstructed", } m["sit-khb-pro"] = { "Kho-Bwa Purba", nil, "sit-khb", "Latn", type = "reconstructed", } m["sit-khp-pro"] = { "Puroik Purba", nil, "sit-khb", "Latn", type = "reconstructed", } m["sit-khw-pro"] = { "Kho-Bwa Barat Purba", nil, "sit-khw", "Latn", type = "reconstructed", } m["sit-kon-pro"] = { "Naga Utara Purba", nil, "sit-kon", "Latn", type = "reconstructed", } m["sit-liz"] = { "Lizu", 6660653, "sit-ers", "Latn", -- and Ersu Shaba } m["sit-lnj"] = { "Longjia", 17096251, "sit-cln", "Latn" } m["sit-lrn"] = { "Luren", 16946370, "sit-cln", "Latn" } m["sit-luu-pro"] = { "Luish Purba", 116773783, "sit-luu", "Latn", type = "reconstructed", } m["sit-nas-pro"] = { "Naish Purba", nil, "sit-nas", "Latn", type = "reconstructed", } m["sit-prn"] = { "Puiron", 7259048, "sit-zem", } m["sit-pro"] = { "Sino-Tibet Purba", 24839178, "sit", "Latn", type = "reconstructed", } m["sit-sit"] = { "Situ", 19840830, "sit-egy", "Latn", } m["sit-tam-pro"] = { "Tamang Purba", 117469295, "sit-tam", "Latn", type = "reconstructed", } m["sit-tan-pro"] = { "Tani Purba", 116773284, "sit-tan", "Latn", -- needs verification type = "reconstructed", } m["sit-tgm"] = { "Tangam", 17041370, "sit-tan", "Latn", } m["sit-tng-pro"] = { "Tangkhulic Purba", nil, "sit-tng", "Latn", type = "reconstructed" } m["sit-tos"] = { "Tosu", 7827899, "sit-ers", "Latn", -- also Ersu Shaba } m["sit-tsh"] = { "Tshobdun", 19840950, "sit-egy", "Latn", } m["sit-zbu"] = { "Zbu", 19841106, "sit-egy", "Latn", } m["sla-pro"] = { "Slavik Purba", 747537, "sla", "Latn", type = "reconstructed", strip_diacritics = { remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve, remove_exceptions = {'ś'}, }, sort_key = { from = {"č", "ď", "ě", "ę", "ь", "ľ", "ň", "ǫ", "ř", "š", "ś", "ť", "ъ", "ž"}, to = {"c²", "d²", "e²", "e³", "i²", "l²", "nj", "o²", "r²", "s²", "s³", "t²", "u²", "z²"}, } } m["smi-pro"] = { "Sami Purba", 7251862, "smi", "Latn", type = "reconstructed", sort_key = { from = {"ā", "č", "δ", "[ëē]", "ŋ", "ń", "ō", "š", "θ", "%([^()]+%)"}, to = {"a", "c²", "d", "e", "n²", "n³", "o", "s²", "t²"} }, } m["son-pro"] = { "Songhay Purba", 116773277, "son", "Latn", type = "reconstructed", } m["sqj-pro"] = { "Albania Purba", 18210846, "sqj", "Latn", type = "reconstructed", } m["ssa-klk-pro"] = { "Kuliak Purba", 116773779, "ssa-klk", "Latn", type = "reconstructed", } m["ssa-kom-pro"] = { "Koman Purba", 116773775, "ssa-kom", "Latn", type = "reconstructed", } m["ssa-pro"] = { "Nilo-Sahara Purba", 116773236, "ssa", "Latn", type = "reconstructed", } m["syd-pro"] = { "Samoyed Purba", 7251863, "syd", "Latn", type = "reconstructed", } m["tai-pro"] = { "Tai Purba", 6583709, "tai", "Latn", type = "reconstructed", } m["tai-swe-pro"] = { "Tai Barat Daya Purba", 116773280, "tai-swe", "Latn", type = "reconstructed", } m["tbq-bdg-pro"] = { "Bodo-Garo Purba", 116773195, "tbq-bdg", "Latn", type = "reconstructed", } m["tbq-blg"] = { "Bailang", 2879843, "tbq-lob", "Hani", sort_key = "Hani-sortkey", } m["tbq-brm-pro"] = { "Burma Purba", nil, "tbq-brm", "Latn", type = "reconstructed", } m["tbq-gkh"] = { "Gokhy", 5578069, "tbq-sil", "Latn", } m["tbq-kuk-pro"] = { "Kukish Purba", 116773220, "tbq-kuk", "Latn", type = "reconstructed", } m["tbq-lal-pro"] = { "Lalo Purba", 116773781, "tbq-lal", "Latn", type = "reconstructed", } m["tbq-laz"] = { "Laze", 17007626, "sit-nas", "Latn", } m["tbq-lob-pro"] = { "Lolo-Burma Purba", 116773224, "tbq-lob", "Latn", type = "reconstructed", } m["tbq-lol-pro"] = { "Lolo Purba", 7251855, "tbq-lol", "Latn", type = "reconstructed", } m["tbq-mil"] = { "Milang", 6850761, "sit-gsi", "Deva, Latn", } m["tbq-mor"] = { "Moran", 6909216, "tbq-bdg", "Latn", } m["tbq-ngo"] = { "Ngochang", 56582, "tbq-brm", "Latn", } -- tbq-pro is now etymology-only m["trk-dkh"] = { "Dukhan", 12809273, "trk-ssb", "Latn, Cyrl, Mong", -- Mong translit, display_text and strip_diacritics in [[Module:scripts/data]] } -- As described in Mahmud al-Kashgari's 11th century ''Dīwān Lughāt al-Turk''. m["trk-eog"] = { "Oghuz Kuno Awal", nil, "trk-ogz", "ota-Arab", strip_diacritics = {["ota-Arab"] = "ar-stripdiacritics"}, } m["trk-oat"] = { "Turki Anatolia Kuno", 7083390, "trk-ogz", "ota-Arab", strip_diacritics = {["ota-Arab"] = "ar-stripdiacritics"}, } m["trk-pro"] = { "Turk Purba", 3657773, "trk", "Latn", type = "reconstructed", standard_chars = { Latn = " ()-abdegiklmnoprstuxyzïöüāčēīĺŋōŕšūǖȫẹ" .. c.macron, } } m["tup-gua-pro"] = { "Tupi-Guarani Purba", 116773288, "tup-gua", "Latn", type = "reconstructed", } m["tup-kab"] = { "Kabishiana", 15302988, "tup", "Latn", } m["tup-pro"] = { "Tupi Purba", 10354700, "tup", "Latn", type = "reconstructed", } m["tuw-alk"] = { "Alchuka", 113553616, "tuw-jrc", "Latn, Hans", sort_key = {Hans = "Hani-sortkey"}, } m["tuw-bal"] = { "Bala", 86730632, "tuw-jrc", "Latn, Hans", sort_key = {Hans = "Hani-sortkey"}, } m["tuw-kkl"] = { "Kyakala", 118875708, "tuw-jrc", "Latn, Hans", sort_key = {Hans = "Hani-sortkey"}, } m["tuw-kli"] = { "Kili", 6406892, "tuw-ewe", "Cyrl", } m["tuw-pro"] = { "Tungus Purba", 85872335, "tuw", "Latn", type = "reconstructed", } m["tuw-sol"] = { "Solon", 30004, "tuw-ewe", } m["urj-fin-pro"] = { "Finnik Purba", 11883720, "urj-fin", "Latn", type = "reconstructed", } m["urj-koo"] = { "Komi Kuno", 86679962, "kv", "Perm, Cyrs", translit = "urj-koo-translit", -- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]]; previously, Cyrs strip_diacritics not present } m["urj-kuk"] = { "Kukkuzi", 107410460, "urj-fin", "Latn", ancestors = "vot", } m["urj-kya"] = { "Komi-Yazva", 2365210, "kv", "Cyrl", translit = "kv-translit", override_translit = true, strip_diacritics = {remove_diacritics = c.acute}, } m["urj-mdv-pro"] = { "Mordvin Purba", 116773232, "urj-mdv", "Latn", type = "reconstructed", } m["urj-prm-pro"] = { "Perm Purba", 116773257, "urj-prm", "Latn", type = "reconstructed", } m["urj-pro"] = { "Ural Purba", 288765, "urj", "Latn", type = "reconstructed", } m["urj-ugr-pro"] = { "Ugri Purba", 156631, "urj-ugr", "Latn", type = "reconstructed", } m["xnd-pro"] = { "Na-Dene Purba", 116773233, "xnd", "Latn", type = "reconstructed", } m["xgn-pro"] = { "Mongol Purba", 2493677, "xgn", "Latn", type = "reconstructed", sort_key = { from = {"č", "i", "ï", "ǰ", "ŋ", "ö", "š", "ü"}, to = {"c", "i" .. p[1], "i", "j", "n" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1]}, }, } m["yok-bvy"] = { "Yokuts Buena Vista", 4985474, "yok", "Latn", } m["yok-dly"] = { "Yokuts Delta", 70923266, "yok", "Latn", } m["yok-gsy"] = { "Gashowu", 3098708, "yok", "Latn", } m["yok-kry"] = { "Yokuts Sungai Kings", 6413014, "yok", "Latn", } m["yok-nvy"] = { "Yokuts Lembah Utara", 85789777, "yok", "Latn", } m["yok-ply"] = { "Yokuts Palewyami", 2387391, "yok", "Latn", } m["yok-svy"] = { "Yokuts Lembah Selatan", 12642473, "yok", "Latn", } m["yok-tky"] = { "Yokuts Tule-Kaweah", 7851988, "yok", "Latn", } m["ypk-pro"] = { "Yupik Purba", 116773295, "ypk", "Latn", type = "reconstructed", } m["yrk-for"] = { "Forest Nenets", 1295107, "yrk", "Cyrl", translit = "yrk-for-translit", strip_diacritics = {remove_diacritics = c.grave .. c.acute .. c.macron .. c.breve .. c.dotabove}, } m["yrk-tun"] = { "Tundra Nenets", 36452, "yrk", "Cyrl", strip_diacritics = { from = {"ӑ", "а̄", "э̇", "ӣ", "ы̄", "ӯ", "ю̄", "я̆", "я̄"}, to = {"а", "а", "э", "и", "ы", "у", "ю", "я", "я"}, }, translit = "yrk-tun-translit", } m["zhx-min-pro"] = { "Min Purba", 19646347, "zhx-min", "Latn", type = "reconstructed", } m["zhx-sht"] = { "Shaozhou Tuhua", 1920769, "zhx", "Nshu, Hants", generate_forms = "zh-generateforms", sort_key = {Hani = "Hani-sortkey"}, } m["zhx-sic"] = { "Sichuan", 2278732, "zhx-man", "Hants", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["zhx-tai"] = { "Taishan", 2208940, "zhx-yue", "Hants", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["zle-ono"] = { "Novgorodia Kuno", 162013, "zle", "Cyrs, Glag", translit = {Cyrs = "Cyrs-translit", Glag = "Glag-translit"}, -- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]] } m["zle-ort"] = { "Ruthenia Kuno", 13211, "zle", "Arab, Cyrs, Latn", ancestors = "orv", translit = { Cyrs = "zle-ort-translit", Arab = "zle-ort-Arab-translit", }, strip_diacritics = { Cyrs = { remove_diacritics = m_langdata.chars_substitutions["Cyrs_remove_diacritics"], remove_exceptions = {"Ї", "ї"}, }, Arab = "ar-stripdiacritics", }, -- Cyrs sort_key in [[Module:scripts/data]] } m["zls-chs"] = { "Slav Gereja", 33251, "zls", "Cyrs, Glag, Latn", ancestors = "cu", translit = { Cyrs = "Cyrs-translit", Glag = "Glag-translit" }, -- Cyrs strip_diacritics, sort_key in [[Module:scripts/data]] } m["zlw-ocs"] = { "Czech Kuno", 593096, "zlw", "Latn", } m["zlw-opl"] = { "Poland Kuno", 149838, "zlw-lch", "Latn", strip_diacritics = {remove_diacritics = c.ringabove}, } m["zlw-osk"] = { "Slovak Kuno", 12776676, "zlw", "Latn", } m["zlw-slv"] = { "Slovincia", 36822, "zlw-pom", "Latn", strip_diacritics = {remove_diacritics = c.macron .. c.breve}, } m["zlm-coa"] = { "Melayu Terengganu Pesisir", 4207412, "poz-mly", "Latn, ms-Arab", } m["zlm-pah"] = { "Melayu Pahang", Q7310370, "poz-mly", "Latn", } return require("Module:languages").finalizeData(m, "language") b5nkainh67h0t8own0oi8tskhrtcnqc 281322 281316 2026-04-21T19:44:55Z Hakimi97 2668 Membatalkan semakan [[Special:Diff/281316|281316]] oleh [[Special:Contributions/Hakimi97|Hakimi97]] ([[User talk:Hakimi97|bincang]]) 281322 Scribunto text/plain local m_lang = require("Module:languages") local m_langdata = require("Module:languages/data") local u = require("Module:string utilities").char local c = m_langdata.chars local p = m_langdata.puaChars local s = m_langdata.shared local m = {} m["aav-khs-pro"] = { "Khasi Purba", 116773216, "aav-khs", "Latn", type = "reconstructed", } m["aav-nic-pro"] = { "Nicobar Purba", 116773793, "aav-nic", "Latn", type = "reconstructed", } m["aav-pkl-pro"] = { "Pnar-Khasi-Lyngngam Purba", 116773259, "aav-pkl", "Latn", type = "reconstructed", } m["aav-pro"] = { -- mkh-pro will merge into this "Austroasia Purba", 116773186, "aav", "Latn", type = "reconstructed", } m["afa-pro"] = { "Afroasia Purba", 269125, "afa", "Latn", type = "reconstructed", } m["alg-aga"] = { "Agawam", nil, "alg-eas", "Latn", } m["alg-pro"] = { "Algonquin Purba", 7251834, "alg", "Latn", type = "reconstructed", sort_key = {remove_diacritics = "·"}, } m["alv-ama"] = { "Amasi", 4740400, "nic-grs", "Latn", entry_name = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.tilde .. c.macron}, } m["alv-bgu"] = { "Baïnounk Gubëeher", 17002646, "alv-bny", "Latn", } m["alv-bua-pro"] = { "Bua Purba", 116773723, "alv-bua", "Latn", type = "reconstructed", } m["alv-cng-pro"] = { "Cangin Purba", 116773726, "alv-cng", "Latn", type = "reconstructed", } m["alv-edo-pro"] = { "Edoid Purba", 116773206, "alv-edo", "Latn", type = "reconstructed", } m["alv-fli-pro"] = { "Fali Purba", 116773754, "alv-fli", "Latn", type = "reconstructed", } m["alv-gbe-pro"] = { "Gbe Purba", 116773208, "alv-gbe", "Latn", type = "reconstructed", } m["alv-gng-pro"] = { "Guang Purba", 116773757, "alv-gng", "Latn", type = "reconstructed", } m["alv-gtm-pro"] = { "Togo Tengah Purba", 116773732, "alv-gtm", "Latn", type = "reconstructed", } m["alv-gwa"] = { "Gwara", 16945580, "nic-pla", "Latn", } m["alv-hei-pro"] = { "Heiban Purba", 116773760, "alv-hei", "Latn", type = "reconstructed", } m["alv-ido-pro"] = { "Idomoid Purba", 116773764, "alv-ido", "Latn", type = "reconstructed", } m["alv-igb-pro"] = { "Igboid Purba", 116773765, "alv-igb", "Latn", type = "reconstructed", } m["alv-kwa-pro"] = { "Kwa Purba", 116773780, "alv-kwa", "Latn", type = "reconstructed", } m["alv-mum-pro"] = { "Mumuye Purba", 116773791, "alv-mum", "Latn", type = "reconstructed", } m["alv-nup-pro"] = { "Nupoid Purba", 116773795, "alv-nup", "Latn", type = "reconstructed", } m["alv-pro"] = { "Atlantik-Congo Purba", 116732838, "alv", "Latn", type = "reconstructed", } m["alv-edk-pro"] = { "Edekiri Purba", nil, "alv-edk", "Latn", type = "reconstructed", } m["alv-yor-pro"] = { "Yoruba Purba", nil, "alv-yor", "Latn", type = "reconstructed", } m["alv-yrd-pro"] = { "Yoruboid Purba", 116773824, "alv-yrd", "Latn", type = "reconstructed", } m["alv-von-pro"] = { "Volta-Niger Purba", 116773820, "alv-von", "Latn", type = "reconstructed", } m["apa-pro"] = { "Apache Purba", 116773135, "apa", "Latn", type = "reconstructed", } m["aql-pro"] = { "Algik Purba", 18389588, "aql", "Latn", type = "reconstructed", sort_key = {remove_diacritics = "·"}, } m["art-adu"] = { "Adûni", 1232159, "art", "Latn", type = "appendix-constructed", } m["art-bel"] = { "Kreol Belter", 108055510, "art", "Latn", type = "appendix-constructed", sort_key = { remove_diacritics = c.acute, from = {"ɒ"}, to = {"a"}, }, } m["art-blk"] = { "Bolak", 2909283, "art", "Latn", type = "appendix-constructed", } m["art-bsp"] = { "Black Speech", 686210, "art", "Latn, Teng", type = "appendix-constructed", } m["art-com"] = { "Communicationssprache", 35227, "art", "Latn", type = "appendix-constructed", } m["art-dtk"] = { "Dothraki", 2914733, "art", "Latn", type = "appendix-constructed", } m["art-elo"] = { "Eloi", nil, "art", "Latn", type = "appendix-constructed", } m["art-gld"] = { "Goa'uld", 19823, "art", "Latn, Egyp, Mero", type = "appendix-constructed", } m["art-lap"] = { "Lapine", 6488195, "art", "Latn", type = "appendix-constructed", } m["art-man"] = { "Mandalorian", 54289, "art", "Latn", type = "appendix-constructed", } m["art-mun"] = { "Mundolinco", 851355, "art", "Latn", type = "appendix-constructed", } m["art-nav"] = { "Na'vi", 316939, "art", "Latn", type = "appendix-constructed", } m["art-vlh"] = { "High Valyrian", 64483808, "art", "Latn", type = "appendix-constructed", } m["ath-nic"] = { "Nicola", 20609, "ath-nor", "Latn", } m["ath-pro"] = { "Athabaska Purba", 104841722, "ath", "Latn", type = "reconstructed", } m["auf-pro"] = { "Arawa Purba", 116773706, "auf", "Latn", type = "reconstructed", } m["aus-alu"] = { "Alungul", 16827670, "aus-pmn", "Latn", } m["aus-and"] = { "Andjingith", 4754509, "aus-pmn", "Latn", } m["aus-ang"] = { "Angkula", 16828520, "aus-pmn", "Latn", } m["aus-arn-pro"] = { "Arnhem Purba", 116773720, "aus-arn", "Latn", type = "reconstructed", } m["aus-bra"] = { "Barranbinya", 4863220, "aus-pmn", "Latn", } m["aus-brm"] = { "Barunggam", 4865914, "aus-pmn", "Latn", } m["aus-cww-pro"] = { "New South Wales Tengah Purba", 116773199, "aus-cww", "Latn", type = "reconstructed", } m["aus-dal-pro"] = { "Daly Purba", 116773743, "aus-dal", "Latn", type = "reconstructed", } m["aus-guw"] = { "Guwar", 6652138, "aus-pam", "Latn", } m["aus-lsw"] = { "Little Swanport", 6652138, nil, "Latn", } m["aus-mbi"] = { "Mbiywom", 6799701, "aus-pmn", "Latn", } m["aus-ngk"] = { "Ngkoth", 7022405, "aus-pmn", "Latn", } m["aus-nyu-pro"] = { "Nyulnyulan Purba", 116773797, "aus-nyu", "Latn", type = "reconstructed", } m["aus-pam-pro"] = { "Pama-Nyunga Purba", 33942, "aus-pam", "Latn", type = "reconstructed", } m["aus-tul"] = { "Tulua", 16938541, "aus-pam", "Latn", } m["aus-uwi"] = { "Uwinymil", 7903995, "aus-arn", "Latn", } m["aus-wdj-pro"] = { "Iwaidjan Purba", 116773767, "aus-wdj", "Latn", type = "reconstructed", } m["aus-won"] = { "Wong-gie", nil, "aus-pam", "Latn", } m["aus-wul"] = { "Wulguru", 8039196, "aus-dyb", "Latn", } m["aus-ynk"] = { -- contrast nny "Yangkaal", 3913770, "aus-tnk", "Latn", } m["awd-amc-pro"] = { "Amuesha-Chamicuro Purba", nil, "awd", "Latn", type = "reconstructed", } m["awd-kmp-pro"] = { "Kampa Purba", nil, "awd", "Latn", type = "reconstructed", } m["awd-prw-pro"] = { "Paresi-Waura Purba", nil, "awd", "Latn", type = "reconstructed", } m["awd-ama"] = { "Amarizana", 16827787, "awd", "Latn", } m["awd-ana"] = { "Anauyá", 16828252, "awd", "Latn", } m["awd-apo"] = { "Apolista", 16916645, "awd", "Latn", } m["awd-cab"] = { "Cabre", 16850160, "awd", "Latn", } m["awd-gnu"] = { "Guinau", 3504087, "awd", "Latn", } m["awd-kar"] = { "Cariay", 16920253, "awd", "Latn", } m["awd-kaw"] = { "Kawishana", 6379993, "awd-nwk", "Latn", } m["awd-kus"] = { "Kustenau", 5196293, "awd", "Latn", } m["awd-man"] = { "Manao", 6746920, "awd", "Latn", } m["awd-mar"] = { "Marawan", 6755108, "awd", "Latn", } m["awd-mpr"] = { "Maipure", 6736872, "awd", "Latn", } m["awd-mrt"] = { "Mariaté", 16910017, "awd-nwk", "Latn", } m["awd-nwk-pro"] = { "Nawiki Purba", 116773234, "awd-nwk", "Latn", type = "reconstructed", } m["awd-pai"] = { "Paikoneka", 128807835, "awd", "Latn", } m["awd-pas"] = { "Pasé", 7143168, "awd-nwk", "Latn", } m["awd-pro"] = { "Arawak Purba", 97573478, "awd", "Latn", type = "reconstructed", } m["awd-she"] = { "Shebayo", 7492248, "awd", "Latn", } m["awd-taa-pro"] = { "Ta-Arawak Purba", 116773282, "awd-taa", "Latn", type = "reconstructed", } m["awd-wai"] = { "Wainumá", 16910017, "awd-nwk", "Latn", } m["awd-yum"] = { "Yumana", 8061062, "awd-nwk", "Latn", } m["azc-caz"] = { "Cazcan", 5055514, "azc", "Latn", } m["azc-cup-pro"] = { "Cupan Purba", 116773738, "azc-cup", "Latn", type = "reconstructed", } m["azc-ktn"] = { "Kitanemuk", 3197558, "azc-tak", "Latn", } m["azc-nah-pro"] = { "Nahua Purba", 7251860, "azc-nah", "Latn", type = "reconstructed", } m["azc-num-pro"] = { "Numi Purba", 116773247, "azc-num", "Latn", type = "reconstructed", } m["azc-pro"] = { "Uto-Aztek Purba", 96400333, "azc", "Latn", type = "reconstructed", } m["azc-tak-pro"] = { "Takik Purba", 116773283, "azc-tak", "Latn", type = "reconstructed", } m["azc-tat"] = { "Tataviam", 743736, "azc", "Latn", } m["ber-pro"] = { "Barbar Purba", 2855698, "ber", "Latn", type = "reconstructed", } m["ber-fog"] = { "Fogaha", 107610173, "ber", "Latn", } m["ber-zuw"] = { "Zuwara", 4117169, "ber", "Latn", } m["bnt-bal"] = { "Balong", 93935237, "bnt-bbo", "Latn", } m["bnt-bon"] = { "Boma Nkuu", nil, "bnt", "Latn", } m["bnt-boy"] = { "Boma Yumu", nil, "bnt", "Latn", } m["bnt-bwa"] = { "Bwala", 128810345, "bnt-tek", "Latn", } m["bnt-cmw"] = { "Chimwiini", 4958328, "bnt-swh", "Latn", } m["bnt-ind"] = { "Indanga", 51412803, "bnt", "Latn", } m["bnt-lal"] = { "Lala (Afrika Selatan)", 6480154, "bnt-ngu", "Latn", } m["bnt-mpi"] = { "Mpiin", 93937013, "bnt-bdz", "Latn", } m["bnt-mpu"] = { "Mpuono", -- not to be confused with Mbuun zmp 36056, "bnt", "Latn", } m["bnt-ngu-pro"] = { "Nguni Purba", 961559, "bnt-ngu", "Latn", type = "reconstructed", sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.caron}, } m["bnt-phu"] = { "Phuthi", 33796, "bnt-ngu", "Latn", entry_name = {remove_diacritics = c.grave .. c.acute}, } m["bnt-pro"] = { "Bantu Purba", 3408025, "bnt", "Latn", type = "reconstructed", sort_key = "bnt-pro-sortkey", } m["bnt-sbo"] = { "Boma Selatan", nil, "bnt", "Latn", } m["bnt-sts-pro"] = { "Sotho-Tswana Purba", 116773278, "bnt-sts", "Latn", type = "reconstructed", } m["btk-pro"] = { "Batak Purba", 116773191, "btk", "Latn", type = "reconstructed", } m["cau-abz-pro"] = { "Abkhaz-Abaza Purba", 7251831, "cau-abz", "Latn", type = "reconstructed", } m["cau-and-pro"] = { "Andi Purba", nil, "cau-and", "Latn", type = "reconstructed", } m["cau-ava-pro"] = { "Avar-Andi Purba", 116773187, "cau-ava", "Latn", type = "reconstructed", } m["cau-cir-pro"] = { "Circassia Purba", 7251838, "cau-cir", "Latn", type = "reconstructed", } m["cau-drg-pro"] = { "Dargwa Purba", 116773205, "cau-drg", "Latn", type = "reconstructed", } m["cau-lzg-pro"] = { "Lezgi Purba", 116773223, "cau-lzg", "Latn", type = "reconstructed", } m["cau-nec-pro"] = { "Kaukasus Timur Laut Purba", 116773244, "cau-nec", "Latn", type = "reconstructed", } m["cau-nkh-pro"] = { "Nakh Purba", 108032840, "cau-nkh", "Latn", type = "reconstructed", } m["cau-nwc-pro"] = { "Kaukasus Barat Laut Purba", 7251861, "cau-nwc", "Latn", type = "reconstructed", } m["cau-tsz-pro"] = { "Tsez Purba", 116773287, "cau-tsz", "Latn", type = "reconstructed", } m["cba-ata"] = { "Atanques", 4812783, "cba", "Latn", } m["cba-cat"] = { "Catío Chibcha", 7083619, "cba", "Latn", } m["cba-dor"] = { "Dorasque", 5297532, "cba", "Latn", } m["cba-dui"] = { "Duit", 3041061, "cba", "Latn", } m["cba-hue"] = { "Huetar", 35514, "cba", "Latn", } m["cba-nut"] = { "Nutabe", 7070405, "cba", "Latn", } m["cba-pro"] = { "Chibchan Purba", 116773203, "cba", "Latn", type = "reconstructed", } m["ccn-pro"] = { "Kaukasus Utara Purba", 116773237, "ccn", "Latn", type = "reconstructed", } m["ccs-pro"] = { "Kartvelia Purba", 2608203, "ccs", "Latn", type = "reconstructed", entry_name = { from = {"q̣", "p̣", "ʓ", "ċ"}, to = {"q̇", "ṗ", "ʒ", "c̣"} }, } m["ccs-gzn-pro"] = { "Georgia-Zan Purba", 23808119, "ccs-gzn", "Latn", type = "reconstructed", entry_name = { from = {"q̣", "p̣", "ʓ", "ċ"}, to = {"q̇", "ṗ", "ʒ", "c̣"} }, } m["cdc-cbm-pro"] = { "Chad Tengah Purba", 116773197, "cdc-cbm", "Latn", type = "reconstructed", } m["cdc-mas-pro"] = { "Masa Purba", 116773789, "cdc-mas", "Latn", type = "reconstructed", } m["cdc-pro"] = { "Chad Purba", 116773201, "cdc", "Latn", type = "reconstructed", } m["cdd-pro"] = { "Caddoan Purba", 116773725, "cdd", "Latn", type = "reconstructed", } m["cel-bry-pro"] = { "Briton Purba", 1248800, "cel-bry", "Latn, Grek", sort_key = "cel-bry-pro-sortkey", } m["cel-gal"] = { "Gallaecia", 3094789, "cel-his", } m["cel-gau"] = { "Gallia", 29977, "cel", "Latn, Grek, Ital", entry_name = {remove_diacritics = c.macron .. c.breve .. c.diaer}, } m["cel-pro"] = { "Keltik Purba", 653649, "cel", "Latn", type = "reconstructed", sort_key = "cel-pro-sortkey", } m["chi-pro"] = { "Chimakuan Purba", 116773734, "chi", "Latn", type = "reconstructed", } m["chm-pro"] = { "Mari Purba", 116773788, "chm", "Latn", type = "reconstructed", } m["cmc-pro"] = { "Chamik Purba", 114793834, "cmc", "Latn", type = "reconstructed", } m["crp-bip"] = { "Pijin Basque-Iceland", 810378, "crp", "Latn", ancestors = "eu", } m["crp-gep"] = { "Pijin Greenland Barat", 17036301, "crp", "Latn", ancestors = "kl", } m["crp-mar"] = { "Maroon Spirit Language", 1093206, "crp", "Latn", ancestors = "en", } m["crp-mpp"] = { "Portugis Pijin Macau", 128804537, "crp", "Hant, Latn", ancestors = "pt", sort_key = {Hant = "Hani-sortkey"}, } m["crp-rsn"] = { "Russenorsk", 505125, "crp", "Cyrl, Latn", ancestors = "nn, ru", translit = {Cyrl = "ru-translit"}, } m["crp-spp"] = { "Samoan Plantation Pidgin", 7409948, "crp", "Latn", ancestors = "en", } m["crp-slb"] = { "Inggeris Solombala", 7558525, "crp", "Cyrl, Latn", ancestors = "en, ru", translit = {Cyrl = "ru-translit"}, } m["crp-tpr"] = { "Rusia Pijin Taimyr", 16930506, "crp", "Cyrl", ancestors = "ru", translit = "ru-translit", } m["csu-bba-pro"] = { "Bongo-Bagirmi Purba", 116773722, "csu-bba", "Latn", type = "reconstructed", } m["csu-maa-pro"] = { "Mangbetu Purba", 116773786, "csu-maa", "Latn", type = "reconstructed", } m["csu-pro"] = { "Sudan Tengah Purba", 116773730, "csu", "Latn", type = "reconstructed", } m["csu-sar-pro"] = { "Sara Purba", 116773809, "csu-sar", "Latn", type = "reconstructed", } m["cus-ash"] = { "Ashraaf", 4805855, "cus-som", "Latn", } m["cus-hec-pro"] = { "Kusyi Timur Tanah Tinggi Purba", 116773761, "cus-hec", "Latn", type = "reconstructed", } m["cus-som-pro"] = { "Somaloid Purba", nil, "cus-som", "Latn", type = "reconstructed", } m["cus-sou-pro"] = { "Kusyi Selatan Purba", 126081567, "cus-sou", "Latn", type = "reconstructed", } m["cus-pro"] = { "Kusyi Purba", 116773204, "cus", "Latn", type = "reconstructed", } m["dmn-dam"] = { "Dama (Sierra Leone)", 19601574, "dmn", "Latn", } m["dra-bry"] = { "Beary", 1089116, "qfa-mix", "Mlym, Knda", ancestors = "ml, tcy", translit = { Mlym = "ml-translit", Knda = "kn-translit", }, } m["dra-cen-pro"] = { "Dravidia Tengah Purba", nil, "dra-cen", "Latn", type = "reconstructed", } m["dra-mkn"] = { "Kannada Pertengahan", 128810572, "dra-kan", "Knda", translit = "kn-translit", } m["dra-nor-pro"] = { "Dravidia Utara Purba", 124433593, "dra-nor", "Latn", type = "reconstructed", } m["dra-okn"] = { "Kannada Kuno", 15723156, "dra-kan", "Knda", translit = "kn-translit", } m["dra-ote"] = { "Telugu Kuno", 126720868, "dra-tel", "Telu", translit = "te-translit", } m["dra-pro"] = { "Dravidia Purba", 1702853, "dra", "Latn", type = "reconstructed", } m["dra-sdo-pro"] = { "Dravidia Selatan I Purba", 104847952, -- Wikipedia's "Dravidia Selatan Purba" is Dravidia Selatan Purba I in this scheme. "dra-sdo", "Latn", type = "reconstructed", } m["dra-sdt-pro"] = { "Dravidia Selatan II Purba", 128885257, "dra-sdt", "Latn", type = "reconstructed", } m["dra-sou-pro"] = { "Dravidia Selatan Purba", 128886121, "dra-sou", "Latn", type = "reconstructed", } m["egx-dem"] = { "Demotik", 36765, "egx", "Latn, Egyd, Polyt", translit = { Polyt = "grc-translit", }, entry_name = { Polyt = s["Polyt-entryname"], }, sort_key = { Latn = { remove_diacritics = "'%-%s", from = {"ꜣ", "j", "e", "ꜥ", "y", "w", "b", "p", "f", "m", "n", "r", "l", "ḥ", "ḫ", "h̭", "ẖ", "h", "š", "s", "q", "k", "g", "ṱ", "ṯ", "t", "ḏ", "%.", "⸗"}, to = {p[1], p[2], p[3], p[4], p[5], p[6], p[7], p[8], p[9], p[10], p[11], p[12], p[13], p[15], p[16], p[16], p[17], p[14], p[19], p[18], p[20], p[21], p[22], p[23], p[24], p[23], p[25], p[26], p[26]} }, Polyt = s["Grek-sortkey"], }, } m["dmn-pro"] = { "Mande Purba", 116773785, "dmn", "Latn", type = "reconstructed", } m["dmn-mdw-pro"] = { "Mande Barat Purba", 116773822, "dmn-mdw", "Latn", type = "reconstructed", } m["dru-pro"] = { "Rukai Purba", 116773807, "map", "Latn", type = "reconstructed", } m["esx-esk-pro"] = { "Eskimo Purba", 7251842, "esx-esk", "Latn", type = "reconstructed", } m["esx-ink"] = { "Inuktun", 1671647, "esx-inu", "Latn", } m["esx-inq"] = { "Inuinnaqtun", 28070, "esx-inu", "Latn", } m["esx-inu-pro"] = { "Inuit Purba", 60785588, "esx-inu", "Latn", type = "reconstructed", } m["esx-pro"] = { "Eskimo-Aleut Purba", 7251843, "esx", "Latn", type = "reconstructed", } m["esx-tut"] = { "Tunumiisut", 15665389, "esx-inu", "Latn", } m["euq-pro"] = { "Vascon Purba", 938011, "euq", "Latn", type = "reconstructed", } m["gba-pro"] = { "Gbaya Purba", nil, "gba", "Latn", type = "reconstructed", } m["gem-pro"] = { "Jermanik Purba", 669623, "gem", "Latn", type = "reconstructed", sort_key = "gem-pro-sortkey", } m["gme-bur"] = { "Burgundians", 47625, "gme", "Latn", } m["gme-cgo"] = { "Goth Crimea", 36211, "gme", "Latn", } m["gmq-gut"] = { "Gutnish", 1256646, "gmq", "Latn", ancestors = "gmq-ogt", } m["gmq-jmk"] = { "Jamtish", 35512, "gmq-eas", "Latn", } m["gmq-mno"] = { "Norway Pertengahan", 3417070, "gmq-wes", "Latn", } m["gmq-oda"] = { "Denmark Kuno", 12330003, "gmq-eas", "Latn, Runr", entry_name = {remove_diacritics = c.macron}, } m["gmq-ogt"] = { "Gutnish Kuno", 1133488, "gmq", "Latn", ancestors = "non", } m["gmq-osw"] = { "Sweden Kuno", 2417210, "gmq-eas", "Latn, Runr", entry_name = {remove_diacritics = c.macron}, } m["gmq-pro"] = { "Norse Purba", 1671294, "gmq", "Runr", translit = "Runr-translit", } m["gmq-scy"] = { "Scanian", 768017, "gmq-eas", "Latn", } m["gmw-bgh"] = { "Bergish", 329030, "gmw-frk", "Latn", } m["gmw-cfr"] = { "Franconia Tengah", 572197, "gmw-hgm", "Latn", ancestors = "gmh", wikimedia_codes = "ksh", } m["gmw-ecg"] = { "Jerman Tengah Timur", 499344, -- subsumes Q699284, Q152965 "gmw-hgm", "Latn", ancestors = "gmh", } m["gmw-fin"] = { "Fingallian", 3072588, "gmw-ian", "Latn", } m["gmw-gts"] = { "Gottscheerish", 533109, "gmw-hgm", "Latn", ancestors = "bar", } m["gmw-jdt"] = { "Belanda Jersey", 1687911, "gmw-frk", "Latn", ancestors = "nl", } m["gmw-msc"] = { "Scots Pertengahan", 3327000, "gmw-ang", "Latn", ancestors = "enm-esc", } m["gmw-pro"] = { "Jermanik Barat Purba", 78079021, "gmw", "Latn", -- type = "reconstructed", -- largely but not entirely reconstructed (like Norse); see April '24 BP, set back to reconstructed (?) if 'anti-asterisk' is added sort_key = "gmw-pro-sortkey", } m["gmw-rfr"] = { "Franconia Rhine", 707007, "gmw-hgm", "Latn", ancestors = "gmh", } m["gmw-stm"] = { "Sathmar Swabian", 2223059, "gmw-hgm", "Latn", ancestors = "swg", } m["gmw-tsx"] = { "Transylvanian Saxon", 260942, "gmw-hgm", "Latn", ancestors = "gmw-cfr", } m["gmw-vog"] = { "Jerman Volga", 312574, "gmw-hgm", "Latn", ancestors = "gmw-rfr", } m["gmw-zps"] = { "Jerman Zipser", 205548, "gmw-hgm", "Latn", ancestors = "gmh", } m["gn-cls"] = { "Guaraní Klasik", 17478065, "tup-gua", "Latn", ancestors = "gn", } m["grk-cal"] = { "Yunani Calabria", 1146398, "grk", "Latn", ancestors = "grk-ita", } m["grk-ita"] = { "Yunani Itali", 19720507, "grk", "Latn, Grek", ancestors = "gkm", entry_name = {remove_diacritics = c.caron .. c.diaerbelow .. c.brevebelow}, sort_key = s["Grek-sortkey"], } m["grk-mar"] = { "Yunani Mariupol", 4400023, "grk", "Cyrl, Latn, Grek", ancestors = "gkm", translit = { Cyrl = "grk-mar-translit", Grek = "grk-mar-translit", }, override_translit = true, display_text = { Grek = s["Grek-displaytext"], }, entry_name = { Cyrl = {remove_diacritics = c.acute}, Grek = s["Grek-entryname"], }, sort_key = { Grek = s["Grek-sortkey"], }, } m["grk-pro"] = { "Hellenik Purba", 1231805, "grk", "Latn", type = "reconstructed", sort_key = { from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", "ď", "ľ", "ň", "ř", "ʰ", "ʷ", c.acute, c.macron}, to = {"a", "e", "i", "o", "u", "d", "l", "n", "r", "¯h", "¯w"} }, } m["hmn-pro"] = { "Hmong", 116773210, "hmn", "Latn", type = "reconstructed", } m["hmx-mie-pro"] = { "Mien", 116773229, "hmx-mie", "Latn", type = "reconstructed", } m["hmx-pro"] = { "Hmong-Mien Purba", 7251846, "hmx", "Latn", type = "reconstructed", } m["hyx-pro"] = { "Armenia Purba", 3848498, "hyx", "Latn", type = "reconstructed", } m["iir-nur-pro"] = { "Nuristani Purba", 116773248, "iir-nur", "Latn", type = "reconstructed", } m["iir-pro"] = { "Indo-Iran Purba", 966439, "iir", "Latn", type = "reconstructed", } m["ijo-pro"] = { "Ijoid Purba", 116773766, "ijo", "Latn", type = "reconstructed", } m["inc-apa"] = { "Apabhramsa", 616419, "inc-mid", "Deva, Shrd, Sidd", ancestors = "pra", translit = { Deva = "sa-translit", Shrd = "Shrd-translit", Sidd = "Sidd-translit", }, } m["inc-ash"] = { "Prakrit Ashoka", 104854379, "inc-mid", "Brah, Khar", ancestors = "sa", translit = { Brah = "Brah-translit", Khar = "Khar-translit", }, } m["inc-kam"] = { "Prakrit Kamarupi", 6356097, "inc-eas", "Brah, Sidd", translit = { Brah = "Brah-translit", Sidd = "Sidd-translit", }, } m["inc-kho"] = { "Kholosi", 24952008, "inc-snd", "Latn", } m["inc-krn-pro"] = { "KRDS lects Purba", 128816843, "inc-eas", "Latn", ancestors = "inc-kam", type = "reconstructed", } m["inc-mas"] = { "Assam Pertengahan", 128806836, "inc-eas", "as-Beng", ancestors = "inc-oas", translit = "inc-mas-translit", } m["inc-mbn"] = { "Benggali Pertengahan", 113559927, "inc-eas", "Beng", ancestors = "inc-obn", translit = "inc-mbn-translit", } m["inc-mgu"] = { "Gujarat Pertengahan", 24907429, "inc-wes", "Deva", ancestors = "inc-ogu", } m["inc-mor"] = { "Odia Pertengahan", 128810882, "inc-eas", "Orya", ancestors = "inc-oor", } m["inc-oas"] = { "Assam Awal", 85758237, "inc-eas", "as-Beng", ancestors = "inc-kam", translit = "inc-oas-translit", } m["inc-oaw"] = { "Awadhi Kuno", nil, "inc-hie", "Deva, Kthi, ur-Arab", entry_name = { from = {"هٔ", "ۂ"}, -- character "ۂ" code U+06C2 to "ه" and "هٔ"‎ (U+0647 + U+0654) to "ه" to = {"ہ", "ہ"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef }, translit = { Deva = "sa-translit", Kthi = "sa-Kthi-translit", ["ur-Arab"] = "inc-ohi-translit", }, } m["inc-obn"] = { "Benggali Kuno", 113559926, "inc-eas", "Beng", } m["inc-ogu"] = { "Gujarati Kuno", 24907427, "inc-wes", "Deva", translit = "sa-translit", } m["inc-ohi"] = { "Hindi Kuno", 48767781, "inc-hiw", "Deva, ur-Arab", entry_name = { from = {"هٔ", "ۂ"}, -- character "ۂ" code U+06C2 to "ه" and "هٔ"‎ (U+0647 + U+0654) to "ه" to = {"ہ", "ہ"}, remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.nunghunna .. c.superalef }, translit = { Deva = "sa-translit", ["ur-Arab"] = "inc-ohi-translit", }, } m["inc-oor"] = { "Odia Kuno", 128807801, "inc-eas", "Orya", } m["inc-opa"] = { "Punjabi Kuno", 115270971, "inc-pan", "Guru, pa-Arab", translit = { Guru = "inc-opa-Guru-translit", ["pa-Arab"] = "pa-Arab-translit", }, entry_name = {remove_diacritics = c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun}, } m["inc-pro"] = { "Indo-Arya Purba", 23808344, "inc", "Latn", type = "reconstructed", } m["ine-ana-pro"] = { "Anatolia Purba", 7251833, "ine-ana", "Latn", type = "reconstructed", } m["ine-bsl-pro"] = { "Balto-Slavik Purba", 1703347, "ine-bsl", "Latn", type = "reconstructed", sort_key = { from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", c.acute, c.macron, "ˀ"}, to = {"a", "e", "i", "o", "u"} }, } m["ine-kal"] = { "Kalašma", 122770439, "ine-ana", "Xsux", } m["ine-pae"] = { "Paeonian", 2705672, "ine", "Polyt", translit = "grc-translit", entry_name = s["Polyt-entryname"], sort_key = s["Grek-sortkey"], } m["ine-pro"] = { "Indo-Eropah Purba", 37178, "ine", "Latn", type = "reconstructed", sort_key = { from = {"[áā]", "[éēḗ]", "[íī]", "[óōṓ]", "[úū]", "ĺ", "ḿ", "ń", "ŕ", "ǵ", "ḱ", "ʰ", "ʷ", "₁", "₂", "₃", c.ringbelow, c.acute, c.macron}, to = {"a", "e", "i", "o", "u", "l", "m", "n", "r", "g'", "k'", "¯h", "¯w", "1", "2", "3"} }, } m["ine-toc-pro"] = { "Tocharia Purba", 37029, "ine-toc", "Latn", type = "reconstructed", } m["xme-old"] = { "Medes Kuno", 36461, "xme", "Grek, Latn", } m["xme-mid"] = { "Medes Pertengahan", 12836150, "xme", "Latn", } m["xme-ker"] = { "Kerman", 129850, "xme", "fa-Arab, Latn", ancestors = "xme-mid", } m["xme-taf"] = { "Tafreshi", nil, "xme", "fa-Arab, Latn", ancestors = "xme-mid", } m["xme-ttc-pro"] = { "Tat Purba", 122973870, "xme-ttc", "Latn", ancestors = "xme-mid", } m["xme-kls"] = { "Kalasuri", nil, "xme-ttc", ancestors = "xme-ttc-nor", } m["xme-klt"] = { "Kilit", 3612452, "xme-ttc", "Cyrl", -- and fa-Arab? } m["xme-ott"] = { "Tati Kuno", 434697, "xme-ttc", "fa-Arab, Latn", } m["ira-kms-pro"] = { "Komisenian Purba", 116773777, "ira-kms", "Latn", type = "reconstructed", } m["ira-mpr-pro"] = { "Medo-Parthia Purba", 116773227, "ira-mpr", "Latn", type = "reconstructed", } m["ira-pat-pro"] = { "Pathan Purba", 116773255, "ira-pat", "Latn", type = "reconstructed", } m["ira-pro"] = { "Iran Purba", 4167865, "ira", "Latn", type = "reconstructed", } m["ira-zgr-pro"] = { "Zaza-Gorani Purba", 116775031, "ira-zgr", "Latn", type = "reconstructed", } m["os-pro"] = { "Ossetia Purba", 116773249, "xsc", "Latn", type = "reconstructed", } m["xsc-pro"] = { "Scythia Purba", 116773273, "xsc", "Latn", type = "reconstructed", } m["xsc-skw-pro"] = { "Saka-Wakhi Purba", 116773267, "xsc-skw", "Latn", type = "reconstructed", } m["xsc-sak-pro"] = { "Saka Purba", 116773264, "xsc-sak", "Latn", type = "reconstructed", } m["ira-sym-pro"] = { "Shughni-Yazghulami-Munji Purba", 116773813, "ira-sym", "Latn", type = "reconstructed", } m["ira-sgi-pro"] = { "Sanglechi-Ishkashimi Purba", 116773808, "ira-sgi", "Latn", type = "reconstructed", } m["ira-mny-pro"] = { "Munji-Yidgha Purba", 116773792, "ira-mny", "Latn", type = "reconstructed", } m["ira-shy-pro"] = { "Shughni-Yazghulami Purba", 116773812, "ira-shy", "Latn", type = "reconstructed", } m["ira-shr-pro"] = { "Shughni-Roshani Purba", 116773811, "ira-shr", "Latn", type = "reconstructed", } m["ira-sgc-pro"] = { "Sogdia Purba", 116773276, "ira-sgc", "Latn", type = "reconstructed", } m["ira-wnj"] = { "Vanji Purba", 3398419, "ira-shy", "Latn", } m["iro-ere"] = { "Erie", 5388365, "iro-nor", "Latn", } m["iro-min"] = { "Mingo", 128531, "iro-nor", "Latn", ietf_subtag = "i-mingo", -- grandfathered IETF tag } m["iro-nor-pro"] = { "Iroquois Utara Purba", 116773242, "iro-nor", "Latn", type = "reconstructed", } m["iro-pro"] = { "Iroquois Purba", 7251852, "iro", "Latn", type = "reconstructed", } m["itc-pro"] = { "Italik Purba", 17102720, "itc", "Latn", type = "reconstructed", } m["jpx-hcj"] = { "Hachijō", 5637049, "jpx", "Jpan", ancestors = "ojp-eas", translit = s["jpx-translit"], display_text = s["jpx-displaytext"], entry_name = s["jpx-entryname"], sort_key = s["jpx-sortkey"], } m["jpx-pro"] = { "Jepunik Purba", 3924309, "jpx", "Latn", type = "reconstructed", } m["jpx-ryu-pro"] = { "Ryukyu Purba", 56349069, "jpx-ryu", "Latn", type = "reconstructed", } m["kar-pro"] = { "Karen Purba", 85794783, "kar", "Latn", type = "reconstructed", } m["kca-eas"] = { "Khanty Timur", 30304622, "kca", "Cyrl", translit = "kca-translit", override_translit = true, } m["kca-nor"] = { "Khanty Utara", 30304527, "kca", "Cyrl", translit = "kca-translit", override_translit = true, } m["kca-pro"] = { "Khanty Purba", 127505171, "kca", "Latn", type = "reconstructed", } m["kca-sou"] = { "Khanty Selatan", 30304618, "kca", "Cyrl", translit = "kca-translit", override_translit = true, } m["khi-kho-pro"] = { "Khoe Purba", 116773218, "khi-kho", "Latn", type = "reconstructed", } m["khi-kun"] = { "ǃKung", 32904, "khi-kxa", "Latn", } m["ko-ear"] = { "Korea Moden Awal", 756014, "qfa-kor", "Kore", ancestors = "okm", translit = "okm-translit", entry_name = s["Kore-entryname"], } m["kro-pro"] = { "Kru Purba", 116773778, "kro", "Latn", type = "reconstructed", } m["ku-pro"] = { "Kurdi Purba", 116773221, "ku", "Latn", type = "reconstructed", } m["map-ata-pro"] = { "Atayal Purba", 116773151, "map-ata", "Latn", type = "reconstructed", } m["map-bms"] = { "Banyumasan", 33219, "map", "Latn, Java", } m["map-pro"] = { "Austronesia Purba", 49230, "map", "Latn", type = "reconstructed", } m["mis-hkl"] = { "Kelantan Peranakan", 108794818, "qfa-mix", ancestors = "nan-hbl, sou, mfa", } m["mis-isa"] = { "Isaurian", 16956868, nil, -- "Xsux, Hluw, Latn", } m["mis-jie"] = { "Jie", 124424186, nil, "Hani", sort_key = "Hani-sortkey", } m["mis-jzh"] = { "Jizhao", 45242758, "qfa-bej", "Latn", } m["mis-kas"] = { "Kassite", 35612, nil, "Xsux", } m["mis-mmd"] = { "Mimi of Decorse", 6862206, nil, "Latn", } m["mis-mmn"] = { "Mimi of Nachtigal", 6862207, nil, "Latn", } m["mis-phi"] = { "Philistine", 2230924, nil, "Phnx", } m["mis-rou"] = { "Rouran", 48816637, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-tnw"] = { "Tangwang", 7683179, "qfa-mix", "Latn", ancestors = "cmn, sce", } m["mis-tuh"] = { "Tuyuhun", 48816625, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-tuo"] = { "Tuoba", 48816629, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-wuh"] = { "Wuhuan", 118976867, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mis-xbi"] = { "Xianbei", 4448647, "qfa-xgx", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mjg-mgl"] = { "Mongghul", 53765528, "mjg", "Latn", -- also Mong, Cyrl ? } m["mjg-mgr"] = { "Mangghuer", 56285392, "mjg", "Latn", -- also Mong, Cyrl ? } m["mkh-asl-pro"] = { "Asli Purba", 55630680, "mkh-asl", "Latn", type = "reconstructed", } m["mkh-ban-pro"] = { "Bahnar Purba", 116773189, "mkh-ban", "Latn", type = "reconstructed", } m["mkh-kat-pro"] = { "Katu Purba", 116773772, "mkh-kat", "Latn", type = "reconstructed", } m["mkh-khm-pro"] = { "Khmu Purba", 116773774, "mkh-khm", "Latn", type = "reconstructed", } m["mkh-kmr-pro"] = { "Khmer Purba", 55630684, "mkh-kmr", "Latn", type = "reconstructed", } m["mkh-mmn"] = { "Mon Pertengahan", 121337926, "mkh-mnc", "Latn, Mymr", --and also Pallava ancestors = "omx", } m["mkh-mnc-pro"] = { "Mon Purba", 116773231, "mkh-mnc", "Latn", type = "reconstructed", } m["mkh-mvi"] = { "Vietnam Pertengahan", 9199, "mkh-vie", "Hani, Latn", sort_key = {Hani = "Hani-sortkey"}, } m["mkh-pal-pro"] = { "Palaung Purba", 104847372, "mkh-pal", "Latn", type = "reconstructed", } m["mkh-pea-pro"] = { "Pear Purba", 116773804, "mkh-pea", "Latn", type = "reconstructed", } m["mkh-pkn-pro"] = { "Pakan Purba", 116773803, "mkh-pkn", "Latn", type = "reconstructed", } m["mkh-pro"] = { --This will be merged into 2015 aav-pro. "Mon-Khmer Purba", 7251859, "mkh", "Latn", type = "reconstructed", } m["mnw-tha"] = { -- To be removed. "Thai Mon", nil, "mkh-mnc", "Mymr, Thai", ancestors = "mkh-mmn", sort_key = { from = {"[%p]", "ျ", "ြ", "ွ", "ှ", "ၞ", "ၟ", "ၠ", "ၚ", "ဿ", "[็-๎]", "([เแโใไ])([ก-ฮ])ฺ?"}, to = {"", "္ယ", "္ရ", "္ဝ", "္ဟ", "္န", "္မ", "္လ", "င", "သ္သ", "", "%2%1"} }, } m["mkh-vie-pro"] = { "Viet Purba", 109432616, "mkh-vie", "Latn", type = "reconstructed", } m["mns-cen"] = { "Mansi Tengah", 128810384, "mns", "Cyrl", translit = "mns-translit", override_translit = true, } m["mns-nor"] = { "Mansi Utara", 30304537, "mns", "Cyrl", translit = "mns-translit", override_translit = true, } m["mns-pro"] = { "Mansi Purba", 128883093, "mns", "Latn", type = "reconstructed", } m["mns-sou"] = { "Mansi Selatan", 30304629, "mns", "Cyrl", translit = "mns-translit", override_translit = true, } m["mun-pro"] = { "Munda Purba", 105102373, "mun", "Latn", type = "reconstructed", } m["myn-chl"] = { -- the stage after ''emy'' "Ch'olti'", 873995, "myn", "Latn", } m["myn-pro"] = { "Maya Purba", 3321532, "myn", "Latn", type = "reconstructed", } m["nai-ala"] = { "Alazapa", 128810233, nil, "Latn", } m["nai-bay"] = { "Bayogoula", 1563704, nil, "Latn", } m["nai-cal"] = { "Calusa", 51782, nil, "Latn", } m["nai-chi"] = { "Chiquimulilla", 25339627, "nai-xin", "Latn", } m["nai-chu-pro"] = { "Chumash Purba", 116773736, "nai-chu", "Latn", type = "reconstructed", } m["nai-cig"] = { "Ciguayo", 20741700, nil, "Latn", } m["nai-ckn-pro"] = { "Chinook Purba", 116773735, "nai-ckn", "Latn", type = "reconstructed", } m["nai-guz"] = { "Guazacapán", 19572028, "nai-xin", "Latn", } m["nai-hit"] = { "Hitchiti", 1542882, "nai-mus", "Latn", } m["nai-ipa"] = { "Ipai", 3027474, "nai-yuc", "Latn", } m["nai-jtp"] = { "Jutiapa", nil, "nai-xin", "Latn", } m["nai-jum"] = { "Jumaytepeque", 25339626, "nai-xin", "Latn", } m["nai-kat"] = { "Kathlamet", 6376639, "nai-ckn", "Latn", } m["nai-klp-pro"] = { "Kalapuyan Purba", 116773771, "nai-klp", "Latn", type = "reconstructed", } m["nai-knm"] = { "Konomihu", 3198734, "nai-shs", "Latn", } m["nai-kum"] = { "Kumeyaay", 4910139, "nai-yuc", "Latn", } m["nai-mac"] = { "Macoris", 21070851, nil, "Latn", } m["nai-mdu-pro"] = { "Maidun Purba", 116773784, "nai-mdu", "Latn", type = "reconstructed", } m["nai-miz-pro"] = { "Mixe-Zoque Purba", 7251858, "nai-miz", "Latn", type = "reconstructed", } m["nai-mus-pro"] = { "Muscogee Purba", 116775368, "nai-mus", "Latn", type = "reconstructed", } m["nai-nao"] = { "Naolan", 6964594, nil, "Latn", } m["nai-nrs"] = { "New River Shasta", 7011254, "nai-shs", "Latn", } m["nai-okw"] = { "Okwanuchu", 3350126, "nai-shs", "Latn", } m["nai-per"] = { "Pericú", 3375369, nil, "Latn", } m["nai-pic"] = { "Picuris", 7191257, "nai-kta", "Latn", } m["nai-plp-pro"] = { "Penuti Penara Purba", 116773806, "nai-plp", "Latn", type = "reconstructed", } m["nai-pom-pro"] = { "Pomo Purba", 116773262, "nai-pom", "Latn", type = "reconstructed", } m["nai-qng"] = { "Quinigua", 36360, nil, "Latn", } m["nai-sca-pro"] = { -- NB 'sio-pro' "Siouan" which is Western Siouan "Sioux-Catawba Purba", 116773275, "nai-sca", "Latn", type = "reconstructed", } m["nai-sin"] = { "Sinacantán", 24190249, "nai-xin", "Latn", } m["nai-sln"] = { "Salvadoran Lenca", 3229434, "nai-len", "Latn", } m["nai-spt"] = { "Sahaptin", 3833015, "nai-shp", "Latn", } m["nai-tap"] = { "Tapachultec", 7684401, "nai-miz", "Latn", } m["nai-taw"] = { "Tawasa", 7689233, nil, "Latn", } m["nai-teq"] = { "Tequistlatec", 2964454, "nai-tqn", "Latn", } m["nai-tip"] = { "Tipai", 3027471, "nai-yuc", "Latn", } m["nai-tot-pro"] = { "Totozoque Purba", 116773285, "nai-tot", "Latn", type = "reconstructed", } m["nai-tsi-pro"] = { "Tsimshian Purba", nil, "nai-tsi", "Latn", type = "reconstructed", } m["nai-utn-pro"] = { "Uti Purba", 116773290, "nai-utn", "Latn", type = "reconstructed", } m["nai-wai"] = { "Waikuri", 3118702, nil, "Latn", } m["nai-wji"] = { "Jicaque Barat", 3178610, "nai-jcq", "Latn", } m["nai-yup"] = { "Yupiltepeque", 25339628, "nai-xin", "Latn", } m["nan-dat"] = { "Datian Min", 19855572, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-hbl"] = { "Hokkien", 1624231, "zhx-nan", "Hants, Latn, Bopo, Kana", wikimedia_codes = "zh-min-nan", generate_forms = "zh-generateforms", sort_key = { Hani = "Hani-sortkey", Kana = "Kana-sortkey" }, } m["nan-hlh"] = { "Min Hailufeng", 120755728, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-hnm"] = { "Hainan", 934541, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-lnx"] = { "Min Longyan", 6674568, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-luh"] = { "Min Leizhou", 1988433, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-tws"] = { "Teochew", 36759, "zhx-nan", "Hants", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["nan-zhe"] = { "Min Zhenan", 3846710, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nan-zsh"] = { "Min Sanxiang", 7420769, "zhx-nan", "Hants", generate_forms = "zh-generateforms", sort_key = "Hani-sortkey", } m["nds-de"] = { "German Low German", 25433, "gmw-lgm", "Latn", ancestors = "nds", ietf_subtag = "nds-DE", -- should we make this the actual code? wikimedia_codes = "nds", } m["nds-nl"] = { "Dutch Low Saxon", 516137, "gmw-lgm", "Latn", ancestors = "nds", ietf_subtag = "nds-NL", -- should we make this the actual code? } m["ngf-pro"] = { "Trans-New Guinea Purba", 85794785, "ngf", "Latn", type = "reconstructed", } m["nic-bco-pro"] = { "Benue-Congo Purba", 116773194, "nic-bco", "Latn", type = "reconstructed", } m["nic-bod-pro"] = { "Bantoid Purba", 116773190, "nic-bod", "Latn", type = "reconstructed", } m["nic-eov-pro"] = { "Oti-Volta Timur Purba", 116773753, "nic-eov", "Latn", type = "reconstructed", } m["nic-gns-pro"] = { "Gurunsi Purba", 116773759, "nic-gns", "Latn", type = "reconstructed", } m["nic-grf-pro"] = { "Grassfields Purba", 116773755, "nic-grf", "Latn", type = "reconstructed", } m["nic-gur-pro"] = { "Gur Purba", 116773758, "nic-gur", "Latn", type = "reconstructed", } m["nic-jkn-pro"] = { "Jukunoid Purba", 116773769, "nic-jkn", "Latn", type = "reconstructed", } m["nic-lcr-pro"] = { "Lower Cross River Purba", 116773782, "nic-lcr", "Latn", type = "reconstructed", } m["nic-ogo-pro"] = { "Ogoni Purba", 116773799, "nic-ogo", "Latn", type = "reconstructed", } m["nic-ovo-pro"] = { "Oti-Volta Purba", 116773802, "nic-ovo", "Latn", type = "reconstructed", } m["nic-plt-pro"] = { "Plateau Purba", 116773805, "nic-plt", "Latn", type = "reconstructed", } m["nic-pro"] = { "Niger-Congo Purba", 108000748, "nic", "Latn", type = "reconstructed", } m["nic-ubg-pro"] = { "Ubangi Purba", 116773818, "nic-ubg", "Latn", type = "reconstructed", } m["nic-ucr-pro"] = { "Upper Cross River Purba", 116773819, "nic-ucr", "Latn", type = "reconstructed", } m["nic-vco-pro"] = { "Volta-Congo Purba", 116773293, "nic-vco", "Latn", type = "reconstructed", } m["nub-har"] = { "Haraza", 19572059, "nub", "Arab, Latn", } m["nub-pro"] = { "Nubia Purba", 116773246, "nub", "Latn", type = "reconstructed", } m["omq-cha-pro"] = { "Chatino Purba", 116773202, "omq-cha", "Latn", type = "reconstructed", } m["omq-maz-pro"] = { "Mazatec Purba", 116773790, "omq-maz", "Latn", type = "reconstructed", } m["omq-mix-pro"] = { "Mixtecan Purba", 21573423, "omq-mix", "Latn", type = "reconstructed", } m["omq-mxt-pro"] = { "Mixtec Purba", 21573424, "omq-mxt", "Latn", type = "reconstructed", } m["omq-otp-pro"] = { "Oto-Pamean Purba", 116773251, "omq-otp", "Latn", type = "reconstructed", } m["omq-pro"] = { "Oto-Manguean Purba", 33669, "omq", "Latn", type = "reconstructed", } m["omq-sjq"] = { "San Juan Quiahije Chatino", 17003130, "omq-cha", "Latn", } m["omq-tel"] = { "Teposcolula Mixtec", nil, "omq-mxt", "Latn", } m["omq-teo"] = { "Teojomulco Chatino", 25340451, "omq-cha", "Latn", } m["omq-tri-pro"] = { "Trique Purba", 116773817, "omq-tri", "Latn", type = "reconstructed", } m["omq-zap-pro"] = { "Zapotecan Purba", 116773297, "omq-zap", "Latn", type = "reconstructed", } m["omq-zpc-pro"] = { "Zapotec Purba", 116773296, "omq-zpc", "Latn", type = "reconstructed", } m["omv-aro-pro"] = { "Aroid Purba", 116773721, "omv-aro", "Latn", type = "reconstructed", } m["omv-diz-pro"] = { "Dizoid Purba", 116773750, "omv-diz", "Latn", type = "reconstructed", } m["omv-pro"] = { "Omo Purba", 116773800, "omv", "Latn", type = "reconstructed", } m["oto-otm-pro"] = { "Otomi Purba", 5908710, "oto-otm", "Latn", type = "reconstructed", } m["oto-pro"] = { "Otomi Purba", 116773252, "oto", "Latn", type = "reconstructed", } m["paa-kom"] = { "Kómnzo", 18344310, "paa-yam", "Latn", } m["paa-kwn"] = { "Kuwani", 6449056, "paa", "Latn", } m["paa-nha-pro"] = { "Halmahera Utara Purba", 116773241, "paa-nha", "Latn", type = "reconstructed" } m["paa-nun"] = { "Nungon", 128807788, "paa", "Latn", } m["phi-din"] = { "Dinapigue Agta", 16945774, "phi", "Latn", } m["phi-kal-pro"] = { "Kalamian Purba", 116773213, "phi-kal", "Latn", type = "reconstructed", } m["phi-nag"] = { "Nagtipunan Agta", 16966111, "phi", "Latn", } m["phi-pro"] = { "Filipina Purba", 18204898, "phi", "Latn", type = "reconstructed", } m["poz-abi"] = { "Abai", 19570729, "poz-san", "Latn", } m["poz-bal"] = { "Baliledo", 4850912, "poz", "Latn", } m["poz-btk-pro"] = { "Bungku-Tolaki Purba", 116773724, "poz-btk", "Latn", type = "reconstructed", } m["poz-cet-pro"] = { "Melayu-Polinesia Tengah Timur Purba", 2269883, "poz-cet", "Latn", type = "reconstructed", } m["poz-hce-pro"] = { "Halmahera Cenderawasih Purba", 116773209, "poz-hce", "Latn", type = "reconstructed", } m["poz-lgx-pro"] = { "Lampung Purba", 116773222, "poz-lgx", "Latn", type = "reconstructed", } m["poz-mcm-pro"] = { "Melayu-Chamik Purba", 116773225, "poz-mcm", "Latn", type = "reconstructed", } m["poz-mic-pro"] = { "Mikronesia Purba", 111939079, "poz-mic", "Latn", type = "reconstructed", } m["poz-mly-pro"] = { "Melayik Purba", 98057728, "poz-mly", "Latn", type = "reconstructed", } m["poz-msa-pro"] = { "Melayu-Sumbawa Purba", 116773226, "poz-msa", "Latn", type = "reconstructed", } m["poz-oce-pro"] = { "Oceania Purba", 141741, "poz-oce", "Latn", type = "reconstructed", } m["poz-pep-pro"] = { "Polinesia Timur Purba", 113988745, "poz-pep", "Latn", type = "reconstructed", } m["poz-pnp-pro"] = { "Polinesia Teras Purba", 113988746, "poz-pnp", "Latn", type = "reconstructed", } m["poz-pol-pro"] = { "Polinesia Purba", 1658709, "poz-pol", "Latn", type = "reconstructed", } m["poz-pro"] = { "Melayu-Polinesia Purba", 3832960, "poz", "Latn", type = "reconstructed", } m["poz-sml"] = { "Melayu Sarawak", 4251702, "poz-mly", "Latn, ms-Arab", } m["poz-ssw-pro"] = { "Sulawesi Selatan Purba", 116773279, "poz-ssw", "Latn", type = "reconstructed", } m["poz-sus-pro"] = { "Sunda-Sulawesi Purba", 116773281, "poz-sus", "Latn", type = "reconstructed", } m["poz-swa-pro"] = { "Sarawak Utara Purba", 116773243, "poz-swa", "Latn", type = "reconstructed", } m["poz-ter"] = { "Melayu Terengganu", 4207412, "poz-mly", "Latn, ms-Arab", } m["pqe-pro"] = { "Melayu-Polinesia Timur Purba", 2269883, "pqe", "Latn", type = "reconstructed", } m["pra-niy"] = { "Prakrit Niya", 11991601, "inc-mid", "Khar", ancestors = "inc-ash", translit = "Khar-translit", } m["qfa-adm-pro"] = { "Andaman Raya Purba", 116773756, "qfa-adm", "Latn", type = "reconstructed", } m["qfa-bet-pro"] = { "Be-Tai Purba", 116773193, "qfa-bet", "Latn", type = "reconstructed", } m["qfa-cka-pro"] = { "Chukotko-Kamchatka Purba", 7251837, "qfa-cka", "Latn", type = "reconstructed", } m["qfa-hur-pro"] = { "Hurro-Urartu Purba", 116773211, "qfa-hur", "Latn", type = "reconstructed", } m["qfa-kad-pro"] = { "Kadu Purba", 116773770, "qfa-kad", "Latn", type = "reconstructed", } m["qfa-kms-pro"] = { "Kam-Sui Purba", 55630682, "qfa-kms", "Latn", type = "reconstructed", } m["qfa-kor-pro"] = { "Korea Purba", 467883, "qfa-kor", "Latn", type = "reconstructed", } m["qfa-kra-pro"] = { "Kra Purba", 7251854, "qfa-kra", "Latn", type = "reconstructed", } m["qfa-lic-pro"] = { "Hlai Purba", 7251845, "qfa-lic", "Latn", type = "reconstructed", } m["qfa-onb-pro"] = { "Be Purba", 116773192, "qfa-onb", "Latn", type = "reconstructed", } m["qfa-ong-pro"] = { "Ongan Purba", 116773801, "qfa-ong", "Latn", type = "reconstructed", } m["qfa-tak-pro"] = { "Kra-Dai Purba", 104901616, "qfa-tak", "Latn", type = "reconstructed", } m["qfa-yen-pro"] = { "Yenisei Purba", 27639, "qfa-yen", "Latn", type = "reconstructed", } m["qfa-yuk-pro"] = { "Yukaghir Purba", 116773294, "qfa-yuk", "Latn", type = "reconstructed", } m["qwe-kch"] = { "Kichwa", 1740805, "qwe", "Latn", ancestors = "qu", } m["qwe-pro"] = { "Quechua Purba", 5575757, "qwe", "Latn", type = "reconstructed", } m["roa-ang"] = { "Angevin", 56782, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-bbn"] = { "Bourbonnais-Berrichon", 2899128, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-brg"] = { "Bourguignon", 508332, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-cha"] = { "Champenois", 430018, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-fcm"] = { "Franc-Comtois", 510561, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-gal"] = { "Gallo", 37300, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-gib"] = { "Gallo-Italic of Basilicata", 3094838, "roa-git", "Latn", } m["roa-gis"] = { "Gallo-Italic of Sicily", 2629019, "roa-git", "Latn", } m["roa-leo"] = { "Leon", 34108, "roa-ibe", "Latn", ancestors = "roa-ole", } m["roa-lor"] = { "Lorrain", 671198, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-oan"] = { "Navarro-Aragon", 2736184, "roa-ibe", "Latn", } m["roa-oca"] = { "Catalonia Kuno", 15478520, "roa-ocr", "Latn", sort_key = { from = {"à", "[èé]", "[íï]", "[òó]", "[úü]", "ç", "·"}, to = {"a", "e", "i", "o", "u", "c"} }, } m["roa-ole"] = { "Leon Kuno", 125977465, "roa-ibe", "Latn", } m["roa-opt"] = { "Galicia-Portugis Kuno", 1072111, "roa-ibe", "Latn", entry_name = {remove_diacritics = c.grave .. c.acute .. c.circ}, } m["roa-orl"] = { "Orléanais", 28497058, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-poi"] = { "Poitevin-Saintongeais", 514123, "roa-oil", "Latn", sort_key = s["roa-oil-sortkey"], } m["roa-tar"] = { "Tarantino", 695526, "roa-itd", "Latn", ancestors = "nap", wikimedia_codes = "roa-tara", } m["sai-all"] = { "Allentiac", 19570789, "sai-hrp", "Latn", } m["sai-and"] = { -- not to be confused with 'cbc' or 'ano' "Andoquero", 16828359, "sai-wit", "Latn", } m["sai-ayo"] = { "Ayomán", 16937754, "sai-jir", "Latn", } m["sai-bae"] = { "Baenan", 3401998, nil, "Latn", } m["sai-bag"] = { "Bagua", 5390321, nil, "Latn", } m["sai-bet"] = { "Betoi", 926551, "qfa-iso", "Latn", } m["sai-bor-pro"] = { "Boran Purba", nil, "sai-bor", "Latn", } m["sai-cac"] = { "Cacán", 945482, nil, "Latn", } m["sai-caq"] = { "Caranqui", 2937753, "sai-bar", "Latn", } m["sai-car-pro"] = { "Cariban Purba", 116773196, "sai-car", "Latn", type = "reconstructed", } m["sai-cat"] = { "Catacao", 5051136, "sai-ctc", "Latn", } m["sai-cer-pro"] = { "Cerrado Purba", 116773200, "sai-cer", "Latn", type = "reconstructed", } m["sai-chi"] = { "Chirino", 5390321, nil, "Latn", } m["sai-chn"] = { "Chaná", 5072718, "sai-crn", "Latn", } m["sai-chp"] = { "Chapacura", 5072884, "sai-cpc", "Latn", } m["sai-chr"] = { "Charrua", 5086680, "sai-crn", "Latn", } m["sai-chu"] = { "Churuya", 5118339, "sai-guh", "Latn", } m["sai-cje-pro"] = { "Jê Tengah Purba", 116773198, "sai-cje", "Latn", type = "reconstructed", } m["sai-cmg"] = { "Comechingon", 6644203, nil, "Latn", } m["sai-cno"] = { "Chono", 5104704, nil, "Latn", } m["sai-cnr"] = { "Cañari", 5055572, nil, "Latn", } m["sai-coe"] = { "Coeruna", 6425639, "sai-wit", "Latn", } m["sai-col"] = { "Colán", 5141893, "sai-ctc", "Latn", } m["sai-cop"] = { "Copallén", 5390321, nil, "Latn", } m["sai-crd"] = { "Coroado Puri", 24191321, "sai-mje", "Latn", } m["sai-ctq"] = { "Catuquinaru", 16858455, nil, "Latn", } m["sai-cul"] = { "Culli", 2879660, nil, "Latn", } m["sai-cva"] = { "Cueva", 5192644, nil, "Latn", } m["sai-esm"] = { "Esmeralda", 3058083, nil, "Latn", } m["sai-ewa"] = { "Ewarhuyana", 16898104, nil, "Latn", } m["sai-gam"] = { "Gamela", 5403661, nil, "Latn", } m["sai-gay"] = { "Gayón", 5528902, "sai-jir", "Latn", } m["sai-gmo"] = { "Guamo", 5613495, nil, "Latn", } m["sai-gue"] = { "Güenoa", 5626799, "sai-crn", "Latn", } m["sai-hau"] = { "Haush", 3128376, "sai-cho", "Latn", } m["sai-jee-pro"] = { "Jê Purba", 116773212, "sai-jee", "Latn", type = "reconstructed", } m["sai-jko"] = { "Jeikó", 6176527, "sai-mje", "Latn", } m["sai-jrj"] = { "Jirajara", 6202966, "sai-jir", "Latn", } m["sai-kat"] = { -- contrast xoo, kzw, sai-xoc "Katembri", 6375925, nil, "Latn", } m["sai-mal"] = { "Malalí", 6741212, nil, "Latn", } m["sai-mar"] = { "Maratino", 6755055, nil, "Latn", } m["sai-mat"] = { "Matanawi", 6786047, nil, "Latn", } m["sai-mcn"] = { "Mocana", 3402048, nil, "Latn", } m["sai-men"] = { "Menien", 16890110, "sai-mje", "Latn", } m["sai-mil"] = { "Millcayac", 19573012, "sai-hrp", "Latn", } m["sai-mlb"] = { "Malibu", 3402048, nil, "Latn", } m["sai-msk"] = { "Masakará", 6782426, "sai-mje", "Latn", } m["sai-muc"] = { "Mucuchí", 6931290, nil, "Latn", } m["sai-mue"] = { "Muellama", 16886936, "sai-bar", "Latn", } m["sai-muz"] = { "Muzo", 6644203, nil, "Latn", } m["sai-mys"] = { "Maynas", 16919393, nil, "Latn", } m["sai-nat"] = { "Natú", 9006749, nil, "Latn", } m["sai-nje-pro"] = { "Jê Utara Purba", 116773245, "sai-nje", "Latn", type = "reconstructed", } m["sai-opo"] = { "Opón", 7099152, "sai-car", "Latn", } m["sai-oto"] = { "Otomaco", 16879234, "sai-otm", "Latn", } m["sai-pal"] = { "Palta", 3042978, nil, "Latn", } m["sai-pam"] = { "Pamigua", 5908689, "sai-otm", "Latn", } m["sai-par"] = { "Paratió", 16890038, nil, "Latn", } m["sai-pnz"] = { "Panzaleo", 3123275, nil, "Latn", } m["sai-prh"] = { "Puruhá", 3410994, nil, "Latn", } m["sai-ptg"] = { "Patagón", 128807870, nil, "Latn", } m["sai-pur"] = { "Purukotó", 7261622, "sai-pem", "Latn", } m["sai-pyg"] = { "Payaguá", 7156643, "sai-guc", "Latn", } m["sai-pyk"] = { "Pykobjê", 98113977, "sai-nje", "Latn", } m["sai-qmb"] = { "Quimbaya", 7272043, nil, "Latn", } m["sai-qtm"] = { "Quitemo", 7272651, "sai-cpc", "Latn", } m["sai-rab"] = { "Rabona", 6644203, nil, "Latn", } m["sai-ram"] = { "Ramanos", 16902824, nil, "Latn", } m["sai-sac"] = { "Sácata", 5390321, nil, "Latn", } m["sai-san"] = { "Sanaviron", 16895999, nil, "Latn", } m["sai-sap"] = { "Sapará", 7420922, "sai-car", "Latn", } m["sai-sec"] = { "Sechura", 7442912, nil, "Latn", } m["sai-sin"] = { "Sinúfana", 7525275, nil, "Latn", } m["sai-sje-pro"] = { "Jê Selatan Purba", 116773814, "sai-sje", "Latn", type = "reconstructed", } m["sai-tab"] = { "Tabancale", 5390321, nil, "Latn", } m["sai-tal"] = { "Tallán", 16910468, nil, "Latn", } m["sai-tap"] = { "Tapayuna", 30719984, "sai-nje", "Latn", } m["sai-tar-pro"] = { "Taranoan Purba", 116773816, "sai-tar", "Latn", type = "reconstructed", } m["sai-teu"] = { "Teushen", 3519243, nil, "Latn", } m["sai-tim"] = { "Timote", 7806995, nil, "Latn", } m["sai-tpr"] = { "Taparita", 7684460, "sai-otm", "Latn", } m["sai-trr"] = { "Tarairiú", 7685313, nil, "Latn", } m["sai-wai"] = { "Waitaká", 16918610, nil, "Latn", } m["sai-way"] = { "Wayumara", 7960726, "sai-car", "Latn", } m["sai-wit-pro"] = { "Witotoan Purba", 116773823, "sai-wit", "Latn", type = "reconstructed", } m["sai-wnm"] = { "Wanham", 16879440, "sai-cpc", "Latn", } m["sai-xoc"] = { -- contrast xoo, kzw, sai-kat "Xocó", 12953620, nil, "Latn", } m["sai-yao"] = { "Yao (Amerika Selatan)", 16979655, "sai-ven", "Latn", } m["sai-yar"] = { -- not the same family as 'suy' "Yarumá", 3505859, "sai-pek", "Latn", } m["sai-yri"] = { "Yuri", 2669157, "sai-tyu", "Latn", } m["sai-yup"] = { "Yupua", 8061430, "sai-tuc", "Latn", } m["sai-yur"] = { "Yurumanguí", 1281291, nil, "Latn", } m["sal-pro"] = { "Salish Purba", 116773269, "sal", "Latn", type = "reconstructed", } m["sdv-daj-pro"] = { "Daju Purba", 116773739, "sdv-daj", "Latn", type = "reconstructed", } m["sdv-eje-pro"] = { "Jabal Timur Purba", 116773751, "sdv-eje", "Latn", type = "reconstructed", } m["sdv-nil-pro"] = { "Nil Purba", 116773794, "sdv-nil", "Latn", type = "reconstructed", } m["sdv-nyi-pro"] = { "Nyima Purba", 116773796, "sdv-nyi", "Latn", type = "reconstructed", } m["sdv-tmn-pro"] = { "Taman Purba", 116773815, "sdv-tmn", "Latn", type = "reconstructed", } m["sel-nor"] = { "Selkup Utara", 30304565, "sel", "Cyrl", translit = "sel-nor-translit", } m["sel-pro"] = { "Selkup Purba", 128884235, "sel", "Latn", type = "reconstructed", } m["sel-sou"] = { "Selkup Selatan", 30304639, "sel", "Cyrl", } m["sem-amm"] = { "Ammun", 279181, "sem-can", "Phnx", translit = "Phnx-translit", } m["sem-amo"] = { "Amorit", 35941, "sem-nwe", "Xsux, Latn", } m["sem-cha"] = { "Chaha", 35543, "sem-eth", "Ethi", translit = "Ethi-translit", } m["sem-dad"] = { "Dadanitic", 21838040, "sem-cen", "Narb", translit = "Narb-translit", } m["sem-dum"] = { "Dumaitic", 128810397, "sem-cen", "Narb", translit = "Narb-translit", } m["sem-has"] = { "Hasaitic", 3541433, "sem-cen", "Narb", translit = "Narb-translit", } m["sem-his"] = { "Hismaic", 22948260, "sem-cen", "Narb", translit = "Narb-translit", } m["sem-mhr"] = { "Muher", 33743, "sem-eth", "Latn", } m["sem-pro"] = { "Samiah Purba", 1658554, "sem", "Latn", type = "reconstructed", } m["sem-saf"] = { "Safaitic", 472586, "sem-cen", "Narb", translit = "Narb-translit", } m["sem-srb"] = { "Arab Selatan Kuno", 35025, "sem-osa", "Sarb", translit = "Sarb-translit", } m["sem-tay"] = { "Taymanitic", 24912301, "sem-cen", "Narb", translit = "Narb-translit", } m["sem-tha"] = { "Thamudic", 843030, "sem-cen", "Narb", translit = "Narb-translit", } m["sem-wes-pro"] = { "Samiah Barat Purba", 98021726, "sem-wes", "Latn", type = "reconstructed", } m["sio-pro"] = { -- NB this is not Siouan-Catawban 'nai-sca-pro' "Sioux Purba", 34181, "sio", "Latn", type = "reconstructed", } m["sit-bai-pro"] = { "Bai Purba", nil, "sit-bai", "Latn", type = "reconstructed", } m["sit-bok"] = { "Bokar", 4938727, "sit-tan", "Latn, Tibt", translit = {Tibt = "Tibt-translit"}, override_translit = true, display_text = {Tibt = s["Tibt-displaytext"]}, entry_name = {Tibt = s["Tibt-entryname"]}, sort_key = {Tibt = "Tibt-sortkey"}, } m["sit-cai"] = { "Caijia", 5017528, "sit-cln", "Latn" } m["sit-cha"] = { "Chairel", 5068066, "sit-luu", "Latn", } m["sit-hrs-pro"] = { "Hrusish Purba", 116773762, "sit-hrs", "Latn", type = "reconstructed", } m["sit-jap"] = { "Japhug", 3162245, "sit-rgy", "Latn", } m["sit-kha-pro"] = { "Kham Purba", 116773773, "sit-kha", "Latn", type = "reconstructed", } m["sit-liz"] = { "Lizu", 6660653, "sit-qia", "Latn", -- and Ersu Shaba } m["sit-lnj"] = { "Longjia", 17096251, "sit-cln", "Latn" } m["sit-lrn"] = { "Luren", 16946370, "sit-cln", "Latn" } m["sit-luu-pro"] = { "Luish Purba", 116773783, "sit-luu", "Latn", type = "reconstructed", } m["sit-prn"] = { "Puiron", 7259048, "sit-zem", } m["sit-pro"] = { "Sino-Tibet Purba", 45961, "sit", "Latn", type = "reconstructed", } m["sit-sit"] = { "Situ", 19840830, "sit-rgy", "Latn", } m["sit-tam-pro"] = { "Tamang Purba", 117469295, "sit-tam", "Latn", type = "reconstructed", } m["sit-tan-pro"] = { "Tani Purba", 116773284, "sit-tan", "Latn", -- needs verification type = "reconstructed", } m["sit-tgm"] = { "Tangam", 17041370, "sit-tan", "Latn", } m["sit-tos"] = { "Tosu", 7827899, "sit-qia", "Latn", -- also Ersu Shaba } m["sit-tsh"] = { "Tshobdun", 19840950, "sit-rgy", "Latn", } m["sit-zbu"] = { "Zbu", 19841106, "sit-rgy", "Latn", } m["sla-pro"] = { "Slavik Purba", 747537, "sla", "Latn", type = "reconstructed", entry_name = { remove_diacritics = c.grave .. c.acute .. c.tilde .. c.macron .. c.dgrave .. c.invbreve, remove_exceptions = {'ś'}, }, sort_key = { from = {"č", "ď", "ě", "ę", "ь", "ľ", "ň", "ǫ", "ř", "š", "ś", "ť", "ъ", "ž"}, to = {"c²", "d²", "e²", "e³", "i²", "l²", "nj", "o²", "r²", "s²", "s³", "t²", "u²", "z²"}, } } m["smi-pro"] = { "Sami Purba", 7251862, "smi", "Latn", type = "reconstructed", sort_key = { from = {"ā", "č", "δ", "[ëē]", "ŋ", "ń", "ō", "š", "θ", "%([^()]+%)"}, to = {"a", "c²", "d", "e", "n²", "n³", "o", "s²", "t²"} }, } m["son-pro"] = { "Songhay Purba", 116773277, "son", "Latn", type = "reconstructed", } m["sqj-pro"] = { "Albania Purba", 18210846, "sqj", "Latn", type = "reconstructed", } m["ssa-klk-pro"] = { "Kuliak Purba", 116773779, "ssa-klk", "Latn", type = "reconstructed", } m["ssa-kom-pro"] = { "Koman Purba", 116773775, "ssa-kom", "Latn", type = "reconstructed", } m["ssa-pro"] = { "Nilo-Sahara Purba", 116773236, "ssa", "Latn", type = "reconstructed", } m["syd-fne"] = { "Forest Nenets", 1295107, "syd", "Cyrl", translit = "syd-fne-translit", entry_name = {remove_diacritics = c.grave .. c.acute .. c.macron .. c.breve .. c.dotabove}, } m["syd-pro"] = { "Samoyed Purba", 7251863, "syd", "Latn", type = "reconstructed", } m["tai-pro"] = { "Tai Purba", 6583709, "tai", "Latn", type = "reconstructed", } m["tai-swe-pro"] = { "Tai Barat Daya Purba", 116773280, "tai-swe", "Latn", type = "reconstructed", } m["tbq-bdg-pro"] = { "Bodo-Garo Purba", 116773195, "tbq-bdg", "Latn", type = "reconstructed", } m["tbq-blg"] = { "Bailang", 2879843, "tbq-lob", "Hani", sort_key = "Hani-sortkey", } m["tbq-gkh"] = { "Gokhy", 5578069, "tbq-sil", "Latn", } m["tbq-kuk-pro"] = { "Kukish Purba", 116773220, "tbq-kuk", "Latn", type = "reconstructed", } m["tbq-lal-pro"] = { "Lalo Purba", 116773781, "tbq-lal", "Latn", type = "reconstructed", } m["tbq-laz"] = { "Laze", 17007626, "sit-nas", "Latn", } m["tbq-lob-pro"] = { "Lolo-Burma Purba", 116773224, "tbq-lob", "Latn", type = "reconstructed", } m["tbq-lol-pro"] = { "Lolo Purba", 7251855, "tbq-lol", "Latn", type = "reconstructed", } m["tbq-mil"] = { "Milang", 6850761, "sit-gsi", "Deva, Latn", } m["tbq-mor"] = { "Moran", 6909216, "tbq-bdg", "Latn", } m["tbq-ngo"] = { "Ngochang", 56582, "tbq-brm", "Latn", } -- tbq-pro is now etymology-only m["trk-dkh"] = { "Dukhan", 12809273, "trk-ssb", "Latn, Cyrl, Mong", translit = {Mong = "Mong-translit"}, display_text = {Mong = s["Mong-displaytext"]}, entry_name = {Mong = s["Mong-entryname"]}, } m["trk-oat"] = { "Turki Anatolia Kuno", 7083390, "trk-ogz", "ota-Arab", entry_name = {["ota-Arab"] = "ar-entryname"}, } m["trk-pro"] = { "Turk Purba", 3657773, "trk", "Latn", type = "reconstructed", } m["tup-gua-pro"] = { "Tupi-Guarani Purba", 116773288, "tup-gua", "Latn", type = "reconstructed", } m["tup-kab"] = { "Kabishiana", 15302988, "tup", "Latn", } m["tup-pro"] = { "Tupi Purba", 10354700, "tup", "Latn", type = "reconstructed", } m["tuw-alk"] = { "Alchuka", 113553616, "tuw-jrc", "Latn, Hans", sort_key = {Hans = "Hani-sortkey"}, } m["tuw-bal"] = { "Bala", 86730632, "tuw-jrc", "Latn, Hans", sort_key = {Hans = "Hani-sortkey"}, } m["tuw-kkl"] = { "Kyakala", 118875708, "tuw-jrc", "Latn, Hans", sort_key = {Hans = "Hani-sortkey"}, } m["tuw-kli"] = { "Kili", 6406892, "tuw-ewe", "Cyrl", } m["tuw-pro"] = { "Tungus Purba", 85872335, "tuw", "Latn", type = "reconstructed", } m["tuw-sol"] = { "Solon", 30004, "tuw-ewe", } m["urj-fin-pro"] = { "Finnik Purba", 11883720, "urj-fin", "Latn", type = "reconstructed", } m["urj-koo"] = { "Komi Kuno", 86679962, "urj-prm", "Perm, Cyrs", translit = "urj-koo-translit", sort_key = {Cyrs = s["Cyrs-sortkey"]}, } m["urj-kuk"] = { "Kukkuzi", 107410460, "urj-fin", "Latn", ancestors = "vot", } m["urj-kya"] = { "Komi-Yazva", 2365210, "urj-prm", "Cyrl", translit = "kv-translit", override_translit = true, entry_name = {remove_diacritics = c.acute}, } m["urj-mdv-pro"] = { "Mordvin Purba", 116773232, "urj-mdv", "Latn", type = "reconstructed", } m["urj-prm-pro"] = { "Perm Purba", 116773257, "urj-prm", "Latn", type = "reconstructed", } m["urj-pro"] = { "Ural Purba", 288765, "urj", "Latn", type = "reconstructed", } m["urj-ugr-pro"] = { "Ugri Purba", 156631, "urj-ugr", "Latn", type = "reconstructed", } m["xnd-pro"] = { "Na-Dene Purba", 116773233, "xnd", "Latn", type = "reconstructed", } m["xgn-pro"] = { "Mongol Purba", 2493677, "xgn", "Latn", type = "reconstructed", sort_key = { from = {"č", "i", "ï", "ǰ", "ŋ", "ö", "š", "ü"}, to = {"c", "i" .. p[1], "i", "j", "n" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1]}, }, } m["yok-bvy"] = { "Yokuts Buena Vista", 4985474, "yok", "Latn", } m["yok-dly"] = { "Yokuts Delta", 70923266, "yok", "Latn", } m["yok-gsy"] = { "Gashowu", 3098708, "yok", "Latn", } m["yok-kry"] = { "Yokuts Sungai Kings", 6413014, "yok", "Latn", } m["yok-nvy"] = { "Yokuts Lembah Utara", 85789777, "yok", "Latn", } m["yok-ply"] = { "Palewyami", 2387391, "yok", "Latn", } m["yok-svy"] = { "Yokuts Lembah Selatan", 12642473, "yok", "Latn", } m["yok-tky"] = { "Yokuts Tule-Kaweah", 7851988, "yok", "Latn", } m["ypk-pro"] = { "Yupik Purba", 116773295, "ypk", "Latn", type = "reconstructed", } m["zhx-min-pro"] = { "Min Purba", 19646347, "zhx-min", "Latn", type = "reconstructed", } m["zhx-sht"] = { "Shaozhou Tuhua", 1920769, "zhx", "Nshu, Hants", generate_forms = "zh-generateforms", sort_key = {Hani = "Hani-sortkey"}, } m["zhx-sic"] = { "Sichuan", 2278732, "zhx-man", "Hants", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["zhx-tai"] = { "Taishan", 2208940, "zhx-yue", "Hants", generate_forms = "zh-generateforms", translit = "zh-translit", sort_key = "Hani-sortkey", } m["zlw-mas"] = { "Masurian", 489691, "zlw-lch", "Latn", ancestors = "zlw-opl", } m["zle-ono"] = { "Novgorodia Kuno", 162013, "zle", "Cyrs, Glag", translit = {Cyrs = "Cyrs-translit", Glag = "Glag-translit"}, entry_name = {Cyrs = s["Cyrs-entryname"]}, sort_key = {Cyrs = s["Cyrs-sortkey"]}, } m["zle-ort"] = { "Ruthenia Kuno", 13211, "zle", "Cyrs", ancestors = "orv", translit = "zle-ort-translit", entry_name = { remove_diacritics = s["Cyrs-entryname"].remove_diacritics, remove_exceptions = {"Ї", "ї"} }, sort_key = s["Cyrs-sortkey"], } m["zlw-ocs"] = { "Czech Kuno", 593096, "zlw", "Latn", } m["zlw-opl"] = { "Poland Kuno", 149838, "zlw-lch", "Latn", entry_name = {remove_diacritics = c.ringabove}, } m["zlw-osk"] = { "Slovak Kuno", 12776676, "zlw", "Latn", } m["zlw-slv"] = { "Slovincia", 36822, "zlw-pom", "Latn", entry_name = "zlw-slv-entryname" } m["zlm-coa"] = { "Melayu Terengganu Pesisir", 4207412, "poz-mly", "Latn, ms-Arab", } m["zlm-pah"] = { "Melayu Pahang", Q7310370, "poz-mly", "Latn", } return require("Module:languages").finalizeData(m, "language") p5xmcz5s172d3fd3gtp8bq8eg2xvkws Modul:languages/data/exceptional/extra 828 33778 281317 245803 2026-04-21T19:37:20Z Hakimi97 2668 Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/89762527|89762527]]) (perlu semakan semula) 281317 Scribunto text/plain local m = {} m["aav-khs-pro"] = { aliases = {"Proto-Khasic"}, } m["aav-nic-pro"] = { } m["aav-pkl-pro"] = { } m["aav-pro"] = { -- mkh-pro will merge into this. } m["afa-pro"] = { aliases = {"Proto-Afro-Asiatic", "Hamito-Semitic"}, } m["alg-aga"] = { aliases = {"Agwam", "Agaam"}, } m["alg-pro"] = { } m["alv-ama"] = { } m["alv-bgu"] = { aliases = {"Baïnounk Gubëeher", -- Wikipedia's name "Gubeeher-Gufangor-Gubelor", -- Glottolog's name, "Gubëeher", "Nyun Gubëeher", "Nun Gubëeher"}, -- N(y)un appears to be the family name varieties = {"Gubeeher", "Gufangor", "Gubelor"}, } m["alv-bua-pro"] = { } m["alv-cng-pro"] = { } m["alv-edk-pro"] = { } m["alv-edo-pro"] = { } m["alv-fli-pro"] = { } m["alv-gbe-pro"] = { } m["alv-gng-pro"] = { } m["alv-gtm-pro"] = { aliases = {"Proto-Ghana-Togo Mountain"}, } m["alv-gwa"] = { } m["alv-hei-pro"] = { } m["alv-ido-pro"] = { } m["alv-igb-pro"] = { } m["alv-kwa-pro"] = { } m["alv-mum-pro"] = { } m["alv-nup-pro"] = { } m["alv-pro"] = { } m["alv-von-pro"] = { } m["alv-yor-pro"] = { } m["alv-yrd-pro"] = { } m["apa-pro"] = { aliases = {"Proto-Apache", "Proto-Southern Athabaskan"}, } m["aql-pro"] = { } m["art-adu"] = { aliases = {"Westron"}, } m["art-bel"] = { } m["art-blk"] = { } m["art-bsp"] = { } m["art-com"] = { } m["art-dtk"] = { } m["art-elo"] = { } m["art-gld"] = { } m["art-lap"] = { } m["art-man"] = { } m["art-mun"] = { } m["art-nav"] = { } m["art-vlh"] = { } m["ath-nic"] = { } m["ath-pro"] = { } m["auf-pro"] = { aliases = {"Proto-Arawan", "Proto-Arauan"}, } m["aus-alu"] = { other_names = {"Ogh-Alungul", "Alngula"}, } m["aus-and"] = { aliases = {"Adithinngithigh"}, } m["aus-ang"] = { other_names = {"Ogh-Anggula", "Anggula", "Ogh-Anggul", "Anggul"}, } m["aus-arn-pro"] = { } m["aus-bra"] = { aliases = {"Barranbinja", "Baranbinya", "Burranbinya", "Burrumbiniya", "Burrunbinya", "Barrumbinya", "Barren-binya", "Parran-binye"}, } m["aus-brm"] = { } m["aus-cww-pro"] = { } m["aus-dal-pro"] = { } m["aus-guw"] = { other_names = {"Gowar", "Goowar", "Gooar", "Guar", "Gowr-burra", "Ngugi", "Mugee", "Wogee", "Gnoogee", "Chunchiburri", "Booroo-geen-merrie"}, } m["aus-lsw"] = { aliases = {"Little Swanport Tasmanian"}, } m["aus-mbi"] = { other_names = {"Mbeiwum"}, } m["aus-ngk"] = { other_names = {"Ngkot", "Nggoth"}, } m["aus-nyu-pro"] = { } m["aus-pam-pro"] = { } m["aus-tul"] = { other_names = {"Dappil", "Dapil", "Toolooa", "Dulua", "Narung", "Dandan"}, } m["aus-uwi"] = { other_names = {"Uwinjmil"}, } m["aus-wdj-pro"] = { } m["aus-won"] = { } m["aus-wul"] = { other_names = {"Manbara", "Wulgurugaba", "Wulgurukaba", "Nhawalgaba"}, } m["aus-ynk"] = { -- contrast nny } m["awd-amc-pro"] = { other_names = {"Western Maipuran"}, } m["awd-kmp-pro"] = { other_names = {"Campa", "Kampan", "Campan", "Pre-Andine Maipurean"}, } m["awd-prw-pro"] = { other_names = {"Paresí-Waurá", "Parecí–Xingú", "Paresí–Xingu", "Central Arawak", "Central Maipurean"}, } m["awd-ama"] = { } m["awd-ana"] = { aliases = {"Anauya"}, } m["awd-apo"] = { other_names = {"Lapachu"}, } m["awd-cab"] = { aliases = {"Cabere", "Cávere", "Cavere"}, } m["awd-gnu"] = { other_names = {"Guinao", "Inao", "Guniare", "Quinhau", "Guiano"}, } m["awd-kar"] = { aliases = {"Kariaí", "Kariai", "Cariyai", "Carihiahy"}, } m["awd-kaw"] = { aliases = {"Cawishana", "Cayuishana", "Kaishana", "Cauixana"}, } m["awd-kus"] = { aliases = {"Kustenaú", "Custenau", "Kutenabu"}, } m["awd-man"] = { } m["awd-mar"] = { aliases = {"Marawán"}, } m["awd-mpr"] = { aliases = {"Maypure", "Mejepure"}, } m["awd-mrt"] = { aliases = {"Mariate"}, } m["awd-nwk-pro"] = { aliases = {"Proto-Newiki"}, } m["awd-pai"] = { aliases = {"Paiconeca", "Paikone", "Paicone"}, } m["awd-pas"] = { aliases = {"Passé", "Pazé"}, } m["awd-pro"] = { other_names = {"Proto-Arawakan", "Proto-Maipurean", "Proto-Maipuran"}, } m["awd-she"] = { aliases = {"Shebaya", "Shebaye"}, } m["awd-taa-pro"] = { other_names = {"Proto-Ta-Arawakan", "Proto-Caribbean Northern Arawak"}, } m["awd-wai"] = { other_names = {"Wainuma", "Wai", "Waima", "Wainumi", "Wainambí", "Waiwana", "Waipi", "Yanuma"}, } m["awd-yum"] = { aliases = {"Jumana"}, } m["azc-caz"] = { aliases = {"Caxcan", "Kaskán"}, } m["azc-cup-pro"] = { } m["azc-ktn"] = { aliases = {"Gitanemuk"}, } m["azc-nah-pro"] = { } m["azc-num-pro"] = { } m["azc-pro"] = { } m["azc-tak-pro"] = { } m["azc-tat"] = { } m["ber-fog"] = { other_names = {"El-Fogaha", "El-Foqaha", "Foqaha", "Fuqaha"}, } m["ber-pro"] = { } m["ber-zuw"] = { } m["bnt-bal"] = { } m["bnt-bon"] = { } m["bnt-boy"] = { } m["bnt-bwa"] = { } m["bnt-cmw"] = { other_names = {"Bravanese", "Mwiini", "Mwini", "Chimwini", "Chimini", "Brava"}, } m["bnt-ind"] = { other_names = {"Kɔlɔmɔnyi", "Kɔlɛ", "Kasaï Oriental"}, } m["bnt-lal"] = { } m["bnt-mpi"] = { } m["bnt-mpu"] = { } m["bnt-ngu-pro"] = { } m["bnt-phu"] = { aliases = {"Siphuthi"}, } m["bnt-pro"] = { } m["bnt-sab-pro"] = { } m["bnt-sbo"] = { } m["bnt-sts-pro"] = { } m["btk-pro"] = { } m["cau-abz-pro"] = { other_names = {"Proto-Abazgi", "Proto-Abkhaz-Tapanta"}, } m["cau-and-pro"] = { aliases = {"Proto-Andi", "Proto-Andic"}, } m["cau-ava-pro"] = { aliases = {"Proto-Avar-Andian", "Proto-Avar-Andi", "Proto-Avar-Andic"}, } m["cau-cir-pro"] = { other_names = {"Proto-Adyghe-Kabardian", "Proto-Adyghe-Circassian"}, } m["cau-drg-pro"] = { other_names = {"Proto-Dargin"}, } m["cau-lzg-pro"] = { aliases = {"Proto-Lezgi", "Proto-Lezgian", "Proto-Lezgic"}, } m["cau-nec-pro"] = { } m["cau-nkh-pro"] = { } m["cau-nwc-pro"] = { } m["cau-tsz-pro"] = { other_names = {"Proto-Tsezic", "Proto-Didoic"}, } m["cba-ata"] = { other_names = {"Atanque", "Cancuamo", "Kankuamo", "Kankwe", "Kankuí", "Atanke"}, } m["cba-cat"] = { other_names = {"Catio Chibcha", "Old Catio"}, } m["cba-dor"] = { other_names = {"Chumulu", "Changuena", "Changuina", "Chánguena", "Gualaca"}, } m["cba-dui"] = { } m["cba-hue"] = { other_names = {"Güetar", "Guetar", "Brusela"}, } m["cba-nut"] = { other_names = {"Nutabane"}, } m["cba-pro"] = { } m["ccs-pro"] = { } m["ccs-gzn-pro"] = { aliases = {"Proto-Karto-Zan"}, } m["cdc-cbm-pro"] = { aliases = {"Proto-Central-Chadic", "Proto-Biu-Mandara"}, } m["cdc-mas-pro"] = { } m["cdc-pro"] = { } m["cdd-pro"] = { } m["cel-bry-pro"] = { aliases = {"Proto-Brittonic", "Common Brythonic", "Common Brittonic"}, } m["cel-gal"] = { } m["cel-gau"] = { } m["cel-pro"] = { } m["chi-pro"] = { } m["chm-pro"] = { } m["cmc-pro"] = { } m["crp-bip"] = { } m["crp-gep"] = { aliases = {"Greenlandic Pidgin", "Greenlandic Eskimo Pidgin"}, } m["crp-kia"] = { aliases = {"Kiautschou Pidgin German"}, } m["crp-mar"] = { other_names = {"Jamaican Maroon Spirit Possession Language"}, } m["crp-mpp"] = { aliases = {"Macao Pidgin Portuguese"}, } m["crp-rsn"] = { } m["crp-slb"] = { other_names = {"Solombala-English", "Solombala English-Russian Pidgin"}, } m["crp-spp"] = { } m["crp-tpr"] = { } m["csu-bba-pro"] = { } m["csu-maa-pro"] = { } m["csu-pro"] = { } m["csu-sar-pro"] = { } m["cus-ash"] = { other_names = {"Ashraf", "Af-Ashraaf"}, varieties = { {"Marka, Lower Shabelle"}, "Shingani"}, } m["cus-hec-pro"] = { } m["cus-som-pro"] = { aliases = {"Proto-Sam", "Proto-Macro-Somali"}, } m["cus-sou-pro"] = { other_names = {"Proto-Rift"}, } m["cus-pro"] = { } m["dmn-dam"] = { } m["dra-bry"] = { aliases = {"Byari"}, } m["dra-cen-pro"] = { } m["dra-mkn"] = { aliases = {"Nadugannada"}, } m["dra-nor-pro"] = { } m["dra-okn"] = { aliases = {"Halegannada"}, } m["dra-ote"] = { } m["dra-pro"] = { } m["dra-sdo-pro"] = { aliases = {"Proto-South Dravidian"}, } m["dra-sdt-pro"] = { aliases = {"Proto-South-Central Dravidian"}, } m["dra-sou-pro"] = { aliases = {"Proto-Southern Dravidian"}, } m["egx-dem"] = { aliases = {"Demotic", "Enchorial"}, } m["dmn-pro"] = { } m["dmn-mdw-pro"] = { } m["dru-pro"] = { } m["ero-gsz"] = { } m["ero-nya"] = { } m["ero-tau"] = { other_names = {"Rtau"}, } m["esx-esk-pro"] = { } m["esx-ink"] = { } m["esx-inq"] = { } m["esx-inu-pro"] = { } m["esx-pro"] = { } m["esx-tut"] = { } m["euq-pro"] = { aliases = {"Proto-Vasconic"}, } m["gba-pro"] = { } m["gem-pro"] = { aliases = {"Common Germanic"}, } m["gme-bur"] = { aliases = {"Burgundish", "Burgundic"}, } m["gme-cgo"] = { } m["gmq-gut"] = { } m["gmq-jmk"] = { aliases = {"Jamtlandic"}, } m["gmq-mno"] = { } m["gmq-oda"] = { } m["gmq-ogt"] = { aliases = {"Old Gotlandic"}, } m["gmq-osw"] = { } m["gmq-pro"] = { aliases = {"Proto-Scandinavian", "Primitive Norse", "Proto-Nordic", "Ancient Nordic", "Ancient Scandinavian", "Old Nordic", "Old Scandinavian", "Proto-North Germanic", "North Proto-Germanic", "Common Scandinavian"}, } m["gmq-scy"] = { } m["gmw-bgh"] = { } m["gmw-cfr"] = { varieties = {"Mittelfränkisch", "Ripuarian", "Moselle Franconian", "Colognian", "Kölsch"}, } m["gmw-ecg"] = { varieties = {"Thuringian", "Thüringisch", "Upper Saxon", "Upper Saxon German", "Obersächsisch", "Lusatian", "Erzgebirgisch", "Silesian", "Silesian German", "High Prussian"}, } m["gmw-fin"] = { aliases = {"Fingal"}, } m["gmw-gts"] = { aliases = {"Gottscheerisch"}, } m["gmw-jdt"] = { } m["gmw-msc"] = { } m["gmw-pro"] = { } m["gmw-rfr"] = { aliases = {"Rheinfränkisch", "Rhenish Franconian"}, varieties = {"Hessian", "Lorraine Franconian", "Lorrainian", "Lothringisch", "Palatine German", "Pfälzisch", "Pälzisch", "Palatinate German"}, } m["gmw-stm"] = { aliases = {"Satu Mare Swabian", "Sathmarschwäbisch", "Sathmarisch"}, } m["gmw-tsx"] = { aliases = {"Siebenbürger Saxon"}, } m["gmw-vog"] = { } m["gmw-zps"] = { aliases = {"Zipser", "Zipserisch", "Outzäpsersch"}, } m["gn-cls"] = { } m["grk-cal"] = { aliases = {"Italian Greek", "Bova"}, } m["grk-ita"] = { aliases = {"Griko", "Grico", "Grecanic"}, } m["grk-mar"] = { aliases = {"Mariupolitan Greek", "Rumeíka", "Rumeika"}, } m["grk-pro"] = { aliases = {"Proto-Greek"}, } m["hmn-pro"] = { } m["hmx-mie-pro"] = { } m["hmx-pro"] = { } m["hyx-pro"] = { } m["iir-nur-pro"] = { } m["iir-pro"] = { } m["ijo-pro"] = { aliases = {"Proto-Ijaw"}, } m["inc-apa"] = { aliases = {"Apabhraṃśa"}, } m["inc-ash"] = { aliases = {"Asokan Prakrit", "Aśokan Prakrit"}, } m["inc-dng-pro"] = { } m["inc-kam"] = { } m["inc-kho"] = { } m["inc-krd-pro"] = { aliases = {"Proto-Kamata"}, } m["inc-mas"] = { } m["inc-mbn"] = { } m["inc-mgu"] = { } m["inc-mor"] = { aliases = {"Middle Oriya"}, } m["inc-oas"] = { } m["inc-oaw"] = { aliases = {"Early Awadhi"}, } m["inc-obn"] = { } m["inc-ogu"] = { aliases = {"Old Western Rajasthani"}, } m["inc-ohi"] = { aliases = {"Dehlavi"}, } m["inc-oor"] = { aliases = {"Old Oriya"}, } m["inc-opa"] = { } m["inc-pro"] = { } m["ine-ana-pro"] = { } m["ine-bsl-pro"] = { } m["ine-kal"] = { aliases = {"Kalasma", "Kalashma", "Kalašmaic", "Kalasmaic", "Kalašmian", "Kalasmian"}, } m["ine-pae"] = { } m["ine-pro"] = { } m["ine-toc-pro"] = { } m["xme-old"] = { } m["xme-mid"] = { aliases = {"Atropatenian"}, } m["xme-ker"] = { other_names = {"Kermanian", "Central Iranian Dialects", "Central Plateau Dialects", "Central Iranian", "South Median", "Gazi", "Soi", "Sohi", "Abuzeydabadi", "Abyanehi", "Farizandi", "Jowshaqani", "Nashalji", "Qohrudi", "Yarandi", "Tari", "Sedehi", "Ardestani", "Zefrehi", "Isfahani", "Kafroni", "Varzenehi", "Khuri", "Nayini", "Anaraki", "Zoroastrian Dari", "Behdināni", "Behdinani", "Gabri", "Gavrŭni", "Gavruni", "Gabrōni", "Gabroni", "Kermani", "Yazdi", "Bidhandi", "Bijagani", "Chimehi", "Hanjani", "Komjani", "Naraqi", "Qalhari", "Varani", "Zori"}, } m["xme-taf"] = { } m["xme-ttc-pro"] = { } m["xme-kls"] = { aliases = {"Kalāsuri", "Kalasur", "Kalāsur"}, } m["xme-klt"] = { } m["xme-ott"] = { other_names = {"Old Tatic", "Old Azeri", "Azari", "Azeri", "Āḏarī", "Adari", "Adhari"}, } m["ira-kms-pro"] = { } m["ira-mpr-pro"] = { } m["ira-pat-pro"] = { } m["ira-pro"] = { } m["ira-zgr-pro"] = { } m["xsc-pro"] = { } m["xsc-sar-pro"] = { } m["xsc-skw-pro"] = { } m["xsc-sak-pro"] = { aliases = {"Proto-Sakan", "Proto-Tumshuqese-Khotanese"}, } m["ira-sym-pro"] = { } m["ira-sgi-pro"] = { } m["ira-mny-pro"] = { } m["ira-shy-pro"] = { } m["ira-shr-pro"] = { } m["ira-sgc-pro"] = { aliases = {"Proto-Sogdian"}, } m["ira-wnj"] = { aliases = {"Old Vanji", "Vanchi", "Vanži", "Wanji"}, } m["iro-ere"] = { } m["iro-min"] = { } m["iro-nor-pro"] = { } m["iro-pro"] = { } m["itc-pro"] = { } m["itc-psa"] = { } m["jpx-hcj"] = { aliases = {"Hachijo"}, } m["jpx-pro"] = { } m["jpx-ryu-pro"] = { } m["kar-pro"] = { } m["kca-eas"] = { } m["kca-nor"] = { } m["kca-pro"] = { } m["kca-sou"] = { } m["khi-kho-pro"] = { } m["khi-kun"] = { other_names = {"ǃOǃKung", "ǃ'OǃKung", "Kung", "Ekoka ǃKung", "Ekoka Kung", "Sekele"}, } m["ko-ear"] = { } m["kro-pro"] = { } m["ku-pro"] = { } m["map-ata-pro"] = { } m["map-bms"] = { } m["map-pro"] = { } m["mis-hkl"] = { aliases = {"Kelantan Peranakan Chinese", "Hokkien Kelantan", "Kelantan Local Hokkien"} } m["mis-idn"] = { } m["mis-isa"] = { } m["mis-jie"] = { aliases = {"Chieh", "Kjet"}, } m["mis-jzh"] = { aliases = {"Haihua"}, } m["mis-kas"] = { aliases = {"Cassite", "Kassitic", "Kaššite"}, } m["mis-mmd"] = { other_names = {"Mimi of Gaudefroy-Demombynes", "Mimi-D"}, } m["mis-mmn"] = { other_names = {"Mimi-N"}, } m["mis-phi"] = { aliases = {"Philistian", "Philistinian"}, } m["mis-rou"] = { aliases = {"Ruanruan", "Ruan-ruan", "Juan-juan"}, } m["mis-tdl"] = { aliases = {"Turduli"}, } m["mis-tdt"] = { aliases = {"Turdetani"}, } m["mis-tnw"] = { aliases = {"Tangwanghua"}, } m["mis-tuh"] = { aliases = {"'Azha"}, } m["mis-tuo"] = { aliases = {"Tabghach", "Taghbach"}, } m["mis-wuh"] = { aliases = {"Wuwan", "Awar"}, } m["mis-xbi"] = { aliases = {"Serbi", "Shirwi"}, } m["mis-xnu"] = { aliases = {"Hsiung-nu", "Hiong-nu"}, } m["mjg-mgl"] = { aliases = {"Huzhu", "Huzhu Monguor"}, } m["mjg-mgr"] = { aliases = {"Minhe", "Minhe Monguor"}, } m["mkh-asl-pro"] = { } m["mkh-ban-pro"] = { } m["mkh-kat-pro"] = { } m["mkh-khm-pro"] = { } m["mkh-kmr-pro"] = { } m["mkh-mmn"] = { } m["mkh-mnc-pro"] = { } m["mkh-mvi"] = { } m["mkh-pal-pro"] = { } m["mkh-pea-pro"] = { } m["mkh-pkn-pro"] = { } m["mkh-pro"] = { --This will be merged into 2015 aav-pro. } m["mnw-tha"] = { aliases = {"Raman", "Thai Raman", "Siamese Mon"}, } m["mkh-vie-pro"] = { } m["mns-cen"] = { } m["mns-nor"] = { } m["mns-pro"] = { } m["mns-sou"] = { } m["mun-pro"] = { aliases = {"Proto-Mundan"}, } m["myn-chl"] = { -- the stage after ''emy'' other_names = {"Cholti", "Colonial Ch'olti'", "Colonial Cholti"}, } m["myn-pro"] = { aliases = {"Proto-Maya"}, } m["nai-ala"] = { other_names = {"Alasapa", "Pinto"}, } m["nai-bay"] = { other_names = {"Bayougoula", "Bayou Goula", "Ischenoca"}, -- tribe merged with "Mougulasha", "Mongoulacha", "Mugulasha", "Mougulasha", "Muglahsa", "Muglasha", "Muguasha", "Imongolosha", "Houma", "Acolapissa" } m["nai-cal"] = { } m["nai-chi"] = { } m["nai-chu-pro"] = { aliases = {"Proto-Chumashan"}, } m["nai-cig"] = { } m["nai-ckn-pro"] = { aliases = {"Proto-Chinook"}, } m["nai-guz"] = { aliases = {"Guazacapan"}, } m["nai-hit"] = { other_names = {"Atcik-hata", "At-pasha-shliha"}, } m["nai-ipa"] = { other_names = {"'Iipay 'aa", "Northern Diegueño", "Diegueño"}, } m["nai-jtp"] = { other_names = {"Xutiapa", "Jalapa", "Xalapa"}, } m["nai-jum"] = { aliases = {"Jumaitepeque", "Jumaytepec"}, } m["nai-kat"] = { other_names = {"Kathlamet Chinook"}, } m["nai-klp-pro"] = { } m["nai-knm"] = { } m["nai-kum"] = { other_names = {"Kumiai", "Central Diegueño", "Diegueño"}, } m["nai-mac"] = { aliases = {"Macorís", "Macorix", "Mazorij", "Mazorig", "Mazoriges"}, } m["nai-mdu-pro"] = { aliases = {"Proto-Maiduan"}, } m["nai-miz-pro"] = { aliases = {"Proto-Mixe-Zoquean"}, } m["nai-mus-pro"] = { aliases = {"Proto-Muskhogean", "Proto-Muskogee"}, } m["nai-nao"] = { } m["nai-nrs"] = { } m["nai-okw"] = { } m["nai-per"] = { } m["nai-pic"] = { } m["nai-plp-pro"] = { } m["nai-pom-pro"] = { aliases = {"Proto-Pomoan"}, } m["nai-qng"] = { } m["nai-sca-pro"] = { -- NB 'sio-pro' "Proto-Siouan" which is Proto-Western Siouan } m["nai-sin"] = { aliases = {"Sinacantan", "Zinacantán", "Zinacantan"}, } m["nai-sln"] = { } m["nai-spt"] = { aliases = {"Shahaptin"}, } m["nai-tap"] = { other_names = {"Tapachulteca", "Tapachulteco", "Tapachula"}, } m["nai-taw"] = { } m["nai-teq"] = { other_names = {"Tequistlateco", "Tequistlateca", "Chontal", "Chontol of Oaxaca", "Oaxaca Chontal", "Oaxacan Chontal"}, } m["nai-tip"] = { other_names = {"Tipay", "Tiipai", "Tiipay", "Jamul Tiipay", "Southern Digueño", "Diegueño"}, } m["nai-tot-pro"] = { } m["nai-tsi-pro"] = { } m["nai-utn-pro"] = { other_names = {"Proto-Miwok-Costanoan"}, } m["nai-wai"] = { aliases = {"Guaycura", "Waicura"}, } m["nai-wji"] = { other_names = {"Jicaque of El Palmar", "Sula"}, } m["nai-yup"] = { aliases = {"Jupiltepeque", "Yupiltepec", "Jupiltepec", "Xupiltepec"}, } m["nan-dat"] = { aliases = {"Datian"}, } m["nan-hbl"] = { aliases = {"Hokkienese", "Quanzhang", "Fukien", "Banlam", "Banlamese", "Ban-lam"}, } m["nan-hlh"] = { aliases = {"Hailufeng", "Hoklo Min", "Hai Lok Hong"}, } m["nan-lnx"] = { aliases = {"Longyan", "Liongna"}, } m["nan-tws"] = { aliases = {"Teochew Min", "Chiuchow", "Teo-Swa", "Teo-Swa Min", "Tio-Sua"}, } m["nan-zhe"] = { aliases = {"Zhenan"}, } m["nan-zsh"] = { aliases = {"Sanxiang", "Samheung", "Sahiu"}, } m["ngf-pro"] = { } m["nic-bco-pro"] = { } m["nic-bod-pro"] = { } m["nic-eov-pro"] = { } m["nic-gns-pro"] = { } m["nic-grf-pro"] = { } m["nic-gur-pro"] = { } m["nic-jkn-pro"] = { } m["nic-lcr-pro"] = { } m["nic-ogo-pro"] = { } m["nic-ovo-pro"] = { } m["nic-plt-pro"] = { } m["nic-pro"] = { } m["nic-ubg-pro"] = { } m["nic-ucr-pro"] = { } m["nic-vco-pro"] = { } m["njo-jgl"] = { } m["nub-har"] = { aliases = {"Ḥarāza"}, } m["nub-pro"] = { } m["omq-cha-pro"] = { } m["omq-maz-pro"] = { aliases = {"Proto-Mazatecan"}, } m["omq-mix-pro"] = { } m["omq-mxt-pro"] = { } m["omq-otp-pro"] = { } m["omq-pro"] = { aliases = {"Proto-Otomanguean", "Proto-Oto-Mangue"}, } m["omq-sjq"] = { } m["omq-tel"] = { } m["omq-teo"] = { } m["omq-tri-pro"] = { aliases = {"Proto-Trique"}, } m["omq-zap-pro"] = { } m["omq-zpc-pro"] = { } m["omv-aro-pro"] = { } m["omv-diz-pro"] = { aliases = {"Proto-Maji"}, } m["omv-pro"] = { } m["oto-otm-pro"] = { } m["oto-pro"] = { } m["ngf-bin-pro"] = { } m["paa-kmn"] = { aliases = {"Komnzo", "Kómnjo", "Komnjo", "Kamundjo", "Rouku"}, } m["paa-kwn"] = { } m["paa-lei"] = { } m["paa-nha-pro"] = { } m["paa-nun"] = { } m["phi-din"] = { } m["phi-kal-pro"] = { aliases = {"Proto-Calamian"}, } m["phi-nag"] = { } m["phi-pro"] = { } m["poz-abi"] = { other_names = {"Sembuak", "Tubu"}, } m["poz-bal"] = { } m["poz-btk-pro"] = { } m["poz-cet-pro"] = { } m["poz-hce-pro"] = { other_names = {"Proto-South Halmahera - West New Guinea"}, } m["poz-lgx-pro"] = { } m["poz-mcm-pro"] = { } m["poz-mic-pro"] = { } m["poz-mly-pro"] = { } m["poz-msa-pro"] = { } m["poz-oce-pro"] = { } m["poz-pep-pro"] = { aliases = {"Proto-Eastern-Polynesian", "Proto-East Polynesian", "Proto-East-Polynesian"}, } m["poz-pnp-pro"] = { } m["poz-pol-pro"] = { } m["poz-pro"] = { other_names = {"Proto-Western Malayo-Polynesian"}, -- Western is subsumed into general Proto-MP } m["poz-sml"] = { aliases = {"Sarawak"}, } m["poz-ssw-pro"] = { } m["poz-swa-pro"] = { } m["poz-ter"] = { aliases = {"Terengganu"}, } m["pqe-pro"] = { } m["pra-niy"] = { } m["qfa-adm-pro"] = { } m["qfa-bet-pro"] = { aliases = {"Proto-Tai-Be"}, } m["qfa-cka-pro"] = { } m["qfa-hur-pro"] = { } m["qfa-kad-pro"] = { } m["qfa-kms-pro"] = { } m["qfa-kor-pro"] = { } m["qfa-kra-pro"] = { } m["qfa-lic-pro"] = { } m["qfa-onb-pro"] = { aliases = {"Proto-Ong-Be", "Proto-Bê"}, } m["qfa-ong-pro"] = { } m["qfa-tak-pro"] = { aliases = {"Proto-Tai-Kadai"}, } m["qfa-yen-pro"] = { } m["qfa-yuk-pro"] = { } m["qwe-kch"] = { other_names = {"Kichwa shimi", "Runashimi", "Runa", "Quichua", "Quecha", "Inga", "Chimborazo", "Imbabura Highland Kichwa", "Cañar Highland Quecha", "Quechua"}, } m["qwe-pro"] = { } m["roa-ang"] = { other_names = {"Craonnais", "Baugeois", "Saumurois"}, } m["roa-bbn"] = { other_names = {"Bourbonnais", "Berrichon", "Moulins", "Allier", "Nivernais", "Haut-Berrichon", "Bas-Berrichon"}, } m["roa-brg"] = { other_names = {"Burgundian", "Bregognon", "Dijonnais", "Morvandiau", "Morvandeau", "Morvan", "Bourguignon-Morvandiau", "Mâconnais", "Brionnais", "Brionnais-Charolais", "Auxerrois", "Beaunois", "Langrois", "Valsaônois", "Verduno-Chalonnais", "Sédelocien"}, } m["roa-can"] = { } m["roa-cha"] = { other_names = {"Bassignot", "Langrois", "Sennonais", "Vallage", "Troyen", "Briard", "Der", "Perthois", "Rémois", "Argonnais", "Porcien", "Ardennais", "Sugny"}, } m["roa-fcm"] = { other_names = {"Frainc-Comtou", "Comtois", "Jurassien", "Ajoulot", "Vâdais", "Taignon", "Bisontin", "Bousbot"}, } m["roa-gal"] = { } m["roa-gib"] = { } m["roa-gis"] = { } m["roa-leo"] = { } m["roa-lor"] = { other_names = {"Gaumais", "Vosgien", "Welche", "Argonnais", "Longovicien", "Messin", "Nancéien", "Spinalien", "Déodatien"}, } m["roa-oca"] = { aliases = {"Medieval Catalan"}, } m["roa-ole"] = { aliases = {"Medieval Leonese"}, } m["roa-ona"] = { aliases = {"Navarro-Aragonese", "Medieval Navarro-Aragonese", "Old Aragonese", "Medieval Aragonese"}, } m["roa-opt"] = { aliases = {"Old Galician Portuguese", "Old Galician–Portuguese", "Old Galician", "Old Portuguese", "Galician-Portuguese", "Galician Portuguese", "Galician–Portuguese", "Medieval Galician-Portuguese", "Medieval Galician Portuguese", "Medieval Galician–Portuguese", "Medieval Galician", "Medieval Portuguese", "Galaic-Portuguese"}, } m["roa-orl"] = { other_names = {"Beauceron", "Solognot", "Gâtinais", "Blaisois", "Vendômois"}, } m["roa-poi"] = { other_names = {"Poitevin", "Saintongeais", "Maraîchin"}, } m["roa-tar"] = { } m["sai-all"] = { other_names = {"Alyentiyak", "Huarpe", "Warpe"}, } m["sai-and"] = { -- not to be confused with 'cbc' or 'ano' other_names = {"Miranya", "Miranha", "Miranha Carapana-Tapuya", "Miraña-Carapana-Tapuyo", "Andokero", "Miranya-Karapana-Tapuyo", "Miraña", "Carapana"}, } m["sai-ayo"] = { aliases = {"Ayoman", "Ayamán", "Ayaman"}, } m["sai-bae"] = { aliases = {"Baenã", "Baenán", "Baena"}, } m["sai-bag"] = { other_names = {"Patagón de Bagua"}, } m["sai-bet"] = { other_names = {"Betoy", "Betoya", "Betoye", "Betoi-Jirara", "Jirara"}, } m["sai-bor-pro"] = { other_names = {"Proto-Bora-Muinane", "Proto-Bora-Muiname"}, } m["sai-cac"] = { other_names = {"Kakán", "Diaguita", "Cacan", "Kakan", "Calchaquí", "Chaka", "Kaka", "Kaká", "Caca", "Caca-Diaguita", "Catamarcano", "Capayán", "Capayana", "Yacampis"}, } m["sai-caq"] = { other_names = {"Cara", "Kara"}, } m["sai-car-pro"] = { } m["sai-cat"] = { } m["sai-cer-pro"] = { other_names = {"Proto-Amazonian Jê"}, } m["sai-chi"] = { } m["sai-chn"] = { aliases = {"Chana"}, } m["sai-chp"] = { aliases = {"Txapacura", "Xapacura", "Guapore", "Šapakura", "Txapakura", "Txapakúra", "Xapakúra"}, } m["sai-chr"] = { aliases = {"Charrúa", "Charruá"}, } m["sai-chu"] = { aliases = {"Churoya"}, } m["sai-cje-pro"] = { other_names = {"Proto-Akuwẽ"}, } m["sai-cmg"] = { aliases = {"Comechingón", "Comechingona", "Comechingone"}, } m["sai-cno"] = { other_names = {"Chonos", "Caucau"}, } m["sai-cnr"] = { aliases = {"Cañar"}, } m["sai-coe"] = { aliases = {"Koeruna"}, } m["sai-col"] = { aliases = {"Colan"}, } m["sai-cop"] = { } m["sai-crd"] = { other_names = {"Coroado"}, } m["sai-ctq"] = { aliases = {"Catuquinarú", "Katukinaru"}, } m["sai-cul"] = { other_names = {"Culle", "Kulyi", "Ilinga", "Linga"}, } m["sai-cva"] = { } m["sai-esm"] = { other_names = {"Esmeraldeño", "Atacame", "Takame"}, } m["sai-ewa"] = { } m["sai-gam"] = { aliases = {"Gamella", "Acobu", "Curinsi", "Barbados"}, } m["sai-gay"] = { aliases = {"Gayon"}, } m["sai-gmo"] = { other_names = {"Wamo", "Santa Rosa", "San Jose", "Barinas", "Guamotey", "Guama"}, } m["sai-gua"] = { aliases = {"Guachi", "Wachí", "Wachi"}, } m["sai-gue"] = { aliases = {"Guenoa"}, } m["sai-hau"] = { other_names = {"Manek'enk"}, } m["sai-jee-pro"] = { other_names = {"Proto-Gê", "Proto-Jean", "Proto-Gean", "Proto-Jê-Kaingang", "Proto-Ye"}, } m["sai-jko"] = { aliases = {"Geicó", "Jeicó", "Jaikó", "Geikó", "Yeikó", "Jeiko", "Geico", "Jeico", "Jaiko", "Geiko", "Yeiko", "Eyco"}, } m["sai-jrj"] = { } m["sai-kat"] = { -- contrast xoo, kzw, sai-xoc other_names = {"Catrimbi", "Catembri", "Kariri de Mirandela", "Mirandela", "Kariri", "Kiriri"}, } m["sai-mal"] = { aliases = {"Malali"}, } m["sai-mar"] = { } m["sai-mat"] = { other_names = {"Matanauí", "Matanaui", "Matanawü", "Mitandua", "Moutoniway"}, } m["sai-mcn"] = { aliases = {"Mokana"}, } m["sai-men"] = { aliases = {"Menién"}, } m["sai-mil"] = { other_names = {"Milykayak", "Huarpe", "Warpe"}, } m["sai-mlb"] = { aliases = {"Malibú", "Malebú"}, } m["sai-msk"] = { aliases = {"Masakara", "Masacará", "Masacara"}, } m["sai-muc"] = { other_names = {"Mucuchi", "Mokochi", "Mocochí", "Mirripú", "Maripú", "Mucuchí-Maripú"}, } m["sai-mue"] = { aliases = {"Muellamués"}, } m["sai-muz"] = { } m["sai-mys"] = { other_names = {"Mayna", "Maina", "Rimachu"}, } m["sai-nat"] = { other_names = {"Natu", "Peagaxinan"}, } m["sai-nje-pro"] = { other_names = {"Proto-Core Jê"}, } m["sai-opo"] = { other_names = {"Opon", "Opón-Karare", "Opón-Carare", "Carare", "Carare-Opón"}, } m["sai-oto"] = { aliases = {"Otomako", "Otomacan", "Otomac", "Otomak"}, } m["sai-pal"] = { } m["sai-pam"] = { aliases = {"Pamiwa"}, } m["sai-par"] = { aliases = {"Paratio", "Prarto"}, } m["sai-peb"] = { aliases = {"Peva"}, varieties = {"Cauwachi", "Caumari", "Pacaya"}, -- per Wikipedia, according to the American anthropologist and linguist John Alden Mason (1950) } m["sai-pnz"] = { aliases = {"Pansaleo"}, } m["sai-prh"] = { } m["sai-ptg"] = { other_names = {"Patagón de Perico"}, } m["sai-pur"] = { aliases = {"Purukoto", "Purucotó", "Purucoto"}, } m["sai-pyg"] = { aliases = {"Payawá", "Payagua"}, } m["sai-pyk"] = { aliases = {"Gavião-Pykobjê", "Pykobjê-Gavião", "Gavião", "Pyhcopji", "Gavião-Pyhcopji"}, } m["sai-qmb"] = { other_names = {"Kimbaya", "Quindío", "Quindio", "Quindo"}, } m["sai-qtm"] = { aliases = {"Quitemoca"}, } m["sai-rab"] = { } m["sai-ram"] = { } m["sai-sac"] = { other_names = {"Sacata", "Zácata", "Chillao"}, } m["sai-san"] = { aliases = {"Sanavirón", "Sanabirón", "Sanabiron", "Sanavirona", "Zanavirona"}, } m["sai-sap"] = { aliases = {"Zapará", "Zapara"}, } m["sai-sec"] = { other_names = {"Sek", "Sec"}, } m["sai-sin"] = { other_names = {"Cenúfana", "Zenúfana", "Cinifaná", "Sinufana", "Sinú", "Cenú", "Zenú", "Finzenú", "Fincenú", "Pancenú", "Sutagao"}, } m["sai-sje-pro"] = { } m["sai-tab"] = { other_names = {"Aconipa"}, } m["sai-tal"] = { other_names = {"Atalán", "Tallan", "Tallanca", "Atalan", "Sek"}, } m["sai-tap"] = { other_names = {"Tapayúna", "Kajkwakhrattxi"}, } m["sai-tar-pro"] = { } m["sai-teu"] = { aliases = {"Tehues", "Teuéx"}, } m["sai-tim"] = { other_names = {"Cuica", "Timote-Cuica"}, } m["sai-tpr"] = { aliases = {"Taparito"}, } m["sai-trr"] = { other_names = {"Caratiú"}, } m["sai-wai"] = { aliases = {"Waitaka", "Waitacá", "Waitaca", "Goytacá", "Goitacá", "Guaitacá", "Guiatacá", "Guiatacás", "Goiatacá", "Goiatacás", "Guaiatacá", "Goytacaz", "Goitacaz", "Goyataca", "Aitacaz", "Uetacaz", "Uetacá", "Outacá", "Ouetacá", "Eutacá", "Itacaz", "Vaitacá"}, } m["sai-way"] = { aliases = {"Wajumará", "Wajumara", "Wayumará", "Azumara", "Guimara"}, } m["sai-wit-pro"] = { other_names = {"Proto-Huitotoan", "Proto-Uitotoan"}, } m["sai-wnm"] = { other_names = {"Wañam", "Wanyam", "Huanyam", "Uanham", "Abitana"}, } m["sai-xoc"] = { -- contrast xoo, kzw, sai-kat other_names = {"Xoco", "Chocó", "Shokó", "Shoko", "Shocó", "Shoco", "Choco", "Chocaz", "Kariri-Xocó", "Kariri-Xoco", "Kariri-Shoko", "Cariri-Chocó", "Xukuru-Kariri", "Xucuru-Kariri", "Xucuru-Cariri", "Xukurú-Kirirí"}, } m["sai-yao"] = { aliases = {"Yao", "Jaoi", "Yaoi", "Yaio", "Anacaioury"}, } m["sai-yar"] = { -- not the same family as 'suy' aliases = {"Yaruma"}, } m["sai-yri"] = { aliases = {"Jurí"}, } m["sai-yup"] = { other_names = {"Yupuá", "Yupúa", "Jupua", "Jupuá", "Jupúa", "Hiupiá", "Yupuá-Duriña", "Duriña"}, } m["sai-yur"] = { aliases = {"Yurumangui", "Yurimangí", "Yurimangi", "Yurimanguí", "Yurimangui"}, } m["sal-pro"] = { aliases = {"Proto-Salishan"}, } m["sdv-daj-pro"] = { } m["sdv-eje-pro"] = { } m["sdv-nil-pro"] = { } m["sdv-nyi-pro"] = { } m["sdv-tmn-pro"] = { } m["sel-nor"] = { aliases = {"Taz Selkup"}, } m["sel-pro"] = { } m["sel-sou"] = { } m["sem-amm"] = { } m["sem-amo"] = { aliases = {"Amoritic"}, } m["sem-cha"] = { aliases = {"Cheha", "Čäha", "Čäxa"}, } m["sem-dad"] = { other_names = {"Dadanite", "Lihyanite", "Lihyanitic"}, } m["sem-dum"] = { } m["sem-has"] = { } m["sem-his"] = { other_names = {"Thamudic E"}, } m["sem-mhr"] = { other_names = {"Muher Gurage", "Muxar", "Muxər", "Muhər", "Muḫər"}, } m["sem-pro"] = { } m["sem-saf"] = { } m["sem-sam"] = { other_names = {"Sam'alian"}, } m["sem-srb"] = { } m["sem-tay"] = { other_names = {"Taymanite", "Thamudic A"}, } m["sem-tha"] = { } m["sem-wes-pro"] = { } m["sio-pro"] = { -- NB this is not Proto-Siouan-Catawban 'nai-sca-pro' } m["sit-aao-pro"] = { } m["sit-bok"] = { other_names = {"Ramo", "Pailibo"}, } m["sit-bai-pro"] = { } m["sit-ban"] = { } m["sit-bdi-pro"] = { } m["sit-cai"] = { } m["sit-cha"] = { } m["sit-ers-pro"] = { } m["sit-hrs-pro"] = { } m["sit-jap"] = { other_names = {"Chabao", "Kuru"}, } m["sit-kha-pro"] = { } m["sit-khb-pro"] = { } m["sit-khp-pro"] = { } m["sit-khw-pro"] = { } m["sit-kon-pro"] = { } m["sit-liz"] = { } m["sit-lnj"] = { } m["sit-lrn"] = { } m["sit-luu-pro"] = { } m["sit-nas-pro"] = { } m["sit-prn"] = { } m["sit-pro"] = { } m["sit-sit"] = { other_names = {"Eastern rGyalrong", "rGyalrong", "Rgyalrong", "rGyalrongic", "Gyalrong", "Gyarong", "rGyarong", "Gyarung", "Jiarong", "Jiarongyu", "Jyarong", "Jyarung", "Yelong", "Kuru"}, } m["sit-tam-pro"] = { aliases = {"Proto-Tamang"}, } m["sit-tan-pro"] = { } m["sit-tgm"] = { } m["sit-tng-pro"] = { } m["sit-tos"] = { } m["sit-tsh"] = { other_names = {"Caodeng", "Sidaba", "rGyalrong", "Rgyalrong", "Jiarong", "Gyarung", "Kuru"}, } m["sit-zbu"] = { other_names = {"Ribu", "Rdzong'bur", "Rdzongmbur", "Showu", "rGyalrong", "Rgyalrong", "Jiarong", "Gyarung", "Kuru"}, } m["sla-pro"] = { aliases = {"Common Slavic"}, } m["smi-pro"] = { aliases = {"Proto-Sami"}, } m["son-pro"] = { aliases = {"Proto-Songhai"}, } m["sqj-pro"] = { } m["ssa-klk-pro"] = { aliases = {"Proto-Rub"}, } m["ssa-kom-pro"] = { } m["ssa-pro"] = { } m["syd-pro"] = { } m["tai-pro"] = { } m["tai-swe-pro"] = { } m["tbq-bdg-pro"] = { } m["tbq-blg"] = { aliases = {"Pai-lang", "Pailang"}, } m["tbq-brm-pro"] = { } m["tbq-gkh"] = { aliases = {"Gɔkhý", "Gɔkhy", "Gouke"}, } m["tbq-kuk-pro"] = { other_names = {"Proto-Kukish"}, } m["tbq-lal-pro"] = { } m["tbq-laz"] = { other_names = {"Lare", "Shuitianhua"}, } m["tbq-lob-pro"] = { } m["tbq-lol-pro"] = { aliases = {"Proto-Yi", "Proto-Ngwi", "Proto-Nisoic"}, } m["tbq-mil"] = { } m["tbq-mor"] = { aliases = {"Morān"}, } m["tbq-ngo"] = { other_names = {"Ngachang", "Achang"}, } -- tbq-pro is now etymology-only m["trk-dkh"] = { aliases = {"Dukha"}, } m["trk-eog"] = { } m["trk-oat"] = { } m["trk-pro"] = { } m["tup-gua-pro"] = { } m["tup-kab"] = { aliases = {"Kabixiana", "Cabixiana", "Cabishiana", "Kapishana", "Capishana", "Kapišana", "Cabichiana", "Capichana", "Capixana"}, } m["tuw-alk"] = { aliases = {"Alechuka"}, } m["tuw-bal"] = { } m["tuw-kkl"] = { aliases = {"Chinese Kyakala"}, } m["tuw-kli"] = { aliases = {"Kilen", "Kirin", "Kila", "Hezhe", "Qile'en"}, } m["tup-pro"] = { } m["tuw-pro"] = { } m["tuw-sol"] = { } m["urj-fin-pro"] = { } m["urj-koo"] = { aliases = {"Old Permian"}, } m["urj-kuk"] = { aliases = {"Kukkuzi Votic", "Kukkuzi Ingrian", "Kukkusi"}, } m["urj-kya"] = { } m["urj-mdv-pro"] = { } m["urj-prm-pro"] = { } m["urj-pro"] = { other_names = {"Proto-Finno-Ugric", "Proto-Finno-Permic"}, -- PFU and PFP are subsumed into PU per [[Wiktionary:Beer parlour/2015/January#Merging Finno-Volgaic, Finno-Samic, Finno-Permic and Finno-Ugric into Uralic]] } m["urj-ugr-pro"] = { } m["xgn-pro"] = { } m["xnd-pro"] = { other_names = {"Proto-Na-Dené", "Proto-Athabaskan-Eyak-Tlingit"}, } m["yok-bvy"] = { other_names = {"Tulamni-Hometwoli", "Tulamni", "Tulamne", "Tuolumne", "Tawitchi", "Hometwoli", "Taneshach"}, } m["yok-dly"] = { other_names = {"Far Northern Valley Yokuts", "Yachikumne", "Yachikumni", "Chulamni", "Lower San Joaquin", "Lakisamni", "Tawalimni"}, } m["yok-gsy"] = { } m["yok-kry"] = { other_names = {"Choinimni", "Choynimni", "Ayticha", "Kocheyali", "Ayitcha", "Michahay", "Chukaymina", "Chukaimina"}, } m["yok-nvy"] = { other_names = {"Chukchansi", "Kechayi", "Dumna", "Chawchila", "Noptinte", "Nopṭinṭe", "Nopthrinthre", "Nopchinchi", "Takin"}, } m["yok-ply"] = { other_names = {"Paleuyami", "Altinin", "Poso Creek", "Poso Creek Yokuts"}, } m["yok-svy"] = { other_names = {"Yawelmani", "Tachi", "Koyeti", "Nutunutu", "Chunut", "Wo'lasi", "Choynok", "Choinok", "Wechihit"}, } m["yok-tky"] = { other_names = {"Wikchamni", "Wukchamni", "Wukchumni", "Yawdanchi"}, } m["ypk-pro"] = { } m["yrk-for"] = { } m["yrk-tun"] = { other_names = {"Yurak"}, varieties = { { "Western Nenets" }, { "Eastern Nenets" }, } } m["zhx-min-pro"] = { } m["zhx-sht"] = { other_names = {"Xiangnan Tuhua", "Yuebei Tuhua", "Shipo", "Shina"}, } m["zhx-sic"] = { aliases = {"Sichuanese Mandarin"}, } m["zhx-tai"] = { aliases = {"Toishanese"}, } m["zle-ono"] = { } m["zle-ort"] = { } m["zls-chs"] = { } m["zlw-ocs"] = { } m["zlw-opl"] = { } m["zlw-osk"] = { } m["zlw-slv"] = { } m["zlm-coa"] = { } m["zlm-pah"] = { } return m h2a8ycwfcvvyuj273xn3e0ftvyjv8gi 281321 281317 2026-04-21T19:44:28Z Hakimi97 2668 Membatalkan semakan [[Special:Diff/281317|281317]] oleh [[Special:Contributions/Hakimi97|Hakimi97]] ([[User talk:Hakimi97|bincang]]) 281321 Scribunto text/plain local m = {} m["aav-khs-pro"] = { aliases = {"Proto-Khasic"}, } m["aav-nic-pro"] = { } m["aav-pkl-pro"] = { } m["aav-pro"] = { -- mkh-pro will merge into this. } m["afa-pro"] = { aliases = {"Proto-Afro-Asiatic", "Hamito-Semitic"}, } m["alg-aga"] = { aliases = {"Agwam", "Agaam"}, } m["alg-pro"] = { } m["alv-ama"] = { } m["alv-bgu"] = { otherNames = {"Gubëeher", "Nyun Gubëeher", "Nun Gubëeher"}, } m["alv-bua-pro"] = { } m["alv-cng-pro"] = { } m["alv-edk-pro"] = { } m["alv-edo-pro"] = { } m["alv-fli-pro"] = { } m["alv-gbe-pro"] = { } m["alv-gng-pro"] = { } m["alv-gtm-pro"] = { aliases = {"Proto-Ghana-Togo Mountain"}, } m["alv-gwa"] = { } m["alv-hei-pro"] = { } m["alv-ido-pro"] = { } m["alv-igb-pro"] = { } m["alv-kwa-pro"] = { } m["alv-mum-pro"] = { } m["alv-nup-pro"] = { } m["alv-pro"] = { } m["alv-von-pro"] = { } m["alv-yor-pro"] = { } m["alv-yrd-pro"] = { } m["apa-pro"] = { aliases = {"Proto-Apache", "Proto-Southern Athabaskan"}, } m["aql-pro"] = { } m["art-adu"] = { aliases = {"Westron"}, } m["art-bel"] = { } m["art-blk"] = { } m["art-bsp"] = { } m["art-com"] = { } m["art-dtk"] = { } m["art-elo"] = { } m["art-gld"] = { } m["art-lap"] = { } m["art-man"] = { } m["art-mun"] = { } m["art-nav"] = { } m["art-vlh"] = { } m["ath-nic"] = { } m["ath-pro"] = { } m["auf-pro"] = { aliases = {"Proto-Arawan", "Proto-Arauan"}, } m["aus-alu"] = { otherNames = {"Ogh-Alungul", "Alngula"}, } m["aus-and"] = { aliases = {"Adithinngithigh"}, } m["aus-ang"] = { otherNames = {"Ogh-Anggula", "Anggula", "Ogh-Anggul", "Anggul"}, } m["aus-arn-pro"] = { } m["aus-bra"] = { aliases = {"Barranbinja", "Baranbinya", "Burranbinya", "Burrumbiniya", "Burrunbinya", "Barrumbinya", "Barren-binya", "Parran-binye"}, } m["aus-brm"] = { } m["aus-cww-pro"] = { } m["aus-dal-pro"] = { } m["aus-guw"] = { otherNames = {"Gowar", "Goowar", "Gooar", "Guar", "Gowr-burra", "Ngugi", "Mugee", "Wogee", "Gnoogee", "Chunchiburri", "Booroo-geen-merrie"}, } m["aus-lsw"] = { aliases = {"Little Swanport Tasmanian"}, } m["aus-mbi"] = { otherNames = {"Mbeiwum"}, } m["aus-ngk"] = { otherNames = {"Ngkot", "Nggoth"}, } m["aus-nyu-pro"] = { } m["aus-pam-pro"] = { } m["aus-tul"] = { otherNames = {"Dappil", "Dapil", "Toolooa", "Dulua", "Narung", "Dandan"}, } m["aus-uwi"] = { otherNames = {"Uwinjmil"}, } m["aus-wdj-pro"] = { } m["aus-won"] = { } m["aus-wul"] = { otherNames = {"Manbara", "Wulgurugaba", "Wulgurukaba", "Nhawalgaba"}, } m["aus-ynk"] = { -- contrast nny } m["awd-amc-pro"] = { otherNames = {"Western Maipuran"}, } m["awd-kmp-pro"] = { otherNames = {"Campa", "Kampan", "Campan", "Pre-Andine Maipurean"}, } m["awd-prw-pro"] = { otherNames = {"Paresí-Waurá", "Parecí–Xingú", "Paresí–Xingu", "Central Arawak", "Central Maipurean"}, } m["awd-ama"] = { } m["awd-ana"] = { aliases = {"Anauya"}, } m["awd-apo"] = { otherNames = {"Lapachu"}, } m["awd-cab"] = { aliases = {"Cabere", "Cávere", "Cavere"}, } m["awd-gnu"] = { otherNames = {"Guinao", "Inao", "Guniare", "Quinhau", "Guiano"}, } m["awd-kar"] = { aliases = {"Kariaí", "Kariai", "Cariyai", "Carihiahy"}, } m["awd-kaw"] = { aliases = {"Cawishana", "Cayuishana", "Kaishana", "Cauixana"}, } m["awd-kus"] = { aliases = {"Kustenaú", "Custenau", "Kutenabu"}, } m["awd-man"] = { } m["awd-mar"] = { aliases = {"Marawán"}, } m["awd-mpr"] = { aliases = {"Maypure", "Mejepure"}, } m["awd-mrt"] = { aliases = {"Mariate"}, } m["awd-nwk-pro"] = { aliases = {"Proto-Newiki"}, } m["awd-pai"] = { aliases = {"Paiconeca", "Paikone", "Paicone"}, } m["awd-pas"] = { aliases = {"Passé", "Pazé"}, } m["awd-pro"] = { otherNames = {"Proto-Arawakan", "Proto-Maipurean", "Proto-Maipuran"}, } m["awd-she"] = { aliases = {"Shebaya", "Shebaye"}, } m["awd-taa-pro"] = { otherNames = {"Proto-Ta-Arawakan", "Proto-Caribbean Northern Arawak"}, } m["awd-wai"] = { otherNames = {"Wainuma", "Wai", "Waima", "Wainumi", "Wainambí", "Waiwana", "Waipi", "Yanuma"}, } m["awd-yum"] = { aliases = {"Jumana"}, } m["azc-caz"] = { aliases = {"Caxcan", "Kaskán"}, } m["azc-cup-pro"] = { } m["azc-ktn"] = { aliases = {"Gitanemuk"}, } m["azc-nah-pro"] = { } m["azc-num-pro"] = { } m["azc-pro"] = { } m["azc-tak-pro"] = { } m["azc-tat"] = { } m["ber-fog"] = { otherNames = {"El-Fogaha", "El-Foqaha", "Foqaha", "Fuqaha"}, } m["ber-pro"] = { } m["ber-zuw"] = { } m["bnt-bal"] = { } m["bnt-bon"] = { } m["bnt-boy"] = { } m["bnt-bwa"] = { } m["bnt-cmw"] = { otherNames = {"Bravanese", "Mwiini", "Mwini", "Chimwini", "Chimini", "Brava"}, } m["bnt-ind"] = { otherNames = {"Kɔlɔmɔnyi", "Kɔlɛ", "Kasaï Oriental"}, } m["bnt-lal"] = { } m["bnt-mpi"] = { } m["bnt-mpu"] = { } m["bnt-ngu-pro"] = { } m["bnt-phu"] = { aliases = {"Siphuthi"}, } m["bnt-pro"] = { } m["bnt-sbo"] = { } m["bnt-sts-pro"] = { } m["btk-pro"] = { } m["cau-abz-pro"] = { otherNames = {"Proto-Abazgi", "Proto-Abkhaz-Tapanta"}, } m["cau-and-pro"] = { aliases = {"Proto-Andi", "Proto-Andic"}, } m["cau-ava-pro"] = { aliases = {"Proto-Avar-Andian", "Proto-Avar-Andi", "Proto-Avar-Andic"}, } m["cau-cir-pro"] = { otherNames = {"Proto-Adyghe-Kabardian", "Proto-Adyghe-Circassian"}, } m["cau-drg-pro"] = { otherNames = {"Proto-Dargin"}, } m["cau-lzg-pro"] = { aliases = {"Proto-Lezgi", "Proto-Lezgian", "Proto-Lezgic"}, } m["cau-nec-pro"] = { } m["cau-nkh-pro"] = { } m["cau-nwc-pro"] = { } m["cau-tsz-pro"] = { otherNames = {"Proto-Tsezic", "Proto-Didoic"}, } m["cba-ata"] = { otherNames = {"Atanque", "Cancuamo", "Kankuamo", "Kankwe", "Kankuí", "Atanke"}, } m["cba-cat"] = { otherNames = {"Catio Chibcha", "Old Catio"}, } m["cba-dor"] = { otherNames = {"Chumulu", "Changuena", "Changuina", "Chánguena", "Gualaca"}, } m["cba-dui"] = { } m["cba-hue"] = { otherNames = {"Güetar", "Guetar", "Brusela"}, } m["cba-nut"] = { otherNames = {"Nutabane"}, } m["cba-pro"] = { } m["ccn-pro"] = { } m["ccs-pro"] = { } m["ccs-gzn-pro"] = { aliases = {"Proto-Karto-Zan"}, } m["cdc-cbm-pro"] = { otherNames = {"Proto-Central-Chadic", "Proto-Biu-Mandara"}, } m["cdc-mas-pro"] = { } m["cdc-pro"] = { } m["cdd-pro"] = { } m["cel-bry-pro"] = { aliases = {"Proto-Brittonic", "Common Brythonic", "Common Brittonic"}, } m["cel-gal"] = { } m["cel-gau"] = { } m["cel-pro"] = { } m["chi-pro"] = { } m["chm-pro"] = { } m["cmc-pro"] = { } m["crp-bip"] = { } m["crp-gep"] = { aliases = {"Greenlandic Pidgin", "Greenlandic Eskimo Pidgin"}, } m["crp-mar"] = { otherNames = {"Jamaican Maroon Spirit Possession Language"}, } m["crp-mpp"] = { aliases = {"Macao Pidgin Portuguese"}, } m["crp-rsn"] = { } m["crp-slb"] = { otherNames = {"Solombala-English", "Solombala English-Russian Pidgin"}, } m["crp-spp"] = { } m["crp-tpr"] = { } m["csu-bba-pro"] = { } m["csu-maa-pro"] = { } m["csu-pro"] = { } m["csu-sar-pro"] = { } m["cus-ash"] = { otherNames = {"Ashraf", "Af-Ashraaf"}, varieties = { {"Marka, Lower Shabelle"}, "Shingani"}, } m["cus-hec-pro"] = { } m["cus-som-pro"] = { otherNames = {"Proto-Sam", "Proto-Macro-Somali"}, } m["cus-sou-pro"] = { otherNames = {"Proto-Rift"}, } m["cus-pro"] = { } m["dmn-dam"] = { } m["dra-bry"] = { aliases = {"Byari"}, } m["dra-cen-pro"] = { } m["dra-mkn"] = { aliases = {"Nadugannada"}, } m["dra-nor-pro"] = { } m["dra-okn"] = { aliases = {"Halegannada"}, } m["dra-ote"] = { } m["dra-pro"] = { } m["dra-sdo-pro"] = { aliases = {"Proto-South Dravidian"}, } m["dra-sdt-pro"] = { aliases = {"Proto-South-Central Dravidian"}, } m["dra-sou-pro"] = { aliases = {"Proto-Southern Dravidian"}, } m["egx-dem"] = { aliases = {"Demotic Egyptian", "Enchorial"}, } m["dmn-pro"] = { } m["dmn-mdw-pro"] = { } m["dru-pro"] = { } m["esx-esk-pro"] = { } m["esx-ink"] = { } m["esx-inq"] = { } m["esx-inu-pro"] = { } m["esx-pro"] = { } m["esx-tut"] = { } m["euq-pro"] = { aliases = {"Proto-Vasconic"}, } m["gba-pro"] = { } m["gem-pro"] = { aliases = {"Common Germanic"}, } m["gme-bur"] = { aliases = {"Burgundish", "Burgundic"}, } m["gme-cgo"] = { } m["gmq-gut"] = { } m["gmq-jmk"] = { aliases = {"Jamtlandic"}, } m["gmq-mno"] = { } m["gmq-oda"] = { } m["gmq-ogt"] = { aliases = {"Old Gotlandic"}, } m["gmq-osw"] = { } m["gmq-pro"] = { aliases = {"Proto-Scandinavian", "Primitive Norse", "Proto-Nordic", "Ancient Nordic", "Ancient Scandinavian", "Old Nordic", "Old Scandinavian", "Proto-North Germanic", "North Proto-Germanic", "Common Scandinavian"}, } m["gmq-scy"] = { } m["gmw-bgh"] = { } m["gmw-cfr"] = { varieties = {"Mittelfränkisch", "Ripuarian", "Moselle Franconian", "Colognian", "Kölsch"}, } m["gmw-ecg"] = { varieties = {"Thuringian", "Thüringisch", "Upper Saxon", "Upper Saxon German", "Obersächsisch", "Lusatian", "Erzgebirgisch", "Silesian", "Silesian German", "High Prussian"}, } m["gmw-fin"] = { aliases = {"Fingal"}, } m["gmw-gts"] = { aliases = {"Gottscheerisch"}, } m["gmw-jdt"] = { } m["gmw-msc"] = { } m["gmw-pro"] = { } m["gmw-rfr"] = { aliases = {"Rheinfränkisch", "Rhenish Franconian"}, varieties = {"Hessian", "Lorraine Franconian", "Lorrainian", "Lothringisch", "Palatine German", "Pfälzisch", "Pälzisch", "Palatinate German"}, } m["gmw-stm"] = { aliases = {"Satu Mare Swabian", "Sathmarschwäbisch", "Sathmarisch"}, } m["gmw-tsx"] = { aliases = {"Siebenbürger Saxon"}, } m["gmw-vog"] = { } m["gmw-zps"] = { aliases = {"Zipser", "Zipserisch", "Outzäpsersch"}, } m["gn-cls"] = { } m["grk-cal"] = { aliases = {"Italian Greek", "Bova"}, } m["grk-ita"] = { aliases = {"Griko", "Grico", "Grecanic"}, } m["grk-mar"] = { aliases = {"Mariupolitan Greek", "Rumeíka", "Rumeika"}, } m["grk-pro"] = { aliases = {"Proto-Greek"}, } m["hmn-pro"] = { } m["hmx-mie-pro"] = { } m["hmx-pro"] = { } m["hyx-pro"] = { } m["iir-nur-pro"] = { } m["iir-pro"] = { } m["ijo-pro"] = { aliases = {"Proto-Ijaw"}, } m["inc-apa"] = { aliases = {"Apabhraṃśa"}, } m["inc-ash"] = { aliases = {"Asokan Prakrit", "Aśokan Prakrit"}, } m["inc-kam"] = { } m["inc-kho"] = { } m["inc-krn-pro"] = { aliases = {"Proto Kamta", "Proto-Kamata", "Proto Kamata"}, } m["inc-mas"] = { } m["inc-mbn"] = { } m["inc-mgu"] = { } m["inc-mor"] = { aliases = {"Middle Oriya"}, } m["inc-oas"] = { } m["inc-oaw"] = { aliases = {"Early Awadhi"}, } m["inc-obn"] = { } m["inc-ogu"] = { otherNames = {"Old Western Rajasthani"}, } m["inc-ohi"] = { aliases = {"Dehlavi"}, } m["inc-oor"] = { aliases = {"Old Oriya"}, } m["inc-opa"] = { } m["inc-pro"] = { } m["ine-ana-pro"] = { } m["ine-bsl-pro"] = { } m["ine-kal"] = { aliases = {"Kalašmaic", "Kalasmaic"}, } m["ine-pae"] = { } m["ine-pro"] = { } m["ine-toc-pro"] = { } m["xme-old"] = { } m["xme-mid"] = { aliases = {"Atropatenian"}, } m["xme-ker"] = { otherNames = {"Kermanian", "Central Iranian Dialects", "Central Plateau Dialects", "Central Iranian", "South Median", "Gazi", "Soi", "Sohi", "Abuzeydabadi", "Abyanehi", "Farizandi", "Jowshaqani", "Nashalji", "Qohrudi", "Yarandi", "Tari", "Sedehi", "Ardestani", "Zefrehi", "Isfahani", "Kafroni", "Varzenehi", "Khuri", "Nayini", "Anaraki", "Zoroastrian Dari", "Behdināni", "Behdinani", "Gabri", "Gavrŭni", "Gavruni", "Gabrōni", "Gabroni", "Kermani", "Yazdi", "Bidhandi", "Bijagani", "Chimehi", "Hanjani", "Komjani", "Naraqi", "Qalhari", "Varani", "Zori"}, } m["xme-taf"] = { } m["xme-ttc-pro"] = { } m["xme-kls"] = { aliases = {"Kalāsuri", "Kalasur", "Kalāsur"}, } m["xme-klt"] = { } m["xme-ott"] = { otherNames = {"Old Tatic", "Old Azeri", "Azari", "Azeri", "Āḏarī", "Adari", "Adhari"}, } m["ira-kms-pro"] = { } m["ira-mpr-pro"] = { } m["ira-pat-pro"] = { } m["ira-pro"] = { } m["ira-zgr-pro"] = { } m["os-pro"] = { otherNames = {"Sarmatian"}, } m["xsc-pro"] = { } m["xsc-skw-pro"] = { } m["xsc-sak-pro"] = { aliases = {"Proto-Sakan"}, } m["ira-sym-pro"] = { } m["ira-sgi-pro"] = { } m["ira-mny-pro"] = { } m["ira-shy-pro"] = { } m["ira-shr-pro"] = { } m["ira-sgc-pro"] = { aliases = {"Proto-Sogdian"}, } m["ira-wnj"] = { aliases = {"Old Vanji", "Vanchi", "Vanži", "Wanji"}, } m["iro-ere"] = { } m["iro-min"] = { } m["iro-nor-pro"] = { } m["iro-pro"] = { } m["itc-pro"] = { } m["jpx-hcj"] = { aliases = {"Hachijo"}, } m["jpx-pro"] = { } m["jpx-ryu-pro"] = { } m["kar-pro"] = { } m["kca-eas"] = { } m["kca-nor"] = { } m["kca-pro"] = { } m["kca-sou"] = { } m["khi-kho-pro"] = { } m["khi-kun"] = { otherNames = {"ǃOǃKung", "ǃ'OǃKung", "Kung", "Ekoka ǃKung", "Ekoka Kung", "Sekele"}, } m["ko-ear"] = { } m["kro-pro"] = { } m["ku-pro"] = { } m["map-ata-pro"] = { } m["map-bms"] = { } m["map-pro"] = { } m["mis-hkl"] = { aliases = {"Kelantan Peranakan Chinese", "Kelantan Peranakan Hokkien", "Hokkien Kelantan", "Kelantan Local Hokkien"} } m["mis-isa"] = { } m["mis-jie"] = { aliases = {"Chieh", "Kjet"}, } m["mis-jzh"] = { aliases = {"Haihua"}, } m["mis-kas"] = { aliases = {"Cassite", "Kassitic", "Kaššite"}, } m["mis-mmd"] = { otherNames = {"Mimi of Gaudefroy-Demombynes", "Mimi-D"}, } m["mis-mmn"] = { otherNames = {"Mimi-N"}, } m["mis-phi"] = { aliases = {"Philistian", "Philistinian"}, } m["mis-rou"] = { aliases = {"Ruanruan", "Ruan-ruan", "Juan-juan"}, } m["mis-tnw"] = { aliases = {"Tangwanghua"}, } m["mis-tuh"] = { aliases = {"'Azha"}, } m["mis-tuo"] = { aliases = {"Tabghach", "Taghbach"}, } m["mis-wuh"] = { aliases = {"Wuwan", "Awar"}, } m["mis-xbi"] = { aliases = {"Serbi", "Shirwi"}, } m["mjg-mgl"] = { aliases = {"Huzhu", "Huzhu Monguor"}, } m["mjg-mgr"] = { aliases = {"Minhe", "Minhe Monguor"}, } m["mkh-asl-pro"] = { } m["mkh-ban-pro"] = { } m["mkh-kat-pro"] = { } m["mkh-khm-pro"] = { } m["mkh-kmr-pro"] = { } m["mkh-mmn"] = { } m["mkh-mnc-pro"] = { } m["mkh-mvi"] = { } m["mkh-pal-pro"] = { } m["mkh-pea-pro"] = { } m["mkh-pkn-pro"] = { } m["mkh-pro"] = { --This will be merged into 2015 aav-pro. } m["mnw-tha"] = { aliases = {"Raman", "Thai Raman", "Siamese Mon"}, } m["mkh-vie-pro"] = { } m["mns-cen"] = { } m["mns-nor"] = { } m["mns-pro"] = { } m["mns-sou"] = { } m["mun-pro"] = { aliases = {"Proto-Mundan"}, } m["myn-chl"] = { -- the stage after ''emy'' otherNames = {"Cholti", "Colonial Ch'olti'", "Colonial Cholti"}, } m["myn-pro"] = { aliases = {"Proto-Maya"}, } m["nai-ala"] = { otherNames = {"Alasapa", "Pinto"}, } m["nai-bay"] = { otherNames = {"Bayougoula", "Bayou Goula", "Ischenoca"}, -- tribe merged with "Mougulasha", "Mongoulacha", "Mugulasha", "Mougulasha", "Muglahsa", "Muglasha", "Muguasha", "Imongolosha", "Houma", "Acolapissa" } m["nai-cal"] = { } m["nai-chi"] = { } m["nai-chu-pro"] = { aliases = {"Proto-Chumashan"}, } m["nai-cig"] = { } m["nai-ckn-pro"] = { aliases = {"Proto-Chinook"}, } m["nai-guz"] = { aliases = {"Guazacapan"}, } m["nai-hit"] = { otherNames = {"Atcik-hata", "At-pasha-shliha"}, } m["nai-ipa"] = { otherNames = {"'Iipay 'aa", "Northern Diegueño", "Diegueño"}, } m["nai-jtp"] = { otherNames = {"Xutiapa", "Jalapa", "Xalapa"}, } m["nai-jum"] = { aliases = {"Jumaitepeque", "Jumaytepec"}, } m["nai-kat"] = { otherNames = {"Kathlamet Chinook"}, } m["nai-klp-pro"] = { } m["nai-knm"] = { } m["nai-kum"] = { otherNames = {"Kumiai", "Central Diegueño", "Diegueño"}, } m["nai-mac"] = { aliases = {"Macorís", "Macorix", "Mazorij", "Mazorig", "Mazoriges"}, } m["nai-mdu-pro"] = { aliases = {"Proto-Maiduan"}, } m["nai-miz-pro"] = { aliases = {"Proto-Mixe-Zoquean"}, } m["nai-mus-pro"] = { aliases = {"Proto-Muskhogean", "Proto-Muskogee"}, } m["nai-nao"] = { } m["nai-nrs"] = { } m["nai-okw"] = { } m["nai-per"] = { } m["nai-pic"] = { } m["nai-plp-pro"] = { } m["nai-pom-pro"] = { aliases = {"Proto-Pomoan"}, } m["nai-qng"] = { } m["nai-sca-pro"] = { -- NB 'sio-pro' "Proto-Siouan" which is Proto-Western Siouan } m["nai-sin"] = { aliases = {"Sinacantan", "Zinacantán", "Zinacantan"}, } m["nai-sln"] = { } m["nai-spt"] = { aliases = {"Shahaptin"}, } m["nai-tap"] = { otherNames = {"Tapachulteca", "Tapachulteco", "Tapachula"}, } m["nai-taw"] = { } m["nai-teq"] = { otherNames = {"Tequistlateco", "Tequistlateca", "Chontal", "Chontol of Oaxaca", "Oaxaca Chontal", "Oaxacan Chontal"}, } m["nai-tip"] = { otherNames = {"Tipay", "Tiipai", "Tiipay", "Jamul Tiipay", "Southern Digueño", "Diegueño"}, } m["nai-tot-pro"] = { } m["nai-tsi-pro"] = { } m["nai-utn-pro"] = { otherNames = {"Proto-Miwok-Costanoan"}, } m["nai-wai"] = { aliases = {"Guaycura", "Waicura"}, } m["nai-wji"] = { otherNames = {"Jicaque of El Palmar", "Sula"}, } m["nai-yup"] = { aliases = {"Jupiltepeque", "Yupiltepec", "Jupiltepec", "Xupiltepec"}, } m["nan-dat"] = { aliases = {"Datian"}, } m["nan-hbl"] = { aliases = {"Hokkienese", "Quanzhang", "Fukien", "Banlam", "Banlamese", "Ban-lam"}, } m["nan-hlh"] = { aliases = {"Hailufeng", "Hoklo Min", "Hai Lok Hong"}, } m["nan-hnm"] = { aliases = {"Hainamese", "Hailamese", "Hainam", "Hainan Min", "Hainam Min"}, } m["nan-lnx"] = { aliases = {"Longyan", "Liongna"}, } m["nan-luh"] = { aliases = {"Leizhou", "Luichew", "Luichew Min"} } m["nan-tws"] = { aliases = {"Teochew Min", "Chiuchow", "Teo-Swa", "Teo-Swa Min", "Tio-Sua"}, } m["nan-zhe"] = { aliases = {"Zhenan"}, } m["nan-zsh"] = { aliases = {"Sanxiang", "Samheung", "Sahiu"}, } m["nds-de"] = { } m["nds-nl"] = { varieties = {"Achterhoeks", "Drents", "Gronings", "Sallands", "Stellingwerfs", "Twents", "Veluws"}, } m["ngf-pro"] = { } m["nic-bco-pro"] = { } m["nic-bod-pro"] = { } m["nic-eov-pro"] = { } m["nic-gns-pro"] = { } m["nic-grf-pro"] = { } m["nic-gur-pro"] = { } m["nic-jkn-pro"] = { } m["nic-lcr-pro"] = { } m["nic-ogo-pro"] = { } m["nic-ovo-pro"] = { } m["nic-plt-pro"] = { } m["nic-pro"] = { } m["nic-ubg-pro"] = { } m["nic-ucr-pro"] = { } m["nic-vco-pro"] = { } m["nub-har"] = { aliases = {"Ḥarāza"}, } m["nub-pro"] = { } m["omq-cha-pro"] = { } m["omq-maz-pro"] = { aliases = {"Proto-Mazatecan"}, } m["omq-mix-pro"] = { } m["omq-mxt-pro"] = { } m["omq-otp-pro"] = { } m["omq-pro"] = { aliases = {"Proto-Otomanguean", "Proto-Oto-Mangue"}, } m["omq-sjq"] = { aliases = {"Chatino Sign Language", "San Juan Quiahije Chatino Sign Language"}, } m["omq-tel"] = { } m["omq-teo"] = { } m["omq-tri-pro"] = { } m["omq-zap-pro"] = { } m["omq-zpc-pro"] = { } m["omv-aro-pro"] = { } m["omv-diz-pro"] = { aliases = {"Proto-Maji"}, } m["omv-pro"] = { } m["oto-otm-pro"] = { } m["oto-pro"] = { } m["paa-kom"] = { aliases = {"Komnzo", "Kómnjo", "Komnjo", "Kamundjo"}, } m["paa-kwn"] = { } m["paa-nha-pro"] = { } m["paa-nun"] = { } m["phi-din"] = { } m["phi-kal-pro"] = { aliases = {"Proto-Calamian"}, } m["phi-nag"] = { } m["phi-pro"] = { } m["poz-abi"] = { otherNames = {"Sembuak", "Tubu"}, } m["poz-bal"] = { } m["poz-btk-pro"] = { } m["poz-cet-pro"] = { } m["poz-hce-pro"] = { otherNames = {"Proto-South Halmahera - West New Guinea"}, } m["poz-lgx-pro"] = { } m["poz-mcm-pro"] = { } m["poz-mic-pro"] = { } m["poz-mly-pro"] = { } m["poz-msa-pro"] = { } m["poz-oce-pro"] = { } m["poz-pep-pro"] = { aliases = {"Proto-Eastern-Polynesian", "Proto-East Polynesian", "Proto-East-Polynesian"}, } m["poz-pnp-pro"] = { } m["poz-pol-pro"] = { } m["poz-pro"] = { otherNames = {"Proto-Western Malayo-Polynesian"}, -- Western is subsumed into general Proto-MP } m["poz-sml"] = { aliases = {"Sarawak"}, } m["poz-ssw-pro"] = { } m["poz-sus-pro"] = { } m["poz-swa-pro"] = { } m["poz-ter"] = { aliases = {"Terengganu"}, } m["pqe-pro"] = { } m["pra-niy"] = { } m["qfa-adm-pro"] = { } m["qfa-bet-pro"] = { aliases = {"Proto-Tai-Be"}, } m["qfa-cka-pro"] = { } m["qfa-hur-pro"] = { } m["qfa-kad-pro"] = { } m["qfa-kms-pro"] = { } m["qfa-kor-pro"] = { } m["qfa-kra-pro"] = { } m["qfa-lic-pro"] = { } m["qfa-onb-pro"] = { aliases = {"Proto-Ong-Be", "Proto-Bê"}, } m["qfa-ong-pro"] = { } m["qfa-tak-pro"] = { aliases = {"Proto-Tai-Kadai"}, } m["qfa-yen-pro"] = { } m["qfa-yuk-pro"] = { } m["qwe-kch"] = { otherNames = {"Kichwa shimi", "Runashimi", "Runa", "Quichua", "Quecha", "Inga", "Chimborazo", "Imbabura Highland Kichwa", "Cañar Highland Quecha", "Quechua"}, } m["qwe-pro"] = { } m["roa-ang"] = { otherNames = {"Craonnais", "Baugeois", "Saumurois"}, } m["roa-bbn"] = { otherNames = {"Bourbonnais", "Berrichon", "Moulins", "Allier", "Nivernais", "Haut-Berrichon", "Bas-Berrichon"}, } m["roa-brg"] = { otherNames = {"Burgundian", "Bregognon", "Dijonnais", "Morvandiau", "Morvandeau", "Morvan", "Bourguignon-Morvandiau", "Mâconnais", "Brionnais", "Brionnais-Charolais", "Auxerrois", "Beaunois", "Langrois", "Valsaônois", "Verduno-Chalonnais", "Sédelocien"}, } m["roa-cha"] = { otherNames = {"Bassignot", "Langrois", "Sennonais", "Vallage", "Troyen", "Briard", "Der", "Perthois", "Rémois", "Argonnais", "Porcien", "Ardennais", "Sugny"}, } m["roa-fcm"] = { otherNames = {"Frainc-Comtou", "Comtois", "Jurassien", "Ajoulot", "Vâdais", "Taignon", "Bisontin", "Bousbot"}, } m["roa-gal"] = { } m["roa-gib"] = { } m["roa-gis"] = { } m["roa-leo"] = { } m["roa-lor"] = { otherNames = {"Gaumais", "Vosgien", "Welche", "Argonnais", "Longovicien", "Messin", "Nancéien", "Spinalien", "Déodatien"}, } m["roa-oan"] = { aliases = {"Old Aragonese"}, } m["roa-oca"] = { } m["roa-ole"] = { } m["roa-opt"] = { aliases = {"Galician-Portuguese", "Galician Portuguese", "Medieval Galician", "Medieval Portuguese", "Old Galician", "Old Portuguese"}, } m["roa-orl"] = { otherNames = {"Beauceron", "Solognot", "Gâtinais", "Blaisois", "Vendômois"}, } m["roa-poi"] = { otherNames = {"Poitevin", "Saintongeais", "Maraîchin"}, } m["roa-tar"] = { } m["sai-all"] = { otherNames = {"Alyentiyak", "Huarpe", "Warpe"}, } m["sai-and"] = { -- not to be confused with 'cbc' or 'ano' otherNames = {"Miranya", "Miranha", "Miranha Carapana-Tapuya", "Miraña-Carapana-Tapuyo", "Andokero", "Miranya-Karapana-Tapuyo", "Miraña", "Carapana"}, } m["sai-ayo"] = { aliases = {"Ayoman", "Ayamán", "Ayaman"}, } m["sai-bae"] = { aliases = {"Baenã", "Baenán", "Baena"}, } m["sai-bag"] = { otherNames = {"Patagón de Bagua"}, } m["sai-bet"] = { otherNames = {"Betoy", "Betoya", "Betoye", "Betoi-Jirara", "Jirara"}, } m["sai-bor-pro"] = { otherNames = {"Proto-Bora-Muinane", "Proto-Bora-Muiname"}, } m["sai-cac"] = { otherNames = {"Kakán", "Diaguita", "Cacan", "Kakan", "Calchaquí", "Chaka", "Kaka", "Kaká", "Caca", "Caca-Diaguita", "Catamarcano", "Capayán", "Capayana", "Yacampis"}, } m["sai-caq"] = { otherNames = {"Cara", "Kara"}, } m["sai-car-pro"] = { } m["sai-cat"] = { } m["sai-cer-pro"] = { otherNames = {"Proto-Amazonian Jê"}, } m["sai-chi"] = { } m["sai-chn"] = { aliases = {"Chana"}, } m["sai-chp"] = { aliases = {"Txapacura", "Xapacura", "Guapore", "Šapakura", "Txapakura", "Txapakúra", "Xapakúra"}, } m["sai-chr"] = { aliases = {"Charrúa", "Charruá"}, } m["sai-chu"] = { aliases = {"Churoya"}, } m["sai-cje-pro"] = { otherNames = {"Proto-Akuwẽ"}, } m["sai-cmg"] = { aliases = {"Comechingón", "Comechingona", "Comechingone"}, } m["sai-cno"] = { otherNames = {"Chonos", "Caucau"}, } m["sai-cnr"] = { aliases = {"Cañar"}, } m["sai-coe"] = { aliases = {"Koeruna"}, } m["sai-col"] = { aliases = {"Colan"}, } m["sai-cop"] = { } m["sai-crd"] = { otherNames = {"Coroado"}, } m["sai-ctq"] = { aliases = {"Catuquinarú", "Katukinaru"}, } m["sai-cul"] = { otherNames = {"Culle", "Kulyi", "Ilinga", "Linga"}, } m["sai-cva"] = { } m["sai-esm"] = { otherNames = {"Esmeraldeño", "Atacame", "Takame"}, } m["sai-ewa"] = { } m["sai-gam"] = { aliases = {"Gamella", "Acobu", "Curinsi", "Barbados"}, } m["sai-gay"] = { aliases = {"Gayon"}, } m["sai-gmo"] = { otherNames = {"Wamo", "Santa Rosa", "San Jose", "Barinas", "Guamotey", "Guama"}, } m["sai-gue"] = { aliases = {"Guenoa"}, } m["sai-hau"] = { otherNames = {"Manek'enk"}, } m["sai-jee-pro"] = { otherNames = {"Proto-Gê", "Proto-Jean", "Proto-Gean", "Proto-Jê-Kaingang", "Proto-Ye"}, } m["sai-jko"] = { aliases = {"Geicó", "Jeicó", "Jaikó", "Geikó", "Yeikó", "Jeiko", "Geico", "Jeico", "Jaiko", "Geiko", "Yeiko", "Eyco"}, } m["sai-jrj"] = { } m["sai-kat"] = { -- contrast xoo, kzw, sai-xoc otherNames = {"Catrimbi", "Catembri", "Kariri de Mirandela", "Mirandela", "Kariri", "Kiriri"}, } m["sai-mal"] = { aliases = {"Malali"}, } m["sai-mar"] = { } m["sai-mat"] = { otherNames = {"Matanauí", "Matanaui", "Matanawü", "Mitandua", "Moutoniway"}, } m["sai-mcn"] = { aliases = {"Mokana"}, } m["sai-men"] = { aliases = {"Menién"}, } m["sai-mil"] = { otherNames = {"Milykayak", "Huarpe", "Warpe"}, } m["sai-mlb"] = { aliases = {"Malibú", "Malebú"}, } m["sai-msk"] = { aliases = {"Masakara", "Masacará", "Masacara"}, } m["sai-muc"] = { otherNames = {"Mucuchi", "Mokochi", "Mocochí", "Mirripú", "Maripú", "Mucuchí-Maripú"}, } m["sai-mue"] = { aliases = {"Muellamués"}, } m["sai-muz"] = { } m["sai-mys"] = { otherNames = {"Mayna", "Maina", "Rimachu"}, } m["sai-nat"] = { otherNames = {"Natu", "Peagaxinan"}, } m["sai-nje-pro"] = { otherNames = {"Proto-Core Jê"}, } m["sai-opo"] = { otherNames = {"Opon", "Opón-Karare", "Opón-Carare", "Carare", "Carare-Opón"}, } m["sai-oto"] = { aliases = {"Otomako", "Otomacan", "Otomac", "Otomak"}, } m["sai-pal"] = { } m["sai-pam"] = { aliases = {"Pamiwa"}, } m["sai-par"] = { aliases = {"Paratio", "Prarto"}, } m["sai-pnz"] = { aliases = {"Pansaleo"}, } m["sai-prh"] = { } m["sai-ptg"] = { otherNames = {"Patagón de Perico"}, } m["sai-pur"] = { aliases = {"Purukoto", "Purucotó", "Purucoto"}, } m["sai-pyg"] = { aliases = {"Payawá", "Payagua"}, } m["sai-pyk"] = { aliases = {"Gavião-Pykobjê", "Pykobjê-Gavião", "Gavião", "Pyhcopji", "Gavião-Pyhcopji"}, } m["sai-qmb"] = { otherNames = {"Kimbaya", "Quindío", "Quindio", "Quindo"}, } m["sai-qtm"] = { aliases = {"Quitemoca"}, } m["sai-rab"] = { } m["sai-ram"] = { } m["sai-sac"] = { otherNames = {"Sacata", "Zácata", "Chillao"}, } m["sai-san"] = { aliases = {"Sanavirón", "Sanabirón", "Sanabiron", "Sanavirona", "Zanavirona"}, } m["sai-sap"] = { aliases = {"Zapará", "Zapara"}, } m["sai-sec"] = { otherNames = {"Sek", "Sec"}, } m["sai-sin"] = { otherNames = {"Cenúfana", "Zenúfana", "Cinifaná", "Sinufana", "Sinú", "Cenú", "Zenú", "Finzenú", "Fincenú", "Pancenú", "Sutagao"}, } m["sai-sje-pro"] = { } m["sai-tab"] = { otherNames = {"Aconipa"}, } m["sai-tal"] = { otherNames = {"Atalán", "Tallan", "Tallanca", "Atalan", "Sek"}, } m["sai-tap"] = { otherNames = {"Tapayúna", "Kajkwakhrattxi"}, } m["sai-tar-pro"] = { } m["sai-teu"] = { aliases = {"Tehues", "Teuéx"}, } m["sai-tim"] = { otherNames = {"Cuica", "Timote-Cuica"}, } m["sai-tpr"] = { aliases = {"Taparito"}, } m["sai-trr"] = { otherNames = {"Caratiú"}, } m["sai-wai"] = { aliases = {"Waitaka", "Waitacá", "Waitaca", "Goytacá", "Goitacá", "Guaitacá", "Guiatacá", "Guiatacás", "Goiatacá", "Goiatacás", "Guaiatacá", "Goytacaz", "Goitacaz", "Goyataca", "Aitacaz", "Uetacaz", "Uetacá", "Outacá", "Ouetacá", "Eutacá", "Itacaz", "Vaitacá"}, } m["sai-way"] = { aliases = {"Wajumará", "Wajumara", "Wayumará", "Azumara", "Guimara"}, } m["sai-wit-pro"] = { otherNames = {"Proto-Huitotoan", "Proto-Uitotoan"}, } m["sai-wnm"] = { otherNames = {"Wañam", "Wanyam", "Huanyam", "Uanham", "Abitana"}, } m["sai-xoc"] = { -- contrast xoo, kzw, sai-kat otherNames = {"Xoco", "Chocó", "Shokó", "Shoko", "Shocó", "Shoco", "Choco", "Chocaz", "Kariri-Xocó", "Kariri-Xoco", "Kariri-Shoko", "Cariri-Chocó", "Xukuru-Kariri", "Xucuru-Kariri", "Xucuru-Cariri", "Xukurú-Kirirí"}, } m["sai-yao"] = { aliases = {"Yao", "Jaoi", "Yaoi", "Yaio", "Anacaioury"}, } m["sai-yar"] = { -- not the same family as 'suy' aliases = {"Yaruma"}, } m["sai-yri"] = { aliases = {"Jurí"}, } m["sai-yup"] = { otherNames = {"Yupuá", "Yupúa", "Jupua", "Jupuá", "Jupúa", "Hiupiá", "Yupuá-Duriña", "Duriña"}, } m["sai-yur"] = { aliases = {"Yurumangui", "Yurimangí", "Yurimangi", "Yurimanguí", "Yurimangui"}, } m["sal-pro"] = { aliases = {"Proto-Salishan"}, } m["sdv-daj-pro"] = { } m["sdv-eje-pro"] = { } m["sdv-nil-pro"] = { } m["sdv-nyi-pro"] = { } m["sdv-tmn-pro"] = { } m["sel-nor"] = { aliases = {"Taz Selkup"}, } m["sel-pro"] = { } m["sel-sou"] = { } m["sem-amm"] = { } m["sem-amo"] = { aliases = {"Amoritic"}, } m["sem-cha"] = { aliases = {"Cheha", "Čäha", "Čäxa"}, } m["sem-dad"] = { otherNames = {"Dadanite", "Lihyanite", "Lihyanitic"}, } m["sem-dum"] = { } m["sem-has"] = { } m["sem-his"] = { otherNames = {"Thamudic E"}, } m["sem-mhr"] = { otherNames = {"Muher Gurage", "Muxar", "Muxər", "Muhər", "Muḫər"}, } m["sem-pro"] = { } m["sem-saf"] = { } m["sem-srb"] = { } m["sem-tay"] = { otherNames = {"Taymanite", "Thamudic A"}, } m["sem-tha"] = { } m["sem-wes-pro"] = { } m["sio-pro"] = { -- NB this is not Proto-Siouan-Catawban 'nai-sca-pro' } m["sit-bok"] = { otherNames = {"Ramo", "Pailibo"}, } m["sit-bai-pro"] = { } m["sit-cai"] = { } m["sit-cha"] = { } m["sit-hrs-pro"] = { } m["sit-jap"] = { otherNames = {"Chabao", "Kuru"}, } m["sit-kha-pro"] = { } m["sit-liz"] = { } m["sit-lnj"] = { } m["sit-lrn"] = { } m["sit-luu-pro"] = { } m["sit-prn"] = { } m["sit-pro"] = { } m["sit-sit"] = { otherNames = {"Eastern rGyalrong", "rGyalrong", "Rgyalrong", "rGyalrongic", "Gyalrong", "Gyarong", "rGyarong", "Gyarung", "Jiarong", "Jiarongyu", "Jyarong", "Jyarung", "Yelong", "Kuru"}, } m["sit-tam-pro"] = { aliases = {"Proto-Tamang"}, } m["sit-tan-pro"] = { } m["sit-tgm"] = { } m["sit-tos"] = { } m["sit-tsh"] = { otherNames = {"Caodeng", "Sidaba", "rGyalrong", "Rgyalrong", "Jiarong", "Gyarung", "Kuru"}, } m["sit-zbu"] = { otherNames = {"Ribu", "Rdzong'bur", "Rdzongmbur", "Showu", "rGyalrong", "Rgyalrong", "Jiarong", "Gyarung", "Kuru"}, } m["sla-pro"] = { aliases = {"Common Slavic"}, } m["smi-pro"] = { aliases = {"Proto-Sami"}, } m["son-pro"] = { aliases = {"Proto-Songhai"}, } m["sqj-pro"] = { } m["ssa-klk-pro"] = { aliases = {"Proto-Rub"}, } m["ssa-kom-pro"] = { } m["ssa-pro"] = { } m["syd-fne"] = { } m["syd-pro"] = { } m["tai-pro"] = { } m["tai-swe-pro"] = { } m["tbq-bdg-pro"] = { } m["tbq-blg"] = { aliases = {"Pai-lang", "Pailang"}, } m["tbq-gkh"] = { aliases = {"Gɔkhý", "Gɔkhy", "Gouke"}, } m["tbq-kuk-pro"] = { otherNames = {"Proto-Kukish"}, } m["tbq-lal-pro"] = { } m["tbq-laz"] = { otherNames = {"Lare", "Shuitianhua"}, } m["tbq-lob-pro"] = { } m["tbq-lol-pro"] = { otherNames = {"Proto-Yi", "Proto-Ngwi", "Proto-Nisoic"}, } m["tbq-mil"] = { } m["tbq-mor"] = { aliases = {"Morān"}, } m["tbq-ngo"] = { otherNames = {"Ngachang", "Achang"}, } -- tbq-pro is now etymology-only m["trk-dkh"] = { aliases = {"Dukha"}, } m["trk-oat"] = { } m["trk-pro"] = { } m["tup-gua-pro"] = { } m["tup-kab"] = { aliases = {"Kabixiana", "Cabixiana", "Cabishiana", "Kapishana", "Capishana", "Kapišana", "Cabichiana", "Capichana", "Capixana"}, } m["tuw-alk"] = { aliases = {"Alechuka"}, } m["tuw-bal"] = { } m["tuw-kkl"] = { aliases = {"Chinese Kyakala"}, } m["tuw-kli"] = { aliases = {"Kilen", "Kirin", "Kila", "Hezhe", "Qile'en"}, } m["tup-pro"] = { } m["tuw-pro"] = { } m["tuw-sol"] = { } m["urj-fin-pro"] = { } m["urj-koo"] = { aliases = {"Old Permian"}, } m["urj-kuk"] = { aliases = {"Kukkuzi Votic", "Kukkuzi Ingrian", "Kukkusi"}, } m["urj-kya"] = { } m["urj-mdv-pro"] = { } m["urj-prm-pro"] = { } m["urj-pro"] = { otherNames = {"Proto-Finno-Ugric", "Proto-Finno-Permic"}, -- PFU and PFP are subsumed into PU per [[Wiktionary:Beer parlour/2015/January#Merging Finno-Volgaic, Finno-Samic, Finno-Permic and Finno-Ugric into Uralic]] } m["urj-ugr-pro"] = { } m["xgn-pro"] = { } m["xnd-pro"] = { otherNames = {"Proto-Na-Dené", "Proto-Athabaskan-Eyak-Tlingit"}, } m["yok-bvy"] = { otherNames = {"Tulamni-Hometwoli", "Tulamni", "Tulamne", "Tuolumne", "Tawitchi", "Hometwoli", "Taneshach"}, } m["yok-dly"] = { otherNames = {"Far Northern Valley Yokuts", "Yachikumne", "Yachikumni", "Chulamni", "Lower San Joaquin", "Lakisamni", "Tawalimni"}, } m["yok-gsy"] = { } m["yok-kry"] = { otherNames = {"Choinimni", "Choynimni", "Ayticha", "Kocheyali", "Ayitcha", "Michahay", "Chukaymina", "Chukaimina"}, } m["yok-nvy"] = { otherNames = {"Chukchansi", "Kechayi", "Dumna", "Chawchila", "Noptinte", "Nopṭinṭe", "Nopthrinthre", "Nopchinchi", "Takin"}, } m["yok-ply"] = { otherNames = {"Paleuyami", "Altinin", "Poso Creek", "Poso Creek Yokuts"}, } m["yok-svy"] = { otherNames = {"Yawelmani", "Tachi", "Koyeti", "Nutunutu", "Chunut", "Wo'lasi", "Choynok", "Choinok", "Wechihit"}, } m["yok-tky"] = { otherNames = {"Wikchamni", "Wukchamni", "Wukchumni", "Yawdanchi"}, } m["ypk-pro"] = { } m["zhx-min-pro"] = { } m["zhx-sht"] = { otherNames = {"Xiangnan Tuhua", "Yuebei Tuhua", "Shipo", "Shina"}, } m["zhx-sic"] = { otherNames = {"Sichuanese Mandarin"}, } m["zhx-tai"] = { aliases = {"Toishanese"}, } m["zlw-mas"] = { aliases = {"Mazurian"}, } m["zle-ono"] = { } m["zle-ort"] = { } m["zlw-ocs"] = { } m["zlw-opl"] = { } m["zlw-osk"] = { } m["zlw-slv"] = { } m["zlm-coa"] = { } m["zlm-pah"] = { } return m c44ahyqdiyqwf2wdft7nfmybeiuymem Modul:Jpan-headword 828 34824 281360 245830 2026-04-22T06:28:21Z PeaceSeekers 3334 281360 Scribunto text/plain local m_ja = require("Module:ja") local m_ja_ruby = require("Module:ja-ruby") local m_str_utils = require("Module:string utilities") local byteoffset = mw.ustring.byteoffset local concat = table.concat local gsplit = m_str_utils.gsplit local insert = table.insert local kana_to_romaji = require("Module:Hrkt-translit").tr local max_index = require("Module:table").maxIndex local moraify = m_ja.moraify local remove = table.remove local ugmatch = mw.ustring.gmatch local ugsub = m_str_utils.gsub local ulen = m_str_utils.len local ulower = m_str_utils.lower local umatch = mw.ustring.match local usub = m_str_utils.sub local export = {} local pos_functions = {} local range = mw.loadData('Module:ja/data/range') local Jpan = require("Module:scripts").getByCode("Jpan") local function remove_links(text) return (text:gsub("%[%[[^|%]]-|", "") :gsub("%[%[", "") :gsub("%]%]", "")) end local function assign_kana_to_kanji(head, kana, pagename, template_name) -- TODO: uses deprecated module local m_tu = require'Module:template utilities' local kanji_pos = {[0] = { nil, 0}} local head_nolink = {} local link_border = 0 local function insert_kanji_pos(substr) insert(head_nolink, substr) for p1, w1 in ugmatch(substr, '()([々' .. range.kanji .. '])') do p1 = byteoffset(substr, p1) + link_border insert(kanji_pos, { p1, p1 + w1:len() - 1 }) end end for p1, p2, w1 in m_tu.gfind_bracket(head, {['%[%['] = ']]'}) do insert_kanji_pos(head:sub(link_border + 1, p1 - 1)) local p_pipe = w1:find'|' or 2 link_border = p1 + p_pipe - 1 insert_kanji_pos(w1:sub(p_pipe + 1, -3)) link_border = p2 end insert_kanji_pos(head:sub(link_border + 1)) head_nolink = concat(head_nolink) local pagetext = mw.title.new(pagename):getContent() if not pagetext then return head, kana end local non_kanji = {} local last_kanji = 1 for p1 in ugmatch(head_nolink, '[々' .. range.kanji .. ']()') do insert(non_kanji, usub(head_nolink, last_kanji, p1 - 2)) last_kanji = p1 end insert(non_kanji, usub(head_nolink, last_kanji)) for kanjitab in pagetext:gmatch('(){{%s*' .. template_name) do kanjitab = select(3, m_tu.find_bracket(pagetext, m_tu.brackets_temp, kanjitab)) if not kanjitab then error('ill-formed [[t:' .. template_name:gsub('%%', '') .. ']] syntax') end kanjitab = m_tu.parse_temp(kanjitab) local readings = {} local readings_len = {} for i = 1, max_index(kanjitab.args) do local r_i = kanjitab.args[i] or '' local r_o = kanjitab.args['o' .. i] or '' if kanjitab.args['k' .. i] then readings[i] = kanjitab.args['k' .. i] .. r_o readings_len[i] = tonumber(r_i:match'^%s*%D*(%d*)%s*$') or 1 else local r_kana, r_len = r_i:match'^%s*(%D*)(%d*)%s*$' readings[i] = r_kana .. r_o readings_len[i] = tonumber(r_len) or 1 end end local kana_decom = {} local reading_id = 1 local reading_len = 1 for i = 1, #non_kanji - 1 do if reading_len <= 1 then reading_len = readings_len[reading_id] or 1 insert(kana_decom, non_kanji[i]) insert(kana_decom, readings[reading_id]) reading_id = reading_id + 1 else reading_len = reading_len - 1 end end insert(kana_decom, non_kanji[#non_kanji]) local function strip_nonkana(str, repl) return ugsub(str, '[^' .. range.kana .. ']+', repl) or nil end local xeno_reading = {strip_nonkana(kana, ''):match('^' .. strip_nonkana(concat(kana_decom), '(.-)') .. '$')} if #xeno_reading > 0 then local head_decom = {} reading_id = 1 reading_len = 1 for i = 1, #non_kanji - 1 do if reading_len <= 1 then reading_len = readings_len[reading_id] or 1 insert(head_decom, head:sub(kanji_pos[i - 1][2] + 1, kanji_pos[i][1] - 1)) insert(head_decom, head:sub(kanji_pos[i][1], kanji_pos[i + reading_len - 1][2])) reading_id = reading_id + 1 else reading_len = reading_len - 1 end end insert(head_decom, head:sub(kanji_pos[#non_kanji - 1][2] + 1)) if #head_decom ~= #kana_decom then error('number of parameters in [[t:' .. template_name:gsub('%%', '') .. ']] is incorrect') end local n_xeno_reading = 0 for i = 1, #kana_decom, 2 do kana_decom[i] = ugsub(kana_decom[i], '[^' .. range.kana .. ']+', function() n_xeno_reading = n_xeno_reading + 1 if xeno_reading[n_xeno_reading] == '' then return nil else return xeno_reading[n_xeno_reading] end end) end return concat(head_decom, '%'), concat(kana_decom, '%') end end return head, kana end local en_grades = { "gred pertama", "gred kedua", "gred ketiga", "gred keempat", "gred kelima", "gred keenam", "sekolah menengah", "jinmeiyō", "hyōgai" } local aliases = { ['transitive']='tr', ['trans']='tr', ['intransitive']='in', ['intrans']='in', ['intr']='in', ['godan']='1', ['ichidan']='2', ['irregular']='irr' } local adverbs_optional_tag = 'optionally ' local adverbs_optional_aliases = { ['to']='と', ['と']='と', ['ト']='と', ['ni']='に', ['に']='に', ['ニ']='に', } local adverbs_optional_links = { ['と']='[[と#Japanese:_adverbs|と]]', ['に']='[[に]]', } local function formatting_adjustments(rom, kana, pos_category) -- hyphens for prefixes, suffixes, and counters (classifiers) if pos_category == "Awalan" then rom = rom:gsub('%-?$', '-') elseif pos_category == "Akhiran" or pos_category == "Bentuk akhiran" or pos_category == "counters" or pos_category == "classifiers" then rom = rom:gsub('^%-?', '-') elseif pos_category == "Kata nama khas" and not kana:match'%^' then -- automatic caps for proper nouns, if not already specified rom = ugsub(ugsub(rom, '%f[^%s%c%p]%l', string.uupper), "%w'%u", ulower) -- no caps after medial apostrophes end return rom end local function kana_to_romaji_with_pos_format(kana, data, args) if data.headword.pos_category == "Bentuk gabungan" or data.headword.pos_category == "Tanda baca" or data.headword.pos_category == "Tanda lelaran" then return "-" end local rom = remove_links(kana_to_romaji(kana, data.lang_code)) -- make adjustments for -u verbs and -i adjectives if args['infl'] == '1' or args['infl'] == '1s' or args['infl'] == 'godan' then rom = rom:gsub('ō$', 'ou'):gsub('ū$', 'uu') elseif args['infl'] == 'i' or args['infl'] == 'is' or args['infl'] == 'い' then rom = rom:gsub('ī$', 'ii') end return formatting_adjustments(rom, kana, data.headword.pos_category) end local function iterate_rare_chars(text) local ch, i return function() repeat ch, i = umatch(text, "([" .. range.kana .. range.kana_graph .. "!-/:-@%[\\-`×△○◎。-〠〶〷〻-〽・·゠=~][゙゚]*)()", i) until not (ch and umatch(ch, "^[ぁ-ちっつて-ろんァ-チッツテ-ロンヲ-゚]$")) return ch end end local function historical_kana(data, hist_kana, modern_kana) -- Disallow historical kana for kana and morae, as there's no one-to-one correspondence. local pos = data.headword.pos_category if pos == "syllables" or pos == "kana" or pos == "morae" then error(("Cannot specify historical kana for %s."):format(pos)) end local hist_kana_no_formatting = hist_kana:gsub("[%^%-%. %%]+", "") local rare_chars, lang_name, hc = {}, data.lang_name, data.headword.categories for ch in iterate_rare_chars(hist_kana_no_formatting) do if not (modern_kana and modern_kana:find(ch)) then rare_chars[ch] = true end end for _, mora in ipairs(moraify((ugsub(hist_kana_no_formatting, "[^" .. range.kana .. "]+", " ")))) do if not (mora:gsub(" +", ""):match("^.?[\128-\191]*$") or (modern_kana and modern_kana:find(mora))) then rare_chars[mora] = true end end for ch in pairs(rare_chars) do insert(hc, "Perkataan mengikut sejarah dieja dengan " .. ch .. " bahasa " .. lang_name) end insert(data.info_hist, require("Module:ja-link").link({ lang = data.headword.lang, lemma = hist_kana, tr = formatting_adjustments( remove_links(kana_to_romaji(hist_kana, data.lang_code, nil, {hist = true})), hist_kana, pos ) }, { face = "head", disableSelfLink = true, })) end local function detect_pagename_kana(data, digraphs) local pagename = data.pagename -- Exclude "&" and "@", which are part of %p (e.g. リズム&ブルース). local function remove_kana(m) return m:match("[&@]") or "" end if ugsub(pagename, '[%p%s%c' .. range.hiragana .. (digraphs and "ゟ" or "") .. ']', remove_kana) == "" then return 'hira' elseif ugsub(pagename, '[%p%s%c' .. range.katakana .. (digraphs and "ヿ" or "") .. ']', remove_kana) == "" then return 'kata' elseif ugsub(pagename, '[%p%s%c' .. range.kana .. (digraphs and "ゟヿ" or "") .. ']', remove_kana) == "" then return 'both' end end -- go through args and build inflections by finding whatever kanas were given to us local function format_headword(args, data) local pagename, kanas, lang_name = data.pagename, data.kanas, data.lang_name data.pagename_kana = detect_pagename_kana(data) if args[1][1] and not args[1][1]:match'[\128-\255]' then -- filter out POS designations remove(args[1], 1) end local linked_translit = data.headword.lang:link_tr(Jpan) local suru_ending, rom_suru_ending if data.headword.pos_category == "kata kerja suru" then suru_ending = "[[する]]" rom_suru_ending = linked_translit and " [[suru]]" or " suru" else suru_ending, rom_suru_ending = "", "" end if data.pagename_kana then -- pure-kana-title entry if #args.head > 0 or args.head.default then insert(data.headword.categories, "Perkataan dengan parameter pengepala lewah bahasa " .. lang_name) end -- {{ja-xxx}} vs {{ja-xxx|こ.うし}} vs {{ja-xxx|コウシ}} in [[こうし]] if not args[1][1] then args[1][1] = pagename elseif remove_links(args[1][1]:gsub("[%^%-%. %%]+", "")) ~= pagename then insert(args[1], 1, pagename) end for i, k in ipairs(args[1]) do insert(data.headword.heads, { term = k:gsub("[%^%-%. %%]+", "") .. suru_ending, tr = '-', l = args.label[i] and {args.label[i]} or nil, }) end for i = 1, math.max(args.rom.maxindex, 1) do local rom = args.rom[i] or args.rom.default or kana_to_romaji_with_pos_format(args[1][1], data, args) if not data.headword.heads[i] then data.headword.heads[i] = {term = data.headword.heads[i-1].term} end if rom == "-" then data.headword.heads[i].tr = "-" elseif linked_translit then data.headword.heads[i].tr = "[[" .. rom .. "]]" .. rom_suru_ending else data.headword.heads[i].tr = rom .. rom_suru_ending end if not data.inflection_base.form then data.inflection_base.form = remove_links(args[i][1]:gsub("[%^%-%. %%]+", "")) .. suru_ending data.inflection_base.romaji = rom .. rom_suru_ending end end kanas[1] = pagename if args.hist[1] then historical_kana(data, args.hist[1], args[1][1]) end else -- non-pure-kana-title entry if #args[1] == 0 and not (data.headword.pos_category == "Tanda baca" or data.headword.pos_category == "Tanda lelaran" or data.headword.pos_category == "Simbol") then error("Kana form is required.") end if args.head.default == pagename then insert(data.headword.categories, "Perkataan dengan parameter pengepala lewah bahasa " .. lang_name) end local rom_repetition_final = {} for i, k in ipairs(args[1]) do local rom_auto = kana_to_romaji_with_pos_format(k, data, args) local head = args.head[i] or args.head.default or pagename if args.head[i] == pagename then insert(data.headword.categories, "Perkataan dengan parameter pengepala lewah bahasa " .. lang_name) end local head_for_ruby, kana_for_ruby if ulen(head) > 1 and head:match'%%' == nil and k:match'%%' == nil then head_for_ruby, kana_for_ruby = assign_kana_to_kanji(head, k, pagename, data.lang_code .. '%-kanjitab') else head_for_ruby, kana_for_ruby = head, k end local format_table = m_ja_ruby.parse_text(head_for_ruby, kana_for_ruby, { try = 'force', try_force_limit = 10000 }) local kana_bare = remove_links(k:gsub("[%^%-%. %%]+", "")) local rom = args.rom[i] or args.rom.default or rom_auto head = { term = m_ja_ruby.to_wiki(format_table, { break_link = true, }):gsub('<rt>(..-)</rt>', "<rt>[[" .. kana_bare .."|%1]]</rt>") .. suru_ending, l = args.label[i] and {args.label[i]} or nil, } if rom == "-" or rom_repetition_final[rom] then head.tr = "-" elseif linked_translit then head.tr = "[[" .. rom .. "]]" .. rom_suru_ending else head.tr = rom .. rom_suru_ending end insert(data.headword.heads, head) rom_repetition_final[rom] = true insert(kanas, kana_bare) if args.hist[i] then historical_kana(data, args.hist[i], k) end if not data.inflection_base.form then data.inflection_base.form = remove_links(m_ja_ruby.to_markup(format_table)) .. suru_ending data.inflection_base.romaji = rom .. rom_suru_ending end end local first_reading, multiple = kanas[1] if not first_reading then return end first_reading = ulower(kana_to_romaji(first_reading, data.lang_code)):gsub("%%", "") for i = 2, #kanas do if ulower(kana_to_romaji(kanas[i], data.lang_code)):gsub("%%", "") ~= first_reading then multiple = true break end end if not multiple then local lang_code = data.lang_code local content = mw.title.getCurrentTitle():getContent() local loc1, loc2 = content:find("%f[^%z%s]==%s*" .. lang_name:gsub("%-", "%%%-") .. "%s*==()") loc2 = content:find("%f[^%z%s]==[^\n=]+==", loc2) if loc1 then content = content:sub(loc1, loc2) for template in require("Module:template parser").find_templates(content) do local name, reading = template:get_name() if ( name == lang_code .. "-head" or name == lang_code .. "-pos" ) then reading = template:get_arguments()[2] if reading ~= nil then reading = remove_links(reading):gsub("%%", "") end elseif ( name == lang_code .. "-noun" or name == lang_code .. "-verb" or name == lang_code .. "-adj" or name == lang_code .. "-phrase" or name == lang_code .. "-verb form" or name == lang_code .. "-verb-suru" ) then reading = template:get_arguments()[1] if reading ~= nil then reading = remove_links(reading):gsub("%%", "") end elseif name == lang_code .. "-see" then reading = template:get_arguments()[1] if reading ~= nil then reading = remove_links(reading):gsub("%%", "") end -- if umatch(reading, "[^" .. range.kana .. "]") then -- TODO: check linked page -- end end if reading and ulower(kana_to_romaji(reading, lang_code)):gsub("%%", "") ~= first_reading then multiple = true end end end end if multiple then insert(data.headword.categories, "Perkataan dengan pelbagai bacaan bahasa " .. lang_name) end end end local function add_transitivity(data, tr) local categories, lang_name = data.headword.categories, data.lang_name tr = aliases[tr] or tr if tr == "tr" then insert(data.info_mid, 'transitive') insert(categories, "Kata kerja transitif bahasa " .. lang_name) elseif tr == "in" then insert(data.info_mid, 'intransitive') insert(categories, "Kata kerja tak transitif bahasa " .. lang_name) elseif tr == "both" then insert(data.info_mid, 'transitive or intransitive') insert(categories, "Kata kerja transitif bahasa " .. lang_name) insert(categories, "Kata kerja tak transitif bahasa " .. lang_name) else insert(categories, "Kata kerja tanpa ketransitifan bahasa " .. lang_name) end end local function get_final(lemma, data) return kana_to_romaji(remove(moraify(m_ja_ruby.to_ruby(m_ja_ruby.parse_markup(lemma)))), data.lang_code) end local function add_language_fragment(t, lang_name) for k, v in ipairs(t) do t[k] = v:gsub("%[%[([^]#]*)%]%]", "[[%1#%%s|%1]]"):format(lang_name) end end local function add_inflections(data, inflection_type, cat_suffix) local lang_name = data.lang_name local lemma = data.inflection_base.form local romaji = data.inflection_base.romaji inflection_type = aliases[inflection_type] or inflection_type local function replace_suffix(lemma_from, lemma_to, romaji_from, romaji_to) -- e.g. 持って来る, lemma = "[持](も)って来(く)る" -- lemma_from = "くる", lemma_to = {"き","きた"} add_language_fragment(lemma_to, lang_name) add_language_fragment(romaji_to, lang_name) local result = {} local pattern_from, n_from = lemma_from:gsub('.[\128-\191]*', function(c) return '[' .. c .. m_ja.hira_to_kata(c) .. ']([^' .. range.kana .. ']*)' end) pattern_from = pattern_from .. '$' -- "[くク]([^kana range]*)[るル]([^kana range]*)$" for i_lemma_to, s_lemma_to in ipairs(lemma_to) do local n_to = 0 local pattern_to = s_lemma_to:gsub('.[\128-\191]*', function(c) if n_to < n_from then n_to = n_to + 1 return c .. "%" .. n_to else return c end end) for i = n_to + 1, n_from do pattern_to = pattern_to .. "%" .. i end -- "き%1%2", "き%1た%2" local lemma_inflected, success = ugsub(lemma, pattern_from, pattern_to) if success == 0 then return end local romaji_inflected romaji_inflected, success = romaji:gsub(romaji_from .. "$", romaji_to[i_lemma_to]) if success == 0 then romaji_inflected, success = romaji:gsub("%[%[" .. romaji_from .. "%]%]$", "[[" .. romaji_to[i_lemma_to] .. "]]") if success == 0 then return end end insert(result, {lemma = lemma_inflected, romaji = romaji_inflected}) end return result -- {{lemma="[持](も)って来(き)",romaji="motteki"},{lemma="[持](も)って来(き)た",romaji="mottekita"}} end local function insert_form(label, ...) -- label = "stem" or "past" etc. -- ... = {lemma=...,romaji=...},{lemma=...,romaji=...} local labeled_forms = {label = label} for _, v in ipairs{...} do local table_form = m_ja_ruby.parse_markup(v.lemma) local form_term = m_ja_ruby.to_wiki(table_form) if not form_term:find'%[%[.+%]%]' then form_term = '[[' .. m_ja_ruby.to_text(table_form) .. '#' .. lang_name .. '|' .. form_term .. ']]' end insert(labeled_forms, { term = form_term, tr = v.romaji, }) end insert(data.headword.inflections, labeled_forms) end local inflected_forms if data.lang_code == 'ja' then if inflection_type == '1' or inflection_type == '1s' then insert(data.info_mid, '<abbr title="godan (group 1) conjugation">godan</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " godan bahasa " .. lang_name) local romaji = data.inflection_base.romaji if cat_suffix == "Kata kerja" then local final = get_final(lemma, data) insert(data.headword.categories, cat_suffix .. " godan berakhir dengan -" .. final " bahasa ".. lang_name) if final == "ru" then if umatch(romaji, "[iIīĪ]ru$") then insert(data.headword.categories, cat_suffix .. " godan berakhir dengan -iru bahasa " .. lang_name) elseif umatch(romaji, "[eEēĒ]ru$") then insert(data.headword.categories, cat_suffix .. " godan berakhir dengan -eru bahasa " .. lang_name) end end end end if inflection_type == '1' then inflected_forms = replace_suffix('く', {'き', 'いた'}, 'ku', {'ki', 'ita'}) or replace_suffix('ぐ', {'ぎ', 'いだ'}, 'gu', {'gi', 'ida'}) or replace_suffix('す', {'し', 'した'}, 'su', {'shi', 'shita'}) or replace_suffix('つ', {'ち', 'った'}, 'tsu', {'chi', 'tta'}) or replace_suffix('ぬ', {'に', 'んだ'}, 'nu', {'ni', 'nda'}) or replace_suffix('ぶ', {'び', 'んだ'}, 'bu', {'bi', 'nda'}) or replace_suffix('む', {'み', 'んだ'}, 'mu', {'mi', 'nda'}) or replace_suffix('る', {'り', 'った'}, 'ru', {'ri', 'tta'}) or replace_suffix('う', {'い', 'った'}, 'u', {'i', 'tta'}) if inflected_forms then insert_form('dasar', inflected_forms[1]) insert_form('lampau', inflected_forms[2]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end else inflected_forms = replace_suffix('る', {'り', 'った', 'い'}, 'ru', {'ri', 'tta', 'i'}) or --くださる replace_suffix('いく', {'いき', 'いった'}, 'iku', {'iki', 'itta'}) or --行く replace_suffix('う', {'い', 'うた'}, 'ou', {'oi', 'ōta'}) --問う if inflected_forms then insert_form('dasar', inflected_forms[1], inflected_forms[3]) insert_form('lampau', inflected_forms[2]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end end elseif inflection_type == '2' then insert(data.info_mid, '<abbr title="ichidan (group 2) conjugation">ichidan</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " ichidan bahasa " .. lang_name) local romaji = data.inflection_base.romaji if umatch(romaji, "[iIīĪ]ru$") then insert(data.headword.categories, cat_suffix .. " kami ichidan bahasa " .. lang_name) elseif umatch(romaji, "[eEēĒ]ru$") then insert(data.headword.categories, cat_suffix .. " shimo ichidan bahasa " .. lang_name) else insert(data.headword.categories, cat_suffix .. " irregular bahasa " .. lang_name) end end inflected_forms = replace_suffix('る', {'', 'た'}, 'ru', {'', 'ta'}) if inflected_forms then insert_form('dasar', inflected_forms[1]) insert_form('lampau', inflected_forms[2]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end elseif inflection_type == 'suru' then insert(data.info_mid, '<abbr title="suru (group 3) conjugation">suru</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " suru bahasa " .. lang_name) end inflected_forms = replace_suffix('する', {'し', 'した'}, 'suru', {'shi', 'shita'}) or replace_suffix('ずる', {'じ', 'じた'}, 'zuru', {'ji', 'jita'}) if inflected_forms then insert_form('dasar', inflected_forms[1]) insert_form('lampau', inflected_forms[2]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end elseif inflection_type == 'kuru' then insert(data.info_mid, '<abbr title="kuru (group 3) conjugation">kuru</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " kuru bahasa " .. lang_name) end inflected_forms = replace_suffix('くる', {'き', 'きた'}, 'kuru', {'ki', 'kita'}) if inflected_forms then insert_form('dasar', inflected_forms[1]) insert_form('lampau', inflected_forms[2]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end elseif inflection_type == 'i' or inflection_type == 'い' then insert(data.info_mid, '<abbr title="-i (type I) inflection">-i</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " い-i bahasa " .. lang_name) end inflected_forms = replace_suffix('い', {'く'}, 'i', {'ku'}) if inflected_forms then insert_form('adverbial', inflected_forms[1]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end elseif inflection_type == 'is' then insert(data.info_mid, '<abbr title="-i (type I) inflection">-i</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " い-i bahasa " .. lang_name) end inflected_forms = replace_suffix('いい', {'よく'}, 'ii', {'yoku'}) if inflected_forms then insert_form('adverbial', inflected_forms[1]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end elseif inflection_type == 'na' or inflection_type == 'な' then insert(data.info_mid, '<abbr title="-na (type II) inflection">-na</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " な-na bahasa " .. lang_name) end inflected_forms = replace_suffix('', {'[[な]]', '[[に]]'}, '', {' [[na]]', ' [[ni]]'}) insert_form('adnominal', inflected_forms[1]) insert_form('adverbial', inflected_forms[2]) elseif inflection_type == "yo" then insert(data.info_mid, '<abbr title="yodan conjugation (classical)"><sup><small>†</small></sup>yodan</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " yodan " .. lang_name) insert(data.headword.categories, cat_suffix .. " yodan berakhir dengan -" .. get_final(lemma, data) .. " bahasa ".. lang_name) end elseif inflection_type == "kami ni" then insert(data.info_mid, '<abbr title="kami nidan conjugation (classical)"><sup><small>†</small></sup>nidan</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " nidan bahasa " .. lang_name) insert(data.headword.categories, cat_suffix .. " kami nidan bahasa " .. lang_name) end elseif inflection_type == "shimo ni" then insert(data.info_mid, '<abbr title="shimo nidan conjugation (classical)"><sup><small>†</small></sup>nidan</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " nidan bahasa " .. lang_name) insert(data.headword.categories, cat_suffix .. " shimo nidan bahasa " .. lang_name) end elseif inflection_type == "rahen" then insert(data.info_mid, '<abbr title="r-special conjugation (classical)"><sup><small>†</small></sup>-ri</abbr>') elseif inflection_type == "sahen" then insert(data.info_mid, '<abbr title="s-special conjugation (classical)"><sup><small>†</small></sup>-se</abbr>') elseif inflection_type == "kahen" then insert(data.info_mid, '<abbr title="k-special conjugation (classical)"><sup><small>†</small></sup>-ko</abbr>') elseif inflection_type == "nahen" then insert(data.info_mid, '<abbr title="n-special conjugation (classical)"><sup><small>†</small></sup>-n</abbr>') elseif inflection_type == "nari" or inflection_type == "なり" then insert(data.info_mid, '<abbr title="-nari inflection (classical)"><sup><small>†</small></sup>-nari</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " なり-nari bahasa " .. lang_name) end elseif inflection_type == 'tari' or inflection_type == 'たり' then insert(data.info_mid, '<abbr title="-tari inflection (classical)"><sup><small>†</small></sup>-tari</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " たり-tari bahasa " .. lang_name) end inflected_forms = replace_suffix('', {'[[とした]]', '[[たる]]', '[[と]]', '[[として]]'}, '', {' [[to shita]]', ' [[taru]]', ' [[to]]', ' [[to shite]]'}) insert_form('adnominal', inflected_forms[1], inflected_forms[2]) insert_form('adverbial', inflected_forms[3], inflected_forms[4]) elseif inflection_type == "ku" or inflection_type == "く" then insert(data.info_mid, '<abbr title="-ku inflection (classical)"><sup><small>†</small></sup>-ku</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " く-ku bahasa " .. lang_name) end elseif inflection_type == "shiku" or inflection_type == "しく" then insert(data.info_mid, '<abbr title="-shiku inflection (classical)"><sup><small>†</small></sup>-shiku</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " しく-shiku bahasa " .. lang_name) end elseif inflection_type == "ka" or inflection_type == "か" then insert(data.info_mid, '<abbr title="-ka inflection (dialectal)"><sup><small>†</small></sup>-ka</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " か-ka " .. lang_name) end elseif inflection_type and inflection_type:len() > adverbs_optional_tag:len() and inflection_type:sub(1, adverbs_optional_tag:len()) == adverbs_optional_tag then local adverbs_optional_list = inflection_type:sub(adverbs_optional_tag:len() + 1) for option in gsplit(adverbs_optional_list, ':') do local normalized_option = adverbs_optional_aliases[option] if not normalized_option then error('unrecognized adverb opt= argument: "' .. option .. '"') end local normalized_option_romaji = kana_to_romaji(normalized_option, data.lang_code) local normalized_option_link = adverbs_optional_links[normalized_option] inflected_forms = replace_suffix('', {normalized_option_link}, '', {' [[' .. normalized_option_romaji .. ']]'}) insert_form('optionally as', inflected_forms[1]) if cat_suffix then insert(data.headword.categories, cat_suffix .. " secara pilihan mengambil " .. normalized_option .. "-" .. normalized_option_romaji .. " bahasa " .. lang_name) end end elseif inflection_type == 'irr' then insert(data.info_mid, 'irregular') if cat_suffix then insert(data.headword.categories, cat_suffix .. " irregular bahasa " .. lang_name) end elseif inflection_type == '-' or inflection_type == 'un' then insert(data.info_mid, 'uninflectable') end --elseif data.lang_code == 'ryu' then ... end end local function add_categories(data) local lang_name = data.lang_name local pagename = data.pagename local tc = data.headword.categories -- adds category [langname] terms spelled with jōyō kanji or [langname] terms spelled with non-jōyō kanji -- (if it contains any kanji) local number_of_kanji = 0 for c in ugmatch(pagename, "[" .. range.kanji .. "々〻]") do number_of_kanji = number_of_kanji + 1 if c ~= "々" and c ~= "〻" then -- Not a kanji for the purposes of categorisation. insert(tc, ("Perkataan dieja dengan kanji %s bahasa " .. lang_name):format(en_grades[m_ja.kanji_grade(c)])) end end -- categorize by number of kanji if number_of_kanji ~= 0 then insert(tc, ("Perkataan dengan %s aksara kanji bahasa " .. lang_name):format(number_of_kanji)) -- single-kanji terms if ulen(pagename) == 1 then insert(tc, "Perkataan dieja dengan " .. pagename .. " bahasa " .. lang_name) insert(tc, "Perkataan kanji tunggal bahasa " .. lang_name) end end -- categorize by the script of the pagename or specific characters contained in it -- if pagename is hiragana or katakana if detect_pagename_kana(data, true) == 'hira' then insert(tc, "Hiragana bahasa " .. lang_name) end if detect_pagename_kana(data, true) == 'kata' then insert(data.katakana_category, "Katakana bahasa " .. lang_name) end local p, n = ugsub(pagename, '[' .. range.kana .. range.kanji .. range.ideograph .. range.kana_graph .. range.punctuation .. ']+', '') if p ~= '' and n > 0 then insert(tc, "Perkataan ditulis dalam pelbagai tulisan bahasa " .. lang_name) end local pos = data.headword.pos_category local rare_chars = {} for ch in iterate_rare_chars(pagename) do rare_chars[ch] = true end -- Categorise yōon, but exclude kana and mora entries, since they can't be spelled with themselves. -- FIXME: allow kana categories for morae. if not (pos == "syllables" or pos == "kana" or pos == "morae") then for _, mora in ipairs(moraify((ugsub(pagename, "[^" .. range.kana .. "]+", " ")))) do if not mora:gsub(" +", ""):match("^.?[\128-\191]*$") then rare_chars[mora] = true end end end for ch in pairs(rare_chars) do insert(tc, "Perkataan dieja dengan " .. ch .. " bahasa " .. lang_name) end if ( pos ~= "proverbs" and pos ~= "phrases" and umatch(ugsub(pagename, "[" .. range.katakana .. "]+", ""), "[" .. range.hiragana .. "]") and umatch(ugsub(pagename, "[" .. range.hiragana .. "]+", ""), "[" .. range.katakana .. "]") ) then insert(tc, "Perkataan dieja dengan campuran kana bahasa " .. lang_name) end end pos_functions["Kata kerja"] = function(args, data) add_transitivity(data, args["tr"]) add_inflections(data, args["infl"], 'verbs') end pos_functions["Akhiran"] = function(args, data) add_inflections(data, args["infl"]) end pos_functions["Kata kerja bantu"] = function(args, data) insert(data.headword.categories, "Kata kerja bantu bahasa " .. data.lang_name) add_inflections(data, args["infl"]) data.headword.pos_category = "Kata kerja" end pos_functions["Kata kerja suru"] = function(args, data) add_transitivity(data, args["tr"]) add_inflections(data, 'suru', 'verbs') data.headword.pos_category = "Kata kerja" end pos_functions["kata sifat"] = function(args, data) add_inflections(data, args["infl"], 'adjectives') end pos_functions["kata nama"] = function(args, data) -- the counter (classifier) parameter, only relevant for nouns local counter = args["count"] or "" if counter == "-" then insert(data.headword.inflections, {label = "uncountable"}) elseif counter ~= "" then insert(data.headword.inflections, {label = "counter", counter}) end end pos_functions["Adverba"] = function(args, data) local opt = args["opt"] if opt then opt = adverbs_optional_tag .. opt end add_inflections(data, opt, 'adverbs') end --[==[ Generate categories by pagename, also optionally by POS Also for use in soft redirect pages ([[Module:ja-see]]). Sortkey is not provided. data = { pagename = ..., -- (required) lang = ..., -- (required) language object categories = {}, -- (required) receive categories katakana_category = {}, -- (required) receive katakana-sorted categories pos = ..., "noun", "verb", etc. no POS categories if not given } ]==] function export.cat(data) data.lang_name = data.lang:getCanonicalName() data.pagename_kana = detect_pagename_kana(data) if data.pos then local pos = data.pos:gsub('x$', 'xe') .. '' insert(data.categories, pos .. ' bahasa ' .. data.lang_name) insert(data.categories, require'Module:headword'.pos_lemma_or_nonlemma(pos, true) .. ' bahasa ' .. data.lang_name) end data.headword = { categories = data.categories } add_categories(data) end --[==[ The main entry point. This is the only function that can be invoked from a template. ]==] function export.show(frame) local poscat = frame.args[2] or frame.args[1] or error("Part of speech has not been specified. Please pass parameter 1 to the module invocation.") local alias_of_hist = {alias_of = 'hist', list = false} local alias_of_infl = {alias_of = "infl"} local list = {list = true} local list_allow_holes_separate_no_index = {list = true, allow_holes = true, separate_no_index = true} local params = { [1] = list, ['rom'] = list_allow_holes_separate_no_index, ['head'] = list_allow_holes_separate_no_index, ['label'] = {list = true, allow_holes = true}, ['hist'] = list, ['hhira'] = alias_of_hist, ['hkata'] = alias_of_hist, ['tr'] = true, ['infl'] = true, ['type'] = alias_of_infl, ['decl'] = alias_of_infl, ['opt'] = true, ['count'] = true, ['sort'] = true, ['pagename'] = true, } -- For backwards compatibility with uses of {{ja-syllable}} with the script parameter. if poscat == "syllables" then params["sc"] = true end local args = require('Module:parameters').process(frame:getParent().args, params) local data = { headword = { pos_category = poscat, categories = {}, heads = {}, no_redundant_head_cat = true, inflections = {}, genders = {'m'}, -- placeholder nogendercat = true }, --custom info pagename = args.pagename or mw.loadData("Module:headword/data").pagename, pagename_kana = nil, -- "hira" "kata" "both", nil lang_code = frame.args[1], lang_name = nil, -- "Japanese", "Okinawan" ... katakana_category = {}, info_mid = {}, -- "godan", "intransitive" ... info_hist = {}, -- historical kana inflection_base = {}, -- base of inflections kanas = {}, -- kana id } data.headword.lang = require("Module:languages").getByCode(data.lang_code) data.lang_name = data.headword.lang:getCanonicalName() -- sort out all the kanas and do the romanization business format_headword(args, data) -- add certain inflections and categories for adjectives, verbs, nouns, or adverbs if pos_functions[poscat] then pos_functions[poscat](args, data) end -- categories add_categories(data) local sort_base = args.sort or data.kanas[1] or data.pagename data.headword.sort_key = data.headword.lang:makeSortKey(sort_base) local katakana_category = #data.katakana_category > 0 and require("Module:utilities").format_categories( data.katakana_category, data.headword.lang, nil, sort_base, nil, require("Module:scripts").getByCode("Kana") ) or "" -- output local i_kanas = 0 return katakana_category .. require('Module:headword').full_headword(data.headword):gsub('<span class="gender">.-</span>', function() return (#data.info_hist > 0 and '<sup>←' .. concat(data.info_hist, ' or ') .. '<sup>[[w:Historical kana orthography|?]]</sup></sup>' or '') .. ('<i>' .. concat(data.info_mid, '&nbsp;') .. '</i>') end):gsub('<strong .->.-</strong>', function(m0) i_kanas = i_kanas + 1 if data.kanas[i_kanas] then return m0 end end):gsub('<span class="headword%-tr tr" dir="ltr"><span class="Latn" lang="ja">', '<span lang="ja-Latn" class="headword-tr tr Latn" dir="ltr">'):gsub('</span></span>', '</span>') end return export egwsrby262fhe6hpnwknt7huiegcpys 281361 281360 2026-04-22T06:29:02Z PeaceSeekers 3334 Membatalkan semakan [[Special:Diff/281360|281360]] oleh [[Special:Contributions/PeaceSeekers|PeaceSeekers]] ([[User talk:PeaceSeekers|bincang]]) 281361 Scribunto text/plain local m_ja = require("Module:ja") local m_ja_ruby = require("Module:ja-ruby") local m_str_utils = require("Module:string utilities") local byteoffset = mw.ustring.byteoffset local concat = table.concat local gsplit = m_str_utils.gsplit local insert = table.insert local kana_to_romaji = require("Module:Hrkt-translit").tr local max_index = require("Module:table").maxIndex local moraify = m_ja.moraify local remove = table.remove local ugmatch = mw.ustring.gmatch local ugsub = m_str_utils.gsub local ulen = m_str_utils.len local ulower = m_str_utils.lower local umatch = mw.ustring.match local usub = m_str_utils.sub local export = {} local pos_functions = {} local range = mw.loadData('Module:ja/data/range') local Jpan = require("Module:scripts").getByCode("Jpan") local function remove_links(text) return (text:gsub("%[%[[^|%]]-|", "") :gsub("%[%[", "") :gsub("%]%]", "")) end local function assign_kana_to_kanji(head, kana, pagename, template_name) -- TODO: uses deprecated module local m_tu = require'Module:template utilities' local kanji_pos = {[0] = { nil, 0}} local head_nolink = {} local link_border = 0 local function insert_kanji_pos(substr) insert(head_nolink, substr) for p1, w1 in ugmatch(substr, '()([々' .. range.kanji .. '])') do p1 = byteoffset(substr, p1) + link_border insert(kanji_pos, { p1, p1 + w1:len() - 1 }) end end for p1, p2, w1 in m_tu.gfind_bracket(head, {['%[%['] = ']]'}) do insert_kanji_pos(head:sub(link_border + 1, p1 - 1)) local p_pipe = w1:find'|' or 2 link_border = p1 + p_pipe - 1 insert_kanji_pos(w1:sub(p_pipe + 1, -3)) link_border = p2 end insert_kanji_pos(head:sub(link_border + 1)) head_nolink = concat(head_nolink) local pagetext = mw.title.new(pagename):getContent() if not pagetext then return head, kana end local non_kanji = {} local last_kanji = 1 for p1 in ugmatch(head_nolink, '[々' .. range.kanji .. ']()') do insert(non_kanji, usub(head_nolink, last_kanji, p1 - 2)) last_kanji = p1 end insert(non_kanji, usub(head_nolink, last_kanji)) for kanjitab in pagetext:gmatch('(){{%s*' .. template_name) do kanjitab = select(3, m_tu.find_bracket(pagetext, m_tu.brackets_temp, kanjitab)) if not kanjitab then error('ill-formed [[t:' .. template_name:gsub('%%', '') .. ']] syntax') end kanjitab = m_tu.parse_temp(kanjitab) local readings = {} local readings_len = {} for i = 1, max_index(kanjitab.args) do local r_i = kanjitab.args[i] or '' local r_o = kanjitab.args['o' .. i] or '' if kanjitab.args['k' .. i] then readings[i] = kanjitab.args['k' .. i] .. r_o readings_len[i] = tonumber(r_i:match'^%s*%D*(%d*)%s*$') or 1 else local r_kana, r_len = r_i:match'^%s*(%D*)(%d*)%s*$' readings[i] = r_kana .. r_o readings_len[i] = tonumber(r_len) or 1 end end local kana_decom = {} local reading_id = 1 local reading_len = 1 for i = 1, #non_kanji - 1 do if reading_len <= 1 then reading_len = readings_len[reading_id] or 1 insert(kana_decom, non_kanji[i]) insert(kana_decom, readings[reading_id]) reading_id = reading_id + 1 else reading_len = reading_len - 1 end end insert(kana_decom, non_kanji[#non_kanji]) local function strip_nonkana(str, repl) return ugsub(str, '[^' .. range.kana .. ']+', repl) or nil end local xeno_reading = {strip_nonkana(kana, ''):match('^' .. strip_nonkana(concat(kana_decom), '(.-)') .. '$')} if #xeno_reading > 0 then local head_decom = {} reading_id = 1 reading_len = 1 for i = 1, #non_kanji - 1 do if reading_len <= 1 then reading_len = readings_len[reading_id] or 1 insert(head_decom, head:sub(kanji_pos[i - 1][2] + 1, kanji_pos[i][1] - 1)) insert(head_decom, head:sub(kanji_pos[i][1], kanji_pos[i + reading_len - 1][2])) reading_id = reading_id + 1 else reading_len = reading_len - 1 end end insert(head_decom, head:sub(kanji_pos[#non_kanji - 1][2] + 1)) if #head_decom ~= #kana_decom then error('number of parameters in [[t:' .. template_name:gsub('%%', '') .. ']] is incorrect') end local n_xeno_reading = 0 for i = 1, #kana_decom, 2 do kana_decom[i] = ugsub(kana_decom[i], '[^' .. range.kana .. ']+', function() n_xeno_reading = n_xeno_reading + 1 if xeno_reading[n_xeno_reading] == '' then return nil else return xeno_reading[n_xeno_reading] end end) end return concat(head_decom, '%'), concat(kana_decom, '%') end end return head, kana end local en_grades = { "gred pertama", "gred kedua", "gred ketiga", "gred keempat", "gred kelima", "gred keenam", "sekolah menengah", "jinmeiyō", "hyōgai" } local aliases = { ['transitive']='tr', ['trans']='tr', ['intransitive']='in', ['intrans']='in', ['intr']='in', ['godan']='1', ['ichidan']='2', ['irregular']='irr' } local adverbs_optional_tag = 'optionally ' local adverbs_optional_aliases = { ['to']='と', ['と']='と', ['ト']='と', ['ni']='に', ['に']='に', ['ニ']='に', } local adverbs_optional_links = { ['と']='[[と#Japanese:_adverbs|と]]', ['に']='[[に]]', } local function formatting_adjustments(rom, kana, pos_category) -- hyphens for prefixes, suffixes, and counters (classifiers) if pos_category == "Awalan" then rom = rom:gsub('%-?$', '-') elseif pos_category == "Akhiran" or pos_category == "Bentuk akhiran" or pos_category == "counters" or pos_category == "classifiers" then rom = rom:gsub('^%-?', '-') elseif pos_category == "Kata nama khas" and not kana:match'%^' then -- automatic caps for proper nouns, if not already specified rom = ugsub(ugsub(rom, '%f[^%s%c%p]%l', string.uupper), "%w'%u", ulower) -- no caps after medial apostrophes end return rom end local function kana_to_romaji_with_pos_format(kana, data, args) if data.headword.pos_category == "Bentuk gabungan" or data.headword.pos_category == "Tanda baca" or data.headword.pos_category == "Tanda lelaran" then return "-" end local rom = remove_links(kana_to_romaji(kana, data.lang_code)) -- make adjustments for -u verbs and -i adjectives if args['infl'] == '1' or args['infl'] == '1s' or args['infl'] == 'godan' then rom = rom:gsub('ō$', 'ou'):gsub('ū$', 'uu') elseif args['infl'] == 'i' or args['infl'] == 'is' or args['infl'] == 'い' then rom = rom:gsub('ī$', 'ii') end return formatting_adjustments(rom, kana, data.headword.pos_category) end local function iterate_rare_chars(text) local ch, i return function() repeat ch, i = umatch(text, "([" .. range.kana .. range.kana_graph .. "!-/:-@%[\\-`×△○◎。-〠〶〷〻-〽・·゠=~][゙゚]*)()", i) until not (ch and umatch(ch, "^[ぁ-ちっつて-ろんァ-チッツテ-ロンヲ-゚]$")) return ch end end local function historical_kana(data, hist_kana, modern_kana) -- Disallow historical kana for kana and morae, as there's no one-to-one correspondence. local pos = data.headword.pos_category if pos == "syllables" or pos == "kana" or pos == "morae" then error(("Cannot specify historical kana for %s."):format(pos)) end local hist_kana_no_formatting = hist_kana:gsub("[%^%-%. %%]+", "") local rare_chars, lang_name, hc = {}, data.lang_name, data.headword.categories for ch in iterate_rare_chars(hist_kana_no_formatting) do if not (modern_kana and modern_kana:find(ch)) then rare_chars[ch] = true end end for _, mora in ipairs(moraify((ugsub(hist_kana_no_formatting, "[^" .. range.kana .. "]+", " ")))) do if not (mora:gsub(" +", ""):match("^.?[\128-\191]*$") or (modern_kana and modern_kana:find(mora))) then rare_chars[mora] = true end end for ch in pairs(rare_chars) do insert(hc, "Perkataan mengikut sejarah dieja dengan " .. ch .. " bahasa " .. lang_name) end insert(data.info_hist, require("Module:ja-link").link({ lang = data.headword.lang, lemma = hist_kana, tr = formatting_adjustments( remove_links(kana_to_romaji(hist_kana, data.lang_code, nil, {hist = true})), hist_kana, pos ) }, { face = "head", disableSelfLink = true, })) end local function detect_pagename_kana(data, digraphs) local pagename = data.pagename -- Exclude "&" and "@", which are part of %p (e.g. リズム&ブルース). local function remove_kana(m) return m:match("[&@]") or "" end if ugsub(pagename, '[%p%s%c' .. range.hiragana .. (digraphs and "ゟ" or "") .. ']', remove_kana) == "" then return 'hira' elseif ugsub(pagename, '[%p%s%c' .. range.katakana .. (digraphs and "ヿ" or "") .. ']', remove_kana) == "" then return 'kata' elseif ugsub(pagename, '[%p%s%c' .. range.kana .. (digraphs and "ゟヿ" or "") .. ']', remove_kana) == "" then return 'both' end end -- go through args and build inflections by finding whatever kanas were given to us local function format_headword(args, data) local pagename, kanas, lang_name = data.pagename, data.kanas, data.lang_name data.pagename_kana = detect_pagename_kana(data) if args[1][1] and not args[1][1]:match'[\128-\255]' then -- filter out POS designations remove(args[1], 1) end local linked_translit = data.headword.lang:link_tr(Jpan) local suru_ending, rom_suru_ending if data.headword.pos_category == "kata kerja suru" then suru_ending = "[[する]]" rom_suru_ending = linked_translit and " [[suru]]" or " suru" else suru_ending, rom_suru_ending = "", "" end if data.pagename_kana then -- pure-kana-title entry if #args.head > 0 or args.head.default then insert(data.headword.categories, "Perkataan dengan parameter pengepala lewah bahasa " .. lang_name) end -- {{ja-xxx}} vs {{ja-xxx|こ.うし}} vs {{ja-xxx|コウシ}} in [[こうし]] if not args[1][1] then args[1][1] = pagename elseif remove_links(args[1][1]:gsub("[%^%-%. %%]+", "")) ~= pagename then insert(args[1], 1, pagename) end for i, k in ipairs(args[1]) do insert(data.headword.heads, { term = k:gsub("[%^%-%. %%]+", "") .. suru_ending, tr = '-', l = args.label[i] and {args.label[i]} or nil, }) end for i = 1, math.max(args.rom.maxindex, 1) do local rom = args.rom[i] or args.rom.default or kana_to_romaji_with_pos_format(args[1][1], data, args) if not data.headword.heads[i] then data.headword.heads[i] = {term = data.headword.heads[i-1].term} end if rom == "-" then data.headword.heads[i].tr = "-" elseif linked_translit then data.headword.heads[i].tr = "[[" .. rom .. "]]" .. rom_suru_ending else data.headword.heads[i].tr = rom .. rom_suru_ending end if not data.inflection_base.form then data.inflection_base.form = remove_links(args[i][1]:gsub("[%^%-%. %%]+", "")) .. suru_ending data.inflection_base.romaji = rom .. rom_suru_ending end end kanas[1] = pagename if args.hist[1] then historical_kana(data, args.hist[1], args[1][1]) end else -- non-pure-kana-title entry if #args[1] == 0 and not (data.headword.pos_category == "Tanda baca" or data.headword.pos_category == "Tanda lelaran" or data.headword.pos_category == "Simbol") then error("Kana form is required.") end if args.head.default == pagename then insert(data.headword.categories, "Perkataan dengan parameter pengepala lewah bahasa " .. lang_name) end local rom_repetition_final = {} for i, k in ipairs(args[1]) do local rom_auto = kana_to_romaji_with_pos_format(k, data, args) local head = args.head[i] or args.head.default or pagename if args.head[i] == pagename then insert(data.headword.categories, "Perkataan dengan parameter pengepala lewah bahasa " .. lang_name) end local head_for_ruby, kana_for_ruby if ulen(head) > 1 and head:match'%%' == nil and k:match'%%' == nil then head_for_ruby, kana_for_ruby = assign_kana_to_kanji(head, k, pagename, data.lang_code .. '%-kanjitab') else head_for_ruby, kana_for_ruby = head, k end local format_table = m_ja_ruby.parse_text(head_for_ruby, kana_for_ruby, { try = 'force', try_force_limit = 10000 }) local kana_bare = remove_links(k:gsub("[%^%-%. %%]+", "")) local rom = args.rom[i] or args.rom.default or rom_auto head = { term = m_ja_ruby.to_wiki(format_table, { break_link = true, }):gsub('<rt>(..-)</rt>', "<rt>[[" .. kana_bare .."|%1]]</rt>") .. suru_ending, l = args.label[i] and {args.label[i]} or nil, } if rom == "-" or rom_repetition_final[rom] then head.tr = "-" elseif linked_translit then head.tr = "[[" .. rom .. "]]" .. rom_suru_ending else head.tr = rom .. rom_suru_ending end insert(data.headword.heads, head) rom_repetition_final[rom] = true insert(kanas, kana_bare) if args.hist[i] then historical_kana(data, args.hist[i], k) end if not data.inflection_base.form then data.inflection_base.form = remove_links(m_ja_ruby.to_markup(format_table)) .. suru_ending data.inflection_base.romaji = rom .. rom_suru_ending end end local first_reading, multiple = kanas[1] if not first_reading then return end first_reading = ulower(kana_to_romaji(first_reading, data.lang_code)):gsub("%%", "") for i = 2, #kanas do if ulower(kana_to_romaji(kanas[i], data.lang_code)):gsub("%%", "") ~= first_reading then multiple = true break end end if not multiple then local lang_code = data.lang_code local content = mw.title.getCurrentTitle():getContent() local loc1, loc2 = content:find("%f[^%z%s]==%s*" .. lang_name:gsub("%-", "%%%-") .. "%s*==()") loc2 = content:find("%f[^%z%s]==[^\n=]+==", loc2) if loc1 then content = content:sub(loc1, loc2) for template in require("Module:template parser").find_templates(content) do local name, reading = template:get_name() if ( name == lang_code .. "-head" or name == lang_code .. "-pos" ) then reading = template:get_arguments()[2] if reading ~= nil then reading = remove_links(reading):gsub("%%", "") end elseif ( name == lang_code .. "-noun" or name == lang_code .. "-verb" or name == lang_code .. "-adj" or name == lang_code .. "-phrase" or name == lang_code .. "-verb form" or name == lang_code .. "-verb-suru" ) then reading = template:get_arguments()[1] if reading ~= nil then reading = remove_links(reading):gsub("%%", "") end elseif name == lang_code .. "-see" then reading = template:get_arguments()[1] if reading ~= nil then reading = remove_links(reading):gsub("%%", "") end -- if umatch(reading, "[^" .. range.kana .. "]") then -- TODO: check linked page -- end end if reading and ulower(kana_to_romaji(reading, lang_code)):gsub("%%", "") ~= first_reading then multiple = true end end end end if multiple then insert(data.headword.categories, "Perkataan dengan pelbagai bacaan bahasa " .. lang_name) end end end local function add_transitivity(data, tr) local categories, lang_name = data.headword.categories, data.lang_name tr = aliases[tr] or tr if tr == "tr" then insert(data.info_mid, 'transitive') insert(categories, "Kata kerja transitif bahasa " .. lang_name) elseif tr == "in" then insert(data.info_mid, 'intransitive') insert(categories, "Kata kerja tak transitif bahasa " .. lang_name) elseif tr == "both" then insert(data.info_mid, 'transitive or intransitive') insert(categories, "Kata kerja transitif bahasa " .. lang_name) insert(categories, "Kata kerja tak transitif bahasa " .. lang_name) else insert(categories, "Kata kerja tanpa ketransitifan bahasa " .. lang_name) end end local function get_final(lemma, data) return kana_to_romaji(remove(moraify(m_ja_ruby.to_ruby(m_ja_ruby.parse_markup(lemma)))), data.lang_code) end local function add_language_fragment(t, lang_name) for k, v in ipairs(t) do t[k] = v:gsub("%[%[([^]#]*)%]%]", "[[%1#%%s|%1]]"):format(lang_name) end end local function add_inflections(data, inflection_type, cat_suffix) local lang_name = data.lang_name local lemma = data.inflection_base.form local romaji = data.inflection_base.romaji inflection_type = aliases[inflection_type] or inflection_type local function replace_suffix(lemma_from, lemma_to, romaji_from, romaji_to) -- e.g. 持って来る, lemma = "[持](も)って来(く)る" -- lemma_from = "くる", lemma_to = {"き","きた"} add_language_fragment(lemma_to, lang_name) add_language_fragment(romaji_to, lang_name) local result = {} local pattern_from, n_from = lemma_from:gsub('.[\128-\191]*', function(c) return '[' .. c .. m_ja.hira_to_kata(c) .. ']([^' .. range.kana .. ']*)' end) pattern_from = pattern_from .. '$' -- "[くク]([^kana range]*)[るル]([^kana range]*)$" for i_lemma_to, s_lemma_to in ipairs(lemma_to) do local n_to = 0 local pattern_to = s_lemma_to:gsub('.[\128-\191]*', function(c) if n_to < n_from then n_to = n_to + 1 return c .. "%" .. n_to else return c end end) for i = n_to + 1, n_from do pattern_to = pattern_to .. "%" .. i end -- "き%1%2", "き%1た%2" local lemma_inflected, success = ugsub(lemma, pattern_from, pattern_to) if success == 0 then return end local romaji_inflected romaji_inflected, success = romaji:gsub(romaji_from .. "$", romaji_to[i_lemma_to]) if success == 0 then romaji_inflected, success = romaji:gsub("%[%[" .. romaji_from .. "%]%]$", "[[" .. romaji_to[i_lemma_to] .. "]]") if success == 0 then return end end insert(result, {lemma = lemma_inflected, romaji = romaji_inflected}) end return result -- {{lemma="[持](も)って来(き)",romaji="motteki"},{lemma="[持](も)って来(き)た",romaji="mottekita"}} end local function insert_form(label, ...) -- label = "stem" or "past" etc. -- ... = {lemma=...,romaji=...},{lemma=...,romaji=...} local labeled_forms = {label = label} for _, v in ipairs{...} do local table_form = m_ja_ruby.parse_markup(v.lemma) local form_term = m_ja_ruby.to_wiki(table_form) if not form_term:find'%[%[.+%]%]' then form_term = '[[' .. m_ja_ruby.to_text(table_form) .. '#' .. lang_name .. '|' .. form_term .. ']]' end insert(labeled_forms, { term = form_term, tr = v.romaji, }) end insert(data.headword.inflections, labeled_forms) end local inflected_forms if data.lang_code == 'ja' then if inflection_type == '1' or inflection_type == '1s' then insert(data.info_mid, '<abbr title="godan (group 1) conjugation">godan</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " godan bahasa " .. lang_name) local romaji = data.inflection_base.romaji if cat_suffix == "Kata kerja" then local final = get_final(lemma, data) insert(data.headword.categories, cat_suffix .. " godan berakhir dengan -" .. final " bahasa ".. lang_name) if final == "ru" then if umatch(romaji, "[iIīĪ]ru$") then insert(data.headword.categories, cat_suffix .. " godan berakhir dengan -iru bahasa " .. lang_name) elseif umatch(romaji, "[eEēĒ]ru$") then insert(data.headword.categories, cat_suffix .. " godan berakhir dengan -eru bahasa " .. lang_name) end end end end if inflection_type == '1' then inflected_forms = replace_suffix('く', {'き', 'いた'}, 'ku', {'ki', 'ita'}) or replace_suffix('ぐ', {'ぎ', 'いだ'}, 'gu', {'gi', 'ida'}) or replace_suffix('す', {'し', 'した'}, 'su', {'shi', 'shita'}) or replace_suffix('つ', {'ち', 'った'}, 'tsu', {'chi', 'tta'}) or replace_suffix('ぬ', {'に', 'んだ'}, 'nu', {'ni', 'nda'}) or replace_suffix('ぶ', {'び', 'んだ'}, 'bu', {'bi', 'nda'}) or replace_suffix('む', {'み', 'んだ'}, 'mu', {'mi', 'nda'}) or replace_suffix('る', {'り', 'った'}, 'ru', {'ri', 'tta'}) or replace_suffix('う', {'い', 'った'}, 'u', {'i', 'tta'}) if inflected_forms then insert_form('dasar', inflected_forms[1]) insert_form('lampau', inflected_forms[2]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end else inflected_forms = replace_suffix('る', {'り', 'った', 'い'}, 'ru', {'ri', 'tta', 'i'}) or --くださる replace_suffix('いく', {'いき', 'いった'}, 'iku', {'iki', 'itta'}) or --行く replace_suffix('う', {'い', 'うた'}, 'ou', {'oi', 'ōta'}) --問う if inflected_forms then insert_form('dasar', inflected_forms[1], inflected_forms[3]) insert_form('lampau', inflected_forms[2]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end end elseif inflection_type == '2' then insert(data.info_mid, '<abbr title="ichidan (group 2) conjugation">ichidan</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " ichidan bahasa " .. lang_name) local romaji = data.inflection_base.romaji if umatch(romaji, "[iIīĪ]ru$") then insert(data.headword.categories, cat_suffix .. " kami ichidan bahasa " .. lang_name) elseif umatch(romaji, "[eEēĒ]ru$") then insert(data.headword.categories, cat_suffix .. " shimo ichidan bahasa " .. lang_name) else insert(data.headword.categories, cat_suffix .. " irregular bahasa " .. lang_name) end end inflected_forms = replace_suffix('る', {'', 'た'}, 'ru', {'', 'ta'}) if inflected_forms then insert_form('dasar', inflected_forms[1]) insert_form('lampau', inflected_forms[2]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end elseif inflection_type == 'suru' then insert(data.info_mid, '<abbr title="suru (group 3) conjugation">suru</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " suru bahasa " .. lang_name) end inflected_forms = replace_suffix('する', {'し', 'した'}, 'suru', {'shi', 'shita'}) or replace_suffix('ずる', {'じ', 'じた'}, 'zuru', {'ji', 'jita'}) if inflected_forms then insert_form('dasar', inflected_forms[1]) insert_form('lampau', inflected_forms[2]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end elseif inflection_type == 'kuru' then insert(data.info_mid, '<abbr title="kuru (group 3) conjugation">kuru</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " kuru bahasa " .. lang_name) end inflected_forms = replace_suffix('くる', {'き', 'きた'}, 'kuru', {'ki', 'kita'}) if inflected_forms then insert_form('dasar', inflected_forms[1]) insert_form('lampau', inflected_forms[2]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end elseif inflection_type == 'i' or inflection_type == 'い' then insert(data.info_mid, '<abbr title="-i (type I) inflection">-i</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " い-i bahasa " .. lang_name) end inflected_forms = replace_suffix('い', {'く'}, 'i', {'ku'}) if inflected_forms then insert_form('adverbial', inflected_forms[1]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end elseif inflection_type == 'is' then insert(data.info_mid, '<abbr title="-i (type I) inflection">-i</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " い-i bahasa " .. lang_name) end inflected_forms = replace_suffix('いい', {'よく'}, 'ii', {'yoku'}) if inflected_forms then insert_form('adverbial', inflected_forms[1]) else require'Module:debug'.track'Jpan-headword/inflection failed/ja' end elseif inflection_type == 'na' or inflection_type == 'な' then insert(data.info_mid, '<abbr title="-na (type II) inflection">-na</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " な-na bahasa " .. lang_name) end inflected_forms = replace_suffix('', {'[[な]]', '[[に]]'}, '', {' [[na]]', ' [[ni]]'}) insert_form('adnominal', inflected_forms[1]) insert_form('adverbial', inflected_forms[2]) elseif inflection_type == "yo" then insert(data.info_mid, '<abbr title="yodan conjugation (classical)"><sup><small>†</small></sup>yodan</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " yodan " .. lang_name) insert(data.headword.categories, cat_suffix .. " yodan berakhir dengan -" .. get_final(lemma, data) .. " bahasa ".. lang_name) end elseif inflection_type == "kami ni" then insert(data.info_mid, '<abbr title="kami nidan conjugation (classical)"><sup><small>†</small></sup>nidan</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " nidan bahasa " .. lang_name) insert(data.headword.categories, cat_suffix .. " kami nidan bahasa " .. lang_name) end elseif inflection_type == "shimo ni" then insert(data.info_mid, '<abbr title="shimo nidan conjugation (classical)"><sup><small>†</small></sup>nidan</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " nidan bahasa " .. lang_name) insert(data.headword.categories, cat_suffix .. " shimo nidan bahasa " .. lang_name) end elseif inflection_type == "rahen" then insert(data.info_mid, '<abbr title="r-special conjugation (classical)"><sup><small>†</small></sup>-ri</abbr>') elseif inflection_type == "sahen" then insert(data.info_mid, '<abbr title="s-special conjugation (classical)"><sup><small>†</small></sup>-se</abbr>') elseif inflection_type == "kahen" then insert(data.info_mid, '<abbr title="k-special conjugation (classical)"><sup><small>†</small></sup>-ko</abbr>') elseif inflection_type == "nahen" then insert(data.info_mid, '<abbr title="n-special conjugation (classical)"><sup><small>†</small></sup>-n</abbr>') elseif inflection_type == "nari" or inflection_type == "なり" then insert(data.info_mid, '<abbr title="-nari inflection (classical)"><sup><small>†</small></sup>-nari</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " なり-nari bahasa " .. lang_name) end elseif inflection_type == 'tari' or inflection_type == 'たり' then insert(data.info_mid, '<abbr title="-tari inflection (classical)"><sup><small>†</small></sup>-tari</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " たり-tari bahasa " .. lang_name) end inflected_forms = replace_suffix('', {'[[とした]]', '[[たる]]', '[[と]]', '[[として]]'}, '', {' [[to shita]]', ' [[taru]]', ' [[to]]', ' [[to shite]]'}) insert_form('adnominal', inflected_forms[1], inflected_forms[2]) insert_form('adverbial', inflected_forms[3], inflected_forms[4]) elseif inflection_type == "ku" or inflection_type == "く" then insert(data.info_mid, '<abbr title="-ku inflection (classical)"><sup><small>†</small></sup>-ku</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " く-ku bahasa " .. lang_name) end elseif inflection_type == "shiku" or inflection_type == "しく" then insert(data.info_mid, '<abbr title="-shiku inflection (classical)"><sup><small>†</small></sup>-shiku</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " しく-shiku bahasa " .. lang_name) end elseif inflection_type == "ka" or inflection_type == "か" then insert(data.info_mid, '<abbr title="-ka inflection (dialectal)"><sup><small>†</small></sup>-ka</abbr>') if cat_suffix then insert(data.headword.categories, cat_suffix .. " か-ka " .. lang_name) end elseif inflection_type and inflection_type:len() > adverbs_optional_tag:len() and inflection_type:sub(1, adverbs_optional_tag:len()) == adverbs_optional_tag then local adverbs_optional_list = inflection_type:sub(adverbs_optional_tag:len() + 1) for option in gsplit(adverbs_optional_list, ':') do local normalized_option = adverbs_optional_aliases[option] if not normalized_option then error('unrecognized adverb opt= argument: "' .. option .. '"') end local normalized_option_romaji = kana_to_romaji(normalized_option, data.lang_code) local normalized_option_link = adverbs_optional_links[normalized_option] inflected_forms = replace_suffix('', {normalized_option_link}, '', {' [[' .. normalized_option_romaji .. ']]'}) insert_form('optionally as', inflected_forms[1]) if cat_suffix then insert(data.headword.categories, cat_suffix .. " secara pilihan mengambil " .. normalized_option .. "-" .. normalized_option_romaji .. " bahasa " .. lang_name) end end elseif inflection_type == 'irr' then insert(data.info_mid, 'irregular') if cat_suffix then insert(data.headword.categories, cat_suffix .. " irregular bahasa " .. lang_name) end elseif inflection_type == '-' or inflection_type == 'un' then insert(data.info_mid, 'uninflectable') end --elseif data.lang_code == 'ryu' then ... end end local function add_categories(data) local lang_name = data.lang_name local pagename = data.pagename local tc = data.headword.categories -- adds category [langname] terms spelled with jōyō kanji or [langname] terms spelled with non-jōyō kanji -- (if it contains any kanji) local number_of_kanji = 0 for c in ugmatch(pagename, "[" .. range.kanji .. "々〻]") do number_of_kanji = number_of_kanji + 1 if c ~= "々" and c ~= "〻" then -- Not a kanji for the purposes of categorisation. insert(tc, ("Perkataan dieja dengan kanji %s bahasa " .. lang_name):format(en_grades[m_ja.kanji_grade(c)])) end end -- categorize by number of kanji if number_of_kanji ~= 0 then insert(tc, ("Perkataan dengan %s aksara kanji bahasa " .. lang_name):format(number_of_kanji)) -- single-kanji terms if ulen(pagename) == 1 then insert(tc, "Perkataan dieja dengan " .. pagename .. " bahasa " .. lang_name) insert(tc, "Perkataan kanji tunggal bahasa " .. lang_name) end end -- categorize by the script of the pagename or specific characters contained in it -- if pagename is hiragana or katakana if detect_pagename_kana(data, true) == 'hira' then insert(tc, "Hiragana bahasa " .. lang_name) end if detect_pagename_kana(data, true) == 'kata' then insert(data.katakana_category, "Katakana bahasa " .. lang_name) end local p, n = ugsub(pagename, '[' .. range.kana .. range.kanji .. range.ideograph .. range.kana_graph .. range.punctuation .. ']+', '') if p ~= '' and n > 0 then insert(tc, "Perkataan ditulis dalam pelbagai tulisan bahasa " .. lang_name) end local pos = data.headword.pos_category local rare_chars = {} for ch in iterate_rare_chars(pagename) do rare_chars[ch] = true end -- Categorise yōon, but exclude kana and mora entries, since they can't be spelled with themselves. -- FIXME: allow kana categories for morae. if not (pos == "syllables" or pos == "kana" or pos == "morae") then for _, mora in ipairs(moraify((ugsub(pagename, "[^" .. range.kana .. "]+", " ")))) do if not mora:gsub(" +", ""):match("^.?[\128-\191]*$") then rare_chars[mora] = true end end end for ch in pairs(rare_chars) do insert(tc, "Perkataan dieja dengan " .. ch .. " bahasa " .. lang_name) end if ( pos ~= "proverbs" and pos ~= "phrases" and umatch(ugsub(pagename, "[" .. range.katakana .. "]+", ""), "[" .. range.hiragana .. "]") and umatch(ugsub(pagename, "[" .. range.hiragana .. "]+", ""), "[" .. range.katakana .. "]") ) then insert(tc, "Perkataan dieja dengan campuran kana bahasa " .. lang_name) end end pos_functions["Kata kerja"] = function(args, data) add_transitivity(data, args["tr"]) add_inflections(data, args["infl"], 'verbs') end pos_functions["Akhiran"] = function(args, data) add_inflections(data, args["infl"]) end pos_functions["Kata kerja bantu"] = function(args, data) insert(data.headword.categories, "Kata kerja bantu bahasa " .. data.lang_name) add_inflections(data, args["infl"]) data.headword.pos_category = "Kata kerja" end pos_functions["Kata kerja suru"] = function(args, data) add_transitivity(data, args["tr"]) add_inflections(data, 'suru', 'verbs') data.headword.pos_category = "Kata kerja" end pos_functions["Kata sifat"] = function(args, data) add_inflections(data, args["infl"], 'adjectives') end pos_functions["Kata nama"] = function(args, data) -- the counter (classifier) parameter, only relevant for nouns local counter = args["count"] or "" if counter == "-" then insert(data.headword.inflections, {label = "uncountable"}) elseif counter ~= "" then insert(data.headword.inflections, {label = "counter", counter}) end end pos_functions["Adverba"] = function(args, data) local opt = args["opt"] if opt then opt = adverbs_optional_tag .. opt end add_inflections(data, opt, 'adverbs') end --[==[ Generate categories by pagename, also optionally by POS Also for use in soft redirect pages ([[Module:ja-see]]). Sortkey is not provided. data = { pagename = ..., -- (required) lang = ..., -- (required) language object categories = {}, -- (required) receive categories katakana_category = {}, -- (required) receive katakana-sorted categories pos = ..., "noun", "verb", etc. no POS categories if not given } ]==] function export.cat(data) data.lang_name = data.lang:getCanonicalName() data.pagename_kana = detect_pagename_kana(data) if data.pos then local pos = data.pos:gsub('x$', 'xe') .. '' insert(data.categories, pos .. ' bahasa ' .. data.lang_name) insert(data.categories, require'Module:headword'.pos_lemma_or_nonlemma(pos, true) .. ' bahasa ' .. data.lang_name) end data.headword = { categories = data.categories } add_categories(data) end --[==[ The main entry point. This is the only function that can be invoked from a template. ]==] function export.show(frame) local poscat = frame.args[2] or frame.args[1] or error("Part of speech has not been specified. Please pass parameter 1 to the module invocation.") local alias_of_hist = {alias_of = 'hist', list = false} local alias_of_infl = {alias_of = "infl"} local list = {list = true} local list_allow_holes_separate_no_index = {list = true, allow_holes = true, separate_no_index = true} local params = { [1] = list, ['rom'] = list_allow_holes_separate_no_index, ['head'] = list_allow_holes_separate_no_index, ['label'] = {list = true, allow_holes = true}, ['hist'] = list, ['hhira'] = alias_of_hist, ['hkata'] = alias_of_hist, ['tr'] = true, ['infl'] = true, ['type'] = alias_of_infl, ['decl'] = alias_of_infl, ['opt'] = true, ['count'] = true, ['sort'] = true, ['pagename'] = true, } -- For backwards compatibility with uses of {{ja-syllable}} with the script parameter. if poscat == "syllables" then params["sc"] = true end local args = require('Module:parameters').process(frame:getParent().args, params) local data = { headword = { pos_category = poscat, categories = {}, heads = {}, no_redundant_head_cat = true, inflections = {}, genders = {'m'}, -- placeholder nogendercat = true }, --custom info pagename = args.pagename or mw.loadData("Module:headword/data").pagename, pagename_kana = nil, -- "hira" "kata" "both", nil lang_code = frame.args[1], lang_name = nil, -- "Japanese", "Okinawan" ... katakana_category = {}, info_mid = {}, -- "godan", "intransitive" ... info_hist = {}, -- historical kana inflection_base = {}, -- base of inflections kanas = {}, -- kana id } data.headword.lang = require("Module:languages").getByCode(data.lang_code) data.lang_name = data.headword.lang:getCanonicalName() -- sort out all the kanas and do the romanization business format_headword(args, data) -- add certain inflections and categories for adjectives, verbs, nouns, or adverbs if pos_functions[poscat] then pos_functions[poscat](args, data) end -- categories add_categories(data) local sort_base = args.sort or data.kanas[1] or data.pagename data.headword.sort_key = data.headword.lang:makeSortKey(sort_base) local katakana_category = #data.katakana_category > 0 and require("Module:utilities").format_categories( data.katakana_category, data.headword.lang, nil, sort_base, nil, require("Module:scripts").getByCode("Kana") ) or "" -- output local i_kanas = 0 return katakana_category .. require('Module:headword').full_headword(data.headword):gsub('<span class="gender">.-</span>', function() return (#data.info_hist > 0 and '<sup>←' .. concat(data.info_hist, ' or ') .. '<sup>[[w:Historical kana orthography|?]]</sup></sup>' or '') .. ('<i>' .. concat(data.info_mid, '&nbsp;') .. '</i>') end):gsub('<strong .->.-</strong>', function(m0) i_kanas = i_kanas + 1 if data.kanas[i_kanas] then return m0 end end):gsub('<span class="headword%-tr tr" dir="ltr"><span class="Latn" lang="ja">', '<span lang="ja-Latn" class="headword-tr tr Latn" dir="ltr">'):gsub('</span></span>', '</span>') end return export laelznpt61s3a3oafnsfj3trmwx3ys1 Modul:script tag link 828 34900 281276 146325 2026-04-21T14:05:10Z Hakimi97 2668 Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/90159474|90159474]]) 281276 Scribunto text/plain local export = {} local codepoint_to_script = require("Module:scripts").charToScript -- FIXME: Temporary hack for script renames. local alias_mapping = { polytonic = "Polyt", Latinx = "Latn", Latnx = "Latn", } -- If there are characters in both scripts (the key and value), the value should be used. local overridden_by = { Grek = "Polyt", Cyrl = "Cyrs", } local function get_script(text) local sc, curr_sc for codepoint in mw.ustring.gcodepoint(text) do curr_sc = codepoint_to_script(codepoint) curr_sc = alias_mapping[curr_sc] or curr_sc if curr_sc ~= "None" then if sc == nil then sc = curr_sc elseif curr_sc ~= sc then -- For instance, Grek -> Polyt. if overridden_by[sc] == curr_sc then sc = curr_sc -- For instance, Grek and Latn. elseif overridden_by[curr_sc] ~= sc then require("Module:debug").track("also/no sc detected") mw.log("Two scripts found in " .. tostring(text) .. ": " .. tostring(sc) .. " and " .. tostring(curr_sc) .. ".") sc = nil break end end end end return sc end local function link(text, sc) return '<span class="' .. sc .. '">[[' .. text .. ']]</span>' end function export.tag_link(link_innards, text) local sc = get_script(text or link_innards) or "None" return link(link_innards, sc) end function export.tag_links(str) str = str:gsub('%[%[(.-)%]%]', function (innards) local sc -- The actual displayed text, whose script we need to detect, -- if different from link innards. local text if innards:find("|") then text = innards:match("|(.+)%]%]$") if not text then return end end return export.tag_link(innards, text) end) return str end function export.tag_links_frame(frame) local args = {} for k, v in pairs(frame:getParent().args) do if k == 1 then args[k] = v else error("The parameter " .. k .. " is not used by this template.") end end local text = args[1] if text then text = mw.text.trim(text) return export.tag_links(text) end end function export.link(frame) local args = {} for k, v in pairs(frame:getParent().args) do if k == 1 then args[k] = v else error("The parameter " .. k .. " is not used by this template.") end end local text = args[1] if text then return export.tag_link(text) end end return export b6041o91hdoysgx74sd5awl3zxoddru Templat:Jpan-pos/format 10 35002 281365 247886 2026-04-22T06:58:03Z PeaceSeekers 3334 281365 wikitext text/x-wiki <includeonly>{{#switch:{{{1|}}} |acronym=Akronim |adj|adjective=Kata sifat |adjective form=Bentuk kata sifat |adnominal=Adnominal |adv|adverb|adverba=Adverba |adverb form=Bentuk adverba |affix=Imbuhan |aux=auxiliary verbs |classifier=Penjodoh bilangan |combining form=combining forms |conjunction=Kata hubung |counter=counters |ideophonic root=ideophonic roots |idiom=Peribahasa |infix=Sisipan |interjection=Kata seru |iteration mark=iteration marks |kana=kana |mora=mora |noun=Kata nama |noun form=Bentuk kata nama |numeral=Kata bilangan |numeral symbol=numeral symbols |particle=Partikel |phrase=Frasa |postposition=postpositions |prefix=Awalan |pronoun=Kata ganti nama |pronoun form=Bentuk kata ganti nama |proper|proper noun=Kata nama khas |proverb=proverbs |punctuation mark=Tanda baca |suffix=Akhiran |suffix form=Bentuk akhiran |syllable=Suku kata |symbol=Simbol |verb=Kata kerja |verb suru=Kata kerja suru |verb form=Bentuk kata kerja |#default = {{error|Invalid part of speech.}}}}</includeonly><noinclude>{{documentation}}</noinclude> 7m2fl6wd09xntr0ch2lo18n5og3o88y 281366 281365 2026-04-22T06:59:59Z PeaceSeekers 3334 281366 wikitext text/x-wiki <includeonly>{{#switch:{{{1|}}} |acronym|akronim=Akronim |adj|adjective|kata sifat|kata adjektif=Kata sifat |adjective form|bentuk kata sifat=Bentuk kata sifat |adnominal=Adnominal |adv|adverb|adverba=Adverba |adverb form|bentuk adverba=Bentuk adverba |affix|imbuhan=Imbuhan |aux=auxiliary verbs |classifier|penjodoh bilangan=Penjodoh bilangan |combining form=combining forms |conjunction|kata hubung=Kata hubung |counter=counters |ideophonic root=ideophonic roots |idiom|peribahasa=Peribahasa |infix|sisipan=Sisipan |interjection|kata seru=Kata seru |iteration mark=iteration marks |kana=kana |mora=mora |noun|kata nama=Kata nama |noun form|bentuk kata nama=Bentuk kata nama |numeral|kata bilangan=Kata bilangan |numeral symbol=numeral symbols |particle|partikel=Partikel |phrase|frasa=Frasa |postposition=postpositions |prefix|awalan=Awalan |pronoun|kata ganti nama=Kata ganti nama |pronoun form|bentuk kata ganti nama=Bentuk kata ganti nama |proper|proper noun|kata nama khas=Kata nama khas |proverb=proverbs |punctuation mark=Tanda baca |suffix|akhiran=Akhiran |suffix form|bentuk akhiran=Bentuk akhiran |syllable|suku kata=Suku kata |symbol|simbol=Simbol |verb|kata kerja=Kata kerja |verb suru=Kata kerja suru |verb form|bentuk kata kerja=Bentuk kata kerja |#default = {{error|Invalid part of speech.}}}}</includeonly><noinclude>{{documentation}}</noinclude> kpwp9i93fva5eldt0kbwe07te81rz24 Zambia 0 37655 281386 150447 2026-04-22T07:26:32Z PeaceSeekers 3334 281386 wikitext text/x-wiki == Bahasa Melayu == {{Wikipedia}} <!-- Kalau ada --> === Takrifan === ==== Kata nama khas ==== {{ms-knk|j=زمبيا}} # {{place|ms|negara|r/selatan Afrika|official=Republik Zambia}}. === Sebutan === * {{dewan|Zam|bia}} === Lihat juga === * {{senarai:negara di Afrika/ms}} === Pautan luar === * {{R:PRPM}} == Bahasa Indonesia == {{Wikipedia|lang=id}} <!-- Kalau ada --> === Takrifan === ==== Kata nama khas ==== {{id-knk}} # Zambia; negara di selatan Afrika === Sebutan === * {{penyempangan|id|Zam|bia}} === Lihat juga === * {{senarai:negara di Afrika/id}} === Pautan luar === * {{R:KBBI Daring}} == Bahasa Inggeris == {{Wikipedia|lang=en}} <!-- Kalau ada --> === Takrifan === ==== Kata nama khas ==== {{en-knk}} # Zambia; negara di selatan Afrika === Sebutan === * {{IPA|en|/ˈzæmbiə/}} * {{audio|en|LL-Q1860 (eng)-Vealhurl-Zambia.wav |Audio (England Selatan)}} * {{audio|en|Zambia.wav|Audio (Zambia)}} === Lihat juga === * {{senarai:negara di Afrika/en}} s1f36bg2l179bats43flaprnm0vh4ji hadiah 0 44008 281248 159303 2026-04-21T13:37:48Z Countryball mys123 9925 /* Bahasa Melayu */Tambah gambar 281248 wikitext text/x-wiki == Bahasa Melayu == {{Wikipedia}} <!-- Kalau ada --> [[File:Gift packing.jpg|thumb|Bungkusan hadiah]] === Takrifan === ==== Kata nama ==== {{ms-kn|j=هديه}} # Suatu [[pemberian]] yang diberi untuk menghargai seseorang atau sesuatu. === Etimologi === Daripada {{der|ms|ar|هَدِيَّة}}. === Sebutan === * {{dewan|ha|diah}} * {{IPA|ms|/ha.di.(j)ah/}} === Pautan luar === * {{R:PRPM}} == Bahasa Indonesia == {{Wikipedia|lang=id}} <!-- Kalau ada --> === Takrifan === ==== Kata nama ==== {{id-kn}} # Suatu [[pemberian]] yang diberi untuk menghargai seseorang atau sesuatu. === Etimologi === Daripada {{der|id|ar|هَدِيَّة}}. === Sebutan === * {{IPA|id|/haˈdi(j)ah/}} * {{rhymes|id|jah|ah|h|s=3}} * {{hyphenation|id|ha|di|ah}} === Pautan luar === * {{R:KBBI Daring}} b9fblfedhxc33r1kvxck3t6t0wawoep خزن 0 47969 281313 165692 2026-04-21T15:53:20Z Hakimi97 2668 /* Etimologi */ 281313 wikitext text/x-wiki == Bahasa Arab == === Takrifan === ==== Kata kerja ==== {{head|ar|kata kerja}} # [[simpan]], [[kumpul]] # [[kandung]] # menyimpan [[rahsia]] === Etimologi === Daripada akar {{ar-root|خ ز ن}}. 84fjqh8jj4gww5l4jz3c03odqq6gyo3 Modul:headword/page 828 51771 281241 265774 2026-04-21T13:00:15Z Hakimi97 2668 Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/88725393|88725393]]) 281241 Scribunto text/plain local export = {} local languages_module = "Module:languages" local maintenance_category_module = "Module:maintenance category" local pages_module = "Module:pages" local string_compare_module = "Module:string/compare" local string_decode_entities_module = "Module:string/decodeEntities" local string_remove_comments_module = "Module:string/removeComments" local string_utilities_module = "Module:string utilities" local table_module = "Module:table" local template_parser_module = "Module:template parser" local mw = mw local string = string local table = table local ustring = mw.ustring local concat = table.concat local find = string.find local format = string.format local gsub = string.gsub local insert = table.insert local load_data = mw.loadData local match = string.match local new_title = mw.title.new local pairs = pairs local require = require local sub = string.sub local toNFC = ustring.toNFC local toNFD = ustring.toNFD local ugsub = ustring.gsub local function class_else_type(...) class_else_type = require(template_parser_module).class_else_type return class_else_type(...) end local function decode_entities(...) decode_entities = require(string_decode_entities_module) return decode_entities(...) end local function encode_entities(...) encode_entities = require(string_utilities_module).encode_entities return encode_entities(...) end local function get_category(...) get_category = require(maintenance_category_module).get_category return get_category(...) end local function get_lang(...) get_lang = require(languages_module).getByCode return get_lang(...) end local function list_to_set(...) list_to_set = require(table_module).listToSet return list_to_set(...) end local function parse(...) parse = require(template_parser_module).parse return parse(...) end local function remove_comments(...) remove_comments = require(string_remove_comments_module) return remove_comments(...) end local function physical_to_logical_pagename_if_mammoth(...) physical_to_logical_pagename_if_mammoth = require(pages_module).physical_to_logical_pagename_if_mammoth return physical_to_logical_pagename_if_mammoth(...) end local function split(...) split = require(string_utilities_module).split return split(...) end local function string_compare(...) string_compare = require(string_compare_module) return string_compare(...) end local function uupper(...) uupper = require(string_utilities_module).upper return uupper(...) end --[==[ Loaders for objects, which load data (or some other object) into some variable, which can then be accessed as "foo or get_foo()", where the function get_foo sets the object to "foo" and then returns it. This ensures they are only loaded when needed, and avoids the need to check for the existence of the object each time, since once "foo" has been set, "get_foo" will not be called again.]==] local langnames local function get_langnames() langnames, get_langnames = load_data("Module:languages/canonical names"), nil return langnames end -- Combining character data used when categorising unusual characters. These resolve into two patterns, used to find -- single combining characters (i.e. character + diacritic(s)) or double combining characters (i.e. character + -- diacritic(s) + character). -- Charsets are in the format used by Unicode's UnicodeSet tool: https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp. -- Single combining characters. -- Charset: [[:M:]&[:^Canonical_Combining_Class=/^Double_/:]&[:^subhead=Grapheme joiner:]&[:^Variation_Selector=Yes:]] -- Note: concatenating hundreds of lines at once gives an error, so () are used every 150 lines to break it up into chunks. local comb_chars_single = ("\204\128-\205\142" .. -- U+0300-U+034E "\205\144-\205\155" .. -- U+0350-U+035B "\205\163-\205\175" .. -- U+0363-U+036F "\210\131-\210\137" .. -- U+0483-U+0489 "\214\145-\214\189" .. -- U+0591-U+05BD "\214\191" .. -- U+05BF "\215\129" .. -- U+05C1 "\215\130" .. -- U+05C2 "\215\132" .. -- U+05C4 "\215\133" .. -- U+05C5 "\215\135" .. -- U+05C7 "\216\144-\216\154" .. -- U+0610-U+061A "\217\139-\217\159" .. -- U+064B-U+065F "\217\176" .. -- U+0670 "\219\150-\219\156" .. -- U+06D6-U+06DC "\219\159-\219\164" .. -- U+06DF-U+06E4 "\219\167" .. -- U+06E7 "\219\168" .. -- U+06E8 "\219\170-\219\173" .. -- U+06EA-U+06ED "\220\145" .. -- U+0711 "\220\176-\221\138" .. -- U+0730-U+074A "\222\166-\222\176" .. -- U+07A6-U+07B0 "\223\171-\223\179" .. -- U+07EB-U+07F3 "\223\189" .. -- U+07FD "\224\160\150-\224\160\153" .. -- U+0816-U+0819 "\224\160\155-\224\160\163" .. -- U+081B-U+0823 "\224\160\165-\224\160\167" .. -- U+0825-U+0827 "\224\160\169-\224\160\173" .. -- U+0829-U+082D "\224\161\153-\224\161\155" .. -- U+0859-U+085B "\224\162\151-\224\162\159" .. -- U+0897-U+089F "\224\163\138-\224\163\161" .. -- U+08CA-U+08E1 "\224\163\163-\224\164\131" .. -- U+08E3-U+0903 "\224\164\186-\224\164\188" .. -- U+093A-U+093C "\224\164\190-\224\165\143" .. -- U+093E-U+094F "\224\165\145-\224\165\151" .. -- U+0951-U+0957 "\224\165\162" .. -- U+0962 "\224\165\163" .. -- U+0963 "\224\166\129-\224\166\131" .. -- U+0981-U+0983 "\224\166\188" .. -- U+09BC "\224\166\190-\224\167\132" .. -- U+09BE-U+09C4 "\224\167\135" .. -- U+09C7 "\224\167\136" .. -- U+09C8 "\224\167\139-\224\167\141" .. -- U+09CB-U+09CD "\224\167\151" .. -- U+09D7 "\224\167\162" .. -- U+09E2 "\224\167\163" .. -- U+09E3 "\224\167\190" .. -- U+09FE "\224\168\129-\224\168\131" .. -- U+0A01-U+0A03 "\224\168\188" .. -- U+0A3C "\224\168\190-\224\169\130" .. -- U+0A3E-U+0A42 "\224\169\135" .. -- U+0A47 "\224\169\136" .. -- U+0A48 "\224\169\139-\224\169\141" .. -- U+0A4B-U+0A4D "\224\169\145" .. -- U+0A51 "\224\169\176" .. -- U+0A70 "\224\169\177" .. -- U+0A71 "\224\169\181" .. -- U+0A75 "\224\170\129-\224\170\131" .. -- U+0A81-U+0A83 "\224\170\188" .. -- U+0ABC "\224\170\190-\224\171\133" .. -- U+0ABE-U+0AC5 "\224\171\135-\224\171\137" .. -- U+0AC7-U+0AC9 "\224\171\139-\224\171\141" .. -- U+0ACB-U+0ACD "\224\171\162" .. -- U+0AE2 "\224\171\163" .. -- U+0AE3 "\224\171\186-\224\171\191" .. -- U+0AFA-U+0AFF "\224\172\129-\224\172\131" .. -- U+0B01-U+0B03 "\224\172\188" .. -- U+0B3C "\224\172\190-\224\173\132" .. -- U+0B3E-U+0B44 "\224\173\135" .. -- U+0B47 "\224\173\136" .. -- U+0B48 "\224\173\139-\224\173\141" .. -- U+0B4B-U+0B4D "\224\173\149-\224\173\151" .. -- U+0B55-U+0B57 "\224\173\162" .. -- U+0B62 "\224\173\163" .. -- U+0B63 "\224\174\130" .. -- U+0B82 "\224\174\190-\224\175\130" .. -- U+0BBE-U+0BC2 "\224\175\134-\224\175\136" .. -- U+0BC6-U+0BC8 "\224\175\138-\224\175\141" .. -- U+0BCA-U+0BCD "\224\175\151" .. -- U+0BD7 "\224\176\128-\224\176\132" .. -- U+0C00-U+0C04 "\224\176\188" .. -- U+0C3C "\224\176\190-\224\177\132" .. -- U+0C3E-U+0C44 "\224\177\134-\224\177\136" .. -- U+0C46-U+0C48 "\224\177\138-\224\177\141" .. -- U+0C4A-U+0C4D "\224\177\149" .. -- U+0C55 "\224\177\150" .. -- U+0C56 "\224\177\162" .. -- U+0C62 "\224\177\163" .. -- U+0C63 "\224\178\129-\224\178\131" .. -- U+0C81-U+0C83 "\224\178\188" .. -- U+0CBC "\224\178\190-\224\179\132" .. -- U+0CBE-U+0CC4 "\224\179\134-\224\179\136" .. -- U+0CC6-U+0CC8 "\224\179\138-\224\179\141" .. -- U+0CCA-U+0CCD "\224\179\149" .. -- U+0CD5 "\224\179\150" .. -- U+0CD6 "\224\179\162" .. -- U+0CE2 "\224\179\163" .. -- U+0CE3 "\224\179\179" .. -- U+0CF3 "\224\180\128-\224\180\131" .. -- U+0D00-U+0D03 "\224\180\187" .. -- U+0D3B "\224\180\188" .. -- U+0D3C "\224\180\190-\224\181\132" .. -- U+0D3E-U+0D44 "\224\181\134-\224\181\136" .. -- U+0D46-U+0D48 "\224\181\138-\224\181\141" .. -- U+0D4A-U+0D4D "\224\181\151" .. -- U+0D57 "\224\181\162" .. -- U+0D62 "\224\181\163" .. -- U+0D63 "\224\182\129-\224\182\131" .. -- U+0D81-U+0D83 "\224\183\138" .. -- U+0DCA "\224\183\143-\224\183\148" .. -- U+0DCF-U+0DD4 "\224\183\150" .. -- U+0DD6 "\224\183\152-\224\183\159" .. -- U+0DD8-U+0DDF "\224\183\178" .. -- U+0DF2 "\224\183\179" .. -- U+0DF3 "\224\184\177" .. -- U+0E31 "\224\184\180-\224\184\186" .. -- U+0E34-U+0E3A "\224\185\135-\224\185\142" .. -- U+0E47-U+0E4E "\224\186\177" .. -- U+0EB1 "\224\186\180-\224\186\188" .. -- U+0EB4-U+0EBC "\224\187\136-\224\187\142" .. -- U+0EC8-U+0ECE "\224\188\152" .. -- U+0F18 "\224\188\153" .. -- U+0F19 "\224\188\181" .. -- U+0F35 "\224\188\183" .. -- U+0F37 "\224\188\185" .. -- U+0F39 "\224\188\190" .. -- U+0F3E "\224\188\191" .. -- U+0F3F "\224\189\177-\224\190\132" .. -- U+0F71-U+0F84 "\224\190\134" .. -- U+0F86 "\224\190\135" .. -- U+0F87 "\224\190\141-\224\190\151" .. -- U+0F8D-U+0F97 "\224\190\153-\224\190\188" .. -- U+0F99-U+0FBC "\224\191\134" .. -- U+0FC6 "\225\128\171-\225\128\190" .. -- U+102B-U+103E "\225\129\150-\225\129\153" .. -- U+1056-U+1059 "\225\129\158-\225\129\160" .. -- U+105E-U+1060 "\225\129\162-\225\129\164" .. -- U+1062-U+1064 "\225\129\167-\225\129\173" .. -- U+1067-U+106D "\225\129\177-\225\129\180" .. -- U+1071-U+1074 "\225\130\130-\225\130\141" .. -- U+1082-U+108D "\225\130\143" .. -- U+108F "\225\130\154-\225\130\157" .. -- U+109A-U+109D "\225\141\157-\225\141\159" .. -- U+135D-U+135F "\225\156\146-\225\156\149" .. -- U+1712-U+1715 "\225\156\178-\225\156\180" .. -- U+1732-U+1734 "\225\157\146" .. -- U+1752 "\225\157\147" .. -- U+1753 "\225\157\178" .. -- U+1772 "\225\157\179" .. -- U+1773 "\225\158\180-\225\159\147") .. -- U+17B4-U+17D3 ("\225\159\157" .. -- U+17DD "\225\162\133" .. -- U+1885 "\225\162\134" .. -- U+1886 "\225\162\169" .. -- U+18A9 "\225\164\160-\225\164\171" .. -- U+1920-U+192B "\225\164\176-\225\164\187" .. -- U+1930-U+193B "\225\168\151-\225\168\155" .. -- U+1A17-U+1A1B "\225\169\149-\225\169\158" .. -- U+1A55-U+1A5E "\225\169\160-\225\169\188" .. -- U+1A60-U+1A7C "\225\169\191" .. -- U+1A7F "\225\170\176-\225\171\142" .. -- U+1AB0-U+1ACE "\225\172\128-\225\172\132" .. -- U+1B00-U+1B04 "\225\172\180-\225\173\132" .. -- U+1B34-U+1B44 "\225\173\171-\225\173\179" .. -- U+1B6B-U+1B73 "\225\174\128-\225\174\130" .. -- U+1B80-U+1B82 "\225\174\161-\225\174\173" .. -- U+1BA1-U+1BAD "\225\175\166-\225\175\179" .. -- U+1BE6-U+1BF3 "\225\176\164-\225\176\183" .. -- U+1C24-U+1C37 "\225\179\144-\225\179\146" .. -- U+1CD0-U+1CD2 "\225\179\148-\225\179\168" .. -- U+1CD4-U+1CE8 "\225\179\173" .. -- U+1CED "\225\179\180" .. -- U+1CF4 "\225\179\183-\225\179\185" .. -- U+1CF7-U+1CF9 "\225\183\128-\225\183\140" .. -- U+1DC0-U+1DCC "\225\183\142-\225\183\187" .. -- U+1DCE-U+1DFB "\225\183\189-\225\183\191" .. -- U+1DFD-U+1DFF "\226\131\144-\226\131\176" .. -- U+20D0-U+20F0 "\226\179\175-\226\179\177" .. -- U+2CEF-U+2CF1 "\226\181\191" .. -- U+2D7F "\226\183\160-\226\183\191" .. -- U+2DE0-U+2DFF "\227\128\170-\227\128\175" .. -- U+302A-U+302F "\227\130\153" .. -- U+3099 "\227\130\154" .. -- U+309A "\234\153\175-\234\153\178" .. -- U+A66F-U+A672 "\234\153\180-\234\153\189" .. -- U+A674-U+A67D "\234\154\158" .. -- U+A69E "\234\154\159" .. -- U+A69F "\234\155\176" .. -- U+A6F0 "\234\155\177" .. -- U+A6F1 "\234\160\130" .. -- U+A802 "\234\160\134" .. -- U+A806 "\234\160\139" .. -- U+A80B "\234\160\163-\234\160\167" .. -- U+A823-U+A827 "\234\160\172" .. -- U+A82C "\234\162\128" .. -- U+A880 "\234\162\129" .. -- U+A881 "\234\162\180-\234\163\133" .. -- U+A8B4-U+A8C5 "\234\163\160-\234\163\177" .. -- U+A8E0-U+A8F1 "\234\163\191" .. -- U+A8FF "\234\164\166-\234\164\173" .. -- U+A926-U+A92D "\234\165\135-\234\165\147" .. -- U+A947-U+A953 "\234\166\128-\234\166\131" .. -- U+A980-U+A983 "\234\166\179-\234\167\128" .. -- U+A9B3-U+A9C0 "\234\167\165" .. -- U+A9E5 "\234\168\169-\234\168\182" .. -- U+AA29-U+AA36 "\234\169\131" .. -- U+AA43 "\234\169\140" .. -- U+AA4C "\234\169\141" .. -- U+AA4D "\234\169\187-\234\169\189" .. -- U+AA7B-U+AA7D "\234\170\176" .. -- U+AAB0 "\234\170\178-\234\170\180" .. -- U+AAB2-U+AAB4 "\234\170\183" .. -- U+AAB7 "\234\170\184" .. -- U+AAB8 "\234\170\190" .. -- U+AABE "\234\170\191" .. -- U+AABF "\234\171\129" .. -- U+AAC1 "\234\171\171-\234\171\175" .. -- U+AAEB-U+AAEF "\234\171\181" .. -- U+AAF5 "\234\171\182" .. -- U+AAF6 "\234\175\163-\234\175\170" .. -- U+ABE3-U+ABEA "\234\175\172" .. -- U+ABEC "\234\175\173" .. -- U+ABED "\239\172\158" .. -- U+FB1E "\239\184\160-\239\184\175" .. -- U+FE20-U+FE2F "\240\144\135\189" .. -- U+101FD "\240\144\139\160" .. -- U+102E0 "\240\144\141\182-\240\144\141\186" .. -- U+10376-U+1037A "\240\144\168\129-\240\144\168\131" .. -- U+10A01-U+10A03 "\240\144\168\133" .. -- U+10A05 "\240\144\168\134" .. -- U+10A06 "\240\144\168\140-\240\144\168\143" .. -- U+10A0C-U+10A0F "\240\144\168\184-\240\144\168\186" .. -- U+10A38-U+10A3A "\240\144\168\191" .. -- U+10A3F "\240\144\171\165" .. -- U+10AE5 "\240\144\171\166" .. -- U+10AE6 "\240\144\180\164-\240\144\180\167" .. -- U+10D24-U+10D27 "\240\144\181\169-\240\144\181\173" .. -- U+10D69-U+10D6D "\240\144\186\171" .. -- U+10EAB "\240\144\186\172" .. -- U+10EAC "\240\144\187\188-\240\144\187\191" .. -- U+10EFC-U+10EFF "\240\144\189\134-\240\144\189\144" .. -- U+10F46-U+10F50 "\240\144\190\130-\240\144\190\133" .. -- U+10F82-U+10F85 "\240\145\128\128-\240\145\128\130" .. -- U+11000-U+11002 "\240\145\128\184-\240\145\129\134" .. -- U+11038-U+11046 "\240\145\129\176" .. -- U+11070 "\240\145\129\179" .. -- U+11073 "\240\145\129\180" .. -- U+11074 "\240\145\129\191-\240\145\130\130" .. -- U+1107F-U+11082 "\240\145\130\176-\240\145\130\186" .. -- U+110B0-U+110BA "\240\145\131\130" .. -- U+110C2 "\240\145\132\128-\240\145\132\130" .. -- U+11100-U+11102 "\240\145\132\167-\240\145\132\180" .. -- U+11127-U+11134 "\240\145\133\133" .. -- U+11145 "\240\145\133\134" .. -- U+11146 "\240\145\133\179" .. -- U+11173 "\240\145\134\128-\240\145\134\130" .. -- U+11180-U+11182 "\240\145\134\179-\240\145\135\128" .. -- U+111B3-U+111C0 "\240\145\135\137-\240\145\135\140" .. -- U+111C9-U+111CC "\240\145\135\142" .. -- U+111CE "\240\145\135\143" .. -- U+111CF "\240\145\136\172-\240\145\136\183" .. -- U+1122C-U+11237 "\240\145\136\190" .. -- U+1123E "\240\145\137\129" .. -- U+11241 "\240\145\139\159-\240\145\139\170" .. -- U+112DF-U+112EA "\240\145\140\128-\240\145\140\131" .. -- U+11300-U+11303 "\240\145\140\187" .. -- U+1133B "\240\145\140\188" .. -- U+1133C "\240\145\140\190-\240\145\141\132" .. -- U+1133E-U+11344 "\240\145\141\135" .. -- U+11347 "\240\145\141\136" .. -- U+11348 "\240\145\141\139-\240\145\141\141" .. -- U+1134B-U+1134D "\240\145\141\151" .. -- U+11357 "\240\145\141\162" .. -- U+11362 "\240\145\141\163" .. -- U+11363 "\240\145\141\166-\240\145\141\172" .. -- U+11366-U+1136C "\240\145\141\176-\240\145\141\180" .. -- U+11370-U+11374 "\240\145\142\184-\240\145\143\128" .. -- U+113B8-U+113C0 "\240\145\143\130" .. -- U+113C2 "\240\145\143\133" .. -- U+113C5 "\240\145\143\135-\240\145\143\138" .. -- U+113C7-U+113CA "\240\145\143\140-\240\145\143\144" .. -- U+113CC-U+113D0 "\240\145\143\146" .. -- U+113D2 "\240\145\143\161" .. -- U+113E1 "\240\145\143\162" .. -- U+113E2 "\240\145\144\181-\240\145\145\134" .. -- U+11435-U+11446 "\240\145\145\158" .. -- U+1145E "\240\145\146\176-\240\145\147\131" .. -- U+114B0-U+114C3 "\240\145\150\175-\240\145\150\181" .. -- U+115AF-U+115B5 "\240\145\150\184-\240\145\151\128" .. -- U+115B8-U+115C0 "\240\145\151\156" .. -- U+115DC "\240\145\151\157" .. -- U+115DD "\240\145\152\176-\240\145\153\128" .. -- U+11630-U+11640 "\240\145\154\171-\240\145\154\183" .. -- U+116AB-U+116B7 "\240\145\156\157-\240\145\156\171" .. -- U+1171D-U+1172B "\240\145\160\172-\240\145\160\186" .. -- U+1182C-U+1183A "\240\145\164\176-\240\145\164\181" .. -- U+11930-U+11935 "\240\145\164\183" .. -- U+11937 "\240\145\164\184" .. -- U+11938 "\240\145\164\187-\240\145\164\190" .. -- U+1193B-U+1193E "\240\145\165\128") .. -- U+11940 ("\240\145\165\130" .. -- U+11942 "\240\145\165\131" .. -- U+11943 "\240\145\167\145-\240\145\167\151" .. -- U+119D1-U+119D7 "\240\145\167\154-\240\145\167\160" .. -- U+119DA-U+119E0 "\240\145\167\164" .. -- U+119E4 "\240\145\168\129-\240\145\168\138" .. -- U+11A01-U+11A0A "\240\145\168\179-\240\145\168\185" .. -- U+11A33-U+11A39 "\240\145\168\187-\240\145\168\190" .. -- U+11A3B-U+11A3E "\240\145\169\135" .. -- U+11A47 "\240\145\169\145-\240\145\169\155" .. -- U+11A51-U+11A5B "\240\145\170\138-\240\145\170\153" .. -- U+11A8A-U+11A99 "\240\145\176\175-\240\145\176\182" .. -- U+11C2F-U+11C36 "\240\145\176\184-\240\145\176\191" .. -- U+11C38-U+11C3F "\240\145\178\146-\240\145\178\167" .. -- U+11C92-U+11CA7 "\240\145\178\169-\240\145\178\182" .. -- U+11CA9-U+11CB6 "\240\145\180\177-\240\145\180\182" .. -- U+11D31-U+11D36 "\240\145\180\186" .. -- U+11D3A "\240\145\180\188" .. -- U+11D3C "\240\145\180\189" .. -- U+11D3D "\240\145\180\191-\240\145\181\133" .. -- U+11D3F-U+11D45 "\240\145\181\135" .. -- U+11D47 "\240\145\182\138-\240\145\182\142" .. -- U+11D8A-U+11D8E "\240\145\182\144" .. -- U+11D90 "\240\145\182\145" .. -- U+11D91 "\240\145\182\147-\240\145\182\151" .. -- U+11D93-U+11D97 "\240\145\187\179-\240\145\187\182" .. -- U+11EF3-U+11EF6 "\240\145\188\128" .. -- U+11F00 "\240\145\188\129" .. -- U+11F01 "\240\145\188\131" .. -- U+11F03 "\240\145\188\180-\240\145\188\186" .. -- U+11F34-U+11F3A "\240\145\188\190-\240\145\189\130" .. -- U+11F3E-U+11F42 "\240\145\189\154" .. -- U+11F5A "\240\147\145\128" .. -- U+13440 "\240\147\145\135-\240\147\145\149" .. -- U+13447-U+13455 "\240\150\132\158-\240\150\132\175" .. -- U+1611E-U+1612F "\240\150\171\176-\240\150\171\180" .. -- U+16AF0-U+16AF4 "\240\150\172\176-\240\150\172\182" .. -- U+16B30-U+16B36 "\240\150\189\143" .. -- U+16F4F "\240\150\189\145-\240\150\190\135" .. -- U+16F51-U+16F87 "\240\150\190\143-\240\150\190\146" .. -- U+16F8F-U+16F92 "\240\150\191\164" .. -- U+16FE4 "\240\150\191\176" .. -- U+16FF0 "\240\150\191\177" .. -- U+16FF1 "\240\155\178\157" .. -- U+1BC9D "\240\155\178\158" .. -- U+1BC9E "\240\156\188\128-\240\156\188\173" .. -- U+1CF00-U+1CF2D "\240\156\188\176-\240\156\189\134" .. -- U+1CF30-U+1CF46 "\240\157\133\165-\240\157\133\169" .. -- U+1D165-U+1D169 "\240\157\133\173-\240\157\133\178" .. -- U+1D16D-U+1D172 "\240\157\133\187-\240\157\134\130" .. -- U+1D17B-U+1D182 "\240\157\134\133-\240\157\134\139" .. -- U+1D185-U+1D18B "\240\157\134\170-\240\157\134\173" .. -- U+1D1AA-U+1D1AD "\240\157\137\130-\240\157\137\132" .. -- U+1D242-U+1D244 "\240\157\168\128-\240\157\168\182" .. -- U+1DA00-U+1DA36 "\240\157\168\187-\240\157\169\172" .. -- U+1DA3B-U+1DA6C "\240\157\169\181" .. -- U+1DA75 "\240\157\170\132" .. -- U+1DA84 "\240\157\170\155-\240\157\170\159" .. -- U+1DA9B-U+1DA9F "\240\157\170\161-\240\157\170\175" .. -- U+1DAA1-U+1DAAF "\240\158\128\128-\240\158\128\134" .. -- U+1E000-U+1E006 "\240\158\128\136-\240\158\128\152" .. -- U+1E008-U+1E018 "\240\158\128\155-\240\158\128\161" .. -- U+1E01B-U+1E021 "\240\158\128\163" .. -- U+1E023 "\240\158\128\164" .. -- U+1E024 "\240\158\128\166-\240\158\128\170" .. -- U+1E026-U+1E02A "\240\158\130\143" .. -- U+1E08F "\240\158\132\176-\240\158\132\182" .. -- U+1E130-U+1E136 "\240\158\138\174" .. -- U+1E2AE "\240\158\139\172-\240\158\139\175" .. -- U+1E2EC-U+1E2EF "\240\158\147\172-\240\158\147\175" .. -- U+1E4EC-U+1E4EF "\240\158\151\174" .. -- U+1E5EE "\240\158\151\175" .. -- U+1E5EF "\240\158\163\144-\240\158\163\150" .. -- U+1E8D0-U+1E8D6 "\240\158\165\132-\240\158\165\138") -- U+1E944-U+1E94A -- Double combining characters. -- Charset: [[:M:]&[:Canonical_Combining_Class=/^Double_/:]&[:^subhead=Grapheme joiner:]&[:^Variation_Selector=Yes:]] local comb_chars_double = "\205\156-\205\162" .. -- U+035C-U+0362 "\225\183\141" .. -- U+1DCD "\225\183\188" -- U+1DFC -- Variation selectors etc.; separated out so that we don't get categories for them. -- Charset: [[:M:]&[[:subhead=Grapheme joiner:][:Variation_Selector=Yes:]]]. local comb_chars_other = "\205\143" .. -- U+034F "\225\160\139-\225\160\141" .. -- U+180B-U+180D "\225\160\143" .. -- U+180F "\239\184\128-\239\184\143" .. -- U+FE00-U+FE0F "\243\160\132\128-\243\160\135\175" -- U+E0100-U+E01EF local comb_chars_all = comb_chars_single .. comb_chars_double .. comb_chars_other local comb_chars = { combined_single = "[^" .. comb_chars_all .. "][" .. comb_chars_single .. comb_chars_other .. "]+%f[^" .. comb_chars_all .. "]", combined_double = "[^" .. comb_chars_all .. "][" .. comb_chars_single .. comb_chars_other .. "]*[" .. comb_chars_double .. "]+[" .. comb_chars_all .. "]*.[" .. comb_chars_single .. comb_chars_other .. "]*", diacritics_single = "[" .. comb_chars_single .. "]", diacritics_double = "[" .. comb_chars_double .. "]", diacritics_all = "[" .. comb_chars_all .. "]" } -- Somewhat curated list from https://unicode.org/Public/emoji/16.0/emoji-sequences.txt. -- NOTE: There are lots more emoji sequences involving non-emoji Plane 0 symbols followed by 0xFE0F, which we don't -- (yet?) handle. local emoji_chars = "\226\140\154" .. -- U+231A (⌚) "\226\140\155" .. -- U+231B (⌛) "\226\140\168" .. -- U+2328 (⌨) "\226\143\143" .. -- U+23CF (⏏) "\226\143\169-\226\143\179" .. -- U+23E9-U+23F3 (⏩-⏳) "\226\143\184-\226\143\186" .. -- U+23F8-U+23FA (⏸-⏺) "\226\150\170" .. -- U+25AA (▪) "\226\150\171" .. -- U+25AB (▫) "\226\150\182" .. -- U+25B6 (▶) "\226\151\128" .. -- U+25C0 (◀) "\226\151\187-\226\151\190" .. -- U+25FB-U+25FE (◻-◾) "\226\152\128-\226\152\132" .. -- U+2600-U+2604 (☀-☄) "\226\152\142" .. -- U+260E (☎) "\226\152\145" .. -- U+2611 (☑) "\226\152\148" .. -- U+2614 (☔) "\226\152\149" .. -- U+2615 (☕) "\226\152\152" .. -- U+2618 (☘) "\226\152\157" .. -- U+261D (☝) "\226\152\160" .. -- U+2620 (☠) "\226\152\162" .. -- U+2622 (☢) "\226\152\163" .. -- U+2623 (☣) "\226\152\166" .. -- U+2626 (☦) "\226\152\170" .. -- U+262A (☪) "\226\152\174" .. -- U+262E (☮) "\226\152\175" .. -- U+262F (☯) "\226\152\184-\226\152\186" .. -- U+2638-U+263A (☸-☺) "\226\153\136-\226\153\147" .. -- U+2648-U+2653 (♈-♓) "\226\153\159" .. -- U+265F (♟) "\226\153\160" .. -- U+2660 (♠) "\226\153\163" .. -- U+2663 (♣) "\226\153\165" .. -- U+2665 (♥) "\226\153\166" .. -- U+2666 (♦) "\226\153\168" .. -- U+2668 (♨) "\226\153\187" .. -- U+267B (♻) "\226\153\190" .. -- U+267E (♾) "\226\153\191" .. -- U+267F (♿) "\226\154\146-\226\154\151" .. -- U+2692-U+2697 (⚒-⚗) "\226\154\153" .. -- U+2699 (⚙) "\226\154\155" .. -- U+269B (⚛) "\226\154\156" .. -- U+269C (⚜) "\226\154\160" .. -- U+26A0 (⚠) "\226\154\161" .. -- U+26A1 (⚡) "\226\154\170" .. -- U+26AA (⚪) "\226\154\171" .. -- U+26AB (⚫) "\226\154\176" .. -- U+26B0 (⚰) "\226\154\177" .. -- U+26B1 (⚱) "\226\154\189" .. -- U+26BD (⚽) "\226\154\190" .. -- U+26BE (⚾) "\226\155\132" .. -- U+26C4 (⛄) "\226\155\133" .. -- U+26C5 (⛅) "\226\155\136" .. -- U+26C8 (⛈) "\226\155\142" .. -- U+26CE (⛎) "\226\155\143" .. -- U+26CF (⛏) "\226\155\145" .. -- U+26D1 (⛑) "\226\155\147" .. -- U+26D3 (⛓) "\226\155\148" .. -- U+26D4 (⛔) "\226\155\169" .. -- U+26E9 (⛩) "\226\155\170" .. -- U+26EA (⛪) "\226\155\176-\226\155\181" .. -- U+26F0-U+26F5 (⛰-⛵) "\226\155\183-\226\155\186" .. -- U+26F7-U+26FA (⛷-⛺) "\226\155\189" .. -- U+26FD (⛽) "\226\156\130" .. -- U+2702 (✂) "\226\156\133" .. -- U+2705 (✅) "\226\156\136-\226\156\141" .. -- U+2708-U+270D (✈-✍) "\226\156\143" .. -- U+270F (✏) "\226\156\146" .. -- U+2712 (✒) "\226\156\148" .. -- U+2714 (✔) "\226\156\150" .. -- U+2716 (✖) "\226\156\157" .. -- U+271D (✝) "\226\156\161" .. -- U+2721 (✡) "\226\156\168" .. -- U+2728 (✨) "\226\156\179" .. -- U+2733 (✳) "\226\156\180" .. -- U+2734 (✴) "\226\157\132" .. -- U+2744 (❄) "\226\157\135" .. -- U+2747 (❇) "\226\157\140" .. -- U+274C (❌) "\226\157\142" .. -- U+274E (❎) "\226\157\147-\226\157\149" .. -- U+2753-U+2755 (❓-❕) "\226\157\151" .. -- U+2757 (❗) "\226\157\163" .. -- U+2763 (❣) "\226\157\164" .. -- U+2764 (❤) "\226\158\149-\226\158\151" .. -- U+2795-U+2797 (➕-➗) "\226\158\161" .. -- U+27A1 (➡) "\226\158\176" .. -- U+27B0 (➰) "\226\158\191" .. -- U+27BF (➿) "\226\164\180" .. -- U+2934 (⤴) "\226\164\181" .. -- U+2935 (⤵) "\226\172\133-\226\172\135" .. -- U+2B05-U+2B07 (⬅-⬇) "\226\172\155" .. -- U+2B1B (⬛) "\226\172\156" .. -- U+2B1C (⬜) "\226\173\144" .. -- U+2B50 (⭐) "\226\173\149" .. -- U+2B55 (⭕) "\227\128\176" .. -- U+3030 (〰) "\227\128\189" .. -- U+303D (〽) "\227\138\151" .. -- U+3297 (㊗) "\227\138\153" .. -- U+3299 (㊙) "\240\159\128\132" .. -- U+1F004 (🀄) "\240\159\131\143" .. -- U+1F0CF (🃏) "\240\159\133\176" .. -- U+1F170 (🅰) "\240\159\133\177" .. -- U+1F171 (🅱) "\240\159\133\190" .. -- U+1F17E (🅾) "\240\159\133\191" .. -- U+1F17F (🅿) "\240\159\134\142" .. -- U+1F18E (🆎) "\240\159\134\145-\240\159\134\154" .. -- U+1F191-U+1F19A (🆑-🆚) "\240\159\136\129" .. -- U+1F201 (🈁) "\240\159\136\130" .. -- U+1F202 (🈂) "\240\159\136\154" .. -- U+1F21A (🈚) "\240\159\136\175" .. -- U+1F22F (🈯) "\240\159\136\178-\240\159\136\186" .. -- U+1F232-U+1F23A (🈲-🈺) "\240\159\137\144" .. -- U+1F250 (🉐) "\240\159\137\145" .. -- U+1F251 (🉑) "\240\159\140\128-\240\159\153\143" .. -- U+1F300-U+1F64F (🌀-🙏) "\240\159\154\128-\240\159\155\151" .. -- U+1F680-U+1F6D7 (🚀-🛗) "\240\159\155\156-\240\159\155\172" .. -- U+1F6DC-U+1F6EC (🛜-🛬) "\240\159\155\176-\240\159\155\188" .. -- U+1F6F0-U+1F6FC (🛰-🛼) "\240\159\159\160-\240\159\159\171" .. -- U+1F7E0-U+1F7EB (🟠-🟫) "\240\159\159\176" .. -- U+1F7F0 (🟰) "\240\159\164\140-\240\159\169\147" .. -- U+1F90C-U+1FA53 (🤌-🩓) "\240\159\169\160-\240\159\169\173" .. -- U+1FA60-U+1FA6D (🩠-🩭) "\240\159\169\176-\240\159\169\188" .. -- U+1FA70-U+1FA7C (🩰-🩼) "\240\159\170\128-\240\159\170\137" .. -- U+1FA80-U+1FA89 (🪀-🪉) "\240\159\170\143-\240\159\171\134" .. -- U+1FA8F-U+1FAC6 (🪏-🫆) "\240\159\171\142-\240\159\171\156" .. -- U+1FACE-U+1FADC (🫎-🫜) "\240\159\171\159-\240\159\171\169" .. -- U+1FADF-U+1FAE9 (🫟-🫩) "\240\159\171\176-\240\159\171\184" -- U+1FAF0-U+1FAF8 (🫰-🫸) local unsupported_characters local function get_unsupported_characters() unsupported_characters, get_unsupported_characters = {}, nil for k, v in pairs(load_data("Module:links/data").unsupported_characters) do unsupported_characters[v] = k end return unsupported_characters end -- The list of unsupported titles and invert it (so the keys are pagenames and values are canonical titles). local unsupported_titles local function get_unsupported_titles() unsupported_titles, get_unsupported_titles = {}, nil for k, v in pairs(load_data("Module:links/data").unsupported_titles) do unsupported_titles[v] = k end return unsupported_titles end -- To save on memory, we only cache names with either non-ASCII characters in them or ASCII characters to be removed or -- transformed (apostrophe, double quote, hyphen). local L2_sort_key_cache = {} function export.get_L2_sort_key(L2) if L2 == "Rentas bahasa" then return "\1" elseif L2 == "Bahasa Melayu" then return "\2" elseif match(L2, "^[%z\1-\b\14-!#-&(-,.-\127]+$") then return L2 end local sort_key = L2_sort_key_cache[L2] if sort_key then return sort_key end sort_key = toNFC(ugsub(ugsub(toNFD(L2), "[" .. comb_chars_all .. "'\"ʻʼ]+", ""), "[%s%-]+", " ")) L2_sort_key_cache[L2] = sort_key return sort_key end --[==[ Given a pagename (or {nil} for the current page), create and return a data structure describing the page. The returned object includes the following fields: * `comb_chars`: A table containing various Lua character class patterns for different types of combined characters (those that decompose into multiple characters in the NFD decomposition). The patterns are meant to be used with {mw.ustring.find()}. The keys are: ** `single`: Single combining characters (character + diacritic), without surrounding brackets; ** `double`: Double combining characters (character + diacritic + character), without surrounding brackets; ** `vs`: Variation selectors, without surrounding brackets; ** `all`: Concatenation of `single` + `double` + `vs`, without surrounding brackets; ** `diacritics_single`: Like `single` but with surrounding brackets; ** `diacritics_double`: Like `double` but with surrounding brackets; ** `diacritics_all`: Like `all` but with surrounding brackets; ** `combined_single`: Lua pattern for matching a spacing character followed by one or more single combining characters; ** `combined_double`: Lua pattern for matching a combination of two spacing characters separated by one or more double combining characters, possibly also with single combining characters; * `emoji_pattern`: A Lua character class pattern (including surrounding brackets) that matches emojis. Meant to be used with {mw.ustring.find()}. * `L2_list`: Ordered list of L2 headings on the page, with the extra key `n` that gives the length of the list. * `L2_sections`: Lookup table of L2 headings on the page, where the key is the section number assigned by the preprocessor, and the value is the L2 heading name. Once an invocation has got its actual section number from get_current_L2 in [[Module:pages]], it can use this table to determine its parent L2. TODO: We could expand this to include subsections, to check POS headings are correct etc. * `unsupported_titles`: Map from pagenames to canonical titles for unsupported-title pages. * `namespace`: Namespace of the pagename. * `ns`: Namespace table for the page from mw.site.namespaces (TODO: merge with `namespace` above). * `full_raw_pagename`: Full version of the '''RAW''' pagename (i.e. unsupported-title pages aren't canonicalized); including the namespace and the base (portion before the slash). * `pagename`: Canonicalized subpage portion of the pagename (unsupported-title pages are canonicalized). * `pagename_with_base`: Same as `pagename` in the main namespace; otherwise, the whole pagename without the namespace. * `decompose_pagename`: Equivalent of `pagename` in NFD decomposition. * `pagename_len`: Length of `pagename` in Unicode chars, where combinations of spacing character + decomposed diacritic are treated as single characters. * `explode_pagename`: Set of characters found in `pagename`. The keys are characters (where combinations of spacing character + decomposed diacritic are treated as single characters). * `encoded_pagename`: FIXME: Document me. * `pagename_defaultsort`: FIXME: Document me. * `raw_defaultsort`: FIXME: Document me. * `wikitext_topic_cat`: FIXME: Document me. * `wikitext_langname_cat`: FIXME: Document me. `no_fetch_content` says to not fetch and parse the content or set a DEFAULTSORT sort key, in order to save time on test and documentation pages that have lots of template invocations that set `|pagename=`. It turns out nearly all the time of this function is contained in the line `frame:callParserFunction("DEFAULTSORT", data.pagename_defaultsort)`, so we skip it on test and documentation pages where it accomplishes nothing in any case. ]==] function export.process_page(pagename, no_fetch_content) local data = { comb_chars = comb_chars, emoji_pattern = "[" .. emoji_chars .. "]", unsupported_titles = unsupported_titles or get_unsupported_titles() } local cats = {} data.cats = cats -- We cannot store `raw_title` in `data` because it contains a metatable. local raw_title local function bad_pagename() if not pagename then error("Internal error: Something wrong, `data.pagename` not specified but current title contains illegal characters") else error(format("Bad value for `data.pagename`: '%s', which must not contain illegal characters", pagename)) end end if pagename then -- for testing, doc pages, etc. raw_title = new_title(pagename) if not raw_title then bad_pagename() end else raw_title = mw.title.getCurrentTitle() end local nsText = raw_title.nsText local namespace_is_reconstruction = nsText == "Rekonstruksi" data.namespace = nsText data.ns = mw.site.namespaces[raw_title.namespace] local full_raw_pagename = raw_title.fullText data.full_raw_pagename = full_raw_pagename local frame = mw.getCurrentFrame() -- WARNING: `content` may be nil, e.g. if we're substing a template like {{ja-new}} on a not-yet-created page -- or if the module specifies the subpage as `data.pagename` (which many modules do) and we're in an Appendix -- or other non-mainspace page. We used to make the latter an error but there are too many modules that do it, -- and substing on a nonexistent page is totally legit, and we don't actually need to be able to access the -- content of the page. local content = not no_fetch_content and raw_title:getContent() or nil -- Get the pagename. pagename = physical_to_logical_pagename_if_mammoth(raw_title) pagename = gsub(pagename, "^Unsupported titles/(.+)", function(m) insert(cats, "Tajuk tidak disokong") local title = (unsupported_titles or get_unsupported_titles())[m] if title then return title end -- Substitute pairs of "`". Those not used for escaping should be escaped as "`grave`", but might not be, -- so if a pair don't form a match, the closing "`" should become the opening "`" of the next match attempt. -- This has to be done manually, instead of using gsub. local open_pos = find(m, "`") if not open_pos then return m end title = {sub(m, 1, open_pos - 1)} while true do local close_pos = find(m, "`", open_pos + 1) if not close_pos then -- Add "`" plus any remaining characters. insert(title, sub(m, open_pos)) break end local escape = sub(m, open_pos, close_pos) local ch = (unsupported_characters or get_unsupported_characters())[escape] -- Match found, so substitute the character and move to the first "`" after the match if found, or -- otherwise return. if ch then insert(title, ch) local nxt_pos = close_pos + 1 open_pos = find(m, "`", nxt_pos) -- Add any characters between the match and the next "`" or end. if open_pos then insert(title, sub(m, nxt_pos, open_pos - 1)) else insert(title, sub(m, nxt_pos)) break end -- Match not found, so make the closing "`" the opening "`" of the next attempt. else -- Add the failed match, except for the closing "`". insert(title, sub(m, open_pos, close_pos - 1)) open_pos = close_pos end end return concat(title) end) -- Save pagename, as the local variable will be destructively modified. data.pagename = pagename if nsText == "" then data.pagename_with_base = pagename else data.pagename_with_base = raw_title.text end -- Decompose the pagename in Unicode normalization form D. data.decompose_pagename = toNFD(pagename) -- Explode the current page name into a character table, taking decomposed combining characters into account. local explode_pagename = {} local pagename_len = 0 local function explode(char) explode_pagename[char] = true pagename_len = pagename_len + 1 return "" end pagename = ugsub(pagename, comb_chars.combined_double, explode) pagename = gsub(ugsub(pagename, comb_chars.combined_single, explode), ".[\128-\191]*", explode) data.explode_pagename = explode_pagename data.pagename_len = pagename_len -- Generate DEFAULTSORT. data.encoded_pagename = encode_entities(data.pagename) data.pagename_defaultsort = get_lang("mul"):makeSortKey(data.encoded_pagename) if not no_fetch_content then frame:callParserFunction("DEFAULTSORT", data.pagename_defaultsort) end data.raw_defaultsort = uupper(raw_title.text) -- Make `L2_list` and `L2_sections`, note raw wikitext use of {{DEFAULTSORT:}} and {{DISPLAYTITLE:}}, then add categories if any unwanted L1 headings are found, the L2 headings are in the wrong order, or they don't match a canonical language name. -- Note: HTML comments shouldn't be removed from `content` until after this step, as they can affect the result. do local L2_list, L2_list_len, L2_sections = {}, 0, {} local prev, rc local new_cats, L2_wrong_order = {} local function handle_heading(heading) local level = heading.level if level > 2 then return end local name = heading:get_name() -- heading:get_name() will return nil if there are any newline characters in the preprocessed heading name (e.g. from an expanded template). In such cases, the preprocessor section count still increments (since it's calculated pre-expansion), but the heading will fail, so the L2 count shouldn't be incremented. if name == nil then return end L2_list_len = L2_list_len + 1 L2_list[L2_list_len] = name L2_sections[heading.section] = name -- Also add any L1s, since they terminate the preceding L2, but add a maintenance category since it's probably a mistake. if level == 1 then new_cats["Laman dengan pengepala L1 tidak dikehendaki"] = true end -- Check the heading is in the right order. -- FIXME: we need a more sophisticated sorting method which handles non-diacritic special characters (e.g. Magɨ). if prev and not ( L2_wrong_order or string_compare(export.get_L2_sort_key(prev), export.get_L2_sort_key(name)) ) then new_cats["Laman dengan pengepala bahasa dalam susunan salah"] = true L2_wrong_order = true end -- Check it's a canonical language name. if not "Bahasa " and (langnames or get_langnames())[name] then new_cats["Laman dengan pengepala bahasa tidak piawai"] = true end prev = name end local function handle_template(template) -- Turn off redirect checking except in the Reconstruction namespace because the rc flag is only -- used in the Reconstruction namespace and the other names are parser functions, which AFAIK can't -- be redirected to. local name = template:get_name(nil, not namespace_is_reconstruction and "no_redirect" or nil) if name == "DEFAULTSORT:" then new_cats["Laman dengan percanggahan DEFAULTSORT"] = true elseif name == "DISPLAYTITLE:" then new_cats["Laman dengan percanggahan DISPLAYTITLE"] = true elseif name == "reconstructed" then rc = true end end if content then for node in parse(content):iterate_nodes() do local node_class = class_else_type(node) if node_class == "heading" then handle_heading(node) elseif node_class == "template" then handle_template(node) elseif node_class == "parameter" then new_cats["Laman dengan parameter templat bertanda kurung dakap ganda tiga"] = true end end end L2_list.n = L2_list_len data.L2_list = L2_list data.L2_sections = L2_sections insert(cats, get_category("Laman dengan entri")) insert(cats, get_category(format("Laman dengan %s entri", L2_list_len))) for cat in pairs(new_cats) do insert(cats, get_category(cat)) end if namespace_is_reconstruction and not rc then local langname = match(full_raw_pagename, "^Rekonstruksi:([^/]+)/.") if langname then insert(cats, get_category("Entri bahasa " .. langname .. " kehilangan Templat:reconstructed")) end end end ------ 4. Parse page for maintenance categories. ------ -- Use of tab characters. if content and find(content, "\t", 1, true) then insert(cats, get_category("Laman dengan aksara tab")) end -- Unencoded character(s) in title. local IDS = list_to_set{"⿰", "⿱", "⿲", "⿳", "⿴", "⿵", "⿶", "⿷", "⿸", "⿹", "⿺", "⿻", "⿼", "⿽", "⿾", "⿿", "㇯"} for char in pairs(explode_pagename) do if IDS[char] and char ~= data.pagename then insert(cats, "Perkataan mengandungi aksara tidak dikodkan") break end end -- Raw wikitext use of a topic or langname category. Also check if any raw sortkeys have been used. do local wikitext_topic_cat = {} local wikitext_langname_cat = {} local raw_sortkey -- If a raw sortkey has been found, add it to the relevant table. -- If there's no table (or the index is just `true`), create one first. local function add_cat_table(t, lang, sortkey) local t_lang = t[lang] if not sortkey then if not t_lang then t[lang] = true end return elseif t_lang == true or not t_lang then t_lang = {} t[lang] = t_lang end t_lang[uupper(decode_entities(sortkey))] = true end local function process_category(content, cat, colon, nxt) local pipe = find(cat, "|", colon + 1, true) -- Categories cannot end "|]]". if pipe == #cat then return end local title = new_title(pipe and sub(cat, 1, pipe - 1) or cat) if not (title and title.namespace == 14) then return end -- Get the sortkey (if any), then canonicalize category title. local sortkey = pipe and sub(cat, pipe + 1) or nil cat = title.text if sortkey then raw_sortkey = true -- If the sortkey contains "[", the first "]" of a final "]]]" is treated as part of the sortkey. if find(sortkey, "[", 1, true) and sub(content, nxt, nxt) == "]" then sortkey = sortkey .. "]" end end local code = match(cat, "^([%w%-.]+):") if code then add_cat_table(wikitext_topic_cat, code, sortkey) return end -- Split by word. cat = split(cat, " ", true, true) -- Formerly we looked for the language name anywhere in the category. This is simply wrong -- because there are no categories like 'Alsatian French lemmas' (only L2 languages -- have langname categories), but doing it this way wrongly catches things like [[Category:Shapsug Adyghe]] -- in [[Category:Adyghe entries with language name categories using raw markup]]. local n = #cat - 1 if n <= 0 then return end -- Go from longest to shortest and stop once we've found a language name. Going from shortest -- to longest or not stopping after a match risks falsely matching (e.g.) German Low German -- categories as German. repeat local name = concat(cat, " ", 1, n) if "Bahasa " and (langnames or get_langnames())[name] then add_cat_table(wikitext_langname_cat, name, sortkey) return end n = n - 1 until n == 0 end if content then -- Remove comments, then iterate over category links. content = remove_comments(content, "BOTH") local head = find(content, "[[", 1, true) while head do local close = find(content, "]]", head + 2, true) if not close then break end -- Make sure there are no intervening "[[" between head and close. local open = find(content, "[[", head + 2, true) while open and open < close do head = open open = find(content, "[[", head + 2, true) end local cat = sub(content, head + 2, close - 1) -- Locate the colon, and weed out most unwanted links. "[ _\128-\244]*" catches valid whitespace, and ensures any category links using the colon trick are ignored. We match all non-ASCII characters, as there could be multibyte spaces, and mw.title.new will filter out any remaining false-positives; this is a lot faster than running mw.title.new on every link. local colon = match(cat, "^[ _\128-\244]*[Kk][Aa][Tt][EeGgOoRrIi _\128-\244]*():") if colon then process_category(content, cat, colon, close + 2) end head = open end end data.wikitext_topic_cat = wikitext_topic_cat data.wikitext_langname_cat = wikitext_langname_cat if raw_sortkey then insert(cats, get_category("Laman dengan kunci isih mentah")) end end return data end return export enca5wnu1j5d8aw625qrp26c74f6yrf Modul:category tree/lang/jpx 828 55921 281367 256038 2026-04-22T07:02:54Z PeaceSeekers 3334 281367 Scribunto text/plain local labels = {} local handlers = {} local m_str_utils = require("Module:string utilities") local concat = table.concat local full_link = require("Module:links").full_link local insert = table.insert local Hani_sort = require("Module:Hani-sortkey").makeSortKey local match = m_str_utils.match local sort = table.sort local tag_text = require("Module:script_utilities").tag_text local ucfirst = m_str_utils.ucfirst local Hira = require("Module:scripts").getByCode("Hira") local Jpan = require("Module:scripts").getByCode("Jpan") local kana_to_romaji = require("Module:Hrkt-translit").tr local m_numeric = require("Module:ConvertNumeric") local kana_capture = "([-" .. require("Module:ja/data/range").kana .. "・]+)" local yomi_data = require("Module:kanjitab/data") labels["adnominals"] = { description = "{{{langname}}} adnominals, or {{ja-r|連%体%詞|れん%たい%し}}, which modify nouns, and do not conjugate or [[predicate#Verb|predicate]].", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Hiragana"] = { description = "{{{langname}}} terms with hiragana {{mdash}} {{ja-r|平%仮%名|ひら%が%な}} {{mdash}} forms, sorted by conventional hiragana sequence. The hiragana form is a [[phonetic]] representation of that word. " .. "Wiktionary represents {{{langname}}}-language segments in three ways: in normal form (with [[kanji]], if appropriate), in [[hiragana]] " .. "form (this differs from kanji form only when the segment contains kanji), and in [[romaji]] form.", additional = "''Lihat juga'' [[:Kategori:Katakana bahasa {{{langname}}}]]", toc_template = "categoryTOC-hiragana", parents = { {name = "{{{langcat}}}", raw = true}, "Kategori:Aksara Tulisan Hiragana", } } labels["historical hiragana"] = { description = "{{{langname}}} historical [[hiragana]].", additional = "''See also'' [[:Category:{{{langname}}} historical katakana]].", toc_template = "categoryTOC-hiragana", parents = { "Hiragana", {name = "{{{langcat}}}", raw = true}, "Kategori:Aksara Tulisan Hiragana", } } labels["Katakana"] = { description = "{{{langname}}} terms with katakana {{mdash}} {{ja-r|片%仮%名|かた%か%な}} {{mdash}} forms, sorted by conventional katakana sequence. Katakana is used primarily for transliterations of foreign words, including old Chinese hanzi not used in [[shinjitai]].", additional = "''Lihat juga'' [[:Kategori:Hiragana bahasa {{{langname}}}]]", toc_template = "categoryTOC-katakana", parents = { {name = "{{{langcat}}}", raw = true}, "Kategori:Aksara Tulisan Katakana", } } labels["historical katakana"] = { description = "{{{langname}}} historical [[katakana]].", additional = "''See also'' [[:Category:{{{langname}}} historical hiragana]].", toc_template = "categoryTOC-katakana", parents = { "Katakana", {name = "{{{langcat}}}", raw = true}, "Kategori:Aksara Tulisan Katakana", } } labels["Perkataan dieja dengan kana campuran"] = { description = "{{{langname}}} terms which combine [[hiragana]] and [[katakana]] characters, potentially with [[kanji]] too.", parents = { {name = "{{{langcat}}}", raw = true}, "Hiragana", "Katakana", }, } labels["Kanji"] = { topright = "{{wp|Kanji}}", description = "Simbol bahasa {{{langname}}} yang merupakan sebahagian daripada tulisan logogram Han, yang boleh mewakili bunyi atau menyampaikan makna secara langsung.", toc_template = "Hani-categoryTOC", umbrella = "Aksara Han", parents = "Logogram", } labels["Kanji mengikut bacaan"] = { description = "Kanji bahasa {{{langname}}} yang dikategorikan mengikut bacaan.", parents = {{name = "Kanji", sort = "bacaan"}}, } labels["Makurakotoba"] = { topright = "{{wp|Makurakotoba}}", description = "{{{langname}}} idioms used in poetry to introduce specific words.", parents = {"peribahasa"}, } labels["Perkataan mengikut bacaan kanji"] = { description = "Kategori bahasa {{{langname}}} yang dikumpulkan berdasarkan bacaan kanji yang dieja dengannya.", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Perkataan mengikut pola bacaan"] = { description = "Kategori bahasa {{{langname}}} dengan perkataan yang dikumpulkan berdasarkan corak bacaannya.", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Perkataan mengikut bilangan aksara kanji"] = { description = "Perkataan bahasa {{{langname}}} dikategorikan mengikut bilangan aksara kanji.", parents = {"Perkataan mengikut sifat ortografi"}, } local function handle_onyomi_list(category, category_type, cat_yomi_type) local onyomi, seen = {}, {} for _, yomi in pairs(yomi_data) do if not seen[yomi] and yomi.onyomi then local yomi_catname = yomi[category_type] if yomi_catname ~= false then local yomi_type = yomi.type if yomi_type ~= "on'yomi" and yomi_type ~= cat_yomi_type then insert(onyomi, "[[:Kategori:" .. category:gsub("{{{yomi_catname}}}", yomi_catname) .. " bahasa {{{langname}}}]]") end end end seen[yomi] = true end sort(onyomi) return onyomi end local function add_yomi_category(category, category_type, parent, description) for _, yomi in pairs(yomi_data) do local yomi_catname = yomi[category_type] if yomi_catname ~= false then local yomi_type = yomi.type local yomi_desc = yomi.link or yomi_catname if yomi.description then yomi_desc = yomi_desc .. "; " .. yomi.description end local label = { description = description .. " " .. yomi_desc .. ".", breadcrumb = yomi_type, parents = {{name = parent, sort = yomi_catname}}, } if yomi.onyomi then local onyomi = handle_onyomi_list(category, category_type, yomi_type) label.additional = "Kategori untuk perkataan dengan " .. (yomi_type == "on'yomi" and "pelbagai lagi" or "lain-lain") .. " jenis spesifik bacaan on'yomi boleh ditemukan pada kategori berikut:\n* " .. concat(onyomi, "\n* ") if yomi_type ~= "on'yomi" then insert(label.parents, 1, { name = (category:gsub("{{{yomi_catname}}}", yomi_data.on[category_type])), sort = yomi_catname }) end end labels[category:gsub("{{{yomi_catname}}}", yomi_catname)] = label end end end add_yomi_category( "Perkataan dengan bacaan {{{yomi_catname}}}", "reading_category", "Perkataan mengikut pola bacaan", "Perkataan bahasa {{{langname}}} dengan bacaan" ) add_yomi_category( "Perkataan dieja dengan kanji dengan bacaan {{{yomi_catname}}}", "kanji_category", "Perkataan mengikut jenis bacaan kanji", "Kategori bahasa {{{langname}}} dengan perkataan yang dieja dengan satu atau lebih banyak aksara kanji dengan bacaan" ) labels["Perkataan kehilangan yomi"] = { description = "Perkataan bahasa {{{langname}}} yang kehilangan satu atau lebih [[Lampiran:Glosari bahasa Jepun#yomi|yomi]] dalam {{tl|{{{langcode}}}-kanjitab}}.", hidden = true, can_be_empty = true, parents = {"Penyelenggaraan entri"}, } labels["terms with IPA pronunciation with pitch accent"] = { description = "{{{langname}}} terms with pronunciations that have {{w|Japanese pitch accent|pitch accent}} specified.", additional = "Pitch accent can be specified in {{tl|{{{langcode}}}-pron}} with the {{code|=acc=}} parameter.", can_be_empty = true, parents = {"Penyelenggaraan entri", "pitch accent"}, } labels["terms with IPA pronunciation missing pitch accent"] = { description = "{{{langname}}} terms with pronunciations that do not have a {{w|Japanese pitch accent|pitch accent}} specified.", additional = "Pitch accent can be specified in {{tl|{{{langcode}}}-pron}} with the {{code|=acc=}} parameter.", hidden = true, can_be_empty = true, parents = {"Penyelenggaraan entri"}, } labels["pitch accent"] = { description = "{{{langname}}} terms regarding {{w|Japanese pitch accent|pitch accent}} pronunciation.", can_be_empty = true, parents = {{name = "{{{langcat}}}", raw = true}}, } labels["terms with Heiban pitch accent (Tōkyō)"] = { description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[平板型|Heiban]] {{w|Japanese pitch accent|pitch accent}}.", can_be_empty = true, parents = {"pitch accent"} } labels["terms with Atamadaka pitch accent (Tōkyō)"] = { description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[頭高型|Atamadaka]] {{w|Japanese pitch accent|pitch accent}}.", can_be_empty = true, parents = {"pitch accent"} } labels["terms with Nakadaka pitch accent (Tōkyō)"] = { description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[中高型|Nakadaka]] {{w|Japanese pitch accent|pitch accent}}.", can_be_empty = true, parents = {"pitch accent"} } labels["terms with Odaka pitch accent (Tōkyō)"] = { description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[尾高型|Odaka]] {{w|Japanese pitch accent|pitch accent}}.", can_be_empty = true, parents = {"pitch accent"} } labels["pitch accent deaccenting before の"] = { description = "{{{langname}}} terms with {{w|Japanese pitch accent|pitch accent}} pronunciations that have exceptional deaccenting or lack thereof before の ({{ja-deaccenting-before-no}}).", can_be_empty = true, parents = {"pitch accent"} } labels["terms with Odaka pitch accent not deaccented before の (Tōkyō)"] = { description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[尾高型|Odaka]] {{w|Japanese pitch accent|pitch accent}} and do not become deaccented before の ({{ja-deaccenting-before-no}}).", can_be_empty = true, parents = {"pitch accent deaccenting before の"} } labels["terms with Nakadaka pitch accent deaccented before の (Tōkyō)"] = { description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[中高型|Nakadaka]] {{w|Japanese pitch accent|pitch accent}} and become deaccented before の ({{ja-deaccenting-before-no}}).", can_be_empty = true, parents = {"pitch accent deaccenting before の"} } labels["Perkataan mengikut jenis bacaan kanji"] = { description = "{{{langname}}} categories with terms grouped with regard to the types of readings of the kanji with which " .. "they are spelled; broadly, those of Chinese origin, {{ja-r|音|おん}} readings, and those of non-Chinese origin, {{ja-r|訓|くん}} readings.", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Perkataan dieja dengan ateji"] = { topright = "{{wp|Ateji}}", description = "{{{langname}}} terms containing one or more [[Appendix:Japanese glossary#ateji|ateji]] {{mdash}} {{ja-r|当て字|あてじ}} {{mdash}} which are [[kanji]] used to represent sounds rather than meanings (though meaning may have some influence on which kanji are chosen).", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Perkataan dieja dengan daiyōji"] = { description = "Japanese terms spelled using [[Appendix:Japanese glossary#daiyouji|daiyōji]], categorized using {{temp|ja-daiyouji}}.", parents = {"Perkataan mengikut etimologi"}, } labels["Perkataan dieja dengan jukujikun"] = { description = "{{{langname}}} terms containing one or more [[Appendix:Japanese glossary#jukujikun|jukujikun]] {{mdash}} {{ja-r|熟%字%訓|じゅく%じ%くん}} {{mdash}} which are [[kanji]] used to represent meanings rather than sounds.", parents = {{name = "{{{langcat}}}", raw = true}}, } local function add_grade_categories(grade, desc, wp, only_one, parent, sort) local grade_kanji = "Kanji " .. grade local topright = wp and ("{{wp|%s}}"):format(ucfirst(grade_kanji)) or nil labels[grade_kanji] = { topright = topright, description = "Kanji bahasa {{{langname}}} " .. desc, toc_template = "Hani-categoryTOC", parents = {{ name = parent and ("Kanji " .. parent) or "Kanji", sort = sort or grade }}, } labels["Perkataan dieja dengan " .. grade_kanji:lower()] = { topright = topright, description = "Perkataan bahasa {{{langname}}} yang dieja dengan " .. (only_one and "sekurang-kurangnya satu " or "") .. " aksara kanji " .. desc, parents = {{ name = parent and ("Perkataan dieja dengan kanji " .. parent) or "Perkataan mengikut sifat ortografi", sort = sort or grade }}, } end for i = 1, 6 do local ord = m_numeric.ones_position_ord[i] add_grade_categories( "gred " .. ord, "diajar dalam gred " .. ord .. " sekolah rendah, seperti yang ditetapkan oleh senarai rasmi {{ja-r|教%育 漢%字|きょう%いく かん%じ|sukatan pendidikan kanji}}.", false, false, "kyōiku", i ) end add_grade_categories( "kyōiku", "pada senarai rasmi {{ja-r|教%育 漢%字|きょう%いく かん%じ|sukatan pendidikan kanji}}.", true, false, "jōyō" ) add_grade_categories( "sekolah menengah", "pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}} yang secara umumnya diajar pada peringkat sekolah menengah.", false, false, "jōyō" ) add_grade_categories( "jōyō", "pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}}.", true, false ) add_grade_categories( "tōyō", "pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}}, yang digunakan pada sekitar tahun 1946{{ndash}}1981 sehingga penerbitan senarai {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}}.", true, false ) add_grade_categories( "jinmeiyō", "pada senarai rasmi {{ja-r|人%名%用 漢%字|じん%めい%-よう かん%じ|kanji untuk kegunaan nama peribadi}}.", true, true ) add_grade_categories( "hyōgai", "tidak termasuk pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara kegunaan kerap}} atau {{ja-r|人%名%用 漢%字|じん%めい%-よう かん%じ|kanji untuk kegunaan nama peribadi}}, yang dikenali sebagai {{ja-r|表%外 漢%字|ひょう%がい かん%じ}} atau {{ja-r|表%外%字|ひょう%がい%じ|aksara tidak tersenarai}}.", true, true ) labels["Perkataan dengan berbilang bacaan"] = { description = "Perkataan bahasa {{{langname}}} dengan berbilang cara sebutan (maka juga sama dengan berbilang ejaan [[kana]]).", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Bacaan kanji mengikut bilangan mora"] = { description = "Kategori-kategori bahasa {{{langname}}} dikumpulkan berdasarkan bilangan mora dalam bacaan kanji.", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Perkataan kanji tunggal"] = { description = "Perkataan {{{langname}}} yang ditulis dengan kanji tunggal.", parents = { "Perkataan mengikut sifat ortografi", {name = "Perkataan dengan 1 aksara kanji", sort = " "}, }, } labels["kanji with kun readings missing okurigana designation"] = { breadcrumb = "Kanji missing okurigana designation", description = "{{{langname}}} kanji entries in which one or more kun readings entered into {{tl|{{{langcode}}}-readings}} is missing a hyphen denoting okurigana.", toc_template = "Hani-categoryTOC", hidden = true, can_be_empty = true, parents = {"Penyelenggaraan entri"}, } labels["Perkataan mengikut aksara individu dalam ejaan sejarah"] = { breadcrumb = "Bersejarah", description = "{{{langname}}} terms categorized by whether their spellings in the {{w|historical kana orthography}} included certain individual characters.", parents = {{name = "Perkataan mengikut aksara individu", sort = " "}}, } labels["Kata kerja tanpa ketransitifan"] = { description = "{{{langname}}} verbs missing the {{code|=tr=}} parameter from their headword templates.", hidden = true, can_be_empty = true, parents = {"Penyelenggaraan entri"}, } labels["Yojijukugo"] = { topright = "{{wp|Yojijukugo}}", description = "{{{langname}}} four-[[kanji]] compound terms, {{ja-r|四%字 熟%語|よ%じ じゅく%ご}}, with idiomatic meanings; typically derived from Classical Chinese, Buddhist scripture or traditional Japanese proverbs.", additional = "Compare Chinese {{w|chengyu}} and Korean {{w|sajaseong-eo}}.", umbrella = "four-character idioms", parents = {"peribahasa"}, } -- FIXME: Only works for 0 through 19. local word_to_number = {} for k, v in pairs(m_numeric.ones_position) do word_to_number[v] = k end local periods = { lama = true, kuno = true, } local function get_period_text_and_reading_type_link(period, reading_type) if period and not periods[period] then return nil end local period_text = period and " " .. period or nil -- Allow periods (historical or ancient) by themselves; they will parse as reading types. if not period and periods[reading_type] then return nil, reading_type end local reading_type_link = "[[Lampiran:Glosari bahasa Jepun#" .. reading_type .. "|" .. reading_type .. "]]" return period_text, reading_type_link end local function get_sc(str) return match(str:gsub("[%s%p]+", ""), "[^" .. Hira:getCharacters() .. "]") and Jpan or Hira end local function get_tagged_reading(reading, lang) return tag_text(reading, lang, get_sc(reading)) end local function get_reading_link(reading, lang, period, link) local hist = periods[period] reading = reading:gsub("[%.%-%s]+", "") return full_link({ lang = lang, sc = get_sc(reading), term = link or reading:gsub("・", ""), -- If we have okurigana, demarcate furigana. alt = reading:gsub("^(.-)・", "<span style=\"border-top:1px solid;position:relative;padding:1px;\">%1<span style=\"position:absolute;top:0;bottom:67%%;right:0%%;border-right:1px solid;\"></span></span>"), tr = kana_to_romaji((reading:gsub("・", ".")), lang:getCode(), nil, {keep_dot = true, hist = hist}) :gsub("^(.-)%.", "<u>%1</u>"), pos = reading:find("・", 1, true) and get_tagged_reading((reading:gsub("^.-・", "~")), lang) or nil }, "term") end local function is_on_subtype(reading_type) return reading_type:find(".on$") end insert(handlers, function(data) local n =data.label:match("^Perkataan dengan ([1-9]%d*) aksara kanji$") if not n then return end local sortkey = require("Module:category tree").numeral_sortkey(n, 2097152) return { breadcrumb = n, description = ("Perkataan bahasa {{{langname}}} yang mengandungi tepat %d aksara kanji."):format(n), -- TODO: implement this using the same mechanism used to implement parents (i.e. avoiding the need for raw categories). -- umbrella = { -- breadcrumb = ("%d kanji"):format(n), -- parents = {{name = "terms by number of kanji subcategories by language", sort = sortkey}}, -- }, parents = {{name = ("Perkataan mengikut bilangan aksara kanji"), sort = sortkey}} } end) insert(handlers, function(data) local label_pref, kana = data.label:match("^(Perkataan yang mengikut sejarah dieja dengan )" .. kana_capture .. "$") if not kana then return end local lang = data.lang return { description = "Perkataan bahasa {{{langname}}} yang dieja dengan " .. get_reading_link(kana, lang, "bersejarah") .. " dalam {{w|ortografi kana sejarawi}}.", displaytitle = label_pref .. get_tagged_reading(kana, lang) .. " bahasa {{{langname}}}", breadcrumb = "sejarah", parents = { {name = "Perkataan dieja dengan " .. kana, sort = " "}, {name = "Perkataan mengikut aksara individu dalam ejaan sejarah", sort = lang:makeSortKey(kana)} }, } end) insert(handlers, function(data) local count, plural = data.label:match("^Bacaan kanji dengan (.+) mora$") local num = word_to_number[count] if not num then return nil end return { description = "Bacaan kanji bahasa {{{langname}}} yang mengandungi " .. count .. " mora.", breadcrumb = num, parents = {{name = "Bacaan kanji mengikut bilangan mora", sort = num}}, } end) insert(handlers, function(data) local label_pref, period, reading_type, reading = match(data.label, "^(Kanji dengan bacaan ([a-z]-) ?([%a']+) )" .. kana_capture .. "$") if not period then return end period = period ~= "" and period or nil local period_text, reading_type_link = get_period_text_and_reading_type_link(period, reading_type) if not reading_type_link then return end local lang = data.lang -- Compute parents. local parents, breadcrumb = {} if reading:find("・", 1, true) then local okurigana = reading:match("・(.*)") insert(parents, { name = "Kanji dengan bacaan" .. (period_text or "") .. " ".. reading_type .. " " .. reading:match("(.-)・"), -- Sort by okurigana, since all coordinate categories will have the same furigana. sort = (lang:makeSortKey(okurigana)) }) breadcrumb = "~" .. okurigana else insert(parents, { name = "Kanji mengikut bacaan" .. (period_text or "") .. " " .. reading_type, sort = (lang:makeSortKey(reading)) }) breadcrumb = reading end if is_on_subtype(reading_type) then insert(parents, {name = "Kanji dengan bacaan" .. (period_text or "") .. " on " .. reading, sort = reading_type}) elseif period_text then insert(parents, {name = "Kanji dengan bacaan" .. period_text .. " " .. reading, sort = reading_type}) end if not period_text then insert(parents, {name = "Kanji dibaca sebagai " .. reading, sort = reading_type}) end return { description = "Aksara [[kanji]] bahasa {{{langname}}} dengan bacaan " .. reading_type_link .. " " .. get_reading_link(reading, lang, period or reading_type) .. ".", displaytitle = "{{{langname}}} " .. label_pref .. get_tagged_reading(reading, lang), breadcrumb = get_tagged_reading(breadcrumb, lang), parents = parents, } end) insert(handlers, function(data) local period, reading_type = match(data.label, "^Kanji mengikut bacaan ([a-z]-) ?([%a']+)$") if not period then return end period = period ~= "" and period or nil local period_text, reading_type_link = get_period_text_and_reading_type_link(period, reading_type) if not reading_type_link then return nil end -- Compute parents. local parents = { is_on_subtype(reading_type) and {name = "Kanji mengikut bacaan" .. (period_text or "") .. " on", sort = reading_type} or period_text and {name = "Kanji mengikut bacaan " .. reading_type, sort = period} or {name = "Kanji mengikut bacaan", sort = reading_type} } if period_text then insert(parents, {name = "Kanji mengikut bacaan" .. period_text, sort = reading_type}) end -- Compute description. local description = "[[kanji|Kanji]] bahasa {{{langname}}} dikategorikan mengikat bacaan " .. (period_text or "") .. reading_type_link .. "." return { description = description, breadcrumb = reading_type .. (period_text or ""), parents = parents, } end) insert(handlers, function(data) local label_pref, reading = match(data.label, "^(Kanji dibaca sebagai )" .. kana_capture .. "$") if not reading then return end local args = require("Module:parameters").process(data.args, { ["histconsol"] = true, }) local lang = data.lang local parents, breadcrumb = {} if reading:find("・", 1, true) then local okurigana = reading:match("・(.*)") insert(parents, { name = "Kanji dibaca sebagai " .. reading:match("(.-)・"), -- Sort by okurigana, since all coordinate categories will have the same furigana. sort = (lang:makeSortKey(okurigana)) }) breadcrumb = "~" .. okurigana else insert(parents, { name = "Kanji mengikut bacaan", sort = (lang:makeSortKey(reading)) }) breadcrumb = reading end local addl local period_text if args.histconsol then period_text = "lama" addl = ("This is a [[Wikipedia:Historical kana orthography|historical]] [[Wikipedia:Kanazukai|reading]], now " .. "consolidated with the [[Wikipedia:Modern kana usage|modern reading]] of " .. get_reading_link(args.histconsol, lang, nil, ("Kategori:Kanji dibaca sebagai %s bahasa Jepun"):format(args.histconsol)) .. ".") end return { description = "[[kanji|Kanji]] bahasa {{{langname}}} dibaca sebagai " .. get_reading_link(reading, lang, period_text) .. ".", additional = addl, displaytitle = label_pref .. get_tagged_reading(reading, lang) .. " bahasa {{{langname}}}" , breadcrumb = get_tagged_reading(breadcrumb, lang), parents = parents, }, true end) insert(handlers, function(data) local label_pref, reading = match(data.label, "^(Perkataan dieja dengan kanji dibaca sebagai )" .. kana_capture .. "$") if not reading then return end -- Compute parents. local lang = data.lang local sort_key = (lang:makeSortKey(reading)) local mora_count = require("Module:ja").count_morae(reading) local mora_count_words = m_numeric.spell_number(tostring(mora_count)) local parents = { {name = "Perkataan mengikut bacaan kanji", sort = sort_key}, {name = "Bacaan kanji dengan " .. mora_count_words .. " mora", sort = sort_key}, {name = "Kanji dibaca sebagai " .. reading, sort = " "}, } local tagged_reading = get_tagged_reading(reading, lang) return { description = "{{{langname}}} terms that contain kanji that exhibit a reading of " .. get_reading_link(reading, lang) .. " in those terms prior to any sound changes.", displaytitle = "{{{langname}}} " .. label_pref .. tagged_reading, breadcrumb = tagged_reading, parents = parents, } end) insert(handlers, function(data) local kanji, reading = match(data.label, "^Perkataan dieja dengan (.) dibaca sebagai " .. kana_capture .. "$") if not kanji then return nil end local args = require("Module:parameters").process(data.args, { [1] = {list = true}, }) local lang = data.lang if #args[1] == 0 then error("Bagi kategori dalam bentuk \"" .. lang:getCanonicalName() .. " terms spelled with KANJI dibaca sebagai READING\", at least one reading type (e.g. <code>kun</code> or <code>on</code>) must be specified using <code>1=</code>, <code>2=</code>, <code>3=</code>, etc.") end local yomi_types, parents = {}, {} for _, yomi, category in ipairs(args[1]) do local yomi_data = yomi_data[yomi] if not yomi_data then error("Jenis yomi \"" .. yomi .. "\" tidak sah.") end category = yomi_data.kanji_category if not category then error("Jenis yomi \"" .. yomi .. "\" tidak sah bagi jenis kategori ini.") end insert(yomi_types, yomi_data.link) insert(parents, { name = "Perkataan dieja dengan kanji dengan bacaan " .. category, sort = (lang:makeSortKey(reading)) }) end insert(parents, 1, {name = "Perkataan dieja dengan " .. kanji, sort = (lang:makeSortKey(reading))}) insert(parents, 2, {name = "Perkataan dieja dengan kanji dibaca sebagai " .. reading, sort = Hani_sort(kanji)}) yomi_types = (#yomi_types > 1 and "one of " or "") .. "its " .. require("Module:table").serialCommaJoin(yomi_types, {conj = "or"}) .. " reading" .. (#yomi_types > 1 and "s" or "") local tagged_kanji = get_tagged_reading(kanji, lang) local tagged_reading = get_tagged_reading(reading, lang) return { description = "{{{langname}}} terms spelled with {{l|{{{langcode}}}|" .. kanji .. "}} with " .. yomi_types .. " of " .. get_reading_link(reading, lang) .. ".", displaytitle = "{{{langname}}} terms spelled with " .. tagged_kanji .. " dibaca sebagai " .. tagged_reading, breadcrumb = "dibaca sebagai " .. tagged_reading, parents = parents, }, true end) insert(handlers, function(data) local affix, kanji, reading = data.label:match("^Perkataan dengan ([a-z]) (.+) dibaca sebagai " .. kana_capture .. "$") if not affix or not kanji or not reading then return nil end local args = require("Module:parameters").process(data.args, { [1] = {list = true}, }) local lang = data.lang if #args[1] == 0 then error("For categories of the form \"" .. lang:getCanonicalName() .. " terms AFFIXed with KANJI dibaca sebagai READING\", at least one reading type (e.g. <code>kun</code> or <code>on</code>) must be specified using <code>1=</code>, <code>2=</code>, <code>3=</code>, etc.") end local yomi_types = {} for _, yomi, category in ipairs(args[1]) do local yomi_data = yomi_data[yomi] if not yomi_data then error("The yomi type \"" .. yomi .. "\" is not recognized.") end category = yomi_data.kanji_category if not category then error("The yomi type \"" .. yomi .. "\" is not valid for this type of category.") end insert(yomi_types, yomi_data.link) end yomi_types = (#yomi_types > 1 and "one of " or "") .. "its " .. require("Module:table").serialCommaJoin(yomi_types, {conj = "or"}) .. " reading" .. (#yomi_types > 1 and "s" or "") local tagged_kanji = get_tagged_reading(kanji, lang) local tagged_reading = get_tagged_reading(reading, lang) return { description = "{{{langname}}} terms " .. affix .. "ed with {{l|{{{langcode}}}|" .. kanji .. "}} with " .. yomi_types .. " of " .. get_reading_link(reading, lang) .. ".", displaytitle = "{{{langname}}} terms " .. affix .. "ed with " .. tagged_kanji .. " dibaca sebagai " .. tagged_reading, breadcrumb = "dibaca sebagai " .. reading, parents = { {name = "terms " .. affix .. "ed with " .. kanji, sort = (lang:makeSortKey(reading))}, --{name = "Perkataan dieja dengan " .. kanji .. " dibaca sebagai " .. reading, sort = (lang:makeSortKey(reading)), args=data.args} }, }, true end) insert(handlers, function(data) local kanji, daiyoji = match(data.label, "^Perkataan dengan (.) digantikan oleh daiyōji (.)$") if not kanji then return nil end local args = require("Module:parameters").process(data.args, { ["sort"] = true, }) local lang = data.lang if not args.sort then error("For categories of the form \"" .. lang:getCanonicalName() .. " terms with KANJI replaced by daiyōji DAIYOJI\", the sort key must be specified using sort=") end local tagged_kanji = get_tagged_reading(kanji, lang) local tagged_daiyoji = get_tagged_reading(daiyoji, lang) return { description = "{{{langname}}} terms with {{l|{{{langcode}}}|" .. kanji .. "}} replaced by [[Appendix:Japanese glossary#daiyouji|daiyōji]] {{l|{{{langcode}}}|" .. daiyoji .. "}}.", displaytitle = "{{{langname}}} terms with " .. tagged_kanji .. " replaced by daiyōji " .. tagged_daiyoji, breadcrumb = tagged_kanji .. " replaced by daiyōji " .. tagged_daiyoji, parents = {{name = "Perkataan dieja dengan daiyōji", sort = args.sort}}, }, true end) return {LABELS = labels, HANDLERS = handlers} kdm401tcuk1j3f4p1ncitj47yq9j7w0 281369 281367 2026-04-22T07:03:20Z PeaceSeekers 3334 281369 Scribunto text/plain local labels = {} local handlers = {} local m_str_utils = require("Module:string utilities") local concat = table.concat local full_link = require("Module:links").full_link local insert = table.insert local Hani_sort = require("Module:Hani-sortkey").makeSortKey local match = m_str_utils.match local sort = table.sort local tag_text = require("Module:script_utilities").tag_text local ucfirst = m_str_utils.ucfirst local Hira = require("Module:scripts").getByCode("Hira") local Jpan = require("Module:scripts").getByCode("Jpan") local kana_to_romaji = require("Module:Hrkt-translit").tr local m_numeric = require("Module:ConvertNumeric") local kana_capture = "([-" .. require("Module:ja/data/range").kana .. "・]+)" local yomi_data = require("Module:kanjitab/data") labels["adnominals"] = { description = "{{{langname}}} adnominals, or {{ja-r|連%体%詞|れん%たい%し}}, which modify nouns, and do not conjugate or [[predicate#Verb|predicate]].", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Hiragana"] = { description = "{{{langname}}} terms with hiragana {{mdash}} {{ja-r|平%仮%名|ひら%が%な}} {{mdash}} forms, sorted by conventional hiragana sequence. The hiragana form is a [[phonetic]] representation of that word. " .. "Wiktionary represents {{{langname}}}-language segments in three ways: in normal form (with [[kanji]], if appropriate), in [[hiragana]] " .. "form (this differs from kanji form only when the segment contains kanji), and in [[romaji]] form.", additional = "''Lihat juga'' [[:Kategori:Katakana bahasa {{{langname}}}]]", toc_template = "categoryTOC-hiragana", parents = { {name = "{{{langcat}}}", raw = true}, "Kategori:Aksara Tulisan Hiragana", } } labels["historical hiragana"] = { description = "{{{langname}}} historical [[hiragana]].", additional = "''See also'' [[:Category:{{{langname}}} historical katakana]].", toc_template = "categoryTOC-hiragana", parents = { "Hiragana", {name = "{{{langcat}}}", raw = true}, "Kategori:Aksara Tulisan Hiragana", } } labels["Katakana"] = { description = "{{{langname}}} terms with katakana {{mdash}} {{ja-r|片%仮%名|かた%か%な}} {{mdash}} forms, sorted by conventional katakana sequence. Katakana is used primarily for transliterations of foreign words, including old Chinese hanzi not used in [[shinjitai]].", additional = "''Lihat juga'' [[:Kategori:Hiragana bahasa {{{langname}}}]]", toc_template = "categoryTOC-katakana", parents = { {name = "{{{langcat}}}", raw = true}, "Kategori:Aksara Tulisan Katakana", } } labels["historical katakana"] = { description = "{{{langname}}} historical [[katakana]].", additional = "''See also'' [[:Category:{{{langname}}} historical hiragana]].", toc_template = "categoryTOC-katakana", parents = { "Katakana", {name = "{{{langcat}}}", raw = true}, "Kategori:Aksara Tulisan Katakana", } } labels["Perkataan dieja dengan kana campuran"] = { description = "{{{langname}}} terms which combine [[hiragana]] and [[katakana]] characters, potentially with [[kanji]] too.", parents = { {name = "{{{langcat}}}", raw = true}, "Hiragana", "Katakana", }, } labels["Kanji"] = { topright = "{{wp|Kanji}}", description = "Simbol bahasa {{{langname}}} yang merupakan sebahagian daripada tulisan logogram Han, yang boleh mewakili bunyi atau menyampaikan makna secara langsung.", toc_template = "Hani-categoryTOC", umbrella = "Aksara Han", parents = "Logogram", } labels["Kanji mengikut bacaan"] = { description = "Kanji bahasa {{{langname}}} yang dikategorikan mengikut bacaan.", parents = {{name = "Kanji", sort = "bacaan"}}, } labels["Makurakotoba"] = { topright = "{{wp|Makurakotoba}}", description = "{{{langname}}} idioms used in poetry to introduce specific words.", parents = {"Peribahasa"}, } labels["Perkataan mengikut bacaan kanji"] = { description = "Kategori bahasa {{{langname}}} yang dikumpulkan berdasarkan bacaan kanji yang dieja dengannya.", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Perkataan mengikut pola bacaan"] = { description = "Kategori bahasa {{{langname}}} dengan perkataan yang dikumpulkan berdasarkan corak bacaannya.", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Perkataan mengikut bilangan aksara kanji"] = { description = "Perkataan bahasa {{{langname}}} dikategorikan mengikut bilangan aksara kanji.", parents = {"Perkataan mengikut sifat ortografi"}, } local function handle_onyomi_list(category, category_type, cat_yomi_type) local onyomi, seen = {}, {} for _, yomi in pairs(yomi_data) do if not seen[yomi] and yomi.onyomi then local yomi_catname = yomi[category_type] if yomi_catname ~= false then local yomi_type = yomi.type if yomi_type ~= "on'yomi" and yomi_type ~= cat_yomi_type then insert(onyomi, "[[:Kategori:" .. category:gsub("{{{yomi_catname}}}", yomi_catname) .. " bahasa {{{langname}}}]]") end end end seen[yomi] = true end sort(onyomi) return onyomi end local function add_yomi_category(category, category_type, parent, description) for _, yomi in pairs(yomi_data) do local yomi_catname = yomi[category_type] if yomi_catname ~= false then local yomi_type = yomi.type local yomi_desc = yomi.link or yomi_catname if yomi.description then yomi_desc = yomi_desc .. "; " .. yomi.description end local label = { description = description .. " " .. yomi_desc .. ".", breadcrumb = yomi_type, parents = {{name = parent, sort = yomi_catname}}, } if yomi.onyomi then local onyomi = handle_onyomi_list(category, category_type, yomi_type) label.additional = "Kategori untuk perkataan dengan " .. (yomi_type == "on'yomi" and "pelbagai lagi" or "lain-lain") .. " jenis spesifik bacaan on'yomi boleh ditemukan pada kategori berikut:\n* " .. concat(onyomi, "\n* ") if yomi_type ~= "on'yomi" then insert(label.parents, 1, { name = (category:gsub("{{{yomi_catname}}}", yomi_data.on[category_type])), sort = yomi_catname }) end end labels[category:gsub("{{{yomi_catname}}}", yomi_catname)] = label end end end add_yomi_category( "Perkataan dengan bacaan {{{yomi_catname}}}", "reading_category", "Perkataan mengikut pola bacaan", "Perkataan bahasa {{{langname}}} dengan bacaan" ) add_yomi_category( "Perkataan dieja dengan kanji dengan bacaan {{{yomi_catname}}}", "kanji_category", "Perkataan mengikut jenis bacaan kanji", "Kategori bahasa {{{langname}}} dengan perkataan yang dieja dengan satu atau lebih banyak aksara kanji dengan bacaan" ) labels["Perkataan kehilangan yomi"] = { description = "Perkataan bahasa {{{langname}}} yang kehilangan satu atau lebih [[Lampiran:Glosari bahasa Jepun#yomi|yomi]] dalam {{tl|{{{langcode}}}-kanjitab}}.", hidden = true, can_be_empty = true, parents = {"Penyelenggaraan entri"}, } labels["terms with IPA pronunciation with pitch accent"] = { description = "{{{langname}}} terms with pronunciations that have {{w|Japanese pitch accent|pitch accent}} specified.", additional = "Pitch accent can be specified in {{tl|{{{langcode}}}-pron}} with the {{code|=acc=}} parameter.", can_be_empty = true, parents = {"Penyelenggaraan entri", "pitch accent"}, } labels["terms with IPA pronunciation missing pitch accent"] = { description = "{{{langname}}} terms with pronunciations that do not have a {{w|Japanese pitch accent|pitch accent}} specified.", additional = "Pitch accent can be specified in {{tl|{{{langcode}}}-pron}} with the {{code|=acc=}} parameter.", hidden = true, can_be_empty = true, parents = {"Penyelenggaraan entri"}, } labels["pitch accent"] = { description = "{{{langname}}} terms regarding {{w|Japanese pitch accent|pitch accent}} pronunciation.", can_be_empty = true, parents = {{name = "{{{langcat}}}", raw = true}}, } labels["terms with Heiban pitch accent (Tōkyō)"] = { description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[平板型|Heiban]] {{w|Japanese pitch accent|pitch accent}}.", can_be_empty = true, parents = {"pitch accent"} } labels["terms with Atamadaka pitch accent (Tōkyō)"] = { description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[頭高型|Atamadaka]] {{w|Japanese pitch accent|pitch accent}}.", can_be_empty = true, parents = {"pitch accent"} } labels["terms with Nakadaka pitch accent (Tōkyō)"] = { description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[中高型|Nakadaka]] {{w|Japanese pitch accent|pitch accent}}.", can_be_empty = true, parents = {"pitch accent"} } labels["terms with Odaka pitch accent (Tōkyō)"] = { description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[尾高型|Odaka]] {{w|Japanese pitch accent|pitch accent}}.", can_be_empty = true, parents = {"pitch accent"} } labels["pitch accent deaccenting before の"] = { description = "{{{langname}}} terms with {{w|Japanese pitch accent|pitch accent}} pronunciations that have exceptional deaccenting or lack thereof before の ({{ja-deaccenting-before-no}}).", can_be_empty = true, parents = {"pitch accent"} } labels["terms with Odaka pitch accent not deaccented before の (Tōkyō)"] = { description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[尾高型|Odaka]] {{w|Japanese pitch accent|pitch accent}} and do not become deaccented before の ({{ja-deaccenting-before-no}}).", can_be_empty = true, parents = {"pitch accent deaccenting before の"} } labels["terms with Nakadaka pitch accent deaccented before の (Tōkyō)"] = { description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[中高型|Nakadaka]] {{w|Japanese pitch accent|pitch accent}} and become deaccented before の ({{ja-deaccenting-before-no}}).", can_be_empty = true, parents = {"pitch accent deaccenting before の"} } labels["Perkataan mengikut jenis bacaan kanji"] = { description = "{{{langname}}} categories with terms grouped with regard to the types of readings of the kanji with which " .. "they are spelled; broadly, those of Chinese origin, {{ja-r|音|おん}} readings, and those of non-Chinese origin, {{ja-r|訓|くん}} readings.", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Perkataan dieja dengan ateji"] = { topright = "{{wp|Ateji}}", description = "{{{langname}}} terms containing one or more [[Appendix:Japanese glossary#ateji|ateji]] {{mdash}} {{ja-r|当て字|あてじ}} {{mdash}} which are [[kanji]] used to represent sounds rather than meanings (though meaning may have some influence on which kanji are chosen).", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Perkataan dieja dengan daiyōji"] = { description = "Japanese terms spelled using [[Appendix:Japanese glossary#daiyouji|daiyōji]], categorized using {{temp|ja-daiyouji}}.", parents = {"Perkataan mengikut etimologi"}, } labels["Perkataan dieja dengan jukujikun"] = { description = "{{{langname}}} terms containing one or more [[Appendix:Japanese glossary#jukujikun|jukujikun]] {{mdash}} {{ja-r|熟%字%訓|じゅく%じ%くん}} {{mdash}} which are [[kanji]] used to represent meanings rather than sounds.", parents = {{name = "{{{langcat}}}", raw = true}}, } local function add_grade_categories(grade, desc, wp, only_one, parent, sort) local grade_kanji = "Kanji " .. grade local topright = wp and ("{{wp|%s}}"):format(ucfirst(grade_kanji)) or nil labels[grade_kanji] = { topright = topright, description = "Kanji bahasa {{{langname}}} " .. desc, toc_template = "Hani-categoryTOC", parents = {{ name = parent and ("Kanji " .. parent) or "Kanji", sort = sort or grade }}, } labels["Perkataan dieja dengan " .. grade_kanji:lower()] = { topright = topright, description = "Perkataan bahasa {{{langname}}} yang dieja dengan " .. (only_one and "sekurang-kurangnya satu " or "") .. " aksara kanji " .. desc, parents = {{ name = parent and ("Perkataan dieja dengan kanji " .. parent) or "Perkataan mengikut sifat ortografi", sort = sort or grade }}, } end for i = 1, 6 do local ord = m_numeric.ones_position_ord[i] add_grade_categories( "gred " .. ord, "diajar dalam gred " .. ord .. " sekolah rendah, seperti yang ditetapkan oleh senarai rasmi {{ja-r|教%育 漢%字|きょう%いく かん%じ|sukatan pendidikan kanji}}.", false, false, "kyōiku", i ) end add_grade_categories( "kyōiku", "pada senarai rasmi {{ja-r|教%育 漢%字|きょう%いく かん%じ|sukatan pendidikan kanji}}.", true, false, "jōyō" ) add_grade_categories( "sekolah menengah", "pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}} yang secara umumnya diajar pada peringkat sekolah menengah.", false, false, "jōyō" ) add_grade_categories( "jōyō", "pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}}.", true, false ) add_grade_categories( "tōyō", "pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}}, yang digunakan pada sekitar tahun 1946{{ndash}}1981 sehingga penerbitan senarai {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara penggunaan biasa}}.", true, false ) add_grade_categories( "jinmeiyō", "pada senarai rasmi {{ja-r|人%名%用 漢%字|じん%めい%-よう かん%じ|kanji untuk kegunaan nama peribadi}}.", true, true ) add_grade_categories( "hyōgai", "tidak termasuk pada senarai rasmi {{ja-r|常%用 漢%字|じょう%よう かん%じ|aksara kegunaan kerap}} atau {{ja-r|人%名%用 漢%字|じん%めい%-よう かん%じ|kanji untuk kegunaan nama peribadi}}, yang dikenali sebagai {{ja-r|表%外 漢%字|ひょう%がい かん%じ}} atau {{ja-r|表%外%字|ひょう%がい%じ|aksara tidak tersenarai}}.", true, true ) labels["Perkataan dengan berbilang bacaan"] = { description = "Perkataan bahasa {{{langname}}} dengan berbilang cara sebutan (maka juga sama dengan berbilang ejaan [[kana]]).", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Bacaan kanji mengikut bilangan mora"] = { description = "Kategori-kategori bahasa {{{langname}}} dikumpulkan berdasarkan bilangan mora dalam bacaan kanji.", parents = {{name = "{{{langcat}}}", raw = true}}, } labels["Perkataan kanji tunggal"] = { description = "Perkataan {{{langname}}} yang ditulis dengan kanji tunggal.", parents = { "Perkataan mengikut sifat ortografi", {name = "Perkataan dengan 1 aksara kanji", sort = " "}, }, } labels["kanji with kun readings missing okurigana designation"] = { breadcrumb = "Kanji missing okurigana designation", description = "{{{langname}}} kanji entries in which one or more kun readings entered into {{tl|{{{langcode}}}-readings}} is missing a hyphen denoting okurigana.", toc_template = "Hani-categoryTOC", hidden = true, can_be_empty = true, parents = {"Penyelenggaraan entri"}, } labels["Perkataan mengikut aksara individu dalam ejaan sejarah"] = { breadcrumb = "Bersejarah", description = "{{{langname}}} terms categorized by whether their spellings in the {{w|historical kana orthography}} included certain individual characters.", parents = {{name = "Perkataan mengikut aksara individu", sort = " "}}, } labels["Kata kerja tanpa ketransitifan"] = { description = "{{{langname}}} verbs missing the {{code|=tr=}} parameter from their headword templates.", hidden = true, can_be_empty = true, parents = {"Penyelenggaraan entri"}, } labels["Yojijukugo"] = { topright = "{{wp|Yojijukugo}}", description = "{{{langname}}} four-[[kanji]] compound terms, {{ja-r|四%字 熟%語|よ%じ じゅく%ご}}, with idiomatic meanings; typically derived from Classical Chinese, Buddhist scripture or traditional Japanese proverbs.", additional = "Compare Chinese {{w|chengyu}} and Korean {{w|sajaseong-eo}}.", umbrella = "four-character idioms", parents = {"Peribahasa"}, } -- FIXME: Only works for 0 through 19. local word_to_number = {} for k, v in pairs(m_numeric.ones_position) do word_to_number[v] = k end local periods = { lama = true, kuno = true, } local function get_period_text_and_reading_type_link(period, reading_type) if period and not periods[period] then return nil end local period_text = period and " " .. period or nil -- Allow periods (historical or ancient) by themselves; they will parse as reading types. if not period and periods[reading_type] then return nil, reading_type end local reading_type_link = "[[Lampiran:Glosari bahasa Jepun#" .. reading_type .. "|" .. reading_type .. "]]" return period_text, reading_type_link end local function get_sc(str) return match(str:gsub("[%s%p]+", ""), "[^" .. Hira:getCharacters() .. "]") and Jpan or Hira end local function get_tagged_reading(reading, lang) return tag_text(reading, lang, get_sc(reading)) end local function get_reading_link(reading, lang, period, link) local hist = periods[period] reading = reading:gsub("[%.%-%s]+", "") return full_link({ lang = lang, sc = get_sc(reading), term = link or reading:gsub("・", ""), -- If we have okurigana, demarcate furigana. alt = reading:gsub("^(.-)・", "<span style=\"border-top:1px solid;position:relative;padding:1px;\">%1<span style=\"position:absolute;top:0;bottom:67%%;right:0%%;border-right:1px solid;\"></span></span>"), tr = kana_to_romaji((reading:gsub("・", ".")), lang:getCode(), nil, {keep_dot = true, hist = hist}) :gsub("^(.-)%.", "<u>%1</u>"), pos = reading:find("・", 1, true) and get_tagged_reading((reading:gsub("^.-・", "~")), lang) or nil }, "term") end local function is_on_subtype(reading_type) return reading_type:find(".on$") end insert(handlers, function(data) local n =data.label:match("^Perkataan dengan ([1-9]%d*) aksara kanji$") if not n then return end local sortkey = require("Module:category tree").numeral_sortkey(n, 2097152) return { breadcrumb = n, description = ("Perkataan bahasa {{{langname}}} yang mengandungi tepat %d aksara kanji."):format(n), -- TODO: implement this using the same mechanism used to implement parents (i.e. avoiding the need for raw categories). -- umbrella = { -- breadcrumb = ("%d kanji"):format(n), -- parents = {{name = "terms by number of kanji subcategories by language", sort = sortkey}}, -- }, parents = {{name = ("Perkataan mengikut bilangan aksara kanji"), sort = sortkey}} } end) insert(handlers, function(data) local label_pref, kana = data.label:match("^(Perkataan yang mengikut sejarah dieja dengan )" .. kana_capture .. "$") if not kana then return end local lang = data.lang return { description = "Perkataan bahasa {{{langname}}} yang dieja dengan " .. get_reading_link(kana, lang, "bersejarah") .. " dalam {{w|ortografi kana sejarawi}}.", displaytitle = label_pref .. get_tagged_reading(kana, lang) .. " bahasa {{{langname}}}", breadcrumb = "sejarah", parents = { {name = "Perkataan dieja dengan " .. kana, sort = " "}, {name = "Perkataan mengikut aksara individu dalam ejaan sejarah", sort = lang:makeSortKey(kana)} }, } end) insert(handlers, function(data) local count, plural = data.label:match("^Bacaan kanji dengan (.+) mora$") local num = word_to_number[count] if not num then return nil end return { description = "Bacaan kanji bahasa {{{langname}}} yang mengandungi " .. count .. " mora.", breadcrumb = num, parents = {{name = "Bacaan kanji mengikut bilangan mora", sort = num}}, } end) insert(handlers, function(data) local label_pref, period, reading_type, reading = match(data.label, "^(Kanji dengan bacaan ([a-z]-) ?([%a']+) )" .. kana_capture .. "$") if not period then return end period = period ~= "" and period or nil local period_text, reading_type_link = get_period_text_and_reading_type_link(period, reading_type) if not reading_type_link then return end local lang = data.lang -- Compute parents. local parents, breadcrumb = {} if reading:find("・", 1, true) then local okurigana = reading:match("・(.*)") insert(parents, { name = "Kanji dengan bacaan" .. (period_text or "") .. " ".. reading_type .. " " .. reading:match("(.-)・"), -- Sort by okurigana, since all coordinate categories will have the same furigana. sort = (lang:makeSortKey(okurigana)) }) breadcrumb = "~" .. okurigana else insert(parents, { name = "Kanji mengikut bacaan" .. (period_text or "") .. " " .. reading_type, sort = (lang:makeSortKey(reading)) }) breadcrumb = reading end if is_on_subtype(reading_type) then insert(parents, {name = "Kanji dengan bacaan" .. (period_text or "") .. " on " .. reading, sort = reading_type}) elseif period_text then insert(parents, {name = "Kanji dengan bacaan" .. period_text .. " " .. reading, sort = reading_type}) end if not period_text then insert(parents, {name = "Kanji dibaca sebagai " .. reading, sort = reading_type}) end return { description = "Aksara [[kanji]] bahasa {{{langname}}} dengan bacaan " .. reading_type_link .. " " .. get_reading_link(reading, lang, period or reading_type) .. ".", displaytitle = "{{{langname}}} " .. label_pref .. get_tagged_reading(reading, lang), breadcrumb = get_tagged_reading(breadcrumb, lang), parents = parents, } end) insert(handlers, function(data) local period, reading_type = match(data.label, "^Kanji mengikut bacaan ([a-z]-) ?([%a']+)$") if not period then return end period = period ~= "" and period or nil local period_text, reading_type_link = get_period_text_and_reading_type_link(period, reading_type) if not reading_type_link then return nil end -- Compute parents. local parents = { is_on_subtype(reading_type) and {name = "Kanji mengikut bacaan" .. (period_text or "") .. " on", sort = reading_type} or period_text and {name = "Kanji mengikut bacaan " .. reading_type, sort = period} or {name = "Kanji mengikut bacaan", sort = reading_type} } if period_text then insert(parents, {name = "Kanji mengikut bacaan" .. period_text, sort = reading_type}) end -- Compute description. local description = "[[kanji|Kanji]] bahasa {{{langname}}} dikategorikan mengikat bacaan " .. (period_text or "") .. reading_type_link .. "." return { description = description, breadcrumb = reading_type .. (period_text or ""), parents = parents, } end) insert(handlers, function(data) local label_pref, reading = match(data.label, "^(Kanji dibaca sebagai )" .. kana_capture .. "$") if not reading then return end local args = require("Module:parameters").process(data.args, { ["histconsol"] = true, }) local lang = data.lang local parents, breadcrumb = {} if reading:find("・", 1, true) then local okurigana = reading:match("・(.*)") insert(parents, { name = "Kanji dibaca sebagai " .. reading:match("(.-)・"), -- Sort by okurigana, since all coordinate categories will have the same furigana. sort = (lang:makeSortKey(okurigana)) }) breadcrumb = "~" .. okurigana else insert(parents, { name = "Kanji mengikut bacaan", sort = (lang:makeSortKey(reading)) }) breadcrumb = reading end local addl local period_text if args.histconsol then period_text = "lama" addl = ("This is a [[Wikipedia:Historical kana orthography|historical]] [[Wikipedia:Kanazukai|reading]], now " .. "consolidated with the [[Wikipedia:Modern kana usage|modern reading]] of " .. get_reading_link(args.histconsol, lang, nil, ("Kategori:Kanji dibaca sebagai %s bahasa Jepun"):format(args.histconsol)) .. ".") end return { description = "[[kanji|Kanji]] bahasa {{{langname}}} dibaca sebagai " .. get_reading_link(reading, lang, period_text) .. ".", additional = addl, displaytitle = label_pref .. get_tagged_reading(reading, lang) .. " bahasa {{{langname}}}" , breadcrumb = get_tagged_reading(breadcrumb, lang), parents = parents, }, true end) insert(handlers, function(data) local label_pref, reading = match(data.label, "^(Perkataan dieja dengan kanji dibaca sebagai )" .. kana_capture .. "$") if not reading then return end -- Compute parents. local lang = data.lang local sort_key = (lang:makeSortKey(reading)) local mora_count = require("Module:ja").count_morae(reading) local mora_count_words = m_numeric.spell_number(tostring(mora_count)) local parents = { {name = "Perkataan mengikut bacaan kanji", sort = sort_key}, {name = "Bacaan kanji dengan " .. mora_count_words .. " mora", sort = sort_key}, {name = "Kanji dibaca sebagai " .. reading, sort = " "}, } local tagged_reading = get_tagged_reading(reading, lang) return { description = "{{{langname}}} terms that contain kanji that exhibit a reading of " .. get_reading_link(reading, lang) .. " in those terms prior to any sound changes.", displaytitle = "{{{langname}}} " .. label_pref .. tagged_reading, breadcrumb = tagged_reading, parents = parents, } end) insert(handlers, function(data) local kanji, reading = match(data.label, "^Perkataan dieja dengan (.) dibaca sebagai " .. kana_capture .. "$") if not kanji then return nil end local args = require("Module:parameters").process(data.args, { [1] = {list = true}, }) local lang = data.lang if #args[1] == 0 then error("Bagi kategori dalam bentuk \"" .. lang:getCanonicalName() .. " terms spelled with KANJI dibaca sebagai READING\", at least one reading type (e.g. <code>kun</code> or <code>on</code>) must be specified using <code>1=</code>, <code>2=</code>, <code>3=</code>, etc.") end local yomi_types, parents = {}, {} for _, yomi, category in ipairs(args[1]) do local yomi_data = yomi_data[yomi] if not yomi_data then error("Jenis yomi \"" .. yomi .. "\" tidak sah.") end category = yomi_data.kanji_category if not category then error("Jenis yomi \"" .. yomi .. "\" tidak sah bagi jenis kategori ini.") end insert(yomi_types, yomi_data.link) insert(parents, { name = "Perkataan dieja dengan kanji dengan bacaan " .. category, sort = (lang:makeSortKey(reading)) }) end insert(parents, 1, {name = "Perkataan dieja dengan " .. kanji, sort = (lang:makeSortKey(reading))}) insert(parents, 2, {name = "Perkataan dieja dengan kanji dibaca sebagai " .. reading, sort = Hani_sort(kanji)}) yomi_types = (#yomi_types > 1 and "one of " or "") .. "its " .. require("Module:table").serialCommaJoin(yomi_types, {conj = "or"}) .. " reading" .. (#yomi_types > 1 and "s" or "") local tagged_kanji = get_tagged_reading(kanji, lang) local tagged_reading = get_tagged_reading(reading, lang) return { description = "{{{langname}}} terms spelled with {{l|{{{langcode}}}|" .. kanji .. "}} with " .. yomi_types .. " of " .. get_reading_link(reading, lang) .. ".", displaytitle = "{{{langname}}} terms spelled with " .. tagged_kanji .. " dibaca sebagai " .. tagged_reading, breadcrumb = "dibaca sebagai " .. tagged_reading, parents = parents, }, true end) insert(handlers, function(data) local affix, kanji, reading = data.label:match("^Perkataan dengan ([a-z]) (.+) dibaca sebagai " .. kana_capture .. "$") if not affix or not kanji or not reading then return nil end local args = require("Module:parameters").process(data.args, { [1] = {list = true}, }) local lang = data.lang if #args[1] == 0 then error("For categories of the form \"" .. lang:getCanonicalName() .. " terms AFFIXed with KANJI dibaca sebagai READING\", at least one reading type (e.g. <code>kun</code> or <code>on</code>) must be specified using <code>1=</code>, <code>2=</code>, <code>3=</code>, etc.") end local yomi_types = {} for _, yomi, category in ipairs(args[1]) do local yomi_data = yomi_data[yomi] if not yomi_data then error("The yomi type \"" .. yomi .. "\" is not recognized.") end category = yomi_data.kanji_category if not category then error("The yomi type \"" .. yomi .. "\" is not valid for this type of category.") end insert(yomi_types, yomi_data.link) end yomi_types = (#yomi_types > 1 and "one of " or "") .. "its " .. require("Module:table").serialCommaJoin(yomi_types, {conj = "or"}) .. " reading" .. (#yomi_types > 1 and "s" or "") local tagged_kanji = get_tagged_reading(kanji, lang) local tagged_reading = get_tagged_reading(reading, lang) return { description = "{{{langname}}} terms " .. affix .. "ed with {{l|{{{langcode}}}|" .. kanji .. "}} with " .. yomi_types .. " of " .. get_reading_link(reading, lang) .. ".", displaytitle = "{{{langname}}} terms " .. affix .. "ed with " .. tagged_kanji .. " dibaca sebagai " .. tagged_reading, breadcrumb = "dibaca sebagai " .. reading, parents = { {name = "terms " .. affix .. "ed with " .. kanji, sort = (lang:makeSortKey(reading))}, --{name = "Perkataan dieja dengan " .. kanji .. " dibaca sebagai " .. reading, sort = (lang:makeSortKey(reading)), args=data.args} }, }, true end) insert(handlers, function(data) local kanji, daiyoji = match(data.label, "^Perkataan dengan (.) digantikan oleh daiyōji (.)$") if not kanji then return nil end local args = require("Module:parameters").process(data.args, { ["sort"] = true, }) local lang = data.lang if not args.sort then error("For categories of the form \"" .. lang:getCanonicalName() .. " terms with KANJI replaced by daiyōji DAIYOJI\", the sort key must be specified using sort=") end local tagged_kanji = get_tagged_reading(kanji, lang) local tagged_daiyoji = get_tagged_reading(daiyoji, lang) return { description = "{{{langname}}} terms with {{l|{{{langcode}}}|" .. kanji .. "}} replaced by [[Appendix:Japanese glossary#daiyouji|daiyōji]] {{l|{{{langcode}}}|" .. daiyoji .. "}}.", displaytitle = "{{{langname}}} terms with " .. tagged_kanji .. " replaced by daiyōji " .. tagged_daiyoji, breadcrumb = tagged_kanji .. " replaced by daiyōji " .. tagged_daiyoji, parents = {{name = "Perkataan dieja dengan daiyōji", sort = args.sort}}, }, true end) return {LABELS = labels, HANDLERS = handlers} 55nzh7armhc450zs6jk9i3eepl0t9es Modul:MediaWiki message helper 828 57886 281283 184920 2026-04-21T14:11:25Z Hakimi97 2668 Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/84209320|84209320]]) 281283 Scribunto text/plain local m_str_utils = require("Module:string utilities") local dump = mw.dumpObject local get_current_title = mw.title.getCurrentTitle local gsplit = m_str_utils.gsplit local make_title = mw.title.makeTitle local new_title = mw.title.new local pattern_escape = m_str_utils.pattern_escape local php_trim = require("Module:Scribunto").php_trim local ufind = m_str_utils.find local ugsub = m_str_utils.gsub local ulower = m_str_utils.lower local uupper = m_str_utils.upper local export = {} local function get_title(frame) local args = frame.args local title = args and args.title or nil return title == nil and get_current_title() or new_title(title) or error(("%s is not a valid title"):format(dump(title))) end local function print_suggestions(suggestions) if #suggestions == 0 then return "" else local prefix = "* Adakah anda maksudkan " local suffix if #suggestions > 1 then prefix = prefix .. " salah satu daripada semua ini?\n" suggestions = suggestions:map(function(link) return "** " .. link end) suffix = "" else suffix = "?" end return prefix .. suggestions:concat "\n" .. suffix end end -- For [[MediaWiki:Noarticletext]] on uncreated category pages. function export.category_suggestions(frame) local title = ugsub(get_title(frame).text, "^.", uupper) local output = require("Module:array")() local function make_suggestion(title, suffix) output:insert("'''[[:Kategori:" .. title .. "]]'''" .. (suffix or "")) end local function check_for_page_from_function(func) local suggestion = func(title) if suggestion then local suggestion_title = make_title(14, suggestion) if suggestion_title and suggestion_title.exists then make_suggestion(suggestion) return true end end return false end local function check_for_page_with_suffix(suffix) return check_for_page_from_function(function(title) return title .. " " .. suffix end) end local function check_for_page_with_prefix_removed(prefix) return check_for_page_from_function(function(title) return title:gsub(pattern_escape(prefix), "") end) end check_for_page_with_prefix_removed("List of ") check_for_page_with_prefix_removed("list of ") local has_language_category = check_for_page_with_suffix("language") check_for_page_with_suffix("Language") check_for_page_with_suffix("languages") local has_script_category = check_for_page_with_suffix("script") local function check_other_names_of_languages(language_name) for code, data in pairs(require("Module:languages/data/all")) do local function check_name_list(list) if list then for _, name in ipairs(list) do -- The aliases and varieties are recursive, -- with subtables that themselves contain names. if type(name) == "table" then check_name_list(name) else if name == language_name then local object = require("Module:languages").makeObject(code, data) make_suggestion(object:getCategoryName()) end end end end end check_name_list(data.otherNames) check_name_list(data.aliases) check_name_list(data.varieties) end end -- If title looks like a language category, then check if the language name -- in it is a valid canonical name, or one of the otherNames for some -- language. -- If the title looks like a language code, check for a language or a script -- with that code. local function check_language_name(language_name, is_language_category, has_language_category) local ret = false if not has_language_category then if require("Module:languages/canonical names")[language_name] then if not is_language_category then make_suggestion("bahasa " .. language_name) else output:insert("* '''" .. language_name .. "''' merupakan nama bahasa Wikikamus yang sah.") end ret = true end end -- Some otherNames are the canonical name of another language. check_other_names_of_languages(language_name) return ret end local language_name = title:match "^[Bb]ahasa (.+)$" or title:match "^(Bahasa .+)$" check_language_name(language_name or title, language_name ~= nil, has_language_category) -- Most languages (7965/8085 by last count) have uppercase letters at -- beginning of the name and after whitespace and punctuation characters, -- and lowercase everywhere else. Exceptions include languages -- with apostrophes, such as Yup'ik, and languages with tone letters, -- such as ǃXóõ. local fixed_capitalization = ugsub(ulower(language_name or title), "%f[^%z%s%p]%a", uupper) if fixed_capitalization ~= (language_name or title) then check_language_name(fixed_capitalization) end if title:find "^[%a-]+$" then local function check_for_valid_code(code, ...) for _, module_name in ipairs { ... } do local object = require("Modul:" .. module_name).getByCode(code) if object then make_suggestion(object:getCategoryName(), " (kod <code>" .. code .. "</code>)") end end end local code = title:lower() check_for_valid_code(code, "languages", "etymology languages", "scripts", "families") check_for_valid_code(code .. "-pro", "languages", "etymology languages") end local function check_script_name(script_name, is_script_category, has_script_category) if not has_script_category then local object = require("Module:scripts").getByCanonicalName(script_name) if object then if is_script_category then output:insert("* " .. script_name .. " merupakan nama tulisan Wikikamus yang sah.") else make_suggestion(object:getCategoryName()) end end end for code, data in pairs(require("Module:scripts/data")) do local function check_other_names_of_script(list) if list then for _, name in ipairs(list) do if type(name) == "table" then check_other_names_of_script(name) elseif script_name == name then local object = require("Module:scripts").makeObject(code, data) make_suggestion(object:getCategoryName()) end end end end check_other_names_of_script(data.otherNames) check_other_names_of_script(data.varieties) check_other_names_of_script(data.aliases) end end local script_name = title:match "^[Tt]ulisan (.+)$" check_script_name(script_name or title, script_name ~= nil, has_script_category) return print_suggestions(output) end function export.template_suggestions(frame) local title = get_title(frame).text local output = require("Module:array")() local function make_suggestion(title, suffix) output:insert("'''[[:Templat:" .. title .. "]]'''" .. (suffix or "")) end local function check_for_page_with_prefix(prefix) local suggestion = prefix .. title local suggestion_title = make_title(10, suggestion) if suggestion_title and suggestion_title.exists then make_suggestion(suggestion) return true end return false end if title:find(" ", 1, true) then local with_hyphen = title:gsub(" ", "-") local suggestion_title = make_title(10, with_hyphen) if suggestion_title and suggestion_title.exists then make_suggestion(with_hyphen) end end local prefixes = frame.args.prefixes if prefixes then for prefix in gsplit(prefixes, ",", true, true) do check_for_page_with_prefix(prefix) end end local prefix, rest = title:match "^([^: ]+) *:(.+)$" if prefix then prefix = prefix:upper() else prefix, rest = "", title end if prefix == "" or prefix == "R" or prefix == "RQ" then local templates = require("Module:MediaWiki message helper/R: and RQ: templates") local rest_pattern = pattern_escape(rest) :gsub("%l", function(letter) return "[" .. letter:upper() .. letter:lower() .. "]" end) for suggestion in templates:gmatch("%f[^%z\n][Rr][Qq]? ?:[a-z-]*[-:]?" .. rest_pattern .. "[ :]?%d?%d?%d?%d?[/-]?%d?%d?%d?%d?%l?%f[%z\n]") do local suggestion_title = make_title(10, suggestion) if suggestion_title and suggestion_title.exists then make_suggestion(suggestion) end end end return print_suggestions(output) end function export.module_suggestions(frame) local title = get_title(frame).text local output = require("Module:array")() local function make_suggestion(title) output:insert("'''[[:Module:" .. title .. "]]'''") end local pronunciation_suffixes = require("Module:array"){"IPA", "pr", "pron", "pronunc", "pronunciation"} if pronunciation_suffixes:some(function(suffix) return title:find("%-" .. suffix .. "$") end) then for _, suggestion in ipairs(pronunciation_suffixes:map(function(suffix) return title:gsub("%-%w+$", "-" .. suffix) end)) do local suggestion_title = make_title(828, suggestion) if suggestion_title and suggestion_title.exists then make_suggestion(suggestion) end end end -- Look for the modules actually invoked by the template with the same name -- as the module (accounting for "Module:Template:..." cases). local template_title = new_title(title, 10) if template_title then local template_text = template_title:getContent() if template_text then for template in require("Module:template parser").find_templates(template_text) do if template:get_name() == "#INVOKE:" then local args = template:get_arguments() if args[2] then -- args[2] is the function name, so #INVOKE: will throw an error if not present make_suggestion(php_trim(args[1])) -- args[1] is the module name end end end end end return print_suggestions(output) end function export.is_data_module_not_documentation(frame) local title = get_title(frame) if require("Module:pages").get_pagetype(title) == "module" then return ufind(title.text, "^" .. (frame.args[1] or "") .. "$") end end return export py90hz418oltqr9ytx3oknlhjrub6c4 gelugak 0 68267 281324 241579 2026-04-22T00:33:54Z PeaceSeekers 3334 281324 wikitext text/x-wiki ==Bahasa Melayu Sarawak== ===Takrifan=== ====Kata kerja==== {{inti|poz-sml|kata kerja}} # [[selongkar]] #: {{cp|poz-sml|'''Gelugak''' jak kabat ya, ngare gilak dalam ya.|'''Selongkar''' saja almari itu, bersepah sangat dalam itu.}} ===Sebutan=== * {{AFA|poz-sml|/gə.lu.gaʔ/}} * {{rima|poz-sml|gaʔ}} * {{penyempangan|poz-sml|ge|lu|gak}} th75a44iild6j2qnxdrxurreaoijuvp Modul:sem-arb-utilities 828 72855 281296 266012 2026-04-21T15:31:27Z Hakimi97 2668 Mengemas kini mengikut padanan Wikikamus bahasa Inggeris (semakan [[en:Special:Diff/87627369|87627369]]) (perlu semakan semula) 281296 Scribunto text/plain local export = {} local m_str_utils = require("Module:string utilities") local m_utilities = require("Module:utilities") local m_links = require("Module:links") local m_headword = require("Module:headword") local m_langs = require("Module:languages") local m_params = require("Module:parameters") local m_parse_utils = require("Module:parse utilities") local m_affix = require("Module:affix") local m_sc_utils = require("Module:script utilities") local pluralize = require("Module:en-utilities").pluralize local lg_ar = m_langs.getByCode("ar") local lg_sem_arb = lg_ar:getFamily():getCode() local rsplit = m_str_utils.split local rsubn = m_str_utils.gsub local unpack = unpack or table.unpack -- Lua 5.2 compatibility local separator_langs = { ["mt"] = true, ["acy"] = true } local color_langs = { ["mt"] = "red", ["ary"] = "red", ["ar"] = "green", ["shu"] = "yellow" } local template_preview_per_langcode = { ["mt"] = "k-t-b", ["acy"] = "k-t-p" } local lang local sc local function ifelse(cond, yes, no) if cond then return yes end return no end local function ucfirst(str) if str == nil then return str end return mw.language.getContentLanguage():ucfirst(str) end local function link(term, alt, id) if term == "" or term == "&mdash;" then return term else return m_links.full_link({ term = term, alt = alt, lang = lang, id = id, }) end end local function parse_inlines(term) return m_parse_utils.parse_inline_modifiers( term, { param_mods = {tr = {}, t = {}, pos = {}}, generate_obj = function(term) return {term} end, } ) end local function make_part(noninline, lang) local keys = {"tr", "t", "pos"} local inline if type(noninline) == "string" then inline = parse_inlines(noninline) else inline = parse_inlines(noninline[1]) end local return_value = { term = m_sc_utils.tag_text(inline[1], lang), } for i, key in ipairs(keys) do if inline[key] and noninline[key] then error( key .. " specified twice: " .. "<" .. key .. ":" .. inline[key] .. ">" .. " and " .. "|" .. key .. "=" .. nonline[key] ) end return_value[key] = inline[key] or noninline[key] end if not return_value.tr then return_value.tr = lang:transliterate(inline[1]) end return return_value end local function make_parts(lang, raw_parts) local parts = {} for i, part in ipairs(raw_parts) do parts[#parts + 1] = make_part(part, lang) end return { parts = parts, lang = lang, sc = lang:findBestScript(parts[1][1]), } end local function show_affix(lang, raw_parts) return m_affix.show_affix( make_parts(lang, raw_parts), {}, lang ) end local appendices = { ["active participle"] = { -- participles have verbal force in (most?) vernaculars function(args, lang) return ifelse(lang:getCode() == lg_ar:getCode(), "nominals", "verbs") end, derived = true, }, ["characteristic adjective"] = "nominals", ["color/defect adjective"] = { "nominals", fragment = "Color or defect adjectives", }, ["diminutive"] = "nominals", ["elative"] = "nominals", ["relative"] = { "nominals", params = function(lang) return { [1] = { required = true, }, [2] = { alias_of = "t", }, [3] = { alias_of = "suffix", }, ["id"] = {}, ["id1"] = { alias_of = "id", }, ["tr"] = {}, ["pl"] = { type = "boolean", }, ["t"] = { required = true, }, ["suffix"] = { default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيَّة", "ـية"), }, } end, title = function(args) if args.pl then return "relative nouns (nisba)" end return "relative noun (nisba)" end, desc = function(args, lang) if not args[1] then return "" end return ( "composed from " .. show_affix( lang, { args, {args.suffix, pos="feminine nisba"}, } ) ) end, }, ["relative-a"] = { "nominals", params = function(lang) return { [1] = { required = true, }, [2] = { alias_of = "t", }, ["tr"] = {}, ["pl"] = { type = "boolean", }, ["suffix"] = { default = ifelse(lang:getCode() == lg_ar:getCode(), "ـَة", "ـة"), }, ["t"] = { required = true, }, } end, title = function(args) if args.pl then return "relative nouns (nisba)" end return "relative noun (nisba)" end, desc = function(args, lang) if not args[1] then return "" end return ( "composed from " .. show_affix( lang, { args, {args.suffix, pos="feminine ending"}, } ) ) end, }, ["relative-linking"] = { "nominals", params = function(lang) return { [1] = { required = true, }, [2] = { alias_of = "t", }, [3] = { alias_of = "linking", }, ["tr"] = {}, ["pl"] = { type = "boolean", }, ["suffix"] = { default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيّ", "ـي"), }, ["t"] = { required = true, }, ["linking"] = { required = true, } } end, title = function(args) if args.pl then return "relative adjectives (nisba)" end return "relative adjective (nisba)" end, desc = function(args, lang) if not args[1] then return "" end return ( "composed from " .. show_affix( lang, { args, args.linking, {args.suffix, pos = "nisba"}, } ) ) end, }, ["relative-linking-noun"] = { "nominals", params = function(lang) return { [1] = { required = true, }, [2] = { alias_of = "t", }, [3] = { alias_of = "linking", }, ["tr"] = {}, ["pl"] = { type = "boolean", }, ["t"] = { required = true, }, ["linking"] = { required = true, }, ["suffix"] = { default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيّ", "ـي"), } } end, title = function(args) if args.pl then return "relative nouns (nisba)" end return "relative noun (nisba)" end, desc = function(args, lang) if not args[1] then return "" end return ( "composed from " .. show_affix( lang, {args, args.linking, args.suffix} ) ) end, }, ["form"] = { "verbs", params = { [1] = { alias_of = "wazn", }, ["wazn"] = { required = true, }, }, fragment = function(args) return "Form_" .. args.wazn end, title = function(args) return "form " .. args.wazn end, }, ["instance noun"] = "nominals", ["noun of place"] = "nominals", ["occupational noun"] = "nominals", ["passive participle"] = { "nominals", derived = true, }, ["reduplicated"] = { glossary = "reduplication", }, ["singulative noun"] = "nominals", ["tool noun"] = "nominals", ["verbal noun"] = "nominals", } local radicals = { ["Arab"] = { ["ء"] = true, ["ب"] = true, ["ت"] = true, ["ث"] = true, ["ج"] = true, ["ح"] = true, ["خ"] = true, ["د"] = true, ["ذ"] = true, ["ر"] = true, ["ز"] = true, ["س"] = true, ["ش"] = true, ["ص"] = true, ["ض"] = true, ["ط"] = true, ["ظ"] = true, ["ع"] = true, ["غ"] = true, ["ف"] = true, ["ق"] = true, ["ك"] = true, ["ل"] = true, ["م"] = true, ["ن"] = true, ["ه"] = true, ["و"] = true, ["ي"] = true, ["گ"] = true, ["چ"] = true, ["پ"] = true, ["ڭ"] = true, }, ["Latn"] = { ["'"] = true, ["b"] = true, ["c"] = true, ["ċ"] = true, ["d"] = true, ["δ"] = true, ["f"] = true, ["ġ"] = true, ["g"] = true, ["għ"] = true, ["h"] = true, ["ħ"] = true, ["j"] = true, ["k"] = true, ["l"] = true, ["m"] = true, ["n"] = true, ["p"] = true, ["q"] = true, ["r"] = true, ["s"] = true, ["ş"] = true, ["t"] = true, ["v"] = true, ["w"] = true, ["x"] = true, ["y"] = true, ["ż"] = true, ["z"] = true, ["θ"] = true, } } local function validateRoot(rootTable, joined_root) if type(rootTable) ~= "table" then error("rootTable is not a table", 2) end local len = #rootTable if len < 3 then error("Root must have at least three radicals.") end if sc == nil then sc = lang:findBestScript(joined_root):getCode() end for i, radical in ipairs(rootTable) do if not radicals[sc][radical] then error("Unrecognized radical " .. radical .. " in " .. joined_root) end end end function export.root(frame) local output = {} local categories = {} local title = mw.title.getCurrentTitle() local namespace = title.nsText local fulltitle = title.fullText if frame.args["lang"] then lang = require("Module:languages").getByCode(frame.args["lang"]) else error("Please provide a language code.") end local subpage = "Appendix:" .. lang:getCanonicalName() .. " roots/" local fulltitle = rsubn(fulltitle, rsubn(subpage, "([^%w])", "%%%1"), "") local params = { [1] = { list = true }, ["nocat"] = { type = "boolean" }, ["plain"] = { type = "boolean" }, ["notext"] = { type = "boolean" }, ["sense"] = {} } local args = require("Module:parameters").process(frame:getParent().args, params) local rootLetters = {} local roots = args[1] local plain = args["plain"] if frame.args["plain"] then plain = true end local langCode = lang:getCode() local separator = " " if separator_langs[langCode] then separator = "-" else separator = " " end local roots_len = #roots if #roots == 0 and namespace == "Template" then if template_preview_per_langcode[langCode] ~= nil then table.insert(rootLetters, rsplit(template_preview_per_langcode[langCode], separator)) else table.insert(rootLetters, rsplit("ك ت ب", separator)) end elseif #roots ~= 0 then for _, root in ipairs(roots) do table.insert(rootLetters, rsplit(root, separator)) end else table.insert(rootLetters, rsplit(fulltitle, separator)) end local joined_roots = {} for i, rootLetter in ipairs(rootLetters) do table.insert(joined_roots, table.concat(rootLetter, separator)) validateRoot(rootLetter, joined_roots[i]) end local sense = args["sense"] local sense_formatted = "" if sense ~= nil then sense_formatted = " (" .. sense .. ") " end if fulltitle == joined_roots[1] then if namespace == "" then error("The root page should be in the Appendix namespace. Please move it to : [[" .. subpage .. joined_roots[1] .. "]]") end if roots_len > 1 then error("There should be only one root.") end table.insert(output, m_headword.full_headword({ lang = lang, pos_category = "roots", categories = {}, heads = { fulltitle }, nomultiwordcat = true, noposcat = true })) if args["nocat"] then return table.concat(output) else return table.concat(output) .. table.concat(categories) end else local link_texts = {} local term_counts = {} for i, joined_root in ipairs(joined_roots) do local link_text = subpage .. joined_root table.insert(link_texts, link(link_text, joined_root .. sense_formatted, sense)) table.insert( categories, m_utilities.format_categories( { lang:getCanonicalName() .. " terms belonging to the root " .. joined_root .. sense_formatted }, lang) ) table.insert(term_counts, mw.site.stats.pagesInCategory( lang:getCanonicalName() .. " terms belonging to the root " .. joined_root .. sense_formatted, "pages") ) end if args["nocat"] or plain then if args["nocat"] then return table.concat(link_texts, ", ") else return table.concat(link_texts, ", ") .. table.concat(categories) end else local link_text_output = "" for i, link_text in ipairs(link_texts) do link_text_output = link_text_output .. "\n|-\n| " .. link_text .. "\n|-\n| [[:Category:" .. lang:getCanonicalName() .. " terms belonging to the root " .. joined_roots[i] .. sense_formatted .. "|" .. term_counts[i] .. " term" .. (term_counts[i] == 1 and "" or "s") .. "]]\n" end local color = "grey" if color_langs[langCode] ~= nil then color = color_langs[langCode] end local wikicode = mw.getCurrentFrame():expandTemplate { title = 'inflection-table-top', args = { title = "-", palette = color, class = "floatright tr-alongside" } } wikicode = wikicode .. [=[ ! [[w:Semitic root|Root]=] .. (#term_counts == 1 and "" or "s") .. [=[]]]=] wikicode = wikicode .. link_text_output wikicode = wikicode .. mw.getCurrentFrame():expandTemplate { title = 'inflection-table-bottom', } return wikicode .. table.concat(categories) end end end local function iffn(val, ...) if type(val) == "function" then return val(unpack(arg)) end return val end function export.etym(frame) local params = { [1] = { alias_of = "lang", }, [2] = { alias_of = "class" }, ["fragment"] = {}, ["nocat"] = { type = boolean, }, ["lang"] = { type = "language", replaced_by = false, required = true, }, ["class"] = { required = true, }, } local args, extra = m_params.process(frame:getParent().args, params, true) local fixed_indices = {} for k, v in pairs(extra) do if type(k) == "number" then k = k - 2 end fixed_indices[k] = v end extra = fixed_indices if args.lang:getFamily():getCode() ~= lg_sem_arb then error( args.lang:getCode() .. "'s family is " .. args.lang:getFamily():getCode() .. ", not " .. lg_sem_arb ) end local lookup = appendices[mw.ustring.lower(args.class)] local lookup_args = {} if not lookup then error("Unrecognized word type " .. mw.ustring.lower(args.class)) end if lookup.glossary then return "[[Lampiran:Glosari#" .. lookup.glossary .. "]]" end if lookup.params then lookup_args = m_params.process(extra, iffn(lookup.params, args.lang)) end local appendix = nil if lookup.glossary then appendix = "Glosari" else local appendix_lang = args.lang:getCanonicalName() local appendix_title = ifelse(type(lookup) == "string", lookup, iffn(lookup[1], args.lang)) appendix = appendix_lang .. " " .. appendix_title if not mw.title.new(appendix, "Lampiran").exists then appendix_lang = lg_ar:getCanonicalName() appendix_title = ifelse(type(lookup) == "string", lookup, iffn(lookup[1], lg_ar)) appendix = appendix_lang .. " " .. appendix_title end end local title = args.class local desc = "" local intro = "" if lookup.derived then intro = "diterbitkan daripada " end if type(lookup) ~= "string" then title = iffn(lookup.title, lookup_args, args.lang) or "" desc = iffn(lookup.desc, lookup_args, args.lang) or "" if mw.ustring.match(args.class, "^%u.*") then if intro == "" then title = ucfirst(title) or "" else intro = ucfirst(intro) or "" end end end local fragment = ( iffn(lookup.fragment, lookup_args, args.lang) or ucfirst(iffn(lookup.title, {pl=true}, args.lang)) or pluralize(ucfirst(args.class)) ) or "" return ( intro .. "[[Lampiran:" .. appendix .. ifelse(fragment, "#" .. fragment, "") .. "|" .. title .. "]] " .. desc ) end return export axix2jxl24a4mf6k5fxcdlyvjykpmhi 281298 281296 2026-04-21T15:36:39Z Hakimi97 2668 281298 Scribunto text/plain local export = {} local m_str_utils = require("Module:string utilities") local m_utilities = require("Module:utilities") local m_links = require("Module:links") local m_headword = require("Module:headword") local m_langs = require("Module:languages") local m_params = require("Module:parameters") local m_parse_utils = require("Module:parse utilities") local m_affix = require("Module:affix") local m_sc_utils = require("Module:script utilities") local pluralize = require("Module:en-utilities").pluralize local lg_ar = m_langs.getByCode("ar") local lg_sem_arb = lg_ar:getFamily():getCode() local rsplit = m_str_utils.split local rsubn = m_str_utils.gsub local unpack = unpack or table.unpack -- Lua 5.2 compatibility local separator_langs = { ["mt"] = true, ["acy"] = true } local color_langs = { ["mt"] = "red", ["ary"] = "red", ["ar"] = "green", ["shu"] = "yellow" } local template_preview_per_langcode = { ["mt"] = "k-t-b", ["acy"] = "k-t-p" } local lang local sc local function ifelse(cond, yes, no) if cond then return yes end return no end local function ucfirst(str) if str == nil then return str end return mw.language.getContentLanguage():ucfirst(str) end local function link(term, alt, id) if term == "" or term == "&mdash;" then return term else return m_links.full_link({ term = term, alt = alt, lang = lang, id = id, }) end end local function parse_inlines(term) return m_parse_utils.parse_inline_modifiers( term, { param_mods = {tr = {}, t = {}, pos = {}}, generate_obj = function(term) return {term} end, } ) end local function make_part(noninline, lang) local keys = {"tr", "t", "pos"} local inline if type(noninline) == "string" then inline = parse_inlines(noninline) else inline = parse_inlines(noninline[1]) end local return_value = { term = m_sc_utils.tag_text(inline[1], lang), } for i, key in ipairs(keys) do if inline[key] and noninline[key] then error( key .. " specified twice: " .. "<" .. key .. ":" .. inline[key] .. ">" .. " and " .. "|" .. key .. "=" .. nonline[key] ) end return_value[key] = inline[key] or noninline[key] end if not return_value.tr then return_value.tr = lang:transliterate(inline[1]) end return return_value end local function make_parts(lang, raw_parts) local parts = {} for i, part in ipairs(raw_parts) do parts[#parts + 1] = make_part(part, lang) end return { parts = parts, lang = lang, sc = lang:findBestScript(parts[1][1]), } end local function show_affix(lang, raw_parts) return m_affix.show_affix( make_parts(lang, raw_parts), {}, lang ) end local appendices = { ["active participle"] = { -- participles have verbal force in (most?) vernaculars function(args, lang) return ifelse(lang:getCode() == lg_ar:getCode(), "nominals", "verbs") end, derived = true, }, ["characteristic adjective"] = "nominals", ["color/defect adjective"] = { "nominals", fragment = "Color or defect adjectives", }, ["diminutive"] = "nominals", ["elative"] = "nominals", ["relative"] = { "nominals", params = function(lang) return { [1] = { required = true, }, [2] = { alias_of = "t", }, [3] = { alias_of = "suffix", }, ["id"] = {}, ["id1"] = { alias_of = "id", }, ["tr"] = {}, ["pl"] = { type = "boolean", }, ["t"] = { required = true, }, ["suffix"] = { default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيَّة", "ـية"), }, } end, title = function(args) if args.pl then return "relative nouns (nisba)" end return "relative noun (nisba)" end, desc = function(args, lang) if not args[1] then return "" end return ( "composed from " .. show_affix( lang, { args, {args.suffix, pos="feminine nisba"}, } ) ) end, }, ["relative-a"] = { "nominals", params = function(lang) return { [1] = { required = true, }, [2] = { alias_of = "t", }, ["tr"] = {}, ["pl"] = { type = "boolean", }, ["suffix"] = { default = ifelse(lang:getCode() == lg_ar:getCode(), "ـَة", "ـة"), }, ["t"] = { required = true, }, } end, title = function(args) if args.pl then return "relative nouns (nisba)" end return "relative noun (nisba)" end, desc = function(args, lang) if not args[1] then return "" end return ( "composed from " .. show_affix( lang, { args, {args.suffix, pos="feminine ending"}, } ) ) end, }, ["relative-linking"] = { "nominals", params = function(lang) return { [1] = { required = true, }, [2] = { alias_of = "t", }, [3] = { alias_of = "linking", }, ["tr"] = {}, ["pl"] = { type = "boolean", }, ["suffix"] = { default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيّ", "ـي"), }, ["t"] = { required = true, }, ["linking"] = { required = true, } } end, title = function(args) if args.pl then return "relative adjectives (nisba)" end return "relative adjective (nisba)" end, desc = function(args, lang) if not args[1] then return "" end return ( "composed from " .. show_affix( lang, { args, args.linking, {args.suffix, pos = "nisba"}, } ) ) end, }, ["relative-linking-noun"] = { "nominals", params = function(lang) return { [1] = { required = true, }, [2] = { alias_of = "t", }, [3] = { alias_of = "linking", }, ["tr"] = {}, ["pl"] = { type = "boolean", }, ["t"] = { required = true, }, ["linking"] = { required = true, }, ["suffix"] = { default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيّ", "ـي"), } } end, title = function(args) if args.pl then return "relative nouns (nisba)" end return "relative noun (nisba)" end, desc = function(args, lang) if not args[1] then return "" end return ( "composed from " .. show_affix( lang, {args, args.linking, args.suffix} ) ) end, }, ["form"] = { "verbs", params = { [1] = { alias_of = "wazn", }, ["wazn"] = { required = true, }, }, fragment = function(args) return "Form_" .. args.wazn end, title = function(args) return "form " .. args.wazn end, }, ["instance noun"] = "nominals", ["noun of place"] = "nominals", ["occupational noun"] = "nominals", ["passive participle"] = { "nominals", derived = true, }, ["reduplicated"] = { glossary = "reduplication", }, ["singulative noun"] = "nominals", ["tool noun"] = "nominals", ["verbal noun"] = "nominals", } local radicals = { ["Arab"] = { ["ء"] = true, ["ب"] = true, ["ت"] = true, ["ث"] = true, ["ج"] = true, ["ح"] = true, ["خ"] = true, ["د"] = true, ["ذ"] = true, ["ر"] = true, ["ز"] = true, ["س"] = true, ["ش"] = true, ["ص"] = true, ["ض"] = true, ["ط"] = true, ["ظ"] = true, ["ع"] = true, ["غ"] = true, ["ف"] = true, ["ق"] = true, ["ك"] = true, ["ل"] = true, ["م"] = true, ["ن"] = true, ["ه"] = true, ["و"] = true, ["ي"] = true, ["گ"] = true, ["چ"] = true, ["پ"] = true, ["ڭ"] = true, }, ["Latn"] = { ["'"] = true, ["b"] = true, ["c"] = true, ["ċ"] = true, ["d"] = true, ["δ"] = true, ["f"] = true, ["ġ"] = true, ["g"] = true, ["għ"] = true, ["h"] = true, ["ħ"] = true, ["j"] = true, ["k"] = true, ["l"] = true, ["m"] = true, ["n"] = true, ["p"] = true, ["q"] = true, ["r"] = true, ["s"] = true, ["ş"] = true, ["t"] = true, ["v"] = true, ["w"] = true, ["x"] = true, ["y"] = true, ["ż"] = true, ["z"] = true, ["θ"] = true, } } local function validateRoot(rootTable, joined_root) if type(rootTable) ~= "table" then error("rootTable is not a table", 2) end local len = #rootTable if len < 3 then error("Root must have at least three radicals.") end if sc == nil then sc = lang:findBestScript(joined_root):getCode() end for i, radical in ipairs(rootTable) do if not radicals[sc][radical] then error("Unrecognized radical " .. radical .. " in " .. joined_root) end end end function export.root(frame) local output = {} local categories = {} local title = mw.title.getCurrentTitle() local namespace = title.nsText local fulltitle = title.fullText if frame.args["lang"] then lang = require("Module:languages").getByCode(frame.args["lang"]) else error("Please provide a language code.") end local subpage = "Lampiran:Akar bahasa " .. lang:getCanonicalName() .. "/" local fulltitle = rsubn(fulltitle, rsubn(subpage, "([^%w])", "%%%1"), "") local params = { [1] = { list = true }, ["nocat"] = { type = "boolean" }, ["plain"] = { type = "boolean" }, ["notext"] = { type = "boolean" }, ["sense"] = {} } local args = require("Module:parameters").process(frame:getParent().args, params) local rootLetters = {} local roots = args[1] local plain = args["plain"] if frame.args["plain"] then plain = true end local langCode = lang:getCode() local separator = " " if separator_langs[langCode] then separator = "-" else separator = " " end local roots_len = #roots if #roots == 0 and namespace == "Templat" then if template_preview_per_langcode[langCode] ~= nil then table.insert(rootLetters, rsplit(template_preview_per_langcode[langCode], separator)) else table.insert(rootLetters, rsplit("ك ت ب", separator)) end elseif #roots ~= 0 then for _, root in ipairs(roots) do table.insert(rootLetters, rsplit(root, separator)) end else table.insert(rootLetters, rsplit(fulltitle, separator)) end local joined_roots = {} for i, rootLetter in ipairs(rootLetters) do table.insert(joined_roots, table.concat(rootLetter, separator)) validateRoot(rootLetter, joined_roots[i]) end local sense = args["sense"] local sense_formatted = "" if sense ~= nil then sense_formatted = " (" .. sense .. ") " end if fulltitle == joined_roots[1] then if namespace == "" then error("The root page should be in the Appendix namespace. Please move it to : [[" .. subpage .. joined_roots[1] .. "]]") end if roots_len > 1 then error("There should be only one root.") end table.insert(output, m_headword.full_headword({ lang = lang, pos_category = "roots", categories = {}, heads = { fulltitle }, nomultiwordcat = true, noposcat = true })) if args["nocat"] then return table.concat(output) else return table.concat(output) .. table.concat(categories) end else local link_texts = {} local term_counts = {} for i, joined_root in ipairs(joined_roots) do local link_text = subpage .. joined_root table.insert(link_texts, link(link_text, joined_root .. sense_formatted, sense)) table.insert( categories, m_utilities.format_categories( { "Perkataan bahasa " .. lang:getCanonicalName() .. " milik akar " .. joined_root .. sense_formatted }, lang) ) table.insert(term_counts, mw.site.stats.pagesInCategory( "Perkataan bahasa " .. lang:getCanonicalName() .. " milik akar " .. joined_root .. sense_formatted, "pages") ) end if args["nocat"] or plain then if args["nocat"] then return table.concat(link_texts, ", ") else return table.concat(link_texts, ", ") .. table.concat(categories) end else local link_text_output = "" for i, link_text in ipairs(link_texts) do link_text_output = link_text_output .. "\n|-\n| " .. link_text .. "\n|-\n| [[:Kategori:Perkataan bahasa " .. lang:getCanonicalName() .. " milik akar " .. joined_roots[i] .. sense_formatted .. "|" .. term_counts[i] .. " perkataan" .. (term_counts[i] == 1 and "" or "") .. "]]\n" end local color = "grey" if color_langs[langCode] ~= nil then color = color_langs[langCode] end local wikicode = mw.getCurrentFrame():expandTemplate { title = 'inflection-table-top', args = { title = "-", palette = color, class = "floatright tr-alongside" } } wikicode = wikicode .. [=[ ! [[w:Akar bahasa-bahasa Samiah|Akar]=] .. (#term_counts == 1 and "" or "") .. [=[]]]=] wikicode = wikicode .. link_text_output wikicode = wikicode .. mw.getCurrentFrame():expandTemplate { title = 'inflection-table-bottom', } return wikicode .. table.concat(categories) end end end local function iffn(val, ...) if type(val) == "function" then return val(unpack(arg)) end return val end function export.etym(frame) local params = { [1] = { alias_of = "lang", }, [2] = { alias_of = "class" }, ["fragment"] = {}, ["nocat"] = { type = boolean, }, ["lang"] = { type = "language", replaced_by = false, required = true, }, ["class"] = { required = true, }, } local args, extra = m_params.process(frame:getParent().args, params, true) local fixed_indices = {} for k, v in pairs(extra) do if type(k) == "number" then k = k - 2 end fixed_indices[k] = v end extra = fixed_indices if args.lang:getFamily():getCode() ~= lg_sem_arb then error( args.lang:getCode() .. "'s family is " .. args.lang:getFamily():getCode() .. ", not " .. lg_sem_arb ) end local lookup = appendices[mw.ustring.lower(args.class)] local lookup_args = {} if not lookup then error("Unrecognized word type " .. mw.ustring.lower(args.class)) end if lookup.glossary then return "[[Lampiran:Glosari#" .. lookup.glossary .. "]]" end if lookup.params then lookup_args = m_params.process(extra, iffn(lookup.params, args.lang)) end local appendix = nil if lookup.glossary then appendix = "Glosari" else local appendix_lang = args.lang:getCanonicalName() local appendix_title = ifelse(type(lookup) == "string", lookup, iffn(lookup[1], args.lang)) appendix = appendix_lang .. " " .. appendix_title if not mw.title.new(appendix, "Lampiran").exists then appendix_lang = lg_ar:getCanonicalName() appendix_title = ifelse(type(lookup) == "string", lookup, iffn(lookup[1], lg_ar)) appendix = appendix_lang .. " " .. appendix_title end end local title = args.class local desc = "" local intro = "" if lookup.derived then intro = "diterbitkan daripada " end if type(lookup) ~= "string" then title = iffn(lookup.title, lookup_args, args.lang) or "" desc = iffn(lookup.desc, lookup_args, args.lang) or "" if mw.ustring.match(args.class, "^%u.*") then if intro == "" then title = ucfirst(title) or "" else intro = ucfirst(intro) or "" end end end local fragment = ( iffn(lookup.fragment, lookup_args, args.lang) or ucfirst(iffn(lookup.title, {pl=true}, args.lang)) or pluralize(ucfirst(args.class)) ) or "" return ( intro .. "[[Lampiran:" .. appendix .. ifelse(fragment, "#" .. fragment, "") .. "|" .. title .. "]] " .. desc ) end return export 1l5ppnwebn031b4xrzw47woeuiwavib 281299 281298 2026-04-21T15:37:52Z Hakimi97 2668 281299 Scribunto text/plain local export = {} local m_str_utils = require("Module:string utilities") local m_utilities = require("Module:utilities") local m_links = require("Module:links") local m_headword = require("Module:headword") local m_langs = require("Module:languages") local m_params = require("Module:parameters") local m_parse_utils = require("Module:parse utilities") local m_affix = require("Module:affix") local m_sc_utils = require("Module:script utilities") local pluralize = require("Module:en-utilities").pluralize local lg_ar = m_langs.getByCode("ar") local lg_sem_arb = lg_ar:getFamily():getCode() local rsplit = m_str_utils.split local rsubn = m_str_utils.gsub local unpack = unpack or table.unpack -- Lua 5.2 compatibility local separator_langs = { ["mt"] = true, ["acy"] = true } local color_langs = { ["mt"] = "red", ["ary"] = "red", ["ar"] = "green", ["shu"] = "yellow" } local template_preview_per_langcode = { ["mt"] = "k-t-b", ["acy"] = "k-t-p" } local lang local sc local function ifelse(cond, yes, no) if cond then return yes end return no end local function ucfirst(str) if str == nil then return str end return mw.language.getContentLanguage():ucfirst(str) end local function link(term, alt, id) if term == "" or term == "&mdash;" then return term else return m_links.full_link({ term = term, alt = alt, lang = lang, id = id, }) end end local function parse_inlines(term) return m_parse_utils.parse_inline_modifiers( term, { param_mods = {tr = {}, t = {}, pos = {}}, generate_obj = function(term) return {term} end, } ) end local function make_part(noninline, lang) local keys = {"tr", "t", "pos"} local inline if type(noninline) == "string" then inline = parse_inlines(noninline) else inline = parse_inlines(noninline[1]) end local return_value = { term = m_sc_utils.tag_text(inline[1], lang), } for i, key in ipairs(keys) do if inline[key] and noninline[key] then error( key .. " specified twice: " .. "<" .. key .. ":" .. inline[key] .. ">" .. " and " .. "|" .. key .. "=" .. nonline[key] ) end return_value[key] = inline[key] or noninline[key] end if not return_value.tr then return_value.tr = lang:transliterate(inline[1]) end return return_value end local function make_parts(lang, raw_parts) local parts = {} for i, part in ipairs(raw_parts) do parts[#parts + 1] = make_part(part, lang) end return { parts = parts, lang = lang, sc = lang:findBestScript(parts[1][1]), } end local function show_affix(lang, raw_parts) return m_affix.show_affix( make_parts(lang, raw_parts), {}, lang ) end local appendices = { ["active participle"] = { -- participles have verbal force in (most?) vernaculars function(args, lang) return ifelse(lang:getCode() == lg_ar:getCode(), "nominals", "verbs") end, derived = true, }, ["characteristic adjective"] = "nominals", ["color/defect adjective"] = { "nominals", fragment = "Color or defect adjectives", }, ["diminutive"] = "nominals", ["elative"] = "nominals", ["relative"] = { "nominals", params = function(lang) return { [1] = { required = true, }, [2] = { alias_of = "t", }, [3] = { alias_of = "suffix", }, ["id"] = {}, ["id1"] = { alias_of = "id", }, ["tr"] = {}, ["pl"] = { type = "boolean", }, ["t"] = { required = true, }, ["suffix"] = { default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيَّة", "ـية"), }, } end, title = function(args) if args.pl then return "relative nouns (nisba)" end return "relative noun (nisba)" end, desc = function(args, lang) if not args[1] then return "" end return ( "composed from " .. show_affix( lang, { args, {args.suffix, pos="feminine nisba"}, } ) ) end, }, ["relative-a"] = { "nominals", params = function(lang) return { [1] = { required = true, }, [2] = { alias_of = "t", }, ["tr"] = {}, ["pl"] = { type = "boolean", }, ["suffix"] = { default = ifelse(lang:getCode() == lg_ar:getCode(), "ـَة", "ـة"), }, ["t"] = { required = true, }, } end, title = function(args) if args.pl then return "relative nouns (nisba)" end return "relative noun (nisba)" end, desc = function(args, lang) if not args[1] then return "" end return ( "composed from " .. show_affix( lang, { args, {args.suffix, pos="feminine ending"}, } ) ) end, }, ["relative-linking"] = { "nominals", params = function(lang) return { [1] = { required = true, }, [2] = { alias_of = "t", }, [3] = { alias_of = "linking", }, ["tr"] = {}, ["pl"] = { type = "boolean", }, ["suffix"] = { default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيّ", "ـي"), }, ["t"] = { required = true, }, ["linking"] = { required = true, } } end, title = function(args) if args.pl then return "relative adjectives (nisba)" end return "relative adjective (nisba)" end, desc = function(args, lang) if not args[1] then return "" end return ( "composed from " .. show_affix( lang, { args, args.linking, {args.suffix, pos = "nisba"}, } ) ) end, }, ["relative-linking-noun"] = { "nominals", params = function(lang) return { [1] = { required = true, }, [2] = { alias_of = "t", }, [3] = { alias_of = "linking", }, ["tr"] = {}, ["pl"] = { type = "boolean", }, ["t"] = { required = true, }, ["linking"] = { required = true, }, ["suffix"] = { default = ifelse(lang:getCode() == lg_ar:getCode(), "ـِيّ", "ـي"), } } end, title = function(args) if args.pl then return "relative nouns (nisba)" end return "relative noun (nisba)" end, desc = function(args, lang) if not args[1] then return "" end return ( "composed from " .. show_affix( lang, {args, args.linking, args.suffix} ) ) end, }, ["form"] = { "verbs", params = { [1] = { alias_of = "wazn", }, ["wazn"] = { required = true, }, }, fragment = function(args) return "Form_" .. args.wazn end, title = function(args) return "form " .. args.wazn end, }, ["instance noun"] = "nominals", ["noun of place"] = "nominals", ["occupational noun"] = "nominals", ["passive participle"] = { "nominals", derived = true, }, ["reduplicated"] = { glossary = "reduplication", }, ["singulative noun"] = "nominals", ["tool noun"] = "nominals", ["verbal noun"] = "nominals", } local radicals = { ["Arab"] = { ["ء"] = true, ["ب"] = true, ["ت"] = true, ["ث"] = true, ["ج"] = true, ["ح"] = true, ["خ"] = true, ["د"] = true, ["ذ"] = true, ["ر"] = true, ["ز"] = true, ["س"] = true, ["ش"] = true, ["ص"] = true, ["ض"] = true, ["ط"] = true, ["ظ"] = true, ["ع"] = true, ["غ"] = true, ["ف"] = true, ["ق"] = true, ["ك"] = true, ["ل"] = true, ["م"] = true, ["ن"] = true, ["ه"] = true, ["و"] = true, ["ي"] = true, ["گ"] = true, ["چ"] = true, ["پ"] = true, ["ڭ"] = true, }, ["Latn"] = { ["'"] = true, ["b"] = true, ["c"] = true, ["ċ"] = true, ["d"] = true, ["δ"] = true, ["f"] = true, ["ġ"] = true, ["g"] = true, ["għ"] = true, ["h"] = true, ["ħ"] = true, ["j"] = true, ["k"] = true, ["l"] = true, ["m"] = true, ["n"] = true, ["p"] = true, ["q"] = true, ["r"] = true, ["s"] = true, ["ş"] = true, ["t"] = true, ["v"] = true, ["w"] = true, ["x"] = true, ["y"] = true, ["ż"] = true, ["z"] = true, ["θ"] = true, } } local function validateRoot(rootTable, joined_root) if type(rootTable) ~= "table" then error("rootTable is not a table", 2) end local len = #rootTable if len < 3 then error("Root must have at least three radicals.") end if sc == nil then sc = lang:findBestScript(joined_root):getCode() end for i, radical in ipairs(rootTable) do if not radicals[sc][radical] then error("Unrecognized radical " .. radical .. " in " .. joined_root) end end end function export.root(frame) local output = {} local categories = {} local title = mw.title.getCurrentTitle() local namespace = title.nsText local fulltitle = title.fullText if frame.args["lang"] then lang = require("Module:languages").getByCode(frame.args["lang"]) else error("Please provide a language code.") end local subpage = "Lampiran:Akar bahasa " .. lang:getCanonicalName() .. "/" local fulltitle = rsubn(fulltitle, rsubn(subpage, "([^%w])", "%%%1"), "") local params = { [1] = { list = true }, ["nocat"] = { type = "boolean" }, ["plain"] = { type = "boolean" }, ["notext"] = { type = "boolean" }, ["sense"] = {} } local args = require("Module:parameters").process(frame:getParent().args, params) local rootLetters = {} local roots = args[1] local plain = args["plain"] if frame.args["plain"] then plain = true end local langCode = lang:getCode() local separator = " " if separator_langs[langCode] then separator = "-" else separator = " " end local roots_len = #roots if #roots == 0 and namespace == "Templat" then if template_preview_per_langcode[langCode] ~= nil then table.insert(rootLetters, rsplit(template_preview_per_langcode[langCode], separator)) else table.insert(rootLetters, rsplit("ك ت ب", separator)) end elseif #roots ~= 0 then for _, root in ipairs(roots) do table.insert(rootLetters, rsplit(root, separator)) end else table.insert(rootLetters, rsplit(fulltitle, separator)) end local joined_roots = {} for i, rootLetter in ipairs(rootLetters) do table.insert(joined_roots, table.concat(rootLetter, separator)) validateRoot(rootLetter, joined_roots[i]) end local sense = args["sense"] local sense_formatted = "" if sense ~= nil then sense_formatted = " (" .. sense .. ") " end if fulltitle == joined_roots[1] then if namespace == "" then error("The root page should be in the Appendix namespace. Please move it to : [[" .. subpage .. joined_roots[1] .. "]]") end if roots_len > 1 then error("There should be only one root.") end table.insert(output, m_headword.full_headword({ lang = lang, pos_category = "roots", categories = {}, heads = { fulltitle }, nomultiwordcat = true, noposcat = true })) if args["nocat"] then return table.concat(output) else return table.concat(output) .. table.concat(categories) end else local link_texts = {} local term_counts = {} for i, joined_root in ipairs(joined_roots) do local link_text = subpage .. joined_root table.insert(link_texts, link(link_text, joined_root .. sense_formatted, sense)) table.insert( categories, m_utilities.format_categories( { "Perkataan bahasa " .. lang:getCanonicalName() .. " dengan akar " .. joined_root .. sense_formatted }, lang) ) table.insert(term_counts, mw.site.stats.pagesInCategory( "Perkataan bahasa " .. lang:getCanonicalName() .. " dengan akar " .. joined_root .. sense_formatted, "pages") ) end if args["nocat"] or plain then if args["nocat"] then return table.concat(link_texts, ", ") else return table.concat(link_texts, ", ") .. table.concat(categories) end else local link_text_output = "" for i, link_text in ipairs(link_texts) do link_text_output = link_text_output .. "\n|-\n| " .. link_text .. "\n|-\n| [[:Kategori:Perkataan bahasa " .. lang:getCanonicalName() .. " dengan akar " .. joined_roots[i] .. sense_formatted .. "|" .. term_counts[i] .. " perkataan" .. (term_counts[i] == 1 and "" or "") .. "]]\n" end local color = "grey" if color_langs[langCode] ~= nil then color = color_langs[langCode] end local wikicode = mw.getCurrentFrame():expandTemplate { title = 'inflection-table-top', args = { title = "-", palette = color, class = "floatright tr-alongside" } } wikicode = wikicode .. [=[ ! [[w:Akar bahasa-bahasa Samiah|Akar]=] .. (#term_counts == 1 and "" or "") .. [=[]]]=] wikicode = wikicode .. link_text_output wikicode = wikicode .. mw.getCurrentFrame():expandTemplate { title = 'inflection-table-bottom', } return wikicode .. table.concat(categories) end end end local function iffn(val, ...) if type(val) == "function" then return val(unpack(arg)) end return val end function export.etym(frame) local params = { [1] = { alias_of = "lang", }, [2] = { alias_of = "class" }, ["fragment"] = {}, ["nocat"] = { type = boolean, }, ["lang"] = { type = "language", replaced_by = false, required = true, }, ["class"] = { required = true, }, } local args, extra = m_params.process(frame:getParent().args, params, true) local fixed_indices = {} for k, v in pairs(extra) do if type(k) == "number" then k = k - 2 end fixed_indices[k] = v end extra = fixed_indices if args.lang:getFamily():getCode() ~= lg_sem_arb then error( args.lang:getCode() .. "'s family is " .. args.lang:getFamily():getCode() .. ", not " .. lg_sem_arb ) end local lookup = appendices[mw.ustring.lower(args.class)] local lookup_args = {} if not lookup then error("Unrecognized word type " .. mw.ustring.lower(args.class)) end if lookup.glossary then return "[[Lampiran:Glosari#" .. lookup.glossary .. "]]" end if lookup.params then lookup_args = m_params.process(extra, iffn(lookup.params, args.lang)) end local appendix = nil if lookup.glossary then appendix = "Glosari" else local appendix_lang = args.lang:getCanonicalName() local appendix_title = ifelse(type(lookup) == "string", lookup, iffn(lookup[1], args.lang)) appendix = appendix_lang .. " " .. appendix_title if not mw.title.new(appendix, "Lampiran").exists then appendix_lang = lg_ar:getCanonicalName() appendix_title = ifelse(type(lookup) == "string", lookup, iffn(lookup[1], lg_ar)) appendix = appendix_lang .. " " .. appendix_title end end local title = args.class local desc = "" local intro = "" if lookup.derived then intro = "diterbitkan daripada " end if type(lookup) ~= "string" then title = iffn(lookup.title, lookup_args, args.lang) or "" desc = iffn(lookup.desc, lookup_args, args.lang) or "" if mw.ustring.match(args.class, "^%u.*") then if intro == "" then title = ucfirst(title) or "" else intro = ucfirst(intro) or "" end end end local fragment = ( iffn(lookup.fragment, lookup_args, args.lang) or ucfirst(iffn(lookup.title, {pl=true}, args.lang)) or pluralize(ucfirst(args.class)) ) or "" return ( intro .. "[[Lampiran:" .. appendix .. ifelse(fragment, "#" .. fragment, "") .. "|" .. title .. "]] " .. desc ) end return export ofaljfp3gndnd0wqj1h0e8dt16klnoq Modul:place/locations 828 76177 281433 264770 2026-04-22T09:56:11Z PeaceSeekers 3334 281433 Scribunto text/plain local export = {} export.force_cat = false -- set to true to force category generation even on non-mainspace pages local m_table = require("Module:table") local string_utilities_module = "Module:string utilities" local en_utilities_module = "Module:en-utilities" local insert = table.insert local concat = table.concat local dump = mw.dumpObject local unpack = unpack or table.unpack -- Lua 5.2 compatibility --[==[ intro: This module contains data on all known locations, along with some lower-level code to process them (higher-level known-location code is in [[Module:place/placetypes]]). You must load this module using require(), not using mw.loadData(). ===Location data=== '''NOTE: In order to understand the following better, first read the introductory documentation in [[Module:place]], especially the section `More about known locations`.''' The bulk of the code in this module (after some helper functions and placetype tables) describes the known locations and their relationships. Locations are grouped into ''location groups'' that share some common properties (examples are states of the United States and cities in Brazil). Each location group is associated with two tables, a ''data table'' that lists the locations and their individual properties, and a ''metadata table'' that lists group-level properties and defaults for the location properties. Each metadata table points to the associated data table (i.e. contains the data table as its `data` field), and the global `locations` variable holds a list of all group metadata tables. A given location is generally described by three values: (a) the group metadata table for the group the location is part of; (b) the location's canonical ''key'', which is the actual key in the group's data table and is globally unique across all locations; and (c) the location's ''spec'', which is the initialized object describing the properties of the location and comes from the value in the data table corresponding to the canonical key, transformed by the `initialize_spec()` function. These are typically named `group`, `key` and `spec`, respectively and in that order, and are found in the arguments to many functions. In a per-group data table, the keys are either ''canonical keys'' describing locations (which, as mentioned above, must be globally unique) or ''alias keys'' specifying an allowed alias for a given location. There may be multiple aliases for a given location and the alias keys only need to be unique within a particular group data table, not across all groups. It is also possible for the same string to serve as an alias key in one group and a canonical key in another group. (For example, `Newcastle` appears as an alias key in two different groups, referring to two different locations, canonically known as `Newcastle upon Tyne`, for the city in England, and `Newcastle, New South Wales`, for the city in New South Wales, Australia; and `Birmingham` appears both as a canonical key in the group of English cities and an alias key for canonical `Birmingham, Alabama` in the group of US cities.) The corresponding value objects are different for canonical and alias keys. Corresponding to canonical keys are ''location specs'', describing the properies of the location that cannot be derived from default properties of the group or global defaults. Corresponding to alias keys are ''alias specs'', which are highly restricted in the properties they can contain, and whose properties do not have per-group defaults, but only global defaults. The canonical key is always the same as the bare category corresponding to the location, which is one of the reasons it must be globally unique. For example, the country of Georgia uses the canonical key `Georgia` and corresponding bare category [[:Category:Georgia]], while the US state of Georgia uses the canonical key `Georgia, USA` and corresponding bare category [[:Category:Georgia, USA]]. The following conventions are followed in naming keys: * Countries, ''country-like entities'' (which are a mixture of unrecognized de-facto states and dependent territories) and ''former countries'' (which also includes other types of polities, such as the Roman Empire) use their unqualified placename as the canonical key. (See the documentation for [[Module:place]] for the distinction between keys and placenames, which is critical to understand when working with location data.) This also applies to constituent countries (such as England, Aruba and the Faroe Islands) and constituent parts of grouped dependent territories (such as the island of Saint Helena, which is administratively part of the British overseas territory of Saint Helena, Ascension and Tristan da Cunha). * Cities (including prefecture-level cities in China, which behave in most respects more like non-city administrative divisions) also normally use their unqualified placename as the canonical key, but if this causes name conflicts or ambiguities, they use a ''qualified key'' containing either the country name or immediate containing division (if different) following a comma, such as the case of `Newcastle, New South Wales` and `Birmingham, Alabama` above. Examples of name conflicts are the two cities just given; examples of ambiguities are the major cities of León and Mérida in Mexico and city of Cartagena, Colombia, which are given the respective canonical keys of `León, Guanajuato`, `Mérida, Yucatán` and `Cartagena, Colombia` to avoid ambiguity with the well-known respective cities of the same name in Spain, even though none of those cities are large enough to be included as known locations in this module. (The cutoff is generally having a metro area of at least 1,000,000 inhabitants, although there are exceptions.) * Administrative divisions of countries, other than the exceptions noted above for constituent countries and dependent territories, use a qualified key that contains the name of the country or constituent country in it, e.g. `Normandy, France` (a region), `Calvados, France` (a department in the region of Normandy), `Herefordshire, England` (a ceremonial county), `Northwest Territories, Canada` (a territory), `Central Finland, Finland` (a region), `Antalya Province, Turkey` (a province), `Cluj County, Romania` (a county), `County Cork, Ireland` (a county) and `New York, USA` (a state). As shown in these various examples, (a) first and second-level divisions are sometimes both included (as in France, the United Kingdom and China); (b) the qualifier after the comma is sometimes a constituent country (England) instead of a country (United Kingdom), and is sometimes abbreviated (USA rather than United States or Unites States of America); (c) the word `the` is not normally included in the key even if the location is normally preceded by `the` when following a preposition (there is a property in the location and alias specs to indicate this), except in a very few cases (most notably `The Hague`); (d) the country is included as a qualifier even if it creates an apparent redundancy, as with `Central Finland, Finland`; and (e) sometimes the placetype is included in the key, as with provinces in Turkey and several other countries; states in Nigeria; and counties in Ireland, Romania and several other countries. Whether the placetype is included, and whether it follows or precedes the placename, depends on per-country conventions. For example, provinces in Turkey, Iran and several other countries (likewise for states in Nigeria, oblasts in Russia, etc.) conventionally include the word "Province", "negeri", "Oblast" etc. in their name because they are normally named after the largest city in the division, which would otherwise lead to ambiguity; and counties in Ireland and Northern Ireland (and likewise County Durham, England) normally have the word "County" preceding rather than following them in their conventional name, so we follow this practice. The Wikipedia article naming scheme for a given administrative division is a strong clue as to how the division is normally referred to, and we usually follow this practice. (A minor exception is that the Wikipedia articles for provinces in Iran, Laos and Thailand include the word `province` with an initial lowercase letter while provinces elsewhere, e.g. North and South Korea, Saudi Arabia and Turkey, use uppercase `Province`; we normalize to uppercase `Province` in all cases.) As mentioned above, associated with canonical keys in the group data table are location specs, which are objects containing properties. It is important here to distinguish ''initialized specs'' from ''uninitialized specs''. Unininitialized specs are as directly specified in [[Module:place/locations]], containing only those properties that differ from the per-group or global defaults. Initialized specs result from calling `initialize_spec()` on an uninitialized spec (it is idempotent in that it will do nothing if encountering an already-initialized spec). This copies all group-level defaults that are not overridden in the location spec itself from the group-level metadata table into the location spec, so that in general, no more reference need be made to the group to fetch the correct value of a given location property. (The initialization process also does more transformations in a few cases, noted below.) Note that the default value of a given property is stored under a key in the group metadata table that is preceded by the string `default_`; for example, the default value corresponding to the `placetype` property of a given location is specified in the `default_placetype` key in the group metadata table. The following are the properties of the location spec. * `placetype`: String specifying the placetype of the location (e.g. "negara", "negeri", province"). This can also be a table of such types; in this case, the first listed type is the canonical type that will be used in descriptions, but the location will be recognized (e.g. in a holonym, or for categorizing into the bare category) when tagged with any of the specified types. The placetype '''must''' be either specified on an individual location or defaulted at the group level, or an error occurs. * `container`: Either a string, a ''canonicalized container'' structure or a list of either type, specifying the immediate ''container'' (or containers) of the given location. A container is another location which this location is considered to be directly part of, either politically or (above the country level) geographically. Some locations belong to multiple immediate containers; this applies especially to transcontinental countries such as Russia and Turkey. Containers can themselves have containers, forming a tree (or more correctly, a [[w:directed acyclic graph]]) of locations. The list of immediate container(s), followed by the container(s) of the container(s), etc., is termed the ''container trail'', and some functions compute and return this trail as part of their operation. When a location spec is initialized, the given container spec is canonicalized into ''canonical container form'', which consists of a list of canonicalized container structures, each of which is of the form `{key = "``container_key``", placetype = "``container_placetype``"}`, where ``container_key`` is a canonical location key and ``container_placetype`` should be the listed placetype for the location, or the first listed placetype if there are multiple. (FIXME: Since the key uniquely identifies the container location, we should eliminate the placetype from the container structure.) The list of canonicalized container structures is stored into the `.containers` field of the location spec (this happens even if the container value is unset in its uninitialized spec form, causing it to default to the corresponding group-level value), and the `.container` field is set to {nil}. The canonicalization process is described in more detail below under [[#Container spec canonicalization]]. * `divs`: List of recognized political divisions; e.g. for the Netherlands, a specification of the form `divs = {"provinces", "municipalities"}` will allow categories such as [[:Category:de:Provinces of the Netherlands]] and [[:Category:pt:Municipalities of the Netherlands]] to be created. Any division that appears here must also be found in `placetype_data`, or an error occurs. The entities appearing in the `divs` list can be structures as well as just strings; this is explained more below under [[#Location divisions]]. Additional political divisions that apply to all locations in a group can be specified at the group level using the group-only property `addl_divs`, which has the same format as `divs`. This is intended to be used in the situation where some division types are shared among all locations in the group and others differ from location to location. An example where this is used is the United States, where `census-designated places` is specified in the group-level `addl_divs` so that all 50 states have census-designated places categorized as e.g. [[:Category:Census-designated places in Arizona, USA]], but `counties` and `county seats` are specified in the group-level `default_divs` because not all states have counties and county seats (Alaska has boroughs and borough seats and Louisiana has parishes and parish seats), and some states have additional divisions (New Jersey and Pennsylvania also have boroughs, while Colorado and Connecticut have municipalities). Note that under most circumstances (particularly, if `container_parent_type` is not set as a property associated with the division type), any division type specified on a sub-country-level location must also be specified on all containers up through the country. For example, since French departments specify `communes` and `municipalities` in `default_divs`, the same division types must be (and are) specified on French regions and for France itself. * `keydesc`: String directly specifying a description of the location, for use in generating the contents of category pages related to the location. In place of a string, a function of three arguments (`group`, `key`, `spec`, as is normal for locations) that computes the location description can also be given. This is used, for example, for Russian federal subjects; see `construct_russia_federal_subject_keydesc`. The special string `+++` contained in the keydesc is replaced with the default value of the location description, which specifies the location's placename, placetype, and the corresponding values for each container in the container trail, generally up through (but not beyond) the country level; see `no_include_container_in_desc` below. The location description is used to construct the full description of various categories, such as bare location categories, whose description generally reads `"{{(((}}langname}}} terms related to the people, culture, or territory of ``keydesc``."` where ``keydesc`` is the specified or auto-constructed location description. * `fulldesc`: String overriding the full description for the bare location category (but not for any other category). This is currently used only for the location `Earth`, at the very top of the tree (because the standard `people, culture or territory of ...` text doesn't make sense here), and for `Antarctica` (because it has no permanent inhabitants). FIXME: This should be renamed `bare_category_fulldesc`. * `addl_parents`: Specify additional parents for the bare location category, in addition to the category or categories generated based on the immediate container(s). For example, `Hawaii, USA` specifies `Polynesia` as an additional parent category; both `North Korea` and `South Korea` specify `Korea` (which is a specially handled location category) as an additional parent; and `Earth` specifies `nature` (not a location category, but still a topic category) as an additional parent (which in this case becomes the first parent, as `Earth` has no container). The only restriction on the categories in `addl_parents` is that they must be topic categories, because each language-specific version of the bare location category gets the corresponding language-specific versions of the categories in `addl_parents`. FIXME: This shoudl be renamed `bare_category_addl_parents`. * `wp`: Spec describing how to construct the Wikipedia article for the location. Each spec is either `true` (equivalent to `"%l"`, i.e. use the full location placename directly) or a string containing formatting directives, indicating how to construct the article name. The allowed formatting directives are `%l` (the full location placename), `%e` (the elliptical location placename) and `%c` (the full placename of the first immediate container). For example, the default value of `wp` for the group of United States cities is `"%l, %c"` since the city articles tend to be named e.g. `Austin, Texas` (but with many exceptions, specified using `wp` fields at the city level). Another example is Thai provinces, which specify a group-level default of `"%e province"` as the Wikipedia articles have lowercase `province` in their name but the Thai province keys specified in this module have uppercase `Province`. Here we have to use `%e` to get the placename without the word `Province` in it. The default is `true`, which simply uses the full location placename as the article name. Note that the Wikipedia article, along with the Wikipedia and Commons category pages, are shown in the upper right of bare category pages. * `wpcat`: Spec describing how to construct the Wikipedia category page for the location (i.e. the page listing articles and categories relevant to the location). The format is the same as with `wp`, and it defaults to the value of `wp`. It rarely needs to be specified because the category page and the article page almost always follow the same format. * `commonscat`: Spec describing how to construct the Commons category page for the location (i.e. the page on the MediaWiki Commons site listing articles and categories relevant to the location). It has the same format as `wp` and `wpcat` and defaults to `wpcat`, which is usually (but not always) correct. * `the`: Boolean specifying whether a location should be preceded by `the` when following a preposition, e.g. in category names such as [[:Category:Cities in the Northern Territory, Australia]] and in old-style place descriptions when the location occurs as the first holonym, such as the city [[Darwin]] described using {{tl|place|city|terr/Northern Territory|c/Australia}}. Note that the global default for this and all Boolean properties is {nil}, which amounts to the same as {false}. * `british_spelling`: Boolean indicating whether the location in question uses British spelling. Currently this only affects whether the spelling `neighborhoods` or `neighbourhoods` is used in categories such as [[:Category:Neighborhoods of New York City]] and [[:Category:Neighbourhoods of Sydney]]. This usually needs to be set only at the top level (i.e. country or country-like entity), because lower-level entities look up the container trail for any container that has `british_spelling = true` set, and if found, assume that British spelling applies. The general principle used in setting this is that all countries in Europe, all dependent territories of any such country, all former British colonies, and any dependent territories of these former colonies, are assumed to use British spelling, while all other countries and associated dependent territories are assumed to use American spelling. This can potentially be modified on a case-by-case basis. * `is_city`: Boolean indicating whether the location in question is a city. This is explicitly set to `true` for city-states (e.g. Monaco and Vatican City), dependent territories that are cities (e.g. Hong Kong, Macau, Bonaire, Gibraltar, etc.), certain city-level administrative divisions (such as `City of Belfast, Northern Ireland`) and (through a group-levell setting) New York boroughs. In addition, it is set to `true` in initialize_spec() whenever the group-level `default_placetype == "city"`, so that all cities get it set without explicitly needing to add a group-level setting for this. Note that the condition `default_placetype == "city"` intentionally excludes Chinese prefecture-level cities, which aren't really cities in that (for example) they don't directly contain neighborhoods, but do contain cities within them. This setting is used in various places: (a) to add cities, rivers, etc. to categories like [[:Category:Rivers in Osaka Prefecture, Japan]] and [[:Category:Cities in Wuhan]] for holonyms that are ''not'' cities; (b) to add districts, neighborhoods, and the like to categories like [[:Category:Neighborhoods of Brooklyn]] and [[:Category:Neighborhoods of Monaco]] for holoynms that ''are'' cities; (c) generally, to determine which "generic" placetypes (cities, rivers, neighborhoods, etc.) apply to the location. (Those that can occur with cities have a `generic_before_cities` setting in [[Module:place/placetypes]], and those that can occur with non-cities have a `generic_before_non_cities` setting.) * `is_former_place`: Boolean that should be set on former places such as the Soviet Union and the Roman Empire. For such places, categories such as [[:Category:fr:Rivers in the Soviet Union]] are neither generated nor recognized (more generally, no "generic" placetypes apply except for `places`), and category descriptions include the word `former`. * `overriding_bare_label_parents`: Document me! * `bare_category_parent_type`: Document me! * `no_container_cat`: Document me! * `no_container_parent`: Document me! * `no_generic_place_cat`: Document me! * `no_check_holonym_mismatch`: Document me! * `no_auto_augment_container`: Document me! * `no_include_container_in_desc`: Document me! ====Location divisions==== The `divs` field of a location describes the recognized political division types of that location. Specifying a given division type will cause places defined as being of the specified division type and with the location as a holonym will cause the place to be categorized as ` ``placetypes`` in/of ``location`` `; for example, specifying that the United States has `"negeri"` as a division will cause anything defined as {{tl|place|fr|state|c/US}} to be categorized under [[:Category:fr:States of the United States]]. Note that you do not have to explicitly specify division types for "generic" placetypes (those that have a `generic_before_non_cities` field if the location is not a city, or that have a `generic_before_cities` field if the location is a city); this includes things like cities, towns, villages, neighbo(u)rhoods and rivers. A given element in the `divs` list is usually a string naming a plural placetype; the placetype is automatically converted to the singular for recognizing the placetype in a {{tl|place}} spec, and irregular plurals such as `kibbutzim` are handled correctly as long as the placetype specifies an appropriate `plural` field (if the `plural` isn't explicitly given, the default singularization algorithm in [[Module:en-utilities]] is run, which gets most things correctly but has problems with `passes` and `fortresses`, which are singularized to `passe` and `fortresse`; for this reason, an explicit plural entry is added to terms in ''-ss''). In place of a string, an object can be given with the plural placetype in the `type` field; this allows additional properties to be specified along with the placetype. An example of this is the `divs` list for Canada: { ["Canada"] = {divs = { {type = "provinces", cat_as = "provinces and territories"}, {type = "territories", cat_as = "provinces and territories"}, "counties", "districts", "municipalities", "regional municipalities", "rural municipalities", "parishes", "Indian reserves", "census divisions", {type = "townships", prep = "di"}, }, ...}, } Here, both provinces and territories are set to categorize as `provinces and territories`, meaning that there is a single category [[:Category:Provinces and territories of Canada]] rather than separate categories for provinces and territories. Similar things are done for other countries that have more than one type of first-level administrative division (e.g. Australia, China, India and Pakistan). Note that any placetype listed under `cat_as` must exist in the table of placetypes in [[Module:place/placetypes]], and in fact there is a category-only entry there for `provinces and territories!` (the use of exclamation point following a plural placetype means that the placetype is present only for use in categories and won't be recognized as the placetype field in a {{tl|place}} description). In addition, townships are declared to use `in` rather than `of` as the preposition in the category; hence the category name will be [[:Category:Townships in Canada]] rather than [[:Category:Townships of Canada]]. (The use of `in` vs. `of` is somewhat related to whether a given placetype is an official administrative or statistical division of the location in question and comes in a defined list, in which case `of` should be used, or is more ill-defined, in which case `in` should be used; the default is `of`, and the use of `in` with `townships` is probably by analogy with the use of `in` with cities and towns.) Another more complex example is the divisions given for Quebec: { ["Quebec, Canada"] = {divs = { "counties", {type = "regional county municipalities", container_parent_type = "regional municipalities"}, {type = "regions", container_parent_type = false}, {type = "townships", prep = "di"}, {type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "counties"}, "municipalities"}}, {type = "township municipalities", cat_as = {{type = "townships", prep = "di"}, "municipalities"}}, {type = "village municipalities", cat_as = {{type = "villages", prep = "di"}, "municipalities"}}, }, ...}, } Here, `container_parent_type` controls the second parent category of the placetype/location category associated with the entry. In this case, for example, [[:Category:Counties of Quebec, Canada]] will have [[:Category:Counties of Canada]] as its second or ''container-level'' parent. However, this doesn't make sense for `regional county municipalities`, which exist only in Quebec (so the parent category [[:Category:Regional county municipalities of Canada]] would have only one subcategory); but they are similar to regional municipalities in British Columbia, Nova Scotia and Ontario, so the `container_parent_type = "regional municipalities"` spec causes the container-level parent of this category to be [[:Category:Regional municipalities of Canada]]. Likewise, `regions` as administrative divisions (as opposed to mere geographic regions) exist only in Quebec; they have no equivalent elsewhere, so we disable the container-level parent using `container_parent_type = false`. The specs for `parish municipalities`, `township municipalities` and `village municipalities` show both that multiple types can be specified under `cat_as` (here, for example, we categorize `parish municipalities` as both `parishes` and `municipalities`) and that these types can themselves have properties, just as for entries directly under `divs`. Specifically, `{type = "parishes", container_parent_type = "counties"}` means that any place defined as a parish municipality in Quebec will be categorized under both [[:Category:Parishes of Quebec, Canada]] and [[:Category:Municipalities of Quebec, Canada]], and that the former will have a container-level parent of [[:Category:Counties of Canada]] (rather than the default of [[:Category:Parishes of Canada]]). Similarly, `township municipalities` will be categorized under both [[:Category:Townships in Quebec, Canada]] (''not'' [[:Category:Townships of Quebec, Canada]]) and [[:Category:Municipalities of Quebec, Canada]]. ====Container spec canonicalization==== A fully canonicalized container spec for a given location consists of a list of ''canonicalized container objects'', each with a `key` and `placetype` field. The `key` field should name the canonical key of some other location at a higher level (e.g. French cities are contained in French departments, which are contained in French regions, which are contained in France, which is contained in Europe, which is contained in Eurasia, which is contained in the Earth). The `placetype` field should correspond to the first (canonical) placetype listed for the key in question. The process of initializing a locaion spec converts the container spec in `.container` into a canonicalized spec in `.containers` and removes the spec from `.container`. It works as follows: # If the `container` field is missing, and there is a group-level `default_container` field, it is used in its place. For example, none of the Brazilian states listed in `brazil_states` specifies a container, but the group specifies `default_container = "Brazil"`. # A single string or canonicalized container object is allowed and made into a one-element list. # If a list element is a string that did ''not'' come from `default_container`, and there is a group-level `canonicalize_key_container` field, it is assumed to be a one-argument function and is called on the string to get a canonicalized container object. # Any remaining strings are assumed to be countries and are used directly as the `key`, with `placetype` set to `"negara"`. ====Alias keys==== Aliases can be provided for canonical keys using ''alias keys''. Alias keys have a very different location spec structure from canonical keys. This structure does not, in general, have defaults at the group level and is not initialized using `initialize_spec()`, but is used as-is. The following properties are recognized in an alias location spec: * `alias_of`: The canonical key of which this key is an alias. Required. * `the`: If true, this alias key is preceded by `the` following a preposition. Defaults to the group-level `default_the` but does not pay attention to the value of `the` for the corresponding canonical key. * `display`: This is a display alias, meaning that holonyms using the placename corresponding to this alias will be converted to the placename corresponding to the canonical key when formatting the holonym for display. (Otherwise, the aliasing applies only to categorization.) If the value is true, the display canonicalization is to the placename of the canonical key; otherwise, the value should be a key whose corresponding placename is used when display canonicalizing. * `placetype`: The placetype of the alias. Rarely needs to be specified as it defaults to the canonical key's placetype, and if that is unspecified, to the group-level default placetype. ====Location group metadata tables==== As mentioned above, associated with each location group is a ''metadata table'' listing group-level properties. The metadata table contains two types of keys: group-level defaults (named like the corresponding location-level keys but preceded by `default_`, e.g. `default_placetype` corresponding to the location-level `placetype` key) and group-only keys, which are mostly functions. The following are the possible group-only keys: * `data`: This points to the group data table for the group, as described above. * `key_to_placename`: This is a function of one argument to transform the location's key (whether canonical or alias) into the full and elliptical placenames. The difference between full and elliptical placenames is described in the documentation for [[Module:place]], but in essence, it applies for keys that include the placetype in them (e.g. `Phuket Province, Thailand` or `County Mayo, Ireland`), in which case the full placename includes the placetype and the elliptical placename does not. For keys that do not include the placetype in them (e.g. `Arizona, USA` or `Gloucestershire, England`), the full and elliptical placenames are identical. Note that neither the full nor the elliptical placename includes the container in it; hence, for `Phuket Province, Thailand`, the full placename is `Phuket Province` and the elliptical placename is just `Phuket`. (Note that the full vs. elliptical placename distinction is intended only for handling cases where the placetype follows or precedes the raw placename and there is no difference between the two in whether they are normally preceded by `the`. More complex situations, such as `State of Mexico` (which normally takes `the`) vs. just `Mexico` (which doesn't), or `Islamabad Capital Territory` vs. just `Islamabad`, should be handled instead by aliases.) The `key_to_placename` function takes one argument, the key, and returns two arguments, the full and elliptical placenames, respectively. If left undefined, the default is to chop off anything starting with a comma and return the result as both full and elliptical placename, and if specifically set to `false`, the key is used directly as both full and elliptical placename. If it needs to be defined, it is best to use the helper function `make_key_to_placename`, if possible (or `make_irish_type_key_to_placename` in the case of Ireland and Northern Ireland, where `County` precedes), rather than rolling your own. In addition, you should use the global `key_to_placename` function (which takes care of the default implementation and such) rather than directly calling the function in the `key_to_placename` field. * `placename_to_key`: This is approximately the inverse of `key_to_placename`, transforming a placename (which can be either in full or elliptical form) into the corresponding key. As with `key_to_placename`, if you need to define this (generally, when the full and elliptical placenames are different), prefer using `make_placename_to_key` (or `make_irish_type_placename_to_key` for Ireland and Northern Ireland) to rolling your own. In addition, similarly to `key_to_placename`, use the global `placename_to_key` function to convert placenames to keys rather than directly invoking the function in the `placename_to_key` field. If the field is set to `false`, the placename is used unchanged as the key. Otherwise, the default algorithm works as follows: *# If the group-level `default_placetype == "city"`, use the placename unchanged as the key. *# Otherwise, if the group-level `default_container` exists and is a string, append it to the placename after a comma + space and use the result as the key. *# Otherwise, if the group-level `default_container` is a canonical container object (an object with `key` and `placetype` fields), and the `placetype` field is either `country` or `constituent country`, append the `key` field to the placename after a comma + space and use the result as the key. *# Otherwise, use the placename unchanged as the key. * `canonicalize_key_container`: A function of one argument to convert the specified `container` field, when a string, to canonical form. Described in more detail above under [[#Container spec canonicalization]]. It is preferable to construct the function using `make_canonicalize_key_container`, if possible, rather than rolling your own. * `addl_divs`: Additional political divisions appended, for all locations in the group, to the list of divisions derived from the location-level `divs` or group-level `default_divs` fields to get the final list of divisions for the location. See [[#Location divisions]] for more details. ]==] ----------------------------------------------------------------------------------- -- Helper functions -- ----------------------------------------------------------------------------------- --[==[ Throw an error. `fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to format the format string as if `fmt:format(...)` were called. In general, callers should use `internal_error` unless the error was due to bad user input rather than a logic error (which usually isn't the case in deep back-end code like this). ]==] function export.process_error(fmt, ...) local args = {...} for i = 1, select("#", ...) do args[i] = dump(args[i]) end return error(string.format(fmt, unpack(args))) end --[==[ Throw an internal error (a logic error that should never happen unless there is a bug in the code, as opposed to a user error triggered by bad input or a system error due to something like running out of memory or hitting a time limit). `fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to format the format string as if `fmt:format(...)` were called. ]==] function export.internal_error(fmt, ...) export.process_error("Internal error: " .. fmt, ...) end local internal_error = export.internal_error -- Return whether `list_or_element` (a list of strings, or a single string) "contains" `item` (a string). If -- `list_or_element` is a list, this returns true if `item` is in the list; otherwise it returns true if `item` -- equals `list_or_element`. local function list_or_element_contains(list_or_element, item) if type(list_or_element) == "table" then return m_table.contains(list_or_element, item) and true or false end return list_or_element == item end --[==[ Call the location group's `key_to_placename` function if it exists (see the comment at the top of [[Module:place]] for the distinction between keys and placenames). Two values are returned, the full and elliptical placenames (e.g. full `"County Durham"` vs. elliptical `"Durham"`). If the group does not define `key_to_placename`, both full and elliptical placenames are computed by chopping off anything starting with a comma. ]==] function export.key_to_placename(group, key) if group.key_to_placename == false then return key, key end if group.key_to_placename then local full_placename, elliptical_placename = group.key_to_placename(key) if type(full_placename) ~= "string" then internal_error("Key %s returned a non-string full placename: %s", key, full_placename) end if type(elliptical_placename) ~= "string" then internal_error("Key %s returned a non-string elliptical placename: %s", key, elliptical_placename) end return full_placename, elliptical_placename end key = key:gsub(",.*", "") return key, key end --[==[ Call the location group's `placename_to_key` function if it exists (see the comment at the top of [[Module:place]] for the distinction between keys and placenames) and return the result. If `placename_to_key` exists with the value `false`, return the placename unchanged. If the group does not define `placename_to_key`, and it defines a `default_container` whose placetype is either `country` or `constituent country`, the container name is appended to the placename after a comma and a space. Otherwise the placename is returned unchanged. ]==] function export.placename_to_key(group, placename) if group.placename_to_key == false then return placename elseif group.placename_to_key then local key = group.placename_to_key(placename) if type(key) ~= "string" then internal_error("Placename %s returned a non-string key: %s", placename, key) end return key elseif group.default_placetype == "city" then return placename else local defcon = group.default_container if not defcon then return placename elseif type(defcon) == "string" then return placename .. ", " .. defcon elseif type(defcon) == "table" and (defcon.placetype == "negara" or defcon.placetype == "constituent country") then return placename .. ", " .. defcon.key else return placename end end end --[==[ Initialize the location spec `spec`, augmenting it with default values taken from `group` if the spec itself doesn't specify values for the properties. This sets `containers` to a canonicalized list of objects, each with `key` and `placetype` keys, describing the immediate containers of the location, and erases (sets to nil) the original non-canonicalized `container` field. (Most locations have only one immediate container but some, e.g. Russia, have more than one. Containers should be carefully distinguished from category parents. Generally the container is the first category parent, or the first ``n`` parents if there are ``n`` containers, but there may be additional category parents, which indicate some sort of relation between the category parent and the location but not necessarily one of containment.) This function is idempotent in that nothing happens if called more than once on the same spec. FIXME: Consider reimplementing this in a more standardly object-oriented way using metatables. ]==] function export.initialize_spec(group, key, spec) if spec.initialized then return end local container = spec.container local containers local container_from_default if not container then container = group.default_container container_from_default = true end if container then if type(container) == "string" or container.key then container = {container} end containers = {} for _, cont in ipairs(container) do if type(cont) == "string" then if group.canonicalize_key_container and not container_from_default then cont = group.canonicalize_key_container(cont) else cont = {key = cont, placetype = "negara"} end end insert(containers, cont) end end spec.containers = containers spec.container = nil local function value_with_default(val, default_val) if val == nil then return default_val else return val end end local function set_or_default(prop) spec[prop] = value_with_default(spec[prop], group["default_" .. prop]) end set_or_default("placetype") if not spec.placetype then internal_error("No placetype found in key %s for spec %s or in group `default_placetype`", key, spec) end set_or_default("divs") spec.addl_divs = group.addl_divs for _, prop in ipairs { "keydesc", "fulldesc", "addl_parents", "overriding_bare_label_parents", "bare_category_parent_type", "wp", "wpcat", "commonscat", "british_spelling", "the", "no_container_cat", "no_container_parent", "no_generic_place_cat", "no_check_holonym_mismatch", "no_auto_augment_container", "no_include_container_in_desc", "is_city", "is_former_place", } do set_or_default(prop) end -- `default_placetype == "city"` is correct; if `default_placetype` has something else like `prefecture-level city` -- as the canonical placetype but also lists `city` (as Chinese prefecture-level cities do), don't mark as -- is_city. spec.is_city = value_with_default(spec.is_city, group.default_placetype == "city") spec.initialized = true end --[=[ Given a location group, key and possible placetypes that the placename must match, check if the key exists in the group with at least one of the group's key's placetypes matching one of the passed-in placetypes. If so, return two values: the group key (which potentially could differ from the passed-in key due to aliases) and the corresponding spec object, which (as with all functions that return spec objects) has been initialized using `initialize_spec()` (i.e. default property values have been copied from the group into the spec, if the spec doesn't itself specify a value for the property in question). `alias_resolution` controls how aliases are resolved. Normally, both display and category aliases are followed, and the returned key will reflect the canonical location key. However, if `alias_resolution` is {"none"}, no alias following happens. In that case, if the key specifies an alias, the spec for the alias rather than the spec for the canonical location is returned, and importantly, it is returned uninitialized, meaning that properties from the group are not copied into the spec. (If the key specifies a canonical location, its spec is returned initialized, as in the normal case where `alias_resolution` is unspecified.) The caller needs to check whether the returned spec is an alias by looking for an `alias_of` property. If `alias_resolution` is {"display"}, the behavior is the same as for {"none"} except that if the alias contains a setting `display = true`, the returned key will reflect the canonical location key, and if the alias contains a setting `display = ``string`` `, the returned key will reflect that string. This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or `find_canonical_key` (for known-canonical locations where the placetype isn't known). ]=] local function find_matching_key_in_group(group, placetypes, key, alias_resolution) if alias_resolution ~= nil and alias_resolution ~= "none" and alias_resolution ~= "display" and alias_resolution ~= "all" then internal_error("Bad value for 'alias_resolution': %s", alias_resolution) end local spec = group.data[key] if not spec then return nil end local function check_correct_placetype(placetype) if type(placetype) == "table" then for _, pt in ipairs(placetype) do if list_or_element_contains(placetypes, pt) then return true end end return false else return list_or_element_contains(placetypes, placetype) end end if spec.alias_of then local resolved_key = spec.alias_of local resolved_spec = group.data[resolved_key] if not resolved_spec then internal_error("Key %s is an alias of %s, which doesn't exist", key, resolved_key) elseif resolved_spec.alias_of then internal_error("Key %s is an alias of %s, which is itself an alias; indirect aliasing not allowed", key, resolved_key) end if alias_resolution == "none" or alias_resolution == "display" then -- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group. local placetype = spec.placetype or resolved_spec.placetype or group.default_placetype if not placetype then internal_error("No placetype found for key %s in any of spec %s, alias-resolved spec %s or in group " .. "`default_placetype`", key, spec, resolved_spec) end if not check_correct_placetype(placetype) then return nil end if alias_resolution == "display" then if spec.display == true then key = resolved_key elseif spec.display then key = spec.display end end return key, spec end key = resolved_key spec = resolved_spec end -- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group. local placetype = spec.placetype or group.default_placetype if not placetype then internal_error("No placetype found for key %s in spec %s or group `default_placetype`", key, spec) end if not check_correct_placetype(placetype) then return nil end export.initialize_spec(group, key, spec) return key, spec end --[=[ Given a location group, placename and possible placetypes that the placename must match, check if the placename exists in the group with at least one of the placetypes of the key in the group that corresponds to the placename matching one of the passed-in placetypes. If so, return two values: the key corrsponding to the passed-in placename and the corresponding spec object. This is similar to `find_matching_key_in_group()` but works with placenames rather than keys. `alias_resolution` is as in `find_matching_key_in_group()`. This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or `find_canonical_key` (for known-canonical locations where the placetype isn't known). ]=] local function find_matching_placename_in_group(group, placetypes, placename, alias_resolution) local key = export.placename_to_key(group, placename) return find_matching_key_in_group(group, placetypes, key, alias_resolution) end --[==[ If `key` is a canonical known location key (i.e. not an alias), return the corresponding group and initialized spec. If no such key exists, return {nil}. This throws an internal error if two locations with the same key are found. ]==] function export.find_canonical_key(key) local found_locations = {} for _, group in ipairs(export.locations) do local spec = group.data[key] if not spec then -- do nothing elseif spec.alias_of then mw.log(("Skipping alias '%s' of canonical '%s'"):format(key, spec.alias_of)) else insert(found_locations, {group, spec}) end end if not found_locations[1] then return nil elseif found_locations[2] then internal_error("Found multiple matching locations for canonical key %s: %s", key, found_locations) else local group, spec = unpack(found_locations[1]) export.initialize_spec(group, key, spec) return group, spec end end --[==[ Iterator that returns all locations matching a given description, where the description consists of either a placename or a key along with a list of possible placetypes. Usually there will be at most one such location. The iterator returns three values at each iteration: the location group, canonical key by which the location is known and the spec object describing the location. `data` contains the following possible fields: * `placetypes`: A list of possible placetypes, one of which must match one of the location's placetypes; or a string specifying a placetype, which must match one of the location's placetypes. This must be specified. * `placename`: The placename of the location. Either this or `key` must be specified. * `key`: The key of the location. Either this or `placename` must be specified. * `alias_resolution`: If specified, it behaves the same as for `find_matching_key_in_group`. The spec is normally initialized using `initialize_spec()` prior to it being returned (but may not be if `alias_resolution` is given and the specified key or placename is an alias; see the documentation for `find_matching_key_in_group`). ]==] function export.iterate_matching_location(data) local i = 0 local n = #export.locations return function() while true do i = i + 1 if i > n then break end local group = export.locations[i] local key, spec if data.placename then key, spec = find_matching_placename_in_group(group, data.placetypes, data.placename, data.alias_resolution) else if not data.key then internal_error("'.placename' or '.key' must be defined: %s", data) end key, spec = find_matching_key_in_group(group, data.placetypes, data.key, data.alias_resolution) end if key then return group, key, spec end end end end --[==[ Return the location matching a given description, where the description consists of either a placename or a key along with a list of possible placetypes. This is similar to `iterate_matching_location()` but throws an internal error if there is not exactly one location found; as such, it is for use with internally specified locations (such as the containers of known locations) rather than externally specified locations, which may not match a known location and in some cases may match multiple known locations. For finding an externally specified location, consider using `find_matching_holonym_location`, which returns {nil} rather than throwing an error if the location isn't found, but also (more importantly) checks to make sure there are no conflicting holonyms among the user-specified holonyms (e.g. {{tl|place|city|s/Delaware|c/USA|t=Newark}} will not match the known location `Newark` (in New Jersey, not Delaware). ]==] function export.get_matching_location(data) local all_found = {} for group, key, spec in export.iterate_matching_location(data) do insert(all_found, {group, key, spec}) end if not all_found[1] then internal_error("Couldn't find matching location for data %s", data) elseif all_found[2] then internal_error("Found multiple matching locations for data %s: %s", data, all_found) else return unpack(all_found[1]) end end --[==[ Successively iterate over a location's containers, and then the containers of those containers, etc. Keep in mind that locations may have multiple containers (e.g. Russia has both Europe and Asia as containers, and both Europe and Asia have Eurasia as their container). A given container will never be returned twice (e.g. in the case where a specific location A has locations B and C as containers, and B has C as its container, C will not be returned twice). An internal error happens if a container loop is detected. The return value is a list of location objects, each of which contains `group`, `key` and `spec` fields. ]==] function export.iterate_containers(group, key, spec) local keys_seen = {} keys_seen[key] = true local iterations = 0 local last_iteration_containers = {{group = group, key = key, spec = spec}} return function() iterations = iterations + 1 if iterations > 10 then internal_error("Probable loop in containers when processing key %s", key) end local next_iteration_containers = {} for _, location in ipairs(last_iteration_containers) do local containers = location.spec.containers if containers then for _, container in ipairs(containers) do local container_group, container_key, container_spec = export.get_matching_location { placetypes = container.placetype, key = container.key, } if not keys_seen[container_key] then insert(next_iteration_containers, { group = container_group, key = container_key, spec = container_spec }) keys_seen[container_key] = true end end end end if not next_iteration_containers[1] then return nil end last_iteration_containers = next_iteration_containers return next_iteration_containers end end --[==[ Given a placename, convert it into a link (two-part if `display_form` is given and differs from `placename`) and add `"the "` to the beginning if called for in `spec`. ]==] function export.construct_linked_placename(spec, placename, display_form) local linked_placename = display_form and placename ~= display_form and ("[[%s|%s]]"):format(placename, display_form) or ("[[%s]]"):format(placename) if spec.the then linked_placename = "the " .. linked_placename end return linked_placename end --[=[ This is typically used to define `key_to_placename`. It generates a function that chops off parts of a string (a location key), typically at the end, in order to get the full and elliptical versions of a placename. (See the documentation above for `key_to_placename` under "Location group tables" for the difference between full and elliptical placenames.) `container_patterns` is a Lua pattern or a list of possible patterns matching the container at the end of the key, which will be used to remove that container. If multiple patterns are specified, each one is tried until one matches. If `container_patterns` is omitted, this part of the process is skipped. The reulting string becomes the full placename. If `divtype_patterns` is specified, it is likewise either a Lua pattern or list of possible patterns to match and remove the political division affixed onto the end (or possibly the beginning) of the key in the keys of certain countries (such as South Korean and North Korean counties, which include the word "County" in the key). The resulting chopped string becomes the elliptical placename. If `divtype_patterns` is omitted, this part of the process is skipped and the full and elliptical placenames are the same. Typical usage is as follows: ``` key_to_placename = make_key_to_placename(", England$"), ``` or (when the political division is part of the key) ``` key_to_placename = make_key_to_placename(", South Korea$", " County$") ``` ]=] local function make_key_to_placename(container_patterns, divtype_patterns) if type(container_patterns) == "string" then container_patterns = {container_patterns} end if type(divtype_patterns) == "string" then divtype_patterns = {divtype_patterns} end return function(key) local full_placename = key if container_patterns then for _, container_pattern in ipairs(container_patterns) do local nsubs full_placename, nsubs = full_placename:gsub(container_pattern, "") if nsubs > 0 then break end end end local elliptical_placename = full_placename if divtype_patterns then for _, divtype_pattern in ipairs(divtype_patterns) do local nsubs elliptical_placename, nsubs = elliptical_placename:gsub(divtype_pattern, "") if nsubs > 0 then break end end end return full_placename, elliptical_placename end end --[=[ This is typically used to define `placename_to_key`. It generates a function that appends a string to the end of a given placename to get the key (see the definition of `placename_to_key` above in the documentation under "Location group tables"). Optional `divtype_suffix` is a raw string (which should not contain hyphens or other characters that have special meaning in Lua patterns) to be appended first to the placename; if already present at the end, it is not appended. `container_suffix` is then added in the same fashion if given. Typical usage is like this: ``` placename_to_key = make_placename_to_key(", England") ``` (which will convert e.g. `"Hampshire"` into `"Hampshire, England"`) or ``` placename_to_key = make_placename_to_key(", South Korea", " County") ``` (which will convert e.g. `"Gangwon"` or `"Gangwon County"` into `"Gangwon County, South Korea"`). ]=] local function make_placename_to_key(container_suffix, divtype_suffix) return function(placename) local key = placename if divtype_suffix then if not key:find(divtype_suffix .. "$") then key = key .. divtype_suffix end end if container_suffix then key = key .. container_suffix end return key end end --[=[ This is typically used to define `canonicalize_key_container`, which converts a container as specified in the location data into the canonical form containing both the full container key and its placetype. It generates a function to do the canonicalization of a given container. If the container is a string, `suffix` is appended onto the string (use {nil} or {""} if there is no suffix to append), and the placetype is set to `placetype`. Otherwise the container is left as-is. Typical usage is like this: ``` canonicalize_key_container = make_canonicalize_key_container(", Canada", "province") ``` which will convert e.g. `"Ontario"` into `{key = "Ontario, Canada", placetype = "province"}`. ]=] local function make_canonicalize_key_container(suffix, placetype) return function(container) if type(container) == "string" then return {key = container .. (suffix or ""), placetype = placetype} else return container end end end ----------------------------------------------------------------------------------- -- Top-level tables -- ----------------------------------------------------------------------------------- export.continents = { ["Bumi"] = {the = true, placetype = "planet", addl_parents = {"alam semula jadi"}, fulldesc = "=the planet [[Earth]] and the features found on it"}, ["Afrika"] = {placetype = "benua", container = {key = "Bumi", placetype = "planet"}}, ["Amerika"] = {placetype = {"superbenua", "benua"}, container = {key = "Bumi", placetype = "planet"}, keydesc = "[[America]], in the sense of [[North America]] and [[South America]] combined", wp = "Amerika"}, ["America"] = {alias_of = "Amerika", the = true}, ["Amerika Utara"] = {placetype = "benua", container = {key = "America", placetype = "superbenua"}}, ["Caribbean"] = {the = true, placetype = {"kawasan benua", "region"}, container = {key = "Amerika Utara", placetype = "benua"}}, ["Amerika Tengah"] = {placetype = {"kawasan benua", "region"}, container = {key = "Amerika Utara", placetype = "benua"}}, ["Amerika Selatan"] = {placetype = "benua", container = {key = "America", placetype = "superbenua"}}, ["Antartika"] = {placetype = "benua", container = {key = "Bumi", placetype = "planet"}, fulldesc = "=the territory of [[Antarctica]]"}, ["Eurasia"] = {placetype = {"superbenua", "benua"}, container = {key = "Bumi", placetype = "planet"}, keydesc = "[[Eurasia]], i.e. [[Europe]] and [[Asia]] together"}, ["Asia"] = {placetype = "benua", container = {key = "Eurasia", placetype = "superbenua"}}, ["Eropah"] = {placetype = "benua", container = {key = "Eurasia", placetype = "superbenua"}}, ["Oceania"] = {placetype = "benua", container = {key = "Bumi", placetype = "planet"}}, ["Melanesia"] = {placetype = {"kawasan benua", "region"}, container = {key = "Oceania", placetype = "benua"}}, ["Micronesia"] = {placetype = {"kawasan benua", "region"}, container = {key = "Oceania", placetype = "benua"}}, ["Polynesia"] = {placetype = {"kawasan benua", "region"}, container = {key = "Oceania", placetype = "benua"}}, } export.continents_group = { default_overriding_bare_label_parents = {}, -- container parents should be used default_divs = {{type = "negara", prep = "di"}}, -- It's enough to mention the first-level continent or continent group. It seems excessive to write e.g. -- "El Salvador, a country in Central America, a continental region in North America, a continent in America, ...". default_no_include_container_in_desc = true, default_no_container_cat = true, default_no_container_parent = true, default_no_auto_augment_container = true, default_no_generic_place_cat = true, -- French Guyana is in France but not in Europe, which should not be an issue, so don't check holonym mismatches at -- this level. We also run into problems with supercontinents, which have "benua" as the fallback and cause -- mismatches. default_no_check_holonym_mismatch = true, data = export.continents, } -- Countries: including those with partial recognition that are normally considered countries (e.g. Kosovo, Taiwan). export.countries = { ["Afghanistan"] = {container = "Asia", divs = {"provinces", "districts"}}, ["Albania"] = {container = "Eropah", divs = {"counties", "municipalities", "communes", {type = "administrative units", cat_as = "communes"}, }, british_spelling = true}, ["Algeria"] = {container = "Afrika", divs = {"provinces", "communes", "districts", "municipalities"}}, ["Andorra"] = {container = "Eropah", divs = {"parishes"}, british_spelling = true}, ["Angola"] = {container = "Afrika", divs = {"provinces", "municipalities"}}, ["Antigua and Barbuda"] = {container = "Caribbean", divs = {"provinces"}, british_spelling = true}, ["Argentina"] = {container = "Amerika Selatan", divs = {"provinces", "departments", "municipalities"}}, ["Armenia"] = {container = {"Eropah", "Asia"}, divs = {"provinces", "districts", "municipalities"}, british_spelling = true}, ["Republic of Armenia"] = {alias_of = "Armenia", the = true}, -- differs in "the" -- Both a country and continent ["Australia"] = {container = "Oceania", divs = { {type = "negeri", cat_as = "states and territories"}, {type = "territories", cat_as = "states and territories"}, {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and territories"}, {type = "ABBREVIATION_OF territories", cat_as = "abbreviations of states and territories"}, "local government areas", "dependent territories", }, british_spelling = true}, ["Austria"] = {container = "Eropah", divs = {"negeri", "districts", "municipalities"}, british_spelling = true}, ["Azerbaijan"] = {container = {"Eropah", "Asia"}, divs = {"districts", "municipalities"}, british_spelling = true}, ["Bahamas"] = {the = true, container = "Caribbean", divs = {"districts"}, british_spelling = true, wp = "The %l"}, ["Bahrain"] = {container = "Asia", divs = {"governorates"}}, ["Bangladesh"] = {container = "Asia", divs = {"divisions", "districts", "municipalities"}, british_spelling = true}, ["Barbados"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true}, ["Belarus"] = {container = "Eropah", divs = {"regions", "districts"}, british_spelling = true}, ["Belgium"] = {container = "Eropah", divs = {"regions", "provinces", "municipalities"}, british_spelling = true}, ["Belize"] = {container = "Amerika Tengah", divs = {"districts"}, british_spelling = true}, ["Benin"] = {container = "Afrika", divs = {"departments", "communes"}}, ["Bhutan"] = {container = "Asia", divs = {"districts", "gewogs"}}, ["Bolivia"] = {container = "Amerika Selatan", divs = {"provinces", "departments", "municipalities"}}, ["Bosnia and Herzegovina"] = {container = "Eropah", divs = {"entities", "cantons", "municipalities"}, british_spelling = true}, ["Bosnia and Hercegovina"] = {alias_of = "Bosnia and Herzegovina", display = true}, ["Bosnia"] = {alias_of = "Bosnia and Herzegovina", display = true}, ["Botswana"] = {container = "Afrika", divs = {"districts", "subdistricts"}, british_spelling = true}, ["Brazil"] = {container = "Amerika Selatan", divs = { "negeri", "municipalities", "macroregions", {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"}, }}, ["Brunei"] = {container = "Asia", divs = {"districts", "mukims"}, british_spelling = true}, ["Bulgaria"] = {container = "Eropah", divs = {"provinces", "municipalities"}, british_spelling = true}, ["Burkina Faso"] = {container = "Afrika", divs = {"regions", "departments", "provinces"}}, ["Burundi"] = {container = "Afrika", divs = {"provinces", "communes"}}, ["Cambodia"] = {container = "Asia", divs = {"provinces", "districts"}}, ["Cameroon"] = {container = "Afrika", divs = {"regions", "departments"}}, ["Kanada"] = {container = "Amerika Utara", divs = { {type = "provinces", cat_as = "provinces and territories"}, {type = "territories", cat_as = "provinces and territories"}, {type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces and territories"}, {type = "ABBREVIATION_OF territories", cat_as = "abbreviations of provinces and territories"}, "counties", "districts", "municipalities", "regional municipalities", "rural municipalities", "parishes", -- Don't change the following to something more politically correct (e.g. "First Nations reserves") until/unless -- the Canadian government makes a similar switch (and note that as of Apr 18 2025, the Wikipedia article is -- still at [[w:Indian reserves]]). "Indian reserves", "census divisions", {type = "townships", prep = "di"}, }, british_spelling = true}, ["Cape Verde"] = {container = "Afrika", divs = {"municipalities", "parishes"}}, ["Central African Republic"] = {the = true, container = "Afrika", divs = {"prefectures", "subprefectures"}}, ["Chad"] = {container = "Afrika", divs = {"regions", "departments"}}, ["Chile"] = {container = "Amerika Selatan", divs = {"regions", "provinces", "communes"}}, ["China"] = {container = "Asia", divs = { {type = "provinces", cat_as = "provinces and autonomous regions"}, {type = "autonomous regions", cat_as = "provinces and autonomous regions"}, {type = "FORMER provinces", cat_as = "former provinces"}, "special administrative regions", "prefectures", {type = "FORMER prefectures", cat_as = "former prefectures"}, "prefecture-level cities", {type = "counties", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, {type = "FORMER counties", cat_as = "former counties and county-level cities"}, {type = "FORMER county-level cities", cat_as = "former counties and county-level cities"}, -- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities. "districts", {type = "FORMER districts", cat_as = "former districts"}, "subdistricts", "townships", "municipalities", {type = "direct-administered municipalities", cat_as = "municipalities"}, }}, ["People's Republic of China"] = {alias_of = "China", the = true}, -- differs in "the" ["Colombia"] = {container = "Amerika Selatan", divs = {"departments", "municipalities"}}, ["Comoros"] = {the = true, container = "Afrika", divs = {"autonomous islands"}}, ["Costa Rica"] = {container = "Amerika Tengah", divs = {"provinces", "cantons"}}, ["Croatia"] = {container = "Eropah", divs = {"counties", "municipalities"}, british_spelling = true}, ["Cuba"] = {container = "Caribbean", divs = {"provinces", "municipalities"}}, ["Cyprus"] = {container = {"Eropah", "Asia"}, divs = {"districts"}, british_spelling = true}, ["Czech Republic"] = {the = true, container = "Eropah", divs = {"regions", "districts", "municipalities"}, british_spelling = true}, ["Czechia"] = {alias_of = "Czech Republic"}, -- differs in "the" ["Democratic Republic of the Congo"] = {the = true, container = "Afrika", divs = {"provinces", "territories"}}, ["Congo"] = {alias_of = "Democratic Republic of the Congo", display = true, the = true}, ["Denmark"] = {container = "Eropah", divs = {"regions", "municipalities", "dependent territories"}, british_spelling = true, -- Wikipedia separates [[w:Denmark]] (constituent country) from [[w:Danish Realm]] (country) }, ["Djibouti"] = {container = "Afrika", divs = {"regions", "districts"}}, ["Dominica"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true}, ["Dominican Republic"] = {the = true, container = "Caribbean", divs = {"provinces", "municipalities"}, keydesc = "the [[Dominican Republic]], the country that shares the [[Caribbean]] island of [[Hispaniola]] with [[Haiti]]"}, ["East Timor"] = {container = "Asia", divs = {"municipalities"}, wp = "Timor-Leste"}, ["Timor-Leste"] = {alias_of = "East Timor", display = true}, ["Ecuador"] = {container = "Amerika Selatan", divs = {"provinces", "cantons"}}, ["Egypt"] = {container = "Afrika", divs = {"governorates", "regions"}, british_spelling = true}, ["El Salvador"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}}, ["Equatorial Guinea"] = {container = "Afrika", divs = {"provinces"}}, ["Eritrea"] = {container = "Afrika", divs = {"regions", "subregions"}}, ["Estonia"] = {container = "Eropah", divs = {"counties", "municipalities"}, british_spelling = true}, ["Eswatini"] = {container = "Afrika", british_spelling = true}, ["Swaziland"] = {alias_of = "Eswatini", display = true}, ["Ethiopia"] = {container = "Afrika", divs = {"regions", "zones"}}, ["Federated States of Micronesia"] = {the = true, container = "Micronesia", divs = {"negeri"}}, ["Micronesia"] = {alias_of = "Federated States of Micronesia"}, ["Fiji"] = {container = "Melanesia", divs = {"divisions", "provinces"}, british_spelling = true}, ["Finland"] = {container = "Eropah", divs = {"regions", "municipalities"}, british_spelling = true}, ["France"] = {container = "Eropah", divs = {"regions", "cantons", "collectivities", "communes", {type = "municipalities", cat_as = "communes"}, "departments", {type = "prefectures", cat_as = {"prefectures", "departmental capitals"}}, {type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}}, "dependent territories", "territories", "provinces", }, british_spelling = true}, ["Gabon"] = {container = "Afrika", divs = {"provinces", "departments"}}, ["Gambia"] = {the = true, container = "Afrika", divs = {"divisions", "districts"}, british_spelling = true, wp = "The %l"}, ["Georgia"] = {container = {"Eropah", "Asia"}, divs = {"regions", "districts"}, keydesc = "the country of [[Georgia]], in [[Eurasia]]", british_spelling = true, wp = "%l (country)"}, ["Germany"] = {container = "Eropah", divs = { "negeri", -- Bavaria, Baden-Württemberg, Hesse and North Rhine-Westphalia have administrative regions as divisions, but -- there aren't really enough of them to categorize per state. "regions", "municipalities", "districts"}, british_spelling = true}, ["Ghana"] = {container = "Afrika", divs = {"regions", "districts"}, british_spelling = true}, ["Greece"] = {container = "Eropah", divs = {"regions", "regional units", "municipalities", {type = "peripheries", cat_as = {"regions"}}, }, british_spelling = true}, ["Grenada"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true}, ["Guatemala"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}}, ["Guinea"] = {container = "Afrika", divs = {"regions", "prefectures"}}, ["Guinea-Bissau"] = {container = "Afrika", divs = {"regions"}}, ["Guyana"] = {container = "Amerika Selatan", divs = {"regions"}, british_spelling = true}, ["Haiti"] = {container = "Caribbean", divs = {"departments", "arrondissements"}}, ["Honduras"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}}, ["Hungary"] = {container = "Eropah", divs = {"counties", "districts"}, british_spelling = true}, ["Iceland"] = {container = "Eropah", divs = {"regions", "municipalities", "counties"}, british_spelling = true}, ["India"] = {container = "Asia", divs = { {type = "negeri", cat_as = "states and union territories"}, {type = "union territories", cat_as = "states and union territories"}, {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and union territories"}, {type = "ABBREVIATION_OF union territories", cat_as = "abbreviations of states and union territories"}, "divisions", "districts", "municipalities", }, british_spelling = true}, ["Indonesia"] = {container = "Asia", divs = {"regencies", "provinces", {type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces"}, }}, ["Iran"] = {container = "Asia", divs = {"provinces", "counties"}}, ["Iraq"] = {container = "Asia", divs = {"governorates", "districts"}}, ["Ireland"] = {container = "Eropah", addl_parents = {"British Isles"}, divs = {"counties", "districts", "provinces"}, british_spelling = true, wp = "Republic of %l"}, ["Republic of Ireland"] = {alias_of = "Ireland", the = true}, -- differs in "the" ["Israel"] = {container = "Asia", divs = {"districts"}}, ["Italy"] = {container = "Eropah", divs = { "regions", "provinces", "metropolitan cities", "municipalities", {type = "autonomous regions", cat_as = "regions"}, }, british_spelling = true}, ["Ivory Coast"] = {container = "Afrika", divs = {"districts", "regions"}}, -- We should really be using Ivory Coast (common name) but there are political ramifications to the use of -- Côte d'Ivoire so don't make it a display alias. ["Côte d'Ivoire"] = {alias_of = "Ivory Coast"}, ["Jamaica"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true}, ["Jepun"] = {container = "Asia", divs = {"prefectures", "subprefectures", "municipalities"}}, ["Jordan"] = {container = "Asia", divs = {"governorates"}}, ["Kazakhstan"] = {container = {"Asia", "Eropah"}, divs = {"regions", "districts"}}, ["Kenya"] = {container = "Afrika", divs = {"counties"}, british_spelling = true}, ["Kiribati"] = {container = "Micronesia", british_spelling = true}, ["Kosovo"] = {container = "Eropah", divs = {"districts", "municipalities"}, british_spelling = true}, ["Kuwait"] = {container = "Asia", divs = {"governorates", "areas"}}, ["Kyrgyzstan"] = {container = "Asia", divs = {"regions", "districts"}}, ["Laos"] = {container = "Asia", divs = {"provinces", "districts"}}, ["Latvia"] = {container = "Eropah", divs = {"municipalities"}, british_spelling = true}, ["Lubnan"] = {container = "Asia", divs = {"governorates", "districts"}}, ["Lesotho"] = {container = "Afrika", divs = {"districts"}, british_spelling = true}, ["Liberia"] = {container = "Afrika", divs = {"counties", "districts"}}, ["Libya"] = {container = "Afrika", divs = {"districts", "municipalities"}}, ["Liechtenstein"] = {container = "Eropah", divs = {"municipalities"}, british_spelling = true}, ["Lithuania"] = {container = "Eropah", divs = {"counties", "municipalities"}, british_spelling = true}, ["Luxembourg"] = {container = "Eropah", divs = {"cantons", "districts"}, british_spelling = true}, ["Madagascar"] = {container = "Afrika", divs = {"regions", "districts"}}, ["Malawi"] = {container = "Afrika", divs = {"regions", "districts"}, british_spelling = true}, ["Malaysia"] = {container = "Asia", divs = {"negeri", "wilayah persekutuan", "daerah"}, british_spelling = true}, ["Maldives"] = {the = true, container = "Asia", divs = {"provinces", "administrative atolls"}, british_spelling = true}, ["Mali"] = {container = "Afrika", divs = {"regions", "cercles"}}, ["Malta"] = {container = "Eropah", divs = {"regions", "local councils"}, british_spelling = true}, ["Kepulauan Marshall"] = {the = true, container = "Micronesia", divs = {"municipalities"}}, ["Mauritania"] = {container = "Afrika", divs = {"regions", "departments"}}, ["Mauritius"] = {container = "Afrika", divs = {"districts"}, british_spelling = true}, ["Mexico"] = {container = "Amerika Utara", addl_parents = {"Amerika Tengah"}, divs = { "negeri", "municipalities", {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"}, }}, ["Moldova"] = {container = "Eropah", divs = { {type = "districts", cat_as = "districts and autonomous territorial units"}, {type = "autonomous territorial units", cat_as = "districts and autonomous territorial units"}, "communes", "municipalities", }, british_spelling = true}, ["Monaco"] = {placetype = {"city-state", "negara"}, container = "Eropah", -- We want the first placetype to be 'city-state' so the description of Monaco says it's a city-state, but we -- want its parent to be "countries in Europe". bare_category_parent_type = {type = "negara", prep = "di"}, is_city = true, british_spelling = true}, ["Mongolia"] = {container = "Asia", divs = {"provinces", "districts"}}, ["Montenegro"] = {container = "Eropah", divs = {"municipalities"}}, ["Morocco"] = {container = "Afrika", divs = {"regions", "prefectures", "provinces"}}, ["Mozambique"] = {container = "Afrika", divs = {"provinces", "districts"}}, ["Myanmar"] = {container = "Asia", divs = {"regions", "negeri", "union territories", {type = "self-administered zones", cat_as = "self-administered areas"}, {type = "self-administered divisions", cat_as = "self-administered areas"}, "districts"}}, ["Burma"] = {alias_of = "Myanmar"}, -- not display-canonicalizing; has political connotations ["Namibia"] = {container = "Afrika", divs = {"regions", "constituencies"}, british_spelling = true}, ["Nauru"] = {container = "Micronesia", divs = {"districts"}, british_spelling = true}, ["Nepal"] = {container = "Asia", divs = {"provinces", "districts"}}, ["Netherlands"] = {the = true, placetype = {"negara", "constituent country"}, container = "Eropah", divs = {"provinces", "municipalities", {type = "FORMER municipalities", cat_as = "former municipalities"}, "dependent territories", "constituent countries"}, british_spelling = true, -- Wikipedia separates [[w:Netherlands]] (constituent country) from [[w:Kingdom of the Netherlands]] -- (country) }, ["New Zealand"] = {container = "Polynesia", divs = { "regions", "dependent territories", "territorial authorities", {type = "districts", cat_as = "territorial authorities"}, }, british_spelling = true}, ["Nicaragua"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}}, ["Niger"] = {container = "Afrika", divs = {"regions", "departments"}}, ["Nigeria"] = {container = "Afrika", divs = { "negeri", -- Categorize the Federal Capital Territory as a state because there's only one of it; we could categorize -- everything under 'states and territories' but that seems a bit pointless. {type = "wilayah persekutuan", cat_as = "negeri"}, "local government areas", }, british_spelling = true}, ["North Korea"] = {container = "Asia", addl_parents = {"Korea"}, divs = {"provinces", "counties"}}, ["North Macedonia"] = {container = "Eropah", divs = {"regions", "municipalities"}, british_spelling = true}, ["Macedonia"] = {alias_of = "North Macedonia", display = true}, ["Republic of North Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the" ["Republic of Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the" ["Norway"] = {container = "Eropah", divs = {"counties", "municipalities", "dependent territories", "districts", "unincorporated areas"}, british_spelling = true}, ["Oman"] = {container = "Asia", divs = {"governorates", "provinces"}}, ["Pakistan"] = {container = "Asia", divs = { {type = "provinces", cat_as = "provinces and territories"}, {type = "administrative territories", cat_as = "provinces and territories"}, {type = "wilayah persekutuan", cat_as = "provinces and territories"}, {type = "territories", cat_as = "provinces and territories"}, "divisions", "districts", }, british_spelling = true}, ["Palau"] = {container = "Micronesia", divs = {"negeri"}}, ["Palestine"] = {container = "Asia", divs = {"governorates"}}, ["State of Palestine"] = {alias_of = "Palestine", the = true}, -- differs in "the" ["Panama"] = {container = "Amerika Tengah", divs = {"provinces", "districts"}}, ["Papua New Guinea"] = {container = "Melanesia", divs = {"provinces", "districts"}, british_spelling = true}, ["Paraguay"] = {container = "Amerika Selatan", divs = {"departments", "districts"}}, ["Peru"] = {container = "Amerika Selatan", divs = {"regions", "provinces", "districts"}}, ["Philippines"] = {the = true, container = "Asia", divs = {"regions", "provinces", "districts", "municipalities", "barangays"}}, ["Poland"] = {divs = {"voivodeships", "counties", {type = "Polish colonies", cat_as = {{type = "villages", prep = "di"}}}, }, container = "Eropah", british_spelling = true}, ["Portugal"] = {container = "Eropah", divs = { {type = "autonomous regions", cat_as = "districts and autonomous regions"}, {type = "districts", cat_as = "districts and autonomous regions"}, "provinces", "municipalities"}, british_spelling = true}, ["Qatar"] = {container = "Asia", divs = {"municipalities", "zones"}}, ["Republic of the Congo"] = {the = true, container = "Afrika", divs = {"departments", "districts"}}, ["Congo Republic"] = {alias_of = "Republic of the Congo", display = true, the = true}, ["Romania"] = {container = "Eropah", divs = { "regions", "counties", "communes", {type = "ABBREVIATION_OF counties", cat_as = "abbreviations of counties"}, }, british_spelling = true}, ["Russia"] = {container = {"Eropah", "Asia"}, divs = { "federal subjects", "republics", "autonomous oblasts", "autonomous okrugs", "oblasts", "krais", "federal cities", "districts", "federal districts"}, british_spelling = true}, ["Rwanda"] = {container = "Afrika", divs = {"provinces", "districts"}}, ["Saint Kitts and Nevis"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true}, ["Saint Lucia"] = {container = "Caribbean", divs = {"districts"}, british_spelling = true}, ["Saint Vincent and the Grenadines"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true}, ["Samoa"] = {container = "Polynesia", divs = {"districts"}, british_spelling = true}, ["San Marino"] = {container = "Eropah", divs = {"municipalities"}, british_spelling = true}, ["São Tomé and Príncipe"] = {container = "Afrika", divs = {"districts"}}, ["Arab Saudi"] = {container = "Asia", divs = {"wilayah", "kegaboneran"}}, ["Senegal"] = {container = "Afrika", divs = {"regions", "departments"}}, ["Serbia"] = {container = "Eropah", divs = {"districts", "municipalities", "autonomous provinces"}}, ["Seychelles"] = {container = "Afrika", divs = {"districts"}, british_spelling = true}, ["Sierra Leone"] = {container = "Afrika", divs = {"provinces", "districts"}, british_spelling = true}, ["Singapore"] = {container = "Asia", divs = {"districts", "regions"}, british_spelling = true}, ["Slovakia"] = {container = "Eropah", divs = {"regions", "districts"}, british_spelling = true}, ["Slovenia"] = {container = "Eropah", divs = {"statistical regions", "municipalities"}, british_spelling = true}, -- Note: the official name does not include "the" at the beginning, but it sounds strange in -- English to leave it out and it's commonly included, so we include it. ["Solomon Islands"] = {the = true, container = "Melanesia", divs = {"provinces"}, british_spelling = true}, ["Somalia"] = {container = "Afrika", divs = {"regions", "districts"}}, ["South Africa"] = {container = "Afrika", divs = { "provinces", "districts", {type = "district municipalities", cat_as = "districts"}, {type = "metropolitan municipalities", cat_as = "districts"}, "municipalities", }, british_spelling = true}, ["South Korea"] = {container = "Asia", addl_parents = {"Korea"}, divs = {"provinces", "counties", "districts"}}, ["South Sudan"] = {container = "Afrika", divs = {"regions", "negeri", "counties"}, british_spelling = true}, ["Spain"] = {container = "Eropah", divs = {"autonomous communities", "provinces", "municipalities", "comarcas", "autonomous cities"}, british_spelling = true}, ["Sri Lanka"] = {container = "Asia", divs = {"provinces", "districts"}, british_spelling = true}, ["Sudan"] = {container = "Afrika", divs = {"negeri", "districts"}, british_spelling = true}, ["Suriname"] = {container = "Amerika Selatan", divs = {"districts"}}, ["Sweden"] = {container = "Eropah", divs = {"provinces", "counties", "municipalities"}, british_spelling = true}, ["Switzerland"] = {container = "Eropah", divs = {"cantons", "municipalities", "districts"}, british_spelling = true}, ["Syria"] = {container = "Asia", divs = {"governorates", "districts"}}, ["Taiwan"] = {container = "Asia", divs = {"counties", "districts", "townships", "special municipalities"}}, ["Republic of China"] = {alias_of = "Taiwan", the = true}, -- differs in "the", different political connotations ["Tajikistan"] = {container = "Asia", divs = {"regions", "districts"}}, ["Tanzania"] = {container = "Afrika", divs = {"regions", "districts"}, british_spelling = true}, ["Thailand"] = {container = "Asia", divs = {"wilayah", "daerah", "subdaerah"}}, ["Togo"] = {container = "Afrika", divs = {"provinces", "prefectures"}}, ["Tonga"] = {container = "Polynesia", divs = {"divisions"}, british_spelling = true}, ["Trinidad and Tobago"] = {container = "Caribbean", divs = {"regions", "municipalities"}, british_spelling = true}, ["Tunisia"] = {container = "Afrika", divs = {"governorates", "delegations"}}, ["Turkey"] = {container = {"Eropah", "Asia"}, divs = {"provinces", "districts"}}, -- Foreign names generally get display-canonicalized. ["Türkiye"] = {alias_of = "Turkey", display = true}, ["Turkmenistan"] = {container = "Asia", divs = { -- The 5 regions are often also called provinces "regions", {type = "provinces", cat_as = "regions"}, "districts"}, }, ["Tuvalu"] = {container = "Polynesia", divs = {"atolls"}, british_spelling = true}, ["Uganda"] = {container = "Afrika", divs = {"districts", "counties"}, british_spelling = true}, ["Ukraine"] = {container = "Eropah", divs = { {type = "oblasts", cat_as = "oblasts and autonomous republics"}, {type = "autonomous republics", cat_as = "oblasts and autonomous republics"}, "raions", "hromadas", }, british_spelling = true}, ["United Arab Emirates"] = {the = true, container = "Asia", divs = {"emirates"}}, -- Abbreviations get display-canonicalized. ["UAE"] = {alias_of = "United Arab Emirates", display = true, the = true}, ["U.A.E."] = {alias_of = "United Arab Emirates", display = true, the = true}, ["United Kingdom"] = {the = true, container = "Eropah", addl_parents = {"British Isles"}, divs = {"constituent countries", "counties", "districts", "boroughs", "territories", "dependent territories", "traditional counties"}, keydesc = "the [[United Kingdom]] of Great Britain and Northern Ireland", british_spelling = true}, -- Abbreviations get display-canonicalized. ["UK"] = {alias_of = "United Kingdom", display = true, the = true}, ["U.K."] = {alias_of = "United Kingdom", display = true, the = true}, ["Amerika Syarikat"] = {the = true, container = "Amerika Utara", divs = {"counties", "county seats", "negeri", "territories", "dependent territories", {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"}, {type = "DEROGATORY_NAME_FOR states", cat_as = "derogatory names for states"}, {type = "NICKNAME_FOR states", cat_as = "nicknames for states"}, {type = "OFFICIAL_NICKNAME_FOR states", cat_as = "official nicknames for states"}, {type = "boroughs", prep = "di"}, -- exist in Pennsylvania and New Jersey "municipalities", -- these exist politically at least in Colorado and Connecticut {type = "census-designated places", prep = "di"}, {type = "unincorporated communities", prep = "di"}, -- Don't change the following to something more politically correct until/unless the US government makes a -- similar switch (and note that as of Apr 18 2025, the Wikipedia article is still at -- [[w:Indian reservations]]). "Indian reservations", }}, -- Abbreviations and long forms (when possible) get display-canonicalized. ["US"] = {alias_of = "Amerika Syarikat", display = true, the = true}, ["U.S."] = {alias_of = "Amerika Syarikat", display = true, the = true}, ["USA"] = {alias_of = "Amerika Syarikat", display = true, the = true}, ["U.S.A."] = {alias_of = "Amerika Syarikat", display = true, the = true}, ["United States of America"] = {alias_of = "Amerika Syarikat", display = true, the = true}, ["United States"] = {alias_of = "Amerika Syarikat", display = true, the = true}, ["Uruguay"] = {container = "Amerika Selatan", divs = {"departments", "municipalities"}}, ["Uzbekistan"] = {container = "Asia", divs = {"regions", "districts"}}, ["Vanuatu"] = {container = "Melanesia", divs = {"provinces"}, british_spelling = true}, ["Vatican City"] = {placetype = {"city-state", "negara"}, container = "Eropah", -- We want the first placetype to be 'city-state' so the description of Vatican City says it's a city-state, -- but we want its parent to be "countries in Europe". bare_category_parent_type = {type = "negara", prep = "di"}, addl_parents = {"Rome"}, is_city = true, british_spelling = true}, ["Vatican"] = {alias_of = "Vatican City", the = true}, -- differs in "the" ["Venezuela"] = {container = "Amerika Selatan", divs = {"negeri", "municipalities"}}, ["Vietnam"] = {container = "Asia", divs = {"provinces", "districts", "municipalities"}}, ["Western Sahara"] = {placetype = {"territory", "negara"}, container = "Afrika", bare_category_parent_type = {type = "negara", prep = "di"}, }, -- Not display-canonicalizable both due to differences in 'the' and the sovereignty dispute over Western Sahara ["Sahrawi Arab Democratic Republic"] = {alias_of = "Western Sahara", the = true}, ["Yemen"] = {container = "Asia", divs = {"governorates", "districts"}}, ["Zambia"] = {container = "Afrika", divs = {"provinces", "districts"}, british_spelling = true}, ["Zimbabwe"] = {container = "Afrika", divs = {"provinces", "districts"}, british_spelling = true}, } local function canonicalize_continent_container(key) if type(key) ~= "string" then return key end if export.continents[key] then return {key = key, placetype = export.continents[key].placetype} end internal_error("Unrecognized key %s in `canonicalize_continent_like`", key) end export.countries_group = { canonicalize_key_container = canonicalize_continent_container, default_overriding_bare_label_parents = {"+++", "negara"}, default_placetype = "negara", default_no_container_cat = true, default_no_container_parent = true, -- No need to augment country holonyms with continents; not needed for disambiguation. default_no_auto_augment_container = true, data = export.countries, } -- Country-like entities: typically overseas territories or de-facto independent countries, which in both cases -- are not internationally recognized as sovereign nations but which we treat similarly to countries. export.country_like_entities = { -- British Overseas Territory ["Akrotiri and Dhekelia"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Cyprus", "Eropah", "Asia"}, british_spelling = true, }, -- Åland: Listed as a region of Finland. Wikipedia lists this under "dependent territories" in -- [[w:List of sovereign states and dependent territories by continent]]. -- unincorporated territory of the United States ["American Samoa"] = { placetype = {"unincorporated territory", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Polynesia"}, }, -- British Overseas Territory ["Anguilla"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Caribbean"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Georgia ["Abkhazia"] = { placetype = {"unrecognized country", "negara"}, addl_parents = {"Georgia", "Eropah", "Asia"}, divs = {"districts"}, keydesc = "the de-facto independent state of [[Abkhazia]], internationally recognized as part of the country of [[Georgia]]", british_spelling = true, }, -- Australian external territory ["Ashmore and Cartier Islands"] = { the = true, placetype = {"external territory", "territory"}, container = "Australia", addl_parents = {"Asia"}, }, -- constituent country of the Netherlands ["Aruba"] = { placetype = {"constituent country", "negara"}, container = "Netherlands", addl_parents = {"Caribbean"}, british_spelling = true, }, -- British Overseas Territory ["Bermuda"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Amerika Utara"}, british_spelling = true, }, -- special municipality of the Netherlands ["Bonaire"] = { placetype = {"special municipality", "municipality", "overseas territory", "territory"}, container = "Netherlands", addl_parents = {"Caribbean"}, is_city = true, british_spelling = true, }, -- British Overseas Territory ["British Indian Ocean Territory"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Asia"}, british_spelling = true, }, -- British Overseas Territory ["British Virgin Islands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Caribbean"}, british_spelling = true, }, -- Norwegian dependent territory ["Bouvet Island"] = { placetype = {"dependent territory", "territory"}, container = "Norway", addl_parents = {"Afrika"}, british_spelling = true, }, -- British Overseas Territory ["Cayman Islands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Caribbean"}, british_spelling = true, }, -- Australian external territory ["Christmas Island"] = { placetype = {"external territory", "territory"}, container = "Australia", addl_parents = {"Asia"}, british_spelling = true, }, -- Sui generis French "state private property" per Wikipedia; classify as overseas territory like the -- French Southern and Antarctic Lands. ["Clipperton Island"] = { placetype = {"overseas territory", "territory"}, container = "France", addl_parents = {"Amerika Utara"}, }, -- Australian external territory; also called the Keeling Islands or (officially) the Cocos (Keeling) Islands ["Cocos Islands"] = { the = true, placetype = {"external territory", "territory"}, container = "Australia", addl_parents = {"Asia"}, wp = "Cocos (Keeling) Islands", british_spelling = true, }, ["Cocos (Keeling) Islands"] = {alias_of = "Cocos Islands", display = true, the = true}, ["Keeling Islands"] = {alias_of = "Cocos Islands", display = true, the = true}, -- self-governing but in free association with New Zealand ["Cook Islands"] = { the = true, placetype = {"negara"}, container = "New Zealand", addl_parents = {"Polynesia"}, british_spelling = true, }, -- constituent country of the Netherlands ["Curaçao"] = { placetype = {"constituent country", "negara"}, container = "Netherlands", addl_parents = {"Caribbean"}, british_spelling = true, }, -- special territory of Chile ["Easter Island"] = { placetype = {"special territory", "territory"}, container = "Chile", addl_parents = {"Polynesia"}, }, -- British Overseas Territory ["Falkland Islands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Amerika Selatan"}, british_spelling = true, }, -- autonomous territory of Denmark ["Faroe Islands"] = { the = true, placetype = {"autonomous territory", "territory"}, container = "Denmark", addl_parents = {"Eropah"}, british_spelling = true, }, -- overseas department and region of France ["French Guiana"] = { placetype = {"overseas department", "department", "administrative region", "region"}, container = "France", divs = {"communes"}, addl_parents = {"Amerika Selatan"}, british_spelling = true, }, -- overseas collectivity of France ["French Polynesia"] = { placetype = {"overseas collectivity", "collectivity"}, container = "France", addl_parents = {"Polynesia"}, british_spelling = true, }, -- French overseas territory ["French Southern and Antarctic Lands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "France", addl_parents = {"Afrika"}, }, -- British Overseas Territory ["Gibraltar"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Eropah"}, is_city = true, british_spelling = true, }, -- autonomous territory of Denmark ["Greenland"] = { placetype = {"autonomous territory", "territory"}, container = "Denmark", addl_parents = {"Amerika Utara"}, divs = {"municipalities"}, british_spelling = true, }, -- overseas department and region of France ["Guadeloupe"] = { placetype = {"overseas department", "department", "administrative region", "region"}, container = "France", addl_parents = {"Caribbean"}, divs = {"communes"}, british_spelling = true, }, -- unincorporated territory of the United States ["Guam"] = { placetype = {"unincorporated territory", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Micronesia"}, }, -- self-governing British Crown dependency; technically called the Bailiwick of Guernsey ["Guernsey"] = { placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "territory"}, container = "United Kingdom", addl_parents = {"British Isles", "Eropah"}, british_spelling = true, wp = "Bailiwick of %l", }, ["Bailiwick of Guernsey"] = {alias_of = "Guernsey", the = true}, -- Australian external territory ["Heard Island and McDonald Islands"] = { the = true, placetype = {"external territory", "territory"}, container = "Australia", addl_parents = {"Afrika"}, }, -- special administrative region of China ["Hong Kong"] = { placetype = {"special administrative region", "city"}, container = "China", is_city = true, british_spelling = true, }, -- self-governing British Crown dependency ["Isle of Man"] = { the = true, placetype = {"crown dependency", "dependency", "dependent territory", "territory"}, container = "United Kingdom", addl_parents = {"British Isles", "Eropah"}, british_spelling = true, }, -- Norwegian unincorporated area ["Jan Mayen"] = { placetype = {"unincorporated area", "dependent territory", "territory", "island"}, container = "Norway", addl_parents = {"Eropah"}, british_spelling = true, }, -- self-governing British Crown dependency; technically called the Bailiwick of Jersey ["Jersey"] = { placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "territory"}, container = "United Kingdom", addl_parents = {"British Isles", "Eropah"}, british_spelling = true, }, ["Bailiwick of Jersey"] = {alias_of = "Jersey", the = true}, -- special administrative region of China ["Macau"] = { placetype = {"special administrative region", "city"}, container = "China", is_city = true, british_spelling = true, }, -- overseas department and region of France ["Martinique"] = { placetype = {"overseas department", "department", "administrative region", "region"}, container = "France", divs = {"communes"}, addl_parents = {"Caribbean"}, british_spelling = true, }, -- overseas department and region of France ["Mayotte"] = { placetype = {"overseas department", "department", "administrative region", "region"}, container = "France", divs = {"communes"}, addl_parents = {"Afrika"}, british_spelling = true, }, -- British Overseas Territory ["Montserrat"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Caribbean"}, british_spelling = true, }, -- special collectivity of France ["New Caledonia"] = { placetype = {"special collectivity", "collectivity"}, container = "France", addl_parents = {"Melanesia"}, british_spelling = true, }, -- dependent territory of New Zealand ["New Zealand Subantarctic Islands"] = { the = true, placetype = {"dependent territory", "territory"}, container = "New Zealand", addl_parents = {"Antartika"}, british_spelling = true, }, -- self-governing but in free association with New Zealand ["Niue"] = { placetype = {"negara"}, container = "New Zealand", addl_parents = {"Polynesia"}, british_spelling = true, }, -- Australian external territory ["Norfolk Island"] = { placetype = {"external territory", "territory"}, container = "Australia", addl_parents = {"Polynesia"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Cyprus ["Northern Cyprus"] = { placetype = {"unrecognized country", "negara"}, addl_parents = {"Cyprus", "Turkey", "Eropah", "Asia"}, divs = {"districts"}, keydesc = "the de-facto independent state of [[Northern Cyprus]], internationally recognized as part of the country of [[Cyprus]]", british_spelling = true, }, -- commonwealth, unincorporated territory of the United States ["Northern Mariana Islands"] = { the = true, placetype = {"commonwealth", "unincorporated territory", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Micronesia"}, }, -- British Overseas Territory ["Pitcairn Islands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Polynesia"}, british_spelling = true, }, -- commonwealth of the United States ["Puerto Rico"] = { placetype = {"commonwealth", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Caribbean"}, divs = {"municipalities"}, }, -- overseas department and region of France ["Réunion"] = { placetype = {"overseas department", "department", "administrative region", "region"}, container = "France", divs = {"communes"}, addl_parents = {"Afrika"}, british_spelling = true, }, -- special municipality of the Netherlands ["Saba"] = { placetype = {"special municipality", "municipality", "overseas territory", "territory"}, container = "Netherlands", addl_parents = {"Caribbean"}, is_city = true, british_spelling = true, }, -- overseas collectivity of France ["Saint Barthélemy"] = { placetype = {"overseas collectivity", "collectivity"}, container = "France", addl_parents = {"Caribbean"}, british_spelling = true, }, -- British Overseas Territory ["Saint Helena, Ascension and Tristan da Cunha"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", divs = {{type = "constituent parts", container_parent_type = false}}, addl_parents = {"Atlantic Ocean", "Afrika"}, british_spelling = true, }, -- constituent parts of the combined oveseas territory ["Ascension Island"] = { placetype = {"constituent part", "territory", "island"}, container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"}, addl_parents = {"Atlantic Ocean"}, overriding_bare_label_parents = {}, no_container_cat = false, no_container_parent = false, no_auto_augment_container = false, }, ["Saint Helena"] = { placetype = {"constituent part", "territory", "island"}, container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"}, addl_parents = {"Atlantic Ocean"}, overriding_bare_label_parents = {}, no_container_cat = false, no_container_parent = false, no_auto_augment_container = false, }, ["Tristan da Cunha"] = { placetype = {"constituent part", "territory", "archipelago"}, container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"}, addl_parents = {"Atlantic Ocean"}, overriding_bare_label_parents = {}, no_container_cat = false, no_container_parent = false, no_auto_augment_container = false, }, -- overseas collectivity of France ["Saint Martin"] = { placetype = {"overseas collectivity", "collectivity"}, container = "France", addl_parents = {"Caribbean"}, british_spelling = true, }, -- overseas collectivity of France ["Saint Pierre and Miquelon"] = { placetype = {"overseas collectivity", "collectivity"}, container = "France", divs = {"communes"}, addl_parents = {"Amerika Utara"}, british_spelling = true, }, -- special municipality of the Netherlands ["Sint Eustatius"] = { placetype = {"special municipality", "municipality", "overseas territory", "territory"}, container = "Netherlands", addl_parents = {"Caribbean"}, is_city = true, british_spelling = true, }, -- constituent country of the Netherlands ["Sint Maarten"] = { placetype = {"constituent country", "negara"}, container = "Netherlands", addl_parents = {"Caribbean"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Somalia ["Somaliland"] = { placetype = {"unrecognized country", "negara"}, addl_parents = {"Somalia", "Afrika"}, keydesc = "the de-facto independent state of [[Somaliland]], internationally recognized as part of the country of [[Somalia]]", british_spelling = true, }, -- British Overseas Territory -- FIXME: We should form the group "South Georgia and the South Sandwich Islands" like we did for -- "Saint Helena, Ascension and Tristan da Cunha". ["South Georgia"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Atlantic Ocean"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Georgia ["South Ossetia"] = { placetype = {"unrecognized country", "negara"}, addl_parents = {"Georgia", "Eropah", "Asia"}, keydesc = "the de-facto independent state of [[South Ossetia]], internationally recognized as part of the country of [[Georgia]]", british_spelling = true, }, -- British Overseas Territory ["South Sandwich Islands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Atlantic Ocean"}, wp = true, wpcat = "South Georgia and the South Sandwich Islands", british_spelling = true, }, -- Norwegian unincorporated area ["Svalbard"] = { placetype = {"unincorporated area", "dependent territory", "territory", "archipelago"}, container = "Norway", addl_parents = {"Eropah"}, british_spelling = true, }, -- dependent territory of New Zealand ["Tokelau"] = { placetype = {"dependent territory", "territory"}, container = "New Zealand", addl_parents = {"Polynesia"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Moldova ["Transnistria"] = { placetype = {"unrecognized country", "negara"}, addl_parents = {"Moldova", "Eropah"}, keydesc = "the de-facto independent state of [[Transnistria]], internationally recognized as part of [[Moldova]]", british_spelling = true, }, -- British Overseas Territory ["Turks and Caicos Islands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Caribbean"}, british_spelling = true, }, -- unincorporated territory of the United States ["United States Minor Outlying Islands"] = { the = true, placetype = {"unincorporated territory", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Islands", "Micronesia", "Polynesia", "Caribbean"}, }, -- FIXME: We should add entries for the other minor outlying islands. -- Baker Island (Oceania) -- Howland Island (Oceania) -- Jarvis Island (Oceania) -- Johnston Atoll (Oceania) -- Kingman Reef (Oceania) -- Midway Atoll (Oceania) -- Navassa Island (Caribbean) -- Palmyra Atoll (Oceania) -- Wake Island (Oceania) ["Wake Island"] = { placetype = {"unincorporated territory", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Micronesia"}, }, -- unincorporated territory of the United States ["United States Virgin Islands"] = { the = true, placetype = {"unincorporated territory", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Caribbean"}, }, ["U.S. Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true}, ["US Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true}, -- overseas collectivity of France ["Wallis and Futuna"] = { placetype = {"overseas collectivity", "collectivity"}, container = "France", addl_parents = {"Polynesia"}, british_spelling = true, }, } export.country_like_entities_group = { -- don't do any transformations between key and placename; in particular, don't chop off anything from -- "Saint Helena, Ascension and Tristan da Cunha". key_to_placename = false, placename_to_key = false, canonicalize_key_container = make_canonicalize_key_container(nil, "negara"), default_overriding_bare_label_parents = {"country-like entities"}, default_no_container_cat = true, default_no_container_parent = true, -- These entities often aren't really part of their container; a village in Wallis and Futuna (an overseas -- collectivity of France in Polynesia), for example, shouldn't be treated as a village in France, nor as a village -- in Europe. default_no_auto_augment_container = true, data = export.country_like_entities, } -- Former countries and such; we don't create "Cities in ..." categories because they don't exist anymore export.former_countries = { -- de-facto independent state of Armenian ethnicity, internationally recognized as part of Azerbaijan -- (also known as Nagorno-Karabakh) -- NOTE: Formerly listed Armenia as a parent; this seems politically non-neutral so I've taken it out. ["Artsakh"] = { placetype = {"unrecognized country", "negara"}, addl_parents = {"Azerbaijan", "Eropah", "Asia"}, keydesc = "the former de-facto independent state of [[Artsakh]], internationally recognized as part of [[Azerbaijan]]", british_spelling = true, }, ["Nagorno-Karabakh"] = {alias_of = "Artsakh"}, ["Czechoslovakia"] = {container = "Eropah", british_spelling = true}, ["East Germany"] = {container = "Eropah", addl_parents = {"Germany"}, british_spelling = true}, ["North Vietnam"] = {container = "Asia", addl_parents = {"Vietnam"}}, ["Persia"] = {placetype = {"empire", "negara"}, container = "Asia", divs = {"provinces"}}, ["Byzantine Empire"] = { the = true, placetype = {"empire", "negara"}, container = {"Eropah", "Afrika", "Asia"}, addl_parents = {"Ancient Europe", "Ancient Near East"}, divs = { "provinces", "themes", }}, ["Roman Empire"] = { the = true, placetype = {"empire", "negara"}, container = {"Eropah", "Afrika", "Asia"}, addl_parents = {"Rome"}, divs = { "provinces", {type = "FORMER provinces", cat_as = "provinces"}, }}, ["South Vietnam"] = {container = "Asia", addl_parents = {"Vietnam"}}, ["Soviet Union"] = { the = true, container = {"Eropah", "Asia"}, divs = {"republics", "autonomous republics"}, british_spelling = true}, ["West Germany"] = {container = "Eropah", addl_parents = {"Germany"}, british_spelling = true}, ["Yugoslavia"] = {container = "Eropah", divs = {"districts"}, keydesc = "the former [[Kingdom of Yugoslavia]] (1918–1943) or the former [[Socialist Federal Republic of Yugoslavia]] (1943–1992)", british_spelling = true}, } export.former_countries_group = { canonicalize_key_container = canonicalize_continent_container, default_overriding_bare_label_parents = {"former countries and country-like entities"}, default_is_former_place = true, default_placetype = "negara", default_no_container_cat = true, default_no_container_parent = true, -- No need to augment country holonyms with continents; not needed for disambiguation. default_no_auto_augment_container = true, data = export.former_countries, } ----------------------------------------------------------------------------------- -- Subpolity tables -- ----------------------------------------------------------------------------------- export.australia_states_and_territories = { ["Australian Capital Territory, Australia"] = {the = true, placetype = "territory"}, ["Jervis Bay Territory, Australia"] = {the = true, placetype = "territory"}, ["New South Wales, Australia"] = {}, ["Northern Territory, Australia"] = {the = true, placetype = "territory"}, ["Queensland, Australia"] = {}, ["South Australia, Australia"] = {}, ["Tasmania, Australia"] = {}, ["Victoria, Australia"] = {}, ["Western Australia, Australia"] = {}, } -- states and territories of Australia export.australia_group = { default_container = "Australia", default_placetype = "negeri", default_divs = "local government areas", data = export.australia_states_and_territories, } export.austria_states = { ["Vienna, Austria"] = {}, ["Lower Austria, Austria"] = {}, ["Upper Austria, Austria"] = {}, ["Styria, Austria"] = {}, ["Tyrol, Austria"] = {wp = "Tyrol (state)"}, ["Carinthia, Austria"] = {}, ["Salzburg, Austria"] = {wp = "Salzburg (state)"}, ["Vorarlberg, Austria"] = {}, ["Burgenland, Austria"] = {}, } -- states of Austria export.austria_group = { default_container = "Austria", default_placetype = "negeri", default_divs = "municipalities", data = export.austria_states, } export.bangladesh_divisions = { ["Barisal Division, Bangladesh"] = {}, ["Chittagong Division, Bangladesh"] = {}, ["Dhaka Division, Bangladesh"] = {}, ["Khulna Division, Bangladesh"] = {}, ["Mymensingh Division, Bangladesh"] = {}, ["Rajshahi Division, Bangladesh"] = {}, ["Rangpur Division, Bangladesh"] = {}, ["Sylhet Division, Bangladesh"] = {}, } -- divisions of Bangladesh export.bangladesh_group = { key_to_placename = make_key_to_placename(", Bangladesh$", " Division$"), placename_to_key = make_placename_to_key(", Bangladesh", " Division"), default_container = "Bangladesh", default_placetype = "division", default_divs = "districts", data = export.bangladesh_divisions, } export.brazil_states = { ["Acre, Brazil"] = {wp = "%l (state)"}, ["Alagoas, Brazil"] = {}, ["Amapá, Brazil"] = {}, ["Amazonas, Brazil"] = {wp = "%l (Brazilian state)"}, ["Bahia, Brazil"] = {}, ["Ceará, Brazil"] = {}, ["Distrito Federal, Brazil"] = {wp = "Federal District (Brazil)"}, ["Espírito Santo, Brazil"] = {}, ["Goiás, Brazil"] = {}, ["Maranhão, Brazil"] = {}, ["Mato Grosso, Brazil"] = {}, ["Mato Grosso do Sul, Brazil"] = {}, ["Minas Gerais, Brazil"] = {}, ["Pará, Brazil"] = {}, ["Paraíba, Brazil"] = {}, ["Paraná, Brazil"] = {wp = "%l (state)"}, ["Pernambuco, Brazil"] = {}, ["Piauí, Brazil"] = {}, ["Rio de Janeiro, Brazil"] = {wp = "%l (state)"}, ["Rio Grande do Norte, Brazil"] = {}, ["Rio Grande do Sul, Brazil"] = {}, ["Rondônia, Brazil"] = {}, ["Roraima, Brazil"] = {}, ["Santa Catarina, Brazil"] = {wp = "%l (state)"}, ["São Paulo, Brazil"] = {wp = "%l (state)"}, ["Sergipe, Brazil"] = {}, ["Tocantins, Brazil"] = {}, } -- states of Brazil export.brazil_group = { default_container = "Brazil", default_placetype = "negeri", default_divs = "municipalities", data = export.brazil_states, } export.canada_provinces_and_territories = { ["Alberta, Canada"] = {divs = { {type = "municipal districts", container_parent_type = "rural municipalities"}, }}, ["British Columbia, Canada"] = {divs = {type = "regional districts", container_parent_type = false}, "regional municipalities", }, ["Manitoba, Canada"] = {divs = {"rural municipalities"}}, ["New Brunswick, Canada"] = {divs = {"counties", "parishes", {type = "civil parishes", cat_as = "parishes"}}}, ["Newfoundland and Labrador, Canada"] = {}, ["Northwest Territories, Canada"] = {the = true, placetype = "territory"}, ["Nova Scotia, Canada"] = {divs = {"counties", "regional municipalities"}}, ["Nunavut, Canada"] = {placetype = "territory"}, ["Ontario, Canada"] = {divs = {"counties", "regional municipalities", {type = "townships", prep = "di"}}}, ["Prince Edward Island, Canada"] = {divs = {"counties", "parishes", "rural municipalities"}}, ["Saskatchewan, Canada"] = {divs = {"rural municipalities"}}, ["Quebec, Canada"] = {divs = { "counties", {type = "regional county municipalities", container_parent_type = "regional municipalities"}, -- administrative regions have an official (but non-governmental) function but there don't appear to be any -- equivalent regions elsewhere in Canada, so disable the [[Category:Regions of Canada]] grouping {type = "regions", container_parent_type = false}, {type = "townships", prep = "di"}, {type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "counties"}, "municipalities"}}, {type = "township municipalities", cat_as = {{type = "townships", prep = "di"}, "municipalities"}}, {type = "village municipalities", cat_as = {{type = "villages", prep = "di"}, "municipalities"}}, }}, ["Yukon, Canada"] = {placetype = "territory"}, ["Yukon Territory, Canada"] = {alias_of = "Yukon, Canada", the = true}, } -- provinces and territories of Canada export.canada_group = { default_container = "Canada", default_placetype = "province", data = export.canada_provinces_and_territories, } export.china_provinces_and_autonomous_regions = { -- direct-administered municipalities are not here but below under prefecture-level cities ["Anhui, China"] = {}, ["Fujian, China"] = {}, ["Fuchien, China"] = {alias_of = "Fujian, China", display = true}, ["Gansu, China"] = {}, ["Guangdong, China"] = {}, ["Guangxi, China"] = {placetype = "autonomous region"}, ["Guizhou, China"] = {}, ["Hainan, China"] = {}, ["Hebei, China"] = {}, ["Heilongjiang, China"] = {}, ["Henan, China"] = {}, ["Hubei, China"] = {}, ["Hunan, China"] = {}, ["Inner Mongolia, China"] = {placetype = "autonomous region"}, ["Jiangsu, China"] = {}, ["Jiangxi, China"] = {}, ["Jilin, China"] = {}, ["Liaoning, China"] = {}, ["Ningxia, China"] = {placetype = "autonomous region"}, ["Qinghai, China"] = {}, ["Shaanxi, China"] = {}, ["Shandong, China"] = {}, ["Shanxi, China"] = {}, ["Sichuan, China"] = {}, ["Tibet, China"] = {placetype = "autonomous region", wp = "Tibet Autonomous Region"}, ["Xinjiang, China"] = {placetype = "autonomous region"}, ["Yunnan, China"] = {}, ["Zhejiang, China"] = {}, } -- provinces and autonomous regions of China export.china_group = { default_container = "China", default_placetype = "province", default_divs = { "prefectures", "prefecture-level cities", "districts", "subdistricts", "townships", {type = "counties", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, }, data = export.china_provinces_and_autonomous_regions, } export.china_prefecture_level_cities = { -- In China, a "prefecture-level city" is not a city in any real sense. It is rather a prefecture, which is an -- administrative unit smaller than a province but bigger than a county, which is administratively controlled by -- the chief city of the prefecture (which bears the same name as the prefecture), in a unified government. Prior -- to the mid-1980's, in fact, prefecture-level cities *were* prefectures, and a few of them (especially in the -- western portion of China) have not yet been converted. Generally a given province is entirely tiled by -- prefecture-level cities, another indication that they should be treated as prefectures and not cities per se. -- Yet another indication is that prefecture-level cities can contain counties and county-level cities (which, much -- like prefecture-level cities, are effectively counties surrounding a chief city of the county, again which bears -- the same name as the county-level city). -- -- For this reason, we treat prefecture-level cities as non-city political divisions, and separately enumerate the -- most populous so we can separately categorize districts and counties under them instead of lumping them at the -- province level. -- -- Note also that China separately distinguishes "urban area" from "metro area". Sometimes the two figures are -- identical but sometimes the metro area is larger (and very occasionally smaller, which I assume is an error). I'm -- guessing that the "urban area" is the contiguous urban area over a certain density while the metro area includes -- all urban areas above a certain density; when the latter is greater, it's because of satellite cities in the -- metro area separated by suburban/exurban or rural land. -- At first I chose all prefecture/province-level cities with a total prefecture/province-level population of at -- least 6,000,000 per the 2020 census with data taken from https://www.citypopulation.de/en/china/admin/ (a total -- of 67, including the four direct-administered municipalities), and also chose all prefecture/province-level -- cities whose "urban population" was at least 2,000,000 per the 2020 census with data taken from Wikipedia -- [[w:List of cities in China by population#Cities and towns by population]] (a total of 61 cities; if we cut off -- at 1.5 million we'd have 84 cities, and if we cut off at 1 million we'd have 105 cities). Merging them produces -- 87 cities. Note that this leaves off a few well-known cities (Guilin, Qiqihar, Kashgar, Lhasa, ...) but includes -- a lot of obscure cities. -- -- At a later date I added all cities from citypopulation.de whose "urban" population per the 2020 China census was -- >= 1 million, and then finally added all urban agglomerations from citypopulation.de whose 2025-01-01 estimate -- was >= 1 million. These are sorted below by the urban agglomeration value (which is generally of the "adm-urb" = -- "administrative area (urban population)" type) and sometimes groups nearby cities into a single agglomeration -- (most notably in the case of the Pearl River Delta, grouped under Guangzhou with an agglomeration population of -- 72,700,000 but including a large number of nearby large cities in the agglomeration (although for some reason not -- Hong Kong, maybe due to the administrative issues involved). In addition, citypopulation.de includes divisions -- under a prefecture-level city if they are city-like and have an agglomeration population of at least 1 million; -- this includes several county-level cities, one county and one district (Wanzhou, a "district" of Chongqing -- despite being 142 miles away). None of the county-level cities or counties have districts under them, only -- subdistricts, towns and townships. ["Guangzhou"] = {container = "Guangdong"}, -- 18.7 prefectural, 18.8 urban; sub-provincial city; 16.097 urban (72.700 adm-urb including Dongguan, Foshan, Huizhou, Jiangmen, Shenzhen, Zhongshan) per citypopulation.de ["Dongguan"] = {container = "Guangdong"}, -- 10.5 prefectural, 10.5 urban; 9.645 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Foshan"] = {container = "Guangdong"}, -- 9.5 prefectural, 9.5 urban; 9.043 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Huizhou"] = {container = "Guangdong"}, -- 6.0 prefectural, 2.5 urban; 2.900 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Jiangmen"] = {container = "Guangdong"}, -- 4.798 prefectural, 2.7 urban; 1.795 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Shenzhen"] = {container = "Guangdong"}, -- 17.5 prefectural, 14.7 urban; sub-provincial city; 17.445 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Zhongshan"] = {container = "Guangdong"}, -- 4.418 prefectural, 4.4 urban; 3.842 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Shanghai"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 24.9 prefectural, 29.9 urban; 21.910 urban (41.600 adm-urb including Changshu, Changzhou, Suzhou, Wuxi) per citypopulation.de ["Changshu"] = {container = "Jiangsu"}, -- 1.231 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration -- NOTE: Not to be confused with Cangzhou in Hebei ["Changzhou"] = {container = "Jiangsu"}, -- 5.278 prefectural, 3.6 urban; 3.187 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration -- NOTE: There is also a prefecture-level city Suzhou in Anhui with 5.3 million prefectural inhabitants ["Suzhou"] = {container = "Jiangsu"}, -- 12.8 prefectural, 4.3 urban; 5.893 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration ["Wuxi"] = {container = "Jiangsu"}, -- 7.5 prefectural, 3.3 urban; 3.957 per citypopulation.de; included by citypopulation.de in Shanghai agglomeration ["Beijing"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 21.9 prefectural, 21.9 urban; 18.961 urban (21.500 adm-urb) per citypopulation.de ["Chengdu"] = {container = "Sichuan"}, -- 20.9 prefectural, 16.9 urban; sub-provincial city; 13.568 urban (18.100 adm-urb) per citypopulation.de ["Xiamen"] = {container = "Fujian"}, -- 5.163 prefectural, 5.2 urban; sub-provincial city; 4.617 urban (15.400 adm-urb including Jinjiang, Quanzhou, Putian) per citypopulation.de ["Jinjiang"] = {container = "Fujian"}, -- 1.416 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration ["Quanzhou"] = {container = "Fujian"}, -- 8.8 prefectural, 1.7 urban (6.7 metro); 1.469 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration ["Putian"] = {container = "Fujian"}, -- 3.210 prefectural, 2.0 urban; 1.539 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration ["Hangzhou"] = {container = "Zhejiang"}, -- 11.9 prefectural, 10.7 urban; sub-provincial city; 9.236 urban (14.600 adm-urb including Shaoxing) per citypopulation.de ["Shaoxing"] = {container = "Zhejiang"}, -- 5.270 prefectural, 2.5 urban; 2.333 urban per citypopulation.de; included by citypopulation.de in Hangzhou agglomeration ["Xi'an"] = {container = "Shaanxi"}, -- 12.1 prefectural, 11.9 urban; sub-provincial city; 9.393 urban (13.400 adm-urb including Xianyang) per citypopulation.de ["Xianyang"] = {container = "Shaanxi"}, -- 1.193 urban per citypopulation.de; included by citypopulation.de in Xi'an agglomeration ["Chongqing"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 32.1 prefectural, 16.9 urban; 9.581 urban (12.900 adm-urb) per citypopulation.de ["Wuhan"] = {container = "Hubei"}, -- 12.4 prefectural, 12.3 urban; sub-provincial city; 10.495 urban (12.600 adm-urb) per citypopulation.de ["Tianjin"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 13.9 prefectural, 13.9 urban; 11.052 urban (11.700 adm-urb) per citypopulation.de ["Changsha"] = {container = "Hunan"}, -- 10.0 prefectural, 6.0 urban; 5.630 urban (11.500 adm-urb including Xiangtan, Zhuzhou) per citypopulation.de -- Changsha County -- 1.024 urban per citypopulation.de ["Zhuzhou"] = {container = "Hunan"}, -- 1.510 urban per citypopulation.de; included by citypopulation.de in Changsha agglomeration ["Zhengzhou"] = {container = "Henan"}, -- 12.6 prefectural, 6.7 urban; 6.461 urban (10.300 adm-urb) per citypopulation.de ["Nanjing"] = {container = "Jiangsu"}, -- 9.3 prefectural, 9.3 urban; sub-provincial city; 7.520 urban (9.500 adm-urb including Ma'anshan) per citypopulation.de ["Shenyang"] = {container = "Liaoning"}, -- 9.1 prefectural, 7.9 urban; sub-provincial city; 7.026 urban (8.800 adm-urb including Fushun) per citypopulation.de ["Fushun"] = {container = "Liaoning"}, -- 1.229 urban per citypopulation.de; included by citypopulation.de in Shenyang agglomeration ["Hefei"] = {container = "Anhui"}, -- 9.4 prefectural, 4.2 urban; 5.056 urban (8.200 adm-urb) per citypopulation.de ["Shantou"] = {container = "Guangdong"}, -- 5.502 prefectural, 4.3 urban; 3.839 urban (8.050 adm-urb including Chaozhou, Jieyang, Puning) per citypopulation.de ["Chaozhou"] = {container = "Guangdong"}, -- 1.254 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration ["Jieyang"] = {container = "Guangdong"}, -- 1.243 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration ["Qingdao"] = {container = "Shandong"}, -- 10.1 prefectural, 7.1 urban; sub-provincial city; 6.165 urban (7.700 adm-urb) per citypopulation.de ["Ningbo"] = {container = "Zhejiang"}, -- 9.4 prefectural, 5.1 urban; sub-provincial city; 3.731 urban (7.600 adm-urb including Cixi, Yuyao) per citypopulation.de ["Cixi"] = {container = "Zhejiang"}, -- 1.458 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration ["Yuyao"] = {container = "Zhejiang"}, -- 1.014 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration -- Hong Kong 7.500 agglomeration per citypopulation.de 2025-01-01 estimate including Kowloon, Victoria ["Wenzhou"] = {container = "Zhejiang"}, -- 9.6 prefectural, 3.6 urban; 2.582 urban (7.000 adm-urb including Rui'an, Cangnan, Pingyang) per citypopulation.de -- Rui'an is a "county-level city" of the "prefecture-level city" of Wenzhou but in fact is 19 miles away from Wenzhou city proper (urban core to urban core). ["Rui'an"] = {placetype = "county-level city", container = {key = "Wenzhou", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 1.013 urban per citypopulation.de; included by citypopulation.de in Wenzhou agglomeration ["Kunming"] = {container = "Yunnan"}, -- 8.5 prefectural, 6.0 urban; 5.273 urban (6.800 adm-urb) per citypopulation.de -- includes Láiwú city ["Jinan"] = {container = "Shandong", wp = "%l, %c"}, -- 9.2 prefectural, 8.4 urban; sub-provincial city; 5.648 urban (6.750 adm-urb) per citypopulation.de -- includes Xīnjí city ["Shijiazhuang"] = {container = "Hebei"}, -- 11.2 prefectural, 4.1 urban; 5.090 urban (6.450 adm-urb) per citypopulation.de ["Taiyuan"] = {container = "Shanxi"}, -- 5.304 prefectural, 4.5 urban; 4.304 urban (6.150 adm-urb) per citypopulation.de ["Harbin"] = {container = "Heilongjiang"}, -- 10.0 prefectural, 7.0 urban; sub-provincial city; 5.243 urban (5.550 adm-urb) per citypopulation.de ["Nanning"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 8.7 prefectural, 3.8 urban; 4.583 urban (5.550 adm-urb) per citypopulation.de ["Dalian"] = {container = "Liaoning"}, -- 7.5 prefectural, 5.7 urban; sub-provincial city; 4.914 urban (5.400 adm-urb) per citypopulation.de ["Guiyang"] = {container = "Guizhou"}, -- 5.987 prefectural, 3.5 urban; 4.021 urban (5.300 adm-urb) per citypopulation.de ["Changchun"] = {container = "Jilin"}, -- 9.1 prefectural, 5.7 urban; sub-provincial city; 4.557 urban (5.200 adm-urb) per citypopulation.de ["Nanchang"] = {container = "Jiangxi"}, -- 6.3 prefectural, 3.6 (3.9?) urban, 5.3 metro; 3.519 urban (5.150 adm-urb) per citypopulation.de ["Ürümqi"] = {container = {key = "Xinjiang, China", placetype = "autonomous region"}}, -- 4.054 prefectural, 4.3 urban; 3.843 urban (5.000 adm-urb) per citypopulation.de ["Urumqi"] = {alias_of = "Ürümqi", display = true}, ["Fuzhou"] = {container = "Fujian"}, -- 8.3 prefectural, 4.1 urban; 3.723 urban (4.775 adm-urb) per citypopulation.de ["Linyi"] = {container = "Shandong"}, -- 11.0 prefectural, 2.3 urban; 2.744 urban (4.650 adm-urb) per citypopulation.de ["Zibo"] = {container = "Shandong"}, -- 4.704 prefectural, 2.6 urban; 2.750 urban (3.975 adm-urb) per citypopulation.de ["Luoyang"] = {container = "Henan"}, -- 7.1 prefectural, 2.4 urban; 2.231 urban (3.750 adm-urb) per citypopulation.de ["Lanzhou"] = {container = "Gansu"}, -- 4.359 prefectural, 3.1 urban; 3.013 urban (3.575 adm-urb) per citypopulation.de ["Nantong"] = {container = "Jiangsu"}, -- 7.7 prefectural, 2.3 urban; 2.988 urban (3.475 adm-urb) citypopulation.de ["Weifang"] = {container = "Shandong"}, -- 9.4 prefectural, 2.7 urban; 1.998 urban (3.325 adm-urb) per citypopulation.de ["Jiangyin"] = {container = "Jiangsu"}, -- 1.331 urban (3.200 adm-urb including Zhangjiagang) per citypopulation.de ["Zhangjiagang"] = {container = "Jiangsu"}, -- 1.056 urban per citypopulation.de; included in Jiangyin figures ["Xuzhou"] = {container = "Jiangsu"}, -- 9.1 prefectural, 2.6 urban; 2.846 urban (3.150 adm-urb) per citypopulation.de ["Handan"] = {container = "Hebei"}, -- 9.4 prefectural, 2.8 urban; 2.095 urban (2.925 adm-urb) per citypopulation.de ["Hohhot"] = {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 3.446 prefectural, 2.7 urban; 2.373 urban (2.850 adm-urb) per citypopulation.de ["Haikou"] = {container = "Hainan"}, -- 2.873 prefectural, 2.3 urban; 2.349 urban (2.800 adm-urb) per citypopulation.de ["Tangshan"] = {container = "Hebei"}, -- 7.7 prefectural, 3.4 urban; 2.550 urban (2.750 adm-urb) per citypopulation.de ["Xinxiang"] = {container = "Henan"}, -- 6.3 prefectural, 1.2 urban, 2.7 metro; 1.271 urban (2.700 adm-urb) per citypopulation.de ["Yiwu"] = {container = "Zhejiang"}, -- 1.481 urban (2.700 adm-urb) per citypopulation.de ["Zhuhai"] = {container = "Guangdong"}, -- 2.439 prefectural, 2.4 urban; 2.207 urban (2.675 adm-urb) per citypopulation.de ["Taizhou, Zhejiang"] = {container = "Zhejiang"}, -- 6.6 prefectural, 1.6 urban; 1.486 urban (2.625 adm-urb) per citypopulation.de ["Taizhou"] = {alias_of = "Taizhou, Zhejiang"}, ["Yantai"] = {container = "Shandong"}, -- 7.1 prefectural, 2.5 urban; 2.312 urban (2.550 adm-urb) per citypopulation.de ["Yinchuan"] = {container = {key = "Ningxia, China", placetype = "autonomous region"}}, -- 1.663 urban (2.525 adm-urb) per citypopulation.de ["Liuzhou"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 4.157 prefectural, 2.2 urban; 2.205 urban (2.500 adm-urb) per citypopulation.de ["Anshan"] = {container = "Liaoning"}, -- 1.480 urban (2.350 adm-urb including Liáoyáng) per citypopulation.de ["Yangzhou"] = {container = "Jiangsu"}, -- 2.067 urban (2.300 adm-urb) per citypopulation.de ["Jiaxing"] = {container = "Zhejiang"}, -- 1.188 urban (2.275 adm-urb) per citypopulation.de ["Xining"] = {container = "Qinghai"}, -- 1.677 urban (2.250 adm-urb) per citypopulation.de -- includes Dìngzhōu city and Xióngān Xīnqū ["Baoding"] = {container = "Hebei"}, -- 11.5 prefectural, 2.0 urban; 1.940 urban (2.225 adm-urb) per citypopulation.de ["Baotou"] = {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 2.709 prefectural, 2.2 urban; 2.104 urban (2.200 adm-urb) per citypopulation.de ["Ganzhou"] = {container = "Jiangxi"}, -- 9.0 prefectural, 1.6 urban; 1.778 urban (2.150 adm-urb) per citypopulation.de ["Pingdingshan"] = {container = "Henan"}, -- 1.046 urban (2.100 adm-urb) per citypopulation.de ["Zunyi"] = {container = "Guizhou"}, -- 6.6 prefectural, 2.4 urban/metro; 1.675 urban (2.025 adm-urb) per citypopulation.de ["Bengbu"] = {container = "Anhui"}, -- 1.078 urban (2.000 adm-urb) per citypopulation.de ["Datong"] = {container = "Shanxi"}, -- 3.105 prefectural, 2.0 urban; 1.810 urban (2.000 adm-urb) per citypopulation.de ["Anyang"] = {container = "Henan"}, -- 1.188 urban (1.960 adm-urb) per citypopulation.de ["Huai'an"] = {container = "Jiangsu"}, -- 4.556 prefectural, 2.6 urban; 1.805 urban (1.940 adm-urb) per citypopulation.de ["Zaozhuang"] = {container = "Shandong"}, -- 1.350 urban (1.900 adm-urb) per citypopulation.de ["Zhanjiang"] = {container = "Guangdong"}, -- 7.0 prefectural, 1.9 urban; 1.401 urban (1.890 adm-urb) per citypopulation.de ["Huainan"] = {container = "Anhui"}, -- 1.256 urban (1.880 adm-urb) per citypopulation.de ["Jining"] = {container = "Shandong"}, -- 8.4 prefectural, 1.5 urban; 1.700 urban (1.880 adm-urb) per citypopulation.de ["Daqing"] = {container = "Heilongjiang"}, -- 1.604 urban (1.860 adm-urb) per citypopulation.de ["Wuhu"] = {container = "Anhui"}, -- 1.598 urban (1.850 adm-urb) per citypopulation.de ["Guilin"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 1.361 urban (1.830 adm-urb) per citypopulation.de ["Mianyang"] = {container = "Sichuan"}, -- 1.549 urban (1.800 adm-urb) per citypopulation.de ["Xiangyang"] = {container = "Hubei"}, -- 1.686 urban (1.800 adm-urb) per citypopulation.de ["Huzhou"] = {container = "Zhejiang"}, -- 1.084 urban (1.750 adm-urb) per citypopulation.de ["Puyang"] = {container = "Henan"}, -- 0.824 urban (1.750 adm-urb) per citypopulation.de ["Shangqiu"] = {container = "Henan"}, -- 7.8 prefectural, 1.9 urban (2.8 metro); 1.031 urban (1.750 adm-urb) per citypopulation.de ["Qinhuangdao"] = {container = "Hebei"}, -- 1.520 urban (1.740 adm-urb) per citypopulation.de ["Xingtai"] = {container = "Hebei"}, -- 7.1 prefectural, 971,000 urban; 1.5 urban (1.700 adm-urb) per citypopulation.de ["Nanyang"] = {container = "Henan", wp = "%l, %c"}, -- 9.7 prefectural, 2.1 urban/metro; 1.481 urban (1.680 adm-urb) per citypopulation.de ["Jiaozuo"] = {container = "Henan"}, -- 0.875 urban (1.640 adm-urb) per citypopulation.de ["Jilin City"] = {container = "Jilin"}, -- 1.509 urban (1.610 adm-urb) per citypopulation.de ["Jilin"] = {alias_of = "Jilin City"}, ["Jinhua"] = {container = "Zhejiang"}, -- 7.1 prefectural, 1.5 urban; 1.041 urban (1.590 adm-urb) per citypopulation.de ["Shangrao"] = {container = "Jiangxi"}, -- 6.5 prefectural, 2.1 urban, 1.3 metro [sic]; 1.342 urban (1.580 adm-urb) per citypopulation.de ["Heze"] = {container = "Shandong"}, -- 8.8 prefectural, 1.3 urban; 1.294 urban (1.570 adm-urb) per citypopulation.de ["Yulin"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}, wp = "%l, %c"}, -- 0.878 urban (1.570 adm-urb) per citypopulation.de ["Tai'an"] = {container = "Shandong"}, -- 1.417 urban (1.560 adm-urb) per citypopulation.de ["Weihai"] = {container = "Shandong"}, -- 1.340 urban (1.510 adm-urb) per citypopulation.de -- Taizhou, Jiangsu would be here (1.490 adm-urb) but moved to china_prefecture_level_cities_2 to avoid clash ["Yancheng"] = {container = "Jiangsu"}, -- 6.7 prefectural, 1.6 urban; 1.353 urban (1.460 adm-urb) per citypopulation.de ["Zhangjiakou"] = {container = "Hebei"}, -- 1.339 urban (1.450 adm-urb) per citypopulation.de ["Maoming"] = {container = "Guangdong"}, -- 6.2 prefectural, 2.5 urban; 1.308 urban (1.440 adm-urb) per citypopulation.de ["Nanchong"] = {container = "Sichuan"}, -- 1.254 urban (1.440 adm-urb) per citypopulation.de ["Fuyang"] = {container = "Anhui", wp = "%l, %c"}, -- 8.2 prefectural, 2.1 urban; 1.191 urban (1.410 adm-urb) per citypopulation.de ["Xuchang"] = {container = "Henan"}, -- 0.850 urban (1.390 adm-urb) per citypopulation.de ["Yichang"] = {container = "Hubei"}, -- 1.284 urban (1.390 adm-urb) per citypopulation.de ["Dazhou"] = {container = "Sichuan"}, -- 1.136 urban (1.380 adm-urb) per citypopulation.de ["Kaifeng"] = {container = "Henan"}, -- 1.194 urban (1.340 adm-urb) per citypopulation.de ["Luzhou"] = {container = "Sichuan"}, -- 1.128 urban (1.340 adm-urb) per citypopulation.de ["Qingyuan"] = {container = "Guangdong"}, -- 1.198 urban (1.340 adm-urb) per citypopulation.de ["Huaibei"] = {container = "Anhui"}, -- 0.831 urban (1.330 adm-urb) per citypopulation.de ["Yibin"] = {container = "Sichuan"}, -- 1.101 urban (1.310 adm-urb) per citypopulation.de ["Lu'an"] = {container = "Anhui"}, -- 1.070 urban (1.300 adm-urb) per citypopulation.de ["Dezhou"] = {container = "Shandong"}, -- 0.843 urban (1.290 adm-urb) per citypopulation.de ["Rizhao"] = {container = "Shandong"}, -- 1.147 urban (1.270 adm-urb) per citypopulation.de ["Changzhi"] = {container = "Shanxi"}, -- 1.047 urban (1.250 adm-urb) per citypopulation.de ["Hengyang"] = {container = "Hunan"}, -- 6.6 prefectural, 1.5 urban; 1.185 urban (1.250 adm-urb) per citypopulation.de ["Jinzhou"] = {container = "Liaoning"}, -- 1.021 urban (1.240 adm-urb) per citypopulation.de ["Liaocheng"] = {container = "Shandong"}, -- 1.020 urban (1.240 adm-urb) per citypopulation.de ["Changde"] = {container = "Hunan"}, -- 1.101 urban (1.230 adm-urb) per citypopulation.de ["Suqian"] = {container = "Jiangsu"}, -- 1.082 urban (1.230 adm-urb) per citypopulation.de ["Xinyang"] = {container = "Henan"}, -- 6.2 prefectural, 1.4 urban/metro; 1.015 urban (1.230 adm-urb) per citypopulation.de ["Baoji"] = {container = "Shaanxi"}, -- 1.108 urban (1.220 adm-urb) per citypopulation.de ["Yueyang"] = {container = "Hunan"}, -- 1.125 urban (1.220 adm-urb) per citypopulation.de ["Zhenjiang"] = {container = "Jiangsu"}, -- 1.124 urban (1.210 adm-urb) per citypopulation.de -- Wanzhou is a "district" of the "direct-administered municipality" of Chongqing but in fact is 142 miles away from Chongqing city proper. ["Wanzhou"] = {placetype = "district", container = {key = "Chongqing", placetype = "direct-administered municipality"}, divs = {"subdistricts", "townships"}, wp = "%l, %c"}, -- 1.078 urban (1.190 adm-urb) per citypopulation.de ["Ulanhad"] = {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 1.093 urban (1.180 adm-urb) per citypopulation.de ["Chifeng"] = {alias_of = "Ulanhad"}, ["Ulankhad"] = {alias_of = "Ulanhad", display = true}, ["Ezhou"] = {container = "Hubei"}, -- < 0.750 urban (1.180 adm-urb) per citypopulation.de ["Zhaoqing"] = {container = "Guangdong"}, -- 1.036 urban (1.160 adm-urb) per citypopulation.de ["Lianyungang"] = {container = "Jiangsu"}, -- 4.599 prefectural, 2.0 urban; 1.071 urban (1.150 adm-urb) per citypopulation.de ["Qujing"] = {container = "Yunnan"}, -- 0.976 urban (1.150 adm-urb) per citypopulation.de -- Shuyang is a "county" of the "prefecture-level city" of Suqian but in fact is 38 miles away from Suqian city proper (urban core to urban core). -- The county itself is 37 miles by 34 miles. ["Shuyang"] = {placetype = "county", container = {key = "Suqian", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}, wp = "%l County"}, -- 0.986 urban (1.120 adm-urb) per citypopulation.de -- Yongkang is a "county-level city" of the "prefecture-level city" of Jinhua but in fact is 32 miles away from Jinhua city proper (urban core to urban core). ["Yongkang"] = {placetype = "county-level city", container = {key = "Jinhua", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}, wp = "%l, Zhejiang"}, -- < 0.750 urban (1.110 adm-urb) per citypopulation.de ["Zhoukou"] = {container = "Henan"}, -- 9.0 prefectural, 721,000 urban (1.6 metro); < 0.750 urban (1.100 adm-urb) per citypopulation.de ["Beihai"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- < 1 urban (1.090 adm-urb) per citypopulation.de ["Jiujiang"] = {container = "Jiangxi"}, -- < 0.750 urban (1.080 adm-urb) per citypopulation.de ["Shaoyang"] = {container = "Hunan"}, -- 6.6 prefectural, 802,000 urban, 1.4 metro; < 1 urban (1.080 adm-urb) per citypopulation.de ["Chuzhou"] = {container = "Anhui"}, -- < 0.750 urban (1.070 adm-urb) per citypopulation.de ["Hengshui"] = {container = "Hebei"}, -- 0.885 urban (1.070 adm-urb) per citypopulation.de ["Shiyan"] = {container = "Hubei"}, -- 0.955 urban (1.070 adm-urb) per citypopulation.de ["Huludao"] = {container = "Liaoning"}, -- 0.764 urban (1.060 adm-urb) per citypopulation.de ["Dongying"] = {container = "Shandong"}, -- 0.961 urban (1.050 adm-urb) per citypopulation.de ["Guigang"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 0.921 urban (1.050 adm-urb) per citypopulation.de -- Liuyang is a "county-level city" of the "prefecture-level city" of Changsha but in fact is 47 miles away from Changsha city proper (urban core to urban core). ["Liuyang"] = {placetype = "county-level city", container = {key = "Changsha", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 0.886 urban (1.040 adm-urb) per citypopulation.de -- NOTE: Not to be confused with Changzhou in Jiangsu ["Cangzhou"] = {container = "Hebei"}, -- 7.3 prefectural, 621,000 urban; 0.759 urban (1.030 adm-urb) per citypopulation.de ["Liupanshui"] = {container = "Guizhou"}, -- < 0.750 urban (1.030 adm-urb) per citypopulation.de ["Panjin"] = {container = "Liaoning"}, -- 0.980 urban (1.030 adm-urb) per citypopulation.de ["Qiqihar"] = {container = "Heilongjiang"}, -- 1.030 urban (1.030 adm-urb) per citypopulation.de ["Linfen"] = {container = "Shanxi"}, -- < 0.750 urban (1.010 adm-urb) per citypopulation.de -- Tengzhou is a "county-level city" of the "prefecture-level city" of Zaozhuang but in fact is 30 miles away from Zaozhuang city proper (urban core to urban core). ["Tengzhou"] = {placetype = "county-level city", container = {key = "Zaozhuang", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 0.937 urban (1.010 adm-urb) per citypopulation.de -- 3 extra that got added in earlier incarnations and aren't found in the "major agglomerations of the world" page https://citypopulation.de/en/world/agglomerations/ reference date 2025-01-01 ["Kunshan"] = {container = "Jiangsu"}, -- 1.652 urban (2020 China census) per citypopulation.de ["Zhumadian"] = {container = "Henan"}, -- 7.0 prefectural, 722,000 urban per Wikipedia; 0.754 urban per citypopulation.de ["Bijie"] = {container = "Guizhou"}, -- 6.9 prefectural, ? urban, ? metro (not listed in Wikipedia); < 0.750 urban per citypopulation.de } export.china_prefecture_level_cities_group = { -- don't do any transformations between key and placename; in particular, don't chop off anything from -- "Taizhou, Zhejiang" or "Suzhou, Anhui". key_to_placename = false, placename_to_key = false, -- don't add ", China" to make the key default_container = "China", canonicalize_key_container = make_canonicalize_key_container(", China", "province"), -- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people -- don't understand how Chinese administrative divisions work. default_placetype = {"prefecture-level city", "city"}, default_divs = { -- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities, -- and prefecture-level cities (as well as county-level cities) are considered non-cities. "districts", "subdistricts", "townships", {type = "counties", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, }, data = export.china_prefecture_level_cities, } -- Needed to avoid problems with two cities called Taizhou and Suzhou. export.china_prefecture_level_cities_2 = { -- NOTE: There is also a larger and better-known prefecture-level city Taizhou in Zhejiang. ["Taizhou, Jiangsu"] = {container = "Jiangsu"}, -- 1.3 urban (1.490 adm-urb) per citypopulation.de 2020 census ["Taizhou"] = {alias_of = "Taizhou, Jiangsu"}, -- NOTE: There is also a larger and better-known prefecture-level city Suzhou in Jiangsu. ["Suzhou, Anhui"] = {container = "Anhui"}, -- 5.3 prefectural, 1.766 metro and "urban"; < 1 urban (1.010 adm-urb) per citypopulation.de 2020 census -- hopefully this will work because we also have Suzhou as a key by itself for the larger, more-well-known Suzhou in Jiangsu ["Suzhou"] = {alias_of = "Suzhou, Anhui"}, } export.china_prefecture_level_cities_group_2 = { -- don't do any transformations between key and placename; in particular, don't chop off anything from -- "Taizhou, Jiangsu". placename_to_key = false, -- don't add ", China" to make the key default_container = "China", canonicalize_key_container = make_canonicalize_key_container(", China", "province"), -- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people -- don't understand how Chinese administrative divisions work. default_placetype = {"prefecture-level city", "city"}, default_divs = { -- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities, -- and prefecture-level cities (as well as county-level cities) are considered non-cities. "districts", "subdistricts", "townships", {type = "counties", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, }, data = export.china_prefecture_level_cities_2, } export.finland_regions = { ["Lapland, Finland"] = {wp = "%l (%c)"}, ["North Ostrobothnia, Finland"] = {}, ["Northern Ostrobothnia, Finland"] = {alias_of = "North Ostrobothnia, Finland", display = true}, ["Kainuu, Finland"] = {}, ["North Karelia, Finland"] = {}, ["Northern Savonia, Finland"] = {}, ["North Savo, Finland"] = {alias_of = "Northern Savonia, Finland", display = true}, ["Southern Savonia, Finland"] = {}, ["South Savo, Finland"] = {alias_of = "Southern Savonia, Finland", display = true}, ["South Karelia, Finland"] = {}, ["Central Finland, Finland"] = {}, ["South Ostrobothnia, Finland"] = {}, ["Southern Ostrobothnia, Finland"] = {alias_of = "South Ostrobothnia, Finland", display = true}, ["Ostrobothnia, Finland"] = {wp = "%l (region)"}, ["Central Ostrobothnia, Finland"] = {}, ["Pirkanmaa, Finland"] = {}, ["Satakunta, Finland"] = {}, ["Päijänne Tavastia, Finland"] = {}, ["Päijät-Häme, Finland"] = {alias_of = "Päijänne Tavastia, Finland", display = true}, ["Tavastia Proper, Finland"] = {}, ["Kanta-Häme, Finland"] = {alias_of = "Tavastia Proper, Finland", display = true}, ["Kymenlaakso, Finland"] = {}, ["Uusimaa, Finland"] = {}, ["Southwest Finland, Finland"] = {}, ["Åland Islands, Finland"] = {the = true, wp = "Åland"}, ["Åland, Finland"] = {alias_of = "Åland Islands, Finland"}, -- differs in "the" } -- regions of Finland export.finland_group = { default_container = "Finland", default_placetype = "region", default_divs = "municipalities", data = export.finland_regions, } export.france_administrative_regions = { ["Auvergne-Rhône-Alpes, France"] = {}, ["Bourgogne-Franche-Comté, France"] = {}, ["Brittany, France"] = {wp = "%l (administrative region)"}, ["Centre-Val de Loire, France"] = {}, ["Corsica, France"] = {}, -- overseas departments are handled in `export.country_like_entities` -- ["French Guiana"] = {}, ["Grand Est, France"] = {}, -- ["Guadeloupe"] = {}, ["Hauts-de-France, France"] = {}, ["Île-de-France, France"] = {}, -- ["Martinique"] = {}, -- ["Mayotte"] = {}, ["Normandy, France"] = {wp = "%l (administrative region)"}, ["Nouvelle-Aquitaine, France"] = {}, ["Occitania, France"] = {wp = "%l (administrative region)"}, ["Occitanie, France"] = {alias_of = "Occitania, France", display = true}, ["Pays de la Loire, France"] = {}, ["Provence-Alpes-Côte d'Azur, France"] = {}, -- ["Réunion"] = {}, } -- administrative regions of France export.france_group = { default_container = "France", -- Canonically these are 'administrative regions' but also treat as 'region' ('administrative region' falls back -- to 'region'). default_placetype = "region", default_divs = { "communes", {type = "municipalities", cat_as = "communes"}, "departments", {type = "prefectures", cat_as = {"prefectures", "departmental capitals"}}, {type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}}, }, data = export.france_administrative_regions, } export.france_departments = { ["Ain, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 01 ["Aisne, France"] = {container = "Hauts-de-France"}, -- 02 ["Allier, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 03 ["Alpes-de-Haute-Provence, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 04 ["Hautes-Alpes, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 05 ["Alpes-Maritimes, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 06 ["Ardèche, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 07 ["Ardennes, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 08 ["Ariège, France"] = {container = "Occitania", wp = "%l (department)"}, -- 09 ["Aube, France"] = {container = "Grand Est"}, -- 10 ["Aude, France"] = {container = "Occitania"}, -- 11 ["Aveyron, France"] = {container = "Occitania"}, -- 12 ["Bouches-du-Rhône, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 13 ["Calvados, France"] = {container = "Normandy", wp = "%l (department)"}, -- 14 ["Cantal, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 15 ["Charente, France"] = {container = "Nouvelle-Aquitaine"}, -- 16 ["Charente-Maritime, France"] = {container = "Nouvelle-Aquitaine"}, -- 17 ["Cher, France"] = {container = "Centre-Val de Loire", wp = "%l (department)"}, -- 18 ["Corrèze, France"] = {container = "Nouvelle-Aquitaine"}, -- 19 ["Corse-du-Sud, France"] = {container = "Corsica"}, -- 2A ["Haute-Corse, France"] = {container = "Corsica"}, -- 2B ["Côte-d'Or, France"] = {container = "Bourgogne-Franche-Comté"}, -- 21 ["Côte d'Or, France"] = {alias_of = "Côte-d'Or, France", display = true}, ["Côtes-d'Armor, France"] = {container = "Brittany"}, -- 22 ["Côtes d'Armor, France"] = {alias_of = "Côtes-d'Armor, France", display = true}, ["Creuse, France"] = {container = "Nouvelle-Aquitaine"}, -- 23 ["Dordogne, France"] = {container = "Nouvelle-Aquitaine"}, -- 24 ["Doubs, France"] = {container = "Bourgogne-Franche-Comté"}, -- 25 ["Drôme, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 26 ["Eure, France"] = {container = "Normandy"}, -- 27 ["Eure-et-Loir, France"] = {container = "Centre-Val de Loire"}, -- 28 ["Finistère, France"] = {container = "Brittany"}, -- 29 ["Gard, France"] = {container = "Occitania"}, -- 30 ["Haute-Garonne, France"] = {container = "Occitania"}, -- 31 ["Gers, France"] = {container = "Occitania"}, -- 32 ["Gironde, France"] = {container = "Nouvelle-Aquitaine"}, -- 33 ["Hérault, France"] = {container = "Occitania"}, -- 34 ["Ille-et-Vilaine, France"] = {container = "Brittany"}, -- 35 ["Indre, France"] = {container = "Centre-Val de Loire"}, -- 36 ["Indre-et-Loire, France"] = {container = "Centre-Val de Loire"}, -- 37 ["Isère, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 38 ["Jura, France"] = {container = "Bourgogne-Franche-Comté", wp = "%l (department)"}, -- 39 ["Landes, France"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 40 ["Loir-et-Cher, France"] = {container = "Centre-Val de Loire"}, -- 41 ["Loire, France"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 42 ["Haute-Loire, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 43 ["Loire-Atlantique, France"] = {container = "Pays de la Loire"}, -- 44 ["Loiret, France"] = {container = "Centre-Val de Loire"}, -- 45 ["Lot, France"] = {container = "Occitania", wp = "%l (department)"}, -- 46 ["Lot-et-Garonne, France"] = {container = "Nouvelle-Aquitaine"}, -- 47 ["Lozère, France"] = {container = "Occitania"}, -- 48 ["Maine-et-Loire, France"] = {container = "Pays de la Loire"}, -- 49 ["Manche, France"] = {container = "Normandy"}, -- 50 ["Marne, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 51 ["Haute-Marne, France"] = {container = "Grand Est"}, -- 52 ["Mayenne, France"] = {container = "Pays de la Loire"}, -- 53 ["Meurthe-et-Moselle, France"] = {container = "Grand Est"}, -- 54 ["Meuse, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 55 ["Morbihan, France"] = {container = "Brittany"}, -- 56 ["Moselle, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 57 ["Nièvre, France"] = {container = "Bourgogne-Franche-Comté"}, -- 58 ["Nord, France"] = {container = "Hauts-de-France", wp = "%l (French department)"}, -- 59 ["Oise, France"] = {container = "Hauts-de-France"}, -- 60 ["Orne, France"] = {container = "Normandy"}, -- 61 ["Pas-de-Calais, France"] = {container = "Hauts-de-France"}, -- 62 ["Puy-de-Dôme, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 63 ["Pyrénées-Atlantiques, France"] = {container = "Nouvelle-Aquitaine"}, -- 64 ["Hautes-Pyrénées, France"] = {container = "Occitania"}, -- 65 ["Pyrénées-Orientales, France"] = {container = "Occitania"}, -- 66 ["Bas-Rhin, France"] = {container = "Grand Est"}, -- 67 ["Haut-Rhin, France"] = {container = "Grand Est"}, -- 68 ["Rhône, France"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 69D ["Metropolis of Lyon, France"] = {container = "Auvergne-Rhône-Alpes", the = true}, -- 69M ["Lyon Metropolis, France"] = {alias_of = "Metropolis of Lyon, France"}, ["Lyon, France"] = {alias_of = "Metropolis of Lyon, France"}, ["Haute-Saône, France"] = {container = "Bourgogne-Franche-Comté"}, -- 70 ["Saône-et-Loire, France"] = {container = "Bourgogne-Franche-Comté"}, -- 71 ["Sarthe, France"] = {container = "Pays de la Loire"}, -- 72 ["Savoie, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 73 ["Haute-Savoie, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 74 ["Paris, France"] = {container = "Île-de-France"}, -- 75 ["Seine-Maritime, France"] = {container = "Normandy"}, -- 76 ["Seine-et-Marne, France"] = {container = "Île-de-France"}, -- 77 ["Yvelines, France"] = {container = "Île-de-France"}, -- 78 ["Deux-Sèvres, France"] = {container = "Nouvelle-Aquitaine"}, -- 79 ["Somme, France"] = {container = "Hauts-de-France", wp = "%l (department)"}, -- 80 ["Tarn, France"] = {container = "Occitania", wp = "%l (department)"}, -- 81 ["Tarn-et-Garonne, France"] = {container = "Occitania"}, -- 82 ["Var, France"] = {container = "Provence-Alpes-Côte d'Azur", wp = "%l (department)"}, -- 83 ["Vaucluse, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 84 ["Vendée, France"] = {container = "Pays de la Loire"}, -- 85 ["Vienne, France"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 86 ["Haute-Vienne, France"] = {container = "Nouvelle-Aquitaine"}, -- 87 ["Vosges, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 88 ["Yonne, France"] = {container = "Bourgogne-Franche-Comté"}, -- 89 ["Territoire de Belfort, France"] = {container = "Bourgogne-Franche-Comté"}, -- 90 ["Essonne, France"] = {container = "Île-de-France"}, -- 91 ["Hauts-de-Seine, France"] = {container = "Île-de-France"}, -- 92 ["Seine-Saint-Denis, France"] = {container = "Île-de-France"}, -- 93 ["Val-de-Marne, France"] = {container = "Île-de-France"}, -- 94 ["Val-d'Oise, France"] = {container = "Île-de-France"}, -- 95 --["Guadeloupe"] = {container = "Guadeloupe"}, -- 971 --["Martinique"] = {container = "Martinique"}, -- 972 --["Guyane"] = {container = "French Guiana", wp = "French Guiana"}, -- 973 --["La Réunion"] = {container = "Réunion", wp = "Réunion"}, -- 974 --["Mayotte"] = {container = "Mayotte"}, -- 976 } export.france_departments_group = { placename_to_key = make_placename_to_key(", France"), canonicalize_key_container = make_canonicalize_key_container(", France", "region"), default_placetype = "department", default_divs = { "communes", {type = "municipalities", cat_as = "communes"}, }, data = export.france_departments, } export.germany_states = { ["Baden-Württemberg, Germany"] = {}, ["Bavaria, Germany"] = {}, -- Berlin, Bremen and Hamburg are effectively city-states and don't have districts ([[Kreise]]), so override -- the default_divs setting. Better not to include them at all since they're included as cities down below. -- ["Berlin"] = {divs = {}}, ["Brandenburg, Germany"] = {}, -- ["Bremen"] = {divs = {}}, -- ["Hamburg"] = {divs = {}}, ["Hesse, Germany"] = {}, ["Lower Saxony, Germany"] = {}, ["Mecklenburg-Vorpommern, Germany"] = {}, ["Mecklenburg-Western Pomerania, Germany"] = {alias_of = "Mecklenburg-Vorpommern, Germany", display = true}, ["North Rhine-Westphalia, Germany"] = {}, ["Rhineland-Palatinate, Germany"] = {}, ["Saarland, Germany"] = {}, ["Saxony, Germany"] = {}, ["Saxony-Anhalt, Germany"] = {}, ["Schleswig-Holstein, Germany"] = {}, ["Thuringia, Germany"] = {}, } -- states of Germany export.germany_group = { default_container = "Germany", default_placetype = "negeri", default_divs = {"districts", "municipalities"}, data = export.germany_states, } export.greece_regions = { ["Attica, Greece"] = {wp = "%l (region)"}, ["Central Greece, Greece"] = {wp = "%l (administrative region)"}, ["Central Macedonia, Greece"] = {}, ["Crete, Greece"] = {}, ["Eastern Macedonia and Thrace, Greece"] = {}, ["Epirus, Greece"] = {wp = "%l (region)"}, ["Ionian Islands, Greece"] = {the = true, wp = "%l (region)"}, ["North Aegean, Greece"] = {the = true}, -- I would expect 'the Peloponnese' but Wikipedia mostly has categories like [[w:Category:Geography of Peloponnese (region)]] -- and [[w:Category:Buildings and structures in Peloponnese (region)]]; only [[w:Category:People from the Peloponnese (region)]] -- has "the" in it. ["Peloponnese, Greece"] = {wp = "%l (region)"}, ["South Aegean, Greece"] = {the = true}, ["Thessaly, Greece"] = {}, ["Western Greece, Greece"] = {}, ["Western Macedonia, Greece"] = {}, ["Mount Athos, Greece"] = {placetype = {"autonomous region", "region"}, wp = "Monastic community of Mount Athos"}, } -- regions of Greece export.greece_group = { default_container = "Greece", default_placetype = "region", data = export.greece_regions, } local india_polity_with_divisions = {"divisions", "districts"} local india_polity_without_divisions = {"districts"} -- States and union territories of India. Only some of them are divided into divisions. export.india_states_and_union_territories = { ["Andaman and Nicobar Islands, India"] = {the = true, placetype = "union territory", divs = india_polity_without_divisions}, ["Andhra Pradesh, India"] = {divs = india_polity_without_divisions}, ["Arunachal Pradesh, India"] = {divs = india_polity_with_divisions}, ["Assam, India"] = {divs = india_polity_with_divisions}, ["Bihar, India"] = {divs = india_polity_with_divisions}, ["Chandigarh, India"] = {placetype = "union territory", divs = india_polity_without_divisions}, ["Chhattisgarh, India"] = {divs = india_polity_with_divisions}, ["Dadra and Nagar Haveli and Daman and Diu, India"] = {placetype = "union territory", divs = india_polity_without_divisions}, ["Delhi, India"] = {placetype = "union territory", divs = india_polity_with_divisions}, ["Goa, India"] = {divs = india_polity_without_divisions}, ["Gujarat, India"] = {divs = india_polity_without_divisions}, ["Haryana, India"] = {divs = india_polity_with_divisions}, ["Himachal Pradesh, India"] = {divs = india_polity_with_divisions}, ["Jammu and Kashmir, India"] = {placetype = "union territory", divs = india_polity_with_divisions, wp = "%l (union territory)"}, ["Jharkhand, India"] = {divs = india_polity_with_divisions}, ["Karnataka, India"] = {divs = india_polity_with_divisions}, ["Kerala, India"] = {divs = india_polity_without_divisions}, ["Ladakh, India"] = {placetype = "union territory", divs = india_polity_with_divisions}, ["Lakshadweep, India"] = {placetype = "union territory", divs = india_polity_without_divisions}, ["Madhya Pradesh, India"] = {divs = india_polity_with_divisions}, ["Maharashtra, India"] = {divs = india_polity_with_divisions}, ["Manipur, India"] = {divs = india_polity_without_divisions}, ["Meghalaya, India"] = {divs = india_polity_with_divisions}, ["Mizoram, India"] = {divs = india_polity_without_divisions}, ["Nagaland, India"] = {divs = india_polity_with_divisions}, ["Odisha, India"] = {divs = india_polity_with_divisions}, ["Puducherry, India"] = {placetype = "union territory", divs = india_polity_without_divisions, wp = "%l (union territory)"}, ["Pondicherry, India"] = {alias_of = "Puducherry, India", display = true}, ["Punjab, India"] = {divs = india_polity_with_divisions, wp = "%l, %c"}, ["Rajasthan, India"] = {divs = india_polity_with_divisions}, ["Sikkim, India"] = {divs = india_polity_without_divisions}, ["Tamil Nadu, India"] = {divs = india_polity_without_divisions}, ["Telangana, India"] = {divs = india_polity_without_divisions}, ["Tripura, India"] = {divs = india_polity_without_divisions}, ["Uttar Pradesh, India"] = {divs = india_polity_with_divisions}, ["Uttarakhand, India"] = {divs = india_polity_with_divisions}, ["West Bengal, India"] = {divs = india_polity_with_divisions}, } -- states and union territories of India export.india_group = { default_container = "India", default_placetype = "negeri", data = export.india_states_and_union_territories, } export.indonesia_provinces = { ["Aceh, Indonesia"] = {}, ["Bali, Indonesia"] = {}, ["Bangka Belitung Islands, Indonesia"] = {the = true}, ["Banten, Indonesia"] = {}, ["Bengkulu, Indonesia"] = {}, ["Central Java, Indonesia"] = {}, ["Central Kalimantan, Indonesia"] = {}, ["Central Papua, Indonesia"] = {}, ["Central Sulawesi, Indonesia"] = {}, ["East Java, Indonesia"] = {}, ["East Kalimantan, Indonesia"] = {}, ["East Nusa Tenggara, Indonesia"] = {}, ["Gorontalo, Indonesia"] = {}, ["Highland Papua, Indonesia"] = {wp = "%l"}, ["Special Capital Region of Jakarta, Indonesia"] = {the = true, wp = "Jakarta"}, ["Jakarta, Indonesia"] = {alias_of = "Special Capital Region of Jakarta, Indonesia"}, ["Jambi, Indonesia"] = {}, ["Lampung, Indonesia"] = {}, ["Maluku, Indonesia"] = {}, ["North Kalimantan, Indonesia"] = {}, ["North Maluku, Indonesia"] = {}, ["North Sulawesi, Indonesia"] = {}, ["North Papua, Indonesia"] = {}, ["North Sumatra, Indonesia"] = {}, ["Papua, Indonesia"] = {wp = "%l (province)"}, ["Riau, Indonesia"] = {}, ["Riau Islands, Indonesia"] = {the = true}, ["Southeast Sulawesi, Indonesia"] = {}, ["South Kalimantan, Indonesia"] = {}, ["South Papua, Indonesia"] = {}, ["South Sulawesi, Indonesia"] = {}, ["South Sumatra, Indonesia"] = {}, ["Southwest Papua, Indonesia"] = {}, ["West Java, Indonesia"] = {}, ["West Kalimantan, Indonesia"] = {}, ["West Nusa Tenggara, Indonesia"] = {}, ["West Papua, Indonesia"] = {wp = "%l (province)"}, ["West Sulawesi, Indonesia"] = {}, ["West Sumatra, Indonesia"] = {}, ["Special Region of Yogyakarta, Indonesia"] = {the = true}, ["Yogyakarta, Indonesia"] = {alias_of = "Special Region of Yogyakarta, Indonesia"}, } -- provinces of Indonesia export.indonesia_group = { default_container = "Indonesia", default_placetype = "province", -- per https://www.quora.com/Does-Indonesia-use-British-or-American-English, Indonesia tends to use American -- spellings. data = export.indonesia_provinces, } export.iran_provinces = { ["Alborz Province, Iran"] = {}, -- abbreviation AL, capital [[w:Karaj]] ["Ardabil Province, Iran"] = {}, -- abbreviation AR, capital [[w:Ardabil]] ["Bushehr Province, Iran"] = {}, -- abbreviation BU, capital [[w:Bushehr]] ["Chaharmahal and Bakhtiari Province, Iran"] = {}, -- abbreviation CB, capital [[w:Shahr-e Kord]] ["East Azerbaijan Province, Iran"] = {}, -- abbreviation EA, capital [[w:Tabriz]] ["Fars Province, Iran"] = {}, -- abbreviation FA, capital [[w:Shiraz]] ["Pars Province, Iran"] = {alias_of = "Fars Province, Iran", display = true}, ["Gilan Province, Iran"] = {}, -- abbreviation GN, capital [[w:Rasht]] ["Golestan Province, Iran"] = {}, -- abbreviation GO, capital [[w:Gorgan]] ["Hamadan Province, Iran"] = {}, -- abbreviation HA, capital [[w:Hamadan]] ["Hormozgan Province, Iran"] = {}, -- abbreviation HO, capital [[w:Bandar Abbas]] ["Ilam Province, Iran"] = {}, -- abbreviation IL, capital [[w:Ilam, Iran|Ilam]] ["Isfahan Province, Iran"] = {}, -- abbreviation IS, capital [[w:Isfahan]] ["Kerman Province, Iran"] = {}, -- abbreviation KN, capital [[w:Kerman]] ["Kermanshah Province, Iran"] = {}, -- abbreviation KE, capital [[w:Kermanshah]] ["Khuzestan Province, Iran"] = {}, -- abbreviation KH, capital [[w:Ahvaz]] ["Kohgiluyeh and Boyer-Ahmad Province, Iran"] = {}, -- abbreviation KB, capital [[w:Yasuj]] ["Kurdistan Province, Iran"] = {}, -- abbreviation KU, capital [[w:Sanandaj]] ["Lorestan Province, Iran"] = {}, -- abbreviation LO, capital [[w:Khorramabad]] ["Markazi Province, Iran"] = {}, -- abbreviation MA, capital [[w:Arak, Iran|Arak]] ["Mazandaran Province, Iran"] = {}, -- abbreviation MN, capital [[w:Sari, Iran|Sari]] ["North Khorasan Province, Iran"] = {}, -- abbreviation NK, capital [[w:Bojnord]] ["Qazvin Province, Iran"] = {}, -- abbreviation QA, capital [[w:Qazvin]] ["Qom Province, Iran"] = {}, -- abbreviation QM, capital [[w:Qom]] ["Razavi Khorasan Province, Iran"] = {}, -- abbreviation RK, capital [[w:Mashhad]] ["Semnan Province, Iran"] = {}, -- abbreviation SE, capital [[w:Semnan, Iran|Semnan]] ["Sistan and Baluchestan Province, Iran"] = {}, -- abbreviation SB, capital [[w:Zahedan]] ["South Khorasan Province, Iran"] = {}, -- abbreviation SK, capital [[w:Birjand]] ["Tehran Province, Iran"] = {}, -- abbreviation TE, capital [[w:Tehran]] ["West Azerbaijan Province, Iran"] = {}, -- abbreviation WA, capital [[w:Urmia]] ["Yazd Province, Iran"] = {}, -- abbreviation YA, capital [[w:Yazd]] ["Zanjan Province, Iran"] = {}, -- abbreviation ZA, capital [[w:Zanjan, Iran|Zanjan]] } -- provinces of Iran export.iran_group = { key_to_placename = make_key_to_placename(", Iran", " Province$"), placename_to_key = make_placename_to_key(", Iran", " Province"), default_container = "Iran", default_placetype = "province", -- There aren't nearly enough counties of Iran currently entered in any language to allow for categorizing them -- per-province. (As of 2025-05-09, there are only 6 counties in each of [[Category:en:Counties of Iran]], -- [[Category:fa:Counties of Iran]] and [[Category:ar:Counties of Iran]].) -- default_divs = "counties", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "%e province", data = export.iran_provinces, } export.ireland_counties = { ["County Carlow, Ireland"] = {}, ["County Cavan, Ireland"] = {}, ["County Clare, Ireland"] = {}, ["County Cork, Ireland"] = {}, ["County Donegal, Ireland"] = {}, ["County Dublin, Ireland"] = {}, ["County Galway, Ireland"] = {}, ["County Kerry, Ireland"] = {}, ["County Kildare, Ireland"] = {}, ["County Kilkenny, Ireland"] = {}, ["County Laois, Ireland"] = {}, ["County Leitrim, Ireland"] = {}, ["County Limerick, Ireland"] = {}, ["County Longford, Ireland"] = {}, ["County Louth, Ireland"] = {}, ["County Mayo, Ireland"] = {}, ["County Meath, Ireland"] = {}, ["County Monaghan, Ireland"] = {}, ["County Offaly, Ireland"] = {}, ["County Roscommon, Ireland"] = {}, ["County Sligo, Ireland"] = {}, ["County Tipperary, Ireland"] = {}, ["County Waterford, Ireland"] = {}, ["County Westmeath, Ireland"] = {}, ["County Wexford, Ireland"] = {}, ["County Wicklow, Ireland"] = {}, } local function make_irish_type_key_to_placename(container_pattern) return function(key) key = key:gsub(container_pattern, "") local elliptical_key = key:gsub("^County ", "") return key, elliptical_key end end local function make_irish_type_placename_to_key(container_suffix) return function(placename) if not placename:find("^County ") and not placename:find("^City ") then placename = "County " .. placename end return placename .. container_suffix end end -- counties of Ireland export.ireland_group = { key_to_placename = make_irish_type_key_to_placename(", Ireland$"), placename_to_key = make_irish_type_placename_to_key(", Ireland"), default_container = "Ireland", default_placetype = "county", data = export.ireland_counties, } export.italy_administrative_regions = { ["Abruzzo, Italy"] = {}, ["Aosta Valley, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}}, ["Apulia, Italy"] = {}, ["Basilicata, Italy"] = {}, ["Calabria, Italy"] = {}, ["Campania, Italy"] = {}, ["Emilia-Romagna, Italy"] = {}, ["Friuli-Venezia Giulia, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}}, ["Lazio, Italy"] = {}, ["Liguria, Italy"] = {}, ["Lombardy, Italy"] = {}, ["Marche, Italy"] = {}, ["Molise, Italy"] = {}, ["Piedmont, Italy"] = {}, ["Sardinia, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}}, ["Sicily, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}}, ["Trentino-Alto Adige, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}}, ["Tuscany, Italy"] = {}, ["Umbria, Italy"] = {}, ["Veneto, Italy"] = {}, } -- administrative regions of Italy export.italy_group = { default_container = "Italy", default_placetype = "region", data = export.italy_administrative_regions, } -- table of Japanese prefectures; interpolated into the main 'places' table, but also needed separately export.japan_prefectures = { ["Aichi Prefecture, Japan"] = {}, ["Akita Prefecture, Japan"] = {}, ["Aomori Prefecture, Japan"] = {}, ["Chiba Prefecture, Japan"] = {}, ["Ehime Prefecture, Japan"] = {}, ["Fukui Prefecture, Japan"] = {}, ["Fukuoka Prefecture, Japan"] = {}, ["Fukushima Prefecture, Japan"] = {}, ["Gifu Prefecture, Japan"] = {}, ["Gunma Prefecture, Japan"] = {}, ["Hiroshima Prefecture, Japan"] = {}, ["Hokkaido Prefecture, Japan"] = {divs = "subprefectures", wp = "Hokkaido"}, ["Hyōgo Prefecture, Japan"] = {}, ["Hyogo Prefecture, Japan"] = {alias_of = "Hyōgo Prefecture, Japan", display = true}, ["Ibaraki Prefecture, Japan"] = {}, ["Ishikawa Prefecture, Japan"] = {}, ["Iwate Prefecture, Japan"] = {}, ["Kagawa Prefecture, Japan"] = {}, ["Kagoshima Prefecture, Japan"] = {}, ["Kanagawa Prefecture, Japan"] = {}, ["Kōchi Prefecture, Japan"] = {}, ["Kochi Prefecture, Japan"] = {alias_of = "Kōchi Prefecture, Japan", display = true}, ["Kumamoto Prefecture, Japan"] = {}, ["Kyoto Prefecture, Japan"] = {}, ["Mie Prefecture, Japan"] = {}, ["Miyagi Prefecture, Japan"] = {}, ["Miyazaki Prefecture, Japan"] = {}, ["Nagano Prefecture, Japan"] = {}, ["Nagasaki Prefecture, Japan"] = {}, ["Nara Prefecture, Japan"] = {}, ["Niigata Prefecture, Japan"] = {}, ["Ōita Prefecture, Japan"] = {}, ["Oita Prefecture, Japan"] = {alias_of = "Ōita Prefecture, Japan", display = true}, ["Okayama Prefecture, Japan"] = {}, ["Okinawa Prefecture, Japan"] = {}, ["Osaka Prefecture, Japan"] = {}, ["Saga Prefecture, Japan"] = {}, ["Saitama Prefecture, Japan"] = {}, ["Shiga Prefecture, Japan"] = {}, ["Shimane Prefecture, Japan"] = {}, ["Shizuoka Prefecture, Japan"] = {}, ["Tochigi Prefecture, Japan"] = {}, ["Tokushima Prefecture, Japan"] = {}, ["Tottori Prefecture, Japan"] = {}, ["Toyama Prefecture, Japan"] = {}, ["Wakayama Prefecture, Japan"] = {}, ["Yamagata Prefecture, Japan"] = {}, ["Yamaguchi Prefecture, Japan"] = {}, ["Yamanashi Prefecture, Japan"] = {}, } -- prefectures of Japan export.japan_group = { key_to_placename = make_key_to_placename(", Japan$", " Prefecture$"), placename_to_key = make_placename_to_key(", Japan", " Prefecture"), default_container = "Japan", default_placetype = "prefecture", data = export.japan_prefectures, } export.laos_provinces = { ["Attapeu Province, Laos"] = {}, ["Bokeo Province, Laos"] = {}, ["Bolikhamxai Province, Laos"] = {}, ["Champasak Province, Laos"] = {}, ["Houaphanh Province, Laos"] = {}, ["Khammouane Province, Laos"] = {}, ["Luang Namtha Province, Laos"] = {}, ["Luang Prabang Province, Laos"] = {}, ["Oudomxay Province, Laos"] = {}, ["Phongsaly Province, Laos"] = {}, ["Salavan Province, Laos"] = {}, ["Savannakhet Province, Laos"] = {}, ["Vientiane Province, Laos"] = {}, ["Vientiane Prefecture, Laos"] = {placetype = "prefecture", wp = "%l"}, ["Sainyabuli Province, Laos"] = {}, ["Sekong Province, Laos"] = {}, ["Xaisomboun Province, Laos"] = {}, ["Xiangkhouang Province, Laos"] = {}, } local function laos_placename_to_key(placename) if placename == "Vientiane Prefecture" then return placename .. ", Laos" end if placename:find(" Province$") then return placename .. ", Laos" end return placename .. " Province, Laos" end -- provinces of Laos export.laos_group = { key_to_placename = make_key_to_placename(", Laos$", {" Province$", " Prefecture$"}), placename_to_key = laos_placename_to_key, default_container = "Laos", default_placetype = "province", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "%e province", data = export.laos_provinces, } export.lebanon_governorates = { ["Akkar Governorate, Lebanon"] = {}, ["Baalbek-Hermel Governorate, Lebanon"] = {}, ["Beirut Governorate, Lebanon"] = {}, ["Beqaa Governorate, Lebanon"] = {}, ["Keserwan-Jbeil Governorate, Lebanon"] = {}, ["Mount Lebanon Governorate, Lebanon"] = {}, ["Nabatieh Governorate, Lebanon"] = {}, -- These two are generic enough that we don't want to automatically augment a use of `gov/North Governorate` or -- `gov/South Governorate` with `c/Lebanon`. ["North Governorate, Lebanon"] = {no_auto_augment_container = true}, ["South Governorate, Lebanon"] = {no_auto_augment_container = true}, } -- governorates of Lebanon export.lebanon_group = { key_to_placename = make_key_to_placename(", Lebanon$", " Governorate$"), placename_to_key = make_placename_to_key(", Lebanon", " Governorate"), default_container = "Lebanon", default_placetype = "governorate", data = export.lebanon_governorates, } export.malaysia_states = { ["Johor, Malaysia"] = {}, ["Kedah, Malaysia"] = {}, ["Kelantan, Malaysia"] = {}, ["Malacca, Malaysia"] = {}, ["Negeri Sembilan, Malaysia"] = {}, ["Pahang, Malaysia"] = {}, ["Penang, Malaysia"] = {}, ["Perak, Malaysia"] = {}, ["Perlis, Malaysia"] = {}, ["Sabah, Malaysia"] = {}, ["Sarawak, Malaysia"] = {}, ["Selangor, Malaysia"] = {}, ["Terengganu, Malaysia"] = {}, } -- states of Malaysia export.malaysia_group = { default_container = "Malaysia", default_placetype = "negeri", default_wp = "%l, %c", data = export.malaysia_states, } export.malta_regions = { -- Some of the regions are generic enough that we don't want to automatically augment a use of e.g. -- `r/Northern Region` with `c/Malta`. In particular; -- * "Eastern Region" also occurs at least in Ghana, Uganda, Iceland, Nigeria, Venezuela, North Macedonia and -- El Salvador; -- * "Northern Region" also occurs at least in Ghana, Uganda, Malawi, Nigeria, Canada and South Africa; -- * "Western Region" also occurs at least in Abu Dhabi, Bahrain, South Africa, Ghana, Iceland, Nepal, Nigeria, -- Serbia and Uganda; -- * "Southern Region" also occurs at least in Nigeria, Eritrea, Iceland, Ireland, Malawi and Serbia. ["Eastern Region, Malta"] = {no_auto_augment_container = true}, ["Gozo Region, Malta"] = {wp = "%l"}, ["Northern Region, Malta"] = {no_auto_augment_container = true}, ["Port Region, Malta"] = {}, ["Southern Region, Malta"] = {no_auto_augment_container = true}, ["Western Region, Malta"] = {no_auto_augment_container = true}, } -- regions of Malta export.malta_group = { key_to_placename = make_key_to_placename(", Malta$", " Region"), placename_to_key = make_placename_to_key(", Malta", " Region"), default_container = "Malta", default_placetype = "region", default_wp = "%l, %c", default_the = true, data = export.malta_regions, } export.mexico_states = { ["Aguascalientes, Mexico"] = {}, ["Baja California, Mexico"] = {}, -- not display-canonicalizing because the "Norte" could be for emphasis ["Baja California Norte, Mexico"] = {alias_of = "Baja California, Mexico"}, ["Baja California Sur, Mexico"] = {}, ["Campeche, Mexico"] = {}, ["Chiapas, Mexico"] = {}, ["Chihuahua, Mexico"] = {wp = "%l (state)"}, ["Coahuila, Mexico"] = {}, ["Colima, Mexico"] = {}, ["Durango, Mexico"] = {}, ["Guanajuato, Mexico"] = {}, ["Guerrero, Mexico"] = {}, ["Hidalgo, Mexico"] = {wp = "%l (state)"}, ["Jalisco, Mexico"] = {}, ["State of Mexico, Mexico"] = {the = true}, ["Mexico, Mexico"] = {alias_of = "State of Mexico, Mexico"}, -- differs in "the" -- ["Mexico City, Mexico"] = {}, doesn't belong here because it's a city ["Michoacán, Mexico"] = {}, ["Michoacan, Mexico"] = {alias_of = "Michoacán, Mexico", display = true}, ["Morelos, Mexico"] = {}, ["Nayarit, Mexico"] = {}, ["Nuevo León, Mexico"] = {}, ["Nuevo Leon, Mexico"] = {alias_of = "Nuevo León, Mexico", display = true}, ["Oaxaca, Mexico"] = {}, ["Puebla, Mexico"] = {}, ["Querétaro, Mexico"] = {}, ["Queretaro, Mexico"] = {alias_of = "Querétaro, Mexico", display = true}, ["Quintana Roo, Mexico"] = {}, ["San Luis Potosí, Mexico"] = {}, ["San Luis Potosi, Mexico"] = {alias_of = "San Luis Potosí, Mexico", display = true}, ["Sinaloa, Mexico"] = {}, ["Sonora, Mexico"] = {}, ["Tabasco, Mexico"] = {}, ["Tamaulipas, Mexico"] = {}, ["Tlaxcala, Mexico"] = {}, ["Veracruz, Mexico"] = {}, ["Yucatán, Mexico"] = {}, ["Yucatan, Mexico"] = {alias_of = "Yucatán, Mexico", display = true}, ["Zacatecas, Mexico"] = {}, } -- Mexican states export.mexico_group = { default_container = "Mexico", default_placetype = "negeri", data = export.mexico_states, } export.moldova_districts_and_autonomous_territorial_units = { ["Anenii Noi District, Moldova"] = {}, -- capital [[Anenii Noi]] ["Basarabeasca District, Moldova"] = {}, -- capital [[Basarabeasca]] ["Briceni District, Moldova"] = {}, -- capital [[Briceni]] ["Cahul District, Moldova"] = {}, -- capital [[Cahul]] ["Cantemir District, Moldova"] = {}, -- capital [[Cantemir, Moldova|Cantemir]] ["Călărași District, Moldova"] = {}, -- capital [[Călărași, Moldova|Călărași]] ["Căușeni District, Moldova"] = {}, -- capital [[Căușeni]] ["Cimișlia District, Moldova"] = {}, -- capital [[Cimișlia]] ["Criuleni District, Moldova"] = {}, -- capital [[Criuleni]] ["Dondușeni District, Moldova"] = {}, -- capital [[Dondușeni]] ["Drochia District, Moldova"] = {}, -- capital [[Drochia]] ["Dubăsari District, Moldova"] = {}, -- capital [[Cocieri]] ["Edineț District, Moldova"] = {}, -- capital [[Edineț]] ["Fălești District, Moldova"] = {}, -- capital [[Fălești]] ["Florești District, Moldova"] = {}, -- capital [[Florești, Moldova|Florești]] ["Glodeni District, Moldova"] = {}, -- capital [[Glodeni]] ["Hîncești District, Moldova"] = {}, -- capital [[Hîncești]] ["Ialoveni District, Moldova"] = {}, -- capital [[Ialoveni]] ["Leova District, Moldova"] = {}, -- capital [[Leova]] ["Nisporeni District, Moldova"] = {}, -- capital [[Nisporeni]] ["Ocnița District, Moldova"] = {}, -- capital [[Ocnița]] ["Orhei District, Moldova"] = {}, -- capital [[Orhei]] ["Rezina District, Moldova"] = {}, -- capital [[Rezina]] ["Rîșcani District, Moldova"] = {}, -- capital [[Rîșcani]] ["Sîngerei District, Moldova"] = {}, -- capital [[Sîngerei]] ["Soroca District, Moldova"] = {}, -- capital [[Soroca]] ["Strășeni District, Moldova"] = {}, -- capital [[Strășeni]] ["Șoldănești District, Moldova"] = {}, -- capital [[Șoldănești]] ["Ștefan Vodă District, Moldova"] = {}, -- capital [[Ștefan Vodă]] ["Taraclia District, Moldova"] = {}, -- capital [[Taraclia]] ["Telenești District, Moldova"] = {}, -- capital [[Telenești]] ["Ungheni District, Moldova"] = {}, -- capital [[Ungheni]] ["Chișinău, Moldova"] = {placetype = "municipality"}, ["Bălți, Moldova"] = {placetype = "municipality"}, ["Gagauzia, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "region"}}, -- capital [[Comrat]] -- the remainder are under the de-facto control of the unrecognized state of Transnistria ["Bender, Moldova"] = {placetype = "municipality"}, ["Tighina, Moldova"] = {alias_of = "Bender, Moldova"}, ["Transnistria, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "region"}}, -- capital [[Tiraspol]] ["Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true}, ["Administrative-Territorial Units of the Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true}, } local function moldova_placename_to_key(placename) local elliptical_key = placename .. ", Moldova" if export.moldova_districts_and_autonomous_territorial_units[elliptical_key] then return elliptical_key end if placename:find(" District$") then return placename .. ", Moldova" end return placename .. " District, Moldova" end -- Moldovan districts (raions) and autonomous territorial units export.moldova_group = { key_to_placename = make_key_to_placename(", Moldova$", " District"), placename_to_key = moldova_placename_to_key, default_container = "Moldova", default_placetype = {"district", "raion"}, default_divs = "communes", data = export.moldova_districts_and_autonomous_territorial_units, } export.morocco_regions = { ["Tangier-Tetouan-Al Hoceima, Morocco"] = {}, ["Oriental, Morocco"] = {wp = "%l (%c)"}, ["L'Oriental, Morocco"] = {alias_of = "Oriental, Morocco", display = true}, ["Fez-Meknes, Morocco"] = {}, ["Rabat-Sale-Kenitra, Morocco"] = {wp = "Rabat-Salé-Kénitra"}, ["Rabat-Salé-Kénitra, Morocco"] = {alias_of = "Rabat-Sale-Kenitra, Morocco", display = true}, ["Beni Mellal-Khenifra, Morocco"] = {wp = "Béni Mellal-Khénifra"}, ["Béni Mellal-Khénifra, Morocco"] = {alias_of = "Beni Mellal-Khenifra, Morocco", display = true}, ["Casablanca-Settat, Morocco"] = {}, ["Marrakesh-Safi, Morocco"] = {wp = "Marrakesh–Safi"}, -- WP title has en-dash ["Marrakech-Safi, Morocco"] = {alias_of = "Marrakesh-Safi, Morocco", display = true}, ["Draa-Tafilalet, Morocco"] = {wp = "Drâa-Tafilalet"}, ["Drâa-Tafilalet, Morocco"] = {alias_of = "Draa-Tafilalet, Morocco", display = true}, ["Souss-Massa, Morocco"] = {}, ["Guelmim-Oued Noun, Morocco"] = { keydesc = "+++. '''NOTE:''' This region lies partly within the disputed territory of [[Western Sahara]]" }, ["Laayoune-Sakia El Hamra, Morocco"] = { wp = "Laâyoune-Sakia El Hamra", keydesc = "+++. '''NOTE:''' This region lies almost completely within the disputed territory of [[Western Sahara]]", }, ["Laâyoune-Sakia El Hamra, Morocco"] = {alias_of = "Laayoune-Sakia El Hamra, Morocco", display = true}, ["Dakhla-Oued Ed-Dahab, Morocco"] = { keydesc = "+++. '''NOTE:''' This region lies completely within the disputed territory of [[Western Sahara]]", }, } -- regions of Morocco export.morocco_group = { default_container = "Morocco", default_placetype = "region", data = export.morocco_regions, } export.egypt_governorates = { ["Cairo Governorate, Egypt"] = {}, ["Giza Governorate, Egypt"] = {}, ["Sharqia Governorate, Egypt"] = {}, ["Dakahlia Governorate, Egypt"] = {}, ["Beheira Governorate, Egypt"] = {}, ["Minya Governorate, Egypt"] = {}, ["Qalyubia Governorate, Egypt"] = {}, ["Sohag Governorate, Egypt"] = {}, ["Alexandria Governorate, Egypt"] = {}, ["Gharbia Governorate, Egypt"] = {}, ["Asyut Governorate, Egypt"] = {}, ["Monufia Governorate, Egypt"] = {}, ["Faiyum Governorate, Egypt"] = {}, ["Kafr El Sheikh Governorate, Egypt"] = {}, ["Qena Governorate, Egypt"] = {}, ["Beni Suef Governorate, Egypt"] = {}, ["Damietta Governorate, Egypt"] = {}, ["Aswan Governorate, Egypt"] = {}, ["Ismailia Governorate, Egypt"] = {}, ["Luxor Governorate, Egypt"] = {}, ["Suez Governorate, Egypt"] = {}, ["Port Said Governorate, Egypt"] = {}, ["Matrouh Governorate, Egypt"] = {}, ["North Sinai Governorate, Egypt"] = {}, ["Red Sea Governorate, Egypt"] = {}, ["New Valley Governorate, Egypt"] = {}, ["South Sinai Governorate, Egypt"] = {}, } -- governorates of Egypt export.egypt_group = { key_to_placename = make_key_to_placename(", Egypt$", " Governorate$"), placename_to_key = make_placename_to_key(", Egypt", " Governorate"), default_container = "Egypt", default_placetype = "governorate", data = export.egypt_governorates, } export.netherlands_provinces = { ["Drenthe, Netherlands"] = {}, ["Flevoland, Netherlands"] = {}, ["Friesland, Netherlands"] = {}, ["Gelderland, Netherlands"] = {}, ["Groningen, Netherlands"] = {wp = "%l (province)"}, ["Limburg, Netherlands"] = {wp = "%l (%c)"}, ["North Brabant, Netherlands"] = {}, -- Foreign forms get display-canonicalized. ["Noord-Brabant, Netherlands"] = {alias_of = "North Brabant, Netherlands", display = true}, ["North Holland, Netherlands"] = {}, ["Noord-Holland, Netherlands"] = {alias_of = "North Holland, Netherlands", display = true}, ["Overijssel, Netherlands"] = {}, ["South Holland, Netherlands"] = {}, ["Zuid-Holland, Netherlands"] = {alias_of = "South Holland, Netherlands", display = true}, ["Utrecht, Netherlands"] = {wp = "%l (province)"}, ["Zeeland, Netherlands"] = {}, } -- provinces of the Netherlands export.netherlands_group = { default_container = "Netherlands", default_placetype = "province", default_divs = "municipalities", data = export.netherlands_provinces, } export.new_zealand_regions = { -- North Island regions ["Northland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-NTL, number 1, capital [[Whangārei]] ["Auckland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-AUK, number 2, capital [[Auckland]] ["Waikato, New Zealand"] = {}, -- ISO 3166-2 code NZ-WKO, number 3, capital [[Hamilton, New Zealand|Hamilton]] ["Bay of Plenty, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-BOP, number 4, capital [[Whakatāne]] ["Gisborne, New Zealand"] = {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-GIS, number 5, capital [[Gisborne, New Zealand|Gisborne]] ["Hawke's Bay, New Zealand"] = {}, -- ISO 3166-2 code NZ-HKB, number 6, capital [[Napier, New Zealand|Napier]] ["Taranaki, New Zealand"] = {}, -- ISO 3166-2 code NZ-TKI, number 7, capital [[Stratford, New Zealand|Stratford]] ["Manawatū-Whanganui, New Zealand"] = {}, -- ISO 3166-2 code NZ-MWT, number 8, capital [[Palmerston North]] ["Manawatu-Whanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true}, ["Manawatu-Wanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true}, ["Wellington, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-WGN, number 9, capital [[Wellington]] -- South Island regions ["Tasman, New Zealand"] = {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-TAS, number 10, capital [[Richmond, New Zealand|Richmond]] ["Nelson, New Zealand"] = {placetype = {"region", "city"}, wp = "%l, %c", is_city = true}, -- ISO 3166-2 code NZ-NSN, number 11, capital [[Nelson, New Zealand|Nelson]] ["Marlborough, New Zealand"] = {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-MBH, number 12, capital [[Blenheim, New Zealand|Blenheim]] ["West Coast, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-WTC, number 13, capital [[Greymouth]] ["Canterbury, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-CAN, number 14, capital [[Christchurch]] ["Otago, New Zealand"] = {}, -- ISO 3166-2 code NZ-OTA, number 15, capital [[Dunedin]] ["Southland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-STL, number 16, capital [[Invercargill]] } -- regions of New Zealand export.new_zealand_group = { default_container = "New Zealand", default_placetype = "region", data = export.new_zealand_regions, } export.nigeria_states = { ["Abia State, Nigeria"] = {}, ["Adamawa State, Nigeria"] = {}, ["Akwa Ibom State, Nigeria"] = {}, ["Anambra State, Nigeria"] = {}, ["Bauchi State, Nigeria"] = {}, ["Bayelsa State, Nigeria"] = {}, ["Benue State, Nigeria"] = {}, ["Borno State, Nigeria"] = {}, ["Cross River State, Nigeria"] = {}, ["Delta State, Nigeria"] = {}, ["Ebonyi State, Nigeria"] = {}, ["Edo State, Nigeria"] = {}, ["Ekiti State, Nigeria"] = {}, ["Enugu State, Nigeria"] = {}, ["Federal Capital Territory, Nigeria"] = { -- not a state but allow it to be referenced as one in holonyms placetype = {"wilayah persekutuan", "territory", "negeri"}, the = true, wp = "%l (%c)", }, ["Gombe State, Nigeria"] = {}, ["Imo State, Nigeria"] = {}, ["Jigawa State, Nigeria"] = {}, ["Kaduna State, Nigeria"] = {}, ["Kano State, Nigeria"] = {}, ["Katsina State, Nigeria"] = {}, ["Kebbi State, Nigeria"] = {}, ["Kogi State, Nigeria"] = {}, ["Kwara State, Nigeria"] = {}, ["Lagos State, Nigeria"] = {}, ["Nasarawa State, Nigeria"] = {}, ["Niger State, Nigeria"] = {}, ["Ogun State, Nigeria"] = {}, ["Ondo State, Nigeria"] = {}, ["Osun State, Nigeria"] = {}, ["Oyo State, Nigeria"] = {}, ["Plateau State, Nigeria"] = {}, ["Rivers State, Nigeria"] = {}, ["Sokoto State, Nigeria"] = {}, ["Taraba State, Nigeria"] = {}, ["Yobe State, Nigeria"] = {}, ["Zamfara State, Nigeria"] = {}, } -- states of Nigeria export.nigeria_group = { key_to_placename = make_key_to_placename(", Nigeria$", " State$"), placename_to_key = make_placename_to_key(", Nigeria", " State"), default_container = "Nigeria", default_placetype = "negeri", data = export.nigeria_states, } export.north_korea_provinces = { ["Chagang Province, North Korea"] = {}, ["North Hamgyong Province, North Korea"] = {}, ["South Hamgyong Province, North Korea"] = {}, ["North Hwanghae Province, North Korea"] = {}, ["South Hwanghae Province, North Korea"] = {}, ["Kangwon Province, North Korea"] = {wp = "%l (%c)"}, ["North Pyongan Province, North Korea"] = {}, ["South Pyongan Province, North Korea"] = {}, ["Ryanggang Province, North Korea"] = {}, } -- provinces of North Korea export.north_korea_group = { key_to_placename = make_key_to_placename(", North Korea$", " Province$"), placename_to_key = make_placename_to_key(", North Korea", " Province"), default_container = "North Korea", default_placetype = "province", data = export.north_korea_provinces, } export.norwegian_counties = { ["Oslo, Norway"] = {}, ["Rogaland, Norway"] = {}, ["Møre og Romsdal, Norway"] = {}, ["Nordland, Norway"] = {}, ["Østfold, Norway"] = {}, ["Akershus, Norway"] = {}, ["Buskerud, Norway"] = {}, -- the following two were merged into Innlandet -- ["Hedmark, Norway"] = {}, -- ["Oppland, Norway"] = {}, ["Innlandet, Norway"] = {}, ["Vestfold, Norway"] = {}, ["Telemark, Norway"] = {}, -- the following two were merged into Agder -- ["Aust-Agder, Norway"] = {}, -- ["Vest-Agder, Norway"] = {}, ["Agder, Norway"] = {}, -- the following two were merged into Vestland -- ["Hordaland, Norway"] = {}, -- ["Sogn og Fjordane, Norway"] = {}, ["Vestland, Norway"] = {}, ["Trøndelag, Norway"] = {}, ["Troms, Norway"] = {}, ["Finnmark, Norway"] = {}, } -- counties of Norway export.norway_group = { default_container = "Norway", default_placetype = "county", data = export.norwegian_counties, } export.pakistan_provinces_and_territories = { ["Azad Kashmir, Pakistan"] = { placetype = {"administrative territory", "autonomous territory", "territory"}, }, ["Azad Jammu and Kashmir, Pakistan"] = {alias_of = "Azad Kashmir, Pakistan", display = true}, ["Balochistan, Pakistan"] = {wp = "%l, %c"}, ["Gilgit-Baltistan, Pakistan"] = { placetype = {"administrative territory", "territory"}, }, ["Islamabad Capital Territory, Pakistan"] = { the = true, divs = {}, -- no divisions placetype = {"wilayah persekutuan", "administrative territory", "territory"}, }, -- Islamabad is an accepted alias for Islamabad Capital Territory given the above placetypes ["Islamabad, Pakistan"] = {alias_of = "Islamabad Capital Territory, Pakistan"}, ["Khyber Pakhtunkhwa, Pakistan"] = {}, ["Punjab, Pakistan"] = {wp = "%l, %c"}, ["Sindh, Pakistan"] = {}, } -- provinces and territories of Pakistan export.pakistan_group = { default_container = "Pakistan", default_placetype = "province", default_divs = "divisions", data = export.pakistan_provinces_and_territories, } export.philippines_provinces = { ["Abra, Philippines"] = {wp = "%l (province)"}, ["Agusan del Norte, Philippines"] = {}, ["Agusan del Sur, Philippines"] = {}, ["Aklan, Philippines"] = {}, ["Albay, Philippines"] = {}, ["Antique, Philippines"] = {wp = "%l (province)"}, ["Apayao, Philippines"] = {}, ["Aurora, Philippines"] = {wp = "%l (province)"}, ["Basilan, Philippines"] = {}, ["Bataan, Philippines"] = {}, ["Batanes, Philippines"] = {}, ["Batangas, Philippines"] = {}, ["Benguet, Philippines"] = {}, ["Biliran, Philippines"] = {}, ["Bohol, Philippines"] = {}, ["Bukidnon, Philippines"] = {}, ["Bulacan, Philippines"] = {}, ["Cagayan, Philippines"] = {}, ["Camarines Norte, Philippines"] = {}, ["Camarines Sur, Philippines"] = {}, ["Camiguin, Philippines"] = {}, ["Capiz, Philippines"] = {}, ["Catanduanes, Philippines"] = {}, ["Cavite, Philippines"] = {}, ["Cebu, Philippines"] = {}, ["Cotabato, Philippines"] = {}, ["Davao de Oro, Philippines"] = {}, ["Davao del Norte, Philippines"] = {}, ["Davao del Sur, Philippines"] = {}, ["Davao Occidental, Philippines"] = {}, ["Davao Oriental, Philippines"] = {}, ["Dinagat Islands, Philippines"] = {the = true}, ["Eastern Samar, Philippines"] = {}, ["Guimaras, Philippines"] = {}, ["Ifugao, Philippines"] = {}, ["Ilocos Norte, Philippines"] = {}, ["Ilocos Sur, Philippines"] = {}, ["Iloilo, Philippines"] = {}, ["Isabela, Philippines"] = {wp = "%l (province)"}, ["Kalinga, Philippines"] = {wp = "%l (province)"}, ["La Union, Philippines"] = {}, ["Laguna, Philippines"] = {wp = "%l (province)"}, ["Lanao del Norte, Philippines"] = {}, ["Lanao del Sur, Philippines"] = {}, ["Leyte, Philippines"] = {wp = "%l (province)"}, ["Maguindanao del Norte, Philippines"] = {}, ["Maguindanao del Sur, Philippines"] = {}, ["Marinduque, Philippines"] = {}, ["Masbate, Philippines"] = {}, ["Misamis Occidental, Philippines"] = {}, ["Misamis Oriental, Philippines"] = {}, ["Mountain Province, Philippines"] = {}, ["Negros Occidental, Philippines"] = {}, ["Negros Oriental, Philippines"] = {}, ["Northern Samar, Philippines"] = {}, ["Nueva Ecija, Philippines"] = {}, ["Nueva Vizcaya, Philippines"] = {}, ["Occidental Mindoro, Philippines"] = {}, ["Oriental Mindoro, Philippines"] = {}, ["Palawan, Philippines"] = {}, ["Pampanga, Philippines"] = {}, ["Pangasinan, Philippines"] = {}, ["Quezon, Philippines"] = {}, ["Quirino, Philippines"] = {}, ["Rizal, Philippines"] = {wp = "%l (province)"}, ["Romblon, Philippines"] = {}, ["Samar, Philippines"] = {wp = "%l (province)"}, ["Sarangani, Philippines"] = {}, ["Siquijor, Philippines"] = {}, ["Sorsogon, Philippines"] = {}, ["South Cotabato, Philippines"] = {}, ["Southern Leyte, Philippines"] = {}, ["Sultan Kudarat, Philippines"] = {}, ["Sulu, Philippines"] = {}, ["Surigao del Norte, Philippines"] = {}, ["Surigao del Sur, Philippines"] = {}, ["Tarlac, Philippines"] = {}, ["Tawi-Tawi, Philippines"] = {}, ["Zambales, Philippines"] = {}, ["Zamboanga del Norte, Philippines"] = {}, ["Zamboanga del Sur, Philippines"] = {}, ["Zamboanga Sibugay, Philippines"] = {}, -- not a province but treated as one; allow it to be referred to as a province in holonyms ["Metro Manila, Philippines"] = {placetype = {"region", "province"}}, } -- provinces of the Philippines export.philippines_group = { default_container = "Philippines", default_placetype = "province", default_divs = {"municipalities", "barangays"}, data = export.philippines_provinces, } export.poland_voivodeships = { ["Lower Silesian Voivodeship, Poland"] = {}, -- abbr DS, code 02, capital Wrocław ["Kuyavian-Pomeranian Voivodeship, Poland"] = {}, -- abbr KP, code 04, capital Bydgoszcz (seat of voivode), Toruń (seat of sejmik and marshal) ["Lublin Voivodeship, Poland"] = {}, -- abbr LU, code 06, capital Lublin ["Lubusz Voivodeship, Poland"] = {}, -- abbr LB, code 08, capital Gorzów Wielkopolski (seat of voivode), Zielona Góra (seat of sejmik and marshal) ["Lodz Voivodeship, Poland"] = {wp = "Łódź Voivodeship"}, -- abbr LD, code 10, capital Łódź ["Łódź Voivodeship, Poland"] = {alias_of = "Lodz Voivodeship, Poland", display = true, display_as_full = true}, ["Lesser Poland Voivodeship, Poland"] = {}, -- abbr MA, code 12, capital Kraków ["Masovian Voivodeship, Poland"] = {}, -- abbr MZ, code 14, capital Warsaw ["Opole Voivodeship, Poland"] = {}, -- abbr OP, code 16, capital Opole ["Subcarpathian Voivodeship, Poland"] = {}, -- abbr PK, code 18, capital Rzeszów ["Podlaskie Voivodeship, Poland"] = {}, -- abbr PD, code 20, capital Białystok ["Pomeranian Voivodeship, Poland"] = {}, -- abbr PM, code 22, capital Gdańsk ["Silesian Voivodeship, Poland"] = {}, -- abbr SL, code 24, capital Katowice ["Holy Cross Voivodeship, Poland"] = {wp = "Świętokrzyskie Voivodeship"}, -- abbr SK, code 26, capital Kielce ["Świętokrzyskie Voivodeship, Poland"] = {alias_of = "Holy Cross Voivodeship, Poland", display = true, display_as_full = true}, ["Warmian-Masurian Voivodeship, Poland"] = {}, -- abbr WN, code 28, capital Olsztyn ["Greater Poland Voivodeship, Poland"] = {}, -- abbr WP, code 30, capital Poznań ["West Pomeranian Voivodeship, Poland"] = {}, -- abbr ZP, code 32, capital Szczecin } -- voivodeships of Poland export.poland_group = { key_to_placename = make_key_to_placename(", Poland$", " Voivodeship$"), placename_to_key = make_placename_to_key(", Poland", " Voivodeship"), default_container = "Poland", default_placetype = "voivodeship", default_divs = { -- "counties", -- not enough of them currently {type = "Polish colonies", cat_as = {{type = "villages", prep = "di"}}}, }, data = export.poland_voivodeships, } export.portugal_districts_and_autonomous_regions = { ["Azores, Portugal"] = {the = true, placetype = {"autonomous region", "region"}}, ["Aveiro District, Portugal"] = {}, ["Beja District, Portugal"] = {}, ["Braga District, Portugal"] = {}, ["Bragança District, Portugal"] = {}, ["Castelo Branco District, Portugal"] = {}, ["Coimbra District, Portugal"] = {}, ["Évora District, Portugal"] = {}, ["Faro District, Portugal"] = {}, ["Guarda District, Portugal"] = {}, ["Leiria District, Portugal"] = {}, ["Lisbon District, Portugal"] = {}, ["Lisboa District, Portugal"] = {alias_of = "Lisbon District, Portugal", display = true}, ["Madeira, Portugal"] = {placetype = {"autonomous region", "region"}}, ["Portalegre District, Portugal"] = {}, ["Porto District, Portugal"] = {}, ["Santarém District, Portugal"] = {}, ["Setúbal District, Portugal"] = {}, ["Viana do Castelo District, Portugal"] = {}, ["Vila Real District, Portugal"] = {}, ["Viseu District, Portugal"] = {}, } local function portugal_placename_to_key(placename) if placename == "Azores" or placename == "Madeira" then return placename .. ", Portugal" end if placename:find(" District$") then return placename .. ", Portugal" end return placename .. " District, Portugal" end -- districts and autonomous regions of Portugal export.portugal_group = { key_to_placename = make_key_to_placename(", Portugal$", " District$"), placename_to_key = portugal_placename_to_key, default_container = "Portugal", default_placetype = "district", default_divs = "municipalities", data = export.portugal_districts_and_autonomous_regions, } export.romania_counties = { ["Alba County, Romania"] = {}, ["Arad County, Romania"] = {}, ["Argeș County, Romania"] = {}, ["Bacău County, Romania"] = {}, ["Bihor County, Romania"] = {}, ["Bistrița-Năsăud County, Romania"] = {}, ["Botoșani County, Romania"] = {}, ["Brașov County, Romania"] = {}, ["Brăila County, Romania"] = {}, -- Bucharest: not in a county ["Buzău County, Romania"] = {}, ["Caraș-Severin County, Romania"] = {}, ["Cluj County, Romania"] = {}, ["Constanța County, Romania"] = {}, ["Covasna County, Romania"] = {}, ["Călărași County, Romania"] = {}, ["Dolj County, Romania"] = {}, ["Dâmbovița County, Romania"] = {}, ["Galați County, Romania"] = {}, ["Giurgiu County, Romania"] = {}, ["Gorj County, Romania"] = {}, ["Harghita County, Romania"] = {}, ["Hunedoara County, Romania"] = {}, ["Ialomița County, Romania"] = {}, ["Iași County, Romania"] = {}, ["Ilfov County, Romania"] = {}, ["Maramureș County, Romania"] = {}, ["Mehedinți County, Romania"] = {}, ["Mureș County, Romania"] = {}, ["Neamț County, Romania"] = {}, ["Olt County, Romania"] = {}, ["Prahova County, Romania"] = {}, ["Satu Mare County, Romania"] = {}, ["Sibiu County, Romania"] = {}, ["Suceava County, Romania"] = {}, ["Sălaj County, Romania"] = {}, ["Teleorman County, Romania"] = {}, ["Timiș County, Romania"] = {}, ["Tulcea County, Romania"] = {}, ["Vaslui County, Romania"] = {}, ["Vrancea County, Romania"] = {}, ["Vâlcea County, Romania"] = {}, } -- counties of Romania export.romania_group = { key_to_placename = make_key_to_placename(", Romania$", " County$"), placename_to_key = make_placename_to_key(", Romania", " County"), default_container = "Romania", default_placetype = "county", default_divs = "communes", data = export.romania_counties, } local function make_russia_federal_subject_spec(spectype, use_the, wp) return { placetype = spectype, the = not not use_the, bare_category_parent_type = {"federal subjects", spectype .. "s"}, wp = wp, } end local russia_autonomous_okrug_no_the = {placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"}} local russia_autonomous_okrug_the = {placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"}, the = true} local russia_krai = make_russia_federal_subject_spec("krai") local russia_oblast = make_russia_federal_subject_spec("oblast") local russia_republic_the = make_russia_federal_subject_spec("republic", "use the") local russia_republic_no_the = make_russia_federal_subject_spec("republic") export.russia_federal_subjects = { -- autonomous oblasts ["Jewish Autonomous Oblast, Russia"] = {the = true, placetype = {"autonomous oblast", "oblast"}, bare_category_parent_type = {"federal subjects", "autonomous oblasts"}}, -- autonomous okrugs ["Chukotka Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Chukotka, Russia"] = {alias_of = "Chukotka Autonomous Okrug, Russia"}, ["Khanty-Mansi Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Khanty-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"}, ["Khantia-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"}, ["Yugra, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"}, ["Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Nenetsia, Russia"] = {alias_of = "Nenets Autonomous Okrug, Russia"}, ["Yamalo-Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Yamalia, Russia"] = {alias_of = "Yamalo-Nenets Autonomous Okrug, Russia"}, -- krais ["Altai Krai, Russia"] = russia_krai, ["Kamchatka Krai, Russia"] = russia_krai, ["Khabarovsk Krai, Russia"] = russia_krai, ["Krasnodar Krai, Russia"] = russia_krai, ["Krasnoyarsk Krai, Russia"] = russia_krai, ["Perm Krai, Russia"] = russia_krai, ["Primorsky Krai, Russia"] = russia_krai, ["Stavropol Krai, Russia"] = russia_krai, ["Zabaykalsky Krai, Russia"] = russia_krai, -- oblasts ["Amur Oblast, Russia"] = russia_oblast, ["Arkhangelsk Oblast, Russia"] = russia_oblast, ["Astrakhan Oblast, Russia"] = russia_oblast, ["Belgorod Oblast, Russia"] = russia_oblast, ["Bryansk Oblast, Russia"] = russia_oblast, ["Chelyabinsk Oblast, Russia"] = russia_oblast, ["Irkutsk Oblast, Russia"] = russia_oblast, ["Ivanovo Oblast, Russia"] = russia_oblast, ["Kaliningrad Oblast, Russia"] = russia_oblast, ["Kaluga Oblast, Russia"] = russia_oblast, ["Kemerovo Oblast, Russia"] = russia_oblast, ["Kirov Oblast, Russia"] = russia_oblast, ["Kostroma Oblast, Russia"] = russia_oblast, ["Kurgan Oblast, Russia"] = russia_oblast, ["Kursk Oblast, Russia"] = russia_oblast, ["Leningrad Oblast, Russia"] = russia_oblast, ["Lipetsk Oblast, Russia"] = russia_oblast, ["Magadan Oblast, Russia"] = russia_oblast, ["Moscow Oblast, Russia"] = russia_oblast, ["Murmansk Oblast, Russia"] = russia_oblast, ["Nizhny Novgorod Oblast, Russia"] = russia_oblast, ["Novgorod Oblast, Russia"] = russia_oblast, ["Novosibirsk Oblast, Russia"] = russia_oblast, ["Omsk Oblast, Russia"] = russia_oblast, ["Orenburg Oblast, Russia"] = russia_oblast, ["Oryol Oblast, Russia"] = russia_oblast, ["Penza Oblast, Russia"] = russia_oblast, ["Pskov Oblast, Russia"] = russia_oblast, ["Rostov Oblast, Russia"] = russia_oblast, ["Ryazan Oblast, Russia"] = russia_oblast, ["Sakhalin Oblast, Russia"] = russia_oblast, ["Samara Oblast, Russia"] = russia_oblast, ["Saratov Oblast, Russia"] = russia_oblast, ["Smolensk Oblast, Russia"] = russia_oblast, ["Sverdlovsk Oblast, Russia"] = russia_oblast, ["Tambov Oblast, Russia"] = russia_oblast, ["Tomsk Oblast, Russia"] = russia_oblast, ["Tula Oblast, Russia"] = russia_oblast, ["Tver Oblast, Russia"] = russia_oblast, ["Tyumen Oblast, Russia"] = russia_oblast, ["Ulyanovsk Oblast, Russia"] = russia_oblast, ["Vladimir Oblast, Russia"] = russia_oblast, ["Volgograd Oblast, Russia"] = russia_oblast, ["Vologda Oblast, Russia"] = russia_oblast, ["Voronezh Oblast, Russia"] = russia_oblast, ["Yaroslavl Oblast, Russia"] = russia_oblast, -- republics -- -- We only need to include cases that aren't just shortened versions of the full federal subject name (i.e. where -- words like "Republic" and "Oblast" are omitted but the name is not otherwise modified; these are handled by -- key_to_placename). Non-display-canonicalizing aliases are generally due to differences in the presence or absence -- of "the". ["Adygea, Russia"] = russia_republic_no_the, ["Republic of Adygea, Russia"] = {alias_of = "Adygea, Russia", the = true}, ["Bashkortostan, Russia"] = russia_republic_no_the, ["Republic of Bashkortostan, Russia"] = {alias_of = "Bashkortostan, Russia", the = true}, ["Bashkiria, Russia"] = {alias_of = "Bashkortostan, Russia"}, ["Buryatia, Russia"] = russia_republic_no_the, ["Republic of Buryatia, Russia"] = {alias_of = "Buryatia, Russia", the = true}, ["Dagestan, Russia"] = russia_republic_no_the, ["Republic of Dagestan, Russia"] = {alias_of = "Dagestan, Russia", the = true}, ["Ingushetia, Russia"] = russia_republic_no_the, ["Republic of Ingushetia, Russia"] = {alias_of = "Ingushetia, Russia", the = true}, ["Kalmykia, Russia"] = russia_republic_no_the, ["Republic of Kalmykia, Russia"] = {alias_of = "Kalmykia, Russia", the = true}, ["Karelia, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Karelia"), ["Republic of Karelia, Russia"] = {alias_of = "Karelia, Russia", the = true}, ["Khakassia, Russia"] = russia_republic_no_the, ["Republic of Khakassia, Russia"] = {alias_of = "Khakassia, Russia", the = true}, ["Mordovia, Russia"] = russia_republic_no_the, ["Republic of Mordovia, Russia"] = {alias_of = "Mordovia, Russia", the = true}, ["North Ossetia-Alania, Russia"] = make_russia_federal_subject_spec("republic", nil, "North Ossetia–Alania"), -- with en-dash ["Republic of North Ossetia-Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", the = true}, ["North Ossetia, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true}, ["Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true}, ["Tatarstan, Russia"] = russia_republic_no_the, ["Republic of Tatarstan, Russia"] = {alias_of = "Tatarstan, Russia", the = true}, ["Altai Republic, Russia"] = russia_republic_the, ["Chechnya, Russia"] = russia_republic_no_the, ["Chechen Republic, Russia"] = {alias_of = "Chechnya, Russia", the = true}, ["Chuvashia, Russia"] = russia_republic_no_the, ["Chuvash Republic, Russia"] = {alias_of = "Chuvashia, Russia", the = true}, ["Kabardino-Balkaria, Russia"] = russia_republic_no_the, ["Kabardino-Balkariya, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", display = true}, ["Kabardino-Balkarian Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", the = true}, ["Kabardino-Balkar Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", display = "Kabardino-Balkarian Republic, Russia", the = true}, ["Karachay-Cherkessia, Russia"] = russia_republic_no_the, ["Karachay-Cherkess Republic, Russia"] = {alias_of = "Karachay-Cherkessia, Russia"}, ["Komi, Russia"] = make_russia_federal_subject_spec("republic", nil, "Komi Republic"), ["Komi Republic, Russia"] = {alias_of = "Komi, Russia", the = true}, ["Mari El, Russia"] = russia_republic_no_the, ["Mari El Republic, Russia"] = {alias_of = "Mari El, Russia", the = true}, ["Sakha, Russia"] = make_russia_federal_subject_spec("republic", nil, "Sakha Republic"), ["Sakha Republic, Russia"] = {alias_of = "Sakha, Russia", the = true}, ["Yakutia, Russia"] = {alias_of = "Sakha, Russia"}, ["Yakutiya, Russia"] = {alias_of = "Sakha, Russia", display = "Yakutia, Russia"}, ["Republic of Yakutia (Sakha), Russia"] = {alias_of = "Sakha, Russia", display = "Sakha Republic, Russia", the = true}, ["Tuva, Russia"] = russia_republic_no_the, ["Tyva, Russia"] = {alias_of = "Tuva, Russia", display = true}, ["Tuva Republic, Russia"] = {alias_of = "Tuva, Russia", the = true}, ["Tyva Republic, Russia"] = {alias_of = "Tuva, Russia", display= "Tuva Republic, Russia", the = true}, ["Udmurtia, Russia"] = russia_republic_no_the, ["Udmurt Republic, Russia"] = {alias_of = "Udmurtia, Russia", the = true}, -- Not included due to being unrecognized and only partly controlled: -- ["Crimea, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Crimea (Russia)") -- ["Donetsk People's Republic, Russia"] = russia_republic_the, -- ["Luhansk People's Republic, Russia"] = russia_republic_the, -- ["Zaporozhye Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Zaporizhzhia Oblast"), -- ["Kherson Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Kherson Oblast"), -- There are also federal cities (not included because they're cities): -- Moscow, Saint Petersburg; Sevastopol (unrecognized; same status as for "Crimea, Russia" above) } local function russia_key_to_placename(key) key = key:gsub(",.*", "") local full_placename = key if key == "Jewish Autonomous Oblast" then return full_placename, full_placename end local elliptical_placename for _, suffix in ipairs({"Krai", "Oblast"}) do elliptical_placename = key:match("^(.*) " .. suffix .. "$") if elliptical_placename then return full_placename, elliptical_placename end end return full_placename, full_placename end local function russia_placename_to_key(placename) local key = placename .. ", Russia" if export.russia_federal_subjects[key] then return key end -- We allow the user to say e.g. "obl/Samara" in place of "obl/Samara Oblast". for _, suffix in ipairs({"Krai", "Oblast"}) do local suffixed_key = placename .. " " .. suffix .. ", Russia" if export.russia_federal_subjects[suffixed_key] then return suffixed_key end end return placename .. ", Russia" end local function construct_russia_federal_subject_keydesc(group, key, spec) local placename = key:gsub(",.*", "") local linked_placename = export.construct_linked_placename(spec, placename) local placetype = spec.placetype if type(placetype) == "table" then placetype = placetype[1] end if placetype == "oblast" then -- Hack: Oblasts generally don't have entries under "Foo Oblast" -- but just under "Foo", so fix the linked key appropriately; -- doesn't apply to the Jewish Autonomous Oblast linked_placename = linked_placename:gsub(" Oblast%]%]", "%]%] Oblast") end return linked_placename .. ", a [[federal subject]] ([[" .. placetype .. "]]) of [[Russia]]" end -- federal subjects of Russia export.russia_group = { key_to_placename = russia_key_to_placename, placename_to_key = russia_placename_to_key, default_container = "Russia", default_keydesc = construct_russia_federal_subject_keydesc, default_overriding_bare_label_parents = {"federal subjects of Russia", "+++"}, data = export.russia_federal_subjects, } export.saudi_arabia_provinces = { ["Riyadh Province, Saudi Arabia"] = {}, ["Mecca Province, Saudi Arabia"] = {}, -- Name is too generic to assume it's in Saudi Arabia if not specified. ["Eastern Province, Saudi Arabia"] = {no_auto_augment_container = true, wp = "%l, %c"}, ["Medina Province, Saudi Arabia"] = {wp = "%l (%c)"}, ["Aseer Province, Saudi Arabia"] = {wp = "Asir"}, ["Asir Province, Saudi Arabia"] = {alias_of = "Aseer Province, Saudi Arabia", display = true}, ["Jazan Province, Saudi Arabia"] = {}, ["Qassim Province, Saudi Arabia"] = {wp = "Al-Qassim Province"}, ["Al-Qassim Province, Saudi Arabia"] = {alias_of = "Qassim Province, Saudi Arabia", display = true}, ["Tabuk Province, Saudi Arabia"] = {}, ["Hail Province, Saudi Arabia"] = {wp = "Ḥa'il Province"}, ["Ha'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true}, ["Ḥa'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true}, ["Al-Jouf Province, Saudi Arabia"] = {wp = "Al-Jawf Province"}, ["Al-Jawf Province, Saudi Arabia"] = {alias_of = "Al-Jouf Province, Saudi Arabia", display = true}, ["Najran Province, Saudi Arabia"] = {}, ["Northern Borders Province, Saudi Arabia"] = {}, ["Al-Bahah Province, Saudi Arabia"] = {}, } -- provinces of Saudi Arabia export.saudi_arabia_group = { key_to_placename = make_key_to_placename(", Arab Saudi$", " Province$"), placename_to_key = make_placename_to_key(", Arab Saudi", " Province"), default_container = "Arab Saudi", default_placetype = "wilayah", data = export.saudi_arabia_provinces, } export.south_africa_provinces = { ["Eastern Cape, South Africa"] = {the = true}, ["Free State, South Africa"] = {the = true, wp = "%l (province)"}, ["Gauteng, South Africa"] = {}, ["KwaZulu-Natal, South Africa"] = {}, ["Limpopo, South Africa"] = {}, ["Mpumalanga, South Africa"] = {}, -- per Wikipedia and other sources, `North West` doesn't normally have `the` before it ["North West, South Africa"] = {wp = "%l (South African province)"}, ["Northern Cape, South Africa"] = {the = true}, ["Western Cape, South Africa"] = {the = true}, } -- provinces of South Africa export.south_africa_group = { default_container = "South Africa", default_placetype = "province", default_divs = "municipalities", data = export.south_africa_provinces, } export.south_korea_provinces = { ["North Chungcheong Province, South Korea"] = {}, ["South Chungcheong Province, South Korea"] = {}, ["Gangwon Province, South Korea"] = {wp = "%l, %c"}, ["Gyeonggi Province, South Korea"] = {}, ["North Gyeongsang Province, South Korea"] = {}, ["South Gyeongsang Province, South Korea"] = {}, ["North Jeolla Province, South Korea"] = {}, ["South Jeolla Province, South Korea"] = {}, ["Jeju Province, South Korea"] = {}, } -- provinces of South Korea export.south_korea_group = { key_to_placename = make_key_to_placename(", South Korea$", " Province$"), placename_to_key = make_placename_to_key(", South Korea", " Province"), default_container = "South Korea", default_placetype = "province", data = export.south_korea_provinces, } export.spain_autonomous_communities = { ["Andalusia, Spain"] = {}, ["Aragon, Spain"] = {}, ["Asturias, Spain"] = {}, ["Balearic Islands, Spain"] = {the = true}, ["Basque Country, Spain"] = {the = true, wp = "%l (autonomous community)"}, ["Canary Islands, Spain"] = {the = true}, ["Cantabria, Spain"] = {}, ["Castile and León, Spain"] = {}, ["Castilla-La Mancha, Spain"] = {wp = "Castilla–La Mancha"}, -- with en-dash ["Catalonia, Spain"] = {}, ["Community of Madrid, Spain"] = {the = true}, ["Extremadura, Spain"] = {}, ["Galicia, Spain"] = {wp = "%l (Spain)"}, ["La Rioja, Spain"] = {}, ["Murcia, Spain"] = {wp = "Region of %l"}, ["Navarre, Spain"] = {}, ["Valencia, Spain"] = {wp = "Valencian Community"}, ["Valencian Community, Spain"] = {alias_of = "Valencia, Spain", the = true}, } -- autonomous communities of Spain export.spain_group = { default_container = "Spain", default_placetype = "autonomous community", default_divs = {"municipalities", "comarcas"}, data = export.spain_autonomous_communities, } export.taiwan_counties = { ["Changhua County, Taiwan"] = {}, ["Chiayi County, Taiwan"] = {}, ["Hsinchu County, Taiwan"] = {}, ["Hualien County, Taiwan"] = {}, ["Kinmen County, Taiwan"] = {wp = "Kinmen"}, ["Lienchiang County, Taiwan"] = {wp = "Matsu Islands"}, ["Miaoli County, Taiwan"] = {}, ["Nantou County, Taiwan"] = {}, ["Penghu County, Taiwan"] = {wp = "Penghu"}, ["Pingtung County, Taiwan"] = {}, ["Taitung County, Taiwan"] = {}, ["Yilan County, Taiwan"] = {wp = "%l, %c"}, ["Yunlin County, Taiwan"] = {}, } -- counties of Taiwan export.taiwan_group = { key_to_placename = make_key_to_placename(", Taiwan$", " County$"), placename_to_key = make_placename_to_key(", Taiwan", " County"), default_container = "Taiwan", default_placetype = "county", default_divs = {"districts", "townships"}, data = export.taiwan_counties, } export.thailand_provinces = { -- Bangkok (special administrative area) ["Amnat Charoen Province, Thailand"] = {}, ["Ang Thong Province, Thailand"] = {}, ["Bueng Kan Province, Thailand"] = {}, ["Buriram Province, Thailand"] = {}, ["Chachoengsao Province, Thailand"] = {}, ["Chai Nat Province, Thailand"] = {}, ["Chaiyaphum Province, Thailand"] = {}, ["Chanthaburi Province, Thailand"] = {}, ["Chiang Mai Province, Thailand"] = {}, ["Chiang Rai Province, Thailand"] = {}, ["Chonburi Province, Thailand"] = {}, ["Chumphon Province, Thailand"] = {}, ["Kalasin Province, Thailand"] = {}, ["Kamphaeng Phet Province, Thailand"] = {}, ["Kanchanaburi Province, Thailand"] = {}, ["Khon Kaen Province, Thailand"] = {}, ["Krabi Province, Thailand"] = {}, ["Lampang Province, Thailand"] = {}, ["Lamphun Province, Thailand"] = {}, ["Loei Province, Thailand"] = {}, ["Lopburi Province, Thailand"] = {}, ["Mae Hong Son Province, Thailand"] = {}, ["Maha Sarakham Province, Thailand"] = {}, ["Mukdahan Province, Thailand"] = {}, ["Nakhon Nayok Province, Thailand"] = {}, ["Nakhon Pathom Province, Thailand"] = {}, ["Nakhon Phanom Province, Thailand"] = {}, ["Nakhon Ratchasima Province, Thailand"] = {}, ["Nakhon Sawon Province, Thailand"] = {}, ["Nakhon Si Thammarat Province, Thailand"] = {}, ["Nan Province, Thailand"] = {}, ["Narathiwat Province, Thailand"] = {}, ["Nong Bua Lamphu Province, Thailand"] = {}, ["Nong Khai Province, Thailand"] = {}, ["Nonthaburi Province, Thailand"] = {}, ["Pathum Thani Province, Thailand"] = {}, ["Pattani Province, Thailand"] = {}, ["Phang Nga Province, Thailand"] = {}, ["Phatthalung Province, Thailand"] = {}, ["Phayao Province, Thailand"] = {}, ["Phetchabun Province, Thailand"] = {}, ["Phetchaburi Province, Thailand"] = {}, ["Phichit Province, Thailand"] = {}, ["Phitsanulok Province, Thailand"] = {}, ["Phra Nakhon Si Ayutthaya Province, Thailand"] = {}, ["Phrae Province, Thailand"] = {}, ["Phuket Province, Thailand"] = {}, ["Prachinburi Province, Thailand"] = {}, ["Prachuap Khiri Khan Province, Thailand"] = {}, ["Ranong Province, Thailand"] = {}, ["Ratchaburi Province, Thailand"] = {}, ["Rayong Province, Thailand"] = {}, ["Roi Et Province, Thailand"] = {}, ["Sa Kaeo Province, Thailand"] = {}, ["Sakon Nakhon Province, Thailand"] = {}, ["Samut Prakan Province, Thailand"] = {}, ["Samut Sakhon Province, Thailand"] = {}, ["Samut Songkhram Province, Thailand"] = {}, ["Saraburi Province, Thailand"] = {}, ["Satun Province, Thailand"] = {}, ["Sing Buri Province, Thailand"] = {}, ["Sisaket Province, Thailand"] = {}, ["Songkhla Province, Thailand"] = {}, ["Sukhothai Province, Thailand"] = {}, ["Suphan Buri Province, Thailand"] = {}, ["Surat Thani Province, Thailand"] = {}, ["Surin Province, Thailand"] = {}, ["Tak Province, Thailand"] = {}, ["Trang Province, Thailand"] = {}, ["Trat Province, Thailand"] = {}, ["Ubon Ratchathani Province, Thailand"] = {}, ["Udon Thani Province, Thailand"] = {}, ["Uthai Thani Province, Thailand"] = {}, ["Uttaradit Province, Thailand"] = {}, ["Yala Province, Thailand"] = {}, ["Yasothon Province, Thailand"] = {}, } -- provinces of Thailand export.thailand_group = { key_to_placename = make_key_to_placename(", Thailand$", "Wilayah "), placename_to_key = make_placename_to_key(", Thailand", "Wilayah "), default_container = "Thailand", default_placetype = "wilayah", default_divs = "daerah", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "Wilayah %e", data = export.thailand_provinces, } export.turkey_provinces = { ["Adana Province, Turkey"] = {}, -- code 01 ["Adıyaman Province, Turkey"] = {}, -- code 02 ["Afyonkarahisar Province, Turkey"] = {}, -- code 03 ["Ağrı Province, Turkey"] = {}, -- code 04 ["Amasya Province, Turkey"] = {}, -- code 05 ["Ankara Province, Turkey"] = {}, -- code 06 ["Antalya Province, Turkey"] = {}, -- code 07 ["Artvin Province, Turkey"] = {}, -- code 08 ["Aydın Province, Turkey"] = {}, -- code 09 ["Balıkesir Province, Turkey"] = {}, -- code 10 ["Bilecik Province, Turkey"] = {}, -- code 11 ["Bingöl Province, Turkey"] = {}, -- code 12 ["Bitlis Province, Turkey"] = {}, -- code 13 ["Bolu Province, Turkey"] = {}, -- code 14 ["Burdur Province, Turkey"] = {}, -- code 15 ["Bursa Province, Turkey"] = {}, -- code 16 ["Çanakkale Province, Turkey"] = {}, -- code 17 ["Çankırı Province, Turkey"] = {}, -- code 18 ["Çorum Province, Turkey"] = {}, -- code 19 ["Denizli Province, Turkey"] = {}, -- code 20 ["Diyarbakır Province, Turkey"] = {}, -- code 21 ["Edirne Province, Turkey"] = {}, -- code 22 ["Elazığ Province, Turkey"] = {}, -- code 23 ["Elâzığ Province, Turkey"] = {alias_of = "Elazığ Province, Turkey", display = true}, ["Erzincan Province, Turkey"] = {}, -- code 24 ["Erzurum Province, Turkey"] = {}, -- code 25 ["Eskişehir Province, Turkey"] = {}, -- code 26 ["Gaziantep Province, Turkey"] = {}, -- code 27 ["Giresun Province, Turkey"] = {}, -- code 28 ["Gümüşhane Province, Turkey"] = {}, -- code 29 ["Hakkâri Province, Turkey"] = {}, -- code 30 ["Hakkari Province, Turkey"] = {alias_of = "Hakkâri Province, Turkey", display = true}, ["Hatay Province, Turkey"] = {}, -- code 31 ["Isparta Province, Turkey"] = {}, -- code 32 ["Mersin Province, Turkey"] = {}, -- code 33 -- ["Istanbul Province, Turkey"] = {}, -- code 34; this is coextensive with the city itself ["İzmir Province, Turkey"] = {}, -- code 35 ["Izmir Province, Turkey"] = {alias_of = "İzmir Province, Turkey", display = true}, ["Kars Province, Turkey"] = {}, -- code 36 ["Kastamonu Province, Turkey"] = {}, -- code 37 ["Kayseri Province, Turkey"] = {}, -- code 38 ["Kırklareli Province, Turkey"] = {}, -- code 39 ["Kırşehir Province, Turkey"] = {}, -- code 40 ["Kocaeli Province, Turkey"] = {}, -- code 41 ["Konya Province, Turkey"] = {}, -- code 42 ["Kütahya Province, Turkey"] = {}, -- code 43 ["Malatya Province, Turkey"] = {}, -- code 44 ["Manisa Province, Turkey"] = {}, -- code 45 ["Kahramanmaraş Province, Turkey"] = {}, -- code 46 ["Mardin Province, Turkey"] = {}, -- code 47 ["Muğla Province, Turkey"] = {}, -- code 48 ["Muş Province, Turkey"] = {}, -- code 49 ["Nevşehir Province, Turkey"] = {}, -- code 50 ["Niğde Province, Turkey"] = {}, -- code 51 ["Ordu Province, Turkey"] = {}, -- code 52 ["Rize Province, Turkey"] = {}, -- code 53 ["Sakarya Province, Turkey"] = {}, -- code 54 ["Samsun Province, Turkey"] = {}, -- code 55 ["Siirt Province, Turkey"] = {}, -- code 56 ["Sinop Province, Turkey"] = {}, -- code 57 ["Sivas Province, Turkey"] = {}, -- code 58 ["Tekirdağ Province, Turkey"] = {}, -- code 59 ["Tokat Province, Turkey"] = {}, -- code 60 ["Trabzon Province, Turkey"] = {}, -- code 61 ["Tunceli Province, Turkey"] = {}, -- code 62 ["Şanlıurfa Province, Turkey"] = {}, -- code 63 ["Uşak Province, Turkey"] = {}, -- code 64 ["Van Province, Turkey"] = {}, -- code 65 ["Yozgat Province, Turkey"] = {}, -- code 66 ["Zonguldak Province, Turkey"] = {}, -- code 67 ["Aksaray Province, Turkey"] = {}, -- code 68 ["Bayburt Province, Turkey"] = {}, -- code 69 ["Karaman Province, Turkey"] = {}, -- code 70 ["Kırıkkale Province, Turkey"] = {}, -- code 71 ["Batman Province, Turkey"] = {}, -- code 72 ["Şırnak Province, Turkey"] = {}, -- code 73 ["Bartın Province, Turkey"] = {}, -- code 74 ["Ardahan Province, Turkey"] = {}, -- code 75 ["Iğdır Province, Turkey"] = {}, -- code 76 ["Yalova Province, Turkey"] = {}, -- code 77 ["Karabük Province, Turkey"] = {}, -- code 78 ["Kilis Province, Turkey"] = {}, -- code 79 ["Osmaniye Province, Turkey"] = {}, -- code 80 ["Düzce Province, Turkey"] = {}, -- code 81 } -- provinces of Turkey export.turkey_group = { key_to_placename = make_key_to_placename(", Turkey$", " Province$"), placename_to_key = make_placename_to_key(", Turkey", " Province"), default_container = "Turkey", default_placetype = "province", default_divs = "districts", data = export.turkey_provinces, } export.ukraine_oblasts = { ["Cherkasy Oblast, Ukraine"] = {}, -- capital [[Cherkasy]], license plate prefix CA, IA ["Chernihiv Oblast, Ukraine"] = {}, -- capital [[Chernihiv]], license plate prefix CB, IB ["Chernivtsi Oblast, Ukraine"] = {}, -- capital [[Chernivtsi]], license plate prefix CE, IE -- apparently will be renamed to 'Dnipro Oblast' ["Dnipropetrovsk Oblast, Ukraine"] = {}, -- capital [[Dnipro]], license plate prefix AE, KE ["Donetsk Oblast, Ukraine"] = {}, -- capital ''[[Donetsk]] ([[Kramatorsk]])'', license plate prefix AH, KH ["Ivano-Frankivsk Oblast, Ukraine"] = {}, -- capital [[Ivano-Frankivsk]], license plate prefix AT, KT ["Kharkiv Oblast, Ukraine"] = {}, -- capital [[Kharkiv]], license plate prefix AX, KX ["Kherson Oblast, Ukraine"] = {}, -- capital ''[[Kherson]]'', license plate prefix ''BT, HT'' ["Khmelnytskyi Oblast, Ukraine"] = {}, -- capital [[Khmelnytskyi]], license plate prefix BX, HX -- apparently will be renamed to 'Kropyvnytskyi Oblast' ["Kirovohrad Oblast, Ukraine"] = {}, -- capital [[Kropyvnytskyi]], license plate prefix BA, HA ["Kyiv Oblast, Ukraine"] = {}, -- capital [[Kyiv]], license plate prefix AI, KI ["Kiev Oblast, Ukraine"] = {alias_of = "Kyiv Oblast, Ukraine", display = true}, ["Luhansk Oblast, Ukraine"] = {}, -- capital ''[[Luhansk]] ([[Sievierodonetsk]])'', license plate prefix BB, HB ["Lviv Oblast, Ukraine"] = {}, -- capital [[Lviv]], license plate prefix BC, HC ["Mykolaiv Oblast, Ukraine"] = {}, -- capital [[Mykolaiv]], license plate prefix BE, HE ["Odesa Oblast, Ukraine"] = {}, -- capital [[Odesa]], license plate prefix BH, HH ["Odessa Oblast, Ukraine"] = {alias_of = "Odesa Oblast, Ukraine", display = true}, ["Poltava Oblast, Ukraine"] = {}, -- capital [[Poltava]], license plate prefix BI, HI ["Rivne Oblast, Ukraine"] = {}, -- capital [[Rivne]], license plate prefix BK, HK ["Sumy Oblast, Ukraine"] = {}, -- capital [[Sumy]], license plate prefix BM, HM ["Ternopil Oblast, Ukraine"] = {}, -- capital [[Ternopil]], license plate prefix BO, HO ["Vinnytsia Oblast, Ukraine"] = {}, -- capital [[Vinnytsia]], license plate prefix AB, KB ["Volyn Oblast, Ukraine"] = {}, -- capital [[Lutsk]], license plate prefix AC, KC ["Zakarpattia Oblast, Ukraine"] = {}, -- capital [[Uzhhorod]], license plate prefix AO, KO ["Zaporizhzhia Oblast, Ukraine"] = {}, -- capital ''[[Zaporizhzhia]]'', license plate prefix AP, KP ["Zaporizhia Oblast, Ukraine"] = {alias_of = "Zaporizhzhia Oblast, Ukraine", display = true}, ["Zhytomyr Oblast, Ukraine"] = {}, -- capital [[Zhytomyr]], license plate prefix AM, KM } -- oblasts of Ukraine export.ukraine_group = { key_to_placename = make_key_to_placename(", Ukraine$", " Oblast$"), placename_to_key = make_placename_to_key(", Ukraine", " Oblast"), default_container = "Ukraine", default_placetype = "oblast", default_divs = {"raions", "hromadas"}, data = export.ukraine_oblasts, } export.united_kingdom_constituent_countries = { ["England"] = {divs = { "counties", "districts", {type = "local government districts", cat_as = "districts"}, { type = "local government districts with borough status", cat_as = {"districts", "boroughs"}, }, {type = "boroughs", cat_as = {"districts", "boroughs"}}, {type = "civil parishes", container_parent_type = false}, }}, ["Northern Ireland"] = { placetype = {"constituent country", "province", "negara"}, divs = {"counties", "districts"}, }, ["Scotland"] = {divs = { {type = "council areas", container_parent_type = false}, "districts", }}, ["Wales"] = {divs = { "counties", {type = "county boroughs", container_parent_type = false}, {type = "communities", container_parent_type = false}, {type = "Welsh communities", cat_as = {{type = "communities", container_parent_type = false}}}, }}, } -- constituent countries and provinces of the United Kingdom export.united_kingdom_group = { placename_to_key = false, default_container = "United Kingdom", default_placetype = {"constituent country", "negara"}, addl_divs = { "traditional counties", {type = "historical counties", cat_as = "traditional counties"}, }, -- Don't create categories like 'Category:en:Towns in the United Kingdom' -- or 'Category:en:Places in the United Kingdom'. default_no_container_cat = true, data = export.united_kingdom_constituent_countries, } export.england_counties = { -- NOTE: We used to have various other "no longer" counties commented out, which seems to refer to counties that -- existed officially at some point between 1889 and 1974, which I have removed. I have only kept the three -- ceremonial counties that existed from 1974 (when ceremonial counties were created) to 1996, as well as those -- still considered "historic counties" per [[w:Historic counties of England]]. -- ["Avon, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996) ["Bedfordshire, England"] = {}, ["Berkshire, England"] = {}, -- ["Brighton and Hove, England"] = {}, -- city -- ["Bristol, England"] = {}, -- city ["Buckinghamshire, England"] = {}, ["Cambridgeshire, England"] = {}, ["Cheshire, England"] = {}, -- ["Cleveland, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996) ["Cornwall, England"] = {}, -- ["Cumberland, England"] = {}, -- no longer (historic county) ["Cumbria, England"] = {}, ["Derbyshire, England"] = {}, ["Devon, England"] = {}, ["Dorset, England"] = {}, ["County Durham, England"] = {}, ["East Sussex, England"] = {}, ["Essex, England"] = {}, ["Gloucestershire, England"] = {}, ["Greater London, England"] = {}, ["Greater Manchester, England"] = {}, ["Hampshire, England"] = {}, ["Herefordshire, England"] = {}, ["Hertfordshire, England"] = {}, -- ["Humberside, England"] = {}, -- no longer (1974 to 1996) -- ["Huntingdonshire, England"] = {}, -- no longer (historic county) ["Isle of Wight, England"] = {the = true}, ["Kent, England"] = {}, ["Lancashire, England"] = {}, ["Leicestershire, England"] = {}, ["Lincolnshire, England"] = {}, ["Merseyside, England"] = {}, -- ["Middlesex, England"] = {}, -- no longer (historic county) ["Norfolk, England"] = {}, ["Northamptonshire, England"] = {}, ["Northumberland, England"] = {}, ["North Yorkshire, England"] = {}, ["Nottinghamshire, England"] = {}, ["Oxfordshire, England"] = {}, ["Rutland, England"] = {}, ["Shropshire, England"] = {}, ["Somerset, England"] = {}, ["South Humberside, England"] = {}, ["South Yorkshire, England"] = {}, ["Staffordshire, England"] = {}, ["Suffolk, England"] = {}, ["Surrey, England"] = {}, -- ["Sussex, England"] = {}, -- no longer (historic county) ["Tyne and Wear, England"] = {}, ["Warwickshire, England"] = {}, ["West Midlands, England"] = {the = true, wp = "%l (county)"}, -- ["Westmorland, England"] = {}, -- no longer (historic county) ["West Sussex, England"] = {}, ["West Yorkshire, England"] = {}, ["Wiltshire, England"] = {}, ["Worcestershire, England"] = {}, -- ["Yorkshire, England"] = {}, -- no longer (historic county) ["East Riding of Yorkshire, England"] = {the = true}, } -- counties of England export.england_group = { default_container = {key = "England", placetype = "constituent country"}, default_placetype = "county", default_divs = { "districts", {type = "local government districts", cat_as = "districts"}, { type = "local government districts with borough status", cat_as = {"districts", "boroughs"}, }, {type = "boroughs", cat_as = {"districts", "boroughs"}}, "civil parishes", }, data = export.england_counties, } export.northern_ireland_counties = { ["County Antrim, Northern Ireland"] = {}, ["County Armagh, Northern Ireland"] = {}, ["City of Belfast, Northern Ireland"] = {the = true, is_city = true, wp = "Belfast"}, ["County Down, Northern Ireland"] = {}, ["County Fermanagh, Northern Ireland"] = {}, ["County Londonderry, Northern Ireland"] = {}, ["City of Derry, Northern Ireland"] = {the = true, is_city = true, wp = "Derry"}, ["County Tyrone, Northern Ireland"] = {}, } -- counties of Northern Ireland export.northern_ireland_group = { key_to_placename = make_irish_type_key_to_placename(", Northern Ireland$"), placename_to_key = make_irish_type_placename_to_key(", Northern Ireland"), default_container = {key = "Northern Ireland", placetype = "constituent country"}, default_placetype = "county", data = export.northern_ireland_counties, } export.scotland_council_areas = { ["Aberdeenshire, Scotland"] = {}, ["Angus, Scotland"] = {wp = "%l, %c"}, ["Argyll and Bute, Scotland"] = {}, ["City of Aberdeen, Scotland"] = {the = true, wp = "Aberdeen"}, ["Aberdeen"] = {alias_of = "City of Aberdeen, Scotland"}, ["Aberdeen City"] = {alias_of = "City of Aberdeen, Scotland"}, ["City of Dundee, Scotland"] = {the = true, wp = "Dundee"}, ["Dundee"] = {alias_of = "City of Dundee, Scotland"}, ["Dundee City"] = {alias_of = "City of Dundee, Scotland"}, ["City of Edinburgh, Scotland"] = {the = true, wp = "%l council area"}, ["Edinburgh"] = {alias_of = "City of Edinburgh, Scotland"}, ["City of Glasgow, Scotland"] = {the = true, wp = "Glasgow"}, ["Glasgow"] = {alias_of = "City of Glasgow, Scotland"}, ["Clackmannanshire, Scotland"] = {}, ["Dumfries and Galloway, Scotland"] = {}, ["East Ayrshire, Scotland"] = {}, ["East Dunbartonshire, Scotland"] = {}, ["East Lothian, Scotland"] = {}, ["East Renfrewshire, Scotland"] = {}, ["Falkirk, Scotland"] = {wp = "%l council area"}, ["Fife, Scotland"] = {}, ["Highland, Scotland"] = {wp = "%l council area"}, ["Inverclyde, Scotland"] = {}, ["Midlothian, Scotland"] = {}, ["Moray, Scotland"] = {}, ["North Ayrshire, Scotland"] = {}, ["North Lanarkshire, Scotland"] = {}, ["Orkney Islands, Scotland"] = {the = true}, ["Perth and Kinross, Scotland"] = {}, ["Renfrewshire, Scotland"] = {}, ["Scottish Borders, Scotland"] = {the = true}, ["Shetland Islands, Scotland"] = {the = true}, ["South Ayrshire, Scotland"] = {}, ["South Lanarkshire, Scotland"] = {}, ["Stirling, Scotland"] = {wp = "%l council area"}, ["West Dunbartonshire, Scotland"] = {}, ["West Lothian, Scotland"] = {}, ["Western Isles, Scotland"] = {the = true, wp = "Outer Hebrides"}, ["Na h-Eileanan Siar, Scotland"] = {alias_of = "Western Isles, Scotland"}, } -- council areas of Scotland export.scotland_group = { default_container = {key = "Scotland", placetype = "constituent country"}, default_placetype = "council area", data = export.scotland_council_areas, } export.wales_principal_areas = { ["Blaenau Gwent, Wales"] = {}, ["Bridgend, Wales"] = {wp = "%l County Borough"}, ["Caerphilly, Wales"] = {wp = "%l County Borough"}, -- ["Cardiff, Wales"] = {placetype = "city"}, ["Carmarthenshire, Wales"] = {placetype = "county"}, ["Ceredigion, Wales"] = {placetype = "county"}, ["Conwy, Wales"] = {wp = "%l County Borough"}, ["Denbighshire, Wales"] = {placetype = "county"}, ["Flintshire, Wales"] = {placetype = "county"}, ["Gwynedd, Wales"] = {placetype = "county"}, ["Isle of Anglesey, Wales"] = {the = true, placetype = "county"}, ["Anglesey, Wales"] = {alias_of = "Isle of Anglesey, Wales"}, -- differs in "the" ["Merthyr Tydfil, Wales"] = {wp = "%l County Borough"}, ["Monmouthshire, Wales"] = {placetype = "county"}, ["Neath Port Talbot, Wales"] = {}, -- ["Newport, Wales"] = {placetype = "city", wp = "%l, %c"}, ["Pembrokeshire, Wales"] = {placetype = "county"}, ["Powys, Wales"] = {placetype = "county"}, ["Rhondda Cynon Taf, Wales"] = {}, -- ["Swansea, Wales"] = {placetype = "city"}, ["Torfaen, Wales"] = {}, ["Vale of Glamorgan, Wales"] = {the = true}, ["Wrexham, Wales"] = {wp = "%l County Borough"}, } -- principal areas (cities, counties and county boroughs) of Wales export.wales_group = { default_container = {key = "Wales", placetype = "constituent country"}, default_placetype = "county borough", data = export.wales_principal_areas, } export.united_states_states = { ["Alabama, USA"] = {}, ["Alaska, USA"] = {divs = { {type = "boroughs", container_parent_type = "counties"}, {type = "borough seats", container_parent_type = "county seats"}, }}, ["Arizona, USA"] = {}, ["Arkansas, USA"] = {}, ["California, USA"] = {}, ["Colorado, USA"] = {divs = {"counties", "county seats", "municipalities"}}, ["Connecticut, USA"] = {divs = {"counties", "county seats", "municipalities"}}, ["Delaware, USA"] = {}, ["Florida, USA"] = {}, ["Georgia, USA"] = {wp = "%l (U.S. state)"}, ["Hawaii, USA"] = {addl_parents = {"Polynesia"}}, ["Idaho, USA"] = {}, ["Illinois, USA"] = {}, ["Indiana, USA"] = {}, ["Iowa, USA"] = {}, ["Kansas, USA"] = {}, ["Kentucky, USA"] = {}, ["Louisiana, USA"] = {divs = { {type = "parishes", container_parent_type = "counties"}, {type = "parish seats", container_parent_type = "county seats"}, }}, ["Maine, USA"] = {}, ["Maryland, USA"] = {}, ["Massachusetts, USA"] = {}, ["Michigan, USA"] = {}, ["Minnesota, USA"] = {}, ["Mississippi, USA"] = {}, ["Missouri, USA"] = {}, ["Montana, USA"] = {}, ["Nebraska, USA"] = {}, ["Nevada, USA"] = {}, ["New Hampshire, USA"] = {}, ["New Jersey, USA"] = {divs = { "counties", "county seats", {type = "boroughs", prep = "di"}, }}, ["New Mexico, USA"] = {}, ["New York, USA"] = {wp = "%l (state)"}, ["North Carolina, USA"] = {}, ["North Dakota, USA"] = {}, ["Ohio, USA"] = {}, ["Oklahoma, USA"] = {}, ["Oregon, USA"] = {}, ["Pennsylvania, USA"] = {divs = { "counties", "county seats", {type = "boroughs", prep = "di"}, }}, ["Rhode Island, USA"] = {}, ["South Carolina, USA"] = {}, ["South Dakota, USA"] = {}, ["Tennessee, USA"] = {}, ["Texas, USA"] = {}, ["Utah, USA"] = {}, ["Vermont, USA"] = {}, ["Virginia, USA"] = {}, ["Washington, USA"] = {wp = "%l (state)"}, ["West Virginia, USA"] = {}, ["Wisconsin, USA"] = {}, ["Wyoming, USA"] = {}, } -- states of the United States export.united_states_group = { placename_to_key = make_placename_to_key(", USA"), default_container = "Amerika Syarikat", default_placetype = "negeri", default_divs = {"counties", "county seats"}, addl_divs = { {type = "census-designated places", prep = "di"}, {type = "unincorporated communities", prep = "di"}, }, data = export.united_states_states, } export.vietnam_provinces = { -- [[Northeast (Vietnam)|Northeast]] region ["Bắc Giang Province, Vietnam"] = {}, -- capital [[Bắc Giang]] ["Bắc Kạn Province, Vietnam"] = {}, -- capital [[Bắc Kạn]] ["Cao Bằng Province, Vietnam"] = {}, -- capital [[Cao Bằng]] ["Hà Giang Province, Vietnam"] = {}, -- capital [[Hà Giang]] ["Lạng Sơn Province, Vietnam"] = {}, -- capital [[Lạng Sơn]] ["Phú Thọ Province, Vietnam"] = {}, -- capital [[Việt Trì]] ["Quảng Ninh Province, Vietnam"] = {}, -- capital [[Hạ Long]] ["Thái Nguyên Province, Vietnam"] = {}, -- capital [[Thái Nguyên]] ["Tuyên Quang Province, Vietnam"] = {}, -- capital [[Tuyên Quang]] -- [[Northwest (Vietnam)|Northwest]] region ["Lào Cai Province, Vietnam"] = {}, -- capital [[Lào Cai]] ["Yên Bái Province, Vietnam"] = {}, -- capital [[Yên Bái]] ["Điện Biên Province, Vietnam"] = {}, -- capital [[Điện Biên Phủ]] ["Hoà Bình Province, Vietnam"] = {}, -- capital [[Hoà Bình City|Hoà Bình]] ["Hòa Bình Province, Vietnam"] = {alias_of = "Hoà Bình Province, Vietnam", display = true}, ["Lai Châu Province, Vietnam"] = {}, -- capital [[Lai Châu]] ["Sơn La Province, Vietnam"] = {}, -- capital [[Sơn La]] -- [[Red River Delta]] region ["Bắc Ninh Province, Vietnam"] = {}, -- capital [[Bắc Ninh]] ["Hà Nam Province, Vietnam"] = {}, -- capital [[Phủ Lý]] ["Hải Dương Province, Vietnam"] = {}, -- capital [[Hải Dương]] ["Hưng Yên Province, Vietnam"] = {}, -- capital [[Hưng Yên]] ["Nam Định Province, Vietnam"] = {}, -- capital [[Nam Định]] ["Ninh Bình Province, Vietnam"] = {}, -- capital [[Ninh Bình|Hoa Lư]] ["Thái Bình Province, Vietnam"] = {}, -- capital [[Thái Bình]] ["Vĩnh Phúc Province, Vietnam"] = {}, -- capital [[Vĩnh Yên]] -- ["Hanoi"] = {placetype = {"municipality", "city"}}, -- capital [[Hoàn Kiếm district]] -- ["Haiphong"] = {placetype = {"municipality", "city"}}, -- capital [[Hồng Bàng district]] -- [[North Central Coast]] region ["Hà Tĩnh Province, Vietnam"] = {}, -- capital [[Hà Tĩnh]] ["Nghệ An Province, Vietnam"] = {}, -- capital [[Vinh]] ["Quảng Bình Province, Vietnam"] = {}, -- capital [[Đồng Hới]] ["Quảng Trị Province, Vietnam"] = {}, -- capital [[Đông Hà]] ["Thanh Hoá Province, Vietnam"] = {}, -- capital [[Thanh Hoá]] ["Thanh Hóa Province, Vietnam"] = {alias_of = "Thanh Hoá Province, Vietnam", display = true}, -- ["Hue"] = {placetype = {"municipality", "city"}, wp = "Huế"}, -- capital [[Thuận Hoá district]] -- [[Central Highlands (Vietnam)|Central Highlands]] region ["Đắk Lắk Province, Vietnam"] = {}, -- capital [[Buôn Ma Thuột]] ["Đăk Nông Province, Vietnam"] = {}, -- capital [[Gia Nghĩa]] ["Gia Lai Province, Vietnam"] = {}, -- capital [[Pleiku]] ["Kon Tum Province, Vietnam"] = {}, -- capital [[Kon Tum]] ["Lâm Đồng Province, Vietnam"] = {}, -- capital [[Đà Lạt]] -- [[South Central Coast]] region ["Bình Định Province, Vietnam"] = {}, -- capital [[Quy Nhon]] ["Bình Thuận Province, Vietnam"] = {}, -- capital [[Phan Thiết]] ["Khánh Hoà Province, Vietnam"] = {}, -- capital [[Nha Trang]] ["Khánh Hòa Province, Vietnam"] = {alias_of = "Khánh Hoà Province, Vietnam", display = true}, ["Ninh Thuận Province, Vietnam"] = {}, -- capital [[Phan Rang–Tháp Chàm]] ["Phú Yên Province, Vietnam"] = {}, -- capital [[Tuy Hoà]] ["Quảng Nam Province, Vietnam"] = {}, -- capital [[Tam Kỳ]] ["Quảng Ngãi Province, Vietnam"] = {}, -- capital [[Quảng Ngãi]] -- ["Da Nang"] = {placetype = {"municipality", "city"}}, -- capital [[Hải Châu district]] -- [[Southeast (Vietnam)|Southeast]] region ["Bà Rịa–Vũng Tàu Province, Vietnam"] = {}, -- capital [[Bà Rịa]] ["Bình Dương Province, Vietnam"] = {}, -- capital [[Thủ Dầu Một]] ["Bình Phước Province, Vietnam"] = {}, -- capital [[Đồng Xoài]] ["Đồng Nai Province, Vietnam"] = {}, -- capital [[Biên Hoà]] ["Tây Ninh Province, Vietnam"] = {}, -- capital [[Tây Ninh]] -- ["Ho Chi Minh City"] = {placetype = {"municipality", "city"}}, -- capital [[District 1, Ho Chi Minh City|'''District 1''']] -- [[Mekong Delta]] region ["An Giang Province, Vietnam"] = {}, -- capital [[Long Xuyên]] ["Bạc Liêu Province, Vietnam"] = {}, -- capital [[Bạc Liêu]] ["Bến Tre Province, Vietnam"] = {}, -- capital [[Bến Tre]] ["Cà Mau Province, Vietnam"] = {}, -- capital [[Cà Mau]] ["Đồng Tháp Province, Vietnam"] = {}, -- capital [[Cao Lãnh City|Cao Lãnh]] ["Hậu Giang Province, Vietnam"] = {}, -- capital [[Vị Thanh]] ["Kiên Giang Province, Vietnam"] = {}, -- capital [[Rạch Giá]] ["Long An Province, Vietnam"] = {}, -- capital [[Tân An]] ["Sóc Trăng Province, Vietnam"] = {}, -- capital [[Sóc Trăng]] ["Tiền Giang Province, Vietnam"] = {}, -- capital [[Mỹ Tho]] ["Trà Vinh Province, Vietnam"] = {}, -- capital [[Trà Vinh]] ["Vĩnh Long Province, Vietnam"] = {}, -- capital [[Vĩnh Long]] -- ["Can Tho"] = {placetype = {"municipality", "city"}, wp = "Cần Thơ"}, -- capital [[Ninh Kiều district]] } -- provinces of Vietnam export.vietnam_group = { key_to_placename = make_key_to_placename(", Vietnam$", " Province$"), placename_to_key = make_placename_to_key(", Vietnam", " Province"), default_container = "Vietnam", default_placetype = "province", -- There may not be enough districts to subcategorize like this. -- default_divs = "districts", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "%e province", data = export.vietnam_provinces, } ----------------------------------------------------------------------------------- -- City data -- ----------------------------------------------------------------------------------- export.australia_cities = { ["Adelaide"] = {container = "South Australia"}, -- 1,450,000 (Agglomeration) ["Brisbane"] = {container = "Queensland"}, -- 3,450,000 (Conglomeration; including the Gold Coast [750,997 2024 estiamte]) ["Canberra"] = {container = {key = "Australian Capital Territory, Australia", placetype = "territory"}}, -- 510,641 (2024 estimate) ["Melbourne"] = {container = "Victoria"}, -- 5,200,000 (Agglomeration) ["Newcastle, New South Wales"] = {container = "New South Wales", wp = "%l, %c"}, -- 534,033 (2024 estimate) ["Newcastle"] = {alias_of = "Newcastle, New South Wales"}, ["Perth"] = {container = "Western Australia"}, -- 2,350,000 (Agglomeration) ["Sydney"] = {container = "New South Wales"}, -- 5,100,000 (Agglomeration) } export.australia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Australia", "negeri"), default_placetype = "city", data = export.australia_cities, } export.brazil_cities = { -- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01. ["São Paulo"] = {container = "São Paulo"}, -- 22,600,000 (Consolidated Urban Area; including Guarulhos) ["Sao Paulo"] = {alias_of = "São Paulo", display = true}, ["Rio de Janeiro"] = {container = "Rio de Janeiro"}, -- 13,600,000 (Consolidated Urban Area) ["Belo Horizonte"] = {container = "Minas Gerais"}, -- 5,300,000 ["Recife"] = {container = "Pernambuco"}, -- 4,100,000 ["Porto Alegre"] = {container = "Rio Grande do Sul"}, -- 3,950,000 (Consolidated Urban Area) ["Brasília"] = {container = "Distrito Federal"}, -- 3,850,000 ["Brasilia"] = {alias_of = "Brasília", display = true}, ["Fortaleza"] = {container = "Ceará"}, -- 3,825,000 ["Salvador"] = {container = "Bahia", wp = "%l, %c", commonscat = "%l (%c)"}, -- 3,400,000 ["Curitiba"] = {container = "Paraná"}, -- 3,375,000 ["Campinas"] = {container = "São Paulo"}, -- 3,250,000 ["Goiânia"] = {container = "Goiás"}, -- 2,525,000 ["Goiania"] = {alias_of = "Goiânia", display = true}, ["Manaus"] = {container = "Amazonas"}, -- 2,275,000 ["Belém"] = {container = "Pará"}, -- 2,200,000 ["Belem"] = {alias_of = "Belém", display = true}, ["Vitória"] = {container = "Espírito Santo", wp = "%l, %c"}, -- 1,870,000 ["Vitoria"] = {alias_of = "Vitória", display = true}, ["Santos"] = {container = "São Paulo", wp = "%l, %c"}, -- 1,760,000 ["São Luís"] = {container = "Maranhão", wp = "%l, %c"}, -- 1,530,000 ["Sao Luis"] = {alias_of = "São Luís", display = true}, ["Natal"] = {container = "Rio Grande do Norte", wp = "%l, %c"}, -- 1,360,000 ["Florianópolis"] = {container = "Santa Catarina"}, -- 1,260,000 ["Florianopolis"] = {alias_of = "Florianópolis", display = true}, ["Maceió"] = {container = "Alagoas"}, -- 1,220,000 ["Maceio"] = {alias_of = "Maceió", display = true}, ["João Pessoa"] = {container = "Paraíba", wp = "%l, %c"}, -- 1,210,000 ["Joao Pessoa"] = {alias_of = "João Pessoa", display = true}, ["São José dos Campos"] = {container = "São Paulo"}, -- 1,090,000 ["Sao Jose dos Campos"] = {alias_of = "São José dos Campos", display = true}, ["Londrina"] = {container = "Paraná"}, -- 1,050,000 ["Teresina"] = {container = "Piauí"}, -- 1,040,000 } export.brazil_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Brazil", "negeri"), default_placetype = "city", data = export.brazil_cities, } export.canada_cities = { -- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01. ["Toronto"] = {container = "Ontario"}, -- 7,850,000 (Consolidated Urban Area; including Hamilton) ["Montreal"] = {container = "Quebec"}, -- 4,500,000 (Consolidated Urban Area) ["Vancouver"] = {container = "British Columbia"}, -- 3,175,000 (Consolidated Urban Area) ["Calgary"] = {container = "Alberta"}, -- 1,510,000 (Consolidated Urban Area) ["Edmonton"] = {container = "Alberta"}, -- 1,460,000 (Consolidated Urban Area) ["Ottawa"] = {container = "Ontario"}, -- 1,390,000 (Consolidated Urban Area) ["Quebec City"] = {container = "Quebec"}, -- 839,311 metro per Wikipedia (2021 census) ["Winnipeg"] = {container = "Manitoba"}, -- 834,678 metro per Wikipedia (2021 census) ["Hamilton"] = {container = "Ontario", wp = "%l, %c"}, -- 785,184 metro per Wikipedia (2021 census) ["Kitchener"] = {container = "Ontario", wp = "%l, %c"}, -- 575,847 metro per Wikipedia (2021 census) } export.canada_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Canada", "province"), default_placetype = "city", data = export.canada_cities, } export.france_cities = { -- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01. ["Paris"] = {container = "Île-de-France"}, -- 11,500,000 (Conglomeration) ["Lyon"] = {container = "Auvergne-Rhône-Alpes"}, -- 2,050,000 (Conglomeration) ["Lyons"] = {alias_of = "Lyon", display = true}, ["Marseille"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 1,710,000 (Conglomeration) ["Marseilles"] = {alias_of = "Marseille", display = true}, ["Lille"] = {container = "Hauts-de-France"}, -- 1,320,000 (Conglomeration) ["Bordeaux"] = {container = "Nouvelle-Aquitaine"}, -- 1,160,000 (Conglomeration) ["Toulouse"] = {container = "Occitania"}, -- 1,150,000 (Conglomeration) ["Nice"] = {container = "Provence-Alpes-Côte d'Azur"}, ["Nantes"] = {container = "Pays de la Loire"}, ["Strasbourg"] = {container = "Grand Est"}, ["Rennes"] = {container = "Brittany"}, } export.france_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", France", "region"), default_placetype = "city", data = export.france_cities, } export.germany_cities = { -- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01. -- listed under Rhein-Ruhr Area, total population 10,900,000 (Consolidated Urban Area) ["Cologne"] = {container = "North Rhine-Westphalia"}, ["Köln"] = {alias_of = "Cologne", display = true}, ["Düsseldorf"] = {container = "North Rhine-Westphalia"}, ["Dusseldorf"] = {alias_of = "Düsseldorf", display = true}, ["Dortmund"] = {container = "North Rhine-Westphalia"}, ["Essen"] = {container = "North Rhine-Westphalia"}, ["Duisberg"] = {container = "North Rhine-Westphalia"}, ["Berlin"] = {}, -- 4,700,000 ["Frankfurt"] = {container = "Hesse"}, -- 3,225,000 ["Frankfurt am Main"] = {alias_of = "Frankfurt"}, -- not a display alias as it's longer ["Hamburg"] = {}, -- 2,900,000 ["Munich"] = {container = "Bavaria"}, -- 2,300,000 ["Stuttgart"] = {container = "Baden-Württemberg"}, -- 2,300,000 ["Mannheim"] = {container = "Baden-Württemberg"}, -- 1,550,000 ["Nuremberg"] = {container = "Bavaria"}, -- 1,120,000 ["Hanover"] = {"Lower Saxony"}, -- 1,090,000 ["Bielefeld"] = {container = "North Rhine-Westphalia"}, -- 1,080,000 ["Leipzig"] = {container = "Saxony"}, -- 1,080,000 ["Aachen"] = {container = "North Rhine-Westphalia"}, -- 1,000,000 ["Aix-la-Chapelle"] = {alias_of = "Aachen"}, -- historical; not a display alias ["Bremen"] = {}, } export.germany_cities_group = { default_container = "Germany", canonicalize_key_container = make_canonicalize_key_container(", Germany", "negeri"), default_placetype = "city", data = export.germany_cities, } export.india_cities = { -- This lists the 65 metro areas per Demographia's 2023 estimates, as found in -- [[w:List_of_million-plus_urban_agglomerations_in_India]]. The last census in India (as of April 2025) was -- conducted in 2011, and the results are not accurate any more. ["Delhi"] = {container = {key = "Delhi, India", placetype = "union territory"}}, -- 31,190,000 ["Mumbai"] = {container = "Maharashtra"}, -- 25,189,000 ["Kolkata"] = {container = "West Bengal"}, -- 21,747,000 ["Bangalore"] = {container = "Karnataka", wp = "Bengaluru"}, -- 15,257,000 ["Bengaluru"] = {alias_of = "Bangalore"}, ["Chennai"] = {container = "Tamil Nadu"}, -- 11,570,000 ["Hyderabad"] = {container = "Telangana"}, -- 9,797,000 ["Ahmedabad"] = {container = "Gujarat"}, -- 8,006,000 ["Pune"] = {container = "Maharashtra"}, -- 6,819,000 ["Surat"] = {container = "Gujarat"}, -- 6,601,000 ["Lucknow"] = {container = "Uttar Pradesh"}, -- 4,661,000 ["Jaipur"] = {container = "Rajasthan"}, -- 4,360,000 ["Kanpur"] = {container = "Uttar Pradesh"}, -- 4,350,000 ["Indore"] = {container = "Madhya Pradesh"}, -- 3,765,000 ["Nagpur"] = {container = "Maharashtra"}, -- 3,493,000 ["Patna"] = {container = "Bihar"}, -- 3,331,000 ["Varanasi"] = {container = "Uttar Pradesh"}, -- 3,229,000 ["Kozhikode"] = {container = "Kerala"}, -- 3,049,000 ["Thiruvananthapuram"] = {container = "Kerala"}, -- 2,851,000 ["Agra"] = {container = "Uttar Pradesh"}, -- 2,737,000 ["Bhopal"] = {container = "Madhya Pradesh"}, -- 2,562,000 ["Coimbatore"] = {container = "Tamil Nadu"}, -- 2,551,000 ["Allahabad"] = {container = "Uttar Pradesh", wp = "Prayagraj"}, -- 2,438,000 ["Prayagraj"] = {alias_of = "Allahabad"}, ["Kochi"] = {container = "Kerala"}, -- 2,381,000 ["Ludhiana"] = {container = "Punjab"}, -- 2,205,000 ["Vadodara"] = {container = "Gujarat"}, -- 2,182,000 ["Chandigarh"] = {container = {key = "Chandigarh, India", placetype = "union territory"}}, -- 2,168,000 ["Madurai"] = {container = "Tamil Nadu"}, -- 2,048,000 ["Meerut"] = {container = "Uttar Pradesh"}, -- 2,011,000 ["Visakhapatnam"] = {container = "Andhra Pradesh"}, -- 2,005,000 ["Jamshedpur"] = {container = "Jharkhand"}, -- 1,925,000 ["Malappuram"] = {container = "Kerala"}, -- 1,868,000 ["Nashik"] = {container = "Maharashtra"}, -- 1,810,000 ["Asansol"] = {container = "West Bengal"}, -- 1,720,000 ["Aligarh"] = {container = "Uttar Pradesh"}, -- 1,660,000 ["Ranchi"] = {container = "Jharkhand"}, -- 1,638,000 ["Thrissur"] = {container = "Kerala"}, -- 1,578,000 ["Kollam"] = {container = "Kerala"}, -- 1,576,000 ["Jabalpur"] = {container = "Madhya Pradesh"}, -- 1,533,000 ["Dhanbad"] = {container = "Jharkhand"}, -- 1,503,000 ["Jodhpur"] = {container = "Rajasthan"}, -- 1,497,000 ["Aurangabad"] = {container = "Maharashtra"}, -- 1,490,000 ["Chhatrapati Sambhajinagar"] = {alias_of = "Aurangabad"}, ["Rajkot"] = {container = "Gujarat"}, -- 1,487,000 ["Gwalior"] = {container = "Madhya Pradesh"}, -- 1,477,000 ["Raipur"] = {container = "Chhattisgarh"}, -- 1,429,000 ["Gorakhpur"] = {container = "Uttar Pradesh"}, -- 1,410,000 ["Kannur"] = {container = "Kerala"}, -- 1,360,000 ["Bareilly"] = {container = "Uttar Pradesh"}, -- 1,355,000 ["Guwahati"] = {container = "Assam"}, -- 1,355,000 ["Moradabad"] = {container = "Uttar Pradesh"}, -- 1,345,000 ["Amritsar"] = {container = "Punjab"}, -- 1,313,000 ["Mysore"] = {container = "Karnataka"}, -- 1,296,000 ["Bhilai"] = {container = "Chhattisgarh"}, -- 1,293,000 ["Durg-Bhilainagar"] = {alias_of = "Bhilai"}, ["Durg-Bhilai"] = {alias_of = "Bhilai"}, ["Durg"] = {alias_of = "Bhilai"}, ["Bhilainagar"] = {alias_of = "Bhilai"}, ["Vijayawada"] = {container = "Andhra Pradesh"}, -- 1,232,000 ["Srinagar"] = {container = {key = "Jammu and Kashmir, India", placetype = "union territory"}}, -- 1,212,000 ["Salem"] = {container = "Tamil Nadu", wp = "%l, %c"}, -- 1,189,000 ["Kota"] = {container = "Rajasthan"}, -- 1,172,000 ["Jalandhar"] = {container = "Punjab"}, -- 1,165,000 ["Saharanpur"] = {container = "Uttar Pradesh"}, -- 1,152,000 ["Dehradun"] = {container = "Uttarakhand"}, -- 1,136,000 ["Tiruchirappalli"] = {container = "Tamil Nadu"}, -- 1,131,000 ["Bhubaneswar"] = {container = "Odisha"}, -- 1,112,000 ["Jammu"] = {container = {key = "Jammu and Kashmir, India", placetype = "union territory"}}, -- 1,103,000 ["Solapur"] = {container = "Maharashtra"}, -- 1,082,000 ["Hubli-Dharwad"] = {container = "Karnataka", wp = "Hubli–Dharwad"}, -- 1,062,000; wp with en dash ["Hubli"] = {alias_of = "Hubli-Dharwad"}, ["Dharwad"] = {alias_of = "Hubli-Dharwad"}, ["Puducherry"] = {container = {key = "Puducherry, India", placetype = "union territory"}}, -- 1,024,000 ["Pondicherry"] = {alias_of = "Puducherry", display = true}, -- satellite/secondary cities of metro area (none in citypopulation.de) ["Ghaziabad"] = {container = "Uttar Pradesh"}, -- 1,729,000 city, 2,358,525 urban agglomeration per 2011 census; 3,406,061 2025 estimate from official website; part of Delhi metro area ["Faridabad"] = {container = "Haryana"}, -- 1,414,050 city per 2011 census; part of Delhi metro area ["Thane"] = {container = "Maharashtra"}, -- 1,841,488 city per 2011 census; part of Mumbai metro area ["Kalyan-Dombivli"] = {container = "Maharashtra"}, -- 1,246,381 city per 2011 census; part of Mumbai metro area ["Kalyan-Dombivali"] = {alias_of = "Kalyan-Dombivli", display = true}, ["Kalyan"] = {alias_of = "Kalyan-Dombivli"}, ["Dombivli"] = {alias_of = "Kalyan-Dombivli"}, ["Dombivali"] = {alias_of = "Kalyan-Dombivli"}, ["Vasai-Virar"] = {container = "Maharashtra"}, -- 1,221,233 city per 2011 census; part of Mumbai metro area ["Vasai"] = {alias_of = "Vasai-Virar"}, ["Virar"] = {alias_of = "Vasai-Virar"}, ["Navi Mumbai"] = {container = "Maharashtra"}, -- 1,120,547 city per 2011 census; part of Mumbai metro area ["Howrah"] = {container = "West Bengal"}, -- 1,077,075 city ("metropolis"), 2,811,344 "metro" per 2011 census; part of Kolkata metro area ["Pimpri-Chinchwad"] = {container = "Maharashtra"}, -- 1,727,692 per 2011 census; part of Pune metro area ["Pimpri Chinchwad"] = {alias_of = "Pimpri-Chinchwad", display = true}, } export.india_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", India", "negeri"), default_placetype = "city", data = export.india_cities, } export.indonesia_cities = { -- cities where the city proper has more than 1,000,000 people as of mid-2023 estimate ["Jakarta"] = {container = "Special Capital Region of Jakarta", divs = { {type = "subdistricts", container_parent_type = false}, }}, ["Surabaya"] = {container = "East Java"}, ["Bekasi"] = {container = "West Java"}, -- part of Jakarta metro area ["Bandung"] = {container = "West Java"}, ["Medan"] = {container = "North Sumatra"}, ["Depok"] = {container = "West Java"}, -- part of Jakarta metro area ["Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area ["Palembang"] = {container = "South Sumatra"}, ["Semarang"] = {container = "Central Java"}, ["Makassar"] = {container = "South Sulawesi"}, ["South Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area ["Batam"] = {container = "Riau Islands"}, ["Bogor"] = {container = "West Java"}, -- part of Jakarta metro area ["Pekanbaru"] = {container = "Riau"}, ["Bandar Lampung"] = {container = "Lampung"}, -- other metro areas over 1,000,000 people ["Padang"] = {container = "West Sumatra"}, ["Samarinda"] = {container = "East Kalimantan"}, ["Malang"] = {container = "East Java"}, ["Yogyakarta"] = {container = "Special Region of Yogyakarta"}, ["Denpasar"] = {container = "Bali"}, ["Cirebon"] = {container = "West Java"}, ["Surakarta"] = {container = "Central Java"}, ["Banjarmasin"] = {container = "South Kalimantan"}, ["Tasikmalaya"] = {container = "West Java"}, } export.indonesia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Indonesia", "province"), default_placetype = "city", data = export.indonesia_cities, } export.italy_cities = { -- Data per [[w:List_of_metropolitan_areas_of_Italy]]. There are several lists given; the most recent one, used -- here, only gives estimates as of Jan 1, 2014. ["Milan"] = {container = "Lombardy"}, -- 6,623,798 ["Naples"] = {container = "Campania"}, -- 5,294,546 ["Rome"] = {container = "Lazio"}, -- 4,447,881 ["Turin"] = {container = "Piedmont"}, -- 1,865,284 ["Venice"] = {container = "Veneto"}, -- 1,645,900 ["Florence"] = {container = "Tuscany"}, -- 1,485,030 ["Bari"] = {container = "Apulia"}, -- 1,257,459 ["Palermo"] = {container = "Sicily"}, -- 1,183,084 -- include a few just below 1,000,000 metro area that may be above it by now (depending on the definition). ["Catania"] = {container = "Sicily"}, -- 988,240 ["Brescia"] = {container = "Lombardy"}, -- 924,090 ["Genoa"] = {container = "Liguria"}, -- 861,318 } export.italy_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Italy", "region"), default_placetype = "city", data = export.italy_cities, } export.japan_cities = { -- Population figures from [[w:List of cities in Japan]]. Metro areas from -- [[w:List of metropolitan areas in Japan]]. ["Tokyo"] = {keydesc = "[[Tokyo]] Metropolis, the [[capital city]] and a [[prefecture]] of [[Japan]] (which is a country in [[Asia]])", placetype = {"city", "prefecture"}, divs = { {type = "special wards", container_parent_type = false}, {type = "cities", prep = "di"}, }, }, ["Yokohama"] = {container = "Kanagawa"}, -- 3,697,894 ["Osaka"] = {container = "Osaka"}, -- 2,668,586 ["Nagoya"] = {container = "Aichi"}, -- 2,283,289 -- FIXME, Hokkaido is handled specially. ["Sapporo"] = {container = "Hokkaido"}, -- 1,918,096 ["Fukuoka"] = {container = "Fukuoka"}, -- 1,581,527 ["Kobe"] = {container = "Hyōgo"}, -- 1,530,847 ["Kyoto"] = {container = "Kyoto"}, -- 1,474,570 ["Kawasaki"] = {container = "Kanagawa", wp = "%l, Kanagawa"}, -- 1,373,630 ["Saitama"] = {container = "Saitama", wp = "%l (city)", commonscat = "%l, %c"}, -- 1,192,418 ["Hiroshima"] = {container = "Hiroshima"}, -- 1,163,806 ["Sendai"] = {container = "Miyagi"}, -- 1,029,552 -- the remaining cities are considered "central cities" in a 1,000,000+ metro area -- (sometimes there is more than one central city in the area). ["Kitakyushu"] = {container = "Fukuoka"}, -- 986,998 ["Chiba"] = {container = "Chiba", wp = "%l (city)", commonscat = "%l, %c"}, -- 938,695 ["Sakai"] = {container = "Osaka"}, -- 835,333 ["Niigata"] = {container = "Niigata", wp = "%l (city)", commonscat = "%l, %c"}, -- 813,053 ["Hamamatsu"] = {container = "Shizuoka"}, -- 811,431 ["Shizuoka"] = {container = "Shizuoka", wp = "%l (city)", commonscat = "%l, %c"}, -- 710,944 ["Sagamihara"] = {container = "Kanagawa"}, -- 706,342 ["Okayama"] = {container = "Okayama"}, -- 701,293 ["Kumamoto"] = {container = "Kumamoto"}, -- 670,348 ["Kagoshima"] = {container = "Kagoshima"}, -- 605,196 -- skipped 6 cities (Funabashi, Hachiōji, Kawaguchi, Himeji, Matsuyama, Higashiōsaka) -- with population in the range 509k - 587k because not central cities in any -- 1,000,000+ metro area. ["Utsunomiya"] = {container = "Tochigi"}, -- 507,833 } export.japan_cities_group = { default_container = "Japan", canonicalize_key_container = make_canonicalize_key_container(" Prefecture, Japan", "prefecture"), default_placetype = "city", data = export.japan_cities, } export.mexico_cities = { ["Mexico City"] = {}, -- its own state ["Monterrey"] = {container = "Nuevo León"}, ["Guadalajara"] = {container = "Jalisco"}, ["Puebla"] = {container = "Puebla", wp = "%l (city)"}, ["Toluca"] = {container = "State of Mexico"}, ["Tijuana"] = {container = "Baja California"}, -- Include the state in the category for León due to possible confusion with León, Spain. ["León, Guanajuato"] = {container = "Guanajuato", wp = "%l, %c"}, ["León"] = {alias_of = "León, Guanajuato"}, ["Leon"] = {alias_of = "León, Guanajuato", display = true}, ["Querétaro"] = {container = "Querétaro", wp = "%l (city)"}, ["Queretaro"] = {alias_of = "Querétaro", display = true}, ["Ciudad Juárez"] = {container = "Chihuahua"}, ["Juárez"] = {alias_of = "Ciudad Juárez"}, ["Juarez"] = {alias_of = "Ciudad Juárez", display = "Juárez"}, ["Torreón"] = {container = "Coahuila"}, ["Torreon"] = {alias_of = "Torreón", display = true}, -- Include the state in the category for Mérida due to possible confusion with Mérida, Spain or -- Mérida, Venezuela. ["Mérida, Yucatán"] = {container = "Yucatán", wp = "%l, %c"}, ["Mérida"] = {alias_of = "Mérida, Yucatán"}, ["Merida"] = {alias_of = "Mérida, Yucatán", display = true}, ["San Luis Potosí"] = {container = "San Luis Potosí", wp = "%l (city)"}, ["San Luis Potosi"] = {alias_of = "San Luis Potosí", display = true}, ["Aguascalientes"] = {container = "Aguascalientes", wp = "%l (city)"}, ["Mexicali"] = {container = "Baja California"}, } export.mexico_cities_group = { default_container = "Mexico", canonicalize_key_container = make_canonicalize_key_container(", Mexico", "negeri"), default_placetype = "city", data = export.mexico_cities, } export.nigeria_cities = { -- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01. ["Lagos"] = {container = "Lagos"}, -- 21,300,000 (unindicated; population of low reliability) ["Kano"] = {container = "Kano", wp = "%l (city)"}, -- 5,350,000 (unindicated; population of low reliability) ["Ibadan"] = {container = "Oyo"}, -- 3,400,000 (unindicated; population of low reliability) ["Abuja"] = {container = {key = "Federal Capital Territory, Nigeria", placetype = "wilayah persekutuan"}}, -- 3,050,000 (unindicated; population of low reliability) ["Port Harcourt"] = {container = "Rivers"}, -- 2,250,000 (unindicated; population of low reliability) ["Kaduna"] = {container = "Kaduna"}, -- 1,980,000 (unindicated; population of low reliability) ["Benin City"] = {container = "Edo"}, -- 1,790,000 (unindicated; population of low reliability) ["Aba"] = {container = "Abia", wp = "%l, Nigeria"}, -- 1,280,000 (unindicated; population of low reliability) ["Onitsha"] = {container = "Anambra"}, -- 1,230,000 (unindicated; population of low reliability) ["Maiduguri"] = {container = "Borno"}, -- 1,190,000 (unindicated; population of low reliability) ["Ilorin"] = {container = "Kwara"}, -- 1,160,000 (unindicated; population of low reliability) ["Sokoto"] = {container = "Sokoto", wp = "%l (city)"}, -- 1,140,000 (unindicated; population of low reliability) ["Jos"] = {container = "Plateau"}, -- 1,110,000 (unindicated; population of low reliability) ["Zaria"] = {container = "Kaduna"}, -- 1,050,000 (unindicated; population of low reliability) ["Enugu"] = {container = "Enugu", wp = "%l (city)"}, -- 1,010,000 (unindicated; population of low reliability) } export.nigeria_cities_group = { default_container = "Nigeria", canonicalize_key_container = make_canonicalize_key_container(" State, Nigeria", "negeri"), default_placetype = "city", data = export.nigeria_cities, } export.pakistan_cities = { -- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01. ["Karachi"] = {container = "Sindh"}, -- 21,000,000 (Consolidated Urban Area) ["Lahore"] = {container = "Punjab"}, -- 14,600,000 (Consolidated Urban Area) ["Rawalpindi"] = {container = "Punjab"}, -- 5,600,000 (Consolidated Urban Area; including Islamabad) ["Islamabad"] = {container = {key = "Islamabad Capital Territory, Pakistan", placetype = "wilayah persekutuan"}}, -- 5,600,000 (Consolidated Urban Area; including Rawalpindi) ["Faisalabad"] = {container = "Punjab"}, -- 4,125,000 (Consolidated Urban Area) ["Gujranwala"] = {container = "Punjab"}, -- 3,450,000 (Consolidated Urban Area) -- there is also Hyderabad in India (very confusing) ["Hyderabad, Pakistan"] = {container = "Sindh", wp = "%l, %c"}, -- 2,475,000 (Consolidated Urban Area) ["Hyderabad"] = {alias_of = "Hyderabad, Pakistan"}, ["Multan"] = {container = "Punjab"}, -- 2,425,000 (Consolidated Urban Area) ["Peshawar"] = {container = "Khyber Pakhtunkhwa"}, -- 2,150,000 (Consolidated Urban Area) ["Quetta"] = {container = "Balochistan"}, -- 1,720,000 (Urban Area) ["Sargodha"] = {container = "Punjab"}, -- 1,080,000 (Urban Area) ["Sialkot"] = {container = "Punjab"}, -- 1,050,000 (Consolidated Urban Area) } export.pakistan_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Pakistan", "province"), default_placetype = "city", data = export.pakistan_cities, } export.philippines_cities = { -- Skipped some cities in Metro Manila (Taguig, Pasig) which don't have districts. -- Other cities outside Metro Manila skipped as not central city in their urban area. ["Quezon City"] = {container = {key = "Metro Manila, Philippines", placetype = "region"}}, -- Don't display-canonicalize Foo to Foo City as it may make the display weird. ["Quezon"] = {alias_of = "Quezon City"}, ["Manila"] = {container = {key = "Metro Manila, Philippines", placetype = "region"}}, ["Davao City"] = {container = "Davao del Sur"}, ["Davao"] = {alias_of = "Davao City"}, ["Caloocan"] = {container = {key = "Metro Manila, Philippines", placetype = "region"}}, ["Zamboanga City"] = {container = "Zamboanga del Sur"}, ["Zamboanga"] = {alias_of = "Zamboanga City"}, ["Cebu City"] = {container = "Cebu"}, ["Cebu"] = {alias_of = "Cebu City"}, ["Antipolo"] = {container = "Rizal"}, ["Cagayan de Oro"] = {container = "Misamis Oriental"}, ["Dasmariñas"] = {container = "Cavite"}, ["Dasmarinas"] = {alias_of = "Dasmariñas", display = true}, ["General Santos"] = {container = "South Cotabato"}, ["San Jose del Monte"] = {container = "Bulacan"}, ["Bacolod"] = {container = "Negros Occidental"}, ["Calamba"] = {container = "Laguna", wp = "%l, %c"}, ["Angeles"] = {container = "Pampanga", wp = "Angeles City"}, ["Angeles City"] = {alias_of = "Angeles"}, ["Iloilo City"] = {container = "Iloilo"}, ["Iloilo"] = {alias_of = "Iloilo City"}, } export.philippines_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Philippines", "province"), default_placetype = "city", data = export.philippines_cities, } export.russia_cities = { -- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01. ["Moscow"] = {}, -- 18,800,000 (Agglomeration) ["Saint Petersburg"] = {}, -- 6,350,000 (Agglomeration) ["Novosibirsk"] = {container = "Novosibirsk Oblast"}, -- 1,820,000 (Agglomeration) ["Yekaterinburg"] = {container = "Sverdlovsk Oblast"}, -- 1,810,000 (Agglomeration) ["Nizhny Novgorod"] = {container = "Nizhny Novgorod Oblast"}, -- 1,620,000 (Agglomeration) ["Kazan"] = {container = {key = "Tatarstan, Russia", placetype = "republic"}}, -- 1,560,000 (Agglomeration) ["Chelyabinsk"] = {container = "Chelyabinsk Oblast"}, -- 1,430,000 (Agglomeration) ["Rostov-on-Don"] = {container = "Rostov Oblast"}, -- 1,390,000 (Agglomeration) ["Rostov-na-Donu"] = {alias_of = "Rostov-on-Don", display = true}, ["Krasnodar"] = {container = {key = "Krasnodar Krai, Russia", placetype = "krai"}}, -- 1,370,000 (Agglomeration) ["Samara"] = {container = "Samara Oblast"}, -- 1,350,000 (Agglomeration) ["Krasnoyarsk"] = {container = {key = "Krasnoyarsk Krai, Russia", placetype = "krai"}}, -- 1,270,000 (Agglomeration) ["Ufa"] = {container = {key = "Bashkortostan, Russia", placetype = "republic"}}, -- 1,230,000 (Agglomeration) ["Saratov"] = {container = "Saratov Oblast"}, -- 1,170,000 (Agglomeration) ["Omsk"] = {container = "Omsk Oblast"}, -- 1,140,000 (Agglomeration) ["Voronezh"] = {container = "Voronezh Oblast"}, -- 1,130,000 (Agglomeration) ["Volgograd"] = {container = "Volgograd Oblast"}, -- 1,080,000 (Agglomeration) ["Perm"] = {container = {key = "Perm Krai, Russia", placetype = "krai"}, wp = "%l, Russia"}, -- 1,070,000 (Agglomeration) } export.russia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Russia", "oblast"), default_container = "Russia", default_placetype = "city", data = export.russia_cities, } export.saudi_arabia_cities = { -- Figures for the first five from [[w:List of cities and towns in Saudi Arabia]] as of 2022. Unclear if these are -- metro, urban or city proper figures. ["Riyadh"] = {container = "Riyadh"}, -- 7,000,100; 7,700,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Jeddah"] = {container = "Mecca"}, -- 3,751,917; 3,950,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Jedda"] = {alias_of = "Jeddah", display = true}, ["Jiddah"] = {alias_of = "Jeddah", display = true}, ["Jidda"] = {alias_of = "Jeddah", display = true}, ["Dammam"] = {container = "Eastern"}, -- 2,638,166; 2,925,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Mecca"] = {container = "Mecca"}, -- 2,385,509; 2,675,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Makkah"] = {alias_of = "Mecca", display = true}, ["Medina"] = {container = "Medina"}, -- 1,477,023; 1,530,000 per citypopulation.de 2025-01-01 (City) ["Hofuf"] = {container = "Eastern"}, -- 1,060,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Khamis Mushait"] = {container = "Aseer"}, -- 1,030,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Khamis Mushayt"] = {alias_of = "Khamis Mushait", display = true}, } export.saudi_arabia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(" Province, Saudi Arabia", "province"), default_placetype = "city", data = export.saudi_arabia_cities, } export.south_korea_cities = { -- All cities listed are not associated with any county. ["Seoul"] = {}, ["Busan"] = {}, ["Incheon"] = {}, ["Daegu"] = {}, ["Daejeon"] = {}, ["Gwangju"] = {}, ["Ulsan"] = {}, } export.south_korea_cities_group = { default_container = "South Korea", canonicalize_key_container = make_canonicalize_key_container(" County, South Korea", "province"), default_placetype = "city", data = export.south_korea_cities, } export.spain_cities = { ["Madrid"] = {container = "Community of Madrid"}, ["Barcelona"] = {container = "Catalonia"}, ["Valencia"] = {container = "Valencia"}, ["Seville"] = {container = "Andalusia"}, ["Bilbao"] = {container = "Basque Country"}, } export.spain_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Spain", "autonomous community"), default_placetype = "city", data = export.spain_cities, } export.taiwan_cities = { ["New Taipei City"] = {}, ["New Taipei"] = {alias_of = "New Taipei City", display = true}, ["Taichung"] = {}, ["Kaohsiung"] = {wp = "%l, Taiwan"}, ["Taipei"] = {}, ["Taoyuan"] = {}, ["Tainan"] = {}, -- these last three are not special municipalities ["Chiayi"] = {placetype = "city"}, ["Hsinchu"] = {placetype = "city"}, ["Keelung"] = {placetype = "city"}, } export.taiwan_cities_group = { placename_to_key = false, -- don't add ", Taiwan" to make the key canonicalize_key_container = make_canonicalize_key_container(", Taiwan", "county"), default_container = "Taiwan", default_placetype = {"special municipality", "municipality", "city"}, default_is_city = true, default_divs = {"districts"}, data = export.taiwan_cities, } -- NOTE: It's OK to mix cities from different constituent countries; as long as the immediate container is correct, -- everything else will be figured out. export.united_kingdom_cities = { ["London"] = {container = "Greater London"}, ["Manchester"] = {container = "Greater Manchester"}, ["Birmingham"] = {container = "West Midlands"}, ["Liverpool"] = {container = "Merseyside"}, ["Glasgow"] = {container = {key = "City of Glasgow, Scotland", placetype = "council area"}}, ["Leeds"] = {container = "West Yorkshire"}, ["Newcastle upon Tyne"] = {container = "Tyne and Wear"}, ["Newcastle"] = {alias_of = "Newcastle upon Tyne"}, ["Bristol"] = {container = {key = "England", placetype = "constituent country"}}, ["Cardiff"] = {container = {key = "Wales", placetype = "constituent country"}}, ["Portsmouth"] = {container = "Hampshire"}, ["Edinburgh"] = {container = {key = "City of Edinburgh, Scotland", placetype = "council area"}}, -- under 1,000,000 people but principal areas of Wales; requested by [[User:Donnanz]] ["Swansea"] = {container = {key = "Wales", placetype = "constituent country"}}, ["Newport"] = {container = {key = "Wales", placetype = "constituent country"}, wp = "Newport, Wales"}, } export.united_kingdom_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", England", "county"), default_placetype = "city", data = export.united_kingdom_cities, } export.united_states_cities = { -- top 50 CSA's by population, with the top and sometimes 2nd or 3rd city listed ["New York City"] = {container = "New York", wp = "%l", divs = { {type = "boroughs", container_parent_type = false}, }}, -- Don't display-canonicalize as it may make the display weird (e.g. in the context New York, New York). ["New York"] = {alias_of = "New York City"}, ["Newark"] = {container = "New Jersey"}, ["Los Angeles"] = {container = "California", wp = "%l"}, ["Long Beach"] = {container = "California"}, ["Riverside"] = {container = "California"}, ["Chicago"] = {container = "Illinois", wp = "%l"}, ["Washington, D.C."] = {wp = "%l"}, ["Washington, DC"] = {alias_of = "Washington, D.C.", display = true}, ["Washington D.C."] = {alias_of = "Washington, D.C.", display = true}, ["Washington DC"] = {alias_of = "Washington, D.C.", display = true}, -- Don't display-canonicalize as it may make the display weird (e.g. if the holonym is followed by a District of -- Columbia holonym). ["Washington"] = {alias_of = "Washington, D.C."}, ["Baltimore"] = {container = "Maryland", wp = "%l"}, -- to avoid conflict with San Jose in Costa Rica ["San Jose, California"] = {container = "California"}, ["San Jose"] = {alias_of = "San Jose, California"}, ["San Francisco"] = {container = "California", wp = "%l"}, ["Oakland"] = {container = "California"}, ["Boston"] = {container = "Massachusetts", wp = "%l"}, ["Providence"] = {container = "Rhode Island"}, ["Dallas"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"}, ["Fort Worth"] = {container = "Texas"}, ["Philadelphia"] = {container = "Pennsylvania", wp = "%l"}, ["Houston"] = {container = "Texas", wp = "%l"}, ["Miami"] = {container = "Florida", wp = "%l", commonscat = "%l, %c"}, ["Atlanta"] = {container = "Georgia", wp = "%l"}, ["Detroit"] = {container = "Michigan", wp = "%l"}, ["Phoenix"] = {container = "Arizona", wp = "%l", commonscat = "%l, %c"}, ["Mesa"] = {container = "Arizona"}, ["Seattle"] = {container = "Washington", wp = "%l"}, ["Orlando"] = {container = "Florida"}, ["Minneapolis"] = {container = "Minnesota", wp = "%l"}, ["Cleveland"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"}, ["Denver"] = {container = "Colorado", wp = "%l", commonscat = "%l, %c"}, ["San Diego"] = {container = "California", wp = "%l", commonscat = "%l, %c"}, ["Portland"] = {container = "Oregon"}, ["Tampa"] = {container = "Florida"}, ["St. Louis"] = {container = "Missouri", wp = "%l", commonscat = "%l, %c"}, ["Saint Louis"] = {alias_of = "St. Louis", display = true}, ["Charlotte"] = {container = "North Carolina"}, ["Sacramento"] = {container = "California"}, ["Pittsburgh"] = {container = "Pennsylvania", wp = "%l"}, ["Salt Lake City"] = {container = "Utah", wp = "%l"}, ["San Antonio"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"}, ["Columbus"] = {container = "Ohio"}, ["Kansas City"] = {container = "Missouri", wp = "%l metropolitan area", commonscat = "%l, %c"}, ["Indianapolis"] = {container = "Indiana", wp = "%l"}, ["Las Vegas"] = {container = "Nevada", wp = "%l"}, ["Cincinnati"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"}, ["Austin"] = {container = "Texas"}, ["Milwaukee"] = {container = "Wisconsin", wp = "%l", commonscat = "%l, %c"}, ["Raleigh"] = {container = "North Carolina"}, ["Nashville"] = {container = "Tennessee"}, ["Virginia Beach"] = {container = "Virginia"}, ["Norfolk"] = {container = "Virginia"}, ["Greensboro"] = {container = "North Carolina"}, ["Winston-Salem"] = {container = "North Carolina"}, ["Jacksonville"] = {container = "Florida"}, ["New Orleans"] = {container = "Louisiana", wp = "%l"}, ["Louisville"] = {container = "Kentucky"}, ["Greenville"] = {container = "South Carolina"}, ["Hartford"] = {container = "Connecticut"}, ["Oklahoma City"] = {container = "Oklahoma", wp = "%l"}, ["Grand Rapids"] = {container = "Michigan"}, ["Memphis"] = {container = "Tennessee"}, ["Birmingham, Alabama"] = {container = "Alabama"}, ["Birmingham"] = {alias_of = "Birmingham, Alabama"}, ["Fresno"] = {container = "California"}, ["Richmond"] = {container = "Virginia"}, ["Harrisburg"] = {container = "Pennsylvania"}, -- any major city of top 50 MSA's that's missed by previous ["Buffalo"] = {container = "New York"}, -- any of the top 50 city by city population that's missed by previous ["El Paso"] = {container = "Texas"}, ["Albuquerque"] = {container = "New Mexico"}, ["Tucson"] = {container = "Arizona"}, ["Colorado Springs"] = {container = "Colorado"}, ["Omaha"] = {container = "Nebraska"}, ["Tulsa"] = {container = "Oklahoma"}, -- skip Arlington, Texas; too obscure and likely to be interpreted as Arlington, Virginia } export.united_states_cities_group = { default_container = "Amerika Syarikat", canonicalize_key_container = make_canonicalize_key_container(", USA", "negeri"), default_placetype = "city", default_wp = "%l, %c", data = export.united_states_cities, } export.new_york_boroughs = { ["Bronx"] = {the = true, wp = "The Bronx"}, ["Brooklyn"] = {}, ["Manhattan"] = {}, ["Queens"] = {}, ["Staten Island"] = {}, } export.new_york_boroughs_group = { default_container = {key = "New York City", placetype = "city"}, default_placetype = "borough", default_is_city = true, data = export.new_york_boroughs, } export.vietnam_cities = { -- Figures from citypopulation.de (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated. ["Ho Chi Minh City"] = {}, -- 14,300,000 (Agglomeration; inclunding Bien Hoa) ["Saigon"] = {alias_of = "Ho Chi Minh City"}, ["Hanoi"] = {}, -- 7,350,000 (Agglomeration) ["Da Nang"] = {}, -- 1,500,000 (Agglomeration) ["Danang"] = {alias_of = "Da Nang", display = true}, ["Haiphong"] = {}, -- 1,450,000 (Agglomeration) ["Hai Phong"] = {alias_of = "Haiphong", display = true}, -- This is the one entry in this list that is not a province-level municipality; instead it's a "provincial city" -- meaning it is directly under its province as opposed to being contained in a district. ["Bien Hoa"] = {placetype = "city", container = "Đồng Nai", wp = "Biên Hòa"}, -- 1,272,235 (2022 city population per Wikipedia) ["Biên Hòa"] = {alias_of = "Bien Hoa", display = true}, ["Biên Hoà"] = {alias_of = "Bien Hoa", display = true}, -- These two not in citypopulation.de because the urban population may be slightly under 1,000,000, but they are -- both province-level municipalities and close to the 1,000,000 mark. ["Can Tho"] = {wp = "Cần Thơ"}, -- 1,456,000 municipality (2019 census), 994,704 urban (2022 General Statistics Office of Vietnam estimate); capital [[Ninh Kiều district]] ["Cần Thơ"] = {alias_of = "Can Tho", display = true}, ["Hue"] = {wp = "Huế"}, -- 1,257,000 municipality (2019 census), 840,000 urban (2022 General Statistics Office of Vietnam estimate); -- capital [[Thuận Hóa district]] ["Huế"] = {alias_of = "Hue", display = true}, } export.vietnam_cities_group = { placename_to_key = false, -- don't add ", Vietnam" to make the key default_container = "Vietnam", canonicalize_key_container = make_canonicalize_key_container(" Province, Vietnam", "province"), -- Most of the cities listed are province-level municipalities in addition, which contain a certain amount of -- rural territory surrounding the city, but not enough to separate the municipality from the city as distinct -- known locations. default_placetype = {"municipality", "city"}, default_is_city = true, -- There may not be enough districts to subcategorize like this. -- default_divs = "districts", data = export.vietnam_cities, } export.misc_cities = { ------------------ Africa ------------------- -- Sorted by country and then within the country, by decreasing population; figures from citypopulation.de -- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated; combined with data from -- [[w:List of urban areas in Africa by population]]. ["Algiers"] = {container = "Algeria"}, -- 4,325,000 (Consolidated Urban Area) ["Oran"] = {container = "Algeria"}, -- 1,640,000 (Consolidated Urban Area) ["Luanda"] = {container = "Angola"}, -- 9,650,000 (Urban Area) ["Benguela"] = {container = "Angola"}, -- 1,420,000 (Urban Area) ["Cotonou"] = {container = "Benin"}, -- 2,150,000 (Agglomeration) ["Ouagadougou"] = {container = "Burkina Faso"}, -- 3,425,000 (Agglomeration) ["Bobo-Dioulasso"] = {container = "Burkina Faso"}, -- 1,100,000 (Agglomeration) ["Bujumbura"] = {container = "Burundi"}, -- 1,143,202 (Urban Area 2023 per PopulationStat, cited in Wikipedia) ["Yaoundé"] = {container = "Cameroon"}, -- 3,975,000 (City) ["Yaounde"] = {alias_of = "Yaoundé", display = true}, ["Douala"] = {container = "Cameroon"}, -- 3,900,000 (City) ["Bangui"] = {container = "Central African Republic"}, -- 1,680,000 (Agglomeration) ["N'Djamena"] = {container = "Chad"}, -- 1,950,000 (City) ["Ndjamena"] = {alias_of = "N'Djamena", display = true}, ["Kinshasa"] = {container = "Democratic Republic of the Congo"}, -- 16,300,000 (City; population of low reliability) ["Lubumbashi"] = {container = "Democratic Republic of the Congo"}, -- 2,875,000 (City; population of low reliability) ["Mbuji-Mayi"] = {container = "Democratic Republic of the Congo"}, -- 2,500,000 (City; population of low reliability) ["Kananga"] = {container = "Democratic Republic of the Congo"}, -- 1,370,000 (City; population of low reliability) ["Kisangani"] = {container = "Democratic Republic of the Congo"}, -- 1,300,000 (City; population of low reliability) ["Bukavu"] = {container = "Democratic Republic of the Congo"}, -- 1,100,000 (City; population of low reliability) ["Goma"] = {container = "Democratic Republic of the Congo"}, -- 1,010,000 (City; population of low reliability) ["Tshikapa"] = {container = "Democratic Republic of the Congo"}, -- 1,020,468 (2023 Wikipedia [[w:List of cities with over one million inhabitants]] from populationstat.com; not in citypopulation.de) ["Cairo"] = {container = "Egypt"}, -- 22,800,000 (Agglomeration, including Giza and Subhra El Kheima) ["Alexandria"] = {container = "Egypt"}, -- 6,250,000 (Agglomeration) ["Giza"] = {container = "Egypt"}, -- 4,458,135 (2023 from citypopulation.de) ["Shubra El Kheima"] = {container = "Egypt"}, -- 1,240,239 (2021 from citypopulation.de) ["Asmara"] = {container = "Eritrea"}, -- 1,090,000 (City; population of low reliability) ["Asmera"] = {alias_of = "Asmara", display = true}, ["Addis Ababa"] = {container = "Ethiopia"}, -- 4,825,000 (Agglomeration) ["Banjul"] = {container = "Gambia"}, -- 1,170,000 (Agglomeration) ["Accra"] = {container = "Ghana"}, -- 6,800,000 (Agglomeration) ["Kumasi"] = {container = "Ghana"}, -- 2,900,000 (Agglomeration) ["Conakry"] = {container = "Guinea"}, -- 2,975,000 (Consolidated Urban Area) ["Abidjan"] = {container = "Ivory Coast"}, -- 7,050,000 (Agglomeration) ["Nairobi"] = {container = "Kenya"}, -- 6,900,000 (unindicated) ["Mombasa"] = {container = "Kenya"}, -- 1,370,000 (City) ["Monrovia"] = {container = "Liberia"}, -- 1,940,000 (Urban Area) ["Tripoli"] = {container = "Libya", wp = "%l, %c"}, -- 1,870,000 (unindicated) ["Antananarivo"] = {container = "Madagascar"}, -- 3,150,000 (Agglomeration) ["Lilongwe"] = {container = "Malawi"}, -- 1,210,000 (City) ["Bamako"] = {container = "Mali"}, -- 5,700,000 (Agglomeration) ["Nouakchott"] = {container = "Mauritania"}, -- 1,500,000 (City) ["Casablanca"] = {container = {key = "Casablanca-Settat, Morocco", placetype = "region"}}, -- 4,450,000 (Municipality (urban population)) ["Rabat"] = {container = {key = "Rabat-Sale-Kenitra, Morocco", placetype = "region"}}, -- 2,125,000 (Municipality (urban population)) ["Tangier"] = {container = {key = "Tangier-Tetouan-Al Hoceima, Morocco", placetype = "region"}}, -- 1,410,000 (Municipality (urban population)) ["Tanger"] = {alias_of = "Tangier", display = true}, ["Tangiers"] = {alias_of = "Tangier", display = true}, ["Fez"] = {container = {key = "Fez-Meknes, Morocco", placetype = "region"}, wp = "%l, Morocco"}, -- 1,310,000 (Municipality (urban population)) ["Fes"] = {alias_of = "Fez", display = true}, ["Fès"] = {alias_of = "Fez", display = true}, ["Agadir"] = {container = {key = "Souss-Massa, Morocco", placetype = "region"}}, -- 1,270,000 (Municipality (urban population)) ["Marrakesh"] = {container = {key = "Marrakesh-Safi, Morocco", placetype = "region"}}, -- 1,140,000 (Municipality (urban population)) ["Marrakech"] = {alias_of = "Marrakesh", display = true}, ["Maputo"] = {container = "Mozambique"}, -- 2,575,000 (Agglomeration) ["Niamey"] = {container = "Niger"}, -- 1,530,000 (City) ["Brazzaville"] = {container = "Republic of the Congo"}, -- 2,475,000 (Agglomeration) ["Pointe-Noire"] = {container = "Republic of the Congo"}, -- 1,480,000 (City) ["Kigali"] = {container = "Rwanda"}, -- 1,960,000 (Municipality (urban population)) ["Dakar"] = {container = "Senegal"}, -- 4,225,000 (Agglomeration) ["Touba"] = {container = "Senegal"}, -- 1,320,000 (Agglomeration) ["Freetown"] = {container = "Sierra Leone"}, -- 1,420,000 (Agglomeration) ["Mogadishu"] = {container = "Somalia"}, -- 2,250,000 (unindicated; population of low reliability) ["Johannesburg"] = {container = {key = "Gauteng, South Africa", placetype = "province"}}, -- 14,800,000 (Consolidated Urban Area; including Pretoria, Soweto, etc.) ["Cape Town"] = {container = {key = "Western Cape, South Africa", placetype = "province"}}, -- 5,100,000 (Consolidated Urban Area) ["Durban"] = {container = {key = "KwaZulu-Natal, South Africa", placetype = "province"}}, -- 3,900,000 (Consolidated Urban Area) ["Pretoria"] = {container = {key = "Gauteng, South Africa", placetype = "province"}}, -- 2,921,488 (2011 census) ["Port Elizabeth"] = {container = {key = "Eastern Cape, South Africa", placetype = "province"}, wp = "Gqeberha"}, -- 1,200,000 (Consolidated Urban Area) ["Gqeberha"] = {alias_of = "Port Elizabeth"}, -- official name; not a display alias ["Khartoum"] = {container = "Sudan"}, -- 7,200,000 (unindicated; population of low reliability) ["Dar es Salaam"] = {container = "Tanzania"}, -- 6,650,000 (Agglomeration) ["Mwanza"] = {container = "Tanzania"}, -- 1,340,000 (Agglomeration) ["Mwanza City"] = {alias_of = "Mwanza", display = true}, ["Arusha"] = {container = "Tanzania"}, -- 1,190,000 (Agglomeration) ["Zanzibar"] = {container = "Tanzania"}, -- 1,030,000 (Agglomeration) ["Lomé"] = {container = "Togo"}, -- 2,625,000 (unindicated) ["Lome"] = {alias_of = "Lomé", display = true}, ["Tunis"] = {container = "Tunisia"}, -- 2,725,000 (Municipality (urban population)) ["Sousse"] = {container = "Tunisia"}, -- 1,180,000 (Municipality (urban population)) ["Soussa"] = {alias_of = "Sousse", display = true}, ["Kampala"] = {container = "Uganda"}, -- 4,300,000 (unindicated) ["Lusaka"] = {container = "Zambia"}, -- 3,000,000 (Consolidated Urban Area) ["Harare"] = {container = "Zimbabwe"}, -- 2,675,000 (Agglomeration) ------------------ Asia ------------------- -- sorted by country and then within the country, by decreasing population; figures from citypopulation.de -- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated. ["Kabul"] = {container = "Afghanistan"}, -- 5,250,000 (Agglomeration) ["Baku"] = {container = "Azerbaijan"}, -- 3,725,000 (Administrative Area (urban population)) ["Manama"] = {container = "Bahrain"}, -- 1,560,000 (unindicated) ["Dhaka"] = {container = {key = "Dhaka Division, Bangladesh", placetype = "division"}}, -- 23,100,000 (Agglomeration) ["Dacca"] = {alias_of = "Dhaka", display = true}, ["Chittagong"] = {container = {key = "Chittagong Division, Bangladesh", placetype = "division"}}, -- 5,050,000 (Agglomeration) ["Gazipur"] = {container = {key = "Dhaka Division, Bangladesh", placetype = "division"}}, -- 2,674,697 (City per 2022; countied in citypopulation.de as part of Dhaka metro area) ["Khulna"] = {container = {key = "Khulna Division, Bangladesh", placetype = "division"}}, -- 1,210,000 (Agglomeration) ["Phnom Penh"] = {container = "Cambodia"}, -- 2,925,000 (Agglomeration) ["Tehran"] = {container = {key = "Tehran Province, Iran", placetype = "province"}}, -- 16,800,000 (Agglomeration) ["Teheran"] = {alias_of = "Tehran", display = true}, ["Mashhad"] = {container = {key = "Razavi Khorasan Province, Iran", placetype = "province"}}, -- 3,475,000 (Agglomeration) ["Mashad"] = {alias_of = "Mashhad", display = true}, ["Meshhed"] = {alias_of = "Mashhad", display = true}, ["Meshed"] = {alias_of = "Mashhad", display = true}, ["Isfahan"] = {container = {key = "Isfahan Province, Iran", placetype = "province"}}, -- 3,425,000 (Agglomeration) ["Esfahan"] = {alias_of = "Isfahan", display = true}, ["Tabriz"] = {container = {key = "East Azerbaijan Province, Iran", placetype = "province"}}, -- 1,970,000 (Agglomeration) ["Shiraz"] = {container = {key = "Fars Province, Iran", placetype = "province"}}, -- 1,950,000 (Agglomeration) ["Ahvaz"] = {container = {key = "Khuzestan Province, Iran", placetype = "province"}}, -- 1,550,000 (Agglomeration) ["Qom"] = {container = {key = "Qom Province, Iran", placetype = "province"}}, -- 1,450,000 (City) ["Kermanshah"] = {container = {key = "Kermanshah Province, Iran", placetype = "province"}}, -- 1,130,000 (City) ["Baghdad"] = {container = "Iraq"}, -- 7,800,000 (Administrative Area (urban population)) ["Basra"] = {container = "Iraq"}, -- 1,710,000 (Administrative Area (urban population)) ["Mosul"] = {container = "Iraq"}, -- 1,550,000 (Administrative Area (urban population)) ["Erbil"] = {container = "Iraq"}, -- 1,220,000 (Administrative Area (urban population)) ["Kirkuk"] = {container = "Iraq"}, -- 1,160,000 (Administrative Area (urban population)) ["Najaf"] = {container = "Iraq"}, -- 1,050,000 (Administrative Area (urban population)) ["Tel Aviv"] = {container = "Israel"}, -- 3,000,000 (Agglomeration) -- Jerusalem is not recognized internationally as part of either Israel or Palestine, but as a -- [[w:corpus separatum]], so put the container as "Asia" and list Israel and Palestine as additional parents for -- categorization purposes. ["Jerusalem"] = {container = {key = "Asia", placetype = "benua"}, addl_parents = {"Israel", "Palestine"}}, -- 1,080,000 (Agglomeration) ["Amman"] = {container = "Jordan"}, -- 6,150,000 (unindicated) ["Irbid"] = {container = "Jordan"}, -- 1,070,000 (unindicated) ["Almaty"] = {container = "Kazakhstan"}, -- 2,700,000 (Agglomeration) ["Alma-Ata"] = {alias_of = "Almaty"}, -- former name, sometimes still used; don't display-canonicalize ["Astana"] = {container = "Kazakhstan"}, -- 1,600,000 (Agglomeration) ["Shymkent"] = {container = "Kazakhstan"}, -- 1,370,000 (Agglomeration) ["Kuwait City"] = {container = "Kuwait"}, -- 5,050,000 (Agglomeration) ["Bishkek"] = {container = "Kyrgyzstan"}, -- 1,540,000 (Agglomeration) ["Beirut"] = {container = "Lebanon"}, -- 1,930,000 (unindicated; population of low reliability) -- Kuala Lumpur is a federal capital city, not in any state ["Kuala Lumpur"] = {container = "Malaysia"}, -- 9,550,000 (Agglomeration) -- there are various George Towns and Georgetowns ["George Town, Malaysia"] = {container = {key = "Penang, Malaysia", placetype = "negeri"}, wp = "%l, %c"}, -- 2,075,000 (Agglomeration) ["George Town"] = {alias_of = "George Town, Malaysia"}, ["Ulaanbaatar"] = {container = "Mongolia"}, -- 1,610,000 (City) ["Ulan Bator"] = {alias_of = "Ulaanbaatar", display = true}, ["Yangon"] = {container = "Myanmar"}, -- 5,650,000 (Municipality (urban population)) ["Rangoon"] = {alias_of = "Yangon", display = true}, ["Mandalay"] = {container = "Myanmar"}, -- 1,600,000 (Municipality (urban population)) ["Kathmandu"] = {container = "Nepal"}, -- 3,175,000 (Agglomeration) -- Pyongyang is a directly governed city, not in any province ["Pyongyang"] = {container = "North Korea"}, -- 3,025,000 (Administrative Area (urban population)) ["Muscat"] = {container = "Oman"}, -- 1,620,000 (Agglomeration) ["Gaza"] = {container = "Palestine", wp = "Gaza City"}, -- 2,275,000 (unindicated) ["Gaza City"] = {alias_of = "Gaza"}, ["Doha"] = {container = "Qatar"}, -- 2,650,000 (Agglomeration) ["Colombo"] = {container = "Sri Lanka"}, -- 4,975,000 (unindicated) ["Damascus"] = {container = "Syria"}, -- 3,975,000 (unindicated; population of low reliability) ["Aleppo"] = {container = "Syria"}, -- 1,980,000 (unindicated; population of low reliability) ["Dushanbe"] = {container = "Tajikistan"}, -- 1,270,000 (City) ["Bangkok"] = {container = "Thailand"}, -- 21,800,000 (Agglomeration) -- Chiang Mai not in citypopulation.de, but 1,198,000 urban population in 2021 per Wikipedia -- [[w:List_of_municipalities_in_Thailand#Largest_cities_by_urban_population]] ["Chiang Mai"] = {container = {key = "Chiang Mai Province, Thailand", placetype = "province"}}, ["Chonburi"] = {container = {key = "Chonburi Province, Thailand", placetype = "province"}}, -- 1,570,000 (Agglomeration; including Pattaya) -- metro area population stats from https://www.statista.com/statistics/255483/biggest-cities-in-turkey/ as of 2021; -- second source is citypopulation.de reference date 2025-01-01. ["Istanbul"] = {placetype = {"city", "province"}, divs = {"districts"}, container = "Turkey"}, -- 15.2 million; 16,000,000 (Agglomeration) ["İstanbul"] = {alias_of = "Istanbul", display = true}, ["Ankara"] = {container = {key = "Ankara Province, Turkey", placetype = "province"}}, -- 5.15 million; 5,200,000 (Agglomeration) ["Izmir"] = {container = {key = "İzmir Province, Turkey", placetype = "province"}, wp = "İzmir"}, -- 2.95 million; 3,025,000 (Agglomeration) ["İzmir"] = {alias_of = "Izmir", display = true}, ["Bursa"] = {container = {key = "Bursa Province, Turkey", placetype = "province"}}, -- 2.02 million; 2,200,000 (Agglomeration) ["Adana"] = {container = {key = "Adana Province, Turkey", placetype = "province"}}, -- 1.77 million; 1,780,000 (Agglomeration) ["Gaziantep"] = {container = {key = "Gaziantep Province, Turkey", placetype = "province"}}, -- 1.71 million; 1,750,000 (Agglomeration) ["Antalya"] = {container = {key = "Antalya Province, Turkey", placetype = "province"}}, -- 1.3 million; 1,400,000 (Agglomeration) ["Konya"] = {container = {key = "Konya Province, Turkey", placetype = "province"}}, -- 1.35 million; 1,390,000 (Agglomeration) ["Diyarbakır"] = {container = {key = "Diyarbakır Province, Turkey", placetype = "province"}}, -- 1.07 million; 1,100,000 (Agglomeration) -- Diyarbakır is more common per Ngrams and Google Scholar, but Diyarbakir is the Kurdish form, so we should not -- display-canonicalize to the Turkish form Diyarbakır. ["Diyarbakir"] = {alias_of = "Diyarbakır"}, ["Mersin"] = {container = {key = "Mersin Province, Turkey", placetype = "province"}}, -- 1.03 million; 1,060,000 (Agglomeration) ["Ashgabat"] = {container = "Turkmenistan"}, -- 1,150,000 (Agglomeration) ["Dubai"] = {container = "United Arab Emirates"}, -- 6,050,000 (Agglomeration; including Sharjah) ["Abu Dhabi"] = {container = "United Arab Emirates"}, -- 1,850,000 (City) ["Sharjah"] = {container = "United Arab Emirates"}, -- 1,800,000 (Metro area 2022-2023 per Wikipedia; separate from Dubai) ["Tashkent"] = {container = "Uzbekistan"}, -- 3,850,000 (unindicated) ["Sanaa"] = {container = "Yemen"}, -- 3,275,000 (City; population of low reliability) ["Sana'a"] = {alias_of = "Sanaa", display = true}, ["Aden"] = {container = "Yemen"}, -- 1,079,060 (?; 2023 estimate from World Population Review per Wikipedia) ------------------ Europe or Europe-like (Caucasus etc.) --------------------- ["Yerevan"] = {container = "Armenia"}, -- 1,520,000 (Agglomeration) ["Vienna"] = {container = "Austria"}, -- 2,375,000 (Agglomeration) ["Minsk"] = {container = "Belarus"}, -- 2,100,000 (unindicated) ["Brussels"] = {container = "Belgium"}, -- 2,800,000 (Consolidated Urban Area) ["Antwerp"] = {container = "Belgium"}, -- 1,270,000 (Consolidated Urban Area) ["Sofia"] = {container = "Bulgaria"}, -- 1,260,000 (Agglomeration) ["Zagreb"] = {container = "Croatia"}, ["Prague"] = {container = "Czech Republic"}, -- 1,470,000 (Agglomeration) ["Brno"] = {container = "Czech Republic"}, -- 729,405 (metro area per Wikipedia as of 2024-01-01 Czech Statistical Office) ["Olomouc"] = {container = "Czech Republic"}, -- 102,293 (city; included only because someone went crazy creating Olomouc-related terms) ["Copenhagen"] = {container = "Denmark"}, -- 1,800,000 (Consolidated Urban Area) ["Helsinki"] = {container = {key = "Uusimaa, Finland", placetype = "region"}}, -- 1,560,000 (Consolidated Urban Area) ["Tbilisi"] = {container = "Georgia"}, -- 1,430,000 (Agglomeration) ["Athens"] = {container = "Greece"}, ["Thessaloniki"] = {container = "Greece"}, ["Budapest"] = {container = "Hungary"}, -- FIXME, per Wikipedia "County Dublin" is now the "Dublin Region" ["Dublin"] = {container = {key = "County Dublin, Ireland", placetype = "county"}}, ["Riga"] = {container = "Latvia"}, ["Amsterdam"] = {container = {key = "North Holland, Netherlands", placetype = "province"}}, ["Rotterdam"] = {container = {key = "South Holland, Netherlands", placetype = "province"}}, ["The Hague"] = {container = {key = "South Holland, Netherlands", placetype = "province"}}, -- Christchurch (metro 546,600) and Wellington (metro 439,800) are too small to make it. ["Auckland"] = {container = {key = "Auckland, New Zealand", placetype = "region"}}, ["Oslo"] = {container = {key = "Oslo, Norway", placetype = "county"}}, ["Warsaw"] = {container = {key = "Masovian Voivodeship, Poland", placetype = "voivodeship"}}, ["Katowice"] = {container = {key = "Silesian Voivodeship, Poland", placetype = "voivodeship"}}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Krakow" without accent. ["Krakow"] = {container = {key = "Lesser Poland Voivodeship, Poland", placetype = "voivodeship"}, wp = "Kraków"}, ["Kraków"] = {alias_of = "Krakow", display = true}, ["Cracow"] = {alias_of = "Krakow", display = true}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirm "Gdańsk" and "Poznań" with accent. ["Gdańsk"] = {container = {key = "Pomeranian Voivodeship, Poland", placetype = "voivodeship"}}, ["Gdansk"] = {alias_of = "Gdańsk", display = true}, ["Poznań"] = {container = {key = "Greater Poland Voivodeship, Poland", placetype = "voivodeship"}}, ["Poznan"] = {alias_of = "Poznań", display = true}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Lodz" without accents. ["Lodz"] = {container = {key = "Lodz Voivodeship, Poland", placetype = "voivodeship"}, wp = "Łódź"}, ["Łódź"] = {alias_of = "Lodz", display = true}, ["Lisbon"] = {container = {key = "Lisbon District, Portugal", placetype = "district"}}, ["Porto"] = {container = {key = "Porto District, Portugal", placetype = "district"}}, ["Oporto"] = {alias_of = "Porto", display = true}, ["Bucharest"] = {container = "Romania"}, ["Belgrade"] = {container = "Serbia"}, ["Stockholm"] = {container = "Sweden"}, ["Zurich"] = {container = "Switzerland"}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Zurich" without umlaut. --- Even Wikipedia uses the form without umlaut. ["Zürich"] = {alias_of = "Zurich", display = true}, ["Kyiv"] = {container = "Ukraine"}, -- not in Kyiv Oblast -- Don't display-canonicalize Kiev -> Kyiv because in ancient contexts, Kiev is still more common. ["Kiev"] = {alias_of = "Kyiv"}, ["Kharkiv"] = {container = {key = "Kharkiv Oblast, Ukraine", placetype = "oblast"}}, ["Odessa"] = {container = {key = "Odesa Oblast, Ukraine", placetype = "oblast"}, wp = "Odesa"}, -- Don't display-canonicalize Odesa -> Odessa because it may be interpreted as a political statement. ["Odesa"] = {alias_of = "Odessa"}, ------------------ North America, South America --------------------- -- Primary figures from citypopulation.de retrieved on 2025-04-26 (reference date 2025-01-01); -- Wikipedia metropolitan figures from [[w:List of metropolitan areas in the Americas]] based on per-country data; -- Wikipedia city limits figures from [[w:List of largest cities in the Americas]]. ["Buenos Aires"] = {container = "Argentina"}, -- 16,800,000 (Consolidated Urban Area; 13,985,794 metropolitan area per Wikipedia) ["Córdoba, Argentina"] = {container = "Argentina", wp = "%l, %c"}, -- 1,810,000 (Consolidated Urban Area; 1,505,25 city limits per Wikipedia) -- to avoid confusion with Córdoba in Spain ["Córdoba"] = {alias_of = "Córdoba, Argentina"}, ["Cordoba"] = {alias_of = "Córdoba, Argentina", display = "Córdoba"}, ["Rosario"] = {container = "Argentina", wp = "%l, Santa Fe"}, -- 1,510,000 (Consolidated Urban Area; 1,348,725 metropolitan area per Wikipedia) ["Mendoza"] = {container = "Argentina", wp = "%l, %c"}, -- 1,180,000 (Consolidated Urban Area) ["San Miguel de Tucumán"] = {container = "Argentina"}, -- 1,110,000 (Consolidated Urban Area) ["Tucumán"] = {alias_of = "San Miguel de Tucumán"}, ["Tucuman"] = {alias_of = "San Miguel de Tucumán", display = "Tucumán"}, ["Santa Cruz de la Sierra"] = {container = "Bolivia"}, -- 1,960,000 (Consolidated Urban Area); 1,606,671 (city limits per Wikipedia) ["Santa Cruz"] = {alias_of = "Santa Cruz de la Sierra"}, ["La Paz"] = {container = "Bolivia"}, -- 1,870,000 (Consolidated Urban Area; composed of El Alto, now slightly larger, and La Paz) ["El Alto"] = {container = "Bolivia"}, ["Cochabamba"] = {container = "Bolivia"}, -- 1,280,000 (Consolidated Urban Area) ["Santiago"] = {container = "Chile"}, -- 8,400,000 (Consolidated Urban Area; 6,903,479 city limits? per Wikipedia) ["Valparaíso"] = {container = "Chile"}, -- 1,060,000 (Consolidated Urban Area) ["Valparaiso"] = {alias_of = "Valparaíso"}, -- 1,060,000 (Consolidated Urban Area) ["Bogotá"] = {container = "Colombia"}, -- 10,600,000 (Agglomeration; 12,772,828 metropolitan area per Wikipedia) ["Bogota"] = {alias_of = "Bogotá", display = true}, ["Medellín"] = {container = "Colombia"}, -- 4,350,000 (Agglomeration; 4,068,000 metropolitan area per Wikipedia) ["Medellin"] = {alias_of = "Medellín", display = true}, ["Cali"] = {container = "Colombia"}, -- 2,975,000 (Agglomeration; 2,837,000 metropolitan area per Wikipedia) ["Barranquilla"] = {container = "Colombia"}, -- 2,375,000 (Agglomeration; 1,341,160 city limits per Wikipedia) ["Bucaramanga"] = {container = "Colombia"}, -- 1,380,000 (Agglomeration) ["Cartagena, Colombia"] = {container = "Colombia", wp = "%l, %c"}, -- 1,250,000 (Agglomeration) -- to avoid confusion with Cartagena, Spain ["Cartagena"] = {alias_of = "Cartagena, Colombia"}, ["Cúcuta"] = {container = "Colombia"}, -- 1,130,000 (Agglomeration) ["Cucuta"] = {alias_of = "Cúcuta", display = true}, -- to avoid conflict with San Jose, California ["San José, Costa Rica"] = {container = "Costa Rica", wp = "%l, %c"}, -- 2,450,000 (Municipality (urban population); 3,160,000 metropolitan area per Wikipedia) ["San José"] = {alias_of = "San José, Costa Rica"}, ["San Jose"] = {alias_of = "San José, Costa Rica"}, -- display = "San José"; causes error due to San Jose alias for California city; FIXME ["Havana"] = {container = "Cuba"}, -- 2,150,000 (City; 2,137,847 city limits? per Wikipedia) ["Santo Domingo"] = {container = "Dominican Republic"}, -- 3,900,000 (Municipality (urban population); 4,274,651 ??? per Wikipedia) ["Guayaquil"] = {container = "Ecuador"}, -- 3,350,000 (Agglomeration; 3,092,000 metro area? per Wikipedia) ["Quito"] = {container = "Ecuador"}, -- 2,875,000 (Agglomeration; 2,889,703 metro area? per Wikipedia) ["San Salvador"] = {container = "El Salvador"}, -- 1,580,000 (Municipality (urban population)) ["Guatemala City"] = {container = "Guatemala"}, -- 3,375,000 (Municipality (urban population); 3,160,000 metro area? per Wikipedia) ["Port-au-Prince"] = {container = "Haiti"}, -- 3,050,000 (Agglomeration; population of low reliability; 2,915,000 metro area? per Wikipedia) ["San Pedro Sula"] = {container = "Honduras"}, -- 1,330,000 (Consolidated Urban Area) ["Tegucigalpa"] = {container = "Honduras"}, -- 1,220,000 (Urban Area) ["Managua"] = {container = "Nicaragua"}, -- 1,400,000 (Consolidated Urban Area) ["Panama City"] = {container = "Panama"}, -- 1,430,000 (Urban Area) ["Asunción"] = {container = "Paraguay"}, -- 2,350,000 (Municipality (urban population)) ["Lima"] = {container = "Peru"}, -- 12,000,000 (Agglomeration; 11,283,787 ??? per Wikipedia) ["Arequipa"] = {container = "Peru"}, -- 1,210,000 (Agglomeration) ["San Juan"] = {container = {key = "Puerto Rico", placetype = "commonwealth"}, wp = "%l, %c"}, -- 1,910,000 (Consolidated Urban Area) ["Montevideo"] = {container = "Uruguay"}, -- 1,810,000 (Agglomeration; 1,302,954 ??? per Wikipedia) ["Caracas"] = {container = "Venezuela"}, -- 3,850,000 (Consolidated Urban Area; 5,243,301 ??? per Wikipedia) ["Maracaibo"] = {container = "Venezuela"}, -- 2,825,000 (Consolidated Urban Area; 5,278,448 ??? per Wikipedia) -- to avoid confusion with Valencia (city and autonomous community of Spain) ["Valencia, Venezuela"] = {container = "Venezuela", wp = "%l, %c"}, -- 2,100,000 (Consolidated Urban Area) ["Valencia"] = {alias_of = "Valencia, Venezuela"}, ["Maracay"] = {container = "Venezuela"}, -- 1,480,000 (Consolidated Urban Area) ["Barquisimeto"] = {container = "Venezuela"}, -- 1,360,000 (Consolidated Urban Area) } export.misc_cities_group = { canonicalize_key_container = make_canonicalize_key_container(nil, "negara"), default_placetype = "city", data = export.misc_cities, } --[==[ var: List of all known locations, in groups. The first group lists continents and continental regions, followed by three groups listing top-level locations: countries, "country-like entities" (de-facto/unrecognized/etc. countries and dependent territories) and former polities (countries, empires, etc.). After that come first-level subpolities (administrative divisions) of several, mostly large, countries, followed by groups of cities. China and the United Kingdom include second-level subpolities (in the case of China, only the largest ones as the full list runs in the hundreds). ]==] export.locations = { export.continents_group, export.countries_group, export.country_like_entities_group, export.former_countries_group, export.australia_group, export.austria_group, export.bangladesh_group, export.brazil_group, export.canada_group, export.china_group, export.china_prefecture_level_cities_group, export.china_prefecture_level_cities_group_2, export.egypt_group, export.finland_group, export.france_group, export.france_departments_group, export.germany_group, export.greece_group, export.india_group, export.indonesia_group, export.iran_group, export.ireland_group, export.italy_group, export.japan_group, export.laos_group, export.lebanon_group, export.malaysia_group, export.malta_group, export.mexico_group, export.moldova_group, export.morocco_group, export.netherlands_group, export.new_zealand_group, export.nigeria_group, export.north_korea_group, export.norway_group, export.pakistan_group, export.philippines_group, export.poland_group, export.portugal_group, export.romania_group, export.russia_group, export.saudi_arabia_group, export.south_africa_group, export.south_korea_group, export.spain_group, export.taiwan_group, export.thailand_group, export.turkey_group, export.ukraine_group, export.united_kingdom_group, export.united_states_group, export.england_group, export.northern_ireland_group, export.scotland_group, export.wales_group, export.vietnam_group, export.australia_cities_group, export.brazil_cities_group, export.canada_cities_group, export.france_cities_group, export.germany_cities_group, export.india_cities_group, export.indonesia_cities_group, export.italy_cities_group, export.japan_cities_group, export.mexico_cities_group, export.nigeria_cities_group, export.pakistan_cities_group, export.philippines_cities_group, export.russia_cities_group, export.saudi_arabia_cities_group, export.south_korea_cities_group, export.spain_cities_group, export.taiwan_cities_group, export.united_kingdom_cities_group, export.united_states_cities_group, export.new_york_boroughs_group, export.vietnam_cities_group, export.misc_cities_group, } return export 2mni79t5mgnor8ogyeszmd5msabwih8 281435 281433 2026-04-22T10:05:07Z PeaceSeekers 3334 281435 Scribunto text/plain local export = {} export.force_cat = false -- set to true to force category generation even on non-mainspace pages local m_table = require("Module:table") local string_utilities_module = "Module:string utilities" local en_utilities_module = "Module:en-utilities" local insert = table.insert local concat = table.concat local dump = mw.dumpObject local unpack = unpack or table.unpack -- Lua 5.2 compatibility --[==[ intro: This module contains data on all known locations, along with some lower-level code to process them (higher-level known-location code is in [[Module:place/placetypes]]). You must load this module using require(), not using mw.loadData(). ===Location data=== '''NOTE: In order to understand the following better, first read the introductory documentation in [[Module:place]], especially the section `More about known locations`.''' The bulk of the code in this module (after some helper functions and placetype tables) describes the known locations and their relationships. Locations are grouped into ''location groups'' that share some common properties (examples are states of the United States and cities in Brazil). Each location group is associated with two tables, a ''data table'' that lists the locations and their individual properties, and a ''metadata table'' that lists group-level properties and defaults for the location properties. Each metadata table points to the associated data table (i.e. contains the data table as its `data` field), and the global `locations` variable holds a list of all group metadata tables. A given location is generally described by three values: (a) the group metadata table for the group the location is part of; (b) the location's canonical ''key'', which is the actual key in the group's data table and is globally unique across all locations; and (c) the location's ''spec'', which is the initialized object describing the properties of the location and comes from the value in the data table corresponding to the canonical key, transformed by the `initialize_spec()` function. These are typically named `group`, `key` and `spec`, respectively and in that order, and are found in the arguments to many functions. In a per-group data table, the keys are either ''canonical keys'' describing locations (which, as mentioned above, must be globally unique) or ''alias keys'' specifying an allowed alias for a given location. There may be multiple aliases for a given location and the alias keys only need to be unique within a particular group data table, not across all groups. It is also possible for the same string to serve as an alias key in one group and a canonical key in another group. (For example, `Newcastle` appears as an alias key in two different groups, referring to two different locations, canonically known as `Newcastle upon Tyne`, for the city in England, and `Newcastle, New South Wales`, for the city in New South Wales, Australia; and `Birmingham` appears both as a canonical key in the group of English cities and an alias key for canonical `Birmingham, Alabama` in the group of US cities.) The corresponding value objects are different for canonical and alias keys. Corresponding to canonical keys are ''location specs'', describing the properies of the location that cannot be derived from default properties of the group or global defaults. Corresponding to alias keys are ''alias specs'', which are highly restricted in the properties they can contain, and whose properties do not have per-group defaults, but only global defaults. The canonical key is always the same as the bare category corresponding to the location, which is one of the reasons it must be globally unique. For example, the country of Georgia uses the canonical key `Georgia` and corresponding bare category [[:Category:Georgia]], while the US state of Georgia uses the canonical key `Georgia, USA` and corresponding bare category [[:Category:Georgia, USA]]. The following conventions are followed in naming keys: * Countries, ''country-like entities'' (which are a mixture of unrecognized de-facto states and dependent territories) and ''former countries'' (which also includes other types of polities, such as the Roman Empire) use their unqualified placename as the canonical key. (See the documentation for [[Module:place]] for the distinction between keys and placenames, which is critical to understand when working with location data.) This also applies to constituent countries (such as England, Aruba and the Faroe Islands) and constituent parts of grouped dependent territories (such as the island of Saint Helena, which is administratively part of the British overseas territory of Saint Helena, Ascension and Tristan da Cunha). * Cities (including prefecture-level cities in China, which behave in most respects more like non-city administrative divisions) also normally use their unqualified placename as the canonical key, but if this causes name conflicts or ambiguities, they use a ''qualified key'' containing either the country name or immediate containing division (if different) following a comma, such as the case of `Newcastle, New South Wales` and `Birmingham, Alabama` above. Examples of name conflicts are the two cities just given; examples of ambiguities are the major cities of León and Mérida in Mexico and city of Cartagena, Colombia, which are given the respective canonical keys of `León, Guanajuato`, `Mérida, Yucatán` and `Cartagena, Colombia` to avoid ambiguity with the well-known respective cities of the same name in Spain, even though none of those cities are large enough to be included as known locations in this module. (The cutoff is generally having a metro area of at least 1,000,000 inhabitants, although there are exceptions.) * Administrative divisions of countries, other than the exceptions noted above for constituent countries and dependent territories, use a qualified key that contains the name of the country or constituent country in it, e.g. `Normandy, France` (a region), `Calvados, France` (a department in the region of Normandy), `Herefordshire, England` (a ceremonial county), `Northwest Territories, Canada` (a territory), `Central Finland, Finland` (a region), `Antalya Province, Turkey` (a province), `Cluj County, Romania` (a county), `County Cork, Ireland` (a county) and `New York, USA` (a state). As shown in these various examples, (a) first and second-level divisions are sometimes both included (as in France, the United Kingdom and China); (b) the qualifier after the comma is sometimes a constituent country (England) instead of a country (United Kingdom), and is sometimes abbreviated (USA rather than United States or Unites States of America); (c) the word `the` is not normally included in the key even if the location is normally preceded by `the` when following a preposition (there is a property in the location and alias specs to indicate this), except in a very few cases (most notably `The Hague`); (d) the country is included as a qualifier even if it creates an apparent redundancy, as with `Central Finland, Finland`; and (e) sometimes the placetype is included in the key, as with provinces in Turkey and several other countries; states in Nigeria; and counties in Ireland, Romania and several other countries. Whether the placetype is included, and whether it follows or precedes the placename, depends on per-country conventions. For example, provinces in Turkey, Iran and several other countries (likewise for states in Nigeria, oblasts in Russia, etc.) conventionally include the word "Province", "negeri", "Oblast" etc. in their name because they are normally named after the largest city in the division, which would otherwise lead to ambiguity; and counties in Ireland and Northern Ireland (and likewise County Durham, England) normally have the word "County" preceding rather than following them in their conventional name, so we follow this practice. The Wikipedia article naming scheme for a given administrative division is a strong clue as to how the division is normally referred to, and we usually follow this practice. (A minor exception is that the Wikipedia articles for provinces in Iran, Laos and Thailand include the word `province` with an initial lowercase letter while provinces elsewhere, e.g. North and South Korea, Saudi Arabia and Turkey, use uppercase `Province`; we normalize to uppercase `Province` in all cases.) As mentioned above, associated with canonical keys in the group data table are location specs, which are objects containing properties. It is important here to distinguish ''initialized specs'' from ''uninitialized specs''. Unininitialized specs are as directly specified in [[Module:place/locations]], containing only those properties that differ from the per-group or global defaults. Initialized specs result from calling `initialize_spec()` on an uninitialized spec (it is idempotent in that it will do nothing if encountering an already-initialized spec). This copies all group-level defaults that are not overridden in the location spec itself from the group-level metadata table into the location spec, so that in general, no more reference need be made to the group to fetch the correct value of a given location property. (The initialization process also does more transformations in a few cases, noted below.) Note that the default value of a given property is stored under a key in the group metadata table that is preceded by the string `default_`; for example, the default value corresponding to the `placetype` property of a given location is specified in the `default_placetype` key in the group metadata table. The following are the properties of the location spec. * `placetype`: String specifying the placetype of the location (e.g. "negara", "negeri", province"). This can also be a table of such types; in this case, the first listed type is the canonical type that will be used in descriptions, but the location will be recognized (e.g. in a holonym, or for categorizing into the bare category) when tagged with any of the specified types. The placetype '''must''' be either specified on an individual location or defaulted at the group level, or an error occurs. * `container`: Either a string, a ''canonicalized container'' structure or a list of either type, specifying the immediate ''container'' (or containers) of the given location. A container is another location which this location is considered to be directly part of, either politically or (above the country level) geographically. Some locations belong to multiple immediate containers; this applies especially to transcontinental countries such as Russia and Turkey. Containers can themselves have containers, forming a tree (or more correctly, a [[w:directed acyclic graph]]) of locations. The list of immediate container(s), followed by the container(s) of the container(s), etc., is termed the ''container trail'', and some functions compute and return this trail as part of their operation. When a location spec is initialized, the given container spec is canonicalized into ''canonical container form'', which consists of a list of canonicalized container structures, each of which is of the form `{key = "``container_key``", placetype = "``container_placetype``"}`, where ``container_key`` is a canonical location key and ``container_placetype`` should be the listed placetype for the location, or the first listed placetype if there are multiple. (FIXME: Since the key uniquely identifies the container location, we should eliminate the placetype from the container structure.) The list of canonicalized container structures is stored into the `.containers` field of the location spec (this happens even if the container value is unset in its uninitialized spec form, causing it to default to the corresponding group-level value), and the `.container` field is set to {nil}. The canonicalization process is described in more detail below under [[#Container spec canonicalization]]. * `divs`: List of recognized political divisions; e.g. for the Netherlands, a specification of the form `divs = {"provinces", "municipalities"}` will allow categories such as [[:Category:de:Provinces of the Netherlands]] and [[:Category:pt:Municipalities of the Netherlands]] to be created. Any division that appears here must also be found in `placetype_data`, or an error occurs. The entities appearing in the `divs` list can be structures as well as just strings; this is explained more below under [[#Location divisions]]. Additional political divisions that apply to all locations in a group can be specified at the group level using the group-only property `addl_divs`, which has the same format as `divs`. This is intended to be used in the situation where some division types are shared among all locations in the group and others differ from location to location. An example where this is used is the United States, where `census-designated places` is specified in the group-level `addl_divs` so that all 50 states have census-designated places categorized as e.g. [[:Category:Census-designated places in Arizona, USA]], but `counties` and `county seats` are specified in the group-level `default_divs` because not all states have counties and county seats (Alaska has boroughs and borough seats and Louisiana has parishes and parish seats), and some states have additional divisions (New Jersey and Pennsylvania also have boroughs, while Colorado and Connecticut have municipalities). Note that under most circumstances (particularly, if `container_parent_type` is not set as a property associated with the division type), any division type specified on a sub-country-level location must also be specified on all containers up through the country. For example, since French departments specify `communes` and `municipalities` in `default_divs`, the same division types must be (and are) specified on French regions and for France itself. * `keydesc`: String directly specifying a description of the location, for use in generating the contents of category pages related to the location. In place of a string, a function of three arguments (`group`, `key`, `spec`, as is normal for locations) that computes the location description can also be given. This is used, for example, for Russian federal subjects; see `construct_russia_federal_subject_keydesc`. The special string `+++` contained in the keydesc is replaced with the default value of the location description, which specifies the location's placename, placetype, and the corresponding values for each container in the container trail, generally up through (but not beyond) the country level; see `no_include_container_in_desc` below. The location description is used to construct the full description of various categories, such as bare location categories, whose description generally reads `"{{(((}}langname}}} terms related to the people, culture, or territory of ``keydesc``."` where ``keydesc`` is the specified or auto-constructed location description. * `fulldesc`: String overriding the full description for the bare location category (but not for any other category). This is currently used only for the location `Earth`, at the very top of the tree (because the standard `people, culture or territory of ...` text doesn't make sense here), and for `Antarctica` (because it has no permanent inhabitants). FIXME: This should be renamed `bare_category_fulldesc`. * `addl_parents`: Specify additional parents for the bare location category, in addition to the category or categories generated based on the immediate container(s). For example, `Hawaii, USA` specifies `Polynesia` as an additional parent category; both `North Korea` and `South Korea` specify `Korea` (which is a specially handled location category) as an additional parent; and `Earth` specifies `nature` (not a location category, but still a topic category) as an additional parent (which in this case becomes the first parent, as `Earth` has no container). The only restriction on the categories in `addl_parents` is that they must be topic categories, because each language-specific version of the bare location category gets the corresponding language-specific versions of the categories in `addl_parents`. FIXME: This shoudl be renamed `bare_category_addl_parents`. * `wp`: Spec describing how to construct the Wikipedia article for the location. Each spec is either `true` (equivalent to `"%l"`, i.e. use the full location placename directly) or a string containing formatting directives, indicating how to construct the article name. The allowed formatting directives are `%l` (the full location placename), `%e` (the elliptical location placename) and `%c` (the full placename of the first immediate container). For example, the default value of `wp` for the group of United States cities is `"%l, %c"` since the city articles tend to be named e.g. `Austin, Texas` (but with many exceptions, specified using `wp` fields at the city level). Another example is Thai provinces, which specify a group-level default of `"%e province"` as the Wikipedia articles have lowercase `province` in their name but the Thai province keys specified in this module have uppercase `Province`. Here we have to use `%e` to get the placename without the word `Province` in it. The default is `true`, which simply uses the full location placename as the article name. Note that the Wikipedia article, along with the Wikipedia and Commons category pages, are shown in the upper right of bare category pages. * `wpcat`: Spec describing how to construct the Wikipedia category page for the location (i.e. the page listing articles and categories relevant to the location). The format is the same as with `wp`, and it defaults to the value of `wp`. It rarely needs to be specified because the category page and the article page almost always follow the same format. * `commonscat`: Spec describing how to construct the Commons category page for the location (i.e. the page on the MediaWiki Commons site listing articles and categories relevant to the location). It has the same format as `wp` and `wpcat` and defaults to `wpcat`, which is usually (but not always) correct. * `the`: Boolean specifying whether a location should be preceded by `the` when following a preposition, e.g. in category names such as [[:Category:Cities in the Northern Territory, Australia]] and in old-style place descriptions when the location occurs as the first holonym, such as the city [[Darwin]] described using {{tl|place|city|terr/Northern Territory|c/Australia}}. Note that the global default for this and all Boolean properties is {nil}, which amounts to the same as {false}. * `british_spelling`: Boolean indicating whether the location in question uses British spelling. Currently this only affects whether the spelling `neighborhoods` or `neighbourhoods` is used in categories such as [[:Category:Neighborhoods of New York City]] and [[:Category:Neighbourhoods of Sydney]]. This usually needs to be set only at the top level (i.e. country or country-like entity), because lower-level entities look up the container trail for any container that has `british_spelling = true` set, and if found, assume that British spelling applies. The general principle used in setting this is that all countries in Europe, all dependent territories of any such country, all former British colonies, and any dependent territories of these former colonies, are assumed to use British spelling, while all other countries and associated dependent territories are assumed to use American spelling. This can potentially be modified on a case-by-case basis. * `is_city`: Boolean indicating whether the location in question is a city. This is explicitly set to `true` for city-states (e.g. Monaco and Vatican City), dependent territories that are cities (e.g. Hong Kong, Macau, Bonaire, Gibraltar, etc.), certain city-level administrative divisions (such as `City of Belfast, Northern Ireland`) and (through a group-levell setting) New York boroughs. In addition, it is set to `true` in initialize_spec() whenever the group-level `default_placetype == "city"`, so that all cities get it set without explicitly needing to add a group-level setting for this. Note that the condition `default_placetype == "city"` intentionally excludes Chinese prefecture-level cities, which aren't really cities in that (for example) they don't directly contain neighborhoods, but do contain cities within them. This setting is used in various places: (a) to add cities, rivers, etc. to categories like [[:Category:Rivers in Osaka Prefecture, Japan]] and [[:Category:Cities in Wuhan]] for holonyms that are ''not'' cities; (b) to add districts, neighborhoods, and the like to categories like [[:Category:Neighborhoods of Brooklyn]] and [[:Category:Neighborhoods of Monaco]] for holoynms that ''are'' cities; (c) generally, to determine which "generic" placetypes (cities, rivers, neighborhoods, etc.) apply to the location. (Those that can occur with cities have a `generic_before_cities` setting in [[Module:place/placetypes]], and those that can occur with non-cities have a `generic_before_non_cities` setting.) * `is_former_place`: Boolean that should be set on former places such as the Soviet Union and the Roman Empire. For such places, categories such as [[:Category:fr:Rivers in the Soviet Union]] are neither generated nor recognized (more generally, no "generic" placetypes apply except for `places`), and category descriptions include the word `former`. * `overriding_bare_label_parents`: Document me! * `bare_category_parent_type`: Document me! * `no_container_cat`: Document me! * `no_container_parent`: Document me! * `no_generic_place_cat`: Document me! * `no_check_holonym_mismatch`: Document me! * `no_auto_augment_container`: Document me! * `no_include_container_in_desc`: Document me! ====Location divisions==== The `divs` field of a location describes the recognized political division types of that location. Specifying a given division type will cause places defined as being of the specified division type and with the location as a holonym will cause the place to be categorized as ` ``placetypes`` in/of ``location`` `; for example, specifying that the United States has `"negeri"` as a division will cause anything defined as {{tl|place|fr|state|c/US}} to be categorized under [[:Category:fr:States of the United States]]. Note that you do not have to explicitly specify division types for "generic" placetypes (those that have a `generic_before_non_cities` field if the location is not a city, or that have a `generic_before_cities` field if the location is a city); this includes things like cities, towns, villages, neighbo(u)rhoods and rivers. A given element in the `divs` list is usually a string naming a plural placetype; the placetype is automatically converted to the singular for recognizing the placetype in a {{tl|place}} spec, and irregular plurals such as `kibbutzim` are handled correctly as long as the placetype specifies an appropriate `plural` field (if the `plural` isn't explicitly given, the default singularization algorithm in [[Module:en-utilities]] is run, which gets most things correctly but has problems with `passes` and `fortresses`, which are singularized to `passe` and `fortresse`; for this reason, an explicit plural entry is added to terms in ''-ss''). In place of a string, an object can be given with the plural placetype in the `type` field; this allows additional properties to be specified along with the placetype. An example of this is the `divs` list for Canada: { ["Canada"] = {divs = { {type = "provinces", cat_as = "provinces and territories"}, {type = "territories", cat_as = "provinces and territories"}, "counties", "districts", "municipalities", "regional municipalities", "rural municipalities", "parishes", "Indian reserves", "census divisions", {type = "townships", prep = "di"}, }, ...}, } Here, both provinces and territories are set to categorize as `provinces and territories`, meaning that there is a single category [[:Category:Provinces and territories of Canada]] rather than separate categories for provinces and territories. Similar things are done for other countries that have more than one type of first-level administrative division (e.g. Australia, China, India and Pakistan). Note that any placetype listed under `cat_as` must exist in the table of placetypes in [[Module:place/placetypes]], and in fact there is a category-only entry there for `provinces and territories!` (the use of exclamation point following a plural placetype means that the placetype is present only for use in categories and won't be recognized as the placetype field in a {{tl|place}} description). In addition, townships are declared to use `in` rather than `of` as the preposition in the category; hence the category name will be [[:Category:Townships in Canada]] rather than [[:Category:Townships of Canada]]. (The use of `in` vs. `of` is somewhat related to whether a given placetype is an official administrative or statistical division of the location in question and comes in a defined list, in which case `of` should be used, or is more ill-defined, in which case `in` should be used; the default is `of`, and the use of `in` with `townships` is probably by analogy with the use of `in` with cities and towns.) Another more complex example is the divisions given for Quebec: { ["Quebec, Canada"] = {divs = { "counties", {type = "regional county municipalities", container_parent_type = "regional municipalities"}, {type = "regions", container_parent_type = false}, {type = "townships", prep = "di"}, {type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "counties"}, "municipalities"}}, {type = "township municipalities", cat_as = {{type = "townships", prep = "di"}, "municipalities"}}, {type = "village municipalities", cat_as = {{type = "villages", prep = "di"}, "municipalities"}}, }, ...}, } Here, `container_parent_type` controls the second parent category of the placetype/location category associated with the entry. In this case, for example, [[:Category:Counties of Quebec, Canada]] will have [[:Category:Counties of Canada]] as its second or ''container-level'' parent. However, this doesn't make sense for `regional county municipalities`, which exist only in Quebec (so the parent category [[:Category:Regional county municipalities of Canada]] would have only one subcategory); but they are similar to regional municipalities in British Columbia, Nova Scotia and Ontario, so the `container_parent_type = "regional municipalities"` spec causes the container-level parent of this category to be [[:Category:Regional municipalities of Canada]]. Likewise, `regions` as administrative divisions (as opposed to mere geographic regions) exist only in Quebec; they have no equivalent elsewhere, so we disable the container-level parent using `container_parent_type = false`. The specs for `parish municipalities`, `township municipalities` and `village municipalities` show both that multiple types can be specified under `cat_as` (here, for example, we categorize `parish municipalities` as both `parishes` and `municipalities`) and that these types can themselves have properties, just as for entries directly under `divs`. Specifically, `{type = "parishes", container_parent_type = "counties"}` means that any place defined as a parish municipality in Quebec will be categorized under both [[:Category:Parishes of Quebec, Canada]] and [[:Category:Municipalities of Quebec, Canada]], and that the former will have a container-level parent of [[:Category:Counties of Canada]] (rather than the default of [[:Category:Parishes of Canada]]). Similarly, `township municipalities` will be categorized under both [[:Category:Townships in Quebec, Canada]] (''not'' [[:Category:Townships of Quebec, Canada]]) and [[:Category:Municipalities of Quebec, Canada]]. ====Container spec canonicalization==== A fully canonicalized container spec for a given location consists of a list of ''canonicalized container objects'', each with a `key` and `placetype` field. The `key` field should name the canonical key of some other location at a higher level (e.g. French cities are contained in French departments, which are contained in French regions, which are contained in France, which is contained in Europe, which is contained in Eurasia, which is contained in the Earth). The `placetype` field should correspond to the first (canonical) placetype listed for the key in question. The process of initializing a locaion spec converts the container spec in `.container` into a canonicalized spec in `.containers` and removes the spec from `.container`. It works as follows: # If the `container` field is missing, and there is a group-level `default_container` field, it is used in its place. For example, none of the Brazilian states listed in `brazil_states` specifies a container, but the group specifies `default_container = "Brazil"`. # A single string or canonicalized container object is allowed and made into a one-element list. # If a list element is a string that did ''not'' come from `default_container`, and there is a group-level `canonicalize_key_container` field, it is assumed to be a one-argument function and is called on the string to get a canonicalized container object. # Any remaining strings are assumed to be countries and are used directly as the `key`, with `placetype` set to `"negara"`. ====Alias keys==== Aliases can be provided for canonical keys using ''alias keys''. Alias keys have a very different location spec structure from canonical keys. This structure does not, in general, have defaults at the group level and is not initialized using `initialize_spec()`, but is used as-is. The following properties are recognized in an alias location spec: * `alias_of`: The canonical key of which this key is an alias. Required. * `the`: If true, this alias key is preceded by `the` following a preposition. Defaults to the group-level `default_the` but does not pay attention to the value of `the` for the corresponding canonical key. * `display`: This is a display alias, meaning that holonyms using the placename corresponding to this alias will be converted to the placename corresponding to the canonical key when formatting the holonym for display. (Otherwise, the aliasing applies only to categorization.) If the value is true, the display canonicalization is to the placename of the canonical key; otherwise, the value should be a key whose corresponding placename is used when display canonicalizing. * `placetype`: The placetype of the alias. Rarely needs to be specified as it defaults to the canonical key's placetype, and if that is unspecified, to the group-level default placetype. ====Location group metadata tables==== As mentioned above, associated with each location group is a ''metadata table'' listing group-level properties. The metadata table contains two types of keys: group-level defaults (named like the corresponding location-level keys but preceded by `default_`, e.g. `default_placetype` corresponding to the location-level `placetype` key) and group-only keys, which are mostly functions. The following are the possible group-only keys: * `data`: This points to the group data table for the group, as described above. * `key_to_placename`: This is a function of one argument to transform the location's key (whether canonical or alias) into the full and elliptical placenames. The difference between full and elliptical placenames is described in the documentation for [[Module:place]], but in essence, it applies for keys that include the placetype in them (e.g. `Phuket Province, Thailand` or `County Mayo, Ireland`), in which case the full placename includes the placetype and the elliptical placename does not. For keys that do not include the placetype in them (e.g. `Arizona, USA` or `Gloucestershire, England`), the full and elliptical placenames are identical. Note that neither the full nor the elliptical placename includes the container in it; hence, for `Phuket Province, Thailand`, the full placename is `Phuket Province` and the elliptical placename is just `Phuket`. (Note that the full vs. elliptical placename distinction is intended only for handling cases where the placetype follows or precedes the raw placename and there is no difference between the two in whether they are normally preceded by `the`. More complex situations, such as `State of Mexico` (which normally takes `the`) vs. just `Mexico` (which doesn't), or `Islamabad Capital Territory` vs. just `Islamabad`, should be handled instead by aliases.) The `key_to_placename` function takes one argument, the key, and returns two arguments, the full and elliptical placenames, respectively. If left undefined, the default is to chop off anything starting with a comma and return the result as both full and elliptical placename, and if specifically set to `false`, the key is used directly as both full and elliptical placename. If it needs to be defined, it is best to use the helper function `make_key_to_placename`, if possible (or `make_irish_type_key_to_placename` in the case of Ireland and Northern Ireland, where `County` precedes), rather than rolling your own. In addition, you should use the global `key_to_placename` function (which takes care of the default implementation and such) rather than directly calling the function in the `key_to_placename` field. * `placename_to_key`: This is approximately the inverse of `key_to_placename`, transforming a placename (which can be either in full or elliptical form) into the corresponding key. As with `key_to_placename`, if you need to define this (generally, when the full and elliptical placenames are different), prefer using `make_placename_to_key` (or `make_irish_type_placename_to_key` for Ireland and Northern Ireland) to rolling your own. In addition, similarly to `key_to_placename`, use the global `placename_to_key` function to convert placenames to keys rather than directly invoking the function in the `placename_to_key` field. If the field is set to `false`, the placename is used unchanged as the key. Otherwise, the default algorithm works as follows: *# If the group-level `default_placetype == "city"`, use the placename unchanged as the key. *# Otherwise, if the group-level `default_container` exists and is a string, append it to the placename after a comma + space and use the result as the key. *# Otherwise, if the group-level `default_container` is a canonical container object (an object with `key` and `placetype` fields), and the `placetype` field is either `country` or `constituent country`, append the `key` field to the placename after a comma + space and use the result as the key. *# Otherwise, use the placename unchanged as the key. * `canonicalize_key_container`: A function of one argument to convert the specified `container` field, when a string, to canonical form. Described in more detail above under [[#Container spec canonicalization]]. It is preferable to construct the function using `make_canonicalize_key_container`, if possible, rather than rolling your own. * `addl_divs`: Additional political divisions appended, for all locations in the group, to the list of divisions derived from the location-level `divs` or group-level `default_divs` fields to get the final list of divisions for the location. See [[#Location divisions]] for more details. ]==] ----------------------------------------------------------------------------------- -- Helper functions -- ----------------------------------------------------------------------------------- --[==[ Throw an error. `fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to format the format string as if `fmt:format(...)` were called. In general, callers should use `internal_error` unless the error was due to bad user input rather than a logic error (which usually isn't the case in deep back-end code like this). ]==] function export.process_error(fmt, ...) local args = {...} for i = 1, select("#", ...) do args[i] = dump(args[i]) end return error(string.format(fmt, unpack(args))) end --[==[ Throw an internal error (a logic error that should never happen unless there is a bug in the code, as opposed to a user error triggered by bad input or a system error due to something like running out of memory or hitting a time limit). `fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to format the format string as if `fmt:format(...)` were called. ]==] function export.internal_error(fmt, ...) export.process_error("Internal error: " .. fmt, ...) end local internal_error = export.internal_error -- Return whether `list_or_element` (a list of strings, or a single string) "contains" `item` (a string). If -- `list_or_element` is a list, this returns true if `item` is in the list; otherwise it returns true if `item` -- equals `list_or_element`. local function list_or_element_contains(list_or_element, item) if type(list_or_element) == "table" then return m_table.contains(list_or_element, item) and true or false end return list_or_element == item end --[==[ Call the location group's `key_to_placename` function if it exists (see the comment at the top of [[Module:place]] for the distinction between keys and placenames). Two values are returned, the full and elliptical placenames (e.g. full `"County Durham"` vs. elliptical `"Durham"`). If the group does not define `key_to_placename`, both full and elliptical placenames are computed by chopping off anything starting with a comma. ]==] function export.key_to_placename(group, key) if group.key_to_placename == false then return key, key end if group.key_to_placename then local full_placename, elliptical_placename = group.key_to_placename(key) if type(full_placename) ~= "string" then internal_error("Key %s returned a non-string full placename: %s", key, full_placename) end if type(elliptical_placename) ~= "string" then internal_error("Key %s returned a non-string elliptical placename: %s", key, elliptical_placename) end return full_placename, elliptical_placename end key = key:gsub(",.*", "") return key, key end --[==[ Call the location group's `placename_to_key` function if it exists (see the comment at the top of [[Module:place]] for the distinction between keys and placenames) and return the result. If `placename_to_key` exists with the value `false`, return the placename unchanged. If the group does not define `placename_to_key`, and it defines a `default_container` whose placetype is either `country` or `constituent country`, the container name is appended to the placename after a comma and a space. Otherwise the placename is returned unchanged. ]==] function export.placename_to_key(group, placename) if group.placename_to_key == false then return placename elseif group.placename_to_key then local key = group.placename_to_key(placename) if type(key) ~= "string" then internal_error("Placename %s returned a non-string key: %s", placename, key) end return key elseif group.default_placetype == "city" then return placename else local defcon = group.default_container if not defcon then return placename elseif type(defcon) == "string" then return placename .. ", " .. defcon elseif type(defcon) == "table" and (defcon.placetype == "negara" or defcon.placetype == "constituent country") then return placename .. ", " .. defcon.key else return placename end end end --[==[ Initialize the location spec `spec`, augmenting it with default values taken from `group` if the spec itself doesn't specify values for the properties. This sets `containers` to a canonicalized list of objects, each with `key` and `placetype` keys, describing the immediate containers of the location, and erases (sets to nil) the original non-canonicalized `container` field. (Most locations have only one immediate container but some, e.g. Russia, have more than one. Containers should be carefully distinguished from category parents. Generally the container is the first category parent, or the first ``n`` parents if there are ``n`` containers, but there may be additional category parents, which indicate some sort of relation between the category parent and the location but not necessarily one of containment.) This function is idempotent in that nothing happens if called more than once on the same spec. FIXME: Consider reimplementing this in a more standardly object-oriented way using metatables. ]==] function export.initialize_spec(group, key, spec) if spec.initialized then return end local container = spec.container local containers local container_from_default if not container then container = group.default_container container_from_default = true end if container then if type(container) == "string" or container.key then container = {container} end containers = {} for _, cont in ipairs(container) do if type(cont) == "string" then if group.canonicalize_key_container and not container_from_default then cont = group.canonicalize_key_container(cont) else cont = {key = cont, placetype = "negara"} end end insert(containers, cont) end end spec.containers = containers spec.container = nil local function value_with_default(val, default_val) if val == nil then return default_val else return val end end local function set_or_default(prop) spec[prop] = value_with_default(spec[prop], group["default_" .. prop]) end set_or_default("placetype") if not spec.placetype then internal_error("No placetype found in key %s for spec %s or in group `default_placetype`", key, spec) end set_or_default("divs") spec.addl_divs = group.addl_divs for _, prop in ipairs { "keydesc", "fulldesc", "addl_parents", "overriding_bare_label_parents", "bare_category_parent_type", "wp", "wpcat", "commonscat", "british_spelling", "the", "no_container_cat", "no_container_parent", "no_generic_place_cat", "no_check_holonym_mismatch", "no_auto_augment_container", "no_include_container_in_desc", "is_city", "is_former_place", } do set_or_default(prop) end -- `default_placetype == "city"` is correct; if `default_placetype` has something else like `prefecture-level city` -- as the canonical placetype but also lists `city` (as Chinese prefecture-level cities do), don't mark as -- is_city. spec.is_city = value_with_default(spec.is_city, group.default_placetype == "city") spec.initialized = true end --[=[ Given a location group, key and possible placetypes that the placename must match, check if the key exists in the group with at least one of the group's key's placetypes matching one of the passed-in placetypes. If so, return two values: the group key (which potentially could differ from the passed-in key due to aliases) and the corresponding spec object, which (as with all functions that return spec objects) has been initialized using `initialize_spec()` (i.e. default property values have been copied from the group into the spec, if the spec doesn't itself specify a value for the property in question). `alias_resolution` controls how aliases are resolved. Normally, both display and category aliases are followed, and the returned key will reflect the canonical location key. However, if `alias_resolution` is {"none"}, no alias following happens. In that case, if the key specifies an alias, the spec for the alias rather than the spec for the canonical location is returned, and importantly, it is returned uninitialized, meaning that properties from the group are not copied into the spec. (If the key specifies a canonical location, its spec is returned initialized, as in the normal case where `alias_resolution` is unspecified.) The caller needs to check whether the returned spec is an alias by looking for an `alias_of` property. If `alias_resolution` is {"display"}, the behavior is the same as for {"none"} except that if the alias contains a setting `display = true`, the returned key will reflect the canonical location key, and if the alias contains a setting `display = ``string`` `, the returned key will reflect that string. This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or `find_canonical_key` (for known-canonical locations where the placetype isn't known). ]=] local function find_matching_key_in_group(group, placetypes, key, alias_resolution) if alias_resolution ~= nil and alias_resolution ~= "none" and alias_resolution ~= "display" and alias_resolution ~= "all" then internal_error("Bad value for 'alias_resolution': %s", alias_resolution) end local spec = group.data[key] if not spec then return nil end local function check_correct_placetype(placetype) if type(placetype) == "table" then for _, pt in ipairs(placetype) do if list_or_element_contains(placetypes, pt) then return true end end return false else return list_or_element_contains(placetypes, placetype) end end if spec.alias_of then local resolved_key = spec.alias_of local resolved_spec = group.data[resolved_key] if not resolved_spec then internal_error("Key %s is an alias of %s, which doesn't exist", key, resolved_key) elseif resolved_spec.alias_of then internal_error("Key %s is an alias of %s, which is itself an alias; indirect aliasing not allowed", key, resolved_key) end if alias_resolution == "none" or alias_resolution == "display" then -- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group. local placetype = spec.placetype or resolved_spec.placetype or group.default_placetype if not placetype then internal_error("No placetype found for key %s in any of spec %s, alias-resolved spec %s or in group " .. "`default_placetype`", key, spec, resolved_spec) end if not check_correct_placetype(placetype) then return nil end if alias_resolution == "display" then if spec.display == true then key = resolved_key elseif spec.display then key = spec.display end end return key, spec end key = resolved_key spec = resolved_spec end -- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group. local placetype = spec.placetype or group.default_placetype if not placetype then internal_error("No placetype found for key %s in spec %s or group `default_placetype`", key, spec) end if not check_correct_placetype(placetype) then return nil end export.initialize_spec(group, key, spec) return key, spec end --[=[ Given a location group, placename and possible placetypes that the placename must match, check if the placename exists in the group with at least one of the placetypes of the key in the group that corresponds to the placename matching one of the passed-in placetypes. If so, return two values: the key corrsponding to the passed-in placename and the corresponding spec object. This is similar to `find_matching_key_in_group()` but works with placenames rather than keys. `alias_resolution` is as in `find_matching_key_in_group()`. This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or `find_canonical_key` (for known-canonical locations where the placetype isn't known). ]=] local function find_matching_placename_in_group(group, placetypes, placename, alias_resolution) local key = export.placename_to_key(group, placename) return find_matching_key_in_group(group, placetypes, key, alias_resolution) end --[==[ If `key` is a canonical known location key (i.e. not an alias), return the corresponding group and initialized spec. If no such key exists, return {nil}. This throws an internal error if two locations with the same key are found. ]==] function export.find_canonical_key(key) local found_locations = {} for _, group in ipairs(export.locations) do local spec = group.data[key] if not spec then -- do nothing elseif spec.alias_of then mw.log(("Skipping alias '%s' of canonical '%s'"):format(key, spec.alias_of)) else insert(found_locations, {group, spec}) end end if not found_locations[1] then return nil elseif found_locations[2] then internal_error("Found multiple matching locations for canonical key %s: %s", key, found_locations) else local group, spec = unpack(found_locations[1]) export.initialize_spec(group, key, spec) return group, spec end end --[==[ Iterator that returns all locations matching a given description, where the description consists of either a placename or a key along with a list of possible placetypes. Usually there will be at most one such location. The iterator returns three values at each iteration: the location group, canonical key by which the location is known and the spec object describing the location. `data` contains the following possible fields: * `placetypes`: A list of possible placetypes, one of which must match one of the location's placetypes; or a string specifying a placetype, which must match one of the location's placetypes. This must be specified. * `placename`: The placename of the location. Either this or `key` must be specified. * `key`: The key of the location. Either this or `placename` must be specified. * `alias_resolution`: If specified, it behaves the same as for `find_matching_key_in_group`. The spec is normally initialized using `initialize_spec()` prior to it being returned (but may not be if `alias_resolution` is given and the specified key or placename is an alias; see the documentation for `find_matching_key_in_group`). ]==] function export.iterate_matching_location(data) local i = 0 local n = #export.locations return function() while true do i = i + 1 if i > n then break end local group = export.locations[i] local key, spec if data.placename then key, spec = find_matching_placename_in_group(group, data.placetypes, data.placename, data.alias_resolution) else if not data.key then internal_error("'.placename' or '.key' must be defined: %s", data) end key, spec = find_matching_key_in_group(group, data.placetypes, data.key, data.alias_resolution) end if key then return group, key, spec end end end end --[==[ Return the location matching a given description, where the description consists of either a placename or a key along with a list of possible placetypes. This is similar to `iterate_matching_location()` but throws an internal error if there is not exactly one location found; as such, it is for use with internally specified locations (such as the containers of known locations) rather than externally specified locations, which may not match a known location and in some cases may match multiple known locations. For finding an externally specified location, consider using `find_matching_holonym_location`, which returns {nil} rather than throwing an error if the location isn't found, but also (more importantly) checks to make sure there are no conflicting holonyms among the user-specified holonyms (e.g. {{tl|place|city|s/Delaware|c/USA|t=Newark}} will not match the known location `Newark` (in New Jersey, not Delaware). ]==] function export.get_matching_location(data) local all_found = {} for group, key, spec in export.iterate_matching_location(data) do insert(all_found, {group, key, spec}) end if not all_found[1] then internal_error("Couldn't find matching location for data %s", data) elseif all_found[2] then internal_error("Found multiple matching locations for data %s: %s", data, all_found) else return unpack(all_found[1]) end end --[==[ Successively iterate over a location's containers, and then the containers of those containers, etc. Keep in mind that locations may have multiple containers (e.g. Russia has both Europe and Asia as containers, and both Europe and Asia have Eurasia as their container). A given container will never be returned twice (e.g. in the case where a specific location A has locations B and C as containers, and B has C as its container, C will not be returned twice). An internal error happens if a container loop is detected. The return value is a list of location objects, each of which contains `group`, `key` and `spec` fields. ]==] function export.iterate_containers(group, key, spec) local keys_seen = {} keys_seen[key] = true local iterations = 0 local last_iteration_containers = {{group = group, key = key, spec = spec}} return function() iterations = iterations + 1 if iterations > 10 then internal_error("Probable loop in containers when processing key %s", key) end local next_iteration_containers = {} for _, location in ipairs(last_iteration_containers) do local containers = location.spec.containers if containers then for _, container in ipairs(containers) do local container_group, container_key, container_spec = export.get_matching_location { placetypes = container.placetype, key = container.key, } if not keys_seen[container_key] then insert(next_iteration_containers, { group = container_group, key = container_key, spec = container_spec }) keys_seen[container_key] = true end end end end if not next_iteration_containers[1] then return nil end last_iteration_containers = next_iteration_containers return next_iteration_containers end end --[==[ Given a placename, convert it into a link (two-part if `display_form` is given and differs from `placename`) and add `"the "` to the beginning if called for in `spec`. ]==] function export.construct_linked_placename(spec, placename, display_form) local linked_placename = display_form and placename ~= display_form and ("[[%s|%s]]"):format(placename, display_form) or ("[[%s]]"):format(placename) if spec.the then linked_placename = "the " .. linked_placename end return linked_placename end --[=[ This is typically used to define `key_to_placename`. It generates a function that chops off parts of a string (a location key), typically at the end, in order to get the full and elliptical versions of a placename. (See the documentation above for `key_to_placename` under "Location group tables" for the difference between full and elliptical placenames.) `container_patterns` is a Lua pattern or a list of possible patterns matching the container at the end of the key, which will be used to remove that container. If multiple patterns are specified, each one is tried until one matches. If `container_patterns` is omitted, this part of the process is skipped. The reulting string becomes the full placename. If `divtype_patterns` is specified, it is likewise either a Lua pattern or list of possible patterns to match and remove the political division affixed onto the end (or possibly the beginning) of the key in the keys of certain countries (such as South Korean and North Korean counties, which include the word "County" in the key). The resulting chopped string becomes the elliptical placename. If `divtype_patterns` is omitted, this part of the process is skipped and the full and elliptical placenames are the same. Typical usage is as follows: ``` key_to_placename = make_key_to_placename(", England$"), ``` or (when the political division is part of the key) ``` key_to_placename = make_key_to_placename(", South Korea$", " County$") ``` ]=] local function make_key_to_placename(container_patterns, divtype_patterns) if type(container_patterns) == "string" then container_patterns = {container_patterns} end if type(divtype_patterns) == "string" then divtype_patterns = {divtype_patterns} end return function(key) local full_placename = key if container_patterns then for _, container_pattern in ipairs(container_patterns) do local nsubs full_placename, nsubs = full_placename:gsub(container_pattern, "") if nsubs > 0 then break end end end local elliptical_placename = full_placename if divtype_patterns then for _, divtype_pattern in ipairs(divtype_patterns) do local nsubs elliptical_placename, nsubs = elliptical_placename:gsub(divtype_pattern, "") if nsubs > 0 then break end end end return full_placename, elliptical_placename end end --[=[ This is typically used to define `placename_to_key`. It generates a function that appends a string to the end of a given placename to get the key (see the definition of `placename_to_key` above in the documentation under "Location group tables"). Optional `divtype_suffix` is a raw string (which should not contain hyphens or other characters that have special meaning in Lua patterns) to be appended first to the placename; if already present at the end, it is not appended. `container_suffix` is then added in the same fashion if given. Typical usage is like this: ``` placename_to_key = make_placename_to_key(", England") ``` (which will convert e.g. `"Hampshire"` into `"Hampshire, England"`) or ``` placename_to_key = make_placename_to_key(", South Korea", " County") ``` (which will convert e.g. `"Gangwon"` or `"Gangwon County"` into `"Gangwon County, South Korea"`). ]=] local function make_placename_to_key(container_suffix, divtype_suffix) return function(placename) local key = placename if divtype_suffix then if not key:find(divtype_suffix .. "$") then key = key .. divtype_suffix end end if container_suffix then key = key .. container_suffix end return key end end --[=[ This is typically used to define `canonicalize_key_container`, which converts a container as specified in the location data into the canonical form containing both the full container key and its placetype. It generates a function to do the canonicalization of a given container. If the container is a string, `suffix` is appended onto the string (use {nil} or {""} if there is no suffix to append), and the placetype is set to `placetype`. Otherwise the container is left as-is. Typical usage is like this: ``` canonicalize_key_container = make_canonicalize_key_container(", Canada", "province") ``` which will convert e.g. `"Ontario"` into `{key = "Ontario, Canada", placetype = "province"}`. ]=] local function make_canonicalize_key_container(suffix, placetype) return function(container) if type(container) == "string" then return {key = container .. (suffix or ""), placetype = placetype} else return container end end end ----------------------------------------------------------------------------------- -- Top-level tables -- ----------------------------------------------------------------------------------- export.continents = { ["Bumi"] = {the = true, placetype = "planet", addl_parents = {"alam semula jadi"}, fulldesc = "=the planet [[Earth]] and the features found on it"}, ["Afrika"] = {placetype = "benua", container = {key = "Bumi", placetype = "planet"}}, ["Amerika"] = {placetype = {"superbenua", "benua"}, container = {key = "Bumi", placetype = "planet"}, keydesc = "[[America]], in the sense of [[North America]] and [[South America]] combined", wp = "Amerika"}, ["America"] = {alias_of = "Amerika", the = true}, ["Amerika Utara"] = {placetype = "benua", container = {key = "America", placetype = "superbenua"}}, ["Caribbean"] = {the = true, placetype = {"kawasan benua", "region"}, container = {key = "Amerika Utara", placetype = "benua"}}, ["Amerika Tengah"] = {placetype = {"kawasan benua", "region"}, container = {key = "Amerika Utara", placetype = "benua"}}, ["Amerika Selatan"] = {placetype = "benua", container = {key = "America", placetype = "superbenua"}}, ["Antartika"] = {placetype = "benua", container = {key = "Bumi", placetype = "planet"}, fulldesc = "=the territory of [[Antarctica]]"}, ["Eurasia"] = {placetype = {"superbenua", "benua"}, container = {key = "Bumi", placetype = "planet"}, keydesc = "[[Eurasia]], i.e. [[Europe]] and [[Asia]] together"}, ["Asia"] = {placetype = "benua", container = {key = "Eurasia", placetype = "superbenua"}}, ["Eropah"] = {placetype = "benua", container = {key = "Eurasia", placetype = "superbenua"}}, ["Oceania"] = {placetype = "benua", container = {key = "Bumi", placetype = "planet"}}, ["Melanesia"] = {placetype = {"kawasan benua", "region"}, container = {key = "Oceania", placetype = "benua"}}, ["Micronesia"] = {placetype = {"kawasan benua", "region"}, container = {key = "Oceania", placetype = "benua"}}, ["Polynesia"] = {placetype = {"kawasan benua", "region"}, container = {key = "Oceania", placetype = "benua"}}, } export.continents_group = { default_overriding_bare_label_parents = {}, -- container parents should be used default_divs = {{type = "negara", prep = "di"}}, -- It's enough to mention the first-level continent or continent group. It seems excessive to write e.g. -- "El Salvador, a country in Central America, a continental region in North America, a continent in America, ...". default_no_include_container_in_desc = true, default_no_container_cat = true, default_no_container_parent = true, default_no_auto_augment_container = true, default_no_generic_place_cat = true, -- French Guyana is in France but not in Europe, which should not be an issue, so don't check holonym mismatches at -- this level. We also run into problems with supercontinents, which have "benua" as the fallback and cause -- mismatches. default_no_check_holonym_mismatch = true, data = export.continents, } -- Countries: including those with partial recognition that are normally considered countries (e.g. Kosovo, Taiwan). export.countries = { ["Afghanistan"] = {container = "Asia", divs = {"provinces", "districts"}}, ["Albania"] = {container = "Eropah", divs = {"counties", "municipalities", "communes", {type = "administrative units", cat_as = "communes"}, }, british_spelling = true}, ["Algeria"] = {container = "Afrika", divs = {"provinces", "communes", "districts", "municipalities"}}, ["Andorra"] = {container = "Eropah", divs = {"parishes"}, british_spelling = true}, ["Angola"] = {container = "Afrika", divs = {"provinces", "municipalities"}}, ["Antigua dan Barbuda"] = {container = "Caribbean", divs = {"provinces"}, british_spelling = true}, ["Argentina"] = {container = "Amerika Selatan", divs = {"provinces", "departments", "municipalities"}}, ["Armenia"] = {container = {"Eropah", "Asia"}, divs = {"provinces", "districts", "municipalities"}, british_spelling = true}, ["Republik Armenia"] = {alias_of = "Armenia", the = true}, -- differs in "the" -- Both a country and continent ["Australia"] = {container = "Oceania", divs = { {type = "negeri", cat_as = "negeri dan wilayah"}, {type = "wilayah", cat_as = "negeri dan wilayah"}, {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and territories"}, {type = "ABBREVIATION_OF territories", cat_as = "abbreviations of states and territories"}, "local government areas", "dependent territories", }, british_spelling = true}, ["Austria"] = {container = "Eropah", divs = {"negeri", "districts", "municipalities"}, british_spelling = true}, ["Azerbaijan"] = {container = {"Eropah", "Asia"}, divs = {"districts", "municipalities"}, british_spelling = true}, ["Bahamas"] = {the = true, container = "Caribbean", divs = {"districts"}, british_spelling = true, wp = "The %l"}, ["Bahrain"] = {container = "Asia", divs = {"governorates"}}, ["Bangladesh"] = {container = "Asia", divs = {"divisions", "districts", "municipalities"}, british_spelling = true}, ["Barbados"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true}, ["Belarus"] = {container = "Eropah", divs = {"regions", "districts"}, british_spelling = true}, ["Belgium"] = {container = "Eropah", divs = {"regions", "provinces", "municipalities"}, british_spelling = true}, ["Belize"] = {container = "Amerika Tengah", divs = {"districts"}, british_spelling = true}, ["Benin"] = {container = "Afrika", divs = {"departments", "communes"}}, ["Bhutan"] = {container = "Asia", divs = {"districts", "gewogs"}}, ["Bolivia"] = {container = "Amerika Selatan", divs = {"provinces", "departments", "municipalities"}}, ["Bosnia dan Herzegovina"] = {container = "Eropah", divs = {"entities", "cantons", "municipalities"}, british_spelling = true}, ["Bosnia dan Hercegovina"] = {alias_of = "Bosnia and Herzegovina", display = true}, ["Bosnia"] = {alias_of = "Bosnia and Herzegovina", display = true}, ["Botswana"] = {container = "Afrika", divs = {"districts", "subdistricts"}, british_spelling = true}, ["Brazil"] = {container = "Amerika Selatan", divs = { "negeri", "municipalities", "macroregions", {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"}, }}, ["Brunei"] = {container = "Asia", divs = {"daerah", "mukim"}, british_spelling = true}, ["Bulgaria"] = {container = "Eropah", divs = {"provinces", "municipalities"}, british_spelling = true}, ["Burkina Faso"] = {container = "Afrika", divs = {"regions", "departments", "provinces"}}, ["Burundi"] = {container = "Afrika", divs = {"provinces", "communes"}}, ["Kemboja"] = {container = "Asia", divs = {"provinces", "districts"}}, ["Cameroon"] = {container = "Afrika", divs = {"regions", "departments"}}, ["Kanada"] = {container = "Amerika Utara", divs = { {type = "provinces", cat_as = "provinces and territories"}, {type = "territories", cat_as = "provinces and territories"}, {type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces and territories"}, {type = "ABBREVIATION_OF territories", cat_as = "abbreviations of provinces and territories"}, "counties", "districts", "municipalities", "regional municipalities", "rural municipalities", "parishes", -- Don't change the following to something more politically correct (e.g. "First Nations reserves") until/unless -- the Canadian government makes a similar switch (and note that as of Apr 18 2025, the Wikipedia article is -- still at [[w:Indian reserves]]). "Indian reserves", "census divisions", {type = "townships", prep = "di"}, }, british_spelling = true}, ["Cape Verde"] = {container = "Afrika", divs = {"municipalities", "parishes"}}, ["Central African Republic"] = {the = true, container = "Afrika", divs = {"prefectures", "subprefectures"}}, ["Chad"] = {container = "Afrika", divs = {"regions", "departments"}}, ["Chile"] = {container = "Amerika Selatan", divs = {"regions", "provinces", "communes"}}, ["China"] = {container = "Asia", divs = { {type = "provinces", cat_as = "provinces and autonomous regions"}, {type = "autonomous regions", cat_as = "provinces and autonomous regions"}, {type = "FORMER provinces", cat_as = "former provinces"}, "special administrative regions", "prefectures", {type = "FORMER prefectures", cat_as = "former prefectures"}, "prefecture-level cities", {type = "counties", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, {type = "FORMER counties", cat_as = "former counties and county-level cities"}, {type = "FORMER county-level cities", cat_as = "former counties and county-level cities"}, -- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities. "districts", {type = "FORMER districts", cat_as = "former districts"}, "subdistricts", "townships", "municipalities", {type = "direct-administered municipalities", cat_as = "municipalities"}, }}, ["Republik Rakyat China"] = {alias_of = "China", the = true}, -- differs in "the" ["Colombia"] = {container = "Amerika Selatan", divs = {"departments", "municipalities"}}, ["Comoros"] = {the = true, container = "Afrika", divs = {"autonomous islands"}}, ["Costa Rica"] = {container = "Amerika Tengah", divs = {"provinces", "cantons"}}, ["Croatia"] = {container = "Eropah", divs = {"counties", "municipalities"}, british_spelling = true}, ["Cuba"] = {container = "Caribbean", divs = {"provinces", "municipalities"}}, ["Cyprus"] = {container = {"Eropah", "Asia"}, divs = {"districts"}, british_spelling = true}, ["Czech Republic"] = {the = true, container = "Eropah", divs = {"regions", "districts", "municipalities"}, british_spelling = true}, ["Czechia"] = {alias_of = "Czech Republic"}, -- differs in "the" ["Democratic Republic of the Congo"] = {the = true, container = "Afrika", divs = {"provinces", "territories"}}, ["Congo"] = {alias_of = "Democratic Republic of the Congo", display = true, the = true}, ["Denmark"] = {container = "Eropah", divs = {"regions", "municipalities", "dependent territories"}, british_spelling = true, -- Wikipedia separates [[w:Denmark]] (constituent country) from [[w:Danish Realm]] (country) }, ["Djibouti"] = {container = "Afrika", divs = {"regions", "districts"}}, ["Dominica"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true}, ["Dominican Republic"] = {the = true, container = "Caribbean", divs = {"provinces", "municipalities"}, keydesc = "the [[Dominican Republic]], the country that shares the [[Caribbean]] island of [[Hispaniola]] with [[Haiti]]"}, ["East Timor"] = {container = "Asia", divs = {"municipalities"}, wp = "Timor-Leste"}, ["Timor-Leste"] = {alias_of = "East Timor", display = true}, ["Ecuador"] = {container = "Amerika Selatan", divs = {"provinces", "cantons"}}, ["Mesir"] = {container = "Afrika", divs = {"kegabenoran", "kawasan"}, british_spelling = true}, ["El Salvador"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}}, ["Equatorial Guinea"] = {container = "Afrika", divs = {"provinces"}}, ["Eritrea"] = {container = "Afrika", divs = {"regions", "subregions"}}, ["Estonia"] = {container = "Eropah", divs = {"counties", "municipalities"}, british_spelling = true}, ["Eswatini"] = {container = "Afrika", british_spelling = true}, ["Swaziland"] = {alias_of = "Eswatini", display = true}, ["Ethiopia"] = {container = "Afrika", divs = {"regions", "zones"}}, ["Federated States of Micronesia"] = {the = true, container = "Micronesia", divs = {"negeri"}}, ["Micronesia"] = {alias_of = "Federated States of Micronesia"}, ["Fiji"] = {container = "Melanesia", divs = {"divisions", "provinces"}, british_spelling = true}, ["Finland"] = {container = "Eropah", divs = {"regions", "municipalities"}, british_spelling = true}, ["France"] = {container = "Eropah", divs = {"regions", "cantons", "collectivities", "communes", {type = "municipalities", cat_as = "communes"}, "departments", {type = "prefectures", cat_as = {"prefectures", "departmental capitals"}}, {type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}}, "dependent territories", "territories", "provinces", }, british_spelling = true}, ["Gabon"] = {container = "Afrika", divs = {"provinces", "departments"}}, ["Gambia"] = {the = true, container = "Afrika", divs = {"divisions", "districts"}, british_spelling = true, wp = "The %l"}, ["Georgia"] = {container = {"Eropah", "Asia"}, divs = {"regions", "districts"}, keydesc = "the country of [[Georgia]], in [[Eurasia]]", british_spelling = true, wp = "%l (country)"}, ["Germany"] = {container = "Eropah", divs = { "negeri", -- Bavaria, Baden-Württemberg, Hesse and North Rhine-Westphalia have administrative regions as divisions, but -- there aren't really enough of them to categorize per state. "regions", "municipalities", "districts"}, british_spelling = true}, ["Ghana"] = {container = "Afrika", divs = {"regions", "districts"}, british_spelling = true}, ["Greece"] = {container = "Eropah", divs = {"regions", "regional units", "municipalities", {type = "peripheries", cat_as = {"regions"}}, }, british_spelling = true}, ["Grenada"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true}, ["Guatemala"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}}, ["Guinea"] = {container = "Afrika", divs = {"regions", "prefectures"}}, ["Guinea-Bissau"] = {container = "Afrika", divs = {"regions"}}, ["Guyana"] = {container = "Amerika Selatan", divs = {"regions"}, british_spelling = true}, ["Haiti"] = {container = "Caribbean", divs = {"departments", "arrondissements"}}, ["Honduras"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}}, ["Hungary"] = {container = "Eropah", divs = {"counties", "districts"}, british_spelling = true}, ["Iceland"] = {container = "Eropah", divs = {"regions", "municipalities", "counties"}, british_spelling = true}, ["India"] = {container = "Asia", divs = { {type = "negeri", cat_as = "states and union territories"}, {type = "union territories", cat_as = "states and union territories"}, {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and union territories"}, {type = "ABBREVIATION_OF union territories", cat_as = "abbreviations of states and union territories"}, "divisions", "districts", "municipalities", }, british_spelling = true}, ["Indonesia"] = {container = "Asia", divs = {"regencies", "provinces", {type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces"}, }}, ["Iran"] = {container = "Asia", divs = {"provinces", "counties"}}, ["Iraq"] = {container = "Asia", divs = {"governorates", "districts"}}, ["Ireland"] = {container = "Eropah", addl_parents = {"British Isles"}, divs = {"counties", "districts", "provinces"}, british_spelling = true, wp = "Republic of %l"}, ["Republic of Ireland"] = {alias_of = "Ireland", the = true}, -- differs in "the" ["Israel"] = {container = "Asia", divs = {"districts"}}, ["Italy"] = {container = "Eropah", divs = { "regions", "provinces", "metropolitan cities", "municipalities", {type = "autonomous regions", cat_as = "regions"}, }, british_spelling = true}, ["Ivory Coast"] = {container = "Afrika", divs = {"districts", "regions"}}, -- We should really be using Ivory Coast (common name) but there are political ramifications to the use of -- Côte d'Ivoire so don't make it a display alias. ["Côte d'Ivoire"] = {alias_of = "Ivory Coast"}, ["Jamaica"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true}, ["Jepun"] = {container = "Asia", divs = {"prefectures", "subprefectures", "municipalities"}}, ["Jordan"] = {container = "Asia", divs = {"governorates"}}, ["Kazakhstan"] = {container = {"Asia", "Eropah"}, divs = {"regions", "districts"}}, ["Kenya"] = {container = "Afrika", divs = {"counties"}, british_spelling = true}, ["Kiribati"] = {container = "Micronesia", british_spelling = true}, ["Kosovo"] = {container = "Eropah", divs = {"districts", "municipalities"}, british_spelling = true}, ["Kuwait"] = {container = "Asia", divs = {"governorates", "areas"}}, ["Kyrgyzstan"] = {container = "Asia", divs = {"regions", "districts"}}, ["Laos"] = {container = "Asia", divs = {"provinces", "districts"}}, ["Latvia"] = {container = "Eropah", divs = {"municipalities"}, british_spelling = true}, ["Lubnan"] = {container = "Asia", divs = {"governorates", "districts"}}, ["Lesotho"] = {container = "Afrika", divs = {"districts"}, british_spelling = true}, ["Liberia"] = {container = "Afrika", divs = {"counties", "districts"}}, ["Libya"] = {container = "Afrika", divs = {"districts", "municipalities"}}, ["Liechtenstein"] = {container = "Eropah", divs = {"municipalities"}, british_spelling = true}, ["Lithuania"] = {container = "Eropah", divs = {"counties", "municipalities"}, british_spelling = true}, ["Luxembourg"] = {container = "Eropah", divs = {"cantons", "districts"}, british_spelling = true}, ["Madagascar"] = {container = "Afrika", divs = {"regions", "districts"}}, ["Malawi"] = {container = "Afrika", divs = {"regions", "districts"}, british_spelling = true}, ["Malaysia"] = {container = "Asia", divs = {"negeri", "wilayah persekutuan", "daerah"}, british_spelling = true}, ["Maldives"] = {the = true, container = "Asia", divs = {"provinces", "administrative atolls"}, british_spelling = true}, ["Mali"] = {container = "Afrika", divs = {"regions", "cercles"}}, ["Malta"] = {container = "Eropah", divs = {"regions", "local councils"}, british_spelling = true}, ["Kepulauan Marshall"] = {the = true, container = "Micronesia", divs = {"municipalities"}}, ["Mauritania"] = {container = "Afrika", divs = {"regions", "departments"}}, ["Mauritius"] = {container = "Afrika", divs = {"districts"}, british_spelling = true}, ["Mexico"] = {container = "Amerika Utara", addl_parents = {"Amerika Tengah"}, divs = { "negeri", "municipalities", {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"}, }}, ["Moldova"] = {container = "Eropah", divs = { {type = "districts", cat_as = "districts and autonomous territorial units"}, {type = "autonomous territorial units", cat_as = "districts and autonomous territorial units"}, "communes", "municipalities", }, british_spelling = true}, ["Monaco"] = {placetype = {"city-state", "negara"}, container = "Eropah", -- We want the first placetype to be 'city-state' so the description of Monaco says it's a city-state, but we -- want its parent to be "countries in Europe". bare_category_parent_type = {type = "negara", prep = "di"}, is_city = true, british_spelling = true}, ["Mongolia"] = {container = "Asia", divs = {"provinces", "districts"}}, ["Montenegro"] = {container = "Eropah", divs = {"municipalities"}}, ["Morocco"] = {container = "Afrika", divs = {"regions", "prefectures", "provinces"}}, ["Mozambique"] = {container = "Afrika", divs = {"provinces", "districts"}}, ["Myanmar"] = {container = "Asia", divs = {"regions", "negeri", "union territories", {type = "self-administered zones", cat_as = "self-administered areas"}, {type = "self-administered divisions", cat_as = "self-administered areas"}, "districts"}}, ["Burma"] = {alias_of = "Myanmar"}, -- not display-canonicalizing; has political connotations ["Namibia"] = {container = "Afrika", divs = {"regions", "constituencies"}, british_spelling = true}, ["Nauru"] = {container = "Micronesia", divs = {"districts"}, british_spelling = true}, ["Nepal"] = {container = "Asia", divs = {"provinces", "districts"}}, ["Netherlands"] = {the = true, placetype = {"negara", "constituent country"}, container = "Eropah", divs = {"provinces", "municipalities", {type = "FORMER municipalities", cat_as = "former municipalities"}, "dependent territories", "constituent countries"}, british_spelling = true, -- Wikipedia separates [[w:Netherlands]] (constituent country) from [[w:Kingdom of the Netherlands]] -- (country) }, ["New Zealand"] = {container = "Polynesia", divs = { "regions", "dependent territories", "territorial authorities", {type = "districts", cat_as = "territorial authorities"}, }, british_spelling = true}, ["Nicaragua"] = {container = "Amerika Tengah", divs = {"departments", "municipalities"}}, ["Niger"] = {container = "Afrika", divs = {"regions", "departments"}}, ["Nigeria"] = {container = "Afrika", divs = { "negeri", -- Categorize the Federal Capital Territory as a state because there's only one of it; we could categorize -- everything under 'states and territories' but that seems a bit pointless. {type = "wilayah persekutuan", cat_as = "negeri"}, "local government areas", }, british_spelling = true}, ["North Korea"] = {container = "Asia", addl_parents = {"Korea"}, divs = {"provinces", "counties"}}, ["North Macedonia"] = {container = "Eropah", divs = {"regions", "municipalities"}, british_spelling = true}, ["Macedonia"] = {alias_of = "North Macedonia", display = true}, ["Republic of North Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the" ["Republic of Macedonia"] = {alias_of = "North Macedonia", the = true}, -- differs in "the" ["Norway"] = {container = "Eropah", divs = {"counties", "municipalities", "dependent territories", "districts", "unincorporated areas"}, british_spelling = true}, ["Oman"] = {container = "Asia", divs = {"governorates", "provinces"}}, ["Pakistan"] = {container = "Asia", divs = { {type = "provinces", cat_as = "provinces and territories"}, {type = "administrative territories", cat_as = "provinces and territories"}, {type = "wilayah persekutuan", cat_as = "provinces and territories"}, {type = "territories", cat_as = "provinces and territories"}, "divisions", "districts", }, british_spelling = true}, ["Palau"] = {container = "Micronesia", divs = {"negeri"}}, ["Palestine"] = {container = "Asia", divs = {"governorates"}}, ["State of Palestine"] = {alias_of = "Palestine", the = true}, -- differs in "the" ["Panama"] = {container = "Amerika Tengah", divs = {"provinces", "districts"}}, ["Papua New Guinea"] = {container = "Melanesia", divs = {"provinces", "districts"}, british_spelling = true}, ["Paraguay"] = {container = "Amerika Selatan", divs = {"departments", "districts"}}, ["Peru"] = {container = "Amerika Selatan", divs = {"regions", "provinces", "districts"}}, ["Filipina"] = {the = true, container = "Asia", divs = {"kawasan", "wilayah", "daerah", "perbandaran", "barangay"}}, ["Poland"] = {divs = {"voivodeships", "counties", {type = "Polish colonies", cat_as = {{type = "villages", prep = "di"}}}, }, container = "Eropah", british_spelling = true}, ["Portugal"] = {container = "Eropah", divs = { {type = "autonomous regions", cat_as = "districts and autonomous regions"}, {type = "districts", cat_as = "districts and autonomous regions"}, "provinces", "municipalities"}, british_spelling = true}, ["Qatar"] = {container = "Asia", divs = {"municipalities", "zones"}}, ["Republic of the Congo"] = {the = true, container = "Afrika", divs = {"departments", "districts"}}, ["Congo Republic"] = {alias_of = "Republic of the Congo", display = true, the = true}, ["Romania"] = {container = "Eropah", divs = { "regions", "counties", "communes", {type = "ABBREVIATION_OF counties", cat_as = "abbreviations of counties"}, }, british_spelling = true}, ["Rusia"] = {container = {"Eropah", "Asia"}, divs = { "federal subjects", "republics", "autonomous oblasts", "autonomous okrugs", "oblasts", "krais", "federal cities", "districts", "federal districts"}, british_spelling = true}, ["Rwanda"] = {container = "Afrika", divs = {"provinces", "districts"}}, ["Saint Kitts and Nevis"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true}, ["Saint Lucia"] = {container = "Caribbean", divs = {"districts"}, british_spelling = true}, ["Saint Vincent and the Grenadines"] = {container = "Caribbean", divs = {"parishes"}, british_spelling = true}, ["Samoa"] = {container = "Polynesia", divs = {"districts"}, british_spelling = true}, ["San Marino"] = {container = "Eropah", divs = {"municipalities"}, british_spelling = true}, ["São Tomé and Príncipe"] = {container = "Afrika", divs = {"districts"}}, ["Arab Saudi"] = {container = "Asia", divs = {"wilayah", "kegaboneran"}}, ["Senegal"] = {container = "Afrika", divs = {"regions", "departments"}}, ["Serbia"] = {container = "Eropah", divs = {"districts", "municipalities", "autonomous provinces"}}, ["Seychelles"] = {container = "Afrika", divs = {"districts"}, british_spelling = true}, ["Sierra Leone"] = {container = "Afrika", divs = {"provinces", "districts"}, british_spelling = true}, ["Singapura"] = {container = "Asia", divs = {"daerah", "kawasan"}, british_spelling = true}, ["Slovakia"] = {container = "Eropah", divs = {"regions", "districts"}, british_spelling = true}, ["Slovenia"] = {container = "Eropah", divs = {"statistical regions", "municipalities"}, british_spelling = true}, -- Note: the official name does not include "the" at the beginning, but it sounds strange in -- English to leave it out and it's commonly included, so we include it. ["Solomon Islands"] = {the = true, container = "Melanesia", divs = {"provinces"}, british_spelling = true}, ["Somalia"] = {container = "Afrika", divs = {"regions", "districts"}}, ["South Africa"] = {container = "Afrika", divs = { "provinces", "districts", {type = "district municipalities", cat_as = "districts"}, {type = "metropolitan municipalities", cat_as = "districts"}, "municipalities", }, british_spelling = true}, ["Korea Selatan"] = {container = "Asia", addl_parents = {"Korea"}, divs = {"provinces", "counties", "districts"}}, ["South Sudan"] = {container = "Afrika", divs = {"regions", "negeri", "counties"}, british_spelling = true}, ["Sepanyol"] = {container = "Eropah", divs = {"autonomous communities", "provinces", "municipalities", "comarcas", "autonomous cities"}, british_spelling = true}, ["Sri Lanka"] = {container = "Asia", divs = {"provinces", "districts"}, british_spelling = true}, ["Sudan"] = {container = "Afrika", divs = {"negeri", "districts"}, british_spelling = true}, ["Suriname"] = {container = "Amerika Selatan", divs = {"districts"}}, ["Sweden"] = {container = "Eropah", divs = {"provinces", "counties", "municipalities"}, british_spelling = true}, ["Switzerland"] = {container = "Eropah", divs = {"cantons", "municipalities", "districts"}, british_spelling = true}, ["Syria"] = {container = "Asia", divs = {"governorates", "districts"}}, ["Taiwan"] = {container = "Asia", divs = {"counties", "districts", "townships", "special municipalities"}}, ["Republik China"] = {alias_of = "Taiwan", the = true}, -- differs in "the", different political connotations ["Tajikistan"] = {container = "Asia", divs = {"regions", "districts"}}, ["Tanzania"] = {container = "Afrika", divs = {"regions", "districts"}, british_spelling = true}, ["Thailand"] = {container = "Asia", divs = {"wilayah", "daerah", "subdaerah"}}, ["Togo"] = {container = "Afrika", divs = {"provinces", "prefectures"}}, ["Tonga"] = {container = "Polynesia", divs = {"divisions"}, british_spelling = true}, ["Trinidad dan Tobago"] = {container = "Caribbean", divs = {"regions", "municipalities"}, british_spelling = true}, ["Tunisia"] = {container = "Afrika", divs = {"governorates", "delegations"}}, ["Turki"] = {container = {"Eropah", "Asia"}, divs = {"provinces", "districts"}}, -- Foreign names generally get display-canonicalized. ["Türkiye"] = {alias_of = "Turkey", display = true}, ["Turkmenistan"] = {container = "Asia", divs = { -- The 5 regions are often also called provinces "regions", {type = "provinces", cat_as = "regions"}, "districts"}, }, ["Tuvalu"] = {container = "Polynesia", divs = {"atolls"}, british_spelling = true}, ["Uganda"] = {container = "Afrika", divs = {"districts", "counties"}, british_spelling = true}, ["Ukraine"] = {container = "Eropah", divs = { {type = "oblasts", cat_as = "oblasts and autonomous republics"}, {type = "autonomous republics", cat_as = "oblasts and autonomous republics"}, "raions", "hromadas", }, british_spelling = true}, ["United Arab Emirates"] = {the = true, container = "Asia", divs = {"emirates"}}, -- Abbreviations get display-canonicalized. ["UAE"] = {alias_of = "United Arab Emirates", display = true, the = true}, ["U.A.E."] = {alias_of = "United Arab Emirates", display = true, the = true}, ["United Kingdom"] = {the = true, container = "Eropah", addl_parents = {"British Isles"}, divs = {"constituent countries", "counties", "districts", "boroughs", "territories", "dependent territories", "traditional counties"}, keydesc = "the [[United Kingdom]] of Great Britain and Northern Ireland", british_spelling = true}, -- Abbreviations get display-canonicalized. ["UK"] = {alias_of = "United Kingdom", display = true, the = true}, ["U.K."] = {alias_of = "United Kingdom", display = true, the = true}, ["Amerika Syarikat"] = {the = true, container = "Amerika Utara", divs = {"counties", "county seats", "negeri", "territories", "dependent territories", {type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"}, {type = "DEROGATORY_NAME_FOR states", cat_as = "derogatory names for states"}, {type = "NICKNAME_FOR states", cat_as = "nicknames for states"}, {type = "OFFICIAL_NICKNAME_FOR states", cat_as = "official nicknames for states"}, {type = "boroughs", prep = "di"}, -- exist in Pennsylvania and New Jersey "municipalities", -- these exist politically at least in Colorado and Connecticut {type = "census-designated places", prep = "di"}, {type = "unincorporated communities", prep = "di"}, -- Don't change the following to something more politically correct until/unless the US government makes a -- similar switch (and note that as of Apr 18 2025, the Wikipedia article is still at -- [[w:Indian reservations]]). "Indian reservations", }}, -- Abbreviations and long forms (when possible) get display-canonicalized. ["US"] = {alias_of = "Amerika Syarikat", display = true, the = true}, ["U.S."] = {alias_of = "Amerika Syarikat", display = true, the = true}, ["USA"] = {alias_of = "Amerika Syarikat", display = true, the = true}, ["U.S.A."] = {alias_of = "Amerika Syarikat", display = true, the = true}, ["United States of America"] = {alias_of = "Amerika Syarikat", display = true, the = true}, ["United States"] = {alias_of = "Amerika Syarikat", display = true, the = true}, ["Uruguay"] = {container = "Amerika Selatan", divs = {"departments", "municipalities"}}, ["Uzbekistan"] = {container = "Asia", divs = {"regions", "districts"}}, ["Vanuatu"] = {container = "Melanesia", divs = {"provinces"}, british_spelling = true}, ["Vatican City"] = {placetype = {"city-state", "negara"}, container = "Eropah", -- We want the first placetype to be 'city-state' so the description of Vatican City says it's a city-state, -- but we want its parent to be "countries in Europe". bare_category_parent_type = {type = "negara", prep = "di"}, addl_parents = {"Rome"}, is_city = true, british_spelling = true}, ["Vatican"] = {alias_of = "Vatican City", the = true}, -- differs in "the" ["Venezuela"] = {container = "Amerika Selatan", divs = {"negeri", "municipalities"}}, ["Vietnam"] = {container = "Asia", divs = {"wilayah", "daerah", "perbandaran"}}, ["Western Sahara"] = {placetype = {"territory", "negara"}, container = "Afrika", bare_category_parent_type = {type = "negara", prep = "di"}, }, -- Not display-canonicalizable both due to differences in 'the' and the sovereignty dispute over Western Sahara ["Sahrawi Arab Democratic Republic"] = {alias_of = "Western Sahara", the = true}, ["Yemen"] = {container = "Asia", divs = {"governorates", "districts"}}, ["Zambia"] = {container = "Afrika", divs = {"provinces", "districts"}, british_spelling = true}, ["Zimbabwe"] = {container = "Afrika", divs = {"provinces", "districts"}, british_spelling = true}, } local function canonicalize_continent_container(key) if type(key) ~= "string" then return key end if export.continents[key] then return {key = key, placetype = export.continents[key].placetype} end internal_error("Unrecognized key %s in `canonicalize_continent_like`", key) end export.countries_group = { canonicalize_key_container = canonicalize_continent_container, default_overriding_bare_label_parents = {"+++", "negara"}, default_placetype = "negara", default_no_container_cat = true, default_no_container_parent = true, -- No need to augment country holonyms with continents; not needed for disambiguation. default_no_auto_augment_container = true, data = export.countries, } -- Country-like entities: typically overseas territories or de-facto independent countries, which in both cases -- are not internationally recognized as sovereign nations but which we treat similarly to countries. export.country_like_entities = { -- British Overseas Territory ["Akrotiri and Dhekelia"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Cyprus", "Eropah", "Asia"}, british_spelling = true, }, -- Åland: Listed as a region of Finland. Wikipedia lists this under "dependent territories" in -- [[w:List of sovereign states and dependent territories by continent]]. -- unincorporated territory of the United States ["American Samoa"] = { placetype = {"unincorporated territory", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Polynesia"}, }, -- British Overseas Territory ["Anguilla"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Caribbean"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Georgia ["Abkhazia"] = { placetype = {"unrecognized country", "negara"}, addl_parents = {"Georgia", "Eropah", "Asia"}, divs = {"districts"}, keydesc = "the de-facto independent state of [[Abkhazia]], internationally recognized as part of the country of [[Georgia]]", british_spelling = true, }, -- Australian external territory ["Ashmore and Cartier Islands"] = { the = true, placetype = {"external territory", "territory"}, container = "Australia", addl_parents = {"Asia"}, }, -- constituent country of the Netherlands ["Aruba"] = { placetype = {"constituent country", "negara"}, container = "Netherlands", addl_parents = {"Caribbean"}, british_spelling = true, }, -- British Overseas Territory ["Bermuda"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Amerika Utara"}, british_spelling = true, }, -- special municipality of the Netherlands ["Bonaire"] = { placetype = {"special municipality", "municipality", "overseas territory", "territory"}, container = "Netherlands", addl_parents = {"Caribbean"}, is_city = true, british_spelling = true, }, -- British Overseas Territory ["British Indian Ocean Territory"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Asia"}, british_spelling = true, }, -- British Overseas Territory ["British Virgin Islands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Caribbean"}, british_spelling = true, }, -- Norwegian dependent territory ["Bouvet Island"] = { placetype = {"dependent territory", "territory"}, container = "Norway", addl_parents = {"Afrika"}, british_spelling = true, }, -- British Overseas Territory ["Cayman Islands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Caribbean"}, british_spelling = true, }, -- Australian external territory ["Christmas Island"] = { placetype = {"external territory", "territory"}, container = "Australia", addl_parents = {"Asia"}, british_spelling = true, }, -- Sui generis French "state private property" per Wikipedia; classify as overseas territory like the -- French Southern and Antarctic Lands. ["Clipperton Island"] = { placetype = {"overseas territory", "territory"}, container = "France", addl_parents = {"Amerika Utara"}, }, -- Australian external territory; also called the Keeling Islands or (officially) the Cocos (Keeling) Islands ["Cocos Islands"] = { the = true, placetype = {"external territory", "territory"}, container = "Australia", addl_parents = {"Asia"}, wp = "Cocos (Keeling) Islands", british_spelling = true, }, ["Cocos (Keeling) Islands"] = {alias_of = "Cocos Islands", display = true, the = true}, ["Keeling Islands"] = {alias_of = "Cocos Islands", display = true, the = true}, -- self-governing but in free association with New Zealand ["Cook Islands"] = { the = true, placetype = {"negara"}, container = "New Zealand", addl_parents = {"Polynesia"}, british_spelling = true, }, -- constituent country of the Netherlands ["Curaçao"] = { placetype = {"constituent country", "negara"}, container = "Netherlands", addl_parents = {"Caribbean"}, british_spelling = true, }, -- special territory of Chile ["Easter Island"] = { placetype = {"special territory", "territory"}, container = "Chile", addl_parents = {"Polynesia"}, }, -- British Overseas Territory ["Falkland Islands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Amerika Selatan"}, british_spelling = true, }, -- autonomous territory of Denmark ["Faroe Islands"] = { the = true, placetype = {"autonomous territory", "territory"}, container = "Denmark", addl_parents = {"Eropah"}, british_spelling = true, }, -- overseas department and region of France ["French Guiana"] = { placetype = {"overseas department", "department", "administrative region", "region"}, container = "France", divs = {"communes"}, addl_parents = {"Amerika Selatan"}, british_spelling = true, }, -- overseas collectivity of France ["French Polynesia"] = { placetype = {"overseas collectivity", "collectivity"}, container = "France", addl_parents = {"Polynesia"}, british_spelling = true, }, -- French overseas territory ["French Southern and Antarctic Lands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "France", addl_parents = {"Afrika"}, }, -- British Overseas Territory ["Gibraltar"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Eropah"}, is_city = true, british_spelling = true, }, -- autonomous territory of Denmark ["Greenland"] = { placetype = {"autonomous territory", "territory"}, container = "Denmark", addl_parents = {"Amerika Utara"}, divs = {"municipalities"}, british_spelling = true, }, -- overseas department and region of France ["Guadeloupe"] = { placetype = {"overseas department", "department", "administrative region", "region"}, container = "France", addl_parents = {"Caribbean"}, divs = {"communes"}, british_spelling = true, }, -- unincorporated territory of the United States ["Guam"] = { placetype = {"unincorporated territory", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Micronesia"}, }, -- self-governing British Crown dependency; technically called the Bailiwick of Guernsey ["Guernsey"] = { placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "territory"}, container = "United Kingdom", addl_parents = {"British Isles", "Eropah"}, british_spelling = true, wp = "Bailiwick of %l", }, ["Bailiwick of Guernsey"] = {alias_of = "Guernsey", the = true}, -- Australian external territory ["Heard Island and McDonald Islands"] = { the = true, placetype = {"external territory", "territory"}, container = "Australia", addl_parents = {"Afrika"}, }, -- special administrative region of China ["Hong Kong"] = { placetype = {"special administrative region", "city"}, container = "China", is_city = true, british_spelling = true, }, -- self-governing British Crown dependency ["Isle of Man"] = { the = true, placetype = {"crown dependency", "dependency", "dependent territory", "territory"}, container = "United Kingdom", addl_parents = {"British Isles", "Eropah"}, british_spelling = true, }, -- Norwegian unincorporated area ["Jan Mayen"] = { placetype = {"unincorporated area", "dependent territory", "territory", "island"}, container = "Norway", addl_parents = {"Eropah"}, british_spelling = true, }, -- self-governing British Crown dependency; technically called the Bailiwick of Jersey ["Jersey"] = { placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "territory"}, container = "United Kingdom", addl_parents = {"British Isles", "Eropah"}, british_spelling = true, }, ["Bailiwick of Jersey"] = {alias_of = "Jersey", the = true}, -- special administrative region of China ["Macau"] = { placetype = {"special administrative region", "city"}, container = "China", is_city = true, british_spelling = true, }, -- overseas department and region of France ["Martinique"] = { placetype = {"overseas department", "department", "administrative region", "region"}, container = "France", divs = {"communes"}, addl_parents = {"Caribbean"}, british_spelling = true, }, -- overseas department and region of France ["Mayotte"] = { placetype = {"overseas department", "department", "administrative region", "region"}, container = "France", divs = {"communes"}, addl_parents = {"Afrika"}, british_spelling = true, }, -- British Overseas Territory ["Montserrat"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Caribbean"}, british_spelling = true, }, -- special collectivity of France ["New Caledonia"] = { placetype = {"special collectivity", "collectivity"}, container = "France", addl_parents = {"Melanesia"}, british_spelling = true, }, -- dependent territory of New Zealand ["New Zealand Subantarctic Islands"] = { the = true, placetype = {"dependent territory", "territory"}, container = "New Zealand", addl_parents = {"Antartika"}, british_spelling = true, }, -- self-governing but in free association with New Zealand ["Niue"] = { placetype = {"negara"}, container = "New Zealand", addl_parents = {"Polynesia"}, british_spelling = true, }, -- Australian external territory ["Norfolk Island"] = { placetype = {"external territory", "territory"}, container = "Australia", addl_parents = {"Polynesia"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Cyprus ["Northern Cyprus"] = { placetype = {"unrecognized country", "negara"}, addl_parents = {"Cyprus", "Turkey", "Eropah", "Asia"}, divs = {"districts"}, keydesc = "the de-facto independent state of [[Northern Cyprus]], internationally recognized as part of the country of [[Cyprus]]", british_spelling = true, }, -- commonwealth, unincorporated territory of the United States ["Northern Mariana Islands"] = { the = true, placetype = {"commonwealth", "unincorporated territory", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Micronesia"}, }, -- British Overseas Territory ["Pitcairn Islands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Polynesia"}, british_spelling = true, }, -- commonwealth of the United States ["Puerto Rico"] = { placetype = {"commonwealth", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Caribbean"}, divs = {"municipalities"}, }, -- overseas department and region of France ["Réunion"] = { placetype = {"overseas department", "department", "administrative region", "region"}, container = "France", divs = {"communes"}, addl_parents = {"Afrika"}, british_spelling = true, }, -- special municipality of the Netherlands ["Saba"] = { placetype = {"special municipality", "municipality", "overseas territory", "territory"}, container = "Netherlands", addl_parents = {"Caribbean"}, is_city = true, british_spelling = true, }, -- overseas collectivity of France ["Saint Barthélemy"] = { placetype = {"overseas collectivity", "collectivity"}, container = "France", addl_parents = {"Caribbean"}, british_spelling = true, }, -- British Overseas Territory ["Saint Helena, Ascension and Tristan da Cunha"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", divs = {{type = "constituent parts", container_parent_type = false}}, addl_parents = {"Atlantic Ocean", "Afrika"}, british_spelling = true, }, -- constituent parts of the combined oveseas territory ["Ascension Island"] = { placetype = {"constituent part", "territory", "island"}, container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"}, addl_parents = {"Atlantic Ocean"}, overriding_bare_label_parents = {}, no_container_cat = false, no_container_parent = false, no_auto_augment_container = false, }, ["Saint Helena"] = { placetype = {"constituent part", "territory", "island"}, container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"}, addl_parents = {"Atlantic Ocean"}, overriding_bare_label_parents = {}, no_container_cat = false, no_container_parent = false, no_auto_augment_container = false, }, ["Tristan da Cunha"] = { placetype = {"constituent part", "territory", "archipelago"}, container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"}, addl_parents = {"Atlantic Ocean"}, overriding_bare_label_parents = {}, no_container_cat = false, no_container_parent = false, no_auto_augment_container = false, }, -- overseas collectivity of France ["Saint Martin"] = { placetype = {"overseas collectivity", "collectivity"}, container = "France", addl_parents = {"Caribbean"}, british_spelling = true, }, -- overseas collectivity of France ["Saint Pierre and Miquelon"] = { placetype = {"overseas collectivity", "collectivity"}, container = "France", divs = {"communes"}, addl_parents = {"Amerika Utara"}, british_spelling = true, }, -- special municipality of the Netherlands ["Sint Eustatius"] = { placetype = {"special municipality", "municipality", "overseas territory", "territory"}, container = "Netherlands", addl_parents = {"Caribbean"}, is_city = true, british_spelling = true, }, -- constituent country of the Netherlands ["Sint Maarten"] = { placetype = {"constituent country", "negara"}, container = "Netherlands", addl_parents = {"Caribbean"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Somalia ["Somaliland"] = { placetype = {"unrecognized country", "negara"}, addl_parents = {"Somalia", "Afrika"}, keydesc = "the de-facto independent state of [[Somaliland]], internationally recognized as part of the country of [[Somalia]]", british_spelling = true, }, -- British Overseas Territory -- FIXME: We should form the group "South Georgia and the South Sandwich Islands" like we did for -- "Saint Helena, Ascension and Tristan da Cunha". ["South Georgia"] = { placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Atlantic Ocean"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Georgia ["South Ossetia"] = { placetype = {"unrecognized country", "negara"}, addl_parents = {"Georgia", "Eropah", "Asia"}, keydesc = "the de-facto independent state of [[South Ossetia]], internationally recognized as part of the country of [[Georgia]]", british_spelling = true, }, -- British Overseas Territory ["South Sandwich Islands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Atlantic Ocean"}, wp = true, wpcat = "South Georgia and the South Sandwich Islands", british_spelling = true, }, -- Norwegian unincorporated area ["Svalbard"] = { placetype = {"unincorporated area", "dependent territory", "territory", "archipelago"}, container = "Norway", addl_parents = {"Eropah"}, british_spelling = true, }, -- dependent territory of New Zealand ["Tokelau"] = { placetype = {"dependent territory", "territory"}, container = "New Zealand", addl_parents = {"Polynesia"}, british_spelling = true, }, -- de-facto independent state, internationally recognized as part of Moldova ["Transnistria"] = { placetype = {"unrecognized country", "negara"}, addl_parents = {"Moldova", "Eropah"}, keydesc = "the de-facto independent state of [[Transnistria]], internationally recognized as part of [[Moldova]]", british_spelling = true, }, -- British Overseas Territory ["Turks and Caicos Islands"] = { the = true, placetype = {"overseas territory", "territory"}, container = "United Kingdom", addl_parents = {"Caribbean"}, british_spelling = true, }, -- unincorporated territory of the United States ["United States Minor Outlying Islands"] = { the = true, placetype = {"unincorporated territory", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Islands", "Micronesia", "Polynesia", "Caribbean"}, }, -- FIXME: We should add entries for the other minor outlying islands. -- Baker Island (Oceania) -- Howland Island (Oceania) -- Jarvis Island (Oceania) -- Johnston Atoll (Oceania) -- Kingman Reef (Oceania) -- Midway Atoll (Oceania) -- Navassa Island (Caribbean) -- Palmyra Atoll (Oceania) -- Wake Island (Oceania) ["Wake Island"] = { placetype = {"unincorporated territory", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Micronesia"}, }, -- unincorporated territory of the United States ["United States Virgin Islands"] = { the = true, placetype = {"unincorporated territory", "overseas territory", "territory"}, container = "Amerika Syarikat", addl_parents = {"Caribbean"}, }, ["U.S. Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true}, ["US Virgin Islands"] = {alias_of = "United States Virgin Islands", display = true, the = true}, -- overseas collectivity of France ["Wallis and Futuna"] = { placetype = {"overseas collectivity", "collectivity"}, container = "France", addl_parents = {"Polynesia"}, british_spelling = true, }, } export.country_like_entities_group = { -- don't do any transformations between key and placename; in particular, don't chop off anything from -- "Saint Helena, Ascension and Tristan da Cunha". key_to_placename = false, placename_to_key = false, canonicalize_key_container = make_canonicalize_key_container(nil, "negara"), default_overriding_bare_label_parents = {"country-like entities"}, default_no_container_cat = true, default_no_container_parent = true, -- These entities often aren't really part of their container; a village in Wallis and Futuna (an overseas -- collectivity of France in Polynesia), for example, shouldn't be treated as a village in France, nor as a village -- in Europe. default_no_auto_augment_container = true, data = export.country_like_entities, } -- Former countries and such; we don't create "Cities in ..." categories because they don't exist anymore export.former_countries = { -- de-facto independent state of Armenian ethnicity, internationally recognized as part of Azerbaijan -- (also known as Nagorno-Karabakh) -- NOTE: Formerly listed Armenia as a parent; this seems politically non-neutral so I've taken it out. ["Artsakh"] = { placetype = {"unrecognized country", "negara"}, addl_parents = {"Azerbaijan", "Eropah", "Asia"}, keydesc = "the former de-facto independent state of [[Artsakh]], internationally recognized as part of [[Azerbaijan]]", british_spelling = true, }, ["Nagorno-Karabakh"] = {alias_of = "Artsakh"}, ["Czechoslovakia"] = {container = "Eropah", british_spelling = true}, ["East Germany"] = {container = "Eropah", addl_parents = {"Germany"}, british_spelling = true}, ["North Vietnam"] = {container = "Asia", addl_parents = {"Vietnam"}}, ["Persia"] = {placetype = {"empire", "negara"}, container = "Asia", divs = {"provinces"}}, ["Byzantine Empire"] = { the = true, placetype = {"empire", "negara"}, container = {"Eropah", "Afrika", "Asia"}, addl_parents = {"Ancient Europe", "Ancient Near East"}, divs = { "provinces", "themes", }}, ["Roman Empire"] = { the = true, placetype = {"empire", "negara"}, container = {"Eropah", "Afrika", "Asia"}, addl_parents = {"Rome"}, divs = { "provinces", {type = "FORMER provinces", cat_as = "provinces"}, }}, ["South Vietnam"] = {container = "Asia", addl_parents = {"Vietnam"}}, ["Soviet Union"] = { the = true, container = {"Eropah", "Asia"}, divs = {"republics", "autonomous republics"}, british_spelling = true}, ["West Germany"] = {container = "Eropah", addl_parents = {"Germany"}, british_spelling = true}, ["Yugoslavia"] = {container = "Eropah", divs = {"districts"}, keydesc = "the former [[Kingdom of Yugoslavia]] (1918–1943) or the former [[Socialist Federal Republic of Yugoslavia]] (1943–1992)", british_spelling = true}, } export.former_countries_group = { canonicalize_key_container = canonicalize_continent_container, default_overriding_bare_label_parents = {"former countries and country-like entities"}, default_is_former_place = true, default_placetype = "negara", default_no_container_cat = true, default_no_container_parent = true, -- No need to augment country holonyms with continents; not needed for disambiguation. default_no_auto_augment_container = true, data = export.former_countries, } ----------------------------------------------------------------------------------- -- Subpolity tables -- ----------------------------------------------------------------------------------- export.australia_states_and_territories = { ["Australian Capital Territory, Australia"] = {the = true, placetype = "territory"}, ["Jervis Bay Territory, Australia"] = {the = true, placetype = "territory"}, ["New South Wales, Australia"] = {}, ["Northern Territory, Australia"] = {the = true, placetype = "territory"}, ["Queensland, Australia"] = {}, ["South Australia, Australia"] = {}, ["Tasmania, Australia"] = {}, ["Victoria, Australia"] = {}, ["Western Australia, Australia"] = {}, } -- states and territories of Australia export.australia_group = { default_container = "Australia", default_placetype = "negeri", default_divs = "local government areas", data = export.australia_states_and_territories, } export.austria_states = { ["Vienna, Austria"] = {}, ["Lower Austria, Austria"] = {}, ["Upper Austria, Austria"] = {}, ["Styria, Austria"] = {}, ["Tyrol, Austria"] = {wp = "Tyrol (state)"}, ["Carinthia, Austria"] = {}, ["Salzburg, Austria"] = {wp = "Salzburg (state)"}, ["Vorarlberg, Austria"] = {}, ["Burgenland, Austria"] = {}, } -- states of Austria export.austria_group = { default_container = "Austria", default_placetype = "negeri", default_divs = "municipalities", data = export.austria_states, } export.bangladesh_divisions = { ["Barisal Division, Bangladesh"] = {}, ["Chittagong Division, Bangladesh"] = {}, ["Dhaka Division, Bangladesh"] = {}, ["Khulna Division, Bangladesh"] = {}, ["Mymensingh Division, Bangladesh"] = {}, ["Rajshahi Division, Bangladesh"] = {}, ["Rangpur Division, Bangladesh"] = {}, ["Sylhet Division, Bangladesh"] = {}, } -- divisions of Bangladesh export.bangladesh_group = { key_to_placename = make_key_to_placename(", Bangladesh$", " Division$"), placename_to_key = make_placename_to_key(", Bangladesh", " Division"), default_container = "Bangladesh", default_placetype = "division", default_divs = "districts", data = export.bangladesh_divisions, } export.brazil_states = { ["Acre, Brazil"] = {wp = "%l (state)"}, ["Alagoas, Brazil"] = {}, ["Amapá, Brazil"] = {}, ["Amazonas, Brazil"] = {wp = "%l (Brazilian state)"}, ["Bahia, Brazil"] = {}, ["Ceará, Brazil"] = {}, ["Distrito Federal, Brazil"] = {wp = "Federal District (Brazil)"}, ["Espírito Santo, Brazil"] = {}, ["Goiás, Brazil"] = {}, ["Maranhão, Brazil"] = {}, ["Mato Grosso, Brazil"] = {}, ["Mato Grosso do Sul, Brazil"] = {}, ["Minas Gerais, Brazil"] = {}, ["Pará, Brazil"] = {}, ["Paraíba, Brazil"] = {}, ["Paraná, Brazil"] = {wp = "%l (state)"}, ["Pernambuco, Brazil"] = {}, ["Piauí, Brazil"] = {}, ["Rio de Janeiro, Brazil"] = {wp = "%l (state)"}, ["Rio Grande do Norte, Brazil"] = {}, ["Rio Grande do Sul, Brazil"] = {}, ["Rondônia, Brazil"] = {}, ["Roraima, Brazil"] = {}, ["Santa Catarina, Brazil"] = {wp = "%l (state)"}, ["São Paulo, Brazil"] = {wp = "%l (state)"}, ["Sergipe, Brazil"] = {}, ["Tocantins, Brazil"] = {}, } -- states of Brazil export.brazil_group = { default_container = "Brazil", default_placetype = "negeri", default_divs = "municipalities", data = export.brazil_states, } export.canada_provinces_and_territories = { ["Alberta, Canada"] = {divs = { {type = "municipal districts", container_parent_type = "rural municipalities"}, }}, ["British Columbia, Canada"] = {divs = {type = "regional districts", container_parent_type = false}, "regional municipalities", }, ["Manitoba, Canada"] = {divs = {"rural municipalities"}}, ["New Brunswick, Canada"] = {divs = {"counties", "parishes", {type = "civil parishes", cat_as = "parishes"}}}, ["Newfoundland and Labrador, Canada"] = {}, ["Northwest Territories, Canada"] = {the = true, placetype = "territory"}, ["Nova Scotia, Canada"] = {divs = {"counties", "regional municipalities"}}, ["Nunavut, Canada"] = {placetype = "territory"}, ["Ontario, Canada"] = {divs = {"counties", "regional municipalities", {type = "townships", prep = "di"}}}, ["Prince Edward Island, Canada"] = {divs = {"counties", "parishes", "rural municipalities"}}, ["Saskatchewan, Canada"] = {divs = {"rural municipalities"}}, ["Quebec, Canada"] = {divs = { "counties", {type = "regional county municipalities", container_parent_type = "regional municipalities"}, -- administrative regions have an official (but non-governmental) function but there don't appear to be any -- equivalent regions elsewhere in Canada, so disable the [[Category:Regions of Canada]] grouping {type = "regions", container_parent_type = false}, {type = "townships", prep = "di"}, {type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "counties"}, "municipalities"}}, {type = "township municipalities", cat_as = {{type = "townships", prep = "di"}, "municipalities"}}, {type = "village municipalities", cat_as = {{type = "villages", prep = "di"}, "municipalities"}}, }}, ["Yukon, Canada"] = {placetype = "territory"}, ["Yukon Territory, Canada"] = {alias_of = "Yukon, Canada", the = true}, } -- provinces and territories of Canada export.canada_group = { default_container = "Canada", default_placetype = "province", data = export.canada_provinces_and_territories, } export.china_provinces_and_autonomous_regions = { -- direct-administered municipalities are not here but below under prefecture-level cities ["Anhui, China"] = {}, ["Fujian, China"] = {}, ["Fuchien, China"] = {alias_of = "Fujian, China", display = true}, ["Gansu, China"] = {}, ["Guangdong, China"] = {}, ["Guangxi, China"] = {placetype = "autonomous region"}, ["Guizhou, China"] = {}, ["Hainan, China"] = {}, ["Hebei, China"] = {}, ["Heilongjiang, China"] = {}, ["Henan, China"] = {}, ["Hubei, China"] = {}, ["Hunan, China"] = {}, ["Inner Mongolia, China"] = {placetype = "autonomous region"}, ["Jiangsu, China"] = {}, ["Jiangxi, China"] = {}, ["Jilin, China"] = {}, ["Liaoning, China"] = {}, ["Ningxia, China"] = {placetype = "autonomous region"}, ["Qinghai, China"] = {}, ["Shaanxi, China"] = {}, ["Shandong, China"] = {}, ["Shanxi, China"] = {}, ["Sichuan, China"] = {}, ["Tibet, China"] = {placetype = "autonomous region", wp = "Tibet Autonomous Region"}, ["Xinjiang, China"] = {placetype = "autonomous region"}, ["Yunnan, China"] = {}, ["Zhejiang, China"] = {}, } -- provinces and autonomous regions of China export.china_group = { default_container = "China", default_placetype = "province", default_divs = { "prefectures", "prefecture-level cities", "districts", "subdistricts", "townships", {type = "counties", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, }, data = export.china_provinces_and_autonomous_regions, } export.china_prefecture_level_cities = { -- In China, a "prefecture-level city" is not a city in any real sense. It is rather a prefecture, which is an -- administrative unit smaller than a province but bigger than a county, which is administratively controlled by -- the chief city of the prefecture (which bears the same name as the prefecture), in a unified government. Prior -- to the mid-1980's, in fact, prefecture-level cities *were* prefectures, and a few of them (especially in the -- western portion of China) have not yet been converted. Generally a given province is entirely tiled by -- prefecture-level cities, another indication that they should be treated as prefectures and not cities per se. -- Yet another indication is that prefecture-level cities can contain counties and county-level cities (which, much -- like prefecture-level cities, are effectively counties surrounding a chief city of the county, again which bears -- the same name as the county-level city). -- -- For this reason, we treat prefecture-level cities as non-city political divisions, and separately enumerate the -- most populous so we can separately categorize districts and counties under them instead of lumping them at the -- province level. -- -- Note also that China separately distinguishes "urban area" from "metro area". Sometimes the two figures are -- identical but sometimes the metro area is larger (and very occasionally smaller, which I assume is an error). I'm -- guessing that the "urban area" is the contiguous urban area over a certain density while the metro area includes -- all urban areas above a certain density; when the latter is greater, it's because of satellite cities in the -- metro area separated by suburban/exurban or rural land. -- At first I chose all prefecture/province-level cities with a total prefecture/province-level population of at -- least 6,000,000 per the 2020 census with data taken from https://www.citypopulation.de/en/china/admin/ (a total -- of 67, including the four direct-administered municipalities), and also chose all prefecture/province-level -- cities whose "urban population" was at least 2,000,000 per the 2020 census with data taken from Wikipedia -- [[w:List of cities in China by population#Cities and towns by population]] (a total of 61 cities; if we cut off -- at 1.5 million we'd have 84 cities, and if we cut off at 1 million we'd have 105 cities). Merging them produces -- 87 cities. Note that this leaves off a few well-known cities (Guilin, Qiqihar, Kashgar, Lhasa, ...) but includes -- a lot of obscure cities. -- -- At a later date I added all cities from citypopulation.de whose "urban" population per the 2020 China census was -- >= 1 million, and then finally added all urban agglomerations from citypopulation.de whose 2025-01-01 estimate -- was >= 1 million. These are sorted below by the urban agglomeration value (which is generally of the "adm-urb" = -- "administrative area (urban population)" type) and sometimes groups nearby cities into a single agglomeration -- (most notably in the case of the Pearl River Delta, grouped under Guangzhou with an agglomeration population of -- 72,700,000 but including a large number of nearby large cities in the agglomeration (although for some reason not -- Hong Kong, maybe due to the administrative issues involved). In addition, citypopulation.de includes divisions -- under a prefecture-level city if they are city-like and have an agglomeration population of at least 1 million; -- this includes several county-level cities, one county and one district (Wanzhou, a "district" of Chongqing -- despite being 142 miles away). None of the county-level cities or counties have districts under them, only -- subdistricts, towns and townships. ["Guangzhou"] = {container = "Guangdong"}, -- 18.7 prefectural, 18.8 urban; sub-provincial city; 16.097 urban (72.700 adm-urb including Dongguan, Foshan, Huizhou, Jiangmen, Shenzhen, Zhongshan) per citypopulation.de ["Dongguan"] = {container = "Guangdong"}, -- 10.5 prefectural, 10.5 urban; 9.645 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Foshan"] = {container = "Guangdong"}, -- 9.5 prefectural, 9.5 urban; 9.043 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Huizhou"] = {container = "Guangdong"}, -- 6.0 prefectural, 2.5 urban; 2.900 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Jiangmen"] = {container = "Guangdong"}, -- 4.798 prefectural, 2.7 urban; 1.795 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Shenzhen"] = {container = "Guangdong"}, -- 17.5 prefectural, 14.7 urban; sub-provincial city; 17.445 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Zhongshan"] = {container = "Guangdong"}, -- 4.418 prefectural, 4.4 urban; 3.842 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration ["Shanghai"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 24.9 prefectural, 29.9 urban; 21.910 urban (41.600 adm-urb including Changshu, Changzhou, Suzhou, Wuxi) per citypopulation.de ["Changshu"] = {container = "Jiangsu"}, -- 1.231 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration -- NOTE: Not to be confused with Cangzhou in Hebei ["Changzhou"] = {container = "Jiangsu"}, -- 5.278 prefectural, 3.6 urban; 3.187 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration -- NOTE: There is also a prefecture-level city Suzhou in Anhui with 5.3 million prefectural inhabitants ["Suzhou"] = {container = "Jiangsu"}, -- 12.8 prefectural, 4.3 urban; 5.893 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration ["Wuxi"] = {container = "Jiangsu"}, -- 7.5 prefectural, 3.3 urban; 3.957 per citypopulation.de; included by citypopulation.de in Shanghai agglomeration ["Beijing"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 21.9 prefectural, 21.9 urban; 18.961 urban (21.500 adm-urb) per citypopulation.de ["Chengdu"] = {container = "Sichuan"}, -- 20.9 prefectural, 16.9 urban; sub-provincial city; 13.568 urban (18.100 adm-urb) per citypopulation.de ["Xiamen"] = {container = "Fujian"}, -- 5.163 prefectural, 5.2 urban; sub-provincial city; 4.617 urban (15.400 adm-urb including Jinjiang, Quanzhou, Putian) per citypopulation.de ["Jinjiang"] = {container = "Fujian"}, -- 1.416 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration ["Quanzhou"] = {container = "Fujian"}, -- 8.8 prefectural, 1.7 urban (6.7 metro); 1.469 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration ["Putian"] = {container = "Fujian"}, -- 3.210 prefectural, 2.0 urban; 1.539 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration ["Hangzhou"] = {container = "Zhejiang"}, -- 11.9 prefectural, 10.7 urban; sub-provincial city; 9.236 urban (14.600 adm-urb including Shaoxing) per citypopulation.de ["Shaoxing"] = {container = "Zhejiang"}, -- 5.270 prefectural, 2.5 urban; 2.333 urban per citypopulation.de; included by citypopulation.de in Hangzhou agglomeration ["Xi'an"] = {container = "Shaanxi"}, -- 12.1 prefectural, 11.9 urban; sub-provincial city; 9.393 urban (13.400 adm-urb including Xianyang) per citypopulation.de ["Xianyang"] = {container = "Shaanxi"}, -- 1.193 urban per citypopulation.de; included by citypopulation.de in Xi'an agglomeration ["Chongqing"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 32.1 prefectural, 16.9 urban; 9.581 urban (12.900 adm-urb) per citypopulation.de ["Wuhan"] = {container = "Hubei"}, -- 12.4 prefectural, 12.3 urban; sub-provincial city; 10.495 urban (12.600 adm-urb) per citypopulation.de ["Tianjin"] = {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 13.9 prefectural, 13.9 urban; 11.052 urban (11.700 adm-urb) per citypopulation.de ["Changsha"] = {container = "Hunan"}, -- 10.0 prefectural, 6.0 urban; 5.630 urban (11.500 adm-urb including Xiangtan, Zhuzhou) per citypopulation.de -- Changsha County -- 1.024 urban per citypopulation.de ["Zhuzhou"] = {container = "Hunan"}, -- 1.510 urban per citypopulation.de; included by citypopulation.de in Changsha agglomeration ["Zhengzhou"] = {container = "Henan"}, -- 12.6 prefectural, 6.7 urban; 6.461 urban (10.300 adm-urb) per citypopulation.de ["Nanjing"] = {container = "Jiangsu"}, -- 9.3 prefectural, 9.3 urban; sub-provincial city; 7.520 urban (9.500 adm-urb including Ma'anshan) per citypopulation.de ["Shenyang"] = {container = "Liaoning"}, -- 9.1 prefectural, 7.9 urban; sub-provincial city; 7.026 urban (8.800 adm-urb including Fushun) per citypopulation.de ["Fushun"] = {container = "Liaoning"}, -- 1.229 urban per citypopulation.de; included by citypopulation.de in Shenyang agglomeration ["Hefei"] = {container = "Anhui"}, -- 9.4 prefectural, 4.2 urban; 5.056 urban (8.200 adm-urb) per citypopulation.de ["Shantou"] = {container = "Guangdong"}, -- 5.502 prefectural, 4.3 urban; 3.839 urban (8.050 adm-urb including Chaozhou, Jieyang, Puning) per citypopulation.de ["Chaozhou"] = {container = "Guangdong"}, -- 1.254 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration ["Jieyang"] = {container = "Guangdong"}, -- 1.243 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration ["Qingdao"] = {container = "Shandong"}, -- 10.1 prefectural, 7.1 urban; sub-provincial city; 6.165 urban (7.700 adm-urb) per citypopulation.de ["Ningbo"] = {container = "Zhejiang"}, -- 9.4 prefectural, 5.1 urban; sub-provincial city; 3.731 urban (7.600 adm-urb including Cixi, Yuyao) per citypopulation.de ["Cixi"] = {container = "Zhejiang"}, -- 1.458 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration ["Yuyao"] = {container = "Zhejiang"}, -- 1.014 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration -- Hong Kong 7.500 agglomeration per citypopulation.de 2025-01-01 estimate including Kowloon, Victoria ["Wenzhou"] = {container = "Zhejiang"}, -- 9.6 prefectural, 3.6 urban; 2.582 urban (7.000 adm-urb including Rui'an, Cangnan, Pingyang) per citypopulation.de -- Rui'an is a "county-level city" of the "prefecture-level city" of Wenzhou but in fact is 19 miles away from Wenzhou city proper (urban core to urban core). ["Rui'an"] = {placetype = "county-level city", container = {key = "Wenzhou", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 1.013 urban per citypopulation.de; included by citypopulation.de in Wenzhou agglomeration ["Kunming"] = {container = "Yunnan"}, -- 8.5 prefectural, 6.0 urban; 5.273 urban (6.800 adm-urb) per citypopulation.de -- includes Láiwú city ["Jinan"] = {container = "Shandong", wp = "%l, %c"}, -- 9.2 prefectural, 8.4 urban; sub-provincial city; 5.648 urban (6.750 adm-urb) per citypopulation.de -- includes Xīnjí city ["Shijiazhuang"] = {container = "Hebei"}, -- 11.2 prefectural, 4.1 urban; 5.090 urban (6.450 adm-urb) per citypopulation.de ["Taiyuan"] = {container = "Shanxi"}, -- 5.304 prefectural, 4.5 urban; 4.304 urban (6.150 adm-urb) per citypopulation.de ["Harbin"] = {container = "Heilongjiang"}, -- 10.0 prefectural, 7.0 urban; sub-provincial city; 5.243 urban (5.550 adm-urb) per citypopulation.de ["Nanning"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 8.7 prefectural, 3.8 urban; 4.583 urban (5.550 adm-urb) per citypopulation.de ["Dalian"] = {container = "Liaoning"}, -- 7.5 prefectural, 5.7 urban; sub-provincial city; 4.914 urban (5.400 adm-urb) per citypopulation.de ["Guiyang"] = {container = "Guizhou"}, -- 5.987 prefectural, 3.5 urban; 4.021 urban (5.300 adm-urb) per citypopulation.de ["Changchun"] = {container = "Jilin"}, -- 9.1 prefectural, 5.7 urban; sub-provincial city; 4.557 urban (5.200 adm-urb) per citypopulation.de ["Nanchang"] = {container = "Jiangxi"}, -- 6.3 prefectural, 3.6 (3.9?) urban, 5.3 metro; 3.519 urban (5.150 adm-urb) per citypopulation.de ["Ürümqi"] = {container = {key = "Xinjiang, China", placetype = "autonomous region"}}, -- 4.054 prefectural, 4.3 urban; 3.843 urban (5.000 adm-urb) per citypopulation.de ["Urumqi"] = {alias_of = "Ürümqi", display = true}, ["Fuzhou"] = {container = "Fujian"}, -- 8.3 prefectural, 4.1 urban; 3.723 urban (4.775 adm-urb) per citypopulation.de ["Linyi"] = {container = "Shandong"}, -- 11.0 prefectural, 2.3 urban; 2.744 urban (4.650 adm-urb) per citypopulation.de ["Zibo"] = {container = "Shandong"}, -- 4.704 prefectural, 2.6 urban; 2.750 urban (3.975 adm-urb) per citypopulation.de ["Luoyang"] = {container = "Henan"}, -- 7.1 prefectural, 2.4 urban; 2.231 urban (3.750 adm-urb) per citypopulation.de ["Lanzhou"] = {container = "Gansu"}, -- 4.359 prefectural, 3.1 urban; 3.013 urban (3.575 adm-urb) per citypopulation.de ["Nantong"] = {container = "Jiangsu"}, -- 7.7 prefectural, 2.3 urban; 2.988 urban (3.475 adm-urb) citypopulation.de ["Weifang"] = {container = "Shandong"}, -- 9.4 prefectural, 2.7 urban; 1.998 urban (3.325 adm-urb) per citypopulation.de ["Jiangyin"] = {container = "Jiangsu"}, -- 1.331 urban (3.200 adm-urb including Zhangjiagang) per citypopulation.de ["Zhangjiagang"] = {container = "Jiangsu"}, -- 1.056 urban per citypopulation.de; included in Jiangyin figures ["Xuzhou"] = {container = "Jiangsu"}, -- 9.1 prefectural, 2.6 urban; 2.846 urban (3.150 adm-urb) per citypopulation.de ["Handan"] = {container = "Hebei"}, -- 9.4 prefectural, 2.8 urban; 2.095 urban (2.925 adm-urb) per citypopulation.de ["Hohhot"] = {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 3.446 prefectural, 2.7 urban; 2.373 urban (2.850 adm-urb) per citypopulation.de ["Haikou"] = {container = "Hainan"}, -- 2.873 prefectural, 2.3 urban; 2.349 urban (2.800 adm-urb) per citypopulation.de ["Tangshan"] = {container = "Hebei"}, -- 7.7 prefectural, 3.4 urban; 2.550 urban (2.750 adm-urb) per citypopulation.de ["Xinxiang"] = {container = "Henan"}, -- 6.3 prefectural, 1.2 urban, 2.7 metro; 1.271 urban (2.700 adm-urb) per citypopulation.de ["Yiwu"] = {container = "Zhejiang"}, -- 1.481 urban (2.700 adm-urb) per citypopulation.de ["Zhuhai"] = {container = "Guangdong"}, -- 2.439 prefectural, 2.4 urban; 2.207 urban (2.675 adm-urb) per citypopulation.de ["Taizhou, Zhejiang"] = {container = "Zhejiang"}, -- 6.6 prefectural, 1.6 urban; 1.486 urban (2.625 adm-urb) per citypopulation.de ["Taizhou"] = {alias_of = "Taizhou, Zhejiang"}, ["Yantai"] = {container = "Shandong"}, -- 7.1 prefectural, 2.5 urban; 2.312 urban (2.550 adm-urb) per citypopulation.de ["Yinchuan"] = {container = {key = "Ningxia, China", placetype = "autonomous region"}}, -- 1.663 urban (2.525 adm-urb) per citypopulation.de ["Liuzhou"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 4.157 prefectural, 2.2 urban; 2.205 urban (2.500 adm-urb) per citypopulation.de ["Anshan"] = {container = "Liaoning"}, -- 1.480 urban (2.350 adm-urb including Liáoyáng) per citypopulation.de ["Yangzhou"] = {container = "Jiangsu"}, -- 2.067 urban (2.300 adm-urb) per citypopulation.de ["Jiaxing"] = {container = "Zhejiang"}, -- 1.188 urban (2.275 adm-urb) per citypopulation.de ["Xining"] = {container = "Qinghai"}, -- 1.677 urban (2.250 adm-urb) per citypopulation.de -- includes Dìngzhōu city and Xióngān Xīnqū ["Baoding"] = {container = "Hebei"}, -- 11.5 prefectural, 2.0 urban; 1.940 urban (2.225 adm-urb) per citypopulation.de ["Baotou"] = {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 2.709 prefectural, 2.2 urban; 2.104 urban (2.200 adm-urb) per citypopulation.de ["Ganzhou"] = {container = "Jiangxi"}, -- 9.0 prefectural, 1.6 urban; 1.778 urban (2.150 adm-urb) per citypopulation.de ["Pingdingshan"] = {container = "Henan"}, -- 1.046 urban (2.100 adm-urb) per citypopulation.de ["Zunyi"] = {container = "Guizhou"}, -- 6.6 prefectural, 2.4 urban/metro; 1.675 urban (2.025 adm-urb) per citypopulation.de ["Bengbu"] = {container = "Anhui"}, -- 1.078 urban (2.000 adm-urb) per citypopulation.de ["Datong"] = {container = "Shanxi"}, -- 3.105 prefectural, 2.0 urban; 1.810 urban (2.000 adm-urb) per citypopulation.de ["Anyang"] = {container = "Henan"}, -- 1.188 urban (1.960 adm-urb) per citypopulation.de ["Huai'an"] = {container = "Jiangsu"}, -- 4.556 prefectural, 2.6 urban; 1.805 urban (1.940 adm-urb) per citypopulation.de ["Zaozhuang"] = {container = "Shandong"}, -- 1.350 urban (1.900 adm-urb) per citypopulation.de ["Zhanjiang"] = {container = "Guangdong"}, -- 7.0 prefectural, 1.9 urban; 1.401 urban (1.890 adm-urb) per citypopulation.de ["Huainan"] = {container = "Anhui"}, -- 1.256 urban (1.880 adm-urb) per citypopulation.de ["Jining"] = {container = "Shandong"}, -- 8.4 prefectural, 1.5 urban; 1.700 urban (1.880 adm-urb) per citypopulation.de ["Daqing"] = {container = "Heilongjiang"}, -- 1.604 urban (1.860 adm-urb) per citypopulation.de ["Wuhu"] = {container = "Anhui"}, -- 1.598 urban (1.850 adm-urb) per citypopulation.de ["Guilin"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 1.361 urban (1.830 adm-urb) per citypopulation.de ["Mianyang"] = {container = "Sichuan"}, -- 1.549 urban (1.800 adm-urb) per citypopulation.de ["Xiangyang"] = {container = "Hubei"}, -- 1.686 urban (1.800 adm-urb) per citypopulation.de ["Huzhou"] = {container = "Zhejiang"}, -- 1.084 urban (1.750 adm-urb) per citypopulation.de ["Puyang"] = {container = "Henan"}, -- 0.824 urban (1.750 adm-urb) per citypopulation.de ["Shangqiu"] = {container = "Henan"}, -- 7.8 prefectural, 1.9 urban (2.8 metro); 1.031 urban (1.750 adm-urb) per citypopulation.de ["Qinhuangdao"] = {container = "Hebei"}, -- 1.520 urban (1.740 adm-urb) per citypopulation.de ["Xingtai"] = {container = "Hebei"}, -- 7.1 prefectural, 971,000 urban; 1.5 urban (1.700 adm-urb) per citypopulation.de ["Nanyang"] = {container = "Henan", wp = "%l, %c"}, -- 9.7 prefectural, 2.1 urban/metro; 1.481 urban (1.680 adm-urb) per citypopulation.de ["Jiaozuo"] = {container = "Henan"}, -- 0.875 urban (1.640 adm-urb) per citypopulation.de ["Jilin City"] = {container = "Jilin"}, -- 1.509 urban (1.610 adm-urb) per citypopulation.de ["Jilin"] = {alias_of = "Jilin City"}, ["Jinhua"] = {container = "Zhejiang"}, -- 7.1 prefectural, 1.5 urban; 1.041 urban (1.590 adm-urb) per citypopulation.de ["Shangrao"] = {container = "Jiangxi"}, -- 6.5 prefectural, 2.1 urban, 1.3 metro [sic]; 1.342 urban (1.580 adm-urb) per citypopulation.de ["Heze"] = {container = "Shandong"}, -- 8.8 prefectural, 1.3 urban; 1.294 urban (1.570 adm-urb) per citypopulation.de ["Yulin"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}, wp = "%l, %c"}, -- 0.878 urban (1.570 adm-urb) per citypopulation.de ["Tai'an"] = {container = "Shandong"}, -- 1.417 urban (1.560 adm-urb) per citypopulation.de ["Weihai"] = {container = "Shandong"}, -- 1.340 urban (1.510 adm-urb) per citypopulation.de -- Taizhou, Jiangsu would be here (1.490 adm-urb) but moved to china_prefecture_level_cities_2 to avoid clash ["Yancheng"] = {container = "Jiangsu"}, -- 6.7 prefectural, 1.6 urban; 1.353 urban (1.460 adm-urb) per citypopulation.de ["Zhangjiakou"] = {container = "Hebei"}, -- 1.339 urban (1.450 adm-urb) per citypopulation.de ["Maoming"] = {container = "Guangdong"}, -- 6.2 prefectural, 2.5 urban; 1.308 urban (1.440 adm-urb) per citypopulation.de ["Nanchong"] = {container = "Sichuan"}, -- 1.254 urban (1.440 adm-urb) per citypopulation.de ["Fuyang"] = {container = "Anhui", wp = "%l, %c"}, -- 8.2 prefectural, 2.1 urban; 1.191 urban (1.410 adm-urb) per citypopulation.de ["Xuchang"] = {container = "Henan"}, -- 0.850 urban (1.390 adm-urb) per citypopulation.de ["Yichang"] = {container = "Hubei"}, -- 1.284 urban (1.390 adm-urb) per citypopulation.de ["Dazhou"] = {container = "Sichuan"}, -- 1.136 urban (1.380 adm-urb) per citypopulation.de ["Kaifeng"] = {container = "Henan"}, -- 1.194 urban (1.340 adm-urb) per citypopulation.de ["Luzhou"] = {container = "Sichuan"}, -- 1.128 urban (1.340 adm-urb) per citypopulation.de ["Qingyuan"] = {container = "Guangdong"}, -- 1.198 urban (1.340 adm-urb) per citypopulation.de ["Huaibei"] = {container = "Anhui"}, -- 0.831 urban (1.330 adm-urb) per citypopulation.de ["Yibin"] = {container = "Sichuan"}, -- 1.101 urban (1.310 adm-urb) per citypopulation.de ["Lu'an"] = {container = "Anhui"}, -- 1.070 urban (1.300 adm-urb) per citypopulation.de ["Dezhou"] = {container = "Shandong"}, -- 0.843 urban (1.290 adm-urb) per citypopulation.de ["Rizhao"] = {container = "Shandong"}, -- 1.147 urban (1.270 adm-urb) per citypopulation.de ["Changzhi"] = {container = "Shanxi"}, -- 1.047 urban (1.250 adm-urb) per citypopulation.de ["Hengyang"] = {container = "Hunan"}, -- 6.6 prefectural, 1.5 urban; 1.185 urban (1.250 adm-urb) per citypopulation.de ["Jinzhou"] = {container = "Liaoning"}, -- 1.021 urban (1.240 adm-urb) per citypopulation.de ["Liaocheng"] = {container = "Shandong"}, -- 1.020 urban (1.240 adm-urb) per citypopulation.de ["Changde"] = {container = "Hunan"}, -- 1.101 urban (1.230 adm-urb) per citypopulation.de ["Suqian"] = {container = "Jiangsu"}, -- 1.082 urban (1.230 adm-urb) per citypopulation.de ["Xinyang"] = {container = "Henan"}, -- 6.2 prefectural, 1.4 urban/metro; 1.015 urban (1.230 adm-urb) per citypopulation.de ["Baoji"] = {container = "Shaanxi"}, -- 1.108 urban (1.220 adm-urb) per citypopulation.de ["Yueyang"] = {container = "Hunan"}, -- 1.125 urban (1.220 adm-urb) per citypopulation.de ["Zhenjiang"] = {container = "Jiangsu"}, -- 1.124 urban (1.210 adm-urb) per citypopulation.de -- Wanzhou is a "district" of the "direct-administered municipality" of Chongqing but in fact is 142 miles away from Chongqing city proper. ["Wanzhou"] = {placetype = "district", container = {key = "Chongqing", placetype = "direct-administered municipality"}, divs = {"subdistricts", "townships"}, wp = "%l, %c"}, -- 1.078 urban (1.190 adm-urb) per citypopulation.de ["Ulanhad"] = {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 1.093 urban (1.180 adm-urb) per citypopulation.de ["Chifeng"] = {alias_of = "Ulanhad"}, ["Ulankhad"] = {alias_of = "Ulanhad", display = true}, ["Ezhou"] = {container = "Hubei"}, -- < 0.750 urban (1.180 adm-urb) per citypopulation.de ["Zhaoqing"] = {container = "Guangdong"}, -- 1.036 urban (1.160 adm-urb) per citypopulation.de ["Lianyungang"] = {container = "Jiangsu"}, -- 4.599 prefectural, 2.0 urban; 1.071 urban (1.150 adm-urb) per citypopulation.de ["Qujing"] = {container = "Yunnan"}, -- 0.976 urban (1.150 adm-urb) per citypopulation.de -- Shuyang is a "county" of the "prefecture-level city" of Suqian but in fact is 38 miles away from Suqian city proper (urban core to urban core). -- The county itself is 37 miles by 34 miles. ["Shuyang"] = {placetype = "county", container = {key = "Suqian", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}, wp = "%l County"}, -- 0.986 urban (1.120 adm-urb) per citypopulation.de -- Yongkang is a "county-level city" of the "prefecture-level city" of Jinhua but in fact is 32 miles away from Jinhua city proper (urban core to urban core). ["Yongkang"] = {placetype = "county-level city", container = {key = "Jinhua", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}, wp = "%l, Zhejiang"}, -- < 0.750 urban (1.110 adm-urb) per citypopulation.de ["Zhoukou"] = {container = "Henan"}, -- 9.0 prefectural, 721,000 urban (1.6 metro); < 0.750 urban (1.100 adm-urb) per citypopulation.de ["Beihai"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- < 1 urban (1.090 adm-urb) per citypopulation.de ["Jiujiang"] = {container = "Jiangxi"}, -- < 0.750 urban (1.080 adm-urb) per citypopulation.de ["Shaoyang"] = {container = "Hunan"}, -- 6.6 prefectural, 802,000 urban, 1.4 metro; < 1 urban (1.080 adm-urb) per citypopulation.de ["Chuzhou"] = {container = "Anhui"}, -- < 0.750 urban (1.070 adm-urb) per citypopulation.de ["Hengshui"] = {container = "Hebei"}, -- 0.885 urban (1.070 adm-urb) per citypopulation.de ["Shiyan"] = {container = "Hubei"}, -- 0.955 urban (1.070 adm-urb) per citypopulation.de ["Huludao"] = {container = "Liaoning"}, -- 0.764 urban (1.060 adm-urb) per citypopulation.de ["Dongying"] = {container = "Shandong"}, -- 0.961 urban (1.050 adm-urb) per citypopulation.de ["Guigang"] = {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 0.921 urban (1.050 adm-urb) per citypopulation.de -- Liuyang is a "county-level city" of the "prefecture-level city" of Changsha but in fact is 47 miles away from Changsha city proper (urban core to urban core). ["Liuyang"] = {placetype = "county-level city", container = {key = "Changsha", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 0.886 urban (1.040 adm-urb) per citypopulation.de -- NOTE: Not to be confused with Changzhou in Jiangsu ["Cangzhou"] = {container = "Hebei"}, -- 7.3 prefectural, 621,000 urban; 0.759 urban (1.030 adm-urb) per citypopulation.de ["Liupanshui"] = {container = "Guizhou"}, -- < 0.750 urban (1.030 adm-urb) per citypopulation.de ["Panjin"] = {container = "Liaoning"}, -- 0.980 urban (1.030 adm-urb) per citypopulation.de ["Qiqihar"] = {container = "Heilongjiang"}, -- 1.030 urban (1.030 adm-urb) per citypopulation.de ["Linfen"] = {container = "Shanxi"}, -- < 0.750 urban (1.010 adm-urb) per citypopulation.de -- Tengzhou is a "county-level city" of the "prefecture-level city" of Zaozhuang but in fact is 30 miles away from Zaozhuang city proper (urban core to urban core). ["Tengzhou"] = {placetype = "county-level city", container = {key = "Zaozhuang", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 0.937 urban (1.010 adm-urb) per citypopulation.de -- 3 extra that got added in earlier incarnations and aren't found in the "major agglomerations of the world" page https://citypopulation.de/en/world/agglomerations/ reference date 2025-01-01 ["Kunshan"] = {container = "Jiangsu"}, -- 1.652 urban (2020 China census) per citypopulation.de ["Zhumadian"] = {container = "Henan"}, -- 7.0 prefectural, 722,000 urban per Wikipedia; 0.754 urban per citypopulation.de ["Bijie"] = {container = "Guizhou"}, -- 6.9 prefectural, ? urban, ? metro (not listed in Wikipedia); < 0.750 urban per citypopulation.de } export.china_prefecture_level_cities_group = { -- don't do any transformations between key and placename; in particular, don't chop off anything from -- "Taizhou, Zhejiang" or "Suzhou, Anhui". key_to_placename = false, placename_to_key = false, -- don't add ", China" to make the key default_container = "China", canonicalize_key_container = make_canonicalize_key_container(", China", "province"), -- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people -- don't understand how Chinese administrative divisions work. default_placetype = {"prefecture-level city", "city"}, default_divs = { -- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities, -- and prefecture-level cities (as well as county-level cities) are considered non-cities. "districts", "subdistricts", "townships", {type = "counties", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, }, data = export.china_prefecture_level_cities, } -- Needed to avoid problems with two cities called Taizhou and Suzhou. export.china_prefecture_level_cities_2 = { -- NOTE: There is also a larger and better-known prefecture-level city Taizhou in Zhejiang. ["Taizhou, Jiangsu"] = {container = "Jiangsu"}, -- 1.3 urban (1.490 adm-urb) per citypopulation.de 2020 census ["Taizhou"] = {alias_of = "Taizhou, Jiangsu"}, -- NOTE: There is also a larger and better-known prefecture-level city Suzhou in Jiangsu. ["Suzhou, Anhui"] = {container = "Anhui"}, -- 5.3 prefectural, 1.766 metro and "urban"; < 1 urban (1.010 adm-urb) per citypopulation.de 2020 census -- hopefully this will work because we also have Suzhou as a key by itself for the larger, more-well-known Suzhou in Jiangsu ["Suzhou"] = {alias_of = "Suzhou, Anhui"}, } export.china_prefecture_level_cities_group_2 = { -- don't do any transformations between key and placename; in particular, don't chop off anything from -- "Taizhou, Jiangsu". placename_to_key = false, -- don't add ", China" to make the key default_container = "China", canonicalize_key_container = make_canonicalize_key_container(", China", "province"), -- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people -- don't understand how Chinese administrative divisions work. default_placetype = {"prefecture-level city", "city"}, default_divs = { -- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities, -- and prefecture-level cities (as well as county-level cities) are considered non-cities. "districts", "subdistricts", "townships", {type = "counties", cat_as = "counties and county-level cities"}, {type = "county-level cities", cat_as = "counties and county-level cities"}, }, data = export.china_prefecture_level_cities_2, } export.finland_regions = { ["Lapland, Finland"] = {wp = "%l (%c)"}, ["North Ostrobothnia, Finland"] = {}, ["Northern Ostrobothnia, Finland"] = {alias_of = "North Ostrobothnia, Finland", display = true}, ["Kainuu, Finland"] = {}, ["North Karelia, Finland"] = {}, ["Northern Savonia, Finland"] = {}, ["North Savo, Finland"] = {alias_of = "Northern Savonia, Finland", display = true}, ["Southern Savonia, Finland"] = {}, ["South Savo, Finland"] = {alias_of = "Southern Savonia, Finland", display = true}, ["South Karelia, Finland"] = {}, ["Central Finland, Finland"] = {}, ["South Ostrobothnia, Finland"] = {}, ["Southern Ostrobothnia, Finland"] = {alias_of = "South Ostrobothnia, Finland", display = true}, ["Ostrobothnia, Finland"] = {wp = "%l (region)"}, ["Central Ostrobothnia, Finland"] = {}, ["Pirkanmaa, Finland"] = {}, ["Satakunta, Finland"] = {}, ["Päijänne Tavastia, Finland"] = {}, ["Päijät-Häme, Finland"] = {alias_of = "Päijänne Tavastia, Finland", display = true}, ["Tavastia Proper, Finland"] = {}, ["Kanta-Häme, Finland"] = {alias_of = "Tavastia Proper, Finland", display = true}, ["Kymenlaakso, Finland"] = {}, ["Uusimaa, Finland"] = {}, ["Southwest Finland, Finland"] = {}, ["Åland Islands, Finland"] = {the = true, wp = "Åland"}, ["Åland, Finland"] = {alias_of = "Åland Islands, Finland"}, -- differs in "the" } -- regions of Finland export.finland_group = { default_container = "Finland", default_placetype = "region", default_divs = "municipalities", data = export.finland_regions, } export.france_administrative_regions = { ["Auvergne-Rhône-Alpes, France"] = {}, ["Bourgogne-Franche-Comté, France"] = {}, ["Brittany, France"] = {wp = "%l (administrative region)"}, ["Centre-Val de Loire, France"] = {}, ["Corsica, France"] = {}, -- overseas departments are handled in `export.country_like_entities` -- ["French Guiana"] = {}, ["Grand Est, France"] = {}, -- ["Guadeloupe"] = {}, ["Hauts-de-France, France"] = {}, ["Île-de-France, France"] = {}, -- ["Martinique"] = {}, -- ["Mayotte"] = {}, ["Normandy, France"] = {wp = "%l (administrative region)"}, ["Nouvelle-Aquitaine, France"] = {}, ["Occitania, France"] = {wp = "%l (administrative region)"}, ["Occitanie, France"] = {alias_of = "Occitania, France", display = true}, ["Pays de la Loire, France"] = {}, ["Provence-Alpes-Côte d'Azur, France"] = {}, -- ["Réunion"] = {}, } -- administrative regions of France export.france_group = { default_container = "France", -- Canonically these are 'administrative regions' but also treat as 'region' ('administrative region' falls back -- to 'region'). default_placetype = "region", default_divs = { "communes", {type = "municipalities", cat_as = "communes"}, "departments", {type = "prefectures", cat_as = {"prefectures", "departmental capitals"}}, {type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}}, }, data = export.france_administrative_regions, } export.france_departments = { ["Ain, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 01 ["Aisne, France"] = {container = "Hauts-de-France"}, -- 02 ["Allier, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 03 ["Alpes-de-Haute-Provence, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 04 ["Hautes-Alpes, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 05 ["Alpes-Maritimes, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 06 ["Ardèche, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 07 ["Ardennes, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 08 ["Ariège, France"] = {container = "Occitania", wp = "%l (department)"}, -- 09 ["Aube, France"] = {container = "Grand Est"}, -- 10 ["Aude, France"] = {container = "Occitania"}, -- 11 ["Aveyron, France"] = {container = "Occitania"}, -- 12 ["Bouches-du-Rhône, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 13 ["Calvados, France"] = {container = "Normandy", wp = "%l (department)"}, -- 14 ["Cantal, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 15 ["Charente, France"] = {container = "Nouvelle-Aquitaine"}, -- 16 ["Charente-Maritime, France"] = {container = "Nouvelle-Aquitaine"}, -- 17 ["Cher, France"] = {container = "Centre-Val de Loire", wp = "%l (department)"}, -- 18 ["Corrèze, France"] = {container = "Nouvelle-Aquitaine"}, -- 19 ["Corse-du-Sud, France"] = {container = "Corsica"}, -- 2A ["Haute-Corse, France"] = {container = "Corsica"}, -- 2B ["Côte-d'Or, France"] = {container = "Bourgogne-Franche-Comté"}, -- 21 ["Côte d'Or, France"] = {alias_of = "Côte-d'Or, France", display = true}, ["Côtes-d'Armor, France"] = {container = "Brittany"}, -- 22 ["Côtes d'Armor, France"] = {alias_of = "Côtes-d'Armor, France", display = true}, ["Creuse, France"] = {container = "Nouvelle-Aquitaine"}, -- 23 ["Dordogne, France"] = {container = "Nouvelle-Aquitaine"}, -- 24 ["Doubs, France"] = {container = "Bourgogne-Franche-Comté"}, -- 25 ["Drôme, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 26 ["Eure, France"] = {container = "Normandy"}, -- 27 ["Eure-et-Loir, France"] = {container = "Centre-Val de Loire"}, -- 28 ["Finistère, France"] = {container = "Brittany"}, -- 29 ["Gard, France"] = {container = "Occitania"}, -- 30 ["Haute-Garonne, France"] = {container = "Occitania"}, -- 31 ["Gers, France"] = {container = "Occitania"}, -- 32 ["Gironde, France"] = {container = "Nouvelle-Aquitaine"}, -- 33 ["Hérault, France"] = {container = "Occitania"}, -- 34 ["Ille-et-Vilaine, France"] = {container = "Brittany"}, -- 35 ["Indre, France"] = {container = "Centre-Val de Loire"}, -- 36 ["Indre-et-Loire, France"] = {container = "Centre-Val de Loire"}, -- 37 ["Isère, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 38 ["Jura, France"] = {container = "Bourgogne-Franche-Comté", wp = "%l (department)"}, -- 39 ["Landes, France"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 40 ["Loir-et-Cher, France"] = {container = "Centre-Val de Loire"}, -- 41 ["Loire, France"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 42 ["Haute-Loire, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 43 ["Loire-Atlantique, France"] = {container = "Pays de la Loire"}, -- 44 ["Loiret, France"] = {container = "Centre-Val de Loire"}, -- 45 ["Lot, France"] = {container = "Occitania", wp = "%l (department)"}, -- 46 ["Lot-et-Garonne, France"] = {container = "Nouvelle-Aquitaine"}, -- 47 ["Lozère, France"] = {container = "Occitania"}, -- 48 ["Maine-et-Loire, France"] = {container = "Pays de la Loire"}, -- 49 ["Manche, France"] = {container = "Normandy"}, -- 50 ["Marne, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 51 ["Haute-Marne, France"] = {container = "Grand Est"}, -- 52 ["Mayenne, France"] = {container = "Pays de la Loire"}, -- 53 ["Meurthe-et-Moselle, France"] = {container = "Grand Est"}, -- 54 ["Meuse, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 55 ["Morbihan, France"] = {container = "Brittany"}, -- 56 ["Moselle, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 57 ["Nièvre, France"] = {container = "Bourgogne-Franche-Comté"}, -- 58 ["Nord, France"] = {container = "Hauts-de-France", wp = "%l (French department)"}, -- 59 ["Oise, France"] = {container = "Hauts-de-France"}, -- 60 ["Orne, France"] = {container = "Normandy"}, -- 61 ["Pas-de-Calais, France"] = {container = "Hauts-de-France"}, -- 62 ["Puy-de-Dôme, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 63 ["Pyrénées-Atlantiques, France"] = {container = "Nouvelle-Aquitaine"}, -- 64 ["Hautes-Pyrénées, France"] = {container = "Occitania"}, -- 65 ["Pyrénées-Orientales, France"] = {container = "Occitania"}, -- 66 ["Bas-Rhin, France"] = {container = "Grand Est"}, -- 67 ["Haut-Rhin, France"] = {container = "Grand Est"}, -- 68 ["Rhône, France"] = {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 69D ["Metropolis of Lyon, France"] = {container = "Auvergne-Rhône-Alpes", the = true}, -- 69M ["Lyon Metropolis, France"] = {alias_of = "Metropolis of Lyon, France"}, ["Lyon, France"] = {alias_of = "Metropolis of Lyon, France"}, ["Haute-Saône, France"] = {container = "Bourgogne-Franche-Comté"}, -- 70 ["Saône-et-Loire, France"] = {container = "Bourgogne-Franche-Comté"}, -- 71 ["Sarthe, France"] = {container = "Pays de la Loire"}, -- 72 ["Savoie, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 73 ["Haute-Savoie, France"] = {container = "Auvergne-Rhône-Alpes"}, -- 74 ["Paris, France"] = {container = "Île-de-France"}, -- 75 ["Seine-Maritime, France"] = {container = "Normandy"}, -- 76 ["Seine-et-Marne, France"] = {container = "Île-de-France"}, -- 77 ["Yvelines, France"] = {container = "Île-de-France"}, -- 78 ["Deux-Sèvres, France"] = {container = "Nouvelle-Aquitaine"}, -- 79 ["Somme, France"] = {container = "Hauts-de-France", wp = "%l (department)"}, -- 80 ["Tarn, France"] = {container = "Occitania", wp = "%l (department)"}, -- 81 ["Tarn-et-Garonne, France"] = {container = "Occitania"}, -- 82 ["Var, France"] = {container = "Provence-Alpes-Côte d'Azur", wp = "%l (department)"}, -- 83 ["Vaucluse, France"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 84 ["Vendée, France"] = {container = "Pays de la Loire"}, -- 85 ["Vienne, France"] = {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 86 ["Haute-Vienne, France"] = {container = "Nouvelle-Aquitaine"}, -- 87 ["Vosges, France"] = {container = "Grand Est", wp = "%l (department)"}, -- 88 ["Yonne, France"] = {container = "Bourgogne-Franche-Comté"}, -- 89 ["Territoire de Belfort, France"] = {container = "Bourgogne-Franche-Comté"}, -- 90 ["Essonne, France"] = {container = "Île-de-France"}, -- 91 ["Hauts-de-Seine, France"] = {container = "Île-de-France"}, -- 92 ["Seine-Saint-Denis, France"] = {container = "Île-de-France"}, -- 93 ["Val-de-Marne, France"] = {container = "Île-de-France"}, -- 94 ["Val-d'Oise, France"] = {container = "Île-de-France"}, -- 95 --["Guadeloupe"] = {container = "Guadeloupe"}, -- 971 --["Martinique"] = {container = "Martinique"}, -- 972 --["Guyane"] = {container = "French Guiana", wp = "French Guiana"}, -- 973 --["La Réunion"] = {container = "Réunion", wp = "Réunion"}, -- 974 --["Mayotte"] = {container = "Mayotte"}, -- 976 } export.france_departments_group = { placename_to_key = make_placename_to_key(", France"), canonicalize_key_container = make_canonicalize_key_container(", France", "region"), default_placetype = "department", default_divs = { "communes", {type = "municipalities", cat_as = "communes"}, }, data = export.france_departments, } export.germany_states = { ["Baden-Württemberg, Germany"] = {}, ["Bavaria, Germany"] = {}, -- Berlin, Bremen and Hamburg are effectively city-states and don't have districts ([[Kreise]]), so override -- the default_divs setting. Better not to include them at all since they're included as cities down below. -- ["Berlin"] = {divs = {}}, ["Brandenburg, Germany"] = {}, -- ["Bremen"] = {divs = {}}, -- ["Hamburg"] = {divs = {}}, ["Hesse, Germany"] = {}, ["Lower Saxony, Germany"] = {}, ["Mecklenburg-Vorpommern, Germany"] = {}, ["Mecklenburg-Western Pomerania, Germany"] = {alias_of = "Mecklenburg-Vorpommern, Germany", display = true}, ["North Rhine-Westphalia, Germany"] = {}, ["Rhineland-Palatinate, Germany"] = {}, ["Saarland, Germany"] = {}, ["Saxony, Germany"] = {}, ["Saxony-Anhalt, Germany"] = {}, ["Schleswig-Holstein, Germany"] = {}, ["Thuringia, Germany"] = {}, } -- states of Germany export.germany_group = { default_container = "Germany", default_placetype = "negeri", default_divs = {"districts", "municipalities"}, data = export.germany_states, } export.greece_regions = { ["Attica, Greece"] = {wp = "%l (region)"}, ["Central Greece, Greece"] = {wp = "%l (administrative region)"}, ["Central Macedonia, Greece"] = {}, ["Crete, Greece"] = {}, ["Eastern Macedonia and Thrace, Greece"] = {}, ["Epirus, Greece"] = {wp = "%l (region)"}, ["Ionian Islands, Greece"] = {the = true, wp = "%l (region)"}, ["North Aegean, Greece"] = {the = true}, -- I would expect 'the Peloponnese' but Wikipedia mostly has categories like [[w:Category:Geography of Peloponnese (region)]] -- and [[w:Category:Buildings and structures in Peloponnese (region)]]; only [[w:Category:People from the Peloponnese (region)]] -- has "the" in it. ["Peloponnese, Greece"] = {wp = "%l (region)"}, ["South Aegean, Greece"] = {the = true}, ["Thessaly, Greece"] = {}, ["Western Greece, Greece"] = {}, ["Western Macedonia, Greece"] = {}, ["Mount Athos, Greece"] = {placetype = {"autonomous region", "region"}, wp = "Monastic community of Mount Athos"}, } -- regions of Greece export.greece_group = { default_container = "Greece", default_placetype = "region", data = export.greece_regions, } local india_polity_with_divisions = {"divisions", "districts"} local india_polity_without_divisions = {"districts"} -- States and union territories of India. Only some of them are divided into divisions. export.india_states_and_union_territories = { ["Andaman and Nicobar Islands, India"] = {the = true, placetype = "union territory", divs = india_polity_without_divisions}, ["Andhra Pradesh, India"] = {divs = india_polity_without_divisions}, ["Arunachal Pradesh, India"] = {divs = india_polity_with_divisions}, ["Assam, India"] = {divs = india_polity_with_divisions}, ["Bihar, India"] = {divs = india_polity_with_divisions}, ["Chandigarh, India"] = {placetype = "union territory", divs = india_polity_without_divisions}, ["Chhattisgarh, India"] = {divs = india_polity_with_divisions}, ["Dadra and Nagar Haveli and Daman and Diu, India"] = {placetype = "union territory", divs = india_polity_without_divisions}, ["Delhi, India"] = {placetype = "union territory", divs = india_polity_with_divisions}, ["Goa, India"] = {divs = india_polity_without_divisions}, ["Gujarat, India"] = {divs = india_polity_without_divisions}, ["Haryana, India"] = {divs = india_polity_with_divisions}, ["Himachal Pradesh, India"] = {divs = india_polity_with_divisions}, ["Jammu and Kashmir, India"] = {placetype = "union territory", divs = india_polity_with_divisions, wp = "%l (union territory)"}, ["Jharkhand, India"] = {divs = india_polity_with_divisions}, ["Karnataka, India"] = {divs = india_polity_with_divisions}, ["Kerala, India"] = {divs = india_polity_without_divisions}, ["Ladakh, India"] = {placetype = "union territory", divs = india_polity_with_divisions}, ["Lakshadweep, India"] = {placetype = "union territory", divs = india_polity_without_divisions}, ["Madhya Pradesh, India"] = {divs = india_polity_with_divisions}, ["Maharashtra, India"] = {divs = india_polity_with_divisions}, ["Manipur, India"] = {divs = india_polity_without_divisions}, ["Meghalaya, India"] = {divs = india_polity_with_divisions}, ["Mizoram, India"] = {divs = india_polity_without_divisions}, ["Nagaland, India"] = {divs = india_polity_with_divisions}, ["Odisha, India"] = {divs = india_polity_with_divisions}, ["Puducherry, India"] = {placetype = "union territory", divs = india_polity_without_divisions, wp = "%l (union territory)"}, ["Pondicherry, India"] = {alias_of = "Puducherry, India", display = true}, ["Punjab, India"] = {divs = india_polity_with_divisions, wp = "%l, %c"}, ["Rajasthan, India"] = {divs = india_polity_with_divisions}, ["Sikkim, India"] = {divs = india_polity_without_divisions}, ["Tamil Nadu, India"] = {divs = india_polity_without_divisions}, ["Telangana, India"] = {divs = india_polity_without_divisions}, ["Tripura, India"] = {divs = india_polity_without_divisions}, ["Uttar Pradesh, India"] = {divs = india_polity_with_divisions}, ["Uttarakhand, India"] = {divs = india_polity_with_divisions}, ["West Bengal, India"] = {divs = india_polity_with_divisions}, } -- states and union territories of India export.india_group = { default_container = "India", default_placetype = "negeri", data = export.india_states_and_union_territories, } export.indonesia_provinces = { ["Aceh, Indonesia"] = {}, ["Bali, Indonesia"] = {}, ["Bangka Belitung Islands, Indonesia"] = {the = true}, ["Banten, Indonesia"] = {}, ["Bengkulu, Indonesia"] = {}, ["Central Java, Indonesia"] = {}, ["Central Kalimantan, Indonesia"] = {}, ["Central Papua, Indonesia"] = {}, ["Central Sulawesi, Indonesia"] = {}, ["East Java, Indonesia"] = {}, ["East Kalimantan, Indonesia"] = {}, ["East Nusa Tenggara, Indonesia"] = {}, ["Gorontalo, Indonesia"] = {}, ["Highland Papua, Indonesia"] = {wp = "%l"}, ["Special Capital Region of Jakarta, Indonesia"] = {the = true, wp = "Jakarta"}, ["Jakarta, Indonesia"] = {alias_of = "Special Capital Region of Jakarta, Indonesia"}, ["Jambi, Indonesia"] = {}, ["Lampung, Indonesia"] = {}, ["Maluku, Indonesia"] = {}, ["North Kalimantan, Indonesia"] = {}, ["North Maluku, Indonesia"] = {}, ["North Sulawesi, Indonesia"] = {}, ["North Papua, Indonesia"] = {}, ["North Sumatra, Indonesia"] = {}, ["Papua, Indonesia"] = {wp = "%l (province)"}, ["Riau, Indonesia"] = {}, ["Riau Islands, Indonesia"] = {the = true}, ["Southeast Sulawesi, Indonesia"] = {}, ["South Kalimantan, Indonesia"] = {}, ["South Papua, Indonesia"] = {}, ["South Sulawesi, Indonesia"] = {}, ["South Sumatra, Indonesia"] = {}, ["Southwest Papua, Indonesia"] = {}, ["West Java, Indonesia"] = {}, ["West Kalimantan, Indonesia"] = {}, ["West Nusa Tenggara, Indonesia"] = {}, ["West Papua, Indonesia"] = {wp = "%l (province)"}, ["West Sulawesi, Indonesia"] = {}, ["West Sumatra, Indonesia"] = {}, ["Special Region of Yogyakarta, Indonesia"] = {the = true}, ["Yogyakarta, Indonesia"] = {alias_of = "Special Region of Yogyakarta, Indonesia"}, } -- provinces of Indonesia export.indonesia_group = { default_container = "Indonesia", default_placetype = "province", -- per https://www.quora.com/Does-Indonesia-use-British-or-American-English, Indonesia tends to use American -- spellings. data = export.indonesia_provinces, } export.iran_provinces = { ["Alborz Province, Iran"] = {}, -- abbreviation AL, capital [[w:Karaj]] ["Ardabil Province, Iran"] = {}, -- abbreviation AR, capital [[w:Ardabil]] ["Bushehr Province, Iran"] = {}, -- abbreviation BU, capital [[w:Bushehr]] ["Chaharmahal and Bakhtiari Province, Iran"] = {}, -- abbreviation CB, capital [[w:Shahr-e Kord]] ["East Azerbaijan Province, Iran"] = {}, -- abbreviation EA, capital [[w:Tabriz]] ["Fars Province, Iran"] = {}, -- abbreviation FA, capital [[w:Shiraz]] ["Pars Province, Iran"] = {alias_of = "Fars Province, Iran", display = true}, ["Gilan Province, Iran"] = {}, -- abbreviation GN, capital [[w:Rasht]] ["Golestan Province, Iran"] = {}, -- abbreviation GO, capital [[w:Gorgan]] ["Hamadan Province, Iran"] = {}, -- abbreviation HA, capital [[w:Hamadan]] ["Hormozgan Province, Iran"] = {}, -- abbreviation HO, capital [[w:Bandar Abbas]] ["Ilam Province, Iran"] = {}, -- abbreviation IL, capital [[w:Ilam, Iran|Ilam]] ["Isfahan Province, Iran"] = {}, -- abbreviation IS, capital [[w:Isfahan]] ["Kerman Province, Iran"] = {}, -- abbreviation KN, capital [[w:Kerman]] ["Kermanshah Province, Iran"] = {}, -- abbreviation KE, capital [[w:Kermanshah]] ["Khuzestan Province, Iran"] = {}, -- abbreviation KH, capital [[w:Ahvaz]] ["Kohgiluyeh and Boyer-Ahmad Province, Iran"] = {}, -- abbreviation KB, capital [[w:Yasuj]] ["Kurdistan Province, Iran"] = {}, -- abbreviation KU, capital [[w:Sanandaj]] ["Lorestan Province, Iran"] = {}, -- abbreviation LO, capital [[w:Khorramabad]] ["Markazi Province, Iran"] = {}, -- abbreviation MA, capital [[w:Arak, Iran|Arak]] ["Mazandaran Province, Iran"] = {}, -- abbreviation MN, capital [[w:Sari, Iran|Sari]] ["North Khorasan Province, Iran"] = {}, -- abbreviation NK, capital [[w:Bojnord]] ["Qazvin Province, Iran"] = {}, -- abbreviation QA, capital [[w:Qazvin]] ["Qom Province, Iran"] = {}, -- abbreviation QM, capital [[w:Qom]] ["Razavi Khorasan Province, Iran"] = {}, -- abbreviation RK, capital [[w:Mashhad]] ["Semnan Province, Iran"] = {}, -- abbreviation SE, capital [[w:Semnan, Iran|Semnan]] ["Sistan and Baluchestan Province, Iran"] = {}, -- abbreviation SB, capital [[w:Zahedan]] ["South Khorasan Province, Iran"] = {}, -- abbreviation SK, capital [[w:Birjand]] ["Tehran Province, Iran"] = {}, -- abbreviation TE, capital [[w:Tehran]] ["West Azerbaijan Province, Iran"] = {}, -- abbreviation WA, capital [[w:Urmia]] ["Yazd Province, Iran"] = {}, -- abbreviation YA, capital [[w:Yazd]] ["Zanjan Province, Iran"] = {}, -- abbreviation ZA, capital [[w:Zanjan, Iran|Zanjan]] } -- provinces of Iran export.iran_group = { key_to_placename = make_key_to_placename(", Iran", " Province$"), placename_to_key = make_placename_to_key(", Iran", " Province"), default_container = "Iran", default_placetype = "province", -- There aren't nearly enough counties of Iran currently entered in any language to allow for categorizing them -- per-province. (As of 2025-05-09, there are only 6 counties in each of [[Category:en:Counties of Iran]], -- [[Category:fa:Counties of Iran]] and [[Category:ar:Counties of Iran]].) -- default_divs = "counties", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "%e province", data = export.iran_provinces, } export.ireland_counties = { ["County Carlow, Ireland"] = {}, ["County Cavan, Ireland"] = {}, ["County Clare, Ireland"] = {}, ["County Cork, Ireland"] = {}, ["County Donegal, Ireland"] = {}, ["County Dublin, Ireland"] = {}, ["County Galway, Ireland"] = {}, ["County Kerry, Ireland"] = {}, ["County Kildare, Ireland"] = {}, ["County Kilkenny, Ireland"] = {}, ["County Laois, Ireland"] = {}, ["County Leitrim, Ireland"] = {}, ["County Limerick, Ireland"] = {}, ["County Longford, Ireland"] = {}, ["County Louth, Ireland"] = {}, ["County Mayo, Ireland"] = {}, ["County Meath, Ireland"] = {}, ["County Monaghan, Ireland"] = {}, ["County Offaly, Ireland"] = {}, ["County Roscommon, Ireland"] = {}, ["County Sligo, Ireland"] = {}, ["County Tipperary, Ireland"] = {}, ["County Waterford, Ireland"] = {}, ["County Westmeath, Ireland"] = {}, ["County Wexford, Ireland"] = {}, ["County Wicklow, Ireland"] = {}, } local function make_irish_type_key_to_placename(container_pattern) return function(key) key = key:gsub(container_pattern, "") local elliptical_key = key:gsub("^County ", "") return key, elliptical_key end end local function make_irish_type_placename_to_key(container_suffix) return function(placename) if not placename:find("^County ") and not placename:find("^City ") then placename = "County " .. placename end return placename .. container_suffix end end -- counties of Ireland export.ireland_group = { key_to_placename = make_irish_type_key_to_placename(", Ireland$"), placename_to_key = make_irish_type_placename_to_key(", Ireland"), default_container = "Ireland", default_placetype = "county", data = export.ireland_counties, } export.italy_administrative_regions = { ["Abruzzo, Italy"] = {}, ["Aosta Valley, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}}, ["Apulia, Italy"] = {}, ["Basilicata, Italy"] = {}, ["Calabria, Italy"] = {}, ["Campania, Italy"] = {}, ["Emilia-Romagna, Italy"] = {}, ["Friuli-Venezia Giulia, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}}, ["Lazio, Italy"] = {}, ["Liguria, Italy"] = {}, ["Lombardy, Italy"] = {}, ["Marche, Italy"] = {}, ["Molise, Italy"] = {}, ["Piedmont, Italy"] = {}, ["Sardinia, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}}, ["Sicily, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}}, ["Trentino-Alto Adige, Italy"] = {placetype = {"autonomous region", "administrative region", "region"}}, ["Tuscany, Italy"] = {}, ["Umbria, Italy"] = {}, ["Veneto, Italy"] = {}, } -- administrative regions of Italy export.italy_group = { default_container = "Italy", default_placetype = "region", data = export.italy_administrative_regions, } -- table of Japanese prefectures; interpolated into the main 'places' table, but also needed separately export.japan_prefectures = { ["Aichi Prefecture, Japan"] = {}, ["Akita Prefecture, Japan"] = {}, ["Aomori Prefecture, Japan"] = {}, ["Chiba Prefecture, Japan"] = {}, ["Ehime Prefecture, Japan"] = {}, ["Fukui Prefecture, Japan"] = {}, ["Fukuoka Prefecture, Japan"] = {}, ["Fukushima Prefecture, Japan"] = {}, ["Gifu Prefecture, Japan"] = {}, ["Gunma Prefecture, Japan"] = {}, ["Hiroshima Prefecture, Japan"] = {}, ["Hokkaido Prefecture, Japan"] = {divs = "subprefectures", wp = "Hokkaido"}, ["Hyōgo Prefecture, Japan"] = {}, ["Hyogo Prefecture, Japan"] = {alias_of = "Hyōgo Prefecture, Japan", display = true}, ["Ibaraki Prefecture, Japan"] = {}, ["Ishikawa Prefecture, Japan"] = {}, ["Iwate Prefecture, Japan"] = {}, ["Kagawa Prefecture, Japan"] = {}, ["Kagoshima Prefecture, Japan"] = {}, ["Kanagawa Prefecture, Japan"] = {}, ["Kōchi Prefecture, Japan"] = {}, ["Kochi Prefecture, Japan"] = {alias_of = "Kōchi Prefecture, Japan", display = true}, ["Kumamoto Prefecture, Japan"] = {}, ["Kyoto Prefecture, Japan"] = {}, ["Mie Prefecture, Japan"] = {}, ["Miyagi Prefecture, Japan"] = {}, ["Miyazaki Prefecture, Japan"] = {}, ["Nagano Prefecture, Japan"] = {}, ["Nagasaki Prefecture, Japan"] = {}, ["Nara Prefecture, Japan"] = {}, ["Niigata Prefecture, Japan"] = {}, ["Ōita Prefecture, Japan"] = {}, ["Oita Prefecture, Japan"] = {alias_of = "Ōita Prefecture, Japan", display = true}, ["Okayama Prefecture, Japan"] = {}, ["Okinawa Prefecture, Japan"] = {}, ["Osaka Prefecture, Japan"] = {}, ["Saga Prefecture, Japan"] = {}, ["Saitama Prefecture, Japan"] = {}, ["Shiga Prefecture, Japan"] = {}, ["Shimane Prefecture, Japan"] = {}, ["Shizuoka Prefecture, Japan"] = {}, ["Tochigi Prefecture, Japan"] = {}, ["Tokushima Prefecture, Japan"] = {}, ["Tottori Prefecture, Japan"] = {}, ["Toyama Prefecture, Japan"] = {}, ["Wakayama Prefecture, Japan"] = {}, ["Yamagata Prefecture, Japan"] = {}, ["Yamaguchi Prefecture, Japan"] = {}, ["Yamanashi Prefecture, Japan"] = {}, } -- prefectures of Japan export.japan_group = { key_to_placename = make_key_to_placename(", Japan$", " Prefecture$"), placename_to_key = make_placename_to_key(", Japan", " Prefecture"), default_container = "Japan", default_placetype = "prefecture", data = export.japan_prefectures, } export.laos_provinces = { ["Attapeu Province, Laos"] = {}, ["Bokeo Province, Laos"] = {}, ["Bolikhamxai Province, Laos"] = {}, ["Champasak Province, Laos"] = {}, ["Houaphanh Province, Laos"] = {}, ["Khammouane Province, Laos"] = {}, ["Luang Namtha Province, Laos"] = {}, ["Luang Prabang Province, Laos"] = {}, ["Oudomxay Province, Laos"] = {}, ["Phongsaly Province, Laos"] = {}, ["Salavan Province, Laos"] = {}, ["Savannakhet Province, Laos"] = {}, ["Vientiane Province, Laos"] = {}, ["Vientiane Prefecture, Laos"] = {placetype = "prefecture", wp = "%l"}, ["Sainyabuli Province, Laos"] = {}, ["Sekong Province, Laos"] = {}, ["Xaisomboun Province, Laos"] = {}, ["Xiangkhouang Province, Laos"] = {}, } local function laos_placename_to_key(placename) if placename == "Vientiane Prefecture" then return placename .. ", Laos" end if placename:find(" Province$") then return placename .. ", Laos" end return placename .. " Province, Laos" end -- provinces of Laos export.laos_group = { key_to_placename = make_key_to_placename(", Laos$", {" Province$", " Prefecture$"}), placename_to_key = laos_placename_to_key, default_container = "Laos", default_placetype = "province", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "%e province", data = export.laos_provinces, } export.lebanon_governorates = { ["Akkar Governorate, Lebanon"] = {}, ["Baalbek-Hermel Governorate, Lebanon"] = {}, ["Beirut Governorate, Lebanon"] = {}, ["Beqaa Governorate, Lebanon"] = {}, ["Keserwan-Jbeil Governorate, Lebanon"] = {}, ["Mount Lebanon Governorate, Lebanon"] = {}, ["Nabatieh Governorate, Lebanon"] = {}, -- These two are generic enough that we don't want to automatically augment a use of `gov/North Governorate` or -- `gov/South Governorate` with `c/Lebanon`. ["North Governorate, Lebanon"] = {no_auto_augment_container = true}, ["South Governorate, Lebanon"] = {no_auto_augment_container = true}, } -- governorates of Lebanon export.lebanon_group = { key_to_placename = make_key_to_placename(", Lebanon$", " Governorate$"), placename_to_key = make_placename_to_key(", Lebanon", " Governorate"), default_container = "Lebanon", default_placetype = "governorate", data = export.lebanon_governorates, } export.malaysia_states = { ["Johor, Malaysia"] = {}, ["Kedah, Malaysia"] = {}, ["Kelantan, Malaysia"] = {}, ["Malacca, Malaysia"] = {}, ["Negeri Sembilan, Malaysia"] = {}, ["Pahang, Malaysia"] = {}, ["Penang, Malaysia"] = {}, ["Perak, Malaysia"] = {}, ["Perlis, Malaysia"] = {}, ["Sabah, Malaysia"] = {}, ["Sarawak, Malaysia"] = {}, ["Selangor, Malaysia"] = {}, ["Terengganu, Malaysia"] = {}, } -- states of Malaysia export.malaysia_group = { default_container = "Malaysia", default_placetype = "negeri", default_wp = "%l, %c", data = export.malaysia_states, } export.malta_regions = { -- Some of the regions are generic enough that we don't want to automatically augment a use of e.g. -- `r/Northern Region` with `c/Malta`. In particular; -- * "Eastern Region" also occurs at least in Ghana, Uganda, Iceland, Nigeria, Venezuela, North Macedonia and -- El Salvador; -- * "Northern Region" also occurs at least in Ghana, Uganda, Malawi, Nigeria, Canada and South Africa; -- * "Western Region" also occurs at least in Abu Dhabi, Bahrain, South Africa, Ghana, Iceland, Nepal, Nigeria, -- Serbia and Uganda; -- * "Southern Region" also occurs at least in Nigeria, Eritrea, Iceland, Ireland, Malawi and Serbia. ["Eastern Region, Malta"] = {no_auto_augment_container = true}, ["Gozo Region, Malta"] = {wp = "%l"}, ["Northern Region, Malta"] = {no_auto_augment_container = true}, ["Port Region, Malta"] = {}, ["Southern Region, Malta"] = {no_auto_augment_container = true}, ["Western Region, Malta"] = {no_auto_augment_container = true}, } -- regions of Malta export.malta_group = { key_to_placename = make_key_to_placename(", Malta$", " Region"), placename_to_key = make_placename_to_key(", Malta", " Region"), default_container = "Malta", default_placetype = "region", default_wp = "%l, %c", default_the = true, data = export.malta_regions, } export.mexico_states = { ["Aguascalientes, Mexico"] = {}, ["Baja California, Mexico"] = {}, -- not display-canonicalizing because the "Norte" could be for emphasis ["Baja California Norte, Mexico"] = {alias_of = "Baja California, Mexico"}, ["Baja California Sur, Mexico"] = {}, ["Campeche, Mexico"] = {}, ["Chiapas, Mexico"] = {}, ["Chihuahua, Mexico"] = {wp = "%l (state)"}, ["Coahuila, Mexico"] = {}, ["Colima, Mexico"] = {}, ["Durango, Mexico"] = {}, ["Guanajuato, Mexico"] = {}, ["Guerrero, Mexico"] = {}, ["Hidalgo, Mexico"] = {wp = "%l (state)"}, ["Jalisco, Mexico"] = {}, ["State of Mexico, Mexico"] = {the = true}, ["Mexico, Mexico"] = {alias_of = "State of Mexico, Mexico"}, -- differs in "the" -- ["Mexico City, Mexico"] = {}, doesn't belong here because it's a city ["Michoacán, Mexico"] = {}, ["Michoacan, Mexico"] = {alias_of = "Michoacán, Mexico", display = true}, ["Morelos, Mexico"] = {}, ["Nayarit, Mexico"] = {}, ["Nuevo León, Mexico"] = {}, ["Nuevo Leon, Mexico"] = {alias_of = "Nuevo León, Mexico", display = true}, ["Oaxaca, Mexico"] = {}, ["Puebla, Mexico"] = {}, ["Querétaro, Mexico"] = {}, ["Queretaro, Mexico"] = {alias_of = "Querétaro, Mexico", display = true}, ["Quintana Roo, Mexico"] = {}, ["San Luis Potosí, Mexico"] = {}, ["San Luis Potosi, Mexico"] = {alias_of = "San Luis Potosí, Mexico", display = true}, ["Sinaloa, Mexico"] = {}, ["Sonora, Mexico"] = {}, ["Tabasco, Mexico"] = {}, ["Tamaulipas, Mexico"] = {}, ["Tlaxcala, Mexico"] = {}, ["Veracruz, Mexico"] = {}, ["Yucatán, Mexico"] = {}, ["Yucatan, Mexico"] = {alias_of = "Yucatán, Mexico", display = true}, ["Zacatecas, Mexico"] = {}, } -- Mexican states export.mexico_group = { default_container = "Mexico", default_placetype = "negeri", data = export.mexico_states, } export.moldova_districts_and_autonomous_territorial_units = { ["Anenii Noi District, Moldova"] = {}, -- capital [[Anenii Noi]] ["Basarabeasca District, Moldova"] = {}, -- capital [[Basarabeasca]] ["Briceni District, Moldova"] = {}, -- capital [[Briceni]] ["Cahul District, Moldova"] = {}, -- capital [[Cahul]] ["Cantemir District, Moldova"] = {}, -- capital [[Cantemir, Moldova|Cantemir]] ["Călărași District, Moldova"] = {}, -- capital [[Călărași, Moldova|Călărași]] ["Căușeni District, Moldova"] = {}, -- capital [[Căușeni]] ["Cimișlia District, Moldova"] = {}, -- capital [[Cimișlia]] ["Criuleni District, Moldova"] = {}, -- capital [[Criuleni]] ["Dondușeni District, Moldova"] = {}, -- capital [[Dondușeni]] ["Drochia District, Moldova"] = {}, -- capital [[Drochia]] ["Dubăsari District, Moldova"] = {}, -- capital [[Cocieri]] ["Edineț District, Moldova"] = {}, -- capital [[Edineț]] ["Fălești District, Moldova"] = {}, -- capital [[Fălești]] ["Florești District, Moldova"] = {}, -- capital [[Florești, Moldova|Florești]] ["Glodeni District, Moldova"] = {}, -- capital [[Glodeni]] ["Hîncești District, Moldova"] = {}, -- capital [[Hîncești]] ["Ialoveni District, Moldova"] = {}, -- capital [[Ialoveni]] ["Leova District, Moldova"] = {}, -- capital [[Leova]] ["Nisporeni District, Moldova"] = {}, -- capital [[Nisporeni]] ["Ocnița District, Moldova"] = {}, -- capital [[Ocnița]] ["Orhei District, Moldova"] = {}, -- capital [[Orhei]] ["Rezina District, Moldova"] = {}, -- capital [[Rezina]] ["Rîșcani District, Moldova"] = {}, -- capital [[Rîșcani]] ["Sîngerei District, Moldova"] = {}, -- capital [[Sîngerei]] ["Soroca District, Moldova"] = {}, -- capital [[Soroca]] ["Strășeni District, Moldova"] = {}, -- capital [[Strășeni]] ["Șoldănești District, Moldova"] = {}, -- capital [[Șoldănești]] ["Ștefan Vodă District, Moldova"] = {}, -- capital [[Ștefan Vodă]] ["Taraclia District, Moldova"] = {}, -- capital [[Taraclia]] ["Telenești District, Moldova"] = {}, -- capital [[Telenești]] ["Ungheni District, Moldova"] = {}, -- capital [[Ungheni]] ["Chișinău, Moldova"] = {placetype = "municipality"}, ["Bălți, Moldova"] = {placetype = "municipality"}, ["Gagauzia, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "region"}}, -- capital [[Comrat]] -- the remainder are under the de-facto control of the unrecognized state of Transnistria ["Bender, Moldova"] = {placetype = "municipality"}, ["Tighina, Moldova"] = {alias_of = "Bender, Moldova"}, ["Transnistria, Moldova"] = {placetype = {"autonomous territorial unit", "autonomous region", "region"}}, -- capital [[Tiraspol]] ["Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true}, ["Administrative-Territorial Units of the Left Bank of the Dniester, Moldova"] = {alias_of = "Transnistria, Moldova", the = true}, } local function moldova_placename_to_key(placename) local elliptical_key = placename .. ", Moldova" if export.moldova_districts_and_autonomous_territorial_units[elliptical_key] then return elliptical_key end if placename:find(" District$") then return placename .. ", Moldova" end return placename .. " District, Moldova" end -- Moldovan districts (raions) and autonomous territorial units export.moldova_group = { key_to_placename = make_key_to_placename(", Moldova$", " District"), placename_to_key = moldova_placename_to_key, default_container = "Moldova", default_placetype = {"district", "raion"}, default_divs = "communes", data = export.moldova_districts_and_autonomous_territorial_units, } export.morocco_regions = { ["Tangier-Tetouan-Al Hoceima, Morocco"] = {}, ["Oriental, Morocco"] = {wp = "%l (%c)"}, ["L'Oriental, Morocco"] = {alias_of = "Oriental, Morocco", display = true}, ["Fez-Meknes, Morocco"] = {}, ["Rabat-Sale-Kenitra, Morocco"] = {wp = "Rabat-Salé-Kénitra"}, ["Rabat-Salé-Kénitra, Morocco"] = {alias_of = "Rabat-Sale-Kenitra, Morocco", display = true}, ["Beni Mellal-Khenifra, Morocco"] = {wp = "Béni Mellal-Khénifra"}, ["Béni Mellal-Khénifra, Morocco"] = {alias_of = "Beni Mellal-Khenifra, Morocco", display = true}, ["Casablanca-Settat, Morocco"] = {}, ["Marrakesh-Safi, Morocco"] = {wp = "Marrakesh–Safi"}, -- WP title has en-dash ["Marrakech-Safi, Morocco"] = {alias_of = "Marrakesh-Safi, Morocco", display = true}, ["Draa-Tafilalet, Morocco"] = {wp = "Drâa-Tafilalet"}, ["Drâa-Tafilalet, Morocco"] = {alias_of = "Draa-Tafilalet, Morocco", display = true}, ["Souss-Massa, Morocco"] = {}, ["Guelmim-Oued Noun, Morocco"] = { keydesc = "+++. '''NOTE:''' This region lies partly within the disputed territory of [[Western Sahara]]" }, ["Laayoune-Sakia El Hamra, Morocco"] = { wp = "Laâyoune-Sakia El Hamra", keydesc = "+++. '''NOTE:''' This region lies almost completely within the disputed territory of [[Western Sahara]]", }, ["Laâyoune-Sakia El Hamra, Morocco"] = {alias_of = "Laayoune-Sakia El Hamra, Morocco", display = true}, ["Dakhla-Oued Ed-Dahab, Morocco"] = { keydesc = "+++. '''NOTE:''' This region lies completely within the disputed territory of [[Western Sahara]]", }, } -- regions of Morocco export.morocco_group = { default_container = "Morocco", default_placetype = "region", data = export.morocco_regions, } export.egypt_governorates = { ["Cairo Governorate, Egypt"] = {}, ["Giza Governorate, Egypt"] = {}, ["Sharqia Governorate, Egypt"] = {}, ["Dakahlia Governorate, Egypt"] = {}, ["Beheira Governorate, Egypt"] = {}, ["Minya Governorate, Egypt"] = {}, ["Qalyubia Governorate, Egypt"] = {}, ["Sohag Governorate, Egypt"] = {}, ["Alexandria Governorate, Egypt"] = {}, ["Gharbia Governorate, Egypt"] = {}, ["Asyut Governorate, Egypt"] = {}, ["Monufia Governorate, Egypt"] = {}, ["Faiyum Governorate, Egypt"] = {}, ["Kafr El Sheikh Governorate, Egypt"] = {}, ["Qena Governorate, Egypt"] = {}, ["Beni Suef Governorate, Egypt"] = {}, ["Damietta Governorate, Egypt"] = {}, ["Aswan Governorate, Egypt"] = {}, ["Ismailia Governorate, Egypt"] = {}, ["Luxor Governorate, Egypt"] = {}, ["Suez Governorate, Egypt"] = {}, ["Port Said Governorate, Egypt"] = {}, ["Matrouh Governorate, Egypt"] = {}, ["North Sinai Governorate, Egypt"] = {}, ["Red Sea Governorate, Egypt"] = {}, ["New Valley Governorate, Egypt"] = {}, ["South Sinai Governorate, Egypt"] = {}, } -- governorates of Egypt export.egypt_group = { key_to_placename = make_key_to_placename(", Egypt$", " Governorate$"), placename_to_key = make_placename_to_key(", Egypt", " Governorate"), default_container = "Egypt", default_placetype = "governorate", data = export.egypt_governorates, } export.netherlands_provinces = { ["Drenthe, Netherlands"] = {}, ["Flevoland, Netherlands"] = {}, ["Friesland, Netherlands"] = {}, ["Gelderland, Netherlands"] = {}, ["Groningen, Netherlands"] = {wp = "%l (province)"}, ["Limburg, Netherlands"] = {wp = "%l (%c)"}, ["North Brabant, Netherlands"] = {}, -- Foreign forms get display-canonicalized. ["Noord-Brabant, Netherlands"] = {alias_of = "North Brabant, Netherlands", display = true}, ["North Holland, Netherlands"] = {}, ["Noord-Holland, Netherlands"] = {alias_of = "North Holland, Netherlands", display = true}, ["Overijssel, Netherlands"] = {}, ["South Holland, Netherlands"] = {}, ["Zuid-Holland, Netherlands"] = {alias_of = "South Holland, Netherlands", display = true}, ["Utrecht, Netherlands"] = {wp = "%l (province)"}, ["Zeeland, Netherlands"] = {}, } -- provinces of the Netherlands export.netherlands_group = { default_container = "Netherlands", default_placetype = "province", default_divs = "municipalities", data = export.netherlands_provinces, } export.new_zealand_regions = { -- North Island regions ["Northland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-NTL, number 1, capital [[Whangārei]] ["Auckland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-AUK, number 2, capital [[Auckland]] ["Waikato, New Zealand"] = {}, -- ISO 3166-2 code NZ-WKO, number 3, capital [[Hamilton, New Zealand|Hamilton]] ["Bay of Plenty, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-BOP, number 4, capital [[Whakatāne]] ["Gisborne, New Zealand"] = {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-GIS, number 5, capital [[Gisborne, New Zealand|Gisborne]] ["Hawke's Bay, New Zealand"] = {}, -- ISO 3166-2 code NZ-HKB, number 6, capital [[Napier, New Zealand|Napier]] ["Taranaki, New Zealand"] = {}, -- ISO 3166-2 code NZ-TKI, number 7, capital [[Stratford, New Zealand|Stratford]] ["Manawatū-Whanganui, New Zealand"] = {}, -- ISO 3166-2 code NZ-MWT, number 8, capital [[Palmerston North]] ["Manawatu-Whanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true}, ["Manawatu-Wanganui, New Zealand"] = {alias_of = "Manawatū-Whanganui, New Zealand", display = true}, ["Wellington, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-WGN, number 9, capital [[Wellington]] -- South Island regions ["Tasman, New Zealand"] = {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-TAS, number 10, capital [[Richmond, New Zealand|Richmond]] ["Nelson, New Zealand"] = {placetype = {"region", "city"}, wp = "%l, %c", is_city = true}, -- ISO 3166-2 code NZ-NSN, number 11, capital [[Nelson, New Zealand|Nelson]] ["Marlborough, New Zealand"] = {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-MBH, number 12, capital [[Blenheim, New Zealand|Blenheim]] ["West Coast, New Zealand"] = {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-WTC, number 13, capital [[Greymouth]] ["Canterbury, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-CAN, number 14, capital [[Christchurch]] ["Otago, New Zealand"] = {}, -- ISO 3166-2 code NZ-OTA, number 15, capital [[Dunedin]] ["Southland, New Zealand"] = {wp = "%l Region"}, -- ISO 3166-2 code NZ-STL, number 16, capital [[Invercargill]] } -- regions of New Zealand export.new_zealand_group = { default_container = "New Zealand", default_placetype = "region", data = export.new_zealand_regions, } export.nigeria_states = { ["Abia State, Nigeria"] = {}, ["Adamawa State, Nigeria"] = {}, ["Akwa Ibom State, Nigeria"] = {}, ["Anambra State, Nigeria"] = {}, ["Bauchi State, Nigeria"] = {}, ["Bayelsa State, Nigeria"] = {}, ["Benue State, Nigeria"] = {}, ["Borno State, Nigeria"] = {}, ["Cross River State, Nigeria"] = {}, ["Delta State, Nigeria"] = {}, ["Ebonyi State, Nigeria"] = {}, ["Edo State, Nigeria"] = {}, ["Ekiti State, Nigeria"] = {}, ["Enugu State, Nigeria"] = {}, ["Federal Capital Territory, Nigeria"] = { -- not a state but allow it to be referenced as one in holonyms placetype = {"wilayah persekutuan", "territory", "negeri"}, the = true, wp = "%l (%c)", }, ["Gombe State, Nigeria"] = {}, ["Imo State, Nigeria"] = {}, ["Jigawa State, Nigeria"] = {}, ["Kaduna State, Nigeria"] = {}, ["Kano State, Nigeria"] = {}, ["Katsina State, Nigeria"] = {}, ["Kebbi State, Nigeria"] = {}, ["Kogi State, Nigeria"] = {}, ["Kwara State, Nigeria"] = {}, ["Lagos State, Nigeria"] = {}, ["Nasarawa State, Nigeria"] = {}, ["Niger State, Nigeria"] = {}, ["Ogun State, Nigeria"] = {}, ["Ondo State, Nigeria"] = {}, ["Osun State, Nigeria"] = {}, ["Oyo State, Nigeria"] = {}, ["Plateau State, Nigeria"] = {}, ["Rivers State, Nigeria"] = {}, ["Sokoto State, Nigeria"] = {}, ["Taraba State, Nigeria"] = {}, ["Yobe State, Nigeria"] = {}, ["Zamfara State, Nigeria"] = {}, } -- states of Nigeria export.nigeria_group = { key_to_placename = make_key_to_placename(", Nigeria$", " State$"), placename_to_key = make_placename_to_key(", Nigeria", " State"), default_container = "Nigeria", default_placetype = "negeri", data = export.nigeria_states, } export.north_korea_provinces = { ["Chagang Province, North Korea"] = {}, ["North Hamgyong Province, North Korea"] = {}, ["South Hamgyong Province, North Korea"] = {}, ["North Hwanghae Province, North Korea"] = {}, ["South Hwanghae Province, North Korea"] = {}, ["Kangwon Province, North Korea"] = {wp = "%l (%c)"}, ["North Pyongan Province, North Korea"] = {}, ["South Pyongan Province, North Korea"] = {}, ["Ryanggang Province, North Korea"] = {}, } -- provinces of North Korea export.north_korea_group = { key_to_placename = make_key_to_placename(", North Korea$", " Province$"), placename_to_key = make_placename_to_key(", North Korea", " Province"), default_container = "North Korea", default_placetype = "province", data = export.north_korea_provinces, } export.norwegian_counties = { ["Oslo, Norway"] = {}, ["Rogaland, Norway"] = {}, ["Møre og Romsdal, Norway"] = {}, ["Nordland, Norway"] = {}, ["Østfold, Norway"] = {}, ["Akershus, Norway"] = {}, ["Buskerud, Norway"] = {}, -- the following two were merged into Innlandet -- ["Hedmark, Norway"] = {}, -- ["Oppland, Norway"] = {}, ["Innlandet, Norway"] = {}, ["Vestfold, Norway"] = {}, ["Telemark, Norway"] = {}, -- the following two were merged into Agder -- ["Aust-Agder, Norway"] = {}, -- ["Vest-Agder, Norway"] = {}, ["Agder, Norway"] = {}, -- the following two were merged into Vestland -- ["Hordaland, Norway"] = {}, -- ["Sogn og Fjordane, Norway"] = {}, ["Vestland, Norway"] = {}, ["Trøndelag, Norway"] = {}, ["Troms, Norway"] = {}, ["Finnmark, Norway"] = {}, } -- counties of Norway export.norway_group = { default_container = "Norway", default_placetype = "county", data = export.norwegian_counties, } export.pakistan_provinces_and_territories = { ["Azad Kashmir, Pakistan"] = { placetype = {"administrative territory", "autonomous territory", "territory"}, }, ["Azad Jammu and Kashmir, Pakistan"] = {alias_of = "Azad Kashmir, Pakistan", display = true}, ["Balochistan, Pakistan"] = {wp = "%l, %c"}, ["Gilgit-Baltistan, Pakistan"] = { placetype = {"administrative territory", "territory"}, }, ["Islamabad Capital Territory, Pakistan"] = { the = true, divs = {}, -- no divisions placetype = {"wilayah persekutuan", "administrative territory", "territory"}, }, -- Islamabad is an accepted alias for Islamabad Capital Territory given the above placetypes ["Islamabad, Pakistan"] = {alias_of = "Islamabad Capital Territory, Pakistan"}, ["Khyber Pakhtunkhwa, Pakistan"] = {}, ["Punjab, Pakistan"] = {wp = "%l, %c"}, ["Sindh, Pakistan"] = {}, } -- provinces and territories of Pakistan export.pakistan_group = { default_container = "Pakistan", default_placetype = "province", default_divs = "divisions", data = export.pakistan_provinces_and_territories, } export.philippines_provinces = { ["Abra, Philippines"] = {wp = "%l (province)"}, ["Agusan del Norte, Philippines"] = {}, ["Agusan del Sur, Philippines"] = {}, ["Aklan, Philippines"] = {}, ["Albay, Philippines"] = {}, ["Antique, Philippines"] = {wp = "%l (province)"}, ["Apayao, Philippines"] = {}, ["Aurora, Philippines"] = {wp = "%l (province)"}, ["Basilan, Philippines"] = {}, ["Bataan, Philippines"] = {}, ["Batanes, Philippines"] = {}, ["Batangas, Philippines"] = {}, ["Benguet, Philippines"] = {}, ["Biliran, Philippines"] = {}, ["Bohol, Philippines"] = {}, ["Bukidnon, Philippines"] = {}, ["Bulacan, Philippines"] = {}, ["Cagayan, Philippines"] = {}, ["Camarines Norte, Philippines"] = {}, ["Camarines Sur, Philippines"] = {}, ["Camiguin, Philippines"] = {}, ["Capiz, Philippines"] = {}, ["Catanduanes, Philippines"] = {}, ["Cavite, Philippines"] = {}, ["Cebu, Philippines"] = {}, ["Cotabato, Philippines"] = {}, ["Davao de Oro, Philippines"] = {}, ["Davao del Norte, Philippines"] = {}, ["Davao del Sur, Philippines"] = {}, ["Davao Occidental, Philippines"] = {}, ["Davao Oriental, Philippines"] = {}, ["Dinagat Islands, Philippines"] = {the = true}, ["Eastern Samar, Philippines"] = {}, ["Guimaras, Philippines"] = {}, ["Ifugao, Philippines"] = {}, ["Ilocos Norte, Philippines"] = {}, ["Ilocos Sur, Philippines"] = {}, ["Iloilo, Philippines"] = {}, ["Isabela, Philippines"] = {wp = "%l (province)"}, ["Kalinga, Philippines"] = {wp = "%l (province)"}, ["La Union, Philippines"] = {}, ["Laguna, Philippines"] = {wp = "%l (province)"}, ["Lanao del Norte, Philippines"] = {}, ["Lanao del Sur, Philippines"] = {}, ["Leyte, Philippines"] = {wp = "%l (province)"}, ["Maguindanao del Norte, Philippines"] = {}, ["Maguindanao del Sur, Philippines"] = {}, ["Marinduque, Philippines"] = {}, ["Masbate, Philippines"] = {}, ["Misamis Occidental, Philippines"] = {}, ["Misamis Oriental, Philippines"] = {}, ["Mountain Province, Philippines"] = {}, ["Negros Occidental, Philippines"] = {}, ["Negros Oriental, Philippines"] = {}, ["Northern Samar, Philippines"] = {}, ["Nueva Ecija, Philippines"] = {}, ["Nueva Vizcaya, Philippines"] = {}, ["Occidental Mindoro, Philippines"] = {}, ["Oriental Mindoro, Philippines"] = {}, ["Palawan, Philippines"] = {}, ["Pampanga, Philippines"] = {}, ["Pangasinan, Philippines"] = {}, ["Quezon, Philippines"] = {}, ["Quirino, Philippines"] = {}, ["Rizal, Philippines"] = {wp = "%l (province)"}, ["Romblon, Philippines"] = {}, ["Samar, Philippines"] = {wp = "%l (province)"}, ["Sarangani, Philippines"] = {}, ["Siquijor, Philippines"] = {}, ["Sorsogon, Philippines"] = {}, ["South Cotabato, Philippines"] = {}, ["Southern Leyte, Philippines"] = {}, ["Sultan Kudarat, Philippines"] = {}, ["Sulu, Philippines"] = {}, ["Surigao del Norte, Philippines"] = {}, ["Surigao del Sur, Philippines"] = {}, ["Tarlac, Philippines"] = {}, ["Tawi-Tawi, Philippines"] = {}, ["Zambales, Philippines"] = {}, ["Zamboanga del Norte, Philippines"] = {}, ["Zamboanga del Sur, Philippines"] = {}, ["Zamboanga Sibugay, Philippines"] = {}, -- not a province but treated as one; allow it to be referred to as a province in holonyms ["Metro Manila, Philippines"] = {placetype = {"region", "province"}}, } -- provinces of the Philippines export.philippines_group = { default_container = "Philippines", default_placetype = "province", default_divs = {"municipalities", "barangays"}, data = export.philippines_provinces, } export.poland_voivodeships = { ["Lower Silesian Voivodeship, Poland"] = {}, -- abbr DS, code 02, capital Wrocław ["Kuyavian-Pomeranian Voivodeship, Poland"] = {}, -- abbr KP, code 04, capital Bydgoszcz (seat of voivode), Toruń (seat of sejmik and marshal) ["Lublin Voivodeship, Poland"] = {}, -- abbr LU, code 06, capital Lublin ["Lubusz Voivodeship, Poland"] = {}, -- abbr LB, code 08, capital Gorzów Wielkopolski (seat of voivode), Zielona Góra (seat of sejmik and marshal) ["Lodz Voivodeship, Poland"] = {wp = "Łódź Voivodeship"}, -- abbr LD, code 10, capital Łódź ["Łódź Voivodeship, Poland"] = {alias_of = "Lodz Voivodeship, Poland", display = true, display_as_full = true}, ["Lesser Poland Voivodeship, Poland"] = {}, -- abbr MA, code 12, capital Kraków ["Masovian Voivodeship, Poland"] = {}, -- abbr MZ, code 14, capital Warsaw ["Opole Voivodeship, Poland"] = {}, -- abbr OP, code 16, capital Opole ["Subcarpathian Voivodeship, Poland"] = {}, -- abbr PK, code 18, capital Rzeszów ["Podlaskie Voivodeship, Poland"] = {}, -- abbr PD, code 20, capital Białystok ["Pomeranian Voivodeship, Poland"] = {}, -- abbr PM, code 22, capital Gdańsk ["Silesian Voivodeship, Poland"] = {}, -- abbr SL, code 24, capital Katowice ["Holy Cross Voivodeship, Poland"] = {wp = "Świętokrzyskie Voivodeship"}, -- abbr SK, code 26, capital Kielce ["Świętokrzyskie Voivodeship, Poland"] = {alias_of = "Holy Cross Voivodeship, Poland", display = true, display_as_full = true}, ["Warmian-Masurian Voivodeship, Poland"] = {}, -- abbr WN, code 28, capital Olsztyn ["Greater Poland Voivodeship, Poland"] = {}, -- abbr WP, code 30, capital Poznań ["West Pomeranian Voivodeship, Poland"] = {}, -- abbr ZP, code 32, capital Szczecin } -- voivodeships of Poland export.poland_group = { key_to_placename = make_key_to_placename(", Poland$", " Voivodeship$"), placename_to_key = make_placename_to_key(", Poland", " Voivodeship"), default_container = "Poland", default_placetype = "voivodeship", default_divs = { -- "counties", -- not enough of them currently {type = "Polish colonies", cat_as = {{type = "villages", prep = "di"}}}, }, data = export.poland_voivodeships, } export.portugal_districts_and_autonomous_regions = { ["Azores, Portugal"] = {the = true, placetype = {"autonomous region", "region"}}, ["Aveiro District, Portugal"] = {}, ["Beja District, Portugal"] = {}, ["Braga District, Portugal"] = {}, ["Bragança District, Portugal"] = {}, ["Castelo Branco District, Portugal"] = {}, ["Coimbra District, Portugal"] = {}, ["Évora District, Portugal"] = {}, ["Faro District, Portugal"] = {}, ["Guarda District, Portugal"] = {}, ["Leiria District, Portugal"] = {}, ["Lisbon District, Portugal"] = {}, ["Lisboa District, Portugal"] = {alias_of = "Lisbon District, Portugal", display = true}, ["Madeira, Portugal"] = {placetype = {"autonomous region", "region"}}, ["Portalegre District, Portugal"] = {}, ["Porto District, Portugal"] = {}, ["Santarém District, Portugal"] = {}, ["Setúbal District, Portugal"] = {}, ["Viana do Castelo District, Portugal"] = {}, ["Vila Real District, Portugal"] = {}, ["Viseu District, Portugal"] = {}, } local function portugal_placename_to_key(placename) if placename == "Azores" or placename == "Madeira" then return placename .. ", Portugal" end if placename:find(" District$") then return placename .. ", Portugal" end return placename .. " District, Portugal" end -- districts and autonomous regions of Portugal export.portugal_group = { key_to_placename = make_key_to_placename(", Portugal$", " District$"), placename_to_key = portugal_placename_to_key, default_container = "Portugal", default_placetype = "district", default_divs = "municipalities", data = export.portugal_districts_and_autonomous_regions, } export.romania_counties = { ["Alba County, Romania"] = {}, ["Arad County, Romania"] = {}, ["Argeș County, Romania"] = {}, ["Bacău County, Romania"] = {}, ["Bihor County, Romania"] = {}, ["Bistrița-Năsăud County, Romania"] = {}, ["Botoșani County, Romania"] = {}, ["Brașov County, Romania"] = {}, ["Brăila County, Romania"] = {}, -- Bucharest: not in a county ["Buzău County, Romania"] = {}, ["Caraș-Severin County, Romania"] = {}, ["Cluj County, Romania"] = {}, ["Constanța County, Romania"] = {}, ["Covasna County, Romania"] = {}, ["Călărași County, Romania"] = {}, ["Dolj County, Romania"] = {}, ["Dâmbovița County, Romania"] = {}, ["Galați County, Romania"] = {}, ["Giurgiu County, Romania"] = {}, ["Gorj County, Romania"] = {}, ["Harghita County, Romania"] = {}, ["Hunedoara County, Romania"] = {}, ["Ialomița County, Romania"] = {}, ["Iași County, Romania"] = {}, ["Ilfov County, Romania"] = {}, ["Maramureș County, Romania"] = {}, ["Mehedinți County, Romania"] = {}, ["Mureș County, Romania"] = {}, ["Neamț County, Romania"] = {}, ["Olt County, Romania"] = {}, ["Prahova County, Romania"] = {}, ["Satu Mare County, Romania"] = {}, ["Sibiu County, Romania"] = {}, ["Suceava County, Romania"] = {}, ["Sălaj County, Romania"] = {}, ["Teleorman County, Romania"] = {}, ["Timiș County, Romania"] = {}, ["Tulcea County, Romania"] = {}, ["Vaslui County, Romania"] = {}, ["Vrancea County, Romania"] = {}, ["Vâlcea County, Romania"] = {}, } -- counties of Romania export.romania_group = { key_to_placename = make_key_to_placename(", Romania$", " County$"), placename_to_key = make_placename_to_key(", Romania", " County"), default_container = "Romania", default_placetype = "county", default_divs = "communes", data = export.romania_counties, } local function make_russia_federal_subject_spec(spectype, use_the, wp) return { placetype = spectype, the = not not use_the, bare_category_parent_type = {"federal subjects", spectype .. "s"}, wp = wp, } end local russia_autonomous_okrug_no_the = {placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"}} local russia_autonomous_okrug_the = {placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"}, the = true} local russia_krai = make_russia_federal_subject_spec("krai") local russia_oblast = make_russia_federal_subject_spec("oblast") local russia_republic_the = make_russia_federal_subject_spec("republic", "use the") local russia_republic_no_the = make_russia_federal_subject_spec("republic") export.russia_federal_subjects = { -- autonomous oblasts ["Jewish Autonomous Oblast, Russia"] = {the = true, placetype = {"autonomous oblast", "oblast"}, bare_category_parent_type = {"federal subjects", "autonomous oblasts"}}, -- autonomous okrugs ["Chukotka Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Chukotka, Russia"] = {alias_of = "Chukotka Autonomous Okrug, Russia"}, ["Khanty-Mansi Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Khanty-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"}, ["Khantia-Mansia, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"}, ["Yugra, Russia"] = {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"}, ["Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Nenetsia, Russia"] = {alias_of = "Nenets Autonomous Okrug, Russia"}, ["Yamalo-Nenets Autonomous Okrug, Russia"] = russia_autonomous_okrug_the, ["Yamalia, Russia"] = {alias_of = "Yamalo-Nenets Autonomous Okrug, Russia"}, -- krais ["Altai Krai, Russia"] = russia_krai, ["Kamchatka Krai, Russia"] = russia_krai, ["Khabarovsk Krai, Russia"] = russia_krai, ["Krasnodar Krai, Russia"] = russia_krai, ["Krasnoyarsk Krai, Russia"] = russia_krai, ["Perm Krai, Russia"] = russia_krai, ["Primorsky Krai, Russia"] = russia_krai, ["Stavropol Krai, Russia"] = russia_krai, ["Zabaykalsky Krai, Russia"] = russia_krai, -- oblasts ["Amur Oblast, Russia"] = russia_oblast, ["Arkhangelsk Oblast, Russia"] = russia_oblast, ["Astrakhan Oblast, Russia"] = russia_oblast, ["Belgorod Oblast, Russia"] = russia_oblast, ["Bryansk Oblast, Russia"] = russia_oblast, ["Chelyabinsk Oblast, Russia"] = russia_oblast, ["Irkutsk Oblast, Russia"] = russia_oblast, ["Ivanovo Oblast, Russia"] = russia_oblast, ["Kaliningrad Oblast, Russia"] = russia_oblast, ["Kaluga Oblast, Russia"] = russia_oblast, ["Kemerovo Oblast, Russia"] = russia_oblast, ["Kirov Oblast, Russia"] = russia_oblast, ["Kostroma Oblast, Russia"] = russia_oblast, ["Kurgan Oblast, Russia"] = russia_oblast, ["Kursk Oblast, Russia"] = russia_oblast, ["Leningrad Oblast, Russia"] = russia_oblast, ["Lipetsk Oblast, Russia"] = russia_oblast, ["Magadan Oblast, Russia"] = russia_oblast, ["Moscow Oblast, Russia"] = russia_oblast, ["Murmansk Oblast, Russia"] = russia_oblast, ["Nizhny Novgorod Oblast, Russia"] = russia_oblast, ["Novgorod Oblast, Russia"] = russia_oblast, ["Novosibirsk Oblast, Russia"] = russia_oblast, ["Omsk Oblast, Russia"] = russia_oblast, ["Orenburg Oblast, Russia"] = russia_oblast, ["Oryol Oblast, Russia"] = russia_oblast, ["Penza Oblast, Russia"] = russia_oblast, ["Pskov Oblast, Russia"] = russia_oblast, ["Rostov Oblast, Russia"] = russia_oblast, ["Ryazan Oblast, Russia"] = russia_oblast, ["Sakhalin Oblast, Russia"] = russia_oblast, ["Samara Oblast, Russia"] = russia_oblast, ["Saratov Oblast, Russia"] = russia_oblast, ["Smolensk Oblast, Russia"] = russia_oblast, ["Sverdlovsk Oblast, Russia"] = russia_oblast, ["Tambov Oblast, Russia"] = russia_oblast, ["Tomsk Oblast, Russia"] = russia_oblast, ["Tula Oblast, Russia"] = russia_oblast, ["Tver Oblast, Russia"] = russia_oblast, ["Tyumen Oblast, Russia"] = russia_oblast, ["Ulyanovsk Oblast, Russia"] = russia_oblast, ["Vladimir Oblast, Russia"] = russia_oblast, ["Volgograd Oblast, Russia"] = russia_oblast, ["Vologda Oblast, Russia"] = russia_oblast, ["Voronezh Oblast, Russia"] = russia_oblast, ["Yaroslavl Oblast, Russia"] = russia_oblast, -- republics -- -- We only need to include cases that aren't just shortened versions of the full federal subject name (i.e. where -- words like "Republic" and "Oblast" are omitted but the name is not otherwise modified; these are handled by -- key_to_placename). Non-display-canonicalizing aliases are generally due to differences in the presence or absence -- of "the". ["Adygea, Russia"] = russia_republic_no_the, ["Republic of Adygea, Russia"] = {alias_of = "Adygea, Russia", the = true}, ["Bashkortostan, Russia"] = russia_republic_no_the, ["Republic of Bashkortostan, Russia"] = {alias_of = "Bashkortostan, Russia", the = true}, ["Bashkiria, Russia"] = {alias_of = "Bashkortostan, Russia"}, ["Buryatia, Russia"] = russia_republic_no_the, ["Republic of Buryatia, Russia"] = {alias_of = "Buryatia, Russia", the = true}, ["Dagestan, Russia"] = russia_republic_no_the, ["Republic of Dagestan, Russia"] = {alias_of = "Dagestan, Russia", the = true}, ["Ingushetia, Russia"] = russia_republic_no_the, ["Republic of Ingushetia, Russia"] = {alias_of = "Ingushetia, Russia", the = true}, ["Kalmykia, Russia"] = russia_republic_no_the, ["Republic of Kalmykia, Russia"] = {alias_of = "Kalmykia, Russia", the = true}, ["Karelia, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Karelia"), ["Republic of Karelia, Russia"] = {alias_of = "Karelia, Russia", the = true}, ["Khakassia, Russia"] = russia_republic_no_the, ["Republic of Khakassia, Russia"] = {alias_of = "Khakassia, Russia", the = true}, ["Mordovia, Russia"] = russia_republic_no_the, ["Republic of Mordovia, Russia"] = {alias_of = "Mordovia, Russia", the = true}, ["North Ossetia-Alania, Russia"] = make_russia_federal_subject_spec("republic", nil, "North Ossetia–Alania"), -- with en-dash ["Republic of North Ossetia-Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", the = true}, ["North Ossetia, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true}, ["Alania, Russia"] = {alias_of = "North Ossetia-Alania, Russia", display = true}, ["Tatarstan, Russia"] = russia_republic_no_the, ["Republic of Tatarstan, Russia"] = {alias_of = "Tatarstan, Russia", the = true}, ["Altai Republic, Russia"] = russia_republic_the, ["Chechnya, Russia"] = russia_republic_no_the, ["Chechen Republic, Russia"] = {alias_of = "Chechnya, Russia", the = true}, ["Chuvashia, Russia"] = russia_republic_no_the, ["Chuvash Republic, Russia"] = {alias_of = "Chuvashia, Russia", the = true}, ["Kabardino-Balkaria, Russia"] = russia_republic_no_the, ["Kabardino-Balkariya, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", display = true}, ["Kabardino-Balkarian Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", the = true}, ["Kabardino-Balkar Republic, Russia"] = {alias_of = "Kabardino-Balkaria, Russia", display = "Kabardino-Balkarian Republic, Russia", the = true}, ["Karachay-Cherkessia, Russia"] = russia_republic_no_the, ["Karachay-Cherkess Republic, Russia"] = {alias_of = "Karachay-Cherkessia, Russia"}, ["Komi, Russia"] = make_russia_federal_subject_spec("republic", nil, "Komi Republic"), ["Komi Republic, Russia"] = {alias_of = "Komi, Russia", the = true}, ["Mari El, Russia"] = russia_republic_no_the, ["Mari El Republic, Russia"] = {alias_of = "Mari El, Russia", the = true}, ["Sakha, Russia"] = make_russia_federal_subject_spec("republic", nil, "Sakha Republic"), ["Sakha Republic, Russia"] = {alias_of = "Sakha, Russia", the = true}, ["Yakutia, Russia"] = {alias_of = "Sakha, Russia"}, ["Yakutiya, Russia"] = {alias_of = "Sakha, Russia", display = "Yakutia, Russia"}, ["Republic of Yakutia (Sakha), Russia"] = {alias_of = "Sakha, Russia", display = "Sakha Republic, Russia", the = true}, ["Tuva, Russia"] = russia_republic_no_the, ["Tyva, Russia"] = {alias_of = "Tuva, Russia", display = true}, ["Tuva Republic, Russia"] = {alias_of = "Tuva, Russia", the = true}, ["Tyva Republic, Russia"] = {alias_of = "Tuva, Russia", display= "Tuva Republic, Russia", the = true}, ["Udmurtia, Russia"] = russia_republic_no_the, ["Udmurt Republic, Russia"] = {alias_of = "Udmurtia, Russia", the = true}, -- Not included due to being unrecognized and only partly controlled: -- ["Crimea, Russia"] = make_russia_federal_subject_spec("republic", nil, "Republic of Crimea (Russia)") -- ["Donetsk People's Republic, Russia"] = russia_republic_the, -- ["Luhansk People's Republic, Russia"] = russia_republic_the, -- ["Zaporozhye Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Zaporizhzhia Oblast"), -- ["Kherson Oblast, Russia"] = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Kherson Oblast"), -- There are also federal cities (not included because they're cities): -- Moscow, Saint Petersburg; Sevastopol (unrecognized; same status as for "Crimea, Russia" above) } local function russia_key_to_placename(key) key = key:gsub(",.*", "") local full_placename = key if key == "Jewish Autonomous Oblast" then return full_placename, full_placename end local elliptical_placename for _, suffix in ipairs({"Krai", "Oblast"}) do elliptical_placename = key:match("^(.*) " .. suffix .. "$") if elliptical_placename then return full_placename, elliptical_placename end end return full_placename, full_placename end local function russia_placename_to_key(placename) local key = placename .. ", Russia" if export.russia_federal_subjects[key] then return key end -- We allow the user to say e.g. "obl/Samara" in place of "obl/Samara Oblast". for _, suffix in ipairs({"Krai", "Oblast"}) do local suffixed_key = placename .. " " .. suffix .. ", Russia" if export.russia_federal_subjects[suffixed_key] then return suffixed_key end end return placename .. ", Russia" end local function construct_russia_federal_subject_keydesc(group, key, spec) local placename = key:gsub(",.*", "") local linked_placename = export.construct_linked_placename(spec, placename) local placetype = spec.placetype if type(placetype) == "table" then placetype = placetype[1] end if placetype == "oblast" then -- Hack: Oblasts generally don't have entries under "Foo Oblast" -- but just under "Foo", so fix the linked key appropriately; -- doesn't apply to the Jewish Autonomous Oblast linked_placename = linked_placename:gsub(" Oblast%]%]", "%]%] Oblast") end return linked_placename .. ", a [[federal subject]] ([[" .. placetype .. "]]) of [[Russia]]" end -- federal subjects of Russia export.russia_group = { key_to_placename = russia_key_to_placename, placename_to_key = russia_placename_to_key, default_container = "Russia", default_keydesc = construct_russia_federal_subject_keydesc, default_overriding_bare_label_parents = {"federal subjects of Russia", "+++"}, data = export.russia_federal_subjects, } export.saudi_arabia_provinces = { ["Riyadh Province, Saudi Arabia"] = {}, ["Mecca Province, Saudi Arabia"] = {}, -- Name is too generic to assume it's in Saudi Arabia if not specified. ["Eastern Province, Saudi Arabia"] = {no_auto_augment_container = true, wp = "%l, %c"}, ["Medina Province, Saudi Arabia"] = {wp = "%l (%c)"}, ["Aseer Province, Saudi Arabia"] = {wp = "Asir"}, ["Asir Province, Saudi Arabia"] = {alias_of = "Aseer Province, Saudi Arabia", display = true}, ["Jazan Province, Saudi Arabia"] = {}, ["Qassim Province, Saudi Arabia"] = {wp = "Al-Qassim Province"}, ["Al-Qassim Province, Saudi Arabia"] = {alias_of = "Qassim Province, Saudi Arabia", display = true}, ["Tabuk Province, Saudi Arabia"] = {}, ["Hail Province, Saudi Arabia"] = {wp = "Ḥa'il Province"}, ["Ha'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true}, ["Ḥa'il Province, Saudi Arabia"] = {alias_of = "Hail Province, Saudi Arabia", display = true}, ["Al-Jouf Province, Saudi Arabia"] = {wp = "Al-Jawf Province"}, ["Al-Jawf Province, Saudi Arabia"] = {alias_of = "Al-Jouf Province, Saudi Arabia", display = true}, ["Najran Province, Saudi Arabia"] = {}, ["Northern Borders Province, Saudi Arabia"] = {}, ["Al-Bahah Province, Saudi Arabia"] = {}, } -- provinces of Saudi Arabia export.saudi_arabia_group = { key_to_placename = make_key_to_placename(", Arab Saudi$", " Province$"), placename_to_key = make_placename_to_key(", Arab Saudi", " Province"), default_container = "Arab Saudi", default_placetype = "wilayah", data = export.saudi_arabia_provinces, } export.south_africa_provinces = { ["Eastern Cape, South Africa"] = {the = true}, ["Free State, South Africa"] = {the = true, wp = "%l (province)"}, ["Gauteng, South Africa"] = {}, ["KwaZulu-Natal, South Africa"] = {}, ["Limpopo, South Africa"] = {}, ["Mpumalanga, South Africa"] = {}, -- per Wikipedia and other sources, `North West` doesn't normally have `the` before it ["North West, South Africa"] = {wp = "%l (South African province)"}, ["Northern Cape, South Africa"] = {the = true}, ["Western Cape, South Africa"] = {the = true}, } -- provinces of South Africa export.south_africa_group = { default_container = "South Africa", default_placetype = "province", default_divs = "municipalities", data = export.south_africa_provinces, } export.south_korea_provinces = { ["North Chungcheong Province, South Korea"] = {}, ["South Chungcheong Province, South Korea"] = {}, ["Gangwon Province, South Korea"] = {wp = "%l, %c"}, ["Gyeonggi Province, South Korea"] = {}, ["North Gyeongsang Province, South Korea"] = {}, ["South Gyeongsang Province, South Korea"] = {}, ["North Jeolla Province, South Korea"] = {}, ["South Jeolla Province, South Korea"] = {}, ["Jeju Province, South Korea"] = {}, } -- provinces of South Korea export.south_korea_group = { key_to_placename = make_key_to_placename(", South Korea$", " Province$"), placename_to_key = make_placename_to_key(", South Korea", " Province"), default_container = "South Korea", default_placetype = "province", data = export.south_korea_provinces, } export.spain_autonomous_communities = { ["Andalusia, Spain"] = {}, ["Aragon, Spain"] = {}, ["Asturias, Spain"] = {}, ["Balearic Islands, Spain"] = {the = true}, ["Basque Country, Spain"] = {the = true, wp = "%l (autonomous community)"}, ["Canary Islands, Spain"] = {the = true}, ["Cantabria, Spain"] = {}, ["Castile and León, Spain"] = {}, ["Castilla-La Mancha, Spain"] = {wp = "Castilla–La Mancha"}, -- with en-dash ["Catalonia, Spain"] = {}, ["Community of Madrid, Spain"] = {the = true}, ["Extremadura, Spain"] = {}, ["Galicia, Spain"] = {wp = "%l (Spain)"}, ["La Rioja, Spain"] = {}, ["Murcia, Spain"] = {wp = "Region of %l"}, ["Navarre, Spain"] = {}, ["Valencia, Spain"] = {wp = "Valencian Community"}, ["Valencian Community, Spain"] = {alias_of = "Valencia, Spain", the = true}, } -- autonomous communities of Spain export.spain_group = { default_container = "Spain", default_placetype = "autonomous community", default_divs = {"municipalities", "comarcas"}, data = export.spain_autonomous_communities, } export.taiwan_counties = { ["Changhua County, Taiwan"] = {}, ["Chiayi County, Taiwan"] = {}, ["Hsinchu County, Taiwan"] = {}, ["Hualien County, Taiwan"] = {}, ["Kinmen County, Taiwan"] = {wp = "Kinmen"}, ["Lienchiang County, Taiwan"] = {wp = "Matsu Islands"}, ["Miaoli County, Taiwan"] = {}, ["Nantou County, Taiwan"] = {}, ["Penghu County, Taiwan"] = {wp = "Penghu"}, ["Pingtung County, Taiwan"] = {}, ["Taitung County, Taiwan"] = {}, ["Yilan County, Taiwan"] = {wp = "%l, %c"}, ["Yunlin County, Taiwan"] = {}, } -- counties of Taiwan export.taiwan_group = { key_to_placename = make_key_to_placename(", Taiwan$", " County$"), placename_to_key = make_placename_to_key(", Taiwan", " County"), default_container = "Taiwan", default_placetype = "county", default_divs = {"districts", "townships"}, data = export.taiwan_counties, } export.thailand_provinces = { -- Bangkok (special administrative area) ["Amnat Charoen Province, Thailand"] = {}, ["Ang Thong Province, Thailand"] = {}, ["Bueng Kan Province, Thailand"] = {}, ["Buriram Province, Thailand"] = {}, ["Chachoengsao Province, Thailand"] = {}, ["Chai Nat Province, Thailand"] = {}, ["Chaiyaphum Province, Thailand"] = {}, ["Chanthaburi Province, Thailand"] = {}, ["Chiang Mai Province, Thailand"] = {}, ["Chiang Rai Province, Thailand"] = {}, ["Chonburi Province, Thailand"] = {}, ["Chumphon Province, Thailand"] = {}, ["Kalasin Province, Thailand"] = {}, ["Kamphaeng Phet Province, Thailand"] = {}, ["Kanchanaburi Province, Thailand"] = {}, ["Khon Kaen Province, Thailand"] = {}, ["Krabi Province, Thailand"] = {}, ["Lampang Province, Thailand"] = {}, ["Lamphun Province, Thailand"] = {}, ["Loei Province, Thailand"] = {}, ["Lopburi Province, Thailand"] = {}, ["Mae Hong Son Province, Thailand"] = {}, ["Maha Sarakham Province, Thailand"] = {}, ["Mukdahan Province, Thailand"] = {}, ["Nakhon Nayok Province, Thailand"] = {}, ["Nakhon Pathom Province, Thailand"] = {}, ["Nakhon Phanom Province, Thailand"] = {}, ["Nakhon Ratchasima Province, Thailand"] = {}, ["Nakhon Sawon Province, Thailand"] = {}, ["Nakhon Si Thammarat Province, Thailand"] = {}, ["Nan Province, Thailand"] = {}, ["Narathiwat Province, Thailand"] = {}, ["Nong Bua Lamphu Province, Thailand"] = {}, ["Nong Khai Province, Thailand"] = {}, ["Nonthaburi Province, Thailand"] = {}, ["Pathum Thani Province, Thailand"] = {}, ["Pattani Province, Thailand"] = {}, ["Phang Nga Province, Thailand"] = {}, ["Phatthalung Province, Thailand"] = {}, ["Phayao Province, Thailand"] = {}, ["Phetchabun Province, Thailand"] = {}, ["Phetchaburi Province, Thailand"] = {}, ["Phichit Province, Thailand"] = {}, ["Phitsanulok Province, Thailand"] = {}, ["Phra Nakhon Si Ayutthaya Province, Thailand"] = {}, ["Phrae Province, Thailand"] = {}, ["Phuket Province, Thailand"] = {}, ["Prachinburi Province, Thailand"] = {}, ["Prachuap Khiri Khan Province, Thailand"] = {}, ["Ranong Province, Thailand"] = {}, ["Ratchaburi Province, Thailand"] = {}, ["Rayong Province, Thailand"] = {}, ["Roi Et Province, Thailand"] = {}, ["Sa Kaeo Province, Thailand"] = {}, ["Sakon Nakhon Province, Thailand"] = {}, ["Samut Prakan Province, Thailand"] = {}, ["Samut Sakhon Province, Thailand"] = {}, ["Samut Songkhram Province, Thailand"] = {}, ["Saraburi Province, Thailand"] = {}, ["Satun Province, Thailand"] = {}, ["Sing Buri Province, Thailand"] = {}, ["Sisaket Province, Thailand"] = {}, ["Songkhla Province, Thailand"] = {}, ["Sukhothai Province, Thailand"] = {}, ["Suphan Buri Province, Thailand"] = {}, ["Surat Thani Province, Thailand"] = {}, ["Surin Province, Thailand"] = {}, ["Tak Province, Thailand"] = {}, ["Trang Province, Thailand"] = {}, ["Trat Province, Thailand"] = {}, ["Ubon Ratchathani Province, Thailand"] = {}, ["Udon Thani Province, Thailand"] = {}, ["Uthai Thani Province, Thailand"] = {}, ["Uttaradit Province, Thailand"] = {}, ["Yala Province, Thailand"] = {}, ["Yasothon Province, Thailand"] = {}, } -- provinces of Thailand export.thailand_group = { key_to_placename = make_key_to_placename(", Thailand$", "Wilayah "), placename_to_key = make_placename_to_key(", Thailand", "Wilayah "), default_container = "Thailand", default_placetype = "wilayah", default_divs = "daerah", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "Wilayah %e", data = export.thailand_provinces, } export.turkey_provinces = { ["Adana Province, Turkey"] = {}, -- code 01 ["Adıyaman Province, Turkey"] = {}, -- code 02 ["Afyonkarahisar Province, Turkey"] = {}, -- code 03 ["Ağrı Province, Turkey"] = {}, -- code 04 ["Amasya Province, Turkey"] = {}, -- code 05 ["Ankara Province, Turkey"] = {}, -- code 06 ["Antalya Province, Turkey"] = {}, -- code 07 ["Artvin Province, Turkey"] = {}, -- code 08 ["Aydın Province, Turkey"] = {}, -- code 09 ["Balıkesir Province, Turkey"] = {}, -- code 10 ["Bilecik Province, Turkey"] = {}, -- code 11 ["Bingöl Province, Turkey"] = {}, -- code 12 ["Bitlis Province, Turkey"] = {}, -- code 13 ["Bolu Province, Turkey"] = {}, -- code 14 ["Burdur Province, Turkey"] = {}, -- code 15 ["Bursa Province, Turkey"] = {}, -- code 16 ["Çanakkale Province, Turkey"] = {}, -- code 17 ["Çankırı Province, Turkey"] = {}, -- code 18 ["Çorum Province, Turkey"] = {}, -- code 19 ["Denizli Province, Turkey"] = {}, -- code 20 ["Diyarbakır Province, Turkey"] = {}, -- code 21 ["Edirne Province, Turkey"] = {}, -- code 22 ["Elazığ Province, Turkey"] = {}, -- code 23 ["Elâzığ Province, Turkey"] = {alias_of = "Elazığ Province, Turkey", display = true}, ["Erzincan Province, Turkey"] = {}, -- code 24 ["Erzurum Province, Turkey"] = {}, -- code 25 ["Eskişehir Province, Turkey"] = {}, -- code 26 ["Gaziantep Province, Turkey"] = {}, -- code 27 ["Giresun Province, Turkey"] = {}, -- code 28 ["Gümüşhane Province, Turkey"] = {}, -- code 29 ["Hakkâri Province, Turkey"] = {}, -- code 30 ["Hakkari Province, Turkey"] = {alias_of = "Hakkâri Province, Turkey", display = true}, ["Hatay Province, Turkey"] = {}, -- code 31 ["Isparta Province, Turkey"] = {}, -- code 32 ["Mersin Province, Turkey"] = {}, -- code 33 -- ["Istanbul Province, Turkey"] = {}, -- code 34; this is coextensive with the city itself ["İzmir Province, Turkey"] = {}, -- code 35 ["Izmir Province, Turkey"] = {alias_of = "İzmir Province, Turkey", display = true}, ["Kars Province, Turkey"] = {}, -- code 36 ["Kastamonu Province, Turkey"] = {}, -- code 37 ["Kayseri Province, Turkey"] = {}, -- code 38 ["Kırklareli Province, Turkey"] = {}, -- code 39 ["Kırşehir Province, Turkey"] = {}, -- code 40 ["Kocaeli Province, Turkey"] = {}, -- code 41 ["Konya Province, Turkey"] = {}, -- code 42 ["Kütahya Province, Turkey"] = {}, -- code 43 ["Malatya Province, Turkey"] = {}, -- code 44 ["Manisa Province, Turkey"] = {}, -- code 45 ["Kahramanmaraş Province, Turkey"] = {}, -- code 46 ["Mardin Province, Turkey"] = {}, -- code 47 ["Muğla Province, Turkey"] = {}, -- code 48 ["Muş Province, Turkey"] = {}, -- code 49 ["Nevşehir Province, Turkey"] = {}, -- code 50 ["Niğde Province, Turkey"] = {}, -- code 51 ["Ordu Province, Turkey"] = {}, -- code 52 ["Rize Province, Turkey"] = {}, -- code 53 ["Sakarya Province, Turkey"] = {}, -- code 54 ["Samsun Province, Turkey"] = {}, -- code 55 ["Siirt Province, Turkey"] = {}, -- code 56 ["Sinop Province, Turkey"] = {}, -- code 57 ["Sivas Province, Turkey"] = {}, -- code 58 ["Tekirdağ Province, Turkey"] = {}, -- code 59 ["Tokat Province, Turkey"] = {}, -- code 60 ["Trabzon Province, Turkey"] = {}, -- code 61 ["Tunceli Province, Turkey"] = {}, -- code 62 ["Şanlıurfa Province, Turkey"] = {}, -- code 63 ["Uşak Province, Turkey"] = {}, -- code 64 ["Van Province, Turkey"] = {}, -- code 65 ["Yozgat Province, Turkey"] = {}, -- code 66 ["Zonguldak Province, Turkey"] = {}, -- code 67 ["Aksaray Province, Turkey"] = {}, -- code 68 ["Bayburt Province, Turkey"] = {}, -- code 69 ["Karaman Province, Turkey"] = {}, -- code 70 ["Kırıkkale Province, Turkey"] = {}, -- code 71 ["Batman Province, Turkey"] = {}, -- code 72 ["Şırnak Province, Turkey"] = {}, -- code 73 ["Bartın Province, Turkey"] = {}, -- code 74 ["Ardahan Province, Turkey"] = {}, -- code 75 ["Iğdır Province, Turkey"] = {}, -- code 76 ["Yalova Province, Turkey"] = {}, -- code 77 ["Karabük Province, Turkey"] = {}, -- code 78 ["Kilis Province, Turkey"] = {}, -- code 79 ["Osmaniye Province, Turkey"] = {}, -- code 80 ["Düzce Province, Turkey"] = {}, -- code 81 } -- provinces of Turkey export.turkey_group = { key_to_placename = make_key_to_placename(", Turkey$", " Province$"), placename_to_key = make_placename_to_key(", Turkey", " Province"), default_container = "Turkey", default_placetype = "province", default_divs = "districts", data = export.turkey_provinces, } export.ukraine_oblasts = { ["Cherkasy Oblast, Ukraine"] = {}, -- capital [[Cherkasy]], license plate prefix CA, IA ["Chernihiv Oblast, Ukraine"] = {}, -- capital [[Chernihiv]], license plate prefix CB, IB ["Chernivtsi Oblast, Ukraine"] = {}, -- capital [[Chernivtsi]], license plate prefix CE, IE -- apparently will be renamed to 'Dnipro Oblast' ["Dnipropetrovsk Oblast, Ukraine"] = {}, -- capital [[Dnipro]], license plate prefix AE, KE ["Donetsk Oblast, Ukraine"] = {}, -- capital ''[[Donetsk]] ([[Kramatorsk]])'', license plate prefix AH, KH ["Ivano-Frankivsk Oblast, Ukraine"] = {}, -- capital [[Ivano-Frankivsk]], license plate prefix AT, KT ["Kharkiv Oblast, Ukraine"] = {}, -- capital [[Kharkiv]], license plate prefix AX, KX ["Kherson Oblast, Ukraine"] = {}, -- capital ''[[Kherson]]'', license plate prefix ''BT, HT'' ["Khmelnytskyi Oblast, Ukraine"] = {}, -- capital [[Khmelnytskyi]], license plate prefix BX, HX -- apparently will be renamed to 'Kropyvnytskyi Oblast' ["Kirovohrad Oblast, Ukraine"] = {}, -- capital [[Kropyvnytskyi]], license plate prefix BA, HA ["Kyiv Oblast, Ukraine"] = {}, -- capital [[Kyiv]], license plate prefix AI, KI ["Kiev Oblast, Ukraine"] = {alias_of = "Kyiv Oblast, Ukraine", display = true}, ["Luhansk Oblast, Ukraine"] = {}, -- capital ''[[Luhansk]] ([[Sievierodonetsk]])'', license plate prefix BB, HB ["Lviv Oblast, Ukraine"] = {}, -- capital [[Lviv]], license plate prefix BC, HC ["Mykolaiv Oblast, Ukraine"] = {}, -- capital [[Mykolaiv]], license plate prefix BE, HE ["Odesa Oblast, Ukraine"] = {}, -- capital [[Odesa]], license plate prefix BH, HH ["Odessa Oblast, Ukraine"] = {alias_of = "Odesa Oblast, Ukraine", display = true}, ["Poltava Oblast, Ukraine"] = {}, -- capital [[Poltava]], license plate prefix BI, HI ["Rivne Oblast, Ukraine"] = {}, -- capital [[Rivne]], license plate prefix BK, HK ["Sumy Oblast, Ukraine"] = {}, -- capital [[Sumy]], license plate prefix BM, HM ["Ternopil Oblast, Ukraine"] = {}, -- capital [[Ternopil]], license plate prefix BO, HO ["Vinnytsia Oblast, Ukraine"] = {}, -- capital [[Vinnytsia]], license plate prefix AB, KB ["Volyn Oblast, Ukraine"] = {}, -- capital [[Lutsk]], license plate prefix AC, KC ["Zakarpattia Oblast, Ukraine"] = {}, -- capital [[Uzhhorod]], license plate prefix AO, KO ["Zaporizhzhia Oblast, Ukraine"] = {}, -- capital ''[[Zaporizhzhia]]'', license plate prefix AP, KP ["Zaporizhia Oblast, Ukraine"] = {alias_of = "Zaporizhzhia Oblast, Ukraine", display = true}, ["Zhytomyr Oblast, Ukraine"] = {}, -- capital [[Zhytomyr]], license plate prefix AM, KM } -- oblasts of Ukraine export.ukraine_group = { key_to_placename = make_key_to_placename(", Ukraine$", " Oblast$"), placename_to_key = make_placename_to_key(", Ukraine", " Oblast"), default_container = "Ukraine", default_placetype = "oblast", default_divs = {"raions", "hromadas"}, data = export.ukraine_oblasts, } export.united_kingdom_constituent_countries = { ["England"] = {divs = { "counties", "districts", {type = "local government districts", cat_as = "districts"}, { type = "local government districts with borough status", cat_as = {"districts", "boroughs"}, }, {type = "boroughs", cat_as = {"districts", "boroughs"}}, {type = "civil parishes", container_parent_type = false}, }}, ["Northern Ireland"] = { placetype = {"constituent country", "province", "negara"}, divs = {"counties", "districts"}, }, ["Scotland"] = {divs = { {type = "council areas", container_parent_type = false}, "districts", }}, ["Wales"] = {divs = { "counties", {type = "county boroughs", container_parent_type = false}, {type = "communities", container_parent_type = false}, {type = "Welsh communities", cat_as = {{type = "communities", container_parent_type = false}}}, }}, } -- constituent countries and provinces of the United Kingdom export.united_kingdom_group = { placename_to_key = false, default_container = "United Kingdom", default_placetype = {"constituent country", "negara"}, addl_divs = { "traditional counties", {type = "historical counties", cat_as = "traditional counties"}, }, -- Don't create categories like 'Category:en:Towns in the United Kingdom' -- or 'Category:en:Places in the United Kingdom'. default_no_container_cat = true, data = export.united_kingdom_constituent_countries, } export.england_counties = { -- NOTE: We used to have various other "no longer" counties commented out, which seems to refer to counties that -- existed officially at some point between 1889 and 1974, which I have removed. I have only kept the three -- ceremonial counties that existed from 1974 (when ceremonial counties were created) to 1996, as well as those -- still considered "historic counties" per [[w:Historic counties of England]]. -- ["Avon, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996) ["Bedfordshire, England"] = {}, ["Berkshire, England"] = {}, -- ["Brighton and Hove, England"] = {}, -- city -- ["Bristol, England"] = {}, -- city ["Buckinghamshire, England"] = {}, ["Cambridgeshire, England"] = {}, ["Cheshire, England"] = {}, -- ["Cleveland, England"] = {wp = "%l (county)"}, -- no longer (1974 to 1996) ["Cornwall, England"] = {}, -- ["Cumberland, England"] = {}, -- no longer (historic county) ["Cumbria, England"] = {}, ["Derbyshire, England"] = {}, ["Devon, England"] = {}, ["Dorset, England"] = {}, ["County Durham, England"] = {}, ["East Sussex, England"] = {}, ["Essex, England"] = {}, ["Gloucestershire, England"] = {}, ["Greater London, England"] = {}, ["Greater Manchester, England"] = {}, ["Hampshire, England"] = {}, ["Herefordshire, England"] = {}, ["Hertfordshire, England"] = {}, -- ["Humberside, England"] = {}, -- no longer (1974 to 1996) -- ["Huntingdonshire, England"] = {}, -- no longer (historic county) ["Isle of Wight, England"] = {the = true}, ["Kent, England"] = {}, ["Lancashire, England"] = {}, ["Leicestershire, England"] = {}, ["Lincolnshire, England"] = {}, ["Merseyside, England"] = {}, -- ["Middlesex, England"] = {}, -- no longer (historic county) ["Norfolk, England"] = {}, ["Northamptonshire, England"] = {}, ["Northumberland, England"] = {}, ["North Yorkshire, England"] = {}, ["Nottinghamshire, England"] = {}, ["Oxfordshire, England"] = {}, ["Rutland, England"] = {}, ["Shropshire, England"] = {}, ["Somerset, England"] = {}, ["South Humberside, England"] = {}, ["South Yorkshire, England"] = {}, ["Staffordshire, England"] = {}, ["Suffolk, England"] = {}, ["Surrey, England"] = {}, -- ["Sussex, England"] = {}, -- no longer (historic county) ["Tyne and Wear, England"] = {}, ["Warwickshire, England"] = {}, ["West Midlands, England"] = {the = true, wp = "%l (county)"}, -- ["Westmorland, England"] = {}, -- no longer (historic county) ["West Sussex, England"] = {}, ["West Yorkshire, England"] = {}, ["Wiltshire, England"] = {}, ["Worcestershire, England"] = {}, -- ["Yorkshire, England"] = {}, -- no longer (historic county) ["East Riding of Yorkshire, England"] = {the = true}, } -- counties of England export.england_group = { default_container = {key = "England", placetype = "constituent country"}, default_placetype = "county", default_divs = { "districts", {type = "local government districts", cat_as = "districts"}, { type = "local government districts with borough status", cat_as = {"districts", "boroughs"}, }, {type = "boroughs", cat_as = {"districts", "boroughs"}}, "civil parishes", }, data = export.england_counties, } export.northern_ireland_counties = { ["County Antrim, Northern Ireland"] = {}, ["County Armagh, Northern Ireland"] = {}, ["City of Belfast, Northern Ireland"] = {the = true, is_city = true, wp = "Belfast"}, ["County Down, Northern Ireland"] = {}, ["County Fermanagh, Northern Ireland"] = {}, ["County Londonderry, Northern Ireland"] = {}, ["City of Derry, Northern Ireland"] = {the = true, is_city = true, wp = "Derry"}, ["County Tyrone, Northern Ireland"] = {}, } -- counties of Northern Ireland export.northern_ireland_group = { key_to_placename = make_irish_type_key_to_placename(", Northern Ireland$"), placename_to_key = make_irish_type_placename_to_key(", Northern Ireland"), default_container = {key = "Northern Ireland", placetype = "constituent country"}, default_placetype = "county", data = export.northern_ireland_counties, } export.scotland_council_areas = { ["Aberdeenshire, Scotland"] = {}, ["Angus, Scotland"] = {wp = "%l, %c"}, ["Argyll and Bute, Scotland"] = {}, ["City of Aberdeen, Scotland"] = {the = true, wp = "Aberdeen"}, ["Aberdeen"] = {alias_of = "City of Aberdeen, Scotland"}, ["Aberdeen City"] = {alias_of = "City of Aberdeen, Scotland"}, ["City of Dundee, Scotland"] = {the = true, wp = "Dundee"}, ["Dundee"] = {alias_of = "City of Dundee, Scotland"}, ["Dundee City"] = {alias_of = "City of Dundee, Scotland"}, ["City of Edinburgh, Scotland"] = {the = true, wp = "%l council area"}, ["Edinburgh"] = {alias_of = "City of Edinburgh, Scotland"}, ["City of Glasgow, Scotland"] = {the = true, wp = "Glasgow"}, ["Glasgow"] = {alias_of = "City of Glasgow, Scotland"}, ["Clackmannanshire, Scotland"] = {}, ["Dumfries and Galloway, Scotland"] = {}, ["East Ayrshire, Scotland"] = {}, ["East Dunbartonshire, Scotland"] = {}, ["East Lothian, Scotland"] = {}, ["East Renfrewshire, Scotland"] = {}, ["Falkirk, Scotland"] = {wp = "%l council area"}, ["Fife, Scotland"] = {}, ["Highland, Scotland"] = {wp = "%l council area"}, ["Inverclyde, Scotland"] = {}, ["Midlothian, Scotland"] = {}, ["Moray, Scotland"] = {}, ["North Ayrshire, Scotland"] = {}, ["North Lanarkshire, Scotland"] = {}, ["Orkney Islands, Scotland"] = {the = true}, ["Perth and Kinross, Scotland"] = {}, ["Renfrewshire, Scotland"] = {}, ["Scottish Borders, Scotland"] = {the = true}, ["Shetland Islands, Scotland"] = {the = true}, ["South Ayrshire, Scotland"] = {}, ["South Lanarkshire, Scotland"] = {}, ["Stirling, Scotland"] = {wp = "%l council area"}, ["West Dunbartonshire, Scotland"] = {}, ["West Lothian, Scotland"] = {}, ["Western Isles, Scotland"] = {the = true, wp = "Outer Hebrides"}, ["Na h-Eileanan Siar, Scotland"] = {alias_of = "Western Isles, Scotland"}, } -- council areas of Scotland export.scotland_group = { default_container = {key = "Scotland", placetype = "constituent country"}, default_placetype = "council area", data = export.scotland_council_areas, } export.wales_principal_areas = { ["Blaenau Gwent, Wales"] = {}, ["Bridgend, Wales"] = {wp = "%l County Borough"}, ["Caerphilly, Wales"] = {wp = "%l County Borough"}, -- ["Cardiff, Wales"] = {placetype = "city"}, ["Carmarthenshire, Wales"] = {placetype = "county"}, ["Ceredigion, Wales"] = {placetype = "county"}, ["Conwy, Wales"] = {wp = "%l County Borough"}, ["Denbighshire, Wales"] = {placetype = "county"}, ["Flintshire, Wales"] = {placetype = "county"}, ["Gwynedd, Wales"] = {placetype = "county"}, ["Isle of Anglesey, Wales"] = {the = true, placetype = "county"}, ["Anglesey, Wales"] = {alias_of = "Isle of Anglesey, Wales"}, -- differs in "the" ["Merthyr Tydfil, Wales"] = {wp = "%l County Borough"}, ["Monmouthshire, Wales"] = {placetype = "county"}, ["Neath Port Talbot, Wales"] = {}, -- ["Newport, Wales"] = {placetype = "city", wp = "%l, %c"}, ["Pembrokeshire, Wales"] = {placetype = "county"}, ["Powys, Wales"] = {placetype = "county"}, ["Rhondda Cynon Taf, Wales"] = {}, -- ["Swansea, Wales"] = {placetype = "city"}, ["Torfaen, Wales"] = {}, ["Vale of Glamorgan, Wales"] = {the = true}, ["Wrexham, Wales"] = {wp = "%l County Borough"}, } -- principal areas (cities, counties and county boroughs) of Wales export.wales_group = { default_container = {key = "Wales", placetype = "constituent country"}, default_placetype = "county borough", data = export.wales_principal_areas, } export.united_states_states = { ["Alabama, USA"] = {}, ["Alaska, USA"] = {divs = { {type = "boroughs", container_parent_type = "counties"}, {type = "borough seats", container_parent_type = "county seats"}, }}, ["Arizona, USA"] = {}, ["Arkansas, USA"] = {}, ["California, USA"] = {}, ["Colorado, USA"] = {divs = {"counties", "county seats", "municipalities"}}, ["Connecticut, USA"] = {divs = {"counties", "county seats", "municipalities"}}, ["Delaware, USA"] = {}, ["Florida, USA"] = {}, ["Georgia, USA"] = {wp = "%l (U.S. state)"}, ["Hawaii, USA"] = {addl_parents = {"Polynesia"}}, ["Idaho, USA"] = {}, ["Illinois, USA"] = {}, ["Indiana, USA"] = {}, ["Iowa, USA"] = {}, ["Kansas, USA"] = {}, ["Kentucky, USA"] = {}, ["Louisiana, USA"] = {divs = { {type = "parishes", container_parent_type = "counties"}, {type = "parish seats", container_parent_type = "county seats"}, }}, ["Maine, USA"] = {}, ["Maryland, USA"] = {}, ["Massachusetts, USA"] = {}, ["Michigan, USA"] = {}, ["Minnesota, USA"] = {}, ["Mississippi, USA"] = {}, ["Missouri, USA"] = {}, ["Montana, USA"] = {}, ["Nebraska, USA"] = {}, ["Nevada, USA"] = {}, ["New Hampshire, USA"] = {}, ["New Jersey, USA"] = {divs = { "counties", "county seats", {type = "boroughs", prep = "di"}, }}, ["New Mexico, USA"] = {}, ["New York, USA"] = {wp = "%l (state)"}, ["North Carolina, USA"] = {}, ["North Dakota, USA"] = {}, ["Ohio, USA"] = {}, ["Oklahoma, USA"] = {}, ["Oregon, USA"] = {}, ["Pennsylvania, USA"] = {divs = { "counties", "county seats", {type = "boroughs", prep = "di"}, }}, ["Rhode Island, USA"] = {}, ["South Carolina, USA"] = {}, ["South Dakota, USA"] = {}, ["Tennessee, USA"] = {}, ["Texas, USA"] = {}, ["Utah, USA"] = {}, ["Vermont, USA"] = {}, ["Virginia, USA"] = {}, ["Washington, USA"] = {wp = "%l (state)"}, ["West Virginia, USA"] = {}, ["Wisconsin, USA"] = {}, ["Wyoming, USA"] = {}, } -- states of the United States export.united_states_group = { placename_to_key = make_placename_to_key(", USA"), default_container = "Amerika Syarikat", default_placetype = "negeri", default_divs = {"counties", "county seats"}, addl_divs = { {type = "census-designated places", prep = "di"}, {type = "unincorporated communities", prep = "di"}, }, data = export.united_states_states, } export.vietnam_provinces = { -- [[Northeast (Vietnam)|Northeast]] region ["Bắc Giang Province, Vietnam"] = {}, -- capital [[Bắc Giang]] ["Bắc Kạn Province, Vietnam"] = {}, -- capital [[Bắc Kạn]] ["Cao Bằng Province, Vietnam"] = {}, -- capital [[Cao Bằng]] ["Hà Giang Province, Vietnam"] = {}, -- capital [[Hà Giang]] ["Lạng Sơn Province, Vietnam"] = {}, -- capital [[Lạng Sơn]] ["Phú Thọ Province, Vietnam"] = {}, -- capital [[Việt Trì]] ["Quảng Ninh Province, Vietnam"] = {}, -- capital [[Hạ Long]] ["Thái Nguyên Province, Vietnam"] = {}, -- capital [[Thái Nguyên]] ["Tuyên Quang Province, Vietnam"] = {}, -- capital [[Tuyên Quang]] -- [[Northwest (Vietnam)|Northwest]] region ["Lào Cai Province, Vietnam"] = {}, -- capital [[Lào Cai]] ["Yên Bái Province, Vietnam"] = {}, -- capital [[Yên Bái]] ["Điện Biên Province, Vietnam"] = {}, -- capital [[Điện Biên Phủ]] ["Hoà Bình Province, Vietnam"] = {}, -- capital [[Hoà Bình City|Hoà Bình]] ["Hòa Bình Province, Vietnam"] = {alias_of = "Hoà Bình Province, Vietnam", display = true}, ["Lai Châu Province, Vietnam"] = {}, -- capital [[Lai Châu]] ["Sơn La Province, Vietnam"] = {}, -- capital [[Sơn La]] -- [[Red River Delta]] region ["Bắc Ninh Province, Vietnam"] = {}, -- capital [[Bắc Ninh]] ["Hà Nam Province, Vietnam"] = {}, -- capital [[Phủ Lý]] ["Hải Dương Province, Vietnam"] = {}, -- capital [[Hải Dương]] ["Hưng Yên Province, Vietnam"] = {}, -- capital [[Hưng Yên]] ["Nam Định Province, Vietnam"] = {}, -- capital [[Nam Định]] ["Ninh Bình Province, Vietnam"] = {}, -- capital [[Ninh Bình|Hoa Lư]] ["Thái Bình Province, Vietnam"] = {}, -- capital [[Thái Bình]] ["Vĩnh Phúc Province, Vietnam"] = {}, -- capital [[Vĩnh Yên]] -- ["Hanoi"] = {placetype = {"municipality", "city"}}, -- capital [[Hoàn Kiếm district]] -- ["Haiphong"] = {placetype = {"municipality", "city"}}, -- capital [[Hồng Bàng district]] -- [[North Central Coast]] region ["Hà Tĩnh Province, Vietnam"] = {}, -- capital [[Hà Tĩnh]] ["Nghệ An Province, Vietnam"] = {}, -- capital [[Vinh]] ["Quảng Bình Province, Vietnam"] = {}, -- capital [[Đồng Hới]] ["Quảng Trị Province, Vietnam"] = {}, -- capital [[Đông Hà]] ["Thanh Hoá Province, Vietnam"] = {}, -- capital [[Thanh Hoá]] ["Thanh Hóa Province, Vietnam"] = {alias_of = "Thanh Hoá Province, Vietnam", display = true}, -- ["Hue"] = {placetype = {"municipality", "city"}, wp = "Huế"}, -- capital [[Thuận Hoá district]] -- [[Central Highlands (Vietnam)|Central Highlands]] region ["Đắk Lắk Province, Vietnam"] = {}, -- capital [[Buôn Ma Thuột]] ["Đăk Nông Province, Vietnam"] = {}, -- capital [[Gia Nghĩa]] ["Gia Lai Province, Vietnam"] = {}, -- capital [[Pleiku]] ["Kon Tum Province, Vietnam"] = {}, -- capital [[Kon Tum]] ["Lâm Đồng Province, Vietnam"] = {}, -- capital [[Đà Lạt]] -- [[South Central Coast]] region ["Bình Định Province, Vietnam"] = {}, -- capital [[Quy Nhon]] ["Bình Thuận Province, Vietnam"] = {}, -- capital [[Phan Thiết]] ["Khánh Hoà Province, Vietnam"] = {}, -- capital [[Nha Trang]] ["Khánh Hòa Province, Vietnam"] = {alias_of = "Khánh Hoà Province, Vietnam", display = true}, ["Ninh Thuận Province, Vietnam"] = {}, -- capital [[Phan Rang–Tháp Chàm]] ["Phú Yên Province, Vietnam"] = {}, -- capital [[Tuy Hoà]] ["Quảng Nam Province, Vietnam"] = {}, -- capital [[Tam Kỳ]] ["Quảng Ngãi Province, Vietnam"] = {}, -- capital [[Quảng Ngãi]] -- ["Da Nang"] = {placetype = {"municipality", "city"}}, -- capital [[Hải Châu district]] -- [[Southeast (Vietnam)|Southeast]] region ["Bà Rịa–Vũng Tàu Province, Vietnam"] = {}, -- capital [[Bà Rịa]] ["Bình Dương Province, Vietnam"] = {}, -- capital [[Thủ Dầu Một]] ["Bình Phước Province, Vietnam"] = {}, -- capital [[Đồng Xoài]] ["Đồng Nai Province, Vietnam"] = {}, -- capital [[Biên Hoà]] ["Tây Ninh Province, Vietnam"] = {}, -- capital [[Tây Ninh]] -- ["Ho Chi Minh City"] = {placetype = {"municipality", "city"}}, -- capital [[District 1, Ho Chi Minh City|'''District 1''']] -- [[Mekong Delta]] region ["An Giang Province, Vietnam"] = {}, -- capital [[Long Xuyên]] ["Bạc Liêu Province, Vietnam"] = {}, -- capital [[Bạc Liêu]] ["Bến Tre Province, Vietnam"] = {}, -- capital [[Bến Tre]] ["Cà Mau Province, Vietnam"] = {}, -- capital [[Cà Mau]] ["Đồng Tháp Province, Vietnam"] = {}, -- capital [[Cao Lãnh City|Cao Lãnh]] ["Hậu Giang Province, Vietnam"] = {}, -- capital [[Vị Thanh]] ["Kiên Giang Province, Vietnam"] = {}, -- capital [[Rạch Giá]] ["Long An Province, Vietnam"] = {}, -- capital [[Tân An]] ["Sóc Trăng Province, Vietnam"] = {}, -- capital [[Sóc Trăng]] ["Tiền Giang Province, Vietnam"] = {}, -- capital [[Mỹ Tho]] ["Trà Vinh Province, Vietnam"] = {}, -- capital [[Trà Vinh]] ["Vĩnh Long Province, Vietnam"] = {}, -- capital [[Vĩnh Long]] -- ["Can Tho"] = {placetype = {"municipality", "city"}, wp = "Cần Thơ"}, -- capital [[Ninh Kiều district]] } -- provinces of Vietnam export.vietnam_group = { key_to_placename = make_key_to_placename(", Vietnam$", " Province$"), placename_to_key = make_placename_to_key(", Vietnam", " Province"), default_container = "Vietnam", default_placetype = "province", -- There may not be enough districts to subcategorize like this. -- default_divs = "districts", -- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province' default_wp = "%e province", data = export.vietnam_provinces, } ----------------------------------------------------------------------------------- -- City data -- ----------------------------------------------------------------------------------- export.australia_cities = { ["Adelaide"] = {container = "South Australia"}, -- 1,450,000 (Agglomeration) ["Brisbane"] = {container = "Queensland"}, -- 3,450,000 (Conglomeration; including the Gold Coast [750,997 2024 estiamte]) ["Canberra"] = {container = {key = "Australian Capital Territory, Australia", placetype = "territory"}}, -- 510,641 (2024 estimate) ["Melbourne"] = {container = "Victoria"}, -- 5,200,000 (Agglomeration) ["Newcastle, New South Wales"] = {container = "New South Wales", wp = "%l, %c"}, -- 534,033 (2024 estimate) ["Newcastle"] = {alias_of = "Newcastle, New South Wales"}, ["Perth"] = {container = "Western Australia"}, -- 2,350,000 (Agglomeration) ["Sydney"] = {container = "New South Wales"}, -- 5,100,000 (Agglomeration) } export.australia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Australia", "negeri"), default_placetype = "city", data = export.australia_cities, } export.brazil_cities = { -- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01. ["São Paulo"] = {container = "São Paulo"}, -- 22,600,000 (Consolidated Urban Area; including Guarulhos) ["Sao Paulo"] = {alias_of = "São Paulo", display = true}, ["Rio de Janeiro"] = {container = "Rio de Janeiro"}, -- 13,600,000 (Consolidated Urban Area) ["Belo Horizonte"] = {container = "Minas Gerais"}, -- 5,300,000 ["Recife"] = {container = "Pernambuco"}, -- 4,100,000 ["Porto Alegre"] = {container = "Rio Grande do Sul"}, -- 3,950,000 (Consolidated Urban Area) ["Brasília"] = {container = "Distrito Federal"}, -- 3,850,000 ["Brasilia"] = {alias_of = "Brasília", display = true}, ["Fortaleza"] = {container = "Ceará"}, -- 3,825,000 ["Salvador"] = {container = "Bahia", wp = "%l, %c", commonscat = "%l (%c)"}, -- 3,400,000 ["Curitiba"] = {container = "Paraná"}, -- 3,375,000 ["Campinas"] = {container = "São Paulo"}, -- 3,250,000 ["Goiânia"] = {container = "Goiás"}, -- 2,525,000 ["Goiania"] = {alias_of = "Goiânia", display = true}, ["Manaus"] = {container = "Amazonas"}, -- 2,275,000 ["Belém"] = {container = "Pará"}, -- 2,200,000 ["Belem"] = {alias_of = "Belém", display = true}, ["Vitória"] = {container = "Espírito Santo", wp = "%l, %c"}, -- 1,870,000 ["Vitoria"] = {alias_of = "Vitória", display = true}, ["Santos"] = {container = "São Paulo", wp = "%l, %c"}, -- 1,760,000 ["São Luís"] = {container = "Maranhão", wp = "%l, %c"}, -- 1,530,000 ["Sao Luis"] = {alias_of = "São Luís", display = true}, ["Natal"] = {container = "Rio Grande do Norte", wp = "%l, %c"}, -- 1,360,000 ["Florianópolis"] = {container = "Santa Catarina"}, -- 1,260,000 ["Florianopolis"] = {alias_of = "Florianópolis", display = true}, ["Maceió"] = {container = "Alagoas"}, -- 1,220,000 ["Maceio"] = {alias_of = "Maceió", display = true}, ["João Pessoa"] = {container = "Paraíba", wp = "%l, %c"}, -- 1,210,000 ["Joao Pessoa"] = {alias_of = "João Pessoa", display = true}, ["São José dos Campos"] = {container = "São Paulo"}, -- 1,090,000 ["Sao Jose dos Campos"] = {alias_of = "São José dos Campos", display = true}, ["Londrina"] = {container = "Paraná"}, -- 1,050,000 ["Teresina"] = {container = "Piauí"}, -- 1,040,000 } export.brazil_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Brazil", "negeri"), default_placetype = "city", data = export.brazil_cities, } export.canada_cities = { -- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01. ["Toronto"] = {container = "Ontario"}, -- 7,850,000 (Consolidated Urban Area; including Hamilton) ["Montreal"] = {container = "Quebec"}, -- 4,500,000 (Consolidated Urban Area) ["Vancouver"] = {container = "British Columbia"}, -- 3,175,000 (Consolidated Urban Area) ["Calgary"] = {container = "Alberta"}, -- 1,510,000 (Consolidated Urban Area) ["Edmonton"] = {container = "Alberta"}, -- 1,460,000 (Consolidated Urban Area) ["Ottawa"] = {container = "Ontario"}, -- 1,390,000 (Consolidated Urban Area) ["Quebec City"] = {container = "Quebec"}, -- 839,311 metro per Wikipedia (2021 census) ["Winnipeg"] = {container = "Manitoba"}, -- 834,678 metro per Wikipedia (2021 census) ["Hamilton"] = {container = "Ontario", wp = "%l, %c"}, -- 785,184 metro per Wikipedia (2021 census) ["Kitchener"] = {container = "Ontario", wp = "%l, %c"}, -- 575,847 metro per Wikipedia (2021 census) } export.canada_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Canada", "province"), default_placetype = "city", data = export.canada_cities, } export.france_cities = { -- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01. ["Paris"] = {container = "Île-de-France"}, -- 11,500,000 (Conglomeration) ["Lyon"] = {container = "Auvergne-Rhône-Alpes"}, -- 2,050,000 (Conglomeration) ["Lyons"] = {alias_of = "Lyon", display = true}, ["Marseille"] = {container = "Provence-Alpes-Côte d'Azur"}, -- 1,710,000 (Conglomeration) ["Marseilles"] = {alias_of = "Marseille", display = true}, ["Lille"] = {container = "Hauts-de-France"}, -- 1,320,000 (Conglomeration) ["Bordeaux"] = {container = "Nouvelle-Aquitaine"}, -- 1,160,000 (Conglomeration) ["Toulouse"] = {container = "Occitania"}, -- 1,150,000 (Conglomeration) ["Nice"] = {container = "Provence-Alpes-Côte d'Azur"}, ["Nantes"] = {container = "Pays de la Loire"}, ["Strasbourg"] = {container = "Grand Est"}, ["Rennes"] = {container = "Brittany"}, } export.france_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", France", "region"), default_placetype = "city", data = export.france_cities, } export.germany_cities = { -- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01. -- listed under Rhein-Ruhr Area, total population 10,900,000 (Consolidated Urban Area) ["Cologne"] = {container = "North Rhine-Westphalia"}, ["Köln"] = {alias_of = "Cologne", display = true}, ["Düsseldorf"] = {container = "North Rhine-Westphalia"}, ["Dusseldorf"] = {alias_of = "Düsseldorf", display = true}, ["Dortmund"] = {container = "North Rhine-Westphalia"}, ["Essen"] = {container = "North Rhine-Westphalia"}, ["Duisberg"] = {container = "North Rhine-Westphalia"}, ["Berlin"] = {}, -- 4,700,000 ["Frankfurt"] = {container = "Hesse"}, -- 3,225,000 ["Frankfurt am Main"] = {alias_of = "Frankfurt"}, -- not a display alias as it's longer ["Hamburg"] = {}, -- 2,900,000 ["Munich"] = {container = "Bavaria"}, -- 2,300,000 ["Stuttgart"] = {container = "Baden-Württemberg"}, -- 2,300,000 ["Mannheim"] = {container = "Baden-Württemberg"}, -- 1,550,000 ["Nuremberg"] = {container = "Bavaria"}, -- 1,120,000 ["Hanover"] = {"Lower Saxony"}, -- 1,090,000 ["Bielefeld"] = {container = "North Rhine-Westphalia"}, -- 1,080,000 ["Leipzig"] = {container = "Saxony"}, -- 1,080,000 ["Aachen"] = {container = "North Rhine-Westphalia"}, -- 1,000,000 ["Aix-la-Chapelle"] = {alias_of = "Aachen"}, -- historical; not a display alias ["Bremen"] = {}, } export.germany_cities_group = { default_container = "Germany", canonicalize_key_container = make_canonicalize_key_container(", Germany", "negeri"), default_placetype = "city", data = export.germany_cities, } export.india_cities = { -- This lists the 65 metro areas per Demographia's 2023 estimates, as found in -- [[w:List_of_million-plus_urban_agglomerations_in_India]]. The last census in India (as of April 2025) was -- conducted in 2011, and the results are not accurate any more. ["Delhi"] = {container = {key = "Delhi, India", placetype = "union territory"}}, -- 31,190,000 ["Mumbai"] = {container = "Maharashtra"}, -- 25,189,000 ["Kolkata"] = {container = "West Bengal"}, -- 21,747,000 ["Bangalore"] = {container = "Karnataka", wp = "Bengaluru"}, -- 15,257,000 ["Bengaluru"] = {alias_of = "Bangalore"}, ["Chennai"] = {container = "Tamil Nadu"}, -- 11,570,000 ["Hyderabad"] = {container = "Telangana"}, -- 9,797,000 ["Ahmedabad"] = {container = "Gujarat"}, -- 8,006,000 ["Pune"] = {container = "Maharashtra"}, -- 6,819,000 ["Surat"] = {container = "Gujarat"}, -- 6,601,000 ["Lucknow"] = {container = "Uttar Pradesh"}, -- 4,661,000 ["Jaipur"] = {container = "Rajasthan"}, -- 4,360,000 ["Kanpur"] = {container = "Uttar Pradesh"}, -- 4,350,000 ["Indore"] = {container = "Madhya Pradesh"}, -- 3,765,000 ["Nagpur"] = {container = "Maharashtra"}, -- 3,493,000 ["Patna"] = {container = "Bihar"}, -- 3,331,000 ["Varanasi"] = {container = "Uttar Pradesh"}, -- 3,229,000 ["Kozhikode"] = {container = "Kerala"}, -- 3,049,000 ["Thiruvananthapuram"] = {container = "Kerala"}, -- 2,851,000 ["Agra"] = {container = "Uttar Pradesh"}, -- 2,737,000 ["Bhopal"] = {container = "Madhya Pradesh"}, -- 2,562,000 ["Coimbatore"] = {container = "Tamil Nadu"}, -- 2,551,000 ["Allahabad"] = {container = "Uttar Pradesh", wp = "Prayagraj"}, -- 2,438,000 ["Prayagraj"] = {alias_of = "Allahabad"}, ["Kochi"] = {container = "Kerala"}, -- 2,381,000 ["Ludhiana"] = {container = "Punjab"}, -- 2,205,000 ["Vadodara"] = {container = "Gujarat"}, -- 2,182,000 ["Chandigarh"] = {container = {key = "Chandigarh, India", placetype = "union territory"}}, -- 2,168,000 ["Madurai"] = {container = "Tamil Nadu"}, -- 2,048,000 ["Meerut"] = {container = "Uttar Pradesh"}, -- 2,011,000 ["Visakhapatnam"] = {container = "Andhra Pradesh"}, -- 2,005,000 ["Jamshedpur"] = {container = "Jharkhand"}, -- 1,925,000 ["Malappuram"] = {container = "Kerala"}, -- 1,868,000 ["Nashik"] = {container = "Maharashtra"}, -- 1,810,000 ["Asansol"] = {container = "West Bengal"}, -- 1,720,000 ["Aligarh"] = {container = "Uttar Pradesh"}, -- 1,660,000 ["Ranchi"] = {container = "Jharkhand"}, -- 1,638,000 ["Thrissur"] = {container = "Kerala"}, -- 1,578,000 ["Kollam"] = {container = "Kerala"}, -- 1,576,000 ["Jabalpur"] = {container = "Madhya Pradesh"}, -- 1,533,000 ["Dhanbad"] = {container = "Jharkhand"}, -- 1,503,000 ["Jodhpur"] = {container = "Rajasthan"}, -- 1,497,000 ["Aurangabad"] = {container = "Maharashtra"}, -- 1,490,000 ["Chhatrapati Sambhajinagar"] = {alias_of = "Aurangabad"}, ["Rajkot"] = {container = "Gujarat"}, -- 1,487,000 ["Gwalior"] = {container = "Madhya Pradesh"}, -- 1,477,000 ["Raipur"] = {container = "Chhattisgarh"}, -- 1,429,000 ["Gorakhpur"] = {container = "Uttar Pradesh"}, -- 1,410,000 ["Kannur"] = {container = "Kerala"}, -- 1,360,000 ["Bareilly"] = {container = "Uttar Pradesh"}, -- 1,355,000 ["Guwahati"] = {container = "Assam"}, -- 1,355,000 ["Moradabad"] = {container = "Uttar Pradesh"}, -- 1,345,000 ["Amritsar"] = {container = "Punjab"}, -- 1,313,000 ["Mysore"] = {container = "Karnataka"}, -- 1,296,000 ["Bhilai"] = {container = "Chhattisgarh"}, -- 1,293,000 ["Durg-Bhilainagar"] = {alias_of = "Bhilai"}, ["Durg-Bhilai"] = {alias_of = "Bhilai"}, ["Durg"] = {alias_of = "Bhilai"}, ["Bhilainagar"] = {alias_of = "Bhilai"}, ["Vijayawada"] = {container = "Andhra Pradesh"}, -- 1,232,000 ["Srinagar"] = {container = {key = "Jammu and Kashmir, India", placetype = "union territory"}}, -- 1,212,000 ["Salem"] = {container = "Tamil Nadu", wp = "%l, %c"}, -- 1,189,000 ["Kota"] = {container = "Rajasthan"}, -- 1,172,000 ["Jalandhar"] = {container = "Punjab"}, -- 1,165,000 ["Saharanpur"] = {container = "Uttar Pradesh"}, -- 1,152,000 ["Dehradun"] = {container = "Uttarakhand"}, -- 1,136,000 ["Tiruchirappalli"] = {container = "Tamil Nadu"}, -- 1,131,000 ["Bhubaneswar"] = {container = "Odisha"}, -- 1,112,000 ["Jammu"] = {container = {key = "Jammu and Kashmir, India", placetype = "union territory"}}, -- 1,103,000 ["Solapur"] = {container = "Maharashtra"}, -- 1,082,000 ["Hubli-Dharwad"] = {container = "Karnataka", wp = "Hubli–Dharwad"}, -- 1,062,000; wp with en dash ["Hubli"] = {alias_of = "Hubli-Dharwad"}, ["Dharwad"] = {alias_of = "Hubli-Dharwad"}, ["Puducherry"] = {container = {key = "Puducherry, India", placetype = "union territory"}}, -- 1,024,000 ["Pondicherry"] = {alias_of = "Puducherry", display = true}, -- satellite/secondary cities of metro area (none in citypopulation.de) ["Ghaziabad"] = {container = "Uttar Pradesh"}, -- 1,729,000 city, 2,358,525 urban agglomeration per 2011 census; 3,406,061 2025 estimate from official website; part of Delhi metro area ["Faridabad"] = {container = "Haryana"}, -- 1,414,050 city per 2011 census; part of Delhi metro area ["Thane"] = {container = "Maharashtra"}, -- 1,841,488 city per 2011 census; part of Mumbai metro area ["Kalyan-Dombivli"] = {container = "Maharashtra"}, -- 1,246,381 city per 2011 census; part of Mumbai metro area ["Kalyan-Dombivali"] = {alias_of = "Kalyan-Dombivli", display = true}, ["Kalyan"] = {alias_of = "Kalyan-Dombivli"}, ["Dombivli"] = {alias_of = "Kalyan-Dombivli"}, ["Dombivali"] = {alias_of = "Kalyan-Dombivli"}, ["Vasai-Virar"] = {container = "Maharashtra"}, -- 1,221,233 city per 2011 census; part of Mumbai metro area ["Vasai"] = {alias_of = "Vasai-Virar"}, ["Virar"] = {alias_of = "Vasai-Virar"}, ["Navi Mumbai"] = {container = "Maharashtra"}, -- 1,120,547 city per 2011 census; part of Mumbai metro area ["Howrah"] = {container = "West Bengal"}, -- 1,077,075 city ("metropolis"), 2,811,344 "metro" per 2011 census; part of Kolkata metro area ["Pimpri-Chinchwad"] = {container = "Maharashtra"}, -- 1,727,692 per 2011 census; part of Pune metro area ["Pimpri Chinchwad"] = {alias_of = "Pimpri-Chinchwad", display = true}, } export.india_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", India", "negeri"), default_placetype = "city", data = export.india_cities, } export.indonesia_cities = { -- cities where the city proper has more than 1,000,000 people as of mid-2023 estimate ["Jakarta"] = {container = "Special Capital Region of Jakarta", divs = { {type = "subdistricts", container_parent_type = false}, }}, ["Surabaya"] = {container = "East Java"}, ["Bekasi"] = {container = "West Java"}, -- part of Jakarta metro area ["Bandung"] = {container = "West Java"}, ["Medan"] = {container = "North Sumatra"}, ["Depok"] = {container = "West Java"}, -- part of Jakarta metro area ["Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area ["Palembang"] = {container = "South Sumatra"}, ["Semarang"] = {container = "Central Java"}, ["Makassar"] = {container = "South Sulawesi"}, ["South Tangerang"] = {container = "Banten"}, -- part of Jakarta metro area ["Batam"] = {container = "Riau Islands"}, ["Bogor"] = {container = "West Java"}, -- part of Jakarta metro area ["Pekanbaru"] = {container = "Riau"}, ["Bandar Lampung"] = {container = "Lampung"}, -- other metro areas over 1,000,000 people ["Padang"] = {container = "West Sumatra"}, ["Samarinda"] = {container = "East Kalimantan"}, ["Malang"] = {container = "East Java"}, ["Yogyakarta"] = {container = "Special Region of Yogyakarta"}, ["Denpasar"] = {container = "Bali"}, ["Cirebon"] = {container = "West Java"}, ["Surakarta"] = {container = "Central Java"}, ["Banjarmasin"] = {container = "South Kalimantan"}, ["Tasikmalaya"] = {container = "West Java"}, } export.indonesia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Indonesia", "province"), default_placetype = "city", data = export.indonesia_cities, } export.italy_cities = { -- Data per [[w:List_of_metropolitan_areas_of_Italy]]. There are several lists given; the most recent one, used -- here, only gives estimates as of Jan 1, 2014. ["Milan"] = {container = "Lombardy"}, -- 6,623,798 ["Naples"] = {container = "Campania"}, -- 5,294,546 ["Rome"] = {container = "Lazio"}, -- 4,447,881 ["Turin"] = {container = "Piedmont"}, -- 1,865,284 ["Venice"] = {container = "Veneto"}, -- 1,645,900 ["Florence"] = {container = "Tuscany"}, -- 1,485,030 ["Bari"] = {container = "Apulia"}, -- 1,257,459 ["Palermo"] = {container = "Sicily"}, -- 1,183,084 -- include a few just below 1,000,000 metro area that may be above it by now (depending on the definition). ["Catania"] = {container = "Sicily"}, -- 988,240 ["Brescia"] = {container = "Lombardy"}, -- 924,090 ["Genoa"] = {container = "Liguria"}, -- 861,318 } export.italy_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Italy", "region"), default_placetype = "city", data = export.italy_cities, } export.japan_cities = { -- Population figures from [[w:List of cities in Japan]]. Metro areas from -- [[w:List of metropolitan areas in Japan]]. ["Tokyo"] = {keydesc = "[[Tokyo]] Metropolis, the [[capital city]] and a [[prefecture]] of [[Japan]] (which is a country in [[Asia]])", placetype = {"city", "prefecture"}, divs = { {type = "special wards", container_parent_type = false}, {type = "cities", prep = "di"}, }, }, ["Yokohama"] = {container = "Kanagawa"}, -- 3,697,894 ["Osaka"] = {container = "Osaka"}, -- 2,668,586 ["Nagoya"] = {container = "Aichi"}, -- 2,283,289 -- FIXME, Hokkaido is handled specially. ["Sapporo"] = {container = "Hokkaido"}, -- 1,918,096 ["Fukuoka"] = {container = "Fukuoka"}, -- 1,581,527 ["Kobe"] = {container = "Hyōgo"}, -- 1,530,847 ["Kyoto"] = {container = "Kyoto"}, -- 1,474,570 ["Kawasaki"] = {container = "Kanagawa", wp = "%l, Kanagawa"}, -- 1,373,630 ["Saitama"] = {container = "Saitama", wp = "%l (city)", commonscat = "%l, %c"}, -- 1,192,418 ["Hiroshima"] = {container = "Hiroshima"}, -- 1,163,806 ["Sendai"] = {container = "Miyagi"}, -- 1,029,552 -- the remaining cities are considered "central cities" in a 1,000,000+ metro area -- (sometimes there is more than one central city in the area). ["Kitakyushu"] = {container = "Fukuoka"}, -- 986,998 ["Chiba"] = {container = "Chiba", wp = "%l (city)", commonscat = "%l, %c"}, -- 938,695 ["Sakai"] = {container = "Osaka"}, -- 835,333 ["Niigata"] = {container = "Niigata", wp = "%l (city)", commonscat = "%l, %c"}, -- 813,053 ["Hamamatsu"] = {container = "Shizuoka"}, -- 811,431 ["Shizuoka"] = {container = "Shizuoka", wp = "%l (city)", commonscat = "%l, %c"}, -- 710,944 ["Sagamihara"] = {container = "Kanagawa"}, -- 706,342 ["Okayama"] = {container = "Okayama"}, -- 701,293 ["Kumamoto"] = {container = "Kumamoto"}, -- 670,348 ["Kagoshima"] = {container = "Kagoshima"}, -- 605,196 -- skipped 6 cities (Funabashi, Hachiōji, Kawaguchi, Himeji, Matsuyama, Higashiōsaka) -- with population in the range 509k - 587k because not central cities in any -- 1,000,000+ metro area. ["Utsunomiya"] = {container = "Tochigi"}, -- 507,833 } export.japan_cities_group = { default_container = "Japan", canonicalize_key_container = make_canonicalize_key_container(" Prefecture, Japan", "prefecture"), default_placetype = "city", data = export.japan_cities, } export.mexico_cities = { ["Mexico City"] = {}, -- its own state ["Monterrey"] = {container = "Nuevo León"}, ["Guadalajara"] = {container = "Jalisco"}, ["Puebla"] = {container = "Puebla", wp = "%l (city)"}, ["Toluca"] = {container = "State of Mexico"}, ["Tijuana"] = {container = "Baja California"}, -- Include the state in the category for León due to possible confusion with León, Spain. ["León, Guanajuato"] = {container = "Guanajuato", wp = "%l, %c"}, ["León"] = {alias_of = "León, Guanajuato"}, ["Leon"] = {alias_of = "León, Guanajuato", display = true}, ["Querétaro"] = {container = "Querétaro", wp = "%l (city)"}, ["Queretaro"] = {alias_of = "Querétaro", display = true}, ["Ciudad Juárez"] = {container = "Chihuahua"}, ["Juárez"] = {alias_of = "Ciudad Juárez"}, ["Juarez"] = {alias_of = "Ciudad Juárez", display = "Juárez"}, ["Torreón"] = {container = "Coahuila"}, ["Torreon"] = {alias_of = "Torreón", display = true}, -- Include the state in the category for Mérida due to possible confusion with Mérida, Spain or -- Mérida, Venezuela. ["Mérida, Yucatán"] = {container = "Yucatán", wp = "%l, %c"}, ["Mérida"] = {alias_of = "Mérida, Yucatán"}, ["Merida"] = {alias_of = "Mérida, Yucatán", display = true}, ["San Luis Potosí"] = {container = "San Luis Potosí", wp = "%l (city)"}, ["San Luis Potosi"] = {alias_of = "San Luis Potosí", display = true}, ["Aguascalientes"] = {container = "Aguascalientes", wp = "%l (city)"}, ["Mexicali"] = {container = "Baja California"}, } export.mexico_cities_group = { default_container = "Mexico", canonicalize_key_container = make_canonicalize_key_container(", Mexico", "negeri"), default_placetype = "city", data = export.mexico_cities, } export.nigeria_cities = { -- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01. ["Lagos"] = {container = "Lagos"}, -- 21,300,000 (unindicated; population of low reliability) ["Kano"] = {container = "Kano", wp = "%l (city)"}, -- 5,350,000 (unindicated; population of low reliability) ["Ibadan"] = {container = "Oyo"}, -- 3,400,000 (unindicated; population of low reliability) ["Abuja"] = {container = {key = "Federal Capital Territory, Nigeria", placetype = "wilayah persekutuan"}}, -- 3,050,000 (unindicated; population of low reliability) ["Port Harcourt"] = {container = "Rivers"}, -- 2,250,000 (unindicated; population of low reliability) ["Kaduna"] = {container = "Kaduna"}, -- 1,980,000 (unindicated; population of low reliability) ["Benin City"] = {container = "Edo"}, -- 1,790,000 (unindicated; population of low reliability) ["Aba"] = {container = "Abia", wp = "%l, Nigeria"}, -- 1,280,000 (unindicated; population of low reliability) ["Onitsha"] = {container = "Anambra"}, -- 1,230,000 (unindicated; population of low reliability) ["Maiduguri"] = {container = "Borno"}, -- 1,190,000 (unindicated; population of low reliability) ["Ilorin"] = {container = "Kwara"}, -- 1,160,000 (unindicated; population of low reliability) ["Sokoto"] = {container = "Sokoto", wp = "%l (city)"}, -- 1,140,000 (unindicated; population of low reliability) ["Jos"] = {container = "Plateau"}, -- 1,110,000 (unindicated; population of low reliability) ["Zaria"] = {container = "Kaduna"}, -- 1,050,000 (unindicated; population of low reliability) ["Enugu"] = {container = "Enugu", wp = "%l (city)"}, -- 1,010,000 (unindicated; population of low reliability) } export.nigeria_cities_group = { default_container = "Nigeria", canonicalize_key_container = make_canonicalize_key_container(" State, Nigeria", "negeri"), default_placetype = "city", data = export.nigeria_cities, } export.pakistan_cities = { -- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01. ["Karachi"] = {container = "Sindh"}, -- 21,000,000 (Consolidated Urban Area) ["Lahore"] = {container = "Punjab"}, -- 14,600,000 (Consolidated Urban Area) ["Rawalpindi"] = {container = "Punjab"}, -- 5,600,000 (Consolidated Urban Area; including Islamabad) ["Islamabad"] = {container = {key = "Islamabad Capital Territory, Pakistan", placetype = "wilayah persekutuan"}}, -- 5,600,000 (Consolidated Urban Area; including Rawalpindi) ["Faisalabad"] = {container = "Punjab"}, -- 4,125,000 (Consolidated Urban Area) ["Gujranwala"] = {container = "Punjab"}, -- 3,450,000 (Consolidated Urban Area) -- there is also Hyderabad in India (very confusing) ["Hyderabad, Pakistan"] = {container = "Sindh", wp = "%l, %c"}, -- 2,475,000 (Consolidated Urban Area) ["Hyderabad"] = {alias_of = "Hyderabad, Pakistan"}, ["Multan"] = {container = "Punjab"}, -- 2,425,000 (Consolidated Urban Area) ["Peshawar"] = {container = "Khyber Pakhtunkhwa"}, -- 2,150,000 (Consolidated Urban Area) ["Quetta"] = {container = "Balochistan"}, -- 1,720,000 (Urban Area) ["Sargodha"] = {container = "Punjab"}, -- 1,080,000 (Urban Area) ["Sialkot"] = {container = "Punjab"}, -- 1,050,000 (Consolidated Urban Area) } export.pakistan_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Pakistan", "province"), default_placetype = "city", data = export.pakistan_cities, } export.philippines_cities = { -- Skipped some cities in Metro Manila (Taguig, Pasig) which don't have districts. -- Other cities outside Metro Manila skipped as not central city in their urban area. ["Quezon City"] = {container = {key = "Metro Manila, Philippines", placetype = "region"}}, -- Don't display-canonicalize Foo to Foo City as it may make the display weird. ["Quezon"] = {alias_of = "Quezon City"}, ["Manila"] = {container = {key = "Metro Manila, Philippines", placetype = "region"}}, ["Davao City"] = {container = "Davao del Sur"}, ["Davao"] = {alias_of = "Davao City"}, ["Caloocan"] = {container = {key = "Metro Manila, Philippines", placetype = "region"}}, ["Zamboanga City"] = {container = "Zamboanga del Sur"}, ["Zamboanga"] = {alias_of = "Zamboanga City"}, ["Cebu City"] = {container = "Cebu"}, ["Cebu"] = {alias_of = "Cebu City"}, ["Antipolo"] = {container = "Rizal"}, ["Cagayan de Oro"] = {container = "Misamis Oriental"}, ["Dasmariñas"] = {container = "Cavite"}, ["Dasmarinas"] = {alias_of = "Dasmariñas", display = true}, ["General Santos"] = {container = "South Cotabato"}, ["San Jose del Monte"] = {container = "Bulacan"}, ["Bacolod"] = {container = "Negros Occidental"}, ["Calamba"] = {container = "Laguna", wp = "%l, %c"}, ["Angeles"] = {container = "Pampanga", wp = "Angeles City"}, ["Angeles City"] = {alias_of = "Angeles"}, ["Iloilo City"] = {container = "Iloilo"}, ["Iloilo"] = {alias_of = "Iloilo City"}, } export.philippines_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Philippines", "province"), default_placetype = "city", data = export.philippines_cities, } export.russia_cities = { -- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01. ["Moscow"] = {}, -- 18,800,000 (Agglomeration) ["Saint Petersburg"] = {}, -- 6,350,000 (Agglomeration) ["Novosibirsk"] = {container = "Novosibirsk Oblast"}, -- 1,820,000 (Agglomeration) ["Yekaterinburg"] = {container = "Sverdlovsk Oblast"}, -- 1,810,000 (Agglomeration) ["Nizhny Novgorod"] = {container = "Nizhny Novgorod Oblast"}, -- 1,620,000 (Agglomeration) ["Kazan"] = {container = {key = "Tatarstan, Russia", placetype = "republic"}}, -- 1,560,000 (Agglomeration) ["Chelyabinsk"] = {container = "Chelyabinsk Oblast"}, -- 1,430,000 (Agglomeration) ["Rostov-on-Don"] = {container = "Rostov Oblast"}, -- 1,390,000 (Agglomeration) ["Rostov-na-Donu"] = {alias_of = "Rostov-on-Don", display = true}, ["Krasnodar"] = {container = {key = "Krasnodar Krai, Russia", placetype = "krai"}}, -- 1,370,000 (Agglomeration) ["Samara"] = {container = "Samara Oblast"}, -- 1,350,000 (Agglomeration) ["Krasnoyarsk"] = {container = {key = "Krasnoyarsk Krai, Russia", placetype = "krai"}}, -- 1,270,000 (Agglomeration) ["Ufa"] = {container = {key = "Bashkortostan, Russia", placetype = "republic"}}, -- 1,230,000 (Agglomeration) ["Saratov"] = {container = "Saratov Oblast"}, -- 1,170,000 (Agglomeration) ["Omsk"] = {container = "Omsk Oblast"}, -- 1,140,000 (Agglomeration) ["Voronezh"] = {container = "Voronezh Oblast"}, -- 1,130,000 (Agglomeration) ["Volgograd"] = {container = "Volgograd Oblast"}, -- 1,080,000 (Agglomeration) ["Perm"] = {container = {key = "Perm Krai, Russia", placetype = "krai"}, wp = "%l, Russia"}, -- 1,070,000 (Agglomeration) } export.russia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Russia", "oblast"), default_container = "Russia", default_placetype = "city", data = export.russia_cities, } export.saudi_arabia_cities = { -- Figures for the first five from [[w:List of cities and towns in Saudi Arabia]] as of 2022. Unclear if these are -- metro, urban or city proper figures. ["Riyadh"] = {container = "Riyadh"}, -- 7,000,100; 7,700,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Jeddah"] = {container = "Mecca"}, -- 3,751,917; 3,950,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Jedda"] = {alias_of = "Jeddah", display = true}, ["Jiddah"] = {alias_of = "Jeddah", display = true}, ["Jidda"] = {alias_of = "Jeddah", display = true}, ["Dammam"] = {container = "Eastern"}, -- 2,638,166; 2,925,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Mecca"] = {container = "Mecca"}, -- 2,385,509; 2,675,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Makkah"] = {alias_of = "Mecca", display = true}, ["Medina"] = {container = "Medina"}, -- 1,477,023; 1,530,000 per citypopulation.de 2025-01-01 (City) ["Hofuf"] = {container = "Eastern"}, -- 1,060,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Khamis Mushait"] = {container = "Aseer"}, -- 1,030,000 per citypopulation.de 2025-01-01 (Agglomeration) ["Khamis Mushayt"] = {alias_of = "Khamis Mushait", display = true}, } export.saudi_arabia_cities_group = { canonicalize_key_container = make_canonicalize_key_container(" Province, Saudi Arabia", "province"), default_placetype = "city", data = export.saudi_arabia_cities, } export.south_korea_cities = { -- All cities listed are not associated with any county. ["Seoul"] = {}, ["Busan"] = {}, ["Incheon"] = {}, ["Daegu"] = {}, ["Daejeon"] = {}, ["Gwangju"] = {}, ["Ulsan"] = {}, } export.south_korea_cities_group = { default_container = "South Korea", canonicalize_key_container = make_canonicalize_key_container(" County, South Korea", "province"), default_placetype = "city", data = export.south_korea_cities, } export.spain_cities = { ["Madrid"] = {container = "Community of Madrid"}, ["Barcelona"] = {container = "Catalonia"}, ["Valencia"] = {container = "Valencia"}, ["Seville"] = {container = "Andalusia"}, ["Bilbao"] = {container = "Basque Country"}, } export.spain_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", Spain", "autonomous community"), default_placetype = "city", data = export.spain_cities, } export.taiwan_cities = { ["New Taipei City"] = {}, ["New Taipei"] = {alias_of = "New Taipei City", display = true}, ["Taichung"] = {}, ["Kaohsiung"] = {wp = "%l, Taiwan"}, ["Taipei"] = {}, ["Taoyuan"] = {}, ["Tainan"] = {}, -- these last three are not special municipalities ["Chiayi"] = {placetype = "city"}, ["Hsinchu"] = {placetype = "city"}, ["Keelung"] = {placetype = "city"}, } export.taiwan_cities_group = { placename_to_key = false, -- don't add ", Taiwan" to make the key canonicalize_key_container = make_canonicalize_key_container(", Taiwan", "county"), default_container = "Taiwan", default_placetype = {"special municipality", "municipality", "city"}, default_is_city = true, default_divs = {"districts"}, data = export.taiwan_cities, } -- NOTE: It's OK to mix cities from different constituent countries; as long as the immediate container is correct, -- everything else will be figured out. export.united_kingdom_cities = { ["London"] = {container = "Greater London"}, ["Manchester"] = {container = "Greater Manchester"}, ["Birmingham"] = {container = "West Midlands"}, ["Liverpool"] = {container = "Merseyside"}, ["Glasgow"] = {container = {key = "City of Glasgow, Scotland", placetype = "council area"}}, ["Leeds"] = {container = "West Yorkshire"}, ["Newcastle upon Tyne"] = {container = "Tyne and Wear"}, ["Newcastle"] = {alias_of = "Newcastle upon Tyne"}, ["Bristol"] = {container = {key = "England", placetype = "constituent country"}}, ["Cardiff"] = {container = {key = "Wales", placetype = "constituent country"}}, ["Portsmouth"] = {container = "Hampshire"}, ["Edinburgh"] = {container = {key = "City of Edinburgh, Scotland", placetype = "council area"}}, -- under 1,000,000 people but principal areas of Wales; requested by [[User:Donnanz]] ["Swansea"] = {container = {key = "Wales", placetype = "constituent country"}}, ["Newport"] = {container = {key = "Wales", placetype = "constituent country"}, wp = "Newport, Wales"}, } export.united_kingdom_cities_group = { canonicalize_key_container = make_canonicalize_key_container(", England", "county"), default_placetype = "city", data = export.united_kingdom_cities, } export.united_states_cities = { -- top 50 CSA's by population, with the top and sometimes 2nd or 3rd city listed ["New York City"] = {container = "New York", wp = "%l", divs = { {type = "boroughs", container_parent_type = false}, }}, -- Don't display-canonicalize as it may make the display weird (e.g. in the context New York, New York). ["New York"] = {alias_of = "New York City"}, ["Newark"] = {container = "New Jersey"}, ["Los Angeles"] = {container = "California", wp = "%l"}, ["Long Beach"] = {container = "California"}, ["Riverside"] = {container = "California"}, ["Chicago"] = {container = "Illinois", wp = "%l"}, ["Washington, D.C."] = {wp = "%l"}, ["Washington, DC"] = {alias_of = "Washington, D.C.", display = true}, ["Washington D.C."] = {alias_of = "Washington, D.C.", display = true}, ["Washington DC"] = {alias_of = "Washington, D.C.", display = true}, -- Don't display-canonicalize as it may make the display weird (e.g. if the holonym is followed by a District of -- Columbia holonym). ["Washington"] = {alias_of = "Washington, D.C."}, ["Baltimore"] = {container = "Maryland", wp = "%l"}, -- to avoid conflict with San Jose in Costa Rica ["San Jose, California"] = {container = "California"}, ["San Jose"] = {alias_of = "San Jose, California"}, ["San Francisco"] = {container = "California", wp = "%l"}, ["Oakland"] = {container = "California"}, ["Boston"] = {container = "Massachusetts", wp = "%l"}, ["Providence"] = {container = "Rhode Island"}, ["Dallas"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"}, ["Fort Worth"] = {container = "Texas"}, ["Philadelphia"] = {container = "Pennsylvania", wp = "%l"}, ["Houston"] = {container = "Texas", wp = "%l"}, ["Miami"] = {container = "Florida", wp = "%l", commonscat = "%l, %c"}, ["Atlanta"] = {container = "Georgia", wp = "%l"}, ["Detroit"] = {container = "Michigan", wp = "%l"}, ["Phoenix"] = {container = "Arizona", wp = "%l", commonscat = "%l, %c"}, ["Mesa"] = {container = "Arizona"}, ["Seattle"] = {container = "Washington", wp = "%l"}, ["Orlando"] = {container = "Florida"}, ["Minneapolis"] = {container = "Minnesota", wp = "%l"}, ["Cleveland"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"}, ["Denver"] = {container = "Colorado", wp = "%l", commonscat = "%l, %c"}, ["San Diego"] = {container = "California", wp = "%l", commonscat = "%l, %c"}, ["Portland"] = {container = "Oregon"}, ["Tampa"] = {container = "Florida"}, ["St. Louis"] = {container = "Missouri", wp = "%l", commonscat = "%l, %c"}, ["Saint Louis"] = {alias_of = "St. Louis", display = true}, ["Charlotte"] = {container = "North Carolina"}, ["Sacramento"] = {container = "California"}, ["Pittsburgh"] = {container = "Pennsylvania", wp = "%l"}, ["Salt Lake City"] = {container = "Utah", wp = "%l"}, ["San Antonio"] = {container = "Texas", wp = "%l", commonscat = "%l, %c"}, ["Columbus"] = {container = "Ohio"}, ["Kansas City"] = {container = "Missouri", wp = "%l metropolitan area", commonscat = "%l, %c"}, ["Indianapolis"] = {container = "Indiana", wp = "%l"}, ["Las Vegas"] = {container = "Nevada", wp = "%l"}, ["Cincinnati"] = {container = "Ohio", wp = "%l", commonscat = "%l, %c"}, ["Austin"] = {container = "Texas"}, ["Milwaukee"] = {container = "Wisconsin", wp = "%l", commonscat = "%l, %c"}, ["Raleigh"] = {container = "North Carolina"}, ["Nashville"] = {container = "Tennessee"}, ["Virginia Beach"] = {container = "Virginia"}, ["Norfolk"] = {container = "Virginia"}, ["Greensboro"] = {container = "North Carolina"}, ["Winston-Salem"] = {container = "North Carolina"}, ["Jacksonville"] = {container = "Florida"}, ["New Orleans"] = {container = "Louisiana", wp = "%l"}, ["Louisville"] = {container = "Kentucky"}, ["Greenville"] = {container = "South Carolina"}, ["Hartford"] = {container = "Connecticut"}, ["Oklahoma City"] = {container = "Oklahoma", wp = "%l"}, ["Grand Rapids"] = {container = "Michigan"}, ["Memphis"] = {container = "Tennessee"}, ["Birmingham, Alabama"] = {container = "Alabama"}, ["Birmingham"] = {alias_of = "Birmingham, Alabama"}, ["Fresno"] = {container = "California"}, ["Richmond"] = {container = "Virginia"}, ["Harrisburg"] = {container = "Pennsylvania"}, -- any major city of top 50 MSA's that's missed by previous ["Buffalo"] = {container = "New York"}, -- any of the top 50 city by city population that's missed by previous ["El Paso"] = {container = "Texas"}, ["Albuquerque"] = {container = "New Mexico"}, ["Tucson"] = {container = "Arizona"}, ["Colorado Springs"] = {container = "Colorado"}, ["Omaha"] = {container = "Nebraska"}, ["Tulsa"] = {container = "Oklahoma"}, -- skip Arlington, Texas; too obscure and likely to be interpreted as Arlington, Virginia } export.united_states_cities_group = { default_container = "Amerika Syarikat", canonicalize_key_container = make_canonicalize_key_container(", USA", "negeri"), default_placetype = "city", default_wp = "%l, %c", data = export.united_states_cities, } export.new_york_boroughs = { ["Bronx"] = {the = true, wp = "The Bronx"}, ["Brooklyn"] = {}, ["Manhattan"] = {}, ["Queens"] = {}, ["Staten Island"] = {}, } export.new_york_boroughs_group = { default_container = {key = "New York City", placetype = "city"}, default_placetype = "borough", default_is_city = true, data = export.new_york_boroughs, } export.vietnam_cities = { -- Figures from citypopulation.de (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated. ["Ho Chi Minh City"] = {}, -- 14,300,000 (Agglomeration; inclunding Bien Hoa) ["Saigon"] = {alias_of = "Ho Chi Minh City"}, ["Hanoi"] = {}, -- 7,350,000 (Agglomeration) ["Da Nang"] = {}, -- 1,500,000 (Agglomeration) ["Danang"] = {alias_of = "Da Nang", display = true}, ["Haiphong"] = {}, -- 1,450,000 (Agglomeration) ["Hai Phong"] = {alias_of = "Haiphong", display = true}, -- This is the one entry in this list that is not a province-level municipality; instead it's a "provincial city" -- meaning it is directly under its province as opposed to being contained in a district. ["Bien Hoa"] = {placetype = "city", container = "Đồng Nai", wp = "Biên Hòa"}, -- 1,272,235 (2022 city population per Wikipedia) ["Biên Hòa"] = {alias_of = "Bien Hoa", display = true}, ["Biên Hoà"] = {alias_of = "Bien Hoa", display = true}, -- These two not in citypopulation.de because the urban population may be slightly under 1,000,000, but they are -- both province-level municipalities and close to the 1,000,000 mark. ["Can Tho"] = {wp = "Cần Thơ"}, -- 1,456,000 municipality (2019 census), 994,704 urban (2022 General Statistics Office of Vietnam estimate); capital [[Ninh Kiều district]] ["Cần Thơ"] = {alias_of = "Can Tho", display = true}, ["Hue"] = {wp = "Huế"}, -- 1,257,000 municipality (2019 census), 840,000 urban (2022 General Statistics Office of Vietnam estimate); -- capital [[Thuận Hóa district]] ["Huế"] = {alias_of = "Hue", display = true}, } export.vietnam_cities_group = { placename_to_key = false, -- don't add ", Vietnam" to make the key default_container = "Vietnam", canonicalize_key_container = make_canonicalize_key_container(" Province, Vietnam", "province"), -- Most of the cities listed are province-level municipalities in addition, which contain a certain amount of -- rural territory surrounding the city, but not enough to separate the municipality from the city as distinct -- known locations. default_placetype = {"municipality", "city"}, default_is_city = true, -- There may not be enough districts to subcategorize like this. -- default_divs = "districts", data = export.vietnam_cities, } export.misc_cities = { ------------------ Africa ------------------- -- Sorted by country and then within the country, by decreasing population; figures from citypopulation.de -- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated; combined with data from -- [[w:List of urban areas in Africa by population]]. ["Algiers"] = {container = "Algeria"}, -- 4,325,000 (Consolidated Urban Area) ["Oran"] = {container = "Algeria"}, -- 1,640,000 (Consolidated Urban Area) ["Luanda"] = {container = "Angola"}, -- 9,650,000 (Urban Area) ["Benguela"] = {container = "Angola"}, -- 1,420,000 (Urban Area) ["Cotonou"] = {container = "Benin"}, -- 2,150,000 (Agglomeration) ["Ouagadougou"] = {container = "Burkina Faso"}, -- 3,425,000 (Agglomeration) ["Bobo-Dioulasso"] = {container = "Burkina Faso"}, -- 1,100,000 (Agglomeration) ["Bujumbura"] = {container = "Burundi"}, -- 1,143,202 (Urban Area 2023 per PopulationStat, cited in Wikipedia) ["Yaoundé"] = {container = "Cameroon"}, -- 3,975,000 (City) ["Yaounde"] = {alias_of = "Yaoundé", display = true}, ["Douala"] = {container = "Cameroon"}, -- 3,900,000 (City) ["Bangui"] = {container = "Central African Republic"}, -- 1,680,000 (Agglomeration) ["N'Djamena"] = {container = "Chad"}, -- 1,950,000 (City) ["Ndjamena"] = {alias_of = "N'Djamena", display = true}, ["Kinshasa"] = {container = "Democratic Republic of the Congo"}, -- 16,300,000 (City; population of low reliability) ["Lubumbashi"] = {container = "Democratic Republic of the Congo"}, -- 2,875,000 (City; population of low reliability) ["Mbuji-Mayi"] = {container = "Democratic Republic of the Congo"}, -- 2,500,000 (City; population of low reliability) ["Kananga"] = {container = "Democratic Republic of the Congo"}, -- 1,370,000 (City; population of low reliability) ["Kisangani"] = {container = "Democratic Republic of the Congo"}, -- 1,300,000 (City; population of low reliability) ["Bukavu"] = {container = "Democratic Republic of the Congo"}, -- 1,100,000 (City; population of low reliability) ["Goma"] = {container = "Democratic Republic of the Congo"}, -- 1,010,000 (City; population of low reliability) ["Tshikapa"] = {container = "Democratic Republic of the Congo"}, -- 1,020,468 (2023 Wikipedia [[w:List of cities with over one million inhabitants]] from populationstat.com; not in citypopulation.de) ["Cairo"] = {container = "Egypt"}, -- 22,800,000 (Agglomeration, including Giza and Subhra El Kheima) ["Alexandria"] = {container = "Egypt"}, -- 6,250,000 (Agglomeration) ["Giza"] = {container = "Egypt"}, -- 4,458,135 (2023 from citypopulation.de) ["Shubra El Kheima"] = {container = "Egypt"}, -- 1,240,239 (2021 from citypopulation.de) ["Asmara"] = {container = "Eritrea"}, -- 1,090,000 (City; population of low reliability) ["Asmera"] = {alias_of = "Asmara", display = true}, ["Addis Ababa"] = {container = "Ethiopia"}, -- 4,825,000 (Agglomeration) ["Banjul"] = {container = "Gambia"}, -- 1,170,000 (Agglomeration) ["Accra"] = {container = "Ghana"}, -- 6,800,000 (Agglomeration) ["Kumasi"] = {container = "Ghana"}, -- 2,900,000 (Agglomeration) ["Conakry"] = {container = "Guinea"}, -- 2,975,000 (Consolidated Urban Area) ["Abidjan"] = {container = "Ivory Coast"}, -- 7,050,000 (Agglomeration) ["Nairobi"] = {container = "Kenya"}, -- 6,900,000 (unindicated) ["Mombasa"] = {container = "Kenya"}, -- 1,370,000 (City) ["Monrovia"] = {container = "Liberia"}, -- 1,940,000 (Urban Area) ["Tripoli"] = {container = "Libya", wp = "%l, %c"}, -- 1,870,000 (unindicated) ["Antananarivo"] = {container = "Madagascar"}, -- 3,150,000 (Agglomeration) ["Lilongwe"] = {container = "Malawi"}, -- 1,210,000 (City) ["Bamako"] = {container = "Mali"}, -- 5,700,000 (Agglomeration) ["Nouakchott"] = {container = "Mauritania"}, -- 1,500,000 (City) ["Casablanca"] = {container = {key = "Casablanca-Settat, Morocco", placetype = "region"}}, -- 4,450,000 (Municipality (urban population)) ["Rabat"] = {container = {key = "Rabat-Sale-Kenitra, Morocco", placetype = "region"}}, -- 2,125,000 (Municipality (urban population)) ["Tangier"] = {container = {key = "Tangier-Tetouan-Al Hoceima, Morocco", placetype = "region"}}, -- 1,410,000 (Municipality (urban population)) ["Tanger"] = {alias_of = "Tangier", display = true}, ["Tangiers"] = {alias_of = "Tangier", display = true}, ["Fez"] = {container = {key = "Fez-Meknes, Morocco", placetype = "region"}, wp = "%l, Morocco"}, -- 1,310,000 (Municipality (urban population)) ["Fes"] = {alias_of = "Fez", display = true}, ["Fès"] = {alias_of = "Fez", display = true}, ["Agadir"] = {container = {key = "Souss-Massa, Morocco", placetype = "region"}}, -- 1,270,000 (Municipality (urban population)) ["Marrakesh"] = {container = {key = "Marrakesh-Safi, Morocco", placetype = "region"}}, -- 1,140,000 (Municipality (urban population)) ["Marrakech"] = {alias_of = "Marrakesh", display = true}, ["Maputo"] = {container = "Mozambique"}, -- 2,575,000 (Agglomeration) ["Niamey"] = {container = "Niger"}, -- 1,530,000 (City) ["Brazzaville"] = {container = "Republic of the Congo"}, -- 2,475,000 (Agglomeration) ["Pointe-Noire"] = {container = "Republic of the Congo"}, -- 1,480,000 (City) ["Kigali"] = {container = "Rwanda"}, -- 1,960,000 (Municipality (urban population)) ["Dakar"] = {container = "Senegal"}, -- 4,225,000 (Agglomeration) ["Touba"] = {container = "Senegal"}, -- 1,320,000 (Agglomeration) ["Freetown"] = {container = "Sierra Leone"}, -- 1,420,000 (Agglomeration) ["Mogadishu"] = {container = "Somalia"}, -- 2,250,000 (unindicated; population of low reliability) ["Johannesburg"] = {container = {key = "Gauteng, South Africa", placetype = "province"}}, -- 14,800,000 (Consolidated Urban Area; including Pretoria, Soweto, etc.) ["Cape Town"] = {container = {key = "Western Cape, South Africa", placetype = "province"}}, -- 5,100,000 (Consolidated Urban Area) ["Durban"] = {container = {key = "KwaZulu-Natal, South Africa", placetype = "province"}}, -- 3,900,000 (Consolidated Urban Area) ["Pretoria"] = {container = {key = "Gauteng, South Africa", placetype = "province"}}, -- 2,921,488 (2011 census) ["Port Elizabeth"] = {container = {key = "Eastern Cape, South Africa", placetype = "province"}, wp = "Gqeberha"}, -- 1,200,000 (Consolidated Urban Area) ["Gqeberha"] = {alias_of = "Port Elizabeth"}, -- official name; not a display alias ["Khartoum"] = {container = "Sudan"}, -- 7,200,000 (unindicated; population of low reliability) ["Dar es Salaam"] = {container = "Tanzania"}, -- 6,650,000 (Agglomeration) ["Mwanza"] = {container = "Tanzania"}, -- 1,340,000 (Agglomeration) ["Mwanza City"] = {alias_of = "Mwanza", display = true}, ["Arusha"] = {container = "Tanzania"}, -- 1,190,000 (Agglomeration) ["Zanzibar"] = {container = "Tanzania"}, -- 1,030,000 (Agglomeration) ["Lomé"] = {container = "Togo"}, -- 2,625,000 (unindicated) ["Lome"] = {alias_of = "Lomé", display = true}, ["Tunis"] = {container = "Tunisia"}, -- 2,725,000 (Municipality (urban population)) ["Sousse"] = {container = "Tunisia"}, -- 1,180,000 (Municipality (urban population)) ["Soussa"] = {alias_of = "Sousse", display = true}, ["Kampala"] = {container = "Uganda"}, -- 4,300,000 (unindicated) ["Lusaka"] = {container = "Zambia"}, -- 3,000,000 (Consolidated Urban Area) ["Harare"] = {container = "Zimbabwe"}, -- 2,675,000 (Agglomeration) ------------------ Asia ------------------- -- sorted by country and then within the country, by decreasing population; figures from citypopulation.de -- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated. ["Kabul"] = {container = "Afghanistan"}, -- 5,250,000 (Agglomeration) ["Baku"] = {container = "Azerbaijan"}, -- 3,725,000 (Administrative Area (urban population)) ["Manama"] = {container = "Bahrain"}, -- 1,560,000 (unindicated) ["Dhaka"] = {container = {key = "Dhaka Division, Bangladesh", placetype = "division"}}, -- 23,100,000 (Agglomeration) ["Dacca"] = {alias_of = "Dhaka", display = true}, ["Chittagong"] = {container = {key = "Chittagong Division, Bangladesh", placetype = "division"}}, -- 5,050,000 (Agglomeration) ["Gazipur"] = {container = {key = "Dhaka Division, Bangladesh", placetype = "division"}}, -- 2,674,697 (City per 2022; countied in citypopulation.de as part of Dhaka metro area) ["Khulna"] = {container = {key = "Khulna Division, Bangladesh", placetype = "division"}}, -- 1,210,000 (Agglomeration) ["Phnom Penh"] = {container = "Cambodia"}, -- 2,925,000 (Agglomeration) ["Tehran"] = {container = {key = "Tehran Province, Iran", placetype = "province"}}, -- 16,800,000 (Agglomeration) ["Teheran"] = {alias_of = "Tehran", display = true}, ["Mashhad"] = {container = {key = "Razavi Khorasan Province, Iran", placetype = "province"}}, -- 3,475,000 (Agglomeration) ["Mashad"] = {alias_of = "Mashhad", display = true}, ["Meshhed"] = {alias_of = "Mashhad", display = true}, ["Meshed"] = {alias_of = "Mashhad", display = true}, ["Isfahan"] = {container = {key = "Isfahan Province, Iran", placetype = "province"}}, -- 3,425,000 (Agglomeration) ["Esfahan"] = {alias_of = "Isfahan", display = true}, ["Tabriz"] = {container = {key = "East Azerbaijan Province, Iran", placetype = "province"}}, -- 1,970,000 (Agglomeration) ["Shiraz"] = {container = {key = "Fars Province, Iran", placetype = "province"}}, -- 1,950,000 (Agglomeration) ["Ahvaz"] = {container = {key = "Khuzestan Province, Iran", placetype = "province"}}, -- 1,550,000 (Agglomeration) ["Qom"] = {container = {key = "Qom Province, Iran", placetype = "province"}}, -- 1,450,000 (City) ["Kermanshah"] = {container = {key = "Kermanshah Province, Iran", placetype = "province"}}, -- 1,130,000 (City) ["Baghdad"] = {container = "Iraq"}, -- 7,800,000 (Administrative Area (urban population)) ["Basra"] = {container = "Iraq"}, -- 1,710,000 (Administrative Area (urban population)) ["Mosul"] = {container = "Iraq"}, -- 1,550,000 (Administrative Area (urban population)) ["Erbil"] = {container = "Iraq"}, -- 1,220,000 (Administrative Area (urban population)) ["Kirkuk"] = {container = "Iraq"}, -- 1,160,000 (Administrative Area (urban population)) ["Najaf"] = {container = "Iraq"}, -- 1,050,000 (Administrative Area (urban population)) ["Tel Aviv"] = {container = "Israel"}, -- 3,000,000 (Agglomeration) -- Jerusalem is not recognized internationally as part of either Israel or Palestine, but as a -- [[w:corpus separatum]], so put the container as "Asia" and list Israel and Palestine as additional parents for -- categorization purposes. ["Jerusalem"] = {container = {key = "Asia", placetype = "benua"}, addl_parents = {"Israel", "Palestine"}}, -- 1,080,000 (Agglomeration) ["Amman"] = {container = "Jordan"}, -- 6,150,000 (unindicated) ["Irbid"] = {container = "Jordan"}, -- 1,070,000 (unindicated) ["Almaty"] = {container = "Kazakhstan"}, -- 2,700,000 (Agglomeration) ["Alma-Ata"] = {alias_of = "Almaty"}, -- former name, sometimes still used; don't display-canonicalize ["Astana"] = {container = "Kazakhstan"}, -- 1,600,000 (Agglomeration) ["Shymkent"] = {container = "Kazakhstan"}, -- 1,370,000 (Agglomeration) ["Kuwait City"] = {container = "Kuwait"}, -- 5,050,000 (Agglomeration) ["Bishkek"] = {container = "Kyrgyzstan"}, -- 1,540,000 (Agglomeration) ["Beirut"] = {container = "Lebanon"}, -- 1,930,000 (unindicated; population of low reliability) -- Kuala Lumpur is a federal capital city, not in any state ["Kuala Lumpur"] = {container = "Malaysia"}, -- 9,550,000 (Agglomeration) -- there are various George Towns and Georgetowns ["George Town, Malaysia"] = {container = {key = "Penang, Malaysia", placetype = "negeri"}, wp = "%l, %c"}, -- 2,075,000 (Agglomeration) ["George Town"] = {alias_of = "George Town, Malaysia"}, ["Ulaanbaatar"] = {container = "Mongolia"}, -- 1,610,000 (City) ["Ulan Bator"] = {alias_of = "Ulaanbaatar", display = true}, ["Yangon"] = {container = "Myanmar"}, -- 5,650,000 (Municipality (urban population)) ["Rangoon"] = {alias_of = "Yangon", display = true}, ["Mandalay"] = {container = "Myanmar"}, -- 1,600,000 (Municipality (urban population)) ["Kathmandu"] = {container = "Nepal"}, -- 3,175,000 (Agglomeration) -- Pyongyang is a directly governed city, not in any province ["Pyongyang"] = {container = "North Korea"}, -- 3,025,000 (Administrative Area (urban population)) ["Muscat"] = {container = "Oman"}, -- 1,620,000 (Agglomeration) ["Gaza"] = {container = "Palestine", wp = "Gaza City"}, -- 2,275,000 (unindicated) ["Gaza City"] = {alias_of = "Gaza"}, ["Doha"] = {container = "Qatar"}, -- 2,650,000 (Agglomeration) ["Colombo"] = {container = "Sri Lanka"}, -- 4,975,000 (unindicated) ["Damascus"] = {container = "Syria"}, -- 3,975,000 (unindicated; population of low reliability) ["Aleppo"] = {container = "Syria"}, -- 1,980,000 (unindicated; population of low reliability) ["Dushanbe"] = {container = "Tajikistan"}, -- 1,270,000 (City) ["Bangkok"] = {container = "Thailand"}, -- 21,800,000 (Agglomeration) -- Chiang Mai not in citypopulation.de, but 1,198,000 urban population in 2021 per Wikipedia -- [[w:List_of_municipalities_in_Thailand#Largest_cities_by_urban_population]] ["Chiang Mai"] = {container = {key = "Chiang Mai Province, Thailand", placetype = "province"}}, ["Chonburi"] = {container = {key = "Chonburi Province, Thailand", placetype = "province"}}, -- 1,570,000 (Agglomeration; including Pattaya) -- metro area population stats from https://www.statista.com/statistics/255483/biggest-cities-in-turkey/ as of 2021; -- second source is citypopulation.de reference date 2025-01-01. ["Istanbul"] = {placetype = {"city", "province"}, divs = {"districts"}, container = "Turkey"}, -- 15.2 million; 16,000,000 (Agglomeration) ["İstanbul"] = {alias_of = "Istanbul", display = true}, ["Ankara"] = {container = {key = "Ankara Province, Turkey", placetype = "province"}}, -- 5.15 million; 5,200,000 (Agglomeration) ["Izmir"] = {container = {key = "İzmir Province, Turkey", placetype = "province"}, wp = "İzmir"}, -- 2.95 million; 3,025,000 (Agglomeration) ["İzmir"] = {alias_of = "Izmir", display = true}, ["Bursa"] = {container = {key = "Bursa Province, Turkey", placetype = "province"}}, -- 2.02 million; 2,200,000 (Agglomeration) ["Adana"] = {container = {key = "Adana Province, Turkey", placetype = "province"}}, -- 1.77 million; 1,780,000 (Agglomeration) ["Gaziantep"] = {container = {key = "Gaziantep Province, Turkey", placetype = "province"}}, -- 1.71 million; 1,750,000 (Agglomeration) ["Antalya"] = {container = {key = "Antalya Province, Turkey", placetype = "province"}}, -- 1.3 million; 1,400,000 (Agglomeration) ["Konya"] = {container = {key = "Konya Province, Turkey", placetype = "province"}}, -- 1.35 million; 1,390,000 (Agglomeration) ["Diyarbakır"] = {container = {key = "Diyarbakır Province, Turkey", placetype = "province"}}, -- 1.07 million; 1,100,000 (Agglomeration) -- Diyarbakır is more common per Ngrams and Google Scholar, but Diyarbakir is the Kurdish form, so we should not -- display-canonicalize to the Turkish form Diyarbakır. ["Diyarbakir"] = {alias_of = "Diyarbakır"}, ["Mersin"] = {container = {key = "Mersin Province, Turkey", placetype = "province"}}, -- 1.03 million; 1,060,000 (Agglomeration) ["Ashgabat"] = {container = "Turkmenistan"}, -- 1,150,000 (Agglomeration) ["Dubai"] = {container = "United Arab Emirates"}, -- 6,050,000 (Agglomeration; including Sharjah) ["Abu Dhabi"] = {container = "United Arab Emirates"}, -- 1,850,000 (City) ["Sharjah"] = {container = "United Arab Emirates"}, -- 1,800,000 (Metro area 2022-2023 per Wikipedia; separate from Dubai) ["Tashkent"] = {container = "Uzbekistan"}, -- 3,850,000 (unindicated) ["Sanaa"] = {container = "Yemen"}, -- 3,275,000 (City; population of low reliability) ["Sana'a"] = {alias_of = "Sanaa", display = true}, ["Aden"] = {container = "Yemen"}, -- 1,079,060 (?; 2023 estimate from World Population Review per Wikipedia) ------------------ Europe or Europe-like (Caucasus etc.) --------------------- ["Yerevan"] = {container = "Armenia"}, -- 1,520,000 (Agglomeration) ["Vienna"] = {container = "Austria"}, -- 2,375,000 (Agglomeration) ["Minsk"] = {container = "Belarus"}, -- 2,100,000 (unindicated) ["Brussels"] = {container = "Belgium"}, -- 2,800,000 (Consolidated Urban Area) ["Antwerp"] = {container = "Belgium"}, -- 1,270,000 (Consolidated Urban Area) ["Sofia"] = {container = "Bulgaria"}, -- 1,260,000 (Agglomeration) ["Zagreb"] = {container = "Croatia"}, ["Prague"] = {container = "Czech Republic"}, -- 1,470,000 (Agglomeration) ["Brno"] = {container = "Czech Republic"}, -- 729,405 (metro area per Wikipedia as of 2024-01-01 Czech Statistical Office) ["Olomouc"] = {container = "Czech Republic"}, -- 102,293 (city; included only because someone went crazy creating Olomouc-related terms) ["Copenhagen"] = {container = "Denmark"}, -- 1,800,000 (Consolidated Urban Area) ["Helsinki"] = {container = {key = "Uusimaa, Finland", placetype = "region"}}, -- 1,560,000 (Consolidated Urban Area) ["Tbilisi"] = {container = "Georgia"}, -- 1,430,000 (Agglomeration) ["Athens"] = {container = "Greece"}, ["Thessaloniki"] = {container = "Greece"}, ["Budapest"] = {container = "Hungary"}, -- FIXME, per Wikipedia "County Dublin" is now the "Dublin Region" ["Dublin"] = {container = {key = "County Dublin, Ireland", placetype = "county"}}, ["Riga"] = {container = "Latvia"}, ["Amsterdam"] = {container = {key = "North Holland, Netherlands", placetype = "province"}}, ["Rotterdam"] = {container = {key = "South Holland, Netherlands", placetype = "province"}}, ["The Hague"] = {container = {key = "South Holland, Netherlands", placetype = "province"}}, -- Christchurch (metro 546,600) and Wellington (metro 439,800) are too small to make it. ["Auckland"] = {container = {key = "Auckland, New Zealand", placetype = "region"}}, ["Oslo"] = {container = {key = "Oslo, Norway", placetype = "county"}}, ["Warsaw"] = {container = {key = "Masovian Voivodeship, Poland", placetype = "voivodeship"}}, ["Katowice"] = {container = {key = "Silesian Voivodeship, Poland", placetype = "voivodeship"}}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Krakow" without accent. ["Krakow"] = {container = {key = "Lesser Poland Voivodeship, Poland", placetype = "voivodeship"}, wp = "Kraków"}, ["Kraków"] = {alias_of = "Krakow", display = true}, ["Cracow"] = {alias_of = "Krakow", display = true}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirm "Gdańsk" and "Poznań" with accent. ["Gdańsk"] = {container = {key = "Pomeranian Voivodeship, Poland", placetype = "voivodeship"}}, ["Gdansk"] = {alias_of = "Gdańsk", display = true}, ["Poznań"] = {container = {key = "Greater Poland Voivodeship, Poland", placetype = "voivodeship"}}, ["Poznan"] = {alias_of = "Poznań", display = true}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Lodz" without accents. ["Lodz"] = {container = {key = "Lodz Voivodeship, Poland", placetype = "voivodeship"}, wp = "Łódź"}, ["Łódź"] = {alias_of = "Lodz", display = true}, ["Lisbon"] = {container = {key = "Lisbon District, Portugal", placetype = "district"}}, ["Porto"] = {container = {key = "Porto District, Portugal", placetype = "district"}}, ["Oporto"] = {alias_of = "Porto", display = true}, ["Bucharest"] = {container = "Romania"}, ["Belgrade"] = {container = "Serbia"}, ["Stockholm"] = {container = "Sweden"}, ["Zurich"] = {container = "Switzerland"}, --- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Zurich" without umlaut. --- Even Wikipedia uses the form without umlaut. ["Zürich"] = {alias_of = "Zurich", display = true}, ["Kyiv"] = {container = "Ukraine"}, -- not in Kyiv Oblast -- Don't display-canonicalize Kiev -> Kyiv because in ancient contexts, Kiev is still more common. ["Kiev"] = {alias_of = "Kyiv"}, ["Kharkiv"] = {container = {key = "Kharkiv Oblast, Ukraine", placetype = "oblast"}}, ["Odessa"] = {container = {key = "Odesa Oblast, Ukraine", placetype = "oblast"}, wp = "Odesa"}, -- Don't display-canonicalize Odesa -> Odessa because it may be interpreted as a political statement. ["Odesa"] = {alias_of = "Odessa"}, ------------------ North America, South America --------------------- -- Primary figures from citypopulation.de retrieved on 2025-04-26 (reference date 2025-01-01); -- Wikipedia metropolitan figures from [[w:List of metropolitan areas in the Americas]] based on per-country data; -- Wikipedia city limits figures from [[w:List of largest cities in the Americas]]. ["Buenos Aires"] = {container = "Argentina"}, -- 16,800,000 (Consolidated Urban Area; 13,985,794 metropolitan area per Wikipedia) ["Córdoba, Argentina"] = {container = "Argentina", wp = "%l, %c"}, -- 1,810,000 (Consolidated Urban Area; 1,505,25 city limits per Wikipedia) -- to avoid confusion with Córdoba in Spain ["Córdoba"] = {alias_of = "Córdoba, Argentina"}, ["Cordoba"] = {alias_of = "Córdoba, Argentina", display = "Córdoba"}, ["Rosario"] = {container = "Argentina", wp = "%l, Santa Fe"}, -- 1,510,000 (Consolidated Urban Area; 1,348,725 metropolitan area per Wikipedia) ["Mendoza"] = {container = "Argentina", wp = "%l, %c"}, -- 1,180,000 (Consolidated Urban Area) ["San Miguel de Tucumán"] = {container = "Argentina"}, -- 1,110,000 (Consolidated Urban Area) ["Tucumán"] = {alias_of = "San Miguel de Tucumán"}, ["Tucuman"] = {alias_of = "San Miguel de Tucumán", display = "Tucumán"}, ["Santa Cruz de la Sierra"] = {container = "Bolivia"}, -- 1,960,000 (Consolidated Urban Area); 1,606,671 (city limits per Wikipedia) ["Santa Cruz"] = {alias_of = "Santa Cruz de la Sierra"}, ["La Paz"] = {container = "Bolivia"}, -- 1,870,000 (Consolidated Urban Area; composed of El Alto, now slightly larger, and La Paz) ["El Alto"] = {container = "Bolivia"}, ["Cochabamba"] = {container = "Bolivia"}, -- 1,280,000 (Consolidated Urban Area) ["Santiago"] = {container = "Chile"}, -- 8,400,000 (Consolidated Urban Area; 6,903,479 city limits? per Wikipedia) ["Valparaíso"] = {container = "Chile"}, -- 1,060,000 (Consolidated Urban Area) ["Valparaiso"] = {alias_of = "Valparaíso"}, -- 1,060,000 (Consolidated Urban Area) ["Bogotá"] = {container = "Colombia"}, -- 10,600,000 (Agglomeration; 12,772,828 metropolitan area per Wikipedia) ["Bogota"] = {alias_of = "Bogotá", display = true}, ["Medellín"] = {container = "Colombia"}, -- 4,350,000 (Agglomeration; 4,068,000 metropolitan area per Wikipedia) ["Medellin"] = {alias_of = "Medellín", display = true}, ["Cali"] = {container = "Colombia"}, -- 2,975,000 (Agglomeration; 2,837,000 metropolitan area per Wikipedia) ["Barranquilla"] = {container = "Colombia"}, -- 2,375,000 (Agglomeration; 1,341,160 city limits per Wikipedia) ["Bucaramanga"] = {container = "Colombia"}, -- 1,380,000 (Agglomeration) ["Cartagena, Colombia"] = {container = "Colombia", wp = "%l, %c"}, -- 1,250,000 (Agglomeration) -- to avoid confusion with Cartagena, Spain ["Cartagena"] = {alias_of = "Cartagena, Colombia"}, ["Cúcuta"] = {container = "Colombia"}, -- 1,130,000 (Agglomeration) ["Cucuta"] = {alias_of = "Cúcuta", display = true}, -- to avoid conflict with San Jose, California ["San José, Costa Rica"] = {container = "Costa Rica", wp = "%l, %c"}, -- 2,450,000 (Municipality (urban population); 3,160,000 metropolitan area per Wikipedia) ["San José"] = {alias_of = "San José, Costa Rica"}, ["San Jose"] = {alias_of = "San José, Costa Rica"}, -- display = "San José"; causes error due to San Jose alias for California city; FIXME ["Havana"] = {container = "Cuba"}, -- 2,150,000 (City; 2,137,847 city limits? per Wikipedia) ["Santo Domingo"] = {container = "Dominican Republic"}, -- 3,900,000 (Municipality (urban population); 4,274,651 ??? per Wikipedia) ["Guayaquil"] = {container = "Ecuador"}, -- 3,350,000 (Agglomeration; 3,092,000 metro area? per Wikipedia) ["Quito"] = {container = "Ecuador"}, -- 2,875,000 (Agglomeration; 2,889,703 metro area? per Wikipedia) ["San Salvador"] = {container = "El Salvador"}, -- 1,580,000 (Municipality (urban population)) ["Guatemala City"] = {container = "Guatemala"}, -- 3,375,000 (Municipality (urban population); 3,160,000 metro area? per Wikipedia) ["Port-au-Prince"] = {container = "Haiti"}, -- 3,050,000 (Agglomeration; population of low reliability; 2,915,000 metro area? per Wikipedia) ["San Pedro Sula"] = {container = "Honduras"}, -- 1,330,000 (Consolidated Urban Area) ["Tegucigalpa"] = {container = "Honduras"}, -- 1,220,000 (Urban Area) ["Managua"] = {container = "Nicaragua"}, -- 1,400,000 (Consolidated Urban Area) ["Panama City"] = {container = "Panama"}, -- 1,430,000 (Urban Area) ["Asunción"] = {container = "Paraguay"}, -- 2,350,000 (Municipality (urban population)) ["Lima"] = {container = "Peru"}, -- 12,000,000 (Agglomeration; 11,283,787 ??? per Wikipedia) ["Arequipa"] = {container = "Peru"}, -- 1,210,000 (Agglomeration) ["San Juan"] = {container = {key = "Puerto Rico", placetype = "commonwealth"}, wp = "%l, %c"}, -- 1,910,000 (Consolidated Urban Area) ["Montevideo"] = {container = "Uruguay"}, -- 1,810,000 (Agglomeration; 1,302,954 ??? per Wikipedia) ["Caracas"] = {container = "Venezuela"}, -- 3,850,000 (Consolidated Urban Area; 5,243,301 ??? per Wikipedia) ["Maracaibo"] = {container = "Venezuela"}, -- 2,825,000 (Consolidated Urban Area; 5,278,448 ??? per Wikipedia) -- to avoid confusion with Valencia (city and autonomous community of Spain) ["Valencia, Venezuela"] = {container = "Venezuela", wp = "%l, %c"}, -- 2,100,000 (Consolidated Urban Area) ["Valencia"] = {alias_of = "Valencia, Venezuela"}, ["Maracay"] = {container = "Venezuela"}, -- 1,480,000 (Consolidated Urban Area) ["Barquisimeto"] = {container = "Venezuela"}, -- 1,360,000 (Consolidated Urban Area) } export.misc_cities_group = { canonicalize_key_container = make_canonicalize_key_container(nil, "negara"), default_placetype = "city", data = export.misc_cities, } --[==[ var: List of all known locations, in groups. The first group lists continents and continental regions, followed by three groups listing top-level locations: countries, "country-like entities" (de-facto/unrecognized/etc. countries and dependent territories) and former polities (countries, empires, etc.). After that come first-level subpolities (administrative divisions) of several, mostly large, countries, followed by groups of cities. China and the United Kingdom include second-level subpolities (in the case of China, only the largest ones as the full list runs in the hundreds). ]==] export.locations = { export.continents_group, export.countries_group, export.country_like_entities_group, export.former_countries_group, export.australia_group, export.austria_group, export.bangladesh_group, export.brazil_group, export.canada_group, export.china_group, export.china_prefecture_level_cities_group, export.china_prefecture_level_cities_group_2, export.egypt_group, export.finland_group, export.france_group, export.france_departments_group, export.germany_group, export.greece_group, export.india_group, export.indonesia_group, export.iran_group, export.ireland_group, export.italy_group, export.japan_group, export.laos_group, export.lebanon_group, export.malaysia_group, export.malta_group, export.mexico_group, export.moldova_group, export.morocco_group, export.netherlands_group, export.new_zealand_group, export.nigeria_group, export.north_korea_group, export.norway_group, export.pakistan_group, export.philippines_group, export.poland_group, export.portugal_group, export.romania_group, export.russia_group, export.saudi_arabia_group, export.south_africa_group, export.south_korea_group, export.spain_group, export.taiwan_group, export.thailand_group, export.turkey_group, export.ukraine_group, export.united_kingdom_group, export.united_states_group, export.england_group, export.northern_ireland_group, export.scotland_group, export.wales_group, export.vietnam_group, export.australia_cities_group, export.brazil_cities_group, export.canada_cities_group, export.france_cities_group, export.germany_cities_group, export.india_cities_group, export.indonesia_cities_group, export.italy_cities_group, export.japan_cities_group, export.mexico_cities_group, export.nigeria_cities_group, export.pakistan_cities_group, export.philippines_cities_group, export.russia_cities_group, export.saudi_arabia_cities_group, export.south_korea_cities_group, export.spain_cities_group, export.taiwan_cities_group, export.united_kingdom_cities_group, export.united_states_cities_group, export.new_york_boroughs_group, export.vietnam_cities_group, export.misc_cities_group, } return export e7wgydrdurf6fp3y3d571dji0neo8h9 Modul:place 828 76178 281374 244167 2026-04-22T07:16:34Z PeaceSeekers 3334 281374 Scribunto text/plain local export = {} local force_cat = false -- set to true for testing local m_placetypes = require("Module:place/placetypes") local m_links = require("Module:links") local memoize = require("Module:memoize") local m_strutils = require("Module:string utilities") local m_table = require("Module:table") local debug_track_module = "Module:debug/track" local en_utilities_module = "Module:en-utilities" local form_of_module = "Module:form of" local languages_module = "Module:languages" local parse_interface_module = "Module:parse interface" local parse_utilities_module = "Module:parse utilities" local parameter_utilities_module = "Module:parameter utilities" local utilities_module = "Module:utilities" local enlang = require(languages_module).getByCode("en") local rmatch = m_strutils.match local rfind = m_strutils.find local ulen = m_strutils.len local split = m_strutils.split local dump = mw.dumpObject local insert = table.insert local concat = table.concat local pluralize = require(en_utilities_module).pluralize local extend = m_table.extend local unpack = unpack or table.unpack -- Lua 5.2 compatibility local internal_error = m_placetypes.internal_error local process_error = m_placetypes.process_error local placetype_data = m_placetypes.placetype_data --[==[ intro: ===Introduction=== This module implements {{tl|place}}, which is a template for standardizing the description and categorization of toponyms (terms that refer to locations such as cities, countries, rivers, etc.). The following modules support this template: * [[Module:place]]: The main module. * [[Module:place/placetypes]]: A module containing data on placetypes, as well as utilities for working with placetypes; category generation handlers for adding categories based on placetypes; and display handlers for displaying holonyms (i.e. containing locations) of a specific type. FIXME: Maybe split out the code from the data. * [[Module:place/locations]]: A module containing data on known locations, as well as utilities for working with such locations. FIXME: Maybe split out the code from the data. * [[Module:category tree/topic/Places]]: A category tree module for generating the descriptions of all categories generated by {{tl|place}}. * [[Module:place doc]]: A module that generates documentation tables describing known placetypes and locations. ===Basic terminology=== The basic terminology used in this and associated {{tl|place}} modules is: * A ''location'' (or equivalently, a ''place'') is any geographic feature (either natural or geopolitical), either on the surface of the Earth or elsewhere. Examples of types of natural places are rivers, mountains, seas and moons; examples of types of geopolitical places are cities, countries, neighborhoods and roads. A ''known location'' is specifically a location whose properties are specified in the {{tl|place}} modules; more on them below. * Specific places are identified by names, referred to as ''toponyms'' or ''placenames''. A given place will often have multiple names, and a given toponym may be ambiguous, referring to multiple possible locations. Specifically: ** There may be names including different amounts of disambiguating information (`Tucson` vs. `Tucson, Arizona` vs. `Tucson, Arizona, USA` or `New York` vs. `New York City` vs. `New York, New York`); abbreviations (`NYC` for `New York City`, `USA` for `United States of America`); ''official'' vs. ''short'' names (e.g. `Union of Soviet Socialist Republics` vs. `Soviet Union`); spelling variations (`Cracow` vs. `Krakow` vs. `Kraków`); current vs. former names (`Saint Petersburg` vs. `Leningrad` vs. `Petrograd`); [[exonym]]s vs. [[endonym]]s (e.g. `Tavastia Proper` vs. `Kanta-Häme`, both referring to the same administrative region in Finland); alternative names not due to any of the above reasons (`Bashkiria` vs. `Bashkortostan`); etc. In addition, each language that has an opportunity to refer to the place will have its own name, with the same sorts of variations as exist in English. ** Examples of ambiguous toponyms are `New York` (either a city or a state); `Georgia` (either a state of the US or an independent country in the Caucasus Mountains); `Paris` (either the capital of France or various small cities and towns in the US); `Mexico` (either a country, a state of that country, or the capital city of that country); and `San Antonio` (besides being a major city in Texas, it is the name of dozens of settlements of all sorts throughout the US and Latin America, and a least 181 distinct [[barangay]]s in the Philippines). * A ''placetype'' is the (or a) type that a location belongs to (e.g. `city`, `state`, `river`, `administrative region`, `[[regional county municipality]]`, etc.). ** It is common for locations to be described using multiple placetypes, and even sometimes known locations have multiple placetypes that they may be identified by (e.g. American Samoa can be identified either as an `unincorporated territory`, an `overseas territory` or just a `territory`). Both the {{tl|place}} template and the known location data allow a given location to be identified by multiple placetypes. When in doubt as to the correct placetype or placetypes for a given location, generally follow how Wikipedia describes the place. ** Some placetypes themselves are ambiguous; e.g. an ''area'' can variously refer to a top-level administrative division (specifically of Kuwait); a geographic region, generally without unambiguously defined borders; or a section of a city, similar to a neighborhood. The term ''district'' is similarly ambiguous. A ''[[prefecture]]'' in the context of Japan is similar to a province, but a prefecture in France is the capital of a ''[[department]]'' (which is similar to a county). Some of this ambiguity is currently handled automatically; e.g. the ambiguity of areas and districts is handled by looking at the ''holonyms'', or containing locations, specified for a given place. But sometimes it is necessary to use a qualifier before the placetype to disambiguate; for example to refer to a French prefecture, use the placetype `French prefecture` instead of just `prefecture`. (FIXME: Handle this automatically.) * A ''holonym'', in the context of a description of a place, is a placename that refers to a larger-sized entity that contains the location being described. For example, `Arizona` and `United States` are holonyms of `Tucson`, and `United States` is a holonym of `Arizona`. * A ''place invocation'' consists of the invocation of {{tl|place}}, including all its parameters. Place invocations may contain one or more ''place descriptions'', each of which provides a description of the location, including its placetype or types, any holonyms, and any additional raw text needed to properly explain the place in context. Place invocations may also contain named parameters specifying zero or more English ''glosses'' or translations (for foreign-language toponyms) and any attached ''extra information'' such as the capital, largest city, official name, modern name or full name. Multiple place descriptions in a single invocation are separated by a numbered parameter starting with a semicolon, and are used when it is necessary to provide two or more definitions of a single location for proper categorization. For example, [[Vatican City]] is defined both as a city-state in Southern Europe and as an enclave within the city of Rome, follows: : {{tl|place|en|city-state|r/Southern Europe|;,|an <<enclave>> within the city of [[Rome]], [[Italy]]|cat=Places in Rome|official=Vatican City State}}. Similar things need to be done for places like [[Crimea]] that are claimed by two different countries with different definitions and administrative structures. ** There are two types of place descriptions, ''new-style'' and ''old-style''. (The use of the terms "new" and "old" indicates chronological precedence in the development of {{tl|place}}, but is not meant to pass any value judgments on the two types, and does not indicate any intent to deprecate old-style descriptions. Both types of descriptions are useful; for example, old-style descriptions are generally more succinct but less flexible.) The above invocation shows both types: an old-style description followed by a new-style description. Old style descriptions use multiple numbered parameters, where the first parameter (after the language code) specifies the placetype or types, and following parameters specify either holonyms (which are always of the form ` ``placetype``/``placename`` `) or raw text (which is identifiable by not having a slash in it). New-style descriptions use a single parameter, where both placetypes and holonyms are surrounded by double angle brackets, and all remaining text is raw (displayed as-is). In both types of descriptions, holonyms include a slash in them to separate the placetype (which is mandatory and often abbreviated) from the placename. ** In the context of a place description, there are two types of placetypes. The ''entry placetypes'' are the placetypes of the place being described, while the ''holonym placetypes'' are the placetypes of the holonyms that the place being described is located within. Currently, a given place can have multiple placetypes specified (e.g. [[Normandy]] is specified using the ''compound placetype'' `administrative region/former province/and/medieval kingdom`) while a given holonym can have only one placetype associated with it. Holonym placetypes are frequently abbreviated (e.g. `r` for `region`, `s` for `state`, `co` for `county`, etc.), while stylistically it is preferred to spell out the entry placetype (except for some long placetypes with well-known abbreviations, such as `CDP` or `cdp` for `[[census-designated place]]`). ** All holonyms in place descriptions are automatically linked as if surrounded by {{tl|l|en|...}}; i.e. if double brackets do not occur in the holonym, the entire holonym will be linked to the corresponding Wiktionary article. For this reason, the holonym should generally be in the same format as the canonical Wiktionary article describing the location; see below). * A ''known location'' is a location whose properties are specifically defined in the {{tl|place}} modules. Generally each such location has an associated category, and known locations exist in a containment hierarchy, where the immediately containing known location is known as the ''container'' of the location and the chain of successive containing locations is known as the ''container trail''. Generally the location's container corresponds to the first parent of its category. Note that some known locations belong to more than one immediate container; for example, Russia belongs to both Europe and Asia. ===More about placetypes=== # The following general categories of placetypes exist: ## ''Natural features'' such as lakes, mountains, mountain ranges, islands, archipelagoes, moons, stars, asteroids, etc. ## ''Continents'', ''supercontinents'' (groupings of continents where it makes sense, such as `America` and `Eurasia`) and ''continent-level regions'' (grouping of countries in a given continent, such as `Central America` and `Polynesia`). ## ''Political entities'', which are generally classified as either ''polities'' (top-level entities such as countries), ''subpolities'' or ''political divisions'' (non-sovereign divisions, often specifically ''administrative divisions'', of a polity, where an administrative division has a governmental or statistical function and almost always has unambiguously defined boundaries), or ''settlements'' (e.g. cities; towns; villages; and divisions of a city such as neighborhoods, wards, [[barrio]]s and [[barangay]]s, which may or may not be formal administrative divisions and may or may not have unambiguous boundaries). ## ''Geographic regions'', which refer to recognized areas of the Earth (either with a natural geographic, political or cultural significance, often of a historical nature). Such regions can be of greatly varying size, may exist either within a single country or spanning multiple countries or (more often) parts of multiple countries, and may not have well-defined boundaries. They should be distinguished from ''administrative regions'', which exist within a single country and have well-defined boundaries and a political or administrative function. Geographic regions are categorized using the generic term ''geographic and cultural areas'' to emphasize that (a) they have no administrative significance; (b) they may vary greatly in size; and (c) their cohesion is due either to natural geographic boundaries, such as rivers or mountain ranges, or to sharing some cultural characteristics. ## ''Man-made structures'' below the level of a settlement or neighborhood, such as airports, roads, individual buildings, and the like. (Note that such structures, even if named, often do not meet the [[WT:CFI]] criteria; this is particularly the case for roads.) # Placetypes support aliases, and the mapping to canonical form happens early on in the processing. For example, `state` can be abbreviated as `s`; `administrative region` as `adr`; `regional county municipality` as `rcomun`; etc. Some placetype aliases handle alternative spellings rather than abbreviations. For example, `departmental capital` maps to `department capital`, and `home-rule city` maps to `home rule city`. Placetype abbreviations are particularly useful in holonym specs, because every holonym must be accompanied by its placetype, for disambiguation purposes. # A ''placetype qualifier'' is an adjective prepended to the placetype to give additional information about the place being described. For example, a given place may be described as a `small city`; logically this is still a city, but the qualifier `small` gives additional information about the place. Multiple qualifiers can be stacked, e.g. `small affluent beachfront unincorporated community`, where `unincorporated community` is a recognized placetype and `small`, `affluent` and `beachfront` are qualifiers. (As shown here, it may not always be obvious where the qualifiers end and the placetype begins.) For the most part, placetype qualifiers do not affect categorization; a `small city` is still a city and an `affluent beachfront unincorporated community` is still an unincorporated community, and both should still be categorized as such. But some qualifiers do change the categorization. In particular, a `former province` is no longer a province and should not be categorized in e.g. [[:Category:Provinces of Italy]], but instead in a different set of categories, e.g. [[:Category:Historical political subdivisions]]. There are several terms treated as equivalent for this purpose: `abandoned` `ancient`, `extinct`, `historic(al)`, `medi(a)eval` and `traditional`. Another set of qualifiers that change categorization are `fictional` and `mythological`, which cause any term using the qualifier to be categorized respectively into [[:Category:Fictional locations]] and [[:Category:Mythological locations]]. ===More about toponyms=== # Toponyms may be: ## ''simple'' (not including any containing location in its name, such as `Tucson`) or ''multipart'' (including one or more containing locations, such as `Tucson, Arizona` or `Tucson, USA` or even `Tucson, Arizona, USA`); ## ''bare'' (not including the word `the` if the location normally requires this article when following a preposition, such as `United States`, `Gambia` or 'Community of Madrid') or ''prefixed'' (including the word `the` as needed, such as `the United States`, `the Gambia` or `the Community of Madrid`); ## ''elliptical'' (just the placename without any disambiguating placetype, such as `Durham`, `New York` or `Mexico`) or ''full'' (containing a disambiguating placetype or similar identifier if one is commonly included, such as the city of `Durham` (in England) vs. its containing county `County Durham`; the US city `New York City` vs. its containing state `New York`; or the three-way distinction between `Mexico` (the country), `Mexico City` (the capital of this country) and `(the) State of Mexico` (one of the states of the country Mexico, mostly surrounding but not including Mexico City)). # The ''canonical Wiktionary article'' is the main article on Wiktionary where a location is described. Canonical articles, per the above terminology, are generally ''simple'' and ''bare'', but may be either ''full'' or ''elliptical''. The fact that a given article is canonical is often identifiable by the fact that translations are housed there an not somewhere else. For example, most counties of the US and Canada include the word `County` in their canonical article name, but most counties elsewhere do not. `Washington, D.C.` is one of the few cases where a non-simple toponym is used as the canonical article; this is based on common usage, especially by residents of the city in question (who commonly refer to it as "D.C." but rarely just as "Washington"). ===More about known locations=== # The following types of known locations are defined in this module: ## Continents, supercontinents and continent-level regions, into which countries are grouped. Specifically: ### At the top level below `Earth` are the supercontinents `America` and `Eurasia` and the continents `Africa`, `Oceania` and `Antartica`. ### `America` is further broken down into the continents `North America` (in turn containing the continental regions `Central America` and `Caribbean`, with the United States, Canada and Mexico directly under North America) and `South America`. ### `Eurasia` is further broken down into the continents `Europe` and `Asia`. ### `Oceania` is further broken down into the continental regions `Melanesia`, `Micronesia` and `Polynesia`, with Australia` directly under `Oceania. ### Under the above-specified divisions are countries. Some countries are placed in more than one continent or continent-level region, either because they actually span two continents (e.g. Russia, Turkey, Kazakhstan, Egypt) or because they are politically considered to belong to a continent different from the one they are geographically in (Cyprus, Georgia, Armenia, etc.). ## Political entities, including: ### Top-level political entities, which includes: #### Countries, with a fairly liberal definition, notably including all UN-recognized countries plus some others that are commonly considered countries, even if not all other countries recognize them as such or consider them completely independent (notably, Kosovo, Palestine, Taiwan, Western Sahara, Niue and the Cook Islands). #### Pseudo-countries, which include areas calling themselves countries that are de-facto not under the control of the country that they are internationally considered part of (e.g. Abkhazia, South Ossetia, Transnistria); dependent/external/etc. territories of countries (e.g. American Samoa [US], Bermuda [UK], Christmas Island [Australia], Easter Island [Chile]); constituent countries, autonomous territories and the like (Aruba, Curaçao and Sint Maarten of the Netherlands; Greenland and the Faroe Islands of Denmark; etc.; but notably not including England, Scotland, Northern Ireland and Wales, which are treated as regular countries); and a grab bag of other entities that have a semi-independent existence, such as Hong Kong, Macau, Guadeloupe, Martinique and the like. Currently, the actual distinction in treatment between "countries" and "country-like entities" is minimal, but in the future we might restrict the sorts of subcategories of country-like entities more than regular countries. #### Former countries, e.g. the Soviet Union, Yugoslavia, West Germany and the Roman Empire. These are much more limited in the sorts of subcategories allowed, because generally locations, especially cities, should be described from the perspective of which political entity they are currently located in (e.g. "an ancient Roman town in modern Syria") and categorized as such. ### Subpolities. Generally we only list top-level administrative divisions of countries (and only fairly major countries are usually included), but sometimes we list second-level administrative divisions, as in the case of the United Kingdom (where the top-level administrative divisions of the four constituent countries are listed) and China (where major prefecture-level cities are listed, and are considered administrative divisions rather than cities). ### Cities. Only major cities get categories, with the definition of "major" varying by country but often including those where the city population itself (sometimes the metro area) is >= 1,000,000 people. # A distinction should be made in the {{tl|place}} modules between ''keys'' and ''placenames''. Placenames are as the location appears in a holonym, and are generally in the same format as the canonical Wiktionary article describing the location so that when formatted as a link, the link goes to the right article; i.e. they are simple and bare, and may be full or elliptical according to Wiktionary conventions. The ''canonical key'' of a location is how the location's category is named, and always uniquely identifies the location from among the known locations in this module (but not necessarily among all possible locations). In particular, subpolities usually have multipart keys that include the containing location, such as `Anhui, China` (not just `Anhui`); `Arizona, USA` (not just `Arizona`, and also not `Arizona, United States`); and `Herefordshire, England` (not just `Herefordshire`, and also in this case not `Herefordshire, UK` or `Herefordshire, England, UK` or any other possible variation). Cities are normally simple, but some cities are multipart for disambiguation purposes (e.g. `Newcastle, New South Wales` for the city in Australia vs. `Newcastle upon Tyne` for the identically-named city in England). Canonical keys may have ''key aliases'', other ways of referring to the location that are not necessarily unique (e.g. `Newcastle` is a key alias for both of the above-mentioned cities), and city keys with diacritics generally have diacriticless aliases, such as canonical key `Düsseldorf` vs. key alias `Dusseldorf`, or canonical key `Łódź` vs. key alias `Lodz`. # Known locations are gathered into ''groups'' with similar properties, such as all the states of the United States; all the (ceremonial) counties of England (see below); and all the "sufficiently major" prefecture-level cities in China (where a prefecture-level city is a prefecture surrounding a major city with a unified government and is more like a prefecture, i.e. a major administrative division just underneath a province, than like a city, and where "sufficiently major" is defined according to the population of either the total prefecture or the urban area of the city). Note that there are multiple types of counties in England, with overlapping but non-identical names and boundaries; there are, in particular, ''ceremonial counties'', ''local government counties'' and ''historic counties''; ''ceremonial counties'' have only ceremonial administrative functionality but unlike local government counties (a) don't frequently change their boundaries or nature, (b) correspond more closely to historic county boundaries and names, and (c) are what Englanders usually identify themselves with, and so they are used as top-level divisions rather than local government counties. # Some known locations have ''aliases'' defined, which are of two types. ''Display aliases'' map holonyms to their canonical form near the beginning of processing (in particular before the displayed output is formatted). For example, `US`, `U.S.`, `USA`, `U.S.A.` and `United States of America` are all canonicalized to `United States` (if identified as a country), and display as `United States`. Similarly, the foreign forms `Occitanie` (as a region or administrative region) and `Noord-Brabant` (as a province) are mapped to `Occitania` and `North Brabant` for display purposes. There are also ''category aliases'', so that if e.g. `Republic of Macedonia` is encountered, it will display as such but categorize as `North Macedonia`. (This is because, among other reasons, `Republic of Macedonia` is normally preceded by `"the"` while `North Macedonia` is not, so a call {{tl|place|en|a <<city>> in the <<c/Republic of Macedonia>>}} would look wrong if `Republic of Macedonia` were converted to `North Macedonia` during display, as the result would be `a city in the North Macedonia`. There are also frequently political connotations to different category aliases, e.g. `Burma` vs. `Myanmar`.) All of these aliases are sensitive to the placetype specified. For example, `Mexico` as a state is categorized under `State of Mexico, Mexico` but `Mexico` the country is categorized as just `Mexico`. ===Categories=== There are two main types of categories: # Categories for known locations, divided into: ## Top-level polity categories (e.g. [[:Category:United States]], [[:Category:Taiwan]], [[:Category:South Ossetia]], [[:Category:Bermuda]], [[:Category:Soviet Union]], [[:Category:West Germany]]). ## Subpolity categories ([[:Category:Arizona, USA]], [[:Category:Hunan]], [[:Category:Kagoshima Prefecture]], [[:Category:Cluj County, Romania]]). For historical reasons, different formats are used for the subpolities of different polities. Increasingly, we are moving towards always including the polity name in the subpolity category, but whether the subpolity type is included and where it is included (cf. [[:Category:Cluj County, Romania]] vs. [[:Category:County Cork, Ireland]] is still inconsistent and will probably remain that way, based on how the subpolity is normally referred to. ## City categories ([[:Category:Tokyo]], [[:Category:New York City]], [[:Category:Jaipur]]). Normally these do not include the containing subpolity, but may do so in order to disambiguate. # Categories for placetypes, divided into: ## "Immediate" political and non-political division categories ([[:Category:States of the United States]], [[:Category:Municipalities of Tocantins, Brazil]], [[:Category:Ghost towns in Arizona, USA]]). These are name categories, whose purpose is to contain locations of the specified type. "Immediate" here refers to the fact that the location in the category name is the immediately-containing polity. Usually these categories use the preposition "of", but sometimes "di". (Specifically, "of" typically implies that the placetype in question has an official or semi-official status, whereas "di" implies there is no such official status, but common usage may override this.) The form of the toponym appearing in these categories is always the same as that of the corresponding toponym category except that the word "the" may appear (e.g. [[:Category:States of the United States]]), whereas it doesn't appear in the toponym category itself ([[:Category:United States]], no "the"). ## "Skip-polity" categories for second-level political and non-political divisions of a country or other top-level polity (e.g. [[:Category:Counties of the United States]], [[:Category:Municipalities of Brazil]] and [[:Category:Subprefectures of Japan]]). These have several purposes: * They group the immediate division categories mentioned previously. * They categorize "straggler" topoynms that (often improperly) fail to mention the subpolity they belong to, but only the top-level polity. * If categories do not exist for the first-level divisions of a country (and sometimes even when they do), they group all toponyms of the specified type for the specified country. For example, Lithuania is divided into first-level counties and second-level municipalities, but since we don't currently have categories for Lithuanian counties, all municipalities go under [[:Category:Municipalities of Lithuania]] rather than under a category for a specific county. In addition, even though we do have categories for Japanese prefectures (a first-level division), all subprefectures (a second-level division) go under [[:Category:Subprefectures of Japan]] because there aren't very many of them (see below). ## "Generic placetype" categories, both of the immediate and skip-polity type (immediate [[:Category:Cities in California, USA]] and [[:Category:Neighborhoods of the Bronx]]; skip-polity [[:Category:Villages in Ivory Coast]], [[:Category:Geographic and cultural areas of England]], [[:Category:Rivers in Egypt]] and [[:Category:Places in the Philippines]]). As mentioned above, "generic" placetypes occur in every polity (although the set of generic placetypes allowed for cities is a subset of those allowed for top-level polities and subpolities). Usually these categories use the preposition "di", but sometimes "of". As above, skip-polity categories group immediate categories, and in addition there are various reasons a toponym entry is categorized into a skip-polity category. (For example, as a general rule, geographic and cultural areas only categorize at the country level, not the subpolity level, both because there often aren't very many in a given country and because they often span multiple subpolities.) The parent categories of a given category depend on its type. Generally, location categories have placetype categories as their first parent, and vice-versa. Specifically: # Top-level country categories have as their parent e.g. [[:Category:Countries in Europe]], [[:Category:Countries in Central America]] or [[:Category:Countries in Polynesia]], using the most specific continental-level region the country is contained in. # Pseudo-countries are under [[:Category:Country-like entities]] as a neutral designation. There aren't enough of them to subcategorize under continent-level regions. # Former countries are under [[:Category:Former countries and country-like entities]]. # Subpolity categories are usually under a placetype category whose placetype is the canonical (first-listed) placetype of the subpolity and whose toponym is the immediately containing polity, but there are exceptions. Specifically, sometimes if a polity has multiple types of subpolities, they are combined (e.g. [[:Category:States and territories of Australia]], [[:Category:Federal subjects of Russia]]). In addition, sometimes a less specific but more identifiable placetype is used instead of the canonical one (e.g. [[:Category:Regions of France]] when the canonical placetype is "administrative region"). The same rules and exceptions generally apply when categorizing subpolities themselves; e.g. both the Australian state of Queensland and territory of Northern Territory go under [[:Category:en:States and territories of Australia]] rather than separately under [[:Category:en:States of Australia]] and [[:Category:en:Territories of Australia]]. In addition, sometimes subpolities may "skip a level" if there aren't very many. For example, there are only 26 subprefectures of Japan (14 under Hokkaido and 12 more scattered under five other prefectures). Rather than have e.g. [[:Category:en:Subprefectures of Kagoshima Prefecture]] containing at most two entries and [[:Category:en:Subprefectures of Miyazaki Prefecture]] containing at most one, they are all grouped under the so-called "skip-subpolity category" [[:Category:en:Subprefectures of Japan]]. # City categories are always under e.g. [[:Category:Cities in the United States]] (e.g. [[:Category:New York City]] is so-placed, even though [[:Category:Cities in New York, USA]] exists). However, they may have a second, more-specific parent (e.g. [[:Category:Cities in New York, USA]] in the case of New York City). The city entries themselves will go under the more specific parent if it exists. # Immediate placetype categories for second-level divisions of a country generally have, respectively, a "toponym parent" that is the toponym mentioned in the category and a "skip-polity parent" that groups all subpolity placetype categories of a specific type and containing polity. For example, [[:Category:Counties of Arizona, USA]] has toponym parent [[:Category:en:Arizona, USA]] and skip-polity parent [[:Category:en:Counties of the United States]]. Sometimes the default skip-polity parent is overridden or disabled entirely. For example, in the US, most states are divided into counties but Louisiana is divided into parishes and Alaska into boroughs. It would make no sense to put [[:Category:Parishes of Louisiana, USA]] under [[:Category:Parishes of the United States]] (which would only have one subcategory), so we include them under [[:Category:Counties of the United States]]. An alternative would be to name the skip-polity category to explicitly include parishes and boroughs; this would get awkward here but is done in some cases. Similarly, [[:Category:Regional county municipalities of Quebec]] is placed under [[:Category:Regional municipalities of Canada]] since that name is used in other provinces. Meanwhile, [[:Category:Regional districts of British Columbia]] disables its skip-polity category since no other province or territory of Canada has regional districts or comparable subpolities under a different name (an alternative would be to place them under [[:Category:Counties of Canada]], since they are sort of comparable to counties). # Placetype categories for first-level divisions of a country similarly (e.g. [[:Category:States of the United States]]) have a toponym parent (in this case [[:Category:United States]]), but in place of the skip-polity parent they have two other parents: a "bare placetype" parent (in this case [[:Category:States]]) and the "generic" parent [[:Category:Political divisions of specific countries]]. (There is also a bare [[:Category:Political divisions]] that groups "bare placetype" categories.) Skip-polity placetype categories for second-level divisions of a country (e.g. [[:Category:Counties of the United States]]) work the same. Placetype categories for countries work likewise except they are missing the generic parent. ===Place descriptions=== A given place description is defined internally in a table of the following form: ```{ placetypes = {"``placetype``", "``placetype``", ...}, holonyms = { { -- holonym object; see below placetype = "``placetype``" or nil, display_placename = "``placename``", unlinked_placename = "``placename``", langcode = "``langcode``" or nil, no_display = BOOLEAN, needs_article = BOOLEAN, force_the = BOOLEAN, affix_type = "``affix_type``" or nil, pluralize_affix = BOOLEAN, suppress_affix = BOOLEAN, continue_cat_loop = BOOLEAN, }, ... }, order = { ``order_item``, ``order_item``, ... }, -- (only for new-style place descriptions), joiner = "``joiner_string``" or nil, holonyms_by_placetype = { ``holonym_placetype`` = {"``placename``", "``placename``", ...}, ``holonym_placetype`` = {"``placename``", "``placename``", ...}, ... }, }``` Holonym objects have the following fields: * `placetype`: The canonicalized placetype if specified as e.g. `c/Australia`; nil if no slash is present (in which case the placename in `display_placename` refers to raw text). * `display_placename`: The placename or raw text, in the format to be displayed. Placename display aliases have already been resolved. It is raw text if `placetype` is nil. * `unlinked_placename`: Same as `display_placename` but with links and HTML removed. * `langcode`: The language code prefix if specified as e.g. `c/fr:Australie`; otherwise nil. * `no_display`: If true (holonym prefixed with !), don't display the holonym but use it for categorization. * `needs_article`: If true, prepend an article if the placename needs one (e.g. `United States`). * `force_the`: If true, always prepend the article `the`. Example use: holoynm 'city:pref:the/Gold Coast', which gets formatted as `(the) city of the [[Gold Coast]]`. * `affix_type`: Type of affix to prepend (values `pref` or `Pref`) or append (values `suf` or `Suf`). The actual affix added is the placetype (capitalized if values `Pref` or `Suf` are given), or its plural if `pluralize_affix` is given. Note that some placetypes (e.g. `district` and `department`) have inherent affixes displayed after (or sometimes before) them. * `pluralize_affix`: Pluralize any displayed affix. Used for holonyms like `c:pref/Canada,US`, which displays as `the countries of Canada and the United States`. * `suppress_affix`: Don't display any affix even if the placetype has an inherent affix. Used for the non-last placenames when there are multiple and a suffix is present, and for the non-first placenames when there are multiple and a prefix is present. * `continue_cat_loop`: If true (holonym used :also), continue producing categories starting with this holonym when preceding holonyms generated categories. Note that new-style place descs (those specified as a single argument using <<...>> to denote placetypes, placetype qualifiers and holonyms) have an additional `order` field to properly capture the raw text surrounding the items denoted in double angle brackets. The ``order_item`` items in the `order` field are objects of the following form: ```{ type = "``order_type``", value = "STRING" or INDEX, }``` Here, the ``order_type`` is one of `"raw"`, `"qualifier"`, `"placetype"` or `"holonym"`: * `"raw"` is used for raw text surrounding `<<...>>` specs. * `"qualifier"` is used for `<<...>>` specs without slashes in them that consist only of qualifiers (e.g. the spec `<<former>>` in `<<former>> French <<colony>>`). * `"placetype"` is used for `<<...>>` `specs without slashes that do not consist only of qualifiers. * `"holonym"` is used for holonyms, i.e. `<<...>>` specs with a slash in them. For all types but `"holonym"`, the value is a string, specifying the text in question. For `"holonym"`, the value is a numeric index into the `holonyms` field. It should be noted that placetypes and placenames occurring inside the holonyms structure are canonicalized, but placetypes inside the placetypes structure are as specified by the user. Stripping off of qualifiers and canonicalization of qualifiers and bare placetypes happens later. The information under `holonyms_by_placetype` is redundant to the information in holonyms but makes categorization easier. The holonym placenames listed here already have category aliases applied. For example, the call {{tl|place|en|city|s/Pennsylvania|c/US}} will result in the return value ```{ placetypes = {"city"}, holonyms = { { placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania" }, { placetype = "negara", display_placename = "United States", unlinked_placename = "United States" }, }, holonyms_by_placetype = { state = {"Pennsylvania"}, country = {"United States"}, }, }``` Here, the placetype aliases `s` and `c` have been expanded into `state` and `country` respectively, and the placename display alias `US` has been expanded into `United States`. PLACETYPES is a list because there may be more than one. For example, the call {{tl|place|en|city/and/municipality|p/[[Kwango]] Province|c/Congo}} will result in the return value ``` { placetypes = {"city", "and", "municipality"}, holonyms = { { placetype = "province", display_placename = "[[Kwango]] Province", unlinked_placename = "Kwango Province" }, { placetype = "negara", display_placename = "Congo", unlinked_placename = "Congo" }, }, holonyms_by_placetype = { country = {"Congo"}, }, }``` Here, the `unlinked_placename` field has removed links from `display_placename`. The value in the key/value pairs is likewise a list; e.g. the call {{tl|place|en|city|s/Kansas|and|s/Missouri}} will return ``` { placetypes = {"city"}, holonyms = { { placetype = "state", display_placename = "Kansas", unlinked_placename = "Kansas" }, { display_placename = "and", unlinked_placename = "and" }, { placetype = "state", display_placename = "Missouri", unlinked_placename = "Missouri" }, }, holonyms_by_placetype = { state = {"Kansas", "Missouri"}, }, } ``` Note that in `get_cats()` (which runs after the display form has been generated), further changes to the holonym structure are made to aid in categorization. For example, after `handle_category_implications()` and `augment_holonyms_with_container()` are called, the above structure will look more like ``` { placetypes = {"city"}, holonyms = { { placetype = "state", display_placename = "Kansas", unlinked_placename = "Kansas" }, { placetype = "negara", unlinked_placename = "United States" }, { display_placename = "and", unlinked_placename = "and" }, { placetype = "state", display_placename = "Missouri", unlinked_placename = "Missouri" }, { placetype = "negara", unlinked_placename = "United States" }, }, holonyms_by_placetype = { state = {"Kansas", "Missouri"}, country = {"United States"} }, } ``` ===Overall place specs=== The overall place spec parsed by `parse_overall_place_spec` has the following fields: * `lang`: The language object (from {{para|1}}). * `args`: The parsed arguments from the {{tl|place}} call. * `directives`: List of form-of directives (starting with `@`) parsed from the numeric args beginning with {{para|2}}. Each directive contains fields `directive` (the directive as specified by the user, e.g. `"former name of"`); `terms` (list of term objects for the terms specified by the user); `conj` (conjunction specified by the user using inline modifier `<conj:...>`, or {nil}); `spec` (the corresponding directive spec from `all_form_of_directives`); `pretext` (the text to display directly before the directive); `posttext` (the text to display directly after the directive; {nil} except for the last directive). * `descs`: List of one or more place description objects parsed from the numeric args beginning with {{para|2}}, as described above. * `extra_info`: List of extra-info objects for extra info specified using arguments such as {{para|capital}}, {{para|modern}}, etc. Objects are in the order they should be displayed, and each object contains fields `spec` (the spec for the type of extra info, taken from `export.extra_info_args`), `terms` (list of term objects for the terms specified by the user); and `conj` (conjunction specified by the user using inline modifier `<conj:...>`, or {nil}). ===Category determination=== The algorithm to find the categories to which a given place belongs works off of a place description (which specifies the entry placetype(s) and holonym(s); see above). If there are multiple place descriptions, each is processed independently to generate categories. Likewise, if there are multiple entry placetypes in a given place description, each is processed independently with all the holonyms of the description to generate categories. Furthermore, before the category-generation algorithm runs, earlier steps have modified the holonyms of the place description (inserting containing polities whenever possible; see the description above of `handle_category_implications()` and `augment_holonyms_with_container()`). Given a single entry placetype and a place description, the algorithm to generate categories processes holonyms from left to right until it finds one that "matches" in that it produces one or more categories. At that point it attempts to generate categories for all other holonyms in the place description of the same placetype. Normally, it then stops processing holonyms, but if a holonym is marked using the `:also` modifier, the category generation process starts over starting with that holonym (or the leftmost such remaining holonym, if there is more than one marked with `:also`). This makes it possible, for example, to specify the description of a river that passes through two different types of political divisions (e.g. Alberta and the Northwest Territories), or categorize a geographic region at both the continent and country level, such as this: <pre> {{place|en|historical region|r/Eastern Europe|located in southeastern|c:also/Poland|*and western|c/Ukraine}} </pre> Here, `r/Eastern Europe` has a category implication that adds `cont/Europe` as a holonym directly after it, which causes the page to be categorized into [[:Category:en:Geographic and cultural areas of Europe]]. The category generation process would normally stop at this point, but the presence of `:also` causes it to restart with `c/Poland` and generate the category [[:Category:en:Geographic and cultural areas of Poland]]. After doing this, it looks for other holonyms of the same placetype as `c/Poland` (i.e. other countries), which causes it to process `c/Ukraine` and generate the category [[:Category:en:Geographic and cultural areas of Ukraine]]. The category generation process works off of the `placetype_data` table, which specifies various properties for placetypes, such as how to display a holonym of that placetype as well as how to categorize certain pages where the {{tl|place}} call contains the specified placetype as an entry placetype. For example, the entry for `city-state` in [[Module:place/placetypes]] might look like ``` ["city-state"] = { link = true, category_link = "[[sovereign]] [[microstate]]s consisting of a single [[city]] and [[w:dependent territory|dependent territories]]", has_neighborhoods = true, class = "settlement", ["continent/*"] = {"City-states", "Cities", "Countries", "Countries in +++", "National capitals"}, default = {"City-states", "Cities", "Countries", "National capitals"}, }, ``` Here, the keys specify, respectively: # If `city-state` occurs as an entry placetype, link it to the corresponding Wiktionary entry (that is what `true` means in `link = true`). # Use the specified `category_link` text for categories such as [[:Category:City-states]]. # City-states are "city-like", i.e. they have neighborhoods; this controls the handling of entry placetypes such as `neighborhood`, `district`, `area`, etc. # City-states should be treated as settlements for determining how to handle the placetype `former city-state` and for categorizing the bare category [[:Category:City-states]] and language-specific equivalents such as [[:Category:en:City-states]]. # When the entry placetype `city-state` occurs along with a continent holonym, categorize into the specified categories under `continent/*`. Here, `+++` stands for the holonym in question. # When the entry placetype `city-state` occurs in any other context, categorize into the specified categories under `default`. It's important to realize that the only categorization keys under a given placetype entry that are specified explicitly in [[Module:place/placetypes]] are certain wildcard keys such as `continent/*` above (i.e. containing a slash followed by `*`) and under the key `default`. All the remaining categorization happens through category handlers, based on the information on known locations in [[Module:place/locations]]. For example, [[Module:place/locations]] has an "England group" specified similarly to the following: ``` export.england_group = { default_container = {key = "England", placetype = "constituent country"}, default_placetype = "county", default_divs = { "districts", {type = "local government districts", cat_as = "districts"}, { type = "local government districts with borough status", cat_as = {"districts", "boroughs"}, }, {type = "boroughs", cat_as = {"districts", "boroughs"}}, "civil parishes", }, default_british_spelling = true, data = export.england_counties, } ``` The `default_divs` key here specifies the divisions that exist for each of the counties listed under the `data` key (unless the key overrides them). Here, the entry `{type = "boroughs", cat_as = {"districts", "boroughs"}}` directs the category handler `political_division_cat_handler` in [[Module:place/placetypes]] (which is one of two category handlers that run for all entry placetypes, along with `generic_place_cat_handler`) to categorize boroughs specified under any of the counties listed under `data` as both districts and boroughs. Now, the categorization process proceeds as follows, given an entry placetype and place description, which specifies a set of holonyms (the code to do this is in `get_placetype_cats()`): # First, look up the entry placetype and any equivalent placetypes in `placetype_data`, which is defined in [[Module:place/placetypes]]. Note that the entry in `placetype_data` that specifies the placetype information that is used to determine the category or categories may not directly correspond to the entry placetype as specified in the place description. For example, if the entry placetype is `small town`, the placetype whose data is fetched will be `town` since `small` is a recognized qualifier and there is no entry in `placetype_data` for `small town`. As another example, if the entry placetype is `administrative capital`, the code will first look up `administrative capital` and then look up `capital city`, which is where the category handler is found, because `administrative capital` specifies `capital city` as its fallback. # Then, iterate over holonyms from left to right, as described above. For each holonym, we proceed as follows: ## First, call `political_division_cat_handler` to check if the entry placetype and holonym match a division in the `locations` data in [[Module:place/locations]], as in the example above. Note that when doing this, holonyms are canonicalized so that e.g. `co/Bedfordshire` gets mapped to `county/Bedfordshire` (because there is an entry in `placetype_aliases` in [[Module:place/placetypes]] that maps `co` to `county`) and `c/USA` gets mapped to `country/United States` (because there is an entry in the location data for the list of countries that maps `country/USA` to `country/United States` for both display and categorization purposes). This category handler, as with all such handlers, is passed the entry placetype and holonym being processed, but is also passed the entire place description, so it can look at other specified holonyms (particularly those that follow). It either returns {nil} or a list of category specs (which are the actual categories minus the preceding language code). ## If `political_division_cat_handler` doesn't generate any categories, check if there is a category handler defined using the `cat_handler` key for the entry placetype. If so, call it to generate the categories (if any). ## If the category handler returns {nil}, or there is no category handler, look for a ''wildcard key'' of the format e.g. `country/*`, which matches any holonym of placetype `country`. If found, the value is a list of category specs, which are processed as above. ## If we get this far without generating any categories, move to the next holonym. ## If we do generate any categories, process all other holonyms of the same placetype. For example, if the user says {{tl|place|en|city|s/Kansas|and|s/Missouri}}, when we get to the holonym `s/Kansas`, we generate the category [[:Category:en:Cities in Kansas, USA]]. This causes us to look for other holonyms of the same placetype `state`, and process them accordingly, generating a category [[:Category:en:Cities in Missouri, USA]] as well. The same thing happens in an invocation like {{tl|place|pl|river|c/Poland,Ukraine,Belarus}}. # Once we generate categories for a holonym and any other holonyms of the same placetype, we normally stop processing holonyms. But if a holonym has the `:also` modifier, we restart the left-to-right loop at that holonym. For example, in the invocation {{tl|place|en|river|flowing through|p/Alberta|p/British Columbia|and the|terr/Northwest Territories}}, we will generate a category [[:Category:en:Rivers in Alberta, Canada]] as well as [[:Category:en:Rivers in British Columbia, Canada]] (because British Columbia is of the same placetype as Alberta); but no category will be generated for the Northwest Territories, which is of a different placetype. To fix this, write {{tl|place|en|river|flowing through|p/Alberta|p/British Columbia|and the|terr:also/Northwest Territories}}. The use of `:also` will cause holonym processing to resume at `Northwest Territories` after `Alberta` is processed, leading to an additional category [[:Category:en:Rivers in the Northwest Territories, Canada]]. (The presence of `the` in this last category is because `Northwest Territories` is a known location with a spec indicating that it should be preceded by `the`; it has nothing to do with the raw text `and the` in the invocation.) # Finally, if we process all holonyms and don't end up producing any categories, we check the entry placetype's data for a `default` key. If found, it lists category specs, which are processed to generate categories. This is used, for example, in the placetype `city-state`, as described above. # It should be noted that the above process runs independently for each combination of entry placetype and place description. Thus, for example, an invocation {{tl|place|en|city/and/county|s/Kansas,Missouri|c/USA}} will generate categories for both cities and counties in both Kansas and Missouri. # Two additional sources of categories are ''bare location'' categories and ''generic place'' categories. These categories are added by appropriate calls in the outer function `get_cats`, which iterates over placetypes and place descriptions, calling `get_placetype_cats` on each combination. ## Bare location categories are categories like [[:Category:Arizona, USA]] that are related-to categories containing terms related to the specified location. The bare location code, for example, adds the term [[Arizona]], and its equivalents in other languages, to [[:Category:Arizona, USA]]. When looking for terms to consider, it checks the pagename, the glosses specified using {{para|t}}, and the terms specified using {{para|modern}}, {{para|short}} and {{para|full}}. It looks to see if any of these parameters match any known locations, but only adds them to a bare location category if (a) the specified entry placetype matches, so that for example Russian `[[Джорджия]]` goes into [[:Category:Georgia, USA]] while `[[Грузия]]` goes into [[:Category:Georgia]] (the country), even though both have a gloss `Georgia`; and (b) there are no conflicting holonyms, so that for example the Old English term [[Munucceaster]] if defined similarly to {{tl|place|ang|city|in modern|cc/England|t=Newcastle}} won't get added to [[:Category:Newcastle, New South Wales]] (even though it is also a city) because the latter city is known to be in Australia, which conflicts with the country `United Kingdom` (added internally to the Old English place description through the holonym augmentation process, based on the holonym `cc/England`). ## Generic place categories are categories like [[:Category:Places in Kansas, USA]] and [[:Category:Places in England]] that contain places of arbitrary placetype. These are added through a special category handler that operates like other category handlers but is run for all placetypes, rather than only for the specified one(s). ]==] --[=[ TODO/FIXME: 1. [DONE] Neighborhoods should categorize at the city level. Categories like [[:Category:Places in Los Angeles]] exist but not [[:Category:Neighborhoods in Los Angeles]]; we can refactor the code in generic_cat_handler() to support this use case. 2. Display handlers should be smarter. For example, 'co/Travis' as a holonym should display as 'Travis County' in the United States, but (I think) display handlers don't currently have the full context of holonyms passed in to allow this to happen. 3. Connected to this, we have various display handlers that add the name of the holonym after or (sometimes) before the placename if it's not already there. An example is the county_display_handler() in [[Module:place/placetypes]], which adds "County" before Ireland and Northern Ireland counties and after Taiwan and Romania counties. This should be integrated into the polity group for these respective polities through a setting rather than requiring a separate handler that has special casing for various polities. 4. Placetypes for toponyms should also have display handlers rather than just fixed text. This should allow us to dispense with the need for special types for "fpref" = "French prefecture" (which displays as "prefecture" but links to the appropriate Wikipedia article on Frenc prefectures, which are completely different from the more general concept of prefecture). Similarly for "Polish colony" and "Welsh community". ("Israeli settlement" should probably stay as-is because it displays as "Israeli settlement" not just "settlement".) 5. [DONE] Currently, categories for e.g. states and territories of Australia go into [[:Category:States and territories of Australia]] but terms for states and territories of Australia go into (respectively) [[:Category:States of Australia]] and [[:Category:Territories of Australia]]. We should fix this; maybe this is as easy as setting cat_as in the respective divs definitions. 6. Probably cat_as should support raw categories as well as category types; raw categories would be indicated by being prefixed with "Category:". 7. [MOSTLY DONE] Update documentation. 8. [DONE] Rename remaining political division categories to include name of country in them. 9. [DONE] Add Pakistan provinces and territories. 10. [DONE] Add a polity group for continents and continent-level regions instead of special-casing. This should make it possible e.g. to have Jerusalem as a city under "Asia". 11. [DONE] Add better handling of cities that are their own states, like Mexico City. 12. [DONE] Breadcrumb for e.g. [[Category:Aguascalientes, Mexico]] is "Aguascalientes, Mexico" instead of just "Aguascalientes". 13. [DONE] Unify aliasing system; cities have a completely different mechanism (alias_of) vs. polities/subpolities (which use`placename_cat_aliases` and `placename_display_aliases` in [[Module:place/placetypes]]). 14. [DONE] More generally, cities should be unified into the polity grouping system to the extent possible; this would allow for divs of cities (see #17 below). 15. [DONE] We have `no_containing_polity_cat` set for Lebanon, Malta and Saudi Arabia to prevent country-level implications from being added due to generically-named divisions like "North Governorate", "Central Region" and "Eastern Province" but (a) this setting seems to do multiple things and should be split, (b) it should be possible to set this at the division level instead of the country level. 16. Split out the data from the handlers so we can use loadData() on the data because it's becoming very big. 17. [DONE] Cities like Tokyo have special wards; "prefecture-level cities" like Wuhan (which aren't really cities but we treat them as such) have districts, subdistricts, etc. We need to support divs for cities and even named divisions of cities (such as we already have for boroughs of New York City). 18. [DONE] It should be allowed to set 'true' to any qualifier (which links it) and have it work correctly; qualifier lookup in [[Module:place]] needs to remove links first. 19. [DONE] Categories 'Historical polities' and 'Historical political subdivisions' should be renamed 'Former ...' since "historic(al)" is ambiguous (cf. "historic counties" in England which are not former, but still have a legal definition). 20. [PARTLY DONE; SUPPORT IS THERE BUT FORMER PROVINCES NOT YET CATEGORIZED] It should be possible to categorize former subpolities of certain polities; cf. [[:Category:ja:Provinces of Japan]], which contains former provinces. 21. [DONE] In subpolity_keydesc(), we need to generate the correct indefinite article and have a huge hack to check specifically for "union territory", which is the only placetype that shows up in this function where the default indefinite article generating function fails. To fix this properly, we need to separate out the non-category placetype data from `cat_data` in [[Module:place/placetypes]] and move it to [[Module:place/locations]], because we don't have access to the data in [[Module:place/placetypes]], and that data indicates the correct article for placetypes like "union territory". 22. [DONE] Simplify the specs in `cat_data`, eliminating the distinction between "inner" and "outer" matching. There should not be two levels, just one. For example, in "district", instead of ["country/Portugal"] = { ["itself"] = {"Districts and autonomous regions of +++"}, } we should just have ["country/Portugal"] = {"Districts and autonomous regions of +++"}, And in "dependent territory", instead of ["default"] = { ["itself"] = {true}, ["negara"] = {true}, }, we should just have ["itself"] = {true}, ["country/*"] = {true}, It appears the only remaining spec that can't be easily converted in this fashion is for "subdistrict": ["country/Indonesia"] = { ["municipality"] = {true}, }, This seems to be specifically for Jakarta and doesn't seem to work anyway, as the two entries in [[:Category:en:Subdistricts of Jakarta]] and the one entry in [[:Category:id:Subdistricts of Jakarta]] are manually categorized. 23. [DONE] Consolidate the remaining stuff in [[Module:category tree/topic cat/data/Earth]] into [[Module:category tree/topic cat/data/Places]]. 24. [DONE] The `generic_cat_handler` that categorizes into `Places in FOO` is smart enough not to categorize cities that are in different polities from the specified containing polity/polities of the city, but doesn't do the same for larger-level divisions. Likewise for the `city_type_cat_handler`. There are some sufficiently generically-named divisions that this issue can occur; for example, [[Koforidua]], the capital city of Eastern Region, Ghana, is incorrectly categorized under [[:Category:en:Cities in Eastern Region, Malta]] and [[:Category:en:Places in Eastern Region, Malta]]. Note that the function `augment_holonyms_with_container` ''DOES'' do such checks, so we should be able to refactor the code out of that function and use it elsewhere. 25. [DONE] The `generic_cat_handler` that categorizes into `Places in FOO` is smart enough not to categorize cities that are in different polities from the specified containing polity/polities of the city; but how smart is it? It will successfully avoid categorizing a neighborhood in e.g. [[Columbus]], [[Georgia]] that doesn't explicitly mention the US (only `s/Georgia`) into [[:Category:en:Places in Columbus]], which is for Columbus, Ohio, but will it do the same for a hypothetical neighborhood of Columbus in say Merseyside, England? This should be investigated. It will probably work for a hypothetical Columbus in [[Canada]] because `augment_holonyms_with_container` would auto-add Canada as an additional holonym once say `p/Ontario` is mentioned, but I think there's a setting preventing this augmentation from happening for the UK. (This relates to FIXME #15. `no_containing_polity_cat` is set on England, Scotland, etc. to prevent the toponyms from being added to [[:Category:en:Places in the United Kingdom]], but this same setting is used to prevent augmentation, which it should not be; there should be different settings.) 26. [DONE] The `generic_cat_handler` (or more specifically `find_holonym_keys_for_categorization`) checks for city holonyms by looking specifically for holonym type `city`. But some cities (particularly those in China) can be specified using different holonym types, e.g. `prefecture-level city`, `subprovincial city`, etc. We should allow these when appropriate (which means the cities in China need to have a `placetype` set that indicates their regional-level status as well as just `city`). I'm not sure if cities support specifying a custom `placetype` at the moment; this relates to FIXME #14 above concerning unifying cities and political divisions internally. 27. [DONE] The bare category handler (`get_bare_categories` in [[Module:place/placetypes]]) is not smart enough to avoid overcategorizing cities or other divisions that are of the right placetype but in the wrong containing polity. For example, Asturian [[Llión]] "León (city in Spain)" gets put in [[:Category:ast:León]] even though the latter is supposed to refer to a city in Mexico. We can borrow the check-containing-polity code from `generic_cat_handler`. 28. [DONE] Redo handling of singular and plural to respect overrides specified in placetype_data. Check more carefully for things that may not singularize correctly, e.g. 'passes' -> 'passe'? Definitely 'headquarters' and variants. 29. [DONE] Combine placetype_equivs and other placetype data into `placetype_data`. Figure out if we need the distinction between `placetype_equivs` and `fallback`. 30. `has_neighborhoods` may need to be a function that can look at the containing holonyms to determine whether the entity in question is city-like. 31. [DONE] Bare placenames as they appear in holonyms (e.g. `Riau Islands`) instead of category keys (e.g. `the Riau Islands, Indonesia`) should appear in the polity data tables. As a first pass, the word "the" should not appear but should instead be a property of the polity. 32. [DONE] `capital_city_cat_handler` should use `get_holonyms_to_check()`. 33. [PARTLY DONE] The code to generate and parse the correct preposition ("di" or "of") is very convoluted, and the actual preposition used is specified in various locations with various defaults, sometimes hardcoded. This should be simplified. It is made more difficult by the fact that the in/of distinction occurs in several places: (a) when generating the {{place}} text in old-style descriptions where the preposition isn't explicitly given, which uses the `preposition` setting in placetype_data, defaulting to "di"; (b) when generating categories based on explicit category specs in placetype_data (which are gradually being deprecated), which likewise uses the `preposition` setting in placetype_data, defaulting to "di"; (c) when generating categories based on political_division_cat_handler, originating in the `divs` placetypes for specific known locations in [[Module:place/locations]], which uses the `prep` setting embedded in the `divs` specifications, defaulting to "of"; (d) when generating categories based on category handlers specified using the `cat_handler` property of entries in placetype_data, which tend to hardcode "di" or "of" depending on the specific category handler; (e) when generating category descriptions in [[Module:category tree/topic/Places]] for `divs` categories generated in (c), which (correctly) uses the same `prep` setting embedded in the `divs` settings that is used when generating the categories themselves; (f) when generating category descriptions for categories generated in (b) and (d) above, which relies on the `generic_before_non_cities` and `generic_before_cities` settings in placetype_data, which need to match the corresponding prepositions hardcoded in the category generation handlers. Instead of the hardcoding, the category generation handler should respect the `generic_before_*` settings. 34. [[Krakow]] defined as {{place|en|A <<city>> on the [[Vistula]] River, the <<capital>> of the <<voi/Lesser Poland Voivodeship>> in southern <<c/Poland>>}} categorizes under [[:Category:Voivodeship capitals]] when it should probably instead be under [[:Category:Voivodeship capitals of Poland]]. Possibly this is because the various voivodeships haven't yet been entered as known locations, but this should happen regardless of that. 35. {{tcl}} bugs: a. [DONE] Lowercase initial letter in new-style {{place}} descriptions in {{tcl}}. Maybe we can have a setting tcl_nolc=1 to prevent this from happening. b. [DONE] tcl= and probably new-style {{place}} descriptions in general should recognize ;; to separate distinct {{place}} descriptions, and similarly ;;and as the equivalent of regular `;and`, etc. c. [DONE] The value supplied in `modern=` should be displayed in {{tcl}} descriptions regardless of the setting that normally disables this, so that e.g. the foreign-language equivalent of [[British Honduras]] doesn't just say it's a former British colony in Central America but specifically identifies it as modern Belize. If the user gives, place_modern= in {{tcl}}, that should override the modern= value and still display. d. [DONE] The page supplied to {{tcl}} should be used for generating bare categories even if t= is supplied and overrides the English term displayed. [DONE] e. [DONE] If text follows {{place}} and begins with a semicolon, the semicolon isn't copied into {{tcl}}. 36. County boroughs used as holonyms currently display 'borough county borough' because there's an affix setting for 'county borough' and a fallback display handler for 'borough'. We need to rethink this; maybe merge the affix setting and display handlers. 37. Implement known-location groups and specs in a more standardly object-oriented way using metatables. 38. Implement caching of known location lookup in the holonym. This may have to be keyed by placetype, but we can have a special field for when the lookup placetype is the same as the user-specified placetype of the holonym. Use this known location in place of looking up known locations and store the appropriate known location there in `augment_holonyms_with_container()` instead of calling `key_to_placename`. 39. Bug fixes with 'the': (a) [DONE] [[Kazaň]] defined as {{place|cs|caplc|rep:Pref/Tatarstan|c/Russia|t1=Kazan}} displays as "Republic of the Tatarstan". (b) [[Valday]] defined as {{place|en|town/administrative center|dist:Suf/Valdaysky|obl/Novgorod|c/Russia}} displays as "a town, the administrative center of the Valdaysky District". Changing to `dist:suf/Valdaysky` displays as "... of Valdaysky district". 40. [DONE] Bug fix with 'the': [[Verkhoyansk]] defined as {{place|en|town|rep/Sakha|c/Russia}} displays as "a town in the Sakha". 41. [DONE] [[Category:Cities in Asia]] has [[Category:Cities in Eurasia]] as a parent, which in turn has [[Category:Cities in the Earth]] as a parent. Continents should not have the second parent like this. 42. [DONE] When checking `british_spelling`, it should check all containers as well; otherwise it's too hard to keep this in sync across cities, administrative divisions and countries. 43. [DONE] `skip_polity_parent_type` should be renamed to container_parent_type or similar. 44. There should be a flag to allow e.g. departments of France that are currently categorized as departments of their region to also be categorized as departments of France. 45. [DONE] Aliases are causing iterate_matching_holonym_location() to fail, e.g. if [[براق]] "Prague" is specified as {{place|acw|capital city|c/Czechia|t1=Prague}}, this fails add a bare category [[Category:acw:Prague]] because the code in iterate_matching_holonym_location() isn't resolving aliases when comparing the known container 'Czech Republic'. Probably we want to build an alias table to speed up these sorts of lookups. 46. [DONE; DUE TO TYPO IN HANDLER] The district cat handler is failing to work right, e.g. in [[Saint-Gaudérique]] defined as {{place|fr|district|city/Perpignan|in|dept/Pyrénées-Orientales|r/Occitania|c/France|t=Saint-Gaudérique}}, only the 'Places in ...' categories are getting triggered. 47. Suburbs of a given city aren't generally in the city and may not even be in the same country or country division, so they should not categorize as "Places in ..." based on the city and specified country and division. Same goes for "enclave" (within somewhere) and "exclave". 48. When converting display aliases, we should automatically convert full placenames to full placenames and elliptical placenames to elliptical placenames instead of always either doing elliptical or full placenames depending on the value of `display_as_full`. 49. `@obsolete form of` and `@archaic form of` should automatically trigger nocat=1. 50. The handler that adds bare categories should pick up values in <eq:...>. ]=] --[==[ var: List specifying the allowed form-of directives, used for former names, official names, abbreviations, etc. of places. The key is the form-of directive and the value is an object with the following properties: * `text`: The actual text displayed before the terms. If the value is `+`, the key is used as the text. If the value is a function, it is passed a single argument, the overall place spec (see comment at top of file) and should return the text to be displayed. * `type_prefix`: The prefix used to generate the placetype for looking up the appropriate category or categories in the placetype data structure. Can be omitted if there are no categories associated with the directive. * `conjunction`: The conjunction used to join multiple terms, defaulting to `and`. * `cat`: Additional category or categories to add the term to, whenever this particular directive is used. Normally the value is a topic-style category minus the langcode prefix, but if prefixed with `cln:`, it is a langname-style category. For example, the value `"Abbreviations"` would correspond to a category [[:Category:en:Abbreviations]] (assuming the language of the {{tl|place}} call is English), while the value `"cln:abbreviations"` corresponds to a category [[:Category:English abbreviations]]. Use a list of such specs for multiple categories. * `default_foreign`: If specified, the default language of terms given along with this directive is the language in {{para|1}}; otherwise it is English. ]==] export.all_form_of_directives = { ["former name of"] = {text = "+", type_prefix = "FORMER_NAME_OF"}, ["fmr of"] = {alias_of = "former name of"}, ["ancient name of"] = {text = "+", type_prefix = "FORMER_NAME_OF"}, ["official name of"] = {text = "+", type_prefix = "OFFICIAL_NAME_OF"}, ["former official name of"] = {text = "+", type_prefix = "FORMER_OFFICIAL_NAME_OF"}, ["long form of"] = {text = "+", type_prefix = "LONG_FORM_OF"}, ["former long form of"] = {text = "+", type_prefix = "FORMER_LONG_FORM_OF"}, ["nickname for"] = {text = "+", type_prefix = "NICKNAME_FOR"}, ["official nickname for"] = {text = "+", type_prefix = "OFFICIAL_NICKNAME_FOR"}, ["former nickname for"] = {text = "+", type_prefix = "FORMER_NICKNAME_FOR"}, ["derogatory name for"] = {text = "[[Appendix:Glossary#derogatory|derogatory]] name for", type_prefix = "DEROGATORY_NAME_FOR"}, ["synonym of"] = {text = "+"}, ["syn of"] = {alias_of = "synonym of"}, ["abbreviation of"] = {text = "[[Appendix:Glossary#abbreviation|abbreviation]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:abbreviations", default_foreign = true}, ["abbr of"] = {alias_of = "abbreviation of"}, ["abbrev of"] = {alias_of = "abbreviation of"}, ["initialism of"] = {text = "[[Appendix:Glossary#initialism|initialism]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:initialisms", default_foreign = true}, ["init of"] = {alias_of = "initialism of"}, ["acronym of"] = {text = "[[Appendix:Glossary#acronym|acronym]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:acronyms", default_foreign = true}, ["syllabic abbreviation of"] = {text = "[[Appendix:Glossary#syllabic abbreviation|syllabic abbreviation]] of", type_prefix = "ABBREVIATION_OF", cat = "cln:syllabic abbreviations", default_foreign = true}, ["sylabbr of"] = {alias_of = "syllabic abbreviation of"}, ["sylabbrev of"] = {alias_of = "syllabic abbreviation of"}, ["ellipsis of"] = {text = "[[Appendix:Glossary#ellipsis|ellipsis]] of", type_prefix = "ELLIPSIS_OF", cat = "cln:ellipses", default_foreign = true}, ["ellip of"] = {alias_of = "ellipsis of"}, ["clipping of"] = {text = "[[Appendix:Glossary#clipping|clipping]] of", type_prefix = "CLIPPING_OF", cat = "cln:clippings", default_foreign = true}, ["clip of"] = {alias_of = "clipping of"}, ["alternative form of"] = {text = "+", default_foreign = true}, ["alt form"] = {alias_of = "alternative form of"}, ["alternative spelling of"] = {text = "+", default_foreign = true}, ["alt spell"] = {alias_of = "alternative spelling of"}, ["alt sp"] = {alias_of = "alternative spelling of"}, ["dated form of"] = {text = "[[Appendix:Glossary#dated|dated]] form of", type_prefix = "DATED_FORM_OF", cat = "cln:dated forms", default_foreign = true}, ["dated form"] = {alias_of = "dated form of"}, ["dated spelling of"] = {text = "[[Appendix:Glossary#dated|dated]] spelling of", type_prefix = "DATED_FORM_OF", cat = "cln:dated forms", default_foreign = true}, ["dated spell"] = {alias_of = "dated spelling of"}, ["dated sp"] = {alias_of = "dated spelling of"}, ["archaic form of"] = {text = "[[Appendix:Glossary#archaic|archaic]] form of", type_prefix = "ARCHAIC_FORM_OF", cat = "cln:archaic forms", default_foreign = true}, ["arch form"] = {alias_of = "archaic form of"}, ["archaic spelling of"] = {text = "[[Appendix:Glossary#archaic|archaic]] spelling of", type_prefix = "ARCHAIC_FORM_OF", cat = "cln:archaic forms", default_foreign = true}, ["arch spell"] = {alias_of = "archaic spelling of"}, ["arch sp"] = {alias_of = "archaic spelling of"}, ["obsolete form of"] = {text = "[[Appendix:Glossary#obsolete|obsolete]] form of", type_prefix = "OBSOLETE_FORM_OF", cat = "cln:obsolete forms", default_foreign = true}, ["obs form"] = {alias_of = "obsolete form of"}, ["obsolete spelling of"] = {text = "[[Appendix:Glossary#obsolete|obsolete]] spelling of", type_prefix = "OBSOLETE_FORM_OF", cat = "cln:obsolete forms", default_foreign = true}, ["obs spell"] = {alias_of = "obsolete spelling of"}, ["obs sp"] = {alias_of = "obsolete spelling of"}, } local function get_seat_text(overall_place_spec) local placetype = overall_place_spec.descs[1].placetypes[1] if placetype == "county" or placetype == "counties" then return "county seat" elseif placetype == "parish" or placetype == "parishes" then return "parish seat" elseif placetype == "borough" or placetype == "boroughs" then return "borough seat" else return "seat" end end --[==[ var: List specifying the allowed arguments containing extra information that is sometimes added to a definition, such as the capital, largest city, modern name, official name, etc., along with associated properties; displayed in the order given. Each element is an object with the following properties: * `arg`: The argument name. * `text`: The actual text displayed before the terms. If the value is `+`, the argument name is used as the text. If the value is a function, it is passed a single argument, the overall place spec (see the comment at the top of the file) and should return the text to be displayed. * `conjunction`: The conjunction used to join multiple terms, defaulting to `and`. * `display_even_when_dropped`: Display this piece of extra info even when it would normally be dropped (e.g. in {{tl|tcl}} when the language is other than English). * `match_sentence_style`: If true, the text will be capitalized and preceded by a period when ''sentence style'' is in effect (essentially, when the language is English and there is no translation specified using {{para|t}} or similar parameter); otherwise, the text will be displayed as-is and preceded by a semicolon. If false, the semicolon style will always be used. * `auto_plural`: If true, pluralize the text when there is more than one term. * `with_colon`: If true, follow the text with a colon. (This colon cannot easily be included in the text itself because if pluralized, the pluralized text goes before the colon.) ]==] export.extra_info_args = { {arg = "modern", text = "+", conjunction = "atau", display_even_when_dropped = true}, {arg = "now", text = "kini", conjunction = "atau", display_even_when_dropped = true}, {arg = "full", text = "nama penuh", conjunction = "atau", display_even_when_dropped = true}, {arg = "short", text = "nama pendek", conjunction = "atau"}, {arg = "abbr", text = "singkatan", conjunction = "atau"}, {arg = "former", text = "dahulunya"}, {arg = "official", text = "nama rasmi", match_sentence_style = true, auto_plural = true, with_colon = true}, {arg = "capital", text = "+", match_sentence_style = true, auto_plural = true, with_colon = true}, {arg = "largest city", text = "+", match_sentence_style = true, auto_plural = true, with_colon = true}, {arg = "caplc", text = "ibu negara dan bandar terbesar", match_sentence_style = true, auto_plural = false, with_colon = true}, {arg = "seat", text = get_seat_text, match_sentence_style = true, auto_plural = true, with_colon = true}, {arg = "shire town", text = "+", match_sentence_style = true, auto_plural = true, with_colon = true}, {arg = "headquarters", text = "+", match_sentence_style = true, auto_plural = false, with_colon = true}, {arg = "center", text = "pusat pentadbiran", match_sentence_style = true, auto_plural = false, with_colon = true}, {arg = "centre", text = "pusat pentadbiran", match_sentence_style = true, auto_plural = false, with_colon = true}, } export.extra_info_arg_map = {} for _, spec in ipairs(export.extra_info_args) do export.extra_info_arg_map[spec.arg] = spec end ----------- Wikicode utility functions -- Return a wikilink link {{l|language|text}} local function link(text, langcode, id) if not langcode then return text end return m_links.full_link( {term = text, lang = require(languages_module).getByCode(langcode, true, "allow etym"), id = id}, nil, "allow self link" ) end ---------- Basic utility functions -- Add the page to a tracking "category". To see the pages in the "category", -- go to [[Wiktionary:Tracking/place/PAGE]] and click on "What links here". local function track(page) require(debug_track_module)("place/" .. page) return true end local function ucfirst_all(text) if text:find(" ") then local parts = split(text, " ", true) for i, part in ipairs(parts) do parts[i] = m_strutils.ucfirst(part) end return concat(parts, " ") else return m_strutils.ucfirst(text) end end local function lc(text) return mw.getContentLanguage():lc(text) end ---------- Argument parsing functions and utilities -- Split an argument on comma, but not comma followed by whitespace. local function split_on_comma(val) if val:find(",") then return require(parse_interface_module).split_on_comma(val) else return {val} end end -- Split an argument on slash, but not slash occurring inside of HTML tags like </span> or <br />. local function split_on_slash(arg) if arg:find("<") then local m_parse_utilities = require(parse_utilities_module) -- We implement this by parsing balanced segment runs involving <...>, and splitting on slash in the remainder. -- The result is a list of lists, so we have to rejoin the inner lists by concatenating. local segments = m_parse_utilities.parse_balanced_segment_run(arg, "<", ">") local slash_separated_groups = m_parse_utilities.split_alternating_runs(segments, "/") for i, group in ipairs(slash_separated_groups) do slash_separated_groups[i] = concat(group) end return slash_separated_groups else return split(arg, "/", true) end end -- Implement "implications", i.e. where the presence of a given holonym causes additional holonym(s) to be added. -- Implications apply only to categorization. There used to be support for "general implications" that applied to both -- display and categorization, but there ended up not being any such implications, so we've removed the support. It is -- a bad idea in any case to have such implications; the user might purposely leave out a higher-level polity to avoid -- redundancy in several successive definitions, and we wouldn't want to override that. Note that in practice the -- mechanism implemented by this function is used specifically for non-administrative geographic regions such as -- Eastern Europe and the West Bank; there is a similar mechanism for administrative regions handled by -- `augment_holonyms_with_containing_polity` in [[Module:place/placetypes]]. -- -- `place_descriptions` is a list of place descriptions (see top of file, collectively describing the data passed to -- {{place}}). `implication_data` is the data used to implement the implications, i.e. a table indexed by holonym -- placetype, each value of which is a table indexed by holonym placename, each value of which is a list of -- "PLACETYPE/PLACENAME" holonyms to be added to the end of the list of holonyms. local function handle_category_implications(place_descriptions, implication_data) for i, desc in ipairs(place_descriptions) do if desc.holonyms then local new_holonyms = {} for _, holonym in ipairs(desc.holonyms) do insert(new_holonyms, holonym) local imp_data = m_placetypes.get_equiv_placetype_prop(holonym.placetype, function(pt) local implication = implication_data[pt] and implication_data[pt][holonym.unlinked_placename] if implication then return implication end end) if imp_data then for _, holonym_to_add in ipairs(imp_data) do local split_holonym = split_on_slash(holonym_to_add) if #split_holonym ~= 2 then internal_error("Invalid holonym in implications: %s", holonym_to_add) end local holonym_placetype, holonym_placename = unpack(split_holonym, 1, 2) local new_holonym = { -- By the time we run, the display has already been generated so we don't need to set -- display_placename. placetype = holonym_placetype, unlinked_placename = holonym_placename } insert(new_holonyms, new_holonym) m_placetypes.key_holonym_into_place_desc(desc, new_holonym) end end end desc.holonyms = new_holonyms end end end -- Split a holonym (e.g. "continent/Europe" or "country/en:Italy" or "in southern" or "r:suf/O'Higgins" or -- "c/Austria,Germany,Czech Republic") into its components. Return a list of holonym objects (see top of file). Note -- that if there isn't a slash in the holonym (e.g. "in southern"), the `placetype` field of the holonym will be nil. -- Placetype aliases (e.g. "r" for "region") and placename aliases (e.g. "US" or "USA" for "United States") will be -- expanded. local function split_holonym(raw) local no_display, combined_holonym = raw:match("^(!)(.*)$") no_display = not not no_display combined_holonym = combined_holonym or raw local suppress_comma, combined_holonym_without_comma = combined_holonym:match("^(%*)(.*)$") suppress_comma = not not suppress_comma combined_holonym = combined_holonym_without_comma or combined_holonym local holonym_parts = split_on_slash(combined_holonym) if #holonym_parts == 1 then -- `unlinked_placename` should not be used. return {{display_placename = combined_holonym, no_display = no_display, suppress_comma = suppress_comma}} end -- Rejoin further slashes in case of slash in holonym placename, e.g. Admaston/Bromley. local placetype = holonym_parts[1] local placename = concat(holonym_parts, "/", 2) -- Check for modifiers after the holonym placetype. local split_holonym_placetype = split(placetype, ":", true) placetype = split_holonym_placetype[1] local affix_type local saw_also local saw_the for i = 2, #split_holonym_placetype do local modifier = split_holonym_placetype[i] if modifier == "also" then if saw_also then error(("Modifier ':also' occurs twice in holonym '%s'"):format(combined_holonym)) end saw_also = true elseif modifier == "the" then if saw_the then error(("Modifier ':the' occurs twice in holonym '%s'"):format(combined_holonym)) end saw_the = true elseif modifier == "pref" or modifier == "Pref" or modifier == "suf" or modifier == "Suf" or modifier == "noaff" then if affix_type then error(("Affix-type modifier ':%s' occurs twice in holonym '%s'"):format(modifier, combined_holonym)) end affix_type = modifier else error(("Unrecognized holonym placetype modifier '%s', should be one of " .. "'pref', 'Pref', 'suf', 'Suf', 'noaff', 'also' or 'the'"):format(modifier)) end end placetype = m_placetypes.resolve_placetype_aliases(placetype) local holonyms = split_on_comma(placename) local pluralize_affix = #holonyms > 1 local affix_holonym_index = (affix_type == "pref" or affix_type == "Pref") and 1 or affix_type == "noaff" and 0 or #holonyms for i, placename in ipairs(holonyms) do -- Check for langcode before the holonym placename, but don't get tripped up by Wikipedia links, which begin -- "[[w:...]]" or "[[wikipedia:]]". local langcode, placename_without_langcode = rmatch(placename, "^([^%[%]]-):(.*)$") if langcode then placename = placename_without_langcode end placename = m_placetypes.resolve_placename_display_aliases(placetype, placename) holonyms[i] = { placetype = placetype, display_placename = placename, unlinked_placename = m_placetypes.remove_links_and_html(placename), langcode = langcode, affix_type = i == affix_holonym_index and affix_type or nil, pluralize_affix = i == affix_holonym_index and pluralize_affix, suppress_affix = i ~= affix_holonym_index, no_display = no_display, suppress_comma = suppress_comma, continue_cat_loop = saw_also, force_the = i == 1 and saw_the, } end return holonyms end local get_param_mods = memoize(function() local m_param_utils = require(parameter_utilities_module) return m_param_utils.construct_param_mods { {group = {"link", "q", "l", "ref"}}, {param = "eq"}, -- FIXME: Finish [[Module:format utilities]]. --{param = "conj", set = require(format_utilities_module).allowed_conjs_for_join_segments, overall = true}, {param = "conj", set = {["and"] = true, ["or"] = true, ["and/or"] = true}, overall = true}, } end) local function parse_term_with_inline_modifiers(term, paramname, default_lang) -- FIXME: Finish changes to [[Module:parameter utilities]] and [[Module:parse utilities]] that support continuations -- and new-format generate_obj(). --local function generate_obj(data) -- local m_param_utils = require(parameter_utilities_module) -- data.parse_lang_prefix = true -- data.special_continuations = m_param_utils.default_special_continuations -- data.default_lang = default_lang -- return m_param_utils.generate_obj_maybe_parsing_lang_prefix(data) --end local function generate_obj(raw_term, parse_err) local obj = require(parameter_utilities_module).generate_obj_maybe_parsing_lang_prefix { term = raw_term, parse_err = parse_err, parse_lang_prefix = true, } obj.lang = obj.lang or default_lang return obj end return require(parse_interface_module).parse_inline_modifiers(term, { paramname = paramname, param_mods = get_param_mods(), generate_obj = generate_obj, -- FIXME: See above. --generate_obj_new_format = true, splitchar = ",", outer_container = {}, }) end local function parse_form_of_directive(arg, lang, form_of_overridden_args) local form_of_directive, raw_terms = arg:match("^@([a-z -]+):(.*)$") if not form_of_directive then error("Misformatted @-directive: " .. dump(arg)) end if not export.all_form_of_directives[form_of_directive] then local known_directives = {} for k, _ in pairs(export.all_form_of_directives) do insert(known_directives, '"' .. k .. '"') end table.sort(known_directives) error(("Unrecognized form-of directive %s in @-directive %s; recognized directives are %s"):format( dump(form_of_directive), dump(arg), concat(known_directives, ", "))) end local spec = export.all_form_of_directives[form_of_directive] local canonical_directive = form_of_directive if spec.alias_of then canonical_directive = spec.alias_of spec = export.all_form_of_directives[canonical_directive] if not spec then internal_error("Form-of directive alias %s points to %s, which is not a directive", "@" .. form_of_directive, canonical_directive) elseif spec.alias_of then internal_error("Form-of directive alias %s points to %s, which is also an alias", "@" .. form_of_directive, canonical_directive) end end local default_foreign = spec.default_foreign local directive_param = "@" .. form_of_directive if form_of_overridden_args and form_of_overridden_args[canonical_directive] then raw_terms = form_of_overridden_args[canonical_directive].new_value local new_directive = form_of_overridden_args[canonical_directive].new_directive local new_spec = export.all_form_of_directives[new_directive] if not new_spec then error(("Internal error: [[Module:transclude]] passed in unrecognized replacement directive '@%s'"): format(new_directive)) end if new_spec.alias_of then error(("Internal error: [[Module:transclude]] passed in replacement directive alias '@%s', " .. "should be canonical"):format(new_directive)) end if new_directive ~= canonical_directive then directive_param = directive_param .. (" (replaced with @%s)"):format(new_directive) canonical_directive = new_directive spec = new_spec end default_foreign = true end local terms = parse_term_with_inline_modifiers(raw_terms, directive_param, default_foreign and lang or enlang) return { directive = canonical_directive, terms = terms.terms, conj = terms.conj, spec = spec, } end -- Parse an argument containing extra information that is sometimes added to a definition, such as the capital, largest -- city, modern name, official name, etc. `args` is the value from the parsed argument structure and can be either nil, -- a string or a list (depending on whether it was declared as a single parameter or a list). `spec` is the extra info -- spec corresponding to the type of extra info. Each value in `args` can be a comma-separated list of terms with inline -- modifiers attached. [FIXME: we should switch to always using the comma-separated format and disallow list parameters -- such as |capital=, |capital2=, etc.] The return value is a structure containing fields `terms` (a list of term -- objects, each of which is in the format expected by full_link() in [[Module:links]]), `conj` (an explicit -- conjunction to join multiple terms, or nil if no explicit conjunction was given) and `spec` (the passed-in spec). local function parse_extra_info_arg(args, spec, default_lang) if not args then return nil end if type(args) ~= "table" then args = {args} end if not args[1] then return nil end local terms = nil local conj for i, arg in ipairs(args) do local this_terms = parse_term_with_inline_modifiers(arg, spec.arg .. (i == 1 and "" or i), default_lang) local thisconj = this_terms.conj if not conj then conj = thisconj elseif thisconj and conj ~= thisconj then error(("Two different conjunctions '%s' and '%s' specified for |%s=; you only need to specify the " .. "conjunction once"):format(conj, thisconj)) end if not terms then terms = this_terms.terms else m_table.extend(terms, this_terms.terms) end end return { spec = spec, terms = terms, conj = conj, } end --[==[ Parse a "new-style" place description, with placetypes and holonyms surrounded by `<<...>>` amid otherwise raw text. Return value is a place description object as documented at the top of the file. Exported for use by [[Module:demonyms]]. ]==] function export.parse_new_style_place_desc(text, lang, form_of_directives, form_of_overridden_args) local placetypes = {} local segments = split(text, "<<(.-)>>") local retval = {holonyms = {}, order = {}} local form_of_directives_already_present = form_of_directives and not not form_of_directives[1] for i, segment in ipairs(segments) do if i % 2 == 1 then insert(retval.order, {type = "raw", value = segment}) elseif segment:find("@") then if not form_of_directives then error(("Form-of directive '%s' not allowed in this context"):format(segment)) elseif form_of_directives_already_present then error(("Saw form-of directive '%s' in new-style place desc followed by direct (separate-parameter) form-of directives; not allowed"):format( segment)) elseif placetypes[1] or retval.holonyms[1] then error(("Form-of directive '%s' must come first, before placetypes and holonyms"):format(segment)) else local form_of_directive = parse_form_of_directive(segment, lang, form_of_overridden_args) if not retval.order[1] or retval.order[1].type ~= "raw" or retval.order[2] then internal_error("`retval.order` should have a single raw element: %s", retval.order) end form_of_directive.pretext = retval.order[1].value retval.order[1] = nil insert(form_of_directives, form_of_directive) end elseif segment:find("/") then local holonyms = split_holonym(segment) for j, holonym in ipairs(holonyms) do if j > 1 then if not holonym.no_display then if j == #holonyms then insert(retval.order, {type = "raw", value = " and "}) else insert(retval.order, {type = "raw", value = ", "}) end end -- All but the first in a multi-holonym need an article. For the first one, the article is -- specified in the raw text if needed. (Currently, needs_article is only used when displaying the -- holonym, so it wouldn't matter when no_display is set, but we set it anyway in case we need it -- for something else.) holonym.needs_article = true end insert(retval.holonyms, holonym) if not holonym.no_display then insert(retval.order, {type = "holonym", value = #retval.holonyms}) end m_placetypes.key_holonym_into_place_desc(retval, holonym) end else local treat_as, display = segment:match("^(..-):(.+)$") if treat_as then segment = treat_as else display = segment end -- see if the placetype segment is just qualifiers local only_qualifiers = true local split_segments = split(segment, " ", true) for _, split_segment in ipairs(split_segments) do if m_placetypes.placetype_qualifiers[split_segment] == nil then only_qualifiers = false break end end insert(placetypes, {placetype = segment, only_qualifiers = only_qualifiers}) if only_qualifiers then insert(retval.order, {type = "qualifier", value = display}) else insert(retval.order, {type = "placetype", value = display}) end end end if not form_of_directives_already_present and form_of_directives and form_of_directives[1] then form_of_directives[#form_of_directives].posttext = "" end local final_placetypes = {} for i, placetype in ipairs(placetypes) do if i > 1 and placetypes[i - 1].only_qualifiers then final_placetypes[#final_placetypes] = final_placetypes[#final_placetypes] .. " " .. placetypes[i].placetype else insert(final_placetypes, placetypes[i].placetype) end end retval.placetypes = final_placetypes return retval end --[==[ Parse one or more "new-style" place descriptions, with placetypes and holonyms surrounded by `<<...>>` amid otherwise raw text. Multiple descriptions are separated by two semicolons in a row. Return value is a list of place description objects as documented at the top of the file. ]==] local function parse_conjoined_new_style_place_desc(text, lang, form_of_directives, form_of_overridden_args) local separate_specs = split(text, ";(;[^ ]*)") local descs = {} for i = 1, #separate_specs do if i % 2 == 1 then insert(descs, export.parse_new_style_place_desc(separate_specs[i], lang, form_of_directives, form_of_overridden_args)) form_of_directives = nil else descs[#descs].separator = separate_specs[i] end end return descs end --[=[ Process numeric and "extra info" arguments into an overall place spec, as described at the top of the file. `data` is an object with the following fields: * `args`: The parsed arguments of {{tl|place}}. * `from_tcl`: True if we're being invoked from {{tl|tcl}}. * `extra_info_overridden_set`, `form_of_overridden_args`: Same as the corresponding fields in the `data` object passed to `export.format`. ]=] local function parse_overall_place_spec(data) local args, from_tcl, extra_info_overridden_set, form_of_overridden_args = data.args, data.from_tcl, data.extra_info_overridden_set, data.form_of_overridden_args local descs = {} local this_desc -- Index of separate (semicolon-separated) place descriptions within `descs`. local desc_index = 1 -- Index of separate holonyms within a place description. 0 means we've seen no holonyms and have yet to process -- the placetypes that precede the holonyms. 1 means we've seen no holonyms but have already processed the -- placetypes. local holonym_index = 0 local in_place_desc = false local form_of_directives = {} local function set_desc_joiner(desc, separator) if separator == ";" then this_desc.joiner = "; " this_desc.include_following_article = true elseif separator == ";;" then this_desc.joiner = " " else local joiner = separator:sub(2) if rfind(joiner, "^%a") then this_desc.joiner = " " .. joiner .. " " else this_desc.joiner = joiner .. " " end end end for _, arg in ipairs(args[2]) do if arg:find("^@") then if not (desc_index == 1 and holonym_index == 0) then error("@-directives cannot follow place descriptions") end local form_of_directive = parse_form_of_directive(arg, args[1], form_of_overridden_args) if form_of_directives[1] then form_of_directive.pretext = ", " else form_of_directive.pretext = "" end insert(form_of_directives, form_of_directive) elseif arg == ";" or arg:find("^;[^ ]") then if not this_desc then error("Saw semicolon joiner without preceding place description") end set_desc_joiner(this_desc, arg) desc_index = desc_index + 1 holonym_index = 0 in_place_desc = false else if arg:find("<<") then if in_place_desc then error("New-style place description must come first or following a separator (semicolon or similar), not directly following another description") end in_place_desc = true local this_descs = parse_conjoined_new_style_place_desc(arg, args[1], form_of_directives, form_of_overridden_args) for j, desc in ipairs(this_descs) do this_desc = desc if holonym_index > 0 then desc_index = desc_index + 1 holonym_index = 0 end if j < #this_descs then set_desc_joiner(this_desc, this_desc.separator) end descs[desc_index] = this_desc last_was_new_style = true holonym_index = #this_desc.holonyms + 1 end else -- Old-style arguments can directly follow a new-style argument; they become additional holonyms -- tacked onto the end of the holonym list, and are displayed old-style except that there is no -- prefix before the first one following the new-style argument. in_place_desc = true if holonym_index == 0 then local entry_placetypes = split_on_slash(arg) this_desc = {placetypes = entry_placetypes, holonyms = {}} descs[desc_index] = this_desc holonym_index = holonym_index + 1 else local holonyms = split_holonym(arg) for j, holonym in ipairs(holonyms) do if j > 1 then -- All but the first in a multi-holonym need an article. Not for the first one because e.g. -- {{place|en|city|s/Arizona|c/United States}} should not display as "a city in Arizona, the -- United States". The overall first holonym in the place description gets an article if -- needed regardless of our setting here. holonym.needs_article = true -- Insert "and" before the last holonym. if j == #holonyms then this_desc.holonyms[holonym_index] = { -- Use the no_display value from the first holonym; it should be the same for all -- holonyms. `unlinked_placename` should not be used. display_placename = "and", no_display = holonyms[1].no_display } holonym_index = holonym_index + 1 end end this_desc.holonyms[holonym_index] = holonym m_placetypes.key_holonym_into_place_desc(this_desc, this_desc.holonyms[holonym_index]) holonym_index = holonym_index + 1 end end end end end if form_of_directives[1] and not form_of_directives[#form_of_directives].posttext then form_of_directives[#form_of_directives].posttext = (args.def and args.def ~= "-" or not args.def and descs[1]) and ": " or "" end -- Tracking code. This does nothing but add tracking for seen placetypes and qualifiers. The place will be linked to -- [[Wiktionary:Tracking/place/entry-placetype/PLACETYPE]] for all entry placetypes seen; in addition, if PLACETYPE -- has qualifiers (e.g. 'small city'), there will be links for the bare placetype minus qualifiers and separately -- for the qualifiers themselves: -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/BARE_PLACETYPE]] -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-qualifier/QUALIFIER]] -- Note that if there are multiple qualifiers, there will be links for each possible split. For example, for -- 'small maritime city'), there will be the following links: -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/small maritime city]] -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/maritime city]] -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-placetype/city]] -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-qualifier/small]] -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/entry-qualifier/maritime]] -- Finally, there are also links for holonym placetypes, e.g. if the holonym 'c/Italy' occurs, there will be the -- following link: -- [[Special:WhatLinksHere/Wiktionary:Tracking/place/holonym-placetype/country]] for _, desc in ipairs(descs) do for _, entry_placetype in ipairs(desc.placetypes) do local splits = m_placetypes.split_qualifiers_from_placetype(entry_placetype, "no canon qualifiers") for _, split in ipairs(splits) do local prev_qualifier, this_qualifier, bare_placetype = unpack(split, 1, 3) track("entry-placetype/" .. bare_placetype) if this_qualifier then track("entry-qualifier/" .. this_qualifier) end end end for _, holonym in ipairs(desc.holonyms) do if holonym.placetype then track("holonym-placetype/" .. holonym.placetype) end end end local extra_info = {} for _, extra_info_spec in ipairs(export.extra_info_args) do local extra_info_terms = parse_extra_info_arg(args[extra_info_spec.arg], extra_info_spec, -- If called from {{tcl}} and extra info argument was set by {{tcl}}, interpret the argument -- according to the language in 1=; otherwise interpret as English. To override this, prefix -- with the appropriate language. from_tcl and extra_info_overridden_set and extra_info_overridden_set[extra_info_spec.arg] and args[1] or enlang) if extra_info_terms then insert(extra_info, extra_info_terms) end end return { lang = args[1], args = args, directives = form_of_directives, descs = descs, extra_info = extra_info, } end -------- Definition-generating functions -- Return a string with the wikilinks to the English translations of the word. local function get_translations(transl, ids) local ret = {} for i, t in ipairs(transl) do local arg_transls = split_on_comma(t) local arg_ids = ids[i] if arg_ids then arg_ids = split_on_comma(arg_ids) if #arg_transls ~= #arg_ids then error(("Saw %s translation%s in t%s=%s but %s ID%s in tid%s=%s"):format( #arg_transls, #arg_transls > 1 and "s" or "", i == 1 and "" or i, t, #arg_ids, #arg_ids > 1 and "'s" or "", i == 1 and "" or i, ids[i])) end end for j, arg_transl in ipairs(arg_transls) do insert(ret, link(arg_transl, "en", arg_ids and arg_ids[j] or nil)) end end return concat(ret, ", ") end -- Return the article (currently always `"the"`) to be prepended to the given placename, or nil. `decorated_placename` -- is the placename as specified by the user along with any affix added to it. `placename` is the raw unlinked -- placename, defaulting to the unlinked version of `decorated_placename` if not given. `placetypes` is a placetype or -- list of placetypes for the placename. `suppress_holonym_use_the_check` suppresses checking the placetypes for -- `holonym_use_the`. local function get_placename_article(decorated_placename, placetypes, placename, suppress_holonym_use_the_check) local unlinked_decorated_placename = m_placetypes.remove_links_and_html(decorated_placename) if unlinked_decorated_placename:find("^the ") then return nil end placename = placename or unlinked_decorated_placename if type(placetypes) == "string" then placetypes = {placetypes} end for _, placetype in ipairs(placetypes) do local art = m_placetypes.get_equiv_placetype_prop(placetype, function(pt) local art = m_placetypes.placename_article[pt] and m_placetypes.placename_article[pt][placename] if art then return art end end) if art then return art end end -- Get equivalent placetypes of the specified placetype so that e.g. -- {{place|en|@official name of:Bahamas|island country|r/Caribbean}} put 'the' before Bahamas ("Bahamas" is just -- specified as a country but "island country" falls back to "negara"). local all_equiv_placetypes = {} for _, placetype in ipairs(placetypes) do local this_equiv_placetypes = m_placetypes.get_placetype_equivs(placetype) for _, this_equiv_placetype in ipairs(this_equiv_placetypes) do insert(all_equiv_placetypes, this_equiv_placetype.placetype) end end -- Look for a known location. We should be using find_matching_holonym_location() but that function doesn't -- currently work without alias resolution. Instead we check if any matching location has `the = true` set. -- In practice there aren't any cases where a given placename matches two locations, only one of which has -- `the = true` set. for group, key, spec in m_placetypes.iterate_matching_location { placetypes = all_equiv_placetypes, placename = placename, alias_resolution = "none", } do -- `iterate_holonym_location` doesn't initialize the spec if alias resolution is turned off, so check both -- the spec and group. Be careful in case `the = false` is explicitly given by the spec. if spec.the ~= nil then if spec.the then return "the" end elseif group.default_the then return "the" end end if not suppress_holonym_use_the_check then -- See if the placetype requests an article to be placed before the placename. This occurs e.g. with 'sea'. But -- if the user specifies e.g. "sea:pref/Cortez", we'll wrongly get "the sea of the Cortez", so in that case we -- need to ignore the holonym article specified along with the placetype. for _, placetype in ipairs(placetypes) do local holonym_use_the = m_placetypes.get_equiv_placetype_prop(placetype, function(pt) return placetype_data[pt] and placetype_data[pt].holonym_use_the end) if holonym_use_the then return "the" end end end local universal_res = m_placetypes.placename_the_re["*"] for _, re in ipairs(universal_res) do if unlinked_decorated_placename:find(re) then return "the" end end for _, placetype in ipairs(placetypes) do local matched = m_placetypes.get_equiv_placetype_prop(placetype, function(pt) local res = m_placetypes.placename_the_re[pt] if not res then return nil end for _, re in ipairs(res) do if unlinked_decorated_placename:find(re) then return true end end return nil end) if matched then return "the" end end return nil end -- Prepend the appropriate article if needed to `decorated_placename` (the user-specified placename with any affix -- added), where the underlying holonym object that generated `linked_placename` can be found at `holonym_index` in the -- holonyms in `place_desc`. local function get_holonym_article(decorated_placename, place_desc, holonym_index) local holonym = place_desc.holonyms[holonym_index] local holonym_placetype = holonym.placetype if not holonym_placetype then return nil end return get_placename_article(decorated_placename, holonym_placetype, holonym.unlinked_placename, not not holonym.affix_type) end -- Convert a holonym into display format. This adds wikilinks to holonyms and passes them through any display handlers, -- which may (e.g.) add the placetype to the holonym. If `needs_article` is true, prepend the article `"the"` if the -- holonym requires it (e.g. if the holonym is `United States`). `needs_article` is set to true we are processing the -- first specified holonym in an old-style place description (i.e. the holonym directly following the entry placetype, -- with no raw-text holonym in between). -- -- Examples: -- ({placetype = "negara", display_placename = "United States", unlinked_placename = "United States"}, true) returns -- the template-expanded equivalent of "the {{l|en|United States}}". -- ({placetype = "region", display_placename = "O'Higgins", unlinked_placename = "O'Higgins", affix_type = "suf"}, false) -- returns the template-expanded equivalent of "{{l|en|O'Higgins}} region". -- ({display_placename = "in the southern"}, false) returns "in the southern" (without wikilinking because .placetype -- and .langcode are both nil). local function format_holonym(place_desc, holonym_index, needs_article) local holonym = place_desc.holonyms[holonym_index] if holonym.no_display then return "" end local orig_needs_article = needs_article needs_article = needs_article or holonym.needs_article or holonym.force_the local output = holonym.display_placename local placetype = holonym.placetype local affix_type_pt_data, affix_type, affix_is_prefix, affix, prefix, suffix, no_affix_strings local pt_equiv_for_affix_type, already_seen_affix, need_affix -- Implement display handlers. local display_handler = m_placetypes.get_equiv_placetype_prop(placetype, function(pt) return placetype_data[pt] and placetype_data[pt].display_handler end) if display_handler then output = display_handler(placetype, output) end if not holonym.suppress_affix then -- Implement adding an affix (prefix or suffix) based on the holonym's placetype. The affix will be -- added either if the placetype's placetype_data spec says so (by setting 'affix_type'), or if the -- user explicitly called for this (e.g. by using 'r:suf/O'Higgins'). Before adding the affix, -- however, we check to see if the affix is already present (e.g. the placetype is "district" -- and the placename is "Mission District"). The placetype can override the affix to add (by setting -- `prefix`, `suffix` or `affix`) and/or override the strings used for checking if the affix is already -- present (by setting 'no_affix_strings', which defaults to the affix explicitly given through `prefix`, -- `suffix` or `affix` if any are given). `prefix` and `suffix` take precedence over `affix` if both are -- set, but only when the appropriate type of affix is requested. -- Search through equivalent placetypes for a setting of `affix_type`, `affix`, `prefix` or `suffix`. If we -- find any, use them. If `affix_type` is given, it is overridden by the user's explicitly specified affix -- type. If either an `affix_type` is found or the user explicitly specified an affix type, the affix is -- displayed according to the following: -- 1. If `prefix`, `suffix` or `affix` is given by the placetype or equivalent placetypes, use it (e.g. -- placetype `administrative region` requests suffix "region" but doesn't set affix type; if the user -- explicitly specifies `administrative region` as the placetype for a holonym and specifies a suffixal -- affix type, use "region"). In this search, we stop looking if we find an explicit `affix_type` -- setting; if this is found without an associated affix setting, the assumption is the associated -- placetype was intended as the affix, not some explicit affix setting associated with a fallback -- placetype. -- 2. Otherwise, if the user explicitly requested an affix type, use the actual placetype (principle of -- least surprise). -- 3. Finally, fall back to the placetype associated with an explicit `affix_type` setting (which will -- always exist if we get this far). affix_type_pt_data, pt_equiv_for_affix_type = m_placetypes.get_equiv_placetype_prop(placetype, function(pt) local cdpt = placetype_data[pt] return cdpt and cdpt.affix_type and cdpt or nil end ) affix_pt_data, pt_equiv_for_affix = m_placetypes.get_equiv_placetype_prop(placetype, function(pt) local cdpt = placetype_data[pt] return cdpt and (cdpt.affix_type or cdpt.affix or cdpt.prefix or cdpt.suffix) and cdpt or nil end ) if affix_type_pt_data then affix_type = affix_type_pt_data.affix_type need_affix = true end if affix_pt_data then prefix = affix_pt_data.prefix or affix_pt_data.affix suffix = affix_pt_data.suffix or affix_pt_data.affix need_affix = true end no_affix_strings = affix_pt_data and affix_pt_data.no_affix_strings or affix_type_pt_data and affix_type_pt_data.no_affix_strings if holonym.affix_type and placetype then affix_type = holonym.affix_type prefix = prefix or placetype suffix = suffix or placetype need_affix = true end if need_affix then -- At this point the affix_type has been determined and can't change any more, so we can figure out -- whether we need the calculated prefix or suffix. affix_is_prefix = affix_type == "pref" or affix_type == "Pref" if affix_is_prefix then affix = prefix else affix = suffix end if not affix then if not pt_equiv_for_affix_type then internal_error("Something wrong, `pt_equiv_for_affix_type` not set processing holonym: %s", holonym) end affix = pt_equiv_for_affix_type.placetype if not affix then internal_error("Something wrong, no affix could be located in `pt_equiv_for_affix_type` for " .. "holonym %s: %s", holonym, pt_equiv_for_affix_type) end end no_affix_strings = no_affix_strings or lc(affix) if holonym.pluralize_affix then affix = m_placetypes.pluralize_placetype(affix) end already_seen_affix = m_placetypes.check_already_seen_string(output, no_affix_strings) end end output = link(output, holonym.langcode or placetype and "en" or nil) if need_affix and not affix_is_prefix and not already_seen_affix then output = output .. " " .. (affix_type == "Suf" and ucfirst_all(affix) or affix) end if needs_article then local article = holonym.force_the and "the" or get_holonym_article(output, place_desc, holonym_index) if article then output = article .. " " .. output end end if affix_is_prefix and not already_seen_affix then output = (affix_type == "Pref" and ucfirst_all(affix) or affix) .. " of " .. output if orig_needs_article then -- Put the article before the added affix if we're the first holonym in the place description. This is -- distinct from the article added above for the holonym itself; cf. "c:pref/United States,Canada" -> -- "the countries of the United States and Canada". We need to use the value of `needs_article` passed -- in from the function, which indicates whether we're processing the first holonym. output = "the " .. output end end return output end -- Format a holonym for display, taking into account the entry's placetype (specifically, the last placetype if there -- are more than one, excluding conjunctions and parenthetical items); the holonym's index among the holonyms in the -- template (which specifies what the previous holonym is and whether it is the first holonym); and the full place -- description (which helps resolve ambiguities in holonyms when looking up known locations). This may involve putting a -- preposition ("di" or "of") before the formatted holonym, particularly if it is the first one, and may involve -- prepending a comma. If `holonym_no_prefix` is specified, nothing except a space is put before the holonym; used -- when formatting mixed new/old-style descriptions. local function format_holonym_in_context(entry_placetype, place_desc, holonym_index, holonym_no_prefix) local desc = "" -- If holonym.placetype is nil, the holonym is just raw text, e.g. 'in southern'. if holonym_no_prefix then desc = " " else local holonym = place_desc.holonyms[holonym_index] if not holonym.no_display then -- First compute the initial delimiter. if holonym_index == 1 then if holonym.placetype then desc = desc .. " " .. m_placetypes.get_placetype_entry_preposition(entry_placetype) .. " " elseif not holonym.display_placename:find("^,") then desc = desc .. " " end else local prev_holonym = place_desc.holonyms[holonym_index - 1] if prev_holonym.placetype and not holonym.suppress_comma then local dname = holonym.display_placename if dname ~= "and" and dname ~= "di" and dname ~= "and the" and dname ~= "di" then desc = desc .. "," end end if holonym.placetype or not holonym.display_placename:find("^,") then desc = desc .. " " end end end end return desc .. format_holonym(place_desc, holonym_index, not holonym_no_prefix and holonym_index == 1) end -- Return the linked description of a placetype. This splits off any qualifiers and displays them separately. local function get_placetype_description(placetype) local splits = m_placetypes.split_qualifiers_from_placetype(placetype) local prefix = "" for _, split in ipairs(splits) do local prev_qualifier, this_qualifier, bare_placetype = unpack(split, 1, 3) if this_qualifier then prefix = (prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier) .. " " else prefix = "" end local display_form = m_placetypes.get_placetype_display_form(bare_placetype) if display_form then return prefix .. display_form end placetype = bare_placetype end return prefix .. placetype end -- Return the linked description of a qualifier (which may be multiple words). local function get_qualifier_description(qualifier) local splits = m_placetypes.split_qualifiers_from_placetype(qualifier .. " foo") local split = splits[#splits] local prev_qualifier, this_qualifier, bare_placetype = unpack(split, 1, 3) return prev_qualifier and prev_qualifier .. " " .. this_qualifier or this_qualifier end -- Format a set of form-of directive terms. local function format_form_of_directive(overall_place_spec, directive_terms, ucfirst, from_tcl) local formatted_terms = {} local placetypes if not overall_place_spec.descs[2] then placetypes = overall_place_spec.descs[1].placetypes else placetypes = {} for _, desc in ipairs(overall_place_spec.descs) do m_table.extend(placetypes, desc.placetypes) end end for _, termobj in ipairs(directive_terms.terms) do local placename_article if not termobj.alt and termobj.term and not termobj.term:find("%[%[") then placename_article = get_placename_article(termobj.term, placetypes) end local linked_term = m_links.full_link(termobj, "term", nil, "show qualifiers") linked_term = "<span class='form-of-definition-link'>" .. linked_term .. "</span>" if termobj.eq then linked_term = linked_term .. " (= " .. m_links.full_link {term = termobj.eq, lang = enlang} .. ")" end if placename_article then linked_term = placename_article .. " " .. linked_term end insert(formatted_terms, linked_term) end local spec = directive_terms.spec local text = spec.text if type(text) == "function" then text = text(overall_place_spec) end if text == "+" then text = directive_terms.directive end if ucfirst then text = m_strutils.ucfirst(text) end if not from_tcl then local tracking_prefix = "form-of/" .. directive_terms.directive track(tracking_prefix) local langcode = overall_place_spec.lang:getCode() local full_langcode = overall_place_spec.lang:getFullCode() track(tracking_prefix .. "/" .. langcode) if full_langcode ~= langcode then track(tracking_prefix .. "/" .. full_langcode) end if full_langcode ~= "en" then track(tracking_prefix .. "/non-english") end end return (require(form_of_module).format_form_of { text = text, lemmas = m_table.serialCommaJoin(formatted_terms, {conj = directive_terms.conj or spec.conjunction or "and"}), lemma_classes = false, -- text_classes = "place-text", }) end -- Format a set of extra-info terms for extra information that is sometimes added to a definition, such as the capital, -- largest city, modern name, official name, etc. `overall_place_spec` is the overall parsed {{tl|place}} spec (see -- comment at top of file); `extra_info_terms` is the terms spec for this type of extra-info (as returned by -- `parse_extra_info_arg`); and `sentence_style` indicates whether we're generating a sentence-style definition (as -- suitable for an English-language term without a translation specified using t=). local function format_extra_info(overall_place_spec, extra_info_terms, sentence_style) local formatted_terms = {} for _, termobj in ipairs(extra_info_terms.terms) do insert(formatted_terms, m_links.full_link(termobj, nil, nil, "show qualifiers")) end local spec = extra_info_terms.spec local text = spec.text if type(text) == "function" then text = text(overall_place_spec) end if text == "+" then text = spec.arg end if spec.auto_plural and formatted_terms[2] then text = pluralize(text) end if spec.with_colon then text = text .. ":" end if sentence_style and spec.match_sentence_style then text = ". " .. m_strutils.ucfirst(text) else text = "; " .. text end -- FIME: Use joinSegments when available. -- return text .. " " .. -- m_table.joinSegments(formatted_terms, {conj = extra_info_terms.conj or spec.conjunction or "and"}) return text .. " " .. m_table.serialCommaJoin(formatted_terms, {conj = extra_info_terms.conj or spec.conjunction or "and"}) end -- Format an old-style place description (with separate arguments for the placetype and each holonym) for display and -- return the resulting string. local function format_old_style_place_desc_for_display(args, place_desc, desc_index, with_article, ucfirst) -- The placetype used to determine whether "di" or "of" follows is the last placetype if there are -- multiple slash-separated placetypes, but ignoring "and", "or" and parenthesized notes -- such as "(one of 254)". local entry_placetype = nil local placetypes = place_desc.placetypes local function is_and_or(item) return item == "and" or item == "or" end local parts = {} local function ins(txt) insert(parts, txt) end local function ins_space() if #parts > 0 then ins(" ") end end local and_or_pos for i, placetype in ipairs(placetypes) do if is_and_or(placetype) then and_or_pos = i -- no break here; we want the last in case of more than one end end local remaining_placetype_index if and_or_pos then track("multiple-placetypes-with-and") if and_or_pos == #placetypes then error("Conjunctions 'and' and 'or' cannot occur last in a set of slash-separated placetypes: " .. concat(placetypes, "/")) end local items = {} for i = 1, and_or_pos + 1 do local pt = placetypes[i] if is_and_or(pt) then -- skip elseif i > 1 and pt:find("^%(") then -- append placetypes beginning with a paren to previous item items[#items] = items[#items] .. " " .. pt else entry_placetype = pt insert(items, get_placetype_description(pt)) end end ins(m_table.serialCommaJoin(items, {conj = placetypes[and_or_pos]})) remaining_placetype_index = and_or_pos + 2 else remaining_placetype_index = 1 end for i = remaining_placetype_index, #placetypes do local pt = placetypes[i] -- Check for and, or and placetypes beginning with a paren (so that things like -- "{{place|en|county/(one of 254)|s/Texas}}" work). if m_placetypes.placetype_is_ignorable(pt) then ins_space() ins(pt) else entry_placetype = pt -- Join multiple placetypes with comma unless placetypes are already -- joined with "and". We allow "the" to precede the second placetype -- if they're not joined with "and" (so we get "city and county seat of ..." -- but "city, the county seat of ..."). if i > 1 then ins(", ") local article = m_placetypes.get_placetype_article(pt) if article ~= "the" and i > remaining_placetype_index then -- Track cases where we are comma-separating multiple placetypes without the second one starting -- with "the", as they may be mistakes. The occurrence of "the" is usually intentional, e.g. -- {{place|zh|municipality/state capital|s/Rio de Janeiro|c/Brazil|t1=Rio de Janeiro}} -- for the city of [[Rio de Janeiro]], which displays as "a municipality, the state capital of ...". track("multiple-placetypes-without-and-or-the") end if article then ins(article) ins(" ") end end ins(get_placetype_description(pt)) end end if place_desc.holonyms then for holonym_index, _ in ipairs(place_desc.holonyms) do ins(format_holonym_in_context(entry_placetype, place_desc, holonym_index)) end end local gloss = concat(parts) if with_article then local article if desc_index == 1 then article = args.a else if not place_desc.holonyms then -- there isn't a following holonym; the place type given might be raw text as well, so don't add -- an article. with_article = false else local saw_placetype_holonym = false for _, holonym in ipairs(place_desc.holonyms) do if holonym.placetype then saw_placetype_holonym = true break end end if not saw_placetype_holonym then -- following holonym(s)s is/are just raw text; the place type given might be raw text as well, -- so don't add an article. with_article = false end end if with_article then track("second-or-higher-description-with-added-article") else track("second-or-higher-description-suppressed-article") end end if with_article then article = article or m_placetypes.get_placetype_article(place_desc.placetypes[1], ucfirst) if article then gloss = article .. " " .. gloss elseif ucfirst then gloss = m_strutils.ucfirst(gloss) end end end return gloss end --[==[ Get the full gloss (English description) of a new-style place description. New-style place descriptions are specified with a single string containing raw text interspersed with placetypes and holonyms surrounded by `<<...>>`. Exported for use by [[Module:demonyms]]. ]==] function export.format_new_style_place_desc_for_display(args, place_desc, with_article) local parts = {} local function ins(txt) insert(parts, txt) end if with_article and args.a then ins(args.a .. " ") end local max_holonym = 0 for _, order in ipairs(place_desc.order) do local segment_type, segment = order.type, order.value if segment_type == "raw" then ins(segment) elseif segment_type == "placetype" then ins(get_placetype_description(segment)) elseif segment_type == "qualifier" then ins(get_qualifier_description(segment)) elseif segment_type == "holonym" then ins(format_holonym(place_desc, segment, false)) if segment > max_holonym then max_holonym = segment end else internal_error("Unrecognized segment type %s", segment_type) end end if place_desc.holonyms and max_holonym < #place_desc.holonyms then local holonym_no_prefix = true for holonym_index = max_holonym + 1, #place_desc.holonyms do ins(format_holonym_in_context(nil, place_desc, holonym_index, holonym_no_prefix)) holonym_no_prefix = false end end return concat(parts) end -- Return a string with the gloss (the description of the place itself, as opposed to translations). If `ucfirst` is -- given, the gloss's first letter is made upper case. If `sentence_style` is given, the "extra info" (modern name, -- capital, largest city, etc.) is displayed as separated sentences; otherwise, it is displayed separated from the main -- definition by semicolons. local function get_display_form(data) local overall_place_spec, ucfirst, sentence_style, drop_extra_info, extra_info_overridden_set, from_tcl = data.overall_place_spec, data.ucfirst, data.sentence_style, data.drop_extra_info, data.extra_info_overridden_set, data.from_tcl local args = overall_place_spec.args local parts = {} local function ins(txt) table.insert(parts, txt) end if overall_place_spec.directives and overall_place_spec.directives[1] then for i, directive_terms in ipairs(overall_place_spec.directives) do ins(directive_terms.pretext) if directive_terms.pretext ~= "" then ucfirst = false end if not args.def or args.def == "-" then ins(format_form_of_directive(overall_place_spec, directive_terms, ucfirst, from_tcl)) ucfirst = false if i == #overall_place_spec.directives and directive_terms.posttext then ins(directive_terms.posttext) end end end end if args.def == "-" then return concat(parts) end if args.def then if args.def:find("<<") then local def_desc = export.parse_new_style_place_desc(args.def, args[1]) ins(export.format_new_style_place_desc_for_display({}, def_desc, false)) else ins(args.def) end else local include_article = true for n, desc in ipairs(overall_place_spec.descs) do if desc.order then ins(export.format_new_style_place_desc_for_display(args, desc, n == 1)) else ins(format_old_style_place_desc_for_display(args, desc, n, include_article, ucfirst)) end if desc.joiner then ins(desc.joiner) end include_article = desc.include_following_article ucfirst = false end end local addl = args.addl if addl then posttext = posttext or "" if addl:find("^[;:]") then ins(addl) elseif addl:find("^_") then ins(" " .. addl:sub(2)) else ins(", " .. addl) end end for _, extra_info_terms in ipairs(overall_place_spec.extra_info) do -- Include a given extra info term either when -- (1) drop_extra_info not set (it's set by {{tcl}}), or -- (2) the extra info term is marked as "display even when dropped" (e.g. modern= or full=, to help understand -- the term's sense), or -- (3) the term was overridden by a `place_*=` setting in {{tcl}}. if not drop_extra_info or extra_info_terms.spec.display_even_when_dropped or extra_info_overridden_set and extra_info_overridden_set[extra_info_terms.spec.arg] then ins(format_extra_info(overall_place_spec, extra_info_terms, sentence_style)) end end return concat(parts) end -- Return the definition line. local function get_def(data) local overall_place_spec, from_tcl, drop_extra_info, extra_info_overridden_set, translation_follows = data.overall_place_spec, data.from_tcl, data.drop_extra_info, data.extra_info_overridden_set, data.translation_follows local args = overall_place_spec.args local sentence_style = overall_place_spec.lang:getCode() == "en" local ucfirst = sentence_style and not args.nocap if #args.t > 0 then local gloss = get_display_form { overall_place_spec = overall_place_spec, ucfirst = false, sentence_style = false, drop_extra_info = drop_extra_info, extra_info_overridden_set = extra_info_overridden_set, from_tcl = from_tcl, } if from_tcl and not args.tcl_nolc then gloss = m_strutils.lcfirst(gloss) end if translation_follows then return (gloss == "" and "" or gloss .. ": ") .. get_translations(args.t, args.tid) else return get_translations(args.t, args.tid) .. (gloss == "" and "" or " (" .. gloss .. ")") end else return get_display_form { overall_place_spec = overall_place_spec, ucfirst = ucfirst, sentence_style = sentence_style, drop_extra_info = drop_extra_info, extra_info_overridden_set = extra_info_overridden_set, from_tcl = from_tcl, } end end ---------- Functions for the category wikicode -- The code in this section finds the categories to which a given place belongs. See comment at top of file. --[=[ Find the appropriate category specs for a given place description and placetype. For example, for the template invocation {{tl|place|en|city/and/county|s/Pennsylvania|c/US}}, which results in the place description ``` { placetypes = {"city", "and", "county"}, holonyms = { {placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania"}, {placetype = "negara", display_placename = "United States", unlinked_placename = "United States"}, }, holonyms_by_placetype = { state = {"Pennsylvania"}, country = {"United States"}, }, } ``` the call ``` find_placetype_cat_specs { entry_placetype = "city", place_desc = { placetypes = {"city", "and", "county"}, holonyms = { {placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania"}, {placetype = "negara", display_placename = "United States", unlinked_placename = "United States"}, }, holonyms_by_placetype = { state = {"Pennsylvania"}, country = {"United States"}, }, }, } ``` might produce the return value ``` { entry_placetype = "city", cat_specs = {"Cities in Pennsylvania, USA"}, triggering_holonym = {placetype = "state", display_placename = "Pennsylvania", unlinked_placename = "Pennsylvania"}, triggering_holonym_index = 1, } ``` See the comment at the top of the section for a description of category specs and the overall algorithm. On entry, `data` is an object with the following fields: * `entry_placetype`: the entry placetype (or equivalent) used to look up the category data in placetype_data, which must have already been resolved to a placetype with an entry in `placetype_data`; * `place_desc`: the full place description as documented at the top of the file (used only for its holonyms); * `first_holonym_index`: the index of the first holonym to consider when iterating through the holonyms (used to implement the `:also` holonym placetype modifier); * `overriding_holonym`: an optional overriding holonym to use, in place of iterating through the holonyms (used to implement categorizing other holonyms of the same type as the triggering holonym, so that e.g. {{tl|place|en|river|s/Kansas,Nebraska}}, or equivalently {{tl|place|en|river|s/Kansas|and|s/Nebraska}}, works); * `from_demonym`: we are called from {{tl|demonym-noun}} or {{tl|demonym-adj}} instead of {{tl|place}}, and should generate categories appropriate to those templates. * `form_of_directive`: A form-of directive prefix such as `FORMER_NAME_OF`. If specified, use that type prefix to generate categories appropriate to the form-of directive (in addition to the regular categories generated for the {{tl|place}} invocation, which happens in a separate call). The return value is {nil} if no category specs could be located, otherwise an object with the following fields: * `entry_placetype`: the placetype that should be used to construct categories when `true` is one of the returned category specs (normally the same as the `entry_placetype` passed in, but will be different when a "fallback" key exists and is used); * `cat_specs`: list of category specs as described above; * `triggering_holonym`: the triggering holonym (see the comment at the top of the section), or nil if there was no triggering holonym; * `triggering_holonym_index`: the index of the triggering holonym in the list of holonyms in `place_desc`, or nil if an overriding holonym was passed in or there was no triggering holonym. ]=] local function find_placetype_cat_specs(data) local entry_placetype, place_desc, first_holonym_index, overriding_holonym, from_demonym = data.entry_placetype, data.place_desc, data.first_holonym_index, data.overriding_holonym, data.from_demonym local form_of_directive = data.form_of_directive local function fetch_cat_specs(holonym_to_match, index, no_fallback) local holonym_placetype = holonym_to_match.placetype if not holonym_placetype then -- raw text in place of holonym return nil end local holonym_placename = holonym_to_match.unlinked_placename if not holonym_placename then internal_error("Missing unlinked_placename in holonym (index %s): %s", index, holonym_to_match) end local cat_specs, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype, function(equiv_entry_pt) return m_placetypes.get_equiv_placetype_prop(holonym_placetype, function(equiv_holonym_pt) return m_placetypes.political_division_cat_handler { entry_placetype = equiv_entry_pt, holonym_placetype = equiv_holonym_pt, holonym_placename = holonym_placename, holonym_index = index, place_desc = place_desc, from_demonym = from_demonym, } end) end, {no_fallback = no_fallback, form_of_directive = form_of_directive} ) if cat_specs and cat_specs[1] then return cat_specs, equiv_entry_placetype_and_qualifier.placetype end local cat_handler, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype, function(equiv_entry_pt) local entry_placetype_data = m_placetypes.placetype_data[equiv_entry_pt] if entry_placetype_data and entry_placetype_data.cat_handler then return entry_placetype_data.cat_handler end end, {no_fallback = no_fallback, form_of_directive = form_of_directive} ) if cat_handler then local cat_specs = m_placetypes.get_equiv_placetype_prop(holonym_placetype, function(equiv_holonym_pt) return cat_handler { entry_placetype = equiv_entry_placetype_and_qualifier.placetype, holonym_placetype = equiv_holonym_pt, holonym_placename = holonym_placename, holonym_index = index, place_desc = place_desc, from_demonym = from_demonym, } end) if cat_specs and cat_specs[1] then return cat_specs, equiv_entry_placetype_and_qualifier.placetype end end if not no_fallback then local cat_specs, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype, function(equiv_entry_pt) local entry_placetype_data = m_placetypes.placetype_data[equiv_entry_pt] if entry_placetype_data then return m_placetypes.get_equiv_placetype_prop(holonym_placetype, function(equiv_holonym_pt) return entry_placetype_data[equiv_holonym_pt .. "/*"] end) end end, {form_of_directive = form_of_directive} ) if cat_specs and cat_specs[1] then return cat_specs, equiv_entry_placetype_and_qualifier.placetype end end return nil end if overriding_holonym then -- FIXME, change the algorithm to eliminate overriding_holonym local cat_specs, fetched_entry_placetype = fetch_cat_specs(overriding_holonym, nil) if cat_specs and cat_specs[1] then return { entry_placetype = fetched_entry_placetype, cat_specs = cat_specs, triggering_holonym = overriding_holonym, -- no triggering_holonym_index } end else -- We loop twice over holonyms, the first time setting `no_fallback` so that we process only category specs for -- the specifically given entry placetype (possibly with preceding qualifiers). The reason for this is to -- correctly handle cases like [[Poblacion IX]]: -- {{place|en|barangay|mun/Roxas|p/Capiz|c/Philippines}}. -- "barangay" falls back to "neighborhood", and without the `no_fallback` loop, the neighborhood cat handler run -- on the mun/Roxas holonym will take precedence over the barangay-specific setting for p/Capiz because we -- check, for each holonym in turn, first for a matching spec through political_division_cat_handler, then a cat -- handler, then a wildcard spec like country/*. During the first no-fallback loop, we disable checking for -- wildcard specs because it seems a fallback matching exactly or through a cat handler on an earlier holonym -- would be better than a wildcard match for the exact entry placetype at a later holonym. (FIXME: But I don't -- know for sure; maybe we should check wildcard holonyms on the exact entry placetype first, or contrariwise -- maybe we should check only exact-match holonyms through political_division_cat_handler on the exact entry -- placetype first, not even checking other cat handlers.) for i, holonym in ipairs(place_desc.holonyms) do if first_holonym_index and i < first_holonym_index then -- continue else local cat_specs, fetched_entry_placetype = fetch_cat_specs(holonym, i, "no_fallback") if cat_specs and cat_specs[1] then return { entry_placetype = fetched_entry_placetype, cat_specs = cat_specs, triggering_holonym = holonym, triggering_holonym_index = i, } end end end for i, holonym in ipairs(place_desc.holonyms) do if first_holonym_index and i < first_holonym_index then -- continue else local cat_specs, fetched_entry_placetype = fetch_cat_specs(holonym, i) if cat_specs and cat_specs[1] then return { entry_placetype = fetched_entry_placetype, cat_specs = cat_specs, triggering_holonym = holonym, triggering_holonym_index = i, } end end end end return nil end -- Turn a list of category specs (see comment at section top) into the corresponding categories (minus the language -- code prefix). The function is given the following arguments: -- (1) the category specs retrieved using find_placetype_cat_specs(); -- (2) the entry placetype used to fetch the entry in `placetype_data` -- (3) the triggering holonym (a holonym object; see comment at top of file) used to fetch the category specs -- (see top-of-section comment); or nil if no triggering holonym. -- The return value is constructed as described in the top-of-section comment. local function cat_specs_to_categories(place_desc, cat_data) local all_cats = {} local cat_specs, entry_placetype, triggering_holonym, triggering_holonym_index = cat_data.cat_specs, cat_data.entry_placetype, cat_data.triggering_holonym, cat_data.triggering_holonym_index if triggering_holonym then for _, cat_spec in ipairs(cat_specs) do local cat if cat_spec == true then cat = m_placetypes.pluralize_placetype(entry_placetype, "ucfirst") .. " " .. m_placetypes.get_placetype_entry_preposition(entry_placetype) .. " +++" else cat = cat_spec end if cat:find("%+%+%+") then local group, key, spec, container_trail = m_placetypes.find_matching_holonym_location { holonym_placetype = triggering_holonym.placetype, holonym_placename = triggering_holonym.unlinked_placename, holonym_index = triggering_holonym_index, place_desc = place_desc, } if group then cat = cat:gsub("%+%+%+", m_strutils.replacement_escape(m_placetypes.get_prefixed_key(key, spec))) insert(all_cats, cat) else mw.log(("Unable to insert category for cat spec '%s' because holonym '%s/%s' did not match a " .. "known location"):format(cat, triggering_holonym.placetype, triggering_holonym.unlinked_placename)) track("cant-match-holonym-for-category-spec") end else insert(all_cats, cat) end end else for _, cat_spec in ipairs(cat_specs) do local cat if cat_spec == true then cat = m_placetypes.pluralize_placetype(entry_placetype, "ucfirst") else cat = cat_spec if cat:find("%+%+%+") then internal_error("Category %s contains +++ but there is no holonym to substitute", cat) end end insert(all_cats, cat) end end return all_cats end -- Return the categories (without initial lang code) that should be added to the entry, given the place description -- (which specifies the entry placetype(s) and holonym(s); see top of file) and a particular entry placetype (e.g. -- "city"). Note that only the holonyms from the place description are looked at, not the entry placetypes in the place -- description. local function get_placetype_cats(place_desc, entry_placetype, from_demonym, form_of_directive) local cats = {} local first_holonym_index = 1 while first_holonym_index <= #place_desc.holonyms do -- Find the category specs (see top-of-file comment) corresponding to the holonym(s) in the place description. local cat_data = find_placetype_cat_specs { entry_placetype = entry_placetype, place_desc = place_desc, first_holonym_index = first_holonym_index, from_demonym = from_demonym, form_of_directive = form_of_directive, } -- Check if no category spec could be found. if not cat_data then break end local triggering_holonym = cat_data.triggering_holonym if not triggering_holonym then internal_error("find_placetype_cat_specs should have returned a triggering holonym: %s", cat_data) end -- Generate categories for the category specs found. extend(cats, cat_specs_to_categories(place_desc, cat_data)) -- Also generate categories for other holonyms of the same placetype, so that e.g. -- {{place|en|city|s/Kansas|and|s/Missouri|c/USA}} generates both [[:Category:en:Cities in Kansas, USA]] and -- [[:Category:en:Cities in Missouri, USA]]. first_holonym_index = cat_data.triggering_holonym_index -- Loop over non-fallback equivalent placetypes to the triggering holonym's placetype, in case it is -- non-canonical (e.g. `cities/San Francisco`). This matches the loop over equivalent places in -- key_holonym_into_place_desc(). local equiv_triggering_placetypes = m_placetypes.get_placetype_equivs(triggering_holonym.placetype, {no_fallback = true}) for _, equiv in ipairs(equiv_triggering_placetypes) do local other_holonyms_of_same_type = place_desc.holonyms_by_placetype[equiv.placetype] if other_holonyms_of_same_type then for _, other_placename_of_same_type in ipairs(other_holonyms_of_same_type) do if other_placename_of_same_type ~= triggering_holonym.unlinked_placename then local overriding_holonym = { placetype = triggering_holonym.placetype, unlinked_placename = other_placename_of_same_type, } local other_cat_data = find_placetype_cat_specs { entry_placetype = entry_placetype, place_desc = place_desc, overriding_holonym = overriding_holonym, from_demonym = from_demonym, form_of_directive = form_of_directive, } if other_cat_data then extend(cats, cat_specs_to_categories(place_desc, other_cat_data)) end end end end end -- If there are any later-specified holonyms that had the modifier :also, try to produce categories for them -- as well. first_holonym_index = first_holonym_index + 1 while first_holonym_index <= #place_desc.holonyms do if place_desc.holonyms[first_holonym_index].continue_cat_loop then break end first_holonym_index = first_holonym_index + 1 end end if cats[1] then return cats end local entry_pt_default, equiv_entry_placetype_and_qualifier = m_placetypes.get_equiv_placetype_prop(entry_placetype, function(pt) return m_placetypes.placetype_data[pt] and m_placetypes.placetype_data[pt].default end, {form_of_directive = form_of_directive}) if entry_pt_default then return cat_specs_to_categories(place_desc, { cat_specs = entry_pt_default, entry_placetype = equiv_entry_placetype_and_qualifier.placetype, -- no triggering holonym }) end return {} end --[==[ Iterate through each type of place and return a list of the categories that need to be added to the entry. The returned categories need to be formatted using `format_cats`, as they can be either topic-style categories (by default) or langname-style categories (if prefixed with `cln:`). The function is passed the overall place spec, which contains all the parsed info on the {{tl|place}} call (see comment at top of file), the parsed arguments (needed for arguments not parsed by `parse_overall_place_spec` and used primarily to add "bare categories" corresponding to toponyms for known locations), and `from_demonym`, which is true if we're being called from {{tl|demonym-noun}} or {{tl|demonym-adj}} (in this case, we only want certain categories added, specifically bare categories corresponding to the specified holonym(s)). ]==] function export.get_cats(args, overall_place_spec, from_demonym) local cats = {} local place_descriptions = overall_place_spec.descs handle_category_implications(place_descriptions, m_placetypes.cat_implications) m_placetypes.augment_holonyms_with_container(place_descriptions) if overall_place_spec.directives then -- not necessarily when called from [[Module:demonym]] for _, directive_terms in ipairs(overall_place_spec.directives) do local spec_cats = directive_terms.spec.cat if spec_cats then if type(spec_cats) == "string" then spec_cats = {spec_cats} end for _, spec_cat in ipairs(spec_cats) do insert(cats, spec_cat) end end if directive_terms.spec.type_prefix then for _, place_desc in ipairs(place_descriptions) do for _, placetype in ipairs(place_desc.placetypes) do if not m_placetypes.placetype_is_ignorable(placetype) then extend(cats, get_placetype_cats(place_desc, placetype, from_demonym, directive_terms.spec.type_prefix)) end end end end end end if not from_demonym then local bare_categories = m_placetypes.get_bare_categories(args, overall_place_spec) extend(cats, bare_categories) end for _, place_desc in ipairs(place_descriptions) do if not from_demonym then for _, placetype in ipairs(place_desc.placetypes) do if not m_placetypes.placetype_is_ignorable(placetype) then extend(cats, get_placetype_cats(place_desc, placetype)) end end end -- Also add generic place categories for the holonyms listed (e.g. a category like -- [[Category:Places in Merseyside, England]]). This is handled through the special placetype "*". extend(cats, get_placetype_cats(place_desc, "*", from_demonym)) end if args.cat then -- not necessarily when called from [[Module:demonym]] for _, cat in ipairs(args.cat) do local split_cats = split_on_comma(cat) extend(cats, split_cats) end end return cats end -- Return the category link for a category, given the language code and the name of the category. local function format_cats(lang, cats, sort_key) local full_cats = {} local langcode = lang:getFullCode() for _, cat in ipairs(cats) do -- 'cln' corresponds to {{cln}}, which generates lang-name categories like [[:Category:English abbreviations]] -- (as opposed to topic categories like [[:Category:en:Abbreviations of states of the United States]]). local cln_cat = cat:match("^cln:(.*)$") if cln_cat then insert(full_cats, lang:getFullName() .. " " .. cln_cat) else insert(full_cats, langcode .. ":" .. cat) end end return require(utilities_module).format_categories(full_cats, lang, sort_key, nil, force_cat or m_placetypes.get_force_cat()) end ----------- Main entry point --[==[ Implementation of {{tl|place}}. Meant to be callable from another module (specifically, [[Module:transclude]]). The single argument `data` is an object with the following fields: * `template_args`: Raw arguments specified by {{tl|place}}, possibly modified by {{tl|tcl}}. * `from_tcl`: True if we're being invoked from {{tl|tcl}}. * `drop_extra_info`: True if we should drop most of the "extra info" specified using extra info arguments (capital, largest city, etc.). Usually true when invoked from {{tl|tcl}}. Note that some extra info is still displayed even when `drop_extra_info` is set in order to establish the context (e.g. {{para|full}} and {{para|modern}}), and any extra info overridden at the {{tl|tcl}} level is displayed regardless. * `extra_info_overridden_set`: Set of booleans specifying, for each extra info arg, whether it was overridden at the {{tl|tcl}} level. This means, for example, that the values are interpreted according to the language in {{para|1}} instead of always defaulting to English, as is the case when {{tl|place}} is called directly. * `form_of_overridden_args`: Set of objects of the form `{new_directive = ``directive``, new_value = ``value``}` for overriding a given form-of directive (the key) with new directive ``directive`` and new unparsed value ``value``. Both the key and the replacing directive should be canonical. ``value`` will be parsed in the same way as a regular form-of directive except that all specified terms are interpreted in the language specified in {{para|1}}, never in English. This is present so that {{tl|tcl}} can be used on abbreviations like [[GDR]] and [[FYROM]], whose equivalents in a foreign language have language-specific expansions but where the rest of the call should stay the same. * `translation_follows`: If true, any translation specified using t= should follow the definition, after a colon, rather than preceding, with the definition in parens. ]==] function export.format(data) local template_args = data.template_args local list_param = {list = true} local boolean_param = {type = "boolean"} local params = { [1] = {required = true, type = "language", default = "und"}, [2] = {required = true, list = true}, ["t"] = list_param, ["tid"] = {list = true, allow_holes = true}, ["cat"] = list_param, ["nocat"] = boolean_param, ["nocap"] = boolean_param, ["sort"] = true, ["pagename"] = true, -- for testing or documentation purposes ["a"] = true, ["addl"] = true, ["def"] = true, -- params that are only used when transcluding using {{tcl}}/{{transclude}}, to transmit information to {{tcl}}. ["tcl"] = true, ["tcl_t"] = list_param, ["tcl_tid"] = list_param, ["tcl_nolb"] = true, ["tcl_nolc"] = boolean_param, ["tcl_noextratext"] = boolean_param, } -- add "extra info" parameters for _, extra_arg_spec in ipairs(export.extra_info_args) do params[extra_arg_spec.arg] = list_param end -- FIXME, once we've flushed out any uses, delete the following clause. That will cause def= to be ignored. if template_args.def == "" then error("Cannot currently pass def= as an empty parameter; use def=- if you want to suppress the definition display") end local args = require("Module:parameters").process(template_args, params) if args.a then track("a") if args.a:find("^[Aa]n?$") or args.a:find("^[Tt]he$") then track("a/article") else error("a= can only be used to specify a definite or indefinite article (and preferably use |nocap=1 instead to get the initial letter lowercase); see especially the documentation on the [[Template:place#Mixed format|mixed format]], which can be used to add arbitrary text before the placetype") end end data.args = args local overall_place_spec = parse_overall_place_spec(data) data.overall_place_spec = overall_place_spec return get_def(data) .. ( args.nocat and "" or format_cats(args[1], export.get_cats(args, overall_place_spec), args.sort)) end --[==[ Actual entry point of {{tl|place}}. ]==] function export.show(frame) return export.format { template_args = frame:getParent().args, } end return export jon52s5c2fdlkwuduulsxgt074fk66m Kategori:Perkataan dengan terjemahan bahasa Turki Usmaniyah 14 77014 281346 225608 2026-04-22T05:14:34Z Hakimi97 2668 Hakimi97 telah memindahkan laman [[Kategori:Perkataan dengan terjemahan bahasa Turki Uthmaniyah]] ke [[Kategori:Perkataan dengan terjemahan bahasa Turki Usmaniyah]]: Tajuk salah eja 225608 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Penyelenggaraan entri bahasa Turki Usmaniyah 14 77015 281330 225609 2026-04-22T00:40:18Z PeaceSeekers 3334 PeaceSeekers telah memindahkan laman [[Kategori:Penyelenggaraan entri bahasa Turki Uthmaniyah]] ke [[Kategori:Penyelenggaraan entri bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama 225609 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx tahun lompat 0 77174 281417 225940 2026-04-22T08:27:17Z PeaceSeekers 3334 281417 wikitext text/x-wiki == Bahasa Melayu == {{Wikipedia}} <!-- Kalau ada --> === Takrifan === ==== Kata nama ==== {{ms-kn|j=تاهون لومڤت}} # Tahun dalam takwim [[Masihi]] di mana satu hari tambahan ditambah pada akhir bulan [[Februari]] (29 Februari) untuk mengimbangi waktu tambahan [[tahun suria]] berbanding takwim. #: {{syn|ms|tahun kabisat}} === Terjemahan === {{trans-top|tahun Masihi dengan hari tambahan}} * Afrikaans: {{t+|af|skrikkeljaar}} * Altai: *: Altai Selatan: {{t|alt|кату јыл}} * Arab: {{t|ar|سَنَة كَبِيسَة}} * Belanda: {{t+|nl|schrikkeljaar|n}} * Breton: {{t|br|bloavezh bizeost|m}} * Bulgaria: {{t|bg|високосна година|f}} * Burma: {{t|my|ရက်ထပ်နှစ်}} * Catalonia: {{t+|ca|any bixest|m}}, {{t+|ca|any bissextil|m}}, {{t+|ca|any de traspàs|m}} * Cina: *: Mandarin: {{t+|cmn|閏年}} * Cornwall: {{t|kw|bledhen lamm|f}} * Czech: {{t+|cs|přestupný rok|m}} * Denmark: {{t+|da|skudår|n}} * Esperanto: {{t|eo|superjaro}} * Estonia: {{t+|et|liigaasta}} * Faroe: {{t|fo|leypár|n}} * Finland: {{t+|fi|karkausvuosi}} * Gael Scotland: {{t|gd|bliadhna-leum|f}} * Georgia: {{t|ka|ნაკიანი წელი}}, {{t|ka|ნაკიანი წელიწადი}} * Hindi: {{t|hi|अधिवर्ष}}, {{t|hi|लीप वर्ष}} * Hungary: {{t+|hu|szökőév}} * Ibrani: {{t+|he|שנה מעוברת|m|tr=shaná meubéret}} * Iceland: {{t|is|hlaupár|n}} * Ido: {{t|io|bisextila yaro}} * Indonesia: {{t+|id|tahun kabisat}} * Inggeris: {{t+|en|leap year}} * Interlingua: {{t|ia|anno bissextil}} * Ireland: {{t|ga|bliain bhisigh|f}} * Itali: {{t|it|anno bisestile|m}} * Jepun: {{t+|ja|閏年|tr=じゅんねん, junnen; うるうどし, urūdoshi}} * Jerman: {{t+|de|Schaltjahr|n}} * Khmer: {{t|km|ឆ្នាំបង្គ្រប់}} * Korea: {{t+|ko|윤년(閏年)}} * Lao: {{t|lo|ປີອະທິກະສຸລະທິນ}} * Lithuania: {{t|lt|keliamieji metai|m-p}} * Luxembourg: {{t|lb|Schaltjoer|n}} * Macedonia: {{t|mk|престапна година|f}} * Malta: {{t|mt|sena biżestili}} * Māori: {{t|mi|tau kuhurangi}} * Minangkabau: {{t|min|tahun kabisat}} * Mongol: {{t|mn|өндөр жил}} * Norman: {{t|nrf|année bissextile|f}} * Norway: *: Bokmål: {{t|nb|skottår|n}}, {{t|nb|skuddår|n}} *: Nynorsk: {{t|nn|skotår|n}}, {{t|nn|skottår|n}} * Pashto: {{t|ps|د کبيسې کال|m}} * Parsi: {{t|fa|سال انباشته|tr=sâl-e anbâšte}}, {{t+|fa|سال کبیسه|tr=sâl-e kabise}} * Perancis: {{t+|fr|année bissextile|f}} * Plautdietsch: {{t|pdt|Schaultjoa|n}} * Poland: {{t+|pl|rok przestępny|m-in}} * Portugis: {{t+|pt|ano bissexto|m}} * Romania: {{t+|ro|an bisect|m}}, {{t|ro|an bisectil|m}} * Rusia: {{t+|ru|високо́сный год|m}} * Samoa: {{t|sm|puna ifo tausaga}} * Sepanyol: {{t+|es|año bisiesto|m}}, {{t+|es|bisiesto|m}} * Serbo-Croatia: *: Cyril: {{t|sh|преступна го̏дина)|f}} *: Latin: {{t|sh|prestupna gȍdina|f}}, {{t|sh|prijestupna gȍdina|f}} * Slovak: {{t|sk|priestupný rok|m}} * Slovene: {{t+|sl|prestopno leto|n}} * Sweden: {{t+|sv|skottår|n}} * Tagalog: {{t|tl|taong bisyesto}} * Tajik: {{t|tg|соли кабиса}} * Thai: {{t|th|ปีอธิกมาส}}, {{t|th|ปีอธิกสุรทิน}} * Turki: {{t+|tr|artık yıl}} * Ukraine: {{t|uk|пере́ступний рік}} {{qualifier|dated}}, {{t|uk|високо́сний рік|m}} * Urdu: {{t|ur|سال کبیسہ|tr=sāl-e kabīsā}} * Vietnam: {{t+|vi|năm nhuận}}, {{t|vi|năm nhuần}} * Wales: {{t|cy|blwyddyn naid}} * Waray-Waray: {{t|war|tuig bisyesto}} * Yiddish: {{t|yi|עיבור־יאָר|n|tr=iber-yor}} * Yunani: {{t|el|δίσεκτο έτος|n}} {{trans-bottom}} === Pautan luar === * {{R:PRPM}} {{C|ms|Takwim|Tahun}} 0c726aa8llj2kgksrsgd454fmmmnswg Kategori:Kata nama bahasa Turki Usmaniyah 14 77426 281326 226398 2026-04-22T00:38:48Z PeaceSeekers 3334 PeaceSeekers telah memindahkan laman [[Kategori:Kata nama bahasa Turki Uthmaniyah]] ke [[Kategori:Kata nama bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama 226398 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Rekonstruksi:Bahasa Indo-Eropah Purba/dwóh₁ 110 79527 281345 230299 2026-04-22T03:31:39Z Hakimi97 2668 /* Terbitan */ 281345 wikitext text/x-wiki {{reconstructed}} ==Bahasa Indo-Eropah Purba== {{etymon|ine-pro|id=two}} ===Kata bilangan=== {{cardinalbox|ine-pro|1|2|3|*óynos|*tréyes|ord=*h₂énteros|adv=*dwís|frac=*sēmi|opt=Awalan|optx=*dwi-}} {{head|ine-pro|kata bilangan}}<ref name="PIEPG">{{R:gem:PIEPG|page=53}}</ref> # [[dua]], 2 ====Bentuk alternatif==== * {{alt|ine-pro|*dwó|*duwó}}<ref name="PIEPG"/><ref name="LIPP">{{R:ine:LIPP|vol=2|entry=*du̯ó-, *du̯í- 'zwei (einzelne)'|page=168-174}}</ref> {{q|bentuk tak terinfleksi}} * {{alt|ine-pro|*dwṓw}}<ref name="LIPP"/> ====Infleksi==== {{ine-decl-adj|n=d|dwó}} ====Terbitan==== * {{l|ine-pro|*dwi-||pos=kata majmuk}} ** {{desc|ine-pro|*wí|alt=*(h₁)wi-|nolang=1|unc=1}}<ref name="De Vaan"/> {{see desc}} * {{l|ine-pro||*dwi-gʰo-}}<ref name="LIPP"/> ** {{desc|sqj-pro|*duaigā}} {{q|< {{m|ine-pro||*dwoy-gʰ-eh₂}}}} *** {{desc|sq|degë|t=cabang (terpisah)}} ** {{desc|ine-bsl-pro|*dweigas}} *** {{desc|sla-pro|*dvigъ|t=cabang}} **** {{desc|sla-pro|*dvigati|t=angkat|der=1}} ** {{desc|gem-pro|*twīgą|t=cabang (terpisah); ranting}} {{see desc}} ** {{desc|grk-pro}} *** {{desc|grc|δίχα}}, {{l|grc|διχθά}}, {{l|grc|διχο-}}, {{l|grc|διξός}}, {{l|grc|δισσός}} * {{l|ine-pro||*dwí-ko-s}} ** {{desc|gem-pro|*twihô|der=1}} *** {{desc|gmw-pro|*twihō|t=syak}} {{see desc}} ** {{desc|iir-pro}} *** {{desc|inc-pro}} **** {{desc|sa|द्विक}} * {{l|ine-pro||*dwi-no-s}}<ref>{{R:itc:EDL|head=bis|page=72}}</ref> ** {{desc|gem-pro|*twinaz}} {{see desc}} ** ⇒ {{l|ine-pro||*dwis-no-s}} *** {{desc|gem-pro|*twiznaz}} {{see desc}} *** {{desc|itc-pro|alt=*dwiznos}} **** {{desc|la|bīnus}} {{see desc}} * {{l|ine-pro|*dwi[[*pel-#Bahasa Indo-Eropah Purba: lipat|-pl-o-s]]||double}} ** {{desc|gem-pro|*twīflaz|t=syak}} {{see desc}} ** {{desc|grk-pro|}} *** {{desc|grc|διπλόος}}, {{l|grc|δῐπλᾱ́ς}} ***: {{desc|grc-att|δῐπλοῦς}} ***: {{desc|grc-ion|δῐπλέος}} **** {{desc|el|διπλός}}, {{l|el|δίπλα}} ** {{desc|itc-pro|*dwiplos}} *** {{desc|la|duplus}} {{see desc}} * {{l|ine-pro||*dwi-pl-o-m}} ** {{desc|imy|𐊗𐊂𐊆𐊓𐊍𐊚|tr=tbiplẽ|unc=1}} * {{l|ine-pro|*dwís|*dwí-s|pos=adverba}} * {{l|ine-pro||*dwi-sk-}} ** {{desc|gem-pro|*twisk(j)a-|t=dua kali ganda}} *** {{desc|osx|twisk}} *** {{desc|goh|zuiski}}, {{l|goh|zwisk}} **** {{desc|gmh|zwisc(h)}} ***** {{desc|de|zwischen|der=1}} * {{l|ine-pro||*(d)wi-tyo-}}<ref name="De Vaan">{{R:itc:EDL|head=vitium|page=684}}</ref> {{q|dengan disimilasi ''*d…t'' > ''*(h₁)…t''}} ** {{desc|itc-pro|*witjom}} *** {{desc|la|vitium}} {{see desc}} * {{l|ine-pro||*dwoy-}} ** {{desc|hit|𒋫𒄿𒊌𒀀|tr=tāiuga|t=umur dua tahun|der=1}} ** {{desc|hyx-pro|-}} *** {{desc|xcl|*երկե-}} **** {{desc|xcl|երկերիւր|der=1}} ** {{l|ine-pro||*dwoy-os}} *** {{desc|ine-bsl-pro|*dwajas}} **** {{desc|sla-pro|*dъvojь}} **** {{desc|lt|dveji}} *** {{desc|grk-pro}} **** {{desc|grc|δοιός}} *** {{desc|iir-pro|*dwayás}} **** {{desc|sa|द्वय|tr=dvayá}} ** {{l|ine-pro||*dwoy-om}} *** {{desc|hit|𒋫𒀀𒀭|tr=ta-a-an|ts=tān}}<ref>{{R:hit:Kloekhorst|page=826-827|head=tān}}</ref> *** {{desc|xlu|𔑢𔗬𔐤𔔂|tr=tu-wa-na|unc=1}} ** {{l|ine-pro||*dwoy-o-mṓi-}} *** {{desc|hit|𒁕𒈠𒄿|tr=dam(m)ai-}}, {{l|hit|𒋫𒈠𒄿|tr=tāmai-|t=kedua, lain}} * {{l|ine-pro|*dwey-|t=takut}} ;Pembentukan tak dikelaskan * {{desc|gem-pro|*twīhnaz}} ====Keturunan==== * {{desc|sqj-pro|*duwō}} {{see desc}} * Anatolia: ** {{desc|hit|𒋫|tr=ta-}} ** {{desc|xlu|𔑢𔗬|𔓯𔖩|tr=tuwa|tr2=i-zi-}} ** {{desc|imy|𐊗𐊂𐊆|tr=tbi}} ** {{desc|xlc|𐊋𐊂𐊆}} * {{desc|hyx-pro}} ** {{desctree|xcl|երկու}} ** {{desctree|xcl|կրկին|qq=susur terbitan tepat dipertikai}} * {{desc|ine-bsl-pro|*duwōˀ|*duōˀ}} {{see desc}} * {{desc|cel-pro|*duwo}} {{see desc}} * {{desc|gem-pro|*twai}} {{see desc}} * {{desc|grk-pro|*dúwo}} {{q|< {{l|ine-pro||*duwó}}<ref>{{R:grc:Beekes|head=δύο|page=359}}</ref>}} {{see desc}} * {{desc|iir-pro|*dwáH}} {{see desc}} * {{desc|itc-pro|*duō}} {{see desc}} * {{desc|ine-toc-pro}} ** {{desc|xto|wu|we}} ** {{desc|txb|wi}} ===Rujukan=== <references/> ===Bacaan lanjut=== * {{R:gem:EDPG|page=529-530}} * {{R:ine:IEW|page=228-232}} r17hj0wrafddgtyobd2zi3eftntgggv Kategori:Perkataan dengan kod tulisan lewah bahasa Turki Usmaniyah 14 80819 281331 232212 2026-04-22T00:40:48Z PeaceSeekers 3334 PeaceSeekers telah memindahkan laman [[Kategori:Perkataan dengan kod tulisan lewah bahasa Turki Uthmaniyah]] ke [[Kategori:Perkataan dengan kod tulisan lewah bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama 232212 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Permintaan perkataan bahasa Turki Usmaniyah 14 88257 281332 248694 2026-04-22T00:40:57Z PeaceSeekers 3334 PeaceSeekers telah memindahkan laman [[Kategori:Permintaan perkataan bahasa Turki Uthmaniyah]] ke [[Kategori:Permintaan perkataan bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama 248694 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Kata sifat bahasa Turki Usmaniyah 14 91356 281327 252673 2026-04-22T00:39:05Z PeaceSeekers 3334 PeaceSeekers telah memindahkan laman [[Kategori:Kata sifat bahasa Turki Uthmaniyah]] ke [[Kategori:Kata sifat bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama 252673 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Modul bahasa Turki Usmaniyah 14 92212 281333 253806 2026-04-22T00:41:27Z PeaceSeekers 3334 PeaceSeekers telah memindahkan laman [[Kategori:Modul bahasa Turki Uthmaniyah]] ke [[Kategori:Modul bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama 253806 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Wikikamus:th/กกุธภัณฑ์ 4 92266 281421 253881 2026-04-22T09:10:33Z PeaceSeekers 3334 /* Sebutan */ 281421 wikitext text/x-wiki ==Bahasa Thai== ===Kata nama=== {{inti|th|kata nama}} ===Etimologi=== Daripada {{bor|th|pi|kakudhabhaṇdha}}, daripada {{m|pi|kakudha|gloss=[[panji]] atau [[simbol]] daripada [[royalti]]}} + {{m|pi|bhaṇḍa|gloss=[[artikel]]; [[instrumen]]; [[perkakas]]}}; bersamaan dengan {{cog|th|-}} {{com|th|กกุธ|ภัณฑ์}}. ===Sebutan=== {{th-seb|กะ-กุด-ทะ-พัน}} ===Kata nama=== {{th-kn}} # [[alat kebesaran]] [[diraja]] ## {{lb|th|khusus}} Alat-alat kebesaran diraja Thailand, ataupun disebut "Lima Regalia Diraja": Mahkota Kemenangan Besar, Pedang Kemenangan, Tongkat Diraja, Kipas Diraja (dan Cambuk), Sandal Diraja. ==== Lihat juga ==== {{top2}} * {{l|th|เครื่องราชกกุธภัณฑ์}} * {{l|th|เบญจราชกกุธภัณฑ์}} {{bottom}} <!-- ==== Terjemahan dalam bahasa lain ==== {{trans-top|lambang kerajaan}} * Jepun: {{t|ja|[[五]][[種]][[の]][[神器]]|tr=ごしゅのじんぎ, โกะชุโนะจิงงิ}} * Laos: {{t|lo|ກະກຸທະພັນ}} * Inggeris: {{t|en|[[insignia]] [[of]] [[kingship]]}}, {{t+|en|regalia}} {{trans-bottom}} --> {{C|th|Thailand}} m7h8jf54qij43nulcoo3rith211tyu9 281422 281421 2026-04-22T09:11:25Z PeaceSeekers 3334 /* Kata nama */ 281422 wikitext text/x-wiki ==Bahasa Thai== ===Kata nama=== {{inti|th|kata nama}} ===Etimologi=== Daripada {{bor|th|pi|kakudhabhaṇdha}}, daripada {{m|pi|kakudha|gloss=[[panji]] atau [[simbol]] daripada [[royalti]]}} + {{m|pi|bhaṇḍa|gloss=[[artikel]]; [[instrumen]]; [[perkakas]]}}; bersamaan dengan {{cog|th|-}} {{com|th|กกุธ|ภัณฑ์}}. ===Sebutan=== {{th-seb|กะ-กุด-ทะ-พัน}} ===Kata nama=== {{head|th|kata nama}} # [[alat kebesaran]] [[diraja]] ## {{lb|th|khusus}} Alat-alat kebesaran diraja Thailand, ataupun disebut "Lima Regalia Diraja": Mahkota Kemenangan Besar, Pedang Kemenangan, Tongkat Diraja, Kipas Diraja (dan Cambuk), Sandal Diraja. ==== Lihat juga ==== {{top2}} * {{l|th|เครื่องราชกกุธภัณฑ์}} * {{l|th|เบญจราชกกุธภัณฑ์}} {{bottom}} <!-- ==== Terjemahan dalam bahasa lain ==== {{trans-top|lambang kerajaan}} * Jepun: {{t|ja|[[五]][[種]][[の]][[神器]]|tr=ごしゅのじんぎ, โกะชุโนะจิงงิ}} * Laos: {{t|lo|ກະກຸທະພັນ}} * Inggeris: {{t|en|[[insignia]] [[of]] [[kingship]]}}, {{t+|en|regalia}} {{trans-bottom}} --> {{C|th|Thailand}} 4rrse4qv3fs72ug1ecj6aicju561vko Kategori:Modul data bahasa Turki Usmaniyah 14 92877 281329 254915 2026-04-22T00:40:09Z PeaceSeekers 3334 PeaceSeekers telah memindahkan laman [[Kategori:Modul data bahasa Turki Uthmaniyah]] ke [[Kategori:Modul data bahasa Turki Usmaniyah]] tanpa meninggalkan lencongan: Tukar nama 254915 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx datsi 0 94530 281399 257304 2026-04-22T07:51:05Z PeaceSeekers 3334 /* Bahasa Chin Tedim */ 281399 wikitext text/x-wiki == Bahasa Chin Tedim == === Kata nama === {{head|ctd|kata nama}} # minyak [[petrol]] === Etimologi === {{bor+|ctd|my|ဓာတ်ဆီ}}. 6mvzgeu2t3qwikymyb5wmm7zfb1lje0 Kategori:beg:Alat dapur 14 112150 281338 278314 2026-04-22T01:04:10Z PeaceSeekers 3334 PeaceSeekers telah memindahkan laman [[Kategori:beg:Peralatan dapur]] ke [[Kategori:beg:Alat dapur]] tanpa meninggalkan lencongan: Tajuk salah eja 278314 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:hi:Fenomena atmosfera 14 113160 281341 279417 2026-04-22T01:09:20Z PeaceSeekers 3334 PeaceSeekers telah memindahkan laman [[Kategori:hi:Kejadian atmosfera]] ke [[Kategori:hi:Fenomena atmosfera]] tanpa meninggalkan lencongan: Tajuk salah eja 279417 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Anjing 14 114822 281249 2026-04-21T13:42:29Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281249 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:ms:Anjing 14 114823 281250 2026-04-21T13:42:45Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281250 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Kuda 14 114824 281251 2026-04-21T13:43:21Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281251 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Kutleri 14 114825 281252 2026-04-21T13:43:25Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281252 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Equidae 14 114826 281253 2026-04-21T13:43:37Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281253 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Ungulat kuku ganjil 14 114827 281254 2026-04-21T13:43:40Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281254 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Alat dapur 14 114828 281255 2026-04-21T13:44:29Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281255 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Beri 14 114829 281256 2026-04-21T13:46:32Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281256 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Cacing 14 114830 281257 2026-04-21T13:46:35Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281257 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Hotel 14 114831 281258 2026-04-21T13:47:06Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281258 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Jenis perniagaan 14 114832 281259 2026-04-21T13:47:18Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281259 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Mata wang 14 114833 281260 2026-04-21T13:48:48Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281260 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Kebersihan kesihatan 14 114834 281261 2026-04-21T13:48:50Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281261 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Pendengaran 14 114835 281262 2026-04-21T13:50:25Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281262 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Rangka 14 114836 281263 2026-04-21T13:50:28Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281263 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Ungulat kuku genap 14 114837 281264 2026-04-21T13:50:50Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281264 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Arah 14 114838 281265 2026-04-21T13:52:06Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281265 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Arca 14 114839 281266 2026-04-21T13:52:11Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281266 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Buddhisme 14 114840 281267 2026-04-21T13:52:53Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281267 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Bulan Kristian Syria 14 114841 281268 2026-04-21T13:52:58Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281268 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Buruj 14 114842 281269 2026-04-21T13:53:01Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281269 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Cengkerik dan belalang 14 114843 281270 2026-04-21T13:57:10Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281270 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Cervidae 14 114844 281271 2026-04-21T13:57:25Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281271 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Crocodilia 14 114845 281272 2026-04-21T14:00:37Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281272 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Cuti 14 114846 281273 2026-04-21T14:00:42Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281273 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Dapur 14 114847 281277 2026-04-21T14:05:16Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281277 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Dekapod 14 114848 281278 2026-04-21T14:05:19Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281278 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Demonim 14 114849 281279 2026-04-21T14:07:35Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281279 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Elipsis bahasa Hungary 14 114850 281280 2026-04-21T14:09:01Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281280 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Pemendekan bahasa Hungary 14 114851 281281 2026-04-21T14:09:17Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281281 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Individu 14 114852 281282 2026-04-21T14:10:31Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281282 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Ketam 14 114853 281284 2026-04-21T14:11:44Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281284 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Kerakyatan 14 114854 281285 2026-04-21T14:11:49Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281285 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Kecacatan 14 114855 281286 2026-04-21T14:14:40Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281286 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Katolik 14 114856 281287 2026-04-21T14:14:44Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281287 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Klimatologi 14 114857 281288 2026-04-21T14:17:07Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281288 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Krismas 14 114858 281289 2026-04-21T14:21:02Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281289 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Kucing 14 114859 281290 2026-04-21T14:21:07Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281290 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Komelinid 14 114860 281291 2026-04-21T14:22:42Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281291 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Mekanisme 14 114861 281292 2026-04-21T14:24:14Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281292 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Mineral 14 114862 281293 2026-04-21T14:24:24Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281293 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Minuman beralkohol 14 114863 281294 2026-04-21T14:39:12Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281294 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Modul:languages/chars 828 114864 281319 2026-04-21T19:40:58Z Hakimi97 2668 Mencipta laman baru dengan kandungan 'local export = {} local table = table local insert = table.insert local u = require("Module:string/char") -- UTF-8 encoded strings for some commonly-used diacritics. local c = { prime = u(0x02B9), grave = u(0x0300), acute = u(0x0301), circ = u(0x0302), -- circumflex tilde = u(0x0303), macron = u(0x0304), overline = u(0x0305), breve = u(0x0306), dotabove = u(0x0307), diaer = u(0x0308), -- diaeresis ringabove =...' 281319 Scribunto text/plain local export = {} local table = table local insert = table.insert local u = require("Module:string/char") -- UTF-8 encoded strings for some commonly-used diacritics. local c = { prime = u(0x02B9), grave = u(0x0300), acute = u(0x0301), circ = u(0x0302), -- circumflex tilde = u(0x0303), macron = u(0x0304), overline = u(0x0305), breve = u(0x0306), dotabove = u(0x0307), diaer = u(0x0308), -- diaeresis ringabove = u(0x030A), dacute = u(0x030B), -- double acute caron = u(0x030C), lineabove = u(0x030D), dgrave = u(0x030F), -- double grave invbreve = u(0x0311), -- inverted breve turnedcommaabove = u(0x0312), commaabove = u(0x0313), revcommaabove = u(0x0314), -- reversed comma above dotbelow = u(0x0323), diaerbelow = u(0x0324), -- diaeresis below ringbelow = u(0x0325), cedilla = u(0x0327), ogonek = u(0x0328), caronbelow = u(0x032C), brevebelow = u(0x032E), macronbelow = u(0x0331), perispomeni = u(0x0342), ypogegrammeni = u(0x0345), CGJ = u(0x034F), -- combining grapheme joiner zigzag = u(0x035B), dbrevebelow = u(0x035C), -- double breve below dmacron = u(0x035E), -- double macron dtilde = u(0x0360), -- double tilde dinvbreve = u(0x0361), -- double inverted breve small_a = u(0x0363), small_e = u(0x0364), small_i = u(0x0365), small_o = u(0x0366), small_u = u(0x0367), keraia = u(0x0374), lowerkeraia = u(0x0375), tonos = u(0x0384), palatalization = u(0x0484), dasiapneumata = u(0x0485), psilipneumata = u(0x0486), kashida = u(0x0640), fathatan = u(0x064B), dammatan = u(0x064C), kasratan = u(0x064D), fatha = u(0x064E), damma = u(0x064F), kasra = u(0x0650), shadda = u(0x0651), sukun = u(0x0652), hamzaabove = u(0x0654), nunghunna = u(0x0658), zwarakay = u(0x0659), smallv = u(0x065A), superalef = u(0x0670), udatta = u(0x0951), anudatta = u(0x0952), tacute = u(0x1ACB), -- triple acute dsvarita = u(0x1CDA), -- double svarita tsvarita = u(0x1CDB), -- triple svarita dottedgrave = u(0x1DC0), dottedacute = u(0x1DC1), coronis = u(0x1FBD), psili = u(0x1FBF), dasia = u(0x1FEF), ZWNJ = u(0x200C), -- zero width non-joiner ZWJ = u(0x200D), -- zero width joiner RSQuo = u(0x2019), -- right single quote kavyka = u(0xA67C), VS01 = u(0xFE00), -- variation selector 1 -- Punctuation for the standard_chars field. -- Note: characters are literal (i.e. no magic characters). punc = " ',-​‌‍‐‑‒–—…∅◌", -- Range covering all diacritics. diacritics = u(0x300) .. "-" .. u(0x34E) .. u(0x350) .. "-" .. u(0x36F) .. u(0x1AB0) .. "-" .. u(0x1ACE) .. u(0x1DC0) .. "-" .. u(0x1DFF) .. u(0x20D0) .. "-" .. u(0x20F0) .. u(0xFE20) .. "-" .. u(0xFE2F), } -- Braille characters for the standard_chars field. local braille = {} for i = 0x2800, 0x28FF do insert(braille, u(i)) end c.braille = table.concat(braille) export.chars = c -- PUA characters, generally used in sortkeys. -- Note: if the limit needs to be increased, do so in powers of 2 (due to the way memory is allocated for tables). local p = {} for i = 1, 32 do p[i] = u(0xF000+i-1) end export.puaChars = p local cs = {} -- Used for the default display_text and strip_diacritics for Grek, but parts also used directly by Albanian (sq). cs["Grek-displaytext"] = { from = {"Þ", "þ", c.turnedcommaabove, "['ʼ" .. c.RSQuo .. c.prime .. c.keraia .. c.coronis .. c.psili .. "]"}, -- Not tonos: used as the numeral sign in entries. to = {"Ϸ", "ϸ", c.revcommaabove, c.RSQuo} } cs["Grek-stripdiacritics"] = { remove_diacritics = c.caron .. c.diaerbelow .. c.brevebelow, from = cs["Grek-displaytext"].from, to = {"Ϸ", "ϸ", c.revcommaabove, "'"} } -- Used in the default strip_diacritics and sort_key for Cyrs, but also used directly by Old Ruthenian (zle-ort). cs["Cyrs_remove_diacritics"] = c.grave .. c.acute .. c.dotabove .. c.diaer .. c.invbreve .. c.palatalization .. c.dasiapneumata .. c.psilipneumata .. c.dottedgrave .. c.dottedacute .. c.kavyka export.chars_substitutions = cs return export nvo2d2djqerlm03ucvsy9n8dkv3uip8 Kategori:Ekinoderma 14 114865 281335 2026-04-22T00:42:20Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281335 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:hi:Atmosfera 14 114866 281342 2026-04-22T01:09:35Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281342 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx 一生懸命 0 114868 281351 2026-04-22T06:01:01Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa Jepun== {{ja-kanjitab|いつ|しょう|けん|めい|k1=いっ|yomi=on}} ===Adverba=== {{ja-pos|adverb|いっしょうけんめい|hhira=いつしやうけんめい}} # secara sedaya upaya, secara bersungguh-sungguh #: {{ja-usex|'''一%生%懸%命'''頑%張る|'''いっ%しょう%けん%めい''' がん%ばる|berusaha '''sedaya upaya'''}} ===Etimologi=== {{ja-yoji}} daripada {{ja-r|一所懸命|いっしょけんめい}} yang merujuk kepad...' 281351 wikitext text/x-wiki ==Bahasa Jepun== {{ja-kanjitab|いつ|しょう|けん|めい|k1=いっ|yomi=on}} ===Adverba=== {{ja-pos|adverb|いっしょうけんめい|hhira=いつしやうけんめい}} # secara sedaya upaya, secara bersungguh-sungguh #: {{ja-usex|'''一%生%懸%命'''頑%張る|'''いっ%しょう%けん%めい''' がん%ばる|berusaha '''sedaya upaya'''}} ===Etimologi=== {{ja-yoji}} daripada {{ja-r|一所懸命|いっしょけんめい}} yang merujuk kepada para [[samurai]] menaruhkan nyawa untuk melindungi wilayah pusaka. ===Sebutan=== {{ja-pron|いっしょうけんめい|acc=5|acc_ref=DJR}} ===Rujukan=== <references/> :* {{R:Kanjipedia Kotoba|0000230600}} {{cln|ja|yojijukugo|}} o0960f6bsyscxv75jp8h69v1jn6rgb6 Kategori:Perkataan dieja dengan 一 dibaca sebagai いつ bahasa Jepun 14 114869 281352 2026-04-22T06:01:56Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat|kan'on}}' 281352 wikitext text/x-wiki {{auto cat|kan'on}} clmo3b09zci1t12px7gti5vw1yfsq0y Kategori:Perkataan dieja dengan kanji dibaca sebagai いつ bahasa Jepun 14 114870 281353 2026-04-22T06:03:17Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281353 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Perkataan dieja dengan 懸 dibaca sebagai けん bahasa Jepun 14 114871 281354 2026-04-22T06:07:19Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat|kan'on}}' 281354 wikitext text/x-wiki {{auto cat|kan'on}} clmo3b09zci1t12px7gti5vw1yfsq0y Kategori:Perkataan dieja dengan 懸 bahasa Jepun 14 114872 281355 2026-04-22T06:07:43Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281355 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Perkataan dieja dengan 懸 mengikut bahasa 14 114873 281356 2026-04-22T06:08:10Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281356 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx 四方八方 0 114874 281357 2026-04-22T06:19:50Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa Jepun== {{ja-kanjitab|し|ほう|はつ|ほう|k3=はっ|k4=ぽう|yomi=on}} ===Adverba=== {{ja-pos|kata sifat|しほうはっぽう}} # setiap [[arah]] # setiap tempat; di [[mana-mana]] ===Rujukan=== <references/> :* {{R:Kanjipedia Kotoba|0000230600}} {{cln|ja|yojijukugo|}}' 281357 wikitext text/x-wiki ==Bahasa Jepun== {{ja-kanjitab|し|ほう|はつ|ほう|k3=はっ|k4=ぽう|yomi=on}} ===Adverba=== {{ja-pos|kata sifat|しほうはっぽう}} # setiap [[arah]] # setiap tempat; di [[mana-mana]] ===Rujukan=== <references/> :* {{R:Kanjipedia Kotoba|0000230600}} {{cln|ja|yojijukugo|}} hsyj08f9qpv1wozk06px1hjplakdbdf 281362 281357 2026-04-22T06:34:02Z PeaceSeekers 3334 281362 wikitext text/x-wiki ==Bahasa Jepun== {{ja-kanjitab|し|ほう|はつ|ほう|k3=はっ|k4=ぽう|yomi=on}} ===Adverba=== {{ja-pos|kata sifat|しほうはっぽう}} # setiap [[arah]] # setiap tempat; di [[mana-mana]] {{cln|ja|yojijukugo|}} emlr30jpml4lcxgnqrhlrlth3irar7f Kategori:Perkataan dieja dengan 八 dibaca sebagai はつ bahasa Jepun 14 114875 281358 2026-04-22T06:20:22Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat|kan'on}}' 281358 wikitext text/x-wiki {{auto cat|kan'on}} clmo3b09zci1t12px7gti5vw1yfsq0y Kategori:Perkataan dieja dengan 方 dibaca sebagai ほう bahasa Jepun 14 114876 281359 2026-04-22T06:21:06Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat|on}}' 281359 wikitext text/x-wiki {{auto cat|on}} irnidilxpyzph26fxce9qlrz5zy5gor Kategori:Yojijukugo bahasa Jepun 14 114877 281368 2026-04-22T07:02:57Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281368 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Paus 14 114878 281370 2026-04-22T07:05:00Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281370 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Setasea 14 114879 281371 2026-04-22T07:05:25Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281371 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Teh 14 114880 281372 2026-04-22T07:08:13Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281372 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Wikikamus:ms/Vanuatu 4 114881 281373 2026-04-22T07:14:41Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Kata nama khas=== {{inti|{{subst:ROOTPAGENAME}}|kata nama khas}} # {{place|ms|negara/dan/kepulauan|r/Melanesia|di|cont/Oceania|official=Republik Vanuatu|caplc=Port Vila}}. ===Etimologi=== Akhirnya daripada {{der|ms|bi|Vanuatu}}. ===Rujukan=== * {{R:KDP}}' 281373 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Kata nama khas=== {{inti|ms|kata nama khas}} # {{place|ms|negara/dan/kepulauan|r/Melanesia|di|cont/Oceania|official=Republik Vanuatu|caplc=Port Vila}}. ===Etimologi=== Akhirnya daripada {{der|ms|bi|Vanuatu}}. ===Rujukan=== * {{R:KDP}} gwp4falgslrgf6i1czl6befnxapxrna Vanuatu 0 114882 281375 2026-04-22T07:17:17Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}' 281375 wikitext text/x-wiki {{wt:ms/{{PAGENAME}}}} oduz2pevfujwte0m2yicioulifapb4r Kategori:ms:Vanuatu 14 114883 281376 2026-04-22T07:18:11Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281376 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Vanuatu 14 114884 281377 2026-04-22T07:18:28Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281377 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:ms:Negara di Melanesia 14 114885 281378 2026-04-22T07:20:23Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281378 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Perkataan bahasa Melayu diterbitkan daripada bahasa Bislama 14 114886 281379 2026-04-22T07:20:32Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281379 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Perkataan diterbitkan daripada bahasa Bislama mengikut bahasa 14 114887 281380 2026-04-22T07:20:46Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281380 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Negara di Melanesia 14 114888 281381 2026-04-22T07:21:30Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281381 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Melanesia 14 114889 281382 2026-04-22T07:22:07Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281382 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:ms:Melanesia 14 114890 281383 2026-04-22T07:22:15Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281383 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:ms:Oceania 14 114891 281384 2026-04-22T07:23:08Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281384 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Oceania 14 114892 281385 2026-04-22T07:23:20Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281385 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:ms:Zambia 14 114893 281387 2026-04-22T07:26:56Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281387 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Zambia 14 114894 281388 2026-04-22T07:27:17Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281388 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Wikikamus:ms/Melanesia 4 114895 281389 2026-04-22T07:32:57Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Kata nama khas=== {{inti|{{subst:ROOTPAGENAME}}|kata nama khas}} # {{place|en|Sebuah <<kawasan benua>> di <<cont/Oceania>> yang terdiri daripada [[New Guinea]], [[Kepulauan Bismarck]], [[Kepulauan Solomon]], [[New Caledonia]], [[Vanuatu]] dan [[Fiji]]}}. ====Perkataan setara==== * {{l|ms|Mikronesia}} * {{l|ms|Polinesia}} ===Etimologi=== Akhirnya daripada {{der|en|grc|μέλας||gelap}} + {{m|grc|νῆ...' 281389 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Kata nama khas=== {{inti|ms|kata nama khas}} # {{place|en|Sebuah <<kawasan benua>> di <<cont/Oceania>> yang terdiri daripada [[New Guinea]], [[Kepulauan Bismarck]], [[Kepulauan Solomon]], [[New Caledonia]], [[Vanuatu]] dan [[Fiji]]}}. ====Perkataan setara==== * {{l|ms|Mikronesia}} * {{l|ms|Polinesia}} ===Etimologi=== Akhirnya daripada {{der|en|grc|μέλας||gelap}} + {{m|grc|νῆσος||pulau}}, dengan "gelap" di sini merujuk kepada warna kulit warga penghuni. nplcym5ejel9fi34uhbtykdi20tq51u 281391 281389 2026-04-22T07:33:42Z PeaceSeekers 3334 /* Kata nama khas */ 281391 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Kata nama khas=== {{inti|ms|kata nama khas}} # {{place|ms|Sebuah <<kawasan benua>> di <<cont/Oceania>> yang terdiri daripada [[New Guinea]], [[Kepulauan Bismarck]], [[Kepulauan Solomon]], [[New Caledonia]], [[Vanuatu]] dan [[Fiji]]}}. ====Perkataan setara==== * {{l|ms|Mikronesia}} * {{l|ms|Polinesia}} ===Etimologi=== Akhirnya daripada {{der|en|grc|μέλας||gelap}} + {{m|grc|νῆσος||pulau}}, dengan "gelap" di sini merujuk kepada warna kulit warga penghuni. 4qiw5py4uwyzr2r3w0uu8wonrefyla2 281392 281391 2026-04-22T07:34:18Z PeaceSeekers 3334 /* Etimologi */ 281392 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Kata nama khas=== {{inti|ms|kata nama khas}} # {{place|ms|Sebuah <<kawasan benua>> di <<cont/Oceania>> yang terdiri daripada [[New Guinea]], [[Kepulauan Bismarck]], [[Kepulauan Solomon]], [[New Caledonia]], [[Vanuatu]] dan [[Fiji]]}}. ====Perkataan setara==== * {{l|ms|Mikronesia}} * {{l|ms|Polinesia}} ===Etimologi=== Akhirnya daripada {{der|ms|grc|μέλας||gelap}} + {{m|grc|νῆσος||pulau}}, dengan "gelap" di sini merujuk kepada warna kulit warga penghuni. 6xsv39rvj1smgi52qkwcxj8nkw92uj0 281393 281392 2026-04-22T07:34:57Z PeaceSeekers 3334 281393 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== {{Wikipedia|lang=ms}} ===Kata nama khas=== {{inti|ms|kata nama khas}} # {{place|ms|Sebuah <<kawasan benua>> di <<cont/Oceania>> yang terdiri daripada [[New Guinea]], [[Kepulauan Bismarck]], [[Kepulauan Solomon]], [[New Caledonia]], [[Vanuatu]] dan [[Fiji]]}}. ====Perkataan setara==== * {{l|ms|Mikronesia}} * {{l|ms|Polinesia}} ===Etimologi=== Akhirnya daripada {{der|ms|grc|μέλας||gelap}} + {{m|grc|νῆσος||pulau}}, dengan "gelap" di sini merujuk kepada warna kulit warga penghuni. oa2rl5i70rxrgu3tk8sr1pi99pc69w5 Melanesia 0 114896 281390 2026-04-22T07:33:17Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}' 281390 wikitext text/x-wiki {{wt:ms/{{PAGENAME}}}} oduz2pevfujwte0m2yicioulifapb4r Wikikamus:ms/Mikronesia 4 114897 281394 2026-04-22T07:40:09Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|ms}}== {{Wikipedia|lang=ms}} ===Kata nama khas=== {{inti|ms|kata nama khas}} # {{place|ms|kawasan benua|cont/Oceania|*di barat laut [[Lautan Pasifik]]}}, dengan kira-kira 2,000 buah pulau kecil. ====Perkataan setara==== * {{l|ms|Melanesia}} * {{l|ms|Polinesia}}' 281394 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== {{Wikipedia|lang=ms}} ===Kata nama khas=== {{inti|ms|kata nama khas}} # {{place|ms|kawasan benua|cont/Oceania|*di barat laut [[Lautan Pasifik]]}}, dengan kira-kira 2,000 buah pulau kecil. ====Perkataan setara==== * {{l|ms|Melanesia}} * {{l|ms|Polinesia}} 5mvft4vp7y4ysrjfn4yemh5e24oarxl Mikronesia 0 114898 281395 2026-04-22T07:40:27Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}' 281395 wikitext text/x-wiki {{wt:ms/{{PAGENAME}}}} oduz2pevfujwte0m2yicioulifapb4r Wikikamus:ms/Polinesia 4 114899 281396 2026-04-22T07:45:50Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|ms}}== {{Wikipedia|lang=ms}} ===Kata nama khas=== {{inti|ms|kata nama khas}} # {{place|en|Sebuah <<kawasan benua>> di <<cont/Oceania>>, termasuk [[Pulau Easter]], [[Hawaii]], [[New Zealand]] dan kebanyakan pulau di antara sesama mereka}}. ====Perkataan setara==== * {{l|ms|Melanesia}} * {{l|ms|Mikronesia}} ====Terjemahan==== {{trans-top|sebahagian Oceania}} * Afrikaans: {{t|af|Polinesië}} * Albania: {{t|sq|Polinezi|f}}, {{t|sq|Pol...' 281396 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== {{Wikipedia|lang=ms}} ===Kata nama khas=== {{inti|ms|kata nama khas}} # {{place|en|Sebuah <<kawasan benua>> di <<cont/Oceania>>, termasuk [[Pulau Easter]], [[Hawaii]], [[New Zealand]] dan kebanyakan pulau di antara sesama mereka}}. ====Perkataan setara==== * {{l|ms|Melanesia}} * {{l|ms|Mikronesia}} ====Terjemahan==== {{trans-top|sebahagian Oceania}} * Afrikaans: {{t|af|Polinesië}} * Albania: {{t|sq|Polinezi|f}}, {{t|sq|Polinezia|f}} {{qualifier|definite}} * Amhara: {{t|am|ፖሊኔዥያ}} * Arab: {{t|ar|بُولِينِزِيَا|f}} * Armenia: {{t|hy|Պոլինեզիա}} * Azeri: {{t|az|Polineziya}} * Belanda: {{t+|nl|Polynesië|n}} * Belarus: {{t|be|Паліне́зія|f}}, {{t|be|Палінэ́зія|f}} * Bengali: {{t+|bn|পলিনেশিয়া}} * Bulgaria: {{t|bg|Полине́зия|f}} * Burma: {{t|my|ပိုလီနီးရှား}} * Catalonia: {{t+|ca|Polinèsia|f}} * Cherokee: {{t|chr|ᏆᎵᏂᏏᎠ|tr=qualinisia}} * Cina: *: Kantonis: {{t|yue|波利尼西亞|tr=bo1 lei6 nei4 sai1 aa3}} *: Mandarin: {{t+|cmn|波利尼西亞|tr=Bōlìníxīyà}}, {{t+|cmn|玻里尼西亞|tr=Bōlǐníxīyà}} {{qualifier|Taiwan}} * Czech: {{t+|cs|Polynésie|f}} * Denmark: {{t|da|Polynesien|n}} * Esperanto: {{t|eo|Polinezio}} * Estonia: {{t|et|Polüneesia}} * Farefare: {{t|gur|Polinesia}} * Finland: {{t+|fi|Polynesia}} * Galicia: {{t+|gl|Polinesia}} * Georgia: {{t|ka|პოლინეზია}} * Hawaii: {{t|haw|Polenekia}} * Hindi: {{t|hi|पॉलिनेशिया|m}} * Hungary: {{t+|hu|Polinézia}} * Ibrani: {{t|he|פּוֹלִינֶזְיָה|f|tr=polinézya}} * Iceland: {{t|is|Pólýnesía}} * Indonesia: {{t|id|Polinesia}} * Ireland: {{t|ga|Polainéis|f|alt=An Pholainéis}} * Itali: {{t+|it|Polinesia|f}} * Jepun: {{t+|ja|ポリネシア|tr=Porineshia}} * Jerman: {{t+|de|Polynesien|n}} * Kazakh: {{t+|kk|Полинезия}} * Khmer: {{t|km|ប៉ូលីណេស៊ី}} * Korea: {{t|ko|^폴리네시아}} * Kurdi: *: Kurdi Utara: {{t|kmr|Polînezya}} * Kyrgyz: {{t+|ky|Полинезия}} * Lao: {{t|lo|ໂປລີເນຊີ}} * Latvia: {{t|lv|Polinēzija|f}} * Lithuania: {{t|lt|Polinezija|f}} * Macedonia: {{t|mk|Полинезија|f}} * Māori: {{t|mi|Poronihia}} * Mongol: *: Cyril: {{t|mn|Полинези}} * Norway: *: Bokmål: {{t|nb|Polynesia}} *: Nynorsk: {{t|nn|Polynesia}} * Parsi: {{t|fa|پلی‌نزی|tr=poli-nezi}}, {{t+|fa|پلینزی|tr=polinezi}} * Perancis: {{t+|fr|Polynésie|f}} * Polish: {{t+|pl|Polinezja|f}} * Portugis: {{t+|pt|Polinésia|f}} * Romania: {{t|ro|Polinezia|f}} * Rusia: {{t+|ru|Полине́зия|f|tr=Polinɛ́zija}} * Samoa: {{t|sm|Polenisia}} * Sepanyol: {{t|es|Polinesia}} * Serbo-Croatia: *: Cyril: {{t|sh|Полѝне̄зија|f}} *: Latin: {{t+|sh|Polìnēzija|f}} * Sinhala: {{t|si|පොලිනීසියාව}} * Slovak: {{t|sk|Polynézia|f}} * Slovene: {{t|sl|Polinezija|f}} * Sweden: {{t+|sv|Polynesien|n}} * Tagalog: {{t|tl|Dampuluan}}, {{t|tl|Polynesia}} * Tahiti: {{t|ty|Pōrīnetia}} * Tajik: {{t|tg|Полинезия}} * Tamil: {{t|ta|பொலினீசியா}} * Tatar: {{t|tt|Полинезия}} * Thai: {{t|th|พอลินีเชีย}} * Turki: {{t+|tr|Polinezya}} * Turkmen: {{t|tk|Polineziýa}} * Ukraine: {{t|uk|Поліне́зія|f}} * Urdu: {{t|ur|پولینیشیا|m|tr=polīneśiyā}} * Uyghur: {{t|ug|پولىنېزىيە}} * Uzbek: {{t|uz|Polineziya}} * Vietnam: {{t|vi|Pô-li-nê-di}}, {{t|vi|Đa Đảo}} ({{t|vi|多島}}) * Volapük: {{t+|vo|Möda-Seanuäns}} * Wales: {{t|cy|Polynesia}} * Yiddish: {{t|yi|פּאָלינעזיע|n}} * Yunani: {{t+|el|Πολυνησία|f}} {{trans-bottom}} jhbdc5sor9j19pgd4oe5xgyizm3bz65 281398 281396 2026-04-22T07:48:42Z PeaceSeekers 3334 /* Terjemahan */ 281398 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== {{Wikipedia|lang=ms}} ===Kata nama khas=== {{inti|ms|kata nama khas}} # {{place|en|Sebuah <<kawasan benua>> di <<cont/Oceania>>, termasuk [[Pulau Easter]], [[Hawaii]], [[New Zealand]] dan kebanyakan pulau di antara sesama mereka}}. ====Perkataan setara==== * {{l|ms|Melanesia}} * {{l|ms|Mikronesia}} ====Terjemahan==== {{trans-top|sebahagian Oceania}} * Afrikaans: {{t|af|Polinesië}} * Albania: {{t|sq|Polinezi|f}}, {{t|sq|Polinezia|f}} {{qualifier|definite}} * Amhara: {{t|am|ፖሊኔዥያ}} * Arab: {{t|ar|بُولِينِزِيَا|f}} * Armenia: {{t|hy|Պոլինեզիա}} * Azeri: {{t|az|Polineziya}} * Belanda: {{t+|nl|Polynesië|n}} * Belarus: {{t|be|Паліне́зія|f}}, {{t|be|Палінэ́зія|f}} * Bengali: {{t+|bn|পলিনেশিয়া}} * Bulgaria: {{t|bg|Полине́зия|f}} * Burma: {{t|my|ပိုလီနီးရှား}} * Catalonia: {{t+|ca|Polinèsia|f}} * Cherokee: {{t|chr|ᏆᎵᏂᏏᎠ|tr=qualinisia}} * Cina: *: Kantonis: {{t|yue|波利尼西亞|tr=bo1 lei6 nei4 sai1 aa3}} *: Mandarin: {{t+|cmn|波利尼西亞|tr=Bōlìníxīyà}}, {{t+|cmn|玻里尼西亞|tr=Bōlǐníxīyà}} {{qualifier|Taiwan}} * Czech: {{t+|cs|Polynésie|f}} * Denmark: {{t|da|Polynesien|n}} * Esperanto: {{t|eo|Polinezio}} * Estonia: {{t|et|Polüneesia}} * Farefare: {{t|gur|Polinesia}} * Finland: {{t+|fi|Polynesia}} * Galicia: {{t+|gl|Polinesia}} * Georgia: {{t|ka|პოლინეზია}} * Hawaii: {{t|haw|Polenekia}} * Hindi: {{t|hi|पॉलिनेशिया|m}} * Hungary: {{t+|hu|Polinézia}} * Ibrani: {{t|he|פּוֹלִינֶזְיָה|f|tr=polinézya}} * Iceland: {{t|is|Pólýnesía}} * Indonesia: {{t|id|Polinesia}} * Inggeris: {{t+|en|Polynesia}} * Ireland: {{t|ga|Polainéis|f|alt=An Pholainéis}} * Itali: {{t+|it|Polinesia|f}} * Jepun: {{t+|ja|ポリネシア|tr=Porineshia}} * Jerman: {{t+|de|Polynesien|n}} * Kazakh: {{t+|kk|Полинезия}} * Khmer: {{t|km|ប៉ូលីណេស៊ី}} * Korea: {{t|ko|^폴리네시아}} * Kurdi: *: Kurdi Utara: {{t|kmr|Polînezya}} * Kyrgyz: {{t+|ky|Полинезия}} * Lao: {{t|lo|ໂປລີເນຊີ}} * Latvia: {{t|lv|Polinēzija|f}} * Lithuania: {{t|lt|Polinezija|f}} * Macedonia: {{t|mk|Полинезија|f}} * Māori: {{t|mi|Poronihia}} * Mongol: *: Cyril: {{t|mn|Полинези}} * Norway: *: Bokmål: {{t|nb|Polynesia}} *: Nynorsk: {{t|nn|Polynesia}} * Parsi: {{t|fa|پلی‌نزی|tr=poli-nezi}}, {{t+|fa|پلینزی|tr=polinezi}} * Perancis: {{t+|fr|Polynésie|f}} * Polish: {{t+|pl|Polinezja|f}} * Portugis: {{t+|pt|Polinésia|f}} * Romania: {{t|ro|Polinezia|f}} * Rusia: {{t+|ru|Полине́зия|f|tr=Polinɛ́zija}} * Samoa: {{t|sm|Polenisia}} * Sepanyol: {{t|es|Polinesia}} * Serbo-Croatia: *: Cyril: {{t|sh|Полѝне̄зија|f}} *: Latin: {{t+|sh|Polìnēzija|f}} * Sinhala: {{t|si|පොලිනීසියාව}} * Slovak: {{t|sk|Polynézia|f}} * Slovene: {{t|sl|Polinezija|f}} * Sweden: {{t+|sv|Polynesien|n}} * Tagalog: {{t|tl|Dampuluan}}, {{t|tl|Polynesia}} * Tahiti: {{t|ty|Pōrīnetia}} * Tajik: {{t|tg|Полинезия}} * Tamil: {{t|ta|பொலினீசியா}} * Tatar: {{t|tt|Полинезия}} * Thai: {{t|th|พอลินีเชีย}} * Turki: {{t+|tr|Polinezya}} * Turkmen: {{t|tk|Polineziýa}} * Ukraine: {{t|uk|Поліне́зія|f}} * Urdu: {{t|ur|پولینیشیا|m|tr=polīneśiyā}} * Uyghur: {{t|ug|پولىنېزىيە}} * Uzbek: {{t|uz|Polineziya}} * Vietnam: {{t|vi|Pô-li-nê-di}}, {{t|vi|Đa Đảo}} ({{t|vi|多島}}) * Volapük: {{t+|vo|Möda-Seanuäns}} * Wales: {{t|cy|Polynesia}} * Yiddish: {{t|yi|פּאָלינעזיע|n}} * Yunani: {{t+|el|Πολυνησία|f}} {{trans-bottom}} 4xcl0sj7wxvfwclq6feeb3y06iow20q Polinesia 0 114900 281397 2026-04-22T07:46:17Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}' 281397 wikitext text/x-wiki {{wt:ms/{{PAGENAME}}}} oduz2pevfujwte0m2yicioulifapb4r Kategori:Perkataan bahasa Chin Tedim dipinjam daripada bahasa Burma 14 114901 281400 2026-04-22T07:51:20Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281400 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Perkataan bahasa Chin Tedim diterbitkan daripada bahasa Burma 14 114902 281401 2026-04-22T07:51:23Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281401 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Perkataan dipinjam daripada bahasa Burma mengikut bahasa 14 114903 281402 2026-04-22T07:51:37Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281402 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Kata pinjaman bahasa Chin Tedim 14 114904 281403 2026-04-22T07:51:50Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281403 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Perkataan mengikut etimologi bahasa Chin Tedim 14 114905 281404 2026-04-22T07:51:52Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281404 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Perkataan bahasa Chin Tedim diterbitkan daripada bahasa lain 14 114906 281405 2026-04-22T07:53:14Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281405 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Perkataan bahasa Chin Tedim diterbitkan daripada bahasa-bahasa Lolo-Burma 14 114907 281406 2026-04-22T07:53:18Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281406 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Perkataan bahasa Chin Tedim diterbitkan daripada bahasa-bahasa Burma-Qiang 14 114908 281407 2026-04-22T07:53:21Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281407 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Perkataan bahasa Chin Tedim diterbitkan daripada bahasa-bahasa Sino-Tibet 14 114909 281408 2026-04-22T07:53:24Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281408 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Wikikamus:ms/taiko 4 114910 281409 2026-04-22T08:04:24Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Takrifan 1=== [[Fail:TaikoDrummersAichiJapan.jpg|thumb|Orang Jepun bermain '''taiko'''.]] {{inti|{{subst:ROOTPAGENAME}}|kata nama}} # Sejenis [[gendang]] tradisional [[Jepun]]. ====Etimologi==== {{bor+|ms|ja|太鼓|tr=たいこ ''taiko''}}, daripada {{m|ltc|太|tr=tʰàj|t=besar}} + {{m|ltc|鼓|tr=kú|t=dram, gendang}}. {{C|ms|Alat muzik|Jepun}} ===Takrifan 2=== {{inti|{{subst:ROOTPAGENAME}}|kata na...' 281409 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Takrifan 1=== [[Fail:TaikoDrummersAichiJapan.jpg|thumb|Orang Jepun bermain '''taiko'''.]] {{inti|ms|kata nama}} # Sejenis [[gendang]] tradisional [[Jepun]]. ====Etimologi==== {{bor+|ms|ja|太鼓|tr=たいこ ''taiko''}}, daripada {{m|ltc|太|tr=tʰàj|t=besar}} + {{m|ltc|鼓|tr=kú|t=dram, gendang}}. {{C|ms|Alat muzik|Jepun}} ===Takrifan 2=== {{inti|ms|kata nama}} # Sejenis [[penyakit]] berjangkit bawaan [[bakteria]] ''Mycobacterium leprae''; [[kusta]]. ====Etimologi==== Daripada {{bor|ms|zh|-}} {{bor|ms|nan-hbl|-}} {{zh-l|癩哥|tr=thái-ko|gloss=kusta}}. {{C|ms|Penyakit}} ===Rujukan=== * {{R:KDP}} qbrieu35se6cssy2o08hhswm8rioafo 281415 281409 2026-04-22T08:20:28Z PeaceSeekers 3334 /* Takrifan 1 */ 281415 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Takrifan 1=== [[Fail:TaikoDrummersAichiJapan.jpg|thumb|Orang Jepun bermain '''taiko'''.]] {{inti|ms|kata nama}} # Sejenis [[gendang]] tradisional [[Jepun]]. ====Etimologi==== {{bor+|ms|ja|太鼓|tr=たいこ ''taiko''}}, daripada {{m|ltc|太|tr=tʰàj|t=besar}} + {{m|ltc|鼓|tr=kú|t=dram, gendang}}. {{C|ms|Alat muzik|Jepun}} ====Terjemahan==== {{ter-atas|gendang Jepun}} * Inggeris: {{t+|en|taiko}} * Jepun: {{t+|ja|太鼓|tr=taiko}} {{ter-bawah}} ===Takrifan 2=== {{inti|ms|kata nama}} # Sejenis [[penyakit]] berjangkit bawaan [[bakteria]] ''Mycobacterium leprae''; [[kusta]]. ====Etimologi==== Daripada {{bor|ms|zh|-}} {{bor|ms|nan-hbl|-}} {{zh-l|癩哥|tr=thái-ko|gloss=kusta}}. {{C|ms|Penyakit}} ===Rujukan=== * {{R:KDP}} ql25gr3u1twnxsl9sfr2a8vcwloa16p taiko 0 114911 281410 2026-04-22T08:05:07Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}' 281410 wikitext text/x-wiki {{wt:ms/{{PAGENAME}}}} oduz2pevfujwte0m2yicioulifapb4r Kategori:ms:Jepun 14 114912 281411 2026-04-22T08:05:51Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281411 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Jepun 14 114913 281412 2026-04-22T08:06:08Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281412 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:id:Tahun 14 114914 281416 2026-04-22T08:21:40Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281416 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Perkataan dengan terjemahan bahasa Cornwall 14 114915 281418 2026-04-22T08:27:35Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281418 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Penyelenggaraan entri bahasa Cornwall 14 114916 281419 2026-04-22T08:27:46Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281419 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Wikikamus:ms/Bahai 4 114917 281423 2026-04-22T09:25:00Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== {{wikipedia|Bahá'í|lang=ms}} ===Kata nama khas=== {{inti|{{subst:ROOTPAGENAME}}|kata nama khas}} # Sebuah gerakan [[agama]] yang ditubuhkan oleh agamawan Iran, [[w:Baháʼu'lláh|Baháʼu'lláh]], pada abad ke-19. ====Terjemahan==== {{trans-top|agama}} * Arab: {{t|ar|الْبَهَائِيَّة|f}} * Armenia: {{t|hy|բահայականություն}} * Cina: *: Mandarin: {{t+|cmn|大同教|tr=dàtóngj...' 281423 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== {{wikipedia|Bahá'í|lang=ms}} ===Kata nama khas=== {{inti|ms|kata nama khas}} # Sebuah gerakan [[agama]] yang ditubuhkan oleh agamawan Iran, [[w:Baháʼu'lláh|Baháʼu'lláh]], pada abad ke-19. ====Terjemahan==== {{trans-top|agama}} * Arab: {{t|ar|الْبَهَائِيَّة|f}} * Armenia: {{t|hy|բահայականություն}} * Cina: *: Mandarin: {{t+|cmn|大同教|tr=dàtóngjiào}}, {{t+|cmn|巴哈伊信仰|tr=bāhāyī xìnyǎng}}, {{t+|cmn|巴哈伊教|tr=bāhāyījiào}} * Esperanto: {{t+|eo|Bahaa Kredo}}, {{t|eo|Bahaa Religio}}, {{t+|eo|Bahaismo}} * Finland: {{t|fi|bahaʼi-usko}}, {{t|fi|bahai-usko}} * Georgia: {{t|ka|ბაჰაიზმი}}, {{t|ka|ბაჰაი რელიგია}} * Ibrani: {{t|he|הָדָּת הָבָּהָאִית|f-p|tr=ha-dat ha-Baha'it}} * Inggeris: {{t+|en|Baháʼí Faith}} * Jerman: {{t|de|Bahaitum|n}}, {{t+|de|Bahaismus|m}} * Hindi: {{t|hi|बहाई धर्म|m}} * Hungary: {{t|hu|[[bahái]] [[hit]]}}, {{t|hu|[[baháʼí]] [[hit]]}}, {{t|hu|baháizmus}} * Ireland: {{t|ga|creideamh Bahá'íoch|m}} * Jepun: {{t|ja|バハイ教|tr=bahai-kyō}} * Kazakh: {{t|kk|Баһаи}}, {{t|kk|Баһаи Сенімі}} * Khmer: {{t|km|[[ជំនឿ]][[បាហៃ]]}} * Parsi: {{t|fa|بهائیت|tr=bahâ'iyyat}} * Perancis: {{t+|fr|bahaïsme|m}}, {{t|fr|foi baháʼíe|f}}, {{t+|fr|béhaïsme|m}} * Poland: {{t+|pl|bahaizm|m}} * Portugis: {{t|pt|bahaísmo|m}}, {{t|pt|Fé Bahá'í|f}} * Rusia: {{t+|ru|бахаи́зм|m}}, {{t|ru|бехаи́зм|m}}, {{t|ru|бахаи́|f}} * Sepanyol: {{t+|es|bahaísmo|m}} * Thai: {{t|th|[[ศาสนา]][[บาไฮ]]}}, {{t|th|[[ลัทธิ]][[บาไฮ]]}}, {{t|th|[[ศาสนา]][[บะฮาอี]]}}, {{t|th|[[ลัทธิ]][[บะฮาอี]]}} * Turki: {{t+|tr|Bahâîlik}} *: Turki Usmaniyah: {{t|ota|بهائیلك|tr=Behâîlik, Bahâîlik}} * Urdu: {{t|ur|بہائیت|f|tr=bahāiyat}} * Uyghur: {{t|ug|باھائىيلىك}} * Uzbek: {{t|uz|Bahoiy Eʼtiqodi}}, {{t|uz|Bahoiylik}} {{trans-bottom}} ===Rujukan=== * {{R:KDP}} {{C|ms|Agama}} gz986lgx6v5hnk20ycvufcd5lwtyefu Bahai 0 114918 281424 2026-04-22T09:25:50Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{also|bahai}} {{wt:ms/{{PAGENAME}}}}' 281424 wikitext text/x-wiki {{also|bahai}} {{wt:ms/{{PAGENAME}}}} 8tw19lsvv678dmz7j5bplaedt3182kh Wikikamus:ms/maulhayat 4 114919 281425 2026-04-22T09:48:13Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Kata nama=== {{inti|{{subst:ROOTPAGENAME}}|kata nama}} # Sejenis [[air]] yang dikatakan dapat memberi peminumnya kehidupan secara [[abadi]]. #: {{syn|ms|ainul hayat|air hayat}} ===Etimologi=== {{bor+|ms|ar|ماء الحياة}}. {{C|ms|Bahan cereka|Keabadian}} ===Rujukan=== * {{R:KDP}}' 281425 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Kata nama=== {{inti|ms|kata nama}} # Sejenis [[air]] yang dikatakan dapat memberi peminumnya kehidupan secara [[abadi]]. #: {{syn|ms|ainul hayat|air hayat}} ===Etimologi=== {{bor+|ms|ar|ماء الحياة}}. {{C|ms|Bahan cereka|Keabadian}} ===Rujukan=== * {{R:KDP}} ed2fvsh88rq3fjt98y8mrsnc8udke4j maulhayat 0 114920 281426 2026-04-22T09:48:46Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}' 281426 wikitext text/x-wiki {{wt:ms/{{PAGENAME}}}} oduz2pevfujwte0m2yicioulifapb4r Wikikamus:ms/Saudi 4 114921 281427 2026-04-22T09:51:46Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|ms}}== ===Kata sifat=== {{inti|ms|kata nama khas}} # Berkenaan negara [[Arab Saudi]]. ===Etimologi=== {{bor+|ms|ar|سُعُودِيّ}}. {{root|en|ar|س ع د}} ===Rujukan=== * {{R:KDP}} {{C|ms|Arab Saudi}}' 281427 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Kata sifat=== {{inti|ms|kata nama khas}} # Berkenaan negara [[Arab Saudi]]. ===Etimologi=== {{bor+|ms|ar|سُعُودِيّ}}. {{root|en|ar|س ع د}} ===Rujukan=== * {{R:KDP}} {{C|ms|Arab Saudi}} 4e5vdnv9aqo707qx46wpi6m2udf6hsi 281429 281427 2026-04-22T09:52:33Z PeaceSeekers 3334 281429 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Kata sifat=== {{inti|ms|kata nama khas}} # Berkenaan negara [[Arab Saudi]]. ===Etimologi=== {{bor+|ms|ar|سُعُودِيّ}}. {{root|ms|ar|س ع د}} ===Rujukan=== * {{R:KDP}} {{C|ms|Arab Saudi}} punqb32jnacgbu4ig5e1r526vuakgtp Saudi 0 114922 281428 2026-04-22T09:52:18Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}' 281428 wikitext text/x-wiki {{wt:ms/{{PAGENAME}}}} oduz2pevfujwte0m2yicioulifapb4r Kategori:Perkataan bahasa Melayu diterbitkan daripada akar bahasa Arab س ع د 14 114923 281430 2026-04-22T09:52:55Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281430 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:ms:Arab Saudi 14 114924 281431 2026-04-22T09:52:59Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281431 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Perkataan diterbitkan daripada akar bahasa Arab س ع د 14 114925 281432 2026-04-22T09:53:10Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281432 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:Arab Saudi 14 114926 281434 2026-04-22T09:57:30Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281434 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Wikikamus:en/Macau scam 4 114927 281436 2026-04-22T10:16:18Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Kata nama=== {{inti|{{subst:ROOTPAGENAME}}|kata nama}} # {{lb|en|Malaysia}} Sejenis taktik [[penipuan]] di mana seseorang [[samar|menyamar]] sebagai suatu pihak berkuasa untuk memaksa mangsa menyalurkan sejumlah wang [[tebusan]]. ===Etimologi=== Daripada {{l|en|Macau}}, di mana jenayah ini mula dilaporkan. Banding dengan {{noncog|tl|lutong-makaw}}. {{C|en|Jenayah}}' 281436 wikitext text/x-wiki ==Bahasa {{bahasa|en}}== ===Kata nama=== {{inti|en|kata nama}} # {{lb|en|Malaysia}} Sejenis taktik [[penipuan]] di mana seseorang [[samar|menyamar]] sebagai suatu pihak berkuasa untuk memaksa mangsa menyalurkan sejumlah wang [[tebusan]]. ===Etimologi=== Daripada {{l|en|Macau}}, di mana jenayah ini mula dilaporkan. Banding dengan {{noncog|tl|lutong-makaw}}. {{C|en|Jenayah}} 874v2vi42wbsff0lp8me4nwmjz2py6y 281440 281436 2026-04-22T10:20:30Z PeaceSeekers 3334 /* Bahasa {{bahasa|en}} */ 281440 wikitext text/x-wiki ==Bahasa {{bahasa|en}}== ===Kata nama=== {{inti|en|kata nama}} # {{lb|en|Malaysia}} Sejenis taktik [[penipuan]] di mana seseorang [[samar|menyamar]] sebagai suatu pihak berkuasa untuk mendesak mangsa menyalurkan sejumlah wang [[tebusan]]. ===Etimologi=== Daripada {{l|en|Macau}}, di mana jenayah ini mula dilaporkan. Banding dengan {{noncog|tl|lutong-makaw}}. {{C|en|Jenayah}} 8zeea9pbzda657bhg6frtpoy49ardtm Macau scam 0 114928 281437 2026-04-22T10:17:00Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{wt:en/{{PAGENAME}}}}' 281437 wikitext text/x-wiki {{wt:en/{{PAGENAME}}}} 2y33swzmyjj8jr581mnvur6xi1gpqs8 Kategori:en:Jenayah 14 114929 281438 2026-04-22T10:17:26Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281438 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Kategori:en:Undang-undang jenayah 14 114930 281439 2026-04-22T10:19:01Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{auto cat}}' 281439 wikitext text/x-wiki {{auto cat}} eomzlm5v4j7ond1phrju7cnue91g5qx Wikikamus:en/MNC 4 114931 281441 2026-04-22T10:25:47Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|en}}== ===Kata nama=== {{inti|en|kata nama}} # {{abbreviation of|en|[[multinational]] [[corporation]]}}. {{C|en|Perniagaan}}' 281441 wikitext text/x-wiki ==Bahasa {{bahasa|en}}== ===Kata nama=== {{inti|en|kata nama}} # {{abbreviation of|en|[[multinational]] [[corporation]]}}. {{C|en|Perniagaan}} 809z9s7jnjotzwwmkdzsuqmdnmrp1b9 MNC 0 114932 281442 2026-04-22T10:26:28Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{wt:en/{{PAGENAME}}}}' 281442 wikitext text/x-wiki {{wt:en/{{PAGENAME}}}} 2y33swzmyjj8jr581mnvur6xi1gpqs8 Wikikamus:ms/hiburan malam 4 114933 281443 2026-04-22T10:35:37Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Kata nama=== {{inti|{{subst:ROOTPAGENAME}}|kata nama}} # Aneka jenis [[hiburan]] yang biasanya dibuka pada waktu [[malam]] seperti [[kelab malam]], [[bar]] dsb. #: {{syn|ms|kehidupan malam}} ===Terjemahan=== {{trans-top|hiburan}} * Belanda: {{t+|nl|nachtleven|n}}, {{t+|nl|uitgaansleven|n}} * Cina: *: Mandarin: {{t+|cmn|夜生活|tr=yèshēnghuó}} * Esperanto: {{t|eo|nokta vivo}} * Finland: {{t+|fi|yöe...' 281443 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Kata nama=== {{inti|ms|kata nama}} # Aneka jenis [[hiburan]] yang biasanya dibuka pada waktu [[malam]] seperti [[kelab malam]], [[bar]] dsb. #: {{syn|ms|kehidupan malam}} ===Terjemahan=== {{trans-top|hiburan}} * Belanda: {{t+|nl|nachtleven|n}}, {{t+|nl|uitgaansleven|n}} * Cina: *: Mandarin: {{t+|cmn|夜生活|tr=yèshēnghuó}} * Esperanto: {{t|eo|nokta vivo}} * Finland: {{t+|fi|yöelämä}} * Georgia: {{t|ka|ღამის ცხოვრება}} * Itali: {{t+|en|nightlife}} * Itali: {{t|it|vita notturna|f}} * Jepun: {{t+|ja|夜遊び|tr=よあそび, yoasobi}}, {{t|ja|ナイトライフ|tr=naitoraifu}} * Jerman: {{t+|de|Nachtleben|n}} * Macedonia: {{t|mk|ноќен живот|m}} * Perancis: {{t|fr|vie nocturne|f}} * Poland: {{t|pl|nocne życie|n}} * Portugis: {{t|pt|vida noturna|f}}, {{t+|pt|noite|f}}, {{t+|pt|night|f}} * Rusia: {{t|ru|ночна́я жизнь|f}} * Rusyn Pannonia: {{t|rsk|ноцни живот|m}} * Sepanyol: {{t|es|vida nocturna|f}} * Swahili: {{t|sw|maisha ya usiku}} * Sweden: {{t+|sv|nattliv|n}} * Turki: {{t+|tr|gece hayatı}} * Yunani: {{t|el|νυχτερινή ζωή|f}} {{trans-bottom}} {{C|ms|Hiburan|Malam}} 6fyl4psmbkrc8qxs1vuxnsas8y0txqi 281445 281443 2026-04-22T10:37:37Z PeaceSeekers 3334 /* Terjemahan */ 281445 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Kata nama=== {{inti|ms|kata nama}} # Aneka jenis [[hiburan]] yang biasanya dibuka pada waktu [[malam]] seperti [[kelab malam]], [[bar]] dsb. #: {{syn|ms|kehidupan malam}} ===Terjemahan=== {{trans-top|hiburan}} * Belanda: {{t+|nl|nachtleven|n}}, {{t+|nl|uitgaansleven|n}} * Cina: *: Mandarin: {{t+|cmn|夜生活|tr=yèshēnghuó}} * Esperanto: {{t|eo|nokta vivo}} * Finland: {{t+|fi|yöelämä}} * Georgia: {{t|ka|ღამის ცხოვრება}} * Inggeris: {{t+|en|nightlife}} * Itali: {{t|it|vita notturna|f}} * Jepun: {{t+|ja|夜遊び|tr=よあそび, yoasobi}}, {{t|ja|ナイトライフ|tr=naitoraifu}} * Jerman: {{t+|de|Nachtleben|n}} * Macedonia: {{t|mk|ноќен живот|m}} * Perancis: {{t|fr|vie nocturne|f}} * Poland: {{t|pl|nocne życie|n}} * Portugis: {{t|pt|vida noturna|f}}, {{t+|pt|noite|f}}, {{t+|pt|night|f}} * Rusia: {{t|ru|ночна́я жизнь|f}} * Rusyn Pannonia: {{t|rsk|ноцни живот|m}} * Sepanyol: {{t|es|vida nocturna|f}} * Swahili: {{t|sw|maisha ya usiku}} * Sweden: {{t+|sv|nattliv|n}} * Turki: {{t+|tr|gece hayatı}} * Yunani: {{t|el|νυχτερινή ζωή|f}} {{trans-bottom}} {{C|ms|Hiburan|Malam}} lyv13nxbpscbslj89dkzvhs0kgjoc62 hiburan malam 0 114934 281444 2026-04-22T10:36:21Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}' 281444 wikitext text/x-wiki {{wt:ms/{{PAGENAME}}}} oduz2pevfujwte0m2yicioulifapb4r Wikikamus:ms/Hari Bumi 4 114935 281446 2026-04-22T10:59:13Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '==Bahasa {{bahasa|{{subst:ROOTPAGENAME}}}}== ===Kata nama khas=== {{inti|{{subst:ROOTPAGENAME}}|kata nama khas}} # [[hari|Hari]] peringatan khas yang ditetapkan pada 22 April di peringkat antarabangsa sebagai hari kesedaran menjaga [[alam sekitar]]. ===Terjemahan=== {{trans-top|hari peringatan alam sekitar}} * Arab: {{t|ar|يَوْم اَلْأَرْض|m}} * Cina: *: Mandarin: {{t+|cmn|世界地球日|tr=Shìjiè Dìqiúrì}}, {{t+|cmn|地球日|tr=D...' 281446 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Kata nama khas=== {{inti|ms|kata nama khas}} # [[hari|Hari]] peringatan khas yang ditetapkan pada 22 April di peringkat antarabangsa sebagai hari kesedaran menjaga [[alam sekitar]]. ===Terjemahan=== {{trans-top|hari peringatan alam sekitar}} * Arab: {{t|ar|يَوْم اَلْأَرْض|m}} * Cina: *: Mandarin: {{t+|cmn|世界地球日|tr=Shìjiè Dìqiúrì}}, {{t+|cmn|地球日|tr=Dìqiúrì}} * Finland: {{t|fi|[[maan]] [[päivä]]}} * Galicia: {{t|gl|Día da Terra}} * Georgia: {{t|ka|დედამიწის დღე}} * Inggeris: {{t+|en|Earth Day}} * Jepun: {{t|ja|アースデイ|tr=Āsu-dei}}, {{t|ja|地球の日|tr=ちきゅうのひ, Chikyū no hi}} * Jerman: {{t|de|Tag der Erde|m}} * Korea: {{t|ko|^지구-의 날}} * Navajo: {{t|nv|Nahasdzáán Baa Hą́ą́hwiindzin Bá Hazʼą́}}, {{t|nv|Nahasdzáán baa ʼáháyą́}} * Perancis: {{t|fr|Jour de la Terre|m}} * Portugis: {{t|pt|Dia da Terra|m}} * Rusia: {{t|ru|День Земли́|m}} * Sepanyol: {{t|es|Día de la Tierra|m}} * Swahili: {{t|sw|Siku ya Dunia}} * Wales: {{t|cy|Dydd y Ddaear|m}} {{trans-bottom}} {{C|en|Cuti}} f6yty16dzmtasahxnwv7iul7z4j98x6 281447 281446 2026-04-22T11:00:01Z PeaceSeekers 3334 281447 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Kata nama khas=== {{inti|ms|kata nama khas}} # [[hari|Hari]] peringatan khas yang ditetapkan pada 22 April di peringkat antarabangsa sebagai hari kesedaran menjaga [[alam sekitar]]. ===Terjemahan=== {{trans-top|hari peringatan alam sekitar}} * Arab: {{t|ar|يَوْم اَلْأَرْض|m}} * Cina: *: Mandarin: {{t+|cmn|世界地球日|tr=Shìjiè Dìqiúrì}}, {{t+|cmn|地球日|tr=Dìqiúrì}} * Finland: {{t|fi|[[maan]] [[päivä]]}} * Galicia: {{t|gl|Día da Terra}} * Georgia: {{t|ka|დედამიწის დღე}} * Inggeris: {{t+|en|Earth Day}} * Jepun: {{t|ja|アースデイ|tr=Āsu-dei}}, {{t|ja|地球の日|tr=ちきゅうのひ, Chikyū no hi}} * Jerman: {{t|de|Tag der Erde|m}} * Korea: {{t|ko|^지구-의 날}} * Navajo: {{t|nv|Nahasdzáán Baa Hą́ą́hwiindzin Bá Hazʼą́}}, {{t|nv|Nahasdzáán baa ʼáháyą́}} * Perancis: {{t|fr|Jour de la Terre|m}} * Portugis: {{t|pt|Dia da Terra|m}} * Rusia: {{t|ru|День Земли́|m}} * Sepanyol: {{t|es|Día de la Tierra|m}} * Swahili: {{t|sw|Siku ya Dunia}} * Wales: {{t|cy|Dydd y Ddaear|m}} {{trans-bottom}} {{C|en|Perayaan}} 2zxsrcf9rnoeigp60flg2bunope367z 281449 281447 2026-04-22T11:01:08Z PeaceSeekers 3334 281449 wikitext text/x-wiki ==Bahasa {{bahasa|ms}}== ===Kata nama khas=== {{ms-knk}} # [[hari|Hari]] peringatan khas yang ditetapkan pada 22 April di peringkat antarabangsa sebagai hari kesedaran menjaga [[alam sekitar]]. ===Terjemahan=== {{trans-top|hari peringatan alam sekitar}} * Arab: {{t|ar|يَوْم اَلْأَرْض|m}} * Cina: *: Mandarin: {{t+|cmn|世界地球日|tr=Shìjiè Dìqiúrì}}, {{t+|cmn|地球日|tr=Dìqiúrì}} * Finland: {{t|fi|[[maan]] [[päivä]]}} * Galicia: {{t|gl|Día da Terra}} * Georgia: {{t|ka|დედამიწის დღე}} * Inggeris: {{t+|en|Earth Day}} * Jepun: {{t|ja|アースデイ|tr=Āsu-dei}}, {{t|ja|地球の日|tr=ちきゅうのひ, Chikyū no hi}} * Jerman: {{t|de|Tag der Erde|m}} * Korea: {{t|ko|^지구-의 날}} * Navajo: {{t|nv|Nahasdzáán Baa Hą́ą́hwiindzin Bá Hazʼą́}}, {{t|nv|Nahasdzáán baa ʼáháyą́}} * Perancis: {{t|fr|Jour de la Terre|m}} * Portugis: {{t|pt|Dia da Terra|m}} * Rusia: {{t|ru|День Земли́|m}} * Sepanyol: {{t|es|Día de la Tierra|m}} * Swahili: {{t|sw|Siku ya Dunia}} * Wales: {{t|cy|Dydd y Ddaear|m}} {{trans-bottom}} {{C|ms|Perayaan}} ezwyhhpj8oo8usemya12hxqn6h0jq53 Hari Bumi 0 114936 281448 2026-04-22T11:00:36Z PeaceSeekers 3334 Mencipta laman baru dengan kandungan '{{wt:ms/{{PAGENAME}}}}' 281448 wikitext text/x-wiki {{wt:ms/{{PAGENAME}}}} oduz2pevfujwte0m2yicioulifapb4r