ဝိက်ရှေန်နရဳ
mnwwiktionary
https://mnw.wiktionary.org/wiki/%E1%80%9D%E1%80%AD%E1%80%80%E1%80%BA%E1%80%9B%E1%80%BE%E1%80%B1%E1%80%94%E1%80%BA%E1%80%94%E1%80%9B%E1%80%B3:%E1%80%99%E1%80%AF%E1%80%80%E1%80%BA%E1%80%9C%E1%80%AD%E1%80%80%E1%80%BA%E1%80%90%E1%80%99%E1%80%BA
MediaWiki 1.46.0-wmf.22
case-sensitive
မဳဒဳယာ
တၟေင်
ဓရီုကျာ
ညးလွပ်
ညးလွပ် ဓရီုကျာ
ဝိက်ရှေန်နရဳ
ဝိက်ရှေန်နရဳ ဓရီုကျာ
ဝှာင်
ဝှာင် ဓရီုကျာ
မဳဒဳယာဝဳကဳ
မဳဒဳယာဝဳကဳ ဓရီုကျာ
ထာမ်ပလိက်
ထာမ်ပလိက် ဓရီုကျာ
ရီု
ရီု ဓရီုကျာ
ကဏ္ဍ
ကဏ္ဍ ဓရီုကျာ
အဆက်လက္ကရဴ
အဆက်လက္ကရဴ ဓရီုကျာ
ကာရန်
ကာရန် ဓရီုကျာ
အဘိဓာန်
အဘိဓာန် ဓရီုကျာ
ဗီုပြၚ်သိုၚ်တၟိ
ဗီုပြၚ်သိုၚ်တၟိ ဓရီုကျာ
TimedText
TimedText talk
မဝ်ဂျူ
မဝ်ဂျူ ဓရီုကျာ
Event
Event talk
မဝ်ဂျူ:languages
828
651
385634
385596
2026-04-02T17:02:13Z
咽頭べさ
33
Error in language
385634
Scribunto
text/plain
--[==[ intro:
This module implements fetching of language-specific information and processing text in a given language.
===Types of languages===
There are two types of languages: full languages and etymology-only languages. The essential difference is that only
full languages appear in L2 headings in vocabulary entries, and hence categories like [[:Category:French nouns]] exist
only for full languages. Etymology-only languages have either a full language or another etymology-only language as
their parent (in the parent-child inheritance sense), and for etymology-only languages with another etymology-only
language as their parent, a full language can always be derived by following the parent links upwards. For example,
"Canadian French", code `fr-CA`, is an etymology-only language whose parent is the full language "French", code `fr`.
An example of an etymology-only language with another etymology-only parent is "Northumbrian Old English", code
`ang-nor`, which has "Anglian Old English", code `ang-ang` as its parent; this is an etymology-only language whose
parent is "Old English", code `ang`, which is a full language. (This is because Northumbrian Old English is considered
a variety of Anglian Old English.) Sometimes the parent is the "Undetermined" language, code `und`; this is the case,
for example, for "substrate" languages such as "Pre-Greek", code `qsb-grc`, and "the BMAC substrate", code `qsb-bma`.
It is important to distinguish language ''parents'' from language ''ancestors''. The parent-child relationship is one
of containment, i.e. if X is a child of Y, X is considered a variety of Y. On the other hand, the ancestor-descendant
relationship is one of descent in time. For example, "Classical Latin", code `la-cla`, and "Late Latin", code `la-lat`,
are both etymology-only languages with "Latin", code `la`, as their parents, because both of the former are varieties
of Latin. However, Late Latin does *NOT* have Classical Latin as its parent because Late Latin is *not* a variety of
Classical Latin; rather, it is a descendant. There is in fact a separate `ancestors` field that is used to express the
ancestor-descendant relationship, and Late Latin's ancestor is given as Classical Latin. It is also important to note
that sometimes an etymology-only language is actually the conceptual ancestor of its parent language. This happens,
for example, with "Old Italian" (code `roa-oit`), which is an etymology-only variant of full language "Italian" (code
`it`), and with "Old Latin" (code `itc-ola`), which is an etymology-only variant of Latin. In both cases, the full
language has the etymology-only variant listed as an ancestor. This allows a Latin term to inherit from Old Latin
using the {{tl|inh}} template (where in this template, "inheritance" refers to ancestral inheritance, i.e. inheritance
in time, rather than in the parent-child sense); likewise for Italian and Old Italian.
Full languages come in three subtypes:
* {regular}: This indicates a full language that is attested according to [[WT:CFI]] and therefore permitted in the
main namespace. There may also be reconstructed terms for the language, which are placed in the
{Reconstruction} namespace and must be prefixed with * to indicate a reconstruction. Most full languages
are natural (not constructed) languages, but a few constructed languages (e.g. Esperanto and Volapük,
among others) are also allowed in the mainspace and considered regular languages.
* {reconstructed}: This language is not attested according to [[WT:CFI]], and therefore is allowed only in the
{Reconstruction} namespace. All terms in this language are reconstructed, and must be prefixed with
*. Languages such as Proto-Indo-European and Proto-Germanic are in this category.
* {appendix-constructed}: This language is attested but does not meet the additional requirements set out for
constructed languages ([[WT:CFI#Constructed languages]]). Its entries must therefore be in
the Appendix namespace, but they are not reconstructed and therefore should not have *
prefixed in links. Most constructed languages are of this subtype.
Both full languages and etymology-only languages have a {Language} object associated with them, which is fetched using
the {getByCode} function in [[Module:languages]] to convert a language code to a {Language} object. Depending on the
options supplied to this function, etymology-only languages may or may not be accepted, and family codes may be
accepted (returning a {Family} object as described in [[Module:families]]). There are also separate {getByCanonicalName}
functions in [[Module:languages]] and [[Module:etymology languages]] to convert a language's canonical name to a
{Language} object (depending on whether the canonical name refers to a full or etymology-only language).
===Textual representations===
Textual strings belonging to a given language come in several different ''text variants'':
# The ''input text'' is what the user supplies in wikitext, in the parameters to {{tl|m}}, {{tl|l}}, {{tl|ux}},
{{tl|t}}, {{tl|lang}} and the like.
# The ''corrected input text'' is the input text with some corrections and/or normalizations applied, such as
bad-character replacements for certain languages, like replacing `l` or `1` to [[palochka]] in some languages written
in Cyrillic. (FIXME: This currently goes under the name ''display text'' but that will be repurposed below. Also,
[[User:Surjection]] suggests renaming this to ''normalized input text'', but "normalized" is used in a different sense
in [[Module:usex]].)
# The ''display text'' is the text in the form as it will be displayed to the user. This is what appears in headwords,
in usexes, in displayed internal links, etc. This can include accent marks that are removed to form the stripped
display text (see below), as well as embedded bracketed links that are variously processed further. The display text
is generated from the corrected input text by applying language-specific transformations; for most languages, there
will be no such transformations. The general reason for having a difference between input and display text is to allow
for extra information in the input text that is not displayed to the user but is sent to the transliteration module.
Note that having different display and input text is only supported currently through special-casing but will be
generalized. Examples of transformations are: (1) Removing the {{cd|^}} that is used in certain East Asian (and
possibly other unicameral) languages to indicate capitalization of the transliteration (which is currently
special-cased); (2) for Korean, removing or otherwise processing hyphens (which is currently special-cased); (3) for
Arabic, removing a ''sukūn'' diacritic placed over a ''tāʔ marbūṭa'' (like this: ةْ) to indicate that the
''tāʔ marbūṭa'' is pronounced and transliterated as /t/ instead of being silent [NOTE, NOT IMPLEMENTED YET]; (4) for
Thai and Khmer, converting space-separated words to bracketed words and resolving respelling substitutions such as
`[กรีน/กฺรีน]`, which indicate how to transliterate given words [NOTE, NOT IMPLEMENTED YET except in language-specific
templates like {{tl|th-usex}}].
## The ''right-resolved display text'' is the result of removing brackets around one-part embedded links and resolving
two-part embedded links into their right-hand components (i.e. converting two-part links into the displayed form).
The process of right-resolution is what happens when you call {{cd|remove_links()}} in [[Module:links]] on some text.
When applied to the display text, it produces exactly what the user sees, without any link markup.
# The ''stripped display text'' is the result of applying diacritic-stripping to the display text.
## The ''left-resolved stripped display text'' [NEED BETTER NAME] is the result of applying left-resolution to the
stripped display text, i.e. similar to right-resolution but resolving two-part embedded links into their left-hand
components (i.e. the linked-to page). If the display text refers to a single page, the resulting of applying
diacritic stripping and left-resolution produces the ''logical pagename''.
# The ''physical pagename text'' is the result of converting the stripped display text into physical page links. If the
stripped display text contains embedded links, the left side of those links is converted into physical page links;
otherwise, the entire text is considered a pagename and converted in the same fashion. The conversion does three
things: (1) converts characters not allowed in pagenames into their "unsupported title" representation, e.g.
{{cd|Unsupported titles/`gt`}} in place of the logical name {{cd|>}}; (2) handles certain special-cased
unsupported-title logical pagenames, such as {{cd|Unsupported titles/Space}} in place of {{cd|[space]}} and
{{cd|Unsupported titles/Ancient Greek dish}} in place of a very long Greek name for a gourmet dish as found in
Aristophanes; (3) converts "mammoth" pagenames such as [[a]] into their appropriate split component, e.g.
[[a/languages A to L]].
# The ''source translit text'' is the text as supplied to the language-specific {{cd|transliterate()}} method. The form
of the source translit text may need to be language-specific, e.g Thai and Khmer will need the corrected input text,
whereas other languages may need to work off the display text. [FIXME: It's still unclear to me how embedded bracketed
links are handled in the existing code.] In general, embedded links need to be right-resolved (see above), but when
this happens is unclear to me [FIXME]. Some languages have a chop-up-and-paste-together scheme that sends parts of the
text through the transliterate mechanism, and for others (those listed with "cont" in {{cd|substitution}} in
[[Module:languages/data]]) they receive the full input text, but preprocessed in certain ways. (The wisdom of this is
still unclear to me.)
# The ''transliterated text'' (or ''transliteration'') is the result of transliterating the source translit text. Unlike
for all the other text variants except the transcribed text, it is always in the Latin script.
# The ''transcribed text'' (or ''transcription'') is the result of transcribing the source translit text, where
"transcription" here means a close approximation to the phonetic form of the language in languages (e.g. Akkadian,
Sumerian, Ancient Egyptian, maybe Tibetan) that have a wide difference between the written letters and spoken form.
Unlike for all the other text variants other than the transliterated text, it is always in the Latin script.
Currently, the transcribed text is always supplied manually be the user; there is no such thing as a
{{cd|transcribe()}} method on language objects.
# The ''sort key'' is the text used in sort keys for determining the placing of pages in categories they belong to. The
sort key is generated from the pagename or a specified ''sort base'' by lowercasing, doing language-specific
transformations and then uppercasing the result. If the sort base is supplied and is generated from input text, it
needs to be converted to display text, have embedded links removed through right-resolution and have
diacritic-stripping applied.
# There are other text variants that occur in usexes (specifically, there are normalized variants of several of the
above text variants), but we can skip them for now.
The following methods exist on {Language} objects to convert between different text variants:
# {correctInputText} (currently called {makeDisplayText}): This converts input text to corrected input text.
# {stripDiacritics}: This converts to stripped display text. [FIXME: This needs some rethinking. In particular,
{stripDiacritics} is sometimes called on input text, corrected input text or display text (in various paths inside of
[[Module:links]], and, in the case of input text, usually from other modules). We need to make sure we don't try to
convert input text to display text twice, but at the same time we need to support calling it directly on input text
since so many modules do this. This means we need to add a parameter indicating whether the passed-in text is input,
corrected input, or display text; if the former two, we call {correctInputText} ourselves.]
# {logicalToPhysical}: This converts logical pagenames to physical pagenames.
# {transliterate}: This appears to convert input text with embedded brackets removed into a transliteration.
[FIXME: This needs some rethinking. In particular, it calls {processDisplayText} on its input, which won't work
for Thai and Khmer, so we may need language-specific flags indicating whether to pass the input text directly to the
language transliterate method. In addition, I'm not sure how embedded links are handled in the existing translit code;
a lot of callers remove the links themselves before calling {transliterate()}, which I assume is wrong.]
# {makeSortKey}: This converts display text (?) to a sort key. [FIXME: Clarify this.]
]==]
local export = {}
local debug_track_module = "Module:debug/track"
local etymology_languages_data_module = "Module:etymology languages/data"
local families_module = "Module:families"
local headword_page_module = "Module:headword/page"
local json_module = "Module:JSON"
local language_like_module = "Module:language-like"
local languages_data_module = "Module:languages/data"
local languages_data_patterns_module = "Module:languages/data/patterns"
local links_data_module = "Module:links/data"
local load_module = "Module:load"
local scripts_module = "Module:scripts"
local scripts_data_module = "Module:scripts/data"
local string_encode_entities_module = "Module:string/encode entities"
local string_pattern_escape_module = "Module:string/patternEscape"
local string_replacement_escape_module = "Module:string/replacementEscape"
local string_utilities_module = "Module:string utilities"
local table_module = "Module:table"
local utilities_module = "Module:utilities"
local wikimedia_languages_module = "Module:wikimedia languages"
local mw = mw
local string = string
local table = table
local char = string.char
local concat = table.concat
local find = string.find
local floor = math.floor
local get_by_code -- Defined below.
local get_data_module_name -- Defined below.
local get_extra_data_module_name -- Defined below.
local getmetatable = getmetatable
local gmatch = string.gmatch
local gsub = string.gsub
local insert = table.insert
local ipairs = ipairs
local is_known_language_tag = mw.language.isKnownLanguageTag
local make_object -- Defined below.
local match = string.match
local next = next
local pairs = pairs
local remove = table.remove
local require = require
local select = select
local setmetatable = setmetatable
local sub = string.sub
local type = type
local unstrip = mw.text.unstrip
-- Loaded as needed by findBestScript.
local Hans_chars
local Hant_chars
local function check_object(...)
check_object = require(utilities_module).check_object
return check_object(...)
end
local function debug_track(...)
debug_track = require(debug_track_module)
return debug_track(...)
end
local function decode_entities(...)
decode_entities = require(string_utilities_module).decode_entities
return decode_entities(...)
end
local function decode_uri(...)
decode_uri = require(string_utilities_module).decode_uri
return decode_uri(...)
end
local function deep_copy(...)
deep_copy = require(table_module).deepCopy
return deep_copy(...)
end
local function encode_entities(...)
encode_entities = require(string_encode_entities_module)
return encode_entities(...)
end
local function get_L2_sort_key(...)
get_L2_sort_key = require(headword_page_module).get_L2_sort_key
return get_L2_sort_key(...)
end
local function get_script(...)
get_script = require(scripts_module).getByCode
return get_script(...)
end
local function find_best_script_without_lang(...)
find_best_script_without_lang = require(scripts_module).findBestScriptWithoutLang
return find_best_script_without_lang(...)
end
local function get_family(...)
get_family = require(families_module).getByCode
return get_family(...)
end
local function get_plaintext(...)
get_plaintext = require(utilities_module).get_plaintext
return get_plaintext(...)
end
local function get_wikimedia_lang(...)
get_wikimedia_lang = require(wikimedia_languages_module).getByCode
return get_wikimedia_lang(...)
end
local function keys_to_list(...)
keys_to_list = require(table_module).keysToList
return keys_to_list(...)
end
local function list_to_set(...)
list_to_set = require(table_module).listToSet
return list_to_set(...)
end
local function load_data(...)
load_data = require(load_module).load_data
return load_data(...)
end
local function make_family_object(...)
make_family_object = require(families_module).makeObject
return make_family_object(...)
end
local function pattern_escape(...)
pattern_escape = require(string_pattern_escape_module)
return pattern_escape(...)
end
local function replacement_escape(...)
replacement_escape = require(string_replacement_escape_module)
return replacement_escape(...)
end
local function safe_require(...)
safe_require = require(load_module).safe_require
return safe_require(...)
end
local function shallow_copy(...)
shallow_copy = require(table_module).shallowCopy
return shallow_copy(...)
end
local function split(...)
split = require(string_utilities_module).split
return split(...)
end
local function to_json(...)
to_json = require(json_module).toJSON
return to_json(...)
end
local function u(...)
u = require(string_utilities_module).char
return u(...)
end
local function ugsub(...)
ugsub = require(string_utilities_module).gsub
return ugsub(...)
end
local function ulen(...)
ulen = require(string_utilities_module).len
return ulen(...)
end
local function ulower(...)
ulower = require(string_utilities_module).lower
return ulower(...)
end
local function umatch(...)
umatch = require(string_utilities_module).match
return umatch(...)
end
local function uupper(...)
uupper = require(string_utilities_module).upper
return uupper(...)
end
local function track(page)
debug_track("languages/" .. page)
return true
end
local function normalize_code(code)
return load_data(languages_data_module).aliases[code] or code
end
local function check_inputs(self, check, default, ...)
local n = select("#", ...)
if n == 0 then
return false
end
local ret = check(self, (...))
if ret ~= nil then
return ret
elseif n > 1 then
local inputs = {...}
for i = 2, n do
ret = check(self, inputs[i])
if ret ~= nil then
return ret
end
end
end
return default
end
local function make_link(self, target, display)
local prefix, main
if self:getFamilyCode() == "qfa-sub" then
prefix, main = display:match("^(the )(.*)")
if not prefix then
prefix, main = display:match("^(a )(.*)")
end
end
return (prefix or "") .. "[[" .. target .. "|" .. (main or display) .. "]]"
end
-- Convert risky characters to HTML entities, which minimizes interference once returned (e.g. for "sms:a", "<!-- -->" etc.).
local function escape_risky_characters(text)
-- Spacing characters in isolation generally need to be escaped in order to be properly processed by the MediaWiki software.
if umatch(text, "^%s*$") then
return encode_entities(text, text)
end
return encode_entities(text, "!#%&*+/:;<=>?@[\\]_{|}")
end
-- Temporarily convert various formatting characters to PUA to prevent them from being disrupted by the substitution process.
local function doTempSubstitutions(text, subbedChars, keepCarets, noTrim)
-- Clone so that we don't insert any extra patterns into the table in package.loaded. For some reason, using require seems to keep memory use down; probably because the table is always cloned.
local patterns = shallow_copy(require(languages_data_patterns_module))
if keepCarets then
insert(patterns, "((\\+)%^)")
insert(patterns, "((%^))")
end
-- Ensure any whitespace at the beginning and end is temp substituted, to prevent it from being accidentally trimmed. We only want to trim any final spaces added during the substitution process (e.g. by a module), which means we only do this during the first round of temp substitutions.
if not noTrim then
insert(patterns, "^([\128-\191\244]*(%s+))")
insert(patterns, "((%s+)[\128-\191\244]*)$")
end
-- Pre-substitution, of "[[" and "]]", which makes pattern matching more accurate.
text = gsub(text, "%f[%[]%[%[", "\1"):gsub("%f[%]]%]%]", "\2")
local i = #subbedChars
for _, pattern in ipairs(patterns) do
-- Patterns ending in \0 stand are for things like "[[" or "]]"), so the inserted PUA are treated as breaks between terms by modules that scrape info from pages.
local term_divider
pattern = gsub(pattern, "%z$", function(divider)
term_divider = divider == "\0"
return ""
end)
text = gsub(text, pattern, function(...)
local m = {...}
local m1New = m[1]
for k = 2, #m do
local n = i + k - 1
subbedChars[n] = m[k]
local byte2 = floor(n / 4096) % 64 + (term_divider and 128 or 136)
local byte3 = floor(n / 64) % 64 + 128
local byte4 = n % 64 + 128
m1New = gsub(m1New, pattern_escape(m[k]), "\244" .. char(byte2) .. char(byte3) .. char(byte4), 1)
end
i = i + #m - 1
return m1New
end)
end
text = gsub(text, "\1", "%[%["):gsub("\2", "%]%]")
return text, subbedChars
end
-- Reinsert any formatting that was temporarily substituted.
local function undoTempSubstitutions(text, subbedChars)
for i = 1, #subbedChars do
local byte2 = floor(i / 4096) % 64 + 128
local byte3 = floor(i / 64) % 64 + 128
local byte4 = i % 64 + 128
text = gsub(text, "\244[" .. char(byte2) .. char(byte2+8) .. "]" .. char(byte3) .. char(byte4),
replacement_escape(subbedChars[i]))
end
text = gsub(text, "\1", "%[%["):gsub("\2", "%]%]")
return text
end
-- Check if the raw text is an unsupported title, and if so return that. Otherwise, remove HTML entities. We do the pre-conversion to avoid loading the unsupported title list unnecessarily.
local function checkNoEntities(self, text)
local textNoEnc = decode_entities(text)
if textNoEnc ~= text and load_data(links_data_module).unsupported_titles[text] then
return text
else
return textNoEnc
end
end
-- If no script object is provided (or if it's invalid or None), get one.
local function checkScript(text, self, sc)
if not check_object("script", true, sc) or sc:getCode() == "None" then
return self:findBestScript(text)
end
return sc
end
local function normalize(text, sc)
text = sc:fixDiscouragedSequences(text)
return sc:toFixedNFD(text)
end
-- Subfunction of iterateSectionSubstitutions(). Process an individual chunk of text according to the specifications in
-- `substitution_data`. The input parameters are all as in the documentation of iterateSectionSubstitutions() except for
-- `recursed`, which is set to true if we called ourselves recursively to process a script-specific setting or
-- script-wide fallback. Returns two values: the processed text and the actual substitution data used to do the
-- substitutions (same as the `actual_substitution_data` return value to iterateSectionSubstitutions()).
local function doSubstitutions(self, text, sc, substitution_data, data_field, function_name, recursed)
-- BE CAREFUL in this function because the value at any level can be `false`, which causes no processing to be done
-- and blocks any further fallback processing.
local actual_substitution_data = substitution_data
-- If there are language-specific substitutes given in the data module, use those.
if type(substitution_data) == "table" then
-- If a script is specified, run this function with the script-specific data before continuing.
local sc_code = sc:getCode()
local has_substitution_data = false
if substitution_data[sc_code] ~= nil then
has_substitution_data = true
if substitution_data[sc_code] then
text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data[sc_code], data_field,
function_name, true)
end
-- Hant, Hans and Hani are usually treated the same, so add a special case to avoid having to specify each one
-- separately.
elseif sc_code:match("^Han") and substitution_data.Hani ~= nil then
has_substitution_data = true
if substitution_data.Hani then
text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data.Hani, data_field,
function_name, true)
end
-- Substitution data with key 1 in the outer table may be given as a fallback.
elseif substitution_data[1] ~= nil then
has_substitution_data = true
if substitution_data[1] then
text, actual_substitution_data = doSubstitutions(self, text, sc, substitution_data[1], data_field,
function_name, true)
end
end
-- Iterate over all strings in the "from" subtable, and gsub with the corresponding string in "to". We work with
-- the NFD decomposed forms, as this simplifies many substitutions.
if substitution_data.from then
has_substitution_data = true
for i, from in ipairs(substitution_data.from) do
-- Normalize each loop, to ensure multi-stage substitutions work correctly.
text = sc:toFixedNFD(text)
text = ugsub(text, sc:toFixedNFD(from), substitution_data.to[i] or "")
end
end
if substitution_data.remove_diacritics then
has_substitution_data = true
text = sc:toFixedNFD(text)
-- Convert exceptions to PUA.
local remove_exceptions, substitutes = substitution_data.remove_exceptions
if remove_exceptions then
substitutes = {}
local i = 0
for _, exception in ipairs(remove_exceptions) do
exception = sc:toFixedNFD(exception)
text = ugsub(text, exception, function(m)
i = i + 1
local subst = u(0x80000 + i)
substitutes[subst] = m
return subst
end)
end
end
-- Strip diacritics.
text = ugsub(text, "[" .. substitution_data.remove_diacritics .. "]", "")
-- Convert exceptions back.
if remove_exceptions then
text = text:gsub("\242[\128-\191]*", substitutes)
end
end
if not has_substitution_data and sc._data[data_field] then
-- If language-specific sort key (etc.) is nil, fall back to script-wide sort key (etc.).
text, actual_substitution_data = doSubstitutions(self, text, sc, sc._data[data_field], data_field,
function_name, true)
end
elseif type(substitution_data) == "string" then
-- If there is a dedicated function module, use that.
local module = safe_require("Module:" .. substitution_data)
if module then
-- TODO: translit functions should take objects, not codes.
-- TODO: translit functions should be called with form NFD.
if function_name == "tr" then
if not module[function_name] then
error(("Internal error: Module [[%s]] has no function named 'tr'"):format(substitution_data))
end
text = module[function_name](text, self._code, sc:getCode())
elseif function_name == "stripDiacritics" then
-- FIXME, get rid of this arm after renaming makeEntryName -> stripDiacritics.
if module[function_name] then
text = module[function_name](sc:toFixedNFD(text), self, sc)
elseif module.makeEntryName then
text = module.makeEntryName(sc:toFixedNFD(text), self, sc)
else
error(("Internal error: Module [[%s]] has no function named 'stripDiacritics' or 'makeEntryName'"
):format(substitution_data))
end
else
if not module[function_name] then
error(("Internal error: Module [[%s]] has no function named '%s'"):format(
substitution_data, function_name))
end
text = module[function_name](sc:toFixedNFD(text), self, sc)
end
else
error("Substitution data '" .. substitution_data .. "' does not match an existing module.")
end
elseif substitution_data == nil and sc._data[data_field] then
-- If language-specific sort key (etc.) is nil, fall back to script-wide sort key (etc.).
text, actual_substitution_data = doSubstitutions(self, text, sc, sc._data[data_field], data_field,
function_name, true)
end
-- Don't normalize to NFC if this is the inner loop or if a module returned nil.
if recursed or not text then
return text, actual_substitution_data
end
-- Fix any discouraged sequences created during the substitution process, and normalize into the final form.
return sc:toFixedNFC(sc:fixDiscouragedSequences(text)), actual_substitution_data
end
-- Split the text into sections, based on the presence of temporarily substituted formatting characters, then iterate
-- over each section to apply substitutions (e.g. transliteration or diacritic stripping). This avoids putting PUA
-- characters through language-specific modules, which may be unequipped for them. This function is passed the following
-- values:
-- * `self` (the Language object);
-- * `text` (the text to process);
-- * `sc` (the script of the text, which must be specified; callers should call checkScript() as needed to autodetect the
-- script of the text if not given explicitly by the user);
-- * `subbedChars` (an array of the same length as the text, indicating which characters have been substituted and by
-- what, or {nil} if no substitutions are to happen);
-- * `keepCarets` (DOCUMENT ME);
-- * `substitution_data` (the data indicating which substitutions to apply, taken directly from `data_field` in the
-- language's data structure in a submodule of [[Module:languages/data]]);
-- * `data_field` (the data field from which `substitution_data` was fetched, such as "sort_key" or "strip_diacritics");
-- * `function_name` (the name of the function to call to do the substitution, in case `substitution_data` specifies a
-- module to do the substitution);
-- * `notrim` (don't trim whitespace at the edges of `text`; set when computing the sort key, because whitespace at the
-- beginning of a sort key is significant and causes the resulting page to be sorted at the beginning of the category
-- it's in).
-- Returns three values:
-- (1) the processed text;
-- (2) the value of `subbedChars` that was passed in, possibly modified with additional character substitutions; will be
-- {nil} if {nil} was passed in;
-- (3) the actual substitution data that was used to apply substitutions to `text`; this may be different from the value
-- of `substitution_data` passed in if that value recursively specified script-specific substitutions or if no
-- substitution data could be found in the language-specific data (e.g. {nil} was passed in or a structure was passed
-- in that had no setting for the script given in `sc`), but a script-wide fallback value was set; currently it is
-- only used by makeSortKey().
local function iterateSectionSubstitutions(self, text, sc, subbedChars, keepCarets, substitution_data, data_field,
function_name, notrim)
local sections
-- See [[Module:languages/data]].
if not find(text, "\244") or load_data(languages_data_module).substitution[self._code] == "cont" then
sections = {text}
else
sections = split(text, "\244[\128-\143][\128-\191]*", true)
end
local actual_substitution_data
for _, section in ipairs(sections) do
-- Don't bother processing empty strings or whitespace (which may also not be handled well by dedicated
-- modules).
if gsub(section, "%s+", "") ~= "" then
local sub, this_actual_substitution_data = doSubstitutions(self, section, sc, substitution_data, data_field,
function_name)
actual_substitution_data = this_actual_substitution_data
-- Second round of temporary substitutions, in case any formatting was added by the main substitution
-- process. However, don't do this if the section contains formatting already (as it would have had to have
-- been escaped to reach this stage, and therefore should be given as raw text).
if sub and subbedChars then
local noSub
for _, pattern in ipairs(require(languages_data_patterns_module)) do
if match(section, pattern .. "%z?") then
noSub = true
end
end
if not noSub then
sub, subbedChars = doTempSubstitutions(sub, subbedChars, keepCarets, true)
end
end
if not sub then
text = sub
break
end
text = sub and gsub(text, pattern_escape(section), replacement_escape(sub), 1) or text
end
end
if not notrim then
-- Trim, unless there are only spacing characters, while ignoring any final formatting characters.
-- Do not trim sort keys because spaces at the beginning are significant.
text = text and text:gsub("^([\128-\191\244]*)%s+(%S)", "%1%2"):gsub("(%S)%s+([\128-\191\244]*)$", "%1%2") or
nil
end
return text, subbedChars, actual_substitution_data
end
-- Process carets (and any escapes). Default to simple removal, if no pattern/replacement is given.
local function processCarets(text, pattern, repl)
local rep
repeat
text, rep = gsub(text, "\\\\(\\*^)", "\3%1")
until rep == 0
return (text:gsub("\\^", "\4")
:gsub(pattern or "%^", repl or "")
:gsub("\3", "\\")
:gsub("\4", "^"))
end
-- Remove carets if they are used to capitalize parts of transliterations (unless they have been escaped).
local function removeCarets(text, sc)
if not sc:hasCapitalization() and sc:isTransliterated() and text:find("^", 1, true) then
return processCarets(text)
else
return text
end
end
local Language = {}
--[==[Returns the language code of the language. Example: {{code|lua|"fr"}} for French.]==]
function Language:getCode()
return self._code
end
--[==[Returns the canonical name of the language. This is the name used to represent that language on Wiktionary, and is guaranteed to be unique to that language alone. Example: {{code|lua|"French"}} for French.]==]
function Language:getCanonicalName()
local name = self._name
if name == nil then
name = self._data[1]
self._name = name
end
return name
end
--[==[
Return the display form of the language. The display form of a language, family or script is the form it takes when
appearing as the <code><var>source</var></code> in categories such as <code>English terms derived from
<var>source</var></code> or <code>English given names from <var>source</var></code>, and is also the displayed text
in {makeCategoryLink()} links. For full and etymology-only languages, this is the same as the canonical name, but
for families, it reads <code>"<var>name</var> languages"</code> (e.g. {"Indo-Iranian languages"}), and for scripts,
it reads <code>"<var>name</var> script"</code> (e.g. {"Arabic script"}).
]==]
function Language:getDisplayForm()
local form = self._displayForm
if form == nil then
form = self:getCanonicalName()
-- Add article and " substrate" to substrates that lack them.
if self:getFamilyCode() == "qfa-sub" then
if not (sub(form, 1, 4) == "the " or sub(form, 1, 2) == "a ") then
form = "a " .. form
end
if not match(form, " [Ss]ubstrate") then
form = form .. " substrate"
end
end
self._displayForm = form
end
return form
end
--[==[Returns the value which should be used in the HTML lang= attribute for tagged text in the language.]==]
function Language:getHTMLAttribute(sc, region)
local code = self._code
if not find(code, "-", 1, true) then
return code .. "-" .. sc:getCode() .. (region and "-" .. region or "")
end
local parent = self:getParent()
region = region or match(code, "%f[%u][%u-]+%f[%U]")
if parent then
return parent:getHTMLAttribute(sc, region)
end
-- TODO: ISO family codes can also be used.
return "mis-" .. sc:getCode() .. (region and "-" .. region or "")
end
--[==[Returns a table of the aliases that the language is known by, excluding the canonical name. Aliases are synonyms for the language in question. The names are not guaranteed to be unique, in that sometimes more than one language is known by the same name. Example: {{code|lua|{"High German", "New High German", "Deutsch"} }} for [[:Category:German language|German]].]==]
function Language:getAliases()
self:loadInExtraData()
return require(language_like_module).getAliases(self)
end
--[==[
Return a table of the known subvarieties of a given language, excluding subvarieties that have been given
explicit etymology-only language codes. The names are not guaranteed to be unique, in that sometimes a given name
refers to a subvariety of more than one language. Example: {{code|lua|{"Southern Aymara", "Central Aymara"} }} for
[[:Category:Aymara language|Aymara]]. Note that the returned value can have nested tables in it, when a subvariety
goes by more than one name. Example: {{code|lua|{"North Azerbaijani", "South Azerbaijani", {"Afshar", "Afshari",
"Afshar Azerbaijani", "Afchar"}, {"Qashqa'i", "Qashqai", "Kashkay"}, "Sonqor"} }} for
[[:Category:Azerbaijani language|Azerbaijani]]. Here, for example, Afshar, Afshari, Afshar Azerbaijani and Afchar
all refer to the same subvariety, whose preferred name is Afshar (the one listed first). To avoid a return value
with nested tables in it, specify a non-{{code|lua|nil}} value for the <code>flatten</code> parameter; in that case,
the return value would be {{code|lua|{"North Azerbaijani", "South Azerbaijani", "Afshar", "Afshari",
"Afshar Azerbaijani", "Afchar", "Qashqa'i", "Qashqai", "Kashkay", "Sonqor"} }}.
]==]
function Language:getVarieties(flatten)
self:loadInExtraData()
return require(language_like_module).getVarieties(self, flatten)
end
--[==[Returns a table of the "other names" that the language is known by, which are listed in the <code>otherNames</code> field. It should be noted that the <code>otherNames</code> field itself is deprecated, and entries listed there should eventually be moved to either <code>aliases</code> or <code>varieties</code>.]==]
function Language:getOtherNames() -- To be eventually removed, once there are no more uses of the `otherNames` field.
self:loadInExtraData()
return require(language_like_module).getOtherNames(self)
end
--[==[
Return a combined table of the canonical name, aliases, varieties and other names of a given language.]==]
function Language:getAllNames()
self:loadInExtraData()
return require(language_like_module).getAllNames(self)
end
--[==[Returns a table of types as a lookup table (with the types as keys).
The possible types are
* {language}: This is a language, either full or etymology-only.
* {full}: This is a "full" (not etymology-only) language, i.e. the union of {regular}, {reconstructed} and
{appendix-constructed}. Note that the types {full} and {etymology-only} also exist for families, so if you
want to check specifically for a full language and you have an object that might be a family, you should
use {{lua|hasType("language", "full")}} and not simply {{lua|hasType("full")}}.
* {etymology-only}: This is an etymology-only (not full) language, whose parent is another etymology-only
language or a full language. Note that the types {full} and {etymology-only} also exist for
families, so if you want to check specifically for an etymology-only language and you have an
object that might be a family, you should use {{lua|hasType("language", "etymology-only")}}
and not simply {{lua|hasType("etymology-only")}}.
* {regular}: This indicates a full language that is attested according to [[WT:CFI]] and therefore permitted
in the main namespace. There may also be reconstructed terms for the language, which are placed in
the {Reconstruction} namespace and must be prefixed with * to indicate a reconstruction. Most full
languages are natural (not constructed) languages, but a few constructed languages (e.g. Esperanto
and Volapük, among others) are also allowed in the mainspace and considered regular languages.
* {reconstructed}: This language is not attested according to [[WT:CFI]], and therefore is allowed only in the
{Reconstruction} namespace. All terms in this language are reconstructed, and must be prefixed
with *. Languages such as Proto-Indo-European and Proto-Germanic are in this category.
* {appendix-constructed}: This language is attested but does not meet the additional requirements set out for
constructed languages ([[WT:CFI#Constructed languages]]). Its entries must therefore
be in the Appendix namespace, but they are not reconstructed and therefore should
not have * prefixed in links.
]==]
function Language:getTypes()
local types = self._types
if types == nil then
types = {language = true}
if self:getFullCode() == self._code then
types.full = true
else
types["etymology-only"] = true
end
for t in gmatch(self._data.type, "[^,]+") do
types[t] = true
end
self._types = types
end
return types
end
--[==[Given a list of types as strings, returns true if the language has all of them.]==]
function Language:hasType(...)
Language.hasType = require(language_like_module).hasType
return self:hasType(...)
end
--[==[Returns a table containing <code>WikimediaLanguage</code> objects (see [[Module:wikimedia languages]]), which represent languages and their codes as they are used in Wikimedia projects for interwiki linking and such. More than one object may be returned, as a single Wiktionary language may correspond to multiple Wikimedia languages. For example, Wiktionary's single code <code>sh</code> (Serbo-Croatian) maps to four Wikimedia codes: <code>sh</code> (Serbo-Croatian), <code>bs</code> (Bosnian), <code>hr</code> (Croatian) and <code>sr</code> (Serbian).
The code for the Wikimedia language is retrieved from the <code>wikimedia_codes</code> property in the data modules. If that property is not present, the code of the current language is used. If none of the available codes is actually a valid Wikimedia code, an empty table is returned.]==]
function Language:getWikimediaLanguages()
local wm_langs = self._wikimediaLanguageObjects
if wm_langs == nil then
local codes = self:getWikimediaLanguageCodes()
wm_langs = {}
for i = 1, #codes do
wm_langs[i] = get_wikimedia_lang(codes[i])
end
self._wikimediaLanguageObjects = wm_langs
end
return wm_langs
end
function Language:getWikimediaLanguageCodes()
local wm_langs = self._wikimediaLanguageCodes
if wm_langs == nil then
wm_langs = self._data.wikimedia_codes
if wm_langs then
wm_langs = split(wm_langs, ",", true, true)
else
local code = self._code
if is_known_language_tag(code) then
wm_langs = {code}
else
-- Inherit, but only if no codes are specified in the data *and*
-- the language code isn't a valid Wikimedia language code.
local parent = self:getParent()
wm_langs = parent and parent:getWikimediaLanguageCodes() or {}
end
end
self._wikimediaLanguageCodes = wm_langs
end
return wm_langs
end
--[==[
Returns the name of the Wikipedia article for the language. `project` specifies the language and project to retrieve
the article from, defaulting to {"enwiki"} for the English Wikipedia. Normally if specified it should be the project
code for a specific-language Wikipedia e.g. "zhwiki" for the Chinese Wikipedia, but it can be any project, including
non-Wikipedia ones. If the project is the English Wikipedia and the property {wikipedia_article} is present in the data
module it will be used first. In all other cases, a sitelink will be generated from {:getWikidataItem} (if set). The
resulting value (or lack of value) is cached so that subsequent calls are fast. If no value could be determined, and
`noCategoryFallback` is {false}, {:getCategoryName} is used as fallback; otherwise, {nil} is returned. Note that if
`noCategoryFallback` is {nil} or omitted, it defaults to {false} if the project is the English Wikipedia, otherwise
to {true}. In other words, under normal circumstances, if the English Wikipedia article couldn't be retrieved, the
return value will fall back to a link to the language's category, but this won't normally happen for any other project.
]==]
function Language:getWikipediaArticle(noCategoryFallback, project)
Language.getWikipediaArticle = require(language_like_module).getWikipediaArticle
return self:getWikipediaArticle(noCategoryFallback, project)
end
function Language:makeWikipediaLink()
return make_link(self, "w:" .. self:getWikipediaArticle(), self:getCanonicalName())
end
--[==[Returns the name of the Wikimedia Commons category page for the language.]==]
function Language:getCommonsCategory()
Language.getCommonsCategory = require(language_like_module).getCommonsCategory
return self:getCommonsCategory()
end
--[==[Returns the Wikidata item id for the language or <code>nil</code>. This corresponds to the the second field in the data modules.]==]
function Language:getWikidataItem()
Language.getWikidataItem = require(language_like_module).getWikidataItem
return self:getWikidataItem()
end
--[==[Returns a table of <code>Script</code> objects for all scripts that the language is written in. See [[Module:scripts]].]==]
function Language:getScripts()
local scripts = self._scriptObjects
if scripts == nil then
local codes = self:getScriptCodes()
if codes[1] == "All" then
scripts = load_data(scripts_data_module)
else
scripts = {}
for i = 1, #codes do
scripts[i] = get_script(codes[i])
end
end
self._scriptObjects = scripts
end
return scripts
end
--[==[Returns the table of script codes in the language's data file.]==]
function Language:getScriptCodes()
local scripts = self._scriptCodes
if scripts == nil then
scripts = self._data[4]
if scripts then
local codes, n = {}, 0
for code in gmatch(scripts, "[^,]+") do
n = n + 1
-- Special handling of "Hants", which represents "Hani", "Hant" and "Hans" collectively.
if code == "Hants" then
codes[n] = "Hani"
codes[n + 1] = "Hant"
codes[n + 2] = "Hans"
n = n + 2
else
codes[n] = code
end
end
scripts = codes
else
scripts = {"None"}
end
self._scriptCodes = scripts
end
return scripts
end
--[==[Given some text, this function iterates through the scripts of a given language and tries to find the script that best matches the text. It returns a {{code|lua|Script}} object representing the script. If no match is found at all, it returns the {{code|lua|None}} script object.]==]
function Language:findBestScript(text, forceDetect)
if not text or text == "" or text == "-" then
return get_script("None")
end
-- Differs from table returned by getScriptCodes, as Hants is not normalized into its constituents.
local codes = self._bestScriptCodes
if codes == nil then
codes = self._data[4]
codes = codes and split(codes, ",", true, true) or {"None"}
self._bestScriptCodes = codes
end
local first_sc = codes[1]
if first_sc == "All" then
return find_best_script_without_lang(text)
end
local codes_len = #codes
if not (forceDetect or first_sc == "Hants" or codes_len > 1) then
first_sc = get_script(first_sc)
local charset = first_sc.characters
return charset and umatch(text, "[" .. charset .. "]") and first_sc or get_script("None")
end
-- Remove all formatting characters.
text = get_plaintext(text)
-- Remove all spaces and any ASCII punctuation. Some non-ASCII punctuation is script-specific, so can't be removed.
text = ugsub(text, "[%s!\"#%%&'()*,%-./:;?@[\\%]_{}]+", "")
if #text == 0 then
return get_script("None")
end
-- Try to match every script against the text,
-- and return the one with the most matching characters.
local bestcount, bestscript, length = 0
for i = 1, codes_len do
local sc = codes[i]
-- Special case for "Hants", which is a special code that represents whichever of "Hant" or "Hans" best matches, or "Hani" if they match equally. This avoids having to list all three. In addition, "Hants" will be treated as the best match if there is at least one matching character, under the assumption that a Han script is desirable in terms that contain a mix of Han and other scripts (not counting those which use Jpan or Kore).
if sc == "Hants" then
local Hani = get_script("Hani")
if not Hant_chars then
Hant_chars = load_data("Module:zh/data/ts")
Hans_chars = load_data("Module:zh/data/st")
end
local t, s, found = 0, 0
-- This is faster than using mw.ustring.gmatch directly.
for ch in gmatch((ugsub(text, "[" .. Hani.characters .. "]", "\255%0")), "\255(.[\128-\191]*)") do
found = true
if Hant_chars[ch] then
t = t + 1
if Hans_chars[ch] then
s = s + 1
end
elseif Hans_chars[ch] then
s = s + 1
else
t, s = t + 1, s + 1
end
end
if found then
if t == s then
return Hani
end
return get_script(t > s and "Hant" or "Hans")
end
else
sc = get_script(sc)
if not length then
length = ulen(text)
end
-- Count characters by removing everything in the script's charset and comparing to the original length.
local charset = sc.characters
local count = charset and length - ulen((ugsub(text, "[" .. charset .. "]+", ""))) or 0
if count >= length then
return sc
elseif count > bestcount then
bestcount = count
bestscript = sc
end
end
end
-- Return best matching script, or otherwise None.
return bestscript or get_script("None")
end
--[==[Returns a <code>Family</code> object for the language family that the language belongs to. See [[Module:families]].]==]
function Language:getFamily()
local family = self._familyObject
if family == nil then
family = self:getFamilyCode()
-- If the value is nil, it's cached as false.
family = family and get_family(family) or false
self._familyObject = family
end
return family or nil
end
--[==[Returns the family code in the language's data file.]==]
function Language:getFamilyCode()
local family = self._familyCode
if family == nil then
-- If the value is nil, it's cached as false.
family = self._data[3] or false
self._familyCode = family
end
return family or nil
end
function Language:getFamilyName()
local family = self._familyName
if family == nil then
family = self:getFamily()
-- If the value is nil, it's cached as false.
family = family and family:getCanonicalName() or false
self._familyName = family
end
return family or nil
end
do
local function check_family(self, family)
if type(family) == "table" then
family = family:getCode()
end
if self:getFamilyCode() == family then
return true
end
local self_family = self:getFamily()
if self_family:inFamily(family) then
return true
-- If the family isn't a real family (e.g. creoles) check any ancestors.
elseif self_family:inFamily("qfa-not") then
local ancestors = self:getAncestors()
for _, ancestor in ipairs(ancestors) do
if ancestor:inFamily(family) then
return true
end
end
end
end
--[==[Check whether the language belongs to `family` (which can be a family code or object). A list of objects can be given in place of `family`; in that case, return true if the language belongs to any of the specified families. Note that some languages (in particular, certain creoles) can have multiple immediate ancestors potentially belonging to different families; in that case, return true if the language belongs to any of the specified families.]==]
function Language:inFamily(...)
if self:getFamilyCode() == nil then
return false
end
return check_inputs(self, check_family, false, ...)
end
end
function Language:getParent()
local parent = self._parentObject
if parent == nil then
parent = self:getParentCode()
-- If the value is nil, it's cached as false.
parent = parent and get_by_code(parent, nil, true, true) or false
self._parentObject = parent
end
return parent or nil
end
function Language:getParentCode()
local parent = self._parentCode
if parent == nil then
-- If the value is nil, it's cached as false.
parent = self._data.parent or false
self._parentCode = parent
end
return parent or nil
end
function Language:getParentName()
local parent = self._parentName
if parent == nil then
parent = self:getParent()
-- If the value is nil, it's cached as false.
parent = parent and parent:getCanonicalName() or false
self._parentName = parent
end
return parent or nil
end
function Language:getParentChain()
local chain = self._parentChain
if chain == nil then
chain = {}
local parent, n = self:getParent(), 0
while parent do
n = n + 1
chain[n] = parent
parent = parent:getParent()
end
self._parentChain = chain
end
return chain
end
do
local function check_lang(self, lang)
for _, parent in ipairs(self:getParentChain()) do
if (type(lang) == "string" and lang or lang:getCode()) == parent:getCode() then
return true
end
end
end
function Language:hasParent(...)
return check_inputs(self, check_lang, false, ...)
end
end
--[==[
If the language is etymology-only, this iterates through parents until a full language or family is found, and the
corresponding object is returned. If the language is a full language, then it simply returns itself.
]==]
function Language:getFull()
local full = self._fullObject
if full == nil then
full = self:getFullCode()
full = full == self._code and self or get_by_code(full)
self._fullObject = full
end
return full
end
--[==[
If the language is an etymology-only language, this iterates through parents until a full language or family is
found, and the corresponding code is returned. If the language is a full language, then it simply returns the
language code.
]==]
function Language:getFullCode()
return self._fullCode or self._code
end
--[==[
If the language is an etymology-only language, this iterates through parents until a full language or family is
found, and the corresponding canonical name is returned. If the language is a full language, then it simply returns
the canonical name of the language.
]==]
function Language:getFullName()
local full = self._fullName
if full == nil then
full = self:getFull():getCanonicalName()
self._fullName = full
end
return full
end
--[==[Returns a table of <code class="nf">Language</code> objects for all languages that this language is directly descended from. Generally this is only a single language, but creoles, pidgins and mixed languages can have multiple ancestors.]==]
function Language:getAncestors()
local ancestors = self._ancestorObjects
if ancestors == nil then
ancestors = {}
local ancestor_codes = self:getAncestorCodes()
if #ancestor_codes > 0 then
for _, ancestor in ipairs(ancestor_codes) do
insert(ancestors, get_by_code(ancestor, nil, true))
end
else
local fam = self:getFamily()
local protoLang = fam and fam:getProtoLanguage() or nil
-- For the cases where the current language is the proto-language
-- of its family, or an etymology-only language that is ancestral to that
-- proto-language, we need to step up a level higher right from the
-- start.
if protoLang and (
protoLang:getCode() == self._code or
(self:hasType("etymology-only") and protoLang:hasAncestor(self))
) then
fam = fam:getFamily()
protoLang = fam and fam:getProtoLanguage() or nil
end
while not protoLang and not (not fam or fam:getCode() == "qfa-not") do
fam = fam:getFamily()
protoLang = fam and fam:getProtoLanguage() or nil
end
insert(ancestors, protoLang)
end
self._ancestorObjects = ancestors
end
return ancestors
end
do
-- Avoid a language being its own ancestor via class inheritance. We only need to check for this if the language has inherited an ancestor table from its parent, because we never want to drop ancestors that have been explicitly set in the data.
-- Recursively iterate over ancestors until we either find self or run out. If self is found, return true.
local function check_ancestor(self, lang)
local codes = lang:getAncestorCodes()
if not codes then
return nil
end
for i = 1, #codes do
local code = codes[i]
if code == self._code then
return true
end
local anc = get_by_code(code, nil, true)
if check_ancestor(self, anc) then
return true
end
end
end
--[==[Returns a table of <code class="nf">Language</code> codes for all languages that this language is directly descended from. Generally this is only a single language, but creoles, pidgins and mixed languages can have multiple ancestors.]==]
function Language:getAncestorCodes()
if self._ancestorCodes then
return self._ancestorCodes
end
local data = self._data
local codes = data.ancestors
if codes == nil then
codes = {}
self._ancestorCodes = codes
return codes
end
codes = split(codes, ",", true, true)
self._ancestorCodes = codes
-- If there are no codes or the ancestors weren't inherited data, there's nothing left to check.
if #codes == 0 or self:getData(false, "raw").ancestors ~= nil then
return codes
end
local i, code = 1
while i <= #codes do
code = codes[i]
if check_ancestor(self, self) then
remove(codes, i)
else
i = i + 1
end
end
return codes
end
end
--[==[Given a list of language objects or codes, returns true if at least one of them is an ancestor. This includes any etymology-only children of that ancestor. If the language's ancestor(s) are etymology-only languages, it will also return true for those language parent(s) (e.g. if Vulgar Latin is the ancestor, it will also return true for its parent, Latin). However, a parent is excluded from this if the ancestor is also ancestral to that parent (e.g. if Classical Persian is the ancestor, Persian would return false, because Classical Persian is also ancestral to Persian).]==]
function Language:hasAncestor(...)
local function iterateOverAncestorTree(node, func, parent_check)
local ancestors = node:getAncestors()
local ancestorsParents = {}
for _, ancestor in ipairs(ancestors) do
-- When checking the parents of the other language, and the ancestor is also a parent, skip to the next ancestor, so that we exclude any etymology-only children of that parent that are not directly related (see below).
local ret = (parent_check or not node:hasParent(ancestor)) and
func(ancestor) or iterateOverAncestorTree(ancestor, func, parent_check)
if ret then
return ret
end
end
-- Check the parents of any ancestors. We don't do this if checking the parents of the other language, so that we exclude any etymology-only children of those parents that are not directly related (e.g. if the ancestor is Vulgar Latin and we are checking New Latin, we want it to return false because they are on different ancestral branches. As such, if we're already checking the parent of New Latin (Latin) we don't want to compare it to the parent of the ancestor (Latin), as this would be a false positive; it should be one or the other).
if not parent_check then
return nil
end
for _, ancestor in ipairs(ancestors) do
local ancestorParents = ancestor:getParentChain()
for _, ancestorParent in ipairs(ancestorParents) do
if ancestorParent:getCode() == self._code or ancestorParent:hasAncestor(ancestor) then
break
else
insert(ancestorsParents, ancestorParent)
end
end
end
for _, ancestorParent in ipairs(ancestorsParents) do
local ret = func(ancestorParent)
if ret then
return ret
end
end
end
local function do_iteration(otherlang, parent_check)
-- otherlang can't be self
if (type(otherlang) == "string" and otherlang or otherlang:getCode()) == self._code then
return false
end
repeat
if iterateOverAncestorTree(
self,
function(ancestor)
return ancestor:getCode() == (type(otherlang) == "string" and otherlang or otherlang:getCode())
end,
parent_check
) then
return true
elseif type(otherlang) == "string" then
otherlang = get_by_code(otherlang, nil, true)
end
otherlang = otherlang:getParent()
parent_check = false
until not otherlang
end
local parent_check = true
for _, otherlang in ipairs{...} do
local ret = do_iteration(otherlang, parent_check)
if ret then
return true
end
end
return false
end
do
local function construct_node(lang, memo)
local branch, ancestors = {lang = lang:getCode()}
memo[lang:getCode()] = branch
for _, ancestor in ipairs(lang:getAncestors()) do
if ancestors == nil then
ancestors = {}
end
insert(ancestors, memo[ancestor:getCode()] or construct_node(ancestor, memo))
end
branch.ancestors = ancestors
return branch
end
function Language:getAncestorChain()
local chain = self._ancestorChain
if chain == nil then
chain = construct_node(self, {})
self._ancestorChain = chain
end
return chain
end
end
function Language:getAncestorChainOld()
local chain = self._ancestorChain
if chain == nil then
chain = {}
local step = self
while true do
local ancestors = step:getAncestors()
step = #ancestors == 1 and ancestors[1] or nil
if not step then
break
end
insert(chain, step)
end
self._ancestorChain = chain
end
return chain
end
local function fetch_descendants(self, fmt)
local descendants, family = {}, self:getFamily()
-- Iterate over all three datasets.
for _, data in ipairs{
require("Module:languages/code to canonical name"),
require("Module:etymology languages/code to canonical name"),
require("Module:families/code to canonical name"),
} do
for code in pairs(data) do
local lang = get_by_code(code, nil, true, true)
-- Test for a descendant. Earlier tests weed out most candidates, while the more intensive tests are only used sparingly.
if (
code ~= self._code and -- Not self.
lang:inFamily(family) and -- In the same family.
(
family:getProtoLanguageCode() == self._code or -- Self is the protolanguage.
self:hasDescendant(lang) or -- Full hasDescendant check.
(lang:getFullCode() == self._code and not self:hasAncestor(lang)) -- Etymology-only child which isn't an ancestor.
)
) then
if fmt == "object" then
insert(descendants, lang)
elseif fmt == "code" then
insert(descendants, code)
elseif fmt == "name" then
insert(descendants, lang:getCanonicalName())
end
end
end
end
return descendants
end
function Language:getDescendants()
local descendants = self._descendantObjects
if descendants == nil then
descendants = fetch_descendants(self, "object")
self._descendantObjects = descendants
end
return descendants
end
function Language:getDescendantCodes()
local descendants = self._descendantCodes
if descendants == nil then
descendants = fetch_descendants(self, "code")
self._descendantCodes = descendants
end
return descendants
end
function Language:getDescendantNames()
local descendants = self._descendantNames
if descendants == nil then
descendants = fetch_descendants(self, "name")
self._descendantNames = descendants
end
return descendants
end
do
local function check_lang(self, lang)
if type(lang) == "string" then
lang = get_by_code(lang, nil, true)
end
if lang:hasAncestor(self) then
return true
end
end
function Language:hasDescendant(...)
return check_inputs(self, check_lang, false, ...)
end
end
local function fetch_children(self, fmt)
local m_etym_data = require(etymology_languages_data_module)
local self_code, children = self._code, {}
for code, lang in pairs(m_etym_data) do
local _lang = lang
repeat
local parent = _lang.parent
if parent == self_code then
if fmt == "object" then
insert(children, get_by_code(code, nil, true))
elseif fmt == "code" then
insert(children, code)
elseif fmt == "name" then
insert(children, lang[1])
end
break
end
_lang = m_etym_data[parent]
until not _lang
end
return children
end
function Language:getChildren()
local children = self._childObjects
if children == nil then
children = fetch_children(self, "object")
self._childObjects = children
end
return children
end
function Language:getChildrenCodes()
local children = self._childCodes
if children == nil then
children = fetch_children(self, "code")
self._childCodes = children
end
return children
end
function Language:getChildrenNames()
local children = self._childNames
if children == nil then
children = fetch_children(self, "name")
self._childNames = children
end
return children
end
function Language:hasChild(...)
local lang = ...
if not lang then
return false
elseif type(lang) == "string" then
lang = get_by_code(lang, nil, true)
end
if lang:hasParent(self) then
return true
end
return self:hasChild(select(2, ...))
end
--[==[Returns the name of the main category of that language. Example: {{code|lua|"French language"}} for French, whose category is at [[:Category:French language]]. Unless optional argument <code>nocap</code> is given, the language name at the beginning of the returned value will be capitalized. This capitalization is correct for category names, but not if the language name is lowercase and the returned value of this function is used in the middle of a sentence.]==]
function Language:getCategoryName(nocap)
local name = self._categoryName
if name == nil then
name = self:getCanonicalName()
-- Only add " language" if a full language.
if self:hasType("full") then
-- Unless the canonical name already ends with "language", "lect" or their derivatives, add " language".
if name:find("^ ") or name == " " then
return name
else
return name
end
end
self._categoryName = name
end
if nocap then
return name
end
return mw.getContentLanguage():ucfirst(name)
end
--[==[Creates a link to the category; the link text is the canonical name.]==]
function Language:makeCategoryLink()
return make_link(self, ":Category:" .. self:getCategoryName(), self:getDisplayForm())
end
function Language:getStandardCharacters(sc)
local standard_chars = self._data.standard_chars
if type(standard_chars) ~= "table" then
return standard_chars
elseif sc and type(sc) ~= "string" then
check_object("script", nil, sc)
sc = sc:getCode()
end
if (not sc) or sc == "None" then
local scripts = {}
for _, script in pairs(standard_chars) do
insert(scripts, script)
end
return concat(scripts)
end
if standard_chars[sc] then
return standard_chars[sc] .. (standard_chars[1] or "")
end
end
--[==[
Strip diacritics from display text `text` (in a language-specific fashion), which is in the script `sc`. If `sc` is
omitted or {nil}, the script is autodetected. This also strips certain punctuation characters from the end and (in the
case of Spanish upside-down question mark and exclamation points) from the beginning; strips any whitespace at the
end of the text or between the text and final stripped punctuation characters; and applies some language-specific
Unicode normalizations to replace discouraged characters with their prescribed alternatives. Return the stripped text.
]==]
function Language:stripDiacritics(text, sc)
if (not text) or text == "" then
return text
end
sc = checkScript(text, self, sc)
text = normalize(text, sc)
-- FIXME, rename makeEntryName to stripDiacritics and get rid of second and third return values
-- everywhere
text, _, _ = iterateSectionSubstitutions(self, text, sc, nil, nil,
self._data.strip_diacritics or self._data.entry_name, "strip_diacritics", "stripDiacritics")
text = umatch(text, "^[¿¡]?(.-[^%s%p].-)%s*[؟?!;՛՜ ՞ ՟?!︖︕।॥။၊་།]?$") or text
return text
end
--[==[
Convert a ''logical'' pagename (the pagename as it appears to the user, after diacritics and punctuation have been
stripped) to a ''physical'' pagename (the pagename as it appears in the MediaWiki database). Reasons for a difference
between the two are (a) unsupported titles such as `[ ]` (with square brackets in them), `#` (pound/hash sign) and
`¯\_(ツ)_/¯` (with underscores), as well as overly long titles of various sorts; (b) "mammoth" pages that are split into
parts (e.g. `a`, which is split into physical pagenames `a/languages A to L` and `a/languages M to Z`). For almost all
purposes, you should work with logical and not physical pagenames. But there are certain use cases that require physical
pagenames, such as checking the existence of a page or retrieving a page's contents.
`pagename` is the logical pagename to be converted. `is_reconstructed_or_appendix` indicates whether the page is in the
`Reconstruction` or `Appendix` namespaces. If it is omitted or has the value {nil}, the pagename is checked for an
initial asterisk, and if found, the page is assumed to be a `Reconstruction` page. Setting a value of `false` or `true`
to `is_reconstructed_or_appendix` disables this check and allows for mainspace pagenames that begin with an asterisk.
]==]
function Language:logicalToPhysical(pagename, is_reconstructed_or_appendix)
-- FIXME: This probably shouldn't happen but it happens when makeEntryName() receives nil.
if pagename == nil then
track("nil-passed-to-logicalToPhysical")
return nil
end
local initial_asterisk
if is_reconstructed_or_appendix == nil then
local pagename_minus_initial_asterisk
initial_asterisk, pagename_minus_initial_asterisk = pagename:match("^(%*)(.*)$")
if pagename_minus_initial_asterisk then
is_reconstructed_or_appendix = true
pagename = pagename_minus_initial_asterisk
elseif self:hasType("appendix-constructed") then
is_reconstructed_or_appendix = true
end
end
if not is_reconstructed_or_appendix then
-- Check if the pagename is a listed unsupported title.
local unsupportedTitles = load_data(links_data_module).unsupported_titles
if unsupportedTitles[pagename] then
return "Unsupported titles/" .. unsupportedTitles[pagename]
end
end
-- Set `unsupported` as true if certain conditions are met.
local unsupported
-- Check if there's an unsupported character. \239\191\189 is the replacement character U+FFFD, which can't be typed
-- directly here due to an abuse filter. Unix-style dot-slash notation is also unsupported, as it is used for
-- relative paths in links, as are 3 or more consecutive tildes. Note: match is faster with magic
-- characters/charsets; find is faster with plaintext.
if (
match(pagename, "[#<>%[%]_{|}]") or
find(pagename, "\239\191\189") or
match(pagename, "%f[^%z/]%.%.?%f[%z/]") or
find(pagename, "~~~")
) then
unsupported = true
-- If it looks like an interwiki link.
elseif find(pagename, ":") then
local prefix = gsub(pagename, "^:*(.-):.*", ulower)
if (
load_data("Module:data/namespaces")[prefix] or
load_data("Module:data/interwikis")[prefix]
) then
unsupported = true
end
end
-- Escape unsupported characters so they can be used in titles. ` is used as a delimiter for this, so a raw use of
-- it in an unsupported title is also escaped here to prevent interference; this is only done with unsupported
-- titles, though, so inclusion won't in itself mean a title is treated as unsupported (which is why it's excluded
-- from the earlier test).
if unsupported then
-- FIXME: This conversion needs to be different for reconstructed pages with unsupported characters. There
-- aren't any currently, but if there ever are, we need to fix this e.g. to put them in something like
-- Reconstruction:Proto-Indo-European/Unsupported titles/`lowbar``num`.
local unsupported_characters = load_data(links_data_module).unsupported_characters
pagename = pagename:gsub("[#<>%[%]_`{|}\239]\191?\189?", unsupported_characters)
:gsub("%f[^%z/]%.%.?%f[%z/]", function(m)
return (gsub(m, "%.", "`period`"))
end)
:gsub("~~~+", function(m)
return (gsub(m, "~", "`tilde`"))
end)
pagename = "Unsupported titles/" .. pagename
elseif not is_reconstructed_or_appendix then
-- Check if this is a mammoth page. If so, which subpage should we link to?
local m_links_data = load_data(links_data_module)
local mammoth_page_type = m_links_data.mammoth_pages[pagename]
if mammoth_page_type then
local canonical_name = self:getFullName()
if canonical_name ~= "မအရေဝ်ပံၚ်ကောံ" and canonical_name ~= "အၚ်္ဂလိက်" then
local this_subpage
local L2_sort_key = get_L2_sort_key(canonical_name)
for _, subpage_spec in ipairs(m_links_data.mammoth_page_subpage_types[mammoth_page_type]) do
-- unpack() fails utterly on data loaded using mw.loadData() even if offsets are given
local subpage, pattern = subpage_spec[1], subpage_spec[2]
if pattern == true or L2_sort_key:match(pattern) then
this_subpage = subpage
break
end
end
if not this_subpage then
error(("Internal error: Bad data in mammoth_page_subpage_pages in [[Module:links/data]] for mammoth page %s, type %s; last entry didn't have 'true' in it"):format(
pagename, mammoth_page_type))
end
pagename = pagename .. "/" .. this_subpage
end
end
end
return (initial_asterisk or "") .. pagename
end
--[==[
Strip the diacritics from a display pagename and convert the resulting logical pagename into a physical pagename.
This allows you, for example, to retrieve the contents of the page or check its existence. WARNING: This is deprecated
and will be going away. It is a simple composition of `self:stripDiacritics` and `self:logicalToPhysical`; most callers
only want the former, and if you need both, call them both yourself.
`text` and `sc` are as in `self:stripDiacritics`, and `is_reconstructed_or_appendix` is as in `self:logicalToPhysical`.
]==]
function Language:makeEntryName(text, sc, is_reconstructed_or_appendix)
return self:logicalToPhysical(self:stripDiacritics(text, sc), is_reconstructed_or_appendix)
end
--[==[Generates alternative forms using a specified method, and returns them as a table. If no method is specified, returns a table containing only the input term.]==]
function Language:generateForms(text, sc)
local generate_forms = self._data.generate_forms
if generate_forms == nil then
return {text}
end
sc = checkScript(text, self, sc)
return require("Module:" .. self._data.generate_forms).generateForms(text, self, sc)
end
--[==[Creates a sort key for the given stripped text, following the rules appropriate for the language. This removes
diacritical marks from the stripped text if they are not considered significant for sorting, and may perform some other
changes. Any initial hyphen is also removed, and anything in parentheses is removed as well.
The <code>sort_key</code> setting for each language in the data modules defines the replacements made by this function, or it gives the name of the module that takes the stripped text and returns a sortkey.]==]
function Language:makeSortKey(text, sc)
if (not text) or text == "" then
return text
end
if match(text, "<[^<>]+>") then
track("track HTML tag")
end
-- Remove directional characters, bold, italics, soft hyphens, strip markers and HTML tags.
-- FIXME: Partly duplicated with remove_formatting() in [[Module:links]].
text = ugsub(text, "[\194\173\226\128\170-\226\128\174\226\129\166-\226\129\169]", "")
text = text:gsub("('*)'''(.-'*)'''", "%1%2"):gsub("('*)''(.-'*)''", "%1%2")
text = gsub(unstrip(text), "<[^<>]+>", "")
text = decode_uri(text, "PATH")
text = checkNoEntities(self, text)
-- Remove initial hyphens and * unless the term only consists of spacing + punctuation characters.
text = ugsub(text, "^([-]*)[-־ـ᠊*]+([-]*)(.*[^%s%p].*)", "%1%2%3")
sc = checkScript(text, self, sc)
text = normalize(text, sc)
text = removeCarets(text, sc)
-- For languages with dotted dotless i, ensure that "İ" is sorted as "i", and "I" is sorted as "ı".
if self:hasDottedDotlessI() then
text = gsub(text, "I\204\135", "i") -- decomposed "İ"
:gsub("I", "ı")
text = sc:toFixedNFD(text)
end
-- Convert to lowercase, make the sortkey, then convert to uppercase. Where the language has dotted dotless i, it is
-- usually not necessary to convert "i" to "İ" and "ı" to "I" first, because "I" will always be interpreted as
-- conventional "I" (not dotless "İ") by any sorting algorithms, which will have been taken into account by the
-- sortkey substitutions themselves. However, if no sortkey substitutions have been specified, then conversion is
-- necessary so as to prevent "i" and "ı" both being sorted as "I".
--
-- An exception is made for scripts that (sometimes) sort by scraping page content, as that means they are sensitive
-- to changes in capitalization (as it changes the target page).
if not sc:sortByScraping() then
text = ulower(text)
end
local actual_substitution_data
-- Don't trim whitespace here because it's significant at the beginning of a sort key or sort base.
text, _, actual_substitution_data = iterateSectionSubstitutions(self, text, sc, nil, nil, self._data.sort_key,
"sort_key", "makeSortKey", "notrim")
if not sc:sortByScraping() then
if self:hasDottedDotlessI() and not actual_substitution_data then
text = text:gsub("ı", "I"):gsub("i", "İ")
text = sc:toFixedNFC(text)
end
text = uupper(text)
end
-- Remove parentheses, as long as they are either preceded or followed by something.
text = gsub(text, "(.)[()]+", "%1"):gsub("[()]+(.)", "%1")
text = escape_risky_characters(text)
return text
end
--[==[Create the form used as as a basis for display text and transliteration. FIXME: Rename to correctInputText().]==]
local function processDisplayText(text, self, sc, keepCarets, keepPrefixes)
local subbedChars = {}
text, subbedChars = doTempSubstitutions(text, subbedChars, keepCarets)
text = decode_uri(text, "PATH")
text = checkNoEntities(self, text)
sc = checkScript(text, self, sc)
text = normalize(text, sc)
text, subbedChars = iterateSectionSubstitutions(self, text, sc, subbedChars, keepCarets, self._data.display_text,
"display_text", "makeDisplayText")
text = removeCarets(text, sc)
-- Remove any interwiki link prefixes (unless they have been escaped or this has been disabled).
if find(text, ":") and not keepPrefixes then
local rep
repeat
text, rep = gsub(text, "\\\\(\\*:)", "\3%1")
until rep == 0
text = gsub(text, "\\:", "\4")
while true do
local prefix = gsub(text, "^(.-):.+", function(m1)
return (gsub(m1, "\244[\128-\191]*", ""))
end)
-- Check if the prefix is an interwiki, though ignore capitalised Wiktionary:, which is a namespace.
if not prefix or prefix == text or prefix == "ဝိက်ရှေန်နရဳ"
or not (load_data("Module:data/interwikis")[ulower(prefix)] or prefix == "") then
break
end
text = gsub(text, "^(.-):(.*)", function(m1, m2)
local ret = {}
for subbedChar in gmatch(m1, "\244[\128-\191]*") do
insert(ret, subbedChar)
end
return concat(ret) .. m2
end)
end
text = gsub(text, "\3", "\\"):gsub("\4", ":")
end
return text, subbedChars
end
--[==[Make the display text (i.e. what is displayed on the page).]==]
function Language:makeDisplayText(text, sc, keepPrefixes)
if not text or text == "" then
return text
end
local subbedChars
text, subbedChars = processDisplayText(text, self, sc, nil, keepPrefixes)
text = escape_risky_characters(text)
return undoTempSubstitutions(text, subbedChars)
end
--[==[Transliterates the text from the given script into the Latin script (see
[[Wiktionary:Transliteration and romanization]]). The language must have the <code>translit</code> property for this to
work; if it is not present, {{code|lua|nil}} is returned.
The <code>sc</code> parameter is handled by the transliteration module, and how it is handled is specific to that
module. Some transliteration modules may tolerate {{code|lua|nil}} as the script, others require it to be one of the
possible scripts that the module can transliterate, and will throw an error if it's not one of them. For this reason,
the <code>sc</code> parameter should always be provided when writing non-language-specific code.
The <code>module_override</code> parameter is used to override the default module that is used to provide the
transliteration. This is useful in cases where you need to demonstrate a particular module in use, but there is no
default module yet, or you want to demonstrate an alternative version of a transliteration module before making it
official. It should not be used in real modules or templates, only for testing. All uses of this parameter are tracked
by [[Wiktionary:Tracking/languages/module_override]].
'''Known bugs''':
* This function assumes {tr(s1) .. tr(s2) == tr(s1 .. s2)}. When this assertion fails, wikitext markups like <nowiki>'''</nowiki> can cause wrong transliterations.
* HTML entities like <code>&apos;</code>, often used to escape wikitext markups, do not work.
]==]
function Language:transliterate(text, sc, module_override)
-- If there is no text, or the language doesn't have transliteration data and there's no override, return nil.
if not text or text == "" or text == "-" then
return text
end
-- If the script is not transliteratable (and no override is given), return nil.
sc = checkScript(text, self, sc)
if not (sc:isTransliterated() or module_override) then
-- temporary tracking to see if/when this gets triggered
track("non-transliterable")
track("non-transliterable/" .. self._code)
track("non-transliterable/" .. sc:getCode())
track("non-transliterable/" .. sc:getCode() .. "/" .. self._code)
return nil
end
-- Remove any strip markers.
text = unstrip(text)
-- Do not process the formatting into PUA characters for certain languages.
local processed = load_data(languages_data_module).substitution[self._code] ~= "none"
-- Get the display text with the keepCarets flag set.
local subbedChars
if processed then
text, subbedChars = processDisplayText(text, self, sc, true)
end
-- Transliterate (using the module override if applicable).
text, subbedChars = iterateSectionSubstitutions(self, text, sc, subbedChars, true, module_override or
self._data.translit, "translit", "tr")
if not text then
return nil
end
-- Incomplete transliterations return nil.
local charset = sc.characters
if charset and umatch(text, "[" .. charset .. "]") then
-- Remove any characters in Latin, which includes Latin characters also included in other scripts (as these are
-- false positives), as well as any PUA substitutions. Anything remaining should only be script code "None"
-- (e.g. numerals).
local check_text = ugsub(text, "[" .. get_script("Latn").characters .. "-]+", "")
-- Set none_is_last_resort_only flag, so that any non-None chars will cause a script other than "None" to be
-- returned.
if find_best_script_without_lang(check_text, true):getCode() ~= "None" then
return nil
end
end
if processed then
text = escape_risky_characters(text)
text = undoTempSubstitutions(text, subbedChars)
end
-- If the script does not use capitalization, then capitalize any letters of the transliteration which are
-- immediately preceded by a caret (and remove the caret).
if text and not sc:hasCapitalization() and text:find("^", 1, true) then
text = processCarets(text, "%^([\128-\191\244]*%*?)([^\128-\191\244][\128-\191]*)", function(m1, m2)
return m1 .. uupper(m2)
end)
end
-- Track module overrides.
if module_override ~= nil then
track("module_override")
end
return text
end
do
local function handle_language_spec(self, spec, sc)
local ret = self["_" .. spec]
if ret == nil then
ret = self._data[spec]
if type(ret) == "string" then
ret = list_to_set(split(ret, ",", true, true))
end
self["_" .. spec] = ret
end
if type(ret) == "table" then
ret = ret[sc:getCode()]
end
return not not ret
end
function Language:overrideManualTranslit(sc)
return handle_language_spec(self, "override_translit", sc)
end
function Language:link_tr(sc)
return handle_language_spec(self, "link_tr", sc)
end
end
--[==[Returns {{code|lua|true}} if the language has a transliteration module, or {{code|lua|false}} if it doesn't.]==]
function Language:hasTranslit()
return not not self._data.translit
end
--[==[Returns {{code|lua|true}} if the language uses the letters I/ı and İ/i, or {{code|lua|false}} if it doesn't.]==]
function Language:hasDottedDotlessI()
return not not self._data.dotted_dotless_i
end
function Language:toJSON(opts)
local strip_diacritics, strip_diacritics_patterns, strip_diacritics_remove_diacritics = self._data.strip_diacritics
if strip_diacritics then
if strip_diacritics.from then
strip_diacritics_patterns = {}
for i, from in ipairs(strip_diacritics.from) do
insert(strip_diacritics_patterns, {from = from, to = strip_diacritics.to[i] or ""})
end
end
strip_diacritics_remove_diacritics = strip_diacritics.remove_diacritics
end
-- mainCode should only end up non-nil if dontCanonicalizeAliases is passed to make_object().
-- props should either contain zero-argument functions to compute the value, or the value itself.
local props = {
ancestors = function() return self:getAncestorCodes() end,
canonicalName = function() return self:getCanonicalName() end,
categoryName = function() return self:getCategoryName("nocap") end,
code = self._code,
mainCode = self._mainCode,
parent = function() return self:getParentCode() end,
full = function() return self:getFullCode() end,
stripDiacriticsPatterns = strip_diacritics_patterns,
stripDiacriticsRemoveDiacritics = strip_diacritics_remove_diacritics,
family = function() return self:getFamilyCode() end,
aliases = function() return self:getAliases() end,
varieties = function() return self:getVarieties() end,
otherNames = function() return self:getOtherNames() end,
scripts = function() return self:getScriptCodes() end,
type = function() return keys_to_list(self:getTypes()) end,
wikimediaLanguages = function() return self:getWikimediaLanguageCodes() end,
wikidataItem = function() return self:getWikidataItem() end,
wikipediaArticle = function() return self:getWikipediaArticle(true) end,
}
local ret = {}
for prop, val in pairs(props) do
if not opts.skip_fields or not opts.skip_fields[prop] then
if type(val) == "function" then
ret[prop] = val()
else
ret[prop] = val
end
end
end
-- Use `deep_copy` when returning a table, so that there are no editing restrictions imposed by `mw.loadData`.
return opts and opts.lua_table and deep_copy(ret) or to_json(ret, opts)
end
function export.getDataModuleName(code)
local letter = match(code, "^(%l)%l%l?$")
return "Module:" .. (
letter == nil and "languages/data/exceptional" or
#code == 2 and "languages/data/2" or
"languages/data/3/" .. letter
)
end
get_data_module_name = export.getDataModuleName
function export.getExtraDataModuleName(code)
return get_data_module_name(code) .. "/extra"
end
get_extra_data_module_name = export.getExtraDataModuleName
do
local function make_stack(data)
local key_types = {
[2] = "unique",
aliases = "unique",
otherNames = "unique",
type = "append",
varieties = "unique",
wikipedia_article = "unique",
wikimedia_codes = "unique"
}
local function __index(self, k)
local stack, key_type = getmetatable(self), key_types[k]
-- Data that isn't inherited from the parent.
if key_type == "unique" then
local v = stack[stack[make_stack]][k]
if v == nil then
local layer = stack[0]
if layer then -- Could be false if there's no extra data.
v = layer[k]
end
end
return v
-- Data that is appended by each generation.
elseif key_type == "append" then
local parts, offset, n = {}, 0, stack[make_stack]
for i = 1, n do
local part = stack[i][k]
if part == nil then
offset = offset + 1
else
parts[i - offset] = part
end
end
return offset ~= n and concat(parts, ",") or nil
end
local n = stack[make_stack]
while true do
local layer = stack[n]
if not layer then -- Could be false if there's no extra data.
return nil
end
local v = layer[k]
if v ~= nil then
return v
end
n = n - 1
end
end
local function __newindex()
error("table is read-only")
end
local function __pairs(self)
-- Iterate down the stack, caching keys to avoid duplicate returns.
local stack, seen = getmetatable(self), {}
local n = stack[make_stack]
local iter, state, k, v = pairs(stack[n])
return function()
repeat
repeat
k = iter(state, k)
if k == nil then
n = n - 1
local layer = stack[n]
if not layer then -- Could be false if there's no extra data.
return nil
end
iter, state, k = pairs(layer)
end
until not (k == nil or seen[k])
-- Get the value via a lookup, as the one returned by the
-- iterator will be the raw value from the current layer,
-- which may not be the one __index will return for that
-- key. Also memoize the key in `seen` (even if the lookup
-- returns nil) so that it doesn't get looked up again.
-- TODO: store values in `self`, avoiding the need to create
-- the `seen` table. The iterator will need to iterate over
-- `self` with `next` first to find these on future loops.
v, seen[k] = self[k], true
until v ~= nil
return k, v
end
end
local __ipairs = require(table_module).indexIpairs
function make_stack(data)
local stack = {
data,
[make_stack] = 1, -- stores the length and acts as a sentinel to confirm a given metatable is a stack.
__index = __index,
__newindex = __newindex,
__pairs = __pairs,
__ipairs = __ipairs,
}
stack.__metatable = stack
return setmetatable({}, stack), stack
end
return make_stack(data)
end
local function get_stack(data)
local stack = getmetatable(data)
return stack and type(stack) == "table" and stack[make_stack] and stack or nil
end
--[==[
<span style="color: var(--wikt-palette-red,#BA0000)">This function is not for use in entries or other content pages.</span>
Returns a blob of data about the language. The format of this blob is undocumented, and perhaps unstable; it's intended for things like the module's own unit-tests, which are "close friends" with the module and will be kept up-to-date as the format changes. If `extra` is set, any extra data in the relevant `/extra` module will be included. (Note that it will be included anyway if it has already been loaded into the language object.) If `raw` is set, then the returned data will not contain any data inherited from parent objects.
-- Do NOT use these methods!
-- All uses should be pre-approved on the talk page!
]==]
function Language:getData(extra, raw)
if extra then
self:loadInExtraData()
end
local data = self._data
-- If raw is not set, just return the data.
if not raw then
return data
end
local stack = get_stack(data)
-- If there isn't a stack or its length is 1, return the data. Extra data (if any) will be included, as it's stored at key 0 and doesn't affect the reported length.
if stack == nil then
return data
end
local n = stack[make_stack]
if n == 1 then
return data
end
local extra = stack[0]
-- If there isn't any extra data, return the top layer of the stack.
if extra == nil then
return stack[n]
end
-- If there is, return a new stack which has the top layer at key 1 and the extra data at key 0.
data, stack = make_stack(stack[n])
stack[0] = extra
return data
end
function Language:loadInExtraData()
-- Only full languages have extra data.
if not self:hasType("language", "full") then
return
end
local data = self._data
-- If there's no stack, create one.
local stack = get_stack(self._data)
if stack == nil then
data, stack = make_stack(data)
-- If already loaded, return.
elseif stack[0] ~= nil then
return
end
self._data = data
-- Load extra data from the relevant module and add it to the stack at key 0, so that the __index and __pairs metamethods will pick it up, since they iterate down the stack until they run out of layers.
local code = self._code
local modulename = get_extra_data_module_name(code)
-- No data cached as false.
stack[0] = modulename and load_data(modulename)[code] or false
end
--[==[Returns the name of the module containing the language's data. Currently, this is always [[Module:scripts/data]].]==]
function Language:getDataModuleName()
local name = self._dataModuleName
if name == nil then
name = self:hasType("etymology-only") and etymology_languages_data_module or
get_data_module_name(self._mainCode or self._code)
self._dataModuleName = name
end
return name
end
--[==[Returns the name of the module containing the language's data. Currently, this is always [[Module:scripts/data]].]==]
function Language:getExtraDataModuleName()
local name = self._extraDataModuleName
if name == nil then
name = not self:hasType("etymology-only") and get_extra_data_module_name(self._mainCode or self._code) or false
self._extraDataModuleName = name
end
return name or nil
end
function export.makeObject(code, data, dontCanonicalizeAliases)
local data_type = type(data)
if data_type ~= "table" then
error(("bad argument #2 to 'makeObject' (table expected, got %s)"):format(data_type))
end
-- Convert any aliases.
local input_code = code
code = normalize_code(code)
input_code = dontCanonicalizeAliases and input_code or code
local parent
if data.parent then
parent = get_by_code(data.parent, nil, true, true)
else
parent = Language
end
parent.__index = parent
local lang = {_code = input_code}
-- This can only happen if dontCanonicalizeAliases is passed to make_object().
if code ~= input_code then
lang._mainCode = code
end
local parent_data = parent._data
if parent_data == nil then
-- Full code is the same as the code.
lang._fullCode = parent._code or code
else
-- Copy full code.
lang._fullCode = parent._fullCode
local stack = get_stack(parent_data)
if stack == nil then
parent_data, stack = make_stack(parent_data)
end
-- Insert the input data as the new top layer of the stack.
local n = stack[make_stack] + 1
data, stack[n], stack[make_stack] = parent_data, data, n
end
lang._data = data
return setmetatable(lang, parent)
end
make_object = export.makeObject
end
--[==[Finds the language whose code matches the one provided. If it exists, it returns a <code class="nf">Language</code> object representing the language. Otherwise, it returns {{code|lua|nil}}, unless <code class="n">paramForError</code> is given, in which case an error is generated. If <code class="n">paramForError</code> is {{code|lua|true}}, a generic error message mentioning the bad code is generated; otherwise <code class="n">paramForError</code> should be a string or number specifying the parameter that the code came from, and this parameter will be mentioned in the error message along with the bad code. If <code class="n">allowEtymLang</code> is specified, etymology-only language codes are allowed and looked up along with normal language codes. If <code class="n">allowFamily</code> is specified, language family codes are allowed and looked up along with normal language codes.]==]
function export.getByCode(code, paramForError, allowEtymLang, allowFamily)
-- Track uses of paramForError, ultimately so it can be removed, as error-handling should be done by [[Module:parameters]], not here.
if paramForError ~= nil then
track("paramForError")
end
if type(code) ~= "string" then
local typ
if not code then
typ = "nil"
elseif check_object("language", true, code) then
typ = "a language object"
elseif check_object("family", true, code) then
typ = "a family object"
else
typ = "a " .. type(code)
end
error("The function getByCode expects a string as its first argument, but received " .. typ .. ".")
end
local m_data = load_data(languages_data_module)
if m_data.aliases[code] or m_data.track[code] then
track(code)
end
local norm_code = normalize_code(code)
-- Get the data, checking for etymology-only languages if allowEtymLang is set.
local data = load_data(get_data_module_name(norm_code))[norm_code] or
allowEtymLang and load_data(etymology_languages_data_module)[norm_code]
-- If no data was found and allowFamily is set, check the family data. If the main family data was found, make the object with [[Module:families]] instead, as family objects have different methods. However, if it's an etymology-only family, use make_object in this module (which handles object inheritance), and the family-specific methods will be inherited from the parent object.
if data == nil and allowFamily then
data = load_data("Module:families/data")[norm_code]
if data ~= nil then
if data.parent == nil then
return make_family_object(norm_code, data)
elseif not allowEtymLang then
data = nil
end
end
end
local retval = code and data and make_object(code, data)
if not retval and paramForError then
require("Module:languages/errorGetBy").code(code, paramForError, allowEtymLang, allowFamily)
end
return retval
end
get_by_code = export.getByCode
--[==[Finds the language whose canonical name (the name used to represent that language on Wiktionary) or other name matches the one provided. If it exists, it returns a <code class="nf">Language</code> object representing the language. Otherwise, it returns {{code|lua|nil}}, unless <code class="n">paramForError</code> is given, in which case an error is generated. If <code class="n">allowEtymLang</code> is specified, etymology-only language codes are allowed and looked up along with normal language codes. If <code class="n">allowFamily</code> is specified, language family codes are allowed and looked up along with normal language codes.
The canonical name of languages should always be unique (it is an error for two languages on Wiktionary to share the same canonical name), so this is guaranteed to give at most one result.
This function is powered by [[Module:languages/canonical names]], which contains a pre-generated mapping of full-language canonical names to codes. It is generated by going through the [[:Category:Language data modules]] for full languages. When <code class="n">allowEtymLang</code> is specified for the above function, [[Module:etymology languages/canonical names]] may also be used, and when <code class="n">allowFamily</code> is specified for the above function, [[Module:families/canonical names]] may also be used.]==]
function export.getByCanonicalName(name, errorIfInvalid, allowEtymLang, allowFamily)
local byName = load_data("Module:languages/canonical names")
local code = byName and byName[name]
if not code and allowEtymLang then
byName = load_data("Module:etymology languages/canonical names")
code = byName and byName[name] or
byName[gsub(name, " [Ss]ubstrate$", "")] or
byName[gsub(name, "^a ", "")] or
byName[gsub(name, "^a ", ""):gsub(" [Ss]ubstrate$", "")] or
-- For etymology families like "ira-pro".
-- FIXME: This is not ideal, as it allows " languages" to be appended to any etymology-only language, too.
byName[match(name, "^ဘာသာ(.*)$")]
end
if not code and allowFamily then
byName = load_data("Module:families/canonical names")
code = byName[name] or byName[match(name, "^ဘာသာ(.*)$")]
end
local retval = code and get_by_code(code, errorIfInvalid, allowEtymLang, allowFamily)
if not retval and errorIfInvalid then
require("Module:languages/errorGetBy").canonicalName(name, allowEtymLang, allowFamily)
end
return retval
end
--[==[Used by [[Module:languages/data/2]] (et al.) and [[Module:etymology languages/data]], [[Module:families/data]], [[Module:scripts/data]] and [[Module:writing systems/data]] to finalize the data into the format that is actually returned.]==]
function export.finalizeData(data, main_type, variety)
local fields = {"type"}
if main_type == "language" then
insert(fields, 4) -- script codes
insert(fields, "ancestors")
insert(fields, "link_tr")
insert(fields, "override_translit")
insert(fields, "wikimedia_codes")
elseif main_type == "script" then
insert(fields, 3) -- writing system codes
end -- Families and writing systems have no extra fields to process.
local fields_len = #fields
for _, entity in next, data do
if variety then
-- Move parent from 3 to "parent" and family from "family" to 3. These are different for the sake of convenience, since very few varieties have the family specified, whereas all of them have a parent.
entity.parent, entity[3], entity.family = entity[3], entity.family
-- Give the type "regular" iff not a variety and no other types are assigned.
elseif not (entity.type or entity.parent) then
entity.type = "regular"
end
for i = 1, fields_len do
local key = fields[i]
local field = entity[key]
if field and type(field) == "string" then
entity[key] = gsub(field, "%s*,%s*", ",")
end
end
end
return data
end
--[==[For backwards compatibility only; modules should require the error themselves.]==]
function export.err(lang_code, param, code_desc, template_tag, not_real_lang)
return require("Module:languages/error")(lang_code, param, code_desc, template_tag, not_real_lang)
end
return export
5y0r7bn6bkrt3a6vnypdpg7zn05m6if
မဝ်ဂျူ:Jpan-headword
828
949
385652
385562
2026-04-02T20:14:40Z
咽頭べさ
33
385652
Scribunto
text/plain
local m_ja = require("Module:ja")
local m_ja_ruby = require("Module:ja-ruby")
local m_str_utils = require("Module:string utilities")
local byteoffset = mw.ustring.byteoffset
local concat = table.concat
local gsplit = m_str_utils.gsplit
local insert = table.insert
local kana_to_romaji = require("Module:Hrkt-translit").tr
local max_index = require("Module:table").maxIndex
local moraify = m_ja.moraify
local remove = table.remove
local ugmatch = mw.ustring.gmatch
local ugsub = m_str_utils.gsub
local ulen = m_str_utils.len
local ulower = m_str_utils.lower
local umatch = mw.ustring.match
local usub = m_str_utils.sub
local export = {}
local pos_functions = {}
local range = mw.loadData('Module:ja/data/range')
local Jpan = require("Module:scripts").getByCode("Jpan")
local function remove_links(text)
return (text:gsub("%[%[[^|%]]-|", "")
:gsub("%[%[", "")
:gsub("%]%]", ""))
end
local function assign_kana_to_kanji(head, kana, pagename, template_name)
-- TODO: uses deprecated module
local m_tu = require'Module:template utilities'
local kanji_pos = {[0] = {nil, 0}}
local head_nolink = {}
local link_border = 0
local function insert_kanji_pos(substr)
insert(head_nolink, substr)
for p1, w1 in ugmatch(substr, '()([々' .. range.kanji .. '])') do
p1 = byteoffset(substr, p1) + link_border
insert(kanji_pos, {p1, p1 + w1:len() - 1})
end
end
for p1, p2, w1 in m_tu.gfind_bracket(head, {['%[%['] = ']]'}) do
insert_kanji_pos(head:sub(link_border + 1, p1 - 1))
local p_pipe = w1:find'|' or 2
link_border = p1 + p_pipe - 1
insert_kanji_pos(w1:sub(p_pipe + 1, -3))
link_border = p2
end
insert_kanji_pos(head:sub(link_border + 1))
head_nolink = concat(head_nolink)
local pagetext = mw.title.new(pagename):getContent()
if not pagetext then return head, kana end
local non_kanji = {}
local last_kanji = 1
for p1 in ugmatch(head_nolink, '[々' .. range.kanji .. ']()') do
insert(non_kanji, usub(head_nolink, last_kanji, p1 - 2))
last_kanji = p1
end
insert(non_kanji, usub(head_nolink, last_kanji))
for kanjitab in pagetext:gmatch('(){{%s*' .. template_name) do
kanjitab = select(3, m_tu.find_bracket(pagetext, m_tu.brackets_temp, kanjitab))
if not kanjitab then error('ill-formed [[t:' .. template_name:gsub('%%', '') .. ']] syntax') end
kanjitab = m_tu.parse_temp(kanjitab)
local readings = {}
local readings_len = {}
for i = 1, max_index(kanjitab.args) do
local r_i = kanjitab.args[i] or ''
local r_o = kanjitab.args['o' .. i] or ''
if kanjitab.args['k' .. i] then
readings[i] = kanjitab.args['k' .. i] .. r_o
readings_len[i] = tonumber(r_i:match'^%s*%D*(%d*)%s*$') or 1
else
local r_kana, r_len = r_i:match'^%s*(%D*)(%d*)%s*$'
readings[i] = r_kana .. r_o
readings_len[i] = tonumber(r_len) or 1
end
end
local kana_decom = {}
local reading_id = 1
local reading_len = 1
for i = 1, #non_kanji - 1 do
if reading_len <= 1 then
reading_len = readings_len[reading_id] or 1
insert(kana_decom, non_kanji[i])
insert(kana_decom, readings[reading_id])
reading_id = reading_id + 1
else
reading_len = reading_len - 1
end
end
insert(kana_decom, non_kanji[#non_kanji])
local function strip_nonkana(str, repl)
return ugsub(str, '[^' .. range.kana .. ']+', repl) or nil
end
local xeno_reading = {strip_nonkana(kana, ''):match('^' .. strip_nonkana(concat(kana_decom), '(.-)') .. '$')}
if #xeno_reading > 0 then
local head_decom = {}
reading_id = 1
reading_len = 1
for i = 1, #non_kanji - 1 do
if reading_len <= 1 then
reading_len = readings_len[reading_id] or 1
insert(head_decom, head:sub(kanji_pos[i - 1][2] + 1, kanji_pos[i][1] - 1))
insert(head_decom, head:sub(kanji_pos[i][1], kanji_pos[i + reading_len - 1][2]))
reading_id = reading_id + 1
else
reading_len = reading_len - 1
end
end
insert(head_decom, head:sub(kanji_pos[#non_kanji - 1][2] + 1))
if #head_decom ~= #kana_decom then error('number of parameters in [[t:' .. template_name:gsub('%%', '') .. ']] is incorrect') end
local n_xeno_reading = 0
for i = 1, #kana_decom, 2 do
kana_decom[i] = ugsub(kana_decom[i], '[^' .. range.kana .. ']+', function()
n_xeno_reading = n_xeno_reading + 1
if xeno_reading[n_xeno_reading] == '' then return nil
else return xeno_reading[n_xeno_reading] end
end)
end
return concat(head_decom, '%'), concat(kana_decom, '%')
end
end
return head, kana
end
local en_grades = {
"first grade", "second grade", "third grade",
"fourth grade", "fifth grade", "sixth grade",
"secondary school", "jinmeiyō", "hyōgai",
}
local aliases = {
['transitive']='tr', ['trans']='tr',
['intransitive']='in', ['intrans']='in', ['intr']='in',
['godan']='1', ['ichidan']='2', ['irregular']='irr'
}
local adverbs_optional_tag = 'optionally '
local adverbs_optional_aliases = {
['to']='と', ['と']='と', ['ト']='と',
['ni']='に', ['に']='に', ['ニ']='に',
}
local adverbs_optional_links = {
['と']='[[と#Japanese:_adverbs|と]]',
['に']='[[に]]',
}
local function formatting_adjustments(rom, kana, pos_category)
-- hyphens for prefixes, suffixes, and counters (classifiers)
if pos_category == "prefixes" then
rom = rom:gsub('%-?$', '-')
elseif pos_category == "suffixes" or pos_category == "suffix forms" or pos_category == "counters" or pos_category == "classifiers" then
rom = rom:gsub('^%-?', '-')
elseif pos_category == "proper nouns" and not kana:match'%^' then -- automatic caps for proper nouns, if not already specified
rom = ugsub(ugsub(rom, '%f[^%s%c%p]%l', string.uupper), "%w'%u", ulower) -- no caps after medial apostrophes
end
return rom
end
local function kana_to_romaji_with_pos_format(kana, data, args)
if data.headword.pos_category == "combining forms" or data.headword.pos_category == "punctuation marks" or data.headword.pos_category == "iteration marks" then
return "-"
end
local rom = remove_links(kana_to_romaji(kana, data.lang_code))
-- make adjustments for -u verbs and -i adjectives
if args['infl'] == '1' or args['infl'] == '1s' or args['infl'] == 'godan' then
rom = rom:gsub('ō$', 'ou'):gsub('ū$', 'uu')
elseif args['infl'] == 'i' or args['infl'] == 'is' or args['infl'] == 'い' then
rom = rom:gsub('ī$', 'ii')
end
return formatting_adjustments(rom, kana, data.headword.pos_category)
end
local function iterate_rare_chars(text)
local ch, i
return function()
repeat
ch, i = umatch(text, "([" .. range.kana .. range.kana_graph .. "!-/:-@%[\\-`×△○◎。-〠〶〷〻-〽・·゠=~][゙゚]*)()", i)
until not (ch and umatch(ch, "^[ぁ-ちっつて-ろんァ-チッツテ-ロンヲ-゚]$"))
return ch
end
end
local function historical_kana(data, hist_kana, modern_kana)
-- Disallow historical kana for kana and morae, as there's no one-to-one correspondence.
local pos = data.headword.pos_category
if pos == "syllables" or pos == "kana" or pos == "morae" then
error(("Cannot specify historical kana for %s."):format(pos))
end
local hist_kana_no_formatting = hist_kana:gsub("[%^%-%. %%]+", "")
local rare_chars, lang_name, hc = {}, data.lang_name, data.headword.categories
for ch in iterate_rare_chars(hist_kana_no_formatting) do
if not (modern_kana and modern_kana:find(ch)) then
rare_chars[ch] = true
end
end
for _, mora in ipairs(moraify((ugsub(hist_kana_no_formatting, "[^" .. range.kana .. "]+", " ")))) do
if not (mora:gsub(" +", ""):match("^.?[\128-\191]*$") or (modern_kana and modern_kana:find(mora))) then
rare_chars[mora] = true
end
end
for ch in pairs(rare_chars) do
-- insert(hc, lang_name .. " terms historically spelled with " .. ch)
end
insert(data.info_hist, require("Module:ja-link").link({
lang = data.headword.lang,
lemma = hist_kana,
tr = formatting_adjustments(
remove_links(kana_to_romaji(hist_kana, data.lang_code, nil, {hist = true})),
hist_kana,
pos
),
}, {
face = "head",
disableSelfLink = true,
}))
end
local function detect_pagename_kana(data, digraphs)
local pagename = data.pagename
-- Exclude "&" and "@", which are part of %p (e.g. リズム&ブルース).
local function remove_kana(m)
return m:match("[&@]") or ""
end
if ugsub(pagename, '[%p%s%c' .. range.hiragana .. (digraphs and "ゟ" or "") .. ']', remove_kana) == "" then
return 'hira'
elseif ugsub(pagename, '[%p%s%c' .. range.katakana .. (digraphs and "ヿ" or "") .. ']', remove_kana) == "" then
return 'kata'
elseif ugsub(pagename, '[%p%s%c' .. range.kana .. (digraphs and "ゟヿ" or "") .. ']', remove_kana) == "" then
return 'both'
end
end
-- go through args and build inflections by finding whatever kanas were given to us
local function format_headword(args, data)
local pagename, kanas, lang_name = data.pagename, data.kanas, data.lang_name
data.pagename_kana = detect_pagename_kana(data)
if args[1][1] and not args[1][1]:match'[\128-\255]' then
-- filter out POS designations
remove(args[1], 1)
end
local linked_translit = data.headword.lang:link_tr(Jpan)
local suru_ending, rom_suru_ending
if data.headword.pos_category == "suru verbs" then
suru_ending = "[[する]]"
rom_suru_ending = linked_translit and " [[suru]]" or " suru"
else
suru_ending, rom_suru_ending = "", ""
end
if data.pagename_kana then -- pure-kana-title entry
if #args.head > 0 or args.head.default then
-- insert(data.headword.categories, lang_name .. " terms with redundant head parameter")
end
-- {{ja-xxx}} vs {{ja-xxx|こ.うし}} vs {{ja-xxx|コウシ}} in [[こうし]]
if not args[1][1] then
args[1][1] = pagename
elseif remove_links(args[1][1]:gsub("[%^%-%. %%]+", "")) ~= pagename then
insert(args[1], 1, pagename)
end
for i, k in ipairs(args[1]) do
insert(data.headword.heads, {
term = k:gsub("[%^%-%. %%]+", "") .. suru_ending,
tr = '-',
l = args.label[i] and {args.label[i]} or nil,
})
end
for i = 1, math.max(args.rom.maxindex, 1) do
local rom = args.rom[i] or args.rom.default or kana_to_romaji_with_pos_format(args[1][1], data, args)
if not data.headword.heads[i] then
data.headword.heads[i] = {term = data.headword.heads[i-1].term}
end
if rom == "-" then
data.headword.heads[i].tr = "-"
elseif linked_translit then
data.headword.heads[i].tr = "[[" .. rom .. "]]" .. rom_suru_ending
else
data.headword.heads[i].tr = rom .. rom_suru_ending
end
if not data.inflection_base.form then
data.inflection_base.form = remove_links(args[i][1]:gsub("[%^%-%. %%]+", "")) .. suru_ending
data.inflection_base.romaji = rom .. rom_suru_ending
end
end
kanas[1] = pagename
if args.hist[1] then
historical_kana(data, args.hist[1], args[1][1])
end
else -- non-pure-kana-title entry
if #args[1] == 0 and not (data.headword.pos_category == "punctuation marks" or data.headword.pos_category == "iteration marks" or data.headword.pos_category == "symbols") then
error("Kana form is required.")
end
if args.head.default == pagename then
-- insert(data.headword.categories, lang_name .. " terms with redundant head parameter")
end
local rom_repetition_final = {}
for i, k in ipairs(args[1]) do
local rom_auto = kana_to_romaji_with_pos_format(k, data, args)
local head = args.head[i] or args.head.default or pagename
if args.head[i] == pagename then
-- insert(data.headword.categories, lang_name .. " terms with redundant head parameter")
end
local head_for_ruby, kana_for_ruby
if ulen(head) > 1 and head:match'%%' == nil and k:match'%%' == nil then
head_for_ruby, kana_for_ruby = assign_kana_to_kanji(head, k, pagename, data.lang_code .. '%-kanjitab')
else
head_for_ruby, kana_for_ruby = head, k
end
local format_table = m_ja_ruby.parse_text(head_for_ruby, kana_for_ruby, {
try = 'force',
try_force_limit = 10000,
})
local kana_bare = remove_links(k:gsub("[%^%-%. %%]+", ""))
local rom = args.rom[i] or args.rom.default or rom_auto
head = {
term = m_ja_ruby.to_wiki(format_table, {
break_link = true,
}):gsub('<rt>(..-)</rt>', "<rt>[[" .. kana_bare .."|%1]]</rt>") .. suru_ending,
l = args.label[i] and {args.label[i]} or nil,
}
if rom == "-" or rom_repetition_final[rom] then
head.tr = "-"
elseif linked_translit then
head.tr = "[[" .. rom .. "]]" .. rom_suru_ending
else
head.tr = rom .. rom_suru_ending
end
insert(data.headword.heads, head)
rom_repetition_final[rom] = true
insert(kanas, kana_bare)
if args.hist[i] then
historical_kana(data, args.hist[i], k)
end
if not data.inflection_base.form then
data.inflection_base.form = remove_links(m_ja_ruby.to_markup(format_table)) .. suru_ending
data.inflection_base.romaji = rom .. rom_suru_ending
end
end
local first_reading, multiple = kanas[1]
if not first_reading then
return
end
first_reading = ulower(kana_to_romaji(first_reading, data.lang_code)):gsub("%%", "")
for i = 2, #kanas do
if ulower(kana_to_romaji(kanas[i], data.lang_code)):gsub("%%", "") ~= first_reading then
multiple = true
break
end
end
if not multiple then
local lang_code = data.lang_code
local content = mw.title.getCurrentTitle():getContent()
local loc1, loc2 = content:find("%f[^%z%s]==%s*" .. lang_name:gsub("%-", "%%%-") .. "%s*==()")
loc2 = content:find("%f[^%z%s]==[^\n=]+==", loc2)
if loc1 then
content = content:sub(loc1, loc2)
for template in require("Module:template parser").find_templates(content) do
local name, reading = template:get_name()
if (
name == lang_code .. "-head" or
name == lang_code .. "-pos"
) then
reading = template:get_arguments()[2]
if reading ~= nil then
reading = remove_links(reading):gsub("%%", "")
end
elseif (
name == lang_code .. "-noun" or
name == lang_code .. "-verb" or
name == lang_code .. "-adj" or
name == lang_code .. "-phrase" or
name == lang_code .. "-verb form" or
name == lang_code .. "-verb-suru"
) then
reading = template:get_arguments()[1]
if reading ~= nil then
reading = remove_links(reading):gsub("%%", "")
end
elseif name == lang_code .. "-see" then
reading = template:get_arguments()[1]
if reading ~= nil then
reading = remove_links(reading):gsub("%%", "")
end
-- if umatch(reading, "[^" .. range.kana .. "]") then
-- TODO: check linked page
-- end
end
if reading and ulower(kana_to_romaji(reading, lang_code)):gsub("%%", "") ~= first_reading then
multiple = true
end
end
end
end
if multiple then
-- insert(data.headword.categories, lang_name .. " terms with multiple readings")
end
end
end
local function add_transitivity(data, tr)
local categories, lang_name = data.headword.categories, data.lang_name
tr = aliases[tr] or tr
if tr == "tr" then
insert(data.info_mid, 'transitive')
-- insert(categories, lang_name .. " transitive verbs")
elseif tr == "in" then
insert(data.info_mid, 'intransitive')
-- insert(categories, lang_name .. " intransitive verbs")
elseif tr == "both" then
-- insert(data.info_mid, 'transitive or intransitive')
-- insert(categories, lang_name .. " transitive verbs")
-- insert(categories, lang_name .. " intransitive verbs")
else
-- insert(categories, lang_name .. " verbs without transitivity")
end
end
local function get_final(lemma, data)
return kana_to_romaji(remove(moraify(m_ja_ruby.to_ruby(m_ja_ruby.parse_markup(lemma)))), data.lang_code)
end
local function add_language_fragment(t, lang_name)
for k, v in ipairs(t) do
t[k] = v:gsub("%[%[([^]#]*)%]%]", function (s)
return "[[" .. s .. "#" .. lang_name .. "|" .. s .. "]]"
end)
end
end
local function add_inflections(data, inflection_type, cat_suffix)
local lang_name = data.lang_name
local lemma = data.inflection_base.form
local romaji = data.inflection_base.romaji
inflection_type = aliases[inflection_type] or inflection_type
local function replace_suffix(lemma_from, lemma_to, romaji_from, romaji_to)
-- e.g. 持って来る, lemma = "[持](も)って来(く)る"
-- lemma_from = "くる", lemma_to = {"き","きた"}
add_language_fragment(lemma_to, lang_name)
add_language_fragment(romaji_to, lang_name)
local result = {}
local pattern_from, n_from = lemma_from:gsub('.[\128-\191]*', function(c)
return '[' .. c .. m_ja.hira_to_kata(c) .. ']([^' .. range.kana .. ']*)'
end)
pattern_from = pattern_from .. '$'
-- "[くク]([^kana range]*)[るル]([^kana range]*)$"
for i_lemma_to, s_lemma_to in ipairs(lemma_to) do
local n_to = 0
local pattern_to = s_lemma_to:gsub('.[\128-\191]*', function(c)
if n_to < n_from then
n_to = n_to + 1
return c .. "%" .. n_to
else
return c
end
end)
for i = n_to + 1, n_from do
pattern_to = pattern_to .. "%" .. i
end
-- "き%1%2", "き%1た%2"
local lemma_inflected, success = ugsub(lemma, pattern_from, pattern_to)
if success == 0 then
return
end
local romaji_inflected
romaji_inflected, success = romaji:gsub(romaji_from .. "$", romaji_to[i_lemma_to])
if success == 0 then
romaji_inflected, success = romaji:gsub("%[%[" .. romaji_from .. "%]%]$", "[[" .. romaji_to[i_lemma_to] .. "]]")
if success == 0 then
return
end
end
insert(result, {lemma = lemma_inflected, romaji = romaji_inflected})
end
return result -- {{lemma="[持](も)って来(き)",romaji="motteki"},{lemma="[持](も)って来(き)た",romaji="mottekita"}}
end
local function insert_form(label, ...)
-- label = "stem" or "past" etc.
-- ... = {lemma=...,romaji=...},{lemma=...,romaji=...}
local labeled_forms = {label = label}
for _, v in ipairs{...} do
local table_form = m_ja_ruby.parse_markup(v.lemma)
local form_term = m_ja_ruby.to_wiki(table_form)
if not form_term:find'%[%[.+%]%]' then
form_term = '[[' .. m_ja_ruby.to_text(table_form) .. '#' .. lang_name .. '|' .. form_term .. ']]'
end
insert(labeled_forms, {
term = form_term,
tr = v.romaji,
})
end
insert(data.headword.inflections, labeled_forms)
end
local inflected_forms
if data.lang_code == 'ja' then
if inflection_type == '1' or inflection_type == '1s' then
insert(data.info_mid, '<abbr title="godan (group 1) conjugation">godan</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " godan " .. cat_suffix)
local romaji = data.inflection_base.romaji
if cat_suffix == "ကြိယာ" then
local final = get_final(lemma, data)
-- insert(data.headword.categories, lang_name .. " godan " .. cat_suffix .. " ending with -" .. final)
if final == "ru" then
if umatch(romaji, "[iIīĪ]ru$") then
-- insert(data.headword.categories, lang_name .. " godan " .. cat_suffix .. " ending with -iru")
elseif umatch(romaji, "[eEēĒ]ru$") then
-- insert(data.headword.categories, lang_name .. " godan " .. cat_suffix .. " ending with -eru")
end
end
end
end
if inflection_type == '1' then
inflected_forms =
replace_suffix('く', {'き', 'いた'}, 'ku', {'ki', 'ita'}) or
replace_suffix('ぐ', {'ぎ', 'いだ'}, 'gu', {'gi', 'ida'}) or
replace_suffix('す', {'し', 'した'}, 'su', {'shi', 'shita'}) or
replace_suffix('つ', {'ち', 'った'}, 'tsu', {'chi', 'tta'}) or
replace_suffix('ぬ', {'に', 'んだ'}, 'nu', {'ni', 'nda'}) or
replace_suffix('ぶ', {'び', 'んだ'}, 'bu', {'bi', 'nda'}) or
replace_suffix('む', {'み', 'んだ'}, 'mu', {'mi', 'nda'}) or
replace_suffix('る', {'り', 'った'}, 'ru', {'ri', 'tta'}) or
replace_suffix('う', {'い', 'った'}, 'u', {'i', 'tta'})
if inflected_forms then
insert_form('stem', inflected_forms[1])
insert_form('past', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
else
inflected_forms =
replace_suffix('る', {'り', 'った', 'い'}, 'ru', {'ri', 'tta', 'i'}) or --くださる
replace_suffix('いく', {'いき', 'いった'}, 'iku', {'iki', 'itta'}) or --行く
replace_suffix('う', {'い', 'うた'}, 'ou', {'oi', 'ōta'}) --問う
if inflected_forms then
insert_form('stem', inflected_forms[1], inflected_forms[3])
insert_form('past', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
end
elseif inflection_type == '2' then
insert(data.info_mid, '<abbr title="ichidan (group 2) conjugation">ichidan</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " ichidan " .. cat_suffix)
local romaji = data.inflection_base.romaji
if umatch(romaji, "[iIīĪ]ru$") then
-- insert(data.headword.categories, lang_name .. " kami ichidan " .. cat_suffix)
elseif umatch(romaji, "[eEēĒ]ru$") then
-- insert(data.headword.categories, lang_name .. " shimo ichidan " .. cat_suffix)
else
-- insert(data.headword.categories, lang_name .. " irregular " .. cat_suffix)
end
end
inflected_forms = replace_suffix('る', {'', 'た'}, 'ru', {'', 'ta'})
if inflected_forms then
insert_form('stem', inflected_forms[1])
insert_form('past', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'suru' then
insert(data.info_mid, '<abbr title="suru (group 3) conjugation">suru</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " suru " .. cat_suffix)
end
inflected_forms =
replace_suffix('する', {'し', 'した'}, 'suru', {'shi', 'shita'}) or
replace_suffix('ずる', {'じ', 'じた'}, 'zuru', {'ji', 'jita'})
if inflected_forms then
insert_form('stem', inflected_forms[1])
insert_form('past', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'kuru' then
insert(data.info_mid, '<abbr title="kuru (group 3) conjugation">kuru</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " kuru " .. cat_suffix)
end
inflected_forms = replace_suffix('くる', {'き', 'きた'}, 'kuru', {'ki', 'kita'})
if inflected_forms then
insert_form('stem', inflected_forms[1])
insert_form('past', inflected_forms[2])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'i' or inflection_type == 'い' then
insert(data.info_mid, '<abbr title="-i (type I) inflection">-i</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " い-i " .. cat_suffix)
end
inflected_forms = replace_suffix('い', {'く'}, 'i', {'ku'})
if inflected_forms then
insert_form('adverbial', inflected_forms[1])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'is' then
insert(data.info_mid, '<abbr title="-i (type I) inflection">-i</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " い-i " .. cat_suffix)
end
inflected_forms = replace_suffix('いい', {'よく'}, 'ii', {'yoku'})
if inflected_forms then
insert_form('adverbial', inflected_forms[1])
else
require'Module:debug'.track'Jpan-headword/inflection failed/ja'
end
elseif inflection_type == 'na' or inflection_type == 'な' then
insert(data.info_mid, '<abbr title="-na (type II) inflection">-na</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " な-na " .. cat_suffix)
end
inflected_forms = replace_suffix('', {'[[な]]', '[[に]]'}, '', {' [[na]]', ' [[ni]]'})
insert_form('adnominal', inflected_forms[1])
insert_form('adverbial', inflected_forms[2])
elseif inflection_type == "yo" then
insert(data.info_mid, '<abbr title="yodan conjugation (classical)"><sup><small>†</small></sup>yodan</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " yodan " .. cat_suffix)
-- insert(data.headword.categories, lang_name .. " yodan " .. cat_suffix .. " ending with -" .. get_final(lemma, data))
end
elseif inflection_type == "kami ni" then
insert(data.info_mid, '<abbr title="kami nidan conjugation (classical)"><sup><small>†</small></sup>nidan</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " nidan " .. cat_suffix)
-- insert(data.headword.categories, lang_name .. " kami nidan " .. cat_suffix)
end
elseif inflection_type == "shimo ni" then
insert(data.info_mid, '<abbr title="shimo nidan conjugation (classical)"><sup><small>†</small></sup>nidan</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " nidan " .. cat_suffix)
-- insert(data.headword.categories, lang_name .. " shimo nidan " .. cat_suffix)
end
elseif inflection_type == "rahen" then
insert(data.info_mid, '<abbr title="r-special conjugation (classical)"><sup><small>†</small></sup>-ri</abbr>')
elseif inflection_type == "sahen" then
insert(data.info_mid, '<abbr title="s-special conjugation (classical)"><sup><small>†</small></sup>-se</abbr>')
elseif inflection_type == "kahen" then
insert(data.info_mid, '<abbr title="k-special conjugation (classical)"><sup><small>†</small></sup>-ko</abbr>')
elseif inflection_type == "nahen" then
insert(data.info_mid, '<abbr title="n-special conjugation (classical)"><sup><small>†</small></sup>-n</abbr>')
elseif inflection_type == "nari" or inflection_type == "なり" then
insert(data.info_mid, '<abbr title="-nari inflection (classical)"><sup><small>†</small></sup>-nari</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " なり-nari " .. cat_suffix)
end
elseif inflection_type == 'tari' or inflection_type == 'たり' then
insert(data.info_mid, '<abbr title="-tari inflection (classical)"><sup><small>†</small></sup>-tari</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " たり-tari " .. cat_suffix)
end
inflected_forms = replace_suffix('', {'[[とした]]', '[[たる]]', '[[と]]', '[[として]]'}, '', {' [[to shita]]', ' [[taru]]', ' [[to]]', ' [[to shite]]'})
insert_form('adnominal', inflected_forms[1], inflected_forms[2])
insert_form('adverbial', inflected_forms[3], inflected_forms[4])
elseif inflection_type == "ku" or inflection_type == "く" then
insert(data.info_mid, '<abbr title="-ku inflection (classical)"><sup><small>†</small></sup>-ku</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " く-ku " .. cat_suffix)
end
elseif inflection_type == "shiku" or inflection_type == "しく" then
insert(data.info_mid, '<abbr title="-shiku inflection (classical)"><sup><small>†</small></sup>-shiku</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " しく-shiku " .. cat_suffix)
end
elseif inflection_type == "ka" or inflection_type == "か" then
insert(data.info_mid, '<abbr title="-ka inflection (dialectal)"><sup><small>†</small></sup>-ka</abbr>')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " か-ka " .. cat_suffix)
end
elseif inflection_type and inflection_type:len() > adverbs_optional_tag:len() and inflection_type:sub(1, adverbs_optional_tag:len()) == adverbs_optional_tag then
local adverbs_optional_list = inflection_type:sub(adverbs_optional_tag:len() + 1)
for option in gsplit(adverbs_optional_list, ':') do
local normalized_option = adverbs_optional_aliases[option]
if not normalized_option then
error('unrecognized adverb opt= argument: "' .. option .. '"')
end
local normalized_option_romaji = kana_to_romaji(normalized_option, data.lang_code)
local normalized_option_link = adverbs_optional_links[normalized_option]
inflected_forms = replace_suffix('', {normalized_option_link}, '', {' [[' .. normalized_option_romaji .. ']]'})
insert_form('optionally as', inflected_forms[1])
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " " .. cat_suffix .. " optionally taking " .. normalized_option .. "-" .. normalized_option_romaji)
end
end
elseif inflection_type == 'irr' then
insert(data.info_mid, 'irregular')
if cat_suffix then
-- insert(data.headword.categories, lang_name .. " irregular " .. cat_suffix)
end
elseif inflection_type == '-' or inflection_type == 'un' then
insert(data.info_mid, 'uninflectable')
end
--elseif data.lang_code == 'ryu' then ...
end
end
local function add_categories(data)
local lang_name = data.lang_name
local pagename = data.pagename
local tc = data.headword.categories
-- adds category [langname] terms spelled with jōyō kanji or [langname] terms spelled with non-jōyō kanji
-- (if it contains any kanji)
local number_of_kanji = 0
for c in ugmatch(pagename, "[" .. range.kanji .. "々〻]") do
number_of_kanji = number_of_kanji + 1
if c ~= "々" and c ~= "〻" then -- Not a kanji for the purposes of categorisation.
-- insert(tc, ("%s terms spelled with %s kanji"):format(lang_name, en_grades[m_ja.kanji_grade(c)]))
end
end
-- categorize by number of kanji
if number_of_kanji ~= 0 then
-- insert(tc, ("%s terms with %s kanji"):format(lang_name, number_of_kanji))
-- single-kanji terms
if ulen(pagename) == 1 then
-- insert(tc, lang_name .. " terms spelled with " .. pagename)
-- insert(tc, lang_name .. " single-kanji terms")
end
end
-- categorize by the script of the pagename or specific characters contained in it
-- if pagename is hiragana or katakana
if detect_pagename_kana(data, true) == 'hira' then insert(tc, "ဟဳရာဂန" .. lang_name) end
if detect_pagename_kana(data, true) == 'kata' then insert(data.katakana_category, "ကာတာကာနာ" .. lang_name) end
local p, n = ugsub(pagename, '[' .. range.kana .. range.kanji .. range.ideograph .. range.kana_graph .. range.punctuation .. ']+', '')
if p ~= '' and n > 0 then insert(tc, "ဝေါဟာ".. lang_name .. "မချူလဝ်ပ္ဍဲမဂၠိုၚ်ကဵုအက္ခရ်ဂမၠိုၚ်") end
local pos = data.headword.pos_category
local rare_chars = {}
for ch in iterate_rare_chars(pagename) do
rare_chars[ch] = true
end
-- Categorise yōon, but exclude kana and mora entries, since they can't be spelled with themselves.
-- FIXME: allow kana categories for morae.
if not (pos == "syllables" or pos == "kana" or pos == "morae") then
for _, mora in ipairs(moraify((ugsub(pagename, "[^" .. range.kana .. "]+", " ")))) do
if not mora:gsub(" +", ""):match("^.?[\128-\191]*$") then
rare_chars[mora] = true
end
end
end
for ch in pairs(rare_chars) do
-- insert(tc, lang_name .. " terms spelled with " .. ch)
end
if (
pos ~= "ပေါရာဏာံပေါရာဒါံ" and
pos ~= "ဝါကျ" and
umatch(ugsub(pagename, "[" .. range.katakana .. "]+", ""), "[" .. range.hiragana .. "]") and
umatch(ugsub(pagename, "[" .. range.hiragana .. "]+", ""), "[" .. range.katakana .. "]")
) then
-- insert(tc, lang_name .. " terms spelled with mixed kana")
end
end
pos_functions["ကြိယာ"] = function(args, data)
add_transitivity(data, args["tr"])
add_inflections(data, args["infl"], 'ကြိယာ')
end
pos_functions["အဆက်လက္ကရဴ"] = function(args, data)
add_inflections(data, args["infl"])
end
pos_functions["auxiliary verbs"] = function(args, data)
insert(data.headword.categories,"ကြိယာအရီုအဗၚ်" .. data.lang_name .. "ဂမၠိုၚ်")
add_inflections(data, args["infl"])
data.headword.pos_category = "ကြိယာ"
end
pos_functions["suru verbs"] = function(args, data)
add_transitivity(data, args["tr"])
add_inflections(data, 'suru', 'ကြိယာ')
data.headword.pos_category = "ကြိယာ"
end
pos_functions["နာမဝိသေသန"] = function(args, data)
add_inflections(data, args["infl"], 'နာမဝိသေသန')
end
pos_functions["နာမ်"] = function(args, data)
-- the counter (classifier) parameter, only relevant for nouns
local counter = args["count"] or ""
if counter == "-" then
insert(data.headword.inflections, {label = "တော်ဟွံမာန်"})
elseif counter ~= "" then
insert(data.headword.inflections, {label = "ရိုဟ်သၠုဲ", counter})
end
end
pos_functions["ကြိယာဝိသေသန"] = function(args, data)
local opt = args["opt"]
if opt then
opt = adverbs_optional_tag .. opt
end
add_inflections(data, opt, 'ကြိယာဝိသေသန')
end
--[==[
Generate categories by pagename, also optionally by POS
Also for use in soft redirect pages ([[Module:ja-see]]).
Sortkey is not provided.
data = {
pagename = ..., -- (required)
lang = ..., -- (required) language object
categories = {}, -- (required) receive categories
katakana_category = {}, -- (required) receive katakana-sorted categories
pos = ..., "noun", "verb", etc. no POS categories if not given
}
]==]
function export.cat(data)
data.lang_name = data.lang:getCanonicalName()
data.pagename_kana = detect_pagename_kana(data)
if data.pos then
local pos = data.pos:gsub('x$', 'xe')
insert(data.categories, pos .. data.lang_name .. 'ဂမၠိုၚ်')
insert(data.categories, require'Module:headword'.pos_lemma_or_nonlemma(pos, true) .. data.lang_name .. 'ဂမၠိုၚ်')
end
data.headword = {categories = data.categories}
add_categories(data)
end
--[==[
The main entry point.
This is the only function that can be invoked from a template.
]==]
function export.show(frame)
local poscat = frame.args[2] or frame.args[1] or error("Part of speech has not been specified. Please pass parameter 1 to the module invocation.")
local alias_of_hist = {alias_of = 'hist', list = false}
local alias_of_infl = {alias_of = "infl"}
local list = {list = true}
local list_allow_holes_separate_no_index = {list = true, allow_holes = true, separate_no_index = true}
local params = {
[1] = list,
['rom'] = list_allow_holes_separate_no_index,
['head'] = list_allow_holes_separate_no_index,
['label'] = {list = true, allow_holes = true},
['hist'] = list, ['hhira'] = alias_of_hist, ['hkata'] = alias_of_hist,
['tr'] = true,
['infl'] = true, ['type'] = alias_of_infl, ['decl'] = alias_of_infl,
['opt'] = true,
['count'] = true,
['sort'] = true,
['pagename'] = true,
}
-- For backwards compatibility with uses of {{ja-syllable}} with the script parameter.
if poscat == "syllables" then
params["sc"] = true
end
local args = require('Module:parameters').process(frame:getParent().args, params)
local data = {
headword = {
pos_category = poscat,
categories = {},
heads = {},
no_redundant_head_cat = true,
inflections = {},
genders = {'m'}, -- placeholder
nogendercat = true,
},
--custom info
pagename = args.pagename or mw.loadData("Module:headword/data").pagename,
pagename_kana = nil, -- "hira" "kata" "both", nil
lang_code = frame.args[1],
lang_name = nil, -- "Japanese", "Okinawan" ...
katakana_category = {},
info_mid = {}, -- "godan", "intransitive" ...
info_hist = {}, -- historical kana
inflection_base = {}, -- base of inflections
kanas = {}, -- kana id
}
data.headword.lang = require("Module:languages").getByCode(data.lang_code)
data.lang_name = data.headword.lang:getCanonicalName()
-- sort out all the kanas and do the romanization business
format_headword(args, data)
-- add certain inflections and categories for adjectives, verbs, nouns, or adverbs
if pos_functions[poscat] then
pos_functions[poscat](args, data)
end
-- categories
add_categories(data)
local sort_base = args.sort or data.kanas[1] or data.pagename
data.headword.sort_key = data.headword.lang:makeSortKey(sort_base)
local katakana_category = #data.katakana_category > 0 and
require("Module:utilities").format_categories(
data.katakana_category,
data.headword.lang,
nil,
sort_base,
nil,
require("Module:scripts").getByCode("Kana")
) or ""
-- output
local i_kanas = 0
return katakana_category .. require('Module:headword').full_headword(data.headword):gsub('<span class="gender">.-</span>', function()
return (#data.info_hist > 0 and '<sup>←' .. concat(data.info_hist, ' or ') .. '<sup>[[w:Historical kana orthography|?]]</sup></sup>' or '') .. ('<i>' .. concat(data.info_mid, ' ') .. '</i>')
end):gsub('<strong .->.-</strong>', function(m0)
i_kanas = i_kanas + 1
if data.kanas[i_kanas] then
return m0
end
end)
end
return export
swhiidoqh0mb3cwe5277961yn8pv384
မဝ်ဂျူ:category tree
828
1140
385637
189243
2026-04-02T17:16:45Z
咽頭べさ
33
385637
Scribunto
text/plain
-- Prevent substitution.
if mw.isSubsting() then
return require("Module:unsubst")
end
local export = {}
local category_tree_submodule_prefix = "Module:category tree/"
local category_tree_styles_css = "Module:category tree/styles.css"
local m_str_utils = require("Module:string utilities")
local m_template_parser = require("Module:template parser")
local m_utilities = require("Module:utilities")
local ceil = math.ceil
local class_else_type = m_template_parser.class_else_type
local concat = table.concat
local deep_copy = require("Module:table").deepCopy
local full_url = mw.uri.fullUrl
local insert = table.insert
local is_callable = require("Module:fun").is_callable
local log10 = math.log10 or require("Module:math").log10
local new_title = mw.title.new
local pages_in_category = mw.site.stats.pagesInCategory
local parse = m_template_parser.parse
local remove_comments = require("Module:string/removeComments")
local sort = table.sort
local split = m_str_utils.split
local string_compare = require("Module:string/compare")
local trim = m_str_utils.trim
local uupper = m_str_utils.upper
local yesno = require("Module:yesno")
local current_frame = mw.getCurrentFrame()
local current_title = mw.title.getCurrentTitle()
local namespace = current_title.namespace
local poscatboiler_subsystem = "poscatboiler"
local extra_args_error = "Extra arguments to {{((}}auto cat{{))}} are not allowed for this category."
-- Generates a sortkey for a numeral `n`, adding leading zeroes to avoid the "1, 10, 2, 3" sorting problem. `max_n` is the greatest expected value of `n`, and is used to determine how many leading zeroes are needed. If not supplied, it defaults to the number of languages.
function export.numeral_sortkey(n, max_n)
max_n = max_n or require("Module:list of languages").count()
return ("#%%0%dd"):format(ceil(log10(max_n + 1))):format(n)
end
function export.split_lang_label(title_text)
local getByCanonicalName = require("Module:languages").getByCanonicalName
-- Progressively remove a word from the potential canonical name until it
-- matches an actual canonical name.
local words = split(title_text, " ", true)
for i = #words - 1, 1, -1 do
local lang = getByCanonicalName(concat(words, " ", 1, i))
if lang then
return lang, concat(words, " ", i + 1)
end
end
return nil, title_text
end
local function show_error(text)
return require("Module:message box").maintenance(
"red",
"[[File:Ambox warning pn.svg|50px]]",
"This category is not defined in Wiktionary's category tree.",
text
)
end
-- Show the text that goes at the very top right of the page.
local function show_topright(current)
return current.getTopright and current:getTopright() or nil
end
local function link_box(content)
return ("<div class=\"noprint plainlinks\" style=\"float: right; clear: both; margin: 0 0 .5em 1em; border: 1px var(--border-color-base, #aaaaaa) solid; margin-top: -1px; padding: 5px; font-weight: bold;\">%s</div>"):format(content)
end
local function show_editlink(current)
return link_box(("[%s စၟတ်သမ္တီပလေဝ်ဒါန်ကဏ္ဍ]"):format(tostring(full_url(current:getDataModule(), "action=edit"))))
end
function show_related_changes()
local title = current_title.fullText
return link_box(("[%s <span title=\"Recent edits and other changes to pages in %s\">အပြံၚ်လှာဲလက္ကရဴအိုတ်</span>]"):format(
tostring(full_url("Special:RecentChangesLinked", {
target = title,
showlinkedto = 0,
})),
title
))
end
local function show_pagelist(current)
local namespace = "namespace="
local info = current:getInfo()
local lang_code = info.code
if info.label == "citations" or info.label == "citations of undefined terms" then
namespace = namespace .. "Citations"
elseif lang_code then
local lang = require("Module:languages").getByCode(lang_code, true)
if lang then
-- Proto-Norse (gmq-pro) is the probably language with a code ending in -pro
-- that's intended to have mostly non-reconstructed entries.
if (lang_code:find("%-pro$") and lang_code ~= "gmq-pro") or lang:hasType("reconstructed") then
namespace = namespace .. "ဗီုပြၚ်သိုၚ်တၟိ"
elseif lang:hasType("appendix-constructed") then
namespace = namespace .. "အဆက်လက္ကရဴ"
end
end
elseif info.label:match("ထာမ်ပလိက်") then
namespace = namespace .. "ထာမ်ပလိက်"
elseif info.label:match("မဝ်ဂျူ") then
namespace = namespace .. "မဝ်ဂျူ"
elseif info.label:match("^ဝိက်ရှေန်နရဳ") or info.label:match("^မုက်လိက်") then
namespace = ""
end
return ([=[
{| id="newest-and-oldest-pages" class="wikitable mw-collapsible" style="float: right; clear: both; margin: 0 0 .5em 1em;"
! မုက်လိက်တၟိကဵုတြေံအိုတ်
|-
| id="recent-additions" style="font-size:0.9em;" | '''မုက်လိက်တၟိအိုတ်မပလေဝ်ဒါန်လဝ်နူ[[mw:Manual:Categorylinks table#cl_timestamp|ကဏ္ဍလေန်ပ္တိုန်တၟိ]]:'''
%s
|-
| id="oldest-pages" style="font-size:0.9em;" | '''မုက်လိက်တြေံအိုတ်မပလေဝ်ဒါန်လဝ်လက္ကရဴအိုတ်:'''
%s
|}]=]):format(
current_frame:extensionTag(
"DynamicPageList",
([=[
category=%s
%s
count=10
mode=ordered
ordermethod=categoryadd
order=descending]=]
):format(current_title.text, namespace)
),
current_frame:extensionTag(
"DynamicPageList",
([=[
category=%s
%s
count=10
mode=ordered
ordermethod=lastedit
order=ascending]=]
):format(current_title.text, namespace)
)
)
end
-- Show navigational "breadcrumbs" at the top of the page.
local function show_breadcrumbs(current)
local steps = {}
-- Start at the current label and move our way up the "chain" from child to parent, until we can't go further.
while current do
local category, display_name, nocap
if type(current) == "string" then
category = current
display_name = current:gsub("^ကဏ္ဍ:", "")
else
if not current.getCategoryName then
error("Internal error: Bad format in breadcrumb chain structure, probably a misformatted value for `parents`: " ..
mw.dumpObject(current))
end
category = "ကဏ္ဍ:" .. current:getCategoryName()
display_name, nocap = current:getBreadcrumbName()
end
if not nocap then
display_name = mw.getContentLanguage():ucfirst(display_name)
end
insert(steps, 1, ("[[:%s|%s]]"):format(category, display_name))
-- Move up the "chain" by one level.
if type(current) == "string" then
current = nil
else
current = current:getParents()
end
if current then
current = current[1].name
end
end
local templateStyles = require("Module:TemplateStyles")(category_tree_styles_css)
local ol = mw.html.create("ol")
for i, step in ipairs(steps) do
local li = mw.html.create("li")
if i ~= 1 then
local span = mw.html.create("span")
:attr("aria-hidden", "true")
:addClass("ts-categoryBreadcrumbs-separator")
:wikitext(" » ")
li:node(span)
end
li:wikitext(step)
ol:node(li)
end
return templateStyles .. tostring(mw.html.create("div")
:attr("role", "navigation")
:attr("aria-label", "Breadcrumb")
:addClass("ts-categoryBreadcrumbs")
:node(ol))
end
local function show_also(current)
local also = current._info.also
if also and #also > 0 then
return ('<div style="margin-top:-1em;margin-bottom:1.5em">%s</div>'):format(require("Module:also").main(also))
end
return nil
end
-- Show a short description text for the category.
local function show_description(current)
return current.getDescription and current:getDescription() or nil
end
local function show_appendix(current)
local appendix = current.getAppendix and current:getAppendix()
return appendix and ("ယဝ်ရထပ်နွံပၟိက်မိက်ဂွံတီဏီတှ်ေ၊ ဆက်ဗဵုအာ [[%s]]။"):format(appendix) or nil
end
local function sort_children(child1, child2)
return string_compare(uupper(child1.sort), uupper(child2.sort))
end
-- Show a list of child categories.
local function show_children(current)
local children = current.getChildren and current:getChildren() or nil
if not children then
return nil
end
sort(children, sort_children)
local children_list = {}
for _, child in ipairs(children) do
local child_name, child_pagetitle = child.name
if type(child_name) == "string" then
child_pagetitle = child_name
else
child_pagetitle = "ကဏ္ဍ:" .. child_name:getCategoryName()
end
if new_title(child_pagetitle).exists then
insert(children_list, ("* [[:%s]]: %s"):format(
child_pagetitle,
child.description or
type(child_name) == "string" and child_name:gsub("^ကဏ္ဍ:", "") .. "." or
child_name:getDescription("child")
))
end
end
return concat(children_list, "\n")
end
-- Show a table of contents with links to each letter in the language's script.
local function show_TOC(current)
local titleText = current_title.text
local inCategoryPages = pages_in_category(titleText, "pages")
local inCategorySubcats = pages_in_category(titleText, "subcats")
local TOC_type
-- Compute type of table of contents required.
if inCategoryPages > 2500 or inCategorySubcats > 2500 then
TOC_type = "full"
elseif inCategoryPages > 200 or inCategorySubcats > 200 then
TOC_type = "normal"
else
-- No (usual) need for a TOC if all pages or subcategories can fit on one page;
-- but allow this to be overridden by a custom TOC handler.
TOC_type = "none"
end
if current.getTOC then
local TOC_text = current:getTOC(TOC_type)
if TOC_text ~= true then
return TOC_text or nil
end
end
if TOC_type ~= "none" then
local templatename = current:getTOCTemplateName()
local TOC_template
if TOC_type == "full" then
-- This category is very large, see if there is a "full" version of the TOC.
local TOC_template_full = new_title(templatename .. "/full")
if TOC_template_full.exists then
TOC_template = TOC_template_full
end
end
if not TOC_template then
local TOC_template_normal = new_title(templatename)
if TOC_template_normal.exists then
TOC_template = TOC_template_normal
end
end
if TOC_template then
return current_frame:expandTemplate{title = TOC_template.text, args = {}}
end
end
return nil
end
-- Show the "catfix" that adds language attributes and script classes to the page.
local function show_catfix(current)
local lang, sc = current:getCatfixInfo()
return lang and m_utilities.catfix(lang, sc) or nil
end
-- Show the parent categories that the current category should be placed in.
local function show_categories(current, categories)
local parents = current.getParents and current:getParents() or nil
if not parents then
return nil
end
for _, parent in ipairs(parents) do
local parent_name = parent.name
local sortkey = type(parent.sort) == "table" and parent.sort:makeSortKey() or parent.sort
if type(parent_name) == "string" then
insert(categories, ("[[%s|%s]]"):format(parent_name, sortkey))
else
insert(categories, ("[[Category:%s|%s]]"):format(parent_name:getCategoryName(), sortkey))
end
end
-- Also put the category in its corresponding "umbrella" or "by language" category.
local umbrella = current:getUmbrella()
if umbrella then
-- FIXME: use a language-neutral sorting function like the Unicode Collation Algorithm.
local sortkey = current._lang and current._lang:getCanonicalName() or current:getCategoryName()
sortkey = require("Module:languages").getByCode("mnw", true):makeSortKey(sortkey)
if type(umbrella) == "string" then
insert(categories, ("[[%s|%s]]"):format(umbrella, sortkey))
else
insert(categories, ("[[Category:%s|%s]]"):format(umbrella:getCategoryName(), sortkey))
end
end
-- Check for various unwanted parser functions, which should be integrated into the category tree data instead.
-- Note: HTML comments shouldn't be removed from `content` until after this step, as they can affect the result.
local content = current_title:getContent()
if not content then
-- This happens when using [[Special:ExpandTemplates]] to call {{auto cat}} on a nonexistent category page,
-- which is needed by Benwing's create_wanted_categories.py script.
return
end
local defaultsort, displaytitle, page_has_param
for node in parse(content):iterate_nodes() do
local node_class = class_else_type(node)
if node_class == "ထာမ်ပလိက်" then
local name = node:get_name()
if name == "DEFAULTSORT:" and not defaultsort then
insert(categories, "[[Category:Pages with DEFAULTSORT conflicts]]")
defaultsort = true
elseif name == "DISPLAYTITLE:" and not displaytitle then
insert(categories,"[[Category:Pages with DISPLAYTITLE conflicts]]")
displaytitle = true
end
elseif node_class == "parameter" and not page_has_param then
insert(categories,"[[Category:Pages with raw triple-brace template parameters]]")
page_has_param = true
end
end
-- Check for raw category markup, which should also be integrated into the category tree data.
content = remove_comments(content, "BOTH")
local head = content:find("[[", 1, true)
while head do
local close = content:find("]]", head + 2, true)
if not close then
break
end
-- Make sure there are no intervening "[[" between head and close.
local open = content:find("[[", head + 2, true)
while open and open < close do
head = open
open = content:find("[[", head + 2, true)
end
local cat = content:sub(head + 2, close - 1)
local colon = cat:match("^[ _\128-\244]*[Cc][Aa][Tt][EeGgOoRrYy _\128-\244]*():")
if colon then
local pipe = cat:find("|", colon + 1, true)
if pipe ~= #cat then
local title = new_title(pipe and cat:sub(1, pipe - 1) or cat)
if title and title.namespace == 14 then
insert(categories,"[[Category:Categories with categories using raw markup]]")
break
end
end
end
head = open
end
end
local function generate_output(current)
if current then
for _, functionName in pairs{
"getBreadcrumbName",
"getDataModule",
"canBeEmpty",
"getDescription",
"getParents",
"getChildren",
"getUmbrella",
"getAppendix",
"getTOCTemplateName",
} do
if not is_callable(current[functionName]) then
require("Module:debug").track{"category tree/missing function", "category tree/missing function/" .. functionName}
end
end
end
local boxes, display, categories = {}, {}, {}
-- Categories should never show files as a gallery.
insert(categories, "__NOGALLERY__")
if current_frame:getParent():getTitle() == "ထာမ်ပလိက်:auto cat" then
insert(categories, "[[ကဏ္ဍ:ကဏ္ဍပွမမကော်ခဴ ထာမ်ပလိက်:ဗ္ဂဲအဝ်တဝ်ဂမၠိုၚ်]]")
end
-- Check if the category is empty
local totalPages = pages_in_category(current_title.text, "all")
local hugeCategory = totalPages > 1000000 -- 1 million
-- Categorize huge categories, as they cause DynamicPageList to time out and make the category inaccessible.
if hugeCategory then
insert(categories, "[[ကဏ္ဍ:ကဏ္ဍၝောံယာဲဂမၠိုၚ်]]")
end
-- Are the parameters valid?
if not current then
insert(categories, "[[Category:Categories that are not defined in the category tree]]")
insert(categories, totalPages == 0 and "[[ကဏ္ဍ:ကဏ္ဍသၠးဒၟံၚ်ဂမၠိုၚ်]]" or nil)
insert(display, show_error(
"Double-check the category name for typos. <br>" ..
"[[Special:Search/Category: " .. current_title.text:gsub("^.+:", ""):gsub(" ", "~2 ") .. '~2|Search existing categories]] to check if this category should be created under a different name (for example, "Fruits" instead of "Fruit"). <br>' ..
"To add a new category to Wiktionary's category tree, please consult " .. current_frame:expandTemplate{title = "section link", args = {
"Help:Category#How_to_create_a_category",
}} .. "."))
-- Exit here, as all code beyond here relies on current not being nil
return concat(categories, "") .. concat(display, "\n\n"), true
end
-- Does the category have the correct name?
local currentName = current:getCategoryName()
local correctName = current_title.text == currentName
if not correctName then
insert(categories, "[[ကဏ္ဍ:ကဏ္ဍမနွံကဵုယၟုဟွံဒးရးဂမၠိုၚ်]]")
insert(display, show_error(("Based on the data in the category tree, this category should be called '''[[:Category:%s]]'''."):format(currentName)))
end
-- Add cleanup category for empty categories.
local canBeEmpty = current:canBeEmpty()
if canBeEmpty and correctName then
insert(categories, " __EXPECTUNUSEDCATEGORY__")
elseif totalPages == 0 then
insert(categories, "[[ကဏ္ဍ:ကဏ္ဍသၠးဒၟံၚ်ဂမၠိုၚ်]]")
end
if current:isHidden() then
insert(categories, "__HIDDENCAT__")
end
-- Put all the float-right stuff into a <div> that does not clear, so that float-left stuff like the breadcrumbs and
-- description can go opposite the float-right stuff without vertical space.
insert(boxes, "<div style=\"float: right;\">")
insert(boxes, show_topright(current))
insert(boxes, show_editlink(current))
insert(boxes, show_related_changes())
-- Show pagelist, unless it's a huge category (since they can't use DynamicPageList - see above).
if not hugeCategory then
insert(boxes, show_pagelist(current))
end
insert(boxes, "</div>")
-- Generate the displayed information
insert(display, show_breadcrumbs(current))
insert(display, show_also(current))
insert(display, show_description(current))
insert(display, show_appendix(current))
insert(display, show_children(current))
insert(display, show_TOC(current))
insert(display, show_catfix(current))
insert(display, '<br class="clear-both-in-vector-2022-only">')
show_categories(current, categories)
return concat(boxes, "\n") .. "\n" .. concat(display, "\n\n") .. concat(categories, "")
end
--[==[
List of handler functions that try to match the page name. A handler should return the name of a submodule to
[[Module:category tree]] and an info table which is passed as an argument to the submodule. If a handler does not
recognize the page name, it should return nil. Note that the order of handlers matters!
]==]
local handlers = {}
-- Thesaurus per-language category
insert(handlers, function(title)
local code, label = title:match("^အဘိဓာန်:(%l[%a-]*%a):(.+)")
if code then
return poscatboiler_subsystem, {label = title, raw = true}
end
end)
-- Topic per-language category
insert(handlers, function(title)
local code, label = title:match("^(%l[%a-]*%a):(.+)")
if code then
return poscatboiler_subsystem, {label = title, raw = true}
end
end)
-- Lect category e.g. for [[:Category:New Zealand English]] or [[:Category:Issime Walser]]
insert(handlers, function(title, args)
local lect = args.lect or args.dialect
if lect ~= "" and yesno(lect, true) then -- Same as boolean in [[Module:parameters]].
return poscatboiler_subsystem, {label = title, args = args, raw = true}
end
end)
-- poscatboiler per-language label, e.g. [[Category:English non-lemma forms]]
insert(handlers, function(title, args)
local lang, label = export.split_lang_label(title)
if not lang then
return
end
local baseLabel, script = label:match("(.+) in (.-) script$")
if script and baseLabel ~= "ဝေါဟာ" then
local scriptObj = require("Module:scripts").getByCanonicalName(script)
if scriptObj then
return poscatboiler_subsystem, {label = baseLabel, code = lang:getCode(), sc = scriptObj:getCode(), args = args}
end
end
return poscatboiler_subsystem, {label = label, code = lang:getCode(), args = args}
end)
-- poscatboiler label umbrella category
insert(handlers, function(title, args)
local label = title:match("^ဗက်အလိုက်အရေဝ်ဘာသာဂမၠိုၚ်")
if label then
-- The poscatboiler code will appropriately lowercase if needed.
return poscatboiler_subsystem, {label = label, args = args}
end
end)
-- poscatboiler raw handlers
insert(handlers, function(title, args)
return poscatboiler_subsystem, {label = title, args = args, raw = true}
end)
-- poscatboiler umbrella handlers without 'by language'
insert(handlers, function(title, args)
return poscatboiler_subsystem, {label = title, args = args}
end)
function export.show(frame)
local args, other_args = require("Module:parameters").process(frame:getParent().args, {
["also"] = {type = "title", sublist = "comma without whitespace", namespace = 14}
}, true)
if args.also then
for k, arg in next, args.also do
args.also[k] = arg.prefixedText
end
end
for k, arg in next, other_args do
other_args[k] = trim(arg)
end
if namespace == 10 then -- Template
return "(This template should be used on pages in the [[Help:Namespaces#Category|Category:]] namespace.)"
elseif namespace ~= 14 then -- Category
error("This template/module can only be used on pages in the [[mw:Help:Namespaces#Category|Category:]] namespace.")
end
local first_fail_args_handled, first_fail_cattext
-- Go through each handler in turn. If a handler doesn't recognize the format of the category, it will return nil,
-- and we will consider the next handler. Otherwise, it returns a template name and arguments to call it with, but
-- even then, that template might return an error, and we need to consider the next handler. This happens, for
-- example, with the category "CAT:Mato Grosso, Brazil", where "Mato" is the name of a language, so the poscatboiler
-- per-language label handler fires and tries to find a label "Grosso, Brazil". This throws an error, and
-- previously, this blocked fruther handler consideration, but now we check for the error and continue checking
-- handlers; eventually, the topic umbrella handler will fire and correctly handle the category.
for _, handler in ipairs(handlers) do
-- Use a new title object and args table for each handler, to keep them isolated.
local submodule, info = handler(current_title.text, deep_copy(other_args))
if submodule then
info.also = deep_copy(args.also)
require("Module:debug").track("auto cat/" .. submodule)
-- `failed` is true if no match was found.
submodule = require(category_tree_submodule_prefix .. submodule)
local cattext, failed = generate_output(submodule.main(info))
if failed then
if not first_fail_cattext then
first_fail_cattext = cattext
first_fail_args_handled = info.args and true or false
end
elseif not info.args and next(other_args) then
error(extra_args_error)
else
return cattext
end
end
end
-- If there were no matches, throw an error if any arguments were given, or otherwise return the cattext
-- from the first fail encountered. The final handlers call the boilers unconditionally, so there should
-- always be something to return.
if not first_fail_args_handled and next(other_args) then
error(extra_args_error)
end
return first_fail_cattext
end
-- TODO: new test entrypoint.
return export
eyb4efu3u2hxwrrev3a2l9ejjgansk7
385643
385637
2026-04-02T17:46:02Z
咽頭べさ
33
385643
Scribunto
text/plain
-- Prevent substitution.
if mw.isSubsting() then
return require("Module:unsubst")
end
local export = {}
local category_tree_submodule_prefix = "Module:category tree/"
local category_tree_styles_css = "Module:category tree/styles.css"
local m_str_utils = require("Module:string utilities")
local m_template_parser = require("Module:template parser")
local m_utilities = require("Module:utilities")
local ceil = math.ceil
local class_else_type = m_template_parser.class_else_type
local concat = table.concat
local deep_copy = require("Module:table").deepCopy
local full_url = mw.uri.fullUrl
local insert = table.insert
local is_callable = require("Module:fun").is_callable
local log10 = math.log10 or require("Module:math").log10
local new_title = mw.title.new
local pages_in_category = mw.site.stats.pagesInCategory
local parse = m_template_parser.parse
local remove_comments = require("Module:string/removeComments")
local sort = table.sort
local split = m_str_utils.split
local string_compare = require("Module:string/compare")
local trim = m_str_utils.trim
local uupper = m_str_utils.upper
local yesno = require("Module:yesno")
local current_frame = mw.getCurrentFrame()
local current_title = mw.title.getCurrentTitle()
local namespace = current_title.namespace
local poscatboiler_subsystem = "poscatboiler"
local extra_args_error = "Extra arguments to {{((}}auto cat{{))}} are not allowed for this category."
-- Generates a sortkey for a numeral `n`, adding leading zeroes to avoid the "1, 10, 2, 3" sorting problem. `max_n` is the greatest expected value of `n`, and is used to determine how many leading zeroes are needed. If not supplied, it defaults to the number of languages.
function export.numeral_sortkey(n, max_n)
max_n = max_n or require("Module:list of languages").count()
return ("#%%0%dd"):format(ceil(log10(max_n + 1))):format(n)
end
function export.split_lang_label(title_text)
local getByCanonicalName = require("Module:languages").getByCanonicalName
-- Progressively remove a word from the potential canonical name until it
-- matches an actual canonical name.
local words = split(title_text, " ", true)
for i = #words - 1, 1, -1 do
local lang = getByCanonicalName(concat(words, " ", 1, i))
if lang then
return lang, concat(words, " ", i + 1)
end
end
return nil, title_text
end
local function show_error(text)
return require("Module:message box").maintenance(
"red",
"[[File:Ambox warning pn.svg|50px]]",
"This category is not defined in Wiktionary's category tree.",
text
)
end
-- Show the text that goes at the very top right of the page.
local function show_topright(current)
return current.getTopright and current:getTopright() or nil
end
local function link_box(content)
return ("<div class=\"noprint plainlinks\" style=\"float: right; clear: both; margin: 0 0 .5em 1em; border: 1px var(--border-color-base, #aaaaaa) solid; margin-top: -1px; padding: 5px; font-weight: bold;\">%s</div>"):format(content)
end
local function show_editlink(current)
return link_box(("[%s စၟတ်သမ္တီပလေဝ်ဒါန်ကဏ္ဍ]"):format(tostring(full_url(current:getDataModule(), "action=edit"))))
end
function show_related_changes()
local title = current_title.fullText
return link_box(("[%s <span title=\"Recent edits and other changes to pages in %s\">အပြံၚ်လှာဲလက္ကရဴအိုတ်</span>]"):format(
tostring(full_url("Special:RecentChangesLinked", {
target = title,
showlinkedto = 0,
})),
title
))
end
local function show_pagelist(current)
local namespace = "namespace="
local info = current:getInfo()
local lang_code = info.code
if info.label == "citations" or info.label == "citations of undefined terms" then
namespace = namespace .. "Citations"
elseif lang_code then
local lang = require("Module:languages").getByCode(lang_code, true)
if lang then
-- Proto-Norse (gmq-pro) is the probably language with a code ending in -pro
-- that's intended to have mostly non-reconstructed entries.
if (lang_code:find("%-pro$") and lang_code ~= "gmq-pro") or lang:hasType("reconstructed") then
namespace = namespace .. "ဗီုပြၚ်သိုၚ်တၟိ"
elseif lang:hasType("appendix-constructed") then
namespace = namespace .. "အဆက်လက္ကရဴ"
end
end
elseif info.label:match("ထာမ်ပလိက်") then
namespace = namespace .. "ထာမ်ပလိက်"
elseif info.label:match("မဝ်ဂျူ") then
namespace = namespace .. "မဝ်ဂျူ"
elseif info.label:match("^ဝိက်ရှေန်နရဳ") or info.label:match("^မုက်လိက်") then
namespace = ""
end
return ([=[
{| id="newest-and-oldest-pages" class="wikitable mw-collapsible" style="float: right; clear: both; margin: 0 0 .5em 1em;"
! မုက်လိက်တၟိကဵုတြေံအိုတ်
|-
| id="recent-additions" style="font-size:0.9em;" | '''မုက်လိက်တၟိအိုတ်မပလေဝ်ဒါန်လဝ်နူ[[mw:Manual:Categorylinks table#cl_timestamp|ကဏ္ဍလေန်ပ္တိုန်တၟိ]]:'''
%s
|-
| id="oldest-pages" style="font-size:0.9em;" | '''မုက်လိက်တြေံအိုတ်မပလေဝ်ဒါန်လဝ်လက္ကရဴအိုတ်:'''
%s
|}]=]):format(
current_frame:extensionTag(
"DynamicPageList",
([=[
category=%s
%s
count=10
mode=ordered
ordermethod=categoryadd
order=descending]=]
):format(current_title.text, namespace)
),
current_frame:extensionTag(
"DynamicPageList",
([=[
category=%s
%s
count=10
mode=ordered
ordermethod=lastedit
order=ascending]=]
):format(current_title.text, namespace)
)
)
end
-- Show navigational "breadcrumbs" at the top of the page.
local function show_breadcrumbs(current)
local steps = {}
-- Start at the current label and move our way up the "chain" from child to parent, until we can't go further.
while current do
local category, display_name, nocap
if type(current) == "string" then
category = current
display_name = current:gsub("^ကဏ္ဍ:", "")
else
if not current.getCategoryName then
error("Internal error: Bad format in breadcrumb chain structure, probably a misformatted value for `parents`: " ..
mw.dumpObject(current))
end
category = "ကဏ္ဍ:" .. current:getCategoryName()
display_name, nocap = current:getBreadcrumbName()
end
if not nocap then
display_name = mw.getContentLanguage():ucfirst(display_name)
end
insert(steps, 1, ("[[:%s|%s]]"):format(category, display_name))
-- Move up the "chain" by one level.
if type(current) == "string" then
current = nil
else
current = current:getParents()
end
if current then
current = current[1].name
end
end
local templateStyles = require("Module:TemplateStyles")(category_tree_styles_css)
local ol = mw.html.create("ol")
for i, step in ipairs(steps) do
local li = mw.html.create("li")
if i ~= 1 then
local span = mw.html.create("span")
:attr("aria-hidden", "true")
:addClass("ts-categoryBreadcrumbs-separator")
:wikitext(" » ")
li:node(span)
end
li:wikitext(step)
ol:node(li)
end
return templateStyles .. tostring(mw.html.create("div")
:attr("role", "navigation")
:attr("aria-label", "Breadcrumb")
:addClass("ts-categoryBreadcrumbs")
:node(ol))
end
local function show_also(current)
local also = current._info.also
if also and #also > 0 then
return ('<div style="margin-top:-1em;margin-bottom:1.5em">%s</div>'):format(require("Module:also").main(also))
end
return nil
end
-- Show a short description text for the category.
local function show_description(current)
return current.getDescription and current:getDescription() or nil
end
local function show_appendix(current)
local appendix = current.getAppendix and current:getAppendix()
return appendix and ("ယဝ်ရထပ်နွံပၟိက်မိက်ဂွံတီဏီတှ်ေ၊ ဆက်ဗဵုအာ [[%s]]။"):format(appendix) or nil
end
local function sort_children(child1, child2)
return string_compare(uupper(child1.sort), uupper(child2.sort))
end
-- Show a list of child categories.
local function show_children(current)
local children = current.getChildren and current:getChildren() or nil
if not children then
return nil
end
sort(children, sort_children)
local children_list = {}
for _, child in ipairs(children) do
local child_name, child_pagetitle = child.name
if type(child_name) == "string" then
child_pagetitle = child_name
else
child_pagetitle = "ကဏ္ဍ:" .. child_name:getCategoryName()
end
if new_title(child_pagetitle).exists then
insert(children_list, ("* [[:%s]]: %s"):format(
child_pagetitle,
child.description or
type(child_name) == "string" and child_name:gsub("^ကဏ္ဍ:", "") .. "." or
child_name:getDescription("child")
))
end
end
return concat(children_list, "\n")
end
-- Show a table of contents with links to each letter in the language's script.
local function show_TOC(current)
local titleText = current_title.text
local inCategoryPages = pages_in_category(titleText, "pages")
local inCategorySubcats = pages_in_category(titleText, "subcats")
local TOC_type
-- Compute type of table of contents required.
if inCategoryPages > 2500 or inCategorySubcats > 2500 then
TOC_type = "full"
elseif inCategoryPages > 200 or inCategorySubcats > 200 then
TOC_type = "normal"
else
-- No (usual) need for a TOC if all pages or subcategories can fit on one page;
-- but allow this to be overridden by a custom TOC handler.
TOC_type = "none"
end
if current.getTOC then
local TOC_text = current:getTOC(TOC_type)
if TOC_text ~= true then
return TOC_text or nil
end
end
if TOC_type ~= "none" then
local templatename = current:getTOCTemplateName()
local TOC_template
if TOC_type == "full" then
-- This category is very large, see if there is a "full" version of the TOC.
local TOC_template_full = new_title(templatename .. "/full")
if TOC_template_full.exists then
TOC_template = TOC_template_full
end
end
if not TOC_template then
local TOC_template_normal = new_title(templatename)
if TOC_template_normal.exists then
TOC_template = TOC_template_normal
end
end
if TOC_template then
return current_frame:expandTemplate{title = TOC_template.text, args = {}}
end
end
return nil
end
-- Show the "catfix" that adds language attributes and script classes to the page.
local function show_catfix(current)
local lang, sc = current:getCatfixInfo()
return lang and m_utilities.catfix(lang, sc) or nil
end
-- Show the parent categories that the current category should be placed in.
local function show_categories(current, categories)
local parents = current.getParents and current:getParents() or nil
if not parents then
return nil
end
for _, parent in ipairs(parents) do
local parent_name = parent.name
local sortkey = type(parent.sort) == "table" and parent.sort:makeSortKey() or parent.sort
if type(parent_name) == "string" then
insert(categories, ("[[%s|%s]]"):format(parent_name, sortkey))
else
insert(categories, ("[[Category:%s|%s]]"):format(parent_name:getCategoryName(), sortkey))
end
end
-- Also put the category in its corresponding "umbrella" or "by language" category.
local umbrella = current:getUmbrella()
if umbrella then
-- FIXME: use a language-neutral sorting function like the Unicode Collation Algorithm.
local sortkey = current._lang and current._lang:getCanonicalName() or current:getCategoryName()
sortkey = require("Module:languages").getByCode("mnw", true):makeSortKey(sortkey)
if type(umbrella) == "string" then
insert(categories, ("[[%s|%s]]"):format(umbrella, sortkey))
else
insert(categories, ("[[Category:%s|%s]]"):format(umbrella:getCategoryName(), sortkey))
end
end
-- Check for various unwanted parser functions, which should be integrated into the category tree data instead.
-- Note: HTML comments shouldn't be removed from `content` until after this step, as they can affect the result.
local content = current_title:getContent()
if not content then
-- This happens when using [[Special:ExpandTemplates]] to call {{auto cat}} on a nonexistent category page,
-- which is needed by Benwing's create_wanted_categories.py script.
return
end
local defaultsort, displaytitle, page_has_param
for node in parse(content):iterate_nodes() do
local node_class = class_else_type(node)
if node_class == "ထာမ်ပလိက်" then
local name = node:get_name()
if name == "DEFAULTSORT:" and not defaultsort then
insert(categories, "[[Category:Pages with DEFAULTSORT conflicts]]")
defaultsort = true
elseif name == "DISPLAYTITLE:" and not displaytitle then
insert(categories,"[[Category:Pages with DISPLAYTITLE conflicts]]")
displaytitle = true
end
elseif node_class == "parameter" and not page_has_param then
insert(categories,"[[Category:Pages with raw triple-brace template parameters]]")
page_has_param = true
end
end
-- Check for raw category markup, which should also be integrated into the category tree data.
content = remove_comments(content, "BOTH")
local head = content:find("[[", 1, true)
while head do
local close = content:find("]]", head + 2, true)
if not close then
break
end
-- Make sure there are no intervening "[[" between head and close.
local open = content:find("[[", head + 2, true)
while open and open < close do
head = open
open = content:find("[[", head + 2, true)
end
local cat = content:sub(head + 2, close - 1)
local colon = cat:match("^[ _\128-\244]*[Cc][Aa][Tt][EeGgOoRrYy _\128-\244]*():")
if colon then
local pipe = cat:find("|", colon + 1, true)
if pipe ~= #cat then
local title = new_title(pipe and cat:sub(1, pipe - 1) or cat)
if title and title.namespace == 14 then
insert(categories,"[[Category:Categories with categories using raw markup]]")
break
end
end
end
head = open
end
end
local function generate_output(current)
if current then
for _, functionName in pairs{
"getBreadcrumbName",
"getDataModule",
"canBeEmpty",
"getDescription",
"getParents",
"getChildren",
"getUmbrella",
"getAppendix",
"getTOCTemplateName",
} do
if not is_callable(current[functionName]) then
require("Module:debug").track{"category tree/missing function", "category tree/missing function/" .. functionName}
end
end
end
local boxes, display, categories = {}, {}, {}
-- Categories should never show files as a gallery.
insert(categories, "__NOGALLERY__")
if current_frame:getParent():getTitle() == "ထာမ်ပလိက်:auto cat" then
insert(categories, "[[ကဏ္ဍ:ကဏ္ဍပွမမကော်ခဴ ထာမ်ပလိက်:ဗ္ဂဲအဝ်တဝ်ဂမၠိုၚ်]]")
end
-- Check if the category is empty
local totalPages = pages_in_category(current_title.text, "all")
local hugeCategory = totalPages > 1000000 -- 1 million
-- Categorize huge categories, as they cause DynamicPageList to time out and make the category inaccessible.
if hugeCategory then
insert(categories, "[[ကဏ္ဍ:ကဏ္ဍၝောံယာဲဂမၠိုၚ်]]")
end
-- Are the parameters valid?
if not current then
insert(categories, "[[Category:Categories that are not defined in the category tree]]")
insert(categories, totalPages == 0 and "[[ကဏ္ဍ:ကဏ္ဍသၠးဒၟံၚ်ဂမၠိုၚ်]]" or nil)
insert(display, show_error(
"Double-check the category name for typos. <br>" ..
"[[Special:Search/ကဏ္ဍ: " .. current_title.text:gsub("^.+:", ""):gsub(" ", "~2 ") .. '~2|Search existing categories]] to check if this category should be created under a different name (for example, "Fruits" instead of "Fruit"). <br>' ..
"To add a new category to Wiktionary's category tree, please consult " .. current_frame:expandTemplate{title = "section link", args = {
"Help:Category#How_to_create_a_category",
}} .. "."))
-- Exit here, as all code beyond here relies on current not being nil
return concat(categories, "") .. concat(display, "\n\n"), true
end
-- Does the category have the correct name?
local currentName = current:getCategoryName()
local correctName = current_title.text == currentName
if not correctName then
insert(categories, "[[ကဏ္ဍ:ကဏ္ဍမနွံကဵုယၟုဟွံဒးရးဂမၠိုၚ်]]")
insert(display, show_error(("Based on the data in the category tree, this category should be called '''[[:Category:%s]]'''."):format(currentName)))
end
-- Add cleanup category for empty categories.
local canBeEmpty = current:canBeEmpty()
if canBeEmpty and correctName then
insert(categories, " __EXPECTUNUSEDCATEGORY__")
elseif totalPages == 0 then
insert(categories, "[[ကဏ္ဍ:ကဏ္ဍသၠးဒၟံၚ်ဂမၠိုၚ်]]")
end
if current:isHidden() then
insert(categories, "__HIDDENCAT__")
end
-- Put all the float-right stuff into a <div> that does not clear, so that float-left stuff like the breadcrumbs and
-- description can go opposite the float-right stuff without vertical space.
insert(boxes, "<div style=\"float: right;\">")
insert(boxes, show_topright(current))
insert(boxes, show_editlink(current))
insert(boxes, show_related_changes())
-- Show pagelist, unless it's a huge category (since they can't use DynamicPageList - see above).
if not hugeCategory then
insert(boxes, show_pagelist(current))
end
insert(boxes, "</div>")
-- Generate the displayed information
insert(display, show_breadcrumbs(current))
insert(display, show_also(current))
insert(display, show_description(current))
insert(display, show_appendix(current))
insert(display, show_children(current))
insert(display, show_TOC(current))
insert(display, show_catfix(current))
insert(display, '<br class="clear-both-in-vector-2022-only">')
show_categories(current, categories)
return concat(boxes, "\n") .. "\n" .. concat(display, "\n\n") .. concat(categories, "")
end
--[==[
List of handler functions that try to match the page name. A handler should return the name of a submodule to
[[Module:category tree]] and an info table which is passed as an argument to the submodule. If a handler does not
recognize the page name, it should return nil. Note that the order of handlers matters!
]==]
local handlers = {}
-- Thesaurus per-language category
insert(handlers, function(title)
local code, label = title:match("^အဘိဓာန်:(%l[%a-]*%a):(.+)")
if code then
return poscatboiler_subsystem, {label = title, raw = true}
end
end)
-- Topic per-language category
insert(handlers, function(title)
local code, label = title:match("^(%l[%a-]*%a):(.+)")
if code then
return poscatboiler_subsystem, {label = title, raw = true}
end
end)
-- Lect category e.g. for [[:Category:New Zealand English]] or [[:Category:Issime Walser]]
insert(handlers, function(title, args)
local lect = args.lect or args.dialect
if lect ~= "" and yesno(lect, true) then -- Same as boolean in [[Module:parameters]].
return poscatboiler_subsystem, {label = title, args = args, raw = true}
end
end)
-- poscatboiler per-language label, e.g. [[Category:English non-lemma forms]]
insert(handlers, function(title, args)
local lang, label = export.split_lang_label(title)
if not lang then
return
end
local baseLabel, script = label:match("(.+) in (.-) script$")
if script and baseLabel ~= "ဝေါဟာ" then
local scriptObj = require("Module:scripts").getByCanonicalName(script)
if scriptObj then
return poscatboiler_subsystem, {label = baseLabel, code = lang:getCode(), sc = scriptObj:getCode(), args = args}
end
end
return poscatboiler_subsystem, {label = label, code = lang:getCode(), args = args}
end)
-- poscatboiler label umbrella category
insert(handlers, function(title, args)
local label = title:match("^ဗက်အလိုက်အရေဝ်ဘာသာဂမၠိုၚ်")
if label then
-- The poscatboiler code will appropriately lowercase if needed.
return poscatboiler_subsystem, {label = label, args = args}
end
end)
-- poscatboiler raw handlers
insert(handlers, function(title, args)
return poscatboiler_subsystem, {label = title, args = args, raw = true}
end)
-- poscatboiler umbrella handlers without 'by language'
insert(handlers, function(title, args)
return poscatboiler_subsystem, {label = title, args = args}
end)
function export.show(frame)
local args, other_args = require("Module:parameters").process(frame:getParent().args, {
["also"] = {type = "title", sublist = "comma without whitespace", namespace = 14}
}, true)
if args.also then
for k, arg in next, args.also do
args.also[k] = arg.prefixedText
end
end
for k, arg in next, other_args do
other_args[k] = trim(arg)
end
if namespace == 10 then -- Template
return "(This template should be used on pages in the [[Help:Namespaces#Category|Category:]] namespace.)"
elseif namespace ~= 14 then -- Category
error("This template/module can only be used on pages in the [[mw:Help:Namespaces#Category|Category:]] namespace.")
end
local first_fail_args_handled, first_fail_cattext
-- Go through each handler in turn. If a handler doesn't recognize the format of the category, it will return nil,
-- and we will consider the next handler. Otherwise, it returns a template name and arguments to call it with, but
-- even then, that template might return an error, and we need to consider the next handler. This happens, for
-- example, with the category "CAT:Mato Grosso, Brazil", where "Mato" is the name of a language, so the poscatboiler
-- per-language label handler fires and tries to find a label "Grosso, Brazil". This throws an error, and
-- previously, this blocked fruther handler consideration, but now we check for the error and continue checking
-- handlers; eventually, the topic umbrella handler will fire and correctly handle the category.
for _, handler in ipairs(handlers) do
-- Use a new title object and args table for each handler, to keep them isolated.
local submodule, info = handler(current_title.text, deep_copy(other_args))
if submodule then
info.also = deep_copy(args.also)
require("Module:debug").track("auto cat/" .. submodule)
-- `failed` is true if no match was found.
submodule = require(category_tree_submodule_prefix .. submodule)
local cattext, failed = generate_output(submodule.main(info))
if failed then
if not first_fail_cattext then
first_fail_cattext = cattext
first_fail_args_handled = info.args and true or false
end
elseif not info.args and next(other_args) then
error(extra_args_error)
else
return cattext
end
end
end
-- If there were no matches, throw an error if any arguments were given, or otherwise return the cattext
-- from the first fail encountered. The final handlers call the boilers unconditionally, so there should
-- always be something to return.
if not first_fail_args_handled and next(other_args) then
error(extra_args_error)
end
return first_fail_cattext
end
-- TODO: new test entrypoint.
return export
8ge2x7xzddwimcv3aauy5yush10793a
မဝ်ဂျူ:category tree/poscatboiler
828
1144
385642
385622
2026-04-02T17:44:38Z
咽頭べさ
33
385642
Scribunto
text/plain
local lang_independent_data = require("Module:category tree/data")
local lang_specific_module = "Module:category tree/lang"
local lang_specific_module_prefix = lang_specific_module .. "/"
local family_specific_module = "Module:category tree/fam"
local family_specific_module_prefix = family_specific_module .. "/"
local labels_utilities_module = "Module:labels/utilities"
local template_parser_module = "Module:template parser"
local concat = table.concat
local dump = mw.dumpObject
local expand_template = require("Module:frame").expandTemplate
local insert = table.insert
local is_callable = require("Module:fun").is_callable
local lcfirst = require("Module:string utilities").lcfirst
local list_to_set = require("Module:table").listToSet
local make_title = mw.title.makeTitle
local new_title = mw.title.new
local parse = require(template_parser_module).parse
local sparse_concat = require("Module:table").sparseConcat
local tostring = tostring
local type = type
local ucfirst = require("Module:string utilities").ucfirst
local uupper = require("Module:string utilities").upper
local function internal_error(msg)
error("Internal error: " .. msg)
end
local function get_lang(...)
local _get_lang = require("Module:languages").getByCode
function get_lang(...)
return _get_lang(...) or require("Module:languages/errorGetBy").code(...)
end
return get_lang(...)
end
local function get_script(...)
local _get_script = require("Module:scripts").getByCode
function get_script(code)
return _get_script(code) or require("Module:languages/error")(code, true, "script code")
end
return get_script(...)
end
-- Category object
local Category = {}
Category.__index = Category
function Category:get_originating_info()
local originating_info = ""
if self._info.originating_label then
originating_info = " (originating from label \"" .. self._info.originating_label .. "\" in module [[" .. self._info.originating_module .. "]])"
end
return originating_info
end
local valid_keys = list_to_set{"code", "label", "sc", "raw", "args", "also", "called_from_inside", "originating_label", "originating_module"}
function Category.new(info)
for key in pairs(info) do
if not valid_keys[key] then
internal_error("The parameter \"" .. key .. "\" was not recognized.")
end
end
local self = setmetatable({}, Category)
self._info = info
if not self._info.label then
internal_error("No label was specified.")
end
self:initCommon()
if not self._data then
internal_error("The " .. (self._info.raw and "raw " or "") .. "label \"" .. self._info.label .. "\" does not exist" .. self:get_originating_info() .. ".")
end
return self
end
function Category:initCommon()
local function patch_args(args)
-- This fixes the issue with Scribunto automatically converting keys
-- in a table as numbers to strings, which in turn causes a circular
-- error for having argument parameter names as numbers as strings.
if type(args) ~= "table" then
return args
end
local new_args = {}
for k, v in pairs(args) do
if type(k) == "string" and string.len(k) < 10 and not string.match(k, "^0") and string.match(k, "^%d+$") then
new_args[tonumber(k)] = patch_args(v)
else
new_args[k] = patch_args(v)
end
end
return new_args
end
local args_handled = false
if self._info.raw then
-- Check if the category exists
local raw_categories = lang_independent_data["RAW_CATEGORIES"]
self._data = raw_categories[self._info.label]
if self._data then
if self._data.lang then
self._lang = get_lang(self._data.lang, nil, true)
self._info.code = self._lang:getCode()
end
if self._data.sc then
self._sc = get_script(self._data.sc)
self._info.sc = self._sc:getCode()
end
else
-- Go through raw handlers
local data = {
category = self._info.label,
args = patch_args(self._info.args) or {},
called_from_inside = self._info.called_from_inside,
}
for _, handler in ipairs(lang_independent_data["RAW_HANDLERS"]) do
self._data, args_handled = handler.handler(data)
if self._data then
self._data.module = self._data.module or handler.module
break
end
end
if self._data then
-- Update the label if the handler specified a canonical name for it.
if self._data.canonical_name then
self._info.canonical_name = self._data.canonical_name
end
if self._data.lang then
if type(self._data.lang) ~= "string" then
internal_error("Received non-string value " .. dump(self._data.lang) .. " for self._data.lang, label \"" .. self._info.label .. "\"" .. self:get_originating_info() .. ".")
end
self._lang = get_lang(self._data.lang, nil, true)
self._info.code = self._lang:getCode()
end
if self._data.sc then
if type(self._data.sc) ~= "string" then
internal_error("Received non-string value " .. dump(self._data.sc) .. " for self._data.sc, label \"" .. self._info.label .. "\"" .. self:get_originating_info() .. ".")
end
self._sc = get_script(self._data.sc)
self._info.sc = self._sc:getCode()
end
end
end
else
-- Already parsed into language + label
if self._info.code then
self._lang = get_lang(self._info.code, nil, true)
else
self._lang = nil
end
if self._info.sc then
self._sc = get_script(self._info.sc)
else
self._sc = nil
end
self._info.orig_label = self._info.label
if not self._lang then
-- Umbrella categories without a preceding language always begin with a capital letter, but the actual label may be
-- lowercase (cf. [[:Category:Nouns by language]] with label 'nouns' with per-language [[:Category:English nouns]];
-- but [[:Category:Reddit slang by language]] with label 'Reddit slang' with per-language
-- [[:Category:English Reddit slang]]). Since the label is almost always lowercase, we lowercase it for umbrella
-- categories, storing the original into `orig_label`, and correct it later if needed.
self._info.label = lcfirst(self._info.label)
end
-- First, check lang-specific labels and handlers if this is not an umbrella category.
if self._lang then
local objects_with_modules = require(lang_specific_module)
local obj, seen = self._lang, {}
local object_specific_module_prefix = lang_specific_module_prefix
local is_family = false
repeat
if objects_with_modules[obj:getCode()] then
local module = object_specific_module_prefix .. obj:getCode()
local labels_and_handlers = require(module)
if labels_and_handlers.LABELS then
self._data = labels_and_handlers.LABELS[self._info.label]
if self._data then
if not is_family and self._data.umbrella == nil and self._data.umbrella_parents == nil then
self._data.umbrella = false
end
self._data.module = self._data.module or module
end
end
if not self._data and labels_and_handlers.HANDLERS then
for _, handler in ipairs(labels_and_handlers.HANDLERS) do
local data = {
label = self._info.label,
lang = self._lang,
sc = self._sc,
args = patch_args(self._info.args) or {},
called_from_inside = self._info.called_from_inside,
}
self._data, args_handled = handler(data)
if self._data then
if not is_family and self._data.umbrella == nil and
self._data.umbrella_parents == nil then
self._data.umbrella = false
end
self._data.module = self._data.module or module
break
end
end
end
if self._data then
break
end
end
seen[obj:getCode()] = true
obj = obj:getFamily()
if not is_family then
is_family = true
object_specific_module_prefix = family_specific_module_prefix
objects_with_modules = require(family_specific_module)
end
until not obj or seen[obj:getCode()]
end
local function fetch_label_data(labels)
self._data = labels[self._info.label]
-- See comment above about uppercase- vs. lowercase-initial labels, which are indistinguishable
-- in umbrella categories.
if not self._data then
self._data = labels[self._info.orig_label]
if self._data then
self._info.label = self._info.orig_label
end
end
end
-- Then check lang-independent labels.
if not self._data then
-- lang_independent_data.LABELS should always exist.
fetch_label_data(lang_independent_data.LABELS)
if not self._data and not self._lang then
-- Check family-specific labels for umbrella label.
local families_with_modules = require(family_specific_module)
for famcode, _ in pairs(families_with_modules) do
local module = family_specific_module_prefix .. famcode
local labels_and_handlers = require(module)
if labels_and_handlers.LABELS then
fetch_label_data(labels_and_handlers.LABELS)
if self._data then
self._data.module = self._data.module or module
break
end
end
end
end
end
-- Then check lang-independent handlers.
if not self._data then
local data = {
label = self._info.label,
lang = self._lang,
sc = self._sc,
args = patch_args(self._info.args) or {},
called_from_inside = self._info.called_from_inside,
}
for _, handler in ipairs(lang_independent_data["HANDLERS"]) do
self._data, args_handled = handler.handler(data)
if self._data then
self._data.module = self._data.module or handler.module
break
end
end
if not self._data and not self._lang then
-- Check family-specific labels for umbrella handler.
local families_with_modules = require(family_specific_module)
for famcode, _ in pairs(families_with_modules) do
local module = family_specific_module_prefix .. famcode
local labels_and_handlers = require(module)
if labels_and_handlers.HANDLERS then
for _, handler in ipairs(labels_and_handlers.HANDLERS) do
local data = {
label = self._info.label,
sc = self._sc,
args = patch_args(self._info.args) or {},
called_from_inside = self._info.called_from_inside,
}
self._data, args_handled = handler(data)
if self._data then
self._data.module = self._data.module or module
break
end
end
end
if self._data then
break
end
end
end
end
end
if not args_handled and self._data and self._info.args and next(self._info.args) then
local module_text = " (handled in [[" .. (self._data.module or "UNKNOWN").. "]])"
local args_text = {}
for k, v in pairs(self._info.args) do
insert(args_text, k .. "=" .. ((type(v) == "string" or type(v) == "number") and v or dump(v)))
end
error("poscatboiler label '" .. self._info.label .. "' " .. module_text .. " doesn't accept extra args " ..
concat(args_text, ", "))
end
if self._sc and not self._lang then
internal_error("Umbrella categories cannot have a script specified.")
end
end
function Category:convert_spec_to_string(desc)
if not desc then
return desc
end
local desc_type = type(desc)
if desc_type == "string" then
return desc
elseif desc_type == "number" then
return tostring(desc)
elseif not is_callable(desc) then
internal_error("`desc` must be a string, number, function, callable table or nil; received " .. dump(desc))
end
desc = desc {
lang = self._lang,
sc = self._sc,
label = self._info.label,
raw = self._info.raw,
}
if not desc then
return desc
end
desc_type = type(desc)
if desc_type == "string" then
return desc
end
internal_error("The value returned by `desc` must be a string or nil; received " .. dump(desc))
end
local function add_obj_args(args, obj, obj_type)
if obj then
args[obj_type .. "code"] = obj:getCode()
args[obj_type .. "name"] = obj:getCanonicalName()
args[obj_type .. "disp"] = obj:getDisplayForm()
args[obj_type .. "cat"] = obj:getCategoryName()
args[obj_type .. "link"] = obj:makeCategoryLink()
end
end
-- Expands `desc` like a template, passing values for specs like {{{langname}}}.
function Category:substitute_template_specs(desc)
-- This may end up happening twice but that's OK as the function is (usually) idempotent.
-- FIXME: Not idempotent if a preprocessed template returns wikicode.
desc = self:convert_spec_to_string(desc)
if not desc then
return nil
end
-- Populate the substitution arguments.
local args = {}
args.umbrella_msg = "This is an umbrella category. It contains no dictionary entries, but only other, language-specific categories, which in turn contain relevant terms in a given language."
args.umbrella_meta_msg = "This is an umbrella metacategory, covering a general area such as \"lemmas\", \"names\" or \"terms by etymology\". It contains no dictionary entries, but holds only umbrella (\"by language\") categories covering specific subtopics, which in turn contain language-specific categories holding terms in a given language for that same topic."
add_obj_args(args, self._lang, "lang")
add_obj_args(args, self._sc, "sc")
return parse(desc, true):expand(args)
end
function Category:substitute_template_specs_in_args(args)
if not args then
return args
end
local pinfo = {}
for k, v in pairs(args) do
pinfo[self:substitute_template_specs(k)] = self:substitute_template_specs(v)
end
return pinfo
end
function Category:make_new(info)
info.originating_label = self._info.label
info.originating_module = self._data.module
info.called_from_inside = true
return Category.new(info)
end
function Category:getBreadcrumbName()
local ret
if self._lang or self._info.raw then
ret = self._data.breadcrumb or self._data.breadcrumb_and_first_sort_key or
self._data.breadcrumb_and_first_sort_base or nil
else
ret = self._data.umbrella and (self._data.umbrella.breadcrumb or
self._data.umbrella.breadcrumb_and_first_sort_key or self._data.umbrella.breadcrumb_and_first_sort_base) or
nil
end
if not ret then
ret = self._info.label
end
if type(ret) ~= "table" then
ret = {name = ret}
end
local name = self:substitute_template_specs(ret.name)
local nocap = ret.nocap
if self._sc then
name = name .. " ပ္ဍဲ " .. self._sc:getDisplayForm()
end
return name, nocap
end
local function expand_toc_template_if(template)
local template_obj = new_title(template, 10)
if template_obj.exists then
return expand_template{title = template_obj.text}
end
return nil
end
-- Return the textual expansion of the first existing template among the given templates, first performing
-- substitutions on the template name such as replacing {{{langcode}}} with the current language's code (if any).
-- If no templates exist after expansion, or if nil is passed in, return nil. If a single string is passed in,
-- treat it like a one-element list consisting of that string.
function Category:get_template_text(templates)
if templates == nil then
return nil
elseif type(templates) ~= "table" then
templates = {templates}
end
for _, template in ipairs(templates) do
if template == false then
return false
end
template = self:substitute_template_specs(template)
return expand_toc_template_if(template)
end
return nil
end
function Category:getTOC(toc_type)
-- Type "none" means everything fits on a single page; in that case, display nothing.
if toc_type == "none" then
return nil
end
local templates, fallback_templates
-- If TOC type is "full" (more than 2500 entries), do the following, in order:
-- 1. look up and expand the `toc_template_full` templates (normal or umbrella, depending on whether there is
-- a current language);
-- 2. look up and expand the `toc_template` templates (normal or umbrella, as above);
-- 3. do the default behavior, which is as follows:
-- 3a. look up a language-specific "full" template according to the current language (using English if there
-- is no current language);
-- 3b. look up a script-specific "full" template according to the first script of current language (using English
-- if there is no current language);
-- 3c. look up a language-specific "normal" template according to the current language (using English if there
-- is no current language);
-- 3d. look up a script-specific "normal" template according to the first script of the current language (using
-- English if there is no current language);
-- 3e. display nothing.
--
-- If TOC type is "normal" (between 200 and 2500 entries), do the following, in order:
-- 1. look up and expand the `toc_template` templates (normal or umbrella, depending on whether there is
-- a current language);
-- 2. do the default behavior, which is as follows:
-- 2a. look up a language-specific "normal" template according to the current language (using English if there
-- is no current language);
-- 2b. look up a script-specific "normal" template according to the first script of the current language (using
-- English if there is no current language);
-- 2c. display nothing.
local data_source
if self._lang or self._info.raw then
data_source = self._data
else
data_source = self._data.umbrella
end
if data_source then
if toc_type == "full" then
templates = data_source.toc_template_full
fallback_templates = data_source.toc_template
else
templates = data_source.toc_template
end
end
local text = self:get_template_text(templates)
if text then
return text
elseif text == false then
return nil
end
text = self:get_template_text(fallback_templates)
if text then
return text
elseif text == false then
return nil
end
local default_toc_templates_to_check = {}
local lang, sc = self:getCatfixInfo()
local langcode = lang and lang:getCode() or "en" or "mnw"
local sccode = sc and sc:getCode() or lang and lang:getScriptCodes()[1] or "Latn" or "Mymr"
-- FIXME: What is toctemplateprefix used for?
local tocname = (self._data.toctemplateprefix or "") .. "categoryTOC"
if toc_type == "full" then
insert(default_toc_templates_to_check, ("%s-%s/full"):format(langcode, tocname))
insert(default_toc_templates_to_check, ("%s-%s/full"):format(sccode, tocname))
end
insert(default_toc_templates_to_check, ("%s-%s"):format(langcode, tocname))
insert(default_toc_templates_to_check, ("%s-%s"):format(sccode, tocname))
for _, toc_template in ipairs(default_toc_templates_to_check) do
local toc_template_text = expand_toc_template_if(toc_template)
if toc_template_text then
return toc_template_text
end
end
return nil
end
function Category:getInfo()
return self._info
end
function Category:getDataModule()
return self._data.module
end
function Category:canBeEmpty()
if self._lang or self._info.raw then
return self._data.can_be_empty
end
return self._data.umbrella and self._data.umbrella.can_be_empty
end
function Category:isHidden()
if self._lang or self._info.raw then
return self._data.hidden
end
return self._data.umbrella and self._data.umbrella.hidden
end
function Category:getCategoryName()
if self._info.raw then
return self._info.canonical_name or self._info.label
elseif self._lang then
local ret = self._info.label .. self._lang:getCanonicalName() .. "ဂမၠိုၚ်"
if self._sc then
ret = ret .. " in " .. self._sc:getDisplayForm()
end
return ucfirst(ret)
end
local ret = ucfirst(self._info.label)
if not (self._data.no_by_language or self._data.umbrella and self._data.umbrella.no_by_language) then
ret = ret .. " by language"
end
return ret
end
function Category:getTopright()
if self._lang or self._info.raw then
return self:substitute_template_specs(self._data.topright)
end
return self._data.umbrella and self:substitute_template_specs(self._data.umbrella.topright)
end
function Category:display_title(displaytitle, lang)
if type(displaytitle) == "string" then
displaytitle = self:substitute_template_specs(displaytitle)
else
displaytitle = displaytitle(self:getCategoryName(), lang)
end
mw.getCurrentFrame():callParserFunction("DISPLAYTITLE", "ကဏ္ဍ:" .. displaytitle)
end
function Category:get_labels_categorizing()
local m_labels_utilities = require(labels_utilities_module)
local pos_cat_labels, sense_cat_labels, use_tlb
pos_cat_labels = m_labels_utilities.find_labels_for_category(self._info.label, "pos", self._lang)
local sense_label = self._info.label:match("^(.*) terms$")
if sense_label then
use_tlb = true
else
sense_label = self._info.label:match("^ဝေါဟာမနွံ (.*) senses$")
end
if not sense_label then
return nil
end
sense_cat_labels = m_labels_utilities.find_labels_for_category(sense_label, "sense", self._lang)
if use_tlb then
return m_labels_utilities.format_labels_categorizing(pos_cat_labels, sense_cat_labels, self._lang)
end
local all_labels = pos_cat_labels
for k, v in pairs(sense_cat_labels) do
all_labels[k] = v
end
return m_labels_utilities.format_labels_categorizing(all_labels, nil, self._lang)
end
-- FIXME: this is clunky.
local function remove_lang_params(desc)
-- Simply remove a language name/code/category from the beginning of the string, but replace the language name
-- in the middle of the string with either "specific languages" or "specific-language" depending on whether the
-- language name appears to be an attributive qualifier of another noun or to stand by itself. This may be wrong,
-- in which case the category in question should supply its own umbrella description.
desc = desc:gsub("^{{{langname}}} ", "")
:gsub("{{{langname}}} %(", "specific languages (")
:gsub("{{{langname}}}([.,])", "specific languages%1")
:gsub("{{{langname}}} ", "specific-language ")
:gsub("{{{langdisp}}}", "specific languages")
:gsub("{{{langlink}}}", "specific languages")
return desc
end
function Category:getDescription(isChild)
-- Allows different text in the list of a category's children
local isChild = isChild == "child"
if self._lang or self._info.raw then
if not isChild and self._data.displaytitle then
self:display_title(self._data.displaytitle, self._lang)
end
if self._sc then
return self:getCategoryName() .. "ဂမၠိုၚ်။"
end
local desc = self:substitute_template_specs(self._data.description)
if not desc then
return nil
elseif isChild then
return desc
end
return sparse_concat({
self:substitute_template_specs(self._data.preceding),
desc,
self:substitute_template_specs(self._data.additional),
self:substitute_template_specs(self:get_labels_categorizing()),
}, "\n\n")
end
local umbrella = self._data.umbrella
if not isChild and umbrella and umbrella.displaytitle then
self:display_title(umbrella.displaytitle)
end
local desc = self:substitute_template_specs(umbrella and umbrella.description)
local has_umbrella_desc = not not desc
if not desc then
desc = self:convert_spec_to_string(self._data.description)
if desc then
desc = remove_lang_params(desc)
desc = lcfirst(desc)
desc = desc:gsub("%.$", "")
desc = "Categories with " .. desc .. "."
else
desc = "Categories with " .. self._info.label .. " in various specific languages."
end
desc = self:substitute_template_specs(desc)
end
if isChild then
return desc
end
return sparse_concat({
self:substitute_template_specs(umbrella and umbrella.preceding or not has_umbrella_desc and self._data.preceding),
desc,
self:substitute_template_specs(umbrella and umbrella.additional or not has_umbrella_desc and self._data.additional),
self:substitute_template_specs("{{{umbrella_msg}}}"),
self:substitute_template_specs(self:get_labels_categorizing()),
}, "\n\n")
end
function Category:new_sortkey(sortkey)
local sortkey_type = type(sortkey)
if sortkey_type == "string" then
sortkey = uupper(sortkey)
elseif sortkey_type == "table" then
function sortkey:makeSortKey()
local sort_func = self.sort_func
if sort_func ~= nil then
return sort_func(self.sort_base)
end
local lang = self.lang
if lang == nil then
return self.sort_base
end
lang = get_lang(lang, nil, true)
if lang == nil then
return self.sort_base
end
local sc = self.sc
if sc ~= nil then
sc = get_script(sc)
end
return lang:makeSortKey(self.sort_base, sc)
end
end
return sortkey
end
function Category:inherit_spec(spec, parent_spec, substitute_result)
if spec == false then
return nil
end
local retval = spec or parent_spec
if substitute_result then
retval = self:substitute_template_specs(retval)
end
return retval
end
function Category:canonicalize_parents_children(cats, is_children, fallback_sort_key, fallback_sort_base)
if not cats then
return nil
elseif type(cats) == "table" then
if cats.name or cats.module then
cats = {cats}
elseif #cats == 0 then
return nil
end
else
cats = {cats}
end
local ret = {}
for _, cat in ipairs(cats) do
if type(cat) ~= "table" or not cat.name and not cat.module then
cat = {name = cat}
end
insert(ret, cat)
end
local is_umbrella = not self._lang and not self._info.raw
local table_type = is_children and "extra_children" or "parents"
for i, cat in ipairs(ret) do
local raw
if self._info.raw or is_umbrella then
raw = not cat.is_label
else
raw = cat.raw
end
local lang = self:inherit_spec(cat.lang, not raw and self._info.code or nil, "substitute")
local sc = self:inherit_spec(cat.sc, not raw and self._info.sc or nil, "substitute")
-- Get the sortkey.
local sortkey = self:inherit_spec(cat.sort, i == 1 and (fallback_sort_key or fallback_sort_base and {sort_base = fallback_sort_base}) or nil)
if type(sortkey) == "table" then
sortkey.sort_base = self:substitute_template_specs(sortkey.sort_base) or
internal_error("Missing .sort_base in '" .. table_type .. "' .sort table for '" ..
self._info.label .. "' category entry in module '" .. (self._data.module or "unknown") .. "'")
if sortkey.sort_func then
-- Not allowed to give a lang and/or script if sort_func is given.
local bad_spec = sortkey.lang and "lang" or sortkey.sc and "sc" or nil
if bad_spec then
internal_error("Cannot specify both ." .. bad_spec .. " and .sort_func in '" .. table_type ..
"' .sort table for '" .. self._info.label .. "' category entry in module '" ..
(self._data.module or "unknown") .. "'")
end
else
sortkey.lang = self:inherit_spec(sortkey.lang, lang, "substitute")
sortkey.sc = self:inherit_spec(sortkey.sc, sc, "substitute")
end
else
sortkey = self:substitute_template_specs(sortkey)
end
local name
if cat.module then
-- A reference to a category using another category tree module.
if not cat.args then
internal_error("Missing .args in '" .. table_type .. "' table with module=\"" .. cat.module .. "\" for '" ..
self._info.label .. "' category entry in module '" .. (self._data.module or "unknown") .. "'")
end
name = require("Module:category tree/" .. cat.module).new(self:substitute_template_specs_in_args(cat.args))
else
name = cat.name
if not name then
internal_error("Missing .name in " .. (is_umbrella and "umbrella " or "") .. "'" .. table_type .. "' table for '" ..
self._info.label .. "' category entry in module '" .. (self._data.module or "unknown") .. "'")
elseif type(name) == "string" then -- otherwise, assume it's a category object and use it directly
name = self:substitute_template_specs(name)
if name:find("^ကဏ္ဍ:") then
-- It's a non-poscatboiler category name.
sortkey = sortkey or is_children and name:gsub("^ကဏ္ဍ:", "") or self:getCategoryName()
else
-- It's a label.
sortkey = sortkey or is_children and name or self._info.label
name = self:make_new{
label = name, code = lang, sc = sc,
raw = raw, args = self:substitute_template_specs_in_args(cat.args)
}
end
end
end
sortkey = sortkey or is_children and " " or self._info.label
ret[i] = {
name = name,
description = is_children and self:substitute_template_specs(cat.description) or nil,
sort = self:new_sortkey(sortkey)
}
end
return ret
end
function Category:getParents()
local is_umbrella, ret = not self._lang and not self._info.raw
if self._sc then
local parent1 = self:make_new{code = self._info.code, label = "ဝေါဟာ" .. self._sc:getCanonicalName() .. "အပ္ဍဲအက္ခရ်ဂမၠိုၚ်"}
local parent2 = self:make_new{code = self._info.code, label = self._info.label, raw = self._info.raw, args = self._info.args}
ret = {
{name = parent1, sort = self._sc:getCanonicalName()},
{name = parent2, sort = self._sc:getCanonicalName()},
}
else
local parents, fallback_sort_key, fallback_sort_base
if is_umbrella then
parents = self._data.umbrella and self._data.umbrella.parents or self._data.umbrella_parents
fallback_sort_key = self._data.umbrella and self._data.umbrella.breadcrumb_and_first_sort_key or nil
fallback_sort_base = self._data.umbrella and self._data.umbrella.breadcrumb_and_first_sort_base or nil
else
parents = self._data.parents
fallback_sort_key = self._data.breadcrumb_and_first_sort_key
fallback_sort_base = self._data.breadcrumb_and_first_sort_base
end
ret = self:canonicalize_parents_children(parents, nil, fallback_sort_key, fallback_sort_base)
if not ret then
return nil
end
end
local self_cat = self:getCategoryName()
for _, parent in ipairs(ret) do
local parent_cat = parent.name.getCategoryName and parent.name:getCategoryName()
if self_cat == parent_cat then
internal_error(("Infinite loop would occur, as parent category '%s' is the same as the child category"):format(self_cat))
end
end
return ret
end
function Category:getChildren()
local is_umbrella = not self._lang and not self._info.raw
local children = self._data.children
local ret = {}
if not is_umbrella and children then
for _, child in ipairs(children) do
child = mw.clone(child)
if type(child) ~= "table" then
child = {name = child}
end
if not child.sort then
child.sort = child.name
end
-- FIXME, is preserving the script correct?
child.name = self:make_new{code = self._info.code, label = child.name, raw = child.raw, sc = self._info.sc}
insert(ret, child)
end
end
local extra_children
if is_umbrella then
extra_children = self._data.umbrella and self._data.umbrella.extra_children
else
extra_children = self._data.extra_children
end
extra_children = self:canonicalize_parents_children(extra_children, "children")
if extra_children then
for _, child in ipairs(extra_children) do
insert(ret, child)
end
end
return #ret > 0 and ret or nil
end
function Category:getUmbrella()
local umbrella = self._data.umbrella
if umbrella == false or self._info.raw or not self._lang or self._sc then
return nil
end
-- If `umbrella` is a string, use that; otherwise, use the label.
return self:make_new({label = type(umbrella) == "string" and umbrella or self._info.label})
end
function Category:getAppendix()
-- FIXME, this should be customizable.
local lang, label = self._lang, self._info.label
if self._info.raw or not (lang and label) then
return nil
end
local appendix = make_title(100, label .. lang:getCanonicalName() .. "ဂမၠိုၚ်")
return appendix.exists and appendix.fullText or nil
end
function Category:getCatfixInfo()
if self._lang or self._sc or self._info.raw then
local langcode, sccode = self._data.catfix, self._data.catfix_sc
local lang, sc
if langcode then
langcode = self:substitute_template_specs(langcode)
lang = get_lang(langcode, nil, true)
elseif langcode == nil then -- not false
lang = self._lang
end
if sccode then
sccode = self:substitute_template_specs(sccode)
sc = get_script(sccode)
elseif sccode == nil then -- not false
sc = self._sc
end
if lang then
lang = lang:getFull()
end
return lang, sc
elseif not self._data.umbrella then
return
end
-- umbrella
local langcode, sccode = self._data.umbrella.catfix, self._data.umbrella.catfix_sc
local lang, sc
if langcode then
langcode = self:substitute_template_specs(langcode)
lang = get_lang(langcode, nil, true)
end
if sccode then
sccode = self:substitute_template_specs(sccode)
sc = get_script(sccode)
end
if lang then
lang = lang:getFull()
end
return lang, sc
end
function Category:getTOCTemplateName()
-- This should only be invoked if getTOC() returns true, meaning to do the default algorithm, but getTOC()
-- implements its own default algorithm.
internal_error("This should never get called")
end
local export = {}
function export.main(info)
local self = setmetatable({_info = info}, Category)
self:initCommon()
return self._data and self or nil
end
export.new = Category.new
return export
5xtpgvmklrvj9zza3drkkp7sbpjacca
မဝ်ဂျူ:category tree/အရေဝ်ဘာသာ
828
1747
385635
385624
2026-04-02T17:07:48Z
咽頭べさ
33
385635
Scribunto
text/plain
local new_title = mw.title.new
local ucfirst = require("Module:string utilities").ucfirst
local split = require("Module:string utilities").split
local raw_categories = {}
local raw_handlers = {}
local m_languages = require("Module:languages")
local m_sc_getByCode = require("Module:scripts").getByCode
local m_table = require("Module:table")
local parse_utilities_module = "Module:parse utilities"
local concat = table.concat
local insert = table.insert
local reverse_ipairs = m_table.reverseIpairs
local serial_comma_join = m_table.serialCommaJoin
local size = m_table.size
local sorted_pairs = m_table.sortedPairs
local to_json = require("Module:JSON").toJSON
local Hang = m_sc_getByCode("Hang")
local Hani = m_sc_getByCode("Hani")
local Hira = m_sc_getByCode("Hira")
local Hrkt = m_sc_getByCode("Hrkt")
local Kana = m_sc_getByCode("Kana")
local function track(page)
-- [[Special:WhatLinksHere/Wiktionary:Tracking/category tree/languages/PAGE]]
return require("Module:debug/track")("category tree/languages/" .. page)
end
-- This handles language categories of the form e.g. [[:Category:French language]] and
-- [[:Category:British Sign Language]]; categories like [[:Category:Languages of Indonesia]]; categories like
-- [[:Category:English-based creole or pidgin languages]]; and categories like
-- [[:Category:English-based constructed languages]].
-----------------------------------------------------------------------------
-- --
-- RAW CATEGORIES --
-- --
-----------------------------------------------------------------------------
raw_categories["အရေဝ်ဘာသာအိုတ်သီုဂမၠိုၚ်"] = {
topright = "{{commonscat|Languages}}\n[[File:Languages world map-transparent background.svg|thumb|right|250px|Rough world map of language families]]",
description = "This category contains the categories for every language on Wiktionary.",
additional = "Not all languages that Wiktionary recognises may have a category here yet. There are many that have " ..
"not yet received any attention from editors, mainly because not all Wiktionary users know about every single " ..
"language. See [[Wiktionary:List of languages]] for a full list.",
parents = {
"ဒၞာဲလုပ်အဝေါၚ်ကဵုပၟိက်",
},
}
raw_categories["All extinct languages"] = {
description = "This category contains the categories for every [[extinct language]] on Wiktionary.",
additional = "Do not confuse this category with [[:Category:Extinct languages]], which is an umbrella category for the names of extinct languages in specific other languages (e.g. {{m+|de|Langobardisch}} for the ancient [[Lombardic]] language).",
parents = {
"အရေဝ်ဘာသာအိုတ်သီုဂမၠိုၚ်",
},
}
raw_categories["Languages by country"] = {
topright = "{{commonscat|Languages by continent}}",
description = "Categories that group languages by country.",
additional = "{{{umbrella_meta_msg}}}",
parents = {
"အရေဝ်ဘာသာအိုတ်သီုဂမၠိုၚ်",
},
}
raw_categories["Language isolates"] = {
topright = "{{wikipedia|Language isolate}}\n{{commonscat|Language isolates}}",
description = "Languages with no known relatives.",
parents = {
{name = "Languages by family", sort = "*Isolates"},
{name = "All language families", sort = "Isolates"},
},
}
raw_categories["Languages not sorted into a location category"] = {
description = "Languages which do not specify (in their {{tl|auto cat}} call) the location(s) where they are spoken.",
additional = "This excludes constructed and reconstructed languages; as a result, all languages in this category explicitly specify their location as {{cd|UNKNOWN}}.",
parents = {
{name = "Requests"},
},
hidden = true,
}
-----------------------------------------------------------------------------
-- --
-- RAW HANDLERS --
-- --
-----------------------------------------------------------------------------
-- Given a category (without the "Category:" prefix), look up the page defining the category, find the call to
-- {{auto cat}} (if any), and return a table of its arguments. If the category page doesn't exist or doesn't have
-- an {{auto cat}} invocation, return nil.
--
-- FIXME: Duplicated in [[Module:category tree/lects]].
local function scrape_category_for_auto_cat_args(cat)
local cat_page = mw.title.new("ကဏ္ဍ:" .. cat)
if cat_page then
local contents = cat_page:getContent()
if contents then
local frame = mw.getCurrentFrame()
for template in require("Module:template parser").find_templates(contents) do
-- The template parser automatically handles redirects and canonicalizes them, so uses of {{autocat}}
-- will also be found.
if template:get_name() == "auto cat" then
return template:get_arguments()
end
end
end
end
return nil
end
local function link_location(location)
local location_no_the = location:match("^the (.*)$")
local bare_location = location_no_the or location
local location_link
local bare_location_parts = split(bare_location, ", ")
for i, part in ipairs(bare_location_parts) do
bare_location_parts[i] = ("[[%s]]"):format(part)
end
location_link = concat(bare_location_parts, ", ")
if location_no_the then
location_link = "the " .. location_link
end
return location_link
end
local function linkbox(lang, setwiki, setwikt, setsister, entryname)
local wiktionarylinks = {}
local canonicalName = lang:getCanonicalName()
local wikimediaLanguages = lang:getWikimediaLanguages()
local wikipediaArticle = setwiki or lang:getWikipediaArticle()
setsister = setsister and ucfirst(setsister) or nil
if setwikt then
track("setwikt")
if setwikt == "-" then
track("setwikt/hyphen")
end
end
if setwikt ~= "-" and wikimediaLanguages and wikimediaLanguages[1] then
for _, wikimedialang in ipairs(wikimediaLanguages) do
local check = new_title(wikimedialang:getCode() .. ":")
if check and check.isExternal then
insert(wiktionarylinks,
(wikimedialang:getCanonicalName() ~= canonicalName and "(''" .. wikimedialang:getCanonicalName() .. "'') " or "") ..
"'''[[:" .. wikimedialang:getCode() .. ":|" .. wikimedialang:getCode() .. ".wiktionary.org]]'''")
end
end
wiktionarylinks = concat(wiktionarylinks, "<br/>")
end
local wikt_plural = wikimediaLanguages[2] and "s" or ""
if #wiktionarylinks == 0 then
wiktionarylinks = "''None.''"
end
if setsister then
track("setsister")
if setsister == "-" then
track("setsister/hyphen")
else
setsister = "ကဏ္ဍ:" .. setsister
end
else
setsister = lang:getCommonsCategory() or "-"
end
return concat{
[=[<div class="wikitable" style="float: right; clear: right; margin: 0 0 0.5em 1em; width: 300px; padding: 5px;">
<div style="text-align: center; margin-bottom: 10px; margin-top: 5px">လေန်အရေဝ်ဘာသာ''']=], canonicalName, [=[ဂမၠိုၚ်'''</div>
{| style="font-size: 90%"
|-
| style="vertical-align: top; height: 35px; border-bottom: 1px solid lightgray;" | [[File:Wikipedia-logo.png|35px|none|ဝဳကဳပဳဒဳယာ]]
| style="border-bottom: 1px solid lightgray;" | '''ဝဳကဳပဳဒဳယာဘာသာမန်'''မနွံဒၟံၚ်လိက်ပရေၚ်လ္တူ:
<div style="padding: 5px 10px">]=], (setwiki == "-" and "''None.''" or "'''[[w:" .. wikipediaArticle .. "|" .. wikipediaArticle .. "]]'''"), [=[</div>
|-
| style="vertical-align: top; height: 35px; border-bottom: 1px solid lightgray;" | [[File:Wikimedia-logo.svg|35px|none|ဝဳကဳမဳဒဳယာ ခမ်မောန်]]
| style="border-bottom: 1px solid lightgray;" | '''ဝဳကဳမဳဒဳယာ ခမ်မောန်'''မနွံဒၟံၚ်လေန်နကဵုမဆက်စပ်လဝ်ပရောပရာ]=], canonicalName, [=[ပ္ဍဲပရဝ်ဂျေတ်ၝုဲဒေံဂမၠိုၚ်:
<div style="padding: 5px 10px">]=], (setsister == "-" and "''None.''" or "'''[[commons:" .. setsister .. "|" .. setsister .. "]]'''"), [=[</div>
|-
| style="vertical-align: top; height: 35px; width: 40px; border-bottom: 1px solid lightgray;" | [[File:Wiktionary-logo-v2.svg|35px|none|ဝိက်ရှေန်နရဳ]]
|style="border-bottom: 1px solid lightgray;" | '''ဝိက်ရှေန်နရဳမချူပလေဝ်ဒါန်''']=], wikt_plural, [=[ဆၜိုတ်ပ္ဍဲ]=], canonicalName, [=[:
<div style="padding: 5px 10px">]=], wiktionarylinks, [=[</div>
|-
| style="vertical-align: top; height: 35px; border-bottom: 1px solid lightgray;" | [[File:Open book nae 02.svg|35px|none|ပရေၚ်ပၠောပ်စုတ်]]
| style="border-bottom: 1px solid lightgray;" | '''ဝိက်ရှေန်နရဳပရေၚ်ပၠောပ်စုတ်'''သွက်ဆေၚ်စပ်ကဵုယၟုဘာသာမန်ဂမၠိုၚ်:
<div style="padding: 5px 10px">''']=], require("Module:links").full_link({lang = m_languages.getByCode("en"), term = entryname or canonicalName}), [=['''</div>
|-
| style="vertical-align: top; height: 35px;" | [[File:Crystal kfind.png|35px|none|Considerations]]
|| '''ဝိက်ရှေန်နရဳသောၚ်တလး'''သွက်ဂွံပွမဗိုၚ်ချူပလေဝ်ဒါန်နကဵုပရေၚ်ပၠောပ်စုတ်]=], canonicalName, [=[ဂမၠိုၚ်:
<div style="padding: 5px 0">
* '''[[ဝိက်ရှေန်နရဳ:ထ္ၜးဂၠံၚ်ပရေၚ်ပၠောပ်စုတ်]=], canonicalName, [=[ဂမၠိုၚ်]]'''
* '''[[:ကဏ္ဍ:ထာမ်ပလိက်နိဿဲ]=], canonicalName, [=[ဂမၠိုၚ်|ထာမ်ပလိက်နိဿဲ]] ({{PAGESINCAT:ထာမ်ပလိက်နိဿဲ]=], canonicalName, [=[ဂမၠိုၚ်}})'''
* '''[[အဆက်လက္ကရဴ:စရၚ်ပြကိုဟ်နိဿဲ]=], canonicalName, [=[ |စရၚ်ပြကိုဟ်နိဿဲ]]'''
|}
</div>]=]
}
end
local function edit_link(title, text)
return '<span class="plainlinks">['
.. tostring(mw.uri.fullUrl(title, { action = "ပလေဝ်ဒါန်" }))
.. ' ' .. text .. ']</span>'
end
-- Should perhaps use wiki syntax.
local function infobox(lang)
local ret = {}
insert(ret, '<table class="wikitable language-category-info"')
local raw_data = lang:getData("extra")
if raw_data then
local replacements = {
[1] = "canonical-name",
[2] = "wikidata-item",
[3] = "family",
[4] = "scripts",
}
local function replacer(letter1, letter2)
return letter1:lower() .. "-" .. letter2:lower()
end
-- For each key in the language data modules, returns a descriptive
-- kebab-case version (containing ASCII lowercase words separated
-- by hyphens).
local function kebab_case(key)
key = replacements[key] or key
key = key:gsub("(%l)(%u)", replacer):gsub("(%l)_(%l)", replacer)
return key
end
local compress = {compress = true}
local function html_attribute_encode(str)
str = to_json(str, compress)
:gsub('"', """)
-- & in attributes is automatically escaped.
-- :gsub("&", "&")
:gsub("<", "<")
:gsub(">", ">")
return str
end
insert(ret, ' data-code="' .. lang:getCode() .. '"')
for k, v in sorted_pairs(raw_data) do
insert(ret, " data-" .. kebab_case(k)
.. '="'
.. html_attribute_encode(v)
.. '"')
end
end
insert(ret, '>\n')
insert(ret, '<tr class="language-category-data">\n<th colspan="2">'
.. edit_link(lang:getDataModuleName(), "ဒေတာပလေဝ်ဒါန်အရေဝ်ဘာသာ")
.. "</th>\n</tr>\n")
insert(ret, "<tr>\n<th>ယၟုတိုၚ်ခဳ</th><td>" .. lang:getCanonicalName() .. "</td>\n</tr>\n")
local otherNames = lang:getOtherNames()
if otherNames then
local names = {}
for _, name in ipairs(otherNames) do
insert(names, "<li>" .. name .. "</li>")
end
if #names > 0 then
insert(ret, "<tr>\n<th>Other names</th><td><ul>" .. concat(names, "\n") .. "</ul></td>\n</tr>\n")
end
end
local aliases = lang:getAliases()
if aliases then
local names = {}
for _, name in ipairs(aliases) do
insert(names, "<li>" .. name .. "</li>")
end
if #names > 0 then
insert(ret, "<tr>\n<th>ယၟုမထပ်ကော်သာ်တၞဟ်</th><td><ul>" .. concat(names, "\n") .. "</ul></td>\n</tr>\n")
end
end
local varieties = lang:getVarieties()
if varieties then
local names = {}
for _, name in ipairs(varieties) do
if type(name) == "string" then
insert(names, "<li>" .. name .. "</li>")
else
assert(type(name) == "table")
local first_var
local subvars = {}
for i, var in ipairs(name) do
if i == 1 then
first_var = var
else
insert(subvars, "<li>" .. var .. "</li>")
end
end
if #subvars > 0 then
insert(names, "<li><dl><dt>" .. first_var .. "</dt>\n<dd><ul>" .. concat(subvars, "\n") .. "</ul></dd></dl></li>")
elseif first_var then
insert(names, "<li>" .. first_var .. "</li>")
end
end
end
if #names > 0 then
insert(ret, "<tr>\n<th>Varieties</th><td><ul>" .. concat(names, "\n") .. "</ul></td>\n</tr>\n")
end
end
insert(ret, "<tr>\n<th>[[ဝိက်ရှေန်နရဳ:အရေဝ်ဘာသာဂမၠိုၚ်|ကုဒ်အရေဝ်ဘာသာ]]</th><td><code>" .. lang:getCode() .. "</code></td>\n</tr>\n")
insert(ret, "<tr>\n<th>[[ဝိက်ရှေန်နရဳ:အရေဝ်ဘာသာဝေါၚ်သဂမၠိုၚ်|အရေဝ်ဘာသာဝေါၚ်သ]]</th>\n")
local fam = lang:getFamily()
local famCode = fam and fam:getCode()
if not fam then
insert(ret, "<td>unclassified</td>")
elseif famCode == "qfa-iso" then
insert(ret, "<td>[[:ကဏ္ဍ:အရေဝ်ဘာသာမပါ်ပ္တိတ်လဝ်ဂမၠိုၚ်|အရေဝ်ဘာသာမပါ်ပ္တိတ်လဝ်]]</td>")
elseif famCode == "qfa-mix" then
insert(ret, "<td>[[:ကဏ္ဍ:အရေဝ်ဘာသာမပံၚ်ဖနှဴလဝ်ဂမၠိုၚ်|အရေဝ်ဘာသာမပံၚ်ဖနှဴလဝ်]]</td>")
elseif famCode == "sgn" then
insert(ret, "<td>[[:ကဏ္ဍ:အရေဝ်ဘာသာလက္ခဏာသမ္တီဂမၠိုၚ်|အရေဝ်ဘာသာလက္ခဏာသမ္တီ]]</td>")
elseif famCode == "crp" then
insert(ret, "<td>[[:ကဏ္ဍ:ဘာသာခရေဝ်အဝ် ဝါ ဖှေတ်ကျေန်ဂမၠိုၚ်|ခရေဝ်အဝ် ဝါ ဖှေတ်ကျေန်]]</td>")
elseif famCode == "art" then
insert(ret, "<td>[[:ကဏ္ဍ:ဂကောံဘာသာခၞံဗဒှ်လဝ်ဂမၠိုၚ်|ဂကောံဘာသာခၞံဗဒှ်လဝ်]]</td>")
else
insert(ret, "<td>" .. fam:makeCategoryLink() .. "</td>")
end
insert(ret, "\n</tr>\n<tr>\n<th>ဇုဇဗဴလဂမၠိုၚ်</th>\n<td>")
local ancestors = lang:getAncestors()
if ancestors[2] then
local ancestorList = {}
for i, anc in ipairs(ancestors) do
ancestorList[i] = "<li>" .. anc:makeCategoryLink() .. "</li>"
end
insert(ret, "<ul>\n" .. concat(ancestorList, "\n") .. "</ul>")
else
local ancestorChain = lang:getAncestorChainOld()
if ancestorChain[1] then
local chain = {}
for _, anc in reverse_ipairs(ancestorChain) do
insert(chain, "<li>" .. anc:makeCategoryLink() .. "</li>")
end
insert(ret, "<ul>\n" .. concat(chain, "\n<ul>\n") .. ("</ul>"):rep(#chain))
else
insert(ret, "unknown")
end
end
insert(ret, "</td>\n</tr>\n")
local scripts = lang:getScripts()
if scripts[1] then
local script_text = {}
local function makeScriptLine(sc)
local code = sc:getCode()
local url = tostring(mw.uri.fullUrl('Special:Search', {
search = 'contentmodel:css insource:"' .. code
.. '" insource:/\\.' .. code .. '/',
ns8 = '1'
}))
return sc:makeCategoryLink()
.. ' (<span class="plainlinks" title="Search for stylesheets referencing this script">[' .. url .. ' <code>' .. code .. '</code>]</span>)'
end
local function add_Hrkt(text)
insert(text, "<li>" .. makeScriptLine(Hrkt))
insert(text, "<ul>")
insert(text, "<li>" .. makeScriptLine(Hira) .. "</li>")
insert(text, "<li>" .. makeScriptLine(Kana) .. "</li>")
insert(text, "</ul>")
insert(text, "</li>")
end
for _, sc in ipairs(scripts) do
local text = {}
local code = sc:getCode()
if code == "Hrkt" then
add_Hrkt(text)
else
insert(text, "<li>" .. makeScriptLine(sc))
if code == "Jpan" then
insert(text, "<ul>")
insert(text, "<li>" .. makeScriptLine(Hani) .. "</li>")
add_Hrkt(text)
insert(text, "</ul>")
elseif code == "Kore" then
insert(text, "<ul>")
insert(text, "<li>" .. makeScriptLine(Hang) .. "</li>")
insert(text, "<li>" .. makeScriptLine(Hani) .. "</li>")
insert(text, "</ul>")
end
insert(text, "</li>")
end
insert(script_text, concat(text, "\n"))
end
insert(ret, "<tr>\n<th>[[ဝိက်ရှေန်နရဳ:အက္ခရ်ဂမၠိုၚ်|အက္ခရ်ဂမၠိုၚ်]]</th>\n<td><ul>\n" .. concat(script_text, "\n") .. "</ul></td>\n</tr>\n")
else
insert(ret, "<tr>\n<th>[[ဝိက်ရှေန်နရဳ:အက္ခရ်ဂမၠိုၚ်|အက္ခရ်ဂမၠိုၚ်]]</th>\n<td>not specified</td>\n</tr>\n")
end
local function add_module_info(raw_data, heading)
if raw_data then
local scripts = lang:getScriptCodes()
local module_info, add = {}, false
if type(raw_data) == "string" then
insert(module_info,
("[[မဝ်ဂျူ:%s]]"):format(raw_data))
add = true
else
local raw_data_type = type(raw_data)
if raw_data_type == "table" and size(scripts) == 1 and type(raw_data[scripts[1]]) == "string" then
insert(module_info,
("[[မဝ်ဂျူ:%s]]"):format(raw_data[scripts[1]]))
add = true
elseif raw_data_type == "table" then
insert(module_info, "<ul>")
for script, data in sorted_pairs(raw_data) do
if type(data) == "string" and m_sc_getByCode(script) then
insert(module_info, ("<li><code>%s</code>: [[မဝ်ဂျူ:%s]]</li>"):format(script, data))
end
end
insert(module_info, "</ul>")
add = size(module_info) > 2
end
end
if add then
insert(ret, [=[
<tr>
<th>]=] .. heading .. [=[</th>
<td>]=] .. concat(module_info) .. [=[</td>
</tr>
]=])
end
end
end
add_module_info(raw_data.generate_forms, "Form-generating<br>module")
add_module_info(raw_data.translit, "[[ဝိက်ရှေန်နရဳ:ကၠာဲပ္တိတ်မအခဝ် ကဵု ပြၚ်လှာဲအက္ခရ်ရဝ်မာန်|မဝ်ဂျူ<br>ကၠာဲပ္တိတ်မအခဝ်]]")
add_module_info(raw_data.display_text, "မဝ်ဂျူ<br>ထ္ၜးမလိက်")
add_module_info(raw_data.entry_name, "မဝ်ဂျူ<br>ယၟုစရၚ်")
add_module_info(raw_data.sort_key, "[[sortkey|မဝ်ဂျူ]]<br>ကဳပါ်အဇာ")
local wikidataItem = lang:getWikidataItem()
if lang:getWikidataItem() and mw.wikibase then
local URL = mw.wikibase.getEntityUrl(wikidataItem)
local link
if URL then
link = '[' .. URL .. ' ' .. wikidataItem .. ']'
else
link = '<span class="error">Invalid Wikidata item: <code>' .. wikidataItem .. '</code></span>'
end
insert(ret, "<tr><th>Wikidata</th><td>" .. link .. "</td></tr>")
end
insert(ret, "</table>")
return concat(ret)
end
local function NavFrame(content, title)
return '<div class="NavFrame"><div class="NavHead">'
.. (title or '{{{title}}}') .. '</div>'
.. '<div class="NavContent" style="text-align: left;">'
.. content
.. '</div></div>'
end
local function get_description_topright_additional(lang, locations, extinct, setwiki, setwikt, setsister, entryname)
local nameWithLanguage = lang:getCategoryName("nocap")
if lang:getCode() == "und" then
local description =
"ဣတဏအ်ဂှ်ဝွံဆေၚ်စပ်ကဵုကဏ္ဍအဓိက'''" .. nameWithLanguage .. "'''၊ မအာတ်မိက်ထ္ၜးလဝ်ပ္ဍဲဝိက်ရှေန်နရဳသီုကဵု[[ဝိက်ရှေန်နရဳ:အရေဝ်ဘာသာဂမၠိုၚ်|ကုဒ်]] '''" .. lang:getCode() .. "'''ရအဴ။" ..
"မအရေဝ်ဘာသာလုပ်အဝေါၚ်တဏအ်ဝွံပ္ဍဲပွမချူဆေၚ်စပ်ကဵုဝၚ်၊ သီုကဵုမအရေဝ်အဓိပ္ပာဲကီုလေဝ်မက္တဵုဒှ်လဝ်ဂလာန်သတ်ဒတ်နူကဵုတၠပညာဟၟဲမွဲဏီ။"
return description, nil, nil
end
local canonicalName = lang:getCanonicalName()
local topright = linkbox(lang, setwiki, setwikt, setsister, entryname)
local the_prefix
if canonicalName:find("ဘာသာ$") then
the_prefix = ""
else
the_prefix = "the "
end
local description = "ဣတဏအ်ဂှ်ဝွံဆေၚ်စပ်ကဵုကဏ္ဍအဓိက" .. the_prefix .. "'''" .. nameWithLanguage .. "'''ရအဴ။"
local location_links = {}
local prep
local saw_embedded_comma = false
for _, location in ipairs(locations) do
local this_prep
if location == "the world" then
this_prep = "across"
insert(location_links, location)
elseif location ~= "UNKNOWN" then
this_prep = "in"
if location:find(",") then
saw_embedded_comma = true
end
insert(location_links, link_location(location))
end
if this_prep then
if prep and this_prep ~= prep then
error("Can't handle location 'the world' along with another location (clashing prepositions)")
end
prep = this_prep
end
end
local location_desc
if #location_links > 0 then
local location_link_text
if saw_embedded_comma and #location_links >= 3 then
location_link_text = mw.text.listToText(location_links, "; ", "; and ")
else
location_link_text = serial_comma_join(location_links)
end
location_desc = ("It is %s %s %s.\n\n"):format(
extinct and "an [[extinct language]] that was formerly spoken" or "spoken", prep, location_link_text)
elseif extinct then
location_desc = "It is an [[extinct language]].\n\n"
else
location_desc = ""
end
local add = location_desc .. "Information about " .. canonicalName .. ":\n\n" .. infobox(lang)
if lang:hasType("reconstructed") then
add = add .. "\n\n" ..
ucfirst(canonicalName) .. " is a reconstructed language. Its words and roots are not directly attested in any written works, but have been reconstructed through the ''comparative method'', " ..
"which finds regular similarities between languages that cannot be explained by coincidence or word-borrowing, and extrapolates ancient forms from these similarities.\n\n" ..
"According to our [[Wiktionary:Criteria for inclusion|criteria for inclusion]], terms in " .. canonicalName ..
" should '''not''' be present in entries in the main namespace, but may be added to the Reconstruction: namespace."
elseif lang:hasType("appendix-constructed") then
add = add .. "\n\n" ..
ucfirst(canonicalName) .. " is a constructed language that is only in sporadic use. " ..
"According to our [[Wiktionary:Criteria for inclusion|criteria for inclusion]], terms in " .. canonicalName ..
" should '''not''' be present in entries in the main namespace, but may be added to the Appendix: namespace. " ..
"All terms in this language may be available at [[Appendix:" .. ucfirst(canonicalName) .. "]]."
end
local about = new_title("ဝိက်ရှေန်နရဳ:ပရူ" .. canonicalName)
if about.exists then
add = add .. "\n\n" ..
"Please see '''[[ဝိက်ရှေန်နရဳ:ပရူ" .. canonicalName .. "]]''' for information and special considerations for creating " .. nameWithLanguage .. " entries."
end
local ok, tree_of_descendants = pcall(
require("Module:family tree").print_children,
lang:getCode(), {
protolanguage_under_family = true,
must_have_descendants = true
})
if ok then
if tree_of_descendants then
add = add .. NavFrame(
tree_of_descendants,
"Family tree")
else
add = add .. "\n\n" .. ucfirst(lang:getCanonicalName())
.. " has no descendants or varieties listed in Wiktionary's language data modules."
end
else
mw.log("error while generating tree: " .. tostring(tree_of_descendants))
end
return description, topright, add
end
local function get_parents(lang, locations, extinct)
local canonicalName = lang:getCanonicalName()
local sortkey = {sort_base = canonicalName, lang = "en"}
local ret = {{name = "အရေဝ်ဘာသာအိုတ်သီုဂမၠိုၚ်", sort = sortkey}}
local fam = lang:getFamily()
local famCode = fam and fam:getCode()
-- FIXME: Some of the following categories should be added to this module.
if not fam then
insert(ret, {name = "ကဏ္ဍ:အရေဝ်ဘာသာအပြောံဂမၠိုၚ်", sort = sortkey})
elseif famCode == "qfa-iso" then
insert(ret, {name = "ကဏ္ဍ:အရေဝ်ဘာသာမပါ်ပ္တိတ်လဝ်ဂမၠိုၚ်", sort = sortkey})
elseif famCode == "qfa-mix" then
insert(ret, {name = "ကဏ္ဍ:အရေဝ်ဘာသာမပံၚ်ဖနှဴလဝ်ဂမၠိုၚ်", sort = sortkey})
elseif famCode == "sgn" then
insert(ret, {name = "ကဏ္ဍ:အရေဝ်ဘာသာလက္ခဏာသမ္တီသီုဖ္အိုတ်ဂမၠိုၚ်", sort = sortkey})
elseif famCode == "crp" then
insert(ret, {name = "ကဏ္ဍ:ဘာသာခရေဝ်အဝ် ဝါ ဖှေတ်ကျေန်ဂမၠိုၚ်", sort = sortkey})
for _, anc in ipairs(lang:getAncestors()) do
-- Avoid Haitian Creole being categorised in [[:Category:Haitian Creole-based creole or pidgin languages]], as one of its ancestors is an etymology-only variety of it.
-- Use that ancestor's ancestors instead.
if anc:getFullCode() == lang:getCode() then
for _, anc_extra in ipairs(anc:getAncestors()) do
insert(ret, {name = "ကဏ္ဍ:ဘာသာခရေဝ်အဝ် ဝါ ဖှေတ်ကျေန်မရပ်စပ်လဝ်ဘာသာ" .. ucfirst(anc_extra:getFullName()) .. "နကဵုတံသ္ဇိုၚ်ဂမၠိုၚ်", sort = sortkey})
end
else
insert(ret, {name = "ကဏ္ဍ:ဘာသာခရေဝ်အဝ် ဝါ ဖှေတ်ကျေန်မရပ်စပ်လဝ်ဘာသာ" .. ucfirst(anc:getFullName()) .. "နကဵုတံသ္ဇိုၚ်ဂမၠိုၚ်", sort = sortkey})
end
end
elseif famCode == "art" then
if lang:hasType("appendix-constructed") then
insert(ret, {name = "ကဏ္ဍ:ဂကောံဘာသာခၞံဗဒှ်လဝ်ပါဲနူအဆက်လက္ကရဴဂမၠိုၚ်", sort = sortkey})
else
insert(ret, {name = "ကဏ္ဍ:ဂကောံဘာသာခၞံဗဒှ်လဝ်ဂမၠိုၚ်", sort = sortkey})
end
for _, anc in ipairs(lang:getAncestors()) do
if anc:getFullCode() == lang:getCode() then
for _, anc_extra in ipairs(anc:getAncestors()) do
insert(ret, {name = "ကဏ္ဍ:ဂကောံဘာသာခၞံဗဒှ်လဝ်မရပ်စပ်လဝ်ဘာသာ" .. ucfirst(anc_extra:getFullName()) .. "နကဵုတံသ္ဇိုၚ်ဂမၠိုၚ်", sort = sortkey})
end
else
insert(ret, {name = "ကဏ္ဍ:ဂကောံဘာသာခၞံဗဒှ်လဝ်မရပ်စပ်လဝ်ဘာသာ" .. ucfirst(anc:getFullName()) .. "နကဵုတံသ္ဇိုၚ်ဂမၠိုၚ်", sort = sortkey})
end
end
else
insert(ret, {name = "ကဏ္ဍ:" .. fam:getCategoryName(), sort = sortkey})
if lang:hasType("reconstructed") then
insert(ret, {
name = "ကဏ္ဍ:အရေဝ်ဘာသာဗီုပြၚ်သိုၚ်တၟိဂမၠိုၚ်",
sort = {sort_base = canonicalName:gsub("^%-အခိုက်ကၞာ", ""), lang = "en"}
})
end
end
local function add_sc_cat(sc)
insert(ret, {name = "ကဏ္ဍ:ဘာသာ" .. sc:getCategoryName() , sort = sortkey})
end
local function add_Hrkt()
add_sc_cat(Hrkt)
add_sc_cat(Hira)
add_sc_cat(Kana)
end
for _, sc in ipairs(lang:getScripts()) do
if sc:getCode() == "Hrkt" then
add_Hrkt()
else
add_sc_cat(sc)
if sc:getCode() == "Jpan" then
add_sc_cat(Hani)
add_Hrkt()
elseif sc:getCode() == "Kore" then
add_sc_cat(Hang)
add_sc_cat(Hani)
end
end
end
if lang:hasTranslit() then
insert(ret, {name = "ကဏ္ဍ:ဘာသာမနွံကဵုပြၚ်လှာဲကၠာဲမအခဝ်အဝ်တဝ်", sort = sortkey})
end
local function insert_location_language_cat(location)
local cat = "အရေဝ်ဘာသာမဆေၚ်စပ်ကဵု" .. location .. "ဂမၠိုၚ်"
insert(ret, {name = "ကဏ္ဍ:" .. cat, sort = sortkey})
local auto_cat_args = scrape_category_for_auto_cat_args(cat)
local location_parent = auto_cat_args and auto_cat_args.parent
if location_parent then
local split_parents = require(parse_utilities_module).split_on_comma(location_parent)
for _, parent in ipairs(split_parents) do
parent = parent:match("^(.-):.*$") or parent
insert_location_language_cat(parent)
end
end
end
local saw_location = false
for _, location in ipairs(locations) do
if location ~= "UNKNOWN" then
saw_location = true
insert_location_language_cat(location)
end
end
if extinct then
insert(ret, {name = "ကဏ္ဍ:အရေဝ်ဘာသာမကၠေံဗ္ဒန်အာသီုဖ္အိုတ်ဂမၠိုၚ်", sort = sortkey})
end
if not saw_location and not (lang:hasType("reconstructed") or (fam and fam:getCode() == "art")) then
-- Constructed and reconstructed languages don't need a location specified and often won't have one,
-- so don't put them in this maintenance category.
insert(ret, {name = "ကဏ္ဍ:အရေဝ်ဘာသာဟွံဂွံစုတ်အဇာအပ္ဍဲကဏ္ဍဍုၚ်အတေံဏီ", sort = sortkey})
end
return ret
end
local function get_children()
local ret = {}
-- FIXME: We should work on the children mechanism so it isn't necessary to manually specify these.
for _, label in ipairs({"ဝေါဟာအဓိက"}) do
insert(ret, {name = label, is_label = true})
end
return ret
end
-- Handle language categories of the form e.g. [[:Category:French language]] and
-- [[:Category:British Sign Language]].
insert(raw_handlers, function(data)
local category = data.category
local lang = m_languages.getByCanonicalName(category)
if not lang then
local langname = category:match("^ဘာသာ")
if langname then
lang = m_languages.getByCanonicalName(langname)
end
if not lang then
return nil
end
end
local args = require("Module:parameters").process(data.args, {
[1] = {list = true},
["setwiki"] = true,
["setwikt"] = true,
["setsister"] = true,
["entryname"] = true,
["extinct"] = {type = "boolean"},
})
-- If called from inside, don't require any arguments, as they can't be known
-- in general and aren't needed just to generate the first parent (used for
-- breadcrumbs).
if #args[1] == 0 and not data.called_from_inside then
-- At least one location must be specified unless the language is constructed (e.g. Esperanto) or reconstructed (e.g. Proto-Indo-European).
local fam = lang:getFamily()
if not (lang:hasType("reconstructed") or (fam and fam:getCode() == "art")) then
error("At least one location (param 1=) must be specified for language '" .. lang:getCanonicalName() .. "' (code '" .. lang:getCode() .. "'). " ..
"Use the value UNKNOWN if the language's location is truly unknown.")
end
end
local description, topright, additional = "", "", ""
-- If called from inside the category tree system, it's called when generating
-- parents or children, and we don't need to generate the description or additional
-- text (which is very expensive in terms of memory because it calls [[Module:family tree]],
-- which calls [[Module:languages/data/all]]).
if not data.called_from_inside then
description, topright, additional = get_description_topright_additional(
lang, args[1], args.extinct, args.setwiki, args.setwikt, args.setsister, args.entryname
)
end
return {
canonical_name = lang:getCategoryName(),
description = description,
lang = lang:getCode(),
topright = topright,
additional = additional,
breadcrumb = lang:getCanonicalName(),
parents = get_parents(lang, args[1], args.extinct),
extra_children = get_children(lang),
umbrella = false,
can_be_empty = true,
}, true
end)
-- Handle categories such as [[:Category:Languages of Indonesia]].
insert(raw_handlers, function(data)
local location = data.category:match("^အရေဝ်ဘာသာမဆေၚ်စပ်ကဵု(.*)$")
if location then
local args = require("Module:parameters").process(data.args, {
["flagfile"] = true,
["commonscat"] = true,
["wp"] = true,
["basename"] = true,
["parent"] = true,
["locationcat"] = true,
["locationlink"] = true,
})
local topright
local basename = args.basename or location:gsub(", .*", "")
if args.flagfile ~= "-" then
local flagfile_arg = args.flagfile or ("Flag of %s.svg"):format(basename)
local files = require(parse_utilities_module).split_on_comma(flagfile_arg)
local topright_parts = {}
for _, file in ipairs(files) do
local flagfile = "File:" .. file
local flagfile_page = new_title(flagfile)
if flagfile_page and flagfile_page.file.exists then
insert(topright_parts, ("[[%s|right|100px|border]]"):format(flagfile))
elseif args.flagfile then
error(("Explicit flagfile '%s' doesn't exist"):format(flagfile))
end
end
topright = concat(topright_parts)
end
if args.wp then
local wp = require("Module:yesno")(args.wp, "+")
if wp == "+" or wp == true then
wp = data.category
end
if wp then
local wp_topright = ("{{wikipedia|%s}}"):format(wp)
if topright then
topright = topright .. wp_topright
else
topright = wp_topright
end
end
end
if args.commonscat then
local commonscat = require("Module:yesno")(args.commonscat, "+")
if commonscat == "+" or commonscat == true then
commonscat = data.category
end
if commonscat then
local commons_topright = ("{{commonscat|%s}}"):format(commonscat)
if topright then
topright = topright .. commons_topright
else
topright = commons_topright
end
end
end
local bare_location = location:match("^the (.*)$") or location
local location_link = args.locationlink or link_location(location)
local bare_basename = basename:match("^the (.*)$") or basename
local parents = {}
if args.parent then
local explicit_parents = require(parse_utilities_module).split_on_comma(args.parent)
for i, parent in ipairs(explicit_parents) do
local actual_parent, sort_key = parent:match("^(.-):(.*)$")
if actual_parent then
parent = actual_parent
sort_key = sort_key:gsub("%+", bare_location)
else
sort_key = " " .. bare_location
end
insert(parents, {name = "အရေဝ်ဘာသာမဆေၚ်စပ်ကဵု" .. parent, sort = sort_key})
end
else
insert(parents, {name = "အရေဝ်ဘာသာဗက်အလိုက်ဍုၚ်ရး", sort = {sort_base = bare_location, lang = "en"}})
end
if args.locationcat then
local explicit_location_cats = require(parse_utilities_module).split_on_comma(args.locationcat)
for i, locationcat in ipairs(explicit_location_cats) do
insert(parents, {name = "ကဏ္ဍ:ဘာသာ" .. locationcat, sort })
end
else
local location_cat = ("ကဏ္ဍ:%s"):format(bare_location)
local location_page = new_title(location_cat)
if location_page and location_page.exists then
insert(parents, {name = "ဘာသာ" .. location_cat, sort })
end
end
local description = ("Categories for languages of %s (including sublects)."):format(location_link)
return {
topright = topright,
description = description,
parents = parents,
breadcrumb = bare_basename,
additional = "{{{umbrella_msg}}}",
}, true
end
end)
-- Handle categories such as [[:Category:English-based creole or pidgin languages]].
insert(raw_handlers, function(data)
local langname = data.category:match("(.*)%-based creole or pidgin languages$")
if langname then
local lang = m_languages.getByCanonicalName(langname)
if lang then
return {
lang = lang:getCode(),
description = "Languages which developed as a [[creole]] or [[pidgin]] from " .. lang:makeCategoryLink() .. ".",
parents = {{name = "Creole or pidgin languages", sort = {sort_base = "*" .. langname, lang = "en"}}},
breadcrumb = lang:getCanonicalName() .. "-based",
}
end
end
end)
-- Handle categories such as [[:Category:English-based constructed languages]].
insert(raw_handlers, function(data)
local langname = data.category:match("(.*)%-based constructed languages$")
if langname then
local lang = m_languages.getByCanonicalName(langname)
if lang then
return {
lang = lang:getCode(),
description = "Constructed languages which are based on " .. lang:makeCategoryLink() .. ".",
parents = {{name = "Constructed languages", sort = {sort_base = "*" .. langname, lang = "en"}}},
breadcrumb = lang:getCanonicalName() .. "-based",
}
end
end
end)
return {RAW_CATEGORIES = raw_categories, RAW_HANDLERS = raw_handlers}
8hvqfi4u7cvojaioxu0ojfdy6okycr5
385638
385635
2026-04-02T17:34:47Z
咽頭べさ
33
385638
Scribunto
text/plain
local new_title = mw.title.new
local ucfirst = require("Module:string utilities").ucfirst
local split = require("Module:string utilities").split
local raw_categories = {}
local raw_handlers = {}
local m_languages = require("Module:languages")
local m_sc_getByCode = require("Module:scripts").getByCode
local m_table = require("Module:table")
local parse_utilities_module = "Module:parse utilities"
local concat = table.concat
local insert = table.insert
local reverse_ipairs = m_table.reverseIpairs
local serial_comma_join = m_table.serialCommaJoin
local size = m_table.size
local sorted_pairs = m_table.sortedPairs
local to_json = require("Module:JSON").toJSON
local Hang = m_sc_getByCode("Hang")
local Hani = m_sc_getByCode("Hani")
local Hira = m_sc_getByCode("Hira")
local Hrkt = m_sc_getByCode("Hrkt")
local Kana = m_sc_getByCode("Kana")
local function track(page)
-- [[Special:WhatLinksHere/Wiktionary:Tracking/category tree/languages/PAGE]]
return require("Module:debug/track")("category tree/languages/" .. page)
end
-- This handles language categories of the form e.g. [[:Category:French language]] and
-- [[:Category:British Sign Language]]; categories like [[:Category:Languages of Indonesia]]; categories like
-- [[:Category:English-based creole or pidgin languages]]; and categories like
-- [[:Category:English-based constructed languages]].
-----------------------------------------------------------------------------
-- --
-- RAW CATEGORIES --
-- --
-----------------------------------------------------------------------------
raw_categories["အရေဝ်ဘာသာအိုတ်သီုဂမၠိုၚ်"] = {
topright = "{{commonscat|Languages}}\n[[File:Languages world map-transparent background.svg|thumb|right|250px|Rough world map of language families]]",
description = "This category contains the categories for every language on Wiktionary.",
additional = "Not all languages that Wiktionary recognises may have a category here yet. There are many that have " ..
"not yet received any attention from editors, mainly because not all Wiktionary users know about every single " ..
"language. See [[Wiktionary:List of languages]] for a full list.",
parents = {
"ဒၞာဲလုပ်အဝေါၚ်ကဵုပၟိက်",
},
}
raw_categories["All extinct languages"] = {
description = "This category contains the categories for every [[extinct language]] on Wiktionary.",
additional = "Do not confuse this category with [[:Category:Extinct languages]], which is an umbrella category for the names of extinct languages in specific other languages (e.g. {{m+|de|Langobardisch}} for the ancient [[Lombardic]] language).",
parents = {
"အရေဝ်ဘာသာအိုတ်သီုဂမၠိုၚ်",
},
}
raw_categories["Languages by country"] = {
topright = "{{commonscat|Languages by continent}}",
description = "Categories that group languages by country.",
additional = "{{{umbrella_meta_msg}}}",
parents = {
"အရေဝ်ဘာသာအိုတ်သီုဂမၠိုၚ်",
},
}
raw_categories["Language isolates"] = {
topright = "{{wikipedia|Language isolate}}\n{{commonscat|Language isolates}}",
description = "Languages with no known relatives.",
parents = {
{name = "Languages by family", sort = "*Isolates"},
{name = "All language families", sort = "Isolates"},
},
}
raw_categories["Languages not sorted into a location category"] = {
description = "Languages which do not specify (in their {{tl|auto cat}} call) the location(s) where they are spoken.",
additional = "This excludes constructed and reconstructed languages; as a result, all languages in this category explicitly specify their location as {{cd|UNKNOWN}}.",
parents = {
{name = "Requests"},
},
hidden = true,
}
-----------------------------------------------------------------------------
-- --
-- RAW HANDLERS --
-- --
-----------------------------------------------------------------------------
-- Given a category (without the "Category:" prefix), look up the page defining the category, find the call to
-- {{auto cat}} (if any), and return a table of its arguments. If the category page doesn't exist or doesn't have
-- an {{auto cat}} invocation, return nil.
--
-- FIXME: Duplicated in [[Module:category tree/lects]].
local function scrape_category_for_auto_cat_args(cat)
local cat_page = mw.title.new("ကဏ္ဍ:" .. cat)
if cat_page then
local contents = cat_page:getContent()
if contents then
local frame = mw.getCurrentFrame()
for template in require("Module:template parser").find_templates(contents) do
-- The template parser automatically handles redirects and canonicalizes them, so uses of {{autocat}}
-- will also be found.
if template:get_name() == "auto cat" then
return template:get_arguments()
end
end
end
end
return nil
end
local function link_location(location)
local location_no_the = location:match("^the (.*)$")
local bare_location = location_no_the or location
local location_link
local bare_location_parts = split(bare_location, ", ")
for i, part in ipairs(bare_location_parts) do
bare_location_parts[i] = ("[[%s]]"):format(part)
end
location_link = concat(bare_location_parts, ", ")
if location_no_the then
location_link = "the " .. location_link
end
return location_link
end
local function linkbox(lang, setwiki, setwikt, setsister, entryname)
local wiktionarylinks = {}
local canonicalName = lang:getCanonicalName()
local wikimediaLanguages = lang:getWikimediaLanguages()
local wikipediaArticle = setwiki or lang:getWikipediaArticle()
setsister = setsister and ucfirst(setsister) or nil
if setwikt then
track("setwikt")
if setwikt == "-" then
track("setwikt/hyphen")
end
end
if setwikt ~= "-" and wikimediaLanguages and wikimediaLanguages[1] then
for _, wikimedialang in ipairs(wikimediaLanguages) do
local check = new_title(wikimedialang:getCode() .. ":")
if check and check.isExternal then
insert(wiktionarylinks,
(wikimedialang:getCanonicalName() ~= canonicalName and "(''" .. wikimedialang:getCanonicalName() .. "'') " or "") ..
"'''[[:" .. wikimedialang:getCode() .. ":|" .. wikimedialang:getCode() .. ".wiktionary.org]]'''")
end
end
wiktionarylinks = concat(wiktionarylinks, "<br/>")
end
local wikt_plural = wikimediaLanguages[2] and "s" or ""
if #wiktionarylinks == 0 then
wiktionarylinks = "''None.''"
end
if setsister then
track("setsister")
if setsister == "-" then
track("setsister/hyphen")
else
setsister = "ကဏ္ဍ:" .. setsister
end
else
setsister = lang:getCommonsCategory() or "-"
end
return concat{
[=[<div class="wikitable" style="float: right; clear: right; margin: 0 0 0.5em 1em; width: 300px; padding: 5px;">
<div style="text-align: center; margin-bottom: 10px; margin-top: 5px">လေန်အရေဝ်ဘာသာ''']=], canonicalName, [=[ဂမၠိုၚ်'''</div>
{| style="font-size: 90%"
|-
| style="vertical-align: top; height: 35px; border-bottom: 1px solid lightgray;" | [[File:Wikipedia-logo.png|35px|none|ဝဳကဳပဳဒဳယာ]]
| style="border-bottom: 1px solid lightgray;" | '''ဝဳကဳပဳဒဳယာဘာသာမန်'''မနွံဒၟံၚ်လိက်ပရေၚ်လ္တူ:
<div style="padding: 5px 10px">]=], (setwiki == "-" and "''None.''" or "'''[[w:" .. wikipediaArticle .. "|" .. wikipediaArticle .. "]]'''"), [=[</div>
|-
| style="vertical-align: top; height: 35px; border-bottom: 1px solid lightgray;" | [[File:Wikimedia-logo.svg|35px|none|ဝဳကဳမဳဒဳယာ ခမ်မောန်]]
| style="border-bottom: 1px solid lightgray;" | '''ဝဳကဳမဳဒဳယာ ခမ်မောန်'''မနွံဒၟံၚ်လေန်နကဵုမဆက်စပ်လဝ်ပရောပရာ]=], canonicalName, [=[ပ္ဍဲပရဝ်ဂျေတ်ၝုဲဒေံဂမၠိုၚ်:
<div style="padding: 5px 10px">]=], (setsister == "-" and "''None.''" or "'''[[commons:" .. setsister .. "|" .. setsister .. "]]'''"), [=[</div>
|-
| style="vertical-align: top; height: 35px; width: 40px; border-bottom: 1px solid lightgray;" | [[File:Wiktionary-logo-v2.svg|35px|none|ဝိက်ရှေန်နရဳ]]
|style="border-bottom: 1px solid lightgray;" | '''ဝိက်ရှေန်နရဳမချူပလေဝ်ဒါန်''']=], wikt_plural, [=[ဆၜိုတ်ပ္ဍဲ]=], canonicalName, [=[:
<div style="padding: 5px 10px">]=], wiktionarylinks, [=[</div>
|-
| style="vertical-align: top; height: 35px; border-bottom: 1px solid lightgray;" | [[File:Open book nae 02.svg|35px|none|ပရေၚ်ပၠောပ်စုတ်]]
| style="border-bottom: 1px solid lightgray;" | '''ဝိက်ရှေန်နရဳပရေၚ်ပၠောပ်စုတ်'''သွက်ဆေၚ်စပ်ကဵုယၟုဘာသာမန်ဂမၠိုၚ်:
<div style="padding: 5px 10px">''']=], require("Module:links").full_link({lang = m_languages.getByCode("en"), term = entryname or canonicalName}), [=['''</div>
|-
| style="vertical-align: top; height: 35px;" | [[File:Crystal kfind.png|35px|none|Considerations]]
|| '''ဝိက်ရှေန်နရဳသောၚ်တလး'''သွက်ဂွံပွမဗိုၚ်ချူပလေဝ်ဒါန်နကဵုပရေၚ်ပၠောပ်စုတ်]=], canonicalName, [=[ဂမၠိုၚ်:
<div style="padding: 5px 0">
* '''[[ဝိက်ရှေန်နရဳ:ထ္ၜးဂၠံၚ်ပရေၚ်ပၠောပ်စုတ်]=], canonicalName, [=[ဂမၠိုၚ်]]'''
* '''[[:ကဏ္ဍ:ထာမ်ပလိက်နိဿဲ]=], canonicalName, [=[ဂမၠိုၚ်|ထာမ်ပလိက်နိဿဲ]] ({{PAGESINCAT:ထာမ်ပလိက်နိဿဲ]=], canonicalName, [=[ဂမၠိုၚ်}})'''
* '''[[အဆက်လက္ကရဴ:စရၚ်ပြကိုဟ်နိဿဲ]=], canonicalName, [=[ |စရၚ်ပြကိုဟ်နိဿဲ]]'''
|}
</div>]=]
}
end
local function edit_link(title, text)
return '<span class="plainlinks">['
.. tostring(mw.uri.fullUrl(title, { action = "ပလေဝ်ဒါန်" }))
.. ' ' .. text .. ']</span>'
end
-- Should perhaps use wiki syntax.
local function infobox(lang)
local ret = {}
insert(ret, '<table class="wikitable language-category-info"')
local raw_data = lang:getData("extra")
if raw_data then
local replacements = {
[1] = "canonical-name",
[2] = "wikidata-item",
[3] = "family",
[4] = "scripts",
}
local function replacer(letter1, letter2)
return letter1:lower() .. "-" .. letter2:lower()
end
-- For each key in the language data modules, returns a descriptive
-- kebab-case version (containing ASCII lowercase words separated
-- by hyphens).
local function kebab_case(key)
key = replacements[key] or key
key = key:gsub("(%l)(%u)", replacer):gsub("(%l)_(%l)", replacer)
return key
end
local compress = {compress = true}
local function html_attribute_encode(str)
str = to_json(str, compress)
:gsub('"', """)
-- & in attributes is automatically escaped.
-- :gsub("&", "&")
:gsub("<", "<")
:gsub(">", ">")
return str
end
insert(ret, ' data-code="' .. lang:getCode() .. '"')
for k, v in sorted_pairs(raw_data) do
insert(ret, " data-" .. kebab_case(k)
.. '="'
.. html_attribute_encode(v)
.. '"')
end
end
insert(ret, '>\n')
insert(ret, '<tr class="language-category-data">\n<th colspan="2">'
.. edit_link(lang:getDataModuleName(), "ဒေတာပလေဝ်ဒါန်အရေဝ်ဘာသာ")
.. "</th>\n</tr>\n")
insert(ret, "<tr>\n<th>ယၟုတိုၚ်ခဳ</th><td>" .. lang:getCanonicalName() .. "</td>\n</tr>\n")
local otherNames = lang:getOtherNames()
if otherNames then
local names = {}
for _, name in ipairs(otherNames) do
insert(names, "<li>" .. name .. "</li>")
end
if #names > 0 then
insert(ret, "<tr>\n<th>Other names</th><td><ul>" .. concat(names, "\n") .. "</ul></td>\n</tr>\n")
end
end
local aliases = lang:getAliases()
if aliases then
local names = {}
for _, name in ipairs(aliases) do
insert(names, "<li>" .. name .. "</li>")
end
if #names > 0 then
insert(ret, "<tr>\n<th>ယၟုမထပ်ကော်သာ်တၞဟ်</th><td><ul>" .. concat(names, "\n") .. "</ul></td>\n</tr>\n")
end
end
local varieties = lang:getVarieties()
if varieties then
local names = {}
for _, name in ipairs(varieties) do
if type(name) == "string" then
insert(names, "<li>" .. name .. "</li>")
else
assert(type(name) == "table")
local first_var
local subvars = {}
for i, var in ipairs(name) do
if i == 1 then
first_var = var
else
insert(subvars, "<li>" .. var .. "</li>")
end
end
if #subvars > 0 then
insert(names, "<li><dl><dt>" .. first_var .. "</dt>\n<dd><ul>" .. concat(subvars, "\n") .. "</ul></dd></dl></li>")
elseif first_var then
insert(names, "<li>" .. first_var .. "</li>")
end
end
end
if #names > 0 then
insert(ret, "<tr>\n<th>Varieties</th><td><ul>" .. concat(names, "\n") .. "</ul></td>\n</tr>\n")
end
end
insert(ret, "<tr>\n<th>[[ဝိက်ရှေန်နရဳ:အရေဝ်ဘာသာဂမၠိုၚ်|ကုဒ်အရေဝ်ဘာသာ]]</th><td><code>" .. lang:getCode() .. "</code></td>\n</tr>\n")
insert(ret, "<tr>\n<th>[[ဝိက်ရှေန်နရဳ:အရေဝ်ဘာသာဝေါၚ်သဂမၠိုၚ်|အရေဝ်ဘာသာဝေါၚ်သ]]</th>\n")
local fam = lang:getFamily()
local famCode = fam and fam:getCode()
if not fam then
insert(ret, "<td>unclassified</td>")
elseif famCode == "qfa-iso" then
insert(ret, "<td>[[:ကဏ္ဍ:အရေဝ်ဘာသာမပါ်ပ္တိတ်လဝ်ဂမၠိုၚ်|အရေဝ်ဘာသာမပါ်ပ္တိတ်လဝ်]]</td>")
elseif famCode == "qfa-mix" then
insert(ret, "<td>[[:ကဏ္ဍ:အရေဝ်ဘာသာမပံၚ်ဖနှဴလဝ်ဂမၠိုၚ်|အရေဝ်ဘာသာမပံၚ်ဖနှဴလဝ်]]</td>")
elseif famCode == "sgn" then
insert(ret, "<td>[[:ကဏ္ဍ:အရေဝ်ဘာသာလက္ခဏာသမ္တီဂမၠိုၚ်|အရေဝ်ဘာသာလက္ခဏာသမ္တီ]]</td>")
elseif famCode == "crp" then
insert(ret, "<td>[[:ကဏ္ဍ:ဘာသာခရေဝ်အဝ် ဝါ ဖှေတ်ကျေန်ဂမၠိုၚ်|ခရေဝ်အဝ် ဝါ ဖှေတ်ကျေန်]]</td>")
elseif famCode == "art" then
insert(ret, "<td>[[:ကဏ္ဍ:ဂကောံဘာသာခၞံဗဒှ်လဝ်ဂမၠိုၚ်|ဂကောံဘာသာခၞံဗဒှ်လဝ်]]</td>")
else
insert(ret, "<td>" .. fam:makeCategoryLink() .. "</td>")
end
insert(ret, "\n</tr>\n<tr>\n<th>ဇုဇဗဴလဂမၠိုၚ်</th>\n<td>")
local ancestors = lang:getAncestors()
if ancestors[2] then
local ancestorList = {}
for i, anc in ipairs(ancestors) do
ancestorList[i] = "<li>" .. anc:makeCategoryLink() .. "</li>"
end
insert(ret, "<ul>\n" .. concat(ancestorList, "\n") .. "</ul>")
else
local ancestorChain = lang:getAncestorChainOld()
if ancestorChain[1] then
local chain = {}
for _, anc in reverse_ipairs(ancestorChain) do
insert(chain, "<li>" .. anc:makeCategoryLink() .. "</li>")
end
insert(ret, "<ul>\n" .. concat(chain, "\n<ul>\n") .. ("</ul>"):rep(#chain))
else
insert(ret, "unknown")
end
end
insert(ret, "</td>\n</tr>\n")
local scripts = lang:getScripts()
if scripts[1] then
local script_text = {}
local function makeScriptLine(sc)
local code = sc:getCode()
local url = tostring(mw.uri.fullUrl('Special:Search', {
search = 'contentmodel:css insource:"' .. code
.. '" insource:/\\.' .. code .. '/',
ns8 = '1'
}))
return sc:makeCategoryLink()
.. ' (<span class="plainlinks" title="Search for stylesheets referencing this script">[' .. url .. ' <code>' .. code .. '</code>]</span>)'
end
local function add_Hrkt(text)
insert(text, "<li>" .. makeScriptLine(Hrkt))
insert(text, "<ul>")
insert(text, "<li>" .. makeScriptLine(Hira) .. "</li>")
insert(text, "<li>" .. makeScriptLine(Kana) .. "</li>")
insert(text, "</ul>")
insert(text, "</li>")
end
for _, sc in ipairs(scripts) do
local text = {}
local code = sc:getCode()
if code == "Hrkt" then
add_Hrkt(text)
else
insert(text, "<li>" .. makeScriptLine(sc))
if code == "Jpan" then
insert(text, "<ul>")
insert(text, "<li>" .. makeScriptLine(Hani) .. "</li>")
add_Hrkt(text)
insert(text, "</ul>")
elseif code == "Kore" then
insert(text, "<ul>")
insert(text, "<li>" .. makeScriptLine(Hang) .. "</li>")
insert(text, "<li>" .. makeScriptLine(Hani) .. "</li>")
insert(text, "</ul>")
end
insert(text, "</li>")
end
insert(script_text, concat(text, "\n"))
end
insert(ret, "<tr>\n<th>[[ဝိက်ရှေန်နရဳ:အက္ခရ်ဂမၠိုၚ်|အက္ခရ်ဂမၠိုၚ်]]</th>\n<td><ul>\n" .. concat(script_text, "\n") .. "</ul></td>\n</tr>\n")
else
insert(ret, "<tr>\n<th>[[ဝိက်ရှေန်နရဳ:အက္ခရ်ဂမၠိုၚ်|အက္ခရ်ဂမၠိုၚ်]]</th>\n<td>not specified</td>\n</tr>\n")
end
local function add_module_info(raw_data, heading)
if raw_data then
local scripts = lang:getScriptCodes()
local module_info, add = {}, false
if type(raw_data) == "string" then
insert(module_info,
("[[မဝ်ဂျူ:%s]]"):format(raw_data))
add = true
else
local raw_data_type = type(raw_data)
if raw_data_type == "table" and size(scripts) == 1 and type(raw_data[scripts[1]]) == "string" then
insert(module_info,
("[[မဝ်ဂျူ:%s]]"):format(raw_data[scripts[1]]))
add = true
elseif raw_data_type == "table" then
insert(module_info, "<ul>")
for script, data in sorted_pairs(raw_data) do
if type(data) == "string" and m_sc_getByCode(script) then
insert(module_info, ("<li><code>%s</code>: [[မဝ်ဂျူ:%s]]</li>"):format(script, data))
end
end
insert(module_info, "</ul>")
add = size(module_info) > 2
end
end
if add then
insert(ret, [=[
<tr>
<th>]=] .. heading .. [=[</th>
<td>]=] .. concat(module_info) .. [=[</td>
</tr>
]=])
end
end
end
add_module_info(raw_data.generate_forms, "Form-generating<br>module")
add_module_info(raw_data.translit, "[[ဝိက်ရှေန်နရဳ:ကၠာဲပ္တိတ်မအခဝ် ကဵု ပြၚ်လှာဲအက္ခရ်ရဝ်မာန်|မဝ်ဂျူ<br>ကၠာဲပ္တိတ်မအခဝ်]]")
add_module_info(raw_data.display_text, "မဝ်ဂျူ<br>ထ္ၜးမလိက်")
add_module_info(raw_data.entry_name, "မဝ်ဂျူ<br>ယၟုစရၚ်")
add_module_info(raw_data.sort_key, "[[sortkey|မဝ်ဂျူ]]<br>ကဳပါ်အဇာ")
local wikidataItem = lang:getWikidataItem()
if lang:getWikidataItem() and mw.wikibase then
local URL = mw.wikibase.getEntityUrl(wikidataItem)
local link
if URL then
link = '[' .. URL .. ' ' .. wikidataItem .. ']'
else
link = '<span class="error">Invalid Wikidata item: <code>' .. wikidataItem .. '</code></span>'
end
insert(ret, "<tr><th>Wikidata</th><td>" .. link .. "</td></tr>")
end
insert(ret, "</table>")
return concat(ret)
end
local function NavFrame(content, title)
return '<div class="NavFrame"><div class="NavHead">'
.. (title or '{{{title}}}') .. '</div>'
.. '<div class="NavContent" style="text-align: left;">'
.. content
.. '</div></div>'
end
local function get_description_topright_additional(lang, locations, extinct, setwiki, setwikt, setsister, entryname)
local nameWithLanguage = lang:getCategoryName("nocap")
if lang:getCode() == "und" then
local description =
"ဣတဏအ်ဂှ်ဝွံဆေၚ်စပ်ကဵုကဏ္ဍအဓိက'''" .. nameWithLanguage .. "'''၊ မအာတ်မိက်ထ္ၜးလဝ်ပ္ဍဲဝိက်ရှေန်နရဳသီုကဵု[[ဝိက်ရှေန်နရဳ:အရေဝ်ဘာသာဂမၠိုၚ်|ကုဒ်]] '''" .. lang:getCode() .. "'''ရအဴ။" ..
"မအရေဝ်ဘာသာလုပ်အဝေါၚ်တဏအ်ဝွံပ္ဍဲပွမချူဆေၚ်စပ်ကဵုဝၚ်၊ သီုကဵုမအရေဝ်အဓိပ္ပာဲကီုလေဝ်မက္တဵုဒှ်လဝ်ဂလာန်သတ်ဒတ်နူကဵုတၠပညာဟၟဲမွဲဏီ။"
return description, nil, nil
end
local canonicalName = lang:getCanonicalName()
local topright = linkbox(lang, setwiki, setwikt, setsister, entryname)
local the_prefix
if canonicalName:find("ဘာသာ$") then
the_prefix = ""
else
the_prefix = "the "
end
local description = "ဣတဏအ်ဂှ်ဝွံဆေၚ်စပ်ကဵုကဏ္ဍအဓိက" .. the_prefix .. "'''" .. nameWithLanguage .. "'''ရအဴ။"
local location_links = {}
local prep
local saw_embedded_comma = false
for _, location in ipairs(locations) do
local this_prep
if location == "the world" then
this_prep = "across"
insert(location_links, location)
elseif location ~= "UNKNOWN" then
this_prep = "in"
if location:find(",") then
saw_embedded_comma = true
end
insert(location_links, link_location(location))
end
if this_prep then
if prep and this_prep ~= prep then
error("Can't handle location 'the world' along with another location (clashing prepositions)")
end
prep = this_prep
end
end
local location_desc
if #location_links > 0 then
local location_link_text
if saw_embedded_comma and #location_links >= 3 then
location_link_text = mw.text.listToText(location_links, "; ", "; and ")
else
location_link_text = serial_comma_join(location_links)
end
location_desc = ("It is %s %s %s.\n\n"):format(
extinct and "an [[extinct language]] that was formerly spoken" or "spoken", prep, location_link_text)
elseif extinct then
location_desc = "It is an [[extinct language]].\n\n"
else
location_desc = ""
end
local add = location_desc .. "Information about " .. canonicalName .. ":\n\n" .. infobox(lang)
if lang:hasType("reconstructed") then
add = add .. "\n\n" ..
ucfirst(canonicalName) .. " is a reconstructed language. Its words and roots are not directly attested in any written works, but have been reconstructed through the ''comparative method'', " ..
"which finds regular similarities between languages that cannot be explained by coincidence or word-borrowing, and extrapolates ancient forms from these similarities.\n\n" ..
"According to our [[Wiktionary:Criteria for inclusion|criteria for inclusion]], terms in " .. canonicalName ..
" should '''not''' be present in entries in the main namespace, but may be added to the Reconstruction: namespace."
elseif lang:hasType("appendix-constructed") then
add = add .. "\n\n" ..
ucfirst(canonicalName) .. " is a constructed language that is only in sporadic use. " ..
"According to our [[Wiktionary:Criteria for inclusion|criteria for inclusion]], terms in " .. canonicalName ..
" should '''not''' be present in entries in the main namespace, but may be added to the Appendix: namespace. " ..
"All terms in this language may be available at [[Appendix:" .. ucfirst(canonicalName) .. "]]."
end
local about = new_title("ဝိက်ရှေန်နရဳ:ပရူ" .. canonicalName)
if about.exists then
add = add .. "\n\n" ..
"Please see '''[[ဝိက်ရှေန်နရဳ:ပရူ" .. canonicalName .. "]]''' for information and special considerations for creating " .. nameWithLanguage .. " entries."
end
local ok, tree_of_descendants = pcall(
require("Module:family tree").print_children,
lang:getCode(), {
protolanguage_under_family = true,
must_have_descendants = true
})
if ok then
if tree_of_descendants then
add = add .. NavFrame(
tree_of_descendants,
"Family tree")
else
add = add .. "\n\n" .. ucfirst(lang:getCanonicalName())
.. " has no descendants or varieties listed in Wiktionary's language data modules."
end
else
mw.log("error while generating tree: " .. tostring(tree_of_descendants))
end
return description, topright, add
end
local function get_parents(lang, locations, extinct)
local canonicalName = lang:getCanonicalName()
local sortkey = {sort_base = canonicalName, lang = "mnw"}
local ret = {{name = "အရေဝ်ဘာသာအိုတ်သီုဂမၠိုၚ်", sort = sortkey}}
local fam = lang:getFamily()
local famCode = fam and fam:getCode()
-- FIXME: Some of the following categories should be added to this module.
if not fam then
insert(ret, {name = "ကဏ္ဍ:အရေဝ်ဘာသာအပြောံဂမၠိုၚ်", sort = sortkey})
elseif famCode == "qfa-iso" then
insert(ret, {name = "ကဏ္ဍ:အရေဝ်ဘာသာမပါ်ပ္တိတ်လဝ်ဂမၠိုၚ်", sort = sortkey})
elseif famCode == "qfa-mix" then
insert(ret, {name = "ကဏ္ဍ:အရေဝ်ဘာသာမပံၚ်ဖနှဴလဝ်ဂမၠိုၚ်", sort = sortkey})
elseif famCode == "sgn" then
insert(ret, {name = "ကဏ္ဍ:အရေဝ်ဘာသာလက္ခဏာသမ္တီသီုဖ္အိုတ်ဂမၠိုၚ်", sort = sortkey})
elseif famCode == "crp" then
insert(ret, {name = "ကဏ္ဍ:ဘာသာခရေဝ်အဝ် ဝါ ဖှေတ်ကျေန်ဂမၠိုၚ်", sort = sortkey})
for _, anc in ipairs(lang:getAncestors()) do
-- Avoid Haitian Creole being categorised in [[:Category:Haitian Creole-based creole or pidgin languages]], as one of its ancestors is an etymology-only variety of it.
-- Use that ancestor's ancestors instead.
if anc:getFullCode() == lang:getCode() then
for _, anc_extra in ipairs(anc:getAncestors()) do
insert(ret, {name = "ကဏ္ဍ:ဘာသာခရေဝ်အဝ် ဝါ ဖှေတ်ကျေန်မရပ်စပ်လဝ်ဘာသာ" .. ucfirst(anc_extra:getFullName()) .. "နကဵုတံသ္ဇိုၚ်ဂမၠိုၚ်", sort = sortkey})
end
else
insert(ret, {name = "ကဏ္ဍ:ဘာသာခရေဝ်အဝ် ဝါ ဖှေတ်ကျေန်မရပ်စပ်လဝ်ဘာသာ" .. ucfirst(anc:getFullName()) .. "နကဵုတံသ္ဇိုၚ်ဂမၠိုၚ်", sort = sortkey})
end
end
elseif famCode == "art" then
if lang:hasType("appendix-constructed") then
insert(ret, {name = "ကဏ္ဍ:ဂကောံဘာသာခၞံဗဒှ်လဝ်ပါဲနူအဆက်လက္ကရဴဂမၠိုၚ်", sort = sortkey})
else
insert(ret, {name = "ကဏ္ဍ:ဂကောံဘာသာခၞံဗဒှ်လဝ်ဂမၠိုၚ်", sort = sortkey})
end
for _, anc in ipairs(lang:getAncestors()) do
if anc:getFullCode() == lang:getCode() then
for _, anc_extra in ipairs(anc:getAncestors()) do
insert(ret, {name = "ကဏ္ဍ:ဂကောံဘာသာခၞံဗဒှ်လဝ်မရပ်စပ်လဝ်ဘာသာ" .. ucfirst(anc_extra:getFullName()) .. "နကဵုတံသ္ဇိုၚ်ဂမၠိုၚ်", sort = sortkey})
end
else
insert(ret, {name = "ကဏ္ဍ:ဂကောံဘာသာခၞံဗဒှ်လဝ်မရပ်စပ်လဝ်ဘာသာ" .. ucfirst(anc:getFullName()) .. "နကဵုတံသ္ဇိုၚ်ဂမၠိုၚ်", sort = sortkey})
end
end
else
insert(ret, {name = "ကဏ္ဍ:" .. fam:getCategoryName(), sort = sortkey})
if lang:hasType("reconstructed") then
insert(ret, {
name = "ကဏ္ဍ:အရေဝ်ဘာသာဗီုပြၚ်သိုၚ်တၟိဂမၠိုၚ်",
sort = {sort_base = canonicalName:gsub("^%-အခိုက်ကၞာ", ""), lang = "mnw"}
})
end
end
local function add_sc_cat(sc)
insert(ret, {name = "ကဏ္ဍ:ဘာသာ" .. sc:getCategoryName() , sort = sortkey})
end
local function add_Hrkt()
add_sc_cat(Hrkt)
add_sc_cat(Hira)
add_sc_cat(Kana)
end
for _, sc in ipairs(lang:getScripts()) do
if sc:getCode() == "Hrkt" then
add_Hrkt()
else
add_sc_cat(sc)
if sc:getCode() == "Jpan" then
add_sc_cat(Hani)
add_Hrkt()
elseif sc:getCode() == "Kore" then
add_sc_cat(Hang)
add_sc_cat(Hani)
end
end
end
if lang:hasTranslit() then
insert(ret, {name = "ကဏ္ဍ:ဘာသာမနွံကဵုပြၚ်လှာဲကၠာဲမအခဝ်အဝ်တဝ်", sort = sortkey})
end
local function insert_location_language_cat(location)
local cat = "အရေဝ်ဘာသာမဆေၚ်စပ်ကဵု" .. location .. "ဂမၠိုၚ်"
insert(ret, {name = "ကဏ္ဍ:" .. cat, sort = sortkey})
local auto_cat_args = scrape_category_for_auto_cat_args(cat)
local location_parent = auto_cat_args and auto_cat_args.parent
if location_parent then
local split_parents = require(parse_utilities_module).split_on_comma(location_parent)
for _, parent in ipairs(split_parents) do
parent = parent:match("^(.-):.*$") or parent
insert_location_language_cat(parent)
end
end
end
local saw_location = false
for _, location in ipairs(locations) do
if location ~= "UNKNOWN" then
saw_location = true
insert_location_language_cat(location)
end
end
if extinct then
insert(ret, {name = "ကဏ္ဍ:အရေဝ်ဘာသာမကၠေံဗ္ဒန်အာသီုဖ္အိုတ်ဂမၠိုၚ်", sort = sortkey})
end
if not saw_location and not (lang:hasType("reconstructed") or (fam and fam:getCode() == "art")) then
-- Constructed and reconstructed languages don't need a location specified and often won't have one,
-- so don't put them in this maintenance category.
insert(ret, {name = "ကဏ္ဍ:အရေဝ်ဘာသာဟွံဂွံစုတ်အဇာအပ္ဍဲကဏ္ဍဍုၚ်အတေံဏီ", sort = sortkey})
end
return ret
end
local function get_children()
local ret = {}
-- FIXME: We should work on the children mechanism so it isn't necessary to manually specify these.
for _, label in ipairs({"ဝေါဟာအဓိက"}) do
insert(ret, {name = label, is_label = true})
end
return ret
end
-- Handle language categories of the form e.g. [[:Category:French language]] and
-- [[:Category:British Sign Language]].
insert(raw_handlers, function(data)
local category = data.category
local lang = m_languages.getByCanonicalName(category)
if not lang then
local langname = category:match("^ဘာသာ")
if langname then
lang = m_languages.getByCanonicalName(langname)
end
if not lang then
return nil
end
end
local args = require("Module:parameters").process(data.args, {
[1] = {list = true},
["setwiki"] = true,
["setwikt"] = true,
["setsister"] = true,
["entryname"] = true,
["extinct"] = {type = "boolean"},
})
-- If called from inside, don't require any arguments, as they can't be known
-- in general and aren't needed just to generate the first parent (used for
-- breadcrumbs).
if #args[1] == 0 and not data.called_from_inside then
-- At least one location must be specified unless the language is constructed (e.g. Esperanto) or reconstructed (e.g. Proto-Indo-European).
local fam = lang:getFamily()
if not (lang:hasType("reconstructed") or (fam and fam:getCode() == "art")) then
error("At least one location (param 1=) must be specified for language '" .. lang:getCanonicalName() .. "' (code '" .. lang:getCode() .. "'). " ..
"Use the value UNKNOWN if the language's location is truly unknown.")
end
end
local description, topright, additional = "", "", ""
-- If called from inside the category tree system, it's called when generating
-- parents or children, and we don't need to generate the description or additional
-- text (which is very expensive in terms of memory because it calls [[Module:family tree]],
-- which calls [[Module:languages/data/all]]).
if not data.called_from_inside then
description, topright, additional = get_description_topright_additional(
lang, args[1], args.extinct, args.setwiki, args.setwikt, args.setsister, args.entryname
)
end
return {
canonical_name = lang:getCategoryName(),
description = description,
lang = lang:getCode(),
topright = topright,
additional = additional,
breadcrumb = lang:getCanonicalName(),
parents = get_parents(lang, args[1], args.extinct),
extra_children = get_children(lang),
umbrella = false,
can_be_empty = true,
}, true
end)
-- Handle categories such as [[:Category:Languages of Indonesia]].
insert(raw_handlers, function(data)
local location = data.category:match("^အရေဝ်ဘာသာမဆေၚ်စပ်ကဵု")
if location then
local args = require("Module:parameters").process(data.args, {
["flagfile"] = true,
["commonscat"] = true,
["wp"] = true,
["basename"] = true,
["parent"] = true,
["locationcat"] = true,
["locationlink"] = true,
})
local topright
local basename = args.basename or location:gsub(", .*", "")
if args.flagfile ~= "-" then
local flagfile_arg = args.flagfile or ("Flag of %s.svg"):format(basename)
local files = require(parse_utilities_module).split_on_comma(flagfile_arg)
local topright_parts = {}
for _, file in ipairs(files) do
local flagfile = "File:" .. file
local flagfile_page = new_title(flagfile)
if flagfile_page and flagfile_page.file.exists then
insert(topright_parts, ("[[%s|right|100px|border]]"):format(flagfile))
elseif args.flagfile then
error(("Explicit flagfile '%s' doesn't exist"):format(flagfile))
end
end
topright = concat(topright_parts)
end
if args.wp then
local wp = require("Module:yesno")(args.wp, "+")
if wp == "+" or wp == true then
wp = data.category
end
if wp then
local wp_topright = ("{{wikipedia|%s}}"):format(wp)
if topright then
topright = topright .. wp_topright
else
topright = wp_topright
end
end
end
if args.commonscat then
local commonscat = require("Module:yesno")(args.commonscat, "+")
if commonscat == "+" or commonscat == true then
commonscat = data.category
end
if commonscat then
local commons_topright = ("{{commonscat|%s}}"):format(commonscat)
if topright then
topright = topright .. commons_topright
else
topright = commons_topright
end
end
end
local bare_location = location:match("^မဆေၚ်စပ်ကဵု") or location
local location_link = args.locationlink or link_location(location)
local bare_basename = basename:match("^မဆေၚ်စပ်ကဵု") or basename
local parents = {}
if args.parent then
local explicit_parents = require(parse_utilities_module).split_on_comma(args.parent)
for i, parent in ipairs(explicit_parents) do
local actual_parent, sort_key = parent:match("^(.-):(.*)$")
if actual_parent then
parent = actual_parent
sort_key = sort_key:gsub("%+", bare_location)
else
sort_key = " " .. bare_location
end
insert(parents, {name = "အရေဝ်ဘာသာမဆေၚ်စပ်ကဵု" .. parent, sort = sort_key})
end
else
insert(parents, {name = "အရေဝ်ဘာသာဗက်အလိုက်ဍုၚ်ရး", sort = {sort_base = bare_location, lang = "mnw"}})
end
if args.locationcat then
local explicit_location_cats = require(parse_utilities_module).split_on_comma(args.locationcat)
for i, locationcat in ipairs(explicit_location_cats) do
insert(parents, {name = "ကဏ္ဍ:ဘာသာ" .. locationcat, sort })
end
else
local location_cat = ("ကဏ္ဍ:%s"):format(bare_location)
local location_page = new_title(location_cat)
if location_page and location_page.exists then
insert(parents, {name = "ဘာသာ" .. location_cat, sort })
end
end
local description = ("Categories for languages of %s (including sublects)."):format(location_link)
return {
topright = topright,
description = description,
parents = parents,
breadcrumb = bare_basename,
additional = "{{{umbrella_msg}}}",
}, true
end
end)
-- Handle categories such as [[:Category:English-based creole or pidgin languages]].
insert(raw_handlers, function(data)
local langname = data.category:match("^ဘာသာခရေဝ်အဝ် ဝါ ဖှေတ်ကျေန်မရပ်စပ်လဝ်ဘာသာ%နကဵုတံသ္ဇိုၚ်ဂမၠိုၚ်$")
if langname then
local lang = m_languages.getByCanonicalName(langname)
if lang then
return {
lang = lang:getCode(),
description = "Languages which developed as a [[creole]] or [[pidgin]] from " .. lang:makeCategoryLink() .. ".",
parents = {{name = "Creole or pidgin languages", sort = {sort_base = "*" .. langname, lang = "mnw"}}},
breadcrumb = "တံသ္ဇိုၚ်-" .. lang:getCanonicalName(),
}
end
end
end)
-- Handle categories such as [[:Category:English-based constructed languages]].
insert(raw_handlers, function(data)
local langname = data.category:match("^ဂကောံဘာသာခၞံဗဒှ်လဝ်မရပ်စပ်လဝ်ဘာသာ%နကဵုတံသ္ဇိုၚ်ဂမၠိုၚ်$")
if langname then
local lang = m_languages.getByCanonicalName(langname)
if lang then
return {
lang = lang:getCode(),
description = "Constructed languages which are based on " .. lang:makeCategoryLink() .. ".",
parents = {{name = "Constructed languages", sort = {sort_base = "*" .. langname, lang = "mnw"}}},
breadcrumb = "တံသ္ဇိုၚ်-" .. lang:getCanonicalName(),
}
end
end
end)
return {RAW_CATEGORIES = raw_categories, RAW_HANDLERS = raw_handlers}
cfxkdb73q8a3wjsf972x0avtnjexoz6
ကဏ္ဍ:ဘာသာသာမိ သၟဝ်ကျာ
14
22842
385640
385621
2026-04-02T17:38:06Z
咽頭べさ
33
385640
wikitext
text/x-wiki
{{auto cat|ဍုၚ်နဝ်ဝေ|ဍုၚ်သွဳဒေန်|ဍုၚ်ဖေန်လာန်}}
b1ckld1fsz4z9ric8cjnjq8p6yiexzu
မဝ်ဂျူ:ja/data/range
828
62224
385650
104620
2026-04-02T20:06:23Z
咽頭べさ
33
385650
Scribunto
text/plain
local u = require("Module:string utilities").char
local range = {}
range.kanji =
u(0x2E80) .. "-" .. u(0x2FDF) .. -- CJK Unified Ideographs
u(0x4E00) .. "-" .. u(0x9FFF) .. -- CJK Unified Ideographs
u(0x3400) .. "-" .. u(0x4DBF) .. -- CJK Unified Ideographs Extension A
u(0xF900) .. "-" .. u(0xFAFF) .. -- CJK Compatibility Ideographs
u(0x20000) .. "-" .. u(0x2A6DF) .. -- CJK Unified Ideographs Extension B
u(0x2A700) .. "-" .. u(0x2EE5F) .. -- CJK Unified Ideographs Extension C-F & I
u(0x2F800) .. "-" .. u(0x2FA1F) .. -- CJK Compatibility Ideographs Supplement
u(0x30000) .. "-" .. u(0x323AF) .. -- CJK Unified Ideographs Extension C-F & I
u(0x323B0) .. "-" .. u(0x3347F) -- CJK Unified Ideographs Extension J
range.kana_combining_characters =
u(0x3099) .. "-" .. u(0x309C) .. -- Hiragana
u(0xFF9E) .. u(0xFF9F) .. -- Halfwidth and Fullwidth Forms
u(0x0305) .. u(0x0323) -- Combining Diacritical Marks
range.kana_overlap =
range.kana_combining_characters ..
"〰-〵" .. -- CJK Symbols and Punctuation
"ー" -- Katakana
local hiragana_exclusive =
"ぁ-ゖゝゞ" .. -- Hiragana
"𛀁𛀆𛄟" .. -- Kana Supplement + Kana Extended-A
"𛄲𛅐-𛅒" -- Small Kana Extension
range.hiragana = range.kana_overlap .. hiragana_exclusive
local katakana_exclusive =
"ァ-ヺヽヾ" .. -- Katakana
"ㇰ-ㇿ" .. -- Katakana Phonetic Extensions
u(0xFF66) .. "-" .. u(0xFF9D) .. -- Halfwidth and Fullwidth Forms
"𚿰-𚿾" .. -- Kana Extended-B
"𛀀𛄠-𛄢" .. -- Kana Supplement + Kana Extended-A
"𛅕𛅤-𛅧" -- Small Kana Extension
range.katakana = range.kana_overlap .. katakana_exclusive
range.hentaigana =
"𛀂-𛀅𛀇-𛄞" -- Kana Supplement + Kana Extended-A
range.kana = range.kana_overlap .. hiragana_exclusive .. katakana_exclusive .. range.hentaigana
-- Note: not other sutegana like っ, as they aren't submoraic.
range.submoraic_kana =
"ぁぃぅぇぉゃゅょゎ" .. -- Hiragana
"ァィゥェォャュョヮ" .. -- Katakana
"ァ-ョ" .. -- Halfwidth and Fullwidth Forms
"𛅐𛅑𛅒𛅤𛅥𛅦" -- Small Kana Extension
range.vowels = {
a = "ぁあかがさざただなはばぱまゃやらゎわァアカガサザタダナハバパマャヤラヮワヷ",
i = "ぃいきぎしじちぢにひびぴみ𛀆り𛅐ゐィイキギシジチヂニヒビピミ𛄠リ𛅤ヰヸ",
u = "ぅうゔくぐすずつづぬふぶぷむゅゆる𛄟ゥウヴクグスズツヅヌフブプムュユル𛄢",
e = "ぇえけげせぜてでねへべぺめ𛀁れ𛅑ゑェエ𛀀ケゲセゼテデネヘベペメ𛄡レ𛅥ヱヹ",
o = "ぉおこごそぞとどのほぼぽもょよろ𛅒をォオコゴソゾトドノホボポモョヨロ𛅦ヲヺ",
n = "んン"
}
range.ideograph =
"〃々-〇〱-〵〻〼" .. -- CJK Symbols and Punctuation
"㈠-㉟㊀-㋿" .. -- Enclosed CJK Letters and Months
"㍘-㏿" .. -- CJK Compatibility
"🈂-" -- Enclosed Ideographic Supplement
range.kana_graph =
"ゟヿ" .. -- Hiragana + Katakana
"㌀-㍗" .. -- CJK Compatibility
"🈀🈁" -- Enclosed Ideographic Supplement
range.punctuation =
" -。〈-】〔-〟〽" .. -- CJK Symbols and Punctuation
"゠・" .. -- Katakana
"!-/:-@[-`{-・¢-○" -- Halfwidth and Fullwidth Forms
range.latin = require("Module:scripts").getByCode("Latn"):getCharacters()
range.numbers =
"0-9" .. -- Basic Latin
"0-9" -- Halfwidth and Fullwidth Forms
return range
0akj99anu5qmk1fqilaviguuvbb3wyu
မဝ်ဂျူ:category tree/fam/trk
828
285995
385632
2026-04-02T16:36:03Z
咽頭べさ
33
ခၞံကၠောန်လဝ် မုက်လိက် နကု "local labels = {} ------- Turkic izafet I/II/III compounds ------- -- FIXME: Possibly should be limited to a subfamily of Turkic. labels["izafet I compounds"] = { description = "{{{langname}}} izafet I compounds, i.e. nominal compounds consisting of two nouns both lacking 3rd-person possessive marking.", additional = "These compounds are right-headed (the second noun is modified by the first), unlike Persian {{lg..."
385632
Scribunto
text/plain
local labels = {}
------- Turkic izafet I/II/III compounds -------
-- FIXME: Possibly should be limited to a subfamily of Turkic.
labels["izafet I compounds"] = {
description = "{{{langname}}} izafet I compounds, i.e. nominal compounds consisting of two nouns both lacking 3rd-person possessive marking.",
additional = "These compounds are right-headed (the second noun is modified by the first), unlike Persian {{lg|ezafe}} compounds, which are typically left-headed.",
breadcrumb_and_first_sort_key = "izafet I",
parents = {"compound terms"},
}
labels["izafet II compounds"] = {
description = "{{{langname}}} izafet II compounds, i.e. nominal compounds with the first noun having zero-marking, and the second noun receiving a possessive suffix.",
additional = "These compounds are right-headed (the second noun is modified by the first), unlike Persian {{lg|ezafe}} compounds, which are typically left-headed.",
breadcrumb_and_first_sort_key = "izafet II",
parents = {"compound terms"},
}
labels["izafet III compounds"] = {
description = "{{{langname}}} izafet III compounds, i.e. nominal compounds with the first noun in the genitive case and the second noun receiving a possessive suffix.",
additional = "These compounds are right-headed (the second noun is modified by the first), unlike Persian {{lg|ezafe}} compounds, which are typically left-headed.",
breadcrumb_and_first_sort_key = "izafet III",
parents = {"compound terms"},
}
labels["Persian-style izafet compounds"] = {
description = "{{{langname}}} Persian-style izafet compounds, i.e. left-headed nominal compounds with the first noun receiving a Persian-style {{lg|ezafe}} suffix and the second noun having zero-marking.",
additional = "These compounds are left-headed (the first noun is modified by second), unlike native Turkic izafet compounds, which are always right-headed.",
breadcrumb_and_first_sort_key = "Persian-style",
parents = {"izafet II compounds"},
}
-- Add 'umbrella_parents' key if not already present.
for key, data in pairs(labels) do
if not data.umbrella_parents then
data.umbrella_parents = "Types of compound terms by language"
end
end
return {LABELS = labels}
ps5lfysh0wgnpjatzdh4pzsy8y1w7y6
မဝ်ဂျူ:category tree/fam/roa-ibe
828
285996
385633
2026-04-02T16:37:52Z
咽頭べさ
33
ခၞံကၠောန်လဝ် မုက်လိက် နကု "local labels = {} local conjugations = { ["ar"] = "{{{langname}}} first conjugation verbs, derived from Latin [[:Category:Latin first conjugation verbs|first conjugation (-āre) verbs]].", ["er"] = "{{{langname}}} second conjugation verbs, derived from Latin [[:Category:Latin second conjugation verbs|second conjugation (-ēre)]] or [[:Category:Latin third conjugation verbs|third conjugation (-ere)]] verbs.", ["ir"..."
385633
Scribunto
text/plain
local labels = {}
local conjugations = {
["ar"] = "{{{langname}}} first conjugation verbs, derived from Latin [[:Category:Latin first conjugation verbs|first conjugation (-āre) verbs]].",
["er"] = "{{{langname}}} second conjugation verbs, derived from Latin [[:Category:Latin second conjugation verbs|second conjugation (-ēre)]] or [[:Category:Latin third conjugation verbs|third conjugation (-ere)]] verbs.",
["ir"] = "{{{langname}}} third conjugation verbs, derived from Latin [[:Category:Latin third conjugation verbs|third conjugation (-ere)]] or [[:Category:Latin fourth conjugation verbs|fourth conjugation (-īre)]] verbs.",
}
labels["verbs by conjugation"] = {
description = "{{{langname}}} verbs categorized by conjugation.",
parents = {"verbs by inflection type"},
}
for conj, conjdesc in pairs(conjugations) do
labels["verbs ending in -" .. conj] = {
description = conjdesc,
displaytitle = "{{{langname}}} verbs ending in {{m|{{{langcode}}}||-" .. conj .. "}}",
parents = {
{name = "verbs by conjugation", sort = conj},
},
breadcrumb = "{{m|{{{langcode}}}||-" .. conj .. "}}",
}
end
labels["verbs by vowel alternation"] = {
description = "{{{langname}}} verbs categorized by type of vowel alternation.",
parents = {"verbs by inflection type"},
}
labels["verbs by consonant alternation"] = {
description = "{{{langname}}} verbs categorized by type of consonant alternation.",
parents = {"verbs by inflection type"},
}
labels["third-person-only verbs"] = {
description = "{{{langname}}} verbs with forms that exist only in the third person, and have no imperatives.",
parents = {{name = "defective verbs"}},
breadcrumb = "third-person-only",
}
-- Add 'umbrella_parents' key if not already present.
for key, data in pairs(labels) do
if not data.umbrella_parents then
data.umbrella_parents = "Terms by grammatical category subcategories by language"
end
end
return {LABELS = labels}
4hqzujtavdjpjcevxpm30i0mf4rf6vg
မဝ်ဂျူ:category tree/fam/zhx
828
285997
385636
2026-04-02T17:08:56Z
咽頭べさ
33
ခၞံကၠောန်လဝ် မုက်လိက် နကု "local labels = {} local handlers = {} labels["hanzi"] = { topright = "{{wp|Chinese characters}}", description = "{{{langname}}} symbols of the Han logographic script, which can represent sounds or convey meanings directly.", umbrella = "Han characters", parents = "logograms", } labels["chengyu"] = { topright = "{{wp|Chengyu}}", description = "{{{langname}}} traditional idiomatic expressions, usually consisting..."
385636
Scribunto
text/plain
local labels = {}
local handlers = {}
labels["hanzi"] = {
topright = "{{wp|Chinese characters}}",
description = "{{{langname}}} symbols of the Han logographic script, which can represent sounds or convey meanings directly.",
umbrella = "Han characters",
parents = "logograms",
}
labels["chengyu"] = {
topright = "{{wp|Chengyu}}",
description = "{{{langname}}} traditional idiomatic expressions, usually consisting of four [[hanzi]]; typically derived from [[Classical Chinese]].",
additional = "Compare Japanese {{w|yojijukugo}} and Korean {{w|sajaseong-eo}}.",
parents = "idioms",
}
labels["terms with uncreated forms"] = {
description = "{{{langname}}} terms that use a hanzi box template (such as {{temp|zh-forms}}) with a form not having a page of its own, or a {{temp|zh-see}} template linking to a page without a Chinese section or a nonexistent page",
additional = "If the redlink in the hanzi box is a variant or simplified form, the page may be created with {{temp|subst:zh-new}}.",
parents = {"redlinks", "entry maintenance"},
}
for _, source in ipairs {
"Mencius",
"the Analects",
"the Book of Documents",
"the Book of Rites",
"the Classic of Poetry",
"the Han Feizi",
"the I Ching",
"the Zhuangzi",
"the Zuo Zhuan",
} do
local book = source:match("^the (.*)$")
local sort_key = book or source
local italicized = book and "the ''" .. book .. "''" or source
labels["terms derived from " .. source] = {
displaytitle = book and "{{{langname}}} terms derived from " .. italicized or nil,
parents = {{name = "terms attributed to a specific source", sort = sort_key}},
description = "{{{langname}}} terms derived from " .. italicized .. ".",
breadcrumb = italicized,
}
labels["chengyu derived from " .. source] = {
displaytitle = book and "{{{langname}}} chengyu derived from " .. italicized or nil,
parents = {{name = "chengyu", sort = sort_key}, "terms derived from " .. source},
description = "{{{langname}}} [[chengyu]] derived from " .. italicized .. ".",
breadcrumb = "derived from " .. italicized,
}
end
return {LABELS = labels, HANDLERS = handlers}
rm746jovo6y2tar6jekijha49gudmt4
မဝ်ဂျူ:category tree/fam/gem
828
285998
385639
2026-04-02T17:36:27Z
咽頭べさ
33
ခၞံကၠောန်လဝ် မုက်လိက် နကု "local labels = {} ------- GERMANIC VERB CLASSES ------- labels["strong verbs"] = { description = "{{{langname}}} verbs that do not use a dental suffix to mark the past tense and past participle, instead using vowel change ([[ablaut]]) and often a suffix ''-(e)n'' in the past participle.", breadcrumb = "strong", parents = {"verbs by inflection type"}, } labels["weak verbs"] = { description = "{{{langname}}} verb..."
385639
Scribunto
text/plain
local labels = {}
------- GERMANIC VERB CLASSES -------
labels["strong verbs"] = {
description = "{{{langname}}} verbs that do not use a dental suffix to mark the past tense and past participle, instead using vowel change ([[ablaut]]) and often a suffix ''-(e)n'' in the past participle.",
breadcrumb = "strong",
parents = {"verbs by inflection type"},
}
labels["weak verbs"] = {
description = "{{{langname}}} verbs that display dental suffixes in their past tense conjugated forms.",
breadcrumb = "weak",
parents = {"verbs by inflection type"},
}
labels["preterite-present verbs"] = {
description = "{{{langname}}} verbs that inflect in the present tense like the past tense of strong verbs.",
breadcrumb = "preterite-present",
parents = {"verbs by inflection type"},
}
labels["class 1 strong verbs"] = {
description = "{{{langname}}} class 1 strong verbs, where the [[ablaut]] vowel was followed by ''-y-'' in Proto-Indo-European.",
breadcrumb = "class 1",
parents = {{name = "strong verbs", sort = "1"}},
}
labels["class 1 weak verbs"] = {
description = "{{{langname}}} class 1 weak verbs, where the stem was followed by {{ic|/i/~/j/}} in Proto-Germanic (or {{ic|/ij/}} after a heavy stem, due to {{w|Sievers' Law}}).",
additional = "This triggered [[umlaut]] in most daughter languages, as well as gemination of the final consonant in light stems in West Germanic.",
breadcrumb = "class 1",
parents = {{name = "weak verbs", sort = "1"}},
}
labels["class 1 weak j-present verbs"] = {
description = "{{{langname}}} class 1 weak verbs with {{ic|/i/~/j/~/ij/}} in Proto-Germanic only in the present tense, but not elsewhere.",
additional = "Most class 1 weak verbs had {{ic|/i/}} in the past tense and past participle, leading to " ..
"[[umlaut]] throughout the verb in daughter languages with umlaut. A few archaic verbs, however, lacked " ..
"this [[interfix]], with the [[dental]] consonant of the ending attached directly to the stem. Original " ..
"instances of this are the reflexes of English [[seek]], [[think]], [[buy]] and [[work]], with apparently " ..
"irregular pasts ''sought'', ''thought'', ''bought'' and archaic ''wrought'', and it was often extended " ..
"to other verbs in various daughter languages (e.g. the [[Old English]] reflexes of [[sell]], [[tell]], " ..
"[[teach]] and formerly [[reach]], with apparently irregular pasts ''sold'', ''told'', ''taught'' and " ..
"now-obsolete ''raught''). The apparent reversal of umlaut in the past tense is sometimes called " ..
"{{m|de|Rückumlaut|lit=backwards umlaut}} in Germanic studies.",
breadcrumb = "''j''-present",
parents = {{name = "class 1 weak verbs", sort = "j-present"}},
}
labels["class 1 weak heavy-stem verbs"] = {
description = "{{{langname}}} class 1 weak verbs with a heavy stem in Proto-Germanic, i.e. a stem containing a long vowel or ending in two consonants.",
additional = "Such verbs had the {{w|Sievers' Law}} variant interfix {{ic|/ij/}} between the stem and endings " ..
"in the present tense, which evolved differently from light-stem verbs in most daughter languages, which " ..
"had an interfix {{ic|/i/~/j/}} in the present tense. Note that some verbs with multisyllabic stems were " ..
"treated as heavy-stem and some as light-stem, depending on the analysis of the metrical feet of the stem.",
breadcrumb_and_first_sort_key = "heavy-stem",
parents = "class 1 weak verbs",
}
labels["class 1 weak light-stem verbs"] = {
description = "{{{langname}}} class 1 weak verbs with a light stem in Proto-Germanic, i.e. a stem containing a short vowel and ending in only one consonant.",
additional = "Such verbs had the {{w|Sievers' Law}} variant interfix {{ic|/i/~/j/}} between the stem and " ..
"endings in the present tense, which evolved differently from heavy-stem verbs in most daughter languages, " ..
"which had an interfix {{ic|/ij/}} in the present tense. Note that some verbs with multisyllabic stems were " ..
"treated as heavy-stem and some as light-stem, depending on the analysis of the metrical feet of the stem.",
breadcrumb_and_first_sort_key = "light-stem",
parents = "class 1 weak verbs",
}
labels["class 2 strong verbs"] = {
description = "{{{langname}}} class 2 strong verbs, where the [[ablaut]] vowel was followed by ''-w-'' in Proto-Indo-European.",
breadcrumb = "class 2",
parents = {{name = "strong verbs", sort = "2"}},
}
labels["class 2a strong verbs"] = {
description = "{{{langname}}} class 2 strong verbs where the [[ablaut]] vowel was ''*eu'' in Proto-Germanic.",
breadcrumb = "class 2a",
parents = {{name = "class 2 strong verbs", sort = "1"}},
}
labels["class 2b strong verbs"] = {
description = "{{{langname}}} class 2 strong verbs where the [[ablaut]] vowel was ''*ū'' in Proto-Germanic.",
breadcrumb = "class 2b",
parents = {{name = "class 2 strong verbs", sort = "2"}},
}
labels["class 2 weak verbs"] = {
description = "{{{langname}}} class 2 weak verbs, where the stem was followed by ''*ō'' in Proto-Germanic.",
breadcrumb = "class 2",
parents = {{name = "weak verbs", sort = "2"}},
}
labels["class 3 weak verbs"] = {
description = "{{{langname}}} class 3 weak verbs, where the stem was followed by ''*ai''~''*ā'' in Proto-Germanic, which was generalized to ''*ē'' in West Germanic.",
breadcrumb = "class 3",
parents = {{name = "weak verbs", sort = "3"}},
}
labels["class 3 strong verbs"] = {
description = "{{{langname}}} class 3 strong verbs, where the [[ablaut]] vowel was followed by a [[consonant cluster]] in Proto-Indo-European.",
breadcrumb = "class 3",
parents = {{name = "strong verbs", sort = "3"}},
}
labels["class 3a strong verbs"] = {
description = "{{{langname}}} class 3 strong verbs where the [[consonant cluster]] following the [[ablaut]] vowel begins with a nasal consonant.",
breadcrumb = "class 3a",
parents = {{name = "class 3 strong verbs", sort = "1"}},
}
labels["class 3b strong verbs"] = {
description = "{{{langname}}} class 3 strong verbs where the [[consonant cluster]] following the [[ablaut]] vowel begins with a lateral consonant or velar fricative.",
breadcrumb = "class 3b",
parents = {{name = "class 3 strong verbs", sort = "2"}},
}
labels["class 3c strong verbs"] = {
description = "{{{langname}}} class 3 strong verbs where the [[consonant cluster]] following the [[ablaut]] vowel begins with a rhotic consonant.",
breadcrumb = "class 3c",
parents = {{name = "class 3 strong verbs", sort = "3"}},
}
labels["class 4 strong verbs"] = {
description = "{{{langname}}} class 4 strong verbs, where the [[ablaut]] vowel was followed by a [[sonorant]] (''m'', ''n'', ''l'', ''r'') but no other consonant in Proto-Indo-European.",
breadcrumb = "class 4",
parents = {{name = "strong verbs", sort = "4"}},
}
labels["class 4 weak verbs"] = {
description = "{{{langname}}} class 4 weak verbs, where the stem was followed by ''*n'' in Proto-Germanic.",
breadcrumb = "class 4",
parents = {{name = "weak verbs", sort = "4"}},
}
labels["class 5 strong verbs"] = {
description = "{{{langname}}} class 5 strong verbs, where the [[ablaut]] vowel was followed by [[consonant]] other than a [[sonorant]] in Proto-Indo-European.",
breadcrumb = "class 5",
parents = {{name = "strong verbs", sort = "5"}},
}
labels["class 5 strong j-present verbs"] = {
description = "{{{langname}}} class 5 strong verbs with a {{IPAchar|/j/}} suffix in the present tense in Proto-Germanic.",
additional = "This [[umlaut]]ed the root vowel to {{ic|/i/}}, and caused gemination of the stem-final consonant in the West Germanic languages. The {{ic|/j/}} was maintained in Gothic, Old Norse (and modern Icelandic) and Old Saxon, but otherwise dropped.",
breadcrumb = "''j''-present",
parents = {{name = "class 5 strong verbs", sort = "j-present"}},
}
labels["class 6 strong verbs"] = {
description = "{{{langname}}} class 6 strong verbs, with the stem vowel ''-a-'' (and usually a single stem-final consonant), except those where it is followed by a sonorant and another consonant (this combination was considered a diphthong in PIE and therefore belonged to class 7).",
additional = "The Proto-Indo-European origin of this class is not securely known.",
breadcrumb = "class 6",
parents = {{name = "strong verbs", sort = "6"}},
}
labels["class 6 strong j-present verbs"] = {
description = "{{{langname}}} class 6 strong verbs with a {{IPAchar|/j/}} suffix in the present tense in Proto-Germanic.",
additional = "This caused gemination of the stem-final consonant in the West Germanic languages, and [[umlaut]] of the root vowel in most languages. The {{ic|/j/}} was maintained in Gothic, Old Norse (and modern Icelandic) and Old Saxon, but otherwise dropped.",
breadcrumb = "''j''-present",
parents = {{name = "class 6 strong verbs", sort = "j-present"}},
}
labels["class 7 strong verbs"] = {
description = "{{{langname}}} class 7 strong verbs, which retained their reduplication in the past tense in Proto-Germanic.",
breadcrumb = "class 7",
parents = {{name = "strong verbs", sort = "7"}},
}
labels["class 7a strong verbs"] = {
description = "{{{langname}}} class 7 strong verbs where the root vowel was ''*ai'' in Proto-Germanic, analogous to class 1.",
breadcrumb = "class 7a",
parents = {{name = "class 7 strong verbs", sort = "a"}},
}
labels["class 7b strong verbs"] = {
description = "{{{langname}}} class 7 strong verbs where the root vowel was ''*au'' in Proto-Germanic, analogous to class 2.",
breadcrumb = "class 7b",
parents = {{name = "class 7 strong verbs", sort = "b"}},
}
labels["class 7c strong verbs"] = {
description = "{{{langname}}} class 7 strong verbs where the root vowel was ''*a'' followed by a [[consonant cluster]] in Proto-Germanic, analogous to class 3.",
breadcrumb = "class 7c",
parents = {{name = "class 7 strong verbs", sort = "c"}},
}
labels["class 7d strong verbs"] = {
description = "{{{langname}}} class 7 strong verbs where the root vowel was ''*ē'' in Proto-Germanic.",
breadcrumb = "class 7d",
parents = {{name = "class 7 strong verbs", sort = "d"}},
}
labels["class 7e strong verbs"] = {
description = "{{{langname}}} class 7 strong verbs where the root vowel was ''*ō'' in Proto-Germanic.",
breadcrumb = "class 7e",
parents = {{name = "class 7 strong verbs", sort = "e"}},
}
labels["class 7 strong j-present verbs"] = {
description = "{{{langname}}} class 7 strong verbs with a {{IPAchar|/j/}} suffix in the present tense in Proto-Germanic.",
additional = "This caused [[umlaut]] of the root vowel in most languages. The {{ic|/j/}} was maintained in Gothic, Old Norse (and modern Icelandic) and Old Saxon, but otherwise dropped.",
breadcrumb = "''j''-present",
parents = {{name = "class 7 strong verbs", sort = "j-present"}},
}
-- Add 'umbrella_parents' key if not already present.
for key, data in pairs(labels) do
if not data.umbrella_parents then
data.umbrella_parents = "Terms by grammatical category subcategories by language"
end
end
return {LABELS = labels}
88k7sky1bf589e9xdsynl8pwglofaen
မဝ်ဂျူ:category tree/fam/jpx
828
285999
385641
2026-04-02T17:39:13Z
咽頭べさ
33
ခၞံကၠောန်လဝ် မုက်လိက် နကု "local labels = {} local handlers = {} local m_str_utils = require("Module:string utilities") local concat = table.concat local full_link = require("Module:links").full_link local insert = table.insert local Hani_sort = require("Module:Hani-sortkey").makeSortKey local match = m_str_utils.match local sort = table.sort local tag_text = require("Module:script_utilities").tag_text local ucfirst = m_str_utils.ucfirst loc..."
385641
Scribunto
text/plain
local labels = {}
local handlers = {}
local m_str_utils = require("Module:string utilities")
local concat = table.concat
local full_link = require("Module:links").full_link
local insert = table.insert
local Hani_sort = require("Module:Hani-sortkey").makeSortKey
local match = m_str_utils.match
local sort = table.sort
local tag_text = require("Module:script_utilities").tag_text
local ucfirst = m_str_utils.ucfirst
local Hira = require("Module:scripts").getByCode("Hira")
local Jpan = require("Module:scripts").getByCode("Jpan")
local kana_to_romaji = require("Module:Hrkt-translit").tr
local m_numeric = require("Module:ConvertNumeric")
local kana_capture = "([-" .. require("Module:ja/data/range").kana .. "・]+)"
local yomi_data = require("Module:kanjitab/data")
labels["adnominals"] = {
description = "{{{langname}}} adnominals, or {{ja-r|連%体%詞|れん%たい%し}}, which modify nouns, and do not conjugate or [[predicate#Verb|predicate]].",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["hiragana"] = {
description = "{{{langname}}} terms with hiragana {{mdash}} {{ja-r|平%仮%名|ひら%が%な}} {{mdash}} forms, sorted by conventional hiragana sequence. The hiragana form is a [[phonetic]] representation of that word. " ..
"Wiktionary represents {{{langname}}}-language segments in three ways: in normal form (with [[kanji]], if appropriate), in [[hiragana]] " ..
"form (this differs from kanji form only when the segment contains kanji), and in [[romaji]] form.",
additional = "''See also'' [[:Category:{{{langname}}} katakana]]",
toc_template = "Hira-categoryTOC",
toc_template_full = "Hira-categoryTOC/full",
parents = {
{name = "{{{langcat}}}", raw = true},
"Category:Hiragana script characters",
}
}
labels["historical hiragana"] = {
description = "{{{langname}}} historical [[hiragana]].",
additional = "''See also'' [[:Category:{{{langname}}} historical katakana]].",
toc_template = "Hira-categoryTOC",
toc_template_full = "Hira-categoryTOC/full",
parents = {
"hiragana",
{name = "{{{langcat}}}", raw = true},
"Category:Hiragana script characters",
}
}
labels["katakana"] = {
description = "{{{langname}}} terms with katakana {{mdash}} {{ja-r|片%仮%名|かた%か%な}} {{mdash}} forms, sorted by conventional katakana sequence. Katakana is used primarily for transliterations of foreign words, including old Chinese hanzi not used in [[shinjitai]].",
additional = "''See also'' [[:Category:{{{langname}}} hiragana]]",
toc_template = "Kana-categoryTOC",
toc_template_full = "Kana-categoryTOC/full",
parents = {
{name = "{{{langcat}}}", raw = true},
"Category:Katakana script characters",
}
}
labels["historical katakana"] = {
description = "{{{langname}}} historical [[katakana]].",
additional = "''See also'' [[:Category:{{{langname}}} historical hiragana]].",
toc_template = "Kana-categoryTOC",
toc_template_full = "Kana-categoryTOC/full",
parents = {
"katakana",
{name = "{{{langcat}}}", raw = true},
"Category:Katakana script characters",
}
}
labels["terms spelled with mixed kana"] = {
description = "{{{langname}}} terms which combine [[hiragana]] and [[katakana]] characters, potentially with [[kanji]] too.",
parents = {
{name = "{{{langcat}}}", raw = true},
"hiragana",
"katakana",
},
}
labels["kanji"] = {
topright = "{{wp|Kanji}}",
description = "{{{langname}}} symbols of the Han logographic script, which can represent sounds or convey meanings directly.",
toc_template = "Hani-categoryTOC",
umbrella = "Han characters",
parents = "logograms",
}
labels["kanji by reading"] = {
description = "{{{langname}}} kanji categorized by reading.",
parents = {{name = "kanji", sort = "reading"}},
}
labels["makurakotoba"] = {
topright = "{{wp|Makurakotoba}}",
description = "{{{langname}}} idioms used in poetry to introduce specific words.",
parents = {"idioms"},
}
labels["terms by kanji readings"] = {
description = "{{{langname}}} categories grouped with regard to the readings of the kanji with which they are spelled.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["terms by reading pattern"] = {
description = "{{{langname}}} categories with terms grouped by their reading patterns.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["terms by number of kanji"] = {
description = "{{{langname}}} terms categorized by number of kanji.",
parents = {"terms by orthographic property"},
}
local function handle_onyomi_list(category, category_type, cat_yomi_type)
local onyomi, seen = {}, {}
for _, yomi in pairs(yomi_data) do
if not seen[yomi] and yomi.onyomi then
local yomi_catname = yomi[category_type]
if yomi_catname ~= false then
local yomi_type = yomi.type
if yomi_type ~= "on'yomi" and yomi_type ~= cat_yomi_type then
insert(onyomi, "[[:Category:{{{langname}}} " .. category:gsub("{{{yomi_catname}}}", yomi_catname) .. "]]")
end
end
end
seen[yomi] = true
end
sort(onyomi)
return onyomi
end
local function add_yomi_category(category, category_type, parent, description)
for _, yomi in pairs(yomi_data) do
local yomi_catname = yomi[category_type]
if yomi_catname ~= false then
local yomi_type = yomi.type
local yomi_desc = yomi.link or yomi_catname
if yomi.description then
yomi_desc = yomi_desc .. "; " .. yomi.description
end
local label = {
description = description .. " " .. yomi_desc .. ".",
breadcrumb = yomi_type,
parents = {{name = parent, sort = yomi_catname}},
}
if yomi.onyomi then
local onyomi = handle_onyomi_list(category, category_type, yomi_type)
label.additional = "Categories of terms with " ..
(yomi_type == "on'yomi" and "more" or "other") ..
" specific types of on'yomi readings can be found in the following categories:\n* " .. concat(onyomi, "\n* ")
if yomi_type ~= "on'yomi" then
insert(label.parents, 1, {
name = (category:gsub("{{{yomi_catname}}}", yomi_data.on[category_type])),
sort = yomi_catname
})
end
end
labels[category:gsub("{{{yomi_catname}}}", yomi_catname)] = label
end
end
end
add_yomi_category(
"terms read with {{{yomi_catname}}}",
"reading_category",
"terms by reading pattern",
"{{{langname}}} terms read with"
)
add_yomi_category(
"terms spelled with kanji with {{{yomi_catname}}} readings",
"kanji_category",
"terms by kanji reading type",
"{{{langname}}} categories with terms that are spelled with one or more kanji read with"
)
labels["terms with missing yomi"] = {
description = "{{{langname}}} terms where at least one [[Appendix:Japanese glossary#yomi|yomi]] is missing from {{tl|{{{langcode}}}-kanjitab}}.",
hidden = true,
can_be_empty = true,
parents = {"entry maintenance"},
}
labels["terms with IPA pronunciation with pitch accent"] = {
description = "{{{langname}}} terms with pronunciations that have {{w|Japanese pitch accent|pitch accent}} specified.",
additional = "Pitch accent can be specified in {{tl|{{{langcode}}}-pron}} with the {{code|=acc=}} parameter.",
can_be_empty = true,
parents = {"entry maintenance", "pitch accent"},
}
labels["terms with IPA pronunciation missing pitch accent"] = {
description = "{{{langname}}} terms with pronunciations that do not have a {{w|Japanese pitch accent|pitch accent}} specified.",
additional = "Pitch accent can be specified in {{tl|{{{langcode}}}-pron}} with the {{code|=acc=}} parameter.",
hidden = true,
can_be_empty = true,
parents = {"entry maintenance"},
}
labels["pitch accent"] = {
description = "{{{langname}}} terms regarding {{w|Japanese pitch accent|pitch accent}} pronunciation.",
can_be_empty = true,
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["terms with Heiban pitch accent (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[平板型|Heiban]] {{w|Japanese pitch accent|pitch accent}}.",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["terms with Atamadaka pitch accent (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[頭高型|Atamadaka]] {{w|Japanese pitch accent|pitch accent}}.",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["terms with Nakadaka pitch accent (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[中高型|Nakadaka]] {{w|Japanese pitch accent|pitch accent}}.",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["terms with Odaka pitch accent (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[尾高型|Odaka]] {{w|Japanese pitch accent|pitch accent}}.",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["terms with complex pitch accent (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) complex {{w|Japanese pitch accent|pitch accent}}, as in having more than one {{m|ja|アクセント句}}.",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["pitch accent deaccenting before の"] = {
description = "{{{langname}}} terms with {{w|Japanese pitch accent|pitch accent}} pronunciations that have exceptional deaccenting or lack thereof before の ({{ja-deaccenting-before-no}}).",
can_be_empty = true,
parents = {"pitch accent"}
}
labels["terms with Odaka pitch accent not deaccented before の (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[尾高型|Odaka]] {{w|Japanese pitch accent|pitch accent}} and do not become deaccented before の ({{ja-deaccenting-before-no}}).",
can_be_empty = true,
parents = {"pitch accent deaccenting before の"}
}
labels["terms with Nakadaka pitch accent deaccented before の (Tōkyō)"] = {
description = "{{{langname}}} terms with pronunciations that are (Tōkyō) [[中高型|Nakadaka]] {{w|Japanese pitch accent|pitch accent}} and become deaccented before の ({{ja-deaccenting-before-no}}).",
can_be_empty = true,
parents = {"pitch accent deaccenting before の"}
}
labels["terms by kanji reading type"] = {
description = "{{{langname}}} categories with terms grouped with regard to the types of readings of the kanji with which " ..
"they are spelled; broadly, those of Chinese origin, {{ja-r|音|おん}} readings, and those of non-Chinese origin, {{ja-r|訓|くん}} readings.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["terms spelled with ateji"] = {
topright = "{{wp|Ateji}}",
description = "{{{langname}}} terms containing one or more [[Appendix:Japanese glossary#ateji|ateji]] {{mdash}} {{ja-r|当て字|あてじ}} {{mdash}} which are [[kanji]] used to represent sounds rather than meanings (though meaning may have some influence on which kanji are chosen).",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["terms spelled with daiyōji"] = {
description = "Japanese terms spelled using [[Appendix:Japanese glossary#daiyouji|daiyōji]], categorized using {{temp|ja-daiyouji}}.",
parents = {"terms by etymology"},
}
labels["terms spelled with jukujikun"] = {
description = "{{{langname}}} terms containing one or more [[Appendix:Japanese glossary#jukujikun|jukujikun]] {{mdash}} {{ja-r|熟%字%訓|じゅく%じ%くん}} {{mdash}} which are [[kanji]] used to represent meanings rather than sounds.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
local function add_grade_categories(grade, desc, wp, only_one, parent, sort)
local grade_kanji = grade .. " kanji"
local topright = wp and ("{{wp|%s}}"):format(ucfirst(grade_kanji)) or nil
labels[grade_kanji] = {
topright = topright,
description = "{{{langname}}} kanji " .. desc,
toc_template = "Hani-categoryTOC",
parents = {{
name = parent and (parent .. " kanji") or "kanji",
sort = sort or grade
}},
}
labels["terms spelled with " .. grade_kanji] = {
topright = topright,
description = "{{{langname}}} terms spelled with " .. (only_one and "at least one " or "") .. "kanji " .. desc,
parents = {{
name = parent and ("terms spelled with " .. parent .. " kanji") or "terms by orthographic property",
sort = sort or grade
}},
}
end
for i = 1, 6 do
local ord = m_numeric.ones_position_ord[i]
add_grade_categories(
ord .. " grade",
"taught in the " .. ord .. " grade of elementary school, as designated by the the official list of {{ja-r|教%育 漢%字|きょう%いく かん%じ|education kanji}}.",
false,
false,
"kyōiku",
i
)
end
add_grade_categories(
"kyōiku",
"on the official list of {{ja-r|教%育 漢%字|きょう%いく かん%じ|education kanji}}.",
true,
false,
"jōyō"
)
add_grade_categories(
"secondary school",
"on the official list of {{ja-r|常%用 漢%字|じょう%よう かん%じ|regular-use characters}} that are generally taught in secondary school.",
false,
false,
"jōyō"
)
add_grade_categories(
"jōyō",
"on the official list of {{ja-r|常%用 漢%字|じょう%よう かん%じ|regular-use characters}}.",
true,
false
)
add_grade_categories(
"tōyō",
"on the official list of {{ja-r|当%用 漢%字|とう%よう かん%じ|general-use characters}}, which was used from 1946{{ndash}}1981 until the publication of the list of {{ja-r|常%用 漢%字|じょう%よう かん%じ|regular-use characters}}.",
true,
false
)
add_grade_categories(
"jinmeiyō",
"on the official list of {{ja-r|人%名%用 漢%字|じん%めい%-よう かん%じ|kanji for use in personal names}}.",
true,
true
)
add_grade_categories(
"hyōgai",
"not included on the official list of {{ja-r|常%用 漢%字|じょう%よう かん%じ|regular-use characters}} or {{ja-r|人%名%用 漢%字|じん%めい%-よう かん%じ|kanji for use in personal names}}, known as {{ja-r|表%外 漢%字|ひょう%がい かん%じ}} or {{ja-r|表%外%字|ひょう%がい%じ|unlisted characters}}.",
true,
true
)
labels["terms with multiple readings"] = {
description = "{{{langname}}} terms with multiple pronunciations (hence multiple [[kana]] spellings).",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["kanji readings by number of morae"] = {
description = "{{{langname}}} categories grouped with regard to the number of morae in their kanji readings.",
parents = {{name = "{{{langcat}}}", raw = true}},
}
labels["single-kanji terms"] = {
description = "{{{langname}}} terms written as a single kanji.",
parents = {
"terms by orthographic property",
{name = "terms with 1 kanji", sort = " "},
},
}
labels["kanji with kun readings missing okurigana designation"] = {
breadcrumb = "Kanji missing okurigana designation",
description = "{{{langname}}} kanji entries in which one or more kun readings entered into {{tl|{{{langcode}}}-readings}} is missing a hyphen denoting okurigana.",
toc_template = "Hani-categoryTOC",
hidden = true,
can_be_empty = true,
parents = {"entry maintenance"},
}
labels["terms by the individual characters in their historical spellings"] = {
breadcrumb = "Historical",
description = "{{{langname}}} terms categorized by whether their spellings in the {{w|historical kana orthography}} included certain individual characters.",
parents = {{name = "terms by their individual characters", sort = " "}},
}
labels["verbs without transitivity"] = {
description = "{{{langname}}} verbs missing the {{code|=tr=}} parameter from their headword templates.",
hidden = true,
can_be_empty = true,
parents = {"entry maintenance"},
}
labels["yojijukugo"] = {
topright = "{{wp|Yojijukugo}}",
description = "{{{langname}}} four-[[kanji]] compound terms, {{ja-r|四%字 熟%語|よ%じ じゅく%ご}}, with idiomatic meanings; typically derived from Classical Chinese, Buddhist scripture or traditional Japanese proverbs.",
additional = "Compare Chinese {{w|chengyu}} and Korean {{w|sajaseong-eo}}.",
umbrella = "four-character idioms",
parents = {"idioms"},
}
-- FIXME: Only works for 0 through 19.
local word_to_number = {}
for k, v in pairs(m_numeric.ones_position) do
word_to_number[v] = k
end
local periods = {
historical = true,
ancient = true,
}
local function get_period_text_and_reading_type_link(period, reading_type)
if period and not periods[period] then
return nil
end
local period_text = period and period .. " " or nil
-- Allow periods (historical or ancient) by themselves; they will parse as reading types.
if not period and periods[reading_type] then
return nil, reading_type
end
local reading_type_link = "[[Appendix:Japanese glossary#" .. reading_type .. "|" .. reading_type .. "]]"
return period_text, reading_type_link
end
local function get_sc(str)
return match(str:gsub("[%s%p]+", ""), "[^" .. Hira:getCharacters() .. "]") and Jpan or Hira
end
local function get_tagged_reading(reading, lang)
return tag_text(reading, lang, get_sc(reading))
end
local function get_reading_link(reading, lang, period, link)
local hist = periods[period]
reading = reading:gsub("[%.%-%s]+", "")
return full_link({
lang = lang,
sc = get_sc(reading),
term = link or reading:gsub("・", ""),
-- If we have okurigana, demarcate furigana.
alt = reading:gsub("^(.-)・", "<span style=\"border-top:1px solid;position:relative;padding:1px;\">%1<span style=\"position:absolute;top:0;bottom:67%%;right:0%%;border-right:1px solid;\"></span></span>"),
tr = kana_to_romaji((reading:gsub("・", ".")), lang:getCode(), nil, {keep_dot = true, hist = hist})
:gsub("^(.-)%.", "<u>%1</u>"),
pos = reading:find("・", 1, true) and get_tagged_reading((reading:gsub("^.-・", "~")), lang) or nil
}, "term")
end
local function is_on_subtype(reading_type)
return reading_type:find(".on$")
end
insert(handlers, function(data)
local n =data.label:match("^terms with ([1-9]%d*) kanji$")
if not n then
return
end
local sortkey = require("Module:category tree").numeral_sortkey(n, 2097152)
return {
breadcrumb = n,
description = ("{{{langname}}} terms containing exactly %d kanji."):format(n),
-- TODO: implement this using the same mechanism used to implement parents (i.e. avoiding the need for raw categories).
-- umbrella = {
-- breadcrumb = ("%d kanji"):format(n),
-- parents = {{name = "terms by number of kanji subcategories by language", sort = sortkey}},
-- },
parents = {{name = ("terms by number of kanji"), sort = sortkey}}
}
end)
insert(handlers, function(data)
local label_pref, kana = data.label:match("^(terms historically spelled with )" .. kana_capture .. "$")
if not kana then
return
end
local lang = data.lang
return {
description = "{{{langname}}} terms spelled with " .. get_reading_link(kana, lang, "historical") .. " in the {{w|historical kana orthography}}.",
displaytitle = "{{{langname}}} " .. label_pref .. get_tagged_reading(kana, lang),
breadcrumb = "historical",
parents = {
{name = "terms spelled with " .. kana, sort = " "},
{name = "terms by the individual characters in their historical spellings", sort = lang:makeSortKey(kana)}
},
umbrella = false,
}
end)
insert(handlers, function(data)
local count, plural = data.label:match("^kanji readings with (.+) mora(e?)$")
-- Make sure 'one' goes with singular and other numbers with plural.
if not count or (count == "one") ~= (plural == "") then
return
end
local num = word_to_number[count]
if not num then
return nil
end
return {
description = "{{{langname}}} kanji readings containing " .. count .. " mora" .. plural .. ".",
breadcrumb = num,
parents = {{name = "kanji readings by number of morae", sort = num}},
umbrella = false,
}
end)
insert(handlers, function(data)
local label_pref, period, reading_type, reading = match(data.label, "^(kanji with ([a-z]-) ?([%a']+) reading )" .. kana_capture .. "$")
if not period then
return
end
period = period ~= "" and period or nil
local period_text, reading_type_link = get_period_text_and_reading_type_link(period, reading_type)
if not reading_type_link then
return
end
local lang = data.lang
-- Compute parents.
local parents, breadcrumb = {}
if reading:find("・", 1, true) then
local okurigana = reading:match("・(.*)")
insert(parents, {
name = "kanji with " .. (period_text or "") .. reading_type .. " reading " .. reading:match("(.-)・"),
-- Sort by okurigana, since all coordinate categories will have the same furigana.
sort = (lang:makeSortKey(okurigana))
})
breadcrumb = "~" .. okurigana
else
insert(parents, {
name = "kanji by " .. (period_text or "") .. reading_type .. " reading",
sort = (lang:makeSortKey(reading))
})
breadcrumb = reading
end
if is_on_subtype(reading_type) then
insert(parents, {name = "kanji with " .. (period_text or "") .. "on reading " .. reading, sort = reading_type})
elseif period_text then
insert(parents, {name = "kanji with " .. period_text .. "reading " .. reading, sort = reading_type})
end
if not period_text then
insert(parents, {name = "kanji read as " .. reading, sort = reading_type})
end
return {
description = "{{{langname}}} [[kanji]] with the " .. (period_text or "") .. reading_type_link .. " reading " ..
get_reading_link(reading, lang, period or reading_type) .. ".",
displaytitle = "{{{langname}}} " .. label_pref .. get_tagged_reading(reading, lang),
breadcrumb = get_tagged_reading(breadcrumb, lang),
parents = parents,
umbrella = false,
}
end)
insert(handlers, function(data)
local period, reading_type = match(data.label, "^kanji by ([a-z]-) ?([%a']+) reading$")
if not period then
return
end
period = period ~= "" and period or nil
local period_text, reading_type_link = get_period_text_and_reading_type_link(period, reading_type)
if not reading_type_link then
return nil
end
-- Compute parents.
local parents = {
is_on_subtype(reading_type) and {name = "kanji by " .. (period_text or "") .. "on reading", sort = reading_type} or
period_text and {name = "kanji by " .. reading_type .. " reading", sort = period} or
{name = "kanji by reading", sort = reading_type}
}
if period_text then
insert(parents, {name = "kanji by " .. period_text .. "reading", sort = reading_type})
end
-- Compute description.
local description = "{{{langname}}} [[kanji]] categorized by " .. (period_text or "") .. reading_type_link .. " reading."
return {
description = description,
breadcrumb = (period_text or "") .. reading_type,
parents = parents,
umbrella = false,
}
end)
insert(handlers, function(data)
local label_pref, reading = match(data.label, "^(kanji read as )" .. kana_capture .. "$")
if not reading then
return
end
local args = require("Module:parameters").process(data.args, {
["histconsol"] = true,
})
local lang = data.lang
local parents, breadcrumb = {}
if reading:find("・", 1, true) then
local okurigana = reading:match("・(.*)")
insert(parents, {
name = "kanji read as " .. reading:match("(.-)・"),
-- Sort by okurigana, since all coordinate categories will have the same furigana.
sort = (lang:makeSortKey(okurigana))
})
breadcrumb = "~" .. okurigana
else
insert(parents, {
name = "kanji by reading",
sort = (lang:makeSortKey(reading))
})
breadcrumb = reading
end
local addl
local period_text
if args.histconsol then
period_text = "historical"
addl = ("This is a [[Wikipedia:Historical kana orthography|historical]] [[Wikipedia:Kanazukai|reading]], now " ..
"consolidated with the [[Wikipedia:Modern kana usage|modern reading]] of " ..
get_reading_link(args.histconsol, lang, nil, ("Category:Japanese kanji read as %s"):format(args.histconsol)) .. ".")
end
return {
description = "{{{langname}}} [[kanji]] read as " .. get_reading_link(reading, lang, period_text) .. ".",
additional = addl,
displaytitle = "{{{langname}}} " .. label_pref .. get_tagged_reading(reading, lang),
breadcrumb = get_tagged_reading(breadcrumb, lang),
parents = parents,
umbrella = false,
}, true
end)
insert(handlers, function(data)
local label_pref, reading = match(data.label, "^(terms spelled with kanji read as )" .. kana_capture .. "$")
if not reading then
return
end
-- Compute parents.
local lang = data.lang
local sort_key = (lang:makeSortKey(reading))
local mora_count = require("Module:ja").count_morae(reading)
local mora_count_words = m_numeric.spell_number(tostring(mora_count))
local parents = {
{name = "terms by kanji readings", sort = sort_key},
{name = "kanji readings with " .. mora_count_words .. " mora" .. (mora_count > 1 and "e" or ""), sort = sort_key},
{name = "kanji read as " .. reading, sort = " "},
}
local tagged_reading = get_tagged_reading(reading, lang)
return {
description = "{{{langname}}} terms that contain kanji that exhibit a reading of " .. get_reading_link(reading, lang) ..
" in those terms prior to any sound changes.",
displaytitle = "{{{langname}}} " .. label_pref .. tagged_reading,
breadcrumb = tagged_reading,
parents = parents,
umbrella = false,
}
end)
insert(handlers, function(data)
local kanji, reading = match(data.label, "^terms spelled with (.) read as " .. kana_capture .. "$")
if not kanji then
return nil
end
local args = require("Module:parameters").process(data.args, {
[1] = {list = true},
})
local lang = data.lang
if #args[1] == 0 then
error("For categories of the form \"" .. lang:getCanonicalName() ..
" terms spelled with KANJI read as READING\", at least one reading type (e.g. <code>kun</code> or <code>on</code>) must be specified using <code>1=</code>, <code>2=</code>, <code>3=</code>, etc.")
end
local yomi_types, parents = {}, {}
for _, yomi, category in ipairs(args[1]) do
local yomi_data = yomi_data[yomi]
if not yomi_data then
error("The yomi type \"" .. yomi .. "\" is not recognized.")
end
category = yomi_data.kanji_category
if not category then
error("The yomi type \"" .. yomi .. "\" is not valid for this type of category.")
end
insert(yomi_types, yomi_data.link)
insert(parents, {
name = "terms spelled with kanji with " .. category .. " readings",
sort = (lang:makeSortKey(reading))
})
end
insert(parents, 1, {name = "terms spelled with " .. kanji, sort = (lang:makeSortKey(reading))})
insert(parents, 2, {name = "terms spelled with kanji read as " .. reading, sort = Hani_sort(kanji)})
yomi_types = (#yomi_types > 1 and "one of " or "") .. "its " ..
require("Module:table").serialCommaJoin(yomi_types, {conj = "or"}) ..
" reading" .. (#yomi_types > 1 and "s" or "")
local tagged_kanji = get_tagged_reading(kanji, lang)
local tagged_reading = get_tagged_reading(reading, lang)
return {
description = "{{{langname}}} terms spelled with {{l|{{{langcode}}}|" .. kanji .. "}} with " ..
yomi_types .. " of " .. get_reading_link(reading, lang) .. ".",
displaytitle = "{{{langname}}} terms spelled with " .. tagged_kanji .. " read as " .. tagged_reading,
breadcrumb = "read as " .. tagged_reading,
parents = parents,
umbrella = false,
}, true
end)
insert(handlers, function(data)
local affix, kanji, reading = data.label:match("^terms ([a-z]+fix)ed with (.+) read as " .. kana_capture .. "$")
if not affix or not kanji or not reading then
return nil
end
local args = require("Module:parameters").process(data.args, {
[1] = {list = true},
})
local lang = data.lang
if #args[1] == 0 then
error("For categories of the form \"" .. lang:getCanonicalName() ..
" terms AFFIXed with KANJI read as READING\", at least one reading type (e.g. <code>kun</code> or <code>on</code>) must be specified using <code>1=</code>, <code>2=</code>, <code>3=</code>, etc.")
end
local yomi_types = {}
for _, yomi, category in ipairs(args[1]) do
local yomi_data = yomi_data[yomi]
if not yomi_data then
error("The yomi type \"" .. yomi .. "\" is not recognized.")
end
category = yomi_data.kanji_category
if not category then
error("The yomi type \"" .. yomi .. "\" is not valid for this type of category.")
end
insert(yomi_types, yomi_data.link)
end
yomi_types = (#yomi_types > 1 and "one of " or "") .. "its " ..
require("Module:table").serialCommaJoin(yomi_types, {conj = "or"}) ..
" reading" .. (#yomi_types > 1 and "s" or "")
local description = "{{{langname}}} terms " .. affix .. "ed with {{l|{{{langcode}}}|" .. kanji .. "}} with " ..
yomi_types .. " of " .. get_reading_link(reading, lang) .. "."
local tl
if affix == "suffix" then
tl = "{{tl|ja-compound|<var>|...|</var>|" .. kanji .. "|-" .. reading .. "}}"
elseif affix == "prefix" then
tl = "{{tl|ja-compound|" .. kanji .. "|" .. reading .. "-|<var>|...|</var>}}"
elseif affix == "infix" then
tl = "{{tl|ja-compound|<var>|...|</var>|" .. kanji .. "|-" .. reading .. "-|<var>|...|</var>}}"
end
if tl then
description = description .. "\n\n" .. "Terms are placed in this category using " .. tl .. "."
end
local parents = {}
table.insert(parents, {name = "terms " .. affix .. "ed with " .. kanji, sort = (lang:makeSortKey(reading))})
if mw.title.new("Category:" .. lang:getCanonicalName() .. " terms spelled with " .. kanji .. " read as " .. reading).exists then
table.insert(parents, {name = "terms spelled with " .. kanji .. " read as " .. reading, sort = (lang:makeSortKey(reading)), args=data.args})
end
local tagged_kanji = get_tagged_reading(kanji, lang)
local tagged_reading = get_tagged_reading(reading, lang)
return {
description = description,
displaytitle = "{{{langname}}} terms " .. affix .. "ed with " .. tagged_kanji .. " read as " .. tagged_reading,
breadcrumb = "read as " .. reading,
parents = parents,
umbrella = false,
}, true
end)
insert(handlers, function(data)
local kanji, daiyoji = match(data.label, "^terms with (.) replaced by daiyōji (.)$")
if not kanji then
return nil
end
local args = require("Module:parameters").process(data.args, {
["sort"] = true,
})
local lang = data.lang
if not args.sort then
error("For categories of the form \"" .. lang:getCanonicalName() ..
" terms with KANJI replaced by daiyōji DAIYOJI\", the sort key must be specified using sort=")
end
local tagged_kanji = get_tagged_reading(kanji, lang)
local tagged_daiyoji = get_tagged_reading(daiyoji, lang)
return {
description = "{{{langname}}} terms with {{l|{{{langcode}}}|" .. kanji .. "}} replaced by [[Appendix:Japanese glossary#daiyouji|daiyōji]] {{l|{{{langcode}}}|" .. daiyoji .. "}}.",
displaytitle = "{{{langname}}} terms with " .. tagged_kanji .. " replaced by daiyōji " .. tagged_daiyoji,
breadcrumb = tagged_kanji .. " replaced by daiyōji " .. tagged_daiyoji,
parents = {{name = "terms spelled with daiyōji", sort = args.sort}},
umbrella = false,
}, true
end)
return {LABELS = labels, HANDLERS = handlers}
j94p85e23zpo7288bz9jnkb85ljqsnq
မဝ်ဂျူ:category tree/fam/sla
828
286000
385644
2026-04-02T19:51:40Z
咽頭べさ
33
ခၞံကၠောန်လဝ် မုက်လိက် နကု "local labels = {} labels["multidirectional verbs"] = { description = "{{{langname}}} verbs of motion whose motion is multidirectional (as opposed to unidirectional) or indirect, or whose action is repeated or in a series, instead of being a single, completed action.", additional = "Multidirectional verbs are always imperfective in aspect, even with prefixes that are normally associated with the perfective aspect. S..."
385644
Scribunto
text/plain
local labels = {}
labels["multidirectional verbs"] = {
description = "{{{langname}}} verbs of motion whose motion is multidirectional (as opposed to unidirectional) or indirect, or whose action is repeated or in a series, instead of being a single, completed action.",
additional = "Multidirectional verbs are always imperfective in aspect, even with prefixes that are normally associated with the perfective aspect. See also {{lg|unidirectional verb}}.",
parents = {"verbs"},
umbrella_parents = "Lemmas subcategories by language",
}
labels["unidirectional verbs"] = {
description = "{{{langname}}} verbs of motion whose motion is unidirectional (as opposed to multidirectional), a definitely directed motion, or a single, completed action (instead of a repeated action or series of actions).",
additional = "Unidirectional verbs may be either imperfective or perfective. See also {{lg|multidirectional verb}}.",
parents = {"verbs"},
umbrella_parents = "Lemmas subcategories by language",
}
------- Slavic terms with prothetic consonants -------
for _, back_sound in ipairs {"v-", "w-", "в-"} do
labels["terms with prothetic " .. back_sound] = {
description = "{{{langname}}} terms with a prothetic {{m|{{{langcode}}}||" .. back_sound .. "}}, which was not present etymologically in {{w|Proto-Slavic}}.",
displaytitle = "{{{langname}}} terms with prothetic {{m|{{{langcode}}}||" .. back_sound .. "}}",
additional = "This sound was originally added before terms beginning with a back vowel (''o'' or ''u'') to prevent {{lg|hiatus}} when the preceding word ended in a vowel, and in time was incorporated into the word itself.",
breadcrumb = "with prothetic {{m|{{{langcode}}}||" .. back_sound .. "}}",
parents = {{name = "terms by lexical property", sort = "prothetic"}},
umbrella = {
description = "Categories with terms with a prothetic {{m|und||" .. back_sound .. "}}, which was not present etymologically in {{w|Proto-Slavic}}.",
breadcrumb = "Terms with prothetic {{m|und||" .. back_sound .. "}}",
parents = {"Terms by lexical property subcategories by language"},
}
}
end
labels["terms with prothetic h-"] = {
description = "{{{langname}}} terms with a prothetic {{m|{{{langcode}}}||h}}, which was not present etymologically in {{w|Proto-Slavic}}.",
displaytitle = "{{{langname}}} terms with prothetic {{m|{{{langcode}}}||h}}",
additional = "This sound was originally added before terms beginning with a vowel to prevent {{lg|hiatus}} when the preceding word ended in a vowel, and in time was incorporated into the word itself.",
breadcrumb = "with prothetic {{m|{{{langcode}}}||h}}",
parents = {{name = "terms by lexical property", sort = "prothetic"}},
umbrella = {
description = "Categories with terms with a prothetic {{m|und||h}}, which was not present etymologically in {{w|Proto-Slavic}}.",
breadcrumb = "Terms with prothetic {{m|und||h}}",
parents = {"Terms by lexical property subcategories by language"},
}
}
return {LABELS = labels}
s4khv5hpn2b3ncc7mqr1hlxeaftnq8z
မဝ်ဂျူ:category tree/fam/alg
828
286001
385645
2026-04-02T19:53:03Z
咽頭べさ
33
ခၞံကၠောန်လဝ် မုက်လိက် နကု "local labels = {} labels["transitive animate verbs"] = { description = "{{{langname}}} transitive verbs with an animate object, commonly abbreviated VTA.", breadcrumb = "transitive animate", parents = {"verbs by inflection type", {name = "transitive verbs", sort = "animate"}}, } labels["transitive inanimate verbs"] = { description = "{{{langname}}} transitive verbs with an inanimate object, commonly abbreviated..."
385645
Scribunto
text/plain
local labels = {}
labels["transitive animate verbs"] = {
description = "{{{langname}}} transitive verbs with an animate object, commonly abbreviated VTA.",
breadcrumb = "transitive animate",
parents = {"verbs by inflection type", {name = "transitive verbs", sort = "animate"}},
}
labels["transitive inanimate verbs"] = {
description = "{{{langname}}} transitive verbs with an inanimate object, commonly abbreviated VTI.",
breadcrumb = "transitive inanimate",
parents = {"verbs by inflection type", {name = "transitive verbs", sort = "inanimate"}},
}
labels["animate intransitive verbs"] = {
description = "{{{langname}}} intransitive verbs with an animate subject, commonly abbreviated VAI.",
breadcrumb = "animate intransitive",
parents = {"verbs by inflection type", {name = "intransitive verbs", sort = "animate"}},
}
labels["inanimate intransitive verbs"] = {
description = "{{{langname}}} intransitive verbs with an inanimate subject, commonly abbreviated VII.",
breadcrumb = "inanimate intransitive",
parents = {"verbs by inflection type", {name = "intransitive verbs", sort = "inanimate"}},
}
-- Add 'umbrella_parents' key if not already present.
for key, data in pairs(labels) do
if not data.umbrella_parents then
data.umbrella_parents = "Terms by grammatical category subcategories by language"
end
end
return {LABELS = labels}
85qdgi9z8fm0fttnf1ltxgr2wyrtvw0
မဝ်ဂျူ:category tree/fam/sem-ara
828
286002
385646
2026-04-02T19:54:41Z
咽頭べさ
33
ခၞံကၠောန်လဝ် မုက်လိက် နကု "local labels = {} --[=[ This module handles language-specific categories for all Aramaic varieties. ]=] ----------------------------------------------------------------------------- -- -- -- NOUNS -- -- -- -----------------..."
385646
Scribunto
text/plain
local labels = {}
--[=[
This module handles language-specific categories for all Aramaic varieties.
]=]
-----------------------------------------------------------------------------
-- --
-- NOUNS --
-- --
-----------------------------------------------------------------------------
---------------------------------- Noun labels ---------------------------------
local function add_noun_labels(labels)
-- Currently there is no [[Appendix:Aramaic nominals]]. Formerly the code conditionalized on the language
-- but only referenced [[Appendix:Assyrian Neo-Aramaic nominals]] (which also doesn't exist) for lang code 'aii',
-- and was broken for all other languages. If we want to conditionalize on the language now, we have to make the
-- description a function (in which case it will be passed an object whose `.lang` field is the language), or use
-- a handler.
local nominal_appendix = "Appendix:Aramaic nominals"
local appendix_exists = mw.title.new(nominal_appendix).exists
local function make_appendix_link(text, anchor)
if appendix_exists then
anchor = anchor or mw.getContentLanguage():ucfirst(text)
return ("[[%s#%s|%s]]"):format(nominal_appendix, anchor, text)
else
return text
end
end
labels["nouns by derivation type"] = {
description = "{{{langname}}} nouns categorized by type of derivation.",
breadcrumb = "by derivation type",
parents = {{name = "nouns", sort = "derivation type"}},
}
labels["active nouns"] = {
description = "{{{langname}}} " .. make_appendix_link("active nouns") .. ", i.e. nouns having the meaning \"one who does X\" for some verb.",
breadcrumb_and_first_sort_key = "active nouns",
parents = {"nouns by derivation type"},
}
labels["instance nouns"] = {
description = "{{{langname}}} " .. make_appendix_link("instance nouns") .. ", i.e. nouns having the meaning \"an instance of doing X\" for some verb.",
breadcrumb_and_first_sort_key = "instance nouns",
parents = {"nouns by derivation type"},
}
labels["nouns of place"] = {
description = "{{{langname}}} " .. make_appendix_link("nouns of place") .. ", i.e. nouns having the approximate meaning \"the place for doing X\" for some verb.",
breadcrumb_and_first_sort_key = "nouns of place",
parents = {"nouns by derivation type"},
}
labels["occupational nouns"] = {
description = "{{{langname}}} " .. make_appendix_link("occupational nouns") .. ", i.e. nouns referring to people employed in doing something.",
breadcrumb_and_first_sort_key = "occupational nouns",
parents = {"nouns by derivation type"},
}
labels["tool nouns"] = {
description = "{{{langname}}} " .. make_appendix_link("tool nouns") .. ", i.e. nouns having the approximate meaning \"tool for doing X\" for some verb.",
breadcrumb_and_first_sort_key = "tool nouns",
parents = {"nouns by derivation type"},
}
-- Add 'umbrella_parents' key if not already present.
for key, data in pairs(labels) do
if not data.umbrella_parents then
data.umbrella_parents = "Lemmas subcategories by language"
end
end
end
-----------------------------------------------------------------------------
-- --
-- WRAPPERS --
-- --
-----------------------------------------------------------------------------
add_noun_labels(labels)
return {LABELS = labels}
rw2e49106ubwcdoxppupbvau909v1qc
きかん
0
286003
385647
2026-04-02T20:00:13Z
咽頭べさ
33
ခၞံကၠောန်လဝ် မုက်လိက် နကု "{{also|きがん|ぎかん|ぎがん}} =={{=ja=}}== # {{ja-def|期間|機関|器官|基幹|季刊|既刊|気管|帰還|帰還|饋還|旗艦}|奇観}}"
385647
wikitext
text/x-wiki
{{also|きがん|ぎかん|ぎがん}}
=={{=ja=}}==
# {{ja-def|期間|機関|器官|基幹|季刊|既刊|気管|帰還|帰還|饋還|旗艦}|奇観}}
k3r26pt1iegbhs6e1okki0s4va3mwpx
385648
385647
2026-04-02T20:00:43Z
咽頭べさ
33
385648
wikitext
text/x-wiki
{{also|きがん|ぎかん|ぎがん}}
=={{=ja=}}==
{{ja-see|期間|機関|器官|基幹|季刊|既刊|気管|帰還|帰還|饋還|旗艦}|奇観}}
sr9iu6km3ivgvgc2g8yu62yyaeaug0b
385651
385648
2026-04-02T20:10:50Z
咽頭べさ
33
385651
wikitext
text/x-wiki
{{also|きがん|ぎかん|ぎがん}}
=={{=ja=}}==
===နာမ်===
{{ja-noun}}
# {{ja-def|期間}}
# {{ja-def|機関}}
# {{ja-def|器官}}
# {{ja-def|基幹}}
# {{ja-def|季刊}}
# {{ja-def|既刊}}
# {{ja-def|気管}}
# {{ja-def|帰還}}
# {{ja-def|帰還|饋還}}
# {{ja-def|旗艦}}
# {{ja-def|奇観}}
6qim2595pqdodf4x1f8b9e6oabuvojl
ထာမ်ပလိက်:ja-see/documentation
10
286004
385649
2026-04-02T20:04:10Z
咽頭べさ
33
ခၞံကၠောန်လဝ် မုက်လိက် နကု "{{documentation subpage}} {{uses lua|Module:ja-see}} Soft-redirect template for Japanese entries. This template can be used to redirect alternative spellings to lemma entries. The advantage over {{tl|alternative spelling of|ja}} is that: (1) This template can be the whole content under <code>==Japanese==</code> or <code>===Etymology <var>n</var>===</code>, so that you don't need to copy the PoS header and the headwor..."
385649
wikitext
text/x-wiki
{{documentation subpage}}
{{uses lua|Module:ja-see}}
Soft-redirect template for Japanese entries.
This template can be used to redirect alternative spellings to lemma entries. The advantage over {{tl|alternative spelling of|ja}} is that: (1) This template can be the whole content under <code>==Japanese==</code> or <code>===Etymology <var>n</var>===</code>, so that you don't need to copy the PoS header and the headword template from the lemma entry to the alternative spelling entries. (2) The template offers a preview of the lemma entry, currently the definitions and the other alternative spellings (more can be added in the future).
See {{m|ja|い}} and {{m|ja|貴方}} for examples.
==Usage==
; {{para|1}} and {{para|2}}, {{para|3}} ...
: Entries to redirect to.
; {{para|type|opt=1}} and {{para|type1}}, {{para|type2}}, {{para|type3}} ...
: Type of the redirect, which explains the relation between the redirect entry and the main entry. Predifined types are:
:* {{para|type|rom}}: Romaji.
:* {{para|type|hira}}: Hiragana spelling.
:* {{para|type|kata}}: Katakana spelling.
:* {{para|type|hkana}}: Historical kana spelling.
:* {{para|type|kyu}}: Kyūjitai.
:* {{para|type|kyualt}}: Kyūjitai of an alternative form.
:* {{para|type|alt}}: Alternative form.
: The template can usually detect the redirect types above by itself so this parameter is not needed for them.
:* {{para|type|eshin}}: Extended shinjitai.
: Custom redirect types are also accepted:
:* {{para|type|an uncommon shinjitai form}}
; {{para|key|opt=1}} and {{para|key2}}, {{para|key3}} ...
: This parameter is a kana form used to further filter the alternative forms by reading. It is only used when {{para|type1|alt}} or {{para|type1|kyualt}}. It has no effect otherwise.
; {{para|term}}
: This parameter is used on kana entries to force a transliteration when the automatically generated one is undesired. E.g. <code>term=きに.いる</code> on the page [[きにいる]] to force the transliteration ''kiniiru'' instead of ''kinīru''.
==Requirements on the lemma entry==
This template displays the relevant definitions on the entries it redirects to. Definitions are considered relevant if:
* In the case of {{para|type|rom}}, {{para|type|hira}}, {{para|type|kata}}, one of [[:Category:Japanese headword-line templates|the headword templates]] is required, such as {{temp|ja-noun}}, {{temp|ja-verb}}, etc. The title of the redirect entry should be consistent with a kana spelling in parameters passed to the headword template.
* In the case of {{para|type|alt}}, {{para|type|kyualt}}, the title of the redirect entry (or its shinjitai form) should be listed in either {{temp|ja-kanjitab|alt=...}} or {{temp|ja-def|...}}. If {{para|key}} exists, it should also be one of the kana spellings in the headword templates.
* In the case of {{para|type|kyu}}, the title of the redirect entry should be the kyūjitai of the main entry.
* In the case of all other redirect types, including custom ones, there is no requirement. All definitions found are considered relevant.
==Placement of the template==
==={{l|ja|かえる}}===
<pre>
==Japanese==
===Etymology x===
{{ja-see|蛙}}
</pre>
==={{l|ja|蝦}}===
<pre>
==Japanese==
===Etymology x===
{{ja-kanjitab|かえる|yomi=k}}
{{ja-see|蛙}}
</pre>
==={{l|ja|言葉}}===
<pre>
==Japanese==
{{ja-kanjitab|こと|は|k2=ば|yomi=k}}
{{ja-see|ことば}}
</pre>
==={{l|ja|真っ当}}===
<pre>
==Japanese==
{{ja-kanjitab|ま|とう|yomi=k,o|ateji=y}}
===Etymology===
This spelling is an example of {{ja-ateji}}, based on a reanalysis of the term as {{prefix|ja|真っ|tr1=maQ-|pos1=intensifying prefix|当|tr2=tō|t2=[[right]], [[proper]]}}.
===Definitions===
{{ja-see|まっとう}}
</pre>
==={{l|ja|貴方}}===
<pre>
==Japanese==
===Etymology 1===
{{ja-kanjitab|yomi=jukujikun}}
{{ja-see|あなた|jukujikun}}
===Etymology 2===
...
</pre>
or
<pre>
==Japanese==
===Etymology 1===
{{ja-kanjitab|yomi=irreg}}
This spelling is an example of {{ja-jukujikun}}, first attested in ….
====Definitions====
{{ja-see|あなた|jukujikun}}
===Etymology 2===
...
</pre>
==={{l|ja|迄}}===
<pre>
==Japanese==
===Kanji===
...
{{ja-kanjitab|まで|yomi=k}}
===Definitions===
{{ja-see|まで}}
</pre>
or
<pre>
==Japanese==
===Kanji===
...
{{ja-kanjitab|まで|yomi=k}}
===Etymology===
This spelling is ….
===Definitions===
{{ja-see|まで}}
</pre>
(It is also okay to put the kanjitab immediately below the header.)
==Note==
Please use this template for alternative spellings in the Japanese script only. Alternative sound forms (e.g. {{m|ja|おっぱらう}}) and alternative spellings in other scripts (e.g. [[H#Japanese]]) currently still use the old approach.
==See also==
* {{temp|ja-gv}}, for ''kyujitai'' (old kanji) forms
* {{temp|ja-see-kango}}, for grouping Sino-Japanese terms into one etymology section to save space.
<includeonly>
[[ကဏ္ဍ:ထာမ်ပလိက်ဂျပါန်ဂမၠိုၚ်|see]]
[[ကဏ္ဍ:ထာမ်ပလိက်ရံၚ်ဗပေၚ်ဂမၠိုၚ်]]
</includeonly>
1lk94r2wabtp46y38m3t62w2xr7s36v
奇観
0
286005
385653
2026-04-02T20:18:11Z
咽頭べさ
33
ခၞံကၠောန်လဝ် မုက်လိက် နကု "=={{=ja=}}== {{ja-kanjitab|き|かん|yomi=on}} ===ဗွဟ်ရမ္သာၚ်=== {{ja-pron|きかん}} ===နာမ်=== {{ja-noun|きかん}} # လညာတ်တၟေၚ်တၟဟ်။"
385653
wikitext
text/x-wiki
=={{=ja=}}==
{{ja-kanjitab|き|かん|yomi=on}}
===ဗွဟ်ရမ္သာၚ်===
{{ja-pron|きかん}}
===နာမ်===
{{ja-noun|きかん}}
# လညာတ်တၟေၚ်တၟဟ်။
kip0qb7n121c1ep5alnp6wu9xjlzbnn
奇觀
0
286006
385654
2026-04-02T20:19:54Z
咽頭べさ
33
ခၞံကၠောန်လဝ် မုက်လိက် နကု "=={{=ja=}}== {{ja-kanjitab|き|かん|yomi=on}} ===ဗွဟ်ရမ္သာၚ်=== {{ja-pron|きかん}} ===နာမ်=== {{ja-noun|きかん}} # {{alternative form of|ja|奇観}}"
385654
wikitext
text/x-wiki
=={{=ja=}}==
{{ja-kanjitab|き|かん|yomi=on}}
===ဗွဟ်ရမ္သာၚ်===
{{ja-pron|きかん}}
===နာမ်===
{{ja-noun|きかん}}
# {{alternative form of|ja|奇観}}
tv7ulzhnssd6pfqsydzo6wiy8prcue8
ဗိတ်ကွိုၚ်
0
286007
385655
2026-04-03T07:54:52Z
Aue Nai
29
ခၞံကၠောန်လဝ် မုက်လိက် နကု "== ဗိတ်ကွိုၚ် (v. / adj.) == === ၁. ပွံက် (Definition) === * '''(ကြိယာ)''' ပွမကၟာတ်လဒဵုလဝ် နကဵုသော၊ ဟွံသေင်မ္ဂး နကဵုသၞောတ်စဵုဒၞာမွဲမွဲ ညံင်ဂွံနွံကဵုဂကန်ခိုင်ကၠိုက်။ * '''(နာမဝိသေသန)''' အ..."
385655
wikitext
text/x-wiki
== ဗိတ်ကွိုၚ် (v. / adj.) ==
=== ၁. ပွံက် (Definition) ===
* '''(ကြိယာ)''' ပွမကၟာတ်လဒဵုလဝ် နကဵုသော၊ ဟွံသေင်မ္ဂး နကဵုသၞောတ်စဵုဒၞာမွဲမွဲ ညံင်ဂွံနွံကဵုဂကန်ခိုင်ကၠိုက်။
* '''(နာမဝိသေသန)''' အကာဲအရာမနွံကဵု ဂုဏ်စရာဲစဵုဒၞာ၊ မနွံကဵုဂကန် (Security)။
=== ၂. တမ်ရိုဟ်ဝေါဟာရ (Etymology) ===
မက္တဵုဒှ်ကၠုင်နူ ဝေါဟာရမန်တြေံ:
* '''ဗိတ် (v.)''' - ပွမကၟာတ်၊ ပွမကၟာတ်လဒဵု။
* '''ကွိုၚ် (v.)''' - ပွမဒက်ဂၞိန်၊ ပွမဒက်ဗဗိုန် (နကဵုသော ဟွံသေင်မ္ဂး ဇုက်)။
=== ၃. ဝေါဟာရပရိယာယ (Synonyms) ===
* <nowiki>[[ကၟာတ်ဗိုန်]]</nowiki>
* <nowiki>[[ဂစပ်လဒဵု]]</nowiki>
=== ၅. ဗီုပြင်စကာ ပ္ဍဲဝါကျ (Usage Examples) ===
* '''Digital Context''': "တင်ဂၞင်သုတေသနဏအ်ဂှ် ဒး '''ဗိတ်ကွိုၚ်''' လဝ် နကဵုသၞောတ် Password ရ။"
* '''Academic Context''': "ပွမ '''ဗိတ်ကွိုၚ်''' လဝ် အဝဵုအုပ်ဓုပ်သံဃာမန်ဂှ် ဒှ်သဇိုင်သုတေသန Ph.D. ရ။"
=== ၆. ကဏ္ဍ (Categories) ===
<nowiki>[[Category:ဝေါဟာရမန်]]</nowiki>
<nowiki>[[Category:ကြိယာမန်]]</nowiki>
<nowiki>[[Category:ယေန်သၞာင်နှဴရဴဗေဒ]]</nowiki>
7owwm1dcvwh0wrwxrym6821vxx0ca90