Modul:data consistency check/dok
Ovo je dokumentaciona podstranica za Модул:data consistency check
This module checks the validity and internal consistency of the language, language family, and script data used on Wiktionary: the modules in Kategorija:Jezički moduli podataka as well as Modul:scripts/data.
Output
[uredi]Discrepancies detected:
- Literary Chinese, the canonical name for the code
lzh-lit
, is wrong; it should be Literary Chinese. - The code
nds-lpr
and the canonical name Low Prussian should be removed; they are not found in Modul:etymology languages/data.
- Literary Chinese, the canonical name for the code
lzh-lit
, is wrong; it should be Literary Chinese.
- Literary Chinese jezik (
lzh-lit
) has a canonical name that is not unique; it is also used by the codelzh
. - The data key
preprocess_links
for ??? (th-new
) is invalid.
- The canonical name North Germanic (
gmq
) is missing. - Severno germanski, the canonical name for the code
gmq
, is wrong; it should be North Germanic. - The code
ira-mid
and the canonical name Middle Iranian should be removed; they are not found in Module:families/data. - The code
ira-old
and the canonical name Old Iranian should be removed; they are not found in Module:families/data. - The canonical name Northern Ryukyuan (
jpx-nry
) is missing. - The canonical name Southern Ryukyuan (
jpx-sry
) is missing.
- Indo-Aryan, the canonical name for the code
inc
, is wrong; it should be Indo-Arijan. - Indo-European, the canonical name for the code
ine
, is wrong; it should be Indo-Evropski. - Balto-Slavic, the canonical name for the code
ine-bsl
, is wrong; it should be Baltoslovenski. - The code
ira-mid
and the canonical name Middle Iranian should be removed; they are not found in Module:families/data. - The code
ira-old
and the canonical name Old Iranian should be removed; they are not found in Module:families/data. - Slavic, the canonical name for the code
sla
, is wrong; it should be Slovenski. - East Slavic, the canonical name for the code
zle
, is wrong; it should be Istočnoslovenski.
- Old Indo-Aryan family (
inc-old
) has no child families or languages. - Prakrit family (
pra
) has no child families or languages.
- Southern Amami Ōshima, the canonical name for the code
ams
, is wrong; it should be Southern Amami-Oshima. - The canonical name Southern Amami-Oshima (
ams
) is missing. - The canonical name Američki znakovni jezik (
ase
) is missing. - American Sign Language, the canonical name for the code
ase
, is wrong; it should be Američki znakovni jezik. - The canonical name Dhundhari (
dhd
) is missing. - Proto-West Germanic, the canonical name for the code
gmw-pro
, is wrong; it should be Pra-Zapadno Germanski. - The canonical name Pra-Zapadno Germanski (
gmw-pro
) is missing. - The canonical name Proto-Indo-European (
ine-pro
) is missing. - Pra-Indo-Evropski, the canonical name for the code
ine-pro
, is wrong; it should be Proto-Indo-European. - Aiwoo, the canonical name for the code
nfl
, is wrong; it should be Äiwoo. - The canonical name Äiwoo (
nfl
) is missing. - Moabite, the canonical name for the code
obm
, is wrong; it should be Moavski. - The canonical name Moavski (
obm
) is missing. - Pra-Semitski, the canonical name for the code
sem-pro
, is wrong; it should be Proto-Semitic. - The canonical name Proto-Semitic (
sem-pro
) is missing. - The canonical name Kantonski (
yue
) is missing. - Cantonese, the canonical name for the code
yue
, is wrong; it should be Kantonski.
- Afar, the canonical name for the code
aa
, is wrong; it should be Afarski. - Afrikaans, the canonical name for the code
af
, is wrong; it should be Afrikanski. - Amharic, the canonical name for the code
am
, is wrong; it should be Amharski. - Southern Amami Ōshima, the canonical name for the code
ams
, is wrong; it should be Southern Amami-Oshima. - Old English, the canonical name for the code
ang
, is wrong; it should be Stari Engleski. - Arabic, the canonical name for the code
ar
, is wrong; it should be Arapski. - Aramaic, the canonical name for the code
arc
, is wrong; it should be Aramejski. - American Sign Language, the canonical name for the code
ase
, is wrong; it should be Američki znakovni jezik. - Azerbaijani, the canonical name for the code
az
, is wrong; it should be Azerbejdžanski. - Belarusian, the canonical name for the code
be
, is wrong; it should be Beloruski. - Bulgarian, the canonical name for the code
bg
, is wrong; it should be Bugarski. - Braj, the canonical name for the code
bra
, is wrong; it should be Braj. - Catalan, the canonical name for the code
ca
, is wrong; it should be Katalonski. - Mandarin, the canonical name for the code
cmn
, is wrong; it should be Mandarin. - Corsican, the canonical name for the code
co
, is wrong; it should be Korzički. - Czech, the canonical name for the code
cs
, is wrong; it should be Češki. - Welsh, the canonical name for the code
cy
, is wrong; it should be Velški. - Danish, the canonical name for the code
da
, is wrong; it should be Danski. - German, the canonical name for the code
de
, is wrong; it should be Nemački. - Dungan, the canonical name for the code
dng
, is wrong; it should be Dungan. - Greek, the canonical name for the code
el
, is wrong; it should be Grčki. - English, the canonical name for the code
en
, is wrong; it should be Engleski. - Middle English, the canonical name for the code
enm
, is wrong; it should be Srednji Engleski. - Esperanto, the canonical name for the code
eo
, is wrong; it should be Esperanto. - Spanish, the canonical name for the code
es
, is wrong; it should be Španski. - Basque, the canonical name for the code
eu
, is wrong; it should be Baskijski. - Finnish, the canonical name for the code
fi
, is wrong; it should be Finski. - French, the canonical name for the code
fr
, is wrong; it should be Francuski. - Old French, the canonical name for the code
fro
, is wrong; it should be Stari Francuski. - Irish, the canonical name for the code
ga
, is wrong; it should be Irski. - Proto-West Germanic, the canonical name for the code
gmw-pro
, is wrong; it should be Pra-Zapadno Germanski. - Gothic, the canonical name for the code
got
, is wrong; it should be Gotski. - Ancient Greek, the canonical name for the code
grc
, is wrong; it should be Antički Grčki. - Gujarati, the canonical name for the code
gu
, is wrong; it should be Gudžarati. - Hawaiian, the canonical name for the code
haw
, is wrong; it should be Havajski. - Hebrew, the canonical name for the code
he
, is wrong; it should be Hebrejski. - Hindi, the canonical name for the code
hi
, is wrong; it should be Hindi. - Hungarian, the canonical name for the code
hu
, is wrong; it should be Mađarski. - Armenian, the canonical name for the code
hy
, is wrong; it should be Jermenski. - Ido, the canonical name for the code
io
, is wrong; it should be Ido. - Italian, the canonical name for the code
it
, is wrong; it should be Italijanski. - Japanese, the canonical name for the code
ja
, is wrong; it should be Japanski. - Korean, the canonical name for the code
ko
, is wrong; it should be Korejski. - Latin, the canonical name for the code
la
, is wrong; it should be Latinski. - Ladino, the canonical name for the code
lad
, is wrong; it should be Ladino. - Macedonian, the canonical name for the code
mk
, is wrong; it should be Makedonski. - Malayalam, the canonical name for the code
ml
, is wrong; it should be Malajalam. - Mongolian, the canonical name for the code
mn
, is wrong; it should be Mongolski. - Marathi, the canonical name for the code
mr
, is wrong; it should be Marati. - Malay, the canonical name for the code
ms
, is wrong; it should be Malajski. - Maltese, the canonical name for the code
mt
, is wrong; it should be Malteški. - Translingual, the canonical name for the code
mul
, is wrong; it should be Međunarodni. - Nepali, the canonical name for the code
ne
, is wrong; it should be Nepali. - Dutch, the canonical name for the code
nl
, is wrong; it should be Holandski. - Norwegian, the canonical name for the code
no
, is wrong; it should be Norveški. - Moabite, the canonical name for the code
obm
, is wrong; it should be Moavski. - Okinoerabu, the canonical name for the code
okn
, is wrong; it should be Oki-No-Erabu. - Old Marathi, the canonical name for the code
omr
, is wrong; it should be Stari Marati. - Old Tamil, the canonical name for the code
oty
, is wrong; it should be Stari Tamilski. - Pali, the canonical name for the code
pi
, is wrong; it should be Pali. - Polish, the canonical name for the code
pl
, is wrong; it should be Poljski. - Portuguese, the canonical name for the code
pt
, is wrong; it should be Portugalski. - Romanian, the canonical name for the code
ro
, is wrong; it should be Rumunski. - Russian, the canonical name for the code
ru
, is wrong; it should be Ruski. - Sanskrit, the canonical name for the code
sa
, is wrong; it should be Sanskrt. - Scots, the canonical name for the code
sco
, is wrong; it should be Škotski. - Serbo-Croatian, the canonical name for the code
sh
, is wrong; it should be Srpskohrvatski. - Slovak, the canonical name for the code
sk
, is wrong; it should be Slovački. - Slovene, the canonical name for the code
sl
, is wrong; it should be Slovenski. - Proto-Slavic, the canonical name for the code
sla-pro
, is wrong; it should be Pra-Slovenski. - Albanian, the canonical name for the code
sq
, is wrong; it should be Albanski. - Swedish, the canonical name for the code
sv
, is wrong; it should be Švedski. - Thai, the canonical name for the code
th
, is wrong; it should be Tajski. - Tokunoshima, the canonical name for the code
tkn
, is wrong; it should be Toku-No-Shima. - Tagalog, the canonical name for the code
tl
, is wrong; it should be Tagalog. - Tok Pisin, the canonical name for the code
tpi
, is wrong; it should be Tok Pisin. - Turkish, the canonical name for the code
tr
, is wrong; it should be Turski. - Ukrainian, the canonical name for the code
uk
, is wrong; it should be Ukrajinski. - Vietnamese, the canonical name for the code
vi
, is wrong; it should be Vijetnamski. - Yiddish, the canonical name for the code
yi
, is wrong; it should be Jidiš. - Cantonese, the canonical name for the code
yue
, is wrong; it should be Kantonski.
- Norwegian Bokmål jezik (
nb
) has Middle Norwegian jezik (gmq-mno
) set as an ancestor, but is not in the West Scandinavian family (gmq-wes
). - Norwegian Bokmål jezik (
nb
) has Danski jezik (da
) set as an ancestor, but is not in the East Scandinavian family (gmq-eas
).
- Southern Amami-Oshima, the canonical name for
ams
, is repeated in the table ofaliases
.
- Panyi Bai, the canonical name for
bfc
, is repeated in the table ofotherNames
. - Daakaka, the canonical name for
bpa
, is repeated in the table ofotherNames
.
- Caribbean Hindustani jezik (
hns
) has Bhojpuri jezik (bho
) set as an ancestor, but is not in the Eastern Indo-Aryan family (inc-eas
). - Caribbean Hindustani jezik (
hns
) has Awadhi jezik (awa
) set as an ancestor, but is not in the Eastern Hindi family (inc-hie
).
- Äiwoo, the canonical name for
nfl
, is repeated in the table ofotherNames
.
- Toku-No-Shima, the canonical name for
tkn
, is repeated in the table ofaliases
.
- Ura (Papua New Guinea), the canonical name for
uro
, is repeated in the table ofotherNames
.
- Wiradjuri, the canonical name for
wrh
, is repeated in the table ofotherNames
.
- Arapski, the canonical name for the code
Arab
, is wrong; it should be Arabic. - Armenian (
Armn
) is missing - Jermenski, the canonical name for the code
Armn
, is wrong; it should be Armenian. - Old Cyrillic (
Cyrs
) is missing - Stara Ćirilica, the canonical name for the code
Cyrs
, is wrong; it should be Old Cyrillic. - Gotski, the canonical name for the code
Goth
, is wrong; it should be Gothic. - Gothic (
Goth
) is missing - Grčki, the canonical name for the code
Grek
, is wrong; it should be Greek. - Gudžarati, the canonical name for the code
Gujr
, is wrong; it should be Gujarati. - Gujarati (
Gujr
) is missing - Hangul, the canonical name for the code
Hang
, is wrong; it should be Hangul. - Hangul (
Hang
) is missing - Han (
Hani
) is missing - Han, the canonical name for the code
Hani
, is wrong; it should be Han. - Hebrew (
Hebr
) is missing - Hebrejski, the canonical name for the code
Hebr
, is wrong; it should be Hebrew. - Japanski, the canonical name for the code
Jpan
, is wrong; it should be Japanese. - Japanese (
Jpan
) is missing - Kannada, the canonical name for the code
Knda
, is wrong; it should be Kannada. - Kannada (
Knda
) is missing - Korean (
Kore
) is missing - Korejski, the canonical name for the code
Kore
, is wrong; it should be Korean. - Latinica (
Latn
) is missing - Latinski, the canonical name for the code
Latn
, is wrong; it should be Latinica. - Malajalam, the canonical name for the code
Mlym
, is wrong; it should be Malayalam. - Malayalam (
Mlym
) is missing - Feničanski (
Phnx
) is missing - Phoenician, the canonical name for the code
Phnx
, is wrong; it should be Feničanski. - Tamil (
Taml
) is missing - Tamilski, the canonical name for the code
Taml
, is wrong; it should be Tamil. - Telugu, the canonical name for the code
Telu
, is wrong; it should be Telugu. - Telugu (
Telu
) is missing - Tibetski, the canonical name for the code
Tibt
, is wrong; it should be Tibetan. - Tibetan (
Tibt
) is missing
- Adlam, the canonical name for the code
Adlm
, is wrong; it should be Adlam. - Cyrillic, the canonical name for the code
Cyrl
, is wrong; it should be Ćirilica. - Devanagari, the canonical name for the code
Deva
, is wrong; it should be Devanagari. - Hiragana, the canonical name for the code
Hira
, is wrong; it should be Hiragana. - Katakana, the canonical name for the code
Kana
, is wrong; it should be Katakana. - Phoenician, the canonical name for the code
Phnx
, is wrong; it should be Feničanski. - Thai, the canonical name for the code
Thai
, is wrong; it should be Tajski.
- Blissymbols tekst (
Blis
) is not used by any language and has no characters listed for auto-detection. - Cypro-Minoan tekst (
Cpmn
) is not used by any language. - Hiragana tekst (
Hira
) is not used by any language. - Kana tekst (
Hrkt
) is not used by any language. - Image-rendered tekst (
Image
) is not used by any language and has no characters listed for auto-detection. - International Phonetic Alphabet tekst (
Ipach
) is not used by any language and has no characters listed for auto-detection. - Moon tekst (
Moon
) is not used by any language and has no characters listed for auto-detection. - Morse code (
Morse
) is not used by any language and has no characters listed for auto-detection. - Musical notation tekst (
Music
) is not used by any language. - Unspecified tekst (
None
) is not used by any language and has no characters listed for auto-detection. - Rongorongo tekst (
Roro
) is not used by any language and has no characters listed for auto-detection. - Rumi numerals tekst (
Rumin
) is not used by any language. - flag semaphore (
Semap
) is not used by any language and has no characters listed for auto-detection. - Visible Speech tekst (
Visp
) is not used by any language and has no characters listed for auto-detection. - mathematical notation tekst (
Zmth
) is not used by any language. - symbol tekst (
Zsym
) is not used by any language. - undetermined tekst (
Zyyy
) is not used by any language and has no characters listed for auto-detection. - uncoded tekst (
Zzzz
) is not used by any language and has no characters listed for auto-detection. - The codes
fa-Arab
,ug-Arab
,ks-Arab
,ps-Arab
,ur-Arab
,tt-Arab
,ota-Arab
,ku-Arab
,mzn-Arab
andsd-Arab
are currently alias codes. Only one code should be used in the data. - The codes
ms-Arab
andkk-Arab
are currently alias codes. Only one code should be used in the data. - The data key
sort_by_scraping
for Japanese tekst (Jpan
) is invalid.
- Code:
aa
. Saw name: Afar. Expected name: Afarski. - Code:
af
. Saw name: Afrikaans. Expected name: Afrikanski. - Code:
als
. Saw name: Albanian. Expected name: Albanski. - Code:
ams
. Saw name: Southern Amami Ōshima. Expected name: Southern Amami-Oshima. - Code:
ang
. Saw name: Old English. Expected name: Stari Engleski. - Code:
ar
. Saw name: Arabic. Expected name: Arapski. - Code:
arc
. Saw name: Aramaic. Expected name: Aramejski. - Code:
az
. Saw name: Azerbaijani. Expected name: Azerbejdžanski. - Code:
be
. Saw name: Belarusian. Expected name: Beloruski. - Code:
bg
. Saw name: Bulgarian. Expected name: Bugarski. - Code:
ca
. Saw name: Catalan. Expected name: Katalonski. - Code:
cmn
. Saw name: Mandarin. Expected name: Mandarin. - Code:
cmn-ear
. Saw name: Mandarin. Expected name: Mandarin. - Code:
co
. Saw name: Corsican. Expected name: Korzički. - Code:
cs
. Saw name: Czech. Expected name: Češki. - Code:
cy
. Saw name: Welsh. Expected name: Velški. - Code:
da
. Saw name: Danish. Expected name: Danski. - Code:
de
. Saw name: German. Expected name: Nemački. - Code:
dng
. Saw name: Dungan. Expected name: Dungan. - Code:
el
. Saw name: Greek. Expected name: Grčki. - Code:
en
. Saw name: English. Expected name: Engleski. - Code:
enm
. Saw name: Middle English. Expected name: Srednji Engleski. - Code:
eo
. Saw name: Esperanto. Expected name: Esperanto. - Code:
es
. Saw name: Spanish. Expected name: Španski. - Code:
eu
. Saw name: Basque. Expected name: Baskijski. - Code:
fi
. Saw name: Finnish. Expected name: Finski. - Code:
fr
. Saw name: French. Expected name: Francuski. - Code:
fr-CA
. Saw name: French. Expected name: Francuski. - Code:
frk
. Saw name: Proto-West Germanic. Expected name: Pra-Zapadno Germanski. - Code:
fro
. Saw name: Old French. Expected name: Stari Francuski. - Code:
fro-nor
. Saw name: Old French. Expected name: Stari Francuski. - Code:
ga
. Saw name: Irish. Expected name: Irski. - Code:
gem
. Saw name: Germanic. Expected name: Germanski. - Code:
gem-pro
. Saw name: Proto-Germanic. Expected name: Pra-Germanski. - Code:
gkm
. Saw name: Ancient Greek. Expected name: Antički Grčki. - Code:
gmw-pro
. Saw name: Proto-West Germanic. Expected name: Pra-Zapadno Germanski. - Code:
got
. Saw name: Gothic. Expected name: Gotski. - Code:
grc
. Saw name: Ancient Greek. Expected name: Antički Grčki. - Code:
gu
. Saw name: Gujarati. Expected name: Gudžarati. - Code:
haw
. Saw name: Hawaiian. Expected name: Havajski. - Code:
he
. Saw name: Hebrew. Expected name: Hebrejski. - Code:
hi
. Saw name: Hindi. Expected name: Hindi. - Code:
hu
. Saw name: Hungarian. Expected name: Mađarski. - Code:
hy
. Saw name: Armenian. Expected name: Jermenski. - Code:
io
. Saw name: Ido. Expected name: Ido. - Code:
it
. Saw name: Italian. Expected name: Italijanski. - Code:
itc-ola
. Saw name: Latin. Expected name: Latinski. - Code:
ja
. Saw name: Japanese. Expected name: Japanski. - Code:
ko
. Saw name: Korean. Expected name: Korejski. - Code:
la
. Saw name: Latin. Expected name: Latinski. - Code:
lad
. Saw name: Ladino. Expected name: Ladino. - Code:
mk
. Saw name: Macedonian. Expected name: Makedonski. - Code:
ml
. Saw name: Malayalam. Expected name: Malajalam. - Code:
mn
. Saw name: Mongolian. Expected name: Mongolski. - Code:
mr
. Saw name: Marathi. Expected name: Marati. - Code:
ms
. Saw name: Malay. Expected name: Malajski. - Code:
ms-cla
. Saw name: Malay. Expected name: Malajski. - Code:
ms-old
. Saw name: Malay. Expected name: Malajski. - Code:
mt
. Saw name: Maltese. Expected name: Malteški. - Code:
mul
. Saw name: Translingual. Expected name: Međunarodni. - Code:
ne
. Saw name: Nepali. Expected name: Nepali. - Code:
nl
. Saw name: Dutch. Expected name: Holandski. - Code:
no
. Saw name: Norwegian. Expected name: Norveški. - Code:
okn
. Saw name: Okinoerabu. Expected name: Oki-No-Erabu. - Code:
pi
. Saw name: Pali. Expected name: Pali. - Code:
pl
. Saw name: Polish. Expected name: Poljski. - Code:
pt
. Saw name: Portuguese. Expected name: Portugalski. - Code:
ro
. Saw name: Romanian. Expected name: Rumunski. - Code:
ru
. Saw name: Russian. Expected name: Ruski. - Code:
sa
. Saw name: Sanskrit. Expected name: Sanskrt. - Code:
sa-ved
. Saw name: Sanskrit. Expected name: Sanskrt. - Code:
sco
. Saw name: Scots. Expected name: Škotski. - Code:
sh
. Saw name: Serbo-Croatian. Expected name: Srpskohrvatski. - Code:
sk
. Saw name: Slovak. Expected name: Slovački. - Code:
sl
. Saw name: Slovene. Expected name: Slovenski. - Code:
sla
. Saw name: Slavic. Expected name: Slovenski. - Code:
sla-pro
. Saw name: Proto-Slavic. Expected name: Pra-Slovenski. - Code:
sq
. Saw name: Albanian. Expected name: Albanski. - Code:
sv
. Saw name: Swedish. Expected name: Švedski. - Code:
ta
. Saw name: Tamil. Expected name: Tamil. - Code:
th
. Saw name: Thai. Expected name: Tajski. - Code:
tl
. Saw name: Tagalog. Expected name: Tagalog. - Code:
tpi
. Saw name: Tok Pisin. Expected name: Tok Pisin. - Code:
tr
. Saw name: Turkish. Expected name: Turski. - Code:
uk
. Saw name: Ukrainian. Expected name: Ukrajinski. - Code:
vi
. Saw name: Vietnamese. Expected name: Vijetnamski. - Code:
xno
. Saw name: Old French. Expected name: Stari Francuski. - Code:
yi
. Saw name: Yiddish. Expected name: Jidiš. - Code:
yue
. Saw name: Cantonese. Expected name: Kantonski. - Code:
zh
. Saw name: Chinese. Expected name: Kineski.
Checks performed
[uredi]For multiple data modules:
- Codes for languages, families and etymology-only languages must be unique and cannot clash with one another.
- Canonical names for languages, families, and etymology-only languages must not be found in the list of other names.
- Each name in the list of other names must appear only once.
otherNames
, if present, must be an array.- Wikidata item IDs must be a positive integer or a string starting with
Q
and ending with decimal digits.
The following must be true of the data used by Module:languages:
- Each code must be defined in the correct submodule according to whether it is two-letter, three-letter or exceptional.
- The canonical name (field
1
) must be present and must not be the same as the canonical name of another language. - If field
2
is notnil
, it must a valid Wikidata item ID. - If field
3
orfamily
is given and notnil
, it must be a valid family code. - If field
4
orscripts
is given and notnil
, it must be an array, and each string in the array must be a valid script code. - If
ancestors
is given, it must be an array, and each string in the array must be a valid language or etymology language code. - If
family
is given, it must be a valid family code. - If
type
is given, it must be one of the recognised values (regular
,reconstructed
,appendix-constructed
). - If
entry_name
is given, it must be a table that contains either two arrays (from
andto
) or a string (remove_diacritics
) or both. - If
sort_key
is given, it may either be a string, or at table that in turn contains either two arrays (from
andto
) or a string (remove_diacritics
). - If
entry_name
orsort_key
is given, thefrom
array must be longer or equal in length to theto
array. - If
standardChars
is given, it must form a valid Lua string pattern when placed between square brackets with^
before it ("[^...]
). (It should match all characters regularly used in the language, but that cannot be tested.) - If
override_translit
is set,translit
must also be set, because there must be a transliteration module that can override manual transliteration. - If
link_tr
is present, it must betrue
. - Have no data keys besides these:
1, 2, 3, "entry_name", "sort_key", "display", "otherNames", "aliases", "varieties", "type", "scripts", "ancestors", "wikimedia_codes", "wikipedia_article", "standardChars", "translit", "override_translit", "link_tr"
.
Checks not performed:
- If
translit
is present, it should be the name of a module, and this module should contain atr
function that takes a pagename (and optionally a language code and script code) as arguments. - If
sort_key
is a string, it should be the name of a module, and this module should contain amakeSortKey
function that takes a pagename (and optionally a language code and script code) as arguments. - If
entry_name
orsort_key
is a table and contains a fieldremove_diacritics
, the value of the field should be a string that forms a valid Lua pattern when it is placed inside negated set notation ([^...]
).
These are not checked here, because module errors will quickly crop up in entries if these conditions are not met, assuming that Module:utilities attempts to generate a sortkey for a category pertaining to the language in question, or full_link
attempts to use the transliteration module.
Module:languages/code to canonical name and Module:languages/canonical names must contain all the codes and canonical names found in the data submodules of Module:languages, and no more.
The following must be true of the data used by Module:etymology languages:
canonicalName
must be given.parent
must be given must be a valid language, family or etymology-only language code.- If
ancestors
is given, it must be an array, and each string in the array must be a valid language or etymology language code. The etymology language should also be listed as the ancestor of a regular language. - Have no data keys besides these:
"canonicalName", "otherNames", "parent", "ancestors", "wikipedia_article", "wikidata_item"
.
Codes in Module:families data must:
- Have
canonicalName
, which must not be the same as the canonical name of another family. - If
family
is given, it must be a valid family code. - Have at least one language or subfamily belonging to it.
- Have no data keys besides these:
"canonicalName", "otherNames", "family", "protoLanguage", "wikidata_item"
.
Codes in Module:scripts data must:
- Have
canonicalName
. - Have at least one language that lists it as one of its scripts.
- Have a
characters
pattern for script autodetection, and this must form a valid Lua string pattern when placed between square brackets ("[...]"
). (It should match all characters in the script, but that cannot be tested.) - Have no data keys besides these:
"canonicalName", "otherNames", "parent", "systems", "wikipedia_article", "characters", "direction"
.