Pređi na sadržaj

Modul:data consistency check/dok

Ovo je dokumentaciona podstranica za Модул:data consistency check

This module checks the validity and internal consistency of the language, language family, and script data used on Wiktionary: the modules in Kategorija:Jezički moduli podataka as well as Modul:scripts/data.

Output

[uredi]

Discrepancies detected:

  • Literary Chinese, the canonical name for the code lzh-lit, is wrong; it should be Literary Chinese.
  • The code nds-lpr and the canonical name Low Prussian should be removed; they are not found in Modul:etymology languages/data.
  • Literary Chinese, the canonical name for the code lzh-lit, is wrong; it should be Literary Chinese.
  • Literary Chinese jezik (lzh-lit) has a canonical name that is not unique; it is also used by the code lzh.
  • The data key preprocess_links for ??? (th-new) is invalid.
  • The canonical name North Germanic (gmq) is missing.
  • Severno germanski, the canonical name for the code gmq, is wrong; it should be North Germanic.
  • The code ira-mid and the canonical name Middle Iranian should be removed; they are not found in Module:families/data.
  • The code ira-old and the canonical name Old Iranian should be removed; they are not found in Module:families/data.
  • The canonical name Northern Ryukyuan (jpx-nry) is missing.
  • The canonical name Southern Ryukyuan (jpx-sry) is missing.
  • Indo-Aryan, the canonical name for the code inc, is wrong; it should be Indo-Arijan.
  • Indo-European, the canonical name for the code ine, is wrong; it should be Indo-Evropski.
  • Balto-Slavic, the canonical name for the code ine-bsl, is wrong; it should be Baltoslovenski.
  • The code ira-mid and the canonical name Middle Iranian should be removed; they are not found in Module:families/data.
  • The code ira-old and the canonical name Old Iranian should be removed; they are not found in Module:families/data.
  • Slavic, the canonical name for the code sla, is wrong; it should be Slovenski.
  • East Slavic, the canonical name for the code zle, is wrong; it should be Istočnoslovenski.
  • Southern Amami Ōshima, the canonical name for the code ams, is wrong; it should be Southern Amami-Oshima.
  • The canonical name Southern Amami-Oshima (ams) is missing.
  • The canonical name Američki znakovni jezik (ase) is missing.
  • American Sign Language, the canonical name for the code ase, is wrong; it should be Američki znakovni jezik.
  • The canonical name Dhundhari (dhd) is missing.
  • Proto-West Germanic, the canonical name for the code gmw-pro, is wrong; it should be Pra-Zapadno Germanski.
  • The canonical name Pra-Zapadno Germanski (gmw-pro) is missing.
  • The canonical name Proto-Indo-European (ine-pro) is missing.
  • Pra-Indo-Evropski, the canonical name for the code ine-pro, is wrong; it should be Proto-Indo-European.
  • Aiwoo, the canonical name for the code nfl, is wrong; it should be Äiwoo.
  • The canonical name Äiwoo (nfl) is missing.
  • Moabite, the canonical name for the code obm, is wrong; it should be Moavski.
  • The canonical name Moavski (obm) is missing.
  • Pra-Semitski, the canonical name for the code sem-pro, is wrong; it should be Proto-Semitic.
  • The canonical name Proto-Semitic (sem-pro) is missing.
  • The canonical name Kantonski (yue) is missing.
  • Cantonese, the canonical name for the code yue, is wrong; it should be Kantonski.
  • Afar, the canonical name for the code aa, is wrong; it should be Afarski.
  • Afrikaans, the canonical name for the code af, is wrong; it should be Afrikanski.
  • Amharic, the canonical name for the code am, is wrong; it should be Amharski.
  • Southern Amami Ōshima, the canonical name for the code ams, is wrong; it should be Southern Amami-Oshima.
  • Old English, the canonical name for the code ang, is wrong; it should be Stari Engleski.
  • Arabic, the canonical name for the code ar, is wrong; it should be Arapski.
  • Aramaic, the canonical name for the code arc, is wrong; it should be Aramejski.
  • American Sign Language, the canonical name for the code ase, is wrong; it should be Američki znakovni jezik.
  • Azerbaijani, the canonical name for the code az, is wrong; it should be Azerbejdžanski.
  • Belarusian, the canonical name for the code be, is wrong; it should be Beloruski.
  • Bulgarian, the canonical name for the code bg, is wrong; it should be Bugarski.
  • Braj, the canonical name for the code bra, is wrong; it should be Braj.
  • Catalan, the canonical name for the code ca, is wrong; it should be Katalonski.
  • Mandarin, the canonical name for the code cmn, is wrong; it should be Mandarin.
  • Corsican, the canonical name for the code co, is wrong; it should be Korzički.
  • Czech, the canonical name for the code cs, is wrong; it should be Češki.
  • Welsh, the canonical name for the code cy, is wrong; it should be Velški.
  • Danish, the canonical name for the code da, is wrong; it should be Danski.
  • German, the canonical name for the code de, is wrong; it should be Nemački.
  • Dungan, the canonical name for the code dng, is wrong; it should be Dungan.
  • Greek, the canonical name for the code el, is wrong; it should be Grčki.
  • English, the canonical name for the code en, is wrong; it should be Engleski.
  • Middle English, the canonical name for the code enm, is wrong; it should be Srednji Engleski.
  • Esperanto, the canonical name for the code eo, is wrong; it should be Esperanto.
  • Spanish, the canonical name for the code es, is wrong; it should be Španski.
  • Basque, the canonical name for the code eu, is wrong; it should be Baskijski.
  • Finnish, the canonical name for the code fi, is wrong; it should be Finski.
  • French, the canonical name for the code fr, is wrong; it should be Francuski.
  • Old French, the canonical name for the code fro, is wrong; it should be Stari Francuski.
  • Irish, the canonical name for the code ga, is wrong; it should be Irski.
  • Proto-West Germanic, the canonical name for the code gmw-pro, is wrong; it should be Pra-Zapadno Germanski.
  • Gothic, the canonical name for the code got, is wrong; it should be Gotski.
  • Ancient Greek, the canonical name for the code grc, is wrong; it should be Antički Grčki.
  • Gujarati, the canonical name for the code gu, is wrong; it should be Gudžarati.
  • Hawaiian, the canonical name for the code haw, is wrong; it should be Havajski.
  • Hebrew, the canonical name for the code he, is wrong; it should be Hebrejski.
  • Hindi, the canonical name for the code hi, is wrong; it should be Hindi.
  • Hungarian, the canonical name for the code hu, is wrong; it should be Mađarski.
  • Armenian, the canonical name for the code hy, is wrong; it should be Jermenski.
  • Ido, the canonical name for the code io, is wrong; it should be Ido.
  • Italian, the canonical name for the code it, is wrong; it should be Italijanski.
  • Japanese, the canonical name for the code ja, is wrong; it should be Japanski.
  • Korean, the canonical name for the code ko, is wrong; it should be Korejski.
  • Latin, the canonical name for the code la, is wrong; it should be Latinski.
  • Ladino, the canonical name for the code lad, is wrong; it should be Ladino.
  • Macedonian, the canonical name for the code mk, is wrong; it should be Makedonski.
  • Malayalam, the canonical name for the code ml, is wrong; it should be Malajalam.
  • Mongolian, the canonical name for the code mn, is wrong; it should be Mongolski.
  • Marathi, the canonical name for the code mr, is wrong; it should be Marati.
  • Malay, the canonical name for the code ms, is wrong; it should be Malajski.
  • Maltese, the canonical name for the code mt, is wrong; it should be Malteški.
  • Translingual, the canonical name for the code mul, is wrong; it should be Međunarodni.
  • Nepali, the canonical name for the code ne, is wrong; it should be Nepali.
  • Dutch, the canonical name for the code nl, is wrong; it should be Holandski.
  • Norwegian, the canonical name for the code no, is wrong; it should be Norveški.
  • Moabite, the canonical name for the code obm, is wrong; it should be Moavski.
  • Okinoerabu, the canonical name for the code okn, is wrong; it should be Oki-No-Erabu.
  • Old Marathi, the canonical name for the code omr, is wrong; it should be Stari Marati.
  • Old Tamil, the canonical name for the code oty, is wrong; it should be Stari Tamilski.
  • Pali, the canonical name for the code pi, is wrong; it should be Pali.
  • Polish, the canonical name for the code pl, is wrong; it should be Poljski.
  • Portuguese, the canonical name for the code pt, is wrong; it should be Portugalski.
  • Romanian, the canonical name for the code ro, is wrong; it should be Rumunski.
  • Russian, the canonical name for the code ru, is wrong; it should be Ruski.
  • Sanskrit, the canonical name for the code sa, is wrong; it should be Sanskrt.
  • Scots, the canonical name for the code sco, is wrong; it should be Škotski.
  • Serbo-Croatian, the canonical name for the code sh, is wrong; it should be Srpskohrvatski.
  • Slovak, the canonical name for the code sk, is wrong; it should be Slovački.
  • Slovene, the canonical name for the code sl, is wrong; it should be Slovenski.
  • Proto-Slavic, the canonical name for the code sla-pro, is wrong; it should be Pra-Slovenski.
  • Albanian, the canonical name for the code sq, is wrong; it should be Albanski.
  • Swedish, the canonical name for the code sv, is wrong; it should be Švedski.
  • Thai, the canonical name for the code th, is wrong; it should be Tajski.
  • Tokunoshima, the canonical name for the code tkn, is wrong; it should be Toku-No-Shima.
  • Tagalog, the canonical name for the code tl, is wrong; it should be Tagalog.
  • Tok Pisin, the canonical name for the code tpi, is wrong; it should be Tok Pisin.
  • Turkish, the canonical name for the code tr, is wrong; it should be Turski.
  • Ukrainian, the canonical name for the code uk, is wrong; it should be Ukrajinski.
  • Vietnamese, the canonical name for the code vi, is wrong; it should be Vijetnamski.
  • Yiddish, the canonical name for the code yi, is wrong; it should be Jidiš.
  • Cantonese, the canonical name for the code yue, is wrong; it should be Kantonski.
  • Southern Amami-Oshima, the canonical name for ams, is repeated in the table of aliases.
  • Panyi Bai, the canonical name for bfc, is repeated in the table of otherNames.
  • Daakaka, the canonical name for bpa, is repeated in the table of otherNames.
  • Äiwoo, the canonical name for nfl, is repeated in the table of otherNames.
  • Toku-No-Shima, the canonical name for tkn, is repeated in the table of aliases.
  • Ura (Papua New Guinea), the canonical name for uro, is repeated in the table of otherNames.
  • Wiradjuri, the canonical name for wrh, is repeated in the table of otherNames.
  • Arapski, the canonical name for the code Arab, is wrong; it should be Arabic.
  • Armenian (Armn) is missing
  • Jermenski, the canonical name for the code Armn, is wrong; it should be Armenian.
  • Old Cyrillic (Cyrs) is missing
  • Stara Ćirilica, the canonical name for the code Cyrs, is wrong; it should be Old Cyrillic.
  • Gotski, the canonical name for the code Goth, is wrong; it should be Gothic.
  • Gothic (Goth) is missing
  • Grčki, the canonical name for the code Grek, is wrong; it should be Greek.
  • Gudžarati, the canonical name for the code Gujr, is wrong; it should be Gujarati.
  • Gujarati (Gujr) is missing
  • Hangul, the canonical name for the code Hang, is wrong; it should be Hangul.
  • Hangul (Hang) is missing
  • Han (Hani) is missing
  • Han, the canonical name for the code Hani, is wrong; it should be Han.
  • Hebrew (Hebr) is missing
  • Hebrejski, the canonical name for the code Hebr, is wrong; it should be Hebrew.
  • Japanski, the canonical name for the code Jpan, is wrong; it should be Japanese.
  • Japanese (Jpan) is missing
  • Kannada, the canonical name for the code Knda, is wrong; it should be Kannada.
  • Kannada (Knda) is missing
  • Korean (Kore) is missing
  • Korejski, the canonical name for the code Kore, is wrong; it should be Korean.
  • Latinica (Latn) is missing
  • Latinski, the canonical name for the code Latn, is wrong; it should be Latinica.
  • Malajalam, the canonical name for the code Mlym, is wrong; it should be Malayalam.
  • Malayalam (Mlym) is missing
  • Feničanski (Phnx) is missing
  • Phoenician, the canonical name for the code Phnx, is wrong; it should be Feničanski.
  • Tamil (Taml) is missing
  • Tamilski, the canonical name for the code Taml, is wrong; it should be Tamil.
  • Telugu, the canonical name for the code Telu, is wrong; it should be Telugu.
  • Telugu (Telu) is missing
  • Tibetski, the canonical name for the code Tibt, is wrong; it should be Tibetan.
  • Tibetan (Tibt) is missing
  • Adlam, the canonical name for the code Adlm, is wrong; it should be Adlam.
  • Cyrillic, the canonical name for the code Cyrl, is wrong; it should be Ćirilica.
  • Devanagari, the canonical name for the code Deva, is wrong; it should be Devanagari.
  • Hiragana, the canonical name for the code Hira, is wrong; it should be Hiragana.
  • Katakana, the canonical name for the code Kana, is wrong; it should be Katakana.
  • Phoenician, the canonical name for the code Phnx, is wrong; it should be Feničanski.
  • Thai, the canonical name for the code Thai, is wrong; it should be Tajski.
  • Blissymbols tekst (Blis) is not used by any language and has no characters listed for auto-detection.
  • Cypro-Minoan tekst (Cpmn) is not used by any language.
  • Hiragana tekst (Hira) is not used by any language.
  • Kana tekst (Hrkt) is not used by any language.
  • Image-rendered tekst (Image) is not used by any language and has no characters listed for auto-detection.
  • International Phonetic Alphabet tekst (Ipach) is not used by any language and has no characters listed for auto-detection.
  • Moon tekst (Moon) is not used by any language and has no characters listed for auto-detection.
  • Morse code (Morse) is not used by any language and has no characters listed for auto-detection.
  • Musical notation tekst (Music) is not used by any language.
  • Unspecified tekst (None) is not used by any language and has no characters listed for auto-detection.
  • Rongorongo tekst (Roro) is not used by any language and has no characters listed for auto-detection.
  • Rumi numerals tekst (Rumin) is not used by any language.
  • flag semaphore (Semap) is not used by any language and has no characters listed for auto-detection.
  • Visible Speech tekst (Visp) is not used by any language and has no characters listed for auto-detection.
  • mathematical notation tekst (Zmth) is not used by any language.
  • symbol tekst (Zsym) is not used by any language.
  • undetermined tekst (Zyyy) is not used by any language and has no characters listed for auto-detection.
  • uncoded tekst (Zzzz) is not used by any language and has no characters listed for auto-detection.
  • The codes fa-Arab, ug-Arab, ks-Arab, ps-Arab, ur-Arab, tt-Arab, ota-Arab, ku-Arab, mzn-Arab and sd-Arab are currently alias codes. Only one code should be used in the data.
  • The codes ms-Arab and kk-Arab are currently alias codes. Only one code should be used in the data.
  • The data key sort_by_scraping for Japanese tekst (Jpan) is invalid.
  • Code: aa. Saw name: Afar. Expected name: Afarski.
  • Code: af. Saw name: Afrikaans. Expected name: Afrikanski.
  • Code: als. Saw name: Albanian. Expected name: Albanski.
  • Code: ams. Saw name: Southern Amami Ōshima. Expected name: Southern Amami-Oshima.
  • Code: ang. Saw name: Old English. Expected name: Stari Engleski.
  • Code: ar. Saw name: Arabic. Expected name: Arapski.
  • Code: arc. Saw name: Aramaic. Expected name: Aramejski.
  • Code: az. Saw name: Azerbaijani. Expected name: Azerbejdžanski.
  • Code: be. Saw name: Belarusian. Expected name: Beloruski.
  • Code: bg. Saw name: Bulgarian. Expected name: Bugarski.
  • Code: ca. Saw name: Catalan. Expected name: Katalonski.
  • Code: cmn. Saw name: Mandarin. Expected name: Mandarin.
  • Code: cmn-ear. Saw name: Mandarin. Expected name: Mandarin.
  • Code: co. Saw name: Corsican. Expected name: Korzički.
  • Code: cs. Saw name: Czech. Expected name: Češki.
  • Code: cy. Saw name: Welsh. Expected name: Velški.
  • Code: da. Saw name: Danish. Expected name: Danski.
  • Code: de. Saw name: German. Expected name: Nemački.
  • Code: dng. Saw name: Dungan. Expected name: Dungan.
  • Code: el. Saw name: Greek. Expected name: Grčki.
  • Code: en. Saw name: English. Expected name: Engleski.
  • Code: enm. Saw name: Middle English. Expected name: Srednji Engleski.
  • Code: eo. Saw name: Esperanto. Expected name: Esperanto.
  • Code: es. Saw name: Spanish. Expected name: Španski.
  • Code: eu. Saw name: Basque. Expected name: Baskijski.
  • Code: fi. Saw name: Finnish. Expected name: Finski.
  • Code: fr. Saw name: French. Expected name: Francuski.
  • Code: fr-CA. Saw name: French. Expected name: Francuski.
  • Code: frk. Saw name: Proto-West Germanic. Expected name: Pra-Zapadno Germanski.
  • Code: fro. Saw name: Old French. Expected name: Stari Francuski.
  • Code: fro-nor. Saw name: Old French. Expected name: Stari Francuski.
  • Code: ga. Saw name: Irish. Expected name: Irski.
  • Code: gem. Saw name: Germanic. Expected name: Germanski.
  • Code: gem-pro. Saw name: Proto-Germanic. Expected name: Pra-Germanski.
  • Code: gkm. Saw name: Ancient Greek. Expected name: Antički Grčki.
  • Code: gmw-pro. Saw name: Proto-West Germanic. Expected name: Pra-Zapadno Germanski.
  • Code: got. Saw name: Gothic. Expected name: Gotski.
  • Code: grc. Saw name: Ancient Greek. Expected name: Antički Grčki.
  • Code: gu. Saw name: Gujarati. Expected name: Gudžarati.
  • Code: haw. Saw name: Hawaiian. Expected name: Havajski.
  • Code: he. Saw name: Hebrew. Expected name: Hebrejski.
  • Code: hi. Saw name: Hindi. Expected name: Hindi.
  • Code: hu. Saw name: Hungarian. Expected name: Mađarski.
  • Code: hy. Saw name: Armenian. Expected name: Jermenski.
  • Code: io. Saw name: Ido. Expected name: Ido.
  • Code: it. Saw name: Italian. Expected name: Italijanski.
  • Code: itc-ola. Saw name: Latin. Expected name: Latinski.
  • Code: ja. Saw name: Japanese. Expected name: Japanski.
  • Code: ko. Saw name: Korean. Expected name: Korejski.
  • Code: la. Saw name: Latin. Expected name: Latinski.
  • Code: lad. Saw name: Ladino. Expected name: Ladino.
  • Code: mk. Saw name: Macedonian. Expected name: Makedonski.
  • Code: ml. Saw name: Malayalam. Expected name: Malajalam.
  • Code: mn. Saw name: Mongolian. Expected name: Mongolski.
  • Code: mr. Saw name: Marathi. Expected name: Marati.
  • Code: ms. Saw name: Malay. Expected name: Malajski.
  • Code: ms-cla. Saw name: Malay. Expected name: Malajski.
  • Code: ms-old. Saw name: Malay. Expected name: Malajski.
  • Code: mt. Saw name: Maltese. Expected name: Malteški.
  • Code: mul. Saw name: Translingual. Expected name: Međunarodni.
  • Code: ne. Saw name: Nepali. Expected name: Nepali.
  • Code: nl. Saw name: Dutch. Expected name: Holandski.
  • Code: no. Saw name: Norwegian. Expected name: Norveški.
  • Code: okn. Saw name: Okinoerabu. Expected name: Oki-No-Erabu.
  • Code: pi. Saw name: Pali. Expected name: Pali.
  • Code: pl. Saw name: Polish. Expected name: Poljski.
  • Code: pt. Saw name: Portuguese. Expected name: Portugalski.
  • Code: ro. Saw name: Romanian. Expected name: Rumunski.
  • Code: ru. Saw name: Russian. Expected name: Ruski.
  • Code: sa. Saw name: Sanskrit. Expected name: Sanskrt.
  • Code: sa-ved. Saw name: Sanskrit. Expected name: Sanskrt.
  • Code: sco. Saw name: Scots. Expected name: Škotski.
  • Code: sh. Saw name: Serbo-Croatian. Expected name: Srpskohrvatski.
  • Code: sk. Saw name: Slovak. Expected name: Slovački.
  • Code: sl. Saw name: Slovene. Expected name: Slovenski.
  • Code: sla. Saw name: Slavic. Expected name: Slovenski.
  • Code: sla-pro. Saw name: Proto-Slavic. Expected name: Pra-Slovenski.
  • Code: sq. Saw name: Albanian. Expected name: Albanski.
  • Code: sv. Saw name: Swedish. Expected name: Švedski.
  • Code: ta. Saw name: Tamil. Expected name: Tamil.
  • Code: th. Saw name: Thai. Expected name: Tajski.
  • Code: tl. Saw name: Tagalog. Expected name: Tagalog.
  • Code: tpi. Saw name: Tok Pisin. Expected name: Tok Pisin.
  • Code: tr. Saw name: Turkish. Expected name: Turski.
  • Code: uk. Saw name: Ukrainian. Expected name: Ukrajinski.
  • Code: vi. Saw name: Vietnamese. Expected name: Vijetnamski.
  • Code: xno. Saw name: Old French. Expected name: Stari Francuski.
  • Code: yi. Saw name: Yiddish. Expected name: Jidiš.
  • Code: yue. Saw name: Cantonese. Expected name: Kantonski.
  • Code: zh. Saw name: Chinese. Expected name: Kineski.

Checks performed

[uredi]

For multiple data modules:

  • Codes for languages, families and etymology-only languages must be unique and cannot clash with one another.
  • Canonical names for languages, families, and etymology-only languages must not be found in the list of other names.
  • Each name in the list of other names must appear only once.
  • otherNames, if present, must be an array.
  • Wikidata item IDs must be a positive integer or a string starting with Q and ending with decimal digits.

The following must be true of the data used by Module:languages:

  • Each code must be defined in the correct submodule according to whether it is two-letter, three-letter or exceptional.
  • The canonical name (field 1) must be present and must not be the same as the canonical name of another language.
  • If field 2 is not nil, it must a valid Wikidata item ID.
  • If field 3 or family is given and not nil, it must be a valid family code.
  • If field 4 or scripts is given and not nil, it must be an array, and each string in the array must be a valid script code.
  • If ancestors is given, it must be an array, and each string in the array must be a valid language or etymology language code.
  • If family is given, it must be a valid family code.
  • If type is given, it must be one of the recognised values (regular, reconstructed, appendix-constructed).
  • If entry_name is given, it must be a table that contains either two arrays (from and to) or a string (remove_diacritics) or both.
  • If sort_key is given, it may either be a string, or at table that in turn contains either two arrays (from and to) or a string (remove_diacritics).
  • If entry_name or sort_key is given, the from array must be longer or equal in length to the to array.
  • If standardChars is given, it must form a valid Lua string pattern when placed between square brackets with ^ before it ("[^...]). (It should match all characters regularly used in the language, but that cannot be tested.)
  • If override_translit is set, translit must also be set, because there must be a transliteration module that can override manual transliteration.
  • If link_tr is present, it must be true.
  • Have no data keys besides these: 1, 2, 3, "entry_name", "sort_key", "display", "otherNames", "aliases", "varieties", "type", "scripts", "ancestors", "wikimedia_codes", "wikipedia_article", "standardChars", "translit", "override_translit", "link_tr".

Checks not performed:

  • If translit is present, it should be the name of a module, and this module should contain a tr function that takes a pagename (and optionally a language code and script code) as arguments.
  • If sort_key is a string, it should be the name of a module, and this module should contain a makeSortKey function that takes a pagename (and optionally a language code and script code) as arguments.
  • If entry_name or sort_key is a table and contains a field remove_diacritics, the value of the field should be a string that forms a valid Lua pattern when it is placed inside negated set notation ([^...]).

These are not checked here, because module errors will quickly crop up in entries if these conditions are not met, assuming that Module:utilities attempts to generate a sortkey for a category pertaining to the language in question, or full_link attempts to use the transliteration module.

Module:languages/code to canonical name and Module:languages/canonical names must contain all the codes and canonical names found in the data submodules of Module:languages, and no more.

The following must be true of the data used by Module:etymology languages:

  • canonicalName must be given.
  • parent must be given must be a valid language, family or etymology-only language code.
  • If ancestors is given, it must be an array, and each string in the array must be a valid language or etymology language code. The etymology language should also be listed as the ancestor of a regular language.
  • Have no data keys besides these: "canonicalName", "otherNames", "parent", "ancestors", "wikipedia_article", "wikidata_item".

Codes in Module:families data must:

  • Have canonicalName, which must not be the same as the canonical name of another family.
  • If family is given, it must be a valid family code.
  • Have at least one language or subfamily belonging to it.
  • Have no data keys besides these: "canonicalName", "otherNames", "family", "protoLanguage", "wikidata_item".

Codes in Module:scripts data must:

  • Have canonicalName.
  • Have at least one language that lists it as one of its scripts.
  • Have a characters pattern for script autodetection, and this must form a valid Lua string pattern when placed between square brackets ("[...]"). (It should match all characters in the script, but that cannot be tested.)
  • Have no data keys besides these: "canonicalName", "otherNames", "parent", "systems", "wikipedia_article", "characters", "direction".