Jump to content

Mul language

From Wikimedia Belgium

For a substantial list of instances multiple languages typically have identical labels, aliases, and/or descriptions.

This impacts Wikidata storage, and the runtime performance of Wikidata Query, ListeriaBot, Pywikbot, and other Wikimedia Tools.

Technical impact

[edit]

As a consequence Wikidata has storage issues, and Wikidata Query has performance problems with redundant labels, aliases and descriptions, taking up redundant storage, memory, and processing time. In addition to that, the item transaction history will grow much larger due to unnecessary duplication of labels.

Therefore in 2024 a new functionality was introduced: the mul labels and mul aliases. If a specific language does not have a label, the mul label is shown (without notice) instead. There is fewer emphasis on the English language. Therefore, as a principle, every item could have a mul label. The description should still be filled for each language, except when the description label would be the same for all instances of that type (e.g. firstname, lastname, category, etc.).

The item description for the mul language itself is never registered (because mul is not a real language).

Examples

[edit]
  • Names of persons, firstnames, lastnames → labels only to be stored for the mul language (no description); the same for aliases
  • Descriptions of instances of firstnames, lastnames, categories, → should not be stored; to be retrieved at runtime from the instance language label
  • Scolar articles and ISBN editions (the title should be only registered in the mul label, because a publication normally has only a single language). The actual language of the publication should be registered with (P407)

Impacted functions

[edit]

Wikidata query

[edit]
SELECT ?item ?itemLabel WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],mul,en". }
  ?item wdt:P6104 wd:Q134895452.
}

Wikidata project documentation

[edit]
[edit]

The same problem holds with inverse and derived statements.