Wikidata data validation
Appearance
Wikidata data must be validated.
Still it is only eventually complete; which is a theoretical status, because it will never be attained.
Characteristics
[edit]- Items and properties have unique and immutable keys
- Triplestore database with statements (predicates) consisting of object-property-subject/value pairs (entity relationship like in a relational database)
- Easily updateable (simple concept; user driven data model)
- Never 100% complete nor correct (eventual consistent)
- Multilingual:
- Labels, Descriptions, and Aliases are language sensitive
- Properties are described like items (Property namespace)
- New properties can be defined (by developers, upon user request, and approval)
Tips
[edit]- Items must be unique
- Unique Label - Description combination (homonym distinction)
- Description must be different from Label
- Label can not be repeated in Alias
- Items must either have an instance or a class
- Notability:
- Item could be created because a Wikipedia page exists
- notoriety
- Item is required to describe another item
- Labels in lowercase (nouns) or initial capital (proper names) -- i.e. German is an exception
- Add sources (references) to statements (more information, proof of validity)
- Use (P6104) (maintained by WikiProject) to list related items
- Can be used by ListeriaBot to build Wikipedia edit-a-thon and project tables/lists
Techniques
[edit]- Constraints:
- additional classes might be added
- reciproque property
- reverse property
- Missing language labels
- Homonym dedection and (P1889) registration
Tools
[edit]Known problems
[edit]- Constraints are not proactively enforced
- Duplicates, data quality problems