Wikipedia:WikiProject Tabular Data
This WikiProject is defunct. Consider looking for related projects for help or ask at the Teahouse. If you feel this project may be worth reviving, please discuss with related projects first. Feel free to change this tag if the parameters were changed in error.
|
Welcome to the WikiProject Tabular Data. This WikiProject is still being set up and still deserves further discussion. Please help and become involved. |
Wikipedia lives from many good written articles, which explicate coherencies and fancy for read.
Good articles also live from the facts, which are used in different articles or which are the subject of permanent changes. It is time and labour-intensive to harmonize these facts across several articles and keep them up-to-date. This project tries to solve this problem using software support in MediaWiki and replace consistently requires bot runs.
Organization
editThe structuring organization, coordination, and maintenance of the metadata templates take place in the Wikipedia:WikiProject metadata/data organization.
Basics
editData, like e.g. the population number of a municipality or the gross domestic product of a country are often used in several articles as well as in the continuous text as in infoboxes or tables. Such data are characterized by a multiply of attributes:
- Which (key): which object is meant (e.g. Berlin)
- What (relation): what kind of data is it (e.g. population number)
- How much (value): what value does the datum have (e.g. 3420786)
- When (date): when was this value determined (e.g. March 31, 2008)
- From where (source): where does the datum come from (e.g. Amt für Statistik Berlin-Brandenburg)
The first three attributes describe the data itself. In according to the resource description framework (RDF) the are also called subject – predicate – object. The last two attributes are metadata in the closest terms, which means data about data.
Templates make it easy to return a value based on an input parameter. In this way, the template “population number” (predicate) could return the value “3420786” (object) when given the parameter “Berlin”. The template programming therefore offers by the parser function switch a smart solution.
Data types
editThe relevant data are distinguished by different types of data, which should be available in a machine editable form for practical reasons and not changed into the suitable form until output. This machine editable form (e.g. numbers without thousands separators and point as decimal separator) makes it possible to calculate the population density from the quotient “population number/area”.
Nomenclature of the templates
editThe naming of the templates of the template type „metadata“ follows the scheme:
Template:Metadata basis apportionment
For example: Template:Metadata population number DE-NI
The call is carried out according:
{{Metadata Yyz|Key|Accessory}}
For example: {{Metadata population number DE-NI|12345}}
Basis
editData of the same basis is described by a universal umbrella term as possible: Examples:
population number
addicts the data for the population number of a political subdivision.
head
addicts the data of the head of a political subdivision, which may comply with a mayor of municipality.
GDP
addicts the data of the gross domestic product of a coutry.
Separation of the dataset
editThe optional separation of the dataset follows the criteria
- Form of the data supply of the data source,
- Size of the data set (for example there should not be more than 2000 entries at maximum in one switch-list)
- logical coherence of the data group.
Examples:
- The arrangement of the data templates for the population numbers of German municipalities, municipalities associations, counties and administrative region follows may be effected by the federal states (Bundesländer).
- The arrangement of the data templates for the human development index may be effected by states of the earth.
The naming of the separation is also carried out by universal criteria as possible. For example: Data about administrative units are separated according to ISO 3166.
Key
editThe allocation of the data is carried out according to universal, explicit and independent of Wikipedia internal regulations keys. For example, the Community Identification Number should be used for data about municipalities. The ISO-3166-key should be used for data about states. This may prevent the loss of data embedding by lemma change and allows the use of the data sets independent of the different name conventions of different language editions of Wikipedia.
Interested editors
edit- Andy Mabbett (User:Pigsonthewing); Andy's talk; Andy's edits
- -- Quiddity (talk)