Commons:Structured data/Archive/2014/MoreFAQ

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search


This page is a work in progress page, not an article or policy, and may be incomplete and/or unreliable.
Please offer suggestions on the talk page.

FAQ[edit]

Who will develop this project?[edit]

This project will be a collaboration between developers and the communities of Wikimedia Commons, Wikidata, Wikipedia and sister projects. Development will be done by the WMF’s Multimedia team, the Wikidata team, and community developers.

How will the community be involved?[edit]

The Wikimedia Commons and Wikidata communities will make decisions about the new data structure. Community help will also be needed in migrating data from the old structure to the new structure. Some of this can be automated, some of it will be manual work, but we've done it before (remember when we didn't have an {{Information}} template?). A good place to start is making sure all files have machine-readable data.

How long will this take?[edit]

This is a long-term project and commitment from the WMF, Wikidata, and the community members helping to make this happen. First results will be visible in 2015, but completion will take several years.

What will be developed first?[edit]

As a first step, the engineering team aims to create mock interfaces that behave like Wikidata interfaces, but use wikitext to read/write data. The storage interface would involve basic concepts like "file", "work", "contributor", "license", etc. For UploadWizard (where we only write information, in the form of information/license templates on a newly created page), we are considering hiding the current code of generating template text behind an API similar to wbeditentity/wbcreateclaim. After an initial period of experimentation and testing, the engineering team would build a high-end API to support a range of features and allow easy access by tools authors.

Where will the structured data be stored?[edit]

The current proposal is to store structured data in a dedicated section on the file description page, but the final decision will be made when the engineers have better visibility on the implementation. From the user's perspective, the interface should be consistent enough that it doesn't matter if the structured data is on the file page or an associated info page (on Commons).

How will people edit the structured data?[edit]

We would adapt the current Wikidata editing tools to make it easy to enter and edit structured data stored on the "media info" and migrate it from the unstructured data now on the file description pages. In the next development period, we would build special-purpose editing interfaces for media info, after proposed designs have been reviewed, prototyped and tested extensively by community members. New design mockups will be proposed in coming weeks to help visualize how that might work. An updated version of UploadWizard is likely to be used for entering the structured data when a file is uploaded. Other editing methods may be added later on.

Will we still use templates?[edit]

Templates will still be used to format the structured data. For example, the {{Information}} template would be changed to pull information like the contributor or description from the structured data. Template parameters would only be used in rare cases to override the values from the structured data. Ideally, in the end the wikitext of file description pages might only contain an {{Information}} call with no parameters, but that wouldn't happen until all the information is accessible in the structured data.

Will we still use categories?[edit]

Categories will be fully supported by the structured data system, and might even be provided in a variety of languages, not just English. Many have suggested that categories be complemented by a more granular set of "topics", which can be linked to corresponding items on Wikidata, as well as intersected to improve search.

What data structure will be used?[edit]

The data structure will be developed by the multimedia and Wikidata teams, in collaboration with community members. Some of the current data used on Commons will probably be stored in structured format (on Commons), while others will be stored by MediaWiki (on Commons). Since many users and contributors seem confused by the many different data and templates now in use on Wikimedia sites, it might be worth streamlining and consolidating these options, to eliminate redundancies, or at the least display them consistently.

How will this affect Wikimedia projects outside of Commons?[edit]

Initially this is being developed for Commons. However, this project is being developed keeping in mind other Wikimedia projects for future opportunities to provide structured data using the same technology.