Commons:Structured data/Development

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search


Gnome-preferences-system.svg
This page is a work in progress page, not an article or policy, and may be incomplete and/or unreliable.
Please offer suggestions on the talk page.

Deutsch | English | Español | Suomi | Français | Magyar | Italiano | Македонски | Nederlands | Português | Português do Brasil | Română | Русский | Sicilianu | Українська | العربية | +/−

Development steps are listed per component of the Structured Commons project and software, and are shown in reverse chronological order (most recent on top).

General: Timeline and roadmap[edit]

Roadmap for development on Structured Data on Wikimedia Commons in 2017-2019. Version of October 31, 2017.

A timeline for development on Structured Commons can be found in this roadmap document (version October 31, 2017), which will be updated as development plans are updated.

The roadmap is the best estimate of when things might happen based on the information we have now. As we get more information, the estimates will change.

As a general rule, the timeline should be pretty accurate for things happening in 3-6 months, and much less accurate for things farther than 6 months in the future.

The team is working on updating this document with user facing milestones, such as the expected data the first feature will be deployed.

Current and future development[edit]

This section contains a high level overview of some features being developed. For more detailed information, including project reports, quarterly goals, and technical requirements, please visit the team page on mediawiki.org.

Technology[edit]

MediaInfo extension[edit]

MediaInfo is a new entity type for Wikibase, that is able to handle structured metadata for multimedia files.

The extension hooks into a file description page and adds a link to a MediaInfo page storing supplemental metadata about the file. This may, for example, include the author, detailed license information, and the concepts that a picture actually depicts.

Further information: Extension:WikibaseMediaInfo

  • October-December 2017: The multimedia team at the Wikimedia Foundation is gaining expertise in Wikibase, and unblocking further development for Structured Commons, by completing the MediaInfo extension for Wikibase.

Federation[edit]

In a technical sense, a federated database system is a management system where multiple autonomous databases work together in a single, so-called federated, database. Wikibase Federation is implemented for Structured Data on Wikimedia Commons: it makes it possible to use entities (Items and Properties) defined on one Wikibase repository (i.e., Wikidata) on another Wikibase repository (i.e., Wikimedia Commons). https://en.wikipedia.org/wiki/Federated_database_system

Multi-Content Revisions[edit]

So-called multi-content revisions form an important building block for structured data on Wikimedia Commons (and on other Wikimedia projects). Multi-content revisions are groundwork to make information in Mediawiki wikis technically more straightforward to organize. The current wikitext pages will be able to be split out into separate documents (slots) with different functionality (such as infoboxes, categories, template documentation); these different slots can then be integrated into one page, sharing page-level functionality and one shared history. Specifically for structured data on Wikimedia Commons, multi-content revisions make it possible to store a structured data entity (an item, a property, a MediaInfo entity) and wikitext in the same page. Structured Commons is a major use case for multi-content revisions.

First feature to release: multilingual captions[edit]

  • July-September (?) 2018: first feature of Structured Commons will be released. This will be multilingual, translatable file captions.

Metrics[edit]

How will we measure the effectiveness of new functionalities on Wikimedia Commons? In order to be able to do this, we need to establish relevant criteria that can be measured and a (2017) baseline against which we can compare in the future.

  • October-December 2017: metrics and a metrics baseline for Commons are defined.


Research[edit]

Research of Commons use by community members[edit]

  • Upcoming: interviews with Commons contributors (phab:T175185)
  • December 2016: Qualitative design research of heavy Commons users by Jan Dittrich (WMDE).

Research of Commons use and needs by GLAMs[edit]

October 2017: GLAM Users Invited to Take Survey: Jonathan Morgan adds an announcement to the main Structured Data on Commons page, asking GLAM users to take a 15-minute survey on how they upload images to Commons. See https://commons.wikimedia.org/wiki/Commons:Structured_data and https://wikimedia.qualtrics.com/jfe/form/SV_7WDA2RZvPDuaV7f.

July 5, 2017: Design Interviews Begin with GLAM Institutions: Senior Design Researcher Jonathan Morgan begins researching GLAM institutions’ batch upload workflows, and conducting interviews with Commons contributors at museums, universities, and other institutions. The interviews will take place over the next four months. See https://meta.wikimedia.org/wiki/Research:Supporting_Commons_contribution_by_GLAM_institutions

Past development[edit]

Earlier research and development that took place in 2014, is documented at Commons:Structured data/Archive/2014/Development.

Metadata cleanup drive (2014)[edit]

In 2014 and early 2015, a large metadata cleanup campaign took place across MediaWiki wikis, in order to prepare as many files as possible for conversion to machine-readable data.