Commons:Structured data/About

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
Other languages:
Deutsch • ‎English • ‎dansk • ‎español • ‎français • ‎galego • ‎polski • ‎português do Brasil • ‎العربية • ‎日本語

Wikimedia Commons, a sister project of Wikipedia, is a collection of more than 50 million free media files. The project Structured Data on Wikimedia Commons converts information about these files to a structured and machine-readable format, making them easier to view, search, edit, organize and re-use, in many languages.

This is implemented with Wikibase, the same technology as used for Wikidata.

Wikimedia community members and staff from the Wikimedia Foundation (WMF) and Wikimedia Deutschland (WMDE) (the Wikidata team) work on this project from 2017 till the end of 2019.

About Wikimedia Commons

Wikimedia Commons is Wikimedia's free media repository. It contains (Autumn 2018) more than 50 million media files that are typically

  1. personal photography and media uploaded by individuals;
  2. freely licensed media files uploaded to Commons from locations on the internet like Flickr, YouTube, open access journals, and other repositories;
  3. uploads from institutions and organizations with substantial media collections, like UNESCO, NASA, and the British Library.

In 2017–2019, through the Structured Data on Commons project, Wikimedia Commons gets access to new infrastructure and tools. These will help Commons' community of contributors to provide machine-readable, structured data about the media files, in addition to the free text (in wikitext markup) with which the files have been described before.

The Structured Data on Commons project is funded by a grant from the Alfred P. Sloan Foundation. Read more about the grant application and the community consultation that has preceded it.

Structured data?

Wikimedia Commons operates on MediaWiki, the same software that powers Wikipedia. MediaWiki was primarily developed for hosting text like in Wikipedia. So, typically, each media file on Commons is accompanied by plain-text descriptions (wikitext, templates) and categories. These are usually only available in one language – mostly English – and, most importantly, not consistently machine-readable.

Structured metadata will allow the files to be accessible in a robust, consistent, structured and linked format: a format that allows software to understand, on a large scale, what the metadata fields mean (structured) and to connect them to other databases on the internet, putting them in a broader context (linked). Structured metadata is also more granular and easier to translate than unstructured data.

This switch makes it possible to use Commons' media in new ways, and makes the files on Commons much easier to view, search, edit, curate or organize, use and reuse, in many languages.

System-search.svgSee also: Why we work on this.

What is changing on Wikimedia Commons?

Information about files on Wikimedia Commons will be enhanced with structured data, powered by Wikidata.

Wikidata, the structured data repository in the Wikimedia ecosystem, debuted in 2012. Wikidata is overseen by a team at Wikimedia Deutschland (WMDE) and serves as a foundation for Structured Data on Commons.

Wikidata's software – Wikibase – does not store articles in wikitext, but it stores concepts (called 'items'). These items then contain many 'statements' that describe, for instance, the item's title, relations to other items, and dates.

The structured data on Wikidata is freely re-usable across Wikimedia sites and by third parties. Additionally, computers can easily process and understand it. Because of this flexibility, Wikidata is increasingly used in STEM[1] fields, but also in cultural heritage and the humanities. Moreover, because of its support of broad translation, data entered in Wikidata in one language can immediately be made available in many other languages as well. The integration of structured data in Wikimedia Commons happens through the integration of Wikibase, and metadata from Wikidata, into file descriptions on Wikimedia Commons.

System-search.svgSee also: Frequently Asked Questions.
System-search.svgSee also: Glossary.

When is this developed?

Development of Structured Data on Commons takes place between 2017 and the end of 2019.

Throughout the three-year project, the team plans to encourage and support volunteers and partners who want to build tools that allow active and diverse editing, maintenance, conversion and cleanup of the files on Commons.

It is expected that, after 3 years, approximately 5 million media files on Wikimedia Commons will contain some structured metadata – probably more. Depending on community processes, there will also probably be a 'long tail' period of several additional years until all files on Commons are described with some structured data.

System-search.svgSee also: Development.

Who is working on this?

Structured Commons is a collaboration between developers, the communities of Wikimedia Commons, Wikidata, Wikipedia, and sister projects, and partners and allies of the Wikimedia movement.

The developer team consists of staff from both the Wikimedia Foundation and Wikimedia Deutschland. Community developers (tool developers, bot operators, developers at partner organizations) can also play a large role in this project. All the developed features are conceptualized, created, tested and improved in close collaboration with the community of active contributors to Commons and Wikidata, as well as Wikipedia and sister projects. We also warmly welcome active feedback from cultural institutions (GLAMs – Galleries, Libraries, Archives and Museums).

System-search.svgSee also: Team.
System-search.svgSee also: Community focus group.

How you can help and get involved

  • For Wikimedia community members, there are many ways to contribute – by providing feedback, helping others, being part of the community focus group, translating content...
  • Representatives from cultural and knowledge institutions (GLAMs) can also provide feedback, and be part of a focus group.
System-search.svgSee also: Get involved.
System-search.svgSee also: Commons:International Image Interoperability Framework.
  1. Science, technology, engineering, and mathematics.