Commons:Structured data/About

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Wikimedia Commons, a sister project of Wikipedia, is a collection of more than 60 million free media files. The project Structured Data on Wikimedia Commons converts information about these files to a structured and machine-readable format, making them easier to view, search, edit, organize and re-use, in many languages.

This is implemented with Wikibase, the same technology as used for Wikidata.

Wikimedia community members and staff from the Wikimedia Foundation (WMF) and Wikimedia Deutschland (WMDE) (the Wikidata team) work on this project from 2017 till the end of 2019.

About Wikimedia Commons

Wikimedia Commons is Wikimedia's free media repository. It contains (Summer 2020) more than 60 million media files that are typically

  1. personal photography and media uploaded by individuals;
  2. freely licensed media files uploaded to Commons from locations on the internet like Flickr, YouTube, open access journals, and other repositories;
  3. uploads from institutions and organizations with substantial media collections, like UNESCO, NASA, and the British Library.

In 2017–2019, through the Structured Data on Commons project, Wikimedia Commons gets access to new infrastructure and tools. These will help Commons' community of contributors to provide machine-readable, structured data about the media files, in addition to the free text (in wikitext markup) with which the files have been described before.

The Structured Data on Commons project is funded by a grant from the Alfred P. Sloan Foundation. Read more about the grant application and the community consultation that has preceded it.

Structured data?

Wikimedia Commons operates on MediaWiki, the same software that powers Wikipedia. MediaWiki was primarily developed for hosting text like in Wikipedia. So, typically, each media file on Commons is accompanied by plain-text descriptions (wikitext, templates) and categories. These are usually only available in one language – mostly English – and, most importantly, not consistently machine-readable.

Structured metadata allows the files to be accessible in a robust, consistent, structured and linked format: a format that allows software to understand, on a large scale, what the metadata fields mean (structured) and to connect them to other databases on the internet, putting them in a broader context (linked). Structured metadata is also more granular and easier to translate than unstructured data.

This switch makes it possible to use Commons' media in new ways, and makes the files on Commons much easier to view, search, edit, curate or organize, use and reuse, in many languages.

System-search.svgSee also: Why we work on this.

What changed on Wikimedia Commons?

Information about files on Wikimedia Commons was enhanced with structured data, powered by Wikidata.

Wikidata, the structured data repository in the Wikimedia ecosystem, debuted in 2012. Wikidata is overseen by a team at Wikimedia Deutschland (WMDE) and serves as a foundation for Structured Data on Commons.

Wikidata's software – Wikibase – does not store articles in wikitext, but it stores concepts (called 'items'). These items then contain many 'statements' that describe, for instance, the item's title, relations to other items, and dates.

The structured data on Wikidata is freely re-usable across Wikimedia sites and by third parties. Additionally, computers can easily process and understand it. Because of this flexibility, Wikidata is increasingly used in STEM[1] fields, but also in cultural heritage and the humanities. Moreover, because of its support of broad translation, data entered in Wikidata in one language can immediately be made available in many other languages as well. The integration of structured data in Wikimedia Commons happens through the integration of Wikibase, and metadata from Wikidata, into file descriptions on Wikimedia Commons.

System-search.svgSee also: Frequently Asked Questions.
System-search.svgSee also: Glossary.

When is this developed?

Development of Structured Data on Commons took place between 2017 and the end of 2019.

Throughout the three-year project, the team encouraged and supported volunteers and partners who wanted to build tools that allowed active and diverse editing, maintenance, conversion and cleanup of the files on Commons.

It was expected that, after 3 years, 5 million media files on Wikimedia Commons would contain some structured metadata – probably more. Currently, on account of community processes, we are in the 'long tail' period of several additional years in which all files on Commons are being slowly described with some structured data.

System-search.svgSee also: Development.

Who worked on this?

Structured Commons is a collaboration between developers, the communities of Wikimedia Commons, Wikidata, Wikipedia, and sister projects, and partners and allies of the Wikimedia movement.

The developer team consists of staff from both the Wikimedia Foundation and Wikimedia Deutschland. Community developers (tool developers, bot operators, developers at partner organizations) also playes a large role in this project. All the developed features were conceptualized, created, tested and improved in close collaboration with the community of active contributors to Commons and Wikidata, as well as Wikipedia and sister projects. We also warmly welcome active feedback from cultural institutions (GLAMs – Galleries, Libraries, Archives and Museums).

System-search.svgSee also: Team.

How you can help and get involved

  • For Wikimedia community members, there are many ways to contribute – by providing feedback, helping others, adding structured data statements, translating content, helping to decide how to model certain metadata...
  • Representatives from cultural and knowledge institutions (GLAMs) can also provide feedback, make use of the tools available, and help to decide how to model certain metadata.
System-search.svgSee also: Get involved.
System-search.svgSee also: Commons:Structured data/GLAM.
System-search.svgSee also: Commons:International Image Interoperability Framework.
  1. Science, technology, engineering, and mathematics.