Commons:Structured data/Archive/2014/Berlin Bootcamp
How can we make multimedia data easier to use on Wikimedia Commons, Wikipedia and sister sites?
Today, information about media files on Wikimedia sites is stored in unstructured formats that cause a range of issues: for example, file information is hard to search, some of it is only available in English, and it is difficult to edit or re-use files to comply with their license terms.
Group photo of participants at the Structured Data Bootcamp in Berlin, October 5-10, 2014. See more photos.
The focus of this event was to investigate how to structure data on Wikimedia Commons, reusing the same technology as the one developed for Wikidata. Participants collaborated in small workgroups to explore a range of problems and solutions, in parallel sessions focused on community, design, engineering, licensing and product management challenges.
Each workgroup produced concrete examples of how these ideas could be implemented, including:
- first ideas for data models for structuring file information, to make it both license-compliant and machine-readable
- first user interface designs for viewing and editing structured data seamlessly, alongside unstructured data
- a working prototype of a high-level API, for reading and updating metadata about media files
- improvements to a prototype dashboard identifying files missing machine-readable metadata.
These preliminary ideas are now being documented on Commons so we, the Commons community, can all use them as an initial basis for an informed discussion. We may end up collectively changing or rewriting these preliminary requirements, designs and initial code as part of that discussion. For a project overview, check out this development page and these project slides.
The bootcamp was very productive, but many questions remain unanswered. The current thinking is that the Structured Data project could take several years to complete. A gradual development process seems preferable, to take time to build this properly and to minimize disruption. Next steps include community discussions, design, prototype building and testing, and a series of experiments with structured data formats before starting actual development and data migration.
Everyone is invited to get involved in this important project. The Structured data hub is the best place to get started; please consider adding it (and related pages) to your watchlist -- and signing-up for the newsletter. Your ideas and comments are much welcome, and developers would love your active participation in defining and guiding this project.
We look forward to working together to better support the needs of our users and modernize our multimedia infrastructure together.
Participants in this Berlin bootcamp included users from the Wikimedia Community (from Commons and other projects), the Multimedia Team (from the Wikimedia Foundation) and the Wikidata Team (from Wikimedia Deutschland).
- Wikimedia community members
- Multimedia team
- Fabrice Florin — Product Manager
- Gilles Dubuc — Senior Software Engineer
- Pau Giner — Interaction Designer
- Stephen LaPorte -- Legal Counsel
- Erik Moeller — VP Product
- Guillaume Paumier - Product Analyst
- Keegan Peterzell — Community Liaison
- Gergő Tisza — Back-end Software Engineer
- Wikidata team
- Lydia Pintscher — Product Manager
- Daniel Kinzler — Senior Software Engineer
- Katie Filbert
- Thiemo Mättig
- Birgit Müller
- Jan Zerebecki
- Europeana representative
- Hugo Manguinhas
Etherpad notes from each day of the meeting: