OpenRefine is an open source, very flexible power tool to work with data, clean it, transform it, and move it from one data repository to another. It is widely used by data scientists, data journalists, cultural institutions and other professionals working with data. OpenRefine is also a popular tool to batch upload and batch edit data on Wikidata.
Since 2022, with support from a Wikimedia grant, it is possible to use OpenRefine to batch edit and upload files on Wikimedia Commons, with a focus on adding multilingual, linked, structured data to the files on Commons.
This new Wikimedia Commons functionality in OpenRefine is especially useful for cultural institutions who want to upload files to Commons with linked, structured data. OpenRefine offers powerful import functionalities from various data formats (csv, tsv, Excel sheets, XML…) and APIs (for those cultural institutions which use these). It also allows revisiting existing Wikimedia Commons files, improving their metadata, and adding multilingual structured data to them. Wikimedians in general can also use OpenRefine to batch upload their own or externally-hosted files to Wikimedia Commons.
For 2023-24, as part of its support for Wikimedia Commons, the Wikimedia Foundation is funding OpenRefine for bug fixes to its Commons features, for a train-the-trainer program, documentation, and a WikiLearn course.
As this program develops, this page will provide updates and more information.
Train-the-trainer course, 2023-24
From November 2023 until April 2024, there is an intensive online train-the-trainer course for candidate OpenRefine-Wikimedia trainers. Read more on the dedicated info page.
- September 1 until 15, 2023: registration period
- November 1, 2023: course starts
Log of activities
|2023-08-31||Announcement of train-the-trainer course. Registration period for this course starts||Info page about the course / registration form|
|2023-07-21||This info page is published||This page :-)|