Commons:Batch uploading

From Wikimedia Commons, the free media repository
Jump to: navigation, search
Bot policy and list · Requests to operate a bot · Requests for work to be done by a bot · Changes to allow localization  · Requests for batch uploads

This page has a backlog that requires the attention of experienced editors.
Please remove this notice if it won't be needed in the future.

Boarisch | বাংলা | Català | Deutsch | Deutsch (Sie-Form)‎ | Ελληνικά | English | Español | فارسی | Français | Galego | Magyar | Italiano | 日本語 | 한국어 | Македонски | മലയാളം | Nederlands | Polski | Português | Português do Brasil | Svenska | Türkçe | 中文(简体) | +/−

Nuvola apps kcmsystem.png

Commons Batch Uploading is a project to centralize the uploading of a collection of files, that have released their work as PD or any Commons compatible license. The files would be assigned to a bot operator who would see how the request would be fulfilled. (To upload batches from Flickr, please make requests on Commons:Flickr batch uploading)

Before you request a batch upload here, please read the guide to batch uploading first.

See w:Wikipedia:Public domain image resources for potential future batch uploads.




Scripts, Examples and Information[edit]

New requests[edit]

State Library of North Carolina[edit]

  • Description of Content

The State Library of North Carolina has over 10,000 photographs taken during a statewide survey of North Carolina's cultural heritage institutions, including libraries, museums, archives, and historic sites. The survey was conducted as part of the North Carolina Exploring Cultural Heritage Online project (NC ECHO), which surveyed all 100 counties in the state, and ultimately identified and photographed over 950 institutions between 2001 and 2009. The collection includes images of building interiors and exteriors, historical marker signage, displays, and exhibits. The State Library is also the official depository for all state agency publications, and has an extensive collection of digitized historical publications on North Carolina and its people.

The State Library is interested in adding the NC ECHO images to Wikimedia Commons (I, Retrent, work for the State Library), and we are particularly interested in doing so in a way that takes advantage of the recent and forthcoming integration of Commons and Wikidata, and the Structured Data Project.

The NC ECHO images are currently online through the North Carolina Digital Collections, a portal jointly managed by the State Library and State Archives of North Carolina. The images can be found at The images are described, and it is relatively easy to capture metadata and image links in whatever format is required for upload tools. State Library staff are available to facilitate.

Example images:

  • Which license tag(s) should be applied?

The photographs were created by a North Carolina government agency, and as such are public records and the State Library considers them to be in the public domain.

The Template:PD-SLNC license was developed by the State Library and Wikimedia DC as part of the Summer of Monuments Project: It is based on the State Library's official rights statement, and it may be used for these images.

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

Template:Photograph is likely sufficient. However, we are unsure about how to take advantage of the recent integration of wikidata and Commons (, and whether we would need to use a special template to insert Wikidata links in fields like Depicted place. Once uploaded, we'd like to add authority control links to the buildings/locations depicted in the images.

Also, five NC ECHO images were uploaded manually as part of the Summer of Monuments project under the username NCandbeyond. These images were uploaded with Template:Information, and we are unsure of how to change the template so that we can enter additional metadata.

Retrent (talk) 16:56, 11 December 2014 (UTC)


Assigned to Progress Bot name Category

National Museum of Korea[edit]

  • Source to upload from: here (375, some uploaded), here (3285 / some duplicates with above folder.), here (7277 files / some duplicates with above folders.)
  • Describe the works to be uploaded in detail (audio files, images by …):

Images taken by National Museum of Korea, depicting all their heritages.

  • Which license tag(s) should be applied?


  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

{{NMoK}} should be used after the source link, and {{Institution:국립중앙박물관}} should be author.  revimsg 04:54, 14 November 2014 (UTC)


Assigned to Progress Bot name Category

The Digital South Asia Library[edit]

For Bond photos, the direct URL to the images is
For Keagle, it is
The "NNNN" is a 4 digit number.

For Hensley, it is The "x" is a small case letter, and the "NNN" is a 3 digit number.

    • Do you know whether the site has an API


    • What else can ease uploading (is the site valid XHTML, WCM they use…)?


    • Did you contact the site owner?


  • Describe the works to be uploaded in detail (audio files, images by …):

All the files are images taken of South Asia (mostly India and Burma) during World War II by American servicemen Glenn S. Hensley, Robert Keagle and Frank Bond.

  • Which license tag(s) should be applied?


  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

The regular "Information" template should be fine. Co9man (talk) 08:44, 10 October 2014 (UTC)


Assigned to Progress Bot name Category

Musée des Augustins[edit]

Over several months the Musée des Augustins (fine arts museum in Toulouse) uploaded the media collections on Commons (see Category:Media_contributed_by_the_Musée_des_Augustins_de_Toulouse).

I am finishing up this work with the works in the reserves: 683 paintings & 921 sculptures.

Alignments for artists names is made at Commons:Musée des Augustins/alignment/ARTIST reserves.

Test of metadata is at Commons:Batch uploading/Musée des Augustins/test.

Looking forward to your opinions, Jean-Fred (talk) 22:10, 26 September 2014 (UTC)

Good work. Thank you. --Slick (talk) 04:59, 23 October 2014 (UTC)


Assigned to Progress Bot name Category

Rubin Kazan - Llevant UD[edit]

  • Source to upload from:
    • Did you observe an URL pattern
    • Do you know whether the site as an API
    • What else can ease uploading (is the site valid XHTML, WCM they use…)?
    • Did you contact the site owner?
  • Describe the works to be uploaded in detail (audio files, images by …):

Photogallery from of an historical match for Levante UD. Good quality images of players that maybe don't have any better portaits.

  • Which license tag(s) should be applied?

The usual license.

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?


Assigned to Progress Bot name Category
Category:Rubin Kazan-Llevant 15-03-2013


This request for the upload was ignored: ttps:// SVG files

    • Did you observe an URL pattern
    • Do you know whether the site has an API
    • What else can ease uploading (is the site valid XHTML, WCM they use…)?
    • Did you contact the site owner?

I contacted Wolfgang Spraul <> 22 January 2014 and he told me that the best thing I can do is "to tell all your friends about Openclipart. :-) Tweet, Blog, say thanks from wikitranslate, but most importantly tell them in person."

When I asked him about the credits for the work to the project I was told to add the link to the homepage

Natkabrown (talk) 14:55, 9 August 2014 (UTC)

  • Describe the works to be uploaded in detail (audio files, images by …):

Public domain clipart files in SVG format.

  • Which license tag(s) should be applied?


  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

Rillke(q?) 10:16, 9 August 2014 (UTC)

{{PD-OpenClipart}} as permission which sets also an hidden Cat. User: Perhelion (Commons: = crap?)14:30, 9 August 2014 (UTC)
All SVG files from openclipart (I've downloaded the 1.4GB dump for Aug 9th) seem to contain embedded metadata with the Author, a link to the source page for the file, and a bunch of openclipart-specific "tags". I think the key to importing the images is finding a good mapping from the oca tags to commons categories. --Dschwen (talk) 15:15, 9 August 2014 (UTC)


Assigned to Progress Bot name Category
Dschwen Writing the code to analyze the metadata DschwenBot

Digitaler Portraitindex[edit]

Zoomify images at Example:
    • Do you know whether the site has an API
    • What else can ease uploading (is the site valid XHTML, WCM they use…)?
    • Did you contact the site owner?
  • Describe the works to be uploaded in detail (audio files, images by …):
257,000 high resolution zoomify print portraits of Early Modern figures. These are very useful for WP articles.
  • Which license tag(s) should be applied?
{{PD-Art-100}}, I believe all images were created in the Early Modern period.
  • Is there a template that could be used on the file description pages? Do you think a special template should be created?
I could create one if this batch upload is possible

Jfhutson (talk) 01:41, 22 May 2014 (UTC)


Pictogram voting comment.svg Comment They use Zoomify (see ), so it should be possible to get the hightest resolution of the images in general. --Slick (talk) 09:35, 26 June 2014 (UTC)

Assigned to Progress Bot name Category

Manuscripts by Srečko Kosovel[edit]

  • Source to upload from:[1] (10 more pages) - images are accesable only from the main list for some reason (small icon next to the image for example [2]).
    • Did you observe an URL pattern? /
    • Do you know whether the site has an API? /
    • What else can ease uploading (is the site valid XHTML, WCM they use…)? /
    • Did you contact the site owner? /
  • Describe the works to be uploaded in detail (audio files, images by …):

1046 manuscripts by slovene poet Srečko Kosovel, motly poems, also letters, essays and other.

  • Which license tag(s) should be applied?


  • Is there a template that could be used on the file description pages? Do you think a special template should be created?
|Date={{other date|by|1926}} (unless known - for about 80 manuscripts)
|Author={{Creator:Srečko Kosovel}}

[[Category:Manuscripts by Srečko Kosovel]]

Name of the files should be something like: Srečko Kosovel - <title>.jpg. For {{}} single parameter is last part of URN; for URN:NBN:SI:IMG-ZXFUYLHN should be ZXFUYLHN.

Sporti (talk) 07:08, 29 March 2014 (UTC)


Assigned to Progress Bot name Category

Geograph Deutschland[edit]

Alle auf dieser Seite hochgeladenen Bilder stehen unter der CC BY-SA 2.0-Lizenz. Vor einiger Zeit wurde hier bereits einmal ein (kleiner) Schwung mit Bildern unter der Kategorie Category:Images from the Geograph Deutschland project hochgeladen. Ist es möglich, alle knapp 50.000 bisher hochgeladenen Bilder nach Commons zu übertragen? Ist es weiters möglich, die neu hinzukommenden Bilder von Zeit zu Zeit automatisch hochzuladen? Gemäß dem Ziel des Projekts "geographisch repräsentative Photos für jeden Quadratkilometer Deutschlands zu sammeln" sind fast alle Bilder auch für Wikipedia relevant.

P170 (talk) 16:11, 21 March 2014 (UTC)


Assigned to Progress Bot name Category

Glitch Artwork[edit]*.zip

(but directory is not traversable)

    • Do you know whether the site has an API


    • What else can ease uploading (is the site valid XHTML, WCM they use…)?
    • Did you contact the site owner?


  • Describe the works to be uploaded in detail (audio files, images by …):

Images and animations in .png,.swf, and .fla format archived in .zip files

  • Which license tag(s) should be applied?
Creative Commons CC-Zero This file is made available under the Creative Commons CC0 1.0 Universal Public Domain Dedication.
The person who associated a work with this deed has dedicated the work to the public domain by waiving all of his or her rights to the work worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law. You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

Lepidoptera (talk) 11:21, 13 February 2014 (UTC)


Pictogram voting comment.svg Comment I review this and only the first ZIP File ([3], 3557 PNGs, ~1.5 Kbyte each, ~30x30 pixel) contains usable file formats. I dont know if it possible to convert the FLA oder SWF given in the other ZIP files can convert (in a batch) and this is usefull. I can not convert. And I am not sure this in not out of scope here, because this are thousands small pieces (specially in the first file I review). I like to get a second opinion please. --Slick (talk) 19:29, 24 February 2014 (UTC)

Assigned to Progress Bot name Category


  • Source to upload from:

    • Did you observe an URL pattern

A list of all applicabel seals is currently created. Either plain URLs or a file (e.g. csv) with description, tags etc.

    • Do you know whether the site has an API

The site uses MediaWiki-Software.

    • What else can ease uploading (is the site valid XHTML, WCM they use…)?
    • Did you contact the site owner?

This is our own site.

  • Describe the works to be uploaded in detail (audio files, images by …):

This site contains a huge collection of historic letter seals from German governemental and administrative authorities.

  • Which license tag(s) should be applied?

Template:Bild-PD-Amtliches Werk

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

Veikkos-archiv (talk) 07:43, 16 January 2014 (UTC)

We need help for selecting the best upload process. Did we understand correct that we have to wait for an admin to show up in the "Assignes to" field?


Assigned to Progress Bot name Category

Kurt Rasmussen[edit]

  • Source to upload from: Kurt Rasmussen,
    • Did you observe an URL pattern
      • where xxxx is a four-digit number
    • Do you know whether the site as an API
      • I think not.
    • What else can ease uploading (is the site valid XHTML, WCM they use…)?
      • Essentially the algorithm that needs to be done is:
        • for each page in the linked search results
          • for each div class="bildvorschau" in the search results
            • download the url given in the first a href=, use this URL as source in the final information template
            • in the now downloaded file, find div class="bildcontainer"
            • in it, from the p class="beschreibung", extract the description to be used in the final information template
            • from the img tag immediately following it, download the url in the src attribute
            • upload it to Commons
  • Describe the works to be uploaded in detail (audio files, images by …):
    • All images by Kurt Rasmussen.
  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

I am also, parallelly, trying to coordinate a manual upload of this huge collection of extremely valuable photos. For that, see User:Darkweasel94/Rasmussen. darkweasel94 13:42, 13 December 2013 (UTC)


Assigned to Progress Bot name Category
darkweasel94 finished coding, will upload in the next days will probably do this from my own user account Category:Files uploaded by darkweasel94 (cleanup) (also contains other stuff)

Bildarchiv Austria[edit]

  • Source to upload from: Bildarchiv Austria
  • Describe the works to be uploaded in detail (audio files, images by …): All photos there that have a date before 1912. I think this should include everything tagged "um (= around) 1900" or earlier, but not "um 1910" per COM:PRP (because this might include stuff that was taken in 1913)
  • Which license tag(s) should be applied? {{PD-Austria-1932}} {{PD-1996}}
  • Is there a template that could be used on the file description pages? Do you think a special template should be created? I don't know if we need anything more specific than {{information}}.

It isn't urgent that this be done because this source isn't likely to disappear, but if somebody wants to hack together something to get these photos with adequate description, I'd be very grateful because there are many useful photos of Austrian history there. darkweasel94 10:04, 3 December 2013 (UTC)

  • Pictogram voting info.svg Info collection example, single image example, raw image example and the image download size is (max.) 800px, otherwise you have to pay. Maybe it is possible to request a higher resolution for commons? --Slick (talk) 14:46, 3 December 2013 (UTC)
    • It might be, but that would very much harm their business, so I wouldn't find it especially likely that they will do so. The 800px versions would already be very nice to have, but if we can get more, that's of course even better. ;) darkweasel94 15:14, 3 December 2013 (UTC)


Assigned to Progress Bot name Category


As part of a partnership with Wikimédia France, the Musées de la Haute-Saône (codenamed Champlitte) are sharing part of their collection on Wikimedia Commons. Jean-Fred (talk) 23:25, 24 November 2013 (UTC)


The first tests look great. Is there anything that needs to be done for alignement ? I don't see any subject on the source records in Joconde that should be used to decide categories to use, so everything goes into the museum category and we categorize after ? Symac (talk) 09:52, 26 November 2013 (UTC)

Thanks Symac for reviewing this.
The dataset provided by the museum was just a sample, with merely 136 records. Looking at the metadata export, I did not see any obvious candidate for an alignment − but that may be because of the sample size. I did not see any good source for categorisation, unfortunately − but that may be because I get a bit confused with the numerous Joconde fields.
There is though some parsing work to do. Size should be pretty much done ; a reverse look-up table together with a split-match-and-apply should do for technique (modulo the metadata confusion) ; rest does not seem to be good candidates for that either. Less work to do it seems :) Jean-Fred (talk) 22:05, 11 December 2013 (UTC)
If after two weeks there are no more concerns than mine (to which you answered perfectly), I think it should be a good idea to ask more data to the provider to go further with this partenership. Symac (talk) 07:18, 12 December 2013 (UTC)
Okay, we had a phone call with the museums last week, the project is back on tracks. We will make use of the GLAMwiki Toolset Project. Current target is to proceed with the upload at the end of January. Jean-Fred (talk) 23:22, 21 December 2013 (UTC)
Update: This is still happening. The museum folks are experimenting right now with the GWToolset on Commons Beta. Jean-Fred (talk) 10:20, 5 February 2014 (UTC)


Redux: GWToolset[edit]

We are getting very close to push files here. User:Tounoki from the museum is managing this.

Test files:


Jean-Fred (talk) 13:34, 11 April 2014 (UTC)

Some quick feedback based on the examples above.
  1. If the descriptions are always in French then a {{fr|1=<description>}} should be wrapped around it.
  2. "lieu de création" (currently part of the object history) should be mapped against the "place of creation" parameter instead
  3. Date should use {{other_date}} if possible. Unsure if GWtoolset supports this but looking at the json it the source data might have sufficient structure.
  4. Measures should use {{size}}. Looking at the json the source data should have sufficient structure.
  5. In the Sabot images
    1. The license seems to have disappeared
    2. an empty Creator template is used
    3. the institution template seems to be broken
    4. apostrophes ( ' ) seem to have been replaced by "&​#39;" (everywhere else is used)
/Lokal_Profil 14:55, 11 April 2014 (UTC)
Hi André, thanks a lot for your feedback.
  1. ✓ Done I used {{Original caption}} as it labels it as a description − we need to stuff other things into this field.
  2. ✓ Done
  3. I gave a try at parsing the date but it does not capture the complexity of the dates (for example « début 14e siècle » is only parsed to 14th century)
  4. ✓ Done Right.
As for the Sabot images I’m not sure what happened there − maybe User:Tounoki would know?
Jean-Fred (talk) 16:07, 14 April 2014 (UTC)
Assigned to Progress Bot name Category

Fonds Eugène Trutat bis[edit]

Follow-up of Commons:Batch uploading/Fonds Eugène Trutat. The archives provided the rest of the Fonds Eugène Trutat, and in better resolution. Jean-Fred (talk) 19:48, 19 October 2013 (UTC)



Some of the metadata is processed using manual alignment.

   Done − Let’s say it’s good enough as it is :-þ


All right, I think we are all set. Tests have been updated on /test.

Here is waht we are looking at categorisation wise − note that these numbers only account for the categorisation made through the alignment ; all files are at least in Category:Fonds Trutat - Archives municipales de Toulouse (plus a bunch of hidden ones).

Per category
The program added 912 categories, 92 distinct ones
The most used category is on 405 files
The less used on 1 files
On average, a category is used 9.9 times (mean)
The median is: 2.5
Per file
The most categorized file has 5 categories
The less categorized file has 0 categories
We have 10 uncategorized files
We have 289 files with two categories or more, which makes 60.2%
On average, a file has 1.9 categories (mean)
The median is: 2.0

Jean-Fred (talk) 00:00, 18 November 2013 (UTC)

First one uploaded: File:(Avignon (Vaucluse). Remparts) - Fonds Trutat - 51Fi480.jpg. Will proceed with the rest shortly. Jean-Fred (talk) 23:14, 20 November 2013 (UTC)
  • Ten more done ; here they are (using the awesome {{MyUploads/grid-photostream}} ;-) Jean-Fred (talk) 00:11, 22 November 2013 (UTC)
  • 40 more done. I’ll probably fire the rest at the end of the WE if there are no further remarks. Jean-Fred (talk) 23:58, 22 November 2013 (UTC)
  • All new ones done. Still have the ones uploaded 3 years ago to update. Jean-Fred (talk) 17:39, 3 December 2013 (UTC)
Assigned to Progress Bot name Category
User:Jean-Frédéric User:TrutatBot

Spread the sign[edit]

This is what the films look like. This one shows the Swedish sign for apelsin (orange).

  • Describe the works to be uploaded in detail (audio files, images by …):
    • Spreadthesign has around 150 000 films of signs in 16 different languages, and are continuing to make more films in new languages. They want to share their films to help raise awareness about sign language and to make better use of the material they have. The films are of high quality but they have yet to decide what resolution and format they want to upload.

  • Which license tag(s) should be applied?
    • CC-BY-SA

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?
    • It would probably be a good idea to have a special template. One thing I'm thinking about is a field that allows connecting films with the same word in different languages for example.

Axel Pettersson (WMSE) (talk) 10:36, 29 August 2013 (UTC)


In a first meeting me and Lokal_Profil had the following comments on the code:

  • Make the filename $word_$language-spreadthesign.ogv
  • Create categories with $wordclass (or what it might be called to have nouns, verbs and such in separate categories)
  • Create the categories Videos of sign language in $language and $language sign language
  • Make the description more dynamic and at least in english and the language of the film. Help with l10n might be needed here, although they have a network with partners in all languages they have films in.
That’a cool project! Awesome :)
Just a few rapid thoughts
  • I gather from the Gist that source links all follow the same pattern − it might be worth to create {{STS link|$vid}} to create the link (whose label could be i18n).
  • Not sure why you need a special template for. Connecting to similar films could be done through the other versions field. Am I missing something?
  • License: File:67329.webm is tagged with CC-BY.
  • Author: do we have better metadata for that? « own work » does not really cut it. Who should be attributed here?
All for now. Hope that helps!
Jean-Fred (talk) 21:06, 29 August 2013 (UTC)
Thanks for your concerns Jean-Fred, we also feel it's important projects.
  • We want to use one correct template to create all commons pages to 150000 uploaded films.
  • Yes, we have better data. « own work » is gone, each file name will be as suggested above $word_$language-spreadthesign.ogv, better, dynamic(language supported) description to.
Spreadthesign10:21, 30 August 2013 (UTC)
Updated bot code.
  • New version is available here
Spreadthesign10:21, 30 August 2013 (UTC)
Updated bot code.
  • Changes link source here
  • We vill use CC-BY-SA-3.0 for the upload.
Spreadthesign 09:33, 2 September 2013 (UTC)
Updated bot code.
  • Added support for wordclass category [[category:verb]] here
Spreadthesign(talk) 10:45, 3 September 2013 (UTC)
Updated bot code.
Example: Apelsin spreadthesign.ogv
SpreadthesignBot (talk) 08:35, 4 September 2013 (UTC)
I created {{STS-cooperation}}. Please help translate it and put it in the code somewhere. /Axel Pettersson (WMSE) (talk) 10:14, 5 September 2013 (UTC)
A few things: 1) Which language codes are you planning to use? It makes sense to use the sign language code and not the spoken/written language code (so swl for the Swedish sign language, not sv). 2) In the bot code I don't see any simple way to cancel uploading or pausing between uploads. You need a way to shut it down in case of "emergency", and in the beginning you should start uploading at 1-2 files per minute, and gradually increase that. 3) Are the videos already encoded as OGV? If not, you might consider using Webm instead, since that gives better quality, but it's no big deal. Skalman (talk) 08:04, 6 September 2013 (UTC)
Wordclass (ordklass in Swedish) is called part of speech in English, POS for short. What I understand, filenames on Commons need to be unique. Same word can belong to different parts of speech. So if I am not wrong, I suggest that the part of speech should also be part of the filename. There should also be possible to add another optional distinguisher if two or more signs are used for the same word of the same part of speech.
Regarding the order $word-$language, isn't it better to state the $language first, like $language-$word-$pos-$optional_dist-spreadtheword.webm/ogv. Please give me your thoughts.
I don't know if this matters, but working with Wiktionary I am used to the fact that capitalization matters. Is there a reason to not normally name the files without capital letter for the word? Like swl-apelsin-noun-spreadthesign.webm/ogv. ~ Dodde (talk) 17:16, 6 September 2013 (UTC)
Thanks all.
To Skalman.
  • We were planning to use language code (ex. sv for Swedish, Svenska) for naming the files simply because we are able to support it, now we may add support for sing language code.
  • Yes, we thought we can limit the amount of files proceeded by limiting how many rows are fetched from the database i SQL. As you mentioned we are going to start with a few records and then increase the amount.
  • All videos we are going to upload are in flash, mp4 or/and ogv format.
Spreadthesign 13:31, 11 September 2013 (UTC)
To Dodde.
  • Today we cannot support using POS for naming the files, I'll create another distinguisher.
  • I can't see any benefit using $language before $word or another way around. If there is any please tell me.
  • Capitalization matters, I guess it was just a typo or test case.
Spreadthesign (talk) 19:51, 11 September 2013 (UTC)
A matter of organizing the files, I suppose. It's easier to see which language the word belong to, if the language code is presented first. In a listing, words of the same language would be grouped together. That is all I can say.
In order to be able to insert entries for the signs, the information regarding part of speech needs to be present. Is this information present somewhere else (in some database), or is it expected the person who runs the bot should manually decide or sometimes guess for each word before creating the entry? ~ Dodde (talk) 22:54, 11 September 2013 (UTC)
For pronunciation files, I believe that the language always comes first, see Category:Pronunciation. It makes sense to use the same system for sign language videos.
If you're not using the actual sign language codes, what do you intend to do with written languages that are covered by multiple sign languages, such as English (US, UK)? Are you using codes such as en-uk? I really think it'd be easier and more correct to just use the sign language code.
Regarding capitalization: The first letter of a file name is always capitalized here (which is another reason to have the lang code first). However, File:Coca-cola spreadthesign.ogv should actually have a capital C in the middle (as well as the lang code).
I feel that it'd be good if somebody who is part of the Commons community commented on this as well (I'm only familiar with the Swedish language Wiktionary). Skalman (talk) 06:26, 12 September 2013 (UTC)
Updated bot code.
  • Added support for sign language cod for naming the files. Now the file name we creating gonna be: swl-apelsin-spreadthesign.ogv sign of orange i Swedish sign language or ase-orange-spreadthesign.ogv sing of orange in American sign language. here.

SpreadthesignBot (talk) 12:16, 12 September 2013 (UTC)

Great work with the upload preparations. A few thoughts though.
  • Are the words always unique? I.e. the Swedish banan (banana/the track) could technically have two signs which would each end up becoming swl-banan-spreadthesign.ogv. A way around this would be to append the internal STS-id. In this case banana would become swl-banan-spreadthesign-98036.ogv. This also solves the issue with e.g. apelsin having three separate videos
    • As a follow up to the three different apelsin videos. Will the information "vanligast/lika vanlig/används i Norrland" which distinguishes the three be included somehow?
  • The license used should probably be {{cc-by-sa-3.0|[ Spread the Sign]}} instead of {{self|cc-by-sa-3.0}}. What I did was to bake this into the {{STS-cooperation}} template so that this can be used as the permission parameter instead. This template also includes the Media contributed by Spread the sign-category meaning that it should also be removed from the github code.
  • The license should be complemented by an e-mail being sent to stating that STS are the owners of the material uploaded by User:Spreadthesign and releases it under cc-by-sa-3.0. Once this is properly registered the id can be added to {{STS-cooperation}}.
  • As for using a special information template. Since the information is structured (POS, language etc.) and there are so many videos it might actually help localisation to use a purpose created template. What I'm thinking is that if there is e.g. a parameter for POS then the language mapping could be done directly in the template (similar to how {{Technique}} works). Similar thing could be done with the languages etc. Other opinions on this would be welcome though.
/André Costa (WMSE) (talk) 15:12, 16 September 2013 (UTC)
Using a template to enter POS and any other information would be very helpful when inserting videos+descriptions automatically into Wiktionary entries, as long as it's well-structured. Skalman (talk) 08:48, 17 September 2013 (UTC)
Yeah, Spreadthesign is back. Thank you all for your thoughts, advices and assistance. We are very excited to begin the upload of our material very soon.
Now for the questions:
  • All files should be unique, I added a distinguishe for each file name.
  • I'm not quite sure how the license should be if not {{self|cc-by-sa-3.0}}. Please explain!
  • {{STS-cooperation}} was complemented, id release is added.
  • We understand how useful it would be to use the POS, but today this is not possible due to lack of support in the db.
  • Updated bot code is available here

SpreadthesignBot (talk) 13:29, 30 September 2013 (UTC)

In the code you should change permission from {{CC-BY-SA 3.0}} to {{STS-cooperation}}. Then it will be as in Apelsin spreadthesign.ogv with complete cooperation, license and OTRS-templates.
I still think description should include both english and the language of the film. Probably something like {{Multilingual description|en=$desc $categoryLanguage|(something that finds out language)=($desc in the language of the film $language in the own language)}} as it would be helpful to non-english communities.
Bot request is at here. Please help with the aproval there.
/Axel Pettersson (WMSE) (talk) 12:22, 3 October 2013 (UTC)
Please don’t use {{Multilingual description}}, please favor {{sv|...}}{{en|...}}. The behaviour of Mld is automatically triggered when there are more than N languages in the description field (don’t remember right now how much is N). Jean-Fred (talk) 13:12, 3 October 2013 (UTC)
Sorry about that, I didn't know. On the other hand, if the desription field only has two languages, english and the language of the film, will it be triggered then? Or maybe it doesn't matter as there is only two languages there. Still have the problem of inserting the right language-code there also, or is there an existing solution somewhere? /Axel Pettersson (WMSE) (talk) 13:39, 3 October 2013 (UTC)

Way to go, File:Bho-make+a+reservation-spreadthesign-9982.ogv is up and running. Some thoughts:

  • No need for + in filenames, it should be Bho-make a reservation-spreadthesign-9982.ogv
  • Format the upload as this
  • The description should state that it's Brittish sign language. Something like "Book a table in a restaraunt at a particular time so that you can eat a meal in British sign language."
  • Categories should be on one line each
  • Categories should (probably) be category:British English sign language and category:Videos of sign language in British English for Bho.

/Axel Pettersson (WMSE) (talk) 09:02, 21 October 2013 (UTC)

A few more points:
  • The name of the language is British Sign language - any categories should probably not include the word "English". The categories Axel suggested should probably be category:British Sign Language and category:Videos in British Sign Language (though I don't understand the difference between them - are both needed?)
  • I am confused as to which language this is. "bho" is the language code of Bhojpuri - British Sign Language has the language code "bfi".
  • The link back to should not be a Swedish language link. ->
  • Where does the description/definition "Book a table in a restaraunt at a particular time so that you can eat a meal" come from? On I only see the text "make a reservation" (+the translations to other written languages). I believe you can "make a reservation" at a hotel too, so which description is accurate?
Skalman (talk) 12:23, 21 October 2013 (UTC)


  • + in a file name is a bug and it's fixed already.
  • The description will be complemented
  • Categories as well.
  • "bho" comes from i think i got it right.
  • Nice point with the link back to vill fix it until next test upload.
  • Sign description comes from our colleagues around the world whom get the chance to help since they know better what each sign means in theirs own language.

SpreadthesignBot (talk) 19:02, 21 October 2013 (UTC)

@SpreadthesignBot: That's an old version of Ethnologue. Starting with Ethnologue 15 it's "bfi". See here for the current version: On sv-wikt we use ISO 639-3 if Wikimedia doesn't have a special code (and I believe that the new version is the same as ISO 639-3). Skalman (talk) 08:40, 22 October 2013 (UTC)
To Skalman

Thanks a lot Skalman, my mistake.

SpreadthesignBot (talk) 12:13, 22 October 2013 (UTC)

Any status update? Skalman (talk) 11:51, 15 November 2013 (UTC)

Reboot in December[edit]

I've added some new movies

New description, wordclass in the categories and some more issues solved. Please have a look and comment here. /user:SpreadthesignBot (through Axel Pettersson (WMSE) (talk) 10:15, 6 December 2013 (UTC))

No comments after waiting for a few days. Moving along with some more uploads now, but feel free to interrupt or comment as we move along. /Axel Pettersson (WMSE) (talk) 10:54, 9 December 2013 (UTC)
Hey, I just hac a quick look. Looks very good, not much to say − please upload more!
One feature request just for the pleasure to ask for the impossible ;-). I see descriptions are provided in English and Swedish − good. But I see that translations are available in many more languages on STS website ; for example 81278, if inserted with a /de/, gives /de/81278/ which says “personen”. Any chance to fetch all those and add them to the file description page ? :) Jean-Fred (talk) 11:48, 9 December 2013 (UTC)
My suggestions:
  • Put the word in quotes (e.g. "annat" på svenskt teckenspråk)
  • In English, language names use capital letters, so it should be Swedish Sign Language (but "svenskt teckenspråk")
  • I'm wondering about File:Swl-annat-spreadthesign-73566.ogv - "annat" in Swedish does not mean "else" in English (annat=other, annars=else). I hope such mistakes are uncommon, but it would be nice to know what the actual meaning is - should we assume that for Swedish Sign Language videos, the Swedish description is (most likely to be) correct? Is there a good place to report errors?
Nice to see some activity here! Skalman (talk) 16:12, 9 December 2013 (UTC)

How is it going? Do you need help uploading? I might be able to help. Are there other considerations? Skalman (talk) 14:22, 14 January 2014 (UTC)

Files uploaded during test period[edit]

Theese files should be deleted and uploaded again with correct name and format later on.


Assigned to Progress Bot name Category
Axel Pettersson (WMSE), Spreadthesign coding SpreadthesignBot Media contributed by Spread the sign


See Com:DPLA for an overview of the project. The DPLA has metadata for over 2 million records; sadly only a portion of these are PD. User:Bdcousineau is going through collection by collection to reveal PD materials. See Com:DPLA for the list.

  • Source to upload from:
    • Did you observe an URL pattern
    • Do you know whether the site as an API
The DPLA has an an API that is available for use, however, it is a metadata repository. The source files will be linked from the local website. See Commons:Bots/Requests/Smallbot (10) for sample templating, etc. Bot operator retired before upload was begun.
    • What else can ease uploading (is the site valid XHTML, WCM they use…)?
    • Did you contact the site owner?
Given the mission of the DPLA, there may be no need. The DPLA has representation on the project page. Project coordinator is happy to contact site owners as needed, if needed.
  • Describe the works to be uploaded in detail (audio files, images by …):
Jpg and tiff files.
  • Which license tag(s) should be applied?
Depending on the collection, either a {{PD-US}} tag, or a {{PD-USGov}}.
  • Is there a template that could be used on the file description pages? Do you think a special template should be created? Depending on the collection {{Artwork}}, {{Photo}}, {{Book}}. We've also created a preliminary institution tag that will be adjusted to reflect the owning institution.

Bdcousineau (talk) 00:59, 5 August 2013 (UTC)


I'd be happy to assist with the task. We however need to establish a good way to handle this. Perhaps a specific template should be created that holds all the notes on the linked nom. This way we would have more control on the licensing should we desire to make slight updates. Also was the code to retrieve the files ever created before? -- とある白い猫 ちぃ? 10:31, 8 September 2013 (UTC)

Hi, thanks! Please know I'm not a techie, so what is "linked nom"? For the initial batch we were working with, we mimicked the templating created by prior uploads (see Commons:Bots/Requests/Smallbot (10) - there is a sample of the JSON source on that page as well. I can see however that an overarching template will be needed. As far as the licensing goes, all the materials we started to work with are {{PD-USGov}}, the others will be different.
I guess the big question is, where would you like to start? With where we left off? or with a smaller batch? A smaller batch makes the most sense, in that the templating and licensing can be in adjusted as you suggest. The Massachusetts Digital Commonwealth has a few smaller collections that are PD - and total approx 1500 items.
To be clear, disclaimer: I am a NARA employee - this project has nothing to do with my official duties, nor does it reflect official policy, etc etc. Bdcousineau (talk) 12:27, 8 September 2013 (UTC)
OK so perhaps we should do this like a Q&A to avoid mistakes. I meant the Commons:Bots/Requests/Smallbot (10) when I stated "linked nom".
  1. At this repository do we have a variety of licenses? If so is there a list of it? Can we easily distinguish the license of each file?
  2. Were any files copied to commons with a bot before? Or was that never the case? I'd rather avoid re-engineering code if one already exists. What exactly do you mean by "where we left off"
  3. Does this repository grow in size? If so how often do we need to update?
  4. Do you have a link to the API and example sample images?
-- とある白い猫 ちぃ? 13:04, 8 September 2013 (UTC)
Much easier, thanks!
  1. For this project, only PD materials are appropriate. Some will be {{PD-USGov}}, others {{PD-art/1923}}, and other will be {{PD-1923}}. In general, each collection will have the same license for each of their files - for example, the ARTStor files (10K files) will be {{PD-art/1923}}, and the MassDigCommonwealth will be {{PD-1923}}. Even though the DPLA has huge number of files, only a small percentage are PD. Yes, easily distinguishable. Also, I can generate any list you'll need collection by collection.
  2. No, no files copied to Commons yet. "Where we left off": group consensus asked previous uploader to upload small sample batch for further review. Task not completed. Most likely that previous work is useless, and should be ignored.
  3. Yes, DPLA grows in size, both inside each collection, and as new service hubs/partners are added. The is no consistent languaging for licensing, either, licensing developed at the donor level, and is wildly various. Last time I checked, searching by licensing field was not an option - PD mapping done by hand. Since the project has a DPLA contact (user:SJ), it might be possible to get better access to the rate at which material is added to DPLA. Can this be put off for a moment?Also, since the project does have a DPLA connection, it's reasonable that at some point he needs to be drawn in to consnesus process around templating, etc, especially if DPLA-specific templates are developed.
  4. You have to get a key for the API here. Sample images: DPLA is broken, can't get any search results. Will try again later today.
New: The DPLA Dev team and others associated with the DPLA were excited when we contacted them about this (April 2013)... so I am assuming we can get some support from them if needed. Bdcousineau (talk) 14:18, 8 September 2013 (UTC)
One possibility is them uploading to Flickr and I can use existing code to receive it. They can throttle their internet usage with this way too to prevent outages as the bot would be relentless (since I don't know their upload limits). They can for instance use . For the script to work they must release it with a free license. If they are willing to do this option, I wouldn't need to code. Or they can upload directly to commons of course. I just am curious if they are unwilling to do either. -- とある白い猫 ちぃ? 21:25, 8 September 2013 (UTC)
Hmmm... that level of support is unlikely, it'll be more like a thumbs-up/pat on the back/yes go for it. IMHO I don't think the DPLA is in the business of pushing the files out once the service hubs sign on, they are strictly a repository; while a great angle, this version of the plan is prolly a non-starter. It'll be up to Wikimedians to figure out a way to bring the files to Commons. BTW I really appreciate having this discussion, thanks. Bdcousineau (talk) 23:59, 8 September 2013 (UTC)
Well, I need sample images, urls etc to work with. -- とある白い猫 ちぃ? 20:42, 11 September 2013 (UTC)

Ok, will try by Saturday, surely by Sunday am. Tied up til then. Thanks so much. Bdcousineau (talk) 01:02, 12 September 2013 (UTC)

Please do not hurry, I am rather busy with real world affairs until more or less the end of this month. This is an issue that needs to be handled with time and care anyways. -- とある白い猫 ちぃ? 21:12, 13 September 2013 (UTC)
Great! Here are sample urls of declared PD materials:
All the metadata from the DPLA's API is PD. Let me know if this is useful, and what you needed. Bdcousineau (talk) 22:15, 15 September 2013 (UTC)
Assigned to Progress Bot name Category

Museum of History of Photography[edit]

Museum of History of Photography contains online collection of public domain photographies within two categories: Photos and Equipment. It would be valuable for Commons and Wikipedia projects as included in:

  • old cameras and camera equipment,
  • historical context of geographic places,
  • photographies of notable people,
  • photographies by professional photographers; different techniques of photography,
  • etc.
  • Source to upload from:
    • There is an URL pattern.
    • Site does not provide any API.
    • What else can ease uploading?
      • There are also books available at the website (scanned pages as JPG). There are sections which can be downloaded optionally: BOOKS, BULLETINS, POSTERS, CALENDARS, ARCHIVE
  • Which license tag(s) should be applied?
    • Public domain. The website says: "Copyright: Exhibits in the Public Domain MHF allows visitors to make full use of the digitalised images of the museum exhibits in the formats published by the Museum in the public domain on condition that the source and creators are acknowledged. The terms and conditions of the accessibility of superior quality digital images of museum exhibits are stated in The Rules of Accessibility of the MHP Collections and the Price List."
  • Is there a template that could be used on the file description pages? Do you think a special template should be created?
    • Author:
    • Subject:
    • Geographical location:
    • Dating:
    • Genus:
    • Technique:
    • Dimensions:
    • Inventory number:

dariusz woźniak (talk) 07:37, 16 July 2013 (UTC)


Assigned to Progress Bot name Category

Rubin Kazan - Llevant UD[edit]

  • Source to upload from:
    • Did you observe an URL pattern
    • Do you know whether the site as an API
    • What else can ease uploading (is the site valid XHTML, WCM they use…)?
    • Did you contact the site owner?
  • Describe the works to be uploaded in detail (audio files, images by …):

Photogallery from of an historical match for Levante UD. Good quality images of players that maybe don't have any better portaits.

  • Which license tag(s) should be applied?

The usual license.

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?


Assigned to Progress Bot name Category
Category:Rubin Kazan-Llevant 15-03-2013


The renovated en:Rijksmuseum in Amsterdam has made their digital collection of 111,000+ objects available digitally under a CC-0 license ( An API key is needed for digital downloading ( According to the museum:

"All object descriptions available via this API are covered by a Creative Commons 0 licence. The images are in the Public Domain, according to which the data and the images are free of rights and may be copied, changed, distributed or exported without the Rijksmuseum’s permission."

Sandstein (talk) 20:40, 7 April 2013 (UTC)

  • Describe the works to be uploaded in detail (audio files, images by …): Presumably the entire collection is of use. According to "The Rijksmuseum API Collection is a set of more than 110,000 descriptions of objects (metadata) and digital images from the Rijksmuseum collection. The works of art and implements in the set date from ancient times through to the late 19th century and provide an excellent overview of the richness, diversity and beauty of the Dutch and international heritage. Unfortunately, copyright restrictions mean that we are not yet able to include any works from the 20th or 21st centuries. The set includes paintings and prints (ranging from the great masters of the Golden Age through to anonymous biblical paintings and other painted objects from the Middle Ages), 19th-century photographs, ceramics, furniture, silverware, doll’s houses, miniatures, etc. Digital photographs were taken of all of the objects in this set."
  • Is there a template that could be used on the file description pages? Do you think a special template should be created? Museum:Rijksmuseum. Also, the following should probably be taken into account even though we are not an app: "In all apps to be built in which images belonging to the Rijksmuseum are used, app designers will credit these as having been built with the API of the Rijksmuseum, including images and documentation. The credit must be placed where it can be seen easily by users. App-builders will credit all images with the words ‘Rijksmuseum collection’."


  • I'm quite aware of this awesome collection. Haven't uploaded it yet because we're planning to use it as pilot for Commons:GLAMwiki Toolset Project. Not sure when this will happen exactly, probably in the next months. Multichill (talk) 10:42, 13 April 2013 (UTC)
Assigned to Progress Bot name Category

Fonds Ancely[edit]

This upload is part of a partnership between Wikimédia France and the Library of Toulouse. It consists of 2085 public domain files. You may see general notes and work in progress on User:Jean-Frédéric/Ancely.

The metadata is held in a OAI PMH repository. The code explores it and retrieves records ; then if applicable the various fields are matched to a manual alignement of Commons categories and tags, community curated. This is then fed to a data ingestion templates which translates the metadata to {{Artwork}}. Actual upload is made with Pywikipedia-rewrite by User:AncelyBot.

In its current state, the categorisation system with the alignment outputs 31,801 categories (1,694 distinct) − the drawback is that many are high-level categories (“Shawls”, “men”, etc.)

Looking forward your thoughts, Jean-Fred (talk) 22:49, 6 March 2013 (UTC)


  • Uploaded five more − see Special:ListFiles/AncelyBot Jean-Fred (talk) 01:14, 16 March 2013 (UTC)
  • Uploaded fifteen more − and I will continue uploading files until my demands are met! Jean-Fred (talk) 00:23, 19 March 2013 (UTC)
  • Symbol support vote.svg Support everything looks fine for me. (may be a bit overcat) --PierreSelim (talk) 14:24, 20 March 2013 (UTC)
  • Ok, uploading 100 right now. Jean-Fred (talk) 21:06, 11 April 2013 (UTC)
  • Looks very good. The only thing that worries me a bit is the number of categories per image. That might become a problem. Please upload more! Multichill (talk) 10:39, 13 April 2013 (UTC)
  • Symbol oppose vote.svg Oppose now, we have forgotten to finish the Creator mapping User:Jean-Frédéric/Ancely/Creator --PierreSelim (talk) 12:02, 25 April 2013 (UTC)
  • Uploaded the first 350. Jean-Fred (talk) 23:08, 7 May 2013 (UTC)
  • Uploaded the first 500. Jean-Fred (talk) 13:04, 8 May 2013 (UTC)
  • Uploaded the first 800. Jean-Fred (talk) 14:05, 10 May 2013 (UTC)
  • Made it 1,000. Jean-Fred (talk) 23:15, 12 May 2013 (UTC)
  • ✓ Done. 2041 files uploaded + 33 dupes + 11 errors = 2085 files, the size of the corpus. Jean-Fred (talk) 14:49, 24 May 2013 (UTC)



The following files were already on Commons − we might want to update their file descriptions (current: 33)


The following files failed to upload (current: 11)

Categorisation statistics[edit]
Per category[edit]

30266 categories, 1760 distincts Mean: 17.1965909091 Median: 2.0 Max 1045 // Min 1

Top 10: [(u'Mountains in art', 1045), (u'Men in art', 992), (u'Women in art', 878), (u'Trees in art', 780), (u'Houses in art', 736), (u'Pyr\xe9n\xe9es-Atlantiques', 693), (u'Hautes-Pyr\xe9n\xe9es', 617), (u'Pyrenees', 470), (u'National costumes in art', 468), (u'Rivers in art', 440)]

Lose 10: [(u'Estrades', 1), (u'Pierre Bayle', 1), (u'Morla\xe0s', 1), (u'Louis-Fran\xe7ois Couch\xe9', 1), (u'Jean Racine', 1), (u'Faience in France', 1), (u'Marmite', 1), (u'Corsica', 1), (u'Dordogne River', 1), (u'Esera River', 1)]

Per file[edit]

Mean: 14.5160671463 Median: 13.0 Max 47 // Min 0

Top N: [('B315556101_A_LEVASSEUR_066', 47), ('B315556101_A_LEVASSEUR_068', 46), ('B315556101_A_LEVASSEUR_018', 44), ('B315556101_A_LEVASSEUR_056', 42), ('B315556101_A_LEVASSEUR_057', 42)]

Lose N: [('B315556101_A_BERTHIER_010', 1), ('B315556101_A_BERTHIER_024', 0), ('B315556101_A_BERTHIER_021', 0), ('B315556101_A_BERTHIER_018', 0), ('B315556101_A_BERTHIER_013', 0)]

Assigned to Job Progress
Jean-Frédéric Metadata pre-processing Status:    Done
Jean-Frédéric, Symac, Léna, PierreSelim Metadata alignment Status:    Done
User:Jean-Frédéric Upload Status:    Done
Dupes and errors processing Status:    todo

South African churches[edit]

User af:Gebruiker:Morne has uploaded hundreds of perfect images of buildings in South Africa (mostly churches) in Afrikaans Wikipedia, all under the same licence "you are free to use, copy, modify, if you properly credit the author" (see an example). I consider it important, as there are unfortunately relatively few images of South African cities, towns and villages in Wikipedia. --Dmitri Lytov (talk) 03:21, 3 March 2013 (UTC)

  • Describe the works to be uploaded in detail (audio files, images by …):

It's a collection of several hundred images of churches in South Africa.

  • Which license tag(s) should be applied?

"you are free to use, copy, modify, if you properly credit the author" (see an example).

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

Sorry, no idea.


Assigned to Progress Bot name Category

National Gallery of Art[edit]

Jean-Fred (talk) 14:19, 18 January 2013 (UTC)

  • Source to upload from: National Gallery of Art online database, per their open access policy
    • Did you observe an URL pattern
    • Do you know whether the site as an API
    • What else can ease uploading (is the site valid XHTML, WCM they use…)?
    • Did you contact the site owner?
      See here, they welcome the idea
  • Describe the works to be uploaded in detail (audio files, images by …):

Artwork digitisations

  • Which license tag(s) should be applied?

Existing uploads seem to rely on {{PD-author|National Gallery of Art}} or {{PD-art|PD-old-100}}.

I guess a custom wrapper for {{Licensed-PD-Art|PD-old-whatever|{{PD-author|National Gallery of Art}}}} would do the trick (in the spirit of {{Walters Art Museum license/2D}})

Went ahead and created {{PD-Art-National Gallery of Art}}. Jean-Fred (talk) 14:47, 18 January 2013 (UTC)


Assigned to Progress Bot name Category

US Army Research Laboratory Eniac[edit]

  • Describe the works to be uploaded in detail (audio files, images by …):
    Images (PNGs are high-res, also lo-res GIF, JPG): those which are photos should ideally be converted to JPG)
    I only count around 20 such images, so please state if that's too few for a batch upload to be considered.
    There are a few duplicates within Category:ENIAC, but I gather that the batch proposal equivalents are generally of better quality.
  • Is there a template that could be used on the file description pages? Do you think a special template should be created?
    Not that I know of... possibly something about ENIAC


Assigned to Progress Bot name Category

11k of Areal Photos[edit]

In the course of the arial photo project of the German Wikimedia de:Wikipedia:Projekt Fotoflüge I wrote an article for a pilots magazine. After that I got in contact with a Pilot who wants to share his own created areal photo collection which he created over the past 24 years. It seams that all photos are already geo-referenced and classified (by type like solar power plant, church as well as by region like Europe, Andalucia, Sanlucar). The classification as well as the geo-reference is within the exif data of the images. During a manual upload the geo-reference was recognized correct by commons. Because of the big amount of pictures it would be fine if there is some way to may automize the upload and if possible somehow to match the classification of the pictures to the commons categories. I have no idea if or how this is possible and it would be great to get some information if this is possible or to get some help for this request. The Classification is sometimes in German and not matching the Commons categories. The Pilot has already created a Wikipedia / Commons User and uploaded one example file where you could see how the data is sored within the exif Data.

  • Source to upload from:

The files are on a computer of the pilot / photographer.

    • Did you observe an URL pattern
    • Do you know whether the site as an API
    • What else can ease uploading (is the site valid XHTML, WCM they use…)?
    • Did you contact the site owner?

Not the site owner but the photographer User:Graf-flugplatz

  • Describe the works to be uploaded in detail (audio files, images by …):

About 11.000 of digital arial photos should be uploaded.

Solarthermie Kraftwerk 100919005.JPG
  • Which license tag(s) should be applied?

Has to be clarified with Author, but expect "CC BY-SA 3.0" like the example.

Update 18.12.2012: License "CC BY-SA 3.0" is approved by Author User:Graf-flugplatz.

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?


Nice sample images. I'm from Germany too and I like to help. But not before end of april 2013 because I am away and busy. If this will be ok, just waiting ... --Slick (talk) 17:26, 9 January 2013 (UTC)

Ok, how can I get the images to upload? I like to have them here, so I can check the tags they have and can try to find best categories for. Possible solutions are I download them all from a source or you can send it to me on by CD/DVD (I am from germany too). You can contact me (in german please) here about this. Additional I suggest the pilot (or you) fill in a minimal content on his userpage for other they are interesting in the source/creator. (i.E. the same information as in this request) --Slick (talk) 08:51, 6 February 2013 (UTC)

Assigned to Progress Bot name Category
Slick Waiting for user response...

Garden of the Victory in Chelyabinsk[edit]

  • Source to upload from:

User Ain92 asked me to upload some photos with Panoramio Picker but I have never done it and found that it's too complicated to understand it in the nearest time. So I ask to upload for category:Garden of the Victory in Chelyabinsk all photos from this page and 2-9th photos from this page (they are cc-by). Анастасия Львоваru (ru-n, en-2) 07:03, 11 December 2012 (UTC)


Assigned to Progress Bot name Category


  • Describe the works to be uploaded in detail (audio files, images by …):

We wanted to upload free images from AELG Website because they have galleries from Galician writers. They have a CC-BY-SA license for some photos from the galleries from the authors, photos from authors Eduardo Castro Bal and Santos-Díez.

There is an index of authors here and this is an example of the gallery of an writer. The individual photos have an url like this.

  • Which license tag(s) should be applied?

The images are CC-BY-SA, some from Eduardo Castro photographer and others from Santos-Díez photographer.

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

There is a template to use with the photos: {{AELG}}. There is a category too. Bye, --Elisardojm (talk) 00:14, 28 September 2012 (UTC)


Pictogram voting comment.svg Comment Somebody could review if this work can realise or is necessary more information? Thanks, --Elisardojm (talk) 22:14, 9 November 2012 (UTC)

Looks good and no more information is needed yet. But usually it can take some time to realize. Just waiting ... --Slick (talk) 16:25, 9 December 2012 (UTC)
Ok, if somebody needs more details or goes to try realise this task, I would appreciate him that it warned me in my talk page. Thanks!, --Elisardojm (talk) 09:54, 11 December 2012 (UTC)
I'll do the upload tmrw. Smallman12q (talk) 03:19, 22 January 2013 (UTC)
If you need more information or details about this task, you can ask me. Thanks!, --Elisardojm (talk) 14:16, 22 January 2013 (UTC)

✓ Done I've completed the upload...~800 uploaded. Some such as File:Valentín_Arias_(AELG)-1.jpg aren't thumbnailing...but work fine in firefox and show metadata so its a bug on the wiki side. Cheers. Smallman12q (talk) 21:22, 22 January 2013 (UTC)

Image rendering bug is being look into at w:Wikipedia:Village_pump_(technical)#Images_not_rendering.Smallman12q (talk) 23:49, 23 January 2013 (UTC)
Image rendering bug was resolved. Fixed issue with spacing in direct links brought up at User_talk:Smallman12q#AELG_photo.27s_upload.

Smallman12q (talk) 04:39, 26 January 2013 (UTC) Per User_talk:Smallman12q#AELG_photo.27s_upload, added author link:

Smallman12q (talk) 03:08, 7 February 2013 (UTC)

Assigned to Progress Bot name Category
User:Smallman12q Done User:Smallbot Category:Images from AELG

Gerald R. Ford Presidential Library and Museum[edit]

The Ford Presidential Lib/Museum is a federal archives, part of NARA. We'd like to create a partnership with Wikimedia:Commons and get all of our digitized material up. All materials are in the public domain. Agency management is on board, and we have a team already working on this! I've been uploading materials one-by-one, I've gotten about 170 images uploaded - see Commons:Gerald R. Ford Presidential Library and Museum - I figure it should take me til oh, 2215 to get everything up! We're looking for an administrator to work with and develop a plan. Bdcousineau (talk) 18:50, 5 September 2012 (UTC)

See Commons:Gerald R. Ford Presidential Library and Museum for current progress.Smallman12q (talk) 23:26, 17 September 2012 (UTC)


Assigned to Progress Bot name Category

Rudolf Steiner Gesamtausgabe[edit]

Die folgende Seite bietet alle Werke der Gesamtausgabe Rudolf Steiners (gemeinfrei) als Scan in zitierfähigen Ausgaben. Eine Übernahme zu Wikimedia Commons wurde hier besprochen und gewünscht.

  • I downloading the files und prepare for upload. Which one is the correct licence template in this case? I guess PD-old. Only this or need a second one? --Slick (talk) 21:14, 11 August 2012 (UTC)
  • Downloads finish. --Slick (talk) 08:47, 13 August 2012 (UTC)

A discussion in german about the licence can found here. Looks like there is a problem with scans from sources newer than 1923. --Slick (talk) 13:35, 15 August 2012 (UTC)

I cancel to support this batch job, remove all local work already done, because missing help/support although requested more than one time. Revert job to Request-List. --Slick (talk) 20:30, 23 August 2012 (UTC)


Assigned to Progress Bot name Category

Detroit Publishing Company at LoC[edit]

"This collection of photographs from the Detroit Publishing Company Collection includes over 25,000 glass negatives and transparencies as well as about 300 color photolithograph prints, mostly of the eastern United States. The collection includes the work of a number of photographers, one of whom was the well known photographer William Henry Jackson. A small group within the larger collection includes about 900 Mammoth Plate Photographs taken by William Henry Jackson along several railroad lines in the United States and Mexico in the 1880s and 1890s. The group also includes views of California, Wyoming and the Canadian Rockies." Subject index; geographical index. cmadler (talk) 17:17, 20 March 2012 (UTC)


Symbol support vote.svg Support Great collection of historical images. --Junkyardsparkle (talk) 21:48, 26 April 2014 (UTC)

Assigned to Progress Bot name Category

Cesare Brizio[edit]

Photographer Cesare Brizio has agreed to donate 1300+ images here. Images may be taken from the web page OR originals can be sent to anyone on a DVD if required. He also suggested some sound files - but they are in the wrong format (mp3).

Data from OTRS ticket 2012021810002796 follows (permission obtained to copy this OTRS message here)
Dear Ron Jones: yes, I confirm that I am actually glad to release all the images located at via the "View Media" link at as "Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)". Furthermore, I can provide upon request higher resolution versions (1024x768 or more) of almost all the same images.

By the way, I would gladly release as "Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)" all the audio samples (recordings of animal sounds) available at the web pages listed here:

best regards,

Cesare Brizio


Symbol support vote.svg Support Sounds good espacally the fact that we don't have for all biological articals pics.--Sanandros (talk) 14:49, 21 August 2012 (UTC)

Assigned to Progress Bot name Category

Works of Maurice Ravel[edit]

All files from,_Maurice can be uploaded to Commons (57 files).

Maurice Ravel's works are in the public domain in France since a decision by the Cour de cassation in 2007 (French Supreme Court). See Wikipedia articles for details. There are about 35 published before 1923, for which there is no URAA issue. Yann (talk) 12:12, 15 September 2012 (UTC)

Category:Compositions by Maurice Ravel
License {{PD-old}}


Assigned to Progress Bot name Category
  • Waiting for the backlog of this page may take longer time than manual uploading 57 images using Special:UploadWizard. Bennylin (yes?) 12:49, 26 February 2012 (UTC)
    • If you would just had a look at the page or at least a bit of music knowledge...; but today I am bountiful and do not respond with other unhelpful comments. I just ask me how you could became steward with those hasty comments. If you want to help, you could take upload requests or analyze them carefully. Or are you even paid by WMF to advertise UpWiz?
  • Please make some suggestions how to get a good descriptions from the page. (Including a custom template, categories, ...)
  • Page structure:
  • {{Not-PD-US-URAA}} is not a valid license template (says that right on it!). Works should be verified as being PD or otherwise free in the US before uploading. Otherwise you're just adding to the Commons:WikiProject Public Domain/URAA review workload. cmadler (talk) 12:17, 27 April 2012 (UTC)
    • Pre-1923 works should be tagged with {{PD-1923}} to cover US copyright status. Post-1923 works are probably still copyrighted in the US, and should not be uploaded without investigation into the status. cmadler (talk) 13:35, 17 September 2012 (UTC)
      • Is this tag necessary even for non-US works? Yann (talk) 15:29, 17 September 2012 (UTC)
        • Yes, because works on Commons must be free in both the country of origin and the US. (Right on {{PD-old}}, it says, "You must also include a United States public domain tag to indicate why this work is in the public domain in the United States.") Alternatively, {{PD-old-70-1923}} is a single template covering both the US and French copyright. cmadler (talk) 12:39, 18 September 2012 (UTC)
  • Symbol oppose vote.svg Oppose Actually, now that I look at it, I don't think any of his works are in public domain in France, the country of origin. The Cour de cassation ruling found that the prorogations de guerre (extensions for the two World Wars) were superceded by later copyright laws, but only for non-musical works. Since we're discussing musical works, the prorogations still need to be taken into account. Works published through 1920 get an additional 14 years, 272 days, while works published from 1920 through 1947 (since Ravel died in 1937, this covers all the rest of his works) get an additional 8 years, 120 days. So Ravel's works through 1920 are copyrighted in France until late 2022 (272 days gets you almost to the end of September), while his post-1920 works are copyrighted in France until 2016 (120 days goes to late April). cmadler (talk) 12:49, 18 September 2012 (UTC)
    • The Cour de cassation did not mention the type of works to which its ruling applies. Yann (talk) 13:43, 18 September 2012 (UTC)
      • If I understand correctly, the 2007 Cour de cassation ruling related primarily to the 1997 law, which had extended the normal duration for non-musical works from 50 years to 70 years (but was not cumulative with the war extensions), and dealt specifically with the works of two painters, Monet and Boldini. But musical works had already been extended to 70 years pma in 1985, by the "Lang" law, and in the 2007 ruling, the court found that this law was cumulative with the war extensions ("la loi du 3 juillet 1985 avait porté à 70 ans la durée de protection normale, de sorte que les bénéficiaires des prorogations de guerre applicables à cette date pouvaient prétendre à une durée de protection excédant 70 ans"), but only in the case of composers who had already "acquired" the right (already died, starting the copyright clock) prior to July 1992. Have I misunderstood an aspect of this? cmadler (talk) 16:34, 18 September 2012 (UTC)
      • After its two rulings, the Cour de Cassation summarized the situation in its annual report for 2007. It mentions the particular situation of musical works, in the terms quoted above by cmadler. However, as the 2007 rulings were not about music or Ravel, there are apparently still some arguments about how to interpret and apply the principles and how the computation of the term of protection should be done in the specific case of Ravel and, depending on the result, if his works are still under copyright in France or if they are in the public domain there. This 2008 article concluded that, at that time, the question was still uncertain but that commentators seemed to lean more toward the theory of the longer term of protection. Anyway, it seems that the SACEM still perceives money relating to the author's rights of Ravel's works for the uses of those works "à l'étranger" (outside of France, in some countries where the works are still under copyright).[4]. I didn't find something telling clearly if they still perceived fees from the uses of Ravel's works in France after 2008. If the works are still under copyright in France and given the sums of money that would represent, it is somewhat surprising that no litigation is found. It may not help clarify the situation that the money perceived from the copyright used to be claimed by a mysterious offshore company, although I suppose that does not affect the term of protection. -- Asclepias (talk) 19:33, 19 September 2012 (UTC)


While working on the English Wikipedia I stumbled upon the Historic American Buildings Survey/Historic American Engineering Record/Historic American collection several times. This is a huge (350.000+) collection of photographs and drawings of historic buildings in the US. The collection is in the public domain although it contains some exceptions (haven't been able to find one). The collection has good metadata like title, author, date and the location (awesome for categorization). Every item has an high resolution tif file. I'm using User:Multichill/HABS as a layout template right now. I did some tests. Once the template is all tweaked I will substitute it. After that at that the template will be substituted on upload. The images are high resolution tiffs, that's of course very nice, but also problematic because the images are not rendered at the moment. The WMF has plans to change that so I rather not upload jpg's too. Any opinions on this? Multichill (talk) 12:06, 14 January 2012 (UTC)

Decided I'd upload both jpg and tiffs. Did some more tweaking:

Multichill (talk) 16:36, 22 January 2012 (UTC)


So I started this a couple of months ago. Ran into some technical problems and a lot of negative feedback so I decided to waste my time on something else.

  • In the meantime the file size limit was raised to 25 megapixels so I no longer need to upload two images. I'm just going to upload one high-res tiff image.
  • Categorization is probably going to be category:buildings in <county> with fallback to category:<county>
  • Naming of the files is quite rough, need to improve that (too long, too many weird characters)
  • Need to use the json and see what kind of useful information is in there that I'm not using (like coordinates)
  • I need to have a very conservative copyright check to not upload dozens of unfree files.
  • I should probably add a template like {{Maybe US heritage}} that explains that this photo might be on the NRHP or some local registry and to replace this template with the right one. Is that picks up that would really combine nicely with all the image we took ourselves (for example in Wiki Loves Monuments)

Multichill (talk) 19:38, 24 October 2012 (UTC)

Everything is about ready. Waiting on en:Wikipedia talk:WikiProject National Register of Historic Places#HABS upload. Multichill (talk) 21:47, 9 January 2013 (UTC)
I'm quite happy to see that User:Fæ took over where I left off at Commons:Batch uploading/Library of Congress. I guess this one can be closed now! :-) Multichill (talk) 10:25, 10 July 2014 (UTC)

Chris's Acorns[edit]

I accept that this is a premature request, so please accept my apologies if that's undesirable.

  1. All the (approx. 3000) photos arguably have educational value, so should that be the target? If not, under what criteria should decisions be made?
  2. Some pages (e.g. Acorn Phoebe 2100) contain non-free publicity photographs. How are such photos best tagged for omission?
  3. In order to collate them as a collection, would it be appropriate for them to be put in Photographs by Chris Whytehead (within Photographs by author)? Or should they go elsewhere, as he's apparently a registered Wikipedian?
  4. What are the options for proceeding with the upload and what will be required of Chris?

All advice gratefully received. Thanks for reading. --Trevj (talk) 12:06, 4 October 2011 (UTC)


Assigned to Progress Bot name Category
Smallman12q Bot Request Filed Smallbot (talk · contribs)

Request filed.Smallman12q (talk) 02:20, 17 November 2011 (UTC)

Maritime photo collection[edit]

Category:Frederic Logghe Maritime photo collection includes only part of the collection available at the website listed there. The collection itself didn't seem to have grown recently and Commons might be a good place to maintain it in the long term. --07:09, 28. Sep. 2011‎ Docu

Anybody should check the licence before a mass import. I am not sure they all be free. I found lot of pictures with copyright informations. i.E: [5] [6] [7] --Slick (talk) 16:30, 4 August 2012 (UTC)


Assigned to Progress Bot name Category

Images from Caelum Observatory & The Mount Lemmon SkyCenter[edit]

Adam Block from The Mount Lemmon SkyCenter has kindly agreed to release a large amount of his images with a CC BY-SA 3.0 license. He has done this specifically so they can be used on wiki projects. A .zip containing all of the released images can be found here. I would like to be able to upload them all into a category called 'Images from Caelum Observatory & The Mount Lemmon SkyCenter' or something in that vein. Many of them will be very useful and have high EV. A link to one of his galleries showing the relevant copyright statements can be found here. As there is 200+ files in the .zip file, uploading them all would be very tedious. I would be very grateful if someone could assist me with this matter. Thanks, Originalwana (talk) 13:16, 10 September 2011 (UTC)

Looks like it is difficult to upload the files from zip with a batch-job because missing information, i.E. description. IMHO makes more sence to parse the website for images under CC because there are very useful descriptions. (Example) --Slick (talk) 11:02, 11 August 2012 (UTC)
That would be great but I have no idea how to go about it, do you know how this could be done? Thanks Originalwana (talk) 10:22, 13 August 2012 (UTC)


Assigned to Progress Bot name Category


The site [8] has a great public domain collection of norvegian manuscripts. For exemple, the totality of the manuscript œuvre of Henrik Ibsen (in the UNESCO patrimony).

The objectif is : download all pictures, convert in a djvu file (by book) and upload the djvu in Commons.

Is it possible ? thx ! --M0tty (talk) 19:36, 3 October 2010 (UTC)


Assigned to Progress Bot name Category


All the images / videos from UMich listed in these two directories [9] If they could all be added to a single category I will than combine them into Wikipedia. --James Heilman, MD (talk) 23:13, 19 July 2011 (UTC)


Assigned to Progress Bot name Category


The owners of ECGpedia have agreed to allow release of their images under a Creative Common 3.0 license This applies to all images except which they are unable to release do to a continued non commercial requirements. There are about 2000 images in all. A list can be found here --James Heilman, MD (talk) 18:51, 13 July 2011 (UTC)

All images are licensed as "Creative Commons Attribution Noncommercial Share-Alike". Wikimedia commons does not allow "Noncommercial" licenses, so unless ECGpedia re-license their images we are not going to be able to use them. If they re-license that will need to be marked on the individual images themselves or through OTRS, which will list which images are covered. --Jarekt (talk) 19:36, 13 July 2011 (UTC)
Yes they have agreed to re-release the images under a license that allows commercial use. So the images will need to be marked as such.--James Heilman, MD (talk) 20:23, 13 July 2011 (UTC)
Here is the OTRS Ticket#2011102310008874 There are about 3000 ECGs and 700 echo images. --James Heilman, MD (talk) 13:55, 23 November 2011 (UTC)


Assigned to Progress Bot name Category
Smallman12q Smallbot[edit] offers 3251 free high resolution images and 2544 free vector symbols licensed under CC BY 3.0. --Leyo 06:59, 21 June 2011 (UTC)

A worthy set of images. I was able to download 2546 SVG files in a single ZIP file, but matching it with metadata is more challenging. --Jarekt (talk) 03:37, 29 June 2011 (UTC)
Are the file names in the ZIP file self-explanatory or rather meaningless? --Leyo 09:07, 29 June 2011 (UTC)
Filenames identify source and have few words about content, see for example here. For SVG files, I think we need to write some scraping software to create a spreadsheet with:
  • "Author" and "Author Company"
  • Title and description
  • "Date created"
  • URL (to be used to link back to the source image)
  • "Album name" and "Keywords" can be useful for choosing categories
  • "Filename" (to match it with the downloaded file)
I am at the moment rather busy with Commons:Batch uploading/Web Gallery of Art but if someone can gather the metadata I can upload the files. --Jarekt (talk) 15:05, 29 June 2011 (UTC)
As according to our discussion here (in German), these files additionally need to be fixed to change numbers that omit the leading zero (like .12345) to include this zero ( ---> 0.12345), else wikipedias renderer doesn't parse them correctly. (the substitution can also be of the type -.12345 ---> -0.12345). This is just in case someone very suddenly rushes in to upload these :) Iridos (talk) 23:24, 4 July 2011 (UTC)
All of the SVGs in this library were originally created with Illustrator, although most were run through SCOUR, which I now see strips leading zeros. Does anyone know of any other SVG parsers that have a problem without leading zeros? — Preceding unsigned comment added by Adrianbj (talk • contribs)
All the SVG files already contain DC metadata. There is also an online spreadsheet and excel version of metadata available.
Links to searchable database of all images/symbols and custom download builder for all the symbols in SVG, AI and PNG in a zip archive.
Just read through a translated version of the german discussion. Not sure why that virus didn't rasterize well. The PNG previews and downloadable versions on the IAN website were all created automatically with iMagick and rSVG, although problems like you are seeing did occur with various older versions of iMagick and rSVG. — Preceding unsigned comment added by Adrianbj (talk • contribs)

It seems they changed their licensing terms; the new license doesn't allow redistribution or sales, which makes it unacceptable for Commons. I guess the upload could still happen since CC licenses are irrevocable, but I imagine they wouldn't appreciate it much. The best solution would be for someone to contact them and ask them to change it back to CC BY. InverseHypercube 07:51, 18 February 2012 (UTC)

Sorry about the licensing change - we do rely on this resource to bring traffic to our website, so we would really appreciate honoring of our new license. Thanks. — Preceding unsigned comment added by Adrianbj (talk • contribs)
Thanks for commenting! However, if the images are licensed under CC-BY, we would be required to attribute (and link back to) your website, so no traffic loss would occur. In fact, since CC-BY would allow us to transfer images from your website, having your images on high-traffic sites such as Wikipedia would increase hits to your site, since they would all link to it. InverseHypercube 04:37, 8 March 2012 (UTC)
A custom license tag such as in Category:Custom license tags might be used. It might contain a link to the website and or a direct link to the respective image (example). --Leyo 09:32, 8 March 2012 (UTC)
I'd like to make the preview sized versions of all our images (photos and vector illustrations) available on Commons with a custom license tag (and attribution) and a direct link to the respective image on our site where users can register (free) and download the full resolution / vector (SVG) versions. Almost all the photos (JPG) have metadata embedded. All the SVG files also have metadata, but the preview PNGs do not because of the metadata issues with the PNG format. As I mentioned above, there is an automated spreadsheet available from our site with all the metadata. We are constantly adding images to the library. Is there any possibility to automatically update commons if I create a web service (XML/JSON) of all the images and metadata?
That sounds great, and it can definitely be done. However, as I understand copyright law, by licensing the previews under CC-BY, for example, you would also be licensing the SVG files under the same license, since they do not meet the threshold of originality over the previews. While we might only upload the previews, I don't think you could stop others from distributing the SVG files. InverseHypercube 17:35, 8 March 2012 (UTC)
I guess I was thinking of a custom license, rather than CC-BY, as suggested by Leyo. Would that work?
I still think you would be effectively licensing the SVG files under the same license. InverseHypercube 17:49, 8 March 2012 (UTC)
I understand what you are saying, but if the custom license says that users cannot redistribute or sell and that they must provide attribution even for the preview PNGs, would that work? Maybe this is too problematic for posting to Commons? We are actually also wanting to add the option for users to purchase the right to use our images without attribution, because at the moment, there are many cases when they can't use them due to the attribution requirement. We think this dual licensing model will make them more useful for more people. I'd be curious if anyone has any further suggestions.
No, that wouldn't be allowed on Commons. See Commons:Licensing#Acceptable licenses; non-commercial licenses are not permitted. However, if you licensed the SVGs under a license that required attribution for redistribution, it would apply to the PNGs too. InverseHypercube 22:14, 9 March 2012 (UTC)
Looks like I was wrong about not being able to license the preview images and the vector versions under separate licenses; the community consensus seems to be that you can. See Commons:Village_pump/Copyright/Archive/2012/01#CC_BY-SA_3.0_and_the_original_image_quality. InverseHypercube 18:38, 14 March 2012 (UTC)