Commons:Batch uploading

From Wikimedia Commons, the free media repository
Jump to: navigation, search
Bot policy and list · Requests to operate a bot
Requests for work to be done by a bot · Changes to allow localization  · Requests for batch uploads
Gnome-system-run.svg

This page has a backlog that requires the attention of experienced editors.
Please remove this notice if it won't be needed in the future.


Boarisch | català | Deutsch | Deutsch (Sie-Form)‎ | Ελληνικά | English | español | français | galego | magyar | italiano | 日本語 | 한국어 | македонски | മലയാളം | Nederlands | português | português do Brasil | svenska | Türkçe | 中文 | +/−

Shortcut
[[COM:BATCH]]
Nuvola apps kcmsystem.png

Commons Batch Uploading is a project to centralize the uploading of a collection of files, that have released their work as PD or any Commons compatible license. The files would be assigned to a bot operator who would see how the request would be fulfilled. (To upload batches from Flickr, please make requests on Commons:Flickr batch uploading)

See w:Wikipedia:Public domain image resources for potential future batch uploads.


Contents

Scripters [edit]

Tools [edit]

Scripts, Examples and Information [edit]

New requests [edit]

Defence Imagery [edit]

  • Source to upload from: http://www.defenceimagery.mod.uk/
    • Did you observe an URL pattern: Yes, of a sort. Every file has an ID, but it's extremely long, and the process is even more complicated because only some of the files on the site are OGL-licenced - the rest are copyrighted. This is because the OGL licence is 'opt-in' for the MoD.
    • Do you know whether the site as an API: Not as far as I am aware, no. It may have one available for members of the press, but not projects like ours...
    • What else can ease uploading (is the site valid XHTML, WCM they use…)? I don't understand the question, I'm sorry. If it helps, the images have extremely detailed metadata.
    • Did you contact the site owner? No, but I can do the legwork if it would make it easier.
  • Describe the works to be uploaded in detail (audio files, images by …): All of the images at http://www.defenceimagery.mod.uk/fotoweb/Grid.fwx?archiveId=5042 - Archive 5042 is, I believe, the archive of OGL-licenced images. There are a variety of authors, but they are all high-quality JPG images. Unfortunately, the archive also updates every few days with new pictures as they're uploaded. Ideally, a bot would need to scrape this site maybe once a week.


  • Which license tag(s) should be applied? The {{OGL}} licence.


  • Is there a template that could be used on the file description pages? Do you think a special template should be created? I would be happy to create a special template.


Opinions [edit]

Assigned to Progress Bot name Category

Rijksmuseum [edit]

The renovated en:Rijksmuseum in Amsterdam has made their digital collection of 111,000+ objects available digitally under a CC-0 license (https://www.rijksmuseum.nl/en/api/terms-and-conditions-of-use). An API key is needed for digital downloading (https://www.rijksmuseum.nl/en/api). According to the museum:

"All object descriptions available via this API are covered by a Creative Commons 0 licence. The images are in the Public Domain, according to which the data and the images are free of rights and may be copied, changed, distributed or exported without the Rijksmuseum’s permission."

Sandstein (talk) 20:40, 7 April 2013 (UTC)

  • Describe the works to be uploaded in detail (audio files, images by …): Presumably the entire collection is of use. According to https://www.rijksmuseum.nl/en/api/instructions-for-use: "The Rijksmuseum API Collection is a set of more than 110,000 descriptions of objects (metadata) and digital images from the Rijksmuseum collection. The works of art and implements in the set date from ancient times through to the late 19th century and provide an excellent overview of the richness, diversity and beauty of the Dutch and international heritage. Unfortunately, copyright restrictions mean that we are not yet able to include any works from the 20th or 21st centuries. The set includes paintings and prints (ranging from the great masters of the Golden Age through to anonymous biblical paintings and other painted objects from the Middle Ages), 19th-century photographs, ceramics, furniture, silverware, doll’s houses, miniatures, etc. Digital photographs were taken of all of the objects in this set."
  • Is there a template that could be used on the file description pages? Do you think a special template should be created? Museum:Rijksmuseum. Also, the following should probably be taken into account even though we are not an app: "In all apps to be built in which images belonging to the Rijksmuseum are used, app designers will credit these as having been built with the API of the Rijksmuseum, including images and documentation. The credit must be placed where it can be seen easily by users. App-builders will credit all images with the words ‘Rijksmuseum collection’."

Opinions [edit]

  • I'm quite aware of this awesome collection. Haven't uploaded it yet because we're planning to use it as pilot for Commons:GLAMToolset project. Not sure when this will happen exactly, probably in the next months. Multichill (talk) 10:42, 13 April 2013 (UTC)
Assigned to Progress Bot name Category

Los Angeles County Museum of Public Art [edit]

  • Source to upload from:

collections.lacma.org

  • Describe the works to be uploaded in detail (audio files, images by …):

« The Los Angeles County Museum of Public Art has released some 20,000 PD images of their collection ([1], example: [2]). » Jean-Fred (talk) 14:00, 13 March 2013

  • Which license tag(s) should be applied?


  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

Opinions [edit]

  • Unless someone wants to pick this up early, I would be happy to look at it in a few weeks, it seems right up my alley. -- (talk) 17:00, 16 March 2013 (UTC)
  • How it will be done? There are not only PD works. I think LACMA could create some xml file for us. Dominikmatus (talk) 06:20, 20 March 2013 (UTC)
    • A easy test seems to be whether the image is marked as "Image not zoomable due to copyright restrictions"1, has a copyright note (looking like <div class="field-name-field-copyright-text">© John Baldessari</div>[3]) or whether it has a download link. The text does not quite match the Terms of Use[4] and for that reason I would email LACMA just to explain what Wikimedia Commons is and which photographs were going to be uploaded. I doubt that the LACMA could offer much more that is already in the online gallery as curator and conservation notes and so forth, are likely to have unclear copyright. (BTW some images have extensive curator notes available.[5] I would check whether the full text is intended to be reusable, off-limits as "Protected Content" as defined in the Terms of use, or whether a limited extract might be okay, such as the first 50 words as I have done with other batch uploads.) -- (talk) 07:52, 20 March 2013 (UTC)
    • An easy filter in LACMA's website search is "has_unrestricted_image", so this might either be better than the above checks or be run in addition to them. See this example search http://collections.lacma.org/search/site/?f[0]=bm_field_has_image%3Atrue&f[1]=im_field_chronology%3A14337&f[]=bm_field_has_unrestricted_image%3Atrue for unrestricted images of ancient artefacts (1,816 images); in practice the upload might usefully be staged by chronology as by default this starts with the least possibly contentious in terms of copyright. -- (talk) 09:09, 21 March 2013 (UTC)
  • Question I am close to finishing a nice mapping using BeautifulSoup to general image description pages, but I have problem in the way LACMA appear to have "updated" their website. We have an prior upload of an 18th C. waistcoat at a very high resolution of 4,000x6,000+ px. The original source is at [6] but I cannot see a way of getting from the current catalogue entry [7] to that old version. The new system shows 5 images, the first duplicates the old upload but is half the resolution (when the expanded button is selected) whilst the other 4 are good detail shots that appear clipped from the high resolution one we already have. Unfortunately id=159291 is the only relevant old reference and there is no mention of that number anywhere on the new catalogue entry, or the id's for its images. -- (talk) 22:37, 21 March 2013 (UTC)
I think, we should write email to LACMA with this problem. It is not good (for SEO) to change URL without redirection. Dominikmatus (talk) 09:42, 22 March 2013 (UTC)
Yes, I was coming to the conclusion that should be my next step. It might not be solvable technically, so if LACMA cannot, or don't have the time, to help, then the solution might be to go ahead with the batch upload even if a few files will be scaled down (but still high quality) duplicates of some high resolution photos we already have. I'll do my best not to be left in that position and I'll start drafting up an email - no hurry as I don't expect a same day answer from the museum on a Friday. Smile fasdfdsfoiueire.svg -- (talk) 10:29, 22 March 2013 (UTC)
I have written today to the web contact at LACMA and asked about how to use the d/b id to track down the large resolution image and whether it is okay to scrape the text from the catalogue entry (such as curator notes). -- (talk) 10:35, 26 March 2013 (UTC)
I did a bunch of those high-res downloads (by hand). I'll be interested to see what LACMA says. - PKM (talk) 01:12, 6 April 2013 (UTC)
No response yet. I might get on with an initial batch for testing as soon as the mobile upload problem is resolved, rather than expecting a reply. -- (talk) 01:56, 6 April 2013 (UTC)
Job Assigned to Progress Links
Code and initial batch (some ancient artefacts) (talk) Status:    in progress
Resolve multiple view artefacts (talk) Status:    In progress
Inform LACMA (talk) Status:    Done
Create digestion template (talk) Status:    pending
Complete upload (talk) Status:    pending
Promote to community (talk) Status:    pending

Fonds Ancely [edit]

This upload is part of a partnership between Wikimédia France and the Library of Toulouse. It consists of 2085 public domain files. You may see general notes and work in progress on User:Jean-Frédéric/Ancely.

The metadata is held in a OAI PMH repository. The code explores it and retrieves records ; then if applicable the various fields are matched to a manual alignement of Commons categories and tags, community curated. This is then fed to a data ingestion templates which translates the metadata to {{Artwork}}. Actual upload is made with Pywikipedia-rewrite by User:AncelyBot.

In its current state, the categorisation system with the alignment outputs 31,801 categories (1,694 distinct) − the drawback is that many are high-level categories (“Shawls”, “men”, etc.)

Looking forward your thoughts, Jean-Fred (talk) 22:49, 6 March 2013 (UTC)

Opinions [edit]

  • Uploaded five more − see Special:ListFiles/AncelyBot Jean-Fred (talk) 01:14, 16 March 2013 (UTC)
  • Uploaded fifteen more − and I will continue uploading files until my demands are met! Jean-Fred (talk) 00:23, 19 March 2013 (UTC)
  • Symbol support vote.svg Support everything looks fine for me. (may be a bit overcat) --PierreSelim (talk) 14:24, 20 March 2013 (UTC)
  • Ok, uploading 100 right now. Jean-Fred (talk) 21:06, 11 April 2013 (UTC)
  • Looks very good. The only thing that worries me a bit is the number of categories per image. That might become a problem. Please upload more! Multichill (talk) 10:39, 13 April 2013 (UTC)
  • Symbol oppose vote.svg Oppose now, we have forgotten to finish the Creator mapping User:Jean-Frédéric/Ancely/Creator --PierreSelim (talk) 12:02, 25 April 2013 (UTC)
  • Uploaded the first 350. Jean-Fred (talk) 23:08, 7 May 2013 (UTC)
  • Uploaded the first 500. Jean-Fred (talk) 13:04, 8 May 2013 (UTC)
  • Uploaded the first 800. Jean-Fred (talk) 14:05, 10 May 2013 (UTC)
  • Made it 1,000. Jean-Fred (talk) 23:15, 12 May 2013 (UTC)
  • ✓ Done. 2041 files uploaded + 33 dupes + 11 errors = 2085 files, the size of the corpus. Jean-Fred (talk) 14:49, 24 May 2013 (UTC)

Dupes [edit]

The following files were already on Commons − we might want to update their file descriptions (current: 33)

Errors [edit]

The following files failed to upload (current: 11)

Categorisation statistics [edit]

Per category [edit]

30266 categories, 1760 distincts Mean: 17.1965909091 Median: 2.0 Max 1045 // Min 1

Top 10: [(u'Mountains in art', 1045), (u'Men in art', 992), (u'Women in art', 878), (u'Trees in art', 780), (u'Houses in art', 736), (u'Pyr\xe9n\xe9es-Atlantiques', 693), (u'Hautes-Pyr\xe9n\xe9es', 617), (u'Pyrenees', 470), (u'National costumes in art', 468), (u'Rivers in art', 440)]

Lose 10: [(u'Estrades', 1), (u'Pierre Bayle', 1), (u'Morla\xe0s', 1), (u'Louis-Fran\xe7ois Couch\xe9', 1), (u'Jean Racine', 1), (u'Faience in France', 1), (u'Marmite', 1), (u'Corsica', 1), (u'Dordogne River', 1), (u'Esera River', 1)]

Per file [edit]

Mean: 14.5160671463 Median: 13.0 Max 47 // Min 0

Top N: [('B315556101_A_LEVASSEUR_066', 47), ('B315556101_A_LEVASSEUR_068', 46), ('B315556101_A_LEVASSEUR_018', 44), ('B315556101_A_LEVASSEUR_056', 42), ('B315556101_A_LEVASSEUR_057', 42)]

Lose N: [('B315556101_A_BERTHIER_010', 1), ('B315556101_A_BERTHIER_024', 0), ('B315556101_A_BERTHIER_021', 0), ('B315556101_A_BERTHIER_018', 0), ('B315556101_A_BERTHIER_013', 0)]

Assigned to Job Progress
Jean-Frédéric Metadata pre-processing Status:    Done
Jean-Frédéric, Symac, Léna, PierreSelim Metadata alignment Status:    Done
User:Jean-Frédéric Upload Status:    Done
Dupes and errors processing Status:    todo

South African churches [edit]

User af:Gebruiker:Morne has uploaded hundreds of perfect images of buildings in South Africa (mostly churches) in Afrikaans Wikipedia, all under the same licence "you are free to use, copy, modify, if you properly credit the author" (see an example). I consider it important, as there are unfortunately relatively few images of South African cities, towns and villages in Wikipedia. --Dmitri Lytov (talk) 03:21, 3 March 2013 (UTC)

  • Describe the works to be uploaded in detail (audio files, images by …):

It's a collection of several hundred images of churches in South Africa.

  • Which license tag(s) should be applied?

"you are free to use, copy, modify, if you properly credit the author" (see an example).

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

Sorry, no idea.

Opinions [edit]

Assigned to Progress Bot name Category

LSH [edit]

  • Livrustkammaren och Skoklosters slott med Stiftelsen Hallwylska museet (COM:LSH):
    • Each image caries a unique identifier which may be linked to a URL (although these aren't live yet)
    • No API
    • They are donating the images and together with these the associated metadata. They've also done some preliminary matching between keywords and Commons categories as well as between artist/events/depicted people and (Swedish) Wikipedia pages.
    • It's a collaboration so yes!
  • Describe the works to be uploaded in detail (audio files, images by …):

These are a collection of approx. 20,000 high resolution photographs in tiff formate of the objects held in the collections of these three museums. The files are all less than 500MB but may be larger than 100MB, resolution is less than 25 megapixels (relevant with respect to Commons:Maximum file size).

  • Which license tag(s) should be applied?

All of the depicted objects are owned by the museum and PD-old. The photographs themselves are either old enough to be PD-Sweden-photo or released as either CC0 or CC-BY-SA (don't know which version yet). {{LSH license}} has been prepared for this purpose.

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

{{LSH_artwork}}

Opinions [edit]

Having looked around a bit it looks as though Chunked uploads might not be integrated into the pywikipediabot framework. Does anyone know anything more about this or if there is a practical workaround? I can ask them to downsample the images but it seems as a waste when they've offered us high-res. /André Costa (WMSE) (talk) 09:32, 7 February 2013 (UTC)

Seems it is not integrated indeed :-/ source. Jean-Fred (talk) 12:23, 7 February 2013 (UTC)
Yes I saw that one. Since it was close to a year ago though I was hoping that the situation had changed since then and that I had somehow missed the follow-up e-mail. /André Costa (WMSE) (talk) 08:17, 8 February 2013 (UTC)

If you want to start at the low level, you can construct your XMLHTTP-Requests yourself. Sample how chunked upload // how the XHR should look like at mw:API:Upload. On Windows, I used Fiddler2 to inspect that everything worked as it should. If you like I can supply my VB(A)-classes but I guess you are on Linux. -- Rillke(q?) 18:18, 22 March 2013 (UTC)

The chunked uploads issue has been resolved thanks to this py-scrit by Smallman12q. /Lokal_Profil 13:10, 6 May 2013 (UTC)

Finally got a proper grip on the metadata preparations and the first example information templates can now be seen at /Examples. There are a few more bits of information to include but most of the final product should be there. To make connections to commons categories and templates I largely rely on the following lists:

  • /Keywords (used to match the keywords used for the images with categories on Commons)
  • /People (used to identify artists, depicted people and provenance/owners)
  • /Events (identifies categories related to historical events)
  • /Places (identifies places and links these using city/country-templates. Also identifies institutions which are linked to page/institution-template)
  • /Materials (identifies materials/techniques and links these using the technique tempalate)
  • /ObjKeywords [provisional] (used to match keywords used for object descriptions with categories on Commons, to be merged with earlier information here and be sorted by frequency)

Feel free to help out in expanding these (although I expect a basic knowledge of Swedish is required). As the last few fields are tidied up there should be one or two more such lists appearing. I might also try some automated matching for a few of these.

The filenames will likely take the form <description> - <museum> - <identifier/museum system filename>.tif. The descriptions used for the filenames can be viewed at /Filenames. I've strived to keep the descriptions shorter than 100 characters with a hard limit of 128 characters. These limits as well as other filters for the descriptions can easily be changed based on feedback.

Having had a closer look at the data, one or two parameters in the artwork template may also change. /Lokal_Profil 06:45, 13 May 2013 (UTC)

An update
Most of the code preparations are now done with the main remaining step being translations/mappings of the above lists. The filename pattern was changed slightly and an "event" parameter was added to the Artwork template. All of these changes can be seen at /Examples. I'm going to try and do the most frequent translations/mappings and will then do a few test uploads (unless someone suggests otherwise). /Lokal_Profil 22:39, 19 May 2013 (UTC)
First two actual files are now up at
Still lots of unmatched parameters kicking around thus the lack of categories. Feel free to help out with any of the subpages above. /Lokal_Profil 14:46, 24 May 2013 (UTC)
Assigned to Progress Bot name Category
Lokal_Profil LSHuploadBot Category:Images from Livrustkammaren och Skoklosters slott med Stiftelsen Hallwylska museet

National Gallery of Art [edit]

Jean-Fred (talk) 14:19, 18 January 2013 (UTC)

  • Source to upload from: National Gallery of Art online database, per their open access policy
    • Did you observe an URL pattern
    • Do you know whether the site as an API
    • What else can ease uploading (is the site valid XHTML, WCM they use…)?
    • Did you contact the site owner?
      See here, they welcome the idea
  • Describe the works to be uploaded in detail (audio files, images by …):

Artwork digitisations

  • Which license tag(s) should be applied?

Existing uploads seem to rely on {{PD-author|National Gallery of Art}} or {{PD-art|PD-old-100}}.

I guess a custom wrapper for {{Licensed-PD-Art|PD-old-whatever|{{PD-author|National Gallery of Art}}}} would do the trick (in the spirit of {{Walters Art Museum license/2D}})

Went ahead and created {{PD-Art-National Gallery of Art}}. Jean-Fred (talk) 14:47, 18 January 2013 (UTC)


Opinions [edit]

Assigned to Progress Bot name Category

US Army Research Laboratory Eniac [edit]

  • Describe the works to be uploaded in detail (audio files, images by …):
    Images (PNGs are high-res, also lo-res GIF, JPG): those which are photos should ideally be converted to JPG)
    I only count around 20 such images, so please state if that's too few for a batch upload to be considered.
    There are a few duplicates within Category:ENIAC, but I gather that the batch proposal equivalents are generally of better quality.
  • Is there a template that could be used on the file description pages? Do you think a special template should be created?
    Not that I know of... possibly something about ENIAC

Opinions [edit]

Assigned to Progress Bot name Category

11k of Areal Photos [edit]

In the course of the arial photo project of the German Wikimedia de:Wikipedia:Projekt Fotoflüge I wrote an article for a pilots magazine. After that I got in contact with a Pilot who wants to share his own created areal photo collection which he created over the past 24 years. It seams that all photos are already geo-referenced and classified (by type like solar power plant, church as well as by region like Europe, Andalucia, Sanlucar). The classification as well as the geo-reference is within the exif data of the images. During a manual upload the geo-reference was recognized correct by commons. Because of the big amount of pictures it would be fine if there is some way to may automize the upload and if possible somehow to match the classification of the pictures to the commons categories. I have no idea if or how this is possible and it would be great to get some information if this is possible or to get some help for this request. The Classification is sometimes in German and not matching the Commons categories. The Pilot has already created a Wikipedia / Commons User and uploaded one example file where you could see how the data is sored within the exif Data.

  • Source to upload from:

The files are on a computer of the pilot / photographer.

    • Did you observe an URL pattern
    • Do you know whether the site as an API
    • What else can ease uploading (is the site valid XHTML, WCM they use…)?
    • Did you contact the site owner?

Not the site owner but the photographer User:Graf-flugplatz

  • Describe the works to be uploaded in detail (audio files, images by …):

About 11.000 of digital arial photos should be uploaded.

Solarthermie Kraftwerk 100919005.JPG
  • Which license tag(s) should be applied?

Has to be clarified with Author, but expect "CC BY-SA 3.0" like the example.

Update 18.12.2012: License "CC BY-SA 3.0" is approved by Author User:Graf-flugplatz.

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?


Opinions [edit]

Nice sample images. I'm from Germany too and I like to help. But not before end of april 2013 because I am away and busy. If this will be ok, just waiting ... --Slick (talk) 17:26, 9 January 2013 (UTC)

Ok, how can I get the images to upload? I like to have them here, so I can check the tags they have and can try to find best categories for. Possible solutions are I download them all from a source or you can send it to me on by CD/DVD (I am from germany too). You can contact me (in german please) here about this. Additional I suggest the pilot (or you) fill in a minimal content on his userpage for other they are interesting in the source/creator. (i.E. the same information as in this request) --Slick (talk) 08:51, 6 February 2013 (UTC)

Assigned to Progress Bot name Category
Slick Waiting for user response...

Garden of the Victory in Chelyabinsk [edit]

  • Source to upload from:

User Ain92 asked me to upload some photos with Panoramio Picker but I have never done it and found that it's too complicated to understand it in the nearest time. So I ask to upload for category:Garden of the Victory in Chelyabinsk all photos from this page and 2-9th photos from this page (they are cc-by). Анастасия Львоваru (ru-n, en-2) 07:03, 11 December 2012 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

AELG [edit]

  • Describe the works to be uploaded in detail (audio files, images by …):

We wanted to upload free images from AELG Website because they have galleries from Galician writers. They have a CC-BY-SA license for some photos from the galleries from the authors, photos from authors Eduardo Castro Bal and Santos-Díez.

There is an index of authors here and this is an example of the gallery of an writer. The individual photos have an url like this.

  • Which license tag(s) should be applied?

The images are CC-BY-SA, some from Eduardo Castro photographer and others from Santos-Díez photographer.

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

There is a template to use with the photos: {{AELG}}. There is a category too. Bye, --Elisardojm (talk) 00:14, 28 September 2012 (UTC)

Opinions [edit]

Pictogram voting comment.svg Comment Somebody could review if this work can realise or is necessary more information? Thanks, --Elisardojm (talk) 22:14, 9 November 2012 (UTC)

Looks good and no more information is needed yet. But usually it can take some time to realize. Just waiting ... --Slick (talk) 16:25, 9 December 2012 (UTC)
Ok, if somebody needs more details or goes to try realise this task, I would appreciate him that it warned me in my talk page. Thanks!, --Elisardojm (talk) 09:54, 11 December 2012 (UTC)
I'll do the upload tmrw. Smallman12q (talk) 03:19, 22 January 2013 (UTC)
If you need more information or details about this task, you can ask me. Thanks!, --Elisardojm (talk) 14:16, 22 January 2013 (UTC)

┌─────────────────────────────────┘
✓ Done I've completed the upload...~800 uploaded. Some such as File:Valentín_Arias_(AELG)-1.jpg aren't thumbnailing...but work fine in firefox and show metadata so its a bug on the wiki side. Cheers. Smallman12q (talk) 21:22, 22 January 2013 (UTC)

Image rendering bug is being look into at w:Wikipedia:Village_pump_(technical)#Images_not_rendering.Smallman12q (talk) 23:49, 23 January 2013 (UTC)
Image rendering bug was resolved. Fixed issue with spacing in direct links brought up at User_talk:Smallman12q#AELG_photo.27s_upload.

Smallman12q (talk) 04:39, 26 January 2013 (UTC) Per User_talk:Smallman12q#AELG_photo.27s_upload, added author link:

Smallman12q (talk) 03:08, 7 February 2013 (UTC)

Assigned to Progress Bot name Category
User:Smallman12q Done User:Smallbot Category:Images from AELG

Gerald R. Ford Presidential Library and Museum [edit]

The Ford Presidential Lib/Museum is a federal archives, part of NARA. We'd like to create a partnership with Wikimedia:Commons and get all of our digitized material up. All materials are in the public domain. Agency management is on board, and we have a team already working on this! I've been uploading materials one-by-one, I've gotten about 170 images uploaded - see Commons:Gerald R. Ford Presidential Library and Museum - I figure it should take me til oh, 2215 to get everything up! We're looking for an administrator to work with and develop a plan. Bdcousineau (talk) 18:50, 5 September 2012 (UTC)

See Commons:Gerald R. Ford Presidential Library and Museum for current progress.Smallman12q (talk) 23:26, 17 September 2012 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

Rudolf Steiner Gesamtausgabe [edit]

Die folgende Seite bietet alle Werke der Gesamtausgabe Rudolf Steiners (gemeinfrei) als Scan in zitierfähigen Ausgaben. Eine Übernahme zu Wikimedia Commons wurde hier besprochen und gewünscht.

http://bdn-steiner.ru/modules.php?name=Ga

  • I downloading the files und prepare for upload. Which one is the correct licence template in this case? I guess PD-old. Only this or need a second one? --Slick (talk) 21:14, 11 August 2012 (UTC)
  • Downloads finish. --Slick (talk) 08:47, 13 August 2012 (UTC)

A discussion in german about the licence can found here. Looks like there is a problem with scans from sources newer than 1923. --Slick (talk) 13:35, 15 August 2012 (UTC)

I cancel to support this batch job, remove all local work already done, because missing help/support although requested more than one time. Revert job to Request-List. --Slick (talk) 20:30, 23 August 2012 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

Detroit Publishing Company at LoC [edit]

"This collection of photographs from the Detroit Publishing Company Collection includes over 25,000 glass negatives and transparencies as well as about 300 color photolithograph prints, mostly of the eastern United States. The collection includes the work of a number of photographers, one of whom was the well known photographer William Henry Jackson. A small group within the larger collection includes about 900 Mammoth Plate Photographs taken by William Henry Jackson along several railroad lines in the United States and Mexico in the 1880s and 1890s. The group also includes views of California, Wyoming and the Canadian Rockies." Subject index; geographical index. cmadler (talk) 17:17, 20 March 2012 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

Cesare Brizio [edit]

Photographer Cesare Brizio has agreed to donate 1300+ images here. Images may be taken from the web page OR originals can be sent to anyone on a DVD if required. He also suggested some sound files - but they are in the wrong format (mp3).

Data from OTRS ticket 2012021810002796 follows (permission obtained to copy this OTRS message here)
++++++++++++++
Dear Ron Jones: yes, I confirm that I am actually glad to release all the images located at via the "View Media" link at http://tolweb.org/onlinecontributors/app?service=external/ImageContributorDetailPage&sp=1810 as "Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)". Furthermore, I can provide upon request higher resolution versions (1024x768 or more) of almost all the same images.

By the way, I would gladly release as "Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)" all the audio samples (recordings of animal sounds) available at the web pages listed here:

best regards,

Cesare Brizio
+++++++++++++

Opinions [edit]

Symbol support vote.svg Support Sounds good espacally the fact that we don't have for all biological articals pics.--Sanandros (talk) 14:49, 21 August 2012 (UTC)

Assigned to Progress Bot name Category

Works of Maurice Ravel [edit]

All files from http://imslp.org/wiki/Category:Ravel,_Maurice can be uploaded to Commons (57 files).

Maurice Ravel's works are in the public domain in France since a decision by the Cour de cassation in 2007 (French Supreme Court). See Wikipedia articles for details. There are about 35 published before 1923, for which there is no URAA issue. Yann (talk) 12:12, 15 September 2012 (UTC)

Category:Compositions by Maurice Ravel
License {{PD-old}}

Opinions [edit]

Assigned to Progress Bot name Category
  • Waiting for the backlog of this page may take longer time than manual uploading 57 images using Special:UploadWizard. Bennylin (yes?) 12:49, 26 February 2012 (UTC)
    • If you would just had a look at the page or at least a bit of music knowledge...; but today I am bountiful and do not respond with other unhelpful comments. I just ask me how you could became steward with those hasty comments. If you want to help, you could take upload requests or analyze them carefully. Or are you even paid by WMF to advertise UpWiz?
  • Please make some suggestions how to get a good descriptions from the page. (Including a custom template, categories, ...)
  • Page structure:
  • {{Not-PD-US-URAA}} is not a valid license template (says that right on it!). Works should be verified as being PD or otherwise free in the US before uploading. Otherwise you're just adding to the Commons:WikiProject Public Domain/URAA review workload. cmadler (talk) 12:17, 27 April 2012 (UTC)
    • Pre-1923 works should be tagged with {{PD-1923}} to cover US copyright status. Post-1923 works are probably still copyrighted in the US, and should not be uploaded without investigation into the status. cmadler (talk) 13:35, 17 September 2012 (UTC)
      • Is this tag necessary even for non-US works? Yann (talk) 15:29, 17 September 2012 (UTC)
        • Yes, because works on Commons must be free in both the country of origin and the US. (Right on {{PD-old}}, it says, "You must also include a United States public domain tag to indicate why this work is in the public domain in the United States.") Alternatively, {{PD-old-70-1923}} is a single template covering both the US and French copyright. cmadler (talk) 12:39, 18 September 2012 (UTC)
  • Symbol oppose vote.svg Oppose Actually, now that I look at it, I don't think any of his works are in public domain in France, the country of origin. The Cour de cassation ruling found that the prorogations de guerre (extensions for the two World Wars) were superceded by later copyright laws, but only for non-musical works. Since we're discussing musical works, the prorogations still need to be taken into account. Works published through 1920 get an additional 14 years, 272 days, while works published from 1920 through 1947 (since Ravel died in 1937, this covers all the rest of his works) get an additional 8 years, 120 days. So Ravel's works through 1920 are copyrighted in France until late 2022 (272 days gets you almost to the end of September), while his post-1920 works are copyrighted in France until 2016 (120 days goes to late April). cmadler (talk) 12:49, 18 September 2012 (UTC)
    • The Cour de cassation did not mention the type of works to which its ruling applies. Yann (talk) 13:43, 18 September 2012 (UTC)
      • If I understand correctly, the 2007 Cour de cassation ruling related primarily to the 1997 law, which had extended the normal duration for non-musical works from 50 years to 70 years (but was not cumulative with the war extensions), and dealt specifically with the works of two painters, Monet and Boldini. But musical works had already been extended to 70 years pma in 1985, by the "Lang" law, and in the 2007 ruling, the court found that this law was cumulative with the war extensions ("la loi du 3 juillet 1985 avait porté à 70 ans la durée de protection normale, de sorte que les bénéficiaires des prorogations de guerre applicables à cette date pouvaient prétendre à une durée de protection excédant 70 ans"), but only in the case of composers who had already "acquired" the right (already died, starting the copyright clock) prior to July 1992. Have I misunderstood an aspect of this? cmadler (talk) 16:34, 18 September 2012 (UTC)
      • After its two rulings, the Cour de Cassation summarized the situation in its annual report for 2007. It mentions the particular situation of musical works, in the terms quoted above by cmadler. However, as the 2007 rulings were not about music or Ravel, there are apparently still some arguments about how to interpret and apply the principles and how the computation of the term of protection should be done in the specific case of Ravel and, depending on the result, if his works are still under copyright in France or if they are in the public domain there. This 2008 article concluded that, at that time, the question was still uncertain but that commentators seemed to lean more toward the theory of the longer term of protection. Anyway, it seems that the SACEM still perceives money relating to the author's rights of Ravel's works for the uses of those works "à l'étranger" (outside of France, in some countries where the works are still under copyright).[8]. I didn't find something telling clearly if they still perceived fees from the uses of Ravel's works in France after 2008. If the works are still under copyright in France and given the sums of money that would represent, it is somewhat surprising that no litigation is found. It may not help clarify the situation that the money perceived from the copyright used to be claimed by a mysterious offshore company, although I suppose that does not affect the term of protection. -- Asclepias (talk) 19:33, 19 September 2012 (UTC)

HABS [edit]

While working on the English Wikipedia I stumbled upon the Historic American Buildings Survey/Historic American Engineering Record/Historic American collection several times. This is a huge (350.000+) collection of photographs and drawings of historic buildings in the US. The collection is in the public domain although it contains some exceptions (haven't been able to find one). The collection has good metadata like title, author, date and the location (awesome for categorization). Every item has an high resolution tif file. I'm using User:Multichill/HABS as a layout template right now. I did some tests. Once the template is all tweaked I will substitute it. After that at that the template will be substituted on upload. The images are high resolution tiffs, that's of course very nice, but also problematic because the images are not rendered at the moment. The WMF has plans to change that so I rather not upload jpg's too. Any opinions on this? Multichill (talk) 12:06, 14 January 2012 (UTC)

Decided I'd upload both jpg and tiffs. Did some more tweaking:

Multichill (talk) 16:36, 22 January 2012 (UTC)

Restart [edit]

So I started this a couple of months ago. Ran into some technical problems and a lot of negative feedback so I decided to waste my time on something else.

  • In the meantime the file size limit was raised to 25 megapixels so I no longer need to upload two images. I'm just going to upload one high-res tiff image.
  • Categorization is probably going to be category:buildings in <county> with fallback to category:<county>
  • Naming of the files is quite rough, need to improve that (too long, too many weird characters)
  • Need to use the json and see what kind of useful information is in there that I'm not using (like coordinates)
  • I need to have a very conservative copyright check to not upload dozens of unfree files.
  • I should probably add a template like {{Maybe US heritage}} that explains that this photo might be on the NRHP or some local registry and to replace this template with the right one. Is that picks up that would really combine nicely with all the image we took ourselves (for example in Wiki Loves Monuments)

Multichill (talk) 19:38, 24 October 2012 (UTC)

Everything is about ready. Waiting on en:Wikipedia talk:WikiProject National Register of Historic Places#HABS upload. Multichill (talk) 21:47, 9 January 2013 (UTC)
Assigned to Progress Bot name Category
Multichill Did some first tests BotMultichillT None

Chris's Acorns [edit]

I accept that this is a premature request, so please accept my apologies if that's undesirable.

  1. All the (approx. 3000) photos arguably have educational value, so should that be the target? If not, under what criteria should decisions be made?
  2. Some pages (e.g. Acorn Phoebe 2100) contain non-free publicity photographs. How are such photos best tagged for omission?
  3. In order to collate them as a collection, would it be appropriate for them to be put in Photographs by Chris Whytehead (within Photographs by author)? Or should they go elsewhere, as he's apparently a registered Wikipedian?
  4. What are the options for proceeding with the upload and what will be required of Chris?

All advice gratefully received. Thanks for reading. --Trevj (talk) 12:06, 4 October 2011 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category
Smallman12q Bot Request Filed Smallbot (talk · contribs)

Request filed.Smallman12q (talk) 02:20, 17 November 2011 (UTC)

Maritime photo collection [edit]

Category:Frederic Logghe Maritime photo collection includes only part of the collection available at the website listed there. The collection itself didn't seem to have grown recently and Commons might be a good place to maintain it in the long term. --07:09, 28. Sep. 2011‎ Docu

Anybody should check the licence before a mass import. I am not sure they all be free. I found lot of pictures with copyright informations. i.E: [9] [10] [11] --Slick (talk) 16:30, 4 August 2012 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

Images from Caelum Observatory & The Mount Lemmon SkyCenter [edit]

Adam Block from The Mount Lemmon SkyCenter has kindly agreed to release a large amount of his images with a CC BY-SA 3.0 license. He has done this specifically so they can be used on wiki projects. A .zip containing all of the released images can be found here. I would like to be able to upload them all into a category called 'Images from Caelum Observatory & The Mount Lemmon SkyCenter' or something in that vein. Many of them will be very useful and have high EV. A link to one of his galleries showing the relevant copyright statements can be found here. As there is 200+ files in the .zip file, uploading them all would be very tedious. I would be very grateful if someone could assist me with this matter. Thanks, Originalwana (talk) 13:16, 10 September 2011 (UTC)

Looks like it is difficult to upload the files from zip with a batch-job because missing information, i.E. description. IMHO makes more sence to parse the website for images under CC because there are very useful descriptions. (Example) --Slick (talk) 11:02, 11 August 2012 (UTC)
That would be great but I have no idea how to go about it, do you know how this could be done? Thanks Originalwana (talk) 10:22, 13 August 2012 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

Dokpro [edit]

The site [12] has a great public domain collection of norvegian manuscripts. For exemple, the totality of the manuscript œuvre of Henrik Ibsen (in the UNESCO patrimony).

The objectif is : download all pictures, convert in a djvu file (by book) and upload the djvu in Commons.

Is it possible ? thx ! --M0tty (talk) 19:36, 3 October 2010 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

UMich [edit]

All the images / videos from UMich listed in these two directories [13] If they could all be added to a single category I will than combine them into Wikipedia. --James Heilman, MD (talk) 23:13, 19 July 2011 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

ECGPedia [edit]

The owners of ECGpedia have agreed to allow release of their images under a Creative Common 3.0 license http://en.ecgpedia.org/wiki/Main_Page http://www.echopedia.org http://www.pcipedia.org This applies to all images except http://en.ecgpedia.org/wiki/Rhythm_Puzzles which they are unable to release do to a continued non commercial requirements. There are about 2000 images in all. A list can be found here http://en.ecgpedia.org/index.php?title=Special%3AAllPages&from=&to=&namespace=6 --James Heilman, MD (talk) 18:51, 13 July 2011 (UTC)

All images are licensed as "Creative Commons Attribution Noncommercial Share-Alike". Wikimedia commons does not allow "Noncommercial" licenses, so unless ECGpedia re-license their images we are not going to be able to use them. If they re-license that will need to be marked on the individual images themselves or through OTRS, which will list which images are covered. --Jarekt (talk) 19:36, 13 July 2011 (UTC)
Yes they have agreed to re-release the images under a license that allows commercial use. So the images will need to be marked as such.--James Heilman, MD (talk) 20:23, 13 July 2011 (UTC)
Here is the OTRS Ticket#2011102310008874 There are about 3000 ECGs and 700 echo images. --James Heilman, MD (talk) 13:55, 23 November 2011 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category
Smallman12q Smallbot

ian.umces.edu [edit]

http://ian.umces.edu offers 3251 free high resolution images and 2544 free vector symbols licensed under CC BY 3.0. --Leyo 06:59, 21 June 2011 (UTC)

A worthy set of images. I was able to download 2546 SVG files in a single ZIP file, but matching it with metadata is more challenging. --Jarekt (talk) 03:37, 29 June 2011 (UTC)
Are the file names in the ZIP file self-explanatory or rather meaningless? --Leyo 09:07, 29 June 2011 (UTC)
Filenames identify source and have few words about content, see for example here. For SVG files, I think we need to write some scraping software to create a spreadsheet with:
  • "Author" and "Author Company"
  • Title and description
  • "Date created"
  • URL (to be used to link back to the source image)
  • "Album name" and "Keywords" can be useful for choosing categories
  • "Filename" (to match it with the downloaded file)
I am at the moment rather busy with Commons:Batch uploading/Web Gallery of Art but if someone can gather the metadata I can upload the files. --Jarekt (talk) 15:05, 29 June 2011 (UTC)
As according to our discussion here (in German), these files additionally need to be fixed to change numbers that omit the leading zero (like .12345) to include this zero ( ---> 0.12345), else wikipedias renderer doesn't parse them correctly. (the substitution can also be of the type -.12345 ---> -0.12345). This is just in case someone very suddenly rushes in to upload these :) Iridos (talk) 23:24, 4 July 2011 (UTC)
All of the SVGs in this library were originally created with Illustrator, although most were run through SCOUR, which I now see strips leading zeros. Does anyone know of any other SVG parsers that have a problem without leading zeros? —Preceding unsigned comment added by Adrianbj (talk • contribs)
All the SVG files already contain DC metadata. There is also an online spreadsheet and excel version of metadata available.
Links to searchable database of all images/symbols and custom download builder for all the symbols in SVG, AI and PNG in a zip archive.
Just read through a translated version of the german discussion. Not sure why that virus didn't rasterize well. The PNG previews and downloadable versions on the IAN website were all created automatically with iMagick and rSVG, although problems like you are seeing did occur with various older versions of iMagick and rSVG. —Preceding unsigned comment added by Adrianbj (talk • contribs)

It seems they changed their licensing terms; the new license doesn't allow redistribution or sales, which makes it unacceptable for Commons. I guess the upload could still happen since CC licenses are irrevocable, but I imagine they wouldn't appreciate it much. The best solution would be for someone to contact them and ask them to change it back to CC BY. InverseHypercube 07:51, 18 February 2012 (UTC)

Sorry about the licensing change - we do rely on this resource to bring traffic to our website, so we would really appreciate honoring of our new license. Thanks. —Preceding unsigned comment added by Adrianbj (talk • contribs)
Thanks for commenting! However, if the images are licensed under CC-BY, we would be required to attribute (and link back to) your website, so no traffic loss would occur. In fact, since CC-BY would allow us to transfer images from your website, having your images on high-traffic sites such as Wikipedia would increase hits to your site, since they would all link to it. InverseHypercube 04:37, 8 March 2012 (UTC)
A custom license tag such as in Category:Custom license tags might be used. It might contain a link to the website and or a direct link to the respective image (example). --Leyo 09:32, 8 March 2012 (UTC)
I'd like to make the preview sized versions of all our images (photos and vector illustrations) available on Commons with a custom license tag (and attribution) and a direct link to the respective image on our site where users can register (free) and download the full resolution / vector (SVG) versions. Almost all the photos (JPG) have metadata embedded. All the SVG files also have metadata, but the preview PNGs do not because of the metadata issues with the PNG format. As I mentioned above, there is an automated spreadsheet available from our site with all the metadata. We are constantly adding images to the library. Is there any possibility to automatically update commons if I create a web service (XML/JSON) of all the images and metadata?
That sounds great, and it can definitely be done. However, as I understand copyright law, by licensing the previews under CC-BY, for example, you would also be licensing the SVG files under the same license, since they do not meet the threshold of originality over the previews. While we might only upload the previews, I don't think you could stop others from distributing the SVG files. InverseHypercube 17:35, 8 March 2012 (UTC)
I guess I was thinking of a custom license, rather than CC-BY, as suggested by Leyo. Would that work?
I still think you would be effectively licensing the SVG files under the same license. InverseHypercube 17:49, 8 March 2012 (UTC)
I understand what you are saying, but if the custom license says that users cannot redistribute or sell and that they must provide attribution even for the preview PNGs, would that work? Maybe this is too problematic for posting to Commons? We are actually also wanting to add the option for users to purchase the right to use our images without attribution, because at the moment, there are many cases when they can't use them due to the attribution requirement. We think this dual licensing model will make them more useful for more people. I'd be curious if anyone has any further suggestions.
No, that wouldn't be allowed on Commons. See Commons:Licensing#Acceptable licenses; non-commercial licenses are not permitted. However, if you licensed the SVGs under a license that required attribution for redistribution, it would apply to the PNGs too. InverseHypercube 22:14, 9 March 2012 (UTC)
Looks like I was wrong about not being able to license the preview images and the vector versions under separate licenses; the community consensus seems to be that you can. See Commons:Village_pump/Copyright/Archive/2012/01#CC_BY-SA_3.0_and_the_original_image_quality. InverseHypercube 18:38, 14 March 2012 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

Yale [edit]

As discussed at Village Pump and announced here Yale released 250k images in its database under {{Cc-by-3.0}} license, see here for details.

We should start looking into moving them here while retaining all available metadata. --Jarekt (talk) 14:43, 3 June 2011 (UTC)

Opinions [edit]

My prelimary evaluation:

  • 47343 images of paintings are available in high resolution at the present time. (Go here, fill in no fields, and click "Find.")
  • Images are made available as TIFF files, max resolution appears to be 2400 x 3000 px, 8-bit color, often smaller (they're crops of a single photo, but not bad). We should upload original TIFFs as well as JPEG versions, and cross-link them.
  • Image downloads via the website are protected by a re-CAPTCHA system. This needs to be either defeated, circumvented, or we need special permission to bypass it.
  • Download speed appears to be throttled to about 80 KB/s. At this rate it will take roughly 93 days just to download them all. This is expected and should not be circumvented, since bandwidth hogging costs money and draws ire.
  • We will require is a special license tag for these, because the situation is not simple. Yale has released their digitizations under CC-BY, which will be important in nations where digitizations may be protected by copyright or by a publisher's right, or in case of a hypothetical reversal of Bridgeman v. Corel. On the other hand, PD-Art indicates that attributing the source is not a legal requirement in the United States or other nations where reproductions carry no copyright, and we should not make reusers think that it is required. We need a special tag that combines these, while referring to the original entry in Yale's collection.
  • I don't know if the URL suffix is a stable reference number. We should instead link to a search for the Accession Number, like this.
  • Extracting metadata from HTML should be straightforward. Their metadata fields match our {{Artwork}} template rather well.

I can write a tool to get started on this, but have other obligations this week. Other opinions are welcome. Dcoetzee (talk) 07:29, 5 June 2011 (UTC)

We already have good contacts at Yale. Meg Bellinger from Yale gave keynote speech at GLAMcamp_NYC (see notes and slides). We can ask en:User:Witty lama, who I think interacted with them, to check what would be be the way to get the data with the least interruptions. We can also check if and how would they prefer that we link to their system. I can start on the license templates, institution templates, etc. --Jarekt (talk) 20:35, 5 June 2011 (UTC)

License [edit]

I created {{PD-Art-Yale}} for 2D artworks. Please verify & correct/improve. I think we should add attribution text parameter and possibly put parts of it in an info box with Yale Logo so the credit is not lost in the text.

It is uncertain to me if CC license extends to "digitization" of 3D objects which are otherwise in PD. --Jarekt (talk) 14:05, 9 June 2011 (UTC)

Looks good so far. I don't know if this collection includes three-dimensional works, or paintings with three-dimensional frames, but if it does it's worth noting that they must be used under the terms of the CC license in all nations (as the photograph would not be a mere copy). Dcoetzee (talk) 23:27, 9 June 2011 (UTC)
Yes if they CC extends to photography of the 3D objects than we would need a separate license: Artwork - PD-old, Photography - CC

--Jarekt (talk) 02:09, 10 June 2011 (UTC)

Assigned to Progress Bot name Category


Geheugen van Nederland [edit]

Initial request from Commons:Picture requests/Requests/Europe:

"There is a collection of photographs of historic maps, originating as far as I understand from the "Nederlands Scheepvaartmuseum Amsterdam". I have seen it here [14]. At the moment I am expecially interested in [15]. The maps would be interesting for a great number of articles, the latter one for some articles about "Noord-Friesland". Maybe somebody can make it possible to upload the whole collection. Thanks in advance and with best regards --92.230.245.120 03:35, 24 May 2010 (UTC)"

There are several collections that might be of use:

820 files from geheugenvannederland.nl are already available at Commons. -- Common Good 19:10, 29 April 2011 (UTC)

I am not sure about the licence. I only found this. Sure we can import the images? --Slick (talk) 19:50, 13 August 2012 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

Africa Centre [edit]

Africa Centre is a non profit organisation in Cape Town that supports arts and culture projects across Africa. Since 2007 they have commissioned thousands of arts and culture images that are related to their projects. The images give an insight into performance art, public art, site-specific art, poetry, visual art, social innovation, architecture, public space, etc. in Africa. They have applied the Creative Commons Attribution-ShareAlice 3.0 license, and have given me permission to upload their files. The photos for each of the Africa Centre projects would be uploaded under the categories Performance Art, Visual Art, Public Art, Poetry, Culture, Arts, and City of origin.Riannedac (talk) 08:54, 15 April 2011 (UTC)

I guess a lot of these photographs are actually derivative works of modern art. Does the Africa Centre own the copyright to the works? Permission should be arranged with Commons:OTRS.
For the actual uploading part we're writing Commons:Guide to batch uploading.
Are you already in touch with Wikimedia South Africa? I'll send them an email about this project. Multichill (talk) 14:30, 16 May 2011 (UTC)

de.wikipedia.org [edit]

Everything in wikipedia:de:Kategorie:Datei:Commonsfähig that does not have wikipedia:de:Vorlage:NoCommons attached. Matt (talk) 12:06, 2 January 2011 (UTC)

Opinions [edit]

Some guys including myself at wikipedia:de:Wikipedia:WikiProjekt Commons-Transfer are currently doing this half-manually using Commons:Tools#Commons Helper which is a quite old and umaintained script. As there are >100.000 images ready for transfer this is going to take too long. It is also a very repetitive task as CommonsHelper does the same conversion-errors over and over again and it's successor is also not working / developed actively. There is User:Boteas, but it needs an extra template to start working which complicates everything even more. A more automatised solution would be great. Matt (talk) 12:06, 2 January 2011 (UTC)

  • Symbol oppose vote.svg Oppose I am astonished... why have we de:WP:NCF? Keep in mind that not all pictures which should have have the NoCommons template attached. E.g. Photographs of protected buildings in France. → automated transferring is not possible. At some step the licensing needs to be checked. Cheers --Saibo (Δ) 16:07, 2 January 2011 (UTC)
    • Symbol oppose vote.svg Oppose It should be clear, that the category Datei:Commonsfähig isn't set manually per review, but only by license-templates. Actually, we have some pictures of Paul Klee, that are tagged with PD-old at de.wp, but aren't free in US and so not commons-compatible. This couldn't be sorted by a bot. At this time, there are some projects locally for transferring to Commons, so the files on de.wp decreases about 100 per day. So for every not manually checked pictures that isn't ready for commons it have to be undeleted on de.wp and deleted here. There are 459 Files transferred to commons, that aren't checked - why work on this first? Additionally, all Files at the sub-categories of de:Wikipedia:Dateiüberprüfung have to be excluded. --Quedel (talk) 23:42, 2 January 2011 (UTC)
Further templates that need a manually review: Templates: Panoramafreiheit. (not completed list yet) --Quedel (talk) 23:49, 2 January 2011 (UTC)
  • Perhaps It should be possible to find some categories to work on. For example PD old, NASA images etc. If the license is wrong then file should be deleted on de-wiki anyway. Admins could do the check when they delete the file on de-wiki. If the file is ok they delete the local file. If not they nominate local + Commons file for deletion. Files could also be tagged with a special template on Commons to show this file is removed without a review. It the template has two links "OK" and Not OK" users can click on it only takes a moment to review. If ok template is removed and if not ok file is nominated for deletion. --MGA73 (talk) 16:46, 7 January 2011 (UTC)
    • Agree >95% is just the usual GFDL and/or CC-BY-SA licensing, mostly photos. This can be moved automatically and then afterwards checked and not transferred half-manually and then "bot move checked". Matt (talk) 13:30, 8 January 2011 (UTC)
Then we need to disable revobot to not have all the images in their "raw state" (unchecked) directly in nowCommons category at dewp. There needs to be a detailed plan when which actions need to be done and who does it and how long it will approximately take. And of course a calculation where the savings (if there are at all) are. Your first request here was a bit rushed, Matt. And: to be honest, I would prefer to discuss this in German. Cheers --Saibo (Δ) 21:48, 8 January 2011 (UTC)
To MGA73's ongoing: There is a well system for files with lacking informations, that will result in fine licensed-files for about half the files. With this new way, this files will be lost, because no automatically asking the uploader will be possible. Another problem is: who will do the work? The list of "NowCommons"-Files not deleted on de.wikipedia increases, there aren't enoug admins to check they. And there are are more than 500 files transferred to commons and not checked. Additionally, there are an amount of file, not tagged with FoP or similiar, or files that are PD-old in Germany, but not on Commons (for example: pictures from Mr. Klee). So this file would be transferred get the "NowCommons-Template" locally, then had to be deleted on Commons and untagged locally. How many files are in Kategorie:Commonsfähig? Approximate 80.000 files (only half of all files). Testing the category GFDL-Bild shows, that only 55% are commons-ready as they are tagged actually (for CC-by-sa, the most chosen license i cannot check it, CatScan doesnt handel so much files). To Saibo: Disucssing in German would nice. --Quedel (talk) 23:11, 8 January 2011 (UTC)
The plan could be like this. We choose a category where we expect that perhaps > 95 % of the files to be good. Then some users scan the category for junk and possible copyvios and tag those few files for deletion or a "don't move". Then a bot moves the rest of the files - perhaps except those without a description. Then admins on de-wiki deletes the local file if transfer is ok. If not they change the "NowCommons" to a "NoCommons" (or whatever template is used on de-wiki) and mark file for deletion on Commons. That way we only have to check files once.
If the bot works correct there is no need to check that information is transfered correctly. Then we "just" need to check if the license is valid and the categories. If license is ok and the only outstanding issue then we could still delete the file on Commons.
Big categories is not a problem. We just do the queries we need on toolserver to find the files to work on. So just bring up all ideas and things we should know, do or not do. --MGA73 (talk) 18:14, 10 January 2011 (UTC)

Ich mach mal 'ne Liste (die kann sicherlich effizienter noch zusammengefasst werden)

include
exclude
Zu bedenken und einzusortieren in die obigen Kategorien

--Quedel (talk) 23:56, 8 January 2011 (UTC)

needs manual tweaking or intelligent bot

Pictogram voting info.svg Info Announcement of a new bot: COM:VP#Easy bulk transfer from English Wikipedia to Commons. --Leyo 16:51, 22 January 2011 (UTC)

Assigned to Progress Bot name Category

Canada Line [edit]

From the English Wikipedia I stumbled upon http://canadalinephotos.blogspot.com/. "Here you will find photography of the Vancouver, B.C. Canada Line, which opened to the public on August 17th, 2009. <...> The photography presented in this blog (780 posts containing 22,000 photos) will be kept online as a historical archive of the construction of the Canada Line." All files are licensed {{cc-by-2.5-ca}} and you have to attribute "Tafyrn & Seamora" and link back to the blog. You can find the actual images in http://www.seataf.com/blogs/canadaline/ . Based on the page it's used on (for example http://canadalinephotos.blogspot.com/2009/04/2009-04-16-waterfront-station.html) you should be able to decide on the title and give it a category (or Category:Canada Line if you can't find anything else).

Great collection, but all images are in 640×480. Are we sure to upload this small resolution? Can anybody, with better english than me, contact the author and request higher resolution for commons? --Slick (talk) 11:13, 11 August 2012 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

Codex Gigas [edit]

The Swedish National Library has made available the Godex Gigas, a 13th century bible manuscript which is also the largest medieval manuscript in existence, in its entirety. It's available in high resolution through FSI Viewer and in medium resolution as jpegs. The whole file structure is available at National Library's website here. The jpegs seem quite simply to download, but it would be even more interesting to extract the high-resolution pictures out of the viewer.

As a reproduction of a medieval volume, there are no copyright issues to worry about (except for perhaps the pictures of the highly ornate cover).

Peter Isotalo 13:30, 29 November 2010 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

Right Livelihood Award [edit]

After some discussion with the Right Livelihood Award Foundation I got a clarification on the usage conditions of the photos provided on their website. The details of the discussion can be found in the OTRS, ticket 2010103110002401.

Basically, pictures (mainly portraits of the laureates) from download.rightlivelihood.org which are marked with a copyright by the Right Livelihood Award Foundation in the respective license files can be used free upon attribution of the photographer and the Foundation, i.e. Template:Attribution. All pictures with other copyrights are in general incompatible to Wikimedia Commons since the Foundation does not own all the rights and they are "free to use as long as they are used in the context of the Foundation's and its Laureates' work." In this respect, the information on http://rightlivelihood.org/press_room.html is not formulated well.

So I wonder if some kind of batch upload of these pictures make any sense, or if it is faster sorted and uploaded completely manually. --Prolineserver (talk) 22:37, 26 November 2010 (UTC)


Opinions [edit]

Assigned to Progress Bot name Category

Pictures of Tom Ruen [edit]

I request an upload of astronomical images of Tom Ruen from English Wikipedia. As I can see, all of them have free license, so they can be uploaded into Commons. --Emaus (talk) 19:34, 21 October 2010 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

Old city maps [edit]

Please have a look at this website. It is a digitization project of old maps. It is done by the Hebrew University of Jerusalem and other institutions. They have a sizable database of old maps that are mostly in the public domain. I searched Commons for a sample of their files to find out if they have already been uploaded, and couldn't find any. These are very rare centuries-old maps and they could be invaluable for many Wikimedia projects.

The maps contain copyright watermarks which obviously don't represent the true status of the copyrights. However, if the university can be contacted and asked if they could collaborate with us and give us access to the un-watermarked maps, and in return we could offer a customized tag (like what'd been done for other mass uploads), it would save us incredible amounts of time and effort working on removing them at the Graphic lab, especially since we have enough work at our hands (just look at the Category:Images for cleanup backlog).

I hope you can start this upload soon. Regards, -- Orionisttalk 23:41, 5 October 2010 (UTC)

Symbol support vote.svg Support very nice collection --Jarekt (talk) 14:26, 6 October 2010 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

IUCN red list [edit]

As I mentioned at talk page, we have established a partnership with the International Union for the Conservation of Nature (IUCN) to produce range maps for many species of animals. See Commons:IUCN red list for a few more details. The GIS manager at IUCN has actually kindly produced now around 6000 maps (in .gif) for all the amphibian species they currently have data on. They have placed the zip file in a password protected ftp site (I can send someone the file, it is only 60 something megabytes). You can see the samples I have uploaded, at Commons talk:IUCN red list#New developments. I also have a .dbf file with information about the source of the spatial data, and I will get shortly a file with a relation between species names and the identification number for the species at the red list website (for example 56054 is the ID for Acanthixalus sonjae). This can be used to extract the Assessor information required to complete the description file. I know that Polbot's sixth task used information retrieved from the IUCN red list website, so it should be possible to use part of its code to retrieve this information. There should be a few other batches later on. GoEThe (talk) 12:41, 1 October 2010 (UTC)

Update: The IUCN is going to send me a total of 30,000 images to be uploaded. They would like it to be done in time for their next website update, which will be on November 11th. Can anybody help me with this? GoEThe (talk) 16:06, 21 October 2011 (UTC)

Opinions [edit]

  • Symbol support vote.svg Support I support this as there are plenty of articles are missing range maps. It is nice that the IUCN would like to support us by releasing range maps to us even if they are in .gif format. --Clarkcj12 (talk) 22:27, 24 January 2012 (UTC)
Maybe we can also per bot make png out of them or is that not possible.--Sanandros (talk) 14:31, 21 August 2012 (UTC)
Assigned to Progress Bot name Category
I think I might be able to do this if I get enough information. Werieth (talk) 01:24, 28 January 2013 (UTC)
I would be interested in helping this move forward. -- Daniel Mietchen - WiR/OS (talk) 22:04, 8 February 2013 (UTC)

KROK2009 [edit]

Please upload photos from "Festival of world animation "KROK2009"". License: CC-BY 3.0.


Opinions [edit]

What is that kind of festival and who are the persons? Right now i don't see any scope--Sanandros (talk) 14:32, 21 August 2012 (UTC)

Assigned to Progress Bot name Category

VOA pronunciation sound files [edit]

Voice of America has a great pronunciation guide with sound files for 2200 hard-to-pronounce names, places, etc. The sound files are PD as US govt works. These would be great additions to many Wikipedia articles. The pronunciation guide is here. These would need to be downloaded, converted to OGG, and uploaded. Thoughts? Calliopejen1 (talk) 18:41, 12 September 2010 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category
User:Smallman12q Commons:Bots/Requests/Smallbot 9 User:Smallbot

It looks like there are ~6500 entries under "list lookup". The sounds seem to be good. The conversion could easily be done with ffmpeg. The mp3's don't seem to be more than 15kb, so the total upload would be around a hundred MB. In addition to the mp3, their is information on name, country, country, pronunciation, and notes which could be used to categorize them. Provided the entries are indeed PD, they could be easily batch uploaded. For naming, you could use VOA-name.Smallman12q (talk) 23:44, 27 December 2010 (UTC)

How u really can say the are PD-VOA? Who are the authors? I'd upload theme with PD-Treshhold of Originality
i can confirm that they are the work of a VOA employee. with the rollout of the improved voa-pronunciation-guide-reimagined-for-2013/ , we could confirm with an otrs email. [16] use "PD-USGov-VOA" and Category:VOA pronunciation Slowking4†@1₭ 21:40, 30 April 2013 (UTC)
I've filed a BRFA at Commons:Bots/Requests/Smallbot 9.Smallman12q (talk) 20:50, 2 May 2013 (UTC)
Great. A regular refresh is a good idea. I've been in touch with the maintainers of the list; they are indeed PD-VOA. --SJ+ 23:08, 12 May 2013 (UTC)

Population distributions of Japan [edit]

I would like to upload images from this category. The images in question are populations distributions of various japanese cities, towns and villages. They are used, for example, in this article or this one. I've uploaded a bunch of samples: 1, 2, 3, the full list. Claymore (talk) 14:50, 6 August 2010 (UTC)

Opinions [edit]

Looks very good! Nothing comes to mind to change here. At jawp however you should add {{NowCommons}} to the images and replace all usage so the admins at jawp can easily delete the files. Multichill (talk) 17:58, 6 August 2010 (UTC)

They depend on a template system that requires names of the file to be "Demography(xxxxx).svg". I'll see if I can convience them to move to the template implementation I created for ruwp. Claymore (talk) 07:07, 7 August 2010 (UTC)
A template system which is based on the names of files is sooooo broken. Multichill (talk) 09:03, 7 August 2010 (UTC)


Assigned to Progress Bot name Category
Claymore ClaymoreBot Population distribution of Japan

The Tansey Collection of Miniatures [edit]

Hi. The Tansey Collection of Miniatures have a large collection of 17th, 18th and 19th century miniature portrait paintings in high resolution. The paintings are definitely within our scope, and would be a great addition to the commons. I have therefore uploaded some of them here, but since there are so many and the frames needs to be cropped to make them eligible for PD-Art, some help would be appreciated. Cheers —P. S. Burton (talk) 17:25, 29 July 2010 (UTC)

Opinions [edit]

  • This sounds like it could be end up being a situation similar to that which developed with the UK National Portrait Gallery. Have you, as a courtesy, considered contacting the curators of the collection before doing a systematic process such as this? I also have objections to this based on the cropping, but that is a different issue to that related to batch uploading so I will raise this back at the Village pump (though any need for cropping will make automation difficult or impossible here, especially for the circular and oval miniatures, which is most of them). Carcharoth (Commons) (talk) 06:29, 31 July 2010 (UTC)
Assigned to Progress Bot name Category

Piqs.de [edit]

We could have a bot upload images from http://www.piqs.de/ It is a page like Flickr but all images are licensed http://creativecommons.org/licenses/by/2.0/de/deed.de and therefore all ok for Commons.

I created a category for the images and a template to use {{Piqs}}. It needs a better picture but the biggest problem is which images we should upload. We could upload ALL images or have users SELECT images. Perhaps we could make a bot like the one we use to upload images from Flickr. Suggestions? Opinions? --MGA73 (talk) 20:39, 24 July 2010 (UTC)

Nice page. For the initial import my suggestion is parse the subpages of the top pictures or here or here or here. If possible watch for new files there in the future (i.E. by the given rss feeds). So we will get only the best and not all the others. But this only make sence when it is done in intervals, not only one time. And another hint, the bot should to login to get the highest solution (or maybe there is a woraround to download the original?). --Slick (talk) 14:09, 11 August 2012 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

Pearson Scott Foresman SVG files [edit]

Users at the Open Clip Art Library have created many SVG versions of line drawing files by Pearson Scott Foresman here. They should be uploaded with the DerivativeFX tool, and the raster version tagged with Template:SupersededSVG. File:Catfish (PSF).svg is one file I have uploaded so far; use it as a basis for formatting new SVG upload filepages. --Siddharth Patil (talk) 15:13, 28 June 2010 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

Weather maps [edit]

The Hydrometeorological Prediction Center provides daily weather maps of the United States from 2003 to the present. These are high-quality and educational, and are able to be used in galleries, articles and other content pages on several projects. All strictly {{PD-USGov}}. This is just a proposal for now, rather than an actual request, to see what folks think. –Juliancolton | Talk 16:54, 19 March 2010 (UTC)

Opinions [edit]

I don't bite! :) –Juliancolton | Talk 00:22, 24 May 2010 (UTC)
If you feel like the images would indeed be useful, then I am happy to upload them to Commons. There are exactly 18490 images to be downloaded, counting from September 1, 2002 until today, October 15, 2012. I'll let you know as soon as I have them on my HDD. odder (talk) 14:20, 15 October 2012 (UTC)
Looks like the Hydrometeorological Prediction Center doesn't publish its maps in advance, so I only managed to get the maps until October 14, 2012. There are 14,841 .gif files to be uploaded, and they're around 650 MiB in total. odder (talk) 07:50, 19 October 2012 (UTC)
Taking a look. It would be interesting to do this jointly, we can layout the process as we go along. No hurry on this one, so let's consider this a slow and careful project. Smile fasdfdsfoiueire.svg. -- (talk) 14:15, 21 May 2013 (UTC)
I think there are five different files here, for (almost) every day since September 1, 2002: maxmin, colormaxmin, dwm500_test, precip and stnplot. (Two files from August 10, 2005 are missing, though: 1, 2). odder (talk) 15:27, 21 May 2013 (UTC)

Coordination [edit]

  1. Identify key fields. I'm not sure what can be drawn from the source. Source file names look like "colormaxmin_20020901.gif" and there appear to be 4 different types of graphic that can be uploaded. So I guess the variables than can be pulled into a data ingestion template on upload are just:
    1. {date} the day the graph represents (inferred from the file name)
    2. {type} the graph type (also inferred from the file name), this may be transcluded with a standard plain English explanation of one of the five file types:
      1. maxmin - Max-min Temperature Map
      2. colormaxmin - Color Max-min Temperature Map
      3. dwm500_test - 500-Millibar Height Contour Map
      4. precip - 24-hr Precipitation Map
      5. stnplot - Surface Weather Map
      Static text can include the permission statement from source, the standard source URL where the images can be looked up by date, the equivalent of author and any credits that are appropriate.
    Consequently if I upload from odder's cache of images, then there is no need to data-mine the source website. Other suggestions? -- (talk) 14:46, 21 May 2013 (UTC)
  2. Set up ingestion template. Draft started at User:Fæ/dailywxmap. Moved to Template:NOAA-dailywxmap
  3. Agree standard filename structure. Perhaps "Weather map <ISOdate> <typename> NOAA.png"?
    How about <ISO date> <typename> NOAA.png? For instance, 2013-05-19 Surface Weather Map NOAA.png (or Min-Max Temp Map, 500-Milibar Height Contour Map, 24-hr Precipitation Map)? odder (talk) 16:18, 21 May 2013 (UTC)
    Looks good, let's go with that so long as it is accurate and navigable (i.e. we might want to give links to next-map, last-map in the template or in parent categories (one category per month might be useful) Smile fasdfdsfoiueire.svg. -- (talk) 16:21, 21 May 2013 (UTC)
  4. Convert GIFs to PNGs.
    Done, with the exception of dwm500_test_20031009, which appears corrupted. (I couldn't even open the file on my computer.) odder (talk) 17:07, 21 May 2013 (UTC)
  5. Run test sample upload.
    Done - first 3 days worth (15 maps) available at Category:NCEP 2002 weather maps.
    Extended sample of the first 3 months of images uploaded, over 500 files, using Noaabot.
  6. Set up special bot account.
    User:Noaabot created. Flag requested. Full upload will start once the bot flag is given.
    (Optional) Integrate the gif->png conversion into the Python script (in OSX this can probably be a simple sips command) to make future regular updates easy, perhaps an automated weekly or monthly run.
  7. Extend to pdfs
    There are a number of recent pdf summaries at pdffiles. This appears a limited backlog, being weekly summaries from week 49 of 2012 through to week 20 of 2012 and then some recent daily summaries for the remaining days (the files are named as DWM<2 digit week><2 digit year>.pdf and they are in black and white as well as colour). There is no archive of past daily summaries, they appear to get bundled into the weekly ones. These are potentially valuable as the maps embedded in the pdfs are vector maps rather than gifs, so worth uploading and potentially these might be un-embedded, or the NCEP might release these online and we can upload directly.
    Naming:
    Weekly pdf summaries: <4 digit year> week <week number> Daily Weather Map <color> summary NOAA.pdf
    "color" will only be included if in color.
    (Parked for the moment, these effectively duplicate the pages in the weeklies) Daily pdf summaries: <ISO date> Daily Weather Map summary NOAA.pdf
    Noaabot is uploading these to Category:NCEP weekly weather maps.
Assigned to Progress Bot name Category
Fæ and odder Beta test - first month of images Noaabot Category:Images from NOAA uploaded by Noaabot

Old Book Art [edit]

http://www.oldbookart.com/ This site has tons of old public domain book illustrations. If a bot can upload them, I'll happily categorize them. Rocket000 (talk) 14:31, 21 January 2010 (UTC)

After reading http://www.oldbookart.com/about/, I think it's best to contact him and see if we can do some sort of partnership. Are you willing to contact him? Multichill (talk) 22:15, 23 May 2010 (UTC)
The images are either released as CC-by-sa or public domain. I think he specifies that he would like as a courtesy a link back to his website, so I think that would suffice for this upload.--Diaa abdelmoneim (talk) 07:22, 14 September 2010 (UTC)
Assigned to Progress Bot name Category

US Coast Guard [edit]

Like the other US federal gov sites, this site contains lot's of nice images. gallery. Multichill (talk) 21:45, 16 January 2010 (UTC)

Opinions [edit]

  • I think these would be great to upload. Definitely public domain. As I'm getting into batch uploads, I'm willing to work on these. -Aude (talk | contribs) 21:25, 9 March 2011 (UTC)

Symbol support vote.svg Support Like the Navy pics they are also fine for us.--Sanandros (talk) 14:40, 21 August 2012 (UTC)

Assigned to Progress Bot name Category

University of Washington Digital Collections [edit]

The same algorithm applied to Commons:Batch uploading/Freshwater and Marine Image Bank can be used on multiple collections of the UW collections. I'll list some here with the reason of why the images would be PD.

There are many more that could be checked.--Diaa abdelmoneim (talk) 17:54, 17 October 2009 (UTC)

Opinions [edit]

Assigned to Progress Bot name Category

NOAA Photo Library [edit]

The Fema request got me started. NOAA has a nice set of images at http://www.photolib.noaa.gov/ . Not sure what amount of images we're talking about, but at least a couple of thousands. Multichill (talk) 20:09, 14 October 2009 (UTC)

See the catalog of images.

Opinions [edit]

If possible, go ahead with it since there haven't been any objections. –Juliancolton | Talk 16:55, 19 March 2010 (UTC)

It does sound good. -- User:Docu at 19:42, 2 May 2010 (UTC)

Some or all of these images don't have metadata, including the dates of when they were taken. --O (висчвын) 20:04, 07 August 2010 (GMT)

Hmm r they really free? Cause some of them have an author which is not working for the NOAA directly, but working for an university which takes part in that project.--Sanandros (talk) 14:44, 21 August 2012 (UTC)

Assigned to Progress Bot name

Images from Beinecke's collections [edit]

One more wonderfull collection with lot of PD-images - http://beinecke.library.yale.edu/digitallibrary/ 200,000 digitized images of photographs, illuminated manuscripts, maps, works of art, and books from the Beinecke's collections --Butko (talk) 08:50, 16 April 2009 (UTC)

Did you contact them? Did you get a release? Or is this merely a suggestion. That shouldn't go here imho. Nice collection though, we should contact them to get some nice images. Multichill (talk) 14:05, 7 June 2009 (UTC)
I would like to help out on the acquisition of images of this library. I wanted to send an e-mail but thought it would be best if we work together on a draft. --Diaa abdelmoneim (talk) 14:59, 7 June 2009 (UTC)
Ok. As discussed on irc: You'll contact the library. Please keep me posted. Multichill (talk) 15:07, 7 June 2009 (UTC)
Any update on this one? Multichill (talk) 23:14, 4 September 2009 (UTC)
I sent them a mail multiple times but they didn't reply....--Diaa abdelmoneim (talk) 23:18, 4 September 2009 (UTC)
  • User:JovanCormac seems to have started uploading the Detroit Company images. Maybe the batch should be split into many parts then each uploaded on its own.--Diaa abdelmoneim (talk) 17:06, 16 October 2009 (UTC)

Can this be removed from the the list? (Commons:Batch uploading)? -- RE rillke questions? 18:27, 4 June 2012 (UTC)

Images from World Digital Library [edit]

New site with PD-images - http://www.wdl.org. Contain 1170 items --Butko (talk) 06:52, 22 April 2009 (UTC)

User:Sj shown interest in working on this upload. Looks like a very nice collection. Some points:
  • The items have an id (http://www.wdl.org/en/item/100/), so easy to loop over
  • The description of the items is available in a lot of languages, we should use that
  • Lot's of metadata is available, this should make categorization easier
  • One item can contain multiple files. We should be aware of that
  • Files are available in the tiff file format. We should either have tiff thumbnails or upload tiff and a jpg version (transcoding!)
  • Experience and code gained with the usgov uploads should be (re)used
  • Some items have curator video's, might be fun to upload too
Multichill (talk) 14:13, 8 November 2009 (UTC)
Aside: There's a lot of interest in using data from how these images are used in encyclopedia articles, and how traffic is driven to the original archives, to inspire more libraries to take part in WDL. +sj + 14:14, 8 November 2009 (UTC)

Any progress? -- RE rillke questions? 18:29, 4 June 2012 (UTC)

Thanks for the reminder. They've done a batch of updates recently; I'll see if I can get a dump next week before finding a suitable scraper. --SJ+ 06:52, 21 June 2012 (UTC)

Maps from Ryhiner Collection [edit]

Available from www.stub.unibe.ch/stub/ryhiner/ I´ve dealing with this collection for time (see this file for a example). This collection consists in "over 16000 high resolution images: maps, town plans and topographical views from the 16th to the early 19th century". So, if this declaration can be taken in face value, there is no problem with copyright because this maps are already in Public domain and being a 2D works their digital copies are also in PD. So if this statements are correct all their collection could uploaded by a bot to commons. Their maps are avaible in high resolution using zoomify (see the exemple map in their site). Tm (talk) 13:20, 22 April 2009 (UTC)

Opinions [edit]

Looks like a great collection. Is it possible to access the source files? Did you try contacting them? Multichill (talk) 14:03, 7 June 2009 (UTC)

Sorry for the delayed answer. To aswer your first question, i don´t know if it´s possible to have online acess to their source files, and i am not very techie savy. Also i didn´t try to contact them. What is your opinion of what are the next steps to take? Tm (talk) 01:25, 15 June 2009 (UTC)

I´ve sent today an email asking for their permission to make this batch upload. I thought that asking now if their source files are avaible online in this stage would be too soon. Tm (talk) 15:10, 2 July 2009 (UTC)
Sorry about not responding sooner, looks like i forgot to watchlist this page. We're in the non tech phase. Try to contact them, see if they like it. If that turns out alright we can start the actual data retrieval and uploading part. Writing a general story about this is still on my list. I'll see if I can make a first version. Multichill (talk) 16:59, 2 July 2009 (UTC)

Just a quick update to tell that i received a automatic answer about the absence of the person contacted by my email, and i forward it to a email i received in the answer. When and if i receive a answer i´ll update this page. Tm (talk) 00:48, 3 July 2009 (UTC)

I received a aswer, and already replied to it, but i am waiting permission to republish the email or the contents of the aswer that i received. Tm (talk) 04:10, 12 July 2009 (UTC)

You can always use OTRS if you want to keep it private. Multichill (talk) 10:56, 12 July 2009 (UTC)

The question isn’t exactly about privacy, but more about building trust between the parts, after the NPG case (I fully support Dcoetzee)‎‎, with might have been heard by this people and gave them a bad impression of Wikimedia Commons and its users. I can tell, without breaking the secrecy correspondence, that the answer that I received was slightly positive to the possibility of cooperation, but the person that answered made some questions, doubts and remarks that need to be addressed, about this possible cooperation, (I gave my opinion), but requested that its answer be publish so that more people can give their input. Despite this I received an automatic answer to my second email telling that I might not receive a second email until 10 of August. Tm (talk) 07:39, 19 July 2009 (UTC)

Any update on this one? Multichill (talk) 23:13, 4 September 2009 (UTC)

Not much. I´ve received a email on 11 of August telling, that do to the holidays of the person that i´ve send the mail, the answer would be delayed but i´ve not received nothing subsequently, until now. Tm (talk) 23:43, 4 September 2009 (UTC)

I have send an email today. as i´ve only received a email on 15 of September telling me that the person i contacted had contacted the library but was still waiting an answer. In this email i asked if there is already an answer. When i receive a answer i´ll update this page. Tm (talk) 04:05, 21 November 2009 (UTC)
I´ve received an email, some days ago, from the same person that i´am contacting from the beginning, saying that still there isnt any aswer, from the library responsible for this collection, about the enquerie i made some months ago. Comments? Tm (talk) 13:11, 6 December 2009 (UTC)
I have to report that the library that keeps this collection, unfortunetly, decided to reject the request made some months ago as, according to the person i exchanged emails, this request "lacks a formal application and there is no treatment needed because the maps are already available online for the public." Tm (talk) 23:34, 14 January 2010 (UTC)
Ok. Looks like we're going to scrape their site after all. I'll have a look at it. Multichill (talk) 23:51, 14 January 2010 (UTC)
These images are easily scrapable through a bit of regex and looping. The various galleries are listed here Where each gallery has about 40 images of the same subject, different periods probably. Next to each gallery the name of the place is listed, where the category could just be like Category:Scotland maps or the like. We've done uploads through the Zoomify upload before so the experience is there.--Diaa abdelmoneim (talk) 10:07, 17 October 2009 (UTC)
I had a look at {{PD-art}}. Seems to work in Switzerland so no NPG issues ;-)
The plan:
  • Loop over the galleries at http://www.zb.unibe.ch/maps/ryhiner/sammlung/?group=volume (does that contain all maps?)
  • Loop over all images in a gallery
  • For each image pull the metadata. Several sources. Have to see what information is useful
  • Pull the image with some dezoomify tool
  • Generate filename, description and categories
  • Upload to Commons
What metadata to use exactly is somewhat tricky. Also the dezoomify if a bit of extra work. Multichill (talk) 15:43, 15 January 2010 (UTC)
Multichill, might I volunteer my dezoomify.py script, which will take in a web page holding a zoomify Flash object, regex for the location of the image tiles automatically and download and recompose the highest zoom level available. Have a look at: this page, which has a full code listing. Example of its work can be seen here. I hope it's useful. Inductiveload (talk) 02:58, 19 February 2010 (UTC)
Sure. Looks nice at first glance, but you should split it up in functions and use objects so it can be used in other programs (like pywikipedia). Probably best to make a lib part and a commandline part (which uses the lib part). What license is you code? Do you need some help restructuring it? Did you take a look at this script when you wrote your code? Multichill (talk) 09:19, 19 February 2010 (UTC)

Any progress? -- RE rillke questions? 18:25, 4 June 2012 (UTC)

Assigned to Progress Bot name

Freshwater and Marine Image Bank [edit]

The Freshwater and Marine Image Bank from the Digital Collections at the University of Washington states:

"Materials in the Freshwater and Marine Image Bank are in the public domain. No copyright permissions are needed. Acknowledgement of the Freshwater and Marine Image Bank as a source for borrowed images is requested."

The entire library can be browsed here: [17]

These photos would be useful in the many marine and freshwater life articles of the Wikipedias. The images are encyclopedic and are very high quality.

The digital collection has been "closed" since June, but the site is still accessible. My guess is the site will shut down within a few days (whenever their webspace subscription ends).

Any way someone could set up a batch for this? Thanks, Bob the Wikipedian (talk) 19:46, 13 July 2009 (UTC)

Opinions [edit]

Um...no responses yet? Perhaps I should revisit the fact this database isn't supposed to be up much longer. Either we take the images now or they might not be there a few months from now. Bob the Wikipedian (talk) 01:23, 28 July 2009 (UTC)

I would like to echo the great potential utility of the UW image database! In many of the subjects of particular interest to me (e.g. North Pacific marine ecology, marine mammals, Pacific salmon, sturgeon species, indigenous people) the collection is a real goldmine. Somebody, whoever is out here making such magical batch uploads possible, please respond! Best, Eliezg (talk) 21:13, 9 August 2009 (UTC)
Assigned to Progress Bot name

Zorger [edit]

Message below was posted on the Commons:Village pump --Jarekt (talk) 19:23, 18 September 2009 (UTC)

Looks like a batch upload could be useful here: public-domain.zorger.com. Tekstman (talk) 18:03, 18 September 2009 (UTC)

I browsed the site and they seem to have few hundred images scaned from old books with clear sources and their own PD justification. Some of those images might be useful, like those. Some should match them to our PD licenses. --Jarekt (talk) 19:23, 18 September 2009 (UTC)

Opinions [edit]

Assigned to Progress Bot name

Mollusca by Jan Delsing [edit]

Photos of shells of Mollusca (143 bivalves, 1469 gastropods) by Jan Delsing from http://www.biolib.cz/en/galleryuser/?uid=3973

The only uploaded example is: http://commons.wikimedia.org/wiki/File:Pythia_scarabaeus_shell.jpg

The best names of files would be: BINOMIAL NAME shell.jpg

Example of filenames:

  • File:Pythia scarabaeus shell.jpg
  • File:Pythia scarabaeus shell 2.jpg
  • File:Pythia scarabaeus shell 3.jpg
  • File:Pythia scarabaeus shell 4.jpg
  • and so on.

Thanks. --Snek01 (talk) 18:09, 6 October 2009 (UTC)

If this information could help, then EOL has cooperation with biolib.cz and EOL takes public domain images and Creative Commons images from this source automatically. --Snek01 (talk) 10:18, 12 November 2009 (UTC)

Opinions [edit]

Assigned to Progress Bot name