Commons:Bots/Requests/File Upload Bot (AntWeb)

From Wikimedia Commons, the free media repository
Jump to: navigation, search

AntWeb Bot

Operator: User:DaveThau

Bot's tasks for which permission is being sought: Initial upload of 30,000 high-quality ant images from AntWeb.org. Subsequent monthly updates of considerably smaller size. The bot is a slight variant on the User:File Upload Bot (Eloquence) bot.

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): Large one time run, then monthly

Maximum edit rate (eg edits per minute): 6

Bot flag requested: (Y/N): Y

Programming language(s): Perl

--File Upload Bot (AntWeb) (talk) 15:27, 12 October 2009 (UTC)

Discussion

Looks like a nice website to copy. Some points to improve the upload:

  1. Looks like the website is also available under the {{GFDL}}, you should update {{AntWeb permission}}
  2. Do you have the scientific name for each ant? You should take a look at the trick I used at User:Multichill/Starr. This will add {{Check categories}} when the species category doesn't exist. Maybe you could do something similar.
  3. Do you have a list of categories you're going to use with the number of images which will end up there? Might be useful to create the tree before you do the big upload
  4. I made Category:AntWeb Pictures hidden because it's a source category. Maybe it should be renamed to Category:Images from AntWeb
  5. Don't use your bot account to do manual edits.
  6. Use Commons:Batch uploading for reference and please create a subpage describing your project. See for example Commons:Batch uploading/Starr images

Multichill (talk) 20:19, 12 October 2009 (UTC)

Thanks, Multichill. I've updated the antweb.org site to correctly state the license we're using. I would like to create the category tree before beginning the upload, but I'm not sure how to do that. I have species, genus, and subfamily levels of the tree. We're only uploading images for ants which have valid scientific names. I don't currently have a list of all the categories with numbers, but I could create that easily. I'll check on your MultiChill/Starr page about the check categories trick. And, I've created Commons:Batch uploading/AntWeb_images and linked it to the main batch upload page under new requests. Thanks for your feedback! Davethau (talk) 02:42, 13 October 2009 (UTC)
Just a note: there's a category for unidentified ants, so I guess you could upload the remaining images for which you don't have valid scientific names, and put them there. Who knows, maybe someone, somewhere knows which species an image depicts, and could point that out in the file's page :) --Waldir talk 10:57, 13 October 2009 (UTC)
I took a look at one of your uploads, it contains:
  • Order: Hymenoptera
  • Family: Formicidae
  • Subfamily: Myrmicinae
  • Genus: Crematogaster
  • Species: lobata
Do you have this in a database or something like that? You can process this data to calculate occurrences and to generate the categories. I can help you with that if you like.
Something else. It won't hurt if you confirm permission to use all these files with Commons:OTRS. Multichill (talk) 18:47, 13 October 2009 (UTC)
Really? On all 30,000+ images? Seems like overkill. The antweb.org site and all the images are already on record as being under cc-by-sa-3.0. Wouldn't this cause a problem for the OTRS? Davethau (talk) 20:20, 13 October 2009 (UTC)
No, don't add it to all images. See for example {{KIT-license}}. The template contains an otrs permission template, but this is not transcluded. The same could be done with {{AntWeb permission}}. It's just an extra to prevent any future problems. Please also take a look at my first question (about the categories). Multichill (talk) 08:21, 14 October 2009 (UTC)
Ok. I'll do that. Thanks for the offer on the categories. Everything is databased. I'd like to generate the categories if that's possible. Maybe send me details in email? Davethau (talk) 14:46, 14 October 2009 (UTC)
Easiest way is probably to discus this at irc. You can find me in #wikimedia-commons at freenode (use http://webchat.freenode.net/ if you don't have an irc client installed). Multichill (talk) 21:00, 14 October 2009 (UTC)

← I'm concerned with the upload speed of this bot. I think we should respect things like maxlag and try not to slow down the site for everyone. I'm sorry if I sound like a pest for bringing this up twice now (the other time was on the Village Pump discussion), but nobody has responded to my previous mention of this topic (that I can see). Killiondude (talk) 20:01, 13 October 2009 (UTC)

A few minutes after I posted that, Dave changed the edit per minute rate. Thanks. :-) Killiondude (talk) 20:08, 13 October 2009 (UTC)
I didn't mention it because I don't think he can go this fast. One upload bot can't slow down the site. Multichill (talk) 20:14, 13 October 2009 (UTC)

Some comments

  1. Having the order, family... in the description is redundant since it's already in the {{taxonavigation}} template that you're adding.--Diaa abdelmoneim (talk) 21:02, 13 October 2009 (UTC)
  2. Remove the text next to permission saying "CC-BY-SA-3.0", since it's also redundant with the licensing template. This would result in "See below" directs the user to licensing...--Diaa abdelmoneim (talk) 21:02, 13 October 2009 (UTC)
  3. Please add "== {{int:license}} ==" before {{AntWeb permission}} for a link to what licensing is in different languages...--Diaa abdelmoneim (talk) 21:02, 13 October 2009 (UTC)
  4. Please also add "== {{int:filedesc}} ==" before everything to indicate a summary.--Diaa abdelmoneim (talk) 21:02, 13 October 2009 (UTC)
Ok, I made those changes on image File:Boloponera_vicans_casent0401737_profile_1.jpg. Looks good, I'll change the bot accordingly. Also, I put some links back to the antweb.org site into the description. I can't figure out how to get them there without the links being surrounded by brackets. Any ideas? Davethau (talk) 02:30, 14 October 2009 (UTC)
External links just need one set of brackets, but internal ("wikilinks) need two sets of brackets. I made the change on picture you listed in your comment above. Killiondude (talk) 02:44, 14 October 2009 (UTC)
Sorry Killiondude, you seem to have the same problem I did - the description is completely gone now :( Davethau (talk) 03:10, 14 October 2009 (UTC)
Oh, I see. The {{en}} template doesn't like links that have equal signs in them, so you have to put in {{en|1=}} for descriptions that have links. I've fixed that picture with this edit. Killiondude (talk) 04:32, 14 October 2009 (UTC)
Nice! Thanks. Davethau (talk) 04:38, 14 October 2009 (UTC)
Since we're talking about links, why aren't you adding the link to the antweb staff page in the author name anymore? --Waldir talk 08:13, 14 October 2009 (UTC)
Yes. That would be good to add. Once you've done that, could you do a testrun of a couple of images. Multichill (talk) 08:21, 14 October 2009 (UTC)
I took it out because AntWeb has community generated content, so not all photographers are on our staff. So, except in cases where the person is staff, it would be misleading. I suppose I could check to see if the person is staff, and then put a link, but then what if that person leaves antweb? Are we then required to put an emeritus section on our staff page? All in all, it seemed problematic. Davethau (talk) 14:46, 14 October 2009 (UTC)
I see. The best would obviously be if the authors had personal webpages, blogs or whatever online profile they would like to be used for attibution / contact (I suppose most photographers have accounts on flickr or other photo sharing sites). If they do, it's a nice touch to add it. But I wouldn't ask you to have the trouble to gather that data -- not for now, at least. --Waldir talk 15:29, 14 October 2009 (UTC)
Multichill, aren't the 200 images already uploaded good enough to serve as a test run? I know some things changed, but it seems evident that Dave knows what he's doing, and besides these are minor changes. --Waldir talk 15:32, 14 October 2009 (UTC)
Dude, just a couple of images to see how all these changes work out. Nice to see and it speeds up the process. You know how long a bot approval can take here. Multichill (talk) 21:00, 14 October 2009 (UTC)
Ok, once I get the sense that things have settled down here, I'll run another 10 images. Davethau (talk) 04:41, 15 October 2009 (UTC)
  • Remove "CC-BY-SA-3.0" from the permission field of the information template.--Diaa abdelmoneim (talk) 21:11, 14 October 2009 (UTC)
Ok. I did it on the test image. Davethau (talk) 04:38, 15 October 2009 (UTC)
  • Why are u using this permission template and not just {{Cc-by-sa-3.0}} ? If you want to have Antweb's name stated there create a source template like on {{Fotothek-License}} where Antweb source template would be alone, then the license under it...--Diaa abdelmoneim (talk) 21:37, 15 October 2009 (UTC)
    • The template was changed by a couple other users (like me!). It should look fine now. Multichill (talk) 22:38, 15 October 2009 (UTC)
  1. What does "Fisher, 2006" mean in authority? I mean shouldn't this be better stated? — Preceding unsigned comment added by Diaa abdelmoneim (talk • contribs) 16 October 2009 (UTC)
Taken from en:Scientific name: "In scholarly texts, the main entry for the binomial is followed by the abbreviated (in botany) or full (in zoology) surname of the scientific authority – the scientist who first published the classification. If in the original description the species was assigned to a different genus from that to which it is assigned today, the abbreviation or name of the describer and the description date are set in parentheses." — raeky (talk | edits) 16:55, 16 October 2009 (UTC)
Yes couldn't there be a link to this authority? I mean what's the use of Abreviation if there are hundreds of "Fisher"...--Diaa abdelmoneim (talk) 17:02, 16 October 2009 (UTC)
It's a specific reference to a specific author, the full reference is "Boloponera vicans Fisher, 2006: 113, figs. 1-4, 24 (w.) CENTRAL AFRICAN REPUBLIC." It's the standard authority for that species (Boloponera vicans) and can be found in MANY reference sites on the web, see. Anyone with a scientific background will understand authority abbreviations. — raeky (talk | edits) 17:08, 16 October 2009 (UTC)
Ok, now after all the fixes could there be a 10 edit test before approval?--Diaa abdelmoneim (talk) 17:12, 16 October 2009 (UTC)
Ok, I'll get it going some time today. Lots of meetings... not much time.. Davethau (talk) 18:14, 16 October 2009 (UTC)

Check 'em out. I just uploaded 9 images (e.g. http://commons.wikimedia.org/wiki/File:Crematogaster_rasoherinae_casent0136650_head_1.jpg) Davethau (talk) 19:19, 16 October 2009 (UTC)

  • Why would the label images be of use? I mean the info is already in the Images right?--Diaa abdelmoneim (talk) 19:31, 16 October 2009 (UTC)
The label can be very informative. For example this label dates from 1903 and it shows that the ant is held in a museum at Harvard. Most importantly, the red labels show that this specimen is a special one that is used to represent the entire species. Davethau (talk) 21:43, 16 October 2009 (UTC)
  • Symbol support vote.svg Support Ok, I don't see anything wrong with the bot... --Diaa abdelmoneim (talk) 21:58, 16 October 2009 (UTC)
  • Symbol support vote.svg Support Everything seems fine to me :) --Waldir talk 08:37, 17 October 2009 (UTC)
  • Symbol support vote.svg Support All concerns are addressed now, right? That takes it from a nice upload to a great upload! Multichill (talk) 13:06, 17 October 2009 (UTC)
✓ Bot approved by EugeneZelenko 16:22, 17 October 2009 (UTC)