Commons:Bots/Requests/Open Access Media Importer Bot

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Open Access Media Importer Bot (talk · contribs)

Operator: Daniel Mietchen

Bot's tasks for which permission is being sought:

  • Upload media files that
    • are available from PubMed Central under Commons-compatible licenses as supplements of scholarly articles
    • have been converted to Commons-accepted file types
with all the necessary metadata.

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): Continuous

Maximum edit rate (eg edits per minute): 6 edits per minute

Bot flag requested: (Y/N): No

Programming language(s): Python

Daniel Mietchen - WiR/OS (talk) 01:46, 18 July 2012 (UTC)[reply]

Discussion

Sounds like a good idea to me. I chatted with Daniel in person last weekend and he explained to me what kind of media to expect. I am convinced it is within commons' scope. A test run would be great to see if the quality of metadata (information template generation, categorization etc.) is up to our standards. --Dschwen (talk) 04:08, 18 July 2012 (UTC)[reply]

OK, will report here when test run is done. -- Daniel Mietchen - WiR/OS (talk) 04:13, 18 July 2012 (UTC)[reply]
My main criticism so far is that the file names do not conform to Commons:File naming. They need to be meaningful to the reader (perhaps parse the figure/paper titles?). --99of9 (talk) 01:08, 31 July 2012 (UTC)[reply]
Please use language template for Source/Author/Permission fields. Why descriptions are not created within upload? --EugeneZelenko (talk) 14:29, 1 August 2012 (UTC)[reply]
We have now tested the bot quite intensively and adapted it accordingly. What it has uploaded over the course of this month should have become increasingly compliant with Commons policies and standards, and so we invite you to have another look at it now. -- Daniel Mietchen - WiR/OS (talk) 09:43, 28 September 2012 (UTC)[reply]
The latest uploads look good. But please make sure to remove leading spaces in descriptions (such as here) as they cause the description to be set as unbreaking pre elements! --Dschwen (talk) 17:42, 1 October 2012 (UTC)[reply]
Thanks - that's on the to do list. I am cleaning up the initial uploads manually. -- Daniel Mietchen - WiR/OS (talk) 00:13, 2 October 2012 (UTC)[reply]
The leading whitespace issue is now fixed (sample upload). -- Daniel Mietchen - WiR/OS (talk) 14:40, 2 October 2012 (UTC)[reply]
I don't see any further issues standing in the way of granting this request. --Dschwen (talk) 17:16, 2 October 2012 (UTC)[reply]
Fixed. Test upload. -- Daniel Mietchen - WiR/OS (talk) 18:56, 16 October 2012 (UTC)[reply]
Thanks. --99of9 (talk) 03:18, 17 October 2012 (UTC)[reply]
  • The permission field is very wordy, which may cause tldr for some reusers. The template {{PLOS}} needs a parameter that allows the message to the uploader to be removed once the conditions are complied with. Since the bot always satisfies the conditions, the bot should then set that flag. Secondly, it is not customary to put an additional line "Copyright owner: Garwood et al" when the author= field has already been specified with full details. Are there any cases where these two fields will be different? --99of9 (talk) 03:18, 17 October 2012 (UTC)[reply]
We have removed the mention of the copyright owner (sample upload) and also simplified the {{PLOS}} template a bit, since the information contained in the note to uploaders is already stated on the relevant category pages. Note that by far not all uploaded materials are from PLOS - for a file from BMC, for instance, see here. -- Daniel Mietchen - WiR/OS (talk) 20:15, 20 October 2012 (UTC)[reply]
That new PLOS upload looks good. The BMC examples often (always?) have a useless sentence at the start of their description. I wouldn't insist on the bot detecting this, but if it's easy, it will definitely reduce the human cleanup work. --99of9 (talk) 03:22, 21 October 2012 (UTC)[reply]
It seems to be always there at BMC, but never at any other source, and so far, we have tried to keep the code as publisher-independent as possible. So we could in principle introduce a line of the sort "if it's a BMC paper and starts with 'Additional file #number' then remove that line" but I am not sure we should. -- Daniel Mietchen - WiR/OS (talk) 05:21, 21 October 2012 (UTC)[reply]
We've removed "Additional file 1" and similar wording. Sample upload. -- Daniel Mietchen - WiR/OS (talk) 18:36, 21 October 2012 (UTC)[reply]
BMC sample upload.-- Daniel Mietchen - WiR/OS (talk) 19:01, 21 October 2012 (UTC)[reply]
Great. --99of9 (talk) 22:39, 21 October 2012 (UTC)[reply]

My concerns are addressed. Unless someone has anything new to discuss, I suggest we approve the bot. --99of9 (talk) 22:39, 21 October 2012 (UTC)[reply]

✓ Approved --99of9 (talk) 04:59, 29 October 2012 (UTC)[reply]