Commons:Bots/Requests

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
This project page in other languages:

Shortcut: COM:BRFA

Bot policy and list · Requests to operate a bot · Requests for work to be done by a bot  · Requests for batch uploads
Gnome-system-run.svg

If you want to run a bot on Commons, you must get permission first. To do so, file a request following the instructions below.

Please read Commons:Bots before making a request for bot permission.

Requests made on this page are automatically transcluded in Commons:Requests and votes for wider comment.

Requests for permission to run a bot[edit]

Before making a bot request, please read the new version of the Commons:Bots page. Read Commons:Bots#Information on bots and make sure you have added the required details to the bot's page. A good example can be found here.

When complete, pages listed here should be archived to Commons:Bots/Archive.

Any user may comment on the merits of the request to run a bot. Please give reasons, as that makes it easier for the closing bureaucrat. Read Commons:Bots before commenting.

JhealdBot (talk · contribs) (5)[edit]

Operator: Jheald (talk · contributions · Number of edits · recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

To rename multiple files, uploaded here from the British Library "Mechanical Curator" set on Flickr.

Automatic or manually assisted:

Automatic, based on prepared hand-reviewed batches of up to several hundred at a time.

Edit type (e.g. Continuous, daily, one time run):

In batches, with review after each batch.

Maximum edit rate (e.g. edits per minute):

60 12

Bot flag requested: (Y/N):

JhealdBot already has the bot flag. But I think it would need the file-mover flag.

Programming language(s):

Perl, using MediaWiki::Bot

More detailed explanation:

For some time I have been working towards an upload of images identified as maps, found in the 1 million images extracted from old books that were uploaded by the British Library to Flickr.
These uploads would include categorisation based on locations inferred from the georferencing campaign here, and meaningful file names based on map titles input as part of the georeferencing process. Here's a page showing some typical intended filenames:
Commons:British Library/MC maps batch 01 (GB counties)
(See also further pages in Category:MC_upload_prep_pages. These filenames would also be similar to those used for many many of the BL books (over 20,000 images in all) by User:Metilsteiner, as per those in eg: Category:Finska_kriget_och_Finlands_krigare_1808-1809_(1897)_by_DANIELSON
However, I took too long about it, and recently User:Artix Kreiger has uploaded just over 20,000 of the images (about 40% of the maps total) using Flickr2Commons, giving standard F2C filenames like these:
User:Jheald/Kreiger/UK IE/MC maps batch 01 (counties)
-- ie the book name (at some length) and page number, but not eg the date or what the file is actually about.
Rather than these, I believe that when eg reviewing thumbnails in a category, or adding further categorisation with Cat-a-lot, that file names indicating what the map itself actually contains, along with book author, date, and volume / page number, would be more useful.
Therefore I would like permission to work through batch-by-batch to re-write these filenames per my original intended scheme. That would also mean they would harmonise better with the remaining majority of files, that are still to upload.


Jheald (talk) 17:34, 28 March 2018 (UTC)

Discussion[edit]

since I was pinged, i wanted to say that the maps were already given the names by the British Library as is. None of the names were given by me. (although I removed the part "Image taken from" while uploading.) This invariably also uploaded several dups, those that were actually rotated, but was not flagged as duplicate by the auto detecting software. JuTa, since you dealt with some of these pics, do you have any opinion? Artix Kreiger (talk) 17:40, 28 March 2018 (UTC)

  • Will it invoke delinker for each file move, or will do the delinking on its own? --Zhuyifei1999 (talk) 18:02, 28 March 2018 (UTC)
    • I was literally just going to use the move call in MediaWiki::Bot (if it still works), which in turn calls the MediaWiki mw:API:Move. I haven't used the Move function before, so hadn't considered Commons:Delinker. I presumed the old file pages would be turned into redirects, and was going to be content to leave it at that. But I am happy to do more, if more is required. Jheald (talk) 18:34, 28 March 2018 (UTC)
      • I think a filemove should get the files delinked, either through delinker or by the bot itself (I recommend the former). You can add the requests in batches if you want. --Zhuyifei1999 (talk) 18:39, 28 March 2018 (UTC)
        • Okay, that looks straightforward enough, if it's really not a problem dropping a couple of hundred requests there in one go. Though they were only uploaded quite recently, and not advertised, so I think file usage so far is pretty minimal. Just having a look with PetScan to see. Jheald (talk) 18:53, 28 March 2018 (UTC)
        There is currently one file being used externally from Artix Kreiger's "Set of Maps" upload. But older files that others have uploaded may have accumulated more uses, so I acknowledge that it's important to check this. Jheald (talk) 19:11, 28 March 2018 (UTC)
        Example check with PetScan: [1] Jheald (talk) 15:34, 7 October 2018 (UTC)
File mover rights requested for User:JhealdBot Jheald (talk) 12:53, 29 March 2018 (UTC)
  • Please explain the edit rate of 60 per minutes. All your other tasks are with 12 per minutes (one every 5 seconds). Based on the quantity of changes suggested here, there seem to be no need for a higher rate. --Schlurcher (talk) 22:41, 1 April 2018 (UTC)
@Schlurcher: Changed to 12. Don't know where the 60 came from. Completely agree, no need for anything faster. Jheald (talk) 09:31, 2 April 2018 (UTC)
No further comments and no concerns regarding this request. --Schlurcher (talk) 16:49, 6 April 2018 (UTC)
  • Do you intend to use suppress_redirect when moving the files? --Krd 20:00, 6 April 2018 (UTC)
@Krd: I had thought to keep the redirects, just in case somebody has externally linked to one of these files. Redirects are cheap. And leaving the old pages may also help flag possible duplication to any future uploaders from Flickr, if some of these files have been cropped or adjusted in the meantime.
BTW, I do hope to do a demonstration run really soon now. I'm just finishing some work preparing for book & author wikidata links & categories for the files that will be in the first set, per JhealdBot (4) Jheald (talk) 20:41, 6 April 2018 (UTC)
The bot already has file mover rights, when ready please continue and make a small test run. --Krd 06:36, 7 April 2018 (UTC)
@Jheald: Please advise when ready to test. Thank you. --Krd 05:41, 15 April 2018 (UTC)

No update for one month, closing as stale. --Krd 10:00, 17 May 2018 (UTC)

Reopened per request. --Krd 05:54, 7 October 2018 (UTC)
@Krd: Test run completed, see bot contributions from 15:46 to 16:07 (103 files renamed), old file names -> new file names.
Now to rewrite the pages (JhealdBot (4)), to complete the job.Jheald (talk) 16:58, 7 October 2018 (UTC)

Ijonbot (talk · contribs)[edit]

Operator: Ijon (talk · contributions · Number of edits · recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Iterating over Category:Images from Wiki Loves Monuments 2018 in Ukraine and ensuring Jewish heritage photos are properly categorizing, according to whether they are nationally-recognized monuments or only special-nomination objects, according to this page] on Ukrainian Wikipedia.

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): One time run.

Maximum edit rate (e.g. edits per minute): 50

Bot flag requested: (Y/N): Y

Programming language(s): Ruby

Ijon (talk) 14:13, 6 October 2018 (UTC)

Discussion[edit]

  • Please use Help:Gadget-HotCat-style edit summaries. --EugeneZelenko (talk) 14:18, 7 October 2018 (UTC)
    • Could you clarify what exactly you'd like to see? Explicitly name all the categories affected (the bot makes multiple cat changes in a single edit)? Or do you just mean to add a link to the bot page from the edit summary? Ijon (talk) 14:49, 7 October 2018 (UTC)
      I meant list of categories affected. Task description could be shorter. --EugeneZelenko (talk) 13:39, 8 October 2018 (UTC)

JhealdBot (talk · contribs) (6)[edit]

Operator: Jheald (talk · contributions · Number of edits · recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

Batch uploading for an image release by the British Library. Mostly 18th-century maps and engravings. Pilot uploads in Category:BL18C_pilot. Potential for about 20,000 images if everybody is happy.

Automatic or manually assisted:

Automatic, but with extensive manual pre-preparation and post-upload review.

Edit type (e.g. Continuous, daily, one time run):

Batches of up to a couple of hundred images at a time

Maximum edit rate (e.g. edits per minute):

Pilot images are taking about a minute and a half each to upload, via my home broadband connection (which is what will probably also be used for the production run). Production images may be circa five times larger (less compressed). Script pauses for about 8 seconds before starting each new image.

Bot flag requested: (Y/N):

Bot already has it.

Programming language(s):

Perl is used to create a file of description pages, based on various fields of MARC records extracted from the library catalogue, plus semi-manual matching of people and places to items known to Wikidata using OpenRefine. The upload itself is then executed using the Pywikibot upload.py script, driven by a further Perl script.

Additional comments:

Pilot images have been extracted from existing released images using a dezoomify-type script. This is the reason for their lack of EXIF information. Production images will come more directly from the BL's own internal systems, and may be somewhat less compressed. Initially JPG versions will be uploaded; TIFF versions may be added at a later date.
Permission is also sought, as JhealdBot (6A), to make minor post-upload edits to the pages, similar to those already approved for JhealdBot (4) -- eg to make any post-upload corrections that may be needed, additional categorisation, additional information that may become available such as scan resolution, updates to the pages to reflect updates in the BL catalogue, etc. The scripts to do this would be closely similar to those already used as JhealdBot (4) for a different set of images.

Jheald (talk) 06:36, 22 September 2018 (UTC)

Discussion[edit]

  • Looks OK for me, but will be good idea to use language tag or Wikidata for Draftsman:. --EugeneZelenko (talk) 14:05, 22 September 2018 (UTC)
@EugeneZelenko: Hmmm... {{occupation}} gives quite a nice, visually unobtrusive way to internationalise this (eg [2]), albeit with maybe a bit less coverage than Wikidata. Would that be acceptable? Jheald (talk) 21:59, 22 September 2018 (UTC)
I'm not sure this will always work, since person could have multiple occupations and file need only one of them. Could we use query translations from item for Draftsman? --EugeneZelenko (talk) 13:59, 23 September 2018 (UTC)
That's not a problem. The {{occupation}} template simply translates the text it is given, and when I create the page I will only give it one occupation. On the minus side, the range of translations the template provides is (I think) not quite as extensive as wikidata. With luck somebody will fix that eventually, and adjust the template to use wikidata as a fallback source for translations. The plus side is that the translations are specialised to Commons, and are sometimes more accurate than what may be more generic translations given by Wikidata labels. Jheald (talk) 14:23, 23 September 2018 (UTC)