Commons:Bots/Requests

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
This project page in other languages:

Shortcut: COM:BRFA

If you want to run a bot on Commons, you must get permission first. To do so, file a request following the instructions below.

Please read Commons:Bots before making a request for bot permission.

Requests made on this page are automatically transcluded in Commons:Requests and votes for wider comment.

Requests for permission to run a bot[edit]

Before making a bot request, please read the new version of the Commons:Bots page. Read Commons:Bots#Information on bots and make sure you have added the required details to the bot's page. A good example can be found here.

When complete, pages listed here should be archived to Commons:Bots/Archive.

Any user may comment on the merits of the request to run a bot. Please give reasons, as that makes it easier for the closing bureaucrat. Read Commons:Bots before commenting.

ImagehashBot (talk · contribs)[edit]

Operator:

Bot's tasks for which permission is being sought: adding pHash checksum (P9310) and Imagehash difference hash (P12563) values to the photos.

Documentation for the hashes
Example images with P9310 and P12563 values

First targets are photos from Europeana, Estonian, Finland, Sweden and Flickr, but long term target is to add imagehashes to all commons photos. Currently we have used FinnaUploadBot for Finna images. Reason for the new account is to make dedicated account and service for the non Finna related edits.

Automatic or manually assisted: automatic

Edit type (e.g. Continuous, daily, one time run): first a batch jobs, later continuous

Maximum edit rate (e.g. edits per minute):

Bot flag requested: (Y/N): Y

Programming language(s):

Zache (talk) 15:08, 12 April 2024 (UTC)[reply]

Discussion
What is use for such hashes? --EugeneZelenko (talk) 14:47, 13 April 2024 (UTC)[reply]
One can use them to compare the similarity of pictures by checking how much the identifiers differ to detect duplicates and match photos in different repositories. We have used image hashes to prevent duplicates when uploading files and to prevent the wrong photos from being updated when reuploading photos from Finna with better quality and/or updating metadata. --Zache (talk) 16:31, 13 April 2024 (UTC)[reply]
Such hashes make much more sense as part of Commons database. --EugeneZelenko (talk) 14:26, 14 April 2024 (UTC)[reply]
In SDC they are filemetadata and in particular using SPARQL it would be easy way for querying and sharing the hashes for external usage. Ie. it is part of metadata for the files. Zache (talk) 14:52, 14 April 2024 (UTC)[reply]
Also, even if the information would be added to the Wikimedia Commons database (there are good technical reasons why one would like to use an external service instead of adding this to the MediaWiki core), I would like to note that we are populating SDC values from the Commons internal database using bots. Most notable in this context are the SHA-1 checksum, mime type, image width, and image height. (Commons:Structured data/Modeling/Meta) And yes, there would be probaply better ways to do this, but currently using bots is the preferred method. --Zache (talk) 06:42, 18 April 2024 (UTC)[reply]
Is there any community discussion that such data shall be generated at large scale? Krd 06:53, 18 April 2024 (UTC)[reply]
I am not aware that there would have been a wider discussion. Current discussions, to my knowledge, are related to the Fæ's User:Fæ/Imagehash and village pump discussions 1 and 2. In my structured data property proposal in 2021, there were no follow-up comments in Wikimedia Commons. Phabricator has some tickets (for example, phab:T121797) related to image hashing.
Also, just for background, I am running ImageHash-Toolforge, which has approximately 25% of Wikimedia Commons bitmap images (jpg, tiff, png) indexed with phash and dhash. I also made a Wikimania lightning talk proposal for it. (Proposals are currently under review.) My current idea was to proceed gradually when adding values to SDC, and my current personal need was to add hashes to European and Estonian photos before the Wikimedia Hackathon, Tallinn, in May so they would be available there. (see my question in Commons_talk:Bots/Requests#Extending_FinnaUploadBot).
However, if you think I should do the village pump discussion or the discussion on the Structured Data talk pages, I am happy to start these. --Zache (talk) 07:49, 18 April 2024 (UTC)[reply]
Please do. Krd 05:48, 21 April 2024 (UTC)[reply]

APPERbot (talk · contribs)[edit]

Operator: Wurgl (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: One-Time job: Remove 550+ languagelinks to not existing pages in de-WP, see Commons:Village_pump/Technical#Interwikilinks_to_german_wikipedia_with_no_corresponding_german_article

Update some the Graphic File:Normdatenentwicklung-de-wikipedia.svg 4 times a year, it already is created on the tools-Servers

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run):

Maximum edit rate (e.g. edits per minute): 10 (like in deWP)

Bot flag requested: (Y/N): Y

Programming language(s): PHP

Wurgl (talk) 21:22, 9 April 2024 (UTC)[reply]

Discussion
I am running this bot in deWP since January 2017 the bot has currently 1.715.969 Edits, but only 927.842 since I control it.
For this (first) one-time job look at https://persondata.toolforge.org/data/common_diff.txt there is a diff showing what the bot would do --Wurgl (talk) 21:22, 9 April 2024 (UTC)[reply]
Please make test run. Please create bot's user page. --EugeneZelenko (talk) 14:55, 10 April 2024 (UTC)[reply]

DaxBot (talk · contribs)[edit]

Operator: DaxServer (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Task #3 - Upload images from w:Capella Space Open Data collection of w:Synthetic-aperture radar captures. The dataset is released under CC-BY-4.0 https://www.capellaspace.com/gallery/. The radar data is rather duplicated in a few formats (example) but has the same PNG thumbnail preview image associated. I've de-duplicated on the PNGs and has no influence on which of the duplicate would be selected and linked from Commons. Uploads are tracked under Category:Files from Capella Space uploaded by DaxBot.

Automatic or manually assisted: Automatic (manually started)

Edit type (e.g. Continuous, daily, one time run): Quarterly

Maximum edit rate (e.g. edits per minute): 1-3

Bot flag requested: (Y/N): N

Programming language(s): https://n8n.io/-powered workflow with JavaScript snippets and MediaWiki API to upload

-- DaxServer (talk) 12:33, 5 April 2024 (UTC)[reply]

Discussion

Svetlov Artem Bot (talk · contribs)[edit]

Operator: Svetlov Artem (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

  • Change date=2004-07-17 12:00:00 to date={{Taken on|2004-07-17 12:00:00|location=Russia}} or Russian subregions for photos in manually selected categories.
  • Creating categories like Category:Tula_Oblast_photographs_taken_on_2004-07-17 and Category:Russia_photographs_taken_on_2004-07-17 if not exists.
  • Remove [[Category:Russia photographs taken on 2004-07-17]] from file description if edit is success.


Automatic or manually assisted: Automatic unsupervised on manually set categories by bot operator

Edit type (e.g. Continuous, daily, one time run): multiple manual run

Maximum edit rate (e.g. edits per minute): 10

Bot flag requested: (Y/N): Y

Programming language(s): pywikibot

Svetlov Artem Bot (talk) 18:26, 25 March 2024 (UTC)[reply]

Discussion
Please leave the bot account exclusively for your bot's edits.
Looks good to me. Some files have duplicate {{Taken with}} tags like File:Zukovskiy industrial railway 2022-10 1664723936.JPG for example. Would it be possible to remove the dupes on the fly when dealing with the files? --Achim55 (talk) 18:49, 28 March 2024 (UTC)[reply]
There was some manual inserted templates. Taken on and Taken with are completly diffirent templates, one for date, second for camera. I can not now to automatically edit Taken with template, it need complicated investigation of EXIF camera tags, witch allways differs from real camera names. Svetlov Artem (talk) 08:57, 4 April 2024 (UTC)[reply]

What is the reason to do such edits, in which way does it improve the file page? Is there any community discussion about doing this at large scale? --Krd 06:58, 18 April 2024 (UTC)[reply]

I was asked by user @MasterRus21thCentury, may you comment? https://commons.wikimedia.org/wiki/User_talk:Svetlov_Artem#c-MasterRus21thCentury-20231026174700-Category:Russia_photographs_taken_on_2008-05-01
Such categories useful for discovery photos in series: if someone photographed a train, bus or village house, there is possibility to see other photographs from next village or train line at same day from same user. Svetlov Artem (talk) 16:23, 18 April 2024 (UTC)[reply]

NinoBot (talk · contribs)[edit]

Operator: Ignacio Rodríguez (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

  • Dealing and cleaning {{Book}} template on spanish book files, for compatibility at Spanish Wikisource.
  • Changing Author names in database form (LastName, FirstName, 1889-1934) into {{Creator}} templates when it applies.
  • Simple tasks such as changing {{Description}} to {{Book}} on book files.

Automatic or manually assisted: Manually assisted

Edit type (e.g. Continuous, daily, one time run): intermitently

Maximum edit rate (e.g. edits per minute): 10 edits per minute

Bot flag requested: (Y/N): Y

Programming language(s): pywikibot

Ignacio Rodríguez (talk) 16:36, 8 March 2024 (UTC)[reply]

Discussion

GeertivpBot (talk · contribs)[edit]

Operator: Geertivp (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

  • Add missing SDC depict statements on media files (File namespace)
  • Add missing Wikidata Infobox template to Category pages (Category namespace)

Automatic or manually assisted: Automatically, but monitored

Edit type (e.g. Continuous, daily, one time run): Intermittently

Maximum edit rate (e.g. edits per minute): 8 edits per minute

Bot flag requested: (Y/N): Y

Programming language(s): Pywikibot, Python scripts are on GitHub:

Test runs are here.

Geert Van Pamel (talk) 22:29, 3 January 2024 (UTC)[reply]

Discussion