Commons:Bots/Requests

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
This project page in other languages:

Shortcut: COM:BRFA

If you want to run a bot on Commons, you must get permission first. To do so, file a request following the instructions below.

Please read Commons:Bots before making a request for bot permission.

Requests made on this page are automatically transcluded in Commons:Requests and votes for wider comment.

Requests for permission to run a bot

[edit]

Before making a bot request, please read the new version of the Commons:Bots page. Read Commons:Bots#Information on bots and make sure you have added the required details to the bot's page. A good example can be found here.

When complete, pages listed here should be archived to Commons:Bots/Archive.

Any user may comment on the merits of the request to run a bot. Please give reasons, as that makes it easier for the closing bureaucrat. Read Commons:Bots before commenting.

Operator: Mdann52 (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Fixing SVG maps uploaded by a given user which no longer display due to a SVG engine change. Please see phab:T367645, w:Wikipedia:Teahouse#Infobox probelem and w:Wikipedia:Teahouse#Us couties on article map for impact/discussion. Example fixes are at 1, 2 and 3 (and in my recent uploads)


Automatic or manually assisted: Automatic, but will review changes Automatic script, all changes reviewed prior to upload.

Edit type (e.g. Continuous, daily, one time run): One Time Run

Maximum edit rate (e.g. edits per minute): 5 epm max

Bot flag requested: (Y/N): Y

Programming language(s): Python, pywikibot, code available on request

--Mdann52talk to me! 17:16, 19 June 2024 (UTC)[reply]

Discussion

Looking at your example 1, it appears that you removed the ‎<g id="state_outline"> tag and its matching ‎</g>, and moved the ‎<g stroke="black" fill="none" stroke-linejoin="round" stroke-width="119" clip-path="url(#state_clip_path)"> tag so that instead of following the ‎</clipPath> it now precedes the ‎<clipPath id="state_clip_path">. But just after that ‎</clipPath> there is a ‎<use xlink:href="#state_outline" fill="white" stroke-width="524" /> which now has nothing to refer to as the state_outline id no longer exists in the doc. --Redrose64 (talk; at English Wikipedia) 20:04, 19 June 2024 (UTC)[reply]

I've corrected the code to handle this now (the edit there was manual), and it now puts the <g> following the <clipPath> section. It doesn't seem to make a difference anyway due to how it is rendered, but happy to get the bot to loop back through the manual ones if needed to fix this. --Mdann52talk to me! 20:34, 19 June 2024 (UTC)[reply]
@Mdann52 I already have a bot that uploads this. See more in Phab. Nux (talk··dyskusja) 00:19, 21 June 2024 (UTC)[reply]
@Nux: are you planning to file a BRFA, I'll withdraw this one if so. Or are you happy if I do so semi-manually using your tool from an alt account, just to avoid spamming the RC feed! --Mdann52talk to me! 06:00, 21 June 2024 (UTC)[reply]

Operator: Matrix (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Replacing all transclusions of {{GPLv3}} with {{GPLv3+}}

Automatic or manually assisted: Automatic, unsupervised

Edit type (e.g. Continuous, daily, one time run): One-time run

Maximum edit rate (e.g. edits per minute): 20 edits/min

Bot flag requested: (Y/N): Y

Programming language(s): JWB (will use AWB if JWB is not suitable)

Matrix(!) {user - talk? - uselesscontributions} 17:13, 16 June 2024 (UTC)[reply]

Discussion

Previous discussions about this merger can be found at Template talk:GPLv3, and Commons:Village pump/Technical/Archive/2024/03. —Matrix(!) {user - talk? - uselesscontributions} 17:13, 16 June 2024 (UTC)[reply]

https://meta.wikimedia.org/wiki/User:AkbarBot

Operator: Akbarali (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

  • Upload files bulk to wikimedia commons,
  • Add descriptions , caption and file name

Automatic or manually assisted:

Edit type (e.g. Continuous, daily, one time run): Intermittently

Maximum edit rate (e.g. edits per minute): 8 edits per minute

Bot flag requested: (Y/N): y

Programming language(s): Pywikibot, Python scripts are on PAWS https://hub-paws.wmcloud.org/hub/spawn-pending/Akbarali

Akbarali (talk) 13:57, 11 June 2024 (UTC)[reply]

Discussion

Operator: Fl.schmitt (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: HandleCommonsOnOSMBot tries to add {{Object location}} and {{On OSM}} templates for Commons Files that are used on OpenStreetMap (using attributes wikimedia_commons or image). Insofar, HandleCommonsOnOSMBot relies on the work of Usage Bot and goes through the media listed on the Files used on OpenStreetMap pages (see also this bot request).

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): Continuous

Maximum edit rate (e.g. edits per minute): 12-15

Bot flag requested: (Y/N): Y

Programming language(s): Python (pywikibot)

Fl.schmitt (talk) 20:38, 28 May 2024 (UTC)[reply]

Discussion
Thanks for doing this. Maybe it's worth tracking images for which the bot tried to retrieve coordinates, but for some reason can't. This avoids re-trying them if the bot runs again or restarts. The "Usage Bot" that maintains the lists does update them though. Enhancing999 (talk) 22:12, 28 May 2024 (UTC)[reply]
Good idea - this would require some sort of blacklist, i think. Maybe evaluating the bot logs timely would be sufficient? Fl.schmitt (talk) 05:30, 29 May 2024 (UTC)[reply]
Maybe it's a non-issue. You could just run it for a while and see how it goes. If needed, add some logic later. Enhancing999 (talk) 07:20, 29 May 2024 (UTC)[reply]

Please make a small test run. --Krd 08:06, 29 May 2024 (UTC)[reply]

@Krd: Test run finished with five edits. Captchas were a little bit annoying :-) Fl.schmitt (talk) 17:26, 29 May 2024 (UTC)[reply]
Looks good to me. Krd 17:31, 29 May 2024 (UTC)[reply]
I take it the coordinates match. I'm a bit hesitant about {{On OSM}}. Apparently others use it too [1], but the layout and wording doesn't seem ideal for files. Maybe a new template could be made for filenamespace (thus the exact wording for this use can be changed easily). At File:1347 Matterhorn.jpg, I add some stuff with "other fields", which blends it into the template. Enhancing999 (talk) 21:54, 29 May 2024 (UTC)[reply]
@Enhancing999 - using the other fields parameter for {{On OSM}} sounds interesting. I've just modified "Altes Kirchle" voller Kunstschätze. 01.jpg manually, moving the {{On OSM}} inside a {{InFi}} and adding it to the other fields parameter. For me. this looks ok. Anyway, since the bot's main effort is adding the coordinates, maybe the best option is to "shelve" the {{On OSM}} question and restrict the bot to the location template. Fl.schmitt (talk) 07:23, 30 May 2024 (UTC)[reply]
Somwhow it still looks overly highlighted. I'd still keep the ids around somewhere. Enhancing999 (talk) 11:15, 30 May 2024 (UTC)[reply]
@Enhancing999 ah, ok - now i think i got it. What about "Altes Kirchle" voller Kunstschätze. 01.jpg now? I've created a new template for textual OSM links ({{OSMLink}}) without any other fancy stuff which should fit nicely into the `other_fields` paratemer using {{Information field}}. Thus we have a link, a category and the OSM id. Fl.schmitt (talk) 12:08, 30 May 2024 (UTC)[reply]
Looks ok. I'd include {{Information field}} directly in {{OSMLink}} Enhancing999 (talk) 12:10, 30 May 2024 (UTC)[reply]
Oh, good idea! Done :-) Maybe i'll find a way to add the OSM icon next to the link, so it's marked as "external" link. Fl.schmitt (talk) 12:28, 30 May 2024 (UTC)[reply]
Better no icon, this is misleading because it behaves differently from the location link icon. Fl.schmitt (talk) 12:49, 30 May 2024 (UTC)[reply]
  • File:'La Brabançonne' (15607113426).jpg has depicts statement, so coordinates and OpenStreetMap identifier should be taken from Wikidata. --EugeneZelenko (talk) 14:24, 31 May 2024 (UTC)[reply]
    Interesting point that shortly puzzled me, but it's due to Schlurcherbot re-copying the coordinates since the test run. Enhancing999 (talk) 14:55, 31 May 2024 (UTC)[reply]
    @EugeneZelenko - thank you for your feedback, but - to be honest - I would generally expect OpenStreetMap to provide "better" (more reliable and more precise) coordinates than Wikidata, especially in an urban area. OpenStreetMap doesn't have coordinates assigned to media, but vice versa, media assigned to quite precise geographical information. In the case of La Brabançonne, Wikidata has 50°51'4"N, 4°22'5"E while OSM offers 50° 50′ 56.18″ N, 4° 22′ 05.97″ E. Please compare yourself on Google Maps or OpenStreetMap (just c&p both values in the respective search boxes): in the real world, those values may amount to a difference of ca. 240 meters ("beeline") according to Google Maps! For me, that example shows why it may be quite useful for Commons and Wikidata if we could make use of OSM data. Fl.schmitt (talk) 15:24, 31 May 2024 (UTC)[reply]
    Obviously, fixing coordinates on Wikidata would benefit not only Commons, but other projects. --EugeneZelenko (talk) 15:29, 31 May 2024 (UTC)[reply]
    Exactly! But to fix them, we need the OSM data. It seems you're asking for a different bot than mine: HandleWikidataOnOsm, evaulating the wikidata attribute of OSM nodes/ways/relations. Would be very nice but doesn't help for Commons media lacking a depicts attribute / a wikidata object. That's the case for many of the Files used on OpenStreetMap: Those hiking fingerposts and wayside shrines usually don't have wikidata, they don't have "depicts", and precise location would be useful anyway. Fl.schmitt (talk) 15:50, 31 May 2024 (UTC)[reply]
    So fix mismatches in the source (Wikidata), not individual re-users of shared data. --EugeneZelenko (talk) 13:34, 1 June 2024 (UTC)[reply]
    File depicts statements seem to have undergone quite a lot of deletions recently. I'm not sure if the link "file depicts - Wikidata item location coordinates" to "File image depicted coordinates (per OSM)" is 1:1.
    The proposal here is merely for geocoding photos. Enhancing999 (talk) 13:53, 1 June 2024 (UTC)[reply]
    Whole point of this thread is to define proper process for geocoding photos when depicts statement is available. --EugeneZelenko (talk) 13:59, 2 June 2024 (UTC)[reply]
    That's easy: proper process is to ignore it. There are over 2,2 Million OSM nodes with an unique wikidata link (that's the set of wikidata items where precise coordinates are available), while there are ca. 220,000 wikimedia_commons references on OSM. Of those 220,000, only small subset has a "depicts" statement. Thus, handling that case (which is quite difficult) would have a very limited benefit, while most part of the task (handling the remaining 2,2 million wikidata entries) is still to be done. This is why I said that we clearly need OSM data, but it's a bot task on its own to reconcile Wikidata with OSM. Fl.schmitt (talk) 06:27, 4 June 2024 (UTC)[reply]
    Postscriptum: "ignore it" refers to the "depicts" statement, not to the photo... Fl.schmitt (talk) 07:58, 4 June 2024 (UTC)[reply]
    Is it really huge task for bot? --EugeneZelenko (talk) 14:33, 4 June 2024 (UTC)[reply]
    It doesn't matter if it's a huge task for a bot. What matters is that it's huge task for me to create such a bot. You're requesting a completely new feature which wasn't part of the initial bot work request. I'm not some sort of AI that delivers nice code on keypress. I'm neither a Python professional nor did i have any experience using Pywikibot before starting the project "HandleCommonsOnOSMBot". I simply picked a bot work request and tried to offer a working solution to improve Wikimedia content, which took me hours of work. Others may do such things faster, but i'm no professional developer. So – yes, it's a huge task, at least for me. Fl.schmitt (talk) 16:47, 4 June 2024 (UTC)[reply]
    One more point: The OSM wiki explicitly prohibits copying data from OSM into Wikidata. There's no similar restriction regarding other Wikimedia wikis, as far as i see. So, you're requesting a different, new feature not only in technical terms but also in legal terms - maybe it's even simply prohibited. Fl.schmitt (talk) 17:13, 4 June 2024 (UTC)[reply]
    It's reasonable to make a bot as best as possible, even if it'll take some time to implement. If there are legal problem, bot could create list of Wikidata items that require coordinates correction and later you could fix them manually. --EugeneZelenko (talk) 14:30, 5 June 2024 (UTC)[reply]
I wonder how long the bot should wait for new additions to the list?
Maybe Commons:Files_used_on_OpenStreetMap/177 should only be processed once there is Commons:Files_used_on_OpenStreetMap/178. Enhancing999 (talk) 17:08, 5 June 2024 (UTC)[reply]
I'm not sure how Usage Bot works in Detail, esp. if it modifies existing galleries of externally-used media or just adds new galleries. But I think handling that case isn't necessary. The bot has to check every single file in those galleries for the actions required anyway (adding Location, adding OSMLink or both). Since this requires read operations but no write/update, it doesn't affect the bot's edit rate, so I think it isn't a real problem if the bot visits a certain gallery multiple times. Fl.schmitt (talk) 09:45, 12 June 2024 (UTC)[reply]
I was rather think of OSM being sufficiently stable .. but then I have no idea of their quality control mecanisms. After the initial rather run, it might want to process new entries with some delay. Enhancing999 (talk) 10:00, 12 June 2024 (UTC)[reply]
I doubt that there are any formal / technical quality control mechanisms, at least regarding the wikimedia_commons attribute. Just take a look at the way the references to commons are set: The OSM wiki advices to just set the File:... or Category:... name as attribute value, but there are many entries with full URLs. Additionally, AFAIK there's no mechanism to update the attribute value after the File or Category was renamed. So, I think it's no mistake to check those galleries multiple times, maybe also logging renamed media that requires an update on OSM. Fl.schmitt (talk) 13:04, 15 June 2024 (UTC)[reply]

Operator: Emijrp (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: adding SDC to images, following Commons:Structured data/Properties table, example [2]

Automatic or manually assisted: automatic

Edit type (e.g. Continuous, daily, one time run): continuous

Maximum edit rate (e.g. edits per minute): 1 edit/sec

Bot flag requested: (Y/N): No

Programming language(s): Python

emijrp (talk) 20:37, 15 May 2024 (UTC)[reply]

Discussion
What is the point to duplicate data that could be extracted from image directly: height, width, file size, checksum? --EugeneZelenko (talk) 15:06, 16 May 2024 (UTC)[reply]
Honestly I don't know. According to the table, over 8 million images have width as a SDC property, that's why I included it. But it seems you can call the value from SPARQL with schema:width for example. @Schlurcher: do you know more? Thanks. emijrp (talk) 16:20, 16 May 2024 (UTC)[reply]
As I explained in Commons:Requests for comment/Technical needs survey/UploadWizardSDC this should not be done by bots and there should be a technical implementation for this. --Schlurcher (talk) 11:35, 17 May 2024 (UTC)[reply]
I am not sure if I understand. Your bot is right now adding SDC, specifically the data types disputed by EugeneZelenko. Or am I wrong? emijrp (talk) 15:30, 17 May 2024 (UTC)[reply]

I don't know if somebody wants to comment further. To be clear, I can exclude the properties considered duplicates, though as we can see other bots are adding them anyway. Regards. emijrp (talk) 12:38, 26 May 2024 (UTC)[reply]

Operator:

Bot's tasks for which permission is being sought: adding pHash checksum (P9310) and Imagehash difference hash (P12563) values to the photos.

Documentation for the hashes
Example images with P9310 and P12563 values

First targets are photos from Europeana, Estonian, Finland, Sweden and Flickr, but long term target is to add imagehashes to all commons photos. Currently we have used FinnaUploadBot for Finna images. Reason for the new account is to make dedicated account and service for the non Finna related edits.

Automatic or manually assisted: automatic

Edit type (e.g. Continuous, daily, one time run): first a batch jobs, later continuous

Maximum edit rate (e.g. edits per minute):

Bot flag requested: (Y/N): Y

Programming language(s):

Zache (talk) 15:08, 12 April 2024 (UTC)[reply]

Discussion
What is use for such hashes? --EugeneZelenko (talk) 14:47, 13 April 2024 (UTC)[reply]
One can use them to compare the similarity of pictures by checking how much the identifiers differ to detect duplicates and match photos in different repositories. We have used image hashes to prevent duplicates when uploading files and to prevent the wrong photos from being updated when reuploading photos from Finna with better quality and/or updating metadata. --Zache (talk) 16:31, 13 April 2024 (UTC)[reply]
Such hashes make much more sense as part of Commons database. --EugeneZelenko (talk) 14:26, 14 April 2024 (UTC)[reply]
In SDC they are filemetadata and in particular using SPARQL it would be easy way for querying and sharing the hashes for external usage. Ie. it is part of metadata for the files. Zache (talk) 14:52, 14 April 2024 (UTC)[reply]
Also, even if the information would be added to the Wikimedia Commons database (there are good technical reasons why one would like to use an external service instead of adding this to the MediaWiki core), I would like to note that we are populating SDC values from the Commons internal database using bots. Most notable in this context are the SHA-1 checksum, mime type, image width, and image height. (Commons:Structured data/Modeling/Meta) And yes, there would be probaply better ways to do this, but currently using bots is the preferred method. --Zache (talk) 06:42, 18 April 2024 (UTC)[reply]
Is there any community discussion that such data shall be generated at large scale? Krd 06:53, 18 April 2024 (UTC)[reply]
I am not aware that there would have been a wider discussion. Current discussions, to my knowledge, are related to the Fæ's User:Fæ/Imagehash and village pump discussions 1 and 2. In my structured data property proposal in 2021, there were no follow-up comments in Wikimedia Commons. Phabricator has some tickets (for example, phab:T121797) related to image hashing.
Also, just for background, I am running ImageHash-Toolforge, which has approximately 25% of Wikimedia Commons bitmap images (jpg, tiff, png) indexed with phash and dhash. I also made a Wikimania lightning talk proposal for it. (Proposals are currently under review.) My current idea was to proceed gradually when adding values to SDC, and my current personal need was to add hashes to European and Estonian photos before the Wikimedia Hackathon, Tallinn, in May so they would be available there. (see my question in Commons_talk:Bots/Requests#Extending_FinnaUploadBot).
However, if you think I should do the village pump discussion or the discussion on the Structured Data talk pages, I am happy to start these. --Zache (talk) 07:49, 18 April 2024 (UTC)[reply]
Please do. Krd 05:48, 21 April 2024 (UTC)[reply]
Now I made a village pump proposal --Zache (talk) 16:44, 17 May 2024 (UTC)[reply]

Operator: Svetlov Artem (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

  • Change date=2004-07-17 12:00:00 to date={{Taken on|2004-07-17 12:00:00|location=Russia}} or Russian subregions for photos in manually selected categories.
  • Creating categories like Category:Tula_Oblast_photographs_taken_on_2004-07-17 and Category:Russia_photographs_taken_on_2004-07-17 if not exists.
  • Remove [[Category:Russia photographs taken on 2004-07-17]] from file description if edit is success.


Automatic or manually assisted: Automatic unsupervised on manually set categories by bot operator

Edit type (e.g. Continuous, daily, one time run): multiple manual run

Maximum edit rate (e.g. edits per minute): 10

Bot flag requested: (Y/N): Y

Programming language(s): pywikibot

Svetlov Artem Bot (talk) 18:26, 25 March 2024 (UTC)[reply]

Discussion
Please leave the bot account exclusively for your bot's edits.
Looks good to me. Some files have duplicate {{Taken with}} tags like File:Zukovskiy industrial railway 2022-10 1664723936.JPG for example. Would it be possible to remove the dupes on the fly when dealing with the files? --Achim55 (talk) 18:49, 28 March 2024 (UTC)[reply]
There was some manual inserted templates. Taken on and Taken with are completly diffirent templates, one for date, second for camera. I can not now to automatically edit Taken with template, it need complicated investigation of EXIF camera tags, witch allways differs from real camera names. Svetlov Artem (talk) 08:57, 4 April 2024 (UTC)[reply]

What is the reason to do such edits, in which way does it improve the file page? Is there any community discussion about doing this at large scale? --Krd 06:58, 18 April 2024 (UTC)[reply]

I was asked by user @MasterRus21thCentury, may you comment? https://commons.wikimedia.org/wiki/User_talk:Svetlov_Artem#c-MasterRus21thCentury-20231026174700-Category:Russia_photographs_taken_on_2008-05-01
Such categories useful for discovery photos in series: if someone photographed a train, bus or village house, there is possibility to see other photographs from next village or train line at same day from same user. Svetlov Artem (talk) 16:23, 18 April 2024 (UTC)[reply]
I still don't see how this has valid use cases. Please start a community discussion and please report the result. Krd 07:12, 18 May 2024 (UTC)[reply]

Operator: Geertivp (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

  • Add missing SDC depict statements on media files (File namespace)
  • Add missing Wikidata Infobox template to Category pages (Category namespace)

Automatic or manually assisted: Automatically, but monitored

Edit type (e.g. Continuous, daily, one time run): Intermittently

Maximum edit rate (e.g. edits per minute): 8 edits per minute

Bot flag requested: (Y/N): Y

Programming language(s): Pywikibot, Python scripts are on GitHub:

Test runs are here.

Geert Van Pamel (talk) 22:29, 3 January 2024 (UTC)[reply]

Discussion