Commons:Bots/Requests

From Wikimedia Commons, the free media repository
Jump to: navigation, search
This project page in other languages:

Shortcut: COM:BRFA

Bot policy and list · Requests to operate a bot · Requests for work to be done by a bot  · Requests for batch uploads
Gnome-system-run.svg

If you want to run a bot on Commons, you must get permission first. To do so, file a request following the instructions below.

Please read Commons:Bots before making a request for bot permission.

Requests made on this page are automatically transcluded in Commons:Requests and votes for wider comment.

Requests for permission to run a bot[edit]

Before making a bot request, please read the new version of the Commons:Bots page. Read Commons:Bots#Information on bots and make sure you have added the required details to the bot's page. A good example can be found here.

When complete, pages listed here should be archived to Commons:Bots/Archive.

Any user may comment on the merits of the request to run a bot. Please give reasons, as that makes it easier for the closing bureaucrat. Read Commons:Bots before commenting.

Michelemassa5 (talk · contribs)[edit]

Operator: Michelemassa5 (talk · contributions · Number of edits · recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

Automatic or manually assisted:

Edit type (e.g. Continuous, daily, one time run):

Maximum edit rate (e.g. edits per minute):

Bot flag requested: (Y/N):

Programming language(s):

Michelemassa5 (talk) 13:07, 20 May 2018 (UTC)

Discussion[edit]

Revibot (talk · contribs) (4)[edit]

Operator: -revi (talk · contributions · Number of edits · recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Fix Category:Pages using ISBN magic links.

Automatic or manually assisted: Automatic, unsupervised

Edit type (e.g. Continuous, daily, one time run): one time run, then probably once in a month

Maximum edit rate (e.g. edits per minute): 6 epm

Bot flag requested: (Y/N): N

Programming language(s): mw:Manual:Pywikibot/replace.py

  • Regex for replace.py will be -regex "ISBN\s+((97(8|9))?\s?-?([0-9]\s?-?){9}([0-9Xx]))([\s\D])" "{{ISBN|\1}}\6"

I didn't run a test, yet. — regards, Revi 11:55, 18 May 2018 (UTC)

Discussion[edit]

Pi bot (talk · contribs) 2[edit]

Operator: Mike Peel (talk · contributions · Number of edits · recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Trim duplicate external ID information from categories where those IDs are shown in {{Wikidata Infobox}}

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): Weekly

Maximum edit rate (e.g. edits per minute): 10 edits per minute

Bot flag requested: (Y/N): Y (Pi bot already has this)

Programming language(s): pywkibot. source code

This bot looks for categories that use both {{Wikidata infobox}} and specified external ID templates where both are displaying the same link (and/or templates like {{mainw}} without parameters or only links to Wikipedia articles that are shown in the infobox). When it finds such a case, then it removes the external ID template (and the other templates where possible), as well as extra whitespace, before saving the page. It is currently coded to look at National Heritage List for England number (P1216) and Category:Listed buildings in England with known IDs, but this will be expanded for other IDs in the future. The motivation is that we don't need to have duplicate links, and it's better to use the ones from Wikidata via the infobox to avoid cluttering the category with multiple templates. It links in with a proposed Wikidata bot that will use the IDs to find new category sitelinks from Wikidata, and probably a future Wikidata bot that will copy the IDs from Commons to Wikidata.

Example edits: [1], [2], [3], [4].

Thanks. Mike Peel (talk) 23:24, 14 May 2018 (UTC)

Discussion[edit]

  • Looks OK for me. I also could suggest to think about removal of description and/or image, like in Category:Queen's House. --EugeneZelenko (talk) 03:37, 15 May 2018 (UTC)
    • @EugeneZelenko: That was this edit, which I did manually. It would be possible to look for images and descriptions in the category wikitext and remove them, but it's much more tricky to check that the information in the description is already in the infobox, so I think that's better done by hand to make sure we don't lose information. I'm generally trying to be very conservative with the bot edits, so complex cases with multiple IDs are also skipped. Thanks. Mike Peel (talk) 11:35, 15 May 2018 (UTC)
      Does bot keep a track of such elements for further automatic or manual clean-up? --EugeneZelenko (talk) 17:37, 15 May 2018 (UTC)
      @EugeneZelenko: No, my plan is to continue using categories to select targets for auto-cleanup, and possibly migrate to using SQL queries in the future. Thanks. Mike Peel (talk) 21:40, 15 May 2018 (UTC)
  • Basic edits look good. It is however standard on commons to have a new line before categories and another before interwikis. I would not remove them. --Schlurcher (talk) 05:46, 15 May 2018 (UTC)
    • @Schlurcher: OK, I've removed the line that looks for double \n's. That may leave extra whitespace at the top, but we can see how it goes. I'm hoping the manual interwikis won't be there much longer! New example edits: [5], [6]. Thanks. Mike Peel (talk) 11:35, 15 May 2018 (UTC)

Frettiebot (talk · contribs)[edit]

Operator: Frettie (talk · contributions · Number of edits · recent activity · block log · User rights log · uploads · Global account information)

This bot adds Wikidata Infobox to czech community pages (by Mike Peel script). Sometime there will be other tasks by czech communnity. I am requesting for bot flag, because my bot (Frettiebot) was blocked after few edits, its a mistake, because at Czech language Wikipedia or Wikidata i have bot flag and it is ok for thousands of edits and there is not at commons, unfortunately, sorry.

Bot's tasks for which permission is being sought: Wikidata Infobox for czech articles

Automatic or manually assisted: Half automatic, i manual start my bot and looks, which its do.

Edit type (e.g. Continuous, daily, one time run): One time / continous

Maximum edit rate (e.g. edits per minute): 30 edits per minute

Bot flag requested: (Y/N): Y

Programming language(s): Python

Frettie (talk) 16:16, 9 May 2018 (UTC)

Discussion[edit]

  • Pictogram voting comment.svg  Comment This isn't a problem from my perspective, but I should note that User:Pi bot will run through these categories eventually. It's currently going through Category:Art as a test of a big category, and I'll start it from Category:CommonsRoot soon (maybe next week), but I'm focusing on adding ~600,000 commons sitelinks through d:Wikidata:Requests for permissions/Bot/Pi bot 2 first. Thanks. Mike Peel (talk) 16:54, 9 May 2018 (UTC)
    • Perfect, good work!--Frettie (talk) 18:57, 9 May 2018 (UTC)
      • Just to note: Pi bot's run to add the Commons sitelinks on Wikidata is now complete, and it's now focusing on infoboxes here, starting from CommonsRoot. Thanks. Mike Peel (talk) 23:17, 11 May 2018 (UTC)
  • (Edit conflict) Does this run the exact same script as Commons:Bots/Requests/Pi_bot_1. If so, could you explain the rationale fot having multiple bots for the same task? If not, could you explain the differences? --Zhuyifei1999 (talk) 16:55, 9 May 2018 (UTC)
    • If there is similar bots for one thing, there will be faster. There is not differences.--Frettie (talk) 18:57, 9 May 2018 (UTC)
      If there happens to be some bug in the script, will you be able to fix them? --Zhuyifei1999 (talk) 20:07, 9 May 2018 (UTC)
      • I think yes, but if there is mistake in Mike's script, there will be corrected by Mike and i adopt this again. --Frettie (talk) 10:08, 11 May 2018 (UTC)
The general expectation is that you are able to correct mistakes in Mike's script if necessary. Please confirm. --Schlurcher (talk) 17:28, 11 May 2018 (UTC)
As a practical test, I've made a few changes to the script recently, @Frettie: can you explain what I've changed and why? Thanks. Mike Peel (talk) 23:17, 11 May 2018 (UTC)
I like your original python code, the targetcats, which was filled in cycle - its interesting. Now you are set seen categories (like a history view) and "what all i want to browse in future from my root category". And while there is some active category, which is not "seen" (because seen category is set only, which become "active" for next iteration), go through this category. Maybe the for cycle with subcategory generator will be slow if there will be too much categories (but its ok, too fast run is not good in this case).--Frettie (talk) 16:35, 12 May 2018 (UTC)
The new category walker is mostly designed for speed, since checking if a given subcategory is included in a list of categories is much slower than checking if it's in a set - although I also preferred the older version as it was more straightforward! I also changed a few things in the 'addtemplate' function, though, can you see why? Thanks. Mike Peel (talk) 23:31, 14 May 2018 (UTC)
Hi Mike, there is check if wikidata item is exist, if not, "skipping" next lines (return 0), and this is earlier. There is not processed list without category main topic by target link. Return 0/1 helps to compute count of modified categories. Sorry for late reaction.--Frettie (talk) 08:25, 21 May 2018 (UTC)
  • I generally don't mind multiple bots for the same task if they are programmed to some extend independently. This will have the merit that they can supplement each other. As I understand this is a second bot running exactly the same script. There is less benefit in having this as long as the other bot is well maintained and running. During the request for Commons:Bots/Requests/Pi_bot_1 we allowed for a higher edit rate than normal to accommodate for the number of edits. So, please provide additional rational except they will be faster together to support this request. Also note that during Commons:Bots/Requests/Pi_bot_1 there was some consideration on the server load these edits might put on Wikidata (not Commons). With two independent bots acting this will be more challenging to monitor. --Schlurcher (talk) 23:33, 9 May 2018 (UTC)
    • Yes, there will by challenging to monitor this, but (maybe) it's better, to edit some known categories (like czech themes).--Frettie (talk) 10:08, 11 May 2018 (UTC)
Do you see any concerns with edits from User:Pi bot that require deeper understanding of the categories? --Schlurcher (talk) 17:28, 11 May 2018 (UTC)
    • To clarify, it was the server load increase due to refreshing pages on Commons that use Wikidata information that I was worried about, so that's somewhere between server load on commons and on wikidata. The main limitation I've found with the bot code so far is that it takes time to work through the categories and find ones to add the infobox to (since it checks each category in series), so it helps to have multiple copies running with different target categories. Plus, it helps to have someone else here that knows how to run the code in case I'm not available in the future (since it'll be useful to keep running this after the main run to catch new categories that the infobox can be added to). Thanks. Mike Peel (talk) 23:17, 11 May 2018 (UTC)

FinnaUploadBot (talk · contribs)[edit]

Operator: Zache (talk · contributions · Number of edits · recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Uploads images from Finna.fi (2016 press release) which is online catalog and API for Finnish museums, libraries and archives and part of the Finlands national digital library service. Finna aggregates its data from organizations own databases so that there is a single endpoint. It also relays data to downstream partners like to Europeana. The basic operation is that users give a finna-id as a parameter and then the tool will read CC0 licensed metadata for the photo and checks if the license of the photo is licensed as CC0, CC-BY OR CC-BY-SA. If everything is ok then the tool will upload the photo.

The request is for larger uploads like thousands of photos and for keeping tool available for single uploads. Here are Example uploads and There is a discussion about the tool in COM:AN/U FinnaUploadBot (link to archive) where it was said that it will need a bot permission.

Automatic or manually assisted:

Automatic, user need to manually tell what to upload but there is no confirmation for each diff.

Edit type (e.g. Continuous, daily, one time run): on demand

Maximum edit rate (e.g. edits per minute): 3 edits per minute

Bot flag requested: (Y/N): Y

Programming language(s): PHP, currently logic is built top of the https://github.com/legoktm/harej-bots/blob/master/botclasses.php Zache (talk) 05:38, 22 April 2018 (UTC)

Discussion[edit]

Quote from COM:CRT#Country-specific_laws The safest way to apply international copyright law is to consider the laws of all the relevant jurisdictions and then use the most restrictive combination of laws to determine whether something is copyrighted or not.'. If we would follow that the license would be CC-BY. --Zache (talk) 09:09, 29 April 2018 (UTC)
This Commons:Project scope/Precautionary principle is also official policy of Wikimedia Commons --Zache (talk) 21:06, 5 May 2018 (UTC)
That discussion is now archived to Commons:Administrators' noticeboard/User problems/Archive 69#User:FinnaUploadBot.   — Jeff G. ツ please ping or talk to me 05:44, 14 May 2018 (UTC)
  • Pictogram voting comment.svg  Comment Zache how do you plan to address the concerns raised by Majora? Thank you. Wikicology (talk) 19:55, 22 April 2018 (UTC)
From a technological point of view, it is quite easy to handle. If we don't like to automatically upload images which are old enough to be in public domain and they are licensed to other than PD or cc0 then you can just add a test for it and if the test fails then do not to upload the image.
From a legal point of view, eg is Majora's concern about license valid, it is fuzzier because it is a valid argument that GLAMs have rights related to their digital reproductions of out-of-copyright works based on Finland's copyright law. So one can't just say that city of Helsinki or ministry of education and culture of Finland is fraudulently slapping restricted licenses on PD content when they are opening it at all. It is, however, unclear if the GLAM's argument for the rights would hold if it would be tested in court. The GLAMs which we are currently speaking about are in the front line of opening data in Finland. en:Finnish National Board of Antiquities (part of the ministry of education) have opened 200000 photos and are doing Wiki Loving Monuments with WMFI. Helsinki city museum which photos I was uploading opened 50000 hi-res photos and city as whole is doing a lot of open source work with OpenStreetMap, data, and API. So the position is that the opensource community in Finland are pointing out current problems and giving them reasons to do the right thing and seeing how far we can get. Last february National gallery of Finland opened their digital reproductions of out-of-copyright works under CC0 ( after years work ) which is kind of a test for other museums to see if it is good idea or not. --Zache (talk) 09:21, 23 April 2018 (UTC)
Pinging Zhuyifei1999 and Fae for insights. Regards. T Cells (talk) 09:39, 24 April 2018 (UTC)

I asked the Helsinki City Museum yesterday why their license is CC-BY for digital reproductions of out-of-copyright works and not public domain / CC0. I haven't yet got an answer. However, the relevant pieces from Copyright Act of Finland as far as I know are in Chapter 5 - Rights related to copyright and under the Section 49 - Producer of a catalogue and a database (607/2015) and Section 49a - Photographer. Section 49 protects the amount of work which is used to make the database but not single items in it. One can copy some photos manually, but downloading significant parts of the database using bots is prohibited. Section 49a protects photographs that are not works of art and requirement for getting the protection is that the photograph has been taken. For this argument (which I copied to COM: AN/U for Majora) is that digital reproductions of out-of-copyright are new photographs and thus they are covered by 49a. Currently, there are no decisions related to this argument for 49a by copyright council of Finland or a court. Kuvasto's opinion when it was asked by fiwiki was that museum can make and sell copies of the photographs but making digital reproductions doesn't extend the original protection. --Zache (talk) 09:08, 26 April 2018 (UTC)

Helsinki city museum's answer was that they are moving to CC0 for out of copyright works on their 2019 update, but not before because of the implementation reasons. This was in their plans before I asked about it. It was also OK for them to change the licenses for en:Signe Brander and fi:Ivan Timiriasew to CC0 or {{PD-Finland}} if it helps us. They also noted that most of the museums in Finland are in internal discussions about whether opening the archives is a good idea at all and the best practice is going forward step by step and not to scare GLAM people off.
I also updated the copyright status of reproductions of out-of-copyright works in Finland to COM:CRT#Finland (diff)
@~riley, Majora: In other ways question about CC-BY vs CC0 seems to be stalled and just for FYI. WMFI have a Wikigap event on 2018-05-08 where one planned track would be to add photos of historical women from Finna to articles. Second event related to this topic is a rephotography trip to Suomenlinna where we would be using photos from SA-kuvat (military archives), Helsinki city museum and Finnish National Board of Antiquities which are currently all under CC-BY and older than 50 years. The rephotography trip is also for testing of our WLM rephotography contest ideas. WMFI plan for WLM is to make national rephotography contest of old historical buildings as part of the WLM. For then-and-now photo pairs the old pictures are needed too. --Zache (talk) 08:25, 29 April 2018 (UTC)

Organizations in Europe are legally entitled to add a layer of copyright to material they make available through neighbouring rights. This is not a practice we want to encourage, but are we going to exclude them from Commons? This is not an isolated case. I personally would like to praise these institutions for the work they have done opening their materials. Sadly, they are following the national recommendations for open licensing. There's work to do, but don't you think we could cheer and let them experience the positive outcomes of open licensing rather than put them down. If this was a court case, they might still win. – Susanna Ånäs (Susannaanas) (talk) 16:32, 27 April 2018 (UTC)

How long is this procedure going to take? @Romaine, Jean-Fred: This is blocking the development for Wiki Loves Monuments 2018 in Finland, where old openly licensed images will be rephotographed. – Susanna Ånäs (Susannaanas) (talk) 18:42, 13 May 2018 (UTC)
@Majora: Could you please comment if there are issues remaining? Thank you. --Krd 05:22, 14 May 2018 (UTC)
@Krd: My apologies for not responding sooner. My concerns were always two fold. One, Zache was running an unauthorized bot on Commons. This solves that. Two, that they were misrepresenting the license that these photos are actually under. Section 49, as linked to above, is irrelevant as I read it. That has to do with databases. Think SQL. Think MediaWiki. Now, whether or not the "catalogue" part of that paragraph includes this type of information is highly debatable and from a programming standpoint, where I think this section is actually referring to, it would make more sense if it was referring to a something different entirely. Section 49a is precisely why we have {{PD-Finland50}}. Zache's editing of the copyright information for Finland (COM:CRT#Reproductions of out-of-copyright works) seems to be a misrepresentation or misunderstanding and should be removed. Section 49 appears to be referencing something completely different as indicated by it further stating that it applies to "tables" or "programs". Again, this is indicative that that section is talking about something completely unrelated to the issue at hand. If the museum has not responded as to why they have put the photos under a non-PD license then there does still seem to be a problem here that needs to be resolved. Please note, these photos are probably fine for Commons anyways and they will all have to undergo license review. If you just want to approve this and let the reviewers sort it out (a large undertaking but not unprecedented) then I wouldn't have a problem with that. But the changes to the CRT page need to be undone. Preferably by someone who has the extra authority (ie the mop) to make those changes stick. --Majora (talk) 00:36, 15 May 2018 (UTC)
@Majora: Just to be clear:
  1. Section 49 is much broader in Finland. Traditional examples for catalogue, table or database in Finland are the list of the name days or the phone book. It doesn't care specific implementation or format of the data.
  2. Helsinki City Museum did answer. The CC-BY license could be used because they owned the copies and they had already the CC0 for out of copyright works in their internal pipeline which they hoped to solve the issue for us too.
--Zache (talk) 05:15, 15 May 2018 (UTC)

┌─────────────────────────────────┘
Who says that section 49 is much broader exactly? You? If there are no specific court cases on the matter, which you have already said there are not above, then we have to interpret the law as it is written. We don't have the luxury of just saying whatever we want or believing what anyone else says. The way that section is written does not apply to this situation and without a court case saying that is is much broader than written we can't just go with what you say. That isn't how it works here.

Second, please stop saying CC0. There is absolutely no indication whatsoever anywhere on finna.fi that the images are under CC0. CC0 is a completely separate and distinct copyright license with its own legal code. It is not a catch all term for public domain images. The fact that you continue to use CC0 in this manner shows that you don't understand that and therefore you do not understand copyright licensing. If the museum really did use the term "CC0" then they were either parroting back exactly what you sent them or they too do not understand what they are talking about. Either explanation reduces my confidence that the copyright license is correct from reasonable doubt to nil. I have zero confidence whatsoever that finna.fi is using the correct copyright license here. Whether or not the bot is going to be approved is beyond my pay grade but only because the images are probably fine for hosting here and that they all have to be licensed reviewed anyways. Otherwise I would strongly oppose the approval of this bot since there seems to be a several gross misunderstandings here. --Majora (talk) 20:16, 15 May 2018 (UTC)

Who says that section 49 is much broader exactly: Copyright counsil of Finland. As an example there is statement TN:2000:9 where the digitized flags in the catalog weren't protected by section 1 or 5 as independent works, but the collection of flags was protected by section 49. (summary, full text) Copyright council statement for name days is TN 2013:8 and for phone book is TN 1987:16.--Zache (talk) 00:10, 16 May 2018 (UTC)
And about the rest. You don't need to trust to my interpretation of section 49a. You can always check the copyright status of digital reproductions in different EU countries from outofcopyright.eu map. --Zache (talk) 01:31, 16 May 2018 (UTC)