Commons talk:Bots

From Wikimedia Commons, the free media repository
Jump to: navigation, search

Bot flag[edit]

After this discussion on the village pump, I would like to point out something that has always bothered me about the bot flag: Some poeple think that the flag should be required to operate a bot. However, the flag is effectively a statement of trust: bot edits can be hidden from the watchlist, etc, because they are assumed to be OK. Thus, a new bot should be required to run a while without the bot flag, so people see and checks the edits it makes. Of corse, before granting the flag, someone should have a look at the contributions anyway... but some sort of "trial" period should be required, so problems can be discussed and resolved before the bot flag is granted. I think there should be a statement to that effect should on the page...

What do you think? -- Duesentrieb(?!) 22:19, 2 April 2006 (UTC)

Making up policy is fun! I added it, plus a bit more. pfctdayelise (translate?) 04:02, 3 April 2006 (UTC)
Some bots should never have bot flags – Wikibooks:en:User:Uncle G's 'bot is a good example. Only bots that repeat simple tasks, and would flood Special:Recentchanges, only those bots need a flag. --Kernigh 22:09, 16 April 2006 (UTC)

Special:Listusers list of bots[edit]

Why does each bot link to Commons:Administrators? Perhaps we should change the link to this page somehow? pfctdayelise (translate?) 07:57, 3 April 2006 (UTC)

All flags (sysop, bureaucrat, checkuser, bot, developer ...) always link to the same page, in this case Commons:Administrators. I know of no way to change the link. Even if there are multiple flags, there is only one link. --22:09, 16 April 2006 (UTC)

Remove requests[edit]

Should we remove requests for bot assitance want the task has been completed and the requester notified, some on the list have been completed and are now just getting in the way of the others. Lcarsdata (Talk) 19:00, 5 October 2006 (UTC)

There have been no replies to this for nearly to weeks so I am going to go ahead and archive the sections. If any one disagrees revert me. Lcarsdata (Talk) 19:17, 17 October 2006 (UTC)

Trial period[edit]

I'm somewhat confused by the instructions. If I'm starting a bot on Commons do I add it to the "To be checked" list or do I apply for a bot flag and then get added or how? /Lokal_Profil 22:43, 6 May 2007 (UTC)

I don't know why we have that list. It's pretty strange. I don't think it is updated very often.
If your bot is going to do hundreds or thousands of edits, and you're using it in a systematic and regular way, then it is best to do a small trial run (< 200 edits) and apply for a bot flag. If you're just using your bot every now and then, for not many edits in a burst, then I wouldn't bother too much. --pfctdayelise (说什么?) 01:33, 7 May 2007 (UTC)
Ok. I'll start small anyway and if I feel that the edits become to many then I'll apply for a bot flag. BTW. it would probably be good to add in a comment/link to Commons:AutoWikiBrowser/CheckPage 8or it's talk page) somewhere on this page. i didn't find out about AWB being limited on Commons untill I tried running AWB and it wouldn't. /Lokal_Profil 10:27, 7 May 2007 (UTC)
Ihave never used AWB and I didn't even know that page existed. Can you explain a bit more please? pfctdayelise (说什么?) 16:01, 7 May 2007 (UTC)
GmaxwellYonatan first created it as a test. The first time I ever got a mention of AWB ever working here was from Lar on IRC. I am using that a lot ever since, and my name and my bot's name is on that list. About the trial period(s), perhaps we should do trials on the bot request?  V60 干什么? · VContribs 17:18, 7 May 2007 (UTC)
AWB allows one to make repetitive edits a lot faster. On a regular account each edit has to be accepted manually and from a bot account one can set it to do these edits atuomatically. What some wikis such as commons and en.wiki have done is that they only allow AWB to be run from accounts which have been added to the list on Commons:AutoWikiBrowser/CheckPage. I didn't know that this could be done untill AWB wouldn't run on Commons and I ended up on that page. Anyhow to apply to be added one adds ones name to Commons talk:AutoWikiBrowser/CheckPage and hopes for the best. The reason why i thought a link to that page would be good from here is because for a bot AWB can be a userfriendly alternative to pywiki and in many cases it can be used instead of a bot. Also if a bot is running AWB it's necessary to be added to that list before one can even start ones trial period.
The second reason for metioning it here is because that page should probably be on that watchlist of the same people that watch "Requests for bot approval". Which translates into could someone please take a look at the requests lodged there. =) /Lokal_Profil 18:12, 7 May 2007 (UTC)

Some recent changes[edit]

I addded a comment on Vishwin's latest changes (which I reverted) on Commons_talk:Administrators/Requests_and_votes#Bot_status. -- Bryan (talk to me) 20:30, 7 May 2007 (UTC)

Reorganising[edit]

After going through some of the bots 'To be checked' I feel this page is in need of a reorganize. If there are no objections I plan on creating a table which lists all the bots, their status (active\inactive), their owner and her\his status, what they do and whether or not they have a flag. Lcarsdata 21:13, 8 May 2007 (UTC)

No objection. In fact, I strongly encourage you to do it, since all of these pages need a through cleanup anyway.  V60 干什么? · VContribs 21:17, 8 May 2007 (UTC)
Um. Why don't we just scrap the table altogether? Why do we need such detailed info? Who cares? I think we should only list the bots that take requests, or whose owners are willing to take requests. pfctdayelise (说什么?) 10:19, 9 May 2007 (UTC)
OK, since you dislike this idea I will move the table to my userspace and make this page how you suggested. Lcarsdata 14:25, 9 May 2007 (UTC)

Edits per minute[edit]

Just noticed that the page doesen't specify a limit for how many edits per minute (with or without botflag). /Lokal_Profil 23:10, 3 July 2007 (UTC)

Do we really need it? Like many others who have said what bot flags are for, it is a statement of trust. There are bots like SieBot and VshBot doing different tasks at different rates. (zelzany - framed) 23:18, 3 July 2007 (UTC)
It's just that this tends to be on of the standard bits of info on the WP:BOT page. Also to a certain extent if one is using pywiki this is a parameter that will be set. But I'm mainly thinking about users without a botflag not wanting to pollute RC. /Lokal_Profil 00:28, 4 July 2007 (UTC)
Our RC is not as routinely patrolled as enwp, because more important is patrolling NI, not RC. Possibly we should have a limit on upload rate, not editing rate. Isn't there an option to ignore bot edits, anyway? I don't really see the need to impose an arbitrary limit. (Let's try and borrow the best elements from en.wp rather than just ALL the elements...) --pfctdayelise (说什么?) 05:49, 4 July 2007 (UTC)
Speaking in terms of the perl upload script, the rate of uploads is controlled by the size of the file. So it is very hard to set an upload limit. (zelzany - framed) 16:05, 4 July 2007 (UTC)
I support Pfctdayelise's and Vishwin60's POV. I made a lot of bot uploads and especially when copying small files (maps come to mind) from other Wikipedias, I upload up to 25 images per minute with the bot. The recats that User:SieBot does, go up to 50 changes per minute. Cheers! Siebrand 16:09, 4 July 2007 (UTC)
OK, just wondering. /Lokal_Profil 16:28, 4 July 2007 (UTC)

Question[edit]

I run a bot on enWiki and upload a lot of maps here in conjunction with that bot. I have been using the Commonist upload tool for my uploads but given the sheer number of maps I am uploading along with the slight (but important) differences in descriptions even that is something of an onerous task. I am interested in writing a customized script to do bulk uploads for me and automatically process the descriptions. I'm not 100% clear on the bot policy here but I am wondering if something like this would require a bot flag/approval? Arkyan 21:39, 20 July 2007 (UTC)

Yes, you're gonna need a bot flag for uploading a lot of files (~60+). (O - RLY?) 21:44, 20 July 2007 (UTC)
Thank ye for the quick answer. I'll get a proposal put together. Arkyan 21:47, 20 July 2007 (UTC)

active/inactive[edit]

I just noticed this list of active/inactive bots. It's woefully out of date. Some enterprising admin wannabe could go clean it up (by going through the list of bot RfBs and the list of those users with bot flags and then viewing contribs to see which bots are active, and maybe also write a quick blurb on what the bot does) if they wanted to :) .... going forward should this be a 'crat task to update when the bot flag is given? ++Lar: t/c 22:24, 5 December 2007 (UTC)

List of flagged bots by last contribution:
Hope that help, -- Bryan (talk to me) 12:11, 6 December 2007 (UTC)
I don't think that list was ever complete, let alone up to date (in fact I'm kind of mistified about what the point of it is :)) --pfctdayelise (说什么?) 08:29, 7 December 2007 (UTC)

Where do these bots run?[edit]

Do these bots run on the server? What are the chances of getting a bot to run on the server for maintenance purposes? Patstuart (talk) 19:29, 21 February 2008 (UTC)

No bots run on the main servers. Many bots run on the toolserver, while some bots run at people's home. -- Bryan (talk to me) 10:30, 23 February 2008 (UTC)

Commons:Welcome log[edit]

Could some of bot owners to back-up User:SieBot for Commons:Welcome log? Thank you. --EugeneZelenko 17:03, 25 February 2008 (UTC)

Page re-write[edit]

I have re-written the page to provide more guidance to bot operators. As discussed elsewhere I have made it clear that permission should be sought for all bots, not only those that need a bot flag (the subpage Commons:Bots/Requests for flags will shortly be moved to remove the reference to flag). There are a couple of innovations borrowed from en:W, namely: (a) a section on maximum bot speed, and (b) a requirement that certain information about the bot should be shown on the bot's userpage. At the moment, very few bot pages have that info and it's irritating to follow a link back to the bot's userpage and there to find nothing at all about the bot's purpose in life. --MichaelMaggs (talk) 08:29, 1 May 2009 (UTC)
Also, I have made it clear now that any bot permission granted is only for the tasks specified in the bot request. --MichaelMaggs (talk) 18:25, 1 May 2009 (UTC)

The last part looks like a quite significant change in the guidelines (unless I was mistaken about the former rules), I'd love to see a consensus around it... --Eusebius (talk) 18:58, 1 May 2009 (UTC)
Yes, that would be good. For comparison, the policy on meta is "If you intend to expand the scope of the bot beyond that which was supported by the community, make note of this at the relevant discussion page to ensure that there is no opposition." The policy on en.W is "Should a bot operator wish to modify or extend the operation of a bot, they should ensure that they do so in compliance with this policy. Small changes, for example to fix problems or improve the operation of a particular task, are unlikely to be an issue, but larger changes should not be implemented without some discussion. Completely new tasks usually require a separate approval request. Bot operators may wish to create a separate bot account for each task. Accounts performing automated tasks without prior approval may be summarily blocked by any administrator." --MichaelMaggs (talk) 19:36, 1 May 2009 (UTC)
Are you sure you have the workforce here on Bots to enforce such a policy? On the other side of the issue, I am not a very prolific coder, but honestly, if I had to formally ask for permission before developing any new functionality, it would simply make me stop working on bots (in spite of the fact that I fully understand the interest of this policy, especially in the light of some debatable or "innovative" bot behaviours). --Eusebius (talk) 19:46, 1 May 2009 (UTC)
Well, it seems to me it goes without saying that a bot granted permission to do one task should not start doing another task without its owner coming back to seek approval for the new task. There are many instances in the bot archive where that has happened. Maybe the wording could be softened to allow for variations on existing tasks, but entirely new tasks should in my view obtain permission, as on en Wiki. --MichaelMaggs (talk) 20:43, 1 May 2009 (UTC)
How should we take into account "non-creative" tasks, such as changing the layout of one's own image pages, or running another kind of pre-existing pywikipedia script with a given configuration, running another AWB query, which will do something totally different from what the bot account was initially allowed to do? Also, I feel like there is a little confusion, in the authorization process, between the authorization of a functionality (which could be given on the basis of a manual, Wizard-of-Oz-like demo, without an actual implementation, but which should be subject, in some cases, to some kind of community approval) and the confidence vote for its implementation (the bot flag, which requires actual testing and validation and remains pretty technical). --Eusebius (talk) 21:22, 1 May 2009 (UTC)

I support the general theme of this rewrite, and thank MichaelMaggs for having taken it on (as well as reorganizing some other pages, nice work). It seems reasonable for the community to ask that when a largely previously not done task is contemplated, new approval should be sought. Tweaks to existing tasks, not needed, but big things? yes. Eusebius asks good questions about how to validate intent vs. execution, which maybe need some smithing in the policy wording but is there really disagreement with the principle that "a bot granted permission to do one task should not start doing another task without its owner coming back to seek approval for the new task" ??? ++Lar: t/c 03:48, 2 May 2009 (UTC)

In the principle, I'm ok with it, but I would not necessarily come back here for approval. I would go the the relevant place, depending on the audience: VP, AN, COM:L, Geocoding, sub-projects talk pages, etc. Intuitively, I would come here only for pure maintenance tasks, I guess, technical jobs that are not done by any bot so far but that have little semantics (or really non-controversial semantics). Or, I would come here for approval on a particular way of doing a task that have been approved elsewhere (in the case the technical means might be a particular burden on the DB servers, for instance).
Also, do we assume that the request made on Bots/Requests are by default checked by whoever-has-authority-to-authorize-a-bot, and that any bot operator can do the job if it is technically well done? --Eusebius (talk) 08:14, 2 May 2009 (UTC)
Bots requests (whether or not a flag is required) are approved after community discussion by a bureaucrat. Unfortunately, the community doesn't tend to get very involved. The question about getting permission for new tasks is an interesting one, and of course depend what we mean by "task" (the English Wikipedia policy ignores this point). Many bots are capable of being used in various ways, for example a bot which changes all occurrences of "xxx" to "yyy" in one field of the Information template could easily be used to change "aaa" to "bbb" in some other field. I wouldn't call that a new task. Perhaps the wording should be softened to ask for a bot to be re-submitted only if its functionality has been changed to carry out some significant new task? It's not easy to define a specific cut-off point, but bot owners are generally sensible and should be able to decide for themselves whether the new code they have added falls within the italic text. --MichaelMaggs (talk) 12:47, 2 May 2009 (UTC)

I noticed a sudden change of policy. It's rather strange to make such a big policy change without first getting consensus. The requirements that a bot should list all it's tasks is not really pratical too. Sure, would be nice to have some of the regular task listed, but a lot of things are not regular. This is to much modeled after the bureaucratic way enwp deals with bots and seems to be written by someone not actually running bots. I'm going to revert this change of policy unless you can point out consensus here at Commons. Multichill (talk) 12:55, 2 May 2009 (UTC)

I will repeat what I said at the discussion page there... Policy on Commons is descriptive and we've been requesting/requiring that people seek community approval for some time now. Getting the policy wording updated to match practice seems a good thing, but I'm not sure that it is a requirement that it be discussed again. But in any case, MM's suggestion of where to discuss this seems apt. Please be prepared to explain WHY you think that asking the community for approval of something (that is potentially very disruptive and potentially will require a lot of work to fix, should it go awry) is a bad thing. Most wikis either require bot approval, or are so small that the global bot policy is a better fit. ++Lar: t/c 13:05, 2 May 2009 (UTC)
Then I'll add this. I strongly suggest you not revert anything. That policy is descriptive of how crats do things already. And therefore, absent a consensus for change, the status quo ante shold remain in effect. ++Lar: t/c 13:09, 2 May 2009 (UTC)
Anonymous users are potential vandals, that's no reaon to have them to register.
All bots running on Wikimedia Commons must have advance permission to do so. Permission is need whether or not the bot requires a bot flag.
This is not current practice. People use unflagged upload bots all the time.
Permission will be granted only in respect of the specific tasks that are listed in the bot request. Bots must not be used to carry out tasks for which permission has not been granted. If the bot operator later wishes to use the bot for a different or a broader range of tasks, a new request must be filed in advance for those new tasks.
This is also not current practice. Bot operators add tasks all the time. This is just a copy of enwp policy. Commons is not enwp. It looks like you guys are mixing things up.
You're severly limiting the moving space of current bot operators and potential future bot operators. This is not the implementation of current practice, no, this is a big change of policy. Multichill (talk) 13:46, 2 May 2009 (UTC)
People who use unflagged (and unapproved) bots are not doing so in accordance with current practice, and may find their bots blocked. Our approval process is lightweight but it nevertheless is there, and has been for quite a while, ging back to the very earliest adminships. This policy merely codifies and regularizes what we have been doing for a very long while now... It's not new. But supposing it's new, you haven't given a valid reason to oppose it. ++Lar: t/c 13:51, 2 May 2009 (UTC)
Indeed. You only have to look through old bot requests at Commons:Bots/Archive to see that it has for a long time been standard practice for requests to be closed as "approved" even where no bot flag was needed or even sought. As part of the request, the requester has to say whether a bot flag is being sought (y/n), a question that would be pointless if no approval was needed if the answer was n. Please note that I have softened the wording now to make it clearer what is meant by "task". --MichaelMaggs (talk) 14:07, 2 May 2009 (UTC)
Let's please try to not be as rigid as en.wp's bot policies are; the day that we start blocking bots just because there was no previous approval, and not because they are doing harm is the day I start looking for another project to contribute to. And yes, I have a bot that made almost 7.000.000 edits in all Wikimedia projects together. Siebrand 14:45, 2 May 2009 (UTC)
I don't think anyone is talking about blocking bots "just because there was no previous approval". However, if a bot is found running unapproved, and the author is unwilling to run through our very lightweight process to gain approval (it's, after all, a process in which if there are no objections, that's that) and even unwilling to answer questions about what the bot does or why, then that bot is subject to having permission to operate revoked, for the good of the project, until the work can be evaluated, especially if it's at all controvesial. That's not new policy. It's just common sense. Your work is considerable, and highly appreciated. And I can't possibly imagine you ever being unwilling to answer questions about what your bots do... you've always been a shining example to the rest of us. ++Lar: t/c 15:09, 2 May 2009 (UTC)
I agree. I can't see any of our respected bot-writers here suddenly finding that they face refusal. One of the factors taken into account in bot requests is the identity and experience of the bot creator, and if Siebrand or Multichill made an application to cover some new task it's highly unlikely in practice that the community would say no. Nobody wants to turn Commons into enWiki, least of all me. --MichaelMaggs (talk) 16:18, 2 May 2009 (UTC)
There's a strong difference between questionning the activity of a bot when something looks not ok, and having to request permission for every new feature. From what I understand of the rewriting of the page, we're talking of the latter, and this is why I'm ill-at-ease with that. --Eusebius (talk) 15:25, 2 May 2009 (UTC)
The wording now reads "Bots must not be used to carry out different tasks for which permission has not been granted. Of course, bot operators are not expected to re-apply every time they want to implement a small alteration, but if the bot's functionality has been changed to carry out some significant new task then a new request should be filed." That sounds quite unexceptional to me. After all, it is the bot and not the bot owner that is being granted approval. The grant of approval for a bot is not a general approval for its creator to add any and all functionality to it without further community involvement. --MichaelMaggs (talk) 16:11, 2 May 2009 (UTC)
You want me to request permission for every new task I add to my bots? Never did that here and have no intention of doing that. Multichill (talk) 18:56, 4 May 2009 (UTC)
Yes, we would like you to ask about every major new task that you are contemplating. Not minor tweaks but if someone has a bot now that was approved to welcome users, it's reasonable to ask for approval before it starts, for example, changing categories, is it not? If not, why exactly is it unreasonable? ++Lar: t/c 20:55, 5 May 2009 (UTC)

Can't we do this in a less bureaucratic way? Just because that's what the user group is called doesn't mean you have to enact it. We should find a better system instead of quibbling over the precise meaning of "major task" or whatever. This isn't enwiki, and we should start acting like it. Sometimes I wonder whether some people notice the logo in the top left corner...  — Mike.lifeguard 01:16, 6 May 2009 (UTC)

What do you suggest? I'm not wedded to any particular wording or implementation. Let's hear your proposal. ++Lar: t/c 01:23, 7 May 2009 (UTC)
As I understand it, Multichill's position effectively leads to approval not of bots themselves but rather of bot creators. Thus, an approved creator is able to do almost any bot task without coming back for further community approval. That worries me (not that Multichill himself would mess anything up, but others might, and I would prefer not to have one rule for him and another for everybody else). --MichaelMaggs (talk) 07:52, 7 May 2009 (UTC)
It's been the better part of two weeks without further discussion... this seems pretty clearly the way things ought to be to me. Does anyone have a reasoned objection as to why the community should not have approval rights for significant or major new tasks? ++Lar: t/c 19:46, 21 May 2009 (UTC)

Removing autocategorization from Creator pages[edit]

Hi, Lately I am doing a lot of maintenance work and cleanup of Creator templates. One persistent problem with those templates is that many of them automatically add categories, which is very confusing for users not used to it, makes it close to impossible to recategorize images and prevents people from developing deeper category schema. I was thinking about ways to remove autocategorization from those templates and add categories directly to files themselves. Are there bots particularly suitable for that task? I can provide a list of categories filled by autocategorizing creator templates. I was thinking about maybe using user:CommonsDelinker to append string category:foo to all images from category:foo for each category from that list. There are about 3k such categories. Is there a better way to do it or is user:CommonsDelinker correct tool for the job? --Jarekt (talk) 13:42, 3 August 2009 (UTC)

You can just move the images from the category to the same category. Something like "category.py move -from:foo -to:foo". Multichill (talk) 15:06, 4 August 2009 (UTC)
Thanks. I never tried py tools but I will give them a try. --Jarekt (talk) 17:04, 4 August 2009 (UTC)
"category.py move -from:foo -to:foo" did not work. category.py could not deal with categories added by template. Quite ironic. I tried

''add_text.py -cat:Photographs_by_P._Dittrich -text:"[[Category:Photographs by P. Dittrich]]" -except:"[cC]ategory:Photographs by P. Dittrich" -summary:"Categorize"'' and the script did a single edit, before it crashed with ValueError: need more than 2 values to unpack. I do not know much about pywikipedia but at the first site, it does not seems to work well. I will try some more. --Jarekt (talk) 04:17, 5 August 2009 (UTC)

Oh right, move won't work. "add" does work, but should be changed to work fully automatic.

Rotatebot[edit]

Please block Rotatebot. There isn't much harm in letting it continue, but I don't know how long it will take for Luxo to return. He will probably fix it easily once he gets back, but there isn't much use in keeping it running. It re-started rotating when a partial fix to mw was applied. In the meantime, some will have to learn how to rotate images .. -- User:Docu at 06:19, 25 September 2009 (UTC)

Docu removed this section, saying "solved". I think rather than outright removing a section from a talk page which is not his own user talk page, it's better to document what the resolution of the problem is and leave the section in place, for the benefit of all, so I've restored it. I'd like to hear HOW the matter was solved. ++Lar: t/c 15:11, 27 September 2009 (UTC)

Category deletion[edit]

Is there a bot that currently does categories renaming and deletions? To rename a series of categories (see discussion), we probably need to do this. While category-bot or SieBot could do the renaming part, they aren't able to delete categories. Shall I register an additional bot account and make a bot permission request? -- User:Docu at 07:52, 25 September 2009 (UTC)

I made a request at Commons:Bots/Requests/Category-bot-helper. -- User:Docu at 09:28, 26 September 2009 (UTC)
I support the general idea. However shouldn't an admin run a bot that deletes things? There is precedent for a paired set of bots, one that does most of the work and one that does the admin-ish cleanup tasks, see Commons:Administrators/Requests/EuseBotHelper. ++Lar: t/c 17:14, 26 September 2009 (UTC)
Yep. I too think only an admin should run an admin bot. Kanonkas // talk // e-mail // 17:30, 26 September 2009 (UTC)
I added it to both request pages. In the meantime, Multichill volunteered to do that too. -- User:Docu at 19:58, 26 September 2009 (UTC)

Wikitext created by Special:Contributions/Flickr_upload_bot[edit]

As it's not entirely clear who maintains it, I'm posting this here.

The wiki source text for the author create by the above bot isn't ideal. e.g. for File:Cacatua galerita -Melbourne -eating seeds-8c.jpg, the author field looks as follows:

{{#if:36199303@N04|[http://flickr.com/photos/36199303@N04 Geoff Penaluna]|{{#if:|[{{{2}}} Geoff Penaluna]|Geoff Penaluna}}}} {{#if:|from {{{location}}}|}}{{#ifeq: {{NAMESPACE}}|File|[[Category:Flickr]]}}

Template functions shouldn't be used outside Template namespace. -- User:Docu at 11:58, 13 October 2009 (UTC)

It's not flinfo so I guess you should contact User:bryan. This can easily be solved with recursive substitution. So {{Flickr author|some parameter}} should be changed to {{subst:Flickr author|some parameter|subst=subst:}}. Multichill (talk) 18:32, 13 October 2009 (UTC)


Usage of the HIDDEN tag[edit]

Some bots and tool uses the __HIDDEN__ tag for a different purpose than what it should be used, mainly for ignoring categories that are hidden. (eg: pages will be categorized as 'not categorized' if there are only hidden categories on it. I think it would be more logic to create a category for this usage, so bots & tools can retrieve the list of cats to ignore from it, and the hidden tag can be used for what it should really (displaying or hiding a category, and not telling bots/tools to ignore it). Any comment from bots master about the doability of this? Esby (talk) 13:27, 12 February 2010 (UTC)

The use of the tag isn't really defined by bot operators, but by Commons:Categories etc. You may want to bring up this question there. For a suggestion to improve the display of various types of categories, see usability ideas -- User:Docu at 13:37, 12 February 2010 (UTC)
Well the usage of the tag is defined by the way you use it. I see nothing in Commons:Categories that talks of it. This is not linked with the interface usability, let me give you more explanation: we have a policy that says that user category must be categorized in the corresponding category, so far it's fine. We have a template that categorize and hides it. Now, if the category 'owner' do not want its category being hidden he can categorize his category and not hide it according to the given policy, here, for instance, the categorization bot will fail, because he'll see a visible category and will think the image was categorized while it is not. This is true for user category, but this can be applied to any category we might want to render visible to the user while a real categorization on the concerned images has not been made yet. In other words, the problems arise when the visible category being added is not making the image as 'categorized' from an efficiency point of view. Hence the question for bot owner (or tools maintainer) before trying to suggest a change while we are not ready technically to react correctly to the change. Esby (talk) 14:09, 12 February 2010 (UTC)
Indeed, it seems that it's primarily defined through {{usercat}}. The proposal would do away with the hidden part as all categories would be sorted by specific types.
In the current system, the problem with making any non-topical categories visible, is that people stop categorizing images once there are some visible ones. Pictures featured somewhere used to have strings of categories of the type "featured here" and "featured there" to the point that most FP were poorly categorized.
If you put your images in a gallery instead, you can still see it listed under "usage". -- User:Docu at 15:34, 12 February 2010 (UTC)

Category talk:Images by Lilyu. Rocket000 (talk) 01:44, 13 February 2010 (UTC)

The debate there don't solve the problems about FP, DP, and important non users being hidden. Esby (talk) 11:05, 13 February 2010 (UTC)

Detecting duplicates during image uploads[edit]

I've noticed that a lot of image uploading bots don't detect for duplicates before uploading an image. This is one of the reasons why many free-license images from Flickr have 2 or 3 dups on Commons under different names. I've added a function called imagematches to botclasses.php that should make detecting dups easy (if your bot is a PHP bot). It takes the sha1 hash of your local image and finds all the other images on Commons (or wherever) that have the same hash. Here's an example script:

<?php
require_once 'botclasses.php';
$commons = new wikipedia;
$commons->url = 'http://commons.wikimedia.org/w/api.php';
$hash = sha1(file_get_contents("localimage.jpg"));
echo "Number of copies already on Commons: ".$commons->imagematches($hash);
?>

Happy bot-writing. Kaldari (talk) 19:09, 12 February 2010 (UTC)


Bot policy / upload.py[edit]

As one of the main file upload tools is currently dead and unlikely to be fixed, I re-wrote Commons:Tools#Python_Wikipedia_Bot describing an alternative. Can I added that if less than 100 files per day (maybe x per month) are uploaded, it's not necessary to run this under a separate bot account/request prior approval? -- User:Docu at 09:08, 10 April 2010 (UTC)

For now I added: " If each upload is checked prior to upload (see verifyDescription option below), it isn't considered a bot. " -- User:Docu at 14:17, 11 April 2010 (UTC)


Python bots: cosmetic_changes.py[edit]

At Commons:Tools/pywiki file description cleanup, there is a light version available for Commons. It fixes some of the more frequent changes needed for localization to work. For discussion, please use the page referenced there. -- User:Docu at 22:40, 22 April 2010 (UTC)

Want to react to events on the wiki? Try Recentchanges via XMPP[edit]

Hi all! For a long time I wantged a decent push interface for RC-Events. Parsing messages on IRC is unreliable, and polling the API sucks (and is also unreliable, see bugzilla:24782.

So, I have written XMLRC and set up a prototype on the Toolserver - have a look at meta:Recentchanges via XMPP for details. Basically, you point any Jabber client to the chat room enwiki@conference.jabber.toolserver.org to see the change events, like on IRC. However, if you use a client aware of the extra data attached to the messages, like rcclient.py, you will get all the information you can get from the API (in fact, you can get the exact same XML tag).

Try it out and let me know what you think! -- Daniel Kinzler (WMDE) (talk) 08:41, 18 August 2010 (UTC)

Bot to change authors name[edit]

I am wondering if their is a bot which would change an authors name on all the files they have uploaded? I wish to change the name under which my images appear. James Heilman, MD (talk) 05:27, 26 October 2010 (UTC)

User:Boing-boing unclear name[edit]

can someone change the name of that bot so he will have a recognizable name? Thx--Sanandros (talk) 11:06, 9 November 2010 (UTC)

Need a bot to add Own work to my images, source field in the description[edit]

Per this, seems that I uploaded hundreds of images taken by myself, with author=myself, but I apparently skipped filling the source field. Is there a bot that can go over my file uploads, see have author=myself (or Piotrus/User:Piotrus) and add {{own}} to the source field? --Piotr Konieczny aka Prokonsul Piotrus Talk 22:00, 7 December 2010 (UTC)

Most of them should be done now. -- RE rillke questions? 18:54, 16 March 2012 (UTC)

File history[edit]

I was wondering if there is a tool/pywiki library for analysing the edit history of a filepage. Essentially I'd like to do a rough analysis of a some images to see if any information has been manually added to the pages after the upload. The motivation is a list of images originally uploaded by a user, and then possibly edited, but later reuploaded as apart of a batch upload (with better filedescriotions filenames). /Lokal_Profil 13:54, 5 July 2011 (UTC)

Request for a bot[edit]

Hi, can someone create an undercategorised files bot. Some files while categorised are undercategorised, while they are categorised, they are categorised in a way that is tangential to their subject. For example I've been categorising a batch of user uploaders taken in the greater Kolkota area, the images are of schools, police stations, roads etc of the area, however they are all categorised in the (month)(year) in India format, which whilst applicable, means that they are not connected to the categorisation tree by subject, but because they are categorised they do not show up in the uncategorised files maintenane cats.

What I have in mind is a bot that will search for files with only a single category in non subject categories and add them to Category:Undercategorised files.--KTo288 (talk) 14:37, 18 May 2012 (UTC)

retrospective request to get advance permission?[edit]

Second section says "All bots running on Wikimedia Commons must have advance permission to do so". But under blocking of bots it says "though in practice unless the bot is doing harm the operator should normally be asked to submit a retrospective bot request". How is it possible to "submit a retrospective bot request" to get "advance permission"? AzaToth 23:37, 9 June 2012 (UTC)

Bot edit rates, what kinds of bots can be exempted from 10s/edit?[edit]

In the bot speed section, it is mentioned that the maximum bot edit rate is once every 10 secs, unless the bot is doing something urgently like reverting vandalism. It recently came to my attention that a bot has ben approved to operate a edit rates of 1 s/edit, but it does not appear to me that the bot in question is approved for doing any kinds of urgent edits. So, there seems to be a discrepency between the bot guideline and and the approval. I see four possibilities

  • Somehow it was overlooked in the review that the edit rate exceeded the normal maximum by a factor of 10.
  • The bot is approved to do urgent tasks, but I just do not see them as urgent. So maybe I am just misunderstanding what can be perceived as "urgent".
  • Other reasons exist for allowing higher edit rates, but these are just not described in the guideline. For instance, if the user is a "real pro" and highly trusted bot operator (based on prior history), a bot can be approved to run "on nitro"? If so, I think it would be helpful to add that to the guideline, such that it is transparent.
  • A matter of 'crat discretion in the decision making. No need to explicitly mention all possible exceptions in the guideline. If it seems sensible for the deciding 'crat, then why not?

Or maybe a fifth reason? --Slaunger (talk) 19:14, 5 July 2012 (UTC)

What problem are you trying to solve? If the bot is malfunctioning, please block it.
This independent of the question weather it operates at the rate of 1 edit per minute or 1 per second. If the edits are productive, it's generally a non-issue. --  Docu  at 19:53, 5 July 2012 (UTC)
I am not trying to solve a problem, but trying to understand. My question is not so much related to the specific bot. If the edit rate does not matter, then why is a specific limit mentioned in the guideline? I know that years ago, it was partially due to a concern that the servers running Commons could be overloaded if several bots were running at high edit rates concurrently. But is this still a relevant concern? Personally, (because I prefer to see bot edits in my watchlist), I also think there is a good reason to limit the edit rate, simply to avoid sudden spamming of watchlists, but also as a precautionary principle, such that a bot can be blocked in case it does "wrong" edits before it has completed thousands of edits (which should then be reverted, leading to an equal amout of watchlist spamming). --Slaunger (talk) 20:07, 5 July 2012 (UTC)
If edit rate needs to be enforced reasons for this should be technical first. I am in tech IRC channels all the time and I have not heard bot edit rates to be an issue. To be honest I have never heard this to be a problem and always felt bot edit rates as an expression of how often the bot runs. ie if the task is to be run all the time, or with intervals of hour(s), week(s), month(s), year(s). I could be wrong about this though...
You needn't worry about wrong edits (assuming these are edits where the bot makes incorrect edits and breaks pages/templates etc). In such a case bot should be blocked and bot operator has the responsibility to fix these. If it is disagreement of the actual edits of the bot, these can be reverted too. Mass reverts however should be avoided if possible as a compromise can perhaps be reached. I am guessing you refer to the recent candidate tagging of mine which is in discussion as you well know. I wouldn't classify such edits as "wrong" since pages aren't broken. Such is more of a content dispute. Bots should cease making edits the moment the content dispute is raised though. After all there is no emergency.
As for the watchlist remark, I kind of do not see what difference it makes. I do not use watchlists so I wouldn't know. Personally had I been watching bot edits (I don't think I would ever watch for bot edits), I would have like them to be clustered together so that other edits aren't littered in them. Hence I would prefer quick bursts over spread out edits. Also say you are sleeping when the bot is running, the edits will be over by the time you wake up and you will see the same list. It sounds to me like you are having a UI/Interface issue with watchlists which is better addressed through mediawiki improvements or JavaScript hacks.
-- とある白い猫 ちぃ? 22:41, 5 July 2012 (UTC)
Yes, it is the recent tagging of your bot that has triggered the question, but my question is not really about that specific case, but about the reason(s) for having the stated 10s/edit bot edit rate in the current guidelines. The edit rate is something I am very cautious about when running my own bot, as it has been my understanding that editing too fast may stress the servers and lead to increased response times for "normal" users. I am trying to gauge if that is a myth. I just checked thestats for edits on Commons. In May 2012, there was a about 2 million edits in total. The average edit rate was therefore 2000000 / (30 * 24 * 60) = 46 edits / minute, or in average 1.3 seconds between each edit. Now if a bot is run at an edit rate of 1 s sec per edit, this amounts to more than the average edit rate from all other users combined. So for each such bot run concurrently with this high edit rate you add +100% to the normal average load of the servers (ballpark numbers). I would be surprised if that is not of concern for the server administrators; that they do not want bots in general to run at such high speeds as the server load can get large bursts, and if unlucky and several such high-speed bots are run concurrently, it can lead to server overload. At least that would be my concern. If a bot edits at 10 secs/edit, it adds about 10% to the average load. This seems for me like a more safe limit. --Slaunger (talk) 06:19, 6 July 2012 (UTC)
(ec) As the system is designed, I think it can handle roughly 100 to 200 or so changes per minute (In fact, I have never seen more than 80 transactions per minute but I am not spending my life on monitoring that). Cat-a-lot and bots create a high peak load. The system can handle that and is some sort of auto regulating. Such high peak loads however make the system less responsive for normal user operations, why there exist some throttle requirements for bots. A bot that runs on 1 per second consumes around 33 to 66 % of the overall system capacity. --Foroa (talk) 06:26, 6 July 2012 (UTC)
  • There is no way my bot was editing at the rate you are suggesting. My bots statistics never reached two digits on average during any run I had. I do not believe AWB can ever run that fast. On the last run I have placed a 15 second cap between edits so that is: (time it takes to load the page) + (time it takes to modify the wiki-text (what you see when you hit the edit link)) + (15 secs) + (time it takes to save). The bots latest edits (that tagged candidates) hence shouldn't exceed more than 4 per minute. Prior to the cap I introduced due to your concerns, I have not seen my bot do more than 8 per minute on average. -- とある白い猫 ちぃ? 22:12, 6 July 2012 (UTC)
  • There are plenty of examples where you bot has been running at rates exceeding the normal 6 edits / minute rate, for instance 18 edits / minute around 6am on July 1, 17 edits / minute around 05.40 on July 1, 30 edits / minute around 06:00 at June 25, and I could go on. However, your bot has been approved to edit at up to 60 edits / minute, so it is not formally abuse of priviledges. I just think that at these edit rates the overall responsiveness of the system for other users begin to be affected, and I think that is a bad thing, as the overall user experience regarding responsiveness ought to have higher priority than non-urgent bot edits. I am not trying to blame you for anything in that respect as you have the crystal clear botaaproval for that. I am just not convinced bots should be approved to edit at such high rates unless there are very good reasons to do so. --Slaunger (talk) 12:39, 7 July 2012 (UTC)
  • Huh, interesting. The cumulative edit rate (the rate at the end of the run) is what I was referring to. There can be higher bursts ever now and then in a lengthy run but there are plenty of much slower rates so that it evens out. I can introduce a higher time period between edits when handling stuff like assessment template tagging. Not because wiki would break if I don't, but because you simply asked. After all, there is no emergency. I would not be able to monitor the bots activity as actively though since I don't want to spend a full day staring at a screen. -- とある白い猫 ちぃ? 15:35, 7 July 2012 (UTC)
  • Odd, you use the pywiki framework as well for these large jobs, don't you? When I run my bot it always by default limits the transient edit rate to no more than 6 edits per minute. This can be seen from its contributions by inspecting that the number of edits in a minute never exceeds six. And I do think it is the transient edit rate that matters. The edit rate can be overrided by specifying an explicit throttle parameter, but this is something you do consciously. But even your cumulated edit rate exceeds the 6 edit / minute limit for some runs. E.g., in you latest big run, which began at 5:33 on July 1, 842 edits were completed until 6:32, where the bot was apparently stopped for a few minutes, and then started again with a much lower throttle (about 3 edits / minute). In that one hour period the average edit rate was 14 edits per minute, 2-3 times larger than default. Now again, this did not exceed the edit rate your bot is approved for, but I do not understand where you get from that cumulated edit rates are always kept below the 6 edits / minute limit. --Slaunger (talk) 20:48, 7 July 2012 (UTC)
  • Is there a report of server load being a problem? I am reading 0 second lag currently and there are more than 60 edits per second. I have asked devs and no load related problem are reported for commons. Edit speed is more of a concern for policy and less of a technical issue in that regard based on what I can gather. I was able to learn that a newly registered user or IP can make 8 edits per minute. In other words 60/8 = 7.5secs should be between edits for the user to be able to consistently make edits without hitting the limit. The proposed 1 edit per 10 secs limit is higher than what IPs get and hence this limit doesn't make sense to me. Server admins put no cap on bots. They would have if there was a problem. -- とある白い猫 ちぃ? 23:46, 6 July 2012 (UTC)
  • The others know more about this than I do. The only time I've noticed a problem was when a script was uploading a 50Mb file every two seconds, other attempted uploads sometimes timed out. But that's probably quite different to making frequent text edits. --99of9 (talk) 12:04, 6 July 2012 (UTC)
    • Upload server and text server are different computers I believe. Lag on image server would not affect the text server. -- とある白い猫 ちぃ? 23:46, 6 July 2012 (UTC)
  • What was a purpose of too frequent edits of bot in question? I glanced most recent 500 edits and looks like bot didn't make more then 6 edits per minute. --EugeneZelenko (talk) 15:02, 6 July 2012 (UTC)
    • You will need to be more specific than that. Can you provide a diff? I'd be happy to explain. -- とある白い猫 ちぃ? 22:12, 6 July 2012 (UTC)
    • Go back to 5:33 on July 1 for an example, but really, this is not a question about the specific bot, because it has been approved to run so fast. The question is rather why such a high edit rate has been approved, and wheather it is of any concern that bots edit at much faster than default edit rates. --Slaunger (talk) 20:59, 7 July 2012 (UTC)
  • Special:Contributions/FrescoBot seems to be editing at a pace of 6 edits per min. Given the task it is doing (fixing geograph uploads), it seems way too slow. I think we should ask its operator to have it edit at a faster rate. --  Docu  at 09:06, 7 July 2012 (UTC)
    • In other words, we should ask its operator to go beyond the limit that is established in the rules?... Am I missing something here? Why there is still no clear answer to the very simple question asked by Slaunger? In my humble opinion, and after looking at the numbers provided in this discussion, such limit seems perfectly reasonable. But that is not apparently the opinion of the experts and admins here... Alvesgaspar (talk) 09:20, 7 July 2012 (UTC)
      • It's not really a limit, it's a default value. When I evaluate a new bot request, it's actually quite helpful. --  Docu  at 09:34, 7 July 2012 (UTC)

General questions not related to a specific bot[edit]

Excuse me for starting a new subsection, but I feel that my question has been completely misunderstood and replies (with a few exceptions) driven off in completely tangential directions Face-smile.svg.

  1. In the bot policy it is stated A bot carrying out non-urgent tasks should not normally edit more frequently than once every 10 seconds.. The sentence contains the word "normally" indicating that there could be cases where a bot carrying out non-urgent tasks is allowed to edit faster than that. What could be such cases?
  2. In average, about 1 edit is done on Commons every second by all users in total, of course with fluctuating rates over the times of day. It has been stated above that the servers autoregulate to a max of about 2-3 edits every second by slowing down in completing requests, a slow down which fill be felt by all users. Given that it seems sensible for me that bot edit rates comply to one edit every 10 secs (one bot taking 10% of average load, or 3-5% of maximum load), unless it is an urgent situation, to avoid a general reduced responsiveness for all users and average out the server load if several bots are running concurrently. Yet, we have examples of bots being approved to run at up to 60 edits every minute for carrying out non-urgent tasks. If a bot is approved to carry out non-urgent edits faster than once every 10 seconds, then what can the objective be for allowing such excemptions? --Slaunger (talk) 12:22, 7 July 2012 (UTC)
  • In answer to question 1, I would consider such circumstances as: (1) A very large job that we want done in a reasonable time period (e.g. an operation on all files would currently take over four years at that rate); (2) Edits which are thought to "cost" less server time than a typical edit (e.g. small uploads cost less than large uploads... similar divergences may exist within text edits); (3) Edits which for some reason require the operator to be present, but do not realistically take 10 seconds to evaluate (I don't want to force an operator to sit there unneccessarily slow-clicking); ... maybe others if someone brings good cause. --99of9 (talk) 12:49, 7 July 2012 (UTC)
  • Responding to question 2, I am less conservative than you for a few reasons (but I'm really not up on the facts about the server): (1) The guideline was written over 3 years ago, I expect Moore's law or software developers have improved server performance since then (if not, why not?). (2) A system that only operates flawlessly when operating under double the average load is dangerously underprovisioned. So if the figures are accurate, I'm concerned and the WMF sysops should kick up a gear, but my faith in WMF sysops means that I doubt the figures. (3) If sysops see bot loads increasing, they will be thinking about how to accommodate this. (4) There are lots more bots and lots more edits on en-wiki, so surely commons could have the same if we needed it. --99of9 (talk) 13:15, 7 July 2012 (UTC)
  • Thanks for the very good answers. They make pretty good sense to be. If the 10 secs / edit guideline today is a bit outdated it may be helpful to update it to a higher value. I have until now taken the value quite seriously because it was there. As bot operator it is very convenient to have the possibility to edit faster than once every 10 secs, and if this is not a problem, then perhaps a mass permission should be given to, e.g., double the rate to 12 edits/minute. --Slaunger (talk) 13:34, 7 July 2012 (UTC)
  • I'd like to note that most Wikimedia wikis share resources. Bringing down commons with bot edits would require far more than our current max edits mentioned here. I suggest technical concerns to be aimed at Wikimedia staff that manages the actual servers. I am sorry but I do not believe my bot (or any bot) was ever taking 10% of the entire load capacity on commons. If you can establish server loads to be an issue we can establish and enforce limits based on that. My inquiries to the server admins did not prompt any immediate problems. Historically server admins always stated that they'd prefer performance concerns be left to them. If there is a problem that impacts servers, they can put restrictions on server end with trivial ease.
  • It is very hard for me to consider this independent of the de-flagging thread for my bot where edit rate is/was one of the arguments to this end. I realize this may not be your intent and I know you do not trust me but timing of this inquiry seems a little problematic to me.
-- とある白い猫 ちぃ? 14:51, 7 July 2012 (UTC)
  • the COM:AN/U thread did trigger my post, but my intentions are not to corner you, but rather highlight and clarify the discrepency between written guidelines and actual practise. See detailed response on your talk page. --Slaunger (talk) 20:54, 7 July 2012 (UTC)

A standard rate of about 12 edits/minute seems enough conservative. -- Basilicofresco (msg) 15:56, 7 July 2012 (UTC)

I think it is good to clarify some aspects. Wikimedia is a massive parallel system that works well. Problem is that, for reasons of data consistency, the main tables (one for each name space) cannot be changed in parallel: changes on those tables have normally to be serialised and done one by one (and then results broadcasted to all servers that use them). So, this limits the maximum capacity of the system in terms of transactions per second to 100 to 200 per minute. Weather the changes are small or big plays no significant role. There seems to be no significant peak load difference between Commons or en:wikipedia, which seems to confirm my impression that both systems run often close to their maximum capacity. This means not that servers are overloaded; they can only process that many changes per second, even if you triple the number of processors in the servers. Background tasks and job queues have not really something to do with the transaction rate.
Some statistics gathered over three hours on July 9, 2012
To me, it is clear that the system is much more often overloaded than a year ago and waiting times are frequently several seconds. For some files (delinker), I often have to wait minutes, but I guess this is related to other problems. As I stated before, if there are too many change requests from users and bots, they all have to wait a bit longer before completion and that's the way the system is auto-regulated.
Counting on the law of Moore to predict a faster system is a bit naive. The Law of Moore is still achieved by the chip makers, mainly by addition of additional cores (parallelism), which does not help for our problem, on the contrary. In terms of sheer processing power, processors gained only marginally last years, they gained essentially on caches, communication, OS and virtualisation, and memory accesses. I would not be surprised to see that the current server configuration has gained zero transaction rate capability increase the last 3 years because the processing power increase has been neutralised by the fact that software becomes more sophisticated and that many more processors have to be orchestrated.
So as a conclusion, this 6 changes per minute limit has been established for very good reasons, and we better don't change that without having precise factual data. And anyway, whatever one specifies, the system can only go at a certain rate and as I suppose that the system is well designed, so that no single application can saturate it. --Foroa (talk) 18:08, 7 July 2012 (UTC)
Thanks, Foroa for sharing your knowledge regarding this. Well, I am increasingly confused then. Different user have very different views on what is OK and what is not. I think we need some more factual input from users, who know more about the inner workings server-side. I have asked User:Kaldari to comment here, as I know this user is a WMF developer. Maybe Kaldari knows more, or knows who to ask for some facts. The mere fact that you explicitly have to override, e.g., the pywikipedia framework with explicit throttle parameters to get it to run at rates higher than one edit per 10 secs could indicate that it should only be deviated from if there are urgent reasons. Yes, the servers will keep on working, but when loaded by very frequest bot edits, response times goes down for all users, although it will keep on running. --Slaunger (talk) 20:06, 7 July 2012 (UTC)
Bot vs user edit rates on July 20. User edit rates does not seem to fall even when a single bot (SieBot) operates at up to 60 edits/minute.
It makes one wonder where this numbers come from.
BTW GeographBot uploaded at a rate of 70 images per minute .. --  Docu  at 23:42, 7 July 2012 (UTC)
To provide further input to this case I have written a script for collecting edit rate data and present them as graphs for user, individual bots editing faster than 6 edits/minute, other bots and the accumultaed bot edit rate. Over three hours there are three bots, which operate at more than 6 edits/minute. One of them up to 30 edits/minute. It seems that user edit rates do not go down significantly while the 30 edits / minute bot is running. I think it would be interesting to extend the graphs further back in time, to see what happens, when more than one high speed bot competes for resources concurrently. However, that requires some more work on the script, as I am only allowed to go through 5000 recent user edits at a time, so i would have to "glue" something together to extnd the analysis. If interest is shown I am willing to spend a little more time on it, although further development may have to wait a few hours due to vacation. — Preceding unsigned comment added by Slaunger (talk • contribs) 23:13, 9 July 2012‎ (UTC)
Could you perhaps create a graph for the past 30, 60, 90, 180, 360 days? I'd really like to see how human edits and bot edits compare. I realize that is a lot of data and perhaps this can be run on the toolserver better. -- とある白い猫 ちぃ? 02:16, 10 July 2012 (UTC)
Would you add SieBot, GeographBot, MultichillBotT and CommonsDelinker? --  Docu  at 06:21, 10 July 2012 (UTC)
User and bot edit rates July1-27, 2012

I have now worked on the analysis script making it possible to accumulate data over larger stretches of time. In the new graph is shown data for a complete day. One bot, Siebot, has three bursts of about 30 minutes acitivity with edit rates peaking at 60 edits/minute. However, the accumulated edit rates from users does not seem to fall during these bursts, indicating that the reponsiveness of the system is not affected or not affected to a degree that affects productivity from normal users. However, this example only shows one high bot bot operating at one instance in time, where the overall user activity is also not so high. I will try to dig deeper in the data and try to find cases where several high speed bots operate at the same time. --Slaunger (talk) 22:08, 24 July 2012 (UTC)

Zoom on July 14-15, 2012 period with high bot activity.
[I am not involved with ops, take the following with salt] In regards to fora's comments above. All namespaces use the same table. namespace is just a field in the page table (Well its a little bit more complicated than that. The information for a single page or edit is spread across several tables. The actual text of the page is actual stored in an entirely separate server as the rest of the info about the page). Additionally, as far as i know, MySQL (with innoDB) uses row-level locking - hence multiple changes to a table can happen at once provided they do not touch the same row. I imagine the guidelines for api usage (Wait for one edit to finish before starting the next one) and using the maxlag url parameter to cause bots to auto-stop editing if the replication lag (Time it takes for changes to the database to propagate to other servers) gets too high, is sufficient. However one should definitly talk to ops folks instead of making guesses, as guessing about optimization without measuring usually turn out to be incorrect guesses. In all probability this is probably a bunch of worry over nothing. Bawolff (talk) 18:12, 30 July 2012 (UTC)

Load measurement[edit]

Great. Obviously, we don't see the most important metric: the response time for the normal user, but this can only be done by instrumenting some code (it is probably done somewhere, but we don't have access). Maybe we can agree some time frame where we all try to insert a maximum of system load, especially the high rate bots. Autocategorisation bots are important too as they make sometimes long hours ... --Foroa (talk) 07:10, 10 July 2012 (UTC)

Well "normal" user is vauge. Logged out users don't even reach php code most of the time. Profiling tables and load graphs are publically available. Bawolff (talk) 18:12, 30 July 2012 (UTC)
p.s. Along with even more pretty graphs. Bawolff (talk) 17:28, 31 July 2012 (UTC)
So in lay mans terms bot edits do not cause a problem? Is this the conclusion or is it the opposite? -- とある白い猫 ちぃ? 04:08, 8 August 2012 (UTC)

In need of a bot[edit]

Who do I talk to about getting a bot to run a monthly (heck, annual might do) date-based category creation routine? -mattbuck (Talk) 13:56, 14 September 2012 (UTC)

Wikimedia Bot Study[edit]

In case anybody would like to learn more about Bots and how they are used on Wikimedia projects. here is an interesting Ph.D. dissertation on the subject. --Jarekt (talk) 19:47, 21 September 2012 (UTC)

Curious about creating and operating an internationalization bot[edit]

I'm curious about creating and operating an internationalization bot here on Commons (something that converts ==License== to =={{int:license-header}}==, "own work" to {{own}}, and the like), but I don't know where to begin or whom to ask. I have a limited background in Python, so I am most curious about a pywikipediabot, but I would be open to other ideas as well (although I have no experience in programming languages other than Python). Maybe I'm not ready for this challenge yet, but I would like to at least learn the basics about bots and (if the code for some of the bots is freely available) take a look at the Python code, at the very least. Thanks so much and take care! Michael Barera (talk) 20:29, 1 January 2013 (UTC)

You don't need to know Python to operate a bot. I used Pywikipediabot before, and I have a very limited knowledge of Python. Yann (talk) 21:03, 1 January 2013 (UTC)
So how would you suggest I get started with it? Michael Barera (talk) 21:31, 1 January 2013 (UTC)
I would start with en:Wikipedia:AutoWikiBrowser, which is a good tool for simple conversions. So install it, go to Commons:AutoWikiBrowser/CheckPage and ask for access to it. Than you can learn how to use it while running in the "manual" mode and than you can ask for bot flag. --Jarekt (talk) 12:49, 5 February 2013 (UTC)

How to stop a bot?[edit]

How to stop a bot that spams thousands of file description pages with a completely useless template? This template belongs to the user discussion pages. It's completely pointless to put it on file description pages if everything works fine. Please revert this. Thanks. --TMg 10:31, 5 February 2013 (UTC)

It is probably part of Commons:Bots/Work_requests#Convert_all_interlaced_JPGs. You can check about it there. But I assume that this is a temporary maintenance template used to tag problematic images, which are often not handled correctly by our software. Once the issue is corrected the tag will be removed. But I agree that we could have put that tag in the talk page of the file. --Jarekt (talk) 12:40, 5 February 2013 (UTC)
There a hundreds of thousands of progessive JPEG files at Commons. Only a tiny, tiny fraction of them (the very large ones) have a problem. All others are fine. There is no need to tag a non-existing issue with a pointless template just for educational purposes. Currently, the bot stopped. For future cases: What should I do if I think a bot misbehaves? How to stop it? --TMg 19:53, 5 February 2013 (UTC)
For AWB based bots all you have to do is to write anything on their talk page. For other bots you need an admin to temporarily block it. Many bots have clear instructions on their user pages on how to stop them. See for example my User:JarektBot. --Jarekt (talk) 20:49, 5 February 2013 (UTC)

Chat about Bot Policy and Global Bots on various WP/WM projects[edit]

Hello. My name is Randall Livingstone, and I am a faculty member at Endicott College (Massachusetts, US). Over the last two years, I have been conducting research on bots and bot operators on WP/WM projects for my dissertation (which was published under CC last year...check it out here if you'd like). Some of you may already know me from that project...for others, hello!

I am continuing to learn about the bot community for a current project and am looking to chat with anyone involved with bots and bot policy on non-English WP versions or non-en.wikipedia projects (well, really anyone who wants to talk about bots...I'm looking for all perspectives). Specifically, I am looking to understand how bot policies and bot approvals vary between projects, and how Meta Bot Policy and the global bot flag is recognized/not recognized by different projects.

I am bound by my English-only language skills, however. I you are interested/willing to participate in English, we could set up an online chat, videochat, phone call, email conversation, or start an on-wiki conversation...whatever works for you. Although this is continuing research, I am ideally looking to chat with contributors sometime in the next few weeks. Please let me know via my en.wp Talk page or email me here if interested, and thank you in advance.

This research has been approved by the Institutional Review Boards at both the University of Oregon and Endicott College. (Feel free to request a copy of those protocols).

UOJComm (talk) 01:06, 25 February 2013 (UTC)

Bot speed[edit]

Can we please remove this. This is a solution without a problem. It is an artificial limit with no benefit whatsoever. If the task is making 1000 edits, this limit would make it take 60*1000=60000 seconds or 1000 minutes which is 16.66 hours. Bots are meant to be used for bulk or periodic edits. This issue was discussed in length on August. With 1 edit per 10 seconds editing 1000 pages would take 2.77 hours if we take the lowest limit mentioned on the page. -- とある白い猫 ちぃ? 15:29, 6 April 2013 (UTC)

Your first calculation assumes an edit rate of 1/minute - no bot has been required to go this slow. However, per the evidence presented in the August discussion, I have doubled the speed limit to 1 edit per 5 seconds (12 per minute). This is also consistent with the maximum speed on meta m:Bot policy. Consider this a case of BRD - if anyone notices a problem with bots slowing down performance, feel free to revert this change, and we can discuss it further. --99of9 (talk) 12:36, 22 May 2013 (UTC)
Thanks but this doesn't completely resolve my question. Why do we even care about the speed. Performance is not our worry. Even without an artificial limit bots still have a reasonable restriction on the server end already. Its higher than of a human editor but wouldn't break the wiki either. -- とある白い猫 ちぃ? 17:34, 23 May 2013 (UTC)
If appropriately authorized techs can back up your assertion that it's their problem not ours, and we can go at whatever speed we like, I'd be happy to argue for meta policy to be changed, but I'd need to see it in print from them. --99of9 (talk) 09:06, 24 May 2013 (UTC)

Cropbot needs to be fixed[edit]

Cropbot hasn't worked since June 29, and no matter how much I complain on the talk page, nothing is done.[1] So out of frustration, I've taken it here, as I don't know where else to ask. Can someone other than the operator perhaps fix it? FunkMonk (talk) 05:44, 2 September 2014 (UTC)

If Cropbot is not working, you can try CropTool. --Steinsplitter (talk) 07:39, 2 September 2014 (UTC)

Source code & license[edit]

I think a link to source code and license would be a good addition to bot info. Palosirkka (talk) 12:41, 23 October 2014 (UTC)

Fixing a bunch of mistakes in a batch upload: bot yes or no?[edit]

Hi, WMDe managed a GLAM project resulting in an upcoming batch upload of about 500 files. Due to some technical issues the file descriptions will have one specific error each, so WMDE is planning to correct them with one distinct bot run. Do they need an existing bot to do that, or apply for a new bot or would you consider an informal permission to run a bot under a user account in connection with oversight by a Commons admin as sufficient? --h-stt !? 16:27, 13 November 2014 (UTC)

Which change? Maybe i can do it with my bot. (i think for 500 edits is no botflag needed) --Steinsplitter (talk) 17:15, 13 November 2014 (UTC)
You my also consider just using VisualFileChange. --McZusatz (talk) 02:08, 14 November 2014 (UTC)

New bot picture[edit]

{{edit request}} Some (en-GB, de, more, maybe all (?)) subpages with translations need a touch (null-edit) by a translation administrator to show the new bot image; the old image had to be deleted. –Be..anyone 💩 11:13, 15 April 2016 (UTC)

I do not think anything can be done manually. User:FuzzyBot has to come and update the page, as this is not an issue with a cache or database that can be fixed by touch, but the source code of Commons:Bots/pl and other has old image and although I can edit that page, I can not save them. --Jarekt (talk) 12:55, 15 April 2016 (UTC)
Oops, even a TA cannot fix this? I think Brion was right, the complete translation business has to be trashed and rewritten from scratch. @Rillke: update my i18n rant list, please.:tongue:Be..anyone 💩 04:58, 16 April 2016 (UTC)
✓ Done Apparently a TA must mark it for translation --Zhuyifei1999 (talk) 08:10, 16 April 2016 (UTC)

API change will break some bots[edit]

Quick note: Any script or bot that is using http:// to access the API, rather than https:// is going to break in June, because of changes to the API. You can find more information in this e-mail message. I'm directly contacting active, high-volume bot owners whose accounts have been using http:// recently. Please {{ping}} me if you have more questions. Whatamidoing (WMF) (talk) 17:48, 20 May 2016 (UTC)

I see warnings are going out about http support being withdrawn for API requests. I'd rather not update my pywikibot-core if I don't have to, can anyone advise if version 3.0-dev always uses https for its API calls? I'm also presuming that all my compat stuff has to be migrated, unless someone has a handy fix documented. Thanks -- (talk) 06:44, 20 May 2016 (UTC)
My request was moved without my permission. It was not originally written here, neither was it in reply to Whatamidoing. It was intended as a general question for fellow Commons bot writers. Thanks -- (talk) 11:29, 21 May 2016 (UTC)

Uploading files[edit]

Is it necessary to use an alternate account with bot flag for reuploading files with upload.py, and does this depend on the number of files? Jc86035 (talk) Use {{re|Jc86035}}
to reply to me
15:24, 10 April 2017 (UTC)

In regard to high-speed manual editing[edit]

All the recent edits here (adding descriptions) https://commons.wikimedia.org/wiki/Special:Contributions/ShakespeareFan00

are manual.

However, following a dispute I was involved on English Wikipedia, I felt it was reasonable to ask if I need to re-activate my dormant second account (Sfan00_IMG) and request a bot-flag for it, based on the sheer speed of editing I am achieving.

No further edits, will be made until feedback is obtained. ShakespeareFan00 (talk) 08:06, 24 May 2017 (UTC)

@ShakespeareFan00: The maximum number of edits per minute you have achieved today (UTC) is only 3. I'm not sure of the threshold here for manual edits, but on enwiki it is 6, so I think you're safe until an Admin or Bureaucrat weighs in. See also User talk:~riley#Commons:Bots.2FRequests.2FJeffGBot_2.   — Jeff G. ツ 13:47, 26 May 2017 (UTC)
@ShakespeareFan00: Wait, you wrote that on the 24th, you did manage 7 once that day.   — Jeff G. ツ 14:09, 26 May 2017 (UTC)