Commons:Bots/Requests/OgreBot 3

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Operator: Magog the Ogre (talk)


Background:

  • As of late, the file upload bots have been doing an especially lazy job of properly transferring files from Wikipedia, and their owners have not made fixes when requested. As a result, various reptitions, improperly transcluded templates, etc. are added to a page which serves only to confuse the reader (ex: [1] [2]).

Cleanup of files moved from Commons via bot. I have already coded most of the following functionality, and I hope to be able to add the rest as well:

Extended content
  • Lastly: I would like permission to be able to add functionality to the bot when I am requested. The list above is only what I currently know I want to implement.
  • An important note: each change to the page will be classified as either "major" or "minor". If all the changes to a page are minor, then the edit will not be performed. This will prevent nuisance edits from the bot.

How the bot will find the pages

  • The bot will go through the upload history from a certain date onward, and edit only the description pages of those files. The bot will pull the text from the SQL on the Toolserver, so as not to bog down Wikimedia's servers unnecessarily. Alternatively, I am considering a run which will edit every page which transcludes {{Original upload log}}, because they are the most prone to error; however, I will not do this unless I have success on the first run of the bot, because there are over 300,000 transclusions and that would take up a lot of time and resources.

Bot's tasks for which permission is being sought: See above.

Automatic or manually assisted: Automatic, unsupervised. A few test runs will be done first, with the number of files being edited on each test run to increase, once I'm more sure there are no bugs or omissions.

Edit type (e.g. Continuous, daily, one time run): Originally meant to be a one-time run, but could perhaps run on a daily basis or more often to continually fix ugly uploads.

Maximum edit rate (eg edits per minute): Unthrottled. In fact, I plan to run several concurrent instances of the program in order to be able to make edits faster, in view of the fact that there will be a lot of pages edited.

Bot flag requested: (Y/N): Y

Programming language(s): PHP

Magog the Ogre (talk) 04:21, 27 May 2012 (UTC)[reply]

Discussion

I don't know if such edits (replacing e.g. Name by Name at German Wikipedia) were done in error or not, but they should be stopped. I only agree if at en.wikipedia is there before (example). In this and other cases, the way of attribution chosen by the author was changed. --Leyo 21:15, 28 May 2012 (UTC)[reply]
You are correct; I am removing that. Magog the Ogre (talk) 04:49, 29 May 2012 (UTC)[reply]
✓ Done - I think you'll find the recent batch of updates much more agreeable (albeit after I fixed several kinks in the peachy framework that parsed templates incorrectly). Magog the Ogre (talk) 07:03, 29 May 2012 (UTC)[reply]
Thank you.
What about adding CommonsHelper to {{Transferred from}} or at least converting [http://tools.wikimedia.de/~magnus/commonshelper.php CommonsHelper] to [//toolserver.org/~magnus/commonshelper.php CommonsHelper] (see diff)? However, IMO the information template should only contain information that is relevant for a reader or reusers. The fact that a file was first uploaded to Wikipedia and then transferred to Commons by User:XY is not relevant in this sense. Hence, it is sufficient that this information is contained in the original upload log and in the text next to the thumb. --Leyo 08:00, 29 May 2012 (UTC)[reply]
OK, I can remove the http:, although I will make it a minor edit. In which cases do you want it to remove the http:? For toolserver and wikimedia links? Also, I would be willing to remove the transferred information, in the case that a) there is other relevant information present (so it doesn't end up with an empty field and leave a {{Nosource}} included), and b) it doesn't include who transferred the file, unless it is User:Boteas, because that's also irrelevant. I would not feel comfortable removing the information altogether; perhaps it could be moved down into the original upload log field, except in situation (a) mentioned above (again, to avoid the nosource template). Magog the Ogre (talk) 17:36, 29 May 2012 (UTC)[reply]
Removing http: should not be the only change in an edit. I agree on case a), but not on case b), especially when the transferring user is mentioned next to the thumb (i.e. in newer transfers). --Leyo 23:14, 29 May 2012 (UTC)[reply]
I guess I see your point - if the initiator is listed in the upload history (which often won't be the case for CommonsHelper; File Upload Bot gets it). However, will the community support it, or will I get a ton of blowback for removing it? Magog the Ogre (talk) 00:15, 30 May 2012 (UTC)[reply]
The bot is not adding {{Transferred from}} template correctly. See [5], [6], [7] etc --Sreejith K (talk) 11:34, 1 June 2012 (UTC)[reply]
Thank you. Magog the Ogre (talk) 12:13, 1 June 2012 (UTC)[reply]
Above I stated I wasn't going to run it anymore. However, while I'm still adding features, I'd like to keep running and in order to test it. It's only making a limited number of edits which I am supervising. This doesn't seem to me like it's a problem; correct me if I'm wrong. However, I would still like the bot flag when possible. Magog the Ogre (talk) 00:42, 2 June 2012 (UTC)[reply]

If there no any objections, I think bot status should be granted. --EugeneZelenko (talk) 14:35, 6 June 2012 (UTC)[reply]