Commons:Bots/Requests/Category redirect bot

From Wikimedia Commons, the free media repository
Jump to: navigation, search

Category redirect bot[edit]

Operator: O (висчвын)

Automatic or Manually Assisted: Automatic

Programming Language(s): Python (category_redirect.py)

Edit period(s) (e.g. Continuous, daily, one time run): Four (00:00, 06:00, 12:00, 18:00 GMT) or two times (00:00, 12:00 GMT) a day

Bot flag requested: (Y/N): Y

Functions: Fixes non-empty category redirects

Discussion[edit]

Test run looks OK for me. --EugeneZelenko (talk) 14:50, 22 July 2008 (UTC)
Looks good to me too, and what a useful bot! Thanks, Patrícia msg 13:57, 25 July 2008 (UTC)
  • No objections to approval. —Giggy 06:50, 26 July 2008 (UTC)
  • Oh yes please!  — Mike.lifeguard | @en.wb 12:59, 26 July 2008 (UTC)
  • Excellent! I've been having RocketBot clear out these once in awhile, but a dedicated bot is just what it needs. Will it also do hash-marked category redirects, or just {{category redirect}} ones? Also, how often will it run? Miss that part on the userpage. Rocket000 (talk) 23:22, 26 July 2008 (UTC)
P.S. Do you think it would be good to convert the 2-3k hard category redirects into soft ones? They tend to fill up too. Rocket000 (talk) 23:27, 26 July 2008 (UTC)
  • Strong Symbol oppose vote.svg Oppose unless you have it run **5** times a day, instead of 4. Or 3, but not 2, as even numbers are evil... :) Ok, seriously, this seems a very very useful task, trial run looks good. No objections. Per Rocket... if you can also see what can be done about the hard catredirs too that would be neato. ++Lar: t/c 16:45, 28 July 2008 (UTC)
  • I just thought of something... Having a bot that pretty much continuously moves images like this kinda gives everyone the same power that admins have with commanding SieBot via {{move cat}}. Anyone can create a bad/disruptive redirect, it's highly likely the bot will get to it before it get noticed. Essentially, slapping a {{seecat}} on a category page is a way to rename the cat. This wouldn't be that bad if it were like moving pages; but moving a category entails every single member to be edited. (beans alert) Image if someone added a cat redirect template to some category that had 1000s of members, right before the bot ran. The bot can do a lot before someone discovers what it's doing and hits the big red button. Sure the bot can just as easily fix this, but that's a lot of unnecessary edits. If this was en.wp, I would say we need another system, however, on Commons, I would say the likelihood of this being used for evil is greatly less but still something to consider (even if not for vandalism purposes, it can be abused in edit-wars and other situations). Maybe some safeguards can be put in place. For example, run the bot only once a day, set it to ignore categories over X amount of members or wait for the bot operator's approval, or (if technically possible) only move images in category redirects that have existed for at least X amount of time. BTW, don't worry about the hard redirects. I've got something in the works to make them feed into Category:Non-empty category redirects just like the soft ones. The pywikipedia script that's being used won't even have to be altered. :) Rocket000 (talk) 21:30, 28 July 2008 (UTC)
    I hadn't thought of the potential for abuse - I suspect it is not going to be a major issue. If so, the bot should then see if the user who made the last edit is has +sysop and request confirmation if not. For now, I don't think that will be necessary, but perhaps we should revisit this if there are issues.  — Mike.lifeguard | @en.wb 22:49, 28 July 2008 (UTC)
    During the test, a user informed me that the bot moved skyline images to a car model category. When I visited that category redirect, I found that there were two {{category redirect}}s, and fixed the redirect as well as the images. Nothing major here, but just wanted to point out that users do care about what gets categorised. --O (висчвын) 22:57, 28 July 2008 (GMT)
    Well, regardless, I wasn't implying any opposition to the bot. It's just something to be on the lookout for. Rocket000 (talk) 17:08, 29 July 2008 (UTC)
  • Can the bot produce a log on the category_talk page of moved images, makes for ease of reverting or restoring where the redirects are from lower tree branches. also where a category names have alternative meanings. Gnangarra 03:39, 31 July 2008 (UTC)
    That could (perhaps) also be done with the bot leaving two time stamps so the bots history could be efficiently accessed if the need to undo exists. -- carol (talk) 12:28, 31 July 2008 (UTC)
    I think a easier way of reverting a whole category move is to simply redirect it the other way. Only takes one edit. If you're talking about a few images, the category they came from will be linked in the history log similarly to how it is when any other bot moves categories. However, this should not be used for those purposes; this is merely to keep the redirects empty. If it does start to get abused to the point where logs are useful, then stronger restrictions like I listed above will need to be taken. Rocket000 (talk) 12:08, 1 August 2008 (UTC)
  • Hmm, i kind of missed this request. I wrote this bot a couple of weeks ago after a request by Rocket000 (source). You should be careful when running automaticly for the reasons pointed out above. I'm now busy converting hard redirects to soft redirects so your bot will probably have a lot of work the next run. Multichill (talk) 17:06, 1 August 2008 (UTC)
  • Well, my plan for hash-marked redirects didn't work, but that's ok since they've been all convented thanks to Multichill. Rocket000 (talk) 23:09, 5 August 2008 (UTC)
  • I feel that one has to rethink the whole move request and handling process, starting from move requests (that hang there for several months) towards the redirect. Move request and redirects have not much credibility if they don't work. Moreover, as I've seen several times, galleries and categories are vitually deleted by placing a redirect on them. --Foroa (talk) 10:40, 6 August 2008 (UTC)
    • Could something similar as with non-empty redirect and disambiguation categories be done with Category:Categories requiring diffusion ? The difference here is that the categories should contain categories but nothing else, otherwise they are not empty. --Foroa (talk) 15:44, 6 August 2008 (UTC)
      • I think we just got a new feature that will allow this. Take a look at a category, you see it now says something like (4 C, 1 P, 222 F). Instead of just the total, it now breaks it down into categories, pages, and files. No magic words yet, but hopefully we get another extremely useful one like {{PAGESINCAT}}. Alternatively, we can simply do it all with bots. I have been (very) gradually building a list (a series of regexes) for categories that should never contain media. For example, upload bots love to put images in Category:Hidden categories for some reason. I just emptied about 800 images from there. The only thing is with some other categories, it would be better to relocate the image into a subcategory instead of simply removing it, something bots can't do. Believe me, bots really suck at categorizing. They're the reason these categories get filled in the first place. Rocket000 (talk) 18:53, 6 August 2008 (UTC)
        • I know. I spent several days emptying non empty redirect and disambiguation categories. Bots should never categorise in redirected categories (and that should be easy). But it is hard to imagine that bots can avoid general and disambiguation cats. That's why such a non empty category is handy: you can try to maintain this list as empty as possible. Once it gets over a few hundreds, then it becomes hopeless and one gives up. (same with uncategorised articles and categories that are now of a manageable size). --Foroa (talk) 06:13, 7 August 2008 (UTC)
  • I once made a spec for this type of bot, mitigating a few risks. Please let us know if these are things you have thought of. (Commons:CategoryRedirectBot). User:Filnik is supposed to have written a(n almost) complete implementation. I think it would be a good idea to not have this bot run unattended until all edits are caused by actions of trusted users. Cheers! Siebrand 15:17, 8 August 2008 (UTC)
I guess that is about what we mean (there is something unclear about the {{Category redirect}} in the last paragraph of the introduction). In the mean time, we have now the Category:Non-empty category redirects too, so with this preselection, the bot could run more frequently without killing the system (the sooner the user has a feedback, the quicker he will learn ;). I guess that a protection scheme, such as with {{Rename}} could be used.
I think that before executing any move, the bot should check if the destination is OK and not redirected itself. It should possible not move if the destination is having a move request outstanding or is a disambiguation category (TBD). It would help a lot if tools like hotcat did not show such redirect categories; people just learn the wrong category names. --Foroa (talk) 16:50, 8 August 2008 (UTC)
I asked O to please comment on Siebrand's concern. This request has been open for quite a while, and if no further concerns appear, I believe we can flag the bot. Patrícia msg 10:43, 19 August 2008 (UTC)
I concur. ++Lar: t/c 11:10, 20 August 2008 (UTC)
Re trusted users and whatnot, it may be a relatively minor issue, as users who are active in the categorisation business pay pretty good attention to what are in categories. Those users usually notify the user(s) involved in operating the bot that some content are in the wrong category. The current script (in the pywikipedia repository) only fixes category redirects marked as such for those who are wondering. --O (висчвын) 20:55, 20 August 2008 (GMT)
I implemented a waiting period of 7 days. This should prevent most vandals, edit wars and other nasty stuff from happening. Fire it up! Multichill (talk) 21:00, 21 August 2008 (UTC)
Could I suggest to run the bot one or more times a day but only execute the move when there are less than 7 images and/or not more than one category in it. This will block vandals while giving quick return to the users that miscategorised items. Moreover, it is easier to detect hijacking attempts in a list that remains virtually empty. Note that, to have a correct "non-empty cat" display, the redirect template has to be evaluated after each transfer. That can be obtained by a dummy edit of the moved cat or with a dummy edit of the redirect template on the end of the batch. --Foroa (talk) 06:10, 22 August 2008 (UTC)
↑↑ That would be prefect because that category sure fills up fast... It would also be nice to have an override function for running it manually. This would be good for those (like me) that would run the script after checking all the categories with over 7 images. If the bot runs 3-4 times a day, it's very unlikely these categories will exceed 7 unless someone's trying to rename a cat this way (which is fine, but needs checking). Rocket000(talk) 08:14, 22 August 2008 (UTC)
FWIW, I have an implementation of a category-redirect fixing bot that I have been running on en: for a couple of months now, and occasionally on other wikis. It is much more general than Multichill's implementation and could be localized to work here on Commons. (Among other things, it converts hard-redirected categories into soft redirects, fixes double-redirects, detects redirect loops, and avoids moving any pages into categories that don't exist.) I am planning to put it in the Pywikipediabot SVN after testing (and I'm warning Multichill about this now in case s/he has any concerns, since it would overwrite the existing Commons-only implementation). --R'n'B (talk) 15:59, 22 August 2008 (UTC)
  • Problem: Check out Category:Non-empty category redirects. Someone redirected all "Cities and villages in Spain" categories to "municipalities of Spain" without any type of discussion. And there's a lot there. Rocket000(talk) 18:56, 25 August 2008 (UTC)
    • Individual case—the user manually moved all of the content. Furthermore, Filnik did write a script which utilises the CheckPage, however it is not in subversion yet, but on Botwiki. --O (висчвын) 20:45, 25 August 2008 (GMT)
      • The code was changed a lot, dont know if it still works here. If the redirect is disputed you should tag those pages as {{CatDispute}}. Multichill (talk) 21:03, 25 August 2008 (UTC)
        • I think you guys are missing the point. Some user is trying to mass rename categories by using {{category redirect}}. Nothing was manually moved. All the content is still there. But if we had a bot running, it wouldn't be. It doesn't matter if this specific case is disputed or not (I don't know or care) but it demonstrates how easily the system can be abused. Rocket000(talk) 15:03, 26 August 2008 (UTC)
          • The cooldown period (currently 6 days) is in the code exactly for this reason; to give other editors a chance to react if one user uses {{category redirect}} inappropriately. The bot will not move anything out of these categories until they have remained untouched for six days. --R'n'B (talk) 19:13, 26 August 2008 (UTC)

If the bot is still going to have a 7-day delay (or some other time frame), can it keep a list of queued ones? Then they can be checked centrally before they are completed.  — Mike.lifeguard | @en.wb 21:35, 25 August 2008 (UTC)

FYI, this bot has been flagged (by Bastique), but the discussion has been left open for further comments. Since there hasn't been edits some days, I suggest further suggestions to be done at O's talk page from now on, instead. Patrícia msg 14:56, 2 September 2008 (UTC)