Commons:Bots/Requests/MerlIwBot

From Wikimedia Commons, the free media repository
Jump to: navigation, search

MerlIwBot (talk · contribs)[edit]

Operator: w:de:User:Merlissimo

Bot's tasks for which permission is being sought: interwikis

Automatic or manually assisted: Automatic unsupervised

Edit type (e.g. Continuous, daily, one time run): continuous

Maximum edit rate (eg edits per minute): 4-6

Bot flag requested: (Y/N): y

Programming language(s): Java (own framework, same as used by User:MerlLinkBot and User:MerlBot), pywikipedia

Already bot flag on: dewiki, ruwiki, nlwiki, plwiki, frwiki, ukwiki, cswiki, itwiki, some more requested

On dewiki my bot is creating reports like de:Wikipedia:WikiProjekt_Interwikilinks#Defekte_Commons-Links and de:Wikipedia:WikiProjekt_Interwikilinks/Commonslinks since long time now. Thousands of wrong commons links got solved by humans there. The bot should sync changes from dewiki to commons, but will also check other languages. Merlissimo (talk) 12:58, 24 March 2011 (UTC)

Discussion[edit]

Test run looks OK for me. --EugeneZelenko (talk) 14:47, 24 March 2011 (UTC)


  • At Category:Extra Flugzeugbau, there is an interwiki to en:Extra_Flugzeugbau. The article this leads to an article that includes [[de:Extra Aircraft]]. Shouldn't the bot add that to the category Category:Extra Flugzeugbau rather than just remove a link? --  Docu  at 18:09, 24 March 2011 (UTC)
    The bot removed the interwiki to dewiki because it was pointing to a not existing page. But searching for an replacement the bot found two possibilities: de:Extra_Aircraft and de:Extra_EA-400. de:Extra_Aircraft was found folloing other interwikis and both have commonscat links to this commons category. This conflict cannot be solved automatically and must be solved by running the bot manually assisted. Merlissimo (talk) 19:24, 24 March 2011 (UTC)
    Sounds good. As there is already a bot operating in gallery namespace (namespace 0), I wonder it wouldn't be preferably if MerllwBot would operate in category namespace, e.g. Category:Henri, comte de Chambord should be linked to the corresponding WP articles. --  Docu  at 02:26, 30 March 2011 (UTC)
    I would like to work on all namespaces (including also project, template). My framework is designed to be a good addition to the already running pywikipedia bots. Both frameworks are not perfect: There are many cases which cannot be handled by pwd automatically and my bot won't find all new interwikis. But the edits they do are correct (like your example above: removing the interwiki to dead page de:Walter Extra was correct, but replacing it with de:Extra Aircraft would be perfect).
    So both frameworks should be used at the same time to get the best result. Merlissimo (talk) 09:42, 30 March 2011 (UTC)
    Given your explanation above, I'm fine with the way it resolved de:Walter Extra. Operating in project and template namespaces shouldn't be a problem, but as at Commons, category namespace is the equivalent of article namespace at Wikipedia, operating in gallery namespace can create more problems than that it resolves. As there is already a bot working there, I think it would be more helpful, if you'd operate primarily in category namespace (and template, Commmons namespaces). --  Docu  at 05:39, 31 March 2011 (UTC)
    If you want me to skip namespace 0 i can do so.
    Just for info here is the result of my last scan from March 24th Merlissimo (talk) 11:15, 31 March 2011 (UTC)
ns pages with dead interwikis dead interwikis in total
0 4296 5361
4 298 500
10 245 360
12 2 7
14 21928 30747
Sounds good. An explanation to some of the numbers may be that the other bot has problems dealing with categories.
BTW how do you resolve conflicts between namespaces? Ideally, a category at Commons would link to articles at WP. If there is no article, but a category, then it would link the WP category.
Does your bot have a "hint" feature? I think with pywikipediabot, one could use hint to attempt to link articles in another language. If yes, could use this option to run through the subcategories of Category:People by name and try to link en_wiki articles?
We are renaming some categories in Category:Ships by name. This would need quite a few fixes at WP. I attempt to add interwikis here whenever possible, thus articles can be found there. --  Docu  at 10:02, 2 April 2011 (UTC)
At the moment my bot uses the datatebases of the reports linked above as hint containing a complete list of all commons(cat) from dewiki. So it has no feature to go through all subcategories. I am still improving my bot and some day it will use hints from other languages as well. But i would like to start with dewiki because this function is already tested and i know the quality of commonscat links from dewiki. I think there is no other wikipedia having so less wrong links because of humans working on a database report every week.
Is there a general rule that article links are preferred in category namespace? Help:Category#Creating a new category does not contain this information. The pywikipediabot script mostly adds category links. That's why i configured my bot to use the category hints first if both exist. Only interwikis on gallery namespace are limited to article namespace. Merlissimo (talk) 21:04, 5 April 2011 (UTC)
It's the solution that emerged last time the question was discussed. The problem is what links to categories at Wikipedia are generally useless for users looking for additional information.
With "hints" I meant links that can't be completed based on existing ones. Obviously it should attempt to complete existing ones.
BTW, in addition to gallery namespace, I think another namespace that should be excluded is file namespace. --  Docu  at 11:56, 9 April 2011 (UTC)
Ok, changing my bot to prefer article namespace is easy. Prefering category namespace was just the way pywikipediabot does. Perhaps you could add this recommendation to Help:Category#Creating a new category.
Having interwiki at file namespace does not really make sence in general i think. My bot also won't touch any user space. Who is the bot operating in gallery namespace at the moment? Is it possible for the operator to focus on these 5300+ broken interwikis? I could create a list of those on toolserver webspace. Merlissimo (talk) 12:24, 9 April 2011 (UTC)
Sounds good. Eventually, I will try to expand the documentation.
The other bot is User:Emijrpbot, but broken interwikis is probably the smallest problem we got with Gallery namespace. There are quite a few pages that are in a terrible state, even though interwikis get updated once in a while.
BTW If you feel like it, you might want to remove existing interwikis from File namespace. There are quite few left there, but we might want to bring this up on WP first.
I will ask a Bureaucrat to flag your bot. --  Docu  at 04:41, 10 April 2011 (UTC)
No issues. ✓approved. Thanks to the bot operator for being cooperative and responsive to concerns. –Juliancolton | Talk 12:30, 10 April 2011 (UTC)