Commons:Bots/Requests/Krdbot

From Wikimedia Commons, the free media repository
Jump to: navigation, search

Krdbot (talk · contribs)

Operator: Krd (talk)

Bot's tasks for which permission is being sought: Krdbot is already active at dewiki, where besides several other tasks it is helping to automatically update and maintain lists and articles on cultural heritage monuments in Austria. For the WLM projekt we now like to also update file description pages and categories of monument pictures on commons.

The regarding code for Commons hasn't been completed yet, so there are no test edits here so far.

Krdbot has botflag on dewiki since February 2011 and done around 28000 Edits without being blocked. (There are several deleted edits because on of it's tasks consists of issuing speedy deletion requests for pages no longer in use.)

Update ID and Coordinate templates and Categories on files and categories of Austrian Cultural Heritage Monuments based on monument tables taken from dewiki. (clearified --Krd (talk) 05:16, 7 October 2011 (UTC)]]

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): daily

Maximum edit rate (eg edits per minute): max. 1 edit per minute

Bot flag requested: (Y/N): Y

Programming language(s): Perl

Krd (talk) 15:33, 12 August 2011 (UTC)

Discussion

  • For Commons, it might be interesting to create categories for monuments and cross-reference them to de-Wikipedia articles. Please advise us when you are ready to do a test run. --  Docu  at 15:41, 12 August 2011 (UTC)

Many files from WLM and WLA projects need a LOT of help fixing their descriptions and formatting, so a bot focusing on those files is very welcome. I think most people would like to see some sample edits first before making decision. --Jarekt (talk) 16:01, 12 August 2011 (UTC)

First test edit has been made [1]. There are currently 36087 monuments listed, of which 9268 have a picture, so we expect around 9k edits for this first task. --Krd (talk) 19:54, 12 August 2011 (UTC)
Could you add a more detailed description of the bot's task above (the de Wikipedia part isn't that important)? How do you select the images to be tagged? --  Docu  at 11:08, 13 August 2011 (UTC)
As employer: The Bot searches pictures that are in monument lists (ie Liste der denkmalgeschützten Objekte in Wilhelmsburg) and adds the monumenttemplate and and the object location if needed (ie File:Schloss Kreisbach Gutshof.jpg). If on wiki the "commonscat"-parameter is set the bot adds the picture to that cat and taggs the category with the same tag (ie Category:Schloss Kreisbach). I hope that helps, greets --AleXXw talk!•me@de.wp 13:43, 13 August 2011 (UTC)
This kind of use is standard, ie {{Béns Andorra}} or {{Rijksmonument}}. I just speak english and german, so feel free to add some languages ;) --AleXXw talk!•me@de.wp 15:01, 13 August 2011 (UTC)

First of all I think that such a bot could be quite useful. But second, I'm still missing some clear definition on what the bot is going to do. The bot will create a strong inderdependency between names in de:WP and commons by using IDs (which should be unproblematic) and category names. We all know that categories tend to be renamed, files move in and move out, files are assigned by mistake. So these lifecycle problems should at least be sketched. There is no clear image for me, what to do (what the bot will do) when there is no 1:1 mapping of monuments and categories here, e.g.: Category:Church_Tribuswinkel. So the maintenance aspect of the structures created by the bot is quite important for me. These things should be worked on, but to gain experience it is necessary to start with something. --Herzi Pinki (talk) 12:47, 14 August 2011 (UTC)

There have been a few more test edits, and it seems the code has stabilised now. Please let me know if additional information is required. --Krd (talk) 16:51, 31 August 2011 (UTC)
More edits do help. However it is still not clear to me how do your bot know which images show which object. Do you only add it to the images used in the monument lists or use the list to guess category name on Commons and add it to all images from a category? May be you can include source code somewhere - such questions are often more easily answered by reading the source code. Also I think {{Denkmalgeschütztes Objekt Österreich}} should be renamed to {{Monument Austria}}, or a single "monument" template should be created to allow easier internationalization, however if that happen we can always add redirects. Some comments about edits:
  • in this edit {{Object location}} template should be placed just below {{Information}} - that is where we place such templates. Also I am not certain it is necessary since file is already geocoded using {{location}} template.
  • in this edit new template is placed above {{Information}} template. It should probably go inside, as with other edits.
Otherwise it looks fine. --Jarekt (talk) 19:25, 31 August 2011 (UTC)
Hi Jarekt. The information which images show which object is strictly based on the dewiki monument lists.
The source code is not very readable and therefore not to be published to the world, but if neccessary I could hand out an excerpt of the most important parts to a limited number of people.
The two edits you mention have been manually corrected afterwards and fixed in the program code. {{Object location}} will be grouped to {{location}} if available, or placed above {{Information}} otherwise. The latter is current cencensus with my colleagues, but this placement maybe should be discussed again.
If you like to see more edit, please let me know. Thank you. --Krd (talk) 19:56, 31 August 2011 (UTC)
I started a discussion about template placement on Template_talk:Location. So far all votes are for "below" option. user:Dschwen offered to share the code he used in his bot to accomplish that, in case you can use it. Greetings. --Jarekt (talk) 18:03, 1 September 2011 (UTC)
Ok, I made another test edits with location placed below. --Krd (talk) 18:19, 1 September 2011 (UTC)

Independent of the details of the code which still might need some tweeks, I feel like it is safe to grant bot flag to user:Krdbot. --Jarekt (talk) 18:05, 1 September 2011 (UTC)

Hi all! Is there any possibility to get the botflag soon? WLM is running for a week now and we're getting about 100 pictures a day. At many of them we have to add the ID etc. manually... Greets --AleXXw 23:00, 7 September 2011 (UTC)

From the description at the beginning of the request, it's still not clear what the bot operator intends to do. This despite a request to complete this.
As the template in the sample edits has now been changed (diff), it seems now that this has been converted into a request to link your personal toolserver account from all images about monuments in Austria. While such links may be useful from category descriptions, I don't think it's appropriate to place this on all uploaders' images. --  Docu  at 18:24, 11 September 2011 (UTC)
Hi Docu. The toolserver-link is just for the time of WLM, there is no other way to link to the right Wiki-list. Its very assisting to add the new pictures to the right list, check the ID, check the location... The bots work is quite simple: If a picture is linked in a de.wp-list he adds the ID-template and, if available the object-location and the category (we have a own field "commonscat" in our lists). Greets --AleXXw 21:23, 12 September 2011 (UTC)
There is a link "File usage" on each image. This should list German Wikipedia as well. No need to replicate this in the image description.
It might be worth to create corresponding Commons categories from your structure on toolserver and link these with the template. The bot could than add the images into these categories. --  Docu  at 06:25, 14 September 2011 (UTC)
OK, I described it wrong... What I meant was that it is assisting to check new pictures that are not being in a list till now (ie this File: No coordinates, no categorys, description something like "wayside chapel"...) There are already corresponding categorys on commons (Cultural heritage monuments in <State>/<District>/<Town>, Cultural heritage monuments in Austria without ID, Cultural heritage monuments in Austria with wrong ID,...) but without flag we have to add everything by hand ;) greets --AleXXw 07:20, 14 September 2011 (UTC)
With "category", I meant categories for specific monuments, such as Category:Palais Herberstorff. You can add the image in there and the category will allow to find links to toolserver and Wikipedia. --  Docu  at 11:17, 14 September 2011 (UTC)
Sorry, I missunderstood you :) We now have more then 600 categorys for more then 36.000 objects, we're working as fast as we can... But if the toolserverlink is the main problem to get the flag we can remove it. It will work, it's just more c&p-action for everyone working on it... greets --AleXXw 18:43, 14 September 2011 (UTC)
Would it be a problem for you to create corresponding categories directly and add the template there? --  Docu  at 20:12, 14 September 2011 (UTC)
Unfortunately not. There are already many untagged categories and the inlist-objectname often is not appropriate for category names. Greets --AleXXw 20:22, 14 September 2011 (UTC)
Duplicate categories wouldn't be much of an issue. You can use "Object name (locality)" as category names. I can help you tag existing categories.
Otherwise it would just built an alternate category scheme without much benefit to Commons. --  Docu  at 20:30, 14 September 2011 (UTC)

Unfortunately that will not work, for some objects it don't makes much sense to create categorys - some examples:

  • Object 128352: Wienflussregulierung und -verbauung von Wien I. bis Wien XIV. (samt Brücken, Geländern und sonstigen baulichen Bestandteilen) - some dozen bridges all over vienna, most of them already have own categorys
  • Object 63957: Kath. Pfarrkirche hll. Drei Könige, Friedhof mit Totenkapelle, Kriegerdenkmal, the category name is Pfarrkirche Elmen (most people don't know the patronage of most of the churches)
  • There are 10 objects just called Wohnhaus in St. Pölten, 14 Scheune in Mörbisch am See, ~61 Bürgerhaus in Rust and ~195 Bürgerhaus in one of the 7 Steyr-lists...

Can you imagine the problem? If you need there would be hundreds of other examples ;) greets --AleXXw 22:02, 14 September 2011 (UTC)

Even in your current approach, you would have had to think about how categories for all these monuments need to named and link them (at least that what you already seem to be doing from toolserver). So directly creating corresponding categories shouldn't be much more than you already intended doing.
We could flag for manual definition (or review) all categories where the name exceeds a given length. This would avoid problems with the two bulky ones you mention. I don't see why we couldn't have a category for Object 128352. It would just contain primarily subcategories for specific bridges.
Categories would obviously only be created once images are available. The Wohnhaus, Scheune, Bürgerhaus samples you quote mostly don't have any images. See here, how to name such categories. --  Docu  at 04:26, 15 September 2011 (UTC)
On toolserver we just link to the right wiki-list (every village has its own list, most towns have several lists) and show all tagged categorys and pictures. Also we show all coordinates (listentry, pictures, category) to see errors. There are already many pictures of Wohnhaus, Bürgerhaus,... Ie Category:Cultural heritage monuments in Braunau am Inn, these Objects are mostly categorized by adress, but some of them have own names (ie by builder or former owner). Apart from the fact that not all objects have an address churches or castles are better categorized by name. Summarised: categorisation is very hard to automate, there are to many special cases. But all available categories are linked both from lists and articles so we could categorize images automaticly in this cases. --AleXXw 06:44, 15 September 2011 (UTC)
If you are not comfortable creating new categories, I can do it for you. I'd need: "id", "title", "address", "location", "filename", Wikipedia article name. You could than link the category from toolserver. --  Docu  at 10:55, 15 September 2011 (UTC)
All data is stored on toolserver in the databases p_erfgoed_p (lists), u_krd_p (commonstags) and u_alexxw_p (mapping list to city/destict/state), updated once a day. We are already working on creating categories, there is no need to rush something... But creating categories in this case is not a task for a bot, there are too many exceptional cases. There are ~13.000 pictures linked in the lists, many of them are not tagged or geocoded on commons. As written earlyer the bot would just add, where missing, ID-tags and coordinates. In some lists there are already commonscats linked (ie St. Pölten-Wagram), in this cases the bot would add this category too. Nothing else, I'm not sure what the problem is... As offerd above I can remove the toolserverlink... --AleXXw 14:46, 15 September 2011 (UTC)
Would you do an extract from the toolserver database and post it to Category talk:Cultural heritage monuments in Austria/list, I will check what I can do to create the categories. BTW, it would also need the Category:Cultural heritage monuments in .... You can skip images that are already in a specific category such as Category:Altstadt 2 (Braunau). --  Docu  at 03:57, 16 September 2011 (UTC)
The toolserver-db is a 1:1 copy of the 2407 lists in de:Kategorie:Liste (Kulturdenkmale in Österreich), I'm pretty sure a list with 36645 rows will be to much for a site ;) The major problem is not to create the categories, its the naming of it. In many cases there is no way to get the correct name from the lists, it is not included. How you distinguish whether you name the category by address, by title, by articlename or an english translation of the name? One of my favourites: ID 41692, name: "Eisenbahnstrecke, Wiener Vorortelinie - Teilbereich Gersthof mit Station Gersthof" (~Railroad, vienna suburbs line - section Gersthof with Gersthof station), category name Bahnhof Wien Gersthof, bacause of notability-regulations the article is under Vorortelinie. The objectname is kind of "technical", no one would use many of the names given by the federal heritage office... Greets --AleXXw 08:29, 16 September 2011 (UTC)
Commons has lots of space. It's easier to handle one list than 2407 lists. BTW I'd only be interested in entries that actually have a file. I will look into the naming issue. --  Docu  at 10:14, 16 September 2011 (UTC)
It's there as csv, good luck ;) btw: please don't add new categories now, we first have to discuss it with the other volunteers! Greets --AleXXw 16:13, 16 September 2011 (UTC)

Looks good, just a few points:

  • With "location", I had in mind the village/town/city of an object, but it's true, we'd also need the coordinates. Would you add the village/town/city as well (e.g. Abfaltersbach for 2316, 2317)? Otherwise it's not possible to determine a category name.
  • BTW, which are the current Category:Cultural heritage monuments in ....-categories that go with it? Could you add that?

Thanks for your help. --  Docu  at 00:43, 17 September 2011 (UTC) (edited)

  • You're welcome, but I still don't know what is the connection to this botflag request ;) Tonight I added ~150 ID-tags by hand, that would have been a great work for the bot ;) Your points:
  • There are 4 different sources for village/town/city who may differ a little bit: 1) Name of the list 2) name of the city-article 3) commonscat 4) official name by Statistik Austria
  • I unfortunately don't have a list of Cultural heritage monuments in..-categories, but they are named 1) Cultural heritage monuments in Bezirk <district> for districts or Cultural heritage monuments in <town> for statutary cities (statutary cities are both city and district) and 2) Cultural heritage monuments in <village/town/city> in one of the spellings from the point above.
BTW: I'm leaving tonight for a 2-week-vacation ;) You should start a discussion on Portal Diskussion:Österreich/Denkmallisten, most of the volunteers should speak english. Greets --AleXXw 10:40, 17 September 2011 (UTC)

Can you give an update if the bot flag can be granted or which objections there still are? Thank you. --Krd (talk) 17:16, 6 October 2011 (UTC)

Please fill in "Bot's tasks for which permission is being sought:" with a description what you intend to do at Commons. Tasks for Wikipedia need to be requested there, not here. --  Docu  at 04:34, 7 October 2011 (UTC)
Done. --Krd (talk) 05:16, 7 October 2011 (UTC)
Ok for me. (I think it would be useful to create and add monument specific categories as well.) --  Docu  at 05:28, 7 October 2011 (UTC)
There are some unresolved problems with Categories, so I don't know yet if that can be done automatically. We have to look into that again as soon as we can start which the project. --Krd (talk) 07:29, 7 October 2011 (UTC)

If there are no further objections, I think bot status should be granted. --EugeneZelenko (talk) 14:53, 7 October 2011 (UTC)