Commons:Bots/Work requests

Une page de Wikimedia Commons, la médiathèque libre.
Aller à : navigation, rechercher

Raccourci : COM:BR· COM:BWR

Règles d’utilisation et liste des bots · Demande d’autorisation
Requête aux bots · Changements pour permettre la localisation  · Requêtes de versement en masse

Filing cabinet icon.svg

SpBot archive toutes les sections marquées avec {{Section resolved|1=~~~~}} depuis 1 jour.

Adding the Information template to files that don't have it[modifier]

Hi again :) As part of the File metadata cleanup drive, I'm working to add the {{Information}} template to the ~700,000 files that don't have it, so that the information can be accessed easily. This is a complex undertaking, but there are small tasks we can take on to make incremental progress.

An easy group of files to start with are those like this one, whose description page basically consists of:

== {{int:filedesc}} ==

< Some description >

== {{int:license-header}} ==

{{Self| <some licence(s) }}

< categories >

In this case, it's relatively easy to add the {{Information}} template:

  • add the information template under == {{int:filedesc}} ==
  • move the existing description to the Information template's Description field
  • add the name of the uploader as the author (since it's their own work)
  • add {{own}} as the source
  • add the date from EXIF data, if available, otherwise leave blank.

This will work for only a subset of the files missing the {{Information}} template, but we have to start somewhere :) (Pinging Multichill, MGA73, Amir and Keegan per previous discussions.) Guillaume (WMF) (discuter) 00:50, 2 December 2014 (UTC)

I will fix this. Amir (discuter) 12:05, 2 December 2014 (UTC)
I have been adding Category:Media missing infobox template and thinking about this issue. I was also trying to discuss it at VP, see here. I think we should use divide and conquer approach I would propose the following:
  1. Mark the files by adding them to Category:Media missing infobox template what will allow everybody to see the files.
  2. Some files likely have information template but have some syntax errors, those I try to place in Category:Pages using Information template with parsing errors
  3. I would propose to first give the original uploaders a chance to fix the files. We can do that by writing a standard message, which without any threat of deletion, ask for help whit bringing those files up to current standards. We should have one message per uploader with a list of all the files that need infoboxes. Many of the images without infobox templates are from the early days of Commons and many of those people might not be around anymore. We should also advise them on the use of VisualFileChange gadget or requesting specific tasks to be done by bots at Commons:Bots/Work requests.
  4. Many files have all the info just not in the right form, for example File:Orchis militaris flowers.jpg or File:St Germain des Prés fenêtre.jpg. We might be able to recognize some patterns used and fill {{information}} based on that.
  5. Some images were moved from wikipedia, like File:St michaelis.jpg and have no information about the photographer. we would need to look the information up on EN-WP to find the name of the original uploader.
  6. Some images imply "own" work by the uploader, like File:MaisonHonfleur1.jpg or File:Pinus pinaster female.jpg, but do not actually say it. If the files have EXIF data and templates like {{PD-self}} or {{GFDL}}, {{self}}, I think it might be OK to fill the {{Information}} with {{own}} and the name of the first uploader and the EXIF date.
  7. Some files have some home-brewed infobox templates that are not maintained or recognized
  8. Many {{PD-old}} files should use {{Artwork}} instead of {{Information}}, for example File:Leonardo da Vinci Grotesque Heads.jpg.
  9. I do not know what to do with, files like File:Ruins at Delfi.JPG. User should have been advised that he needs to send the permission to OTRS, but 10 years ago when he uploded the image OTRS mostly dealt with handling emails from the public not permissions.
Once we deal with a lot of "easy" cases we can asses what is left. --Jarekt (discuter) 16:46, 2 December 2014 (UTC)
Thanks, Jarekt! That's a great plan. Should we discuss the details elsewhere or is here ok? Guillaume (WMF) (discuter) 18:35, 2 December 2014 (UTC)
I would just keep the discussion here. I was trying to have this discussion on VP and Commons talk:Structured data, but nobody wanted to talk about it, so this place seems better. By the way Category:Items with OTRS permission missing infobox template seems like are distinctive enough to warrant a separate category. --Jarekt (discuter) 18:58, 2 December 2014 (UTC)
My approach is to fix easy cases and evolve the script as we handle more complex cases. Amir (discuter) 19:49, 2 December 2014 (UTC)
Amir, I am slowly working on step #1 adding Category:Media missing infobox template and more specific subdirectories, so far I am ~5% done. You can start with those files or have your own way of generating the list of files with no infoboxes. It should not be hard as I added {{Infobox template tag}} to all infobox templates (other than {{Information}}) so any file that do not have {{Infobox template tag}} or {{Information}} is likely not to have an infobox. So maybe you want to tackle cases where author, source and possibly date and the description are present, and unambiguous to a human reader (case #4), than you can develop regexp rules to detect them and place them in the correct fields. Some of those rules can be "borrowed" from toollabs:add-information. But the bot should skip unusual cases. Many of the uploads are by the same users which might follow the same pattern and we could process few more prolific users with a custom set of rules. --Jarekt (discuter) 20:37, 2 December 2014 (UTC)
ُThank you for your hints, I'll use them and probably work on case #4 Amir (discuter) 21:01, 2 December 2014 (UTC)

It would also be helpful to have toollabs:add-information fixed. Currently, it occasionally destroys section headers and other parts of the code. --Leyo 17:14, 2 December 2014 (UTC)

Agreed; I'll reach out to Magnus and follow up here. Guillaume (WMF) (discuter) 18:35, 2 December 2014 (UTC)

I finished the script that fixes cases that they consists only language templates (example 1, example 2) Is it okay to start with them? Amir (discuter) 09:01, 6 December 2014 (UTC)

That is good, however are you going to be able skip cases which are clearly not "own work", like File:FSO ok 1974r.jpg. Also Files with only language templates might have date, author, source which are not the same. Do you attempt to recognize those? --Jarekt (discuter) 09:38, 6 December 2014 (UTC)
It skips when the template:Self is not used and if the language template consists several lines (instead of one). Is that enough? Amir (discuter) 11:06, 6 December 2014 (UTC)
I think that limiting it to files using {{Self}} is enough. Could you also remove Category:Media missing infobox template, in case the file has it? (you might be doing it already). Thanks --Jarekt (discuter) 17:11, 6 December 2014 (UTC)
I don't remove self template. Should we remove it? Amir (discuter) 21:23, 6 December 2014 (UTC)
I am sorry, I forgot : and Category:Media missing infobox template did not show up. I meant to remove that category. --Jarekt (discuter) 03:30, 7 December 2014 (UTC)
Yes, It does remove them Amir (discuter) 08:46, 7 December 2014 (UTC)

Next step: One line long descriptions: Commons:Bots/Requests/Dexbot_5 Amir (discuter) 02:18, 25 December 2014 (UTC)

Btw I found that there might be some uploads with "self"-templates which are not "self" by the uploader because they were transferred from other wiki's. See this one for an example. Some big uploaders (bots and users) have been busy with file transferring in the early days and should at least be exempt when using this method to add information templates. Mvg, Basvb (discuter) 12:26, 25 December 2014 (UTC)

An example pattern[modifier]

@Ladsgroup:@Guillaume (WMF):Last month under my "normal" account I went through and cleaned up a couple hundred file pages by hand looking for such patterns. Here's an easy(ish) test case for a bot to take on:

RHaworth has/had a bunch of old (2005/6ish) uploads that need formatting. They're pretty easy to do by hand, but even so there's still 61 files left that need completed; I did the other half by hand. The list is on this labs page. I can copy the file names over if need be. Keegan (WMF) (discuter) 21:16, 11 December 2014 (UTC)

@Keegan (WMF): on File:All_Saints,_Beeston_Regis.jpg why did you put User:RHaworth as the author, when the text was clear that it is actually User:Stavros1 (Mark Hobbs)? --99of9 (discuter) 23:22, 11 December 2014 (UTC)
@99of9: because I made a mistake there. I've fixed it, thanks for pointing it out :) Keegan (discuter) 03:54, 12 December 2014 (UTC)

Dutch wiktionary pattern[modifier]

There are 130.000 pronounciation uploads from wiktionary on commons, a few thousand (my estimate would be around 7000-15000) don't have information templates. Most of these are uploaded with the same pattern. There are uploads by different uploaders with different patterns. In this edit I change the filedescription of a file by GerardM (which have similar patterns), the description was created by me (but could be generated based on the title. The words "eigen opname" (meaning: own recording) are added in slightly different formats, besides that there is not really a lot of info on the images. This one might be an easy one to add templates to and also a pretty big one. Mvg, Basvb (discuter) 00:13, 14 December 2014 (UTC)

In smaller numbers this holds for other languages as well (en, pt I've seen). Mvg, Basvb (discuter) 12:04, 14 December 2014 (UTC)
I will fix this for Dutch pronunciations by the weekend Amir (discuter) 02:23, 18 December 2014 (UTC)

9154 files of Dutch pronunciations didn't have Information template. Now 8588 more files have it (so 566 still needs to be fixed. I'll do that too) Amir (discuter) 05:37, 19 December 2014 (UTC)

Nice work, with those numbers we can work very well! Mvg, Basvb (discuter) 11:28, 19 December 2014 (UTC)


A list of books (with more than 100 files) which have no infobox template and could probably use some automated adding of the book-template. Most books have a few hundred pages and thus we are looking at a few hundred files per listed book. Basvb (discuter) 00:17, 14 December 2014 (UTC)

I can finish the Book categories. I have a system going that adds book templates with unique page numbers so you can page through the files. The only slow down is that I am creating book templates and often creator templates as I go. --Jarekt (discuter) 20:28, 19 December 2014 (UTC)
✓ Fait That was a good find and since all the files already use Template:LA2-NSRW, it was also easy to fix. --Jarekt (discuter) 05:02, 18 December 2014 (UTC)

I can do those, since I have some script for adding page numbers so one can page through the book, but I would appreciate help with the book templates, like {{L'Odyssée}} or {{Nietzsche's Werke, III}}, since they are the most time consuming especially since I do not speak any of the languages. --Jarekt (discuter) 05:04, 14 December 2014 (UTC)

Thanks, I can do the templates in a few days. Mvg, Basvb (discuter) 12:01, 14 December 2014 (UTC)
Btw, is there any fix for the fact that all (or a lot) books with booktemplates end up in Category:Files with no machine-readable source and Category:Files with no machine-readable author? Basvb (discuter) 12:03, 14 December 2014 (UTC)
Template:L'Ile des Pingouins is done (I'll add them to the relevant lines of the books from here on). Basvb (discuter) 14:05, 14 December 2014 (UTC)
Smiley.svg Merci --Jarekt (discuter) 18:13, 14 December 2014 (UTC)
Thank you! I've also left a message to the French Wikisource community to see if they can help to create the book templates. Guillaume (WMF) (discuter) 18:21, 15 December 2014 (UTC)
@Guillaume (WMF): Thank you, althought my French is limited exactly to the understanding of book covers. I've a question, is it possible to generate from the data about files without infobox which users have uploaded a lot of files (let's take over 100) or a lot of files in one category? This would help a lot in finding books like these and other patterns which can be fixed easily. Mvg, Basvb (discuter) 21:13, 15 December 2014 (UTC)
Basvb: Sorry for the delay on this; I'm still new to SQL queries, so it took me a little time. The list you asked for is now available and I'll set it up to be refreshed every day. You might want to download the file to your computer to avoid encoding errors if you open it in your browser. I see familiar names in the list, like MarcBot (used for many of the books discussed above) and G.dallorto (that you mentioned below), so I'm reasonably confident that it's what you're after. Let me know if I can do anything else! Guillaume (WMF) (discuter) 01:22, 17 December 2014 (UTC)
Guillaume (WMF): Thank you very much, that makes it much easier to search for the big fish, which will save the people who work on this file by file a lot of work. If we for example fix all files of uploaders with over 1000 uploads (without infobox) than we have the first 200k done. About the anything else, I indeed had another idea, depending on how hard it is a good way to find similar uploads is when the lists are sortable on uploaddate, but I can be busy with the this uploader list for a while. Mvg, Basvb (discuter) 09:08, 17 December 2014 (UTC)
Basvb: I've made another query. You can now download a list of the files missing machine-readable metadata, grouped by user, and with the timestamp. Warning: this is a ~40 MB text file so some browsers may have issues with it. I suggest you download it to your computer and open it with a spreadsheet application, so you can reorder the content more easily. For example, I imagine that you could select all the files from a given user, and reorder them by upload date to see if there are patterns. The file isn't being updated for now, but I can set it up if you think it would be useful. Guillaume (WMF) (discuter) 19:08, 17 December 2014 (UTC)
Guillaume (WMF): Thank you, now I can get to the regexfixing. Update of the file is not really needed (until a big chunk is done). Mvg, Basvb (discuter) 20:31, 17 December 2014 (UTC)

When I saw Guillaume's message on WS, I came here to see what I could do to help. I haven't check all these books, but the 1st one I looked (Category:Lettres de mon moulin), there is already a DjVu file from this same book edition: s:fr:Fichier:Daudet - Lettres de mon moulin.djvu... what is the usual procedure when we come around this kind of thing on Commons? is it considered as a duplicate? Thanks. --Ernest-Mtl (discuter) 02:23, 16 December 2014 (UTC)

The DJVU Lettres de mon moulin file is misplaced in Wikisource, and has to be uploaded in Commons. There is a number of JPG book pages, such as Gustave Flaubert Category:Bouvard et Pécuchet, Category:L'Éducation sentimentale, Category:Madame Bovary, duplicates of DJVU; now Wikisource uses the DJVU and the JPG are no more useful. --Wuyouyuan (discuter) 13:47, 16 December 2014 (UTC)
I do not think we have a policy on that but I am inclined to let the old files stay and the book template we use for them can be reused for the DjVu files, as I did with files in Category:Encyclopédie – Planches V1–9 (pages assemblées, DJVU). --Jarekt (discuter) 03:19, 16 December 2014 (UTC)
I just realized you are talking about the case of some files on Commons which are the same as files on french wikisource. That is quite puzzling why is french wikisource hosting local files? Either way we are not going to delete our copy just because one of the projects has a local copy, and we still would try to add metadata to our copy. --Jarekt (discuter) 03:27, 16 December 2014 (UTC)
The reason why I was asking the question is that we are actually moving all those files to commons. Some files from the early days of the projects were simply uploaded to WS. Furthermore, the quality of the djvu scans are a lot better than the jpg in the case of Lettres de mon moulin and I would, personaly, find it a waste of space to keep individual jpgs of a book that can be accessed in djvu, especially that the djvu format here on commons allows people to save individual pages into jpgs on their computer if they can't open a djvu file... That's the reason why I was asking what was the procedure here, before doing something that would not have been considered correct. If someone is to waste time on these 200 some jpgs files, it won't be me as I consider these files useless duplicates. --Ernest-Mtl (discuter) 15:07, 16 December 2014 (UTC)
In such a case once the DjVu file is copied you should nominate the jpegs for deletion as poorer quality duplicates. --Jarekt (discuter) 15:16, 16 December 2014 (UTC)
Yes, except when the JPEG are not easily available. That was (is?) the case for the Encyclopédie files. Regards, Yann (discuter) 17:46, 16 December 2014 (UTC)

One thing that I noticed is that most of the files here are uploads by User:MarcBot for french wikisource and all (as far as I noticed) were unused and replaced by DjVu files. The book images are often incomplete concentrating on the pages with "text" and skipping the title pages, tables of content, etc. We usually do not remove files which are not identical duplicates but in this case thase are truly unusable files, since better versions uploaded latter exist. --Jarekt (discuter) 04:51, 22 December 2014 (UTC)

Yes. The fact that most files are from MarcBot is because I looked at all the uploads from this bot. I don't know what's the best plan for the images. Mvg, Basvb (discuter) 13:21, 22 December 2014 (UTC)
Books are all done. --Jarekt (discuter) 14:41, 26 February 2015 (UTC)

Over 500 images by G.dallorto[modifier]

There are over 500 images by G.dallorto without information templates which have some basic pattern, dates in exif-data and mainly have a self-license. Seems like a pattern which could be matched. Example edit: here (media missing information template cat should also be removed). Basvb (discuter) 00:30, 14 December 2014 (UTC)

My bot will fix this pattern and similar. I'm waiting for approval. Amir (discuter) 18:58, 14 December 2014 (UTC)
✓ Fait Amir (discuter) 12:29, 16 December 2014 (UTC)
A lot of files in Category:Società Umanitaria (Milan) aren't done yet. Mvg, Basvb (discuter) 19:56, 16 December 2014 (UTC)
their pattern was a little bit different, my bot fixes them too now and it finishes them soon Amir (discuter) 19:48, 17 December 2014 (UTC)
I see now that this user has images with a lot of different patterns (just heavily active) thus processing all of those by both isn't really suitable, it'll just be part of other botbatches if it fits. Thus lets close this request. Mvg, Basvb (discuter) 22:32, 17 December 2014 (UTC)

over 500 images by Lalupa[modifier]

over 500 architectural shots with one sentence text description.[1] some with move table.[2] Slowking4Farmbrough's revenge 23:15, 23 December 2014 (UTC)

Thank you! Amir, do you think your bot could take care of those? Guillaume (WMF) (discuter) 20:59, 7 January 2015 (UTC)
This user has uploaded 4225 files, I'm investigating Amir (discuter) 14:04, 8 January 2015 (UTC)
I fixed about 400 cases, there are some more complex cases that need to be taken care of. Amir (discuter) 04:14, 12 January 2015 (UTC)

Continuing the work[modifier]

We've made progress, but there are still many files missing machine-readable information.

Hello everyone,

A few days ago I looked at the numbers, and we've managed to fix more than a third of images on all wikis, and almost 100,000 on Commons! This is great :) We should continue the work to make further progress.

Commons is still the wiki with the most files missing machine-readable information (see image). Jarekt, Basvb, Slowking4, Amir: Do you think there are more groups of files with similarly-formatted file pages, or have we reached the point where it's time to expand the scope?

If we're done with the patterns we can find, it may be time to start filling the information template with less precise information. We could use a tracking category for the bot-fixed files, so we can check them later to improve the fields manually. In the meantime, the information will be available thanks to the bots, even if it's not perfect.

Here's how we could do it:

  • Take larger groups of files, like files with a == Summary == section but no info template, and put everything from that section into the "Description" field of the information template.
  • Otherwise, do the same for all content on a file page that isn't a license template or a category (this can obviously be refined).

This approach isn't perfect, since some information like the Author and the Source may be put into the Description field. However, it's better for it to be shown in the Description field than not be shown at all, at least until the file is checked manually.

Do you think this could work? I'm open to other ideas, but I don't think we want to wait years before we fix every file manually only. Guillaume (WMF) (discuter) 23:50, 4 February 2015 (UTC)

There's still lots of stuff we could do with pattern matching: see the list of authors. All MarcBot's files are books and easily fixable. I simply had a busy time and my attention shifted. One thing we could do is ask all users with less than 10 images to check and add an infobox, as those will be the hardest to cover in automatic attempts. There's around 100k files from ca 50 uploaders with over 1000 files with missing info. Likely the 200k point is also still. The next 100k point is at 300 images or more. Fixing the files of 250-300 uploaders would devide the number in half leaving us with around 200k images to fix. I think these 300+ images authors largely be fixed in a 10-30 minutes of work for lets say 70% (the other 30% is not easily fixable). That would reduce the number by 150k in around 100 man hours (the first 75k is only 25 hours of that). After that stuff will get more time consuming, the next 100k images are in the 100-300 images/uploader range with around 600 uploaders (which would than produce a number of 200 manhours for ca 75k more fixes). It all depends on how much time we and others would like to invest in this. Other approaches with high performance based on generic patterns outside of user uploaded might prove valuable as well. I think going for quick and dirty approaches right now isn't the best idea. Is there really a hurry to fix of all of this? Mvg, Basvb (discuter) 00:11, 5 February 2015 (UTC)
Hmm, my estimates about time and percentage we can easily fix might be a bit too optimistic. But I believe we should try and keep to aim for high quality. Maybe we just need some more battle troops. I'm btw still trying to figure out how I can force the regex to start matching at the start of the text of a file, this is holding me back from quite some replacements. Mvg, Basvb (discuter) 00:37, 5 February 2015 (UTC)
i would say there is lots of work still for the bots, if you group just by file name, you will see the pattern. and best to do those first and then call on the brute force humans for the last 20%. maybe you would want an uploader list for suggested groupings for the bot operators. Slowking4Farmbrough's revenge 01:35, 5 February 2015 (UTC)
Fair enough :) Thank you for sharing your thoughts! Guillaume (WMF) (discuter) 17:05, 10 February 2015 (UTC)
Basvb: I wouldn't say there is a hurry, but there is a choice we can make between the two strategies:
  • slowly work our way through ~425,000 files on Commons, using small bot runs and manual fixing. We don't know how long it will take to fix all the files, and during that time the information can't be accessed by the Mobile apps, MediaViewer, etc.
  • Or, do a massive migration of the remaining ~425,000 files although the infobox matching will be imperfect. Then, we can fix them manually as well, and we don't know how long it will take to clean them up, but during that time we will be able to have some (imperfect) information in Mobile apps, MediaViewer, etc. rather than nothing.
I personally think the second option is best, but if the collective decision is the first one, I respect that.
Regarding your regex issue, have you tried using ^ or \A ("Start of string" anchor)? It might be what you're looking for. Guillaume (WMF) (discuter) 17:05, 10 February 2015 (UTC)

In the mean time my bot is slowly adding files to Category:Media missing infobox template and it's subcategories. I am probably more than 90% done, with ~560k files, so the final number should be less than 600k. I also remove files from those categories once someone adds infobox template. I noticed some activity in other problem categories resulting from users adding {{Artwork}} and {{Information}} templates. I was not spending much time on this effort, since I spend last couple months on development of Lua version of {{Other date}}. I can help with some regular expression based mass replacements using my AWB based bot, but the most time goes into finding the files with patterns. The replacements themselves can be done with VisualFileChange by anybody who knows regular expressions. I was working for a while on adding {{tl|Book} templates to old books, but I was relying on others to create {{Book}} templates. Other issue I run into was people reversing my edits, which replaced home-brewed versions of {{Information}} -like templates with the real thing. Apparently we do not have a policy about {{Information}} and other infobox templates, saying that it is preferable to no template or home-brewed versions written for a single user. We might need to change that, before we start asking others to add infobox templates. --Jarekt (discuter) 15:45, 5 February 2015 (UTC)

By the way, looking at File:File metadata cleanup drive impact Oct2014-Jan2015.svg and comparing it to Category:Media missing infobox template suggest that there are many files that do not use any of the standard infobox templates, but do have machine-readable information. It would be interesting to see what those are. --Jarekt (discuter) 15:51, 5 February 2015 (UTC)
Jarekt: The difference may be the files with description/author specified in their EXIF metadata, that are currently recognized and used by the CommonsMetadata extension. There is a change pending to not use such data any more, though, because it's so unreliable, so I expect our counts to converge once the change is deployed. Guillaume (WMF) (discuter) 17:11, 10 February 2015 (UTC)

I fixed some files of MarcBot, Stahlkocher, and Immanuel Giel. I will work on them more Amir (discuter) 14:55, 7 February 2015 (UTC)

Fixed Sailko and SPUI files. So about 10,300 more files now have machine-readable information. I will go through the list and fix as much as I canAmir (discuter) 10:56, 8 February 2015 (UTC)
Thank you so much Amir! Guillaume (WMF) (discuter) 17:05, 10 February 2015 (UTC)

Template updates[modifier]

In addition to adding templates to the files that never had any infobox templates, there are also cases where a rare infobox template exists and needs to be either replaced with or merged with equivalent existing template, like one from Commons:Infoboxes, or templates derived from them, like templates in Category:Infobox templates: based on Information template. Some examples can be seen below. Please help me find more such templates used by files in Category:Media missing infobox template and help with conversions or template merger.

--Jarekt (discuter) 05:13, 9 February 2015 (UTC)

Thanks, Jarekt. For user-specific templates, we don't necessarily have to convert them to the official information templates. Since custom templates are discouraged, I personally don't consider them a priority; we don't have to support every possible custom template if people don't use the official infoboxes. Just adding the machine-readable markers to the custom templates would be more than enough, and that's even something we could leave to their authors. Guillaume (WMF) (discuter) 16:38, 10 February 2015 (UTC)
Just like home-brewed license templates which we managed to retire some years back, user-specific infobox templates are just hard to work with or edit, are often abandoned and forgotten by users who retired years ago and are prone to someone breaking them on purpose or by accident without others noticing it for years. They are also often lacking intenationalization of the field names and proper machine readable fields. And as with user specific license templates, I usually try to make sure they are re-written based on standard infobox templates, so at least field names are translated and metadata is machine-readable, like with this file. --Jarekt (discuter) 18:00, 10 February 2015 (UTC)
Yes, I'm not saying we shouldn't convert them, however some users are clearly attached to their custom templates. If they don't mind, then it's fine to convert the templates or add the markers, but if your efforts get reverted, I don't think there's much more we ought to do for those users. Guillaume (WMF) (discuter) 19:26, 10 February 2015 (UTC)
User:Aka revert is an issue with COM:OWN concept, but you are right that it is not a priority. I am working on merging Template:Audio upload, Template:Položka namluveného článku and Template:Gesprochener Artikel with Template:Spoken article entry. --Jarekt (discuter) 19:59, 10 February 2015 (UTC)
✓ Fait converting files using about dozen home brewed templates to {{Information}} template. In the process I rewrote Template:Spoken article to be a much more mainstream template for files with reading of wikipedia articles. Please help by adding translations to Template:Spoken article/i18n and someone with better understanding of machine readable metadata should verify that it works well. --Jarekt (discuter) 13:54, 23 February 2015 (UTC)

And this template. I'm not sure it's okay to just replace the template with Template:Information, if it's so, tell me to replace it. Amir (discuter) 21:52, 24 February 2015 (UTC)

I am not sure either User:Aka reverted earlier attempts to fix his files, because he does not like the color scheme used for all the other files on Commons. --Jarekt (discuter) 14:54, 26 February 2015 (UTC)

And this template Amir (discuter) 21:55, 24 February 2015 (UTC)

I will look into that one --Jarekt (discuter) 14:54, 26 February 2015 (UTC)

Ok so there are more {{Information}}-like templates that do not use {{Information}}:

  • There were 600 of them earlier on today. ;) --Jarekt (discuter) 19:52, 27 February 2015 (UTC)

--Jarekt (discuter) 17:52, 27 February 2015 (UTC)


I have raised this matter through the Commons OTRS mail queue. My email address has been reviewed against the source of the files and I have been directed here for resolution.

For the image listed at Category:Images_by_Rob_Lavinsky (there are over 50k files) I am requesting that someone can undertake a task on my behalf.

Every image in that category currently directs to the wrong source due to a website relaunch performed recently, and those links direct to a password protected source. Source links should all go to somewhere on, not Many of these photos are currently directing to that need to be redirected systematically to but some direct to and need to be modified to go to

Additionally, we need to modify the descriptions slightly so it isn't exactly the same as the text on our website because it is hurting us in terms of SEO. This can be done systematically based on information already uploaded, or if there is a better way (if I get a csv of all of the images/info, I can fix the desc and send back to be uploaded and overwritten... What can I do to help? Thanks!

I guess we need to create {{Rob_Lavinsky_source}} template taking some numeric ID which will display correct link, and run a bot to change all files to use that template. I would also replace {{Images by Rob Lavinsky}} with {{Rob Lavinsky cc-by-sa}} and Category:Images by Rob Lavinsky. --Jarekt (discuter) 14:28, 22 January 2015 (UTC)
At the moment is not online while is. For example File:Acanthite-221197.jpg link to works just fine while File:Acanthite-Barite-ma74a.jpg 2 links to are broken. --Jarekt (discuter) 13:09, 26 January 2015 (UTC)
Thanks - the pieces still exist, but there have been multiple changes that occurred over the last ____ years. That acanthite exists, but the image is now at img originally handled the import of this data, but they did it in a way that significantly helps them while hurting our SEO, so we're trying to regain that. Is it easier to just completely delete all images in the category, and then reimport them from our site using our correct current links? Is it even possible to mass-delete? I think there are so many weird programming issues that happened in the past through cheap bandaids that it's hard to figure out the right steps to fix it without just restarting completely.
Deleting and reloading the images in order to fix the links back to and websites is not an option, since all the images would be removed from the wikipedia articles. The dead links in the "source" field is something that happens all the time, and at least in this case is not necessary to verify the license. If there is some clear pattern in the form of a list of files and what change needs to be done than I think we can help, but otherwise I do not see what else can be done. --Jarekt (discuter) 12:48, 30 January 2015 (UTC)

Correcting language templates[modifier]

Hi bots, there are (many) photos at Commons, such as File:Պուշկինի լեռնանցք 19.jpg, that contain a language template (in the description) which is obviously not correct. In special case there are only Armenian letters why it is impossible that its English. Wouldn't it be great to correct such templates where all contained letters do not fit to the letters of the template language? Thanks, --Arnd (discuter) 10:11, 23 January 2015 (UTC)

Due to the timeout i am going to create my own bot for doing such task. Hope to have your support when requesting for it. --Arnd (discuter) 13:29, 1 February 2015 (UTC)
@Aschroet: I published at gerrit:183266 a rough Pywikibot script to help with this cleanup activities. It can use langdetect to choose the best match. --Ricordisamoa 17:05, 2 February 2015 (UTC)
@Ricordisamoa: thank you. I hope that i could somewhen use it. Btw, since langdetect does not work for my example (Armenian) i looked through the Internet and found this. Beside this cleanup things in the beginning there is an interesting approach namely langdetect_by_chars which detects languages (including Armenian) by chars. Maybe it is worth to include that as well into your script. --Arnd (discuter) 18:56, 4 February 2015 (UTC)

change all pages with {int:filedesc-header}[modifier]

i sometimes see wrong == {{int:filedesc-header}} == instead of == {{int:filedesc}} == like here [3].

I do not know how much it is a problem (how many pages), but it is always shown as <filedesc-header>, not autotranslated. Maybe this can be added to the work list of the various "cleanup bots" or tools? Holger1959 (discuter) 00:07, 29 January 2015 (UTC)

I have not seen it before and I do not know how to look for such pages. --Jarekt (discuter) 12:35, 30 January 2015 (UTC)
YaCBot is fixing such things. --Steinsplitter (discuter) 14:10, 30 January 2015 (UTC)
thank you, so @McZusatz: can help maybe? Holger1959 (discuter) 15:13, 30 January 2015 (UTC)
I think it is a long-running process until all these replacements have been performed. In my opinion it makes no sense to request a special treatment of certain files because it just rearranges the order of the renaming but not the final result where all files will be fine. So "all we can do is sit and wait" ;-) --Arnd (discuter) 15:57, 30 January 2015 (UTC)
{{int:filedesc-header}} produces "<filedesc-header>", it there are any files that use this non-existing message, they should be fixed. I located 50 such files by searching for "filedesc-header". I hope that was all. --Jarekt (discuter) 18:14, 30 January 2015 (UTC)
@Jarekt: great, many thanks! Holger1959 (discuter) 21:10, 30 January 2015 (UTC)

Change template[modifier]


could you change a template {{Mediagrant II|Foto českých obcí}} for {{Mediagrant II|Události}} within files from Category:Akce Cihelna 2014, please.


--Juandev (discuter) 12:02, 30 January 2015 (UTC)

Why not using VisualFileChange.js? --Arnd (discuter) 12:18, 30 January 2015 (UTC)

Double extension[modifier]

Hi, Could someone with a bot help cleaning this list: User:Dispenser/Double extension. These files should be renamed. There are many cases which can be done with a bot:

  1. when the double extension is the same format (usually JPEG/JPG/jpg/jpeg), and,
  2. when the file name has a meaning.

I think the rest has to done manually. Thanks in advance, Yann (discuter) 23:22, 31 January 2015 (UTC)

I'll code #1 tomorrow --Zhuyifei1999 (discuter) 13:19, 1 February 2015 (UTC)
✓ FaitCommons:Bots/Requests/YiFeiBot (23) --Zhuyifei1999 (discuter) 09:29, 2 February 2015 (UTC)

Personality rights[modifier]


would it be possible to add Template:Personality rights to all the pictures included in the subcategories of Category:Rallies in support of the victims of the 2015 Charlie Hebdo shooting ? Thank you. JJ Georges (discuter) 21:08, 1 February 2015 (UTC)

As this only makes sence if people are pictured, a visual check is needed. Hence, just do it using VisualFileChange. --Leyo 01:18, 2 March 2015 (UTC)

Uploads by LasPo rocks[modifier]

User LasPo rocks has uploaded dozens if not hundreds of images from Flickr with an automated tool. Problem is, each of these uploads was only categorised with tag-type cats, hence spamming improper cats. See Special:Contributions/LasPo_rocks. Proposing all these be tagged with Check categories and/or added to a new maintenance cat "Uploads by LasPo rocks needing category review". --Pitke (discuter) 21:16, 2 February 2015 (UTC)

Addendum: between 300 and 400 files. --Pitke (discuter) 15:29, 4 February 2015 (UTC)
@Pitke: ✓ Fait , see Category:Uploads by LasPo rocks needing category review --Steinsplitter (discuter) 15:58, 4 February 2015 (UTC)

The San Diego Museum of Art Collection[modifier]

Hi, Could someone with a bot upload The San Diego Museum of Art Collection? The license is -NC-ND, but most of these files are PD-Art. There are more than 1750 files, and it would take ages manually. I am ready to review them once there are uploaded. Regards, Yann (discuter) 16:01, 6 February 2015 (UTC)

@Yann: Would you create a check cat please? And is it just {{PD-Art}} (without license parameter)? --Zhuyifei1999 (discuter) 13:50, 7 February 2015 (UTC)
Thanks for your message. Please add Category:Files from the San Diego Museum of Art to be checked. {{PD-Art|PD-old-100}} should be OK for most of them. Regards, Yann (discuter) 23:01, 8 February 2015 (UTC)
 Ok, running --Zhuyifei1999 (discuter) 07:07, 10 February 2015 (UTC)

Diffuse Category:Saints by name to Category:Saints by name - A‎...Z[modifier]


A hierarchy has been created (not by me) to unclog the 1350 subcats of Saints by name. Unfortunately, cat-a-lot does not allow to move subcategories.

Can a bot move categories to the appropriate subcats? Place Clichy 12:37, 9 February 2015 (UTC)

Pics with date 4501-01-01[modifier]

Hello all, due to a bug resp. feature in Picasa we have around 550 files on Commons that have the mentioned obviously wrong date in their metadatas. [4]. Could someone please mark them with {{Wrong date}} right after the date? --Arnd (discuter) 13:59, 15 February 2015 (UTC)

✓ Fait 510 being replaced. -- (discuter) 15:55, 23 February 2015 (UTC)
@: Could you please check you Bot run because the Category:Incorrect date contains less then 300 files which contradicts that 500 files have been changed. Also a some random examples showed me that lots of files have not been changed by your Bot. Regards, --Arnd (discuter) 16:16, 23 February 2015 (UTC)
It's fine. They are being replaced, now at 326/510. Probably finish in about an hour. I only have slow bots as I don't have one that is not throttled (I've never asked for anything faster). Classic smiley.svg -- (discuter) 16:27, 23 February 2015 (UTC)
Thanks. Maybe it is also worth to mark all media with date in future... --Arnd (discuter) 19:27, 23 February 2015 (UTC)

Report of OTRS verified images which are later deleted[modifier]

After thinking about the case of Commons:Deletion requests/File:Bling Bling.jpg which was an OTRS verified image, I realize that we have no system of offering feedback to OTRS volunteers when this happens. Could someone investigate how to produce a monthly report of images deleted against uploader, OTRS ticket and OTRS volunteer. I believe that as I do not have access to admin tools, this is impossible for me to create. Thanks -- (discuter) 15:38, 17 February 2015 (UTC)

Removing “none”[modifier]

other versions = none is not very useful information that is often there. Someone willing to start a clean-up task? --Leyo 01:04, 1 March 2015 (UTC)