Adding the Information template to files that don't have it[modifier]

Hi again :) As part of the File metadata cleanup drive, I'm working to add the {{Information}} template to the ~700,000 files that don't have it, so that the information can be accessed easily. This is a complex undertaking, but there are small tasks we can take on to make incremental progress.

An easy group of files to start with are those like this one, whose description page basically consists of:

== {{int:filedesc}} ==

< Some description >

== {{int:license-header}} ==

{{Self| <some licence(s) }}

< categories >

In this case, it's relatively easy to add the {{Information}} template:

  • add the information template under == {{int:filedesc}} ==
  • move the existing description to the Information template's Description field
  • add the name of the uploader as the author (since it's their own work)
  • add {{own}} as the source
  • add the date from EXIF data, if available, otherwise leave blank.

This will work for only a subset of the files missing the {{Information}} template, but we have to start somewhere :) (Pinging Multichill, MGA73, Amir and Keegan per previous discussions.) Guillaume (WMF) (discussion) 00:50, 2 December 2014 (UTC)

I will fix this. Amir (discussion) 12:05, 2 December 2014 (UTC)
I have been adding Category:Media missing infobox template and thinking about this issue. I was also trying to discuss it at VP, see here. I think we should use divide and conquer approach I would propose the following:
  1. Mark the files by adding them to Category:Media missing infobox template what will allow everybody to see the files.
  2. Some files likely have information template but have some syntax errors, those I try to place in Category:Pages using Information template with parsing errors
  3. I would propose to first give the original uploaders a chance to fix the files. We can do that by writing a standard message, which without any threat of deletion, ask for help whit bringing those files up to current standards. We should have one message per uploader with a list of all the files that need infoboxes. Many of the images without infobox templates are from the early days of Commons and many of those people might not be around anymore. We should also advise them on the use of VisualFileChange gadget or requesting specific tasks to be done by bots at Commons:Bots/Work requests.
  4. Many files have all the info just not in the right form, for example File:Orchis militaris flowers.jpg or File:St Germain des Prés fenêtre.jpg. We might be able to recognize some patterns used and fill {{information}} based on that.
  5. Some images were moved from wikipedia, like File:St michaelis.jpg and have no information about the photographer. we would need to look the information up on EN-WP to find the name of the original uploader.
  6. Some images imply "own" work by the uploader, like File:MaisonHonfleur1.jpg or File:Pinus pinaster female.jpg, but do not actually say it. If the files have EXIF data and templates like {{PD-self}} or {{GFDL}}, {{self}}, I think it might be OK to fill the {{Information}} with {{own}} and the name of the first uploader and the EXIF date.
  7. Some files have some home-brewed infobox templates that are not maintained or recognized
  8. Many {{PD-old}} files should use {{Artwork}} instead of {{Information}}, for example File:Leonardo da Vinci Grotesque Heads.jpg.
  9. I do not know what to do with, files like File:Ruins at Delfi.JPG. User should have been advised that he needs to send the permission to OTRS, but 10 years ago when he uploded the image OTRS mostly dealt with handling emails from the public not permissions.
Once we deal with a lot of "easy" cases we can asses what is left. --Jarekt (discussion) 16:46, 2 December 2014 (UTC)
Thanks, Jarekt! That's a great plan. Should we discuss the details elsewhere or is here ok? Guillaume (WMF) (discussion) 18:35, 2 December 2014 (UTC)
I would just keep the discussion here. I was trying to have this discussion on VP and Commons talk:Structured data, but nobody wanted to talk about it, so this place seems better. By the way Category:Items with OTRS permission missing infobox template seems like are distinctive enough to warrant a separate category. --Jarekt (discussion) 18:58, 2 December 2014 (UTC)
My approach is to fix easy cases and evolve the script as we handle more complex cases. Amir (discussion) 19:49, 2 December 2014 (UTC)
Amir, I am slowly working on step #1 adding Category:Media missing infobox template and more specific subdirectories, so far I am ~5% done. You can start with those files or have your own way of generating the list of files with no infoboxes. It should not be hard as I added {{Infobox template tag}} to all infobox templates (other than {{Information}}) so any file that do not have {{Infobox template tag}} or {{Information}} is likely not to have an infobox. So maybe you want to tackle cases where author, source and possibly date and the description are present, and unambiguous to a human reader (case #4), than you can develop regexp rules to detect them and place them in the correct fields. Some of those rules can be "borrowed" from toollabs:add-information. But the bot should skip unusual cases. Many of the uploads are by the same users which might follow the same pattern and we could process few more prolific users with a custom set of rules. --Jarekt (discussion) 20:37, 2 December 2014 (UTC)
ُThank you for your hints, I'll use them and probably work on case #4 Amir (discussion) 21:01, 2 December 2014 (UTC)

It would also be helpful to have toollabs:add-information fixed. Currently, it occasionally destroys section headers and other parts of the code. --Leyo 17:14, 2 December 2014 (UTC)

Agreed; I'll reach out to Magnus and follow up here. Guillaume (WMF) (discussion) 18:35, 2 December 2014 (UTC)

I finished the script that fixes cases that they consists only language templates (example 1, example 2) Is it okay to start with them? Amir (discussion) 09:01, 6 December 2014 (UTC)

That is good, however are you going to be able skip cases which are clearly not "own work", like File:FSO ok 1974r.jpg. Also Files with only language templates might have date, author, source which are not the same. Do you attempt to recognize those? --Jarekt (discussion) 09:38, 6 December 2014 (UTC)
It skips when the template:Self is not used and if the language template consists several lines (instead of one). Is that enough? Amir (discussion) 11:06, 6 December 2014 (UTC)
I think that limiting it to files using {{Self}} is enough. Could you also remove Category:Media missing infobox template, in case the file has it? (you might be doing it already). Thanks --Jarekt (discussion) 17:11, 6 December 2014 (UTC)
I don't remove self template. Should we remove it? Amir (discussion) 21:23, 6 December 2014 (UTC)
I am sorry, I forgot : and Category:Media missing infobox template did not show up. I meant to remove that category. --Jarekt (discussion) 03:30, 7 December 2014 (UTC)
Yes, It does remove them Amir (discussion) 08:46, 7 December 2014 (UTC)

Next step: One line long descriptions: Commons:Bots/Requests/Dexbot_5 Amir (discussion) 02:18, 25 December 2014 (UTC)

Btw I found that there might be some uploads with "self"-templates which are not "self" by the uploader because they were transferred from other wiki's. See this one for an example. Some big uploaders (bots and users) have been busy with file transferring in the early days and should at least be exempt when using this method to add information templates. Mvg, Basvb (discussion) 12:26, 25 December 2014 (UTC)

An example pattern[modifier]

@Ladsgroup:@Guillaume (WMF):Last month under my "normal" account I went through and cleaned up a couple hundred file pages by hand looking for such patterns. Here's an easy(ish) test case for a bot to take on:

RHaworth has/had a bunch of old (2005/6ish) uploads that need formatting. They're pretty easy to do by hand, but even so there's still 61 files left that need completed; I did the other half by hand. The list is on this labs page. I can copy the file names over if need be. Keegan (WMF) (discussion) 21:16, 11 December 2014 (UTC)

@Keegan (WMF): on File:All_Saints,_Beeston_Regis.jpg why did you put User:RHaworth as the author, when the text was clear that it is actually User:Stavros1 (Mark Hobbs)? --99of9 (discussion) 23:22, 11 December 2014 (UTC)
@99of9: because I made a mistake there. I've fixed it, thanks for pointing it out :) Keegan (discussion) 03:54, 12 December 2014 (UTC)

Dutch wiktionary pattern[modifier]

There are 130.000 pronounciation uploads from wiktionary on commons, a few thousand (my estimate would be around 7000-15000) don't have information templates. Most of these are uploaded with the same pattern. There are uploads by different uploaders with different patterns. In this edit I change the filedescription of a file by GerardM (which have similar patterns), the description was created by me (but could be generated based on the title. The words "eigen opname" (meaning: own recording) are added in slightly different formats, besides that there is not really a lot of info on the images. This one might be an easy one to add templates to and also a pretty big one. Mvg, Basvb (discussion) 00:13, 14 December 2014 (UTC)

In smaller numbers this holds for other languages as well (en, pt I've seen). Mvg, Basvb (discussion) 12:04, 14 December 2014 (UTC)
I will fix this for Dutch pronunciations by the weekend Amir (discussion) 02:23, 18 December 2014 (UTC)

9154 files of Dutch pronunciations didn't have Information template. Now 8588 more files have it (so 566 still needs to be fixed. I'll do that too) Amir (discussion) 05:37, 19 December 2014 (UTC)

Nice work, with those numbers we can work very well! Mvg, Basvb (discussion) 11:28, 19 December 2014 (UTC)


A list of books (with more than 100 files) which have no infobox template and could probably use some automated adding of the book-template. Most books have a few hundred pages and thus we are looking at a few hundred files per listed book. Basvb (discussion) 00:17, 14 December 2014 (UTC)

I can finish the Book categories. I have a system going that adds book templates with unique page numbers so you can page through the files. The only slow down is that I am creating book templates and often creator templates as I go. --Jarekt (discussion) 20:28, 19 December 2014 (UTC)
✓ Fait That was a good find and since all the files already use Template:LA2-NSRW, it was also easy to fix. --Jarekt (discussion) 05:02, 18 December 2014 (UTC)

I can do those, since I have some script for adding page numbers so one can page through the book, but I would appreciate help with the book templates, like {{L'Odyssée}} or {{Nietzsche's Werke, III}}, since they are the most time consuming especially since I do not speak any of the languages. --Jarekt (discussion) 05:04, 14 December 2014 (UTC)

Thanks, I can do the templates in a few days. Mvg, Basvb (discussion) 12:01, 14 December 2014 (UTC)
Btw, is there any fix for the fact that all (or a lot) books with booktemplates end up in Category:Files with no machine-readable source and Category:Files with no machine-readable author? Basvb (discussion) 12:03, 14 December 2014 (UTC)
Template:L'Ile des Pingouins is done (I'll add them to the relevant lines of the books from here on). Basvb (discussion) 14:05, 14 December 2014 (UTC)
Smiley.svg Merci --Jarekt (discussion) 18:13, 14 December 2014 (UTC)
Thank you! I've also left a message to the French Wikisource community to see if they can help to create the book templates. Guillaume (WMF) (discussion) 18:21, 15 December 2014 (UTC)
@Guillaume (WMF): Thank you, althought my French is limited exactly to the understanding of book covers. I've a question, is it possible to generate from the data about files without infobox which users have uploaded a lot of files (let's take over 100) or a lot of files in one category? This would help a lot in finding books like these and other patterns which can be fixed easily. Mvg, Basvb (discussion) 21:13, 15 December 2014 (UTC)
Basvb: Sorry for the delay on this; I'm still new to SQL queries, so it took me a little time. The list you asked for is now available and I'll set it up to be refreshed every day. You might want to download the file to your computer to avoid encoding errors if you open it in your browser. I see familiar names in the list, like MarcBot (used for many of the books discussed above) and G.dallorto (that you mentioned below), so I'm reasonably confident that it's what you're after. Let me know if I can do anything else! Guillaume (WMF) (discussion) 01:22, 17 December 2014 (UTC)
Guillaume (WMF): Thank you very much, that makes it much easier to search for the big fish, which will save the people who work on this file by file a lot of work. If we for example fix all files of uploaders with over 1000 uploads (without infobox) than we have the first 200k done. About the anything else, I indeed had another idea, depending on how hard it is a good way to find similar uploads is when the lists are sortable on uploaddate, but I can be busy with the this uploader list for a while. Mvg, Basvb (discussion) 09:08, 17 December 2014 (UTC)
Basvb: I've made another query. You can now download a list of the files missing machine-readable metadata, grouped by user, and with the timestamp. Warning: this is a ~40 MB text file so some browsers may have issues with it. I suggest you download it to your computer and open it with a spreadsheet application, so you can reorder the content more easily. For example, I imagine that you could select all the files from a given user, and reorder them by upload date to see if there are patterns. The file isn't being updated for now, but I can set it up if you think it would be useful. Guillaume (WMF) (discussion) 19:08, 17 December 2014 (UTC)
Guillaume (WMF): Thank you, now I can get to the regexfixing. Update of the file is not really needed (until a big chunk is done). Mvg, Basvb (discussion) 20:31, 17 December 2014 (UTC)

When I saw Guillaume's message on WS, I came here to see what I could do to help. I haven't check all these books, but the 1st one I looked (Category:Lettres de mon moulin), there is already a DjVu file from this same book edition: s:fr:Fichier:Daudet - Lettres de mon moulin.djvu... what is the usual procedure when we come around this kind of thing on Commons? is it considered as a duplicate? Thanks. --Ernest-Mtl (discussion) 02:23, 16 December 2014 (UTC)

The DJVU Lettres de mon moulin file is misplaced in Wikisource, and has to be uploaded in Commons. There is a number of JPG book pages, such as Gustave Flaubert Category:Bouvard et Pécuchet, Category:L'Éducation sentimentale, Category:Madame Bovary, duplicates of DJVU; now Wikisource uses the DJVU and the JPG are no more useful. --Wuyouyuan (discussion) 13:47, 16 December 2014 (UTC)
I do not think we have a policy on that but I am inclined to let the old files stay and the book template we use for them can be reused for the DjVu files, as I did with files in Category:Encyclopédie – Planches V1–9 (pages assemblées, DJVU). --Jarekt (discussion) 03:19, 16 December 2014 (UTC)
I just realized you are talking about the case of some files on Commons which are the same as files on french wikisource. That is quite puzzling why is french wikisource hosting local files? Either way we are not going to delete our copy just because one of the projects has a local copy, and we still would try to add metadata to our copy. --Jarekt (discussion) 03:27, 16 December 2014 (UTC)
The reason why I was asking the question is that we are actually moving all those files to commons. Some files from the early days of the projects were simply uploaded to WS. Furthermore, the quality of the djvu scans are a lot better than the jpg in the case of Lettres de mon moulin and I would, personaly, find it a waste of space to keep individual jpgs of a book that can be accessed in djvu, especially that the djvu format here on commons allows people to save individual pages into jpgs on their computer if they can't open a djvu file... That's the reason why I was asking what was the procedure here, before doing something that would not have been considered correct. If someone is to waste time on these 200 some jpgs files, it won't be me as I consider these files useless duplicates. --Ernest-Mtl (discussion) 15:07, 16 December 2014 (UTC)
In such a case once the DjVu file is copied you should nominate the jpegs for deletion as poorer quality duplicates. --Jarekt (discussion) 15:16, 16 December 2014 (UTC)
Yes, except when the JPEG are not easily available. That was (is?) the case for the Encyclopédie files. Regards, Yann (discussion) 17:46, 16 December 2014 (UTC)

One thing that I noticed is that most of the files here are uploads by User:MarcBot for french wikisource and all (as far as I noticed) were unused and replaced by DjVu files. The book images are often incomplete concentrating on the pages with "text" and skipping the title pages, tables of content, etc. We usually do not remove files which are not identical duplicates but in this case thase are truly unusable files, since better versions uploaded latter exist. --Jarekt (discussion) 04:51, 22 December 2014 (UTC)

Yes. The fact that most files are from MarcBot is because I looked at all the uploads from this bot. I don't know what's the best plan for the images. Mvg, Basvb (discussion) 13:21, 22 December 2014 (UTC)
Books are all done. --Jarekt (discussion) 14:41, 26 February 2015 (UTC)

Over 500 images by G.dallorto[modifier]

There are over 500 images by G.dallorto without information templates which have some basic pattern, dates in exif-data and mainly have a self-license. Seems like a pattern which could be matched. Example edit: here (media missing information template cat should also be removed). Basvb (discussion) 00:30, 14 December 2014 (UTC)

My bot will fix this pattern and similar. I'm waiting for approval. Amir (discussion) 18:58, 14 December 2014 (UTC)
✓ Fait Amir (discussion) 12:29, 16 December 2014 (UTC)
A lot of files in Category:Società Umanitaria (Milan) aren't done yet. Mvg, Basvb (discussion) 19:56, 16 December 2014 (UTC)
their pattern was a little bit different, my bot fixes them too now and it finishes them soon Amir (discussion) 19:48, 17 December 2014 (UTC)
I see now that this user has images with a lot of different patterns (just heavily active) thus processing all of those by both isn't really suitable, it'll just be part of other botbatches if it fits. Thus lets close this request. Mvg, Basvb (discussion) 22:32, 17 December 2014 (UTC)

over 500 images by Lalupa[modifier]

over 500 architectural shots with one sentence text description.[1] some with move table.[2] Slowking4Farmbrough's revenge 23:15, 23 December 2014 (UTC)

Thank you! Amir, do you think your bot could take care of those? Guillaume (WMF) (discussion) 20:59, 7 January 2015 (UTC)
This user has uploaded 4225 files, I'm investigating Amir (discussion) 14:04, 8 January 2015 (UTC)
I fixed about 400 cases, there are some more complex cases that need to be taken care of. Amir (discussion) 04:14, 12 January 2015 (UTC)

Continuing the work[modifier]

We've made progress, but there are still many files missing machine-readable information.

Hello everyone,

A few days ago I looked at the numbers, and we've managed to fix more than a third of images on all wikis, and almost 100,000 on Commons! This is great :) We should continue the work to make further progress.

Commons is still the wiki with the most files missing machine-readable information (see image). Jarekt, Basvb, Slowking4, Amir: Do you think there are more groups of files with similarly-formatted file pages, or have we reached the point where it's time to expand the scope?

If we're done with the patterns we can find, it may be time to start filling the information template with less precise information. We could use a tracking category for the bot-fixed files, so we can check them later to improve the fields manually. In the meantime, the information will be available thanks to the bots, even if it's not perfect.

Here's how we could do it:

  • Take larger groups of files, like files with a == Summary == section but no info template, and put everything from that section into the "Description" field of the information template.
  • Otherwise, do the same for all content on a file page that isn't a license template or a category (this can obviously be refined).

This approach isn't perfect, since some information like the Author and the Source may be put into the Description field. However, it's better for it to be shown in the Description field than not be shown at all, at least until the file is checked manually.

Do you think this could work? I'm open to other ideas, but I don't think we want to wait years before we fix every file manually only. Guillaume (WMF) (discussion) 23:50, 4 February 2015 (UTC)

There's still lots of stuff we could do with pattern matching: see the list of authors. All MarcBot's files are books and easily fixable. I simply had a busy time and my attention shifted. One thing we could do is ask all users with less than 10 images to check and add an infobox, as those will be the hardest to cover in automatic attempts. There's around 100k files from ca 50 uploaders with over 1000 files with missing info. Likely the 200k point is also still. The next 100k point is at 300 images or more. Fixing the files of 250-300 uploaders would devide the number in half leaving us with around 200k images to fix. I think these 300+ images authors largely be fixed in a 10-30 minutes of work for lets say 70% (the other 30% is not easily fixable). That would reduce the number by 150k in around 100 man hours (the first 75k is only 25 hours of that). After that stuff will get more time consuming, the next 100k images are in the 100-300 images/uploader range with around 600 uploaders (which would than produce a number of 200 manhours for ca 75k more fixes). It all depends on how much time we and others would like to invest in this. Other approaches with high performance based on generic patterns outside of user uploaded might prove valuable as well. I think going for quick and dirty approaches right now isn't the best idea. Is there really a hurry to fix of all of this? Mvg, Basvb (discussion) 00:11, 5 February 2015 (UTC)
Hmm, my estimates about time and percentage we can easily fix might be a bit too optimistic. But I believe we should try and keep to aim for high quality. Maybe we just need some more battle troops. I'm btw still trying to figure out how I can force the regex to start matching at the start of the text of a file, this is holding me back from quite some replacements. Mvg, Basvb (discussion) 00:37, 5 February 2015 (UTC)
i would say there is lots of work still for the bots, if you group just by file name, you will see the pattern. and best to do those first and then call on the brute force humans for the last 20%. maybe you would want an uploader list for suggested groupings for the bot operators. Slowking4Farmbrough's revenge 01:35, 5 February 2015 (UTC)
Fair enough :) Thank you for sharing your thoughts! Guillaume (WMF) (discussion) 17:05, 10 February 2015 (UTC)
Basvb: I wouldn't say there is a hurry, but there is a choice we can make between the two strategies:
  • slowly work our way through ~425,000 files on Commons, using small bot runs and manual fixing. We don't know how long it will take to fix all the files, and during that time the information can't be accessed by the Mobile apps, MediaViewer, etc.
  • Or, do a massive migration of the remaining ~425,000 files although the infobox matching will be imperfect. Then, we can fix them manually as well, and we don't know how long it will take to clean them up, but during that time we will be able to have some (imperfect) information in Mobile apps, MediaViewer, etc. rather than nothing.
I personally think the second option is best, but if the collective decision is the first one, I respect that.
Regarding your regex issue, have you tried using ^ or \A ("Start of string" anchor)? It might be what you're looking for. Guillaume (WMF) (discussion) 17:05, 10 February 2015 (UTC)

In the mean time my bot is slowly adding files to Category:Media missing infobox template and it's subcategories. I am probably more than 90% done, with ~560k files, so the final number should be less than 600k. I also remove files from those categories once someone adds infobox template. I noticed some activity in other problem categories resulting from users adding {{Artwork}} and {{Information}} templates. I was not spending much time on this effort, since I spend last couple months on development of Lua version of {{Other date}}. I can help with some regular expression based mass replacements using my AWB based bot, but the most time goes into finding the files with patterns. The replacements themselves can be done with VisualFileChange by anybody who knows regular expressions. I was working for a while on adding {{tl|Book} templates to old books, but I was relying on others to create {{Book}} templates. Other issue I run into was people reversing my edits, which replaced home-brewed versions of {{Information}} -like templates with the real thing. Apparently we do not have a policy about {{Information}} and other infobox templates, saying that it is preferable to no template or home-brewed versions written for a single user. We might need to change that, before we start asking others to add infobox templates. --Jarekt (discussion) 15:45, 5 February 2015 (UTC)

By the way, looking at File:File metadata cleanup drive impact Oct2014-Jan2015.svg and comparing it to Category:Media missing infobox template suggest that there are many files that do not use any of the standard infobox templates, but do have machine-readable information. It would be interesting to see what those are. --Jarekt (discussion) 15:51, 5 February 2015 (UTC)
Jarekt: The difference may be the files with description/author specified in their EXIF metadata, that are currently recognized and used by the CommonsMetadata extension. There is a change pending to not use such data any more, though, because it's so unreliable, so I expect our counts to converge once the change is deployed. Guillaume (WMF) (discussion) 17:11, 10 February 2015 (UTC)

I fixed some files of MarcBot, Stahlkocher, and Immanuel Giel. I will work on them more Amir (discussion) 14:55, 7 February 2015 (UTC)

Fixed Sailko and SPUI files. So about 10,300 more files now have machine-readable information. I will go through the list and fix as much as I canAmir (discussion) 10:56, 8 February 2015 (UTC)
Thank you so much Amir! Guillaume (WMF) (discussion) 17:05, 10 February 2015 (UTC)

Template updates[modifier]

In addition to adding templates to the files that never had any infobox templates, there are also cases where a rare infobox template exists and needs to be either replaced with or merged with equivalent existing template, like one from Commons:Infoboxes, or templates derived from them, like templates in Category:Infobox templates: based on Information template. Some examples can be seen below. Please help me find more such templates used by files in Category:Media missing infobox template and help with conversions or template merger.

--Jarekt (discussion) 05:13, 9 February 2015 (UTC)

Thanks, Jarekt. For user-specific templates, we don't necessarily have to convert them to the official information templates. Since custom templates are discouraged, I personally don't consider them a priority; we don't have to support every possible custom template if people don't use the official infoboxes. Just adding the machine-readable markers to the custom templates would be more than enough, and that's even something we could leave to their authors. Guillaume (WMF) (discussion) 16:38, 10 February 2015 (UTC)
Just like home-brewed license templates which we managed to retire some years back, user-specific infobox templates are just hard to work with or edit, are often abandoned and forgotten by users who retired years ago and are prone to someone breaking them on purpose or by accident without others noticing it for years. They are also often lacking intenationalization of the field names and proper machine readable fields. And as with user specific license templates, I usually try to make sure they are re-written based on standard infobox templates, so at least field names are translated and metadata is machine-readable, like with this file. --Jarekt (discussion) 18:00, 10 February 2015 (UTC)
Yes, I'm not saying we shouldn't convert them, however some users are clearly attached to their custom templates. If they don't mind, then it's fine to convert the templates or add the markers, but if your efforts get reverted, I don't think there's much more we ought to do for those users. Guillaume (WMF) (discussion) 19:26, 10 February 2015 (UTC)
User:Aka revert is an issue with COM:OWN concept, but you are right that it is not a priority. I am working on merging Template:Audio upload, Template:Položka namluveného článku and Template:Gesprochener Artikel with Template:Spoken article entry. --Jarekt (discussion) 19:59, 10 February 2015 (UTC)
✓ Fait converting files using about dozen home brewed templates to {{Information}} template. In the process I rewrote Template:Spoken article to be a much more mainstream template for files with reading of wikipedia articles. Please help by adding translations to Template:Spoken article/i18n and someone with better understanding of machine readable metadata should verify that it works well. --Jarekt (discussion) 13:54, 23 February 2015 (UTC)

And this template. I'm not sure it's okay to just replace the template with Template:Information, if it's so, tell me to replace it. Amir (discussion) 21:52, 24 February 2015 (UTC)

I am not sure either User:Aka reverted earlier attempts to fix his files, because he does not like the color scheme used for all the other files on Commons. --Jarekt (discussion) 14:54, 26 February 2015 (UTC)

And this template Amir (discussion) 21:55, 24 February 2015 (UTC)

I will look into that one --Jarekt (discussion) 14:54, 26 February 2015 (UTC)

Ok so there are more {{Information}}-like templates that do not use {{Information}}:

  • There were 600 of them earlier on today. ;) --Jarekt (discussion) 19:52, 27 February 2015 (UTC)

--Jarekt (discussion) 17:52, 27 February 2015 (UTC)

More templates to process:

--Jarekt (discussion) 02:54, 9 March 2015 (UTC)

Personality rights[modifier]


would it be possible to add Template:Personality rights to all the pictures included in the subcategories of Category:Rallies in support of the victims of the 2015 Charlie Hebdo shooting ? Thank you. JJ Georges (discussion) 21:08, 1 February 2015 (UTC)

As this only makes sence if people are pictured, a visual check is needed. Hence, just do it using VisualFileChange. --Leyo 01:18, 2 March 2015 (UTC)

The San Diego Museum of Art Collection[modifier]

Hi, Could someone with a bot upload The San Diego Museum of Art Collection? The license is -NC-ND, but most of these files are PD-Art. There are more than 1750 files, and it would take ages manually. I am ready to review them once there are uploaded. Regards, Yann (discussion) 16:01, 6 February 2015 (UTC)

@Yann: Would you create a check cat please? And is it just {{PD-Art}} (without license parameter)? --Zhuyifei1999 (discussion) 13:50, 7 February 2015 (UTC)
Thanks for your message. Please add Category:Files from the San Diego Museum of Art to be checked. {{PD-Art|PD-old-100}} should be OK for most of them. Regards, Yann (discussion) 23:01, 8 February 2015 (UTC)
 Ok, running --Zhuyifei1999 (discussion) 07:07, 10 February 2015 (UTC)
Thanks Zhuyifei1999. I will take care of images of Indian art, but help would be great for others. Regards, Yann (discussion) 16:38, 8 March 2015 (UTC)

Removing “none”[modifier]

other versions = none is not very useful information that is often there. Someone willing to start a clean-up task? --Leyo 01:04, 1 March 2015 (UTC)

It is already listed under com:regex, so c:user:YaCBot will take care of those soon. --McZusatz (discussion) 04:01, 6 March 2015 (UTC)
What about such cases? --Leyo 10:47, 6 March 2015 (UTC)
1, 2, ... --McZusatz (discussion) 18:32, 7 March 2015 (UTC)
Well, OK. But it will take months until they are all cleaned up. Can't they be prioritized? --Leyo 19:41, 8 March 2015 (UTC)

NYPL images update (over 81550 files)[modifier]

See Template talk:NYPL-image-DigitalID. In nutshell, web-site is migrated to, and old "DigitalId" is not may be used for permalinks in new site (tgrough searsh only). Currently detected next algoritm for provide direct links to the new site:

  1. seeking all NYPL images with old "DigitalId"
  2. get migration link from
    these links currently generated automatically; as example, old link is redirected to
  3. put "struc_id"/"strucID" in the second parameter of {{NYPL-image-DigitalID}} (template is updated)
    may be second unnamed parameter or named as "struc_id"
  4. purge template cache
  5. check for existing pages for new links provided by template

(my "bot-skills" is not enough for the job).--Kaganer (discussion) 11:42, 16 March 2015 (UTC)

PS: As additional benefit of this work - check all NYPL images with incorrect DigitalId, for updating manually. --Kaganer (discussion) 11:56, 16 March 2015 (UTC)

If the NYPL intent to switch off support for the old format, is there a link to where this is explained? I am concerned that the link format suggested may not actually be permanent links either. -- (discussion) 11:59, 16 March 2015 (UTC)
See yellow banner at the top of any page by old-format links (see this link as example): "Digital Collections will replace Digital Gallery in March 2015." Old format currently is not supported in new site; new permalinks is not includes "DigitalId" as part of URL. I have no further information. --Kaganer (discussion) 12:05, 16 March 2015 (UTC)
Template talk:NYPL-image-DigitalID is updated so that provides two links - permlink for old site, and keyword search for new site. After adding "struc_id" these be automatically changed with adding new permalink (as "migration link"); see example. --Kaganer (discussion) 12:09, 16 March 2015 (UTC)
Here's a bit of analysis which puts me off the suggestion of using the slightly complex way of presenting a URL with "strucID" which in the API is "RecordID". Neither of these provides a true permanent link, only the UUID. Note that it is free to set up a login to use the API, which includes getting a token if you want to run a bot.
Potential IDs are:
  1. ImageID (local_image_id): 54671 (which is displayed in the new gallery as well as the old)
    The ImageID can be used to find the UUID or return a (apparently broken) link to the full MODS catalog entry by calling (login required)
  2. RecordID (local_hades): 118546
    Using the RecordID the API can automatically return the UUID, for example by calling (login required)
  3. UUID: 510d47d9-7c02-a3d9-e040-e00a18064a99
    From which a true permanent URL can be made:
I previously used the ImageID to uniquely identify NYPL maps in the filename (more that 10,000 files), there is no particular reason to change any of these identifiers as they will all continue to work with the API and be referenced in the new gallery.
Under the new gallery system (which effectively hides the image behind a zoomify type viewer) there is no way to download the high resolution TIFF for the image. When I uploaded the NYPL maps at full resolution, this was by using the API to find high resolution TIFF links. For the example image given above, the links to a high resolution image are now missing from the API, which makes me believe the NYPL have decided to automatically levy (quite high) charges per image at high resolution.
I find this new secure lock-down of high resolution images by the NYPL sad. It should be possible to de-zoomify an image, but any volunteer that attempts to do so, may be liable to prosecution for systematically by-passing their site "security". -- (discussion) 14:53, 16 March 2015 (UTC)