User talk:Fæ

From Wikimedia Commons, the free media repository
Jump to: navigation, search
Notice If you want to see Python source code that supports some of my projects, go to Github and help yourself. The code is not written with reuse in mind... -- (talk) 15:57, 15 May 2018 (UTC)
Archives.png

2017
2018

Thank you![edit]

Dear Fæ,

I just wanted to say ‘Thank you!’ for your incredible restless work, which saves millions of images, many of them very valuable, for the benefit of all of us and maybe even for the benefit of future generations.

Thank you very much! --Aristeas (talk) 14:58, 23 February 2018 (UTC)

Thanks for the feedback! -- (talk) 09:04, 20 March 2018 (UTC)

Category:Faebot analysed duplicates ready for review[edit]

Just a heads-up that the category has stopped repopulating for some reason. Kelly (talk) 19:34, 1 February 2018 (UTC)

@Kelly: Thanks, rerunning. It was more of a test routine rather than established maintenance, so it remains a manual instigation. I'm unsure how useful or interesting others find it. -- (talk) 13:13, 27 February 2018 (UTC)
I was visiting it periodically - I thought it was quite useful. Kelly (talk) 13:16, 27 February 2018 (UTC)
@Kelly: I have turned this into a crontab task for my desktop, rather than running manually. I'm not putting this on labs, due to the very narrow scope it current has. It is currently set to kick off just once a day, then run for 8 hours depending on how long it takes to work through the backlog, which can take a while as it has no memory of the last run. If it appears to halt for 24 hours, or anything else unexpected happens, drop me a note. Thanks -- (talk) 09:12, 8 March 2018 (UTC)
Will do, thank you very much! Kelly (talk) 10:17, 8 March 2018 (UTC)

User:Fæ/Imagehash and LOC images[edit]

Fæ, any chance of running the bot against the LOC images recently uploaded? (Bain, Harris/Ewing, etc.) Just at first glance it looks as if we may have quite a few versions/copies of some of the images that could do with being linked together via "other version" and probably some that can be deleted as well. All the best - Kelly (talk) 22:21, 7 March 2018 (UTC)

It's a non-trivial thing to set up and adds hash info hidden in the page source text. Best done once all the current uploads are complete. I'll look at doing it, but in a month or two.
The current uploads should spot earlier upload matches with the same IDs or SHAs, and add them to the gallery. For example File:Keene Fitzpatrick - Princeton LCCN2014694395.tif was uploaded a few minutes ago, and includes a thumbnail link back to the jpeg version uploaded in 2012. -- (talk) 22:31, 7 March 2018 (UTC)
No rush, obviously. I've been doing some categorization on the newly uploaded photos, and it looks like a lot of matches are being spotted during the upload process. A few are not - for instance, File:Anita Stewart, Lallie Charles - Lallie Charles LCCN2014681265.jpg and File:Anitastewart 4.jpg. There are also some circumstances where the same photo has found its way here via a different source, such as the photos in Category:Hirata Tōsuke. Many thanks. Kelly (talk) 22:42, 7 March 2018 (UTC)
I'll examine these as part of looking at the set up. However as the 'sample space' for hashes ought to return no more than one or two 100,000 files to test. At that size the hash generation will take several days. Due to the reality of processing time, which increases as the square of volume of the sample space when trying to pair up files using the hashes, I doubt that whole categories where LOC file might appear can be included.
Even a generalized url search (like loc.gov) would yield too many files (>600,000), so the 'space' will probably have to be limited to finding something like "loc.gov" plus collection ID like "ggbain" (which at the moment returns 36,000 files). If the keywords are absent from a duplicate, and the duplicate is not identical by SHA1 value, my script will not pair it up as a duplicate.
Let's park for the moment. I may have better solutions to try when I get to it, and it might be worth writing up as another experiment. -- (talk) 09:39, 8 March 2018 (UTC)
Sounds good to me. Kelly (talk) 00:54, 13 March 2018 (UTC)

Analysis[edit]

Hash matching[edit]

Apart from drilling into categories, in this case added two weeks after LOC upload, there are no identification codes that could automatically connect the alternates. Even the given names do not match sufficiently for automation. As the scan has defects and the LOC image has writing on the negative border, images hashes may not be identical. This case may not be picked up through an image hash project that was limited to finding equal thumbnail hash values.

LOC code matching[edit]

Examining the history of the above example duplicate, the 2009 jpeg is the identical resolution to the full size TIFF uploaded this year. The LOC does not make available a jpeg version of the full size TIFF, but instead supplies two lower resolution versions jpegs. However the 2009 upload clearly shows in the EXIF data that the jpeg was "officially" created by the LOC, so presumably was originally made available by LOC.

LOC uses the LCCN inconsistently for apparent historical reasons. This means that not all batch uploads use LCCNs and instead default back to the collection digital ID like ggbain.01270.

Based on LOC's summary, the LCCN is (almost always) formed of <year, 4 digits> + <serial number, 6 digits with leading zeros>. The LCCN uses the year the central record was made.

At the time of writing this, projects in User:Fæ/LOC use the LCCN to search for existing files unless the MARC record is being used (i.e. no LCCN available), only then does a "pnp" value get used to search for existing matches on Commons. This is created from the MARC record. The pnp is deduced by searching the MARC record for a url matching "loc.pnp.*\d{4}.*", this is probably always in field 856 (access method 4 - http). Searching this way avoids matching to urls for the collection rather than the item. A technical explanation of MARC 856 is here.

However, the pnp value is always created as this is used for both types of upload to display and link in the {{LOC-image}} credit template. The credit template is used this way for historical reasons, not because the LCCN is less good where it exists.

Housekeeping task[edit]

Rather than retrospectively changing the batch upload process to do a cross-image search with pnp, it is more convenient to stick to creating a specific housekeeping task to add to or create image page galleries to cross-link both old jpg files and newer uploads. This would not compare the images directly, only find matches by identification codes. The George Grantham Bain Collection is 55% uploaded, but as we have one example missing match, and the collection contains over 20,000 items/jpegs, it makes for a good test category.

The proposed housekeeping task does the following:

  1. Pull a list of all jpegs in a LOC collection category
  2. For each image, extract the pnp value from the image page
  3. Do a general Commons search for pnp value matches, like insource:/ggbain.01270/ insource:LOC
  4. Check whether the image pages cross-link to each other, and add in any additions

Some initial passive results. Returning pnp matches to jpegs in PD-Bain where images are not already cross-linked and the 'seed' image may not be my upload. The first image of each set is the source, and there may be more than one match which need not be in PD-Bain:

Sample highlights:

Now running. Initial changes:


There will be a few exceptions, such as Flanders-F4-1912.jpg where there is no information template or no 'other_versions' included. To automate this would be a pain. However these will be linked to in the galleries of the associated files, in this example "Flanders" monoplane LCCN2014697249.tif. A marginal case is Hattie the elephant, where there are 5 jpegs along with the TIFF, these variations are due to crops. Should any image be matched more than 7 times this will be interpreted as a search problem and skipped.

A known "bug" is where a gallery exists and the linked file is presented using odd characters. These may be included twice in the gallery as searching for matches is made too complex. Example diff and Example from the Hattie the elephant variations with bad characters. As playing about with possible html encoding variations is ridiculously complex, these are being ignored as fringe cases.

Lastly, this runs slowly. I am not using multithreading for this task and it probably does not need it to be effective. I'll add a note about it on the LOC project page. As the Bain collection is still being uploaded, the housekeeping is officially starting with Popular Graphic Arts, Recent Changes list.

✓ 

Done

-- (talk) 14:29, 16 March 2018 (UTC)

Reopened[edit]

This was mostly okay. However there are some unique image numbers which match to a collection rather than variations of the same image. An example is ppmsca.13709. As a result some of the generated galleries are naff and link to images which may only be loosely connected to the 'seed' image.

To get around this, I am testing out the matches with an additional pHash and dHash test. The intention will be to re-run and either create better galleries or replace galleries under the previous scheme. May take a day or two to finish testing. @Geagea: FYI.

Testing ✓ 

Done

, housekeeping running. -- (talk) 16:11, 27 March 2018 (UTC)

Congratulations 🎉, again[edit]

I just saw that you had passed the six million edit mark, I think that probably almost half free files on Wikimedia Commons were uploaded by you. Face-wink.svg Face-smile.svg --Donald Trung 『徵國單』 (Talk 💬) (WikiProject Numismatics 💴) (Articles 📚) 11:55, 8 March 2018 (UTC)

UK Parliamentary portraits[edit]

You'll no doubt recall importing MPs official portraits, last summer. The Lords' pics are now available:

https://pds.blog.parliament.uk/2017/07/21/mp-official-portraits-open-source-images/

(ignore date in URL; post has been updated today). Please can you do the same again? If so, please note the naming structure used in Category:Official United Kingdom Parliamentary photographs 2017. After Friday, I'll have some free time to add them to Wikipedia articles/ Wikidata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:53, 20 March 2018 (UTC)

I may get distracted, but hopefully the last script should be reusable, so it should be done in a day or two. I might even post it on github if I find the time to trim out dead code. -- (talk) 20:08, 20 March 2018 (UTC)
Upload underway. Here is a petscan report of the new files. The only wrinkle was handling TLS, which seems a new security feature since last year and rather gets in the way of openness. The source code is at github if anyone asks.
✓  Done -- (talk) 09:47, 23 March 2018 (UTC)
Thanks, Fæ, that's great. What's TLS? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:13, 24 March 2018 (UTC)
Transport Layer Security -- (talk) 17:42, 24 March 2018 (UTC)

A barnstar for you![edit]

Original Barnstar Hires.png The Original Barnstar
Hello Fæ! I recently found the large and very important collections of artworks you've uploaded by George Cruikshank and Gustave Doré. I had to buy books of their works when I was studying illustration, satirical art, and life art in college, and what you've uploaded for free use is much more extensive and larger with more detail. As an artist, being able to download and examine these works is really priceless! I put Cruikshank on the high level of William Hogarth, and Doré really has no equal, especially in the beautiful, painstaking detail of his backgrounds. His London series alone is so full of detail I'll probably spend months on them. You uploaded some works by Hablot K. Browne too, the illustrator of several Dickens novels, and a very interesting and talented artist in his own right. I also downloaded the Hogarth and Bruegel collections. I'll be poring over these for years, and it's only a 5 gig collection all told. Thanks so much for these hugely valuable artworks. Anyone who says Commons is crap has to be an absolute philistine... 3D heart.png Jenny 04:45, 11 April 2018 (UTC)
Thanks for the feedback! It's encouraging to read of how some of the images I upload are so useful for research, and enjoyed. :-) -- (talk) 09:14, 11 April 2018 (UTC)

Xeno-canto[edit]

Example of new Xeno-canto uploads in mp3, the Mallard

Hi Fae, I notice that back in 2013 you uploaded audio recordings of birds from the Xeno-canto website (https://www.xeno-canto.org/) to Commons. (see the cat here and your notes here. I would like to encourage you to upload another batch.

I edit bird articles on English wikipedia and try to add audio recordings. If I search the Xeno-canto site with the query 'lic:BY-SA' I obtain 5783 results from 1310 species. If I restrict the search to recordings of the highest quality using 'q:A' ("loud and clear") I obtain 2208 results from 759 species. At the moment Commons has 678 recording and not all are high quality.

I've recently edited articles on antbirds (family Thamnophilidae - 236 species). If I search Xeno-canto with the query 'lic:BY-SA Thamnophilidae q:A' I obtain 121 results from 44 species. Commons has a single recording (here)(and the quality is only "B"). I notice that a large fraction of the Xeno-canto files with the BY-SA license were recorded by Niels Krabbe. Many thanks - Aa77zz (talk) 11:02, 1 April 2018 (UTC)

It's a nice project. Since that time Commons allows mp3s, so the process for uploading can be simplified. I'll look at revisiting the project and decide what the best format might be. It may take a few weeks to get around to it. -- (talk) 11:06, 1 April 2018 (UTC)
Many thanks for your reply. When you're done I'll begin work including each file in the corresponding article. There are more than 10,000 species of bird so this is a long term project. - Aa77zz (talk) 17:21, 1 April 2018 (UTC)

(Dearchived) @Aa77zz: Uploads now running. You can list the new files using Petscan.

I have overwritten the source code with the new version. Published at https://github.com/faebug/batchuploads/blob/master/batchXenocanto.py ✓ 

Done

-- (talk) 13:30, 11 April 2018 (UTC)

Many thanks for all your good work. I'm travelling at the momement but when I get back to editing I'll start adding audio files to the articles. - Aa77zz (talk) 09:05, 12 April 2018 (UTC)

Photographs by Albert Percy Godber[edit]

Dear Fae

I noted some high-quality photographs by Albert Percy Godber and would like to add some additional ones to an article about the Piha Tramway. Are you coincidentally interested in doing a batch upload of the 2415 photographs in the Alexander Turnbull Library? --NearEMPTiness (talk) 08:36, 22 April 2018 (UTC)

I'll take a look, today or tomorrow. -- (talk) 15:54, 22 April 2018 (UTC)
Thanks, I just noted that this link might be helpful. --NearEMPTiness (talk) 06:07, 23 April 2018 (UTC)
Based on a look via a tablet, it may be important to take a route that gets all the high resolution IIP images. The download link gives a significantly poorer jpeg version; which I think is what has been uploaded already in some cases. I may be able to put this together later today so long as the natlib website stays up. I did start getting down-for-maintenance messages earlier.
As there are a couple of stages to this, I'm running it as a project at User:Fæ/Project_list/natlibnz. If you have other comments, I suggest keeping them here as the project page is already fairly technical. -- (talk) 10:49, 23 April 2018 (UTC)
Thank you very much, indeed. I am positively impressed by what you have uploaded. --NearEMPTiness (talk) 07:18, 25 April 2018 (UTC)
@NearEMPTiness: The uploads are being overwhelmed with "internal server errors". It may be that after 300 full size images downloaded, this has triggered an anti-bot process. I will be away from my desktop for a few days, so will let it rest and look again next week. I see #308 was oddly corrupted, I may look at that again once the other problem is solved. Thanks -- (talk) 08:07, 26 April 2018 (UTC)
Adding #307 to be checked. -- (talk) 08:22, 26 April 2018 (UTC)
Thanks for the update. I have already added categories to most of the photos and embedded some into the relevant articles. Some of them are really interesting and useful, indeed. --NearEMPTiness (talk) 09:09, 26 April 2018 (UTC)

Update on #308. This was caused by downloading giving up half way and moving on to the next image, possibly down to a manual restart. The result was half of one image and half of the next. -- (talk) 13:49, 1 May 2018 (UTC)

The project has been added to the GLAM dashboard, reports are indexed here. -- (talk) 11:03, 3 May 2018 (UTC)

Thank you very much for your efforts so far. Now I have four questions:
  • Do you know of a tool, with which I can check, which of the uploaded files have no manually added category?
No, but User:Fæ/Project list/natlibnz/improvement is updated daily. -- (talk) 12:29, 8 May 2018 (UTC)
  • Do you know of a tool, with which I can check, which of the uploaded files have a red category?
No, but I think they get added to a maintenance category, which can then be queried using petscan. -- (talk) 12:29, 8 May 2018 (UTC)
  • Do you have a tool to automatically remove the black frames, particularly the very large ones?
I suggest using the standard croptool and try the magic border locator. It is possible to auto-crop if the type of border is reliably very similar, however it's probably not worth creating such a thing for 2,000 files. -- (talk) 12:29, 8 May 2018 (UTC)
  • If cropping/rotating is done manually, should I save the cropped version better in the same file or in a new file?--NearEMPTiness (talk) 12:06, 8 May 2018 (UTC)
If the crop is obvious, I would overwrite the original. -- (talk) 12:29, 8 May 2018 (UTC)

✓ 

Done

Uploads now completed, with 2014 photographs counted. -- (talk) 12:29, 8 May 2018 (UTC)

Thank you very much, indeed.--NearEMPTiness (talk) 14:03, 8 May 2018 (UTC)

Library of Congress maps[edit]

CIA map of Congo-Zaire boundary, 1972

Somebody just posted this zoomable detailed birds-eye view of Liverpool onto my twitter feed, which I thought was rather nice. (Nicer and nicer, the more you zoom in, in fact):

https://www.loc.gov/resource/g5754l.ct007678/

I think quite a lot of LoC maps have been uploaded over the years; and I see you've uploaded a lot of their prints and photographs.

Not that you're probably looking for a project, but would it be good to upload the whole lot of the LoC's online maps (about 27,000 https://www.loc.gov/maps/ ), less those we have already?

They look like they have a certain amount of cataloguing exposed on the website, that might go quite a way towards categorisation. Jheald (talk) 19:55, 28 April 2018 (UTC)

Thanks. I'll look into it, maybe Tuesday. It is a bit constraining to try on a mobile. -- (talk) 20:08, 28 April 2018 (UTC)
Started looking at this. I think it's a separate project from the LOC prints project due to the nature of the new website and iterative items, i.e. not just zoomable maps but atlases as books with images in a gallery. LOC seems to have no collection ID but relies on subject terms, but this needs some analysis on the JSON output to work out the best approach.
No especial hurry, so this is something I'll poke at in slow time, probably creating a project page when I start getting in to it. -- (talk) 10:02, 2 May 2018 (UTC)

I have got as far as the initial query loop and parsing. There are 27,199 items with 17,928 being single images. I'll probably set up the single images for uploads as they can sit in a bucket category. The other items will probably all need auto-categorization, which will need much more effort to make "non-controversial".

Now running, project page will be at User:Fæ/LOC maps. All uploads via the iiif service will be selected as PNG. Getting 503 errors at the moment (technical difficulties), this may or may not be transient. -- (talk) 20:48, 8 May 2018 (UTC)

@Jheald: Is there a visual mapping tool that can take the Commons image as a zoomable overly while you pin it to a current map? I know I can do this with OSM, but it's too fiddly to be practical as a way to find the coords of an old map. -- (talk) 07:41, 15 May 2018 (UTC)

Yes there is, the Wikimaps Warper.
See eg File:Portsmouth point OS25 inch to mile1858.PNG for a map where it has been done -- follow the link in the button at the bottom, which is provided by {{map}}; or File:Map of Africa from Encyclopaedia Britannica 1890.jpg for a map which has been uploaded to the warper, but not yet warped.
It's a bit clunky (or was, the last time I looked at it a couple of years ago), because first one has to install a copy of the map image in the map warper's own private filespace.
Also note that it's a slow old process adding the coordinates to a large number of maps -- of 50,000 maps found in the BL 19th century book images on Flickr, it's taken three years to georeference 60% (progress page), and that's using the BL's georeferencer which is a bit slicker. Plus even with the coordinates it can still be tricky to identify what the map is actually of, especially for things like railways, rivers, battlefields, or pages out of larger map-books (see Category:MC_upload_prep_pages for UK state of play).
But I do think it is something well worth linking up, and encouraging people to do. Jheald (talk) 08:27, 15 May 2018 (UTC)
I recall playing with it a while back, but it looks little different now. Unfortunately clunky, and the OAuth login for Commons fails to work for me today in Chrome, even though my OSM account does. In theory Google Earth has a neat interface for zooming/rotating overlays (and I assume that is still available), but my old macmini is a bit slow for it. -- (talk) 09:23, 15 May 2018 (UTC)

Size[edit]

Alexandra 1875, uploaded at "large" image size

Hi Fæ, I found a number of interesting scans in Category:Ship plans of the Royal Museums Greenwich you have uploaded. The filesizes and therefore quality of the scans differ dramatically from some 50 MB down to less than 200 KB. Is that a result of your uploading process, which I assume to be fully automated, or is that something the individual is responsible for, who first put those scans online? In other words: can anything be done to get higher quality versions of the low quality files for Wikipedia? Alexpl (talk) 09:34, 5 May 2018 (UTC)

Normally the largest possible resolution is uploaded from archives. There may be reasons to have both older scans as well as newer ones from the same archive. If you have specific collections in mind it may be worth writing to the museum curators to ask whether higher resolution scans could be published for a Wikipedia project. If the issues are technical, it would be worth providing me the specific example to take a second look at.
In the case of the Library of Congress, we have been uploading from that source for so long, that we sometimes have digital scans that the library has upgraded in the years since first upload.
By the way, the RMG was a 2017 project, you can read the project page at User:Fæ/Project list/RMG. -- (talk) 10:30, 5 May 2018 (UTC)
I find it hard to keep track with new projects and usually get surprised by finding stuff like this. :) I drop in for doing some categorizing-work from time to time, thats about it. The images of HMS Alexandra [1][2][3] seemed nice at first for use in an article, but I cant even read the text on those tiny scans. Alexpl (talk) 10:54, 5 May 2018 (UTC)
Refreshing my memory, I took a deeper look at the second example. The true source image is http://collections.rmg.co.uk/mediaLib/3295/media-3295854/large.jpg and the version uploaded is the largest the RMG has made available to the public. It's a bit sad, as the original artefact is six foot wide, so the RMG's archives must have a very large version of this photograph.
I have put out a tweet https://twitter.com/Faewik/status/992722889198723072 to ask for free stuff, though I doubt they will bend their policy of asking for money for better quality images. -- (talk) 11:10, 5 May 2018 (UTC)
Ok, I will keep an eye open. And, btw, thank you for all the time and effort you have put in doing all the uploads. Alexpl (talk) 11:34, 5 May 2018 (UTC)

File:V.W. Peters, letter, 1935.10.7, Songdo, Korea, to Father, Rosemead, California, USA (Peters 351007 letter~1).jpg[edit]

File:V.W. Peters, letter, 1935.10.7, Songdo, Korea, to Father, Rosemead, California, USA (Peters 351007 letter~1).jpg (edit|talk|history|links|watch|logs)
Commons:Deletion requests/File:V.W. Peters, letter, 1935.10.7, Songdo, Korea, to Father, Rosemead, California, USA (Peters 351007 letter~1).jpg 184.100.58.172 07:40, 10 May 2018 (UTC)

File:Bird-life; a guide to the study of our common birds (1898) (14726756076).jpg[edit]

This image looks great! I tried poking around Flickr to find some person to discuss the editing of these images with; maybe it was just a crop of the cu version, though.... Apparently, there is no guarantee of a person there as there is none of that here, either. So, this is a ping of sorts.--RaboKarbakian (talk) 13:30, 10 May 2018 (UTC)

Well, I am still a person. If you want to improve the quality of the image consider saving as a separate file. -- (talk) 08:00, 12 May 2018 (UTC)

File:ТУ4-224х, СССР, Удмуртская АССР, станция Камбарка (Trainpix 175349).jpg[edit]

File:ТУ4-224х, СССР, Удмуртская АССР, станция Камбарка (Trainpix 175349).jpg (edit|talk|history|links|watch|logs)
Commons:Deletion requests/File:ТУ4-224х, СССР, Удмуртская АССР, станция Камбарка (Trainpix 175349).jpg Kleeblatt187 (talk) 09:51, 12 May 2018 (UTC)


@Kleeblatt187: Please avoid blitzing my talk page with notices when you could easily create one group DR for all these. I am on mobile access only, per the notice at the top of the page, and there is no way I am able to to reply, or read, so many in one day. -- (talk) 09:58, 12 May 2018 (UTC)
I'm sorry, I have already stopped a few minutes ago when I saw the huge list on your talk page. To be honest I wasn't aware of this side effect until then. I was not sure whether a group deletion request was appropriate as the reason is different in most cases (original photographer and source, where the scan is from), also I wasn't sure whether everything by С. Чекалкин is a copyvio, that's why I started to check everything manually. It seems to be a few files in Category:Photographs from trainpix.org, but definitely not all of them. I will not resume for the next few days. Regards, --Kleeblatt187 (talk) 10:20, 12 May 2018 (UTC)

Una and the red cross knight, and other tales from Spenser's Faery Queene (1905)[edit]

Hello. Back in 2015 you imported many images from this work, found on Flickr. But not all of them, as far as I can tell. Do you want (or would it be better) to import this image and delete this one that I uploaded today? Thanks ~ DanielTom (talk) 23:03, 13 May 2018 (UTC)

The IA is the better source. If Faebot cannot uncrop, then you can do it manually by zooming in on the page on IA and download locally when at maximum, probably five times larger. -- (talk) 03:23, 14 May 2018 (UTC)
✓  Done -- (talk) 20:26, 14 May 2018 (UTC)

Photo of the doors of Milan duomo[edit]

Dear Luca,

Just want to inform you that I used your beautiful photo in my video/ vlog. You can watch in YouTube at GkJmdx6yrAo Thank you for sharing the photo.

Best Regards S. Toha Stef Toha (talk) 09:14, 14 May 2018 (UTC)

Bain[edit]

If you have time, when you upload, don't forget to add a category for the person to the Wikimedia images and link them to Wikidata. I just added categories to about 6 images. See: Ernst_Scholz at Wikidata and Category:Ernst_Scholz to see how they are linked. It will ensure that the image is found in the future. RAN (talk) 18:15, 16 May 2018 (UTC)

Yes, I know how it works. Nobody has come up with a solution for automating matching LoC bio data with Wikidata in a reliable way. As there would be something like 10,000 records to handle, it remains a non-starter. BTW, the Bain collection has been fully uploaded, so there is none that we are waiting on. -- (talk) 18:17, 16 May 2018 (UTC)
I have been doing one at a time. Are you going in through Flickr Commons which has more identifying information and a more reliable date? The current tranche of images is from 1922, with occasional misfiled ones. Either way, you have done great work. I will work on categorizing them with the biographical information from Flickr Commons. RAN (talk) 18:23, 16 May 2018 (UTC)
That's an interesting example. I have not read of the LoC using information fed back from Flickr. There's lots of things I could potentially do with Flickr, but it would have to be worth the time doing the programming. -- (talk) 18:35, 16 May 2018 (UTC)

Overwrite[edit]

Please do not overwrite original files like File:Hiddingsel, Ortsansicht -- 2014 -- 9237 -- Ausschnitt.jpg. Upscaling is not an improvement. If necessary I'll improve the image from the RAW data. Thank you. --XRay talk 18:41, 16 May 2018 (UTC)

@XRay: Sure. Out of interest, do you know why the crop lost resolution? I understand why a crop might be small if the parent was upgraded, but in a case like this one, it's unclear why the crop is smaller than it needs to be. -- (talk) 18:52, 16 May 2018 (UTC)
The crop is made from the RAW data, not from the JPEG data. If I take a crop from an image, there may be other improvements too, especially if the crop is made later. Improvements may be perspective correction and this may be a minor lost of resolution. I'm using Adobe Lightroom and I've made the crop as virtual copy from the original file. --XRay talk 19:06, 16 May 2018 (UTC)
Thanks. The routine I have for looking at larger re-crops handles perspective correction, so long as it's a normal homographic transform (quadrilateral linearly transforming to a rectangle). It would be possible to look at other elements, such as light level adjustments from the original, however there are just as many cases where the default auto-crop is actually better than whatever the original crop uploader has done.
Anyway, the current process is reliant on a manual review, so I just have to be more cautious about approving overwrites where the size improvement was not that much and the crop appears to have digital enhancement.
The project page says a fair amount, but as well as a side by side comparison, I see the following data:
61882    289 × 405    Hidilyn Diaz poses with President Rodrigo Duterte (cropped).jpg 
       2,048 × 1,365  Hidilyn Diaz poses with President Rodrigo Duterte.jpg
     Half size precheck 
     Processing took 0.4 seconds
     Processing took 16.7 seconds
     64% larger in resolution
        289 × 405 (0.714) → 475 × 666 (0.713) 
     https://commons.wikimedia.org/wiki/File%3AHidilyn_Diaz_poses_with_President_Rodrigo_Duterte_%28cropped%29.jpg 
     Overwrite? (rtn / n) >>
So, visual check, resolution increase and relative proportions normally sufficient to make a good decision in more than 99% of cases. -- (talk) 19:22, 16 May 2018 (UTC)
I think overwriting without asking the author is never a good choice. Anyway. I'll check the image within the next days. The difference of the resolution to your crop is too large. May be I should check the resolution of the crop from the RAW data. So there may be a happy end for the crop and your overwriting. --XRay talk 19:36, 16 May 2018 (UTC)
Keep in mind that it's actually rare for an extracted image to be created by the photographer. In fact the vast majority of images where upscaling has been identified (themselves less than 1% of the category for derived images) are transfers from Wikipedia, mostly several years old. I'm still unsure why that is, or why the majority can at least be doubled in resolution. I'll think about whether to add uploader names to the data that gets presented, perhaps just showing where they match. -- (talk) 19:41, 16 May 2018 (UTC)
Maybe people with inadequate graphics knowledge or software, making crops from screenshots or thumbnails instead of the actual image files?—Odysseus1479 (talk) 07:58, 17 May 2018 (UTC)
Possible, just seems an odd thing to do to me. Here's a example upscaled moments ago, where Castagna appears to have taken the Flickr original and cropped out a watermark before using the uploadwizard, but in the process made the image a fraction of what is possible from the Commons hosted parent. It could be that in these specific cases, folks go to Flickr and download the default size, rather than seeking out the "original" or just sticking to the Commons parent. -- (talk) 08:12, 17 May 2018 (UTC)
I've uploaded my photograph with the full resolution (of the crop) and it looks like a mistake before. The former photograph shouldn't had the lower resolution. So: Thanks for the issue you've initiated. --XRay talk 15:19, 17 May 2018 (UTC)

File:Short face. Pastel by W. Langdon Kihn. Wellcome V0018123.jpg[edit]

File:Short face. Pastel by W. Langdon Kihn. Wellcome V0018123.jpg (edit|talk|history|links|watch|logs)
Commons:Deletion requests/File:Short face. Pastel by W. Langdon Kihn. Wellcome V0018123.jpg Storkk (talk) 19:28, 18 May 2018 (UTC)

File:Misty (Unsplash).jpg[edit]

File:Misty (Unsplash).jpg (edit|talk|history|links|watch|logs)
Commons:Deletion requests/File:Misty (Unsplash).jpg Pi.1415926535 (talk) 03:05, 19 May 2018 (UTC)

File:Santa Monica, United States (Unsplash YrsIR805qjg).jpg[edit]

File:Santa Monica, United States (Unsplash YrsIR805qjg).jpg (edit|talk|history|links|watch|logs)
Commons:Deletion requests/File:Santa Monica, United States (Unsplash YrsIR805qjg).jpg Pi.1415926535 (talk) 03:11, 19 May 2018 (UTC)

Name Removal[edit]

Please remove the Operators name ,B Lena, from these three images,pages,and file names. This is for security of him and the personnel on this mission. MSRT DAS-1

Thank you.


120619-N-GN377-023.jpg 120628-N-GN377-001.jpg 120623-N-GN377-133.jpg Electricsurge81 (talk) 01:55, 20 May 2018 (UTC)

File:The Forum (1886) (14801589043).jpg[edit]

Hi , the file states 1886 but the book viewer states 1919. Is there a specific reason for that.? Since you were the uploader, you might be able to answer my question. Thank you for your time. Face-smile.svg Lotje (talk) 05:49, 20 May 2018 (UTC)