Commons talk:WikiProject Postcards/Archive/2020

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Start of the project

Great work Stefan! Well Done!


It is a great idea to use Locator Tool, so that everybody knows where the view is from. I tried to familiarize with Locator Tool, but it was not an easy job. If you have some more easy-to diggest-information, please, send them to me/us. Or insert a locator into one of my postcards and I will try and do the same with all the others.

Do you think it would help if there was a Template:Postcard ?????


The English of your directions needs a bit of editing. If you intend to do it, this is fine. If you want me to do some editing on it, I will do it with much pleasure!


Keep up the Good Work! Many thanks for your effort. Actia Nicopolis (talk) 20:42, 18 January 2020 (UTC) (Nikos)

Thanks for your work, Stefan. I will try to apply your suggested standards when uploading other postcards. Culex (talk) 21:50, 20 January 2020 (UTC)

Need Categories

@William Ellison: In the last days I add to the Category:Postcard many images (over 1000). With the search: Images with the German word "Ansichtskarte" or "Postkarte" or "Ansichtspostkarte" and no postcard-category I found postcard outside of the Category:Postcard. There waiting 5000 more images only with the German keywords "Ansichtskarte" or "Postkarte".

Why? My idea is to set new categories to this postcards via script. I will scan this images with a new bot and read the structure data. Like "Dresden" and then I will set in this image the "Category:Postcards of Dresden". This will only work automatic by location not by subject. I have to program this script and to test. For now pleas let the postcards stay in the Category:Postcards. If you want help you can add structured data or set the better category manually. --sk (talk) 04:09, 22 April 2020 (UTC)

Uploading to Commons

When you upload a file, among other items, Commons requires that you provide the name of tha AUTHOR for copyright reasons and that you provide CATEGORIES for organisational reasons.

Author

For postcards the situation is a bit ambiguous. The photographic image is the copyright of the photographer ; the published postcard is the copyright of the publisher. On postcards we often have :

  1. The name of both the photographer and the publisher. In this case what do we give as AUTHOR ?
  2. Only the name of the publisher, who is clearly not the photographer. Are we to suppose that the photographer has sold his copyright to the publisher or should the photographer be noted as {{anonymous}} ? For example there are hundreds of postcards with LL (Léroy & Lévy).
  3. The majority of old postcards (circa 1900) seem to be sold by the photographer, who was also the publisher, so no problem.
  4. Many postcards have neither the name of the photographer nor the publisher, so in this case the author is clearly {{anonymous}}, but how should such anonymous postcards be categorised ? (At the moment I am working on the Category:Postcards of Gironde and a companion Category:Postcard publishers in Gironde and I created a special Category:Postcards published by Anonymous (Gironde) for the files with either no information, a logo I have not been able to identify or the name of the photographer/publisher is unreadable, but I am by no means sure that it is the best choice.)

When known, should a postcard file be attributed to the category of the photographer and that of the publisher ?

The licence to be attributed to the postcard : It seems to me that the section on the projects page should be expanded, especially for postcards published after 1925 and when the copyright holder is anonymous.

Category

A postcard file will be in many categories, depending upon : the author, the publisher, the date, the subject of the image etc.

As far as this project is concerned I think that we should be very clear about the organisation of the category hierarchy.

The goal of a category hierarchy is not just to provide a database for computers to work with, but to provide a clear and logical pathway that will enable a human to find a specific postcard file that she is looking for.

In my opinion, the Category:Postcards should only contain very general organisational categories like :

  • Postcards by location
  • Postcards by publisher
  • Postcards by photographer
  • Postcards by date
  • Postcards by subject
  • Postcards by type

and probably a few other general categories.

The hotch-potch of categories and files now in this category need to be cleaned up (@Stefan Kühn: , @Leyo: : Sorry for the confusion I made with my naïve changes, which have since been largely corrected I hope.)

Going down to the second and third levals of the category hierarchy involves much the same type of questionning, which gets more and more detailed as we approach the actual postcard files. However, this post is only intended to start the discuusion.

Best wishes William Ellison (talk) 07:45, 23 April 2020 (UTC)

@William Ellison: Please not so many question at once. With this big text we can not good discuss all that at the same time. Also please only use no hierarchy in the topics. So we can better discuss. I make some top headings for better organisation the disussion. -- sk (talk) 07:41, 24 April 2020 (UTC)

Who is the author of a postcard?

IHMO: We can not clearly say who is the author, if publisher and photographer stand an the postcard. So we write as author both! Like on a book with more authors. -- sk (talk) 07:41, 24 April 2020 (UTC)

Licence-Guid for postcards

Yes, this is also a big wish of me. But I think this is big minefield. Maybe we start a table like for every country. Then we have clear "author is known", "photograph is kown" "author is unkown" -- sk (talk) 07:41, 24 April 2020 (UTC)

Category

I work hard to clean up the Category:Postcards. At the begin it was a big chaos. Please be free to make new better subcategories. Maybe we can start like in Commons:WikiProject Aviation with some categories for Category:Postcards with unidentified publishers, Category:Postcards files (check needed), Category:People associated with postcards. -- sk (talk) 07:41, 24 April 2020 (UTC)

I create a new Category:Unidentified logos on postcards -- sk (talk) 15:52, 28 April 2020 (UTC)

Optical character recognition, automatic processing.

Many postcards have a short text with the location of the photograph, the editor's name and sometimes a logo. There are not many different logos, and the text very often use the same vocabulary: Street name, city name etc... Given that there are many postcards, and sometimes already uploaded on Commons, I wonder if there would be an interest to scan postcards with an "Optical character recognition" software. I would help the uploaded of images in bulk, and also add extra metadata information to already loaded images. Of course, the result should be reviewed by a human, but a lot of time would be gained. There are Python software libraries which can be of help. Some research work have even been made for the handwritten message of the postcard: https://www.aclweb.org/anthology/L18-1038.pdf ! Rc1959 (talk) 07:20, 26 April 2020 (UTC)

@Rc1959: Thanks for this paper. - For me OCR is one big step for the future. Therfore I say, please upload both sides of a postcard. I try by myself under Linux en:Tesseract_(software) with good results. But only for printed text not for handwritten. - Maybe this is a good project for student or so. We can also ask the authors of this paper for help. -- sk (talk) 15:56, 28 April 2020 (UTC)
@Sk: Indeed, there is Python module called import pytesseract, based on the same software:

import PIL;import pytesseract;pil_image = PIL.Image.open(file_path);ocr_text = pytesseract.image_to_string(pil_image, lang='eng')

I used 'eng', without creating yet trained data: But on top of language specific training, it could be worth to train it with words strongly related to the domain: "Straße, Platz, rue, avenue" plus location names, end possibly existing street names. This could maybe enhance the OCR quality. I was also thinking about checking the status of already uploaded cards, but indeed, this is a project in itself. Rc1959 (talk) 16:49, 28 April 2020 (UTC)
@Rc1959: I think the first step is to fill the Category:Backs of postcards. At the moment many images had not this category. Than we have to define a correct way to link both images together (frontpage and adresspage). And then we can work with OCR. -- sk (talk) 18:45, 28 April 2020 (UTC)
@Rc1959: Today I read this article about google lens. We should also try this technology for OCR of handwritten text. -- sk (talk) 15:39, 8 May 2020 (UTC)

Detection of duplicates

The same postcard might be loaded several times. More exactly, several samples of the same postcards, scanned by several people. For example, look at these two links, one of these postcards has a stamp: https://fr.wikipedia.org/wiki/Route_des_Fusill%C3%A9s-de-la-R%C3%A9sistance#/media/Fichier:Nanterre.Route_de_Charles_X.jpg https://www.cpa-bastille91.com/wp-content/uploads/2012/10/92-Nanterre-route-de-CharlesX-582x388.jpg It might be useful to detect these duplicates, because they do not carry exactly the same information (Stamp etc...). Rc1959 (talk) 07:04, 28 April 2020 (UTC)

If you detect duplicates on commons, you can set it in this picture as a other version. If they are exact identical, then we can delete the one with lower resolution. But was this postcard in the post (with stamps, handwritting,...) then they are never exact duplicate. Maybe on the imageside, but not at the adressside. -- sk (talk) 11:19, 1 May 2020 (UTC)

Old postcards

IMHO we can change all categories with "Old" like "Old postcards of ..." into "Postcards of ..."! A postcard is old in the moment you make the picture for this postcard. So we have never a good border to say this is old and this is new postcard. - At the moment we have 405 Categories "Old postcards" and 20544 with "Postcards of". - What do you think? -- sk (talk) 11:28, 1 May 2020 (UTC)

 Support, user:Stefan Kühn, unless philatelists have definition for "old". Compare the header in category:Old maps--Estopedist1 (talk) 05:12, 2 May 2020 (UTC)
@Stefan Kühn: Yes, I agree with that. To me, it seems that the term "old postcard" implies a period, and by extension, a printing process. It seems difficult to define the good border. Nevertheless, would it be interesting to create categories by printing processes? Like collotype, rotogravure, offset? --Jjjiijjj (talk) 10:00, 21 May 2020 (UTC)Jjjiiijjj

Also we have 20x Category:Historcial postcards .... --sk (talk) 06:25, 21 May 2020 (UTC)

Leather

Is this material leather?

If yes, then we can use Category:Leather postcards, but I am not sure. -- sk (talk) 17:27, 6 May 2020 (UTC)

@Stefan Kühn: I think you are right, but to be sure you can ask from postcards' holders: Cornell University Library--Estopedist1 (talk) 05:09, 7 May 2020 (UTC)

Translation

I think we should invest time to translate the project page.

Before I start I will ask for ideas for a better structure of the page. Or is it all ok? -- sk (talk) 18:37, 25 May 2020 (UTC)

Postcards up for deletion

Hi there. Can someone from this project please take a look at Commons:Deletion requests/File:Binghamton Court House on old post cards.jpg and share their thoughts on keeping or deletion? I know sometimes US postcards fall under the "no known copyright" realm, for example. Thank you. Missvain (talk) 23:09, 3 December 2020 (UTC)