User talk:Dominic/2020

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

File is slightly broken at the bottom. Can't find the original. Can you have a look and correct this? Multichill (talk) 15:26, 15 February 2020 (UTC)

@Multichill: The original TIFF is also uploaded and linked from the image... Carl Lindberg (talk) 16:00, 15 February 2020 (UTC)
File:History of Olney Presbyterian Church Gastonia North Carolina 1793-1947 - DPLA - 7518d84185f6eb6552d060c4bfad190b (page 40).jpg has been listed at Commons:Deletion requests so that the community can discuss whether it should be kept or not. We would appreciate it if you could go to voice your opinion about this at its entry.

If you created this file, please note that the fact that it has been proposed for deletion does not necessarily mean that we do not value your kind contribution. It simply means that one person believes that there is some specific problem with it, such as a copyright issue. Please see Commons:But it's my own work! for a guide on how to address these issues.

Please remember to respond to and – if appropriate – contradict the arguments supporting deletion. Arguments which focus on the nominator will not affect the result of the nomination. Thank you!

pandakekok9 09:21, 25 March 2020 (UTC)


Notification about possible deletion

Some contents have been listed at Commons:Deletion requests so that the community can discuss whether they should be kept or not. We would appreciate it if you could go to voice your opinion about this at their entry.

If you created these pages, please note that the fact that they have been proposed for deletion does not necessarily mean that we do not value your kind contribution. It simply means that one person believes that there is some specific problem with them, such as a copyright issue. Please see Commons:But it's my own work! for a guide on how to address these issues.

Please remember to respond to and – if appropriate – contradict the arguments supporting deletion. Arguments which focus on the nominator will not affect the result of the nomination. Thank you!

Bobamnertiopsis (talk) 23:30, 12 April 2020 (UTC)

uploading blank pages

Hi, your bot uploaded recently hundreds of completely blank pages. A Lot of with many duplicates - see Special:ListDuplicatedFiles. I'm curently busy to overwrite them with smaller versions - i.e. File:Counties of Clay and Owen, Indiana- Historical and biographical - DPLA - 410eee5fac529aeeb5b0253ce735e207 (page 610).jpg. Can you improve your bot best case not to upload completely blank pages or instead to use a while 1x1 square. Thx. --JuTa 09:46, 7 May 2020 (UTC)

@JuTa: Why are you overwriting files like this? These are pages that are blank in the original work, and so they were digitized that way. This is not a bot error, and you can verify at the linked source. Dominic (talk) 20:54, 7 May 2020 (UTC)
Well, I can delete them as duplicates if you prefer that. --JuTa 23:00, 7 May 2020 (UTC)
Blank pages are normally fine -- if it's a scan of a page of a book, well that is what the book had and it's just part of the scan. We don't want to go pulling out blank pages from .djvu files, etc. On the other hand... that one you mention was pure white; not really a scan was it? Guess why that is why they got marked as duplicate, since they effectively are. On the other hand... those compress rather well. There is not much point in making a smaller overwrite -- would be better to revert those, since they will kind of ruin looking at book pages in sequence. I would imagine that blank pages are just as much a backing for Wikisource (to show it was blank in the original). However... I do wonder about the pure white images. Those do seem odd. And having them in the duplicates categories is probably the main issue. Carl Lindberg (talk) 00:42, 8 May 2020 (UTC)
My guess was that the ones that look pure white are because it was scanned in black-and-white mode with any edges cropped out. It's hard to know exactly, but I would not think they artificially inserting white pages using the same file, rather than scanning. Dominic (talk) 13:16, 8 May 2020 (UTC)
I'm sure they are scanning normally, but the resulting image will be basically identical in that case, and apparently the duplicate-image-checking stuff is finding them. Carl Lindberg (talk) 14:35, 8 May 2020 (UTC)

Notification about possible deletion

Some contents have been listed at Commons:Deletion requests so that the community can discuss whether they should be kept or not. We would appreciate it if you could go to voice your opinion about this at their entry.

If you created these pages, please note that the fact that they have been proposed for deletion does not necessarily mean that we do not value your kind contribution. It simply means that one person believes that there is some specific problem with them, such as a copyright issue. Please see Commons:But it's my own work! for a guide on how to address these issues.

Please remember to respond to and – if appropriate – contradict the arguments supporting deletion. Arguments which focus on the nominator will not affect the result of the nomination. Thank you!

Affected:

Binksternet (talk) 23:18, 10 August 2020 (UTC)
File:Grand Rounds Scenic Byway - Rabbit On Bell Sculpture in Sculpture Garden - NARA - 7718710.jpg has been listed at Commons:Deletion requests so that the community can discuss whether it should be kept or not. We would appreciate it if you could go to voice your opinion about this at its entry.

If you created this file, please note that the fact that it has been proposed for deletion does not necessarily mean that we do not value your kind contribution. It simply means that one person believes that there is some specific problem with it, such as a copyright issue. Please see Commons:But it's my own work! for a guide on how to address these issues.

Please remember to respond to and – if appropriate – contradict the arguments supporting deletion. Arguments which focus on the nominator will not affect the result of the nomination. Thank you!

Dumelow (talk) 04:46, 13 August 2020 (UTC)

File:TENNESSEE-NEAR NASHVILLE - NARA - 543979.jpg has been listed at Commons:Deletion requests so that the community can discuss whether it should be kept or not. We would appreciate it if you could go to voice your opinion about this at its entry.

If you created this file, please note that the fact that it has been proposed for deletion does not necessarily mean that we do not value your kind contribution. It simply means that one person believes that there is some specific problem with it, such as a copyright issue. Please see Commons:But it's my own work! for a guide on how to address these issues.

Please remember to respond to and – if appropriate – contradict the arguments supporting deletion. Arguments which focus on the nominator will not affect the result of the nomination. Thank you!

77.10.206.231 07:40, 29 September 2020 (UTC)

File:TENNESSEE-NEAR NASHVILLE - NARA - 543979.tif has been listed at Commons:Deletion requests so that the community can discuss whether it should be kept or not. We would appreciate it if you could go to voice your opinion about this at its entry.

If you created this file, please note that the fact that it has been proposed for deletion does not necessarily mean that we do not value your kind contribution. It simply means that one person believes that there is some specific problem with it, such as a copyright issue. Please see Commons:But it's my own work! for a guide on how to address these issues.

Please remember to respond to and – if appropriate – contradict the arguments supporting deletion. Arguments which focus on the nominator will not affect the result of the nomination. Thank you!

77.10.206.231 07:41, 29 September 2020 (UTC)

Notification about possible deletion

Some contents have been listed at Commons:Deletion requests so that the community can discuss whether they should be kept or not. We would appreciate it if you could go to voice your opinion about this at their entry.

If you created these pages, please note that the fact that they have been proposed for deletion does not necessarily mean that we do not value your kind contribution. It simply means that one person believes that there is some specific problem with them, such as a copyright issue. Please see Commons:But it's my own work! for a guide on how to address these issues.

Please remember to respond to and – if appropriate – contradict the arguments supporting deletion. Arguments which focus on the nominator will not affect the result of the nomination. Thank you!

Affected:


Yours sincerely, – BMacZero (🗩) 15:53, 8 October 2020 (UTC)

File:President Nixon meets with China's Communist Party Leader, Mao Tse- Tung, 02-29-1972 - NARA - 194759.tif has been listed at Commons:Deletion requests so that the community can discuss whether it should be kept or not. We would appreciate it if you could go to voice your opinion about this at its entry.

If you created this file, please note that the fact that it has been proposed for deletion does not necessarily mean that we do not value your kind contribution. It simply means that one person believes that there is some specific problem with it, such as a copyright issue. Please see Commons:But it's my own work! for a guide on how to address these issues.

Please remember to respond to and – if appropriate – contradict the arguments supporting deletion. Arguments which focus on the nominator will not affect the result of the nomination. Thank you!

芄蘭 (talk) 14:24, 15 October 2020 (UTC)

global IP block exemption

Hello. User:US National Archives bot appears inactive. Is the global IP block exemption still needed? --Krd 12:12, 27 October 2020 (UTC)

Your recent bot uploads

Hi, I noticed that your bot uploading a lot of files with incorrect licenses - i.e. File:Secretary Mel Martinez in Miami, Florida - DPLA - 868ccfb02c71d174afcf93f783e12409.jpg. Its definitly not created before 1925 as the current license required. Please stop your current upload and fix this for likely thousends of images.

Btw: You creating a lot of duplicates as well - see Special:ListDuplicatedFiles , please try to reduce this. regards --JuTa 11:24, 26 November 2020 (UTC)

(tpw) I've amended the one example, though as this can be fixed by using USGov rather than US, a mass fix seems easy enough. It may be better to test for the source link in addition to a date test. There may be more specific options for the license.
Duplicates might be related to an API issue. If you are still running the uploads it's worth a proper review to ensure you understand the root cause and why this problem did not happen with past uploads. -- (talk) 11:39, 26 November 2020 (UTC)
PS: If I dont see a reaction or a stop of uploads of the bot I'll likely will block the bot this afternoon. regards --JuTa 11:42, 26 November 2020 (UTC)
I now blocked your bot User:DPLA bot. The licenses of past uploads and your upload procedure needs to be fixed before continuing mass uploads. Give me a ping if you require an unblock. --JuTa 14:23, 26 November 2020 (UTC)
Hi JuTa, it is morning here in my time zone, and I've just seen your messages. I am unclear what the issue is with using {{PD-US}} when an institution specifically states that it the rights statement is "No Copyright - United States". Is there a Commons policy that states that an upload must have certain level of granularity in the PD tag it uses, because that is the first I am hearing this demand.
We are applying {{PD-US}} for all records described with No Copyright - United States by the source, because this is the technically correct mapping of that rights statement to a Commons copyright license. We are not using {{PD-US-expired}}, so what you said about the tag we are using requiring the material be published before 1925 is incorrect. PD-US simply states that the work is in the public domain, not that it is necessarily due to the date of publication, and we are applying this based on the determination of the source institution. We are using this general license, so it seems you have blocked the bot on a misunderstanding of the license template in question.
The more specific PD-USGov tag you are suggesting is based on your reading of the image's metadata as a human, but there is no way a bot could determine that was correct based on the data provided, and if I were to apply that license carte blanche to all uploads, others would then rightly complain that some were incorrect, and that is why the general license is being applied. Dominic (talk) 14:49, 26 November 2020 (UTC)
Well, {{PD-US}} is generally used for pre 1925 works from the US, which is not the case for recent Govermental works. Can you check the past uploads and correct the licenses to {{PD-USGov}} and modify your upload process to reflect this pls. --JuTa 14:54, 26 November 2020 (UTC)
This is not what the template either states or means. I will quote from the documentation: This is the most general "public domain in the US" license tag. If at all possible, please use a more specific tag, such as {{PD-US-expired}} for US publication before 1925. Your suggestion that using this completely valid template is wrong because I am using it for the general purpose for which it is intended seems completely unwarranted. I've stated why I am not using {{PD-USGov}}—there is not a machine-readable way to determine if the work is created by the US federal government, and so I am using the general tag, which is valid and allowed under Commons policy, in order to avoid errors. I do not understand under what basis you are choosing to require I use a copyright license with a level of granularity that I would not be able to automate, when I am instead using a license valid for all cases. Dominic (talk) 15:04, 26 November 2020 (UTC)
@JuTa: PD-US is used when it is not known the precise reason for U.S. public domain status, just that it is PD -- it could be before 1925, or it could be published without notice, or could be not renewed. It does not imply that it was published before 1925, though that is one of the possible reasons, as plainly stated on the tag. It sounds like these particular records may not be PD-USGov so PD-US is the more appropriate license for a bot. If the specific reason is apparent from reading the information, then change to that, but it's not possible for a bot. Carl Lindberg (talk) 15:05, 26 November 2020 (UTC)
Hmm, even the linked Commons:Hirtle chart does not tell us anything about govermental works, but just about what exires when. There are very recent govermental images from where one cannot find any suitable information of about why the images are in PD. But you start to convince me.
Anyhow there are 2 other issues I like to rise:
  • The real high number of duplicates should be realy reduced7.
  • The Categorisation of all your uploads is very bad, just 2 standard cats for the source of the images, but no cats for what is depicted. How do you plan to fix that?
regards --JuTa 15:14, 26 November 2020 (UTC)
I would just like to point out that the bot was operating completely within the scope of what it was originally approved to do, and within Commons policy. If you have preferences about how its uploads should be formatted, I would be happy to discuss them, but I do not think initiating blocks over non-emergency stylistic preferences is the way to go about that—especially when you are now bringing up things that were not even part of the original basis for the block. Is the categorization basic? Yes, and I hope to improve on that with future development, but it appears you are now insisting on maintaining an emergency block on the bot for doing what it was approved to do in the way it was approved to do it.
Let me just also point out that, from everything I have seen when we discussed previously, the duplicates amount to a complete non-issue. It appears you dislike the practice of including pages which are blank in the original source document, and were digitized that way. But this is common enough practice on Commons, and I don't know on what basis (especially what policy) you are suggesting it is not allowed. And these are mostly months old, from what I can tell. I suggest you seek to change Commons policy, if that is what you are really after, rather than resorting to blocking my account. The few dozen other duplicates I can see in Special:ListDuplicatedFiles amount to a few instances of human error by the source institutions, and, considering we have now done well over a million uploads, the rate of duplicates is almost none. Dominic (talk) 15:30, 26 November 2020 (UTC)
It looks like I have to accept the license issue, thou I'm not happy with it. So I'm going to unblock your bot.
I'm sure that I have currently processed about 200 duplicates (outof 500 yet processed, 1000 still awainting since a few days) that very much than "a few dozend". You (and others) creating currentky duplicates more quickly than I am able to process them.
How do you plan the categorization issue I raised? --JuTa 15:48, 26 November 2020 (UTC)
You bot is now unblocked - and directly working again. --JuTa 15:52, 26 November 2020 (UTC)
Thanks for doing that. Regarding the other issues you are concerned with around duplicates and categorization, I am certainly committed to having a discussion around that and seeing how we can come up with an amicable solution. I will also say that we are working on this grant-funded project, with the continuing assistance of software developers—which is how we have gotten to this point of uploading a million files this year—and we continuing to develop and iterate on our custom code. There is always room for improvement, and this bot is actively maintained. We are planning future tweaks, such as implementing structured data statements and other changes that might help with categorization, and we could apply them retroactively to all past uploads. We can also consider what other quality filters are possible within the limitations of the data available to us. For now, I will note that it is a holiday here in the United States, and hope that you let me take a break from this conversation so I can return to my family! Thanks again. Dominic (talk) 16:06, 26 November 2020 (UTC)

Sounds good so far. To explain the dupe-issue. I'm regularly working on the duplicate files, and got normaly between 100 and 500 within 3 days, which I am able to process. Now the rate is about 1500 within 3 days, and I got concerned that I will not be able to process them as quick as they "arrive". Anyhow: Have a nice holiday. --JuTa 16:11, 26 November 2020 (UTC)


Notification about possible deletion

Some contents have been listed at Commons:Deletion requests so that the community can discuss whether they should be kept or not. We would appreciate it if you could go to voice your opinion about this at their entry.

If you created these pages, please note that the fact that they have been proposed for deletion does not necessarily mean that we do not value your kind contribution. It simply means that one person believes that there is some specific problem with them, such as a copyright issue. Please see Commons:But it's my own work! for a guide on how to address these issues.

Please remember to respond to and – if appropriate – contradict the arguments supporting deletion. Arguments which focus on the nominator will not affect the result of the nomination. Thank you!

Bobamnertiopsis (talk) 18:48, 29 November 2020 (UTC)