Commons:Village pump/Proposals

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Shortcuts: COM:VP/P • COM:VPP

Welcome to the Village pump proposals section

This page is used for proposals relating to the operations, technical issues, and policies of Wikimedia Commons; it is distinguished from the main Village pump, which handles community-wide discussion of all kinds. The page may also be used to advertise significant discussions taking place elsewhere, such as on the talk page of a Commons policy. Recent sections with no replies for 30 days and sections tagged with {{Section resolved|1=--~~~~}} may be archived; for old discussions, see the archives; the latest archive is Commons:Village pump/Proposals/Archive/2022/06.

COMMONS DISCUSSION PAGES (index)
Please note
  • One of Wikimedia Commons’ basic principles is: "Only free content is allowed." Please do not ask why unfree material is not allowed on Wikimedia Commons or suggest that allowing it would be a good thing.
  • Have you read the FAQ?

 
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 5 days and sections whose most recent comment is older than 30 days.

File usage on openstreetmap.org, The following page uses this file[edit]

In addition to the listing "File usage on Commons" a listing "File usage on openstreetmap.org" would be very useful from my point of view. Especially to be able to take this into account for deletions.

Symbol support vote.svg Support to enable listing of all InstantCommons uses.   — Jeff G. please ping or talk to me 05:59, 11 June 2022 (UTC)Reply[reply]
  • I do not think that this is technically possible. The only way to do something like this would be a bot going through the OSM database and adding a template to the file page. --GPSLeo (talk) 06:43, 11 June 2022 (UTC)Reply[reply]
    Thanks for your feedbacks. Yes, I had already thought of a bot that searches for "wikimedia_commons" in the OSM database. Then perhaps the link of the parent note (example here: https://www.openstreetmap.org/node/3094456337#map=18/52.35720/12.69014) could be determined and the necessary link to the file (here: File:Beobachtungsturm Strengsee, Seitensicht.jpg) is also available. It would only be necessary to find someone who would have time to implement this . . . — Preceding unsigned comment added by Molgreen (talk • contribs) 09:00, 11 June 2022‎ (UTC)Reply[reply]
    I think having such a bot edit the file page would be disruptive: people get annoyed enough at watchlist notifications for changes to structured data; notifications for changes to the usage of a file would be even worse. But a bot could maintain a user gallery or several that contained files used in OSM, and those would appear in the list of local uses.
    However, I would then wonder why this should be limited to OSM. Other sites can use Commons images, either through InstantCommons as Jeff mentions, or by just recording URLs in a database (e.g. MusicBrainz). Do we in principle want to know about all external uses? Or maybe just ones in free-content projects? --bjh21 (talk) 10:16, 11 June 2022 (UTC)Reply[reply]
    I agree a bot scanning file links to deleted or rename files would be useful but this is nothing to discuss here this is a topic for the OSM forums. It is also a question if direct linking of the files is needed. Most objects in OSM where a photo makes sense do have a Wikidata item where the photo is linked. --GPSLeo (talk) 10:49, 11 June 2022 (UTC)Reply[reply]
    From my point of view, it would be interesting to record all uses. But that should be very controversial (I suspect)? OSM is a special project for me. I would even call it a partner project. I am meanwhile in both worlds (Wikiversum and OSM) on the way and think that with OSM similarly high quality standards apply as in the Wikiversum. --Molgreen (talk) 10:58, 11 June 2022 (UTC)Reply[reply]
    @GPSLeo: To be clear, I did not suggest a bot scanning file links to deleted or rename files. As you say, such a bot and the question of whether the wikimedia_commons key should exist at all, are both matters for the OSM community. My suggestion was of a way to implement what Molgreen proposed without spamming people's watchlist. --bjh21 (talk) 11:02, 11 June 2022 (UTC)Reply[reply]
Pictogram voting comment.svg Comment To give an idea of scale, OSM taginfo says there are 67,390 distinct values of the wikimedia_commons key on OSM at present. --bjh21 (talk) 10:22, 11 June 2022 (UTC)Reply[reply]
Hello bjh21, this is very interesting. Thank you for the link! (There is still much possible . . .) --Molgreen (talk) 11:02, 11 June 2022 (UTC)Reply[reply]
(I mean the use "wikimedia_commmons" is apparently only at the beginning. Hopefully still very much develops). --Molgreen (talk) 11:24, 11 June 2022 (UTC)Reply[reply]
Pictogram voting comment.svg Further comment I've written a trivial script to convert data from Overpass into a gallery and created User:Bjh21/files used on OSM containing files referenced by OSM in and around London. This means that if you visit File:Halfway II Heaven, Trafalgar Square, WC2 (3614629275).jpg, for instance, that gallery appears in the list of uses of the file. I think I could fairly easily run a bot to maintain a collection of such galleries for the whole world if there were a consensus in favour of that. --bjh21 (talk) 11:37, 11 June 2022 (UTC)Reply[reply]
@Bjh21: That is a very elegant solution and I can see it being very useful, both for us and for our OSM colleagues. Woudd there be some way to display the results on a map, also? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:45, 11 June 2022 (UTC)Reply[reply]
@Pigsonthewing: I could probably arrange to add maps, but that's a lot more complicated and not really within scope. If you want you can get the same set of objects on an interactive map by typing wikimedia_commons~"^File:" into the overpass turbo Query Wizard. --bjh21 (talk) 20:01, 11 June 2022 (UTC)Reply[reply]
@Bjh21: Thank you very much! A great solution from my point of view as well. The key thing here is that the usage is automatically displayed on the file page. This works great. One question: what are the chances that the solution will be implemented permanently and for all these files? --Molgreen (talk) 17:45, 11 June 2022 (UTC)Reply[reply]
@Molgreen: From a technical point of view I don't think there's anything stopping me implementing this. I've tested it without the geographical restriction and the Overpass query runs in a couple of minutes, generating a 4 MB wiki page with 42,000 pictures on it. I'd probably want to carve it up into smaller subpages, but doing that and wrapping it up into a bot that runs daily or weekly would be easy. The only likely obstacles are organisational. I'd like to allow for some more opinions here so I can be confident we've got some kind of consensus. Then I'll go through the procedure for getting permission to run a bot. --bjh21 (talk) 20:55, 11 June 2022 (UTC)Reply[reply]
@Bjh21: It would be very nice if that would work. Either way, thanks again for your effort up to here. --Molgreen (talk) 15:45, 12 June 2022 (UTC)Reply[reply]

I have requested permission to run a bot to do this: Commons:Bots/Requests/Usage Bot. --bjh21 (talk) 15:31, 19 June 2022 (UTC)Reply[reply]

I now have permission to run the bot. I've decided to start with the simple case of files used on Wikitech, so I've created Commons:Files used on Wikitech and some subpages. The bot has made its first edit bringing the galleries up to date. I think there should be a central page describing this system and with a talk page for discussing it. I'm not quite sure what it should be called, though. Maybe Commons:Tracking external file usage? --bjh21 (talk) 22:15, 27 July 2022 (UTC)Reply[reply]

Commons:Files used on OpenStreetMap now exists and is populated. The bot needs a bit of fettling still, but what Molgreen suggested seven weeks ago is now in place. Thank you everyone! --bjh21 (talk) 17:50, 1 August 2022 (UTC)Reply[reply]

A very good solution from my point of view. Many thanks to bjh21 --Molgreen (talk) 18:28, 1 August 2022 (UTC)Reply[reply]

Would a backlink also be conceivable?[edit]

Hi bjh21, would it be conceivable that there would also be a backlink: from the single image in the list (Commons:Files used on OpenStreetMap) to the node in OSM that uses that particular image. --Molgreen (talk) 19:27, 5 August 2022 (UTC)Reply[reply]

@Molgreen There is indirectly: click on the link on the gallery page saying "~1 use(s)". That takes you to Taginfo, where the "overpass turbo" link (top right) takes you to an Overpass query that returns the relevant feature, and then click on the feature on the map and you'll get a link to the main OSM site. I could possibly have a link directly to overpass turbo, but adding one extra link will make the page about 30% bigger, so I'd prefer not to. --bjh21 (talk) 21:20, 5 August 2022 (UTC)Reply[reply]
@bjh21 yes, thank you, this is what I was looking for. A direct link would be nicer of course, but I can understand that you don't want that. Many thanks and greetings --Molgreen (talk) 04:42, 6 August 2022 (UTC)Reply[reply]
PS: Maybe you could add a sentence in the description:
"Images in this gallery are referenced by the wikimedia_commons key on OpenStreetMap. The list was last updated on 31 July 2022 by Usage Bot. The link below each file leads to a page on OpenStreetMap Taginfo for the particular value of wikimedia_commons. That page in turn has links to other OSM sites including overpass turbo for generating maps and Level0 for editing features. There you can also see where the images are used. This page contains information from OpenStreetMap, which is made available here under the Open Database License (ODbL)." --Molgreen (talk) 04:42, 6 August 2022 (UTC)Reply[reply]
@Molgreen: The header is transcluded from Commons:Files used on OpenStreetMap/header so you can edit it yourself. Be bold! --bjh21 (talk) 08:50, 6 August 2022 (UTC)Reply[reply]
@bjh21 Thanks and sorry: I wanted to be "bold" :-) but if I change that, I would be the editor of the list. That would be a "presumptuous" :-) --Molgreen (talk) 10:01, 6 August 2022 (UTC)Reply[reply]
@Molgreen: Are you worried about the {{REVISIONUSER}} in the header? That gets expanded based on the top-level page, so even if you edit the header, the lists will all continue to say "Usage Bot" at the top. --bjh21 (talk) 10:43, 6 August 2022 (UTC)Reply[reply]
@bjh21 Thanks. I was "bold" :-) --Molgreen (talk) 11:06, 6 August 2022 (UTC)Reply[reply]

Should we track image=* as well?[edit]

Since this topic hasn't been archived yet, I'll raise this here: Commons:Files used on OpenStreetMap currently tracks usage in wikimedia_commons=*. But references to Commons files can also appear in the image=* key. Lots of MapComplete themes render pictures from image=*. I've coded an extension to the Usage Bot to support reading multiple keys, and it finds over 85,000 more files to add to the galleries, taking the total number of galleries to 133. Since the galleries are in the Commons namespace, I'd like to get consensus before I enable tracking of image=*. So should I enable it? --bjh21 (talk) 18:42, 9 August 2022 (UTC)Reply[reply]

Policy status for Commons:Harassment[edit]

The page has stated that it is proposed for three years, with no complaints. Let's make it official, like en:WP:Harassment.   — Jeff G. please ping or talk to me 10:54, 6 July 2022 (UTC)Reply[reply]

What are examples of harassment on the Commons that have not been addressed, or not adequately addressed, by existing policy? In the absence thereof, how is this not w:WP:CREEP? Why is a policy that was copy and pasted from en.wiki being presented here for approval with only superficial localisation? "WP" shortcut prefixes, for example, were simply swapped to "COM" resulting in redlinks like "COM:HUSH"--or missed altogether, like the still-remaining "WP:HNE". The page is replete with links, see alsos, and references to en.wiki. This is the Commons. Эlcobbola talk 14:19, 6 July 2022 (UTC)Reply[reply]
What is the benefit of making it official? What is the benefit of its existence? Mateusz Konieczny (talk) 09:45, 28 July 2022 (UTC)Reply[reply]
@Mateusz Konieczny: The benefit is that, as a policy, it could be more easily and safely used as a reporting and blocking rationale.   — Jeff G. please ping or talk to me 13:14, 28 July 2022 (UTC)Reply[reply]
Are you sure that it is needed and that currently people are not getting for harassment and this would be changed with making this page official? Mateusz Konieczny (talk) 13:24, 28 July 2022 (UTC)Reply[reply]
@Mateusz Konieczny: Having that as policy would have helped in my ANU complaint about Debjyoti Gorai, among others.   — Jeff G. please ping or talk to me 13:32, 28 July 2022 (UTC)Reply[reply]
Can you link relevant discussion? (BTW, if you want to make something a policy because it would help in your complain about other user - it would be nice to mention it from start) Mateusz Konieczny (talk) 13:38, 28 July 2022 (UTC)Reply[reply]
@Mateusz Konieczny: I was writing of COM:ANU#Debjyoti Gorai, but there is also COM:HD#Need to talk to an Indian Commons admin.   — Jeff G. please ping or talk to me 13:54, 28 July 2022 (UTC)Reply[reply]
The user got no support in the linked discussions, and no one questioned your suggestions. I don't see how the suggested policy would help. –LPfi (talk) 08:50, 29 July 2022 (UTC)Reply[reply]
@LPfi: And yet, the user remains unblocked after over 39 hours. Note: I made this proposal 23 days ago.   — Jeff G. please ping or talk to me 10:06, 29 July 2022 (UTC)Reply[reply]
The one month block seems kind of week all things considered. I'm not saying having this as policy at the time would have helped make the block duration longer, but I do think there's a direct correlation between there being behavioral guidelines and how seriously administrators take inappropriate behavior. As well as how quickly they will act on it. It's also a lot easier for someone to excuse and arguing in favor of their position if the breach in conduct is only wrong because of implicit norms that aren't stated anywhere. Maybe Debjyoti Gorai wouldn't have been less inclined to beat the dead horse if we could have pointed to specific sentences in Commons:Harassment that they were violating instead of just doing a bunch of vague finger waging about them being a drama queen or whatever. --Adamant1 (talk) 23:42, 1 August 2022 (UTC)Reply[reply]
@Adamant1: Here's a vague finger-wave you can take to the bank (not directed at you): Alienating Admins is bad for your account's health.   — Jeff G. please ping or talk to me 00:00, 2 August 2022 (UTC)Reply[reply]
True. Debjyoti Gorai definitely didn't do their self any favors in that regard. --Adamant1 (talk) 00:21, 2 August 2022 (UTC)Reply[reply]

Duplikat-Erkennung: Commons Hochlade-Assistent und Commons-App[edit]

Seit ein paar Wochen habe ich Gefallen an der Commons:Mobile app gefunden. Ich habe vor, später (in ein paar Jahren) Teile meines Medienarchives mit dem Commons Hochlade-Assistent zu veröffentlichen und fürchte dann, versehentlich Bilder mehrfach zu veröffentlichen. Der Commons Hochlade-Assistent erkennt zuverlässig seine „eigenen“ Duplikate. Er kann aber nicht erkennen, wenn dasselbe Bild bereits mit der Commons Mobile App hochgeladen wurde. (Möglicherweise liegt das an unterschiedlichen Metadaten?) Ich habe hier auch schon darauf hingewiesen. Molgreen (talk) 15:24, 17 July 2022 (UTC)Reply[reply]

Ja die Fotos unterscheiden sich um wenige Bytes und können daher nicht als Duplikate erkannt werden. Das Problem, dass Uploads von der App beim nochmaligen Hochladen im Wizard nicht erkannt werden lässt sich also leider nicht so einfach lösen. Das ein Foto aus der App zweimal hochgeladen werden kann, obwohl es die bitgleiche Datei ist, ist wohl tatsächlich ein Bug. --GPSLeo (talk) 15:46, 17 July 2022 (UTC)Reply[reply]
Kann mir jemand sagen, ob ich das umgehen kann, wenn ich in der Commons:Mobile app alle EXIF-Tags in den Einstellungen auswähle? --Molgreen (talk) 16:14, 17 July 2022 (UTC)Reply[reply]
Ich habe es mal versucht, wenn alle EXIF Tags in den Einstellungen der „Commons App" ausgewählt sind: auch dann erkennt der „Commons Hochlade-Assistent" nicht, dass es ein Duplikat ist:
  • Bild mit allen EXIF Tags per „Commons App" hochgeladen

    Bild mit allen EXIF Tags per „Commons App" hochgeladen

  • Duplikat mit allen EXIF Tags per „Commons Hochlade-Assistent" hochgeladen (Hochlade-Assistent erkennt „Commons App" Duplikat nicht)

    Duplikat mit allen EXIF Tags per „Commons Hochlade-Assistent" hochgeladen (Hochlade-Assistent erkennt „Commons App" Duplikat nicht)

  • Why not just use the mobile version of the MediaWiki Upload Wizard? The mobile app is notoriously bad at the basic functions it provides and is inferior to any browser-based web interface, just use the mobile upload wizard and the issues unique to the mobile app would disappear (for you, not for those unfortunate to use the mobile app). --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 16:18, 17 July 2022 (UTC)Reply[reply]
    thanks for the hint. I will try it soon --Molgreen (talk) 16:49, 17 July 2022 (UTC)Reply[reply]
    The "Nearby" feature mentioned here
    Hello Donald Trung 『徵國單』, thank you very much for the tip about UploadWizard the mobile version of the MediaWiki Upload Wizard. I have tried it. The upload works very well as expected. At the same time, the Commons app also has good features such as "Nearby" / "Photos needed".— Preceding unsigned comment added by Molgreen (talk • contribs)
    I am using this mobile app and I am perfectly fine with it Mateusz Konieczny (talk) 09:46, 28 July 2022 (UTC)Reply[reply]
    • Pictogram voting comment.svg Comment, wouldn't it be wise to propose for the mobile app to become a web-wrapper until they can make their version of the Upload Wizard equal or better to the one we have for us web-users (saying this as a user who currently only uses his mobile telephone 📞 to edit). --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 16:20, 17 July 2022 (UTC)Reply[reply]
      • Why not use both? The feature sets overlap but some features work better or are only available in one of the two. One user can choose to use one method to contribute to Commons in one situation, and another in another situation. whym (talk) 12:41, 21 July 2022 (UTC)Reply[reply]
    Thank you, yes, that's exactly what I'm practicing successfully by now.
    It remains very unfortunate for me that the two upload wizards do not recognize each other's duplicates. --Molgreen (talk) 15:14, 21 July 2022 (UTC)Reply[reply]
    @Molgreen Do you mean UploadWizard (the web version) successfully detected those duplicates with different metadata? If so, I wonder why. As far as I know, both of UploadWizard and the mobile app rely on hash values to detect duplicates. The two Schwanenteich images have different hash values, so both should fail to detect them as duplicates. (It's "sha1" in the result.) whym (talk) 22:28, 21 July 2022 (UTC)Reply[reply]
    @whym thank you for your feedback. In my experience:
    • Does the UploadWizard reliably recognize its own duplicates
    • Does the Commons:Mobile app not recognize any duplicates (see Commons:Mobile_app/Feedback#Feedback_from_Molgreen_for_version_4.0.1~66f8f97d0_3 here
    • Commons:Mobile app and UploadWizard do not recognize duplicates of each other. This may be technically difficult. But I would really like it, because unfortunately I've accidentally uploaded the same file twice via Commons:Mobile app and UploadWizard. --Molgreen (talk) 05:51, 22 July 2022 (UTC)Reply[reply]
      There seems to be two issues. 1) I wonder if something other than the two upload tools might have changed the file (or the metadata of the file). For example, you might have used Google Photo to transfer the file, and Google Photo might have modified/normalized something in EXIF. If so, another duplication detection method (that ignores EXIF) might be the solution. That would be a feature request to the MediaWiki developers. 2) File:Schwanenteich_im_Annatal.jpg and File:Schwanenteich_im_Annatal_4.jpg are completely identical, so if the mobile app showed no warning about duplicates, that is a software bug. I think this needs to be fixed by the mobile app developers. whym (talk) 03:28, 23 July 2022 (UTC)Reply[reply]
    Symbol support vote.svg Support --Molgreen (talk) 09:52, 23 July 2022 (UTC)Reply[reply]

    Another test[edit]

    to be on the safe side, I did another test. It seems to be very complicated:

    • in the following order:
      • I download Schwanenteich im Annatal.jpg to the download area of my smartphones
      • now something is different: duplicate is detected, but I can still upload
      • Upload wizard prevents uploading

    Another foto[edit]

    • 20220708-Test.jpg
    • App detects duplicate, but I can upload anyway
    • Upload wizard does not recognize the duplicate app (same source each time (Google Photo: Camera)

    Mass move of files containing space before the file extension[edit]

    I came across File:The Heart of a Hero (1916) .webm this morning, and I do remember seeing other webm files in the past that also had spaces before ".webm". I was wondering if it might be a good idea to use a bot to do a mass move of files with this specific issue, since it is an obvious, straightforward, and unambiguous error IMO. PseudoSkull (talk) 12:50, 21 July 2022 (UTC)Reply[reply]

    Symbol support vote.svg Support with redirects.   — Jeff G. please ping or talk to me 23:18, 21 July 2022 (UTC)Reply[reply]
    Redirects break links from outside Wikimedia, which is why file moves are discouraged unless really necessary (or maybe recently uploaded). This check definitely should be part of any uploading interface to try and prevent the situation in the first place, but there can be some consequences to fixing existing ones. Carl Lindberg (talk) 13:10, 28 July 2022 (UTC)Reply[reply]

    Block mass upload tools that don't prevent upload of duplicates[edit]

    Automated uploads of duplicates have gotten entirely out of hand. Any automated tool that uploads to Commons should not be permitted to do so until it prevents upload of duplicates.   — Jeff G. please ping or talk to me 11:43, 30 July 2022 (UTC)Reply[reply]

    I meant exact duplicates of existing files (duplicate), of deleted files (duplicate-archive), and of old versions of deleted files (duplicate-version), as per bjh21's post below.   — Jeff G. please ping or talk to me 10:23, 31 July 2022 (UTC)Reply[reply]
    I definitely support this. Additionally to the prevention of duplicates with the same hash value the tools should ask people to do at least a short search whether the files are already on Commons. --GPSLeo (talk) 12:09, 30 July 2022 (UTC)Reply[reply]
    I'd say that this is probably not a good idea as it seemed, a large amount of images of museums, archeological sites, works of ancient architecture, Etc. comes from Flickr and Flickr2Commons is the best tool to import from. Generally speaking it is able to find duplicates but it doesn't detect all duplicates. The issue here isn't that people are allowed to use Flickr2Commons, rather it's the fact that it's not properly maintained and updated. Wouldn't it simply be easier to note that the tool in its current form has a bug and then have someone fix that bug than to prevent tens of thousands of educational uploads simply because a few of them might be duplicates which are usually tagged as duplicates and then deleted by a few admins specialised in this in a semi-automating way and redirected? This is just putting more burdens on the uploader who simply found a good high quality educational image with a free license and then attempts to upload this image. This is not behaviour that we should be discouraging by adding extra steps to. Simply ping user "@Magnus Manske: " or whomever is maintaining it now and let them fix these issues.
    Also the wording of the proposal can lead to all upload tools being blocked if they have this issue, which simply shifts the tool from being fixed to not being used at all. Flickr is probably the most important photography website on the internet (also note that places like Twitter, Instagram, and the Facebook all remove Metadata and shrink the files, so even if we would be able to import from them they're still inferior), we should be finding ways to make importing free images from Flickr easier, not more difficult. We used to have users like "" that did massive imports from Flickr using Flickr2Commons and a large amount of their images are used in Wikipedia articles. I'd say run the Flickr2Commons tool on any previously imported album and it detects basically all duplicates, it's quite rare for Flickr2Commons to actually not detect a duplicate if you're using the tool correctly, so it could be that Flickr duplicates aren't being found because of other reasons that simply don't happen with images that were previously imported using this tool. This requires a bug fix, not a ban. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 12:25, 30 July 2022 (UTC)Reply[reply]
    See Category:Duplicate as example. Problems with duplicates are not exceptions. Sometimes duplicates that were uploaded on Flickr were not detected in same batch. Either tools must be properly maintained and do not create unnecessary problems or not allowed to be used. --EugeneZelenko (talk) 13:53, 30 July 2022 (UTC)Reply[reply]
    I think this proposal needs a clearer definition of "automated" and "duplicate". The upload API can issue warnings for exact duplicates of existing files (duplicate), of deleted files (duplicate-archive), and of old versions of deleted files (duplicate-version). Meanwhile, CSD F8 allows for speedy deletions of exact and scaled-down duplicates of existing files, and CSD G4 for speedy deletions of duplicates of certain deleted files. It might be reasonable to require tools not to silently overrule certain upload warnings. Requiring detection of scaled-down duplicates would be unfeasible since Commons doesn't have any useful facility to search for such things. Requiring automated tools to ask a human to search for duplicates, as suggested by GPSLeo would be difficult for bots. Should Geograph Update Bot wake me up every Sunday morning with a list of files that I need to search for before it can upload them? That might feasible, since the bot only uploaded 32 files this week. For GeographBot, which uploads about six files per minute, it would be completely impractical. --bjh21 (talk) 10:09, 31 July 2022 (UTC)Reply[reply]
    Yes such a search can not be done by bots but this problem should be discussed in the task approval. In cases where many duplicates are expected there are some extra mitigations needed. Every bot should definitely be required to respect the duplicate warnings. --GPSLeo (talk) 18:51, 31 July 2022 (UTC)Reply[reply]
    • Comment Personally, I'm kind of on the fence about this. On the one hand I'm not a super fan of duplicate files, but then on the other I've gotten in a few disagreements about if Commons actually allows for duplicate images/files or not. And from those discussions it seems like no one really cares. So while I'd support this in practice, I think it would have be implemented in-conjunction with a wider "no duplicate images/files" policy in general. Otherwise, it just seems weird to single out bots for doing something that isn't even against the rules. --Adamant1 (talk) 22:45, 1 August 2022 (UTC)Reply[reply]
    • Obviously, bots increase magnitude of problem. Also mass uploaders did not always care enough about categorization and descriptions, so such duplicates lead to time waste other than administrators actions. --EugeneZelenko (talk) 14:09, 2 August 2022 (UTC)Reply[reply]
    • Well, regarding "Also mass uploaders did not always care enough about categorization and descriptions, so such duplicates lead to time waste other than administrators actions." I'd say that the content of the media is more important than their categorisation (which the MediaWiki Upload Wizard itself notes as optional), there are users here who spend their entire time here categorising images and better categorising, are they "wasting their time"? I'd say that it's better to have an uncategorised educational image here than to have nothing at all. I often found high quality images in top categories like "Coins" or "Coins of Randomcountry" that should have been in more specific categories like "Coins of King Monarchpants XI of the Longgone Empire", but having the image here and then categorising it is better than not having the image and not being able to illustrate the subject with a free image at all. Don't get me wrong, discoverability is important, but categorisation is secondary to the content itself, especially since some users are pushing for a full abolition of MediaWiki categories once Wikidata-based Structured Data on Wikimedia Commons (SDC) items are becoming the norm. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 09:59, 11 August 2022 (UTC)Reply[reply]
    • Pictogram voting comment.svg Comment, well, I thought that I had already written this but apparently I forgot and only (partially) addressed this in the main Village Pump. The issue with this proposal is that in its current wording it can technically ban the MediaWiki Upload Wizard. While browsing the files at "Category:Duplicates" I found that most I found in a random sample came from the MediaWiki Upload Wizard and not Flickr2Commons. Running my own experiment (example images at the Village Pump post that inspired this proposal) I found that Flickr2Commons actually does quite a good job of preventing duplicates to be uploaded. So what types of duplicates don't get filtered out?
    Well, that's the issue, these are files that are always impossible to filter out because they are technically not duplicates, files that have different EXIF data because some websites edit these. Now "User:1" imports files from Freefileswebhost.website and all these images are good educational content and get used on various Wikipedia articles, but as it turns out these images have edited EXIF data and were originally taken from Flickr. Now "User:2" imports all these same images with the "correct" (original) EXIF data (as Flickr doesn't edit EXIF data, while other websites like Meta's Facebook, Meta's Instagram, Twitter, Etc. all do) then is Flickr2Commons at fault here for not recognising that these files were already imported here from Freefileswebhost.website? Obviously not, as no tool could have recognised this.
    Had "User:2" have used the MediaWiki Upload Wizard's Flickr import tool then the exact same issue would have occurred, the MediaWiki Upload Wizard wouldn't have recognised them as duplicates and they would have still been uploaded.
    Sometimes uploaders only categorise media in specific categories. I came across an old 19th century French photographer that had made lots of images of Egypt, someone not familiar with how Egyptian categories worked could have only added them to stuff like "Pyramids in Egypt" or "Egypt in the 19th century" while the person who uploads the slightly different "duplicates" might have looked at the category for the specific pyramid by name. Are we asking users to literally go through hundreds of categories every time before they upload an image?
    More often than not human error, or rather human ignorance is at play here. A vague blanket ban will only prevent educational images from being uploaded and uploading these images one-by-one is tedious and takes even way more time than simply using mass-categorisation and mass-deletion tools that admins have access to. The only way this would work is if we'd say that the free time of an admin is more valuable than the free time of a content contributor and that doesn't seem like a wise judgement to make if the number one (#1) mission of the Wikimedia Commons is providing free educational content. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 09:59, 11 August 2022 (UTC)Reply[reply]
    Category:Duplicates contained ~ 9.000 technical duplicates ~ week ago and still contains more than 3.00. So this is completely preventable problem. It'll also save valuable time of content contributors. --EugeneZelenko (talk) 14:16, 11 August 2022 (UTC)Reply[reply]

    Subcategories of Category:Cultural history[edit]

    I am looking over the top-level categories by century/decade and there seems to be a large inconsistency on what falls under "Culture". I see cultural organizations, cultural events, entertainment, religion, education, art, cosplay, fashion. Is there a prior discussion on how to organize things at the top level downward? Ricky81682 (talk) 20:23, 31 July 2022 (UTC)Reply[reply]

    Just a general comment, but what I've seen categories become completely worthless you get up to certain level of semantic abstraction like this one because people just the categories as random file dumps. Category:History is kind of the same way. There's so many different kinds of images in that category that it's essentially meaningless. Category:Music is another good example, 99% of the images in that category are COPYVIO SPAM images of non-notable musicians. Personally, I'd love to see such categories be gotten rid of. Same goes for a category like this one. Everything is "cultural" and everything is also "history." So what's the point in the category? There isn't really one. Conversely, is an image of a random person standing next to a tree "music"? Obviously not. --Adamant1 (talk) 22:31, 1 August 2022 (UTC)Reply[reply]
    They serve two roles: 1) If they aren't too overpopulated with irrelevant content, somebody can find files they are able to put into useful categories (or delete as out-of-scope copyvios). The person leaning towards a tree in Category:Music is probably a musician and somebody might recognise them. 2) They can be used to find relevant subcategories. Everything is culture, including a boat, but a category hierarchy starting from there might be about heritage ships, concerts aboard a ship or somesuch. Sometimes it is difficult to guess at the relevant category, and then one strategy is to start high (low?) enough in the tree. –LPfi (talk) 09:31, 9 August 2022 (UTC)Reply[reply]

    Google Earth-#- Wickepedia[edit]

    Auf Google Earth fand ich in Deutschland den Kauern 50º43´08,59 Nord 12º04´40,85 Ost im Landkreis Greiz in der Nähe von Lunzig dort fand ich auch einen Querverweis auf Wikipedia, jedoch hat dieser Verweis nicht mit dem Ort in der Nähe nichts zu tun, sondern er bezieht sich auf den Ort gleichen Namens 50º50´38,39 Nord 12º08´39,09 Ost zwischen Gera und Ronneburg. Ich übe keine Kritik an dem Beitrag, jedoch sollte der Punkt für den Querverweiss dahin kommen, wo er hingehört. Da ich nicht weiß, wie man diesen Fehler Google Earth mitteilen kann wende ich mich an Sie.

    Eckhard Bartz Eckhard Bartz (talk) 16:37, 7 August 2022 (UTC)Reply[reply]

    The coordinates used by en-Wikipedia and Wikidata were indeed incorrect, confusing the ortsteil in Langenwetzendorf and the gemeinde near Ronneburg. These have now been corrected, but I have no idea if and when Google will synchronise their information. --HyperGaruda (talk) 06:07, 9 August 2022 (UTC)Reply[reply]