Commons:Village pump/Proposals

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
Community portal
introduction
Help deskVillage pump
copyrightproposalstechnical
Administrators' noticeboard
vandalismuser problemsblocks and protections

Shortcuts: COM:VP/P· COM:VPP

Welcome to the Village pump proposals section

This page is used for proposals relating to the operations, technical issues, and policies of Wikimedia Commons; it is distinguished from the main Village pump, which handles community-wide discussion of all kinds. The page may also be used to advertise significant discussions taking place elsewhere, such as on the talk page of a Commons policy. Recent sections with no replies for 30 days and sections tagged with {{section resolved|1=~~~~}} may be archived; for old discussions, see the archives.

Please note
  • One of Wikimedia Commons’ basic principles is: "Only free content is allowed." Please do not ask why unfree material is not allowed on Wikimedia Commons or suggest that allowing it would be a good thing.
  • Have you read the FAQ?

 
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 1 day and sections whose most recent comment is older than 30 days.

Back up Google Ngrams[edit]

Google Ngrams.

You can do fun stuff with it, comparing word frequency like automobile vs car vs taxi and police car vs police automobile. And what's also really great about it: it's free!

Available under a Creative Commons Attribution 3.0 Unported License to be exact. But Google being a for-profit company, they have no obligation to host these files forever. They could be available for a long time, or gone tomorrow.

I'm not sure if this is strictly within our current scope. Educational, yes, totally, but COM:SCOPE isn't really clear about datasets like these. "representative merely of raw text, e.g. ASCII files" kind of suggests it falls outside of our current scope. But that line seems to have been written with text that should go into articles in mind, not gigabytes of tab separated data. Apparently we do have mw:Help:Tabular Data (I learned something today), but I'd also like to see the original files on Commons. (also I don't know if tabular data will even be suitable for this)

This will probably involve:

  • Declaring that at least for Google Ngrams, we are making an exception for COM:SCOPE.
  • Temporarily enable .gz uploads for either bots, administrators or a particular user who will upload it.
  • Add storage.googleapis.com at least temporarily to wgCopyUploadsDomains.json for upload_by_url. - Alexis Jazz ping plz 03:43, 17 October 2019 (UTC)

Back up Google Ngrams: votes[edit]

  • Symbol support vote.svg Support - Alexis Jazz ping plz 03:29, 17 October 2019 (UTC)
  • Symbol support vote.svg Support.--Vulphere 13:36, 17 October 2019 (UTC)
  • Symbol support vote.svg Support - Commons has more useless(Ngrams are not useless BTW) files than these, why shouldn't we host these ? -- Eatcha (talk) 17:22, 17 October 2019 (UTC)
  • Symbol support vote.svg Support, these Ngrams seem very useful, though not all of these are equally useful, it's better to save them now then regret it tomorrow. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 20:30, 18 October 2019 (UTC)
  • Symbol support vote.svg Support - I'm sure people out there interested in this sort of thing will find these useful. –Davey2010Talk 20:56, 18 October 2019 (UTC)
  • Symbol support vote.svg Conditional support If this is generalised to "Allow dataset uploads with prior community agreement". Symbol oppose vote.svg Oppose If this is a Google-only exception. ℺ Gone Postal ( ) 04:37, 30 October 2019 (UTC)
  • Symbol oppose vote.svg Oppose - First off, these are mostly zipped and gzipped csv files. Shouldn't those go somewhere like CommonsArchive or Internet Archive rather than here? Second, Google is more than capable of backing up their own files and generally gives a year or two notice before discontinuing any services. What's the point of creating a 2nd (3rd?) back-up on Commons when there is no reason to believe that Google is discontinuing Ngrams any time soon? Surely, there are more productive uses of our time and space. Finally, hosting data files is out of scope for Commons. According to Commons:Project scope, "Every file must be a media file." Kaldari (talk) 21:52, 7 November 2019 (UTC)
  • Symbol support vote.svg Support - Tibet Nation (talk) 18:09, 29 December 2019 (UTC)
  • Symbol oppose vote.svg Oppose I agree with Kaldari. Commons is for media files, backup is already existing in Internet Archive.--So9q (talk) 10:51, 16 January 2020 (UTC)
  • Symbol oppose vote.svg Oppose - I agree with Kaldari and am of the opinion that this is out of scope. The Internet Archive is a better place to host a backup of datasets like this. GFJ (talk) 12:34, 27 January 2020 (UTC)

Back up Google Ngrams: discussion[edit]

@El Grafo: Doesn't even include the 2012 dataset. But why shouldn't Commons and archive.org both retain a copy? There is no technical or copyright reason why we couldn't. - Alexis Jazz ping plz 16:21, 17 October 2019 (UTC)
@Alexis Jazz: alright, but I think Commonsarchive would be a much better place for this. --El Grafo (talk) 16:17, 31 October 2019 (UTC)
We'd need to buff up its storage space significantly (a few order of magnitudes probably?) to store this. --Zhuyifei1999 (talk) 16:40, 31 October 2019 (UTC)
@Gone Postal: What's the difference? Prior community agreement will always be needed due to file formats. I guess you mean datasets should be added to COM:SCOPE? I agree with that, I'll create a proposal for that as well. Consider this proposal the finding of community agreement to upload Google Ngrams. - Alexis Jazz ping plz 15:36, 30 October 2019 (UTC)
@Alexis Jazz: Ok, fair enough. I just want to be sure that if I were to spend lots of resources and collect a good dataset about word useage on forums, newsgroups (yes, they still exist), etc, then they would be treated in a similar manner to Google's. ℺ Gone Postal ( ) 13:21, 31 October 2019 (UTC)
  • @Kaldari: El Grafo suggested CommonsArchive above. According to Zhuyifei1999, the storage space of CommonsArchive would have to be buffed up considerably. It only has 317 files at the moment and it is unofficial, so I wouldn't be surprised if CommonsArchive itself disappeared at some point. The Ngram files are in an open format, so there's no problem there. We just couldn't allow gzip for everyone because a gzip file can contain anything. As for Google backing it up themselves: we shouldn't rely on Google. Why import anything from anywhere? Others can take care of themselves right? And I'm not watching the Ngrams download page all the time. They could give a notice a year in advance and I would probably miss it. As for time, if someone runs a script to perform the upload (which shouldn't be exceptionally hard, I can provide a nice clean list of links if needed), I will take care of descriptions/categories/etc. - Alexis Jazz ping plz 21:21, 22 November 2019 (UTC)
Sorry for the misclicked 'delete', User:Alexis Jazz DMacks (talk) 23:00, 22 November 2019 (UTC)
@Kaldari: any comment? Our space is barely a concern, WMF doesn't even seem interested in permanently deleting obvious copyvios and abuse. (I doubt even the Wikizero abuse has been permanently deleted) Our time is also no real concern, I will do the categorizing and everything else for the file pages. Running an upload script is not really time consuming either, it just runs in the background. Uploading to CommonsArchive would cost a lot of time though, because I don't have any tools there and services would need to be reconfigured to add space. - Alexis Jazz ping plz 19:52, 22 December 2019 (UTC)
@Alexis Jazz: I concede every point to you except "we shouldn't rely on Google". Google has a gzillion times more resources than us and isn't going anywhere, so why not just let them host it? Is there a Wikimedia project that is actively wanting to use this data? Kaldari (talk) 00:51, 23 December 2019 (UTC)
@Kaldari: Google as a company, sure. But any particular Google service or a part of any given service? No. Remember links to other videos on YouTube, usually on end cards? Vanished. Just gone. Even if Ngrams remain available, there's no guarantee the downloads will. Or they could be protected by a CAPTCHA at any point, making it much harder to get them.
Your second question: Ngrams can certainly be used for Wiktionary. Dutch Wiktionary already has a header "gangbaarheid" (prevalence) which currently contains a link to the official word list (if the word is in the word list) and statistics from a research project. (if the word was included in that) In case of w:wikt:amai, only 58% of Dutch people recognized it while 90% of Flemish people did. Which makes sense because that word is rarely used in The Netherlands. Adding a statistic to indicate how common any given word is (and how it's use progressed over time) would add value. - Alexis Jazz ping plz 01:12, 23 December 2019 (UTC)
@Alexis Jazz: Sure, it could add value, but I'm not convinced anyone is actually going to use this data on our projects. If there was an actual plan to use it or even better, some existing uses, I would be more open to hosting it. Another issue (that you already mentioned) is that data files are not within the scope of Commons. According to Commons:Project scope, "Every file must be a media file." There are countless data files out there on the internet that could have educational value and are under open licenses (for example, data supplements for open license scientific papers). Are we going to start hosting all of them just because they might disappear one day? That sounds like a job for the Internet Archive, not Wikimedia Commons. Kaldari (talk) 23:19, 14 January 2020 (UTC)
@Kaldari: Chicken and egg. I may work on bringing something to Wiktionary (I must admit, with my schedule and everything, it could take quite some time, hard to say) but if Commons doesn't host the raw data, I probably won't. Because I don't want to rely on Google and see it all vanish one day. As for open data from scientific papers, I say: sure, let's host it! Why not? Why does data have to be restricted to the data: namespace? Or do you want to get rid of the data: namespace as well? - Alexis Jazz ping plz 17:15, 15 January 2020 (UTC)

We don't have enough videos, some proposals to Improve the issue of meagre video files on Wikimedia Commons[edit]

As per Special:MediaStatistics less than 1% of our files are videos, in size 4.97% of our total storage is video.
Q Why do we need more video?

  • A Visual stimulation grabs users’ attention, our goal of developing and maintaining open content, wiki-based projects and providing the full contents of those projects to the public free of charge is incomplete without educational videos.

Q How does a normal-user upload video (at present)?

  • A They either use FFmpeg(or any other video conversion tool) or upload via Commons:video2commons. Please note that new users are not aware of Video2commons and it's also restricted to AutoConfirmed Users for obvious reasons. Users editing through tablets/phones/phablets/cheap computers can't uses FFmpeg as it's slow.

Q Are we ignoring this problem?

  • A You know better, than I do.

Q Would it break WMF's server?

  • A Commons:video2commons runs on WMF server, it doesn't break it. WMF has enough processing power to handle the conversion.


Please feel free to oppose any of the following proposals, but don't forget to provide a realistic solution to increase the number of videos on Wikimedia Commons.

Proposal 1: Support conversion of MP4 files to open format like WebM & delete/nuke the MP4 file after uploading the transcoded Webm file[edit]

  • Most of the Video cameras record in MP4, smartphones record in mp4. The uploader is required to convert the video to Webm, otherwise, they can't upload to commons.
  • Conversion takes too much time(sometimes even days), expensive computers, etc.
  • It should be understood that videos would be recorded in MP4 no matter what we do, now conversion can either be done on the uploader's computer or WMF's servers.
  • By doing the conversion on the WMF server we can increase the number of video files. And by deleting the mp4 file, we are not hampering with WMF's goal of supporting open-source.
  • By not allowing the conversion on WMF servers we are certainly hampering upload of many educational videos.

Votes for Proposal 1[edit]

  • Symbol support vote.svg Support As the suggester -- Eatcha (talk) 10:08, 8 November 2019 (UTC)
  • Symbol support vote.svg yes, please. Many content creators can't upload educational videos to Commons because it doesn't support uploading of mp4 files. Masum Reza📞 10:51, 8 November 2019 (UTC)
  • Symbol support vote.svg Yes, please, I have tried uploading videos before discovering Video2Commons but gave up until I found out about the tool. A lot of newbies or just general users who do want to upload video files to Wikimedia Commons will be empowered to do so if this proposal gets accepted. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 14:43, 11 November 2019 (UTC)
  • I Symbol support vote.svg Support whatever of these proposals allows uploads of MP4. As it happens, the US government normally releases video in this format, and I have more than once downloaded and tried to upload a video only to remember that I forgot that the format isn't allowed. I'm not really qualified to comment on the technical nuances. GMGtalk 20:34, 30 November 2019 (UTC)
  • Symbol support vote.svg Support same as GMG, proposal 1 or 2, it's mostly what developers think is more doable. Because that's usually the bottleneck. In fact, it may not matter what we vote anyway. We can't force any developer to work on it. - Alexis Jazz ping plz 20:50, 30 November 2019 (UTC)
  • Symbol oppose vote.svg Oppose Too complex procedure. It would make tons of additional backlog. – Kwj2772 (talk) 22:44, 1 December 2019 (UTC)
  • Symbol oppose vote.svg Oppose - MP4 patents will eventually expire (around 2027?). We might as well keep the files around so we can use them once that happens. Plus we may want to re-transcode from the original sources at some point due to better codecs, fixes to transcoding bugs, etc. Kaldari (talk) 23:19, 31 January 2020 (UTC)
  • Symbol oppose vote.svg Oppose. Unless the WMF can't afford the storage space, there's no reason to delete the uploaded source files. They should be kept for future use, mostly re-encoding the VP8 and VP9 versions if such a need arises, generating versions for future formats such as AV1, and making future editing of the uploaded video files viable with less quality loss (see Commons:Lossy and lossless for why). Proposal 2 should be the short term solution, and Proposal 3 the long-term strategy. -- Veikk0.ma (talk) 14:42, 6 February 2020 (UTC)
  • Symbol support vote.svg Support the allowing of uploading of MP4. No position on whether we keep or delete the MP4 after conversion. Doc James (talk · contribs · email) 03:31, 10 February 2020 (UTC)
  • Symbol support vote.svg Yes, please - The first is a sure shot yes. Can someone quantify the cost per annum for having a mp4 to webM set-up? There are many "free online converters online" who support themselves via ad. So i don't think this will be by any chance the costs will run in millions of $. W.r.t second half - that can be debated. However, the first half is a definite yes.--Pratik.pks (talk) 03:08, 11 February 2020 (UTC)
  • Pratikpks I don't have stats for total minutes of videos but https://cloud.qencode.com/pricing tells that VP9 + Webm is the costliest combination supported on commons. AV1(in webm) is not supported but if we support that would be the costliest combination possible. If we we assume that every single video on commons is 5 mins, it would cost about $1 million(VP9+webm). For AV1 + webm itwould be about $2.5 million. Based on Special:MediaStatistics. PS: all these videos are uploaded in many years. -- Eatcha (talk) 05:33, 11 February 2020 (UTC)
  • Thanks for quantifying the amounts, Eatcha. So (VP9+Webm) costs $1 million to convert around 82,000 videos uploaded to date. So if we are to now apply these conversion for prospective videos, and assume 10K videos uploaded per year (it will costs $125K) or assume a bump in the number of upload to videos to 20,000 videos (due to easy conversion of mp4 to WebM) - it will cost ($250K). This pricing is based on an external provider, and if we do it on in-house servers, we could save around 50%, right? So the costs will reduce the ($62.5K - $125K per annum). Does that sound reasonable? Any further upward/downward adjustments in pricing? --Pratik.pks (talk) 02:45, 19 February 2020 (UTC)
  • Pratikpks $62.5K - $125K per annum is reasonable to me considering total donations, it depends on WMF. is it reasonable to them ? -- Eatcha (talk) 03:49, 19 February 2020 (UTC)
  • I don't have any numbers myself, but my gut feeling is that the numbers cited above are significantly higher than the true cost (especially if excluding any sort of salary of the people setting it up, and the fact that these servers already exist regardless of what happens with this proposal). I don't think we should make any assumption on cost of this proposal. If its unreasonably expensive WMF will tell us.Bawolff (talk) 11:13, 20 February 2020 (UTC)

Discussion for Proposal 1[edit]

With regard to the upload of a non-free mp4, these do not have to be visible on Wikimedia Commons, so the proposal as written is unnecessarily complex or negative.

The upload process puts the file on a WMF server where there are some checks and information like EXIF data is parsed in order to create the page that later appears on Commons. A video pre-processing stage can occur at that point, which can logically either reject the video file due to parsing errors, unrecognized codecs, any other type of error, or successfully release the webm version. The mp4 original could be retained in a file archive as a potential future reference, including the advantage of being able to automatically re-process the video should the pre-processing software be upgraded or indeed being able to release the mp4 should Commons policies change or the nature of the copyrights for that format change. -- (talk) 11:08, 8 November 2019 (UTC)

The software, as written, mostly assumes that the initial verification stage is relatively fast, where transcoding can take a long time (possibly even hours for a long HD video). So making MediaWiki work that way, might involve some effort. That said, i think the effect of what you are saying is good, and perhaps can be accomplished more efficiently with slightly different technical implementation. Bawolff (talk) 02:47, 23 December 2019 (UTC)

Technical question: When I upload a long, high-resoluted video, I use CPU capacity of a Wikimedia Server. Is there any way for the upload website to use my own CPU for video conversion? --D-Kuru (talk) 16:07, 5 January 2020 (UTC)

Theoretically, yes. The problem is, the technology for doing so is mainly used to mine Bitcoin on unsuspecting users systems (one study found that literally half the use of Webassembly was to mine cryptocurrency.) I'm not completely familiar with Webassembly, but if there is a way to compile existing conversion code for the web, it's still going to be a lot of infrastructure that has to be written. If there's not, or it doesn't work well enough, doing it would be a huge project. So I wouldn't hold my breath for it.--Prosfilaes (talk) 02:10, 7 January 2020 (UTC)
So this is an interesting question. There's a couple points I'd like to make:
  • Wikimedia has lots of servers, at least several hundered. video scaling is a very small part of it. Part of the issue here of course, is not just that transcoding video takes a lot of cpu, but it takes a lot of cpu in a row (Some parallization is possible, but you can't just split it up amongst 10 different computers to take a tenth of the time).
  • Long ago there used to be a firefox extension called firefogg, which integrated with UploadWizard and caused videos to be converted as part of that.
  • WebAssembly can be used for this sort of thing. In fact, the reverse of that is kind of how we play videos/audio on iPhones where there is no other option [1]
  • Generally speaking, if we are creating transcoded assets (i.e. the resized videos), we would want to do it on wikimedia servers for quality assurance reasons. We wouldn't want a vandal to upload the 640p version to be something different than the original file.
Generally speaking I think w:WP:PERF applies here to a certain extent. Bawolff (talk) 11:08, 20 February 2020 (UTC)

Proposal 2: Allow uploading of MP4 files, only provide transcoded Webm files to download/stream.[edit]

  • Keep MP4 files, don't allow download of MP4 files nor streaming.
  • Instead, stream Webm version of the MP4 files and allow Webm version of the MP4 file to be downloaded.
  • It doesn't promote MP4, as we are disabling download and streaming. Users can no way get he MP4 file that was uploaded.
  • On August 26, 2010, MPEG LA announced that royalties won't be charged for H.264(patent expiring in 2027) encoded Internet video that is free to end-users.

Votes for Proposal 2[edit]

  • Symbol support vote.svg Support -- Eatcha (talk) 10:09, 8 November 2019 (UTC)
  • Symbol support vote.svg Support Kaldari (talk) 16:00, 12 November 2019 (UTC)
  • Symbol support vote.svg SupportKwj2772 (talk) 20:22, 30 November 2019 (UTC)
  • Symbol support vote.svg Support, this would make uploading video files a lot easier as it is the de facto standard. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 19:36, 22 December 2019 (UTC)
  • Symbol support vote.svg Support I support this. I don't think the free content movement is well served by making it difficult for people to convert their works to free formats. I am strongly opposed to serving content in patented formats, but freeing content from patented formats seems like an obvious good. Additionally, the debian project seems to view MP4/h.264 as free enough to distribute transcoding software, which in my mind satisfies the requirement that Wikimedia's software stack should be forkable. Bawolff (talk) 02:52, 23 December 2019 (UTC)
  • Symbol support vote.svg Support - Tibet Nation (talk) 18:25, 29 December 2019 (UTC)
  • Symbol support vote.svg Support Keep, transcode and make available when patent issue is gone.--So9q (talk) 10:47, 16 January 2020 (UTC)
  • Symbol support vote.svg Support, with the exception of the proposal (all of these proposals, actually) conflating the MP4 container format and non-free video coding formats. A better wording would be non-free MP4 files or even better, specifying the video coding formats in question. MP4 is not non-free. Fully free MP4 files (AV1 + Opus) are possible and are actually served by Youtube, though I'm not sure whether Commons currently accepts them as uploads. -- Veikk0.ma (talk) 15:14, 6 February 2020 (UTC)
  • Symbol support vote.svg Support Good idea too Doc James (talk · contribs · email) 03:32, 10 February 2020 (UTC)
  • Symbol support vote.svg Support This is extremely useful. —Justin (koavf)TCM 04:34, 10 February 2020 (UTC)
  • Symbol support vote.svg Support This is a brilliant idea, and a great balance between functionality and values. --Pratik.pks (talk) 03:05, 11 February 2020 (UTC)
  • Symbol support vote.svg Support Is there a holding-pen somewhere where the mp4 files can be saved until the patent expires? Archive.org maybe? Victorgrigas (talk) 19:06, 11 February 2020 (UTC)

Discussion for Proposal 2[edit]

There is no advantage in "displaying" a mp4 that reusers cannot access, or viewers see. For this reason it's really the same as proposal 1 in my eyes, as the choice of whether to keep an original video in the filearchive is the same issue. -- (talk) 11:12, 8 November 2019 (UTC)

I think the implied difference is that under this proposal the MP4 files would more easily be made available for streaming and downloading in 2027 (once all the patents have expired), without requiring undeletion of all the source files. Kaldari (talk) 16:00, 12 November 2019 (UTC)
Also saving the file in a MediaWiki way, allows re-transcoding if there is some bug in the transcoding process or we need to transcode to new formats later like AV1 (Its always best to transcode from original where possible). Bawolff (talk) 02:49, 23 December 2019 (UTC)

Proposal 3: Use AV1 codec instead of VP9/VP8 as the main codec, (AV1 is new and better free codec)[edit]

  • AV1 is a new and better free codec then VP9, AV1 can use WebM, mkv, and mp4 as a container.
  • It was developed as a successor to VP9 by the Alliance for Open Media (AOMedia).
  • Tests from Netflix showed that, based on measurements with PSNR and VMAF at 720p, AV1 was about 25% more efficient than VP9 (libvpx).
  • Similar conclusions with respect to quality were drawn from a test conducted by Moscow State University researchers, where VP9 was found to require 31% and HEVC 22% more bitrate than AV1 for the same level of quality.
  • Facebook showed about 40% bitrate savings over VP9 when using a constant quality encoding mode.

Votes for Proposal 3[edit]

  • Symbol support vote.svg Support -- Eatcha (talk) 10:09, 8 November 2019 (UTC)
  • Symbol support vote.svg Support --D-Kuru (talk) 16:04, 5 January 2020 (UTC)
  • Symbol support vote.svg Support Victorgrigas (talk) 19:05, 11 February 2020 (UTC)
  • Symbol support vote.svg Support --Vulphere 03:00, 19 February 2020 (UTC)

Discussion for Proposal 3[edit]

  • There is a lot of arguments here without any evidence. Masum Reza📞 10:20, 8 November 2019 (UTC)

@Masum Reza📞 Here you are:

There does not need to be a proposal agreed for this to proceed as a task on Phabricator. As this is a fairly technical point, and there may be operational issues that are not known here, but might be known to the Phabricator community, it makes sense to start the Phabricator discussion anyway. -- (talk) 11:11, 8 November 2019 (UTC)

  • I think this should be left to the multimedia devs to decide. There's lots of technical considerations involved (Client support, maturity of implementations, different performance requirements for encoding) as well as technical work that someone has to do to make it happen. Bawolff (talk) 02:16, 23 December 2019 (UTC). Addendum: I would add, I doubt this proposal would address the issue. Most users do not know what AV1 is, lack of AV1 support is not the reason we have few videos. Bawolff (talk) 02:54, 23 December 2019 (UTC)
NO ACTION:

Snowball per objections below MorganKevinJ(talk) 01:50, 20 November 2019 (UTC)

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Proposal 4: Integrate any open-source video conversion suite inbuilt the commons android/IOS app[edit]

  • It should facilitate the uploading of videos directly from smartphones, as users don't need to download an extra app to convert video. (It's gonna be very slow BTW, due to less processing power of phones)

Votes for Proposal 4[edit]

  • Symbol support vote.svg Support -- Eatcha (talk) 10:10, 8 November 2019 (UTC)
  • Symbol oppose vote.svg Oppose Idea is good but we want high quality and not fast encoded content. That is not possible on mobile devices. Even on a high end desktop most encodings can not run in real time. --GPSLeo (talk) 09:46, 9 November 2019 (UTC)
  • Symbol oppose vote.svg Oppose I tried running this on OnePlus 7TP, and it can't even transcode the videos recorded using the device itself. And of course, you can fry eggs after transcoding 4K videos. Transcoding must be done on WMF servers. --Eatcha (talk) 15:15, 9 November 2019 (UTC)
  • Symbol oppose vote.svg Oppose Legal hell. (actually not so much "hell" as "it's going to cost millions of dollars in patent licensing") - Alexis Jazz ping plz 16:03, 9 November 2019 (UTC)

Discussion for Proposal 4[edit]

Devolving processing to the client side may make sense, or not. I would like to see some trials of what the outcomes or (dis-)benefits to the user experience are. Expecting users to wait for an hour for a 4 minute video to be pre-processed and hogging resources on their phone during that time, would not be a workable scenario. -- (talk) 11:16, 8 November 2019 (UTC)


The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Specifying categorizing by file format[edit]

After discussion here (Commons:Administrators'_noticeboard#Massive upmerging and deletion) user:Themightyquill gave the proposal to specify Commons:Categories policy related to categorizing by file format. The proposal is:

--Estopedist1 (talk) 06:29, 28 January 2020 (UTC)

  • I am not against this, but I do not like this. I already think that we have lots of "Intersection categories" which should be done by other tools. We currently can intersect categories with search, open up a search and in advanced put in multiple categories like so. However, this is non-trivial and is error prone. But doing it all by hand will create another level of problems. You already have huge trees of categories by subject, we will basically duplicate them all by file type. We also have huge trees (where I am trying to help out right now) by date. Currently I am trying not to touch countries which are sometimes written with the and sometimes without. It is a horrible mess and any change leads upwards. But if we will also have to propagate any change horizontally into intersections... I pitty whomever will this task fall upon. What we do need is a tool to allow users to intersect easily and quickly. ℺ Gone Postal ( ) 06:49, 28 January 2020 (UTC)
@Gone Postal: you are the information visionary like I would like to be. This proposal is essential step to clean our category tree and to get rid of eg category:JPG flags by country, category:PNG flags by country, category:Ogg files of classical music for piano--Estopedist1 (talk) 14:44, 29 January 2020 (UTC)
@Estopedist1: Wow, thanks for the compliment :) I do not, howewever, think that abondoning these categories will happen now. And until we get good tools (not external ones that take several seconds to complete for your to realise that they return wrong results, and not something where you have to write an SQL-like query) we will be unable to abandon these categories. ℺ Gone Postal ( ) 18:52, 29 January 2020 (UTC)

Definition of educational[edit]

There are objects with many thousands pictures of them and a huge amount of these images has an acceptable or good quality. But it would be difficult to find an educational use where this many images are needed. And no one would forbid to upload now images of these object and none of these images gets deleted because it is just one with many similar ones. The current policy says that images are in scope if they are used in an other Wikimedia project or if they can be used for educational purposes. Educational is defined as "providing knowledge; instructional or informative".
To have many pictures of one object in scope I would propose to add "showing the real world with every detail" to the definition of educational. I think that would include many images of the same subject in the scope while preventing mass upload of fictional or promotional content. Of course this should also not include media violating the human rights of someone. --GPSLeo (talk) 20:48, 28 January 2020 (UTC)

By default the Vulcan philosophy of Infinite Diversity in Infinite Combinations has applied.
It's already sort of in the policies, though guidelines can always improve their wording. -- (talk) 20:57, 28 January 2020 (UTC)
I disagree; "showing the real world with every detail" is already part of educational use. Showing hundreds of largely similar images of the Tower Bridge (which I'm guessing this is in response to) is not showing unique details of anything, and I'd argue we should be pushing back at new uploads of the same object in the same way. A thousand pictures of the Tower Bridge, each mostly not overlapping, showing the Bridge at detail you can't get from File:Tower Bridge from Shad Thames.jpg would be awesome, and I don't think anyone would object to that.
Basically, I don't think the rule adds anything at all, and while it appears to me to be about a subject on which you and I disagree, I don't think it clarifies the disagreement for the larger audience. (I'd also say it's largely moot; nobody, including me, is rushing to delete files from those categories I complained about.)--Prosfilaes (talk) 21:31, 28 January 2020 (UTC)
I also do not think that this would make any major change, but with this words some misunderstandings and ambiguities of educational and the project scope could be eliminated. I would disagree that there are enough images of some objects. I see no reason that we can not store a huge amount of images, storage is not very expensive. One example that images where it looks like there is no use for them can have a use case: I used File:Silberteich 18.jpg and File:Silberteich 19.jpg to find the exact location of the WLE finalist File:Silberteich.jpg. --GPSLeo (talk) 11:04, 29 January 2020 (UTC)
If we're going to toss the flood gates open, why not start with ceasing to delete photos people upload of themselves and their bands and the like? That's contentious and frustrating to the uploaders, and they can be neatly stuck away in the category tree and forgotten at little cost.
One of my hiking buddies complained that one of his friends would post photos of their hiking trips, but instead of a selected few, he'd post 300+ images, making them extremely tedious to go through. When I'm in the mood, I can rack up hundreds of photos in a couple days. If I uploaded them all, I think it worsen the quality of Commons, make it harder to find useful images. I have hundreds of pictures of the desert cottontail; I've uploaded one, not great quality, video, because it's the only video on Commons where you can actually see how the cottontail moves. And if I were to encourage myself and others, it would not be to upload the same tourist pictures; a picture of a local park, as of yet undocumented on Commons, is worth way more than another of the Tower Bridge.--Prosfilaes (talk) 21:41, 29 January 2020 (UTC)
As I sad there are use cases for many pictures of the same. For finding the best for most common uses we should better look on improving the quality image assessment procedures. --GPSLeo (talk) 09:30, 30 January 2020 (UTC)
I think the adding mediocre, poorly categorized images quickly overwhelms the value of any of those use cases. It takes work to handle them.--Prosfilaes (talk) 01:13, 7 February 2020 (UTC)
If I may add a further opinion: I do agree with Prosfilaes that for objects where we already have hundreds or thousands of highly similar and redundant images, adding even more should be discouraged unless new ones are in some aspect of superior quality or show previously-unseen details. Similarly, I believe that deleting redundant images does serve a purpose. Among hundreds or thousands of images showing the same object, naturally some will be of better, others of worse quality. Now, if an article editor or any other user is in need of an image, having to look through an endless number of images just to find the better ones is difficult and highly time-consuming. On the other hand, if redundant images are continuously sorted out, with inferior ones being deleted and the better ones kept, it is far more easy to find the images that one actually wants to see. GFJ (talk) 19:33, 29 January 2020 (UTC)
There's another option between ignoring and deleting, which I'd call curating. E.g., pick the best photos of the tower bridge, from various observation points and weather conditions. Then put {{Superseded}} on any inferior image that matches one of these, and put all such images in a "superseded" subcategory. --ghouston (talk) 02:50, 30 January 2020 (UTC)
The curation used to be on galleries, not in categories. Pick the best handful of images to illustrate a subject and put them in a gallery. Entry points to topics should prefer the galleries, if we go that route. Other than potentially having different links from Wikidata, seems like that should be just as easy (if not easier) than curating categories. Carl Lindberg (talk) 03:55, 31 January 2020 (UTC)
  • I think that this is a solution to a small problem, which will have serious negative consequences. Here is a hypothetical educational useage: I want to illustrate the difference of winter weather in some village. What I am searching for (in this hypothetical) is an outside photo of any objects that was made in December of each year. Your approach would delete many such photos, simply because the objects on them are already illustrated elsewhere, but I was not searching for those objects, I was looking at everything around them (is there snow, how much of it, what kind of snow, etc). So by trying to make it slightly easier for a user to find a good image, you have just made it completely impossible (unless the user is an admin and is able to look at deleted images, but then they have lost all their categorisation, etc). On that ground I am in Symbol oppose vote oversat.svg  strong opposition to this proposal. ℺ Gone Postal ( ) 05:39, 7 February 2020 (UTC)
I don't think anyone suggested deleting images that are not actually redundant, such as images that show the same object, but in different seasons/backgrounds/landscapes. The image of a village in summer is certainly not redundant to an image of the same village in winter and both should absolutely be kept, I very much agree with you there. Where I believe that deletions are useful is only with "real" redundancy. If we already have 20 images of an object in the same season, with the same background (including general weather, vegetation, etc.), then we don't need 20 more. In this scenario, with real redundancy that takes backgrounds, etc. into account, deleting inferior images is highly useful. GFJ (talk) 20:17, 7 February 2020 (UTC)
Ok, I have typed an answer, and then deleted it. As it stands, I do not believe that this is a constructive proposal, and your response actually points to the problem with it. Anybody who will want to delete a file will say "Yes, of course we allow all educational media here, but this is a 'real' duplicate." We already have a policy about deleting scaled down duplicates, and everything else is the step in the wrong direction. ℺ Gone Postal ( ) 20:50, 7 February 2020 (UTC)
Have a look at Category:ZooSphere Gymnopleurus sericeifrons. These are my uploads, I think none is redundant, but they all have the same background and every photograph is one of a large set of exactly the same insect. -- (talk) 10:05, 19 February 2020 (UTC)
Once again, this appears to boil down to how one defines "redundancy". The images from the above category indeed show the same dung beetle, but from different angles, thereby highlighting different details. Like you, I would not consider such images redundant. If the angles were the same, then yes - but that's not the case. Obviously, exhaustively defining "redundancy" is nearly impossible and also not very helpful, different users will always have different criteria for what they consider redundant. Luckily, with Commons:DR we do have a system for peer review. I am not in favor of this proposal. What I do want to achieve with my comments is to make the point that even if it may not be too often, there indeed can be cases where non-identical images are so redundant that deletions become useful to make it easier to find the actually good ones. However, as said before, the threshold for redundancy must of course be high. Here's an example: would most of you say that this image is redundant to this one, but of significantly lower quality? GFJ (talk) 12:45, 19 February 2020 (UTC)
WRT example, no they are not redundant. Both are a bit pants if the intention is to illustrate the game though. -- (talk) 12:49, 19 February 2020 (UTC)

FastCCI - Enhancement - Only images from Wikidata[edit]

To have a new option that only selects images that are on wikidata. This would allow to obtain only one/better/more representative image of each category (within a given parent category) without overloading the browsers memory (which is one of the big problems for many users)--JotaCartas (talk) 01:06, 31 January 2020 (UTC)

Commons:VideoCutTool[edit]

Our video cut trim tool is working fairly well (still improvements needed of course).

Wondering peoples thoughts about adding this to the left sidebar for video files, similar to how we have the "CropTool" for images? Doc James (talk · contribs · email) 03:18, 10 February 2020 (UTC)

I would love to have a direct link to this tool from the target video. Having this on the side bar would be great. Thanks -- Eatcha (talk) 03:25, 10 February 2020 (UTC)
Usually to alter the sidebar one edits MediaWiki:Sidebar. But I do not see the tools listed. Doc James (talk · contribs · email) 04:16, 10 February 2020 (UTC)
How many times has it been used?
Independently of whether this gets accepted or not, would it be an idea to bundle some helpful video tools together, and offer it as optional? Then people who want to work with video can just turn them on at once. Effeietsanders (talk) 04:47, 10 February 2020 (UTC)
Probably only been used a dozen times. It is just getting functional. What other video tools do we have to bundle? Doc James (talk · contribs · email) 05:45, 10 February 2020 (UTC)

I would love to have a video editing tool as part of Commons seen under/next to videos, happy to help with things if needed to make it easily available to people. I have no opinion as to whether it should go in the sidebar or underneath the video itself, as long as its obvious the tool is available (also not sure if there are different requirements for either). John Cummings (talk) 10:00, 10 February 2020 (UTC)

Does anyone know how to do this? User:Steinsplitter? Doc James (talk · contribs · email) 04:45, 20 February 2020 (UTC)

Restrict Computer-aided tagging to autopatroll users till the major problems are solved[edit]

The Computer-aided tagging tool is producing many bad edits, especial by new users. (Discussions on: Village pump & Commons talk:Structured data/Computer-aided tagging) There are many feature requests to solve these problems and the AI has to become better too. Because of these problems this tool should be restricted to trusted and experienced users with autopatroll rights for now. When the tool works very well it could be open for everyone, even IP users. --GPSLeo (talk) 19:30, 15 February 2020 (UTC)

I agree, though this doesn't go far enough. I have an RfC in draft. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:50, 16 February 2020 (UTC)
AI has to become better - We will be making it better by doing the verification job for Google (it's not bad). Reinforcement based learning, rejected tags will be not shown again. How do you think they improve these results ? -- Eatcha (talk) 10:46, 16 February 2020 (UTC)
I agree with @Pigsonthewing: that this tool is a nightmare. Tags are of no use, we need precise depict statements only. Please could you ping me Andy once the RfC is open? :-) --Vojtěch Dostál (talk) 13:06, 17 February 2020 (UTC)

Proposal to implement blocking by abuse filters[edit]

Following the scoping discussion #Time for abuse filters to block (temporary and permanent) (permalink), a formal proposal for consideration.

One of the standard abilities for abuse filters in mediawiki is to allow blocking of accounts or IP addresses (Block the user and/or IP address from editing) based on criteria in a filter. It has not been something that we have typically needed over the earlier years as we haven't had persistent vandalism or spam. Things have changed, and it is the time for us to move to having blocking functionality available.

[technical detail https://noc.wikimedia.org/conf/highlight.php?file=abusefilter.php and setting $wgAbuseFilterActions['block'] = true;]

If that occurs we also need to define a default period for blocks. I suggest that the default would least demonstrate that we are looking for a minimal approach, so let that be the most gentle setting. Though noting that this would just be a default, and a dropdown with other values will always be present for selection.

To have this change made at Commons, we would need to demonstrate a consensus of the community, and lodge a phabricator site request. Noting that this is a technical change, not a policy change to what we block, or to the blocking policy. Accordingly I propose:

  • Wikimedia Commons moves to have enabled the ability to block through its abuse filters.
  • Default periods for blocks to be 2 hours for user accounts, and 2 hours for IP addresses.

I also note that if consensus is reached that Commons administrators will need to work to operational guidance and that is being developed in a separate section, and is outside of the scope of this technical request, and will have a separate consensus.  — billinghurst sDrewth 12:58, 15 February 2020 (UTC)

Support[edit]

  • Symbol support vote.svg Support as proposer  — billinghurst sDrewth 12:59, 15 February 2020 (UTC)
  • Symbol support vote.svg Support --Herby talk thyme 13:30, 15 February 2020 (UTC)
  • Symbol support vote.svg Support Christian Ferrer (talk) 19:27, 16 February 2020 (UTC)
  • Symbol support vote.svg Support Kaldari (talk) 00:30, 17 February 2020 (UTC)
  • Symbol support vote.svg Support unfortunately because there's not enough admins.--BevinKacon (talk) 12:43, 22 February 2020 (UTC)
  • Symbol support vote.svg Support But don't allow non technical users to touch the filters Eatcha (talk) 14:57, 22 February 2020 (UTC)
  • Symbol support vote.svg Support I've actually thought about making this very proposal before. This can be extremely useful in dealing with LTAs. In fact, my relatively recent dealings with an LTA was what make me think about wanting this. However, there would have to be some restrictions. My original filter has been modified to be much more broad at this point to capture more of the LTA's attacks. Currently, there are false positives. Not many. But even a few being caught by a blocking filter would be too many in my mind. For that reason, admins who use this functionality need to be aware of what they are doing and be willing to monitor their filters extremely closely to ensure that any false positives are dealt with quickly and the filter is modified to preclude them. The filter debugging page can be used to quickly undo errant blocks under such a scenario. A blocking filter is a nuclear option and it should be treated as such. I do believe that it is an option that we should have and I do believe that in very limited circumstances it would be a massive benefit to be able to do but the ramifications of its use need to be fully understood by those that use it and the consequences for misuse should amount to admin abuse. --Majora (talk) 17:33, 23 February 2020 (UTC)

Oppose[edit]

  1. The referenced consensus is weak, two supports and general discussion is not convincing. This type of systems decision can and should be made on convincing reports and analysis. We do not have to implement the filter in order to do testing, we can simply run a test of the proposed filter against past contributions and analyse what the impact would be, both positive impact for reducing disruption to this project, and negative impact for possible good-faith contributors. Without this, it is unclear what a "minimal approach" is, or how it would be measured. So, let's have some test reports so the community can vote against more than hypotheticals. -- (talk) 13:13, 15 February 2020 (UTC)
  2. Symbol oppose vote.svg Oppose I have personal bad experience with vandalism-detecting filters in en.wiki. I do not know, what kind of edits are considered vandalism by bot. I have seen no analysis yet about proposed filters. What if quarter of blocks will be false positives? I do not know that, I feel, that nobody knows. After test run and analysis my vote can change. Taivo (talk) 19:00, 15 February 2020 (UTC)
    @Taivo: Every edit you have been making is already going through every active abuse filter, there is no change involved here. The suggested change is an action that comes from an abuse filter. From your 167k edits, maybe you can explain and relate on your experiences with abuse filters affecting your editing here, I can see about 42 interactions in the logs.

    With regard to the processes, I covered that separately below, and our process would not be getting that criteria, that is why we test and manage. We already know what is happening here. I gave specific links to meta's logs (abuse and block) where there is the process in place and it can be demonstrated what is happening. I perfectly understand a cautious approach, and that is what is being proposed.  — billinghurst sDrewth 09:41, 16 February 2020 (UTC)

"Every edit you have been making is already going through every active abuse filter"
Technically that's true, but many filters exclude administrators and even more exclude patrollers/autopatrollers. So an admin never tripping an abuse filter doesn't mean much. - Alexis Jazz ping plz 12:36, 18 February 2020 (UTC)
  1. Symbol oppose vote.svg Oppose on procedural grounds. This should go at VP and not here. VP has twice the page watchers and is the appropriate place for seeking community consensus. GMGtalk 02:44, 16 February 2020 (UTC)
    @GreenMeansGo: This seems to be resolved now. Kaldari (talk) 00:30, 17 February 2020 (UTC)
  2. Symbol oppose vote.svg Oppose, abuse filters themselves are open to abuse, if an admin doesn't like certain behaviour it can be blocked and having an automated blocking process will only create a less collegial working environment for everyone. Imagine being a newbie doing his/her first edits and then getting permanently blocked because you triggered some filter, you probably won't even know what filter you triggered and your only impression of this website is that you're not welcome (for whatever reason), this is something that already happens on other Wikimedia websites. What's worse is when we allow some users (sysops, rollbackers, autopatrolled, Etc.) to do some edits but block others for the exact same edits, this just creates an even more unfair system where some users are more disadvantaged than others. We need more human eyes on admin actions, not less. Countless of free files already get deleted because someone who doesn't properly understand "Commons:licensing" tags some image as "unsourced" or "speedy" and then an admin just deletes it. So why would we make an already imperfect system even more imperfect? Also, unblocking is a nightmare, it's already difficult to get unblocked now, let alone if you're blocked by an automated system and don't even know what you did that was wrong and saying that you understand why you've been blocked and won't do it again is a prerequisite for unblocking but sysops can still decide to leave the block in place. This will just create a whole lot more problems than it solves. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 10:58, 23 February 2020 (UTC)

Neutral[edit]

Comment[edit]

In response to . Umm, I referenced no consensus, this is the discussion for consensus. I mentioned a scoping discussion.

With regard to your request for analysis, there is plenty of evidence of spambots active here, and those attempting to be active here. We have been manually been blocking these for years, and this is to stop having to do this manually. This proposal does not change what we are blocking, to that there is no change, it is the processing from manual to automated. This becomes about ensuring that the filters are targeted appropriately, and tuned appropriately for their use, and to agreed measures, some here are close though would need tuning to go the next steps. I linked to some of those active blocking filters at Meta, which would be similar, though not exact that were performing well on 700+ wikis covered by global abuse filters.  — billinghurst sDrewth 14:27, 15 February 2020 (UTC)

Suggested guidance followed at #Draft of operational guidance for use of blocking by abuse filters. Feel welcome to make suggestions, or asked for clarifications to be made.  — billinghurst sDrewth 14:31, 15 February 2020 (UTC)
@Billinghurst: As I indicated previously, if you are seeking broad community consensus for site-wide changes, you need to transfer these discussions to the village pump. AN is a place for requesting administrator assistance, not a place for building community consensus, and having this discussion here instead of there is out of order. GMGtalk 14:34, 15 February 2020 (UTC)
When I scanned the referenced discussion, it read as a proposal with votes. You mention "general agreement" a few lines in, but the title "Time for abuse filters to block" I read literally. If you want to discount that discussion as no evidence of consensus, that's fine.
However in line with GMG's point, the history here is (1) run a proposal for "general agreement" that people vote on, (2) run a proposal to "implement" that is laid out as a vote, (3) run a proposal for "we would need to demonstrate a consensus of the community", which this presumably is not.
That's 2 votes more than we actually need and seems exhausting for the limited numbers of volunteers that will be interested and know what we are talking about. -- (talk) 20:09, 15 February 2020 (UTC)

Fæ, I wrote the following to the subject line "Time_for_abuse_filters_to_block_(temporary_and_permanent)"

Hi. To me we have some persistent LTAs and enough spambots getting caught against filters, that I think that it is time that we consider the ability to apply blocks with spam filters, either short term application or permanent. The blocking ability is now in place in numbers of wikis, and has been for a number of years and it is not seen as problematic or out of control. If we did go down the path, we would want to look at some concepts and practice around what would, and how would we apply temporary or permanent blocks, though as we already have a good blocking policy and application of that, then it is not about novel concepts of why we are blocking. If there is a general feeling of agreement, then I will put forward a more specific plan. …

So please don't selectively quote or misrepresent what has been said. I said that I would come forward with a proposal, and I have done so. I did not call for votes, and no body counted votes, they expressed opinions as guidance to my opening statement. I would also like to address the contradiction in some of the argument. It is indicated that this is a limited scope argument for a limited set of people interested and knowing about what we are talking. Yet also argued that the conversation should be at another forum where it would be of less interest and less relevance and small knowledge base, so how does that work? This is an administrator only action, and there are numbers of administrators who keep away from the area, so how is that going to progress in a more inexperienced forum.  — billinghurst sDrewth 10:08, 16 February 2020 (UTC)

The contributors affected by this change are not limited to administrators. It is weird to limit the discussion or consensus to those in the sysop group, when it is everyone that will be affected by it. From what you are saying here, I don't understand why you are replying to me, because I am not an administrator, so by the logic above, I have no say here on what happens. -- (talk) 13:09, 16 February 2020 (UTC)
Pictogram voting info.svg Info Comment made before the move from COM:AN -- (talk) 20:02, 16 February 2020 (UTC)
Pictogram voting comment.svg Comment Hard to reply to this. I think the time period should be reduced to 1 hour, that's typically enough for a human to look at it. In my experience even humans have a hard time correctly identifying abuse. Also, these automated blocks should virtually always be partial blocks: user talk should never be blocked, unless user talk page abuse is specifically the target of the filter. I've experienced more than once the situation where I got blocked on a site by some automated process and as a result I also couldn't complain about my block, because I was blocked! And of course, how do we decide what kind of thing will be eligible for an automatic block? - Alexis Jazz ping plz 12:36, 18 February 2020 (UTC)
@Alexis Jazz: To my best knowledge, abuse filters can't partially block users, but blocking talk page access can be modified. I think talk page access should never be blocked using abuse filter.
We will most probably hide these filters so that people won't be able to bypass them. That's especially needed for LTAs. When we're sure that the filter is working properly, an administrator can bring it up on administrators' noticeboard. If there is at least one support and no oppose or there is general consensus, we can enable the blocking feature. These blocks should surely be monitored routinely. I can't think of any other proper way, because logs and code of a hidden filter is not available to non-admins. Maybe we should create an "edit filter managers" user group here as well so that skilled non-admins can work on filters too. Ahmadtalk 15:42, 18 February 2020 (UTC)
@Ahmad252: well I do remember one time (can't remember the exact details now, doesn't really matter) when I wasn't able to contribute to a WMF project. Somebody had misconfigured the abuse filter and basically everybody who wasn't a trusted user was blocked from editing. They hadn't gotten any complaints, because doh, nobody who was affected could file any. That's something we should really avoid. - Alexis Jazz ping plz 15:46, 18 February 2020 (UTC)
  • Pictogram voting comment.svg Comment - If abuse filters are going to block users, I think there must be a bot that notifies users of their blocks properly so that they will know how to appeal their block. Abuse filter doesn't do that, so the blocked user wouldn't know how they can appeal. Ahmadtalk 15:44, 23 February 2020 (UTC)

Maximum block length[edit]

This proposal is to set a fixed maximum length for indefinite account blocks, after which administrators are encouraged to remove the block and consider re-applying for another 4-year term if there are specific realistic reasons to expect that removing the block could cause damage to the owner of the account, or to the goals of this project. The maximum block length is proposed to be 4 years.

Context and qualifications

Wikimedia Commons has no standard for a maximum length of edit blocks, no equivalent of the English Wikipedia "standard offer" and though there are blocks that have lasted for more than a decade, there is no such thing as a Wikimedia Commons account ban. This proposal cannot overturn a WMF office action, though for users that are banned on other accounts, reviewing a long term indef Wikimedia Commons block may result in recommending a global ban or asking for a WMF office action to replace the Wikimedia Commons block.

The benefits of adding this to COM:Blocking policy would be to remove blocks that are cosmetic or where the original rationale for the block may have been superseded by later policies and systems, for example, the WMF office lock. There would be no need for the account owner to run an unblock request for this review to occur. -- (talk) 16:41, 21 February 2020 (UTC)

Maximum block length, votes[edit]

  • Symbol support vote.svg Support as proposer. -- (talk) 16:41, 21 February 2020 (UTC)
  • Symbol oppose vote.svg Oppose No relevant example has been shown. Creation of tons of log entries for unblocking users that don't request anything is useless admin work. If at all, we should create a policy for new blocks only, and have a well elaborated list of possible exceptions. --Krd 17:09, 21 February 2020 (UTC)
  • Symbol oppose vote.svg Oppose Mainly as per the first Krd's sentence, it seems like work for a profit not obvious. Christian Ferrer (talk) 21:54, 21 February 2020 (UTC)
  • Symbol oppose vote.svg Oppose But mostly on technical grounds. Removing blocks from users who haven't even logged in in ages seems like a waste of time. For deceased users, this may be considered if the account is globally locked to prevent abuse. (no need to continue shitting on someone after they passed away) Also, 4 years is too long if we are hoping anyone will come back. 2 years tops. - Alexis Jazz ping plz 03:25, 22 February 2020 (UTC)
  • Symbol oppose vote.svg Oppose per Krd, but I do like Christian Ferrer's idea below. -- CptViraj (📧) 03:42, 22 February 2020 (UTC)
  • Symbol oppose vote.svg Oppose any users who want return will already have requested it, otherwise you are free to contact them on their talk page or email to see if they are even still active.--BevinKacon (talk) 12:40, 22 February 2020 (UTC)
  • Symbol oppose vote.svg Oppose They should create a new account and have a new start, much better than living a life of a indef blocked user. Ask your yourselves , would you trust a new user who appears really nice or an user with past problems ? Thanks for reading my lines.-- Eatcha (talk) 15:02, 22 February 2020 (UTC)
  • Symbol support vote.svg Support, I support the idea behind the proposal, just not its technical implementation. For example this person was blocked for 8 (eight) years and has now returned as a net positive for Wikimedia Commons. It is not uncommon for 10 (ten) year old blocks to still be enforced because evasion of the original block by itself is considered to be abusive and these blocks are factually impossible to lift (plenty of admins believe that users shouldn't receive more chances than what they "originally had"), we basically have a culture that says "respect old blocks, regardless if there is a 0% (zero percent) chance of the abuse repeating. For years I had thought about proposing "a maximum sanction length", for example at "Commons:Editing restrictions" two (2) extant entries read "They may not use any alternate accounts. They may upload a maximum of three images per week. All images they upload must be their own work. They must give a link to all online sites where they have previously published images they have uploaded. They will not reupload any previously deleted images." These restrictions basically say that they are forever "second-class users" who may not engage in non-disruptive behaviour which is considered normal for other users because of something that happened 7 (seven) years ago. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 11:10, 23 February 2020 (UTC)
  • Symbol oppose vote.svg Oppose Would create further workload for administrators, to the point it would be unworkable. Indef users are welcome to use the request for unblock, if it has been quite sometime since they were blocked. Bidgee (talk) 11:25, 23 February 2020 (UTC)
    Huh? How is looking at a maximum of 24 account blocks, most of which will be obvious decisions, be "unworkable". One admin could do it in about an hour and a half. -- (talk) 11:42, 23 February 2020 (UTC)
  • Symbol oppose vote.svg Oppose The vast majority of people I block are either spammers or vandals - anything other than indef would be silly. --Herby talk thyme 12:18, 23 February 2020 (UTC)
  • Symbol oppose vote.svg Oppose Pointless mission-creep, sorry. No need for it. Rodhullandemu (talk) 12:46, 23 February 2020 (UTC)
  • Symbol oppose vote.svg Oppose Looks like a lot of completely redundant work. --jdx Re: 14:26, 23 February 2020 (UTC)

Maximum block length, discussion[edit]

There may be standard exceptions, such as sock accounts where the puppeteer's primary account is the one that should be reviewed. However, these types of exception should be added as guidelines that supplement block policy. Self-blocks or self-requested blocks, are ultimately meaningless after 4 years and probably should not be considered an exception as the account is effectively retired anyway. Accounts such being blocked as 'unapproved bot' are again redundant after 4 years, unless the account owner is a known active puppeteer, or is known to intend to resume disruptive behaviour.

Examples of accounts that would be reviewed are User:Juiced lemon (harassment 2007), User:Mbdortmund (deceased 2012), User:Saibo (self block "Commons" 2013). Accounts like User:DcoetzeeBot which were blocked in alignment with a WMF office action but were never locked by the WMF, should be passed back for the WMF to decide if they wish to block the account. At the time of this proposal, there were just 24 accounts which are indef blocked for more than 4 years, so doing reviews to consider whether re-blocks are necessary and justifiable are not a significant burden for Administrator time. There are no examples I have seen so far where the account could not be unblocked or the indef block replaced with a global ban or WMF lock. -- (talk) 16:41, 21 February 2020 (UTC)

Pictogram-voting-question.svg Question What problem is this trying to address? What is the benefit of unblocking indef blocked accounts after 4 years (or any other fixed length of time)? Why unblock accounts is the user has not requested it? World's Lamest Critic (talk) 18:55, 21 February 2020 (UTC)

Pictogram voting comment.svg Comment What about a maximum duration (of, say, 2 years) when blocking accounts that have actively contributed for at least 6 months and aren't socks? As a way to still allow indef for vandalism-only accounts etc. I think lifting blocks for deceased users is not a good idea. They can't return, ever. If anything, we should perhaps have a special status for such accounts, or simply apply a global lock to prevent any possible future abuse of the account. That would also allow removing any project specific blocks. It would be mostly cosmetic, but calling someone unwelcome after they died is, meh, not nice I guess. - Alexis Jazz ping plz 20:45, 21 February 2020 (UTC)

  • So... why shouldn't we just wait for the user to request unblock? Administrators are required to notify blocked users. Block notification templates also tell users about appealing their block. If someone wants to return, they can appeal. In my opinion, many never consider returning to Commons after, say, 4 years of being blocked. If someone hasn't appealed their block in the last 4 years, I think there is a very low possibility that they ever come back. Ahmadtalk 22:09, 21 February 2020 (UTC)
I don't think it all that unlikely that they'll come back, but I suspect if they haven't created a new account in four years, they aren't going to reopen their old ones, and anyone blocked for four years who has been operating under other accounts is not going to do anything good with a suddenly unblocked account.--Prosfilaes (talk) 22:16, 21 February 2020 (UTC)
@Ahmad252: because in practice I sometimes see admins who feel personally hurt or who refuse to give a user another chance as a result of thinking "once a bad user, always a bad user". I can't even recall any unblock request after a long period of time that was granted. The only actual result from this is that we encourage anyone who gets indefblocked to sock, because requesting to be unblocked is futile. - Alexis Jazz ping plz 22:30, 21 February 2020 (UTC)
@Prosfilaes: Some will come back, but I think they should request unblock first. Unblocking all blocked users will create too many log entries, I'd say it doesn't worth it if the user isn't going to come back. As Christian Ferrer said below, we can alternatively accept unblock requests after 4 years automatically.
@Alexis Jazz: I see. I counted them using Jarry1250's template transclusion count; ~ 309 granted requests, 1107 declined (question: if 1107+309 is 1416, then why does Category:Reviewed requests for unblock return 1362 requests?). However, that's just a number; I know it doesn't mean much. Anyways, I think Christian Ferrer's suggestion below is a better choice. In that case, user will make an unblock request, so that we know that they still are willing to edit. Ideally, however, I'd say we should discuss this more, and try to solve the main problem: unblock requests being declined for no clear reason. I don't know much about this (maybe because I'm somehow new), but I'd like such a discussion to be started. Ahmadtalk 22:50, 21 February 2020 (UTC)
@Ahmad252: I'm not familiar with "Jarry1250's template transclusion count", link? I do see that hastemplate:"unblock granted" gives 308 results, but the majority of those would have been granted less than a month after being blocked. So your chances of being unblocked after a year or so still seem slim to none. Actually, 308 granted unblock requests.. that seems incredibly low. With ~1400 unblock requests total it doesn't seem terrible, but that number also seems low. I think people either don't know how to request unblock or have no faith (and rightfully so IMHO..) that it'll be granted in many cases.
Two examples that immediately come to my mind are Amitie 10g and Slowking4. Amitie 10g made mistakes, I won't argue about that. But as the Dutch say "waar er twee vechten hebben er twee schuld" which roughly translates to "where two people are fighting, both are guilty". And Ellin Beltz isn't fully free from blame either. Amitie 10g was wrong in the way he treated her, but the complaints weren't hollow. And it's quite a long time ago. Even a request to just get access to the File: namespace again was declined, instead a threat was made to revoke his talk page access. Admins seem to expect people to grovel or something, and even if one did, I doubt their unblock request would be granted.
Slowking4 is a different matter entirely. Slowking4 is a disrupter, no question about that. And sometimes he went too far. But we need people who speak up when they spot a problem! He requested to be unblocked and IIRC that was refused mainly because he wrote m:User:Slowking4/wikicommons has cancer. It's just a list, the title is the primary reason the unblock was declined AFAIK. Now, where I consider myself a bit of a wordsmith, Slowking4 can be more of a butcher. But I have said pretty much the same things and worse as Slowking4 has, I just don't say "x has cancer" and don't resort to disruption, at least not as often as Slowking4 did. Also, as you can tell from that page, Jcb was an important factor that bothered Slowking4. That's not going to be a source of conflict anymore.. - Alexis Jazz ping plz 04:03, 22 February 2020 (UTC)
@Alexis Jazz: That's here. Thanks for examples. I understand (I'd prefer to not go into details here), but I was thinking... What happens after the (semi-)automatic unblock? What if the unblocked user makes a little mistake? I know that blocks must be preventative rather than punitive, but we've already blocked them for 2/4/n years. Should we just block the user indefinitely in this case, or block them for another two/four/n years? Ahmadtalk 15:39, 23 February 2020 (UTC)
@Ahmad252: Unless a user clearly has nothing but bad intentions, I think they should never be blocked for more than 2 years. And if the unblocked user makes a little mistake, that should just result in a warning or short block, as usual. If they continue with the bad behaviour that got them blocked in the first place, leeway would be limited. They would be reminded/warned, but if nothing changes it's another 2 year block. But this would obviously be to be judged on a case-by-case basis. - Alexis Jazz ping plz 18:27, 23 February 2020 (UTC)
  • Pictogram voting comment.svg Comment Despite my oppose vote above, I like the idea of a new chance for blocked users, but I absolutely don't like the systematic aspect. As an alternative, taking things from the other end, maybe an unblock request made 4 years after the block should be accepted automatically. Christian Ferrer (talk) 22:20, 21 February 2020 (UTC)
I'd say reduce that to 2 years and accept any neutral/empty or positive unblock request. Obviously, if someone tries "{{unblock|I want to break stuff!}} there's no point in granting that. - Alexis Jazz ping plz 22:38, 21 February 2020 (UTC)
  • I personally think that Alexis Jazz and Eatcha said it best, it would be better if blocks can "non-technically" expire, in this case "clean starts" can start again, and then they won't be bothered by people who hate them and will stalk their edits to see if they repeat any mistake they made in the past. If a new user registers an account and makes a few mistakes most people will try to talk to them and teach them what is and isn't acceptable, but if a previously blocked user makes a mistake, it doesn't matter that they were previously blocked for uploading copyright violations, massively categorising images in a wrong manner now means that they should be blocked because "disruption is disruption". Vandals will be vandals and they are blocked on sight, but a young child blocked for constantly uploading selfies and then returning in a few years to upload high quality images that are within scope should not be punished for "socking". --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 11:17, 23 February 2020 (UTC)
  • There are many cases where an indefinite block is appropriate. Here we're talking about active contributors, often established contributors. I agree that in in practice it's often better to block "only" for 5 years, because we have no idea how the person might be after such a long time. Indefinite blocks work only if we implicitly assume that the person may come back with a completely different identity and improve their conduct so drastically that nobody will notice they're the same person (or not without looking very hard). Some speedy process to consider unblocks could be interesting: it doesn't need to be super bureaucratic, maybe it's enough to ask that the user requests the unblock, at least N other users endorse it (1? 2?), and an admin administers the unblock. We could make a list of users who've been indefinitely blocked year for several years and who are still active on some other Wikimedia wiki, just to get an idea. Nemo 16:21, 23 February 2020 (UTC)