Commons:Village pump/Proposals/Archive/2016/04

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
Archive This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Default action for the search field


Any search in the Vector top right search box should lead to a search, even when the exact page name exists.

Previous discussions

A previous discussion, with a Whoopy Goldberg sample, leads to a consensus to modify the behavior.

Meanwhile, Vector has been deployed. It changes the behavior of this search box, adding a jump to feature. We'd lose it if we'd accept the proposal (ie that would be printed, but when you select this page, enter, you will reach a search page).

Yet, this jump feature is probably very little used (I need to show it and the interest at each edit-a-thon or workshop I attend when someone explains to me some difficulties to find an help page exact title). --Dereckson (talk) 12:32, 4 April 2016 (UTC)


  • Oppose. I use this box to navigate on Commons help pages. For example, I write "COM:LIC" to reach Commons:Licensing, or "com:vis", then "help:vis" to reach Help:VisualFileChange.js. The behavior proposal would kill this "jump to" feature. Furthermore, the concern could be addressed in, the gallery pages we add a visible template with a link to the category for more pictures. --Dereckson (talk) 12:32, 4 April 2016 (UTC)
  • Symbol oppose vote.svg Oppose per Dereckson. --Thibaut120094 (talk) 13:29, 4 April 2016 (UTC)
  • I find it pretty annoying when I search for content and end up at a "gallery" page that fits my search term (as that gallery is usually hopelessly outdated, completely unrelated or both). On the other hand, I use the "jump to" feature quite often in the way Dereckson describes above. Not only for COM: and help:, but also for for quickly reaching Templates or Categories. I'd hate to see that disabled, so I Symbol oppose vote.svg Oppose just switching that off. What about the following: If the search term starts with a namespace or a common abbreviation thereof (such as Commons:, COM:, Category:, Template: etc.), the behaviour stays as it is. Otherwise, always display search results instead of supposedly matching gallery pages. Or at least display categories instead of galleries. Galleries are dead, the Goldberg Example still holds: currently 18 files in Category:Whoopi Goldberg vs. 5 in Whoopi Goldberg. --El Grafo (talk) 13:52, 4 April 2016 (UTC)
  • Symbol oppose vote.svg Oppose per Dereckson, the search box should open a search page, not a gallery. -- Rillke(q?) 14:14, 4 April 2016 (UTC)
    @Rillke: so you actually support to change the status quo? --El Grafo (talk) 15:14, 4 April 2016 (UTC)
    Actually Rillke, like El Grafo highlights, you have the choice between jump + go to gallery if a gallery page exists or no jump but search. So the dilemna is more "Allow to quickly jump to a page or favour search?" --Dereckson (talk) 18:21, 5 April 2016 (UTC)
    I use Monobook, I want no "I'm feeling lucky" (jump) search effect, I like some galleries, and I hate the search as is (minimal requirement: content vs. technical vs. all). Makes sense? –Be..anyone 💩 06:03, 17 April 2016 (UTC)
  • Pictogram-voting-question.svg Question @Dereckson: I recently asked for help because (I use monobook skin fwiw) each time I used the search box to enter, for example, COM:L, I ended in the "Did You Meant" page, which was pretty annoying for me, so I had to add a code on my common.js to make actually the search box work as I desired. I'd like that, if I write a valid page title or a shortcut, the system sends me to the page (ie: if I write COM:L, the system opens Commons:Licensing). Is that what you're proposing? Thanks. —MarcoAurelio 14:22, 4 April 2016 (UTC)
    If you use vector instead, typing COM:L will already bring you to Commons:Licensing. The original 2009 proposal aims at his: When you type "Clown" you will end up at Clown (at least with vector). But gallery pages like this are mostly useless, so it would be better to end up at a search results page instead. In 2009, many people agreed to that (and personally, I still agree today). The reason people are opposing now is that COM:L → Commons:Licensing wouldn't work anymore. Also, typing "Commons:Licensing" would lead you to a search results page instead of Commons:Licensing. At least that's how I understand it. --El Grafo (talk) 14:53, 4 April 2016 (UTC)
    Hi @El Grafo, thanks for your reply. Yes, I'd hate to loose the COM:L ↔ Commons:Licessing. That was the reason of this thread at the VP some weeks ago. —MarcoAurelio 17:01, 4 April 2016 (UTC)
    Wow... GoButton=true; I feel literally hours of past frustration over months just evaporating. Thanks, MarcoAurelio! Storkk (talk) 22:45, 16 April 2016 (UTC)
  • Pictogram voting comment.svg Comment I suppose "simply" moving all gallery content to a dedicated Gallery: namespace would solve this dilemma. It's kind of stupid to have them as our "main namespace" anyway … (running for cover now) --El Grafo (talk) 15:11, 4 April 2016 (UTC)
IMO long overdue.    FDMS  4    16:26, 4 April 2016 (UTC)
That's a good idea to solve that. --Dereckson (talk) 18:22, 5 April 2016 (UTC)
Just wondering, if galleries are moved to a "Gallery" namespace, what should be in mainspace instead? --Zhuyifei1999 (talk) 06:17, 17 April 2016 (UTC)
@Zhuyifei1999: Do we have to use it for something? I'd say: Just abandon it. The more important question would probably be: Can a move like this realistically be done? Lots of pages would have to be moved and lots of things might break (templates on other projects, stuff on Wikidata …). --El Grafo (talk) 12:26, 18 April 2016 (UTC)
I've never seen such a move done before. Depending on how "used" our galleries are, such move may be more or less easier --Zhuyifei1999 (talk) 12:54, 18 April 2016 (UTC)
Well, most of the uses on other projects probably come through templates like en:Template:Commons – shouldn't be too much of a problem to change them. On Wikidata, d:Property:P935 would need to be changed. The site links in the "Other sites" section of the right sidebar on individual entries might be more of a problem – but most of them should probably point to categories anyway *coughh*. --El Grafo (talk) 13:47, 19 April 2016 (UTC)
How about just deleting galleries which don't add anything over the category? I think it's OK to have them redirect to a category as well. It used to be more common to maintain them, and they could certainly be the Commons "article" for a subject -- a curated selection of images on a topic. On the larger question, even Google has a separate "I'm feeling Lucky" button if you want to go straight to an article. Maybe if there is a colon in the search term, we assume the user is searching for a particular page and go that way if there's an exact match, but if just a bare word, then perhaps default to a search result? Carl Lindberg (talk) 13:38, 18 April 2016 (UTC)

Hide button for editing change tags on history pages

Please see Commons:Administrators' noticeboard#Hide button for editing change tags on history pages. Poké95 06:21, 15 April 2016 (UTC)

Archived to Commons:Administrators' noticeboard/Archive 57#Hide button for editing change tags on history pages. -- Rillke(q?) 05:46, 20 April 2016 (UTC)
@Rillke: Does it mean that we can now file a report at Phabricator? Poké95 01:01, 22 April 2016 (UTC)
Either we do so (phabricator report) for taking away the changetags rights for non-admins or we locally simply hide the button through MediaWiki:Common.css. If we do the former, we should probably make clear that the decision is solely because there is currently no tag that is relevant to anyone and that we do not generally want to disallow everyone to change tags. This is important to avoid bureaucracy if a tag is created (e.g. a check-for-vandilism tag) that everyone might have to apply. -- Rillke(q?) 08:40, 22 April 2016 (UTC)

Tabular data storage for Commons!

During the last hackathon I created a new on-wiki tabular storage described in T120452, similar to CSV and TSV formats. It allows any user to create a page, e.g. "Data:List of interesting facts.tabular" (demo), and keep it as a table, rather than wiki text. Tabular storage allows strings, numbers, Booleans (true/false), and "localized strings" – a string that has different value depending on the language. Additionally, tabular data stores metadata, such as description (localized) and license. More metadata can be added as needed.

Tabular storage greatly simplifies storing data for lists, tables, and graphs. Graphs may directly access tabular data, and on-wiki tables and lists can be created by using simple Lua scripts. This storage is fundamentally different from Wikidata, because it works with "blobs" (batches) of data, whereas Wikidata works with tiny "facts". Wikidata technology is simply not suited for large storage such as the list of the most expensive paintings, the shoe size comparisons table, or data to plot Moscow subway growth graph.

After a long discussion, it seems Commons is the best fit for such data. Commons community already has good experience with international multi-licensed content. The current proposal is to create the data namespace on Commons, and use it from all of the wikis.

Feel free to experiment with it at Note that you can view it with different languages, e.g.

Technical notes: When storing, the data is validated and stored as JSON, so there are no delimeter problems common to the traditional CSV/TSV files. At this point, the wiki editor shows tabular data as a JSON, but very soon I hope to have a CSV/TSV editor to simplify copy/pasting, and afterwards – a full scale spreadsheet table editor. Eventually, I would also like to implement Q number support, allowing direct links to Wikidata. --Yurik (talk) 19:43, 19 April 2016 (UTC)

Symbol support vote.svg Support Strongly support this, as well as arbitrary (not just as CSV backend) JSON data storage --Ilya (talk) 21:44, 19 April 2016 (UTC)
Thanks Ilya, JSON support is fairly easy to add, but we should be very clear of what will be the use cases - arbitrary json is harder to work with in a generic manner. --Yurik (talk) 21:51, 19 April 2016 (UTC)
If it's as simple as the TemplateData I should grok it. –Be..anyone 💩 23:44, 19 April 2016 (UTC)
Symbol support vote.svg Support useful and no significant risk. Ijon (talk) 01:05, 20 April 2016 (UTC)
Symbol support vote.svg Support this might make a better solution for list articles, such as "fellows of royal society". Slowking4Richard Arthur Norton's revenge 03:13, 20 April 2016 (UTC)
Symbol support vote.svg Support Would be a great way to have 3D molecular data (X-ray coordinates, etc.) that could be accessed by external renderers to generate images, or if we eventually get mw:Extension:Jmol or similar rendering gadget on-wiki. DMacks (talk) 03:53, 20 April 2016 (UTC)
Symbol support vote.svg Support I disagree on the most expensive paintings list, because theoretically each of those paintings are notable enough to have a Wikipedia article and therefore can be on Wikidata with a :significant event=sale @ sale price" which can then later be queried to produce such a list. That said, this would be great to have in order to crowdsource indexes to existing Wikisource books (such as all the mentions of all the artists in Vasari on Italian Wikisource, etc). Jane023 (talk) 06:06, 20 April 2016 (UTC)
Symbol support vote.svg Support This feature is useful. -- Poké95 06:20, 20 April 2016 (UTC)
Pictogram-voting-question.svg Question Still some open questions for me: What about meta data? You'd need a way to list your sources etc. Commons:Structured data is probably still far away, but it seems they are still working on this and it should probably be kept in mind. How would we handle categories with that? Good thing in general, though! --El Grafo (talk) 10:15, 20 April 2016 (UTC)
The real thing is COM:MRD, the "structural" page appears to be WMF "what is a wiki" spin. –Be..anyone 💩 11:13, 20 April 2016 (UTC)
Yes, COM:MRD is what matters now. If done right, Commons:Structured data could become so much more than that, but we'll have to wait and see. --El Grafo (talk) 12:34, 20 April 2016 (UTC)
El Grafo, please see the first paragraph - the meta data is stored on the page as well. We should probably start with the "license" and "info" tags at first (see demo), but it will be very easy to add any other data that community needs, such as "source", etc. Suggestions are welcome :) --Yurik (talk) 18:14, 20 April 2016 (UTC)
Ugh, sorry, totally overlooked that … --El Grafo (talk) 19:08, 20 April 2016 (UTC)
Pictogram-voting-question.svg Question To me it seems like a form of wikitable that can be accessed by other tools. I Symbol support vote.svg Support the idea in general but I do not understand the details. What namespace will this data go into: gallery, file or some new not yet created namespace. I expect we might have many of such pages. Is it like a file as it should be accompanied by metadata similar to files (author, source, license) but without the separate file part? Or is it more like a wikitable in wikipedia page where license is set and authors can be look up in the history? Will we be storing tabular data we copied from some places on the web? Or is it more for wikipedia users generated stuff? Do we need to alter COM:SCOPE? --Jarekt (talk) 12:26, 20 April 2016 (UTC)
Based on the example given above it looks like the plan is to create a new Data: namespace. Makes sense to me. No idea about the other questions though, would be very interested to hear about that as well! --El Grafo 12:38, 20 April 2016 (UTC)
Jarekt, El Grafo, yes, new Data namespace, with the possibility of having different data types, e.g. tabular, geojson, or generic json, depending on the page name extension. The .tabular is similar to wikitable, but unlike wiki text, it is "structured" instead of "free form", and therefore machine-readable. This means that each column must have a well defined data type, such as number, boolean (true/false), or a string. The tabular pages will not allow saving of invalid data. The data types will allow wikidata Q numbers in the future as well - this way when they display, they can automatically link to the original item, and use Wikidata translations. We can define any meta fields as well, and they can be made machine-readable from the start too - suggestions are welcome. Unrelated to this, we can even go as far as creating a new "Data:*.file" pages to store all the machine-readable meta data for the files, unless wikidata wants to handle this usecase. --Yurik (talk) 02:34, 21 April 2016 (UTC)
Symbol support vote.svg Support Well-conceived, useful, cross-wiki. I have wanted this! I'll benefit from clear documentation on the variations of JSON and CSV and which ones are most appropriate for commons. 100% strong support! -- econterms 13:00, 20 April 2016 (UTC)
Symbol support vote.svg Support Although I think in a future we should move only some data to Wikidata. I think some data should remain here and other moved, and the moving process should be gradual. Here we should have more flexible and a-semantic data, and in Wikidata semantic data.--Eloy (talk) 15:32, 20 April 2016 (UTC)
Pictogram-voting-question.svg Question Would translation happen on the Commons or individual wiki side? Also do you have a beautiful illustration of how this could work with maps? czar 16:19, 20 April 2016 (UTC)
Czar, translations will happen on Commons. For the "localized" string type, it will be stored inside the tabular page. Once WikiData Q-numbers are implemented, their translations will stay at Wikidata. There will be no "Data" namespace on other wikis, because there won't be any local data pages stored there. Only Lua modules and graphs (maps?) will be able to use that data directly. --Yurik (talk) 18:14, 20 April 2016 (UTC)
Symbol support vote.svg Support I wonder if it would also be possible to actually upload CSV files directly to commons, and use them in the same way that the Wikibased tables are used. --Denny (talk) 17:10, 20 April 2016 (UTC)
Symbol support vote.svg Support Seems useful. Yann (talk) 17:13, 20 April 2016 (UTC)
Pictogram-voting-question.svg Question will diff function need to be updated? (e.g. show row and column numbers of cells) — Preceding unsigned comment added by Erik Zachte (talk • contribs)
It could be, but even now JSON diffs are already easy to read because they are shown after being normalized. Take a look in the demo page. But yes, it's a nice to have :) --Yurik (talk) 18:14, 20 April 2016 (UTC)
Symbol support vote.svg Support Excellent idea. Full support. Will this be usable to generate heat maps? Doc James (talk · contribs · email) 18:25, 20 April 2016 (UTC)
Symbol support vote.svg Support --Geraki TLG 18:55, 20 April 2016 (UTC)
Of course, Symbol support vote.svg Support. But we should think something about some naming structure - sports editors could pottentially use Data namespace for this kind of data storage, that can be used in all projects later. --Edgars2007 (talk) 20:24, 20 April 2016 (UTC)
Symbol support vote.svg Support Sounds good to me :) Jean-Fred (talk) 22:09, 20 April 2016 (UTC)
Pictogram voting comment.svg Comment Traditionally, Commons has been for media in concrete fixed visual, audio, or audio-visual form, but not really for abstract data which could be visualized in any number of ways (as discussed in the archives of Commons talk:File types and elsewhere). But I guess if you're establishing a separate namespace, then that's kind of a whole new ballgame (past proposals have generally been to allow the upload of LibreOffice .ODS files or similar)... AnonMoos (talk) 08:50, 21 April 2016 (UTC)
Pictogram voting comment.svg Comment i'm fin with it, but why NOT in a Data namespace on wikidata btw ? Cause I'm a bit worried about increasing confusion for the 'general public'. —TheDJ (talkcontribs) 09:36, 21 April 2016 (UTC)
TheDJ, I thought so too, but Lydia Pintscher (WMDE) from Wikidata gave some interesting reasons why Wikidata is not a good place. --Yurik (talk) 11:23, 21 April 2016 (UTC)
Symbol support vote.svg Support, unless the DJ's suggestion of a Data namespace on Wikidata is deemed more appropriate. Also, be mindful of W3C standards for CSV data. Andy Mabbett (talk) 10:57, 21 April 2016 (UTC)
Pigsonthewing, thanks for the link. It seems that standard applies more to the proper CSV/TSV files, plus the schema to validate them. For our use case, I think Wikidata's data types are more appropriate in the long run, but it might take some time to implement them. --Yurik (talk) 11:23, 21 April 2016 (UTC)
Yurik Presumably (hopefully!) it will be possible to download such files as "proper" .csv or .tsv files. If so, they should comply with the W3C standards. Andy Mabbett (talk) 14:39, 21 April 2016 (UTC)
Pigsonthewing, agree, should be possible to do this relatively easily, at least for the export. Wouldn't be the first priority though - only after editing is improved. --Yurik (talk) 17:04, 21 April 2016 (UTC)
Symbol support vote.svg Support looks very helpful. --Derzno (talk) 09:15, 23 April 2016 (UTC)
Symbol support vote.svg Support I can see the potential. Natuur12 (talk) 19:22, 23 April 2016 (UTC)
Symbol support vote.svg Support looks OK. --Steinsplitter (talk) 19:51, 23 April 2016 (UTC)
Symbol oppose vote.svg Oppose A good feature but wrong place. This would be a nice extension on Wikidata.--Avron (talk) 07:56, 24 April 2016 (UTC)
@Avron: Hello, but however, if we put this feature on Wikidata, there will be 7 disadvantages. See m:User:Yurik/Storing data for details. Thanks, Poké95 09:46, 24 April 2016 (UTC)
I don't understand the use case. I don't see a general difference in you examples most expensive paintings or Moscow subway growth to the examples on wikidata Mountains over 8000 elevation or Population in Europe after 1960. So when you say Wikidata technology is simply not suited for large storage such as the list of the most expensive paintings, the shoe size comparisons table, or data to plot Moscow subway growth graph. I just can't believe it. At the moment, in my opinion it is an attempt to obtain a dumb table but bypassing wikidata which could do the job in a better and smarter way. But please convince me with a comprehensible use case. --Avron (talk) 12:27, 24 April 2016 (UTC)
Symbol support vote.svg Support would be great for open data and wikimedia projects. Mike Linksvayer (talk) 20:08, 24 April 2016 (UTC)
Symbol oppose vote.svg Oppose The editing interface sucks. Not Excel / Open Office compatible (delete 'em on enwiki). I'd like to see this developed as its clear that this data will never meet Wikidata's unofficial notability criteria.

There's also POV and COI issues, I saw a solar installer comparison table with stated and a very good educated guess on actual deliverable (e.g. Generic Square vs custom hexagonal cell). I'm uncertain if the always-short-staffed Commons is ready to battle with corporate FUD departments. Wikidata escaped this by being a useless (to SEOs) copy of Wikipedia.

Finally, there's the question of notability: Are price lists allowed? Is unverified information allowed to be published (e.g. amateur water quality testing)? Personally identifiable information (e.g. public tax records)? How about very niche things? Dispenser (talk) 14:44, 25 April 2016 (UTC)

Dispenser, I agree that a proper UI is needed for this feature for it to become truly user-friendly, as well as some easy way to import/export into CSV text. But editing interface is a "nice to have", whereas "storing structured shared data" is a base requirement, so I would like to release early, release often, rather than build something shiny, and only then realize that it is not what community needs. --Yurik (talk) 17:47, 25 April 2016 (UTC)
Symbol oppose vote.svg Oppose Out of scope. "The aim of Wikimedia Commons is to provide a media file repository", but tabular data are not media files. I also have a hard time with the claim that Meta:User:Yurik/Storing data is a long discussion when all but one sentence was contributed by the proposer as of the time the proposal was posted.--Ahecht (talk) 17:32, 25 April 2016 (UTC)
Ahecht, this was a long discussion. I simply summed it up on my page. See also T124569 and T120452.
As for the scope argument - one of the main use cases I see for this feature is to draw graphs. At the moment, most graphs contain data internally, making it very hard to edit by most users. Also this causes graphs to be non-localized, and to be duplicated across multiple projects. Commons was created to help with exactly that problem. For reuse and simplicity of data editing, the data is split up from the graph into two part - the pure data and the transformation of that data into an image or an interactive object. So data is basically a required component to draw such charts and graphs. And BTW, any SVG has the same dilemma - because they contain "what" to draw (data), and "how" to draw it (style). --Yurik (talk) 18:09, 25 April 2016 (UTC)
Yurik, the scope is crucial in this proposal, so please take a look at the my posting about possibilities within wikidata.--Avron (talk) 15:01, 26 April 2016 (UTC)
I have no problem with graphs being generated from data stored somewhere on Wikimedia servers, but that data itself should come from somewhere like wikidata (and yes, I understand the current limitations on putting it there, but the scope disparity on Wikidata is much smaller than the scope disparity here). The SVG is a bit of a red herring, since it is one file that contains both the data and the style comingled, and in most cases the "data" is canvas coordinates useful only for visual presentation. What you're proposing is arbitrary data that can be reused in multiple formats, only one of which is graphical, that would better serve the project by being accessible outside the context of just media. --Ahecht (talk) 16:28, 26 April 2016 (UTC)
Symbol support vote.svg Support Was just thinking Commons should do this the other day! I can think of a lot of sources of valuable, freely-licensed data that would be great for Commons. -IagoQnsi (talk) 18:25, 25 April 2016 (UTC)
Symbol support vote.svg Support My support is very weak, but I’m not against. I still believe it would be much better idea to adapt WikiData to be able to store large datasets instead of keeping data in Commons. However, if Commons is the service to store such data, it’s important to provide some visualization tools to embed the data into articles in a useful way: charts, maps etc. Otherwise the whole concept makes no sense to me. --Wikimpan (talk) 20:04, 25 April 2016 (UTC)
Symbol oppose vote.svg Oppose, Commons is store media files, not data file. data file can store to wikidata (a free, collaborative, multilingual, secondary database, collecting structured data to provide support for Wikipedia, Wikimedia Commons, the other Wikimedia projects) or metawiki(for coordinating amongst the Wikimedia projects)--shizhao (talk) 00:37, 26 April 2016 (UTC)
Shizhao, please take a look at my reply to Ahecht above. --Yurik (talk) 02:41, 26 April 2016 (UTC)
GA candidate.svg Weak support, no indication that it will be really as simple as TemplateData, but Brion likes it. –Be..anyone 💩 03:18, 26 April 2016 (UTC)
Symbol oppose vote.svg Oppose full per Shizhao. --Alchemist-hp (talk) 08:41, 26 April 2016 (UTC)
Alchemist-hp, please take a look at my reply to Ahecht above, and let me know if it makes sense. --Yurik (talk) 14:36, 26 April 2016 (UTC)
Symbol oppose vote.svg Oppose Arbitrary data is out of scope for commons. Maps and diagrams may use their corresponding more specialized file types --Zhuyifei1999 (talk) 08:24, 27 April 2016 (UTC)
Zhuyifei1999, there is a fundamental problem with storing maps and diagrams as files. Both are based on some other data (diagrams - on some statistics, maps - on some central repository of geographical shapes like OSM). Which means that whenever underlying data changes, you have an enormous task of updating all the diagrams and maps, and re-uploading them, instead of simply updating a single changed data item and have MediaWiki automatically regenerate images for you. --Yurik (talk) 14:18, 27 April 2016 (UTC)
@Yurik: IMO, arbitrary data do not seem to fit for MediaWiki extensions to generate maps/diagrams (because of the arbitrariness). If this were to provide integrated translations for maps/diagrams in "File" namespace, so as single changed data item (eg. a data point position change) goes to the file, and MediaWiki generates all the translations for the file based on the data in the new namespace (just as how Timed Texts provides subtitles), I would have no problems with it. (Yes, SVG translating is a pain right now.) Non-media-related data is out of scope and should go to WikiData. --Zhuyifei1999 (talk) 16:31, 27 April 2016 (UTC)
Obviously, this feature would involve altering the scope of either Wikidata or Commons, somewhat. It would be the creation of a new namespace on one project or the other with slightly different material than has existed previously. But if this opens up really good new possibilities for educational use elsewhere, it needs to go somewhere. It is not helpful if both projects say "it should go on the other one". There could be some new licensing weirdnesses, since most current media doesn't have collaborative editors and the license of additional edits could be an issue. But, probably like SVGs, additional edits probably assumed to be under the same license as the original documented one. Most of the time though, the data would not be copyrightable (though the selection and arrangement of the column titles actually *can* be), and there are the EU sui generis database rights which could be involved for EU data. But... probably all things that can be handled, and the licensing stuff is probably more easily dealt with on Commons. Carl Lindberg (talk) 14:24, 27 April 2016 (UTC)
Agree, Clindberg, this will be an expansion of scope, and I feel Commons community is much more versed in licensing issues than Wikidata community (which never dealt with any non-PD data AFAIK). --Yurik (talk) 15:21, 27 April 2016 (UTC)
BA candidate.svg Weak oppose I can see why we might be the best community to expand our scope, however I can foresee major sourcing issues that may require policy and even cultural changes. We do deal with some factual information here, what springs to mind is mostly maps and flags, and these can be enormous sources of contention. For maps, my impression is that we kind of ignore the problem, essentially foisting factual disputes onto the map maker to solve (see, e.g. Special:Diff/192128695/194080394: where TUBS is somehow expected to adjudicate a contentious border dispute). For flags, perhaps Fry1989 could weigh in with some good file histories to gawk at. If we host factual information, we will need to adopt some kind of w:WP:V/w:WP:NOR/w:WP:RS policies, and that seems to be a big ask, IMO. Storkk (talk) 16:19, 27 April 2016 (UTC)
I apologise, but I don't understand what this table does. I am confused. Fry1989 eh? 16:30, 27 April 2016 (UTC)
The proposal is to allow Commons to store tabular data (think spreadsheets) that can be used in sister wikis, and it would look like this. I think this has the potential to cause us major headaches of the type that up until now we only see in various corners that deal with "facts" like exactly what a flag looks like. I called you out specifically because I have seen you advocate in the past for correct versions of flags, so you have some experience dealing with these types of things, and I know there are files out there with many dozens of reverts. I don't think we have policies in place to manage the disputes that this new "file format" would seem to specifically invite, and I think enacting those policies might not be a great idea. Storkk (talk) 16:45, 27 April 2016 (UTC)
I am sorry but I am still confused. Fry1989 eh? 23:14, 28 April 2016 (UTC)
Symbol support vote.svg Support We should thoroughly specify what is in scope and this time impose some naming (title) and minimal description policies so it is easy to find out what the data is about. Looking at images, a quick glance is sufficient, with data it might be harder. Administrators must be entitled deleting non-compliant material without bureaucracy (restoring is always possible). Tabular-Namespace edits and creations must be restricted to logged-in users like we do for file uploads otherwise it's going to become a mess when this data is used by multiple projects. Licensing: I think we should go for a default license that doesn't require attribution, like CC-0. Tabular data in multiple languages is great and if WikiData isn't building support for it, we can take it. I do not expect too much maintenance work if we have good and strong policies. There isn't much we can loose; if the feature is used, a new community (likely different people than those curating media files) will appear, if it isn't much used, it can be removed and turned off. The only thing I am concerned about is that newcomers wonder why the media repository stores (tabular) data and WikiData doesn't and how they can edit these data. That must be simple, pleasing and intuitive. -- Rillke(q?) 21:17, 27 April 2016 (UTC)
Symbol oppose vote.svg Oppose per above. Better (≠ perfect) fit for Wikidata.    FDMS  4    13:00, 28 April 2016 (UTC)
Symbol support vote.svg Support Very valuable. Long overdue. Totally inappropriate for Wikidata, which is a Graph database, not a place for data blobs. Jheald (talk) 22:11, 1 May 2016 (UTC)
Symbol support vote.svg Support per m:DataNamespace, which basically lays out the exact same proposal. Excellent to see this moving forward and being implemented in Commons, it totally makes sense as the destination for this project. People interested in the why not wikidata angle may find this discussion useful.--DarTar (talk) 18:55, 2 May 2016 (UTC)
So one has to read complicated-to-understand docs and talks to understand what Wikimedia projects are for. I'd prefer if just their name would be a good hint. -- Rillke(q?) 22:05, 5 May 2016 (UTC)
Symbol support vote.svg Support Aubrey (talk) 07:45, 6 May 2016 (UTC)
Symbol oppose vote.svg Oppose, per Dispenser: Special:diff/194486152 + before allowing this type of uploads we need a full developed environment for this data in Commons regarding policies (scope etc.), categories, maintance tools (like filters etc.), help & orientation pages (for uploaders and maintainers), etc.. Btw: what about references to support the data or can just everybody throw his data to Commons? But the main question is: who in Commons will be able to additionally monitoring this kind of stuff? Not only the uploads itselfs but also all later modifications (typical edit by IP: "sales": 2.000 --> 200.000)? And: they (companies, marketing, spammers, POV's/COI's users etc.) will abuse in medium-term also this system, providing [fake/false] data for their "interests". IMHO, Commons in the past already suffered some mass-oriented "features" like Wikipedia Zero, cross-wiki uploads via local Visual Editor, mobile uploads, or whatever --> all mostly either grabbed from Internet or out of project scope (often detected only months or years later) — btw: currently, around +/- 30-40 % of daily deletion requests at Commons are already related only to "out of project scope", mostly involving "Commbook"-uploads from spammers and user pics from gals & guys who (will) never touch an wiki article, vomiting an user page on "their" wiki, thinking Commons = Facebook. The concept of "data uploads" may be interesting but ignores (among other things) the completely under-staffed [maintainer] user base in Commons and instead of trying to keep a (+/-) quality database of "free-use images, sound, and other media files" the whole thing is already turning (also regarding e.g. thousands of images grabbed & uploaded from social media) more and more into a random web hoster. Gunnex (talk) 20:05, 8 May 2016 (UTC)
Symbol support vote.svg Support -- Michael F. Schönitzer 14:22, 14 May 2016 (UTC)
Symbol oppose vote.svg Oppose per Commons:Project_scope#Excluded educational content. Everything here is international, not everything that is international has to be here on Commons. --Martin H. (talk) 19:16, 14 May 2016 (UTC)
Symbol support vote.svg Support I am not quite sure I understand this. I think that I would support having the option to present tables in this way on all Wikimedia projects that present tables. If Commons is being proposed as a first place to put this information, then that seems appropriate, because Commons already hosts sets of data in free text which ought to be in tables. Examples include information which has to be presented in multiple languages, and where free text would be repetitive but a table communicates more clearly. From Commons I would hope that this could lead to a standard being available in many places, including Wikipedias, and perhaps universal connection with Wikidata. Blue Rasberry (talk) 16:47, 27 May 2016 (UTC)

Implementation details

I added the implementations details page, where we can iron out the exact features needed, e.g. what metadata fields we actually need, more data types, and some technical concerns. Thanks for the great support! --Yurik (talk) 02:34, 21 April 2016 (UTC)


One reason why putting such files on Wikidata might be better is that all of its content is fully PD; but then, that might be a reason to prefer Commons; or we could insist that all data files on Commons are PD. Andy Mabbett (talk) 14:43, 21 April 2016 (UTC)

It might make sense to only allow PD / {{CC-zero}} content. The data will likely be {{PD-ineligible}} anyway and that would allow merging and mixing of the data. --Jarekt (talk) 19:35, 21 April 2016 (UTC)
I would support PD-required licensing for data files on Commons. Even blob data is just data. However, is there any issue with possibly importing data from elsewhere that is currently labeled as free-use but not not PD? Pi.1415926535 (talk) 23:35, 21 April 2016 (UTC)

I am beginning to think that we should require all blob data to be PD (CC-0) licensed, in which case it actually might be better hosted at IANAL, but apparently data could only exist under "database license" other than CC0, and most of the time any use of such data will have to also specify the license, which in our case might be fairly hard - imagine a table where each cell has a copyright link, or any graph usage would also need to add that link. Requiring PD-only data would solve all such cases. I will wait for our lawyer friends to elaborate on this. --Yurik (talk) 19:44, 29 April 2016 (UTC)

Please do not forget the community will have to curate this data. While some aspects may look fine from the juridical perspective, they might cause headaches, uncertainty and an absurd overhead to re-users and to those curating the data.
PD ≠ CC-0. The former is a copyright status, the latter is a license chosen by a copyright holder. What might be interesting to have evaluated by professionals is when a work is started to be considered eligible for protection as a database and whether we could have some simple rules that might be even enforceable by computer programs (e.g MediaWiki). -- Rillke(q?) 21:45, 29 April 2016 (UTC)
Rillke, I agree it needs to be curated, but how is this different from writing a table in an article and validating that? Plus having the same data shown in multiple languages should (in theory) improve the quality of that data. Thanks for clarifying re PD vs CC0. I guess my words should be changed to "only allow data that is in the Public Domain, and if someone owns a copyright on some data, they need to release it under CC0 license". Yurik (talk) 16:49, 30 April 2016 (UTC)
It is different in that the table is used in multiple projects and the data source isn't obvious to re-users (changes and authors likely do not appear in the revision history of an article). -- Rillke(q?) 17:30, 30 April 2016 (UTC)
CC-0 is trying to be as close to PD as possible. In some jurisdictions, it is likely the same, if you take PD to mean "no known copyright restrictions" which is basically what we do. But... things can still be "interesting"; as even if data is not covered by copyright (thus "public domain"), do we respect the EU sui generis database right? That generally applies only to databases made by Europeans, and only in Europe. And there are things like the Open Database License (ODbL), used by, which I think covers both copyright (where it exists) and the sui generis right. Do we allow those? Questions like that. Carl Lindberg (talk) 20:08, 30 April 2016 (UTC)
The UK's standard Open Government Licence (OGL) is similar to CC-BY [1]. It would be a shame to exclude all of this.
As for putting blobs on Wikidata, I think that would be a mistake. Wikidata is a knowledge graph database, and that is quite enough. Arbitrary data blobs don't belong there. A sensible place is Commons, like other sorts of "thing" designed to be used by multiple projects. Jheald (talk) 22:08, 1 May 2016 (UTC)
CC0 would keep things simple and is the right place to start. Allowing data blobs first published elsewhere under conditional free licenses to be uploaded might not be a terrible thing to do, but could be added later after the feature is well used and if there's a compelling reason for such uploads. Mike Linksvayer (talk) 16:13, 3 May 2016 (UTC)