Commons talk:Structured data

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Another example why structured data are needed for every-day commons user[edit]

Hi. Since I am here "frustrated" working with another commons user at an edit-a-thon, User:LigaDue, I'd like to share with you another example of why we really need structured metadata on commons soon or later. See for example here. We have no idea of the right standard to use for the title of a new category. The same happens for generic building exteriors. Is the right title "Exterior of...", "- outside" or "-exterior"? Of course, we can keep looking, compare different countries but it's such a wasteful process. We spend too much time trying to answer these questions when we clean the files. It would be nice if we could link to the wikidata item (or something similar) of these concepts, in these cases it would be so much faster and less ambiguous than making a statistics of usual strings in titles. Maybe some people know these things but if you need some real life examples to show to those who might think the actual structure of commons is fair to manage, well, that is one. Bye--Alexmar983 (talk) 09:50, 2 June 2018 (UTC)

  • No, that’s an example on how terminological standartization is important for a smooth and transparent workflow. You give no explanation on how magicly structured data will not suffer from the same growing pains as category names did/do. -- Tuválkin 16:21, 2 June 2018 (UTC)
  • +1 to Tuvalkin here. I'm all for structured data, but this is not a problem it would help solve. - Jmabel ! talk 17:14, 2 June 2018 (UTC)
    • Humm.. Not sure I've understood how structured data would work on Commons. But if it's a system where you say "exterior" (AKA "outside" AKA whatever, all defined in the same term) + "church" + "Italy", I believe it can indeed facilitate the process.-- Darwin Ahoy! 17:19, 2 June 2018 (UTC)
You describe with a clear unambiguous information that this image is 1) a building/church/palace 2) an exterior/an interior. At this point you know this is an intersection of the concepts "exterior" and "building". You can make a query with this information, of course and this basically eliminates the need of a categorization system to look for files but if you still need new categories for other reason (which is true), you can make them quickly and change the way the string of the title is called with a bot every time you want. Every time there are enough images with these intersections, the bot can create the category and of course it does with the standardized title we agree, even with basic standardized descriptions and standardize navboxes and so on. Everytime a certain combination becomes common, you can expand with a click the categorization tree. If I have to spend my afternoon getting frustrated creating manual categories that I have no idea I am doing right, and someone else have to overwrite probably what I am doing again and again, than I prefer to spend it converting previous categories and description information in metadata or revising similar information suggested by bot. It's information management in any case but with metadata the value of my effort is much bigger, so is its flexibility, and I would like my effort to be more fruitful. The manual categorization can always increases in confusion, the metadata architecture basically increases in sophistication. For many of us who work with both commons and wikidata, the need to handle commons files the way we handle wikidata item is something we start to feel. It took us two or three years to get a solid and robust metadata architecture on wikidata but at least when we do a query to search for something, it kinda works quite well, and it's improving. Our manual categorization system here is not so efficient, and we feel the frustration that we are not progressing to something that works better but simply adding partial and not always coherent patch here and there. ---Alexmar983 (talk) 18:05, 2 June 2018 (UTC)
  • And then imagine trying to do this kind of upload, when your first language is something other than English. Even if we standardize the vocabulary per @Tuvalkin:, the grammatical intersection of multiple concepts could be overwhelming. Sadads (talk) 12:21, 5 June 2018 (UTC)
  • No need to imagine: Enlish is not my native language and I do a lot of categorization in Commons. It works. Your vapourware does not. So, go ahead and keep wasting your time and talent and WMF donations money with it, but keep your hands off Commons categories. (By the way: Category:Pedro Mexia was just created; everything works, except creating links to pt:Pedro Mexia via Wikidata because reasons.) -- Tuválkin 20:04, 5 June 2018 (UTC)
Just had a look at Category:Pedro Mexia, and the link to pt:Pedro Mexia is there just fine. Jean-Fred (talk) 06:37, 6 June 2018 (UTC) P
  • It takes time to transclude. Yet when I tried to manually creating the recyprocal link, I was faced with a gobbledegook error message, which is not what one should get from a UI when trying to do something thats already done. -- Tuválkin 11:32, 6 June 2018 (UTC)

┌─────────────────────────────────┘

  • Sadads, Sadads, Tuválkin categories are... in English. How is it possible that people are worried about metadata in English and not categories? it's more complicated to find the names of categories in English (or miscategorized files with description in other languages) than an integrated systems with wikidata that can provide labels in different languages automatically, which is what wikidata already does. It's not even a problem, actually metadata increases multilingual flexibility because they standardize the handling of key concepts. It will remains a vaporware not because a lack of tools to do it but resistance. Like keep your hands off Commons categories... it's weird because even if you keep manual categories as much as it pleases you, metadata can be used in parallel and basically they can be used to make categories in a much more efficient way. It's written above in my example, it's not a complicated automation, you only need to invest in metadata for files, which people like me are welcome to do instead of battling with these strings and category trees. Grammatical intersection of multiple concepts could be overwhelming? yes, like with categories. You don't see it a lot in their cases because they are done manually. But that's not a balanced solution! Metadata automation does not automatically increases details, it simply forces you to clearly define the detail levels which are ok. You can have them on demand in a personal query (which is good) or decide which level of categorization to provide. And that's a good thing, a responsible thing. Currently, you simply have excessive details here and there in any case, mostly hindered in a bunch of categroization holes. This is a poor scenario, because it literally means you have no idea what to expect from the category tree.
Of course, if we started years ago it would have taken a lower effort to adapt metadata, but it's never to late to see things in a functional perspective instead of projecting fears. I mean, I care about money too, precisely I care about this huge amount of time wasted that it is also indirectly money. A lot of money. The metadata investment is already years late and we are far beyond the key steps of literacy, as far as I can see. Not literacy of newbies, in my experience. New users see the metadata quickly, they learn wikidata (and metadata) quite fast, when they arrive to the mediocre, not flexible for multilingualism, not complete, uneven and time-consuming architecture of commons categories I can simply link to them these discussions and explain how commons is currently "protected" by this scenario. So, yes, I can show them how to make a query to list with pinpointed precision what they need amongst millions of wikidata items but not millions of files. In the end, more years of metadata illiteracy will simply leave to new generation of users a much expensive bill. Well, not my fault.--Alexmar983 (talk) 03:22, 12 June 2018 (UTC)
  • This is what theory looks like and what I expected from Wikidata when it appeared a few years back. I immediatly thought — yay, we can have language-independent categorization! (Yes, because unlike what you slyly imply, I’m very much not an Anglocentric monolingual, as my user page hopefully suggests.) But no. This is what Wikidata has been so far: Underwhelming in meeting its originally percieved goals and at the same time threatning to take over systematized data from other projects (like geolocation from Commons, infobox data from Wikipedia(s), the chilling annoucement of lexical data to endanger Wikitionnary etc.), locking it in a dumbed-down, gamified UI that cannot sustain the kind of workflow “power users” are accustumed to, effectivly shutting down the mechanism that allowed Commons (and the other projects) the very build-up of entered data.
But go ahead, maybe it will become a beautiful thing. Just don’t destroy others’ way of contributing, okay? Feel free to diss categories, it’s amusing when you do it, but refrain from pushing to its removal from Commons.
-- Tuválkin 11:11, 12 June 2018 (UTC)
Tuválkin refrain from pushing to its removal from Common Just don’t destroy others’ way of contributing Who said that? I did not. Are you talking to me or not? You can go on creating manual category as much as it pleases you, as far I care, it's your time... but there are many users that don't get why they still MUST do it this way. And trust me when you enter not simply a remote hamlet in a European country but entire areas of the rest of the world, just to stick to geography, you feel strong that this ecosystem is not sustainable. Certainly, it cannot last if you keep pushing millions of files from other platforms by bot. We have bots for that, but not for categorization based on a semantic architecture because we keep delaying it in every way (like this reaction) and that's apparently a good thing.
In the future, can I add "interior" "church" "name of place" using a multilingual menu and let a bot make a category while you handle your manual category please?
In my experience, I expected from Wikidata and its structure something, it was not there in the beginning, now it is there or it starts to be there. My time was well spent. I am talking based on years of interactions, with other users. I could write a lot about experience, but I just repeat: I'd like to spend my afternoon investing in a metadata architecture for the future than this category tree. Not just me. And I would like, when I share this experience, not to see a bunch of users act this way. Read what I have written, don't reply to what you think it's written. It's not constructive. Not just for me but for the people I show these pages later. I don't know if you have noticed, but it seems as if I did a reply to the presented concerns showing they are also the same defects of the current system that you consider the best option or alternative (but again, you can keep that manual system, just don't force us all to use only it forever). People I show this page might notice that, including the fact that so far the reply were also a little but vaporware themselves.
A vaporware of fears, but that's what it is. Are you really worried that the manual categories disappear? because I am not, I want to make manual categories about very sophisticated topic myself but not about "war memorials in the province of X". It's 2018, please... there are much more delicate topics I should invest my time.--Alexmar983 (talk) 13:40, 19 June 2018 (UTC)

New feedback request - Depicts[edit]

There is a new feedback request available, the draft specifications for Depicts statements. Please have a look over the presentation and leave your thoughts on the talk page. Keegan (WMF) (talk) 18:11, 16 August 2018 (UTC)