Commons talk:Structured data

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 7 days.

Talk pages of subpages and archives

Converting the GLAM's subjects tags to P180 values?[edit]

Do we have opinnion on which granularity we should target in P180 values? It would be possible to do translation from subject tags used by Finnish GLAMs to Wikidata items using Finto ontologies. However, it is possible that there will be a lot of quite random information tags too if it is done automatically.

In example "wooden house" in first example photo which is in background and not a key element. Also it would be likely that human would add wikidata item of Ratakadun poliisiasema (Q98432303) instead of poliisilaitokset as there is wikidata item for that specific police station.

In second photo we can see that @Apalsola: already added values Tervahovi (Q11897001) and barrel (Q10289) + of (P642) = tar (Q186209) to P180 values. This so well defined information that it is out of reach of any automatic conversion.

So my question is that when I am adding the tags, should i try to focus on very specific tags or should I add all of the tags which I can translate to wikidata which would be more general? Also do we know which strategy would be better for Media search development?

] Subjects

--Zache (talk) 09:29, 8 February 2021 (UTC)

Captions in wrong langcode[edit]

like special:diff/435428822, a string of Cyrillic letters entered in a Latin language (english). these obvious errors should be somehow automatically detected or at least tagged for review and correction.--RZuo (talk) 11:56, 25 February 2021 (UTC)

✓ Done @RZuo: That caption was nonsense and obviously not in good faith so I removed it. Thanks for pointing it out! --Sabelöga (talk) 14:55, 25 February 2021 (UTC)
@Sabelöga:  Not done, the question was not to remove that specific caption (anyone can do that), but to prevent/tag such edits automatically.
@RZuo: Wikidata abuse filter #33 does something like this, maybe it could be adopted to Commons. I’m not an admin myself, so I can’t edit abuse filters, but hopefully you can find someone able and willing to help at COM:AN. —Tacsipacsi (talk) 23:17, 26 February 2021 (UTC)
@Sabelöga: it was a good-faith edit, but just entered in a wrong langcode.
a request for a filter was submitted but not followed up so far: Commons_talk:Abuse_filter/Archive_2021/03#New_AF_request:_purely_non-Latin_alphabets_entered_into_Latin-script-language_captions.--RZuo (talk) 11:47, 14 March 2021 (UTC)

alt text property proposal[edit]

In case you missed the notice on the village pump a couple days ago: there's a Wikidata property proposal for adding an alt text field to the structured data of images. On one hand this would enable all kinds of programmatic workflows, like Toolforge tools for creating or translating alt text for images; would allow using alt text for images outside wikitext; and would probably improve the search result ranking of images. On the other hand it is being argued that alt text by its nature cannot be centralized since it always needs to reflect article context. More opinions on the matter would be welcome. --Tgr (talk) 14:05, 3 March 2021 (UTC)

Query structured data and Wikidata[edit]

In order to showcase publicly the stuff we do with SPARQL on (or embed it at some place), it would be necessary to query the author and license from the images. At the moment AFAIK you can display P18 only without any copyright info which makes all the learnings and cool stuff a very private fun thing (if you don't restrict on public domain paintings) :-| – Now that we have all the structured information on Commons, has this been done already? A query to merge Wikidata items with Structured Commons? I understand it's a different Wikibase instance, would it be thinkable at all? Any approaches under way? Thanks! (please ping me for an answer) --Elya (talk) 20:49, 3 March 2021 (UTC)

Proposal: Move captions from file information to structured data tab[edit]

Currently we have two tabs on files on Commons:

  • File information to display information about the file generally in a infobox template format
  • Structured data to see and edit structured data in a key/value format

Currently the captions are in the "file information" tab. This is weird because these captions are part of the structured data. These captions cause visual clutter. I propose that the captions are moved to the structured data tab. I opened phab:T276718 for that. Please comment here to see if we have community consensus for this. Multichill (talk) 18:33, 7 March 2021 (UTC)

In my opinion, you should set up several tabs and group the content by topic. However, more than 5 to 7 tabs would be confusing.--XRay 💬 19:16, 7 March 2021 (UTC)
I agree with XRay, personally I would (also) prefer to see a separate tab for them, but then also to have more (written) information in that tab. If that ain't an option on the table then I do support the move to the SD tab. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 21:14, 7 March 2021 (UTC)
At least one thing gives me hesitancy about this: I rarely pay attention to the structured data tab because, being a human rather than a bot, I rarely find information there that I can't glean as well or better from the wikitext. However, I really do like to have an easy way to see and remove the incorrect captions that are frequently added to my photos. So perhaps this should be a user option? - Jmabel ! talk 02:57, 8 March 2021 (UTC)
  • In short, is this one more case of trying to move information from wikitext off to the SCD…? -- Tuválkin 13:13, 8 March 2021 (UTC)
    • No it's not. You seem to have completely misunderstood the proposal. This is about a change how the user interface looks. Multichill (talk) 18:40, 8 March 2021 (UTC)
  • That’s a relief, then. As for any extra tabs displaying info that’s not in the file page’s wikitext, I’m okay with anything, provided I get to hide them as I currently do. -- Tuválkin 19:23, 8 March 2021 (UTC)
Sounds reasonable, as captions are usually more or less redundant with the description of {{Information}}. --Zolo (talk) 18:18, 9 March 2021 (UTC)
Whatever the result of this proposal, IMHO there should be a gadget (opt-in) to show the structured data tab as the main/primary tab, if a (robotic?) user wants it that way. Strakhov (talk) 19:20, 9 March 2021 (UTC)
Symbol keep vote.svg Agree --Jarekt (talk) 13:14, 15 March 2021 (UTC)
While the captions are indeed structured data, they also represent descriptive information in a format that is more human readable than the rest of the structured data fields, which is why they are in the main "file information" tab. For users who are not fluent in English, the caption may also be the most easily accessible descriptive information available in their language. I do think that clearer modeling of how to use the "caption" field would benefit everybody. If you could say more about how having captions on the main page gets in the way of your workflows or of easily viewing file information, or if they represent some other challenge I don't understand, I'd love to hear about it. We're certainly open to making a change if users would prefer to move the captions to the structured data tab. However, the team has other priorities right now, so I'm not sure when we would be able to focus on it. (I've also said this in phab:T276718.) CBogen (WMF) (talk) 20:43, 15 March 2021 (UTC)
@[[:User::CBogen (WMF)|CBogen (WMF)]]: the caption from the structured data provide more or less the same information as the description field of {{Information}}. Actually, when the field from {{Information}} is left empty, it just shows the structured data caption. Showing the caption above {{Information}} is usually useless. --Zolo (talk) 17:00, 24 March 2021 (UTC)

How can I add SDC to Categories?[edit]

Can I define a depicts (P180) for a category? E.g. for single topic categories. How does this technically work? --Herzi Pinki (talk) 13:01, 15 March 2021 (UTC)

SDC is only for files. Category might be connected to Wikidata and you could add depicts there. --Jarekt (talk) 13:12, 15 March 2021 (UTC)

Is it possible or is it planned to have references for facts in SDC?[edit]

Hi all

Is it possible or is it planned to have references for facts in SDC? I mean in a similar way to how Wikidata provides a space for references for facts? I realise that a lot of SDC will be self evident from the image e.g number of sugar cubes or other depictions. However some facts would benefit from having some kind of reference e.g to confirm an image depicts a specific location or person (this could help with miss identification). Also to identify where structured data has been imported from an external source e.g when an image has been imported from a museum website.


John Cummings (talk) 19:39, 15 March 2021 (UTC)

@John Cummings: See phab:T230315

I believe technically you can add them (eg by bot, QuickStatements etc), you just can't see them, apart from in queries. I am definitely one of those who thinks it may often be very useful to record where a statement has come from. Jheald (talk) 14:13, 16 March 2021 (UTC)
Thanks very much Jheald, very helpful to know that other people think this would be useful and it has been documented as a request. John Cummings (talk) 14:50, 16 March 2021 (UTC)

Project Grant application for SDC support in OpenRefine: feedback and endorsements welcome[edit]

OpenRefine New Logo.png

Hello everyone! Since 2019, it is possible to add structured data to files on Wikimedia Commons (SDC = Structured Data on Commons). But there are no very advanced and user-friendly tools yet to edit the structured data of very large and very diverse batches of files on Commons. And there is no batch upload tool yet that supports SDC.

The OpenRefine community wants to fill this gap: in the upcoming year, we would like to build brand new features in the open source OpenRefine tool, allowing batch editing and batch uploading SDC :-) As these are major new functionalities in OpenRefine, we have applied for a Project Grant. Your feedback and (if you support this plan) endorsements are very welcome. Thanks in advance, and many greetings – Pintoch (as OpenRefine developer) and SFauconnier (talk) 09:24, 16 March 2021 (UTC) (aka Spinster, as member of the OpenRefine steering committee)

QuickStatements is a batch upload tool which does work for SDC. Strobilomyces (talk) 11:40, 16 March 2021 (UTC)
Hi Strobilomyces, it is correct that you can batch edit SDC QuickStatements (with workarounds), but the last time I tried to use it, it definitely did not support uploading new files. In the project grant application we do talk about QuickStatements and other batch tools that already exist, what they can and cannot do, and why we think it would be very valuable to have extended SDC functionalities in OpenRefine. SFauconnier (talk) 13:11, 16 March 2021 (UTC)
Hi. Thank you for your answer. Uploading new files and fairly arbitrary wikitext through batch would be very useful for me. I thought that was an independent question from editing SDC. When I looked at the existing tools I thought they were too restrictive on the Wikitext which could be loaded, but perhaps I should look again. I certainly hope that your upload tool will also support Wikitext. Strobilomyces (talk) 16:04, 16 March 2021 (UTC)
Help:Gadget-ACDC is a gadget to add a collection of structured data statements to a set of files. Why do we need a new tool instead of better support of existing ones? --Schlurcher (talk) 16:09, 16 March 2021 (UTC)
I really like ACDC and use it a lot, but it runs in the browser with all the limitations and problems this has. I also used QuickStatements for Wikidata and for my usecases it was good. I never tried OpenRefine. --GPSLeo (talk) 16:45, 16 March 2021 (UTC)
I think that we are comparing apples and oranges here when we are comparing Quickstatements, SDC and Open refine. Open Refine is tool for human-assisted or automatic data matching and data conversion which can store the result to Wikidata (or to SDC i hope). I personally use python for this but I can perfectly understand why somebody would like to use some higher level tool for that. Zache (talk) 12:11, 6 May 2021 (UTC)

Text about Image Annotator[edit]

@GFontenelle (WMF): [1] The sentence you added is not targeted for translation.--Afaz (talk) 12:38, 26 March 2021 (UTC)

bad request[edit]

File:Tram in Kärntner Ring, at twilight (Vienna, Austria).jpg is linked via SDC with <title of image> (, which gives a bad request. No idea. --Herzi Pinki (talk) 22:38, 26 March 2021 (UTC)

@Herzi Pinki: Where? The link is obviously wrong (it should be, but I can’t find it anywhere on the SDC tab. —Tacsipacsi (talk) 00:09, 28 March 2021 (UTC)
if you edit the above image, you will have below: Wikidata entities used in this page where there currently is a single entry:
exhibiting the described behaviour. best --Herzi Pinki (talk) 05:37, 28 March 2021 (UTC)
@Herzi Pinki: I see. You didn’t mention edit interface, so I thought you’re referring to the page that appears if I click the link, i.e. the view (non-edit) interface. Seems like phab:T250611 and/or phab:T240358. —Tacsipacsi (talk) 20:03, 28 March 2021 (UTC)
sounds as if this is the problem. I thought as the problem is occurring for quite a while, nobody saw it until now. AGF = nobody saw it; ¬ AGF = nobody considered it important enough to care for it. (I had an issue from 2015 which was triaged last week). Thanks for identifying the suitable tasks. (I thought Wikidata entities used in this page to be independent from the view of the page, sorry) --Herzi Pinki (talk) 20:23, 28 March 2021 (UTC)

Geograph restarted as structured data upload[edit]

I restarted the Geograph upload, see Special:ListFiles/GeographBot. All metadata is stored as structured data and {{Geograph from structured data}} is used to display it in the wikitext. This way content and presentation are separated. Multichill (talk) 11:17, 10 April 2021 (UTC)

  • Wikitext is not presentation. -- Tuválkin 12:33, 10 April 2021 (UTC)
    • Thank you for this extremely helpful and constructive feedback. Multichill (talk) 12:41, 10 April 2021 (UTC)

SDC at GLAMHack 2021 on 16th and 17th April + can we request a Query Service dump request on a specific day?[edit]

Hi all

This coming Friday and Saturday (16th and 17th) GLAMHack 2021 (organised by Beat Estermann) is taking place (its free). 

Please come and take part, play around with content and data and make something fun, you don't have to be technical (I'm not). Its a nice excuse to talk to other Wikimedia people since we can't hang out like normal.

I'm organising a Structured Data on Commons hackathon team for the event where we will play around with Commons, data and try to make new and exciting things, I'll be providing some basic SDC materials for people who are new to it.  Register for free here

One of the things that would be really helpful to make this event successful is if we could request that the dump which the Commons Query Service runs off is updated on the night of the 16th so that any additional content we want to add can be queried on the second day.

Thanks very much John Cummings (talk) 20:13, 10 April 2021 (UTC)

Clothing and costume accessories[edit]

Do we have best practices for deescribing clothing and costume accessories in photos and artworks? "Wears" seems logical but causes an error. Do we prefer "shown with features"?


Thanks - PKM (talk) 00:38, 17 April 2021 (UTC)

I've fixed the constraint of wears (P3828). However, that information shouldn't be at Commons anyway for both mentioned cases but rather at the Wikidata items for the paintings. Best --Marsupium (talk) 08:21, 17 April 2021 (UTC)
Easy enough to remove it. But are we really saying we don't want to be able to search clothing details from portraits in Commons? (This is why SDC confuses me.) - PKM (talk)
For Commons itself, the most useful thing here would be an ImageNote placed precisely on the item of clothing in question. - Jmabel ! talk 15:32, 21 April 2021 (UTC)
A painting can't wear anything! It can depict a person wearing something, but if I look at the current examples for the use of wears (P3828) it is always associated with the actual person and not with a depiction of the person. So in this case it would be "Elisabeth I" wearing the "Three Brothers jewel" on George Gower's "The Ermine Painting" Sorry, no Wikidata Item for the "Three Brothers Jewel", so I can't show how this would work. --Wuselig (talk) 18:42, 21 April 2021 (UTC)

Modeling picture taken in a certain municipality[edit]

What is the best way to model the relationship between the picture and the municipality? (see [2] for background)--So9q (talk) 06:54, 21 April 2021 (UTC)

Please let me just add a bit of context: the picture depicts a shelter which is in that municipality. The municipality has a Wikidata item, but the shelter does not and probably never will. Thanks all in advance! Syced (talk) 11:20, 21 April 2021 (UTC)
The shelters in Wikimedia Commons will all have a wikidata item if I get to decide. I have cleared it with the wikidata community. The only thing we need is an external source of truth that we can link to. One Swedish municipality Uppsala has shared their shelters as open data: They can be imported as wikidata items because they are useful for Wikivoyage. See example use here: (talk) 13:02, 21 April 2021 (UTC)

digital representation of (P6243)[edit]

I'm not so sure about using digital representation of (P6243). I think it's the right thing to do with digitized postcards or photographed paintings. It may also be the case with sculptures and statues in museums. However, an uncertainty begins here. For sculptures in the public area (as in File:Madonna_an_der_Aa_-_Muenster_-_2021.jpg) I find it rather inappropriate. Then one would have to treat building photos (like File:Langer Eugen.jpg) in the same way. Or automobiles. Or persons. And then the question arises, only if it is the only object or also if it is a compilation (like File:Bonn-Gronau Post Tower Schürmann-Bau Langer Eugen Luftaufnahme 2015-05.jpg). --XRay 💬 08:40, 3 May 2021 (UTC)

When WCQS will be out of beta?[edit]

Do we have any idea when WCQS would be out of beta? I think that the main feature what I would be searching is to be able query it without OAUTH. Zache (talk) 12:14, 6 May 2021 (UTC)