Commons talk:Structured data/Modeling/Depiction

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Advanced guidelines for the use of depicts on Structured data on Commons[edit]

Hello! Since Structured Data on Commons launched in 2019, the community has been working to effectively use the depicts feature, which has been integrated into the Upload Wizard steps. However, there is still a lot of uncertainty and divergent practices.

To build a shared understanding, we produced a guide to depicts on my sandbox page, User:GFontenelle (WMF)/Depicts, and facilitated a discussion about the depicts property use within Structured data on Commons during this year's Wikimania, in the session Structured data on Commons: today and tomorrow, and at the table Structured data on Commons.

We also talked about it on the This Month in GLAM newsletter for July on the post A conversation about depicts and Structured Data on Commons.

Following all the discussions and after sharing the guidelines with different people in the community, including during Wikimania, I moved the page to the main page Commons:Structured data/Modeling/Depiction. However, some questions and issues, as well as some suggestions are still open.

Open questions and issues[edit]

  • Should other statements, such as instance of (P31) and main subject (P921), be prioritized by MediaSearch? Or should they be added as depicts (P180) information?
  • Should 'mark as prominent' be available in the Upload Wizard? And should it be used on MediaSearch to help ranking media files?
  • Should depicts (P180) include 'genre' as well? Or should it be added as a qualifier to the statement that would been equivalent to instance of (P31) on Wikidata?
  • Should Wikimedia Commons duplicate (partially, and guaranteed to be less complete) data that is also on Wikidata? Or should Commons do it, as there are many media files without a Wikidata item as they don't reach the platform's notability standards?


  • Include a "drag and drop" button to Wikimedia Commons to add structured data to files (a suggestion made during Wikimania, on this talk page).

The page still is a work in progress and there's still a lot that can be added and explained. However, I hope the page it's already useful to provide some guidance for the use of depicts (P180) on Structured data on Commons. GFontenelle (WMF) (talk) 01:11, 31 August 2021 (UTC)Reply[reply]

@GFontenelle (WMF): thanks for starting this page. The depicts statements you added on File:De kunstgalerij van Jan Gildemeester Jansz Rijksmuseum SK-A-4100.jpeg seem redundant as these are already on The Art Gallery of Jan Gildemeester Jansz (Q17337965). We shouldn't encourage users to become monkeys and have them duplicate data. Multichill (talk) 15:14, 28 December 2021 (UTC)Reply[reply]
@GFontenelle (WMF): This is actually a direct contradiction with Commons:Depicts#What items not to add so it should be removed. I already did part of it. Multichill (talk) 13:37, 16 January 2022 (UTC)Reply[reply]
Hi @Multichill. Thank you for editing the page and making suggestions. Just to clarify, I made the edits on the File:De kunstgalerij van Jan Gildemeester Jansz Rijksmuseum SK-A-4100.jpeg in my volunteer capacity only, out of curiosity, because I was trying to experiment and understand which were the paintings depicted on the painting, relating them to their physical position on the image. The idea was to later add "relative position within image" (P2677). It was not my intention to "encourage users to become monkeys and have them duplicate data." Later, I ended using it as a model on the depicts modeling page, as it was a good example of the applies to part (P518) qualifier only -- which now is not there anymore.
Other than that, I understand what you are saying about not duplicating data and not adding a lot of details to the depicts of an image. However, I ask you to consider the MediaSearch. Depending on the way we model the data, the information will not be picked up by MediaSearch and will simply not appear in searches. Just like you, and as someone who had this sort of conversation first on Wikidata, at first, I was also hesitant with overcategorization and duplicating data that was already on Wikidata. However, in conversations with people who were involved in the development of MS, I understood that it was important to add more specific data on Commons for the sake of Commons' search (and, later, of Wikipedia's, because of the Structured Data Across Wikimedia project). Of course that it's not good to add "nose" when you already have "face" or "nail" when you already have "hand". However, depending on the image, it's important to have more specific data. That's why, to try to solve the problem, I'm suggesting using the "mark as prominent" option on that section on the Depicts page.
There's also the following aspect to consider (which I believe is very important): "As a feature, ‘mark as prominent’ enhances the accessibility of media files for people with visual disabilities, as it's a structured way to differentiate between elements displayed in an image, especially considering that not all media files on Wikimedia Commons have a Wikidata item (or a notable enough to have one) to be described on a structured and multilingual way over that platform." GFontenelle (WMF) (talk) 16:23, 16 January 2022 (UTC)Reply[reply]

Response to question about hierarchical depicts[edit]

I want to respond to a question from User:Rhododendrites that was originally posted on the first draft of this page at User_talk:GFontenelle_(WMF)/Depicts:

Sequoia is considered a subclass of taxon (Q16521). Based on past conversations, I think you should be prepared for some pushback here: why is it not hierarchical? If that's a shortcoming of the software, shouldn't we be pressing for a change to the software? Will it ever be possible to link "tree" to "Sequoia"? If so, why would we use limited volunteer labor to create all of these statements which would be made redundant in the future?"

It’s not exactly a shortcoming of the software itself. It’s based on the way the ontology has been built in Wikidata by volunteers - modeling decisions made by the Wikidata community. In the long term, the WMF hopes to work with WMDE and the Wikidata community to think of creative ways to solve this problem, but those potential solutions are far in the future (likely years), and it’s impossible to predict at this point how they will be implemented. In the meantime, we are working with the system that we have, and we want to optimize the experience for users searching now. Hopefully, if it becomes possible to link “tree” to “Sequoia” in the future, we can work with the community to run bots or something similar that can fix or reduce the duplication. CBogen (WMF) (talk) 15:47, 9 September 2021 (UTC)Reply[reply]

Unstucking digital representation of (P6243)[edit]

The discussions about how to use digital representation of (P6243) happened at d:Property talk:P6243 some time ago, but no action came out of it. Instead of breaking down the problem, more things were getting added. Let's break it down to something simple so we can at least can get started with the easy cases. For all images:

  1. If an image contains an artwork for which we have an item on Wikidata, we'll add depicts (P180). example 1 & example 2.
  2. If an image contains an artwork (2D or 3D) for which we have an item on Wikidata and that artwork is clearly the main subject of the photo, we'll add depicts (P180) & main subject (P921). example 1 & example 2.
  3. If an image is of a 2D artwork for which we have an item on Wikidata and the image matches with the {{PD-art}} policies, we'll add depicts (P180), main subject (P921) & digital representation of (P6243). example 1 & example 2.

I left the other cases like crops and books out for now. Let's worry about those later.

The {{Artwork}} logic should be updated to work like this:

  1. Look for the wikidata field. If found, use that one to fill the infobox (like now)
  2. Look for digital representation of (P6243). If found, use that one to fill the infobox (like now)
  3. Look for main subject (P921). If one preferred value is found and the found item on Wikidata has instance of (P31) and not subclass of (P279), use that one to fill the infobox and make the infobox behave like {{Art Photo}}.

That way we have a nice way to show information about the object in the image. I created tracker categories like Category:Artworks digital representation of sculpture to be able to do clean up. This also makes cases like File:Noord-oostgevel - West-Terschelling - 20254738 - RCE.jpg possible. Multichill (talk) 15:43, 28 December 2021 (UTC)Reply[reply]

Commons_talk:Structured_data/Modeling#SDC_for_postcards mentions another problem with P6243 (use with classes). --- Jura1 (talk) 16:26, 28 December 2021 (UTC)Reply[reply]

I started updating this page, see Commons:Structured_data/Modeling/Depiction#Works_of_art. Multichill (talk) 13:38, 16 January 2022 (UTC)Reply[reply]

High level classification for files[edit]

It would be useful for querying if there would be a higher-level classification for the files and not just depiction. Ie. something like Wikidata's instance of (P31). Values could like

  • Photograph of the object
  • Portrait
  • Aerial photograph
  • Landscape photograph
  • Indoor photograph


Anyway something which can be easily derived from categories + machine learning confirmation. -- Zache (talk) 09:16, 7 July 2022 (UTC)Reply[reply]

@Zache: For paintings we use genre to this kind of broad classification. Multichill (talk) 18:46, 7 July 2022 (UTC)Reply[reply]
So we could just use genre (P136) to store that value for art AND extrapolate use outside artwork too? (my use case is mainly photographs of physical locations and some level portraits) -- Zache (talk) 09:30, 8 July 2022 (UTC)Reply[reply]

Adoption of guidelines[edit]

There is an ongoing discussion to vote on Commons:Depiction guidelines. Please give your opinion here Anthere (talk) 14:53, 8 November 2022 (UTC)Reply[reply]

The discussion has been archived (without clear consensus) at Commons:Village_pump/Proposals/Archive/2022/10#Adopt_Commons:Depiction_guidelines_as_a_guideline. Ainali (talk) 13:58, 19 February 2023 (UTC)Reply[reply]

Level of detail[edit]

I've got a hint to the page Commons:Structured_data/Modeling/Depiction#Level_of_detail. I find such an idea understandable, but also surprising. The rule suggested to add Catholic cathedral (Q56242215) and/or tourist attraction (Q570116) if Cologne Cathedral (Q4176) exists. These are items of instance of (P31) at Cologne Cathedral (Q4176). (Similar with subclass of (P279). Example: ) IMO it is not good to add these depict statements, especially since it contradicts COM:DEPICTS. instance of (P31) and subclass of (P279) are well known. Why not extend the search engine with the statements of instance of (P31) and subclass of (P279) for each (preferred) item of depicts (P180)? It will work immediate and everywhere without any changes of depicts (P180). --XRay 💬 17:04, 4 May 2023 (UTC)Reply[reply]

Oh yes, the reason for my comment: Revision of 753453537. The bot has set lighthouse (Q39715), but Eastern Channel Pile Light (Q5330051) already exists. --XRay 💬 17:06, 4 May 2023 (UTC)Reply[reply]
An additional comment. IMO improving the search engine is a good way, adding superfluous statements leads to confusion. If Eastern Channel Pile Light (Q5330051) exists as depict statement, what statements should be added? lighthouse (Q39715), signal tower (Q21550989), lit beacon (Q110113149), beacon (Q7321258), service (Q7406919), light source (Q1146001), appliance (Q1183543), tower (Q12518)? And if there are a lot of statements, should be lots of superfluous statements be added? --XRay 💬 17:23, 4 May 2023 (UTC)Reply[reply]