User:Multichill/Commons Wikidata roadmap

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

This page describes the Commons Wikidata roadmap as I see it. It may change over time. It contains several items sorted in the order I think it will happen in time. The actual order of deploying them might change.

Centralize Commons category links[edit]

Almost every Wikipedia has Commons category template linking articles and categories to their equivalent at Commons. This is the same problem as with the interwiki's: We have to maintain it on every Wikipedia and that's quite a lot of work. To centralize this a property Commons category (P373)‎‎ has been created. This is a property to ease the transition and it will disappear over time.

Enable Wikidata interwiki[edit]

Interwiki links will be enabled to link galleries with articles and categories with categories. This way Commons can really get integrated and get more closely connected with the other projects.

Link topics and categories[edit]

Linking categories and articles

On Wikidata we already have a property main category topic (P301) to link category items with article items. An inverse property topic's main category (P910) has been created to link the article items with the category items.

Based on data already in main category topic (P301) and Commons category (P373)‎‎ we can start populating this property, create missing Commons category items and link them.

In the end we have:

Bot should be used to keep everything here consistent: If an item A has a property "Topic main category" linking to a (category) item B, the interwiki link on item B should be exactly the same as the Commons category (P373)‎‎ claim on item A.

Mind you that Commons category (P373)‎‎ is still used in Wikipedia because getting the interwiki from a linked item is not possible yet.

Model intersections[edit]

We want to move away from the current category system to a better system, but we don't want to throw away years of work. We need to model intersected categories on Wikidata to make it easier to move. An intersected category is a category that intersects two or more topics. Take for example Category:Churches in Haarlem. This is an intersection of Haarlem (Q9920) and church (Q16970). We need to model this because at some point in time we want to add these claims directly to images (probably with a bot) and this also helps with the search engine.

New search prototype[edit]

Categories and articles have been properly linked and intersections have been modeled. We got a lot of interesting data now that should be used to build a new prototype search engine for images.

  • Every image has one or more categories
  • Every category links to one or more topics
    1. One if it's connected one on one with an article with main category topic (P301)
    2. More than one if it's an intersection is linked with <new property>

The images with the connected topics should be fed to a search engine. How the different categories and items relate to each other should be fed to the search engine too. Faceted search would probably be a really nice replacement for the current static category system. We should build this as a prototype (probably on labs) so we can get the feel what works (and what not) without disturbing the current system.

Tracking & purging[edit]

Tracking is where something is used and purging is regenerating the page because something has changed. Now this is done for templates. Take for example {{Information}}. The usage of this template is tracked here. Say I change the color of the template all the pages using this template have to be regenerated to show {{Information}} with the new color, this is the purging part.

For Wikidata this tracking and purging has to be implemented too. So if there is a change to a sitelink, claim, etc, all the pages using that data should be purged. Otherwise we would be looking at old incorrect data. Say for example we rename Category:Boston to Category:Boston, Massachusetts. We update d:Q8307979. d:Q100 contains a claim that the main category is d:Q8307979. This claim is used on all Wikipedia articles to generate a Commons category link. All these articles need to be purged so they point to Category:Boston, Massachusetts.

Redesign file pages[edit]

The current file pages are a mix of data and representations of that data. With data being moved to the Wikibase repository we can focus on a complete redesign. Assuming we have almost all data in a structured form, how would we design the new pages? We should completely rethink the way we show the information to the users. Mock ups should be made, community discussion should happen, etc.

Preferably we should just have one template on the page that shows what is needed. So no {{Information}}, {{Book}}, {{Artwork}}, {{Creator}}, {{Institution}} and license tags. The template should give the right representation based on the claims added to the file. We should of course use the lessons learned from these templates in the new system.

Wikibase on Commons[edit]

Up until now we had one instance of Wikibase (the one on Wikidata). Storing the data about every file on Wikidata is probably not a good idea so a local instance should be deployed to store that data. The two Wikibase instances should work together and we should avoid duplication. We probably have very few items here, because for most things we would just link to Wikidata items. We'll have an object for every file and for every user. We'll have a number of string type properties which can be upgraded to item type properties. Say for example a painting made by Rembrandt might first have a author string based claim "Rembrandt" and this can be changed to a author claim linking it to Rembrandt (Q5598). This is how we work with {{Creator}} templates now.

We should be adding claims based on the current categorization so we can slowly start replacing that.

We should probably focus on getting this working for new uploads, than focus on important files and in the end, based on experience, do the bulk of the files. This is going to involve a lot of bot work.

See for more information Commons:Wikidata for media info

New search[edit]

We gained a lot of experience with the search prototype. This experience should be used to replace/improve the current Commons search engine. Without this we can't get rid of categories.

See also[edit]