User talk:Multichill/Generating creator templates

From Wikimedia Commons, the free media repository
Jump to: navigation, search

I think part of a solution comes from Wikipedia itself and the structured semantic databases built on Wikipedia: Dbpedia and Freebase. I'd actually use them as the source for creator templates. While not CC0 they are openly licensed. WikiData may be a future option.

We've done some very limited cross-referencing with Dbpedia in Europeana to link our creator/dc:creator fields with Dbpedia resources. One such cross-reference is for Rembrandt van Rijn. So in Europeana (and the API) you can search like this, http://europeana.eu/portal/search.html?query=enrichment_agent_term%3A+%22http%3A%2F%2Fdbpedia.org%2Fresource%2FRembrandt%22 to find all the works so cross-referenced. Note the false positives!

If you go to http://dbpedia.org/page/Rembrandt you'll find all the data needed to fill a creator template. If other large aggregators besides Europeana would start using Dbpedia in this way (to cross-reference and search on) it would also be possible to easily link to further sources for works by that creator. Freebase already does this: http://www.freebase.com/view/en/rembrandt (and check the right-hand side-bar).

Other possible sources for "authority files" is VIAF (though here the license terms are fuzzy) and the authority files of the various National Libraries of Europe (e.g. http://openbiblio.net/2012/02/02/linked-data-at-the-biblioteca-nacional-de-espana/ ). DivadH (talk) 13:23, 26 February 2012‎

There are many possibilities. Since I am doing a lot of maintenance of Creator templates I wrote User:JarektBot/Commons creator maintenance.py I am hoping to add to it:
  • adding/updating name={{LangSwitch}} with interwiki links to articles and defaulting to Commons gallery if one exist. One would probably run interwiki.py on Category:Creator template home categories first.
  • copying/merging {{Authority control}} data from Commons categories and or DE Wiki (or en Wiki if they start using it). BTW we could use some general tool similar to interwiki.py for synchronizing Authority control data for related Wikipedia pages and Commons.
  • use viaf to create new {{Authority control}} links
  • Scraping dates, places, nationalities, occupations, gender, etc. from Commons categories, Wikipedia pages, Dbpedia, etc.
Also I was thinking about a code to use all of above to create new templates for artist we have categories for. Also I think it would be a good idea to either create a database with creator info (and possibly info from Category:People_by_name) or figure out a way to extract equivalent data from dumps.wikimedia or the toolserver. Currently each time I work on matching some list of artists to Commons Creator templates I create a spreadsheet by scanning all the pages. Such DB could be essential part of future batch upload tools by matching museum records to our creators. --Jarekt (talk) 16:02, 27 February 2012 (UTC)