Commons:Machine-readable data

From Wikimedia Commons, the free media repository
Jump to: navigation, search

On Wikimedia Commons, a lot of metadata (including license and author) are not machine readable. As they are entered as free text into the file description page itself, the built-in MediaWiki API cannot help. There are plans to move the metadata into the database[1], but it is likely they are not going to happen soon.

To make up for this, Wikimedia Commons use a set of standard template which have been made machine-readable in some ways, through HTML elements. Some scripts already make use of that. It is worth noting that this data is available for any wiki using Wikimedia Commons, where it can be read into html of File: page just as other local data.

Contents

Machine readable data [edit]

Machine readable data set by infobox templates [edit]

These are several standard infobox templates tagging different elements of the template with different tags to allow parsing of the information. Several different styles of tags are used:

  • Microformat tags follow industry standards and can be parsed by already existing tools.
  • <td> id attributes (identifiers) are custom markings which allow more complete tags, which have to be read by custom tools. Many most universal infoboxes have two column structure: column #1 holds name of the field and column #2 holds the value
    • Traditionally <td> id attributes were used to tag the name call in the first column in a row. To get the data, you would need to get the contents of the following <td> cell in the second column.
    • {{Creator}} and {{Institution}} templates have more complicated structure, so the cells with the actual data are tagged with attributes using magenta background.
Template Fieldname Description <td> id attribute Microformat Comment
{{Information}} description description of the image fileinfotpl_desc hProduct.description
{{Information}} date date the image was taken fileinfotpl_date hCalendar vevent.dtstart microformat added by {{date}} template
{{Information}} source source of the image fileinfotpl_src
{{Information}} author author of the image fileinfotpl_aut
{{Information}} permission permission/license for the image fileinfotpl_perm
{{Information}} other versions other versions of the image fileinfotpl_ver
{{Artwork}} description description of the artwork fileinfotpl_desc hProduct.description
{{Artwork}} date date the artwork was taken fileinfotpl_date hCalendar vevent.dtstart microformat added by {{date}} template
{{Artwork}} source source of the image fileinfotpl_src
{{Artwork}} artist creator of the artwork fileinfotpl_aut "hProduct.fn value"
{{Artwork}} permission permission/license for the image and artwork fileinfotpl_perm
{{Artwork}} other versions other versions of the image fileinfotpl_ver
{{Artwork}} title title of the artwork fileinfotpl_art_title hProduct.fn
{{Artwork}} object type artwork object type fileinfotpl_art_object_type
{{Artwork}} medium technique or medium of the artwork fileinfotpl_art_medium
{{Artwork}} dimensions dimensions of the artwork fileinfotpl_art_dimensions
{{Artwork}} gallery institution holding the artwork fileinfotpl_art_gallery
{{Artwork}} location location of the artwork within the institution fileinfotpl_art_location hProduct.locality
{{Artwork}} accession number accession number of the artwork fileinfotpl_art_id hProduct.identifier
{{Artwork}} object history object history of the artwork fileinfotpl_art_object_history
{{Artwork}} exhibition history exhibition history of the artwork fileinfotpl_art_exhibition_history
{{Artwork}} credit line credit line of the artwork fileinfotpl_art_credit_line
{{Artwork}} inscriptions inscriptions on the artwork fileinfotpl_art_inscriptions
{{Artwork}} notes notes about the artwork fileinfotpl_art_notes
{{Artwork}} references references related to the artwork fileinfotpl_art_references
{{Book}} Author author of the book fileinfotpl_author
{{Book}} Editor editor of the book fileinfotpl_book_editor
{{Book}} Translator translator of the book fileinfotpl_book_translator
{{Book}} Illustrator illustrator of the book fileinfotpl_book_illustrator
{{Book}} Title title of the book fileinfotpl_book_title
{{Book}} Subtitle subtitle of the book fileinfotpl_book_subtitle
{{Book}} Series title series-title of the book fileinfotpl_book_series-title
{{Book}} Authority control authority control data fileinfotpl_book_authority
{{Book}} Publisher publisher of the book fileinfotpl_book_publisher
{{Book}} Printer printer of the book fileinfotpl_book_printer
{{Book}} Year of publication date or year of the publication of the book fileinfotpl_date
{{Book}} Place of publication place or city of the publication of the book fileinfotpl_book_place-of-publication
{{Book}} Language language of the book fileinfotpl_book_language
{{Book}} Description description of the book fileinfotpl_desc
{{Creator}} Name Name of creator vCard.fn
{{Creator}} Alternative names Alternative names of creator fileinfotpl_creator_alt-name_value vCard.nickname
{{Creator}} Description Nationality and occupation(s) of creator fileinfotpl_creator_desc_value vCard.note
{{Creator}} Date of death Date of death of creator fileinfotpl_creator_deathdate_value
{{Creator}} Date of birth Date of birth of creator fileinfotpl_creator_birthdate_value vCard.bday
{{Creator}} Location of birth/death Location of death of creator fileinfotpl_creator_deathloc_value
{{Creator}} Location of birth Location of birth of creator fileinfotpl_creator_birthloc_value
{{Creator}} Work period Work period of creator fileinfotpl_creator_work-period_value
{{Creator}} Work location Work location of creator fileinfotpl_creator_work-location_valuev
{{Creator}} Image image of creator fileinfotpl_creator_image
{{Creator}} Authority control Authority control related to creator fileinfotpl_creator_authority_value


{{FileContentsByBot}} (various) depends, please confer {{FileContentsByBot}} (various) hproduct-by-bot big data set and still growing, please confer {{FileContentsByBot}}

Machine readable data set by license templates [edit]

Introduced in October 2010, using classes <span class="licensetpl_XXX">

licensetpl
Wrapping the all.
licensetpl_short
Short name of the license: “Public domain”, “CC-BY-SA-3.0”, “CC-by-2.0-FR”, etc.
licensetpl_long
Long name of the license: “Public domain”, “Creative Commons Attribution-Share Alike 3.0”,
licensetpl_attr_req
Whether attribution is required. “true” or “false”.
licensetpl_attr
The required attribution: Free text.
licensetpl_link_req
Whether a link to the license is required. “true” or “false”.
licensetpl_link
The link to the license. “www.creativecommons.org/licenses/by-sa/XXX/YYY”

Templates setting these information [edit]

Usage [edit]

Scripts using machine-readable data [edit]

External tools [edit]

See also [edit]

Notes [edit]

  1. bugzilla:17503