Commons:Batch uploading/Art of Japan in the Rijksmuseum

From Wikimedia Commons, the free media repository
Jump to: navigation, search

Art of Japan in the Rijksmuseum[edit]

Consistent and best practice upload from the Rijksmuseum of 1,679 photographs of Japanese artworks.

Using this as an initial themed upload, might enable a far large upload of images from the collections using the same code/process.

Technical[edit]

The upload will use the GWToolset.

A credit template of {{GWToolset Rijksmuseum}} is available.

RM API[edit]
API guide: http://rijksmuseum.github.io

Examining an initial upload based on an export from Europeana, here, it seems worth exploring the Rijksmuseum API before proceeding. Reasons include:

  • The RM metadata is available in Dutch and English, both can be reused on the image page, though in practice most of the "English" text may be Dutch anyway.
  • Some additional fields are available in the RM API that do not appear in the Europeana data, such as exhibition history (places, years). Whether this is helpful for Commons would need exploration.
  • The image is identical whether pulled from the RM API or Europeana, including the EXIF data.
  • A persistent identifier via handle.net should probably be used, rather than the current URL (though including both might be a good option).
  • Dimensions are nicely broken up in the metadata and so could be displayed with fairly intelligent use of Commons templates for dimensions.
  • In the example, "physicalMedium" is given as porselein in Dutch, when the English metadata is called, the same field shows plate (dishes), which is actually "objectType". The same inconsistency has been carried over to the Europeana data. It may well be that importing the English version from the API might not be as useful as limiting this upload to the Dutch metadata, or leaving the English as a suggested translation.

Analysis comparing differences only between API calls in NL and EN for example image:

 <element> <value> # First in English then Dutch
 objectTypes  [u'plate (dishes)'] 
 objectTypes  [u'bord (vaatwerk)'] 

 objectCollection  [] 
 objectCollection  [u'keramiek'] 

 scLabelLine  anoniem, 1700 - 1725, plate (dishes) 
 scLabelLine  anoniem, 1700 - 1725, porselein 

 id  en-BK-1968-212 
 id  nl-BK-1968-212 

 materials  [] 
 materials  [u'porselein', u'glazuur'] 

 subTitle  d 15.9cm × h 2.9cm 
 subTitle  d 15,9CM × h 2,9CM 

 dimensions  [{u'part': None, u'type': u'diameter', u'value': u'15.9', u'unit': u'cm'}, {u'part': None, u'type': u'height', u'value': u'2.9', u'unit': u'cm'}] 
 dimensions  [{u'part': None, u'type': u'diameter', u'value': u'15,9', u'unit': u'CM'}, {u'part': None, u'type': u'hoogte', u'value': u'2,9', u'unit': u'CM'}] 

 language  en 
 language  nl 

 physicalMedium  plate (dishes) 
 physicalMedium  porselein 

 acquisition  {u'date': u'1968-01-01T00:00:00Z', u'method': None, u'creditLine': u'B. Westendorp-Osieck Bequest, Amsterdam'} 
 acquisition  {u'date': u'1968-01-01T00:00:00Z', u'method': None, u'creditLine': u'Legaat van mevrouw B. Westendorp-Osieck, Amsterdam'} 

From this example and the two other images uploaded by DH, the English version of the metadata appears to miss some fields and may be inconsistent. It seems better to stick to the Dutch record only. -- (talk) 14:17, 4 May 2014 (UTC)

Copyright[edit]

Licenses chosen can be based on this statement "All data and all images made available through the API are either in the public domain or are subject to a CC0 license." found here. This should mean that the photographs are themselves released as CC0, with copyright of the art object being a separate issue (sticking to a cut-off of before the 20th century should mean PD can apply).

Progress[edit]

David Haskiya has started the foundation of this batch project, and after he ran short on time, Fæ has offered to pick up this project.

Action Status Where
DH: Pass on background and current xml file.

Three initial example files uploaded.

["BK-1968-205", "BK-1968-213-A", "BK-1968-212"]
Status:    Done Art of Japan in the Rijksmuseum
Fæ: Initial investigation (May 2014):
  • Review comments on template by Jean-Frédéric [1]
  • Check structure of current xml
  • Review metadata potential from Rijksmuseum API
    • API key ✓ Done
    • Manual experiments ✓ Done
    • Automate API import and map to a suitable GWToolset available template ✓ Done
Status:    Done -
Fæ: Run test upload on beta cluster. (May 2014)
  • Pull xml test sets for 10 to 200 artefacts ✓ Done
  • Explore category checking ✓ Done
  • Ask for feedback[2] ✓ Done
  • Follow up on GWT/Artwork template format bugs (filename generation, wiki-code handling in parameters) ✓ Done
Status:    Done -
Fæ: Do real uploads starting with an initial 'page' full (20-200) for feedback and checking.

Planned cut-off of 1923 to avoid any possible contention on copyright, for the moment at least.

Status:    Done
2,496 images uploaded
Art of Japan in the Rijksmuseum