English subtitles for clip: File:OpenRefine Commons - editing - retrieve structured data from Commons files.webm
Jump to navigation
Jump to search
1 00:00:06,560 --> 00:00:12,000 I have an OpenRefine project here, based on files on Wikimedia Commons. 2 00:00:12,000 --> 00:00:13,920 I am interested in finding out 3 00:00:13,920 --> 00:00:16,760 whether the files that I have selected here 4 00:00:16,760 --> 00:00:19,240 already have certain structured data statements. 5 00:00:19,240 --> 00:00:23,720 So that I'm sure that I'm not adding duplicates of them. 6 00:00:23,720 --> 00:00:27,080 And just to explore what is already there. 7 00:00:27,080 --> 00:00:30,240 In practice I'm interested in finding out: 8 00:00:30,240 --> 00:00:33,760 whether these files already have a Depicts statement, 9 00:00:33,760 --> 00:00:36,040 whether they already have a Creator, 10 00:00:36,040 --> 00:00:37,080 and a Collection, 11 00:00:37,080 --> 00:00:41,000 because I would be interested in adding this information to the files. 12 00:00:41,000 --> 00:00:45,960 How do I check in OpenRefine whether these files already have this information? 13 00:00:45,960 --> 00:00:49,760 I can create columns with that structured data. 14 00:00:49,760 --> 00:00:51,320 I do that as follows. 15 00:00:51,320 --> 00:00:56,680 I go to the file column menu. 16 00:00:56,680 --> 00:01:00,120 The files already need to be reconciled with Wikimedia Commons 17 00:01:00,120 --> 00:01:02,240 And you can see that that has happened 18 00:01:02,240 --> 00:01:05,519 if the column has a dark green line, 19 00:01:05,519 --> 00:01:07,080 if the file names are blue, 20 00:01:07,080 --> 00:01:12,400 and you can click on them and open them in a new tab, 21 00:01:12,400 --> 00:01:15,880 and, if you have the Wikimedia Commons extension installed in OpenRefine, 22 00:01:15,880 --> 00:01:19,360 you should also see thumbnails of the files. 23 00:01:19,360 --> 00:01:23,880 I am selecting the menu of the file column. 24 00:01:23,880 --> 00:01:30,640 I select the option "Edit column..." - "Add columns from reconciled values ". 25 00:01:30,640 --> 00:01:36,440 Then OpenRefine will present me with some options of structured data that I can retrieve. 26 00:01:36,440 --> 00:01:41,280 As I said, I was interested in the Collection of the files. 27 00:01:41,280 --> 00:01:44,000 I was also interested in the Creator, 28 00:01:44,000 --> 00:01:47,680 and whether they have a Depicts statement. 29 00:01:47,680 --> 00:01:50,240 In the preview you can already see that. 30 00:01:50,240 --> 00:01:54,400 It shows me in advance some things that I'm interested in. 31 00:01:54,400 --> 00:01:58,160 Let's say... I'm also clicking on Inception. 32 00:01:58,160 --> 00:02:00,720 But then I decide I actually am not interested 33 00:02:00,720 --> 00:02:03,920 to see whether these files have an Inception statement. 34 00:02:03,920 --> 00:02:07,120 Then I can remove this option again. 35 00:02:07,120 --> 00:02:09,199 I click "OK". 36 00:02:09,199 --> 00:02:13,880 And then OpenRefine will load columns with that structured data for me. 37 00:02:13,880 --> 00:02:19,040 And, as you can see, there is no information at all about the collection. 38 00:02:19,040 --> 00:02:22,640 There's no structured data yet around collection. 39 00:02:22,640 --> 00:02:26,960 But all the files already have Creator statements. 40 00:02:26,960 --> 00:02:29,680 But none of them has Depicts statements.