English subtitles for clip: File:Aaron Swartz - The Network Transformation.webm

Jump to: navigation, search
1
00:00:03,200 --> 00:00:08,080
The change in the architecture of the media 
is completely connected to a change in the architecture of control

2
00:00:08,119 --> 00:00:11,559
With the broadcast system you have one person in one station

3
00:00:11,599 --> 00:00:13,719
deciding what gets put out over the airwaves. 

4
00:00:13,800 --> 00:00:17,640
When you have a distributed network like the internet everybody can be a server.

5
00:00:17,679 --> 00:00:20,559
There's no distinction between the broadcaster and the receiver:

6
00:00:20,879 --> 00:00:22,279
every computer does both.

7
00:00:22,399 --> 00:00:27,079
You can take your home laptop and run a server off of it that can distribute movies and music

8
00:00:27,120 --> 00:00:31,640
and webpages and email in the same way that the biggest computers at Google can.

9
00:00:31,679 --> 00:00:35,519
There's no fundamental difference between the computers they have in the racks in their server rooms

10
00:00:31,760 --> 00:00:31,800


11
00:00:35,520 --> 00:00:36,920
and what you have on your desk.

12
00:00:38,039 --> 00:00:39,560
In the old system of broadcasting, 

13
00:00:39,600 --> 00:00:42,800
you were fundamentally limited by the amount of space in the airwaves

14
00:00:42,759 --> 00:00:46,439
You could only send out 10 channels over the airwaves in television

15
00:00:46,439 --> 00:00:48,519
or even with cable you had 500 channels.

16
00:00:48,679 --> 00:00:50,640
On the internet, everybody can have a channel; 

17
00:00:50,759 --> 00:00:53,039
everyone can get a blog or a MySpace page;

18
00:00:53,079 --> 00:00:55,159
everyone has a way of expressing themselves

19
00:00:55,479 --> 00:00:57,719
and so what you see now is not a question of

20
00:00:57,719 --> 00:00:59,119
who gets access to the airwaves, 

21
00:00:59,280 --> 00:01:00,840
it's a question of who gets control

22
00:01:00,840 --> 00:01:02,400
over the ways you find people.

23
00:01:02,799 --> 00:01:05,399
You start seeing power centralising in sites like Google,

24
00:01:05,439 --> 00:01:08,480
these sort of 'gatekeepers' that tell you where on the internet you want to go

25
00:01:08,599 --> 00:01:11,559
the people who provide you your sources of news and information.

26
00:01:11,599 --> 00:01:15,079
So its not only certain people have a license to speak

27
00:01:15,280 --> 00:01:16,640
now everyone has a license to speak,

28
00:01:16,680 --> 00:01:17,760
it's a question of who gets heard.

29
00:01:19,400 --> 00:01:22,600
So one of the biggest questions we're facing in a world of many speakers

30
00:01:22,680 --> 00:01:24,040
how do you find what's good?

31
00:01:24,040 --> 00:01:27,240
Are we gonna go to a system like the old media where you go to CNN

32
00:01:27,359 --> 00:01:29,239
and they pick a handful of people to focus on 

33
00:01:29,280 --> 00:01:30,239
and you read what they say

34
00:01:30,280 --> 00:01:32,040
or are we going to go with something more like the internet

35
00:01:32,200 --> 00:01:35,200
where everybody has a chance of being heard, a more democratic system. 

36
00:01:35,480 --> 00:01:38,480
One of the most interesting technologies for doing something like that

37
00:01:38,439 --> 00:01:39,840
is a system called collaborative filtering,

38
00:01:40,079 --> 00:01:43,640
where everybody expresses their opinions on what they like and what they don't like

39
00:01:43,680 --> 00:01:47,400
and the computer tries to match you up with other people who have similar preferences

40
00:01:47,439 --> 00:01:50,719
and recommend you things that they also like that you didn't know about before.

41
00:01:50,760 --> 00:01:52,800
It's the same kind of system you see on Amazon

42
00:01:52,840 --> 00:01:55,159
where people who bought this book also bought this book.

43
00:01:55,159 --> 00:01:57,920
People are trying to experiment that not only with books 

44
00:01:58,000 --> 00:02:01,079
but with blogs, web pages and news stories all across the internet,

45
00:02:01,120 --> 00:02:04,320
they're trying to find ways and things that you never would have heard of before 

46
00:02:04,400 --> 00:02:05,719
and bringing them in front of you.

47
00:02:06,200 --> 00:02:08,080
Mass media had this fundamental paradox

48
00:02:08,400 --> 00:02:10,319
because it was aiming at a huge audience 

49
00:02:10,360 --> 00:02:12,760
but it wanted to convince everybody they were an individual

50
00:02:12,759 --> 00:02:15,159
so you see all these ads on television all the time like

51
00:02:15,360 --> 00:02:17,560
'buck the trend, buy these jeans' right?

52
00:02:17,599 --> 00:02:20,039
And it's on a show that 4 million people are watching,

53
00:02:20,280 --> 00:02:23,840
you're not going to buck a trend by doing what 4 million other people are. 

54
00:02:24,000 --> 00:02:26,840
Now that the internet is actually making these niche things possible

55
00:02:27,039 --> 00:02:29,039
the mass media is incredibly threatened

56
00:02:29,400 --> 00:02:32,879
no longer this idea of bucking the crowd and being your own

57
00:02:33,479 --> 00:02:36,280
it's no longer just a theory you can actually do it on the internet

58
00:02:36,280 --> 00:02:40,400
And what we're starting to see is tools that take power away from the big conglomerates 

59
00:02:40,560 --> 00:02:42,199
and start to distribute it to small groups.

60
00:02:42,479 --> 00:02:47,239
And so there are a bunch of issues in a system like that there are questions of funding you know,

61
00:02:47,280 --> 00:02:50,400
how will these small groups get paid and how will the random blogger be able to live

62
00:02:50,520 --> 00:02:53,400
in a way that an investigative journalist can now

63
00:02:53,599 --> 00:02:55,879
because there's one giant source of advertising.

64
00:02:55,919 --> 00:02:59,799
You know there are question finding people, how will I be able to find the stuff I'm interested in,

65
00:02:59,879 --> 00:03:01,599
and the stuff that's trustworthy and reliable

66
00:03:02,080 --> 00:03:04,719
and so for each of these there are new technologies

67
00:03:04,719 --> 00:03:06,520
people are trying all sorts of different things

68
00:03:06,680 --> 00:03:08,240
and one of the most exciting things about the internet is that

69
00:03:08,280 --> 00:03:12,400
there is still experimentation in this, since everybody can just go up and start a website

70
00:03:12,400 --> 00:03:15,640
with a new piece of technology to try and solve one of these problems.

71
00:03:15,680 --> 00:03:18,640
We're seeing lots of different possibilities, lots of different funding models

72
00:03:18,680 --> 00:03:22,920
lots of different recommendations systems and who knows what will work best.

73
00:03:23,439 --> 00:03:25,759
We have a chance to try it all and see what falls out.

74
00:03:25,800 --> 00:03:27,840
So there are a couple of interesting funding models:

75
00:03:27,879 --> 00:03:30,479
One of course is the sort of standard method of advertising,

76
00:03:30,599 --> 00:03:35,039
you go to a bunch of big corporate sponsors and instead of having them fund a television show

77
00:03:35,080 --> 00:03:36,400
you have them fund your webpage.

78
00:03:36,400 --> 00:03:39,800
But a more interesting one is you do the same thing with niche groups

79
00:03:40,000 --> 00:03:45,039
instead of going to IBM/Ford or some big company and having them buy a banner out on your site, 

80
00:03:45,120 --> 00:03:47,599
you go to people who actually care about the readers you have.

81
00:03:47,680 --> 00:03:50,719
If you're a design weblog you go to design companies.

82
00:03:50,759 --> 00:03:53,719
If you're a politics webblog, you go to other politicians.

83
00:03:53,879 --> 00:03:58,039
You have a very targeted narrow group of people who are really interested in the subject,

84
00:03:58,039 --> 00:04:01,120
thats an audience advertisers really love.

85
00:04:01,759 --> 00:04:04,560
Another possibility is to turn directly to your readers for support.

86
00:04:04,680 --> 00:04:07,719
So you see blogs say, I wanna go on a trip to New Hampshire 

87
00:04:07,719 --> 00:04:10,400
to cover the American political conventions.

88
00:04:10,439 --> 00:04:11,599
Will you support me?

89
00:04:12,000 --> 00:04:13,400
And the readers pour in money.

90
00:04:13,400 --> 00:04:16,720
These people are very dedicated if they feel like they have a personal connection

91
00:04:16,720 --> 00:04:17,800
with the person writing,

92
00:04:17,800 --> 00:04:20,240
so they are eager to spend money to support it! 

93
00:04:21,000 --> 00:04:23,879
Another thing is that you work simply off of volunteer labor,

94
00:04:23,879 --> 00:04:27,040
you have people who have a day job that's an expert on a subject

95
00:04:27,040 --> 00:04:30,280
and they just enjoy talking about it so they rate stuff in their free time

96
00:04:30,360 --> 00:04:31,199
and publish it on the internet.

97
00:04:31,439 --> 00:04:33,920
Or they have readers who read their site and contribute stuff

98
00:04:34,000 --> 00:04:36,399
and it gets compiled into one exciting source.

99
00:04:36,680 --> 00:04:39,040
So I think there are lots of different experiments

100
00:04:39,040 --> 00:04:40,760
and people are trying in lots of different ways.

101
00:04:41,000 --> 00:04:46,480
That's one of the errors you had with television, right, was that television could only provide one level of interest.

102
00:04:48,600 --> 00:04:51,720
It was funded based on adertising, not on how much people cared about the program.

103
00:04:52,360 --> 00:04:56,280
Advertisers were going to pay the same no matter how exciting or how compelling 

104
00:04:56,279 --> 00:04:58,119
or how interested an audience was in the show

105
00:04:58,160 --> 00:05:01,880
so what you ended up with was fairly boring shows that appealed to lots of people

106
00:05:01,920 --> 00:05:04,400
because that's what advertisers wanted

107
00:05:04,399 --> 00:05:05,919
they wanted lots people watching the shows.

108
00:05:06,160 --> 00:05:08,280
Whereas in a normal market economy what happens is

109
00:05:08,279 --> 00:05:10,279
if you really want something you pay more for it

110
00:05:10,360 --> 00:05:12,040
you just can't do that with television. 

111
00:05:12,279 --> 00:05:15,399
So one of the interesting things about broadcast is that a lot of what you like 

112
00:05:15,399 --> 00:05:17,039
depends on what other people like

113
00:05:17,399 --> 00:05:19,239
there are only so many shows out there

114
00:05:19,240 --> 00:05:20,160
they are all kind of bland

115
00:05:20,199 --> 00:05:21,839
so what happens, you have these megahits

116
00:05:22,000 --> 00:05:26,920
like American Idol or Lost, where everybody at the water cooler is talking about this show,

117
00:05:27,000 --> 00:05:29,879
so you have to watch it because otherwise you can't keep up with them.

118
00:05:30,120 --> 00:05:33,399
Whenever social factors get involved

119
00:05:33,560 --> 00:05:35,639
you have this sort of process of rich gets richer

120
00:05:35,680 --> 00:05:40,040
one thing takes off because thats what everybody else is doing!

121
00:05:40,360 --> 00:05:43,720
One nice thing about the internet is that it allows for so much more variety

122
00:05:43,720 --> 00:05:47,960
that niche products can get so much more attention and interest

123
00:05:48,000 --> 00:05:50,720
So they've the run the numbers and there is this proven mathematical fact

124
00:05:50,759 --> 00:05:54,639
that as long as some percentage of what you care about is whether other people

125
00:05:54,680 --> 00:05:58,040
like it or now you're gonna end up 
with these patterns of hits and failures.

126
00:05:58,680 --> 00:06:01,720
If you have two things which are equivalent in quality,

127
00:06:01,720 --> 00:06:04,640
and one of them is liked by one more person 
than the other one,

128
00:06:04,680 --> 00:06:06,079
you're going to go that one.

129
00:06:06,120 --> 00:06:08,399
There's some small chance that you're going to go to that one 

130
00:06:08,399 --> 00:06:09,719
and everybody's going to start going to that one

131
00:06:09,759 --> 00:06:11,120
and all of a sudden you have Harry Potter.

132
00:06:11,160 --> 00:06:14,840
This one book plucked of nowhere that suddenly becomes this massive mega-hit.

133
00:06:14,879 --> 00:06:18,079
not because it's a hundred million times better written than every other book

134
00:06:18,120 --> 00:06:19,840
but simply because everybody's reading it.

135
00:06:21,000 --> 00:06:22,959
And putting stuff on the internet doesn't change that,

136
00:06:23,079 --> 00:06:27,039
you still care about what your friends like, you still wanna read what everybody else is talking about,

137
00:06:27,040 --> 00:06:30,720
you still wanna do what's popular because you think maybe other people have a valid opinion

138
00:06:30,759 --> 00:06:34,159
and maybe you wanna talk to them about it,
maybe you want to join part of this community.

139
00:06:34,199 --> 00:06:37,240
But whatever your reason is, 
as long as you care about other people's opinions 

140
00:06:37,240 --> 00:06:38,720
you're going to end up with these hits. 

141
00:06:38,800 --> 00:06:38,840
You just have this social signifier that everybody cares about.

142
00:06:39,040 --> 00:06:41,640
You just have this social signifier that everybody cares about.

143
00:06:41,680 --> 00:06:45,040
Everybody's watching American Idol,
doesn't matter how good the show is.

144
00:06:45,079 --> 00:06:49,279
I mean it has to be somewhat decent so people watch it, 
but once everybody's watching it,

145
00:06:49,480 --> 00:06:54,160
and everybody's talking about it, you know, 
it suddenly becomes this megahit for no real reason,

146
00:06:54,199 --> 00:06:56,039
right, just because it's a social phenomenon.

147
00:06:56,079 --> 00:07:00,560
And what television does, it chops off the tail and it throws away all the other shows 

148
00:07:00,600 --> 00:07:03,400
people would like but don't care enough about to be megahits

149
00:07:03,399 --> 00:07:06,599
and instead pours all of its money into these cheap-to-produce shows.

150
00:07:06,680 --> 00:07:08,639
Well you can't get rid of hits, right,

151
00:07:08,720 --> 00:07:11,040
it's a fact that people would want to do what their friends are doing

152
00:07:11,040 --> 00:07:15,480
you can't avoid that but what you can do is say 
there's the whole rest of the world out there

153
00:07:15,519 --> 00:07:18,719
there's a whole rest of what people care about other than what everybody else is doing.

154
00:07:19,000 --> 00:07:22,399
Everybody has their own particular interests, everybody has something that fascinates them

155
00:07:22,399 --> 00:07:25,719
and what the internet does is it allows them to do that,

156
00:07:25,720 --> 00:07:29,040
to get involved and find other people who share these things.

157
00:07:29,079 --> 00:07:33,199
One of the exciting things about Wikipedia 
is that it doesn't just have articles on

158
00:07:33,240 --> 00:07:36,160
you know, 100 most popular things or 1000 most popular things

159
00:07:36,199 --> 00:07:39,240
you can pick the most obscure subject in the world 
and there's an article about it.

160
00:07:39,279 --> 00:07:43,039
Because for EVERYTHING, 
there's someone who cares a great deal about it

161
00:07:43,040 --> 00:07:46,439
and that's what television, 
that's what radio doesn't provide, but the internet does!

162
00:07:46,600 --> 00:07:50,040
It provides a way for you to get in touch 
with those other people who really  

163
00:07:50,079 --> 00:07:52,240
care about this completely obscure thing.

164
00:07:52,480 --> 00:07:56,040
It doesn't just go into the direction of topic, 
it goes into the direction of time.

165
00:07:56,040 --> 00:07:58,480
You can go back in time and find all the shows that have been canceled,

166
00:07:58,519 --> 00:08:00,799
find all the articles that have been deleted,

167
00:08:00,839 --> 00:08:04,039
you can go back and find everything that has been lost in major culture

168
00:08:04,040 --> 00:08:05,439
and it's got a place on the internet,

169
00:08:05,480 --> 00:08:09,640
YouTube music videos from the 70s and the 80s 
that you can't find anywhere these days

170
00:08:09,680 --> 00:08:11,439
you can watch at your leisure.

171
00:08:11,480 --> 00:08:13,640
I think lessening the power of the hits

172
00:08:13,680 --> 00:08:16,720
bringing down the things from the top 
and making it more egalitarian

173
00:08:16,720 --> 00:08:18,040
is the something we should always strive for

174
00:08:18,079 --> 00:08:21,039
it may be really difficult, it may not be super possible

175
00:08:21,079 --> 00:08:24,039
but it's something to hope for, to drive for

176
00:08:24,079 --> 00:08:26,719
and what that means is 

177
00:08:26,720 --> 00:08:31,160
throwing away as much as possible, all the things that give you hints about

178
00:08:31,199 --> 00:08:34,039
you should do this because other people like it.

179
00:08:34,039 --> 00:08:37,519
It's very tempting 
when you're building a website or programming system

180
00:08:37,519 --> 00:08:39,840
is to start sorting things that are really popular at the top

181
00:08:39,879 --> 00:08:43,039
but all that does is, that it makes it less democratic and less fair.

182
00:08:43,120 --> 00:08:48,240
You have to have continual pressure, 
to try and pull things from the bottom of the tail up,

183
00:08:48,279 --> 00:08:51,399
give everybody a chance to look at everything and if you do that,

184
00:08:51,440 --> 00:08:53,720
maybe you won't get completely rid of hits, 

185
00:08:53,720 --> 00:08:55,639
but you can start to ameliorate some of their problems.

186
00:08:55,679 --> 00:08:59,279
I mean that's one power of data mining 
is that construct to find obscure subjects

187
00:08:59,399 --> 00:09:01,840
that you wouldn't have found 
simply because they are not popular.

188
00:09:01,879 --> 00:09:03,919
You know, one of the tools of recommendations

189
00:09:03,919 --> 00:09:06,759
can be to pull you to the less popular stuff down on the tail.

190
00:09:06,919 --> 00:09:09,919
The random article button on Wikipedia 
is really cool in this sense:

191
00:09:10,000 --> 00:09:13,240
You can just wake up every day 
and read about some completely random topic

192
00:09:13,240 --> 00:09:17,039
that you'd never would have heard of except for the fact 
that there's an article on Wikipedia about it,

193
00:09:17,080 --> 00:09:20,040
and boy are there some completely random topics! (interviewer laughs)