google-video 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
data/AUTHORS ADDED
@@ -0,0 +1 @@
1
+ Walter Korman <shaper@wgks.org>
data/CHANGELOG ADDED
@@ -0,0 +1,3 @@
1
+ * 2006/11/07
2
+
3
+ - [shaper] Initial version.
data/README ADDED
@@ -0,0 +1,84 @@
1
+ = Google Video
2
+
3
+ A Ruby object-oriented interface to the video content available on Google Video at http://video.google.com. Functionality is provided to do things including:
4
+
5
+ * retrieve a list of current top videos (see GoogleVideo::Client#top_videos)
6
+ * search for a list of videos matching a set of search query parameters (see GoogleVideo::Client#video_search)
7
+ * retrieve full detailed information on a specific video (see GoogleVideo::Client#video_details)
8
+
9
+ The RubyForge project is at http://rubyforge.org/projects/google-video.
10
+
11
+ == About
12
+
13
+ As the Google Video web site has no formally exposed API, we make use of the lovely Hpricot[http://code.whytheluckystiff.net/hpricot/] to parse desired data from the Google Video web pages.
14
+
15
+ The Google Video web site is still in beta, so it is likely to change in ways that could impact the proper functionality of this library. There is an initial set of unit tests provided with this library which should give some guidance as to its proper operation, and we will endeavor to update the library in accordance with Google's changes, but no promises can be made, so none can be broken, and hence your mileage may vary.
16
+
17
+ See also the YouTube[http://rubyforge.org/projects/youtube] library for Ruby library access to another popular Google-owned video site. Will these two one day live together in harmonious glory? Will intrepid Google engineers rewrite YouTube to make use of GFS[http://labs.google.com/papers/gfs.html], Bigtable[http://labs.google.com/papers/bigtable.html] and a variety of Googley AJAX love? Only time will tell!
18
+
19
+ == Installing
20
+
21
+ We recommend installing <tt>google-video</tt> via rubygems[http://rubyforge.org/projects/rubygems/] (see also http://www.rubygems.org).
22
+
23
+ Once you have +rubygems+ installed on your system, you can easily install the <tt>google-video</tt> gem by executing:
24
+
25
+ % gem install --include-dependencies google-video
26
+
27
+ <tt>google-video</tt> requires the Hpricot[http://code.whytheluckystiff.net/hpricot/] library for parsing HTML, and the HTMLEntities[http://htmlentities.rubyforge.org/] library for, uh, decoding HTML entities. Both will be auto-installed by the above command if not already present.
28
+
29
+ == Usage
30
+
31
+ Instantiate a GoogleVideo::Client and use its methods (e.g. GoogleVideo::Client#top_videos, GoogleVideo::Client#video_search) to make requests of the Google Video server.
32
+
33
+ Each Client method takes as a parameter its respective request object (e.g. GoogleVideo::VideoSearchRequest) and returns its respective response object (e.g. GoogleVideo::VideoSearchResponse). See method documentation for links and more information
34
+
35
+ An example program showing some simple requests follows. The script is available in the distribution under <tt>examples/example.rb</tt> along with several others.
36
+
37
+ #!/usr/bin/env ruby
38
+
39
+ require 'rubygems'
40
+ require 'google-video'
41
+ require 'pp'
42
+
43
+ # create a client with which to submit requests
44
+ client = GoogleVideo::Client.new
45
+
46
+ # look up a list of the top 100 videos
47
+ request = GoogleVideo::TopVideosRequest.new
48
+ response = client.top_videos request
49
+ print "Top 5 Videos:\n"
50
+ response.videos[0...5].each { |video| pp(video) }
51
+
52
+ # choose one at random to look up in detail
53
+ index = rand(response.videos.length)
54
+ video = response.videos[index].video
55
+ print "\nRequesting video detail for:\n"
56
+ pp(video)
57
+
58
+ # look up the video's details
59
+ request = GoogleVideo::VideoDetailsRequest.new :video => video
60
+ response = client.video_details request
61
+ print "\nDetail:\n"
62
+ pp(response)
63
+
64
+ # look up a previously identified video by its document id
65
+ previous_doc_id = 8718762874044429036
66
+ request = GoogleVideo::VideoDetailsRequest.new :doc_id => previous_doc_id
67
+ response = client.video_details request
68
+ print "\nDetail on doc id #{previous_doc_id}:\n"
69
+ pp(response)
70
+
71
+ # search for a video on turtles
72
+ query = 'turtles'
73
+ request = GoogleVideo::VideoSearchRequest.new :query => query
74
+ response = client.video_search request
75
+ print "\nResults of video search for #{query}:\n"
76
+ pp(response)
77
+
78
+ == License
79
+
80
+ This library is provided via the GNU LGPL license at http://www.gnu.org/licenses/lgpl.html.
81
+
82
+ == Authors
83
+
84
+ Copyright 2006, Walter Korman <shaper@wgks.org>, http://www.lemurware.com.
data/Rakefile ADDED
@@ -0,0 +1,49 @@
1
+ require 'rubygems'
2
+ require 'rake'
3
+ require 'rake/testtask'
4
+ require 'rake/rdoctask'
5
+ require 'rake/gempackagetask'
6
+
7
+ spec = Gem::Specification.new do |s|
8
+ s.name = 'google-video'
9
+ s.version = '0.5.0'
10
+ s.author = 'Walter Korman'
11
+ s.email = 'shaper@wgks.org'
12
+ s.platform = Gem::Platform::RUBY
13
+ s.summary = 'A Ruby object-oriented interface to Google Video content.'
14
+ s.rubyforge_project = 'google-video'
15
+ s.has_rdoc = true
16
+ s.extra_rdoc_files = [ 'README' ]
17
+ s.rdoc_options << '--main' << 'README'
18
+ s.test_files = Dir.glob('test/test_*.rb')
19
+ s.files = Dir.glob("{examples,lib,test}/**/*") + [ 'AUTHORS', 'CHANGELOG', 'README', 'Rakefile', 'TODO' ]
20
+ s.add_dependency("hpricot", ">= 0.4")
21
+ s.add_dependency("htmlentities", ">= 3.0.1")
22
+ end
23
+
24
+ desc 'Run tests'
25
+ task :default => [ :test ]
26
+
27
+ Rake::TestTask.new('test') do |t|
28
+ t.libs << 'test'
29
+ t.pattern = 'test/test_*.rb'
30
+ t.verbose = true
31
+ end
32
+
33
+ desc 'Generate RDoc'
34
+ Rake::RDocTask.new :rdoc do |rd|
35
+ rd.rdoc_dir = 'doc'
36
+ rd.rdoc_files.add 'lib', 'README'
37
+ rd.main = 'README'
38
+ end
39
+
40
+ desc 'Build Gem'
41
+ Rake::GemPackageTask.new spec do |pkg|
42
+ pkg.need_tar = true
43
+ end
44
+
45
+ desc 'Clean up'
46
+ task :clean => [ :clobber_rdoc, :clobber_package ]
47
+
48
+ desc 'Clean up'
49
+ task :clobber => [ :clean ]
data/TODO ADDED
@@ -0,0 +1,14 @@
1
+ 1 add some explicit doc_id-based checks on hard-coded actual values
2
+
3
+ 2 add support for advanced videosearch page (with pagination)
4
+ 2 genericize parsing of rating count
5
+ 2 pull in tags and comment text for video detail requests
6
+ 2 provide constant list of available genres for use in video search.
7
+ 2 report failure if we don't successfully parse the contents rather than returning a non-populated response
8
+ 2 add support for moversshakers page (with pagination)
9
+ 2 make sure we preserve click-through source query url param ("q=<xxx>") in video page urls
10
+ 2 look at whether all_text() is losing some detail text in description text parse
11
+ 2 consider simplifying qualified html query paths in top_videos where we can (e.g. star classes)
12
+
13
+ 3 add support for videocaptioned page (with pagination)
14
+ 3 figure out what the Playlist is all about and improve rdoc around it
@@ -0,0 +1,40 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require 'rubygems'
4
+ require 'google-video'
5
+ require 'pp'
6
+
7
+ # create a client with which to submit requests
8
+ client = GoogleVideo::Client.new
9
+
10
+ # look up a list of the top 100 videos
11
+ request = GoogleVideo::TopVideosRequest.new
12
+ response = client.top_videos request
13
+ print "Top 5 Videos:\n"
14
+ response.videos[0...5].each { |video| pp(video) }
15
+
16
+ # choose one at random to look up in detail
17
+ index = rand(response.videos.length)
18
+ video = response.videos[index].video
19
+ print "\nRequesting video detail for:\n"
20
+ pp(video)
21
+
22
+ # look up the video's details
23
+ request = GoogleVideo::VideoDetailsRequest.new :video => video
24
+ response = client.video_details request
25
+ print "\nDetail:\n"
26
+ pp(response)
27
+
28
+ # look up a previously identified video by its document id
29
+ previous_doc_id = 8718762874044429036
30
+ request = GoogleVideo::VideoDetailsRequest.new :doc_id => previous_doc_id
31
+ response = client.video_details request
32
+ print "\nDetail on doc id #{previous_doc_id}:\n"
33
+ pp(response)
34
+
35
+ # search for a video on turtles
36
+ query = 'turtles'
37
+ request = GoogleVideo::VideoSearchRequest.new :query => query
38
+ response = client.video_search request
39
+ print "\nResults of video search for #{query}:\n"
40
+ pp(response)
@@ -0,0 +1,10 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require 'rubygems'
4
+ require 'google-video'
5
+ require 'pp'
6
+
7
+ client = GoogleVideo::Client.new
8
+ request = GoogleVideo::TopVideosRequest.new ARGV[0]
9
+ response = client.top_videos request
10
+ pp(response)
@@ -0,0 +1,10 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require 'rubygems'
4
+ require 'google-video'
5
+ require 'pp'
6
+
7
+ client = GoogleVideo::Client.new
8
+ request = GoogleVideo::VideoDetailsRequest.new :doc_id => ARGV[0].to_i
9
+ response = client.video_details request
10
+ pp(response)
@@ -0,0 +1,10 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require 'rubygems'
4
+ require 'google-video'
5
+ require 'pp'
6
+
7
+ client = GoogleVideo::Client.new
8
+ request = GoogleVideo::VideoSearchRequest.new :query => ARGV[0]
9
+ response = client.video_search request
10
+ pp(response)
@@ -0,0 +1,830 @@
1
+ # google-video -- provides OO access to the Google Video web site content
2
+ # Copyright (C) 2006 Walter Korman <shaper@wgks.org>
3
+ #
4
+ # This library is free software; you can redistribute it and/or
5
+ # modify it under the terms of the GNU Lesser General Public
6
+ # License as published by the Free Software Foundation; either
7
+ # version 2.1 of the License, or (at your option) any later version.
8
+ #
9
+ # This library is distributed in the hope that it will be useful,
10
+ # but WITHOUT ANY WARRANTY; without even the implied warranty of
11
+ # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
12
+ # Lesser General Public License for more details.
13
+ #
14
+ # You should have received a copy of the GNU Lesser General Public
15
+ # License along with this library; if not, write to the Free Software
16
+ # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
17
+
18
+ require 'open-uri'
19
+ require 'hpricot'
20
+ require 'htmlentities'
21
+ require 'time'
22
+
23
+ # Extension to Hpricot for our own parsing purposes.
24
+ class Hpricot::Elem
25
+ # add in an easy way to gather all raw text content from within an element.
26
+ # the latest unstable version of hpricot has some useful routines like this
27
+ # already but we'd like to use the gem-installable version for now so we
28
+ # have to go it alone.
29
+ def all_text
30
+ text = ''
31
+ each_child { |c| text += c.to_s if c.is_a?(Hpricot::Text) }
32
+ text
33
+ end
34
+ end
35
+
36
+ module GoogleVideo
37
+ # An exception thrown by the GoogleVideo module should something untoward
38
+ # occur.
39
+ class GoogleVideoException < RuntimeError
40
+ def initialize (message)
41
+ super(message)
42
+ end
43
+ end
44
+
45
+ # Describes a single video file available for viewing on Google Video.
46
+ class Video
47
+ # the prose text describing the video contents.
48
+ attr_reader :description
49
+
50
+ # the google unique identifier.
51
+ attr_reader :doc_id
52
+
53
+ # the duration of the video in prose, e.g. "1hr 49min" or "3min".
54
+ attr_reader :duration
55
+
56
+ # the full url at which the video is available for viewing.
57
+ attr_reader :page_url
58
+
59
+ # only available via a details request: a list of PlaylistEntry objects
60
+ # detailed in the "Playlist" of next-up videos displayed on the video
61
+ # detail page.
62
+ attr_reader :playlist_entries
63
+
64
+ # only available via a details request: the current rank of this video on
65
+ # the site.
66
+ attr_reader :rank
67
+
68
+ # only available via a details request: the change in rank this video has
69
+ # seen since last ranking: if positive, a move up, if negative, a move
70
+ # down.
71
+ attr_reader :rank_delta
72
+
73
+ # the number of ratings this video has received.
74
+ attr_reader :rating_count
75
+
76
+ # the number of stars all of the video ratings currently average out to,
77
+ # e.g. 4.5.
78
+ attr_reader :star_count
79
+
80
+ # the full url to the video thumbnail image.
81
+ attr_reader :thumbnail_image_url
82
+
83
+ # the title text describing the video.
84
+ attr_reader :title
85
+
86
+ # the date at which the video was uploaded.
87
+ attr_reader :upload_date
88
+
89
+ # the name of the user who uploaded the video; not always available.
90
+ attr_reader :upload_user
91
+
92
+ # only available via a details request, and then only for some videos: the
93
+ # domain provided by the user who uploaded the video.
94
+ attr_reader :upload_domain
95
+
96
+ # only available via a details request, and then only for some videos: the
97
+ # redirect url through Google Video through which you may reach the
98
+ # uploading user's website.
99
+ attr_reader :upload_user_url
100
+
101
+ # only available via a details request: the url to the video's
102
+ # <tt>.gvp</tt> format video file. see
103
+ # http://en.wikipedia.org/wiki/Google_Video#Google_Video_Player for more
104
+ # information on file formats and the like.
105
+ attr_reader :video_file_url
106
+
107
+ # only available via a details request: a list of VideoFrameThumbnail
108
+ # objects describing zero or more individual frame stills within the
109
+ # video.
110
+ attr_reader :video_frame_thumbnails
111
+
112
+ # only available via a details request: the total number of views this
113
+ # video has received to date.
114
+ attr_reader :view_count
115
+
116
+ # only available via a details request: the number of views this video
117
+ # received yesterday.
118
+ attr_reader :yesterday_view_count
119
+
120
+ # the default width in pixels for the video embed html.
121
+ @@DEFAULT_WIDTH = 400
122
+
123
+ # the default height in pixels for the video embed html.
124
+ @@DEFAULT_HEIGHT = 326
125
+
126
+ # Constructs a Video with the supplied hash mapping attribute names to
127
+ # their respective values.
128
+ def initialize (params)
129
+ params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
130
+
131
+ # pull the doc id out of the page url if we've got one
132
+ if (@page_url)
133
+ @page_url =~ /docid=([^&]+)/
134
+ @doc_id = $1.to_i
135
+ end
136
+ end
137
+
138
+ # Returns HTML suitable for embedding this video in a web page with the
139
+ # video occupying the specified dimensions. The generated html matches
140
+ # that which is provided by the Google Video web site embed instructions.
141
+ def embed_html (width = @@DEFAULT_WIDTH, height = @@DEFAULT_HEIGHT)
142
+ <<edoc
143
+ <embed style="width:#{width}px; height:#{height}px;" id="VideoPlayback" type="application/x-shockwave-flash"
144
+ src="http://video.google.com/googleplayer.swf?docId=#{@doc_id}&hl=en" flashvars=""> </embed>
145
+ edoc
146
+ end
147
+ end
148
+
149
+ # Provides a miniature snapshot of a point in time within a full video. On
150
+ # the actual Google Video site these are linked to auto-pan the video to the
151
+ # start of the scene represented in the thumbnail, but for us there's no
152
+ # direct link of meaning so we've only got static content which is
153
+ # nevertheless of potential interest and utility.
154
+ class VideoFrameThumbnail
155
+ # the title of the video frame, e.g. "at 10 secs", "at 30 secs", etc.
156
+ attr_reader :title
157
+
158
+ # the full url to the video frame thumbnail image.
159
+ attr_reader :thumbnail_image_url
160
+
161
+ def initialize (params)
162
+ params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
163
+ end
164
+ end
165
+
166
+ # Describes an alternate video as listed in the "Playlist" tab for a
167
+ # particular video's detailed view page.
168
+ class PlaylistEntry
169
+ # the title of the video.
170
+ attr_reader :title
171
+
172
+ # the full url to the video thumbnail image.
173
+ attr_reader :thumbnail_image_url
174
+
175
+ # the duration of the video in prose, e.g. "1hr 49min" or "3min".
176
+ attr_reader :duration
177
+
178
+ # the name of the user who uploaded the video.
179
+ attr_reader :upload_user
180
+
181
+ # the full url at which the video is available for viewing.
182
+ attr_reader :page_url
183
+
184
+ def initialize (params)
185
+ params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
186
+ end
187
+ end
188
+
189
+ # Describes an entry on the "top videos" page.
190
+ class TopVideo
191
+ # the direction the entry moved from the previous day to today: 0 for no
192
+ # change, 1 for moving up, -1 for moving down.
193
+ attr_reader :movement
194
+
195
+ # the entry's rank today.
196
+ attr_reader :rank_today
197
+
198
+ # the entry's rank yesterday.
199
+ attr_reader :rank_yesterday
200
+
201
+ # the entry's video details as a Video object.
202
+ attr_reader :video
203
+
204
+ # Constructs a TopVideo with the supplied hash mapping attribute names to
205
+ # their respective values.
206
+ def initialize (params)
207
+ params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
208
+ end
209
+ end
210
+
211
+ # Describes a request for the current list of top videos on Google Video.
212
+ class TopVideosRequest
213
+ # the list of countries by which one can constrain a search for top videos
214
+ COUNTRIES = {
215
+ 'All' => 'all',
216
+ 'Argentina' => 'arg',
217
+ 'Australia' => 'aus',
218
+ 'Austria' => 'aut',
219
+ 'Brazil' => 'bra',
220
+ 'Canada' => 'can',
221
+ 'Chile' => 'chl',
222
+ 'Denmark' => 'dnk',
223
+ 'Finland' => 'fin',
224
+ 'France' => 'fra',
225
+ 'Germany' => 'deu',
226
+ 'Greece' => 'grc',
227
+ 'Hong Kong' => 'hkg',
228
+ 'India' => 'ind',
229
+ 'Indonesia' => 'idn',
230
+ 'Ireland' => 'irl',
231
+ 'Israel' => 'isr',
232
+ 'Italy' => 'ita',
233
+ 'Japan' => 'jpn',
234
+ 'Kenya' => 'ken',
235
+ 'Malaysia' => 'mys',
236
+ 'Mexico' => 'mex',
237
+ 'Netherlands' => 'nld',
238
+ 'New Zealand' => 'nzl',
239
+ 'Norway' => 'nor',
240
+ 'Peru' => 'per',
241
+ 'Philippines' => 'phl',
242
+ 'Poland' => 'pol',
243
+ 'Russia' => 'rus',
244
+ 'Saudi Arabia' => 'sau',
245
+ 'Singapore' => 'sgp',
246
+ 'South Africa' => 'zaf',
247
+ 'South Korea' => 'kor',
248
+ 'Spain' => 'esp',
249
+ 'Sweden' => 'swe',
250
+ 'Switzerland' => 'che',
251
+ 'Taiwan' => 'twn',
252
+ 'Thailand' => 'tha',
253
+ 'Turkey' => 'tur',
254
+ 'Ukraine' => 'ukr',
255
+ 'United Arab Emirates' => 'are',
256
+ 'United Kingdom' => 'gbr',
257
+ 'United States' => 'usa',
258
+ 'Vietnam' => 'vnm'
259
+ }
260
+
261
+ # optional: the country by which results are to be constrained, defaulting
262
+ # to nil. Specifying 'all' or nil will provide results across all
263
+ # countries.
264
+ attr_reader :country
265
+
266
+ # Constructs a TopVideosRequest with an optional supplied hash mapping
267
+ # attribute names to their respective values.
268
+ def initialize (params = nil)
269
+ params.each { |key, value| instance_variable_set('@' + key.to_s, value) } if (params)
270
+
271
+ # validate request parameters
272
+ if @country && !COUNTRIES.include?(@country)
273
+ raise ArgumentError.new("invalid country parameter: #{@country}")
274
+ end
275
+ end
276
+ end
277
+
278
+ # Describes a response from Google Video providing a list of current top videos.
279
+ class TopVideosResponse
280
+ # the url with which the request was made of the Google Video service.
281
+ attr_reader :request_url
282
+
283
+ # the list of Video objects comprising the search results.
284
+ attr_reader :videos
285
+
286
+ # Constructs a TopVideosResponse with the supplied hash mapping attribute
287
+ # names to their respective values.
288
+ def initialize (params)
289
+ params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
290
+ end
291
+ end
292
+
293
+ # Describes a search request for videos matching the specified parameters on
294
+ # Google Video.
295
+ class VideoSearchRequest
296
+ # the list of valid sort parameters.
297
+ SORT_OPTIONS = [ 'relevance', 'rating', 'date', 'title' ]
298
+
299
+ # the list of valid duration parameters.
300
+ DURATION_OPTIONS = [ 'all', 'short', 'medium', 'long' ]
301
+
302
+ # optional: the sort order for search results: one of 'relevance',
303
+ # 'rating', 'date', 'title', defaulting to 'relevance'.
304
+ attr_reader :sort
305
+
306
+ # optional: the duration by which to filter search results: one of 'all',
307
+ # 'short', 'medium' or 'long', defaulting to 'all'.
308
+ attr_reader :duration
309
+
310
+ # required: the query string by which to search.
311
+ attr_reader :query
312
+
313
+ # optional: the page number of results sought, defaulting to 1.
314
+ attr_reader :page
315
+
316
+ # Constructs a VideoSearchRequest with the supplied hash mapping attribute
317
+ # names to their respective values.
318
+ def initialize (params)
319
+ params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
320
+
321
+ # validate request parameters
322
+ if @sort && !SORT_OPTIONS.include?(@sort);
323
+ raise ArgumentError.new("invalid sort parameter: #{@sort}")
324
+ end
325
+ if @duration && !DURATION_OPTIONS.include?(@duration)
326
+ raise ArgumentError.new("invalid duration parameter: #{@duration}")
327
+ end
328
+ if !@query
329
+ raise ArgumentError.new("invalid request, query parameter required")
330
+ end
331
+ end
332
+ end
333
+
334
+ # Describes a response to a VideoSearchQuery.
335
+ class VideoSearchResponse
336
+ # the url with which the request was made of the Google Video service.
337
+ attr_reader :request_url
338
+
339
+ # the 1-based starting index number of the results in this set.
340
+ attr_reader :start_index
341
+
342
+ # the 1-based ending index number of the results in this set.
343
+ attr_reader :end_index
344
+
345
+ # the total number of results in this result set.
346
+ attr_reader :total_result_count
347
+
348
+ # the time taken in seconds to execute the search query (according to Google).
349
+ attr_reader :execution_time
350
+
351
+ # the list of Video objects comprising the search results.
352
+ attr_reader :videos
353
+
354
+ # Constructs a VideoSearchResponse with the supplied hash mapping
355
+ # attribute names to their respective values.
356
+ def initialize (params)
357
+ params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
358
+ end
359
+ end
360
+
361
+ # Describes a request of the Google Video service for details on a specific
362
+ # previously retrieved Video or its stored +doc_id+. Only one of either
363
+ # +video+ or +doc_id+ should be specified in the request.
364
+ class VideoDetailsRequest
365
+ # the Video object whose details are sought.
366
+ attr_reader :video
367
+
368
+ # the +doc_id+ of the video whose details are sought.
369
+ attr_reader :doc_id
370
+
371
+ # Constructs a VideoDetailsRequest with the supplied hash mapping
372
+ # attribute names to their respective values.
373
+ def initialize (params)
374
+ params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
375
+
376
+ # validate request parameters
377
+ if !@video && !@doc_id
378
+ raise ArgumentError.new("invalid request, one of video or doc_id parameter required")
379
+ end
380
+ if @video && @doc_id
381
+ raise ArgumentError.new("invalid request, only one of video or doc_id parameters may be specified")
382
+ end
383
+ if @video && !@video.is_a?(Video)
384
+ raise ArgumentError.new("invalid request, video must be a GoogleVideo::Video")
385
+ end
386
+ if @doc_id && !@doc_id.is_a?(Bignum)
387
+ raise ArgumentError.new("invalid request, doc_id must be a Bignum")
388
+ end
389
+ end
390
+ end
391
+
392
+ # Describes a response to a VideoDetailsRequest.
393
+ class VideoDetailsResponse
394
+ # the url with which the request was made of the Google Video service.
395
+ attr_reader :request_url
396
+
397
+ # the new Video object providing full detailed information.
398
+ attr_reader :video
399
+
400
+ # Constructs a VideoDetailsResponse with the supplied hash mapping
401
+ # attribute names to their respective values.
402
+ def initialize (params)
403
+ params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
404
+ end
405
+ end
406
+
407
+ # The main client object providing interface methods for retrieving
408
+ # information from the Google Video server.
409
+ class Client
410
+ # the default hostname queried to retrieve google video content.
411
+ @@DEFAULT_HOST = 'video.google.com'
412
+
413
+ # the default user agent submitted with http requests of google video.
414
+ @@DEFAULT_AGENT = 'google-video for Ruby (http://www.rubyforge.org/projects/google-video/)'
415
+
416
+ # Constructs a Client for querying the Google Video server. Optional
417
+ # parameters to be specified as a hash include:
418
+ # * host: optional alternate host name to query instead of the default host.
419
+ # * agent: optional alternate user agent to submit with http requests
420
+ # instead of the default agent.
421
+ def initialize (params = nil)
422
+ @host = @@DEFAULT_HOST
423
+ @agent = @@DEFAULT_AGENT
424
+ params.each { |key, value| instance_variable_set('@' + key.to_s, value) } if params
425
+ end
426
+
427
+ # Runs a search query on Google Video with the parameters specified in the
428
+ # supplied VideoSearchRequest and returns a VideoSearchResponse.
429
+ def video_search (search_request)
430
+ # validate parameters
431
+ if !search_request.is_a?(VideoSearchRequest)
432
+ raise ArgumentError.new("invalid argument, request must be a GoogleVideo::VideoSearchRequest")
433
+ end
434
+
435
+ # gather response data from the server
436
+ url = _search_url(search_request)
437
+ response = _request(url)
438
+ doc = Hpricot(response)
439
+
440
+ # parse the overall search query stats
441
+ regexp_stats = Regexp.new(/([0-9,]+) \- ([0-9,]+)<\/b> of about <b>([0-9,]+)<\/b> \(<b>([0-9.]+)/)
442
+ row = (doc/"#resultsheadertable/tr/td/font").first
443
+ if !regexp_stats.match(row.inner_html)
444
+ raise GoogleVideoException.new("failed to parse search query stats")
445
+ end
446
+ ( start_index, end_index, total_result_count, execution_time ) = [ $1.to_i, $2.to_i, $3.to_i, $4.to_f ]
447
+
448
+ # parse the video results
449
+ videos = []
450
+ rows = doc/"table[@class='searchresult']/tr"
451
+ rows.each do |row|
452
+ # parse the thumbnail image
453
+ thumbnail_image_url = _decode_html((row/"img[@class='searchresultimg']").first.attributes['src'])
454
+
455
+ # parse the title and page url
456
+ a_title = (row/"div[@class='resulttitle']/a").first
457
+ page_url = 'http://' + @host + '/' + _decode_html(a_title.attributes['href'])
458
+ title = _decode_html(a_title.inner_html.strip)
459
+
460
+ # parse the description text
461
+ description = _decode_html((row/"div[@class='snippet']").first.inner_html.strip)
462
+
463
+ # parse the upload username
464
+ span_channel = (row/"span[@class='channel']").first
465
+ channel_html = (span_channel) ? span_channel.inner_html : ''
466
+ channel_html =~ /([^\-]+)/
467
+ upload_user = _clean_string($1)
468
+
469
+ # stars
470
+ star_count = _parse_star_elements(row/"img[@class='star']")
471
+
472
+ # rating count
473
+ span_raters = (row/"span[@id='numOfRaters']").first
474
+ rating_count = (span_raters) ? span_raters.inner_html.to_i : 0
475
+
476
+ # duration
477
+ span_date = (row/"span[@class='date']").first
478
+ date_html = span_date.inner_html
479
+ date_html =~ /([^\-]+) \- (.*)$/
480
+ duration = _clean_string($1)
481
+ upload_date = Time.parse(_clean_string($2))
482
+
483
+ # construct the video object and tack it onto the video result list
484
+ videos << Video.new(:title => title,
485
+ :page_url => page_url,
486
+ :thumbnail_image_url => thumbnail_image_url,
487
+ :description => description,
488
+ :star_count => star_count,
489
+ :rating_count => rating_count,
490
+ :duration => duration,
491
+ :upload_date => upload_date,
492
+ :upload_user => upload_user)
493
+ end
494
+
495
+ # construct the final search response with all info we've gathered
496
+ VideoSearchResponse.new(:request_url => url,
497
+ :start_index => start_index,
498
+ :end_index => end_index,
499
+ :total_result_count => total_result_count,
500
+ :execution_time => execution_time,
501
+ :videos => videos)
502
+ end
503
+
504
+ # Looks up detailed information on a specific Video on Google Video with
505
+ # the parameters specified in the supplied VideoDetailsRequest and returns
506
+ # a VideoDetailsResponse.
507
+ def video_details (details_request)
508
+ # validate parameters
509
+ if !details_request.is_a?(VideoDetailsRequest)
510
+ raise ArgumentError.new("invalid argument, request must be a GoogleVideo::VideoDetailsRequest")
511
+ end
512
+
513
+ # gather response data from the server
514
+ url = _video_details_url(details_request)
515
+ response = _request(url)
516
+ doc = Hpricot(response)
517
+
518
+ # parse title
519
+ title = (doc/"div[@id='pvprogtitle']").inner_html.strip
520
+
521
+ # parse description
522
+ font_description = (doc/"div[@id='description']/font").first
523
+ description = (font_description) ? font_description.all_text.strip : ''
524
+ span_wholedescr = (doc/"span[@id='wholedescr']").first
525
+ if (span_wholedescr)
526
+ description += ' ' + span_wholedescr.all_text.strip
527
+ end
528
+ description = _decode_html(description)
529
+
530
+ # parse star count
531
+ span_rating = (doc/"span[@id='communityRating']").first
532
+ star_count = _parse_star_elements(span_rating/"img[@class='star']")
533
+
534
+ # parse rating count
535
+ span_raters = (doc/"span[@id='numOfRaters']").first
536
+ rating_count = (span_raters) ? span_raters.inner_html.to_i : 0
537
+
538
+ # parse upload user, duration, upload date, upload user domain, upload
539
+ # user url. unfortunately this is a bit messy since, unlike much of the
540
+ # rest of google's lovely html, there are no useful id or class names we
541
+ # can hang our hat on. rather, there are anywhere from one to three
542
+ # rows of text, with only the middle row (in the three-row scenario)
543
+ # containing duration and upload date, omnipresent. still, we buckle
544
+ # down and have at it with fervor and tenacity.
545
+ duration_etc_html = (doc/"div[@id='durationetc']").first.inner_html
546
+ duration_parts = duration_etc_html.split(/<br[^>]+>/)
547
+ # see if the first line looks like it has a date formatted ala 'Nov 9, 2006'
548
+ if (duration_parts[0] =~ /\- [A-Za-z]{3} \d+, \d{4}/)
549
+ # first line is duration / upload_date, and there is no upload username
550
+ upload_user = ''
551
+ duration_upload_html = duration_parts[0]
552
+ upload_user_domain = duration_parts[1]
553
+ else
554
+ upload_user = _clean_string(duration_parts[0])
555
+ duration_upload_html = duration_parts[1]
556
+ upload_user_domain = duration_parts[2]
557
+ end
558
+
559
+ # parse the duration and upload date
560
+ ( duration, upload_date ) = duration_upload_html.split(/\-/)
561
+ duration = _clean_string(duration)
562
+ upload_date = Time.parse(_clean_string(upload_date))
563
+
564
+ # parse the upload user url and domain if present
565
+ if (upload_user_domain =~ /<a.*?href="([^"]+)"[^>]+>([^<]+)<\/a>/)
566
+ upload_user_url = 'http://' + @host + _decode_html(_clean_string($1))
567
+ upload_user_domain = _clean_string($2)
568
+ else
569
+ upload_user_url = ''
570
+ upload_user_domain = ''
571
+ end
572
+
573
+ # pull out view count and rank info table row elements
574
+ ( tr_total_views, tr_views_yesterday, tr_rank ) = (doc/"table[@id='statsall']/tr")
575
+
576
+ # parse view count
577
+ tr_total_views.inner_html =~ /<b>([^<]+)<\/b>/
578
+ view_count = _human_number_to_int($1)
579
+
580
+ # parse yesterday's view count
581
+ tr_views_yesterday.inner_html =~ /\s+([0-9,]+) yesterday/
582
+ yesterday_view_count = _human_number_to_int($1)
583
+
584
+ # parse rank
585
+ tr_rank.inner_html =~ /rank ([0-9,]+)/
586
+ rank = _human_number_to_int($1)
587
+
588
+ # parse rank delta
589
+ (tr_rank/"span").first.inner_html =~ /\(([0-9\+\-,]+)\)/
590
+ rank_delta = _human_number_to_int($1)
591
+
592
+ # pull out the url to the video .gvp file if prsent
593
+ img_download = (doc/"img[@src='/static/btn_download.gif']").first
594
+ if (img_download)
595
+ onclick_html = img_download.attributes['onclick']
596
+ onclick_script = _decode_html(onclick_html)
597
+ onclick_script =~ /onDownloadClick\(([^\)]+)\)/
598
+ video_file_url = onclick_script.split(",")[1].gsub(/"/, '')
599
+ else
600
+ video_file_url = ''
601
+ end
602
+
603
+ # pull out the video frame thumbnails
604
+ video_frame_thumbnails = []
605
+ (doc/"img[@class='detailsimage']").each do |frame_image|
606
+ video_frame_thumbnails << _parse_video_frame_thumbnail(frame_image)
607
+ end
608
+
609
+ # pull out the playlist entries
610
+ playlist_entries = []
611
+ table_upnext = (doc/"table[@id='upnexttable']").first
612
+ (table_upnext/"tr").each do |tr_playlist|
613
+ playlist_entries << _parse_playlist_entry(tr_playlist)
614
+ end
615
+
616
+ # create the new, fully populated video record
617
+ video = Video.new(:description => description,
618
+ :duration => duration,
619
+ :page_url => url,
620
+ :playlist_entries => playlist_entries,
621
+ :rank => rank,
622
+ :rank_delta => rank_delta,
623
+ :rating_count => rating_count,
624
+ :star_count => star_count,
625
+ :title => title,
626
+ :upload_date => upload_date,
627
+ :upload_user => upload_user,
628
+ :upload_user_domain => upload_user_domain,
629
+ :upload_user_url => upload_user_url,
630
+ :video_file_url => video_file_url,
631
+ :video_frame_thumbnails => video_frame_thumbnails,
632
+ :view_count => view_count,
633
+ :yesterday_view_count => yesterday_view_count)
634
+
635
+ # build and return the response
636
+ VideoDetailsResponse.new(:request_url => url, :video => video)
637
+ end
638
+
639
+ # Looks up top videos on Google Video with the parameters specified in the
640
+ # supplied TopVideosRequest and returns a TopVideosResponse.
641
+ def top_videos (top_request)
642
+ # validate parameters
643
+ if !top_request.is_a?(TopVideosRequest)
644
+ raise ArgumentError.new("invalid argument, request must be a GoogleVideo::TopVideosRequest")
645
+ end
646
+
647
+ # gather response data from the server
648
+ url = _top_videos_url(top_request)
649
+ response = _request(url)
650
+ doc = Hpricot(response)
651
+
652
+ # parse out each of the top video entries
653
+ top_videos = []
654
+ # grab the top 100 table rows
655
+ rows = doc/"table[@class='table-top100']/tr"
656
+ # the first row is just header info, so skip it
657
+ rows.shift
658
+ # there's one video per row, so we iterate over the table row elements
659
+ rows.each do |row|
660
+ # break the table cells into logically-named elements we can manipulate more precisely
661
+ (td_movement, td_rank_today, td_rank_yesterday, td_thumbnail, td_detail) = (row/"td")
662
+
663
+ # parse the rank movement direction
664
+ movement_html = (td_movement/"img").to_html
665
+ if (movement_html =~ /up\.gif/)
666
+ movement = 1
667
+ elsif (movement_html =~ /down\.gif/)
668
+ movement = -1
669
+ else
670
+ movement = 0
671
+ end
672
+
673
+ # parse today and yesterday's rank
674
+ rank_today = td_rank_today.inner_html.to_i
675
+ rank_yesterday = td_rank_yesterday.inner_html.to_i
676
+
677
+ # parse the video thumbnail image
678
+ thumbnail_image_url = _decode_html((td_thumbnail/"a/img").first.attributes['src'])
679
+
680
+ # parse the detailed video info
681
+ a_video = (td_detail/"a").first
682
+ page_url = 'http://' + @host + a_video.attributes['href']
683
+
684
+ # title
685
+ title = _decode_html(a_video.inner_html.strip)
686
+
687
+ # stars
688
+ star_count = _parse_star_elements(td_detail/"div[@class='meta']/span/font/img[@class='star']")
689
+
690
+ # rating count
691
+ span_raters = (td_detail/"div[@class='meta']/span/font/span[@id='numOfRaters']").first
692
+ rating_count = (span_raters) ? span_raters.inner_html.to_i : 0
693
+
694
+ # duration
695
+ duration = (td_detail/"div[@class='meta']").first.all_text.gsub(/&nbsp;/, '').strip
696
+
697
+ # description
698
+ description = _decode_html((td_detail).all_text.strip)
699
+
700
+ # construct the video object
701
+ video = Video.new(:title => title,
702
+ :page_url => page_url,
703
+ :thumbnail_image_url => thumbnail_image_url,
704
+ :star_count => star_count,
705
+ :rating_count => rating_count,
706
+ :duration => duration,
707
+ :description => description)
708
+
709
+ # create the top video entry and throw it on the list of top videos
710
+ top_videos << TopVideo.new(:movement => movement,
711
+ :rank_today => rank_today,
712
+ :rank_yesterday => rank_yesterday,
713
+ :video => video)
714
+ end
715
+
716
+ TopVideosResponse.new(:request_url => url,
717
+ :videos => top_videos)
718
+ end
719
+
720
+ private
721
+ # Breakout method used by Client#video_details to make things more
722
+ # manageable, returning a VideoFrameThumbnail constructed from the given
723
+ # image element.
724
+ def _parse_video_frame_thumbnail (frame_image)
725
+ title = frame_image.attributes['title']
726
+ thumbnail_image_url = _decode_html(frame_image.attributes['src'])
727
+ VideoFrameThumbnail.new(:title => title, :thumbnail_image_url => thumbnail_image_url)
728
+ end
729
+
730
+ # Breakout method used by Client#video_details to make things more
731
+ # manageable, returning a PlaylistEntry constructed from the given table
732
+ # row element.
733
+ def _parse_playlist_entry (tr_playlist)
734
+ # pull out the playlist entry table cell elements
735
+ ( td_thumbnail, td_playlist ) = (tr_playlist/"td")
736
+
737
+ # parse thumbnail image
738
+ thumbnail_image_url = _decode_html((td_thumbnail/"img").first.attributes['src'])
739
+
740
+ # parse the page url and title
741
+ a_page = (td_playlist/"a").first
742
+ page_url = 'http://' + @host + a_page.attributes['href']
743
+ title = _decode_html(a_page.attributes['title'])
744
+
745
+ # parse the upload user and duration
746
+ meta_html = (td_playlist/"span[@class='meta']").first.inner_html
747
+ if (meta_html =~ /([^<]+)<br \/>(.*)/)
748
+ upload_user = _clean_string($1)
749
+ duration = _clean_string($2)
750
+ else
751
+ upload_user = ''
752
+ duration = _clean_string(meta_html)
753
+ end
754
+
755
+ PlaylistEntry.new(:title => title,
756
+ :page_url => page_url,
757
+ :upload_user => upload_user,
758
+ :duration => duration,
759
+ :thumbnail_image_url => thumbnail_image_url)
760
+ end
761
+
762
+ def _decode_html (text)
763
+ HTMLEntities.decode_entities(text)
764
+ end
765
+
766
+ def _clean_string (text)
767
+ text ? text.strip : ''
768
+ end
769
+
770
+ def _human_number_to_int (text)
771
+ text ? text.gsub(/,/, '').to_i : 0
772
+ end
773
+
774
+ def _parse_star_elements (elements)
775
+ star_count = 0
776
+ elements.each do |star|
777
+ star_src = star.attributes['src']
778
+ if (star_src =~ /starLittle\.gif$/)
779
+ star_count += 1
780
+ elsif (star_src =~ /starLittleHalf\.gif$/)
781
+ star_count += 0.5
782
+ end
783
+ end
784
+ star_count
785
+ end
786
+
787
+ def _video_details_url (details_request)
788
+ doc_id = (details_request.video) ? details_request.video.doc_id : details_request.doc_id
789
+ "http://#{@host}/videoplay?docid=#{doc_id}"
790
+ end
791
+
792
+ def _top_videos_url (top_request)
793
+ url = "http://#{@host}/videoranking"
794
+ if top_request.country
795
+ country_code = TopVideosRequest::COUNTRIES[top_request.country]
796
+ url += "?cr=#{country_code}"
797
+ end
798
+ url
799
+ end
800
+
801
+ def _search_url (search_request)
802
+ query = search_request.query
803
+ if (search_request.duration && search_request.duration != 'all')
804
+ query += '+duration:' + search_request.duration
805
+ end
806
+
807
+ url = "http://#{@host}/videosearch?q=#{URI.encode(query)}"
808
+
809
+ if (search_request.page)
810
+ url += '&page=' + URI.encode(search_request.page.to_s)
811
+ end
812
+
813
+ if (search_request.sort)
814
+ url += '&so=' + URI.encode(search_request.sort)
815
+ end
816
+
817
+ url
818
+ end
819
+
820
+ def _request (url)
821
+ begin
822
+ content = ''
823
+ open(url, "User-Agent" => @agent) { |f| content = f.read }
824
+ content
825
+ rescue
826
+ raise GoogleVideoException.new("failed to request '#{url}': " + $!)
827
+ end
828
+ end
829
+ end
830
+ end