RubyGems - google-video - Versions diffs - 0.5.0 - Mend

google-video 0.5.0

Files changed (16) hide show

data/AUTHORS +1 -0
data/CHANGELOG +3 -0
data/README +84 -0
data/Rakefile +49 -0
data/TODO +14 -0
data/examples/example.rb +40 -0
data/examples/get-top-videos.rb +10 -0
data/examples/get-video-details.rb +10 -0
data/examples/search-videos.rb +10 -0
data/lib/google-video.rb +830 -0
data/test/test_client.rb +22 -0
data/test/test_top_videos.rb +76 -0
data/test/test_video_details.rb +104 -0
data/test/test_video_search.rb +89 -0
data/test/video_test_helper.rb +19 -0
metadata +81 -0

data/AUTHORS ADDED Viewed

	@@ -0,0 +1 @@
1	+ Walter Korman <shaper@wgks.org>

data/CHANGELOG ADDED Viewed

@@ -0,0 +1,3 @@
+* 2006/11/07
+- [shaper] Initial version.

data/README ADDED Viewed

@@ -0,0 +1,84 @@
+= Google Video
+A Ruby object-oriented interface to the video content available on Google Video at http://video.google.com.  Functionality is provided to do things including:
+* retrieve a list of current top videos (see GoogleVideo::Client#top_videos)
+* search for a list of videos matching a set of search query parameters (see GoogleVideo::Client#video_search)
+* retrieve full detailed information on a specific video (see GoogleVideo::Client#video_details)
+The RubyForge project is at http://rubyforge.org/projects/google-video.
+== About
+As the Google Video web site has no formally exposed API, we make use of the lovely Hpricot[http://code.whytheluckystiff.net/hpricot/] to parse desired data from the Google Video web pages.
+The Google Video web site is still in beta, so it is likely to change in ways that could impact the proper functionality of this library.  There is an initial set of unit tests provided with this library which should give some guidance as to its proper operation, and we will endeavor to update the library in accordance with Google's changes, but no promises can be made, so none can be broken, and hence your mileage may vary.
+See also the YouTube[http://rubyforge.org/projects/youtube] library for Ruby library access to another popular Google-owned video site.  Will these two one day live together in harmonious glory?  Will intrepid Google engineers rewrite YouTube to make use of GFS[http://labs.google.com/papers/gfs.html], Bigtable[http://labs.google.com/papers/bigtable.html] and a variety of Googley AJAX love?  Only time will tell!
+== Installing
+We recommend installing <tt>google-video</tt> via rubygems[http://rubyforge.org/projects/rubygems/] (see also http://www.rubygems.org).
+Once you have +rubygems+ installed on your system, you can easily install the <tt>google-video</tt> gem by executing:
+  % gem install --include-dependencies google-video
+<tt>google-video</tt> requires the Hpricot[http://code.whytheluckystiff.net/hpricot/] library for parsing HTML, and the HTMLEntities[http://htmlentities.rubyforge.org/] library for, uh, decoding HTML entities.  Both will be auto-installed by the above command if not already present.
+== Usage
+Instantiate a GoogleVideo::Client and use its methods (e.g. GoogleVideo::Client#top_videos, GoogleVideo::Client#video_search) to make requests of the Google Video server.
+Each Client method takes as a parameter its respective request object (e.g. GoogleVideo::VideoSearchRequest) and returns its respective response object (e.g. GoogleVideo::VideoSearchResponse).  See method documentation for links and more information
+An example program showing some simple requests follows.  The script is available in the distribution under <tt>examples/example.rb</tt> along with several others.
+  #!/usr/bin/env ruby
+  require 'rubygems'
+  require 'google-video'
+  require 'pp'
+  # create a client with which to submit requests
+  client = GoogleVideo::Client.new
+  # look up a list of the top 100 videos
+  request = GoogleVideo::TopVideosRequest.new
+  response = client.top_videos request
+  print "Top 5 Videos:\n"
+  response.videos[0...5].each { |video| pp(video) }
+  # choose one at random to look up in detail
+  index = rand(response.videos.length)
+  video = response.videos[index].video
+  print "\nRequesting video detail for:\n"
+  pp(video)
+  # look up the video's details
+  request = GoogleVideo::VideoDetailsRequest.new :video => video
+  response = client.video_details request
+  print "\nDetail:\n"
+  pp(response)
+  # look up a previously identified video by its document id
+  previous_doc_id = 8718762874044429036
+  request = GoogleVideo::VideoDetailsRequest.new :doc_id => previous_doc_id
+  response = client.video_details request
+  print "\nDetail on doc id #{previous_doc_id}:\n"
+  pp(response)
+  # search for a video on turtles
+  query = 'turtles'
+  request = GoogleVideo::VideoSearchRequest.new :query => query
+  response = client.video_search request
+  print "\nResults of video search for #{query}:\n"
+  pp(response)
+== License
+This library is provided via the GNU LGPL license at http://www.gnu.org/licenses/lgpl.html.
+== Authors
+Copyright 2006, Walter Korman <shaper@wgks.org>, http://www.lemurware.com.

data/Rakefile ADDED Viewed

@@ -0,0 +1,49 @@
+require 'rubygems'
+require 'rake'
+require 'rake/testtask'
+require 'rake/rdoctask'
+require 'rake/gempackagetask'
+spec = Gem::Specification.new do |s|
+  s.name = 'google-video'
+  s.version = '0.5.0'
+  s.author = 'Walter Korman'
+  s.email = 'shaper@wgks.org'
+  s.platform = Gem::Platform::RUBY
+  s.summary = 'A Ruby object-oriented interface to Google Video content.'
+  s.rubyforge_project = 'google-video'
+  s.has_rdoc = true
+  s.extra_rdoc_files = [ 'README' ]
+  s.rdoc_options << '--main' << 'README'
+  s.test_files = Dir.glob('test/test_*.rb')
+  s.files = Dir.glob("{examples,lib,test}/**/*") + [ 'AUTHORS', 'CHANGELOG', 'README', 'Rakefile', 'TODO' ]
+  s.add_dependency("hpricot", ">= 0.4")
+  s.add_dependency("htmlentities", ">= 3.0.1")
+end
+desc 'Run tests'
+task :default => [ :test ]
+Rake::TestTask.new('test') do |t|
+  t.libs << 'test'
+  t.pattern = 'test/test_*.rb'
+  t.verbose = true
+end
+desc 'Generate RDoc'
+Rake::RDocTask.new :rdoc do |rd|
+  rd.rdoc_dir = 'doc'
+  rd.rdoc_files.add 'lib', 'README'
+  rd.main = 'README'
+end
+desc 'Build Gem'
+Rake::GemPackageTask.new spec do |pkg|
+  pkg.need_tar = true
+end
+desc 'Clean up'
+task :clean => [ :clobber_rdoc, :clobber_package ]
+desc 'Clean up'
+task :clobber => [ :clean ]

data/TODO ADDED Viewed

@@ -0,0 +1,14 @@
+1 add some explicit doc_id-based checks on hard-coded actual values
+2 add support for advanced videosearch page (with pagination)
+2 genericize parsing of rating count
+2 pull in tags and comment text for video detail requests
+2 provide constant list of available genres for use in video search.
+2 report failure if we don't successfully parse the contents rather than returning a non-populated response
+2 add support for moversshakers page (with pagination)
+2 make sure we preserve click-through source query url param ("q=<xxx>") in video page urls
+2 look at whether all_text() is losing some detail text in description text parse
+2 consider simplifying qualified html query paths in top_videos where we can (e.g. star classes)
+3 add support for videocaptioned page (with pagination)
+3 figure out what the Playlist is all about and improve rdoc around it

data/examples/example.rb ADDED Viewed

@@ -0,0 +1,40 @@
+#!/usr/bin/env ruby
+require 'rubygems'
+require 'google-video'
+require 'pp'
+# create a client with which to submit requests
+client = GoogleVideo::Client.new
+# look up a list of the top 100 videos
+request = GoogleVideo::TopVideosRequest.new
+response = client.top_videos request
+print "Top 5 Videos:\n"
+response.videos[0...5].each { |video| pp(video) }
+# choose one at random to look up in detail
+index = rand(response.videos.length)
+video = response.videos[index].video
+print "\nRequesting video detail for:\n"
+pp(video)
+# look up the video's details
+request = GoogleVideo::VideoDetailsRequest.new :video => video
+response = client.video_details request
+print "\nDetail:\n"
+pp(response)
+# look up a previously identified video by its document id
+previous_doc_id = 8718762874044429036
+request = GoogleVideo::VideoDetailsRequest.new :doc_id => previous_doc_id
+response = client.video_details request
+print "\nDetail on doc id #{previous_doc_id}:\n"
+pp(response)
+# search for a video on turtles
+query = 'turtles'
+request = GoogleVideo::VideoSearchRequest.new :query => query
+response = client.video_search request
+print "\nResults of video search for #{query}:\n"
+pp(response)

data/examples/get-top-videos.rb ADDED Viewed

@@ -0,0 +1,10 @@
+#!/usr/bin/env ruby
+require 'rubygems'
+require 'google-video'
+require 'pp'
+client = GoogleVideo::Client.new
+request = GoogleVideo::TopVideosRequest.new ARGV[0]
+response = client.top_videos request
+pp(response)

data/examples/get-video-details.rb ADDED Viewed

@@ -0,0 +1,10 @@
+#!/usr/bin/env ruby
+require 'rubygems'
+require 'google-video'
+require 'pp'
+client = GoogleVideo::Client.new
+request = GoogleVideo::VideoDetailsRequest.new :doc_id => ARGV[0].to_i
+response = client.video_details request
+pp(response)

data/examples/search-videos.rb ADDED Viewed

@@ -0,0 +1,10 @@
+#!/usr/bin/env ruby
+require 'rubygems'
+require 'google-video'
+require 'pp'
+client = GoogleVideo::Client.new
+request = GoogleVideo::VideoSearchRequest.new :query => ARGV[0]
+response = client.video_search request
+pp(response)

data/lib/google-video.rb ADDED Viewed

@@ -0,0 +1,830 @@
+# google-video -- provides OO access to the Google Video web site content
+# Copyright (C) 2006 Walter Korman <shaper@wgks.org>
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
+require 'open-uri'
+require 'hpricot'
+require 'htmlentities'
+require 'time'
+# Extension to Hpricot for our own parsing purposes.
+class Hpricot::Elem
+  # add in an easy way to gather all raw text content from within an element.
+  # the latest unstable version of hpricot has some useful routines like this
+  # already but we'd like to use the gem-installable version for now so we
+  # have to go it alone.
+  def all_text
+    text = ''
+    each_child { |c| text += c.to_s if c.is_a?(Hpricot::Text) }
+    text
+  end
+end
+module GoogleVideo
+  # An exception thrown by the GoogleVideo module should something untoward
+  # occur.
+  class GoogleVideoException < RuntimeError
+    def initialize (message)
+      super(message)
+    end
+  end
+  # Describes a single video file available for viewing on Google Video.
+  class Video
+    # the prose text describing the video contents.
+    attr_reader :description
+    # the google unique identifier.
+    attr_reader :doc_id
+    # the duration of the video in prose, e.g. "1hr 49min" or "3min".
+    attr_reader :duration
+    # the full url at which the video is available for viewing.
+    attr_reader :page_url
+    # only available via a details request: a list of PlaylistEntry objects
+    # detailed in the "Playlist" of next-up videos displayed on the video
+    # detail page.
+    attr_reader :playlist_entries
+    # only available via a details request: the current rank of this video on
+    # the site.
+    attr_reader :rank
+    # only available via a details request: the change in rank this video has
+    # seen since last ranking: if positive, a move up, if negative, a move
+    # down.
+    attr_reader :rank_delta
+    # the number of ratings this video has received.
+    attr_reader :rating_count
+    # the number of stars all of the video ratings currently average out to,
+    # e.g. 4.5.
+    attr_reader :star_count
+    # the full url to the video thumbnail image.
+    attr_reader :thumbnail_image_url
+    # the title text describing the video.
+    attr_reader :title
+    # the date at which the video was uploaded.
+    attr_reader :upload_date
+    # the name of the user who uploaded the video; not always available.
+    attr_reader :upload_user
+    # only available via a details request, and then only for some videos: the
+    # domain provided by the user who uploaded the video.
+    attr_reader :upload_domain
+    # only available via a details request, and then only for some videos: the
+    # redirect url through Google Video through which you may reach the
+    # uploading user's website.
+    attr_reader :upload_user_url
+    # only available via a details request: the url to the video's
+    # <tt>.gvp</tt> format video file.  see
+    # http://en.wikipedia.org/wiki/Google_Video#Google_Video_Player for more
+    # information on file formats and the like.
+    attr_reader :video_file_url
+    # only available via a details request: a list of VideoFrameThumbnail
+    # objects describing zero or more individual frame stills within the
+    # video.
+    attr_reader :video_frame_thumbnails
+    # only available via a details request: the total number of views this
+    # video has received to date.
+    attr_reader :view_count
+    # only available via a details request: the number of views this video
+    # received yesterday.
+    attr_reader :yesterday_view_count
+    # the default width in pixels for the video embed html.
+    @@DEFAULT_WIDTH = 400
+    # the default height in pixels for the video embed html.
+    @@DEFAULT_HEIGHT = 326
+    # Constructs a Video with the supplied hash mapping attribute names to
+    # their respective values.
+    def initialize (params)
+      params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
+      # pull the doc id out of the page url if we've got one
+      if (@page_url)
+        @page_url =~ /docid=([^&]+)/
+        @doc_id = $1.to_i
+      end
+    end
+    # Returns HTML suitable for embedding this video in a web page with the
+    # video occupying the specified dimensions.  The generated html matches
+    # that which is provided by the Google Video web site embed instructions.
+    def embed_html (width = @@DEFAULT_WIDTH, height = @@DEFAULT_HEIGHT)
+      <<edoc
+<embed style="width:#{width}px; height:#{height}px;" id="VideoPlayback" type="application/x-shockwave-flash"
+ src="http://video.google.com/googleplayer.swf?docId=#{@doc_id}&hl=en" flashvars=""> </embed>
+edoc
+    end
+  end
+  # Provides a miniature snapshot of a point in time within a full video.  On
+  # the actual Google Video site these are linked to auto-pan the video to the
+  # start of the scene represented in the thumbnail, but for us there's no
+  # direct link of meaning so we've only got static content which is
+  # nevertheless of potential interest and utility.
+  class VideoFrameThumbnail
+    # the title of the video frame, e.g. "at 10 secs", "at 30 secs", etc.
+    attr_reader :title
+    # the full url to the video frame thumbnail image.
+    attr_reader :thumbnail_image_url
+    def initialize (params)
+      params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
+    end
+  end
+  # Describes an alternate video as listed in the "Playlist" tab for a
+  # particular video's detailed view page.
+  class PlaylistEntry
+    # the title of the video.
+    attr_reader :title
+    # the full url to the video thumbnail image.
+    attr_reader :thumbnail_image_url
+    # the duration of the video in prose, e.g. "1hr 49min" or "3min".
+    attr_reader :duration
+    # the name of the user who uploaded the video.
+    attr_reader :upload_user
+    # the full url at which the video is available for viewing.
+    attr_reader :page_url
+    def initialize (params)
+      params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
+    end
+  end
+  # Describes an entry on the "top videos" page.
+  class TopVideo
+    # the direction the entry moved from the previous day to today: 0 for no
+    # change, 1 for moving up, -1 for moving down.
+    attr_reader :movement
+    # the entry's rank today.
+    attr_reader :rank_today
+    # the entry's rank yesterday.
+    attr_reader :rank_yesterday
+    # the entry's video details as a Video object.
+    attr_reader :video
+    # Constructs a TopVideo with the supplied hash mapping attribute names to
+    # their respective values.
+    def initialize (params)
+      params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
+    end
+  end
+  # Describes a request for the current list of top videos on Google Video.
+  class TopVideosRequest
+    # the list of countries by which one can constrain a search for top videos
+    COUNTRIES = {
+      'All' => 'all',
+      'Argentina' => 'arg',
+      'Australia' => 'aus',
+      'Austria' => 'aut',
+      'Brazil' => 'bra',
+      'Canada' => 'can',
+      'Chile' => 'chl',
+      'Denmark' => 'dnk',
+      'Finland' => 'fin',
+      'France' => 'fra',
+      'Germany' => 'deu',
+      'Greece' => 'grc',
+      'Hong Kong' => 'hkg',
+      'India' => 'ind',
+      'Indonesia' => 'idn',
+      'Ireland' => 'irl',
+      'Israel' => 'isr',
+      'Italy' => 'ita',
+      'Japan' => 'jpn',
+      'Kenya' => 'ken',
+      'Malaysia' => 'mys',
+      'Mexico' => 'mex',
+      'Netherlands' => 'nld',
+      'New Zealand' => 'nzl',
+      'Norway' => 'nor',
+      'Peru' => 'per',
+      'Philippines' => 'phl',
+      'Poland' => 'pol',
+      'Russia' => 'rus',
+      'Saudi Arabia' => 'sau',
+      'Singapore' => 'sgp',
+      'South Africa' => 'zaf',
+      'South Korea' => 'kor',
+      'Spain' => 'esp',
+      'Sweden' => 'swe',
+      'Switzerland' => 'che',
+      'Taiwan' => 'twn',
+      'Thailand' => 'tha',
+      'Turkey' => 'tur',
+      'Ukraine' => 'ukr',
+      'United Arab Emirates' => 'are',
+      'United Kingdom' => 'gbr',
+      'United States' => 'usa',
+      'Vietnam' => 'vnm'
+    }
+    # optional: the country by which results are to be constrained, defaulting
+    # to nil.  Specifying 'all' or nil will provide results across all
+    # countries.
+    attr_reader :country
+    # Constructs a TopVideosRequest with an optional supplied hash mapping
+    # attribute names to their respective values.
+    def initialize (params = nil)
+      params.each { |key, value| instance_variable_set('@' + key.to_s, value) } if (params)
+      # validate request parameters
+      if @country && !COUNTRIES.include?(@country)
+        raise ArgumentError.new("invalid country parameter: #{@country}")
+      end
+    end
+  end
+  # Describes a response from Google Video providing a list of current top videos.
+  class TopVideosResponse
+    # the url with which the request was made of the Google Video service.
+    attr_reader :request_url
+    # the list of Video objects comprising the search results.
+    attr_reader :videos
+    # Constructs a TopVideosResponse with the supplied hash mapping attribute
+    # names to their respective values.
+    def initialize (params)
+      params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
+    end
+  end
+  # Describes a search request for videos matching the specified parameters on
+  # Google Video.
+  class VideoSearchRequest
+    # the list of valid sort parameters.
+    SORT_OPTIONS = [ 'relevance', 'rating', 'date', 'title' ]
+    # the list of valid duration parameters.
+    DURATION_OPTIONS = [ 'all', 'short', 'medium', 'long' ]
+    # optional: the sort order for search results: one of 'relevance',
+    # 'rating', 'date', 'title', defaulting to 'relevance'.
+    attr_reader :sort
+    # optional: the duration by which to filter search results: one of 'all',
+    # 'short', 'medium' or 'long', defaulting to 'all'.
+    attr_reader :duration
+    # required: the query string by which to search.
+    attr_reader :query
+    # optional: the page number of results sought, defaulting to 1.
+    attr_reader :page
+    # Constructs a VideoSearchRequest with the supplied hash mapping attribute
+    # names to their respective values.
+    def initialize (params)
+      params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
+      # validate request parameters
+      if @sort && !SORT_OPTIONS.include?(@sort);
+        raise ArgumentError.new("invalid sort parameter: #{@sort}")
+      end
+      if @duration && !DURATION_OPTIONS.include?(@duration)
+        raise ArgumentError.new("invalid duration parameter: #{@duration}")
+      end
+      if !@query
+        raise ArgumentError.new("invalid request, query parameter required")
+      end
+    end
+  end
+  # Describes a response to a VideoSearchQuery.
+  class VideoSearchResponse
+    # the url with which the request was made of the Google Video service.
+    attr_reader :request_url
+    # the 1-based starting index number of the results in this set.
+    attr_reader :start_index
+    # the 1-based ending index number of the results in this set.
+    attr_reader :end_index
+    # the total number of results in this result set.
+    attr_reader :total_result_count
+    # the time taken in seconds to execute the search query (according to Google).
+    attr_reader :execution_time
+    # the list of Video objects comprising the search results.
+    attr_reader :videos
+    # Constructs a VideoSearchResponse with the supplied hash mapping
+    # attribute names to their respective values.
+    def initialize (params)
+      params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
+    end
+  end
+  # Describes a request of the Google Video service for details on a specific
+  # previously retrieved Video or its stored +doc_id+.  Only one of either
+  # +video+ or +doc_id+ should be specified in the request.
+  class VideoDetailsRequest
+    # the Video object whose details are sought.
+    attr_reader :video
+    # the +doc_id+ of the video whose details are sought.
+    attr_reader :doc_id
+    # Constructs a VideoDetailsRequest with the supplied hash mapping
+    # attribute names to their respective values.
+    def initialize (params)
+      params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
+      # validate request parameters
+      if !@video && !@doc_id
+        raise ArgumentError.new("invalid request, one of video or doc_id parameter required")
+      end
+      if @video && @doc_id
+        raise ArgumentError.new("invalid request, only one of video or doc_id parameters may be specified")
+      end
+      if @video && !@video.is_a?(Video)
+        raise ArgumentError.new("invalid request, video must be a GoogleVideo::Video")
+      end
+      if @doc_id && !@doc_id.is_a?(Bignum)
+        raise ArgumentError.new("invalid request, doc_id must be a Bignum")
+      end
+    end
+  end
+  # Describes a response to a VideoDetailsRequest.
+  class VideoDetailsResponse
+    # the url with which the request was made of the Google Video service.
+    attr_reader :request_url
+    # the new Video object providing full detailed information.
+    attr_reader :video
+    # Constructs a VideoDetailsResponse with the supplied hash mapping
+    # attribute names to their respective values.
+    def initialize (params)
+      params.each { |key, value| instance_variable_set('@' + key.to_s, value) }
+    end
+  end
+  # The main client object providing interface methods for retrieving
+  # information from the Google Video server.
+  class Client
+    # the default hostname queried to retrieve google video content.
+    @@DEFAULT_HOST = 'video.google.com'
+    # the default user agent submitted with http requests of google video.
+    @@DEFAULT_AGENT = 'google-video for Ruby (http://www.rubyforge.org/projects/google-video/)'
+    # Constructs a Client for querying the Google Video server.  Optional
+    # parameters to be specified as a hash include:
+    # * host: optional alternate host name to query instead of the default host.
+    # * agent: optional alternate user agent to submit with http requests
+    #   instead of the default agent.
+    def initialize (params = nil)
+      @host = @@DEFAULT_HOST
+      @agent = @@DEFAULT_AGENT
+      params.each { |key, value| instance_variable_set('@' + key.to_s, value) } if params
+    end
+    # Runs a search query on Google Video with the parameters specified in the
+    # supplied VideoSearchRequest and returns a VideoSearchResponse.
+    def video_search (search_request)
+      # validate parameters
+      if !search_request.is_a?(VideoSearchRequest)
+        raise ArgumentError.new("invalid argument, request must be a GoogleVideo::VideoSearchRequest")
+      end
+      # gather response data from the server
+      url = _search_url(search_request)
+      response = _request(url)
+      doc = Hpricot(response)
+      # parse the overall search query stats
+      regexp_stats = Regexp.new(/([0-9,]+) \- ([0-9,]+)<\/b> of about <b>([0-9,]+)<\/b> \(<b>([0-9.]+)/)
+      row = (doc/"#resultsheadertable/tr/td/font").first
+      if !regexp_stats.match(row.inner_html)
+        raise GoogleVideoException.new("failed to parse search query stats")
+      end
+      ( start_index, end_index, total_result_count, execution_time ) = [ $1.to_i, $2.to_i, $3.to_i, $4.to_f ]
+      # parse the video results
+      videos = []
+      rows = doc/"table[@class='searchresult']/tr"
+      rows.each do |row|
+        # parse the thumbnail image
+        thumbnail_image_url = _decode_html((row/"img[@class='searchresultimg']").first.attributes['src'])
+        # parse the title and page url
+        a_title = (row/"div[@class='resulttitle']/a").first
+        page_url = 'http://' + @host + '/' + _decode_html(a_title.attributes['href'])
+        title = _decode_html(a_title.inner_html.strip)
+        # parse the description text
+        description = _decode_html((row/"div[@class='snippet']").first.inner_html.strip)
+        # parse the upload username
+        span_channel = (row/"span[@class='channel']").first
+        channel_html = (span_channel) ? span_channel.inner_html : ''
+        channel_html =~ /([^\-]+)/
+        upload_user = _clean_string($1)
+        # stars
+        star_count = _parse_star_elements(row/"img[@class='star']")
+        # rating count
+        span_raters = (row/"span[@id='numOfRaters']").first
+        rating_count = (span_raters) ? span_raters.inner_html.to_i : 0
+        # duration
+        span_date = (row/"span[@class='date']").first
+        date_html = span_date.inner_html
+        date_html =~ /([^\-]+) \- (.*)$/
+        duration = _clean_string($1)
+        upload_date = Time.parse(_clean_string($2))
+        # construct the video object and tack it onto the video result list
+        videos << Video.new(:title => title,
+                            :page_url => page_url,
+                            :thumbnail_image_url => thumbnail_image_url,
+                            :description => description,
+                            :star_count => star_count,
+                            :rating_count => rating_count,
+                            :duration => duration,
+                            :upload_date => upload_date,
+                            :upload_user => upload_user)
+      end
+      # construct the final search response with all info we've gathered
+      VideoSearchResponse.new(:request_url => url,
+                              :start_index => start_index,
+                              :end_index => end_index,
+                              :total_result_count => total_result_count,
+                              :execution_time => execution_time,
+                              :videos => videos)
+    end
+    # Looks up detailed information on a specific Video on Google Video with
+    # the parameters specified in the supplied VideoDetailsRequest and returns
+    # a VideoDetailsResponse.
+    def video_details (details_request)
+      # validate parameters
+      if !details_request.is_a?(VideoDetailsRequest)
+        raise ArgumentError.new("invalid argument, request must be a GoogleVideo::VideoDetailsRequest")
+      end
+      # gather response data from the server
+      url = _video_details_url(details_request)
+      response = _request(url)
+      doc = Hpricot(response)
+      # parse title
+      title = (doc/"div[@id='pvprogtitle']").inner_html.strip
+      # parse description
+      font_description = (doc/"div[@id='description']/font").first
+      description = (font_description) ? font_description.all_text.strip : ''
+      span_wholedescr = (doc/"span[@id='wholedescr']").first
+      if (span_wholedescr)
+        description += ' ' + span_wholedescr.all_text.strip
+      end
+      description = _decode_html(description)
+      # parse star count
+      span_rating = (doc/"span[@id='communityRating']").first
+      star_count = _parse_star_elements(span_rating/"img[@class='star']")
+      # parse rating count
+      span_raters = (doc/"span[@id='numOfRaters']").first
+      rating_count = (span_raters) ? span_raters.inner_html.to_i : 0
+      # parse upload user, duration, upload date, upload user domain, upload
+      # user url.  unfortunately this is a bit messy since, unlike much of the
+      # rest of google's lovely html, there are no useful id or class names we
+      # can hang our hat on.  rather, there are anywhere from one to three
+      # rows of text, with only the middle row (in the three-row scenario)
+      # containing duration and upload date, omnipresent.  still, we buckle
+      # down and have at it with fervor and tenacity.
+      duration_etc_html = (doc/"div[@id='durationetc']").first.inner_html
+      duration_parts = duration_etc_html.split(/<br[^>]+>/)
+      # see if the first line looks like it has a date formatted ala 'Nov 9, 2006'
+      if (duration_parts[0] =~ /\-  [A-Za-z]{3} \d+, \d{4}/)
+        # first line is duration / upload_date, and there is no upload username
+        upload_user = ''
+        duration_upload_html = duration_parts[0]
+        upload_user_domain = duration_parts[1]
+      else
+        upload_user = _clean_string(duration_parts[0])
+        duration_upload_html = duration_parts[1]
+        upload_user_domain = duration_parts[2]
+      end
+      # parse the duration and upload date
+      ( duration, upload_date ) = duration_upload_html.split(/\-/)
+      duration = _clean_string(duration)
+      upload_date = Time.parse(_clean_string(upload_date))
+      # parse the upload user url and domain if present
+      if (upload_user_domain =~ /<a.*?href="([^"]+)"[^>]+>([^<]+)<\/a>/)
+        upload_user_url = 'http://' + @host + _decode_html(_clean_string($1))
+        upload_user_domain = _clean_string($2)
+      else
+        upload_user_url = ''
+        upload_user_domain = ''
+      end
+      # pull out view count and rank info table row elements
+      ( tr_total_views, tr_views_yesterday, tr_rank ) = (doc/"table[@id='statsall']/tr")
+      # parse view count
+      tr_total_views.inner_html =~ /<b>([^<]+)<\/b>/
+      view_count = _human_number_to_int($1)
+      # parse yesterday's view count
+      tr_views_yesterday.inner_html =~ /\s+([0-9,]+) yesterday/
+      yesterday_view_count = _human_number_to_int($1)
+      # parse rank
+      tr_rank.inner_html =~ /rank ([0-9,]+)/
+      rank = _human_number_to_int($1)
+      # parse rank delta
+      (tr_rank/"span").first.inner_html =~ /\(([0-9\+\-,]+)\)/
+      rank_delta = _human_number_to_int($1)
+      # pull out the url to the video .gvp file if prsent
+      img_download = (doc/"img[@src='/static/btn_download.gif']").first
+      if (img_download)
+        onclick_html = img_download.attributes['onclick']
+        onclick_script = _decode_html(onclick_html)
+        onclick_script =~ /onDownloadClick\(([^\)]+)\)/
+        video_file_url = onclick_script.split(",")[1].gsub(/"/, '')
+      else
+        video_file_url = ''
+      end
+      # pull out the video frame thumbnails
+      video_frame_thumbnails = []
+      (doc/"img[@class='detailsimage']").each do |frame_image|
+        video_frame_thumbnails << _parse_video_frame_thumbnail(frame_image)
+      end
+      # pull out the playlist entries
+      playlist_entries = []
+      table_upnext = (doc/"table[@id='upnexttable']").first
+      (table_upnext/"tr").each do |tr_playlist|
+        playlist_entries << _parse_playlist_entry(tr_playlist)
+      end
+      # create the new, fully populated video record
+      video = Video.new(:description => description,
+                        :duration => duration,
+                        :page_url => url,
+                        :playlist_entries => playlist_entries,
+                        :rank => rank,
+                        :rank_delta => rank_delta,
+                        :rating_count => rating_count,
+                        :star_count => star_count,
+                        :title => title,
+                        :upload_date => upload_date,
+                        :upload_user => upload_user,
+                        :upload_user_domain => upload_user_domain,
+                        :upload_user_url => upload_user_url,
+                        :video_file_url => video_file_url,
+                        :video_frame_thumbnails => video_frame_thumbnails,
+                        :view_count => view_count,
+                        :yesterday_view_count => yesterday_view_count)
+      # build and return the response
+      VideoDetailsResponse.new(:request_url => url, :video => video)
+    end
+    # Looks up top videos on Google Video with the parameters specified in the
+    # supplied TopVideosRequest and returns a TopVideosResponse.
+    def top_videos (top_request)
+      # validate parameters
+      if !top_request.is_a?(TopVideosRequest)
+        raise ArgumentError.new("invalid argument, request must be a GoogleVideo::TopVideosRequest")
+      end
+      # gather response data from the server
+      url = _top_videos_url(top_request)
+      response = _request(url)
+      doc = Hpricot(response)
+      # parse out each of the top video entries
+      top_videos = []
+      # grab the top 100 table rows
+      rows = doc/"table[@class='table-top100']/tr"
+      # the first row is just header info, so skip it
+      rows.shift
+      # there's one video per row, so we iterate over the table row elements
+      rows.each do |row|
+        # break the table cells into logically-named elements we can manipulate more precisely
+        (td_movement, td_rank_today, td_rank_yesterday, td_thumbnail, td_detail) = (row/"td")
+        # parse the rank movement direction
+        movement_html = (td_movement/"img").to_html
+        if (movement_html =~ /up\.gif/)
+          movement = 1
+        elsif (movement_html =~ /down\.gif/)
+          movement = -1
+        else
+          movement = 0
+        end
+        # parse today and yesterday's rank
+        rank_today = td_rank_today.inner_html.to_i
+        rank_yesterday = td_rank_yesterday.inner_html.to_i
+        # parse the video thumbnail image
+        thumbnail_image_url = _decode_html((td_thumbnail/"a/img").first.attributes['src'])
+        # parse the detailed video info
+        a_video = (td_detail/"a").first
+        page_url = 'http://' + @host + a_video.attributes['href']
+        # title
+        title = _decode_html(a_video.inner_html.strip)
+        # stars
+        star_count = _parse_star_elements(td_detail/"div[@class='meta']/span/font/img[@class='star']")
+        # rating count
+        span_raters = (td_detail/"div[@class='meta']/span/font/span[@id='numOfRaters']").first
+        rating_count = (span_raters) ? span_raters.inner_html.to_i : 0
+        # duration
+        duration = (td_detail/"div[@class='meta']").first.all_text.gsub(/&nbsp;/, '').strip
+        # description
+        description = _decode_html((td_detail).all_text.strip)
+        # construct the video object
+        video = Video.new(:title => title,
+                          :page_url => page_url,
+                          :thumbnail_image_url => thumbnail_image_url,
+                          :star_count => star_count,
+                          :rating_count => rating_count,
+                          :duration => duration,
+                          :description => description)
+        # create the top video entry and throw it on the list of top videos
+        top_videos << TopVideo.new(:movement => movement,
+                                   :rank_today => rank_today,
+                                   :rank_yesterday => rank_yesterday,
+                                   :video => video)
+      end
+      TopVideosResponse.new(:request_url => url,
+                            :videos => top_videos)
+    end
+    private
+    # Breakout method used by Client#video_details to make things more
+    # manageable, returning a VideoFrameThumbnail constructed from the given
+    # image element.
+    def _parse_video_frame_thumbnail (frame_image)
+      title = frame_image.attributes['title']
+      thumbnail_image_url = _decode_html(frame_image.attributes['src'])
+      VideoFrameThumbnail.new(:title => title, :thumbnail_image_url => thumbnail_image_url)
+    end
+    # Breakout method used by Client#video_details to make things more
+    # manageable, returning a PlaylistEntry constructed from the given table
+    # row element.
+    def _parse_playlist_entry (tr_playlist)
+      # pull out the playlist entry table cell elements
+      ( td_thumbnail, td_playlist ) = (tr_playlist/"td")
+      # parse thumbnail image
+      thumbnail_image_url = _decode_html((td_thumbnail/"img").first.attributes['src'])
+      # parse the page url and title
+      a_page = (td_playlist/"a").first
+      page_url = 'http://' + @host + a_page.attributes['href']
+      title = _decode_html(a_page.attributes['title'])
+      # parse the upload user and duration
+      meta_html = (td_playlist/"span[@class='meta']").first.inner_html
+      if (meta_html =~ /([^<]+)<br \/>(.*)/)
+        upload_user = _clean_string($1)
+        duration = _clean_string($2)
+      else
+        upload_user = ''
+        duration = _clean_string(meta_html)
+      end
+      PlaylistEntry.new(:title => title,
+                        :page_url => page_url,
+                        :upload_user => upload_user,
+                        :duration => duration,
+                        :thumbnail_image_url => thumbnail_image_url)
+    end
+    def _decode_html (text)
+      HTMLEntities.decode_entities(text)
+    end
+    def _clean_string (text)
+      text ? text.strip : ''
+    end
+    def _human_number_to_int (text)
+      text ? text.gsub(/,/, '').to_i : 0
+    end
+    def _parse_star_elements (elements)
+      star_count = 0
+      elements.each do |star|
+        star_src = star.attributes['src']
+        if (star_src =~ /starLittle\.gif$/)
+          star_count += 1
+        elsif (star_src =~ /starLittleHalf\.gif$/)
+          star_count += 0.5
+        end
+      end
+      star_count
+    end
+    def _video_details_url (details_request)
+      doc_id = (details_request.video) ? details_request.video.doc_id : details_request.doc_id
+      "http://#{@host}/videoplay?docid=#{doc_id}"
+    end
+    def _top_videos_url (top_request)
+      url = "http://#{@host}/videoranking"
+      if top_request.country
+        country_code = TopVideosRequest::COUNTRIES[top_request.country]
+        url += "?cr=#{country_code}"
+      end
+      url
+    end
+    def _search_url (search_request)
+      query = search_request.query
+      if (search_request.duration && search_request.duration != 'all')
+        query += '+duration:' + search_request.duration
+      end
+      url = "http://#{@host}/videosearch?q=#{URI.encode(query)}"
+      if (search_request.page)
+        url += '&page=' + URI.encode(search_request.page.to_s)
+      end
+      if (search_request.sort)
+        url += '&so=' + URI.encode(search_request.sort)
+      end
+      url
+    end
+    def _request (url)
+      begin
+        content = ''
+        open(url, "User-Agent" => @agent) { |f| content = f.read }
+        content
+      rescue
+        raise GoogleVideoException.new("failed to request '#{url}': " + $!)
+      end
+    end
+  end
+end