google-site-search 0.0.5 → 0.0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.rdoc CHANGED
@@ -10,35 +10,56 @@ In the simplest use case it will query your google site search for a term and su
10
10
 
11
11
  Add the following to your projects Gemfile.
12
12
 
13
- gem 'google-site-search', :git => "git@github.com:dvallance/google-site-search.git"
13
+ gem 'google-site-search', :git => "git@github.com:dvallance/google-site-search.git"
14
14
 
15
15
  Require the code if necessary (_note:_ some frameworks like rails are set to auto-require gems for you by default)
16
16
 
17
- require 'google-site-search'
17
+ require 'google-site-search'
18
18
 
19
19
  == Usage
20
20
 
21
21
  The simpliest way to use the gem is by providing just a *search* *query* *term* and your *search* *engine* *unique* *id* code (_e.g._ looks like this +00255077836266642015+:+u-scht7a-8i+ and is located in your google site search control panel)
22
22
 
23
- #just assign the query to an object
24
- search = GoogleSiteSearch.query(GoogleSiteSearch::UrlBuilder.new("microsoft", "00255077836266642015:u-scht7a-8i")
23
+ # just assign the query to an object
24
+ search = GoogleSiteSearch.query(GoogleSiteSearch::UrlBuilder.new("microsoft", "00255077836266642015:u-scht7a-8i")
25
25
 
26
- #object has search attributes like
27
- puts search.next_results_url
28
- puts search.previous_results_url
29
- puts search.xml
30
- puts search.spelling
31
- puts search.spelling_url
32
-
33
- #object has an array of each specific result that contains title, description and its link by default
34
- search.results.each do |result|
35
- puts result.title
36
- puts result.description
37
- puts result.link
38
- end
26
+ # object has search attributes like
27
+ puts search.next_results_url
28
+ puts search.previous_results_url
29
+ puts search.xml
30
+ puts search.spelling
31
+ puts search.spelling_url
32
+
33
+ # object has an array of each specific result that contains title, description and its link by default
34
+ search.results.each do |result|
35
+ puts result.title
36
+ puts result.description
37
+ puts result.link
38
+ end
39
39
 
40
40
  The _query_ method expects a valid url so if you wanted to supply your own you can! However I have created a builder class to help with proper url creation and to help do some of the work for you.
41
41
 
42
+ == Multiple Search
43
+
44
+ Since google only allows a max of 20 returned results I have added a method that will capture up to *n* number of results.
45
+
46
+ # the array will be up to 5 search objects if the query actually has that many results.
47
+ # has soon as a search doesn't have a next_results_url the method stops.
48
+ array_of_search_results = GoogleSiteSearch.query_multiple(5, *YOUR_URL*)
49
+
50
+ == Blocks (and a caching example)
51
+
52
+ Both the query and query_multiple methods can take a block which executes for each Search object found.
53
+
54
+ # here is a rails example using the block
55
+ url = GoogleSiteSearch::UrlBuilder.new("microsoft", "00255077836266642015:u-scht7a-8i")
56
+ @search = Rails.cache.fetch(url) do |search|
57
+ # I can do something with the search objects.
58
+ # possibly custom analytics for our searchs?
59
+ search.url # we can access the object
60
+ end
61
+
62
+
42
63
  == Advanced Usage
43
64
 
44
65
  An important requirement for this gem was to be able to use {structured data}[https://developers.google.com/custom-search/docs/structured_data] for:
@@ -49,15 +70,14 @@ Therefore I allow the developer to supply his own "*Results*" class to the query
49
70
 
50
71
  The default Result class is as follows:
51
72
 
52
- class Result
53
- attr_reader :title, :link, :description
54
-
55
- def initialize(node)
56
- @title = node.find_first("T").content
57
- @link = node.find_first("UE").content
58
- @description = node.find_first("S").content
59
- end
60
- end
73
+ class Result
74
+ attr_reader :title, :link, :description
75
+ def initialize(node)
76
+ @title = node.find_first("T").content
77
+ @link = node.find_first("UE").content
78
+ @description = node.find_first("S").content
79
+ end
80
+ end
61
81
 
62
82
  As you can see it is very simple. Your class simply needs an initialize method that will recieve an xml node, which it can then do with as it pleases. After it is initialized it is added to the _search.results_ array as shown previously.
63
83
 
@@ -67,9 +87,9 @@ See
67
87
 
68
88
  == Pagination
69
89
 
70
- The google search api actually does the work of pagination for us, supplying the next and previous urls. The urls are relative to \http://www.google.com so I added a _paginate_ method to simplify the call.
90
+ The google search api does the work of pagination for us, supplying the next and previous urls. The urls are relative paths and contain the search engine id parameter. Since this is a security concern I strip out the search engine id when I store them in the Search.next_result_url and Search.previous_result_url methods. This makes them safe to put in links on views and it is why you must supply the search engine id again on the paginate call; so full url can be rebuilt for the query call.
71
91
 
72
- search2 = GoogleSiteSearch.query(GoogleSiteSearch.paginate(search1.next_results_url))
92
+ search2 = GoogleSiteSearch.query(GoogleSiteSearch.paginate(search1.next_results_url, "00255077836266642015:u-scht7a-8i"))
73
93
 
74
94
  == Pagination Simple Example
75
95
 
@@ -78,19 +98,19 @@ This works and is fairly straight forward.
78
98
  In your controller:
79
99
 
80
100
  if params[:move]
81
- @search = GoogleSiteSearch.query(GoogleSiteSearch.paginate(params[:move]))
101
+ @search = GoogleSiteSearch.query(GoogleSiteSearch.paginate(params[:move], "00255077836266642015:u-scht7a-8i"))
82
102
  else
83
103
  @search = GoogleSiteSearch.query(GoogleSiteSearch::UrlBuilder.new("microsoft", "00255077836266642015:u-scht7a-8i", :num => 5))
84
104
  end
85
105
 
86
106
  In your view:
87
107
 
88
- <% if @search.previous_results_url %>
89
- <%= link_to "Previous", search_url(:move => @search.previous_results_url) %>
90
- <% end %>
91
- <% if @search.next_results_url %>
92
- <%= link_to "More", search_url(:move => @search.next_results_url) %>
93
- <% end %>
108
+ <% if @search.previous_results_url %>
109
+ <%= link_to "Previous", search_url(:move => @search.previous_results_url) %>
110
+ <% end %>
111
+ <% if @search.next_results_url %>
112
+ <%= link_to "More", search_url(:move => @search.next_results_url) %>
113
+ <% end %>
94
114
 
95
115
  == Escaping
96
116
 
@@ -98,13 +118,11 @@ If you start passing around the url's in parameters you may run into issues if y
98
118
 
99
119
  View adds escape:
100
120
 
101
- <%= link_to "Previous", search_url(:move => CGI::escape(@search.previous_results_url)) %>
121
+ <%= link_to "Previous", search_url(:move => CGI::escape(@search.previous_results_url)) %>
102
122
 
103
123
  Controller unescapes:
104
124
 
105
- @search = GoogleSiteSearch.query(GoogleSiteSearch.paginate(CGI::unescape(params[:move])))
106
-
107
-
125
+ @search = GoogleSiteSearch.query(GoogleSiteSearch.paginate(CGI::unescape(params[:move]), "00255077836266642015:u-scht7a-8i"))
108
126
 
109
127
  == Filtering and Sorting
110
128
 
@@ -116,8 +134,8 @@ Google expects filtering to be on the "search query" itself. However I feel my e
116
134
 
117
135
  From the google reference link above an example filter search query is <b>halloween more:pagemap:document-author:lisamorton</b>
118
136
 
119
- #using the example above would look like this.
120
- search = GoogleSiteSearch.query(GoogleSiteSearch::UrlBuilder.new("halloween", "00255077836266642015:u-scht7a-8i", :filter => "more:pagemap:document-author:lisamorton")
137
+ # using the example above would look like this.
138
+ search = GoogleSiteSearch.query(GoogleSiteSearch::UrlBuilder.new("halloween", "00255077836266642015:u-scht7a-8i", :filter => "more:pagemap:document-author:lisamorton")
121
139
 
122
140
  === Separate Search Term From Filters
123
141
 
@@ -125,20 +143,20 @@ The full "search query" is returned by google's api and stored in the Search obj
125
143
 
126
144
  To separate the search term from the filter use:
127
145
 
128
- search_term, filters = GoogleSiteSearch.separate_search_term_from_filters(@search.search_query)
146
+ search_term, filters = GoogleSiteSearch.separate_search_term_from_filters(@search.search_query)
129
147
 
130
148
  === Sorting
131
149
 
132
150
  Sorting would also be done by specifing a *sort* option.
133
151
 
134
- search = GoogleSiteSearch.query(GoogleSiteSearch::UrlBuilder.new("halloween", "00255077836266642015:u-scht7a-8i", :filter => "more:pagemap:document-author:lisamorton", :sort => "data-sdate")
152
+ search = GoogleSiteSearch.query(GoogleSiteSearch::UrlBuilder.new("halloween", "00255077836266642015:u-scht7a-8i", :filter => "more:pagemap:document-author:lisamorton", :sort => "data-sdate")
135
153
 
136
154
  == Other Params
137
155
 
138
156
  Any <b>[param=value]</b> query string additions you want to add can be assigned like the sorting above. For example to limit the search results return, to 5, would look like...
139
157
 
140
- #get only 5 search results with the filtering and sorting from above still applyed.
141
- search = GoogleSiteSearch.query(GoogleSiteSearch::UrlBuilder.new("halloween more:pagemap:document-author:lisamorton", "00255077836266642015:u-scht7a-8i", :sort => "date-sdate", :num => "5" )
158
+ # get only 5 search results with the filtering and sorting from above still applyed.
159
+ search = GoogleSiteSearch.query(GoogleSiteSearch::UrlBuilder.new("halloween more:pagemap:document-author:lisamorton", "00255077836266642015:u-scht7a-8i", :sort => "date-sdate", :num => "5" )
142
160
 
143
161
  == Author
144
162
 
@@ -16,4 +16,6 @@ Gem::Specification.new do |gem|
16
16
  gem.version = GoogleSiteSearch::VERSION
17
17
  gem.add_dependency("activesupport")
18
18
  gem.add_dependency("libxml-ruby")
19
+ gem.add_dependency("rsmaz")
20
+ gem.add_dependency("rack")
19
21
  end
@@ -6,10 +6,12 @@ require "google-site-search/version"
6
6
  require "google-site-search/url_builder"
7
7
  require "google-site-search/search"
8
8
  require "google-site-search/result"
9
- require "timeout"
10
9
  require "net/http"
10
+ require "rsmaz"
11
+ require "timeout"
11
12
  require "uri"
12
13
  require "xml"
14
+ require "rack/utils"
13
15
 
14
16
  ##
15
17
  # A module to help query and parse the google site search api.
@@ -24,9 +26,28 @@ module GoogleSiteSearch
24
26
 
25
27
  class << self
26
28
 
29
+ # Takes a url, strips out un-required query params, and compresses
30
+ # a string representation. The intent is to have a small string to
31
+ # use as a caching key.
32
+ def caching_key url
33
+ params = Rack::Utils.parse_query(URI.parse(url).query)
34
+ # ei = "Passes on an alphanumeric parameter that decodes the originating SERP where user clicked on a related search". Don't fully understand what it does but it makes my caching less effective.
35
+ params.delete("ei")
36
+ key = params.map{|k,v| k.to_s + v.to_s}.sort.join
37
+ key.blank? ? nil : RSmaz.compress(key)
38
+ end
39
+
27
40
  # Expects the URL returned by Search#next_results_url or Search#previous_results_url.
28
- def paginate url
29
- GOOGLE_SEARCH_URL + url.to_s
41
+ def paginate url, search_engine_id
42
+ raise StandardError, "search_engine_id required" if search_engine_id.blank?
43
+ uri = URI.parse(url.to_s)
44
+ raise StandardError, "url seems to be invalid, parameters expected" if uri.query.blank?
45
+ if uri.relative?
46
+ uri.host = "www.google.com"
47
+ uri.scheme = "http"
48
+ end
49
+ uri.query = uri.query += "&cx=#{search_engine_id}"
50
+ uri.to_s
30
51
  end
31
52
 
32
53
  # See Search - This is a convienence method for creating and querying.
@@ -44,6 +44,15 @@ module GoogleSiteSearch
44
44
  @result_class = result_class
45
45
  end
46
46
 
47
+ def next_results_url
48
+ @next_results_url
49
+ end
50
+
51
+ def previous_results_url
52
+ @previous_results_url
53
+ end
54
+
55
+
47
56
  # Query's Google API, stores the xml and parses values into itself.
48
57
  def query
49
58
  @xml = GoogleSiteSearch::request_xml(url)
@@ -68,12 +77,21 @@ module GoogleSiteSearch
68
77
  @spelling = spelling_node.try(:content)
69
78
  @spelling_q = spelling_node.try(:attributes).try(:[],:q)
70
79
  @estimated_results_total = doc.find_first("RES/M").try(:content)
71
- @next_results_url = doc.find_first("RES/NB/NU").try(:content)
72
- @previous_results_url = doc.find_first("RES/NB/PU").try(:content)
80
+ @next_results_url = remove_search_engine_id(doc.find_first("RES/NB/NU").try(:content))
81
+ @previous_results_url = remove_search_engine_id(doc.find_first("RES/NB/PU").try(:content))
73
82
  @search_query = doc.find_first("Q").try(:content)
74
83
  rescue Exception => e
75
84
  raise ParsingError, "#{e.message} URL:[#{@url}] XML:[#{@xml}]"
76
85
  end
77
86
  end
87
+
88
+ def remove_search_engine_id url
89
+ return nil if url.blank?
90
+ uri = URI.parse(url)
91
+ params = Rack::Utils::parse_query(uri.query)
92
+ params.delete("cx")
93
+ uri.query = params.map{|k,v| "#{k}=#{v}"}.sort.join("&")
94
+ uri.to_s
95
+ end
78
96
  end
79
97
  end
@@ -1,3 +1,3 @@
1
1
  module GoogleSiteSearch
2
- VERSION = "0.0.5"
2
+ VERSION = "0.0.7"
3
3
  end
@@ -35,8 +35,8 @@
35
35
  <RES SN="3" EN="4">
36
36
  <M>29</M>
37
37
  <NB>
38
- <PU>/previous</PU>
39
- <NU>/next</NU>
38
+ <PU>/previous?start=20&amp;q=search&amp;cx=my_key</PU>
39
+ <NU>/next?cx=my_key&amp;q=search&amp;start=1</NU>
40
40
  </NB>
41
41
  <RG START="1" SIZE="2"/>
42
42
  <RG START="1" SIZE="1"> </RG>
@@ -2,10 +2,46 @@ require_relative 'test_helper'
2
2
 
3
3
  describe GoogleSiteSearch do
4
4
 
5
+
6
+ describe '.caching_key' do
7
+ let :sample_url do
8
+ "http://domain?q=work&ei=dontshow&do=i"
9
+ end
10
+
11
+ let :becomes do
12
+ "doiqwork" #query parameter values sorted and concated.
13
+ end
14
+
15
+ it "properly creates a key" do
16
+ RSmaz.decompress(GoogleSiteSearch.caching_key(sample_url)).must_equal becomes
17
+ end
18
+
19
+ it "raises error on bad uri" do
20
+ -> {GoogleSiteSearch.caching_key(nil)}.must_raise URI::InvalidURIError
21
+ end
22
+
23
+ end
24
+
5
25
  describe '.paginate' do
26
+
27
+ let :valid_url do
28
+ "http://www.valid.com/search?q=mysearch"
29
+ end
30
+
31
+ it "raises an error if the url has no parameters and is therefore invalid" do
32
+ url = "http://www.noparameters.com/"
33
+ -> {GoogleSiteSearch.paginate(url, "my_key")}.must_raise StandardError
34
+ end
35
+
36
+ it "raises an error if a search engine key is nil or blank" do
37
+ -> {GoogleSiteSearch.paginate(valid_url, "")}.must_raise StandardError
38
+ -> {GoogleSiteSearch.paginate(valid_url, nil)}.must_raise StandardError
39
+ end
40
+
6
41
  it 'completes a valid url for the relative path supplied' do
7
- GoogleSiteSearch.paginate("/some/path").must_equal "http://www.google.com/some/path"
42
+ GoogleSiteSearch.paginate("/some/path?q=search","my_key").must_equal "http://www.google.com/some/path?q=search&cx=my_key"
8
43
  end
44
+
9
45
  end
10
46
 
11
47
  describe '.relative_path' do
data/test/test_search.rb CHANGED
@@ -27,13 +27,21 @@ describe Search do
27
27
  end
28
28
 
29
29
  it "contains the next results url" do
30
- search.next_results_url.must_equal "/next"
30
+ search.next_results_url.wont_be :empty?
31
+ end
32
+
33
+ it "next results url removed the search engine id parameter" do
34
+ search.next_results_url.must_equal "/next?q=search&start=1"
31
35
  end
32
36
 
33
37
  it "contains the previous results url" do
34
- search.previous_results_url.must_equal "/previous"
38
+ search.previous_results_url.wont_be :empty?
35
39
  end
36
40
 
41
+ it "next results url removed the search engine id parameter" do
42
+ search.previous_results_url.must_equal "/previous?q=search&start=20"
43
+ end
44
+
37
45
  it "stores the original xml" do
38
46
  search.xml.must_equal xml
39
47
  end
@@ -54,4 +62,5 @@ describe Search do
54
62
  search.spelling_q.must_equal "fake suggestion escaped"
55
63
  end
56
64
  end
65
+
57
66
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: google-site-search
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.5
4
+ version: 0.0.7
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,11 +9,11 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2012-11-21 00:00:00.000000000 Z
12
+ date: 2012-11-25 00:00:00.000000000Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: activesupport
16
- requirement: &22916060 !ruby/object:Gem::Requirement
16
+ requirement: &11771100 !ruby/object:Gem::Requirement
17
17
  none: false
18
18
  requirements:
19
19
  - - ! '>='
@@ -21,10 +21,10 @@ dependencies:
21
21
  version: '0'
22
22
  type: :runtime
23
23
  prerelease: false
24
- version_requirements: *22916060
24
+ version_requirements: *11771100
25
25
  - !ruby/object:Gem::Dependency
26
26
  name: libxml-ruby
27
- requirement: &22915560 !ruby/object:Gem::Requirement
27
+ requirement: &11770260 !ruby/object:Gem::Requirement
28
28
  none: false
29
29
  requirements:
30
30
  - - ! '>='
@@ -32,7 +32,29 @@ dependencies:
32
32
  version: '0'
33
33
  type: :runtime
34
34
  prerelease: false
35
- version_requirements: *22915560
35
+ version_requirements: *11770260
36
+ - !ruby/object:Gem::Dependency
37
+ name: rsmaz
38
+ requirement: &11761040 !ruby/object:Gem::Requirement
39
+ none: false
40
+ requirements:
41
+ - - ! '>='
42
+ - !ruby/object:Gem::Version
43
+ version: '0'
44
+ type: :runtime
45
+ prerelease: false
46
+ version_requirements: *11761040
47
+ - !ruby/object:Gem::Dependency
48
+ name: rack
49
+ requirement: &11760240 !ruby/object:Gem::Requirement
50
+ none: false
51
+ requirements:
52
+ - - ! '>='
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ type: :runtime
56
+ prerelease: false
57
+ version_requirements: *11760240
36
58
  description: A gem to aid in the consumption of the google site search service; querys
37
59
  the service, populates a result object and has some related helper methods.
38
60
  email:
@@ -70,12 +92,18 @@ required_ruby_version: !ruby/object:Gem::Requirement
70
92
  - - ! '>='
71
93
  - !ruby/object:Gem::Version
72
94
  version: '0'
95
+ segments:
96
+ - 0
97
+ hash: -3596847305660679714
73
98
  required_rubygems_version: !ruby/object:Gem::Requirement
74
99
  none: false
75
100
  requirements:
76
101
  - - ! '>='
77
102
  - !ruby/object:Gem::Version
78
103
  version: '0'
104
+ segments:
105
+ - 0
106
+ hash: -3596847305660679714
79
107
  requirements: []
80
108
  rubyforge_project:
81
109
  rubygems_version: 1.8.10