bento_search 0.9.0 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -2,10 +2,6 @@
2
2
 
3
3
  [![Build Status](https://secure.travis-ci.org/jrochkind/bento_search.png)](http://travis-ci.org/jrochkind/bento_search)
4
4
 
5
- (Fairly robust and stable at this point, but still pre-1.0 release, may
6
- be some breaking api changes before 1.0, but probably not too many, it's
7
- looking pretty good).
8
-
9
5
  bento_search provides an abstraction/normalization layer for querying and
10
6
  displaying results from external search engines, in Ruby on Rails. Requires
11
7
  Rails3 and tested only under ruby 1.9.3.
@@ -88,14 +84,14 @@ may be required for certain engines.
88
84
  results = engine.search("a query")
89
85
  ~~~~
90
86
 
91
- `results` are a BentoSearch::Results object, which acts like an array of
92
- BentoSearch::Item objects, along with some meta-information about the
87
+ `results` are a [BentoSearch::Results](./app/models/bento_search/results.rb) object, which acts like an array of
88
+ [BentoSearch::Item](./app/models/bento_search/results.rb) objects, along with some meta-information about the
93
89
  search itself (pagination keys, etc). BentoSearch::Results and Item fields
94
90
  are standardized accross engines. BentoSearch::Items provide semantic
95
91
  values (title, author, etc.), as available from the particular engine.
96
92
 
97
93
  To see which engines come bundled with BentoSearch, and any special
98
- engine-specific instructions, look at BentoSearch source in `./app/search_engines`
94
+ engine-specific instructions, look at BentoSearch source in [`./app/search_engines/bento_search`](./app/search_engines/bento_search)
99
95
 
100
96
  ### Register engines in global configuration
101
97
 
@@ -211,7 +207,7 @@ An engine instance advertises it's maximum per-page values.
211
207
  bento_search fixes the default per_page at 10.
212
208
 
213
209
  For help creating your UI, you can ask a BentoSearch::Results for
214
- `results.pagination`, which returns a BentoSearch::Results::Pagination
210
+ `results.pagination`, which returns a [BentoSearch::Results::Pagination](app/models/bento_search/results/pagination.rb)
215
211
  object which should be suitable for passing to [kaminari](https://github.com/amatsuda/kaminari)
216
212
  `paginate`, or else have convenient methods for roll your own pagination UI.
217
213
  Kaminari's paginate method:
@@ -259,7 +255,7 @@ that Celluloid uses multi-threading in such a way that you might need
259
255
  to turn Rails config.cache_classes=true even in development.
260
256
 
261
257
 
262
- For more info, see BentoSearch::MultiSearcher.
258
+ For more info, see [BentoSearch::MultiSearcher](./app/models/bento_search/multi_searcher.rb).
263
259
 
264
260
  ### Delayed results loading via AJAX (actually more like AJAHtml)
265
261
 
@@ -289,7 +285,8 @@ link resolver.
289
285
  BentoSearch::Items can have a main link associated with them (generally
290
286
  hyperlinked from title), as well as a list of additional links. Most engines
291
287
  do not provide additional links by default, custom local Decorators would
292
- be used to add them. See wiki for more info on decorators, and BentoSearch::Link
288
+ be used to add them. See [wiki on display cusotmization](https://github.com/jrochkind/bento_search/wiki/Customizing-Results-Display)
289
+ for more info on decorators, and [BentoSearch::Link](app/models/bento_search/link.rb)
293
290
  for fields.
294
291
 
295
292
  ### OpenURL and metadata
@@ -307,8 +304,8 @@ documented at ResultItem#format). As well as how well the #to_openurl routine
307
304
  handles all edge cases (OpenURL can be weird). As edge cases are discovered, they
308
305
  can be solved.
309
306
 
310
- See `./app/item_decorators/bento_search/openurl_add_other_link.rb` for an example
311
- of using item decorators to add a link to your openurl resover to an item when
307
+ See [`./app/item_decorators/bento_search/openurl_add_other_link.rb`](./app/item_decorators/bento_search/openurl_add_other_link.rb)
308
+ for an example of using item decorators to add a link to your openurl resover to an item when
312
309
  displayed.
313
310
 
314
311
  ### Exporting (eg as RIS) and get by unique_id
@@ -322,8 +319,34 @@ the RIS format, suitable for import into EndNote, Refworks, etc.
322
319
 
323
320
  Accomodating actual exports into the transactional flow of a web app can be
324
321
  tricky, and often requires use of the `result_item#unique_id` and
325
- `engine.get( unique_id )` features. See the wiki at
322
+ `engine.get( unique_id )` features. See the wiki on [exports and #unique_id](https://github.com/jrochkind/bento_search/wiki/Exports-and-the-get-by-unique_id-feature)
323
+
324
+ ### Machine-readable serialization in Atom
325
+
326
+ Translation of any BentoSearch::Results to an Atom response that is enhanced to
327
+ include nearly all the elements of each BentoSearch::ResultItem, so can serves
328
+ well as machine-readable api response in general, not just for Atom feed readers.
329
+
330
+ You can use the [`bento_search/atom_results`](./app/views/bento_search/atom_results.atom.builder) view template, perhaps
331
+ in your action method like so:
332
+
333
+ ~~~ruby
334
+ # ...
335
+ respond_to do |format|
336
+ format.html # default view
337
+ format.atom do
338
+ render( :template => "bento_search/atom_results",
339
+ :locals => {
340
+ :atom_results => @results,
341
+ :feed_name => "Acme results",
342
+ :feed_author_name => "MyCorp"
343
+ }
344
+ )
345
+ end
346
+ ~~~
326
347
 
348
+ There are additional details that might matter to you, for more info see the
349
+ [wiki page](https://github.com/jrochkind/bento_search/wiki/Machine-Readable-Serialization-With-Atom)
327
350
 
328
351
  ## Planned Features
329
352
 
@@ -340,7 +363,17 @@ Probably:
340
363
  a normalized cross-engine way.
341
364
 
342
365
  Other needs or suggestions?
343
-
366
+
367
+ ## Backwards compat
368
+
369
+ We are going to try to be strictly backwards compatible with all post 1.0
370
+ releases that do not increment the major version number (semantic versioning).
371
+
372
+ As a general rule, we're going to let our tests enforce this -- if a test has
373
+ to be changed to pass with new code, that's a very strong sign that it is
374
+ not a backwards-compat change, and you should think _very_ carefully to
375
+ be sure it is an exception to this rule before changing any existing tests
376
+ for new functionality.
344
377
 
345
378
  ## Developing
346
379
 
@@ -87,18 +87,14 @@ module BentoSearch
87
87
  self.any_present?(:source_title, :publisher, :start_page)
88
88
  end
89
89
 
90
- # Put together title and subtitle if neccesary.
90
+ # Mix-in a default missing title marker for empty titles
91
+ # (Used to combine title and subtitle when those were different fields)
91
92
  def complete_title
92
- t = self.title
93
- if self.subtitle
94
- t = safe_join([t, ": ", self.subtitle], "")
95
- end
96
-
97
- if t.blank?
98
- t = I18n.translate("bento_search.missing_title")
99
- end
100
-
101
- return t
93
+ if self.title.present?
94
+ self.title
95
+ else
96
+ I18n.translate("bento_search.missing_title")
97
+ end
102
98
  end
103
99
 
104
100
 
@@ -125,7 +121,7 @@ module BentoSearch
125
121
  return result_elements.join(", ").html_safe
126
122
  end
127
123
 
128
- # A display method, this is like #langauge_str, but will be nil if
124
+ # A display method, this is like #langauge_str, but will be nil if
129
125
  # the language_code matches the current default locale, used
130
126
  # for printing language only when not "English" normally.
131
127
  #
@@ -152,6 +148,28 @@ module BentoSearch
152
148
 
153
149
  return value.blank? ? nil : value
154
150
  end
151
+
152
+ # A unique opaque identifier for a record may sometimes be
153
+ # required, for instance in Atom.
154
+ #
155
+ # We here provide a really dumb implementation, if and only if
156
+ # the result has an engine_id and unique_id available, (and
157
+ # a #root_url is available) by basically concatenating them to
158
+ # app base url.
159
+ #
160
+ # That's pretty lame, probably not resolvable, but best we
161
+ # can do without knowing details of host app. You may want
162
+ # to over-ride this in a decorator to do something more valid
163
+ # in an app-specific way.
164
+ #
165
+ # yes uri_identifier is like PIN number, deal with it.
166
+ def uri_identifier
167
+ if self.engine_id.present? && self.unique_id.present? && _h.respond_to?(:root_url)
168
+ "#{_h.root_url.chomp("/")}/bento_search_opaque_id/#{CGI.escape self.engine_id}/#{CGI.escape self.unique_id}"
169
+ else
170
+ nil
171
+ end
172
+ end
155
173
 
156
174
 
157
175
  ###################
@@ -13,6 +13,10 @@ module BentoSearch
13
13
  # http://www.whatwg.org/specs/web-apps/current-work/multipage/links.html#linkTypes
14
14
  attr_accessor :rel
15
15
 
16
+ # MIME content type may be used for both HMTL links and Atom
17
+ # link 'type' attribute
18
+ attr_accessor :type
19
+
16
20
  # Array of strings, used for CSS classes on this link, possibly
17
21
  # for custom styles/images etc. May be used in non-html link
18
22
  # contexts too.
@@ -102,8 +102,10 @@ class BentoSearch::MultiSearcher
102
102
  Rails.logger.error("\nBentoSearch:MultiSearcher caught exception: #{e}\n#{e.backtrace.join(" \n")}")
103
103
  # Make a fake results with caught exception.
104
104
  @results = BentoSearch::Results.new
105
+ self.engine.fill_in_search_metadata_for(@results, self.engine.normalized_search_arguments(search_args))
106
+
105
107
  @results.error ||= {}
106
- @results.error["exception"] = e
108
+ @results.error["exception"] = e
107
109
  end
108
110
  end
109
111
 
@@ -24,13 +24,42 @@ class BentoSearch::Registrar
24
24
  #
25
25
  # The first parameter identifier, eg "gbs", may be used in some
26
26
  # URLs, for AJAX etc.
27
- def register_engine(id, &block)
28
- conf = Confstruct::Configuration.new(&block)
27
+ #
28
+ # You can also pass in a hash or hash-like object (including
29
+ # a configuration object returned by a prior register_engine)
30
+ # instead of or in addition to the block 'dsl' -- this can be used
31
+ # to base one configuration off another, with changes:
32
+ #
33
+ # BentoSearch.register_engine("original", {
34
+ # :engine => "Something",
35
+ # :title => "Original",
36
+ # :shared => "shared"
37
+ # })
38
+ #
39
+ # BentoSearch.register_engine("derived") do |conf|
40
+ # conf.title = "Derived"
41
+ # end
42
+ #
43
+ # Above would not change 'shared' in 'original', but would
44
+ # over-ride 'title' in 'derived', without changing 'title' in
45
+ # 'original'.
46
+ def register_engine(id, conf_data = nil, &block)
47
+ conf = Confstruct::Configuration.new
48
+
49
+ # Make sure we make a deep_copy so any changes don't mutate
50
+ # the original. Confstruct can be unpredictable.
51
+ if conf_data.present?
52
+ conf_data = Confstruct::Configuration.new(conf_data).deep_copy
53
+ end
54
+
55
+ conf.configure(conf_data, &block)
29
56
  conf.id = id.to_s
30
57
 
31
58
  raise ArgumentError.new("Must supply an `engine` class name") unless conf.engine
32
59
 
33
60
  @registered_engine_confs[id] = conf
61
+
62
+ return conf
34
63
  end
35
64
 
36
65
  # Get a configured SearchEngine, using configuration and engine
@@ -44,12 +44,8 @@ module BentoSearch
44
44
  # * dc.title
45
45
  # * schema.org CreativeWork: 'name'
46
46
  attr_accessor :title
47
-
48
- # When an individual seperate subtitle is available.
49
- # May also be nil with subtitle in "title" field after colon.
50
- #
51
- # *
52
- attr_accessor :subtitle
47
+ # backwards compat, we used to have separate titles and subtitles
48
+ alias_method :complete_title, :title
53
49
 
54
50
  # usually a direct link to the search provider's 'native' page.
55
51
  # Can be changed in actual presentation with a Decorator.
@@ -65,12 +61,24 @@ module BentoSearch
65
61
  @link_is_fulltext = v
66
62
  end
67
63
 
68
- # normalized controlled vocab title, important this is supplied
69
- # if possible for OpenURL generation and other features.
64
+ # Our own INTERNAL controlled vocab for 'format'.
65
+ #
66
+ # Important that this be supplied by engine for maximum
67
+ # success of openurl, ris export, etc.
68
+ #
69
+ # This vocab is based on schema.org CreativeWork 'types',
70
+ # but supplemented with values we needed not present in schema.org.
71
+ # String values are last part of schema.org URLs, symbol values are custom.
72
+ #
73
+ # However, for backwards compat, values that didn't exist in schema.org
74
+ # when we started but later came to exist -- we still use our string
75
+ # values. If you actually want a schema.org url, see #schema_org_type_url
76
+ # which translates as needed.
70
77
  #
71
78
  # schema.org 'type' that's a sub-type of CreativeWork.
72
79
  # should hold a string that, when appended to "http://schema.org/"
73
- # is a valid schema.org type uri, that sub-types CreativeWork. Eg.
80
+ # is a valid schema.org type uri, that sub-types CreativeWork. Ones
81
+ # we have used:
74
82
  # * Article
75
83
  # * Book
76
84
  # * Movie
@@ -93,7 +101,27 @@ module BentoSearch
93
101
  #
94
102
  # Note: We're re-thinking this, might allow uncontrolled
95
103
  # in here instead.
96
- attr_accessor :format
104
+ attr_accessor :format
105
+
106
+ # Translated from internal format vocab at #format. Outputs
107
+ # eg http://schema.org/Book
108
+ # Uses the @@format_to_schema_org hash for mapping from
109
+ # certain internal symbol values to schema org value, where
110
+ # possible.
111
+ #
112
+ # Can return nil if we don't know a schema.org type
113
+ def schema_org_type_url
114
+ if format.kind_of? String
115
+ "http://schema.org/#{format}"
116
+ elsif mapped = @@format_to_schema_org[format]
117
+ "http://schema.org/#{mapped}"
118
+ else
119
+ nil
120
+ end
121
+ end
122
+ @@format_to_schema_org = {
123
+ :report => "Article",
124
+ }
97
125
 
98
126
  # uncontrolled presumably english-language format string.
99
127
  # if supplied will be used in display in place of controlled
@@ -110,10 +138,10 @@ module BentoSearch
110
138
  # Manually set language_str will over-ride display string calculated from
111
139
  # language_code.
112
140
  #
113
- # Consumers can look at language_code or language_str regardless (although
114
- # either or both may be nil). You can get a language_list gem obj from
115
- # language_obj, and use to normalize to a
116
- # 2- or 3-letter from language_code that could be either.
141
+ # Consumers that want a language code can use #language_iso_639_1 or
142
+ # #language_iso_639_2 (either may be null), or #language_str for uncontrolled
143
+ # string. If engine just sets one of these, internals take care of filling
144
+ # out the others. r
117
145
  attr_accessor :language_code
118
146
  attr_writer :language_str
119
147
  def language_str
@@ -131,6 +159,16 @@ module BentoSearch
131
159
  @language_obj ||= LanguageList::LanguageInfo.find( self.language_code )
132
160
  end
133
161
 
162
+ # Two letter ISO language code, or nil
163
+ def language_iso_639_1
164
+ language_obj.try { |l| l.iso_639_1 }
165
+ end
166
+
167
+ # Three letter ISO language code, or nil
168
+ def language_iso_639_3
169
+ language_obj.try {|l| l.iso_639_3 }
170
+ end
171
+
134
172
  # year published. a ruby int
135
173
  # PART of:.
136
174
  # * schema.org CreativeWork "datePublished", year portion
@@ -213,18 +213,11 @@ module BentoSearch
213
213
  arguments = normalized_search_arguments(*arguments)
214
214
 
215
215
  results = search_implementation(arguments)
216
-
217
-
218
- # standard result metadata
219
- results.start = arguments[:start] || 0
220
- results.per_page = arguments[:per_page]
221
-
222
- results.search_args = arguments
223
- results.engine_id = configuration.id
224
216
 
217
+ fill_in_search_metadata_for(results, arguments)
218
+
225
219
  results.timing = (Time.now - start_t)
226
220
 
227
- results.display_configuration = configuration.for_display
228
221
  results.each do |item|
229
222
  # We copy some configuraton info over to each Item, as a convenience
230
223
  # to display logic that may have decide what to do given only an item,
@@ -249,14 +242,27 @@ module BentoSearch
249
242
  failed.error ||= {}
250
243
  failed.error[:exception] = e
251
244
 
252
- failed.search_args = arguments
253
- failed.engine_id = configuration.id
254
- failed.display_configuration = configuration.for_display
255
245
  failed.timing = (Time.now - start_t)
246
+
247
+ fill_in_search_metadata_for(failed, arguments)
256
248
 
257
249
 
258
250
  return failed
259
251
  end
252
+
253
+ # SOME of the elements of Results to be returned that SearchEngine implementation
254
+ # fills in automatically post-search. Extracted into a method for DRY in
255
+ # error handling to try to fill these in even in errors. And *possible*
256
+ # experimental use in other classes for same thing is why method is
257
+ # public, see MultiSearcher.
258
+ def fill_in_search_metadata_for(results, normalized_arguments)
259
+ results.search_args = normalized_arguments
260
+ results.start = normalized_arguments[:start] || 0
261
+ results.per_page = normalized_arguments[:per_page]
262
+
263
+ results.engine_id = configuration.id
264
+ results.display_configuration = configuration.for_display
265
+ end
260
266
 
261
267
 
262
268
  # Take the arguments passed into #search, which can be flexibly given
@@ -113,8 +113,7 @@ module BentoSearch
113
113
 
114
114
  item.unique_id = item_response["id"]
115
115
 
116
- item.title = v_info["title"]
117
- item.subtitle = v_info["subtitle"]
116
+ item.title = format_title(v_info)
118
117
  item.publisher = v_info["publisher"]
119
118
  # previewLink gives you your search results highlighted, preferable
120
119
  # if it exists.
@@ -255,6 +254,14 @@ module BentoSearch
255
254
  end
256
255
  return nil
257
256
  end
257
+
258
+ def format_title(v_info)
259
+ title = v_info["title"]
260
+ if v_info["subtitle"]
261
+ title = "#{title}: #{v_info["subtitle"]}"
262
+ end
263
+ return title
264
+ end
258
265
 
259
266
  end
260
267
  end
@@ -99,7 +99,7 @@ class BentoSearch::GoogleSiteSearchEngine
99
99
  10
100
100
  end
101
101
 
102
- def self.required_configuation
102
+ def self.required_configuration
103
103
  [:api_key, :cx]
104
104
  end
105
105
 
@@ -121,11 +121,7 @@ class BentoSearch::SummonEngine
121
121
 
122
122
  item.unique_id = first_if_present doc_hash["ID"]
123
123
 
124
- item.title = handle_highlighting( first_if_present doc_hash["Title"] )
125
- item.custom_data["raw_title"] = handle_highlighting( first_if_present(doc_hash["Title"]) , :strip => true)
126
-
127
- item.subtitle = handle_highlighting( first_if_present doc_hash["Subtitle"] )# TODO is this right?
128
- item.custom_data["raw_subtitle"] = handle_highlighting( first_if_present(doc_hash["Subtitle"]), :strip => true )
124
+ item.title = format_title(doc_hash)
129
125
 
130
126
  item.link = doc_hash["link"]
131
127
  # Don't understand difference between hasFullText and
@@ -356,6 +352,18 @@ class BentoSearch::SummonEngine
356
352
  :strip => options[:strip]
357
353
  )
358
354
  end
355
+
356
+ # combine title and subtitle into one title,
357
+ def format_title(doc_hash)
358
+ title = first_if_present doc_hash["Title"]
359
+ subtitle = first_if_present doc_hash["Subtitle"]
360
+
361
+ if subtitle.present?
362
+ title = "#{title}: #{subtitle}"
363
+ end
364
+
365
+ return handle_highlighting( title )
366
+ end
359
367
 
360
368
  def self.required_configuration
361
369
  [:access_id, :secret_key]