bento_search 0.9.0 → 1.0.0
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +47 -14
- data/app/item_decorators/bento_search/standard_decorator.rb +30 -12
- data/app/models/bento_search/link.rb +4 -0
- data/app/models/bento_search/multi_searcher.rb +3 -1
- data/app/models/bento_search/registrar.rb +31 -2
- data/app/models/bento_search/result_item.rb +52 -14
- data/app/models/bento_search/search_engine.rb +18 -12
- data/app/search_engines/bento_search/google_books_engine.rb +9 -2
- data/app/search_engines/bento_search/google_site_search_engine.rb +1 -1
- data/app/search_engines/bento_search/summon_engine.rb +13 -5
- data/app/search_engines/bento_search/worldcat_sru_dc_engine.rb +1 -1
- data/app/views/bento_search/_atom_item.atom.builder +135 -0
- data/app/views/bento_search/atom_results.atom.builder +61 -0
- data/lib/bento_search.rb +4 -2
- data/lib/bento_search/version.rb +1 -1
- data/test/dummy/config/routes.rb +3 -1
- data/test/dummy/log/test.log +45768 -0
- data/test/support/atom.xsd.xml +240 -0
- data/test/unit/openurl_creator_test.rb +2 -4
- data/test/unit/register_engine_test.rb +59 -0
- data/test/unit/summon_engine_test.rb +1 -5
- data/test/view/atom_results_test.rb +281 -0
- metadata +11 -5
data/README.md
CHANGED
@@ -2,10 +2,6 @@
|
|
2
2
|
|
3
3
|
[![Build Status](https://secure.travis-ci.org/jrochkind/bento_search.png)](http://travis-ci.org/jrochkind/bento_search)
|
4
4
|
|
5
|
-
(Fairly robust and stable at this point, but still pre-1.0 release, may
|
6
|
-
be some breaking api changes before 1.0, but probably not too many, it's
|
7
|
-
looking pretty good).
|
8
|
-
|
9
5
|
bento_search provides an abstraction/normalization layer for querying and
|
10
6
|
displaying results from external search engines, in Ruby on Rails. Requires
|
11
7
|
Rails3 and tested only under ruby 1.9.3.
|
@@ -88,14 +84,14 @@ may be required for certain engines.
|
|
88
84
|
results = engine.search("a query")
|
89
85
|
~~~~
|
90
86
|
|
91
|
-
`results` are a BentoSearch::Results object, which acts like an array of
|
92
|
-
BentoSearch::Item objects, along with some meta-information about the
|
87
|
+
`results` are a [BentoSearch::Results](./app/models/bento_search/results.rb) object, which acts like an array of
|
88
|
+
[BentoSearch::Item](./app/models/bento_search/results.rb) objects, along with some meta-information about the
|
93
89
|
search itself (pagination keys, etc). BentoSearch::Results and Item fields
|
94
90
|
are standardized accross engines. BentoSearch::Items provide semantic
|
95
91
|
values (title, author, etc.), as available from the particular engine.
|
96
92
|
|
97
93
|
To see which engines come bundled with BentoSearch, and any special
|
98
|
-
engine-specific instructions, look at BentoSearch source in `./app/search_engines`
|
94
|
+
engine-specific instructions, look at BentoSearch source in [`./app/search_engines/bento_search`](./app/search_engines/bento_search)
|
99
95
|
|
100
96
|
### Register engines in global configuration
|
101
97
|
|
@@ -211,7 +207,7 @@ An engine instance advertises it's maximum per-page values.
|
|
211
207
|
bento_search fixes the default per_page at 10.
|
212
208
|
|
213
209
|
For help creating your UI, you can ask a BentoSearch::Results for
|
214
|
-
`results.pagination`, which returns a BentoSearch::Results::Pagination
|
210
|
+
`results.pagination`, which returns a [BentoSearch::Results::Pagination](app/models/bento_search/results/pagination.rb)
|
215
211
|
object which should be suitable for passing to [kaminari](https://github.com/amatsuda/kaminari)
|
216
212
|
`paginate`, or else have convenient methods for roll your own pagination UI.
|
217
213
|
Kaminari's paginate method:
|
@@ -259,7 +255,7 @@ that Celluloid uses multi-threading in such a way that you might need
|
|
259
255
|
to turn Rails config.cache_classes=true even in development.
|
260
256
|
|
261
257
|
|
262
|
-
For more info, see BentoSearch::MultiSearcher.
|
258
|
+
For more info, see [BentoSearch::MultiSearcher](./app/models/bento_search/multi_searcher.rb).
|
263
259
|
|
264
260
|
### Delayed results loading via AJAX (actually more like AJAHtml)
|
265
261
|
|
@@ -289,7 +285,8 @@ link resolver.
|
|
289
285
|
BentoSearch::Items can have a main link associated with them (generally
|
290
286
|
hyperlinked from title), as well as a list of additional links. Most engines
|
291
287
|
do not provide additional links by default, custom local Decorators would
|
292
|
-
be used to add them. See wiki
|
288
|
+
be used to add them. See [wiki on display cusotmization](https://github.com/jrochkind/bento_search/wiki/Customizing-Results-Display)
|
289
|
+
for more info on decorators, and [BentoSearch::Link](app/models/bento_search/link.rb)
|
293
290
|
for fields.
|
294
291
|
|
295
292
|
### OpenURL and metadata
|
@@ -307,8 +304,8 @@ documented at ResultItem#format). As well as how well the #to_openurl routine
|
|
307
304
|
handles all edge cases (OpenURL can be weird). As edge cases are discovered, they
|
308
305
|
can be solved.
|
309
306
|
|
310
|
-
See `./app/item_decorators/bento_search/openurl_add_other_link.rb`
|
311
|
-
of using item decorators to add a link to your openurl resover to an item when
|
307
|
+
See [`./app/item_decorators/bento_search/openurl_add_other_link.rb`](./app/item_decorators/bento_search/openurl_add_other_link.rb)
|
308
|
+
for an example of using item decorators to add a link to your openurl resover to an item when
|
312
309
|
displayed.
|
313
310
|
|
314
311
|
### Exporting (eg as RIS) and get by unique_id
|
@@ -322,8 +319,34 @@ the RIS format, suitable for import into EndNote, Refworks, etc.
|
|
322
319
|
|
323
320
|
Accomodating actual exports into the transactional flow of a web app can be
|
324
321
|
tricky, and often requires use of the `result_item#unique_id` and
|
325
|
-
`engine.get( unique_id )` features. See the wiki
|
322
|
+
`engine.get( unique_id )` features. See the wiki on [exports and #unique_id](https://github.com/jrochkind/bento_search/wiki/Exports-and-the-get-by-unique_id-feature)
|
323
|
+
|
324
|
+
### Machine-readable serialization in Atom
|
325
|
+
|
326
|
+
Translation of any BentoSearch::Results to an Atom response that is enhanced to
|
327
|
+
include nearly all the elements of each BentoSearch::ResultItem, so can serves
|
328
|
+
well as machine-readable api response in general, not just for Atom feed readers.
|
329
|
+
|
330
|
+
You can use the [`bento_search/atom_results`](./app/views/bento_search/atom_results.atom.builder) view template, perhaps
|
331
|
+
in your action method like so:
|
332
|
+
|
333
|
+
~~~ruby
|
334
|
+
# ...
|
335
|
+
respond_to do |format|
|
336
|
+
format.html # default view
|
337
|
+
format.atom do
|
338
|
+
render( :template => "bento_search/atom_results",
|
339
|
+
:locals => {
|
340
|
+
:atom_results => @results,
|
341
|
+
:feed_name => "Acme results",
|
342
|
+
:feed_author_name => "MyCorp"
|
343
|
+
}
|
344
|
+
)
|
345
|
+
end
|
346
|
+
~~~
|
326
347
|
|
348
|
+
There are additional details that might matter to you, for more info see the
|
349
|
+
[wiki page](https://github.com/jrochkind/bento_search/wiki/Machine-Readable-Serialization-With-Atom)
|
327
350
|
|
328
351
|
## Planned Features
|
329
352
|
|
@@ -340,7 +363,17 @@ Probably:
|
|
340
363
|
a normalized cross-engine way.
|
341
364
|
|
342
365
|
Other needs or suggestions?
|
343
|
-
|
366
|
+
|
367
|
+
## Backwards compat
|
368
|
+
|
369
|
+
We are going to try to be strictly backwards compatible with all post 1.0
|
370
|
+
releases that do not increment the major version number (semantic versioning).
|
371
|
+
|
372
|
+
As a general rule, we're going to let our tests enforce this -- if a test has
|
373
|
+
to be changed to pass with new code, that's a very strong sign that it is
|
374
|
+
not a backwards-compat change, and you should think _very_ carefully to
|
375
|
+
be sure it is an exception to this rule before changing any existing tests
|
376
|
+
for new functionality.
|
344
377
|
|
345
378
|
## Developing
|
346
379
|
|
@@ -87,18 +87,14 @@ module BentoSearch
|
|
87
87
|
self.any_present?(:source_title, :publisher, :start_page)
|
88
88
|
end
|
89
89
|
|
90
|
-
#
|
90
|
+
# Mix-in a default missing title marker for empty titles
|
91
|
+
# (Used to combine title and subtitle when those were different fields)
|
91
92
|
def complete_title
|
92
|
-
|
93
|
-
|
94
|
-
|
95
|
-
|
96
|
-
|
97
|
-
if t.blank?
|
98
|
-
t = I18n.translate("bento_search.missing_title")
|
99
|
-
end
|
100
|
-
|
101
|
-
return t
|
93
|
+
if self.title.present?
|
94
|
+
self.title
|
95
|
+
else
|
96
|
+
I18n.translate("bento_search.missing_title")
|
97
|
+
end
|
102
98
|
end
|
103
99
|
|
104
100
|
|
@@ -125,7 +121,7 @@ module BentoSearch
|
|
125
121
|
return result_elements.join(", ").html_safe
|
126
122
|
end
|
127
123
|
|
128
|
-
|
124
|
+
# A display method, this is like #langauge_str, but will be nil if
|
129
125
|
# the language_code matches the current default locale, used
|
130
126
|
# for printing language only when not "English" normally.
|
131
127
|
#
|
@@ -152,6 +148,28 @@ module BentoSearch
|
|
152
148
|
|
153
149
|
return value.blank? ? nil : value
|
154
150
|
end
|
151
|
+
|
152
|
+
# A unique opaque identifier for a record may sometimes be
|
153
|
+
# required, for instance in Atom.
|
154
|
+
#
|
155
|
+
# We here provide a really dumb implementation, if and only if
|
156
|
+
# the result has an engine_id and unique_id available, (and
|
157
|
+
# a #root_url is available) by basically concatenating them to
|
158
|
+
# app base url.
|
159
|
+
#
|
160
|
+
# That's pretty lame, probably not resolvable, but best we
|
161
|
+
# can do without knowing details of host app. You may want
|
162
|
+
# to over-ride this in a decorator to do something more valid
|
163
|
+
# in an app-specific way.
|
164
|
+
#
|
165
|
+
# yes uri_identifier is like PIN number, deal with it.
|
166
|
+
def uri_identifier
|
167
|
+
if self.engine_id.present? && self.unique_id.present? && _h.respond_to?(:root_url)
|
168
|
+
"#{_h.root_url.chomp("/")}/bento_search_opaque_id/#{CGI.escape self.engine_id}/#{CGI.escape self.unique_id}"
|
169
|
+
else
|
170
|
+
nil
|
171
|
+
end
|
172
|
+
end
|
155
173
|
|
156
174
|
|
157
175
|
###################
|
@@ -13,6 +13,10 @@ module BentoSearch
|
|
13
13
|
# http://www.whatwg.org/specs/web-apps/current-work/multipage/links.html#linkTypes
|
14
14
|
attr_accessor :rel
|
15
15
|
|
16
|
+
# MIME content type may be used for both HMTL links and Atom
|
17
|
+
# link 'type' attribute
|
18
|
+
attr_accessor :type
|
19
|
+
|
16
20
|
# Array of strings, used for CSS classes on this link, possibly
|
17
21
|
# for custom styles/images etc. May be used in non-html link
|
18
22
|
# contexts too.
|
@@ -102,8 +102,10 @@ class BentoSearch::MultiSearcher
|
|
102
102
|
Rails.logger.error("\nBentoSearch:MultiSearcher caught exception: #{e}\n#{e.backtrace.join(" \n")}")
|
103
103
|
# Make a fake results with caught exception.
|
104
104
|
@results = BentoSearch::Results.new
|
105
|
+
self.engine.fill_in_search_metadata_for(@results, self.engine.normalized_search_arguments(search_args))
|
106
|
+
|
105
107
|
@results.error ||= {}
|
106
|
-
@results.error["exception"] = e
|
108
|
+
@results.error["exception"] = e
|
107
109
|
end
|
108
110
|
end
|
109
111
|
|
@@ -24,13 +24,42 @@ class BentoSearch::Registrar
|
|
24
24
|
#
|
25
25
|
# The first parameter identifier, eg "gbs", may be used in some
|
26
26
|
# URLs, for AJAX etc.
|
27
|
-
|
28
|
-
|
27
|
+
#
|
28
|
+
# You can also pass in a hash or hash-like object (including
|
29
|
+
# a configuration object returned by a prior register_engine)
|
30
|
+
# instead of or in addition to the block 'dsl' -- this can be used
|
31
|
+
# to base one configuration off another, with changes:
|
32
|
+
#
|
33
|
+
# BentoSearch.register_engine("original", {
|
34
|
+
# :engine => "Something",
|
35
|
+
# :title => "Original",
|
36
|
+
# :shared => "shared"
|
37
|
+
# })
|
38
|
+
#
|
39
|
+
# BentoSearch.register_engine("derived") do |conf|
|
40
|
+
# conf.title = "Derived"
|
41
|
+
# end
|
42
|
+
#
|
43
|
+
# Above would not change 'shared' in 'original', but would
|
44
|
+
# over-ride 'title' in 'derived', without changing 'title' in
|
45
|
+
# 'original'.
|
46
|
+
def register_engine(id, conf_data = nil, &block)
|
47
|
+
conf = Confstruct::Configuration.new
|
48
|
+
|
49
|
+
# Make sure we make a deep_copy so any changes don't mutate
|
50
|
+
# the original. Confstruct can be unpredictable.
|
51
|
+
if conf_data.present?
|
52
|
+
conf_data = Confstruct::Configuration.new(conf_data).deep_copy
|
53
|
+
end
|
54
|
+
|
55
|
+
conf.configure(conf_data, &block)
|
29
56
|
conf.id = id.to_s
|
30
57
|
|
31
58
|
raise ArgumentError.new("Must supply an `engine` class name") unless conf.engine
|
32
59
|
|
33
60
|
@registered_engine_confs[id] = conf
|
61
|
+
|
62
|
+
return conf
|
34
63
|
end
|
35
64
|
|
36
65
|
# Get a configured SearchEngine, using configuration and engine
|
@@ -44,12 +44,8 @@ module BentoSearch
|
|
44
44
|
# * dc.title
|
45
45
|
# * schema.org CreativeWork: 'name'
|
46
46
|
attr_accessor :title
|
47
|
-
|
48
|
-
|
49
|
-
# May also be nil with subtitle in "title" field after colon.
|
50
|
-
#
|
51
|
-
# *
|
52
|
-
attr_accessor :subtitle
|
47
|
+
# backwards compat, we used to have separate titles and subtitles
|
48
|
+
alias_method :complete_title, :title
|
53
49
|
|
54
50
|
# usually a direct link to the search provider's 'native' page.
|
55
51
|
# Can be changed in actual presentation with a Decorator.
|
@@ -65,12 +61,24 @@ module BentoSearch
|
|
65
61
|
@link_is_fulltext = v
|
66
62
|
end
|
67
63
|
|
68
|
-
#
|
69
|
-
#
|
64
|
+
# Our own INTERNAL controlled vocab for 'format'.
|
65
|
+
#
|
66
|
+
# Important that this be supplied by engine for maximum
|
67
|
+
# success of openurl, ris export, etc.
|
68
|
+
#
|
69
|
+
# This vocab is based on schema.org CreativeWork 'types',
|
70
|
+
# but supplemented with values we needed not present in schema.org.
|
71
|
+
# String values are last part of schema.org URLs, symbol values are custom.
|
72
|
+
#
|
73
|
+
# However, for backwards compat, values that didn't exist in schema.org
|
74
|
+
# when we started but later came to exist -- we still use our string
|
75
|
+
# values. If you actually want a schema.org url, see #schema_org_type_url
|
76
|
+
# which translates as needed.
|
70
77
|
#
|
71
78
|
# schema.org 'type' that's a sub-type of CreativeWork.
|
72
79
|
# should hold a string that, when appended to "http://schema.org/"
|
73
|
-
# is a valid schema.org type uri, that sub-types CreativeWork.
|
80
|
+
# is a valid schema.org type uri, that sub-types CreativeWork. Ones
|
81
|
+
# we have used:
|
74
82
|
# * Article
|
75
83
|
# * Book
|
76
84
|
# * Movie
|
@@ -93,7 +101,27 @@ module BentoSearch
|
|
93
101
|
#
|
94
102
|
# Note: We're re-thinking this, might allow uncontrolled
|
95
103
|
# in here instead.
|
96
|
-
attr_accessor :format
|
104
|
+
attr_accessor :format
|
105
|
+
|
106
|
+
# Translated from internal format vocab at #format. Outputs
|
107
|
+
# eg http://schema.org/Book
|
108
|
+
# Uses the @@format_to_schema_org hash for mapping from
|
109
|
+
# certain internal symbol values to schema org value, where
|
110
|
+
# possible.
|
111
|
+
#
|
112
|
+
# Can return nil if we don't know a schema.org type
|
113
|
+
def schema_org_type_url
|
114
|
+
if format.kind_of? String
|
115
|
+
"http://schema.org/#{format}"
|
116
|
+
elsif mapped = @@format_to_schema_org[format]
|
117
|
+
"http://schema.org/#{mapped}"
|
118
|
+
else
|
119
|
+
nil
|
120
|
+
end
|
121
|
+
end
|
122
|
+
@@format_to_schema_org = {
|
123
|
+
:report => "Article",
|
124
|
+
}
|
97
125
|
|
98
126
|
# uncontrolled presumably english-language format string.
|
99
127
|
# if supplied will be used in display in place of controlled
|
@@ -110,10 +138,10 @@ module BentoSearch
|
|
110
138
|
# Manually set language_str will over-ride display string calculated from
|
111
139
|
# language_code.
|
112
140
|
#
|
113
|
-
# Consumers
|
114
|
-
# either
|
115
|
-
#
|
116
|
-
#
|
141
|
+
# Consumers that want a language code can use #language_iso_639_1 or
|
142
|
+
# #language_iso_639_2 (either may be null), or #language_str for uncontrolled
|
143
|
+
# string. If engine just sets one of these, internals take care of filling
|
144
|
+
# out the others. r
|
117
145
|
attr_accessor :language_code
|
118
146
|
attr_writer :language_str
|
119
147
|
def language_str
|
@@ -131,6 +159,16 @@ module BentoSearch
|
|
131
159
|
@language_obj ||= LanguageList::LanguageInfo.find( self.language_code )
|
132
160
|
end
|
133
161
|
|
162
|
+
# Two letter ISO language code, or nil
|
163
|
+
def language_iso_639_1
|
164
|
+
language_obj.try { |l| l.iso_639_1 }
|
165
|
+
end
|
166
|
+
|
167
|
+
# Three letter ISO language code, or nil
|
168
|
+
def language_iso_639_3
|
169
|
+
language_obj.try {|l| l.iso_639_3 }
|
170
|
+
end
|
171
|
+
|
134
172
|
# year published. a ruby int
|
135
173
|
# PART of:.
|
136
174
|
# * schema.org CreativeWork "datePublished", year portion
|
@@ -213,18 +213,11 @@ module BentoSearch
|
|
213
213
|
arguments = normalized_search_arguments(*arguments)
|
214
214
|
|
215
215
|
results = search_implementation(arguments)
|
216
|
-
|
217
|
-
|
218
|
-
# standard result metadata
|
219
|
-
results.start = arguments[:start] || 0
|
220
|
-
results.per_page = arguments[:per_page]
|
221
|
-
|
222
|
-
results.search_args = arguments
|
223
|
-
results.engine_id = configuration.id
|
224
216
|
|
217
|
+
fill_in_search_metadata_for(results, arguments)
|
218
|
+
|
225
219
|
results.timing = (Time.now - start_t)
|
226
220
|
|
227
|
-
results.display_configuration = configuration.for_display
|
228
221
|
results.each do |item|
|
229
222
|
# We copy some configuraton info over to each Item, as a convenience
|
230
223
|
# to display logic that may have decide what to do given only an item,
|
@@ -249,14 +242,27 @@ module BentoSearch
|
|
249
242
|
failed.error ||= {}
|
250
243
|
failed.error[:exception] = e
|
251
244
|
|
252
|
-
failed.search_args = arguments
|
253
|
-
failed.engine_id = configuration.id
|
254
|
-
failed.display_configuration = configuration.for_display
|
255
245
|
failed.timing = (Time.now - start_t)
|
246
|
+
|
247
|
+
fill_in_search_metadata_for(failed, arguments)
|
256
248
|
|
257
249
|
|
258
250
|
return failed
|
259
251
|
end
|
252
|
+
|
253
|
+
# SOME of the elements of Results to be returned that SearchEngine implementation
|
254
|
+
# fills in automatically post-search. Extracted into a method for DRY in
|
255
|
+
# error handling to try to fill these in even in errors. And *possible*
|
256
|
+
# experimental use in other classes for same thing is why method is
|
257
|
+
# public, see MultiSearcher.
|
258
|
+
def fill_in_search_metadata_for(results, normalized_arguments)
|
259
|
+
results.search_args = normalized_arguments
|
260
|
+
results.start = normalized_arguments[:start] || 0
|
261
|
+
results.per_page = normalized_arguments[:per_page]
|
262
|
+
|
263
|
+
results.engine_id = configuration.id
|
264
|
+
results.display_configuration = configuration.for_display
|
265
|
+
end
|
260
266
|
|
261
267
|
|
262
268
|
# Take the arguments passed into #search, which can be flexibly given
|
@@ -113,8 +113,7 @@ module BentoSearch
|
|
113
113
|
|
114
114
|
item.unique_id = item_response["id"]
|
115
115
|
|
116
|
-
item.title = v_info
|
117
|
-
item.subtitle = v_info["subtitle"]
|
116
|
+
item.title = format_title(v_info)
|
118
117
|
item.publisher = v_info["publisher"]
|
119
118
|
# previewLink gives you your search results highlighted, preferable
|
120
119
|
# if it exists.
|
@@ -255,6 +254,14 @@ module BentoSearch
|
|
255
254
|
end
|
256
255
|
return nil
|
257
256
|
end
|
257
|
+
|
258
|
+
def format_title(v_info)
|
259
|
+
title = v_info["title"]
|
260
|
+
if v_info["subtitle"]
|
261
|
+
title = "#{title}: #{v_info["subtitle"]}"
|
262
|
+
end
|
263
|
+
return title
|
264
|
+
end
|
258
265
|
|
259
266
|
end
|
260
267
|
end
|
@@ -121,11 +121,7 @@ class BentoSearch::SummonEngine
|
|
121
121
|
|
122
122
|
item.unique_id = first_if_present doc_hash["ID"]
|
123
123
|
|
124
|
-
item.title =
|
125
|
-
item.custom_data["raw_title"] = handle_highlighting( first_if_present(doc_hash["Title"]) , :strip => true)
|
126
|
-
|
127
|
-
item.subtitle = handle_highlighting( first_if_present doc_hash["Subtitle"] )# TODO is this right?
|
128
|
-
item.custom_data["raw_subtitle"] = handle_highlighting( first_if_present(doc_hash["Subtitle"]), :strip => true )
|
124
|
+
item.title = format_title(doc_hash)
|
129
125
|
|
130
126
|
item.link = doc_hash["link"]
|
131
127
|
# Don't understand difference between hasFullText and
|
@@ -356,6 +352,18 @@ class BentoSearch::SummonEngine
|
|
356
352
|
:strip => options[:strip]
|
357
353
|
)
|
358
354
|
end
|
355
|
+
|
356
|
+
# combine title and subtitle into one title,
|
357
|
+
def format_title(doc_hash)
|
358
|
+
title = first_if_present doc_hash["Title"]
|
359
|
+
subtitle = first_if_present doc_hash["Subtitle"]
|
360
|
+
|
361
|
+
if subtitle.present?
|
362
|
+
title = "#{title}: #{subtitle}"
|
363
|
+
end
|
364
|
+
|
365
|
+
return handle_highlighting( title )
|
366
|
+
end
|
359
367
|
|
360
368
|
def self.required_configuration
|
361
369
|
[:access_id, :secret_key]
|