worldcat 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (4) hide show
  1. data/CHANGELOG.rdoc +4 -0
  2. data/README.rdoc +74 -0
  3. data/lib/worldcat.rb +331 -0
  4. metadata +110 -0
@@ -0,0 +1,4 @@
1
+ == 0.0.1
2
+
3
+ Initial version of worldcat gem.
4
+
@@ -0,0 +1,74 @@
1
+ = WorldCat Search API
2
+
3
+ A WorldCat API for Ruby to interact with WorldCat search webservices.
4
+ http://www.worldcat.org
5
+
6
+ == Usage
7
+
8
+ require 'worldcat'
9
+
10
+ client = WorldCat.new '[api_key]'
11
+
12
+ Get Atom or RSS response from an OpenSearch
13
+
14
+ atom = client.open_search :query => "Civil War"
15
+ puts atom.feed.title
16
+ puts atom.entries.first.author
17
+
18
+ Get MARC XML or Dublin Core from a SRU CQL query
19
+
20
+ cql = 'srw.kw="civil war" and (srw.su="antietam" or srw.su="sharpsburg")'
21
+
22
+ records = client.sru_search :query => cql, :format => "marcxml"
23
+ for record in records
24
+ # print out field 245 subfield a
25
+ puts record['245']['a']
26
+ end
27
+
28
+ If you'd like to use another implementation, the raw response is available:
29
+
30
+ client.raw_response
31
+
32
+ A faster way?
33
+
34
+ rss = WorldCat.new.open_search :q => "Globalization", :format => "rss", :wskey => '[api_key]'
35
+
36
+ For more information, please have a look at the documentation or the test cases.
37
+
38
+ == Installation
39
+
40
+ gem install worldcat
41
+
42
+ == Why?
43
+
44
+ The 'wcapi' gem does not satisfy several points, so another version is justified for many reasons:
45
+
46
+ * It is better to use a RSS Ruby implementation, actually SimpleRSS, to get Atom or RSS response.
47
+ * It is better to use the MARC Ruby implementation to get MARC XML or Dublin Core response from a SRU CQL search or other search.
48
+ * Unit testing is great.
49
+
50
+ == What this API can do
51
+
52
+ * Send searches in OpenSearch or SRU CQL syntax.
53
+ * Receive OpenSearch responses in RSS or Atom format (both are a SimpleRSS object).
54
+ * Receive SRU responses in an array of MARC::Record or Dublin Core (REXML::Document).
55
+ * Receive a MARC::Record for a single OCLC record.
56
+ * Receive a REXML::Document for geographically-sorted library holdings information.
57
+ * Receive a HTML formatted String for standard bibliographic citation formats (APA, Chicago, Harvard, MLA, and Turabian).
58
+
59
+ == To do
60
+
61
+ * Use SRU gem to get response from sru_search.
62
+
63
+ == Contribution
64
+
65
+ Feel free to fork and send me a pull request for changes, fixes or simply a message for any suggestion.
66
+
67
+ == See
68
+
69
+ * {WorldCat webservices}[http://www.worldcat.org/affiliate/tools?atype=wcapi]
70
+ * {Ruby MARC documentation}[http://marc.rubyforge.org/]
71
+ * {Ruby Simple RSS documentation}[http://simple-rss.rubyforge.org/]
72
+
73
+ Vivien Didelot <vivien.didelot@gmail.com>
74
+ http://github.com/v0n/worldcat
@@ -0,0 +1,331 @@
1
+ # Simple WorldCat Search Ruby API
2
+ # http://oclc.org/developer/services/WCAPI
3
+ #
4
+ # Author:: Vivien Didelot 'v0n' <vivien.didelot@gmail.com>
5
+
6
+ require 'rubygems' # needed by simple-rss
7
+ require 'open-uri' # used to fetch responses
8
+ require 'simple-rss' # used for Atom and RSS format
9
+ require 'marc' # used for MARC records
10
+ require 'rexml/document' # used for many XML purposes
11
+ require 'json' # used for JSON format
12
+
13
+ # The WorldCat class methods use WorldCat webservices.
14
+ # Options are given as a hash and Symbol keys may be:
15
+ # * the same name than GET parameters,
16
+ # * Ruby naming convention (i.e. underscore),
17
+ # * or aliases if available.
18
+ #
19
+ # Note: aliases have priority.
20
+ #
21
+ # For a complete list of parameters, see documentation here:
22
+ # http://oclc.org/developer/documentation/worldcat-search-api/parameters
23
+
24
+ # The WorldCat class, used to interact with the WorldCat search webservices.
25
+ class WorldCat
26
+
27
+ # A specific WorldCat error class.
28
+ class WorldCatError < StandardError
29
+ def initialize(details = nil)
30
+ @details = details
31
+ end
32
+ end
33
+
34
+ # The WorldCat webservices API key.
35
+ attr_writer :api_key
36
+
37
+ # The raw response from WorldCat.
38
+ attr_reader :raw_response
39
+
40
+ # The raw url used to fetch the response.
41
+ attr_reader :raw_url
42
+
43
+ # The constructor.
44
+ # The API key can be given here or later.
45
+ def initialize(api_key = nil)
46
+ @api_key = api_key
47
+ @raw_url = nil
48
+ @raw_response = nil
49
+ end
50
+
51
+ # OpenSearch method.
52
+ #
53
+ # Aliases:
54
+ # * :query is an alias for :q
55
+ # * :max is an alias for :count
56
+ # * :citation_format is an alias for :cformat
57
+ #
58
+ # This method returns a SimpleRSS object. You can see the usage on:
59
+ # http://simple-rss.rubyforge.org/
60
+ def open_search(options)
61
+ # Check aliases
62
+ options.keys.each do |k|
63
+ case k
64
+ when :query then options[:q] = options.delete(k)
65
+ when :max then options[:count] = options.delete(k)
66
+ when :citation_format then options[:cformat] = options.delete(k)
67
+ end
68
+ end
69
+
70
+ fetch("search/opensearch", options)
71
+ #TODO diagnostic
72
+
73
+ # Add tags
74
+ SimpleRSS.feed_tags << :"opensearch:totalResults"
75
+ SimpleRSS.feed_tags << :"opensearch:startIndex"
76
+ SimpleRSS.feed_tags << :"opensearch:itemsPerPage"
77
+ SimpleRSS.item_tags << :"dc:identifier"
78
+ SimpleRSS.item_tags << :"oclcterms:recordIdentifier"
79
+
80
+ SimpleRSS.parse @raw_response
81
+ #TODO rescue SimpleRSS Error? (i.e. response too small)
82
+ end
83
+
84
+ # SRU search method.
85
+ #
86
+ # aliases:
87
+ # * :q is an alias for :query
88
+ # * :format is an alias for :record_schema
89
+ # and its value can match "marc" or "dublin", or can be the exact value. e.g.
90
+ # :format => :marcxml
91
+ # * :citation_format is an alias for :cformat
92
+ # * :start is an alias for :start_record
93
+ # * :count and :max are aliases for :maximum_records
94
+ #
95
+ # this method returns an array of MARC::Record objects for marc format
96
+ # (you can see the usage on http://marc.rubyforge.org),
97
+ # or a REXML::Document for Dublin Core format.
98
+ def sru_search(options)
99
+ #TODO add other control_tags?
100
+
101
+ # Check aliases
102
+ options.keys.each do |k|
103
+ case k
104
+ when :q then options[:query] = options.delete(k)
105
+ when :count, :max then options[:maximum_records] = options.delete(k)
106
+ when :start then options[:start_record] = options.delete(k)
107
+ when :citation_format then options[:cformat] = options.delete(k)
108
+ when :format
109
+ format = options.delete(k).to_s
110
+ if format =~ /marc/ then format = "info:srw/schema/1/marcxml" end
111
+ if format =~ /dublin/ then format = "info:srw/schema/1/dc" end
112
+ options[:record_schema] = format
113
+ end
114
+ end
115
+
116
+ fetch("search/sru", options)
117
+ xml_diagnostic
118
+
119
+ format = options[:record_schema]
120
+ if format.nil? || format == "info:srw/schema/1/marcxml"
121
+ marc_to_array
122
+ else
123
+ REXML::Document.new @raw_response
124
+ end
125
+ end
126
+
127
+ # Library locations method.
128
+ #
129
+ # aliases:
130
+ # * :start is an alias for :start_library
131
+ # * :count and :max are aliases for :maximum_libraries
132
+ # * :latitude is an alias for :lat
133
+ # * :longitude is an alias for :lon
134
+ # * libtype can be given as text value as well. e.g.:
135
+ # :libtype => :academic
136
+ # * record identifier should be given as type => id. e.g.:
137
+ # :isbn => "014330223X"
138
+ #
139
+ # this method returns a REXML::Document for XML format,
140
+ # or a Hash for JSON format.
141
+ def library_locations(options)
142
+ url_comp = "content/libraries/"
143
+
144
+ # Check aliases
145
+ options.keys.each do |k|
146
+ case k
147
+ when :count, :max then options[:maximum_libraries] = options.delete(k)
148
+ when :start then options[:start_library] = options.delete(k)
149
+ when :latitude then options[:lat] = options.delete(k)
150
+ when :longitude then options[:lon] = options.delete(k)
151
+ when :format then options.delete(k) if options[k].to_s == "xml"
152
+ when :libtype
153
+ libtype = options[k].to_s
154
+ options[k] = 1 if libtype == "academic"
155
+ options[k] = 2 if libtype == "public"
156
+ options[k] = 3 if libtype == "government"
157
+ options[k] = 4 if libtype == "other"
158
+ when :oclc then url_comp << options.delete(k).to_s
159
+ when :isbn then url_comp << "isbn/" << options.delete(k).to_s
160
+ when :issn then url_comp << "issn/" << options.delete(k).to_s
161
+ when :sn then url_comp << "sn/" << options.delete(k).to_s
162
+ end
163
+ end
164
+
165
+ if options.has_key? :format
166
+ fetch(url_comp, options)
167
+ json_diagnostic
168
+ response = JSON.parse(@raw_response)
169
+ else
170
+ fetch(url_comp, options)
171
+ xml_diagnostic
172
+ response = REXML::Document.new(@raw_response)
173
+ end
174
+
175
+ response
176
+ end
177
+
178
+ # Single Bibliographic Record.
179
+ #
180
+ # aliases:
181
+ # * record identifier should be given as type => id. e.g.:
182
+ # :isbn => "014330223X"
183
+ #
184
+ # this method returns a MARC::Record.
185
+ def single_record(options)
186
+ url_comp = "content/"
187
+
188
+ # Check aliases
189
+ options.keys.each do |k|
190
+ case k
191
+ when :oclc then url_comp << options.delete(k).to_s
192
+ when :isbn then url_comp << "isbn/" << options.delete(k).to_s
193
+ when :issn then url_comp << "issn/" << options.delete(k).to_s
194
+ end
195
+ end
196
+
197
+ fetch(url_comp, options)
198
+ xml_diagnostic
199
+ marc_to_array.first
200
+ end
201
+
202
+ # Libray Catalog URL for a Record.
203
+ #
204
+ # aliases:
205
+ # * record identifier should be given as type => id. e.g.:
206
+ # :isbn => "014330223X"
207
+ #
208
+ # this method returns a MARC::Record.
209
+ def library_catalog_url(options)
210
+ url_comp = "content/libraries/"
211
+
212
+ # Check aliases
213
+ options.keys.each do |k|
214
+ case k
215
+ when :oclc then url_comp << options.delete(k).to_s
216
+ when :isbn then url_comp << "isbn/" << options.delete(k).to_s
217
+ end
218
+ end
219
+
220
+ #TODO get diagnostic for "no holdings found" instead of raising it.
221
+ fetch(url_comp, options)
222
+ xml_diagnostic
223
+ REXML::Document.new(@raw_response)
224
+ end
225
+
226
+ # Formatted Citations.
227
+ #
228
+ # aliases:
229
+ # * :citation_format is an alias for :cformat
230
+ # * record identifier should be given as:
231
+ # :oclc => [oclc_number]
232
+ #
233
+ # this method returns a HTML formatted String.
234
+ def formatted_citations(options)
235
+ url_comp = "content/citations/"
236
+
237
+ # Check aliases
238
+ options.keys.each do |k|
239
+ case k
240
+ when :citation_format then options[:cformat] = options.delete(k)
241
+ when :oclc then url_comp << options.delete(k).to_s
242
+ end
243
+ end
244
+
245
+ fetch(url_comp, options)
246
+ if options.has_key? :cformat
247
+ xml_diagnostic
248
+ else
249
+ str_diagnostic
250
+ end
251
+
252
+ @raw_response
253
+ end
254
+
255
+ private
256
+
257
+ # Helper method to convert a MARC::XMLReader in an array of records.
258
+ # That's easier to use and better because of the bug
259
+ # that makes the REXML reader empty after the first #each call.
260
+ def marc_to_array
261
+ reader = MARC::XMLReader.new(StringIO.new(@raw_response))
262
+ records = Array.new
263
+ reader.each { |record| records << record }
264
+
265
+ records
266
+ end
267
+
268
+ # Method to fetch the raw response from WorldCat webservices.
269
+ def fetch(url_comp, options)
270
+ # Use the API key attribute or the one provided.
271
+ options = {:wskey => @api_key}.merge options
272
+
273
+ url = "http://www.worldcat.org/webservices/catalog/" << url_comp << "?"
274
+ url << options.map { |k, v| "#{camelize(k)}=#{parse_value(v)}" }.join("&")
275
+ @raw_url = URI.escape(url)
276
+
277
+ begin
278
+ open @raw_url do |raw|
279
+ @raw_response = raw.read
280
+ end
281
+ rescue OpenURI::HTTPError => e
282
+ if e.message =~ /status=UNAUTHENTICATED/
283
+ raise WorldCatError.new(e.message), "Authentication failure"
284
+ else raise e
285
+ end
286
+ end
287
+ end
288
+
289
+ def str_diagnostic
290
+ # May be something like: "info:srw/diagnostic/1/65Record does not exist"
291
+ if @raw_response =~ /(info:srw\/diagnostic\/\d+\/\d+)(.*)/
292
+ raise WorldCatError.new, $2
293
+ end
294
+ end
295
+
296
+ # Check for diagnostics of XML responses from WorldCat.
297
+ def xml_diagnostic
298
+ xml = REXML::Document.new @raw_response
299
+ d = xml.elements['diagnostics'] || xml.root.elements['diagnostics']
300
+ unless d.nil?
301
+ d = d.elements.first
302
+ details = d.elements["details"]
303
+ details = details.text unless details.nil?
304
+ message = d.elements["message"].text
305
+
306
+ raise WorldCatError.new(details), message
307
+ end
308
+ end
309
+
310
+ # Check for diagnostics of JSON responses from WorldCat.
311
+ def json_diagnostic
312
+ json = JSON.parse(@raw_response)
313
+ if json.has_key? "diagnostic"
314
+ details = json["diagnostic"].first["details"]
315
+ message = json["diagnostic"].first["message"]
316
+ raise WorldCatError.new(details), message
317
+ end
318
+ end
319
+
320
+ # Helper function to camelize a string or symbol
321
+ # to match WorldCat services parameters.
322
+ def camelize(key)
323
+ key.to_s.gsub(/_(\w)/) { |m| m.sub('_', '').capitalize }
324
+ end
325
+
326
+ # Helper function to parse a array, number or string
327
+ # to match WorldCat services parameters.
328
+ def parse_value(value)
329
+ value.is_a?(Array) ? value.join(',') : value.to_s
330
+ end
331
+ end
metadata ADDED
@@ -0,0 +1,110 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: worldcat
3
+ version: !ruby/object:Gem::Version
4
+ prerelease: false
5
+ segments:
6
+ - 0
7
+ - 0
8
+ - 1
9
+ version: 0.0.1
10
+ platform: ruby
11
+ authors:
12
+ - Vivien Didelot
13
+ autorequire:
14
+ bindir: bin
15
+ cert_chain: []
16
+
17
+ date: 2010-09-17 00:00:00 +10:00
18
+ default_executable:
19
+ dependencies:
20
+ - !ruby/object:Gem::Dependency
21
+ name: simple-rss
22
+ prerelease: false
23
+ requirement: &id001 !ruby/object:Gem::Requirement
24
+ none: false
25
+ requirements:
26
+ - - ">="
27
+ - !ruby/object:Gem::Version
28
+ segments:
29
+ - 1
30
+ - 2
31
+ - 3
32
+ version: 1.2.3
33
+ type: :runtime
34
+ version_requirements: *id001
35
+ - !ruby/object:Gem::Dependency
36
+ name: marc
37
+ prerelease: false
38
+ requirement: &id002 !ruby/object:Gem::Requirement
39
+ none: false
40
+ requirements:
41
+ - - ">="
42
+ - !ruby/object:Gem::Version
43
+ segments:
44
+ - 0
45
+ - 3
46
+ - 3
47
+ version: 0.3.3
48
+ type: :runtime
49
+ version_requirements: *id002
50
+ - !ruby/object:Gem::Dependency
51
+ name: json
52
+ prerelease: false
53
+ requirement: &id003 !ruby/object:Gem::Requirement
54
+ none: false
55
+ requirements:
56
+ - - ">="
57
+ - !ruby/object:Gem::Version
58
+ segments:
59
+ - 1
60
+ - 4
61
+ - 6
62
+ version: 1.4.6
63
+ type: :runtime
64
+ version_requirements: *id003
65
+ description:
66
+ email: vivien.didelot@gmail.com
67
+ executables: []
68
+
69
+ extensions: []
70
+
71
+ extra_rdoc_files: []
72
+
73
+ files:
74
+ - lib/worldcat.rb
75
+ - README.rdoc
76
+ - CHANGELOG.rdoc
77
+ has_rdoc: true
78
+ homepage:
79
+ licenses: []
80
+
81
+ post_install_message:
82
+ rdoc_options: []
83
+
84
+ require_paths:
85
+ - lib
86
+ required_ruby_version: !ruby/object:Gem::Requirement
87
+ none: false
88
+ requirements:
89
+ - - ">="
90
+ - !ruby/object:Gem::Version
91
+ segments:
92
+ - 0
93
+ version: "0"
94
+ required_rubygems_version: !ruby/object:Gem::Requirement
95
+ none: false
96
+ requirements:
97
+ - - ">="
98
+ - !ruby/object:Gem::Version
99
+ segments:
100
+ - 0
101
+ version: "0"
102
+ requirements: []
103
+
104
+ rubyforge_project:
105
+ rubygems_version: 1.3.7
106
+ signing_key:
107
+ specification_version: 3
108
+ summary: A Ruby API for the WorldCat Search webservices
109
+ test_files: []
110
+