patentscope 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
data/.gitignore ADDED
@@ -0,0 +1,26 @@
1
+ /support/*
2
+ /spec/cassettes/*
3
+
4
+ *.gem
5
+ *.rbc
6
+ .bundle
7
+ .config
8
+ .yardoc
9
+ Gemfile.lock
10
+ InstalledFiles
11
+ _yardoc
12
+ coverage
13
+ doc/
14
+ lib/bundler/man
15
+ pkg
16
+ rdoc
17
+ spec/reports
18
+ test/tmp
19
+ test/version_tmp
20
+ tmp
21
+
22
+ # Ignore gems vendored by bundle install --path
23
+ /vendor/ruby
24
+
25
+ # Ignore binstubs generated by bundle install --binstubs
26
+ /bin
data/.rspec ADDED
@@ -0,0 +1 @@
1
+ --color
data/.ruby-version ADDED
@@ -0,0 +1 @@
1
+ 1.9.3-p484
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in patentscope.gemspec
4
+ gemspec
data/License.txt ADDED
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2013-2014 Chong-Yee Khoo
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,257 @@
1
+ # patentscope gem
2
+ Gem to allow easy access to data from the WIPO Patentscope Web Service.
3
+
4
+ ## Introduction
5
+
6
+ The Patentscope gem allows easy access, with Ruby, to data provided by the PATENTSCOPE Web Service of the World Intellectual Property Organisation (WIPO).
7
+
8
+ As provided by WIPO, the PATENTSCOPE Web Service is available through a SOAP interface. The documentation provided by WIPO uses Java.
9
+
10
+ The Patentscope gem, on the other hand, provides a simple Ruby interface to the PATENTSCOPE Web Service. The gem allows access for each of the functions available from the SOAP interface.
11
+
12
+ ## About the WIPO PATENTSCOPE Web Service
13
+
14
+ From [PATENTSCOPE Web Service](http://www.wipo.int/patentscope/en/data/products.html) site:
15
+
16
+ "Includes:
17
+
18
+ * Bibliographic data for all published international applications (XML format);
19
+ * Images for all published international applications (TIFF format);
20
+ * Full-text description and claims (OCR output) for all international applications published in English, French, German, Spanish and Russian, as well as Japanese and Korean (soon available) (PDF format).
21
+
22
+ Available on the Internet on the day of publication. Programmatic access ... to the documents available in the document tab of the PATENTSCOPE search engine ([example](http://www.wipo.int/pctdb/en/wo.jsp?WO=2009120859&IA=US2009038389&DISPLAY=DOCS)). This set makes it possible to integrate access to PATENTSCOPE in an IT architecture, to retrieve the International Application Status Report (IASR) and to parse it on the fly and to download, within the framework of the [authorized uses policy](http://www.wipo.int/patentscope/en/data/terms.html); documents by batch. The formats of the documents are the same as the formats of the documents available via the web site, i.e. TIFF, XML for all documents and a text-based PDF OCR for most pamphlets."
23
+
24
+ The PATENTSCOPE Web Service is available from the World Intellectual Property Organisation (WIPO) through a [paid subscription](http://www.wipo.int/patentscope/en/data/forms/web_service.jsp). The current cost of a subscription is 600 Swiss Francs per calendar year.
25
+
26
+ If you [ask nicely](mailto:patentscope@wipo.int?subject=Request%20for%20Trial%20Trial%20to%20PATENTSCOPE%20Web%20Service), the folks at WIPO might give you a trial account.
27
+
28
+ ## Installation
29
+
30
+ Add this line to your application's Gemfile:
31
+
32
+ gem 'patentscope'
33
+
34
+ And then execute:
35
+
36
+ $ bundle
37
+
38
+ Or install it yourself as:
39
+
40
+ $ gem install patentscope
41
+
42
+ ## Usage
43
+
44
+ ### Configuration
45
+ Run the configuration block first to set the credentials for the PATENTSCOPE Web Service.
46
+
47
+ Patentscope.configure do |config|
48
+ config.username = 'username'
49
+ config.password = 'password'
50
+ end
51
+
52
+ ### Configuring from Environment Variables
53
+
54
+ It is most convenient to store the PATENTSCOPE Web Service username and password credentials as environment variables.
55
+
56
+ If these are stored as `PATENTSCOPE_WEBSERVICE_USERNAME` and `PATENTSCOPE_WEBSERVICE_PASSWORD` respectively, you can simply use
57
+
58
+ Patentscope.configure_from_env
59
+
60
+ to load the credentials into the configuration in a single step.
61
+
62
+ ### Querying and Resetting Configuration
63
+
64
+ The `configured?` class method returns a boolean indicating whether the configuration has been set. This doesn't necessarily mean that the credentials are valid, only that they have been set.
65
+
66
+ Patentscope.configured? #=> true
67
+
68
+ Use the `username` method of the `configuration` object to obtain the username set in the configuration.
69
+
70
+ Patentscope.configuration.username #=> 'username'
71
+
72
+ The `password` method of the `configuration` object returns the password set in the configuration.
73
+
74
+ Patentscope.configuration.password #=> 'password'
75
+
76
+ The `reset_configuration` class method resets the configuration.
77
+
78
+ Patentscope.reset_configuration
79
+ Patentscope.configuration #=> nil
80
+
81
+ ### List of Available Methods
82
+
83
+ * `get_iasr`
84
+ * `get_available_documents`
85
+ * `get_document_content`
86
+ * `get_document_ocr_content`
87
+ * `get_document_table_of_contents`
88
+ * `get_document_content_page`
89
+ * `wsdl`
90
+
91
+ ### Getting the International Application Status Report (`get_iasr`)
92
+
93
+ This is possibly the most useful of all the functions provided by this gem.
94
+
95
+ The `get_iasr` class method returns an International Application Status Report in XML format for the specified application number. The IASR document is essentially a bibliographic summary of the PCT application in XML format.
96
+
97
+ The `get_iasr` method takes an International Application number, with or without the PCT prefix and with or without slashes.
98
+
99
+ Patentscope.get_iasr('SG2003000062')
100
+ Patentscope.get_iasr('SG2003/000062')
101
+ Patentscope.get_iasr('PCTSG2003000062')
102
+ Patentscope.get_iasr('PCT/SG2003/000062')
103
+
104
+ Example output for SG2003000062:
105
+
106
+ Patentscope.get_iasr('SG2003000062')
107
+ #=>
108
+ <?xml version="1.0"?>
109
+ <wo-international-application-status>
110
+ <wo-bibliographic-data produced-by="IB" dtd-version="0.2" lang="EN" date-produced="20140108">
111
+ <publication-reference>
112
+ <document-id lang="EN">
113
+ <country>WO</country>
114
+ <doc-number>2009/105044</doc-number>
115
+ <kind>A1</kind>
116
+ <date>20090827</date>
117
+ </document-id>
118
+ </publication-reference>
119
+ <wo-publication-info>...
120
+
121
+ The PATENTSCOPE Web Service doesn't allow us to access documents using WO publication numbers. Calling `Patentscope.get_iasr('WO2003/080231')` for example will fail.
122
+
123
+ ### Getting a List of Available Documents (`get_available_documents`)
124
+
125
+ The `get_available_documents` class method returns the list of available documents for the specified application number.
126
+
127
+ Patentscope.get_available_documents('SG2009000062')
128
+ # =>
129
+ <?xml version="1.0"?>
130
+ <doc ocrPresence="no" docType="RO101" docId="id00000008679651"/>
131
+
132
+ ### Getting the Binary Content of a Document (`get_document_content`)
133
+
134
+ The `get_document_content` class method returns the binary content of the document for the specified document id.
135
+
136
+ Patentscope.get_document_content('090063618004ca88')
137
+ #=>
138
+ <?xml version="1.0"?>
139
+ <documentContent>UEsDBBQACAAIAIyMOy0AAAAAAAAAAAAAAAAKAAAAMDAwMDAxLnRpZsy7ezxU2 ...
140
+
141
+ ### Getting the Text of a Document in PDF Format (`get_document_ocr_content`)
142
+
143
+ The `get_document_ocr_content` class method returns the binary content of the document for the specified document id, in text-based PDF format (high quality OCR).
144
+
145
+ Patentscope.get_document_ocr_content('id00000015801579')
146
+ =>
147
+ <?xml version="1.0"?>
148
+ <documentContent>JVBERi0xLjQKJeLjz9MKOCAwIG9iago8PC9EZWNvZGVQYXJtcyA8PC9CbG ...
149
+
150
+ ### Getting a List of Page IDs for a Document (`get_document_table_of_contents`)
151
+
152
+ The `get_document_table_of_contents` class method returns the list of page ids for the specified document id.
153
+
154
+ Patentscope.get_document_table_of_contents('090063618004ca88')
155
+ #=>
156
+ <?xml version="1.0"?>
157
+ <content>000001.tif</content>
158
+
159
+ ### Getting the Binary Content for a Document and Page (`get_document_content_page`)
160
+
161
+ The `get_document_content_page` class method returns the binary content for specified document and page ids.
162
+
163
+ Patentscope.get_document_content_page('090063618004ca88', '000001.tif')
164
+ #=>
165
+ <?xml version="1.0"?>
166
+ <pageContent>SUkqAAgAAAASAP4ABAABAAAAAAAAAAABAwABAAAA</pageContent>
167
+
168
+ ###Getting a WSDL Document for the Web Service (`wsdl`)
169
+
170
+ The `wsdl` method returns a WSDL document for the PATENTSCOPE Web Service
171
+
172
+ Patentscope.wsdl
173
+ #=>
174
+ <?xml version='1.0' encoding='UTF-8'?>
175
+ ...
176
+ <wsdl:definitions...
177
+
178
+ ### Errors
179
+
180
+ * NoCredentialsError - attempting to use the client when no credentials were entered
181
+ * WrongCredentialsError - attempting to access the PATENTSCOPE Webservice with incorrect credentials
182
+ * NoAppNumberError - no application numbere was entered, or unable to convert number
183
+ * NoDocIDError - no document id was entered
184
+ * NoPageIDError - no page id was entered
185
+ * BusinessError - PATENTSCOPE Webservice returned a "business error"
186
+
187
+ ## Disclaimer
188
+
189
+ The Patentscope gem is not an official product of the World Intellectual Property Organization. No warranty is attached to the provision of this gem. Please use this gem at your own risk.
190
+
191
+ You are solely responsible for ensuring that any uses are suitable, authorized and otherwise legal. In particular, you are solely responsible for maintaining a subscription at WIPO and adhering to WIPO's terms of use for the PATENTSCOPE Web Service.
192
+
193
+ Please note the [Terms and Conditions](http://www.wipo.int/patentscope/en/data/terms.html) relating to use of the PATENTSCOPE Web Service, especially Section 3.2 ("authorized uses", i.e., fewer than 10 retrieval related actions per minute from an individual IP address of a subscriber).
194
+
195
+ ## Support
196
+
197
+ I am happy to help with any queries in relation to the Patentscope gem. Please file an issue on the Github repo and I will try my best to help.
198
+
199
+ **I am unable to answer any queries on the PATENTSCOPE Web Service itself**.
200
+
201
+ For support on the PATENTSCOPE Web Service, please see the resources in the section below, visit the [PATENTSCOPE Forum](http://wipo-patentscope-forum.2944828.n2.nabble.com) or send an email to WIPO at patentscope@wipo.int
202
+
203
+ ## Resources
204
+
205
+ ### PATENTSCOPE Web Service
206
+ * [PATENTSCOPE Web Service](http://www.wipo.int/patentscope/en/data/products.html)
207
+ * [Terms and Conditions](http://www.wipo.int/patentscope/en/data/terms.html)
208
+ * [Subscription Form](http://www.wipo.int/patentscope/en/data/forms/web_service.jsp)
209
+ * [News](http://www.wipo.int/patentscope/en/news/pctdb/)
210
+ * [PCT PATENTSCOPE Web-services for Offices](http://www.wipo.int/edocs/mdocs/pct/en/wipo_pct_mow_12/wipo_pct_mow_12_ref_pctpatentscope.pdf) (Presentation by WIPO)
211
+
212
+ ### PATENTSCOPE Search System
213
+ * [PATENTSCOPE](http://www.wipo.int/patentscope/en/)
214
+ * [Search Interface] (http://patentscope.wipo.int/search/)
215
+ * [Webinars](http://www.wipo.int/patentscope/en/webinar/)
216
+ * [Forum](http://wipo-patentscope-forum.2944828.n2.nabble.com)
217
+
218
+ ## Contact
219
+ Comments and bug reports are welcome.
220
+
221
+ ## Contributing
222
+ Feel free to drop us a line to let us know you would like to work on something or if you have an idea. Otherwise, fork, code, commit, push and create pull request, *viz*:
223
+
224
+ 1. Create a fork of the repo from http://github.com/cantab/patentscope.
225
+ 2. Create your feature branch (`git checkout -b new-feature`).
226
+ 2. Write some tests (in RSpec, if you please).
227
+ 3. Write the code that allows the tests to pass.
228
+ 3. Commit your changes (`git commit -am 'Add some feature'`).
229
+ 4. Push to the branch (`git push origin new-feature`).
230
+ 5. Create a new [Pull Request] (https://help.github.com/articles/using-pull-requests).
231
+
232
+ More details on how to contribute can be found at this great Thoughtbot blogpost [8 (new) steps for fixing other people's code] (http://robots.thoughtbot.com/8-new-steps-for-fixing-other-peoples-code).
233
+
234
+ ## License
235
+
236
+ Copyright (c) 2013-2014 Chong-Yee Khoo. All rights reserved.
237
+
238
+ MIT License
239
+
240
+ Permission is hereby granted, free of charge, to any person obtaining
241
+ a copy of this software and associated documentation files (the
242
+ "Software"), to deal in the Software without restriction, including
243
+ without limitation the rights to use, copy, modify, merge, publish,
244
+ distribute, sublicense, and/or sell copies of the Software, and to
245
+ permit persons to whom the Software is furnished to do so, subject to
246
+ the following conditions:
247
+
248
+ The above copyright notice and this permission notice shall be
249
+ included in all copies or substantial portions of the Software.
250
+
251
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
252
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
253
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
254
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
255
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
256
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
257
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/Rakefile ADDED
@@ -0,0 +1,27 @@
1
+ require 'bundler/gem_tasks'
2
+ require 'rspec/core/rake_task'
3
+
4
+ desc 'Run RSpec examples (all)'
5
+
6
+ RSpec::Core::RakeTask.new(:spec)
7
+
8
+ namespace :spec do
9
+
10
+ desc 'Run RSpec core examples'
11
+
12
+ RSpec::Core::RakeTask.new(:core) do |task|
13
+ task.pattern = "./spec/**/*_spec.rb"
14
+ task.rspec_opts = '--tag core'
15
+ end
16
+
17
+ desc 'Run additional RSpec examples (e.g., integration specs)'
18
+
19
+ RSpec::Core::RakeTask.new(:more) do |task|
20
+ task.pattern = "./spec/**/*_spec.rb"
21
+ task.rspec_opts = '--tag more'
22
+ end
23
+
24
+ end
25
+
26
+ task default: :spec
27
+ task test: :spec
@@ -0,0 +1,52 @@
1
+ module Patentscope
2
+
3
+ require 'patentscope/version'
4
+ require 'patentscope/client'
5
+ require 'patentscope/configuration'
6
+ require 'patentscope/webservice'
7
+ require 'patentscope/webservice_soap_builder'
8
+ require 'patentscope/pct_doc_number'
9
+
10
+ class NoCredentialsError < StandardError; end
11
+ class WrongCredentialsError < StandardError; end
12
+ class BusinessError < StandardError; end
13
+ class WrongNumberFormatError < StandardError; end
14
+
15
+ class << self
16
+
17
+ def wsdl
18
+ webservice.wsdl
19
+ end
20
+
21
+ def get_available_documents(ia_number)
22
+ webservice.get_available_documents(ia_number: ia_number)
23
+ end
24
+
25
+ def get_document_content(doc_id)
26
+ webservice.get_document_content(doc_id: doc_id)
27
+ end
28
+
29
+ def get_document_ocr_content(doc_id)
30
+ webservice.get_document_ocr_content(doc_id: doc_id)
31
+ end
32
+
33
+ def get_iasr(ia_number)
34
+ webservice.get_iasr(ia_number: ia_number)
35
+ end
36
+
37
+ def get_document_table_of_contents(doc_id)
38
+ webservice.get_document_table_of_contents(doc_id: doc_id)
39
+ end
40
+
41
+ def get_document_content_page(doc_id, page_id)
42
+ webservice.get_document_content_page(doc_id: doc_id, page_id: page_id)
43
+ end
44
+
45
+ private
46
+
47
+ def webservice
48
+ Webservice.new
49
+ end
50
+ end
51
+ end
52
+
@@ -0,0 +1,32 @@
1
+ require 'net/http'
2
+ require 'nokogiri'
3
+
4
+ module Patentscope
5
+
6
+ class Client
7
+ attr_reader :username, :password
8
+
9
+ USER_AGENT_STRING = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/536.30.1 (KHTML, like Gecko) Version/6.0.5 Safari/536.30.1"
10
+
11
+ def initialize(args = {})
12
+ @username = args[:username]
13
+ @password = args[:password]
14
+ end
15
+
16
+ def get_url(url)
17
+ open(url, "User-Agent" => USER_AGENT_STRING, http_basic_authentication: [username, password]).read
18
+ end
19
+
20
+ def post_url(url, content_type = 'text/html', body = '')
21
+ uri = URI.parse(url)
22
+ http = Net::HTTP.new(uri.host, uri.port)
23
+ request = Net::HTTP::Post.new(uri.request_uri)
24
+ request.basic_auth(username, password)
25
+ request["User-Agent"] = USER_AGENT_STRING
26
+ request["Content-Type"] = content_type
27
+ request.body = body
28
+ response = http.request(request)
29
+ response.body
30
+ end
31
+ end
32
+ end
@@ -0,0 +1,42 @@
1
+ module Patentscope
2
+
3
+ class << self
4
+ attr_accessor :configuration
5
+
6
+ def configure
7
+ self.configuration ||= Configuration.new
8
+ yield(configuration) if block_given?
9
+ end
10
+
11
+ def configure_from_env
12
+ if self.configuration
13
+ return false
14
+ else
15
+ self.configuration = Configuration.new
16
+ self.configuration.username = ENV['PATENTSCOPE_WEBSERVICE_USERNAME']
17
+ self.configuration.password = ENV['PATENTSCOPE_WEBSERVICE_PASSWORD']
18
+ return true
19
+ end
20
+ end
21
+
22
+ def configured?
23
+ configuration &&
24
+ configuration.username &&
25
+ configuration.password
26
+ end
27
+
28
+ def reset_configuration
29
+ self.configuration = nil
30
+ end
31
+ end
32
+
33
+ class Configuration
34
+ attr_accessor :username, :password
35
+
36
+ def initialize
37
+ @username = ''
38
+ @password = ''
39
+ end
40
+ end
41
+
42
+ end