epo-ops 0.1.6 → 0.2.5

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 14d4527d80492bc9c65e4b9119155f8efbd5f306
4
- data.tar.gz: 0ef4446294943bb1bcaf0af3bc34d3443c80d1fa
3
+ metadata.gz: e40ce1dbaed083f38c18e4e0f40f66f80381d270
4
+ data.tar.gz: 7a69709812e5be2687ff4ccae2d7aa69372e3bf2
5
5
  SHA512:
6
- metadata.gz: 09e15e977dc99780a2519fec1b87328205b57aabc86fc02cfa26b755eae6680fc2c3cda513c53d357dac42a69315603a9b0108f0ceeab2b4d9459301ee5b9478
7
- data.tar.gz: 2cb69673d63c1a3250a17002b110ada59db528837ae6adf36c9d2f728a8ed918e941f305c30ef3038787c2b7ed88811de5900e6bf5108d76ff181b2bfb03ff54
6
+ metadata.gz: 6092ddb5f55b92f9156c5360f1436d3242468c00f2e40b21fd9fe50240f4c333ce7d8562022cdb7af9eb6aca59236c22171b6b2d72a4dd628bb9558422c75bbe
7
+ data.tar.gz: f26525ce447f0882581228ea0bdc52c86a01aa90bd2b2be6f76e1b52c4f92ec00e59d3b6d3ffa785eb105ecb8122b6126ec7ade6f81bba20d4c9efdfaf701859
data/.gitignore CHANGED
@@ -35,3 +35,6 @@ Gemfile.lock
35
35
 
36
36
  # unless supporting rvm < 1.11.0 or doing something fancy, ignore this:
37
37
  .rvmrc
38
+
39
+ # IDEs
40
+ /.idea/
data/.travis.yml CHANGED
@@ -1,5 +1,5 @@
1
1
  language: ruby
2
2
  rvm:
3
- - 2.2.7
4
- - 2.3.4
5
- - 2.4.1
3
+ - "2.0.0"
4
+ - "2.1.0"
5
+ - "2.2.0"
data/README.md CHANGED
@@ -4,15 +4,19 @@
4
4
  # epo-ops
5
5
  Ruby interface to the EPO Open Patent Services (OPS).
6
6
 
7
- You can play around with the API [here](https://developers.epo.org/).
8
- Documentation of it can be found [here](https://www.epo.org/searching-for-patents/technical/espacenet/ops.html) under `Downloads`.
7
+ [Documentation can be found here](http://www.rubydoc.info/gems/epo-ops/)
8
+
9
+ The EPO provides [playground](https://developers.epo.org/), where you can try
10
+ out the methods. As well as [Documentation](https://www.epo.org/searching-for-patents/technical/espacenet/ops.html)
11
+ of the different endpoints and detailed usage (see the 'Downloads' section).
9
12
 
10
13
  # Usage
11
14
 
12
- ## Authentification
13
- In order to use this gem you need to register at the EPO for OAuth
14
- [here](https://developers.epo.org/user/register).
15
+ ## Authentication
16
+ In order to use this gem you need to register at the [EPO for
17
+ OAuth](https://developers.epo.org/user/register).
15
18
  Use your credentials by configuring
19
+
16
20
  ```ruby
17
21
  Epo::Ops.configure do |conf|
18
22
  conf.consumer_key = "YOUR_KEY"
@@ -20,15 +24,48 @@ Epo::Ops.configure do |conf|
20
24
  end
21
25
  ```
22
26
 
23
- ## What works up to now
24
- * Search the EPO OPS register with `Epo::Ops::Register.search(query)`; use `Epo::Ops::SearchQueryBuilder` to build an appropriate request.
25
- * Get bibliographic info from the register, both for application and publication references (which you may retrieve with the search).
26
- * Bulk searching for all patents on a given date wih `Epo::Ops::Register::Bulk.all_queries(date)`. Note that patents are usually published on Wednesdays, if you find some on another weekday, please let us know.
27
- This method currently returns all queries necessary to find all patents with `Epo::Ops::Register.search`
27
+ ## Quickstart
28
+ ### Search for Patents
29
+
30
+ Get references to all Patents on a given date and IPC-class:
31
+
32
+ ```ruby
33
+ Epo::Ops::Register.search("A", Date.new(2016,2 ,3))
34
+ # or for all ipc classes
35
+ Epo::Ops::Register.search(nil, Date.new(2016,2 ,3))
36
+ ```
37
+
38
+ You can now retrieve the bibliographic entries of all these:
39
+
40
+ ```ruby
41
+ references = Epo::Ops::Register.search(nil, Date.new(2016,2 ,3))
42
+ references.map { |ref| Epo::Ops::Register.biblio(ref) }
43
+ ```
44
+ This will return an object that helps parsing the result. See the documentation
45
+ for more information
46
+
47
+ Note that both operations take a considerable amount of time. Also you may not
48
+ want to develop and test with many of these requests, as they can quite quickly
49
+ excess the API limits. Also note that this methods use the `application`
50
+ endpoint.
51
+
52
+ ## Custom Retrieval
28
53
 
29
- ### #search
30
- Use the `SearchQueryBuilder` to set up the queries. By default structs are returned that should make it easier to work with the results, but with the `raw`-flag set to true you may also retrieve the resulting hash and parse it yourself.
31
- The results have the method `#epodoc_reference` which perfectly fits into `#biblio`
54
+ ### #raw_search
55
+ This allows you to build your own CQL query, as
56
+ described in the official documentation. With the second parameter set
57
+ to true you can get the raw result as a nested Hash, if you want to
58
+ parse it yourself.
32
59
 
33
- ### #biblio
34
- With `Epo::Ops::Register.biblio(reference_id)` you can retrieve the bibliographic entry for the given patent (see OPS documentation). By default it searches the `/application/` endpoint, but you may set `publication` as the second parameter. Make sure the `reference_id` matches the given type. The last optional parameter allows you to set another format the id, but the default `epodoc` is strongly advised. This format is also provided from search results with `#epodoc_reference`.
60
+ ```ruby
61
+ Epo::Ops::Register.raw_search("q=pd=20160203 and ic=D&Range=1-100", true)
62
+ ```
63
+
64
+ ### #raw_biblio
65
+ If you do not want to retrieve via the `application` endpoint (say you want
66
+ `publication`) this method gives you more fine-grained control. Make sure the
67
+ `reference_id` you use matches the type.
68
+
69
+ ```ruby
70
+ Epo::Ops::Register.raw_biblio('EP1000000', 'publication')
71
+ ```
data/epo-ops.gemspec CHANGED
@@ -21,9 +21,14 @@ Gem::Specification.new do |spec|
21
21
  spec.add_development_dependency 'bundler', '~> 1.3'
22
22
  spec.add_development_dependency 'rake', '~> 10.5'
23
23
  spec.add_development_dependency 'minitest', '~> 5.8'
24
+ spec.add_development_dependency 'mocha'
24
25
  spec.add_development_dependency 'vcr', '~> 2.9'
25
26
  spec.add_development_dependency 'webmock', '~> 1.22'
26
27
  spec.add_development_dependency 'simplecov'
27
28
 
29
+ spec.add_development_dependency 'redis'
30
+ spec.add_development_dependency 'connection_pool'
31
+
28
32
  spec.add_dependency 'oauth2', '~> 1.1'
33
+ spec.add_dependency 'httparty', '~> 0.13'
29
34
  end
data/lib/epo/ops.rb CHANGED
@@ -2,6 +2,8 @@ require 'epo/ops/version'
2
2
  require 'epo/ops/token_store'
3
3
  require 'epo/ops/register'
4
4
  require 'epo/ops/search_query_builder'
5
+ require 'epo/ops/ipc_class_hierarchy_loader'
6
+ require 'epo/ops/ipc_class_util'
5
7
 
6
8
  module Epo
7
9
  module Ops
@@ -27,7 +27,7 @@ module Epo
27
27
 
28
28
  # @return [String] The URL at which you can query the original document.
29
29
  def url
30
- @url ||= "https://ops.epo.org/#{Epo::Ops::API_VERSION}/rest-services/register/application/epodoc/#{application_nr}"
30
+ @url ||= "https://ops.epo.org/3.1/rest-services/register/application/epodoc/#{application_nr}"
31
31
  end
32
32
 
33
33
  # @return [String] the english title of the patent @note Titles are
@@ -3,9 +3,20 @@ require 'epo/ops/error'
3
3
 
4
4
  module Epo
5
5
  module Ops
6
+
7
+ # This is a wrapper for OAuth
6
8
  class Client
7
9
  # @return [OAuth2::Response]
8
10
  def self.request(verb, url, options = {})
11
+ do_request(verb, url, options)
12
+ rescue Error::AccessTokenExpired
13
+ Epo::Ops.config.token_store.reset
14
+ do_request(verb, url, options)
15
+ end
16
+
17
+ private
18
+
19
+ def self.do_request(verb, url, options = {})
9
20
  token = Epo::Ops.config.token_store.token
10
21
  response = token.request(verb, URI.encode(url), options)
11
22
  fail Error.from_response(response) unless response.status == 200
data/lib/epo/ops/error.rb CHANGED
@@ -32,6 +32,8 @@ module Epo
32
32
  ServiceUnavailable = Class.new(ServerError)
33
33
  # Raised when EPO returns the HTTP status code 504
34
34
  GatewayTimeout = Class.new(ServerError)
35
+ # AccessToken has expired
36
+ AccessTokenExpired = Class.new(ClientError)
35
37
 
36
38
  ERRORS = {
37
39
  400 => Epo::Ops::Error::BadRequest,
@@ -59,6 +61,8 @@ module Epo
59
61
 
60
62
  if code == 403 && FORBIDDEN_MESSAGES[message]
61
63
  FORBIDDEN_MESSAGES[message].new(message, response.headers, code)
64
+ elsif code == 400 && response.headers['www-authenticate'] && response.headers['www-authenticate'].include?('Access Token expired')
65
+ Error::AccessTokenExpired.new('Access Token expired', response.headers, code)
62
66
  else
63
67
  ERRORS[code].new(message, response.headers, code)
64
68
  end
@@ -0,0 +1,148 @@
1
+ module Epo
2
+ module Ops
3
+ # The hierarchy is a flat Hash, that helps finding all known ipc subclasses
4
+ # of a given class. It was parsed from the WIPO. It does not support all
5
+ # levels, as it would (currently unnecessarily) blow up this hash. It only
6
+ # finds the first two sub class levels, e.g. A45F.
7
+ class IpcClassHierarchy
8
+ Hierarchy = { 'A' => %w(A01 A21 A22 A23 A24 A41 A42 A43 A44 A45 A46 A47 A61 A62 A63 A99),
9
+ 'A01' => %w(A01B A01C A01D A01F A01G A01H A01J A01K A01L A01M A01N A01P),
10
+ 'A21' => %w(A21B A21C A21D),
11
+ 'A22' => %w(A22B A22C),
12
+ 'A23' => %w(A23B A23C A23D A23F A23G A23J A23K A23L A23N A23P),
13
+ 'A24' => %w(A24B A24C A24D A24F),
14
+ 'A41' => %w(A41B A41C A41D A41F A41G A41H),
15
+ 'A42' => %w(A42B A42C),
16
+ 'A43' => %w(A43B A43C A43D),
17
+ 'A44' => %w(A44B A44C),
18
+ 'A45' => %w(A45B A45C A45D A45F),
19
+ 'A46' => %w(A46B A46D),
20
+ 'A47' => %w(A47B A47C A47D A47F A47G A47H A47J A47K A47L),
21
+ 'A61' => %w(A61B A61C A61D A61F A61G A61H A61J A61K A61L A61M A61N A61P A61Q),
22
+ 'A62' => %w(A62B A62C A62D),
23
+ 'A63' => %w(A63B A63C A63D A63F A63G A63H A63J A63K),
24
+ 'A99' => ['A99Z'],
25
+ 'B' => %w(B01 B02 B03 B04 B05 B06 B07 B08 B09 B21 B22 B23 B24 B25 B26 B27 B28 B29 B30 B31 B32 B33 B41 B42 B43 B44 B60 B61 B62 B63 B64 B65 B66 B67 B68 B81 B82 B99),
26
+ 'B01' => %w(B01B B01D B01F B01J B01L),
27
+ 'B02' => %w(B02B B02C),
28
+ 'B03' => %w(B03B B03C B03D),
29
+ 'B04' => %w(B04B B04C),
30
+ 'B05' => %w(B05B B05C B05D),
31
+ 'B06' => ['B06B'],
32
+ 'B07' => %w(B07B B07C),
33
+ 'B08' => ['B08B'],
34
+ 'B09' => %w(B09B B09C),
35
+ 'B21' => %w(B21B B21C B21D B21F B21G B21H B21J B21K B21L),
36
+ 'B22' => %w(B22C B22D B22F),
37
+ 'B23' => %w(B23B B23C B23D B23F B23G B23H B23K B23P B23Q),
38
+ 'B24' => %w(B24B B24C B24D),
39
+ 'B25' => %w(B25B B25C B25D B25F B25G B25H B25J),
40
+ 'B26' => %w(B26B B26D B26F),
41
+ 'B27' => %w(B27B B27C B27D B27F B27G B27H B27J B27K B27L B27M B27N),
42
+ 'B28' => %w(B28B B28C B28D),
43
+ 'B29' => %w(B29B B29C B29D B29K B29L),
44
+ 'B30' => ['B30B'],
45
+ 'B31' => %w(B31B B31C B31D B31F),
46
+ 'B32' => ['B32B'],
47
+ 'B33' => ['B33Y'],
48
+ 'B41' => %w(B41B B41C B41D B41F B41G B41J B41K B41L B41M B41N),
49
+ 'B42' => %w(B42B B42C B42D B42F),
50
+ 'B43' => %w(B43K B43L B43M),
51
+ 'B44' => %w(B44B B44C B44D B44F),
52
+ 'B60' => %w(B60B B60C B60D B60F B60G B60H B60J B60K B60L B60M B60N B60P B60Q B60R B60S B60T B60V B60W),
53
+ 'B61' => %w(B61B B61C B61D B61F B61G B61H B61J B61K B61L),
54
+ 'B62' => %w(B62B B62C B62D B62H B62J B62K B62L B62M),
55
+ 'B63' => %w(B63B B63C B63G B63H B63J),
56
+ 'B64' => %w(B64B B64C B64D B64F B64G),
57
+ 'B65' => %w(B65B B65C B65D B65F B65G B65H),
58
+ 'B66' => %w(B66B B66C B66D B66F),
59
+ 'B67' => %w(B67B B67C B67D),
60
+ 'B68' => %w(B68B B68C B68F B68G),
61
+ 'B81' => %w(B81B B81C),
62
+ 'B82' => %w(B82B B82Y),
63
+ 'B99' => ['B99Z'],
64
+ 'C' => %w(C01 C02 C03 C04 C05 C06 C07 C08 C09 C10 C11 C12 C13 C14 C21 C22 C23 C25 C30 C40 C99),
65
+ 'C01' => %w(C01B C01C C01D C01F C01G),
66
+ 'C02' => ['C02F'],
67
+ 'C03' => %w(C03B C03C),
68
+ 'C04' => ['C04B'],
69
+ 'C05' => %w(C05B C05C C05D C05F C05G),
70
+ 'C06' => %w(C06B C06C C06D C06F),
71
+ 'C07' => %w(C07B C07C C07D C07F C07G C07H C07J C07K),
72
+ 'C08' => %w(C08B C08C C08F C08G C08H C08J C08K C08L),
73
+ 'C09' => %w(C09B C09C C09D C09F C09G C09H C09J C09K),
74
+ 'C10' => %w(C10B C10C C10F C10G C10H C10J C10K C10L C10M C10N),
75
+ 'C11' => %w(C11B C11C C11D),
76
+ 'C12' => %w(C12C C12F C12G C12H C12J C12L C12M C12N C12P C12Q C12R),
77
+ 'C13' => %w(C13B C13K),
78
+ 'C14' => %w(C14B C14C),
79
+ 'C21' => %w(C21B C21C C21D),
80
+ 'C22' => %w(C22B C22C C22F),
81
+ 'C23' => %w(C23C C23D C23F C23G),
82
+ 'C25' => %w(C25B C25C C25D C25F),
83
+ 'C30' => ['C30B'],
84
+ 'C40' => ['C40B'],
85
+ 'C99' => ['C99Z'],
86
+ 'D' => %w(D01 D02 D03 D04 D05 D06 D07 D21 D99),
87
+ 'D01' => %w(D01B D01C D01D D01F D01G D01H),
88
+ 'D02' => %w(D02G D02H D02J),
89
+ 'D03' => %w(D03C D03D D03J),
90
+ 'D04' => %w(D04B D04C D04D D04G D04H),
91
+ 'D05' => %w(D05B D05C),
92
+ 'D06' => %w(D06B D06C D06F D06G D06H D06J D06L D06M D06N D06P D06Q),
93
+ 'D07' => ['D07B'],
94
+ 'D21' => %w(D21B D21C D21D D21F D21G D21H D21J),
95
+ 'D99' => ['D99Z'],
96
+ 'E' => %w(E01 E02 E03 E04 E05 E06 E21 E99),
97
+ 'E01' => %w(E01B E01C E01D E01F E01H),
98
+ 'E02' => %w(E02B E02C E02D E02F),
99
+ 'E03' => %w(E03B E03C E03D E03F),
100
+ 'E04' => %w(E04B E04C E04D E04F E04G E04H),
101
+ 'E05' => %w(E05B E05C E05D E05F E05G),
102
+ 'E06' => %w(E06B E06C),
103
+ 'E21' => %w(E21B E21C E21D E21F),
104
+ 'E99' => ['E99Z'],
105
+ 'F' => %w(F01 F02 F03 F04 F15 F16 F17 F21 F22 F23 F24 F25 F26 F27 F28 F41 F42 F99),
106
+ 'F01' => %w(F01B F01C F01D F01K F01L F01M F01N F01P),
107
+ 'F02' => %w(F02B F02C F02D F02F F02G F02K F02M F02N F02P),
108
+ 'F03' => %w(F03B F03C F03D F03G F03H),
109
+ 'F04' => %w(F04B F04C F04D F04F),
110
+ 'F15' => %w(F15B F15C F15D),
111
+ 'F16' => %w(F16B F16C F16D F16F F16G F16H F16J F16K F16L F16M F16N F16P F16S F16T),
112
+ 'F17' => %w(F17B F17C F17D),
113
+ 'F21' => %w(F21H F21K F21L F21S F21V F21W F21Y),
114
+ 'F22' => %w(F22B F22D F22G),
115
+ 'F23' => %w(F23B F23C F23D F23G F23H F23J F23K F23L F23M F23N F23Q F23R),
116
+ 'F24' => %w(F24B F24C F24D F24F F24H F24J),
117
+ 'F25' => %w(F25B F25C F25D F25J),
118
+ 'F26' => ['F26B'],
119
+ 'F27' => %w(F27B F27D),
120
+ 'F28' => %w(F28B F28C F28D F28F F28G),
121
+ 'F41' => %w(F41A F41B F41C F41F F41G F41H F41J),
122
+ 'F42' => %w(F42B F42C F42D),
123
+ 'F99' => ['F99Z'],
124
+ 'G' => %w(G01 G02 G03 G04 G05 G06 G07 G08 G09 G10 G11 G12 G21 G99),
125
+ 'G01' => %w(G01B G01C G01D G01F G01G G01H G01J G01K G01L G01M G01N G01P G01Q G01R G01S G01T G01V G01W),
126
+ 'G02' => %w(G02B G02C G02F),
127
+ 'G03' => %w(G03B G03C G03D G03F G03G G03H),
128
+ 'G04' => %w(G04B G04C G04D G04F G04G G04R),
129
+ 'G05' => %w(G05B G05D G05F G05G),
130
+ 'G06' => %w(G06C G06D G06E G06F G06G G06J G06K G06M G06N G06Q G06T),
131
+ 'G07' => %w(G07B G07C G07D G07F G07G),
132
+ 'G08' => %w(G08B G08C G08G),
133
+ 'G09' => %w(G09B G09C G09D G09F G09G),
134
+ 'G10' => %w(G10B G10C G10D G10F G10G G10H G10K G10L),
135
+ 'G11' => %w(G11B G11C),
136
+ 'G12' => ['G12B'],
137
+ 'G21' => %w(G21B G21C G21D G21F G21G G21H G21J G21K),
138
+ 'G99' => ['G99Z'],
139
+ 'H' => %w(H01 H02 H03 H04 H05 H99),
140
+ 'H01' => %w(H01B H01C H01F H01G H01H H01J H01K H01L H01M H01P H01Q H01R H01S H01T),
141
+ 'H02' => %w(H02B H02G H02H H02J H02K H02M H02N H02P H02S),
142
+ 'H03' => %w(H03B H03C H03D H03F H03G H03H H03J H03K H03L H03M),
143
+ 'H04' => %w(H04B H04H H04J H04K H04L H04M H04N H04Q H04R H04S H04W),
144
+ 'H05' => %w(H05B H05C H05F H05G H05H H05K),
145
+ 'H99' => ['H99Z'] }
146
+ end
147
+ end
148
+ end
@@ -0,0 +1,62 @@
1
+ require 'httparty'
2
+ require 'epo/ops/ipc_class_util'
3
+
4
+ module Epo
5
+ module Ops
6
+ # Usually this should only used internally.
7
+ # Loads the Hierarchy from the WIPO.
8
+ # This is used to update IpcClassHierarchy manually.
9
+ # At the beginning of the year the WIPO publishes a new list of IPC classes.
10
+ # The IpcClassHierarchy should then be updated. Make sure that the url is
11
+ # correct!
12
+ class IpcClassHierarchyLoader
13
+ # loads data from the WIPO
14
+ # @return [Hash]
15
+ def self.load
16
+ load_url
17
+ end
18
+
19
+ private
20
+
21
+ def self.load_url
22
+ url = 'http://www.wipo.int/ipc/itos4ipc/ITSupport_and_download_area/20160101/IPC_scheme_title_list/EN_ipc_section_#letter_title_list_20160101.txt'
23
+
24
+ # There is a file for every letter A-H
25
+ ('A'..'H').inject({}) do |mem, letter|
26
+ # Fetch the file from the server
27
+ response = HTTParty.get(url.gsub('#letter', letter), http_proxyaddr: proxy[:addr], http_proxyport: proxy[:port])
28
+ file = response.body
29
+ mem.merge! process_file(file)
30
+ end
31
+ end
32
+
33
+ def self.process_file(file)
34
+ # Process every line (There is a line for every class entry, name and description are separated by a \t)
35
+ file.each_line.inject(Hash.new { |h, k| h[k] = [] }) do |mem, line|
36
+ next if line.to_s.strip.empty?
37
+ ipc_class_generic, description = line.split("\t")
38
+
39
+ # Some entries in the files have the same ipc class, the first line is
40
+ # just some kind of headline, the second is the description we want.
41
+ ipc_class = Epo::Ops::IpcClassUtil.parse_generic_format(ipc_class_generic)
42
+ if ipc_class.length == 3
43
+ mem[ipc_class[0]] << ipc_class
44
+ elsif ipc_class.length == 4
45
+ mem[ipc_class[0, 3]] << ipc_class
46
+ end
47
+ mem
48
+ end
49
+ end
50
+
51
+ def self.proxy
52
+ # configure proxy
53
+ proxy_addr = nil
54
+ proxy_port = nil
55
+ unless ENV['http_proxy'].to_s.strip.empty?
56
+ proxy_addr, proxy_port = ENV['http_proxy'].gsub('http://', '').gsub('/', '').split(':')
57
+ end
58
+ { addr: proxy_addr, port: proxy_port }
59
+ end
60
+ end
61
+ end
62
+ end
@@ -0,0 +1,73 @@
1
+ require 'epo/ops/ipc_class_hierarchy'
2
+
3
+ module Epo
4
+ module Ops
5
+ # Utility functions to work on Strings representing ipc classes.
6
+ class IpcClassUtil
7
+
8
+ # @return [Array] \['A', 'B', …, 'H'\]
9
+ def self.main_classes
10
+ %w( A B C D E F G H )
11
+ end
12
+
13
+ # check if the given ipc_class is valid as OPS search parameter
14
+ # @param [String] ipc_class an ipc class
15
+ # @return [Boolean]
16
+ def self.valid_for_search?(ipc_class)
17
+ ipc_class.match(/\A[A-H](\d{2}([A-Z](\d{1,2}\/\d{2,3})?)?)?\z/)
18
+ end
19
+
20
+ # There is a generic format for ipc classes that does not have
21
+ # the / as delimiter and leaves space for additions. This parses
22
+ # it into the format the register search understands
23
+ # @param [String] generic ipc class in generic format
24
+ # @return [String] reformatted ipc class
25
+ # @example
26
+ # parse_generic_format('A01B0003140000') #=> 'A01B3/14'
27
+ def self.parse_generic_format(generic)
28
+ ipc_class = generic
29
+ if ipc_class.length > 4
30
+ match = ipc_class.match(/([A-Z]\d{2}[A-Z])(\d{4})(\d{6})$/)
31
+ ipc_class = match[1] + (match[2].to_i).to_s + '/' + process_number(match[3])
32
+ end
33
+ ipc_class
34
+ end
35
+
36
+ # @param [String] ipc_class an ipc_class
37
+ # @return [Array] List of all ipc classes one level more specific.
38
+ # @examples
39
+ # children('A') #=> ['A01', 'A21', 'A22', 'A23', ...]
40
+ # children('A62') #=> ['A62B', 'A62C', 'A62D'],
41
+ # @raise [InvalidIpcClassError] if parameter is not a valid ipc class in
42
+ # the format EPO understands
43
+ # @raise [LevelNotSupportedError] for parameters with ipc class depth >= 3
44
+ # e.g. 'A62B' cannot be split further. It is currently not necessary to
45
+ # do so, it would only blow up the gem, and you do not want to query for
46
+ # all classes at the lowest level, as it takes too many requests.
47
+ def self.children(ipc_class)
48
+ return main_classes if ipc_class.nil?
49
+ valid = valid_for_search?(ipc_class)
50
+ fail InvalidIpcClassError, ipc_class unless valid
51
+ map = IpcClassHierarchy::Hierarchy
52
+ fail LevelNotSupportedError, ipc_class unless map.key? ipc_class
53
+ map[ipc_class]
54
+ end
55
+
56
+ # An ipc class in invalid format was given, or none at all.
57
+ class InvalidIpcClassError < StandardError; end
58
+ # It is currently not supported to split by the most specific class level.
59
+ # This would result in a large amount of requests.
60
+ class LevelNotSupportedError < StandardError; end
61
+
62
+ private
63
+
64
+ def self.process_number(number)
65
+ result = number.gsub(/0+$/, '')
66
+ result += '0' if result.length == 1
67
+ result = '00' if result.length == 0
68
+
69
+ result
70
+ end
71
+ end
72
+ end
73
+ end
@@ -3,140 +3,115 @@ require 'epo/ops/client'
3
3
  require 'epo/ops/util'
4
4
  require 'epo/ops/bibliographic_document'
5
5
  require 'epo/ops/logger'
6
+ require 'epo/ops/ipc_class_util'
6
7
 
7
8
  module Epo
8
9
  module Ops
9
- # Access to the {http://ops.epo.org/3.2/rest-services/register register}
10
+ # Access to the {http://ops.epo.org/3.1/rest-services/register register}
10
11
  # endpoint of the EPO OPS API.
11
12
  #
12
13
  # By now you can search and retrieve patents by using the type `application`
13
14
  # in the `epodoc` format.
14
15
  #
15
16
  # Search queries are limited by size, not following these limits
16
- # will result in errors.
17
+ # will result in errors. You should probably use {.search} which handles the
18
+ # limits itself.
19
+ #
20
+ # For more fine grained control use {.raw_search} and {.raw_biblio}
17
21
  #
18
22
  # @see Limits
19
23
  # @see SearchQueryBuilder
20
24
  class Register
21
- # Helper class that assists in building the queries necessary to search
22
- # for more patents than possible with one query respecting the given
23
- # limits.
25
+ # A helper method which creates queries that take API limits into account.
26
+ # @param patent_count [Integer] number of overall results expected.
27
+ # See {.published_patents_count}
24
28
  #
25
- # @see Limits
26
- class Bulk
27
- # Helper method returning all unique register references on a given
28
- # date. This is the same as executing all queries from {.all_queries}
29
- # and making the results unique.
30
- #
31
- # @note Patents may have more than one IPC class, they would appear
32
- # more than once, this method filters these by `doc_number`
33
- def self.all_register_references(date)
34
- begin
35
- queries = Bulk.all_queries(date)
36
- search_entries = queries.flat_map do |query|
37
- Register.search(query)
38
- end
39
- rescue ::Epo::Ops::Error::NotFound
40
- return []
41
- end
42
- search_entries.map(&:application_reference)
43
- .uniq(&:doc_number)
29
+ # @return [Array] of Strings, each a query to put into {Register.raw_search}
30
+ # @see Epo::Ops::Limits
31
+ def self.split_by_size_limits(ipc_class, date, patent_count)
32
+ max_interval = Limits::MAX_QUERY_INTERVAL
33
+ (1..patent_count).step(max_interval).map do |start|
34
+ range_end = [start + max_interval - 1, patent_count].min
35
+ Epo::Ops::SearchQueryBuilder.build(ipc_class, date, start, range_end)
44
36
  end
37
+ end
45
38
 
46
- # Build the queries to search for all patents on a given date.
47
- #
48
- # The offset of EPOs register search may at max be 2000, if more patents
49
- # are published on one day the queries must be split; here across the
50
- # first level of ipc classification.
51
- # At time of this writing they are mostly below 1000, there should be
52
- # plenty of space for now.
53
- #
54
- # In case the limits change, they can be found in {Epo::Ops::Limits}
55
- # Should there be more than 2000 patents in one class, a message will
56
- # be logged, please file an Issue if that happens.
57
- #
58
- # @return [Array] containing all queries to put into {Register.search}.
59
- # @note The queries are split by IPC-classes if necessary; Patents may
60
- # have more than one, you might get multiple references to the same
61
- # patent.
62
- # @see .all_register_references
63
- def self.all_queries(date)
64
- overall_count = published_patents_count(date)
65
- if overall_count > Limits::MAX_QUERY_RANGE
66
- patent_count_by_ipc_classes(date).flat_map do |ipc_class, count|
67
- builder = SearchQueryBuilder.new
68
- .publication_date(date.year, date.month, date.day)
69
- .and
70
- .ipc_class(ipc_class)
71
- split_by_size_limits(builder, count)
72
- end
73
- else
74
- builder = SearchQueryBuilder.new
75
- .publication_date(date.year, date.month, date.day)
76
- split_by_size_limits(builder, overall_count)
77
- end
39
+ # Makes the requests to find how many patents are in each top
40
+ # level ipc class on a given date.
41
+ #
42
+ # @param date [Date] date on which patents should be counted
43
+ # @return [Hash] Hash ipc_class => count (ipc_class A-H)
44
+ def self.patent_counts_per_ipc_class(date)
45
+ %w( A B C D E F G H ).inject({}) do |mem, icc|
46
+ mem[icc] = published_patents_counts(icc, date)
47
+ mem
78
48
  end
49
+ end
79
50
 
80
- # @return [Hash] For all top level IPC classes (A-H) => count
81
- def self.patent_count_by_ipc_classes(date)
82
- ipc_classes = %w(A B C D E F G H)
83
- ipc_classes.inject({}) do |mem, ipcc|
84
- mem[ipcc] = published_patents_count(date, ipcc)
85
- if mem[ipcc] > Limits::MAX_QUERY_RANGE
86
- Logger.log("IPC class #{ipcc} has more than #{Epo::Ops::Limits::MAX_QUERY_RANGE} on #{date}. They can not all be retrieved. Please file this as an issue!")
87
- end
88
- mem
89
- end
90
- end
51
+ # @param date [Date]
52
+ # @param ipc_class [String] up to now should only be between A-H
53
+ # @return [Integer] number of patents with given parameters
54
+ def self.published_patents_counts(ipc_class = nil, date = nil)
55
+ query = SearchQueryBuilder.build(ipc_class, date, 1, 2)
56
+ minimum_result_set = Register.raw_search(query, true)
57
+ return 0 if minimum_result_set.empty?
58
+ minimum_result_set['world_patent_data']['register_search']['total_result_count'].to_i
59
+ end
91
60
 
92
- # Splits the queries build by `query_builder` by the allowed intervals.
93
- #
94
- # @param query_builder [SearchQueryBuilder] with all settings made, but
95
- # not built yet.
96
- # @param patent_count [Integer] number of overall results expected.
97
- # See {.published_patents_count}
98
- #
99
- # @return [Array] of Strings, each a query to put into {Register.search}
100
- def self.split_by_size_limits(query_builder, patent_count)
101
- max_interval = Limits::MAX_QUERY_INTERVAL
102
- (1..patent_count).step(max_interval).map do |start|
103
- query_builder.build(start, [start + max_interval - 1, patent_count].min)
104
- end
105
- end
61
+ # Search method returning all unique register references on a given
62
+ # date, with optional ipc_class.
63
+ # @note This method does more than one query; it may happen that you
64
+ # exceed your API limits
65
+ # @return [Array] Array of {SearchEntry}
66
+ def self.search(ipc_class = nil, date = nil)
67
+ queries = all_queries(ipc_class, date)
68
+ search_entries = queries.flat_map { |query| raw_search(query) }
69
+ search_entries.uniq { |se| se.application_reference.epodoc_reference }
70
+ end
106
71
 
107
- # makes a minimum request to find out how many patents are published on
108
- # that date
109
- #
110
- # @return [Integer] number of patents on that date.
111
- def self.published_patents_count(date, ipc_class = nil)
112
- query = SearchQueryBuilder.new
113
- query.publication_date(date.year, date.month, date.day)
114
- query.and.ipc_class(ipc_class) if ipc_class
115
- query = query.build(1, 2)
116
- minimum_result_set = Register.search(query, true)
117
- return 0 if minimum_result_set.empty?
118
- minimum_result_set['world_patent_data']['register_search']['total_result_count'].to_i
72
+ # @return [Array] Array of Strings containing queries applicable to
73
+ # {Register.raw_search}.
74
+ # builds all queries necessary to find all patent references on a given
75
+ # date.
76
+ def self.all_queries(ipc_class = nil, date = nil)
77
+ count = published_patents_counts(ipc_class, date)
78
+ if count > Limits::MAX_QUERY_RANGE
79
+ IpcClassUtil.children(ipc_class).flat_map { |ic| all_queries(ic, date) }
80
+ else
81
+ split_by_size_limits(ipc_class, date, count)
119
82
  end
120
83
  end
121
84
 
122
85
  # @param query A query built with {Epo::Ops::SearchQueryBuilder}
123
- # @param raw if `true` the result will be the raw response as a nested hash.
124
- # if false(default) the result will be parsed further, returning a list of [SearchEntry]
86
+ # @param raw if `true` the result will be the raw response as a nested
87
+ # hash. if false(default) the result will be parsed further, returning a
88
+ # list of [SearchEntry]
125
89
  # @return [Array] containing {SearchEntry}
126
- def self.search(query, raw = false)
127
- hash = Client.request(:get, register_api_string + query).parsed
90
+ def self.raw_search(query, raw = false)
91
+ hash = Client.request(:get, register_api_string + 'search?' + query).parsed
128
92
  return parse_search_results(hash) unless raw
129
93
  hash
94
+ rescue Epo::Ops::Error::NotFound
95
+ []
96
+ end
97
+
98
+ # @param search_entry [SearchEntry] a search entry which should be
99
+ # retrieved.
100
+ # @return [BibliographicDocument] a parsed document.
101
+ def self.biblio(search_entry)
102
+ raw_biblio(search_entry.application_reference.epodoc_reference)
130
103
  end
131
104
 
132
- # @param format epodoc is a format defined by the EPO for a
105
+ # @param reference_id [String] identifier for document. Format similar to
106
+ # EP1000000
107
+ # @param format [String] epodoc is a format defined by the EPO for a
133
108
  # document id. see their documentation.
134
- # @param type may be `application` or `publication` make sure that the
135
- # `reference_id` is matching
136
- # @param raw flag if the result should be returned as a raw Hash or
137
- # parsed as {BibliographicDocument}
109
+ # @param type [String] may be `application` or `publication` make sure
110
+ # that the `reference_id` is matching
111
+ # @param raw [Boolean] flag if the result should be returned as a raw Hash
112
+ # or parsed as {BibliographicDocument}
138
113
  # @return [BibliographicDocument, Hash]
139
- def self.biblio(reference_id, type = 'application', format = 'epodoc', raw = false)
114
+ def self.raw_biblio(reference_id, type = 'application', format = 'epodoc', raw = false)
140
115
  request = "#{register_api_string}#{type}/#{format}/#{reference_id}/biblio"
141
116
  result = Client.request(:get, request).parsed
142
117
  raw ? result : BibliographicDocument.new(result)
@@ -170,7 +145,7 @@ module Epo
170
145
  end
171
146
 
172
147
  def self.register_api_string
173
- "/#{Epo::Ops::API_VERSION}/rest-services/register/"
148
+ '/3.1/rest-services/register/'
174
149
  end
175
150
  end
176
151
  end
@@ -5,39 +5,34 @@ module Epo
5
5
  module Ops
6
6
  # This Builder helps creating a search query using
7
7
  # {https://www.loc.gov/standards/sru/cql/ CQL} (Common Query Language or
8
- # Contextual Query Language) with the identifies specified by the EPO in
9
- # the OPS Documentation chapter 4.2 ( {https://www.epo.org/searching-for-patents/technical/espacenet/ops.html Link}
10
- # - use tab Downloads and see file 'OPS version 3.2 documentation').
11
- # Dont use a builder twice ;)
8
+ # Contextual Query Language) with the identifiers specified by the EPO in
9
+ # the OPS Documentation chapter 4.2 ({https://www.epo.org/searching-for-patents/technical/espacenet/ops.html Link})
10
+ # - use tab Downloads and see file 'OPS version 3.1 documentation').
12
11
  class SearchQueryBuilder
13
- def initialize
14
- @query = 'search?q='
12
+ # Build the query with the given parameters. Invalid ranges are fixed
13
+ # automatically and you will be notified about the changes
14
+ # @return [String]
15
+ def self.build(ipc_class, date, range_start = 1, range_end = nil)
16
+ validated_range = validate_range range_start, range_end
17
+ "q=#{build_params(ipc_class, date)}&Range=#{validated_range[0]}-#{validated_range[1]}"
15
18
  end
16
19
 
17
- def publication_date(year, month, day)
18
- @query << "pd=#{('%04d' % year) << ('%02d' % month) << ('%02d' % day)}"
19
- self
20
- end
20
+ private
21
21
 
22
- def and
23
- @query << ' and '
24
- self
22
+ def self.build_params(ipc_class, date)
23
+ [build_date(date), build_class(ipc_class)].compact.join(' and ')
25
24
  end
26
25
 
27
- def ipc_class(ipc_class)
28
- @query << "ic=#{ipc_class}"
29
- # TODO: ipc_class richtig formatieren
30
- self
26
+ def self.build_date(date)
27
+ if date
28
+ "pd=#{('%04d' % date.year)}"\
29
+ "#{('%02d' % date.month)}"\
30
+ "#{('%02d' % date.day)}"
31
+ end
31
32
  end
32
33
 
33
- # builds the search query ready to put into the register API. The
34
- # parameters are validated with {#validate_range}.
35
- # This does not change the query, several calls will allow you to
36
- # create the same queries for different ranges.
37
- def build(range_start = 1, range_end = nil)
38
- range_end ||= range_start + Limits::MAX_QUERY_INTERVAL - 1
39
- validated_range = validate_range range_start, range_end
40
- @query + "&Range=#{validated_range[0]}-#{validated_range[1]}"
34
+ def self.build_class(ipc_class)
35
+ "ic=#{ipc_class}" if ipc_class
41
36
  end
42
37
 
43
38
  # Fixes the range given so that they meed the EPO APIs rules. The range
@@ -46,11 +41,11 @@ module Epo
46
41
  # distance covered.
47
42
  # @see Epo::Ops::Limits
48
43
  # @return array with two elements: [range_start, range_end]
49
- def validate_range(range_start, range_end)
44
+ def self.validate_range(range_start, range_end)
50
45
  if range_start > range_end
51
46
  range_start, range_end = range_end, range_start
52
47
  Logger.log('range_start was bigger than range_end, swapped values')
53
- elsif range_start == range_end || range_end - range_start > Limits::MAX_QUERY_INTERVAL - 1
48
+ elsif range_end - range_start > Limits::MAX_QUERY_INTERVAL - 1
54
49
  range_end = range_start + Limits::MAX_QUERY_INTERVAL - 1
55
50
  Logger.log("range invalid, set to: #{[range_start, range_end]}")
56
51
  end
@@ -8,10 +8,15 @@ module Epo
8
8
  #
9
9
  class TokenStore
10
10
  def token
11
- return generate_token if !@token || @token.expired?
11
+ @token = generate_token if !@token || @token.expired?
12
+
12
13
  @token
13
14
  end
14
15
 
16
+ def reset
17
+ @token = nil
18
+ end
19
+
15
20
  protected
16
21
 
17
22
  def generate_token
@@ -19,10 +24,11 @@ module Epo
19
24
  Epo::Ops.config.consumer_key,
20
25
  Epo::Ops.config.consumer_secret,
21
26
  site: 'https://ops.epo.org/',
22
- token_url: "/#{Epo::Ops::API_VERSION}/auth/accesstoken",
27
+ token_url: '/3.1/auth/accesstoken',
23
28
  raise_errors: false
24
29
  )
25
- @token = client.client_credentials.get_token
30
+
31
+ client.client_credentials.get_token
26
32
  end
27
33
  end
28
34
  end
@@ -13,20 +13,32 @@ module Epo
13
13
 
14
14
  def token
15
15
  token = nil
16
- @redis.conn do |conn|
16
+ @redis.with do |conn|
17
17
  token = conn.get("epo_token_#{id}")
18
18
  end
19
19
 
20
20
  token.present? ? OAuth2::AccessToken.new(client, token) : generate_token
21
21
  end
22
22
 
23
+ def reset
24
+ @redis.with do |conn|
25
+ conn.del("epo_token_#{id}")
26
+ end
27
+ end
28
+
23
29
  private
24
30
 
31
+ def id
32
+ Digest::MD5.hexdigest(Epo::Ops.config.consumer_key + Epo::Ops.config.consumer_secret)
33
+ end
34
+
25
35
  def generate_token
26
- super
27
- Sidekiq.redis do |conn|
36
+ token = super
37
+
38
+ @redis.with do |conn|
28
39
  conn.set("epo_token_#{id}", token.token, ex: token.expires_in, nx: true)
29
40
  end
41
+
30
42
  token
31
43
  end
32
44
  end
@@ -1,6 +1,5 @@
1
1
  module Epo
2
2
  module Ops
3
- VERSION = '0.1.6'.freeze
4
- API_VERSION = '3.2'.freeze
3
+ VERSION = '0.2.5'.freeze
5
4
  end
6
5
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: epo-ops
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.6
4
+ version: 0.2.5
5
5
  platform: ruby
6
6
  authors:
7
7
  - Max Kießling
@@ -10,7 +10,7 @@ authors:
10
10
  autorequire:
11
11
  bindir: exe
12
12
  cert_chain: []
13
- date: 2017-06-26 00:00:00.000000000 Z
13
+ date: 2016-03-15 00:00:00.000000000 Z
14
14
  dependencies:
15
15
  - !ruby/object:Gem::Dependency
16
16
  name: bundler
@@ -54,6 +54,20 @@ dependencies:
54
54
  - - "~>"
55
55
  - !ruby/object:Gem::Version
56
56
  version: '5.8'
57
+ - !ruby/object:Gem::Dependency
58
+ name: mocha
59
+ requirement: !ruby/object:Gem::Requirement
60
+ requirements:
61
+ - - ">="
62
+ - !ruby/object:Gem::Version
63
+ version: '0'
64
+ type: :development
65
+ prerelease: false
66
+ version_requirements: !ruby/object:Gem::Requirement
67
+ requirements:
68
+ - - ">="
69
+ - !ruby/object:Gem::Version
70
+ version: '0'
57
71
  - !ruby/object:Gem::Dependency
58
72
  name: vcr
59
73
  requirement: !ruby/object:Gem::Requirement
@@ -96,6 +110,34 @@ dependencies:
96
110
  - - ">="
97
111
  - !ruby/object:Gem::Version
98
112
  version: '0'
113
+ - !ruby/object:Gem::Dependency
114
+ name: redis
115
+ requirement: !ruby/object:Gem::Requirement
116
+ requirements:
117
+ - - ">="
118
+ - !ruby/object:Gem::Version
119
+ version: '0'
120
+ type: :development
121
+ prerelease: false
122
+ version_requirements: !ruby/object:Gem::Requirement
123
+ requirements:
124
+ - - ">="
125
+ - !ruby/object:Gem::Version
126
+ version: '0'
127
+ - !ruby/object:Gem::Dependency
128
+ name: connection_pool
129
+ requirement: !ruby/object:Gem::Requirement
130
+ requirements:
131
+ - - ">="
132
+ - !ruby/object:Gem::Version
133
+ version: '0'
134
+ type: :development
135
+ prerelease: false
136
+ version_requirements: !ruby/object:Gem::Requirement
137
+ requirements:
138
+ - - ">="
139
+ - !ruby/object:Gem::Version
140
+ version: '0'
99
141
  - !ruby/object:Gem::Dependency
100
142
  name: oauth2
101
143
  requirement: !ruby/object:Gem::Requirement
@@ -110,6 +152,20 @@ dependencies:
110
152
  - - "~>"
111
153
  - !ruby/object:Gem::Version
112
154
  version: '1.1'
155
+ - !ruby/object:Gem::Dependency
156
+ name: httparty
157
+ requirement: !ruby/object:Gem::Requirement
158
+ requirements:
159
+ - - "~>"
160
+ - !ruby/object:Gem::Version
161
+ version: '0.13'
162
+ type: :runtime
163
+ prerelease: false
164
+ version_requirements: !ruby/object:Gem::Requirement
165
+ requirements:
166
+ - - "~>"
167
+ - !ruby/object:Gem::Version
168
+ version: '0.13'
113
169
  description: This gem allows simple access to the European Patent Offices (EPO) Open
114
170
  Patent Services (OPS) using their XML-API
115
171
  email:
@@ -129,6 +185,9 @@ files:
129
185
  - lib/epo/ops/bibliographic_document.rb
130
186
  - lib/epo/ops/client.rb
131
187
  - lib/epo/ops/error.rb
188
+ - lib/epo/ops/ipc_class_hierarchy.rb
189
+ - lib/epo/ops/ipc_class_hierarchy_loader.rb
190
+ - lib/epo/ops/ipc_class_util.rb
132
191
  - lib/epo/ops/limits.rb
133
192
  - lib/epo/ops/logger.rb
134
193
  - lib/epo/ops/rate_limit.rb
@@ -157,7 +216,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
157
216
  version: '0'
158
217
  requirements: []
159
218
  rubyforge_project:
160
- rubygems_version: 2.6.11
219
+ rubygems_version: 2.5.1
161
220
  signing_key:
162
221
  specification_version: 4
163
222
  summary: Ruby interface to the European Patent Office API (OPS)