rospatent 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c95c34be848baa76a4286b680399c4b76a10b5d4e699776c9d6d735ccbcf4e07
4
- data.tar.gz: fdc55a2649fbaa37d4af103929a1cd25747aae7a409512e052da187b832197ef
3
+ metadata.gz: f0fd09ca7a6d13e73a4ec4af7f9e06e4e4a5c640607b551c614289898d83a177
4
+ data.tar.gz: 76cca6feb25bf64b8c069059470f87e7aef159331f93a0fccb2e7147df69cbab
5
5
  SHA512:
6
- metadata.gz: cdf8239372f22f997aaca78ec400574a6ad26e470f2827e8c465aba6e6a15958c6f3caf6efa602245d58be3f7612929da720cfb77ffe12a871bb77878b084bc1
7
- data.tar.gz: d852ecfd4103fb1d4bd22b6bc64b1c22f5e63ea8329c01dabd554e8503f647b7f049680fc35ff43d5b30e5e3ca7d94e3f6daf83ea6446d3f6d20e4debe51ed54
6
+ metadata.gz: 1022920b2f35d2db3558c6df5668db8421df46bbdba2ee13a40d5119a36ca5e3e4a4fb33d8432a10f088fa8dfcb9d1333a7a9b05f4c1ba287ddc8887cf3f32b8
7
+ data.tar.gz: 8ce447fae3280ee2d0a4ff83ab481c8a47321bddcb2cbd310c557d01700635ea271079dcda98f4993d312573d3ec29404a02377743396c2b38ac2f60498d5593
data/CHANGELOG.md CHANGED
@@ -5,6 +5,24 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [1.2.0] - 2025-06-04
9
+
10
+ ### Added
11
+ - Binary data support for `patent_media` and `patent_media_by_id` methods to properly handle PDF, image, and other media files
12
+ - New `binary` parameter for `get` method to distinguish between JSON and binary responses
13
+ - New `handle_binary_response` method for processing binary API responses with proper error handling
14
+ - Russian patent number formatting with automatic zero-padding to 10 digits
15
+ - New `format_publication_number` private method for consistent number formatting across media methods
16
+
17
+ ### Fixed
18
+ - API endpoint paths for `classification_search` and `classification_code` methods now include trailing slashes to prevent 404 errors
19
+ - Binary data corruption issue when downloading patent media files (PDFs, images) through media endpoints
20
+
21
+ ### Changed
22
+ - Enhanced test coverage for binary data handling and publication number formatting
23
+ - Updated README documentation for classification search examples and dataset display conventions
24
+ - Improved error handling consistency for binary vs JSON responses
25
+
8
26
  ## [1.1.0] - 2025-06-04
9
27
 
10
28
  ### Added
data/README.md CHANGED
@@ -230,20 +230,20 @@ Search within patent classification systems (IPC and CPC) and get detailed infor
230
230
  ```ruby
231
231
  # Search for classification codes related to rockets in IPC
232
232
  ipc_results = client.classification_search("ipc", query: "ракета", lang: "ru")
233
- puts "Found #{ipc_results['total']} IPC codes"
233
+ puts "Found #{ipc_results.size} IPC codes"
234
234
 
235
- ipc_results["hits"]&.each do |hit|
236
- puts "#{hit['code']}: #{hit['description']}"
235
+ ipc_results&.each do |result|
236
+ puts "#{result['Code']}: #{result['Description']}"
237
237
  end
238
238
 
239
239
  # Search for rocket-related codes in CPC using English
240
240
  cpc_results = client.classification_search("cpc", query: "rocket", lang: "en")
241
241
 
242
242
  # Get detailed information about a specific classification code
243
- code_info = client.classification_code("ipc", code: "F02K9/00", lang: "ru")
244
- puts "Code: #{code_info['code']}"
245
- puts "Description: #{code_info['description']}"
246
- puts "Hierarchy: #{code_info['hierarchy']&.join(' → ')}"
243
+ code, info = client.classification_code("ipc", code: "F02K9/00", lang: "ru")&.first
244
+ puts "Code: #{code}"
245
+ puts "Description: #{info&.first['Description']}"
246
+ puts "Hierarchy: #{info&.map{|level| level['Code']}&.join(' → ')}"
247
247
 
248
248
  # Get CPC code information in English
249
249
  cpc_info = client.classification_code("cpc", code: "B63H11/00", lang: "en")
@@ -279,9 +279,11 @@ pdf_data = client.patent_media_by_id(
279
279
  )
280
280
 
281
281
  # Get available datasets
282
- datasets = client.datasets_tree
283
- datasets.each do |dataset|
284
- puts "#{dataset['id']}: #{dataset['name']}"
282
+ datasets.each do |category|
283
+ puts "Category: #{category['name_en']}"
284
+ category.children.each do |dataset|
285
+ puts " #{dataset['id']}: #{dataset['name_en']}"
286
+ end
285
287
  end
286
288
  ```
287
289
 
@@ -617,6 +619,7 @@ The library uses **Faraday** as the HTTP client with redirect support for all en
617
619
  - **Similar Patents by Text**: Occasionally returns `503 Service Unavailable` (a server-side issue, not a client implementation issue)
618
620
  ⚠️ **Documentation inconsistencies**:
619
621
  - **Similar Patents**: According to the documentation, the array of hits is named `hits`, but the real implementation uses the name `data`
622
+ - **Available Datasets**: The `name` key in the real implementation has the localization suffix — `name_ru`, `name_en`
620
623
 
621
624
  All core functionality works perfectly and is production-ready with a unified HTTP approach.
622
625
 
@@ -920,20 +923,20 @@ end
920
923
  ```ruby
921
924
  # Поиск классификационных кодов, связанных с ракетами в IPC
922
925
  ipc_results = client.classification_search("ipc", query: "ракета", lang: "ru")
923
- puts "Найдено #{ipc_results['total']} кодов IPC"
926
+ puts "Найдено #{ipc_results.size} кодов IPC"
924
927
 
925
- ipc_results["hits"]&.each do |hit|
926
- puts "#{hit['code']}: #{hit['description']}"
928
+ ipc_results&.each do |result|
929
+ puts "#{result['Code']}: #{result['Description']}"
927
930
  end
928
931
 
929
932
  # Поиск кодов, связанных с ракетами в CPC на английском
930
933
  cpc_results = client.classification_search("cpc", query: "rocket", lang: "en")
931
934
 
932
935
  # Получение подробной информации о конкретном классификационном коде
933
- code_info = client.classification_code("ipc", code: "F02K9/00", lang: "ru")
934
- puts "Код: #{code_info['code']}"
935
- puts "Описание: #{code_info['description']}"
936
- puts "Иерархия: #{code_info['hierarchy']&.join(' → ')}"
936
+ code, info = client.classification_code("ipc", code: "F02K9/00", lang: "ru")&.first
937
+ puts "Код: #{code}"
938
+ puts "Описание: #{info&.first['Description']}"
939
+ puts "Иерархия: #{info&.map{|level| level['Code']}&.join(' → ')}"
937
940
 
938
941
  # Получение информации о коде CPC на английском
939
942
  cpc_info = client.classification_code("cpc", code: "B63H11/00", lang: "en")
@@ -970,8 +973,11 @@ pdf_data = client.patent_media_by_id(
970
973
 
971
974
  # Получение доступных датасетов
972
975
  datasets = client.datasets_tree
973
- datasets.each do |dataset|
974
- puts "#{dataset['id']}: #{dataset['name']}"
976
+ datasets.each do |category|
977
+ puts "Категория: #{category['name_ru']}"
978
+ category.children.each do |dataset|
979
+ puts " #{dataset['id']}: #{dataset['name_ru']}"
980
+ end
975
981
  end
976
982
  ```
977
983
 
@@ -1096,6 +1102,7 @@ end
1096
1102
  - **Поиск похожих патентов по тексту**: Иногда возвращает `503 Service Unavailable` (проблема сервера, не клиентской реализации)
1097
1103
  ⚠️ **Неточности документации**:
1098
1104
  - **Поиск похожих патентов**: Массив совпадений в документации назван `hits`, фактическая реализация использует `data`
1105
+ - **Перечень датасетов**: Ключ `name` в фактической реализации содержит признак локализации — `name_ru`, `name_en`
1099
1106
 
1100
1107
  Вся основная функциональность реализована и готова для продакшена.
1101
1108
 
@@ -199,7 +199,7 @@ module Rospatent
199
199
  @logger.log_cache("miss", cache_key)
200
200
 
201
201
  # Make the API request
202
- result = get("/patsearch/v0.2/datasets/tree")
202
+ result = get("/patsearch/v0.2/datasets/tree", {})
203
203
 
204
204
  # Cache the result for longer since datasets don't change often
205
205
  @cache.set(cache_key, result, ttl: 3600) # Cache for 1 hour
@@ -230,19 +230,22 @@ module Rospatent
230
230
  # Format publication date
231
231
  formatted_date = validated_date.strftime("%Y/%m/%d")
232
232
 
233
+ # Format publication number with appropriate padding
234
+ formatted_number = format_publication_number(validated_number, validated_country)
235
+
233
236
  # Construct the path
234
237
  path = "/media/#{validated_collection}/#{validated_country}/" \
235
- "#{validated_doc_type}/#{formatted_date}/#{validated_number}/" \
238
+ "#{validated_doc_type}/#{formatted_date}/#{formatted_number}/" \
236
239
  "#{validated_filename}"
237
240
 
238
- # Make a GET request to retrieve the media file
239
- get(path)
241
+ # Get binary data
242
+ get(path, {}, binary: true)
240
243
  end
241
244
 
242
- # Simplified method to retrieve media data by patent ID and collection ID
243
- # @param document_id [String] The patent document ID (e.g., "RU134694U1_20131120")
244
- # @param collection_id [String] Dataset/collection identifier (e.g., "National")
245
- # @param filename [String] Media file name (e.g., "document.pdf")
245
+ # Retrieve media using simplified patent ID format
246
+ # @param document_id [String] Patent document ID (e.g., "RU134694U1_20131120")
247
+ # @param collection_id [String] Collection identifier (e.g., "National")
248
+ # @param filename [String] Filename to retrieve (e.g., "document.pdf")
246
249
  # @return [String] Binary content of the requested file
247
250
  # @raise [Rospatent::Errors::InvalidRequestError] If document_id format is invalid
248
251
  # or parameters are missing
@@ -258,9 +261,12 @@ module Rospatent
258
261
  # Format the date from YYYYMMDD to YYYY/MM/DD
259
262
  formatted_date = id_parts[:date].gsub(/^(\d{4})(\d{2})(\d{2})$/, '\1/\2/\3')
260
263
 
264
+ # Format publication number with appropriate padding
265
+ formatted_number = format_publication_number(id_parts[:number], id_parts[:country_code])
266
+
261
267
  # Call the base method with extracted components
262
268
  patent_media(validated_collection, id_parts[:country_code], id_parts[:doc_type],
263
- formatted_date, id_parts[:number], validated_filename)
269
+ formatted_date, formatted_number, validated_filename)
264
270
  end
265
271
 
266
272
  # Extract and parse the abstract content from a patent document
@@ -334,7 +340,7 @@ module Rospatent
334
340
  }
335
341
 
336
342
  # Make a POST request to the classification search endpoint
337
- result = post("/patsearch/v0.2/classification/#{validated_classifier}/search", payload)
343
+ result = post("/patsearch/v0.2/classification/#{validated_classifier}/search/", payload)
338
344
 
339
345
  # Cache the result
340
346
  @cache.set(cache_key, result, ttl: 1800) # Cache for 30 minutes
@@ -374,7 +380,7 @@ module Rospatent
374
380
  }
375
381
 
376
382
  # Make a POST request to the classification code endpoint
377
- result = post("/patsearch/v0.2/classification/#{validated_classifier}/code", payload)
383
+ result = post("/patsearch/v0.2/classification/#{validated_classifier}/code/", payload)
378
384
 
379
385
  # Cache the result for longer since classification codes don't change often
380
386
  @cache.set(cache_key, result, ttl: 3600) # Cache for 1 hour
@@ -386,8 +392,9 @@ module Rospatent
386
392
  # Execute a GET request to the API
387
393
  # @param endpoint [String] API endpoint
388
394
  # @param params [Hash] Query parameters (optional)
389
- # @return [Hash] Response data
390
- def get(endpoint, params = {})
395
+ # @param binary [Boolean] Whether to expect binary response (default: false)
396
+ # @return [Hash, String] Response data (Hash for JSON, String for binary)
397
+ def get(endpoint, params = {}, binary: false)
391
398
  start_time = Time.now
392
399
  request_id = generate_request_id
393
400
 
@@ -395,8 +402,12 @@ module Rospatent
395
402
  @request_count += 1
396
403
 
397
404
  response = connection.get(endpoint, params) do |req|
398
- req.headers["Accept"] = "application/json"
399
- req.headers["Content-Type"] = "application/json"
405
+ if binary
406
+ req.headers["Accept"] = "*/*"
407
+ else
408
+ req.headers["Accept"] = "application/json"
409
+ req.headers["Content-Type"] = "application/json"
410
+ end
400
411
  req.headers["X-Request-ID"] = request_id
401
412
  end
402
413
 
@@ -406,7 +417,11 @@ module Rospatent
406
417
  @logger.log_response("GET", endpoint, response.status, duration,
407
418
  response_size: response.body&.bytesize, request_id: request_id)
408
419
 
409
- handle_response(response, request_id)
420
+ if binary
421
+ handle_binary_response(response, request_id)
422
+ else
423
+ handle_response(response, request_id)
424
+ end
410
425
  rescue Faraday::Error => e
411
426
  @logger.log_error(e, { endpoint: endpoint, params: params, request_id: request_id })
412
427
  handle_error(e)
@@ -642,6 +657,42 @@ module Rospatent
642
657
  end
643
658
  end
644
659
 
660
+ # Process binary API response (for media files)
661
+ # @param response [Faraday::Response] Raw response from the API
662
+ # @param request_id [String] Request ID for tracking
663
+ # @return [String] Binary response data
664
+ # @raise [Rospatent::Errors::ApiError] If the response is not successful
665
+ def handle_binary_response(response, request_id = nil)
666
+ return response.body if response.success?
667
+
668
+ # For binary endpoints, error responses might still be JSON
669
+ error_msg = begin
670
+ data = JSON.parse(response.body)
671
+ data["error"] || data["message"] || "Unknown error"
672
+ rescue JSON::ParserError
673
+ "Binary request failed"
674
+ end
675
+
676
+ # Create specific error types based on status code
677
+ case response.status
678
+ when 401
679
+ raise Errors::AuthenticationError, "#{error_msg} [Request ID: #{request_id}]"
680
+ when 404
681
+ raise Errors::NotFoundError.new("#{error_msg} [Request ID: #{request_id}]", response.status)
682
+ when 422
683
+ errors = extract_validation_errors(response)
684
+ raise Errors::ValidationError.new(error_msg, errors)
685
+ when 429
686
+ retry_after = response.headers["Retry-After"]&.to_i
687
+ raise Errors::RateLimitError.new(error_msg, response.status, retry_after)
688
+ when 503
689
+ raise Errors::ServiceUnavailableError.new("#{error_msg} [Request ID: #{request_id}]",
690
+ response.status)
691
+ else
692
+ raise Errors::ApiError.new(error_msg, response.status, response.body, request_id)
693
+ end
694
+ end
695
+
645
696
  # Handle connection errors
646
697
  # @param error [Faraday::Error] Connection error
647
698
  # @raise [Rospatent::Errors::ConnectionError] Wrapped connection error
@@ -695,5 +746,18 @@ module Rospatent
695
746
  def generate_request_id
696
747
  "req_#{Time.now.to_f}_#{rand(10_000)}"
697
748
  end
749
+
750
+ # Pad publication number with leading zeros for specific countries
751
+ # @param number [String] Publication number to pad
752
+ # @param country_code [String] Country code (e.g., "RU")
753
+ # @return [String] Padded publication number
754
+ def format_publication_number(number, country_code)
755
+ # Russian patents require 10-digit publication numbers
756
+ if country_code == "RU" && number.length < 10
757
+ number.rjust(10, "0")
758
+ else
759
+ number
760
+ end
761
+ end
698
762
  end
699
763
  end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Rospatent
4
- VERSION = "1.1.0"
4
+ VERSION = "1.2.0"
5
5
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: rospatent
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.0
4
+ version: 1.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Aleksandr Dryzhuk
@@ -140,8 +140,8 @@ licenses:
140
140
  metadata:
141
141
  homepage_uri: https://github.com/amdest/rospatent
142
142
  source_code_uri: https://github.com/amdest/rospatent
143
- changelog_uri: https://github.com/amdest/rospatent/blob/main/CHANGELOG.md
144
- documentation_uri: https://github.com/amdest/rospatent/blob/main/README.md
143
+ changelog_uri: https://github.com/amdest/rospatent/blob/master/CHANGELOG.md
144
+ documentation_uri: https://github.com/amdest/rospatent/blob/master/README.md
145
145
  bug_tracker_uri: https://github.com/amdest/rospatent/issues
146
146
  wiki_uri: https://github.com/amdest/rospatent/wiki
147
147
  rubygems_mfa_required: 'true'