rospatent 1.3.2 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 7a67e6d87f80561b12663934f8f521318ffba4a91bfcc2d9e8f0da69a37bb941
4
- data.tar.gz: 144209fec90c09ef12ec3b8bfbe8ce25ef869956d42a4c99932c0ee8b8cd2152
3
+ metadata.gz: 415bcf677cca43f775fb9ddca9d7c11093a7f2bce4cc0298be375ffe1f1912fc
4
+ data.tar.gz: e7c9111a4230c5624137f380511e248e8b177dfd30fa583133c4788fc0e09917
5
5
  SHA512:
6
- metadata.gz: f04c183c8b4896ebce31c73adfb59cf5134256feb075d881138e1e54547241216abda98c63500106a8649c6a6b2b18562cf40acddeed6d926b08162887de9dca
7
- data.tar.gz: 0d3f2c86cc5cbfb58c69d0db1100e796adb42664251cd1812118889604a897b4c43b6986a0104660b6ce4cc8c3ea39adab1c27ffbbf0fa815f733da5466e1f72
6
+ metadata.gz: c74bd899f8263fee6097bb7157d6a628f922ac4a43e36264639892b5e1d9359850b4b1cba869711a764d4ec4d55dabd81efa9432b51507bbae2c0f30c0af40e8
7
+ data.tar.gz: e5a9b87840af75812374e881e00184527c5c82a5d0d7589c1512c838524b958063e200366d44a70b0f1e8bf020227a3e5a72ed99abafad3bd65e0e357927422c
data/CHANGELOG.md CHANGED
@@ -5,6 +5,38 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [1.4.0] - 2025-06-09
9
+
10
+ ### Added
11
+ - **Optional Filename Parameter**: Enhanced `patent_media` and `patent_media_by_id` methods with optional filename parameter
12
+ - Auto-generated filenames using formatted publication number + ".pdf" extension (e.g., "0000134694.pdf")
13
+ - Leverages existing `format_publication_number` logic for reliable API compatibility
14
+ - Provides convenient default behavior while maintaining full backward compatibility
15
+ - **Safe Binary File Handling**: New `save_binary_file` method for proper encoding of binary data
16
+ - Ensures binary data (PDFs, images, etc.) is written correctly with ASCII-8BIT encoding
17
+ - Prevents encoding conversion errors when saving media files
18
+ - Includes comprehensive documentation and examples
19
+
20
+ ### Fixed
21
+ - **Binary Data Encoding Issues**: Enhanced `handle_binary_response` method to properly handle binary data encoding
22
+ - Forces ASCII-8BIT encoding to prevent UTF-8 conversion errors
23
+ - Resolves `Encoding::UndefinedConversionError` when saving PDF and image files
24
+ - Maintains proper binary data integrity throughout the request lifecycle
25
+
26
+ ### Changed
27
+ - **Enhanced Media Method Documentation**: Updated documentation with auto-generated filename examples
28
+ - Added examples showing both convenience (auto-filename) and explicit filename approaches
29
+ - Enhanced code examples in both English and Russian README sections
30
+ - Improved documentation structure with clear usage recommendations
31
+ - **Comprehensive Test Coverage**: Enhanced test suite with auto-generated filename functionality
32
+ - Added specific tests for optional filename parameter behavior
33
+ - Updated existing tests to accommodate new method signatures
34
+ - Enhanced integration tests with fallback filename strategies
35
+ - **README Improvements**: Updated Key Features section to remove hardcoded test counts
36
+ - Replaced specific numbers with qualitative descriptions of testing coverage
37
+ - Improved maintainability by removing numbers that become outdated quickly
38
+ - Enhanced focus on testing quality rather than raw metrics
39
+
8
40
  ## [1.3.2] - 2025-06-07
9
41
 
10
42
  ### Added
data/README.md CHANGED
@@ -15,7 +15,7 @@ A comprehensive Ruby client for the Rospatent patent search API with advanced fe
15
15
  - 📊 **Structured Logging** - JSON/text logging with request/response tracking
16
16
  - 🚀 **Batch Operations** - Process multiple patents concurrently
17
17
  - ⚙️ **Environment-Aware** - Different configurations for dev/staging/production
18
- - 🧪 **Comprehensive Testing** - 232 tests with 483 assertions, comprehensive integration testing
18
+ - 🧪 **Comprehensive Testing** - Extensive unit and integration test coverage with robust error handling validation
19
19
  - 📚 **Excellent Documentation** - Detailed examples and API documentation
20
20
 
21
21
  ## Installation
@@ -583,23 +583,48 @@ end
583
583
  ### Media and Documents
584
584
 
585
585
  ```ruby
586
- # Download patent PDF
586
+ # ✅ Recommended: Download patent PDF with auto-generated filename
587
+ # Automatically uses the formatted publication number (e.g., "0000134694.pdf")
588
+ pdf_data = client.patent_media(
589
+ "National", # collection_id
590
+ "RU", # country_code
591
+ "U1", # doc_type
592
+ "2013/11/20", # pub_date
593
+ "134694" # pub_number (filename auto-generated)
594
+ )
595
+ client.save_binary_file(pdf_data, "patent.pdf")
596
+
597
+ # ✅ Alternative: Download with explicit filename
587
598
  pdf_data = client.patent_media(
588
599
  "National", # collection_id
589
600
  "RU", # country_code
590
601
  "U1", # doc_type
591
602
  "2013/11/20", # pub_date
592
603
  "134694", # pub_number
593
- "document.pdf" # filename
604
+ "document.pdf" # explicit filename
594
605
  )
595
- File.write("patent.pdf", pdf_data)
606
+ client.save_binary_file(pdf_data, "patent_explicit.pdf")
596
607
 
597
- # Simplified method using patent ID
608
+ # Simplified method using patent ID (auto-generated filename)
598
609
  pdf_data = client.patent_media_by_id(
599
610
  "RU134694U1_20131120",
600
- "National",
601
- "document.pdf"
611
+ "National" # filename auto-generated as "0000134694.pdf"
612
+ )
613
+ client.save_binary_file(pdf_data, "patent_by_id.pdf")
614
+
615
+ # ✅ Or with explicit filename for specific files
616
+ image_data = client.patent_media_by_id(
617
+ "RU134694U1_20131120",
618
+ "National",
619
+ "image.png" # explicit filename for non-PDF files
602
620
  )
621
+ client.save_binary_file(image_data, "patent_image.png")
622
+
623
+ # ✅ Safe file saving options:
624
+ File.binwrite("patent.pdf", pdf_data) # Manual binary write
625
+
626
+ # ❌ Avoid: File.write can cause encoding errors with binary data
627
+ # File.write("patent.pdf", pdf_data) # This may fail!
603
628
  ```
604
629
 
605
630
  ## Advanced Features
@@ -1105,7 +1130,7 @@ $ bundle exec rake release
1105
1130
  - 📊 **Структурированное логирование** - JSON/текстовое логирование с отслеживанием запросов/ответов
1106
1131
  - 🚀 **Пакетные операции** - параллельная обработка множества патентов
1107
1132
  - ⚙️ **Адаптивные окружения** - различные конфигурации для development/staging/production
1108
- - 🧪 **Комплексное тестирование** - 232 теста с 483 проверками, комплексное интеграционное тестирование
1133
+ - 🧪 **Комплексное тестирование** - Обширное покрытие модульными и интеграционными тестами с валидацией обработки ошибок
1109
1134
  - 📚 **Отличная документация** - подробные примеры и документация API
1110
1135
 
1111
1136
  ## Установка
@@ -1669,23 +1694,48 @@ end
1669
1694
  ### Медиафайлы и документы
1670
1695
 
1671
1696
  ```ruby
1672
- # Скачивание PDF патента
1697
+ # ✅ Рекомендуется: Скачивание PDF патента с автоматически генерируемым именем файла
1698
+ # Автоматически использует отформатированный номер публикации (например, "0000134694.pdf")
1699
+ pdf_data = client.patent_media(
1700
+ "National", # collection_id
1701
+ "RU", # country_code
1702
+ "U1", # doc_type
1703
+ "2013/11/20", # pub_date
1704
+ "134694" # pub_number (имя файла генерируется автоматически)
1705
+ )
1706
+ client.save_binary_file(pdf_data, "patent.pdf")
1707
+
1708
+ # ✅ Альтернатива: Скачивание с явным указанием имени файла
1673
1709
  pdf_data = client.patent_media(
1674
1710
  "National", # collection_id
1675
1711
  "RU", # country_code
1676
1712
  "U1", # doc_type
1677
1713
  "2013/11/20", # pub_date
1678
1714
  "134694", # pub_number
1679
- "document.pdf" # filename
1715
+ "document.pdf" # явное имя файла
1680
1716
  )
1681
- File.write("patent.pdf", pdf_data)
1717
+ client.save_binary_file(pdf_data, "patent_explicit.pdf")
1682
1718
 
1683
- # Упрощенный метод с использованием ID патента
1719
+ # Упрощенный метод с использованием ID патента (автогенерируемое имя)
1684
1720
  pdf_data = client.patent_media_by_id(
1685
1721
  "RU134694U1_20131120",
1686
- "National",
1687
- "document.pdf"
1722
+ "National" # имя файла автоматически генерируется как "0000134694.pdf"
1723
+ )
1724
+ client.save_binary_file(pdf_data, "patent_by_id.pdf")
1725
+
1726
+ # ✅ Или с явным именем файла для конкретных файлов
1727
+ image_data = client.patent_media_by_id(
1728
+ "RU134694U1_20131120",
1729
+ "National",
1730
+ "image.png" # явное имя файла для файлов не-PDF
1688
1731
  )
1732
+ client.save_binary_file(image_data, "patent_image.png")
1733
+
1734
+ # ✅ Варианты безопасного сохранения файлов:
1735
+ File.binwrite("patent.pdf", pdf_data) # Ручная бинарная запись
1736
+
1737
+ # ❌ Избегайте: File.write может вызвать ошибки кодировки с бинарными данными
1738
+ # File.write("patent.pdf", pdf_data) # Это может не сработать!
1689
1739
  ```
1690
1740
 
1691
1741
  ## Расширенные возможности
@@ -214,18 +214,23 @@ module Rospatent
214
214
  # @param doc_type [String] Document type code (e.g., "U1")
215
215
  # @param pub_date [String, Date] Publication date in format YYYY/MM/DD
216
216
  # @param pub_number [String] Publication number
217
- # @param filename [String] Media file name (e.g., "document.pdf")
218
- # @return [String] Binary content of the requested file
217
+ # @param filename [String, nil] Media file name (optional, defaults to "<formatted_number>.pdf")
218
+ # @return [String] Binary content with ASCII-8BIT encoding
219
219
  # @raise [Rospatent::Errors::InvalidRequestError] If any required parameter is missing
220
+ # @example Retrieve and save a PDF with auto-generated filename
221
+ # pdf_data = client.patent_media("National", "RU", "U1", "2013/11/20", "134694")
222
+ # client.save_binary_file(pdf_data, "patent.pdf")
223
+ # @example Retrieve and save a specific file
224
+ # pdf_data = client.patent_media("National", "RU", "U1", "2013/11/20", "134694", "document.pdf")
225
+ # client.save_binary_file(pdf_data, "patent.pdf")
220
226
  def patent_media(collection_id, country_code, doc_type, pub_date, pub_number,
221
- filename)
227
+ filename = nil)
222
228
  # Validate and normalize inputs
223
229
  validated_collection = validate_required_string(collection_id, "collection_id")
224
230
  validated_country = validate_required_string(country_code, "country_code", max_length: 2)
225
231
  validated_doc_type = validate_required_string(doc_type, "doc_type", max_length: 3)
226
232
  validated_date = validate_required_date(pub_date, "pub_date")
227
233
  validated_number = validate_required_string(pub_number, "pub_number")
228
- validated_filename = validate_required_string(filename, "filename")
229
234
 
230
235
  # Format publication date
231
236
  formatted_date = validated_date.strftime("%Y/%m/%d")
@@ -233,6 +238,13 @@ module Rospatent
233
238
  # Format publication number with appropriate padding
234
239
  formatted_number = format_publication_number(validated_number, validated_country)
235
240
 
241
+ # Generate default filename if not provided
242
+ validated_filename = if filename.nil?
243
+ "#{formatted_number}.pdf"
244
+ else
245
+ validate_required_string(filename, "filename")
246
+ end
247
+
236
248
  # Construct the path
237
249
  path = "/media/#{validated_collection}/#{validated_country}/" \
238
250
  "#{validated_doc_type}/#{formatted_date}/#{formatted_number}/" \
@@ -245,15 +257,23 @@ module Rospatent
245
257
  # Retrieve media using simplified patent ID format
246
258
  # @param document_id [String] Patent document ID (e.g., "RU134694U1_20131120")
247
259
  # @param collection_id [String] Collection identifier (e.g., "National")
248
- # @param filename [String] Filename to retrieve (e.g., "document.pdf")
249
- # @return [String] Binary content of the requested file
260
+ # @param filename [String, nil] Filename to retrieve (optional, defaults to "<formatted_number>.pdf")
261
+ # @return [String] Binary content with ASCII-8BIT encoding
250
262
  # @raise [Rospatent::Errors::InvalidRequestError] If document_id format is invalid
251
263
  # or parameters are missing
252
- def patent_media_by_id(document_id, collection_id, filename)
264
+ # @example Retrieve and save a PDF with auto-generated filename
265
+ # pdf_data = client.patent_media_by_id("RU134694U1_20131120", "National")
266
+ # client.save_binary_file(pdf_data, "patent.pdf")
267
+ # @example Retrieve and save a specific file
268
+ # pdf_data = client.patent_media_by_id("RU134694U1_20131120", "National", "document.pdf")
269
+ # client.save_binary_file(pdf_data, "patent.pdf")
270
+ def patent_media_by_id(document_id, collection_id, filename = nil)
253
271
  # Validate inputs
254
272
  validated_id = validate_patent_id(document_id)
255
273
  validated_collection = validate_required_string(collection_id, "collection_id")
256
- validated_filename = validate_required_string(filename, "filename")
274
+
275
+ # Validate filename if provided
276
+ validated_filename = filename ? validate_required_string(filename, "filename") : nil
257
277
 
258
278
  # Parse the patent ID to extract components
259
279
  id_parts = parse_patent_id(validated_id)
@@ -261,12 +281,10 @@ module Rospatent
261
281
  # Format the date from YYYYMMDD to YYYY/MM/DD
262
282
  formatted_date = id_parts[:date].gsub(/^(\d{4})(\d{2})(\d{2})$/, '\1/\2/\3')
263
283
 
264
- # Format publication number with appropriate padding
265
- formatted_number = format_publication_number(id_parts[:number], id_parts[:country_code])
266
-
267
284
  # Call the base method with extracted components
285
+ # If no filename provided, patent_media will generate default using format_publication_number
268
286
  patent_media(validated_collection, id_parts[:country_code], id_parts[:doc_type],
269
- formatted_date, formatted_number, validated_filename)
287
+ formatted_date, id_parts[:number], validated_filename)
270
288
  end
271
289
 
272
290
  # Extract and parse the abstract content from a patent document
@@ -496,6 +514,29 @@ module Rospatent
496
514
  }
497
515
  end
498
516
 
517
+ # Save binary data to a file with proper encoding handling
518
+ # This method ensures that binary data (PDFs, images, etc.) is written correctly
519
+ # @param binary_data [String] Binary data returned from patent_media methods
520
+ # @param file_path [String] Path where to save the file
521
+ # @return [Integer] Number of bytes written
522
+ # @raise [SystemCallError] If file cannot be written
523
+ # @example Save a PDF file with auto-generated filename
524
+ # pdf_data = client.patent_media_by_id("RU134694U1_20131120", "National")
525
+ # client.save_binary_file(pdf_data, "patent.pdf")
526
+ # @example Save a specific file
527
+ # pdf_data = client.patent_media_by_id("RU134694U1_20131120", "National", "document.pdf")
528
+ # client.save_binary_file(pdf_data, "patent.pdf")
529
+ def save_binary_file(binary_data, file_path)
530
+ validate_required_string(binary_data, "binary_data")
531
+ validate_required_string(file_path, "file_path")
532
+
533
+ # Ensure data is properly encoded as binary
534
+ data_to_write = binary_data.dup.force_encoding(Encoding::ASCII_8BIT)
535
+
536
+ # Write in binary mode to prevent any encoding conversions
537
+ File.binwrite(file_path, data_to_write)
538
+ end
539
+
499
540
  private
500
541
 
501
542
  # Validate search parameters
@@ -662,10 +703,15 @@ module Rospatent
662
703
  # Process binary API response (for media files)
663
704
  # @param response [Faraday::Response] Raw response from the API
664
705
  # @param request_id [String] Request ID for tracking
665
- # @return [String] Binary response data
706
+ # @return [String] Binary response data with proper encoding
666
707
  # @raise [Rospatent::Errors::ApiError] If the response is not successful
667
708
  def handle_binary_response(response, request_id = nil)
668
- return response.body if response.success?
709
+ if response.success?
710
+ # Ensure binary data is properly encoded as ASCII-8BIT to prevent encoding issues
711
+ binary_data = response.body.dup
712
+ binary_data.force_encoding(Encoding::ASCII_8BIT)
713
+ return binary_data
714
+ end
669
715
 
670
716
  # For binary endpoints, error responses might still be JSON
671
717
  error_msg = begin
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Rospatent
4
- VERSION = "1.3.2"
4
+ VERSION = "1.4.0"
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: rospatent
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.3.2
4
+ version: 1.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Aleksandr Dryzhuk
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2025-06-07 00:00:00.000000000 Z
11
+ date: 2025-06-09 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: faraday