mindee 4.7.2 → 4.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (53) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +8 -0
  3. data/README.md +6 -6
  4. data/Rakefile +1 -0
  5. data/lib/mindee/errors/mindee_http_error_v2.rb +23 -4
  6. data/lib/mindee/errors/mindee_http_unknown_error_v2.rb +18 -0
  7. data/lib/mindee/http/endpoint.rb +1 -0
  8. data/lib/mindee/parsing/v2/error_item.rb +21 -0
  9. data/lib/mindee/parsing/v2/error_response.rb +18 -3
  10. data/lib/mindee/parsing/v2/inference_result.rb +4 -0
  11. data/lib/mindee/parsing/v2/rag_metadata.rb +17 -0
  12. data/lib/mindee/version.rb +1 -1
  13. data/mindee.gemspec +1 -0
  14. data/sig/mindee/errors/mindee_http_error_v2.rbs +4 -1
  15. data/sig/mindee/errors/mindee_http_unknown_error_v2.rbs +9 -0
  16. data/sig/mindee/http/endpoint.rbs +1 -0
  17. data/sig/mindee/parsing/v2/error_item.rbs +13 -0
  18. data/sig/mindee/parsing/v2/error_response.rbs +3 -0
  19. data/sig/mindee/parsing/v2/inference_result.rbs +1 -0
  20. data/sig/mindee/parsing/v2/rag_metadata.rbs +13 -0
  21. metadata +22 -34
  22. data/docs/advanced_file_operations.md +0 -109
  23. data/docs/getting_started.md +0 -257
  24. data/docs/global_products/barcode_reader_v1.md +0 -125
  25. data/docs/global_products/bill_of_lading_v1.md +0 -276
  26. data/docs/global_products/business_card_v1.md +0 -194
  27. data/docs/global_products/cropper_v1.md +0 -123
  28. data/docs/global_products/delivery_notes_v1.md +0 -168
  29. data/docs/global_products/driver_license_v1.md +0 -212
  30. data/docs/global_products/expense_receipts_v5.md +0 -415
  31. data/docs/global_products/financial_document_v1.md +0 -615
  32. data/docs/global_products/international_id_v2.md +0 -264
  33. data/docs/global_products/invoice_splitter_v1.md +0 -127
  34. data/docs/global_products/invoices_v4.md +0 -576
  35. data/docs/global_products/multi_receipts_detector_v1.md +0 -131
  36. data/docs/global_products/nutrition_facts_v1.md +0 -399
  37. data/docs/global_products/passport_v1.md +0 -207
  38. data/docs/global_products/resume_v1.md +0 -384
  39. data/docs/global_products/universal.md +0 -113
  40. data/docs/global_products.md +0 -6
  41. data/docs/loading_a_document.md +0 -330
  42. data/docs/localized_products/bank_account_details_v2.md +0 -158
  43. data/docs/localized_products/bank_check_v1.md +0 -205
  44. data/docs/localized_products/bank_statement_fr_v2.md +0 -269
  45. data/docs/localized_products/carte_grise_v1.md +0 -475
  46. data/docs/localized_products/energy_bill_fra_v1.md +0 -342
  47. data/docs/localized_products/french_healthcard_v1.md +0 -142
  48. data/docs/localized_products/idcard_fr_v2.md +0 -284
  49. data/docs/localized_products/ind_passport_v1.md +0 -307
  50. data/docs/localized_products/payslip_fra_v3.md +0 -344
  51. data/docs/localized_products/us_healthcare_cards_v1.md +0 -258
  52. data/docs/localized_products/us_mail_v3.md +0 -152
  53. data/docs/localized_products.md +0 -6
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 2279aaa45f25ab506ab861d139c93029ddcffe5899ca448041fbbe02ef422058
4
- data.tar.gz: 8a531ca5073b775a5b4afdb6e97f001129fb9d1aa4cf3f4f31633580cd9c16aa
3
+ metadata.gz: 6bbe7fd9450d6e87f07b3b96c636ef8715200370503eb97d65db46225b6ccf8e
4
+ data.tar.gz: 36b67f8ca703982d40048e4d7e9a3b20e4204c7a36386028b3bd11da79560a20
5
5
  SHA512:
6
- metadata.gz: d3b4a6fb9da9a9b8bd3d14b8b06d7c77e328c7eaa060910578f4e345ebc0a732d2b5c4b6e9117621caa1ff7b5f2633624f383833b80c8e7f81faa16236f9d1ca
7
- data.tar.gz: 4be80490fb4f63677954c770458d9bcfdd237bb7fbd1f72fed286b3414046987c0281d6cb2867490e4261e7c423e3ee8d37a24b7e986de77bf97abfbe96950c4
6
+ metadata.gz: '08dd6e3e40ea73f874647ab537f2539342aca107bb1d324c04fa8b8de1fa803f5893117bcaf0e840dd6210b55e6e7c163897bb6cba6a36389b44daeab6a886e2'
7
+ data.tar.gz: 6028ec9f0d38e9e7840c33c98a517d8db02aef34058b8ae517c560cdc0b88524a0e2ab4a472dfe4e1f12ef027fce5bd2153cebd628e2c66f29ac3c8733c1cec3
data/CHANGELOG.md CHANGED
@@ -1,5 +1,13 @@
1
1
  # Mindee Ruby API Library Changelog
2
2
 
3
+ ## v4.8.0 - 2025-11-18
4
+ ### Changes
5
+ * :sparkles: add support for better errors
6
+ * :sparkles: add support for RAG metadata
7
+ ### Fixes
8
+ * :recycle: harmonize test structure with other libraries
9
+
10
+
3
11
  ## v4.7.2 - 2025-10-13
4
12
  ### Changes
5
13
  * :recycle: harmonize getting page count from a local input source
data/README.md CHANGED
@@ -7,26 +7,26 @@ Quickly and easily connect to Mindee's API services using Ruby.
7
7
  ## Mindee API Versions
8
8
  This client library has support for both Mindee platform versions.
9
9
 
10
- ### Latest - V2
11
- This is the new platform located here:
10
+ ### V2 - Latest
11
+ This is the latest platform located here:
12
12
 
13
13
  https://app.mindee.com
14
14
 
15
15
  It uses **API version 2**.
16
16
 
17
17
  Consult the
18
- **[Latest Documentation](https://docs.mindee.com/integrations/client-libraries-sdk)**
18
+ **[V2 Documentation](https://docs.mindee.com/integrations/client-libraries-sdk)**
19
19
 
20
20
 
21
- ### Legacy - V1
22
- This is the legacy platform located here:
21
+ ### V1
22
+ This is the platform located here:
23
23
 
24
24
  https://platform.mindee.com/
25
25
 
26
26
  It uses **API version 1**.
27
27
 
28
28
  Consult the
29
- **[Legacy Documentation](https://developers.mindee.com/docs/ruby-getting-started)**
29
+ [V1 Documentation](https://docs.mindee.com/v1/libraries/ruby-sdk)
30
30
 
31
31
  ## Additional Information
32
32
 
data/Rakefile CHANGED
@@ -23,6 +23,7 @@ end
23
23
  desc 'Run integration tests'
24
24
  RSpec::Core::RakeTask.new(:integration) do |t|
25
25
  t.pattern = 'spec/**/*_integration.rb'
26
+ t.rspec_opts = ['--require', 'integration_helper']
26
27
  end
27
28
 
28
29
  Rake::Task[:doc].enhance do
@@ -1,25 +1,44 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require_relative 'mindee_error'
4
+ require_relative '../parsing/v2/error_item'
4
5
 
5
6
  module Mindee
6
7
  module Errors
7
8
  # API V2 HttpError
8
9
  class MindeeHTTPErrorV2 < MindeeError
9
- # @return [Integer]
10
+ # @return [Integer] The HTTP status code returned by the server.
10
11
  attr_reader :status
11
- # @return [String]
12
+ # @return [String] A human-readable explanation specific to the occurrence of the problem.
12
13
  attr_reader :detail
14
+ # @return [String] A short, human-readable summary of the problem.
15
+ attr_reader :title
16
+ # @return [String] A machine-readable code specific to the occurrence of the problem.
17
+ attr_reader :code
18
+ # @return [Array<ErrorItem>] A list of explicit error details.
19
+ attr_reader :errors
13
20
 
14
21
  # @param http_error [Hash, Parsing::V2::ErrorResponse]
15
22
  def initialize(http_error)
16
23
  if http_error.is_a?(Parsing::V2::ErrorResponse)
17
24
  http_error = { 'detail' => http_error.detail,
18
- 'status' => http_error.status }
25
+ 'status' => http_error.status,
26
+ 'title' => http_error.title,
27
+ 'code' => http_error.code,
28
+ 'errors' => http_error.errors }
19
29
  end
20
30
  @status = http_error['status']
21
31
  @detail = http_error['detail']
22
- super("HTTP error: #{@status} - #{@detail}")
32
+ @title = http_error['title']
33
+ @code = http_error['code']
34
+ @errors = if http_error.key?('errors')
35
+ http_error['errors'].map do |error|
36
+ Parsing::V2::ErrorItem.new(error)
37
+ end
38
+ else
39
+ []
40
+ end
41
+ super("HTTP #{@status} - #{@title} :: #{@code} - #{@detail}")
23
42
  end
24
43
  end
25
44
  end
@@ -0,0 +1,18 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative 'mindee_error'
4
+
5
+ module Mindee
6
+ module Errors
7
+ # Unknown HTTP error for the V2 API.
8
+ class MindeeHTTPUnknownErrorV2 < MindeeHTTPErrorV2
9
+ def initialize(http_error)
10
+ super({ 'detail' => "Couldn't deserialize server error. Found: #{http_error}",
11
+ 'status' => -1,
12
+ 'title' => 'Unknown Error',
13
+ 'code' => '000-000',
14
+ 'errors' => nil })
15
+ end
16
+ end
17
+ end
18
+ end
@@ -7,6 +7,7 @@ require_relative '../version'
7
7
  require_relative 'response_validation'
8
8
 
9
9
  module Mindee
10
+ # Mindee internal HTTP module.
10
11
  module HTTP
11
12
  # API key's default environment key name.
12
13
  API_KEY_ENV_NAME = 'MINDEE_API_KEY'
@@ -0,0 +1,21 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Mindee
4
+ module Parsing
5
+ module V2
6
+ # Individual error item.
7
+ class ErrorItem
8
+ # @return [String, nil] A JSON Pointer to the location of the body property.
9
+ attr_reader :pointer
10
+ # @return [String, nil] Explicit information on the issue.
11
+ attr_reader :detail
12
+
13
+ # @param server_response [Hash] Raw JSON parsed into a Hash.
14
+ def initialize(server_response)
15
+ @pointer = server_response['pointer']
16
+ @detail = server_response['detail']
17
+ end
18
+ end
19
+ end
20
+ end
21
+ end
@@ -5,21 +5,36 @@ module Mindee
5
5
  module V2
6
6
  # Encapsulates information returned by the API when an error occurs.
7
7
  class ErrorResponse
8
- # @return [Integer] HTTP status code.
8
+ # @return [Integer] The HTTP status code returned by the server.
9
9
  attr_reader :status
10
- # @return [String] Error detail.
10
+ # @return [String] A human-readable explanation specific to the occurrence of the problem.
11
11
  attr_reader :detail
12
+ # @return [String] A short, human-readable summary of the problem.
13
+ attr_reader :title
14
+ # @return [String] A machine-readable code specific to the occurrence of the problem.
15
+ attr_reader :code
16
+ # @return [Array<ErrorItem>] A list of explicit error details.
17
+ attr_reader :errors
12
18
 
13
19
  # @param server_response [Hash] Raw JSON parsed into a Hash.
14
20
  def initialize(server_response)
15
21
  @status = server_response['status']
16
22
  @detail = server_response['detail']
23
+ @title = server_response['title']
24
+ @code = server_response['code']
25
+ @errors = if server_response.key?('errors')
26
+ server_response['errors'].map do |error|
27
+ ErrorItem.new(error)
28
+ end
29
+ else
30
+ []
31
+ end
17
32
  end
18
33
 
19
34
  # String representation.
20
35
  # @return [String]
21
36
  def to_s
22
- "Error\n=====\n:Status: #{@status}\n:Detail: #{@detail}"
37
+ "HTTP #{@status} - #{@title} :: #{@code} - #{@detail}"
23
38
  end
24
39
 
25
40
  # Hash representation
@@ -2,6 +2,7 @@
2
2
 
3
3
  require_relative 'field/inference_fields'
4
4
  require_relative 'raw_text'
5
+ require_relative 'rag_metadata'
5
6
 
6
7
  module Mindee
7
8
  module Parsing
@@ -12,6 +13,8 @@ module Mindee
12
13
  attr_reader :fields
13
14
  # @return [Mindee::Parsing::V2::RawText, nil] Optional extra data.
14
15
  attr_reader :raw_text
16
+ # @return [Mindee::Parsing::V2::RAGMetadata, nil] Optional RAG metadata.
17
+ attr_reader :rag
15
18
 
16
19
  # @param server_response [Hash] Hash version of the JSON returned by the API.
17
20
  def initialize(server_response)
@@ -20,6 +23,7 @@ module Mindee
20
23
  @fields = Field::InferenceFields.new(server_response['fields'])
21
24
 
22
25
  @raw_text = server_response['raw_text'] ? RawText.new(server_response['raw_text']) : nil
26
+ @rag = (V2::RAGMetadata.new(server_response['rag']) if server_response.key?('rag') && server_response['rag'])
23
27
  end
24
28
 
25
29
  # String representation.
@@ -0,0 +1,17 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Mindee
4
+ module Parsing
5
+ module V2
6
+ # Metadata about the RAG operation.
7
+ class RAGMetadata
8
+ # The UUID of the matched document used during the RAG operation.
9
+ attr_accessor :retrieved_document_id
10
+
11
+ def initialize(server_response)
12
+ @retrieved_document_id = server_response.fetch('retrieved_document_id', nil)
13
+ end
14
+ end
15
+ end
16
+ end
17
+ end
@@ -3,7 +3,7 @@
3
3
  # Mindee
4
4
  module Mindee
5
5
  # Current version.
6
- VERSION = '4.7.2'
6
+ VERSION = '4.8.0'
7
7
 
8
8
  # Finds and return the current platform.
9
9
  # @return [Symbol, Hash[String | Symbol, Regexp], Nil?]
data/mindee.gemspec CHANGED
@@ -33,6 +33,7 @@ Gem::Specification.new do |spec|
33
33
  spec.add_dependency 'origamindee', '~> 4.0'
34
34
  spec.add_dependency 'pdf-reader', '~> 2.14'
35
35
 
36
+ spec.add_development_dependency 'openssl', '~> 3.3.2'
36
37
  spec.add_development_dependency 'prism', '~> 1.3'
37
38
  spec.add_development_dependency 'rake', '~> 13.2'
38
39
  spec.add_development_dependency 'rbs', '~> 3.6'
@@ -3,9 +3,12 @@ module Mindee
3
3
  module Errors
4
4
  # API V2 HttpError
5
5
  class MindeeHTTPErrorV2 < MindeeError
6
-
7
6
  attr_reader detail: String
8
7
  attr_reader status: Integer
8
+ attr_reader code: String
9
+ attr_reader title: String
10
+ attr_reader errors: Array[Parsing::V2::ErrorItem]
11
+
9
12
  def initialize: (Hash[String, untyped] | Parsing::V2::ErrorResponse) -> void
10
13
  end
11
14
  end
@@ -0,0 +1,9 @@
1
+ # lib/mindee/errors/mindee_http_unknown_error_v2.rb
2
+ module Mindee
3
+ module Errors
4
+ # Unknown HTTP error for the V2 API.
5
+ class MindeeHTTPUnknownErrorV2 < MindeeHTTPErrorV2
6
+ def initialize: (Hash[String|Symbol, untyped]) -> void
7
+ end
8
+ end
9
+ end
@@ -23,6 +23,7 @@ module Mindee
23
23
  def document_queue_req_post: (Input::Source::LocalInputSource | Input::Source::URLInputSource, ParseOptions) -> Net::HTTPResponse
24
24
  def document_queue_req_get: (String) -> Net::HTTPResponse
25
25
  def check_api_key: -> void
26
+ def configure_ssl: (Net::HTTP) -> void
26
27
  end
27
28
  end
28
29
  end
@@ -0,0 +1,13 @@
1
+ # lib/mindee/parsing/v2/error_item.rb
2
+ module Mindee
3
+ module Parsing
4
+ module V2
5
+ class ErrorItem
6
+ attr_reader pointer: String
7
+ attr_reader detail: String|nil
8
+
9
+ def initialize: (Hash[String|Symbol, untyped]) -> void
10
+ end
11
+ end
12
+ end
13
+ end
@@ -5,6 +5,9 @@ module Mindee
5
5
  class ErrorResponse
6
6
  attr_reader detail: String
7
7
  attr_reader status: Integer
8
+ attr_reader code: String
9
+ attr_reader title: String
10
+ attr_reader errors: Array[ErrorItem]
8
11
  def initialize: (Hash[String | Symbol, untyped]) -> void
9
12
 
10
13
  def as_hash: -> Hash[Symbol, String | Integer]
@@ -5,6 +5,7 @@ module Mindee
5
5
  class InferenceResult
6
6
  attr_reader fields: Field::InferenceFields
7
7
  attr_reader raw_text: RawText?
8
+ attr_reader rag: RAGMetadata?
8
9
 
9
10
  def initialize: (Hash[String | Symbol, untyped]) -> void
10
11
  def to_s: -> String
@@ -0,0 +1,13 @@
1
+ # lib/mindee/parsing/v2/rag_metadata.rb
2
+
3
+ module Mindee
4
+ module Parsing
5
+ module V2
6
+ class RAGMetadata
7
+ attr_accessor retrieved_document_id: string | nil
8
+
9
+ def initialize: (Hash[String | Symbol, untyped]) -> void
10
+ end
11
+ end
12
+ end
13
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: mindee
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.7.2
4
+ version: 4.8.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Mindee, SA
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2025-10-13 00:00:00.000000000 Z
11
+ date: 2025-11-18 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: base64
@@ -86,6 +86,20 @@ dependencies:
86
86
  - - "~>"
87
87
  - !ruby/object:Gem::Version
88
88
  version: '2.14'
89
+ - !ruby/object:Gem::Dependency
90
+ name: openssl
91
+ requirement: !ruby/object:Gem::Requirement
92
+ requirements:
93
+ - - "~>"
94
+ - !ruby/object:Gem::Version
95
+ version: 3.3.2
96
+ type: :development
97
+ prerelease: false
98
+ version_requirements: !ruby/object:Gem::Requirement
99
+ requirements:
100
+ - - "~>"
101
+ - !ruby/object:Gem::Version
102
+ version: 3.3.2
89
103
  - !ruby/object:Gem::Dependency
90
104
  name: prism
91
105
  requirement: !ruby/object:Gem::Requirement
@@ -209,7 +223,6 @@ files:
209
223
  - bin/cli_products.rb
210
224
  - bin/console
211
225
  - bin/mindee.rb
212
- - docs/advanced_file_operations.md
213
226
  - docs/code_samples/bank_account_details_v1.txt
214
227
  - docs/code_samples/bank_account_details_v2.txt
215
228
  - docs/code_samples/bank_check_v1.txt
@@ -247,37 +260,6 @@ files:
247
260
  - docs/code_samples/us_mail_v3_async.txt
248
261
  - docs/code_samples/workflow_execution.txt
249
262
  - docs/code_samples/workflow_polling.txt
250
- - docs/getting_started.md
251
- - docs/global_products.md
252
- - docs/global_products/barcode_reader_v1.md
253
- - docs/global_products/bill_of_lading_v1.md
254
- - docs/global_products/business_card_v1.md
255
- - docs/global_products/cropper_v1.md
256
- - docs/global_products/delivery_notes_v1.md
257
- - docs/global_products/driver_license_v1.md
258
- - docs/global_products/expense_receipts_v5.md
259
- - docs/global_products/financial_document_v1.md
260
- - docs/global_products/international_id_v2.md
261
- - docs/global_products/invoice_splitter_v1.md
262
- - docs/global_products/invoices_v4.md
263
- - docs/global_products/multi_receipts_detector_v1.md
264
- - docs/global_products/nutrition_facts_v1.md
265
- - docs/global_products/passport_v1.md
266
- - docs/global_products/resume_v1.md
267
- - docs/global_products/universal.md
268
- - docs/loading_a_document.md
269
- - docs/localized_products.md
270
- - docs/localized_products/bank_account_details_v2.md
271
- - docs/localized_products/bank_check_v1.md
272
- - docs/localized_products/bank_statement_fr_v2.md
273
- - docs/localized_products/carte_grise_v1.md
274
- - docs/localized_products/energy_bill_fra_v1.md
275
- - docs/localized_products/french_healthcard_v1.md
276
- - docs/localized_products/idcard_fr_v2.md
277
- - docs/localized_products/ind_passport_v1.md
278
- - docs/localized_products/payslip_fra_v3.md
279
- - docs/localized_products/us_healthcare_cards_v1.md
280
- - docs/localized_products/us_mail_v3.md
281
263
  - examples/auto_invoice_splitter_extraction.rb
282
264
  - examples/auto_multi_receipts_detector_extraction.rb
283
265
  - lib/mindee.rb
@@ -287,6 +269,7 @@ files:
287
269
  - lib/mindee/errors/mindee_error.rb
288
270
  - lib/mindee/errors/mindee_http_error.rb
289
271
  - lib/mindee/errors/mindee_http_error_v2.rb
272
+ - lib/mindee/errors/mindee_http_unknown_error_v2.rb
290
273
  - lib/mindee/errors/mindee_input_error.rb
291
274
  - lib/mindee/extraction.rb
292
275
  - lib/mindee/extraction/multi_receipts_extractor.rb
@@ -366,6 +349,7 @@ files:
366
349
  - lib/mindee/parsing/universal/universal_object_field.rb
367
350
  - lib/mindee/parsing/v2.rb
368
351
  - lib/mindee/parsing/v2/common_response.rb
352
+ - lib/mindee/parsing/v2/error_item.rb
369
353
  - lib/mindee/parsing/v2/error_response.rb
370
354
  - lib/mindee/parsing/v2/field.rb
371
355
  - lib/mindee/parsing/v2/field/base_field.rb
@@ -384,6 +368,7 @@ files:
384
368
  - lib/mindee/parsing/v2/job.rb
385
369
  - lib/mindee/parsing/v2/job_response.rb
386
370
  - lib/mindee/parsing/v2/job_webhook.rb
371
+ - lib/mindee/parsing/v2/rag_metadata.rb
387
372
  - lib/mindee/parsing/v2/raw_text.rb
388
373
  - lib/mindee/parsing/v2/raw_text_page.rb
389
374
  - lib/mindee/pdf.rb
@@ -571,6 +556,7 @@ files:
571
556
  - sig/mindee/errors/mindee_error.rbs
572
557
  - sig/mindee/errors/mindee_http_error.rbs
573
558
  - sig/mindee/errors/mindee_http_error_v2.rbs
559
+ - sig/mindee/errors/mindee_http_unknown_error_v2.rbs
574
560
  - sig/mindee/errors/mindee_input_error.rbs
575
561
  - sig/mindee/extraction/multi_receipts_extractor.rbs
576
562
  - sig/mindee/geometry/min_max.rbs
@@ -635,6 +621,7 @@ files:
635
621
  - sig/mindee/parsing/universal/universal_list_field.rbs
636
622
  - sig/mindee/parsing/universal/universal_object_field.rbs
637
623
  - sig/mindee/parsing/v2/common_response.rbs
624
+ - sig/mindee/parsing/v2/error_item.rbs
638
625
  - sig/mindee/parsing/v2/error_response.rbs
639
626
  - sig/mindee/parsing/v2/field/base_field.rbs
640
627
  - sig/mindee/parsing/v2/field/field_confidence.rbs
@@ -652,6 +639,7 @@ files:
652
639
  - sig/mindee/parsing/v2/job.rbs
653
640
  - sig/mindee/parsing/v2/job_response.rbs
654
641
  - sig/mindee/parsing/v2/job_webhook.rbs
642
+ - sig/mindee/parsing/v2/rag_metadata.rbs
655
643
  - sig/mindee/parsing/v2/raw_text.rbs
656
644
  - sig/mindee/parsing/v2/raw_text_page.rbs
657
645
  - sig/mindee/pdf/extracted_pdf.rbs
@@ -1,109 +0,0 @@
1
- ---
2
- title: Advanced File Operations
3
- category: 622b805aaec68102ea7fcbc2
4
- slug: ruby-advanced-file-operations
5
- parentDoc: 6294d97ee723f1008d2ab28e
6
- ---
7
-
8
- > ❗️ Disclaimer: the file operations listed below do not directly manipulate the files you will pass to the library,
9
- > They will instead create a copy before applying any operations, which means that the file you send may not be an exact copy of the file the server will receive.
10
- > To avoid any unexpected or unwanted result, you can save a copy of the created file locally to inspect it visually before sending it.
11
-
12
- ## Image compression
13
-
14
- The compression functionality for image files (JPEG, PNG, etc.) via the `compress!` method available on a
15
- LocalInputSource. This method allows you to reduce file size by specifying quality and dimension constraints.
16
-
17
- Example:
18
-
19
- ```rb
20
- # Compress an image with custom parameters.
21
- input_source.compress!(quality: 85, max_width: 1024, max_height: 768)
22
- ```
23
- > ⚠️ Warning: Compression alters the original image data.
24
- > We strongly advise you inspect a compressed file before sending it:
25
- > ```rb
26
- > # Compress using a quality of 50%:
27
- > input_source.compress!(quality: 50)
28
- > input_source.write_to_file('path/to/my/compressed/file_50.jpg')
29
- > ```
30
-
31
- For reference, here's what the following levels of compression on this image will look like:
32
-
33
- **Original:**
34
- ![Invoice sample](https://github.com/mindee/client-lib-test-data/blob/main/products/invoices/default_sample.jpg?raw=true)
35
-
36
- **85% compressed:**
37
- ![85% sample](https://github.com/mindee/client-lib-test-data/blob/main/file_operations/compression/compressed_ruby_85.jpg?raw=true)
38
-
39
- **50% compressed:**
40
- ![50% sample](https://github.com/mindee/client-lib-test-data/blob/main/file_operations/compression/compressed_ruby_50.jpg?raw=true)
41
-
42
- **10% compressed:**
43
- ![10% sample](https://github.com/mindee/client-lib-test-data/blob/main/file_operations/compression/compressed_ruby_10.jpg?raw=true)
44
-
45
-
46
- ## PDF operations
47
-
48
- PDF operations include both compression and fixing features.
49
- These are specifically designed to handle challenges associated with PDF files, such as large file sizes and formatting
50
- issues.
51
-
52
- ### PDF compression
53
-
54
- > 🧪 PDF compression is an **experimental** feature that rasterizes each page of the PDF (similar to how images are
55
- > compressed) to reduce its overall size.
56
- >
57
- > Because the process involves re-rendering the PDF’s contents, some source text may be lost or rendered differently.
58
- > Use this feature with caution.
59
-
60
-
61
- ```rb
62
- # Load a local input source.
63
- input_file_path = "path/to/your/file.pdf"
64
- output_file_path = "path/to/the/compressed/file.pdf"
65
- pdf_input = Mindee::Input::Source::PathInputSource.new(input_file_path)
66
-
67
- # We advise you test the quality value yourself, as results may vary greatly depending on the input file
68
- pdf_input.compress!(quality: 50)
69
-
70
- # Write the output file locally for visual checking:
71
- File.write(output_file_path, pdf_input.io_stream.read)
72
- ```
73
-
74
- > 🚧 Be warned that the source text (the text embedded in the PDF itself) might not render properly,
75
- > and so source PDFs will be ignored by default.
76
- >
77
- > You can bypass this using:
78
-
79
- ```rb
80
- pdf_input.compress!(quality: 50, force_source_text: true)
81
- ```
82
-
83
- Or alternatively, you can try to approximate the re-rendering of the source-text using:
84
-
85
- ```rb
86
- pdf_input.compress!(quality: 50, force_source_text: true, disable_source_text: false)
87
- ```
88
-
89
- ### PDF Repair
90
-
91
- The PDF repair feature attempts to rescue PDFs with invalid or broken header information.
92
- This can sometimes help when files get rejected by the server.
93
-
94
- Example:
95
- ```rb
96
- # Load a PDF file with the repair_pdf flag enabled.
97
- input_source = mindee_client.source_from_file(file, "document.pdf", repair_pdf: true)
98
- ```
99
-
100
- > ⚠️ Warning: PDF fixing alters the input file by re-writing header information.
101
- > Use this feature only when required, as it might affect the integrity of the document file.
102
-
103
- ---
104
-
105
- Feel free to expand these examples and adjust the parameters as needed for your projects. For further details on
106
- authentication and usage, you can refer to the [Getting Started Guide](getting_started.md).
107
-
108
- # Questions?
109
- [Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-2d0ds7dtz-DPAF81ZqTy20chsYpQBW5g)