mindee 4.7.2 → 4.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +8 -0
- data/README.md +6 -6
- data/Rakefile +1 -0
- data/lib/mindee/errors/mindee_http_error_v2.rb +23 -4
- data/lib/mindee/errors/mindee_http_unknown_error_v2.rb +18 -0
- data/lib/mindee/http/endpoint.rb +1 -0
- data/lib/mindee/parsing/v2/error_item.rb +21 -0
- data/lib/mindee/parsing/v2/error_response.rb +18 -3
- data/lib/mindee/parsing/v2/inference_result.rb +4 -0
- data/lib/mindee/parsing/v2/rag_metadata.rb +17 -0
- data/lib/mindee/version.rb +1 -1
- data/mindee.gemspec +1 -0
- data/sig/mindee/errors/mindee_http_error_v2.rbs +4 -1
- data/sig/mindee/errors/mindee_http_unknown_error_v2.rbs +9 -0
- data/sig/mindee/http/endpoint.rbs +1 -0
- data/sig/mindee/parsing/v2/error_item.rbs +13 -0
- data/sig/mindee/parsing/v2/error_response.rbs +3 -0
- data/sig/mindee/parsing/v2/inference_result.rbs +1 -0
- data/sig/mindee/parsing/v2/rag_metadata.rbs +13 -0
- metadata +22 -34
- data/docs/advanced_file_operations.md +0 -109
- data/docs/getting_started.md +0 -257
- data/docs/global_products/barcode_reader_v1.md +0 -125
- data/docs/global_products/bill_of_lading_v1.md +0 -276
- data/docs/global_products/business_card_v1.md +0 -194
- data/docs/global_products/cropper_v1.md +0 -123
- data/docs/global_products/delivery_notes_v1.md +0 -168
- data/docs/global_products/driver_license_v1.md +0 -212
- data/docs/global_products/expense_receipts_v5.md +0 -415
- data/docs/global_products/financial_document_v1.md +0 -615
- data/docs/global_products/international_id_v2.md +0 -264
- data/docs/global_products/invoice_splitter_v1.md +0 -127
- data/docs/global_products/invoices_v4.md +0 -576
- data/docs/global_products/multi_receipts_detector_v1.md +0 -131
- data/docs/global_products/nutrition_facts_v1.md +0 -399
- data/docs/global_products/passport_v1.md +0 -207
- data/docs/global_products/resume_v1.md +0 -384
- data/docs/global_products/universal.md +0 -113
- data/docs/global_products.md +0 -6
- data/docs/loading_a_document.md +0 -330
- data/docs/localized_products/bank_account_details_v2.md +0 -158
- data/docs/localized_products/bank_check_v1.md +0 -205
- data/docs/localized_products/bank_statement_fr_v2.md +0 -269
- data/docs/localized_products/carte_grise_v1.md +0 -475
- data/docs/localized_products/energy_bill_fra_v1.md +0 -342
- data/docs/localized_products/french_healthcard_v1.md +0 -142
- data/docs/localized_products/idcard_fr_v2.md +0 -284
- data/docs/localized_products/ind_passport_v1.md +0 -307
- data/docs/localized_products/payslip_fra_v3.md +0 -344
- data/docs/localized_products/us_healthcare_cards_v1.md +0 -258
- data/docs/localized_products/us_mail_v3.md +0 -152
- data/docs/localized_products.md +0 -6
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 6bbe7fd9450d6e87f07b3b96c636ef8715200370503eb97d65db46225b6ccf8e
|
|
4
|
+
data.tar.gz: 36b67f8ca703982d40048e4d7e9a3b20e4204c7a36386028b3bd11da79560a20
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: '08dd6e3e40ea73f874647ab537f2539342aca107bb1d324c04fa8b8de1fa803f5893117bcaf0e840dd6210b55e6e7c163897bb6cba6a36389b44daeab6a886e2'
|
|
7
|
+
data.tar.gz: 6028ec9f0d38e9e7840c33c98a517d8db02aef34058b8ae517c560cdc0b88524a0e2ab4a472dfe4e1f12ef027fce5bd2153cebd628e2c66f29ac3c8733c1cec3
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,13 @@
|
|
|
1
1
|
# Mindee Ruby API Library Changelog
|
|
2
2
|
|
|
3
|
+
## v4.8.0 - 2025-11-18
|
|
4
|
+
### Changes
|
|
5
|
+
* :sparkles: add support for better errors
|
|
6
|
+
* :sparkles: add support for RAG metadata
|
|
7
|
+
### Fixes
|
|
8
|
+
* :recycle: harmonize test structure with other libraries
|
|
9
|
+
|
|
10
|
+
|
|
3
11
|
## v4.7.2 - 2025-10-13
|
|
4
12
|
### Changes
|
|
5
13
|
* :recycle: harmonize getting page count from a local input source
|
data/README.md
CHANGED
|
@@ -7,26 +7,26 @@ Quickly and easily connect to Mindee's API services using Ruby.
|
|
|
7
7
|
## Mindee API Versions
|
|
8
8
|
This client library has support for both Mindee platform versions.
|
|
9
9
|
|
|
10
|
-
###
|
|
11
|
-
This is the
|
|
10
|
+
### V2 - Latest
|
|
11
|
+
This is the latest platform located here:
|
|
12
12
|
|
|
13
13
|
https://app.mindee.com
|
|
14
14
|
|
|
15
15
|
It uses **API version 2**.
|
|
16
16
|
|
|
17
17
|
Consult the
|
|
18
|
-
**[
|
|
18
|
+
**[V2 Documentation](https://docs.mindee.com/integrations/client-libraries-sdk)**
|
|
19
19
|
|
|
20
20
|
|
|
21
|
-
###
|
|
22
|
-
This is the
|
|
21
|
+
### V1
|
|
22
|
+
This is the platform located here:
|
|
23
23
|
|
|
24
24
|
https://platform.mindee.com/
|
|
25
25
|
|
|
26
26
|
It uses **API version 1**.
|
|
27
27
|
|
|
28
28
|
Consult the
|
|
29
|
-
|
|
29
|
+
[V1 Documentation](https://docs.mindee.com/v1/libraries/ruby-sdk)
|
|
30
30
|
|
|
31
31
|
## Additional Information
|
|
32
32
|
|
data/Rakefile
CHANGED
|
@@ -1,25 +1,44 @@
|
|
|
1
1
|
# frozen_string_literal: true
|
|
2
2
|
|
|
3
3
|
require_relative 'mindee_error'
|
|
4
|
+
require_relative '../parsing/v2/error_item'
|
|
4
5
|
|
|
5
6
|
module Mindee
|
|
6
7
|
module Errors
|
|
7
8
|
# API V2 HttpError
|
|
8
9
|
class MindeeHTTPErrorV2 < MindeeError
|
|
9
|
-
# @return [Integer]
|
|
10
|
+
# @return [Integer] The HTTP status code returned by the server.
|
|
10
11
|
attr_reader :status
|
|
11
|
-
# @return [String]
|
|
12
|
+
# @return [String] A human-readable explanation specific to the occurrence of the problem.
|
|
12
13
|
attr_reader :detail
|
|
14
|
+
# @return [String] A short, human-readable summary of the problem.
|
|
15
|
+
attr_reader :title
|
|
16
|
+
# @return [String] A machine-readable code specific to the occurrence of the problem.
|
|
17
|
+
attr_reader :code
|
|
18
|
+
# @return [Array<ErrorItem>] A list of explicit error details.
|
|
19
|
+
attr_reader :errors
|
|
13
20
|
|
|
14
21
|
# @param http_error [Hash, Parsing::V2::ErrorResponse]
|
|
15
22
|
def initialize(http_error)
|
|
16
23
|
if http_error.is_a?(Parsing::V2::ErrorResponse)
|
|
17
24
|
http_error = { 'detail' => http_error.detail,
|
|
18
|
-
'status' => http_error.status
|
|
25
|
+
'status' => http_error.status,
|
|
26
|
+
'title' => http_error.title,
|
|
27
|
+
'code' => http_error.code,
|
|
28
|
+
'errors' => http_error.errors }
|
|
19
29
|
end
|
|
20
30
|
@status = http_error['status']
|
|
21
31
|
@detail = http_error['detail']
|
|
22
|
-
|
|
32
|
+
@title = http_error['title']
|
|
33
|
+
@code = http_error['code']
|
|
34
|
+
@errors = if http_error.key?('errors')
|
|
35
|
+
http_error['errors'].map do |error|
|
|
36
|
+
Parsing::V2::ErrorItem.new(error)
|
|
37
|
+
end
|
|
38
|
+
else
|
|
39
|
+
[]
|
|
40
|
+
end
|
|
41
|
+
super("HTTP #{@status} - #{@title} :: #{@code} - #{@detail}")
|
|
23
42
|
end
|
|
24
43
|
end
|
|
25
44
|
end
|
|
@@ -0,0 +1,18 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require_relative 'mindee_error'
|
|
4
|
+
|
|
5
|
+
module Mindee
|
|
6
|
+
module Errors
|
|
7
|
+
# Unknown HTTP error for the V2 API.
|
|
8
|
+
class MindeeHTTPUnknownErrorV2 < MindeeHTTPErrorV2
|
|
9
|
+
def initialize(http_error)
|
|
10
|
+
super({ 'detail' => "Couldn't deserialize server error. Found: #{http_error}",
|
|
11
|
+
'status' => -1,
|
|
12
|
+
'title' => 'Unknown Error',
|
|
13
|
+
'code' => '000-000',
|
|
14
|
+
'errors' => nil })
|
|
15
|
+
end
|
|
16
|
+
end
|
|
17
|
+
end
|
|
18
|
+
end
|
data/lib/mindee/http/endpoint.rb
CHANGED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Mindee
|
|
4
|
+
module Parsing
|
|
5
|
+
module V2
|
|
6
|
+
# Individual error item.
|
|
7
|
+
class ErrorItem
|
|
8
|
+
# @return [String, nil] A JSON Pointer to the location of the body property.
|
|
9
|
+
attr_reader :pointer
|
|
10
|
+
# @return [String, nil] Explicit information on the issue.
|
|
11
|
+
attr_reader :detail
|
|
12
|
+
|
|
13
|
+
# @param server_response [Hash] Raw JSON parsed into a Hash.
|
|
14
|
+
def initialize(server_response)
|
|
15
|
+
@pointer = server_response['pointer']
|
|
16
|
+
@detail = server_response['detail']
|
|
17
|
+
end
|
|
18
|
+
end
|
|
19
|
+
end
|
|
20
|
+
end
|
|
21
|
+
end
|
|
@@ -5,21 +5,36 @@ module Mindee
|
|
|
5
5
|
module V2
|
|
6
6
|
# Encapsulates information returned by the API when an error occurs.
|
|
7
7
|
class ErrorResponse
|
|
8
|
-
# @return [Integer] HTTP status code.
|
|
8
|
+
# @return [Integer] The HTTP status code returned by the server.
|
|
9
9
|
attr_reader :status
|
|
10
|
-
# @return [String]
|
|
10
|
+
# @return [String] A human-readable explanation specific to the occurrence of the problem.
|
|
11
11
|
attr_reader :detail
|
|
12
|
+
# @return [String] A short, human-readable summary of the problem.
|
|
13
|
+
attr_reader :title
|
|
14
|
+
# @return [String] A machine-readable code specific to the occurrence of the problem.
|
|
15
|
+
attr_reader :code
|
|
16
|
+
# @return [Array<ErrorItem>] A list of explicit error details.
|
|
17
|
+
attr_reader :errors
|
|
12
18
|
|
|
13
19
|
# @param server_response [Hash] Raw JSON parsed into a Hash.
|
|
14
20
|
def initialize(server_response)
|
|
15
21
|
@status = server_response['status']
|
|
16
22
|
@detail = server_response['detail']
|
|
23
|
+
@title = server_response['title']
|
|
24
|
+
@code = server_response['code']
|
|
25
|
+
@errors = if server_response.key?('errors')
|
|
26
|
+
server_response['errors'].map do |error|
|
|
27
|
+
ErrorItem.new(error)
|
|
28
|
+
end
|
|
29
|
+
else
|
|
30
|
+
[]
|
|
31
|
+
end
|
|
17
32
|
end
|
|
18
33
|
|
|
19
34
|
# String representation.
|
|
20
35
|
# @return [String]
|
|
21
36
|
def to_s
|
|
22
|
-
"
|
|
37
|
+
"HTTP #{@status} - #{@title} :: #{@code} - #{@detail}"
|
|
23
38
|
end
|
|
24
39
|
|
|
25
40
|
# Hash representation
|
|
@@ -2,6 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
require_relative 'field/inference_fields'
|
|
4
4
|
require_relative 'raw_text'
|
|
5
|
+
require_relative 'rag_metadata'
|
|
5
6
|
|
|
6
7
|
module Mindee
|
|
7
8
|
module Parsing
|
|
@@ -12,6 +13,8 @@ module Mindee
|
|
|
12
13
|
attr_reader :fields
|
|
13
14
|
# @return [Mindee::Parsing::V2::RawText, nil] Optional extra data.
|
|
14
15
|
attr_reader :raw_text
|
|
16
|
+
# @return [Mindee::Parsing::V2::RAGMetadata, nil] Optional RAG metadata.
|
|
17
|
+
attr_reader :rag
|
|
15
18
|
|
|
16
19
|
# @param server_response [Hash] Hash version of the JSON returned by the API.
|
|
17
20
|
def initialize(server_response)
|
|
@@ -20,6 +23,7 @@ module Mindee
|
|
|
20
23
|
@fields = Field::InferenceFields.new(server_response['fields'])
|
|
21
24
|
|
|
22
25
|
@raw_text = server_response['raw_text'] ? RawText.new(server_response['raw_text']) : nil
|
|
26
|
+
@rag = (V2::RAGMetadata.new(server_response['rag']) if server_response.key?('rag') && server_response['rag'])
|
|
23
27
|
end
|
|
24
28
|
|
|
25
29
|
# String representation.
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Mindee
|
|
4
|
+
module Parsing
|
|
5
|
+
module V2
|
|
6
|
+
# Metadata about the RAG operation.
|
|
7
|
+
class RAGMetadata
|
|
8
|
+
# The UUID of the matched document used during the RAG operation.
|
|
9
|
+
attr_accessor :retrieved_document_id
|
|
10
|
+
|
|
11
|
+
def initialize(server_response)
|
|
12
|
+
@retrieved_document_id = server_response.fetch('retrieved_document_id', nil)
|
|
13
|
+
end
|
|
14
|
+
end
|
|
15
|
+
end
|
|
16
|
+
end
|
|
17
|
+
end
|
data/lib/mindee/version.rb
CHANGED
data/mindee.gemspec
CHANGED
|
@@ -33,6 +33,7 @@ Gem::Specification.new do |spec|
|
|
|
33
33
|
spec.add_dependency 'origamindee', '~> 4.0'
|
|
34
34
|
spec.add_dependency 'pdf-reader', '~> 2.14'
|
|
35
35
|
|
|
36
|
+
spec.add_development_dependency 'openssl', '~> 3.3.2'
|
|
36
37
|
spec.add_development_dependency 'prism', '~> 1.3'
|
|
37
38
|
spec.add_development_dependency 'rake', '~> 13.2'
|
|
38
39
|
spec.add_development_dependency 'rbs', '~> 3.6'
|
|
@@ -3,9 +3,12 @@ module Mindee
|
|
|
3
3
|
module Errors
|
|
4
4
|
# API V2 HttpError
|
|
5
5
|
class MindeeHTTPErrorV2 < MindeeError
|
|
6
|
-
|
|
7
6
|
attr_reader detail: String
|
|
8
7
|
attr_reader status: Integer
|
|
8
|
+
attr_reader code: String
|
|
9
|
+
attr_reader title: String
|
|
10
|
+
attr_reader errors: Array[Parsing::V2::ErrorItem]
|
|
11
|
+
|
|
9
12
|
def initialize: (Hash[String, untyped] | Parsing::V2::ErrorResponse) -> void
|
|
10
13
|
end
|
|
11
14
|
end
|
|
@@ -23,6 +23,7 @@ module Mindee
|
|
|
23
23
|
def document_queue_req_post: (Input::Source::LocalInputSource | Input::Source::URLInputSource, ParseOptions) -> Net::HTTPResponse
|
|
24
24
|
def document_queue_req_get: (String) -> Net::HTTPResponse
|
|
25
25
|
def check_api_key: -> void
|
|
26
|
+
def configure_ssl: (Net::HTTP) -> void
|
|
26
27
|
end
|
|
27
28
|
end
|
|
28
29
|
end
|
|
@@ -5,6 +5,9 @@ module Mindee
|
|
|
5
5
|
class ErrorResponse
|
|
6
6
|
attr_reader detail: String
|
|
7
7
|
attr_reader status: Integer
|
|
8
|
+
attr_reader code: String
|
|
9
|
+
attr_reader title: String
|
|
10
|
+
attr_reader errors: Array[ErrorItem]
|
|
8
11
|
def initialize: (Hash[String | Symbol, untyped]) -> void
|
|
9
12
|
|
|
10
13
|
def as_hash: -> Hash[Symbol, String | Integer]
|
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: mindee
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 4.
|
|
4
|
+
version: 4.8.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Mindee, SA
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: bin
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date: 2025-
|
|
11
|
+
date: 2025-11-18 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: base64
|
|
@@ -86,6 +86,20 @@ dependencies:
|
|
|
86
86
|
- - "~>"
|
|
87
87
|
- !ruby/object:Gem::Version
|
|
88
88
|
version: '2.14'
|
|
89
|
+
- !ruby/object:Gem::Dependency
|
|
90
|
+
name: openssl
|
|
91
|
+
requirement: !ruby/object:Gem::Requirement
|
|
92
|
+
requirements:
|
|
93
|
+
- - "~>"
|
|
94
|
+
- !ruby/object:Gem::Version
|
|
95
|
+
version: 3.3.2
|
|
96
|
+
type: :development
|
|
97
|
+
prerelease: false
|
|
98
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
99
|
+
requirements:
|
|
100
|
+
- - "~>"
|
|
101
|
+
- !ruby/object:Gem::Version
|
|
102
|
+
version: 3.3.2
|
|
89
103
|
- !ruby/object:Gem::Dependency
|
|
90
104
|
name: prism
|
|
91
105
|
requirement: !ruby/object:Gem::Requirement
|
|
@@ -209,7 +223,6 @@ files:
|
|
|
209
223
|
- bin/cli_products.rb
|
|
210
224
|
- bin/console
|
|
211
225
|
- bin/mindee.rb
|
|
212
|
-
- docs/advanced_file_operations.md
|
|
213
226
|
- docs/code_samples/bank_account_details_v1.txt
|
|
214
227
|
- docs/code_samples/bank_account_details_v2.txt
|
|
215
228
|
- docs/code_samples/bank_check_v1.txt
|
|
@@ -247,37 +260,6 @@ files:
|
|
|
247
260
|
- docs/code_samples/us_mail_v3_async.txt
|
|
248
261
|
- docs/code_samples/workflow_execution.txt
|
|
249
262
|
- docs/code_samples/workflow_polling.txt
|
|
250
|
-
- docs/getting_started.md
|
|
251
|
-
- docs/global_products.md
|
|
252
|
-
- docs/global_products/barcode_reader_v1.md
|
|
253
|
-
- docs/global_products/bill_of_lading_v1.md
|
|
254
|
-
- docs/global_products/business_card_v1.md
|
|
255
|
-
- docs/global_products/cropper_v1.md
|
|
256
|
-
- docs/global_products/delivery_notes_v1.md
|
|
257
|
-
- docs/global_products/driver_license_v1.md
|
|
258
|
-
- docs/global_products/expense_receipts_v5.md
|
|
259
|
-
- docs/global_products/financial_document_v1.md
|
|
260
|
-
- docs/global_products/international_id_v2.md
|
|
261
|
-
- docs/global_products/invoice_splitter_v1.md
|
|
262
|
-
- docs/global_products/invoices_v4.md
|
|
263
|
-
- docs/global_products/multi_receipts_detector_v1.md
|
|
264
|
-
- docs/global_products/nutrition_facts_v1.md
|
|
265
|
-
- docs/global_products/passport_v1.md
|
|
266
|
-
- docs/global_products/resume_v1.md
|
|
267
|
-
- docs/global_products/universal.md
|
|
268
|
-
- docs/loading_a_document.md
|
|
269
|
-
- docs/localized_products.md
|
|
270
|
-
- docs/localized_products/bank_account_details_v2.md
|
|
271
|
-
- docs/localized_products/bank_check_v1.md
|
|
272
|
-
- docs/localized_products/bank_statement_fr_v2.md
|
|
273
|
-
- docs/localized_products/carte_grise_v1.md
|
|
274
|
-
- docs/localized_products/energy_bill_fra_v1.md
|
|
275
|
-
- docs/localized_products/french_healthcard_v1.md
|
|
276
|
-
- docs/localized_products/idcard_fr_v2.md
|
|
277
|
-
- docs/localized_products/ind_passport_v1.md
|
|
278
|
-
- docs/localized_products/payslip_fra_v3.md
|
|
279
|
-
- docs/localized_products/us_healthcare_cards_v1.md
|
|
280
|
-
- docs/localized_products/us_mail_v3.md
|
|
281
263
|
- examples/auto_invoice_splitter_extraction.rb
|
|
282
264
|
- examples/auto_multi_receipts_detector_extraction.rb
|
|
283
265
|
- lib/mindee.rb
|
|
@@ -287,6 +269,7 @@ files:
|
|
|
287
269
|
- lib/mindee/errors/mindee_error.rb
|
|
288
270
|
- lib/mindee/errors/mindee_http_error.rb
|
|
289
271
|
- lib/mindee/errors/mindee_http_error_v2.rb
|
|
272
|
+
- lib/mindee/errors/mindee_http_unknown_error_v2.rb
|
|
290
273
|
- lib/mindee/errors/mindee_input_error.rb
|
|
291
274
|
- lib/mindee/extraction.rb
|
|
292
275
|
- lib/mindee/extraction/multi_receipts_extractor.rb
|
|
@@ -366,6 +349,7 @@ files:
|
|
|
366
349
|
- lib/mindee/parsing/universal/universal_object_field.rb
|
|
367
350
|
- lib/mindee/parsing/v2.rb
|
|
368
351
|
- lib/mindee/parsing/v2/common_response.rb
|
|
352
|
+
- lib/mindee/parsing/v2/error_item.rb
|
|
369
353
|
- lib/mindee/parsing/v2/error_response.rb
|
|
370
354
|
- lib/mindee/parsing/v2/field.rb
|
|
371
355
|
- lib/mindee/parsing/v2/field/base_field.rb
|
|
@@ -384,6 +368,7 @@ files:
|
|
|
384
368
|
- lib/mindee/parsing/v2/job.rb
|
|
385
369
|
- lib/mindee/parsing/v2/job_response.rb
|
|
386
370
|
- lib/mindee/parsing/v2/job_webhook.rb
|
|
371
|
+
- lib/mindee/parsing/v2/rag_metadata.rb
|
|
387
372
|
- lib/mindee/parsing/v2/raw_text.rb
|
|
388
373
|
- lib/mindee/parsing/v2/raw_text_page.rb
|
|
389
374
|
- lib/mindee/pdf.rb
|
|
@@ -571,6 +556,7 @@ files:
|
|
|
571
556
|
- sig/mindee/errors/mindee_error.rbs
|
|
572
557
|
- sig/mindee/errors/mindee_http_error.rbs
|
|
573
558
|
- sig/mindee/errors/mindee_http_error_v2.rbs
|
|
559
|
+
- sig/mindee/errors/mindee_http_unknown_error_v2.rbs
|
|
574
560
|
- sig/mindee/errors/mindee_input_error.rbs
|
|
575
561
|
- sig/mindee/extraction/multi_receipts_extractor.rbs
|
|
576
562
|
- sig/mindee/geometry/min_max.rbs
|
|
@@ -635,6 +621,7 @@ files:
|
|
|
635
621
|
- sig/mindee/parsing/universal/universal_list_field.rbs
|
|
636
622
|
- sig/mindee/parsing/universal/universal_object_field.rbs
|
|
637
623
|
- sig/mindee/parsing/v2/common_response.rbs
|
|
624
|
+
- sig/mindee/parsing/v2/error_item.rbs
|
|
638
625
|
- sig/mindee/parsing/v2/error_response.rbs
|
|
639
626
|
- sig/mindee/parsing/v2/field/base_field.rbs
|
|
640
627
|
- sig/mindee/parsing/v2/field/field_confidence.rbs
|
|
@@ -652,6 +639,7 @@ files:
|
|
|
652
639
|
- sig/mindee/parsing/v2/job.rbs
|
|
653
640
|
- sig/mindee/parsing/v2/job_response.rbs
|
|
654
641
|
- sig/mindee/parsing/v2/job_webhook.rbs
|
|
642
|
+
- sig/mindee/parsing/v2/rag_metadata.rbs
|
|
655
643
|
- sig/mindee/parsing/v2/raw_text.rbs
|
|
656
644
|
- sig/mindee/parsing/v2/raw_text_page.rbs
|
|
657
645
|
- sig/mindee/pdf/extracted_pdf.rbs
|
|
@@ -1,109 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
title: Advanced File Operations
|
|
3
|
-
category: 622b805aaec68102ea7fcbc2
|
|
4
|
-
slug: ruby-advanced-file-operations
|
|
5
|
-
parentDoc: 6294d97ee723f1008d2ab28e
|
|
6
|
-
---
|
|
7
|
-
|
|
8
|
-
> ❗️ Disclaimer: the file operations listed below do not directly manipulate the files you will pass to the library,
|
|
9
|
-
> They will instead create a copy before applying any operations, which means that the file you send may not be an exact copy of the file the server will receive.
|
|
10
|
-
> To avoid any unexpected or unwanted result, you can save a copy of the created file locally to inspect it visually before sending it.
|
|
11
|
-
|
|
12
|
-
## Image compression
|
|
13
|
-
|
|
14
|
-
The compression functionality for image files (JPEG, PNG, etc.) via the `compress!` method available on a
|
|
15
|
-
LocalInputSource. This method allows you to reduce file size by specifying quality and dimension constraints.
|
|
16
|
-
|
|
17
|
-
Example:
|
|
18
|
-
|
|
19
|
-
```rb
|
|
20
|
-
# Compress an image with custom parameters.
|
|
21
|
-
input_source.compress!(quality: 85, max_width: 1024, max_height: 768)
|
|
22
|
-
```
|
|
23
|
-
> ⚠️ Warning: Compression alters the original image data.
|
|
24
|
-
> We strongly advise you inspect a compressed file before sending it:
|
|
25
|
-
> ```rb
|
|
26
|
-
> # Compress using a quality of 50%:
|
|
27
|
-
> input_source.compress!(quality: 50)
|
|
28
|
-
> input_source.write_to_file('path/to/my/compressed/file_50.jpg')
|
|
29
|
-
> ```
|
|
30
|
-
|
|
31
|
-
For reference, here's what the following levels of compression on this image will look like:
|
|
32
|
-
|
|
33
|
-
**Original:**
|
|
34
|
-

|
|
35
|
-
|
|
36
|
-
**85% compressed:**
|
|
37
|
-

|
|
38
|
-
|
|
39
|
-
**50% compressed:**
|
|
40
|
-

|
|
41
|
-
|
|
42
|
-
**10% compressed:**
|
|
43
|
-

|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
## PDF operations
|
|
47
|
-
|
|
48
|
-
PDF operations include both compression and fixing features.
|
|
49
|
-
These are specifically designed to handle challenges associated with PDF files, such as large file sizes and formatting
|
|
50
|
-
issues.
|
|
51
|
-
|
|
52
|
-
### PDF compression
|
|
53
|
-
|
|
54
|
-
> 🧪 PDF compression is an **experimental** feature that rasterizes each page of the PDF (similar to how images are
|
|
55
|
-
> compressed) to reduce its overall size.
|
|
56
|
-
>
|
|
57
|
-
> Because the process involves re-rendering the PDF’s contents, some source text may be lost or rendered differently.
|
|
58
|
-
> Use this feature with caution.
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
```rb
|
|
62
|
-
# Load a local input source.
|
|
63
|
-
input_file_path = "path/to/your/file.pdf"
|
|
64
|
-
output_file_path = "path/to/the/compressed/file.pdf"
|
|
65
|
-
pdf_input = Mindee::Input::Source::PathInputSource.new(input_file_path)
|
|
66
|
-
|
|
67
|
-
# We advise you test the quality value yourself, as results may vary greatly depending on the input file
|
|
68
|
-
pdf_input.compress!(quality: 50)
|
|
69
|
-
|
|
70
|
-
# Write the output file locally for visual checking:
|
|
71
|
-
File.write(output_file_path, pdf_input.io_stream.read)
|
|
72
|
-
```
|
|
73
|
-
|
|
74
|
-
> 🚧 Be warned that the source text (the text embedded in the PDF itself) might not render properly,
|
|
75
|
-
> and so source PDFs will be ignored by default.
|
|
76
|
-
>
|
|
77
|
-
> You can bypass this using:
|
|
78
|
-
|
|
79
|
-
```rb
|
|
80
|
-
pdf_input.compress!(quality: 50, force_source_text: true)
|
|
81
|
-
```
|
|
82
|
-
|
|
83
|
-
Or alternatively, you can try to approximate the re-rendering of the source-text using:
|
|
84
|
-
|
|
85
|
-
```rb
|
|
86
|
-
pdf_input.compress!(quality: 50, force_source_text: true, disable_source_text: false)
|
|
87
|
-
```
|
|
88
|
-
|
|
89
|
-
### PDF Repair
|
|
90
|
-
|
|
91
|
-
The PDF repair feature attempts to rescue PDFs with invalid or broken header information.
|
|
92
|
-
This can sometimes help when files get rejected by the server.
|
|
93
|
-
|
|
94
|
-
Example:
|
|
95
|
-
```rb
|
|
96
|
-
# Load a PDF file with the repair_pdf flag enabled.
|
|
97
|
-
input_source = mindee_client.source_from_file(file, "document.pdf", repair_pdf: true)
|
|
98
|
-
```
|
|
99
|
-
|
|
100
|
-
> ⚠️ Warning: PDF fixing alters the input file by re-writing header information.
|
|
101
|
-
> Use this feature only when required, as it might affect the integrity of the document file.
|
|
102
|
-
|
|
103
|
-
---
|
|
104
|
-
|
|
105
|
-
Feel free to expand these examples and adjust the parameters as needed for your projects. For further details on
|
|
106
|
-
authentication and usage, you can refer to the [Getting Started Guide](getting_started.md).
|
|
107
|
-
|
|
108
|
-
# Questions?
|
|
109
|
-
[Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-2d0ds7dtz-DPAF81ZqTy20chsYpQBW5g)
|