mindee 3.1.0 → 3.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +21 -0
- data/README.md +57 -7
- data/bin/mindee.rb +160 -83
- data/docs/bank_account_details_v2.md +137 -0
- data/docs/bank_check_v1.md +179 -0
- data/docs/barcode_reader_v1.md +104 -0
- data/docs/carte_vitale_v1.md +123 -0
- data/docs/code_samples/barcode_reader_v1.txt +19 -0
- data/docs/code_samples/cropper_v1.txt +16 -0
- data/docs/code_samples/idcard_fr_v2.txt +19 -0
- data/docs/code_samples/invoice_splitter_v1_async.txt +6 -54
- data/docs/code_samples/multi_receipts_detector_v1.txt +19 -0
- data/docs/code_samples/us_w9_v1.txt +16 -0
- data/docs/cropper_v1.md +97 -0
- data/docs/custom_v1.md +101 -0
- data/docs/expense_receipts_v5.md +306 -0
- data/docs/financial_document_v1.md +384 -0
- data/docs/{ruby-getting-started.md → getting_started.md} +22 -6
- data/docs/idcard_fr_v2.md +253 -0
- data/docs/invoice_splitter_v1.md +85 -0
- data/docs/invoices_v4.md +369 -0
- data/docs/license_plates_v1.md +91 -0
- data/docs/multi_receipts_detector_v1.md +105 -0
- data/docs/passport_v1.md +186 -0
- data/docs/proof_of_address_v1.md +207 -0
- data/docs/us_driver_license_v1.md +268 -0
- data/docs/us_w9_v1.md +207 -0
- data/lib/mindee/client.rb +95 -16
- data/lib/mindee/geometry/quadrilateral.rb +5 -0
- data/lib/mindee/http/.rubocop.yml +8 -0
- data/lib/mindee/http/endpoint.rb +14 -6
- data/lib/mindee/http/error.rb +104 -0
- data/lib/mindee/http.rb +1 -0
- data/lib/mindee/input/sources.rb +84 -15
- data/lib/mindee/parsing/common/api_response.rb +11 -1
- data/lib/mindee/parsing/common/inference.rb +2 -2
- data/lib/mindee/parsing/common/ocr/ocr.rb +1 -0
- data/lib/mindee/parsing/common.rb +0 -1
- data/lib/mindee/parsing/standard/company_registration_field.rb +1 -1
- data/lib/mindee/parsing/standard/locale_field.rb +1 -1
- data/lib/mindee/parsing/standard/payment_details_field.rb +1 -1
- data/lib/mindee/parsing/standard/position_field.rb +10 -3
- data/lib/mindee/parsing/standard/{text_field.rb → string_field.rb} +1 -1
- data/lib/mindee/parsing/standard.rb +1 -1
- data/lib/mindee/pdf/pdf_processing.rb +2 -1
- data/lib/mindee/product/barcode_reader/barcode_reader_v1.rb +37 -0
- data/lib/mindee/product/barcode_reader/barcode_reader_v1_document.rb +44 -0
- data/lib/mindee/product/barcode_reader/barcode_reader_v1_page.rb +32 -0
- data/lib/mindee/product/cropper/cropper_v1.rb +37 -0
- data/lib/mindee/product/cropper/cropper_v1_document.rb +13 -0
- data/lib/mindee/product/cropper/cropper_v1_page.rb +49 -0
- data/lib/mindee/product/custom/custom_v1.rb +1 -0
- data/lib/mindee/product/eu/license_plate/license_plate_v1.rb +1 -0
- data/lib/mindee/product/eu/license_plate/license_plate_v1_document.rb +2 -2
- data/lib/mindee/product/financial_document/financial_document_v1.rb +1 -0
- data/lib/mindee/product/financial_document/financial_document_v1_document.rb +24 -24
- data/lib/mindee/product/fr/bank_account_details/bank_account_details_v1.rb +1 -0
- data/lib/mindee/product/fr/bank_account_details/bank_account_details_v1_document.rb +6 -6
- data/lib/mindee/product/fr/bank_account_details/bank_account_details_v2.rb +1 -0
- data/lib/mindee/product/fr/bank_account_details/bank_account_details_v2_document.rb +6 -6
- data/lib/mindee/product/fr/carte_vitale/carte_vitale_v1.rb +1 -0
- data/lib/mindee/product/fr/carte_vitale/carte_vitale_v1_document.rb +6 -6
- data/lib/mindee/product/fr/id_card/id_card_v1.rb +1 -0
- data/lib/mindee/product/fr/id_card/id_card_v1_document.rb +16 -16
- data/lib/mindee/product/fr/id_card/id_card_v2.rb +39 -0
- data/lib/mindee/product/fr/id_card/id_card_v2_document.rb +107 -0
- data/lib/mindee/product/fr/id_card/id_card_v2_page.rb +53 -0
- data/lib/mindee/product/invoice/invoice_v4.rb +1 -0
- data/lib/mindee/product/invoice/invoice_v4_document.rb +24 -24
- data/lib/mindee/product/invoice_splitter/invoice_splitter_v1.rb +1 -0
- data/lib/mindee/product/invoice_splitter/invoice_splitter_v1_document.rb +5 -3
- data/lib/mindee/product/multi_receipts_detector/multi_receipts_detector_v1.rb +37 -0
- data/lib/mindee/product/multi_receipts_detector/multi_receipts_detector_v1_document.rb +35 -0
- data/lib/mindee/product/multi_receipts_detector/multi_receipts_detector_v1_page.rb +32 -0
- data/lib/mindee/product/passport/passport_v1.rb +1 -0
- data/lib/mindee/product/passport/passport_v1_document.rb +16 -16
- data/lib/mindee/product/proof_of_address/proof_of_address_v1.rb +1 -0
- data/lib/mindee/product/proof_of_address/proof_of_address_v1_document.rb +14 -14
- data/lib/mindee/product/receipt/receipt_v4_document.rb +6 -6
- data/lib/mindee/product/receipt/receipt_v5.rb +1 -0
- data/lib/mindee/product/receipt/receipt_v5_document.rb +12 -12
- data/lib/mindee/product/us/bank_check/bank_check_v1.rb +1 -0
- data/lib/mindee/product/us/bank_check/bank_check_v1_document.rb +8 -8
- data/lib/mindee/product/us/driver_license/driver_license_v1.rb +1 -0
- data/lib/mindee/product/us/driver_license/driver_license_v1_document.rb +28 -28
- data/lib/mindee/product/us/w9/w9_v1.rb +39 -0
- data/lib/mindee/product/us/w9/w9_v1_document.rb +15 -0
- data/lib/mindee/product/us/w9/w9_v1_page.rb +102 -0
- data/lib/mindee/product.rb +5 -0
- data/lib/mindee/version.rb +5 -1
- data/lib/mindee.rb +47 -0
- metadata +43 -9
- data/docs/ruby-api-builder.md +0 -123
- data/docs/ruby-invoice-ocr.md +0 -271
- data/docs/ruby-passport-ocr.md +0 -165
- data/docs/ruby-receipt-ocr.md +0 -196
- data/lib/mindee/parsing/common/error.rb +0 -24
|
@@ -0,0 +1,384 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: Financial Document OCR Ruby
|
|
3
|
+
---
|
|
4
|
+
The Ruby OCR SDK supports the [Financial Document API](https://platform.mindee.com/mindee/financial_document).
|
|
5
|
+
|
|
6
|
+
Using the [sample below](https://github.com/mindee/client-lib-test-data/blob/main/products/financial_document/default_sample.jpg), we are going to illustrate how to extract the data that we want using the OCR SDK.
|
|
7
|
+

|
|
8
|
+
|
|
9
|
+
# Quick-Start
|
|
10
|
+
```rb
|
|
11
|
+
require 'mindee'
|
|
12
|
+
|
|
13
|
+
# Init a new client
|
|
14
|
+
mindee_client = Mindee::Client.new(api_key: 'my-api-key')
|
|
15
|
+
|
|
16
|
+
# Load a file from disk
|
|
17
|
+
input_source = mindee_client.source_from_path('/path/to/the/file.ext')
|
|
18
|
+
|
|
19
|
+
# Parse the file
|
|
20
|
+
result = mindee_client.parse(
|
|
21
|
+
input_source,
|
|
22
|
+
Mindee::Product::FinancialDocument::FinancialDocumentV1
|
|
23
|
+
)
|
|
24
|
+
|
|
25
|
+
# Print a full summary of the parsed data in RST format
|
|
26
|
+
puts result.document
|
|
27
|
+
|
|
28
|
+
# Print the document-level parsed data
|
|
29
|
+
# puts result.document.inference.prediction
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
**Output (RST):**
|
|
33
|
+
```rst
|
|
34
|
+
########
|
|
35
|
+
Document
|
|
36
|
+
########
|
|
37
|
+
:Mindee ID: 81c1d637-3a84-41d9-b40a-f72ca2a58826
|
|
38
|
+
:Filename: default_sample.jpg
|
|
39
|
+
|
|
40
|
+
Inference
|
|
41
|
+
#########
|
|
42
|
+
:Product: mindee/financial_document v1.1
|
|
43
|
+
:Rotation applied: Yes
|
|
44
|
+
|
|
45
|
+
Prediction
|
|
46
|
+
==========
|
|
47
|
+
:Locale: en; en; USD;
|
|
48
|
+
:Invoice Number:
|
|
49
|
+
:Reference Numbers:
|
|
50
|
+
:Purchase Date: 2014-07-07
|
|
51
|
+
:Due Date: 2014-07-07
|
|
52
|
+
:Total Net: 40.48
|
|
53
|
+
:Total Amount: 53.82
|
|
54
|
+
:Taxes:
|
|
55
|
+
+---------------+--------+----------+---------------+
|
|
56
|
+
| Base | Code | Rate (%) | Amount |
|
|
57
|
+
+===============+========+==========+===============+
|
|
58
|
+
| | TAX | | 3.34 |
|
|
59
|
+
+---------------+--------+----------+---------------+
|
|
60
|
+
:Supplier Payment Details:
|
|
61
|
+
:Supplier name: LOGANS
|
|
62
|
+
:Supplier Company Registrations:
|
|
63
|
+
:Supplier Address: 2513 s stemmons freeway lewisville tx 75067
|
|
64
|
+
:Supplier Phone Number: 9724596042
|
|
65
|
+
:Customer name:
|
|
66
|
+
:Customer Company Registrations:
|
|
67
|
+
:Customer Address:
|
|
68
|
+
:Document Type: EXPENSE RECEIPT
|
|
69
|
+
:Purchase Subcategory: restaurant
|
|
70
|
+
:Purchase Category: food
|
|
71
|
+
:Total Tax: 3.34
|
|
72
|
+
:Tip and Gratuity: 10.00
|
|
73
|
+
:Purchase Time: 20:20
|
|
74
|
+
:Line Items:
|
|
75
|
+
+--------------------------------------+--------------+----------+------------+--------------+--------------+------------+
|
|
76
|
+
| Description | Product code | Quantity | Tax Amount | Tax Rate (%) | Total Amount | Unit Price |
|
|
77
|
+
+======================================+==============+==========+============+==============+==============+============+
|
|
78
|
+
| TAX | | | | | 3.34 | |
|
|
79
|
+
+--------------------------------------+--------------+----------+------------+--------------+--------------+------------+
|
|
80
|
+
|
|
81
|
+
Page Predictions
|
|
82
|
+
================
|
|
83
|
+
|
|
84
|
+
Page 0
|
|
85
|
+
------
|
|
86
|
+
:Locale: en; en; USD;
|
|
87
|
+
:Invoice Number:
|
|
88
|
+
:Reference Numbers:
|
|
89
|
+
:Purchase Date: 2014-07-07
|
|
90
|
+
:Due Date: 2014-07-07
|
|
91
|
+
:Total Net: 40.48
|
|
92
|
+
:Total Amount: 53.82
|
|
93
|
+
:Taxes:
|
|
94
|
+
+---------------+--------+----------+---------------+
|
|
95
|
+
| Base | Code | Rate (%) | Amount |
|
|
96
|
+
+===============+========+==========+===============+
|
|
97
|
+
| | TAX | | 3.34 |
|
|
98
|
+
+---------------+--------+----------+---------------+
|
|
99
|
+
:Supplier Payment Details:
|
|
100
|
+
:Supplier name: LOGANS
|
|
101
|
+
:Supplier Company Registrations:
|
|
102
|
+
:Supplier Address: 2513 s stemmons freeway lewisville tx 75067
|
|
103
|
+
:Supplier Phone Number: 9724596042
|
|
104
|
+
:Customer name:
|
|
105
|
+
:Customer Company Registrations:
|
|
106
|
+
:Customer Address:
|
|
107
|
+
:Document Type: EXPENSE RECEIPT
|
|
108
|
+
:Purchase Subcategory: restaurant
|
|
109
|
+
:Purchase Category: food
|
|
110
|
+
:Total Tax: 3.34
|
|
111
|
+
:Tip and Gratuity: 10.00
|
|
112
|
+
:Purchase Time: 20:20
|
|
113
|
+
:Line Items:
|
|
114
|
+
+--------------------------------------+--------------+----------+------------+--------------+--------------+------------+
|
|
115
|
+
| Description | Product code | Quantity | Tax Amount | Tax Rate (%) | Total Amount | Unit Price |
|
|
116
|
+
+======================================+==============+==========+============+==============+==============+============+
|
|
117
|
+
| TAX | | | | | 3.34 | |
|
|
118
|
+
+--------------------------------------+--------------+----------+------------+--------------+--------------+------------+
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
# Field Types
|
|
122
|
+
## Standard Fields
|
|
123
|
+
These fields are generic and used in several products.
|
|
124
|
+
|
|
125
|
+
### Basic Field
|
|
126
|
+
Each prediction object contains a set of fields that inherit from the generic `Field` class.
|
|
127
|
+
A typical `Field` object will have the following attributes:
|
|
128
|
+
|
|
129
|
+
* **value** (`String`, `Float`, `Integer`, `Boolean`): corresponds to the field value. Can be `nil` if no value was extracted.
|
|
130
|
+
* **confidence** (Float, nil): the confidence score of the field prediction.
|
|
131
|
+
* **bounding_box** (`Mindee::Geometry::Quadrilateral`, `nil`): contains exactly 4 relative vertices (points) coordinates of a right rectangle containing the field in the document.
|
|
132
|
+
* **polygon** (`Mindee::Geometry::Polygon`, `nil`): contains the relative vertices coordinates (`Point`) of a polygon containing the field in the image.
|
|
133
|
+
* **page_id** (`Integer`, `nil`): the ID of the page, is `nil` when at document-level.
|
|
134
|
+
* **reconstructed** (`Boolean`): indicates whether or not an object was reconstructed (not extracted as the API gave it).
|
|
135
|
+
|
|
136
|
+
|
|
137
|
+
Aside from the previous attributes, all basic fields have access to a `to_s` method that can be used to print their value as a string.
|
|
138
|
+
|
|
139
|
+
|
|
140
|
+
### Amount Field
|
|
141
|
+
The amount field `AmountField` only has one constraint: its **value** is a `Float` (or `nil`).
|
|
142
|
+
|
|
143
|
+
|
|
144
|
+
### Classification Field
|
|
145
|
+
The classification field `ClassificationField` does not implement all the basic `Field` attributes. It only implements **value**, **confidence** and **page_id**.
|
|
146
|
+
|
|
147
|
+
> Note: a classification field's `value is always a `String`.
|
|
148
|
+
|
|
149
|
+
|
|
150
|
+
### Company Registration Field
|
|
151
|
+
Aside from the basic `Field` attributes, the company registration field `CompanyRegistrationField` also implements the following:
|
|
152
|
+
|
|
153
|
+
* **type** (`String`): the type of company.
|
|
154
|
+
|
|
155
|
+
### Date Field
|
|
156
|
+
Aside from the basic `Field` attributes, the date field `DateField` also implements the following:
|
|
157
|
+
|
|
158
|
+
* **date_object** (`Date`): an accessible representation of the value as a JavaScript object.
|
|
159
|
+
|
|
160
|
+
### Locale Field
|
|
161
|
+
The locale field `LocaleField` only implements the **value**, **confidence** and **page_id** base `Field` attributes, but it comes with its own:
|
|
162
|
+
|
|
163
|
+
* **language** (`String`): ISO 639-1 language code (e.g.: `en` for English). Can be `nil`.
|
|
164
|
+
* **country** (`String`): ISO 3166-1 alpha-2 or ISO 3166-1 alpha-3 code for countries (e.g.: `GRB` or `GB` for "Great Britain"). Can be `nil`.
|
|
165
|
+
* **currency** (`String`): ISO 4217 code for currencies (e.g.: `USD` for "US Dollars"). Can be `nil`.
|
|
166
|
+
|
|
167
|
+
### Payment Details Field
|
|
168
|
+
Aside from the basic `Field` attributes, the payment details field `PaymentDetailsField` also implements the following:
|
|
169
|
+
|
|
170
|
+
* **account_number** (`String`): number of an account, expressed as a string. Can be `nil`.
|
|
171
|
+
* **iban** (`String`): International Bank Account Number. Can be `nil`.
|
|
172
|
+
* **routing_number** (`String`): routing number of an account. Can be `nil`.
|
|
173
|
+
* **swift** (`String`): the account holder's bank's SWIFT Business Identifier Code (BIC). Can be `nil`.
|
|
174
|
+
|
|
175
|
+
### String Field
|
|
176
|
+
The text field `StringField` only has one constraint: it's **value** is a `String` (or `nil`).
|
|
177
|
+
|
|
178
|
+
### Taxes Field
|
|
179
|
+
#### Tax
|
|
180
|
+
Aside from the basic `Field` attributes, the tax field `TaxField` also implements the following:
|
|
181
|
+
|
|
182
|
+
* **rate** (`Float`): the tax rate applied to an item can be undefined. Expressed as a percentage. Can be `nil`.
|
|
183
|
+
* **code** (`String`): tax code (or equivalent, depending on the origin of the document). Can be `nil`.
|
|
184
|
+
* **base** (`Float`): base amount used for the tax. Can be `nil`.
|
|
185
|
+
|
|
186
|
+
> Note: currently `TaxField` is not used on its own, and is accessed through a parent `Taxes` object, an array-like structure.
|
|
187
|
+
|
|
188
|
+
#### Taxes (Array)
|
|
189
|
+
The `Taxes` field represents an array-like collection of `TaxField` objects. As it is the representation of several objects, it has access to a custom `to_s` method that can render a `TaxField` object as a table line.
|
|
190
|
+
|
|
191
|
+
## Specific Fields
|
|
192
|
+
Fields which are specific to this product; they are not used in any other product.
|
|
193
|
+
|
|
194
|
+
### Line Items Field
|
|
195
|
+
List of line item details.
|
|
196
|
+
|
|
197
|
+
A `FinancialDocumentV1LineItem` implements the following attributes:
|
|
198
|
+
|
|
199
|
+
* `description` (String): The item description.
|
|
200
|
+
* `product_code` (String): The product code referring to the item.
|
|
201
|
+
* `quantity` (Float): The item quantity
|
|
202
|
+
* `tax_amount` (Float): The item tax amount.
|
|
203
|
+
* `tax_rate` (Float): The item tax rate in percentage.
|
|
204
|
+
* `total_amount` (Float): The item total amount.
|
|
205
|
+
* `unit_price` (Float): The item unit price.
|
|
206
|
+
|
|
207
|
+
# Attributes
|
|
208
|
+
The following fields are extracted for Financial Document V1:
|
|
209
|
+
|
|
210
|
+
## Purchase Category
|
|
211
|
+
**category** ([ClassificationField](#classification-field)): The purchase category among predefined classes.
|
|
212
|
+
|
|
213
|
+
```rb
|
|
214
|
+
puts result.document.inference.prediction.category.value
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
## Customer Address
|
|
218
|
+
**customer_address** ([StringField](#string-field)): The address of the customer.
|
|
219
|
+
|
|
220
|
+
```rb
|
|
221
|
+
puts result.document.inference.prediction.customer_address.value
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
## Customer Company Registrations
|
|
225
|
+
**customer_company_registrations** (Array<[CompanyRegistrationField](#company-registration-field)>): List of company registrations associated to the customer.
|
|
226
|
+
|
|
227
|
+
```rb
|
|
228
|
+
for customer_company_registrations_elem in result.document.inference.prediction.customer_company_registrations do
|
|
229
|
+
puts customer_company_registrations_elem.value
|
|
230
|
+
end
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
## Customer Name
|
|
234
|
+
**customer_name** ([StringField](#string-field)): The name of the customer.
|
|
235
|
+
|
|
236
|
+
```rb
|
|
237
|
+
puts result.document.inference.prediction.customer_name.value
|
|
238
|
+
```
|
|
239
|
+
|
|
240
|
+
## Purchase Date
|
|
241
|
+
**date** ([DateField](#date-field)): The date the purchase was made.
|
|
242
|
+
|
|
243
|
+
```rb
|
|
244
|
+
puts result.document.inference.prediction.date.value
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
## Document Type
|
|
248
|
+
**document_type** ([ClassificationField](#classification-field)): One of: 'INVOICE', 'CREDIT NOTE', 'CREDIT CARD RECEIPT', 'EXPENSE RECEIPT'.
|
|
249
|
+
|
|
250
|
+
```rb
|
|
251
|
+
puts result.document.inference.prediction.document_type.value
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
## Due Date
|
|
255
|
+
**due_date** ([DateField](#date-field)): The date on which the payment is due.
|
|
256
|
+
|
|
257
|
+
```rb
|
|
258
|
+
puts result.document.inference.prediction.due_date.value
|
|
259
|
+
```
|
|
260
|
+
|
|
261
|
+
## Invoice Number
|
|
262
|
+
**invoice_number** ([StringField](#string-field)): The invoice number or identifier.
|
|
263
|
+
|
|
264
|
+
```rb
|
|
265
|
+
puts result.document.inference.prediction.invoice_number.value
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
## Line Items
|
|
269
|
+
**line_items** (Array<[FinancialDocumentV1LineItem](#line-items-field)>): List of line item details.
|
|
270
|
+
|
|
271
|
+
```rb
|
|
272
|
+
for line_items_elem in result.document.inference.prediction.line_items do
|
|
273
|
+
puts line_items_elem.value
|
|
274
|
+
end
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
## Locale
|
|
278
|
+
**locale** ([LocaleField](#locale-field)): The locale detected on the document.
|
|
279
|
+
|
|
280
|
+
```rb
|
|
281
|
+
puts result.document.inference.prediction.locale.value
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
## Reference Numbers
|
|
285
|
+
**reference_numbers** (Array<[StringField](#string-field)>): List of Reference numbers, including PO number.
|
|
286
|
+
|
|
287
|
+
```rb
|
|
288
|
+
for reference_numbers_elem in result.document.inference.prediction.reference_numbers do
|
|
289
|
+
puts reference_numbers_elem.value
|
|
290
|
+
end
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
## Purchase Subcategory
|
|
294
|
+
**subcategory** ([ClassificationField](#classification-field)): The purchase subcategory among predefined classes for transport and food.
|
|
295
|
+
|
|
296
|
+
```rb
|
|
297
|
+
puts result.document.inference.prediction.subcategory.value
|
|
298
|
+
```
|
|
299
|
+
|
|
300
|
+
## Supplier Address
|
|
301
|
+
**supplier_address** ([StringField](#string-field)): The address of the supplier or merchant.
|
|
302
|
+
|
|
303
|
+
```rb
|
|
304
|
+
puts result.document.inference.prediction.supplier_address.value
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
## Supplier Company Registrations
|
|
308
|
+
**supplier_company_registrations** (Array<[CompanyRegistrationField](#company-registration-field)>): List of company registrations associated to the supplier.
|
|
309
|
+
|
|
310
|
+
```rb
|
|
311
|
+
for supplier_company_registrations_elem in result.document.inference.prediction.supplier_company_registrations do
|
|
312
|
+
puts supplier_company_registrations_elem.value
|
|
313
|
+
end
|
|
314
|
+
```
|
|
315
|
+
|
|
316
|
+
## Supplier name
|
|
317
|
+
**supplier_name** ([StringField](#string-field)): The name of the supplier or merchant.
|
|
318
|
+
|
|
319
|
+
```rb
|
|
320
|
+
puts result.document.inference.prediction.supplier_name.value
|
|
321
|
+
```
|
|
322
|
+
|
|
323
|
+
## Supplier Payment Details
|
|
324
|
+
**supplier_payment_details** (Array<[PaymentDetailsField](#payment-details-field)>): List of payment details associated to the supplier.
|
|
325
|
+
|
|
326
|
+
```rb
|
|
327
|
+
for supplier_payment_details_elem in result.document.inference.prediction.supplier_payment_details do
|
|
328
|
+
puts supplier_payment_details_elem.value
|
|
329
|
+
end
|
|
330
|
+
```
|
|
331
|
+
|
|
332
|
+
## Supplier Phone Number
|
|
333
|
+
**supplier_phone_number** ([StringField](#string-field)): The phone number of the supplier or merchant.
|
|
334
|
+
|
|
335
|
+
```rb
|
|
336
|
+
puts result.document.inference.prediction.supplier_phone_number.value
|
|
337
|
+
```
|
|
338
|
+
|
|
339
|
+
## Taxes
|
|
340
|
+
**taxes** (Array<[TaxField](#taxes-field)>): List of tax lines information.
|
|
341
|
+
|
|
342
|
+
```rb
|
|
343
|
+
for taxes_elem in result.document.inference.prediction.taxes do
|
|
344
|
+
puts taxes_elem.to_s
|
|
345
|
+
end
|
|
346
|
+
```
|
|
347
|
+
|
|
348
|
+
## Purchase Time
|
|
349
|
+
**time** ([StringField](#string-field)): The time the purchase was made.
|
|
350
|
+
|
|
351
|
+
```rb
|
|
352
|
+
puts result.document.inference.prediction.time.value
|
|
353
|
+
```
|
|
354
|
+
|
|
355
|
+
## Tip and Gratuity
|
|
356
|
+
**tip** ([AmountField](#amount-field)): The total amount of tip and gratuity
|
|
357
|
+
|
|
358
|
+
```rb
|
|
359
|
+
puts result.document.inference.prediction.tip.value
|
|
360
|
+
```
|
|
361
|
+
|
|
362
|
+
## Total Amount
|
|
363
|
+
**total_amount** ([AmountField](#amount-field)): The total amount paid: includes taxes, tips, fees, and other charges.
|
|
364
|
+
|
|
365
|
+
```rb
|
|
366
|
+
puts result.document.inference.prediction.total_amount.value
|
|
367
|
+
```
|
|
368
|
+
|
|
369
|
+
## Total Net
|
|
370
|
+
**total_net** ([AmountField](#amount-field)): The net amount paid: does not include taxes, fees, and discounts.
|
|
371
|
+
|
|
372
|
+
```rb
|
|
373
|
+
puts result.document.inference.prediction.total_net.value
|
|
374
|
+
```
|
|
375
|
+
|
|
376
|
+
## Total Tax
|
|
377
|
+
**total_tax** ([AmountField](#amount-field)): The total amount of taxes.
|
|
378
|
+
|
|
379
|
+
```rb
|
|
380
|
+
puts result.document.inference.prediction.total_tax.value
|
|
381
|
+
```
|
|
382
|
+
|
|
383
|
+
# Questions?
|
|
384
|
+
[Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-1jv6nawjq-FDgFcF2T5CmMmRpl9LLptw)
|
|
@@ -1,8 +1,7 @@
|
|
|
1
|
-
|
|
2
|
-
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
You can view the source code on [GitHub](https://github.com/mindee/mindee-api-ruby).
|
|
1
|
+
---
|
|
2
|
+
title: Ruby Getting Started
|
|
3
|
+
---
|
|
4
|
+
This guide will help you get the most out of the Mindee Ruby client library to easily extract data from your documents.
|
|
6
5
|
|
|
7
6
|
## Installation
|
|
8
7
|
|
|
@@ -62,6 +61,7 @@ gem install mindee@<version>
|
|
|
62
61
|
```
|
|
63
62
|
|
|
64
63
|
## Usage
|
|
64
|
+
|
|
65
65
|
Using Mindee's APIs can be broken down into the following steps:
|
|
66
66
|
|
|
67
67
|
1. [Initialize a Client](#initializing-the-client)
|
|
@@ -202,7 +202,6 @@ result = mindee_client.parse(
|
|
|
202
202
|
)
|
|
203
203
|
```
|
|
204
204
|
|
|
205
|
-
|
|
206
205
|
## Sending a File
|
|
207
206
|
To send a file to the API, we need to specify how to process the document.
|
|
208
207
|
This will determine which API endpoint is used and how the API return will be handled internally by the library.
|
|
@@ -302,5 +301,22 @@ response.document.inference.pages.each do |page|
|
|
|
302
301
|
end
|
|
303
302
|
```
|
|
304
303
|
|
|
304
|
+
## 🧪 Experimental Features
|
|
305
|
+
|
|
306
|
+
### PDF repair
|
|
307
|
+
|
|
308
|
+
Some PDF files might appear fine on your computer, but can be rejected by the server.
|
|
309
|
+
This _experimental_ feature attempts to fix the file's header information before sending it to the server.
|
|
310
|
+
|
|
311
|
+
> ⚠️ **Warning**: This feature copies your file and then **alters** it. The original file will be left alone, but the copy might get partially corrupted, and improperly parsed as a result. Use at your own discretion.
|
|
312
|
+
|
|
313
|
+
To enable it, simply set the `fix_pdf` flag to `true` during source creation:
|
|
314
|
+
|
|
315
|
+
```rb
|
|
316
|
+
input_source = mindee_client.source_from_file(input_file, "name-of-my-file.ext", fix_pdf: true)
|
|
317
|
+
```
|
|
318
|
+
|
|
319
|
+
Note: This only works for local files, files sent by URL will not be processed.
|
|
320
|
+
|
|
305
321
|
## Questions?
|
|
306
322
|
[Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-1jv6nawjq-FDgFcF2T5CmMmRpl9LLptw)
|
|
@@ -0,0 +1,253 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: FR Carte Nationale d'Identité OCR Ruby
|
|
3
|
+
---
|
|
4
|
+
The Ruby OCR SDK supports the [Carte Nationale d'Identité API](https://platform.mindee.com/mindee/idcard_fr).
|
|
5
|
+
|
|
6
|
+
Using the [sample below](https://github.com/mindee/client-lib-test-data/blob/main/products/idcard_fr/default_sample.jpg), we are going to illustrate how to extract the data that we want using the OCR SDK.
|
|
7
|
+

|
|
8
|
+
|
|
9
|
+
# Quick-Start
|
|
10
|
+
```rb
|
|
11
|
+
require 'mindee'
|
|
12
|
+
|
|
13
|
+
# Init a new client
|
|
14
|
+
mindee_client = Mindee::Client.new(api_key: 'my-api-key')
|
|
15
|
+
|
|
16
|
+
# Load a file from disk
|
|
17
|
+
input_source = mindee_client.source_from_path('/path/to/the/file.ext')
|
|
18
|
+
|
|
19
|
+
# Parse the file
|
|
20
|
+
result = mindee_client.parse(
|
|
21
|
+
input_source,
|
|
22
|
+
Mindee::Product::FR::IdCard::IdCardV2
|
|
23
|
+
)
|
|
24
|
+
|
|
25
|
+
# Print a full summary of the parsed data in RST format
|
|
26
|
+
puts result.document
|
|
27
|
+
|
|
28
|
+
# Print the document-level parsed data
|
|
29
|
+
# puts result.document.inference.prediction
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
**Output (RST):**
|
|
33
|
+
```rst
|
|
34
|
+
########
|
|
35
|
+
Document
|
|
36
|
+
########
|
|
37
|
+
:Mindee ID: d33828f1-ef7e-4984-b9df-a2bfaa38a78d
|
|
38
|
+
:Filename: default_sample.jpg
|
|
39
|
+
|
|
40
|
+
Inference
|
|
41
|
+
#########
|
|
42
|
+
:Product: mindee/idcard_fr v2.0
|
|
43
|
+
:Rotation applied: Yes
|
|
44
|
+
|
|
45
|
+
Prediction
|
|
46
|
+
==========
|
|
47
|
+
:Nationality:
|
|
48
|
+
:Card Access Number: 175775H55790
|
|
49
|
+
:Document Number:
|
|
50
|
+
:Given Name(s): Victor
|
|
51
|
+
Marie
|
|
52
|
+
:Surname: DAMBARD
|
|
53
|
+
:Alternate Name:
|
|
54
|
+
:Date of Birth: 1994-04-24
|
|
55
|
+
:Place of Birth: LYON 4E ARRONDISSEM
|
|
56
|
+
:Gender: M
|
|
57
|
+
:Expiry Date: 2030-04-02
|
|
58
|
+
:Mrz Line 1: IDFRADAMBARD<<<<<<<<<<<<<<<<<<075025
|
|
59
|
+
:Mrz Line 2: 170775H557903VICTOR<<MARIE<9404246M5
|
|
60
|
+
:Mrz Line 3:
|
|
61
|
+
:Date of Issue: 2015-04-03
|
|
62
|
+
:Issuing Authority: SOUS-PREFECTURE DE BELLE (02)
|
|
63
|
+
|
|
64
|
+
Page Predictions
|
|
65
|
+
================
|
|
66
|
+
|
|
67
|
+
Page 0
|
|
68
|
+
------
|
|
69
|
+
:Document Type: OLD
|
|
70
|
+
:Document Sides: RECTO & VERSO
|
|
71
|
+
:Nationality:
|
|
72
|
+
:Card Access Number: 175775H55790
|
|
73
|
+
:Document Number:
|
|
74
|
+
:Given Name(s): Victor
|
|
75
|
+
Marie
|
|
76
|
+
:Surname: DAMBARD
|
|
77
|
+
:Alternate Name:
|
|
78
|
+
:Date of Birth: 1994-04-24
|
|
79
|
+
:Place of Birth: LYON 4E ARRONDISSEM
|
|
80
|
+
:Gender: M
|
|
81
|
+
:Expiry Date: 2030-04-02
|
|
82
|
+
:Mrz Line 1: IDFRADAMBARD<<<<<<<<<<<<<<<<<<075025
|
|
83
|
+
:Mrz Line 2: 170775H557903VICTOR<<MARIE<9404246M5
|
|
84
|
+
:Mrz Line 3:
|
|
85
|
+
:Date of Issue: 2015-04-03
|
|
86
|
+
:Issuing Authority: SOUS-PREFECTURE DE BELLE (02)
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
# Field Types
|
|
90
|
+
## Standard Fields
|
|
91
|
+
These fields are generic and used in several products.
|
|
92
|
+
|
|
93
|
+
### Basic Field
|
|
94
|
+
Each prediction object contains a set of fields that inherit from the generic `Field` class.
|
|
95
|
+
A typical `Field` object will have the following attributes:
|
|
96
|
+
|
|
97
|
+
* **value** (`String`, `Float`, `Integer`, `Boolean`): corresponds to the field value. Can be `nil` if no value was extracted.
|
|
98
|
+
* **confidence** (Float, nil): the confidence score of the field prediction.
|
|
99
|
+
* **bounding_box** (`Mindee::Geometry::Quadrilateral`, `nil`): contains exactly 4 relative vertices (points) coordinates of a right rectangle containing the field in the document.
|
|
100
|
+
* **polygon** (`Mindee::Geometry::Polygon`, `nil`): contains the relative vertices coordinates (`Point`) of a polygon containing the field in the image.
|
|
101
|
+
* **page_id** (`Integer`, `nil`): the ID of the page, is `nil` when at document-level.
|
|
102
|
+
* **reconstructed** (`Boolean`): indicates whether or not an object was reconstructed (not extracted as the API gave it).
|
|
103
|
+
|
|
104
|
+
|
|
105
|
+
Aside from the previous attributes, all basic fields have access to a `to_s` method that can be used to print their value as a string.
|
|
106
|
+
|
|
107
|
+
|
|
108
|
+
### Classification Field
|
|
109
|
+
The classification field `ClassificationField` does not implement all the basic `Field` attributes. It only implements **value**, **confidence** and **page_id**.
|
|
110
|
+
|
|
111
|
+
> Note: a classification field's `value is always a `String`.
|
|
112
|
+
|
|
113
|
+
### Date Field
|
|
114
|
+
Aside from the basic `Field` attributes, the date field `DateField` also implements the following:
|
|
115
|
+
|
|
116
|
+
* **date_object** (`Date`): an accessible representation of the value as a JavaScript object.
|
|
117
|
+
|
|
118
|
+
### String Field
|
|
119
|
+
The text field `StringField` only has one constraint: it's **value** is a `String` (or `nil`).
|
|
120
|
+
|
|
121
|
+
## Page-Level Fields
|
|
122
|
+
Some fields are constrained to the page level, and so will not be retrievable to through the document.
|
|
123
|
+
|
|
124
|
+
# Attributes
|
|
125
|
+
The following fields are extracted for Carte Nationale d'Identité V2:
|
|
126
|
+
|
|
127
|
+
## Alternate Name
|
|
128
|
+
**alternate_name** ([StringField](#string-field)): The alternate name of the card holder.
|
|
129
|
+
|
|
130
|
+
```rb
|
|
131
|
+
puts result.document.inference.prediction.alternate_name.value
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
## Issuing Authority
|
|
135
|
+
**authority** ([StringField](#string-field)): The name of the issuing authority.
|
|
136
|
+
|
|
137
|
+
```rb
|
|
138
|
+
puts result.document.inference.prediction.authority.value
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
## Date of Birth
|
|
142
|
+
**birth_date** ([DateField](#date-field)): The date of birth of the card holder.
|
|
143
|
+
|
|
144
|
+
```rb
|
|
145
|
+
puts result.document.inference.prediction.birth_date.value
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
## Place of Birth
|
|
149
|
+
**birth_place** ([StringField](#string-field)): The place of birth of the card holder.
|
|
150
|
+
|
|
151
|
+
```rb
|
|
152
|
+
puts result.document.inference.prediction.birth_place.value
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
## Card Access Number
|
|
156
|
+
**card_access_number** ([StringField](#string-field)): The card access number (CAN).
|
|
157
|
+
|
|
158
|
+
```rb
|
|
159
|
+
puts result.document.inference.prediction.card_access_number.value
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
## Document Number
|
|
163
|
+
**document_number** ([StringField](#string-field)): The document number.
|
|
164
|
+
|
|
165
|
+
```rb
|
|
166
|
+
puts result.document.inference.prediction.document_number.value
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
## Document Sides
|
|
170
|
+
[📄](#page-level-fields "This field is only present on individual pages.")**document_side** ([ClassificationField](#classification-field)): The sides of the document which are visible.
|
|
171
|
+
|
|
172
|
+
```rb
|
|
173
|
+
for document_side_elem in result.document.document_side do
|
|
174
|
+
puts document_side_elem.value
|
|
175
|
+
end
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
## Document Type
|
|
179
|
+
[📄](#page-level-fields "This field is only present on individual pages.")**document_type** ([ClassificationField](#classification-field)): The document type or format.
|
|
180
|
+
|
|
181
|
+
```rb
|
|
182
|
+
for document_type_elem in result.document.document_type do
|
|
183
|
+
puts document_type_elem.value
|
|
184
|
+
end
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
## Expiry Date
|
|
188
|
+
**expiry_date** ([DateField](#date-field)): The expiry date of the identification card.
|
|
189
|
+
|
|
190
|
+
```rb
|
|
191
|
+
puts result.document.inference.prediction.expiry_date.value
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
## Gender
|
|
195
|
+
**gender** ([StringField](#string-field)): The gender of the card holder.
|
|
196
|
+
|
|
197
|
+
```rb
|
|
198
|
+
puts result.document.inference.prediction.gender.value
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
## Given Name(s)
|
|
202
|
+
**given_names** (Array<[StringField](#string-field)>): The given name(s) of the card holder.
|
|
203
|
+
|
|
204
|
+
```rb
|
|
205
|
+
for given_names_elem in result.document.inference.prediction.given_names do
|
|
206
|
+
puts given_names_elem.value
|
|
207
|
+
end
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
## Date of Issue
|
|
211
|
+
**issue_date** ([DateField](#date-field)): The date of issue of the identification card.
|
|
212
|
+
|
|
213
|
+
```rb
|
|
214
|
+
puts result.document.inference.prediction.issue_date.value
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
## Mrz Line 1
|
|
218
|
+
**mrz1** ([StringField](#string-field)): The Machine Readable Zone, first line.
|
|
219
|
+
|
|
220
|
+
```rb
|
|
221
|
+
puts result.document.inference.prediction.mrz1.value
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
## Mrz Line 2
|
|
225
|
+
**mrz2** ([StringField](#string-field)): The Machine Readable Zone, second line.
|
|
226
|
+
|
|
227
|
+
```rb
|
|
228
|
+
puts result.document.inference.prediction.mrz2.value
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
## Mrz Line 3
|
|
232
|
+
**mrz3** ([StringField](#string-field)): The Machine Readable Zone, third line.
|
|
233
|
+
|
|
234
|
+
```rb
|
|
235
|
+
puts result.document.inference.prediction.mrz3.value
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
## Nationality
|
|
239
|
+
**nationality** ([StringField](#string-field)): The nationality of the card holder.
|
|
240
|
+
|
|
241
|
+
```rb
|
|
242
|
+
puts result.document.inference.prediction.nationality.value
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
## Surname
|
|
246
|
+
**surname** ([StringField](#string-field)): The surname of the card holder.
|
|
247
|
+
|
|
248
|
+
```rb
|
|
249
|
+
puts result.document.inference.prediction.surname.value
|
|
250
|
+
```
|
|
251
|
+
|
|
252
|
+
# Questions?
|
|
253
|
+
[Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-1jv6nawjq-FDgFcF2T5CmMmRpl9LLptw)
|