mindee 3.1.1 → 3.3.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (109) hide show
  1. checksums.yaml +4 -4
  2. data/.gitignore +1 -0
  3. data/.rubocop.yml +1 -0
  4. data/CHANGELOG.md +26 -0
  5. data/README.md +57 -7
  6. data/bin/mindee.rb +160 -83
  7. data/docs/bank_account_details_v2.md +137 -0
  8. data/docs/bank_check_v1.md +179 -0
  9. data/docs/barcode_reader_v1.md +104 -0
  10. data/docs/carte_grise_v1.md +454 -0
  11. data/docs/carte_vitale_v1.md +123 -0
  12. data/docs/code_samples/barcode_reader_v1.txt +19 -0
  13. data/docs/code_samples/carte_grise_v1.txt +19 -0
  14. data/docs/code_samples/cropper_v1.txt +16 -0
  15. data/docs/code_samples/idcard_fr_v2.txt +19 -0
  16. data/docs/code_samples/invoice_splitter_v1_async.txt +6 -54
  17. data/docs/code_samples/multi_receipts_detector_v1.txt +19 -0
  18. data/docs/code_samples/us_w9_v1.txt +16 -0
  19. data/docs/cropper_v1.md +97 -0
  20. data/docs/custom_v1.md +109 -0
  21. data/docs/expense_receipts_v5.md +306 -0
  22. data/docs/financial_document_v1.md +384 -0
  23. data/docs/{ruby-getting-started.md → getting_started.md} +22 -6
  24. data/docs/idcard_fr_v2.md +253 -0
  25. data/docs/invoice_splitter_v1.md +85 -0
  26. data/docs/invoices_v4.md +338 -0
  27. data/docs/license_plates_v1.md +91 -0
  28. data/docs/multi_receipts_detector_v1.md +105 -0
  29. data/docs/passport_v1.md +186 -0
  30. data/docs/proof_of_address_v1.md +207 -0
  31. data/docs/us_driver_license_v1.md +268 -0
  32. data/docs/us_w9_v1.md +207 -0
  33. data/lib/mindee/client.rb +95 -16
  34. data/lib/mindee/geometry/quadrilateral.rb +5 -0
  35. data/lib/mindee/http/.rubocop.yml +8 -0
  36. data/lib/mindee/http/endpoint.rb +19 -7
  37. data/lib/mindee/http/error.rb +104 -0
  38. data/lib/mindee/http.rb +1 -0
  39. data/lib/mindee/input/sources.rb +83 -14
  40. data/lib/mindee/parsing/common/api_response.rb +12 -1
  41. data/lib/mindee/parsing/common/document.rb +4 -1
  42. data/lib/mindee/parsing/common/inference.rb +2 -2
  43. data/lib/mindee/parsing/common/ocr/ocr.rb +1 -0
  44. data/lib/mindee/parsing/common.rb +0 -1
  45. data/lib/mindee/parsing/custom/list_field.rb +7 -5
  46. data/lib/mindee/parsing/standard/base_field.rb +1 -1
  47. data/lib/mindee/parsing/standard/company_registration_field.rb +1 -1
  48. data/lib/mindee/parsing/standard/locale_field.rb +1 -1
  49. data/lib/mindee/parsing/standard/payment_details_field.rb +1 -1
  50. data/lib/mindee/parsing/standard/position_field.rb +10 -3
  51. data/lib/mindee/parsing/standard/{text_field.rb → string_field.rb} +1 -1
  52. data/lib/mindee/parsing/standard.rb +1 -1
  53. data/lib/mindee/pdf/pdf_processing.rb +2 -1
  54. data/lib/mindee/product/barcode_reader/barcode_reader_v1.rb +37 -0
  55. data/lib/mindee/product/barcode_reader/barcode_reader_v1_document.rb +44 -0
  56. data/lib/mindee/product/barcode_reader/barcode_reader_v1_page.rb +32 -0
  57. data/lib/mindee/product/cropper/cropper_v1.rb +37 -0
  58. data/lib/mindee/product/cropper/cropper_v1_document.rb +13 -0
  59. data/lib/mindee/product/cropper/cropper_v1_page.rb +49 -0
  60. data/lib/mindee/product/custom/custom_v1.rb +1 -0
  61. data/lib/mindee/product/eu/license_plate/license_plate_v1.rb +1 -0
  62. data/lib/mindee/product/eu/license_plate/license_plate_v1_document.rb +2 -2
  63. data/lib/mindee/product/financial_document/financial_document_v1.rb +1 -0
  64. data/lib/mindee/product/financial_document/financial_document_v1_document.rb +26 -26
  65. data/lib/mindee/product/fr/bank_account_details/bank_account_details_v1.rb +1 -0
  66. data/lib/mindee/product/fr/bank_account_details/bank_account_details_v1_document.rb +6 -6
  67. data/lib/mindee/product/fr/bank_account_details/bank_account_details_v2.rb +1 -0
  68. data/lib/mindee/product/fr/bank_account_details/bank_account_details_v2_document.rb +6 -6
  69. data/lib/mindee/product/fr/carte_grise/carte_grise_v1.rb +39 -0
  70. data/lib/mindee/product/fr/carte_grise/carte_grise_v1_document.rb +235 -0
  71. data/lib/mindee/product/fr/carte_grise/carte_grise_v1_page.rb +34 -0
  72. data/lib/mindee/product/fr/carte_vitale/carte_vitale_v1.rb +1 -0
  73. data/lib/mindee/product/fr/carte_vitale/carte_vitale_v1_document.rb +6 -6
  74. data/lib/mindee/product/fr/id_card/id_card_v1.rb +1 -0
  75. data/lib/mindee/product/fr/id_card/id_card_v1_document.rb +16 -16
  76. data/lib/mindee/product/fr/id_card/id_card_v2.rb +39 -0
  77. data/lib/mindee/product/fr/id_card/id_card_v2_document.rb +107 -0
  78. data/lib/mindee/product/fr/id_card/id_card_v2_page.rb +53 -0
  79. data/lib/mindee/product/invoice/invoice_v4.rb +2 -2
  80. data/lib/mindee/product/invoice/invoice_v4_document.rb +115 -155
  81. data/lib/mindee/product/invoice/invoice_v4_line_item.rb +54 -30
  82. data/lib/mindee/product/invoice_splitter/invoice_splitter_v1.rb +1 -0
  83. data/lib/mindee/product/invoice_splitter/invoice_splitter_v1_document.rb +5 -3
  84. data/lib/mindee/product/multi_receipts_detector/multi_receipts_detector_v1.rb +37 -0
  85. data/lib/mindee/product/multi_receipts_detector/multi_receipts_detector_v1_document.rb +35 -0
  86. data/lib/mindee/product/multi_receipts_detector/multi_receipts_detector_v1_page.rb +32 -0
  87. data/lib/mindee/product/passport/passport_v1.rb +1 -0
  88. data/lib/mindee/product/passport/passport_v1_document.rb +16 -16
  89. data/lib/mindee/product/proof_of_address/proof_of_address_v1.rb +1 -0
  90. data/lib/mindee/product/proof_of_address/proof_of_address_v1_document.rb +14 -14
  91. data/lib/mindee/product/receipt/receipt_v4_document.rb +6 -6
  92. data/lib/mindee/product/receipt/receipt_v5.rb +1 -0
  93. data/lib/mindee/product/receipt/receipt_v5_document.rb +12 -12
  94. data/lib/mindee/product/us/bank_check/bank_check_v1.rb +1 -0
  95. data/lib/mindee/product/us/bank_check/bank_check_v1_document.rb +8 -8
  96. data/lib/mindee/product/us/driver_license/driver_license_v1.rb +1 -0
  97. data/lib/mindee/product/us/driver_license/driver_license_v1_document.rb +28 -28
  98. data/lib/mindee/product/us/w9/w9_v1.rb +39 -0
  99. data/lib/mindee/product/us/w9/w9_v1_document.rb +15 -0
  100. data/lib/mindee/product/us/w9/w9_v1_page.rb +102 -0
  101. data/lib/mindee/product.rb +6 -0
  102. data/lib/mindee/version.rb +5 -1
  103. data/lib/mindee.rb +45 -1
  104. metadata +48 -9
  105. data/docs/ruby-api-builder.md +0 -123
  106. data/docs/ruby-invoice-ocr.md +0 -271
  107. data/docs/ruby-passport-ocr.md +0 -165
  108. data/docs/ruby-receipt-ocr.md +0 -196
  109. data/lib/mindee/parsing/common/error.rb +0 -24
@@ -0,0 +1,384 @@
1
+ ---
2
+ title: Financial Document OCR Ruby
3
+ ---
4
+ The Ruby OCR SDK supports the [Financial Document API](https://platform.mindee.com/mindee/financial_document).
5
+
6
+ Using the [sample below](https://github.com/mindee/client-lib-test-data/blob/main/products/financial_document/default_sample.jpg), we are going to illustrate how to extract the data that we want using the OCR SDK.
7
+ ![Financial Document sample](https://github.com/mindee/client-lib-test-data/blob/main/products/financial_document/default_sample.jpg?raw=true)
8
+
9
+ # Quick-Start
10
+ ```rb
11
+ require 'mindee'
12
+
13
+ # Init a new client
14
+ mindee_client = Mindee::Client.new(api_key: 'my-api-key')
15
+
16
+ # Load a file from disk
17
+ input_source = mindee_client.source_from_path('/path/to/the/file.ext')
18
+
19
+ # Parse the file
20
+ result = mindee_client.parse(
21
+ input_source,
22
+ Mindee::Product::FinancialDocument::FinancialDocumentV1
23
+ )
24
+
25
+ # Print a full summary of the parsed data in RST format
26
+ puts result.document
27
+
28
+ # Print the document-level parsed data
29
+ # puts result.document.inference.prediction
30
+ ```
31
+
32
+ **Output (RST):**
33
+ ```rst
34
+ ########
35
+ Document
36
+ ########
37
+ :Mindee ID: 81c1d637-3a84-41d9-b40a-f72ca2a58826
38
+ :Filename: default_sample.jpg
39
+
40
+ Inference
41
+ #########
42
+ :Product: mindee/financial_document v1.1
43
+ :Rotation applied: Yes
44
+
45
+ Prediction
46
+ ==========
47
+ :Locale: en; en; USD;
48
+ :Invoice Number:
49
+ :Reference Numbers:
50
+ :Purchase Date: 2014-07-07
51
+ :Due Date: 2014-07-07
52
+ :Total Net: 40.48
53
+ :Total Amount: 53.82
54
+ :Taxes:
55
+ +---------------+--------+----------+---------------+
56
+ | Base | Code | Rate (%) | Amount |
57
+ +===============+========+==========+===============+
58
+ | | TAX | | 3.34 |
59
+ +---------------+--------+----------+---------------+
60
+ :Supplier Payment Details:
61
+ :Supplier Name: LOGANS
62
+ :Supplier Company Registrations:
63
+ :Supplier Address: 2513 s stemmons freeway lewisville tx 75067
64
+ :Supplier Phone Number: 9724596042
65
+ :Customer Name:
66
+ :Customer Company Registrations:
67
+ :Customer Address:
68
+ :Document Type: EXPENSE RECEIPT
69
+ :Purchase Subcategory: restaurant
70
+ :Purchase Category: food
71
+ :Total Tax: 3.34
72
+ :Tip and Gratuity: 10.00
73
+ :Purchase Time: 20:20
74
+ :Line Items:
75
+ +--------------------------------------+--------------+----------+------------+--------------+--------------+------------+
76
+ | Description | Product code | Quantity | Tax Amount | Tax Rate (%) | Total Amount | Unit Price |
77
+ +======================================+==============+==========+============+==============+==============+============+
78
+ | TAX | | | | | 3.34 | |
79
+ +--------------------------------------+--------------+----------+------------+--------------+--------------+------------+
80
+
81
+ Page Predictions
82
+ ================
83
+
84
+ Page 0
85
+ ------
86
+ :Locale: en; en; USD;
87
+ :Invoice Number:
88
+ :Reference Numbers:
89
+ :Purchase Date: 2014-07-07
90
+ :Due Date: 2014-07-07
91
+ :Total Net: 40.48
92
+ :Total Amount: 53.82
93
+ :Taxes:
94
+ +---------------+--------+----------+---------------+
95
+ | Base | Code | Rate (%) | Amount |
96
+ +===============+========+==========+===============+
97
+ | | TAX | | 3.34 |
98
+ +---------------+--------+----------+---------------+
99
+ :Supplier Payment Details:
100
+ :Supplier Name: LOGANS
101
+ :Supplier Company Registrations:
102
+ :Supplier Address: 2513 s stemmons freeway lewisville tx 75067
103
+ :Supplier Phone Number: 9724596042
104
+ :Customer Name:
105
+ :Customer Company Registrations:
106
+ :Customer Address:
107
+ :Document Type: EXPENSE RECEIPT
108
+ :Purchase Subcategory: restaurant
109
+ :Purchase Category: food
110
+ :Total Tax: 3.34
111
+ :Tip and Gratuity: 10.00
112
+ :Purchase Time: 20:20
113
+ :Line Items:
114
+ +--------------------------------------+--------------+----------+------------+--------------+--------------+------------+
115
+ | Description | Product code | Quantity | Tax Amount | Tax Rate (%) | Total Amount | Unit Price |
116
+ +======================================+==============+==========+============+==============+==============+============+
117
+ | TAX | | | | | 3.34 | |
118
+ +--------------------------------------+--------------+----------+------------+--------------+--------------+------------+
119
+ ```
120
+
121
+ # Field Types
122
+ ## Standard Fields
123
+ These fields are generic and used in several products.
124
+
125
+ ### Basic Field
126
+ Each prediction object contains a set of fields that inherit from the generic `Field` class.
127
+ A typical `Field` object will have the following attributes:
128
+
129
+ * **value** (`String`, `Float`, `Integer`, `Boolean`): corresponds to the field value. Can be `nil` if no value was extracted.
130
+ * **confidence** (Float, nil): the confidence score of the field prediction.
131
+ * **bounding_box** (`Mindee::Geometry::Quadrilateral`, `nil`): contains exactly 4 relative vertices (points) coordinates of a right rectangle containing the field in the document.
132
+ * **polygon** (`Mindee::Geometry::Polygon`, `nil`): contains the relative vertices coordinates (`Point`) of a polygon containing the field in the image.
133
+ * **page_id** (`Integer`, `nil`): the ID of the page, is `nil` when at document-level.
134
+ * **reconstructed** (`Boolean`): indicates whether or not an object was reconstructed (not extracted as the API gave it).
135
+
136
+
137
+ Aside from the previous attributes, all basic fields have access to a `to_s` method that can be used to print their value as a string.
138
+
139
+
140
+ ### Amount Field
141
+ The amount field `AmountField` only has one constraint: its **value** is a `Float` (or `nil`).
142
+
143
+
144
+ ### Classification Field
145
+ The classification field `ClassificationField` does not implement all the basic `Field` attributes. It only implements **value**, **confidence** and **page_id**.
146
+
147
+ > Note: a classification field's `value is always a `String`.
148
+
149
+
150
+ ### Company Registration Field
151
+ Aside from the basic `Field` attributes, the company registration field `CompanyRegistrationField` also implements the following:
152
+
153
+ * **type** (`String`): the type of company.
154
+
155
+ ### Date Field
156
+ Aside from the basic `Field` attributes, the date field `DateField` also implements the following:
157
+
158
+ * **date_object** (`Date`): an accessible representation of the value as a JavaScript object.
159
+
160
+ ### Locale Field
161
+ The locale field `LocaleField` only implements the **value**, **confidence** and **page_id** base `Field` attributes, but it comes with its own:
162
+
163
+ * **language** (`String`): ISO 639-1 language code (e.g.: `en` for English). Can be `nil`.
164
+ * **country** (`String`): ISO 3166-1 alpha-2 or ISO 3166-1 alpha-3 code for countries (e.g.: `GRB` or `GB` for "Great Britain"). Can be `nil`.
165
+ * **currency** (`String`): ISO 4217 code for currencies (e.g.: `USD` for "US Dollars"). Can be `nil`.
166
+
167
+ ### Payment Details Field
168
+ Aside from the basic `Field` attributes, the payment details field `PaymentDetailsField` also implements the following:
169
+
170
+ * **account_number** (`String`): number of an account, expressed as a string. Can be `nil`.
171
+ * **iban** (`String`): International Bank Account Number. Can be `nil`.
172
+ * **routing_number** (`String`): routing number of an account. Can be `nil`.
173
+ * **swift** (`String`): the account holder's bank's SWIFT Business Identifier Code (BIC). Can be `nil`.
174
+
175
+ ### String Field
176
+ The text field `StringField` only has one constraint: it's **value** is a `String` (or `nil`).
177
+
178
+ ### Taxes Field
179
+ #### Tax
180
+ Aside from the basic `Field` attributes, the tax field `TaxField` also implements the following:
181
+
182
+ * **rate** (`Float`): the tax rate applied to an item can be expressed as a percentage. Can be `nil`.
183
+ * **code** (`String`): tax code (or equivalent, depending on the origin of the document). Can be `nil`.
184
+ * **base** (`Float`): base amount used for the tax. Can be `nil`.
185
+
186
+ > Note: currently `TaxField` is not used on its own, and is accessed through a parent `Taxes` object, an array-like structure.
187
+
188
+ #### Taxes (Array)
189
+ The `Taxes` field represents an array-like collection of `TaxField` objects. As it is the representation of several objects, it has access to a custom `to_s` method that can render a `TaxField` object as a table line.
190
+
191
+ ## Specific Fields
192
+ Fields which are specific to this product; they are not used in any other product.
193
+
194
+ ### Line Items Field
195
+ List of line item details.
196
+
197
+ A `FinancialDocumentV1LineItem` implements the following attributes:
198
+
199
+ * `description` (String): The item description.
200
+ * `product_code` (String): The product code referring to the item.
201
+ * `quantity` (Float): The item quantity
202
+ * `tax_amount` (Float): The item tax amount.
203
+ * `tax_rate` (Float): The item tax rate in percentage.
204
+ * `total_amount` (Float): The item total amount.
205
+ * `unit_price` (Float): The item unit price.
206
+
207
+ # Attributes
208
+ The following fields are extracted for Financial Document V1:
209
+
210
+ ## Purchase Category
211
+ **category** ([ClassificationField](#classification-field)): The purchase category among predefined classes.
212
+
213
+ ```rb
214
+ puts result.document.inference.prediction.category.value
215
+ ```
216
+
217
+ ## Customer Address
218
+ **customer_address** ([StringField](#string-field)): The address of the customer.
219
+
220
+ ```rb
221
+ puts result.document.inference.prediction.customer_address.value
222
+ ```
223
+
224
+ ## Customer Company Registrations
225
+ **customer_company_registrations** (Array<[CompanyRegistrationField](#company-registration-field)>): List of company registrations associated to the customer.
226
+
227
+ ```rb
228
+ for customer_company_registrations_elem in result.document.inference.prediction.customer_company_registrations do
229
+ puts customer_company_registrations_elem.value
230
+ end
231
+ ```
232
+
233
+ ## Customer Name
234
+ **customer_name** ([StringField](#string-field)): The name of the customer.
235
+
236
+ ```rb
237
+ puts result.document.inference.prediction.customer_name.value
238
+ ```
239
+
240
+ ## Purchase Date
241
+ **date** ([DateField](#date-field)): The date the purchase was made.
242
+
243
+ ```rb
244
+ puts result.document.inference.prediction.date.value
245
+ ```
246
+
247
+ ## Document Type
248
+ **document_type** ([ClassificationField](#classification-field)): One of: 'INVOICE', 'CREDIT NOTE', 'CREDIT CARD RECEIPT', 'EXPENSE RECEIPT'.
249
+
250
+ ```rb
251
+ puts result.document.inference.prediction.document_type.value
252
+ ```
253
+
254
+ ## Due Date
255
+ **due_date** ([DateField](#date-field)): The date on which the payment is due.
256
+
257
+ ```rb
258
+ puts result.document.inference.prediction.due_date.value
259
+ ```
260
+
261
+ ## Invoice Number
262
+ **invoice_number** ([StringField](#string-field)): The invoice number or identifier.
263
+
264
+ ```rb
265
+ puts result.document.inference.prediction.invoice_number.value
266
+ ```
267
+
268
+ ## Line Items
269
+ **line_items** (Array<[FinancialDocumentV1LineItem](#line-items-field)>): List of line item details.
270
+
271
+ ```rb
272
+ for line_items_elem in result.document.inference.prediction.line_items do
273
+ puts line_items_elem.value
274
+ end
275
+ ```
276
+
277
+ ## Locale
278
+ **locale** ([LocaleField](#locale-field)): The locale detected on the document.
279
+
280
+ ```rb
281
+ puts result.document.inference.prediction.locale.value
282
+ ```
283
+
284
+ ## Reference Numbers
285
+ **reference_numbers** (Array<[StringField](#string-field)>): List of Reference numbers, including PO number.
286
+
287
+ ```rb
288
+ for reference_numbers_elem in result.document.inference.prediction.reference_numbers do
289
+ puts reference_numbers_elem.value
290
+ end
291
+ ```
292
+
293
+ ## Purchase Subcategory
294
+ **subcategory** ([ClassificationField](#classification-field)): The purchase subcategory among predefined classes for transport and food.
295
+
296
+ ```rb
297
+ puts result.document.inference.prediction.subcategory.value
298
+ ```
299
+
300
+ ## Supplier Address
301
+ **supplier_address** ([StringField](#string-field)): The address of the supplier or merchant.
302
+
303
+ ```rb
304
+ puts result.document.inference.prediction.supplier_address.value
305
+ ```
306
+
307
+ ## Supplier Company Registrations
308
+ **supplier_company_registrations** (Array<[CompanyRegistrationField](#company-registration-field)>): List of company registrations associated to the supplier.
309
+
310
+ ```rb
311
+ for supplier_company_registrations_elem in result.document.inference.prediction.supplier_company_registrations do
312
+ puts supplier_company_registrations_elem.value
313
+ end
314
+ ```
315
+
316
+ ## Supplier Name
317
+ **supplier_name** ([StringField](#string-field)): The name of the supplier or merchant.
318
+
319
+ ```rb
320
+ puts result.document.inference.prediction.supplier_name.value
321
+ ```
322
+
323
+ ## Supplier Payment Details
324
+ **supplier_payment_details** (Array<[PaymentDetailsField](#payment-details-field)>): List of payment details associated to the supplier.
325
+
326
+ ```rb
327
+ for supplier_payment_details_elem in result.document.inference.prediction.supplier_payment_details do
328
+ puts supplier_payment_details_elem.value
329
+ end
330
+ ```
331
+
332
+ ## Supplier Phone Number
333
+ **supplier_phone_number** ([StringField](#string-field)): The phone number of the supplier or merchant.
334
+
335
+ ```rb
336
+ puts result.document.inference.prediction.supplier_phone_number.value
337
+ ```
338
+
339
+ ## Taxes
340
+ **taxes** (Array<[TaxField](#taxes-field)>): List of tax lines information.
341
+
342
+ ```rb
343
+ for taxes_elem in result.document.inference.prediction.taxes do
344
+ puts taxes_elem.to_s
345
+ end
346
+ ```
347
+
348
+ ## Purchase Time
349
+ **time** ([StringField](#string-field)): The time the purchase was made.
350
+
351
+ ```rb
352
+ puts result.document.inference.prediction.time.value
353
+ ```
354
+
355
+ ## Tip and Gratuity
356
+ **tip** ([AmountField](#amount-field)): The total amount of tip and gratuity
357
+
358
+ ```rb
359
+ puts result.document.inference.prediction.tip.value
360
+ ```
361
+
362
+ ## Total Amount
363
+ **total_amount** ([AmountField](#amount-field)): The total amount paid: includes taxes, tips, fees, and other charges.
364
+
365
+ ```rb
366
+ puts result.document.inference.prediction.total_amount.value
367
+ ```
368
+
369
+ ## Total Net
370
+ **total_net** ([AmountField](#amount-field)): The net amount paid: does not include taxes, fees, and discounts.
371
+
372
+ ```rb
373
+ puts result.document.inference.prediction.total_net.value
374
+ ```
375
+
376
+ ## Total Tax
377
+ **total_tax** ([AmountField](#amount-field)): The total amount of taxes.
378
+
379
+ ```rb
380
+ puts result.document.inference.prediction.total_tax.value
381
+ ```
382
+
383
+ # Questions?
384
+ [Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-1jv6nawjq-FDgFcF2T5CmMmRpl9LLptw)
@@ -1,8 +1,7 @@
1
- This guide will help you get started with the Mindee Ruby OCR SDK to easily extract data from your documents.
2
-
3
- The Ruby client supports [Invoice](https://developers.mindee.com/docs/ruby-invoice-ocr), [receipt](https://developers.mindee.com/docs/ruby-receipt-ocr), [passport](https://developers.mindee.com/docs/ruby-passport-ocr), OCR APIs and [custom-built API](https://developers.mindee.com/docs/ruby-api-builder) from the API Builder.
4
-
5
- You can view the source code on [GitHub](https://github.com/mindee/mindee-api-ruby).
1
+ ---
2
+ title: Ruby Getting Started
3
+ ---
4
+ This guide will help you get the most out of the Mindee Ruby client library to easily extract data from your documents.
6
5
 
7
6
  ## Installation
8
7
 
@@ -62,6 +61,7 @@ gem install mindee@<version>
62
61
  ```
63
62
 
64
63
  ## Usage
64
+
65
65
  Using Mindee's APIs can be broken down into the following steps:
66
66
 
67
67
  1. [Initialize a Client](#initializing-the-client)
@@ -202,7 +202,6 @@ result = mindee_client.parse(
202
202
  )
203
203
  ```
204
204
 
205
-
206
205
  ## Sending a File
207
206
  To send a file to the API, we need to specify how to process the document.
208
207
  This will determine which API endpoint is used and how the API return will be handled internally by the library.
@@ -302,5 +301,22 @@ response.document.inference.pages.each do |page|
302
301
  end
303
302
  ```
304
303
 
304
+ ## 🧪 Experimental Features
305
+
306
+ ### PDF repair
307
+
308
+ Some PDF files might appear fine on your computer, but can be rejected by the server.
309
+ This _experimental_ feature attempts to fix the file's header information before sending it to the server.
310
+
311
+ > ⚠️ **Warning**: This feature copies your file and then **alters** it. The original file will be left alone, but the copy might get partially corrupted, and improperly parsed as a result. Use at your own discretion.
312
+
313
+ To enable it, simply set the `fix_pdf` flag to `true` during source creation:
314
+
315
+ ```rb
316
+ input_source = mindee_client.source_from_file(input_file, "name-of-my-file.ext", fix_pdf: true)
317
+ ```
318
+
319
+ Note: This only works for local files, files sent by URL will not be processed.
320
+
305
321
  ## Questions?
306
322
  [Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-1jv6nawjq-FDgFcF2T5CmMmRpl9LLptw)
@@ -0,0 +1,253 @@
1
+ ---
2
+ title: FR Carte Nationale d'Identité OCR Ruby
3
+ ---
4
+ The Ruby OCR SDK supports the [Carte Nationale d'Identité API](https://platform.mindee.com/mindee/idcard_fr).
5
+
6
+ Using the [sample below](https://github.com/mindee/client-lib-test-data/blob/main/products/idcard_fr/default_sample.jpg), we are going to illustrate how to extract the data that we want using the OCR SDK.
7
+ ![Carte Nationale d'Identité sample](https://github.com/mindee/client-lib-test-data/blob/main/products/idcard_fr/default_sample.jpg?raw=true)
8
+
9
+ # Quick-Start
10
+ ```rb
11
+ require 'mindee'
12
+
13
+ # Init a new client
14
+ mindee_client = Mindee::Client.new(api_key: 'my-api-key')
15
+
16
+ # Load a file from disk
17
+ input_source = mindee_client.source_from_path('/path/to/the/file.ext')
18
+
19
+ # Parse the file
20
+ result = mindee_client.parse(
21
+ input_source,
22
+ Mindee::Product::FR::IdCard::IdCardV2
23
+ )
24
+
25
+ # Print a full summary of the parsed data in RST format
26
+ puts result.document
27
+
28
+ # Print the document-level parsed data
29
+ # puts result.document.inference.prediction
30
+ ```
31
+
32
+ **Output (RST):**
33
+ ```rst
34
+ ########
35
+ Document
36
+ ########
37
+ :Mindee ID: d33828f1-ef7e-4984-b9df-a2bfaa38a78d
38
+ :Filename: default_sample.jpg
39
+
40
+ Inference
41
+ #########
42
+ :Product: mindee/idcard_fr v2.0
43
+ :Rotation applied: Yes
44
+
45
+ Prediction
46
+ ==========
47
+ :Nationality:
48
+ :Card Access Number: 175775H55790
49
+ :Document Number:
50
+ :Given Name(s): Victor
51
+ Marie
52
+ :Surname: DAMBARD
53
+ :Alternate Name:
54
+ :Date of Birth: 1994-04-24
55
+ :Place of Birth: LYON 4E ARRONDISSEM
56
+ :Gender: M
57
+ :Expiry Date: 2030-04-02
58
+ :Mrz Line 1: IDFRADAMBARD<<<<<<<<<<<<<<<<<<075025
59
+ :Mrz Line 2: 170775H557903VICTOR<<MARIE<9404246M5
60
+ :Mrz Line 3:
61
+ :Date of Issue: 2015-04-03
62
+ :Issuing Authority: SOUS-PREFECTURE DE BELLE (02)
63
+
64
+ Page Predictions
65
+ ================
66
+
67
+ Page 0
68
+ ------
69
+ :Document Type: OLD
70
+ :Document Sides: RECTO & VERSO
71
+ :Nationality:
72
+ :Card Access Number: 175775H55790
73
+ :Document Number:
74
+ :Given Name(s): Victor
75
+ Marie
76
+ :Surname: DAMBARD
77
+ :Alternate Name:
78
+ :Date of Birth: 1994-04-24
79
+ :Place of Birth: LYON 4E ARRONDISSEM
80
+ :Gender: M
81
+ :Expiry Date: 2030-04-02
82
+ :Mrz Line 1: IDFRADAMBARD<<<<<<<<<<<<<<<<<<075025
83
+ :Mrz Line 2: 170775H557903VICTOR<<MARIE<9404246M5
84
+ :Mrz Line 3:
85
+ :Date of Issue: 2015-04-03
86
+ :Issuing Authority: SOUS-PREFECTURE DE BELLE (02)
87
+ ```
88
+
89
+ # Field Types
90
+ ## Standard Fields
91
+ These fields are generic and used in several products.
92
+
93
+ ### Basic Field
94
+ Each prediction object contains a set of fields that inherit from the generic `Field` class.
95
+ A typical `Field` object will have the following attributes:
96
+
97
+ * **value** (`String`, `Float`, `Integer`, `Boolean`): corresponds to the field value. Can be `nil` if no value was extracted.
98
+ * **confidence** (Float, nil): the confidence score of the field prediction.
99
+ * **bounding_box** (`Mindee::Geometry::Quadrilateral`, `nil`): contains exactly 4 relative vertices (points) coordinates of a right rectangle containing the field in the document.
100
+ * **polygon** (`Mindee::Geometry::Polygon`, `nil`): contains the relative vertices coordinates (`Point`) of a polygon containing the field in the image.
101
+ * **page_id** (`Integer`, `nil`): the ID of the page, is `nil` when at document-level.
102
+ * **reconstructed** (`Boolean`): indicates whether or not an object was reconstructed (not extracted as the API gave it).
103
+
104
+
105
+ Aside from the previous attributes, all basic fields have access to a `to_s` method that can be used to print their value as a string.
106
+
107
+
108
+ ### Classification Field
109
+ The classification field `ClassificationField` does not implement all the basic `Field` attributes. It only implements **value**, **confidence** and **page_id**.
110
+
111
+ > Note: a classification field's `value is always a `String`.
112
+
113
+ ### Date Field
114
+ Aside from the basic `Field` attributes, the date field `DateField` also implements the following:
115
+
116
+ * **date_object** (`Date`): an accessible representation of the value as a JavaScript object.
117
+
118
+ ### String Field
119
+ The text field `StringField` only has one constraint: it's **value** is a `String` (or `nil`).
120
+
121
+ ## Page-Level Fields
122
+ Some fields are constrained to the page level, and so will not be retrievable to through the document.
123
+
124
+ # Attributes
125
+ The following fields are extracted for Carte Nationale d'Identité V2:
126
+
127
+ ## Alternate Name
128
+ **alternate_name** ([StringField](#string-field)): The alternate name of the card holder.
129
+
130
+ ```rb
131
+ puts result.document.inference.prediction.alternate_name.value
132
+ ```
133
+
134
+ ## Issuing Authority
135
+ **authority** ([StringField](#string-field)): The name of the issuing authority.
136
+
137
+ ```rb
138
+ puts result.document.inference.prediction.authority.value
139
+ ```
140
+
141
+ ## Date of Birth
142
+ **birth_date** ([DateField](#date-field)): The date of birth of the card holder.
143
+
144
+ ```rb
145
+ puts result.document.inference.prediction.birth_date.value
146
+ ```
147
+
148
+ ## Place of Birth
149
+ **birth_place** ([StringField](#string-field)): The place of birth of the card holder.
150
+
151
+ ```rb
152
+ puts result.document.inference.prediction.birth_place.value
153
+ ```
154
+
155
+ ## Card Access Number
156
+ **card_access_number** ([StringField](#string-field)): The card access number (CAN).
157
+
158
+ ```rb
159
+ puts result.document.inference.prediction.card_access_number.value
160
+ ```
161
+
162
+ ## Document Number
163
+ **document_number** ([StringField](#string-field)): The document number.
164
+
165
+ ```rb
166
+ puts result.document.inference.prediction.document_number.value
167
+ ```
168
+
169
+ ## Document Sides
170
+ [📄](#page-level-fields "This field is only present on individual pages.")**document_side** ([ClassificationField](#classification-field)): The sides of the document which are visible.
171
+
172
+ ```rb
173
+ for document_side_elem in result.document.document_side do
174
+ puts document_side_elem.value
175
+ end
176
+ ```
177
+
178
+ ## Document Type
179
+ [📄](#page-level-fields "This field is only present on individual pages.")**document_type** ([ClassificationField](#classification-field)): The document type or format.
180
+
181
+ ```rb
182
+ for document_type_elem in result.document.document_type do
183
+ puts document_type_elem.value
184
+ end
185
+ ```
186
+
187
+ ## Expiry Date
188
+ **expiry_date** ([DateField](#date-field)): The expiry date of the identification card.
189
+
190
+ ```rb
191
+ puts result.document.inference.prediction.expiry_date.value
192
+ ```
193
+
194
+ ## Gender
195
+ **gender** ([StringField](#string-field)): The gender of the card holder.
196
+
197
+ ```rb
198
+ puts result.document.inference.prediction.gender.value
199
+ ```
200
+
201
+ ## Given Name(s)
202
+ **given_names** (Array<[StringField](#string-field)>): The given name(s) of the card holder.
203
+
204
+ ```rb
205
+ for given_names_elem in result.document.inference.prediction.given_names do
206
+ puts given_names_elem.value
207
+ end
208
+ ```
209
+
210
+ ## Date of Issue
211
+ **issue_date** ([DateField](#date-field)): The date of issue of the identification card.
212
+
213
+ ```rb
214
+ puts result.document.inference.prediction.issue_date.value
215
+ ```
216
+
217
+ ## Mrz Line 1
218
+ **mrz1** ([StringField](#string-field)): The Machine Readable Zone, first line.
219
+
220
+ ```rb
221
+ puts result.document.inference.prediction.mrz1.value
222
+ ```
223
+
224
+ ## Mrz Line 2
225
+ **mrz2** ([StringField](#string-field)): The Machine Readable Zone, second line.
226
+
227
+ ```rb
228
+ puts result.document.inference.prediction.mrz2.value
229
+ ```
230
+
231
+ ## Mrz Line 3
232
+ **mrz3** ([StringField](#string-field)): The Machine Readable Zone, third line.
233
+
234
+ ```rb
235
+ puts result.document.inference.prediction.mrz3.value
236
+ ```
237
+
238
+ ## Nationality
239
+ **nationality** ([StringField](#string-field)): The nationality of the card holder.
240
+
241
+ ```rb
242
+ puts result.document.inference.prediction.nationality.value
243
+ ```
244
+
245
+ ## Surname
246
+ **surname** ([StringField](#string-field)): The surname of the card holder.
247
+
248
+ ```rb
249
+ puts result.document.inference.prediction.surname.value
250
+ ```
251
+
252
+ # Questions?
253
+ [Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-1jv6nawjq-FDgFcF2T5CmMmRpl9LLptw)