mindee 1.2.0 → 2.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.gitignore +1 -1
- data/.rubocop.yml +2 -2
- data/.yardopts +4 -0
- data/CHANGELOG.md +21 -0
- data/README.md +46 -23
- data/Rakefile +6 -1
- data/bin/mindee.rb +70 -61
- data/docs/ruby-api-builder.md +131 -0
- data/docs/ruby-getting-started.md +265 -0
- data/docs/ruby-invoice-ocr.md +261 -0
- data/docs/ruby-passport-ocr.md +156 -0
- data/docs/ruby-receipt-ocr.md +170 -0
- data/lib/mindee/client.rb +128 -93
- data/lib/mindee/document_config.rb +22 -154
- data/lib/mindee/geometry.rb +105 -8
- data/lib/mindee/http/endpoint.rb +80 -0
- data/lib/mindee/input/pdf_processing.rb +106 -0
- data/lib/mindee/input/sources.rb +97 -0
- data/lib/mindee/input.rb +3 -0
- data/lib/mindee/parsing/document.rb +31 -0
- data/lib/mindee/parsing/error.rb +22 -0
- data/lib/mindee/parsing/inference.rb +53 -0
- data/lib/mindee/parsing/page.rb +46 -0
- data/lib/mindee/parsing/prediction/base.rb +30 -0
- data/lib/mindee/{fields → parsing/prediction/common_fields}/amount.rb +5 -1
- data/lib/mindee/{fields → parsing/prediction/common_fields}/base.rb +16 -5
- data/lib/mindee/{fields → parsing/prediction/common_fields}/company_registration.rb +0 -0
- data/lib/mindee/{fields/datefield.rb → parsing/prediction/common_fields/date.rb} +0 -0
- data/lib/mindee/{fields → parsing/prediction/common_fields}/locale.rb +0 -0
- data/lib/mindee/{fields → parsing/prediction/common_fields}/payment_details.rb +0 -0
- data/lib/mindee/parsing/prediction/common_fields/position.rb +39 -0
- data/lib/mindee/{fields → parsing/prediction/common_fields}/tax.rb +7 -2
- data/lib/mindee/parsing/prediction/common_fields/text.rb +12 -0
- data/lib/mindee/parsing/prediction/common_fields.rb +11 -0
- data/lib/mindee/parsing/prediction/custom/custom_v1.rb +58 -0
- data/lib/mindee/{fields/custom_docs.rb → parsing/prediction/custom/fields.rb} +5 -5
- data/lib/mindee/parsing/prediction/eu/license_plate/license_plate_v1.rb +34 -0
- data/lib/mindee/parsing/prediction/fr/bank_account_details/bank_account_details_v1.rb +40 -0
- data/lib/mindee/parsing/prediction/fr/carte_vitale/carte_vitale_v1.rb +49 -0
- data/lib/mindee/parsing/prediction/fr/id_card/id_card_v1.rb +84 -0
- data/lib/mindee/parsing/prediction/invoice/invoice_line_item.rb +58 -0
- data/lib/mindee/parsing/prediction/invoice/invoice_v4.rb +216 -0
- data/lib/mindee/parsing/prediction/passport/passport_v1.rb +184 -0
- data/lib/mindee/parsing/prediction/receipt/receipt_v4.rb +84 -0
- data/lib/mindee/parsing/prediction/shipping_container/shipping_container_v1.rb +38 -0
- data/lib/mindee/parsing/prediction/us/bank_check/bank_check_v1.rb +70 -0
- data/lib/mindee/parsing/prediction.rb +12 -0
- data/lib/mindee/parsing.rb +4 -0
- data/lib/mindee/version.rb +1 -1
- data/mindee.gemspec +2 -1
- metadata +57 -24
- data/lib/mindee/documents/base.rb +0 -35
- data/lib/mindee/documents/custom.rb +0 -65
- data/lib/mindee/documents/financial_doc.rb +0 -135
- data/lib/mindee/documents/invoice.rb +0 -162
- data/lib/mindee/documents/passport.rb +0 -163
- data/lib/mindee/documents/receipt.rb +0 -109
- data/lib/mindee/documents.rb +0 -7
- data/lib/mindee/endpoint.rb +0 -105
- data/lib/mindee/fields/orientation.rb +0 -26
- data/lib/mindee/fields.rb +0 -11
- data/lib/mindee/inputs.rb +0 -153
- data/lib/mindee/response.rb +0 -27
@@ -0,0 +1,265 @@
|
|
1
|
+
This guide will help you get started with the Mindee Ruby OCR SDK to easily extract data from your documents.
|
2
|
+
|
3
|
+
The Ruby client supports [Invoice](https://developers.mindee.com/docs/ruby-invoice-ocr), [receipt](https://developers.mindee.com/docs/ruby-receipt-ocr), [passport](https://developers.mindee.com/docs/ruby-passport-ocr), OCR APIs and [custom-built API](https://developers.mindee.com/docs/ruby-api-builder) from the API Builder.
|
4
|
+
|
5
|
+
You can view the source code on [GitHub](https://github.com/mindee/mindee-api-ruby).
|
6
|
+
|
7
|
+
## Installation
|
8
|
+
|
9
|
+
### Requirements
|
10
|
+
The following Ruby versions are tested and supported: 2.6, 2.7, 3.0, 3.1, 3.2
|
11
|
+
|
12
|
+
### Standard Installation
|
13
|
+
To quickly get started with the Ruby OCR SDK, Install by adding this line to your application's Gemfile:
|
14
|
+
|
15
|
+
```shell
|
16
|
+
gem 'mindee'
|
17
|
+
```
|
18
|
+
And then execute:
|
19
|
+
|
20
|
+
```shell
|
21
|
+
bundle install
|
22
|
+
```
|
23
|
+
Or you can install it like this:
|
24
|
+
|
25
|
+
```shell
|
26
|
+
gem install mindee
|
27
|
+
```
|
28
|
+
Finally, Ruby away!
|
29
|
+
|
30
|
+
### Development Installation
|
31
|
+
If you'll be modifying the source code, you'll need to install the required libraries to get started.
|
32
|
+
|
33
|
+
We recommend using [Bundler](https://bundler.io/).
|
34
|
+
|
35
|
+
1. First clone the repo.
|
36
|
+
|
37
|
+
```shell
|
38
|
+
git clone git@github.com:mindee/mindee-api-ruby.git
|
39
|
+
```
|
40
|
+
|
41
|
+
2. Navigate to the cloned directory and install all required libraries.
|
42
|
+
|
43
|
+
```shell
|
44
|
+
cd mindee-api-ruby
|
45
|
+
bundle install
|
46
|
+
```
|
47
|
+
|
48
|
+
## Updating the Library
|
49
|
+
It is important to always check the version of the Mindee OCR SDK you are using, as new and updated
|
50
|
+
features won’t work on older versions.
|
51
|
+
|
52
|
+
To get the latest version of your OCR SDK:
|
53
|
+
|
54
|
+
```shell
|
55
|
+
gem install mindee
|
56
|
+
```
|
57
|
+
|
58
|
+
To install a specific version of Mindee:
|
59
|
+
|
60
|
+
```shell
|
61
|
+
gem install mindee@<version>
|
62
|
+
```
|
63
|
+
|
64
|
+
## Usage
|
65
|
+
Using Mindee's APIs can be broken down into the following steps:
|
66
|
+
|
67
|
+
1. [Initialize a `Client`](#initializing-the-client)
|
68
|
+
2. [Load a File](#loading-a-document-file)
|
69
|
+
3. [Send the File](#sending-a-file) to Mindee's API
|
70
|
+
4. [Process the Result](#process-the-result) in some way
|
71
|
+
|
72
|
+
Let's take a deep dive into how this works.
|
73
|
+
|
74
|
+
## Initializing the Client
|
75
|
+
The `Client` centralizes document configurations in a single object.
|
76
|
+
|
77
|
+
The `Client` requires your [API key](https://developers.mindee.com/docs/make-your-first-request#create-an-api-key).
|
78
|
+
|
79
|
+
You can either pass these directly to the constructor or through environment variables.
|
80
|
+
|
81
|
+
|
82
|
+
### Pass the API key directly
|
83
|
+
```ruby
|
84
|
+
# Init a new client and passing the key directly
|
85
|
+
mindee_client = Mindee::Client.new(api_key: 'my-api-key')
|
86
|
+
```
|
87
|
+
|
88
|
+
### Set the API key in the environment
|
89
|
+
API keys should be set as environment variables, especially for any production deployment.
|
90
|
+
|
91
|
+
The following environment variable will set the global API key:
|
92
|
+
```shell
|
93
|
+
MINDEE_API_KEY=my-api-key
|
94
|
+
```
|
95
|
+
|
96
|
+
Then in your code:
|
97
|
+
```ruby
|
98
|
+
# Init a new client without an API key
|
99
|
+
mindee_client = Mindee::Client.new
|
100
|
+
```
|
101
|
+
|
102
|
+
### Setting the Request Timeout
|
103
|
+
The request timeout can be set using an environment variable:
|
104
|
+
```shell
|
105
|
+
MINDEE_REQUEST_TIMEOUT=200
|
106
|
+
```
|
107
|
+
|
108
|
+
|
109
|
+
## Loading a Document File
|
110
|
+
Before being able to send a document to the API, it must first be loaded.
|
111
|
+
|
112
|
+
You don't need to worry about different MIME types, the library will take care of handling
|
113
|
+
all supported types automatically.
|
114
|
+
|
115
|
+
Once a document is loaded, interacting with it is done in exactly the same way, regardless
|
116
|
+
of how it was loaded.
|
117
|
+
|
118
|
+
There are a few different ways of loading a document file, depending on your use case:
|
119
|
+
|
120
|
+
* [Path](#path)
|
121
|
+
* [File Object](#file-object)
|
122
|
+
* [Base64](#base64)
|
123
|
+
* [Bytes](#bytes)
|
124
|
+
|
125
|
+
### Path
|
126
|
+
Load from a file directly from disk. Requires an absolute path, as a string.
|
127
|
+
|
128
|
+
```ruby
|
129
|
+
result = mindee_client.doc_from_path("/path/to/the/invoice.jpg").parse(Mindee::Prediction::InvoiceV4)
|
130
|
+
|
131
|
+
# Print a full summary of the parsed data in RST format
|
132
|
+
puts result
|
133
|
+
```
|
134
|
+
|
135
|
+
### File Object
|
136
|
+
A normal Ruby file object with a path. Must be in binary mode.
|
137
|
+
|
138
|
+
**Note**: The original filename is required when calling the method.
|
139
|
+
|
140
|
+
```ruby
|
141
|
+
result = nil
|
142
|
+
File.open(INVOICE_FILE, 'rb') do |fo|
|
143
|
+
result = mindee_client.doc_from_file(fo, "invoice.jpg").parse(Mindee::Prediction::InvoiceV4)
|
144
|
+
end
|
145
|
+
|
146
|
+
# Print a full summary of the parsed data in RST format
|
147
|
+
puts result
|
148
|
+
```
|
149
|
+
|
150
|
+
### Base64
|
151
|
+
Load file contents from a base64-encoded string.
|
152
|
+
|
153
|
+
**Note**: The original filename is required when calling the method.
|
154
|
+
|
155
|
+
```ruby
|
156
|
+
b64_string = "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLD...."
|
157
|
+
result = mindee_client.doc_from_b64string(b64_string, "receipt.jpg").parse(Mindee::Prediction::ReceiptV4)
|
158
|
+
|
159
|
+
# Print a full summary of the parsed data in RST format
|
160
|
+
puts result
|
161
|
+
```
|
162
|
+
|
163
|
+
### Bytes
|
164
|
+
Requires raw bytes.
|
165
|
+
|
166
|
+
**Note**: The original filename is required when calling the method.
|
167
|
+
|
168
|
+
```ruby
|
169
|
+
raw_bytes = b"%PDF-1.3\n%\xbf\xf7\xa2\xfe\n1 0 ob..."
|
170
|
+
result = mindee_client.doc_from_bytes(raw_bytes, "invoice.pdf").parse(Mindee::Prediction::InvoiceV4)
|
171
|
+
|
172
|
+
# Print a full summary of the parsed data in RST format
|
173
|
+
puts result
|
174
|
+
```
|
175
|
+
|
176
|
+
## Sending a File
|
177
|
+
To send a file to the API, we need to specify how to process the document.
|
178
|
+
This will determine which API endpoint is used and how the API return will be handled internally by the library.
|
179
|
+
|
180
|
+
More specifically, we need to set a `Mindee::Prediction` class as the first parameter of the `parse` method.
|
181
|
+
|
182
|
+
This is because the `parse` method's' return type depends on its first argument.
|
183
|
+
|
184
|
+
Each document type available in the library has its corresponding class, which inherit from the base `Mindee::Prediction` class.
|
185
|
+
This is detailed in each document-specific guide.
|
186
|
+
|
187
|
+
### Off-the-Shelf Documents
|
188
|
+
Simply setting the correct class is enough:
|
189
|
+
```ruby
|
190
|
+
result = doc.parse(Mindee::Prediction::InvoiceV4)
|
191
|
+
```
|
192
|
+
|
193
|
+
### Custom Documents
|
194
|
+
The endpoint to use must also be set, this is done in the second argument of the `parse` method:
|
195
|
+
```ruby
|
196
|
+
result = doc.parse(Mindee::Prediction::CustomV1, endpoint_name: 'wnine')
|
197
|
+
```
|
198
|
+
|
199
|
+
This is because the `CustomV1` class is enough to handle the return processing, but the actual endpoint needs to be specified.
|
200
|
+
|
201
|
+
## Process the Result
|
202
|
+
The response object is common to all documents, including custom documents. The main properties are:
|
203
|
+
|
204
|
+
* `id` — Mindee ID of the document
|
205
|
+
* `name` — Filename sent to the API
|
206
|
+
* `inference` — [Inference](#inference)
|
207
|
+
|
208
|
+
### Inference
|
209
|
+
Regroups the predictions at the page level, as well as predictions for the entire document.
|
210
|
+
|
211
|
+
* `prediction` — [Document level prediction](#document-level-prediction)
|
212
|
+
* `pages` — [Page level prediction](#page-level-prediction)
|
213
|
+
|
214
|
+
#### Document level prediction
|
215
|
+
The `prediction` attribute is a `Prediction` object specific to the type of document being processed.
|
216
|
+
It contains the data extracted from the entire document, all pages combined.
|
217
|
+
|
218
|
+
It's possible to have the same field in various pages, but at the document level,
|
219
|
+
only the highest confidence field data will be shown (this is all done automatically at the API level).
|
220
|
+
|
221
|
+
```ruby
|
222
|
+
# as an object, complete
|
223
|
+
pp result.inference.prediction
|
224
|
+
|
225
|
+
# as a string, summary in RST format
|
226
|
+
puts result.inference.prediction
|
227
|
+
```
|
228
|
+
|
229
|
+
#### Page level prediction
|
230
|
+
The `pages` attribute is a list of `Prediction` objects.
|
231
|
+
|
232
|
+
Each page element contains the data extracted for a particular page of the document.
|
233
|
+
The order of the elements in the array matches the order of the pages in the document.
|
234
|
+
|
235
|
+
All response objects have this property, regardless of the number of pages.
|
236
|
+
Single page documents will have a single entry.
|
237
|
+
|
238
|
+
Iteration is done like any Ruby array:
|
239
|
+
```ruby
|
240
|
+
response.inference.pages.each do |page|
|
241
|
+
# as an object, complete
|
242
|
+
pp page.prediction
|
243
|
+
|
244
|
+
# as a string, summary in RST format
|
245
|
+
puts page.prediction
|
246
|
+
end
|
247
|
+
```
|
248
|
+
|
249
|
+
#### Page Orientation
|
250
|
+
The orientation field is only available at the page level as it describes whether the page image should be rotated to be upright.
|
251
|
+
|
252
|
+
If the page requires rotation for correct display, the orientation field gives a prediction among these 3 possible outputs:
|
253
|
+
|
254
|
+
* 0 degrees: the page is already upright
|
255
|
+
* 90 degrees: the page must be rotated clockwise to be upright
|
256
|
+
* 270 degrees: the page must be rotated counterclockwise to be upright
|
257
|
+
|
258
|
+
```ruby
|
259
|
+
response.inference.pages.each do |page|
|
260
|
+
puts page.orientation.value
|
261
|
+
end
|
262
|
+
```
|
263
|
+
|
264
|
+
## Questions?
|
265
|
+
[Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-1jv6nawjq-FDgFcF2T5CmMmRpl9LLptw)
|
@@ -0,0 +1,261 @@
|
|
1
|
+
The Ruby OCR SDK supports the [invoice API](https://developers.mindee.com/docs/invoice-ocr) for extracting data from invoices.
|
2
|
+
|
3
|
+
Using this sample below, we are going to illustrate how to extract the data that we want using the OCR SDK.
|
4
|
+
|
5
|
+
![sample invoice](https://raw.githubusercontent.com/mindee/client-lib-test-data/main/invoice/invoice_1p.jpg)
|
6
|
+
|
7
|
+
## Quick Start
|
8
|
+
```ruby
|
9
|
+
require 'mindee'
|
10
|
+
|
11
|
+
# Init a new client, specifying an API key
|
12
|
+
mindee_client = Mindee::Client.new(api_key: 'my-api-key')
|
13
|
+
|
14
|
+
# Send the file
|
15
|
+
result = mindee_client.doc_from_path('/path/to/the/file.ext').parse(Mindee::Prediction::InvoiceV4)
|
16
|
+
|
17
|
+
# Print a summary of the document prediction in RST format
|
18
|
+
puts result.inference.prediction
|
19
|
+
```
|
20
|
+
|
21
|
+
Output:
|
22
|
+
```shell
|
23
|
+
:Locale: en; en; CAD;
|
24
|
+
:Document type: INVOICE
|
25
|
+
:Invoice number: 14
|
26
|
+
:Reference numbers: AD29094
|
27
|
+
:Invoice date: 2018-09-25
|
28
|
+
:Invoice due date: 2018-09-25
|
29
|
+
:Supplier name: TURNPIKE DESIGNS CO.
|
30
|
+
:Supplier address: 156 University Ave, Toronto ON, Canada M5H 2H7
|
31
|
+
:Supplier company registrations:
|
32
|
+
:Supplier payment details:
|
33
|
+
:Customer name: JIRO DOI
|
34
|
+
:Customer address: 1954 Bloor Street West Toronto, ON, M6P 3K9 Canada
|
35
|
+
:Customer company registrations:
|
36
|
+
:Taxes: 193.20 8.00%
|
37
|
+
:Total net: 2415.00
|
38
|
+
:Total taxes: 193.20
|
39
|
+
:Total amount: 2608.20
|
40
|
+
|
41
|
+
:Line Items:
|
42
|
+
====================== ======== ========= ========== ================== ====================================
|
43
|
+
Code QTY Price Amount Tax (Rate) Description
|
44
|
+
====================== ======== ========= ========== ================== ====================================
|
45
|
+
1.00 65.00 65.00 Platinum web hosting package Down...
|
46
|
+
3.00 2100.00 2100.00 2 page website design Includes ba...
|
47
|
+
1.00 250.00 250.00 Mobile designs Includes responsiv...
|
48
|
+
====================== ======== ========= ========== ================== ====================================
|
49
|
+
```
|
50
|
+
|
51
|
+
**Note:** Line item descriptions are truncated here only for display purposes.
|
52
|
+
The full text is available in the [details](#line-items).
|
53
|
+
|
54
|
+
## Fields
|
55
|
+
Each prediction object contains a set of different fields.
|
56
|
+
Each `Field` object contains at a minimum the following attributes:
|
57
|
+
|
58
|
+
* `value` (String or Float depending on the field type): corresponds to the field value. Can be `nil` if no value was extracted.
|
59
|
+
* `confidence` (Float): the confidence score of the field prediction.
|
60
|
+
* `bounding_box` (Array< Array< Float > >): contains exactly 4 relative vertices coordinates (points) of a right rectangle containing the field in the document.
|
61
|
+
* `polygon` (Array< Array< Float > >): contains the relative vertices coordinates (points) of a polygon containing the field in the image.
|
62
|
+
* `reconstructed` (Boolean): True if the field was reconstructed or computed using other fields.
|
63
|
+
|
64
|
+
|
65
|
+
## Attributes
|
66
|
+
Depending on the field type, there might be additional attributes that will be extracted in the `Invoice` object.
|
67
|
+
|
68
|
+
Using the above sample, the following are the basic fields that can be extracted:
|
69
|
+
|
70
|
+
- [Orientation](#orientation)
|
71
|
+
- [Customer Information](#customer-information)
|
72
|
+
- [Dates](#dates)
|
73
|
+
- [Locale and Currency](#locale)
|
74
|
+
- [Payment Information](#payment-information)
|
75
|
+
- [Supplier Information](#supplier-information)
|
76
|
+
- [Taxes](#taxes)
|
77
|
+
- [Totals](#totals)
|
78
|
+
- [Line Items](#line-items)
|
79
|
+
|
80
|
+
|
81
|
+
### Customer Information
|
82
|
+
**`customer_name`** (Field): Customer's name
|
83
|
+
|
84
|
+
```ruby
|
85
|
+
puts result.inference.prediction.customer_name.value
|
86
|
+
```
|
87
|
+
|
88
|
+
**`customer_address`** (Field): Customer's postal address
|
89
|
+
|
90
|
+
```ruby
|
91
|
+
puts result.inference.prediction.customer_address.value
|
92
|
+
```
|
93
|
+
|
94
|
+
**`customer_company_registration`** (Array<CompanyRegistration>): Customer's company registration
|
95
|
+
|
96
|
+
```ruby
|
97
|
+
result.inference.prediction.customer_company_registrations.each do |registration|
|
98
|
+
puts registration
|
99
|
+
end
|
100
|
+
```
|
101
|
+
|
102
|
+
### Dates
|
103
|
+
Date fields:
|
104
|
+
|
105
|
+
* contain the `date_object` attribute, which is a standard Ruby [date object](https://ruby-doc.org/stdlib-2.7.1/libdoc/date/rdoc/Date.html)
|
106
|
+
* have a `value` attribute which is the [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) representation of the date.
|
107
|
+
|
108
|
+
The following date fields are available:
|
109
|
+
|
110
|
+
**`date`**: Date the invoice was issued
|
111
|
+
|
112
|
+
```ruby
|
113
|
+
puts result.inference.prediction.date.value
|
114
|
+
```
|
115
|
+
|
116
|
+
**`due_date`**: Payment due date of the invoice.
|
117
|
+
|
118
|
+
```ruby
|
119
|
+
puts result.inference.prediction.due_date.value
|
120
|
+
```
|
121
|
+
|
122
|
+
### Locale
|
123
|
+
**`locale`** [Locale]: Locale information.
|
124
|
+
|
125
|
+
* `locale.language` (String): Language code in [ISO 639-1](https://en.wikipedia.org/wiki/ISO_639-1) format as seen on the document.
|
126
|
+
```ruby
|
127
|
+
puts result.inference.prediction.locale.language
|
128
|
+
```
|
129
|
+
|
130
|
+
* `locale.currency` (String): Currency code in [ISO 4217](https://en.wikipedia.org/wiki/ISO_4217) format as seen on the document.
|
131
|
+
```ruby
|
132
|
+
puts result.inference.prediction.locale.currency
|
133
|
+
```
|
134
|
+
|
135
|
+
* `locale.country` (String): Country code in [ISO 3166-1](https://en.wikipedia.org/wiki/ISO_3166-1) alpha-2 format as seen on the document.
|
136
|
+
```ruby
|
137
|
+
puts result.inference.prediction.locale.country
|
138
|
+
```
|
139
|
+
|
140
|
+
### Supplier Information
|
141
|
+
|
142
|
+
**`supplier_name`**: Supplier name as written in the invoice (logo or supplier Info).
|
143
|
+
|
144
|
+
```ruby
|
145
|
+
puts result.inference.prediction.supplier_name.value
|
146
|
+
```
|
147
|
+
|
148
|
+
**`supplier_address`**: Supplier address as written in the invoice.
|
149
|
+
|
150
|
+
```ruby
|
151
|
+
puts result.inference.prediction.supplier_address.value
|
152
|
+
```
|
153
|
+
|
154
|
+
**`supplier__payment_details`** (Array< PaymentDetails >): List of invoice's supplier payment details.
|
155
|
+
Each object in the list contains extra attributes:
|
156
|
+
|
157
|
+
* `iban` (String)
|
158
|
+
```ruby
|
159
|
+
# Show the IBAN of the first payment
|
160
|
+
puts result.inference.prediction.supplier_payment_details[0].iban
|
161
|
+
```
|
162
|
+
|
163
|
+
* `swift` (String)
|
164
|
+
```ruby
|
165
|
+
# Show the SWIFT of the first payment
|
166
|
+
puts result.inference.prediction.supplier_payment_details[0].swift
|
167
|
+
```
|
168
|
+
|
169
|
+
* `routing_number` (String)
|
170
|
+
```ruby
|
171
|
+
# Show the routing number of the first payment
|
172
|
+
puts result.inference.prediction.supplier_payment_details[0].routing_number
|
173
|
+
```
|
174
|
+
|
175
|
+
* `account_number` (String)
|
176
|
+
```ruby
|
177
|
+
# Show the account number of the first payment
|
178
|
+
puts result.inference.prediction.supplier_payment_details[0].account_number
|
179
|
+
```
|
180
|
+
|
181
|
+
**`supplier_company_registrations`** (Array< CompanyRegistration >):
|
182
|
+
List of detected supplier's company registration numbers.
|
183
|
+
Each object in the list contains an extra attribute:
|
184
|
+
|
185
|
+
* `type` (String): Type of company registration number among predefined categories.
|
186
|
+
```ruby
|
187
|
+
# Show the type of the first registration
|
188
|
+
puts result.inference.prediction.supplier_company_registrations[0].type
|
189
|
+
```
|
190
|
+
|
191
|
+
* `value` (String): Value of the company identifier
|
192
|
+
```ruby
|
193
|
+
# Show the value of the first registration
|
194
|
+
puts result.inference.prediction.supplier_company_registrations[0].value
|
195
|
+
```
|
196
|
+
|
197
|
+
### Taxes
|
198
|
+
**`taxes`** (Array< TaxField >): Contains tax fields as seen on the receipt.
|
199
|
+
|
200
|
+
* `value` (Float): The tax amount.
|
201
|
+
```ruby
|
202
|
+
# Show the amount of the first tax
|
203
|
+
puts result.inference.prediction.taxes[0].value
|
204
|
+
```
|
205
|
+
|
206
|
+
* `code` (String): The tax code (HST, GST... for Canadian; City Tax, State tax for US, etc..).
|
207
|
+
```ruby
|
208
|
+
# Show the code of the first tax
|
209
|
+
puts result.inference.prediction.taxes[0].code
|
210
|
+
```
|
211
|
+
|
212
|
+
* `rate` (Float): The tax rate.
|
213
|
+
```ruby
|
214
|
+
# Show the rate of the first tax
|
215
|
+
puts result.inference.prediction.taxes[0].rate
|
216
|
+
```
|
217
|
+
|
218
|
+
### Totals
|
219
|
+
|
220
|
+
**`total_amount`** (Field): Total amount including taxes.
|
221
|
+
|
222
|
+
```ruby
|
223
|
+
puts result.inference.prediction.total_amount.value
|
224
|
+
```
|
225
|
+
|
226
|
+
**`total_net`** (Field): Total amount excluding taxes.
|
227
|
+
|
228
|
+
```ruby
|
229
|
+
puts result.inference.prediction.total_net.value
|
230
|
+
```
|
231
|
+
|
232
|
+
**`total_tax`** (Field): Total tax value from tax lines.
|
233
|
+
|
234
|
+
```ruby
|
235
|
+
puts result.inference.prediction.total_tax.value
|
236
|
+
```
|
237
|
+
|
238
|
+
### Line items
|
239
|
+
|
240
|
+
**`line_items`** (Array<InvoiceLineItem>): Line items details.
|
241
|
+
Each object in the list contains:
|
242
|
+
|
243
|
+
* `product_code` (String)
|
244
|
+
* `description` (String)
|
245
|
+
* `quantity` (Float)
|
246
|
+
* `unit_price` (Float)
|
247
|
+
* `total_amount` (Float)
|
248
|
+
* `tax_rate` (Float)
|
249
|
+
* `tax_amount` (Float)
|
250
|
+
* `confidence` (Float)
|
251
|
+
* `page_id` (Integer)
|
252
|
+
* `polygon` (Polygon)
|
253
|
+
|
254
|
+
```ruby
|
255
|
+
result.inference.prediction.line_items.each do |line_item|
|
256
|
+
pp line_item
|
257
|
+
end
|
258
|
+
```
|
259
|
+
|
260
|
+
## Questions?
|
261
|
+
[Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-1jv6nawjq-FDgFcF2T5CmMmRpl9LLptw)
|
@@ -0,0 +1,156 @@
|
|
1
|
+
The Ruby OCR SDK supports the [passport API](https://developers.mindee.com/docs/passport-ocr) for extracting data from passports.
|
2
|
+
|
3
|
+
Using the sample below, we are going to illustrate how to extract the data that we want using the OCR SDK.
|
4
|
+
|
5
|
+
![sample passport](https://raw.githubusercontent.com/mindee/client-lib-test-data/main/passport/passport.jpeg)
|
6
|
+
|
7
|
+
## Quick Start
|
8
|
+
```ruby
|
9
|
+
require 'mindee'
|
10
|
+
|
11
|
+
# Init a new client, specifying an API key
|
12
|
+
mindee_client = Mindee::Client.new(api_key: 'my-api-key')
|
13
|
+
|
14
|
+
# Send the file
|
15
|
+
result = mindee_client.doc_from_path('/path/to/the/file.ext').parse(Mindee::Prediction::PassportV1)
|
16
|
+
|
17
|
+
# Print a summary of the document prediction in RST format
|
18
|
+
puts result.inference.prediction
|
19
|
+
```
|
20
|
+
|
21
|
+
Output:
|
22
|
+
```shell
|
23
|
+
:Full name: HENERT PUDARSAN
|
24
|
+
:Given names: HENERT
|
25
|
+
:Surname: PUDARSAN
|
26
|
+
:Country: GBR
|
27
|
+
:ID Number: 707797979
|
28
|
+
:Issuance date: 2012-04-22
|
29
|
+
:Birth date: 1995-05-20
|
30
|
+
:Expiry date: 2017-04-22
|
31
|
+
:MRZ 1: P<GBRPUDARSAN<<HENERT<<<<<<<<<<<<<<<<<<<<<<<
|
32
|
+
:MRZ 2: 7077979792GBR9505209M1704224<<<<<<<<<<<<<<00
|
33
|
+
:MRZ: P<GBRPUDARSAN<<HENERT<<<<<<<<<<<<<<<<<<<<<<<7077979792GBR9505209M1704224<<<<<<<<<<<<<<00
|
34
|
+
```
|
35
|
+
|
36
|
+
## Fields
|
37
|
+
Each prediction object contains a set of different fields.
|
38
|
+
Each `Field` object contains at a minimum the following attributes:
|
39
|
+
|
40
|
+
* `value` (String or Float depending on the field type): corresponds to the field value. Can be `nil` if no value was extracted.
|
41
|
+
* `confidence` (Float): the confidence score of the field prediction.
|
42
|
+
* `bounding_box` (Array< Array< Float > >): contains exactly 4 relative vertices coordinates (points) of a right rectangle containing the field in the document.
|
43
|
+
* `polygon` (Array< Array< Float > >): contains the relative vertices coordinates (points) of a polygon containing the field in the image.
|
44
|
+
* `reconstructed` (Boolean): True if the field was reconstructed or computed using other fields.
|
45
|
+
|
46
|
+
|
47
|
+
## Attributes
|
48
|
+
Depending on the field type specified, additional attributes can be extracted from the `Passport` object.
|
49
|
+
|
50
|
+
Using the above sample, the following are the basic fields that can be extracted:
|
51
|
+
|
52
|
+
- [Orientation](#orientation)
|
53
|
+
- [Birth Place](#birth-place)
|
54
|
+
- [Country](#country)
|
55
|
+
- [Dates (Expiry, Issuance, Birth)](#dates)
|
56
|
+
- [Gender](#gender)
|
57
|
+
- [Given Names](#given-names)
|
58
|
+
- [ID Number](#id)
|
59
|
+
- [Machine Readable Zone](#machine-readable-zone)
|
60
|
+
- [Surname](#surname)
|
61
|
+
|
62
|
+
### Birth Place
|
63
|
+
|
64
|
+
**`birth_place`** (Field): Passport owner birthplace.
|
65
|
+
|
66
|
+
```ruby
|
67
|
+
puts result.inference.prediction.birth_place.value
|
68
|
+
```
|
69
|
+
|
70
|
+
### Country
|
71
|
+
**`country`** (Field): Passport country in [ISO 3166-1 alpha-3 code format](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3) (3-letter code).
|
72
|
+
|
73
|
+
```ruby
|
74
|
+
puts result.inference.prediction.country.value
|
75
|
+
```
|
76
|
+
|
77
|
+
### Dates
|
78
|
+
Date fields:
|
79
|
+
|
80
|
+
* contain the `date_object` attribute, which is a standard Ruby [date object](https://ruby-doc.org/stdlib-2.7.1/libdoc/date/rdoc/Date.html)
|
81
|
+
* have a `value` attribute which is the [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) representation of the date.
|
82
|
+
|
83
|
+
The following date fields are available:
|
84
|
+
|
85
|
+
**`expiry_date`**: Passport expiry date.
|
86
|
+
|
87
|
+
```ruby
|
88
|
+
puts result.inference.prediction.expiry_date.value
|
89
|
+
```
|
90
|
+
|
91
|
+
**`issuance_date`**: Passport date of issuance.
|
92
|
+
|
93
|
+
```ruby
|
94
|
+
puts result.inference.prediction.issuance_date.value
|
95
|
+
```
|
96
|
+
|
97
|
+
**`birth_date`**: Passport's owner date of birth.
|
98
|
+
|
99
|
+
```ruby
|
100
|
+
puts result.inference.prediction.birth_date.value
|
101
|
+
```
|
102
|
+
|
103
|
+
### Gender
|
104
|
+
|
105
|
+
**`gender`** (Field): Passport owner's gender (M / F).
|
106
|
+
|
107
|
+
```ruby
|
108
|
+
puts result.inference.prediction.gender.value
|
109
|
+
```
|
110
|
+
|
111
|
+
### Names
|
112
|
+
|
113
|
+
**`given_names`** (Array< Field >): List of passport owner's given names.
|
114
|
+
|
115
|
+
```ruby
|
116
|
+
result.inference.prediction.given_names.each do |name|
|
117
|
+
puts name
|
118
|
+
end
|
119
|
+
```
|
120
|
+
|
121
|
+
**`surname`** (Field): Passport's owner surname.
|
122
|
+
|
123
|
+
```ruby
|
124
|
+
puts result.inference.prediction.surname.value
|
125
|
+
```
|
126
|
+
|
127
|
+
### ID
|
128
|
+
|
129
|
+
**`id_number`** (Field): Passport identification number.
|
130
|
+
|
131
|
+
```ruby
|
132
|
+
puts result.inference.prediction.id_number.value
|
133
|
+
```
|
134
|
+
|
135
|
+
### Machine-Readable Zone
|
136
|
+
|
137
|
+
**`mrz1`** (Field): Passport first line of machine-readable zone.
|
138
|
+
|
139
|
+
```ruby
|
140
|
+
puts result.inference.prediction.mrz1.value
|
141
|
+
```
|
142
|
+
|
143
|
+
**`mrz2`** (Field): Passport second line of machine-readable zone.
|
144
|
+
|
145
|
+
```ruby
|
146
|
+
puts result.inference.prediction.mrz2.value
|
147
|
+
```
|
148
|
+
|
149
|
+
**`mrz`** (Field): Reconstructed passport full machine-readable zone from mrz1 and mrz2.
|
150
|
+
|
151
|
+
```ruby
|
152
|
+
puts result.inference.prediction.mrz.value
|
153
|
+
```
|
154
|
+
|
155
|
+
## Questions?
|
156
|
+
[Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-1jv6nawjq-FDgFcF2T5CmMmRpl9LLptw)
|