mindee 1.1.2 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.gitignore +1 -1
- data/.rubocop.yml +2 -2
- data/.yardopts +4 -0
- data/CHANGELOG.md +25 -0
- data/Gemfile +0 -7
- data/README.md +52 -21
- data/Rakefile +6 -1
- data/bin/mindee.rb +70 -61
- data/docs/ruby-api-builder.md +131 -0
- data/docs/ruby-getting-started.md +265 -0
- data/docs/ruby-invoice-ocr.md +261 -0
- data/docs/ruby-passport-ocr.md +156 -0
- data/docs/ruby-receipt-ocr.md +170 -0
- data/lib/mindee/client.rb +128 -93
- data/lib/mindee/document_config.rb +22 -154
- data/lib/mindee/geometry.rb +105 -8
- data/lib/mindee/http/endpoint.rb +80 -0
- data/lib/mindee/input/pdf_processing.rb +106 -0
- data/lib/mindee/input/sources.rb +97 -0
- data/lib/mindee/input.rb +3 -0
- data/lib/mindee/parsing/document.rb +31 -0
- data/lib/mindee/parsing/error.rb +22 -0
- data/lib/mindee/parsing/inference.rb +53 -0
- data/lib/mindee/parsing/page.rb +46 -0
- data/lib/mindee/parsing/prediction/base.rb +30 -0
- data/lib/mindee/{fields → parsing/prediction/common_fields}/amount.rb +5 -1
- data/lib/mindee/{fields → parsing/prediction/common_fields}/base.rb +16 -5
- data/lib/mindee/{fields → parsing/prediction/common_fields}/company_registration.rb +0 -0
- data/lib/mindee/{fields/datefield.rb → parsing/prediction/common_fields/date.rb} +0 -0
- data/lib/mindee/{fields → parsing/prediction/common_fields}/locale.rb +0 -0
- data/lib/mindee/{fields → parsing/prediction/common_fields}/payment_details.rb +0 -0
- data/lib/mindee/parsing/prediction/common_fields/position.rb +39 -0
- data/lib/mindee/{fields → parsing/prediction/common_fields}/tax.rb +7 -2
- data/lib/mindee/parsing/prediction/common_fields/text.rb +12 -0
- data/lib/mindee/parsing/prediction/common_fields.rb +11 -0
- data/lib/mindee/parsing/prediction/custom/custom_v1.rb +58 -0
- data/lib/mindee/{fields/custom_docs.rb → parsing/prediction/custom/fields.rb} +5 -5
- data/lib/mindee/parsing/prediction/eu/license_plate/license_plate_v1.rb +34 -0
- data/lib/mindee/parsing/prediction/fr/bank_account_details/bank_account_details_v1.rb +40 -0
- data/lib/mindee/parsing/prediction/fr/carte_vitale/carte_vitale_v1.rb +49 -0
- data/lib/mindee/parsing/prediction/fr/id_card/id_card_v1.rb +84 -0
- data/lib/mindee/parsing/prediction/invoice/invoice_line_item.rb +58 -0
- data/lib/mindee/parsing/prediction/invoice/invoice_v4.rb +216 -0
- data/lib/mindee/parsing/prediction/passport/passport_v1.rb +184 -0
- data/lib/mindee/parsing/prediction/receipt/receipt_v4.rb +84 -0
- data/lib/mindee/parsing/prediction/shipping_container/shipping_container_v1.rb +38 -0
- data/lib/mindee/parsing/prediction/us/bank_check/bank_check_v1.rb +70 -0
- data/lib/mindee/parsing/prediction.rb +12 -0
- data/lib/mindee/parsing.rb +4 -0
- data/lib/mindee/version.rb +1 -1
- data/mindee.gemspec +11 -5
- metadata +105 -30
- data/lib/mindee/documents/base.rb +0 -35
- data/lib/mindee/documents/custom.rb +0 -65
- data/lib/mindee/documents/financial_doc.rb +0 -135
- data/lib/mindee/documents/invoice.rb +0 -162
- data/lib/mindee/documents/passport.rb +0 -163
- data/lib/mindee/documents/receipt.rb +0 -109
- data/lib/mindee/documents.rb +0 -7
- data/lib/mindee/endpoint.rb +0 -105
- data/lib/mindee/fields/orientation.rb +0 -26
- data/lib/mindee/fields.rb +0 -11
- data/lib/mindee/inputs.rb +0 -153
- data/lib/mindee/response.rb +0 -27
@@ -0,0 +1,265 @@
|
|
1
|
+
This guide will help you get started with the Mindee Ruby OCR SDK to easily extract data from your documents.
|
2
|
+
|
3
|
+
The Ruby client supports [Invoice](https://developers.mindee.com/docs/ruby-invoice-ocr), [receipt](https://developers.mindee.com/docs/ruby-receipt-ocr), [passport](https://developers.mindee.com/docs/ruby-passport-ocr), OCR APIs and [custom-built API](https://developers.mindee.com/docs/ruby-api-builder) from the API Builder.
|
4
|
+
|
5
|
+
You can view the source code on [GitHub](https://github.com/mindee/mindee-api-ruby).
|
6
|
+
|
7
|
+
## Installation
|
8
|
+
|
9
|
+
### Requirements
|
10
|
+
The following Ruby versions are tested and supported: 2.6, 2.7, 3.0, 3.1, 3.2
|
11
|
+
|
12
|
+
### Standard Installation
|
13
|
+
To quickly get started with the Ruby OCR SDK, Install by adding this line to your application's Gemfile:
|
14
|
+
|
15
|
+
```shell
|
16
|
+
gem 'mindee'
|
17
|
+
```
|
18
|
+
And then execute:
|
19
|
+
|
20
|
+
```shell
|
21
|
+
bundle install
|
22
|
+
```
|
23
|
+
Or you can install it like this:
|
24
|
+
|
25
|
+
```shell
|
26
|
+
gem install mindee
|
27
|
+
```
|
28
|
+
Finally, Ruby away!
|
29
|
+
|
30
|
+
### Development Installation
|
31
|
+
If you'll be modifying the source code, you'll need to install the required libraries to get started.
|
32
|
+
|
33
|
+
We recommend using [Bundler](https://bundler.io/).
|
34
|
+
|
35
|
+
1. First clone the repo.
|
36
|
+
|
37
|
+
```shell
|
38
|
+
git clone git@github.com:mindee/mindee-api-ruby.git
|
39
|
+
```
|
40
|
+
|
41
|
+
2. Navigate to the cloned directory and install all required libraries.
|
42
|
+
|
43
|
+
```shell
|
44
|
+
cd mindee-api-ruby
|
45
|
+
bundle install
|
46
|
+
```
|
47
|
+
|
48
|
+
## Updating the Library
|
49
|
+
It is important to always check the version of the Mindee OCR SDK you are using, as new and updated
|
50
|
+
features won’t work on older versions.
|
51
|
+
|
52
|
+
To get the latest version of your OCR SDK:
|
53
|
+
|
54
|
+
```shell
|
55
|
+
gem install mindee
|
56
|
+
```
|
57
|
+
|
58
|
+
To install a specific version of Mindee:
|
59
|
+
|
60
|
+
```shell
|
61
|
+
gem install mindee@<version>
|
62
|
+
```
|
63
|
+
|
64
|
+
## Usage
|
65
|
+
Using Mindee's APIs can be broken down into the following steps:
|
66
|
+
|
67
|
+
1. [Initialize a `Client`](#initializing-the-client)
|
68
|
+
2. [Load a File](#loading-a-document-file)
|
69
|
+
3. [Send the File](#sending-a-file) to Mindee's API
|
70
|
+
4. [Process the Result](#process-the-result) in some way
|
71
|
+
|
72
|
+
Let's take a deep dive into how this works.
|
73
|
+
|
74
|
+
## Initializing the Client
|
75
|
+
The `Client` centralizes document configurations in a single object.
|
76
|
+
|
77
|
+
The `Client` requires your [API key](https://developers.mindee.com/docs/make-your-first-request#create-an-api-key).
|
78
|
+
|
79
|
+
You can either pass these directly to the constructor or through environment variables.
|
80
|
+
|
81
|
+
|
82
|
+
### Pass the API key directly
|
83
|
+
```ruby
|
84
|
+
# Init a new client and passing the key directly
|
85
|
+
mindee_client = Mindee::Client.new(api_key: 'my-api-key')
|
86
|
+
```
|
87
|
+
|
88
|
+
### Set the API key in the environment
|
89
|
+
API keys should be set as environment variables, especially for any production deployment.
|
90
|
+
|
91
|
+
The following environment variable will set the global API key:
|
92
|
+
```shell
|
93
|
+
MINDEE_API_KEY=my-api-key
|
94
|
+
```
|
95
|
+
|
96
|
+
Then in your code:
|
97
|
+
```ruby
|
98
|
+
# Init a new client without an API key
|
99
|
+
mindee_client = Mindee::Client.new
|
100
|
+
```
|
101
|
+
|
102
|
+
### Setting the Request Timeout
|
103
|
+
The request timeout can be set using an environment variable:
|
104
|
+
```shell
|
105
|
+
MINDEE_REQUEST_TIMEOUT=200
|
106
|
+
```
|
107
|
+
|
108
|
+
|
109
|
+
## Loading a Document File
|
110
|
+
Before being able to send a document to the API, it must first be loaded.
|
111
|
+
|
112
|
+
You don't need to worry about different MIME types, the library will take care of handling
|
113
|
+
all supported types automatically.
|
114
|
+
|
115
|
+
Once a document is loaded, interacting with it is done in exactly the same way, regardless
|
116
|
+
of how it was loaded.
|
117
|
+
|
118
|
+
There are a few different ways of loading a document file, depending on your use case:
|
119
|
+
|
120
|
+
* [Path](#path)
|
121
|
+
* [File Object](#file-object)
|
122
|
+
* [Base64](#base64)
|
123
|
+
* [Bytes](#bytes)
|
124
|
+
|
125
|
+
### Path
|
126
|
+
Load from a file directly from disk. Requires an absolute path, as a string.
|
127
|
+
|
128
|
+
```ruby
|
129
|
+
result = mindee_client.doc_from_path("/path/to/the/invoice.jpg").parse(Mindee::Prediction::InvoiceV4)
|
130
|
+
|
131
|
+
# Print a full summary of the parsed data in RST format
|
132
|
+
puts result
|
133
|
+
```
|
134
|
+
|
135
|
+
### File Object
|
136
|
+
A normal Ruby file object with a path. Must be in binary mode.
|
137
|
+
|
138
|
+
**Note**: The original filename is required when calling the method.
|
139
|
+
|
140
|
+
```ruby
|
141
|
+
result = nil
|
142
|
+
File.open(INVOICE_FILE, 'rb') do |fo|
|
143
|
+
result = mindee_client.doc_from_file(fo, "invoice.jpg").parse(Mindee::Prediction::InvoiceV4)
|
144
|
+
end
|
145
|
+
|
146
|
+
# Print a full summary of the parsed data in RST format
|
147
|
+
puts result
|
148
|
+
```
|
149
|
+
|
150
|
+
### Base64
|
151
|
+
Load file contents from a base64-encoded string.
|
152
|
+
|
153
|
+
**Note**: The original filename is required when calling the method.
|
154
|
+
|
155
|
+
```ruby
|
156
|
+
b64_string = "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLD...."
|
157
|
+
result = mindee_client.doc_from_b64string(b64_string, "receipt.jpg").parse(Mindee::Prediction::ReceiptV4)
|
158
|
+
|
159
|
+
# Print a full summary of the parsed data in RST format
|
160
|
+
puts result
|
161
|
+
```
|
162
|
+
|
163
|
+
### Bytes
|
164
|
+
Requires raw bytes.
|
165
|
+
|
166
|
+
**Note**: The original filename is required when calling the method.
|
167
|
+
|
168
|
+
```ruby
|
169
|
+
raw_bytes = b"%PDF-1.3\n%\xbf\xf7\xa2\xfe\n1 0 ob..."
|
170
|
+
result = mindee_client.doc_from_bytes(raw_bytes, "invoice.pdf").parse(Mindee::Prediction::InvoiceV4)
|
171
|
+
|
172
|
+
# Print a full summary of the parsed data in RST format
|
173
|
+
puts result
|
174
|
+
```
|
175
|
+
|
176
|
+
## Sending a File
|
177
|
+
To send a file to the API, we need to specify how to process the document.
|
178
|
+
This will determine which API endpoint is used and how the API return will be handled internally by the library.
|
179
|
+
|
180
|
+
More specifically, we need to set a `Mindee::Prediction` class as the first parameter of the `parse` method.
|
181
|
+
|
182
|
+
This is because the `parse` method's' return type depends on its first argument.
|
183
|
+
|
184
|
+
Each document type available in the library has its corresponding class, which inherit from the base `Mindee::Prediction` class.
|
185
|
+
This is detailed in each document-specific guide.
|
186
|
+
|
187
|
+
### Off-the-Shelf Documents
|
188
|
+
Simply setting the correct class is enough:
|
189
|
+
```ruby
|
190
|
+
result = doc.parse(Mindee::Prediction::InvoiceV4)
|
191
|
+
```
|
192
|
+
|
193
|
+
### Custom Documents
|
194
|
+
The endpoint to use must also be set, this is done in the second argument of the `parse` method:
|
195
|
+
```ruby
|
196
|
+
result = doc.parse(Mindee::Prediction::CustomV1, endpoint_name: 'wnine')
|
197
|
+
```
|
198
|
+
|
199
|
+
This is because the `CustomV1` class is enough to handle the return processing, but the actual endpoint needs to be specified.
|
200
|
+
|
201
|
+
## Process the Result
|
202
|
+
The response object is common to all documents, including custom documents. The main properties are:
|
203
|
+
|
204
|
+
* `id` — Mindee ID of the document
|
205
|
+
* `name` — Filename sent to the API
|
206
|
+
* `inference` — [Inference](#inference)
|
207
|
+
|
208
|
+
### Inference
|
209
|
+
Regroups the predictions at the page level, as well as predictions for the entire document.
|
210
|
+
|
211
|
+
* `prediction` — [Document level prediction](#document-level-prediction)
|
212
|
+
* `pages` — [Page level prediction](#page-level-prediction)
|
213
|
+
|
214
|
+
#### Document level prediction
|
215
|
+
The `prediction` attribute is a `Prediction` object specific to the type of document being processed.
|
216
|
+
It contains the data extracted from the entire document, all pages combined.
|
217
|
+
|
218
|
+
It's possible to have the same field in various pages, but at the document level,
|
219
|
+
only the highest confidence field data will be shown (this is all done automatically at the API level).
|
220
|
+
|
221
|
+
```ruby
|
222
|
+
# as an object, complete
|
223
|
+
pp result.inference.prediction
|
224
|
+
|
225
|
+
# as a string, summary in RST format
|
226
|
+
puts result.inference.prediction
|
227
|
+
```
|
228
|
+
|
229
|
+
#### Page level prediction
|
230
|
+
The `pages` attribute is a list of `Prediction` objects.
|
231
|
+
|
232
|
+
Each page element contains the data extracted for a particular page of the document.
|
233
|
+
The order of the elements in the array matches the order of the pages in the document.
|
234
|
+
|
235
|
+
All response objects have this property, regardless of the number of pages.
|
236
|
+
Single page documents will have a single entry.
|
237
|
+
|
238
|
+
Iteration is done like any Ruby array:
|
239
|
+
```ruby
|
240
|
+
response.inference.pages.each do |page|
|
241
|
+
# as an object, complete
|
242
|
+
pp page.prediction
|
243
|
+
|
244
|
+
# as a string, summary in RST format
|
245
|
+
puts page.prediction
|
246
|
+
end
|
247
|
+
```
|
248
|
+
|
249
|
+
#### Page Orientation
|
250
|
+
The orientation field is only available at the page level as it describes whether the page image should be rotated to be upright.
|
251
|
+
|
252
|
+
If the page requires rotation for correct display, the orientation field gives a prediction among these 3 possible outputs:
|
253
|
+
|
254
|
+
* 0 degrees: the page is already upright
|
255
|
+
* 90 degrees: the page must be rotated clockwise to be upright
|
256
|
+
* 270 degrees: the page must be rotated counterclockwise to be upright
|
257
|
+
|
258
|
+
```ruby
|
259
|
+
response.inference.pages.each do |page|
|
260
|
+
puts page.orientation.value
|
261
|
+
end
|
262
|
+
```
|
263
|
+
|
264
|
+
## Questions?
|
265
|
+
[Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-1jv6nawjq-FDgFcF2T5CmMmRpl9LLptw)
|
@@ -0,0 +1,261 @@
|
|
1
|
+
The Ruby OCR SDK supports the [invoice API](https://developers.mindee.com/docs/invoice-ocr) for extracting data from invoices.
|
2
|
+
|
3
|
+
Using this sample below, we are going to illustrate how to extract the data that we want using the OCR SDK.
|
4
|
+
|
5
|
+

|
6
|
+
|
7
|
+
## Quick Start
|
8
|
+
```ruby
|
9
|
+
require 'mindee'
|
10
|
+
|
11
|
+
# Init a new client, specifying an API key
|
12
|
+
mindee_client = Mindee::Client.new(api_key: 'my-api-key')
|
13
|
+
|
14
|
+
# Send the file
|
15
|
+
result = mindee_client.doc_from_path('/path/to/the/file.ext').parse(Mindee::Prediction::InvoiceV4)
|
16
|
+
|
17
|
+
# Print a summary of the document prediction in RST format
|
18
|
+
puts result.inference.prediction
|
19
|
+
```
|
20
|
+
|
21
|
+
Output:
|
22
|
+
```shell
|
23
|
+
:Locale: en; en; CAD;
|
24
|
+
:Document type: INVOICE
|
25
|
+
:Invoice number: 14
|
26
|
+
:Reference numbers: AD29094
|
27
|
+
:Invoice date: 2018-09-25
|
28
|
+
:Invoice due date: 2018-09-25
|
29
|
+
:Supplier name: TURNPIKE DESIGNS CO.
|
30
|
+
:Supplier address: 156 University Ave, Toronto ON, Canada M5H 2H7
|
31
|
+
:Supplier company registrations:
|
32
|
+
:Supplier payment details:
|
33
|
+
:Customer name: JIRO DOI
|
34
|
+
:Customer address: 1954 Bloor Street West Toronto, ON, M6P 3K9 Canada
|
35
|
+
:Customer company registrations:
|
36
|
+
:Taxes: 193.20 8.00%
|
37
|
+
:Total net: 2415.00
|
38
|
+
:Total taxes: 193.20
|
39
|
+
:Total amount: 2608.20
|
40
|
+
|
41
|
+
:Line Items:
|
42
|
+
====================== ======== ========= ========== ================== ====================================
|
43
|
+
Code QTY Price Amount Tax (Rate) Description
|
44
|
+
====================== ======== ========= ========== ================== ====================================
|
45
|
+
1.00 65.00 65.00 Platinum web hosting package Down...
|
46
|
+
3.00 2100.00 2100.00 2 page website design Includes ba...
|
47
|
+
1.00 250.00 250.00 Mobile designs Includes responsiv...
|
48
|
+
====================== ======== ========= ========== ================== ====================================
|
49
|
+
```
|
50
|
+
|
51
|
+
**Note:** Line item descriptions are truncated here only for display purposes.
|
52
|
+
The full text is available in the [details](#line-items).
|
53
|
+
|
54
|
+
## Fields
|
55
|
+
Each prediction object contains a set of different fields.
|
56
|
+
Each `Field` object contains at a minimum the following attributes:
|
57
|
+
|
58
|
+
* `value` (String or Float depending on the field type): corresponds to the field value. Can be `nil` if no value was extracted.
|
59
|
+
* `confidence` (Float): the confidence score of the field prediction.
|
60
|
+
* `bounding_box` (Array< Array< Float > >): contains exactly 4 relative vertices coordinates (points) of a right rectangle containing the field in the document.
|
61
|
+
* `polygon` (Array< Array< Float > >): contains the relative vertices coordinates (points) of a polygon containing the field in the image.
|
62
|
+
* `reconstructed` (Boolean): True if the field was reconstructed or computed using other fields.
|
63
|
+
|
64
|
+
|
65
|
+
## Attributes
|
66
|
+
Depending on the field type, there might be additional attributes that will be extracted in the `Invoice` object.
|
67
|
+
|
68
|
+
Using the above sample, the following are the basic fields that can be extracted:
|
69
|
+
|
70
|
+
- [Orientation](#orientation)
|
71
|
+
- [Customer Information](#customer-information)
|
72
|
+
- [Dates](#dates)
|
73
|
+
- [Locale and Currency](#locale)
|
74
|
+
- [Payment Information](#payment-information)
|
75
|
+
- [Supplier Information](#supplier-information)
|
76
|
+
- [Taxes](#taxes)
|
77
|
+
- [Totals](#totals)
|
78
|
+
- [Line Items](#line-items)
|
79
|
+
|
80
|
+
|
81
|
+
### Customer Information
|
82
|
+
**`customer_name`** (Field): Customer's name
|
83
|
+
|
84
|
+
```ruby
|
85
|
+
puts result.inference.prediction.customer_name.value
|
86
|
+
```
|
87
|
+
|
88
|
+
**`customer_address`** (Field): Customer's postal address
|
89
|
+
|
90
|
+
```ruby
|
91
|
+
puts result.inference.prediction.customer_address.value
|
92
|
+
```
|
93
|
+
|
94
|
+
**`customer_company_registration`** (Array<CompanyRegistration>): Customer's company registration
|
95
|
+
|
96
|
+
```ruby
|
97
|
+
result.inference.prediction.customer_company_registrations.each do |registration|
|
98
|
+
puts registration
|
99
|
+
end
|
100
|
+
```
|
101
|
+
|
102
|
+
### Dates
|
103
|
+
Date fields:
|
104
|
+
|
105
|
+
* contain the `date_object` attribute, which is a standard Ruby [date object](https://ruby-doc.org/stdlib-2.7.1/libdoc/date/rdoc/Date.html)
|
106
|
+
* have a `value` attribute which is the [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) representation of the date.
|
107
|
+
|
108
|
+
The following date fields are available:
|
109
|
+
|
110
|
+
**`date`**: Date the invoice was issued
|
111
|
+
|
112
|
+
```ruby
|
113
|
+
puts result.inference.prediction.date.value
|
114
|
+
```
|
115
|
+
|
116
|
+
**`due_date`**: Payment due date of the invoice.
|
117
|
+
|
118
|
+
```ruby
|
119
|
+
puts result.inference.prediction.due_date.value
|
120
|
+
```
|
121
|
+
|
122
|
+
### Locale
|
123
|
+
**`locale`** [Locale]: Locale information.
|
124
|
+
|
125
|
+
* `locale.language` (String): Language code in [ISO 639-1](https://en.wikipedia.org/wiki/ISO_639-1) format as seen on the document.
|
126
|
+
```ruby
|
127
|
+
puts result.inference.prediction.locale.language
|
128
|
+
```
|
129
|
+
|
130
|
+
* `locale.currency` (String): Currency code in [ISO 4217](https://en.wikipedia.org/wiki/ISO_4217) format as seen on the document.
|
131
|
+
```ruby
|
132
|
+
puts result.inference.prediction.locale.currency
|
133
|
+
```
|
134
|
+
|
135
|
+
* `locale.country` (String): Country code in [ISO 3166-1](https://en.wikipedia.org/wiki/ISO_3166-1) alpha-2 format as seen on the document.
|
136
|
+
```ruby
|
137
|
+
puts result.inference.prediction.locale.country
|
138
|
+
```
|
139
|
+
|
140
|
+
### Supplier Information
|
141
|
+
|
142
|
+
**`supplier_name`**: Supplier name as written in the invoice (logo or supplier Info).
|
143
|
+
|
144
|
+
```ruby
|
145
|
+
puts result.inference.prediction.supplier_name.value
|
146
|
+
```
|
147
|
+
|
148
|
+
**`supplier_address`**: Supplier address as written in the invoice.
|
149
|
+
|
150
|
+
```ruby
|
151
|
+
puts result.inference.prediction.supplier_address.value
|
152
|
+
```
|
153
|
+
|
154
|
+
**`supplier__payment_details`** (Array< PaymentDetails >): List of invoice's supplier payment details.
|
155
|
+
Each object in the list contains extra attributes:
|
156
|
+
|
157
|
+
* `iban` (String)
|
158
|
+
```ruby
|
159
|
+
# Show the IBAN of the first payment
|
160
|
+
puts result.inference.prediction.supplier_payment_details[0].iban
|
161
|
+
```
|
162
|
+
|
163
|
+
* `swift` (String)
|
164
|
+
```ruby
|
165
|
+
# Show the SWIFT of the first payment
|
166
|
+
puts result.inference.prediction.supplier_payment_details[0].swift
|
167
|
+
```
|
168
|
+
|
169
|
+
* `routing_number` (String)
|
170
|
+
```ruby
|
171
|
+
# Show the routing number of the first payment
|
172
|
+
puts result.inference.prediction.supplier_payment_details[0].routing_number
|
173
|
+
```
|
174
|
+
|
175
|
+
* `account_number` (String)
|
176
|
+
```ruby
|
177
|
+
# Show the account number of the first payment
|
178
|
+
puts result.inference.prediction.supplier_payment_details[0].account_number
|
179
|
+
```
|
180
|
+
|
181
|
+
**`supplier_company_registrations`** (Array< CompanyRegistration >):
|
182
|
+
List of detected supplier's company registration numbers.
|
183
|
+
Each object in the list contains an extra attribute:
|
184
|
+
|
185
|
+
* `type` (String): Type of company registration number among predefined categories.
|
186
|
+
```ruby
|
187
|
+
# Show the type of the first registration
|
188
|
+
puts result.inference.prediction.supplier_company_registrations[0].type
|
189
|
+
```
|
190
|
+
|
191
|
+
* `value` (String): Value of the company identifier
|
192
|
+
```ruby
|
193
|
+
# Show the value of the first registration
|
194
|
+
puts result.inference.prediction.supplier_company_registrations[0].value
|
195
|
+
```
|
196
|
+
|
197
|
+
### Taxes
|
198
|
+
**`taxes`** (Array< TaxField >): Contains tax fields as seen on the receipt.
|
199
|
+
|
200
|
+
* `value` (Float): The tax amount.
|
201
|
+
```ruby
|
202
|
+
# Show the amount of the first tax
|
203
|
+
puts result.inference.prediction.taxes[0].value
|
204
|
+
```
|
205
|
+
|
206
|
+
* `code` (String): The tax code (HST, GST... for Canadian; City Tax, State tax for US, etc..).
|
207
|
+
```ruby
|
208
|
+
# Show the code of the first tax
|
209
|
+
puts result.inference.prediction.taxes[0].code
|
210
|
+
```
|
211
|
+
|
212
|
+
* `rate` (Float): The tax rate.
|
213
|
+
```ruby
|
214
|
+
# Show the rate of the first tax
|
215
|
+
puts result.inference.prediction.taxes[0].rate
|
216
|
+
```
|
217
|
+
|
218
|
+
### Totals
|
219
|
+
|
220
|
+
**`total_amount`** (Field): Total amount including taxes.
|
221
|
+
|
222
|
+
```ruby
|
223
|
+
puts result.inference.prediction.total_amount.value
|
224
|
+
```
|
225
|
+
|
226
|
+
**`total_net`** (Field): Total amount excluding taxes.
|
227
|
+
|
228
|
+
```ruby
|
229
|
+
puts result.inference.prediction.total_net.value
|
230
|
+
```
|
231
|
+
|
232
|
+
**`total_tax`** (Field): Total tax value from tax lines.
|
233
|
+
|
234
|
+
```ruby
|
235
|
+
puts result.inference.prediction.total_tax.value
|
236
|
+
```
|
237
|
+
|
238
|
+
### Line items
|
239
|
+
|
240
|
+
**`line_items`** (Array<InvoiceLineItem>): Line items details.
|
241
|
+
Each object in the list contains:
|
242
|
+
|
243
|
+
* `product_code` (String)
|
244
|
+
* `description` (String)
|
245
|
+
* `quantity` (Float)
|
246
|
+
* `unit_price` (Float)
|
247
|
+
* `total_amount` (Float)
|
248
|
+
* `tax_rate` (Float)
|
249
|
+
* `tax_amount` (Float)
|
250
|
+
* `confidence` (Float)
|
251
|
+
* `page_id` (Integer)
|
252
|
+
* `polygon` (Polygon)
|
253
|
+
|
254
|
+
```ruby
|
255
|
+
result.inference.prediction.line_items.each do |line_item|
|
256
|
+
pp line_item
|
257
|
+
end
|
258
|
+
```
|
259
|
+
|
260
|
+
## Questions?
|
261
|
+
[Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-1jv6nawjq-FDgFcF2T5CmMmRpl9LLptw)
|
@@ -0,0 +1,156 @@
|
|
1
|
+
The Ruby OCR SDK supports the [passport API](https://developers.mindee.com/docs/passport-ocr) for extracting data from passports.
|
2
|
+
|
3
|
+
Using the sample below, we are going to illustrate how to extract the data that we want using the OCR SDK.
|
4
|
+
|
5
|
+

|
6
|
+
|
7
|
+
## Quick Start
|
8
|
+
```ruby
|
9
|
+
require 'mindee'
|
10
|
+
|
11
|
+
# Init a new client, specifying an API key
|
12
|
+
mindee_client = Mindee::Client.new(api_key: 'my-api-key')
|
13
|
+
|
14
|
+
# Send the file
|
15
|
+
result = mindee_client.doc_from_path('/path/to/the/file.ext').parse(Mindee::Prediction::PassportV1)
|
16
|
+
|
17
|
+
# Print a summary of the document prediction in RST format
|
18
|
+
puts result.inference.prediction
|
19
|
+
```
|
20
|
+
|
21
|
+
Output:
|
22
|
+
```shell
|
23
|
+
:Full name: HENERT PUDARSAN
|
24
|
+
:Given names: HENERT
|
25
|
+
:Surname: PUDARSAN
|
26
|
+
:Country: GBR
|
27
|
+
:ID Number: 707797979
|
28
|
+
:Issuance date: 2012-04-22
|
29
|
+
:Birth date: 1995-05-20
|
30
|
+
:Expiry date: 2017-04-22
|
31
|
+
:MRZ 1: P<GBRPUDARSAN<<HENERT<<<<<<<<<<<<<<<<<<<<<<<
|
32
|
+
:MRZ 2: 7077979792GBR9505209M1704224<<<<<<<<<<<<<<00
|
33
|
+
:MRZ: P<GBRPUDARSAN<<HENERT<<<<<<<<<<<<<<<<<<<<<<<7077979792GBR9505209M1704224<<<<<<<<<<<<<<00
|
34
|
+
```
|
35
|
+
|
36
|
+
## Fields
|
37
|
+
Each prediction object contains a set of different fields.
|
38
|
+
Each `Field` object contains at a minimum the following attributes:
|
39
|
+
|
40
|
+
* `value` (String or Float depending on the field type): corresponds to the field value. Can be `nil` if no value was extracted.
|
41
|
+
* `confidence` (Float): the confidence score of the field prediction.
|
42
|
+
* `bounding_box` (Array< Array< Float > >): contains exactly 4 relative vertices coordinates (points) of a right rectangle containing the field in the document.
|
43
|
+
* `polygon` (Array< Array< Float > >): contains the relative vertices coordinates (points) of a polygon containing the field in the image.
|
44
|
+
* `reconstructed` (Boolean): True if the field was reconstructed or computed using other fields.
|
45
|
+
|
46
|
+
|
47
|
+
## Attributes
|
48
|
+
Depending on the field type specified, additional attributes can be extracted from the `Passport` object.
|
49
|
+
|
50
|
+
Using the above sample, the following are the basic fields that can be extracted:
|
51
|
+
|
52
|
+
- [Orientation](#orientation)
|
53
|
+
- [Birth Place](#birth-place)
|
54
|
+
- [Country](#country)
|
55
|
+
- [Dates (Expiry, Issuance, Birth)](#dates)
|
56
|
+
- [Gender](#gender)
|
57
|
+
- [Given Names](#given-names)
|
58
|
+
- [ID Number](#id)
|
59
|
+
- [Machine Readable Zone](#machine-readable-zone)
|
60
|
+
- [Surname](#surname)
|
61
|
+
|
62
|
+
### Birth Place
|
63
|
+
|
64
|
+
**`birth_place`** (Field): Passport owner birthplace.
|
65
|
+
|
66
|
+
```ruby
|
67
|
+
puts result.inference.prediction.birth_place.value
|
68
|
+
```
|
69
|
+
|
70
|
+
### Country
|
71
|
+
**`country`** (Field): Passport country in [ISO 3166-1 alpha-3 code format](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3) (3-letter code).
|
72
|
+
|
73
|
+
```ruby
|
74
|
+
puts result.inference.prediction.country.value
|
75
|
+
```
|
76
|
+
|
77
|
+
### Dates
|
78
|
+
Date fields:
|
79
|
+
|
80
|
+
* contain the `date_object` attribute, which is a standard Ruby [date object](https://ruby-doc.org/stdlib-2.7.1/libdoc/date/rdoc/Date.html)
|
81
|
+
* have a `value` attribute which is the [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) representation of the date.
|
82
|
+
|
83
|
+
The following date fields are available:
|
84
|
+
|
85
|
+
**`expiry_date`**: Passport expiry date.
|
86
|
+
|
87
|
+
```ruby
|
88
|
+
puts result.inference.prediction.expiry_date.value
|
89
|
+
```
|
90
|
+
|
91
|
+
**`issuance_date`**: Passport date of issuance.
|
92
|
+
|
93
|
+
```ruby
|
94
|
+
puts result.inference.prediction.issuance_date.value
|
95
|
+
```
|
96
|
+
|
97
|
+
**`birth_date`**: Passport's owner date of birth.
|
98
|
+
|
99
|
+
```ruby
|
100
|
+
puts result.inference.prediction.birth_date.value
|
101
|
+
```
|
102
|
+
|
103
|
+
### Gender
|
104
|
+
|
105
|
+
**`gender`** (Field): Passport owner's gender (M / F).
|
106
|
+
|
107
|
+
```ruby
|
108
|
+
puts result.inference.prediction.gender.value
|
109
|
+
```
|
110
|
+
|
111
|
+
### Names
|
112
|
+
|
113
|
+
**`given_names`** (Array< Field >): List of passport owner's given names.
|
114
|
+
|
115
|
+
```ruby
|
116
|
+
result.inference.prediction.given_names.each do |name|
|
117
|
+
puts name
|
118
|
+
end
|
119
|
+
```
|
120
|
+
|
121
|
+
**`surname`** (Field): Passport's owner surname.
|
122
|
+
|
123
|
+
```ruby
|
124
|
+
puts result.inference.prediction.surname.value
|
125
|
+
```
|
126
|
+
|
127
|
+
### ID
|
128
|
+
|
129
|
+
**`id_number`** (Field): Passport identification number.
|
130
|
+
|
131
|
+
```ruby
|
132
|
+
puts result.inference.prediction.id_number.value
|
133
|
+
```
|
134
|
+
|
135
|
+
### Machine-Readable Zone
|
136
|
+
|
137
|
+
**`mrz1`** (Field): Passport first line of machine-readable zone.
|
138
|
+
|
139
|
+
```ruby
|
140
|
+
puts result.inference.prediction.mrz1.value
|
141
|
+
```
|
142
|
+
|
143
|
+
**`mrz2`** (Field): Passport second line of machine-readable zone.
|
144
|
+
|
145
|
+
```ruby
|
146
|
+
puts result.inference.prediction.mrz2.value
|
147
|
+
```
|
148
|
+
|
149
|
+
**`mrz`** (Field): Reconstructed passport full machine-readable zone from mrz1 and mrz2.
|
150
|
+
|
151
|
+
```ruby
|
152
|
+
puts result.inference.prediction.mrz.value
|
153
|
+
```
|
154
|
+
|
155
|
+
## Questions?
|
156
|
+
[Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-1jv6nawjq-FDgFcF2T5CmMmRpl9LLptw)
|