cloudmersive-ocr-api-client 1.2.9
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/Gemfile +7 -0
- data/README.md +111 -0
- data/Rakefile +8 -0
- data/cloudmersive-ocr-api-client.gemspec +45 -0
- data/docs/ImageOcrApi.md +128 -0
- data/docs/ImageToTextResponse.md +9 -0
- data/docs/OcrPageResult.md +10 -0
- data/docs/PdfOcrApi.md +67 -0
- data/docs/PdfToTextResponse.md +9 -0
- data/docs/PreprocessingApi.md +118 -0
- data/git_push.sh +55 -0
- data/lib/cloudmersive-ocr-api-client.rb +45 -0
- data/lib/cloudmersive-ocr-api-client/api/image_ocr_api.rb +144 -0
- data/lib/cloudmersive-ocr-api-client/api/pdf_ocr_api.rb +85 -0
- data/lib/cloudmersive-ocr-api-client/api/preprocessing_api.rb +135 -0
- data/lib/cloudmersive-ocr-api-client/api_client.rb +389 -0
- data/lib/cloudmersive-ocr-api-client/api_error.rb +38 -0
- data/lib/cloudmersive-ocr-api-client/configuration.rb +209 -0
- data/lib/cloudmersive-ocr-api-client/models/image_to_text_response.rb +199 -0
- data/lib/cloudmersive-ocr-api-client/models/ocr_page_result.rb +209 -0
- data/lib/cloudmersive-ocr-api-client/models/pdf_to_text_response.rb +199 -0
- data/lib/cloudmersive-ocr-api-client/version.rb +15 -0
- data/spec/api/image_ocr_api_spec.rb +62 -0
- data/spec/api/pdf_ocr_api_spec.rb +49 -0
- data/spec/api/preprocessing_api_spec.rb +59 -0
- data/spec/api_client_spec.rb +226 -0
- data/spec/configuration_spec.rb +42 -0
- data/spec/models/image_to_text_response_spec.rb +48 -0
- data/spec/models/ocr_page_result_spec.rb +54 -0
- data/spec/models/pdf_to_text_response_spec.rb +48 -0
- data/spec/spec_helper.rb +111 -0
- metadata +255 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 85c1038fa352c0cdbe9fb398c454ee32dc127a205b1b4c5592a33295684c12b8
|
4
|
+
data.tar.gz: c591c0122cb4524ece02f29940e3904c918056862ff51e9b7e199f19e8edd4c0
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 54fe2313a8c568122ad9431f404c09e850718cf8122e22a9ce5aebbca42b1f705a857acaeb899e697c9c29bdc6d39cb060ee0f2a4aa91820f1bfd49f777e1193
|
7
|
+
data.tar.gz: 11805dc8dc8f85845849320479c9745a91ce57f06108262a00e3fe46fe2eb916cf9ed06a95e0f6675f8fcf680a0abd1ef5b27e04dc36302f9b8b2e182c83ef1d
|
data/Gemfile
ADDED
data/README.md
ADDED
@@ -0,0 +1,111 @@
|
|
1
|
+
# cloudmersive-ocr-api-client
|
2
|
+
|
3
|
+
CloudmersiveOcrApiClient - the Ruby gem for the ocrapi
|
4
|
+
|
5
|
+
The powerful Optical Character Recognition (OCR) APIs let you convert scanned images of pages into recognized text.
|
6
|
+
|
7
|
+
This SDK is automatically generated by the [Swagger Codegen](https://github.com/swagger-api/swagger-codegen) project:
|
8
|
+
|
9
|
+
- API version: v1
|
10
|
+
- Package version: 1.2.9
|
11
|
+
- Build package: io.swagger.codegen.languages.RubyClientCodegen
|
12
|
+
|
13
|
+
## Installation
|
14
|
+
|
15
|
+
### Build a gem
|
16
|
+
|
17
|
+
To build the Ruby code into a gem:
|
18
|
+
|
19
|
+
```shell
|
20
|
+
gem build cloudmersive-ocr-api-client.gemspec
|
21
|
+
```
|
22
|
+
|
23
|
+
Then either install the gem locally:
|
24
|
+
|
25
|
+
```shell
|
26
|
+
gem install ./cloudmersive-ocr-api-client-1.2.9.gem
|
27
|
+
```
|
28
|
+
(for development, run `gem install --dev ./cloudmersive-ocr-api-client-1.2.9.gem` to install the development dependencies)
|
29
|
+
|
30
|
+
or publish the gem to a gem hosting service, e.g. [RubyGems](https://rubygems.org/).
|
31
|
+
|
32
|
+
Finally add this to the Gemfile:
|
33
|
+
|
34
|
+
gem 'cloudmersive-ocr-api-client', '~> 1.2.9'
|
35
|
+
|
36
|
+
### Install from Git
|
37
|
+
|
38
|
+
If the Ruby gem is hosted at a git repository: https://github.com/GIT_USER_ID/GIT_REPO_ID, then add the following in the Gemfile:
|
39
|
+
|
40
|
+
gem 'cloudmersive-ocr-api-client', :git => 'https://github.com/GIT_USER_ID/GIT_REPO_ID.git'
|
41
|
+
|
42
|
+
### Include the Ruby code directly
|
43
|
+
|
44
|
+
Include the Ruby code directly using `-I` as follows:
|
45
|
+
|
46
|
+
```shell
|
47
|
+
ruby -Ilib script.rb
|
48
|
+
```
|
49
|
+
|
50
|
+
## Getting Started
|
51
|
+
|
52
|
+
Please follow the [installation](#installation) procedure and then run the following code:
|
53
|
+
```ruby
|
54
|
+
# Load the gem
|
55
|
+
require 'cloudmersive-ocr-api-client'
|
56
|
+
|
57
|
+
# Setup authorization
|
58
|
+
CloudmersiveOcrApiClient.configure do |config|
|
59
|
+
# Configure API key authorization: Apikey
|
60
|
+
config.api_key['Apikey'] = 'YOUR API KEY'
|
61
|
+
# Uncomment the following line to set a prefix for the API key, e.g. 'Bearer' (defaults to nil)
|
62
|
+
#config.api_key_prefix['Apikey'] = 'Bearer'
|
63
|
+
end
|
64
|
+
|
65
|
+
api_instance = CloudmersiveOcrApiClient::ImageOcrApi.new
|
66
|
+
|
67
|
+
image_file = File.new("/path/to/file.txt") # File | Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.
|
68
|
+
|
69
|
+
opts = {
|
70
|
+
language: "language_example" # String | Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)
|
71
|
+
}
|
72
|
+
|
73
|
+
begin
|
74
|
+
#Convert a photo of a document into text
|
75
|
+
result = api_instance.image_ocr_photo_to_text(image_file, opts)
|
76
|
+
p result
|
77
|
+
rescue CloudmersiveOcrApiClient::ApiError => e
|
78
|
+
puts "Exception when calling ImageOcrApi->image_ocr_photo_to_text: #{e}"
|
79
|
+
end
|
80
|
+
|
81
|
+
```
|
82
|
+
|
83
|
+
## Documentation for API Endpoints
|
84
|
+
|
85
|
+
All URIs are relative to *https://api.cloudmersive.com*
|
86
|
+
|
87
|
+
Class | Method | HTTP request | Description
|
88
|
+
------------ | ------------- | ------------- | -------------
|
89
|
+
*CloudmersiveOcrApiClient::ImageOcrApi* | [**image_ocr_photo_to_text**](docs/ImageOcrApi.md#image_ocr_photo_to_text) | **POST** /ocr/photo/toText | Convert a photo of a document into text
|
90
|
+
*CloudmersiveOcrApiClient::ImageOcrApi* | [**image_ocr_post**](docs/ImageOcrApi.md#image_ocr_post) | **POST** /ocr/image/toText | Convert a scanned image into text
|
91
|
+
*CloudmersiveOcrApiClient::PdfOcrApi* | [**pdf_ocr_post**](docs/PdfOcrApi.md#pdf_ocr_post) | **POST** /ocr/pdf/toText | Converts an uploaded image in common formats such as JPEG, PNG into text via Optical Character Recognition.
|
92
|
+
*CloudmersiveOcrApiClient::PreprocessingApi* | [**preprocessing_unrotate**](docs/PreprocessingApi.md#preprocessing_unrotate) | **POST** /ocr/preprocessing/image/unrotate | Detect and unrotate a document image
|
93
|
+
*CloudmersiveOcrApiClient::PreprocessingApi* | [**preprocessing_unskew**](docs/PreprocessingApi.md#preprocessing_unskew) | **POST** /ocr/preprocessing/image/unskew | Detect and unskew a photo of a document
|
94
|
+
|
95
|
+
|
96
|
+
## Documentation for Models
|
97
|
+
|
98
|
+
- [CloudmersiveOcrApiClient::ImageToTextResponse](docs/ImageToTextResponse.md)
|
99
|
+
- [CloudmersiveOcrApiClient::OcrPageResult](docs/OcrPageResult.md)
|
100
|
+
- [CloudmersiveOcrApiClient::PdfToTextResponse](docs/PdfToTextResponse.md)
|
101
|
+
|
102
|
+
|
103
|
+
## Documentation for Authorization
|
104
|
+
|
105
|
+
|
106
|
+
### Apikey
|
107
|
+
|
108
|
+
- **Type**: API key
|
109
|
+
- **API key parameter name**: Apikey
|
110
|
+
- **Location**: HTTP header
|
111
|
+
|
data/Rakefile
ADDED
@@ -0,0 +1,45 @@
|
|
1
|
+
# -*- encoding: utf-8 -*-
|
2
|
+
#
|
3
|
+
=begin
|
4
|
+
#ocrapi
|
5
|
+
|
6
|
+
#The powerful Optical Character Recognition (OCR) APIs let you convert scanned images of pages into recognized text.
|
7
|
+
|
8
|
+
OpenAPI spec version: v1
|
9
|
+
|
10
|
+
Generated by: https://github.com/swagger-api/swagger-codegen.git
|
11
|
+
Swagger Codegen version: unset
|
12
|
+
|
13
|
+
=end
|
14
|
+
|
15
|
+
$:.push File.expand_path("../lib", __FILE__)
|
16
|
+
require "cloudmersive-ocr-api-client/version"
|
17
|
+
|
18
|
+
Gem::Specification.new do |s|
|
19
|
+
s.name = "cloudmersive-ocr-api-client"
|
20
|
+
s.version = CloudmersiveOcrApiClient::VERSION
|
21
|
+
s.platform = Gem::Platform::RUBY
|
22
|
+
s.authors = ["Cloudmersive"]
|
23
|
+
s.email = [""]
|
24
|
+
s.homepage = "https://www.cloudmersive.com/ocr-api"
|
25
|
+
s.summary = "Convert scanned documents and images to text."
|
26
|
+
s.description = "Convert scanned images of documents into rich text."
|
27
|
+
s.license = "Apache 2.0"
|
28
|
+
s.required_ruby_version = ">= 1.9"
|
29
|
+
|
30
|
+
s.add_runtime_dependency 'typhoeus', '~> 1.0', '>= 1.0.1'
|
31
|
+
s.add_runtime_dependency 'json', '~> 2.1', '>= 2.1.0'
|
32
|
+
|
33
|
+
s.add_development_dependency 'rspec', '~> 3.6', '>= 3.6.0'
|
34
|
+
s.add_development_dependency 'vcr', '~> 3.0', '>= 3.0.1'
|
35
|
+
s.add_development_dependency 'webmock', '~> 1.24', '>= 1.24.3'
|
36
|
+
s.add_development_dependency 'autotest', '~> 4.4', '>= 4.4.6'
|
37
|
+
s.add_development_dependency 'autotest-rails-pure', '~> 4.1', '>= 4.1.2'
|
38
|
+
s.add_development_dependency 'autotest-growl', '~> 0.2', '>= 0.2.16'
|
39
|
+
s.add_development_dependency 'autotest-fsevent', '~> 0.2', '>= 0.2.12'
|
40
|
+
|
41
|
+
s.files = Dir['./**/*']
|
42
|
+
s.test_files = `find spec/*`.split("\n")
|
43
|
+
s.executables = []
|
44
|
+
s.require_paths = ["lib"]
|
45
|
+
end
|
data/docs/ImageOcrApi.md
ADDED
@@ -0,0 +1,128 @@
|
|
1
|
+
# CloudmersiveOcrApiClient::ImageOcrApi
|
2
|
+
|
3
|
+
All URIs are relative to *https://api.cloudmersive.com*
|
4
|
+
|
5
|
+
Method | HTTP request | Description
|
6
|
+
------------- | ------------- | -------------
|
7
|
+
[**image_ocr_photo_to_text**](ImageOcrApi.md#image_ocr_photo_to_text) | **POST** /ocr/photo/toText | Convert a photo of a document into text
|
8
|
+
[**image_ocr_post**](ImageOcrApi.md#image_ocr_post) | **POST** /ocr/image/toText | Convert a scanned image into text
|
9
|
+
|
10
|
+
|
11
|
+
# **image_ocr_photo_to_text**
|
12
|
+
> ImageToTextResponse image_ocr_photo_to_text(image_file, opts)
|
13
|
+
|
14
|
+
Convert a photo of a document into text
|
15
|
+
|
16
|
+
Converts an uploaded photo of a document in common formats such as JPEG, PNG into text via Optical Character Recognition. This API is intended to be run on photos of documents, e.g. taken with a smartphone and supports cases where other content, such as a desk, are in the frame and the camera is crooked. If you want to OCR a scanned image, use the image/toText API call instead as it is designed for scanned images.
|
17
|
+
|
18
|
+
### Example
|
19
|
+
```ruby
|
20
|
+
# load the gem
|
21
|
+
require 'cloudmersive-ocr-api-client'
|
22
|
+
# setup authorization
|
23
|
+
CloudmersiveOcrApiClient.configure do |config|
|
24
|
+
# Configure API key authorization: Apikey
|
25
|
+
config.api_key['Apikey'] = 'YOUR API KEY'
|
26
|
+
# Uncomment the following line to set a prefix for the API key, e.g. 'Bearer' (defaults to nil)
|
27
|
+
#config.api_key_prefix['Apikey'] = 'Bearer'
|
28
|
+
end
|
29
|
+
|
30
|
+
api_instance = CloudmersiveOcrApiClient::ImageOcrApi.new
|
31
|
+
|
32
|
+
image_file = File.new("/path/to/file.txt") # File | Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.
|
33
|
+
|
34
|
+
opts = {
|
35
|
+
language: "language_example" # String | Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)
|
36
|
+
}
|
37
|
+
|
38
|
+
begin
|
39
|
+
#Convert a photo of a document into text
|
40
|
+
result = api_instance.image_ocr_photo_to_text(image_file, opts)
|
41
|
+
p result
|
42
|
+
rescue CloudmersiveOcrApiClient::ApiError => e
|
43
|
+
puts "Exception when calling ImageOcrApi->image_ocr_photo_to_text: #{e}"
|
44
|
+
end
|
45
|
+
```
|
46
|
+
|
47
|
+
### Parameters
|
48
|
+
|
49
|
+
Name | Type | Description | Notes
|
50
|
+
------------- | ------------- | ------------- | -------------
|
51
|
+
**image_file** | **File**| Image file to perform OCR on. Common file formats such as PNG, JPEG are supported. |
|
52
|
+
**language** | **String**| Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish) | [optional]
|
53
|
+
|
54
|
+
### Return type
|
55
|
+
|
56
|
+
[**ImageToTextResponse**](ImageToTextResponse.md)
|
57
|
+
|
58
|
+
### Authorization
|
59
|
+
|
60
|
+
[Apikey](../README.md#Apikey)
|
61
|
+
|
62
|
+
### HTTP request headers
|
63
|
+
|
64
|
+
- **Content-Type**: multipart/form-data
|
65
|
+
- **Accept**: application/json, text/json, application/xml, text/xml
|
66
|
+
|
67
|
+
|
68
|
+
|
69
|
+
# **image_ocr_post**
|
70
|
+
> ImageToTextResponse image_ocr_post(image_file, opts)
|
71
|
+
|
72
|
+
Convert a scanned image into text
|
73
|
+
|
74
|
+
Converts an uploaded image in common formats such as JPEG, PNG into text via Optical Character Recognition. This API is intended to be run on scanned documents. If you want to OCR photos (e.g. taken with a smart phone camera), be sure to use the photo/toText API instead, as it is designed to unskew the image first.
|
75
|
+
|
76
|
+
### Example
|
77
|
+
```ruby
|
78
|
+
# load the gem
|
79
|
+
require 'cloudmersive-ocr-api-client'
|
80
|
+
# setup authorization
|
81
|
+
CloudmersiveOcrApiClient.configure do |config|
|
82
|
+
# Configure API key authorization: Apikey
|
83
|
+
config.api_key['Apikey'] = 'YOUR API KEY'
|
84
|
+
# Uncomment the following line to set a prefix for the API key, e.g. 'Bearer' (defaults to nil)
|
85
|
+
#config.api_key_prefix['Apikey'] = 'Bearer'
|
86
|
+
end
|
87
|
+
|
88
|
+
api_instance = CloudmersiveOcrApiClient::ImageOcrApi.new
|
89
|
+
|
90
|
+
image_file = File.new("/path/to/file.txt") # File | Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.
|
91
|
+
|
92
|
+
opts = {
|
93
|
+
language: "language_example", # String | Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)
|
94
|
+
preprocessing: "preprocessing_example" # String | Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image before OCR is applied; this is recommended).
|
95
|
+
}
|
96
|
+
|
97
|
+
begin
|
98
|
+
#Convert a scanned image into text
|
99
|
+
result = api_instance.image_ocr_post(image_file, opts)
|
100
|
+
p result
|
101
|
+
rescue CloudmersiveOcrApiClient::ApiError => e
|
102
|
+
puts "Exception when calling ImageOcrApi->image_ocr_post: #{e}"
|
103
|
+
end
|
104
|
+
```
|
105
|
+
|
106
|
+
### Parameters
|
107
|
+
|
108
|
+
Name | Type | Description | Notes
|
109
|
+
------------- | ------------- | ------------- | -------------
|
110
|
+
**image_file** | **File**| Image file to perform OCR on. Common file formats such as PNG, JPEG are supported. |
|
111
|
+
**language** | **String**| Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish) | [optional]
|
112
|
+
**preprocessing** | **String**| Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image before OCR is applied; this is recommended). | [optional]
|
113
|
+
|
114
|
+
### Return type
|
115
|
+
|
116
|
+
[**ImageToTextResponse**](ImageToTextResponse.md)
|
117
|
+
|
118
|
+
### Authorization
|
119
|
+
|
120
|
+
[Apikey](../README.md#Apikey)
|
121
|
+
|
122
|
+
### HTTP request headers
|
123
|
+
|
124
|
+
- **Content-Type**: multipart/form-data
|
125
|
+
- **Accept**: application/json, text/json, application/xml, text/xml
|
126
|
+
|
127
|
+
|
128
|
+
|
@@ -0,0 +1,9 @@
|
|
1
|
+
# CloudmersiveOcrApiClient::ImageToTextResponse
|
2
|
+
|
3
|
+
## Properties
|
4
|
+
Name | Type | Description | Notes
|
5
|
+
------------ | ------------- | ------------- | -------------
|
6
|
+
**mean_confidence_level** | **Float** | Confidence level rating of the OCR operation; ratings above 80% are strong. | [optional]
|
7
|
+
**text_result** | **String** | Converted text string from the image input. | [optional]
|
8
|
+
|
9
|
+
|
@@ -0,0 +1,10 @@
|
|
1
|
+
# CloudmersiveOcrApiClient::OcrPageResult
|
2
|
+
|
3
|
+
## Properties
|
4
|
+
Name | Type | Description | Notes
|
5
|
+
------------ | ------------- | ------------- | -------------
|
6
|
+
**page_number** | **Integer** | Page number of the page that was OCR-ed, starting with 1 for the first page in the PDF file | [optional]
|
7
|
+
**mean_confidence_level** | **Float** | Confidence level rating of the OCR operation; ratings above 80% are strong. | [optional]
|
8
|
+
**text_result** | **String** | Converted text string from the image input. | [optional]
|
9
|
+
|
10
|
+
|
data/docs/PdfOcrApi.md
ADDED
@@ -0,0 +1,67 @@
|
|
1
|
+
# CloudmersiveOcrApiClient::PdfOcrApi
|
2
|
+
|
3
|
+
All URIs are relative to *https://api.cloudmersive.com*
|
4
|
+
|
5
|
+
Method | HTTP request | Description
|
6
|
+
------------- | ------------- | -------------
|
7
|
+
[**pdf_ocr_post**](PdfOcrApi.md#pdf_ocr_post) | **POST** /ocr/pdf/toText | Converts an uploaded image in common formats such as JPEG, PNG into text via Optical Character Recognition.
|
8
|
+
|
9
|
+
|
10
|
+
# **pdf_ocr_post**
|
11
|
+
> PdfToTextResponse pdf_ocr_post(image_file, opts)
|
12
|
+
|
13
|
+
Converts an uploaded image in common formats such as JPEG, PNG into text via Optical Character Recognition.
|
14
|
+
|
15
|
+
### Example
|
16
|
+
```ruby
|
17
|
+
# load the gem
|
18
|
+
require 'cloudmersive-ocr-api-client'
|
19
|
+
# setup authorization
|
20
|
+
CloudmersiveOcrApiClient.configure do |config|
|
21
|
+
# Configure API key authorization: Apikey
|
22
|
+
config.api_key['Apikey'] = 'YOUR API KEY'
|
23
|
+
# Uncomment the following line to set a prefix for the API key, e.g. 'Bearer' (defaults to nil)
|
24
|
+
#config.api_key_prefix['Apikey'] = 'Bearer'
|
25
|
+
end
|
26
|
+
|
27
|
+
api_instance = CloudmersiveOcrApiClient::PdfOcrApi.new
|
28
|
+
|
29
|
+
image_file = File.new("/path/to/file.txt") # File | Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.
|
30
|
+
|
31
|
+
opts = {
|
32
|
+
language: "language_example", # String | Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)
|
33
|
+
preprocessing: "preprocessing_example" # String | Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image before OCR is applied; this is recommended).
|
34
|
+
}
|
35
|
+
|
36
|
+
begin
|
37
|
+
#Converts an uploaded image in common formats such as JPEG, PNG into text via Optical Character Recognition.
|
38
|
+
result = api_instance.pdf_ocr_post(image_file, opts)
|
39
|
+
p result
|
40
|
+
rescue CloudmersiveOcrApiClient::ApiError => e
|
41
|
+
puts "Exception when calling PdfOcrApi->pdf_ocr_post: #{e}"
|
42
|
+
end
|
43
|
+
```
|
44
|
+
|
45
|
+
### Parameters
|
46
|
+
|
47
|
+
Name | Type | Description | Notes
|
48
|
+
------------- | ------------- | ------------- | -------------
|
49
|
+
**image_file** | **File**| Image file to perform OCR on. Common file formats such as PNG, JPEG are supported. |
|
50
|
+
**language** | **String**| Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish) | [optional]
|
51
|
+
**preprocessing** | **String**| Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image before OCR is applied; this is recommended). | [optional]
|
52
|
+
|
53
|
+
### Return type
|
54
|
+
|
55
|
+
[**PdfToTextResponse**](PdfToTextResponse.md)
|
56
|
+
|
57
|
+
### Authorization
|
58
|
+
|
59
|
+
[Apikey](../README.md#Apikey)
|
60
|
+
|
61
|
+
### HTTP request headers
|
62
|
+
|
63
|
+
- **Content-Type**: multipart/form-data
|
64
|
+
- **Accept**: application/json, text/json, application/xml, text/xml
|
65
|
+
|
66
|
+
|
67
|
+
|
@@ -0,0 +1,9 @@
|
|
1
|
+
# CloudmersiveOcrApiClient::PdfToTextResponse
|
2
|
+
|
3
|
+
## Properties
|
4
|
+
Name | Type | Description | Notes
|
5
|
+
------------ | ------------- | ------------- | -------------
|
6
|
+
**successful** | **BOOLEAN** | | [optional]
|
7
|
+
**ocr_pages** | [**Array<OcrPageResult>**](OcrPageResult.md) | | [optional]
|
8
|
+
|
9
|
+
|
@@ -0,0 +1,118 @@
|
|
1
|
+
# CloudmersiveOcrApiClient::PreprocessingApi
|
2
|
+
|
3
|
+
All URIs are relative to *https://api.cloudmersive.com*
|
4
|
+
|
5
|
+
Method | HTTP request | Description
|
6
|
+
------------- | ------------- | -------------
|
7
|
+
[**preprocessing_unrotate**](PreprocessingApi.md#preprocessing_unrotate) | **POST** /ocr/preprocessing/image/unrotate | Detect and unrotate a document image
|
8
|
+
[**preprocessing_unskew**](PreprocessingApi.md#preprocessing_unskew) | **POST** /ocr/preprocessing/image/unskew | Detect and unskew a photo of a document
|
9
|
+
|
10
|
+
|
11
|
+
# **preprocessing_unrotate**
|
12
|
+
> Object preprocessing_unrotate(image_file)
|
13
|
+
|
14
|
+
Detect and unrotate a document image
|
15
|
+
|
16
|
+
Detect and unrotate an image of a document (e.g. that was scanned at an angle). Great for document scanning applications; once unskewed, this image is perfect for converting to PDF using the Convert API or optical character recognition using the OCR API.
|
17
|
+
|
18
|
+
### Example
|
19
|
+
```ruby
|
20
|
+
# load the gem
|
21
|
+
require 'cloudmersive-ocr-api-client'
|
22
|
+
# setup authorization
|
23
|
+
CloudmersiveOcrApiClient.configure do |config|
|
24
|
+
# Configure API key authorization: Apikey
|
25
|
+
config.api_key['Apikey'] = 'YOUR API KEY'
|
26
|
+
# Uncomment the following line to set a prefix for the API key, e.g. 'Bearer' (defaults to nil)
|
27
|
+
#config.api_key_prefix['Apikey'] = 'Bearer'
|
28
|
+
end
|
29
|
+
|
30
|
+
api_instance = CloudmersiveOcrApiClient::PreprocessingApi.new
|
31
|
+
|
32
|
+
image_file = File.new("/path/to/file.txt") # File | Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.
|
33
|
+
|
34
|
+
|
35
|
+
begin
|
36
|
+
#Detect and unrotate a document image
|
37
|
+
result = api_instance.preprocessing_unrotate(image_file)
|
38
|
+
p result
|
39
|
+
rescue CloudmersiveOcrApiClient::ApiError => e
|
40
|
+
puts "Exception when calling PreprocessingApi->preprocessing_unrotate: #{e}"
|
41
|
+
end
|
42
|
+
```
|
43
|
+
|
44
|
+
### Parameters
|
45
|
+
|
46
|
+
Name | Type | Description | Notes
|
47
|
+
------------- | ------------- | ------------- | -------------
|
48
|
+
**image_file** | **File**| Image file to perform OCR on. Common file formats such as PNG, JPEG are supported. |
|
49
|
+
|
50
|
+
### Return type
|
51
|
+
|
52
|
+
**Object**
|
53
|
+
|
54
|
+
### Authorization
|
55
|
+
|
56
|
+
[Apikey](../README.md#Apikey)
|
57
|
+
|
58
|
+
### HTTP request headers
|
59
|
+
|
60
|
+
- **Content-Type**: multipart/form-data
|
61
|
+
- **Accept**: application/json, text/json, application/xml, text/xml
|
62
|
+
|
63
|
+
|
64
|
+
|
65
|
+
# **preprocessing_unskew**
|
66
|
+
> Object preprocessing_unskew(image_file)
|
67
|
+
|
68
|
+
Detect and unskew a photo of a document
|
69
|
+
|
70
|
+
Detect and unskew a photo of a document (e.g. taken on a cell phone) into a perfectly square image. Great for document scanning applications; once unskewed, this image is perfect for converting to PDF using the Convert API or optical character recognition using the OCR API.
|
71
|
+
|
72
|
+
### Example
|
73
|
+
```ruby
|
74
|
+
# load the gem
|
75
|
+
require 'cloudmersive-ocr-api-client'
|
76
|
+
# setup authorization
|
77
|
+
CloudmersiveOcrApiClient.configure do |config|
|
78
|
+
# Configure API key authorization: Apikey
|
79
|
+
config.api_key['Apikey'] = 'YOUR API KEY'
|
80
|
+
# Uncomment the following line to set a prefix for the API key, e.g. 'Bearer' (defaults to nil)
|
81
|
+
#config.api_key_prefix['Apikey'] = 'Bearer'
|
82
|
+
end
|
83
|
+
|
84
|
+
api_instance = CloudmersiveOcrApiClient::PreprocessingApi.new
|
85
|
+
|
86
|
+
image_file = File.new("/path/to/file.txt") # File | Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.
|
87
|
+
|
88
|
+
|
89
|
+
begin
|
90
|
+
#Detect and unskew a photo of a document
|
91
|
+
result = api_instance.preprocessing_unskew(image_file)
|
92
|
+
p result
|
93
|
+
rescue CloudmersiveOcrApiClient::ApiError => e
|
94
|
+
puts "Exception when calling PreprocessingApi->preprocessing_unskew: #{e}"
|
95
|
+
end
|
96
|
+
```
|
97
|
+
|
98
|
+
### Parameters
|
99
|
+
|
100
|
+
Name | Type | Description | Notes
|
101
|
+
------------- | ------------- | ------------- | -------------
|
102
|
+
**image_file** | **File**| Image file to perform OCR on. Common file formats such as PNG, JPEG are supported. |
|
103
|
+
|
104
|
+
### Return type
|
105
|
+
|
106
|
+
**Object**
|
107
|
+
|
108
|
+
### Authorization
|
109
|
+
|
110
|
+
[Apikey](../README.md#Apikey)
|
111
|
+
|
112
|
+
### HTTP request headers
|
113
|
+
|
114
|
+
- **Content-Type**: multipart/form-data
|
115
|
+
- **Accept**: application/json, text/json, application/xml, text/xml
|
116
|
+
|
117
|
+
|
118
|
+
|