convertorio-sdk 1.2.0 → 1.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +79 -0
- data/lib/convertorio.rb +3 -1
- metadata +4 -4
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 48843d3f36c609b819ed906a746e6af91312c44c487d8e25ca8f184a136156eb
|
|
4
|
+
data.tar.gz: b4e88e74f5e2376ce9f1b843dfacae6f08a0f349f325d2b5ed06eac3106e1e1a
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 48a0845b64716f21e4165136326bd4e138ad452fecd1a9f7adf7c50f009bf13b032bd343668216444516898645d895f96615d617aacd503aa7704a14223b5a3f
|
|
7
|
+
data.tar.gz: 8fee683d9f857964130080f31a5cce03af56fe5336e160ec6a55f2ca6565baa1961dec20d0acd37b00daf8b52ebd73c7a8915870952bcf21db4b4422cfc10ce0
|
data/README.md
CHANGED
|
@@ -272,6 +272,85 @@ The SDK supports conversion between all formats supported by Convertorio:
|
|
|
272
272
|
- EPS
|
|
273
273
|
- JXL (JPEG XL)
|
|
274
274
|
|
|
275
|
+
**✨ AI-Powered OCR:**
|
|
276
|
+
- Extract text from any image format
|
|
277
|
+
- Powered by advanced AI technology
|
|
278
|
+
- Support for printed and handwritten text
|
|
279
|
+
- JSON or TXT output formats
|
|
280
|
+
|
|
281
|
+
## 🤖 AI-Powered OCR
|
|
282
|
+
|
|
283
|
+
Extract text from images with state-of-the-art AI accuracy.
|
|
284
|
+
|
|
285
|
+
### Quick OCR Example
|
|
286
|
+
|
|
287
|
+
```ruby
|
|
288
|
+
result = client.convert_file(
|
|
289
|
+
input_path: './form.jpg',
|
|
290
|
+
target_format: 'ocr',
|
|
291
|
+
output_path: './form.txt',
|
|
292
|
+
conversion_metadata: {
|
|
293
|
+
ocr_format: 'txt',
|
|
294
|
+
ocr_instructions: 'Extract all text preserving formatting'
|
|
295
|
+
}
|
|
296
|
+
)
|
|
297
|
+
|
|
298
|
+
puts "Tokens used: #{result[:tokens_used]}"
|
|
299
|
+
```
|
|
300
|
+
|
|
301
|
+
### OCR Features
|
|
302
|
+
|
|
303
|
+
- **High Accuracy**: Powered by advanced AI for state-of-the-art text recognition
|
|
304
|
+
- **Multiple Languages**: Automatic language detection and support
|
|
305
|
+
- **Flexible Output**: Choose between `txt` (plain text) or `json` (structured data)
|
|
306
|
+
- **Custom Instructions**: Guide the AI to extract specific information
|
|
307
|
+
- **Handwriting Support**: Recognizes both printed and handwritten text
|
|
308
|
+
- **Table Recognition**: Preserves table structure in extracted text
|
|
309
|
+
- **Token-Based Billing**: Pay only for what you use, with transparent token counts
|
|
310
|
+
|
|
311
|
+
### OCR Options
|
|
312
|
+
|
|
313
|
+
| Option | Type | Values | Description |
|
|
314
|
+
|--------|------|--------|-------------|
|
|
315
|
+
| `ocr_format` | string | `txt`, `json` | Output format (default: `txt`) |
|
|
316
|
+
| `ocr_instructions` | string | Any text | Custom instructions to guide extraction |
|
|
317
|
+
|
|
318
|
+
### OCR Use Cases
|
|
319
|
+
|
|
320
|
+
- 📄 **Invoice Processing**: Extract structured data from invoices and receipts
|
|
321
|
+
- 📝 **Form Digitization**: Convert paper forms to digital data
|
|
322
|
+
- 📋 **Document Archival**: Make scanned documents searchable
|
|
323
|
+
- 🏷️ **Label Reading**: Extract text from product labels and tags
|
|
324
|
+
- ✍️ **Handwriting Recognition**: Digitize handwritten notes and documents
|
|
325
|
+
|
|
326
|
+
### Complete OCR Example
|
|
327
|
+
|
|
328
|
+
```ruby
|
|
329
|
+
require 'convertorio'
|
|
330
|
+
require 'json'
|
|
331
|
+
|
|
332
|
+
client = Convertorio::Client.new(api_key: 'your_api_key_here')
|
|
333
|
+
|
|
334
|
+
# Extract text as JSON with custom instructions
|
|
335
|
+
result = client.convert_file(
|
|
336
|
+
input_path: './invoice.jpg',
|
|
337
|
+
target_format: 'ocr',
|
|
338
|
+
output_path: './invoice.json',
|
|
339
|
+
conversion_metadata: {
|
|
340
|
+
ocr_format: 'json',
|
|
341
|
+
ocr_instructions: 'Extract merchant name, date, items with prices, and total amount'
|
|
342
|
+
}
|
|
343
|
+
)
|
|
344
|
+
|
|
345
|
+
puts 'OCR completed!'
|
|
346
|
+
puts "Tokens used: #{result[:tokens_used]}"
|
|
347
|
+
puts "Output saved to: #{result[:output_path]}"
|
|
348
|
+
|
|
349
|
+
# Read the extracted text
|
|
350
|
+
extracted_data = JSON.parse(File.read('./invoice.json'))
|
|
351
|
+
puts extracted_data
|
|
352
|
+
```
|
|
353
|
+
|
|
275
354
|
## Advanced Conversion Options
|
|
276
355
|
|
|
277
356
|
You can control various aspects of the conversion process by passing a `conversion_metadata` hash:
|
data/lib/convertorio.rb
CHANGED
|
@@ -34,6 +34,7 @@ module Convertorio
|
|
|
34
34
|
'Authorization' => "Bearer #{@api_key}",
|
|
35
35
|
'Content-Type' => 'application/json'
|
|
36
36
|
)
|
|
37
|
+
self.class.default_timeout 60
|
|
37
38
|
end
|
|
38
39
|
|
|
39
40
|
# Register a callback for events
|
|
@@ -177,7 +178,8 @@ module Convertorio
|
|
|
177
178
|
target_format: target_format.downcase,
|
|
178
179
|
file_size: File.size(final_output_path),
|
|
179
180
|
processing_time: job_result['processing_time_ms'],
|
|
180
|
-
download_url: job_result['download_url']
|
|
181
|
+
download_url: job_result['download_url'],
|
|
182
|
+
tokens_used: job_result['tokens_used']
|
|
181
183
|
}
|
|
182
184
|
|
|
183
185
|
emit(:complete, conversion_result)
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: convertorio-sdk
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 1.2.
|
|
4
|
+
version: 1.2.1
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Convertorio
|
|
@@ -93,9 +93,9 @@ dependencies:
|
|
|
93
93
|
- - "~>"
|
|
94
94
|
- !ruby/object:Gem::Version
|
|
95
95
|
version: '3.18'
|
|
96
|
-
description: Ruby SDK for the Convertorio API. Convert
|
|
97
|
-
|
|
98
|
-
ICO, and more.
|
|
96
|
+
description: Ruby SDK for the Convertorio API. Convert files between 20+ formats with
|
|
97
|
+
AI-powered OCR for text extraction. Supports JPG, PNG, WebP, AVIF, HEIC, GIF, BMP,
|
|
98
|
+
TIFF, ICO, PDF, and more.
|
|
99
99
|
email:
|
|
100
100
|
- support@convertorio.com
|
|
101
101
|
executables: []
|