activestorage-ocr 0.1.2 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 407f4e4098cfdf7804094b783efa4e9dd6a8ef18ede70a8bf571eae693a17fa7
4
- data.tar.gz: 368ec4b42ffa94ca71e6b6df4677d0af87dadd9ab14bf95e57a44aa2904e6a2d
3
+ metadata.gz: 9ac32a71c542c906c4a3d29ab10e1592a2fd08e79914f447beaa4216b9f3e321
4
+ data.tar.gz: e89646596ca0d7662d48f61633744453153426518e7c509951d9da632ae42571
5
5
  SHA512:
6
- metadata.gz: 8cf7e677131db26961ca31684f932f4aff67a85c3b16ba1bb460ec49b1e74f172e64bd1be777bd0a0620df8756c62bdc858d9409857e27da48eb166bc0735669
7
- data.tar.gz: 25b5ae857880af8f0e3a11e4874f15896f4451685e8cae45a0dce2cf6e1c9a8b5ac15f89f1e80920fa9f2bd3ce8bfcfb9a62ef38d99e57b4c977de5b8d65b30c
6
+ metadata.gz: f18b15c1cdc22d5816cf5f7d3c6d2b0cb9b1f89a60f1f78f20f5e020a34e0ca98c5f97108aec75904be08ed84b84937d801d0b2e09415279f028e5f384db439c
7
+ data.tar.gz: cb1422a3f9b30b81a9336ca9c69cc76cef8e17d52dc68711587ca80f26ad525bd26c875e7a0cc7aaf9460101c126c2fd9424216951fdc74b4c6e7673b436e3a5
data/README.md CHANGED
@@ -3,16 +3,18 @@
3
3
  [![CI](https://github.com/Cause-of-a-Kind/activestorage-ocr/actions/workflows/ci.yml/badge.svg)](https://github.com/Cause-of-a-Kind/activestorage-ocr/actions/workflows/ci.yml)
4
4
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
5
5
 
6
- OCR for Rails Active Storage attachments, powered by Rust and [ocrs](https://github.com/robertknight/ocrs).
6
+ OCR for Rails Active Storage attachments, powered by Rust.
7
7
 
8
8
  ## Overview
9
9
 
10
- `activestorage-ocr` provides optical character recognition (OCR) for files stored with Active Storage. It uses a high-performance Rust server with the pure-Rust `ocrs` OCR engine, eliminating the need for third-party OCR services or system-level dependencies.
10
+ `activestorage-ocr` provides optical character recognition (OCR) for files stored with Active Storage. It uses a high-performance Rust server with your choice of OCR engine, eliminating the need for third-party OCR services.
11
11
 
12
12
  **Key Features:**
13
- - **Pure Rust** - No Tesseract or system dependencies required
14
- - **Self-contained** - Models download automatically on first run (~50MB)
15
- - **Fast** - Processes images in ~150ms
13
+ - **Two OCR Engines** - Choose the right tool for the job:
14
+ - **ocrs** (default) - Pure Rust, zero system dependencies
15
+ - **leptess** - Tesseract-based, statically linked (no system Tesseract needed)
16
+ - **Self-contained** - Pre-built binaries with no system dependencies
17
+ - **Per-request engine selection** - Use different engines for different files
16
18
  - **Automatic** - OCR runs automatically when files are uploaded via Active Storage
17
19
 
18
20
  **Supported Formats:**
@@ -24,6 +26,24 @@ OCR for Rails Active Storage attachments, powered by Rust and [ocrs](https://git
24
26
  - **Ruby gem** provides seamless Rails integration
25
27
  - Simple HTTP/JSON protocol for easy debugging
26
28
 
29
+ ### Choosing an Engine
30
+
31
+ | Engine | Binary Size | Notes |
32
+ |--------|-------------|-------|
33
+ | `ocrs` | ~9 MB | Pure Rust, modern neural network approach |
34
+ | `leptess` | ~13 MB | Tesseract-based, well-established OCR engine |
35
+ | `all` | ~15 MB | Includes both engines for comparison |
36
+
37
+ **Which engine should I use?**
38
+
39
+ Start with `ocrs` (the default). It works well for most documents and has a smaller binary. If results aren't satisfactory, try `leptess`—it uses Tesseract, a mature OCR engine that's been around for decades.
40
+
41
+ Neither engine is universally better—performance varies by image. The `all` variant lets you compare both engines on your actual documents to see which works best for your use case.
42
+
43
+ **Platform Support:**
44
+ - **Linux (x86_64):** Fully supported
45
+ - **macOS / Windows:** ocrs-only variant works; Tesseract variants require building from source
46
+
27
47
  ## Requirements
28
48
 
29
49
  - Ruby 3.2+
@@ -45,6 +65,18 @@ bundle install
45
65
  bin/rails activestorage_ocr:install
46
66
  ```
47
67
 
68
+ By default, this installs the `ocrs` engine (~9 MB binary). To install with Tesseract support:
69
+
70
+ ```bash
71
+ # Install both engines (recommended for comparing results)
72
+ bin/rails activestorage_ocr:install variant=all
73
+
74
+ # Or install only the Tesseract-based engine
75
+ bin/rails activestorage_ocr:install variant=leptess
76
+ ```
77
+
78
+ **No system dependencies required.** The Tesseract OCR engine and training data are statically linked into the binary.
79
+
48
80
  **3. Add the OCR server to your `Procfile.dev`:**
49
81
 
50
82
  ```procfile
@@ -111,6 +143,15 @@ result.confidence # => 0.95
111
143
 
112
144
  # Extract text from an Active Storage attachment
113
145
  result = client.extract_text(document.file)
146
+
147
+ # Per-request engine selection (requires all-engines variant)
148
+ result = client.extract_text(document.file, engine: :ocrs) # Pure Rust engine
149
+ result = client.extract_text(document.file, engine: :leptess) # Tesseract engine
150
+
151
+ # Compare engines side-by-side
152
+ comparison = client.compare(document.file)
153
+ comparison[:ocrs].confidence # => 0.95
154
+ comparison[:leptess].confidence # => 0.92
114
155
  ```
115
156
 
116
157
  ## Configuration
@@ -162,10 +203,11 @@ In your `Dockerfile`, run the generator during the build to install the binary:
162
203
  RUN bundle install
163
204
 
164
205
  # Install OCR server binary to bin/dist/
165
- RUN bundle exec rails activestorage_ocr:install path=./bin/dist
206
+ # Use variant=all to include both ocrs and Tesseract engines (~15 MB)
207
+ RUN bundle exec rails activestorage_ocr:install variant=all path=./bin/dist
166
208
 
167
- # Alternatively, run the full generator:
168
- # RUN bundle exec rails generate activestorage_ocr:install
209
+ # Or for the smaller ocrs-only variant (~9 MB):
210
+ # RUN bundle exec rails activestorage_ocr:install path=./bin/dist
169
211
  ```
170
212
 
171
213
  Use foreman to manage both processes:
@@ -261,6 +303,12 @@ bin/rails activestorage_ocr:health
261
303
  # Install the OCR server binary for your platform
262
304
  bin/rails activestorage_ocr:install
263
305
 
306
+ # Install with Tesseract support (all-engines variant)
307
+ bin/rails activestorage_ocr:install variant=all
308
+
309
+ # Install to a custom path
310
+ bin/rails activestorage_ocr:install variant=all path=./bin/dist
311
+
264
312
  # Start the OCR server (for manual testing)
265
313
  bin/rails activestorage_ocr:start
266
314
 
@@ -269,6 +317,9 @@ bin/rails activestorage_ocr:health
269
317
 
270
318
  # Show binary info (platform, path, version)
271
319
  bin/rails activestorage_ocr:info
320
+
321
+ # Compare both OCR engines on a file
322
+ bin/rails activestorage_ocr:compare file=/path/to/image.png
272
323
  ```
273
324
 
274
325
  ## API Endpoints
@@ -278,8 +329,10 @@ The Rust server exposes these HTTP endpoints:
278
329
  | Endpoint | Method | Description |
279
330
  |----------|--------|-------------|
280
331
  | `/health` | GET | Health check |
281
- | `/info` | GET | Server info and supported formats |
282
- | `/ocr` | POST | Extract text from uploaded file |
332
+ | `/info` | GET | Server info, available engines, and supported formats |
333
+ | `/ocr` | POST | Extract text using default engine |
334
+ | `/ocr/ocrs` | POST | Extract text using ocrs engine |
335
+ | `/ocr/leptess` | POST | Extract text using Tesseract engine |
283
336
 
284
337
  ### Example with curl
285
338
 
@@ -287,9 +340,16 @@ The Rust server exposes these HTTP endpoints:
287
340
  # Health check
288
341
  curl http://localhost:9292/health
289
342
 
290
- # OCR an image
343
+ # Server info (shows available engines)
344
+ curl http://localhost:9292/info
345
+
346
+ # OCR with default engine
291
347
  curl -X POST http://localhost:9292/ocr \
292
348
  -F "file=@document.png;type=image/png"
349
+
350
+ # OCR with specific engine (requires all-engines variant)
351
+ curl -X POST http://localhost:9292/ocr/leptess \
352
+ -F "file=@document.png;type=image/png"
293
353
  ```
294
354
 
295
355
  ## Development
@@ -297,22 +357,38 @@ curl -X POST http://localhost:9292/ocr \
297
357
  ### Building from source
298
358
 
299
359
  ```bash
300
- # Build the Rust server
360
+ # Build with default engine (ocrs only)
301
361
  cd rust
302
362
  cargo build --release
303
363
 
364
+ # Build with all engines (ocrs + Tesseract)
365
+ cargo build --release --features all-engines
366
+
367
+ # Build with specific engine only
368
+ cargo build --release --features engine-leptess
369
+
304
370
  # The binary will be at rust/target/release/activestorage-ocr-server
305
371
  ```
306
372
 
373
+ For local development with a Rails app, you can install a locally-built binary:
374
+
375
+ ```bash
376
+ # In your Rails app directory
377
+ bin/rails activestorage_ocr:install source=/path/to/activestorage-ocr/rust/target/release/activestorage-ocr-server path=./bin/dist
378
+ ```
379
+
307
380
  ### Running tests
308
381
 
309
382
  ```bash
310
383
  # Ruby unit tests
311
384
  bundle exec rake test
312
385
 
313
- # Rust tests
386
+ # Rust tests (default engine)
314
387
  cd rust && cargo test
315
388
 
389
+ # Rust tests (all engines)
390
+ cd rust && cargo test --features all-engines -- --test-threads=1
391
+
316
392
  # Integration tests (requires server running)
317
393
  cd rust && ./target/release/activestorage-ocr-server &
318
394
  cd test/sandbox && RAILS_ENV=test bin/rails test
@@ -15,6 +15,7 @@ module ActiveStorage
15
15
  # After analysis, blobs will have the following metadata:
16
16
  # * +ocr_text+ - The extracted text
17
17
  # * +ocr_confidence+ - Confidence score (0.0 to 1.0)
18
+ # * +ocr_engine+ - The OCR engine used ("ocrs" or "leptess")
18
19
  # * +ocr_processed_at+ - ISO 8601 timestamp
19
20
  #
20
21
  # == Example
@@ -20,11 +20,23 @@ module ActiveStorage
20
20
  # * linux-x86_64 (Linux x86_64)
21
21
  # * linux-aarch64 (Linux ARM64)
22
22
  #
23
+ # == Binary Variants
24
+ #
25
+ # * :ocrs - Pure Rust OCR engine only (~15MB, no system dependencies)
26
+ # * :leptess - Tesseract OCR engine only (~50-80MB, no system dependencies)
27
+ # * :all - Both engines included (~80-100MB)
28
+ #
23
29
  # == Usage
24
30
  #
25
- # # Install the binary for the current platform
31
+ # # Install the default (ocrs) binary for the current platform
26
32
  # ActiveStorage::Ocr::Binary.install!
27
33
  #
34
+ # # Install the leptess variant
35
+ # ActiveStorage::Ocr::Binary.install!(variant: :leptess)
36
+ #
37
+ # # Install all-engines variant
38
+ # ActiveStorage::Ocr::Binary.install!(variant: :all)
39
+ #
28
40
  # # Check if binary is installed
29
41
  # ActiveStorage::Ocr::Binary.installed? # => true
30
42
  #
@@ -38,6 +50,22 @@ module ActiveStorage
38
50
  # Name of the server binary.
39
51
  BINARY_NAME = "activestorage-ocr-server"
40
52
 
53
+ # Available binary variants with their download suffix and descriptions.
54
+ VARIANTS = {
55
+ ocrs: {
56
+ suffix: "",
57
+ description: "Pure Rust OCR engine (fast, no system dependencies)"
58
+ },
59
+ leptess: {
60
+ suffix: "-leptess",
61
+ description: "Tesseract OCR engine (better for messy images)"
62
+ },
63
+ all: {
64
+ suffix: "-all",
65
+ description: "All OCR engines included"
66
+ }
67
+ }.freeze
68
+
41
69
  class << self
42
70
  # Detects the current platform.
43
71
  #
@@ -107,17 +135,47 @@ module ActiveStorage
107
135
  ActiveStorage::Ocr::VERSION
108
136
  end
109
137
 
110
- # Returns the download URL for the current platform.
138
+ # Returns the download URL for the current platform and variant.
139
+ #
140
+ # ==== Parameters
141
+ #
142
+ # * +variant+ - The binary variant (:ocrs, :leptess, or :all)
111
143
  #
112
144
  # ==== Returns
113
145
  #
114
146
  # GitHub releases URL for the platform-specific tarball.
115
- def download_url
147
+ def download_url(variant: :ocrs)
148
+ variant = variant.to_sym
149
+ validate_variant!(variant)
116
150
  tag = "v#{version}"
117
- filename = "activestorage-ocr-server-#{platform}.tar.gz"
151
+ suffix = VARIANTS[variant][:suffix]
152
+ filename = "activestorage-ocr-server#{suffix}-#{platform}.tar.gz"
118
153
  "https://github.com/#{GITHUB_REPO}/releases/download/#{tag}/#{filename}"
119
154
  end
120
155
 
156
+ # Lists available binary variants.
157
+ #
158
+ # ==== Returns
159
+ #
160
+ # Array of variant names.
161
+ def available_variants
162
+ VARIANTS.keys
163
+ end
164
+
165
+ # Returns info about a specific variant.
166
+ #
167
+ # ==== Parameters
168
+ #
169
+ # * +variant+ - The variant name (:ocrs, :leptess, or :all)
170
+ #
171
+ # ==== Returns
172
+ #
173
+ # Hash with :suffix and :description keys.
174
+ def variant_info(variant)
175
+ validate_variant!(variant)
176
+ VARIANTS[variant]
177
+ end
178
+
121
179
  # Downloads and installs the binary.
122
180
  #
123
181
  # Downloads from GitHub releases and extracts to the specified directory.
@@ -126,6 +184,7 @@ module ActiveStorage
126
184
  #
127
185
  # * +force+ - If true, reinstalls even if already installed
128
186
  # * +path+ - Custom installation directory (defaults to gem's bin directory)
187
+ # * +variant+ - The binary variant to install (:ocrs, :leptess, or :all)
129
188
  #
130
189
  # ==== Returns
131
190
  #
@@ -134,7 +193,9 @@ module ActiveStorage
134
193
  # ==== Raises
135
194
  #
136
195
  # RuntimeError if the download fails.
137
- def install!(force: false, path: nil)
196
+ # ArgumentError if an invalid variant is specified.
197
+ def install!(force: false, path: nil, variant: :ocrs)
198
+ validate_variant!(variant)
138
199
  target_dir = path || install_dir
139
200
  target_path = File.join(target_dir, BINARY_NAME)
140
201
 
@@ -145,15 +206,18 @@ module ActiveStorage
145
206
 
146
207
  FileUtils.mkdir_p(target_dir)
147
208
 
148
- puts "Downloading activestorage-ocr-server for #{platform}..."
209
+ variant_desc = VARIANTS[variant][:description]
210
+ puts "Downloading activestorage-ocr-server (#{variant_desc}) for #{platform}..."
149
211
 
150
- uri = URI(download_url)
212
+ url = download_url(variant: variant)
213
+ uri = URI(url)
151
214
  response = fetch_with_redirects(uri)
152
215
 
153
216
  unless response.is_a?(Net::HTTPSuccess)
217
+ feature_flag = variant == :ocrs ? "engine-ocrs" : (variant == :leptess ? "engine-leptess" : "all-engines")
154
218
  raise "Failed to download binary: #{response.code} #{response.message}\n" \
155
- "URL: #{download_url}\n" \
156
- "You may need to build from source: cd rust && cargo build --release"
219
+ "URL: #{url}\n" \
220
+ "You may need to build from source: cd rust && cargo build --release --features #{feature_flag}"
157
221
  end
158
222
 
159
223
  extract_binary(response.body, target_path)
@@ -163,6 +227,18 @@ module ActiveStorage
163
227
 
164
228
  private
165
229
 
230
+ # Validates the variant parameter.
231
+ #
232
+ # ==== Raises
233
+ #
234
+ # ArgumentError if the variant is not valid.
235
+ def validate_variant!(variant)
236
+ variant = variant.to_sym
237
+ unless VARIANTS.key?(variant)
238
+ raise ArgumentError, "Invalid variant: #{variant}. Valid variants: #{VARIANTS.keys.join(', ')}"
239
+ end
240
+ end
241
+
166
242
  # Fetches a URL, following redirects.
167
243
  def fetch_with_redirects(uri, limit = 10)
168
244
  raise "Too many redirects" if limit == 0
@@ -17,6 +17,12 @@ module ActiveStorage
17
17
  # # Extract text from a file path
18
18
  # result = client.extract_text_from_path("/path/to/image.png")
19
19
  #
20
+ # # Use a specific OCR engine
21
+ # result = client.extract_text(document.file, engine: :leptess)
22
+ #
23
+ # # Compare results from both engines
24
+ # comparison = client.compare(document.file)
25
+ #
20
26
  # # Check server health
21
27
  # client.healthy? # => true
22
28
  #
@@ -37,6 +43,9 @@ module ActiveStorage
37
43
  # ==== Parameters
38
44
  #
39
45
  # * +blob+ - An ActiveStorage::Blob instance
46
+ # * +engine+ - OCR engine to use (:ocrs or :leptess). Defaults to configured engine.
47
+ # * +preprocess+ - Preprocessing preset (:none, :minimal, :default, :aggressive).
48
+ # Defaults to configured preset.
40
49
  #
41
50
  # ==== Returns
42
51
  #
@@ -46,9 +55,9 @@ module ActiveStorage
46
55
  #
47
56
  # * ConnectionError - if the server is unreachable
48
57
  # * ServerError - if the server returns an error
49
- def extract_text(blob)
58
+ def extract_text(blob, engine: nil, preprocess: nil)
50
59
  blob.open do |file|
51
- extract_text_from_file(file, blob.content_type, blob.filename.to_s)
60
+ extract_text_from_file(file, blob.content_type, blob.filename.to_s, engine: engine, preprocess: preprocess)
52
61
  end
53
62
  end
54
63
 
@@ -59,6 +68,9 @@ module ActiveStorage
59
68
  # * +path+ - Path to the file
60
69
  # * +content_type+ - MIME type (auto-detected if not provided)
61
70
  # * +filename+ - Filename to send (defaults to basename of path)
71
+ # * +engine+ - OCR engine to use (:ocrs or :leptess). Defaults to configured engine.
72
+ # * +preprocess+ - Preprocessing preset (:none, :minimal, :default, :aggressive).
73
+ # Defaults to configured preset.
62
74
  #
63
75
  # ==== Returns
64
76
  #
@@ -68,12 +80,12 @@ module ActiveStorage
68
80
  #
69
81
  # * ConnectionError - if the server is unreachable
70
82
  # * ServerError - if the server returns an error
71
- def extract_text_from_path(path, content_type: nil, filename: nil)
83
+ def extract_text_from_path(path, content_type: nil, filename: nil, engine: nil, preprocess: nil)
72
84
  content_type ||= Marcel::MimeType.for(Pathname.new(path))
73
85
  filename ||= File.basename(path)
74
86
 
75
87
  File.open(path, "rb") do |file|
76
- extract_text_from_file(file, content_type, filename)
88
+ extract_text_from_file(file, content_type, filename, engine: engine, preprocess: preprocess)
77
89
  end
78
90
  end
79
91
 
@@ -86,6 +98,9 @@ module ActiveStorage
86
98
  # * +file+ - An IO object (File, StringIO, etc.)
87
99
  # * +content_type+ - MIME type of the file
88
100
  # * +filename+ - Filename to send to the server
101
+ # * +engine+ - OCR engine to use (:ocrs or :leptess). Defaults to configured engine.
102
+ # * +preprocess+ - Preprocessing preset (:none, :minimal, :default, :aggressive).
103
+ # Defaults to configured preset.
89
104
  #
90
105
  # ==== Returns
91
106
  #
@@ -95,8 +110,12 @@ module ActiveStorage
95
110
  #
96
111
  # * ConnectionError - if the server is unreachable
97
112
  # * ServerError - if the server returns an error
98
- def extract_text_from_file(file, content_type, filename)
99
- response = connection.post("/ocr") do |req|
113
+ def extract_text_from_file(file, content_type, filename, engine: nil, preprocess: nil)
114
+ target_engine = engine || @config.engine
115
+ target_preprocess = preprocess || @config.preprocess
116
+ endpoint = ocr_endpoint_for(target_engine, target_preprocess)
117
+
118
+ response = connection.post(endpoint) do |req|
100
119
  req.body = {
101
120
  file: Faraday::Multipart::FilePart.new(
102
121
  file,
@@ -111,6 +130,57 @@ module ActiveStorage
111
130
  raise ConnectionError, "Failed to connect to OCR server: #{e.message}"
112
131
  end
113
132
 
133
+ # Compares OCR results from both engines.
134
+ #
135
+ # Runs OCR on the same file using both ocrs and leptess engines,
136
+ # allowing you to compare accuracy and performance.
137
+ #
138
+ # ==== Parameters
139
+ #
140
+ # * +blob+ - An ActiveStorage::Blob instance
141
+ #
142
+ # ==== Returns
143
+ #
144
+ # A Hash with :ocrs and :leptess keys, each containing a Result object.
145
+ #
146
+ # ==== Raises
147
+ #
148
+ # * ConnectionError - if the server is unreachable
149
+ # * ServerError - if the server returns an error
150
+ def compare(blob)
151
+ ocrs_result = extract_text(blob, engine: :ocrs)
152
+ leptess_result = extract_text(blob, engine: :leptess)
153
+
154
+ {
155
+ ocrs: ocrs_result,
156
+ leptess: leptess_result
157
+ }
158
+ end
159
+
160
+ # Compares OCR results from both engines using a file path.
161
+ #
162
+ # ==== Parameters
163
+ #
164
+ # * +path+ - Path to the file
165
+ # * +content_type+ - MIME type (auto-detected if not provided)
166
+ # * +filename+ - Filename to send (defaults to basename of path)
167
+ #
168
+ # ==== Returns
169
+ #
170
+ # A Hash with :ocrs and :leptess keys, each containing a Result object.
171
+ def compare_from_path(path, content_type: nil, filename: nil)
172
+ content_type ||= Marcel::MimeType.for(Pathname.new(path))
173
+ filename ||= File.basename(path)
174
+
175
+ ocrs_result = extract_text_from_path(path, content_type: content_type, filename: filename, engine: :ocrs)
176
+ leptess_result = extract_text_from_path(path, content_type: content_type, filename: filename, engine: :leptess)
177
+
178
+ {
179
+ ocrs: ocrs_result,
180
+ leptess: leptess_result
181
+ }
182
+ end
183
+
114
184
  # Checks if the OCR server is healthy.
115
185
  #
116
186
  # ==== Returns
@@ -143,6 +213,29 @@ module ActiveStorage
143
213
 
144
214
  private
145
215
 
216
+ # Returns the OCR endpoint path for the given engine and preprocess preset.
217
+ #
218
+ # ==== Parameters
219
+ #
220
+ # * +engine+ - Engine name (:ocrs or :leptess)
221
+ # * +preprocess+ - Preprocessing preset (:none, :minimal, :default, :aggressive)
222
+ #
223
+ # ==== Returns
224
+ #
225
+ # The endpoint path string with query parameter (e.g., "/ocr?preprocess=default")
226
+ def ocr_endpoint_for(engine, preprocess)
227
+ base = case engine.to_sym
228
+ when :ocrs
229
+ "/ocr"
230
+ when :leptess
231
+ "/ocr/leptess"
232
+ else
233
+ raise ArgumentError, "Unknown engine: #{engine}"
234
+ end
235
+
236
+ "#{base}?preprocess=#{preprocess}"
237
+ end
238
+
146
239
  # Returns the Faraday connection, creating it if necessary.
147
240
  def connection
148
241
  @connection ||= Faraday.new(url: @config.server_url) do |f|
@@ -166,7 +259,9 @@ module ActiveStorage
166
259
  text: data[:text],
167
260
  confidence: data[:confidence],
168
261
  processing_time_ms: data[:processing_time_ms],
169
- warnings: data[:warnings] || []
262
+ warnings: data[:warnings] || [],
263
+ engine: data[:engine],
264
+ preprocessing: data[:preprocessing]
170
265
  )
171
266
  end
172
267
  end
@@ -9,6 +9,8 @@ module ActiveStorage
9
9
  # ActiveStorage::Ocr.configure do |config|
10
10
  # config.server_url = "http://localhost:9292"
11
11
  # config.timeout = 60
12
+ # config.engine = :leptess # Use Tesseract engine instead of default ocrs
13
+ # config.preprocess = :aggressive # Use aggressive preprocessing
12
14
  # end
13
15
  #
14
16
  # == Environment Variables
@@ -16,8 +18,16 @@ module ActiveStorage
16
18
  # * +ACTIVESTORAGE_OCR_SERVER_URL+ - OCR server URL (default: http://localhost:9292)
17
19
  # * +ACTIVESTORAGE_OCR_TIMEOUT+ - Request timeout in seconds (default: 30)
18
20
  # * +ACTIVESTORAGE_OCR_OPEN_TIMEOUT+ - Connection timeout in seconds (default: 5)
21
+ # * +ACTIVESTORAGE_OCR_ENGINE+ - OCR engine to use: ocrs (default) or leptess
22
+ # * +ACTIVESTORAGE_OCR_PREPROCESS+ - Preprocessing preset: none, minimal, default, aggressive
19
23
  #
20
24
  class Configuration
25
+ # Valid OCR engine names
26
+ VALID_ENGINES = %i[ocrs leptess].freeze
27
+
28
+ # Valid preprocessing preset names
29
+ VALID_PREPROCESS = %i[none minimal default aggressive].freeze
30
+
21
31
  # The URL of the OCR server.
22
32
  attr_accessor :server_url
23
33
 
@@ -30,6 +40,16 @@ module ActiveStorage
30
40
  # Array of MIME types that the analyzer will process.
31
41
  attr_accessor :content_types
32
42
 
43
+ # The OCR engine to use (:ocrs or :leptess).
44
+ # Default is :ocrs (pure Rust, no dependencies).
45
+ # Use :leptess for Tesseract-based OCR (better for messy images).
46
+ attr_reader :engine
47
+
48
+ # The preprocessing preset to use (:none, :minimal, :default, :aggressive).
49
+ # Default is :default.
50
+ # Use :none to skip preprocessing, :aggressive for poor quality images.
51
+ attr_reader :preprocess
52
+
33
53
  # Creates a new Configuration with default values.
34
54
  #
35
55
  # Defaults are read from environment variables if set.
@@ -38,6 +58,45 @@ module ActiveStorage
38
58
  @timeout = ENV.fetch("ACTIVESTORAGE_OCR_TIMEOUT", 30).to_i
39
59
  @open_timeout = ENV.fetch("ACTIVESTORAGE_OCR_OPEN_TIMEOUT", 5).to_i
40
60
  @content_types = default_content_types
61
+ self.engine = ENV.fetch("ACTIVESTORAGE_OCR_ENGINE", "ocrs").to_sym
62
+ self.preprocess = ENV.fetch("ACTIVESTORAGE_OCR_PREPROCESS", "default").to_sym
63
+ end
64
+
65
+ # Set the OCR engine to use.
66
+ #
67
+ # ==== Parameters
68
+ #
69
+ # * +value+ - Engine name (:ocrs or :leptess)
70
+ #
71
+ # ==== Raises
72
+ #
73
+ # * +ArgumentError+ if an invalid engine name is provided
74
+ def engine=(value)
75
+ value = value.to_sym
76
+ unless VALID_ENGINES.include?(value)
77
+ raise ArgumentError, "Invalid engine: #{value}. Valid engines: #{VALID_ENGINES.join(', ')}"
78
+ end
79
+
80
+ @engine = value
81
+ end
82
+
83
+ # Set the preprocessing preset to use.
84
+ #
85
+ # ==== Parameters
86
+ #
87
+ # * +value+ - Preset name (:none, :minimal, :default, :aggressive)
88
+ #
89
+ # ==== Raises
90
+ #
91
+ # * +ArgumentError+ if an invalid preset name is provided
92
+ def preprocess=(value)
93
+ value = value.to_sym
94
+ unless VALID_PREPROCESS.include?(value)
95
+ raise ArgumentError,
96
+ "Invalid preprocess preset: #{value}. Valid presets: #{VALID_PREPROCESS.join(', ')}"
97
+ end
98
+
99
+ @preprocess = value
41
100
  end
42
101
 
43
102
  # Returns the default list of supported content types.
@@ -13,6 +13,7 @@ module ActiveStorage
13
13
  # * +activestorage_ocr:start+ - Start the OCR server
14
14
  # * +activestorage_ocr:health+ - Check if the server is responding
15
15
  # * +activestorage_ocr:info+ - Show binary and platform information
16
+ # * +activestorage_ocr:compare+ - Compare OCR results from different engines
16
17
  #
17
18
  class Railtie < Rails::Railtie
18
19
  # Registers the OCR analyzer with Active Storage.
@@ -29,10 +30,23 @@ module ActiveStorage
29
30
  # Defines rake tasks for server management.
30
31
  rake_tasks do
31
32
  namespace :activestorage_ocr do
32
- desc "Install the OCR server binary (optional: path=./bin/dist)"
33
+ desc "Install the OCR server binary (variant=ocrs|leptess|all, path=./bin/dist, source=local_binary_path)"
33
34
  task :install do
34
35
  path = ENV["path"]
35
- ActiveStorage::Ocr::Binary.install!(path: path)
36
+ variant = (ENV["variant"] || "ocrs").to_sym
37
+ source = ENV["source"]
38
+
39
+ if source
40
+ # Local development: copy from a local path instead of downloading
41
+ target_dir = path || ActiveStorage::Ocr::Binary.install_dir
42
+ target_path = File.join(target_dir, ActiveStorage::Ocr::Binary::BINARY_NAME)
43
+ FileUtils.mkdir_p(target_dir)
44
+ FileUtils.cp(source, target_path)
45
+ FileUtils.chmod(0o755, target_path)
46
+ puts "Copied #{source} to #{target_path}"
47
+ else
48
+ ActiveStorage::Ocr::Binary.install!(path: path, variant: variant)
49
+ end
36
50
  end
37
51
 
38
52
  desc "Check OCR server health"
@@ -42,7 +56,11 @@ module ActiveStorage
42
56
  puts "OCR server is healthy"
43
57
  info = client.server_info
44
58
  puts " Version: #{info[:version]}"
45
- puts " Supported formats: #{info[:supported_formats].join(', ')}"
59
+ puts " Default engine: #{info[:default_engine]}"
60
+ puts " Available engines:"
61
+ info[:available_engines].each do |engine|
62
+ puts " - #{engine[:name]}: #{engine[:description]}"
63
+ end
46
64
  else
47
65
  puts "OCR server is not responding"
48
66
  exit 1
@@ -54,7 +72,8 @@ module ActiveStorage
54
72
  binary = ActiveStorage::Ocr::Binary.binary_path
55
73
  unless File.executable?(binary)
56
74
  puts "Binary not found. Installing..."
57
- ActiveStorage::Ocr::Binary.install!
75
+ variant = (ENV["variant"] || "ocrs").to_sym
76
+ ActiveStorage::Ocr::Binary.install!(variant: variant)
58
77
  end
59
78
 
60
79
  config = ActiveStorage::Ocr.configuration
@@ -72,6 +91,59 @@ module ActiveStorage
72
91
  puts "Binary path: #{ActiveStorage::Ocr::Binary.binary_path}"
73
92
  puts "Installed: #{ActiveStorage::Ocr::Binary.installed?}"
74
93
  puts "Version: #{ActiveStorage::Ocr::Binary.version}"
94
+ puts ""
95
+ puts "Available variants:"
96
+ ActiveStorage::Ocr::Binary::VARIANTS.each do |name, info|
97
+ puts " #{name}: #{info[:description]}"
98
+ end
99
+ end
100
+
101
+ desc "Compare OCR engines on a file (file=/path/to/image.png)"
102
+ task compare: :environment do
103
+ file_path = ENV["file"]
104
+ unless file_path
105
+ puts "Usage: rake activestorage_ocr:compare file=/path/to/image.png"
106
+ exit 1
107
+ end
108
+
109
+ unless File.exist?(file_path)
110
+ puts "File not found: #{file_path}"
111
+ exit 1
112
+ end
113
+
114
+ client = ActiveStorage::Ocr::Client.new
115
+ puts "Comparing OCR engines on: #{file_path}"
116
+ puts ""
117
+
118
+ begin
119
+ comparison = client.compare_from_path(file_path)
120
+
121
+ comparison.each do |engine, result|
122
+ puts "#{engine.to_s.upcase}:"
123
+ puts " Text length: #{result.text.length} characters"
124
+ puts " Confidence: #{(result.confidence * 100).round(1)}%"
125
+ puts " Processing time: #{result.processing_time_ms}ms"
126
+ if result.warnings.any?
127
+ puts " Warnings: #{result.warnings.join(', ')}"
128
+ end
129
+ puts ""
130
+ end
131
+
132
+ # Summary
133
+ faster = comparison.min_by { |_, r| r.processing_time_ms }
134
+ higher_conf = comparison.max_by { |_, r| r.confidence }
135
+ puts "Summary:"
136
+ puts " Faster engine: #{faster[0]} (#{faster[1].processing_time_ms}ms)"
137
+ puts " Higher confidence: #{higher_conf[0]} (#{(higher_conf[1].confidence * 100).round(1)}%)"
138
+ rescue ActiveStorage::Ocr::ConnectionError => e
139
+ puts "Error: #{e.message}"
140
+ puts "Make sure the OCR server is running with both engines enabled."
141
+ exit 1
142
+ rescue ActiveStorage::Ocr::ServerError => e
143
+ puts "Server error: #{e.message}"
144
+ puts "This engine may not be available. Check server configuration."
145
+ exit 1
146
+ end
75
147
  end
76
148
  end
77
149
  end
@@ -27,6 +27,13 @@ module ActiveStorage
27
27
  # Array of warning messages from the OCR server.
28
28
  attr_reader :warnings
29
29
 
30
+ # The OCR engine that processed this result (e.g., "ocrs" or "leptess").
31
+ attr_reader :engine
32
+
33
+ # Preprocessing statistics (Hash with :preset, :total_time_ms, :steps).
34
+ # nil if preprocessing was skipped.
35
+ attr_reader :preprocessing
36
+
30
37
  # Creates a new Result.
31
38
  #
32
39
  # ==== Parameters
@@ -35,11 +42,15 @@ module ActiveStorage
35
42
  # * +confidence+ - Confidence score (0.0 to 1.0)
36
43
  # * +processing_time_ms+ - Processing time in milliseconds
37
44
  # * +warnings+ - Array of warning messages (optional)
38
- def initialize(text:, confidence:, processing_time_ms:, warnings: [])
45
+ # * +engine+ - The OCR engine used (optional)
46
+ # * +preprocessing+ - Preprocessing stats hash (optional)
47
+ def initialize(text:, confidence:, processing_time_ms:, warnings: [], engine: nil, preprocessing: nil)
39
48
  @text = text
40
49
  @confidence = confidence
41
50
  @processing_time_ms = processing_time_ms
42
51
  @warnings = warnings
52
+ @engine = engine
53
+ @preprocessing = preprocessing
43
54
  end
44
55
 
45
56
  # Returns whether OCR successfully extracted text.
@@ -51,6 +62,16 @@ module ActiveStorage
51
62
  !text.nil? && !text.empty?
52
63
  end
53
64
 
65
+ # Returns the preprocessing time in milliseconds, or 0 if not preprocessed.
66
+ def preprocessing_time_ms
67
+ preprocessing&.dig(:total_time_ms) || 0
68
+ end
69
+
70
+ # Returns the preprocessing preset used, or nil if not preprocessed.
71
+ def preprocessing_preset
72
+ preprocessing&.dig(:preset)
73
+ end
74
+
54
75
  # Converts the result to a Hash.
55
76
  #
56
77
  # ==== Returns
@@ -61,7 +82,9 @@ module ActiveStorage
61
82
  text: text,
62
83
  confidence: confidence,
63
84
  processing_time_ms: processing_time_ms,
64
- warnings: warnings
85
+ warnings: warnings,
86
+ engine: engine,
87
+ preprocessing: preprocessing
65
88
  }
66
89
  end
67
90
 
@@ -71,11 +94,12 @@ module ActiveStorage
71
94
  #
72
95
  # ==== Returns
73
96
  #
74
- # A Hash with +:ocr_text+, +:ocr_confidence+, and +:ocr_processed_at+.
97
+ # A Hash with +:ocr_text+, +:ocr_confidence+, +:ocr_engine+, and +:ocr_processed_at+.
75
98
  def to_metadata
76
99
  {
77
100
  ocr_text: text,
78
101
  ocr_confidence: confidence,
102
+ ocr_engine: engine,
79
103
  ocr_processed_at: Time.now.utc.iso8601
80
104
  }
81
105
  end
@@ -2,6 +2,6 @@
2
2
 
3
3
  module ActiveStorage
4
4
  module Ocr
5
- VERSION = "0.1.2"
5
+ VERSION = "0.3.0"
6
6
  end
7
7
  end
@@ -19,14 +19,20 @@ module ActiveStorage
19
19
  # OCR support for Rails Active Storage attachments.
20
20
  #
21
21
  # This module provides optical character recognition (OCR) for files stored
22
- # with Active Storage using a high-performance Rust server with the pure-Rust
23
- # +ocrs+ OCR engine.
22
+ # with Active Storage using a high-performance Rust server.
23
+ #
24
+ # == OCR Engines
25
+ #
26
+ # Two engines are available:
27
+ # * +:ocrs+ - Pure Rust engine (default). Fast, no system dependencies.
28
+ # * +:leptess+ - Tesseract-based engine. Better for noisy/messy images.
24
29
  #
25
30
  # == Configuration
26
31
  #
27
32
  # ActiveStorage::Ocr.configure do |config|
28
33
  # config.server_url = "http://localhost:9292"
29
34
  # config.timeout = 30
35
+ # config.engine = :leptess # Use Tesseract instead of default ocrs
30
36
  # end
31
37
  #
32
38
  # == Basic Usage
@@ -35,6 +41,16 @@ module ActiveStorage
35
41
  # result = ActiveStorage::Ocr.extract_text(document.file)
36
42
  # result.text # => "Extracted text..."
37
43
  # result.confidence # => 0.95
44
+ # result.engine # => "ocrs"
45
+ #
46
+ # # Use a specific engine for one request
47
+ # client = ActiveStorage::Ocr::Client.new
48
+ # result = client.extract_text(document.file, engine: :leptess)
49
+ #
50
+ # # Compare both engines
51
+ # comparison = client.compare(document.file)
52
+ # comparison[:ocrs].text # => "Text from ocrs..."
53
+ # comparison[:leptess].text # => "Text from leptess..."
38
54
  #
39
55
  # == Error Handling
40
56
  #
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: activestorage-ocr
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.2
4
+ version: 0.3.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Michael Rispoli