@luii/node-tesseract-ocr 2.0.13 → 2.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -16,24 +16,23 @@ Native C++ addon for Node.js that exposes Tesseract OCR (`libtesseract-dev`) to
16
16
  - [Enums](#enums)
17
17
  - [Types](#types)
18
18
  - [Tesseract API](#tesseract-api)
19
- - [Example](#example)
20
19
  - [License](#license)
21
- - [Special Thanks](#special-thanks)
22
20
 
23
21
  ## Features
24
22
 
25
23
  - Native bindings to Tesseract (prebuilds via `pkg-prebuilds`)
26
24
  - Access to Tesseract enums and configuration from TypeScript
27
25
  - Progress callback and multiple output formats
26
+ - Lazy download of missing traineddata (configurable)
28
27
 
29
28
  ## Prerequisites
30
29
 
31
30
  - nodejs
32
31
  - node-addon-api
33
32
  - c++ build toolchain (e.g. build-essentials)
34
- - libtesseract-dev
33
+ - libtesseract-dev (exactly `5.5.2`)
35
34
  - libleptonica-dev
36
- - Tesseract training data (eng, deu, ...)
35
+ - Tesseract training data (eng, deu, ...) or let the library handle that
37
36
 
38
37
  > See [Install](#install)
39
38
 
@@ -44,6 +43,15 @@ sudo apt update
44
43
  sudo apt install -y nodejs npm build-essential pkg-config libtesseract-dev libleptonica-dev tesseract-ocr-eng
45
44
  ```
46
45
 
46
+ Verify the required Tesseract version:
47
+
48
+ ```bash
49
+ pkg-config --modversion tesseract
50
+ # expected: 5.5.2
51
+ ```
52
+
53
+ If your distro ships another version, install/build `tesseract 5.5.2` and ensure `pkg-config` resolves that installation.
54
+
47
55
  ```bash
48
56
  git clone git@github.com:luii/node-tesseract-ocr.git
49
57
  cd node-tesseract-ocr
@@ -59,7 +67,9 @@ Install additional languages as needed, for example:
59
67
  sudo apt install -y tesseract-ocr-deu tesseract-ocr-eng tesseract-ocr-jpn
60
68
  ```
61
69
 
62
- If you install traineddata files manually, make sure `NODE_TESSERACT_DATAPATH` points to the directory that contains them (for example `/usr/share/tesseract-ocr/5/tessdata`).
70
+ If you install traineddata files manually, make sure `TESSDATA_PREFIX` points to the directory that contains them (for example `/usr/share/tessdata`).
71
+
72
+ If traineddata is missing, this package will download it lazily during `init` by default. You can control this behavior via `ensureTraineddata`, `cachePath`, and `dataPath`.
63
73
 
64
74
  ## Build
65
75
 
@@ -73,12 +83,14 @@ npm run build:release
73
83
 
74
84
  ## Start
75
85
 
76
- Set `NODE_TESSERACT_DATAPATH` to your traineddata directory (usually `/usr/share/tesseract-ocr/5/tessdata`).
86
+ Set `TESSDATA_PREFIX` to your traineddata directory (usually `/usr/share/tesseract-ocr/5/tessdata` or `/usr/share/tessdata`).
77
87
 
78
88
  ```sh
79
- env NODE_TESSERACT_DATAPATH=/usr/share/tesseract-ocr/5/tessdata node path/to/your/app.js
89
+ env TESSDATA_PREFIX=/usr/share/tessdata node path/to/your/app.js
80
90
  ```
81
91
 
92
+ If you prefer automatic downloads, you can skip setting `TESSDATA_PREFIX` and let the default cache directory handle traineddata on first use.
93
+
82
94
  ## Scripts
83
95
 
84
96
  ```bash
@@ -86,9 +98,6 @@ env NODE_TESSERACT_DATAPATH=/usr/share/tesseract-ocr/5/tessdata node path/to/you
86
98
  npm run build:debug
87
99
  npm run build:release
88
100
 
89
- # Build precompiled binaries for distribution
90
- npm run prebuild
91
-
92
101
  # Run the JS example (builds debug first)
93
102
  npm run example:recognize
94
103
 
@@ -100,8 +109,73 @@ npm run test:js:watch
100
109
 
101
110
  ## Examples
102
111
 
112
+ ### Run Included Example
113
+
103
114
  ```sh
104
- env NODE_TESSERACT_DATAPATH=/usr/share/tesseract-ocr/5/tessdata npm run example:recognize
115
+ env TESSDATA_PREFIX=/usr/share/tessdata npm run example:recognize
116
+ ```
117
+
118
+ ### Basic OCR (Local Traineddata)
119
+
120
+ You can find a similar example in the `examples/` folder of the project.
121
+
122
+ ```ts
123
+ import fs from "node:fs";
124
+ import Tesseract, { OcrEngineModes } from "node-tesseract-ocr";
125
+
126
+ process.env.TESSDATA_PREFIX = "/usr/share/tessdata/";
127
+
128
+ async function main() {
129
+ const tesseract = new Tesseract();
130
+ await tesseract.init({
131
+ langs: ["eng"],
132
+ });
133
+
134
+ const buffer = fs.readFileSync("example1.png");
135
+ await tesseract.setImage(buffer);
136
+ await tesseract.recognize((info) => {
137
+ console.log(`Progress: ${info.percent}%`);
138
+ });
139
+
140
+ const text = await tesseract.getUTF8Text();
141
+ console.log(text);
142
+
143
+ await tesseract.end();
144
+ }
145
+
146
+ main().catch((err) => {
147
+ console.error(err);
148
+ process.exit(1);
149
+ });
150
+ ```
151
+
152
+ ### Lazy Traineddata Download (Default)
153
+
154
+ ```ts
155
+ import fs from "node:fs";
156
+ import Tesseract from "node-tesseract-ocr";
157
+
158
+ async function main() {
159
+ const tesseract = new Tesseract();
160
+ await tesseract.init({
161
+ langs: ["eng"],
162
+ ensureTraineddata: true
163
+ dataPath: './tessdata-local'
164
+ });
165
+
166
+ const buffer = fs.readFileSync("example1.png");
167
+ await tesseract.setImage(buffer);
168
+ await tesseract.recognize();
169
+ const text = await tesseract.getUTF8Text();
170
+ console.log(text);
171
+
172
+ await tesseract.end();
173
+ }
174
+
175
+ main().catch((err) => {
176
+ console.error(err);
177
+ process.exit(1);
178
+ });
105
179
  ```
106
180
 
107
181
  ## Public API
@@ -151,13 +225,17 @@ Full list of page segmentation modes from Tesseract.
151
225
 
152
226
  #### `TesseractInitOptions`
153
227
 
154
- | Field | Type | Optional | Default | Description |
155
- | ----------------------- | ----------------------------------------------------------------------------------------------------- | -------- | ----------- | --------------------------------------- |
156
- | `lang` | [`Language[]`](#availablelanguages) | Yes | `undefined` | Languages to load as an array. |
157
- | `oem` | [`OcrEngineMode`](#ocrenginemode) | Yes | `undefined` | OCR engine mode. |
158
- | `vars` | `Partial<Record<keyof ConfigurationVariables, ConfigurationVariables[keyof ConfigurationVariables]>>` | Yes | `undefined` | Variables to set. |
159
- | `configs` | `Array<string>` | Yes | `undefined` | Tesseract config files to apply. |
160
- | `setOnlyNonDebugParams` | `boolean` | Yes | `undefined` | If true, only non-debug params are set. |
228
+ | Field | Type | Optional | Default | Description |
229
+ | ----------------------- | ----------------------------------------------------------------------------------------------------- | -------- | -------------------------------------- | --------------------------------------- |
230
+ | `langs` | [`Language[]`](#availablelanguages) | Yes | `undefined` | Languages to load as an array. |
231
+ | `oem` | [`OcrEngineMode`](#ocrenginemode) | Yes | `undefined` | OCR engine mode. |
232
+ | `vars` | `Partial<Record<keyof ConfigurationVariables, ConfigurationVariables[keyof ConfigurationVariables]>>` | Yes | `undefined` | Variables to set. |
233
+ | `configs` | `Array<string>` | Yes | `undefined` | Tesseract config files to apply. |
234
+ | `setOnlyNonDebugParams` | `boolean` | Yes | `undefined` | If true, only non-debug params are set. |
235
+ | `ensureTraineddata` | `boolean` | Yes | `true` | Download missing traineddata lazily. |
236
+ | `cachePath` | `string` | Yes | `~/.cache/node-tesseract-ocr/tessdata` | Cache directory for downloads. |
237
+ | `dataPath` | `string` | Yes | `TESSDATA_PREFIX` or `cachePath` | Directory used by Tesseract for data. |
238
+ | `progressCallback` | `(info: TrainingDataDownloadProgress) => void` | Yes | `undefined` | Download progress callback. |
161
239
 
162
240
  #### `TesseractSetRectangleOptions`
163
241
 
@@ -180,6 +258,18 @@ Full list of page segmentation modes from Tesseract.
180
258
  | `bottom` | `number` | No | n/a | Bottom coordinate of current element bbox. |
181
259
  | `left` | `number` | No | n/a | Left coordinate of current element bbox. |
182
260
 
261
+ #### `TesseractProcessPagesStatus`
262
+
263
+ | Field | Type | Optional | Default | Description |
264
+ | ----------------- | --------- | -------- | ------- | ----------------------------------------------------- |
265
+ | `active` | `boolean` | No | n/a | Whether a multipage session is currently active. |
266
+ | `healthy` | `boolean` | No | n/a | Whether the renderer is healthy. |
267
+ | `processedPages` | `number` | No | n/a | Number of pages already processed in this session. |
268
+ | `nextPageIndex` | `number` | No | n/a | Zero-based index that will be used for the next page. |
269
+ | `outputBase` | `string` | No | n/a | Effective output base used by the PDF renderer. |
270
+ | `timeoutMillisec` | `number` | No | n/a | Timeout per page in milliseconds (`0` = unlimited). |
271
+ | `textonly` | `boolean` | No | n/a | Whether text-only PDF mode is enabled. |
272
+
183
273
  #### `DetectOrientationScriptResult`
184
274
 
185
275
  | Field | Type | Optional | Default | Description |
@@ -199,13 +289,92 @@ new Tesseract();
199
289
 
200
290
  Creates a new Tesseract instance.
201
291
 
292
+ #### Initialization Requirements
293
+
294
+ Call `init(...)` once before using OCR/engine-dependent methods.
295
+
296
+ Methods that do **not** require `init(...)`:
297
+
298
+ - `version()`
299
+ - `isInitialized()`
300
+ - `setInputName(...)`
301
+ - `getInputName()`
302
+ - `abortProcessPages()`
303
+ - `getProcessPagesStatus()`
304
+ - `document.abort()`
305
+ - `document.status()`
306
+ - `init(...)`
307
+ - `end()`
308
+
309
+ Methods that **require** `init(...)`:
310
+
311
+ - `setInputImage(...)`
312
+ - `getInputImage()`
313
+ - `getSourceYResolution()`
314
+ - `getDataPath()`
315
+ - `setOutputName(...)`
316
+ - `clearPersistentCache()`
317
+ - `clearAdaptiveClassifier()`
318
+ - `setImage(...)`
319
+ - `getThresholdedImage()`
320
+ - `getThresholdedImageScaleFactor()`
321
+ - `setPageMode(...)`
322
+ - `setRectangle(...)`
323
+ - `setSourceResolution(...)`
324
+ - `recognize(...)`
325
+ - `detectOrientationScript()`
326
+ - `meanTextConf()`
327
+ - `allWordConfidences()`
328
+ - `getPAGEText(...)`
329
+ - `getLSTMBoxText(...)`
330
+ - `getBoxText(...)`
331
+ - `getWordStrBoxText(...)`
332
+ - `getOSDText(...)`
333
+ - `getUTF8Text()`
334
+ - `getHOCRText(...)`
335
+ - `getTSVText(...)`
336
+ - `getUNLVText()`
337
+ - `getALTOText(...)`
338
+ - `getInitLanguages()`
339
+ - `getLoadedLanguages()`
340
+ - `getAvailableLanguages()`
341
+ - `setDebugVariable(...)`
342
+ - `setVariable(...)`
343
+ - `getIntVariable(...)`
344
+ - `getBoolVariable(...)`
345
+ - `getDoubleVariable(...)`
346
+ - `getStringVariable(...)`
347
+ - `clear()`
348
+ - `beginProcessPages(...)`
349
+ - `addProcessPage(...)`
350
+ - `finishProcessPages()`
351
+ - `document.begin(...)`
352
+ - `document.addPage(...)`
353
+ - `document.finish()`
354
+
355
+ #### version
356
+
357
+ Returns the currently loaded libtesseract version string.
358
+
359
+ ```ts
360
+ version(): Promise<string>
361
+ ```
362
+
363
+ #### isInitialized
364
+
365
+ Returns whether `init(...)` has already completed successfully and has not been reset via `end()`.
366
+
367
+ ```ts
368
+ isInitialized(): Promise<boolean>
369
+ ```
370
+
202
371
  #### init
203
372
 
204
- Initializes Tesseract with language, engine mode, configs, and variables.
373
+ Initializes the OCR engine with language, OEM, configs, and variables.
205
374
 
206
- | Name | Type | Optional | Default | Description |
207
- | ------- | ----------------------------------------------- | -------- | ------- | ----------------------- |
208
- | options | [`TesseractInitOptions`](#tesseractinitoptions) | No | n/a | Initialization options. |
375
+ | Name | Type | Optional | Default | Description |
376
+ | --------- | ----------------------------------------------- | -------- | ------- | ----------------------- |
377
+ | `options` | [`TesseractInitOptions`](#tesseractinitoptions) | No | n/a | Initialization options. |
209
378
 
210
379
  ```ts
211
380
  init(options: TesseractInitOptions): Promise<void>
@@ -213,56 +382,282 @@ init(options: TesseractInitOptions): Promise<void>
213
382
 
214
383
  #### initForAnalysePage
215
384
 
216
- Initializes for layout analysis only.
385
+ Initializes the engine in analysis-only mode.
217
386
 
218
387
  ```ts
219
388
  initForAnalysePage(): Promise<void>
220
389
  ```
221
390
 
222
- #### analysePage
391
+ #### analyseLayout
223
392
 
224
- Runs the layout analysis.
393
+ Runs page layout analysis on the current image.
225
394
 
226
- | Name | Type | Optional | Default | Description |
227
- | ----------------- | ------- | -------- | ------- | ------------------------------- |
228
- | mergeSimilarWords | boolean | No | n/a | Whether to merge similar words. |
395
+ | Name | Type | Optional | Default | Description |
396
+ | ------------------- | --------- | -------- | ------- | ------------------------------------------- |
397
+ | `mergeSimilarWords` | `boolean` | No | n/a | Merge similar words during layout analysis. |
229
398
 
230
399
  ```ts
231
- analysePage(mergeSimilarWords: boolean): Promise<void>
400
+ analyseLayout(mergeSimilarWords: boolean): Promise<void>
401
+ ```
402
+
403
+ #### setInputName
404
+
405
+ Sets the source/input name used by renderer/training APIs.
406
+
407
+ | Name | Type | Optional | Default | Description |
408
+ | ----------- | -------- | -------- | ------- | ------------------------------------------ |
409
+ | `inputName` | `string` | No | n/a | Input name used by renderer/training APIs. |
410
+
411
+ ```ts
412
+ setInputName(inputName: string): Promise<void>
413
+ ```
414
+
415
+ #### getInputName
416
+
417
+ Returns the current input name from engine state.
418
+
419
+ ```ts
420
+ getInputName(): Promise<string>
421
+ ```
422
+
423
+ #### setInputImage
424
+
425
+ Sets the encoded source image buffer.
426
+
427
+ | Name | Type | Optional | Default | Description |
428
+ | -------- | -------- | -------- | ------- | ---------------------------- |
429
+ | `buffer` | `Buffer` | No | n/a | Encoded source image buffer. |
430
+
431
+ ```ts
432
+ setInputImage(buffer: Buffer): Promise<void>
433
+ ```
434
+
435
+ #### getInputImage
436
+
437
+ Returns the current input image bytes.
438
+
439
+ ```ts
440
+ getInputImage(): Promise<Buffer>
441
+ ```
442
+
443
+ #### getSourceYResolution
444
+
445
+ Returns source image Y resolution (DPI).
446
+
447
+ ```ts
448
+ getSourceYResolution(): Promise<number>
449
+ ```
450
+
451
+ #### getDataPath
452
+
453
+ Returns the active tessdata path from the engine.
454
+
455
+ ```ts
456
+ getDataPath(): Promise<string>
457
+ ```
458
+
459
+ #### setOutputName
460
+
461
+ Sets the output base name for renderer-based outputs.
462
+
463
+ | Name | Type | Optional | Default | Description |
464
+ | ------------ | -------- | -------- | ------- | -------------------------------------- |
465
+ | `outputName` | `string` | No | n/a | Output base name for renderer outputs. |
466
+
467
+ ```ts
468
+ setOutputName(outputName: string): Promise<void>
469
+ ```
470
+
471
+ #### clearPersistentCache
472
+
473
+ Clears global library-level caches (for example dictionaries).
474
+
475
+ ```ts
476
+ clearPersistentCache(): Promise<void>
477
+ ```
478
+
479
+ #### clearAdaptiveClassifier
480
+
481
+ Cleans adaptive classifier state between pages/documents.
482
+
483
+ ```ts
484
+ clearAdaptiveClassifier(): Promise<void>
485
+ ```
486
+
487
+ #### setImage
488
+
489
+ Sets the image used by OCR recognition.
490
+
491
+ | Name | Type | Optional | Default | Description |
492
+ | -------- | -------- | -------- | ------- | ------------------------ |
493
+ | `buffer` | `Buffer` | No | n/a | Image data used for OCR. |
494
+
495
+ ```ts
496
+ setImage(buffer: Buffer): Promise<void>
497
+ ```
498
+
499
+ #### getThresholdedImage
500
+
501
+ Returns thresholded image bytes from Tesseract internals.
502
+
503
+ ```ts
504
+ getThresholdedImage(): Promise<Buffer>
505
+ ```
506
+
507
+ #### getThresholdedImageScaleFactor
508
+
509
+ Returns scale factor for thresholded/component images.
510
+
511
+ ```ts
512
+ getThresholdedImageScaleFactor(): Promise<number>
232
513
  ```
233
514
 
234
515
  #### setPageMode
235
516
 
236
- Sets the page segmentation mode.
517
+ Sets the page segmentation mode (PSM).
237
518
 
238
- | Name | Type | Optional | Default | Description |
239
- | ---- | ------------------------------------------------ | -------- | ------- | ----------------------- |
240
- | psm | [`PageSegmentationMode`](#pagesegmentationmodes) | No | n/a | Page segmentation mode. |
519
+ | Name | Type | Optional | Default | Description |
520
+ | ----- | ----------------------------------------------- | -------- | ------- | ----------------------- |
521
+ | `psm` | [`PageSegmentationMode`](#pagesegmentationmode) | No | n/a | Page segmentation mode. |
241
522
 
242
523
  ```ts
243
524
  setPageMode(psm: PageSegmentationMode): Promise<void>
244
525
  ```
245
526
 
527
+ #### setRectangle
528
+
529
+ Restricts recognition to the given rectangle.
530
+
531
+ | Name | Type | Optional | Default | Description |
532
+ | --------- | --------------------------------------------------------------- | -------- | ------- | ------------------ |
533
+ | `options` | [`TesseractSetRectangleOptions`](#tesseractsetrectangleoptions) | No | n/a | Region definition. |
534
+
535
+ ```ts
536
+ setRectangle(options: TesseractSetRectangleOptions): Promise<void>
537
+ ```
538
+
539
+ #### setSourceResolution
540
+
541
+ Sets the source resolution in PPI.
542
+
543
+ | Name | Type | Optional | Default | Description |
544
+ | ----- | -------- | -------- | ------- | ------------------------- |
545
+ | `ppi` | `number` | No | n/a | Source resolution in PPI. |
546
+
547
+ ```ts
548
+ setSourceResolution(ppi: number): Promise<void>
549
+ ```
550
+
551
+ #### document
552
+
553
+ Facade for multipage PDF/document processing lifecycle.
554
+
555
+ ```ts
556
+ document: {
557
+ begin(options: TesseractBeginProcessPagesOptions): Promise<void>;
558
+ addPage(buffer: Buffer, filename?: string): Promise<void>;
559
+ finish(): Promise<string>;
560
+ abort(): Promise<void>;
561
+ status(): Promise<TesseractProcessPagesStatus>;
562
+ }
563
+ ```
564
+
565
+ #### document.begin
566
+
567
+ Starts a multipage processing session.
568
+
569
+ | Name | Type | Optional | Default | Description |
570
+ | --------- | ----------------------------------- | -------- | ------- | --------------------------- |
571
+ | `options` | `TesseractBeginProcessPagesOptions` | No | n/a | Multipage renderer options. |
572
+
573
+ ```ts
574
+ document.begin(options: TesseractBeginProcessPagesOptions): Promise<void>
575
+ ```
576
+
577
+ #### document.addPage
578
+
579
+ Adds an encoded page to the active session.
580
+
581
+ | Name | Type | Optional | Default | Description |
582
+ | ---------- | -------- | -------- | ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
583
+ | `buffer` | `Buffer` | No | n/a | Encoded page image buffer. |
584
+ | `filename` | `string` | Yes | `undefined` | Optional source filename/path passed to Tesseract `ProcessPage` for this page. Tesseract/Leptonica may open this file internally and use it as the source image for parts of PDF rendering. If output pages look wrong (for example inverted or visually corrupted), pass a real image path here to force a stable source image path for that page. |
585
+
586
+ ```ts
587
+ document.addPage(buffer: Buffer, filename?: string): Promise<void>
588
+ ```
589
+
590
+ #### document.finish
591
+
592
+ Finalizes the active session and returns output PDF path.
593
+
594
+ ```ts
595
+ document.finish(): Promise<string>
596
+ ```
597
+
598
+ #### document.abort
599
+
600
+ Aborts and resets the active multipage session.
601
+
602
+ ```ts
603
+ document.abort(): Promise<void>
604
+ ```
605
+
606
+ #### document.status
607
+
608
+ Returns the current multipage session status (active flag, page counters, and effective renderer settings).
609
+
610
+ ```ts
611
+ document.status(): Promise<TesseractProcessPagesStatus>
612
+ ```
613
+
614
+ #### getProcessPagesStatus
615
+
616
+ Returns the current multipage session status from the instance API.
617
+
618
+ ```ts
619
+ getProcessPagesStatus(): Promise<TesseractProcessPagesStatus>
620
+ ```
621
+
622
+ #### setDebugVariable
623
+
624
+ Sets a debug configuration variable.
625
+
626
+ | Name | Type | Optional | Default | Description |
627
+ | ------- | -------------------------------------------------------------- | -------- | ------- | --------------- |
628
+ | `name` | `keyof SetVariableConfigVariables` | No | n/a | Variable name. |
629
+ | `value` | `SetVariableConfigVariables[keyof SetVariableConfigVariables]` | No | n/a | Variable value. |
630
+
631
+ ```ts
632
+ setDebugVariable(
633
+ name: keyof SetVariableConfigVariables,
634
+ value: SetVariableConfigVariables[keyof SetVariableConfigVariables],
635
+ ): Promise<boolean>
636
+ ```
637
+
246
638
  #### setVariable
247
639
 
248
- Sets a Tesseract variable. Returns `false` if the lookup failed.
640
+ Sets a regular configuration variable.
249
641
 
250
- | Name | Type | Optional | Default | Description |
251
- | ----- | -------------------------------------------------------------- | -------- | ------- | --------------- |
252
- | name | keyof SetVariableConfigVariables | No | n/a | Variable name. |
253
- | value | SetVariableConfigVariables\[keyof SetVariableConfigVariables\] | No | n/a | Variable value. |
642
+ | Name | Type | Optional | Default | Description |
643
+ | ------- | -------------------------------------------------------------- | -------- | ------- | --------------- |
644
+ | `name` | `keyof SetVariableConfigVariables` | No | n/a | Variable name. |
645
+ | `value` | `SetVariableConfigVariables[keyof SetVariableConfigVariables]` | No | n/a | Variable value. |
254
646
 
255
647
  ```ts
256
- setVariable(name: keyof SetVariableConfigVariables, value: SetVariableConfigVariables[keyof SetVariableConfigVariables]): Promise<boolean>
648
+ setVariable(
649
+ name: keyof SetVariableConfigVariables,
650
+ value: SetVariableConfigVariables[keyof SetVariableConfigVariables],
651
+ ): Promise<boolean>
257
652
  ```
258
653
 
259
654
  #### getIntVariable
260
655
 
261
- Reads an integer variable from Tesseract.
656
+ Reads a configuration variable as integer.
262
657
 
263
- | Name | Type | Optional | Default | Description |
264
- | ---- | -------------------------------- | -------- | ------- | -------------- |
265
- | name | keyof SetVariableConfigVariables | No | n/a | Variable name. |
658
+ | Name | Type | Optional | Default | Description |
659
+ | ------ | ---------------------------------- | -------- | ------- | -------------- |
660
+ | `name` | `keyof SetVariableConfigVariables` | No | n/a | Variable name. |
266
661
 
267
662
  ```ts
268
663
  getIntVariable(name: keyof SetVariableConfigVariables): Promise<number>
@@ -270,11 +665,11 @@ getIntVariable(name: keyof SetVariableConfigVariables): Promise<number>
270
665
 
271
666
  #### getBoolVariable
272
667
 
273
- Reads a boolean variable from Tesseract. Returns `0` or `1`.
668
+ Reads a configuration variable as boolean (`0`/`1`).
274
669
 
275
- | Name | Type | Optional | Default | Description |
276
- | ---- | -------------------------------- | -------- | ------- | -------------- |
277
- | name | keyof SetVariableConfigVariables | No | n/a | Variable name. |
670
+ | Name | Type | Optional | Default | Description |
671
+ | ------ | ---------------------------------- | -------- | ------- | -------------- |
672
+ | `name` | `keyof SetVariableConfigVariables` | No | n/a | Variable name. |
278
673
 
279
674
  ```ts
280
675
  getBoolVariable(name: keyof SetVariableConfigVariables): Promise<number>
@@ -282,11 +677,11 @@ getBoolVariable(name: keyof SetVariableConfigVariables): Promise<number>
282
677
 
283
678
  #### getDoubleVariable
284
679
 
285
- Reads a double variable from Tesseract.
680
+ Reads a configuration variable as double.
286
681
 
287
- | Name | Type | Optional | Default | Description |
288
- | ---- | -------------------------------- | -------- | ------- | -------------- |
289
- | name | keyof SetVariableConfigVariables | No | n/a | Variable name. |
682
+ | Name | Type | Optional | Default | Description |
683
+ | ------ | ---------------------------------- | -------- | ------- | -------------- |
684
+ | `name` | `keyof SetVariableConfigVariables` | No | n/a | Variable name. |
290
685
 
291
686
  ```ts
292
687
  getDoubleVariable(name: keyof SetVariableConfigVariables): Promise<number>
@@ -294,139 +689,175 @@ getDoubleVariable(name: keyof SetVariableConfigVariables): Promise<number>
294
689
 
295
690
  #### getStringVariable
296
691
 
297
- Reads a string variable from Tesseract.
692
+ Reads a configuration variable as string.
298
693
 
299
- | Name | Type | Optional | Default | Description |
300
- | ---- | -------------------------------- | -------- | ------- | -------------- |
301
- | name | keyof SetVariableConfigVariables | No | n/a | Variable name. |
694
+ | Name | Type | Optional | Default | Description |
695
+ | ------ | ---------------------------------- | -------- | ------- | -------------- |
696
+ | `name` | `keyof SetVariableConfigVariables` | No | n/a | Variable name. |
302
697
 
303
698
  ```ts
304
699
  getStringVariable(name: keyof SetVariableConfigVariables): Promise<string>
305
700
  ```
306
701
 
307
- #### setImage
702
+ #### recognize
308
703
 
309
- Sets the image from a Buffer.
704
+ Runs OCR recognition (optionally with progress callback).
310
705
 
311
- | Name | Type | Optional | Default | Description |
312
- | ------ | ------ | -------- | ------- | ----------- |
313
- | buffer | Buffer | No | n/a | Image data. |
706
+ | Name | Type | Optional | Default | Description |
707
+ | ------------------ | ------------------------------------- | -------- | ----------- | ---------------------- |
708
+ | `progressCallback` | `(info: ProgressChangedInfo) => void` | Yes | `undefined` | OCR progress callback. |
314
709
 
315
710
  ```ts
316
- setImage(buffer: Buffer): Promise<void>
711
+ recognize(progressCallback?: (info: ProgressChangedInfo) => void): Promise<void>
317
712
  ```
318
713
 
319
- #### setRectangle
714
+ #### detectOrientationScript
715
+
716
+ Detects orientation and script with confidence values.
717
+
718
+ ```ts
719
+ detectOrientationScript(): Promise<DetectOrientationScriptResult>
720
+ ```
320
721
 
321
- Sets the image region using coordinates and size.
722
+ #### meanTextConf
322
723
 
323
- | Name | Type | Optional | Default | Description |
324
- | ------- | --------------------------------------------------------------- | -------- | ------- | ------------------ |
325
- | options | [`TesseractSetRectangleOptions`](#tesseractsetrectangleoptions) | No | n/a | Region definition. |
724
+ Returns mean text confidence.
326
725
 
327
726
  ```ts
328
- setRectangle(options: TesseractSetRectangleOptions): Promise<void>
727
+ meanTextConf(): Promise<number>
329
728
  ```
330
729
 
331
- #### setSourceResolution
730
+ #### allWordConfidences
332
731
 
333
- Sets the source resolution in PPI.
732
+ Returns all word confidences for current recognition result.
733
+
734
+ ```ts
735
+ allWordConfidences(): Promise<number[]>
736
+ ```
737
+
738
+ #### getPAGEText
334
739
 
335
- | Name | Type | Optional | Default | Description |
336
- | ---- | ------ | -------- | ------- | ---------------- |
337
- | ppi | number | No | n/a | Pixels per inch. |
740
+ Returns PAGE XML output.
741
+
742
+ | Name | Type | Optional | Default | Description |
743
+ | ------------------ | ------------------------------------- | -------- | ----------- | ---------------------------------- |
744
+ | `progressCallback` | `(info: ProgressChangedInfo) => void` | Yes | `undefined` | PAGE generation progress callback. |
745
+ | `pageNumber` | `number` | Yes | `undefined` | 0-based page number. |
338
746
 
339
747
  ```ts
340
- setSourceResolution(ppi: number): Promise<void>
748
+ getPAGEText(
749
+ progressCallback?: (info: ProgressChangedInfo) => void,
750
+ pageNumber?: number,
751
+ ): Promise<string>
341
752
  ```
342
753
 
343
- #### recognize
754
+ #### getLSTMBoxText
344
755
 
345
- Starts OCR and calls the callback with progress info.
756
+ Returns LSTM box output.
346
757
 
347
- | Name | Type | Optional | Default | Description |
348
- | ---------------- | ------------------------------------------------------------- | -------- | ------- | ------------------ |
349
- | progressCallback | (info: [`ProgressChangedInfo`](#progresschangedinfo)) => void | No | n/a | Progress callback. |
758
+ | Name | Type | Optional | Default | Description |
759
+ | ------------ | -------- | -------- | ----------- | -------------------- |
760
+ | `pageNumber` | `number` | Yes | `undefined` | 0-based page number. |
350
761
 
351
762
  ```ts
352
- recognize(progressCallback: (info: ProgressChangedInfo) => void): Promise<void>
763
+ getLSTMBoxText(pageNumber?: number): Promise<string>
353
764
  ```
354
765
 
355
- #### getUTF8Text
766
+ #### getBoxText
767
+
768
+ Returns classic box output.
356
769
 
357
- Returns recognized text as UTF-8.
770
+ | Name | Type | Optional | Default | Description |
771
+ | ------------ | -------- | -------- | ----------- | -------------------- |
772
+ | `pageNumber` | `number` | Yes | `undefined` | 0-based page number. |
358
773
 
359
774
  ```ts
360
- getUTF8Text(): Promise<string>
775
+ getBoxText(pageNumber?: number): Promise<string>
361
776
  ```
362
777
 
363
- #### getHOCRText
778
+ #### getWordStrBoxText
364
779
 
365
- Returns HOCR output. Optional progress callback and page number.
780
+ Returns WordStr box output.
366
781
 
367
- | Name | Type | Optional | Default | Description |
368
- | ---------------- | ------------------------------------------------------------- | -------- | --------- | ---------------------- |
369
- | progressCallback | (info: [`ProgressChangedInfo`](#progresschangedinfo)) => void | Yes | undefined | Progress callback. |
370
- | pageNumber | number | Yes | undefined | Page number (0-based). |
782
+ | Name | Type | Optional | Default | Description |
783
+ | ------------ | -------- | -------- | ----------- | -------------------- |
784
+ | `pageNumber` | `number` | Yes | `undefined` | 0-based page number. |
371
785
 
372
786
  ```ts
373
- getHOCRText(
374
- progressCallback?: (info: ProgressChangedInfo) => void,
375
- pageNumber?: number,
376
- ): Promise<string>
787
+ getWordStrBoxText(pageNumber?: number): Promise<string>
377
788
  ```
378
789
 
379
- #### getTSVText
790
+ #### getOSDText
380
791
 
381
- Returns TSV output.
792
+ Returns OSD text output.
793
+
794
+ | Name | Type | Optional | Default | Description |
795
+ | ------------ | -------- | -------- | ----------- | -------------------- |
796
+ | `pageNumber` | `number` | Yes | `undefined` | 0-based page number. |
382
797
 
383
798
  ```ts
384
- getTSVText(): Promise<string>
799
+ getOSDText(pageNumber?: number): Promise<string>
385
800
  ```
386
801
 
387
- #### getUNLVText
802
+ #### getUTF8Text
388
803
 
389
- Returns UNLV output.
804
+ Returns recognized UTF-8 text.
390
805
 
391
806
  ```ts
392
- getUNLVText(): Promise<string>
807
+ getUTF8Text(): Promise<string>
393
808
  ```
394
809
 
395
- #### getALTOText
810
+ #### getHOCRText
396
811
 
397
- Returns ALTO output. Optional progress callback and page number.
812
+ Returns hOCR output.
398
813
 
399
- | Name | Type | Optional | Default | Description |
400
- | ---------------- | ------------------------------------------------------------- | -------- | --------- | ---------------------- |
401
- | progressCallback | (info: [`ProgressChangedInfo`](#progresschangedinfo)) => void | Yes | undefined | Progress callback. |
402
- | pageNumber | number | Yes | undefined | Page number (0-based). |
814
+ | Name | Type | Optional | Default | Description |
815
+ | ------------------ | ------------------------------------- | -------- | ----------- | ---------------------------------- |
816
+ | `progressCallback` | `(info: ProgressChangedInfo) => void` | Yes | `undefined` | hOCR generation progress callback. |
817
+ | `pageNumber` | `number` | Yes | `undefined` | 0-based page number. |
403
818
 
404
819
  ```ts
405
- getALTOText(
820
+ getHOCRText(
406
821
  progressCallback?: (info: ProgressChangedInfo) => void,
407
822
  pageNumber?: number,
408
823
  ): Promise<string>
409
824
  ```
410
825
 
411
- #### detectOrientationScript
826
+ #### getTSVText
827
+
828
+ Returns TSV output.
412
829
 
413
- Detects orientation and script with confidences. Returns [`DetectOrientationScriptResult`](#detectorientationscriptresult).
830
+ | Name | Type | Optional | Default | Description |
831
+ | ------------ | -------- | -------- | ----------- | -------------------- |
832
+ | `pageNumber` | `number` | Yes | `undefined` | 0-based page number. |
414
833
 
415
834
  ```ts
416
- detectOrientationScript(): Promise<DetectOrientationScriptResult>
835
+ getTSVText(pageNumber?: number): Promise<string>
417
836
  ```
418
837
 
419
- #### meanTextConf
838
+ #### getUNLVText
420
839
 
421
- Mean text confidence (0-100).
840
+ Returns UNLV output.
422
841
 
423
842
  ```ts
424
- meanTextConf(): Promise<number>
843
+ getUNLVText(): Promise<string>
844
+ ```
845
+
846
+ #### getALTOText
847
+
848
+ Returns ALTO XML output.
849
+
850
+ | Name | Type | Optional | Default | Description |
851
+ | ------------ | -------- | -------- | ----------- | -------------------- |
852
+ | `pageNumber` | `number` | Yes | `undefined` | 0-based page number. |
853
+
854
+ ```ts
855
+ getALTOText(pageNumber?: number): Promise<string>
425
856
  ```
426
857
 
427
858
  #### getInitLanguages
428
859
 
429
- Returns [`Language`](#availablelanguages) in raw Tesseract format (e.g. "deu+eng").
860
+ Returns languages used during initialization (for example `deu+eng`).
430
861
 
431
862
  ```ts
432
863
  getInitLanguages(): Promise<string>
@@ -434,7 +865,7 @@ getInitLanguages(): Promise<string>
434
865
 
435
866
  #### getLoadedLanguages
436
867
 
437
- Returns [`Language[]`](#availablelanguages) in raw Tesseract format.
868
+ Returns languages currently loaded in the engine.
438
869
 
439
870
  ```ts
440
871
  getLoadedLanguages(): Promise<Language[]>
@@ -442,7 +873,7 @@ getLoadedLanguages(): Promise<Language[]>
442
873
 
443
874
  #### getAvailableLanguages
444
875
 
445
- Returns [`Language[]`](#availablelanguages) in raw Tesseract format.
876
+ Returns languages available from tessdata.
446
877
 
447
878
  ```ts
448
879
  getAvailableLanguages(): Promise<Language[]>
@@ -450,7 +881,7 @@ getAvailableLanguages(): Promise<Language[]>
450
881
 
451
882
  #### clear
452
883
 
453
- Clears internal state.
884
+ Clears internal recognition state/results.
454
885
 
455
886
  ```ts
456
887
  clear(): Promise<void>
@@ -458,49 +889,12 @@ clear(): Promise<void>
458
889
 
459
890
  #### end
460
891
 
461
- Ends the instance.
892
+ Releases native resources and ends the instance.
462
893
 
463
894
  ```ts
464
895
  end(): Promise<void>
465
896
  ```
466
897
 
467
- ## Example
468
-
469
- You can find a similar example in the `examples/` folder of the project
470
-
471
- ```ts
472
- import fs from "node:fs";
473
- import Tesseract, { OcrEngineModes } from "node-tesseract-ocr";
474
-
475
- async function main() {
476
- const tesseract = new Tesseract();
477
- await tesseract.init({
478
- lang: ["eng"],
479
- oem: OcrEngineModes.OEM_LSTM_ONLY,
480
- });
481
-
482
- const buffer = fs.readFileSync("example1.png");
483
- await tesseract.setImage(buffer);
484
- await tesseract.recognize((info) => {
485
- console.log(`Progress: ${info.percent}%`);
486
- });
487
-
488
- const text = await tesseract.getUTF8Text();
489
- console.log(text);
490
-
491
- await tesseract.end();
492
- }
493
-
494
- main().catch((err) => {
495
- console.error(err);
496
- process.exit(1);
497
- });
498
- ```
499
-
500
898
  ## License
501
899
 
502
900
  Apache-2.0. See [`LICENSE.md`](/LICENSE.md) for full terms.
503
-
504
- ## Special Thanks
505
-
506
- - **Stunt3000**