extractia-sdk 1.0.6 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,9 +1,34 @@
1
1
  # Extractia SDK
2
2
 
3
- A JavaScript SDK for interacting with the Extractia API.
3
+ JavaScript SDK for the [Extractia](https://extractia.info) document-extraction API.
4
+ Works in Node.js ≥ 18 and modern browsers via the provided UMD build.
4
5
 
5
- > **Note:** This SDK is designed for the implementation of the [extractia.info](https://extractia.info) public API.
6
- > You must have an Extractia account and a valid API token to use it.
6
+ > **Requires** an Extractia account and a valid API token.
7
+ > Generate one at **Settings API Keys** in the Extractia dashboard.
8
+
9
+ ---
10
+
11
+ ## Table of Contents
12
+
13
+ 1. [Installation](#installation)
14
+ 2. [Quick Start](#quick-start)
15
+ 3. [Authentication](#authentication)
16
+ 4. [Error Handling](#error-handling)
17
+ 5. [API Reference](#api-reference)
18
+ - [Profile & Webhook](#profile--webhook)
19
+ - [Templates](#templates)
20
+ - [Documents](#documents)
21
+ - [Processing Images](#processing-images)
22
+ - [AI Features](#ai-features)
23
+ - [Exports](#exports)
24
+ - [OCR Tools](#ocr-tools)
25
+ - [Credits & Analytics](#credits--analytics)
26
+ - [Sub-Users](#sub-users)
27
+ 6. [TypeScript](#typescript)
28
+ 7. [Rate Limits & Quotas](#rate-limits--quotas)
29
+ 8. [Changelog](#changelog)
30
+
31
+ ---
7
32
 
8
33
  ## Installation
9
34
 
@@ -11,58 +36,982 @@ A JavaScript SDK for interacting with the Extractia API.
11
36
  npm install extractia-sdk
12
37
  ```
13
38
 
14
- ## Usage
39
+ Or using yarn / pnpm:
40
+
41
+ ```sh
42
+ yarn add extractia-sdk
43
+ pnpm add extractia-sdk
44
+ ```
45
+
46
+ **Browser (IIFE build):**
47
+
48
+ ```html
49
+ <script src="https://unpkg.com/extractia-sdk/dist/extractia-sdk.browser.js"></script>
50
+ <script>
51
+ ExtractiaSDK.default.setToken("YOUR_API_TOKEN");
52
+ </script>
53
+ ```
54
+
55
+ ---
56
+
57
+ ## Quick Start
15
58
 
16
59
  ```js
17
- import { setToken, getMyProfile, getTemplates, processImage } from 'extractia-sdk';
60
+ import {
61
+ setToken,
62
+ suggestFields,
63
+ createTemplate,
64
+ processImage,
65
+ } from "extractia-sdk";
18
66
 
19
- // Set your API token
20
- setToken('YOUR_API_TOKEN');
67
+ // 1. Authenticate
68
+ setToken("ext_YOUR_API_TOKEN_HERE");
21
69
 
22
- // Get user profile
23
- const profile = await getMyProfile();
70
+ // 2. Let AI suggest fields for your document type
71
+ const fields = await suggestFields(
72
+ "Invoice",
73
+ "header data plus all line items with product and quantity",
74
+ );
24
75
 
25
- // Get all templates
26
- const templates = await getTemplates();
76
+ // 3. Create a template from the suggestions
77
+ const template = await createTemplate({ label: "Invoice", fields });
78
+
79
+ // 4. Process an image
80
+ import { readFileSync } from "fs";
81
+ const base64 = readFileSync("./invoice.png").toString("base64");
82
+ const doc = await processImage(template.id, base64);
83
+
84
+ console.log(JSON.parse(doc.rawJson));
85
+ // → { "Vendor": "Acme Corp", "Total": "1500.00", "Line Items": [...] }
86
+ ```
87
+
88
+ ---
89
+
90
+ ## Authentication
91
+
92
+ Every SDK method requires a valid API token. Call `setToken` **once** at
93
+ application startup — it is stored in module scope and attached automatically as
94
+ a `Bearer` header on every subsequent request.
95
+
96
+ ```js
97
+ import { setToken } from "extractia-sdk";
27
98
 
28
- // Process an image with a template
29
- const result = await processImage(templateId, base64Image);
99
+ setToken(process.env.EXTRACTIA_TOKEN);
30
100
  ```
31
101
 
102
+ > **Security**: Never hard-code your token in client-side code. Use environment
103
+ > variables or a secrets manager. Tokens can be rotated from the dashboard at any
104
+ > time.
105
+
106
+ ---
107
+
108
+ ## Error Handling
109
+
110
+ The SDK maps every HTTP error to a typed exception. Catch the specific subclass
111
+ you need, or catch the base `ExtractiaError` for a generic fallback.
112
+
113
+ ```js
114
+ import {
115
+ processImage,
116
+ AuthError,
117
+ TierError,
118
+ RateLimitError,
119
+ NotFoundError,
120
+ ExtractiaError,
121
+ } from "extractia-sdk";
122
+
123
+ try {
124
+ const doc = await processImage(templateId, base64Image);
125
+ } catch (err) {
126
+ if (err instanceof AuthError) {
127
+ // 401 — token is missing, expired, or revoked
128
+ console.error("Authentication failed:", err.message);
129
+ } else if (err instanceof TierError) {
130
+ // 402/403 — monthly quota exhausted or plan doesn't allow this action
131
+ console.error("Upgrade your plan:", err.message);
132
+ } else if (err instanceof RateLimitError) {
133
+ // 429 — too many requests in a short window
134
+ console.warn("Rate limited. Retrying in 60s...");
135
+ await new Promise((r) => setTimeout(r, 60_000));
136
+ } else if (err instanceof NotFoundError) {
137
+ // 404 — template or document does not exist
138
+ console.error("Not found:", err.message);
139
+ } else if (err instanceof ExtractiaError) {
140
+ // Fallback for any other API error
141
+ console.error(`API error [${err.status}]:`, err.message);
142
+ } else {
143
+ throw err; // Re-throw network / unexpected errors
144
+ }
145
+ }
146
+ ```
147
+
148
+ ### Error class hierarchy
149
+
150
+ | Class | HTTP status | When thrown |
151
+ | ---------------- | ----------- | ------------------------------------------------- |
152
+ | `ExtractiaError` | any | Base class; fallback for unexpected codes |
153
+ | `AuthError` | 401 | Missing, expired, or revoked token |
154
+ | `ForbiddenError` | 403 | Unconfirmed account or sub-user permission denied |
155
+ | `TierError` | 402 / 403 | Monthly document quota exhausted |
156
+ | `RateLimitError` | 429 | Too many requests in time window |
157
+ | `NotFoundError` | 404 | Template or document not found |
158
+
159
+ ---
160
+
32
161
  ## API Reference
33
162
 
34
- ### Authentication
163
+ ### Profile & Webhook
164
+
165
+ #### `setToken(token)`
166
+
167
+ Sets the Bearer token used for all requests. Must be called before any other method.
168
+
169
+ | Parameter | Type | Required | Description |
170
+ | --------- | -------- | -------- | ---------------------------------- |
171
+ | `token` | `string` | ✅ | Your Extractia API token (`ext_…`) |
172
+
173
+ ```js
174
+ setToken("ext_abc123");
175
+ ```
176
+
177
+ ---
178
+
179
+ #### `getMyProfile()`
180
+
181
+ Returns the profile and usage metrics of the authenticated user.
182
+
183
+ **Returns:** `Promise<AppUserProfile>`
35
184
 
36
- - `setToken(token)` – Set the API token for requests.
37
- - `getMyProfile()` – Get the authenticated user's profile.
38
- - `updateWebhook(url)` Update the webhook URL for the user.
185
+ ```js
186
+ const profile = await getMyProfile();
187
+ console.log(profile.email); // 'user@example.com'
188
+ console.log(profile.formTemplatesCount); // 5
189
+ console.log(profile.documentsCount); // 42
190
+ ```
191
+
192
+ ---
193
+
194
+ #### `updateWebhook(url)`
195
+
196
+ Updates the webhook URL. After each successful extraction, Extractia
197
+ sends a `POST` to this URL with the document payload.
198
+
199
+ | Parameter | Type | Required | Description |
200
+ | --------- | -------- | -------- | ---------------------------------------------- |
201
+ | `url` | `string` | ✅ | Webhook URL (pass empty string `""` to remove) |
202
+
203
+ **Returns:** `Promise<void>`
204
+
205
+ ```js
206
+ await updateWebhook("https://myapp.example.com/hooks/extractia");
207
+ await updateWebhook(""); // remove webhook
208
+ ```
209
+
210
+ **Webhook POST payload:**
211
+
212
+ ```json
213
+ {
214
+ "documentId": "abc123",
215
+ "templateId": "tpl456",
216
+ "rawJson": "{ \"Total\": \"150.00\" }",
217
+ "createdAt": "2025-01-05T10:30:00Z"
218
+ }
219
+ ```
220
+
221
+ ---
39
222
 
40
223
  ### Templates
41
224
 
42
- - `getTemplates()` Retrieve all templates.
43
- - `getTemplateById(id)` – Get a template by its ID.
44
- - `getTemplateByName(name)` Get a template by its name.
45
- - `createTemplate(template)` – Create a new template.
46
- - `updateTemplate(id, template)` – Update an existing template.
47
- - `deleteTemplate(id)` – Delete a template.
48
- - `deleteAllTemplateDocuments(id)` – Delete all documents for a template.
225
+ A **template** defines the fields to extract from a document.
226
+
227
+ **Supported field types:** `TEXT` | `NUMBER` | `PERCENTAGE` | `DATE` | `BOOLEAN` | `EMAIL` | `PHONE` | `ADDRESS` | `CURRENCY` | `LIST`
228
+
229
+ ---
230
+
231
+ #### `getTemplates()`
232
+
233
+ Returns all templates owned by the authenticated user.
234
+
235
+ ```js
236
+ const templates = await getTemplates();
237
+ templates.forEach((t) => console.log(t.id, t.label));
238
+ ```
239
+
240
+ ---
241
+
242
+ #### `getTemplateById(id)`
243
+
244
+ Returns a single template by its ID. Throws `NotFoundError` if missing.
245
+
246
+ ```js
247
+ const template = await getTemplateById("tpl_abc123");
248
+ ```
249
+
250
+ ---
251
+
252
+ #### `getTemplateByName(name)`
253
+
254
+ Returns a template matched by its label name. Throws `NotFoundError` if missing.
255
+
256
+ ```js
257
+ const invoice = await getTemplateByName("Invoice");
258
+ ```
259
+
260
+ ---
261
+
262
+ #### `suggestFields(templateName, extractionContext?)`
263
+
264
+ Uses AI to suggest extraction field definitions for a given document type.
265
+ Results can be passed directly to `createTemplate`.
266
+
267
+ **Consumes AI credits.**
268
+
269
+ | Parameter | Type | Required | Description |
270
+ | ------------------- | -------- | -------- | ----------------------------------------------------------- |
271
+ | `templateName` | `string` | ✅ | Document type name (e.g. `"Invoice"`, `"Driver's License"`) |
272
+ | `extractionContext` | `string` | — | Natural-language hint about what to extract or detect |
273
+
274
+ ```js
275
+ // Basic usage
276
+ const fields = await suggestFields("Receipt");
277
+
278
+ // With context — AI will only return what you describe
279
+ const fields = await suggestFields(
280
+ "Purchase Order",
281
+ "I need the supplier name, PO number, and all ordered products with quantity and unit price",
282
+ );
283
+
284
+ const template = await createTemplate({ label: "Purchase Order", fields });
285
+ ```
286
+
287
+ The returned array follows the `FormField` shape:
288
+
289
+ ```json
290
+ [
291
+ { "label": "PO Number", "type": "TEXT", "required": true },
292
+ { "label": "Supplier", "type": "TEXT", "required": true },
293
+ {
294
+ "label": "Order Items",
295
+ "type": "LIST",
296
+ "required": false,
297
+ "listLabel": "Product"
298
+ }
299
+ ]
300
+ ```
301
+
302
+ ---
303
+
304
+ #### `createTemplate(template)`
305
+
306
+ Creates a new template.
307
+
308
+ ```js
309
+ const template = await createTemplate({
310
+ label: "Purchase Order",
311
+ fields: [
312
+ { label: "PO Number", type: "TEXT", required: true },
313
+ { label: "Vendor", type: "TEXT", required: true },
314
+ { label: "Total Amount", type: "CURRENCY", required: true },
315
+ { label: "Order Date", type: "DATE", required: false },
316
+ {
317
+ label: "Line Items",
318
+ type: "LIST",
319
+ required: false,
320
+ listLabel: "Product",
321
+ },
322
+ ],
323
+ });
324
+ console.log(template.id); // 'tpl_newid'
325
+ ```
326
+
327
+ ---
328
+
329
+ #### `updateTemplate(id, template)`
330
+
331
+ Updates an existing template's label and/or fields.
332
+
333
+ ```js
334
+ const updated = await updateTemplate("tpl_abc123", {
335
+ fields: [
336
+ { label: "Vendor", type: "TEXT", required: true },
337
+ { label: "Total", type: "CURRENCY", required: true },
338
+ { label: "Due Date", type: "DATE", required: false },
339
+ ],
340
+ });
341
+ ```
342
+
343
+ ---
344
+
345
+ #### `deleteTemplate(id)`
346
+
347
+ Deletes a template. Returns a `409 Conflict` error if the template has
348
+ associated documents — call `deleteAllTemplateDocuments` first.
349
+
350
+ ```js
351
+ await deleteAllTemplateDocuments("tpl_abc123");
352
+ await deleteTemplate("tpl_abc123");
353
+ ```
354
+
355
+ ---
356
+
357
+ #### `deleteAllTemplateDocuments(id)`
358
+
359
+ Deletes all documents associated with a template in one call.
360
+
361
+ ```js
362
+ await deleteAllTemplateDocuments("tpl_abc123");
363
+ ```
364
+
365
+ ---
49
366
 
50
367
  ### Documents
51
368
 
52
- - `getDocumentsByTemplateId(templateId, options)` – Get documents for a template.
53
- - `options`:
54
- ```js
369
+ #### `getDocumentsByTemplateId(templateId, options?)`
370
+
371
+ Returns a paginated list of documents for a given template (newest first by default).
372
+
373
+ | Option | Type | Default | Description |
374
+ | -------------- | --------- | ------- | ------------------------------------------------- |
375
+ | `preconformed` | `boolean` | — | Filter: reviewed (`true`) or unreviewed (`false`) |
376
+ | `index` | `number` | `0` | Zero-based page index (10 docs per page) |
377
+ | `sort` | `string` | `"-1"` | Sort direction: `"1"` = ASC, `"-1"` = DESC |
378
+ | `includeImage` | `boolean` | `false` | Include base64 source image in results |
379
+
380
+ ```js
381
+ const page = await getDocumentsByTemplateId("tpl_abc123", {
382
+ preconformed: false, // only unreviewed
383
+ index: 0,
384
+ });
385
+
386
+ console.log(`Page 1 of ${page.totalPages}`);
387
+ for (const doc of page.content) {
388
+ const fields = JSON.parse(doc.rawJson);
389
+ console.log("Total:", fields["Total Amount"]);
390
+ }
391
+ ```
392
+
393
+ ---
394
+
395
+ #### `getDocumentById(templateId, docId, options?)`
396
+
397
+ Returns a single document by template and document ID.
398
+
399
+ | Option | Type | Default | Description |
400
+ | -------------- | --------- | ------- | ------------------------------- |
401
+ | `includeImage` | `boolean` | `false` | Include the base64 source image |
402
+
403
+ ```js
404
+ const doc = await getDocumentById("tpl_abc123", "doc_xyz789", {
405
+ includeImage: true,
406
+ });
407
+ console.log(doc.imageBase64); // the original scan
408
+ ```
409
+
410
+ ---
411
+
412
+ #### `getRecentDocuments(size?)`
413
+
414
+ Returns the N most-recent documents across **all templates**. Useful for dashboard feeds.
415
+
416
+ ```js
417
+ const recent = await getRecentDocuments(5);
418
+ recent.forEach((doc) => console.log(doc.createdAt, doc.formTemplateId));
419
+ ```
420
+
421
+ ---
422
+
423
+ #### `deleteDocument(documentId)`
424
+
425
+ Permanently deletes a single document.
426
+
427
+ ```js
428
+ await deleteDocument("doc_xyz789");
429
+ ```
430
+
431
+ ---
432
+
433
+ #### `updateDocumentStatus(docId, status)`
434
+
435
+ Updates the workflow status of a document. You define the status values — common
436
+ choices are `"PENDING"`, `"REVIEWED"`, `"APPROVED"`, `"REJECTED"`.
437
+
438
+ ```js
439
+ await updateDocumentStatus("doc_xyz789", "APPROVED");
440
+ ```
441
+
442
+ ---
443
+
444
+ #### `updateDocumentNotes(docId, notes)`
445
+
446
+ Saves reviewer annotations on a document. Pass `""` to clear.
447
+
448
+ ```js
449
+ await updateDocumentNotes(
450
+ "doc_xyz789",
451
+ "Verified against original: amounts match.",
452
+ );
453
+ await updateDocumentNotes("doc_xyz789", ""); // clear notes
454
+ ```
455
+
456
+ ---
457
+
458
+ #### `updateDocumentData(docId, data, options?)`
459
+
460
+ Corrects or overwrites the extracted field data programmatically. Optionally
461
+ marks the document as preconformed (reviewed) in the same call.
462
+
463
+ | Option | Type | Default | Description |
464
+ | -------------- | --------- | ------- | ----------------------------------- |
465
+ | `preconformed` | `boolean` | `false` | Mark document as reviewed/confirmed |
466
+
467
+ ```js
468
+ // Fix a wrong extraction and confirm
469
+ await updateDocumentData(
470
+ "doc_xyz789",
471
+ { "Total Amount": 1250.0, "Invoice Number": "INV-1042" },
472
+ { preconformed: true },
473
+ );
474
+ ```
475
+
476
+ ---
477
+
478
+ #### `bulkPreconform(ids)`
479
+
480
+ Marks multiple documents as reviewed/confirmed in a single API call. Returns
481
+ the count of documents actually updated.
482
+
483
+ ```js
484
+ const { updated } = await bulkPreconform(["doc_1", "doc_2", "doc_3"]);
485
+ console.log(`${updated} documents confirmed`);
486
+ ```
487
+
488
+ ---
489
+
490
+ ### Processing Images
491
+
492
+ #### `processImage(templateId, base64Image)`
493
+
494
+ Processes a **single image** — ideal for one-page documents.
495
+
496
+ Max image size: **5 MB** decoded. Supported formats: PNG, JPEG, WEBP, BMP.
497
+
498
+ ```js
499
+ // Node.js
500
+ import { readFileSync } from "fs";
501
+ const base64 = readFileSync("./invoice.jpg").toString("base64");
502
+ const doc = await processImage("tpl_invoice_id", base64);
503
+ const fields = JSON.parse(doc.rawJson);
504
+ console.log("Vendor:", fields["Vendor"]);
505
+ console.log("Total:", fields["Total Amount"]);
506
+ ```
507
+
508
+ ```js
509
+ // Browser — convert a File input to base64
510
+ function fileToBase64(file) {
511
+ return new Promise((resolve, reject) => {
512
+ const reader = new FileReader();
513
+ reader.onload = () => resolve(reader.result.split(",")[1]);
514
+ reader.onerror = reject;
515
+ reader.readAsDataURL(file);
516
+ });
517
+ }
518
+
519
+ const file = document.querySelector("#fileInput").files[0];
520
+ const base64 = await fileToBase64(file);
521
+ const doc = await processImage(templateId, base64);
522
+ ```
523
+
524
+ ---
525
+
526
+ #### `processImagesMultipage(templateId, base64ImagesArray)`
527
+
528
+ Processes **multiple images** as a single multi-page document. All pages are
529
+ merged into one result. Use this for PDFs split into images or multi-page scans.
530
+
531
+ ```js
532
+ import { readdirSync, readFileSync } from "fs";
533
+ import path from "path";
534
+
535
+ const pages = readdirSync("./scan-pages")
536
+ .sort()
537
+ .map((f) => readFileSync(path.join("./scan-pages", f)).toString("base64"));
538
+
539
+ const doc = await processImagesMultipage("tpl_contract_id", pages);
540
+ const result = JSON.parse(doc.rawJson);
541
+ console.log("Contract Date:", result["Contract Date"]);
542
+ ```
543
+
544
+ ---
545
+
546
+ ### AI Features
547
+
548
+ #### `generateDocumentSummary(docId)`
549
+
550
+ Asks the AI to generate a concise bullet-point summary of a document's
551
+ extracted data. Returns natural language — not JSON.
552
+
553
+ **Consumes AI credits.**
554
+
555
+ ```js
556
+ const { summary } = await generateDocumentSummary("doc_xyz789");
557
+ console.log(summary);
558
+ // • Invoice #1042 was issued by Acme Corp on January 5 2025.
559
+ // • The total amount due is $1,250.00, payable by February 4 2025.
560
+ // • The order includes 3 line items for a total of 15 units.
561
+ ```
562
+
563
+ ---
564
+
565
+ ### Exports
566
+
567
+ Export all documents in a template to a file format for offline processing,
568
+ spreadsheet import, or archiving.
569
+
570
+ #### `exportDocumentsCsv(templateId, options?)`
571
+
572
+ Returns all extracted data as a UTF-8 CSV string with BOM (Excel-compatible).
573
+ Each row is one document; columns are the extracted fields plus `preconformed`
574
+ and `uploadedAt`.
575
+
576
+ | Option | Type | Description |
577
+ | -------- | ---------- | ------------------------------------------------ |
578
+ | `fields` | `string[]` | Optional column subset. Dot-notation for nested. |
579
+
580
+ ```js
581
+ // Export everything
582
+ const csv = await exportDocumentsCsv("tpl_abc123");
583
+ fs.writeFileSync("invoices.csv", csv);
584
+
585
+ // Export a specific column subset
586
+ const csv = await exportDocumentsCsv("tpl_abc123", {
587
+ fields: ["Invoice Number", "Vendor", "Total Amount", "Issue Date"],
588
+ });
589
+ ```
590
+
591
+ ---
592
+
593
+ #### `exportDocumentsJson(templateId)`
594
+
595
+ Returns all documents as a plain JSON array. Each element is the extracted
596
+ field map plus three metadata keys:
597
+
598
+ - `_id` — document ID
599
+ - `_preconformed` — whether the document was reviewed
600
+ - `_uploadedAt` — ISO timestamp
601
+
602
+ ```js
603
+ const records = await exportDocumentsJson("tpl_abc123");
604
+ records.forEach((doc) => {
605
+ console.log(doc._id, doc["Invoice Number"], doc["Total Amount"]);
606
+ });
607
+
608
+ // Save to file
609
+ import { writeFileSync } from "fs";
610
+ writeFileSync("invoices.json", JSON.stringify(records, null, 2));
611
+ ```
612
+
613
+ ---
614
+
615
+ ### OCR Tools
616
+
617
+ OCR Tools let you ask AI yes/no questions, classify documents into labels, or
618
+ extract free-form text from any image — without a template.
619
+
620
+ > **Credit consumption:** every OCR Agent run deducts **1 document** from the user's monthly (or add-on) document quota **plus AI credits** based on the token count of the AI analysis (calculated as `ceil(totalTokens / 1000)` credits).
621
+
622
+ #### Dynamic parameters (`{{?N}}` syntax)
623
+
624
+ Prompts can include numbered placeholders of the form `{{?1}}`, `{{?2}}`, etc.
625
+ At run time you supply the actual values; the AI receives the substituted text.
626
+
627
+ ```js
628
+ // Tool with two dynamic parameters
629
+ const checker = await createOcrTool({
630
+ name: "Property Ownership Check",
631
+ prompt:
632
+ "Does this document appear to be a proof of ownership for the property at {{?1}} in the name of {{?2}}?",
633
+ outputType: "YES_NO",
634
+ parameterDefinitions: [
55
635
  {
56
- preconformed?: boolean,
57
- index?: number, // Paging index (each page contains 10 documents)
58
- sort?: number, // 1 for ascending, -1 for descending
59
- includeImage?: boolean
60
- }
61
- ```
62
- - `deleteDocument(documentId)` – Delete a document.
63
- - `processImage(templateId, base64Image)` – Process a single image.
64
- - `processImagesMultipage(templateId, base64ImagesArray)` Process multiple images (multipage).
636
+ key: 1,
637
+ label: "Property address",
638
+ description: "Full street address",
639
+ maxChars: 200,
640
+ },
641
+ {
642
+ key: 2,
643
+ label: "Owner name",
644
+ description: "Full legal name of the owner",
645
+ maxChars: 150,
646
+ },
647
+ ],
648
+ });
649
+
650
+ // Run with values
651
+ const result = await runOcrTool(checker.id, imageBase64, {
652
+ params: { 1: "123 Main St, Buenos Aires", 2: "Juan García" },
653
+ });
654
+ ```
655
+
656
+ Each `ParameterDefinition` has:
657
+
658
+ | Field | Type | Required | Description |
659
+ | ------------- | -------- | -------- | ------------------------------------------------------- |
660
+ | `key` | `number` | ✅ | 1-based index matching `{{?N}}` in the prompt |
661
+ | `label` | `string` | ✅ | Human-friendly name shown in the UI and API errors |
662
+ | `description` | `string` | | Optional hint shown to the user when filling values |
663
+ | `maxChars` | `number` | | Character limit for this parameter (1–500, default 200) |
664
+
665
+ **Validation rules:**
666
+
667
+ - Every `{{?N}}` placeholder in the prompt must have a matching `ParameterDefinition`.
668
+ - A `400 Bad Request` is returned if a required parameter is missing or exceeds its `maxChars` limit.
669
+ - The fully substituted prompt must not exceed **3 000 characters** total; adjust individual `maxChars` values accordingly.
670
+
671
+ #### `getOcrTools()`
672
+
673
+ Returns all OCR tool configurations owned by the authenticated user.
674
+
675
+ ```js
676
+ const tools = await getOcrTools();
677
+ tools.forEach((t) => console.log(t.id, t.name, t.outputType));
678
+ ```
679
+
680
+ ---
681
+
682
+ #### `createOcrTool(config)`
683
+
684
+ Creates a new OCR tool configuration.
685
+
686
+ | Field | Type | Required | Description |
687
+ | ---------------------- | ------------------------------- | -------- | --------------------------------------------------------- |
688
+ | `name` | `string` | ✅ | Human-friendly display name |
689
+ | `prompt` | `string` | ✅ | Natural-language instruction for the AI (max 3 000 chars) |
690
+ | `outputType` | `"YES_NO" \| "LABEL" \| "TEXT"` | ✅ | Expected output shape |
691
+ | `labels` | `string[]` | ⚠️ | Required when `outputType === "LABEL"` |
692
+ | `parameterDefinitions` | `ParameterDefinition[]` | | Dynamic parameter definitions for `{{?N}}` placeholders |
693
+
694
+ ```js
695
+ // YES/NO check
696
+ const checker = await createOcrTool({
697
+ name: "Proof of Residence Check",
698
+ prompt:
699
+ "Does this document appear to be a valid proof of residence? Look for an address, official stamp, and the person's name.",
700
+ outputType: "YES_NO",
701
+ });
702
+
703
+ // Document classifier
704
+ const classifier = await createOcrTool({
705
+ name: "Document Type Classifier",
706
+ prompt: "What type of document is this?",
707
+ outputType: "LABEL",
708
+ labels: [
709
+ "invoice",
710
+ "id_card",
711
+ "receipt",
712
+ "proof_of_residence",
713
+ "contract",
714
+ "other",
715
+ ],
716
+ });
717
+
718
+ // Free-form extraction
719
+ const extractor = await createOcrTool({
720
+ name: "Serial Number Extractor",
721
+ prompt:
722
+ "Extract the serial number or product code printed on the label. Return only the code, nothing else.",
723
+ outputType: "TEXT",
724
+ });
725
+ ```
726
+
727
+ ---
728
+
729
+ #### `updateOcrTool(id, config)`
730
+
731
+ Updates an existing OCR tool configuration.
732
+
733
+ ```js
734
+ await updateOcrTool("tool_abc", {
735
+ prompt:
736
+ "Does this document show a current address dated within the last 3 months?",
737
+ });
738
+ ```
739
+
740
+ ---
741
+
742
+ #### `deleteOcrTool(id)`
743
+
744
+ Deletes an OCR tool configuration.
745
+
746
+ ```js
747
+ await deleteOcrTool("tool_abc");
748
+ ```
749
+
750
+ ---
751
+
752
+ #### `runOcrTool(id, base64Image, options?)`
753
+
754
+ Runs an OCR tool against a base64-encoded image. Max image size: **5 MB**.
755
+
756
+ **Consumes 1 document credit + AI credits** (based on token usage).
757
+
758
+ The AI output language matches the language of the prompt / parameter values automatically.
759
+
760
+ | Option | Type | Description |
761
+ | -------- | ------------------------ | ------------------------------------------------ |
762
+ | `params` | `Record<string, string>` | Values for `{{?N}}` placeholders, keyed by `"N"` |
763
+
764
+ ```js
765
+ import { readFileSync } from "fs";
766
+ const image = readFileSync("./id-card.jpg").toString("base64");
767
+
768
+ const result = await runOcrTool("tool_residence_check", image);
769
+ console.log(result.answer); // "YES"
770
+ console.log(result.explanation); // "The document shows a full address, an official municipal stamp, and the applicant's name."
771
+
772
+ // Document classification
773
+ const type = await runOcrTool("tool_classifier", image);
774
+ console.log(type.answer); // "id_card"
775
+
776
+ // With dynamic parameters
777
+ const ownership = await runOcrTool("tool_ownership_check", image, {
778
+ params: { 1: "Av. Corrientes 1234, CABA", 2: "María López" },
779
+ });
780
+ console.log(ownership.answer); // "YES"
781
+
782
+ // Full workflow: classify first, then extract
783
+ const { answer: docType } = await runOcrTool("tool_classifier", invoiceBase64);
784
+ if (docType === "invoice") {
785
+ const doc = await processImage(invoiceTemplateId, invoiceBase64);
786
+ }
787
+ ```
788
+
789
+ **Error responses:**
790
+
791
+ | HTTP | Condition |
792
+ | ---- | ------------------------------------------------------------------------------------------------------ |
793
+ | 400 | Missing required parameter, value exceeds `maxChars`, or prompt exceeds 3 000 chars after substitution |
794
+ | 402 | Document quota exhausted — upgrade plan or purchase add-on docs |
795
+ | 402 | AI credits exhausted — purchase add-on credits |
796
+
797
+ ---
798
+
799
+ ### Credits & Analytics
800
+
801
+ #### `getCreditsBalance()`
802
+
803
+ Returns the current AI-credit balance for the authenticated user.
804
+
805
+ ```js
806
+ const balance = await getCreditsBalance();
807
+ console.log(`Monthly credits: ${balance.monthlyBalance}`);
808
+ console.log(`Add-on credits: ${balance.addonBalance}`);
809
+ console.log(`Total available: ${balance.totalBalance}`);
810
+ ```
811
+
812
+ ---
813
+
814
+ #### `getCreditsHistory(options?)`
815
+
816
+ Returns a paginated history of AI credit consumption events (newest first).
817
+
818
+ | Option | Type | Default | Description |
819
+ | ------ | -------- | ------- | --------------------- |
820
+ | `page` | `number` | `0` | Zero-based page index |
821
+ | `size` | `number` | `20` | Page size (max 100) |
822
+
823
+ ```js
824
+ const history = await getCreditsHistory({ page: 0, size: 10 });
825
+ history.content.forEach((entry) => {
826
+ console.log(entry.timestamp, entry.operation, entry.creditsConsumed);
827
+ });
828
+ ```
829
+
830
+ ---
831
+
832
+ ## Sub-Users
833
+
834
+ > Requires a **Pro or higher** plan. The plan determines the maximum number of sub-users allowed.
835
+
836
+ Sub-users can log in to the web app and operate within the permissions you grant them.
837
+ Available permissions: `"upload"` · `"view"` · `"template"` · `"settings"` · `"export"` · `"api"`
838
+
839
+ ### Document History
840
+
841
+ ```js
842
+ import { getDocumentHistory } from 'extractia-sdk';
843
+
844
+ const log = await getDocumentHistory({ page: 0, size: 20 });
845
+ log.content.forEach((entry) => {
846
+ console.log(entry.templateName, entry.status, entry.uploadDate);
847
+ if (entry.status === 'FAILURE') console.error(entry.errorMessage);
848
+ });
849
+ ```
850
+
851
+ ### List Sub-Users
852
+
853
+ ```js
854
+ import { getSubUsers } from 'extractia-sdk';
855
+
856
+ const users = await getSubUsers();
857
+ // [{ username: 'agent_carlos', permissions: ['upload','view'], suspended: false }]
858
+ ```
859
+
860
+ ### Create a Sub-User
861
+
862
+ ```js
863
+ import { createSubUser } from 'extractia-sdk';
864
+
865
+ const sub = await createSubUser({
866
+ username: 'agent_carlos',
867
+ password: 'SecurePass1',
868
+ permissions: ['upload', 'view'],
869
+ });
870
+ ```
871
+
872
+ **Error codes:**
873
+ | Code | Reason |
874
+ |------|--------|
875
+ | `403` | Plan does not allow sub-users or limit reached |
876
+ | `409` | Username already in use |
877
+ | `400` | Missing fields or password matches the main account |
878
+
879
+ ### Update Permissions or Password
880
+
881
+ Only the fields you include are changed. Omit `password` to keep it unchanged.
882
+
883
+ ```js
884
+ import { updateSubUser } from 'extractia-sdk';
885
+
886
+ await updateSubUser('agent_carlos', {
887
+ permissions: ['upload', 'view', 'export'],
888
+ // password: 'NewPass99', ← optional
889
+ });
890
+ ```
891
+
892
+ ### Delete a Sub-User
893
+
894
+ ```js
895
+ import { deleteSubUser } from 'extractia-sdk';
896
+
897
+ await deleteSubUser('agent_carlos');
898
+ ```
899
+
900
+ ### Suspend / Reactivate
901
+
902
+ A suspended sub-user cannot log in. Calling the same method again reactivates them.
903
+
904
+ ```js
905
+ import { toggleSuspendSubUser } from 'extractia-sdk';
906
+
907
+ const state = await toggleSuspendSubUser('agent_carlos');
908
+ console.log(state.suspended); // true | false
909
+ ```
910
+
911
+ ---
912
+
913
+ ## TypeScript
914
+
915
+ The SDK ships with a full `index.d.ts` declaration file — no `@types` package needed.
916
+
917
+ ```ts
918
+ import {
919
+ setToken,
920
+ processImage,
921
+ runOcrTool,
922
+ suggestFields,
923
+ exportDocumentsJson,
924
+ UserDocument,
925
+ OcrRunResult,
926
+ FormField,
927
+ TierError,
928
+ RateLimitError,
929
+ } from "extractia-sdk";
930
+
931
+ setToken(process.env.EXTRACTIA_TOKEN!);
932
+
933
+ async function classifyAndExtract(
934
+ templateId: string,
935
+ ocrToolId: string,
936
+ imagePath: string,
937
+ ): Promise<UserDocument | null> {
938
+ const { readFileSync } = await import("fs");
939
+ const base64 = readFileSync(imagePath).toString("base64");
940
+
941
+ // Classify first
942
+ const check: OcrRunResult = await runOcrTool(ocrToolId, base64);
943
+ if (check.answer !== "YES") {
944
+ console.log("Document rejected:", check.explanation);
945
+ return null;
946
+ }
947
+
948
+ // Extract with template
949
+ return processImage(templateId, base64);
950
+ }
951
+ ```
952
+
953
+ ---
954
+
955
+ ## Rate Limits & Quotas
956
+
957
+ | Limit | Value |
958
+ | ------------------ | ---------------------------------------------------- |
959
+ | Max image size | 5 MB decoded |
960
+ | Processing timeout | 60 seconds |
961
+ | Monthly documents | Depends on plan (Free / Pro / Business / Enterprise) |
962
+ | Active API keys | 10 per account |
963
+
964
+ When the monthly quota is exhausted, processing calls throw a `TierError`.
965
+ Purchase extra document packs or upgrade your plan from the dashboard to continue.
966
+
967
+ ---
968
+
969
+ ## Changelog
970
+
971
+ ### v1.2.0
972
+
973
+ - **New:** `getDocumentHistory(opts?)` — paginated log of all document processing events (successes and failures)
974
+ - **New:** `getSubUsers()` — list all sub-users under your account
975
+ - **New:** `createSubUser({ username, password, permissions })` — create a sub-user (Pro+ plans)
976
+ - **New:** `updateSubUser(username, updates)` — change permissions or password of a sub-user
977
+ - **New:** `deleteSubUser(username)` — permanently remove a sub-user
978
+ - **New:** `toggleSuspendSubUser(username)` — suspend or reactivate a sub-user
979
+ - **Updated:** `getMyProfile()` response now includes `documentsAvailableThisMonth` and `extraDocsAvailable` quota fields
980
+ - Updated TypeScript declarations: `AppUserProfile`, `DocumentAuditEntry`, `SubUser` interfaces; all new function signatures
981
+
982
+ ### v1.1.0
983
+
984
+ - **New:** `suggestFields(templateName, context?)` — AI-powered field suggestions
985
+ - **New:** `getDocumentById(templateId, docId)` — fetch a single document
986
+ - **New:** `getRecentDocuments(size?)` — latest documents across all templates
987
+ - **New:** `generateDocumentSummary(docId)` — AI bullet-point summary of a document
988
+ - **New:** `updateDocumentStatus(docId, status)` — set workflow status
989
+ - **New:** `updateDocumentNotes(docId, notes)` — save reviewer annotations
990
+ - **New:** `updateDocumentData(docId, data, opts?)` — correct extracted data
991
+ - **New:** `bulkPreconform(ids)` — confirm multiple documents in one call
992
+ - **New:** `exportDocumentsCsv(templateId, opts?)` — export to CSV
993
+ - **New:** `exportDocumentsJson(templateId)` — export to JSON array
994
+ - **New:** `getOcrTools()` / `createOcrTool()` / `updateOcrTool()` / `deleteOcrTool()` / `runOcrTool(id, image)` — full OCR Tools API
995
+ - **New:** `getCreditsBalance()` / `getCreditsHistory(opts?)` — AI credits tracking
996
+ - Extended `FieldType` with `BOOLEAN`, `EMAIL`, `PHONE`, `ADDRESS`, `CURRENCY`
997
+ - Full TypeScript declarations updated for all new methods
998
+
999
+ ### v1.0.6
1000
+
1001
+ - Added `processImagesMultipage` for multi-page document support
1002
+ - Added typed error classes (`AuthError`, `TierError`, `RateLimitError`, `NotFoundError`, `ForbiddenError`)
1003
+ - Added TypeScript declaration file (`index.d.ts`)
1004
+ - Added `deleteAllTemplateDocuments` helper
1005
+ - Added browser IIFE build
1006
+
1007
+ ### v1.0.0
1008
+
1009
+ - Initial release: `setToken`, `getMyProfile`, `updateWebhook`, `getTemplates`,
1010
+ `createTemplate`, `updateTemplate`, `deleteTemplate`, `getDocumentsByTemplateId`,
1011
+ `deleteDocument`, `processImage`
1012
+
1013
+ ---
65
1014
 
66
1015
  ## License
67
1016
 
68
- ISC
1017
+ MIT