chub-dev 0.1.0 → 0.1.2-beta.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +55 -0
- package/bin/chub-mcp +2 -0
- package/dist/airtable/docs/database/javascript/DOC.md +1437 -0
- package/dist/airtable/docs/database/python/DOC.md +1735 -0
- package/dist/amplitude/docs/analytics/javascript/DOC.md +1282 -0
- package/dist/amplitude/docs/analytics/python/DOC.md +1199 -0
- package/dist/anthropic/docs/claude-api/javascript/DOC.md +503 -0
- package/dist/anthropic/docs/claude-api/python/DOC.md +389 -0
- package/dist/asana/docs/tasks/DOC.md +1396 -0
- package/dist/assemblyai/docs/transcription/DOC.md +1043 -0
- package/dist/atlassian/docs/confluence/javascript/DOC.md +1347 -0
- package/dist/atlassian/docs/confluence/python/DOC.md +1604 -0
- package/dist/auth0/docs/identity/javascript/DOC.md +968 -0
- package/dist/auth0/docs/identity/python/DOC.md +1199 -0
- package/dist/aws/docs/s3/javascript/DOC.md +1773 -0
- package/dist/aws/docs/s3/python/DOC.md +1807 -0
- package/dist/binance/docs/trading/javascript/DOC.md +1315 -0
- package/dist/binance/docs/trading/python/DOC.md +1454 -0
- package/dist/braintree/docs/gateway/javascript/DOC.md +1278 -0
- package/dist/braintree/docs/gateway/python/DOC.md +1179 -0
- package/dist/chromadb/docs/embeddings-db/javascript/DOC.md +1263 -0
- package/dist/chromadb/docs/embeddings-db/python/DOC.md +1707 -0
- package/dist/clerk/docs/auth/javascript/DOC.md +1220 -0
- package/dist/clerk/docs/auth/python/DOC.md +274 -0
- package/dist/cloudflare/docs/workers/javascript/DOC.md +918 -0
- package/dist/cloudflare/docs/workers/python/DOC.md +994 -0
- package/dist/cockroachdb/docs/distributed-db/DOC.md +1500 -0
- package/dist/cohere/docs/llm/DOC.md +1335 -0
- package/dist/datadog/docs/monitoring/javascript/DOC.md +1740 -0
- package/dist/datadog/docs/monitoring/python/DOC.md +1815 -0
- package/dist/deepgram/docs/speech/javascript/DOC.md +885 -0
- package/dist/deepgram/docs/speech/python/DOC.md +685 -0
- package/dist/deepl/docs/translation/javascript/DOC.md +887 -0
- package/dist/deepl/docs/translation/python/DOC.md +944 -0
- package/dist/deepseek/docs/llm/DOC.md +1220 -0
- package/dist/directus/docs/headless-cms/javascript/DOC.md +1128 -0
- package/dist/directus/docs/headless-cms/python/DOC.md +1276 -0
- package/dist/discord/docs/bot/javascript/DOC.md +1090 -0
- package/dist/discord/docs/bot/python/DOC.md +1130 -0
- package/dist/elasticsearch/docs/search/DOC.md +1634 -0
- package/dist/elevenlabs/docs/text-to-speech/javascript/DOC.md +336 -0
- package/dist/elevenlabs/docs/text-to-speech/python/DOC.md +552 -0
- package/dist/firebase/docs/auth/DOC.md +1015 -0
- package/dist/gemini/docs/genai/javascript/DOC.md +691 -0
- package/dist/gemini/docs/genai/python/DOC.md +555 -0
- package/dist/github/docs/octokit/DOC.md +1560 -0
- package/dist/google/docs/bigquery/javascript/DOC.md +1688 -0
- package/dist/google/docs/bigquery/python/DOC.md +1503 -0
- package/dist/hubspot/docs/crm/javascript/DOC.md +1805 -0
- package/dist/hubspot/docs/crm/python/DOC.md +2033 -0
- package/dist/huggingface/docs/transformers/DOC.md +948 -0
- package/dist/intercom/docs/messaging/javascript/DOC.md +1844 -0
- package/dist/intercom/docs/messaging/python/DOC.md +1797 -0
- package/dist/jira/docs/issues/javascript/DOC.md +1420 -0
- package/dist/jira/docs/issues/python/DOC.md +1492 -0
- package/dist/kafka/docs/streaming/javascript/DOC.md +1671 -0
- package/dist/kafka/docs/streaming/python/DOC.md +1464 -0
- package/dist/landingai-ade/docs/api/DOC.md +620 -0
- package/dist/landingai-ade/docs/sdk/python/DOC.md +489 -0
- package/dist/landingai-ade/docs/sdk/typescript/DOC.md +542 -0
- package/dist/landingai-ade/skills/SKILL.md +489 -0
- package/dist/launchdarkly/docs/feature-flags/javascript/DOC.md +1191 -0
- package/dist/launchdarkly/docs/feature-flags/python/DOC.md +1671 -0
- package/dist/linear/docs/tracker/DOC.md +1554 -0
- package/dist/livekit/docs/realtime/javascript/DOC.md +303 -0
- package/dist/livekit/docs/realtime/python/DOC.md +163 -0
- package/dist/mailchimp/docs/marketing/DOC.md +1420 -0
- package/dist/meilisearch/docs/search/DOC.md +1241 -0
- package/dist/microsoft/docs/onedrive/javascript/DOC.md +1421 -0
- package/dist/microsoft/docs/onedrive/python/DOC.md +1549 -0
- package/dist/mongodb/docs/atlas/DOC.md +2041 -0
- package/dist/notion/docs/workspace-api/javascript/DOC.md +1435 -0
- package/dist/notion/docs/workspace-api/python/DOC.md +1400 -0
- package/dist/okta/docs/identity/javascript/DOC.md +1171 -0
- package/dist/okta/docs/identity/python/DOC.md +1401 -0
- package/dist/openai/docs/chat/javascript/DOC.md +407 -0
- package/dist/openai/docs/chat/python/DOC.md +568 -0
- package/dist/paypal/docs/checkout/DOC.md +278 -0
- package/dist/pinecone/docs/sdk/javascript/DOC.md +984 -0
- package/dist/pinecone/docs/sdk/python/DOC.md +1395 -0
- package/dist/plaid/docs/banking/javascript/DOC.md +1163 -0
- package/dist/plaid/docs/banking/python/DOC.md +1203 -0
- package/dist/playwright-community/skills/login-flows/SKILL.md +108 -0
- package/dist/postmark/docs/transactional-email/DOC.md +1168 -0
- package/dist/prisma/docs/orm/javascript/DOC.md +1419 -0
- package/dist/prisma/docs/orm/python/DOC.md +1317 -0
- package/dist/qdrant/docs/vector-search/javascript/DOC.md +1221 -0
- package/dist/qdrant/docs/vector-search/python/DOC.md +1653 -0
- package/dist/rabbitmq/docs/message-queue/javascript/DOC.md +1193 -0
- package/dist/rabbitmq/docs/message-queue/python/DOC.md +1243 -0
- package/dist/razorpay/docs/payments/javascript/DOC.md +1219 -0
- package/dist/razorpay/docs/payments/python/DOC.md +1330 -0
- package/dist/redis/docs/key-value/javascript/DOC.md +1851 -0
- package/dist/redis/docs/key-value/python/DOC.md +2054 -0
- package/dist/registry.json +2817 -0
- package/dist/replicate/docs/model-hosting/DOC.md +1318 -0
- package/dist/resend/docs/email/DOC.md +1271 -0
- package/dist/salesforce/docs/crm/javascript/DOC.md +1241 -0
- package/dist/salesforce/docs/crm/python/DOC.md +1183 -0
- package/dist/search-index.json +1 -0
- package/dist/sendgrid/docs/email-api/javascript/DOC.md +371 -0
- package/dist/sendgrid/docs/email-api/python/DOC.md +656 -0
- package/dist/sentry/docs/error-tracking/javascript/DOC.md +1073 -0
- package/dist/sentry/docs/error-tracking/python/DOC.md +1309 -0
- package/dist/shopify/docs/storefront/DOC.md +457 -0
- package/dist/slack/docs/workspace/javascript/DOC.md +933 -0
- package/dist/slack/docs/workspace/python/DOC.md +271 -0
- package/dist/square/docs/payments/javascript/DOC.md +1855 -0
- package/dist/square/docs/payments/python/DOC.md +1728 -0
- package/dist/stripe/docs/api/DOC.md +1727 -0
- package/dist/stripe/docs/payments/DOC.md +1726 -0
- package/dist/stytch/docs/auth/javascript/DOC.md +1813 -0
- package/dist/stytch/docs/auth/python/DOC.md +1962 -0
- package/dist/supabase/docs/client/DOC.md +1606 -0
- package/dist/twilio/docs/messaging/python/DOC.md +469 -0
- package/dist/twilio/docs/messaging/typescript/DOC.md +946 -0
- package/dist/vercel/docs/platform/DOC.md +1940 -0
- package/dist/weaviate/docs/vector-db/javascript/DOC.md +1268 -0
- package/dist/weaviate/docs/vector-db/python/DOC.md +1388 -0
- package/dist/zendesk/docs/support/javascript/DOC.md +2150 -0
- package/dist/zendesk/docs/support/python/DOC.md +2297 -0
- package/package.json +22 -6
- package/skills/get-api-docs/SKILL.md +84 -0
- package/src/commands/annotate.js +83 -0
- package/src/commands/build.js +12 -1
- package/src/commands/feedback.js +150 -0
- package/src/commands/get.js +83 -42
- package/src/commands/search.js +7 -0
- package/src/index.js +43 -17
- package/src/lib/analytics.js +90 -0
- package/src/lib/annotations.js +57 -0
- package/src/lib/bm25.js +170 -0
- package/src/lib/cache.js +69 -6
- package/src/lib/config.js +8 -3
- package/src/lib/identity.js +99 -0
- package/src/lib/registry.js +103 -20
- package/src/lib/telemetry.js +86 -0
- package/src/mcp/server.js +177 -0
- package/src/mcp/tools.js +251 -0
|
@@ -0,0 +1,542 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: sdk
|
|
3
|
+
description: "TypeScript/JavaScript SDK reference for LandingAI's Agentic Document Extraction (ADE). Includes type definitions, Zod schema validation, async processing, error handling, type guards, and complete API context."
|
|
4
|
+
metadata:
|
|
5
|
+
languages: "typescript"
|
|
6
|
+
versions: "2.2.0"
|
|
7
|
+
updated-on: "2026-03-04"
|
|
8
|
+
source: maintainer
|
|
9
|
+
tags: "landingai,ade,typescript,javascript,sdk,zod,document-extraction,parse,extract,split,async"
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
# LandingAI ADE — TypeScript SDK Reference
|
|
13
|
+
|
|
14
|
+
TypeScript/JavaScript SDK for LandingAI's Agentic Document Extraction.
|
|
15
|
+
|
|
16
|
+
## Installation
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
npm install landingai-ade
|
|
20
|
+
# or: yarn add landingai-ade / pnpm add landingai-ade
|
|
21
|
+
export VISION_AGENT_API_KEY="v2_..."
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
## Client Setup
|
|
25
|
+
|
|
26
|
+
```typescript
|
|
27
|
+
import { LandingAIADE } from "landingai-ade";
|
|
28
|
+
|
|
29
|
+
const client = new LandingAIADE(); // Uses VISION_AGENT_API_KEY env var
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
### Constructor Arguments
|
|
33
|
+
|
|
34
|
+
| Parameter | Type | Default | Description |
|
|
35
|
+
|-----------|------|---------|-------------|
|
|
36
|
+
| `apikey` | `string \| undefined` | env `VISION_AGENT_API_KEY` | API key (note: lowercase) |
|
|
37
|
+
| `environment` | `"production" \| "eu"` | `"production"` | Region — `"production"` (US) or `"eu"` |
|
|
38
|
+
| `baseURL` | `string \| undefined` | — | Override base URL |
|
|
39
|
+
| `timeout` | `number \| undefined` | SDK default | Request timeout in ms |
|
|
40
|
+
| `maxRetries` | `number \| undefined` | SDK default | Max retry attempts for transient errors |
|
|
41
|
+
| `defaultHeaders` | `Record<string, string>` | — | Custom headers for all requests |
|
|
42
|
+
| `fetch` | `typeof global.fetch` | — | Custom fetch implementation |
|
|
43
|
+
|
|
44
|
+
```typescript
|
|
45
|
+
// EU region
|
|
46
|
+
const client = new LandingAIADE({ environment: "eu" });
|
|
47
|
+
|
|
48
|
+
// Pass key directly
|
|
49
|
+
const client = new LandingAIADE({ apikey: "v2_..." });
|
|
50
|
+
|
|
51
|
+
// Full config
|
|
52
|
+
const client = new LandingAIADE({
|
|
53
|
+
apikey: "v2_...",
|
|
54
|
+
timeout: 60000,
|
|
55
|
+
maxRetries: 3,
|
|
56
|
+
});
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
---
|
|
60
|
+
|
|
61
|
+
## 1. Parse
|
|
62
|
+
|
|
63
|
+
Converts documents to structured markdown with visual grounding.
|
|
64
|
+
|
|
65
|
+
### Arguments
|
|
66
|
+
|
|
67
|
+
| Parameter | Type | Required | Description |
|
|
68
|
+
|-----------|------|----------|-------------|
|
|
69
|
+
| `document` | `Uploadable \| null` | One required | Local file (Buffer, ReadStream, File object) |
|
|
70
|
+
| `document_url` | `string \| null` | One required | Remote document URL |
|
|
71
|
+
| `model` | `string \| null` | No | Model version (default: `dpt-2-latest`) |
|
|
72
|
+
| `split` | `"page" \| null` | No | Split by pages |
|
|
73
|
+
| `saveTo` | `string` | No | Directory to save `{filename}_parse_output.json` |
|
|
74
|
+
|
|
75
|
+
### Returns `ParseResponse`
|
|
76
|
+
|
|
77
|
+
```
|
|
78
|
+
.markdown → string: full document as markdown
|
|
79
|
+
.chunks[] → Chunk: {id, type, markdown, grounding: {page, box}}
|
|
80
|
+
.grounding → Record<string, Grounding>: bounding boxes and tableCell positions
|
|
81
|
+
.splits[] → Split: {chunks[], class, identifier, markdown, pages[]} (only if split="page")
|
|
82
|
+
.metadata → Metadata: {filename, page_count, duration_ms, credit_usage, version, job_id, failed_pages}
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
### Example
|
|
86
|
+
|
|
87
|
+
```typescript
|
|
88
|
+
const response = await client.parse({
|
|
89
|
+
document: fs.createReadStream("./invoice.pdf"),
|
|
90
|
+
model: "dpt-2-latest",
|
|
91
|
+
saveTo: "./output",
|
|
92
|
+
});
|
|
93
|
+
|
|
94
|
+
console.log(response.markdown);
|
|
95
|
+
console.log(`${response.chunks.length} chunks, ${response.metadata.page_count} pages`);
|
|
96
|
+
|
|
97
|
+
const tables = response.chunks.filter(c => c.type === "table");
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
### Working with Chunks and Grounding
|
|
101
|
+
|
|
102
|
+
```typescript
|
|
103
|
+
// Filter by type and page
|
|
104
|
+
const tables = response.chunks.filter(c => c.type === "table");
|
|
105
|
+
const page0 = response.chunks.filter(c => c.grounding.page === 0);
|
|
106
|
+
|
|
107
|
+
// Find table cells with positions
|
|
108
|
+
const tableCells = Object.entries(response.grounding)
|
|
109
|
+
.filter(([_, g]) => g.type === "tableCell")
|
|
110
|
+
.map(([id, g]) => ({ id, page: g.page, position: g.position! }));
|
|
111
|
+
|
|
112
|
+
tableCells.forEach(cell => {
|
|
113
|
+
const { row, col, rowspan, colspan } = cell.position;
|
|
114
|
+
console.log(`Cell (${row},${col}) span ${rowspan}x${colspan}`);
|
|
115
|
+
});
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
---
|
|
119
|
+
|
|
120
|
+
## 2. Extract
|
|
121
|
+
|
|
122
|
+
Extracts structured data from markdown using a JSON schema.
|
|
123
|
+
|
|
124
|
+
### Arguments
|
|
125
|
+
|
|
126
|
+
| Parameter | Type | Required | Description |
|
|
127
|
+
|-----------|------|----------|-------------|
|
|
128
|
+
| `schema` | `string` | Yes | JSON schema string (use `zodToJsonSchema()` from `zod-to-json-schema` to generate from Zod models) |
|
|
129
|
+
| `markdown` | `Uploadable \| string \| null` | One required | Markdown content, string, or file |
|
|
130
|
+
| `markdown_url` | `string \| null` | One required | URL to markdown |
|
|
131
|
+
| `model` | `string \| null` | No | Model version (default: `extract-latest`) |
|
|
132
|
+
| `saveTo` | `string` | No | Directory to save `{filename}_extract_output.json` |
|
|
133
|
+
|
|
134
|
+
### Returns `ExtractResponse`
|
|
135
|
+
|
|
136
|
+
```
|
|
137
|
+
.extraction → Record<string, any>: extracted key-value pairs matching schema
|
|
138
|
+
.extraction_metadata → Record<string, {references?: string[]}>: chunk references for grounding
|
|
139
|
+
.metadata → Metadata: {credit_usage, duration_ms, filename, job_id, version, schema_violation_error}
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
### Using Zod for Schema Validation
|
|
143
|
+
|
|
144
|
+
```typescript
|
|
145
|
+
import { z } from "zod";
|
|
146
|
+
import { zodToJsonSchema } from "zod-to-json-schema"; // separate npm package
|
|
147
|
+
|
|
148
|
+
const InvoiceSchema = z.object({
|
|
149
|
+
invoice_number: z.string().describe("Invoice number or ID"),
|
|
150
|
+
total_amount: z.number().positive().describe("Total amount"),
|
|
151
|
+
vendor_name: z.string().describe("Vendor name"),
|
|
152
|
+
line_items: z.array(z.object({
|
|
153
|
+
description: z.string(),
|
|
154
|
+
quantity: z.number().int().positive(),
|
|
155
|
+
unit_price: z.number().positive(),
|
|
156
|
+
total: z.number().positive()
|
|
157
|
+
})).optional()
|
|
158
|
+
});
|
|
159
|
+
|
|
160
|
+
// Parse once, extract many
|
|
161
|
+
const parsed = await client.parse({ document: fs.createReadStream("./invoice.pdf") });
|
|
162
|
+
|
|
163
|
+
const response = await client.extract({
|
|
164
|
+
markdown: parsed.markdown,
|
|
165
|
+
schema: JSON.stringify(zodToJsonSchema(InvoiceSchema)),
|
|
166
|
+
});
|
|
167
|
+
|
|
168
|
+
// Validate extracted data against Zod schema
|
|
169
|
+
const validated = InvoiceSchema.parse(response.extraction);
|
|
170
|
+
console.log(`Invoice ${validated.invoice_number}: $${validated.total_amount}`);
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
### Grounding References (Tracing Back to Source)
|
|
174
|
+
|
|
175
|
+
```typescript
|
|
176
|
+
const chunkMap = new Map(parsed.chunks.map(c => [c.id, c]));
|
|
177
|
+
|
|
178
|
+
Object.entries(response.extraction).forEach(([field, value]) => {
|
|
179
|
+
const refs = response.extraction_metadata[field]?.references;
|
|
180
|
+
if (refs?.length) {
|
|
181
|
+
const chunk = chunkMap.get(refs[0]);
|
|
182
|
+
if (chunk) {
|
|
183
|
+
console.log(`${field}=${value} → page ${chunk.grounding.page}`);
|
|
184
|
+
}
|
|
185
|
+
}
|
|
186
|
+
});
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
---
|
|
190
|
+
|
|
191
|
+
## 3. Split
|
|
192
|
+
|
|
193
|
+
Classifies and splits mixed documents by type.
|
|
194
|
+
|
|
195
|
+
### Arguments
|
|
196
|
+
|
|
197
|
+
| Parameter | Type | Required | Description |
|
|
198
|
+
|-----------|------|----------|-------------|
|
|
199
|
+
| `split_class` | `Array<{name: string, description?: string, identifier?: string}>` | Yes | Classification configuration |
|
|
200
|
+
| `markdown` | `Uploadable \| string \| null` | One required | Markdown content or file |
|
|
201
|
+
| `markdownUrl` | `string \| null` | One required | URL to markdown |
|
|
202
|
+
| `model` | `string \| null` | No | Model version (default: `split-latest`) |
|
|
203
|
+
| `saveTo` | `string` | No | Directory to save `{filename}_split_output.json` |
|
|
204
|
+
|
|
205
|
+
### Returns `SplitResponse`
|
|
206
|
+
|
|
207
|
+
```
|
|
208
|
+
.splits[] → Split: {classification, identifier, markdowns[], pages[], chunks[], class}
|
|
209
|
+
.metadata → Metadata: {credit_usage, duration_ms, filename, page_count}
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
### Split → Extract Pipeline
|
|
213
|
+
|
|
214
|
+
```typescript
|
|
215
|
+
const parsed = await client.parse({ document: fs.createReadStream("./mixed_invoices.pdf") });
|
|
216
|
+
|
|
217
|
+
const splitResponse = await client.split({
|
|
218
|
+
markdown: parsed.markdown,
|
|
219
|
+
split_class: [
|
|
220
|
+
{ name: "Invoice", description: "Sales invoice", identifier: "Invoice Number" },
|
|
221
|
+
{ name: "Receipt", description: "Payment receipt", identifier: "Receipt Number" },
|
|
222
|
+
],
|
|
223
|
+
});
|
|
224
|
+
|
|
225
|
+
for (const split of splitResponse.splits) {
|
|
226
|
+
console.log(`${split.classification}: ${split.identifier} (pages ${split.pages})`);
|
|
227
|
+
}
|
|
228
|
+
|
|
229
|
+
// Extract from each split
|
|
230
|
+
const schema = JSON.stringify(zodToJsonSchema(InvoiceSchema));
|
|
231
|
+
const results = [];
|
|
232
|
+
for (const split of splitResponse.splits) {
|
|
233
|
+
const extracted = await client.extract({ markdown: split.markdowns[0], schema });
|
|
234
|
+
results.push({ type: split.classification, id: split.identifier, data: extracted.extraction });
|
|
235
|
+
}
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
---
|
|
239
|
+
|
|
240
|
+
## 4. Parse Jobs (Async, Large Files)
|
|
241
|
+
|
|
242
|
+
For files >50MB, use asynchronous processing.
|
|
243
|
+
|
|
244
|
+
### `parseJobs.create()` Arguments
|
|
245
|
+
|
|
246
|
+
| Parameter | Type | Required | Description |
|
|
247
|
+
|-----------|------|----------|-------------|
|
|
248
|
+
| `document` | `Uploadable \| null` | One required | Local file |
|
|
249
|
+
| `document_url` | `string \| null` | One required | Remote document URL |
|
|
250
|
+
| `model` | `string \| null` | No | Model version (default: `dpt-2-latest`) |
|
|
251
|
+
| `split` | `"page" \| null` | No | Split by pages |
|
|
252
|
+
| `output_save_url` | `string \| null` | If ZDR | URL for zero data retention output |
|
|
253
|
+
|
|
254
|
+
### Returns `ParseJobCreateResponse`
|
|
255
|
+
|
|
256
|
+
```
|
|
257
|
+
.job_id → string: unique job identifier
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
### `parseJobs.get(jobId)` Returns `ParseJobGetResponse`
|
|
261
|
+
|
|
262
|
+
```
|
|
263
|
+
.job_id → string
|
|
264
|
+
.status → string: pending|processing|completed|failed|cancelled
|
|
265
|
+
.progress → number: 0.0 to 1.0
|
|
266
|
+
.failure_reason → string | undefined: error message if failed
|
|
267
|
+
.data → ParseResponse | undefined: full result when completed
|
|
268
|
+
.output_url → string | undefined: presigned URL if result >1MB (expires 1hr)
|
|
269
|
+
.received_at → number: Unix timestamp
|
|
270
|
+
.org_id → string
|
|
271
|
+
.version → string
|
|
272
|
+
.metadata → ParseMetadata | undefined
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
### `parseJobs.list()` Arguments & Returns
|
|
276
|
+
|
|
277
|
+
| Parameter | Type | Required | Description |
|
|
278
|
+
|-----------|------|----------|-------------|
|
|
279
|
+
| `status` | `string` | No | Filter: `"pending" \| "processing" \| "completed" \| "failed" \| "cancelled"` |
|
|
280
|
+
| `page` | `number` | No | Page number (0-indexed) |
|
|
281
|
+
| `pageSize` | `number` | No | Items per page |
|
|
282
|
+
|
|
283
|
+
```
|
|
284
|
+
.jobs[] → JobSummary: {job_id, status, progress, received_at, failure_reason}
|
|
285
|
+
.has_more → boolean
|
|
286
|
+
```
|
|
287
|
+
|
|
288
|
+
### Example
|
|
289
|
+
|
|
290
|
+
```typescript
|
|
291
|
+
const job = await client.parseJobs.create({
|
|
292
|
+
document: fs.createReadStream("./large.pdf"),
|
|
293
|
+
});
|
|
294
|
+
console.log(`Job ID: ${job.job_id}`);
|
|
295
|
+
|
|
296
|
+
while (true) {
|
|
297
|
+
const status = await client.parseJobs.get(job.job_id);
|
|
298
|
+
console.log(`${status.status}: ${(status.progress * 100).toFixed(0)}%`);
|
|
299
|
+
|
|
300
|
+
if (status.status === "completed") {
|
|
301
|
+
const result = status.data!; // ParseResponse
|
|
302
|
+
break;
|
|
303
|
+
}
|
|
304
|
+
if (status.status === "failed") {
|
|
305
|
+
throw new Error(`Job failed: ${status.failure_reason}`);
|
|
306
|
+
}
|
|
307
|
+
|
|
308
|
+
await new Promise(r => setTimeout(r, 5000));
|
|
309
|
+
}
|
|
310
|
+
```
|
|
311
|
+
|
|
312
|
+
---
|
|
313
|
+
|
|
314
|
+
## Error Handling
|
|
315
|
+
|
|
316
|
+
### Error Classes
|
|
317
|
+
|
|
318
|
+
All errors inherit from `LandingAIADEError`. Import from `"landingai-ade"`:
|
|
319
|
+
|
|
320
|
+
| Exception | HTTP Status | Description |
|
|
321
|
+
|-----------|-------------|-------------|
|
|
322
|
+
| `BadRequestError` | 400 | Invalid parameters |
|
|
323
|
+
| `AuthenticationError` | 401 | Invalid API key |
|
|
324
|
+
| `PermissionDeniedError` | 403 | Forbidden |
|
|
325
|
+
| `NotFoundError` | 404 | Resource not found |
|
|
326
|
+
| `ConflictError` | 409 | Conflict |
|
|
327
|
+
| `UnprocessableEntityError` | 422 | Invalid file type or malformed schema |
|
|
328
|
+
| `RateLimitError` | 429 | Too many requests |
|
|
329
|
+
| `InternalServerError` | 5xx | Server error |
|
|
330
|
+
| `APIConnectionError` | — | Network failure |
|
|
331
|
+
| `APIConnectionTimeoutError` | — | Request timeout (extends `APIConnectionError`) |
|
|
332
|
+
|
|
333
|
+
### Retry with Fallback to Jobs
|
|
334
|
+
|
|
335
|
+
```typescript
|
|
336
|
+
import {
|
|
337
|
+
RateLimitError,
|
|
338
|
+
APIConnectionTimeoutError,
|
|
339
|
+
AuthenticationError,
|
|
340
|
+
APIConnectionError,
|
|
341
|
+
} from "landingai-ade";
|
|
342
|
+
|
|
343
|
+
async function robustParse(
|
|
344
|
+
client: LandingAIADE, filePath: string, maxRetries = 3
|
|
345
|
+
): Promise<ParseResponse> {
|
|
346
|
+
for (let attempt = 0; attempt < maxRetries; attempt++) {
|
|
347
|
+
try {
|
|
348
|
+
return await client.parse({ document: fs.createReadStream(filePath) });
|
|
349
|
+
} catch (error) {
|
|
350
|
+
if (error instanceof RateLimitError) {
|
|
351
|
+
await new Promise(r => setTimeout(r, Math.pow(2, attempt) * 10000));
|
|
352
|
+
} else if (error instanceof APIConnectionTimeoutError) {
|
|
353
|
+
console.log("Timeout — switching to parse jobs");
|
|
354
|
+
return await parseLargeFile(client, filePath);
|
|
355
|
+
} else if (error instanceof AuthenticationError) {
|
|
356
|
+
throw error; // Non-retryable
|
|
357
|
+
} else if (error instanceof APIConnectionError) {
|
|
358
|
+
await new Promise(r => setTimeout(r, 2000));
|
|
359
|
+
} else {
|
|
360
|
+
throw error;
|
|
361
|
+
}
|
|
362
|
+
}
|
|
363
|
+
}
|
|
364
|
+
throw new Error("Failed after retries");
|
|
365
|
+
}
|
|
366
|
+
```
|
|
367
|
+
|
|
368
|
+
---
|
|
369
|
+
|
|
370
|
+
## Type Definitions
|
|
371
|
+
|
|
372
|
+
```typescript
|
|
373
|
+
interface ParseResponse {
|
|
374
|
+
markdown: string;
|
|
375
|
+
chunks: Chunk[];
|
|
376
|
+
grounding: Record<string, Grounding>;
|
|
377
|
+
splits?: Split[];
|
|
378
|
+
metadata: Metadata;
|
|
379
|
+
}
|
|
380
|
+
|
|
381
|
+
interface Chunk {
|
|
382
|
+
id: string;
|
|
383
|
+
type: "text" | "table" | "figure" | "marginalia" | "logo" | "card" | "attestation" | "scan_code";
|
|
384
|
+
markdown: string;
|
|
385
|
+
grounding: { page: number; box: BoundingBox };
|
|
386
|
+
}
|
|
387
|
+
|
|
388
|
+
interface BoundingBox {
|
|
389
|
+
left: number; top: number; right: number; bottom: number; // 0-1 normalized
|
|
390
|
+
}
|
|
391
|
+
|
|
392
|
+
interface Grounding {
|
|
393
|
+
type: string;
|
|
394
|
+
page: number;
|
|
395
|
+
box: BoundingBox;
|
|
396
|
+
position?: TablePosition; // Only for tableCell type
|
|
397
|
+
}
|
|
398
|
+
|
|
399
|
+
interface TablePosition {
|
|
400
|
+
row: number; col: number; rowspan: number; colspan: number; chunk_id: string;
|
|
401
|
+
}
|
|
402
|
+
|
|
403
|
+
interface ExtractResponse {
|
|
404
|
+
extraction: Record<string, any>;
|
|
405
|
+
extraction_metadata: Record<string, { references?: string[] }>;
|
|
406
|
+
metadata: Metadata;
|
|
407
|
+
}
|
|
408
|
+
|
|
409
|
+
interface SplitResponse {
|
|
410
|
+
splits: Split[];
|
|
411
|
+
metadata: Metadata;
|
|
412
|
+
}
|
|
413
|
+
|
|
414
|
+
interface Split {
|
|
415
|
+
chunks: string[];
|
|
416
|
+
class: string;
|
|
417
|
+
classification: string;
|
|
418
|
+
identifier: string;
|
|
419
|
+
markdowns: string[];
|
|
420
|
+
pages: number[];
|
|
421
|
+
}
|
|
422
|
+
|
|
423
|
+
interface Metadata {
|
|
424
|
+
filename: string; org_id: string; page_count: number;
|
|
425
|
+
duration_ms: number; credit_usage: number; version: string;
|
|
426
|
+
job_id: string; failed_pages?: number[];
|
|
427
|
+
}
|
|
428
|
+
```
|
|
429
|
+
|
|
430
|
+
### Type Guards
|
|
431
|
+
|
|
432
|
+
```typescript
|
|
433
|
+
function isTableChunk(chunk: Chunk): boolean {
|
|
434
|
+
return chunk.type === "table";
|
|
435
|
+
}
|
|
436
|
+
|
|
437
|
+
function isTableCell(
|
|
438
|
+
grounding: Grounding
|
|
439
|
+
): grounding is Grounding & { position: TablePosition } {
|
|
440
|
+
return grounding.type === "tableCell" && grounding.position !== undefined;
|
|
441
|
+
}
|
|
442
|
+
|
|
443
|
+
// Usage
|
|
444
|
+
Object.values(response.grounding).forEach(g => {
|
|
445
|
+
if (isTableCell(g)) {
|
|
446
|
+
console.log(`Cell at (${g.position.row}, ${g.position.col})`);
|
|
447
|
+
}
|
|
448
|
+
});
|
|
449
|
+
```
|
|
450
|
+
|
|
451
|
+
---
|
|
452
|
+
|
|
453
|
+
## API Reference
|
|
454
|
+
|
|
455
|
+
The following sections provide the complete API context so this document is fully self-contained.
|
|
456
|
+
|
|
457
|
+
### Base Configuration
|
|
458
|
+
|
|
459
|
+
| Region | Base URL |
|
|
460
|
+
|--------|----------|
|
|
461
|
+
| US (default) | `https://api.va.landing.ai/v1/ade` |
|
|
462
|
+
| EU | `https://api.va.eu-west-1.landing.ai/v1/ade` |
|
|
463
|
+
|
|
464
|
+
**Authentication**: All requests require `Authorization: Bearer $VISION_AGENT_API_KEY`
|
|
465
|
+
|
|
466
|
+
### Quick Reference
|
|
467
|
+
|
|
468
|
+
| Endpoint | Method | Path | Model | Input |
|
|
469
|
+
|----------|--------|------|-------|-------|
|
|
470
|
+
| Parse | POST | `/v1/ade/parse` | `dpt-2-latest` | `document` (file) or `document_url` |
|
|
471
|
+
| Extract | POST | `/v1/ade/extract` | `extract-latest` | `markdown` (file/string) or `markdown_url` + `schema` |
|
|
472
|
+
| Split | POST | `/v1/ade/split` | `split-latest` | `markdown` (file/string) or `markdown_url` + `split_class` |
|
|
473
|
+
| Create Job | POST | `/v1/ade/parse/jobs` | `dpt-2-latest` | `document` or `document_url` |
|
|
474
|
+
| Get Job | GET | `/v1/ade/parse/jobs/{id}` | — | — |
|
|
475
|
+
| List Jobs | GET | `/v1/ade/parse/jobs` | — | `?status=&page=&pageSize=` |
|
|
476
|
+
|
|
477
|
+
### Data Types
|
|
478
|
+
|
|
479
|
+
#### Chunk Types
|
|
480
|
+
- `text` — Characters, paragraphs, headings, lists, form fields, checkboxes, code blocks
|
|
481
|
+
- `table` — Grid of rows and columns; includes spreadsheets and receipts
|
|
482
|
+
- `figure` — Visual/graphical non-text content — images, graphs, flowcharts, diagrams
|
|
483
|
+
- `marginalia` — Content in document margins — headers, footers, page numbers, handwritten notes
|
|
484
|
+
- `logo` — Logos (DPT-2 only)
|
|
485
|
+
- `card` — ID cards and driver's licenses (DPT-2 only)
|
|
486
|
+
- `attestation` — Signatures, stamps, and seals (DPT-2 only)
|
|
487
|
+
- `scan_code` — QR codes and barcodes (DPT-2 only)
|
|
488
|
+
|
|
489
|
+
#### Grounding Types
|
|
490
|
+
- Chunk grounding: `chunkText`, `chunkTable`, `chunkFigure`, `chunkMarginalia`, `chunkLogo`, `chunkCard`, `chunkAttestation`, `chunkScanCode`
|
|
491
|
+
- Structure: `table`, `tableCell` (with position data)
|
|
492
|
+
|
|
493
|
+
#### Bounding Box
|
|
494
|
+
All coordinates normalized 0–1: `{ left, top, right, bottom }`.
|
|
495
|
+
|
|
496
|
+
#### Table Cell Position
|
|
497
|
+
`{ row, col, rowspan, colspan, chunk_id }` — zero-indexed.
|
|
498
|
+
|
|
499
|
+
#### Table Chunk Formats
|
|
500
|
+
|
|
501
|
+
**PDF/Image tables**: Element IDs use `{page}-{base62_seq}`. Grounding object has bounding boxes and `tableCell` entries.
|
|
502
|
+
|
|
503
|
+
**Spreadsheet tables (XLSX/CSV)**: Element IDs use `{tab_name}-{cell_ref}` (e.g., `Sheet 1-B2`). **Grounding is null** — positions are encoded in IDs.
|
|
504
|
+
|
|
505
|
+
### Error Codes
|
|
506
|
+
|
|
507
|
+
| Status | Error Type | Description | Solution |
|
|
508
|
+
|--------|------------|-------------|----------|
|
|
509
|
+
| 400 | `validation_error` | Invalid parameters | Check request format |
|
|
510
|
+
| 401 | `authentication_error` | Invalid API key | Check VISION_AGENT_API_KEY |
|
|
511
|
+
| 413 | `payload_too_large` | File too large | Use Parse Jobs API |
|
|
512
|
+
| 422 | `unprocessable_entity` | Invalid file type or malformed schema | Validate file format and schema JSON |
|
|
513
|
+
| 429 | `rate_limit_error` | Too many requests | Implement backoff |
|
|
514
|
+
| 500 | `internal_error` | Server error | Retry with backoff |
|
|
515
|
+
| 504 | `timeout_error` | Request timeout | Use Parse Jobs API |
|
|
516
|
+
|
|
517
|
+
### Supported File Types
|
|
518
|
+
|
|
519
|
+
| Category | Formats | Notes |
|
|
520
|
+
|----------|---------|-------|
|
|
521
|
+
| **PDF** | PDF | Up to 100 pages; no password-protected files |
|
|
522
|
+
| **Images** | JPEG, JPG, PNG, APNG, BMP, DCX, DDS, DIB, GD, GIF, ICNS, JP2, PCX, PPM, PSD, TGA, TIF, TIFF, WEBP | |
|
|
523
|
+
| **Text Documents** | DOC, DOCX, ODT | Converted to PDF before parsing |
|
|
524
|
+
| **Presentations** | ODP, PPT, PPTX | Converted to PDF before parsing |
|
|
525
|
+
| **Spreadsheets** | CSV, XLSX | Up to 10 MB in Playground; no sheet/column/row limits |
|
|
526
|
+
|
|
527
|
+
> **Note:** Word, PowerPoint, and OpenDocument files are converted to PDF server-side before parsing.
|
|
528
|
+
|
|
529
|
+
### Model Versions
|
|
530
|
+
|
|
531
|
+
| Operation | Current Version | Description |
|
|
532
|
+
|-----------|----------------|-------------|
|
|
533
|
+
| Parse | `dpt-2-latest` | Document parsing and OCR |
|
|
534
|
+
| Extract | `extract-latest` | Schema-based extraction |
|
|
535
|
+
| Split | `split-latest` | Document classification |
|
|
536
|
+
|
|
537
|
+
---
|
|
538
|
+
|
|
539
|
+
## External Links
|
|
540
|
+
|
|
541
|
+
- [TypeScript SDK Documentation](https://docs.landing.ai/ade/ade-typescript)
|
|
542
|
+
- [TypeScript SDK GitHub](https://github.com/landing-ai/ade-typescript)
|