@ontos-ai/knowhere-sdk 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Knowhere Team
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,254 @@
1
+ # Knowhere Node.js SDK
2
+
3
+ Official Node.js/TypeScript SDK for the [Knowhere](https://knowhereto.ai) document parsing API.
4
+
5
+ [![npm version](https://badge.fury.io/js/@ontos-ai%2Fknowhere-sdk.svg)](https://www.npmjs.com/package/@ontos-ai/knowhere-sdk)
6
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
+ [![Node.js Version](https://img.shields.io/node/v/@ontos-ai/knowhere-sdk)](https://nodejs.org)
8
+
9
+ ## Features
10
+
11
+ - 🚀 **TypeScript-first** - Full type safety with comprehensive type definitions
12
+ - 📦 **Stream-based uploads** - Efficient handling of large files
13
+ - 🔄 **Automatic retries** - Exponential backoff for transient failures
14
+ - 📊 **Adaptive polling** - Smart waiting for job completion
15
+ - 🎯 **Progressive API** - High-level convenience methods + low-level control
16
+ - ⚡ **Modern JavaScript** - ESM and CommonJS support
17
+
18
+ ## Installation
19
+
20
+ ```bash
21
+ npm install @ontos-ai/knowhere-sdk
22
+ ```
23
+
24
+ **Requirements:**
25
+
26
+ - Node.js >= 20.19.0
27
+ - npm >= 10.0.0
28
+ - TypeScript >= 5.0 (optional, for type checking)
29
+
30
+ ## Quick Start
31
+
32
+ ```typescript
33
+ import Knowhere from '@ontos-ai/knowhere-sdk';
34
+
35
+ // Initialize client
36
+ const client = new Knowhere({
37
+ apiKey: process.env.KNOWHERE_API_KEY,
38
+ });
39
+
40
+ // Parse a document from URL
41
+ const result = await client.parse({
42
+ url: 'https://example.com/document.pdf',
43
+ });
44
+
45
+ // Access parsed content
46
+ console.log(`Found ${result.textChunks.length} text chunks`);
47
+ console.log(`Found ${result.imageChunks.length} images`);
48
+ console.log(`Found ${result.tableChunks.length} tables`);
49
+
50
+ // Work with chunks
51
+ result.textChunks.forEach((chunk) => {
52
+ console.log(chunk.content);
53
+ console.log(chunk.keywords);
54
+ console.log(chunk.summary);
55
+ });
56
+
57
+ // Save results to disk
58
+ await result.save('./output/');
59
+ ```
60
+
61
+ ## Configuration
62
+
63
+ ### Environment Variables
64
+
65
+ ```bash
66
+ KNOWHERE_API_KEY=sk_... # Required
67
+ KNOWHERE_BASE_URL=https://api.knowhereto.ai # Optional
68
+ ```
69
+
70
+ ### Client Options
71
+
72
+ ```typescript
73
+ const client = new Knowhere({
74
+ apiKey: 'sk_...', // API authentication key
75
+ baseURL: 'https://...', // API base URL
76
+ timeout: 60000, // Request timeout (ms)
77
+ uploadTimeout: 600000, // Upload timeout (ms)
78
+ maxRetries: 5, // Max retry attempts
79
+ });
80
+ ```
81
+
82
+ ## Usage Examples
83
+
84
+ ### Parse from File
85
+
86
+ ```typescript
87
+ // From file path (recommended)
88
+ const result = await client.parse({
89
+ file: './document.pdf',
90
+ });
91
+
92
+ // From Buffer
93
+ const buffer = await fs.readFile('./document.pdf');
94
+ const result = await client.parse({
95
+ file: buffer,
96
+ fileName: 'document.pdf',
97
+ });
98
+
99
+ // From Stream
100
+ const stream = fs.createReadStream('./document.pdf');
101
+ const result = await client.parse({
102
+ file: stream,
103
+ fileName: 'document.pdf',
104
+ });
105
+ ```
106
+
107
+ `fileName` 会在 `file` 是本地文件路径时自动推断;当 `file` 是 `Buffer`、`Uint8Array` 或不带路径信息的流时必须显式提供。
108
+
109
+ ### Advanced Options
110
+
111
+ ```typescript
112
+ const result = await client.parse({
113
+ url: 'https://example.com/doc.pdf',
114
+ model: 'advanced', // 'base' | 'advanced'
115
+ ocr: true, // Enable OCR
116
+ docType: 'pdf', // Document type hint
117
+ smartTitleParse: true, // Smart title detection
118
+ summaryImage: true, // Generate image summaries
119
+ summaryTable: true, // Generate table summaries
120
+ summaryText: true, // Generate text summaries
121
+ addFragDesc: 'Custom context', // Additional fragment description
122
+ kbDir: 'project_docs', // Knowledge base directory
123
+ pollInterval: 10000, // Polling interval (ms)
124
+ pollTimeout: 1800000, // Max wait time (ms)
125
+ verifyChecksum: true, // Verify ZIP checksum (default: true)
126
+ webhook: {
127
+ // Webhook for completion
128
+ url: 'https://...',
129
+ },
130
+ onUploadProgress: (progress) => {
131
+ console.log(`Upload: ${progress.percent}%`);
132
+ },
133
+ onPollProgress: (status) => {
134
+ console.log(`Status: ${status.status}`);
135
+ },
136
+ });
137
+ ```
138
+
139
+ ### Low-Level API
140
+
141
+ For granular control over the job lifecycle:
142
+
143
+ ```typescript
144
+ // 1. Create job
145
+ const job = await client.jobs.create({
146
+ sourceType: 'file',
147
+ fileName: 'document.pdf',
148
+ parsingParams: { model: 'advanced', ocrEnabled: true },
149
+ });
150
+
151
+ // 2. Upload file
152
+ await client.jobs.upload(job, {
153
+ file: './document.pdf',
154
+ onProgress: ({ percent }) => console.log(`${percent}%`),
155
+ });
156
+
157
+ // 3. Wait for completion
158
+ const jobResult = await client.jobs.wait(job.jobId, {
159
+ pollInterval: 10000,
160
+ });
161
+
162
+ // 4. Load results
163
+ const result = await client.jobs.load(jobResult);
164
+ ```
165
+
166
+ ### Error Handling
167
+
168
+ ```typescript
169
+ import {
170
+ BadRequestError,
171
+ AuthenticationError,
172
+ RateLimitError,
173
+ PollingTimeoutError,
174
+ JobFailedError,
175
+ ValidationError,
176
+ InvalidStateError,
177
+ } from '@ontos-ai/knowhere-sdk';
178
+
179
+ try {
180
+ const result = await client.parse({ url: '...' });
181
+ } catch (error) {
182
+ if (error instanceof ValidationError) {
183
+ console.error('Invalid parameters:', error.message);
184
+ } else if (error instanceof RateLimitError) {
185
+ // Wait and retry
186
+ await sleep(error.retryAfter * 1000);
187
+ } else if (error instanceof AuthenticationError) {
188
+ console.error('Invalid API key');
189
+ } else if (error instanceof PollingTimeoutError) {
190
+ console.error('Processing timeout');
191
+ } else if (error instanceof JobFailedError) {
192
+ console.error('Job failed:', error.jobResult.error);
193
+ } else if (error instanceof InvalidStateError) {
194
+ console.error('Invalid state:', error.message);
195
+ }
196
+ }
197
+ ```
198
+
199
+ ## Documentation
200
+
201
+ For complete documentation, visit [https://docs.knowhereto.ai](https://docs.knowhereto.ai)
202
+
203
+ ## Examples
204
+
205
+ Check out the [examples](./examples) directory for more usage examples:
206
+
207
+ - [Basic Usage](./examples/basic.ts)
208
+ - [File Upload](./examples/file-upload.ts)
209
+ - [Error Handling](./examples/error-handling.ts)
210
+ - [Low-Level API](./examples/low-level.ts)
211
+
212
+ ## Development
213
+
214
+ ```bash
215
+ # Install dependencies
216
+ npm install
217
+
218
+ # Run tests
219
+ npm test
220
+
221
+ # Run tests with coverage
222
+ npm run test:ci
223
+
224
+ # Lint code
225
+ npm run lint
226
+
227
+ # Format code
228
+ npm run format
229
+
230
+ # Type check
231
+ npm run typecheck
232
+
233
+ # Build
234
+ npm run build
235
+ ```
236
+
237
+ ## Release Workflow
238
+
239
+ See [docs/release-workflow.md](./docs/release-workflow.md) for the
240
+ Changesets-based stable and beta release process.
241
+
242
+ ## License
243
+
244
+ [MIT](./LICENSE)
245
+
246
+ ## Support
247
+
248
+ - 📧 Email: team@knowhereto.ai
249
+ - 🐛 Issues: [GitHub Issues](https://github.com/Ontos-AI/knowhere-node-sdk/issues)
250
+ - 📚 Documentation: [https://docs.knowhereto.ai](https://docs.knowhereto.ai)
251
+
252
+ ## Changelog
253
+
254
+ See [CHANGELOG.md](./CHANGELOG.md) for release history.