vecbox 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Embed Kit
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,377 @@
1
+ # vecbox v0.1.0
2
+
3
+ ![vecbox](./src/images/vecbox.png)
4
+ [![npm version](https://img.shields.io/npm/v/vecbox.svg)](https://www.npmjs.com/package/vecbox)
5
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
6
+
7
+ ## Why Embedbox?
8
+
9
+ **One API, multiple providers.** Switch between OpenAI, Gemini, or run locally with Llama.cpp without changing code.
10
+ ```typescript
11
+ // Works with any provider
12
+ const result = await autoEmbed({ text: 'Hello, world!' });
13
+ console.log(result.embedding); // [0.1, 0.2, ...]
14
+ ```
15
+
16
+ ## Installation
17
+ ```bash
18
+
19
+ ```
20
+
21
+ ## Quick Start
22
+
23
+ ### Auto-detect (Recommended)
24
+ ```typescript
25
+ import { autoEmbed } from 'vecbox';
26
+
27
+ const result = await autoEmbed({ text: 'Your text' });
28
+ // Automatically uses: Llama.cpp (local) → OpenAI → Gemini → ...
29
+ ```
30
+
31
+ ### Specific Provider
32
+ ```typescript
33
+ import { embed } from 'vecbox';
34
+
35
+ const result = await embed(
36
+ { provider: 'openai', apiKey: process.env.OPENAI_API_KEY },
37
+ { text: 'Your text' }
38
+ );
39
+ ```
40
+
41
+ ## Providers
42
+
43
+ <details>
44
+ <summary><b>OpenAI</b></summary>
45
+ ```typescript
46
+ await embed(
47
+ {
48
+ provider: 'openai',
49
+ model: 'text-embedding-3-small', // or text-embedding-3-large
50
+ apiKey: process.env.OPENAI_API_KEY
51
+ },
52
+ { text: 'Your text' }
53
+ );
54
+ ```
55
+
56
+ **Setup:** Get API key at [platform.openai.com](https://platform.openai.com)
57
+
58
+ </details>
59
+
60
+ <details>
61
+ <summary><b>Google Gemini</b></summary>
62
+ ```typescript
63
+ await embed(
64
+ {
65
+ provider: 'gemini',
66
+ model: 'gemini-embedding-001',
67
+ apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY
68
+ },
69
+ { text: 'Your text' }
70
+ );
71
+ ```
72
+
73
+ **Setup:** Get API key at [aistudio.google.com](https://aistudio.google.com)
74
+
75
+ </details>
76
+
77
+ <details>
78
+ <summary><b>Llama.cpp (Local)</b></summary>
79
+ ```typescript
80
+ await embed(
81
+ { provider: 'llamacpp', model: 'nomic-embed-text-v1.5.Q4_K_M.gguf' },
82
+ { text: 'Your text' }
83
+ );
84
+ ```
85
+
86
+ **Setup:**
87
+ ```bash
88
+ # 1. Install
89
+ git clone https://github.com/ggerganov/llama.cpp
90
+ cd llama.cpp && make llama-server
91
+
92
+ # 2. Download model
93
+ wget https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/resolve/main/nomic-embed-text-v1.5.Q4_K_M.gguf
94
+
95
+ # 3. Run server
96
+ ./llama-server -m nomic-embed-text-v1.5.Q4_K_M.gguf --embedding --port 8080
97
+ ```
98
+
99
+ </details>
100
+
101
+ <details>
102
+ <summary><b>Anthropic Claude</b></summary>
103
+ ```typescript
104
+ await embed(
105
+ {
106
+ provider: 'claude',
107
+ model: 'claude-3-sonnet-20240229',
108
+ apiKey: process.env.ANTHROPIC_API_KEY
109
+ },
110
+ { text: 'Your text' }
111
+ );
112
+ ```
113
+
114
+ **Setup:** Get API key at [console.anthropic.com](https://console.anthropic.com)
115
+
116
+ </details>
117
+
118
+ <details>
119
+ <summary><b>Mistral</b></summary>
120
+ ```typescript
121
+ await embed(
122
+ {
123
+ provider: 'mistral',
124
+ model: 'mistral-embed',
125
+ apiKey: process.env.MISTRAL_API_KEY
126
+ },
127
+ { text: 'Your text' }
128
+ );
129
+ ```
130
+
131
+ **Setup:** Get API key at [mistral.ai](https://mistral.ai)
132
+
133
+ </details>
134
+
135
+ <details>
136
+ <summary><b>DeepSeek</b></summary>
137
+ ```typescript
138
+ await embed(
139
+ {
140
+ provider: 'deepseek',
141
+ model: 'deepseek-chat',
142
+ apiKey: process.env.DEEPSEEK_API_KEY
143
+ },
144
+ { text: 'Your text' }
145
+ );
146
+ ```
147
+
148
+ **Setup:** Get API key at [platform.deepseek.com](https://platform.deepseek.com)
149
+
150
+ </details>
151
+
152
+ ## Common Use Cases
153
+
154
+ ### Semantic Search
155
+ ```typescript
156
+ // Helper function for cosine similarity
157
+ function cosineSimilarity(vecA: number[], vecB: number[]): number {
158
+ const dotProduct = vecA.reduce((sum, val, i) => sum + val * vecB[i], 0);
159
+ const magnitudeA = Math.sqrt(vecA.reduce((sum, val) => sum + val * val, 0));
160
+ const magnitudeB = Math.sqrt(vecB.reduce((sum, val) => sum + val * val, 0));
161
+ return dotProduct / (magnitudeA * magnitudeB);
162
+ }
163
+
164
+ const query = await autoEmbed({ text: 'machine learning' });
165
+ const docs = await Promise.all(
166
+ documents.map(doc => autoEmbed({ text: doc }))
167
+ );
168
+
169
+ // Find most similar
170
+ const scores = docs.map(doc =>
171
+ cosineSimilarity(query.embedding, doc.embedding)
172
+ );
173
+ const mostSimilar = scores.indexOf(Math.max(...scores));
174
+ console.log(`Best match: ${documents[mostSimilar]}`);
175
+ ```
176
+
177
+ ### Text Similarity
178
+ ```typescript
179
+ function cosineSimilarity(vecA: number[], vecB: number[]): number {
180
+ const dotProduct = vecA.reduce((sum, val, i) => sum + val * vecB[i], 0);
181
+ const magnitudeA = Math.sqrt(vecA.reduce((sum, val) => sum + val * val, 0));
182
+ const magnitudeB = Math.sqrt(vecB.reduce((sum, val) => sum + val * val, 0));
183
+ return dotProduct / (magnitudeA * magnitudeB);
184
+ }
185
+
186
+ const [emb1, emb2] = await Promise.all([
187
+ autoEmbed({ text: 'cat sleeping' }),
188
+ autoEmbed({ text: 'cat napping' })
189
+ ]);
190
+
191
+ const similarity = cosineSimilarity(emb1.embedding, emb2.embedding);
192
+ console.log(`Similarity: ${similarity.toFixed(3)}`); // → 0.95 (very similar)
193
+ ```
194
+
195
+ ### Batch Processing
196
+ ```typescript
197
+ const results = await embed(
198
+ { provider: 'openai', apiKey: 'key' },
199
+ [
200
+ { text: 'Text 1' },
201
+ { text: 'Text 2' },
202
+ { filePath: './doc.txt' }
203
+ ]
204
+ );
205
+ // → { embeddings: [[...], [...], [...]], dimensions: 1536 }
206
+
207
+ console.log(`Processed ${results.embeddings.length} texts`);
208
+ console.log(`Dimensions: ${results.dimensions}`);
209
+ ```
210
+
211
+ ### File Processing
212
+ ```typescript
213
+ import { readdir } from 'fs/promises';
214
+ import { join } from 'path';
215
+
216
+ async function embedAllFiles(dirPath: string) {
217
+ const files = await readdir(dirPath);
218
+ const textFiles = files.filter(file => file.endsWith('.txt'));
219
+
220
+ const inputs = textFiles.map(file => ({
221
+ filePath: join(dirPath, file)
222
+ }));
223
+
224
+ const results = await embed(
225
+ { provider: 'llamacpp' },
226
+ inputs
227
+ );
228
+
229
+ return textFiles.map((file, index) => ({
230
+ file,
231
+ embedding: results.embeddings[index]
232
+ }));
233
+ }
234
+
235
+ const embeddings = await embedAllFiles('./documents');
236
+ console.log(`Processed ${embeddings.length} files`);
237
+ ```
238
+
239
+ ## API
240
+
241
+ ### `autoEmbed(input)`
242
+
243
+ Auto-detects best provider in priority order:
244
+ 1. **Llama.cpp** (Local & Free)
245
+ 2. **OpenAI** (if API key available)
246
+ 3. **Gemini** (if API key available)
247
+ 4. **Claude** (if API key available)
248
+ 5. **Mistral** (if API key available)
249
+ 6. **DeepSeek** (if API key available)
250
+
251
+ ```typescript
252
+ await autoEmbed({ text: string } | { filePath: string })
253
+ ```
254
+
255
+ ### `embed(config, input)`
256
+
257
+ Explicit provider selection.
258
+ ```typescript
259
+ await embed(
260
+ { provider, model?, apiKey?, baseUrl?, timeout?, maxRetries? },
261
+ { text: string } | { filePath: string } | Array
262
+ )
263
+ ```
264
+
265
+ **Returns:**
266
+ ```typescript
267
+ {
268
+ embedding: number[],
269
+ dimensions: number,
270
+ provider: string,
271
+ model: string,
272
+ usage?: {
273
+ promptTokens?: number;
274
+ totalTokens?: number;
275
+ }
276
+ }
277
+ ```
278
+
279
+ ### `getSupportedProviders()`
280
+
281
+ Returns available providers.
282
+ ```typescript
283
+ import { getSupportedProviders } from 'embedbox';
284
+
285
+ const providers = getSupportedProviders();
286
+ // → ['openai', 'gemini', 'claude', 'mistral', 'deepseek', 'llamacpp']
287
+ ```
288
+
289
+ ### `createProvider(config)`
290
+
291
+ Create provider instance for advanced usage.
292
+ ```typescript
293
+ import { createProvider } from 'embedbox';
294
+
295
+ const provider = createProvider({
296
+ provider: 'openai',
297
+ model: 'text-embedding-3-small',
298
+ apiKey: 'your-key'
299
+ });
300
+
301
+ const isReady = await provider.isReady();
302
+ if (isReady) {
303
+ const result = await provider.embed({ text: 'Hello' });
304
+ }
305
+ ```
306
+
307
+ ## Environment Variables
308
+ ```bash
309
+ # .env file
310
+ OPENAI_API_KEY=sk-...
311
+ GOOGLE_GENERATIVE_AI_API_KEY=...
312
+ ANTHROPIC_API_KEY=sk-ant-...
313
+ MISTRAL_API_KEY=...
314
+ DEEPSEEK_API_KEY=...
315
+ ```
316
+
317
+ ## Error Handling
318
+
319
+ ```typescript
320
+ import { autoEmbed } from 'embedbox';
321
+
322
+ try {
323
+ const result = await autoEmbed({ text: 'Hello' });
324
+ console.log(result.embedding);
325
+ } catch (error) {
326
+ if (error.message.includes('API key')) {
327
+ console.error('Please set up your API keys in .env');
328
+ } else if (error.message.includes('not ready')) {
329
+ console.error('Provider is not available');
330
+ } else if (error.message.includes('network')) {
331
+ console.error('Network connection failed');
332
+ } else {
333
+ console.error('Embedding failed:', error.message);
334
+ }
335
+ }
336
+ ```
337
+
338
+ ## TypeScript Support
339
+
340
+ Full TypeScript support with type definitions:
341
+ ```typescript
342
+ import {
343
+ autoEmbed,
344
+ embed,
345
+ getSupportedProviders,
346
+ createProvider,
347
+ type EmbedConfig,
348
+ type EmbedInput,
349
+ type EmbedResult
350
+ } from 'embedbox';
351
+
352
+ // Full type safety
353
+ const config: EmbedConfig = {
354
+ provider: 'openai',
355
+ model: 'text-embedding-3-small'
356
+ };
357
+
358
+ const input: EmbedInput = {
359
+ text: 'Your text here'
360
+ };
361
+
362
+ const result: EmbedResult = await embed(config, input);
363
+ ```
364
+
365
+ ## License
366
+
367
+ MIT © Embedbox Team
368
+
369
+ ## Links
370
+
371
+ - [npm](https://www.npmjs.com/package/embedbox)
372
+ - [GitHub](https://github.com/embedbox/embedbox)
373
+ - [Documentation](https://embedbox.dev)
374
+
375
+ ---
376
+
377
+ **Embedbox v1.0.0** - One API, multiple providers. Simple embeddings.