vectra-js 0.9.6 → 0.9.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.github/FUNDING.yml +4 -0
- package/.github/workflows/npm-publish.yml +3 -4
- package/README.md +392 -537
- package/RELEASE_NOTES.md +15 -0
- package/docs/assets/vectraArch.png +0 -0
- package/examples/chromadb.js +96 -0
- package/examples/pg-prisma.js +119 -0
- package/examples/postgress.js +115 -0
- package/package.json +4 -3
- package/src/backends/gemini.js +15 -8
- package/src/backends/openrouter.js +2 -2
- package/src/backends/postgres_store.js +191 -0
- package/src/config.js +1 -1
- package/src/core.js +174 -130
- package/src/observability.js +0 -6
- package/src/processor.js +32 -2
- package/src/webconfig_server.js +1 -1
package/README.md
CHANGED
|
@@ -1,656 +1,511 @@
|
|
|
1
1
|
# Vectra (Node.js)
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
*
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
*
|
|
25
|
-
*
|
|
26
|
-
*
|
|
27
|
-
*
|
|
28
|
-
*
|
|
29
|
-
*
|
|
30
|
-
*
|
|
31
|
-
*
|
|
32
|
-
*
|
|
33
|
-
*
|
|
3
|
+
**Vectra** is a **production-grade, provider-agnostic Node.js SDK** for building **end-to-end Retrieval-Augmented Generation (RAG)** systems. It is designed for teams that need **flexibility, extensibility, correctness, and observability** across embeddings, vector databases, retrieval strategies, and LLM providers—without locking into a single vendor.
|
|
4
|
+
|
|
5
|
+

|
|
6
|
+

|
|
7
|
+

|
|
8
|
+
[](https://sonarcloud.io/summary/new_code?id=iamabhishek-n_vectra-js)
|
|
9
|
+
|
|
10
|
+
If you find this project useful, consider supporting it:<br>
|
|
11
|
+
[](https://github.com/iamabhishek-n/vectra-js/stargazers)
|
|
12
|
+
[](https://github.com/sponsors/iamabhishek-n)
|
|
13
|
+
[](https://www.buymeacoffee.com/iamabhishekn)
|
|
14
|
+
|
|
15
|
+
## Table of Contents
|
|
16
|
+
|
|
17
|
+
* [1. Overview](#1-overview)
|
|
18
|
+
* [2. Design Goals & Philosophy](#2-design-goals--philosophy)
|
|
19
|
+
* [3. Feature Matrix](#3-feature-matrix)
|
|
20
|
+
* [4. Installation](#4-installation)
|
|
21
|
+
* [5. Quick Start](#5-quick-start)
|
|
22
|
+
* [6. Core Concepts](#6-core-concepts)
|
|
23
|
+
|
|
24
|
+
* [Providers](#providers)
|
|
25
|
+
* [Vector Stores](#vector-stores)
|
|
26
|
+
* [Chunking](#chunking)
|
|
27
|
+
* [Retrieval](#retrieval)
|
|
28
|
+
* [Reranking](#reranking)
|
|
29
|
+
* [Metadata Enrichment](#metadata-enrichment)
|
|
30
|
+
* [Query Planning & Grounding](#query-planning--grounding)
|
|
31
|
+
* [Conversation Memory](#conversation-memory)
|
|
32
|
+
* [7. Configuration Reference (Usage‑Driven)](#7-configuration-reference-usage-driven)
|
|
33
|
+
* [8. Ingestion Pipeline](#8-ingestion-pipeline)
|
|
34
|
+
* [9. Querying & Streaming](#9-querying--streaming)
|
|
35
|
+
* [10. Conversation Memory](#10-conversation-memory)
|
|
36
|
+
* [11. Evaluation & Quality Measurement](#11-evaluation--quality-measurement)
|
|
37
|
+
* [12. CLI](#12-cli)
|
|
38
|
+
|
|
39
|
+
* [Ingest & Query](#ingest--query)
|
|
40
|
+
* [WebConfig (Config Generator UI)](#webconfig-config-generator-ui)
|
|
41
|
+
* [Observability Dashboard](#observability-dashboard)
|
|
42
|
+
* [13. Observability & Callbacks](#13-observability--callbacks)
|
|
43
|
+
* [14. Database Schemas & Indexing](#14-database-schemas--indexing)
|
|
44
|
+
* [15. Extending Vectra](#15-extending-vectra)
|
|
45
|
+
* [16. Architecture Overview](#16-architecture-overview)
|
|
46
|
+
* [17. Development & Contribution Guide](#17-development--contribution-guide)
|
|
47
|
+
* [18. Production Best Practices](#18-production-best-practices)
|
|
34
48
|
|
|
35
49
|
---
|
|
36
50
|
|
|
37
|
-
##
|
|
51
|
+
## 1. Overview
|
|
52
|
+
|
|
53
|
+
Vectra provides a **fully modular RAG pipeline**:
|
|
54
|
+
|
|
55
|
+
```
|
|
56
|
+
Load → Chunk → Embed → Store → Retrieve → Rerank → Plan → Ground → Generate → Stream
|
|
57
|
+
```
|
|
58
|
+
<p align="center">
|
|
59
|
+
<img src="https://vectra.thenxtgenagents.com/vectraArch.png" alt="Vectra SDK Architecture" width="900">
|
|
60
|
+
</p>
|
|
61
|
+
|
|
62
|
+
<p align="center">
|
|
63
|
+
<em>Vectra SDK – End-to-End RAG Architecture</em>
|
|
64
|
+
</p>
|
|
65
|
+
|
|
66
|
+
Every stage is **explicitly configurable**, validated at runtime, and observable.
|
|
67
|
+
|
|
68
|
+
### Key Characteristics
|
|
69
|
+
|
|
70
|
+
* Provider‑agnostic LLM & embedding layer
|
|
71
|
+
* Multiple vector backends (Postgres, Chroma, Qdrant, Milvus)
|
|
72
|
+
* Advanced retrieval strategies (HyDE, Multi‑Query, Hybrid RRF, MMR)
|
|
73
|
+
* Unified streaming interface
|
|
74
|
+
* Built‑in evaluation & observability
|
|
75
|
+
* CLI + SDK parity
|
|
76
|
+
|
|
77
|
+
---
|
|
78
|
+
|
|
79
|
+
## 2. Design Goals & Philosophy
|
|
80
|
+
|
|
81
|
+
### Explicitness over Magic
|
|
82
|
+
|
|
83
|
+
Vectra avoids hidden defaults. Chunking, retrieval, grounding, memory, and generation behavior are always explicit.
|
|
84
|
+
|
|
85
|
+
### Production‑First
|
|
86
|
+
|
|
87
|
+
Index helpers, rate limiting, embedding cache, observability, and evaluation are first‑class features.
|
|
88
|
+
|
|
89
|
+
### Provider Neutrality
|
|
90
|
+
|
|
91
|
+
Swapping OpenAI → Gemini → Anthropic → Ollama requires **no application code changes**.
|
|
92
|
+
|
|
93
|
+
### Extensibility
|
|
94
|
+
|
|
95
|
+
Every major subsystem (providers, vector stores, callbacks) is interface‑driven.
|
|
96
|
+
|
|
97
|
+
---
|
|
98
|
+
|
|
99
|
+
## 3. Feature Matrix
|
|
100
|
+
|
|
101
|
+
### Providers
|
|
102
|
+
|
|
103
|
+
* **Embeddings**: OpenAI, Gemini, Ollama, HuggingFace
|
|
104
|
+
* **Generation**: OpenAI, Gemini, Anthropic, Ollama, OpenRouter, HuggingFace
|
|
105
|
+
* **Streaming**: Unified async generator
|
|
106
|
+
|
|
107
|
+
### Vector Stores
|
|
108
|
+
|
|
109
|
+
* PostgreSQL (Prisma + pgvector)
|
|
110
|
+
* PostgreSQL (native `pg` driver)
|
|
111
|
+
* ChromaDB
|
|
112
|
+
* Qdrant
|
|
113
|
+
* Milvus
|
|
114
|
+
|
|
115
|
+
### Retrieval Strategies
|
|
116
|
+
|
|
117
|
+
* Naive cosine similarity
|
|
118
|
+
* HyDE (Hypothetical Document Embeddings)
|
|
119
|
+
* Multi‑Query expansion
|
|
120
|
+
* Hybrid semantic + lexical (RRF)
|
|
121
|
+
* MMR diversification
|
|
122
|
+
|
|
123
|
+
---
|
|
124
|
+
|
|
125
|
+
## 4. Installation
|
|
126
|
+
|
|
127
|
+
### Library
|
|
38
128
|
|
|
39
129
|
```bash
|
|
40
|
-
# Library (npm)
|
|
41
130
|
npm install vectra-js @prisma/client
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
# Library (pnpm)
|
|
131
|
+
# or
|
|
45
132
|
pnpm add vectra-js @prisma/client
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
# CLI (global install)
|
|
49
|
-
npm i -g vectra-js # or: pnpm add -g vectra-js
|
|
133
|
+
```
|
|
50
134
|
|
|
51
|
-
|
|
52
|
-
# Uses local project bin if vectra-js is installed
|
|
53
|
-
npx vectra ingest ./docs --config=./config.json
|
|
135
|
+
Optional backends:
|
|
54
136
|
|
|
55
|
-
|
|
56
|
-
|
|
137
|
+
```bash
|
|
138
|
+
npm install chromadb
|
|
57
139
|
```
|
|
58
140
|
|
|
59
|
-
|
|
141
|
+
### CLI
|
|
60
142
|
|
|
61
|
-
|
|
143
|
+
```bash
|
|
144
|
+
npm i -g vectra-js
|
|
145
|
+
# or
|
|
146
|
+
pnpm add -g vectra-js
|
|
147
|
+
```
|
|
62
148
|
|
|
63
|
-
|
|
149
|
+
---
|
|
64
150
|
|
|
65
|
-
|
|
151
|
+
## 5. Quick Start
|
|
66
152
|
|
|
67
|
-
```
|
|
68
|
-
const {
|
|
153
|
+
```js
|
|
154
|
+
const { VectraClient, ProviderType } = require('vectra-js');
|
|
69
155
|
|
|
70
|
-
const
|
|
71
|
-
// 1. Embedding Provider
|
|
156
|
+
const client = new VectraClient({
|
|
72
157
|
embedding: {
|
|
73
158
|
provider: ProviderType.OPENAI,
|
|
74
159
|
apiKey: process.env.OPENAI_API_KEY,
|
|
75
|
-
modelName: 'text-embedding-3-small'
|
|
76
|
-
dimensions: 1536 // Optional
|
|
160
|
+
modelName: 'text-embedding-3-small'
|
|
77
161
|
},
|
|
78
|
-
|
|
79
|
-
// 2. LLM Provider (for Generation)
|
|
80
162
|
llm: {
|
|
81
163
|
provider: ProviderType.GEMINI,
|
|
82
164
|
apiKey: process.env.GOOGLE_API_KEY,
|
|
83
165
|
modelName: 'gemini-1.5-pro-latest'
|
|
84
166
|
},
|
|
85
|
-
|
|
86
|
-
// 3. Database (Modular)
|
|
87
167
|
database: {
|
|
88
|
-
type: 'prisma',
|
|
89
|
-
clientInstance:
|
|
90
|
-
tableName: 'Document'
|
|
91
|
-
columnMap: { // Map SDK fields to your DB columns
|
|
92
|
-
content: 'text',
|
|
93
|
-
vector: 'embedding',
|
|
94
|
-
metadata: 'meta'
|
|
95
|
-
}
|
|
96
|
-
},
|
|
97
|
-
|
|
98
|
-
// 4. Chunking (Optional)
|
|
99
|
-
chunking: {
|
|
100
|
-
strategy: ChunkingStrategy.RECURSIVE,
|
|
101
|
-
chunkSize: 1000,
|
|
102
|
-
chunkOverlap: 200
|
|
103
|
-
},
|
|
104
|
-
|
|
105
|
-
// 5. Retrieval (Optional)
|
|
106
|
-
retrieval: {
|
|
107
|
-
strategy: RetrievalStrategy.HYBRID, // Uses RRF
|
|
108
|
-
llmConfig: { /* Config for query rewriting LLM */ }
|
|
168
|
+
type: 'prisma',
|
|
169
|
+
clientInstance: prisma,
|
|
170
|
+
tableName: 'Document'
|
|
109
171
|
}
|
|
110
|
-
};
|
|
111
|
-
```
|
|
172
|
+
});
|
|
112
173
|
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
- `provider`: one of `ProviderType.OPENAI`, `ProviderType.GEMINI`
|
|
117
|
-
- `apiKey`: provider API key string
|
|
118
|
-
- `modelName`: embedding model identifier
|
|
119
|
-
- `dimensions`: number; ensures vector size matches DB `pgvector(n)`
|
|
120
|
-
- LLM
|
|
121
|
-
- `provider`: `ProviderType.OPENAI` | `ProviderType.GEMINI` | `ProviderType.ANTHROPIC` | `ProviderType.OLLAMA`
|
|
122
|
-
- `apiKey`: provider API key string (optional for Ollama)
|
|
123
|
-
- `modelName`: generation model identifier
|
|
124
|
-
- `baseUrl`: optional custom URL (e.g., for Ollama)
|
|
125
|
-
- `temperature`: number; optional sampling temperature
|
|
126
|
-
- `maxTokens`: number; optional max output tokens
|
|
127
|
-
- Memory
|
|
128
|
-
- `enabled`: boolean; toggle memory on/off (default: false)
|
|
129
|
-
- `type`: `'in-memory' | 'redis' | 'postgres'`
|
|
130
|
-
- `maxMessages`: number; number of recent messages to retain (default: 20)
|
|
131
|
-
- `redis`: `{ clientInstance, keyPrefix }` where `keyPrefix` defaults to `'vectra:chat:'`
|
|
132
|
-
- `postgres`: `{ clientInstance, tableName, columnMap }` where `tableName` defaults to `'ChatMessage'` and `columnMap` maps `{ sessionId, role, content, createdAt }`
|
|
133
|
-
- Ingestion
|
|
134
|
-
- `rateLimitEnabled`: boolean; toggle rate limiting on/off (default: false)
|
|
135
|
-
- `concurrencyLimit`: number; max concurrent embedding requests when enabled (default: 5)
|
|
136
|
-
- `mode`: `'skip' | 'append' | 'replace'`; idempotency behavior (default: `'skip'`)
|
|
137
|
-
- Database
|
|
138
|
-
- `type`: `prisma` | `chroma` | `qdrant` | `milvus`
|
|
139
|
-
- `clientInstance`: instantiated client for the chosen backend
|
|
140
|
-
- `tableName`: table/collection name (Postgres/Qdrant/Milvus)
|
|
141
|
-
- `columnMap`: maps SDK fields to DB columns
|
|
142
|
-
- `content`: text column name
|
|
143
|
-
- `vector`: embedding vector column name (for Postgres pgvector)
|
|
144
|
-
- `metadata`: JSON column name for per-chunk metadata
|
|
145
|
-
- Chunking
|
|
146
|
-
- `strategy`: `ChunkingStrategy.RECURSIVE` | `ChunkingStrategy.AGENTIC`
|
|
147
|
-
- `chunkSize`: number; preferred chunk size (characters)
|
|
148
|
-
- `chunkOverlap`: number; overlap between adjacent chunks (characters)
|
|
149
|
-
- `separators`: array of string separators to split on (optional)
|
|
150
|
-
- Retrieval
|
|
151
|
-
- `strategy`: `RetrievalStrategy.NAIVE` | `HYDE` | `MULTI_QUERY` | `HYBRID` | `MMR`
|
|
152
|
-
- `llmConfig`: optional LLM config for query rewriting (HyDE/Multi-Query)
|
|
153
|
-
- `mmrLambda`: \(0..1\) tradeoff between relevance and diversity (default: 0.5)
|
|
154
|
-
- `mmrFetchK`: candidate pool size for MMR (default: 20)
|
|
155
|
-
- Reranking
|
|
156
|
-
- `enabled`: boolean; enable LLM-based reranking
|
|
157
|
-
- `topN`: number; final number of docs to keep (optional)
|
|
158
|
-
- `windowSize`: number; number of docs considered before reranking
|
|
159
|
-
- `llmConfig`: optional LLM config for the reranker
|
|
160
|
-
- Metadata
|
|
161
|
-
- `enrichment`: boolean; generate `summary`, `keywords`, `hypothetical_questions`
|
|
162
|
-
- Callbacks
|
|
163
|
-
- `callbacks`: array of handlers; use `LoggingCallbackHandler` or `StructuredLoggingCallbackHandler`
|
|
164
|
-
- Observability
|
|
165
|
-
- `enabled`: boolean; enable SQLite-based observability (default: false)
|
|
166
|
-
- `sqlitePath`: string; path to SQLite database file (default: 'vectra-observability.db')
|
|
167
|
-
- `projectId`: string; project identifier for multi-project support (default: 'default')
|
|
168
|
-
- `trackMetrics`: boolean; track latency and other metrics
|
|
169
|
-
- `trackTraces`: boolean; track detailed workflow traces
|
|
170
|
-
- `sessionTracking`: boolean; track chat sessions
|
|
171
|
-
- Index Helpers (Postgres + Prisma)
|
|
172
|
-
- `ensureIndexes()`: creates ivfflat and GIN FTS indexes and optional `tsvector` trigger
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
### 2. Initialization & Ingestion
|
|
176
|
-
|
|
177
|
-
```javascript
|
|
178
|
-
const { VectraClient } = require('vectra-js');
|
|
179
|
-
const client = new VectraClient(config);
|
|
180
|
-
|
|
181
|
-
// Ingest a file (supports .pdf, .docx, .txt, .md, .xlsx)
|
|
182
|
-
// This will: Load -> Chunk -> Embed -> Store
|
|
183
|
-
await client.ingestDocuments('./documents/employee_handbook.pdf');
|
|
184
|
-
// Ensure indexes (Postgres + Prisma)
|
|
185
|
-
if (config.database.type === 'prisma' && client.vectorStore.ensureIndexes) {
|
|
186
|
-
await client.vectorStore.ensureIndexes();
|
|
187
|
-
}
|
|
188
|
-
// Enable metadata enrichment
|
|
189
|
-
// metadata: { enrichment: true }
|
|
174
|
+
await client.ingestDocuments('./docs');
|
|
175
|
+
const res = await client.queryRAG('What is the vacation policy?');
|
|
176
|
+
console.log(res.answer);
|
|
190
177
|
```
|
|
191
178
|
|
|
192
|
-
|
|
179
|
+
---
|
|
193
180
|
|
|
194
|
-
|
|
195
|
-
// List recent documents (by metadata filter)
|
|
196
|
-
const docs = await client.listDocuments({ filter: { docTitle: 'Employee Handbook' }, limit: 50 });
|
|
181
|
+
## 6. Core Concepts
|
|
197
182
|
|
|
198
|
-
|
|
199
|
-
await client.deleteDocuments({ ids: docs.map(d => d.id) });
|
|
200
|
-
// or:
|
|
201
|
-
await client.deleteDocuments({ filter: { absolutePath: '/abs/path/to/file.pdf' } });
|
|
183
|
+
### Providers
|
|
202
184
|
|
|
203
|
-
|
|
204
|
-
await client.updateDocuments([
|
|
205
|
-
{ id: docs[0].id, content: 'Updated content', metadata: { docTitle: 'Employee Handbook' } }
|
|
206
|
-
]);
|
|
207
|
-
```
|
|
185
|
+
Providers implement embeddings, generation, or both. Vectra normalizes outputs and streaming across providers.
|
|
208
186
|
|
|
209
|
-
###
|
|
187
|
+
### Vector Stores
|
|
210
188
|
|
|
211
|
-
|
|
212
|
-
const response = await client.queryRAG("What is the vacation policy?");
|
|
189
|
+
Vector stores persist embeddings and metadata. They are fully swappable via config.
|
|
213
190
|
|
|
214
|
-
|
|
215
|
-
console.log("Sources:", response.sources); // Metadata of retrieved chunks
|
|
216
|
-
```
|
|
191
|
+
### Chunking
|
|
217
192
|
|
|
218
|
-
|
|
193
|
+
* **Recursive**: Character‑aware, separator‑aware splitting
|
|
194
|
+
* **Agentic**: LLM‑driven semantic propositions (best for policies, legal docs)
|
|
219
195
|
|
|
220
|
-
|
|
196
|
+
### Retrieval
|
|
197
|
+
|
|
198
|
+
Controls recall vs precision using multiple strategies.
|
|
199
|
+
|
|
200
|
+
### Reranking
|
|
201
|
+
|
|
202
|
+
Optional LLM‑based reordering of retrieved chunks.
|
|
203
|
+
|
|
204
|
+
### Metadata Enrichment
|
|
221
205
|
|
|
222
|
-
|
|
223
|
-
const stream = await client.queryRAG("Draft a welcome email...", null, true);
|
|
206
|
+
Optional per‑chunk summaries, keywords, and hypothetical questions generated at ingestion time.
|
|
224
207
|
|
|
225
|
-
|
|
226
|
-
|
|
208
|
+
### Query Planning & Grounding
|
|
209
|
+
|
|
210
|
+
Controls how context is assembled and how strictly answers must be grounded in retrieved text.
|
|
211
|
+
|
|
212
|
+
### Conversation Memory
|
|
213
|
+
|
|
214
|
+
Persist multi‑turn chat history across sessions.
|
|
215
|
+
|
|
216
|
+
---
|
|
217
|
+
|
|
218
|
+
## 7. Configuration Reference (Usage‑Driven)
|
|
219
|
+
|
|
220
|
+
> All configuration is validated using **Zod** at runtime.
|
|
221
|
+
|
|
222
|
+
### Embedding
|
|
223
|
+
|
|
224
|
+
```js
|
|
225
|
+
embedding: {
|
|
226
|
+
provider: ProviderType.OPENAI,
|
|
227
|
+
apiKey: process.env.OPENAI_API_KEY,
|
|
228
|
+
modelName: 'text-embedding-3-small',
|
|
229
|
+
dimensions: 1536
|
|
227
230
|
}
|
|
228
231
|
```
|
|
229
232
|
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
Enable multi-turn conversations by configuring memory and passing a `sessionId`.
|
|
233
|
-
|
|
234
|
-
```javascript
|
|
235
|
-
// In config (enable memory: default is off)
|
|
236
|
-
const config = {
|
|
237
|
-
// ...
|
|
238
|
-
memory: { enabled: true, type: 'in-memory', maxMessages: 10 }
|
|
239
|
-
};
|
|
240
|
-
|
|
241
|
-
// Redis-backed memory
|
|
242
|
-
const redis = /* your redis client instance */;
|
|
243
|
-
const configRedis = {
|
|
244
|
-
// ...
|
|
245
|
-
memory: {
|
|
246
|
-
enabled: true,
|
|
247
|
-
type: 'redis',
|
|
248
|
-
redis: { clientInstance: redis, keyPrefix: 'vectra:chat:' },
|
|
249
|
-
maxMessages: 20
|
|
250
|
-
}
|
|
251
|
-
};
|
|
252
|
-
|
|
253
|
-
// Postgres-backed memory
|
|
254
|
-
const prisma = /* your Prisma client instance */;
|
|
255
|
-
const configPostgres = {
|
|
256
|
-
// ...
|
|
257
|
-
memory: {
|
|
258
|
-
enabled: true,
|
|
259
|
-
type: 'postgres',
|
|
260
|
-
postgres: {
|
|
261
|
-
clientInstance: prisma,
|
|
262
|
-
tableName: 'ChatMessage',
|
|
263
|
-
columnMap: { sessionId: 'sessionId', role: 'role', content: 'content', createdAt: 'createdAt' }
|
|
264
|
-
},
|
|
265
|
-
maxMessages: 20
|
|
266
|
-
}
|
|
267
|
-
};
|
|
233
|
+
Use `dimensions` when using pgvector to avoid runtime mismatches.
|
|
268
234
|
|
|
269
|
-
|
|
270
|
-
|
|
271
|
-
|
|
272
|
-
|
|
235
|
+
---
|
|
236
|
+
|
|
237
|
+
### LLM
|
|
238
|
+
|
|
239
|
+
```js
|
|
240
|
+
llm: {
|
|
241
|
+
provider: ProviderType.GEMINI,
|
|
242
|
+
apiKey: process.env.GOOGLE_API_KEY,
|
|
243
|
+
modelName: 'gemini-1.5-pro-latest',
|
|
244
|
+
temperature: 0.3,
|
|
245
|
+
maxTokens: 1024
|
|
246
|
+
}
|
|
273
247
|
```
|
|
274
248
|
|
|
275
|
-
|
|
249
|
+
Used for:
|
|
276
250
|
|
|
277
|
-
|
|
251
|
+
* Answer generation
|
|
252
|
+
* HyDE & Multi‑Query
|
|
253
|
+
* Agentic chunking
|
|
254
|
+
* Reranking
|
|
278
255
|
|
|
279
|
-
|
|
280
|
-
const testSet = [
|
|
281
|
-
{
|
|
282
|
-
question: "What is the capital of France?",
|
|
283
|
-
expectedGroundTruth: "Paris is the capital of France."
|
|
284
|
-
}
|
|
285
|
-
];
|
|
256
|
+
---
|
|
286
257
|
|
|
287
|
-
|
|
258
|
+
### Database
|
|
288
259
|
|
|
289
|
-
|
|
290
|
-
|
|
260
|
+
```js
|
|
261
|
+
database: {
|
|
262
|
+
type: 'prisma',
|
|
263
|
+
clientInstance: prisma,
|
|
264
|
+
tableName: 'Document'
|
|
265
|
+
}
|
|
291
266
|
```
|
|
292
267
|
|
|
268
|
+
Supports Prisma, Chroma, Qdrant, Milvus.
|
|
269
|
+
|
|
293
270
|
---
|
|
294
271
|
|
|
295
|
-
|
|
296
|
-
|
|
297
|
-
| Feature | OpenAI | Gemini | Anthropic | Ollama | OpenRouter | HuggingFace |
|
|
298
|
-
| :--- | :---: | :---: | :---: | :---: | :---: | :---: |
|
|
299
|
-
| **Embeddings** | ✅ | ✅ | ❌ | ✅ | ❌ | ✅ |
|
|
300
|
-
| **Generation** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
|
|
301
|
-
| **Streaming** | ✅ | ✅ | ✅ | ✅ | ✅ | ⚠️ |
|
|
302
|
-
|
|
303
|
-
### Ollama (Local)
|
|
304
|
-
- Use Ollama for local, offline development.
|
|
305
|
-
- Set `provider = ProviderType.OLLAMA`.
|
|
306
|
-
- Default `baseUrl` is `http://localhost:11434`.
|
|
307
|
-
```javascript
|
|
308
|
-
const config = {
|
|
309
|
-
embedding: { provider: ProviderType.OLLAMA, modelName: 'nomic-embed-text' },
|
|
310
|
-
llm: { provider: ProviderType.OLLAMA, modelName: 'llama3' }
|
|
311
|
-
};
|
|
312
|
-
```
|
|
272
|
+
### Chunking
|
|
313
273
|
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
|
|
317
|
-
|
|
318
|
-
|
|
319
|
-
|
|
320
|
-
llm: {
|
|
321
|
-
provider: ProviderType.OPENROUTER,
|
|
322
|
-
apiKey: process.env.OPENROUTER_API_KEY,
|
|
323
|
-
modelName: 'openai/gpt-4o',
|
|
324
|
-
defaultHeaders: {
|
|
325
|
-
'HTTP-Referer': 'https://your.app',
|
|
326
|
-
'X-Title': 'Your App'
|
|
327
|
-
}
|
|
328
|
-
}
|
|
329
|
-
};
|
|
274
|
+
```js
|
|
275
|
+
chunking: {
|
|
276
|
+
strategy: ChunkingStrategy.RECURSIVE,
|
|
277
|
+
chunkSize: 1000,
|
|
278
|
+
chunkOverlap: 200
|
|
279
|
+
}
|
|
330
280
|
```
|
|
331
281
|
|
|
332
|
-
|
|
282
|
+
Agentic chunking:
|
|
333
283
|
|
|
334
|
-
|
|
335
|
-
|
|
336
|
-
|
|
337
|
-
|
|
338
|
-
|
|
339
|
-
|
|
340
|
-
|
|
341
|
-
|
|
284
|
+
```js
|
|
285
|
+
chunking: {
|
|
286
|
+
strategy: ChunkingStrategy.AGENTIC,
|
|
287
|
+
agenticLlm: {
|
|
288
|
+
provider: ProviderType.OPENAI,
|
|
289
|
+
apiKey: process.env.OPENAI_API_KEY,
|
|
290
|
+
modelName: 'gpt-4o-mini'
|
|
291
|
+
}
|
|
342
292
|
}
|
|
343
293
|
```
|
|
344
294
|
|
|
345
|
-
**ChromaDB**
|
|
346
|
-
No schema required; collections are created automatically.
|
|
347
|
-
|
|
348
295
|
---
|
|
349
296
|
|
|
350
|
-
|
|
351
|
-
|
|
352
|
-
### `new VectraClient(config)`
|
|
353
|
-
Creates a new client instance. Throws error if config is invalid.
|
|
354
|
-
|
|
355
|
-
### `client.ingestDocuments(path: string): Promise<void>`
|
|
356
|
-
Reads a file **or recursively iterates a directory**, chunks content, embeds, and saves to the configured DB.
|
|
357
|
-
- If `path` is a file: Ingests that single file.
|
|
358
|
-
- If `path` is a directory: Recursively finds all supported files and ingests them.
|
|
359
|
-
|
|
360
|
-
### `client.queryRAG(query: string, filter?: object, stream?: boolean)`
|
|
361
|
-
Performs the RAG pipeline:
|
|
362
|
-
1. **Retrieval**: Fetches relevant docs using `config.retrieval.strategy`.
|
|
363
|
-
2. **Reranking**: (Optional) Re-orders docs using `config.reranking`.
|
|
364
|
-
3. **Generation**: Sends context + query to LLM.
|
|
365
|
-
|
|
366
|
-
**Returns**:
|
|
367
|
-
* If `stream=false` (default): `{ answer: string | object, sources: object[] }`
|
|
368
|
-
* If `stream=true`: `AsyncGenerator<{ delta: string, finish_reason: string | null, usage: any | null }>`
|
|
369
|
-
|
|
370
|
-
### Advanced Configuration
|
|
371
|
-
|
|
372
|
-
- Query Planning
|
|
373
|
-
- `queryPlanning.tokenBudget`: number; total token budget for context
|
|
374
|
-
- `queryPlanning.preferSummariesBelow`: number; prefer metadata summaries under this budget
|
|
375
|
-
- `queryPlanning.includeCitations`: boolean; include titles/sections/pages in context
|
|
376
|
-
- Grounding
|
|
377
|
-
- `grounding.enabled`: boolean; enable extractive snippet grounding
|
|
378
|
-
- `grounding.strict`: boolean; use only grounded snippets when true
|
|
379
|
-
- `grounding.maxSnippets`: number; max snippets to include
|
|
380
|
-
- Generation
|
|
381
|
-
- `generation.structuredOutput`: `'none' | 'citations'`; enable inline citations
|
|
382
|
-
- `generation.outputFormat`: `'text' | 'json'`; return JSON when set to `json`
|
|
383
|
-
- Prompts
|
|
384
|
-
- `prompts.query`: string template using `{{context}}` and `{{question}}`
|
|
385
|
-
- `prompts.reranking`: optional template for reranker prompt
|
|
386
|
-
- Tracing
|
|
387
|
-
- `tracing.enable`: boolean; enable provider/DB/pipeline span hooks
|
|
297
|
+
### Retrieval
|
|
388
298
|
|
|
389
|
-
|
|
299
|
+
```js
|
|
300
|
+
retrieval: { strategy: RetrievalStrategy.HYBRID }
|
|
301
|
+
```
|
|
390
302
|
|
|
391
|
-
|
|
303
|
+
HYBRID is recommended for production.
|
|
392
304
|
|
|
393
|
-
|
|
394
|
-
|
|
395
|
-
|
|
396
|
-
```
|
|
305
|
+
---
|
|
306
|
+
|
|
307
|
+
### Reranking
|
|
397
308
|
|
|
398
|
-
|
|
399
|
-
|
|
400
|
-
|
|
401
|
-
|
|
402
|
-
|
|
403
|
-
|
|
404
|
-
};
|
|
309
|
+
```js
|
|
310
|
+
reranking: {
|
|
311
|
+
enabled: true,
|
|
312
|
+
windowSize: 20,
|
|
313
|
+
topN: 5
|
|
314
|
+
}
|
|
405
315
|
```
|
|
406
316
|
|
|
407
317
|
---
|
|
408
318
|
|
|
409
|
-
|
|
319
|
+
### Memory
|
|
410
320
|
|
|
411
|
-
|
|
412
|
-
|
|
321
|
+
```js
|
|
322
|
+
memory: { enabled: true, type: 'in-memory', maxMessages: 20 }
|
|
323
|
+
```
|
|
413
324
|
|
|
414
|
-
|
|
415
|
-
const { VectorStore } = require('vectra-js/interfaces');
|
|
325
|
+
Redis and Postgres are supported.
|
|
416
326
|
|
|
417
|
-
|
|
418
|
-
|
|
419
|
-
|
|
327
|
+
---
|
|
328
|
+
|
|
329
|
+
### Observability
|
|
330
|
+
|
|
331
|
+
```js
|
|
332
|
+
observability: {
|
|
333
|
+
enabled: true,
|
|
334
|
+
sqlitePath: 'vectra-observability.db'
|
|
420
335
|
}
|
|
421
336
|
```
|
|
422
337
|
|
|
423
338
|
---
|
|
424
339
|
|
|
425
|
-
##
|
|
340
|
+
## 8. Ingestion Pipeline
|
|
426
341
|
|
|
427
|
-
|
|
428
|
-
|
|
429
|
-
|
|
430
|
-
- Install with `pnpm install`.
|
|
431
|
-
- Lint with `pnpm run lint`.
|
|
342
|
+
```js
|
|
343
|
+
await client.ingestDocuments('./documents');
|
|
344
|
+
```
|
|
432
345
|
|
|
433
|
-
|
|
434
|
-
- `OPENAI_API_KEY`, `GOOGLE_API_KEY`, `ANTHROPIC_API_KEY` for providers.
|
|
435
|
-
- Database client instance configured under `config.database.clientInstance`.
|
|
346
|
+
Supports files or directories.
|
|
436
347
|
|
|
437
|
-
|
|
438
|
-
- Pipeline: Load → Chunk → Embed → Store → Retrieve → Rerank → Plan → Ground → Generate → Stream.
|
|
439
|
-
- Core client: `VectraClient` (library export).
|
|
440
|
-
- Configuration: `VectraConfig` (validated schema).
|
|
441
|
-
- Vector store interface: `VectorStore` (extend to add custom stores).
|
|
442
|
-
- Callbacks: `StructuredLoggingCallbackHandler` and custom handler support.
|
|
348
|
+
Formats: PDF, DOCX, XLSX, TXT, Markdown
|
|
443
349
|
|
|
444
|
-
|
|
445
|
-
- Supports NAIVE, HYDE, MULTI_QUERY, HYBRID (RRF fusion built-in).
|
|
350
|
+
---
|
|
446
351
|
|
|
447
|
-
|
|
448
|
-
- Context assembly respects `queryPlanning` (token budget, summary preference, citations).
|
|
449
|
-
- Snippet extraction controlled by `grounding` (strict mode and max snippets).
|
|
352
|
+
## 9. Querying & Streaming
|
|
450
353
|
|
|
451
|
-
|
|
452
|
-
|
|
354
|
+
```js
|
|
355
|
+
const res = await client.queryRAG('Refund policy?');
|
|
356
|
+
```
|
|
453
357
|
|
|
454
|
-
|
|
455
|
-
- Implement `embedDocuments`, `embedQuery`, `generate`, `generateStream`.
|
|
456
|
-
- Ensure streaming yields `{ delta, finish_reason, usage }`.
|
|
457
|
-
- Wire via `llm.provider` in config.
|
|
358
|
+
Streaming:
|
|
458
359
|
|
|
459
|
-
|
|
460
|
-
|
|
461
|
-
|
|
360
|
+
```js
|
|
361
|
+
const stream = await client.queryRAG('Draft email', null, true);
|
|
362
|
+
for await (const chunk of stream) process.stdout.write(chunk.delta || '');
|
|
363
|
+
```
|
|
462
364
|
|
|
463
|
-
|
|
464
|
-
- Available events: `onIngestStart`, `onIngestEnd`, `onIngestSummary`, `onChunkingStart`, `onEmbeddingStart`, `onRetrievalStart`, `onRetrievalEnd`, `onRerankingStart`, `onRerankingEnd`, `onGenerationStart`, `onGenerationEnd`, `onError`.
|
|
465
|
-
- Extend `StructuredLoggingCallbackHandler` to add error codes and payload sizes.
|
|
365
|
+
---
|
|
466
366
|
|
|
467
|
-
|
|
468
|
-
- Binary `vectra` included with the package.
|
|
469
|
-
- Ingest: `vectra ingest <path> --config=./config.json`.
|
|
470
|
-
- Query: `vectra query "<text>" --config=./config.json --stream`.
|
|
367
|
+
## 10. Conversation Memory
|
|
471
368
|
|
|
472
|
-
|
|
473
|
-
- CommonJS modules, flat ESLint config.
|
|
474
|
-
- Follow existing naming: `chunkIndex` in JS; use consistent casing.
|
|
369
|
+
Pass a `sessionId` to maintain context across turns.
|
|
475
370
|
|
|
476
371
|
---
|
|
477
372
|
|
|
478
|
-
##
|
|
373
|
+
## 11. Evaluation & Quality Measurement
|
|
479
374
|
|
|
480
|
-
|
|
481
|
-
|
|
482
|
-
- Configure dimensions to match DB `pgvector(n)` when applicable.
|
|
483
|
-
- Example:
|
|
484
|
-
```javascript
|
|
485
|
-
const config = {
|
|
486
|
-
embedding: {
|
|
487
|
-
provider: ProviderType.OPENAI,
|
|
488
|
-
apiKey: process.env.OPENAI_API_KEY,
|
|
489
|
-
modelName: 'text-embedding-3-small',
|
|
490
|
-
dimensions: 1536
|
|
491
|
-
},
|
|
492
|
-
// ...
|
|
493
|
-
};
|
|
375
|
+
```js
|
|
376
|
+
await client.evaluate([{ question: 'Capital of France?', expectedGroundTruth: 'Paris' }]);
|
|
494
377
|
```
|
|
495
378
|
|
|
496
|
-
|
|
497
|
-
- Providers: `OPENAI`, `GEMINI`, `ANTHROPIC`.
|
|
498
|
-
- Options: `temperature`, `maxTokens`.
|
|
499
|
-
- Structured output: set `generation.outputFormat = 'json'` and parse `answer`.
|
|
500
|
-
```javascript
|
|
501
|
-
const config = {
|
|
502
|
-
llm: { provider: ProviderType.GEMINI, apiKey: process.env.GOOGLE_API_KEY, modelName: 'gemini-1.5-pro-latest', temperature: 0.3 },
|
|
503
|
-
generation: { outputFormat: 'json', structuredOutput: 'citations' }
|
|
504
|
-
};
|
|
505
|
-
const client = new VectraClient(config);
|
|
506
|
-
const res = await client.queryRAG('Summarize our policy with citations.');
|
|
507
|
-
console.log(res.answer); // JSON object or string on fallback
|
|
508
|
-
```
|
|
379
|
+
Metrics:
|
|
509
380
|
|
|
510
|
-
|
|
511
|
-
|
|
512
|
-
const config = {
|
|
513
|
-
llm: {
|
|
514
|
-
provider: ProviderType.OPENROUTER,
|
|
515
|
-
apiKey: process.env.OPENROUTER_API_KEY,
|
|
516
|
-
modelName: 'openai/gpt-4o',
|
|
517
|
-
defaultHeaders: { 'HTTP-Referer': 'https://your.app', 'X-Title': 'Your App' }
|
|
518
|
-
}
|
|
519
|
-
};
|
|
520
|
-
```
|
|
381
|
+
* Faithfulness
|
|
382
|
+
* Relevance
|
|
521
383
|
|
|
522
|
-
|
|
523
|
-
- Strategies: `RECURSIVE`, `AGENTIC`.
|
|
524
|
-
- Agentic requires `chunking.agenticLlm` config.
|
|
525
|
-
```javascript
|
|
526
|
-
const config = {
|
|
527
|
-
chunking: {
|
|
528
|
-
strategy: ChunkingStrategy.AGENTIC,
|
|
529
|
-
agenticLlm: { provider: ProviderType.OPENAI, apiKey: process.env.OPENAI_API_KEY, modelName: 'gpt-4o-mini' },
|
|
530
|
-
chunkSize: 1200,
|
|
531
|
-
chunkOverlap: 200
|
|
532
|
-
}
|
|
533
|
-
};
|
|
534
|
-
```
|
|
384
|
+
---
|
|
535
385
|
|
|
536
|
-
|
|
537
|
-
- Strategies: `NAIVE`, `HYDE`, `MULTI_QUERY`, `HYBRID`.
|
|
538
|
-
- HYDE/MULTI_QUERY require `retrieval.llmConfig`.
|
|
539
|
-
- Example:
|
|
540
|
-
```javascript
|
|
541
|
-
const config = {
|
|
542
|
-
retrieval: {
|
|
543
|
-
strategy: RetrievalStrategy.MULTI_QUERY,
|
|
544
|
-
llmConfig: { provider: ProviderType.OPENAI, apiKey: process.env.OPENAI_API_KEY, modelName: 'gpt-4o-mini' }
|
|
545
|
-
}
|
|
546
|
-
};
|
|
547
|
-
```
|
|
386
|
+
## 12. CLI
|
|
548
387
|
|
|
549
|
-
###
|
|
550
|
-
- Enable LLM-based reranking to reorder results.
|
|
551
|
-
```javascript
|
|
552
|
-
const config = {
|
|
553
|
-
reranking: {
|
|
554
|
-
enabled: true,
|
|
555
|
-
topN: 5,
|
|
556
|
-
windowSize: 20,
|
|
557
|
-
llmConfig: { provider: ProviderType.ANTHROPIC, apiKey: process.env.ANTHROPIC_API_KEY, modelName: 'claude-3-haiku' }
|
|
558
|
-
}
|
|
559
|
-
};
|
|
560
|
-
```
|
|
388
|
+
### Ingest & Query
|
|
561
389
|
|
|
562
|
-
|
|
563
|
-
|
|
564
|
-
|
|
565
|
-
const config = { metadata: { enrichment: true } };
|
|
566
|
-
await client.ingestDocuments('./docs/handbook.pdf');
|
|
390
|
+
```bash
|
|
391
|
+
vectra ingest ./docs --config=./config.json
|
|
392
|
+
vectra query "What is our leave policy?" --config=./config.json --stream
|
|
567
393
|
```
|
|
568
394
|
|
|
569
|
-
|
|
570
|
-
- Control context assembly with token budget and summary preference.
|
|
571
|
-
```javascript
|
|
572
|
-
const config = {
|
|
573
|
-
queryPlanning: { tokenBudget: 2048, preferSummariesBelow: 1024, includeCitations: true }
|
|
574
|
-
};
|
|
575
|
-
```
|
|
395
|
+
---
|
|
576
396
|
|
|
577
|
-
###
|
|
578
|
-
- Inject extractive snippets; use `strict` to only allow grounded quotes.
|
|
579
|
-
```javascript
|
|
580
|
-
const config = { grounding: { enabled: true, strict: false, maxSnippets: 4 } };
|
|
581
|
-
```
|
|
397
|
+
### WebConfig (Config Generator UI)
|
|
582
398
|
|
|
583
|
-
|
|
584
|
-
|
|
585
|
-
```javascript
|
|
586
|
-
const config = {
|
|
587
|
-
prompts: { query: 'Use only the following context to answer.\nContext:\n{{context}}\n\nQ: {{question}}' }
|
|
588
|
-
};
|
|
399
|
+
```bash
|
|
400
|
+
vectra webconfig
|
|
589
401
|
```
|
|
590
402
|
|
|
591
|
-
|
|
592
|
-
- Unified async generator with chunks `{ delta, finish_reason, usage }`.
|
|
593
|
-
```javascript
|
|
594
|
-
const stream = await client.queryRAG('Draft a welcome email', null, true);
|
|
595
|
-
for await (const chunk of stream) process.stdout.write(chunk.delta || '');
|
|
596
|
-
```
|
|
403
|
+
**WebConfig** launches a local web UI that:
|
|
597
404
|
|
|
598
|
-
|
|
599
|
-
|
|
600
|
-
|
|
601
|
-
const res = await client.queryRAG('Vacation policy', { docTitle: 'Employee Handbook' });
|
|
602
|
-
```
|
|
405
|
+
* Guides you through building a valid `vectra.config.json`
|
|
406
|
+
* Validates all options interactively
|
|
407
|
+
* Prevents misconfiguration
|
|
603
408
|
|
|
604
|
-
|
|
605
|
-
|
|
606
|
-
|
|
607
|
-
|
|
608
|
-
|
|
409
|
+
This is ideal for:
|
|
410
|
+
|
|
411
|
+
* First‑time setup
|
|
412
|
+
* Non‑backend users
|
|
413
|
+
* Sharing configs across teams
|
|
414
|
+
|
|
415
|
+
---
|
|
416
|
+
|
|
417
|
+
### Observability Dashboard
|
|
418
|
+
|
|
419
|
+
```bash
|
|
420
|
+
vectra dashboard
|
|
609
421
|
```
|
|
610
422
|
|
|
423
|
+
The **Observability Dashboard** is a local web UI backed by SQLite that visualizes:
|
|
424
|
+
|
|
425
|
+
* Ingestion latency
|
|
426
|
+
* Query latency
|
|
427
|
+
* Retrieval & generation traces
|
|
428
|
+
* Chat sessions
|
|
429
|
+
|
|
430
|
+
It helps you:
|
|
431
|
+
|
|
432
|
+
* Debug RAG quality issues
|
|
433
|
+
* Understand latency bottlenecks
|
|
434
|
+
* Monitor production‑like workloads
|
|
435
|
+
|
|
436
|
+
---
|
|
437
|
+
|
|
438
|
+
## 13. Observability & Callbacks
|
|
439
|
+
|
|
611
440
|
### Observability
|
|
612
441
|
|
|
613
|
-
|
|
614
|
-
|
|
615
|
-
```javascript
|
|
616
|
-
const config = {
|
|
617
|
-
// ...
|
|
618
|
-
observability: {
|
|
619
|
-
enabled: true,
|
|
620
|
-
sqlitePath: 'vectra-observability.db',
|
|
621
|
-
projectId: 'my-project',
|
|
622
|
-
trackMetrics: true,
|
|
623
|
-
trackTraces: true,
|
|
624
|
-
sessionTracking: true
|
|
625
|
-
}
|
|
626
|
-
};
|
|
627
|
-
```
|
|
442
|
+
Tracks metrics, traces, and sessions automatically when enabled.
|
|
628
443
|
|
|
629
|
-
|
|
630
|
-
- **Metrics**: Latency (ingest, query).
|
|
631
|
-
- **Traces**: Detailed spans for retrieval, generation, and ingestion workflows.
|
|
632
|
-
- **Sessions**: Chat session history and last query tracking.
|
|
444
|
+
### Callbacks
|
|
633
445
|
|
|
634
|
-
|
|
635
|
-
|
|
636
|
-
|
|
637
|
-
|
|
638
|
-
|
|
639
|
-
|
|
640
|
-
|
|
641
|
-
|
|
642
|
-
|
|
643
|
-
|
|
644
|
-
|
|
645
|
-
|
|
446
|
+
Lifecycle hooks:
|
|
447
|
+
|
|
448
|
+
* Ingestion
|
|
449
|
+
* Chunking
|
|
450
|
+
* Embedding
|
|
451
|
+
* Retrieval
|
|
452
|
+
* Reranking
|
|
453
|
+
* Generation
|
|
454
|
+
* Errors
|
|
455
|
+
|
|
456
|
+
---
|
|
457
|
+
|
|
458
|
+
## 14. Database Schemas & Indexing
|
|
459
|
+
|
|
460
|
+
```prisma
|
|
461
|
+
model Document {
|
|
462
|
+
id String @id @default(uuid())
|
|
463
|
+
content String
|
|
464
|
+
metadata Json
|
|
465
|
+
vector Unsupported("vector")?
|
|
466
|
+
createdAt DateTime @default(now())
|
|
467
|
+
}
|
|
646
468
|
```
|
|
647
|
-
|
|
648
|
-
|
|
649
|
-
|
|
650
|
-
|
|
651
|
-
|
|
652
|
-
|
|
653
|
-
|
|
654
|
-
|
|
655
|
-
|
|
469
|
+
|
|
470
|
+
---
|
|
471
|
+
|
|
472
|
+
## 15. Extending Vectra
|
|
473
|
+
|
|
474
|
+
### Custom Vector Store
|
|
475
|
+
|
|
476
|
+
```js
|
|
477
|
+
class MyStore extends VectorStore {
|
|
478
|
+
async addDocuments() {}
|
|
479
|
+
async similaritySearch() {}
|
|
480
|
+
}
|
|
656
481
|
```
|
|
482
|
+
|
|
483
|
+
---
|
|
484
|
+
|
|
485
|
+
## 16. Architecture Overview
|
|
486
|
+
|
|
487
|
+
* `VectraClient`: orchestrator
|
|
488
|
+
* Typed config schema
|
|
489
|
+
* Interface‑driven providers & stores
|
|
490
|
+
* Unified streaming abstraction
|
|
491
|
+
|
|
492
|
+
---
|
|
493
|
+
|
|
494
|
+
## 17. Development & Contribution Guide
|
|
495
|
+
|
|
496
|
+
* Node.js 18+
|
|
497
|
+
* pnpm recommended
|
|
498
|
+
* Lint: `pnpm run lint`
|
|
499
|
+
|
|
500
|
+
---
|
|
501
|
+
|
|
502
|
+
## 18. Production Best Practices
|
|
503
|
+
|
|
504
|
+
* Match embedding dimensions to pgvector
|
|
505
|
+
* Prefer HYBRID retrieval
|
|
506
|
+
* Enable observability in staging
|
|
507
|
+
* Evaluate before changing chunk sizes
|
|
508
|
+
|
|
509
|
+
---
|
|
510
|
+
|
|
511
|
+
**Vectra scales cleanly from local prototypes to production‑grade RAG platforms.**
|