@ambicuity/kindx 0.1.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,12 +1,60 @@
1
- # KINDX -- On-Device Document Intelligence Engine
1
+ ```
2
+ ██╗ ██╗██╗███╗ ██╗██████╗ ██╗ ██╗
3
+ ██║ ██╔╝██║████╗ ██║██╔══██╗╚██╗██╔╝
4
+ █████╔╝ ██║██╔██╗ ██║██║ ██║ ╚███╔╝
5
+ ██╔═██╗ ██║██║╚██╗██║██║ ██║ ██╔██╗
6
+ ██║ ██╗██║██║ ╚████║██████╔╝██╔╝ ██╗
7
+ ╚═╝ ╚═╝╚═╝╚═╝ ╚═══╝╚═════╝ ╚═╝ ╚═╝
8
+ ```
9
+
10
+ # KINDX — Enterprise-Grade On-Device Knowledge Infrastructure
11
+
12
+ [![MCP-Compatible](https://img.shields.io/badge/MCP-Compatible-6f42c1?style=flat-square&logo=data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMTYiIGhlaWdodD0iMTYiIHZpZXdCb3g9IjAgMCAxNiAxNiIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIj48cmVjdCB3aWR0aD0iMTYiIGhlaWdodD0iMTYiIHJ4PSIyIiBmaWxsPSIjNmY0MmMxIi8+PC9zdmc+)](https://modelcontextprotocol.io)
13
+ [![Local-First](https://img.shields.io/badge/Local--First-Privacy%20Guaranteed-22c55e?style=flat-square)](https://github.com/ambicuity/KINDX)
14
+ [![Node.js](https://img.shields.io/badge/Node.js-22%2B-339933?style=flat-square&logo=node.js&logoColor=white)](https://nodejs.org)
15
+ [![TypeScript](https://img.shields.io/badge/TypeScript-Strict-3178C6?style=flat-square&logo=typescript&logoColor=white)](https://www.typescriptlang.org)
16
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=flat-square)](./LICENSE)
17
+ [![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/ambicuity/KINDX/badge)](https://scorecard.dev/viewer/?uri=github.com/ambicuity/KINDX)
18
+
19
+ **Knowledge Infrastructure for AI Agents.** KINDX is a high-performance, local-first backend for Agentic Context Injection — enabling AI agents to perform deterministic, privacy-preserving Contextual Retrieval over enterprise corpora without a single byte leaving the edge.
20
+
21
+ KINDX combines BM25 full-text retrieval, vector semantic retrieval, and LLM re-ranking — all running locally via `node-llama-cpp` with GGUF models. It is designed to be called by agents, not typed by humans.
22
+
23
+ > Read the progress log in the [CHANGELOG](./CHANGELOG.md).
24
+
25
+ ---
26
+
27
+ ## Why KINDX?
28
+
29
+ The local RAG ecosystem is fragmenting: LanceDB is moving to multimodal ML infrastructure, Chroma is moving to managed cloud, Orama is moving to the browser. **KINDX is the only tool that stays on the desktop and speaks the agent's native language.**
30
+
31
+ | Capability | KINDX | LanceDB | Chroma | Orama | Khoj |
32
+ |---|:---:|:---:|:---:|:---:|:---:|
33
+ | **Local-first / Air-gapped** | ✅ | ✅ | ❌ | ✅ | ✅ |
34
+ | **MCP Server (agent protocol)** | ✅ | ❌ | ❌ | ❌ | ❌ |
35
+ | **On-device GGUF inference** | ✅ | ❌ | ❌ | ❌ | Partial |
36
+ | **Hybrid BM25 + Vector + Rerank** | ✅ | Partial | Partial | ✅ | ❌ |
37
+ | **Structured agent output (JSON/CSV/XML)** | ✅ | ❌ | ❌ | ❌ | ❌ |
38
+ | **CLI-first / `child_process` invocable** | ✅ | ❌ | ❌ | ❌ | ❌ |
39
+
40
+ KINDX is the only product in this category that combines local-first privacy, first-class MCP support, on-device GGUF inference, structured pipeline output, and CLI invocability — making it the ideal Memory Node for MCP-compatible autonomous agents (Claude Code, Cursor, Continue.dev, AutoGPT, and beyond).
41
+
42
+ ---
43
+
44
+ ## The Three Pillars
2
45
 
3
- A local-first search engine for everything you need to remember. Index your markdown notes, meeting transcripts, documentation, and knowledge bases. Search with keywords or natural language. Designed for agentic workflows.
46
+ ### 1. Deterministic Privacy
47
+ Every inference — embedding, reranking, query expansion — runs on local GGUF models via `node-llama-cpp`. Sensitive documents never leave the edge. There is no telemetry, no API call, no cloud dependency.
4
48
 
5
- KINDX combines BM25 full-text search, vector semantic search, and LLM re-ranking -- all running locally via node-llama-cpp with GGUF models.
49
+ ### 2. Agent-Native Design
50
+ KINDX is architected for `child_process` invocation from autonomous agents (AutoGPT, OpenDevin, Claude Code, LangGraph). The `--json`, `--files`, `--csv`, and `--xml` output flags produce structured payloads for agent consumption. The MCP server provides tight protocol-level integration.
6
51
 
7
- You can read more about KINDX's progress in the [CHANGELOG](./CHANGELOG.md).
52
+ ### 3. Hybrid Precision (Neural-Symbolic Retrieval)
53
+ Position-Aware Blending merges BM25 symbolic retrieval with neural vector similarity and LLM cross-encoder reranking. The fusion strategy is provably non-destructive to exact-match signals via Reciprocal Rank Fusion (RRF, k=60). See the [Architecture](#architecture) section for the full pipeline specification.
54
+
55
+ ---
8
56
 
9
- ## Quick Start
57
+ ## Quick Start — Local-First Agentic Stack
10
58
 
11
59
  ```bash
12
60
  # Install globally (Node or Bun)
@@ -14,72 +62,99 @@ npm install -g @ambicuity/kindx
14
62
  # or
15
63
  bun install -g @ambicuity/kindx
16
64
 
17
- # Or run directly
65
+ # Or invoke without installing
18
66
  npx @ambicuity/kindx ...
19
67
  bunx @ambicuity/kindx ...
20
68
 
21
- # Create collections for your notes, docs, and meeting transcripts
69
+ > **Note:** The term "collection" in this documentation corresponds to a `collection` in the CLI.
70
+
71
+ # Register collections
22
72
  kindx collection add ~/notes --name notes
23
73
  kindx collection add ~/Documents/meetings --name meetings
24
74
  kindx collection add ~/work/docs --name docs
25
75
 
26
- # Add context to help with search results
27
- kindx context add kindx://notes "Personal notes and ideas"
28
- kindx context add kindx://meetings "Meeting transcripts and notes"
29
- kindx context add kindx://docs "Work documentation"
76
+ # Annotate collections with semantic context
77
+ kindx context add kindx://notes "Personal documents and ideation corpus"
78
+ kindx context add kindx://meetings "Meeting transcripts and decision records"
79
+ kindx context add kindx://docs "Engineering documentation corpus"
30
80
 
31
- # Generate embeddings for semantic search
81
+ # Build the vector index from corpus
32
82
  kindx embed
33
83
 
34
- # Search across everything
35
- kindx search "project timeline" # Fast keyword search
36
- kindx vsearch "how to deploy" # Semantic search
37
- kindx query "quarterly planning process" # Hybrid + reranking (best quality)
84
+ # Contextual Retrieval — choose retrieval mode
85
+ kindx search "project timeline" # BM25 full-text retrieval (fast)
86
+ kindx vsearch "how to deploy" # Neural vector retrieval
87
+ kindx query "quarterly planning process" # Hybrid + reranking (highest precision)
38
88
 
39
- # Get a specific document
89
+ # Neural Extraction — retrieve a specific document
40
90
  kindx get "meetings/2024-01-15.md"
41
91
 
42
- # Get a document by docid (shown in search results)
92
+ # Neural Extraction by docid (shown in retrieval results)
43
93
  kindx get "#abc123"
44
94
 
45
- # Get multiple documents by glob pattern
95
+ # Bulk Neural Extraction via glob pattern
46
96
  kindx multi-get "journals/2025-05*.md"
47
97
 
48
- # Search within a specific collection
98
+ # Scoped Contextual Retrieval within a collection
49
99
  kindx search "API" -c notes
50
100
 
51
- # Export all matches for an agent
101
+ # Corrective feedback (Phase 1)
102
+ kindx feedback --irrelevant --query "deploy k8s" --chunk "#abc123:2"
103
+ kindx feedback --relevant --query "deploy k8s" --chunk "#abc123:2"
104
+ kindx feedback list --query "deploy"
105
+
106
+ # Export full match set for agent pipeline
52
107
  kindx search "API" --all --files --min-score 0.3
108
+
109
+ > **Pro-tip (Small Collections):** For collections under ~100 documents, `kindx search` (BM25) is incredibly fast and often sufficient. The query expansion and reranking overhead of `kindx query` is best suited for larger, noisier corporate datasets.
53
110
  ```
54
111
 
55
- ### Using with AI Agents
112
+ ---
113
+
114
+ ## Agent-Native Integration
56
115
 
57
- KINDX's `--json` and `--files` output formats are designed for agentic workflows:
116
+ KINDX's primary interface is structured output for agent pipelines. Treat CLI invocations as RPC calls.
58
117
 
59
118
  ```bash
60
- # Get structured results for an LLM
119
+ # Structured JSON payload for LLM context injection
61
120
  kindx search "authentication" --json -n 10
62
121
 
63
- # List all relevant files above a threshold
122
+ # Filepath manifest above relevance threshold agent file consumption
64
123
  kindx query "error handling" --all --files --min-score 0.4
65
124
 
66
- # Retrieve full document content
125
+ # Full document content for agent context window
67
126
  kindx get "docs/api-reference.md" --full
68
127
  ```
69
128
 
70
- ### MCP Server
129
+ > **Pro-tip (Agentic Performance):** Prefer `kindx query` over `kindx search` for open-ended agent instructions. The query expansion and LLM re-ranking pipeline surfaces semantically adjacent documents that keyword retrieval misses.
130
+
131
+ > **Pro-tip (Context Window Budgeting):** Use `--min-score 0.4` with `--files` to produce a ranked manifest, then `multi-get` only the top-k assets. This two-phase pattern prevents context window overflow while preserving retrieval precision.
132
+
133
+ ### Typed SDK Packages
134
+
135
+ KINDX now includes typed client packages and integration scaffolding:
136
+
137
+ - `@ambicuity/kindx-schemas` — shared Zod schemas for KINDX MCP/HTTP request and response contracts.
138
+ - `@ambicuity/kindx-client` — TypeScript client for `/query` and MCP tool calls (`get`, `multi_get`, `status`, `kindx_feedback`, and memory tools).
139
+ - `python/kindx-langchain` — installable Python retriever wrapper for LangChain-style document retrieval.
140
+ - [`reference/integrations/agent-templates.md`](reference/integrations/agent-templates.md) — tested MCP configuration templates for OpenDevin, Goose, and Claude Code.
141
+
142
+ ---
143
+
144
+ ## MCP Server
71
145
 
72
- Although the tool works perfectly fine when you just tell your agent to use it on the command line, it also exposes an MCP (Model Context Protocol) server for tighter integration.
146
+ KINDX exposes a Model Context Protocol (MCP) server for tool-call integration with any MCP-compatible agent runtime.
73
147
 
74
- Tools exposed:
75
- - `kindx_search` -- Fast BM25 keyword search (supports collection filter)
76
- - `kindx_vector_search` -- Semantic vector search (supports collection filter)
77
- - `kindx_deep_search` -- Deep search with query expansion and reranking (supports collection filter)
78
- - `kindx_get` -- Retrieve document by path or docid (with fuzzy matching suggestions)
79
- - `kindx_multi_get` -- Retrieve multiple documents by glob pattern, list, or docids
80
- - `kindx_status` -- Index health and collection info
148
+ **Registered Tools:**
149
+ - `kindx_search` BM25 Contextual Retrieval (supports collection filter)
150
+ - `kindx_vector_search` Neural vector Contextual Retrieval (supports collection filter)
151
+ - `kindx_deep_search` Hybrid Neural-Symbolic retrieval with query expansion and reranking (supports collection filter)
152
+ - `kindx_get` Neural Extraction by path or docid (with fuzzy matching fallback)
153
+ - `kindx_multi_get` Bulk Neural Extraction by glob pattern, list, or docids
154
+ - `kindx_status` Index health and collection inventory
155
+ - `kindx_feedback` — Store relevance feedback (`relevant` / `irrelevant`) for query+chunk pairs
81
156
 
82
- Claude Desktop configuration (`~/Library/Application Support/Claude/claude_desktop_config.json`):
157
+ **Claude Desktop configuration** (`~/Library/Application Support/Claude/claude_desktop_config.json`):
83
158
 
84
159
  ```json
85
160
  {
@@ -92,28 +167,32 @@ Claude Desktop configuration (`~/Library/Application Support/Claude/claude_deskt
92
167
  }
93
168
  ```
94
169
 
95
- #### HTTP Transport
170
+ ### HTTP Transport
96
171
 
97
- By default, KINDX's MCP server uses stdio (launched as a subprocess by each client). For a shared, long-lived server that avoids repeated model loading, use the HTTP transport:
172
+ By default, the MCP server uses stdio (launched as a subprocess per client). For a shared, long-lived server that avoids repeated model loading across agent sessions, use the HTTP transport:
98
173
 
99
174
  ```bash
100
- # Foreground (Ctrl-C to stop)
175
+ # Foreground
101
176
  kindx mcp --http # localhost:8181
102
177
  kindx mcp --http --port 8080 # custom port
103
178
 
104
- # Background daemon
105
- kindx mcp --http --daemon # start, writes PID to ~/.cache/kindx/mcp.pid
106
- kindx mcp stop # stop via PID file
107
- kindx status # shows "MCP: running (PID ...)" when active
179
+ # Persistent daemon
180
+ kindx mcp --http --daemon # writes PID to ~/.cache/kindx/mcp.pid
181
+ kindx mcp stop # terminate via PID file
182
+ kindx status # reports "MCP: running (PID ...)"
108
183
  ```
109
184
 
110
- The HTTP server exposes two endpoints:
111
- - `POST /mcp` -- MCP Streamable HTTP (JSON responses, stateless)
112
- - `GET /health` -- liveness check with uptime
185
+ Endpoints:
186
+ - `POST /mcp` MCP Streamable HTTP (JSON, stateless)
187
+ - `GET /health` liveness probe with uptime
113
188
 
114
- LLM models stay loaded in VRAM across requests. Embedding/reranking contexts are disposed after 5 min idle and transparently recreated on the next request (~1s penalty, models remain loaded).
189
+ LLM models remain resident in VRAM across requests. Embedding and reranking contexts are disposed after 5 min idle and transparently recreated on next request (~1 s penalty, models remain warm).
115
190
 
116
- Point any MCP client at `http://localhost:8181/mcp` to connect.
191
+ Point any MCP client at `http://localhost:8181/mcp`.
192
+
193
+ > **Pro-tip (Multi-Agent Deployments):** Run `kindx mcp --http --daemon` once at agent-cluster startup. All child agents share a single warm model context, eliminating per-invocation model load overhead (~3–8 s per cold start).
194
+
195
+ ---
117
196
 
118
197
  ## Architecture
119
198
 
@@ -164,7 +243,7 @@ graph TB
164
243
  CAT --> SQLite
165
244
  ```
166
245
 
167
- ### Hybrid Search Pipeline
246
+ ### Hybrid Retrieval Pipeline
168
247
 
169
248
  ```mermaid
170
249
  flowchart TD
@@ -219,10 +298,10 @@ flowchart TD
219
298
 
220
299
  ### Score Normalization and Fusion
221
300
 
222
- #### Search Backends
301
+ #### Retrieval Backends
223
302
 
224
303
  - **BM25 (FTS5)**: `Math.abs(score)` normalized via `score / 10`
225
- - **Vector search**: `1 / (1 + distance)` cosine similarity
304
+ - **Vector retrieval**: `1 / (1 + distance)` cosine similarity
226
305
 
227
306
  #### Fusion Strategy
228
307
 
@@ -231,15 +310,17 @@ The `query` command uses Reciprocal Rank Fusion (RRF) with position-aware blendi
231
310
  1. **Query Expansion**: Original query (x2 for weighting) + 1 LLM variation
232
311
  2. **Parallel Retrieval**: Each query searches both FTS and vector indexes
233
312
  3. **RRF Fusion**: Combine all result lists using `score = Sum(1/(k+rank+1))` where k=60
234
- 4. **Top-Rank Bonus**: Documents ranking #1 in any list get +0.05, #2-3 get +0.02
313
+ 4. **Top-Rank Bonus**: documents ranking #1 in any list get +0.05, #2-3 get +0.02
235
314
  5. **Top-K Selection**: Take top 30 candidates for reranking
236
- 6. **Re-ranking**: LLM scores each document (yes/no with logprobs confidence)
315
+ 6. **Re-ranking**: LLM scores each asset (yes/no with logprobs confidence)
237
316
  7. **Position-Aware Blending**:
238
317
  - RRF rank 1-3: 75% retrieval, 25% reranker (preserves exact matches)
239
318
  - RRF rank 4-10: 60% retrieval, 40% reranker
240
319
  - RRF rank 11+: 40% retrieval, 60% reranker (trust reranker more)
241
320
 
242
- Why this approach: Pure RRF can dilute exact matches when expanded queries don't match. The top-rank bonus preserves documents that score #1 for the original query. Position-aware blending prevents the reranker from destroying high-confidence retrieval results.
321
+ **Design rationale:** Pure RRF can dilute exact matches when expanded queries don't match. The top-rank bonus preserves documents that score #1 for the original query. Position-aware blending prevents the reranker from overriding high-confidence retrieval signals.
322
+
323
+ ---
243
324
 
244
325
  ## Requirements
245
326
 
@@ -253,33 +334,53 @@ Why this approach: Pure RRF can dilute exact matches when expanded queries don't
253
334
  brew install sqlite
254
335
  ```
255
336
 
337
+ ### WSL2 GPU Support (Windows)
338
+
339
+ If you are running KINDX inside WSL2 with an NVIDIA GPU, `node-llama-cpp` might fall back to the slow, non-conformant Vulkan translation layer (`dzn`) causing `vsearch` and `query` to take 60-90 seconds or crash.
340
+
341
+ To enable native CUDA GPU acceleration, install the CUDA toolkit runtime libraries (do *not* install the driver meta-packages, WSL2 passes the driver through from Windows):
342
+
343
+ ```bash
344
+ wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-keyring_1.1-1_all.deb
345
+ sudo dpkg -i cuda-keyring_1.1-1_all.deb
346
+ sudo apt-get update
347
+ sudo apt-get install cuda-toolkit-13-1 # or cuda-toolkit-12-6
348
+ ```
349
+
256
350
  ### GGUF Models (via node-llama-cpp)
257
351
 
258
352
  KINDX uses three local GGUF models (auto-downloaded on first use):
259
353
 
260
- - `embeddinggemma-300M-Q8_0` -- embedding model
261
- - `qwen3-reranker-0.6b-q8_0` -- cross-encoder reranker
262
- - `kindx-query-expansion-1.7B-q4_k_m` -- query expansion (fine-tuned)
354
+ - `embeddinggemma-300M-Q8_0` embedding model
355
+ - `qwen3-reranker-0.6b-q8_0` cross-encoder reranker
356
+ - `kindx-query-expansion-1.7B-q4_k_m` query expansion (fine-tuned)
263
357
 
264
358
  Models are downloaded from HuggingFace and cached in `~/.cache/kindx/models/`.
265
359
 
360
+ > **Pro-tip (Air-Gapped Deployments):** Pre-download all three GGUF files and place them in `~/.cache/kindx/models/`. KINDX resolves models from the local cache first; no network access is required at runtime.
361
+
266
362
  ### Custom Embedding Model
267
363
 
268
- Override the default embedding model via the `KINDX_EMBED_MODEL` environment variable. This is useful for multilingual corpora (e.g. Chinese, Japanese, Korean) where embeddinggemma-300M has limited coverage.
364
+ Override the default embedding model via the `KINDX_EMBED_MODEL` environment variable. Required for multilingual corpora (CJK, Arabic, etc.) where `embeddinggemma-300M` has limited coverage.
269
365
 
270
366
  ```bash
271
- # Use Qwen3-Embedding-0.6B for better multilingual (CJK) support
272
- export KINDX_EMBED_MODEL="hf:Qwen/Qwen3-Embedding-0.6B-GGUF/qwen3-embedding-0.6b-q8_0.gguf"
367
+ # Use Qwen3-Embedding-0.6B for multilingual corpus (CJK) support
368
+ export KINDX_EMBED_MODEL="hf:Qwen/Qwen3-Embedding-0.6B-GGUF/qwen3-embedding-0.6b-Q8_0.gguf"
273
369
 
274
- # After changing the model, re-embed all collections:
370
+ # Force re-embed all documents after model switch
275
371
  kindx embed -f
276
372
  ```
277
373
 
278
374
  Supported model families:
279
- - **embeddinggemma** (default) -- English-optimized, small footprint
280
- - **Qwen3-Embedding** -- Multilingual (119 languages including CJK), MTEB top-ranked
281
375
 
282
- Note: When switching embedding models, you must re-index with `kindx embed -f` since vectors are not cross-compatible between models. The prompt format is automatically adjusted for each model family.
376
+ | Model | Use Case |
377
+ |---|---|
378
+ | `embeddinggemma` (default) | English-optimized, minimal footprint |
379
+ | `Qwen3-Embedding` | Multilingual (119 languages including CJK), MTEB top-ranked |
380
+
381
+ > **Note:** Switching embedding models requires full re-indexing (`kindx embed -f`). Vectors are model-specific and not cross-compatible. The prompt format is automatically adjusted per model family.
382
+
383
+ ---
283
384
 
284
385
  ## Installation
285
386
 
@@ -298,18 +399,110 @@ npm install
298
399
  npm link
299
400
  ```
300
401
 
301
- ## Usage
402
+ ### Troubleshooting: Permission Errors (`EACCES`)
403
+
404
+ If you see `npm error code EACCES` when running `npm install -g`, your system npm is configured to write to a directory owned by root (e.g. `/usr/local/lib/node_modules`). **Do not use `sudo npm install -g`** — this is a security risk.
405
+
406
+ The recommended fix is to use a Node version manager so that npm writes to a user-owned prefix:
407
+
408
+ **Option 1 — `nvm` (most common)**
409
+
410
+ ```bash
411
+ # Install nvm
412
+ curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.2/install.sh | bash
413
+ # Restart your shell, then:
414
+ nvm install --lts
415
+ nvm use --lts
416
+ npm install -g @ambicuity/kindx
417
+ ```
302
418
 
303
- ### Collection Management
419
+ **Option 2 — `mise` (polyglot version manager)**
304
420
 
305
421
  ```bash
306
- # Create a collection from current directory
422
+ # Install mise
423
+ curl https://mise.run | sh
424
+ # Restart your shell, then:
425
+ mise use -g node@lts
426
+ npm install -g @ambicuity/kindx
427
+ ```
428
+
429
+ **Option 3 — configure a user-writable npm prefix**
430
+
431
+ ```bash
432
+ mkdir -p ~/.npm-global
433
+ npm config set prefix ~/.npm-global
434
+ # Add to your shell profile (~/.zshrc or ~/.bashrc):
435
+ export PATH="$HOME/.npm-global/bin:$PATH"
436
+ # Then:
437
+ npm install -g @ambicuity/kindx
438
+ ```
439
+
440
+ After any of the above, `kindx --version` should print the installed version.
441
+
442
+
443
+ ## Usage Reference
444
+
445
+ ### Command Index
446
+
447
+ Top-level commands:
448
+
449
+ ```bash
450
+ kindx query <query> # Hybrid search with expansion + reranking
451
+ kindx search <query> # BM25 full-text search
452
+ kindx vsearch <query> # Vector similarity search
453
+ kindx get <file> [--from N] # Retrieve one document (optionally from line offset)
454
+ kindx multi-get <pattern> # Retrieve many documents by glob/list/docid
455
+ kindx embed # Generate or refresh embeddings
456
+ kindx pull # Download/check the default local models
457
+ kindx update # Re-index configured collections
458
+ kindx watch # Keep the index fresh in the background
459
+ kindx status # Report index, collection, and MCP health
460
+ kindx cleanup # Clear cache/orphaned rows and vacuum the DB
461
+ kindx mcp # Start the MCP server (stdio by default)
462
+ kindx migrate <target> <path> # Import from Chroma or OpenCLAW
463
+ kindx skill install # Install the packaged Claude skill locally
464
+ kindx --skill # Print the packaged skill markdown
465
+ kindx --version # Print the installed CLI version
466
+ ```
467
+
468
+ KINDX opens SQLite indexes with `journal_mode=WAL` and `busy_timeout=5000`, so background writers
469
+ (for example `kindx watch`) and MCP readers can run concurrently with fewer lock conflicts.
470
+
471
+ Collection subcommands:
472
+
473
+ ```bash
474
+ kindx collection add <path> [--name NAME] [--mask GLOB]
475
+ kindx collection list
476
+ kindx collection show <name>
477
+ kindx collection remove <name>
478
+ kindx collection rename <old> <new>
479
+ kindx collection update-cmd <name> [command]
480
+ kindx collection include <name>
481
+ kindx collection exclude <name>
482
+ ```
483
+
484
+ Context and MCP subcommands:
485
+
486
+ ```bash
487
+ kindx context add [path] "text"
488
+ kindx context list
489
+ kindx context rm <path>
490
+
491
+ kindx mcp --http
492
+ kindx mcp --http --daemon
493
+ kindx mcp stop
494
+ ```
495
+
496
+ ### collection Management
497
+
498
+ ```bash
499
+ # Register a collection from current directory
307
500
  kindx collection add . --name myproject
308
501
 
309
- # Create a collection with explicit path and custom glob mask
502
+ # Register with explicit path and glob mask
310
503
  kindx collection add ~/Documents/notes --name notes --mask "**/*.md"
311
504
 
312
- # List all collections
505
+ # List all registered collections
313
506
  kindx collection list
314
507
 
315
508
  # Remove a collection
@@ -318,137 +511,204 @@ kindx collection remove myproject
318
511
  # Rename a collection
319
512
  kindx collection rename myproject my-project
320
513
 
321
- # List files in a collection
514
+ # Show collection details and current settings
515
+ kindx collection show my-project
516
+
517
+ # Configure a pre-refresh command
518
+ kindx collection update-cmd my-project "git pull --ff-only"
519
+
520
+ # Include or exclude a collection from default queries
521
+ kindx collection include my-project
522
+ kindx collection exclude archive
523
+
524
+ # List documents within a domain
322
525
  kindx ls notes
323
526
  kindx ls notes/subfolder
324
527
  ```
325
528
 
326
- ### Generate Vector Embeddings
529
+ ### YAML Configuration
530
+
531
+ By default, collection settings are stored in `~/.config/kindx/index.yml`. The config directory is resolved as `KINDX_CONFIG_DIR` (if set), then `XDG_CONFIG_HOME/kindx` (if set), then `~/.config/kindx`. Named indexes use `~/.config/kindx/{indexName}.yml` (default: `index.yml`). You can edit this file directly to configure `ignore` patterns and glob rules for files that should be skipped during indexing and search.
532
+
533
+ #### Example
534
+
535
+ ```yaml
536
+ collections:
537
+ docs:
538
+ path: ~/work/docs
539
+ pattern: "**/*.md"
540
+ ignore:
541
+ - "archive/**"
542
+ - "sessions/**"
543
+ - "**/*.draft.md"
544
+ ```
545
+
546
+ #### How it works
547
+
548
+ - `pattern` defines which files are included
549
+ - `ignore` excludes matching files and directories
550
+ - A file must match `pattern` **and not match any `ignore` rule** to be indexed
551
+ - Ignored files are skipped during indexing and will not appear in search results
552
+
553
+ #### Notes
554
+
555
+ - `ignore` is configured in YAML (no CLI support currently)
556
+ - Patterns are evaluated relative to the collection `path`.
557
+ - By default, `node_modules`, `.git`, `.cache`, `vendor`, `dist`, and `build` are already ignored; `ignore` adds custom exclusions.
558
+ - After editing `index.yml`, run `kindx update` to re-index with the updated rules.
559
+
560
+ ### Vector Index Generation
327
561
 
328
562
  ```bash
329
563
  # Embed all indexed documents (900 tokens/chunk, 15% overlap)
330
564
  kindx embed
331
565
 
332
- # Force re-embed everything
566
+ # Force re-embed entire corpus
333
567
  kindx embed -f
334
568
  ```
335
569
 
336
570
  ### Context Management
337
571
 
338
- Context adds descriptive metadata to collections and paths, helping search understand your content.
572
+ Context annotations add semantic metadata to collections and paths, improving Contextual Retrieval precision.
339
573
 
340
574
  ```bash
341
- # Add context to a collection (using kindx:// virtual paths)
342
- kindx context add kindx://notes "Personal notes and ideas"
343
- kindx context add kindx://docs/api "API documentation"
575
+ # Annotate a collection (using kindx:// virtual paths)
576
+ kindx context add kindx://notes "Personal documents and ideation corpus"
577
+ kindx context add kindx://docs/api "API and integration documentation corpus"
344
578
 
345
- # Add context from within a collection directory
346
- cd ~/notes && kindx context add "Personal notes and ideas"
347
- cd ~/notes/work && kindx context add "Work-related notes"
579
+ # Annotate from within a corpus directory
580
+ cd ~/notes && kindx context add "Personal documents and ideas"
581
+ cd ~/notes/work && kindx context add "Work-related knowledge corpus"
348
582
 
349
- # Add global context (applies to all collections)
350
- kindx context add / "Knowledge base for my projects"
583
+ # Add global context (applies across all collections)
584
+ kindx context add / "Enterprise knowledge base for agent context injection"
351
585
 
352
- # List all contexts
586
+ # List all context annotations
353
587
  kindx context list
354
588
 
355
- # Remove context
589
+ # Remove context annotation
356
590
  kindx context rm kindx://notes/old
357
591
  ```
358
592
 
359
- ### Search Commands
593
+ ### Contextual Retrieval Commands
360
594
 
361
595
  ```
362
596
  +------------------------------------------------------------+
363
- | Search Modes |
597
+ | Retrieval Modes |
364
598
  +----------+-------------------------------------------------+
365
- | search | BM25 full-text search only |
366
- | vsearch | Vector semantic search only |
367
- | query | Hybrid: FTS + Vector + Query Expansion + Rerank |
599
+ | search | BM25 full-text retrieval only |
600
+ | vsearch | Neural vector retrieval only |
601
+ | query | Hybrid: FTS + Vector + Expansion + Rerank |
368
602
  +----------+-------------------------------------------------+
369
603
  ```
370
604
 
371
605
  ```bash
372
- # Full-text search (fast, keyword-based)
606
+ # Full-text Contextual Retrieval (fast, keyword-based)
373
607
  kindx search "authentication flow"
374
608
 
375
- # Vector search (semantic similarity)
609
+ # Neural vector Contextual Retrieval (semantic similarity)
376
610
  kindx vsearch "how to login"
377
611
 
378
- # Hybrid search with re-ranking (best quality)
612
+ # Hybrid Neural-Symbolic retrieval with re-ranking (highest precision)
379
613
  kindx query "user authentication"
380
614
  ```
381
615
 
382
- ### Options
616
+ ### CLI Options
383
617
 
384
618
  ```bash
385
- # Search options
619
+ # Retrieval options
386
620
  -n <num> # Number of results (default: 5, or 20 for --files/--json)
387
- -c, --collection # Restrict search to a specific collection
388
- --all # Return all matches (use with --min-score to filter)
389
- --min-score <num> # Minimum score threshold (default: 0)
390
- --full # Show full document content
391
- --line-numbers # Add line numbers to output
621
+ -c, --collection # Restrict retrieval to a specific collection
622
+ --all # Return all matches (combine with --min-score to filter)
623
+ --min-score <num> # Minimum relevance threshold (default: 0)
624
+ --full # Return full document content
625
+ --line-numbers # Annotate output with line numbers
392
626
  --explain # Include retrieval score traces (query, JSON/CLI output)
393
627
  --index <name> # Use named index
394
628
 
395
- # Output formats (for search and multi-get)
629
+ # Structured output formats (for agent pipeline consumption)
396
630
  --files # Output: docid,score,filepath,context
397
- --json # JSON output with snippets
631
+ --json # JSON payload with snippets
398
632
  --csv # CSV output
399
633
  --md # Markdown output
400
634
  --xml # XML output
401
635
 
402
- # Get options
403
- kindx get <file>[:line] # Get document, optionally starting at line
636
+ # Neural Extraction options
637
+ kindx get <file>[:line] # Extract document, optionally from line offset
404
638
  -l <num> # Maximum lines to return
405
639
  --from <num> # Start from line number
406
640
 
407
- # Multi-get options
408
- -l <num> # Maximum lines per file
409
- --max-bytes <num> # Skip files larger than N bytes (default: 10KB)
641
+ # Bulk Neural Extraction options
642
+ -l <num> # Maximum lines per asset
643
+ --max-bytes <num> # Skip assets larger than N bytes (default: 10KB)
410
644
  ```
411
645
 
412
646
  ### Index Maintenance
413
647
 
414
648
  ```bash
415
- # Show index status and collections with contexts
649
+ # Report index health and collection inventory
416
650
  kindx status
417
651
 
418
652
  # Re-index all collections
419
653
  kindx update
420
654
 
421
- # Re-index with git pull first (for remote repos)
655
+ # Re-index with upstream git pull (for remote corpus repos)
422
656
  kindx update --pull
423
657
 
424
- # Get document by filepath (with fuzzy matching suggestions)
658
+ # Download/check the default local models
659
+ kindx pull
660
+
661
+ # Force re-download the default models
662
+ kindx pull --refresh
663
+
664
+ # Watch one or more collections for changes
665
+ kindx watch
666
+ kindx watch notes docs
667
+
668
+ # Neural Extraction by filepath (with fuzzy matching fallback)
425
669
  kindx get notes/meeting.md
426
670
 
427
- # Get document by docid (from search results)
671
+ # Neural Extraction by docid (from retrieval results)
428
672
  kindx get "#abc123"
429
673
 
430
- # Get document starting at line 50, max 100 lines
674
+ # Extract document starting at line 50, max 100 lines
431
675
  kindx get notes/meeting.md:50 -l 100
432
676
 
433
- # Get multiple documents by glob pattern
677
+ # Bulk Neural Extraction via glob pattern
434
678
  kindx multi-get "journals/2025-05*.md"
435
679
 
436
- # Get multiple documents by comma-separated list (supports docids)
680
+ # Bulk Neural Extraction via comma-separated list (supports docids)
437
681
  kindx multi-get "doc1.md, doc2.md, #abc123"
438
682
 
439
- # Limit multi-get to files under 20KB
683
+ # Limit bulk extraction to assets under 20KB
440
684
  kindx multi-get "docs/*.md" --max-bytes 20480
441
685
 
442
- # Output multi-get as JSON for agent processing
686
+ # Export bulk extraction as JSON for agent processing
443
687
  kindx multi-get "docs/*.md" --json
444
688
 
445
- # Clean up cache and orphaned data
689
+ # Purge cache and orphaned index data
446
690
  kindx cleanup
691
+
692
+ # Import an existing Chroma or OpenCLAW corpus
693
+ kindx migrate chroma /path/to/chroma.sqlite3
694
+ kindx migrate openclaw /path/to/openclaw/repo
447
695
  ```
448
696
 
697
+ ### Claude Skill Packaging
698
+
699
+ ```bash
700
+ # Print the packaged skill markdown
701
+ kindx --skill
702
+
703
+ # Install the packaged skill into ~/.claude/commands/
704
+ kindx skill install
705
+ ```
706
+
707
+ ---
708
+
449
709
  ## Data Storage
450
710
 
451
- Index stored in: `~/.cache/kindx/index.sqlite`
711
+ Index stored at: `~/.cache/kindx/index.sqlite`
452
712
 
453
713
  ### Schema
454
714
 
@@ -499,15 +759,33 @@ erDiagram
499
759
  content_vectors ||--|| vectors_vec : embeds
500
760
  ```
501
761
 
762
+ ---
763
+
502
764
  ## Environment Variables
503
765
 
504
766
  | Variable | Default | Description |
505
767
  |----------|---------|-------------|
506
768
  | `KINDX_EMBED_MODEL` | `embeddinggemma-300M` | Override embedding model (HuggingFace URI) |
507
- | `KINDX_EXPAND_CONTEXT_SIZE` | `2048` | Context window for query expansion |
769
+ | `KINDX_EXPAND_CONTEXT_SIZE` | `2048` | Context window for query expansion LLM |
770
+ | `KINDX_RERANK_CONTEXT_SIZE` | `4096` | Context window for reranking contexts |
771
+ | `KINDX_LOW_VRAM` | (auto) | Force low-VRAM policy on/off (`1`/`0`) |
772
+ | `KINDX_VRAM_BUDGET_MB` | (unset) | Optional GPU budget in MB; constrains context + parallelism |
773
+ | `KINDX_LOW_VRAM_THRESHOLD_MB` | `6144` | Auto low-VRAM threshold based on free GPU memory |
774
+ | `KINDX_LOW_VRAM_EMBED_PARALLELISM` | `2` | Max embedding context parallelism in low-VRAM mode |
775
+ | `KINDX_LOW_VRAM_RERANK_PARALLELISM` | `1` | Max rerank context parallelism in low-VRAM mode |
776
+ | `KINDX_LOW_VRAM_EXPAND_CONTEXT_SIZE` | `1024` | Expansion context size cap in low-VRAM mode |
777
+ | `KINDX_LOW_VRAM_RERANK_CONTEXT_SIZE` | `1024` | Rerank context size cap in low-VRAM mode |
508
778
  | `KINDX_CONFIG_DIR` | `~/.config/kindx` | Configuration directory override |
509
779
  | `XDG_CACHE_HOME` | `~/.cache` | Cache base directory |
510
- | `NO_COLOR` | (unset) | Disable terminal colors |
780
+ | `NO_COLOR` | (unset) | Disable ANSI terminal colors |
781
+ | `KINDX_LLM_BACKEND` | `local` | Set to `remote` to use an OpenAI-compatible API instead of local GPU |
782
+ | `KINDX_OPENAI_BASE_URL` | `http://127.0.0.1:11434/v1` | URL for the Remote API backend (e.g. Ollama, LM Studio) |
783
+ | `KINDX_OPENAI_API_KEY` | (unset) | API key for the Remote API backend if required |
784
+ | `KINDX_OPENAI_EMBED_MODEL`| `nomic-embed-text` | Model name to pass for `/v1/embeddings` |
785
+ | `KINDX_OPENAI_GENERATE_MODEL` | `llama3.2` | Model name to pass for `/v1/chat/completions` (query expansion) |
786
+ | `KINDX_OPENAI_RERANK_MODEL` | (unset) | Model name to pass for `/v1/rerank` (if supported by backend) |
787
+
788
+ ---
511
789
 
512
790
  ## How It Works
513
791
 
@@ -515,8 +793,8 @@ erDiagram
515
793
 
516
794
  ```mermaid
517
795
  flowchart LR
518
- COL["Collection Config"] --> GLOB["Glob Pattern Scan"]
519
- GLOB --> MD["Markdown Files"]
796
+ COL["collection Config"] --> GLOB["Glob Pattern Scan"]
797
+ GLOB --> MD["Markdown documents"]
520
798
  MD --> PARSE["Parse Title + Hash Content"]
521
799
  PARSE --> DOCID["Generate docid (6-char hash)"]
522
800
  DOCID --> SQL["Store in SQLite"]
@@ -525,11 +803,11 @@ flowchart LR
525
803
 
526
804
  ### Embedding Flow
527
805
 
528
- Documents are chunked into ~900-token pieces with 15% overlap using smart boundary detection:
806
+ documents are chunked into ~900-token pieces with 15% overlap using smart boundary detection:
529
807
 
530
808
  ```mermaid
531
809
  flowchart LR
532
- DOC["Document"] --> CHUNK["Smart Chunk (~900 tokens)"]
810
+ DOC["document"] --> CHUNK["Smart Chunk (~900 tokens)"]
533
811
  CHUNK --> FMT["Format: title | text"]
534
812
  FMT --> LLM["node-llama-cpp embedBatch"]
535
813
  LLM --> STORE["Store Vectors in sqlite-vec"]
@@ -539,9 +817,9 @@ flowchart LR
539
817
 
540
818
  ### Smart Chunking
541
819
 
542
- Instead of cutting at hard token boundaries, KINDX uses a scoring algorithm to find natural markdown break points. This keeps semantic units (sections, paragraphs, code blocks) together.
820
+ Instead of cutting at hard token boundaries, KINDX uses a scoring algorithm to find natural markdown break points. This keeps semantic units (sections, paragraphs, code blocks) together within a single chunk.
543
821
 
544
- Algorithm:
822
+ **Algorithm:**
545
823
  1. Scan document for all break points with scores
546
824
  2. When approaching the 900-token target, search a 200-token window before the cutoff
547
825
  3. Score each break point: `finalScore = baseScore x (1 - (distance/window)^2 x 0.7)`
@@ -549,7 +827,7 @@ Algorithm:
549
827
 
550
828
  The squared distance decay means a heading 200 tokens back (score ~30) still beats a simple line break at the target (score 1), but a closer heading wins over a distant one.
551
829
 
552
- Code Fence Protection: Break points inside code blocks are ignored -- code stays together. If a code block exceeds the chunk size, it is kept whole when possible.
830
+ **Code Fence Protection:** Break points inside code blocks are ignored code stays together. If a code block exceeds the chunk size, it is kept whole when possible.
553
831
 
554
832
  ### Model Configuration
555
833
 
@@ -561,18 +839,20 @@ const DEFAULT_RERANK_MODEL = "hf:ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-re
561
839
  const DEFAULT_GENERATE_MODEL = "hf:ambicuity/kindx-query-expansion-1.7B-gguf/kindx-query-expansion-1.7B-q4_k_m.gguf";
562
840
  ```
563
841
 
842
+ ---
843
+
564
844
  ## Contributing
565
845
 
566
- See [CONTRIBUTING.md](./CONTRIBUTING.md) for the full contribution guide.
846
+ See [CONTRIBUTING.md](./CONTRIBUTING.md) for the full contribution guide and The KINDX Specification.
567
847
 
568
848
  ## Security
569
849
 
570
- See [SECURITY.md](./SECURITY.md) for reporting vulnerabilities.
850
+ See [SECURITY.md](./SECURITY.md) for vulnerability disclosure.
571
851
 
572
852
  ## License
573
853
 
574
- MIT -- see [LICENSE](./LICENSE) for details.
854
+ MIT see [LICENSE](./LICENSE) for details.
575
855
 
576
856
  ---
577
857
 
578
- Maintained by [Ritesh Rana](https://github.com/ambicuity) -- `contact@riteshrana.engineer`
858
+ Maintained by [Ritesh Rana](https://github.com/ambicuity) `contact@riteshrana.engineer`