@jcode.labs/mimir 0.4.2 → 0.4.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +16 -531
- package/dist/version.d.ts +1 -1
- package/dist/version.js +1 -1
- package/package.json +3 -3
package/README.md
CHANGED
|
@@ -1,552 +1,37 @@
|
|
|
1
|
-
# Mimir
|
|
1
|
+
# Mimir Core Package
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
[](https://www.npmjs.com/package/@jcode.labs/mimir)
|
|
6
|
-
[](https://github.com/jcode-works/jcode-mimir/blob/main/LICENSE)
|
|
3
|
+
`@jcode.labs/mimir` is the core Mimir package: CLI, library, MCP server, bundled agent skills, and
|
|
4
|
+
synthetic examples for sovereign local RAG.
|
|
7
5
|
|
|
8
|
-
|
|
6
|
+
**Full documentation:** https://github.com/jcode-works/jcode-mimir#readme
|
|
9
7
|
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
vectors locally with LanceDB, and can use either built-in local-hash retrieval or optional
|
|
13
|
-
Transformers.js semantic embeddings. Mimir core returns cited retrieval context; answer synthesis
|
|
14
|
-
belongs to the AI agent, LLM, or local model runtime you choose around it.
|
|
8
|
+
This npm README is intentionally short because package READMEs are displayed separately on npm. The
|
|
9
|
+
GitHub root README is the canonical product documentation.
|
|
15
10
|
|
|
16
|
-
|
|
17
|
-
research documents in a private local folder, index them locally, then let any compatible AI agent or
|
|
18
|
-
LLM workflow retrieve grounded context for summaries, briefs, audits, and decision support without
|
|
19
|
-
shipping the dataset to a hosted RAG service.
|
|
20
|
-
|
|
21
|
-
Created by Jean-Baptiste Thery and published under the JCode Labs npm scope.
|
|
22
|
-
|
|
23
|
-
Built by Jean-Baptiste Thery, freelance full-stack/AI tooling engineer at JCode Labs.
|
|
24
|
-
|
|
25
|
-
## Open Source
|
|
26
|
-
|
|
27
|
-
Mimir is a public open-source project under the MIT License. It is designed to be
|
|
28
|
-
inspectable, forkable, and usable without a JCode Labs account.
|
|
29
|
-
|
|
30
|
-
Contributions are welcome through pull requests. Start with
|
|
31
|
-
[`CONTRIBUTING.md`](https://github.com/jcode-works/jcode-mimir/blob/main/CONTRIBUTING.md).
|
|
32
|
-
Security reports should stay private and follow the policy in
|
|
33
|
-
[`SECURITY.md`](https://github.com/jcode-works/jcode-mimir/blob/main/SECURITY.md).
|
|
34
|
-
|
|
35
|
-
## Sponsors
|
|
36
|
-
|
|
37
|
-
Mimir stays MIT open source. Sponsorship helps fund maintenance, issue triage,
|
|
38
|
-
documentation, and practical agent-workflow improvements.
|
|
39
|
-
|
|
40
|
-
Sponsor the project through [GitHub Sponsors](https://github.com/sponsors/jb-thery).
|
|
41
|
-
|
|
42
|
-
Suggested GitHub Sponsors tiers:
|
|
43
|
-
|
|
44
|
-
- EUR 5/month: support the project.
|
|
45
|
-
- EUR 15/month: active sponsor.
|
|
46
|
-
- EUR 49/month: priority on issues and questions.
|
|
47
|
-
- EUR 199/month: company sponsor and light advisory support.
|
|
48
|
-
|
|
49
|
-
## Status
|
|
50
|
-
|
|
51
|
-
Early public package. APIs may evolve before `1.0.0`.
|
|
52
|
-
|
|
53
|
-
## Documentation
|
|
54
|
-
|
|
55
|
-
- [Getting started](https://github.com/jcode-works/jcode-mimir/blob/main/docs/getting-started.md):
|
|
56
|
-
install Mimir and complete the first local search.
|
|
57
|
-
- [CLI reference](https://github.com/jcode-works/jcode-mimir/blob/main/docs/cli-reference.md):
|
|
58
|
-
every `kb` and `mimir-tts` command with practical usage notes.
|
|
59
|
-
- [Troubleshooting](https://github.com/jcode-works/jcode-mimir/blob/main/docs/troubleshooting.md):
|
|
60
|
-
common install, indexing, retrieval, audio, and release issues.
|
|
61
|
-
- [Security hardening](https://github.com/jcode-works/jcode-mimir/blob/main/SECURITY-HARDENING.md):
|
|
62
|
-
offline operation, threat model, secure deletion limits, and release verification.
|
|
63
|
-
|
|
64
|
-
## What Mimir Is For
|
|
65
|
-
|
|
66
|
-
- Build a local RAG knowledge base inside any repository.
|
|
67
|
-
- Analyze confidential datasets while keeping raw files and generated indexes local.
|
|
68
|
-
- Give Claude, Codex, Cursor, internal assistants, or other MCP-compatible tools the same private
|
|
69
|
-
retrieval layer.
|
|
70
|
-
- Retrieve grounded local evidence through CLI, library calls, MCP tools, or the bundled agent
|
|
71
|
-
skills so your chosen AI agent can produce cited summaries.
|
|
72
|
-
- Optionally create listenable MP3 or WAV summaries with `kb audio`, `@jcode.labs/mimir-tts`, and
|
|
73
|
-
the bundled `mimir-audio-summary` skill.
|
|
74
|
-
|
|
75
|
-
Mimir is not a hosted SaaS, not a remote vector database, and not a certified high-assurance system.
|
|
76
|
-
For regulated or state-grade environments, pair it with encrypted disks, controlled machines, release
|
|
77
|
-
verification, and an external security review.
|
|
78
|
-
|
|
79
|
-
## Use Cases
|
|
80
|
-
|
|
81
|
-
Mimir is useful whenever the source material should stay local but an AI agent still needs grounded
|
|
82
|
-
context.
|
|
83
|
-
|
|
84
|
-
| Use case | Example questions |
|
|
85
|
-
| --- | --- |
|
|
86
|
-
| Understand a code repository | "Where is authentication implemented?", "What depends on this module?", "Summarize the payment flow." |
|
|
87
|
-
| Understand architecture | "What services exist?", "What are the data boundaries?", "Which components are risky to change?" |
|
|
88
|
-
| Analyze specifications | "What does the technical spec require?", "Which requirements are still unclear?", "Generate an implementation checklist." |
|
|
89
|
-
| Work through a request for proposal or tender | "What are the mandatory constraints?", "Which documents prove compliance?", "What risks should be clarified?" |
|
|
90
|
-
| Study courses and training material | "Summarize chapter three.", "Create revision questions.", "Compare these two concepts." |
|
|
91
|
-
| Analyze a book or long report | "Extract the main thesis.", "Find recurring arguments.", "Create a chapter-by-chapter brief." |
|
|
92
|
-
| Build an internal knowledge base | "What is the policy for incident review?", "Who owns this process?", "Which source says that?" |
|
|
93
|
-
| Prepare meetings or decisions | "Give me a one-page briefing.", "What is missing before deciding?", "List action items and evidence." |
|
|
94
|
-
| Ask questions over offline documents | "Which files mention local-only operation?", "What evidence supports this claim?" |
|
|
95
|
-
| Generate audio briefings | "Create a listenable high-quality or offline summary of the current dossier." |
|
|
96
|
-
|
|
97
|
-
## Requirements
|
|
98
|
-
|
|
99
|
-
- Node.js 20+
|
|
100
|
-
- pnpm, npm, yarn or bun
|
|
101
|
-
- No model runtime is required for the default `embeddingProvider: "local-hash"` mode.
|
|
102
|
-
- Optional semantic embeddings use Transformers.js with local model files under `.mimir/models` by
|
|
103
|
-
default.
|
|
104
|
-
- Generated answers are intentionally outside Mimir core. Use Claude, Codex, OpenAI, a local model
|
|
105
|
-
MCP server, or another trusted model runtime to synthesize from Mimir's cited context.
|
|
106
|
-
- Optional audio summaries use the separate `@jcode.labs/mimir-tts` workspace package. For the
|
|
107
|
-
highest quality, install the external `edge-tts` CLI and render Edge MP3 output with
|
|
108
|
-
`fr-FR-DeniseNeural`. For confidential or air-gapped content, use the Transformers.js WAV path
|
|
109
|
-
with `--engine transformers --offline`; it does not require Python, ffmpeg, Piper, XTTS, or a
|
|
110
|
-
local server.
|
|
111
|
-
|
|
112
|
-
## Install From npm
|
|
113
|
-
|
|
114
|
-
The package is public. Users do not need a JCode Labs account or npm token to install it.
|
|
115
|
-
|
|
116
|
-
With pnpm:
|
|
11
|
+
## Install
|
|
117
12
|
|
|
118
13
|
```bash
|
|
119
14
|
pnpm add -D @jcode.labs/mimir
|
|
120
15
|
```
|
|
121
16
|
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
```bash
|
|
125
|
-
npm install --save-dev @jcode.labs/mimir
|
|
126
|
-
```
|
|
127
|
-
|
|
128
|
-
Maintainer tokens are only needed to publish new versions.
|
|
129
|
-
|
|
130
|
-
## Install From Source Checkout
|
|
131
|
-
|
|
132
|
-
```bash
|
|
133
|
-
git clone git@github.com:jcode-works/jcode-mimir.git
|
|
134
|
-
cd jcode-mimir
|
|
135
|
-
pnpm install
|
|
136
|
-
pnpm build
|
|
137
|
-
```
|
|
138
|
-
|
|
139
|
-
For local development:
|
|
140
|
-
|
|
141
|
-
```bash
|
|
142
|
-
pnpm add -D file:../jcode-mimir/packages/mimir
|
|
143
|
-
```
|
|
144
|
-
|
|
145
|
-
Before creating an npm tarball later, run:
|
|
146
|
-
|
|
147
|
-
```bash
|
|
148
|
-
pnpm build
|
|
149
|
-
pnpm --dir packages/mimir pack
|
|
150
|
-
```
|
|
151
|
-
|
|
152
|
-
## Use In Any Repository
|
|
153
|
-
|
|
154
|
-
Initialize the local project config:
|
|
17
|
+
## Quick Commands
|
|
155
18
|
|
|
156
19
|
```bash
|
|
157
20
|
pnpm exec kb init
|
|
158
21
|
pnpm exec kb doctor
|
|
159
|
-
```
|
|
160
|
-
|
|
161
|
-
Add private documents under `private/`, then run:
|
|
162
|
-
|
|
163
|
-
```bash
|
|
164
|
-
pnpm exec kb ingest
|
|
165
|
-
pnpm exec kb doctor
|
|
166
|
-
pnpm exec kb search "vendor invoice status"
|
|
167
|
-
pnpm exec kb ask "What do the documents prove?"
|
|
168
|
-
pnpm exec kb audit
|
|
169
|
-
pnpm exec kb security-audit
|
|
170
|
-
pnpm exec kb status
|
|
171
|
-
```
|
|
172
|
-
|
|
173
|
-
With npm, use `npx` after installing the package:
|
|
174
|
-
|
|
175
|
-
```bash
|
|
176
|
-
npx kb init
|
|
177
|
-
npx kb doctor
|
|
178
|
-
npx kb ingest
|
|
179
|
-
npx kb doctor
|
|
180
|
-
npx kb search "vendor invoice status"
|
|
181
|
-
npx kb ask "What do the documents prove?"
|
|
182
|
-
npx kb audit
|
|
183
|
-
npx kb security-audit
|
|
184
|
-
npx kb status
|
|
185
|
-
```
|
|
186
|
-
|
|
187
|
-
## Choose A Retrieval Mode
|
|
188
|
-
|
|
189
|
-
Mimir has two embedding modes.
|
|
190
|
-
|
|
191
|
-
### Default Local Hash Retrieval
|
|
192
|
-
|
|
193
|
-
Use this when you want a fully local, no-model smoke test or a dependency-light setup. Retrieval is
|
|
194
|
-
lexical/hash-based, not semantic.
|
|
195
|
-
|
|
196
|
-
`.kb/config.json`:
|
|
197
|
-
|
|
198
|
-
```json
|
|
199
|
-
{
|
|
200
|
-
"embeddingProvider": "local-hash"
|
|
201
|
-
}
|
|
202
|
-
```
|
|
203
|
-
|
|
204
|
-
Commands:
|
|
205
|
-
|
|
206
|
-
```bash
|
|
207
|
-
pnpm exec kb ingest
|
|
208
|
-
pnpm exec kb search "offline retrieval approval"
|
|
209
|
-
pnpm exec kb ask "What evidence supports offline operation?"
|
|
210
|
-
```
|
|
211
|
-
|
|
212
|
-
`kb ask` always returns cited retrieved passages instead of a generated synthesis. You can pass those
|
|
213
|
-
passages to any LLM or agent you trust.
|
|
214
|
-
|
|
215
|
-
### Optional Semantic Embeddings With Transformers.js
|
|
216
|
-
|
|
217
|
-
Use this when you want better semantic retrieval while keeping Mimir core free of an LLM server.
|
|
218
|
-
|
|
219
|
-
`.kb/config.json`:
|
|
220
|
-
|
|
221
|
-
```json
|
|
222
|
-
{
|
|
223
|
-
"embeddingProvider": "transformers",
|
|
224
|
-
"embeddingModel": "mixedbread-ai/mxbai-embed-xsmall-v1",
|
|
225
|
-
"embeddingModelPath": ".mimir/models",
|
|
226
|
-
"transformersAllowRemoteModels": false
|
|
227
|
-
}
|
|
228
|
-
```
|
|
229
|
-
|
|
230
|
-
Commands:
|
|
231
|
-
|
|
232
|
-
```bash
|
|
233
|
-
pnpm exec kb ingest
|
|
234
|
-
pnpm exec kb ask "Which passages support offline operation?"
|
|
235
|
-
```
|
|
236
|
-
|
|
237
|
-
Keep `transformersAllowRemoteModels` false for confidential or air-gapped work and preload model
|
|
238
|
-
files into `embeddingModelPath`. Set it to true only when you explicitly allow Transformers.js to
|
|
239
|
-
download model files from Hugging Face.
|
|
240
|
-
|
|
241
|
-
## Dependency Footprint
|
|
242
|
-
|
|
243
|
-
Mimir can run retrieval without a model runtime. Some runtime dependencies remain because they own
|
|
244
|
-
core features:
|
|
245
|
-
|
|
246
|
-
| Dependency | Why it remains |
|
|
247
|
-
| --- | --- |
|
|
248
|
-
| @huggingface/transformers | optional local semantic embeddings |
|
|
249
|
-
| LanceDB | local vector storage and nearest-neighbor retrieval |
|
|
250
|
-
| MCP SDK | MCP server for compatible agents |
|
|
251
|
-
| fast-glob | safe source-file discovery |
|
|
252
|
-
| unpdf, html-to-text, yaml, fflate | document parsing for PDF, HTML, YAML, Office/OpenDocument ZIP files |
|
|
253
|
-
| commander, zod, picocolors | CLI, config validation, readable terminal output |
|
|
254
|
-
|
|
255
|
-
Removing more dependencies is possible only by dropping features or replacing them with smaller
|
|
256
|
-
internal implementations. The current low-friction path is dependency-light at runtime for users who
|
|
257
|
-
choose `local-hash`, while preserving richer parsing, MCP support, and optional semantic embeddings.
|
|
258
|
-
|
|
259
|
-
## Example Test Workspace
|
|
260
|
-
|
|
261
|
-
This repository includes a synthetic example under
|
|
262
|
-
[`examples/sovereign-rag-demo`](./examples/sovereign-rag-demo). It can be used to test ingestion,
|
|
263
|
-
retrieval, `security-audit`, and custom text extensions without using private documents.
|
|
264
|
-
|
|
265
|
-
From a local checkout:
|
|
266
|
-
|
|
267
|
-
```bash
|
|
268
|
-
pnpm build
|
|
269
|
-
cd examples/sovereign-rag-demo
|
|
270
|
-
node ../../dist/cli.js security-audit
|
|
271
|
-
node ../../dist/cli.js ingest
|
|
272
|
-
node ../../dist/cli.js search "offline retrieval approval"
|
|
273
|
-
node ../../dist/cli.js audit
|
|
274
|
-
```
|
|
275
|
-
|
|
276
|
-
The example uses the default local-hash retrieval mode, so it can run without downloading an
|
|
277
|
-
embedding or chat model.
|
|
278
|
-
|
|
279
|
-
## Typical Workflows
|
|
280
|
-
|
|
281
|
-
### Understand A Codebase
|
|
282
|
-
|
|
283
|
-
```bash
|
|
284
|
-
pnpm exec kb init
|
|
285
|
-
printf "src\nREADME.md\ndocs\n" >> .kb/sources.txt
|
|
286
|
-
pnpm exec kb ingest
|
|
287
|
-
pnpm exec kb search "authentication flow"
|
|
288
|
-
pnpm exec kb ask "Explain the architecture and cite the relevant files."
|
|
289
|
-
```
|
|
290
|
-
|
|
291
|
-
### Analyze Specifications Or A Course
|
|
292
|
-
|
|
293
|
-
```bash
|
|
294
22
|
pnpm exec kb ingest
|
|
295
|
-
pnpm exec kb
|
|
296
|
-
pnpm exec kb ask "
|
|
297
|
-
```
|
|
298
|
-
|
|
299
|
-
### Work Offline
|
|
300
|
-
|
|
301
|
-
```bash
|
|
302
|
-
pnpm exec kb security-audit --strict
|
|
303
|
-
pnpm exec kb ingest
|
|
304
|
-
pnpm exec kb search "incident review policy"
|
|
305
|
-
pnpm exec kb ask "What does the local evidence prove?"
|
|
306
|
-
```
|
|
307
|
-
|
|
308
|
-
Use `embeddingProvider: "local-hash"` for a no-model offline workflow. Use
|
|
309
|
-
`embeddingProvider: "transformers"` with preloaded model files for semantic offline retrieval.
|
|
310
|
-
Generated answers should come from a trusted external agent or model runtime.
|
|
311
|
-
|
|
312
|
-
### Generate An Audio Briefing
|
|
313
|
-
|
|
314
|
-
Mimir includes a plug-and-play text-to-speech path for listenable summaries. For the same quality
|
|
315
|
-
path as the global Voice Forge skill, install `edge-tts` and render MP3:
|
|
316
|
-
|
|
317
|
-
```bash
|
|
318
|
-
pnpm exec kb audio --doctor
|
|
319
|
-
pipx install edge-tts
|
|
320
|
-
pnpm exec kb audio /tmp/MIMIR-SUMMARY-project.txt \
|
|
321
|
-
--engine edge \
|
|
322
|
-
--out .mimir/audio/project-summary.mp3
|
|
323
|
-
```
|
|
324
|
-
|
|
325
|
-
The Edge path uses the online Microsoft Edge TTS service through the `edge-tts` CLI. Use it only
|
|
326
|
-
when sending the narration text to that service is acceptable.
|
|
327
|
-
|
|
328
|
-
By default, `kb audio` uses the Transformers.js WAV path. For confidential or air-gapped work,
|
|
329
|
-
preload Transformers.js-compatible model files and render WAV offline:
|
|
330
|
-
|
|
331
|
-
```bash
|
|
332
|
-
pnpm exec kb audio /tmp/MIMIR-SUMMARY-project.txt \
|
|
333
|
-
--engine transformers \
|
|
334
|
-
--offline \
|
|
335
|
-
--model-path .mimir/models/tts \
|
|
336
|
-
--out .mimir/audio/project-summary.wav
|
|
337
|
-
```
|
|
338
|
-
|
|
339
|
-
The standalone package can also be installed directly:
|
|
340
|
-
|
|
341
|
-
```bash
|
|
342
|
-
pnpm add -D @jcode.labs/mimir-tts
|
|
343
|
-
pnpm exec mimir-tts render /tmp/MIMIR-SUMMARY-project.txt \
|
|
344
|
-
--engine edge \
|
|
345
|
-
--out .mimir/audio/project-summary.mp3
|
|
346
|
-
```
|
|
347
|
-
|
|
348
|
-
## Agent Skills And MCP
|
|
349
|
-
|
|
350
|
-
Mimir ships with portable agent skills and a standard MCP server.
|
|
351
|
-
|
|
352
|
-
Install the agent kit into a repository:
|
|
353
|
-
|
|
354
|
-
```bash
|
|
23
|
+
pnpm exec kb search "your question"
|
|
24
|
+
pnpm exec kb ask "your question"
|
|
355
25
|
pnpm exec kb install-skill
|
|
356
26
|
```
|
|
357
27
|
|
|
358
|
-
|
|
359
|
-
|
|
360
|
-
```plain text
|
|
361
|
-
.mimir/skills/mimir/SKILL.md
|
|
362
|
-
.mimir/skills/mimir-audio-summary/SKILL.md
|
|
363
|
-
.mimir/mcp.json
|
|
364
|
-
.mimir/README.md
|
|
365
|
-
```
|
|
366
|
-
|
|
367
|
-
Agents that support skill folders can load `.mimir/skills/mimir/` for deep local RAG usage.
|
|
368
|
-
Load `.mimir/skills/mimir-audio-summary/` only when an optional spoken summary is needed.
|
|
369
|
-
Other agents can read the generated `.mimir/README.md` and use the MCP config snippet.
|
|
370
|
-
|
|
371
|
-
Start the MCP server from the repository root:
|
|
372
|
-
|
|
373
|
-
```bash
|
|
374
|
-
pnpm exec kb serve-mcp
|
|
375
|
-
```
|
|
376
|
-
|
|
377
|
-
MCP tools exposed:
|
|
378
|
-
|
|
379
|
-
- `mimir_status`
|
|
380
|
-
- `mimir_search`
|
|
381
|
-
- `mimir_ask`
|
|
382
|
-
- `mimir_audit`
|
|
383
|
-
- `mimir_security_audit`
|
|
384
|
-
|
|
385
|
-
This MCP layer is the recommended way to let any compatible LLM or agent query the same local
|
|
386
|
-
knowledge base. The LLM does not need to know about LanceDB or the raw file layout; it asks Mimir for
|
|
387
|
-
ranked passages or cited context and uses the returned citations.
|
|
388
|
-
|
|
389
|
-
Print the bundled skill path from the installed package:
|
|
390
|
-
|
|
391
|
-
```bash
|
|
392
|
-
pnpm exec kb skill-path
|
|
393
|
-
```
|
|
394
|
-
|
|
395
|
-
## Data Boundary
|
|
396
|
-
|
|
397
|
-
The package code lives in `node_modules` or in this repository. Project data stays in the
|
|
398
|
-
repository where you run the CLI:
|
|
399
|
-
|
|
400
|
-
```plain text
|
|
401
|
-
your-project/
|
|
402
|
-
private/ # raw documents to ingest
|
|
403
|
-
.kb/config.json # local config
|
|
404
|
-
.kb/sources.txt # optional extra source paths
|
|
405
|
-
.kb/storage/ # generated LanceDB index
|
|
406
|
-
.kb/access.log # metadata-only access log
|
|
407
|
-
```
|
|
408
|
-
|
|
409
|
-
The package never ships project documents. `kb init` adds gitignore entries for `.kb/`
|
|
410
|
-
and `private/**`, and `kb install-skill` keeps `.mimir/` ignored as generated local agent
|
|
411
|
-
state.
|
|
412
|
-
|
|
413
|
-
## Confidentiality Defaults
|
|
414
|
-
|
|
415
|
-
Mimir is designed for private repositories and sensitive local evidence.
|
|
416
|
-
|
|
417
|
-
- Zero telemetry: no analytics or document content is sent to JCode Labs.
|
|
418
|
-
- No LLM generation in core: Mimir returns cited context for the agent/runtime you choose.
|
|
419
|
-
- Local-hash by default: no model runtime is required for the default retrieval path.
|
|
420
|
-
- Transformers.js remote model loading is disabled by default.
|
|
421
|
-
- Redaction before indexing: common secrets and identifiers are redacted before chunks are
|
|
422
|
-
embedded and stored.
|
|
423
|
-
- Metadata-only access logs: query hashes and action metadata are logged, not raw queries.
|
|
424
|
-
- MCP is read-focused and bounded by `mcpMaxTopK`.
|
|
425
|
-
- Generated local state is ignored by Git.
|
|
426
|
-
|
|
427
|
-
Run:
|
|
428
|
-
|
|
429
|
-
```bash
|
|
430
|
-
pnpm exec kb security-audit --strict
|
|
431
|
-
```
|
|
432
|
-
|
|
433
|
-
Remove the generated vector index:
|
|
434
|
-
|
|
435
|
-
```bash
|
|
436
|
-
pnpm exec kb destroy-index --yes
|
|
437
|
-
```
|
|
438
|
-
|
|
439
|
-
For air-gapped operation, release verification, secure deletion limits, and threat model details,
|
|
440
|
-
read
|
|
441
|
-
[`SECURITY-HARDENING.md`](https://github.com/jcode-works/jcode-mimir/blob/main/SECURITY-HARDENING.md).
|
|
442
|
-
|
|
443
|
-
## Supported Files
|
|
444
|
-
|
|
445
|
-
Mimir supports common text, document, data, config, log, and source-code files out of the box:
|
|
446
|
-
|
|
447
|
-
- Markdown: `.md`, `.mdx`
|
|
448
|
-
- Text: `.txt`, `.text`
|
|
449
|
-
- JSON: `.json`
|
|
450
|
-
- YAML: `.yaml`, `.yml`
|
|
451
|
-
- CSV/TSV: `.csv`, `.tsv`
|
|
452
|
-
- HTML: `.html`, `.htm`
|
|
453
|
-
- PDF: `.pdf`
|
|
454
|
-
- Office/OpenDocument: `.docx`, `.pptx`, `.xlsx`, `.odt`, `.ods`, `.odp`
|
|
455
|
-
- Rich text: `.rtf`
|
|
456
|
-
- Line data and logs: `.jsonl`, `.ndjson`, `.log`
|
|
457
|
-
- XML feeds and documents: `.xml`, `.rss`, `.atom`
|
|
458
|
-
- Config and data files: `.toml`, `.ini`, `.conf`, `.cfg`, `.properties`, `.sql`
|
|
459
|
-
- Source code: `.ts`, `.tsx`, `.js`, `.jsx`, `.py`, `.go`, `.rs`, `.java`, `.rb`, `.php`,
|
|
460
|
-
`.cs`, `.c`, `.cpp`, `.h`, `.css`
|
|
461
|
-
|
|
462
|
-
Custom UTF-8 text extensions can be enabled without changing code:
|
|
463
|
-
|
|
464
|
-
```json
|
|
465
|
-
{
|
|
466
|
-
"includeExtensions": [".transcript", ".evidence"]
|
|
467
|
-
}
|
|
468
|
-
```
|
|
469
|
-
|
|
470
|
-
Or through:
|
|
471
|
-
|
|
472
|
-
```bash
|
|
473
|
-
KB_INCLUDE_EXTENSIONS=".transcript,.evidence" pnpm exec kb ingest
|
|
474
|
-
```
|
|
475
|
-
|
|
476
|
-
Images, scans, audio/video files, old proprietary Office binaries such as `.doc`, and other formats
|
|
477
|
-
that are not listed should be OCRed, transcribed, converted, or exported to text/PDF/HTML first.
|
|
478
|
-
Mimir intentionally avoids pretending that every binary format can be indexed safely without
|
|
479
|
-
extraction logic.
|
|
480
|
-
|
|
481
|
-
## Config
|
|
482
|
-
|
|
483
|
-
`.kb/config.json`:
|
|
484
|
-
|
|
485
|
-
```json
|
|
486
|
-
{
|
|
487
|
-
"rawDir": "private",
|
|
488
|
-
"storageDir": ".kb/storage",
|
|
489
|
-
"sourcesFile": ".kb/sources.txt",
|
|
490
|
-
"accessLogPath": ".kb/access.log",
|
|
491
|
-
"embeddingModelPath": ".mimir/models",
|
|
492
|
-
"tableName": "chunks",
|
|
493
|
-
"embeddingProvider": "local-hash",
|
|
494
|
-
"embeddingModel": "mixedbread-ai/mxbai-embed-xsmall-v1",
|
|
495
|
-
"transformersAllowRemoteModels": false,
|
|
496
|
-
"redaction": {
|
|
497
|
-
"enabled": true,
|
|
498
|
-
"builtIn": true,
|
|
499
|
-
"patterns": []
|
|
500
|
-
},
|
|
501
|
-
"accessLog": true,
|
|
502
|
-
"mcpMaxTopK": 10,
|
|
503
|
-
"topK": 5,
|
|
504
|
-
"chunkSize": 1200,
|
|
505
|
-
"chunkOverlap": 150,
|
|
506
|
-
"includeExtensions": []
|
|
507
|
-
}
|
|
508
|
-
```
|
|
509
|
-
|
|
510
|
-
Environment overrides:
|
|
511
|
-
|
|
512
|
-
- `KB_RAW_DIR`
|
|
513
|
-
- `KB_STORAGE_DIR`
|
|
514
|
-
- `KB_SOURCES_FILE`
|
|
515
|
-
- `KB_ACCESS_LOG_PATH`
|
|
516
|
-
- `KB_EMBEDDING_PROVIDER`
|
|
517
|
-
- `KB_EMBEDDING_MODEL`
|
|
518
|
-
- `KB_EMBEDDING_MODEL_PATH`
|
|
519
|
-
- `KB_TRANSFORMERS_ALLOW_REMOTE_MODELS`
|
|
520
|
-
- `KB_REDACTION_ENABLED`
|
|
521
|
-
- `KB_REDACTION_BUILT_IN`
|
|
522
|
-
- `KB_ACCESS_LOG`
|
|
523
|
-
- `KB_MCP_MAX_TOP_K`
|
|
524
|
-
- `KB_TOP_K`
|
|
525
|
-
- `KB_CHUNK_SIZE`
|
|
526
|
-
- `KB_CHUNK_OVERLAP`
|
|
527
|
-
- `KB_INCLUDE_EXTENSIONS`
|
|
528
|
-
|
|
529
|
-
## Library API
|
|
530
|
-
|
|
531
|
-
```ts
|
|
532
|
-
import { ingest, search, ask } from "@jcode.labs/mimir"
|
|
533
|
-
|
|
534
|
-
await ingest({ rebuild: true })
|
|
535
|
-
const results = await search("vendor invoice status")
|
|
536
|
-
const answer = await ask("What documents support the project timeline?")
|
|
537
|
-
```
|
|
538
|
-
|
|
539
|
-
## Privacy
|
|
28
|
+
## Entry Points
|
|
540
29
|
|
|
541
|
-
-
|
|
542
|
-
-
|
|
543
|
-
-
|
|
544
|
-
-
|
|
545
|
-
- Access logs store query hashes, not raw queries.
|
|
546
|
-
- The vector index is stored locally.
|
|
547
|
-
- Raw private documents should stay in the target repository's ignored `private/` folder.
|
|
548
|
-
- Do not put secrets or scans inside this package repository.
|
|
30
|
+
- CLI: `kb`
|
|
31
|
+
- Library import: `@jcode.labs/mimir`
|
|
32
|
+
- MCP server: `pnpm exec kb serve-mcp`
|
|
33
|
+
- Bundled skills: `pnpm exec kb install-skill`
|
|
549
34
|
|
|
550
35
|
## License
|
|
551
36
|
|
|
552
|
-
MIT
|
|
37
|
+
MIT (c) Jean-Baptiste Thery.
|
package/dist/version.d.ts
CHANGED
|
@@ -1,2 +1,2 @@
|
|
|
1
|
-
export declare const VERSION = "0.4.
|
|
1
|
+
export declare const VERSION = "0.4.3";
|
|
2
2
|
//# sourceMappingURL=version.d.ts.map
|
package/dist/version.js
CHANGED
|
@@ -1,2 +1,2 @@
|
|
|
1
|
-
export const VERSION = "0.4.
|
|
1
|
+
export const VERSION = "0.4.3";
|
|
2
2
|
//# sourceMappingURL=version.js.map
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@jcode.labs/mimir",
|
|
3
|
-
"version": "0.4.
|
|
3
|
+
"version": "0.4.3",
|
|
4
4
|
"description": "Mimir: open-source sovereign local RAG for confidential datasets and AI agents.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"license": "MIT",
|
|
@@ -28,7 +28,7 @@
|
|
|
28
28
|
"url": "git+https://github.com/jcode-works/jcode-mimir.git",
|
|
29
29
|
"directory": "packages/mimir"
|
|
30
30
|
},
|
|
31
|
-
"homepage": "https://github.com/jcode-works/jcode-mimir
|
|
31
|
+
"homepage": "https://github.com/jcode-works/jcode-mimir#readme",
|
|
32
32
|
"bugs": {
|
|
33
33
|
"url": "https://github.com/jcode-works/jcode-mimir/issues"
|
|
34
34
|
},
|
|
@@ -65,7 +65,7 @@
|
|
|
65
65
|
"unpdf": "^1.4.0",
|
|
66
66
|
"yaml": "^2.8.1",
|
|
67
67
|
"zod": "^4.1.13",
|
|
68
|
-
"@jcode.labs/mimir-tts": "0.4.
|
|
68
|
+
"@jcode.labs/mimir-tts": "0.4.3"
|
|
69
69
|
},
|
|
70
70
|
"devDependencies": {
|
|
71
71
|
"@types/html-to-text": "^9.0.4",
|