mindkeg-mcp 0.2.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -32,16 +32,42 @@ Unlike traditional RAG systems that chunk large documents, Mind Keg stores **pre
32
32
  - API key authentication with per-repository access control
33
33
  - SQLite storage (zero dependencies, zero config)
34
34
  - Import/export for backup and migration
35
+ - **Enterprise security**: encryption at rest, audit logging, TTL/data retention, Prometheus monitoring, rate limiting, content integrity verification
35
36
 
36
37
  ## Quick Start
37
38
 
38
- ### Install
39
+ ### One-command setup
40
+
41
+ ```bash
42
+ npx mindkeg-mcp init
43
+ ```
44
+
45
+ This auto-detects your agent (Claude Code, Cursor, Windsurf), writes the MCP config, copies agent instructions, and runs a health check. That's it — open your agent and start coding.
46
+
47
+ **Options:**
48
+
49
+ ```bash
50
+ npx mindkeg-mcp init --agent cursor # Target a specific agent
51
+ npx mindkeg-mcp init --no-instructions # Skip copying AGENTS.md
52
+ npx mindkeg-mcp init --no-health-check # Skip the health check
53
+ ```
54
+
55
+ `init` is idempotent — safe to run multiple times. It merges with existing configs and never overwrites.
56
+
57
+ ### Manual setup
58
+
59
+ If you prefer to configure manually, or need HTTP mode:
60
+
61
+ <details>
62
+ <summary>Click to expand manual setup instructions</summary>
63
+
64
+ #### Install
39
65
 
40
66
  ```bash
41
67
  npm install -g mindkeg-mcp
42
68
  ```
43
69
 
44
- ### Create an API key
70
+ #### Create an API key
45
71
 
46
72
  ```bash
47
73
  mindkeg api-key create --name "My Laptop"
@@ -49,13 +75,11 @@ mindkeg api-key create --name "My Laptop"
49
75
  # mk_abc123...
50
76
  ```
51
77
 
52
- ### Connect your AI agent
78
+ #### Connect your AI agent
53
79
 
54
80
  Mind Keg works with any MCP-compatible AI coding agent. Choose your setup:
55
81
 
56
- #### Claude Code (stdio)
57
-
58
- Add to `~/.claude.json` or your project's MCP settings:
82
+ **Claude Code** — Add to `~/.claude.json` or your project's `.claude/mcp.json`:
59
83
 
60
84
  ```json
61
85
  {
@@ -71,9 +95,7 @@ Add to `~/.claude.json` or your project's MCP settings:
71
95
  }
72
96
  ```
73
97
 
74
- #### Cursor
75
-
76
- Add to your Cursor MCP settings (`.cursor/mcp.json` or global settings):
98
+ **Cursor** — Add to `.cursor/mcp.json` or global settings:
77
99
 
78
100
  ```json
79
101
  {
@@ -89,9 +111,7 @@ Add to your Cursor MCP settings (`.cursor/mcp.json` or global settings):
89
111
  }
90
112
  ```
91
113
 
92
- #### Windsurf
93
-
94
- Add to your Windsurf MCP configuration (`~/.codeium/windsurf/mcp_config.json`):
114
+ **Windsurf** — Add to `~/.codeium/windsurf/mcp_config.json`:
95
115
 
96
116
  ```json
97
117
  {
@@ -107,9 +127,7 @@ Add to your Windsurf MCP configuration (`~/.codeium/windsurf/mcp_config.json`):
107
127
  }
108
128
  ```
109
129
 
110
- #### HTTP mode (any MCP client)
111
-
112
- For agents that connect via HTTP instead of stdio:
130
+ **HTTP mode (any MCP client):**
113
131
 
114
132
  ```bash
115
133
  MINDKEG_API_KEY=mk_your_key mindkeg serve --http
@@ -130,11 +148,9 @@ MINDKEG_API_KEY=mk_your_key mindkeg serve --http
130
148
  }
131
149
  ```
132
150
 
133
- #### Other MCP-compatible agents
134
-
135
- Mind Keg works with any agent that supports the [Model Context Protocol](https://modelcontextprotocol.io) — including Codex CLI, Gemini CLI, GitHub Copilot, and more. Use the stdio config above adapted to your agent's MCP settings format.
151
+ **Other MCP-compatible agents** — Mind Keg works with any agent that supports the [Model Context Protocol](https://modelcontextprotocol.io) — including Codex CLI, Gemini CLI, GitHub Copilot, and more. Use the stdio config above adapted to your agent's MCP settings format.
136
152
 
137
- ### Add Mind Keg instructions to your repository
153
+ #### Add Mind Keg instructions to your repository
138
154
 
139
155
  Copy `templates/AGENTS.md` to the root of any repository where you want agents to use Mind Keg.
140
156
 
@@ -142,10 +158,13 @@ Copy `templates/AGENTS.md` to the root of any repository where you want agents t
142
158
 
143
159
  > **Claude Code only**: Claude Code doesn't auto-load `AGENTS.md` natively. Add `@AGENTS.md` to your `CLAUDE.md` to bridge it.
144
160
 
161
+ </details>
162
+
145
163
  ## MCP Tools
146
164
 
147
165
  | Tool | Description |
148
166
  |----------------------|------------------------------------------------------|
167
+ | `get_context` | Prime an agent session with all relevant learnings — ranked, scoped, and budget-controlled |
149
168
  | `store_learning` | Store a new atomic learning (repo, workspace, or global scope) |
150
169
  | `search_learnings` | Semantic/keyword search for relevant learnings |
151
170
  | `update_learning` | Update content, category, or tags |
@@ -158,6 +177,14 @@ Copy `templates/AGENTS.md` to the root of any repository where you want agents t
158
177
  ## CLI Commands
159
178
 
160
179
  ```bash
180
+ # Quick setup (auto-detects agent, writes config, copies instructions)
181
+ mindkeg init
182
+ mindkeg init --agent cursor
183
+
184
+ # Database statistics
185
+ mindkeg stats
186
+ mindkeg stats --json
187
+
161
188
  # Start in stdio mode (for local agent connections)
162
189
  mindkeg serve --stdio
163
190
 
@@ -173,9 +200,25 @@ mindkeg api-key revoke <prefix>
173
200
  # Database
174
201
  mindkeg migrate
175
202
 
203
+ # Near-duplicate detection (backfill existing learnings)
204
+ mindkeg dedup-scan
205
+ mindkeg dedup-scan --dry-run
206
+
176
207
  # Backup and restore
177
208
  mindkeg export --output backup.json
178
209
  mindkeg import backup.json --regenerate-embeddings
210
+
211
+ # Data retention
212
+ mindkeg purge --older-than 90 # Purge learnings older than 90 days
213
+ mindkeg purge --repository /path/repo # Purge all learnings for a repo
214
+ mindkeg purge --all --confirm # Purge everything (requires --confirm)
215
+
216
+ # Encryption at rest
217
+ mindkeg encrypt-db # Encrypt existing database (requires MINDKEG_ENCRYPTION_KEY)
218
+ mindkeg decrypt-db # Decrypt existing database (requires MINDKEG_ENCRYPTION_KEY)
219
+
220
+ # Integrity backfill
221
+ mindkeg backfill-integrity # Compute SHA-256 hashes for legacy learnings
179
222
  ```
180
223
 
181
224
  ## Configuration
@@ -213,24 +256,131 @@ export MINDKEG_EMBEDDING_PROVIDER=none
213
256
 
214
257
  Disables semantic search and falls back to SQLite FTS5 full-text search — all other features work identically.
215
258
 
259
+ ## Enterprise Security
260
+
261
+ Mind Keg 0.4.0 ships a suite of security features suitable for corporate and regulated environments.
262
+
263
+ ### Encryption at Rest
264
+
265
+ Encrypt `content` and `embedding` fields using AES-256-GCM. All other fields (category, tags, timestamps) remain plaintext.
266
+
267
+ ```bash
268
+ # Generate a 256-bit key
269
+ node -e "console.log(require('crypto').randomBytes(32).toString('base64'))"
270
+
271
+ export MINDKEG_ENCRYPTION_KEY=<your-base64-key>
272
+ mindkeg serve --stdio
273
+ ```
274
+
275
+ To encrypt an existing database in-place:
276
+
277
+ ```bash
278
+ MINDKEG_ENCRYPTION_KEY=<key> mindkeg encrypt-db
279
+ # Creates a backup automatically before operating
280
+ ```
281
+
282
+ > **Note**: FTS5 keyword search does not work when encryption is enabled. Use FastEmbed or OpenAI embedding providers for search.
283
+
284
+ ### Audit Logging
285
+
286
+ All MCP tool invocations are written to a structured JSON lines audit log (SIEM-compatible).
287
+
288
+ ```bash
289
+ export MINDKEG_AUDIT_LOG=~/.mindkeg/audit.jsonl # default
290
+ # Or: MINDKEG_AUDIT_LOG=stderr (write to stderr alongside app logs)
291
+ # Or: MINDKEG_AUDIT_LOG=none (disable)
292
+ ```
293
+
294
+ Each audit entry contains: `timestamp` (ISO 8601), `action`, `actor` (API key prefix), `resource_id`, `result`, `client` transport metadata. Sensitive fields (`content`, `embedding`) are never logged.
295
+
296
+ ### TTL and Data Retention
297
+
298
+ Set a global default TTL or a per-learning TTL to automatically expire old entries.
299
+
300
+ ```bash
301
+ export MINDKEG_DEFAULT_TTL_DAYS=365 # Expire all learnings after 1 year by default
302
+ export MINDKEG_PURGE_INTERVAL_HOURS=24 # Run purge every 24 hours (default)
303
+ ```
304
+
305
+ Per-learning TTL overrides the global default:
306
+
307
+ ```json
308
+ { "content": "...", "ttl_days": 30 }
309
+ ```
310
+
311
+ Manual purge:
312
+
313
+ ```bash
314
+ mindkeg purge --older-than 180 --confirm
315
+ ```
316
+
317
+ ### Monitoring
318
+
319
+ HTTP transport exposes Prometheus-compatible endpoints:
320
+
321
+ ```
322
+ GET /health → JSON: { status, version, uptime, database }
323
+ GET /metrics → Prometheus text format
324
+ ```
325
+
326
+ Both endpoints are unauthenticated by default. Set `MINDKEG_METRICS_AUTH=true` to require API key auth.
327
+
328
+ Metrics exposed: `mindkeg_learnings_total`, `mindkeg_tool_invocations_total`, `mindkeg_tool_duration_seconds`, `mindkeg_errors_total`, `mindkeg_uptime_seconds`, `mindkeg_search_latency_seconds`.
329
+
330
+ ### Rate Limiting
331
+
332
+ HTTP transport enforces per-API-key token bucket rate limits with separate write and read buckets.
333
+
334
+ ```bash
335
+ export MINDKEG_RATE_LIMIT_WRITE_RPM=100 # default: 100 write req/min per key
336
+ export MINDKEG_RATE_LIMIT_READ_RPM=300 # default: 300 read req/min per key
337
+ ```
338
+
339
+ Returns HTTP 429 with `Retry-After` header when exceeded. stdio transport is not rate-limited.
340
+
341
+ ### Supply Chain Security
342
+
343
+ - npm packages published with `--provenance` (Sigstore attestation via GitHub Actions)
344
+ - CycloneDX SBOM generated and uploaded as a release asset on every GitHub release
345
+ - Cosign signatures for npm tarballs uploaded as release assets
346
+
347
+ ### Content Integrity
348
+
349
+ SHA-256 integrity hashes are computed and stored for every learning on write. Verify on demand:
350
+
351
+ ```json
352
+ { "query": "...", "verify_integrity": true }
353
+ ```
354
+
355
+ Each result includes `integrity_valid: true | false | null` (`null` for legacy learnings without a stored hash).
356
+
357
+ Backfill integrity hashes for existing learnings:
358
+
359
+ ```bash
360
+ mindkeg backfill-integrity
361
+ ```
362
+
216
363
  ## Data Model
217
364
 
218
365
  Each learning contains:
219
366
 
220
- | Field | Type | Notes |
221
- |--------------|-------------------|------------------------------------------------|
222
- | `id` | UUID | Auto-generated |
223
- | `content` | string (max 500) | The atomic learning text |
224
- | `category` | enum | One of 6 categories |
225
- | `tags` | string[] | Free-form labels |
226
- | `repository` | string or null | Repo path; null = workspace or global |
227
- | `workspace` | string or null | Workspace path; null = repo-specific or global |
228
- | `group_id` | UUID or null | Link related learnings |
229
- | `source` | string | Who created this (e.g., "claude-code") |
230
- | `status` | enum | `active` or `deprecated` |
231
- | `stale_flag` | boolean | Agent-flagged as potentially outdated |
232
- | `created_at` | ISO 8601 | Auto-set on creation |
233
- | `updated_at` | ISO 8601 | Auto-updated on modification |
367
+ | Field | Type | Notes |
368
+ |-------------------|-------------------|-------------------------------------------------------------|
369
+ | `id` | UUID | Auto-generated |
370
+ | `content` | string (max 500) | The atomic learning text (sanitized on write) |
371
+ | `category` | enum | One of 6 categories |
372
+ | `tags` | string[] | Free-form labels |
373
+ | `repository` | string or null | Repo path; null = workspace or global |
374
+ | `workspace` | string or null | Workspace path; null = repo-specific or global |
375
+ | `group_id` | UUID or null | Link related learnings |
376
+ | `source` | string | Who created this (e.g., "claude-code") |
377
+ | `status` | enum | `active` or `deprecated` |
378
+ | `stale_flag` | boolean | Agent-flagged as potentially outdated |
379
+ | `ttl_days` | integer or null | Per-learning TTL; overrides global `MINDKEG_DEFAULT_TTL_DAYS` |
380
+ | `source_agent` | string or null | Agent name for provenance tracking |
381
+ | `integrity_hash` | string or null | SHA-256 hash of canonical fields for tamper detection |
382
+ | `created_at` | ISO 8601 | Auto-set on creation |
383
+ | `updated_at` | ISO 8601 | Auto-updated on modification; TTL expiry anchors to this |
234
384
 
235
385
  ## Scoping
236
386
 
@@ -295,15 +445,20 @@ Mind Keg works fully offline by default. FastEmbed provides free, local semantic
295
445
 
296
446
  ```
297
447
  CLI (Commander.js)
298
- └── serve / api-key / migrate / export / import
448
+ └── init / stats / serve / api-key / migrate / export / import / dedup-scan
449
+ purge / encrypt-db / decrypt-db / backfill-integrity
299
450
 
300
451
  src/
301
452
  index.ts Entry point, stdio + HTTP transports
302
453
  server.ts MCP server + tool registration
303
454
  config.ts Config loading (env vars → defaults)
455
+ audit/ Structured JSON lines audit logger
304
456
  auth/ API key generation + validation middleware
305
- tools/ One file per MCP tool (8 tools)
306
- services/ LearningService + EmbeddingService
457
+ crypto/ AES-256-GCM field encryption
458
+ monitoring/ Prometheus metrics + /health endpoint
459
+ security/ Content sanitization, integrity hashing, rate limiter
460
+ tools/ One file per MCP tool (9 tools) + shared tool-utils
461
+ services/ LearningService + EmbeddingService + PurgeService
307
462
  storage/ StorageAdapter interface + SQLite impl
308
463
  models/ Zod schemas + TypeScript types
309
464
  utils/ Logger (pino → stderr) + error classes