gnosys-mcp 0.3.2 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (53) hide show
  1. package/README.md +521 -165
  2. package/dist/cli.js +615 -21
  3. package/dist/cli.js.map +1 -1
  4. package/dist/index.js +244 -21
  5. package/dist/index.js.map +1 -1
  6. package/dist/lib/ask.d.ts +63 -0
  7. package/dist/lib/ask.d.ts.map +1 -0
  8. package/dist/lib/ask.js +227 -0
  9. package/dist/lib/ask.js.map +1 -0
  10. package/dist/lib/config.d.ts +154 -0
  11. package/dist/lib/config.d.ts.map +1 -0
  12. package/dist/lib/config.js +265 -0
  13. package/dist/lib/config.js.map +1 -0
  14. package/dist/lib/dashboard.d.ts +55 -0
  15. package/dist/lib/dashboard.d.ts.map +1 -0
  16. package/dist/lib/dashboard.js +184 -0
  17. package/dist/lib/dashboard.js.map +1 -0
  18. package/dist/lib/embeddings.d.ts +85 -0
  19. package/dist/lib/embeddings.d.ts.map +1 -0
  20. package/dist/lib/embeddings.js +213 -0
  21. package/dist/lib/embeddings.js.map +1 -0
  22. package/dist/lib/graph.d.ts +50 -0
  23. package/dist/lib/graph.d.ts.map +1 -0
  24. package/dist/lib/graph.js +118 -0
  25. package/dist/lib/graph.js.map +1 -0
  26. package/dist/lib/hybridSearch.d.ts +67 -0
  27. package/dist/lib/hybridSearch.d.ts.map +1 -0
  28. package/dist/lib/hybridSearch.js +211 -0
  29. package/dist/lib/hybridSearch.js.map +1 -0
  30. package/dist/lib/import.js +1 -1
  31. package/dist/lib/import.js.map +1 -1
  32. package/dist/lib/ingest.d.ts +6 -2
  33. package/dist/lib/ingest.d.ts.map +1 -1
  34. package/dist/lib/ingest.js +25 -24
  35. package/dist/lib/ingest.js.map +1 -1
  36. package/dist/lib/llm.d.ts +84 -0
  37. package/dist/lib/llm.d.ts.map +1 -0
  38. package/dist/lib/llm.js +380 -0
  39. package/dist/lib/llm.js.map +1 -0
  40. package/dist/lib/maintenance.d.ts +114 -0
  41. package/dist/lib/maintenance.d.ts.map +1 -0
  42. package/dist/lib/maintenance.js +476 -0
  43. package/dist/lib/maintenance.js.map +1 -0
  44. package/dist/lib/retry.d.ts +24 -0
  45. package/dist/lib/retry.d.ts.map +1 -0
  46. package/dist/lib/retry.js +60 -0
  47. package/dist/lib/retry.js.map +1 -0
  48. package/dist/lib/store.d.ts +4 -0
  49. package/dist/lib/store.d.ts.map +1 -1
  50. package/dist/lib/store.js +46 -5
  51. package/dist/lib/store.js.map +1 -1
  52. package/package.json +13 -4
  53. package/prompts/synthesize.md +21 -0
package/README.md CHANGED
@@ -2,8 +2,6 @@
2
2
  <img src="docs/logo.svg" alt="Gnosys" width="200">
3
3
  </p>
4
4
 
5
- <p align="center"><strong>LLM-native persistent memory for AI agents.</strong></p>
6
-
7
5
  <p align="center">
8
6
  <a href="https://www.npmjs.com/package/gnosys-mcp"><img src="https://img.shields.io/npm/v/gnosys-mcp.svg" alt="npm version"></a>
9
7
  <a href="https://github.com/proticom/gnosys-mcp/actions"><img src="https://github.com/proticom/gnosys-mcp/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
@@ -13,156 +11,179 @@
13
11
 
14
12
  ---
15
13
 
16
- Gnosys gives AI agents long-term memory that survives across sessions. Memories are atomic markdown files with structured frontmatter, stored in plain directories, versioned by git, and searchable via FTS5. No database, no vector store, no external services — just files.
14
+ ### Gnosys Persistent Memory for AI Agents (and Universal Transparent Knowledge Engine)
17
15
 
18
- Gnosys works as an MCP server (for Cursor, Claude Desktop, or any MCP-compatible agent) and as a standalone CLI.
16
+ **Gnosys** gives LLMs and humans a knowledge layer that survives across sessions and scales to real-world datasets.
19
17
 
20
- ## Quick Start
18
+ Every piece of knowledge is stored as an atomic Markdown file with rich YAML frontmatter inside a `.gnosys/` directory. Git versions every change. SQLite FTS5 delivers instant keyword search. The entire folder is a fully functional Obsidian vault for browsing, wikilinking, graphing, and editing.
21
19
 
22
- ```bash
23
- # Install globally from npm
24
- npm install -g gnosys-mcp
20
+ It runs as a CLI and a complete MCP server that drops straight into Cursor, Claude Desktop, Claude Code, or any MCP client.
25
21
 
26
- # Initialize a store in your project
27
- cd /path/to/your/project
28
- gnosys init
22
+ **Beyond agents**: Gnosys turns any structured dataset into a connected, versioned knowledge graph.
23
+ NVD/CVE Database: 200k+ vulnerabilities auto-linked to packages, exploits, patches, and supersession history. Ask "which of our dependencies have active unpatched criticals?"
24
+ USDA FoodData Central: ~8k foods atomized with wikilinks to nutrients and substitutions. Ask "high-protein, low-sodium, high-potassium alternatives to X?"
29
25
 
30
- # Add your first memory
31
- gnosys add "We decided to use PostgreSQL for the main database because of its JSON support and mature ecosystem"
26
+ No vector DBs. No black boxes. No external services. Just files, Git, and Obsidian — the way knowledge should be.
32
27
 
33
- # Find memories later
34
- gnosys discover "database selection"
35
- ```
28
+ ---
36
29
 
37
- ## How It Works
30
+ ## Why Gnosys?
38
31
 
39
- A Gnosys store is a `.gnosys/` directory inside your project. It contains markdown files organized by category:
32
+ Most "memory for LLMs" solutions use vector databases, embeddings, or proprietary services. They're opaque you can't see what the model remembers, can't edit it, can't version it, can't share it.
40
33
 
41
- ```
42
- your-project/
43
- .gnosys/
44
- decisions/
45
- use-postgresql.md
46
- jwt-over-sessions.md
47
- architecture/
48
- three-layer-design.md
49
- concepts/
50
- memory-decay.md
51
- .gnosys/ # internal config
52
- tags.json # tag registry
53
- reinforcement.log
54
- CHANGELOG.md
34
+ Gnosys takes a different approach: every memory is a plain Markdown file with YAML frontmatter. The entire knowledge base is a Git repository and an Obsidian vault. You can read it, edit it, version it, grep it, and back it up with the tools you already use.
35
+
36
+ **What makes it different:**
37
+
38
+ - **Transparent** — every memory is a human-readable `.md` file. No embeddings, no binary blobs.
39
+ - **Freeform Ask** — ask natural-language questions and get synthesized answers with Obsidian wikilink citations from the entire vault.
40
+ - **Hybrid Search** — combines FTS5 keyword search with semantic embeddings via Reciprocal Rank Fusion (RRF).
41
+ - **Versioned** — Git auto-commits every write. Full history, rollback, and diff support.
42
+ - **Obsidian-native** — the `.gnosys/` folder is a real vault. Graph view, wikilinks, tags, backlinks — all work.
43
+ - **MCP-first** — drops into Cursor, Claude Desktop, Claude Code, Codex, or any MCP client with one config line.
44
+ - **Bulk import** — CSV, JSON, JSONL. Import entire datasets (USDA, NVD, your internal docs) in seconds.
45
+ - **Layered stores** — project, personal, global, and optional read-only stores stacked by precedence.
46
+ - **Zero infrastructure** — no databases, no Docker (unless you want it), no cloud services. Just `npm install`.
47
+
48
+ ---
49
+
50
+ ## Real-World Use Cases
51
+
52
+ ### USDA FoodData Central — 100 foods imported in 0.6s
53
+
54
+ ![USDA import: 100 Foundation Foods with nutrient data, wikilinks to food categories](docs/screenshots/usda-import-result.png)
55
+
56
+ ```bash
57
+ gnosys import usda-foods.json \
58
+ --format json \
59
+ --mapping '{"title":"title","category":"category","content":"content","tags":"tags","relevance":"relevance"}' \
60
+ --mode structured --skip-existing
55
61
  ```
56
62
 
57
- Each memory is a markdown file with YAML frontmatter:
63
+ Each food becomes an atomic memory with nutrient data and `[[wikilinks]]` to food categories:
58
64
 
59
65
  ```yaml
60
66
  ---
61
- id: deci-001
62
- title: "Use PostgreSQL for Main Database"
63
- category: decisions
67
+ title: "Almond butter, creamy"
68
+ category: usda-foods
64
69
  tags:
65
- domain: [database, backend]
66
- type: [decision]
67
- relevance: "database selection postgres sql json storage persistence"
68
- author: human+ai
69
- authority: declared
70
- confidence: 0.9
71
- created: 2026-03-01
72
- modified: 2026-03-01
73
- status: active
74
- supersedes: null
70
+ domain: [food, nutrition, usda]
71
+ relevance: "almond butter creamy food nutrition usda fdc nutrient diet dietary protein"
75
72
  ---
73
+ # Almond butter, creamy
76
74
 
77
- # Use PostgreSQL for Main Database
75
+ **Food Category:** [[General]]
78
76
 
79
- We chose PostgreSQL over MySQL and SQLite because...
77
+ ## Key Nutrients (per 100g)
78
+ - Protein (g): 20.4 G
79
+ - Total Fat (g): 55.7 G
80
+ - Calcium (mg): 264 MG
81
+ - Potassium (mg): 699 MG
80
82
  ```
81
83
 
82
- Key fields:
83
-
84
- - **relevance**: A keyword cloud that powers `discover`. Describe the contexts where this memory would be useful.
85
- - **confidence**: 0–1 score. How certain is this knowledge? Observations might be 0.6; declared decisions are 0.9.
86
- - **authority**: Who established this? `declared` (human decided), `observed` (AI noticed), `imported` (from external source), `inferred` (AI deduced).
87
- - **status**: `active`, `archived`, or `superseded`. Superseded memories link to their replacement via `superseded_by`.
84
+ ### NVD/CVE Database — 20 vulnerabilities with CVSS scores and affected products
88
85
 
89
- ## Using with Obsidian
86
+ ![NVD import: CVEs with CVSS scores, severity tags, wikilinks to affected products](docs/screenshots/nvd-import-result.png)
90
87
 
91
- A Gnosys store is a valid Obsidian vault. Open your `.gnosys/` directory in Obsidian and you get full browsing, graph view, tag filtering, and search — with zero configuration. This is the recommended way for humans to browse and explore the knowledge base.
88
+ ```bash
89
+ gnosys import nvd-cves.json \
90
+ --format json \
91
+ --mapping '{"title":"title","category":"category","content":"content","tags":"tags","relevance":"relevance"}' \
92
+ --mode structured --skip-existing
93
+ ```
92
94
 
93
- 1. Open Obsidian
94
- 2. Click "Open folder as vault"
95
- 3. Select your project's `.gnosys/` directory
96
- 4. Browse, search, and explore your memories visually
95
+ Each CVE links to affected packages via wikilinks:
97
96
 
98
- Edits made in Obsidian are picked up automatically by Gnosys (the filesystem is the source of truth).
97
+ ```yaml
98
+ ---
99
+ title: CVE-1999-0095
100
+ tags:
101
+ domain: [cve, vulnerability, security, high]
102
+ relevance: "cve-1999-0095 cve vulnerability security nvd patch exploit high eric_allman sendmail"
103
+ ---
104
+ # CVE-1999-0095
99
105
 
100
- ## CLI Reference
106
+ The debug command in Sendmail is enabled, allowing attackers to execute commands as root.
101
107
 
102
- ```bash
103
- npm install -g gnosys-mcp
108
+ **CVSS Score:** 10.0 (HIGH)
109
+ **Affected:** [[eric_allman/sendmail]]
104
110
  ```
105
111
 
106
- ### Core Commands
112
+ See [DEMO.md](DEMO.md) for the full step-by-step walkthrough.
107
113
 
108
- **`gnosys init [--directory <dir>]`**
109
- Initialize a new `.gnosys` store. Creates the directory structure, default tag registry, and a git repository.
114
+ ---
110
115
 
111
- **`gnosys add <input> [--author human|ai|human+ai] [--authority declared|observed] [--store project|personal|global]`**
112
- Add a memory using natural language. An LLM structures your input into an atomic memory with proper frontmatter, category, and tags. Requires `ANTHROPIC_API_KEY`.
116
+ ## Quick Start
113
117
 
114
- **`gnosys add-structured --title <title> --category <category> --content <content> [--tags <json>] [--relevance <keywords>] [--confidence <n>]`**
115
- Add a memory with explicit fields. No LLM needed — you provide the structure directly.
118
+ ```bash
119
+ # Install
120
+ npm install -g gnosys-mcp
116
121
 
117
- **`gnosys discover <query> [--limit <n>]`**
118
- Find relevant memories by keyword. Searches relevance clouds, titles, and tags. Returns metadata only (no file contents). This is the primary entry point for agents starting a task.
122
+ # Initialize a store in your project
123
+ cd your-project
124
+ gnosys init
119
125
 
120
- **`gnosys search <query> [--limit <n>]`**
121
- Full-text search across all memories. Returns matching paths with context snippets.
126
+ # Add a memory (uses LLM to structure it — needs Anthropic key or Ollama)
127
+ gnosys add "We chose PostgreSQL over MySQL for its JSON support and mature ecosystem"
122
128
 
123
- **`gnosys read <path>`**
124
- Read a specific memory. Supports layer-prefixed paths: `project:decisions/auth.md`.
129
+ # Or add without an LLM
130
+ gnosys add-structured --title "Use PostgreSQL" --category decisions \
131
+ --content "Chosen for JSON support and mature ecosystem" \
132
+ --relevance "database postgres sql json storage"
125
133
 
126
- **`gnosys list [--category <cat>] [--tag <tag>] [--store <store>]`**
127
- List all memories, optionally filtered.
134
+ # Find memories later
135
+ gnosys discover "database selection"
128
136
 
129
- **`gnosys update <path> [--title <t>] [--status active|archived|superseded] [--confidence <n>] [--relevance <kw>] [--supersedes <id>] [--superseded-by <id>] [--content <md>]`**
130
- Update a memory's frontmatter or content. Handles supersession cross-linking automatically.
137
+ # Full-text search
138
+ gnosys search "PostgreSQL"
139
+ ```
131
140
 
132
- **`gnosys reinforce <memoryId> --signal useful|not_relevant|outdated [--context <why>]`**
133
- Signal whether a memory was helpful. `useful` resets decay; `not_relevant` logs routing feedback; `outdated` flags for review.
141
+ ---
134
142
 
135
- **`gnosys stale [--days <n>] [--limit <n>]`**
136
- Find memories not modified within N days (default: 90). Useful for periodic review.
143
+ ## Installation
137
144
 
138
- **`gnosys commit-context <context> [--dry-run] [--store <store>]`**
139
- Extract atomic memories from a conversation context. Checks existing memories for duplicates — only adds what's genuinely new. Use `--dry-run` to preview without writing. Requires `ANTHROPIC_API_KEY`.
145
+ ### npm (recommended)
140
146
 
141
- **`gnosys tags`**
142
- List all tags in the registry, grouped by category.
147
+ ```bash
148
+ npm install -g gnosys-mcp
149
+ ```
143
150
 
144
- **`gnosys tags-add --category <cat> --tag <tag>`**
145
- Add a new tag to the registry.
151
+ ### Docker
146
152
 
147
- **`gnosys stores`**
148
- Show all active stores with their layers, paths, and write permissions.
153
+ ```bash
154
+ # Build the image
155
+ docker build -t gnosys .
149
156
 
150
- **`gnosys serve`**
151
- Start the MCP server in stdio mode (used by editors and agent runtimes).
157
+ # Initialize a store
158
+ docker run -v $(pwd):/data gnosys init
152
159
 
153
- ### Getting Help
160
+ # Import data
161
+ docker run -v $(pwd):/data gnosys import data.json --format json \
162
+ --mapping '{"name":"title","type":"category","notes":"content"}' \
163
+ --mode structured
164
+
165
+ # Start the MCP server
166
+ docker run -v $(pwd):/data gnosys serve
167
+ ```
168
+
169
+ Or with Docker Compose:
154
170
 
155
171
  ```bash
156
- gnosys --help # List all commands
157
- gnosys help <command> # Detailed help for a command
158
- gnosys <command> --help # Same as above
172
+ # Start the MCP server (mounts current directory)
173
+ docker compose up
174
+
175
+ # Run any CLI command
176
+ docker compose run gnosys search "my query"
177
+ docker compose run gnosys import data.json --format json --mapping '...'
159
178
  ```
160
179
 
180
+ ---
181
+
161
182
  ## MCP Server Setup
162
183
 
163
184
  ### Claude Desktop
164
185
 
165
- Add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) or the equivalent config on your platform:
186
+ Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
166
187
 
167
188
  ```json
168
189
  {
@@ -170,9 +191,7 @@ Add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS)
170
191
  "gnosys": {
171
192
  "command": "npx",
172
193
  "args": ["gnosys-mcp"],
173
- "env": {
174
- "ANTHROPIC_API_KEY": "your-key-here"
175
- }
194
+ "env": { "ANTHROPIC_API_KEY": "your-key-here" }
176
195
  }
177
196
  }
178
197
  }
@@ -180,7 +199,7 @@ Add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS)
180
199
 
181
200
  ### Cursor
182
201
 
183
- Add to your Cursor MCP settings (`.cursor/mcp.json` in your project or global config):
202
+ Add to `.cursor/mcp.json`:
184
203
 
185
204
  ```json
186
205
  {
@@ -188,9 +207,7 @@ Add to your Cursor MCP settings (`.cursor/mcp.json` in your project or global co
188
207
  "gnosys": {
189
208
  "command": "npx",
190
209
  "args": ["gnosys-mcp"],
191
- "env": {
192
- "ANTHROPIC_API_KEY": "your-key-here"
193
- }
210
+ "env": { "ANTHROPIC_API_KEY": "your-key-here" }
194
211
  }
195
212
  }
196
213
  }
@@ -204,7 +221,7 @@ claude mcp add gnosys npx gnosys-mcp
204
221
 
205
222
  ### Codex
206
223
 
207
- Add to `~/.codex/config.toml` or `.codex/config.toml` in your project:
224
+ Add to `.codex/config.toml`:
208
225
 
209
226
  ```toml
210
227
  [mcp.gnosys]
@@ -215,95 +232,361 @@ command = ["npx", "gnosys-mcp"]
215
232
  ANTHROPIC_API_KEY = "your-key-here"
216
233
  ```
217
234
 
218
- ### OpenCode
219
-
220
- Add to `~/.config/opencode/opencode.json` or `opencode.json` in your project:
221
-
222
- ```json
223
- {
224
- "mcp": {
225
- "gnosys": {
226
- "type": "local",
227
- "command": ["npx", "gnosys-mcp"],
228
- "env": {
229
- "ANTHROPIC_API_KEY": "your-key-here"
230
- }
231
- }
232
- }
233
- }
234
- ```
235
-
236
235
  ### MCP Tools
237
236
 
238
- The MCP server exposes the same operations as the CLI:
239
-
240
237
  | Tool | Description |
241
238
  |------|-------------|
242
- | `gnosys_discover` | Find relevant memories by keyword |
239
+ | `gnosys_discover` | Find relevant memories by keyword (start here) |
243
240
  | `gnosys_read` | Read a specific memory |
244
241
  | `gnosys_search` | Full-text search across stores |
242
+ | `gnosys_hybrid_search` | Hybrid keyword + semantic search (RRF fusion) |
243
+ | `gnosys_semantic_search` | Semantic similarity search (embeddings) |
244
+ | `gnosys_ask` | Ask a question, get a synthesized answer with citations |
245
+ | `gnosys_reindex` | Rebuild semantic embeddings from all memories |
245
246
  | `gnosys_list` | List memories with optional filters |
246
247
  | `gnosys_add` | Add a memory (LLM-structured) |
247
- | `gnosys_add_structured` | Add a memory (explicit fields) |
248
+ | `gnosys_add_structured` | Add with explicit fields (no LLM) |
248
249
  | `gnosys_update` | Update frontmatter or content |
249
250
  | `gnosys_reinforce` | Signal usefulness of a memory |
250
- | `gnosys_stale` | Find stale memories |
251
- | `gnosys_commit_context` | Extract and commit memories from context |
252
- | `gnosys_tags` | List tag registry |
253
- | `gnosys_tags_add` | Add a tag to the registry |
251
+ | `gnosys_commit_context` | Extract memories from conversation context |
252
+ | `gnosys_import` | Bulk import from CSV, JSON, or JSONL |
254
253
  | `gnosys_init` | Initialize a new store |
254
+ | `gnosys_maintain` | Run vault maintenance (decay, dedup, consolidation) |
255
+ | `gnosys_dashboard` | System dashboard (memory count, health, graph, LLM status) |
256
+ | `gnosys_reindex_graph` | Build/rebuild the wikilink graph |
255
257
  | `gnosys_stores` | Show active stores |
258
+ | `gnosys_tags` | List tag registry |
259
+
260
+ ---
261
+
262
+ ## How It Works
263
+
264
+ A Gnosys store is a `.gnosys/` directory inside your project:
265
+
266
+ ```
267
+ your-project/
268
+ .gnosys/
269
+ decisions/
270
+ use-postgresql.md
271
+ architecture/
272
+ three-layer-design.md
273
+ usda-foods/
274
+ almond-butter-creamy.md
275
+ nvd-cves/
276
+ cve-2024-1234.md
277
+ gnosys.json # configuration
278
+ .config/tags.json # tag registry
279
+ CHANGELOG.md
280
+ .git/ # auto-versioned
281
+ ```
282
+
283
+ Each memory is an atomic Markdown file with YAML frontmatter:
284
+
285
+ ```yaml
286
+ ---
287
+ id: deci-001
288
+ title: "Use PostgreSQL for Main Database"
289
+ category: decisions
290
+ tags:
291
+ domain: [database, backend]
292
+ type: [decision]
293
+ relevance: "database selection postgres sql json storage persistence"
294
+ author: human+ai
295
+ authority: declared
296
+ confidence: 0.9
297
+ created: 2026-03-01
298
+ status: active
299
+ supersedes: null
300
+ ---
301
+ # Use PostgreSQL for Main Database
302
+
303
+ We chose PostgreSQL over MySQL and SQLite because...
304
+ ```
305
+
306
+ Key fields:
307
+
308
+ - **relevance** — keyword cloud powering `discover`. Think: what would someone search to find this?
309
+ - **confidence** — 0–1 score. Observations: 0.6. Firm decisions: 0.9.
310
+ - **authority** — who established this? `declared`, `observed`, `imported`, `inferred`.
311
+ - **status** — `active`, `archived`, or `superseded`. Superseded memories link to replacements.
312
+
313
+ ---
314
+
315
+ ## LLM Providers & Configuration
316
+
317
+ Gnosys features a **System of Cognition (SOC)** — five LLM providers behind a single interface. Switch between cloud and local with one command:
318
+
319
+ ```bash
320
+ # Switch providers
321
+ gnosys config set provider anthropic # Cloud (default)
322
+ gnosys config set provider ollama # Local via Ollama
323
+ gnosys config set provider groq # Fast cloud inference
324
+ gnosys config set provider openai # OpenAI-compatible
325
+ gnosys config set provider lmstudio # Local via LM Studio
326
+
327
+ # Route tasks to different providers
328
+ gnosys config set task structuring ollama llama3.2
329
+ gnosys config set task synthesis anthropic claude-sonnet-4-20250514
330
+
331
+ # View the full SOC dashboard
332
+ gnosys dashboard
333
+
334
+ # Check all provider connectivity
335
+ gnosys doctor
336
+ ```
337
+
338
+ ### Supported Providers
339
+
340
+ | Provider | Type | Default Model | API Key Env Var |
341
+ |----------|------|---------------|-----------------|
342
+ | **Anthropic** | Cloud | claude-sonnet-4-20250514 | `ANTHROPIC_API_KEY` |
343
+ | **Ollama** | Local | llama3.2 | — (runs locally) |
344
+ | **Groq** | Cloud | llama-3.3-70b-versatile | `GROQ_API_KEY` |
345
+ | **OpenAI** | Cloud | gpt-4o-mini | `OPENAI_API_KEY` |
346
+ | **LM Studio** | Local | default | — (runs locally) |
347
+
348
+ All providers implement the same `LLMProvider` interface. Cloud providers use API keys (set via env var or `gnosys.json`). Local providers (Ollama, LM Studio) just need the service running.
349
+
350
+ ### Task-Based Model Routing
351
+
352
+ Use different models for different tasks — a cheap/fast model for structuring imports and a powerful model for synthesis:
353
+
354
+ ```json
355
+ {
356
+ "llm": {
357
+ "defaultProvider": "anthropic",
358
+ "anthropic": { "model": "claude-sonnet-4-20250514" },
359
+ "ollama": { "model": "llama3.2", "baseUrl": "http://localhost:11434" },
360
+ "groq": { "model": "llama-3.3-70b-versatile" },
361
+ "openai": { "model": "gpt-4o-mini", "baseUrl": "https://api.openai.com/v1" },
362
+ "lmstudio": { "model": "default", "baseUrl": "http://localhost:1234/v1" }
363
+ },
364
+ "taskModels": {
365
+ "structuring": { "provider": "ollama", "model": "llama3.2" },
366
+ "synthesis": { "provider": "anthropic", "model": "claude-sonnet-4-20250514" }
367
+ }
368
+ }
369
+ ```
370
+
371
+ A default `gnosys.json` is created during `gnosys init`. Validation is handled by Zod — invalid configs produce clear error messages. Legacy `defaultLLMProvider` and `defaultModel` fields are auto-migrated to the new `llm` structure.
372
+
373
+ ---
374
+
375
+ ## Using with Obsidian
376
+
377
+ The `.gnosys/` directory is a fully valid Obsidian vault. Open it and get graph view, wikilinks, backlinks, tag search, and visual editing with zero configuration.
378
+
379
+ 1. Open Obsidian → "Open folder as vault" → select `.gnosys/`
380
+ 2. Browse categories as folders, explore the graph view
381
+ 3. Wikilinks between memories (e.g., `[[eric_allman/sendmail]]` in CVE data) create navigable connections
382
+ 4. Edits made in Obsidian are picked up automatically (filesystem is source of truth)
383
+
384
+ ---
385
+
386
+ ## Bulk Import
387
+
388
+ Import any structured dataset into atomic memories:
389
+
390
+ ```bash
391
+ # JSON with field mapping
392
+ gnosys import foods.json --format json \
393
+ --mapping '{"description":"title","foodCategory":"category","notes":"content"}' \
394
+ --mode structured
395
+
396
+ # CSV
397
+ gnosys import data.csv --format csv \
398
+ --mapping '{"name":"title","type":"category","notes":"content"}'
399
+
400
+ # JSONL (one record per line)
401
+ gnosys import events.jsonl --format jsonl \
402
+ --mapping '{"event":"title","type":"category","details":"content"}'
403
+
404
+ # With LLM enrichment (generates keyword clouds, better structure)
405
+ gnosys import data.json --mode llm --concurrency 3
406
+
407
+ # Preview without writing
408
+ gnosys import data.json --dry-run
409
+
410
+ # Resume interrupted imports
411
+ gnosys import data.json --skip-existing
412
+
413
+ # Slice a large dataset
414
+ gnosys import large.json --limit 500 --offset 1000
415
+ ```
416
+
417
+ ---
418
+
419
+ ## Freeform Asking
420
+
421
+ Ask natural-language questions and get synthesized answers with citations from the entire vault. Gnosys retrieves relevant memories via hybrid search, then uses your LLM to synthesize a cited response.
422
+
423
+ ```bash
424
+ # First, build the semantic index (downloads ~80 MB model on first run)
425
+ gnosys reindex
426
+
427
+ # Ask a question about your USDA data
428
+ gnosys ask "What are the best high-protein low-sodium food alternatives?"
429
+
430
+ # Ask about CVEs
431
+ gnosys ask "Which vulnerabilities allow remote code execution?"
432
+
433
+ # Use keyword-only mode (no embeddings needed)
434
+ gnosys ask "What do we know about cheddar cheese?" --mode keyword
435
+ ```
436
+
437
+ Answers include Obsidian wikilink citations like `[[almond-butter-creamy.md]]` so you can click through to the source memories. If the initial search doesn't find enough context, a "deep query" follow-up search automatically expands the context.
438
+
439
+ ### Hybrid Search
440
+
441
+ Three search modes available:
442
+
443
+ ```bash
444
+ # Hybrid (default): combines keyword + semantic with RRF fusion
445
+ gnosys hybrid-search "high protein low sodium"
446
+
447
+ # Semantic only: finds conceptually related memories
448
+ gnosys semantic-search "healthy meal alternatives"
449
+
450
+ # Keyword only: classic FTS5 full-text search
451
+ gnosys hybrid-search "cheddar cheese protein" --mode keyword
452
+ ```
453
+
454
+ The embedding model (`all-MiniLM-L6-v2`) is lazy-loaded — it's only downloaded the first time you run `gnosys reindex` or a semantic search. Embeddings are stored as a regeneratable sidecar in SQLite, never the source of truth.
455
+
456
+ ---
256
457
 
257
458
  ## Layered Stores
258
459
 
259
- Gnosys supports multiple stores stacked in precedence order, so project-specific knowledge can override personal defaults, which can override shared organizational knowledge.
460
+ Multiple stores stacked by precedence:
260
461
 
261
462
  | Layer | Source | Writable | Use Case |
262
463
  |-------|--------|----------|----------|
263
- | **Project** | `.gnosys/` in project root | Yes (default) | Project-specific decisions and architecture |
264
- | **Optional** | `GNOSYS_STORES` env var | Read-only | Shared reference knowledge |
464
+ | **Project** | `.gnosys/` in project root | Yes (default) | Project-specific knowledge |
465
+ | **Optional** | `GNOSYS_STORES` env var | Read-only | Shared reference data |
265
466
  | **Personal** | `GNOSYS_PERSONAL` env var | Yes (fallback) | Cross-project personal knowledge |
266
- | **Global** | `GNOSYS_GLOBAL` env var | Explicit only | Organization-wide shared knowledge |
467
+ | **Global** | `GNOSYS_GLOBAL` env var | Explicit only | Org-wide shared knowledge |
267
468
 
268
- Writes go to the project store by default. Global writes require `--store global` to prevent accidental changes to shared knowledge.
469
+ ```bash
470
+ export GNOSYS_PERSONAL="$HOME/.gnosys-personal"
471
+ export GNOSYS_GLOBAL="/shared/team/.gnosys"
472
+ export GNOSYS_STORES="/path/to/reference-data"
473
+ ```
269
474
 
270
- ### Environment Variables
475
+ ---
271
476
 
272
- ```bash
273
- # Optional: API key for LLM-powered features (add, commit-context)
274
- export ANTHROPIC_API_KEY="sk-ant-..."
477
+ ## Auto Memory Maintenance
275
478
 
276
- # Optional: Personal knowledge store (cross-project)
277
- export GNOSYS_PERSONAL="$HOME/.gnosys-personal"
479
+ The vault stays clean and useful forever without manual babysitting. Agents can run for months without the memory turning into a mess.
278
480
 
279
- # Optional: Organization-wide shared knowledge
280
- export GNOSYS_GLOBAL="/shared/team/.gnosys"
481
+ ### How It Works
281
482
 
282
- # Optional: Additional read-only stores (colon-separated)
283
- export GNOSYS_STORES="/path/to/store1:/path/to/store2"
284
- ```
483
+ **Confidence Decay:** Every memory's confidence decays exponentially over time based on how recently it was used. The formula: `decayed = base_confidence × e^(-0.005 × days_since_reinforced)`. At this rate, an unreinforced memory loses ~50% confidence after 139 days.
484
+
485
+ **Automatic Reinforcement:** Every time a memory appears in search results, ask synthesis, or import — its `reinforcement_count` increments and `last_reinforced` resets. This happens automatically in `gnosys_ask`, `gnosys_hybrid_search`, and all search-based tools.
486
+
487
+ **Duplicate Detection:** Uses semantic similarity (cosine > 0.85) combined with title word overlap (Jaccard > 0.4) to flag potential duplicates. Both conditions must pass to reduce false positives.
488
+
489
+ **Auto-Consolidation:** When duplicates are confirmed, the LLM merges both memories into a single comprehensive one. Originals are marked `status: superseded` with a pointer to the merged version.
490
+
491
+ ### Running Maintenance
492
+
493
+ ```bash
494
+ # See what would change (safe, no modifications)
495
+ gnosys maintain --dry-run
285
496
 
286
- You can also place your API key in `~/.config/gnosys/.env`:
497
+ # Apply all changes automatically
498
+ gnosys maintain --auto-apply
287
499
 
500
+ # Background mode: runs every 6 hours alongside the MCP server
501
+ gnosys serve --with-maintenance
288
502
  ```
289
- ANTHROPIC_API_KEY=sk-ant-...
503
+
504
+ ### Scheduling with cron (Linux/Mac)
505
+
506
+ ```bash
507
+ # Run maintenance daily at 3am
508
+ 0 3 * * * cd /path/to/project && npx gnosys maintain --auto-apply >> /var/log/gnosys-maintain.log 2>&1
290
509
  ```
291
510
 
292
- ## Agent Integration Guide
511
+ ### Scheduling with Task Scheduler (Windows)
512
+
513
+ Create a basic task that runs daily:
514
+ - Program: `npx`
515
+ - Arguments: `gnosys maintain --auto-apply`
516
+ - Start in: `C:\path\to\project`
293
517
 
294
- ### Recommended Agent Workflow
518
+ ### MCP Tool
295
519
 
296
- 1. **Start of session**: Call `gnosys_discover` with keywords about the current task to load relevant context.
297
- 2. **During work**: When making decisions or learning something new, call `gnosys_add` to persist it.
298
- 3. **When things change**: Use `gnosys_update` with `supersedes` to create clean revision chains.
299
- 4. **End of session**: Call `gnosys_commit_context` with a summary of the conversation to extract and persist novel knowledge before context is lost.
520
+ The `gnosys_maintain` MCP tool lets agents trigger maintenance programmatically with dry-run and auto-apply options.
300
521
 
301
- ### Memory Quality Tips
522
+ ### Doctor Health Report
523
+
524
+ `gnosys doctor` now includes a Maintenance Health section showing stale count, average confidence (raw and decayed), reinforcement stats, and never-reinforced memories.
525
+
526
+ ---
302
527
 
303
- - Write **atomic memories** — one decision, fact, or insight per file.
304
- - Use **specific relevance keywords** — think about what someone would search for to find this memory.
305
- - Set **confidence scores honestly** a hunch is 0.5, a firm decision is 0.9.
306
- - Use **supersession** instead of editing — when a decision changes, create a new memory that `supersedes` the old one. This preserves the history of why things changed.
528
+ ## Comparison
529
+
530
+ | Feature | **Gnosys** | NotebookLM | gnosis-mcp | Official MCP Memory |
531
+ |---------|-----------|------------|------------|-------------------|
532
+ | Storage | Markdown files + Git | Google proprietary | SQLite | JSON file |
533
+ | Transparent/editable | ✅ Plain `.md` files | ❌ Opaque | ❌ Binary DB | ✅ But flat JSON |
534
+ | Version history | ✅ Full Git history | ❌ | ❌ | ❌ |
535
+ | Obsidian vault | ✅ Native | ❌ | ❌ | ❌ |
536
+ | Bulk import | ✅ CSV/JSON/JSONL | ❌ Manual | ❌ | ❌ |
537
+ | MCP server | ✅ Native | ❌ | ✅ | ✅ |
538
+ | CLI | ✅ Full-featured | ❌ | ❌ | ❌ |
539
+ | Layered stores | ✅ 4 layers | ❌ | ❌ | ❌ |
540
+ | Wikilinks | ✅ Auto-generated | ❌ | ❌ | ❌ |
541
+ | Search | Hybrid: FTS5 + semantic + RRF | Proprietary | Basic SQL | None |
542
+ | Freeform Q&A | ✅ gnosys_ask with citations | ✅ Built-in | ❌ | ❌ |
543
+ | Self-hosted | ✅ | ❌ | ✅ | ✅ |
544
+ | LLM providers | 5 (Anthropic, Ollama, Groq, OpenAI, LM Studio) | Proprietary | No LLM | No LLM |
545
+ | Wikilink graph | ✅ Persistent JSON graph | ❌ | ❌ | ❌ |
546
+ | System dashboard | ✅ Pretty CLI + MCP tool | ❌ | ❌ | ❌ |
547
+ | Auto maintenance | ✅ Decay, dedup, consolidation | ❌ | ❌ | ❌ |
548
+ | Docker support | ✅ | ❌ | ❌ | ❌ |
549
+ | Price | Free / MIT | Free tier, then paid | Free | Free |
550
+
551
+ ---
552
+
553
+ ## CLI Reference
554
+
555
+ ```bash
556
+ gnosys --help # List all commands
557
+ gnosys init # Initialize a new store
558
+ gnosys add "raw input" # Add memory via LLM
559
+ gnosys add-structured ... # Add memory with explicit fields
560
+ gnosys discover "keywords" # Find relevant memories (metadata only)
561
+ gnosys search "query" # Full-text search with snippets
562
+ gnosys hybrid-search "q" # Hybrid keyword + semantic search
563
+ gnosys semantic-search "q" # Semantic similarity search
564
+ gnosys ask "question" # Ask a question, get cited answer
565
+ gnosys reindex # Build/rebuild semantic embeddings
566
+ gnosys read <path> # Read a specific memory
567
+ gnosys list # List all memories
568
+ gnosys update <path> ... # Update a memory
569
+ gnosys reinforce <id> ... # Signal memory usefulness
570
+ gnosys stale # Find stale memories
571
+ gnosys commit-context "..." # Extract memories from conversation
572
+ gnosys import <file> ... # Bulk import data
573
+ gnosys maintain # Run vault maintenance (dry run by default)
574
+ gnosys maintain --dry-run # Preview changes without modifying
575
+ gnosys maintain --auto-apply # Apply all maintenance automatically
576
+ gnosys dashboard # Pretty system dashboard
577
+ gnosys dashboard --json # Dashboard as JSON
578
+ gnosys reindex-graph # Build/rebuild wikilink graph
579
+ gnosys config show # Show SOC configuration
580
+ gnosys config set provider <name> # Set default provider
581
+ gnosys config set task <task> <provider> <model> # Route task
582
+ gnosys doctor # Full system health check (all providers)
583
+ gnosys tags # List tag registry
584
+ gnosys stores # Show active stores
585
+ gnosys serve # Start MCP server (stdio)
586
+ gnosys serve --with-maintenance # MCP server + maintenance every 6h
587
+ ```
588
+
589
+ ---
307
590
 
308
591
  ## Development
309
592
 
@@ -324,11 +607,84 @@ src/
324
607
  lib/
325
608
  store.ts # Core: read/write/update memory files
326
609
  search.ts # FTS5 search and discovery
610
+ embeddings.ts # Lazy semantic embeddings (all-MiniLM-L6-v2)
611
+ hybridSearch.ts # Hybrid search with RRF fusion
612
+ ask.ts # Freeform Q&A with LLM synthesis + citations
613
+ llm.ts # LLM abstraction — System of Cognition (5 providers)
614
+ maintenance.ts # Auto-maintenance: decay, dedup, consolidation, reinforcement
615
+ dashboard.ts # Aggregated system dashboard
616
+ graph.ts # Persistent wikilink graph (graph.json)
327
617
  tags.ts # Tag registry management
328
- ingest.ts # LLM-powered structuring of raw input
618
+ ingest.ts # LLM-powered structuring (with retry logic)
619
+ import.ts # Bulk import engine (CSV, JSON, JSONL)
620
+ config.ts # gnosys.json loader with Zod validation
621
+ retry.ts # Exponential backoff for LLM calls
329
622
  resolver.ts # Layered multi-store resolution
623
+ lensing.ts # Memory lensing (filtered views)
624
+ history.ts # Git history and rollback
625
+ timeline.ts # Knowledge evolution timeline
626
+ wikilinks.ts # Obsidian wikilink graph
627
+ bootstrap.ts # Bootstrap from source code
628
+ prompts/
629
+ synthesize.md # System prompt template for ask engine
330
630
  ```
331
631
 
632
+ ---
633
+
634
+ ## Benchmarks
635
+
636
+ Real numbers from our demo vault (120 memories — 100 USDA foods + 20 NVD CVEs):
637
+
638
+ | Metric | Gnosys | NotebookLM | gnosis-mcp |
639
+ |--------|--------|------------|------------|
640
+ | Import 100 records | 0.6s (structured) | Manual upload | N/A |
641
+ | Cold start (first load) | 0.3s | ~5s (cloud) | ~0.1s |
642
+ | Keyword search | <10ms (FTS5) | Cloud-dependent | SQLite |
643
+ | Hybrid search (keyword + semantic) | ~50ms | N/A | N/A |
644
+ | Reindex 120 embeddings | ~8s (first run: model download ~80 MB) | N/A | N/A |
645
+ | Maintenance dry-run (120 memories) | ~2s | N/A | N/A |
646
+ | Graph reindex (120 memories) | <1s | N/A | N/A |
647
+ | Storage per memory | ~1 KB `.md` file | Opaque | SQLite row |
648
+ | Embedding storage | ~0.3 MB for 120 memories | Cloud | N/A |
649
+ | LLM providers | 5 (Anthropic, Ollama, Groq, OpenAI, LM Studio) | 1 (Google) | 0 |
650
+ | Offline capable | ✅ (Ollama / LM Studio) | ❌ | ✅ |
651
+ | Test suite | 143 tests, 0 errors | N/A | N/A |
652
+
653
+ All benchmarks on Apple M-series hardware, Node.js 20+. Import speed depends on mode — `structured` bypasses LLM entirely. LLM-enriched imports depend on provider latency.
654
+
655
+ ---
656
+
657
+ ## Community & Next Steps
658
+
659
+ Gnosys is open source (MIT) and actively developed. Here's how to get involved:
660
+
661
+ **Get started fast:**
662
+ - **Cursor template:** Add Gnosys to any Cursor project with one MCP config line (see [MCP Server Setup](#mcp-server-setup))
663
+ - **Docker:** `docker build -t gnosys . && docker compose up` for containerized deployment
664
+ - **Demo vault:** See [DEMO.md](DEMO.md) for a full walkthrough with USDA + NVD data
665
+
666
+ **Contribute:**
667
+ - [GitHub Discussions](https://github.com/proticom/gnosys-mcp/discussions) — share ideas, ask questions, show what you've built
668
+ - [Issues](https://github.com/proticom/gnosys-mcp/issues) — bug reports and feature requests
669
+ - PRs welcome — especially for new import connectors, LLM providers, and Obsidian plugins
670
+
671
+ **What's next (v1.2+):**
672
+ - Obsidian community plugin for native vault integration
673
+ - VS Code extension for in-editor memory reinforcement
674
+ - Docker Hub published image for one-line deployment
675
+ - Multi-agent memory sharing protocol
676
+ - Graph visualization in the dashboard
677
+
678
+ ---
679
+
680
+ ## Roadmap
681
+
682
+ See the [6-phase roadmap](https://gnosys.ai/roadmap) for what's next.
683
+
684
+ **Have ideas?** [Join the discussion →](https://github.com/proticom/gnosys-mcp/discussions)
685
+
686
+ ---
687
+
332
688
  ## License
333
689
 
334
- MIT
690
+ MIT — [LICENSE](LICENSE)