mneme-cli 0.4.0__tar.gz → 0.5.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (51) hide show
  1. {mneme_cli-0.4.0/mneme/templates/workspace → mneme_cli-0.5.1}/AGENTS.md +75 -2
  2. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/CHANGELOG.md +87 -0
  3. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/CLAUDE.md +1 -1
  4. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/CODER.md +1 -1
  5. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/EXAMPLES.md +2 -2
  6. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/FEATURES.md +14 -4
  7. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/PKG-INFO +149 -21
  8. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/README.md +143 -17
  9. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/__init__.py +2 -2
  10. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/config.py +6 -13
  11. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/core.py +1813 -806
  12. mneme_cli-0.5.1/mneme/search.py +318 -0
  13. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/server.py +4 -4
  14. mneme_cli-0.5.1/mneme/templates/workspace/.gitignore +9 -0
  15. {mneme_cli-0.4.0 → mneme_cli-0.5.1/mneme/templates/workspace}/AGENTS.md +2 -2
  16. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/templates/workspace/README.md +2 -2
  17. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/ui.html +19 -19
  18. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme_cli.egg-info/SOURCES.txt +2 -0
  19. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/pyproject.toml +4 -4
  20. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/tests/test_agent_loop.py +4 -1
  21. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/tests/test_bug_regressions.py +20 -15
  22. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/tests/test_core.py +611 -209
  23. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/tests/test_ingest_csv.py +1 -1
  24. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/tests/test_profile.py +1 -1
  25. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/tests/test_schema_search.py +23 -5
  26. mneme_cli-0.5.1/tests/test_search.py +142 -0
  27. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/tests/test_tornado_lint.py +1 -1
  28. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/tests/test_trace.py +1 -1
  29. mneme_cli-0.4.0/mneme/templates/workspace/.gitignore +0 -9
  30. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/LICENSE +0 -0
  31. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/MANIFEST.in +0 -0
  32. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/__main__.py +0 -0
  33. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/profiles/eu-mdr.md +0 -0
  34. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/profiles/iso-13485.md +0 -0
  35. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/profiles/mappings/dds.json +0 -0
  36. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/profiles/mappings/requirements.json +0 -0
  37. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/profiles/mappings/risk-register.json +0 -0
  38. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/profiles/mappings/test-cases.json +0 -0
  39. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/profiles/mappings/user-needs.json +0 -0
  40. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/templates/workspace/inbox/.gitkeep +0 -0
  41. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/templates/workspace/index.md +0 -0
  42. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/templates/workspace/log.md +0 -0
  43. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/templates/workspace/profiles/README.md +0 -0
  44. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/templates/workspace/profiles/mappings/.gitkeep +0 -0
  45. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/templates/workspace/schema/entities.json +0 -0
  46. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/templates/workspace/schema/graph.json +0 -0
  47. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/templates/workspace/schema/tags.json +0 -0
  48. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/templates/workspace/sources/.gitkeep +0 -0
  49. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/mneme/templates/workspace/wiki/_templates/page.md +0 -0
  50. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/setup.cfg +0 -0
  51. {mneme_cli-0.4.0 → mneme_cli-0.5.1}/tests/__init__.py +0 -0
@@ -69,7 +69,7 @@ A mneme workspace is a directory. Its shape is stable across versions:
69
69
  graph.json relationship graph
70
70
  tags.json tag registry
71
71
  traceability.json trace links between pages
72
- memvid/ optional .mv2 archives (semantic search)
72
+ search.db SQLite FTS5 search index (rebuilt from wiki)
73
73
  profiles/ workspace-local profiles and CSV mappings
74
74
  mappings/ JSON column mappings for ingest-csv
75
75
  exports/ JSON / markdown exports
@@ -107,7 +107,7 @@ mneme tornado --client <client> # batch from inbox/
107
107
  ```
108
108
 
109
109
  `ingest` is atomic: it writes the wiki page, updates the schema, and
110
- advances the Memvid archive in one operation. `ingest-csv` produces one
110
+ indexes the page in SQLite FTS5 in one operation. `ingest-csv` produces one
111
111
  wiki page per row, with trace links derived from the mapping. `tornado`
112
112
  is a bulk inbox processor — it auto-detects page type and routes CSVs
113
113
  through `ingest-csv`, everything else through `ingest`.
@@ -207,6 +207,79 @@ baseline, your current wiki page, and a fresh ingest of the new source.
207
207
  If there are conflicts, the page is left with merge markers. Edit them
208
208
  out manually, then run `resync-resolve`.
209
209
 
210
+ ### 3.6 TAG — agent-driven tagging
211
+
212
+ ```bash
213
+ mneme tags suggest <client>/<page> # build tag packet
214
+ mneme tags suggest <client>/<page> --json # raw dict
215
+ mneme tags apply <client>/<page> --add t1,t2 --remove t3
216
+
217
+ # Bulk variants -- packet up to N pages in one round-trip
218
+ mneme tags bulk-suggest --client <c> --filter req- --limit 50 --out packet.md
219
+ mneme tags bulk-apply response.json # response: {"pages": [{wiki_path, add, remove}, ...]}
220
+ ```
221
+
222
+ `mneme tags suggest` builds a **tag packet**: the page content, current
223
+ tags, the workspace tag taxonomy (every existing tag with usage counts),
224
+ active profile guidance, and a ready-to-paste prompt instructing you to
225
+ choose 3–7 tags. Mneme does **not** propose tags itself — content
226
+ understanding is your job. The packet gives you all the context you need.
227
+
228
+ Your contract when consuming a tag packet:
229
+
230
+ 1. **Prefer existing tags** from the taxonomy when they fit. Consistency
231
+ matters more than novelty — `iso-13485` should not become `iso13485`
232
+ on the next page.
233
+ 2. **Add new tags only** when no existing tag captures the concept.
234
+ 3. Follow the format: lowercase, hyphenated (`risk-management`, not
235
+ `Risk Management`).
236
+ 4. Do not propose generic tags (`summary`, `overview`, `report`).
237
+ 5. Do not add the client slug — it is auto-applied.
238
+ 6. Output JSON: `{"tags": ["existing-a", "existing-b"], "new_tags": ["proposed-c"]}`.
239
+
240
+ `mneme tags apply` is **atomic**: it rewrites the wiki page frontmatter,
241
+ updates `schema/tags.json`, re-syncs the page to the FTS5 index, and
242
+ appends a log entry — all in one operation. Search picks up the new tags
243
+ immediately. Use `--add` and/or `--remove`, comma-separated.
244
+
245
+ Existing taxonomy ops:
246
+
247
+ ```bash
248
+ mneme tags list # all tags + counts
249
+ mneme tags merge <old> <new> # rename across all pages
250
+ ```
251
+
252
+ ### 3.7 ENTITY — agent-driven classification
253
+
254
+ ```bash
255
+ mneme entity suggest --client <c> --limit 50 # classification packet
256
+ mneme entity apply --id iso-13485 --type standard # one at a time
257
+ mneme entity bulk-apply classifications.json # batch
258
+ ```
259
+
260
+ `mneme entity suggest` builds an **entity packet**: every `unknown`-typed
261
+ entity, the workspace's current type distribution, the valid type
262
+ vocabulary, an example wiki page per entity, and a prompt. The agent
263
+ returns a JSON array of `{id, type}` objects which `bulk-apply` writes
264
+ back atomically. Same philosophy as tags: mneme stays deterministic, the
265
+ LLM does the classification.
266
+
267
+ Valid types: `standard`, `company`, `person`, `product`, `technology`,
268
+ `concept`, `brand`, `unknown`.
269
+
270
+ ### 3.8 HOME — generated landing page
271
+
272
+ ```bash
273
+ mneme home --client <c> # wiki/<c>/HOME.md
274
+ mneme home --all-clients # wiki/HOME.md (cross-client)
275
+ ```
276
+
277
+ Generates an Obsidian-friendly navigation hub with Dataview queries
278
+ (group by type, by ID prefix like `REQ-*` / `DDS-*`, top tags) and a
279
+ plain-markdown `<details>` fallback so the page is useful outside
280
+ Obsidian. Run after a large ingest, or whenever the wiki's shape
281
+ changes meaningfully.
282
+
210
283
  ---
211
284
 
212
285
  ## 4. Profiles and the writing-style contract
@@ -4,6 +4,93 @@ All notable changes to this project are documented here.
4
4
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
5
5
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
6
6
 
7
+ ## [0.5.0] - 2026-04-13
8
+
9
+ ### Breaking Changes
10
+
11
+ - **Replaced memvid-sdk with SQLite FTS5.** The `memvid/` directory and `.mv2`
12
+ archives are no longer used. Search is now powered by a local `search.db`
13
+ file using BM25 ranking with Porter stemming. **Zero external dependencies**
14
+ for search — `sqlite3` is in the Python stdlib.
15
+ - `mneme repair` now rebuilds the FTS5 index instead of memvid archives.
16
+ - `mneme drift` reports `unindexed` / `orphaned` / `stale` instead of
17
+ `missing_from_memvid` / `orphan_frames`.
18
+ - `get_stats()` returns a `search` key (page_count, db_size_bytes,
19
+ search_latency_ms) instead of `memvid`.
20
+ - `sync_page_to_memvid()` renamed to `sync_page_to_index()`. Returns
21
+ `bool` (indexed) instead of `int` (frame count).
22
+ - Removed `chunk_body()`, `_sanitize_memvid_query()`, and all chunking
23
+ config (`MAX_CHUNK_SIZE`, `MIN_CHUNK_SIZE`, `MAX_CHUNKS_PER_INGEST`,
24
+ `CHUNK_COMMIT_BATCH`).
25
+ - Removed `MEMVID_DIR`, `MASTER_MV2`, `PER_CLIENT_DIR` config constants.
26
+ Replaced by `SEARCH_DB`.
27
+
28
+ ### Added
29
+
30
+ - **`mneme reindex`** command — rebuild search index from wiki pages.
31
+ - **`ingest-dir --recursive` / `-r`** — recurse into subdirectories.
32
+ - **`ingest-dir --preserve-structure`** — mirror source directory structure
33
+ in wiki subdirectories (avoids dedup collisions between same-basename files
34
+ in different directories).
35
+ - **`ingest-csv --delimiter`** flag with auto-detection via `csv.Sniffer`.
36
+ - **`.xlsx` ingest support** — install with `pip install "mneme-cli[xlsx]"`.
37
+ Sheets are rendered as markdown tables.
38
+ - **`mneme trace matrix --csv [--out FILE]`** — export the trace matrix as
39
+ CSV for QMS audits and DHF inclusion.
40
+ - **`graph.json` auto-populated** during ingest from wiki page wikilinks
41
+ and `related` frontmatter.
42
+ - **`stats` relationship count** now includes traceability.json links, not
43
+ just graph.json edges.
44
+ - **log.md rotation** — entries beyond `LOG_MAX_ENTRIES` (default 500) are
45
+ archived to `log-archive-YYYY-MM-DD.md`.
46
+
47
+ ### Fixed
48
+
49
+ - `mneme status` crash (UnboundLocalError on `log_content`).
50
+ - CSV ingest crash on `None` cells (`row.get()` returning None).
51
+ - Duplicate ingest detection now uses full source path, not just filename
52
+ (two `INSTRUCTIONS.md` files in different directories now both ingest).
53
+
54
+ ### Removed
55
+
56
+ - `memvid-sdk` dependency.
57
+ - `MNEME_NO_MEMVID` env var (no longer needed — FTS5 is always available).
58
+ - Chunking logic (`chunk_body`, `MAX_CHUNK_SIZE`, frame management).
59
+ - Tantivy-reserved-word query sanitizer (FTS5 has different syntax).
60
+
61
+ ## [0.5.1] - 2026-04-14
62
+
63
+ ### Added
64
+ - **`mneme entity suggest` / `entity apply` / `entity bulk-apply`** — agent-driven
65
+ entity classification (same packet pattern as `tags suggest`). Mneme builds a
66
+ packet of unclassified entities + the workspace type taxonomy + example pages;
67
+ the LLM agent classifies; mneme writes the types back atomically.
68
+ - **`mneme tags bulk-suggest` / `tags bulk-apply`** — operate on many pages at
69
+ once. `bulk-suggest --client X --filter req- --limit 50` packets up to 50
70
+ matching pages; agent returns one JSON file; `bulk-apply` runs all the changes
71
+ with per-page error tolerance. Critical for tagging workspaces of hundreds of
72
+ pages.
73
+ - **`mneme home --client <slug>` / `--all-clients`** — generates a `HOME.md`
74
+ navigation hub with Obsidian Dataview queries (group by type, by ID prefix
75
+ like REQ-*/DDS-*, top tags) plus a plain-markdown `<details>` fallback for
76
+ non-Obsidian viewers.
77
+ - **`mneme ingest-dir --preserve-structure`** — mirrors source directory
78
+ hierarchy into wiki subdirectories. `sources/client/REQUIREMENTS/req-001.md`
79
+ becomes `wiki/client/requirements/req-001.md` instead of flattening. Also
80
+ resolves same-basename-different-directory collisions naturally.
81
+ - **`mneme resync` auto-detects subpath** from a source's location under
82
+ `sources/<client>/`, so resyncs of preserve-structure ingests target the
83
+ correct nested wiki page instead of creating a duplicate flat one.
84
+ - **Progress bar** for `ingest-dir` and `ingest-csv` long loops. TTY-aware
85
+ (in-place updates) with non-TTY fallback (periodic line output) so CI logs
86
+ stay readable.
87
+
88
+ ### Fixed
89
+ - `mneme status` crashed with `UnboundLocalError` because a local `status` in
90
+ the `agent show` branch shadowed the function name throughout `main()`.
91
+ - `wiki/HOME.md` and `wiki/<client>/HOME.md` are now skipped during HOME
92
+ generation so re-running is idempotent.
93
+
7
94
  ## [Unreleased]
8
95
 
9
96
  ### Added
@@ -1,4 +1,4 @@
1
- # Mnemosyne - Wiki Protocol
1
+ # mneme - Wiki Protocol
2
2
 
3
3
  > **If you are an LLM agent driving mneme**, read [AGENTS.md](AGENTS.md) **first**. It is the canonical agent protocol: the agent loop, the standard task templates, the sub-agent spawning patterns, and the hard rules you must never violate. CLAUDE.md (this file) describes the wiki layer; AGENTS.md describes the agent's job.
4
4
 
@@ -1,4 +1,4 @@
1
- # CODER.md - Developer Guide for Mnemosyne
1
+ # CODER.md - Developer Guide for mneme
2
2
 
3
3
  This is the engineering reference for anyone building on or extending mneme. Read this before writing code.
4
4
 
@@ -1,4 +1,4 @@
1
- # EXAMPLES.md - Mnemosyne Usage Guide
1
+ # EXAMPLES.md - mneme Usage Guide
2
2
 
3
3
  Real-world workflows, core concepts, and entity types used in mneme.
4
4
 
@@ -256,7 +256,7 @@ cp ~/old-qms/*.pdf inbox/
256
256
  # Let tornado figure it out
257
257
  mneme tornado --client cardio-monitor
258
258
 
259
- # === Mnemosyne Tornado ===
259
+ # === mneme Tornado ===
260
260
  #
261
261
  # Scanning inbox/... found 50 files
262
262
  #
@@ -1,4 +1,4 @@
1
- # Mnemosyne - Feature Roadmap
1
+ # mneme - Feature Roadmap
2
2
 
3
3
  ## Current Features (v0.4.0)
4
4
 
@@ -17,13 +17,14 @@
17
17
  | `mneme new` | Scaffold a new workspace from the bundled template (preferred over `init`) |
18
18
  | `mneme init` | Scaffold a workspace in cwd (legacy) |
19
19
  | `mneme --workspace <dir>` / `MNEME_HOME=<dir>` | Run any command against a specific workspace |
20
- | `mneme ingest` | Atomic ingest: source -> wiki + Memvid + schema |
20
+ | `mneme ingest` | Atomic ingest: source -> wiki + FTS5 index + schema |
21
21
  | `mneme resync` | Diff-aware re-ingest: 3-way merge (baseline / wiki / fresh ingest) via `git merge-file` |
22
22
  | `mneme resync-resolve` | Mark a conflicted resync page as resolved after editing out markers |
23
23
  | `mneme ingest-dir` | Batch ingest all files from a directory |
24
24
  | `mneme search` | Dual-layer search with `--client` scoping |
25
25
  | `mneme lint` | Health check: orphan pages, dead links, stale pages, citations, schema drift, coverage |
26
- | `mneme sync` | Sync wiki pages to Memvid |
26
+ | `mneme sync` | Sync wiki pages to FTS5 search index |
27
+ | `mneme reindex` | Rebuild search index from wiki pages |
27
28
  | `mneme drift` | Detect layer desynchronization |
28
29
  | `mneme stats` | Health overview |
29
30
  | `mneme repair` | Fix corrupted archives and schema |
@@ -31,6 +32,15 @@
31
32
  | `mneme recent` | Show last N activity log entries |
32
33
  | `mneme tags list` | List all tags with page counts |
33
34
  | `mneme tags merge` | Merge one tag into another across all pages |
35
+ | `mneme tags suggest <page>` | Build a *tag packet* for an LLM agent (page content + taxonomy + prompt) |
36
+ | `mneme tags apply <page> --add t1,t2 --remove t3` | Atomic tag update: rewrites frontmatter, updates schema/tags.json, re-syncs FTS5 index |
37
+ | `mneme tags bulk-suggest --client X --filter req- --limit 50` | Bulk tag packet for many pages in one round-trip |
38
+ | `mneme tags bulk-apply response.json` | Apply tag changes from an agent JSON response (per-page error tolerance) |
39
+ | `mneme entity suggest --client X` | Agent-driven entity-classification packet (entities + taxonomy + example pages) |
40
+ | `mneme entity apply --id <id> --type <type>` | Set one entity's type atomically |
41
+ | `mneme entity bulk-apply classifications.json` | Bulk classify entities from a JSON file |
42
+ | `mneme home --client X` / `--all-clients` | Generate a `HOME.md` navigation hub with Dataview queries and plain-markdown fallback |
43
+ | `mneme ingest-dir --preserve-structure` | Mirror source directory hierarchy into wiki subdirectories (resync auto-detects matching subpath) |
34
44
  | `mneme diff` | Git-aware diff for a wiki page |
35
45
  | `mneme snapshot` | Versioned zip archive of a client + git tag |
36
46
  | `mneme dedupe` | Detect near-duplicate wiki pages |
@@ -54,7 +64,7 @@
54
64
  | `mneme scan-repo` | Scan code repo, compare against QMS docs, find gaps |
55
65
  | `mneme tornado` | Inbox processor: auto-detect type/client, ingest, archive to sources |
56
66
  | `mneme ingest-csv` | CSV ingest: one row = one wiki page, with column-to-frontmatter mapping and auto trace links |
57
- | `mneme demo clean` | Remove all demo content: demo-retail client, demo/ folder, schema entries, memvid manifest, index/log entries |
67
+ | `mneme demo clean` | Remove all demo content: demo-retail client, demo/ folder, schema entries, search index entries, index/log entries |
58
68
 
59
69
  ### Web UI (localhost:3141)
60
70
 
@@ -1,7 +1,7 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: mneme-cli
3
- Version: 0.4.0
4
- Summary: Mnemosyne - CLI tool that turns documents into a searchable second brain. Ingest once, query forever.
3
+ Version: 0.5.1
4
+ Summary: mneme - CLI tool that turns documents into a searchable second brain. Ingest once, query forever.
5
5
  Author-email: Tolis Moustaklis <apostolos.moustaklis@gmail.com>
6
6
  License-Expression: MIT
7
7
  Project-URL: Homepage, https://github.com/tolism/mneme
@@ -9,7 +9,7 @@ Project-URL: Repository, https://github.com/tolism/mneme
9
9
  Project-URL: Issues, https://github.com/tolism/mneme/issues
10
10
  Project-URL: Documentation, https://github.com/tolism/mneme#readme
11
11
  Project-URL: Changelog, https://github.com/tolism/mneme/blob/main/CHANGELOG.md
12
- Keywords: knowledge-management,second-brain,cli,wiki,memvid,llm,qms,obsidian,traceability
12
+ Keywords: knowledge-management,second-brain,cli,wiki,sqlite,fts5,llm,qms,obsidian,traceability
13
13
  Classifier: Development Status :: 3 - Alpha
14
14
  Classifier: Environment :: Console
15
15
  Classifier: Intended Audience :: Developers
@@ -28,19 +28,21 @@ Classifier: Topic :: Text Processing :: Markup :: Markdown
28
28
  Requires-Python: >=3.9
29
29
  Description-Content-Type: text/markdown
30
30
  License-File: LICENSE
31
- Requires-Dist: memvid-sdk>=2.0.0
32
31
  Requires-Dist: portalocker>=2.0.0
33
32
  Provides-Extra: pdf
34
33
  Requires-Dist: pymupdf>=1.23.0; extra == "pdf"
34
+ Provides-Extra: xlsx
35
+ Requires-Dist: openpyxl>=3.1.0; extra == "xlsx"
35
36
  Provides-Extra: all
36
37
  Requires-Dist: pymupdf>=1.23.0; extra == "all"
38
+ Requires-Dist: openpyxl>=3.1.0; extra == "all"
37
39
  Provides-Extra: release
38
40
  Requires-Dist: build>=1.0.0; extra == "release"
39
41
  Requires-Dist: twine>=5.0.0; extra == "release"
40
42
  Dynamic: license-file
41
43
 
42
44
  <p align="center">
43
- <img src="https://raw.githubusercontent.com/tolism/mneme/main/assets/logo.png" alt="Mnemosyne" width="400">
45
+ <img src="https://raw.githubusercontent.com/tolism/mneme/main/assets/logo.png" alt="mneme" width="400">
44
46
  </p>
45
47
 
46
48
  <h1 align="center"></h1>
@@ -163,15 +165,25 @@ One installed CLI serves many projects — each workspace is just a directory.
163
165
  | `mneme search "<query>"` | Search across all layers |
164
166
  | `mneme draft --doc-type <t> --section <s> --client <c>` | Build a *write packet* for an LLM agent to produce one section |
165
167
  | `mneme validate writing-style <page>` | Build a *review packet* for an LLM agent to grade a page |
168
+ | `mneme tags suggest <page>` | Build a *tag packet* for an LLM agent to choose tags |
169
+ | `mneme tags apply <page> --add t1,t2 --remove t3` | Atomic tag update (frontmatter + schema + search index) |
170
+ | `mneme tags bulk-suggest --client X --filter req- --limit 50` | Build one *bulk packet* covering many pages |
171
+ | `mneme tags bulk-apply response.json` | Apply tag changes from an agent JSON response |
172
+ | `mneme entity suggest --client X` | Build an *entity-classification packet* for an LLM agent |
173
+ | `mneme entity apply --id <id> --type <type>` | Set one entity's type atomically |
174
+ | `mneme entity bulk-apply classifications.json` | Bulk classify many entities |
175
+ | `mneme home --client X` / `--all-clients` | Generate a `HOME.md` navigation hub (Dataview + fallback) |
176
+ | `mneme ingest-dir --recursive --preserve-structure` | Mirror source directory hierarchy into the wiki |
166
177
  | `mneme agent plan --goal "..." --doc-type <t> --client <c>` | Generate a deterministic TODO plan from the active profile |
167
178
  | `mneme agent next-task` | Return the next ready task in the active plan |
168
179
  | `mneme agent task-done <id>` | Mark a task as done |
169
- | `mneme sync` | Sync wiki to Memvid memory |
180
+ | `mneme sync` | Sync wiki pages to FTS5 search index |
181
+ | `mneme reindex` | Rebuild search index from wiki pages |
170
182
  | `mneme drift` | Detect layer desynchronization |
171
183
  | `mneme stats` | Health overview |
172
184
  | `mneme repair` | Fix corrupted archives |
173
185
 
174
- **Formats:** `.md`, `.txt`, `.pdf`
186
+ **Formats:** `.md`, `.txt`, `.pdf`, `.xlsx` (with `pip install "mneme-cli[xlsx]"`)
175
187
 
176
188
  ---
177
189
 
@@ -206,6 +218,121 @@ Mneme generates the plan deterministically from the active profile's section_not
206
218
 
207
219
  ---
208
220
 
221
+ ## End-to-end example: from raw documents to a tagged, searchable, validated knowledge base
222
+
223
+ A realistic walkthrough showing how the human, the CLI, and the LLM agent collaborate. Suppose you're building a knowledge base for **Parkiwatch**, a medical device for Parkinson's monitoring.
224
+
225
+ ### Step 1 — Scaffold a workspace (human, one-time)
226
+
227
+ ```bash
228
+ mneme new ~/projects/parkiwatch --name Parkiwatch --client parkiwatch --profile eu-mdr
229
+ cd ~/projects/parkiwatch
230
+ ```
231
+
232
+ Creates the workspace tree, sets the EU MDR writing-style profile, and initializes empty schema files.
233
+
234
+ ### Step 2 — Ingest source material (human)
235
+
236
+ ```bash
237
+ # Drop a folder of source documents into inbox/, then bulk-process
238
+ cp -r ~/Downloads/parkinson-research/* inbox/
239
+ mneme tornado --client parkiwatch
240
+
241
+ # Or ingest individual files
242
+ mneme ingest research-paper.pdf parkiwatch
243
+ mneme ingest-csv risk-register.csv parkiwatch --mapping risk-register
244
+ mneme ingest spec-table.xlsx parkiwatch # .xlsx renders sheets as markdown tables
245
+ mneme ingest-dir docs/ parkiwatch --recursive # walk subdirectories
246
+ ```
247
+
248
+ What happens per ingest: source file → wiki page in `wiki/parkiwatch/` → frontmatter with auto-extracted entities → entry in `index.md` → row in the FTS5 search DB → log entry.
249
+
250
+ ### Step 3 — Tag the new pages (LLM agent)
251
+
252
+ The new pages have only the auto-applied `parkiwatch` client tag. The agent now adds meaningful tags:
253
+
254
+ ```bash
255
+ # For each new page, the agent runs:
256
+ mneme tags suggest parkiwatch/research-paper > /tmp/packet.md
257
+ ```
258
+
259
+ The packet contains the page body, the current tag taxonomy (every tag in the workspace + usage counts), and a ready-to-paste prompt. **The LLM reads the packet** — it understands the content and decides on tags, preferring existing taxonomy entries when they fit. The LLM's response is JSON:
260
+
261
+ ```json
262
+ {"tags": ["clinical-trial", "iso-13485"], "new_tags": ["bradykinesia-detection"]}
263
+ ```
264
+
265
+ The agent then runs:
266
+
267
+ ```bash
268
+ mneme tags apply parkiwatch/research-paper \
269
+ --add clinical-trial,iso-13485,bradykinesia-detection
270
+ ```
271
+
272
+ Atomic operation: rewrites the wiki page frontmatter, updates `schema/tags.json`, re-indexes the page in FTS5 (so search picks up the new tags immediately), appends a log entry. **Repeat for every page** — the taxonomy grows, and subsequent pages tend to reuse existing tags (consistency).
273
+
274
+ ### Step 4 — Search the knowledge base (anyone)
275
+
276
+ ```bash
277
+ mneme search "bradykinesia" # BM25 + Porter stemming
278
+ mneme search "clinical evaluation" --client parkiwatch # client-scoped
279
+ ```
280
+
281
+ Sub-millisecond. Returns the page title, snippet (with `<b>highlights</b>`), tags, and BM25 score.
282
+
283
+ ### Step 5 — Produce a regulatory deliverable (LLM agent driving the agent loop)
284
+
285
+ ```bash
286
+ # Generate a deterministic plan from the active profile
287
+ mneme agent plan --goal "produce a Design Validation Report" \
288
+ --doc-type design-validation-report \
289
+ --client parkiwatch
290
+ # → 15 tasks: 11 section drafts + assemble + harmonize + review + submission-check
291
+
292
+ # Walk the plan
293
+ mneme agent next-task
294
+ # → Task: section-purpose-and-scope
295
+ # next_command: mneme draft --doc-type design-validation-report \
296
+ # --section purpose-and-scope --client parkiwatch
297
+
298
+ mneme draft --doc-type design-validation-report \
299
+ --section purpose-and-scope --client parkiwatch \
300
+ --query "purpose scope intended use" \
301
+ --out /tmp/write-packet.md
302
+
303
+ # The LLM reads /tmp/write-packet.md (which includes wiki search hits as evidence,
304
+ # the profile's writing-style rules, and a write prompt) and produces the section.
305
+ # The agent writes the section to wiki/parkiwatch/design-validation-report.md.
306
+
307
+ mneme agent task-done section-purpose-and-scope
308
+
309
+ # ... repeat for each section ...
310
+
311
+ # After all sections drafted:
312
+ mneme harmonize --client parkiwatch --fix # mechanical vocabulary swap
313
+ mneme validate writing-style parkiwatch/design-validation-report > /tmp/review.md
314
+ # The LLM reads /tmp/review.md, critiques every section, applies fixes in place
315
+ mneme agent task-done review-page
316
+
317
+ # Submission readiness
318
+ mneme validate consistency --client parkiwatch # cross-doc version checks
319
+ mneme trace gaps parkiwatch # find broken trace chains
320
+ mneme trace matrix parkiwatch --csv --out trace-matrix.csv # for the DHF
321
+ mneme snapshot parkiwatch # versioned audit zip
322
+ ```
323
+
324
+ ### Who does what
325
+
326
+ | Layer | Responsibility |
327
+ |---|---|
328
+ | **Human** | Drops sources, runs commands, reviews diffs, ships the deliverable |
329
+ | **mneme CLI** | Deterministic infrastructure: parses files, builds packets, indexes, traces, harmonizes vocabulary, generates plans, atomic state updates |
330
+ | **LLM agent** | All reasoning: classifying entities, choosing tags, drafting prose, grading writing style, deciding when a chain is complete |
331
+
332
+ mneme never calls an LLM. The LLM never bypasses mneme's atomic operations. They meet at the packet boundary.
333
+
334
+ ---
335
+
209
336
  ## How It Works
210
337
 
211
338
  ```
@@ -218,9 +345,9 @@ Mneme generates the plan deterministically from the active profile's section_not
218
345
  | Frontmatter, citations, [[wikilinks]]
219
346
  | You read and browse here
220
347
  |
221
- +---> Memory Layer (.mv2 archive)
222
- | Smart Frames, semantic embeddings
223
- | Machines query here (<5ms)
348
+ +---> Search Index (SQLite FTS5)
349
+ | BM25 ranking, Porter stemming
350
+ | Sub-millisecond queries, zero dependencies
224
351
  |
225
352
  +---> Schema Layer (JSON)
226
353
  entities.json - people, companies, products
@@ -228,9 +355,9 @@ Mneme generates the plan deterministically from the active profile's section_not
228
355
  tags.json - taxonomy
229
356
  ```
230
357
 
231
- Every `mneme ingest` writes both layers atomically. `mneme drift` catches desync. `mneme repair` fixes it.
358
+ Every `mneme ingest` writes the wiki page and updates the search index atomically. `mneme drift` catches desync. `mneme reindex` rebuilds the index from wiki pages.
232
359
 
233
- **Memvid is optional.** Without it, mneme runs as a wiki-only knowledge base with text search. Add `memvid-sdk` when you outgrow grep.
360
+ **Zero external dependencies for search.** SQLite FTS5 is built into Python's stdlib no install, no API key, no capacity limit.
234
361
 
235
362
  ---
236
363
 
@@ -405,14 +532,15 @@ See `EXAMPLES.md` Example 13 for a full walkthrough with a real Parkiwatch scena
405
532
 
406
533
  ## When You Need This
407
534
 
408
- | Scale | Wiki alone | Wiki + Memvid |
409
- |---|---|---|
410
- | 5 docs | Plenty | Overkill |
411
- | 50 docs | Fine | Starting to help |
412
- | 500 docs | Grep takes 2-3s, misses semantic matches | 2ms, cross-client connections |
413
- | 5,000 docs | Unusable | Still 2ms |
535
+ | Scale | Search performance |
536
+ |---|---|
537
+ | 5 docs | Sub-millisecond |
538
+ | 50 docs | Sub-millisecond |
539
+ | 500 docs | Sub-millisecond, BM25 ranked |
540
+ | 5,000 docs | A few ms, still ranked by relevance |
541
+ | 50,000 docs | Tens of ms |
414
542
 
415
- Start wiki-only. Add the memory layer when search gets slow.
543
+ SQLite FTS5 scales transparently. No tuning, no capacity limits.
416
544
 
417
545
  ---
418
546
 
@@ -423,7 +551,7 @@ mneme/
423
551
  sources/ Raw documents (immutable, never modified)
424
552
  wiki/ Markdown knowledge pages (Obsidian-compatible)
425
553
  schema/ entities.json, graph.json, tags.json
426
- memvid/ .mv2 memory archives
554
+ search.db SQLite FTS5 search index
427
555
  core.py Engine (ingest, search, sync, drift, repair)
428
556
  config.py Configuration
429
557
  server.py Web dashboard
@@ -489,7 +617,7 @@ password = pypi-AgENd... # from https://test.pypi.org/manage/account/to
489
617
  This project builds on two foundational ideas:
490
618
 
491
619
  - **LLM Wiki pattern** by [Andrej Karpathy](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f) -- the insight that LLMs should build and maintain a persistent, compounding wiki instead of re-deriving answers from raw documents on every query
492
- - **Memvid** by [Olow304/memvid](https://github.com/Olow304/memvid) -- single-file AI memory with sub-millisecond retrieval, no vector DB required
620
+ - **SQLite FTS5** -- the world's most-deployed embedded database, with built-in BM25 full-text search
493
621
  - **Original implementation** -- [tashisleepy/knowledge-engine](https://github.com/tashisleepy/knowledge-engine) -- the first version that fused both patterns into a dual-layer bridge
494
622
 
495
623
  ---