mind-palace-graph 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (51) hide show
  1. package/INSTALL.md +387 -0
  2. package/README.md +602 -0
  3. package/dist/api.d.ts +682 -0
  4. package/dist/api.js +660 -0
  5. package/dist/api.js.map +1 -0
  6. package/dist/cli.d.ts +95 -0
  7. package/dist/cli.js +856 -0
  8. package/dist/cli.js.map +1 -0
  9. package/dist/format.d.ts +16 -0
  10. package/dist/format.js +199 -0
  11. package/dist/format.js.map +1 -0
  12. package/dist/fuzzy.d.ts +45 -0
  13. package/dist/fuzzy.js +150 -0
  14. package/dist/fuzzy.js.map +1 -0
  15. package/dist/index.d.ts +9 -0
  16. package/dist/index.js +528 -0
  17. package/dist/index.js.map +1 -0
  18. package/dist/mcp-server.d.ts +24 -0
  19. package/dist/mcp-server.js +187 -0
  20. package/dist/mcp-server.js.map +1 -0
  21. package/dist/mind-palace.d.ts +148 -0
  22. package/dist/mind-palace.js +780 -0
  23. package/dist/mind-palace.js.map +1 -0
  24. package/dist/nodes.d.ts +57 -0
  25. package/dist/nodes.js +220 -0
  26. package/dist/nodes.js.map +1 -0
  27. package/dist/pagination.d.ts +41 -0
  28. package/dist/pagination.js +63 -0
  29. package/dist/pagination.js.map +1 -0
  30. package/dist/palace-format.d.ts +30 -0
  31. package/dist/palace-format.js +146 -0
  32. package/dist/palace-format.js.map +1 -0
  33. package/dist/rg.d.ts +34 -0
  34. package/dist/rg.js +288 -0
  35. package/dist/rg.js.map +1 -0
  36. package/dist/sources.d.ts +87 -0
  37. package/dist/sources.js +457 -0
  38. package/dist/sources.js.map +1 -0
  39. package/dist/tokens.d.ts +35 -0
  40. package/dist/tokens.js +95 -0
  41. package/dist/tokens.js.map +1 -0
  42. package/dist/types.d.ts +236 -0
  43. package/dist/types.js +8 -0
  44. package/dist/types.js.map +1 -0
  45. package/package.json +67 -0
  46. package/skills/mpg-context/SKILL.md +556 -0
  47. package/skills/mpg-context/references/anti-patterns.md +133 -0
  48. package/skills/mpg-context/references/integration.md +123 -0
  49. package/skills/mpg-context/references/mind-palace.md +217 -0
  50. package/skills/mpg-context/references/multi-agent.md +147 -0
  51. package/skills/mpg-context/references/sources.md +120 -0
package/README.md ADDED
@@ -0,0 +1,602 @@
1
+ # mpg — node-centric context retrieval for LLM harnesses
2
+
3
+ `mpg` is a CLI tool for retrieving **token-budgeted context nodes** from
4
+ files, command output, URLs, and stdin, designed to be consumed directly by
5
+ LLM harnesses.
6
+
7
+ The differentiator: a search returns **nodes** (a match + sized pre/post
8
+ context), not files or lines. Each node is sized in **tokens**, not lines,
9
+ and you can cap the **number of nodes** and the **total token budget**
10
+ independently. The depth of context is adjusted by `effort` rather than by
11
+ blindly loading more text.
12
+
13
+ Plus a persistent **mind palace** — named, addressable stashes of
14
+ search results that compose, intersect, prune, and form a graph. mpg is
15
+ how agents browse and trim long-term memory, not just how they grep.
16
+
17
+ ## Headline numbers vs alternatives
18
+
19
+ Pulled from the in-repo benchmark suite (`bench/` + [`BENCHMARKS.md`](./BENCHMARKS.md)).
20
+
21
+ | workload | mpg config | result vs alternatives |
22
+ | :--- | :--- | :--- |
23
+ | Literal recall on a markdown/JSON memory corpus | `--effort scan --clip 30` | 100% / 100% / **377 tokens** vs ripgrep's 1,197 — **3.2× cheaper than rg** |
24
+ | Typo-tolerant search (drop / insert / swap / sub) | `--fuzzy` | **100%** recall vs ripgrep's 0%, ~12× cheaper than per-file embedding retrieval |
25
+ | Topic compaction at fixed token budget | `--effort scan --clip 30 --max-tokens N` | 67% downstream-Q&A pass vs LLM summarization's 33% — at **zero LLM input tokens** |
26
+ | Multi-turn agent convergence | `mpg_search` + `mpg_stash` | 24% fewer input tokens, **half the tool calls and turns** vs read+grep control |
27
+ | Mind palace set semantics | `--mp-compose` / `--mp-intersect` / `--mp-except` | 17/17 micro assertions pass |
28
+
29
+ See [`BENCHMARKS.md`](./BENCHMARKS.md) for the full scorecard across six
30
+ tiers (micro, meso, conversational, semantic, typo, compaction, macro,
31
+ multi-turn) and the raw cell-by-cell breakdowns. Where mpg loses (single-
32
+ keyword greps, CLI cold start), the numbers are reported honestly.
33
+
34
+ ## Command reference
35
+
36
+ Every flag, grouped by category. The shape of every command is:
37
+
38
+ ```
39
+ mpg [<pattern>] [options]
40
+ ```
41
+
42
+ `<pattern>` is a ripgrep regex (or a literal string with `-F`). It is
43
+ required for searches and stash-producing operations, and omitted for
44
+ pure palace operations (`--mp-list`, `--mp-get`, `--mp-drop`, `--mp-link`,
45
+ `--mp-related`, `--mp-graph`, `--mp-prune-*`, `--ls`).
46
+
47
+ ### Sources — where to search
48
+
49
+ | Flag | Example | What it does |
50
+ | :--- | :--- | :--- |
51
+ | `-i, --in <path>...` | `mpg "TODO" --in src/ test/` | One or more files, dirs, or globs. Greedy: consumes non-flag args. Dirs recurse. |
52
+ | `--in @<file>` | `mpg "TODO" --in @files.txt` | Read path list from a file (one per line, `#` comments). |
53
+ | `--in @-` | `ls *.ts \| mpg "TODO" --in @-` | Read path list from stdin. |
54
+ | `--in a,b,c` | `mpg "TODO" --in src/,test/` | Comma-separated path list. |
55
+ | trailing paths | `mpg "TODO" src/ test/` | rg-style positionals; equivalent to `--in`. |
56
+ | `--cmd <cmd>` | `mpg "error" --cmd "git log --oneline -100"` | Search the stdout of a shell command. |
57
+ | `--stdin` | `cat README.md \| mpg "install"` | Search piped stdin (auto-detected when piped). |
58
+ | `-u, --url <url>` | `mpg "deprecated" -u https://example.com/docs` | Fetch URL body and search it. |
59
+
60
+ ### Node sizing — control context width and density
61
+
62
+ | Flag | Example | What it does |
63
+ | :--- | :--- | :--- |
64
+ | `-b, --before <tokens>` | `--before 800` | Tokens of context before each match. Default: 500. |
65
+ | `-a, --after <tokens>` | `--after 800` | Tokens of context after each match. Default: 500. |
66
+ | `-n, --max-nodes <n>` | `--max-nodes 20` | Hard cap on nodes returned. Default: 30. |
67
+ | `--max-tokens <n>` | `--max-tokens 8000` | Total token budget across all nodes. |
68
+ | `--strategy fill\|deep` | `--strategy deep` | Spend `--max-tokens` on more nodes (`fill`) or deeper per node (`deep`). |
69
+ | `-e, --effort <preset>` | `--effort scan` | Bundles: **`scan`** (20t/uncapped, index mode), **`quick`** (200t/10n, **default**), `normal` (500t/30n), `deep` (2000t/100n). |
70
+ | `--clip <N>` | `--clip 30` | **Sub-line snippet mode.** Drops line context; trims the match line to N chars on each side of the matched span (with ellipsis markers). Combine with `--effort scan` for the cheapest possible index. |
71
+ | `--sort <mode>` | `--sort recent` | Order nodes by source file mtime: `recent` (newest first), `oldest`, `default` (rg's order). Pairs with `scan` for a time-ordered memory index. |
72
+ | `--window-curve <mode>` | `--window-curve log` | Per-node window decays across ranks: `flat` (default), `linear` (full → ~10% at last rank), `log` (`full / log2(rank+2)`). Combine with `--sort recent` for "rich on what just changed, tight on older history." |
73
+ | `--fuzzy` | `--fuzzy` | Typo-tolerant search. Trigram-union driver + Levenshtein post-filter (edit distance ≤ 2). Handles drop / insert / substitute / swap typos. Skipped when the pattern already has regex metacharacters. |
74
+
75
+ ### Output
76
+
77
+ | Flag | Example | What it does |
78
+ | :--- | :--- | :--- |
79
+ | `-f, --format <fmt>` | `--format json` | `llm` (default), `markdown`, `json`, `text`. |
80
+ | `--color` / `--no-color` | `--no-color` | Force or disable ANSI color. Auto by default. |
81
+
82
+ ### Search options (forwarded to ripgrep)
83
+
84
+ | Flag | Example | What it does |
85
+ | :--- | :--- | :--- |
86
+ | `-I, --ignore-case` | `-I` | Case-insensitive match. |
87
+ | `-w, --word` | `-w` | Match whole words only. |
88
+ | `-F, --fixed-strings` | `-F` | Treat pattern as a literal string, not a regex. |
89
+ | `-U, --multiline` | `-U` | Allow patterns to span lines. |
90
+ | `--hidden` | `--hidden` | Include hidden files and dirs. |
91
+ | `--no-ignore` | `--no-ignore` | Don't respect `.gitignore`. |
92
+ | `--include <glob>` | `--include '*.ts'` | Only files matching glob (repeatable). |
93
+ | `--exclude <glob>` | `--exclude '*.test.ts'` | Skip files matching glob (repeatable). |
94
+ | `--type <lang>` | `--type ts` | ripgrep file-type filter (`ts`, `rust`, `py`, ...). |
95
+
96
+ ### Pagination
97
+
98
+ | Flag | Example | What it does |
99
+ | :--- | :--- | :--- |
100
+ | `--page <n>` | `--page 1` | Return only the Nth page (1-indexed). Paginates nodes (search / `--mp-get`) or stashes (`--mp-list`). |
101
+ | `--page-size <n>` | `--page-size 5` | Items per page. Defaults: 10 for nodes, 20 for stashes. |
102
+ | `--all` | `--all` | Disable pagination, return everything. |
103
+
104
+ ### Mind palace — instantiable short-term memory
105
+
106
+ A palace is a JSON file (default `./.mpg/mind-palace.json`) that holds
107
+ named **stashes** of search results. Stashes are addressable: future
108
+ searches can use them as inputs.
109
+
110
+ | Flag | Example | What it does |
111
+ | :--- | :--- | :--- |
112
+ | `--mp-stash <name> <note>` | `mpg "TODO" --in src/ --mp-stash auth "Auth TODOs"` | Run the search, save the result under `name`. Merges into an existing stash; pass `--mp-replace` to overwrite. |
113
+ | `--mp-stash-note <note>` | `--mp-stash-note "extra context"` | Set the note separately. |
114
+ | `--mp-stash-tag <tag>` / `--mp-tag <tag>` | `--mp-tag p0 --mp-tag auth` | Tag a stash (repeatable). |
115
+ | `--mp-replace` | `--mp-replace` | Overwrite an existing stash rather than merging. |
116
+ | `--mp-ttl <duration>` | `--mp-ttl 2h` | Auto-expire this stash after the duration (e.g. `30m`, `2h`, `7d`). |
117
+ | `--mp-list` | `mpg --mp-list` | List all stashes (with relative timestamps). |
118
+ | `--mp-list-tag <tag>` | `mpg --mp-list --mp-list-tag p0` | Filter list by tag (repeatable). |
119
+ | `--mp-get <name>` | `mpg --mp-get auth` | Show the full contents of one stash. |
120
+ | `--mp-drop <name>` | `mpg --mp-drop auth` | Remove a stash. |
121
+ | `--mp-from <name>` | `mpg "rate.limit" --mp-from auth` | Re-run a fresh search, scoped to the files in a stash. |
122
+ | `--mp-compose <a> <b>...` | `mpg "error" --mp-compose auth perf` | Run a search across the **union** of multiple stashes' files. |
123
+ | `--mp-except <a>` / `--mp-except <a> <b>...` | `mpg "TODO" --mp-except deprecated` | Search files NOT in the listed stash(es). |
124
+ | `--mp-intersect <a> <b>...` | `mpg "TODO" --mp-intersect auth perf` | Search files in **all** the listed stashes (set intersection). |
125
+ | `--mp-path <file>` | `--mp-path .mpg/task-42.json` | Use an isolated palace file. Also: `MPG_MIND_PALACE` env var. |
126
+ | `--mp-stash-locations` | `--mp-stash-locations` | Save only file:line pointers, drop context text (lean stashes). |
127
+
128
+ ### Pruning — keep the palace from growing unbounded
129
+
130
+ | Flag | Example | What it does |
131
+ | :--- | :--- | :--- |
132
+ | `--mp-prune-older-than <dur>` | `--mp-prune-older-than 7d` | Remove stashes not updated within the duration. |
133
+ | `--mp-prune-keep <n>` | `--mp-prune-keep 10` | Keep only the N most recently updated stashes. |
134
+ | `--mp-prune-tag <tag>` | `--mp-prune-tag temp` | Remove all stashes carrying the tag. |
135
+ | `--mp-prune-expired` | `--mp-prune-expired` | Remove stashes whose `--mp-ttl` has elapsed. |
136
+ | `--mp-prune-all` | `--mp-prune-all --mp-prune-confirm` | Clear the entire palace. `--mp-prune-confirm` required. |
137
+ | `--mp-prune-dry-run` | `--mp-prune-older-than 7d --mp-prune-dry-run` | Show what *would* be pruned, don't delete. **Use this first.** |
138
+
139
+ ### Relationships — make the *graph* in mind-palace-graph real
140
+
141
+ | Flag | Example | What it does |
142
+ | :--- | :--- | :--- |
143
+ | `--mp-link <from> <to> <type> [note]` | `mpg --mp-link auth perf depends-on "shared db"` | Create a directed edge. Types: `depends-on`, `related-to`, `see-also`, `parent-of`, `child-of`, `supersedes`, or any custom string. |
144
+ | `--mp-unlink <from> <to>` | `mpg --mp-unlink auth perf` | Remove a relationship. |
145
+ | `--mp-related <name>` | `mpg --mp-related auth` | Show all stashes connected to `name` (inbound + outbound). |
146
+ | `--mp-graph <name> [depth]` | `mpg --mp-graph auth 3` | Traversal graph from `name` up to `[depth]` (default 3). |
147
+
148
+ ### Discovery & meta
149
+
150
+ | Flag | Example | What it does |
151
+ | :--- | :--- | :--- |
152
+ | `--ls` / `--tree` | `mpg --ls --in src/` | List/tree all searchable files under the given paths and exit. |
153
+ | `-h, --help` | `mpg --help` | Show inline help. |
154
+ | `-v, --version` | `mpg --version` | Print version. |
155
+
156
+ ### Environment variables
157
+
158
+ | Variable | Effect |
159
+ | :--- | :--- |
160
+ | `MPG_MIND_PALACE` | Override default palace path (`./.mpg/mind-palace.json`). |
161
+ | `MPG_PATTERN` | Default pattern if none is passed positionally. |
162
+
163
+ ### Common recipes (copy-paste)
164
+
165
+ ```bash
166
+ # Cheapest first-touch index — "browse my recent memory"
167
+ # 3.2x cheaper than rg at 100/100 recall/precision on memory-system content.
168
+ mpg "JWT|Bearer|ProviderContext" --in . \
169
+ --effort scan --clip 30 --sort recent --page 1 --page-size 10
170
+
171
+ # Typo-tolerant search (catches drop/insert/substitute/swap, edit dist <= 2)
172
+ mpg "PrvderiContext" --in . --fuzzy --effort scan --clip 30
173
+
174
+ # Topic compaction at a hard token budget (zero LLM cost)
175
+ mpg "auth|JWT|Bearer|ProviderContext" --in conductor/tracks \
176
+ --effort scan --clip 30 --sort recent --window-curve log \
177
+ --max-tokens 2000 --format llm > auth-compaction.md
178
+
179
+ # Quick recon
180
+ mpg "auth" --in . --effort quick --max-nodes 5
181
+
182
+ # Deep grounding for a final answer
183
+ mpg "session" --in src/auth/ --effort deep --max-tokens 16000
184
+
185
+ # Stash + tag + TTL
186
+ mpg "TODO" --in src/auth/ --mp-stash auth-todos "Auth TODOs" \
187
+ --mp-tag auth --mp-tag p0 --mp-ttl 7d
188
+
189
+ # Compose two stashes, re-search across their union
190
+ mpg "error" --mp-compose auth-todos perf-hotspots
191
+
192
+ # Re-search scoped to one stash's files
193
+ mpg "rate.limit" --mp-from auth-todos
194
+
195
+ # Link stashes into a graph, then traverse it
196
+ mpg --mp-link auth-todos perf-hotspots depends-on "shared db layer"
197
+ mpg --mp-graph auth-todos 3
198
+
199
+ # Prune safely
200
+ mpg --mp-prune-older-than 7d --mp-prune-dry-run # preview
201
+ mpg --mp-prune-older-than 7d # commit
202
+
203
+ # Use an isolated palace for one task
204
+ MPG_MIND_PALACE=./.mpg/task-42.json mpg "TODO" --in src/ --mp-stash t42 "..."
205
+
206
+ # Programmatic JSON for a harness
207
+ mpg "TODO" --in src/ --format json --page 1 --page-size 5
208
+ ```
209
+
210
+ ### Exit codes
211
+
212
+ | Code | Meaning |
213
+ | ---: | :--- |
214
+ | 0 | Matches found (or palace operation succeeded) |
215
+ | 1 | No matches (matches ripgrep's convention) |
216
+ | 2 | Bad arguments |
217
+ | 3 | ripgrep not installed |
218
+ | 4 | Mind palace error (unknown stash, etc.) |
219
+ | 99 | Unexpected error |
220
+
221
+ ---
222
+
223
+ ## Why
224
+
225
+ Most context tools are file-centric (`@filename`) or line-centric
226
+ (`grep -C N`). For an LLM harness, this is wasteful:
227
+
228
+ - A 500-line file might be 8,000 tokens, but the LLM only needs 200 tokens
229
+ of context around the actual match.
230
+ - `grep -C 50` gives 50 *lines* of context, regardless of how long those
231
+ lines are. One symbol-dense line is 10 tokens; one long paragraph is 80.
232
+ - Without a node cap, a single regex can flood the context with thousands
233
+ of hits.
234
+
235
+ `mpg` fixes this:
236
+
237
+ | Knob | What it does |
238
+ | :--- | :--- |
239
+ | `--before N` / `--after N` | Tokens of context around each match |
240
+ | `--max-nodes N` | Cap on the number of hits returned |
241
+ | `--max-tokens N` | Total token budget across all nodes |
242
+ | `--strategy fill\|deep` | How to use the budget (more nodes vs deeper per node) |
243
+ | `--effort quick\|normal\|deep\|auto` | Preset that bundles the above |
244
+ | `--in`, `--cmd`, `--url`, `--stdin` | Multi-source inputs |
245
+ | `--format llm\|markdown\|json\|text` | Output format (default: `llm`) |
246
+
247
+ The `llm` output format is the default and the point: clear delimiters,
248
+ file:line attribution, match highlighting, and a summary footer. Paste
249
+ the entire output into an LLM context and it knows exactly where every
250
+ snippet came from.
251
+
252
+ ## Install
253
+
254
+ Requires [Node 20+](https://nodejs.org) and [ripgrep](https://github.com/BurntSushi/ripgrep).
255
+
256
+ ```bash
257
+ npm install -g mind-palace-graph
258
+ # or from source:
259
+ git clone https://github.com/JadeZaher/mind-palace-graph.git
260
+ cd mind-palace-graph && npm install && npm run build && npm link
261
+ ```
262
+
263
+ For Claude / Gemini / coding agents: load `skills/mpg-context/SKILL.md`
264
+ into your system prompt or tool descriptions. It provides a decision tree
265
+ for effort levels, mind palace patterns, pagination, and error recovery.
266
+
267
+ Verify:
268
+
269
+ ```bash
270
+ mpg --version
271
+ mpg --help
272
+ ```
273
+
274
+ ## Quickstart
275
+
276
+ ```bash
277
+ # Find TODOs in src/, with 500 tokens of context, up to 20 nodes
278
+ mpg "TODO" --in src/ --max-nodes 20
279
+
280
+ # Multiple paths in one flag (greedy, like git add or curl)
281
+ mpg "TODO" --in src/ test/ docs/
282
+
283
+ # Trailing positional paths (rg-style)
284
+ mpg "TODO" src/ test/
285
+
286
+ # Directory: recurses into all files automatically
287
+ mpg "TODO" --in src/auth/
288
+
289
+ # Read path list from a file (one per line, # comments allowed)
290
+ mpg "TODO" --in @filelist.txt
291
+
292
+ # Read path list from stdin
293
+ echo -e "src/\ntest/" | mpg "TODO" --in @-
294
+
295
+ # Comma-separated paths
296
+ mpg "TODO" --in src/,test/,docs/
297
+
298
+ # Quick recon: narrow context, 5 nodes
299
+ mpg "auth" --in . --effort quick --max-nodes 5
300
+
301
+ # Deep dive: wide context, capped at 16k tokens
302
+ mpg "session" --in src/auth/ --effort deep --max-tokens 16000
303
+
304
+ # Search the output of a command
305
+ mpg "error" --cmd "git log --oneline -100"
306
+
307
+ # Pipe content in
308
+ cat README.md | mpg "install"
309
+
310
+ # JSON for programmatic harness integration
311
+ mpg "TODO" --in src/ --format json
312
+
313
+ # Markdown for pasting into a doc or chat
314
+ mpg "TODO" --in src/ --format markdown
315
+ ```
316
+
317
+ The `--in` flag is greedy: it consumes every non-flag argument that
318
+ follows it, so `--in src/ test/ docs/` is equivalent to three separate
319
+ `--in` flags. To pass a path that starts with `-`, prefix it with `./`
320
+ (so `./-weird-name`) or use the `@file` syntax.
321
+
322
+ ## Output format: `llm`
323
+
324
+ The default. Designed to be both human-readable and directly consumable
325
+ by an LLM harness:
326
+
327
+ ```text
328
+ <mpg result pattern="TODO" nodes=4 tokens=~566 effort=normal strategy=fill>
329
+
330
+ --- NODE 1 of 4 | src/auth/login.ts:8 | ~196 tokens ---
331
+ 1 import { User } from './types';
332
+ 2 import { db } from './db';
333
+ 3 import { logger } from '../../utils/logger';
334
+ 4
335
+ 5 // Authentication flow for the public API.
336
+ 6 // Validates the user credentials, then issues a short-lived session token.
337
+ 7 export async function login(user: User, password: string) {
338
+ 8 >> // **TODO**: add rate limiting per IP+user to prevent brute force
339
+ 9 const valid = await db.users.verifyPassword(user.id, password);
340
+ 10 if (!valid) {
341
+ 11 logger.warn(`failed login for ${user.id}`);
342
+ 12 return null;
343
+ 13 }
344
+ 14 const session = await db.sessions.create({ userId: user.id });
345
+ 15 return session;
346
+ 16 }
347
+
348
+ --- NODE 2 of 4 | src/auth/session.ts:8 | ~166 tokens ---
349
+ ...
350
+
351
+ --- TOTAL ---
352
+ 4 nodes | ~566 tokens | 3 sources | 30ms
353
+ </mpg result>
354
+ ```
355
+
356
+ An LLM can paste the entire `<mpg result>...</mpg result>` block into its
357
+ context and immediately know:
358
+
359
+ - The pattern being searched (`pattern="TODO"`)
360
+ - The total cost in tokens (`tokens=~566`)
361
+ - The source attribution of every snippet (`src/auth/login.ts:8`)
362
+ - The matched substring (highlighted with `>>` and `**bold**`)
363
+
364
+ ## Effort presets
365
+
366
+ | Preset | before | after | max-nodes | Use when |
367
+ | :--- | ---: | ---: | ---: | :--- |
368
+ | `quick` | 200 | 200 | 10 | Initial recon, "is this term in the codebase?" |
369
+ | `normal` | 500 | 500 | 30 | Default. Good for most LLM context windows. |
370
+ | `deep` | 2000 | 2000 | 100 | Final-answer grounding, large context windows. |
371
+ | `auto` | 500 | 500 | 30 | Same as `normal` for now; future: heuristic sizing. |
372
+
373
+ ## Token estimation
374
+
375
+ `mpg` uses a simple `chars/4` heuristic for token estimation. This is
376
+ fast and dependency-free, and accurate enough to make *budgeting*
377
+ decisions (sizing context windows, capping output). It is not a substitute
378
+ for a real tokenizer when billing accuracy matters.
379
+
380
+ The `tokens` field in JSON output is always approximate and prefixed with
381
+ `~` in the `llm` format.
382
+
383
+ ## Path spec syntax
384
+
385
+ `--in` accepts any of:
386
+
387
+ The mind palace is the LLM's **addressable short-term memory** for
388
+ search results. It works like RAM that the LLM can write to, read from,
389
+ and compose across multiple invocations.
390
+
391
+ The metaphor: while investigating a codebase, the LLM builds up a set of
392
+ named "stashes". Each stash holds a search result with a note. Stashes
393
+ can be used as inputs to *future* searches (so the LLM can scope a new
394
+ search to the files it cared about before) and can be composed together.
395
+
396
+ ### The lifecycle
397
+
398
+ | Operation | CLI | What it does |
399
+ | :--- | :--- | :--- |
400
+ | **Instantiate** | `--mp-stash <name> <note>` | Run the current search, save the result under `name` with `note`. |
401
+ | **Read** | `--mp-from <name>` | Re-run a search, but only in the files stashed under `name`. |
402
+ | **Compose** | `--mp-compose <a> <b> ...` | Re-run a search across the union of multiple stashes' file lists. |
403
+ | **Inspect** | `--mp-list [--mp-list-tag t]` | See all stashes, optionally filtered by tag. |
404
+ | **Inspect** | `--mp-get <name>` | Show the full contents of one stash. |
405
+ | **Free** | `--mp-drop <name>` | Remove a stash from the palace. |
406
+
407
+ Stashes default to **merge on duplicate name** (dedup by file:line);
408
+ pass `--mp-replace` to overwrite. Tag stashes with `--mp-tag <t>`
409
+ (repeatable) and filter the list with `--mp-list-tag <t>`.
410
+
411
+ ### Storage
412
+
413
+ A palace is a JSON file. Default location: `./.mpg/mind-palace.json`
414
+ (project-scoped). The LLM can have **multiple isolated palaces** by
415
+ pointing `--mp-path <file>` at a different file — one palace per task
416
+ or per session. Override at runtime with `MPG_MIND_PALACE=<file>`.
417
+
418
+ ### Example: multi-step investigation
419
+
420
+ ```bash
421
+ # 1. The LLM starts by stashing "auth" issues
422
+ mpg "TODO" --in src/auth/ --mp-stash auth-issues "Auth TODOs to fix" \
423
+ --mp-tag auth --mp-tag p0
424
+
425
+ # 2. Then "performance" hotspots from a different search
426
+ mpg "performance\|slow\|TODO" --in src/ --effort deep \
427
+ --mp-stash perf-hotspots "Performance concerns" --mp-tag perf
428
+
429
+ # 3. The LLM wants to find files involved in BOTH: compose them
430
+ mpg "TODO" --mp-compose auth-issues perf-hotspots
431
+
432
+ # 4. The LLM wants to re-search "rate" but only in files that had TODOs
433
+ mpg "rate.limit" --mp-from auth-issues
434
+
435
+ # 5. The LLM is done with auth-issues, frees the slot
436
+ mpg --mp-drop auth-issues
437
+ ```
438
+
439
+ The mind palace is **persistent** across `mpg` invocations within the
440
+ same project (the JSON file lives on disk) but **logical** — a fresh
441
+ palace can be created instantly by pointing `--mp-path` elsewhere.
442
+
443
+ ### Pruning & TTL
444
+
445
+ The palace can grow unbounded. mpg provides several ways to prune:
446
+
447
+ | Prune operation | CLI flag |
448
+ | :--- | :--- |
449
+ | By age | `--mp-prune-older-than 7d` — stashes not updated in 7 days |
450
+ | By count | `--mp-prune-keep 10` — keep only the 10 most recently updated |
451
+ | By tag | `--mp-prune-tag temp` — remove all stashes tagged `temp` |
452
+ | All | `--mp-prune-all --mp-prune-confirm` — clear entire palace |
453
+ | Expired TTL | Auto-pruned on every `--mp-list` / `--mp-get` |
454
+
455
+ `--mp-prune-dry-run` shows what WOULD be pruned without deleting.
456
+
457
+ TTL stashes auto-expire:
458
+
459
+ ```bash
460
+ mpg "debug_stmt" --in src/ --mp-stash temp-findings "Temp" \
461
+ --mp-ttl 2h --mp-tag temp
462
+ ```
463
+
464
+ Relative timestamps are shown in all listings (`just now`, `3m ago`, `2d ago`).
465
+
466
+ ## Pagination
467
+
468
+ For finer-grained traversal of large result sets, `mpg` supports
469
+ opt-in pagination. The LLM can page through nodes in a search, stashes
470
+ in `--mp-list`, or nodes within a stash in `--mp-get`.
471
+
472
+ ```bash
473
+ # Page through a large search result
474
+ mpg "TODO" --in src/ --page 1 --page-size 5
475
+ mpg "TODO" --in src/ --page 2 --page-size 5
476
+
477
+ # Browse a large mind palace 20 stashes at a time
478
+ mpg --mp-list --page 1 --page-size 20
479
+
480
+ # Browse a stash's nodes 5 at a time
481
+ mpg --mp-get auth-issues --page 2 --page-size 5
482
+ ```
483
+
484
+ The LLM format annotates the result with pagination metadata:
485
+
486
+ ```text
487
+ <mpg result pattern="TODO" nodes=6 tokens=~816 effort=normal
488
+ page=1 of 3 page_size=2 total_items=6>
489
+ ```
490
+
491
+ The JSON format includes a `pagination` block:
492
+
493
+ ```json
494
+ {
495
+ "nodes": [ ... 2 items ... ],
496
+ "pagination": {
497
+ "page": 1,
498
+ "page_size": 2,
499
+ "total_items": 6,
500
+ "total_pages": 3,
501
+ "has_next": true,
502
+ "has_prev": false
503
+ }
504
+ }
505
+ ```
506
+
507
+ `--all` disables pagination (returns everything). Pagination is off
508
+ by default for backwards compatibility; the LLM harness should pass
509
+ `--page 1` in its tool wrapper to enable it.
510
+
511
+ ## Programmatic API
512
+
513
+ For TS/Node harnesses that prefer to embed `mpg` rather than shell
514
+ out, the `mpg` package exports a programmatic API:
515
+
516
+ ```ts
517
+ import { search, stash, listStashes, toolDefinition } from "mind-palace-graph";
518
+
519
+ const result = await search({
520
+ pattern: "TODO",
521
+ in: ["src/"],
522
+ effort: "quick",
523
+ page: 1,
524
+ pageSize: 5,
525
+ });
526
+
527
+ // Stash the result for later composition.
528
+ await stash(result, {
529
+ name: "auth-issues",
530
+ note: "Auth TODOs to review",
531
+ tags: ["auth", "p0"],
532
+ });
533
+
534
+ // Browse stashes.
535
+ const all = listStashes();
536
+
537
+ // Expose to OpenAI / Anthropic function calling:
538
+ openai.tools.create({ name: "mpg", ...toolDefinition });
539
+ ```
540
+
541
+ The API mirrors the CLI 1:1 — every flag has a corresponding option.
542
+
543
+ ## Path spec syntax
544
+
545
+ `--in` accepts any of:
546
+
547
+ | Form | Meaning |
548
+ | :--- | :--- |
549
+ | `--in path/to/file` | A single file |
550
+ | `--in path/to/dir` | A directory; recurses into all files |
551
+ | `--in '**/*.ts'` | A glob (single or multiple wildcards) |
552
+ | `--in src/ test/ docs/` | Multiple paths in one flag (greedy) |
553
+ | `--in src/,test/,docs/` | Comma-separated paths |
554
+ | `--in @list.txt` | Read paths from a file (one per line, `#` comments) |
555
+ | `--in @-` | Read paths from stdin (one per line, `#` comments) |
556
+ | `mpg PATTERN path/ ...` | Trailing positionals also act as paths (rg-style) |
557
+
558
+ ## Architecture
559
+
560
+ ```
561
+ src/
562
+ cli.ts hand-rolled arg parser + effort preset resolution
563
+ types.ts shared types (Node, Source, Result, etc.)
564
+ tokens.ts token estimation + line trimming to budget
565
+ rg.ts ripgrep wrapper (rg --json)
566
+ sources.ts source resolution: file / glob / command / stdin / url
567
+ nodes.ts match → context node construction
568
+ format.ts llm / markdown / json / text output
569
+ mind-palace.ts stash / drop / list / compose / except / intersect
570
+ palace-format.ts llm-friendly formatters for palace output
571
+ pagination.ts page-through-the-results utility
572
+ api.ts programmatic API (search, stash, toolDefinition)
573
+ index.ts orchestrator (entry point)
574
+ ```
575
+
576
+ `mpg` does not reimplement grep. It shells out to `rg --json` for the
577
+ actual search, which is the fastest, most correct regex engine available
578
+ and provides structured match data. Everything else — node building,
579
+ context sizing, output formatting — is in-process TypeScript.
580
+
581
+ ## Development
582
+
583
+ ```bash
584
+ npm run dev # run with tsx (no build step)
585
+ npm run build # compile to dist/
586
+ npm test # run smoke tests
587
+ ```
588
+
589
+ ## Exit codes
590
+
591
+ | Code | Meaning |
592
+ | ---: | :--- |
593
+ | 0 | Matches found (or palace operation succeeded) |
594
+ | 1 | No matches (matches ripgrep's convention) |
595
+ | 2 | Bad arguments |
596
+ | 3 | ripgrep not installed |
597
+ | 4 | Mind palace error (unknown stash, etc.) |
598
+ | 99 | Unexpected error |
599
+
600
+ ## License
601
+
602
+ MIT.