@lh8ppl/claude-memory-kit 0.3.2 → 0.3.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +6 -4
- package/package.json +1 -1
- package/src/auto-extract.mjs +16 -5
- package/src/import-claude-md.mjs +7 -2
- package/src/index-rebuild.mjs +91 -3
- package/src/inject-context.mjs +16 -6
- package/src/lazy-compress.mjs +81 -0
- package/src/mcp-server.mjs +9 -1
- package/src/read-core.mjs +65 -3
- package/src/remember-core.mjs +15 -15
- package/src/sanitize.mjs +30 -0
- package/src/search.mjs +119 -4
- package/src/session-end-tasks.mjs +40 -3
- package/src/subcommands.mjs +39 -5
- package/template/.claude/skills/memory-search/SKILL.md +20 -0
- package/template/CLAUDE.md.template +1 -0
package/README.md
CHANGED
|
@@ -47,11 +47,13 @@ cmk doctor # verify, then restart Claude Code
|
|
|
47
47
|
Inside Claude Code:
|
|
48
48
|
|
|
49
49
|
```text
|
|
50
|
-
/plugin marketplace add LH8PPL/claude-memory-kit
|
|
51
|
-
/plugin install claude-memory-kit
|
|
50
|
+
/plugin marketplace add LH8PPL/claude-memory-kit # add this repo as a plugin source (once per machine)
|
|
51
|
+
/plugin install claude-memory-kit # install the global machinery — hooks + skills (once per machine)
|
|
52
|
+
cd ~/my-project # the project you want memory in — bootstrap scaffolds into the CURRENT dir
|
|
53
|
+
/claude-memory-kit:bootstrap # scaffold this project's context/ memory tree (once per project)
|
|
52
54
|
```
|
|
53
55
|
|
|
54
|
-
|
|
56
|
+
The first two commands are **global** (per machine); `bootstrap` is **per project** — run it again (after a `cd`) in each project. The plugin bundles the hooks + the `bootstrap`, `memory-write`, and `memory-search` skills, so it's complete without the npm CLI (add the CLI later only if you want `cmk search` / `cmk doctor` / cron).
|
|
55
57
|
|
|
56
58
|
## CLI
|
|
57
59
|
|
|
@@ -62,7 +64,7 @@ Most-used commands (full list via `cmk --help`):
|
|
|
62
64
|
| `cmk install` | Scaffold `context/` + the `memory-write`/`memory-search` skills + `.gitignore` + CLAUDE.md block + wire hooks (`--no-hooks` for scaffold-only) |
|
|
63
65
|
| `cmk doctor` | Run HC-1..HC-8 health checks, surface repair commands |
|
|
64
66
|
| `cmk repair --hooks` / `--locks` / `--index` / `--all` | Idempotent self-repair |
|
|
65
|
-
| `cmk search "<query>" [--mode keyword\|semantic\|hybrid] [--scope facts\|transcripts]` | Search memory — by meaning with the embedder (hybrid default after `--with-semantic`); `--scope transcripts` = the raw session record |
|
|
67
|
+
| `cmk search "<query>" [--mode keyword\|semantic\|hybrid] [--scope facts\|transcripts\|decisions]` | Search memory — by meaning with the embedder (hybrid default after `--with-semantic`); `--scope transcripts` = the raw session record; `--scope decisions` = the decision journal (history / "what did we reject") |
|
|
66
68
|
| `cmk get <id…>` / `cmk timeline <id>` / `cmk cite <id>` / `cmk recent-activity` | Read the index back — full fact bodies + provenance, sequential context around an observation, a canonical citation link, recent changes (the CLI side of the `mk_*` MCP read tools) |
|
|
67
69
|
| `cmk roll --scope now\|today\|recent` | Manually trigger a compression pipeline |
|
|
68
70
|
| `cmk register-crons [--dry-run] [--unregister]` | Register daily + weekly jobs with cron / launchd / Task Scheduler |
|
package/package.json
CHANGED
package/src/auto-extract.mjs
CHANGED
|
@@ -53,6 +53,7 @@ import { hashContent } from './content-hash.mjs';
|
|
|
53
53
|
import { memoryWrite } from './memory-write.mjs';
|
|
54
54
|
import { writeFact } from './write-fact.mjs';
|
|
55
55
|
import { buildRichFactBody, slugifyFact } from './rich-fact.mjs';
|
|
56
|
+
import { sanitizeForTitle } from './sanitize.mjs';
|
|
56
57
|
import { HaikuTimeoutError } from './compressor.mjs';
|
|
57
58
|
import { pidIsAlive } from './lock-discipline.mjs';
|
|
58
59
|
import { nowIso } from './audit-log.mjs';
|
|
@@ -638,9 +639,16 @@ function routeMedium({ candidate, projectRoot, ts }) {
|
|
|
638
639
|
// Direct-to-fact-store (NOT the review queue the terse medium-trust path uses):
|
|
639
640
|
// the point of Task 103 is AUTOMATIC native-parity capture — native writes its
|
|
640
641
|
// fact files with no approval step, so parity requires the same. The fact store
|
|
641
|
-
// is searchable-but-not-full-trust-injected, writeFact
|
|
642
|
-
//
|
|
643
|
-
// later explicit `cmk remember` (trust:high) supersedes. See design §6.4.
|
|
642
|
+
// is searchable-but-not-full-trust-injected, writeFact screens the body +
|
|
643
|
+
// frontmatter (Poison_Guard + home-path sanitize + schema + INDEX/reindex), and
|
|
644
|
+
// a later explicit `cmk remember` (trust:high) supersedes. See design §6.4.
|
|
645
|
+
//
|
|
646
|
+
// CAVEAT (F-V0.3.3-2): writeFact does NOT sanitize the SLUG/filename — the slug
|
|
647
|
+
// is `slugifyFact(title)` derived HERE, before writeFact runs. So the title MUST
|
|
648
|
+
// be routed through sanitizeForTitle first, or a home path in Haiku's candidate
|
|
649
|
+
// title (auto-extract runs every turn, no user action) leaks the username into a
|
|
650
|
+
// COMMITTED filename. This was the same bug as cmk remember — the old comment
|
|
651
|
+
// here wrongly assumed "writeFact already sanitizes" the whole write.
|
|
644
652
|
//
|
|
645
653
|
// trust:medium / write_source:auto-extract marks it as a Haiku synthesis
|
|
646
654
|
// (proposal-grade), below the explicit-high tier. The body is built by the SAME
|
|
@@ -652,11 +660,14 @@ function routeRichFact({ candidate, projectRoot, ts }) {
|
|
|
652
660
|
why: candidate.why,
|
|
653
661
|
how: candidate.how,
|
|
654
662
|
});
|
|
663
|
+
// Sanitize the title BEFORE deriving the slug (F-V0.3.3-2) — writeFact won't
|
|
664
|
+
// catch a home path in the slug/filename. One helper, same as cmk remember.
|
|
665
|
+
const title = sanitizeForTitle(candidate.title);
|
|
655
666
|
return writeFact({
|
|
656
667
|
tier: 'P',
|
|
657
668
|
type: candidate.type,
|
|
658
|
-
slug: slugifyFact(
|
|
659
|
-
title
|
|
669
|
+
slug: slugifyFact(title),
|
|
670
|
+
title,
|
|
660
671
|
body,
|
|
661
672
|
writeSource: 'auto-extract',
|
|
662
673
|
trust: 'medium',
|
package/src/import-claude-md.mjs
CHANGED
|
@@ -32,7 +32,7 @@ import { appendAuditEntry, nowIso, REASON_CODES } from './audit-log.mjs';
|
|
|
32
32
|
import { ERROR_CATEGORIES, errorResult } from './result-shapes.mjs';
|
|
33
33
|
import { writeFact } from './write-fact.mjs';
|
|
34
34
|
import { slugifyFact } from './rich-fact.mjs';
|
|
35
|
-
import { sanitizeHomePaths } from './sanitize.mjs';
|
|
35
|
+
import { sanitizeHomePaths, sanitizeForTitle } from './sanitize.mjs';
|
|
36
36
|
import { parse as parseFrontmatter } from './frontmatter.mjs';
|
|
37
37
|
|
|
38
38
|
const DEFAULT_FILE = 'CLAUDE.md';
|
|
@@ -280,7 +280,12 @@ export async function importClaudeMd({
|
|
|
280
280
|
// absolute --file argument (the D-51 name-privacy class).
|
|
281
281
|
const sourceFileField = sanitizeHomePaths(fileRel);
|
|
282
282
|
for (const p of proposals) {
|
|
283
|
-
|
|
283
|
+
// Sanitize BEFORE deriving the title/slug (F-V0.3.3-2) — p.text is the RAW
|
|
284
|
+
// rule text (the `sanitized` above feeds only the dedup key/id), and the
|
|
285
|
+
// slug becomes the committed filename, which writeFact won't sanitize. A
|
|
286
|
+
// CLAUDE.md rule mentioning C:\Users\you\ would otherwise leak the username
|
|
287
|
+
// into the imported fact's filename. Same one helper as the other slug sites.
|
|
288
|
+
const title = sanitizeForTitle(p.text).split('\n')[0].slice(0, 80);
|
|
284
289
|
let slug = slugifyFact(title);
|
|
285
290
|
if (usedSlugs.has(`${p.type}/${slug}`)) slug = `${slug}-l${p.line}`;
|
|
286
291
|
usedSlugs.add(`${p.type}/${slug}`);
|
package/src/index-rebuild.mjs
CHANGED
|
@@ -295,6 +295,40 @@ function parseSource(source, { projectRoot, userDir }) {
|
|
|
295
295
|
|
|
296
296
|
// --- DB write helpers -------------------------------------------------
|
|
297
297
|
|
|
298
|
+
// Bug 1 (2026-06-16, fact P-UCG4RKNL): the kit dual-writes a fact to BOTH the
|
|
299
|
+
// MEMORY.md scratchpad bullet AND its granular archive file, both carrying the
|
|
300
|
+
// SAME content-addressed id. `observations.id` is a global PRIMARY KEY, so a
|
|
301
|
+
// plain INSERT of the second source's row collided (`UNIQUE constraint failed:
|
|
302
|
+
// observations.id`) and aborted the whole reindex. The fix is id-keyed upsert
|
|
303
|
+
// with deterministic ARCHIVE-BEATS-SCRATCHPAD precedence — validated against
|
|
304
|
+
// three markdown-first analogs that all key replacement on the id, never the
|
|
305
|
+
// file (TencentDB `ON CONFLICT(record_id) DO UPDATE`; basic-memory
|
|
306
|
+
// resolve-permalink precedence + partial unique index; memweave content-hash
|
|
307
|
+
// dedup). See docs/research/2026-06-16-index-uniqueness-id-vs-file-scoped-delete.md.
|
|
308
|
+
//
|
|
309
|
+
// Two precedence-keyed paths, order-INDEPENDENT (the source walk order must not
|
|
310
|
+
// change the surviving row):
|
|
311
|
+
// - fact (granular archive = the canonical Why/How home) → explicit
|
|
312
|
+
// DELETE-by-id then INSERT: always wins, overwriting any scratchpad row for
|
|
313
|
+
// the id.
|
|
314
|
+
// - scratchpad (the hot working-copy bullet) → ON CONFLICT(id) DO NOTHING:
|
|
315
|
+
// inserts only when no row exists yet; never overwrites a fact row.
|
|
316
|
+
// Whichever is walked first, the fact row is the one that survives.
|
|
317
|
+
//
|
|
318
|
+
// FTS5 CORRECTNESS (the self-review catch): the fact path uses an explicit
|
|
319
|
+
// DELETE-by-id, NOT `INSERT OR REPLACE`. `observations_fts` is an
|
|
320
|
+
// external-content FTS5 table whose only safe delete path is the
|
|
321
|
+
// `obs_after_delete` trigger firing the 'delete' SENTINEL with the OLD row's
|
|
322
|
+
// column values (index-db.mjs §4.4.3 comment). `INSERT OR REPLACE` reuses the
|
|
323
|
+
// conflicting row's rowid, so its internal delete+insert leaves the OLD
|
|
324
|
+
// scratchpad body orphaned in the FTS index (it keeps MATCH-ing with no backing
|
|
325
|
+
// row — silent stale-hit corruption). An explicit `DELETE FROM observations
|
|
326
|
+
// WHERE id = ?` fires obs_after_delete cleanly (sentinel removes the old terms),
|
|
327
|
+
// then the plain INSERT fires obs_after_insert. This is the same delete-then-
|
|
328
|
+
// insert pattern every other writer in the kit uses against this table.
|
|
329
|
+
|
|
330
|
+
const DELETE_OBSERVATION_BY_ID_SQL = `DELETE FROM observations WHERE id = ?`;
|
|
331
|
+
|
|
298
332
|
const INSERT_OBSERVATION_SQL = `
|
|
299
333
|
INSERT INTO observations
|
|
300
334
|
(id, tier, source_file, source_line, source_sha1, heading_path, body,
|
|
@@ -304,6 +338,16 @@ VALUES
|
|
|
304
338
|
@write_source, @trust, @created_at, @superseded_by, @deleted_at)
|
|
305
339
|
`;
|
|
306
340
|
|
|
341
|
+
const INSERT_SCRATCHPAD_OBSERVATION_SQL = `
|
|
342
|
+
INSERT INTO observations
|
|
343
|
+
(id, tier, source_file, source_line, source_sha1, heading_path, body,
|
|
344
|
+
write_source, trust, created_at, superseded_by, deleted_at)
|
|
345
|
+
VALUES
|
|
346
|
+
(@id, @tier, @source_file, @source_line, @source_sha1, @heading_path, @body,
|
|
347
|
+
@write_source, @trust, @created_at, @superseded_by, @deleted_at)
|
|
348
|
+
ON CONFLICT(id) DO NOTHING
|
|
349
|
+
`;
|
|
350
|
+
|
|
307
351
|
const UPSERT_FILE_SQL = `
|
|
308
352
|
INSERT INTO files (path, mtime, sha1, indexed_at)
|
|
309
353
|
VALUES (@path, @mtime, @sha1, @indexed_at)
|
|
@@ -322,10 +366,48 @@ const DELETE_OBSERVATIONS_FOR_PATH_SQL = `DELETE FROM observations WHERE source_
|
|
|
322
366
|
*/
|
|
323
367
|
function replaceObservationsForFile(db, { source, observations, mtime, sha1, projectRoot, userDir, now }) {
|
|
324
368
|
const source_file = relativeSource(source.path, { projectRoot, userDir });
|
|
369
|
+
// File-scoped delete clears THIS file's own rows so a re-index of a changed
|
|
370
|
+
// file is idempotent. It only matches rows whose source_file is this path, so
|
|
371
|
+
// a fact's row (source_file = context/memory/*.md) is untouched when the
|
|
372
|
+
// scratchpad (context/MEMORY.md) is re-indexed, and vice versa — the
|
|
373
|
+
// cross-file id collision is handled by the precedence-keyed insert below,
|
|
374
|
+
// NOT by this delete (Bug 1).
|
|
325
375
|
db.prepare(DELETE_OBSERVATIONS_FOR_PATH_SQL).run(source_file);
|
|
326
|
-
|
|
327
|
-
for
|
|
328
|
-
|
|
376
|
+
// Archive-beats-scratchpad precedence (Bug 1): a fact row wins the id by
|
|
377
|
+
// explicitly deleting any existing row for that id first (firing the FTS
|
|
378
|
+
// 'delete' sentinel cleanly) then inserting; a scratchpad row yields via
|
|
379
|
+
// ON CONFLICT(id) DO NOTHING. Within a FULL pass (reindexFull, or a
|
|
380
|
+
// reindexBoot that re-walks both sources) this is order-independent — the
|
|
381
|
+
// fact row always wins (listObservationSources walks scratchpad-before-facts
|
|
382
|
+
// per tier, but either order lands the same surviving row).
|
|
383
|
+
//
|
|
384
|
+
// INCREMENTAL caveat (skill-review I1): on the mtime-skip boot path / the
|
|
385
|
+
// single-file watcher path, only the CHANGED source is re-processed. If a
|
|
386
|
+
// fact file is removed while its scratchpad twin (same id) is untouched, the
|
|
387
|
+
// orphan-prune drops the fact row and the skipped scratchpad's DO-NOTHING
|
|
388
|
+
// insert never re-fires — so the id momentarily vanishes from search until
|
|
389
|
+
// the scratchpad is next edited (which re-inserts it). `cmk forget` does NOT
|
|
390
|
+
// hit this: it tombstones the fact AND scrubs the scratchpad bullet in the
|
|
391
|
+
// same op (forget.mjs scrubAllScratchpads), so the only window is a manual
|
|
392
|
+
// hand-`rm` of a context/memory/*.md leaving the bullet behind — a rare,
|
|
393
|
+
// self-healing transition, documented + tested rather than resurrected.
|
|
394
|
+
//
|
|
395
|
+
// The DELETE-by-id is UNQUALIFIED (no tier/source_file filter) by design and
|
|
396
|
+
// safe: ids are content-addressed WITH the tier as a prefix (`P-`/`L-`/`U-`),
|
|
397
|
+
// so a P-tier and U-tier fact can never share an id — no cross-tier delete is
|
|
398
|
+
// possible. (Defended by the P/U-same-content tier test below.)
|
|
399
|
+
if (source.kind === 'fact') {
|
|
400
|
+
const deleteById = db.prepare(DELETE_OBSERVATION_BY_ID_SQL);
|
|
401
|
+
const insert = db.prepare(INSERT_OBSERVATION_SQL);
|
|
402
|
+
for (const obs of observations) {
|
|
403
|
+
deleteById.run(obs.id);
|
|
404
|
+
insert.run(obs);
|
|
405
|
+
}
|
|
406
|
+
} else {
|
|
407
|
+
const insert = db.prepare(INSERT_SCRATCHPAD_OBSERVATION_SQL);
|
|
408
|
+
for (const obs of observations) {
|
|
409
|
+
insert.run(obs);
|
|
410
|
+
}
|
|
329
411
|
}
|
|
330
412
|
db.prepare(UPSERT_FILE_SQL).run({
|
|
331
413
|
path: source_file,
|
|
@@ -370,6 +452,9 @@ export function reindexBoot({ projectRoot, userDir, db, now }) {
|
|
|
370
452
|
userDir,
|
|
371
453
|
now: ts,
|
|
372
454
|
});
|
|
455
|
+
// observationsAffected counts insert-ATTEMPTS, not net rows: a fact that
|
|
456
|
+
// displaces a same-id scratchpad row (Bug 1 precedence) is net-zero but
|
|
457
|
+
// counts as one here. It's a "work done" metric, not a row-count invariant.
|
|
373
458
|
return result.observations.length;
|
|
374
459
|
});
|
|
375
460
|
|
|
@@ -537,6 +622,9 @@ export function reindexFull({ projectRoot, userDir, db, now }) {
|
|
|
537
622
|
userDir,
|
|
538
623
|
now: ts,
|
|
539
624
|
});
|
|
625
|
+
// observationsAffected counts insert-ATTEMPTS, not net rows: a fact that
|
|
626
|
+
// displaces a same-id scratchpad row (Bug 1 precedence) is net-zero but
|
|
627
|
+
// counts as one here. It's a "work done" metric, not a row-count invariant.
|
|
540
628
|
return result.observations.length;
|
|
541
629
|
});
|
|
542
630
|
|
package/src/inject-context.mjs
CHANGED
|
@@ -35,7 +35,7 @@ import { join } from 'node:path';
|
|
|
35
35
|
import { homedir } from 'node:os';
|
|
36
36
|
import { SCRATCHPADS_BY_TIER, resolveTierRoot, ID_PATTERN } from './tier-paths.mjs';
|
|
37
37
|
import { nowIso } from './audit-log.mjs';
|
|
38
|
-
import { detectStaleness } from './lazy-compress.mjs';
|
|
38
|
+
import { detectStaleness, isJournalStale } from './lazy-compress.mjs';
|
|
39
39
|
import { isProvenanceCommentLine, parseBulletProvenance } from './provenance.mjs';
|
|
40
40
|
import { listConflictQueue } from './conflict-queue.mjs';
|
|
41
41
|
import { listReviewQueue } from './review-queue.mjs';
|
|
@@ -787,18 +787,28 @@ export function injectContext({
|
|
|
787
787
|
let lazyTrigger = null;
|
|
788
788
|
try {
|
|
789
789
|
const verdict = detectStaleness({ projectRoot, now: ts });
|
|
790
|
-
|
|
791
|
-
|
|
790
|
+
// Task 159 (D-169): journal-staleness is an INDEPENDENT spawn trigger — the
|
|
791
|
+
// detached lazy worker syncs DECISIONS.md unconditionally, so a session that's
|
|
792
|
+
// compress-fresh (or cron-active) but has new un-journaled decisions must
|
|
793
|
+
// still spawn, else the journal never renders without a clean SessionEnd
|
|
794
|
+
// (the Task-105/D-75 no-clean-exit class). Cron handles compress but NOT the
|
|
795
|
+
// journal, so cron-active + a stale journal SHOULD spawn (compress skips
|
|
796
|
+
// inside, the journal syncs). It is NOT a competing detectStaleness verdict
|
|
797
|
+
// (one verdict → one compress dispatch; folding journal in would suppress
|
|
798
|
+
// compress work — the separately-correct-jointly-broken class).
|
|
799
|
+
const journalStale = isJournalStale(projectRoot);
|
|
800
|
+
lazyTrigger = { verdict: verdict.action, reason: verdict.reason, journalStale };
|
|
801
|
+
const compressStale =
|
|
792
802
|
verdict.action === 'stale-now' ||
|
|
793
803
|
verdict.action === 'stale-daily' ||
|
|
794
|
-
verdict.action === 'stale-weekly'
|
|
795
|
-
) {
|
|
804
|
+
verdict.action === 'stale-weekly';
|
|
805
|
+
if (compressStale || journalStale) {
|
|
796
806
|
const spawner = typeof testSpawnLazy === 'function' ? testSpawnLazy : spawnLazyCompress;
|
|
797
807
|
const spawnResult = spawner(projectRoot, compressLazyPath);
|
|
798
808
|
lazyTrigger = { ...lazyTrigger, ...spawnResult };
|
|
799
809
|
}
|
|
800
810
|
} catch (err) {
|
|
801
|
-
// detectStaleness should be defensive; if
|
|
811
|
+
// detectStaleness / isJournalStale should be defensive; if they throw, log + continue.
|
|
802
812
|
lazyTrigger = { verdict: 'error', error: err?.message ?? String(err) };
|
|
803
813
|
}
|
|
804
814
|
|
package/src/lazy-compress.mjs
CHANGED
|
@@ -44,6 +44,7 @@ import {
|
|
|
44
44
|
import { dailyDistill } from './daily-distill.mjs';
|
|
45
45
|
import { weeklyCurate } from './weekly-curate.mjs';
|
|
46
46
|
import { compressSession } from './compress-session.mjs';
|
|
47
|
+
import { syncDecisionsJournal } from './decisions-journal.mjs';
|
|
47
48
|
|
|
48
49
|
const DEFAULT_DAILY_TTL_MS = 24 * 60 * 60 * 1000; // 24 hours
|
|
49
50
|
const DEFAULT_WEEKLY_TTL_MS = 7 * 24 * 60 * 60 * 1000; // 7 days
|
|
@@ -132,6 +133,61 @@ function recentMdMtimeMs(projectRoot) {
|
|
|
132
133
|
}
|
|
133
134
|
}
|
|
134
135
|
|
|
136
|
+
const MEMORY_REL = ['context', 'memory'];
|
|
137
|
+
const DECISIONS_MD_REL = ['context', 'DECISIONS.md'];
|
|
138
|
+
|
|
139
|
+
/**
|
|
140
|
+
* Task 159 (D-169): is the decision journal behind the captured decision facts?
|
|
141
|
+
*
|
|
142
|
+
* INDEPENDENT of compress staleness — a compress-fresh session can still have
|
|
143
|
+
* new `type:project` decision facts that aren't yet rendered into DECISIONS.md.
|
|
144
|
+
* So this is its OWN boolean (NOT a competing detectStaleness verdict, which can
|
|
145
|
+
* only return ONE action and would suppress compress work). Used as an ADDITIONAL
|
|
146
|
+
* spawn condition in inject-context, and the journal is synced unconditionally
|
|
147
|
+
* inside runLazyCompress.
|
|
148
|
+
*
|
|
149
|
+
* **O(1) — runs inline on EVERY SessionStart, so it must compose with the 500ms
|
|
150
|
+
* NFR-1 budget.** It uses `context/memory/INDEX.md` as the freshness proxy:
|
|
151
|
+
* `write-fact.mjs` rewrites INDEX.md on every fact write, so `INDEX.md` mtime ≥
|
|
152
|
+
* the newest fact file always (verified). Comparing two file mtimes is O(1) — vs
|
|
153
|
+
* stat-every-fact, which was ~130ms on a 307-fact corpus and grew linearly (a
|
|
154
|
+
* self-review find; that approach would blow the budget on a large repo).
|
|
155
|
+
*
|
|
156
|
+
* Stale ⇔ a `project_*.md` fact exists (short-circuit on the first one — no stat)
|
|
157
|
+
* AND (DECISIONS.md is missing OR older than INDEX.md). Trade-off: INDEX.md
|
|
158
|
+
* covers ALL fact types, so a feedback-only write can flag the journal stale →
|
|
159
|
+
* one spurious detached sync (~175ms, idempotent, never a correctness issue) —
|
|
160
|
+
* acceptable for an O(1) check on the hot SessionStart path. Defensive: any
|
|
161
|
+
* throw → false (never block SessionStart on a stat error).
|
|
162
|
+
*
|
|
163
|
+
* @param {string} projectRoot
|
|
164
|
+
* @returns {boolean}
|
|
165
|
+
*/
|
|
166
|
+
export function isJournalStale(projectRoot) {
|
|
167
|
+
if (!projectRoot) return false;
|
|
168
|
+
try {
|
|
169
|
+
const memDir = join(projectRoot, ...MEMORY_REL);
|
|
170
|
+
if (!existsSync(memDir)) return false;
|
|
171
|
+
// Any project (decision) fact at all? Short-circuit on the first — no stat,
|
|
172
|
+
// just the dirent name. No project facts → nothing to journal → not stale.
|
|
173
|
+
const hasDecisionFact = readdirSync(memDir).some(
|
|
174
|
+
(name) => name.startsWith('project_') && name.endsWith('.md'),
|
|
175
|
+
);
|
|
176
|
+
if (!hasDecisionFact) return false;
|
|
177
|
+
const journalPath = join(projectRoot, ...DECISIONS_MD_REL);
|
|
178
|
+
if (!existsSync(journalPath)) return true; // facts exist, journal missing → stale
|
|
179
|
+
// INDEX.md is the O(1) freshness proxy (rewritten on every fact write). If
|
|
180
|
+
// it's absent (pre-index repo), fall back to "facts exist + journal exists"
|
|
181
|
+
// → treat as fresh (a reindex will create INDEX.md; the session-end sync
|
|
182
|
+
// covers the journal regardless).
|
|
183
|
+
const indexPath = join(memDir, 'INDEX.md');
|
|
184
|
+
if (!existsSync(indexPath)) return false;
|
|
185
|
+
return statSync(indexPath).mtimeMs > statSync(journalPath).mtimeMs;
|
|
186
|
+
} catch {
|
|
187
|
+
return false;
|
|
188
|
+
}
|
|
189
|
+
}
|
|
190
|
+
|
|
135
191
|
/**
|
|
136
192
|
* Cheap inline staleness check. Runs in <5ms — one stat + a few existsSync.
|
|
137
193
|
*
|
|
@@ -254,6 +310,31 @@ export async function runLazyCompress({
|
|
|
254
310
|
});
|
|
255
311
|
}
|
|
256
312
|
|
|
313
|
+
// Task 159 (D-169): sync the decision journal UNCONDITIONALLY, before any
|
|
314
|
+
// compress gate. This is the SessionStart fallback path for sessions that never
|
|
315
|
+
// cleanly closed (Claude Code fires SessionEnd only on clean window-close — the
|
|
316
|
+
// Task-105/D-75 class), where the primary session-end sync never ran. It must
|
|
317
|
+
// run regardless of the compress verdict (cooldown / cron-active / fresh) — a
|
|
318
|
+
// cron-active or compress-fresh session can still have new decisions. Cheap pure
|
|
319
|
+
// file I/O (~175ms), idempotent (a no-change run rewrites nothing), best-effort
|
|
320
|
+
// (syncDecisionsJournal has its own try/catch + soft-error return). It does NOT
|
|
321
|
+
// touch the Haiku cooldown — that gate is for the LLM compress passes only.
|
|
322
|
+
// Door 4: log the outcome to lazy-compress.log so a silent fallback-path
|
|
323
|
+
// failure (e.g. a DECISIONS.md permissions error) leaves a trace — the rest of
|
|
324
|
+
// this function is fully NDJSON-observable, and the journal sync must be too.
|
|
325
|
+
const journalResult = syncDecisionsJournal({ projectRoot, now: ts });
|
|
326
|
+
writeLazyLogEntry({
|
|
327
|
+
projectRoot,
|
|
328
|
+
entry: {
|
|
329
|
+
ts,
|
|
330
|
+
scope: 'journal-sync',
|
|
331
|
+
action: journalResult?.error ? 'error' : journalResult?.written ? 'written' : 'no-change',
|
|
332
|
+
written: journalResult?.written ?? false,
|
|
333
|
+
appended: journalResult?.appended ?? 0,
|
|
334
|
+
...(journalResult?.error ? { error: journalResult.error } : {}),
|
|
335
|
+
},
|
|
336
|
+
});
|
|
337
|
+
|
|
257
338
|
// Cooldown gate up front — composes with shared 120s marker.
|
|
258
339
|
if (isCooldownActive({ projectRoot, now: ts, cooldownMs })) {
|
|
259
340
|
const duration_ms = Date.now() - t0;
|
package/src/mcp-server.mjs
CHANGED
|
@@ -155,6 +155,7 @@ function makeMkSearch({ db, semanticBackend, projectRoot }) {
|
|
|
155
155
|
db, query,
|
|
156
156
|
mode: wantMode,
|
|
157
157
|
scope,
|
|
158
|
+
projectRoot, // Task 156: the decisions scope reads context/DECISIONS.md
|
|
158
159
|
tier,
|
|
159
160
|
since,
|
|
160
161
|
limit,
|
|
@@ -181,6 +182,13 @@ function makeMkSearch({ db, semanticBackend, projectRoot }) {
|
|
|
181
182
|
function makeMkGet({ db }) {
|
|
182
183
|
// Thin adapter over the shared read core (read-core.getObservations) — the
|
|
183
184
|
// SAME logic the CLI `cmk get` calls (ADR-0014 parity).
|
|
185
|
+
//
|
|
186
|
+
// D-163 (BINDING): mk_get is tombstone-BLIND and must stay that way. It calls
|
|
187
|
+
// getObservations WITHOUT `includeTombstoned`, so a forgotten fact returns
|
|
188
|
+
// `not found`. Tombstone recovery is a HUMAN-only verb (`cmk get
|
|
189
|
+
// --include-tombstoned`); the agent must NEVER recover a fact the user
|
|
190
|
+
// forgot (resurfacing a deleted fact is the worst memory-product failure).
|
|
191
|
+
// Do NOT add an includeTombstoned param to this tool.
|
|
184
192
|
return async ({ ids }) => ({
|
|
185
193
|
content: [{ type: 'text', text: JSON.stringify(getObservations(db, ids), null, 2) }],
|
|
186
194
|
});
|
|
@@ -563,7 +571,7 @@ export function buildMcpServer({ projectRoot, userDir, db, semanticBackend }) {
|
|
|
563
571
|
inputSchema: {
|
|
564
572
|
query: z.string().min(1).describe('search query'),
|
|
565
573
|
mode: z.enum(['keyword', 'semantic', 'hybrid']).optional(),
|
|
566
|
-
scope: z.enum(['facts', 'transcripts']).optional().describe("'facts' (default) = curated memory; 'transcripts' = the raw session record — the
|
|
574
|
+
scope: z.enum(['facts', 'transcripts', 'decisions']).optional().describe("'facts' (default) = curated memory; 'transcripts' = the raw session record (LAST-RESORT — only when curated memory has no answer); 'decisions' = the append-only decision journal (context/DECISIONS.md) — use for decision-HISTORY / evolution / 'what did we reject / why did X change' queries (it returns superseded + retracted entries the live fact store no longer carries)"),
|
|
567
575
|
tier: z.enum(['U', 'P', 'L']).optional(),
|
|
568
576
|
since: z.string().optional().describe('ISO 8601 timestamp'),
|
|
569
577
|
limit: z.number().int().positive().max(1000).optional(),
|
package/src/read-core.mjs
CHANGED
|
@@ -6,7 +6,10 @@
|
|
|
6
6
|
// surfaces, one implementation. Pure (db + args in, plain data out); the MCP
|
|
7
7
|
// adapter wraps the result in a content envelope, the CLI adapter prints it.
|
|
8
8
|
|
|
9
|
+
import { existsSync, readFileSync } from 'node:fs';
|
|
10
|
+
import { join } from 'node:path';
|
|
9
11
|
import { ID_PATTERN } from './tier-paths.mjs';
|
|
12
|
+
import { parse as parseFrontmatter } from './frontmatter.mjs';
|
|
10
13
|
|
|
11
14
|
const GET_COLUMNS =
|
|
12
15
|
'id, body, heading_path, source_file, source_line, tier, trust, ' +
|
|
@@ -15,17 +18,76 @@ const GET_COLUMNS =
|
|
|
15
18
|
/**
|
|
16
19
|
* Fetch full observation rows by id. An invalid-format or missing id becomes
|
|
17
20
|
* a `{ id, error }` entry (the array stays positionally aligned with `ids`).
|
|
21
|
+
*
|
|
22
|
+
* Task 155 (D-163) — opt-in tombstone recovery. By DEFAULT this is live-only:
|
|
23
|
+
* a forgotten id (its index row pruned by Task 110, the body moved to
|
|
24
|
+
* `context/memory/archive/tombstones/<id>.md`) returns `not found`. The
|
|
25
|
+
* automatic recall surfaces (the SessionStart snapshot, `mk_search`, `mk_get`)
|
|
26
|
+
* MUST stay on this default — a deleted fact must remain invisible to the agent
|
|
27
|
+
* (resurfacing it is the worst memory-product failure). ONLY an explicit
|
|
28
|
+
* HUMAN-driven `cmk get --include-tombstoned` opts in, passing
|
|
29
|
+
* `{ includeTombstoned: true, projectRoot }`; on a live miss it then reads the
|
|
30
|
+
* tombstone file directly and returns its body marked `tombstoned: true`.
|
|
31
|
+
*
|
|
32
|
+
* @param {object} [opts]
|
|
33
|
+
* @param {boolean} [opts.includeTombstoned=false] human-only recovery opt-in
|
|
34
|
+
* @param {string} [opts.projectRoot] required when includeTombstoned (to find the archive)
|
|
18
35
|
*/
|
|
19
|
-
export function getObservations(db, ids) {
|
|
36
|
+
export function getObservations(db, ids, { includeTombstoned = false, projectRoot } = {}) {
|
|
20
37
|
const stmt = db.prepare(`SELECT ${GET_COLUMNS} FROM observations WHERE id = ?`);
|
|
21
38
|
return ids.map((id) => {
|
|
22
39
|
if (!ID_PATTERN.test(id)) return { id, error: 'invalid id format' };
|
|
23
40
|
const row = stmt.get(id);
|
|
24
|
-
if (
|
|
25
|
-
|
|
41
|
+
if (row) return row; // a LIVE hit always wins — recovery is a miss-only fallback
|
|
42
|
+
// Live miss. Recovery is opt-in AND needs projectRoot to locate the archive.
|
|
43
|
+
if (includeTombstoned && projectRoot) {
|
|
44
|
+
const recovered = readTombstone(projectRoot, id);
|
|
45
|
+
if (recovered) return recovered;
|
|
46
|
+
}
|
|
47
|
+
return { id, error: 'not found' };
|
|
26
48
|
});
|
|
27
49
|
}
|
|
28
50
|
|
|
51
|
+
/**
|
|
52
|
+
* Read a tombstoned fact's body + deletion provenance from
|
|
53
|
+
* `<projectRoot>/context/memory/archive/tombstones/<id>.md`. Returns a row-like
|
|
54
|
+
* object marked `tombstoned: true`, or null if no tombstone exists for the id.
|
|
55
|
+
* Read-only; never un-tombstones (that would be a separate `restore` verb).
|
|
56
|
+
*/
|
|
57
|
+
function readTombstone(projectRoot, id) {
|
|
58
|
+
// SAFETY: `id` is interpolated into the path, but every caller reaches here
|
|
59
|
+
// ONLY after getObservations' `ID_PATTERN.test(id)` gate (anchored
|
|
60
|
+
// /^[PUL]-[base32]{8}$/ — no `.`/`/`/`\`), so it cannot path-traverse out of
|
|
61
|
+
// the tombstones dir. Do NOT call readTombstone before that validation.
|
|
62
|
+
const tombPath = join(
|
|
63
|
+
projectRoot, 'context', 'memory', 'archive', 'tombstones', `${id}.md`,
|
|
64
|
+
);
|
|
65
|
+
if (!existsSync(tombPath)) return null;
|
|
66
|
+
const { frontmatter, body } = parseFrontmatter(readFileSync(tombPath, 'utf8'));
|
|
67
|
+
const fm = frontmatter ?? {};
|
|
68
|
+
// `tombstoned: true` is the SOLE discriminator for recovered-vs-live — a live
|
|
69
|
+
// row never carries it. Consumers must key off this, NOT off `deleted_at`
|
|
70
|
+
// presence (a live row can carry a null deleted_at too). A malformed/garbled
|
|
71
|
+
// tombstone still returns its raw body + null provenance (graceful degrade —
|
|
72
|
+
// a human recovering is precisely the case where something went wrong).
|
|
73
|
+
return {
|
|
74
|
+
id,
|
|
75
|
+
body: body ?? '',
|
|
76
|
+
heading_path: fm.title ?? null,
|
|
77
|
+
source_file: `context/memory/archive/tombstones/${id}.md`,
|
|
78
|
+
source_line: 1, // synthetic — the tombstone file has no meaningful source line
|
|
79
|
+
tier: fm.tier ?? null,
|
|
80
|
+
trust: fm.trust ?? null,
|
|
81
|
+
write_source: fm.write_source ?? null,
|
|
82
|
+
created_at: fm.created_at ?? fm.at ?? null,
|
|
83
|
+
superseded_by: fm.superseded_by ?? null,
|
|
84
|
+
deleted_at: fm.deleted_at ?? null,
|
|
85
|
+
deleted_reason: fm.deleted_reason ?? null,
|
|
86
|
+
deleted_by: fm.deleted_by ?? null,
|
|
87
|
+
tombstoned: true,
|
|
88
|
+
};
|
|
89
|
+
}
|
|
90
|
+
|
|
29
91
|
/** The canonical Markdown citation link for an id. Pure (no DB). */
|
|
30
92
|
export function citeLink(id) {
|
|
31
93
|
if (!ID_PATTERN.test(id)) return { ok: false, error: 'id must match ID_PATTERN' };
|
package/src/remember-core.mjs
CHANGED
|
@@ -18,7 +18,7 @@
|
|
|
18
18
|
|
|
19
19
|
import { resolve as resolvePath } from 'node:path';
|
|
20
20
|
import { hashContent } from './content-hash.mjs';
|
|
21
|
-
import {
|
|
21
|
+
import { sanitizeForTitle } from './sanitize.mjs';
|
|
22
22
|
import { writeFact as defaultWriteFact } from './write-fact.mjs';
|
|
23
23
|
import { buildRichFactBody, slugifyFact } from './rich-fact.mjs';
|
|
24
24
|
|
|
@@ -54,16 +54,15 @@ export function rememberRich(text, options = {}, deps = {}) {
|
|
|
54
54
|
const projectRoot = deps.projectRoot ?? resolvePath(process.cwd());
|
|
55
55
|
const write = deps.writeFact ?? defaultWriteFact;
|
|
56
56
|
|
|
57
|
-
//
|
|
58
|
-
//
|
|
59
|
-
//
|
|
60
|
-
//
|
|
61
|
-
//
|
|
62
|
-
//
|
|
63
|
-
|
|
64
|
-
const
|
|
65
|
-
|
|
66
|
-
: '';
|
|
57
|
+
// Sanitize BEFORE deriving/slicing the title — the slug is `slugifyFact(title)`,
|
|
58
|
+
// so anything still in the title here lands in the committed FILENAME + INDEX,
|
|
59
|
+
// which writeFact's later body/title sanitization can't undo. sanitizeForTitle
|
|
60
|
+
// (the ONE shared helper — sanitize.mjs) strips <private> + abstracts home
|
|
61
|
+
// paths, the two cut-gate findings (v0.3.1 + F-V0.3.3-2). The body itself keeps
|
|
62
|
+
// its <private> redaction via the headline below; home paths in the body are
|
|
63
|
+
// abstracted by writeFact downstream.
|
|
64
|
+
const headline = sanitizeForTitle(text);
|
|
65
|
+
const safeTitle = options.title ? sanitizeForTitle(options.title) : '';
|
|
67
66
|
const title = safeTitle || headline.split('\n')[0].slice(0, 80);
|
|
68
67
|
const body = buildRichFactBody({ text: headline, why: options.why, how: options.how });
|
|
69
68
|
// `links` arrives as an ARRAY from the MCP tool (z.array) and as a
|
|
@@ -97,10 +96,11 @@ export function rememberRich(text, options = {}, deps = {}) {
|
|
|
97
96
|
|
|
98
97
|
/** The title rememberRich() will derive for `text`/`options` (for caller messages). */
|
|
99
98
|
export function richFactTitle(text, options = {}) {
|
|
100
|
-
// Mirror rememberRich
|
|
101
|
-
//
|
|
102
|
-
|
|
103
|
-
|
|
99
|
+
// Mirror rememberRich EXACTLY (the SAME sanitizeForTitle helper) so the preview
|
|
100
|
+
// a caller echoes never carries <private> content or the username, and stays
|
|
101
|
+
// identical to the title rememberRich actually derives + stores.
|
|
102
|
+
const safeTitle = options.title ? sanitizeForTitle(options.title) : '';
|
|
103
|
+
return safeTitle || sanitizeForTitle(text).split('\n')[0].slice(0, 80);
|
|
104
104
|
}
|
|
105
105
|
|
|
106
106
|
/**
|
package/src/sanitize.mjs
CHANGED
|
@@ -13,6 +13,8 @@
|
|
|
13
13
|
// (local, gitignored) — machine-specific absolute paths are the whole point
|
|
14
14
|
// of the local tier, so they stay verbatim there.
|
|
15
15
|
|
|
16
|
+
import { sanitizePrivacyTags } from './privacy.mjs';
|
|
17
|
+
|
|
16
18
|
// Each pattern matches an absolute home-directory prefix up to (but not
|
|
17
19
|
// including) the next path separator / whitespace / quote, so the remainder
|
|
18
20
|
// of the path is preserved. Username char class excludes separators, spaces,
|
|
@@ -37,3 +39,31 @@ export function sanitizeHomePaths(text) {
|
|
|
37
39
|
for (const re of HOME_PATH_PATTERNS) out = out.replace(re, '~');
|
|
38
40
|
return out;
|
|
39
41
|
}
|
|
42
|
+
|
|
43
|
+
/**
|
|
44
|
+
* Sanitize a string that is about to become a fact TITLE — and therefore the
|
|
45
|
+
* fact's SLUG (`slugifyFact(title)`) and committed FILENAME + INDEX.md link.
|
|
46
|
+
*
|
|
47
|
+
* THE INVARIANT (F-V0.3.3-2, cut-blocker): a slug is derived from the title
|
|
48
|
+
* BEFORE `writeFact` runs, and `writeFact` only sanitizes the body + the
|
|
49
|
+
* frontmatter `title:` field — NOT the slug/filename. So anything still in the
|
|
50
|
+
* title at slug-derivation time leaks into the COMMITTED FILENAME, which no
|
|
51
|
+
* downstream sanitization can undo. Every caller that derives a slug from
|
|
52
|
+
* user/Haiku text MUST route the title through THIS helper first, so the leak
|
|
53
|
+
* class is closed in ONE place instead of being re-missed per call site
|
|
54
|
+
* (cmk remember had it; auto-extract had the same bug — the comment there even
|
|
55
|
+
* wrongly claimed "writeFact already sanitizes").
|
|
56
|
+
*
|
|
57
|
+
* Two transforms, both required, privacy-first:
|
|
58
|
+
* - sanitizePrivacyTags: strip `<private>…</private>` (v0.3.1 — a later
|
|
59
|
+
* 80-char title slice that severs the closing tag defeats writeFact's regex).
|
|
60
|
+
* - sanitizeHomePaths: `C:\Users\<you>` → `~` (F-V0.3.3-2 — the username leak).
|
|
61
|
+
* Privacy-first is the safe order: the private span (which may itself contain a
|
|
62
|
+
* home path) is removed wholesale before homepath-sanitize ever sees a fragment.
|
|
63
|
+
*
|
|
64
|
+
* @param {string} s
|
|
65
|
+
* @returns {string} the redacted + abstracted, trimmed string (safe to slug)
|
|
66
|
+
*/
|
|
67
|
+
export function sanitizeForTitle(s) {
|
|
68
|
+
return sanitizeHomePaths(sanitizePrivacyTags(String(s).trim()));
|
|
69
|
+
}
|
package/src/search.mjs
CHANGED
|
@@ -42,6 +42,8 @@
|
|
|
42
42
|
// hybrid + semantic paths. Production callers (the `cmk search` CLI in
|
|
43
43
|
// subcommands.mjs) pass undefined; v0.1.x lands the real backend.
|
|
44
44
|
|
|
45
|
+
import { existsSync, readFileSync } from 'node:fs';
|
|
46
|
+
import { join } from 'node:path';
|
|
45
47
|
import { ERROR_CATEGORIES, errorResult } from './result-shapes.mjs';
|
|
46
48
|
import { VALID_TIERS } from './tier-paths.mjs';
|
|
47
49
|
|
|
@@ -58,9 +60,16 @@ const MAX_LIMIT = 1000;
|
|
|
58
60
|
// index (L1, the default). 'transcripts' = the SEPARATE raw-transcript
|
|
59
61
|
// chunk index (the L3 last-resort tier) — reached ONLY when explicitly
|
|
60
62
|
// asked, so raw history never pollutes curated results.
|
|
63
|
+
// Task 156 (D-168) — 'decisions' = the append-only decision journal
|
|
64
|
+
// (context/DECISIONS.md). Deliberately NOT FTS-indexed (a derived view,
|
|
65
|
+
// skipped like INDEX.md), so this scope scans the markdown file directly. It
|
|
66
|
+
// is the recall path for decision-HISTORY / "what did we reject / why did X
|
|
67
|
+
// change" queries — the journal carries the retract/supersede trail the flat
|
|
68
|
+
// fact store no longer holds. Keyword-only (the journal is not embedded).
|
|
61
69
|
export const SEARCH_SCOPES = Object.freeze({
|
|
62
70
|
FACTS: 'facts',
|
|
63
71
|
TRANSCRIPTS: 'transcripts',
|
|
72
|
+
DECISIONS: 'decisions',
|
|
64
73
|
});
|
|
65
74
|
|
|
66
75
|
const TRUST_ORDINAL = Object.freeze({
|
|
@@ -117,8 +126,12 @@ function validateInput(opts) {
|
|
|
117
126
|
}
|
|
118
127
|
}
|
|
119
128
|
const scope = opts.scope ?? SEARCH_SCOPES.FACTS;
|
|
120
|
-
if (
|
|
121
|
-
|
|
129
|
+
if (
|
|
130
|
+
scope !== SEARCH_SCOPES.FACTS &&
|
|
131
|
+
scope !== SEARCH_SCOPES.TRANSCRIPTS &&
|
|
132
|
+
scope !== SEARCH_SCOPES.DECISIONS
|
|
133
|
+
) {
|
|
134
|
+
errors.push(`scope: must be one of facts/transcripts/decisions (got ${JSON.stringify(scope)})`);
|
|
122
135
|
}
|
|
123
136
|
if (scope === SEARCH_SCOPES.TRANSCRIPTS) {
|
|
124
137
|
// Chunks carry no tier/trust/created_at — rejecting these is more honest
|
|
@@ -133,6 +146,26 @@ function validateInput(opts) {
|
|
|
133
146
|
}
|
|
134
147
|
}
|
|
135
148
|
}
|
|
149
|
+
if (scope === SEARCH_SCOPES.DECISIONS) {
|
|
150
|
+
// The journal is a flat markdown file, not the index: it carries no
|
|
151
|
+
// tier/trust/created_at columns and isn't embedded. Reject those filters +
|
|
152
|
+
// semantic/hybrid modes (same explicit-vs-configured honesty as transcripts).
|
|
153
|
+
for (const [key, label] of [
|
|
154
|
+
['tier', 'tier'],
|
|
155
|
+
['minTrust', 'minTrust'],
|
|
156
|
+
['since', 'since'],
|
|
157
|
+
]) {
|
|
158
|
+
if (opts[key] !== undefined) {
|
|
159
|
+
errors.push(`${label}: not supported under the decisions scope (journal entries carry no ${label})`);
|
|
160
|
+
}
|
|
161
|
+
}
|
|
162
|
+
if (mode !== SEARCH_MODES.KEYWORD) {
|
|
163
|
+
errors.push(`mode: only keyword is supported under the decisions scope (the journal is not embedded)`);
|
|
164
|
+
}
|
|
165
|
+
if (typeof opts.projectRoot !== 'string' || opts.projectRoot.length === 0) {
|
|
166
|
+
errors.push('projectRoot: required for the decisions scope (to locate context/DECISIONS.md)');
|
|
167
|
+
}
|
|
168
|
+
}
|
|
136
169
|
return { errors, mode, scope };
|
|
137
170
|
}
|
|
138
171
|
|
|
@@ -394,6 +427,87 @@ function flattenSnippet(s) {
|
|
|
394
427
|
return flat.length > TRANSCRIPT_SNIPPET_MAX ? flat.slice(0, TRANSCRIPT_SNIPPET_MAX) + '…' : flat;
|
|
395
428
|
}
|
|
396
429
|
|
|
430
|
+
// --- Decisions-scope keyword backend (Task 156, the decision journal) ---
|
|
431
|
+
|
|
432
|
+
// The journal entry shape (decisions-journal.mjs buildDecisionEntry):
|
|
433
|
+
// <!-- decision:P-XXXXXXXX -->
|
|
434
|
+
// ### <title> (a retracted entry carries _(retracted DATE)_)
|
|
435
|
+
// **When:** <date> · **Fact:** `<id>`
|
|
436
|
+
// **Why:** <why> (optional)
|
|
437
|
+
// Entries are separated by the machine marker; we split on it, match the query
|
|
438
|
+
// as a case-insensitive substring over the entry text, and report the retract
|
|
439
|
+
// marker so recall can answer "did this change / what did we reject".
|
|
440
|
+
const DECISION_MARKER_RE = /<!--\s*decision:([PUL]-[^\s]+)\s*-->/g;
|
|
441
|
+
const DECISIONS_SNIPPET_MAX = 240;
|
|
442
|
+
|
|
443
|
+
function runDecisionsKeywordSearch(_db, opts) {
|
|
444
|
+
const file = join(opts.projectRoot, 'context', 'DECISIONS.md');
|
|
445
|
+
if (!existsSync(file)) return []; // no journal yet → empty, not an error
|
|
446
|
+
const content = readFileSync(file, 'utf8');
|
|
447
|
+
|
|
448
|
+
// Split the body into entry spans keyed by the decision marker. Each span runs
|
|
449
|
+
// from its marker to the next marker (or EOF). A marker is an entry boundary
|
|
450
|
+
// ONLY at line-start — the writer (buildDecisionEntry) always emits it first
|
|
451
|
+
// on its own line, so a marker QUOTED inside a Why/body (a meta-decision about
|
|
452
|
+
// the journal format, or a fact citing another's marker) does NOT false-split
|
|
453
|
+
// the entry (skill-review I2). DECISION_MARKER_RE is module-level /g + reset
|
|
454
|
+
// here; the function is fully synchronous (no await between reset and the
|
|
455
|
+
// loop), so there is no shared-state re-entrancy hazard.
|
|
456
|
+
const markers = [];
|
|
457
|
+
let m;
|
|
458
|
+
DECISION_MARKER_RE.lastIndex = 0;
|
|
459
|
+
while ((m = DECISION_MARKER_RE.exec(content)) !== null) {
|
|
460
|
+
const atLineStart = m.index === 0 || content[m.index - 1] === '\n';
|
|
461
|
+
if (atLineStart) markers.push({ id: m[1], start: m.index });
|
|
462
|
+
}
|
|
463
|
+
|
|
464
|
+
const needle = opts.query.trim().toLowerCase();
|
|
465
|
+
const hits = [];
|
|
466
|
+
for (let i = 0; i < markers.length; i++) {
|
|
467
|
+
const start = markers[i].start;
|
|
468
|
+
const end = i + 1 < markers.length ? markers[i + 1].start : content.length;
|
|
469
|
+
const block = content.slice(start, end);
|
|
470
|
+
// Strip the plumbing (the `<!-- decision:ID -->` marker + the `### ` heading
|
|
471
|
+
// hashes) BEFORE matching, so the query matches the human signal (title /
|
|
472
|
+
// When / Why) — NOT the literal word "decision" inside every marker comment
|
|
473
|
+
// (the self-review false-positive: searching "decision" matched all entries
|
|
474
|
+
// via their markers). Uses a FRESH regex (not the shared module-level
|
|
475
|
+
// DECISION_MARKER_RE) so the loop's .exec lastIndex isn't clobbered.
|
|
476
|
+
const cleaned = block
|
|
477
|
+
.replace(/<!--\s*decision:[PUL]-[^\s]+\s*-->/g, '')
|
|
478
|
+
.replace(/^#{1,6}\s+/gm, '');
|
|
479
|
+
if (!cleaned.toLowerCase().includes(needle)) continue;
|
|
480
|
+
|
|
481
|
+
// The line offset of the marker = source_line drill-back into DECISIONS.md.
|
|
482
|
+
const sourceLine = content.slice(0, start).split('\n').length;
|
|
483
|
+
// Retracted-tag detection mirrors the WRITER's contract: the tag sits on its
|
|
484
|
+
// own line DIRECTLY after the `### ` heading (decisions-journal.mjs §2), so
|
|
485
|
+
// scope the check there — NOT a raw-block substring, which would mislabel an
|
|
486
|
+
// active entry whose Why merely MENTIONS "_(retracted" (skill-review I1).
|
|
487
|
+
const headingIdx = block.indexOf('### ');
|
|
488
|
+
const afterHeading =
|
|
489
|
+
headingIdx === -1 ? '' : block.slice(block.indexOf('\n', headingIdx) + 1);
|
|
490
|
+
const retracted = afterHeading.startsWith('_(retracted');
|
|
491
|
+
hits.push({
|
|
492
|
+
id: markers[i].id,
|
|
493
|
+
snippet: flattenSnippet(cleaned).slice(0, DECISIONS_SNIPPET_MAX),
|
|
494
|
+
source_file: 'context/DECISIONS.md',
|
|
495
|
+
source_line: sourceLine,
|
|
496
|
+
retracted,
|
|
497
|
+
// `score` is POSITIONAL (the marker index), NOT an FTS relevance rank —
|
|
498
|
+
// the journal is chronological, so a lower score = an earlier decision.
|
|
499
|
+
// Don't fuse/sort this against the facts/transcripts scopes' rank scores.
|
|
500
|
+
score: i,
|
|
501
|
+
});
|
|
502
|
+
// NB: `limit` is a CHRONOLOGICAL head, not a relevance top-N — it returns
|
|
503
|
+
// the first N matches in journal (oldest→newest) order, so a strongly
|
|
504
|
+
// relevant decision far down a long journal can be cut. Acceptable: the
|
|
505
|
+
// journal is bounded and chronological by design (M1, deliberate).
|
|
506
|
+
if (hits.length >= (opts.limit ?? DEFAULT_LIMIT)) break;
|
|
507
|
+
}
|
|
508
|
+
return hits;
|
|
509
|
+
}
|
|
510
|
+
|
|
397
511
|
// --- Reciprocal-rank fusion (hybrid mode) -----------------------------
|
|
398
512
|
|
|
399
513
|
/**
|
|
@@ -445,8 +559,9 @@ export function search(opts = {}) {
|
|
|
445
559
|
// Scope dispatch (Task 104.2): the transcripts scope swaps the keyword
|
|
446
560
|
// backend; semantic/hybrid use the caller-prepared backend exactly like
|
|
447
561
|
// the facts scope (prepareSemanticBackend({scope}) embeds the right table).
|
|
448
|
-
|
|
449
|
-
|
|
562
|
+
let keywordBackend = runKeywordSearch;
|
|
563
|
+
if (scope === SEARCH_SCOPES.TRANSCRIPTS) keywordBackend = runTranscriptKeywordSearch;
|
|
564
|
+
else if (scope === SEARCH_SCOPES.DECISIONS) keywordBackend = runDecisionsKeywordSearch;
|
|
450
565
|
|
|
451
566
|
// Semantic + hybrid require an injected backend. Production v0.1.0
|
|
452
567
|
// passes undefined → error with the not-yet-shipped hint. A future
|
|
@@ -35,6 +35,7 @@
|
|
|
35
35
|
import { compressSession } from './compress-session.mjs';
|
|
36
36
|
import { autoPersona } from './auto-persona.mjs';
|
|
37
37
|
import { graduateAllScratchpads } from './graduate-session.mjs';
|
|
38
|
+
import { syncDecisionsJournal } from './decisions-journal.mjs';
|
|
38
39
|
|
|
39
40
|
/**
|
|
40
41
|
* Run the two independent SessionEnd Haiku passes concurrently.
|
|
@@ -45,7 +46,7 @@ import { graduateAllScratchpads } from './graduate-session.mjs';
|
|
|
45
46
|
* @param {() => object} opts.makeBackend - factory returning a fresh CompressorBackend
|
|
46
47
|
* per call (each concurrent pass gets its own instance — no shared state).
|
|
47
48
|
* @param {string} [opts.now] - ISO timestamp override (tests).
|
|
48
|
-
* @returns {Promise<{compressOutcome: PromiseSettledResult, personaOutcome: PromiseSettledResult, graduationOutcome: PromiseSettledResult}>}
|
|
49
|
+
* @returns {Promise<{compressOutcome: PromiseSettledResult, personaOutcome: PromiseSettledResult, graduationOutcome: PromiseSettledResult, journalOutcome: PromiseSettledResult}>}
|
|
49
50
|
*/
|
|
50
51
|
export async function runSessionEndTasks({ projectRoot, userDir, makeBackend, now }) {
|
|
51
52
|
const [compressOutcome, personaOutcome] = await Promise.allSettled([
|
|
@@ -75,7 +76,30 @@ export async function runSessionEndTasks({ projectRoot, userDir, makeBackend, no
|
|
|
75
76
|
graduationOutcome = { status: 'rejected', reason: err };
|
|
76
77
|
}
|
|
77
78
|
|
|
78
|
-
|
|
79
|
+
// Task 159 (D-169): auto-sync the decision journal. This is what makes
|
|
80
|
+
// DECISIONS.md "automatic" (D-164) — Task 147 BUILT the append logic but wired
|
|
81
|
+
// it to ONLY the manual `cmk digest`, so the journal never populated on its own.
|
|
82
|
+
// Same shape as the graduation sweep: SEQUENTIAL, pure local file I/O (reads the
|
|
83
|
+
// type:project fact files auto-extract wrote per-turn → rewrites DECISIONS.md),
|
|
84
|
+
// no Haiku/network (~175ms), no hook-ceiling risk, wrapped so a throw can't reject
|
|
85
|
+
// the hook. DISJOINT from compress (sessions/ tree) + persona (user-tier) +
|
|
86
|
+
// graduation (persona scratchpads) — nothing else in the block touches DECISIONS.md,
|
|
87
|
+
// so no lock contention. Session-end is the natural "this session's decisions
|
|
88
|
+
// landed → render them" boundary (squad's session-end Scribe instinct, made
|
|
89
|
+
// deterministic — the kit's typed-fact substrate needs no LLM to merge).
|
|
90
|
+
// syncDecisionsJournal is already best-effort (its own try/catch returns
|
|
91
|
+
// {written:false,error}); the wrapper here guards the unexpected synchronous throw.
|
|
92
|
+
let journalOutcome;
|
|
93
|
+
try {
|
|
94
|
+
journalOutcome = {
|
|
95
|
+
status: 'fulfilled',
|
|
96
|
+
value: syncDecisionsJournal({ projectRoot, now }),
|
|
97
|
+
};
|
|
98
|
+
} catch (err) {
|
|
99
|
+
journalOutcome = { status: 'rejected', reason: err };
|
|
100
|
+
}
|
|
101
|
+
|
|
102
|
+
return { compressOutcome, personaOutcome, graduationOutcome, journalOutcome };
|
|
79
103
|
}
|
|
80
104
|
|
|
81
105
|
/**
|
|
@@ -86,7 +110,7 @@ export async function runSessionEndTasks({ projectRoot, userDir, makeBackend, no
|
|
|
86
110
|
* @param {{compressOutcome: PromiseSettledResult, personaOutcome: PromiseSettledResult}} outcomes
|
|
87
111
|
* @returns {string[]}
|
|
88
112
|
*/
|
|
89
|
-
export function summarizeSessionEnd({ compressOutcome, personaOutcome, graduationOutcome }) {
|
|
113
|
+
export function summarizeSessionEnd({ compressOutcome, personaOutcome, graduationOutcome, journalOutcome }) {
|
|
90
114
|
const lines = [];
|
|
91
115
|
|
|
92
116
|
if (compressOutcome.status === 'fulfilled') {
|
|
@@ -123,5 +147,18 @@ export function summarizeSessionEnd({ compressOutcome, personaOutcome, graduatio
|
|
|
123
147
|
}
|
|
124
148
|
}
|
|
125
149
|
|
|
150
|
+
// journalOutcome is optional (Task 159) — pre-159 callers render no journal line.
|
|
151
|
+
if (journalOutcome) {
|
|
152
|
+
if (journalOutcome.status === 'fulfilled') {
|
|
153
|
+
const j = journalOutcome.value ?? {};
|
|
154
|
+
lines.push(
|
|
155
|
+
`cmk-compress-session: journal (written: ${j.written ?? false}, appended: ${j.appended ?? 0})\n`,
|
|
156
|
+
);
|
|
157
|
+
} else {
|
|
158
|
+
const e = journalOutcome.reason;
|
|
159
|
+
lines.push(`cmk-compress-session: journal sync failed: ${e?.message ?? e}\n`);
|
|
160
|
+
}
|
|
161
|
+
}
|
|
162
|
+
|
|
126
163
|
return lines;
|
|
127
164
|
}
|
package/src/subcommands.mjs
CHANGED
|
@@ -416,7 +416,19 @@ async function runSearch(queryParts, options) {
|
|
|
416
416
|
let mode = explicitMode ?? resolveDefaultSearchMode({ projectRoot });
|
|
417
417
|
// Task 104.2 — the L3 raw tier: `--scope transcripts` searches the
|
|
418
418
|
// separate transcript-chunk index (synthetic T: ids; no tier/trust).
|
|
419
|
+
// Task 156 — `--scope decisions` scans context/DECISIONS.md (the decision
|
|
420
|
+
// journal) for decision-history / "what did we reject" recall.
|
|
419
421
|
const scope = options?.scope ?? 'facts';
|
|
422
|
+
// Task 156 / v0.3.3 cut-gate-16: the `decisions` scope is keyword-only BY
|
|
423
|
+
// DESIGN — it scans the flat `context/DECISIONS.md` journal, which is NOT
|
|
424
|
+
// embedded (no vec table). So it can never go through the semantic backend.
|
|
425
|
+
// Coerce to keyword BEFORE the semantic block, silently — a user who has the
|
|
426
|
+
// hybrid default (from `--with-semantic`) must not see a scary
|
|
427
|
+
// "unknown-scope:decisions" warning (configured default) or hard exit-2
|
|
428
|
+
// (explicit --mode) for using a real, shipped scope. The recall just works.
|
|
429
|
+
if (scope === 'decisions') {
|
|
430
|
+
mode = SEARCH_MODES.KEYWORD;
|
|
431
|
+
}
|
|
420
432
|
let semanticBackend;
|
|
421
433
|
if (mode === SEARCH_MODES.SEMANTIC || mode === SEARCH_MODES.HYBRID) {
|
|
422
434
|
const { prepareSemanticBackend } = await import('./semantic-backend.mjs');
|
|
@@ -443,6 +455,7 @@ async function runSearch(queryParts, options) {
|
|
|
443
455
|
query,
|
|
444
456
|
mode,
|
|
445
457
|
scope,
|
|
458
|
+
projectRoot, // Task 156: the decisions scope reads context/DECISIONS.md
|
|
446
459
|
minTrust: options?.minTrust,
|
|
447
460
|
tier: options?.tier,
|
|
448
461
|
since: options?.since,
|
|
@@ -464,9 +477,15 @@ async function runSearch(queryParts, options) {
|
|
|
464
477
|
for (const hit of r.results) {
|
|
465
478
|
// Plain-text output suitable for terminal piping. Snippet uses
|
|
466
479
|
// FTS5's <b>...</b> markers; preserved as-is so callers can pipe
|
|
467
|
-
// to a TUI that renders them OR strip via sed.
|
|
468
|
-
//
|
|
469
|
-
|
|
480
|
+
// to a TUI that renders them OR strip via sed. Hits with no tier/trust
|
|
481
|
+
// (raw transcript chunks; decision-journal entries) show the scope's
|
|
482
|
+
// label instead — 'transcript' for the L3 raw tier, 'decision' for the
|
|
483
|
+
// journal (Task 156), plus a `(retracted)` marker so the "what did we
|
|
484
|
+
// reject" trail is visible at a glance.
|
|
485
|
+
let provenance;
|
|
486
|
+
if (hit.tier) provenance = `${hit.tier}/${hit.trust}`;
|
|
487
|
+
else if (r.scope === 'decisions') provenance = hit.retracted ? 'decision (retracted)' : 'decision';
|
|
488
|
+
else provenance = 'transcript';
|
|
470
489
|
console.log(
|
|
471
490
|
`${hit.id}\t${provenance}\t${hit.source_file}:${hit.source_line}\t${hit.snippet}`,
|
|
472
491
|
);
|
|
@@ -512,10 +531,19 @@ export function withReadDb(fn, deps = {}) {
|
|
|
512
531
|
}
|
|
513
532
|
}
|
|
514
533
|
|
|
515
|
-
export function runGet(ids,
|
|
534
|
+
export function runGet(ids, options = {}, _command, deps = {}) {
|
|
516
535
|
const log = deps.log ?? console.log;
|
|
517
536
|
const list = Array.isArray(ids) ? ids : [ids];
|
|
518
|
-
|
|
537
|
+
// Task 155 (D-163): `--include-tombstoned` is the HUMAN-only recovery opt-in.
|
|
538
|
+
// It's a CLI flag ONLY — the MCP mk_get tool never exposes it, so automatic
|
|
539
|
+
// recall stays tombstone-blind. projectRoot is resolved the same way
|
|
540
|
+
// withReadDb does, so the tombstone-file fallback can find the archive.
|
|
541
|
+
const includeTombstoned = options.includeTombstoned === true;
|
|
542
|
+
const projectRoot = deps.projectRoot ?? resolvePath(process.cwd());
|
|
543
|
+
const rows = withReadDb(
|
|
544
|
+
(db) => getObservations(db, list, { includeTombstoned, projectRoot }),
|
|
545
|
+
deps,
|
|
546
|
+
);
|
|
519
547
|
log(JSON.stringify(rows, null, 2));
|
|
520
548
|
// All-missing/invalid → exit 2 (lets a script tell "nothing matched" from a hit).
|
|
521
549
|
if (rows.length > 0 && rows.every((r) => r.error)) process.exitCode = 2;
|
|
@@ -1999,6 +2027,12 @@ export const subcommands = [
|
|
|
1999
2027
|
description: 'fetch full observation bodies + provenance by ID (parity with the mk_get MCP tool)',
|
|
2000
2028
|
milestone: 108,
|
|
2001
2029
|
argSpec: [{ flags: '<ids...>', description: 'one or more citation IDs (e.g. P-S79MJHFN)' }],
|
|
2030
|
+
optionSpec: [
|
|
2031
|
+
{
|
|
2032
|
+
flags: '--include-tombstoned',
|
|
2033
|
+
description: 'also recover forgotten (tombstoned) facts from the archive — human-only; the AI never reads tombstones',
|
|
2034
|
+
},
|
|
2035
|
+
],
|
|
2002
2036
|
action: runGet,
|
|
2003
2037
|
},
|
|
2004
2038
|
{
|
|
@@ -63,6 +63,26 @@ Hits are raw turn excerpts (dialogue + the tools the agent ran), keyed
|
|
|
63
63
|
whole turns. If something found here is durably useful, say so in the
|
|
64
64
|
summary so the caller can capture it as a proper fact.
|
|
65
65
|
|
|
66
|
+
## Decision HISTORY — the `decisions` scope
|
|
67
|
+
|
|
68
|
+
For "what did we DECIDE about X" a normal fact search (steps 1-3) is enough —
|
|
69
|
+
the decision fact carries its own **Why**. But when the question is about how a
|
|
70
|
+
decision **evolved**, what we **reject**ed or moved away from, or **why X
|
|
71
|
+
changed** ("did we ever consider Y?", "weren't we using Postgres?", "what did
|
|
72
|
+
we decide and did it change?"), search the **decision journal** — the
|
|
73
|
+
append-only `context/DECISIONS.md`, which keeps superseded + retracted entries
|
|
74
|
+
the live fact store no longer carries:
|
|
75
|
+
|
|
76
|
+
- MCP: `mk_search` with `scope: "decisions"`.
|
|
77
|
+
- CLI: `cmk search "<topic>" --scope decisions`
|
|
78
|
+
|
|
79
|
+
Hits are decision entries keyed by their fact id, labelled `decision` (or
|
|
80
|
+
`decision (retracted)` for a reversed one). The retracted/superseded entries
|
|
81
|
+
ARE the answer to "what did we reject" — surface them explicitly, with the
|
|
82
|
+
date, so the caller sees the trail. Use this scope IN ADDITION to the fact
|
|
83
|
+
ladder when the question has a history/evolution axis; the fact search answers
|
|
84
|
+
the "current decision", the journal answers "how it got there".
|
|
85
|
+
|
|
66
86
|
## When the query is vague
|
|
67
87
|
|
|
68
88
|
If you cannot form a concrete query, look at recent activity first, then
|
|
@@ -29,6 +29,7 @@ The `cmk doctor` health checks verify each layer is wired correctly: install int
|
|
|
29
29
|
The snapshot injected at session start is a **bounded hot index, not everything** — there is a deeper, queryable archive. When a question is "what did we decide / what's our X / how does the user work / what's the setup / **how is this project structured or built / where does X live / what's the architecture**," **query your memory instead of re-deriving the answer from scratch** — the structure is a recorded decision, recall it before re-reading the files to reconstruct it:
|
|
30
30
|
|
|
31
31
|
- **`cmk search "<topic>"`** — find any captured fact (decisions, preferences, config, lessons) across the project + user tiers.
|
|
32
|
+
- **`cmk search "<topic>" --scope decisions`** — the append-only **decision journal** (`context/DECISIONS.md`). Use it for decision **history / evolution** — "what did we reject", "did X change", "why did we move away from Y" — it keeps superseded + retracted decisions the live fact store drops. (A plain `cmk search` answers "what's the current decision"; this answers "how it got there".)
|
|
32
33
|
- **`context/memory/<type>_<slug>.md`** — the granular fact archive with full **Why / How** rationale (`context/memory/INDEX.md` lists them).
|
|
33
34
|
- **`~/.claude-memory-kit/` (`USER.md` / `HABITS.md` / `LESSONS.md`)** — how this user works across *all* their projects.
|
|
34
35
|
|