akm-cli 0.9.0-beta.2 → 0.9.0-beta.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. package/CHANGELOG.md +248 -0
  2. package/dist/assets/templates/html/default.html +78 -0
  3. package/dist/assets/templates/html/health.html +560 -0
  4. package/dist/assets/templates/html/vendor/echarts.min.js +45 -0
  5. package/dist/cli/shared.js +21 -5
  6. package/dist/cli.js +36 -5
  7. package/dist/commands/health/html-report.js +448 -0
  8. package/dist/commands/health.js +97 -6
  9. package/dist/commands/improve/consolidate.js +15 -2
  10. package/dist/commands/improve/extract.js +38 -2
  11. package/dist/commands/improve/improve-auto-accept.js +27 -1
  12. package/dist/commands/improve/improve.js +167 -53
  13. package/dist/commands/improve/reflect-noise.js +0 -0
  14. package/dist/commands/improve/reflect.js +25 -0
  15. package/dist/commands/proposal/drain.js +73 -6
  16. package/dist/commands/proposal/proposal-cli.js +22 -10
  17. package/dist/commands/proposal/proposal.js +12 -1
  18. package/dist/commands/proposal/validators/proposals.js +361 -338
  19. package/dist/commands/remember.js +6 -2
  20. package/dist/core/config/config-schema.js +5 -0
  21. package/dist/core/logs-db.js +304 -0
  22. package/dist/core/state-db.js +107 -14
  23. package/dist/indexer/db/db.js +2 -2
  24. package/dist/indexer/passes/memory-inference.js +61 -22
  25. package/dist/integrations/harnesses/claude/session-log.js +16 -4
  26. package/dist/llm/client.js +15 -0
  27. package/dist/llm/usage-persist.js +77 -0
  28. package/dist/llm/usage-telemetry.js +103 -0
  29. package/dist/output/context.js +3 -2
  30. package/dist/output/html-render.js +73 -0
  31. package/dist/output/shapes/helpers.js +17 -1
  32. package/dist/output/text/helpers.js +69 -1
  33. package/dist/scripts/migrate-storage.js +65 -14
  34. package/dist/scripts/migrations/import-fs-improve-runs-to-db.js +14 -2
  35. package/dist/tasks/runner.js +99 -16
  36. package/dist/workflows/db.js +4 -0
  37. package/package.json +2 -1
package/CHANGELOG.md CHANGED
@@ -6,6 +6,93 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
6
6
 
7
7
  ## [Unreleased]
8
8
 
9
+ ## [0.9.0-beta.3] - 2026-06-12
10
+
11
+ Stabilization batch closing the remaining 0.9.0 milestone: DB-locking and
12
+ improve-pipeline perf backports, extract/reflect gate fixes, SQLite-first
13
+ proposal and log storage, `--format html` output, and per-stage LLM telemetry.
14
+
15
+ ### Added
16
+
17
+ - **`--format html` output with per-command templates** (#582). `akm health
18
+ --format html` renders the full interactive health report (ECharts inlined by
19
+ default, or via CDN with `AKM_ECHARTS=cdn`); every other command falls back to
20
+ a dark-mode default template that pretty-prints its JSON. A global `--output
21
+ <path>` flag writes the rendered HTML to a file instead of stdout. Token
22
+ replacement only — no template engine. The standalone health-report skill is
23
+ now folded into core.
24
+ - **Per-stage LLM telemetry** (#576). Every `chatCompletion` call now records
25
+ tokens (prompt/completion/total/reasoning), wall-time, model, and
26
+ finish_reason as an `llm_usage` event, attributed to the pipeline stage via an
27
+ ambient `AsyncLocalStorage` context (`withLlmStage`) set once per phase — no
28
+ `stage` parameter threaded through call sites. `akm health` exposes per-stage
29
+ token and time aggregates. Telemetry is best-effort and can never fail a run;
30
+ capture is forward-only.
31
+ - **Per-proposal gate decision + confidence** (#577). When a proposal passes
32
+ through the auto-accept/triage gate, its outcome (`auto-accepted` /
33
+ `deferred` / `auto-rejected`), reason, confidence, measured value, and the
34
+ thresholds in effect are persisted on the proposal (in the SQLite metadata).
35
+ `akm proposal show`/`list` surface them with reconstructable comparisons
36
+ (e.g. `0.72 < 0.90`), so tooling can explain *why* each proposal is pending
37
+ instead of relying on a run-level aggregate. Forward-only; legacy proposals
38
+ render `unknown`.
39
+
40
+ ### Fixed
41
+
42
+ - **`SQLITE_BUSY` / "database is locked" under concurrent runs** (#584, #585,
43
+ #589). `busy_timeout` raised from 5 s to 30 s on every SQLite open path
44
+ (index.db and state.db); the improve maintenance pass now closes its index.db
45
+ handle before each reindex (which opens its own writer to the same WAL file);
46
+ and the post-loop purge reuses the long-lived events connection instead of
47
+ opening a second state.db writer. Together these eliminate all observed
48
+ lock failures from overlapping cron improve runs. (Backports of 0.8.8.)
49
+ - **Extract gate ignored the active profile's `extract.enabled: false`** (#593,
50
+ #594). The session-extraction gate hardcoded the `default` profile, so a
51
+ non-default profile (e.g. a quick pass) ran extract anyway — 300–600 s of
52
+ redundant work per run when a dedicated extract task also exists. The gate
53
+ now resolves `extract` against the active improve profile. (Backport of
54
+ 0.8.11.)
55
+ - **Memory inference burned LLM calls on already-derived parents** (#588). The
56
+ primary pass now checks for the `<parent>.derived.md` child on disk *before*
57
+ the LLM/cache call, and opportunistically marks the parent processed so it
58
+ never re-pends. Previously ~55 % of the inference budget was spent
59
+ rediscovering children that already existed.
60
+ - **Reflect no longer queues empty-diff or cosmetic-only proposals** (#580).
61
+ A deterministic, LLM-free noise gate diffs each candidate against the current
62
+ asset; byte-identical edits are dropped and changes that are pure formatting
63
+ (whitespace reflow, hard-wrap changes, code-fence language hints, YAML scalar
64
+ re-folding) are suppressed, each recorded via summary events so suppression
65
+ rates are visible in `akm health`.
66
+
67
+ ### Added
68
+
69
+ - **`minContentChars` pre-LLM extract gate** (#595, #596). Sessions whose raw
70
+ size is below `profiles.improve.<name>.processes.extract.minContentChars`
71
+ (default 10 — only truly empty sessions/journal files) skip the extract LLM
72
+ call entirely. Gates on raw input size, not post-noise-filter size.
73
+ (Backports of 0.8.12–0.8.14.)
74
+ - **Structured logs database** (#579). Task and run log lines now land in a
75
+ dedicated `logs.db` (WAL, 30 s busy_timeout) keyed by task, run, stream, and
76
+ time, with retention/purge wired into the existing purge pass and `ATTACH`
77
+ support for joining log lines to `state.db` rows (e.g. a failed
78
+ `task_history` row to its log output). The scattered-log audit and per-source
79
+ keep/move/drop decisions are documented in `docs/technical/logs-audit.md`.
80
+
81
+ ### Changed
82
+
83
+ - **Proposals are now stored canonically in SQLite** (#578). The previously
84
+ bypassed `proposals` table in state.db is the single source of truth; all
85
+ proposal commands (`list`/`show`/`diff`/`accept`/`reject`/`revert`/`drain`),
86
+ the improve auto-accept gate, and health metrics read and write it through
87
+ one storage layer. Pending file-based proposals are imported on first read;
88
+ `akm proposal *` UX is unchanged. Design and migration notes live in
89
+ `docs/technical/proposal-storage.md`.
90
+ - **Improve planning no longer does per-ref DB lookups or per-ref skip events**
91
+ (#591, #592). Eligible refs carry a pre-resolved `filePath`, removing a
92
+ serial async lookup per ref (~500 s on 9 k-ref stashes), and the
93
+ profile-filtered skip loop emits one summary event with a count instead of
94
+ thousands of rows. (Backports of 0.8.9–0.8.10.)
95
+
9
96
  ## [0.9.0-beta.2] - 2026-06-09
10
97
 
11
98
  ### Fixed
@@ -254,6 +341,167 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
254
341
  `migrate-storage` change is pinned by a sha256 + file-mode fixture-stash
255
342
  differential test.
256
343
 
344
+ ## [0.8.14] - 2026-06-11
345
+
346
+ ### Fixed
347
+
348
+ - **`akm extract` minContentChars default lowered from 500 to 10.** The 500-char
349
+ threshold used inputCount (raw session size) but analysis showed 209 of 218
350
+ candidate-producing sessions had inputCount < 500 — tiny agent sessions (22–368
351
+ chars) regularly yield 1–5 candidates. The only reliably skippable sessions are
352
+ empty ones (0 chars, journal files). Default lowered to 10 to catch only
353
+ truly empty sessions while preserving all signal-bearing content. Closes #597.
354
+
355
+ ## [0.8.13] - 2026-06-11
356
+
357
+ ### Fixed
358
+
359
+ - **`akm extract` minContentChars gate filtered all sessions.** The threshold was
360
+ checked against `filtered.stats.outputCount` (post-noise-filter chars), but the
361
+ pre-filter strips so much boilerplate that even signal-bearing sessions end up
362
+ below 500 chars of output. All 75 sessions in the first post-deploy run were
363
+ filtered, dropping candidates from 4–13 to 0. Fix: gate on `inputCount` (raw
364
+ session size) instead — a session with < 500 raw chars has nothing worth
365
+ extracting regardless of what the pre-filter produces. Closes #596.
366
+
367
+ ## [0.8.12] - 2026-06-11
368
+
369
+ ### Fixed
370
+
371
+ - **`akm extract` calling the LLM for noise sessions that never yield candidates.**
372
+ 96% of processed sessions (72/75 measured) produced zero candidates, consuming
373
+ ~330 s of LLM time per run. The pre-filter had no minimum content threshold —
374
+ sessions as short as 50 chars were sent to the LLM regardless. A new
375
+ `minContentChars` gate (default 500) skips the LLM call when post-filter
376
+ content falls below threshold, cutting extract LLM calls by ~95% on typical
377
+ stashes. Configurable via `profiles.improve.<name>.processes.extract.minContentChars`.
378
+ Closes #595.
379
+
380
+ ## [0.8.11] - 2026-06-11
381
+
382
+ ### Fixed
383
+
384
+ - **`akm improve --profile <name>` ignored profile's `extract.enabled: false` setting.**
385
+ The session-extraction gate in the preparation stage called
386
+ `isLlmFeatureEnabled(config, "session_extraction")`, which hardcodes a lookup
387
+ against `profiles.improve.default.processes.extract.enabled`. Any non-default
388
+ profile that set `extract.enabled: false` (e.g. `quick-shredder`) was silently
389
+ ignored, causing the extract pass to run regardless. The fix adds a
390
+ `resolveProcessEnabled("extract", improveProfile)` check so the active
391
+ resolved profile gates the pass correctly. Closes #593.
392
+
393
+ ## [0.8.10] - 2026-06-11
394
+
395
+ ### Fixed
396
+
397
+ - **`akm improve` taking 8–10 minutes per run due to O(n) DB writes for
398
+ profile-filtered refs.** When a profile disables reflect and distill for
399
+ certain asset types, `collectEligibleRefs` marks those refs as
400
+ `profile_filtered_all_passes`. The caller then emitted one `improve_skipped`
401
+ event per ref — a sequential DB write for each. On a ~9 000-ref stash this
402
+ was ~500 s of SQLite writes before any consolidation or memory inference
403
+ began. The fix collapses the per-ref loop into a single summary event
404
+ carrying a `count` field, eliminating ~9 000 sequential writes per run.
405
+ Closes #590.
406
+
407
+ ## [0.8.9] - 2026-06-11
408
+
409
+ ### Fixed
410
+
411
+ - **`akm improve` validation pass was O(n) in stash size, causing ~510 s overhead
412
+ on large stashes.** For every indexed ref, the preparation phase called
413
+ `findAssetFilePath()` — an async round-trip to the index DB followed by a
414
+ filesystem probe — serially inside a `for…await` loop. With ~9 000 indexed
415
+ refs at ~55 ms each, this loop consumed the entire 600–900 s run budget before
416
+ any reflect, triage, or memory-inference work began. The fix threads
417
+ `filePath` from the planning stage (`collectEligibleRefs`) through
418
+ `ImproveEligibleRef` so the validation pass and the disk-existence guard can
419
+ use the pre-resolved path directly. The async lookup is retained only as a
420
+ fallback for refs that enter via a narrow scope (e.g. `--scope ref:foo`).
421
+ Closes #587.
422
+
423
+ ## [0.8.8] - 2026-06-11
424
+
425
+ ### Fixed
426
+
427
+ - **SQLite `SQLITE_BUSY` errors under concurrent improve runs.** `busy_timeout`
428
+ was set to 5 000 ms in all three database open paths (`openDatabase`,
429
+ `openExistingDatabase`, `openStateDatabase`). Under a busy cron schedule — or
430
+ when a reindex triggered by memory inference ran concurrently with an event
431
+ write — the 5 s window was routinely exhausted, producing "database is locked"
432
+ failures. Raised to 30 000 ms across all three paths so transient lock
433
+ contention is retried for up to 30 s before surfacing as an error.
434
+
435
+ ## [0.8.7] - 2026-06-09
436
+
437
+ ### Fixed
438
+
439
+ - **`incrementalSince` duration strings were silently ignored.** Values like
440
+ `"30m"`, `"24h"`, `"7d"` were passed raw to `narrowToIncrementalCandidates`,
441
+ which compared them against ISO timestamps via string sort. All `2026-...`
442
+ timestamps are lexicographically less than `"30m"` (`'2' < '3'`) and `"24h"`
443
+ (`"20" < "24"`), so `isChanged()` always returned `false` and the candidate
444
+ pool was silently emptied rather than filtered to the window. The fix adds
445
+ `parseSinceToIso()`, which resolves human duration strings to absolute ISO
446
+ timestamps before comparison. Values that already look like ISO timestamps
447
+ are passed through unchanged.
448
+
449
+ ## [0.8.6] - 2026-06-09
450
+
451
+ ### Added
452
+
453
+ - **`consolidate.incrementalSince` profile config field.** Setting
454
+ `incrementalSince: "7d"` (or any duration string) in the `consolidate` block
455
+ of an improve profile narrows the candidate pool to memories modified within
456
+ that window plus their top-5 graph neighbours, keeping each pass focused on
457
+ recent changes. This makes it practical to run consolidation more often than
458
+ once per day (e.g. via `akm-improve-consolidate` every 4 h) without
459
+ re-scanning the full pool every time. The nightly default profile leaves this
460
+ unset (full-pool sweep, same as before). The `incrementalSince` option already
461
+ existed in `akmConsolidate()` but was hardcoded off at the call site; the
462
+ field is now surfaced in the config schema and read from the profile.
463
+
464
+ ## [0.8.5] - 2026-06-09
465
+
466
+ ### Fixed
467
+
468
+ - **Consolidation starved merge recall; the memory pool grew unbounded.** Commit
469
+ `633ece41` made the `incrementalSince` narrowing unconditional, so every
470
+ consolidation run only judged memories changed since the last run plus their
471
+ immediate vector-neighbors. Stale-but-unmerged duplicate clusters were never
472
+ re-examined, so the eligible pool grew monotonically and never shrank, and
473
+ contradiction detection (which rides on the consolidation pass) went dark.
474
+ Consolidation only runs on the nightly default-profile pass (`quick`/`frequent`
475
+ disable it), so a full-pool sweep is correct and affordable; the override is
476
+ removed. `lastConsolidateTs` still gates whether the pass runs.
477
+
478
+ ## [0.8.4] - 2026-06-08
479
+
480
+ ### Fixed
481
+
482
+ - **`akm tasks sync` ignored schedule changes.** Sync classified any task already
483
+ present in the OS scheduler as "unchanged" without comparing its installed
484
+ entry, so editing a task's `schedule:` in the `.yml` never reached the crontab —
485
+ the only way to apply a new schedule was to `remove` and re-`add` the task. The
486
+ same gap affected `tasks enable`/`disable`, which merely toggled the existing
487
+ cron line's comment and so re-enabled a stale schedule. Sync now compares the
488
+ backend's installed signature against the signature the current definition would
489
+ produce and reinstalls on drift (reported in a new `updated[]` field);
490
+ `enable`/`disable` reinstall from the current `.yml` instead of toggling in
491
+ place. Backends that can't cheaply read their installed form fall back to an
492
+ idempotent reinstall, so the fix is correct on launchd/schtasks too. The cron
493
+ backend gains `expectedSignature()` and a signature on each `list()` entry.
494
+
495
+ ### Added
496
+
497
+ - **`akm improve --skip-if-locked`.** When another improve run already holds the
498
+ lock, the run logs and exits 0 with a no-op result (`skipped.reason:
499
+ "lock-held"`) instead of failing with the "already running" config error
500
+ (exit 78). Intended for high-frequency scheduled runs (e.g. an every-30-min
501
+ `quick` pass) that would otherwise pile up exit-78 failures whenever a longer
502
+ run overlaps them. Default off — the hard error is preserved for interactive
503
+ use. The result is still recorded so the skip is auditable.
504
+
257
505
  ## [0.8.3] - 2026-06-08
258
506
 
259
507
  ### Fixed
@@ -0,0 +1,78 @@
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>akm %%COMMAND%%</title>
7
+ <style>
8
+ *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
9
+
10
+ :root {
11
+ --bg: #0d1117;
12
+ --surface: #161b22;
13
+ --border: #30363d;
14
+ --text: #e6edf3;
15
+ --muted: #8b949e;
16
+ --accent: #58a6ff;
17
+ }
18
+
19
+ body {
20
+ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Noto Sans', sans-serif;
21
+ background: var(--bg);
22
+ color: var(--text);
23
+ font-size: 14px;
24
+ line-height: 1.6;
25
+ padding: 24px;
26
+ }
27
+
28
+ header {
29
+ max-width: 980px;
30
+ margin: 0 auto 16px;
31
+ display: flex;
32
+ align-items: baseline;
33
+ gap: 12px;
34
+ flex-wrap: wrap;
35
+ }
36
+ header .logo { font-size: 20px; font-weight: 700; color: var(--accent); letter-spacing: -0.5px; }
37
+ header .command { font-family: ui-monospace, SFMono-Regular, Menlo, monospace; color: var(--muted); }
38
+
39
+ main { max-width: 980px; margin: 0 auto; }
40
+
41
+ pre {
42
+ background: var(--surface);
43
+ border: 1px solid var(--border);
44
+ border-radius: 8px;
45
+ padding: 16px 20px;
46
+ overflow-x: auto;
47
+ font-family: ui-monospace, SFMono-Regular, Menlo, monospace;
48
+ font-size: 13px;
49
+ white-space: pre-wrap;
50
+ word-break: break-word;
51
+ }
52
+
53
+ footer {
54
+ max-width: 980px;
55
+ margin: 16px auto 0;
56
+ color: var(--muted);
57
+ font-size: 12px;
58
+ display: flex;
59
+ justify-content: space-between;
60
+ gap: 12px;
61
+ flex-wrap: wrap;
62
+ }
63
+ </style>
64
+ </head>
65
+ <body>
66
+ <header>
67
+ <span class="logo">akm</span>
68
+ <span class="command">%%COMMAND%%</span>
69
+ </header>
70
+ <main>
71
+ <pre>%%CONTENT_JSON%%</pre>
72
+ </main>
73
+ <footer>
74
+ <span>akm — Agent Knowledge Management</span>
75
+ <span>Generated %%GENERATED_AT%%</span>
76
+ </footer>
77
+ </body>
78
+ </html>