npm - akm-cli - Versions diffs - 0.9.0-beta.5 → 0.9.0-beta.9 - Mend

akm-cli 0.9.0-beta.5 → 0.9.0-beta.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

package/CHANGELOG.md +119 -0
package/dist/cli.js +7 -0
package/dist/commands/feedback-cli.js +42 -37
package/dist/commands/graph/graph.js +75 -71
package/dist/commands/health.js +10 -2
package/dist/commands/improve/consolidate.js +24 -4
package/dist/commands/improve/distill.js +26 -5
package/dist/commands/improve/extract-prompt.js +1 -1
package/dist/commands/improve/improve-auto-accept.js +6 -0
package/dist/commands/improve/improve-profiles.js +4 -0
package/dist/commands/improve/improve.js +753 -465
package/dist/commands/improve/proactive-maintenance.js +113 -0
package/dist/commands/improve/reflect.js +6 -0
package/dist/commands/proposal/proposal.js +5 -0
package/dist/commands/proposal/validators/proposals.js +67 -54
package/dist/commands/read/curate.js +17 -0
package/dist/commands/sources/stash-cli.js +10 -2
package/dist/core/config/config-schema.js +25 -0
package/dist/core/paths.js +3 -0
package/dist/core/state-db.js +46 -1
package/dist/indexer/db/db.js +97 -11
package/dist/indexer/ensure-index.js +152 -17
package/dist/indexer/index-writer-lock.js +99 -0
package/dist/indexer/indexer.js +114 -111
package/dist/integrations/harnesses/claude/session-log.js +1 -1
package/dist/llm/client.js +23 -4
package/dist/scripts/migrate-storage.js +90 -13
package/dist/scripts/migrations/import-fs-improve-runs-to-db.js +8 -1
package/dist/sources/providers/tar-utils.js +16 -8
package/package.json +2 -2

package/CHANGELOG.md CHANGED Viewed

@@ -6,6 +6,125 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
 ## [Unreleased]
+## [0.9.0-beta.9] - 2026-06-14
+Restore and instrument `akm improve` steady-state output. The reflect/distill
+self-improvement lanes had been near-zero in steady state because the
+signal-delta eligibility gate was the only lane (cache "no-access = no-work"
+pathology) and the high-retrieval fallback was structurally dead. This release
+revives proactive improvement, adds attribution + a measurement/kill-criterion
+system so the lane must prove its value, and right-sizes reflect budgets to
+their task timeouts.
+### Added
+- **Proactive maintenance selector** (`proactiveMaintenance` improve process):
+  due-gated, composite-priority (`importance × log(1+retrievalFreq) ×
+  recencyDecay / log(size)`), bounded rotating top-N reflect/distill over
+  stale/never-reflected assets. **Disabled by default**; enable per profile.
+- **Eligibility attribution**: every reflect/distill proposal is stamped
+  `eligibilitySource ∈ {signal-delta, high-retrieval, proactive, scope,
+  unknown}` on `reflect_invoked`/`distill_invoked`/`promoted` events and the
+  proposal record, so outcomes are sliceable by lane.
+- **Measurement system** under `scripts/akm-eval/`: a real-query retrieval suite
+  generated from `usage_events`, and `akm-eval-proactive-verdict` — a read-only
+  kill-criterion runner comparing the proactive lane (treatment) vs due-but-
+  untouched assets (control). Emits PASS/FAIL/INCONCLUSIVE and recommends
+  disabling the lane on FAIL. New `proactive_selected` event +
+  `proactiveSelected`/`proactiveDueTotal`/`proactiveNeverReflected` fields on
+  `improve_completed`.
+### Fixed
+- Revived the P0-A high-retrieval fallback: genuinely zero-feedback assets were
+  routed to the fully-skipped branch one phase before the fallback could see
+  them, so frequently-retrieved-but-never-rated assets were never improved.
+- `getRetrievalCounts` now normalizes bare vs `origin//`-prefixed refs (it was
+  dropping ~half the retrieval signal) and counts `curate` events
+  (`akm curate` now records per-item `entry_ref`).
+- The fully-skipped `no_new_signal` branch emitted one `improve_skipped` event
+  per ref (~11K writes/run, ~400K rows/day) — a contributor to 900s improve
+  timeouts and state.db bloat. Collapsed into one aggregated counted event.
+## [0.9.0-beta.8] - 2026-06-13
+Fix multi-process SQLite contention in `index.db` and harden concurrent proposal
+queue mutations.
+### Changed
+- Added a global `index.db` writer lease used by foreground indexing,
+  background auto-index, improve maintenance index writers, graph updates, and
+  feedback writes.
+- Replaced the racy background index PID-file dedup flow with lease-based
+  coordination and explicit handoff to the spawned worker.
+- `akm feedback` now uses blocking index preparation and writes under the same
+  `index.db` lease, avoiding self-inflicted `database is locked` failures.
+- Proposal queue create/archive/gate-decision mutations now run under
+  `BEGIN IMMEDIATE` state.db transactions so concurrent processes serialize on
+  live queue state.
+## [0.9.0-beta.7] - 2026-06-13
+Fix the `akm improve` regression introduced by background `ensureIndex`.
+### Changed
+- Added an explicit `ensureIndex` mode so callers choose `background` or
+  `blocking` behavior directly instead of relying on hidden environment state.
+- `akm improve` now uses blocking index preparation before collecting eligible
+  refs, restoring the post-upgrade empty-index recovery path.
+- Removed the `AKM_INDEX_INLINE` test-only override so tests exercise the same
+  index behavior model as production.
+## [0.9.0-beta.6] - 2026-06-12
+Pipeline optimization: new per-process config fields wire up the consolidation
+and improve pipeline knobs exposed by the optimization report — incremental
+consolidation, pool caps, distill gating, and memory inference throttling.
+### Added
+- **`consolidate.incrementalSince`** — profile config field that narrows the
+  consolidation candidate pool to memories modified within the given window
+  (e.g. `"1h"`, `"4h"`) plus their graph neighbours. Enables frequent
+  consolidation passes (e.g. `quick-shredder` every 15 min) without full-pool
+  sweeps. Absent = full-pool sweep (correct for nightly runs).
+- **`consolidate.limit`** — hard cap on memories processed per consolidation
+  pass, applied after incremental narrowing. Prevents runaway full-pool sweeps
+  in the nightly default profile.
+- **`consolidate.neighborsPerChanged`** — configurable graph-neighbour count
+  per changed memory during incremental consolidation (was hardcoded to 5).
+  `quick-shredder` sets this to 3 for a 40% candidate reduction per burst.
+- **`distill.requirePlannedRefs`** — when `true`, the distill process is
+  skipped entirely for distill-only refs when the reflect phase produced zero
+  planned refs. Eliminates hundreds of `distill-skipped` events on quiet passes
+  where all refs are on reflect cooldown.
+- **`memoryInference.minPendingCount`** — minimum pending split-parent memory
+  count below which the inference pass is skipped entirely (zero LLM calls).
+  Prevents lock acquisition on passes where there is nothing to infer.
+- **`reflect.limit`** — per-process ref limit for the reflect/distill loop,
+  applied as the improve run limit when no CLI `--limit` is given.
+- **New `reflect-distill` improve profile** — dedicated reflect + distill +
+  memoryInference + triage profile for the every-4h `akm-improve-frequent`
+  task. `reflect.limit: 25` bounds LLM cost per pass.
+### Changed
+- **`quick-shredder` profile tuned**: `incrementalSince` `4h` → `1h`,
+  `maxChunkSize` 25 → 35, added `minPoolSize: 10`, `neighborsPerChanged: 3`,
+  `memoryInference.minPendingCount: 5`. All `profile: "qwen-9b-shredder"`
+  process references removed — falls back to default LLM.
+- **`default` improve profile** (nightly): extract disabled (dedicated
+  `akm-extract` task runs at 01:48), consolidate gets `limit: 500`,
+  reflect gets `limit: 100` and `allowedTypes`, distill gets
+  `requirePlannedRefs: true`, triage enabled at 50 accepts/run,
+  graphExtraction explicitly enabled.
+- **Cron schedule optimised**: extract reverted to `8,28,48 * * * *` (3×/hr),
+  quick-shredder shifted to `4,19,34,49` (4-min extract gap), health-report
+  shifted to `:03` (avoids `:00` collision), `akm-improve-frequent` re-enabled
+  at `45 */4` with `reflect-distill` profile.
 ## [0.9.0-beta.3] - 2026-06-12
 Stabilization batch closing the remaining 0.9.0 milestone: DB-locking and

package/dist/cli.js CHANGED Viewed

@@ -536,6 +536,13 @@ const EXIT_HEALTH_WARN = EXIT_CODES.HEALTH_WARN;
 // The wrapper sets `AKM_NODE_ENTRY=1` to opt into the startup block. The test
 // harness never sets it, so importing cli.ts under Bun stays inert as before.
 if (import.meta.main || process.env.AKM_NODE_ENTRY === "1") {
+    // Mark that this process is the real akm CLI: its `process.argv[1]` is the
+    // akm entrypoint, so the background auto-reindex may safely re-invoke it as a
+    // detached child. Hosts that merely import this module (the in-process test
+    // harness, library embeddings) never reach this block, so they fall back to
+    // an inline reindex instead of spawning the wrong program. See
+    // `ensureIndex` in src/indexer/ensure-index.ts.
+    process.env.AKM_CLI_ENTRY = "1";
     // citty reads process.argv directly and does not accept a custom argv array,
     // so we must replace process.argv with the normalized version before runMain.
     process.argv = normalizeShowArgv(process.argv);

package/dist/commands/feedback-cli.js CHANGED Viewed

@@ -14,6 +14,7 @@ import { appendEvent } from "../core/events.js";
 import { warn } from "../core/warn.js";
 import { applyFeedbackToUtilityScore, closeDatabase, findEntryIdByRef, getEntryFilePathById, openExistingDatabase, } from "../indexer/db/db.js";
 import { ensureIndex } from "../indexer/ensure-index.js";
+import { withIndexWriterLease } from "../indexer/index-writer-lock.js";
 import { resolveSourceEntries } from "../indexer/search/search-source.js";
 import { countFeedbackSignals, insertUsageEvent } from "../indexer/usage/usage-events.js";
 // ── Tag validation ────────────────────────────────────────────────────────────
@@ -203,47 +204,51 @@ export const feedbackCommand = defineCommand({
                 ...(validatedTags.length > 0 ? { tags: validatedTags } : {}),
             };
             const metadataStr = Object.keys(metadataObj).length > 1 ? JSON.stringify(metadataObj) : undefined;
-            // Auto-index when stale so the index is current before recording feedback.
-            const sources = resolveSourceEntries();
-            if (sources.length > 0) {
-                await ensureIndex(sources[0].path);
-            }
-            let utilityResult;
-            const db = openExistingDatabase();
-            try {
-                const entryId = findEntryIdByRef(db, ref);
-                if (entryId === undefined) {
-                    throw new UsageError(`Ref "${ref}" is not in the index. ` +
-                        "Run 'akm search' to verify the asset exists, then 'akm index' if it was recently added.");
+            const utilityResult = await withIndexWriterLease({ purpose: "feedback-write" }, async () => {
+                // Feedback is itself an index.db writer, so it must not spawn a detached
+                // reindex and then compete with it for the same database file.
+                const sources = resolveSourceEntries();
+                if (sources.length > 0) {
+                    await ensureIndex(sources[0].path, { mode: "blocking" });
                 }
-                // Persist the feedback signal into usage_events. For positive signals,
-                // the EMA utility score is updated immediately on the next read path.
-                // For negative signals, the score is adjusted the next time `akm index`
-                // runs — the signal is durable in the DB but does NOT suppress ranking
-                // in search results until after reindexing.
-                insertUsageEvent(db, {
-                    event_type: "feedback",
-                    entry_ref: ref,
-                    entry_id: entryId,
-                    signal,
-                    metadata: metadataStr,
-                });
-                // Apply feedback-derived utility score adjustment immediately so that
-                // positive/negative signals influence search ranking without requiring
-                // a full reindex. We query the total accumulated feedback counts from
-                // usage_events so the delta reflects the entire signal history.
-                // Uses MemRL bounded-step EMA (F-5 / #386, arXiv:2601.03192).
+                let scopedUtilityResult;
+                const db = openExistingDatabase();
                 try {
-                    const { pos, neg } = countFeedbackSignals(db, entryId);
-                    utilityResult = applyFeedbackToUtilityScore(db, entryId, pos, neg);
+                    const entryId = findEntryIdByRef(db, ref);
+                    if (entryId === undefined) {
+                        throw new UsageError(`Ref "${ref}" is not in the index. ` +
+                            "Run 'akm search' to verify the asset exists, then 'akm index' if it was recently added.");
+                    }
+                    // Persist the feedback signal into usage_events. For positive signals,
+                    // the EMA utility score is updated immediately on the next read path.
+                    // For negative signals, the score is adjusted the next time `akm index`
+                    // runs — the signal is durable in the DB but does NOT suppress ranking
+                    // in search results until after reindexing.
+                    insertUsageEvent(db, {
+                        event_type: "feedback",
+                        entry_ref: ref,
+                        entry_id: entryId,
+                        signal,
+                        metadata: metadataStr,
+                    });
+                    // Apply feedback-derived utility score adjustment immediately so that
+                    // positive/negative signals influence search ranking without requiring
+                    // a full reindex. We query the total accumulated feedback counts from
+                    // usage_events so the delta reflects the entire signal history.
+                    // Uses MemRL bounded-step EMA (F-5 / #386, arXiv:2601.03192).
+                    try {
+                        const { pos, neg } = countFeedbackSignals(db, entryId);
+                        scopedUtilityResult = applyFeedbackToUtilityScore(db, entryId, pos, neg);
+                    }
+                    catch {
+                        // best-effort — feedback recording succeeds even if utility update fails
+                    }
                 }
-                catch {
-                    // best-effort — feedback recording succeeds even if utility update fails
+                finally {
+                    closeDatabase(db);
                 }
-            }
-            finally {
-                closeDatabase(db);
-            }
+                return scopedUtilityResult;
+            });
             appendEvent({
                 eventType: "feedback",
                 ref,

package/dist/commands/graph/graph.js CHANGED Viewed

@@ -12,6 +12,7 @@ import { closeDatabase, findEntryIdByRef, getEntryById, getEntryRefRowsForStashR
 import { loadStoredGraphSnapshot } from "../../indexer/db/graph-db.js";
 import { listRelatedPathsForFile } from "../../indexer/graph/graph-boost.js";
 import { runGraphExtractionPass } from "../../indexer/graph/graph-extraction.js";
+import { withIndexWriterLease } from "../../indexer/index-writer-lock.js";
 import { lookup } from "../../indexer/indexer.js";
 import { findSourceForPath, resolveSourceEntries } from "../../indexer/search/search-source.js";
 import { resolveAssetPath } from "../../indexer/walk/path-resolver.js";
@@ -375,85 +376,88 @@ export async function akmGraphUpdate(options) {
         }
     }
     const scoped = Array.isArray(options.refs) && options.refs.length > 0;
-    let candidatePaths;
-    if (scoped && options.refs) {
-        // Resolve each ref to an absolute file path via the index DB.
-        const dbPath = getDbPath();
-        let db;
-        const resolvedPaths = new Set();
-        try {
-            db = openDatabase(dbPath);
-            for (const ref of options.refs) {
-                const trimmed = ref.trim();
-                if (!trimmed)
-                    continue;
-                const entryId = findEntryIdByRef(db, trimmed);
-                if (entryId === undefined) {
-                    warn(`[graph] ref not found in index, skipping: ${trimmed}`);
-                    continue;
+    return withIndexWriterLease({ purpose: "graph-update" }, async () => {
+        let candidatePaths;
+        if (scoped && options.refs) {
+            // Resolve each ref to an absolute file path while the writer lease is held
+            // so the scoped graph write sees the same index snapshot it resolved from.
+            const dbPath = getDbPath();
+            let db;
+            const resolvedPaths = new Set();
+            try {
+                db = openDatabase(dbPath);
+                for (const ref of options.refs) {
+                    const trimmed = ref.trim();
+                    if (!trimmed)
+                        continue;
+                    const entryId = findEntryIdByRef(db, trimmed);
+                    if (entryId === undefined) {
+                        warn(`[graph] ref not found in index, skipping: ${trimmed}`);
+                        continue;
+                    }
+                    const row = getEntryById(db, entryId);
+                    if (!row?.filePath) {
+                        warn(`[graph] could not resolve path for ref, skipping: ${trimmed}`);
+                        continue;
+                    }
+                    resolvedPaths.add(row.filePath);
                 }
-                const row = getEntryById(db, entryId);
-                if (!row?.filePath) {
-                    warn(`[graph] could not resolve path for ref, skipping: ${trimmed}`);
-                    continue;
-                }
-                resolvedPaths.add(row.filePath);
             }
+            finally {
+                if (db)
+                    closeDatabase(db);
+            }
+            if (resolvedPaths.size === 0) {
+                warn("[graph] none of the provided refs resolved to indexed paths — no extraction performed.");
+                return {
+                    shape: "graph-update",
+                    ok: true,
+                    filesExtracted: 0,
+                    entitiesUpserted: 0,
+                    relationsUpserted: 0,
+                    durationMs: 0,
+                    scoped: true,
+                };
+            }
+            candidatePaths = resolvedPaths;
         }
-        finally {
-            if (db)
-                closeDatabase(db);
-        }
-        if (resolvedPaths.size === 0) {
-            warn("[graph] none of the provided refs resolved to indexed paths — no extraction performed.");
+        const extractionFn = options.graphExtractionFn ?? runGraphExtractionPass;
+        const passOptions = candidatePaths ? { candidatePaths } : {};
+        let db;
+        const startMs = Date.now();
+        try {
+            db = openDatabase(getDbPath());
+            const onProgress = (event) => {
+                if (!event.currentPath)
+                    return;
+                const file = path.basename(event.currentPath);
+                warn(`[graph] extracting ${event.processed}/${event.total} ${file}`);
+            };
+            const result = await extractionFn({
+                config,
+                sources,
+                signal: undefined,
+                db,
+                reEnrich: false,
+                onProgress,
+                options: passOptions,
+            });
+            const durationMs = Date.now() - startMs;
             return {
                 shape: "graph-update",
                 ok: true,
-                filesExtracted: 0,
-                entitiesUpserted: 0,
-                relationsUpserted: 0,
-                durationMs: 0,
-                scoped: true,
+                filesExtracted: result.quality.extractedFiles,
+                entitiesUpserted: result.quality.entityCount,
+                relationsUpserted: result.quality.relationCount,
+                durationMs,
+                scoped,
             };
         }
-        candidatePaths = resolvedPaths;
-    }
-    const extractionFn = options.graphExtractionFn ?? runGraphExtractionPass;
-    const passOptions = candidatePaths ? { candidatePaths } : {};
-    let db;
-    const startMs = Date.now();
-    try {
-        db = openDatabase(getDbPath());
-        const onProgress = (event) => {
-            if (!event.currentPath)
-                return;
-            const file = path.basename(event.currentPath);
-            warn(`[graph] extracting ${event.processed}/${event.total} ${file}`);
-        };
-        const result = await extractionFn({
-            config,
-            sources,
-            signal: undefined,
-            db,
-            reEnrich: false,
-            onProgress,
-            options: passOptions,
-        });
-        const durationMs = Date.now() - startMs;
-        return {
-            shape: "graph-update",
-            ok: true,
-            filesExtracted: result.quality.extractedFiles,
-            entitiesUpserted: result.quality.entityCount,
-            relationsUpserted: result.quality.relationCount,
-            durationMs,
-            scoped,
-        };
-    }
-    finally {
-        if (db)
-            closeDatabase(db);
-    }
+        finally {
+            if (db)
+                closeDatabase(db);
+        }
+    });
 }
 async function resolveGraphTarget(ref, source) {
     const parsedRef = parseAssetRef(ref);

package/dist/commands/health.js CHANGED Viewed

@@ -760,11 +760,19 @@ function computeWallTimeStats(durationsMs, byPhase) {
 }
 function buildImproveSkipSummary(events) {
     const skipReasons = {};
+    let skipped = 0;
     for (const event of events) {
         const reason = typeof event.metadata?.reason === "string" && event.metadata.reason.trim() ? event.metadata.reason : "unknown";
-        skipReasons[reason] = (skipReasons[reason] ?? 0) + 1;
+        // Aggregated skip events (e.g. `no_new_signal`, `profile_filtered_all_passes`)
+        // carry a `count` of the refs they represent in a single row instead of one
+        // event per ref. Honor that count so the skip histogram reflects the true
+        // number of skipped refs; per-ref events without a count contribute 1.
+        const rawCount = event.metadata?.count;
+        const count = typeof rawCount === "number" && Number.isFinite(rawCount) && rawCount > 0 ? rawCount : 1;
+        skipReasons[reason] = (skipReasons[reason] ?? 0) + count;
+        skipped += count;
     }
-    return { skipped: events.length, skipReasons };
+    return { skipped, skipReasons };
 }
 function probeStateDbRoundTrip(stateDbPath) {
     const before = readEvents({}, { dbPath: stateDbPath }).nextOffset;

package/dist/commands/improve/consolidate.js CHANGED Viewed

@@ -809,7 +809,7 @@ export async function akmConsolidate(opts = {}) {
         };
     }
     if (opts.incrementalSince) {
-        memories = narrowToIncrementalCandidates(memories, opts.incrementalSince, warnings);
+        memories = narrowToIncrementalCandidates(memories, opts.incrementalSince, warnings, opts.neighborsPerChanged);
         if (memories.length === 0) {
             return {
                 schemaVersion: 1,
@@ -828,6 +828,27 @@ export async function akmConsolidate(opts = {}) {
             };
         }
     }
+    if (opts.limit !== undefined && memories.length > opts.limit) {
+        // Order oldest-modified-first before capping so the limit selects the
+        // stalest memories rather than a fixed head of the (rowid-ordered) DB
+        // query. Consolidation rewrites surviving files, bumping their mtime, so
+        // processed memories drift to the back of the queue and the cap rotates
+        // across the whole corpus over successive runs instead of revisiting the
+        // same slice every time. Fail-open to 0 (front of queue) when a file can
+        // no longer be stat'd.
+        const mtimeOf = (m) => {
+            try {
+                return fs.statSync(m.filePath).mtimeMs;
+            }
+            catch {
+                return 0;
+            }
+        };
+        const mtimeCache = new Map(memories.map((m) => [m.filePath, mtimeOf(m)]));
+        memories = [...memories].sort((a, b) => (mtimeCache.get(a.filePath) ?? 0) - (mtimeCache.get(b.filePath) ?? 0));
+        warnings.push(`Consolidation: pool capped at ${opts.limit} of ${memories.length} memories (limit option, oldest-modified first).`);
+        memories = memories.slice(0, opts.limit);
+    }
     // Consolidation always uses the HTTP LLM client directly — never the agent
     // CLI. The agent CLI is for interactive agent sessions (reflect, propose);
     // structured JSON generation works better and faster via HTTP.
@@ -2004,7 +2025,7 @@ function parseSinceToIso(since) {
     const multiplier = { m: 60_000, h: 3_600_000, d: 86_400_000 }[m[2]];
     return new Date(Date.now() - parseInt(m[1], 10) * multiplier).toISOString();
 }
-export function narrowToIncrementalCandidates(memories, since, warnings) {
+export function narrowToIncrementalCandidates(memories, since, warnings, neighborsPerChanged = 5) {
     const sinceIso = parseSinceToIso(since);
     const isChanged = (m) => {
         try {
@@ -2019,7 +2040,6 @@ export function narrowToIncrementalCandidates(memories, since, warnings) {
         return [];
     if (changed.length === memories.length)
         return memories;
-    const NEIGHBORS_PER_CHANGED = 5;
     const byName = new Map(memories.map((m) => [m.name, m]));
     const keep = new Set(changed.map((m) => m.name));
     let db;
@@ -2029,7 +2049,7 @@ export function narrowToIncrementalCandidates(memories, since, warnings) {
             const id = findEntryIdByRef(db, `memory:${m.name}`);
             if (id === undefined)
                 continue;
-            for (const hit of getNeighborsByEntryId(db, id, NEIGHBORS_PER_CHANGED + 1)) {
+            for (const hit of getNeighborsByEntryId(db, id, neighborsPerChanged + 1)) {
                 if (hit.id === id)
                     continue;
                 const entry = getEntryById(db, hit.id);

package/dist/commands/improve/distill.js CHANGED Viewed

@@ -586,7 +586,7 @@ similarLessons) {
  * @param reason    - Human-readable rejection reason.
  * @param extraMeta - Optional additional metadata for the event.
  */
-function writeQualityRejection(stash, inputRef, lessonRef, content, score, reason, extraMeta = {}) {
+function writeQualityRejection(stash, inputRef, lessonRef, content, score, reason, extraMeta = {}, eligibilitySource) {
     // D-5 / #388: reviewNeeded flag selects "review_needed" vs "quality_rejected" outcome.
     const outcome = extraMeta.reviewNeeded ? "review_needed" : "quality_rejected";
     const rejectDir = path.join(stash, ".akm", "distill-rejected");
@@ -602,6 +602,9 @@ function writeQualityRejection(stash, inputRef, lessonRef, content, score, reaso
             score,
             reason,
             ...extraMeta,
+            // Attribution tagging: stamp the eligibility lane so distill_invoked can be
+            // sliced by lane downstream. See EligibilitySource.
+            ...(eligibilitySource ? { eligibilitySource } : {}),
         },
     });
     return {
@@ -629,6 +632,12 @@ export async function akmDistill(options) {
     // Validate the ref shape up front so a typo never reaches the LLM.
     const parsedInputRef = parseAssetRef(inputRef);
     const targetKind = options.proposalKind ?? "lesson";
+    // Attribution tagging: spread into every distill_invoked event's metadata so
+    // the lane that selected this asset is recorded uniformly across all outcome
+    // branches. Empty object when no lane was supplied (direct `akm distill`).
+    const eligMeta = options.eligibilitySource
+        ? { eligibilitySource: options.eligibilitySource }
+        : {};
     // Recursive-distillation guard. Distill produces *lessons* from non-lesson
     // sources (memory, skill, knowledge, etc.). Calling distill on an existing
     // lesson would derive `lesson:lesson-<name>-lesson-lesson` (double `-lesson`
@@ -650,6 +659,7 @@ export async function akmDistill(options) {
                 lessonRef: skippedRef,
                 message: "distill refuses lesson inputs — lessons are the distilled form, not a source",
                 skipReason: "recursive_lesson_input",
+                ...eligMeta,
             },
         });
         return {
@@ -766,6 +776,7 @@ export async function akmDistill(options) {
                             outcome: "skipped",
                             lessonRef: promotion.knowledgeRef,
                             message: "D-1: LLM resolved destination conflict as NOOP — existing content kept",
+                            ...eligMeta,
                         },
                     });
                     return {
@@ -814,9 +825,9 @@ export async function akmDistill(options) {
             if (!judgeResult.pass) {
                 if (judgeResult.reviewNeeded) {
                     // Uncertainty band (2.5–3.5): queue as review_needed instead of rejecting.
-                    return writeQualityRejection(stash, inputRef, promotion.knowledgeRef, resolvedPromotionContent, judgeResult.score, judgeResult.reason, { reviewNeeded: true });
+                    return writeQualityRejection(stash, inputRef, promotion.knowledgeRef, resolvedPromotionContent, judgeResult.score, judgeResult.reason, { reviewNeeded: true }, options.eligibilitySource);
                 }
-                return writeQualityRejection(stash, inputRef, promotion.knowledgeRef, resolvedPromotionContent, judgeResult.score, judgeResult.reason);
+                return writeQualityRejection(stash, inputRef, promotion.knowledgeRef, resolvedPromotionContent, judgeResult.score, judgeResult.reason, {}, options.eligibilitySource);
             }
             // Normalize 1-5 judge score to [0, 1]. Score of -1 means pass-through
             // (no LLM / timeout / parse failure) — leave confidence undefined so
@@ -834,6 +845,8 @@ export async function akmDistill(options) {
                 ...(Object.keys(knowledgeParsed.data).length > 0 ? { frontmatter: knowledgeParsed.data } : {}),
             },
             ...(knowledgeJudgeConfidence !== undefined ? { confidence: knowledgeJudgeConfidence } : {}),
+            // Attribution tagging: persist the eligibility lane on the proposal.
+            ...(options.eligibilitySource ? { eligibilitySource: options.eligibilitySource } : {}),
         }, options.ctx);
         if (isProposalSkipped(proposalResult)) {
             appendEvent({
@@ -844,6 +857,7 @@ export async function akmDistill(options) {
                     lessonRef: promotion.knowledgeRef,
                     message: proposalResult.message,
                     skipReason: proposalResult.reason,
+                    ...eligMeta,
                 },
             });
             return {
@@ -867,6 +881,7 @@ export async function akmDistill(options) {
                 proposalId: proposal.id,
                 ...(options.sourceRun !== undefined ? { sourceRun: options.sourceRun } : {}),
                 ...(exclusionSet.size > 0 ? { filteredFeedbackCount } : {}),
+                ...eligMeta,
             },
         });
         return {
@@ -979,6 +994,7 @@ export async function akmDistill(options) {
                 lessonRef: effectiveLessonRef,
                 proposalKind: effectiveProposalKind,
                 ...(exclusionSet.size > 0 ? { filteredFeedbackCount } : {}),
+                ...eligMeta,
             },
         });
         return {
@@ -1203,6 +1219,7 @@ export async function akmDistill(options) {
                 proposalKind: effectiveProposalKind,
                 findingKinds: findings.map((f) => f.kind),
                 ...(exclusionSet.size > 0 ? { filteredFeedbackCount } : {}),
+                ...eligMeta,
             },
         });
         const message = findings.map((f) => f.message).join("\n");
@@ -1224,9 +1241,9 @@ export async function akmDistill(options) {
                 return writeQualityRejection(stash, inputRef, effectiveLessonRef, content, judgeResult.score, judgeResult.reason, {
                     reviewNeeded: true,
                     ...(exclusionSet.size > 0 ? { filteredFeedbackCount, feedbackFullyFiltered } : {}),
-                });
+                }, options.eligibilitySource);
             }
-            return writeQualityRejection(stash, inputRef, effectiveLessonRef, content, judgeResult.score, judgeResult.reason, exclusionSet.size > 0 ? { filteredFeedbackCount, feedbackFullyFiltered } : {});
+            return writeQualityRejection(stash, inputRef, effectiveLessonRef, content, judgeResult.score, judgeResult.reason, exclusionSet.size > 0 ? { filteredFeedbackCount, feedbackFullyFiltered } : {}, options.eligibilitySource);
         }
         // Normalize 1-5 judge score to [0, 1]. Score of -1 means pass-through
         // (no LLM / timeout / parse failure) — leave confidence undefined so
@@ -1256,6 +1273,8 @@ export async function akmDistill(options) {
             frontmatter: frontmatterWithSources,
         },
         ...(lessonJudgeConfidence !== undefined ? { confidence: lessonJudgeConfidence } : {}),
+        // Attribution tagging: persist the eligibility lane on the proposal.
+        ...(options.eligibilitySource ? { eligibilitySource: options.eligibilitySource } : {}),
     }, options.ctx);
     if (isProposalSkipped(proposalResult2)) {
         appendEvent({
@@ -1266,6 +1285,7 @@ export async function akmDistill(options) {
                 lessonRef: effectiveLessonRef,
                 message: proposalResult2.message,
                 skipReason: proposalResult2.reason,
+                ...eligMeta,
             },
         });
         return {
@@ -1290,6 +1310,7 @@ export async function akmDistill(options) {
             ...(options.sourceRun !== undefined ? { sourceRun: options.sourceRun } : {}),
             ...(exclusionSet.size > 0 ? { filteredFeedbackCount } : {}),
             ...(descriptionSwapped > 0 ? { descriptionSwapped } : {}),
+            ...eligMeta,
         },
     });
     return {

package/dist/commands/improve/extract-prompt.js CHANGED Viewed

@@ -55,7 +55,7 @@ export const EXTRACT_JSON_SCHEMA = {
                         type: "string",
                         minLength: 20,
                         maxLength: 400,
-                        description: "One-sentence summary of the candidate. Must be a complete sentence; do not end mid-clause.",
+                        description: "One-sentence summary of the candidate. Must be a complete sentence in active voice. Do NOT start with 'When', 'If', 'How', 'Use', or 'Avoid'. Do NOT end with ':', ';', or ','. Do NOT use heading-fragment text ('Summary', 'Overview', 'Key finding:'). Minimum 20 characters, maximum 400 characters.",
                     },
                     when_to_use: {
                         type: "string",

package/dist/commands/improve/improve-auto-accept.js CHANGED Viewed

@@ -72,6 +72,12 @@ export async function runAutoAcceptGate(candidates, cfg, promoteFn = promoteProp
                     confidence,
                     threshold: effectiveThreshold,
                     phase: cfg.phase,
+                    // Attribution tagging: carry the eligibility lane from the proposal
+                    // record onto the auto-accept promoted event so the lane survives to
+                    // accept time even when promotion happens in a later run.
+                    ...(promotion.proposal.eligibilitySource !== undefined
+                        ? { eligibilitySource: promotion.proposal.eligibilitySource }
+                        : {}),
                 },
             }, cfg.eventsCtx ?? {});
             info(`[improve] auto-accepted ${promotion.ref} (${cfg.phase}; confidence=${confidence.toFixed(2)} >= threshold=${effectiveThreshold.toFixed(2)})`);