npm - @lumoai/cli - Versions diffs - 1.43.0 → 1.45.0 - Mend

@lumoai/cli 1.43.0 → 1.45.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/assets/skill/references/verify.md +5 -1
package/dist/cli/src/commands/task-status.js +161 -4
package/dist/cli/src/lib/evidence-display.js +106 -0
package/dist/shared/src/acceptance-evidence.js +19 -0
package/package.json +1 -1

package/assets/skill/references/verify.md CHANGED Viewed

@@ -112,11 +112,13 @@ what's unmet and why (the exact failure tails), and how many rounds are left.
 - **Header** — task identifier/title/status + `verification round N/3` (round 0 = never verified) + an escalation warning when the machine loop is exhausted.
 - **Machine verification rollup** (LUM-470) — directly under the `Criteria` header, one line `Machine verification: N machine-verified / M human override (of T MACHINE criteria)` over the active MACHINE criteria, aligned with the web read model (LUM-456). Printed whenever the contract has ≥1 MACHINE criterion, so the terminal rollup never reads as all-human when a checkpointer actually verified the work.
-- **Criteria** — every criterion as `<glyph> <id>  [TYPE]  SOURCE@rN statement` (✓ latest verdict passed / ✗ failed / ○ no verdict yet) with its checkpointer and latest verdict line (evidence pointer on pass, failure tail on fail). `REVIEW_ADDED@rN` provenance is visible per row.
+- **Criteria** — every criterion as `<glyph> <id>  [TYPE]  SOURCE@rN statement` (✓ latest verdict passed / ✗ failed / ○ no verdict yet) with its checkpointer and latest verdict line (failure tail on fail). `REVIEW_ADDED@rN` provenance is visible per row.
   - A passing **MACHINE** criterion's verdict line carries a machine-state tag derived from the read model's `machinePassed` flag, NOT the latest verdict (LUM-470): `· machine-verified` when a checkpointer actually passed it (even after a human later signs the task off), or `· human override (no machine pass)` when it passes only on a human sign-off with no machine run underneath. This keeps the terminal honest with web — a machine-verified criterion that a human co-signed no longer reads as a plain human pass.
+  - A verdict's **evidence is drillable** (LUM-563), rendered as an indented `↳ evidence:` line under the verdict (PASS _and_ FAIL) instead of the inert raw pointer that used to ride the verdict line — so a conclusion points at real proof you can act on, not just the `check:` command: a `cmd:` pointer prints the actual command + exit code (`ran \`…\` → exit N · re-run to reproduce`), a `file:`pointer prints a terminal-clickable`path:line`, and a `commit:` pointer prints a navigable web URL (`<repo>/commit/<hash>`, resolved from the local git `origin`remote) or a`git show <hash>`fallback when no remote resolves. A criterion that **requires evidence but has none recorded yet** (e.g. a HUMAN evidence criterion before sign-off) renders an explicit`↳ evidence: pending — no reference recorded yet`(fail-closed) instead of a bare, dead`[evidence]` tag.
   - A pass can carry a **`⚠ pre-edit version`** note (LUM-457): the criterion was changed after that verdict (reworded, or its checkpointer was swapped so the recorded evidence ran a different command). The pass still counts as met (a stale pass does not block DONE — render-only signal), but it vouches for an older version — **re-run `lumo verify` to re-confirm against the current criterion.** This is the habit whenever you edit a MACHINE criterion's checkpointer mid-task: change the check, then re-verify so the green is honest.
 - **History** — one line per recorded round: `rN · timestamp · X PASS / Y FAIL`.
 - **Last round failures** — the most recent round's FAIL verdicts with their rejection reasons (why the last round bounced).
+- **Cost** (LUM-560) — 规律 1: the costs a human should weigh, on the same report as the verdict instead of scattered across the web delivery card and `task lineage`. Three lines: **Tokens** (total input+output+cache across the task's sessions), **Active time** (non-idle agent seconds — Σ per-turn `STOP − prompt`, LUM-487), and **Rework rounds** (verify rounds that recorded a FAIL). Read from the **same** server-side source the web delivery card consumes (`retrospectiveRepository.loadActuals`), so the two reports cannot drift. Token cost is **fail-closed**: when no session usage was recorded it prints `Tokens: not recorded (no session usage captured)`, kept distinct from a measured `0` (没测到 vs 花了0, aligned with LUM-559). Carried in `--json` as `cost { tokenCost, activeTimeSec, reworkRounds }` (`tokenCost: null` = not measured). Omitted only against an older server that doesn't emit the field.
 - **Struggle / rework / outstanding** (LUM-561) — the anti-mum-and-deaf block: **always printed when the contract exists, even on a clean 0-unmet task** so a passing task still shows its scars instead of wiping them to a single PASS count. Lists, when present:
   - **rework rounds** — verify rounds that had a FAIL;
   - **send-backs** — criteria sent back by a human/agent verdict (a MACHINE verify-loop FAIL is not a 打回), with their open/resolved lifecycle, preserved even for since-removed criteria;
@@ -126,6 +128,8 @@ what's unmet and why (the exact failure tails), and how many rounds are left.
   When the trail is genuinely empty it states the **basis** (`None recorded — N rounds run, 0 FAIL, no send-backs, no reopens, no leftover follow-ups`); when nothing has been verified yet it says so (`No verification has run yet — cannot confirm there were no difficulties`) rather than rendering an implicitly-clean slate. Carried in `--json` as `struggleTrail` (incl. `pullRequests` + `reopens`).
+- **Trend** (LUM-562) — 规律 7 趋势非快照: the _movement_ of the key quantities across the task's attempts, not a single snapshot. Where History/Cost/Struggle list current values, this shows direction: **Pass rate** across verification rounds (`r1 60% → r2 100% (↑ +40pts)`), **Cost/session** across the task's sessions (`4.2K → 1.1K tokens (↓), 5.3K total` — per-session spend from the same source as the **Cost** total, so the trajectory's points sum to it), and **Rework** accrual (`3 accrued — 1 FAIL round, 1 reopen, +1 PR cycle (↑ from 0)`). **Honest about a single point:** with only one round and one session every quantity is one data point, so it prints `Single attempt so far — no trajectory yet (a trend needs ≥2 rounds or sessions)` rather than drawing a fake arrow off one value (closes the named "1-round pass = single point" gap). When nothing was verified and no cost was measured it says `No verification rounds or measured cost yet — nothing to trend`. Carried in `--json` as `trend { passRate[], cost[], rework{} }`. Omitted only against an older server.
 - **Next actions** — the unmet criteria (latest verdict is not a pass: failed or never verified, HUMAN ones included). This list IS the plan — recomputed from the event log on every read, never maintained separately. Empty + rounds recorded = awaiting human adjudication.
 - **Open boundary crossings** (LUM-448) — a trailing safety block when the task has ≥1 OPEN (undispositioned) forbidden-action crossing: a count, then one line per crossing `• [SEVERITY] CATEGORY — <clipped detail>` (highest-severity first), each followed by a read-only **attribution** line `↳ by model=<m> · agent=<type>[/branch] · session=<8-char prefix>` (LUM-469 — who/what crossed; any dimension that couldn't be resolved server-side prints `unknown`, never a fabricated value), then a pointer to the web acceptance panel. Silent when there are none, so it never overshadows the criteria.
   - **Read-only awareness** — this surfaces crossings detected elsewhere (LUM-426/435/442); there is no CLI path to disposition or clear one. Disposition stays web + human-only (LUM-426/435/422): an agent/CLI bearer cannot clear its own crossing from the terminal.

package/dist/cli/src/commands/task-status.js CHANGED Viewed

@@ -7,6 +7,7 @@ const api_1 = require("../lib/api");
 const resolve_bound_task_1 = require("../lib/resolve-bound-task");
 const sanitize_1 = require("../lib/sanitize");
 const open_crossings_1 = require("../lib/open-crossings");
+const evidence_display_1 = require("../lib/evidence-display");
 /** One-line a possibly-multiline crossing detail and cap it so the safety block
  *  stays a glance, never a wall (it must not overshadow the criteria). */
 const CROSSING_DETAIL_CAP = 160;
@@ -85,9 +86,6 @@ function formatTaskStatus(data, extras = {}) {
             lines.push(`      ✗ FAIL@r${v.round}${why}`);
         }
         else {
-            const evidencePart = v.evidencePointer
-                ? ` · ${(0, sanitize_1.sanitizeField)(v.evidencePointer)}`
-                : '';
             // LUM-470: tag a passing MACHINE criterion by the read model's
             // machinePassed flag, not the latest verdict — a criterion a checkpointer
             // verified reads as machine-verified even after a human signs off, and a
@@ -97,13 +95,26 @@ function formatTaskStatus(data, extras = {}) {
                     ? ' · machine-verified'
                     : ' · human override (no machine pass)'
                 : '';
-            lines.push(`      ✓ ${v.verdict}@r${v.round}${evidencePart}${machineTag}`);
+            lines.push(`      ✓ ${v.verdict}@r${v.round}${machineTag}`);
             // LUM-457: a pass that vouches for a pre-edit version of the criterion —
             // render-only downgrade, the criterion still counts met.
             if (c.verdictStale || c.checkMismatch) {
                 lines.push('      ⚠ pre-edit version — criterion changed since this check; re-run `lumo verify` to re-confirm');
             }
         }
+        // LUM-563: drill the verdict's evidence into a navigable / re-runnable
+        // reference instead of the inert raw pointer that used to ride the verdict
+        // line — a commit URL (or `git show` fallback), a clickable path:line, or
+        // the actual command + exit. Rendered for PASS and FAIL alike so a failure
+        // is just as reproducible. When a criterion requires evidence but none is
+        // recorded yet, say so explicitly (fail-closed) rather than leaving the
+        // bare `[evidence]` tag pointing at nothing.
+        if (v?.evidencePointer) {
+            lines.push(`        ↳ evidence: ${(0, sanitize_1.sanitizeField)((0, evidence_display_1.formatEvidenceDrilldown)(v.evidencePointer, extras.repoWebUrl ?? null))}`);
+        }
+        else if (c.evidenceRequired) {
+            lines.push('        ↳ evidence: pending — no reference recorded yet');
+        }
         // LUM-511 Phase 5: send-back lifecycle (was this criterion's send-back
         // resolved, and by which PR).
         const sb = c.sendBackResolution;
@@ -134,7 +145,9 @@ function formatTaskStatus(data, extras = {}) {
             }
         }
     }
+    pushCost(lines, data);
     pushStruggleTrail(lines, data);
+    pushTrend(lines, data);
     lines.push('');
     if (data.nextActions.length === 0) {
         lines.push(data.currentRound > 0
@@ -166,6 +179,57 @@ function formatTaskStatus(data, extras = {}) {
     pushOpenCrossings(lines, extras);
     return lines.join('\n') + '\n';
 }
+/** Compact a token count for the terminal — 1_200_000 → "1.2M", 850_000 →
+ *  "850K", 0 → "0". Uses the SAME Intl compact formatter the web card's
+ *  fmtCompact uses (notation:'compact', maximumFractionDigits:1) so the same
+ *  number reads identically in both surfaces — no presentation drift (LUM-560). */
+const TOKEN_FMT = new Intl.NumberFormat('en-US', {
+    notation: 'compact',
+    maximumFractionDigits: 1,
+});
+function fmtTokens(n) {
+    return TOKEN_FMT.format(n);
+}
+/** Active (non-idle) seconds → a compact "2h 14m" / "3m 5s" / "12s". 0 → "0s". */
+function fmtDuration(totalSec) {
+    const sec = Math.max(0, Math.round(totalSec));
+    if (sec === 0)
+        return '0s';
+    const h = Math.floor(sec / 3600);
+    const m = Math.floor((sec % 3600) / 60);
+    const s = sec % 60;
+    const parts = [];
+    if (h > 0)
+        parts.push(`${h}h`);
+    if (m > 0)
+        parts.push(`${m}m`);
+    // Show seconds only when the duration is under an hour (keeps long runs tidy).
+    if (s > 0 && h === 0)
+        parts.push(`${s}s`);
+    return parts.join(' ');
+}
+/**
+ * Append the honest "Cost" section (LUM-560) — 规律 1: surface the costs a human
+ * should weigh (token spend, active time, machine rework) on the same report as
+ * the acceptance verdict, instead of leaving them scattered across the web
+ * delivery card and `task lineage`. Every number is the server's, read from the
+ * same loadActuals source as the web card (no drift). Token cost is fail-closed:
+ * a null reads as an explicit "not recorded" line, never a silent or fake 0, so
+ * 没测到 (no session usage) stays distinct from a measured 花了0. Skipped only
+ * when the server didn't emit the field (older server) — never fabricated.
+ */
+function pushCost(lines, data) {
+    const cost = data.cost;
+    if (!cost)
+        return; // older server: can't fabricate cost, so don't claim any.
+    lines.push('');
+    lines.push('Cost:');
+    lines.push(cost.tokenCost == null
+        ? '  Tokens: not recorded (no session usage captured)'
+        : `  Tokens: ${fmtTokens(cost.tokenCost)}`);
+    lines.push(`  Active time: ${fmtDuration(cost.activeTimeSec)} (non-idle)`);
+    lines.push(`  Rework rounds: ${cost.reworkRounds}${cost.reworkRounds === 0 ? ' (no machine rework)' : ''}`);
+}
 /**
  * Append the honest "Struggle / rework / outstanding" section (LUM-561) — the
  * anti-mum-and-deaf block (kills a silent "Nothing outstanding"). It is ALWAYS
@@ -239,6 +303,95 @@ function pushStruggleTrail(lines, data) {
         }
     }
 }
+/** Direction glyph for a delta — ↑ up, ↓ down, → flat. */
+function arrow(delta) {
+    return delta > 0 ? '↑' : delta < 0 ? '↓' : '→';
+}
+/** Pass rate of one round as an integer percent (0 verdicts → 0%). */
+function passPct(p) {
+    const total = p.passed + p.failed;
+    return total === 0 ? 0 : Math.round((p.passed / total) * 100);
+}
+/**
+ * Append the "Trend" section (LUM-562, 規律7 趨勢非快照) — the *movement* of the
+ * task's key quantities across its attempts, not a single-point snapshot. Where
+ * History lists each round's PASS/FAIL, Cost shows the running total, and
+ * Struggle lists the scars, this shows direction: pass rate climbing, cost per
+ * session rising/falling, rework accruing. The honesty rule (mirrors the
+ * struggle trail): a single data point is NOT a trajectory — with one round and
+ * one session the section says so outright rather than drawing a fake arrow off
+ * a single value, closing the named "1-round pass = single point" gap. The
+ * per-session cost reuses the same source as the Cost section, so the
+ * trajectory's points sum to the Cost total (no in-report drift). Skipped only
+ * when the server didn't emit the field (older server) — never fabricated.
+ */
+function pushTrend(lines, data) {
+    const trend = data.trend;
+    if (!trend)
+        return; // older server: no trend data — don't invent one.
+    const rounds = trend.passRate;
+    const cost = trend.cost;
+    const rw = trend.rework;
+    lines.push('');
+    // Nothing to trend at all — no verification, no measured cost.
+    if (rounds.length === 0 && cost.length === 0) {
+        lines.push('Trend:');
+        lines.push('  No verification rounds or measured cost yet — nothing to trend. Run `lumo verify`.');
+        return;
+    }
+    // A trajectory needs ≥2 points in at least one dimension. With a single
+    // round AND a single session, every quantity is one data point — say so
+    // plainly instead of implying a flat/healthy trend (the named gap).
+    const hasTrajectory = rounds.length >= 2 || cost.length >= 2;
+    if (!hasTrajectory) {
+        lines.push('Trend:');
+        lines.push('  Single attempt so far — no trajectory yet (a trend needs ≥2 rounds or sessions).');
+        const bits = [];
+        if (rounds.length === 1)
+            bits.push(`pass rate r${rounds[0].round} ${passPct(rounds[0])}%`);
+        if (cost.length === 1)
+            bits.push(`cost ${fmtTokens(cost[0].total)} tokens`);
+        bits.push(rw.total === 0 ? 'rework none' : `rework ${rw.total}`);
+        lines.push(`  ${bits.join(' · ')}`);
+        return;
+    }
+    const sessionWord = cost.length === 1 ? 'session' : 'sessions';
+    lines.push(`Trend (${rounds.length} rounds · ${cost.length} ${sessionWord}):`);
+    // Pass rate across rounds — the climb (or fall) of the verify loop.
+    if (rounds.length >= 2) {
+        const seq = rounds.map(r => `r${r.round} ${passPct(r)}%`).join(' → ');
+        const delta = passPct(rounds[rounds.length - 1]) - passPct(rounds[0]);
+        const sign = delta > 0 ? `+${delta}` : `${delta}`;
+        lines.push(`  Pass rate: ${seq}  (${arrow(delta)} ${sign}pts)`);
+    }
+    else if (rounds.length === 1) {
+        lines.push(`  Pass rate: r${rounds[0].round} ${passPct(rounds[0])}% (1 round — single point)`);
+    }
+    // Cost per session across the task's sessions — the spend trajectory.
+    if (cost.length >= 2) {
+        const first = cost[0].total;
+        const last = cost[cost.length - 1].total;
+        const total = cost.reduce((a, c) => a + c.total, 0);
+        lines.push(`  Cost/session: ${fmtTokens(first)} → ${fmtTokens(last)} tokens  (${arrow(last - first)}), ${fmtTokens(total)} total over ${cost.length} sessions`);
+    }
+    else if (cost.length === 1) {
+        lines.push(`  Cost: ${fmtTokens(cost[0].total)} tokens (1 session — single point)`);
+    }
+    // Rework accrual — monotonic from a clean baseline of 0.
+    if (rw.total === 0) {
+        lines.push('  Rework: none accrued (0) — clean across every attempt');
+    }
+    else {
+        const parts = [];
+        if (rw.reworkRounds > 0)
+            parts.push(`${rw.reworkRounds} FAIL round${rw.reworkRounds === 1 ? '' : 's'}`);
+        if (rw.reopens > 0)
+            parts.push(`${rw.reopens} reopen${rw.reopens === 1 ? '' : 's'}`);
+        if (rw.extraPrCycles > 0)
+            parts.push(`+${rw.extraPrCycles} PR cycle${rw.extraPrCycles === 1 ? '' : 's'}`);
+        lines.push(`  Rework: ${rw.total} accrued — ${parts.join(', ')}  (↑ from 0)`);
+    }
+}
 /**
  * Append the OPEN boundary-crossings safety block (LUM-448) — a count, one line
  * per crossing with its severity + category + clipped detail, and a pointer to
@@ -355,5 +508,9 @@ async function taskStatus(identifier, options = {}) {
     process.stdout.write(formatTaskStatus(data, {
         openCrossings: crossingsResult,
         dispositionUrl: (0, open_crossings_1.dispositionUrl)(base, creds.workspaceSlug ?? 'lumo', data.task.identifier),
+        // LUM-563: resolve the local repo's web base so a `commit:` evidence
+        // pointer drills to a navigable URL; null (not a git repo / no origin)
+        // falls back to a `git show` hint in the renderer.
+        repoWebUrl: (0, evidence_display_1.resolveRepoWebUrl)(),
     }));
 }

package/dist/cli/src/lib/evidence-display.js ADDED Viewed

@@ -0,0 +1,106 @@
+"use strict";
+Object.defineProperty(exports, "__esModule", { value: true });
+exports.normalizeRepoWebUrl = normalizeRepoWebUrl;
+exports.resolveRepoWebUrl = resolveRepoWebUrl;
+exports.formatEvidenceDrilldown = formatEvidenceDrilldown;
+const child_process_1 = require("child_process");
+const acceptance_evidence_1 = require("../../../shared/src/acceptance-evidence");
+/**
+ * Drill-down rendering for acceptance evidence pointers (LUM-563).
+ *
+ * A verdict's evidencePointer is a structured token (`cmd:…#exit=N`,
+ * `file:path:line`, `commit:hash`). Printed raw it satisfies Norman's gulf of
+ * evaluation only halfway: you can read it but not act on it. These helpers
+ * turn each shape into a navigable / re-runnable line — a web commit URL when
+ * the git remote resolves (else a `git show` fallback), a terminal-clickable
+ * `path:line`, or the actual command + exit you can re-run to reproduce.
+ *
+ * No third-party deps; sanitization is applied by the caller (task-status.ts)
+ * before anything reaches the terminal.
+ */
+/**
+ * Normalize a git remote URL to its https web base (no trailing `.git`), or
+ * null when it doesn't look like a remote we can navigate. Fails closed: a
+ * shape we can't confidently map yields null so the caller drops to a
+ * non-URL fallback rather than printing a broken link.
+ *
+ * Handles the three remote forms git emits:
+ *   - scp-style  `git@host:owner/repo.git`
+ *   - https      `https://host/owner/repo.git`
+ *   - ssh url    `ssh://git@host/owner/repo.git`
+ */
+function normalizeRepoWebUrl(remote) {
+    const raw = remote.trim();
+    if (!raw)
+        return null;
+    let host;
+    let path;
+    const scp = /^[\w.-]+@([\w.-]+):(.+)$/.exec(raw);
+    if (scp) {
+        host = scp[1];
+        path = scp[2];
+    }
+    else {
+        let url;
+        try {
+            url = new URL(raw);
+        }
+        catch {
+            return null;
+        }
+        if (url.protocol !== 'https:' && url.protocol !== 'ssh:')
+            return null;
+        host = url.host;
+        path = url.pathname;
+    }
+    const cleanPath = path
+        .replace(/^\/+/, '')
+        .replace(/\.git$/, '')
+        .replace(/\/+$/, '');
+    // A navigable repo path is at minimum owner/repo — reject a bare host.
+    if (!host || !cleanPath || !cleanPath.includes('/'))
+        return null;
+    return `https://${host}/${cleanPath}`;
+}
+/**
+ * Resolve the current repo's web base by reading `origin`'s remote URL. Best
+ * effort and side-effecting (runs git in the cwd), so it lives outside the pure
+ * formatter and is injected via TaskStatusExtras. Null when not a git repo /
+ * no origin / unparseable — the renderer then uses the `git show` fallback.
+ */
+function resolveRepoWebUrl(cwd = process.cwd()) {
+    try {
+        const r = (0, child_process_1.spawnSync)('git', ['remote', 'get-url', 'origin'], {
+            cwd,
+            encoding: 'utf8',
+            timeout: 5_000,
+        });
+        if (r.status !== 0 || !r.stdout)
+            return null;
+        return normalizeRepoWebUrl(r.stdout);
+    }
+    catch {
+        return null;
+    }
+}
+/**
+ * Render the actionable evidence body for one pointer (the text after
+ * `↳ evidence: `). Unrecognised shapes fall back to the raw pointer so data is
+ * never hidden, only upgraded when we know how.
+ */
+function formatEvidenceDrilldown(pointer, repoWebUrl) {
+    const parsed = (0, acceptance_evidence_1.parseEvidencePointer)(pointer);
+    if (!parsed)
+        return pointer;
+    if (parsed.kind === 'cmd') {
+        return `ran \`${parsed.command}\` → exit ${parsed.exitCode} · re-run to reproduce`;
+    }
+    if (parsed.kind === 'file') {
+        return `${parsed.path}:${parsed.line}`;
+    }
+    // commit
+    const short = parsed.hash.slice(0, 10);
+    return repoWebUrl
+        ? `commit ${short} · ${repoWebUrl}/commit/${parsed.hash}`
+        : `commit ${short} · view with \`git show ${parsed.hash}\``;
+}

package/dist/shared/src/acceptance-evidence.js CHANGED Viewed

@@ -2,6 +2,7 @@
 Object.defineProperty(exports, "__esModule", { value: true });
 exports.EVIDENCE_POINTER_MAX = exports.EVIDENCE_POINTER_FORMAT_HINT = exports.EVIDENCE_POINTER_PATTERNS = void 0;
 exports.isValidEvidencePointer = isValidEvidencePointer;
+exports.parseEvidencePointer = parseEvidencePointer;
 exports.buildCmdEvidencePointer = buildCmdEvidencePointer;
 /**
  * Evidence-pointer grammar for acceptance verification (LUM-343).
@@ -29,6 +30,24 @@ function isValidEvidencePointer(value) {
 }
 /** Max stored pointer length — mirrors the column-level cap in validation. */
 exports.EVIDENCE_POINTER_MAX = 2_000;
+/**
+ * Decompose an evidence pointer into its parts, or null if it doesn't parse.
+ * The grammar is the single source of truth (EVIDENCE_POINTER_PATTERNS); this
+ * mirrors those exact shapes — a `#` may appear inside a `cmd:` command, so the
+ * exit marker is matched greedily from the END, never the first `#`.
+ */
+function parseEvidencePointer(value) {
+    const commit = /^commit:([0-9a-f]{7,40})$/.exec(value);
+    if (commit)
+        return { kind: 'commit', hash: commit[1] };
+    const file = /^file:([^\s]+):(\d+(?:-\d+)?)$/.exec(value);
+    if (file)
+        return { kind: 'file', path: file[1], line: file[2] };
+    const cmd = /^cmd:([\s\S]+)#exit=(\d+)$/.exec(value);
+    if (cmd)
+        return { kind: 'cmd', command: cmd[1], exitCode: Number(cmd[2]) };
+    return null;
+}
 /**
  * Build a `cmd:` evidence pointer for an executed checkpointer. The command
  * is truncated so the suffix (`#exit=N`) always survives the length cap —

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@lumoai/cli",
-  "version": "1.43.0",
+  "version": "1.45.0",
   "description": "Lumo CLI — manage tasks and sessions from the terminal",
   "license": "MIT",
   "author": "cli@uselumo.ai",