@lumoai/cli 1.34.0 → 1.35.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -79,7 +79,7 @@ The command catalog below is a **map**: it lists every command grouped by domain
79
79
  **Verification (machine acceptance loop)** — see [verify.md](references/verify.md)
80
80
 
81
81
  - `lumo verify [task] [--timeout <seconds>]` — run every MACHINE criterion's checkpointer locally, report one structured PASS/FAIL verdict per criterion to the server, print next actions. Defaults to the session-bound task. Round cap 3: an all-pass round moves the task to IN_REVIEW (agent stops there); a round-3 fail escalates to a human (stop retrying). **Run this before claiming a task is done.**
82
- - `lumo task status [task] [--json]` — read-only acceptance self-check (no LLM, milliseconds): the contract with each criterion's latest verdict (REVIEW_ADDED provenance visible), verification history, current round, last round's failure reasons, `nextActions` = the unmet criteria (the declarative "what's next" — no separate plan), and any OPEN (undispositioned) boundary crossings (count + per crossing category/severity/detail + a read-only attribution line `↳ by model=…·agent=…·session=…` naming who/what crossed, `unknown` when unresolved — LUM-469; `--json` adds an `openCrossings[]` field, each entry carrying an `attribution` object) — read-only awareness, disposition stays web + human-only (LUM-448). Defaults to the session-bound task; `--json` emits a versioned payload (`version` field). **Run it first when resuming a task in a new session or after a verification round was rejected.**
82
+ - `lumo task status [task] [--json]` — read-only acceptance self-check (no LLM, milliseconds): the contract with each criterion's latest verdict (REVIEW_ADDED provenance visible), verification history, current round, last round's failure reasons, `nextActions` = the unmet criteria (the declarative "what's next" — no separate plan), and any OPEN (undispositioned) boundary crossings (count + per crossing category/severity/detail + a read-only attribution line `↳ by model=…·agent=…·session=…` naming who/what crossed, `unknown` when unresolved — LUM-469; `--json` adds an `openCrossings` field, each entry carrying an `attribution` object) — read-only awareness, disposition stays web + human-only (LUM-448). The crossings check fails closed (LUM-480): if the read errors, the block prints `⚠ Boundary-crossing check failed` instead of staying silent, and `--json` sets `openCrossings: null` (distinct from `[]` = a successful read with zero open — treat `null` as "could not confirm", not "safe"). Defaults to the session-bound task; `--json` emits a versioned payload (`version` field). **Run it first when resuming a task in a new session or after a verification round was rejected.**
83
83
  - `lumo verdict [task] --pass | --pass-with-followup | --fail` — acceptance verdicts (LUM-422). `--pass` / `--pass-with-followup` open the browser to the human verdict bar focused on the passing action (a deep link — **records nothing**; a passing data row is only ever a human's own click). `--fail --reason <enum> [--note <text>] [--criterion <id>…]` records an **AGENT send-back** (verifierType=AGENT, verdict hard-coded FAIL) and bounces the task to IN_PROGRESS. Defaults to the session-bound task. **An unresolved send-back (machine/AGENT/human FAIL) blocks the agent/CLI DONE transition with 409** — clear it (re-verify) before `task update --status done`.
84
84
 
85
85
  **Artifacts & Figma** — see [artifacts-figma.md](references/artifacts-figma.md)
@@ -126,7 +126,7 @@ The command catalog below is a **map**: it lists every command grouped by domain
126
126
 
127
127
  - `lumo session attach <id>` — bind this session to a task (then run `task context`). **Lifetime lock**: re-attaching to the same task is a no-op; attaching to a _different_ task is refused with 409 — start a new Claude Code session instead. No `--force`, no `session detach`.
128
128
  - `lumo session status` — show current binding
129
- - `lumo session wrap [--yes] [--dry-run] [--used <indices>]` — end-of-session panel: progress comment + memory review + fragment-usage vote (`--used`, LUM-300) + blocked-tag prompt, then a read-only reminder when the bound task has ≥1 OPEN boundary crossing still undispositioned (silent when none — no wrap-up noise; pointer is web + human-only, LUM-448). Usage is now also audited automatically when a task reaches DONE (evidence-gated, true-only — confident fragments marked used, the rest left NULL); `session wrap --used` remains the manual override and takes precedence for a session.
129
+ - `lumo session wrap [--yes] [--dry-run] [--used <indices>]` — end-of-session panel: progress comment + memory review + fragment-usage vote (`--used`, LUM-300) + blocked-tag prompt, then a read-only reminder when the bound task has ≥1 OPEN boundary crossing still undispositioned (silent only on a genuine empty read — no wrap-up noise; a crossings-check failure prints a "could not confirm" warning instead of staying silent, LUM-480; pointer is web + human-only, LUM-448). Usage is now also audited automatically when a task reaches DONE (evidence-gated, true-only — confident fragments marked used, the rest left NULL); `session wrap --used` remains the manual override and takes precedence for a session.
130
130
  - Git-suggest at session start (suggests `session attach`, never auto-binds) + Layer-2 project-memory review — see the reference
131
131
 
132
132
  **Worktrees (local dev tooling)** — see [worktree.md](references/worktree.md)
@@ -181,12 +181,16 @@ nothing to prompt, the section prints "(no content)".
181
181
  finish, `session wrap` prints a one-shot read-only reminder **if** the bound
182
182
  task has ≥1 OPEN (undispositioned) boundary crossing: `⚠ N open boundary
183
183
  crossing(s) on LUM-N still undispositioned:` then a line per crossing `• [SEVERITY]
184
- CATEGORY` and a web pointer. When there are none — or the session is unbound —
185
- it prints **nothing** (truly silent, not a "(no content)" line), so a clean task
186
- adds no wrap-up noise. **Awareness only:** it points at the web acceptance panel;
187
- there is **no CLI path** to disposition or clear a crossing. Disposition stays
188
- web + human-only (LUM-426/435/422) an agent/CLI bearer cannot clear its own
189
- crossing from the terminal.
184
+ CATEGORY` and a web pointer. When the read genuinely comes back empty — or the
185
+ session is unbound — it prints **nothing** (truly silent, not a "(no content)"
186
+ line), so a clean task adds no wrap-up noise. **But a crossings-check failure is
187
+ not silent (LUM-480):** if the read errors (network / server), it prints
188
+ `⚠ Could not check boundary crossings on LUM-N (network/server error) unable
189
+ to confirm whether any are still undispositioned`, so a failed safety check never
190
+ masquerades as "0 open / safe". **Awareness only:** it points at the web
191
+ acceptance panel; there is **no CLI path** to disposition or clear a crossing.
192
+ Disposition stays web + human-only (LUM-426/435/422) — an agent/CLI bearer cannot
193
+ clear its own crossing from the terminal.
190
194
 
191
195
  ```bash
192
196
  lumo session wrap # interactive: preview each section, choose per-section
@@ -125,10 +125,23 @@ what's unmet and why (the exact failure tails), and how many rounds are left.
125
125
 
126
126
  - Header: task identifier/title/status + `verification round N/3` (round 0 =
127
127
  never verified) + an escalation warning when the machine loop is exhausted.
128
+ - **Machine verification rollup** (LUM-470) — directly under the `Criteria`
129
+ header, one line `Machine verification: N machine-verified / M human override
130
+ (of T MACHINE criteria)` over the active MACHINE criteria, aligned with the web
131
+ read model (LUM-456). Printed whenever the contract has ≥1 MACHINE criterion,
132
+ so the terminal rollup never reads as all-human when a checkpointer actually
133
+ verified the work.
128
134
  - **Criteria** — every criterion as `<glyph> <id> [TYPE] SOURCE@rN
129
135
  statement` (✓ latest verdict passed / ✗ failed / ○ no verdict yet) with its
130
136
  checkpointer and latest verdict line (evidence pointer on pass, failure
131
137
  tail on fail). `REVIEW_ADDED@rN` provenance is visible per row.
138
+ - A passing **MACHINE** criterion's verdict line carries a machine-state tag
139
+ derived from the read model's `machinePassed` flag, NOT the latest verdict
140
+ (LUM-470): `· machine-verified` when a checkpointer actually passed it (even
141
+ after a human later signs the task off), or `· human override (no machine
142
+ pass)` when it passes only on a human sign-off with no machine run underneath.
143
+ This keeps the terminal honest with web — a machine-verified criterion that
144
+ a human co-signed no longer reads as a plain human pass.
132
145
  - A pass can carry a **`⚠ pre-edit version`** note (LUM-457): the criterion
133
146
  was changed after that verdict (reworded, or its checkpointer was swapped so
134
147
  the recorded evidence ran a different command). The pass still counts as met
@@ -154,19 +167,31 @@ statement` (✓ latest verdict passed / ✗ failed / ○ no verdict yet) with it
154
167
  **Read-only awareness** — this surfaces crossings detected elsewhere
155
168
  (LUM-426/435/442); there is no CLI path to disposition or clear one.
156
169
  Disposition stays web + human-only (LUM-426/435/422): an agent/CLI bearer
157
- cannot clear its own crossing from the terminal.
170
+ cannot clear its own crossing from the terminal. **The check fails closed
171
+ (LUM-480):** if the crossings read itself errors (network / server / parse),
172
+ the block prints `⚠ Boundary-crossing check failed (network/server error) —
173
+ could not confirm whether any are undispositioned` instead of staying silent.
174
+ Silence means a successful read with zero open crossings, never a failed
175
+ check — a hiccup can no longer masquerade as "all clear".
158
176
 
159
177
  ### --json contract
160
178
 
161
179
  `--json` emits the full read model with a top-level `version` field
162
180
  (currently `1`). The schema is versioned: breaking shape changes bump the
163
181
  major; additive fields don't. Pin on `version` when scripting against it.
182
+ Each criterion carries `machinePassed` (boolean — a checkpointer currently
183
+ vouches for it; LUM-456/470), and the payload carries a top-level
184
+ `machineVerification` aggregate `{ total, machineVerified, humanOverridden }`
185
+ over the active MACHINE criteria — read these, not `latestVerdict` alone, to
186
+ tell a machine-verified criterion from a human override.
164
187
  The open boundary crossings ride along as an additive top-level
165
- `openCrossings[]` (each `{ id, category, severity, detail, attribution }`,
188
+ `openCrossings` (each entry `{ id, category, severity, detail, attribution }`,
166
189
  where `attribution` is `{ workspaceMemberId, sessionId, agent, worktreeBranch,
167
190
  model }` with every field nullable — null = unknown, never fabricated; LUM-469;
168
- the array length is the count; empty when none) — same read-only awareness, no
169
- write path.
191
+ the array length is the count) — same read-only awareness, no write path.
192
+ **`openCrossings` is `null` when the crossings check failed (LUM-480)** —
193
+ distinct from `[]`, which is a successful read with zero open crossings. Script
194
+ consumers must treat `null` as "unknown / could not confirm", not "safe".
170
195
 
171
196
  `status` reads; `verify` judges. Running status never starts a round, never
172
197
  escalates, and never changes task state — loop rules (cap 3, IN_REVIEW on
@@ -46,6 +46,13 @@ function formatTaskStatus(data, extras = {}) {
46
46
  }
47
47
  lines.push('');
48
48
  lines.push(`Criteria (${data.criteria.length} total, ${data.nextActions.length} unmet):`);
49
+ // LUM-470: honest machine-verification rollup over the active MACHINE criteria
50
+ // (same read model as web, LUM-456) — so the terminal rollup never reads as
51
+ // all-human when a checkpointer actually verified the work.
52
+ const mv = data.machineVerification;
53
+ if (mv.total > 0) {
54
+ lines.push(`Machine verification: ${mv.machineVerified} machine-verified / ${mv.humanOverridden} human override (of ${mv.total} MACHINE criteria)`);
55
+ }
49
56
  for (const c of data.criteria) {
50
57
  const glyph = c.latestVerdict == null
51
58
  ? '○'
@@ -81,7 +88,16 @@ function formatTaskStatus(data, extras = {}) {
81
88
  const evidencePart = v.evidencePointer
82
89
  ? ` · ${(0, sanitize_1.sanitizeField)(v.evidencePointer)}`
83
90
  : '';
84
- lines.push(` ✓ ${v.verdict}@r${v.round}${evidencePart}`);
91
+ // LUM-470: tag a passing MACHINE criterion by the read model's
92
+ // machinePassed flag, not the latest verdict — a criterion a checkpointer
93
+ // verified reads as machine-verified even after a human signs off, and a
94
+ // pass with no machine run underneath reads as a human override.
95
+ const machineTag = c.verifierType === 'MACHINE'
96
+ ? c.machinePassed
97
+ ? ' · machine-verified'
98
+ : ' · human override (no machine pass)'
99
+ : '';
100
+ lines.push(` ✓ ${v.verdict}@r${v.round}${evidencePart}${machineTag}`);
85
101
  // LUM-457: a pass that vouches for a pre-edit version of the criterion —
86
102
  // render-only downgrade, the criterion still counts met.
87
103
  if (c.verdictStale || c.checkMismatch) {
@@ -146,7 +162,17 @@ function formatTaskStatus(data, extras = {}) {
146
162
  * a crossing from the terminal (LUM-426/435/422).
147
163
  */
148
164
  function pushOpenCrossings(lines, extras) {
149
- const open = extras.openCrossings ?? [];
165
+ const result = extras.openCrossings;
166
+ if (!result)
167
+ return;
168
+ // LUM-480: a failed check is NOT "0 open / safe" — say so explicitly rather
169
+ // than rendering an empty (implicitly-clear) block.
170
+ if (result.status === 'error') {
171
+ lines.push('');
172
+ lines.push('⚠ Boundary-crossing check failed (network/server error) — could not confirm whether any are undispositioned.');
173
+ return;
174
+ }
175
+ const open = result.crossings;
150
176
  if (open.length === 0)
151
177
  return;
152
178
  lines.push('');
@@ -224,20 +250,24 @@ async function taskStatus(identifier, options = {}) {
224
250
  }
225
251
  const data = (await res.json());
226
252
  // Read-only awareness (LUM-448): surface the task's OPEN boundary crossings
227
- // via the existing LUM-435 endpoint. Best-effort fetchOpenCrossings already
228
- // swallows failures to an empty list, so this supplementary safety signal can
229
- // never block the primary acceptance status. The resolved taskId (the
230
- // identifier the status was fetched for) is the key here.
231
- const openCrossings = await (0, open_crossings_1.fetchOpenCrossings)(base, creds.token, taskId);
253
+ // via the existing LUM-435 endpoint. fetchOpenCrossings returns a result that
254
+ // distinguishes a check FAILURE from a genuine 0-open read (LUM-480), so this
255
+ // supplementary safety signal never masquerades a hiccup as "all clear" yet
256
+ // it still never throws, so it can't block the primary acceptance status. The
257
+ // resolved taskId (the identifier the status was fetched for) is the key here.
258
+ const crossingsResult = await (0, open_crossings_1.fetchOpenCrossings)(base, creds.token, taskId);
232
259
  if (options.json) {
233
260
  // JSON.stringify escapes control chars (…), so the payload is safe
234
261
  // to emit raw — and consumers get byte-faithful server data.
235
- // The open crossings ride alongside as an additive field (count = length).
262
+ // openCrossings rides alongside as an additive field: an array on success
263
+ // (count = length), or `null` when the check failed (LUM-480) — distinct
264
+ // from `[]`, which is a successful read with zero open crossings.
265
+ const openCrossings = crossingsResult.status === 'ok' ? crossingsResult.crossings : null;
236
266
  process.stdout.write(JSON.stringify({ ...data, openCrossings }, null, 2) + '\n');
237
267
  return;
238
268
  }
239
269
  process.stdout.write(formatTaskStatus(data, {
240
- openCrossings,
270
+ openCrossings: crossingsResult,
241
271
  dispositionUrl: (0, open_crossings_1.dispositionUrl)(base, creds.workspaceSlug ?? 'lumo', data.task.identifier),
242
272
  }));
243
273
  }
@@ -7,13 +7,20 @@ const sanitize_1 = require("../../lib/sanitize");
7
7
  const open_crossings_1 = require("../../lib/open-crossings");
8
8
  /**
9
9
  * Build the wrap-up reminder for a task's OPEN boundary crossings (LUM-448).
10
- * Returns the reminder string when there is ≥1 open crossing, and `null` when
11
- * there are none — the caller prints nothing on null, so a clean task makes NO
12
- * noise at wrap time. Read-only awareness: the reminder points at the
13
- * human-only web disposition panel and offers no way to clear a crossing from
14
- * the terminal (LUM-426/435/422).
10
+ * Returns the reminder string when there is ≥1 open crossing, and `null` only on
11
+ * a genuine empty read — the caller prints nothing on null, so a clean task
12
+ * makes NO noise at wrap time. A check FAILURE (LUM-480) is NOT silent: it
13
+ * returns a warning so a failed safety check never reads as "0 open / safe".
14
+ * Read-only awareness: the reminder points at the human-only web disposition
15
+ * panel and offers no way to clear a crossing from the terminal (LUM-426/435/422).
15
16
  */
16
- function formatCrossingReminder(taskIdentifier, open, url) {
17
+ function formatCrossingReminder(taskIdentifier, result, url) {
18
+ if (result.status === 'error') {
19
+ return (`⚠ Could not check boundary crossings on ${taskIdentifier} ` +
20
+ `(network/server error) — unable to confirm whether any are still ` +
21
+ `undispositioned. Review in the web panel: ${url}\n`);
22
+ }
23
+ const open = result.crossings;
17
24
  if (open.length === 0)
18
25
  return null;
19
26
  const n = open.length;
@@ -28,14 +35,15 @@ function formatCrossingReminder(taskIdentifier, open, url) {
28
35
  }
29
36
  /**
30
37
  * Resolve the session's bound task and surface its OPEN boundary crossings as a
31
- * wrap-up reminder (LUM-448), or `null` when the session is unbound or nothing
32
- * is open. Pure read `fetchOpenCrossings` hits only the LUM-435 GET endpoint
33
- * and there is no disposition write path here.
38
+ * wrap-up reminder (LUM-448), or `null` when the session is unbound or the read
39
+ * genuinely came back empty. A crossings-check failure yields a warning, not
40
+ * silence (LUM-480). Pure read `fetchOpenCrossings` hits only the LUM-435 GET
41
+ * endpoint and there is no disposition write path here.
34
42
  */
35
43
  async function openCrossingReminder(creds) {
36
44
  const taskIdentifier = await (0, resolve_bound_task_1.resolveBoundTaskIdentifier)(creds.apiUrl, creds.token);
37
45
  if (!taskIdentifier)
38
46
  return null;
39
- const open = await (0, open_crossings_1.fetchOpenCrossings)(creds.apiUrl, creds.token, taskIdentifier);
40
- return formatCrossingReminder(taskIdentifier, open, (0, open_crossings_1.dispositionUrl)(creds.apiUrl, creds.workspaceSlug, taskIdentifier));
47
+ const result = await (0, open_crossings_1.fetchOpenCrossings)(creds.apiUrl, creds.token, taskIdentifier);
48
+ return formatCrossingReminder(taskIdentifier, result, (0, open_crossings_1.dispositionUrl)(creds.apiUrl, creds.workspaceSlug, taskIdentifier));
41
49
  }
@@ -32,9 +32,13 @@ function normalizeSeverity(s) {
32
32
  * way to clear a crossing — disposition stays web + human-only
33
33
  * (LUM-426/435/422).
34
34
  *
35
- * Best-effort: any transport / HTTP / parse failure yields an empty list, so a
36
- * supplementary safety signal can never break the caller's primary output
37
- * (the acceptance status, the wrap-up panel).
35
+ * Fails *closed*, not open (LUM-480): any transport / non-ok HTTP / parse
36
+ * failure returns `{ status: 'error', reason }` so the caller can say "check
37
+ * failed could not confirm" instead of mistaking a hiccup for "0 open / all
38
+ * clear". A successful read returns `{ status: 'ok', crossings }`; the list is
39
+ * empty only when the server genuinely reports no open crossings. Either way the
40
+ * supplementary safety signal never throws into the caller's primary output (the
41
+ * acceptance status, the wrap-up panel) — failure is a value, not an exception.
38
42
  */
39
43
  async function fetchOpenCrossings(apiUrl, token, taskIdentifier) {
40
44
  const url = `${(0, api_1.trimTrailingSlash)(apiUrl)}/api/tasks/${encodeURIComponent(taskIdentifier)}/boundary-crossings`;
@@ -42,20 +46,23 @@ async function fetchOpenCrossings(apiUrl, token, taskIdentifier) {
42
46
  try {
43
47
  res = await fetch(url, { headers: { Authorization: `Bearer ${token}` } });
44
48
  }
45
- catch {
46
- return [];
49
+ catch (err) {
50
+ return {
51
+ status: 'error',
52
+ reason: err instanceof Error ? err.message : 'network error',
53
+ };
47
54
  }
48
55
  if (!res.ok)
49
- return [];
56
+ return { status: 'error', reason: `HTTP ${res.status}` };
50
57
  let data;
51
58
  try {
52
59
  data = (await res.json());
53
60
  }
54
61
  catch {
55
- return [];
62
+ return { status: 'error', reason: 'invalid response body' };
56
63
  }
57
64
  const rows = Array.isArray(data.crossings) ? data.crossings : [];
58
- return rows
65
+ const crossings = rows
59
66
  .filter(c => c.disposition == null)
60
67
  .map(c => ({
61
68
  id: c.id,
@@ -65,6 +72,7 @@ async function fetchOpenCrossings(apiUrl, token, taskIdentifier) {
65
72
  attribution: normalizeAttribution(c.attribution),
66
73
  }))
67
74
  .sort((a, b) => SEVERITY_RANK[b.severity] - SEVERITY_RANK[a.severity]);
75
+ return { status: 'ok', crossings };
68
76
  }
69
77
  /**
70
78
  * The web deep link where a HUMAN dispositions crossings. Disposition is
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@lumoai/cli",
3
- "version": "1.34.0",
3
+ "version": "1.35.0",
4
4
  "description": "Lumo CLI — manage tasks and sessions from the terminal",
5
5
  "license": "MIT",
6
6
  "author": "cli@uselumo.ai",