@lumoai/cli 1.42.0 → 1.44.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -117,6 +117,16 @@ what's unmet and why (the exact failure tails), and how many rounds are left.
117
117
  - A pass can carry a **`⚠ pre-edit version`** note (LUM-457): the criterion was changed after that verdict (reworded, or its checkpointer was swapped so the recorded evidence ran a different command). The pass still counts as met (a stale pass does not block DONE — render-only signal), but it vouches for an older version — **re-run `lumo verify` to re-confirm against the current criterion.** This is the habit whenever you edit a MACHINE criterion's checkpointer mid-task: change the check, then re-verify so the green is honest.
118
118
  - **History** — one line per recorded round: `rN · timestamp · X PASS / Y FAIL`.
119
119
  - **Last round failures** — the most recent round's FAIL verdicts with their rejection reasons (why the last round bounced).
120
+ - **Cost** (LUM-560) — 规律 1: the costs a human should weigh, on the same report as the verdict instead of scattered across the web delivery card and `task lineage`. Three lines: **Tokens** (total input+output+cache across the task's sessions), **Active time** (non-idle agent seconds — Σ per-turn `STOP − prompt`, LUM-487), and **Rework rounds** (verify rounds that recorded a FAIL). Read from the **same** server-side source the web delivery card consumes (`retrospectiveRepository.loadActuals`), so the two reports cannot drift. Token cost is **fail-closed**: when no session usage was recorded it prints `Tokens: not recorded (no session usage captured)`, kept distinct from a measured `0` (没测到 vs 花了0, aligned with LUM-559). Carried in `--json` as `cost { tokenCost, activeTimeSec, reworkRounds }` (`tokenCost: null` = not measured). Omitted only against an older server that doesn't emit the field.
121
+ - **Struggle / rework / outstanding** (LUM-561) — the anti-mum-and-deaf block: **always printed when the contract exists, even on a clean 0-unmet task** so a passing task still shows its scars instead of wiping them to a single PASS count. Lists, when present:
122
+ - **rework rounds** — verify rounds that had a FAIL;
123
+ - **send-backs** — criteria sent back by a human/agent verdict (a MACHINE verify-loop FAIL is not a 打回), with their open/resolved lifecycle, preserved even for since-removed criteria;
124
+ - **leftover follow-ups** — criteria whose latest verdict is `PASS_WITH_FOLLOWUP`;
125
+ - **PR iterations** — when the task has >1 PR (the dominant rework signal when the verify loop ran once but the work churned across many follow-up PRs — e.g. LUM-557: ~10 PRs vs 1 verify round); a single PR is the happy path and is not flagged;
126
+ - **reopens** — backward `IN_REVIEW/DONE → IN_PROGRESS/TODO` transitions (from the `STATUS_CHANGED` log): the task reached review/done and got bounced, a rework that leaves no FAIL verdict.
127
+
128
+ When the trail is genuinely empty it states the **basis** (`None recorded — N rounds run, 0 FAIL, no send-backs, no reopens, no leftover follow-ups`); when nothing has been verified yet it says so (`No verification has run yet — cannot confirm there were no difficulties`) rather than rendering an implicitly-clean slate. Carried in `--json` as `struggleTrail` (incl. `pullRequests` + `reopens`).
129
+
120
130
  - **Next actions** — the unmet criteria (latest verdict is not a pass: failed or never verified, HUMAN ones included). This list IS the plan — recomputed from the event log on every read, never maintained separately. Empty + rounds recorded = awaiting human adjudication.
121
131
  - **Open boundary crossings** (LUM-448) — a trailing safety block when the task has ≥1 OPEN (undispositioned) forbidden-action crossing: a count, then one line per crossing `• [SEVERITY] CATEGORY — <clipped detail>` (highest-severity first), each followed by a read-only **attribution** line `↳ by model=<m> · agent=<type>[/branch] · session=<8-char prefix>` (LUM-469 — who/what crossed; any dimension that couldn't be resolved server-side prints `unknown`, never a fabricated value), then a pointer to the web acceptance panel. Silent when there are none, so it never overshadows the criteria.
122
132
  - **Read-only awareness** — this surfaces crossings detected elsewhere (LUM-426/435/442); there is no CLI path to disposition or clear one. Disposition stays web + human-only (LUM-426/435/422): an agent/CLI bearer cannot clear its own crossing from the terminal.
@@ -102,10 +102,14 @@ function formatLineageMarkdown(data) {
102
102
  lines.push(`**Status**: ${data.task.status}`);
103
103
  lines.push('');
104
104
  if (data.groups.length === 0) {
105
- lines.push('_No lineage edges recorded yet. Lineage is captured when a ' +
106
- "bound session consumes this task's context; once that happens " +
107
- '(and a PR merges / the task closes), the causal trail and its cost ' +
108
- 'will appear here._');
105
+ // LUM-559: an empty report is fail-closed it means no session ever bound
106
+ // to this task, so neither its cost nor its causal trail could be measured.
107
+ // Say "not measured" explicitly; never let the blank read as "zero cost".
108
+ lines.push('_Cost not measured — no session was ever bound to this task, so its ' +
109
+ 'cost and causal trail could not be recorded. This is "not measured", ' +
110
+ 'not "zero cost". Cost and lineage are captured once a session binds ' +
111
+ '(`lumo session attach <id>`) and consumes the context; bind before ' +
112
+ 'working so future runs are recorded._');
109
113
  lines.push('');
110
114
  return lines.join('\n');
111
115
  }
@@ -146,6 +150,16 @@ function formatLineageMarkdown(data) {
146
150
  }
147
151
  const summary = outcomeSummary(fragmentOutcomeCounts(g.fragments));
148
152
  lines.push(`**Fragments** (${g.fragments.length}${summary ? `: ${summary}` : ''}):`);
153
+ // LUM-559: an edgeless cost group is a session that spent tokens but
154
+ // recorded no fragments (it bound after session-start). Its empty trail is
155
+ // "not captured", NOT "the session used nothing" — say so, and skip the
156
+ // per-fragment usage legend that has nothing to annotate.
157
+ if (g.fragments.length === 0) {
158
+ lines.push('_Causal fragments not captured — this session bound after start, so ' +
159
+ 'its cost is recorded but its fragment trail is not._');
160
+ lines.push('');
161
+ continue;
162
+ }
149
163
  lines.push('_✓ used · · abstained · ✗ unused (manual)_');
150
164
  for (const f of g.fragments) {
151
165
  const tag = f.disclosure === 'INDEX'
@@ -134,6 +134,8 @@ function formatTaskStatus(data, extras = {}) {
134
134
  }
135
135
  }
136
136
  }
137
+ pushCost(lines, data);
138
+ pushStruggleTrail(lines, data);
137
139
  lines.push('');
138
140
  if (data.nextActions.length === 0) {
139
141
  lines.push(data.currentRound > 0
@@ -165,6 +167,130 @@ function formatTaskStatus(data, extras = {}) {
165
167
  pushOpenCrossings(lines, extras);
166
168
  return lines.join('\n') + '\n';
167
169
  }
170
+ /** Compact a token count for the terminal — 1_200_000 → "1.2M", 850_000 →
171
+ * "850K", 0 → "0". Uses the SAME Intl compact formatter the web card's
172
+ * fmtCompact uses (notation:'compact', maximumFractionDigits:1) so the same
173
+ * number reads identically in both surfaces — no presentation drift (LUM-560). */
174
+ const TOKEN_FMT = new Intl.NumberFormat('en-US', {
175
+ notation: 'compact',
176
+ maximumFractionDigits: 1,
177
+ });
178
+ function fmtTokens(n) {
179
+ return TOKEN_FMT.format(n);
180
+ }
181
+ /** Active (non-idle) seconds → a compact "2h 14m" / "3m 5s" / "12s". 0 → "0s". */
182
+ function fmtDuration(totalSec) {
183
+ const sec = Math.max(0, Math.round(totalSec));
184
+ if (sec === 0)
185
+ return '0s';
186
+ const h = Math.floor(sec / 3600);
187
+ const m = Math.floor((sec % 3600) / 60);
188
+ const s = sec % 60;
189
+ const parts = [];
190
+ if (h > 0)
191
+ parts.push(`${h}h`);
192
+ if (m > 0)
193
+ parts.push(`${m}m`);
194
+ // Show seconds only when the duration is under an hour (keeps long runs tidy).
195
+ if (s > 0 && h === 0)
196
+ parts.push(`${s}s`);
197
+ return parts.join(' ');
198
+ }
199
+ /**
200
+ * Append the honest "Cost" section (LUM-560) — 规律 1: surface the costs a human
201
+ * should weigh (token spend, active time, machine rework) on the same report as
202
+ * the acceptance verdict, instead of leaving them scattered across the web
203
+ * delivery card and `task lineage`. Every number is the server's, read from the
204
+ * same loadActuals source as the web card (no drift). Token cost is fail-closed:
205
+ * a null reads as an explicit "not recorded" line, never a silent or fake 0, so
206
+ * 没测到 (no session usage) stays distinct from a measured 花了0. Skipped only
207
+ * when the server didn't emit the field (older server) — never fabricated.
208
+ */
209
+ function pushCost(lines, data) {
210
+ const cost = data.cost;
211
+ if (!cost)
212
+ return; // older server: can't fabricate cost, so don't claim any.
213
+ lines.push('');
214
+ lines.push('Cost:');
215
+ lines.push(cost.tokenCost == null
216
+ ? ' Tokens: not recorded (no session usage captured)'
217
+ : ` Tokens: ${fmtTokens(cost.tokenCost)}`);
218
+ lines.push(` Active time: ${fmtDuration(cost.activeTimeSec)} (non-idle)`);
219
+ lines.push(` Rework rounds: ${cost.reworkRounds}${cost.reworkRounds === 0 ? ' (no machine rework)' : ''}`);
220
+ }
221
+ /**
222
+ * Append the honest "Struggle / rework / outstanding" section (LUM-561) — the
223
+ * anti-mum-and-deaf block (kills a silent "Nothing outstanding"). It is ALWAYS
224
+ * rendered when the contract exists, even on a clean 0-unmet task: a passing
225
+ * task that bounced, was sent back, or left a follow-up behind still shows its
226
+ * scars. When the trail is genuinely empty the block states the *basis* for
227
+ * that ("none — N rounds run, 0 FAIL …"), or, when nothing has been verified,
228
+ * that absence cannot be confirmed — never a bare clean slate. Skipped only
229
+ * when the server didn't emit the field (older server).
230
+ */
231
+ function pushStruggleTrail(lines, data) {
232
+ const trail = data.struggleTrail;
233
+ if (!trail)
234
+ return; // older server: can't fabricate, so don't claim "none".
235
+ lines.push('');
236
+ lines.push('Struggle / rework / outstanding:');
237
+ // A single PR is the happy path; >1 PR is the iteration signal. Reopens (a
238
+ // bounce after IN_REVIEW/DONE) always count. These two catch the rework that
239
+ // lives in PR cycles / status flips rather than in FAIL verdicts (LUM-561
240
+ // follow-up — without them a 10-PR task like LUM-557 reads as one hiccup).
241
+ const prIterated = trail.pullRequests.length > 1;
242
+ const empty = trail.reworkRounds.length === 0 &&
243
+ trail.sendBacks.length === 0 &&
244
+ trail.followUps.length === 0 &&
245
+ trail.reopens.length === 0 &&
246
+ !prIterated;
247
+ if (empty) {
248
+ if (data.currentRound === 0) {
249
+ // Honest fail-open: nothing was verified, so we cannot claim there were
250
+ // no difficulties — say so rather than render an implicitly-clean slate.
251
+ lines.push(' No verification has run yet — cannot confirm there were no difficulties. Run `lumo verify`.');
252
+ }
253
+ else {
254
+ const rounds = `${data.currentRound} verification round${data.currentRound === 1 ? '' : 's'}`;
255
+ lines.push(` None recorded — ${rounds} run, 0 FAIL, no send-backs, no reopens, no leftover follow-ups.`);
256
+ }
257
+ return;
258
+ }
259
+ if (trail.reworkRounds.length > 0) {
260
+ const parts = trail.reworkRounds.map(r => `round ${r.round} (${r.failed} FAIL)`);
261
+ lines.push(` Rework rounds: ${parts.join(', ')}`);
262
+ }
263
+ if (trail.sendBacks.length > 0) {
264
+ lines.push(' Send-backs:');
265
+ for (const s of trail.sendBacks) {
266
+ const lifecycle = s.status === 'resolved'
267
+ ? `resolved (sent back r${s.failedAtRound}${s.resolvedAtRound != null ? ` → r${s.resolvedAtRound}` : ''})`
268
+ : `open (sent back r${s.failedAtRound})`;
269
+ lines.push(` • ${(0, sanitize_1.sanitizeField)(s.statement)} — ${lifecycle}`);
270
+ }
271
+ }
272
+ if (trail.followUps.length > 0) {
273
+ lines.push(' Follow-ups left behind:');
274
+ for (const f of trail.followUps) {
275
+ lines.push(` • ${(0, sanitize_1.sanitizeField)(f.statement)} — flagged r${f.round}`);
276
+ }
277
+ }
278
+ // PR iteration — the dominant rework signal when the verify loop only ran
279
+ // once but the work churned across many PRs (LUM-557). Only when >1 PR.
280
+ if (prIterated) {
281
+ const PR_CAP = 12;
282
+ const nums = trail.pullRequests.map(p => `#${p.number}`);
283
+ const shown = nums.slice(0, PR_CAP).join(', ');
284
+ const overflow = nums.length > PR_CAP ? `, +${nums.length - PR_CAP} more` : '';
285
+ lines.push(` PR iterations: ${trail.pullRequests.length} PRs (${shown}${overflow})`);
286
+ }
287
+ if (trail.reopens.length > 0) {
288
+ lines.push(` Reopened ${trail.reopens.length}× (bounced back after review/done):`);
289
+ for (const r of trail.reopens) {
290
+ lines.push(` • ${r.from} → ${r.to}`);
291
+ }
292
+ }
293
+ }
168
294
  /**
169
295
  * Append the OPEN boundary-crossings safety block (LUM-448) — a count, one line
170
296
  * per crossing with its severity + category + clipped detail, and a pointer to
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@lumoai/cli",
3
- "version": "1.42.0",
3
+ "version": "1.44.0",
4
4
  "description": "Lumo CLI — manage tasks and sessions from the terminal",
5
5
  "license": "MIT",
6
6
  "author": "cli@uselumo.ai",