@lumoai/cli 1.41.0 → 1.42.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,11 +1,11 @@
1
1
  # lumo verify — machine verification loop
2
2
 
3
3
  `lumo verify` is the machine half of the acceptance system (Acceptance v1,
4
- LUM-343). It executes every **MACHINE** criterion's checkpointer in the local
5
- repo, reports one structured PASS/FAIL verdict per criterion to the server,
6
- and prints what to do next. The judge lives server-side: round numbering, the
7
- 3-round cap, escalation, and the IN_REVIEW transition all happen there
8
- (execution on the client, adjudication on the server).
4
+ LUM-343): it executes every **MACHINE** criterion's checkpointer in the local
5
+ repo, POSTs one structured PASS/FAIL verdict per criterion, and prints what to
6
+ do next. Execution is on the client; adjudication is server-side round
7
+ numbering, the **3-round cap**, escalation, and the **IN_REVIEW** transition all
8
+ happen on the server.
9
9
 
10
10
  ## The claim-done rule
11
11
 
@@ -13,10 +13,10 @@ and prints what to do next. The judge lives server-side: round numbering, the
13
13
  touching its status — run `lumo verify`.** The loop replaces "I read the code
14
14
  and it looks done" with executed evidence.
15
15
 
16
- ```
17
- lumo verify # session-bound task
18
- lumo verify LUM-42 # explicit task (overrides the session binding)
19
- lumo verify --timeout 900 # per-checkpointer timeout in seconds (default 600)
16
+ ```bash
17
+ lumo verify # session-bound task
18
+ lumo verify LUM-42 # explicit task (overrides the session binding)
19
+ lumo verify --timeout 900 # per-checkpointer timeout in seconds (default 600)
20
20
  ```
21
21
 
22
22
  ## What one round does
@@ -28,84 +28,60 @@ lumo verify --timeout 900 # per-checkpointer timeout in seconds (default 600)
28
28
  criterion at round = previous max + 1 and mirrors each verdict as a
29
29
  TaskActivity event.
30
30
  4. Prints the round outcome:
31
- - **All PASS** → the task transitions to **IN_REVIEW** (existing state
32
- machine + TASK_IN_REVIEW notification). **Stop here.** Human
33
- adjudication and any HUMAN criteria take over; never set DONE yourself.
34
- - **Any FAIL** task status is untouched; the unmet criteria are printed
35
- as next actions (statement, checkpointer, failure tail). Fix and re-run.
36
- - **Round 3 still failing** the loop escalates: a human is notified
37
- (AGENT_VERIFY, requires action) and further `lumo verify` rounds are
38
- rejected with 409. **Stop retrying**; fix only what the human directs.
31
+
32
+ | Round outcome | Effect | What to do |
33
+ | ------------------------- | --------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
34
+ | **All PASS** | Task transitions to **IN_REVIEW** (existing state machine + TASK_IN_REVIEW notification) | **Stop here.** Human adjudication + any HUMAN criteria take over; **never set DONE yourself** |
35
+ | **Any FAIL** | Task status untouched; unmet criteria printed as next actions (statement, checkpointer, failure tail) | Fix and re-run |
36
+ | **Round 3 still failing** | Loop escalates: a human is notified (AGENT_VERIFY, requires action); further `lumo verify` rounds are rejected with **409** | **Stop retrying**; fix only what the human directs |
39
37
 
40
38
  Exit code 0 = all passed (or nothing to run); 1 = failures, escalation, or
41
39
  errors.
42
40
 
43
41
  ## Verdict semantics (what the CLI sends)
44
42
 
45
- - checkpointer exits 0 `PASS` with evidence `cmd:<command>#exit=0`
46
- - non-zero exit `FAIL`, reason = output tail, enum `CRITERION_UNMET`
47
- - spawn failure / timeout → `FAIL`, enum `CHECK_EXECUTION_ERROR`
43
+ | Checkpointer result | Verdict | Detail |
44
+ | ----------------------- | ------- | -------------------------------------------- |
45
+ | exits 0 | `PASS` | evidence `cmd:<command>#exit=0` |
46
+ | non-zero exit | `FAIL` | reason = output tail, enum `CRITERION_UNMET` |
47
+ | spawn failure / timeout | `FAIL` | enum `CHECK_EXECUTION_ERROR` |
48
48
 
49
- evidencePointer is **not free text** — the server only accepts
50
- `commit:<hash>`, `file:<path>:<line>`, or `cmd:<command>#exit=<code>`.
51
- Verdicts are PASS|FAIL only; the agent path cannot write HUMAN verdicts or
52
- `PASS_WITH_FOLLOWUP` (red line — those enter via human-initiated UI paths
53
- only).
49
+ - evidencePointer is **not free text** — the server accepts only `commit:<hash>`, `file:<path>:<line>`, or `cmd:<command>#exit=<code>`.
50
+ - Verdicts are PASS|FAIL only; **the agent path cannot write HUMAN verdicts or `PASS_WITH_FOLLOWUP`** (red line — those enter via human-initiated UI paths only).
54
51
 
55
52
  ## Edge cases
56
53
 
57
- - **No contract yet** → error pointing at `lumo task criteria set`; draft the
58
- contract first (criteria.md golden rule).
59
- - **HUMAN-only contract (zero MACHINE criteria)** nothing to run; the CLI
60
- says so and suggests handing off for human review
61
- (`lumo task update <id> --status in_review`). No server write happens.
62
- - **A round must cover every MACHINE criterion** the CLI always runs all of
63
- them; the server rejects partial rounds.
64
- - Criteria added during review (`REVIEW_ADDED`) appear in the contract and
65
- are picked up automatically by the next round.
66
- - **Session bound to a different task (LUM-459)** → the server returns 409,
67
- which the command surfaces as an error. No advisory is printed; the verify
68
- round is rejected outright.
69
- - **Provably-unbound session** → the server includes `bindingAdvisory: 'unbound'`
70
- in the round response, and the command prints:
71
- `⚠ Working unbound — this verify ran from a Claude Code session not attached to the task.`
72
- The run is recorded as a `SESSION_BINDING_MISSING` boundary crossing visible in
73
- `lumo task status` open crossings. Run `lumo session attach <LUM-N>` before the
74
- next verify to bind the session.
75
- - **Unconfirmed session binding** → `bindingAdvisory: 'unconfirmed'` causes a
76
- softer advisory: `⚠ Could not confirm this session is attached to the task.`
77
- Same remediation: `lumo session attach <LUM-N>`.
54
+ | Case | Behavior |
55
+ | ----------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
56
+ | **No contract yet** | Error pointing at `lumo task criteria set`; draft the contract first (criteria.md golden rule). |
57
+ | **HUMAN-only contract** (zero MACHINE criteria) | Nothing to run; CLI says so and suggests `lumo task update <id> --status in_review` for human review. No server write happens. |
58
+ | **Partial round** | A round must cover every MACHINE criterion; the CLI always runs all of them and the server rejects partial rounds. |
59
+ | **`REVIEW_ADDED` criteria** | Criteria added during review appear in the contract and are picked up automatically by the next round. |
60
+ | **Session bound to a different task** (LUM-459) | Server returns 409, surfaced as an error. No advisory printed; the verify round is rejected outright. |
61
+ | **Provably-unbound session** | Response carries `bindingAdvisory: 'unbound'`; prints `⚠ Working unbound — this verify ran from a Claude Code session not attached to the task.` Recorded as a `SESSION_BINDING_MISSING` boundary crossing (visible in `lumo task status` open crossings). Run `lumo session attach <LUM-N>` before the next verify. |
62
+ | **Unconfirmed session binding** | `bindingAdvisory: 'unconfirmed'` → softer advisory `⚠ Could not confirm this session is attached to the task.` Same remediation: `lumo session attach <LUM-N>`. |
78
63
 
79
64
  ## Round discipline
80
65
 
81
66
  Rounds are a hard budget of 3, not a retry loop. Between rounds, actually fix
82
- the failures — re-running without changes burns a round and (at round 3)
83
- pages a human. A FAIL round never changes task status; only an all-pass round
84
- moves it (to IN_REVIEW, never further).
67
+ the failures — re-running without changes burns a round and (at round 3) pages a
68
+ human. **A FAIL round never changes task status; only an all-pass round moves it
69
+ (to IN_REVIEW, never further).**
85
70
 
86
71
  ## Review-time drift habits (gap findings)
87
72
 
88
- A problem discovered during acceptance/review that the contract does NOT
89
- cover is a **gap finding** — record it in the contract, never just fix it
90
- silently:
91
-
92
- 1. **Append it on the spot.** Transcribe the human's finding as a criterion
93
- via `lumo task criteria set <task> --file <desired-final-list> --human` —
94
- the review-added semantics: the gap surfaced at review time, at the
95
- current round.
96
- 2. **Tag why the contract drifted** with `--cause
97
- <NEW_INFO|SCOPE_CHANGE|DRAFT_BLIND_SPOT|GRANULARITY|OTHER>`. Gap findings
98
- are usually `DRAFT_BLIND_SPOT` (the draft missed it) or `NEW_INFO`
99
- (information that didn't exist at drafting time).
100
- 3. **Then bounce.** The appended criterion shows up in `lumo task status`
101
- nextActions and the next verify round picks it up automatically — no
102
- side-channel to-do list.
103
-
104
- How to read drift: information-lag and requirement-movement drift
105
- (`NEW_INFO`, `SCOPE_CHANGE`) is healthy — don't optimize it away.
106
- `DRAFT_BLIND_SPOT` clusters feed back into the drafting guide. **Zero drift
107
- across many tasks is a red flag, not a trophy** — it usually means contracts
108
- are too thin or state only sure-win clauses that can never be found wanting.
73
+ A problem discovered during acceptance/review that the contract does NOT cover is
74
+ a **gap finding** — record it in the contract, never just fix it silently:
75
+
76
+ 1. `lumo task criteria set <task> --file <desired-final-list> --human` — append it on the spot, transcribing the human's finding as a criterion. Review-added semantics: the gap surfaced at review time, at the current round.
77
+ 2. `--cause <NEW_INFO|SCOPE_CHANGE|DRAFT_BLIND_SPOT|GRANULARITY|OTHER>` tag why the contract drifted. Gap findings are usually `DRAFT_BLIND_SPOT` (the draft missed it) or `NEW_INFO` (info that didn't exist at drafting time).
78
+ 3. `lumo task status` then bounce: the appended criterion shows up in nextActions and the next verify round picks it up automatically. No side-channel to-do list.
79
+
80
+ How to read drift: information-lag and requirement-movement drift (`NEW_INFO`,
81
+ `SCOPE_CHANGE`) is healthy don't optimize it away. `DRAFT_BLIND_SPOT` clusters
82
+ feed back into the drafting guide. **Zero drift across many tasks is a red flag,
83
+ not a trophy** — it usually means contracts are too thin or state only sure-win
84
+ clauses that can never be found wanting.
109
85
 
110
86
  ## lumo task status — the read half (self-check entry point)
111
87
 
@@ -114,7 +90,7 @@ are too thin or state only sure-win clauses that can never be found wanting.
114
90
  nothing and burns no round. Defaults to the session-bound task; an explicit
115
91
  identifier overrides.
116
92
 
117
- ```
93
+ ```bash
118
94
  lumo task status # session-bound task
119
95
  lumo task status LUM-42 # explicit task
120
96
  lumo task status --json # versioned machine-readable payload
@@ -122,108 +98,58 @@ lumo task status --json # versioned machine-readable payload
122
98
 
123
99
  ### When to run it
124
100
 
125
- **Status-first recovery:** run it FIRST — before re-reading code or
126
- planning — whenever you:
101
+ **Status-first recovery:** run it FIRST — before re-reading code or planning —
102
+ whenever you:
127
103
 
128
104
  - resume a task in a new session (yours or another agent's earlier work);
129
105
  - come back after a verification round was rejected (`lumo verify` failed);
130
- - were told the task bounced in review (REVIEW_ADDED criteria may have been
131
- appended at the round they surfaced — they show up here automatically).
106
+ - were told the task bounced in review (REVIEW_ADDED criteria may have been appended at the round they surfaced — they show up here automatically).
132
107
 
133
108
  It answers "where does the loop stand": what already passed (don't redo it),
134
109
  what's unmet and why (the exact failure tails), and how many rounds are left.
135
110
 
136
111
  ### What it prints
137
112
 
138
- - Header: task identifier/title/status + `verification round N/3` (round 0 =
139
- never verified) + an escalation warning when the machine loop is exhausted.
140
- - **Machine verification rollup** (LUM-470) directly under the `Criteria`
141
- header, one line `Machine verification: N machine-verified / M human override
142
- (of T MACHINE criteria)` over the active MACHINE criteria, aligned with the web
143
- read model (LUM-456). Printed whenever the contract has ≥1 MACHINE criterion,
144
- so the terminal rollup never reads as all-human when a checkpointer actually
145
- verified the work.
146
- - **Criteria** — every criterion as `<glyph> <id> [TYPE] SOURCE@rN
147
- statement` (✓ latest verdict passed / ✗ failed / ○ no verdict yet) with its
148
- checkpointer and latest verdict line (evidence pointer on pass, failure
149
- tail on fail). `REVIEW_ADDED@rN` provenance is visible per row.
150
- - A passing **MACHINE** criterion's verdict line carries a machine-state tag
151
- derived from the read model's `machinePassed` flag, NOT the latest verdict
152
- (LUM-470): `· machine-verified` when a checkpointer actually passed it (even
153
- after a human later signs the task off), or `· human override (no machine
154
- pass)` when it passes only on a human sign-off with no machine run underneath.
155
- This keeps the terminal honest with web — a machine-verified criterion that
156
- a human co-signed no longer reads as a plain human pass.
157
- - A pass can carry a **`⚠ pre-edit version`** note (LUM-457): the criterion
158
- was changed after that verdict (reworded, or its checkpointer was swapped so
159
- the recorded evidence ran a different command). The pass still counts as met
160
- (a stale pass does not block DONE — render-only signal), but it vouches for
161
- an older version — **re-run `lumo verify` to re-confirm against the current
162
- criterion.** This is the habit whenever you edit a MACHINE criterion's
163
- checkpointer mid-task: change the check, then re-verify so the green is honest.
113
+ - **Header** task identifier/title/status + `verification round N/3` (round 0 = never verified) + an escalation warning when the machine loop is exhausted.
114
+ - **Machine verification rollup** (LUM-470) — directly under the `Criteria` header, one line `Machine verification: N machine-verified / M human override (of T MACHINE criteria)` over the active MACHINE criteria, aligned with the web read model (LUM-456). Printed whenever the contract has ≥1 MACHINE criterion, so the terminal rollup never reads as all-human when a checkpointer actually verified the work.
115
+ - **Criteria** every criterion as `<glyph> <id> [TYPE] SOURCE@rN statement` (✓ latest verdict passed / ✗ failed / ○ no verdict yet) with its checkpointer and latest verdict line (evidence pointer on pass, failure tail on fail). `REVIEW_ADDED@rN` provenance is visible per row.
116
+ - A passing **MACHINE** criterion's verdict line carries a machine-state tag derived from the read model's `machinePassed` flag, NOT the latest verdict (LUM-470): machine-verified` when a checkpointer actually passed it (even after a human later signs the task off), or `· human override (no machine pass)` when it passes only on a human sign-off with no machine run underneath. This keeps the terminal honest with web — a machine-verified criterion that a human co-signed no longer reads as a plain human pass.
117
+ - A pass can carry a **`⚠ pre-edit version`** note (LUM-457): the criterion was changed after that verdict (reworded, or its checkpointer was swapped so the recorded evidence ran a different command). The pass still counts as met (a stale pass does not block DONE — render-only signal), but it vouches for an older version — **re-run `lumo verify` to re-confirm against the current criterion.** This is the habit whenever you edit a MACHINE criterion's checkpointer mid-task: change the check, then re-verify so the green is honest.
164
118
  - **History** — one line per recorded round: `rN · timestamp · X PASS / Y FAIL`.
165
- - **Last round failures** — the most recent round's FAIL verdicts with their
166
- rejection reasons (why the last round bounced).
167
- - **Next actions** — the unmet criteria (latest verdict is not a pass:
168
- failed or never verified, HUMAN ones included). This list IS the plan —
169
- it is recomputed from the event log on every read, never maintained
170
- separately. Empty + rounds recorded = awaiting human adjudication.
171
- - **Open boundary crossings** (LUM-448) — a trailing safety block when the
172
- task has ≥1 OPEN (undispositioned) forbidden-action crossing: a count, then
173
- one line per crossing `• [SEVERITY] CATEGORY — <clipped detail>`
174
- (highest-severity first), each followed by a read-only **attribution** line
175
- `↳ by model=<m> · agent=<type>[/branch] · session=<8-char prefix>` (LUM-469 —
176
- who/what crossed; any dimension that couldn't be resolved server-side prints
177
- `unknown`, never a fabricated value), then a pointer to the web acceptance
178
- panel. Silent when there are none, so it never overshadows the criteria.
179
- **Read-only awareness** — this surfaces crossings detected elsewhere
180
- (LUM-426/435/442); there is no CLI path to disposition or clear one.
181
- Disposition stays web + human-only (LUM-426/435/422): an agent/CLI bearer
182
- cannot clear its own crossing from the terminal. **The check fails closed
183
- (LUM-480):** if the crossings read itself errors (network / server / parse),
184
- the block prints `⚠ Boundary-crossing check failed (network/server error) —
185
- could not confirm whether any are undispositioned` instead of staying silent.
186
- Silence means a successful read with zero open crossings, never a failed
187
- check — a hiccup can no longer masquerade as "all clear".
119
+ - **Last round failures** — the most recent round's FAIL verdicts with their rejection reasons (why the last round bounced).
120
+ - **Next actions** the unmet criteria (latest verdict is not a pass: failed or never verified, HUMAN ones included). This list IS the plan — recomputed from the event log on every read, never maintained separately. Empty + rounds recorded = awaiting human adjudication.
121
+ - **Open boundary crossings** (LUM-448) a trailing safety block when the task has ≥1 OPEN (undispositioned) forbidden-action crossing: a count, then one line per crossing `• [SEVERITY] CATEGORY — <clipped detail>` (highest-severity first), each followed by a read-only **attribution** line `↳ by model=<m> · agent=<type>[/branch] · session=<8-char prefix>` (LUM-469 — who/what crossed; any dimension that couldn't be resolved server-side prints `unknown`, never a fabricated value), then a pointer to the web acceptance panel. Silent when there are none, so it never overshadows the criteria.
122
+ - **Read-only awareness** this surfaces crossings detected elsewhere (LUM-426/435/442); there is no CLI path to disposition or clear one. Disposition stays web + human-only (LUM-426/435/422): an agent/CLI bearer cannot clear its own crossing from the terminal.
123
+ - **The check fails closed (LUM-480):** if the crossings read itself errors (network / server / parse), the block prints `⚠ Boundary-crossing check failed (network/server error) — could not confirm whether any are undispositioned` instead of staying silent. Silence means a successful read with zero open crossings, never a failed check — a hiccup can no longer masquerade as "all clear".
188
124
 
189
125
  ### Responding to an open crossing — `lumo crossing explain` (LUM-542)
190
126
 
191
- When `lumo task status` surfaces an OPEN crossing you believe is a false
192
- positive — or you simply want to leave a rationale for the human reviewer —
193
- append a self-explanation ("申辩") to it:
127
+ When `lumo task status` surfaces an OPEN crossing you believe is a false positive
128
+ — or you simply want to leave a rationale for the human reviewer — append a
129
+ self-explanation ("申辩") to it:
194
130
 
195
- ```
131
+ ```bash
196
132
  lumo crossing explain <id> --note "this was a generated fixture, not a hand-edited migration"
197
133
  ```
198
134
 
199
135
  This is the **inverse** of dispositioning, but it is the agent/CLI path
200
- (bearer-only; a clerk/human caller is refused) and it can **only append** an
201
- append-only note — it **never clears the crossing or unblocks Done**
202
- (disposition stays web + human-only, LUM-448). The note is shown to the human
203
- reviewer at disposition time, kept for later review, and explicitly labeled
204
- _agent self-report · unverified_. `<id>` must be a crossing on the
205
- **session-bound task** (resolved from `$CLAUDE_CODE_SESSION_ID`; cross-task
206
- targets and unbound/mismatched sessions are rejected). Earlier explanations are
207
- immutable — a correction is a new note.
136
+ (bearer-only; a clerk/human caller is refused). Behavior:
137
+
138
+ - it can **only append** an append-only note — it **never clears the crossing or unblocks Done** (disposition stays web + human-only, LUM-448);
139
+ - the note is shown to the human reviewer at disposition time, kept for later review, and explicitly labeled _agent self-report · unverified_;
140
+ - `<id>` must be a crossing on the **session-bound task** (resolved from `$CLAUDE_CODE_SESSION_ID`; cross-task targets and unbound/mismatched sessions are rejected);
141
+ - earlier explanations are immutable — a correction is a new note.
208
142
 
209
143
  ### --json contract
210
144
 
211
- `--json` emits the full read model with a top-level `version` field
212
- (currently `1`). The schema is versioned: breaking shape changes bump the
213
- major; additive fields don't. Pin on `version` when scripting against it.
214
- Each criterion carries `machinePassed` (boolean — a checkpointer currently
215
- vouches for it; LUM-456/470), and the payload carries a top-level
216
- `machineVerification` aggregate `{ total, machineVerified, humanOverridden }`
217
- over the active MACHINE criteria read these, not `latestVerdict` alone, to
218
- tell a machine-verified criterion from a human override.
219
- The open boundary crossings ride along as an additive top-level
220
- `openCrossings` (each entry `{ id, category, severity, detail, attribution }`,
221
- where `attribution` is `{ workspaceMemberId, sessionId, agent, worktreeBranch,
222
- model }` with every field nullable — null = unknown, never fabricated; LUM-469;
223
- the array length is the count) — same read-only awareness, no write path.
224
- **`openCrossings` is `null` when the crossings check failed (LUM-480)** —
225
- distinct from `[]`, which is a successful read with zero open crossings. Script
226
- consumers must treat `null` as "unknown / could not confirm", not "safe".
145
+ `--json` emits the full read model with a top-level `version` field (currently
146
+ `1`). The schema is versioned: breaking shape changes bump the major; additive
147
+ fields don't. Pin on `version` when scripting against it.
148
+
149
+ - each criterion carries `machinePassed` (boolean — a checkpointer currently vouches for it; LUM-456/470);
150
+ - the payload carries a top-level `machineVerification` aggregate `{ total, machineVerified, humanOverridden }` over the active MACHINE criteria — read these, not `latestVerdict` alone, to tell a machine-verified criterion from a human override;
151
+ - open boundary crossings ride along as an additive top-level `openCrossings`, each entry `{ id, category, severity, detail, attribution }` where `attribution` is `{ workspaceMemberId, sessionId, agent, worktreeBranch, model }` with every field nullable — null = unknown, never fabricated (LUM-469); the array length is the count. Same read-only awareness, no write path;
152
+ - **`openCrossings` is `null` when the crossings check failed (LUM-480)** distinct from `[]` (a successful read with zero open crossings). Script consumers must treat `null` as "unknown / could not confirm", **not** "safe".
227
153
 
228
154
  `status` reads; `verify` judges. Running status never starts a round, never
229
155
  escalates, and never changes task state — loop rules (cap 3, IN_REVIEW on
@@ -232,10 +158,10 @@ all-pass, human-only DONE) live entirely in `lumo verify` and the server.
232
158
  ## lumo verdict — the three verdict channels (LUM-422)
233
159
 
234
160
  `lumo verify` is the MACHINE channel. `lumo verdict` covers the other two — the
235
- HUMAN pass and the AGENT send-back — under one red line: **no passing data row
236
- is ever agent-produced.**
161
+ HUMAN pass and the AGENT send-back — under one red line: **no passing data row is
162
+ ever agent-produced.**
237
163
 
238
- ```
164
+ ```bash
239
165
  lumo verdict --pass
240
166
  lumo verdict LUM-42 --pass
241
167
  lumo verdict --fail --reason CRITERION_UNMET --note "the retry path is still missing"
@@ -256,62 +182,56 @@ verifierType=AGENT (a channel distinct from MACHINE and HUMAN, so "machine
256
182
  all-pass but human FAIL" stays an uncontaminated signal). The verdict is
257
183
  hard-coded FAIL — there is no agent path to a passing verdict. It:
258
184
 
259
- - requires `--reason` (case-insensitive): `CRITERION_UNMET | EVIDENCE_INSUFFICIENT
260
- | CHECK_EXECUTION_ERROR | SCOPE_MISMATCH | OTHER` the agent pays the
261
- structured tax a human send-back is spared;
262
- - takes an optional `--note`, posted as a task comment (@mentions and images for
263
- free) and summarized onto the verdict row;
264
- - takes repeatable `--criterion <id>` to narrow the send-back; omitted, it fans
265
- out to the whole contract;
266
- - records at round = the current max (not a new round) and bounces the task back
267
- to IN_PROGRESS, with the unmet criteria surfacing through `lumo task status`.
185
+ - `--reason <enum>` required (case-insensitive): `CRITERION_UNMET | EVIDENCE_INSUFFICIENT | CHECK_EXECUTION_ERROR | SCOPE_MISMATCH | OTHER` — the agent pays the structured tax a human send-back is spared;
186
+ - `--note <text>` optional, posted as a task comment (@mentions and images for free) and summarized onto the verdict row;
187
+ - `--criterion <id>` repeatable, narrows the send-back; omitted, it fans out to the whole contract;
188
+ - `round` = the current max (not a new round); bounces the task back to IN_PROGRESS, with the unmet criteria surfacing through `lumo task status`.
268
189
 
269
190
  ### The DONE gate
270
191
 
271
192
  Once any criterion's latest verdict is FAIL — machine, AGENT, or human — moving
272
- the task to DONE on the agent/CLI path is refused with **409** and the
273
- unresolved items listed. Clear the send-back (fix + re-verify, or a human PASS)
274
- before `lumo task update <id> --status done`. A task with no criteria, or whose
275
- criteria were never adjudicated, transitions freely — the gate only blocks an
276
- actual send-back, never an un-adjudicated criterion. When the machine loop has
277
- left a task IN_REVIEW with no send-back standing, the agent may move it to DONE
278
- directly; a human-PASS row is a provable manual override, not a required ticket.
193
+ the task to DONE on the agent/CLI path is refused with **409** and the unresolved
194
+ items listed. Clear the send-back (fix + re-verify, or a human PASS) before
195
+ `lumo task update <id> --status done`.
196
+
197
+ - A task with no criteria, or whose criteria were never adjudicated, transitions freely — **the gate only blocks an actual send-back, never an un-adjudicated criterion.**
198
+ - When the machine loop has left a task IN_REVIEW with no send-back standing, the agent may move it to DONE directly; a human-PASS row is a provable manual override, not a required ticket.
279
199
 
280
200
  ## When a defect appears — fix in place, don't spin off a new task
281
201
 
282
- On a send-back **or** a self-review finding: if the issue falls under any
283
- existing acceptance criterion of **this** task, fix it in place and re-run
284
- `lumo verify`. **Do not** `lumo task create` for it. New tasks are only for work
285
- genuinely _outside_ this task's acceptance contract. Creating a task (and PR)
286
- for in-scope rework launders a first-attempt failure — it bypasses the DONE
287
- gate's send-back protection and corrupts the flywheel signal.
288
-
289
- This is now enforced: when you're mid-task, `lumo task create` refuses the bare
290
- form and makes you declare intent — `--rework-of <id>` (it redirects you back to
291
- fix the existing task and creates nothing) or `--new-scope` (genuinely new,
292
- separate work). If the send-back reveals the **contract itself** was wrong,
293
- amend it on this task (`lumo task criteria set`) rather than opening a new
294
- task — see criteria.md.
295
-
296
- **Hard rule:** while THIS task has an unresolved send-back (any criterion's
297
- latest verdict is FAIL — the same condition that blocks DONE), `lumo task create`
298
- is refused with 409 **even with `--new-scope`**. A standing send-back means the
299
- task can't be completed yet; resolve it (fix + `lumo verify`, or amend the
300
- contract) before opening any new work. `--rework-of` still redirects you to it.
202
+ On a send-back **or** a self-review finding: if the issue falls under any existing
203
+ acceptance criterion of **this** task, fix it in place and re-run `lumo verify`.
204
+ **Do not** `lumo task create` for it. New tasks are only for work genuinely
205
+ _outside_ this task's acceptance contract. Creating a task (and PR) for in-scope
206
+ rework launders a first-attempt failure — it bypasses the DONE gate's send-back
207
+ protection and corrupts the flywheel signal.
208
+
209
+ This is enforced: when you're mid-task, `lumo task create` refuses the bare form
210
+ and makes you declare intent:
211
+
212
+ - `lumo task create --rework-of <id>` — redirects you back to fix the existing task and creates nothing;
213
+ - `lumo task create --new-scope` genuinely new, separate work.
214
+
215
+ If the send-back reveals the **contract itself** was wrong, amend it on this task
216
+ (`lumo task criteria set`) rather than opening a new task — see criteria.md.
217
+
218
+ **Hard rule:** while THIS task has an unresolved send-back (any criterion's latest
219
+ verdict is FAIL the same condition that blocks DONE), `lumo task create` is
220
+ refused with **409 even with `--new-scope`**. A standing send-back means the task
221
+ can't be completed yet; resolve it (fix + `lumo verify`, or amend the contract)
222
+ before opening any new work. `--rework-of` still redirects you to it.
301
223
 
302
224
  ## Human-reported defects, once a task is submitted
303
225
 
304
226
  When someone reports a defect in conversation, your action depends on whether the
305
227
  task has **ever entered IN_REVIEW**:
306
228
 
307
- - **Not yet** (still your first working pass) → just fix it and continue. No
308
- verdict needed nothing was claimed complete, so there's nothing to contradict.
309
- - **Already submitted** (entered IN*REVIEW / DONE / merged) **do not silently
310
- fix and re-pass.** Either record your own send-back (`lumo verdict --fail`,
311
- noting it was human-reported — this is \_your* honest concurrence, not a forged
312
- human verdict), or ask the reporter to record a human FAIL via the web UI /
313
- Slack (the only channel that can attribute it to a human). If the defect is a
314
- **new requirement** not covered by any criterion, first transcribe it with
315
- `lumo task criteria set --human`, then proceed. You can never write a human
316
- _verdict_ — the terminal can't prove a human is behind the command
317
- (attribution integrity, not anti-forgery).
229
+ - **Not yet** (still your first working pass) → just fix it and continue. No verdict needed — nothing was claimed complete, so there's nothing to contradict.
230
+ - **Already submitted** (entered IN_REVIEW / DONE / merged) **do not silently fix and re-pass.** Either:
231
+ - record your own send-back `lumo verdict --fail` (noting it was human-reported this is _your_ honest concurrence, not a forged human verdict), or
232
+ - ask the reporter to record a human FAIL via the web UI / Slack (the only channel that can attribute it to a human).
233
+
234
+ If the defect is a **new requirement** not covered by any criterion, first
235
+ transcribe it with `lumo task criteria set --human`, then proceed. You can never
236
+ write a human _verdict_ the terminal can't prove a human is behind the command
237
+ (attribution integrity, not anti-forgery).
@@ -24,42 +24,20 @@ lumo worktree add LUM-267 --no-fetch # branch off existing origin/main
24
24
  lumo worktree add LUM-267 --verify # run npx jest baseline after setup
25
25
  ```
26
26
 
27
- Flags: `--base <ref>` (branch off a ref other than origin/main; skips the
28
- fetch), `--no-fetch` (skip `git fetch origin main`), `--verify` (run `npx jest`
29
- in the new worktree). Errors if the target dir already exists; reuses the branch
30
- if it already exists (adds the worktree without `-b`).
31
-
32
- **The prisma gotcha it warns about:** the generated client lives in the shared
33
- (symlinked) `node_modules`, so a `prisma generate` in one worktree clobbers the
34
- client every parallel worktree depends on. Verify with jest (SWC mocks Prisma);
35
- do `generate + tsc` atomically once at the end.
36
-
37
- **The jest gotcha it warns about:** always run jest from the worktree root (`cd`
38
- in first) `cli/` has no jest config, and running from the main checkout hits
39
- the `cli/package.json` haste collision and silently runs the wrong tests.
40
-
41
- **The husky hooks it copies in:** husky owns git hooks via
42
- `core.hooksPath = .husky/_`, a relative path git resolves against each
43
- worktree's own root. That `_` shim dir is **untracked** (husky regenerates it on
44
- the main checkout's `npm install`/`prepare`), so a fresh worktree would lack it
45
- and git would **silently skip every hook** — pre-commit (lint-staged) and
46
- commit-msg (LUM-405 drift-check) included. Since this repo has no GitHub CI,
47
- husky is the only deterministic quality gate, so `add` copies `.husky/_` into the
48
- new worktree (copy, not symlink — the shim is tiny and the hooks it invokes
49
- resolve relative to the worktree). If the main checkout itself has no `.husky/_`
50
- (husky never installed there), `add` prints a warning telling you to run
51
- `npm install` in the main checkout to regenerate it, rather than skipping
52
- silently.
53
-
54
- **Never run `npm install` / `npm ci` inside a worktree.** The worktree shares
55
- the main checkout's `node_modules` via a symlink; npm does not respect the link
56
- — it deletes it and reifies a full standalone `node_modules` (thousands of
57
- packages, ~1 min, and the shared prisma-client is gone). Run installs only in
58
- the main checkout, then re-create the symlink if npm replaced it. (Older npm
59
- could instead plant a self-referential `node_modules/node_modules` in the shared
60
- tree, which hard-panics Turbopack's `next build`; the `prebuild`/`predev`/
61
- `preanalyze` guard — `scripts/fix-nodemodules-selflink.ts` — removes that link
62
- automatically, but the rule above avoids the whole mess.)
27
+ | Flag | Notes |
28
+ | -------------- | --------------------------------------------------------------- |
29
+ | `--base <ref>` | Branch off a ref other than origin/main (skips the fetch). |
30
+ | `--no-fetch` | Skip `git fetch origin main` (branch off existing origin/main). |
31
+ | `--verify` | Run `npx jest` baseline in the new worktree after setup. |
32
+
33
+ Errors if the target dir already exists; reuses the branch if it already exists (adds the worktree without `-b`).
34
+
35
+ ### Gotchas (two it warns about, two it can't fix)
36
+
37
+ - **`prisma generate` clobbers all worktrees.** The generated client lives in the shared (symlinked) `node_modules`, so a `generate` in one worktree overwrites the client every parallel worktree depends on. Verify with jest (SWC mocks Prisma); do `generate + tsc` atomically once at the end.
38
+ - **Run jest from the worktree root** (`cd` in first). `cli/` has no jest config; running from the main checkout hits the `cli/package.json` haste collision and silently runs the wrong tests.
39
+ - **Husky hooks are copied in for you.** Husky owns hooks via `core.hooksPath = .husky/_`, resolved relative to each worktree's root. That `_` shim is **untracked** (regenerated on the main checkout's `npm install`/`prepare`), so a fresh worktree would lack it and git would **silently skip every hook** — pre-commit (lint-staged) and commit-msg (LUM-405 drift-check). With no GitHub CI, husky is the only deterministic quality gate, so `add` copies `.husky/_` in (copy, not symlink). If the main checkout has no `.husky/_`, `add` warns you to `npm install` there rather than skipping silently.
40
+ - **Never `npm install` / `npm ci` inside a worktree.** npm doesn't respect the `node_modules` symlink — it deletes it and reifies a full standalone tree (~1 min, shared prisma-client gone). Install only in the main checkout, then re-create the symlink if npm replaced it. (Older npm could plant a self-referential `node_modules/node_modules` that hard-panics Turbopack's `next build`; the `prebuild`/`predev`/`preanalyze` guard `scripts/fix-nodemodules-selflink.ts` removes it, but this rule avoids the mess.)
63
41
 
64
42
  ## `lumo worktree rm <LUM-N>`
65
43
 
@@ -83,3 +61,10 @@ check when jest/tsc misbehaves). Read-only; runnable from anywhere in the repo.
83
61
  ```bash
84
62
  lumo worktree list
85
63
  ```
64
+
65
+ ## When to suggest worktrees
66
+
67
+ - The user wants to work on a task **in isolation** from the current workspace — a parallel branch they can build/test without disturbing uncommitted work in the main checkout.
68
+ - Starting a second task while one is in flight, or running independent tasks concurrently (one worktree per task — never two tasks in one worktree).
69
+ - Before executing an implementation plan that needs a clean, dedicated branch off `origin/main`.
70
+ - A `MISSING` node_modules link in `lumo worktree list` is the first thing to check when jest/tsc misbehaves in a worktree.
@@ -0,0 +1,76 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.memoryFold = memoryFold;
4
+ const config_1 = require("../lib/config");
5
+ const api_1 = require("../lib/api");
6
+ const resolve_1 = require("../lib/resolve");
7
+ const resolve_bound_task_1 = require("../lib/resolve-bound-task");
8
+ const resolve_project_1 = require("../lib/resolve-project");
9
+ const sanitize_1 = require("../lib/sanitize");
10
+ function flatten(content) {
11
+ if (content === null || typeof content !== 'object')
12
+ return '';
13
+ return Object.values(content)
14
+ .map(v => (Array.isArray(v) ? v.join(' ') : typeof v === 'string' ? v : ''))
15
+ .filter(Boolean)
16
+ .join(' — ');
17
+ }
18
+ async function memoryFold(refArg, options) {
19
+ // Folding runs automatically (daily cron). The CLI is preview-only.
20
+ if (!options.dryRun) {
21
+ console.error('Topic folding runs automatically (daily). This command only previews it.\n' +
22
+ 'Run `lumo memory fold --dry-run` to see what the next fold pass would do.');
23
+ return 1;
24
+ }
25
+ const creds = (0, config_1.readCredentials)();
26
+ if (!creds) {
27
+ console.error('Error: not logged in. Run `lumo auth login` first.');
28
+ return 1;
29
+ }
30
+ const apiUrl = (0, api_1.resolveAuthedApiUrl)(creds.apiUrl);
31
+ const base = (0, api_1.trimTrailingSlash)(apiUrl);
32
+ let projectId;
33
+ if (refArg) {
34
+ projectId = await (0, resolve_1.resolveProjectId)(base, creds.token, refArg);
35
+ }
36
+ else {
37
+ const bound = await (0, resolve_bound_task_1.resolveBoundTaskIdentifier)(apiUrl, creds.token);
38
+ if (!bound) {
39
+ console.error('Error: no <project-ref> given and no task bound to this session.\n' +
40
+ 'Pass a project, or run `lumo session attach <LUM-N>`.');
41
+ return 1;
42
+ }
43
+ const r = await (0, resolve_project_1.resolveBoundProjectId)(apiUrl, creds.token, bound);
44
+ if (!r.ok) {
45
+ console.error(`Error: ${r.error}`);
46
+ return 1;
47
+ }
48
+ projectId = r.id;
49
+ }
50
+ let res;
51
+ try {
52
+ res = await fetch(`${base}/api/projects/${encodeURIComponent(projectId)}/memories/fold-candidates`, { headers: { Authorization: `Bearer ${creds.token}` } });
53
+ }
54
+ catch (err) {
55
+ console.error(`Error: could not reach Lumo API at ${apiUrl} (${err instanceof Error ? err.message : String(err)})`);
56
+ return 1;
57
+ }
58
+ if (!res.ok) {
59
+ console.error(`Error: fold preview failed (HTTP ${res.status})`);
60
+ return 1;
61
+ }
62
+ const { proposals } = (await res.json());
63
+ if (proposals.length === 0) {
64
+ process.stdout.write('No fold proposals — nothing coarse to collapse right now.\n');
65
+ return;
66
+ }
67
+ process.stdout.write(`${proposals.length} fold proposal(s) — DRY RUN, nothing written:\n\n`);
68
+ for (const p of proposals) {
69
+ process.stdout.write(`[${(0, sanitize_1.sanitizeField)(p.category)}] coarse card folds ${p.sourceIds.length} cards` +
70
+ (p.excludedIds.length
71
+ ? ` (${p.excludedIds.length} left ACTIVE by the coverage gate)`
72
+ : '') +
73
+ `:\n ${(0, sanitize_1.sanitizeField)(flatten(p.coarseContent))}\n` +
74
+ ` folds: ${p.sourceIds.map(s => (0, sanitize_1.sanitizeField)(s)).join(', ')}\n\n`);
75
+ }
76
+ }
@@ -70,6 +70,7 @@ const memory_rm_1 = require("./commands/memory-rm");
70
70
  const memory_show_1 = require("./commands/memory-show");
71
71
  const memory_sync_1 = require("./commands/memory-sync");
72
72
  const memory_push_1 = require("./commands/memory-push");
73
+ const memory_fold_1 = require("./commands/memory-fold");
73
74
  const task_artifact_add_1 = require("./commands/task-artifact-add");
74
75
  const task_criteria_set_1 = require("./commands/task-criteria-set");
75
76
  const task_criteria_list_1 = require("./commands/task-criteria-list");
@@ -533,6 +534,11 @@ memoryCmd
533
534
  .option('--dir <path>', 'Memory dir (default: ~/.claude/projects/<cwd>/memory)')
534
535
  .option('--dry-run', 'List what would be pushed without sending')
535
536
  .action(wrap((opts) => (0, memory_push_1.memoryPush)(opts)));
537
+ memoryCmd
538
+ .command('fold [project-ref]')
539
+ .description('Preview the autonomous topic-fold pass (folding itself runs daily, automatically). Requires --dry-run.')
540
+ .option('--dry-run', 'preview proposed subsystem cards + which fine cards they fold; writes nothing')
541
+ .action(wrap((ref, opts) => (0, memory_fold_1.memoryFold)(ref, opts)));
536
542
  const milestoneCmd = program
537
543
  .command('milestone')
538
544
  .description('Inspect milestones from the terminal');