@lumoai/cli 1.41.0 → 1.42.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/assets/skill/SKILL.md +47 -41
- package/assets/skill/references/criteria.md +97 -252
- package/assets/skill/references/doc-editing.md +146 -0
- package/assets/skill/references/docs.md +49 -278
- package/assets/skill/references/memory.md +29 -0
- package/assets/skill/references/onboarding.md +4 -4
- package/assets/skill/references/sessions.md +84 -97
- package/assets/skill/references/task-context.md +65 -149
- package/assets/skill/references/task-deps.md +113 -0
- package/assets/skill/references/tasks.md +2 -133
- package/assets/skill/references/verify.md +121 -201
- package/assets/skill/references/worktree.md +21 -36
- package/dist/cli/src/commands/memory-fold.js +76 -0
- package/dist/cli/src/index.js +6 -0
- package/package.json +1 -1
|
@@ -1,11 +1,11 @@
|
|
|
1
1
|
# lumo verify — machine verification loop
|
|
2
2
|
|
|
3
3
|
`lumo verify` is the machine half of the acceptance system (Acceptance v1,
|
|
4
|
-
LUM-343)
|
|
5
|
-
repo,
|
|
6
|
-
|
|
7
|
-
3-round cap
|
|
8
|
-
|
|
4
|
+
LUM-343): it executes every **MACHINE** criterion's checkpointer in the local
|
|
5
|
+
repo, POSTs one structured PASS/FAIL verdict per criterion, and prints what to
|
|
6
|
+
do next. Execution is on the client; adjudication is server-side — round
|
|
7
|
+
numbering, the **3-round cap**, escalation, and the **IN_REVIEW** transition all
|
|
8
|
+
happen on the server.
|
|
9
9
|
|
|
10
10
|
## The claim-done rule
|
|
11
11
|
|
|
@@ -13,10 +13,10 @@ and prints what to do next. The judge lives server-side: round numbering, the
|
|
|
13
13
|
touching its status — run `lumo verify`.** The loop replaces "I read the code
|
|
14
14
|
and it looks done" with executed evidence.
|
|
15
15
|
|
|
16
|
-
```
|
|
17
|
-
lumo verify
|
|
18
|
-
lumo verify LUM-42
|
|
19
|
-
lumo verify --timeout 900
|
|
16
|
+
```bash
|
|
17
|
+
lumo verify # session-bound task
|
|
18
|
+
lumo verify LUM-42 # explicit task (overrides the session binding)
|
|
19
|
+
lumo verify --timeout 900 # per-checkpointer timeout in seconds (default 600)
|
|
20
20
|
```
|
|
21
21
|
|
|
22
22
|
## What one round does
|
|
@@ -28,84 +28,60 @@ lumo verify --timeout 900 # per-checkpointer timeout in seconds (default 600)
|
|
|
28
28
|
criterion at round = previous max + 1 and mirrors each verdict as a
|
|
29
29
|
TaskActivity event.
|
|
30
30
|
4. Prints the round outcome:
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
(AGENT_VERIFY, requires action) and further `lumo verify` rounds are
|
|
38
|
-
rejected with 409. **Stop retrying**; fix only what the human directs.
|
|
31
|
+
|
|
32
|
+
| Round outcome | Effect | What to do |
|
|
33
|
+
| ------------------------- | --------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
|
|
34
|
+
| **All PASS** | Task transitions to **IN_REVIEW** (existing state machine + TASK_IN_REVIEW notification) | **Stop here.** Human adjudication + any HUMAN criteria take over; **never set DONE yourself** |
|
|
35
|
+
| **Any FAIL** | Task status untouched; unmet criteria printed as next actions (statement, checkpointer, failure tail) | Fix and re-run |
|
|
36
|
+
| **Round 3 still failing** | Loop escalates: a human is notified (AGENT_VERIFY, requires action); further `lumo verify` rounds are rejected with **409** | **Stop retrying**; fix only what the human directs |
|
|
39
37
|
|
|
40
38
|
Exit code 0 = all passed (or nothing to run); 1 = failures, escalation, or
|
|
41
39
|
errors.
|
|
42
40
|
|
|
43
41
|
## Verdict semantics (what the CLI sends)
|
|
44
42
|
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
43
|
+
| Checkpointer result | Verdict | Detail |
|
|
44
|
+
| ----------------------- | ------- | -------------------------------------------- |
|
|
45
|
+
| exits 0 | `PASS` | evidence `cmd:<command>#exit=0` |
|
|
46
|
+
| non-zero exit | `FAIL` | reason = output tail, enum `CRITERION_UNMET` |
|
|
47
|
+
| spawn failure / timeout | `FAIL` | enum `CHECK_EXECUTION_ERROR` |
|
|
48
48
|
|
|
49
|
-
evidencePointer is **not free text** — the server only
|
|
50
|
-
|
|
51
|
-
Verdicts are PASS|FAIL only; the agent path cannot write HUMAN verdicts or
|
|
52
|
-
`PASS_WITH_FOLLOWUP` (red line — those enter via human-initiated UI paths
|
|
53
|
-
only).
|
|
49
|
+
- evidencePointer is **not free text** — the server accepts only `commit:<hash>`, `file:<path>:<line>`, or `cmd:<command>#exit=<code>`.
|
|
50
|
+
- Verdicts are PASS|FAIL only; **the agent path cannot write HUMAN verdicts or `PASS_WITH_FOLLOWUP`** (red line — those enter via human-initiated UI paths only).
|
|
54
51
|
|
|
55
52
|
## Edge cases
|
|
56
53
|
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
-
|
|
65
|
-
|
|
66
|
-
- **Session bound to a different task (LUM-459)** → the server returns 409,
|
|
67
|
-
which the command surfaces as an error. No advisory is printed; the verify
|
|
68
|
-
round is rejected outright.
|
|
69
|
-
- **Provably-unbound session** → the server includes `bindingAdvisory: 'unbound'`
|
|
70
|
-
in the round response, and the command prints:
|
|
71
|
-
`⚠ Working unbound — this verify ran from a Claude Code session not attached to the task.`
|
|
72
|
-
The run is recorded as a `SESSION_BINDING_MISSING` boundary crossing visible in
|
|
73
|
-
`lumo task status` open crossings. Run `lumo session attach <LUM-N>` before the
|
|
74
|
-
next verify to bind the session.
|
|
75
|
-
- **Unconfirmed session binding** → `bindingAdvisory: 'unconfirmed'` causes a
|
|
76
|
-
softer advisory: `⚠ Could not confirm this session is attached to the task.`
|
|
77
|
-
Same remediation: `lumo session attach <LUM-N>`.
|
|
54
|
+
| Case | Behavior |
|
|
55
|
+
| ----------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
56
|
+
| **No contract yet** | Error pointing at `lumo task criteria set`; draft the contract first (criteria.md golden rule). |
|
|
57
|
+
| **HUMAN-only contract** (zero MACHINE criteria) | Nothing to run; CLI says so and suggests `lumo task update <id> --status in_review` for human review. No server write happens. |
|
|
58
|
+
| **Partial round** | A round must cover every MACHINE criterion; the CLI always runs all of them and the server rejects partial rounds. |
|
|
59
|
+
| **`REVIEW_ADDED` criteria** | Criteria added during review appear in the contract and are picked up automatically by the next round. |
|
|
60
|
+
| **Session bound to a different task** (LUM-459) | Server returns 409, surfaced as an error. No advisory printed; the verify round is rejected outright. |
|
|
61
|
+
| **Provably-unbound session** | Response carries `bindingAdvisory: 'unbound'`; prints `⚠ Working unbound — this verify ran from a Claude Code session not attached to the task.` Recorded as a `SESSION_BINDING_MISSING` boundary crossing (visible in `lumo task status` open crossings). Run `lumo session attach <LUM-N>` before the next verify. |
|
|
62
|
+
| **Unconfirmed session binding** | `bindingAdvisory: 'unconfirmed'` → softer advisory `⚠ Could not confirm this session is attached to the task.` Same remediation: `lumo session attach <LUM-N>`. |
|
|
78
63
|
|
|
79
64
|
## Round discipline
|
|
80
65
|
|
|
81
66
|
Rounds are a hard budget of 3, not a retry loop. Between rounds, actually fix
|
|
82
|
-
the failures — re-running without changes burns a round and (at round 3)
|
|
83
|
-
|
|
84
|
-
|
|
67
|
+
the failures — re-running without changes burns a round and (at round 3) pages a
|
|
68
|
+
human. **A FAIL round never changes task status; only an all-pass round moves it
|
|
69
|
+
(to IN_REVIEW, never further).**
|
|
85
70
|
|
|
86
71
|
## Review-time drift habits (gap findings)
|
|
87
72
|
|
|
88
|
-
A problem discovered during acceptance/review that the contract does NOT
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
3. **Then bounce.** The appended criterion shows up in `lumo task status`
|
|
101
|
-
nextActions and the next verify round picks it up automatically — no
|
|
102
|
-
side-channel to-do list.
|
|
103
|
-
|
|
104
|
-
How to read drift: information-lag and requirement-movement drift
|
|
105
|
-
(`NEW_INFO`, `SCOPE_CHANGE`) is healthy — don't optimize it away.
|
|
106
|
-
`DRAFT_BLIND_SPOT` clusters feed back into the drafting guide. **Zero drift
|
|
107
|
-
across many tasks is a red flag, not a trophy** — it usually means contracts
|
|
108
|
-
are too thin or state only sure-win clauses that can never be found wanting.
|
|
73
|
+
A problem discovered during acceptance/review that the contract does NOT cover is
|
|
74
|
+
a **gap finding** — record it in the contract, never just fix it silently:
|
|
75
|
+
|
|
76
|
+
1. `lumo task criteria set <task> --file <desired-final-list> --human` — append it on the spot, transcribing the human's finding as a criterion. Review-added semantics: the gap surfaced at review time, at the current round.
|
|
77
|
+
2. `--cause <NEW_INFO|SCOPE_CHANGE|DRAFT_BLIND_SPOT|GRANULARITY|OTHER>` — tag why the contract drifted. Gap findings are usually `DRAFT_BLIND_SPOT` (the draft missed it) or `NEW_INFO` (info that didn't exist at drafting time).
|
|
78
|
+
3. `lumo task status` — then bounce: the appended criterion shows up in nextActions and the next verify round picks it up automatically. No side-channel to-do list.
|
|
79
|
+
|
|
80
|
+
How to read drift: information-lag and requirement-movement drift (`NEW_INFO`,
|
|
81
|
+
`SCOPE_CHANGE`) is healthy — don't optimize it away. `DRAFT_BLIND_SPOT` clusters
|
|
82
|
+
feed back into the drafting guide. **Zero drift across many tasks is a red flag,
|
|
83
|
+
not a trophy** — it usually means contracts are too thin or state only sure-win
|
|
84
|
+
clauses that can never be found wanting.
|
|
109
85
|
|
|
110
86
|
## lumo task status — the read half (self-check entry point)
|
|
111
87
|
|
|
@@ -114,7 +90,7 @@ are too thin or state only sure-win clauses that can never be found wanting.
|
|
|
114
90
|
nothing and burns no round. Defaults to the session-bound task; an explicit
|
|
115
91
|
identifier overrides.
|
|
116
92
|
|
|
117
|
-
```
|
|
93
|
+
```bash
|
|
118
94
|
lumo task status # session-bound task
|
|
119
95
|
lumo task status LUM-42 # explicit task
|
|
120
96
|
lumo task status --json # versioned machine-readable payload
|
|
@@ -122,108 +98,58 @@ lumo task status --json # versioned machine-readable payload
|
|
|
122
98
|
|
|
123
99
|
### When to run it
|
|
124
100
|
|
|
125
|
-
**Status-first recovery:** run it FIRST — before re-reading code or
|
|
126
|
-
|
|
101
|
+
**Status-first recovery:** run it FIRST — before re-reading code or planning —
|
|
102
|
+
whenever you:
|
|
127
103
|
|
|
128
104
|
- resume a task in a new session (yours or another agent's earlier work);
|
|
129
105
|
- come back after a verification round was rejected (`lumo verify` failed);
|
|
130
|
-
- were told the task bounced in review (REVIEW_ADDED criteria may have been
|
|
131
|
-
appended at the round they surfaced — they show up here automatically).
|
|
106
|
+
- were told the task bounced in review (REVIEW_ADDED criteria may have been appended at the round they surfaced — they show up here automatically).
|
|
132
107
|
|
|
133
108
|
It answers "where does the loop stand": what already passed (don't redo it),
|
|
134
109
|
what's unmet and why (the exact failure tails), and how many rounds are left.
|
|
135
110
|
|
|
136
111
|
### What it prints
|
|
137
112
|
|
|
138
|
-
- Header
|
|
139
|
-
|
|
140
|
-
- **
|
|
141
|
-
|
|
142
|
-
(
|
|
143
|
-
read model (LUM-456). Printed whenever the contract has ≥1 MACHINE criterion,
|
|
144
|
-
so the terminal rollup never reads as all-human when a checkpointer actually
|
|
145
|
-
verified the work.
|
|
146
|
-
- **Criteria** — every criterion as `<glyph> <id> [TYPE] SOURCE@rN
|
|
147
|
-
statement` (✓ latest verdict passed / ✗ failed / ○ no verdict yet) with its
|
|
148
|
-
checkpointer and latest verdict line (evidence pointer on pass, failure
|
|
149
|
-
tail on fail). `REVIEW_ADDED@rN` provenance is visible per row.
|
|
150
|
-
- A passing **MACHINE** criterion's verdict line carries a machine-state tag
|
|
151
|
-
derived from the read model's `machinePassed` flag, NOT the latest verdict
|
|
152
|
-
(LUM-470): `· machine-verified` when a checkpointer actually passed it (even
|
|
153
|
-
after a human later signs the task off), or `· human override (no machine
|
|
154
|
-
pass)` when it passes only on a human sign-off with no machine run underneath.
|
|
155
|
-
This keeps the terminal honest with web — a machine-verified criterion that
|
|
156
|
-
a human co-signed no longer reads as a plain human pass.
|
|
157
|
-
- A pass can carry a **`⚠ pre-edit version`** note (LUM-457): the criterion
|
|
158
|
-
was changed after that verdict (reworded, or its checkpointer was swapped so
|
|
159
|
-
the recorded evidence ran a different command). The pass still counts as met
|
|
160
|
-
(a stale pass does not block DONE — render-only signal), but it vouches for
|
|
161
|
-
an older version — **re-run `lumo verify` to re-confirm against the current
|
|
162
|
-
criterion.** This is the habit whenever you edit a MACHINE criterion's
|
|
163
|
-
checkpointer mid-task: change the check, then re-verify so the green is honest.
|
|
113
|
+
- **Header** — task identifier/title/status + `verification round N/3` (round 0 = never verified) + an escalation warning when the machine loop is exhausted.
|
|
114
|
+
- **Machine verification rollup** (LUM-470) — directly under the `Criteria` header, one line `Machine verification: N machine-verified / M human override (of T MACHINE criteria)` over the active MACHINE criteria, aligned with the web read model (LUM-456). Printed whenever the contract has ≥1 MACHINE criterion, so the terminal rollup never reads as all-human when a checkpointer actually verified the work.
|
|
115
|
+
- **Criteria** — every criterion as `<glyph> <id> [TYPE] SOURCE@rN statement` (✓ latest verdict passed / ✗ failed / ○ no verdict yet) with its checkpointer and latest verdict line (evidence pointer on pass, failure tail on fail). `REVIEW_ADDED@rN` provenance is visible per row.
|
|
116
|
+
- A passing **MACHINE** criterion's verdict line carries a machine-state tag derived from the read model's `machinePassed` flag, NOT the latest verdict (LUM-470): `· machine-verified` when a checkpointer actually passed it (even after a human later signs the task off), or `· human override (no machine pass)` when it passes only on a human sign-off with no machine run underneath. This keeps the terminal honest with web — a machine-verified criterion that a human co-signed no longer reads as a plain human pass.
|
|
117
|
+
- A pass can carry a **`⚠ pre-edit version`** note (LUM-457): the criterion was changed after that verdict (reworded, or its checkpointer was swapped so the recorded evidence ran a different command). The pass still counts as met (a stale pass does not block DONE — render-only signal), but it vouches for an older version — **re-run `lumo verify` to re-confirm against the current criterion.** This is the habit whenever you edit a MACHINE criterion's checkpointer mid-task: change the check, then re-verify so the green is honest.
|
|
164
118
|
- **History** — one line per recorded round: `rN · timestamp · X PASS / Y FAIL`.
|
|
165
|
-
- **Last round failures** — the most recent round's FAIL verdicts with their
|
|
166
|
-
|
|
167
|
-
- **
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
separately. Empty + rounds recorded = awaiting human adjudication.
|
|
171
|
-
- **Open boundary crossings** (LUM-448) — a trailing safety block when the
|
|
172
|
-
task has ≥1 OPEN (undispositioned) forbidden-action crossing: a count, then
|
|
173
|
-
one line per crossing `• [SEVERITY] CATEGORY — <clipped detail>`
|
|
174
|
-
(highest-severity first), each followed by a read-only **attribution** line
|
|
175
|
-
`↳ by model=<m> · agent=<type>[/branch] · session=<8-char prefix>` (LUM-469 —
|
|
176
|
-
who/what crossed; any dimension that couldn't be resolved server-side prints
|
|
177
|
-
`unknown`, never a fabricated value), then a pointer to the web acceptance
|
|
178
|
-
panel. Silent when there are none, so it never overshadows the criteria.
|
|
179
|
-
**Read-only awareness** — this surfaces crossings detected elsewhere
|
|
180
|
-
(LUM-426/435/442); there is no CLI path to disposition or clear one.
|
|
181
|
-
Disposition stays web + human-only (LUM-426/435/422): an agent/CLI bearer
|
|
182
|
-
cannot clear its own crossing from the terminal. **The check fails closed
|
|
183
|
-
(LUM-480):** if the crossings read itself errors (network / server / parse),
|
|
184
|
-
the block prints `⚠ Boundary-crossing check failed (network/server error) —
|
|
185
|
-
could not confirm whether any are undispositioned` instead of staying silent.
|
|
186
|
-
Silence means a successful read with zero open crossings, never a failed
|
|
187
|
-
check — a hiccup can no longer masquerade as "all clear".
|
|
119
|
+
- **Last round failures** — the most recent round's FAIL verdicts with their rejection reasons (why the last round bounced).
|
|
120
|
+
- **Next actions** — the unmet criteria (latest verdict is not a pass: failed or never verified, HUMAN ones included). This list IS the plan — recomputed from the event log on every read, never maintained separately. Empty + rounds recorded = awaiting human adjudication.
|
|
121
|
+
- **Open boundary crossings** (LUM-448) — a trailing safety block when the task has ≥1 OPEN (undispositioned) forbidden-action crossing: a count, then one line per crossing `• [SEVERITY] CATEGORY — <clipped detail>` (highest-severity first), each followed by a read-only **attribution** line `↳ by model=<m> · agent=<type>[/branch] · session=<8-char prefix>` (LUM-469 — who/what crossed; any dimension that couldn't be resolved server-side prints `unknown`, never a fabricated value), then a pointer to the web acceptance panel. Silent when there are none, so it never overshadows the criteria.
|
|
122
|
+
- **Read-only awareness** — this surfaces crossings detected elsewhere (LUM-426/435/442); there is no CLI path to disposition or clear one. Disposition stays web + human-only (LUM-426/435/422): an agent/CLI bearer cannot clear its own crossing from the terminal.
|
|
123
|
+
- **The check fails closed (LUM-480):** if the crossings read itself errors (network / server / parse), the block prints `⚠ Boundary-crossing check failed (network/server error) — could not confirm whether any are undispositioned` instead of staying silent. Silence means a successful read with zero open crossings, never a failed check — a hiccup can no longer masquerade as "all clear".
|
|
188
124
|
|
|
189
125
|
### Responding to an open crossing — `lumo crossing explain` (LUM-542)
|
|
190
126
|
|
|
191
|
-
When `lumo task status` surfaces an OPEN crossing you believe is a false
|
|
192
|
-
|
|
193
|
-
|
|
127
|
+
When `lumo task status` surfaces an OPEN crossing you believe is a false positive
|
|
128
|
+
— or you simply want to leave a rationale for the human reviewer — append a
|
|
129
|
+
self-explanation ("申辩") to it:
|
|
194
130
|
|
|
195
|
-
```
|
|
131
|
+
```bash
|
|
196
132
|
lumo crossing explain <id> --note "this was a generated fixture, not a hand-edited migration"
|
|
197
133
|
```
|
|
198
134
|
|
|
199
135
|
This is the **inverse** of dispositioning, but it is the agent/CLI path
|
|
200
|
-
(bearer-only; a clerk/human caller is refused)
|
|
201
|
-
|
|
202
|
-
(disposition stays web + human-only, LUM-448)
|
|
203
|
-
reviewer at disposition time, kept for later review, and explicitly labeled
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
targets and unbound/mismatched sessions are rejected). Earlier explanations are
|
|
207
|
-
immutable — a correction is a new note.
|
|
136
|
+
(bearer-only; a clerk/human caller is refused). Behavior:
|
|
137
|
+
|
|
138
|
+
- it can **only append** an append-only note — it **never clears the crossing or unblocks Done** (disposition stays web + human-only, LUM-448);
|
|
139
|
+
- the note is shown to the human reviewer at disposition time, kept for later review, and explicitly labeled _agent self-report · unverified_;
|
|
140
|
+
- `<id>` must be a crossing on the **session-bound task** (resolved from `$CLAUDE_CODE_SESSION_ID`; cross-task targets and unbound/mismatched sessions are rejected);
|
|
141
|
+
- earlier explanations are immutable — a correction is a new note.
|
|
208
142
|
|
|
209
143
|
### --json contract
|
|
210
144
|
|
|
211
|
-
`--json` emits the full read model with a top-level `version` field
|
|
212
|
-
|
|
213
|
-
|
|
214
|
-
|
|
215
|
-
vouches for it; LUM-456/470)
|
|
216
|
-
`machineVerification` aggregate `{ total, machineVerified, humanOverridden }`
|
|
217
|
-
|
|
218
|
-
|
|
219
|
-
The open boundary crossings ride along as an additive top-level
|
|
220
|
-
`openCrossings` (each entry `{ id, category, severity, detail, attribution }`,
|
|
221
|
-
where `attribution` is `{ workspaceMemberId, sessionId, agent, worktreeBranch,
|
|
222
|
-
model }` with every field nullable — null = unknown, never fabricated; LUM-469;
|
|
223
|
-
the array length is the count) — same read-only awareness, no write path.
|
|
224
|
-
**`openCrossings` is `null` when the crossings check failed (LUM-480)** —
|
|
225
|
-
distinct from `[]`, which is a successful read with zero open crossings. Script
|
|
226
|
-
consumers must treat `null` as "unknown / could not confirm", not "safe".
|
|
145
|
+
`--json` emits the full read model with a top-level `version` field (currently
|
|
146
|
+
`1`). The schema is versioned: breaking shape changes bump the major; additive
|
|
147
|
+
fields don't. Pin on `version` when scripting against it.
|
|
148
|
+
|
|
149
|
+
- each criterion carries `machinePassed` (boolean — a checkpointer currently vouches for it; LUM-456/470);
|
|
150
|
+
- the payload carries a top-level `machineVerification` aggregate `{ total, machineVerified, humanOverridden }` over the active MACHINE criteria — read these, not `latestVerdict` alone, to tell a machine-verified criterion from a human override;
|
|
151
|
+
- open boundary crossings ride along as an additive top-level `openCrossings`, each entry `{ id, category, severity, detail, attribution }` where `attribution` is `{ workspaceMemberId, sessionId, agent, worktreeBranch, model }` with every field nullable — null = unknown, never fabricated (LUM-469); the array length is the count. Same read-only awareness, no write path;
|
|
152
|
+
- **`openCrossings` is `null` when the crossings check failed (LUM-480)** — distinct from `[]` (a successful read with zero open crossings). Script consumers must treat `null` as "unknown / could not confirm", **not** "safe".
|
|
227
153
|
|
|
228
154
|
`status` reads; `verify` judges. Running status never starts a round, never
|
|
229
155
|
escalates, and never changes task state — loop rules (cap 3, IN_REVIEW on
|
|
@@ -232,10 +158,10 @@ all-pass, human-only DONE) live entirely in `lumo verify` and the server.
|
|
|
232
158
|
## lumo verdict — the three verdict channels (LUM-422)
|
|
233
159
|
|
|
234
160
|
`lumo verify` is the MACHINE channel. `lumo verdict` covers the other two — the
|
|
235
|
-
HUMAN pass and the AGENT send-back — under one red line: **no passing data row
|
|
236
|
-
|
|
161
|
+
HUMAN pass and the AGENT send-back — under one red line: **no passing data row is
|
|
162
|
+
ever agent-produced.**
|
|
237
163
|
|
|
238
|
-
```
|
|
164
|
+
```bash
|
|
239
165
|
lumo verdict --pass
|
|
240
166
|
lumo verdict LUM-42 --pass
|
|
241
167
|
lumo verdict --fail --reason CRITERION_UNMET --note "the retry path is still missing"
|
|
@@ -256,62 +182,56 @@ verifierType=AGENT (a channel distinct from MACHINE and HUMAN, so "machine
|
|
|
256
182
|
all-pass but human FAIL" stays an uncontaminated signal). The verdict is
|
|
257
183
|
hard-coded FAIL — there is no agent path to a passing verdict. It:
|
|
258
184
|
|
|
259
|
-
-
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
-
|
|
263
|
-
free) and summarized onto the verdict row;
|
|
264
|
-
- takes repeatable `--criterion <id>` to narrow the send-back; omitted, it fans
|
|
265
|
-
out to the whole contract;
|
|
266
|
-
- records at round = the current max (not a new round) and bounces the task back
|
|
267
|
-
to IN_PROGRESS, with the unmet criteria surfacing through `lumo task status`.
|
|
185
|
+
- `--reason <enum>` required (case-insensitive): `CRITERION_UNMET | EVIDENCE_INSUFFICIENT | CHECK_EXECUTION_ERROR | SCOPE_MISMATCH | OTHER` — the agent pays the structured tax a human send-back is spared;
|
|
186
|
+
- `--note <text>` optional, posted as a task comment (@mentions and images for free) and summarized onto the verdict row;
|
|
187
|
+
- `--criterion <id>` repeatable, narrows the send-back; omitted, it fans out to the whole contract;
|
|
188
|
+
- `round` = the current max (not a new round); bounces the task back to IN_PROGRESS, with the unmet criteria surfacing through `lumo task status`.
|
|
268
189
|
|
|
269
190
|
### The DONE gate
|
|
270
191
|
|
|
271
192
|
Once any criterion's latest verdict is FAIL — machine, AGENT, or human — moving
|
|
272
|
-
the task to DONE on the agent/CLI path is refused with **409** and the
|
|
273
|
-
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
actual send-back, never an un-adjudicated criterion
|
|
277
|
-
left a task IN_REVIEW with no send-back standing, the agent may move it to DONE
|
|
278
|
-
directly; a human-PASS row is a provable manual override, not a required ticket.
|
|
193
|
+
the task to DONE on the agent/CLI path is refused with **409** and the unresolved
|
|
194
|
+
items listed. Clear the send-back (fix + re-verify, or a human PASS) before
|
|
195
|
+
`lumo task update <id> --status done`.
|
|
196
|
+
|
|
197
|
+
- A task with no criteria, or whose criteria were never adjudicated, transitions freely — **the gate only blocks an actual send-back, never an un-adjudicated criterion.**
|
|
198
|
+
- When the machine loop has left a task IN_REVIEW with no send-back standing, the agent may move it to DONE directly; a human-PASS row is a provable manual override, not a required ticket.
|
|
279
199
|
|
|
280
200
|
## When a defect appears — fix in place, don't spin off a new task
|
|
281
201
|
|
|
282
|
-
On a send-back **or** a self-review finding: if the issue falls under any
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
|
|
288
|
-
|
|
289
|
-
This is
|
|
290
|
-
|
|
291
|
-
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
|
|
295
|
-
|
|
296
|
-
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
202
|
+
On a send-back **or** a self-review finding: if the issue falls under any existing
|
|
203
|
+
acceptance criterion of **this** task, fix it in place and re-run `lumo verify`.
|
|
204
|
+
**Do not** `lumo task create` for it. New tasks are only for work genuinely
|
|
205
|
+
_outside_ this task's acceptance contract. Creating a task (and PR) for in-scope
|
|
206
|
+
rework launders a first-attempt failure — it bypasses the DONE gate's send-back
|
|
207
|
+
protection and corrupts the flywheel signal.
|
|
208
|
+
|
|
209
|
+
This is enforced: when you're mid-task, `lumo task create` refuses the bare form
|
|
210
|
+
and makes you declare intent:
|
|
211
|
+
|
|
212
|
+
- `lumo task create --rework-of <id>` — redirects you back to fix the existing task and creates nothing;
|
|
213
|
+
- `lumo task create --new-scope` — genuinely new, separate work.
|
|
214
|
+
|
|
215
|
+
If the send-back reveals the **contract itself** was wrong, amend it on this task
|
|
216
|
+
(`lumo task criteria set`) rather than opening a new task — see criteria.md.
|
|
217
|
+
|
|
218
|
+
**Hard rule:** while THIS task has an unresolved send-back (any criterion's latest
|
|
219
|
+
verdict is FAIL — the same condition that blocks DONE), `lumo task create` is
|
|
220
|
+
refused with **409 even with `--new-scope`**. A standing send-back means the task
|
|
221
|
+
can't be completed yet; resolve it (fix + `lumo verify`, or amend the contract)
|
|
222
|
+
before opening any new work. `--rework-of` still redirects you to it.
|
|
301
223
|
|
|
302
224
|
## Human-reported defects, once a task is submitted
|
|
303
225
|
|
|
304
226
|
When someone reports a defect in conversation, your action depends on whether the
|
|
305
227
|
task has **ever entered IN_REVIEW**:
|
|
306
228
|
|
|
307
|
-
- **Not yet** (still your first working pass) → just fix it and continue. No
|
|
308
|
-
|
|
309
|
-
-
|
|
310
|
-
|
|
311
|
-
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
_verdict_ — the terminal can't prove a human is behind the command
|
|
317
|
-
(attribution integrity, not anti-forgery).
|
|
229
|
+
- **Not yet** (still your first working pass) → just fix it and continue. No verdict needed — nothing was claimed complete, so there's nothing to contradict.
|
|
230
|
+
- **Already submitted** (entered IN_REVIEW / DONE / merged) → **do not silently fix and re-pass.** Either:
|
|
231
|
+
- record your own send-back `lumo verdict --fail` (noting it was human-reported — this is _your_ honest concurrence, not a forged human verdict), or
|
|
232
|
+
- ask the reporter to record a human FAIL via the web UI / Slack (the only channel that can attribute it to a human).
|
|
233
|
+
|
|
234
|
+
If the defect is a **new requirement** not covered by any criterion, first
|
|
235
|
+
transcribe it with `lumo task criteria set --human`, then proceed. You can never
|
|
236
|
+
write a human _verdict_ — the terminal can't prove a human is behind the command
|
|
237
|
+
(attribution integrity, not anti-forgery).
|
|
@@ -24,42 +24,20 @@ lumo worktree add LUM-267 --no-fetch # branch off existing origin/main
|
|
|
24
24
|
lumo worktree add LUM-267 --verify # run npx jest baseline after setup
|
|
25
25
|
```
|
|
26
26
|
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
in first)
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
**The husky hooks it copies in:** husky owns git hooks via
|
|
42
|
-
`core.hooksPath = .husky/_`, a relative path git resolves against each
|
|
43
|
-
worktree's own root. That `_` shim dir is **untracked** (husky regenerates it on
|
|
44
|
-
the main checkout's `npm install`/`prepare`), so a fresh worktree would lack it
|
|
45
|
-
and git would **silently skip every hook** — pre-commit (lint-staged) and
|
|
46
|
-
commit-msg (LUM-405 drift-check) included. Since this repo has no GitHub CI,
|
|
47
|
-
husky is the only deterministic quality gate, so `add` copies `.husky/_` into the
|
|
48
|
-
new worktree (copy, not symlink — the shim is tiny and the hooks it invokes
|
|
49
|
-
resolve relative to the worktree). If the main checkout itself has no `.husky/_`
|
|
50
|
-
(husky never installed there), `add` prints a warning telling you to run
|
|
51
|
-
`npm install` in the main checkout to regenerate it, rather than skipping
|
|
52
|
-
silently.
|
|
53
|
-
|
|
54
|
-
**Never run `npm install` / `npm ci` inside a worktree.** The worktree shares
|
|
55
|
-
the main checkout's `node_modules` via a symlink; npm does not respect the link
|
|
56
|
-
— it deletes it and reifies a full standalone `node_modules` (thousands of
|
|
57
|
-
packages, ~1 min, and the shared prisma-client is gone). Run installs only in
|
|
58
|
-
the main checkout, then re-create the symlink if npm replaced it. (Older npm
|
|
59
|
-
could instead plant a self-referential `node_modules/node_modules` in the shared
|
|
60
|
-
tree, which hard-panics Turbopack's `next build`; the `prebuild`/`predev`/
|
|
61
|
-
`preanalyze` guard — `scripts/fix-nodemodules-selflink.ts` — removes that link
|
|
62
|
-
automatically, but the rule above avoids the whole mess.)
|
|
27
|
+
| Flag | Notes |
|
|
28
|
+
| -------------- | --------------------------------------------------------------- |
|
|
29
|
+
| `--base <ref>` | Branch off a ref other than origin/main (skips the fetch). |
|
|
30
|
+
| `--no-fetch` | Skip `git fetch origin main` (branch off existing origin/main). |
|
|
31
|
+
| `--verify` | Run `npx jest` baseline in the new worktree after setup. |
|
|
32
|
+
|
|
33
|
+
Errors if the target dir already exists; reuses the branch if it already exists (adds the worktree without `-b`).
|
|
34
|
+
|
|
35
|
+
### Gotchas (two it warns about, two it can't fix)
|
|
36
|
+
|
|
37
|
+
- **`prisma generate` clobbers all worktrees.** The generated client lives in the shared (symlinked) `node_modules`, so a `generate` in one worktree overwrites the client every parallel worktree depends on. Verify with jest (SWC mocks Prisma); do `generate + tsc` atomically once at the end.
|
|
38
|
+
- **Run jest from the worktree root** (`cd` in first). `cli/` has no jest config; running from the main checkout hits the `cli/package.json` haste collision and silently runs the wrong tests.
|
|
39
|
+
- **Husky hooks are copied in for you.** Husky owns hooks via `core.hooksPath = .husky/_`, resolved relative to each worktree's root. That `_` shim is **untracked** (regenerated on the main checkout's `npm install`/`prepare`), so a fresh worktree would lack it and git would **silently skip every hook** — pre-commit (lint-staged) and commit-msg (LUM-405 drift-check). With no GitHub CI, husky is the only deterministic quality gate, so `add` copies `.husky/_` in (copy, not symlink). If the main checkout has no `.husky/_`, `add` warns you to `npm install` there rather than skipping silently.
|
|
40
|
+
- **Never `npm install` / `npm ci` inside a worktree.** npm doesn't respect the `node_modules` symlink — it deletes it and reifies a full standalone tree (~1 min, shared prisma-client gone). Install only in the main checkout, then re-create the symlink if npm replaced it. (Older npm could plant a self-referential `node_modules/node_modules` that hard-panics Turbopack's `next build`; the `prebuild`/`predev`/`preanalyze` guard `scripts/fix-nodemodules-selflink.ts` removes it, but this rule avoids the mess.)
|
|
63
41
|
|
|
64
42
|
## `lumo worktree rm <LUM-N>`
|
|
65
43
|
|
|
@@ -83,3 +61,10 @@ check when jest/tsc misbehaves). Read-only; runnable from anywhere in the repo.
|
|
|
83
61
|
```bash
|
|
84
62
|
lumo worktree list
|
|
85
63
|
```
|
|
64
|
+
|
|
65
|
+
## When to suggest worktrees
|
|
66
|
+
|
|
67
|
+
- The user wants to work on a task **in isolation** from the current workspace — a parallel branch they can build/test without disturbing uncommitted work in the main checkout.
|
|
68
|
+
- Starting a second task while one is in flight, or running independent tasks concurrently (one worktree per task — never two tasks in one worktree).
|
|
69
|
+
- Before executing an implementation plan that needs a clean, dedicated branch off `origin/main`.
|
|
70
|
+
- A `MISSING` node_modules link in `lumo worktree list` is the first thing to check when jest/tsc misbehaves in a worktree.
|
|
@@ -0,0 +1,76 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
3
|
+
exports.memoryFold = memoryFold;
|
|
4
|
+
const config_1 = require("../lib/config");
|
|
5
|
+
const api_1 = require("../lib/api");
|
|
6
|
+
const resolve_1 = require("../lib/resolve");
|
|
7
|
+
const resolve_bound_task_1 = require("../lib/resolve-bound-task");
|
|
8
|
+
const resolve_project_1 = require("../lib/resolve-project");
|
|
9
|
+
const sanitize_1 = require("../lib/sanitize");
|
|
10
|
+
function flatten(content) {
|
|
11
|
+
if (content === null || typeof content !== 'object')
|
|
12
|
+
return '';
|
|
13
|
+
return Object.values(content)
|
|
14
|
+
.map(v => (Array.isArray(v) ? v.join(' ') : typeof v === 'string' ? v : ''))
|
|
15
|
+
.filter(Boolean)
|
|
16
|
+
.join(' — ');
|
|
17
|
+
}
|
|
18
|
+
async function memoryFold(refArg, options) {
|
|
19
|
+
// Folding runs automatically (daily cron). The CLI is preview-only.
|
|
20
|
+
if (!options.dryRun) {
|
|
21
|
+
console.error('Topic folding runs automatically (daily). This command only previews it.\n' +
|
|
22
|
+
'Run `lumo memory fold --dry-run` to see what the next fold pass would do.');
|
|
23
|
+
return 1;
|
|
24
|
+
}
|
|
25
|
+
const creds = (0, config_1.readCredentials)();
|
|
26
|
+
if (!creds) {
|
|
27
|
+
console.error('Error: not logged in. Run `lumo auth login` first.');
|
|
28
|
+
return 1;
|
|
29
|
+
}
|
|
30
|
+
const apiUrl = (0, api_1.resolveAuthedApiUrl)(creds.apiUrl);
|
|
31
|
+
const base = (0, api_1.trimTrailingSlash)(apiUrl);
|
|
32
|
+
let projectId;
|
|
33
|
+
if (refArg) {
|
|
34
|
+
projectId = await (0, resolve_1.resolveProjectId)(base, creds.token, refArg);
|
|
35
|
+
}
|
|
36
|
+
else {
|
|
37
|
+
const bound = await (0, resolve_bound_task_1.resolveBoundTaskIdentifier)(apiUrl, creds.token);
|
|
38
|
+
if (!bound) {
|
|
39
|
+
console.error('Error: no <project-ref> given and no task bound to this session.\n' +
|
|
40
|
+
'Pass a project, or run `lumo session attach <LUM-N>`.');
|
|
41
|
+
return 1;
|
|
42
|
+
}
|
|
43
|
+
const r = await (0, resolve_project_1.resolveBoundProjectId)(apiUrl, creds.token, bound);
|
|
44
|
+
if (!r.ok) {
|
|
45
|
+
console.error(`Error: ${r.error}`);
|
|
46
|
+
return 1;
|
|
47
|
+
}
|
|
48
|
+
projectId = r.id;
|
|
49
|
+
}
|
|
50
|
+
let res;
|
|
51
|
+
try {
|
|
52
|
+
res = await fetch(`${base}/api/projects/${encodeURIComponent(projectId)}/memories/fold-candidates`, { headers: { Authorization: `Bearer ${creds.token}` } });
|
|
53
|
+
}
|
|
54
|
+
catch (err) {
|
|
55
|
+
console.error(`Error: could not reach Lumo API at ${apiUrl} (${err instanceof Error ? err.message : String(err)})`);
|
|
56
|
+
return 1;
|
|
57
|
+
}
|
|
58
|
+
if (!res.ok) {
|
|
59
|
+
console.error(`Error: fold preview failed (HTTP ${res.status})`);
|
|
60
|
+
return 1;
|
|
61
|
+
}
|
|
62
|
+
const { proposals } = (await res.json());
|
|
63
|
+
if (proposals.length === 0) {
|
|
64
|
+
process.stdout.write('No fold proposals — nothing coarse to collapse right now.\n');
|
|
65
|
+
return;
|
|
66
|
+
}
|
|
67
|
+
process.stdout.write(`${proposals.length} fold proposal(s) — DRY RUN, nothing written:\n\n`);
|
|
68
|
+
for (const p of proposals) {
|
|
69
|
+
process.stdout.write(`[${(0, sanitize_1.sanitizeField)(p.category)}] coarse card folds ${p.sourceIds.length} cards` +
|
|
70
|
+
(p.excludedIds.length
|
|
71
|
+
? ` (${p.excludedIds.length} left ACTIVE by the coverage gate)`
|
|
72
|
+
: '') +
|
|
73
|
+
`:\n ${(0, sanitize_1.sanitizeField)(flatten(p.coarseContent))}\n` +
|
|
74
|
+
` folds: ${p.sourceIds.map(s => (0, sanitize_1.sanitizeField)(s)).join(', ')}\n\n`);
|
|
75
|
+
}
|
|
76
|
+
}
|
package/dist/cli/src/index.js
CHANGED
|
@@ -70,6 +70,7 @@ const memory_rm_1 = require("./commands/memory-rm");
|
|
|
70
70
|
const memory_show_1 = require("./commands/memory-show");
|
|
71
71
|
const memory_sync_1 = require("./commands/memory-sync");
|
|
72
72
|
const memory_push_1 = require("./commands/memory-push");
|
|
73
|
+
const memory_fold_1 = require("./commands/memory-fold");
|
|
73
74
|
const task_artifact_add_1 = require("./commands/task-artifact-add");
|
|
74
75
|
const task_criteria_set_1 = require("./commands/task-criteria-set");
|
|
75
76
|
const task_criteria_list_1 = require("./commands/task-criteria-list");
|
|
@@ -533,6 +534,11 @@ memoryCmd
|
|
|
533
534
|
.option('--dir <path>', 'Memory dir (default: ~/.claude/projects/<cwd>/memory)')
|
|
534
535
|
.option('--dry-run', 'List what would be pushed without sending')
|
|
535
536
|
.action(wrap((opts) => (0, memory_push_1.memoryPush)(opts)));
|
|
537
|
+
memoryCmd
|
|
538
|
+
.command('fold [project-ref]')
|
|
539
|
+
.description('Preview the autonomous topic-fold pass (folding itself runs daily, automatically). Requires --dry-run.')
|
|
540
|
+
.option('--dry-run', 'preview proposed subsystem cards + which fine cards they fold; writes nothing')
|
|
541
|
+
.action(wrap((ref, opts) => (0, memory_fold_1.memoryFold)(ref, opts)));
|
|
536
542
|
const milestoneCmd = program
|
|
537
543
|
.command('milestone')
|
|
538
544
|
.description('Inspect milestones from the terminal');
|