@lumoai/cli 1.27.0 → 1.29.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/cli/src/commands/doc-section-edit.js +113 -0
- package/dist/cli/src/commands/doc-show.js +48 -1
- package/dist/cli/src/commands/doc-update.js +22 -1
- package/dist/cli/src/index.js +34 -7
- package/dist/cli/src/lib/git-task.js +9 -17
- package/dist/cli/src/lib/markdown-sections.js +12 -0
- package/dist/shared/src/markdown-sections.js +162 -0
- package/dist/shared/src/task-identifier.js +51 -0
- package/package.json +1 -1
- package/assets/skill/SKILL.md +0 -160
- package/assets/skill/references/artifacts-figma.md +0 -124
- package/assets/skill/references/criteria.md +0 -139
- package/assets/skill/references/docs.md +0 -339
- package/assets/skill/references/memory.md +0 -103
- package/assets/skill/references/milestones.md +0 -244
- package/assets/skill/references/onboarding.md +0 -102
- package/assets/skill/references/sessions.md +0 -222
- package/assets/skill/references/sprints.md +0 -157
- package/assets/skill/references/task-context.md +0 -136
- package/assets/skill/references/tasks.md +0 -357
- package/assets/skill/references/verify.md +0 -124
|
@@ -1,124 +0,0 @@
|
|
|
1
|
-
# lumo verify — machine verification loop
|
|
2
|
-
|
|
3
|
-
`lumo verify` is the machine half of the acceptance system (Acceptance v1,
|
|
4
|
-
LUM-343). It executes every **MACHINE** criterion's checkpointer in the local
|
|
5
|
-
repo, reports one structured PASS/FAIL verdict per criterion to the server,
|
|
6
|
-
and prints what to do next. The judge lives server-side: round numbering, the
|
|
7
|
-
3-round cap, escalation, and the IN_REVIEW transition all happen there
|
|
8
|
-
(execution on the client, adjudication on the server).
|
|
9
|
-
|
|
10
|
-
## The claim-done rule
|
|
11
|
-
|
|
12
|
-
**Before claiming a task is complete — in conversation, in a wrap-up, or by
|
|
13
|
-
touching its status — run `lumo verify`.** The loop replaces "I read the code
|
|
14
|
-
and it looks done" with executed evidence.
|
|
15
|
-
|
|
16
|
-
```
|
|
17
|
-
lumo verify # session-bound task
|
|
18
|
-
lumo verify LUM-42 # explicit task (overrides the session binding)
|
|
19
|
-
lumo verify --timeout 900 # per-checkpointer timeout in seconds (default 600)
|
|
20
|
-
```
|
|
21
|
-
|
|
22
|
-
## What one round does
|
|
23
|
-
|
|
24
|
-
1. Loads the task's acceptance contract and picks out MACHINE criteria.
|
|
25
|
-
2. Runs each checkpointer locally (shell, cwd = current directory), one at a
|
|
26
|
-
time, echoing PASS/FAIL as it goes.
|
|
27
|
-
3. POSTs the structured verdicts; the server records one VerificationRun per
|
|
28
|
-
criterion at round = previous max + 1 and mirrors each verdict as a
|
|
29
|
-
TaskActivity event.
|
|
30
|
-
4. Prints the round outcome:
|
|
31
|
-
- **All PASS** → the task transitions to **IN_REVIEW** (existing state
|
|
32
|
-
machine + TASK_IN_REVIEW notification). **Stop here.** Human
|
|
33
|
-
adjudication and any HUMAN criteria take over; never set DONE yourself.
|
|
34
|
-
- **Any FAIL** → task status is untouched; the unmet criteria are printed
|
|
35
|
-
as next actions (statement, checkpointer, failure tail). Fix and re-run.
|
|
36
|
-
- **Round 3 still failing** → the loop escalates: a human is notified
|
|
37
|
-
(AGENT_VERIFY, requires action) and further `lumo verify` rounds are
|
|
38
|
-
rejected with 409. **Stop retrying**; fix only what the human directs.
|
|
39
|
-
|
|
40
|
-
Exit code 0 = all passed (or nothing to run); 1 = failures, escalation, or
|
|
41
|
-
errors.
|
|
42
|
-
|
|
43
|
-
## Verdict semantics (what the CLI sends)
|
|
44
|
-
|
|
45
|
-
- checkpointer exits 0 → `PASS` with evidence `cmd:<command>#exit=0`
|
|
46
|
-
- non-zero exit → `FAIL`, reason = output tail, enum `CRITERION_UNMET`
|
|
47
|
-
- spawn failure / timeout → `FAIL`, enum `CHECK_EXECUTION_ERROR`
|
|
48
|
-
|
|
49
|
-
evidencePointer is **not free text** — the server only accepts
|
|
50
|
-
`commit:<hash>`, `file:<path>:<line>`, or `cmd:<command>#exit=<code>`.
|
|
51
|
-
Verdicts are PASS|FAIL only; the agent path cannot write HUMAN verdicts or
|
|
52
|
-
`PASS_WITH_FOLLOWUP` (red line — those enter via human-initiated UI paths
|
|
53
|
-
only).
|
|
54
|
-
|
|
55
|
-
## Edge cases
|
|
56
|
-
|
|
57
|
-
- **No contract yet** → error pointing at `lumo task criteria set`; draft the
|
|
58
|
-
contract first (criteria.md golden rule).
|
|
59
|
-
- **HUMAN-only contract (zero MACHINE criteria)** → nothing to run; the CLI
|
|
60
|
-
says so and suggests handing off for human review
|
|
61
|
-
(`lumo task update <id> --status in_review`). No server write happens.
|
|
62
|
-
- **A round must cover every MACHINE criterion** — the CLI always runs all of
|
|
63
|
-
them; the server rejects partial rounds.
|
|
64
|
-
- Criteria added during review (`REVIEW_ADDED`) appear in the contract and
|
|
65
|
-
are picked up automatically by the next round.
|
|
66
|
-
|
|
67
|
-
## Round discipline
|
|
68
|
-
|
|
69
|
-
Rounds are a hard budget of 3, not a retry loop. Between rounds, actually fix
|
|
70
|
-
the failures — re-running without changes burns a round and (at round 3)
|
|
71
|
-
pages a human. A FAIL round never changes task status; only an all-pass round
|
|
72
|
-
moves it (to IN_REVIEW, never further).
|
|
73
|
-
|
|
74
|
-
## lumo task status — the read half (self-check entry point)
|
|
75
|
-
|
|
76
|
-
`lumo task status [task] [--json]` is the read-only counterpart of the loop
|
|
77
|
-
(LUM-344): pure read, milliseconds, no LLM, never writes — running it costs
|
|
78
|
-
nothing and burns no round. Defaults to the session-bound task; an explicit
|
|
79
|
-
identifier overrides.
|
|
80
|
-
|
|
81
|
-
```
|
|
82
|
-
lumo task status # session-bound task
|
|
83
|
-
lumo task status LUM-42 # explicit task
|
|
84
|
-
lumo task status --json # versioned machine-readable payload
|
|
85
|
-
```
|
|
86
|
-
|
|
87
|
-
### When to run it
|
|
88
|
-
|
|
89
|
-
**Status-first recovery:** run it FIRST — before re-reading code or
|
|
90
|
-
planning — whenever you:
|
|
91
|
-
|
|
92
|
-
- resume a task in a new session (yours or another agent's earlier work);
|
|
93
|
-
- come back after a verification round was rejected (`lumo verify` failed);
|
|
94
|
-
- were told the task bounced in review (REVIEW_ADDED criteria may have been
|
|
95
|
-
appended at the round they surfaced — they show up here automatically).
|
|
96
|
-
|
|
97
|
-
It answers "where does the loop stand": what already passed (don't redo it),
|
|
98
|
-
what's unmet and why (the exact failure tails), and how many rounds are left.
|
|
99
|
-
|
|
100
|
-
### What it prints
|
|
101
|
-
|
|
102
|
-
- Header: task identifier/title/status + `verification round N/3` (round 0 =
|
|
103
|
-
never verified) + an escalation warning when the machine loop is exhausted.
|
|
104
|
-
- **Criteria** — every criterion as `<glyph> <id> [TYPE] SOURCE@rN
|
|
105
|
-
statement` (✓ latest verdict passed / ✗ failed / ○ no verdict yet) with its
|
|
106
|
-
checkpointer and latest verdict line (evidence pointer on pass, failure
|
|
107
|
-
tail on fail). `REVIEW_ADDED@rN` provenance is visible per row.
|
|
108
|
-
- **History** — one line per recorded round: `rN · timestamp · X PASS / Y FAIL`.
|
|
109
|
-
- **Last round failures** — the most recent round's FAIL verdicts with their
|
|
110
|
-
rejection reasons (why the last round bounced).
|
|
111
|
-
- **Next actions** — the unmet criteria (latest verdict is not a pass:
|
|
112
|
-
failed or never verified, HUMAN ones included). This list IS the plan —
|
|
113
|
-
it is recomputed from the event log on every read, never maintained
|
|
114
|
-
separately. Empty + rounds recorded = awaiting human adjudication.
|
|
115
|
-
|
|
116
|
-
### --json contract
|
|
117
|
-
|
|
118
|
-
`--json` emits the full read model with a top-level `version` field
|
|
119
|
-
(currently `1`). The schema is versioned: breaking shape changes bump the
|
|
120
|
-
major; additive fields don't. Pin on `version` when scripting against it.
|
|
121
|
-
|
|
122
|
-
`status` reads; `verify` judges. Running status never starts a round, never
|
|
123
|
-
escalates, and never changes task state — loop rules (cap 3, IN_REVIEW on
|
|
124
|
-
all-pass, human-only DONE) live entirely in `lumo verify` and the server.
|